Jaka Čibej, Špela Arhar Holdt, Kaja Dobrovoljc, Simon Krek A GUIDE TO FREQUENCY LISTS FROM THE GIGAFIDA 2.0 AND GOS 1.0 CORPORA A GUIDE TO FREQUENCY LISTS FROM THE GIGAFIDA 2.0 AND GOS 1.0 CORPORA Book series: Sporazumevanje (ISSN 2738-4527) Editorial board: Špela Arhar Holdt and Vojko Gorjanc Authors: Jaka Čibej, Špela Arhar Holdt, Kaja Dobrovoljc, Simon Krek Design and layout: Dvokotnik, Lenka Trdina, s.p. Published by: Ljubljana University Press, Faculty of Arts Issued by: University of Ljubljana, Centre for language resources and technologies For the publisher: Roman Kuhar, Dean of the Faculty of Arts, University of Ljubljana Ljubljana, 2020 First edition, e-edition Publication is free of charge. Publication is available at: https://e-knjige.ff.uni-lj.si DOI: 10.4312/9789610604006 To delo je ponujeno pod licenco Creative Commons Priznanje avtorstva-Deljenje pod enakimi pogoji 4.0 Mednarodna licenca. / This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Kataložni zapis o publikaciji (CIP) pripravili v Narodni in univerzitetni knjižnici v Ljubljani COBISS.SI -ID=39089923 ISBN 978-961-06-0400-6 (pdf) The authors acknowledge the project New grammar of contemporary standard Slovene: sources and methods (J6-8256) was financially supported by the Slovenian Research Agency. CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 3The research project titled “The New Grammar of Modern Standard Slovene: Resources and Methods” (ARRS J6-8256) was carried out between 2017 and 2020 by the researchers of the Jožef Stefan Institute, the Faculty of Arts, and the Faculty of Computer and Information Science of the University of Ljubljana. The goal of the project was to define a linguistic methodological basis for a computational analysis of written and spoken Slovene as present in modern Slovene language corpora. Based on these new methods, a series of open-access corpus-based databases were generated, which can serve as a basis for the preparation of an empirical grammatical description of modern Slovene, as well as the development of language technologies for Slovene. More information on the participants, the purpose, and the results of the project is available at the project website . The work package dedicated to corpus data on the levels of morphology and word formation produced LIST, an open-access program for the statistical analysis of large corpora available the CLARIN.SI repository (Krsnik et al. 2019 ). The program was developed in Java and allows for the extraction of corpus data on different levels: characters, word parts, words, and word sets. The program is optimized for individual use on a personal computer and supports TEI P5 XML and VERT formats. In the project, LIST was used to obtain frequency lists from the reference corpora of written standard Slovene (Gigafida 2.0, described in Chapter 1) and spoken Slovene (GOS, described in Chapter 2), taking into account different settings available in the program. The purpose of these ready-made frequency lists is to facilitate a simpler, quicker and more documented access to corpus data for the community of researchers and developers. 768 lists were made in total. All are available in organized packages at the CLARIN.SI repository. The lists extracted with LIST were complemented by lists of consonant-vowel structures of words, which were prepared with a separate script. In order to facilitate the use of frequency lists, the packages at the CLARIN.SI repository contain two versions of the lists: in addition to the entire versions (some of which are quite large), shortened versions were also made available. These are more suitable to import in data analysis software such as Microsoft Excel. The short versions of the lists are available at the same URLs as their non-abridged versions and can be identified by the word “short” in the filename (as opposed to “entire”). The purpose of this publication (titled A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora) is to provide a quick overview of the data made available at the repository, and to demonstrate the functions and uses of the LIST program, which can be used on other corpora for extracting similar frequency lists. The guide features short excerpts of all available frequency lists, i.e. the table header and approximately 30 lines. Each table also features the link to the data in the CLARIN.SI repository. Each subsection of a chapter begins with a short description of the conditions used in the extraction. The guide is available in Slovene and English.Introduction CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 4Gigafida 2.0 is currently the latest version of the reference corpus of written standard Slovene. It was published in 2019 as a result of the project titled “Upgrade of the Gigafida, Kres, ccGigafida, and ccKres Corpora”, which was financed by the Ministry of Culture of the Republic of Slovenia between 2015 and 2018. The project was carried out by the Centre for Language Resources and Technologies of the University of Ljubljana. The corpus consists of daily newspapers, magazines, a selected range of internet texts (also covering news articles to a certain extent), and different books (fiction, student books, non- fiction). The texts were selected and automatically processed in order to ensure that the corpus serves as a sample of modern standard Slovene for research in linguistics and the humanities, for the compilation of modern dictionaries, grammars, learning materials, as well as for the development of language technologies for Slovene. Gigafida 2.0 is the updated version of the Gigafida corpus (Logar et al. 2012 ). Compared to the previous version, Gigafida 2.0 is a corpus of standard Slovene: most of the texts containing non- standard language elements (such as user comments from news sites) were removed as part of the upgrade. Other improvements also included text deduplication (which removed identical text fragments occurring more than once), an improved automatic part-of-speech tagging, and several minor changes to interface design. In terms of content, the corpus was updated with texts from selected websites with significant daily text production (e.g. news sites, daily newspaper sites). In addition, several texts were added that were under-represented in the first version, such as schoolbooks and fiction. More information on these changes to the corpus is available in the paper by aforementioned Krek et al. 2020 .This link leads to a detailed overview of the text type distribution in the corpus. Gigafida 2.0 is mostly comprised of newspapers, magazines, and internet texts (due to the methods of corpus compilation, these also contain news sites), while a smaller portion consists of fiction and non- fiction. The following is the description of the acronyms used to define the text types according to the taxonomy used in the corpus (the acronyms are included in the tables presented in this chapter): • SSJ.T.D – Printed texts / Other • SSJ.I – Internet texts • SSJ.T.P.C – Printed texts / Periodical / Newspapers • SSJ.T.P.R – Printed texts / Periodical / Magazines • SSJ.T.K.N – Printed texts / Books / Unknown • SSJ.T.K.L – Printed texts / Books / Fiction • SSJ.T.K.S – Printed texts / Books / Non-fiction The rest of this chapter describes the frequency lists extracted from the Gigafida 2.0 corpus. The lists are divided into sections by levels, starting with characters (Section 1.1.), word parts (Section 1.2.), and words with consonant-vowel structures (Section 1.3.), and ending with word sets (Section 1.4.).1. Frequency lists from the Gigafida 2.0 Reference Corpus of Written Standard Slovene CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 5Tables 1.1.1. to 1.1.5. contain frequency lists of character-level n-grams extracted from word forms in the Gigafida 2.0 corpus (from 1-grams, i.e. individual characters, to 5-grams, i.e. sequences of five characters). For instance, from the word “Slonom”, the following character 2-grams are extracted: “Sl”, “lo”, “on”, “no”, and “om”, as well as the following 3-grams: “Slo”, “lon”, “ono”, and “nom”. Each character n-gram entry on the list also contains its absolute and relative frequencies and percentages based on all character n-grams in the corpus (or. part of the corpus when the frequency refers to a specific taxonomy branch; see below for further details). Tables 1.1.6. to 1.1.10. also contain character-level n-grams extracted in the same manner, but from lemmas in the Gigafida 2.0 corpus instead of word forms. Total absolute frequencies constitute sums of all occurrences of a specific character n-gram in all units (word forms or lemmas) in the corpus. Total relative frequencies indicate how frequently a character n-gram occurs per 1,000,000 occurrences of character n-grams of equal length in the corpus. The frequencies are calculated according to the following formula, where fa is the total absolute frequency of the character n-gram and N is the total absolute frequency of all character n-grams of equal length in the corpus:The percentage of a character n-gram represents the share of the n-gram among all extracted character n-grams of equal length in the corpus, and is calculated in the following manner: The character n-grams are also listed with the absolute and relative frequencies within individual taxonomy branches in the Gigafida 2.0 corpus. These indicate how frequently a certain character n-gram appears in a certain text type (e.g. internet texts, newspapers, and fiction). In this case, the absolute frequencies represent the sum of all occurrences of a character n-gram in the texts of a specific taxonomy branch. The relative frequencies (frT) and percentages (pT) are calculated using the following formulas, where faT is the absolute frequency of a character n-gram in a taxonomy branch, and NT is the total frequency of all extracted character n-grams of equal length within that taxonomy branch: 1.1. Frequency lists of characters from the Gigafida 2.0 corpus CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 6 File at CLARIN.SI 1.1.1 List of character-level 1-grams in word forms in the Gigafida 2.0 corpusGF2.0-characters-lowercase_forms- 1grams-taxonomy-entire.tsvCharacter string Total absolute frequency of character string Percentage of total sum of all found character strings Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] a 598,958,606 10.15 % 101,512.60 2,184,424 10.40 % 104,026.42 170,759,742 10.24 % 102,410 286,321,161 10.10 % 100,945.25 22,314,144 9.79 % 97,916.32 97,864,508 10.14 % 101,367.21 4,365 8.93 % 89,309.46 19,510,262 10.71 % 107,116.10 e 586,236,625 9.94 % 99,356.46 2,087,620 9.94 % 99,416.43 165,024,600 9.90 % 98,970.45 278,968,651 9.84 % 98,353.05 23,008,431 10.10 % 100,962.91 97,476,445 10.10 % 100,965.26 5,695 11.65 % 116,521.74 19,665,183 10.80 % 107,966.66 o 540,730,482 9.16 % 91,643.99 1,979,619 9.43 % 94,273.22 151,866,298 9.11 % 91,079.01 259,759,973 9.16 % 91,580.85 21,346,544 9.37 % 93,670.41 89,406,747 9.26 % 92,606.74 4,695 9.61 % 96,061.38 16,366,606 8.99 % 89,856.66 i 525,579,546 8.91 % 89,076.18 1,839,185 8.76 % 87,585.48 146,991,182 8.82 % 88,155.24 252,667,917 8.91 % 89,080.48 20,768,156 9.11 % 91,132.40 87,181,903 9.03 % 90,302.26 4,390 8.98 % 89,820.97 16,126,813 8.85 % 88,540.14 n 399,521,517 6.77 % 67,711.64 1,457,481 6.94 % 69,408.02 112,414,843 6.74 % 67,418.72 192,052,593 6.77 % 67,709.97 15,930,189 6.99 % 69,902.99 66,015,209 6.84 % 68,377.98 3,324 6.80 % 68,010.23 11,647,878 6.39 % 63,949.69 r 315,501,548 5.35 % 53,471.78 1,118,026 5.32 % 53,242.52 89,699,851 5.38 % 53,795.83 153,078,146 5.40 % 53,969.16 11,907,120 5.22 % 52,249.43 50,807,668 5.26 % 52,626.14 2,489 5.09 % 50,925.83 8,888,248 4.88 % 48,798.65 s 273,909,109 4.64 % 46,422.62 952,496 4.54 % 45,359.67 77,368,681 4.64 % 46,400.44 130,523,582 4.60 % 46,017.33 10,822,686 4.75 % 47,490.85 45,161,684 4.68 % 46,778.08 1,938 3.96 % 39,652.17 9,078,042 4.98 % 49,840.67 t 268,629,370 4.55 % 45,527.80 941,325 4.48 % 44,827.69 75,995,217 4.56 % 45,576.73 128,227,916 4.52 % 45,207.97 10,773,400 4.73 % 47,274.57 44,934,820 4.65 % 46,543.10 3,212 6.57 % 65,718.67 7,753,480 4.26 % 42,568.50 l 265,899,612 4.51 % 45,065.15 886,117 4.22 % 42,198.58 74,878,845 4.49 % 44,907.20 127,556,486 4.50 % 44,971.25 9,714,268 4.26 % 42,627.02 42,934,946 4.45 % 44,471.65 1,933 3.96 % 39,549.87 9,927,017 5.45 % 54,501.75 j 250,906,640 4.25 % 42,524.12 833,674 3.97 % 39,701.14 70,715,286 4.24 % 42,410.19 120,184,074 4.24 % 42,372.04 9,730,291 4.27 % 42,697.33 41,033,341 4.25 % 42,501.98 1,963 4.02 % 40,163.68 8,408,011 4.62 % 46,162.03 v 248,863,856 4.22 % 42,177.90 912,157 4.34 % 43,438.65 71,162,021 4.27 % 42,678.11 121,490,001 4.28 % 42,832.46 9,373,309 4.11 % 41,130.86 39,331,881 4.07 % 40,739.62 1,294 2.65 % 26,475.70 6,593,193 3.62 % 36,198.24 k 213,782,298 3.62 % 36,232.22 738,157 3.52 % 35,152.44 59,487,986 3.57 % 35,676.82 103,561,722 3.65 % 36,511.67 8,040,915 3.53 % 35,284.20 35,296,515 3.66 % 36,559.83 1,742 3.56 % 35,641.94 6,655,261 3.65 % 36,539.01 d 211,403,779 3.58 % 35,829.10 784,338 3.73 % 37,351.67 61,189,463 3.67 % 36,697.25 102,732,827 3.62 % 36,219.44 7,515,951 3.30 % 32,980.62 33,058,462 3.42 % 34,241.67 1,497 3.06 % 30,629.16 6,121,241 3.36 % 33,607.11 p 201,540,006 3.42 % 34,157.37 766,470 3.65 % 36,500.76 57,730,436 3.46 % 34,622.76 96,020,266 3.38 % 33,852.86 7,713,711 3.38 % 33,848.41 33,224,846 3.44 % 34,414.01 2,059 4.21 % 42,127.88 6,082,218 3.34 % 33,392.86 m 175,047,714 2.97 % 29,667.41 587,122 2.80 % 27,959.87 47,630,711 2.86 % 28,565.64 82,912,725 2.92 % 29,231.67 6,966,121 3.06 % 30,567.92 30,631,301 3.17 % 31,727.64 1,699 3.48 % 34,762.15 6,318,035 3.47 % 34,687.56 z 124,307,716 2.11 % 21,067.90 506,918 2.41 % 24,140.40 35,494,079 2.13 % 21,286.92 59,055,186 2.08 % 20,820.47 5,017,328 2.20 % 22,016.45 20,354,120 2.11 % 21,082.62 1,102 2.25 % 22,547.31 3,878,983 2.13 % 21,296.56 u 119,147,508 2.02 % 20,193.34 413,952 1.97 % 19,713.18 34,581,712 2.07 % 20,739.74 57,202,489 2.02 % 20,167.28 4,679,425 2.05 % 20,533.71 18,971,857 1.97 % 19,650.89 946 1.94 % 19,355.50 3,297,127 1.81 % 18,102.03 b 105,976,190 1.80 % 17,961.04 356,941 1.70 % 16,998.21 29,704,140 1.78 % 17,814.51 51,253,141 1.81 % 18,069.78 3,857,863 1.69 % 16,928.62 17,170,033 1.78 % 17,784.57 567 1.16 % 11,601.02 3,633,505 2.00 % 19,948.83 g 85,180,392 1.44 % 14,436.53 336,020 1.60 % 16,001.91 24,033,881 1.44 % 14,413.88 41,028,309 1.45 % 14,464.92 3,338,720 1.47 % 14,650.58 13,589,801 1.41 % 14,076.20 505 1.03 % 10,332.48 2,853,156 1.57 % 15,664.52 č 78,511,624 1.33 % 13,306.29 306,574 1.46 % 14,599.64 21,268,709 1.28 % 12,755.52 37,935,372 1.34 % 13,374.48 3,160,215 1.39 % 13,867.29 13,183,120 1.36 % 13,654.96 649 1.33 % 13,278.77 2,656,985 1.46 % 14,587.50 h 63,465,194 1.08 % 10,756.20 217,953 1.04 % 10,379.34 17,536,696 1.05 % 10,517.31 30,142,431 1.06 % 10,627 2,686,269 1.18 % 11,787.57 10,994,068 1.14 % 11,387.56 573 1.17 % 11,723.79 1,887,204 1.04 % 10,361.21 š 56,691,217 0.96 % 9,608.13 168,468 0.80 % 8,022.77 15,421,662 0.93 % 9,248.86 28,337,680 1.00 % 9,990.72 1,931,642 0.85 % 8,476.21 9,043,312 0.94 % 9,366.98 416 0.85 % 8,511.51 1,788,037 0.98 % 9,816.76 c 52,786,813 0.90 % 8,946.41 165,037 0.79 % 7,859.38 15,388,509 0.92 % 9,228.97 25,679,128 0.91 % 9,053.42 1,960,866 0.86 % 8,604.44 8,400,508 0.87 % 8,701.17 592 1.21 % 12,112.53 1,192,173 0.66 % 6,545.32 ž 35,740,568 0.61 % 6,057.38 122,688 0.58 % 5,842.64 9,946,743 0.60 % 5,965.38 17,295,224 0.61 % 6,097.60 1,366,098 0.60 % 5,994.55 5,899,101 0.61 % 6,110.24 427 0.87 % 8,736.57 1,110,287 0.61 % 6,095.75 0 13,997,379 0.24 % 2,372.30 38,824 0.18 % 1,848.87 4,400,564 0.26 % 2,639.16 7,272,488 0.26 % 2,563.98 376,876 0.17 % 1,653.76 1,890,585 0.20 % 1,958.25 207 0.42 % 4,235.29 17,835 0.01 % 97.92 f 13,845,562 0.23 % 2,346.57 36,876 0.18 % 1,756.11 4,275,521 0.26 % 2,564.16 6,411,677 0.23 % 2,260.50 522,884 0.23 % 2,294.46 2,340,996 0.24 % 2,424.78 48 0.10 % 982.10 257,560 0.14 % 1,414.07 1 12,360,406 0.21 % 2,094.86 46,628 0.22 % 2,220.51 3,822,597 0.23 % 2,292.53 6,491,952 0.23 % 2,288.80 559,426 0.24 % 2,454.81 1,407,006 0.15 % 1,457.36 147 0.30 % 3,007.67 32,650 0.02 % 179.26 , 9,940,200 0.17 % 1,684.68 50,484 0.24 % 2,404.14 2,447,875 0.15 % 1,468.07 5,865,291 0.21 % 2,067.86 422,960 0.19 % 1,855.98 1,117,876 0.12 % 1,157.89 35 0.07 % 716.11 35,679 0.02 % 195.89 2 9,195,924 0.16 % 1,558.54 29,571 0.14 % 1,408.23 3,068,275 0.18 % 1,840.14 4,727,964 0.17 % 1,666.89 312,599 0.14 % 1,371.71 1,043,172 0.11 % 1,080.51 108 0.22 % 2,209.72 14,235 0.01 % 78.15 5 5,445,200 0.09 % 922.86 17,802 0.09 % 847.77 1,555,080 0.09 % 932.63 2,991,123 0.10 % 1,054.55 193,227 0.09 % 847.90 677,786 0.07 % 702.04 77 0.16 % 1,575.45 10,105 0.01 % 55.48 3 5,365,748 0.09 % 909.40 22,778 0.11 % 1,084.73 1,560,837 0.09 % 936.08 2,938,417 0.10 % 1,035.97 205,538 0.09 % 901.92 628,071 0.07 % 650.55 59 0.12 % 1,207.16 10,048 0.01 % 55.17 9 4,688,195 0.08 % 794.56 22,232 0.11 % 1,058.73 1,237,902 0.07 % 742.41 2,472,971 0.09 % 871.87 285,577 0.12 % 1,253.14 652,502 0.07 % 675.86 8 0.02 % 163.68 17,003 0.01 % 93.35 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 7 File at CLARIN.SI 1.1.2 List of character-level 2-grams in word forms in the Gigafida 2.0 corpusGF2.0-characters-lowercase_forms- 2grams-taxonomy-entire.tsvCharacter string Total absolute frequency of character string Percentage of total sum of all found character strings Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] je 96,506,859 2.02 % 20,250.54 322,174 1.89 % 18,903.73 27,437,248 2.03 % 20,331.81 45,300,651 1.98 % 19,750.21 3,772,508 2.04 % 20,396.87 15,690,015 2.02 % 20,166.40 348 0.89 % 8,884.80 3,983,915 2.80 % 27,971.92 na 81,178,495 1.70 % 17,034.11 288,094 1.69 % 16,904.07 23,143,474 1.72 % 17,150 38,795,585 1.69 % 16,914.13 3,145,089 1.70 % 17,004.59 13,496,331 1.74 % 17,346.86 687 1.75 % 17,539.83 2,309,235 1.62 % 16,213.64 ni 75,424,233 1.58 % 15,826.66 276,830 1.62 % 16,243.15 21,374,386 1.58 % 15,839.05 37,023,973 1.61 % 16,141.74 2,663,531 1.44 % 14,400.95 11,914,136 1.53 % 15,313.26 488 1.25 % 12,459.15 2,170,889 1.52 % 15,242.28 pr 72,561,683 1.52 % 15,226 301,038 1.77 % 17,663.57 20,739,950 1.54 % 15,368.91 35,088,277 1.53 % 15,297.81 2,659,702 1.44 % 14,380.25 11,759,627 1.51 % 15,114.67 585 1.49 % 14,935.66 2,012,504 1.41 % 14,130.22 ra 70,316,143 1.48 % 14,754.81 267,278 1.57 % 15,682.68 19,476,861 1.44 % 14,432.93 33,835,549 1.48 % 14,751.65 2,870,675 1.55 % 15,520.92 11,835,100 1.52 % 15,211.67 585 1.49 % 14,935.66 2,030,095 1.43 % 14,253.73 st 69,379,371 1.46 % 14,558.24 273,399 1.60 % 16,041.83 19,962,644 1.48 % 14,792.91 33,061,291 1.44 % 14,414.09 2,982,824 1.61 % 16,127.27 11,168,775 1.44 % 14,355.24 559 1.43 % 14,271.85 1,929,879 1.35 % 13,550.10 po 68,656,405 1.44 % 14,406.53 266,419 1.56 % 15,632.28 20,011,371 1.48 % 14,829.01 32,194,901 1.40 % 14,036.36 2,796,595 1.51 % 15,120.39 11,259,147 1.45 % 14,471.40 727 1.86 % 18,561.07 2,127,245 1.49 % 14,935.84 re 65,163,700 1.37 % 13,673.64 255,316 1.50 % 14,980.80 18,661,025 1.38 % 13,828.37 31,392,281 1.37 % 13,686.43 2,408,925 1.30 % 13,024.37 10,548,563 1.36 % 13,558.08 640 1.63 % 16,339.87 1,896,950 1.33 % 13,318.89 ko 58,439,989 1.23 % 12,262.77 208,470 1.22 % 12,232.09 15,978,967 1.18 % 11,840.88 27,908,195 1.22 % 12,167.44 2,301,413 1.24 % 12,443.08 10,061,025 1.29 % 12,931.45 663 1.69 % 16,927.08 1,981,256 1.39 % 13,910.82 an 57,722,117 1.21 % 12,112.13 202,234 1.19 % 11,866.19 17,084,008 1.27 % 12,659.75 28,476,501 1.24 % 12,415.21 2,170,064 1.17 % 11,732.91 8,622,816 1.11 % 11,082.92 481 1.23 % 12,280.43 1,166,013 0.82 % 8,186.83 in 57,496,021 1.21 % 12,064.69 214,628 1.26 % 12,593.41 15,640,234 1.16 % 11,589.87 27,720,463 1.21 % 12,085.59 2,460,293 1.33 % 13,302.10 9,653,842 1.24 % 12,408.10 967 2.47 % 24,688.52 1,805,594 1.27 % 12,677.46 ne 56,597,273 1.19 % 11,876.10 224,776 1.32 % 13,188.85 15,737,659 1.17 % 11,662.07 26,813,150 1.17 % 11,690.02 2,354,475 1.27 % 12,729.97 9,673,905 1.24 % 12,433.89 365 0.93 % 9,318.83 1,792,943 1.26 % 12,588.64 ov 55,866,264 1.17 % 11,722.71 180,110 1.06 % 10,568.05 16,244,070 1.20 % 12,037.33 28,011,749 1.22 % 12,212.58 1,953,201 1.06 % 10,560.40 8,255,904 1.06 % 10,611.33 165 0.42 % 4,212.62 1,221,065 0.86 % 8,573.36 en 55,705,394 1.17 % 11,688.96 229,648 1.35 % 13,474.72 15,680,285 1.16 % 11,619.55 26,573,171 1.16 % 11,585.39 2,271,695 1.23 % 12,282.40 9,330,133 1.20 % 11,992.03 376 0.96 % 9,599.67 1,620,086 1.14 % 11,374.97 li 50,991,948 1.07 % 10,699.91 172,245 1.01 % 10,106.57 13,906,680 1.03 % 10,305.26 24,949,266 1.09 % 10,877.40 2,045,458 1.11 % 11,059.20 8,392,230 1.08 % 10,786.55 516 1.32 % 13,174.02 1,525,553 1.07 % 10,711.24 no 50,882,675 1.07 % 10,676.98 187,495 1.10 % 11,001.37 13,908,579 1.03 % 10,306.67 24,264,709 1.06 % 10,578.95 2,226,549 1.20 % 12,038.31 8,854,638 1.14 % 11,380.88 429 1.09 % 10,952.82 1,440,276 1.01 % 10,112.49 al 50,434,041 1.06 % 10,582.84 160,956 0.94 % 9,444.18 14,522,836 1.08 % 10,761.85 23,958,941 1.04 % 10,445.64 1,788,788 0.97 % 9,671.46 7,891,518 1.01 % 10,142.98 246 0.63 % 6,280.64 2,110,756 1.48 % 14,820.07 te 47,676,005 1.00 % 10,004.11 194,121 1.14 % 11,390.15 13,219,488 0.98 % 9,796.03 22,000,624 0.96 % 9,591.85 2,194,655 1.19 % 11,865.87 8,679,121 1.12 % 11,155.29 1,944 4.96 % 49,632.35 1,386,052 0.97 % 9,731.77 la 47,561,474 1.00 % 9,980.07 169,662 1.00 % 9,955.01 13,364,115 0.99 % 9,903.20 22,645,918 0.99 % 9,873.18 1,576,085 0.85 % 8,521.44 7,516,474 0.97 % 9,660.94 316 0.81 % 8,067.81 2,288,904 1.61 % 16,070.89 ve 47,099,302 0.99 % 9,883.09 162,916 0.96 % 9,559.18 13,511,125 1.00 % 10,012.14 22,765,294 0.99 % 9,925.23 1,749,632 0.95 % 9,459.76 7,582,784 0.97 % 9,746.16 117 0.30 % 2,987.13 1,327,434 0.93 % 9,320.20 ri 45,933,584 0.96 % 9,638.49 149,008 0.87 % 8,743.12 12,928,885 0.96 % 9,580.68 22,222,031 0.97 % 9,688.38 1,776,400 0.96 % 9,604.49 7,509,709 0.96 % 9,652.24 390 1.00 % 9,957.11 1,347,161 0.95 % 9,458.71 za 44,760,122 0.94 % 9,392.25 188,126 1.10 % 11,038.39 13,088,699 0.97 % 9,699.11 21,630,911 0.94 % 9,430.66 1,507,299 0.81 % 8,149.53 7,007,645 0.90 % 9,006.94 310 0.79 % 7,914.62 1,337,132 0.94 % 9,388.29 se 44,510,387 0.93 % 9,339.85 162,742 0.95 % 9,548.97 11,907,138 0.88 % 8,823.54 20,744,161 0.90 % 9,044.06 1,676,747 0.91 % 9,065.69 7,774,390 1.00 % 9,992.44 291 0.74 % 7,429.53 2,244,918 1.58 % 15,762.05 ja 44,497,489 0.93 % 9,337.14 164,832 0.97 % 9,671.61 13,353,419 0.99 % 9,895.28 21,466,267 0.94 % 9,358.88 1,698,824 0.92 % 9,185.05 6,747,142 0.87 % 8,672.11 334 0.85 % 8,527.37 1,066,671 0.75 % 7,489.33 od 43,916,907 0.92 % 9,215.32 183,624 1.08 % 10,774.24 12,986,739 0.96 % 9,623.56 21,452,244 0.94 % 9,352.77 1,561,504 0.84 % 8,442.60 6,627,883 0.85 % 8,518.83 473 1.21 % 12,076.18 1,104,440 0.78 % 7,754.51 ti 43,851,176 0.92 % 9,201.52 155,661 0.91 % 9,133.49 11,936,273 0.89 % 8,845.13 20,735,684 0.90 % 9,040.36 1,940,584 1.05 % 10,492.18 7,556,481 0.97 % 9,712.36 229 0.58 % 5,846.61 1,526,264 1.07 % 10,716.23 il 42,744,394 0.90 % 8,969.28 111,472 0.65 % 6,540.68 12,775,486 0.95 % 9,467.01 20,289,942 0.89 % 8,846.02 1,265,366 0.68 % 6,841.47 6,233,149 0.80 % 8,011.48 126 0.32 % 3,216.91 2,068,853 1.45 % 14,525.86 da 40,620,509 0.85 % 8,523.61 147,480 0.86 % 8,653.47 11,834,643 0.88 % 8,769.82 19,193,491 0.84 % 8,367.99 1,350,944 0.73 % 7,304.17 6,454,210 0.83 % 8,295.61 336 0.86 % 8,578.43 1,639,405 1.15 % 11,510.61 el 40,300,151 0.85 % 8,456.39 119,460 0.70 % 7,009.38 10,975,653 0.81 % 8,133.28 19,401,352 0.85 % 8,458.62 1,510,378 0.82 % 8,166.18 6,696,072 0.86 % 8,606.47 277 0.71 % 7,072.10 1,596,959 1.12 % 11,212.59 nj 40,187,972 0.84 % 8,432.85 152,057 0.89 % 8,922.03 11,046,181 0.82 % 8,185.55 19,516,426 0.85 % 8,508.79 1,793,699 0.97 % 9,698.02 6,543,032 0.84 % 8,409.77 318 0.81 % 8,118.87 1,136,259 0.80 % 7,977.92 em 40,130,086 0.84 % 8,420.71 148,975 0.87 % 8,741.19 11,150,979 0.83 % 8,263.20 18,921,160 0.82 % 8,249.26 1,665,476 0.90 % 9,004.75 6,741,184 0.87 % 8,664.45 301 0.77 % 7,684.84 1,502,011 1.05 % 10,545.94 ka 39,023,002 0.82 % 8,188.40 132,835 0.78 % 7,794.17 10,856,229 0.80 % 8,044.79 18,498,231 0.81 % 8,064.87 1,435,590 0.78 % 7,761.82 6,644,057 0.85 % 8,539.62 178 0.45 % 4,544.53 1,455,882 1.02 % 10,222.06 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 8 File at CLARIN.SI 1.1.3 List of character-level 3-grams in word forms in the Gigafida 2.0 corpusGF2.0-characters-lowercase_forms- 3grams-taxonomy-entire.tsvCharacter string Total absolute frequency of character string Percentage of total sum of all found character strings Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] pre 26,797,393 0.73 % 7,264.83 114,343 0.86 % 8,590.93 7,868,098 0.75 % 7,505.18 12,808,196 0.72 % 7,201.46 962,431 0.67 % 6,671.48 4,307,154 0.72 % 7,180.67 216 0.71 % 7,110.88 736,955 0.71 % 7,065.14 pri 20,184,578 0.55 % 5,472.08 72,503 0.55 % 5,447.36 5,542,400 0.53 % 5,286.75 9,759,022 0.55 % 5,487.05 790,932 0.55 % 5,482.67 3,423,785 0.57 % 5,707.96 174 0.57 % 5,728.21 595,762 0.57 % 5,711.54 ost 18,637,435 0.51 % 5,052.65 72,950 0.55 % 5,480.95 5,151,866 0.49 % 4,914.23 8,638,228 0.49 % 4,856.88 952,437 0.66 % 6,602.21 3,363,575 0.56 % 5,607.58 140 0.46 % 4,608.90 458,239 0.44 % 4,393.11 anj 18,282,250 0.50 % 4,956.36 78,574 0.59 % 5,903.50 5,168,957 0.49 % 4,930.54 8,902,862 0.50 % 5,005.67 866,966 0.60 % 6,009.73 2,936,984 0.49 % 4,896.39 103 0.34 % 3,390.83 327,804 0.31 % 3,142.64 nje 16,993,900 0.46 % 4,607.08 74,401 0.56 % 5,589.97 4,631,745 0.44 % 4,418.10 8,037,848 0.45 % 4,519.31 809,449 0.56 % 5,611.03 2,878,450 0.48 % 4,798.80 89 0.29 % 2,929.94 561,918 0.54 % 5,387.07 ega 16,384,096 0.44 % 4,441.76 84,840 0.64 % 6,374.28 4,637,432 0.44 % 4,423.53 7,977,221 0.45 % 4,485.23 642,676 0.45 % 4,454.97 2,569,184 0.43 % 4,283.21 69 0.23 % 2,271.53 472,674 0.45 % 4,531.50 sta 15,998,261 0.43 % 4,337.16 61,073 0.46 % 4,588.59 4,873,887 0.47 % 4,649.08 7,485,562 0.42 % 4,208.79 607,860 0.42 % 4,213.63 2,461,124 0.41 % 4,103.06 101 0.33 % 3,324.99 508,654 0.49 % 4,876.44 red 14,799,399 0.40 % 4,012.15 64,971 0.49 % 4,881.46 4,380,015 0.42 % 4,177.99 7,437,407 0.42 % 4,181.71 490,177 0.34 % 3,397.86 2,131,578 0.35 % 3,553.66 56 0.18 % 1,843.56 295,195 0.28 % 2,830.02 ali 13,747,362 0.37 % 3,726.94 63,961 0.48 % 4,805.58 3,652,081 0.35 % 3,483.63 6,787,735 0.38 % 3,816.43 599,942 0.42 % 4,158.74 2,257,408 0.38 % 3,763.44 77 0.25 % 2,534.90 386,158 0.37 % 3,702.07 rav 13,190,280 0.36 % 3,575.91 69,828 0.53 % 5,246.38 3,625,985 0.35 % 3,458.74 6,422,140 0.36 % 3,610.88 490,647 0.34 % 3,401.12 2,208,860 0.37 % 3,682.50 75 0.25 % 2,469.05 372,745 0.36 % 3,573.48 sti 12,678,683 0.34 % 3,437.22 50,549 0.38 % 3,797.90 3,547,549 0.34 % 3,383.92 6,058,280 0.34 % 3,406.29 600,524 0.42 % 4,162.78 2,083,261 0.35 % 3,473.11 99 0.33 % 3,259.15 338,421 0.32 % 3,244.42 ove 11,531,063 0.31 % 3,126.10 34,622 0.26 % 2,601.25 3,492,065 0.33 % 3,330.99 5,794,707 0.33 % 3,258.10 376,843 0.26 % 2,612.24 1,597,369 0.27 % 2,663.05 20 0.07 % 658.41 235,437 0.23 % 2,257.12 pra 11,359,867 0.31 % 3,079.68 60,177 0.45 % 4,521.28 3,087,649 0.29 % 2,945.23 5,581,705 0.31 % 3,138.34 409,731 0.28 % 2,840.22 1,859,519 0.31 % 3,100.09 143 0.47 % 4,707.66 360,943 0.35 % 3,460.34 ova 10,974,157 0.30 % 2,975.12 36,725 0.28 % 2,759.26 3,151,041 0.30 % 3,005.70 5,577,953 0.31 % 3,136.23 368,651 0.26 % 2,555.46 1,572,754 0.26 % 2,622.01 8 0.03 % 263.37 267,025 0.26 % 2,559.95 del 10,819,806 0.29 % 2,933.27 41,415 0.31 % 3,111.63 2,994,381 0.29 % 2,856.26 5,341,540 0.30 % 3,003.30 426,273 0.29 % 2,954.89 1,704,242 0.28 % 2,841.23 38 0.12 % 1,250.99 311,917 0.30 % 2,990.33 bil 10,642,526 0.29 % 2,885.21 28,339 0.21 % 2,129.19 2,992,369 0.28 % 2,854.35 5,091,670 0.29 % 2,862.81 315,223 0.22 % 2,185.10 1,588,277 0.27 % 2,647.89 6 0.02 % 197.52 626,642 0.60 % 6,007.58 eni 10,479,413 0.28 % 2,840.99 48,441 0.36 % 3,639.51 3,060,868 0.29 % 2,919.68 5,103,963 0.29 % 2,869.73 372,694 0.26 % 2,583.48 1,637,128 0.27 % 2,729.34 72 0.24 % 2,370.29 256,247 0.25 % 2,456.62 sto 10,461,611 0.28 % 2,836.17 40,442 0.30 % 3,038.53 3,169,163 0.30 % 3,022.98 5,054,214 0.28 % 2,841.75 396,363 0.28 % 2,747.55 1,516,860 0.25 % 2,528.83 77 0.25 % 2,534.90 284,492 0.27 % 2,727.41 nik 10,423,096 0.28 % 2,825.72 32,524 0.24 % 2,443.62 2,809,001 0.27 % 2,679.43 5,282,391 0.30 % 2,970.05 333,926 0.23 % 2,314.74 1,727,337 0.29 % 2,879.73 70 0.23 % 2,304.45 237,847 0.23 % 2,280.23 udi 10,333,862 0.28 % 2,801.53 26,837 0.20 % 2,016.34 2,903,742 0.28 % 2,769.81 5,072,922 0.28 % 2,852.27 331,741 0.23 % 2,299.60 1,793,690 0.30 % 2,990.35 28 0.09 % 921.78 204,902 0.20 % 1,964.38 let 9,994,628 0.27 % 2,709.57 20,702 0.16 % 1,555.40 3,068,343 0.29 % 2,926.81 5,011,137 0.28 % 2,817.53 255,509 0.18 % 1,771.17 1,489,315 0.25 % 2,482.91 13 0.04 % 427.97 149,609 0.14 % 1,434.29 pos 9,817,811 0.27 % 2,661.63 44,722 0.34 % 3,360.10 2,841,083 0.27 % 2,710.04 4,650,134 0.26 % 2,614.56 429,179 0.30 % 2,975.03 1,604,344 0.27 % 2,674.68 157 0.52 % 5,168.55 248,192 0.24 % 2,379.40 lov 9,685,835 0.26 % 2,625.85 34,347 0.26 % 2,580.59 2,905,768 0.28 % 2,771.74 4,971,752 0.28 % 2,795.39 283,455 0.20 % 1,964.88 1,371,737 0.23 % 2,286.89 20 0.07 % 658.41 118,756 0.11 % 1,138.51 nih 9,495,910 0.26 % 2,574.36 42,056 0.32 % 3,159.79 2,739,324 0.26 % 2,612.97 4,593,862 0.26 % 2,582.92 404,963 0.28 % 2,807.17 1,562,135 0.26 % 2,604.31 19 0.06 % 625.49 153,551 0.15 % 1,472.08 ili 9,292,825 0.25 % 2,519.31 17,068 0.13 % 1,282.37 2,715,609 0.26 % 2,590.35 4,842,202 0.27 % 2,722.55 236,614 0.16 % 1,640.19 1,272,288 0.21 % 2,121.09 51 0.17 % 1,678.96 208,993 0.20 % 2,003.60 raz 9,288,355 0.25 % 2,518.09 37,641 0.28 % 2,828.08 2,462,945 0.23 % 2,349.34 4,378,356 0.25 % 2,461.75 524,774 0.36 % 3,637.69 1,594,064 0.27 % 2,657.54 102 0.34 % 3,357.91 290,473 0.28 % 2,784.75 ako 9,203,590 0.25 % 2,495.11 50,073 0.38 % 3,762.13 2,392,640 0.23 % 2,282.28 4,318,568 0.24 % 2,428.13 379,132 0.26 % 2,628.11 1,642,292 0.27 % 2,737.95 98 0.32 % 3,226.23 420,787 0.40 % 4,034.06 pro 9,194,157 0.25 % 2,492.56 38,758 0.29 % 2,912 2,598,663 0.25 % 2,478.80 4,534,548 0.26 % 2,549.57 344,226 0.24 % 2,386.14 1,496,952 0.25 % 2,495.64 29 0.10 % 954.70 180,981 0.17 % 1,735.05 ter 9,142,006 0.25 % 2,478.42 39,875 0.30 % 2,995.93 2,688,417 0.26 % 2,564.41 4,311,234 0.24 % 2,424.01 398,535 0.28 % 2,762.61 1,525,909 0.25 % 2,543.92 132 0.43 % 4,345.54 177,904 0.17 % 1,705.56 tud 9,134,541 0.25 % 2,476.39 24,307 0.18 % 1,826.26 2,562,967 0.24 % 2,444.75 4,530,180 0.26 % 2,547.11 286,917 0.20 % 1,988.88 1,581,326 0.26 % 2,636.31 21 0.07 % 691.34 148,823 0.14 % 1,426.76 nov 9,040,025 0.24 % 2,450.77 24,720 0.19 % 1,857.29 2,519,966 0.24 % 2,403.73 4,742,570 0.27 % 2,666.53 273,844 0.19 % 1,898.26 1,344,218 0.22 % 2,241.01 40 0.13 % 1,316.83 134,667 0.13 % 1,291.04 nos 8,982,081 0.24 % 2,435.06 41,122 0.31 % 3,089.62 2,516,346 0.24 % 2,400.28 4,179,985 0.23 % 2,350.21 501,978 0.35 % 3,479.67 1,552,927 0.26 % 2,588.96 5 0.02 % 164.60 189,718 0.18 % 1,818.82 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 9 File at CLARIN.SI 1.1.4 List of character-level 4-grams in word forms in the Gigafida 2.0 corpusGF2.0-characters-lowercase_forms- 4grams-taxonomy-entire.tsvCharacter string Total absolute frequency of character string Percentage of total sum of all found character strings Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] prav 9,195,001 0.32 % 3,198.10 53,132 0.51 % 5,072.12 2,542,039 0.31 % 3,098.06 4,523,111 0.33 % 3,257.92 314,049 0.28 % 2,778.76 1,503,360 0.32 % 3,232.20 63 0.27 % 2,676.64 259,247 0.33 % 3,338.91 pred 9,103,424 0.32 % 3,166.24 39,996 0.38 % 3,818.13 2,810,538 0.34 % 3,425.29 4,563,572 0.33 % 3,287.07 252,776 0.22 % 2,236.61 1,262,075 0.27 % 2,713.44 32 0.14 % 1,359.56 174,435 0.23 % 2,246.59 tudi 8,861,632 0.31 % 3,082.15 23,806 0.23 % 2,272.58 2,484,818 0.30 % 3,028.32 4,393,813 0.32 % 3,164.79 277,683 0.25 % 2,456.99 1,537,688 0.33 % 3,306.01 21 0.09 % 892.21 143,803 0.18 % 1,852.08 nost 7,825,931 0.27 % 2,721.92 36,719 0.35 % 3,505.29 2,207,887 0.27 % 2,690.82 3,686,943 0.27 % 2,655.64 443,645 0.39 % 3,925.45 1,301,290 0.28 % 2,797.75 4 0.02 % 169.95 149,443 0.19 % 1,924.72 anje 7,332,379 0.26 % 2,550.26 36,804 0.35 % 3,513.41 2,029,806 0.25 % 2,473.79 3,498,873 0.25 % 2,520.18 385,207 0.34 % 3,408.38 1,252,350 0.27 % 2,692.53 44 0.19 % 1,869.40 129,295 0.17 % 1,665.22 nega 6,151,002 0.21 % 2,139.37 29,432 0.28 % 2,809.66 1,839,268 0.22 % 2,241.57 2,969,940 0.21 % 2,139.20 248,342 0.22 % 2,197.38 931,043 0.20 % 2,001.73 32 0.14 % 1,359.56 132,945 0.17 % 1,712.23 slov 5,876,485 0.20 % 2,043.89 20,551 0.20 % 1,961.85 1,867,570 0.23 % 2,276.06 3,149,782 0.23 % 2,268.74 112,233 0.10 % 993.06 698,324 0.15 % 1,501.39 0 0 % 0 28,025 0.04 % 360.94 vanj 5,488,484 0.19 % 1,908.94 26,133 0.25 % 2,494.73 1,576,537 0.19 % 1,921.37 2,731,255 0.20 % 1,967.28 261,579 0.23 % 2,314.50 816,120 0.17 % 1,754.65 6 0.03 % 254.92 76,854 0.10 % 989.82 osti 5,418,510 0.19 % 1,884.60 24,334 0.23 % 2,322.99 1,497,583 0.18 % 1,825.15 2,558,277 0.18 % 1,842.68 299,566 0.27 % 2,650.62 930,437 0.20 % 2,000.43 16 0.07 % 679.78 108,297 0.14 % 1,394.79 love 5,074,775 0.18 % 1,765.05 17,574 0.17 % 1,677.66 1,602,553 0.20 % 1,953.08 2,650,266 0.19 % 1,908.94 120,190 0.11 % 1,063.46 636,504 0.14 % 1,368.47 0 0 % 0 47,688 0.06 % 614.19 stav 4,601,164 0.16 % 1,600.32 27,304 0.26 % 2,606.51 1,441,083 0.18 % 1,756.29 2,153,583 0.15 % 1,551.19 175,712 0.15 % 1,554.73 710,185 0.15 % 1,526.89 30 0.13 % 1,274.59 93,267 0.12 % 1,201.21 oven 4,176,692 0.14 % 1,452.69 14,865 0.14 % 1,419.05 1,382,308 0.17 % 1,684.66 2,240,815 0.16 % 1,614.02 60,788 0.05 % 537.86 470,492 0.10 % 1,011.55 0 0 % 0 7,424 0.01 % 95.62 ljen 4,161,075 0.14 % 1,447.26 19,272 0.18 % 1,839.76 1,109,258 0.14 % 1,351.89 1,908,585 0.14 % 1,374.72 214,088 0.19 % 1,894.29 778,635 0.17 % 1,674.05 39 0.17 % 1,656.97 131,198 0.17 % 1,689.73 drug 4,158,996 0.14 % 1,446.53 26,229 0.25 % 2,503.89 1,164,632 0.14 % 1,419.37 1,959,542 0.14 % 1,411.43 186,676 0.17 % 1,651.74 687,247 0.15 % 1,477.57 19 0.08 % 807.24 134,651 0.17 % 1,734.21 ovan 4,153,096 0.14 % 1,444.48 19,399 0.18 % 1,851.88 1,191,012 0.14 % 1,451.52 2,145,989 0.15 % 1,545.72 161,274 0.14 % 1,426.98 581,017 0.12 % 1,249.18 0 0 % 0 54,405 0.07 % 700.70 anja 4,148,579 0.14 % 1,442.91 18,753 0.18 % 1,790.21 1,219,763 0.15 % 1,486.56 1,973,375 0.14 % 1,421.39 215,747 0.19 % 1,908.97 645,803 0.14 % 1,388.47 14 0.06 % 594.81 75,124 0.10 % 967.54 tako 4,120,708 0.14 % 1,433.22 14,351 0.14 % 1,369.99 1,030,417 0.13 % 1,255.80 1,920,034 0.14 % 1,382.97 166,017 0.15 % 1,468.95 790,846 0.17 % 1,700.31 47 0.20 % 1,996.86 198,996 0.26 % 2,562.92 ater 4,014,439 0.14 % 1,396.25 20,342 0.19 % 1,941.90 1,114,784 0.14 % 1,358.62 1,869,698 0.14 % 1,346.71 196,648 0.17 % 1,739.98 715,507 0.15 % 1,538.33 8 0.03 % 339.89 97,452 0.13 % 1,255.11 ravi 3,940,650 0.14 % 1,370.59 21,004 0.20 % 2,005.10 1,078,765 0.13 % 1,314.72 1,937,789 0.14 % 1,395.76 147,309 0.13 % 1,303.42 651,866 0.14 % 1,401.50 36 0.15 % 1,529.51 103,881 0.13 % 1,337.91 svoj 3,859,675 0.13 % 1,342.43 11,554 0.11 % 1,102.98 1,035,947 0.13 % 1,262.54 1,788,004 0.13 % 1,287.87 163,859 0.14 % 1,449.85 705,349 0.15 % 1,516.49 3 0.01 % 127.46 154,959 0.20 % 1,995.76 osta 3,790,542 0.13 % 1,318.38 10,358 0.10 % 988.80 1,140,123 0.14 % 1,389.50 1,730,111 0.12 % 1,246.17 154,426 0.14 % 1,366.39 636,806 0.14 % 1,369.12 51 0.22 % 2,166.80 118,667 0.15 % 1,528.34 kate 3,754,185 0.13 % 1,305.74 18,116 0.17 % 1,729.40 1,055,320 0.13 % 1,286.15 1,767,133 0.13 % 1,272.84 177,420 0.16 % 1,569.84 649,872 0.14 % 1,397.22 8 0.03 % 339.89 86,316 0.11 % 1,111.69 acij 3,657,495 0.13 % 1,272.11 16,281 0.15 % 1,554.23 1,137,247 0.14 % 1,386 1,723,055 0.12 % 1,241.09 182,234 0.16 % 1,612.44 571,549 0.12 % 1,228.82 3 0.01 % 127.46 27,126 0.04 % 349.36 avlj 3,628,706 0.13 % 1,262.09 22,801 0.22 % 2,176.64 1,069,682 0.13 % 1,303.65 1,695,216 0.12 % 1,221.04 159,543 0.14 % 1,411.67 610,831 0.13 % 1,313.28 11 0.05 % 467.35 70,622 0.09 % 909.56 lahk 3,505,684 0.12 % 1,219.31 17,427 0.17 % 1,663.63 884,023 0.11 % 1,077.39 1,474,806 0.11 % 1,062.28 206,698 0.18 % 1,828.90 799,742 0.17 % 1,719.43 42 0.18 % 1,784.42 122,946 0.16 % 1,583.45 bolj 3,499,899 0.12 % 1,217.29 5,989 0.06 % 571.73 947,093 0.12 % 1,154.25 1,666,647 0.12 % 1,200.46 119,682 0.11 % 1,058.97 674,760 0.14 % 1,450.72 20 0.09 % 849.73 85,708 0.11 % 1,103.86 oval 3,443,899 0.12 % 1,197.82 8,776 0.08 % 837.78 987,981 0.12 % 1,204.08 1,743,194 0.13 % 1,255.59 100,492 0.09 % 889.17 501,435 0.11 % 1,078.08 6 0.03 % 254.92 102,015 0.13 % 1,313.88 ahko 3,426,117 0.12 % 1,191.63 17,127 0.16 % 1,634.99 865,448 0.10 % 1,054.75 1,441,947 0.10 % 1,038.61 202,129 0.18 % 1,788.47 780,467 0.17 % 1,677.99 42 0.18 % 1,784.42 118,957 0.15 % 1,532.08 kega 3,298,581 0.12 % 1,147.27 10,050 0.10 % 959.40 967,449 0.12 % 1,179.06 1,730,227 0.12 % 1,246.25 105,269 0.09 % 931.44 427,112 0.09 % 918.28 6 0.03 % 254.92 58,468 0.07 % 753.02 pove 3,215,883 0.11 % 1,118.51 8,752 0.08 % 835.49 988,988 0.12 % 1,205.31 1,539,560 0.11 % 1,108.92 128,550 0.11 % 1,137.43 453,411 0.10 % 974.83 0 0 % 0 96,622 0.12 % 1,244.42 nski 3,209,534 0.11 % 1,116.30 9,663 0.09 % 922.46 935,923 0.11 % 1,140.64 1,696,409 0.12 % 1,221.90 89,752 0.08 % 794.14 443,320 0.10 % 953.13 3 0.01 % 127.46 34,464 0.04 % 443.87 reds 3,156,120 0.11 % 1,097.72 10,746 0.10 % 1,025.84 1,053,247 0.13 % 1,283.62 1,649,536 0.12 % 1,188.13 63,833 0.06 % 564.81 351,049 0.07 % 754.75 0 0 % 0 27,709 0.04 % 356.87 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 10 File at CLARIN.SI 1.1.5 List of character-level 5-grams in word forms in the Gigafida 2.0 corpusGF2.0-characters-lowercase_forms- 5grams-taxonomy-entire.tsvCharacter string Total absolute frequency of character string Percentage of total sum of all found character strings Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] slove 4,289,953 0.20 % 1,994.84 14,933 0.19 % 1,881.52 1,414,480 0.23 % 2,294.14 2,305,579 0.22 % 2,216.55 62,945 0.07 % 739.47 482,454 0.14 % 1,394.43 0 0 % 0 9,562 0.02 % 174.72 loven 4,148,231 0.19 % 1,928.94 14,794 0.19 % 1,864.01 1,375,228 0.22 % 2,230.48 2,226,486 0.21 % 2,140.51 59,366 0.07 % 697.42 465,876 0.14 % 1,346.51 0 0 % 0 6,481 0.01 % 118.42 nosti 3,785,650 0.18 % 1,760.34 19,213 0.24 % 2,420.79 1,081,678 0.17 % 1,754.37 1,803,015 0.17 % 1,733.39 220,256 0.26 % 2,587.53 599,561 0.17 % 1,732.90 3 0.02 % 173 61,924 0.11 % 1,131.48 kater 3,556,404 0.17 % 1,653.74 17,421 0.22 % 2,195 1,003,431 0.16 % 1,627.46 1,667,633 0.16 % 1,603.24 168,858 0.20 % 1,983.72 617,358 0.18 % 1,784.34 8 0.05 % 461.33 81,695 0.15 % 1,492.74 pravi 3,465,573 0.16 % 1,611.50 19,542 0.25 % 2,462.24 977,900 0.16 % 1,586.05 1,710,424 0.16 % 1,644.37 122,507 0.14 % 1,439.19 543,865 0.16 % 1,571.92 35 0.20 % 2,018.34 91,300 0.17 % 1,668.25 lahko 3,422,712 0.16 % 1,591.57 17,124 0.22 % 2,157.58 864,237 0.14 % 1,401.70 1,439,972 0.14 % 1,384.37 202,074 0.24 % 2,373.93 780,328 0.23 % 2,255.37 42 0.24 % 2,422.01 118,935 0.22 % 2,173.20 ovanj 3,154,606 0.15 % 1,466.90 14,217 0.18 % 1,791.31 925,580 0.15 % 1,501.20 1,615,789 0.15 % 1,553.39 122,270 0.14 % 1,436.41 437,719 0.13 % 1,265.13 0 0 % 0 39,031 0.07 % 713.18 preds 2,771,033 0.13 % 1,288.54 6,303 0.08 % 794.16 924,207 0.15 % 1,498.97 1,458,594 0.14 % 1,402.27 49,822 0.06 % 585.30 306,080 0.09 % 884.66 0 0 % 0 26,027 0.05 % 475.57 velik 2,724,218 0.13 % 1,266.77 6,039 0.08 % 760.90 706,042 0.12 % 1,145.13 1,288,868 0.12 % 1,239.10 109,438 0.13 % 1,285.66 540,373 0.16 % 1,561.83 5 0.03 % 288.33 73,453 0.13 % 1,342.14 držav 2,523,846 0.12 % 1,173.60 18,372 0.23 % 2,314.83 869,907 0.14 % 1,410.90 1,335,559 0.13 % 1,283.98 58,961 0.07 % 692.66 231,739 0.07 % 669.79 0 0 % 0 9,308 0.02 % 170.08 vanje 2,425,451 0.11 % 1,127.84 12,913 0.16 % 1,627 675,234 0.11 % 1,095.16 1,182,588 0.11 % 1,136.92 129,243 0.15 % 1,518.33 392,806 0.11 % 1,135.32 0 0 % 0 32,667 0.06 % 596.90 avlja 2,399,343 0.11 % 1,115.70 17,918 0.23 % 2,257.62 728,217 0.12 % 1,181.09 1,116,010 0.11 % 1,072.91 107,233 0.13 % 1,259.76 383,918 0.11 % 1,109.63 1 0.01 % 57.67 46,046 0.08 % 841.36 govor 2,291,319 0.11 % 1,065.47 8,623 0.11 % 1,086.48 682,023 0.11 % 1,106.17 1,082,232 0.10 % 1,040.44 98,580 0.12 % 1,158.10 314,814 0.09 % 909.90 0 0 % 0 105,047 0.19 % 1,919.43 vljen 2,262,086 0.10 % 1,051.88 8,882 0.11 % 1,119.11 599,530 0.10 % 972.38 1,033,035 0.10 % 993.14 122,831 0.14 % 1,443 434,764 0.13 % 1,256.59 10 0.06 % 576.67 63,034 0.12 % 1,151.77 skega 2,237,428 0.10 % 1,040.41 6,892 0.09 % 868.37 688,764 0.11 % 1,117.11 1,195,705 0.12 % 1,149.53 61,974 0.07 % 728.06 263,543 0.08 % 761.71 2 0.01 % 115.33 20,548 0.04 % 375.46 ateri 2,122,272 0.10 % 986.86 10,616 0.13 % 1,337.59 588,680 0.10 % 954.78 995,180 0.10 % 956.75 102,779 0.12 % 1,207.43 379,116 0.11 % 1,095.75 8 0.05 % 461.33 45,893 0.08 % 838.56 vensk 1,953,511 0.09 % 908.39 3,104 0.04 % 391.10 599,948 0.10 % 973.05 1,097,962 0.11 % 1,055.56 31,088 0.04 % 365.22 218,074 0.06 % 630.29 0 0 % 0 3,335 0.01 % 60.94 ovens 1,952,096 0.09 % 907.73 3,113 0.04 % 392.23 599,388 0.10 % 972.15 1,097,403 0.11 % 1,055.03 31,216 0.04 % 366.72 217,629 0.06 % 629.01 0 0 % 0 3,347 0.01 % 61.16 drugi 1,915,559 0.09 % 890.74 10,022 0.13 % 1,262.75 570,703 0.09 % 925.62 905,911 0.09 % 870.93 82,042 0.10 % 963.82 301,385 0.09 % 871.09 6 0.04 % 346 45,490 0.08 % 831.20 redst 1,890,474 0.09 % 879.08 7,769 0.10 % 978.87 600,415 0.10 % 973.81 949,321 0.09 % 912.66 55,367 0.07 % 650.44 256,518 0.07 % 741.41 0 0 % 0 21,084 0.04 % 385.25 posta 1,882,986 0.09 % 875.60 5,536 0.07 % 697.52 562,639 0.09 % 912.54 832,081 0.08 % 799.95 92,969 0.11 % 1,092.18 330,211 0.10 % 954.40 21 0.12 % 1,211 59,529 0.11 % 1,087.72 stran 1,870,456 0.09 % 869.77 7,709 0.10 % 971.31 584,220 0.10 % 947.55 831,839 0.08 % 799.72 80,322 0.09 % 943.61 316,297 0.09 % 914.19 77 0.44 % 4,440.34 49,992 0.09 % 913.46 poved 1,869,438 0.09 % 869.30 4,154 0.05 % 523.39 625,743 0.10 % 1,014.89 928,266 0.09 % 892.42 48,124 0.06 % 565.35 197,473 0.06 % 570.75 0 0 % 0 65,678 0.12 % 1,200.08 aradi 1,840,602 0.09 % 855.89 5,218 0.07 % 657.45 579,691 0.09 % 940.20 878,888 0.08 % 844.95 57,155 0.07 % 671.45 282,834 0.08 % 817.47 22 0.13 % 1,268.67 36,794 0.07 % 672.30 oveni 1,796,082 0.08 % 835.18 11,099 0.14 % 1,398.45 655,044 0.11 % 1,062.41 912,306 0.09 % 877.08 18,779 0.02 % 220.61 197,344 0.06 % 570.38 0 0 % 0 1,510 0 % 27.59 zarad 1,793,203 0.08 % 833.85 5,125 0.07 % 645.74 570,232 0.09 % 924.86 862,002 0.08 % 828.71 53,033 0.06 % 623.02 267,070 0.08 % 771.91 0 0 % 0 35,741 0.07 % 653.06 ljubl 1,781,667 0.08 % 828.48 1,650 0.02 % 207.90 601,311 0.10 % 975.27 928,607 0.09 % 892.75 30,674 0.04 % 360.35 206,003 0.06 % 595.41 0 0 % 0 13,422 0.03 % 245.25 jublj 1,781,559 0.08 % 828.43 1,649 0.02 % 207.77 601,266 0.10 % 975.19 928,550 0.09 % 892.69 30,667 0.04 % 360.27 206,004 0.06 % 595.41 0 0 % 0 13,423 0.03 % 245.27 venij 1,781,239 0.08 % 828.28 11,087 0.14 % 1,396.93 649,831 0.10 % 1,053.96 903,419 0.09 % 868.53 18,207 0.02 % 213.89 196,348 0.06 % 567.50 0 0 % 0 2,347 0 % 42.88 njego 1,768,610 0.08 % 822.41 5,170 0.07 % 651.41 516,196 0.08 % 837.22 783,459 0.07 % 753.20 78,204 0.09 % 918.73 274,803 0.08 % 794.26 0 0 % 0 110,778 0.20 % 2,024.15 jegov 1,766,197 0.08 % 821.29 5,167 0.07 % 651.03 515,698 0.08 % 836.41 781,981 0.07 % 751.78 78,045 0.09 % 916.86 274,580 0.08 % 793.61 0 0 % 0 110,726 0.20 % 2,023.20 nekaj 1,746,509 0.08 % 812.13 3,397 0.04 % 428.01 422,782 0.07 % 685.71 823,115 0.08 % 791.33 50,702 0.06 % 595.64 349,695 0.10 % 1,010.72 17 0.10 % 980.34 96,801 0.18 % 1,768.76 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 11 File at CLARIN.SI 1.1.6 List of character-level 1-grams in lemmas in the Gigafida 2.0 corpusGF2.0-characters-lemmas- 1grams-taxonomy-entire.tsvCharacter string Character string (lower case) Total absolute frequency of character string Percentage of total sum of all found character strings Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] i i 687,633,789 11.80 % 118,024.76 2,239,182 10.88 % 108,784.18 194,368,161 11.79 % 117,884.14 325,911,711 11.65 % 116,550.86 25,616,770 11.44 % 114,414.76 112,230,783 11.80 % 117,990.90 5,385 11.60 % 115,986.04 27,261,797 14.71 % 147,070.69 a a 584,847,838 10.04 % 100,382.69 2,183,731 10.61 % 106,090.26 165,739,386 10.05 % 100,520.81 280,076,814 10.02 % 100,159.62 22,188,020 9.91 % 99,100.59 96,376,347 10.13 % 101,322.75 5,273 11.36 % 113,573.71 18,278,267 9.86 % 98,606.75 e e 519,030,684 8.91 % 89,085.90 1,870,527 9.09 % 90,874.15 146,562,571 8.89 % 88,890.09 248,693,373 8.89 % 88,936.44 20,393,690 9.11 % 91,086.39 86,023,229 9.04 % 90,438.27 3,801 8.19 % 81,868.70 15,483,493 8.35 % 83,529.64 o o 451,688,905 7.75 % 77,527.42 1,683,270 8.18 % 81,776.81 127,205,896 7.71 % 77,150.28 216,560,671 7.75 % 77,445.31 17,980,455 8.03 % 80,307.92 74,274,859 7.81 % 78,086.93 3,842 8.28 % 82,751.79 13,979,912 7.54 % 75,418.19 t t 446,141,982 7.66 % 76,575.35 1,468,601 7.13 % 71,347.73 126,185,469 7.65 % 76,531.39 210,702,359 7.54 % 75,350.29 16,864,325 7.53 % 75,322.83 73,394,074 7.72 % 77,160.94 3,431 7.39 % 73,899.37 17,523,723 9.45 % 94,536.18 n n 393,280,828 6.75 % 67,502.32 1,445,458 7.02 % 70,223.40 110,437,902 6.70 % 66,980.50 188,789,133 6.75 % 67,513.79 15,819,334 7.07 % 70,655.49 65,102,856 6.84 % 68,444.18 3,469 7.47 % 74,717.84 11,682,676 6.30 % 63,025.16 r r 311,066,334 5.34 % 53,391.11 1,106,294 5.38 % 53,746.10 88,448,249 5.36 % 53,643.80 150,869,327 5.39 % 53,953.11 11,753,804 5.25 % 52,497.20 50,138,642 5.27 % 52,711.95 2,491 5.37 % 53,652.97 8,747,527 4.72 % 47,190.76 s s 235,095,551 4.04 % 40,351.56 848,677 4.12 % 41,230.52 66,286,281 4.02 % 40,202.58 111,907,033 4.00 % 40,019.61 9,584,257 4.28 % 42,807.13 38,873,378 4.09 % 40,868.51 1,730 3.73 % 37,262 7,594,195 4.10 % 40,968.81 v v 234,791,634 4.03 % 40,299.40 875,329 4.25 % 42,525.33 67,074,549 4.07 % 40,680.66 114,184,394 4.08 % 40,834.03 9,049,091 4.04 % 40,416.87 37,311,247 3.92 % 39,226.20 1,276 2.75 % 27,483.42 6,295,748 3.40 % 33,964.01 k k 212,342,897 3.65 % 36,446.32 741,936 3.60 % 36,044.82 59,108,179 3.58 % 35,849.07 102,332,531 3.66 % 36,595.63 8,117,233 3.62 % 36,254.82 35,500,317 3.73 % 37,322.33 1,742 3.75 % 37,520.46 6,540,959 3.53 % 35,286.86 d d 203,985,455 3.50 % 35,011.86 775,113 3.77 % 37,656.62 58,814,274 3.57 % 35,670.81 98,920,880 3.54 % 35,375.57 7,342,530 3.28 % 32,794.68 32,096,198 3.37 % 33,743.50 1,507 3.25 % 32,458.86 6,034,953 3.26 % 32,557.09 l l 197,997,076 3.40 % 33,984.02 750,994 3.65 % 36,484.87 54,725,972 3.32 % 33,191.26 95,266,334 3.41 % 34,068.65 7,940,224 3.55 % 35,464.22 33,761,217 3.55 % 35,493.97 1,877 4.04 % 40,428.19 5,550,458 2.99 % 29,943.36 p p 195,940,608 3.36 % 33,631.05 759,062 3.69 % 36,876.83 56,034,847 3.40 % 33,985.09 93,030,386 3.33 % 33,269.05 7,570,028 3.38 % 33,810.78 32,533,997 3.42 % 34,203.77 2,056 4.43 % 44,283.62 6,010,232 3.24 % 32,423.72 j j 168,590,744 2.89 % 28,936.74 601,314 2.92 % 29,213.10 47,195,377 2.86 % 28,623.96 81,761,294 2.92 % 29,239.05 6,604,886 2.95 % 29,500.07 27,417,738 2.88 % 28,824.92 1,025 2.21 % 22,077.19 5,009,110 2.70 % 27,022.92 b b 165,893,255 2.85 % 28,473.75 514,527 2.50 % 24,996.81 46,981,379 2.85 % 28,494.17 78,701,987 2.81 % 28,145 5,858,993 2.62 % 26,168.61 26,581,408 2.79 % 27,945.67 663 1.43 % 14,280.18 7,254,298 3.91 % 39,135.15 z z 133,276,183 2.29 % 22,875.39 543,569 2.64 % 26,407.73 37,600,818 2.28 % 22,804.87 62,988,983 2.25 % 22,525.79 5,414,030 2.42 % 24,181.23 22,271,953 2.34 % 23,415.03 1,326 2.86 % 28,560.35 4,455,504 2.40 % 24,036.35 m m 112,577,850 1.93 % 19,322.75 375,200 1.82 % 18,228.01 30,916,108 1.88 % 18,750.60 53,525,589 1.91 % 19,141.54 4,577,822 2.04 % 20,446.39 19,516,140 2.05 % 20,517.78 1,156 2.49 % 24,898.77 3,665,835 1.98 % 19,776.28 u u 83,769,400 1.44 % 14,378.09 284,489 1.38 % 13,821.08 23,916,980 1.45 % 14,505.63 40,305,457 1.44 % 14,413.83 3,367,752 1.50 % 15,041.73 13,586,110 1.43 % 14,283.40 764 1.65 % 16,455.59 2,307,848 1.25 % 12,450.27 č č 79,564,751 1.37 % 13,656.41 312,636 1.52 % 15,188.52 21,530,693 1.31 % 13,058.35 38,258,464 1.37 % 13,681.79 3,213,412 1.44 % 14,352.39 13,364,609 1.41 % 14,050.53 746 1.61 % 16,067.89 2,884,191 1.56 % 15,559.50 g g 62,232,489 1.07 % 10,681.52 237,577 1.15 % 11,541.99 17,581,801 1.07 % 10,663.35 29,829,580 1.07 % 10,667.50 2,497,786 1.12 % 11,156.11 10,059,456 1.06 % 10,575.75 378 0.81 % 8,141.64 2,025,911 1.09 % 10,929.29 c c 49,151,670 0.84 % 8,436.34 159,997 0.78 % 7,772.99 14,240,681 0.86 % 8,636.96 24,030,367 0.86 % 8,593.62 1,841,875 0.82 % 8,226.55 7,756,344 0.81 % 8,154.43 480 1.03 % 10,338.59 1,121,926 0.60 % 6,052.51 š š 47,324,031 0.81 % 8,122.65 148,917 0.72 % 7,234.70 12,902,047 0.78 % 7,825.08 23,837,220 0.85 % 8,524.54 1,623,771 0.72 % 7,252.41 7,440,091 0.78 % 7,821.94 403 0.87 % 8,680.11 1,371,582 0.74 % 7,399.35 ž ž 33,853,882 0.58 % 5,810.65 117,811 0.57 % 5,723.51 9,465,390 0.57 % 5,740.75 16,376,208 0.59 % 5,856.37 1,275,112 0.57 % 5,695.16 5,548,471 0.58 % 5,833.24 329 0.71 % 7,086.24 1,070,561 0.58 % 5,775.41 h h 27,021,526 0.46 % 4,637.95 87,462 0.42 % 4,249.09 7,113,475 0.43 % 4,314.32 12,192,653 0.44 % 4,360.27 1,270,319 0.57 % 5,673.75 5,195,783 0.55 % 5,462.45 417 0.90 % 8,981.65 1,161,417 0.63 % 6,265.56 0 0 13,996,365 0.24 % 2,402.32 38,824 0.19 % 1,886.15 4,400,764 0.27 % 2,669.06 7,271,948 0.26 % 2,600.56 376,635 0.17 % 1,682.20 1,890,153 0.20 % 1,987.16 207 0.45 % 4,458.52 17,834 0.01 % 96.21 1 1 12,359,065 0.21 % 2,121.30 46,627 0.23 % 2,265.24 3,822,233 0.23 % 2,318.18 6,491,455 0.23 % 2,321.44 559,225 0.25 % 2,497.72 1,406,736 0.15 % 1,478.94 147 0.32 % 3,166.19 32,642 0.02 % 176.10 f f 11,568,306 0.20 % 1,985.57 32,037 0.16 % 1,556.43 3,511,776 0.21 % 2,129.89 5,317,515 0.19 % 1,901.62 456,079 0.20 % 2,037.03 2,034,015 0.21 % 2,138.41 46 0.10 % 990.78 216,838 0.12 % 1,169.79 , , 9,937,282 0.17 % 1,705.62 50,481 0.24 % 2,452.47 2,446,480 0.15 % 1,483.79 5,864,700 0.21 % 2,097.30 422,773 0.19 % 1,888.27 1,117,136 0.12 % 1,174.47 35 0.07 % 753.86 35,677 0.02 % 192.47 S s 9,493,244 0.16 % 1,629.41 20,481 0.10 % 995.01 3,204,962 0.19 % 1,943.81 4,755,064 0.17 % 1,700.48 188,674 0.08 % 842.69 1,216,088 0.13 % 1,278.50 0 0 % 0 107,975 0.06 % 582.50 2 2 9,194,260 0.16 % 1,578.09 29,570 0.14 % 1,436.57 3,067,954 0.19 % 1,860.71 4,727,204 0.17 % 1,690.52 312,410 0.14 % 1,395.35 1,042,782 0.11 % 1,096.30 108 0.23 % 2,326.18 14,232 0.01 % 76.78 M m 7,007,990 0.12 % 1,202.84 6,999 0.03 % 340.03 2,186,393 0.13 % 1,326.05 3,575,407 0.13 % 1,278.62 148,702 0.07 % 664.16 958,641 0.10 % 1,007.84 1 0 % 21.54 131,847 0.07 % 711.28 B b 5,766,382 0.10 % 989.74 7,220 0.04 % 350.76 1,801,966 0.11 % 1,092.89 3,036,096 0.11 % 1,085.75 119,246 0.05 % 532.60 709,691 0.07 % 746.12 0 0 % 0 92,163 0.05 % 497.20 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 12 File at CLARIN.SI 1.1.7 List of character-level 2-grams in lemmas in the Gigafida 2.0 corpusGF2.0-characters-lemmas- 2grams-taxonomy-entire.tsvCharacter string Character string (lower case) Total absolute frequency of character string Percentage of total sum of all found character strings Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] ti ti 221,388,074 4.72 % 47,189.30 669,005 4.02 % 40,234.02 61,996,299 4.66 % 46,583.36 102,511,704 4.55 % 45,488.31 8,088,568 4.47 % 44,698.29 36,652,251 4.80 % 47,988.98 1,960 5.34 % 53,375.45 11,468,287 7.87 % 78,738.94 it it 145,220,811 3.10 % 30,954.10 413,144 2.48 % 24,846.52 41,565,947 3.12 % 31,232.21 67,989,266 3.02 % 30,169.40 4,915,795 2.72 % 27,165.21 22,943,622 3.00 % 30,040.20 928 2.53 % 25,271.64 7,392,109 5.08 % 50,752.73 en en 105,010,102 2.24 % 22,383.11 443,543 2.67 % 26,674.72 29,740,403 2.23 % 22,346.62 50,298,363 2.23 % 22,319.28 4,467,285 2.47 % 24,686.69 17,598,156 2.30 % 23,041.36 643 1.75 % 17,510.42 2,461,709 1.69 % 16,901.60 bi bi 100,643,304 2.15 % 21,452.32 260,993 1.57 % 15,696.14 28,708,156 2.16 % 21,571 47,508,634 2.11 % 21,081.37 3,176,616 1.75 % 17,554.32 15,677,874 2.05 % 20,527.12 197 0.54 % 5,364.78 5,310,834 3.65 % 36,463.11 na na 73,688,017 1.57 % 15,706.75 241,509 1.45 % 14,524.37 20,979,984 1.58 % 15,764.14 35,554,721 1.58 % 15,776.97 2,849,015 1.57 % 15,743.96 12,024,196 1.57 % 15,743.34 756 2.06 % 20,587.67 2,037,836 1.40 % 13,991.37 pr pr 71,765,288 1.53 % 15,296.91 299,439 1.80 % 18,008.29 20,529,209 1.54 % 15,425.43 34,628,515 1.54 % 15,365.98 2,636,319 1.46 % 14,568.58 11,667,066 1.53 % 15,275.75 586 1.60 % 15,958.17 2,004,154 1.38 % 13,760.12 at at 71,278,266 1.52 % 15,193.10 255,544 1.54 % 15,368.44 19,922,208 1.50 % 14,969.33 32,913,353 1.46 % 14,604.90 2,967,025 1.64 % 16,396.10 12,374,325 1.62 % 16,201.77 866 2.36 % 23,583.24 2,844,945 1.95 % 19,532.82 ra ra 70,620,330 1.50 % 15,052.86 267,862 1.61 % 16,109.24 19,428,515 1.46 % 14,598.38 33,938,902 1.51 % 15,059.97 2,946,383 1.63 % 16,282.03 11,993,118 1.57 % 15,702.65 639 1.74 % 17,401.49 2,044,911 1.40 % 14,039.95 po po 66,990,706 1.43 % 14,279.20 262,095 1.58 % 15,762.42 19,534,143 1.47 % 14,677.75 31,289,651 1.39 % 13,884.40 2,753,326 1.52 % 15,215.17 11,061,121 1.45 % 14,482.38 723 1.97 % 19,689.01 2,089,647 1.44 % 14,347.09 st st 66,983,982 1.43 % 14,277.77 270,347 1.63 % 16,258.69 19,141,453 1.44 % 14,382.68 32,064,174 1.42 % 14,228.08 2,909,509 1.61 % 16,078.26 10,725,402 1.40 % 14,042.82 586 1.60 % 15,958.17 1,872,511 1.29 % 12,856.28 re re 63,131,954 1.35 % 13,456.70 248,880 1.50 % 14,967.67 18,115,863 1.36 % 13,612.07 30,442,818 1.35 % 13,508.63 2,339,827 1.29 % 12,930.13 10,121,891 1.32 % 13,252.64 676 1.84 % 18,409.08 1,861,999 1.28 % 12,784.11 in in 57,092,879 1.22 % 12,169.46 214,219 1.29 % 12,883.15 15,528,711 1.17 % 11,668.11 27,522,358 1.22 % 12,212.71 2,443,287 1.35 % 13,501.86 9,583,712 1.25 % 12,548 967 2.63 % 26,333.71 1,799,625 1.24 % 12,355.86 an an 56,512,893 1.21 % 12,045.83 201,236 1.21 % 12,102.35 16,796,448 1.26 % 12,620.67 27,952,661 1.24 % 12,403.65 2,094,686 1.16 % 11,575.46 8,356,620 1.09 % 10,941.37 473 1.29 % 12,880.91 1,110,769 0.76 % 7,626.32 ve ve 52,179,409 1.11 % 11,122.14 199,342 1.20 % 11,988.45 14,950,346 1.12 % 11,233.53 25,165,587 1.12 % 11,166.92 1,967,936 1.09 % 10,875.02 8,444,570 1.11 % 11,056.52 129 0.35 % 3,512.98 1,451,499 1.00 % 9,965.70 je je 51,050,610 1.09 % 10,881.54 212,688 1.28 % 12,791.08 14,121,704 1.06 % 10,610.90 24,531,178 1.09 % 10,885.41 2,364,016 1.31 % 13,063.80 8,484,679 1.11 % 11,109.03 285 0.78 % 7,761.23 1,336,060 0.92 % 9,173.12 te te 49,686,991 1.06 % 10,590.88 175,488 1.05 % 10,553.86 14,639,725 1.10 % 11,000.13 23,727,388 1.05 % 10,528.74 1,892,179 1.05 % 10,456.38 8,091,969 1.06 % 10,594.86 375 1.02 % 10,212.14 1,159,867 0.80 % 7,963.41 ja ja 49,635,269 1.06 % 10,579.86 191,413 1.15 % 11,511.59 14,807,202 1.11 % 11,125.97 23,618,494 1.05 % 10,480.42 1,829,330 1.01 % 10,109.07 7,782,837 1.02 % 10,190.11 310 0.84 % 8,442.04 1,405,683 0.96 % 9,651.14 ov ov 48,687,448 1.04 % 10,377.83 157,862 0.95 % 9,493.83 13,888,961 1.04 % 10,436.02 24,372,754 1.08 % 10,815.11 1,785,468 0.99 % 9,866.69 7,341,407 0.96 % 9,612.14 148 0.40 % 4,030.39 1,140,848 0.78 % 7,832.83 ko ko 47,109,129 1.00 % 10,041.40 170,747 1.03 % 10,268.74 12,802,905 0.96 % 9,619.97 22,254,675 0.99 % 9,875.24 1,913,863 1.06 % 10,576.21 8,260,548 1.08 % 10,815.58 521 1.42 % 14,188.07 1,705,870 1.17 % 11,712.16 et et 46,163,689 0.98 % 9,839.88 128,974 0.78 % 7,756.51 12,930,991 0.97 % 9,716.21 22,232,547 0.99 % 9,865.42 1,565,174 0.86 % 8,649.32 7,570,679 0.99 % 9,912.33 245 0.67 % 6,671.93 1,735,079 1.19 % 11,912.70 za za 45,990,602 0.98 % 9,802.99 193,383 1.16 % 11,630.07 13,434,510 1.01 % 10,094.55 22,133,077 0.98 % 9,821.28 1,587,486 0.88 % 8,772.62 7,256,985 0.95 % 9,501.61 397 1.08 % 10,811.25 1,384,764 0.95 % 9,507.51 ri ri 44,442,529 0.95 % 9,473.01 150,253 0.90 % 9,036.23 12,535,922 0.94 % 9,419.36 21,408,652 0.95 % 9,499.83 1,727,245 0.95 % 9,544.94 7,315,758 0.96 % 9,578.56 313 0.85 % 8,523.73 1,304,386 0.90 % 8,955.65 ta ta 43,443,739 0.93 % 9,260.12 195,349 1.18 % 11,748.31 12,145,429 0.91 % 9,125.95 20,554,293 0.91 % 9,120.72 1,755,340 0.97 % 9,700.20 7,265,063 0.95 % 9,512.18 364 0.99 % 9,912.58 1,527,901 1.05 % 10,490.26 le le 43,099,284 0.92 % 9,186.70 148,289 0.89 % 8,918.11 12,291,880 0.92 % 9,235.99 20,833,560 0.92 % 9,244.64 1,560,597 0.86 % 8,624.03 7,139,838 0.94 % 9,348.23 230 0.63 % 6,263.45 1,124,890 0.77 % 7,723.27 da da 42,871,238 0.91 % 9,138.09 160,585 0.97 % 9,657.60 12,421,832 0.93 % 9,333.63 20,225,209 0.90 % 8,974.69 1,470,569 0.81 % 8,126.52 6,861,021 0.90 % 8,983.17 509 1.39 % 13,861.28 1,731,513 1.19 % 11,888.22 se se 42,248,075 0.90 % 9,005.26 157,909 0.95 % 9,496.66 11,415,695 0.86 % 8,577.63 19,736,962 0.88 % 8,758.04 1,660,975 0.92 % 9,178.72 7,402,898 0.97 % 9,692.65 283 0.77 % 7,706.76 1,873,353 1.29 % 12,862.06 od od 41,148,169 0.88 % 8,770.81 179,809 1.08 % 10,813.73 12,147,544 0.91 % 9,127.53 19,935,104 0.89 % 8,845.96 1,519,633 0.84 % 8,397.65 6,298,992 0.82 % 8,247.30 468 1.27 % 12,744.75 1,066,619 0.73 % 7,323.19 ki ki 41,004,848 0.87 % 8,740.26 135,433 0.81 % 8,144.95 12,067,344 0.91 % 9,067.27 20,633,209 0.92 % 9,155.73 1,393,277 0.77 % 7,699.40 6,047,643 0.79 % 7,918.21 109 0.30 % 2,968.33 727,833 0.50 % 4,997.15 va va 40,706,839 0.87 % 8,676.74 167,334 1.01 % 10,063.48 11,643,970 0.88 % 8,749.15 19,835,520 0.88 % 8,801.77 1,612,450 0.89 % 8,910.57 6,499,710 0.85 % 8,510.10 220 0.60 % 5,991.12 947,635 0.65 % 6,506.27 ka ka 39,345,294 0.84 % 8,386.53 131,660 0.79 % 7,918.04 10,699,957 0.80 % 8,039.83 18,453,183 0.82 % 8,188.37 1,502,372 0.83 % 8,302.27 7,007,058 0.92 % 9,174.38 320 0.87 % 8,714.36 1,550,744 1.06 % 10,647.09 ed ed 38,159,777 0.81 % 8,133.83 135,537 0.81 % 8,151.21 11,163,124 0.84 % 8,387.85 18,585,922 0.82 % 8,247.28 1,330,184 0.73 % 7,350.74 5,789,824 0.76 % 7,580.65 169 0.46 % 4,602.27 1,155,017 0.79 % 7,930.11 ni ni 37,741,360 0.80 % 8,044.65 130,848 0.79 % 7,869.21 10,730,149 0.81 % 8,062.52 18,718,205 0.83 % 8,305.97 1,217,361 0.67 % 6,727.27 5,760,518 0.75 % 7,542.28 304 0.83 % 8,278.64 1,183,975 0.81 % 8,128.93 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 13 File at CLARIN.SI 1.1.8 List of character-level 3-grams in lemmas in the Gigafida 2.0 corpusGF2.0-characters-lemmas- 3grams-taxonomy-entire.tsvCharacter string Character string (lower case) Total absolute frequency of character string Percentage of total sum of all found character strings Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] iti iti 130,870,082 3.62 % 36,206.78 357,596 2.77 % 27,731.85 37,281,741 3.62 % 36,204.49 61,177,330 3.52 % 35,190.34 4,321,520 3.08 % 30,809.51 20,617,578 3.52 % 35,209.29 881 3.15 % 31,544.27 7,113,436 6.62 % 66,150.46 bit bit 95,007,023 2.63 % 26,284.83 243,576 1.89 % 18,889.51 27,184,155 2.64 % 26,398.67 44,811,830 2.58 % 25,776.60 2,965,296 2.11 % 21,140.55 14,623,627 2.50 % 24,973.23 179 0.64 % 6,409.11 5,178,360 4.82 % 48,155.48 ati ati 52,314,493 1.45 % 14,473.43 187,241 1.45 % 14,520.69 14,618,037 1.42 % 14,195.65 23,943,533 1.38 % 13,772.77 2,224,515 1.59 % 15,859.28 9,048,957 1.54 % 15,453.19 731 2.62 % 26,173.51 2,291,479 2.13 % 21,309.31 pre pre 26,893,554 0.74 % 7,440.42 114,484 0.89 % 8,878.32 7,901,217 0.77 % 7,672.91 12,831,765 0.74 % 7,381.07 960,493 0.69 % 6,847.66 4,320,312 0.74 % 7,377.93 221 0.79 % 7,912.92 765,062 0.71 % 7,114.59 nje nje 21,691,702 0.60 % 6,001.27 97,233 0.75 % 7,540.50 6,114,652 0.59 % 5,937.97 10,233,127 0.59 % 5,886.29 1,092,824 0.78 % 7,791.09 3,595,711 0.61 % 6,140.51 105 0.38 % 3,759.53 558,050 0.52 % 5,189.51 ski ski 20,206,804 0.56 % 5,590.45 60,790 0.47 % 4,714.31 6,098,332 0.59 % 5,922.12 10,548,997 0.61 % 6,067.98 610,567 0.43 % 4,352.93 2,674,751 0.46 % 4,567.76 26 0.09 % 930.93 213,341 0.20 % 1,983.94 pri pri 19,961,157 0.55 % 5,522.49 72,212 0.56 % 5,600.10 5,483,173 0.53 % 5,324.74 9,631,438 0.55 % 5,540.18 783,711 0.56 % 5,587.33 3,396,672 0.58 % 5,800.60 167 0.60 % 5,979.45 593,784 0.55 % 5,521.82 ost ost 17,940,663 0.50 % 4,963.50 71,373 0.55 % 5,535.03 5,001,130 0.49 % 4,856.62 8,355,266 0.48 % 4,806.11 918,648 0.66 % 6,549.34 3,152,930 0.54 % 5,384.36 133 0.48 % 4,762.08 441,183 0.41 % 4,102.72 eti eti 17,867,281 0.49 % 4,943.20 56,247 0.44 % 4,362 4,674,103 0.45 % 4,539.05 8,155,597 0.47 % 4,691.25 661,950 0.47 % 4,719.25 3,212,693 0.55 % 5,486.42 140 0.50 % 5,012.71 1,106,551 1.03 % 10,290.23 anj anj 17,333,154 0.48 % 4,795.43 76,033 0.59 % 5,896.42 4,943,727 0.48 % 4,800.88 8,465,771 0.49 % 4,869.67 824,505 0.59 % 5,878.16 2,743,048 0.47 % 4,684.39 97 0.35 % 3,473.09 279,973 0.26 % 2,603.57 ija ija 16,786,725 0.46 % 4,644.25 61,963 0.48 % 4,805.28 5,324,355 0.52 % 5,170.51 8,087,046 0.47 % 4,651.82 595,778 0.42 % 4,247.49 2,491,311 0.42 % 4,254.49 38 0.14 % 1,360.59 226,234 0.21 % 2,103.83 ven ven 15,617,743 0.43 % 4,320.84 69,592 0.54 % 5,396.91 4,837,108 0.47 % 4,697.34 7,872,084 0.45 % 4,528.17 505,184 0.36 % 3,601.62 2,148,852 0.37 % 3,669.66 37 0.13 % 1,324.79 184,886 0.17 % 1,719.32 red red 14,720,656 0.41 % 4,072.65 64,799 0.50 % 5,025.21 4,356,787 0.42 % 4,230.90 7,401,448 0.43 % 4,257.45 487,746 0.35 % 3,477.30 2,117,215 0.36 % 3,615.64 56 0.20 % 2,005.08 292,605 0.27 % 2,721.04 sta sta 13,461,223 0.37 % 3,724.21 60,173 0.47 % 4,666.46 3,867,525 0.38 % 3,755.77 6,428,434 0.37 % 3,697.76 563,174 0.40 % 4,015.05 2,141,254 0.37 % 3,656.69 94 0.34 % 3,365.68 400,569 0.37 % 3,725.04 rav rav 13,134,578 0.36 % 3,633.84 69,750 0.54 % 5,409.17 3,613,127 0.35 % 3,508.73 6,388,835 0.37 % 3,674.98 489,562 0.35 % 3,490.24 2,202,898 0.38 % 3,761.96 75 0.27 % 2,685.38 370,331 0.34 % 3,443.84 ove ove 13,039,898 0.36 % 3,607.64 41,934 0.33 % 3,252.01 4,049,322 0.39 % 3,932.32 6,512,022 0.38 % 3,745.84 433,166 0.31 % 3,088.18 1,785,557 0.30 % 3,049.25 7 0.03 % 250.64 217,890 0.20 % 2,026.24 ova ova 12,718,300 0.35 % 3,518.67 46,634 0.36 % 3,616.50 3,554,610 0.34 % 3,451.90 6,380,270 0.37 % 3,670.05 491,139 0.35 % 3,501.49 1,973,481 0.34 % 3,370.18 35 0.12 % 1,253.18 272,131 0.25 % 2,530.65 pra pra 11,282,615 0.31 % 3,121.47 60,229 0.47 % 4,670.81 3,063,619 0.30 % 2,975.10 5,532,034 0.32 % 3,182.13 410,018 0.29 % 2,923.15 1,856,760 0.32 % 3,170.85 155 0.56 % 5,549.79 359,800 0.34 % 3,345.91 rat rat 10,843,566 0.30 % 3,000 36,164 0.28 % 2,804.55 2,978,285 0.29 % 2,892.23 5,103,065 0.29 % 2,935.38 415,905 0.30 % 2,965.12 1,902,069 0.33 % 3,248.22 98 0.35 % 3,508.90 407,980 0.38 % 3,793.96 let let 10,823,273 0.30 % 2,994.39 23,281 0.18 % 1,805.46 3,354,111 0.33 % 3,257.19 5,351,140 0.31 % 3,078.08 284,612 0.20 % 2,029.09 1,630,076 0.28 % 2,783.73 27 0.10 % 966.74 180,026 0.17 % 1,674.13 lov lov 10,794,671 0.30 % 2,986.48 36,708 0.28 % 2,846.73 3,226,061 0.31 % 3,132.84 5,481,996 0.32 % 3,153.35 338,055 0.24 % 2,410.10 1,558,980 0.27 % 2,662.32 20 0.07 % 716.10 152,851 0.14 % 1,421.42 ter ter 10,733,174 0.30 % 2,969.46 43,645 0.34 % 3,384.70 3,169,310 0.31 % 3,077.73 5,120,091 0.29 % 2,945.17 427,505 0.30 % 3,047.82 1,775,993 0.30 % 3,032.92 137 0.49 % 4,905.30 196,493 0.18 % 1,827.26 ina ina 10,728,685 0.30 % 2,968.22 39,474 0.31 % 3,061.24 3,063,925 0.30 % 2,975.39 5,366,242 0.31 % 3,086.76 441,145 0.32 % 3,145.06 1,627,287 0.28 % 2,778.97 133 0.48 % 4,762.08 190,479 0.18 % 1,771.33 vat vat 10,617,726 0.29 % 2,937.52 39,481 0.31 % 3,061.78 2,972,414 0.29 % 2,886.53 5,007,338 0.29 % 2,880.31 482,024 0.34 % 3,436.50 1,797,199 0.31 % 3,069.13 73 0.26 % 2,613.77 319,197 0.30 % 2,968.33 jen jen 10,548,059 0.29 % 2,918.25 49,955 0.39 % 3,874.05 2,798,428 0.27 % 2,717.57 4,887,272 0.28 % 2,811.25 507,213 0.36 % 3,616.08 1,945,017 0.33 % 3,321.57 55 0.20 % 1,969.28 360,119 0.34 % 3,348.88 čen čen 10,513,957 0.29 % 2,908.81 46,122 0.36 % 3,576.80 2,864,280 0.28 % 2,781.52 4,873,952 0.28 % 2,803.59 555,459 0.40 % 3,960.05 1,943,067 0.33 % 3,318.24 93 0.33 % 3,329.87 230,984 0.21 % 2,148.01 jat jat 10,425,347 0.29 % 2,884.30 44,408 0.34 % 3,443.88 3,020,711 0.29 % 2,933.43 4,705,215 0.27 % 2,706.53 448,843 0.32 % 3,199.95 1,812,937 0.31 % 3,096.01 72 0.26 % 2,577.97 393,161 0.37 % 3,656.15 nik nik 10,234,923 0.28 % 2,831.61 32,251 0.25 % 2,501.09 2,753,579 0.27 % 2,674.01 5,192,050 0.30 % 2,986.56 329,640 0.23 % 2,350.11 1,701,057 0.29 % 2,904.95 70 0.25 % 2,506.36 226,276 0.21 % 2,104.22 sto sto 10,212,157 0.28 % 2,825.32 38,685 0.30 % 3,000.05 3,144,155 0.30 % 3,053.30 4,948,916 0.28 % 2,846.71 379,283 0.27 % 2,704.03 1,452,689 0.25 % 2,480.80 68 0.24 % 2,434.75 248,361 0.23 % 2,309.60 ica ica 9,860,999 0.27 % 2,728.16 36,158 0.28 % 2,804.08 2,611,628 0.25 % 2,536.17 4,801,563 0.28 % 2,761.95 351,609 0.25 % 2,506.73 1,701,157 0.29 % 2,905.12 233 0.83 % 8,342.58 358,651 0.33 % 3,335.23 pos pos 9,768,625 0.27 % 2,702.61 45,219 0.35 % 3,506.77 2,831,086 0.28 % 2,749.28 4,602,622 0.27 % 2,647.51 429,617 0.31 % 3,062.88 1,610,195 0.28 % 2,749.78 157 0.56 % 5,621.40 249,729 0.23 % 2,322.32 udi udi 9,630,544 0.27 % 2,664.41 25,377 0.20 % 1,968.01 2,674,329 0.26 % 2,597.05 4,750,767 0.27 % 2,732.73 304,779 0.22 % 2,172.87 1,688,117 0.29 % 2,882.85 28 0.10 % 1,002.54 187,147 0.17 % 1,740.35 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 14 File at CLARIN.SI 1.1.9 List of character-level 4-grams in lemmas in the Gigafida 2.0 corpusGF2.0-characters-lemmas- 4grams-taxonomy-entire.tsvCharacter string Character string (lower case) Total absolute frequency of character string Percentage of total sum of all found character strings Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] biti biti 94,280,785 3.44 % 34,379.50 14,504,498 3.28 % 32,783.46 2,937,647 2.74 % 27,360.11 5,162,567 6.65 % 66,509.05 177 0.84 % 8,387.04 26,965,095 3.44 % 34,373.24 44,470,061 3.37 % 33,677.12 240,740 2.42 % 24,204.67 anje anje 13,154,539 0.48 % 4,796.80 2,108,292 0.48 % 4,765.22 677,765 0.63 % 6,312.44 199,468 0.26 % 2,569.73 64 0.30 % 3,032.60 3,811,948 0.49 % 4,859.21 6,292,587 0.48 % 4,765.37 64,415 0.65 % 6,476.46 vati vati 10,357,158 0.38 % 3,776.74 1,760,736 0.40 % 3,979.66 474,005 0.44 % 4,414.70 314,033 0.41 % 4,045.67 72 0.34 % 3,411.68 2,898,609 0.37 % 3,694.95 4,870,857 0.37 % 3,688.69 38,846 0.39 % 3,905.68 jati jati 9,911,092 0.36 % 3,614.08 1,702,172 0.39 % 3,847.30 431,623 0.40 % 4,019.97 364,357 0.47 % 4,693.99 72 0.34 % 3,411.68 2,885,628 0.37 % 3,678.40 4,485,923 0.34 % 3,397.18 41,317 0.41 % 4,154.13 prav prav 9,193,169 0.34 % 3,352.29 1,503,137 0.34 % 3,397.43 313,930 0.29 % 2,923.82 259,168 0.33 % 3,338.85 63 0.30 % 2,985.22 2,541,410 0.32 % 3,239.61 4,522,332 0.34 % 3,424.76 53,129 0.53 % 5,341.74 pred pred 9,067,088 0.33 % 3,306.32 1,258,038 0.28 % 2,843.45 252,033 0.23 % 2,347.34 173,851 0.22 % 2,239.71 32 0.15 % 1,516.30 2,803,762 0.36 % 3,574.04 4,539,446 0.34 % 3,437.72 39,926 0.40 % 4,014.27 tudi tudi 8,861,014 0.32 % 3,231.17 1,537,646 0.35 % 3,475.43 277,621 0.26 % 2,585.66 143,928 0.18 % 1,854.22 21 0.10 % 995.07 2,484,618 0.32 % 3,167.22 4,393,363 0.33 % 3,327.09 23,817 0.24 % 2,394.63 nost nost 7,822,667 0.28 % 2,852.54 1,300,670 0.29 % 2,939.81 443,104 0.41 % 4,126.90 149,399 0.19 % 1,924.70 4 0.02 % 189.54 2,207,266 0.28 % 2,813.67 3,685,509 0.28 % 2,791.03 36,715 0.37 % 3,691.43 nski nski 7,605,959 0.28 % 2,773.51 1,023,951 0.23 % 2,314.36 220,031 0.20 % 2,049.28 77,379 0.10 % 996.87 7 0.03 % 331.69 2,239,008 0.28 % 2,854.13 4,021,262 0.30 % 3,045.30 24,321 0.24 % 2,445.30 love love 6,997,306 0.26 % 2,551.57 953,994 0.22 % 2,156.24 204,968 0.19 % 1,908.99 91,258 0.12 % 1,175.67 0 0 % 0 2,159,007 0.28 % 2,752.15 3,563,289 0.27 % 2,698.47 24,790 0.25 % 2,492.46 oven oven 6,944,802 0.25 % 2,532.42 845,867 0.19 % 1,911.85 164,808 0.15 % 1,534.96 23,579 0.03 % 303.77 5 0.02 % 236.92 2,264,004 0.29 % 2,886 3,620,538 0.27 % 2,741.83 26,001 0.26 % 2,614.21 rati rati 6,408,401 0.23 % 2,336.82 1,113,625 0.25 % 2,517.05 267,208 0.25 % 2,488.67 227,277 0.29 % 2,928 72 0.34 % 3,411.68 1,767,846 0.23 % 2,253.53 3,008,444 0.23 % 2,278.29 23,929 0.24 % 2,405.89 niti niti 6,358,979 0.23 % 2,318.80 1,028,379 0.23 % 2,324.37 253,755 0.24 % 2,363.38 495,872 0.64 % 6,388.29 121 0.57 % 5,733.51 1,753,663 0.22 % 2,235.45 2,808,710 0.21 % 2,127.03 18,479 0.19 % 1,857.93 leto leto 6,040,278 0.22 % 2,202.59 844,427 0.19 % 1,908.60 134,410 0.12 % 1,251.84 62,851 0.08 % 809.71 0 0 % 0 1,849,851 0.24 % 2,358.06 3,136,168 0.24 % 2,375.02 12,571 0.13 % 1,263.92 cija cija 5,651,361 0.21 % 2,060.77 825,116 0.19 % 1,864.95 244,864 0.23 % 2,280.57 47,951 0.06 % 617.75 5 0.02 % 236.92 1,869,557 0.24 % 2,383.18 2,638,946 0.20 % 1,998.47 24,922 0.25 % 2,505.73 vanj vanj 5,377,581 0.20 % 1,960.94 792,911 0.18 % 1,792.16 255,631 0.24 % 2,380.85 63,547 0.08 % 818.67 1 0.01 % 47.38 1,554,974 0.20 % 1,982.17 2,684,648 0.20 % 2,033.08 25,869 0.26 % 2,600.94 meti meti 5,091,554 0.19 % 1,856.64 933,193 0.21 % 2,109.23 191,236 0.18 % 1,781.10 235,633 0.30 % 3,035.65 16 0.08 % 758.15 1,286,965 0.16 % 1,640.53 2,423,516 0.18 % 1,835.33 20,995 0.21 % 2,110.90 ovat ovat 5,064,656 0.18 % 1,846.83 891,761 0.20 % 2,015.58 232,618 0.22 % 2,166.51 147,465 0.19 % 1,899.78 21 0.10 % 995.07 1,384,797 0.18 % 1,765.24 2,390,660 0.18 % 1,810.44 17,334 0.17 % 1,742.81 viti viti 5,052,913 0.18 % 1,842.55 814,818 0.18 % 1,841.67 169,949 0.16 % 1,582.84 168,397 0.22 % 2,169.45 78 0.37 % 3,695.98 1,431,709 0.18 % 1,825.04 2,450,635 0.19 % 1,855.86 17,327 0.17 % 1,742.10 riti riti 4,706,950 0.17 % 1,716.39 733,980 0.17 % 1,658.96 180,280 0.17 % 1,679.06 244,621 0.32 % 3,151.44 43 0.20 % 2,037.53 1,414,045 0.18 % 1,802.53 2,120,329 0.16 % 1,605.72 13,652 0.14 % 1,372.61 ičen ičen 4,625,125 0.17 % 1,686.55 898,550 0.20 % 2,030.93 263,164 0.24 % 2,451.01 70,403 0.09 % 907 29 0.14 % 1,374.15 1,222,981 0.16 % 1,558.97 2,151,256 0.16 % 1,629.14 18,742 0.19 % 1,884.37 stav stav 4,595,185 0.17 % 1,675.63 709,325 0.16 % 1,603.24 175,634 0.16 % 1,635.79 93,173 0.12 % 1,200.34 30 0.14 % 1,421.53 1,439,933 0.18 % 1,835.53 2,149,797 0.16 % 1,628.04 27,293 0.27 % 2,744.11 ljen ljen 4,535,679 0.17 % 1,653.94 842,650 0.19 % 1,904.58 230,930 0.21 % 2,150.79 138,878 0.18 % 1,789.16 39 0.18 % 1,847.99 1,201,680 0.15 % 1,531.82 2,101,016 0.16 % 1,591.10 20,486 0.21 % 2,059.72 enje enje 4,510,979 0.16 % 1,644.93 806,725 0.18 % 1,823.38 235,239 0.22 % 2,190.93 110,598 0.14 % 1,424.83 28 0.13 % 1,326.76 1,200,246 0.15 % 1,529.99 2,139,537 0.16 % 1,620.27 18,606 0.19 % 1,870.70 itev itev 4,457,494 0.16 % 1,625.43 564,693 0.13 % 1,276.33 153,870 0.14 % 1,433.09 29,340 0.04 % 377.99 1 0.01 % 47.38 1,443,799 0.18 % 1,840.46 2,241,821 0.17 % 1,697.73 23,970 0.24 % 2,410.01 diti diti 4,373,282 0.16 % 1,594.72 757,859 0.17 % 1,712.93 158,546 0.15 % 1,476.64 202,076 0.26 % 2,603.33 72 0.34 % 3,411.68 1,185,027 0.15 % 1,510.59 2,058,015 0.16 % 1,558.53 11,687 0.12 % 1,175.04 avit avit 4,359,378 0.16 % 1,589.65 709,060 0.16 % 1,602.64 139,350 0.13 % 1,297.85 134,900 0.17 % 1,737.91 62 0.29 % 2,937.83 1,246,978 0.16 % 1,589.56 2,114,914 0.16 % 1,601.62 14,114 0.14 % 1,419.06 imet imet 4,286,044 0.16 % 1,562.91 821,438 0.19 % 1,856.64 158,676 0.15 % 1,477.85 183,531 0.24 % 2,364.42 17 0.08 % 805.53 1,119,723 0.14 % 1,427.35 1,987,027 0.15 % 1,504.77 15,632 0.16 % 1,571.68 nica nica 4,240,842 0.15 % 1,546.42 625,280 0.14 % 1,413.27 121,381 0.11 % 1,130.50 134,199 0.17 % 1,728.88 51 0.24 % 2,416.60 1,161,731 0.15 % 1,480.89 2,180,534 0.17 % 1,651.32 17,666 0.18 % 1,776.19 drug drug 4,156,605 0.15 % 1,515.71 686,923 0.15 % 1,552.60 186,484 0.17 % 1,736.84 134,632 0.17 % 1,734.46 19 0.09 % 900.30 1,163,802 0.15 % 1,483.53 1,958,518 0.15 % 1,483.18 26,227 0.26 % 2,636.94 ovan ovan 4,152,456 0.15 % 1,514.19 580,862 0.13 % 1,312.88 161,177 0.15 % 1,501.14 54,386 0.07 % 700.65 0 0 % 0 1,190,927 0.15 % 1,518.11 2,145,708 0.16 % 1,624.94 19,396 0.20 % 1,950.13 dati dati 4,123,934 0.15 % 1,503.79 639,216 0.14 % 1,444.77 140,476 0.13 % 1,308.34 249,052 0.32 % 3,208.52 186 0.88 % 8,813.50 1,175,519 0.15 % 1,498.47 1,904,077 0.14 % 1,441.95 15,408 0.15 % 1,549.16 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 15 File at CLARIN.SI 1.1.10 List of character-level 5-grams in lemmas in the Gigafida 2.0 corpusGF2.0-characters-lemmas- 5grams-taxonomy-entire.tsvCharacter string Character string (lower case) Total absolute frequency of character string Percentage of total sum of all found character strings Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] vanje vanje 5,238,689 0.27 % 2,674.25 781,672 0.25 % 2,481.78 252,280 0.32 % 3,239.82 61,791 0.12 % 1,204.87 1 0.01 % 66.72 1,520,205 0.27 % 2,701.20 2,597,386 0.28 % 2,749.34 25,354 0.35 % 3,479.74 ovati ovati 5,027,672 0.26 % 2,566.53 886,196 0.28 % 2,813.64 231,690 0.30 % 2,975.40 146,927 0.29 % 2,864.95 21 0.14 % 1,401.21 1,372,403 0.24 % 2,438.57 2,373,148 0.25 % 2,511.98 17,287 0.24 % 2,372.57 loven loven 4,904,016 0.25 % 2,503.40 581,938 0.18 % 1,847.63 89,546 0.12 % 1,149.97 12,072 0.02 % 235.39 0 0 % 0 1,594,505 0.28 % 2,833.22 2,606,719 0.28 % 2,759.22 19,236 0.26 % 2,640.06 imeti imeti 4,149,008 0.21 % 2,117.99 786,424 0.25 % 2,496.87 152,374 0.20 % 1,956.81 179,989 0.35 % 3,509.63 12 0.08 % 800.69 1,085,363 0.19 % 1,928.54 1,930,104 0.20 % 2,043.02 14,742 0.20 % 2,023.28 velik velik 3,913,323 0.20 % 1,997.68 760,268 0.24 % 2,413.83 150,169 0.19 % 1,928.50 85,568 0.17 % 1,668.50 6 0.04 % 400.35 1,037,206 0.18 % 1,842.97 1,871,933 0.20 % 1,981.45 8,173 0.11 % 1,121.71 aviti aviti 3,894,129 0.20 % 1,987.88 627,663 0.20 % 1,992.81 120,869 0.15 % 1,552.22 130,553 0.26 % 2,545.67 61 0.41 % 4,070.19 1,104,074 0.20 % 1,961.79 1,899,739 0.20 % 2,010.88 11,170 0.15 % 1,533.04 ljati ljati 3,870,494 0.20 % 1,975.81 702,154 0.22 % 2,229.32 176,801 0.23 % 2,270.51 135,250 0.26 % 2,637.26 50 0.33 % 3,336.22 1,045,506 0.19 % 1,857.72 1,788,995 0.19 % 1,893.66 21,738 0.30 % 2,983.45 ateri ateri 3,793,603 0.19 % 1,936.56 672,799 0.21 % 2,136.11 183,378 0.23 % 2,354.97 86,809 0.17 % 1,692.70 8 0.05 % 533.80 1,064,810 0.19 % 1,892.02 1,765,853 0.19 % 1,869.16 19,946 0.27 % 2,737.51 pravi pravi 3,641,605 0.19 % 1,858.97 603,817 0.19 % 1,917.10 131,484 0.17 % 1,688.54 102,532 0.20 % 1,999.29 31 0.21 % 2,068.46 1,006,069 0.18 % 1,787.65 1,778,257 0.19 % 1,882.29 19,415 0.27 % 2,664.63 kater kater 3,554,748 0.18 % 1,814.63 617,206 0.20 % 1,959.61 168,821 0.22 % 2,168.03 81,676 0.16 % 1,592.61 8 0.05 % 533.80 1,002,826 0.18 % 1,781.88 1,666,790 0.18 % 1,764.30 17,421 0.24 % 2,390.96 evati evati 3,534,970 0.18 % 1,804.53 560,158 0.18 % 1,778.48 161,828 0.21 % 2,078.22 97,887 0.19 % 1,908.71 11 0.07 % 733.97 1,005,523 0.18 % 1,786.68 1,694,010 0.18 % 1,793.11 15,553 0.21 % 2,134.59 lahko lahko 3,534,884 0.18 % 1,804.49 804,640 0.26 % 2,554.71 207,701 0.27 % 2,667.33 121,802 0.24 % 2,375.04 44 0.29 % 2,935.88 892,195 0.16 % 1,585.31 1,491,261 0.16 % 1,578.50 17,241 0.24 % 2,366.26 acija acija 3,202,267 0.16 % 1,634.70 493,257 0.16 % 1,566.07 157,715 0.20 % 2,025.40 24,955 0.05 % 486.60 3 0.02 % 200.17 1,013,595 0.18 % 1,801.02 1,498,631 0.16 % 1,586.30 14,111 0.19 % 1,936.68 ovanj ovanj 3,154,179 0.16 % 1,610.15 437,615 0.14 % 1,389.41 122,221 0.16 % 1,569.58 39,030 0.08 % 761.05 0 0 % 0 925,484 0.16 % 1,644.46 1,615,613 0.17 % 1,710.13 14,216 0.20 % 1,951.09 preds preds 2,769,864 0.14 % 1,413.96 306,012 0.10 % 971.58 49,693 0.06 % 638.17 26,012 0.05 % 507.21 0 0 % 0 924,077 0.16 % 1,641.96 1,457,770 0.15 % 1,543.05 6,300 0.09 % 864.65 enski enski 2,591,836 0.13 % 1,323.08 325,273 0.10 % 1,032.73 56,273 0.07 % 722.67 16,792 0.03 % 327.43 2 0.01 % 133.45 778,601 0.14 % 1,383.47 1,409,492 0.15 % 1,491.95 5,403 0.07 % 741.54 držav držav 2,523,233 0.13 % 1,288.06 231,609 0.07 % 735.35 58,935 0.08 % 756.85 9,307 0.02 % 181.48 0 0 % 0 869,717 0.15 % 1,545.37 1,335,295 0.14 % 1,413.41 18,370 0.25 % 2,521.21 ajati ajati 2,499,546 0.13 % 1,275.97 442,525 0.14 % 1,405 117,258 0.15 % 1,505.85 83,890 0.16 % 1,635.78 10 0.07 % 667.24 735,166 0.13 % 1,306.29 1,110,571 0.12 % 1,175.54 10,126 0.14 % 1,389.75 slove slove 2,470,188 0.13 % 1,260.98 294,735 0.09 % 935.77 47,665 0.06 % 612.12 8,326 0.02 % 162.35 0 0 % 0 741,488 0.13 % 1,317.52 1,373,334 0.14 % 1,453.68 4,640 0.06 % 636.82 avlja avlja 2,399,474 0.12 % 1,224.88 384,585 0.12 % 1,221.04 107,417 0.14 % 1,379.47 46,319 0.09 % 903.18 1 0.01 % 66.72 727,946 0.13 % 1,293.46 1,115,281 0.12 % 1,180.53 17,925 0.25 % 2,460.14 ijski ijski 2,312,611 0.12 % 1,180.54 314,102 0.10 % 997.26 71,152 0.09 % 913.75 18,867 0.04 % 367.89 2 0.01 % 133.45 749,992 0.13 % 1,332.63 1,149,429 0.12 % 1,216.67 9,067 0.12 % 1,244.41 anski anski 2,292,067 0.12 % 1,170.06 308,781 0.10 % 980.37 72,388 0.09 % 929.62 26,749 0.05 % 521.58 2 0.01 % 133.45 703,669 0.12 % 1,250.32 1,175,158 0.12 % 1,243.91 5,320 0.07 % 730.15 govor govor 2,290,928 0.12 % 1,169.47 314,789 0.10 % 999.44 98,545 0.13 % 1,265.53 105,037 0.20 % 2,048.13 0 0 % 0 681,904 0.12 % 1,211.65 1,082,030 0.12 % 1,145.33 8,623 0.12 % 1,183.47 vljen vljen 2,261,796 0.12 % 1,154.60 434,728 0.14 % 1,380.25 122,791 0.16 % 1,576.90 63,026 0.12 % 1,228.95 10 0.07 % 667.24 599,375 0.11 % 1,065.01 1,032,986 0.11 % 1,093.42 8,880 0.12 % 1,218.74 edati edati 2,245,116 0.12 % 1,146.09 335,461 0.11 % 1,065.08 70,421 0.09 % 904.36 169,615 0.33 % 3,307.35 17 0.11 % 1,134.32 629,238 0.11 % 1,118.07 1,033,871 0.11 % 1,094.35 6,493 0.09 % 891.14 Slove slove 2,156,074 0.11 % 1,100.63 241,989 0.08 % 768.31 25,688 0.03 % 329.89 2,610 0.01 % 50.89 0 0 % 0 765,895 0.14 % 1,360.89 1,108,300 0.12 % 1,173.14 11,592 0.16 % 1,590.96 irati irati 2,120,922 0.11 % 1,082.69 389,345 0.12 % 1,236.16 95,145 0.12 % 1,221.87 57,217 0.11 % 1,115.68 18 0.12 % 1,201.04 596,007 0.11 % 1,059.02 975,312 0.10 % 1,032.37 7,878 0.11 % 1,081.22 stati stati 2,115,854 0.11 % 1,080.10 367,066 0.12 % 1,165.42 89,705 0.12 % 1,152.01 108,634 0.21 % 2,118.27 12 0.08 % 800.69 580,498 0.10 % 1,031.47 963,460 0.10 % 1,019.82 6,479 0.09 % 889.22 orati orati 2,073,000 0.11 % 1,058.23 342,431 0.11 % 1,087.21 90,512 0.12 % 1,162.37 93,593 0.18 % 1,824.98 6 0.04 % 400.35 564,752 0.10 % 1,003.49 970,681 0.10 % 1,027.47 11,025 0.15 % 1,513.14 morat morat 2,054,930 0.10 % 1,049 337,843 0.11 % 1,072.64 89,280 0.12 % 1,146.55 93,206 0.18 % 1,817.44 6 0.04 % 400.35 559,660 0.10 % 994.44 963,927 0.10 % 1,020.32 11,008 0.15 % 1,510.80 janje janje 2,021,930 0.10 % 1,032.16 325,564 0.10 % 1,033.65 115,889 0.15 % 1,488.27 25,381 0.05 % 494.91 3 0.02 % 200.17 604,686 0.11 % 1,074.44 935,492 0.10 % 990.22 14,915 0.20 % 2,047.02 poved poved 2,007,516 0.10 % 1,024.80 242,600 0.08 % 770.25 56,211 0.07 % 721.87 79,220 0.15 % 1,544.72 0 0 % 0 642,533 0.11 % 1,141.69 981,822 0.10 % 1,039.26 5,130 0.07 % 704.07 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 16A total of 260 frequency lists of word parts were extracted from Gigafida 2.0: 20 for each part- of-speech according to the MTE-6 annotation scheme used to automatically annotate the corpus (nouns, verbs, adjectives, adverbs, pronouns, numerals, conjunctions, prepositions, interjections, particles, abbreviations, and residual words), with an additional 20 lists that contain word parts extracted from all word forms or lemmas regardless of their part-of-speech. Each group of 20 frequency lists is divided into two parts: 10 lists extracted from lemmas, and 10 lists extracted from lower-case word forms. Each of these two parts is further divided into 5 lists of initial word parts and 5 lists of final word parts (of length 1 to 5). Each line in the list contains the unit (either the word form or the lemma, e.g. “Slovenija”; with lemmas, the lower-case lemma is also listed, e.g. “slovenija”), its initial or final word part (e.g. “slo”), and the rest of the word (e.g. “venija”). The numerical data included is comprised of the absolute frequency (fa) of the split word in the corpus, e.g. the total number of its occurrences in the corpus, followed by its percentage (p) according to the total frequency (N) of all units (either lower-case forms or lemmas) in the corpus: The list also contains the unit’s total relative frequency (fr), which indicates how frequently per 1,000,000 units the split unit occurs in the corpus. It is calculated with the following formula, where fa is the total absolute frequency of the split unit in the corpus, and N is the total frequency of all units in the corpus:The lists containing only the units with a specific part-of-speech also feature numerical data for individual text-type subcorpora (e.g. internet texts, newspapers, and fiction). In this case, the absolute frequencies (faT) represent the sum of all occurrences of the split unit in the texts pertaining to a specific taxonomy branch. The relative frequencies (frT) and percentages (pT) are calculated using the following formulas, where faT is the absolute frequency of the split unit in the taxonomy branch, and NT is the absolute frequency of all units in the taxonomy branch: The tables are sorted in the following manner: • 1.2.1.–1.2.5. → All parts of speech / lemmas / initial word parts • 1.2.6.–1.2.10. → All parts of speech / lemmas / final word parts • 1.2.11.–1.2.15. → All parts of speech / lower-case word forms / initial word parts • 1.2.16.–1.2.20. → All parts of speech / lower-case word forms / final word-parts • 1.2.21.–1.2.25. → Nouns / lemmas / initial word parts • 1.2.26.–1.2.30. → Nouns / lemmas / final word parts • 1.2.31.–1.2.35. → Nouns / lower-case word forms / initial word parts • 1.2.36.–1.2.40. → Nouns / lower-case word forms / final word parts • ...1.2. Frequency lists of word parts from the Gigafida 2.0 corpus CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 17 File at CLARIN.SI 1.2.1 List of initial character-level 1-grams from all lemmas in the Gigafida 2.0 corpus GF2.0-word_parts-all-lemmas- initial-1grams-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) biti biti b iti 91,522,113 8.07 % 80,657.97 v v v 31,839,375 2.81 % 28,059.88 in in i n 29,141,024 2.57 % 25,681.84 na na n a 19,074,713 1.68 % 16,810.45 se se s e 18,657,742 1.64 % 16,442.97 z z z 15,738,609 1.39 % 13,870.36 za za z a 15,202,396 1.34 % 13,397.79 da da d a 14,827,013 1.31 % 13,066.97 ki ki k i 12,624,190 1.11 % 11,125.63 on on o n 11,886,761 1.05 % 10,475.74 ta ta t a 11,387,563 1.00 % 10,035.80 pa pa p a 11,004,454 0.97 % 9,698.17 tudi tudi t udi 8,479,355 0.75 % 7,472.81 ne ne n e 6,768,223 0.60 % 5,964.80 po po p o 6,362,319 0.56 % 5,607.08 še še š e 5,787,754 0.51 % 5,100.72 kot kot k ot 5,745,987 0.51 % 5,063.91 leto leto l eto 4,967,810 0.44 % 4,378.11 ves ves v es 4,233,166 0.37 % 3,730.67 imeti imeti i meti 4,146,971 0.36 % 3,654.70 iz iz i z 4,006,974 0.35 % 3,531.33 pri pri p ri 3,934,295 0.35 % 3,467.27 od od o d 3,856,863 0.34 % 3,399.03 že že ž e 3,735,431 0.33 % 3,292.02 tako tako t ako 3,700,590 0.33 % 3,261.31 o o o 3,700,276 0.33 % 3,261.03 lahko lahko l ahko 3,491,432 0.31 % 3,076.98 jaz jaz j az 3,490,681 0.31 % 3,076.32 svoj svoj s voj 3,455,845 0.30 % 3,045.62 do do d o 3,436,523 0.30 % 3,028.59 drug drug d rug 3,403,426 0.30 % 2,999.42 ali ali a li 3,265,801 0.29 % 2,878.13 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 18 File at CLARIN.SI 1.2.2 List of initial character-level 2-grams from all lemmas in the Gigafida 2.0 corpus GF2.0-word_parts-all-lemmas- initial-2grams-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) biti biti bi ti 91,522,113 8.50 % 80,657.97 in in in 29,141,024 2.71 % 25,681.84 na na na 19,074,713 1.77 % 16,810.45 se se se 18,657,742 1.73 % 16,442.97 za za za 15,202,396 1.41 % 13,397.79 da da da 14,827,013 1.38 % 13,066.97 ki ki ki 12,624,190 1.17 % 11,125.63 on on on 11,886,761 1.10 % 10,475.74 ta ta ta 11,387,563 1.06 % 10,035.80 pa pa pa 11,004,454 1.02 % 9,698.17 tudi tudi tu di 8,479,355 0.79 % 7,472.81 ne ne ne 6,768,223 0.63 % 5,964.80 po po po 6,362,319 0.59 % 5,607.08 še še še 5,787,754 0.54 % 5,100.72 kot kot ko t 5,745,987 0.53 % 5,063.91 leto leto le to 4,967,810 0.46 % 4,378.11 ves ves ve s 4,233,166 0.39 % 3,730.67 imeti imeti im eti 4,146,971 0.39 % 3,654.70 iz iz iz 4,006,974 0.37 % 3,531.33 pri pri pr i 3,934,295 0.36 % 3,467.27 od od od 3,856,863 0.36 % 3,399.03 že že že 3,735,431 0.35 % 3,292.02 tako tako ta ko 3,700,590 0.34 % 3,261.31 lahko lahko la hko 3,491,432 0.32 % 3,076.98 jaz jaz ja z 3,490,681 0.32 % 3,076.32 svoj svoj sv oj 3,455,845 0.32 % 3,045.62 do do do 3,436,523 0.32 % 3,028.59 drug drug dr ug 3,403,426 0.32 % 2,999.42 ali ali al i 3,265,801 0.30 % 2,878.13 ko ko ko 3,180,411 0.29 % 2,802.88 med med me d 3,026,425 0.28 % 2,667.17 kar kar ka r 3,004,818 0.28 % 2,648.13 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 19 File at CLARIN.SI 1.2.3 List of initial character-level 3-grams from all lemmas in the Gigafida 2.0 corpus GF2.0-word_parts-all-lemmas- initial-3grams-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) biti biti bit i 91,522,113 10.49 % 80,657.97 tudi tudi tud i 8,479,355 0.97 % 7,472.81 kot kot kot 5,745,987 0.66 % 5,063.91 leto leto let o 4,967,810 0.57 % 4,378.11 ves ves ves 4,233,166 0.48 % 3,730.67 imeti imeti ime ti 4,146,971 0.47 % 3,654.70 pri pri pri 3,934,295 0.45 % 3,467.27 tako tako tak o 3,700,590 0.42 % 3,261.31 lahko lahko lah ko 3,491,432 0.40 % 3,076.98 jaz jaz jaz 3,490,681 0.40 % 3,076.32 svoj svoj svo j 3,455,845 0.40 % 3,045.62 drug drug dru g 3,403,426 0.39 % 2,999.42 ali ali ali 3,265,801 0.37 % 2,878.13 med med med 3,026,425 0.35 % 2,667.17 kar kar kar 3,004,818 0.34 % 2,648.13 velik velik vel ik 2,657,041 0.30 % 2,341.64 kateri kateri kat eri 2,632,286 0.30 % 2,319.82 nov nov nov 2,597,170 0.30 % 2,288.87 več več več 2,417,738 0.28 % 2,130.74 pred pred pre d 2,269,568 0.26 % 2,000.16 prvi prvi prv i 2,198,891 0.25 % 1,937.87 morati morati mor ati 2,049,546 0.23 % 1,806.25 naj naj naj 1,987,783 0.23 % 1,751.82 saj saj saj 1,922,973 0.22 % 1,694.71 dan dan dan 1,885,968 0.22 % 1,662.09 slovenski slovenski slo venski 1,883,301 0.22 % 1,659.74 čas čas čas 1,869,795 0.21 % 1,647.84 dober dober dob er 1,844,130 0.21 % 1,625.22 zaradi zaradi zar adi 1,792,448 0.21 % 1,579.68 Slovenija slovenija Slo venija 1,765,488 0.20 % 1,555.92 njegov njegov nje gov 1,765,377 0.20 % 1,555.82 nekaj nekaj nek aj 1,708,192 0.20 % 1,505.42 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 20 File at CLARIN.SI 1.2.4 List of initial character-level 4-grams from all lemmas in the Gigafida 2.0 corpus GF2.0-word_parts-all-lemmas- initial-4grams-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) biti biti biti 91,522,113 11.68 % 80,657.97 tudi tudi tudi 8,479,355 1.08 % 7,472.81 leto leto leto 4,967,810 0.63 % 4,378.11 imeti imeti imet i 4,146,971 0.53 % 3,654.70 tako tako tako 3,700,590 0.47 % 3,261.31 lahko lahko lahk o 3,491,432 0.45 % 3,076.98 svoj svoj svoj 3,455,845 0.44 % 3,045.62 drug drug drug 3,403,426 0.43 % 2,999.42 velik velik veli k 2,657,041 0.34 % 2,341.64 kateri kateri kate ri 2,632,286 0.34 % 2,319.82 pred pred pred 2,269,568 0.29 % 2,000.16 prvi prvi prvi 2,198,891 0.28 % 1,937.87 morati morati mora ti 2,049,546 0.26 % 1,806.25 slovenski slovenski slov enski 1,883,301 0.24 % 1,659.74 dober dober dobe r 1,844,130 0.23 % 1,625.22 zaradi zaradi zara di 1,792,448 0.23 % 1,579.68 Slovenija slovenija Slov enija 1,765,488 0.23 % 1,555.92 njegov njegov njeg ov 1,765,377 0.23 % 1,555.82 nekaj nekaj neka j 1,708,192 0.22 % 1,505.42 zato zato zato 1,625,474 0.21 % 1,432.52 delo delo delo 1,612,070 0.21 % 1,420.71 mesto mesto mest o 1,587,744 0.20 % 1,399.27 človek človek člov ek 1,543,697 0.20 % 1,360.45 država država drža va 1,502,640 0.19 % 1,324.27 zelo zelo zelo 1,445,559 0.18 % 1,273.96 sicer sicer sice r 1,341,818 0.17 % 1,182.54 svet svet svet 1,307,958 0.17 % 1,152.70 prav prav prav 1,266,134 0.16 % 1,115.84 bolj bolj bolj 1,262,376 0.16 % 1,112.53 zadnji zadnji zadn ji 1,254,914 0.16 % 1,105.95 tisti tisti tist i 1,204,479 0.15 % 1,061.50 kjer kjer kjer 1,147,659 0.15 % 1,011.43 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 21 File at CLARIN.SI 1.2.5 List of initial character-level 5-grams from all lemmas in the Gigafida 2.0 corpus GF2.0-word_parts-all-lemmas- initial-5grams-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) imeti imeti imeti 4,146,971 0.70 % 3,654.70 lahko lahko lahko 3,491,432 0.59 % 3,076.98 velik velik velik 2,657,041 0.45 % 2,341.64 kateri kateri kater i 2,632,286 0.45 % 2,319.82 morati morati morat i 2,049,546 0.35 % 1,806.25 slovenski slovenski slove nski 1,883,301 0.32 % 1,659.74 dober dober dober 1,844,130 0.31 % 1,625.22 zaradi zaradi zarad i 1,792,448 0.30 % 1,579.68 Slovenija slovenija Slove nija 1,765,488 0.30 % 1,555.92 njegov njegov njego v 1,765,377 0.30 % 1,555.82 nekaj nekaj nekaj 1,708,192 0.29 % 1,505.42 mesto mesto mesto 1,587,744 0.27 % 1,399.27 človek človek člove k 1,543,697 0.26 % 1,360.45 država država držav a 1,502,640 0.25 % 1,324.27 sicer sicer sicer 1,341,818 0.23 % 1,182.54 zadnji zadnji zadnj i 1,254,914 0.21 % 1,105.95 tisti tisti tisti 1,204,479 0.20 % 1,061.50 začeti začeti začet i 1,138,052 0.19 % 1,002.96 konec konec konec 1,104,610 0.19 % 973.49 odstotek odstotek odsto tek 1,089,885 0.18 % 960.51 Ljubljana ljubljana Ljubl jana 1,078,624 0.18 % 950.59 trije trije trije 1,077,626 0.18 % 949.71 vedno vedno vedno 1,075,131 0.18 % 947.51 priti priti priti 1,046,846 0.18 % 922.58 njihov njihov njiho v 1,022,474 0.17 % 901.10 podjetje podjetje podje tje 1,001,130 0.17 % 882.29 vendar vendar venda r 999,463 0.17 % 880.82 tekma tekma tekma 978,013 0.17 % 861.92 povedati povedati poved ati 970,820 0.16 % 855.58 dobiti dobiti dobit i 966,380 0.16 % 851.67 evropski evropski evrop ski 952,910 0.16 % 839.79 veliko veliko velik o 945,317 0.16 % 833.10 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 22 File at CLARIN.SI 1.2.6 List of final character-level 1-grams from all lemmas in the Gigafida 2.0 corpus GF2.0-word_parts-all-lemmas- final-1grams-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) biti biti bit i 91,522,113 8.07 % 80,657.97 v v v 31,839,375 2.81 % 28,059.88 in in i n 29,141,024 2.57 % 25,681.84 na na n a 19,074,713 1.68 % 16,810.45 se se s e 18,657,742 1.64 % 16,442.97 z z z 15,738,609 1.39 % 13,870.36 za za z a 15,202,396 1.34 % 13,397.79 da da d a 14,827,013 1.31 % 13,066.97 ki ki k i 12,624,190 1.11 % 11,125.63 on on o n 11,886,761 1.05 % 10,475.74 ta ta t a 11,387,563 1.00 % 10,035.80 pa pa p a 11,004,454 0.97 % 9,698.17 tudi tudi tud i 8,479,355 0.75 % 7,472.81 ne ne n e 6,768,223 0.60 % 5,964.80 po po p o 6,362,319 0.56 % 5,607.08 še še š e 5,787,754 0.51 % 5,100.72 kot kot ko t 5,745,987 0.51 % 5,063.91 leto leto let o 4,967,810 0.44 % 4,378.11 ves ves ve s 4,233,166 0.37 % 3,730.67 imeti imeti imet i 4,146,971 0.36 % 3,654.70 iz iz i z 4,006,974 0.35 % 3,531.33 pri pri pr i 3,934,295 0.35 % 3,467.27 od od o d 3,856,863 0.34 % 3,399.03 že že ž e 3,735,431 0.33 % 3,292.02 tako tako tak o 3,700,590 0.33 % 3,261.31 o o o 3,700,276 0.33 % 3,261.03 lahko lahko lahk o 3,491,432 0.31 % 3,076.98 jaz jaz ja z 3,490,681 0.31 % 3,076.32 svoj svoj svo j 3,455,845 0.30 % 3,045.62 do do d o 3,436,523 0.30 % 3,028.59 drug drug dru g 3,403,426 0.30 % 2,999.42 ali ali al i 3,265,801 0.29 % 2,878.13 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 23 File at CLARIN.SI 1.2.7 List of final character-level 2-grams from all lemmas in the Gigafida 2.0 corpus GF2.0-word_parts-all-lemmas- final-2grams-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) biti biti bi ti 91,522,113 8.50 % 80,657.97 in in in 29,141,024 2.71 % 25,681.84 na na na 19,074,713 1.77 % 16,810.45 se se se 18,657,742 1.73 % 16,442.97 za za za 15,202,396 1.41 % 13,397.79 da da da 14,827,013 1.38 % 13,066.97 ki ki ki 12,624,190 1.17 % 11,125.63 on on on 11,886,761 1.10 % 10,475.74 ta ta ta 11,387,563 1.06 % 10,035.80 pa pa pa 11,004,454 1.02 % 9,698.17 tudi tudi tu di 8,479,355 0.79 % 7,472.81 ne ne ne 6,768,223 0.63 % 5,964.80 po po po 6,362,319 0.59 % 5,607.08 še še še 5,787,754 0.54 % 5,100.72 kot kot k ot 5,745,987 0.53 % 5,063.91 leto leto le to 4,967,810 0.46 % 4,378.11 ves ves v es 4,233,166 0.39 % 3,730.67 imeti imeti ime ti 4,146,971 0.39 % 3,654.70 iz iz iz 4,006,974 0.37 % 3,531.33 pri pri p ri 3,934,295 0.36 % 3,467.27 od od od 3,856,863 0.36 % 3,399.03 že že že 3,735,431 0.35 % 3,292.02 tako tako ta ko 3,700,590 0.34 % 3,261.31 lahko lahko lah ko 3,491,432 0.32 % 3,076.98 jaz jaz j az 3,490,681 0.32 % 3,076.32 svoj svoj sv oj 3,455,845 0.32 % 3,045.62 do do do 3,436,523 0.32 % 3,028.59 drug drug dr ug 3,403,426 0.32 % 2,999.42 ali ali a li 3,265,801 0.30 % 2,878.13 ko ko ko 3,180,411 0.29 % 2,802.88 med med m ed 3,026,425 0.28 % 2,667.17 kar kar k ar 3,004,818 0.28 % 2,648.13 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 24 File at CLARIN.SI 1.2.8 List of final character-level 3-grams from all lemmas in the Gigafida 2.0 corpus GF2.0-word_parts-all-lemmas- final-3grams-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) biti biti b iti 91,522,113 10.49 % 80,657.97 tudi tudi t udi 8,479,355 0.97 % 7,472.81 kot kot kot 5,745,987 0.66 % 5,063.91 leto leto l eto 4,967,810 0.57 % 4,378.11 ves ves ves 4,233,166 0.48 % 3,730.67 imeti imeti im eti 4,146,971 0.47 % 3,654.70 pri pri pri 3,934,295 0.45 % 3,467.27 tako tako t ako 3,700,590 0.42 % 3,261.31 lahko lahko la hko 3,491,432 0.40 % 3,076.98 jaz jaz jaz 3,490,681 0.40 % 3,076.32 svoj svoj s voj 3,455,845 0.40 % 3,045.62 drug drug d rug 3,403,426 0.39 % 2,999.42 ali ali ali 3,265,801 0.37 % 2,878.13 med med med 3,026,425 0.35 % 2,667.17 kar kar kar 3,004,818 0.34 % 2,648.13 velik velik ve lik 2,657,041 0.30 % 2,341.64 kateri kateri kat eri 2,632,286 0.30 % 2,319.82 nov nov nov 2,597,170 0.30 % 2,288.87 več več več 2,417,738 0.28 % 2,130.74 pred pred p red 2,269,568 0.26 % 2,000.16 prvi prvi p rvi 2,198,891 0.25 % 1,937.87 morati morati mor ati 2,049,546 0.23 % 1,806.25 naj naj naj 1,987,783 0.23 % 1,751.82 saj saj saj 1,922,973 0.22 % 1,694.71 dan dan dan 1,885,968 0.22 % 1,662.09 slovenski slovenski sloven ski 1,883,301 0.22 % 1,659.74 čas čas čas 1,869,795 0.21 % 1,647.84 dober dober do ber 1,844,130 0.21 % 1,625.22 zaradi zaradi zar adi 1,792,448 0.21 % 1,579.68 Slovenija slovenija Sloven ija 1,765,488 0.20 % 1,555.92 njegov njegov nje gov 1,765,377 0.20 % 1,555.82 nekaj nekaj ne kaj 1,708,192 0.20 % 1,505.42 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 25 File at CLARIN.SI 1.2.9 List of final character-level 4-grams from all lemmas in the Gigafida 2.0 corpus GF2.0-word_parts-all-lemmas- final-4grams-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) biti biti biti 91,522,113 11.68 % 80,657.97 tudi tudi tudi 8,479,355 1.08 % 7,472.81 leto leto leto 4,967,810 0.63 % 4,378.11 imeti imeti i meti 4,146,971 0.53 % 3,654.70 tako tako tako 3,700,590 0.47 % 3,261.31 lahko lahko l ahko 3,491,432 0.45 % 3,076.98 svoj svoj svoj 3,455,845 0.44 % 3,045.62 drug drug drug 3,403,426 0.43 % 2,999.42 velik velik v elik 2,657,041 0.34 % 2,341.64 kateri kateri ka teri 2,632,286 0.34 % 2,319.82 pred pred pred 2,269,568 0.29 % 2,000.16 prvi prvi prvi 2,198,891 0.28 % 1,937.87 morati morati mo rati 2,049,546 0.26 % 1,806.25 slovenski slovenski slove nski 1,883,301 0.24 % 1,659.74 dober dober d ober 1,844,130 0.23 % 1,625.22 zaradi zaradi za radi 1,792,448 0.23 % 1,579.68 Slovenija slovenija Slove nija 1,765,488 0.23 % 1,555.92 njegov njegov nj egov 1,765,377 0.23 % 1,555.82 nekaj nekaj n ekaj 1,708,192 0.22 % 1,505.42 zato zato zato 1,625,474 0.21 % 1,432.52 delo delo delo 1,612,070 0.21 % 1,420.71 mesto mesto m esto 1,587,744 0.20 % 1,399.27 človek človek čl ovek 1,543,697 0.20 % 1,360.45 država država dr žava 1,502,640 0.19 % 1,324.27 zelo zelo zelo 1,445,559 0.18 % 1,273.96 sicer sicer s icer 1,341,818 0.17 % 1,182.54 svet svet svet 1,307,958 0.17 % 1,152.70 prav prav prav 1,266,134 0.16 % 1,115.84 bolj bolj bolj 1,262,376 0.16 % 1,112.53 zadnji zadnji za dnji 1,254,914 0.16 % 1,105.95 tisti tisti t isti 1,204,479 0.15 % 1,061.50 kjer kjer kjer 1,147,659 0.15 % 1,011.43 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 26 File at CLARIN.SI 1.2.10 List of final character-level 5-grams from all lemmas in the Gigafida 2.0 corpus GF2.0-word_parts-all-lemmas- final-5grams-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) imeti imeti imeti 4,146,971 0.70 % 3,654.70 lahko lahko lahko 3,491,432 0.59 % 3,076.98 velik velik velik 2,657,041 0.45 % 2,341.64 kateri kateri k ateri 2,632,286 0.45 % 2,319.82 morati morati m orati 2,049,546 0.35 % 1,806.25 slovenski slovenski slov enski 1,883,301 0.32 % 1,659.74 dober dober dober 1,844,130 0.31 % 1,625.22 zaradi zaradi z aradi 1,792,448 0.30 % 1,579.68 Slovenija slovenija Slov enija 1,765,488 0.30 % 1,555.92 njegov njegov n jegov 1,765,377 0.30 % 1,555.82 nekaj nekaj nekaj 1,708,192 0.29 % 1,505.42 mesto mesto mesto 1,587,744 0.27 % 1,399.27 človek človek č lovek 1,543,697 0.26 % 1,360.45 država država d ržava 1,502,640 0.25 % 1,324.27 sicer sicer sicer 1,341,818 0.23 % 1,182.54 zadnji zadnji z adnji 1,254,914 0.21 % 1,105.95 tisti tisti tisti 1,204,479 0.20 % 1,061.50 začeti začeti z ačeti 1,138,052 0.19 % 1,002.96 konec konec konec 1,104,610 0.19 % 973.49 odstotek odstotek ods totek 1,089,885 0.18 % 960.51 Ljubljana ljubljana Ljub ljana 1,078,624 0.18 % 950.59 trije trije trije 1,077,626 0.18 % 949.71 vedno vedno vedno 1,075,131 0.18 % 947.51 priti priti priti 1,046,846 0.18 % 922.58 njihov njihov n jihov 1,022,474 0.17 % 901.10 podjetje podjetje pod jetje 1,001,130 0.17 % 882.29 vendar vendar v endar 999,463 0.17 % 880.82 tekma tekma tekma 978,013 0.17 % 861.92 povedati povedati pov edati 970,820 0.16 % 855.58 dobiti dobiti d obiti 966,380 0.16 % 851.67 evropski evropski evr opski 952,910 0.16 % 839.79 veliko veliko v eliko 945,317 0.16 % 833.10 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 27 File at CLARIN.SI 1.2.11 List of initial character-level 1-grams from all lower-case word forms in the Gigafida 2.0 corpus GF2.0-word_parts-all-lowercase_ forms-initial-1grams-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) je j e 39,900,636 3.52 % 35,164.23 v v 31,824,841 2.81 % 28,047.07 in i n 29,140,300 2.57 % 25,681.20 na n a 19,074,884 1.68 % 16,810.60 se s e 16,041,389 1.41 % 14,137.19 za z a 15,202,312 1.34 % 13,397.72 da d a 14,926,166 1.31 % 13,154.35 so s o 13,322,722 1.17 % 11,741.25 ki k i 12,625,507 1.11 % 11,126.80 pa p a 11,003,997 0.97 % 9,697.77 z z 8,673,736 0.76 % 7,644.12 tudi t udi 8,478,795 0.75 % 7,472.32 s s 7,090,077 0.62 % 6,248.45 ne n e 6,768,091 0.60 % 5,964.68 bi b i 6,497,514 0.57 % 5,726.23 po p o 6,362,308 0.56 % 5,607.07 bo b o 5,992,153 0.53 % 5,280.85 še š e 5,787,746 0.51 % 5,100.71 kot k ot 5,708,530 0.50 % 5,030.90 ni n i 4,416,665 0.39 % 3,892.38 to t o 4,127,429 0.36 % 3,637.48 iz i z 4,006,982 0.35 % 3,531.33 pri p ri 3,934,274 0.35 % 3,467.26 od o d 3,857,055 0.34 % 3,399.20 tako t ako 3,742,620 0.33 % 3,298.35 že ž e 3,735,428 0.33 % 3,292.01 o o 3,706,188 0.33 % 3,266.24 do d o 3,436,485 0.30 % 3,028.56 lahko l ahko 3,378,762 0.30 % 2,977.69 ali a li 3,266,629 0.29 % 2,878.86 ko k o 3,180,620 0.28 % 2,803.06 med m ed 3,010,972 0.27 % 2,653.55 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 28 File at CLARIN.SI 1.2.12 List of initial character-level 2-grams from all lower-case word forms in the Gigafida 2.0 corpus GF2.0-word_parts-all-lowercase_ forms-initial-2grams-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) je je 39,900,636 3.71 % 35,164.23 in in 29,140,300 2.71 % 25,681.20 na na 19,074,884 1.77 % 16,810.60 se se 16,041,389 1.49 % 14,137.19 za za 15,202,312 1.41 % 13,397.72 da da 14,926,166 1.39 % 13,154.35 so so 13,322,722 1.24 % 11,741.25 ki ki 12,625,507 1.17 % 11,126.80 pa pa 11,003,997 1.02 % 9,697.77 tudi tu di 8,478,795 0.79 % 7,472.32 ne ne 6,768,091 0.63 % 5,964.68 bi bi 6,497,514 0.60 % 5,726.23 po po 6,362,308 0.59 % 5,607.07 bo bo 5,992,153 0.56 % 5,280.85 še še 5,787,746 0.54 % 5,100.71 kot ko t 5,708,530 0.53 % 5,030.90 ni ni 4,416,665 0.41 % 3,892.38 to to 4,127,429 0.38 % 3,637.48 iz iz 4,006,982 0.37 % 3,531.33 pri pr i 3,934,274 0.36 % 3,467.26 od od 3,857,055 0.36 % 3,399.20 tako ta ko 3,742,620 0.35 % 3,298.35 že že 3,735,428 0.35 % 3,292.01 do do 3,436,485 0.32 % 3,028.56 lahko la hko 3,378,762 0.31 % 2,977.69 ali al i 3,266,629 0.30 % 2,878.86 ko ko 3,180,620 0.29 % 2,803.06 med me d 3,010,972 0.28 % 2,653.55 ga ga 2,820,925 0.26 % 2,486.07 bodo bo do 2,658,670 0.25 % 2,343.07 kar ka r 2,643,190 0.24 % 2,329.43 jih ji h 2,543,854 0.24 % 2,241.89 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 29 File at CLARIN.SI 1.2.13 List of initial character-level 3-grams from all lower-case word forms in the Gigafida 2.0 corpus GF2.0-word_parts-all-lowercase_ forms-initial-3grams-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) tudi tud i 8,478,795 1.04 % 7,472.32 kot kot 5,708,530 0.70 % 5,030.90 pri pri 3,934,274 0.48 % 3,467.26 tako tak o 3,742,620 0.46 % 3,298.35 lahko lah ko 3,378,762 0.41 % 2,977.69 ali ali 3,266,629 0.40 % 2,878.86 med med 3,010,972 0.37 % 2,653.55 bodo bod o 2,658,670 0.33 % 2,343.07 kar kar 2,643,190 0.33 % 2,329.43 jih jih 2,543,854 0.31 % 2,241.89 tem tem 2,440,313 0.30 % 2,150.64 več več 2,415,744 0.30 % 2,128.98 sem sem 2,410,811 0.30 % 2,124.64 bil bil 2,389,589 0.29 % 2,105.93 pred pre d 2,269,564 0.28 % 2,000.16 sta sta 2,179,842 0.27 % 1,921.08 vse vse 2,134,934 0.26 % 1,881.51 bilo bil o 2,082,027 0.26 % 1,834.88 naj naj 1,987,817 0.24 % 1,751.85 bila bil a 1,952,427 0.24 % 1,720.66 saj saj 1,922,966 0.24 % 1,694.70 smo smo 1,909,804 0.23 % 1,683.10 leta let a 1,848,499 0.23 % 1,629.07 zaradi zar adi 1,792,345 0.22 % 1,579.58 nekaj nek aj 1,681,897 0.21 % 1,482.25 ter ter 1,670,311 0.20 % 1,472.04 zato zat o 1,625,491 0.20 % 1,432.54 ker ker 1,508,486 0.18 % 1,329.42 zelo zel o 1,442,297 0.18 % 1,271.09 sicer sic er 1,341,817 0.17 % 1,182.54 tega teg a 1,301,031 0.16 % 1,146.59 prav pra v 1,266,144 0.16 % 1,115.85 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 30 File at CLARIN.SI 1.2.14 List of initial character-level 4-grams from all lower-case word forms in the Gigafida 2.0 corpus GF2.0-word_parts-all-lowercase_ forms-initial-4grams-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) tudi tudi 8,478,795 1.17 % 7,472.32 tako tako 3,742,620 0.52 % 3,298.35 lahko lahk o 3,378,762 0.47 % 2,977.69 bodo bodo 2,658,670 0.37 % 2,343.07 pred pred 2,269,564 0.31 % 2,000.16 bilo bilo 2,082,027 0.29 % 1,834.88 bila bila 1,952,427 0.27 % 1,720.66 leta leta 1,848,499 0.26 % 1,629.07 zaradi zara di 1,792,345 0.25 % 1,579.58 nekaj neka j 1,681,897 0.23 % 1,482.25 zato zato 1,625,491 0.22 % 1,432.54 zelo zelo 1,442,297 0.20 % 1,271.09 sicer sice r 1,341,817 0.18 % 1,182.54 tega tega 1,301,031 0.18 % 1,146.59 prav prav 1,266,144 0.17 % 1,115.85 bolj bolj 1,262,089 0.17 % 1,112.27 veliko veli ko 1,199,832 0.17 % 1,057.41 kjer kjer 1,147,660 0.16 % 1,011.43 zdaj zdaj 1,082,612 0.15 % 954.10 vedno vedn o 1,075,231 0.15 % 947.60 niso niso 1,033,081 0.14 % 910.45 kako kako 1,015,957 0.14 % 895.36 samo samo 1,000,522 0.14 % 881.75 vendar vend ar 999,460 0.14 % 880.82 bili bili 949,367 0.13 % 836.67 danes dane s 905,175 0.12 % 797.73 namreč namr eč 902,137 0.12 % 795.05 brez brez 896,121 0.12 % 789.75 svoje svoj e 896,119 0.12 % 789.75 proti prot i 886,922 0.12 % 781.64 predvsem pred vsem 844,382 0.12 % 744.15 drugi drug i 822,391 0.11 % 724.77 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 31 File at CLARIN.SI 1.2.15 List of initial character-level 5-grams from all lower-case word forms in the Gigafida 2.0 corpus GF2.0-word_parts-all-lowercase_ forms-initial-5grams-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) lahko lahko 3,378,762 0.55 % 2,977.69 zaradi zarad i 1,792,345 0.29 % 1,579.58 nekaj nekaj 1,681,897 0.27 % 1,482.25 sicer sicer 1,341,817 0.22 % 1,182.54 veliko velik o 1,199,832 0.19 % 1,057.41 vedno vedno 1,075,231 0.17 % 947.60 vendar venda r 999,460 0.16 % 880.82 danes danes 905,175 0.15 % 797.73 namreč namre č 902,137 0.15 % 795.05 svoje svoje 896,119 0.14 % 789.75 proti proti 886,922 0.14 % 781.64 predvsem predv sem 844,382 0.14 % 744.15 drugi drugi 822,391 0.13 % 724.77 potem potem 814,625 0.13 % 717.92 dobro dobro 786,829 0.13 % 693.43 najbolj najbo lj 780,726 0.13 % 688.05 evrov evrov 773,109 0.12 % 681.34 seveda seved a 725,856 0.12 % 639.69 treba treba 715,340 0.12 % 630.43 čeprav čepra v 679,676 0.11 % 599 pravi pravi 676,188 0.11 % 595.92 glede glede 671,713 0.11 % 591.98 strani stran i 656,391 0.11 % 578.47 oziroma oziro ma 648,348 0.10 % 571.39 ljudi ljudi 634,368 0.10 % 559.07 ljubljana ljubl jana 629,756 0.10 % 555 mesto mesto 628,321 0.10 % 553.74 skupaj skupa j 614,586 0.10 % 541.63 letos letos 606,877 0.10 % 534.84 poleg poleg 594,327 0.10 % 523.78 letih letih 593,893 0.10 % 523.39 imajo imajo 589,979 0.10 % 519.95 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 32 File at CLARIN.SI 1.2.16 List of final character-level 1-grams from all lower-case word forms in the Gigafida 2.0 corpus GF2.0-word_parts-all-lowercase_ forms-final-1grams-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) je j e 39,900,636 3.52 % 35,164.23 v v 31,824,841 2.81 % 28,047.07 in i n 29,140,300 2.57 % 25,681.20 na n a 19,074,884 1.68 % 16,810.60 se s e 16,041,389 1.41 % 14,137.19 za z a 15,202,312 1.34 % 13,397.72 da d a 14,926,166 1.31 % 13,154.35 so s o 13,322,722 1.17 % 11,741.25 ki k i 12,625,507 1.11 % 11,126.80 pa p a 11,003,997 0.97 % 9,697.77 z z 8,673,736 0.76 % 7,644.12 tudi tud i 8,478,795 0.75 % 7,472.32 s s 7,090,077 0.62 % 6,248.45 ne n e 6,768,091 0.60 % 5,964.68 bi b i 6,497,514 0.57 % 5,726.23 po p o 6,362,308 0.56 % 5,607.07 bo b o 5,992,153 0.53 % 5,280.85 še š e 5,787,746 0.51 % 5,100.71 kot ko t 5,708,530 0.50 % 5,030.90 ni n i 4,416,665 0.39 % 3,892.38 to t o 4,127,429 0.36 % 3,637.48 iz i z 4,006,982 0.35 % 3,531.33 pri pr i 3,934,274 0.35 % 3,467.26 od o d 3,857,055 0.34 % 3,399.20 tako tak o 3,742,620 0.33 % 3,298.35 že ž e 3,735,428 0.33 % 3,292.01 o o 3,706,188 0.33 % 3,266.24 do d o 3,436,485 0.30 % 3,028.56 lahko lahk o 3,378,762 0.30 % 2,977.69 ali al i 3,266,629 0.29 % 2,878.86 ko k o 3,180,620 0.28 % 2,803.06 med me d 3,010,972 0.27 % 2,653.55 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 33 File at CLARIN.SI 1.2.17 List of final character-level 2-grams from all lower-case word forms in the Gigafida 2.0 corpus GF2.0-word_parts-all-lowercase_ forms-final-2grams-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) je je 39,900,636 3.71 % 35,164.23 in in 29,140,300 2.71 % 25,681.20 na na 19,074,884 1.77 % 16,810.60 se se 16,041,389 1.49 % 14,137.19 za za 15,202,312 1.41 % 13,397.72 da da 14,926,166 1.39 % 13,154.35 so so 13,322,722 1.24 % 11,741.25 ki ki 12,625,507 1.17 % 11,126.80 pa pa 11,003,997 1.02 % 9,697.77 tudi tu di 8,478,795 0.79 % 7,472.32 ne ne 6,768,091 0.63 % 5,964.68 bi bi 6,497,514 0.60 % 5,726.23 po po 6,362,308 0.59 % 5,607.07 bo bo 5,992,153 0.56 % 5,280.85 še še 5,787,746 0.54 % 5,100.71 kot k ot 5,708,530 0.53 % 5,030.90 ni ni 4,416,665 0.41 % 3,892.38 to to 4,127,429 0.38 % 3,637.48 iz iz 4,006,982 0.37 % 3,531.33 pri p ri 3,934,274 0.36 % 3,467.26 od od 3,857,055 0.36 % 3,399.20 tako ta ko 3,742,620 0.35 % 3,298.35 že že 3,735,428 0.35 % 3,292.01 do do 3,436,485 0.32 % 3,028.56 lahko lah ko 3,378,762 0.31 % 2,977.69 ali a li 3,266,629 0.30 % 2,878.86 ko ko 3,180,620 0.29 % 2,803.06 med m ed 3,010,972 0.28 % 2,653.55 ga ga 2,820,925 0.26 % 2,486.07 bodo bo do 2,658,670 0.25 % 2,343.07 kar k ar 2,643,190 0.24 % 2,329.43 jih j ih 2,543,854 0.24 % 2,241.89 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 34 File at CLARIN.SI 1.2.18 List of final character-level 3-grams from all lower-case word forms in the Gigafida 2.0 corpus GF2.0-word_parts-all-lowercase_ forms-final-3grams-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) tudi t udi 8,478,795 1.04 % 7,472.32 kot kot 5,708,530 0.70 % 5,030.90 pri pri 3,934,274 0.48 % 3,467.26 tako t ako 3,742,620 0.46 % 3,298.35 lahko la hko 3,378,762 0.41 % 2,977.69 ali ali 3,266,629 0.40 % 2,878.86 med med 3,010,972 0.37 % 2,653.55 bodo b odo 2,658,670 0.33 % 2,343.07 kar kar 2,643,190 0.33 % 2,329.43 jih jih 2,543,854 0.31 % 2,241.89 tem tem 2,440,313 0.30 % 2,150.64 več več 2,415,744 0.30 % 2,128.98 sem sem 2,410,811 0.30 % 2,124.64 bil bil 2,389,589 0.29 % 2,105.93 pred p red 2,269,564 0.28 % 2,000.16 sta sta 2,179,842 0.27 % 1,921.08 vse vse 2,134,934 0.26 % 1,881.51 bilo b ilo 2,082,027 0.26 % 1,834.88 naj naj 1,987,817 0.24 % 1,751.85 bila b ila 1,952,427 0.24 % 1,720.66 saj saj 1,922,966 0.24 % 1,694.70 smo smo 1,909,804 0.23 % 1,683.10 leta l eta 1,848,499 0.23 % 1,629.07 zaradi zar adi 1,792,345 0.22 % 1,579.58 nekaj ne kaj 1,681,897 0.21 % 1,482.25 ter ter 1,670,311 0.20 % 1,472.04 zato z ato 1,625,491 0.20 % 1,432.54 ker ker 1,508,486 0.18 % 1,329.42 zelo z elo 1,442,297 0.18 % 1,271.09 sicer si cer 1,341,817 0.17 % 1,182.54 tega t ega 1,301,031 0.16 % 1,146.59 prav p rav 1,266,144 0.16 % 1,115.85 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 35 File at CLARIN.SI 1.2.19 List of final character-level 4-grams from all lower-case word forms in the Gigafida 2.0 corpus GF2.0-word_parts-all-lowercase_ forms-final-4grams-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) tudi tudi 8,478,795 1.17 % 7,472.32 tako tako 3,742,620 0.52 % 3,298.35 lahko l ahko 3,378,762 0.47 % 2,977.69 bodo bodo 2,658,670 0.37 % 2,343.07 pred pred 2,269,564 0.31 % 2,000.16 bilo bilo 2,082,027 0.29 % 1,834.88 bila bila 1,952,427 0.27 % 1,720.66 leta leta 1,848,499 0.26 % 1,629.07 zaradi za radi 1,792,345 0.25 % 1,579.58 nekaj n ekaj 1,681,897 0.23 % 1,482.25 zato zato 1,625,491 0.22 % 1,432.54 zelo zelo 1,442,297 0.20 % 1,271.09 sicer s icer 1,341,817 0.18 % 1,182.54 tega tega 1,301,031 0.18 % 1,146.59 prav prav 1,266,144 0.17 % 1,115.85 bolj bolj 1,262,089 0.17 % 1,112.27 veliko ve liko 1,199,832 0.17 % 1,057.41 kjer kjer 1,147,660 0.16 % 1,011.43 zdaj zdaj 1,082,612 0.15 % 954.10 vedno v edno 1,075,231 0.15 % 947.60 niso niso 1,033,081 0.14 % 910.45 kako kako 1,015,957 0.14 % 895.36 samo samo 1,000,522 0.14 % 881.75 vendar ve ndar 999,460 0.14 % 880.82 bili bili 949,367 0.13 % 836.67 danes d anes 905,175 0.12 % 797.73 namreč na mreč 902,137 0.12 % 795.05 brez brez 896,121 0.12 % 789.75 svoje s voje 896,119 0.12 % 789.75 proti p roti 886,922 0.12 % 781.64 predvsem pred vsem 844,382 0.12 % 744.15 drugi d rugi 822,391 0.11 % 724.77 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 36 File at CLARIN.SI 1.2.20 List of final character-level 5-grams from all lower-case word forms in the Gigafida 2.0 corpus GF2.0-word_parts-all-lowercase_ forms-final-5grams-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) lahko lahko 3,378,762 0.55 % 2,977.69 zaradi z aradi 1,792,345 0.29 % 1,579.58 nekaj nekaj 1,681,897 0.27 % 1,482.25 sicer sicer 1,341,817 0.22 % 1,182.54 veliko v eliko 1,199,832 0.19 % 1,057.41 vedno vedno 1,075,231 0.17 % 947.60 vendar v endar 999,460 0.16 % 880.82 danes danes 905,175 0.15 % 797.73 namreč n amreč 902,137 0.15 % 795.05 svoje svoje 896,119 0.14 % 789.75 proti proti 886,922 0.14 % 781.64 predvsem pre dvsem 844,382 0.14 % 744.15 drugi drugi 822,391 0.13 % 724.77 potem potem 814,625 0.13 % 717.92 dobro dobro 786,829 0.13 % 693.43 najbolj na jbolj 780,726 0.13 % 688.05 evrov evrov 773,109 0.12 % 681.34 seveda s eveda 725,856 0.12 % 639.69 treba treba 715,340 0.12 % 630.43 čeprav č eprav 679,676 0.11 % 599 pravi pravi 676,188 0.11 % 595.92 glede glede 671,713 0.11 % 591.98 strani s trani 656,391 0.11 % 578.47 oziroma oz iroma 648,348 0.10 % 571.39 ljudi ljudi 634,368 0.10 % 559.07 ljubljana ljub ljana 629,756 0.10 % 555 mesto mesto 628,321 0.10 % 553.74 skupaj s kupaj 614,586 0.10 % 541.63 letos letos 606,877 0.10 % 534.84 poleg poleg 594,327 0.10 % 523.78 letih letih 593,893 0.10 % 523.39 imajo imajo 589,979 0.10 % 519.95 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 37 File at CLARIN.SI 1.2.21 List of initial character-level 1-grams from noun lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-nouns-lemmas- initial-1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] leto leto l eto 4,967,678 1.41 % 4,377.99 0 0 % 0 58,808 0.69 % 1,480.72 11,974 0.97 % 3,026.90 2,501,788 1.45 % 4,609.71 732,438 1.33 % 3,908.05 1,531,105 1.49 % 4,815.72 131,565 1.02 % 3,064.31 čas čas č as 1,869,767 0.53 % 1,647.82 15 0.49 % 1,545.28 65,738 0.78 % 1,655.21 5,627 0.46 % 1,422.44 884,076 0.51 % 1,628.97 353,071 0.64 % 1,883.87 493,125 0.48 % 1,551.01 68,115 0.53 % 1,586.48 dan dan d an 1,852,418 0.53 % 1,632.53 4 0.13 % 412.07 63,871 0.75 % 1,608.20 7,241 0.58 % 1,830.45 911,593 0.53 % 1,679.67 285,803 0.52 % 1,524.95 544,212 0.53 % 1,711.69 39,694 0.31 % 924.52 Slovenija slovenija S lovenija 1,764,516 0.50 % 1,555.06 0 0 % 0 1,408 0.02 % 35.45 11,071 0.90 % 2,798.63 894,914 0.52 % 1,648.94 193,536 0.35 % 1,032.64 645,977 0.63 % 2,031.77 17,610 0.14 % 410.16 delo delo d elo 1,611,830 0.46 % 1,420.50 2 0.07 % 206.04 21,312 0.25 % 536.61 6,576 0.53 % 1,662.34 832,012 0.48 % 1,533.04 223,957 0.41 % 1,194.96 455,097 0.44 % 1,431.40 72,874 0.56 % 1,697.32 mesto mesto m esto 1,587,720 0.45 % 1,399.25 0 0 % 0 30,392 0.36 % 765.24 2,457 0.20 % 621.10 831,762 0.48 % 1,532.58 187,507 0.34 % 1,000.48 494,079 0.48 % 1,554.01 41,523 0.32 % 967.12 človek človek č lovek 1,543,691 0.44 % 1,360.45 0 0 % 0 64,013 0.76 % 1,611.78 4,089 0.33 % 1,033.66 693,749 0.40 % 1,278.28 286,123 0.52 % 1,526.66 411,720 0.40 % 1,294.97 83,997 0.65 % 1,956.39 država država d ržava 1,502,639 0.43 % 1,324.27 0 0 % 0 5,437 0.06 % 136.90 9,324 0.75 % 2,357.01 761,820 0.44 % 1,403.70 149,312 0.27 % 796.68 536,561 0.52 % 1,687.62 40,185 0.31 % 935.96 svet svet s vet 1,216,329 0.34 % 1,071.94 0 0 % 0 27,166 0.32 % 684.01 4,324 0.35 % 1,093.06 611,210 0.35 % 1,126.19 177,328 0.32 % 946.16 351,299 0.34 % 1,104.93 45,002 0.35 % 1,048.15 odstotek odstotek o dstotek 1,089,884 0.31 % 960.51 0 0 % 0 898 0.01 % 22.61 583 0.05 % 147.38 558,212 0.32 % 1,028.54 82,011 0.15 % 437.58 439,901 0.43 % 1,383.60 8,279 0.06 % 192.83 Ljubljana ljubljana L jubljana 1,078,589 0.31 % 950.56 0 0 % 0 2,707 0.03 % 68.16 940 0.08 % 237.62 570,072 0.33 % 1,050.40 89,473 0.16 % 477.40 398,878 0.39 % 1,254.58 16,519 0.13 % 384.75 podjetje podjetje p odjetje 1,001,078 0.28 % 882.24 0 0 % 0 2,879 0.03 % 72.49 2,661 0.21 % 672.67 537,241 0.31 % 989.90 154,968 0.28 % 826.86 283,563 0.28 % 891.88 19,766 0.15 % 460.37 tekma tekma t ekma 978,012 0.28 % 861.92 0 0 % 0 1,508 0.02 % 37.97 81 0.01 % 20.48 519,746 0.30 % 957.67 45,417 0.08 % 242.33 409,852 0.40 % 1,289.09 1,408 0.01 % 32.79 del del d el 960,422 0.27 % 846.42 7 0.23 % 721.13 13,353 0.16 % 336.21 3,660 0.30 % 925.21 460,019 0.27 % 847.62 165,687 0.30 % 884.05 267,226 0.26 % 840.50 50,470 0.39 % 1,175.51 evro evro e vro 952,189 0.27 % 839.16 0 0 % 0 587 0.01 % 14.78 39 0 % 9.86 403,032 0.23 % 742.61 41,108 0.07 % 219.34 505,459 0.49 % 1,589.80 1,964 0.01 % 45.74 predsednik predsednik p redsednik 935,429 0.27 % 824.39 0 0 % 0 3,383 0.04 % 85.18 2,020 0.16 % 510.63 520,594 0.30 % 959.23 65,730 0.12 % 350.71 339,004 0.33 % 1,066.26 4,698 0.04 % 109.42 primer primer p rimer 905,773 0.26 % 798.25 8 0.26 % 824.15 14,594 0.17 % 367.46 5,373 0.43 % 1,358.24 406,445 0.24 % 748.90 154,197 0.28 % 822.74 266,038 0.26 % 836.76 59,118 0.46 % 1,376.93 konec konec k onec 893,711 0.25 % 787.62 26 0.85 % 2,678.48 29,432 0.35 % 741.07 2,089 0.17 % 528.08 428,007 0.25 % 788.63 131,299 0.24 % 700.57 277,983 0.27 % 874.33 24,875 0.19 % 579.37 ura ura u ra 887,827 0.25 % 782.44 38 1.24 % 3,914.70 28,341 0.33 % 713.60 1,567 0.13 % 396.12 495,541 0.29 % 913.07 115,281 0.21 % 615.10 229,747 0.22 % 722.61 17,312 0.13 % 403.22 milijon milijon m ilijon 863,041 0.24 % 760.59 0 0 % 0 2,587 0.03 % 65.14 561 0.04 % 141.81 472,592 0.27 % 870.78 73,218 0.13 % 390.67 308,212 0.30 % 969.41 5,871 0.04 % 136.74 otrok otrok o trok 858,940 0.24 % 756.98 0 0 % 0 32,917 0.39 % 828.81 1,735 0.14 % 438.59 386,949 0.23 % 712.98 167,600 0.30 % 894.26 219,308 0.21 % 689.78 50,431 0.39 % 1,174.60 skupina skupina s kupina 857,991 0.24 % 756.14 0 0 % 0 6,619 0.08 % 166.66 2,360 0.19 % 596.58 424,903 0.25 % 782.91 115,348 0.21 % 615.46 268,220 0.26 % 843.62 40,541 0.31 % 944.25 stran stran s tran 856,113 0.24 % 754.49 24 0.78 % 2,472.44 28,776 0.34 % 724.55 2,133 0.17 % 539.20 357,367 0.21 % 658.47 172,145 0.31 % 918.51 250,185 0.24 % 786.90 45,483 0.35 % 1,059.35 vlada vlada v lada 767,169 0.22 % 676.10 0 0 % 0 1,902 0.02 % 47.89 5,609 0.45 % 1,417.89 398,609 0.23 % 734.46 56,720 0.10 % 302.64 297,430 0.29 % 935.49 6,899 0.05 % 160.69 življenje življenje ž ivljenje 762,139 0.22 % 671.67 0 0 % 0 38,014 0.45 % 957.15 2,198 0.18 % 555.63 319,278 0.18 % 588.29 169,305 0.31 % 903.36 178,445 0.17 % 561.26 54,899 0.42 % 1,278.66 družba družba d ružba 762,071 0.22 % 671.61 0 0 % 0 6,818 0.08 % 171.67 4,866 0.39 % 1,230.07 394,846 0.23 % 727.53 79,568 0.14 % 424.55 247,831 0.24 % 779.49 28,142 0.22 % 655.46 zakon zakon z akon 706,952 0.20 % 623.03 1 0.03 % 103.02 4,285 0.05 % 107.89 20,577 1.66 % 5,201.64 358,907 0.21 % 661.31 70,886 0.13 % 378.22 236,331 0.23 % 743.32 15,965 0.12 % 371.84 občina občina o bčina 688,467 0.20 % 606.74 0 0 % 0 430 0.01 % 10.83 2,616 0.21 % 661.30 491,291 0.28 % 905.24 24,281 0.04 % 129.56 167,303 0.16 % 526.21 2,546 0.02 % 59.30 teden teden t eden 683,852 0.19 % 602.68 2 0.07 % 206.04 14,002 0.17 % 352.56 904 0.07 % 228.52 351,497 0.20 % 647.66 84,244 0.15 % 449.50 225,387 0.22 % 708.90 7,816 0.06 % 182.04 prostor prostor p rostor 672,859 0.19 % 592.99 0 0 % 0 13,594 0.16 % 342.28 1,932 0.16 % 488.39 359,646 0.21 % 662.67 111,497 0.20 % 594.91 160,130 0.16 % 503.65 26,060 0.20 % 606.97 točka točka t očka 672,245 0.19 % 592.45 0 0 % 0 3,432 0.04 % 86.41 3,686 0.30 % 931.78 305,174 0.18 % 562.30 49,829 0.09 % 265.87 294,805 0.29 % 927.24 15,319 0.12 % 356.80 program program p rogram 667,540 0.19 % 588.30 0 0 % 0 1,804 0.02 % 45.42 3,767 0.30 % 952.26 339,935 0.20 % 626.35 138,849 0.25 % 740.85 162,628 0.16 % 511.51 20,557 0.16 % 478.80 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 38 File at CLARIN.SI 1.2.22 List of initial character-level 2-grams from noun lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-nouns-lemmas- initial-2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] leto leto le to 4,967,678 1.41 % 4,377.99 0 0 % 0 58,808 0.69 % 1,480.72 11,974 0.97 % 3,026.90 2,501,788 1.46 % 4,609.71 732,438 1.34 % 3,908.05 1,531,105 1.49 % 4,815.72 131,565 1.02 % 3,064.31 čas čas ča s 1,869,767 0.53 % 1,647.82 15 0.50 % 1,545.28 65,738 0.78 % 1,655.21 5,627 0.46 % 1,422.44 884,076 0.52 % 1,628.97 353,071 0.65 % 1,883.87 493,125 0.48 % 1,551.01 68,115 0.53 % 1,586.48 dan dan da n 1,852,418 0.53 % 1,632.53 4 0.13 % 412.07 63,871 0.76 % 1,608.20 7,241 0.59 % 1,830.45 911,593 0.53 % 1,679.67 285,803 0.52 % 1,524.95 544,212 0.53 % 1,711.69 39,694 0.31 % 924.52 Slovenija slovenija Sl ovenija 1,764,516 0.50 % 1,555.06 0 0 % 0 1,408 0.02 % 35.45 11,071 0.90 % 2,798.63 894,914 0.52 % 1,648.94 193,536 0.35 % 1,032.64 645,977 0.63 % 2,031.77 17,610 0.14 % 410.16 delo delo de lo 1,611,830 0.46 % 1,420.50 2 0.07 % 206.04 21,312 0.25 % 536.61 6,576 0.54 % 1,662.34 832,012 0.48 % 1,533.04 223,957 0.41 % 1,194.96 455,097 0.44 % 1,431.40 72,874 0.57 % 1,697.32 mesto mesto me sto 1,587,720 0.45 % 1,399.25 0 0 % 0 30,392 0.36 % 765.24 2,457 0.20 % 621.10 831,762 0.48 % 1,532.58 187,507 0.34 % 1,000.48 494,079 0.48 % 1,554.01 41,523 0.32 % 967.12 človek človek čl ovek 1,543,691 0.44 % 1,360.45 0 0 % 0 64,013 0.76 % 1,611.78 4,089 0.33 % 1,033.66 693,749 0.40 % 1,278.28 286,123 0.53 % 1,526.66 411,720 0.40 % 1,294.97 83,997 0.65 % 1,956.39 država država dr žava 1,502,639 0.43 % 1,324.27 0 0 % 0 5,437 0.06 % 136.90 9,324 0.76 % 2,357.01 761,820 0.44 % 1,403.70 149,312 0.27 % 796.68 536,561 0.52 % 1,687.62 40,185 0.31 % 935.96 svet svet sv et 1,216,329 0.35 % 1,071.94 0 0 % 0 27,166 0.32 % 684.01 4,324 0.35 % 1,093.06 611,210 0.36 % 1,126.19 177,328 0.33 % 946.16 351,299 0.34 % 1,104.93 45,002 0.35 % 1,048.15 odstotek odstotek od stotek 1,089,884 0.31 % 960.51 0 0 % 0 898 0.01 % 22.61 583 0.05 % 147.38 558,212 0.33 % 1,028.54 82,011 0.15 % 437.58 439,901 0.43 % 1,383.60 8,279 0.06 % 192.83 Ljubljana ljubljana Lj ubljana 1,078,589 0.31 % 950.56 0 0 % 0 2,707 0.03 % 68.16 940 0.08 % 237.62 570,072 0.33 % 1,050.40 89,473 0.16 % 477.40 398,878 0.39 % 1,254.58 16,519 0.13 % 384.75 podjetje podjetje po djetje 1,001,078 0.28 % 882.24 0 0 % 0 2,879 0.03 % 72.49 2,661 0.22 % 672.67 537,241 0.31 % 989.90 154,968 0.28 % 826.86 283,563 0.28 % 891.88 19,766 0.15 % 460.37 tekma tekma te kma 978,012 0.28 % 861.92 0 0 % 0 1,508 0.02 % 37.97 81 0.01 % 20.48 519,746 0.30 % 957.67 45,417 0.08 % 242.33 409,852 0.40 % 1,289.09 1,408 0.01 % 32.79 del del de l 960,422 0.27 % 846.42 7 0.23 % 721.13 13,353 0.16 % 336.21 3,660 0.30 % 925.21 460,019 0.27 % 847.62 165,687 0.30 % 884.05 267,226 0.26 % 840.50 50,470 0.39 % 1,175.51 evro evro ev ro 952,189 0.27 % 839.16 0 0 % 0 587 0.01 % 14.78 39 0 % 9.86 403,032 0.23 % 742.61 41,108 0.07 % 219.34 505,459 0.49 % 1,589.80 1,964 0.01 % 45.74 predsednik predsednik pr edsednik 935,429 0.27 % 824.39 0 0 % 0 3,383 0.04 % 85.18 2,020 0.16 % 510.63 520,594 0.30 % 959.23 65,730 0.12 % 350.71 339,004 0.33 % 1,066.26 4,698 0.04 % 109.42 primer primer pr imer 905,773 0.26 % 798.25 8 0.27 % 824.15 14,594 0.17 % 367.46 5,373 0.44 % 1,358.24 406,445 0.24 % 748.90 154,197 0.28 % 822.74 266,038 0.26 % 836.76 59,118 0.46 % 1,376.93 konec konec ko nec 893,711 0.25 % 787.62 26 0.87 % 2,678.48 29,432 0.35 % 741.07 2,089 0.17 % 528.08 428,007 0.25 % 788.63 131,299 0.24 % 700.57 277,983 0.27 % 874.33 24,875 0.19 % 579.37 ura ura ur a 887,827 0.25 % 782.44 38 1.27 % 3,914.70 28,341 0.34 % 713.60 1,567 0.13 % 396.12 495,541 0.29 % 913.07 115,281 0.21 % 615.10 229,747 0.22 % 722.61 17,312 0.14 % 403.22 milijon milijon mi lijon 863,041 0.25 % 760.59 0 0 % 0 2,587 0.03 % 65.14 561 0.05 % 141.81 472,592 0.28 % 870.78 73,218 0.13 % 390.67 308,212 0.30 % 969.41 5,871 0.05 % 136.74 otrok otrok ot rok 858,940 0.24 % 756.98 0 0 % 0 32,917 0.39 % 828.81 1,735 0.14 % 438.59 386,949 0.23 % 712.98 167,600 0.31 % 894.26 219,308 0.21 % 689.78 50,431 0.39 % 1,174.60 skupina skupina sk upina 857,991 0.24 % 756.14 0 0 % 0 6,619 0.08 % 166.66 2,360 0.19 % 596.58 424,903 0.25 % 782.91 115,348 0.21 % 615.46 268,220 0.26 % 843.62 40,541 0.32 % 944.25 stran stran st ran 856,113 0.24 % 754.49 24 0.80 % 2,472.44 28,776 0.34 % 724.55 2,133 0.17 % 539.20 357,367 0.21 % 658.47 172,145 0.32 % 918.51 250,185 0.24 % 786.90 45,483 0.35 % 1,059.35 vlada vlada vl ada 767,169 0.22 % 676.10 0 0 % 0 1,902 0.02 % 47.89 5,609 0.46 % 1,417.89 398,609 0.23 % 734.46 56,720 0.10 % 302.64 297,430 0.29 % 935.49 6,899 0.05 % 160.69 življenje življenje ži vljenje 762,139 0.22 % 671.67 0 0 % 0 38,014 0.45 % 957.15 2,198 0.18 % 555.63 319,278 0.19 % 588.29 169,305 0.31 % 903.36 178,445 0.17 % 561.26 54,899 0.43 % 1,278.66 družba družba dr užba 762,071 0.22 % 671.61 0 0 % 0 6,818 0.08 % 171.67 4,866 0.40 % 1,230.07 394,846 0.23 % 727.53 79,568 0.15 % 424.55 247,831 0.24 % 779.49 28,142 0.22 % 655.46 zakon zakon za kon 706,952 0.20 % 623.03 1 0.03 % 103.02 4,285 0.05 % 107.89 20,577 1.67 % 5,201.64 358,907 0.21 % 661.31 70,886 0.13 % 378.22 236,331 0.23 % 743.32 15,965 0.12 % 371.84 občina občina ob čina 688,467 0.20 % 606.74 0 0 % 0 430 0.01 % 10.83 2,616 0.21 % 661.30 491,291 0.29 % 905.24 24,281 0.04 % 129.56 167,303 0.16 % 526.21 2,546 0.02 % 59.30 teden teden te den 683,852 0.20 % 602.68 2 0.07 % 206.04 14,002 0.17 % 352.56 904 0.07 % 228.52 351,497 0.20 % 647.66 84,244 0.15 % 449.50 225,387 0.22 % 708.90 7,816 0.06 % 182.04 prostor prostor pr ostor 672,859 0.19 % 592.99 0 0 % 0 13,594 0.16 % 342.28 1,932 0.16 % 488.39 359,646 0.21 % 662.67 111,497 0.20 % 594.91 160,130 0.16 % 503.65 26,060 0.20 % 606.97 točka točka to čka 672,245 0.19 % 592.45 0 0 % 0 3,432 0.04 % 86.41 3,686 0.30 % 931.78 305,174 0.18 % 562.30 49,829 0.09 % 265.87 294,805 0.29 % 927.24 15,319 0.12 % 356.80 program program pr ogram 667,540 0.19 % 588.30 0 0 % 0 1,804 0.02 % 45.42 3,767 0.31 % 952.26 339,935 0.20 % 626.35 138,849 0.26 % 740.85 162,628 0.16 % 511.51 20,557 0.16 % 478.80 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 39 File at CLARIN.SI 1.2.23 List of initial character-level 3-grams from noun lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-nouns-lemmas- initial-3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] leto leto let o 4,967,678 1.43 % 4,377.99 0 0 % 0 58,808 0.70 % 1,480.72 11,974 0.98 % 3,026.90 2,501,788 1.47 % 4,609.71 732,438 1.36 % 3,908.05 1,531,105 1.50 % 4,815.72 131,565 1.03 % 3,064.31 čas čas čas 1,869,767 0.54 % 1,647.82 15 0.51 % 1,545.28 65,738 0.78 % 1,655.21 5,627 0.46 % 1,422.44 884,076 0.52 % 1,628.97 353,071 0.65 % 1,883.87 493,125 0.48 % 1,551.01 68,115 0.53 % 1,586.48 dan dan dan 1,852,418 0.53 % 1,632.53 4 0.14 % 412.07 63,871 0.76 % 1,608.20 7,241 0.59 % 1,830.45 911,593 0.54 % 1,679.67 285,803 0.53 % 1,524.95 544,212 0.54 % 1,711.69 39,694 0.31 % 924.52 Slovenija slovenija Slo venija 1,764,516 0.51 % 1,555.06 0 0 % 0 1,408 0.02 % 35.45 11,071 0.90 % 2,798.63 894,914 0.53 % 1,648.94 193,536 0.36 % 1,032.64 645,977 0.64 % 2,031.77 17,610 0.14 % 410.16 delo delo del o 1,611,830 0.46 % 1,420.50 2 0.07 % 206.04 21,312 0.25 % 536.61 6,576 0.54 % 1,662.34 832,012 0.49 % 1,533.04 223,957 0.41 % 1,194.96 455,097 0.45 % 1,431.40 72,874 0.57 % 1,697.32 mesto mesto mes to 1,587,720 0.46 % 1,399.25 0 0 % 0 30,392 0.36 % 765.24 2,457 0.20 % 621.10 831,762 0.49 % 1,532.58 187,507 0.35 % 1,000.48 494,079 0.49 % 1,554.01 41,523 0.33 % 967.12 človek človek člo vek 1,543,691 0.44 % 1,360.45 0 0 % 0 64,013 0.76 % 1,611.78 4,089 0.33 % 1,033.66 693,749 0.41 % 1,278.28 286,123 0.53 % 1,526.66 411,720 0.41 % 1,294.97 83,997 0.66 % 1,956.39 država država drž ava 1,502,639 0.43 % 1,324.27 0 0 % 0 5,437 0.06 % 136.90 9,324 0.76 % 2,357.01 761,820 0.45 % 1,403.70 149,312 0.28 % 796.68 536,561 0.53 % 1,687.62 40,185 0.32 % 935.96 svet svet sve t 1,216,329 0.35 % 1,071.94 0 0 % 0 27,166 0.32 % 684.01 4,324 0.35 % 1,093.06 611,210 0.36 % 1,126.19 177,328 0.33 % 946.16 351,299 0.34 % 1,104.93 45,002 0.35 % 1,048.15 odstotek odstotek ods totek 1,089,884 0.31 % 960.51 0 0 % 0 898 0.01 % 22.61 583 0.05 % 147.38 558,212 0.33 % 1,028.54 82,011 0.15 % 437.58 439,901 0.43 % 1,383.60 8,279 0.07 % 192.83 Ljubljana ljubljana Lju bljana 1,078,589 0.31 % 950.56 0 0 % 0 2,707 0.03 % 68.16 940 0.08 % 237.62 570,072 0.34 % 1,050.40 89,473 0.17 % 477.40 398,878 0.39 % 1,254.58 16,519 0.13 % 384.75 podjetje podjetje pod jetje 1,001,078 0.29 % 882.24 0 0 % 0 2,879 0.03 % 72.49 2,661 0.22 % 672.67 537,241 0.32 % 989.90 154,968 0.29 % 826.86 283,563 0.28 % 891.88 19,766 0.15 % 460.37 tekma tekma tek ma 978,012 0.28 % 861.92 0 0 % 0 1,508 0.02 % 37.97 81 0.01 % 20.48 519,746 0.30 % 957.67 45,417 0.08 % 242.33 409,852 0.40 % 1,289.09 1,408 0.01 % 32.79 del del del 960,422 0.28 % 846.42 7 0.24 % 721.13 13,353 0.16 % 336.21 3,660 0.30 % 925.21 460,019 0.27 % 847.62 165,687 0.31 % 884.05 267,226 0.26 % 840.50 50,470 0.40 % 1,175.51 evro evro evr o 952,189 0.27 % 839.16 0 0 % 0 587 0.01 % 14.78 39 0 % 9.86 403,032 0.24 % 742.61 41,108 0.08 % 219.34 505,459 0.50 % 1,589.80 1,964 0.01 % 45.74 predsednik predsednik pre dsednik 935,429 0.27 % 824.39 0 0 % 0 3,383 0.04 % 85.18 2,020 0.17 % 510.63 520,594 0.31 % 959.23 65,730 0.12 % 350.71 339,004 0.33 % 1,066.26 4,698 0.04 % 109.42 primer primer pri mer 905,773 0.26 % 798.25 8 0.27 % 824.15 14,594 0.17 % 367.46 5,373 0.44 % 1,358.24 406,445 0.24 % 748.90 154,197 0.29 % 822.74 266,038 0.26 % 836.76 59,118 0.46 % 1,376.93 konec konec kon ec 893,711 0.26 % 787.62 26 0.88 % 2,678.48 29,432 0.35 % 741.07 2,089 0.17 % 528.08 428,007 0.25 % 788.63 131,299 0.24 % 700.57 277,983 0.27 % 874.33 24,875 0.20 % 579.37 ura ura ura 887,827 0.26 % 782.44 38 1.28 % 3,914.70 28,341 0.34 % 713.60 1,567 0.13 % 396.12 495,541 0.29 % 913.07 115,281 0.21 % 615.10 229,747 0.23 % 722.61 17,312 0.14 % 403.22 milijon milijon mil ijon 863,041 0.25 % 760.59 0 0 % 0 2,587 0.03 % 65.14 561 0.05 % 141.81 472,592 0.28 % 870.78 73,218 0.14 % 390.67 308,212 0.30 % 969.41 5,871 0.05 % 136.74 otrok otrok otr ok 858,940 0.25 % 756.98 0 0 % 0 32,917 0.39 % 828.81 1,735 0.14 % 438.59 386,949 0.23 % 712.98 167,600 0.31 % 894.26 219,308 0.22 % 689.78 50,431 0.40 % 1,174.60 skupina skupina sku pina 857,991 0.25 % 756.14 0 0 % 0 6,619 0.08 % 166.66 2,360 0.19 % 596.58 424,903 0.25 % 782.91 115,348 0.21 % 615.46 268,220 0.26 % 843.62 40,541 0.32 % 944.25 stran stran str an 856,113 0.25 % 754.49 24 0.81 % 2,472.44 28,776 0.34 % 724.55 2,133 0.17 % 539.20 357,367 0.21 % 658.47 172,145 0.32 % 918.51 250,185 0.25 % 786.90 45,483 0.36 % 1,059.35 vlada vlada vla da 767,169 0.22 % 676.10 0 0 % 0 1,902 0.02 % 47.89 5,609 0.46 % 1,417.89 398,609 0.23 % 734.46 56,720 0.10 % 302.64 297,430 0.29 % 935.49 6,899 0.05 % 160.69 življenje življenje živ ljenje 762,139 0.22 % 671.67 0 0 % 0 38,014 0.45 % 957.15 2,198 0.18 % 555.63 319,278 0.19 % 588.29 169,305 0.31 % 903.36 178,445 0.17 % 561.26 54,899 0.43 % 1,278.66 družba družba dru žba 762,071 0.22 % 671.61 0 0 % 0 6,818 0.08 % 171.67 4,866 0.40 % 1,230.07 394,846 0.23 % 727.53 79,568 0.15 % 424.55 247,831 0.24 % 779.49 28,142 0.22 % 655.46 zakon zakon zak on 706,952 0.20 % 623.03 1 0.03 % 103.02 4,285 0.05 % 107.89 20,577 1.68 % 5,201.64 358,907 0.21 % 661.31 70,886 0.13 % 378.22 236,331 0.23 % 743.32 15,965 0.12 % 371.84 občina občina obč ina 688,467 0.20 % 606.74 0 0 % 0 430 0.01 % 10.83 2,616 0.21 % 661.30 491,291 0.29 % 905.24 24,281 0.04 % 129.56 167,303 0.16 % 526.21 2,546 0.02 % 59.30 teden teden ted en 683,852 0.20 % 602.68 2 0.07 % 206.04 14,002 0.17 % 352.56 904 0.07 % 228.52 351,497 0.21 % 647.66 84,244 0.16 % 449.50 225,387 0.22 % 708.90 7,816 0.06 % 182.04 prostor prostor pro stor 672,859 0.19 % 592.99 0 0 % 0 13,594 0.16 % 342.28 1,932 0.16 % 488.39 359,646 0.21 % 662.67 111,497 0.21 % 594.91 160,130 0.16 % 503.65 26,060 0.20 % 606.97 točka točka toč ka 672,245 0.19 % 592.45 0 0 % 0 3,432 0.04 % 86.41 3,686 0.30 % 931.78 305,174 0.18 % 562.30 49,829 0.09 % 265.87 294,805 0.29 % 927.24 15,319 0.12 % 356.80 program program pro gram 667,540 0.19 % 588.30 0 0 % 0 1,804 0.02 % 45.42 3,767 0.31 % 952.26 339,935 0.20 % 626.35 138,849 0.26 % 740.85 162,628 0.16 % 511.51 20,557 0.16 % 478.80 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 40 File at CLARIN.SI 1.2.24 List of initial character-level 4-grams from noun lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-nouns-lemmas- initial-4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] leto leto leto 4,967,678 1.53 % 4,377.99 0 0 % 0 58,808 0.78 % 1,480.72 11,974 1.03 % 3,026.90 2,501,788 1.57 % 4,609.71 732,438 1.46 % 3,908.05 1,531,105 1.61 % 4,815.72 131,565 1.10 % 3,064.31 Slovenija slovenija Slov enija 1,764,516 0.54 % 1,555.06 0 0 % 0 1,408 0.02 % 35.45 11,071 0.95 % 2,798.63 894,914 0.56 % 1,648.94 193,536 0.39 % 1,032.64 645,977 0.68 % 2,031.77 17,610 0.15 % 410.16 delo delo delo 1,611,830 0.49 % 1,420.50 2 0.07 % 206.04 21,312 0.28 % 536.61 6,576 0.56 % 1,662.34 832,012 0.52 % 1,533.04 223,957 0.45 % 1,194.96 455,097 0.48 % 1,431.40 72,874 0.61 % 1,697.32 mesto mesto mest o 1,587,720 0.49 % 1,399.25 0 0 % 0 30,392 0.40 % 765.24 2,457 0.21 % 621.10 831,762 0.52 % 1,532.58 187,507 0.37 % 1,000.48 494,079 0.52 % 1,554.01 41,523 0.35 % 967.12 človek človek člov ek 1,543,691 0.47 % 1,360.45 0 0 % 0 64,013 0.84 % 1,611.78 4,089 0.35 % 1,033.66 693,749 0.43 % 1,278.28 286,123 0.57 % 1,526.66 411,720 0.43 % 1,294.97 83,997 0.70 % 1,956.39 država država drža va 1,502,639 0.46 % 1,324.27 0 0 % 0 5,437 0.07 % 136.90 9,324 0.80 % 2,357.01 761,820 0.48 % 1,403.70 149,312 0.30 % 796.68 536,561 0.56 % 1,687.62 40,185 0.34 % 935.96 svet svet svet 1,216,329 0.37 % 1,071.94 0 0 % 0 27,166 0.36 % 684.01 4,324 0.37 % 1,093.06 611,210 0.38 % 1,126.19 177,328 0.35 % 946.16 351,299 0.37 % 1,104.93 45,002 0.38 % 1,048.15 odstotek odstotek odst otek 1,089,884 0.34 % 960.51 0 0 % 0 898 0.01 % 22.61 583 0.05 % 147.38 558,212 0.35 % 1,028.54 82,011 0.16 % 437.58 439,901 0.46 % 1,383.60 8,279 0.07 % 192.83 Ljubljana ljubljana Ljub ljana 1,078,589 0.33 % 950.56 0 0 % 0 2,707 0.04 % 68.16 940 0.08 % 237.62 570,072 0.36 % 1,050.40 89,473 0.18 % 477.40 398,878 0.42 % 1,254.58 16,519 0.14 % 384.75 podjetje podjetje podj etje 1,001,078 0.31 % 882.24 0 0 % 0 2,879 0.04 % 72.49 2,661 0.23 % 672.67 537,241 0.34 % 989.90 154,968 0.31 % 826.86 283,563 0.30 % 891.88 19,766 0.17 % 460.37 tekma tekma tekm a 978,012 0.30 % 861.92 0 0 % 0 1,508 0.02 % 37.97 81 0.01 % 20.48 519,746 0.33 % 957.67 45,417 0.09 % 242.33 409,852 0.43 % 1,289.09 1,408 0.01 % 32.79 evro evro evro 952,189 0.29 % 839.16 0 0 % 0 587 0.01 % 14.78 39 0 % 9.86 403,032 0.25 % 742.61 41,108 0.08 % 219.34 505,459 0.53 % 1,589.80 1,964 0.02 % 45.74 predsednik predsednik pred sednik 935,429 0.29 % 824.39 0 0 % 0 3,383 0.04 % 85.18 2,020 0.17 % 510.63 520,594 0.33 % 959.23 65,730 0.13 % 350.71 339,004 0.36 % 1,066.26 4,698 0.04 % 109.42 primer primer prim er 905,773 0.28 % 798.25 8 0.30 % 824.15 14,594 0.19 % 367.46 5,373 0.46 % 1,358.24 406,445 0.26 % 748.90 154,197 0.31 % 822.74 266,038 0.28 % 836.76 59,118 0.50 % 1,376.93 konec konec kone c 893,711 0.28 % 787.62 26 0.96 % 2,678.48 29,432 0.39 % 741.07 2,089 0.18 % 528.08 428,007 0.27 % 788.63 131,299 0.26 % 700.57 277,983 0.29 % 874.33 24,875 0.21 % 579.37 milijon milijon mili jon 863,041 0.27 % 760.59 0 0 % 0 2,587 0.03 % 65.14 561 0.05 % 141.81 472,592 0.30 % 870.78 73,218 0.15 % 390.67 308,212 0.32 % 969.41 5,871 0.05 % 136.74 otrok otrok otro k 858,940 0.26 % 756.98 0 0 % 0 32,917 0.43 % 828.81 1,735 0.15 % 438.59 386,949 0.24 % 712.98 167,600 0.34 % 894.26 219,308 0.23 % 689.78 50,431 0.42 % 1,174.60 skupina skupina skup ina 857,991 0.26 % 756.14 0 0 % 0 6,619 0.09 % 166.66 2,360 0.20 % 596.58 424,903 0.27 % 782.91 115,348 0.23 % 615.46 268,220 0.28 % 843.62 40,541 0.34 % 944.25 stran stran stra n 856,113 0.26 % 754.49 24 0.89 % 2,472.44 28,776 0.38 % 724.55 2,133 0.18 % 539.20 357,367 0.22 % 658.47 172,145 0.34 % 918.51 250,185 0.26 % 786.90 45,483 0.38 % 1,059.35 vlada vlada vlad a 767,169 0.24 % 676.10 0 0 % 0 1,902 0.03 % 47.89 5,609 0.48 % 1,417.89 398,609 0.25 % 734.46 56,720 0.11 % 302.64 297,430 0.31 % 935.49 6,899 0.06 % 160.69 življenje življenje živl jenje 762,139 0.23 % 671.67 0 0 % 0 38,014 0.50 % 957.15 2,198 0.19 % 555.63 319,278 0.20 % 588.29 169,305 0.34 % 903.36 178,445 0.19 % 561.26 54,899 0.46 % 1,278.66 družba družba druž ba 762,071 0.23 % 671.61 0 0 % 0 6,818 0.09 % 171.67 4,866 0.42 % 1,230.07 394,846 0.25 % 727.53 79,568 0.16 % 424.55 247,831 0.26 % 779.49 28,142 0.24 % 655.46 zakon zakon zako n 706,952 0.22 % 623.03 1 0.04 % 103.02 4,285 0.06 % 107.89 20,577 1.77 % 5,201.64 358,907 0.23 % 661.31 70,886 0.14 % 378.22 236,331 0.25 % 743.32 15,965 0.13 % 371.84 občina občina obči na 688,467 0.21 % 606.74 0 0 % 0 430 0.01 % 10.83 2,616 0.23 % 661.30 491,291 0.31 % 905.24 24,281 0.05 % 129.56 167,303 0.18 % 526.21 2,546 0.02 % 59.30 teden teden tede n 683,852 0.21 % 602.68 2 0.07 % 206.04 14,002 0.18 % 352.56 904 0.08 % 228.52 351,497 0.22 % 647.66 84,244 0.17 % 449.50 225,387 0.24 % 708.90 7,816 0.07 % 182.04 prostor prostor pros tor 672,859 0.21 % 592.99 0 0 % 0 13,594 0.18 % 342.28 1,932 0.17 % 488.39 359,646 0.23 % 662.67 111,497 0.22 % 594.91 160,130 0.17 % 503.65 26,060 0.22 % 606.97 točka točka točk a 672,245 0.21 % 592.45 0 0 % 0 3,432 0.04 % 86.41 3,686 0.32 % 931.78 305,174 0.19 % 562.30 49,829 0.10 % 265.87 294,805 0.31 % 927.24 15,319 0.13 % 356.80 program program prog ram 667,540 0.20 % 588.30 0 0 % 0 1,804 0.02 % 45.42 3,767 0.32 % 952.26 339,935 0.21 % 626.35 138,849 0.28 % 740.85 162,628 0.17 % 511.51 20,557 0.17 % 478.80 mesec mesec mese c 653,149 0.20 % 575.62 1 0.04 % 103.02 11,071 0.15 % 278.76 2,582 0.22 % 652.70 331,605 0.21 % 611 96,549 0.19 % 515.15 202,005 0.21 % 635.36 9,336 0.08 % 217.45 težava težava teža va 621,900 0.19 % 548.08 1 0.04 % 103.02 10,735 0.14 % 270.30 620 0.05 % 156.73 292,123 0.18 % 538.26 119,138 0.24 % 635.68 179,213 0.19 % 563.67 20,070 0.17 % 467.45 vprašanje vprašanje vpra šanje 619,723 0.19 % 546.16 0 0 % 0 12,480 0.17 % 314.23 2,996 0.26 % 757.36 317,174 0.20 % 584.41 81,372 0.16 % 434.17 177,497 0.19 % 558.27 28,204 0.24 % 656.90 vrsta vrsta vrst a 605,858 0.19 % 533.94 5 0.18 % 515.09 14,107 0.19 % 355.20 3,289 0.28 % 831.42 283,858 0.18 % 523.03 114,099 0.23 % 608.79 148,498 0.16 % 467.06 42,002 0.35 % 978.28 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 41 File at CLARIN.SI 1.2.25 List of initial character-level 5-grams from noun lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-nouns-lemmas- initial-5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Slovenija slovenija Slove nija 1,764,516 0.62 % 1,555.06 0 0 % 0 1,408 0.02 % 35.45 11,071 1.06 % 2,798.63 894,914 0.64 % 1,648.94 193,536 0.45 % 1,032.64 645,977 0.77 % 2,031.77 17,610 0.17 % 410.16 mesto mesto mesto 1,587,720 0.56 % 1,399.25 0 0 % 0 30,392 0.50 % 765.24 2,457 0.24 % 621.10 831,762 0.59 % 1,532.58 187,507 0.43 % 1,000.48 494,079 0.59 % 1,554.01 41,523 0.40 % 967.12 človek človek člove k 1,543,691 0.54 % 1,360.45 0 0 % 0 64,013 1.05 % 1,611.78 4,089 0.39 % 1,033.66 693,749 0.49 % 1,278.28 286,123 0.66 % 1,526.66 411,720 0.49 % 1,294.97 83,997 0.81 % 1,956.39 država država držav a 1,502,639 0.53 % 1,324.27 0 0 % 0 5,437 0.09 % 136.90 9,324 0.90 % 2,357.01 761,820 0.54 % 1,403.70 149,312 0.34 % 796.68 536,561 0.64 % 1,687.62 40,185 0.39 % 935.96 odstotek odstotek odsto tek 1,089,884 0.38 % 960.51 0 0 % 0 898 0.01 % 22.61 583 0.06 % 147.38 558,212 0.40 % 1,028.54 82,011 0.19 % 437.58 439,901 0.52 % 1,383.60 8,279 0.08 % 192.83 Ljubljana ljubljana Ljubl jana 1,078,589 0.38 % 950.56 0 0 % 0 2,707 0.04 % 68.16 940 0.09 % 237.62 570,072 0.41 % 1,050.40 89,473 0.21 % 477.40 398,878 0.47 % 1,254.58 16,519 0.16 % 384.75 podjetje podjetje podje tje 1,001,078 0.35 % 882.24 0 0 % 0 2,879 0.05 % 72.49 2,661 0.26 % 672.67 537,241 0.38 % 989.90 154,968 0.36 % 826.86 283,563 0.34 % 891.88 19,766 0.19 % 460.37 tekma tekma tekma 978,012 0.34 % 861.92 0 0 % 0 1,508 0.03 % 37.97 81 0.01 % 20.48 519,746 0.37 % 957.67 45,417 0.10 % 242.33 409,852 0.49 % 1,289.09 1,408 0.01 % 32.79 predsednik predsednik preds ednik 935,429 0.33 % 824.39 0 0 % 0 3,383 0.06 % 85.18 2,020 0.19 % 510.63 520,594 0.37 % 959.23 65,730 0.15 % 350.71 339,004 0.40 % 1,066.26 4,698 0.04 % 109.42 primer primer prime r 905,773 0.32 % 798.25 8 0.37 % 824.15 14,594 0.24 % 367.46 5,373 0.52 % 1,358.24 406,445 0.29 % 748.90 154,197 0.36 % 822.74 266,038 0.32 % 836.76 59,118 0.57 % 1,376.93 konec konec konec 893,711 0.31 % 787.62 26 1.21 % 2,678.48 29,432 0.48 % 741.07 2,089 0.20 % 528.08 428,007 0.31 % 788.63 131,299 0.30 % 700.57 277,983 0.33 % 874.33 24,875 0.24 % 579.37 milijon milijon milij on 863,041 0.30 % 760.59 0 0 % 0 2,587 0.04 % 65.14 561 0.05 % 141.81 472,592 0.34 % 870.78 73,218 0.17 % 390.67 308,212 0.37 % 969.41 5,871 0.06 % 136.74 otrok otrok otrok 858,940 0.30 % 756.98 0 0 % 0 32,917 0.54 % 828.81 1,735 0.17 % 438.59 386,949 0.28 % 712.98 167,600 0.39 % 894.26 219,308 0.26 % 689.78 50,431 0.48 % 1,174.60 skupina skupina skupi na 857,991 0.30 % 756.14 0 0 % 0 6,619 0.11 % 166.66 2,360 0.23 % 596.58 424,903 0.30 % 782.91 115,348 0.27 % 615.46 268,220 0.32 % 843.62 40,541 0.39 % 944.25 stran stran stran 856,113 0.30 % 754.49 24 1.12 % 2,472.44 28,776 0.47 % 724.55 2,133 0.20 % 539.20 357,367 0.26 % 658.47 172,145 0.40 % 918.51 250,185 0.30 % 786.90 45,483 0.44 % 1,059.35 vlada vlada vlada 767,169 0.27 % 676.10 0 0 % 0 1,902 0.03 % 47.89 5,609 0.54 % 1,417.89 398,609 0.28 % 734.46 56,720 0.13 % 302.64 297,430 0.35 % 935.49 6,899 0.07 % 160.69 življenje življenje življ enje 762,139 0.27 % 671.67 0 0 % 0 38,014 0.62 % 957.15 2,198 0.21 % 555.63 319,278 0.23 % 588.29 169,305 0.39 % 903.36 178,445 0.21 % 561.26 54,899 0.53 % 1,278.66 družba družba družb a 762,071 0.27 % 671.61 0 0 % 0 6,818 0.11 % 171.67 4,866 0.47 % 1,230.07 394,846 0.28 % 727.53 79,568 0.18 % 424.55 247,831 0.29 % 779.49 28,142 0.27 % 655.46 zakon zakon zakon 706,952 0.25 % 623.03 1 0.05 % 103.02 4,285 0.07 % 107.89 20,577 1.98 % 5,201.64 358,907 0.26 % 661.31 70,886 0.16 % 378.22 236,331 0.28 % 743.32 15,965 0.15 % 371.84 občina občina občin a 688,467 0.24 % 606.74 0 0 % 0 430 0.01 % 10.83 2,616 0.25 % 661.30 491,291 0.35 % 905.24 24,281 0.06 % 129.56 167,303 0.20 % 526.21 2,546 0.02 % 59.30 teden teden teden 683,852 0.24 % 602.68 2 0.09 % 206.04 14,002 0.23 % 352.56 904 0.09 % 228.52 351,497 0.25 % 647.66 84,244 0.19 % 449.50 225,387 0.27 % 708.90 7,816 0.07 % 182.04 prostor prostor prost or 672,859 0.24 % 592.99 0 0 % 0 13,594 0.22 % 342.28 1,932 0.19 % 488.39 359,646 0.26 % 662.67 111,497 0.26 % 594.91 160,130 0.19 % 503.65 26,060 0.25 % 606.97 točka točka točka 672,245 0.24 % 592.45 0 0 % 0 3,432 0.06 % 86.41 3,686 0.35 % 931.78 305,174 0.22 % 562.30 49,829 0.12 % 265.87 294,805 0.35 % 927.24 15,319 0.15 % 356.80 program program progr am 667,540 0.23 % 588.30 0 0 % 0 1,804 0.03 % 45.42 3,767 0.36 % 952.26 339,935 0.24 % 626.35 138,849 0.32 % 740.85 162,628 0.19 % 511.51 20,557 0.20 % 478.80 mesec mesec mesec 653,149 0.23 % 575.62 1 0.05 % 103.02 11,071 0.18 % 278.76 2,582 0.25 % 652.70 331,605 0.24 % 611 96,549 0.22 % 515.15 202,005 0.24 % 635.36 9,336 0.09 % 217.45 težava težava težav a 621,900 0.22 % 548.08 1 0.05 % 103.02 10,735 0.18 % 270.30 620 0.06 % 156.73 292,123 0.21 % 538.26 119,138 0.28 % 635.68 179,213 0.21 % 563.67 20,070 0.19 % 467.45 vprašanje vprašanje vpraš anje 619,723 0.22 % 546.16 0 0 % 0 12,480 0.20 % 314.23 2,996 0.29 % 757.36 317,174 0.23 % 584.41 81,372 0.19 % 434.17 177,497 0.21 % 558.27 28,204 0.27 % 656.90 vrsta vrsta vrsta 605,858 0.21 % 533.94 5 0.23 % 515.09 14,107 0.23 % 355.20 3,289 0.32 % 831.42 283,858 0.20 % 523.03 114,099 0.26 % 608.79 148,498 0.18 % 467.06 42,002 0.40 % 978.28 beseda beseda besed a 594,966 0.21 % 524.34 0 0 % 0 32,479 0.53 % 817.79 2,963 0.28 % 749.01 261,656 0.19 % 482.12 74,175 0.17 % 395.77 190,944 0.23 % 600.57 32,749 0.31 % 762.76 stranka stranka stran ka 591,212 0.21 % 521.03 0 0 % 0 3,498 0.06 % 88.08 3,818 0.37 % 965.15 283,718 0.20 % 522.77 56,592 0.13 % 301.96 233,813 0.28 % 735.40 9,773 0.09 % 227.62 podatek podatek podat ek 587,651 0.21 % 517.89 1 0.05 % 103.02 3,365 0.06 % 84.73 4,325 0.42 % 1,093.31 242,117 0.17 % 446.12 108,286 0.25 % 577.78 206,424 0.25 % 649.26 23,133 0.22 % 538.80 začetek začetek začet ek 584,429 0.20 % 515.05 1 0.05 % 103.02 8,101 0.13 % 203.97 1,762 0.17 % 445.41 287,624 0.20 % 529.97 85,385 0.20 % 455.59 182,376 0.22 % 573.62 19,180 0.18 % 446.73 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 42 File at CLARIN.SI 1.2.26 List of final character-level 1-grams from noun lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-nouns-lemmas-final- 1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] leto leto let o 4,967,678 1.41 % 4,377.99 0 0 % 0 58,808 0.69 % 1,480.72 11,974 0.97 % 3,026.90 2,501,788 1.45 % 4,609.71 732,438 1.33 % 3,908.05 1,531,105 1.49 % 4,815.72 131,565 1.02 % 3,064.31 čas čas ča s 1,869,767 0.53 % 1,647.82 15 0.49 % 1,545.28 65,738 0.78 % 1,655.21 5,627 0.46 % 1,422.44 884,076 0.51 % 1,628.97 353,071 0.64 % 1,883.87 493,125 0.48 % 1,551.01 68,115 0.53 % 1,586.48 dan dan da n 1,852,418 0.53 % 1,632.53 4 0.13 % 412.07 63,871 0.75 % 1,608.20 7,241 0.58 % 1,830.45 911,593 0.53 % 1,679.67 285,803 0.52 % 1,524.95 544,212 0.53 % 1,711.69 39,694 0.31 % 924.52 Slovenija slovenija Slovenij a 1,764,516 0.50 % 1,555.06 0 0 % 0 1,408 0.02 % 35.45 11,071 0.90 % 2,798.63 894,914 0.52 % 1,648.94 193,536 0.35 % 1,032.64 645,977 0.63 % 2,031.77 17,610 0.14 % 410.16 delo delo del o 1,611,830 0.46 % 1,420.50 2 0.07 % 206.04 21,312 0.25 % 536.61 6,576 0.53 % 1,662.34 832,012 0.48 % 1,533.04 223,957 0.41 % 1,194.96 455,097 0.44 % 1,431.40 72,874 0.56 % 1,697.32 mesto mesto mest o 1,587,720 0.45 % 1,399.25 0 0 % 0 30,392 0.36 % 765.24 2,457 0.20 % 621.10 831,762 0.48 % 1,532.58 187,507 0.34 % 1,000.48 494,079 0.48 % 1,554.01 41,523 0.32 % 967.12 človek človek člove k 1,543,691 0.44 % 1,360.45 0 0 % 0 64,013 0.76 % 1,611.78 4,089 0.33 % 1,033.66 693,749 0.40 % 1,278.28 286,123 0.52 % 1,526.66 411,720 0.40 % 1,294.97 83,997 0.65 % 1,956.39 država država držav a 1,502,639 0.43 % 1,324.27 0 0 % 0 5,437 0.06 % 136.90 9,324 0.75 % 2,357.01 761,820 0.44 % 1,403.70 149,312 0.27 % 796.68 536,561 0.52 % 1,687.62 40,185 0.31 % 935.96 svet svet sve t 1,216,329 0.34 % 1,071.94 0 0 % 0 27,166 0.32 % 684.01 4,324 0.35 % 1,093.06 611,210 0.35 % 1,126.19 177,328 0.32 % 946.16 351,299 0.34 % 1,104.93 45,002 0.35 % 1,048.15 odstotek odstotek odstote k 1,089,884 0.31 % 960.51 0 0 % 0 898 0.01 % 22.61 583 0.05 % 147.38 558,212 0.32 % 1,028.54 82,011 0.15 % 437.58 439,901 0.43 % 1,383.60 8,279 0.06 % 192.83 Ljubljana ljubljana Ljubljan a 1,078,589 0.31 % 950.56 0 0 % 0 2,707 0.03 % 68.16 940 0.08 % 237.62 570,072 0.33 % 1,050.40 89,473 0.16 % 477.40 398,878 0.39 % 1,254.58 16,519 0.13 % 384.75 podjetje podjetje podjetj e 1,001,078 0.28 % 882.24 0 0 % 0 2,879 0.03 % 72.49 2,661 0.21 % 672.67 537,241 0.31 % 989.90 154,968 0.28 % 826.86 283,563 0.28 % 891.88 19,766 0.15 % 460.37 tekma tekma tekm a 978,012 0.28 % 861.92 0 0 % 0 1,508 0.02 % 37.97 81 0.01 % 20.48 519,746 0.30 % 957.67 45,417 0.08 % 242.33 409,852 0.40 % 1,289.09 1,408 0.01 % 32.79 del del de l 960,422 0.27 % 846.42 7 0.23 % 721.13 13,353 0.16 % 336.21 3,660 0.30 % 925.21 460,019 0.27 % 847.62 165,687 0.30 % 884.05 267,226 0.26 % 840.50 50,470 0.39 % 1,175.51 evro evro evr o 952,189 0.27 % 839.16 0 0 % 0 587 0.01 % 14.78 39 0 % 9.86 403,032 0.23 % 742.61 41,108 0.07 % 219.34 505,459 0.49 % 1,589.80 1,964 0.01 % 45.74 predsednik predsednik predsedni k 935,429 0.27 % 824.39 0 0 % 0 3,383 0.04 % 85.18 2,020 0.16 % 510.63 520,594 0.30 % 959.23 65,730 0.12 % 350.71 339,004 0.33 % 1,066.26 4,698 0.04 % 109.42 primer primer prime r 905,773 0.26 % 798.25 8 0.26 % 824.15 14,594 0.17 % 367.46 5,373 0.43 % 1,358.24 406,445 0.24 % 748.90 154,197 0.28 % 822.74 266,038 0.26 % 836.76 59,118 0.46 % 1,376.93 konec konec kone c 893,711 0.25 % 787.62 26 0.85 % 2,678.48 29,432 0.35 % 741.07 2,089 0.17 % 528.08 428,007 0.25 % 788.63 131,299 0.24 % 700.57 277,983 0.27 % 874.33 24,875 0.19 % 579.37 ura ura ur a 887,827 0.25 % 782.44 38 1.24 % 3,914.70 28,341 0.33 % 713.60 1,567 0.13 % 396.12 495,541 0.29 % 913.07 115,281 0.21 % 615.10 229,747 0.22 % 722.61 17,312 0.13 % 403.22 milijon milijon milijo n 863,041 0.24 % 760.59 0 0 % 0 2,587 0.03 % 65.14 561 0.04 % 141.81 472,592 0.27 % 870.78 73,218 0.13 % 390.67 308,212 0.30 % 969.41 5,871 0.04 % 136.74 otrok otrok otro k 858,940 0.24 % 756.98 0 0 % 0 32,917 0.39 % 828.81 1,735 0.14 % 438.59 386,949 0.23 % 712.98 167,600 0.30 % 894.26 219,308 0.21 % 689.78 50,431 0.39 % 1,174.60 skupina skupina skupin a 857,991 0.24 % 756.14 0 0 % 0 6,619 0.08 % 166.66 2,360 0.19 % 596.58 424,903 0.25 % 782.91 115,348 0.21 % 615.46 268,220 0.26 % 843.62 40,541 0.31 % 944.25 stran stran stra n 856,113 0.24 % 754.49 24 0.78 % 2,472.44 28,776 0.34 % 724.55 2,133 0.17 % 539.20 357,367 0.21 % 658.47 172,145 0.31 % 918.51 250,185 0.24 % 786.90 45,483 0.35 % 1,059.35 vlada vlada vlad a 767,169 0.22 % 676.10 0 0 % 0 1,902 0.02 % 47.89 5,609 0.45 % 1,417.89 398,609 0.23 % 734.46 56,720 0.10 % 302.64 297,430 0.29 % 935.49 6,899 0.05 % 160.69 življenje življenje življenj e 762,139 0.22 % 671.67 0 0 % 0 38,014 0.45 % 957.15 2,198 0.18 % 555.63 319,278 0.18 % 588.29 169,305 0.31 % 903.36 178,445 0.17 % 561.26 54,899 0.42 % 1,278.66 družba družba družb a 762,071 0.22 % 671.61 0 0 % 0 6,818 0.08 % 171.67 4,866 0.39 % 1,230.07 394,846 0.23 % 727.53 79,568 0.14 % 424.55 247,831 0.24 % 779.49 28,142 0.22 % 655.46 zakon zakon zako n 706,952 0.20 % 623.03 1 0.03 % 103.02 4,285 0.05 % 107.89 20,577 1.66 % 5,201.64 358,907 0.21 % 661.31 70,886 0.13 % 378.22 236,331 0.23 % 743.32 15,965 0.12 % 371.84 občina občina občin a 688,467 0.20 % 606.74 0 0 % 0 430 0.01 % 10.83 2,616 0.21 % 661.30 491,291 0.28 % 905.24 24,281 0.04 % 129.56 167,303 0.16 % 526.21 2,546 0.02 % 59.30 teden teden tede n 683,852 0.19 % 602.68 2 0.07 % 206.04 14,002 0.17 % 352.56 904 0.07 % 228.52 351,497 0.20 % 647.66 84,244 0.15 % 449.50 225,387 0.22 % 708.90 7,816 0.06 % 182.04 prostor prostor prosto r 672,859 0.19 % 592.99 0 0 % 0 13,594 0.16 % 342.28 1,932 0.16 % 488.39 359,646 0.21 % 662.67 111,497 0.20 % 594.91 160,130 0.16 % 503.65 26,060 0.20 % 606.97 točka točka točk a 672,245 0.19 % 592.45 0 0 % 0 3,432 0.04 % 86.41 3,686 0.30 % 931.78 305,174 0.18 % 562.30 49,829 0.09 % 265.87 294,805 0.29 % 927.24 15,319 0.12 % 356.80 program program progra m 667,540 0.19 % 588.30 0 0 % 0 1,804 0.02 % 45.42 3,767 0.30 % 952.26 339,935 0.20 % 626.35 138,849 0.25 % 740.85 162,628 0.16 % 511.51 20,557 0.16 % 478.80 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 43 File at CLARIN.SI 1.2.27 List of final character-level 2-grams from noun lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-nouns-lemmas- final-2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] leto leto le to 4,967,678 1.41 % 4,377.99 0 0 % 0 58,808 0.69 % 1,480.72 11,974 0.97 % 3,026.90 2,501,788 1.46 % 4,609.71 732,438 1.34 % 3,908.05 1,531,105 1.49 % 4,815.72 131,565 1.02 % 3,064.31 čas čas č as 1,869,767 0.53 % 1,647.82 15 0.50 % 1,545.28 65,738 0.78 % 1,655.21 5,627 0.46 % 1,422.44 884,076 0.52 % 1,628.97 353,071 0.65 % 1,883.87 493,125 0.48 % 1,551.01 68,115 0.53 % 1,586.48 dan dan d an 1,852,418 0.53 % 1,632.53 4 0.13 % 412.07 63,871 0.76 % 1,608.20 7,241 0.59 % 1,830.45 911,593 0.53 % 1,679.67 285,803 0.52 % 1,524.95 544,212 0.53 % 1,711.69 39,694 0.31 % 924.52 Slovenija slovenija Sloveni ja 1,764,516 0.50 % 1,555.06 0 0 % 0 1,408 0.02 % 35.45 11,071 0.90 % 2,798.63 894,914 0.52 % 1,648.94 193,536 0.35 % 1,032.64 645,977 0.63 % 2,031.77 17,610 0.14 % 410.16 delo delo de lo 1,611,830 0.46 % 1,420.50 2 0.07 % 206.04 21,312 0.25 % 536.61 6,576 0.54 % 1,662.34 832,012 0.48 % 1,533.04 223,957 0.41 % 1,194.96 455,097 0.44 % 1,431.40 72,874 0.57 % 1,697.32 mesto mesto mes to 1,587,720 0.45 % 1,399.25 0 0 % 0 30,392 0.36 % 765.24 2,457 0.20 % 621.10 831,762 0.48 % 1,532.58 187,507 0.34 % 1,000.48 494,079 0.48 % 1,554.01 41,523 0.32 % 967.12 človek človek člov ek 1,543,691 0.44 % 1,360.45 0 0 % 0 64,013 0.76 % 1,611.78 4,089 0.33 % 1,033.66 693,749 0.40 % 1,278.28 286,123 0.53 % 1,526.66 411,720 0.40 % 1,294.97 83,997 0.65 % 1,956.39 država država drža va 1,502,639 0.43 % 1,324.27 0 0 % 0 5,437 0.06 % 136.90 9,324 0.76 % 2,357.01 761,820 0.44 % 1,403.70 149,312 0.27 % 796.68 536,561 0.52 % 1,687.62 40,185 0.31 % 935.96 svet svet sv et 1,216,329 0.35 % 1,071.94 0 0 % 0 27,166 0.32 % 684.01 4,324 0.35 % 1,093.06 611,210 0.36 % 1,126.19 177,328 0.33 % 946.16 351,299 0.34 % 1,104.93 45,002 0.35 % 1,048.15 odstotek odstotek odstot ek 1,089,884 0.31 % 960.51 0 0 % 0 898 0.01 % 22.61 583 0.05 % 147.38 558,212 0.33 % 1,028.54 82,011 0.15 % 437.58 439,901 0.43 % 1,383.60 8,279 0.06 % 192.83 Ljubljana ljubljana Ljublja na 1,078,589 0.31 % 950.56 0 0 % 0 2,707 0.03 % 68.16 940 0.08 % 237.62 570,072 0.33 % 1,050.40 89,473 0.16 % 477.40 398,878 0.39 % 1,254.58 16,519 0.13 % 384.75 podjetje podjetje podjet je 1,001,078 0.28 % 882.24 0 0 % 0 2,879 0.03 % 72.49 2,661 0.22 % 672.67 537,241 0.31 % 989.90 154,968 0.28 % 826.86 283,563 0.28 % 891.88 19,766 0.15 % 460.37 tekma tekma tek ma 978,012 0.28 % 861.92 0 0 % 0 1,508 0.02 % 37.97 81 0.01 % 20.48 519,746 0.30 % 957.67 45,417 0.08 % 242.33 409,852 0.40 % 1,289.09 1,408 0.01 % 32.79 del del d el 960,422 0.27 % 846.42 7 0.23 % 721.13 13,353 0.16 % 336.21 3,660 0.30 % 925.21 460,019 0.27 % 847.62 165,687 0.30 % 884.05 267,226 0.26 % 840.50 50,470 0.39 % 1,175.51 evro evro ev ro 952,189 0.27 % 839.16 0 0 % 0 587 0.01 % 14.78 39 0 % 9.86 403,032 0.23 % 742.61 41,108 0.07 % 219.34 505,459 0.49 % 1,589.80 1,964 0.01 % 45.74 predsednik predsednik predsedn ik 935,429 0.27 % 824.39 0 0 % 0 3,383 0.04 % 85.18 2,020 0.16 % 510.63 520,594 0.30 % 959.23 65,730 0.12 % 350.71 339,004 0.33 % 1,066.26 4,698 0.04 % 109.42 primer primer prim er 905,773 0.26 % 798.25 8 0.27 % 824.15 14,594 0.17 % 367.46 5,373 0.44 % 1,358.24 406,445 0.24 % 748.90 154,197 0.28 % 822.74 266,038 0.26 % 836.76 59,118 0.46 % 1,376.93 konec konec kon ec 893,711 0.25 % 787.62 26 0.87 % 2,678.48 29,432 0.35 % 741.07 2,089 0.17 % 528.08 428,007 0.25 % 788.63 131,299 0.24 % 700.57 277,983 0.27 % 874.33 24,875 0.19 % 579.37 ura ura u ra 887,827 0.25 % 782.44 38 1.27 % 3,914.70 28,341 0.34 % 713.60 1,567 0.13 % 396.12 495,541 0.29 % 913.07 115,281 0.21 % 615.10 229,747 0.22 % 722.61 17,312 0.14 % 403.22 milijon milijon milij on 863,041 0.25 % 760.59 0 0 % 0 2,587 0.03 % 65.14 561 0.05 % 141.81 472,592 0.28 % 870.78 73,218 0.13 % 390.67 308,212 0.30 % 969.41 5,871 0.05 % 136.74 otrok otrok otr ok 858,940 0.24 % 756.98 0 0 % 0 32,917 0.39 % 828.81 1,735 0.14 % 438.59 386,949 0.23 % 712.98 167,600 0.31 % 894.26 219,308 0.21 % 689.78 50,431 0.39 % 1,174.60 skupina skupina skupi na 857,991 0.24 % 756.14 0 0 % 0 6,619 0.08 % 166.66 2,360 0.19 % 596.58 424,903 0.25 % 782.91 115,348 0.21 % 615.46 268,220 0.26 % 843.62 40,541 0.32 % 944.25 stran stran str an 856,113 0.24 % 754.49 24 0.80 % 2,472.44 28,776 0.34 % 724.55 2,133 0.17 % 539.20 357,367 0.21 % 658.47 172,145 0.32 % 918.51 250,185 0.24 % 786.90 45,483 0.35 % 1,059.35 vlada vlada vla da 767,169 0.22 % 676.10 0 0 % 0 1,902 0.02 % 47.89 5,609 0.46 % 1,417.89 398,609 0.23 % 734.46 56,720 0.10 % 302.64 297,430 0.29 % 935.49 6,899 0.05 % 160.69 življenje življenje življen je 762,139 0.22 % 671.67 0 0 % 0 38,014 0.45 % 957.15 2,198 0.18 % 555.63 319,278 0.19 % 588.29 169,305 0.31 % 903.36 178,445 0.17 % 561.26 54,899 0.43 % 1,278.66 družba družba druž ba 762,071 0.22 % 671.61 0 0 % 0 6,818 0.08 % 171.67 4,866 0.40 % 1,230.07 394,846 0.23 % 727.53 79,568 0.15 % 424.55 247,831 0.24 % 779.49 28,142 0.22 % 655.46 zakon zakon zak on 706,952 0.20 % 623.03 1 0.03 % 103.02 4,285 0.05 % 107.89 20,577 1.67 % 5,201.64 358,907 0.21 % 661.31 70,886 0.13 % 378.22 236,331 0.23 % 743.32 15,965 0.12 % 371.84 občina občina obči na 688,467 0.20 % 606.74 0 0 % 0 430 0.01 % 10.83 2,616 0.21 % 661.30 491,291 0.29 % 905.24 24,281 0.04 % 129.56 167,303 0.16 % 526.21 2,546 0.02 % 59.30 teden teden ted en 683,852 0.20 % 602.68 2 0.07 % 206.04 14,002 0.17 % 352.56 904 0.07 % 228.52 351,497 0.20 % 647.66 84,244 0.15 % 449.50 225,387 0.22 % 708.90 7,816 0.06 % 182.04 prostor prostor prost or 672,859 0.19 % 592.99 0 0 % 0 13,594 0.16 % 342.28 1,932 0.16 % 488.39 359,646 0.21 % 662.67 111,497 0.20 % 594.91 160,130 0.16 % 503.65 26,060 0.20 % 606.97 točka točka toč ka 672,245 0.19 % 592.45 0 0 % 0 3,432 0.04 % 86.41 3,686 0.30 % 931.78 305,174 0.18 % 562.30 49,829 0.09 % 265.87 294,805 0.29 % 927.24 15,319 0.12 % 356.80 program program progr am 667,540 0.19 % 588.30 0 0 % 0 1,804 0.02 % 45.42 3,767 0.31 % 952.26 339,935 0.20 % 626.35 138,849 0.26 % 740.85 162,628 0.16 % 511.51 20,557 0.16 % 478.80 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 44 File at CLARIN.SI 1.2.28 List of final character-level 3-grams from noun lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-nouns-lemmas- final-3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] leto leto l eto 4,967,678 1.43 % 4,377.99 0 0 % 0 58,808 0.70 % 1,480.72 11,974 0.98 % 3,026.90 2,501,788 1.47 % 4,609.71 732,438 1.36 % 3,908.05 1,531,105 1.50 % 4,815.72 131,565 1.03 % 3,064.31 čas čas čas 1,869,767 0.54 % 1,647.82 15 0.51 % 1,545.28 65,738 0.78 % 1,655.21 5,627 0.46 % 1,422.44 884,076 0.52 % 1,628.97 353,071 0.65 % 1,883.87 493,125 0.48 % 1,551.01 68,115 0.53 % 1,586.48 dan dan dan 1,852,418 0.53 % 1,632.53 4 0.14 % 412.07 63,871 0.76 % 1,608.20 7,241 0.59 % 1,830.45 911,593 0.54 % 1,679.67 285,803 0.53 % 1,524.95 544,212 0.54 % 1,711.69 39,694 0.31 % 924.52 Slovenija slovenija Sloven ija 1,764,516 0.51 % 1,555.06 0 0 % 0 1,408 0.02 % 35.45 11,071 0.90 % 2,798.63 894,914 0.53 % 1,648.94 193,536 0.36 % 1,032.64 645,977 0.64 % 2,031.77 17,610 0.14 % 410.16 delo delo d elo 1,611,830 0.46 % 1,420.50 2 0.07 % 206.04 21,312 0.25 % 536.61 6,576 0.54 % 1,662.34 832,012 0.49 % 1,533.04 223,957 0.41 % 1,194.96 455,097 0.45 % 1,431.40 72,874 0.57 % 1,697.32 mesto mesto me sto 1,587,720 0.46 % 1,399.25 0 0 % 0 30,392 0.36 % 765.24 2,457 0.20 % 621.10 831,762 0.49 % 1,532.58 187,507 0.35 % 1,000.48 494,079 0.49 % 1,554.01 41,523 0.33 % 967.12 človek človek člo vek 1,543,691 0.44 % 1,360.45 0 0 % 0 64,013 0.76 % 1,611.78 4,089 0.33 % 1,033.66 693,749 0.41 % 1,278.28 286,123 0.53 % 1,526.66 411,720 0.41 % 1,294.97 83,997 0.66 % 1,956.39 država država drž ava 1,502,639 0.43 % 1,324.27 0 0 % 0 5,437 0.06 % 136.90 9,324 0.76 % 2,357.01 761,820 0.45 % 1,403.70 149,312 0.28 % 796.68 536,561 0.53 % 1,687.62 40,185 0.32 % 935.96 svet svet s vet 1,216,329 0.35 % 1,071.94 0 0 % 0 27,166 0.32 % 684.01 4,324 0.35 % 1,093.06 611,210 0.36 % 1,126.19 177,328 0.33 % 946.16 351,299 0.34 % 1,104.93 45,002 0.35 % 1,048.15 odstotek odstotek odsto tek 1,089,884 0.31 % 960.51 0 0 % 0 898 0.01 % 22.61 583 0.05 % 147.38 558,212 0.33 % 1,028.54 82,011 0.15 % 437.58 439,901 0.43 % 1,383.60 8,279 0.07 % 192.83 Ljubljana ljubljana Ljublj ana 1,078,589 0.31 % 950.56 0 0 % 0 2,707 0.03 % 68.16 940 0.08 % 237.62 570,072 0.34 % 1,050.40 89,473 0.17 % 477.40 398,878 0.39 % 1,254.58 16,519 0.13 % 384.75 podjetje podjetje podje tje 1,001,078 0.29 % 882.24 0 0 % 0 2,879 0.03 % 72.49 2,661 0.22 % 672.67 537,241 0.32 % 989.90 154,968 0.29 % 826.86 283,563 0.28 % 891.88 19,766 0.15 % 460.37 tekma tekma te kma 978,012 0.28 % 861.92 0 0 % 0 1,508 0.02 % 37.97 81 0.01 % 20.48 519,746 0.30 % 957.67 45,417 0.08 % 242.33 409,852 0.40 % 1,289.09 1,408 0.01 % 32.79 del del del 960,422 0.28 % 846.42 7 0.24 % 721.13 13,353 0.16 % 336.21 3,660 0.30 % 925.21 460,019 0.27 % 847.62 165,687 0.31 % 884.05 267,226 0.26 % 840.50 50,470 0.40 % 1,175.51 evro evro e vro 952,189 0.27 % 839.16 0 0 % 0 587 0.01 % 14.78 39 0 % 9.86 403,032 0.24 % 742.61 41,108 0.08 % 219.34 505,459 0.50 % 1,589.80 1,964 0.01 % 45.74 predsednik predsednik predsed nik 935,429 0.27 % 824.39 0 0 % 0 3,383 0.04 % 85.18 2,020 0.17 % 510.63 520,594 0.31 % 959.23 65,730 0.12 % 350.71 339,004 0.33 % 1,066.26 4,698 0.04 % 109.42 primer primer pri mer 905,773 0.26 % 798.25 8 0.27 % 824.15 14,594 0.17 % 367.46 5,373 0.44 % 1,358.24 406,445 0.24 % 748.90 154,197 0.29 % 822.74 266,038 0.26 % 836.76 59,118 0.46 % 1,376.93 konec konec ko nec 893,711 0.26 % 787.62 26 0.88 % 2,678.48 29,432 0.35 % 741.07 2,089 0.17 % 528.08 428,007 0.25 % 788.63 131,299 0.24 % 700.57 277,983 0.27 % 874.33 24,875 0.20 % 579.37 ura ura ura 887,827 0.26 % 782.44 38 1.28 % 3,914.70 28,341 0.34 % 713.60 1,567 0.13 % 396.12 495,541 0.29 % 913.07 115,281 0.21 % 615.10 229,747 0.23 % 722.61 17,312 0.14 % 403.22 milijon milijon mili jon 863,041 0.25 % 760.59 0 0 % 0 2,587 0.03 % 65.14 561 0.05 % 141.81 472,592 0.28 % 870.78 73,218 0.14 % 390.67 308,212 0.30 % 969.41 5,871 0.05 % 136.74 otrok otrok ot rok 858,940 0.25 % 756.98 0 0 % 0 32,917 0.39 % 828.81 1,735 0.14 % 438.59 386,949 0.23 % 712.98 167,600 0.31 % 894.26 219,308 0.22 % 689.78 50,431 0.40 % 1,174.60 skupina skupina skup ina 857,991 0.25 % 756.14 0 0 % 0 6,619 0.08 % 166.66 2,360 0.19 % 596.58 424,903 0.25 % 782.91 115,348 0.21 % 615.46 268,220 0.26 % 843.62 40,541 0.32 % 944.25 stran stran st ran 856,113 0.25 % 754.49 24 0.81 % 2,472.44 28,776 0.34 % 724.55 2,133 0.17 % 539.20 357,367 0.21 % 658.47 172,145 0.32 % 918.51 250,185 0.25 % 786.90 45,483 0.36 % 1,059.35 vlada vlada vl ada 767,169 0.22 % 676.10 0 0 % 0 1,902 0.02 % 47.89 5,609 0.46 % 1,417.89 398,609 0.23 % 734.46 56,720 0.10 % 302.64 297,430 0.29 % 935.49 6,899 0.05 % 160.69 življenje življenje življe nje 762,139 0.22 % 671.67 0 0 % 0 38,014 0.45 % 957.15 2,198 0.18 % 555.63 319,278 0.19 % 588.29 169,305 0.31 % 903.36 178,445 0.17 % 561.26 54,899 0.43 % 1,278.66 družba družba dru žba 762,071 0.22 % 671.61 0 0 % 0 6,818 0.08 % 171.67 4,866 0.40 % 1,230.07 394,846 0.23 % 727.53 79,568 0.15 % 424.55 247,831 0.24 % 779.49 28,142 0.22 % 655.46 zakon zakon za kon 706,952 0.20 % 623.03 1 0.03 % 103.02 4,285 0.05 % 107.89 20,577 1.68 % 5,201.64 358,907 0.21 % 661.31 70,886 0.13 % 378.22 236,331 0.23 % 743.32 15,965 0.12 % 371.84 občina občina obč ina 688,467 0.20 % 606.74 0 0 % 0 430 0.01 % 10.83 2,616 0.21 % 661.30 491,291 0.29 % 905.24 24,281 0.04 % 129.56 167,303 0.16 % 526.21 2,546 0.02 % 59.30 teden teden te den 683,852 0.20 % 602.68 2 0.07 % 206.04 14,002 0.17 % 352.56 904 0.07 % 228.52 351,497 0.21 % 647.66 84,244 0.16 % 449.50 225,387 0.22 % 708.90 7,816 0.06 % 182.04 prostor prostor pros tor 672,859 0.19 % 592.99 0 0 % 0 13,594 0.16 % 342.28 1,932 0.16 % 488.39 359,646 0.21 % 662.67 111,497 0.21 % 594.91 160,130 0.16 % 503.65 26,060 0.20 % 606.97 točka točka to čka 672,245 0.19 % 592.45 0 0 % 0 3,432 0.04 % 86.41 3,686 0.30 % 931.78 305,174 0.18 % 562.30 49,829 0.09 % 265.87 294,805 0.29 % 927.24 15,319 0.12 % 356.80 program program prog ram 667,540 0.19 % 588.30 0 0 % 0 1,804 0.02 % 45.42 3,767 0.31 % 952.26 339,935 0.20 % 626.35 138,849 0.26 % 740.85 162,628 0.16 % 511.51 20,557 0.16 % 478.80 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 45 File at CLARIN.SI 1.2.29 List of final character-level 4-grams from noun lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-nouns-lemmas- final-4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] leto leto leto 4,967,678 1.53 % 4,377.99 0 0 % 0 58,808 0.78 % 1,480.72 11,974 1.03 % 3,026.90 2,501,788 1.57 % 4,609.71 732,438 1.46 % 3,908.05 1,531,105 1.61 % 4,815.72 131,565 1.10 % 3,064.31 Slovenija slovenija Slove nija 1,764,516 0.54 % 1,555.06 0 0 % 0 1,408 0.02 % 35.45 11,071 0.95 % 2,798.63 894,914 0.56 % 1,648.94 193,536 0.39 % 1,032.64 645,977 0.68 % 2,031.77 17,610 0.15 % 410.16 delo delo delo 1,611,830 0.49 % 1,420.50 2 0.07 % 206.04 21,312 0.28 % 536.61 6,576 0.56 % 1,662.34 832,012 0.52 % 1,533.04 223,957 0.45 % 1,194.96 455,097 0.48 % 1,431.40 72,874 0.61 % 1,697.32 mesto mesto m esto 1,587,720 0.49 % 1,399.25 0 0 % 0 30,392 0.40 % 765.24 2,457 0.21 % 621.10 831,762 0.52 % 1,532.58 187,507 0.37 % 1,000.48 494,079 0.52 % 1,554.01 41,523 0.35 % 967.12 človek človek čl ovek 1,543,691 0.47 % 1,360.45 0 0 % 0 64,013 0.84 % 1,611.78 4,089 0.35 % 1,033.66 693,749 0.43 % 1,278.28 286,123 0.57 % 1,526.66 411,720 0.43 % 1,294.97 83,997 0.70 % 1,956.39 država država dr žava 1,502,639 0.46 % 1,324.27 0 0 % 0 5,437 0.07 % 136.90 9,324 0.80 % 2,357.01 761,820 0.48 % 1,403.70 149,312 0.30 % 796.68 536,561 0.56 % 1,687.62 40,185 0.34 % 935.96 svet svet svet 1,216,329 0.37 % 1,071.94 0 0 % 0 27,166 0.36 % 684.01 4,324 0.37 % 1,093.06 611,210 0.38 % 1,126.19 177,328 0.35 % 946.16 351,299 0.37 % 1,104.93 45,002 0.38 % 1,048.15 odstotek odstotek odst otek 1,089,884 0.34 % 960.51 0 0 % 0 898 0.01 % 22.61 583 0.05 % 147.38 558,212 0.35 % 1,028.54 82,011 0.16 % 437.58 439,901 0.46 % 1,383.60 8,279 0.07 % 192.83 Ljubljana ljubljana Ljubl jana 1,078,589 0.33 % 950.56 0 0 % 0 2,707 0.04 % 68.16 940 0.08 % 237.62 570,072 0.36 % 1,050.40 89,473 0.18 % 477.40 398,878 0.42 % 1,254.58 16,519 0.14 % 384.75 podjetje podjetje podj etje 1,001,078 0.31 % 882.24 0 0 % 0 2,879 0.04 % 72.49 2,661 0.23 % 672.67 537,241 0.34 % 989.90 154,968 0.31 % 826.86 283,563 0.30 % 891.88 19,766 0.17 % 460.37 tekma tekma t ekma 978,012 0.30 % 861.92 0 0 % 0 1,508 0.02 % 37.97 81 0.01 % 20.48 519,746 0.33 % 957.67 45,417 0.09 % 242.33 409,852 0.43 % 1,289.09 1,408 0.01 % 32.79 evro evro evro 952,189 0.29 % 839.16 0 0 % 0 587 0.01 % 14.78 39 0 % 9.86 403,032 0.25 % 742.61 41,108 0.08 % 219.34 505,459 0.53 % 1,589.80 1,964 0.02 % 45.74 predsednik predsednik predse dnik 935,429 0.29 % 824.39 0 0 % 0 3,383 0.04 % 85.18 2,020 0.17 % 510.63 520,594 0.33 % 959.23 65,730 0.13 % 350.71 339,004 0.36 % 1,066.26 4,698 0.04 % 109.42 primer primer pr imer 905,773 0.28 % 798.25 8 0.30 % 824.15 14,594 0.19 % 367.46 5,373 0.46 % 1,358.24 406,445 0.26 % 748.90 154,197 0.31 % 822.74 266,038 0.28 % 836.76 59,118 0.50 % 1,376.93 konec konec k onec 893,711 0.28 % 787.62 26 0.96 % 2,678.48 29,432 0.39 % 741.07 2,089 0.18 % 528.08 428,007 0.27 % 788.63 131,299 0.26 % 700.57 277,983 0.29 % 874.33 24,875 0.21 % 579.37 milijon milijon mil ijon 863,041 0.27 % 760.59 0 0 % 0 2,587 0.03 % 65.14 561 0.05 % 141.81 472,592 0.30 % 870.78 73,218 0.15 % 390.67 308,212 0.32 % 969.41 5,871 0.05 % 136.74 otrok otrok o trok 858,940 0.26 % 756.98 0 0 % 0 32,917 0.43 % 828.81 1,735 0.15 % 438.59 386,949 0.24 % 712.98 167,600 0.34 % 894.26 219,308 0.23 % 689.78 50,431 0.42 % 1,174.60 skupina skupina sku pina 857,991 0.26 % 756.14 0 0 % 0 6,619 0.09 % 166.66 2,360 0.20 % 596.58 424,903 0.27 % 782.91 115,348 0.23 % 615.46 268,220 0.28 % 843.62 40,541 0.34 % 944.25 stran stran s tran 856,113 0.26 % 754.49 24 0.89 % 2,472.44 28,776 0.38 % 724.55 2,133 0.18 % 539.20 357,367 0.22 % 658.47 172,145 0.34 % 918.51 250,185 0.26 % 786.90 45,483 0.38 % 1,059.35 vlada vlada v lada 767,169 0.24 % 676.10 0 0 % 0 1,902 0.03 % 47.89 5,609 0.48 % 1,417.89 398,609 0.25 % 734.46 56,720 0.11 % 302.64 297,430 0.31 % 935.49 6,899 0.06 % 160.69 življenje življenje življ enje 762,139 0.23 % 671.67 0 0 % 0 38,014 0.50 % 957.15 2,198 0.19 % 555.63 319,278 0.20 % 588.29 169,305 0.34 % 903.36 178,445 0.19 % 561.26 54,899 0.46 % 1,278.66 družba družba dr užba 762,071 0.23 % 671.61 0 0 % 0 6,818 0.09 % 171.67 4,866 0.42 % 1,230.07 394,846 0.25 % 727.53 79,568 0.16 % 424.55 247,831 0.26 % 779.49 28,142 0.24 % 655.46 zakon zakon z akon 706,952 0.22 % 623.03 1 0.04 % 103.02 4,285 0.06 % 107.89 20,577 1.77 % 5,201.64 358,907 0.23 % 661.31 70,886 0.14 % 378.22 236,331 0.25 % 743.32 15,965 0.13 % 371.84 občina občina ob čina 688,467 0.21 % 606.74 0 0 % 0 430 0.01 % 10.83 2,616 0.23 % 661.30 491,291 0.31 % 905.24 24,281 0.05 % 129.56 167,303 0.18 % 526.21 2,546 0.02 % 59.30 teden teden t eden 683,852 0.21 % 602.68 2 0.07 % 206.04 14,002 0.18 % 352.56 904 0.08 % 228.52 351,497 0.22 % 647.66 84,244 0.17 % 449.50 225,387 0.24 % 708.90 7,816 0.07 % 182.04 prostor prostor pro stor 672,859 0.21 % 592.99 0 0 % 0 13,594 0.18 % 342.28 1,932 0.17 % 488.39 359,646 0.23 % 662.67 111,497 0.22 % 594.91 160,130 0.17 % 503.65 26,060 0.22 % 606.97 točka točka t očka 672,245 0.21 % 592.45 0 0 % 0 3,432 0.04 % 86.41 3,686 0.32 % 931.78 305,174 0.19 % 562.30 49,829 0.10 % 265.87 294,805 0.31 % 927.24 15,319 0.13 % 356.80 program program pro gram 667,540 0.20 % 588.30 0 0 % 0 1,804 0.02 % 45.42 3,767 0.32 % 952.26 339,935 0.21 % 626.35 138,849 0.28 % 740.85 162,628 0.17 % 511.51 20,557 0.17 % 478.80 mesec mesec m esec 653,149 0.20 % 575.62 1 0.04 % 103.02 11,071 0.15 % 278.76 2,582 0.22 % 652.70 331,605 0.21 % 611 96,549 0.19 % 515.15 202,005 0.21 % 635.36 9,336 0.08 % 217.45 težava težava te žava 621,900 0.19 % 548.08 1 0.04 % 103.02 10,735 0.14 % 270.30 620 0.05 % 156.73 292,123 0.18 % 538.26 119,138 0.24 % 635.68 179,213 0.19 % 563.67 20,070 0.17 % 467.45 vprašanje vprašanje vpraš anje 619,723 0.19 % 546.16 0 0 % 0 12,480 0.17 % 314.23 2,996 0.26 % 757.36 317,174 0.20 % 584.41 81,372 0.16 % 434.17 177,497 0.19 % 558.27 28,204 0.24 % 656.90 vrsta vrsta v rsta 605,858 0.19 % 533.94 5 0.18 % 515.09 14,107 0.19 % 355.20 3,289 0.28 % 831.42 283,858 0.18 % 523.03 114,099 0.23 % 608.79 148,498 0.16 % 467.06 42,002 0.35 % 978.28 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 46 File at CLARIN.SI 1.2.30 List of final character-level 5-grams from noun lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-nouns-lemmas- final-5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Slovenija slovenija Slov enija 1,764,516 0.62 % 1,555.06 0 0 % 0 1,408 0.02 % 35.45 11,071 1.06 % 2,798.63 894,914 0.64 % 1,648.94 193,536 0.45 % 1,032.64 645,977 0.77 % 2,031.77 17,610 0.17 % 410.16 mesto mesto mesto 1,587,720 0.56 % 1,399.25 0 0 % 0 30,392 0.50 % 765.24 2,457 0.24 % 621.10 831,762 0.59 % 1,532.58 187,507 0.43 % 1,000.48 494,079 0.59 % 1,554.01 41,523 0.40 % 967.12 človek človek č lovek 1,543,691 0.54 % 1,360.45 0 0 % 0 64,013 1.05 % 1,611.78 4,089 0.39 % 1,033.66 693,749 0.49 % 1,278.28 286,123 0.66 % 1,526.66 411,720 0.49 % 1,294.97 83,997 0.81 % 1,956.39 država država d ržava 1,502,639 0.53 % 1,324.27 0 0 % 0 5,437 0.09 % 136.90 9,324 0.90 % 2,357.01 761,820 0.54 % 1,403.70 149,312 0.34 % 796.68 536,561 0.64 % 1,687.62 40,185 0.39 % 935.96 odstotek odstotek ods totek 1,089,884 0.38 % 960.51 0 0 % 0 898 0.01 % 22.61 583 0.06 % 147.38 558,212 0.40 % 1,028.54 82,011 0.19 % 437.58 439,901 0.52 % 1,383.60 8,279 0.08 % 192.83 Ljubljana ljubljana Ljub ljana 1,078,589 0.38 % 950.56 0 0 % 0 2,707 0.04 % 68.16 940 0.09 % 237.62 570,072 0.41 % 1,050.40 89,473 0.21 % 477.40 398,878 0.47 % 1,254.58 16,519 0.16 % 384.75 podjetje podjetje pod jetje 1,001,078 0.35 % 882.24 0 0 % 0 2,879 0.05 % 72.49 2,661 0.26 % 672.67 537,241 0.38 % 989.90 154,968 0.36 % 826.86 283,563 0.34 % 891.88 19,766 0.19 % 460.37 tekma tekma tekma 978,012 0.34 % 861.92 0 0 % 0 1,508 0.03 % 37.97 81 0.01 % 20.48 519,746 0.37 % 957.67 45,417 0.10 % 242.33 409,852 0.49 % 1,289.09 1,408 0.01 % 32.79 predsednik predsednik preds ednik 935,429 0.33 % 824.39 0 0 % 0 3,383 0.06 % 85.18 2,020 0.19 % 510.63 520,594 0.37 % 959.23 65,730 0.15 % 350.71 339,004 0.40 % 1,066.26 4,698 0.04 % 109.42 primer primer p rimer 905,773 0.32 % 798.25 8 0.37 % 824.15 14,594 0.24 % 367.46 5,373 0.52 % 1,358.24 406,445 0.29 % 748.90 154,197 0.36 % 822.74 266,038 0.32 % 836.76 59,118 0.57 % 1,376.93 konec konec konec 893,711 0.31 % 787.62 26 1.21 % 2,678.48 29,432 0.48 % 741.07 2,089 0.20 % 528.08 428,007 0.31 % 788.63 131,299 0.30 % 700.57 277,983 0.33 % 874.33 24,875 0.24 % 579.37 milijon milijon mi lijon 863,041 0.30 % 760.59 0 0 % 0 2,587 0.04 % 65.14 561 0.05 % 141.81 472,592 0.34 % 870.78 73,218 0.17 % 390.67 308,212 0.37 % 969.41 5,871 0.06 % 136.74 otrok otrok otrok 858,940 0.30 % 756.98 0 0 % 0 32,917 0.54 % 828.81 1,735 0.17 % 438.59 386,949 0.28 % 712.98 167,600 0.39 % 894.26 219,308 0.26 % 689.78 50,431 0.48 % 1,174.60 skupina skupina sk upina 857,991 0.30 % 756.14 0 0 % 0 6,619 0.11 % 166.66 2,360 0.23 % 596.58 424,903 0.30 % 782.91 115,348 0.27 % 615.46 268,220 0.32 % 843.62 40,541 0.39 % 944.25 stran stran stran 856,113 0.30 % 754.49 24 1.12 % 2,472.44 28,776 0.47 % 724.55 2,133 0.20 % 539.20 357,367 0.26 % 658.47 172,145 0.40 % 918.51 250,185 0.30 % 786.90 45,483 0.44 % 1,059.35 vlada vlada vlada 767,169 0.27 % 676.10 0 0 % 0 1,902 0.03 % 47.89 5,609 0.54 % 1,417.89 398,609 0.28 % 734.46 56,720 0.13 % 302.64 297,430 0.35 % 935.49 6,899 0.07 % 160.69 življenje življenje živl jenje 762,139 0.27 % 671.67 0 0 % 0 38,014 0.62 % 957.15 2,198 0.21 % 555.63 319,278 0.23 % 588.29 169,305 0.39 % 903.36 178,445 0.21 % 561.26 54,899 0.53 % 1,278.66 družba družba d ružba 762,071 0.27 % 671.61 0 0 % 0 6,818 0.11 % 171.67 4,866 0.47 % 1,230.07 394,846 0.28 % 727.53 79,568 0.18 % 424.55 247,831 0.29 % 779.49 28,142 0.27 % 655.46 zakon zakon zakon 706,952 0.25 % 623.03 1 0.05 % 103.02 4,285 0.07 % 107.89 20,577 1.98 % 5,201.64 358,907 0.26 % 661.31 70,886 0.16 % 378.22 236,331 0.28 % 743.32 15,965 0.15 % 371.84 občina občina o bčina 688,467 0.24 % 606.74 0 0 % 0 430 0.01 % 10.83 2,616 0.25 % 661.30 491,291 0.35 % 905.24 24,281 0.06 % 129.56 167,303 0.20 % 526.21 2,546 0.02 % 59.30 teden teden teden 683,852 0.24 % 602.68 2 0.09 % 206.04 14,002 0.23 % 352.56 904 0.09 % 228.52 351,497 0.25 % 647.66 84,244 0.19 % 449.50 225,387 0.27 % 708.90 7,816 0.07 % 182.04 prostor prostor pr ostor 672,859 0.24 % 592.99 0 0 % 0 13,594 0.22 % 342.28 1,932 0.19 % 488.39 359,646 0.26 % 662.67 111,497 0.26 % 594.91 160,130 0.19 % 503.65 26,060 0.25 % 606.97 točka točka točka 672,245 0.24 % 592.45 0 0 % 0 3,432 0.06 % 86.41 3,686 0.35 % 931.78 305,174 0.22 % 562.30 49,829 0.12 % 265.87 294,805 0.35 % 927.24 15,319 0.15 % 356.80 program program pr ogram 667,540 0.23 % 588.30 0 0 % 0 1,804 0.03 % 45.42 3,767 0.36 % 952.26 339,935 0.24 % 626.35 138,849 0.32 % 740.85 162,628 0.19 % 511.51 20,557 0.20 % 478.80 mesec mesec mesec 653,149 0.23 % 575.62 1 0.05 % 103.02 11,071 0.18 % 278.76 2,582 0.25 % 652.70 331,605 0.24 % 611 96,549 0.22 % 515.15 202,005 0.24 % 635.36 9,336 0.09 % 217.45 težava težava t ežava 621,900 0.22 % 548.08 1 0.05 % 103.02 10,735 0.18 % 270.30 620 0.06 % 156.73 292,123 0.21 % 538.26 119,138 0.28 % 635.68 179,213 0.21 % 563.67 20,070 0.19 % 467.45 vprašanje vprašanje vpra šanje 619,723 0.22 % 546.16 0 0 % 0 12,480 0.20 % 314.23 2,996 0.29 % 757.36 317,174 0.23 % 584.41 81,372 0.19 % 434.17 177,497 0.21 % 558.27 28,204 0.27 % 656.90 vrsta vrsta vrsta 605,858 0.21 % 533.94 5 0.23 % 515.09 14,107 0.23 % 355.20 3,289 0.32 % 831.42 283,858 0.20 % 523.03 114,099 0.26 % 608.79 148,498 0.18 % 467.06 42,002 0.40 % 978.28 beseda beseda b eseda 594,966 0.21 % 524.34 0 0 % 0 32,479 0.53 % 817.79 2,963 0.28 % 749.01 261,656 0.19 % 482.12 74,175 0.17 % 395.77 190,944 0.23 % 600.57 32,749 0.31 % 762.76 stranka stranka st ranka 591,212 0.21 % 521.03 0 0 % 0 3,498 0.06 % 88.08 3,818 0.37 % 965.15 283,718 0.20 % 522.77 56,592 0.13 % 301.96 233,813 0.28 % 735.40 9,773 0.09 % 227.62 podatek podatek po datek 587,651 0.21 % 517.89 1 0.05 % 103.02 3,365 0.06 % 84.73 4,325 0.42 % 1,093.31 242,117 0.17 % 446.12 108,286 0.25 % 577.78 206,424 0.25 % 649.26 23,133 0.22 % 538.80 začetek začetek za četek 584,429 0.20 % 515.05 1 0.05 % 103.02 8,101 0.13 % 203.97 1,762 0.17 % 445.41 287,624 0.20 % 529.97 85,385 0.20 % 455.59 182,376 0.22 % 573.62 19,180 0.18 % 446.73 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 47 File at CLARIN.SI 1.2.31 List of initial character-level 1-grams from noun lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-nouns-lowercase_forms- initial-1grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] leta l eta 1,848,166 0.52 % 1,628.78 269,635 0.49 % 1,438.68 66,252 0.51 % 1,543.09 16,515 0.20 % 415.83 0 0 % 0 611,187 0.59 % 1,922.34 879,792 0.51 % 1,621.07 4,785 0.39 % 1,209.60 let l et 873,231 0.25 % 769.57 144,472 0.26 % 770.86 20,474 0.16 % 476.86 18,975 0.22 % 477.77 0 0 % 0 237,663 0.23 % 747.51 449,263 0.26 % 827.80 2,384 0.19 % 602.65 evrov e vrov 773,102 0.22 % 681.33 31,324 0.06 % 167.13 780 0.01 % 18.17 446 0.01 % 11.23 0 0 % 0 417,524 0.41 % 1,313.22 323,022 0.19 % 595.19 6 0 % 1.52 leto l eto 721,611 0.20 % 635.95 98,875 0.18 % 527.56 10,303 0.08 % 239.97 6,698 0.08 % 168.65 0 0 % 0 209,509 0.20 % 658.96 394,765 0.23 % 727.38 1,461 0.12 % 369.33 del d el 709,817 0.20 % 625.56 112,840 0.21 % 602.08 31,994 0.25 % 745.18 9,763 0.12 % 245.82 4 0.13 % 412.07 196,278 0.19 % 617.35 356,268 0.21 % 656.45 2,670 0.22 % 674.95 dan d an 699,476 0.20 % 616.44 125,516 0.23 % 669.71 17,378 0.13 % 404.75 28,703 0.34 % 722.71 0 0 % 0 190,718 0.18 % 599.86 335,315 0.20 % 617.84 1,846 0.15 % 466.65 dela d ela 657,992 0.19 % 579.89 87,263 0.16 % 465.61 29,686 0.23 % 691.42 8,364 0.10 % 210.60 1 0.03 % 103.02 192,362 0.19 % 605.03 337,659 0.20 % 622.16 2,657 0.21 % 671.66 strani s trani 656,360 0.19 % 578.45 128,171 0.23 % 683.88 33,031 0.26 % 769.33 19,223 0.23 % 484.01 5 0.16 % 515.09 199,449 0.19 % 627.32 274,744 0.16 % 506.23 1,737 0.14 % 439.09 ljudi l judi 634,249 0.18 % 558.96 96,577 0.18 % 515.30 24,644 0.19 % 573.99 16,405 0.19 % 413.06 0 0 % 0 210,518 0.20 % 662.13 284,778 0.17 % 524.72 1,327 0.11 % 335.45 ljubljana l jubljana 629,709 0.18 % 554.96 31,935 0.06 % 170.39 10,049 0.08 % 234.05 517 0.01 % 13.02 0 0 % 0 273,474 0.27 % 860.15 313,379 0.18 % 577.42 355 0.03 % 89.74 mesto m esto 628,282 0.18 % 553.70 69,292 0.13 % 369.72 14,019 0.11 % 326.52 10,635 0.12 % 267.78 0 0 % 0 181,789 0.18 % 571.77 351,818 0.20 % 648.25 729 0.06 % 184.28 letih l etih 593,892 0.17 % 523.39 88,203 0.16 % 470.62 15,668 0.12 % 364.93 7,036 0.08 % 177.16 0 0 % 0 186,187 0.18 % 585.61 295,576 0.17 % 544.62 1,222 0.10 % 308.91 delo d elo 592,066 0.17 % 521.78 87,155 0.16 % 465.03 26,130 0.20 % 608.60 9,140 0.11 % 230.14 0 0 % 0 162,646 0.16 % 511.56 304,906 0.18 % 561.81 2,089 0.17 % 528.08 čas č as 584,983 0.17 % 515.54 115,093 0.21 % 614.10 21,773 0.17 % 507.12 25,547 0.30 % 643.25 2 0.07 % 206.04 147,089 0.14 % 462.63 273,358 0.16 % 503.68 2,121 0.17 % 536.17 slovenije s lovenije 579,657 0.16 % 510.85 56,763 0.10 % 302.87 6,065 0.05 % 141.26 465 0.01 % 11.71 0 0 % 0 198,780 0.19 % 625.21 310,741 0.18 % 572.56 6,843 0.55 % 1,729.84 sloveniji s loveniji 567,197 0.16 % 499.87 78,143 0.14 % 416.95 6,967 0.05 % 162.27 478 0.01 % 12.04 0 0 % 0 187,227 0.18 % 588.88 292,345 0.17 % 538.66 2,037 0.17 % 514.93 odstotkov o dstotkov 559,799 0.16 % 493.35 53,835 0.10 % 287.25 5,020 0.04 % 116.92 578 0.01 % 14.55 0 0 % 0 188,394 0.18 % 592.55 311,673 0.18 % 574.28 299 0.02 % 75.58 predsednik p redsednik 545,455 0.15 % 480.71 37,608 0.07 % 200.66 2,827 0.02 % 65.84 1,964 0.02 % 49.45 0 0 % 0 192,584 0.19 % 605.73 309,379 0.18 % 570.05 1,093 0.09 % 276.30 času č asu 515,535 0.15 % 454.34 82,263 0.15 % 438.93 18,270 0.14 % 425.53 7,848 0.09 % 197.60 1 0.03 % 103.02 157,737 0.15 % 496.12 247,802 0.14 % 456.59 1,614 0.13 % 408 časa č asa 504,880 0.14 % 444.95 104,840 0.19 % 559.39 20,705 0.16 % 482.24 24,957 0.29 % 628.39 12 0.39 % 1,236.22 124,379 0.12 % 391.20 228,687 0.13 % 421.37 1,300 0.10 % 328.63 dni d ni 492,162 0.14 % 433.74 74,007 0.14 % 394.88 8,434 0.07 % 196.44 13,720 0.16 % 345.45 2 0.07 % 206.04 137,827 0.13 % 433.50 256,647 0.15 % 472.89 1,525 0.12 % 385.50 svetu s vetu 452,594 0.13 % 398.87 76,162 0.14 % 406.38 14,688 0.11 % 342.10 9,995 0.12 % 251.66 0 0 % 0 137,347 0.13 % 431.99 213,404 0.12 % 393.21 998 0.08 % 252.28 milijonov m ilijonov 451,741 0.13 % 398.12 39,191 0.07 % 209.11 2,589 0.02 % 60.30 883 0.01 % 22.23 0 0 % 0 145,704 0.14 % 458.28 263,033 0.15 % 484.66 341 0.03 % 86.20 države d ržave 441,261 0.12 % 388.88 43,421 0.08 % 231.68 12,978 0.10 % 302.27 1,697 0.02 % 42.73 0 0 % 0 160,359 0.16 % 504.37 219,400 0.13 % 404.26 3,406 0.28 % 861 slovenija s lovenija 434,662 0.12 % 383.07 38,121 0.07 % 203.40 3,078 0.02 % 71.69 216 0 % 5.44 0 0 % 0 192,738 0.19 % 606.21 199,029 0.12 % 366.72 1,480 0.12 % 374.13 tolarjev t olarjev 412,138 0.12 % 363.22 40,402 0.07 % 215.57 398 0 % 9.27 93 0 % 2.34 0 0 % 0 1,394 0 % 4.38 368,701 0.21 % 679.36 1,150 0.09 % 290.71 mestu m estu 409,944 0.12 % 361.28 53,464 0.10 % 285.27 9,822 0.08 % 228.77 10,096 0.12 % 254.21 0 0 % 0 137,112 0.13 % 431.25 198,916 0.12 % 366.52 534 0.04 % 134.99 odstotka o dstotka 408,111 0.12 % 359.67 15,227 0.03 % 81.25 1,027 0.01 % 23.92 48 0 % 1.21 0 0 % 0 209,030 0.20 % 657.45 182,717 0.11 % 336.67 62 0.01 % 15.67 letu l etu 396,319 0.11 % 349.27 50,649 0.09 % 270.25 9,551 0.07 % 222.45 1,695 0.02 % 42.68 0 0 % 0 128,073 0.12 % 402.82 205,061 0.12 % 377.84 1,290 0.10 % 326.10 koncu k oncu 386,487 0.11 % 340.61 60,717 0.11 % 323.97 12,930 0.10 % 301.16 11,331 0.13 % 285.30 20 0.65 % 2,060.37 118,425 0.12 % 372.48 182,053 0.11 % 335.44 1,011 0.08 % 255.57 leti l eti 380,841 0.11 % 335.63 60,949 0.11 % 325.20 5,767 0.04 % 134.32 6,530 0.08 % 164.42 0 0 % 0 106,368 0.10 % 334.56 200,657 0.12 % 369.72 570 0.05 % 144.09 ljudje l judje 380,144 0.11 % 335.02 75,678 0.14 % 403.79 20,838 0.16 % 485.34 17,127 0.20 % 431.24 0 0 % 0 92,650 0.09 % 291.41 172,775 0.10 % 318.35 1,076 0.09 % 272 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 48 File at CLARIN.SI 1.2.32 List of initial character-level 2-grams from noun lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-nouns-lowercase_forms- initial-2grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] leta le ta 1,848,166 0.53 % 1,628.78 269,635 0.49 % 1,438.68 66,252 0.52 % 1,543.09 16,515 0.20 % 415.83 0 0 % 0 611,187 0.59 % 1,922.34 879,792 0.51 % 1,621.07 4,785 0.39 % 1,209.60 let le t 873,231 0.25 % 769.57 144,472 0.27 % 770.86 20,474 0.16 % 476.86 18,975 0.22 % 477.77 0 0 % 0 237,663 0.23 % 747.51 449,263 0.26 % 827.80 2,384 0.19 % 602.65 evrov ev rov 773,102 0.22 % 681.33 31,324 0.06 % 167.13 780 0.01 % 18.17 446 0.01 % 11.23 0 0 % 0 417,524 0.41 % 1,313.22 323,022 0.19 % 595.19 6 0 % 1.52 leto le to 721,611 0.20 % 635.95 98,875 0.18 % 527.56 10,303 0.08 % 239.97 6,698 0.08 % 168.65 0 0 % 0 209,509 0.20 % 658.96 394,765 0.23 % 727.38 1,461 0.12 % 369.33 del de l 709,817 0.20 % 625.56 112,840 0.21 % 602.08 31,994 0.25 % 745.18 9,763 0.12 % 245.82 4 0.13 % 412.07 196,278 0.19 % 617.35 356,268 0.21 % 656.45 2,670 0.22 % 674.95 dan da n 699,476 0.20 % 616.44 125,516 0.23 % 669.71 17,378 0.14 % 404.75 28,703 0.34 % 722.71 0 0 % 0 190,718 0.19 % 599.86 335,315 0.20 % 617.84 1,846 0.15 % 466.65 dela de la 657,992 0.19 % 579.89 87,263 0.16 % 465.61 29,686 0.23 % 691.42 8,364 0.10 % 210.60 1 0.03 % 103.02 192,362 0.19 % 605.03 337,659 0.20 % 622.16 2,657 0.22 % 671.66 strani st rani 656,360 0.19 % 578.45 128,171 0.23 % 683.88 33,031 0.26 % 769.33 19,223 0.23 % 484.01 5 0.17 % 515.09 199,449 0.19 % 627.32 274,744 0.16 % 506.23 1,737 0.14 % 439.09 ljudi lj udi 634,249 0.18 % 558.96 96,577 0.18 % 515.30 24,644 0.19 % 573.99 16,405 0.19 % 413.06 0 0 % 0 210,518 0.20 % 662.13 284,778 0.17 % 524.72 1,327 0.11 % 335.45 ljubljana lj ubljana 629,709 0.18 % 554.96 31,935 0.06 % 170.39 10,049 0.08 % 234.05 517 0.01 % 13.02 0 0 % 0 273,474 0.27 % 860.15 313,379 0.18 % 577.42 355 0.03 % 89.74 mesto me sto 628,282 0.18 % 553.70 69,292 0.13 % 369.72 14,019 0.11 % 326.52 10,635 0.13 % 267.78 0 0 % 0 181,789 0.18 % 571.77 351,818 0.20 % 648.25 729 0.06 % 184.28 letih le tih 593,892 0.17 % 523.39 88,203 0.16 % 470.62 15,668 0.12 % 364.93 7,036 0.08 % 177.16 0 0 % 0 186,187 0.18 % 585.61 295,576 0.17 % 544.62 1,222 0.10 % 308.91 delo de lo 592,066 0.17 % 521.78 87,155 0.16 % 465.03 26,130 0.20 % 608.60 9,140 0.11 % 230.14 0 0 % 0 162,646 0.16 % 511.56 304,906 0.18 % 561.81 2,089 0.17 % 528.08 čas ča s 584,983 0.17 % 515.54 115,093 0.21 % 614.10 21,773 0.17 % 507.12 25,547 0.30 % 643.25 2 0.07 % 206.04 147,089 0.14 % 462.63 273,358 0.16 % 503.68 2,121 0.17 % 536.17 slovenije sl ovenije 579,657 0.17 % 510.85 56,763 0.10 % 302.87 6,065 0.05 % 141.26 465 0.01 % 11.71 0 0 % 0 198,780 0.19 % 625.21 310,741 0.18 % 572.56 6,843 0.56 % 1,729.84 sloveniji sl oveniji 567,197 0.16 % 499.87 78,143 0.14 % 416.95 6,967 0.05 % 162.27 478 0.01 % 12.04 0 0 % 0 187,227 0.18 % 588.88 292,345 0.17 % 538.66 2,037 0.17 % 514.93 odstotkov od stotkov 559,799 0.16 % 493.35 53,835 0.10 % 287.25 5,020 0.04 % 116.92 578 0.01 % 14.55 0 0 % 0 188,394 0.18 % 592.55 311,673 0.18 % 574.28 299 0.02 % 75.58 predsednik pr edsednik 545,455 0.15 % 480.71 37,608 0.07 % 200.66 2,827 0.02 % 65.84 1,964 0.02 % 49.45 0 0 % 0 192,584 0.19 % 605.73 309,379 0.18 % 570.05 1,093 0.09 % 276.30 času ča su 515,535 0.15 % 454.34 82,263 0.15 % 438.93 18,270 0.14 % 425.53 7,848 0.09 % 197.60 1 0.03 % 103.02 157,737 0.15 % 496.12 247,802 0.14 % 456.59 1,614 0.13 % 408 časa ča sa 504,880 0.14 % 444.95 104,840 0.19 % 559.39 20,705 0.16 % 482.24 24,957 0.29 % 628.39 12 0.40 % 1,236.22 124,379 0.12 % 391.20 228,687 0.13 % 421.37 1,300 0.11 % 328.63 dni dn i 492,162 0.14 % 433.74 74,007 0.14 % 394.88 8,434 0.07 % 196.44 13,720 0.16 % 345.45 2 0.07 % 206.04 137,827 0.13 % 433.50 256,647 0.15 % 472.89 1,525 0.12 % 385.50 svetu sv etu 452,594 0.13 % 398.87 76,162 0.14 % 406.38 14,688 0.11 % 342.10 9,995 0.12 % 251.66 0 0 % 0 137,347 0.13 % 431.99 213,404 0.12 % 393.21 998 0.08 % 252.28 milijonov mi lijonov 451,741 0.13 % 398.12 39,191 0.07 % 209.11 2,589 0.02 % 60.30 883 0.01 % 22.23 0 0 % 0 145,704 0.14 % 458.28 263,033 0.15 % 484.66 341 0.03 % 86.20 države dr žave 441,261 0.13 % 388.88 43,421 0.08 % 231.68 12,978 0.10 % 302.27 1,697 0.02 % 42.73 0 0 % 0 160,359 0.16 % 504.37 219,400 0.13 % 404.26 3,406 0.28 % 861 slovenija sl ovenija 434,662 0.12 % 383.07 38,121 0.07 % 203.40 3,078 0.02 % 71.69 216 0 % 5.44 0 0 % 0 192,738 0.19 % 606.21 199,029 0.12 % 366.72 1,480 0.12 % 374.13 tolarjev to larjev 412,138 0.12 % 363.22 40,402 0.07 % 215.57 398 0 % 9.27 93 0 % 2.34 0 0 % 0 1,394 0 % 4.38 368,701 0.21 % 679.36 1,150 0.09 % 290.71 mestu me stu 409,944 0.12 % 361.28 53,464 0.10 % 285.27 9,822 0.08 % 228.77 10,096 0.12 % 254.21 0 0 % 0 137,112 0.13 % 431.25 198,916 0.12 % 366.52 534 0.04 % 134.99 odstotka od stotka 408,111 0.12 % 359.67 15,227 0.03 % 81.25 1,027 0.01 % 23.92 48 0 % 1.21 0 0 % 0 209,030 0.20 % 657.45 182,717 0.11 % 336.67 62 0.01 % 15.67 letu le tu 396,319 0.11 % 349.27 50,649 0.09 % 270.25 9,551 0.07 % 222.45 1,695 0.02 % 42.68 0 0 % 0 128,073 0.12 % 402.82 205,061 0.12 % 377.84 1,290 0.10 % 326.10 koncu ko ncu 386,487 0.11 % 340.61 60,717 0.11 % 323.97 12,930 0.10 % 301.16 11,331 0.13 % 285.30 20 0.67 % 2,060.37 118,425 0.12 % 372.48 182,053 0.11 % 335.44 1,011 0.08 % 255.57 leti le ti 380,841 0.11 % 335.63 60,949 0.11 % 325.20 5,767 0.04 % 134.32 6,530 0.08 % 164.42 0 0 % 0 106,368 0.10 % 334.56 200,657 0.12 % 369.72 570 0.05 % 144.09 ljudje lj udje 380,144 0.11 % 335.02 75,678 0.14 % 403.79 20,838 0.16 % 485.34 17,127 0.20 % 431.24 0 0 % 0 92,650 0.09 % 291.41 172,775 0.10 % 318.35 1,076 0.09 % 272 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 49 File at CLARIN.SI 1.2.33 List of initial character-level 3-grams from noun lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-nouns-lowercase_forms- initial-3grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] leta let a 1,848,166 0.53 % 1,628.78 269,635 0.50 % 1,438.68 66,252 0.52 % 1,543.09 16,515 0.20 % 415.83 0 0 % 0 611,187 0.60 % 1,922.34 879,792 0.52 % 1,621.07 4,785 0.39 % 1,209.60 let let 873,231 0.25 % 769.57 144,472 0.27 % 770.86 20,474 0.16 % 476.86 18,975 0.23 % 477.77 0 0 % 0 237,663 0.23 % 747.51 449,263 0.26 % 827.80 2,384 0.20 % 602.65 evrov evr ov 773,102 0.22 % 681.33 31,324 0.06 % 167.13 780 0.01 % 18.17 446 0.01 % 11.23 0 0 % 0 417,524 0.41 % 1,313.22 323,022 0.19 % 595.19 6 0 % 1.52 leto let o 721,611 0.21 % 635.95 98,875 0.18 % 527.56 10,303 0.08 % 239.97 6,698 0.08 % 168.65 0 0 % 0 209,509 0.21 % 658.96 394,765 0.23 % 727.38 1,461 0.12 % 369.33 del del 709,817 0.20 % 625.56 112,840 0.21 % 602.08 31,994 0.25 % 745.18 9,763 0.12 % 245.82 4 0.14 % 412.07 196,278 0.19 % 617.35 356,268 0.21 % 656.45 2,670 0.22 % 674.95 dan dan 699,476 0.20 % 616.44 125,516 0.23 % 669.71 17,378 0.14 % 404.75 28,703 0.34 % 722.71 0 0 % 0 190,718 0.19 % 599.86 335,315 0.20 % 617.84 1,846 0.15 % 466.65 dela del a 657,992 0.19 % 579.89 87,263 0.16 % 465.61 29,686 0.23 % 691.42 8,364 0.10 % 210.60 1 0.03 % 103.02 192,362 0.19 % 605.03 337,659 0.20 % 622.16 2,657 0.22 % 671.66 strani str ani 656,360 0.19 % 578.45 128,171 0.24 % 683.88 33,031 0.26 % 769.33 19,223 0.23 % 484.01 5 0.17 % 515.09 199,449 0.20 % 627.32 274,744 0.16 % 506.23 1,737 0.14 % 439.09 ljudi lju di 634,249 0.18 % 558.96 96,577 0.18 % 515.30 24,644 0.19 % 573.99 16,405 0.19 % 413.06 0 0 % 0 210,518 0.21 % 662.13 284,778 0.17 % 524.72 1,327 0.11 % 335.45 ljubljana lju bljana 629,709 0.18 % 554.96 31,935 0.06 % 170.39 10,049 0.08 % 234.05 517 0.01 % 13.02 0 0 % 0 273,474 0.27 % 860.15 313,379 0.18 % 577.42 355 0.03 % 89.74 mesto mes to 628,282 0.18 % 553.70 69,292 0.13 % 369.72 14,019 0.11 % 326.52 10,635 0.13 % 267.78 0 0 % 0 181,789 0.18 % 571.77 351,818 0.21 % 648.25 729 0.06 % 184.28 letih let ih 593,892 0.17 % 523.39 88,203 0.16 % 470.62 15,668 0.12 % 364.93 7,036 0.08 % 177.16 0 0 % 0 186,187 0.18 % 585.61 295,576 0.17 % 544.62 1,222 0.10 % 308.91 delo del o 592,066 0.17 % 521.78 87,155 0.16 % 465.03 26,130 0.20 % 608.60 9,140 0.11 % 230.14 0 0 % 0 162,646 0.16 % 511.56 304,906 0.18 % 561.81 2,089 0.17 % 528.08 čas čas 584,983 0.17 % 515.54 115,093 0.21 % 614.10 21,773 0.17 % 507.12 25,547 0.30 % 643.25 2 0.07 % 206.04 147,089 0.14 % 462.63 273,358 0.16 % 503.68 2,121 0.17 % 536.17 slovenije slo venije 579,657 0.17 % 510.85 56,763 0.10 % 302.87 6,065 0.05 % 141.26 465 0.01 % 11.71 0 0 % 0 198,780 0.20 % 625.21 310,741 0.18 % 572.56 6,843 0.56 % 1,729.84 sloveniji slo veniji 567,197 0.16 % 499.87 78,143 0.14 % 416.95 6,967 0.06 % 162.27 478 0.01 % 12.04 0 0 % 0 187,227 0.18 % 588.88 292,345 0.17 % 538.66 2,037 0.17 % 514.93 odstotkov ods totkov 559,799 0.16 % 493.35 53,835 0.10 % 287.25 5,020 0.04 % 116.92 578 0.01 % 14.55 0 0 % 0 188,394 0.18 % 592.55 311,673 0.18 % 574.28 299 0.02 % 75.58 predsednik pre dsednik 545,455 0.16 % 480.71 37,608 0.07 % 200.66 2,827 0.02 % 65.84 1,964 0.02 % 49.45 0 0 % 0 192,584 0.19 % 605.73 309,379 0.18 % 570.05 1,093 0.09 % 276.30 času čas u 515,535 0.15 % 454.34 82,263 0.15 % 438.93 18,270 0.14 % 425.53 7,848 0.09 % 197.60 1 0.03 % 103.02 157,737 0.15 % 496.12 247,802 0.15 % 456.59 1,614 0.13 % 408 časa čas a 504,880 0.14 % 444.95 104,840 0.19 % 559.39 20,705 0.16 % 482.24 24,957 0.30 % 628.39 12 0.41 % 1,236.22 124,379 0.12 % 391.20 228,687 0.13 % 421.37 1,300 0.11 % 328.63 dni dni 492,162 0.14 % 433.74 74,007 0.14 % 394.88 8,434 0.07 % 196.44 13,720 0.16 % 345.45 2 0.07 % 206.04 137,827 0.14 % 433.50 256,647 0.15 % 472.89 1,525 0.12 % 385.50 svetu sve tu 452,594 0.13 % 398.87 76,162 0.14 % 406.38 14,688 0.12 % 342.10 9,995 0.12 % 251.66 0 0 % 0 137,347 0.14 % 431.99 213,404 0.12 % 393.21 998 0.08 % 252.28 milijonov mil ijonov 451,741 0.13 % 398.12 39,191 0.07 % 209.11 2,589 0.02 % 60.30 883 0.01 % 22.23 0 0 % 0 145,704 0.14 % 458.28 263,033 0.15 % 484.66 341 0.03 % 86.20 države drž ave 441,261 0.13 % 388.88 43,421 0.08 % 231.68 12,978 0.10 % 302.27 1,697 0.02 % 42.73 0 0 % 0 160,359 0.16 % 504.37 219,400 0.13 % 404.26 3,406 0.28 % 861 slovenija slo venija 434,662 0.12 % 383.07 38,121 0.07 % 203.40 3,078 0.02 % 71.69 216 0 % 5.44 0 0 % 0 192,738 0.19 % 606.21 199,029 0.12 % 366.72 1,480 0.12 % 374.13 tolarjev tol arjev 412,138 0.12 % 363.22 40,402 0.07 % 215.57 398 0 % 9.27 93 0 % 2.34 0 0 % 0 1,394 0 % 4.38 368,701 0.22 % 679.36 1,150 0.09 % 290.71 mestu mes tu 409,944 0.12 % 361.28 53,464 0.10 % 285.27 9,822 0.08 % 228.77 10,096 0.12 % 254.21 0 0 % 0 137,112 0.14 % 431.25 198,916 0.12 % 366.52 534 0.04 % 134.99 odstotka ods totka 408,111 0.12 % 359.67 15,227 0.03 % 81.25 1,027 0.01 % 23.92 48 0 % 1.21 0 0 % 0 209,030 0.20 % 657.45 182,717 0.11 % 336.67 62 0.01 % 15.67 letu let u 396,319 0.11 % 349.27 50,649 0.09 % 270.25 9,551 0.07 % 222.45 1,695 0.02 % 42.68 0 0 % 0 128,073 0.13 % 402.82 205,061 0.12 % 377.84 1,290 0.10 % 326.10 koncu kon cu 386,487 0.11 % 340.61 60,717 0.11 % 323.97 12,930 0.10 % 301.16 11,331 0.13 % 285.30 20 0.68 % 2,060.37 118,425 0.12 % 372.48 182,053 0.11 % 335.44 1,011 0.08 % 255.57 leti let i 380,841 0.11 % 335.63 60,949 0.11 % 325.20 5,767 0.04 % 134.32 6,530 0.08 % 164.42 0 0 % 0 106,368 0.10 % 334.56 200,657 0.12 % 369.72 570 0.05 % 144.09 ljudje lju dje 380,144 0.11 % 335.02 75,678 0.14 % 403.79 20,838 0.16 % 485.34 17,127 0.20 % 431.24 0 0 % 0 92,650 0.09 % 291.41 172,775 0.10 % 318.35 1,076 0.09 % 272 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 50 File at CLARIN.SI 1.2.34 List of initial character-level 4-grams from noun lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-nouns-lowercase_forms- initial-4grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] leta leta 1,848,166 0.56 % 1,628.78 269,635 0.52 % 1,438.68 66,252 0.54 % 1,543.09 16,515 0.21 % 415.83 0 0 % 0 611,187 0.63 % 1,922.34 879,792 0.54 % 1,621.07 4,785 0.40 % 1,209.60 evrov evro v 773,102 0.23 % 681.33 31,324 0.06 % 167.13 780 0.01 % 18.17 446 0.01 % 11.23 0 0 % 0 417,524 0.43 % 1,313.22 323,022 0.20 % 595.19 6 0 % 1.52 leto leto 721,611 0.22 % 635.95 98,875 0.19 % 527.56 10,303 0.08 % 239.97 6,698 0.09 % 168.65 0 0 % 0 209,509 0.21 % 658.96 394,765 0.24 % 727.38 1,461 0.12 % 369.33 dela dela 657,992 0.20 % 579.89 87,263 0.17 % 465.61 29,686 0.24 % 691.42 8,364 0.11 % 210.60 1 0.04 % 103.02 192,362 0.20 % 605.03 337,659 0.21 % 622.16 2,657 0.22 % 671.66 strani stra ni 656,360 0.20 % 578.45 128,171 0.25 % 683.88 33,031 0.27 % 769.33 19,223 0.24 % 484.01 5 0.18 % 515.09 199,449 0.20 % 627.32 274,744 0.17 % 506.23 1,737 0.15 % 439.09 ljudi ljud i 634,249 0.19 % 558.96 96,577 0.19 % 515.30 24,644 0.20 % 573.99 16,405 0.21 % 413.06 0 0 % 0 210,518 0.22 % 662.13 284,778 0.17 % 524.72 1,327 0.11 % 335.45 ljubljana ljub ljana 629,709 0.19 % 554.96 31,935 0.06 % 170.39 10,049 0.08 % 234.05 517 0.01 % 13.02 0 0 % 0 273,474 0.28 % 860.15 313,379 0.19 % 577.42 355 0.03 % 89.74 mesto mest o 628,282 0.19 % 553.70 69,292 0.14 % 369.72 14,019 0.11 % 326.52 10,635 0.13 % 267.78 0 0 % 0 181,789 0.19 % 571.77 351,818 0.22 % 648.25 729 0.06 % 184.28 letih leti h 593,892 0.18 % 523.39 88,203 0.17 % 470.62 15,668 0.13 % 364.93 7,036 0.09 % 177.16 0 0 % 0 186,187 0.19 % 585.61 295,576 0.18 % 544.62 1,222 0.10 % 308.91 delo delo 592,066 0.18 % 521.78 87,155 0.17 % 465.03 26,130 0.21 % 608.60 9,140 0.12 % 230.14 0 0 % 0 162,646 0.17 % 511.56 304,906 0.19 % 561.81 2,089 0.18 % 528.08 slovenije slov enije 579,657 0.17 % 510.85 56,763 0.11 % 302.87 6,065 0.05 % 141.26 465 0.01 % 11.71 0 0 % 0 198,780 0.20 % 625.21 310,741 0.19 % 572.56 6,843 0.58 % 1,729.84 sloveniji slov eniji 567,197 0.17 % 499.87 78,143 0.15 % 416.95 6,967 0.06 % 162.27 478 0.01 % 12.04 0 0 % 0 187,227 0.19 % 588.88 292,345 0.18 % 538.66 2,037 0.17 % 514.93 odstotkov odst otkov 559,799 0.17 % 493.35 53,835 0.10 % 287.25 5,020 0.04 % 116.92 578 0.01 % 14.55 0 0 % 0 188,394 0.19 % 592.55 311,673 0.19 % 574.28 299 0.03 % 75.58 predsednik pred sednik 545,455 0.16 % 480.71 37,608 0.07 % 200.66 2,827 0.02 % 65.84 1,964 0.03 % 49.45 0 0 % 0 192,584 0.20 % 605.73 309,379 0.19 % 570.05 1,093 0.09 % 276.30 času času 515,535 0.15 % 454.34 82,263 0.16 % 438.93 18,270 0.15 % 425.53 7,848 0.10 % 197.60 1 0.04 % 103.02 157,737 0.16 % 496.12 247,802 0.15 % 456.59 1,614 0.14 % 408 časa časa 504,880 0.15 % 444.95 104,840 0.20 % 559.39 20,705 0.17 % 482.24 24,957 0.32 % 628.39 12 0.42 % 1,236.22 124,379 0.13 % 391.20 228,687 0.14 % 421.37 1,300 0.11 % 328.63 svetu svet u 452,594 0.14 % 398.87 76,162 0.15 % 406.38 14,688 0.12 % 342.10 9,995 0.13 % 251.66 0 0 % 0 137,347 0.14 % 431.99 213,404 0.13 % 393.21 998 0.08 % 252.28 milijonov mili jonov 451,741 0.14 % 398.12 39,191 0.08 % 209.11 2,589 0.02 % 60.30 883 0.01 % 22.23 0 0 % 0 145,704 0.15 % 458.28 263,033 0.16 % 484.66 341 0.03 % 86.20 države drža ve 441,261 0.13 % 388.88 43,421 0.08 % 231.68 12,978 0.11 % 302.27 1,697 0.02 % 42.73 0 0 % 0 160,359 0.17 % 504.37 219,400 0.14 % 404.26 3,406 0.29 % 861 slovenija slov enija 434,662 0.13 % 383.07 38,121 0.07 % 203.40 3,078 0.03 % 71.69 216 0 % 5.44 0 0 % 0 192,738 0.20 % 606.21 199,029 0.12 % 366.72 1,480 0.12 % 374.13 tolarjev tola rjev 412,138 0.12 % 363.22 40,402 0.08 % 215.57 398 0 % 9.27 93 0 % 2.34 0 0 % 0 1,394 0 % 4.38 368,701 0.23 % 679.36 1,150 0.10 % 290.71 mestu mest u 409,944 0.12 % 361.28 53,464 0.10 % 285.27 9,822 0.08 % 228.77 10,096 0.13 % 254.21 0 0 % 0 137,112 0.14 % 431.25 198,916 0.12 % 366.52 534 0.04 % 134.99 odstotka odst otka 408,111 0.12 % 359.67 15,227 0.03 % 81.25 1,027 0.01 % 23.92 48 0 % 1.21 0 0 % 0 209,030 0.21 % 657.45 182,717 0.11 % 336.67 62 0.01 % 15.67 letu letu 396,319 0.12 % 349.27 50,649 0.10 % 270.25 9,551 0.08 % 222.45 1,695 0.02 % 42.68 0 0 % 0 128,073 0.13 % 402.82 205,061 0.13 % 377.84 1,290 0.11 % 326.10 koncu konc u 386,487 0.12 % 340.61 60,717 0.12 % 323.97 12,930 0.10 % 301.16 11,331 0.14 % 285.30 20 0.70 % 2,060.37 118,425 0.12 % 372.48 182,053 0.11 % 335.44 1,011 0.09 % 255.57 leti leti 380,841 0.11 % 335.63 60,949 0.12 % 325.20 5,767 0.05 % 134.32 6,530 0.08 % 164.42 0 0 % 0 106,368 0.11 % 334.56 200,657 0.12 % 369.72 570 0.05 % 144.09 ljudje ljud je 380,144 0.11 % 335.02 75,678 0.15 % 403.79 20,838 0.17 % 485.34 17,127 0.22 % 431.24 0 0 % 0 92,650 0.10 % 291.41 172,775 0.11 % 318.35 1,076 0.09 % 272 primer prim er 373,094 0.11 % 328.81 74,757 0.14 % 398.88 31,034 0.25 % 722.82 7,528 0.10 % 189.55 8 0.28 % 824.15 86,453 0.09 % 271.92 171,615 0.10 % 316.21 1,699 0.14 % 429.49 sveta svet a 372,133 0.11 % 327.96 43,196 0.08 % 230.48 12,861 0.10 % 299.55 6,226 0.08 % 156.76 0 0 % 0 110,306 0.11 % 346.94 198,045 0.12 % 364.91 1,499 0.13 % 378.93 delu delu 372,018 0.11 % 327.86 53,176 0.10 % 283.73 18,125 0.15 % 422.15 4,223 0.05 % 106.33 0 0 % 0 112,003 0.12 % 352.28 183,117 0.11 % 337.41 1,374 0.12 % 347.33 način nači n 366,442 0.11 % 322.94 69,269 0.14 % 369.60 24,837 0.20 % 578.48 8,527 0.11 % 214.70 2 0.07 % 206.04 104,410 0.11 % 328.40 156,026 0.10 % 287.49 3,371 0.28 % 852.15 podjetja podj etja 357,777 0.11 % 315.31 54,217 0.10 % 289.28 8,399 0.07 % 195.62 870 0.01 % 21.91 0 0 % 0 100,211 0.10 % 315.19 193,160 0.12 % 355.91 920 0.08 % 232.57 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 51 File at CLARIN.SI 1.2.35 List of initial character-level 5-grams from noun lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-nouns-lowercase_forms- initial-5grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] evrov evrov 773,102 0.26 % 681.33 0 0 % 0 446 0.01 % 11.23 6 0 % 1.52 323,022 0.22 % 595.19 31,324 0.07 % 167.13 417,524 0.47 % 1,313.22 780 0.01 % 18.17 strani stran i 656,360 0.22 % 578.45 5 0.22 % 515.09 19,223 0.29 % 484.01 1,737 0.16 % 439.09 274,744 0.19 % 506.23 128,171 0.28 % 683.88 199,449 0.23 % 627.32 33,031 0.30 % 769.33 ljudi ljudi 634,249 0.21 % 558.96 0 0 % 0 16,405 0.25 % 413.06 1,327 0.12 % 335.45 284,778 0.19 % 524.72 96,577 0.21 % 515.30 210,518 0.24 % 662.13 24,644 0.22 % 573.99 ljubljana ljubl jana 629,709 0.21 % 554.96 0 0 % 0 517 0.01 % 13.02 355 0.03 % 89.74 313,379 0.21 % 577.42 31,935 0.07 % 170.39 273,474 0.31 % 860.15 10,049 0.09 % 234.05 mesto mesto 628,282 0.21 % 553.70 0 0 % 0 10,635 0.16 % 267.78 729 0.07 % 184.28 351,818 0.24 % 648.25 69,292 0.15 % 369.72 181,789 0.21 % 571.77 14,019 0.13 % 326.52 letih letih 593,892 0.20 % 523.39 0 0 % 0 7,036 0.11 % 177.16 1,222 0.11 % 308.91 295,576 0.20 % 544.62 88,203 0.19 % 470.62 186,187 0.21 % 585.61 15,668 0.14 % 364.93 slovenije slove nije 579,657 0.19 % 510.85 0 0 % 0 465 0.01 % 11.71 6,843 0.62 % 1,729.84 310,741 0.21 % 572.56 56,763 0.12 % 302.87 198,780 0.23 % 625.21 6,065 0.06 % 141.26 sloveniji slove niji 567,197 0.19 % 499.87 0 0 % 0 478 0.01 % 12.04 2,037 0.19 % 514.93 292,345 0.20 % 538.66 78,143 0.17 % 416.95 187,227 0.21 % 588.88 6,967 0.06 % 162.27 odstotkov odsto tkov 559,799 0.19 % 493.35 0 0 % 0 578 0.01 % 14.55 299 0.03 % 75.58 311,673 0.21 % 574.28 53,835 0.12 % 287.25 188,394 0.21 % 592.55 5,020 0.05 % 116.92 predsednik preds ednik 545,455 0.18 % 480.71 0 0 % 0 1,964 0.03 % 49.45 1,093 0.10 % 276.30 309,379 0.21 % 570.05 37,608 0.08 % 200.66 192,584 0.22 % 605.73 2,827 0.03 % 65.84 svetu svetu 452,594 0.15 % 398.87 0 0 % 0 9,995 0.15 % 251.66 998 0.09 % 252.28 213,404 0.14 % 393.21 76,162 0.17 % 406.38 137,347 0.16 % 431.99 14,688 0.13 % 342.10 milijonov milij onov 451,741 0.15 % 398.12 0 0 % 0 883 0.01 % 22.23 341 0.03 % 86.20 263,033 0.18 % 484.66 39,191 0.09 % 209.11 145,704 0.17 % 458.28 2,589 0.02 % 60.30 države držav e 441,261 0.15 % 388.88 0 0 % 0 1,697 0.03 % 42.73 3,406 0.31 % 861 219,400 0.15 % 404.26 43,421 0.10 % 231.68 160,359 0.18 % 504.37 12,978 0.12 % 302.27 slovenija slove nija 434,662 0.14 % 383.07 0 0 % 0 216 0 % 5.44 1,480 0.14 % 374.13 199,029 0.14 % 366.72 38,121 0.08 % 203.40 192,738 0.22 % 606.21 3,078 0.03 % 71.69 tolarjev tolar jev 412,138 0.14 % 363.22 0 0 % 0 93 0 % 2.34 1,150 0.10 % 290.71 368,701 0.25 % 679.36 40,402 0.09 % 215.57 1,394 0 % 4.38 398 0 % 9.27 mestu mestu 409,944 0.14 % 361.28 0 0 % 0 10,096 0.15 % 254.21 534 0.05 % 134.99 198,916 0.14 % 366.52 53,464 0.12 % 285.27 137,112 0.16 % 431.25 9,822 0.09 % 228.77 odstotka odsto tka 408,111 0.14 % 359.67 0 0 % 0 48 0 % 1.21 62 0.01 % 15.67 182,717 0.12 % 336.67 15,227 0.03 % 81.25 209,030 0.24 % 657.45 1,027 0.01 % 23.92 koncu koncu 386,487 0.13 % 340.61 20 0.86 % 2,060.37 11,331 0.17 % 285.30 1,011 0.09 % 255.57 182,053 0.12 % 335.44 60,717 0.13 % 323.97 118,425 0.14 % 372.48 12,930 0.12 % 301.16 ljudje ljudj e 380,144 0.13 % 335.02 0 0 % 0 17,127 0.26 % 431.24 1,076 0.10 % 272 172,775 0.12 % 318.35 75,678 0.17 % 403.79 92,650 0.10 % 291.41 20,838 0.19 % 485.34 primer prime r 373,094 0.12 % 328.81 8 0.34 % 824.15 7,528 0.11 % 189.55 1,699 0.15 % 429.49 171,615 0.12 % 316.21 74,757 0.16 % 398.88 86,453 0.10 % 271.92 31,034 0.28 % 722.82 sveta sveta 372,133 0.12 % 327.96 0 0 % 0 6,226 0.09 % 156.76 1,499 0.14 % 378.93 198,045 0.14 % 364.91 43,196 0.10 % 230.48 110,306 0.12 % 346.94 12,861 0.12 % 299.55 način način 366,442 0.12 % 322.94 2 0.09 % 206.04 8,527 0.13 % 214.70 3,371 0.31 % 852.15 156,026 0.11 % 287.49 69,269 0.15 % 369.60 104,410 0.12 % 328.40 24,837 0.23 % 578.48 podjetja podje tja 357,777 0.12 % 315.31 0 0 % 0 870 0.01 % 21.91 920 0.08 % 232.57 193,160 0.13 % 355.91 54,217 0.12 % 289.28 100,211 0.11 % 315.19 8,399 0.08 % 195.62 življenje življ enje 336,327 0.11 % 296.40 0 0 % 0 18,645 0.28 % 469.46 1,095 0.10 % 276.80 143,565 0.10 % 264.53 73,976 0.16 % 394.71 77,091 0.09 % 242.47 21,955 0.20 % 511.36 primeru prime ru 329,763 0.11 % 290.62 0 0 % 0 4,501 0.07 % 113.33 2,246 0.20 % 567.76 144,090 0.10 % 265.50 49,248 0.11 % 262.77 114,733 0.13 % 360.87 14,945 0.14 % 348.09 vprašanje vpraš anje 329,493 0.11 % 290.38 0 0 % 0 6,144 0.09 % 154.70 1,445 0.13 % 365.28 173,591 0.12 % 319.85 44,565 0.10 % 237.78 92,549 0.10 % 291.09 11,199 0.10 % 260.84 vlada vlada 312,080 0.10 % 275.03 0 0 % 0 928 0.01 % 23.37 2,791 0.26 % 705.53 162,534 0.11 % 299.48 22,227 0.05 % 118.60 120,955 0.14 % 380.43 2,645 0.02 % 61.61 ljubljani ljubl jani 298,833 0.10 % 263.36 0 0 % 0 1,086 0.02 % 27.34 371 0.03 % 93.78 164,428 0.11 % 302.97 38,080 0.08 % 203.18 90,498 0.10 % 284.64 4,370 0.04 % 101.78 začetku začet ku 297,915 0.10 % 262.55 1 0.04 % 103.02 3,617 0.06 % 91.07 694 0.06 % 175.44 147,717 0.10 % 272.18 45,622 0.10 % 243.42 90,370 0.10 % 284.24 9,894 0.09 % 230.44 mesta mesta 295,647 0.10 % 260.55 0 0 % 0 6,951 0.10 % 175.02 630 0.06 % 159.26 152,791 0.10 % 281.53 36,280 0.08 % 193.58 89,032 0.10 % 280.03 9,963 0.09 % 232.05 otrok otrok 287,157 0.10 % 253.07 0 0 % 0 10,771 0.16 % 271.20 602 0.06 % 152.18 129,206 0.09 % 238.07 51,492 0.11 % 274.74 77,643 0.09 % 244.21 17,443 0.16 % 406.27 vlade vlade 287,061 0.10 % 252.99 0 0 % 0 543 0.01 % 13.67 1,602 0.15 % 404.97 150,128 0.10 % 276.62 21,656 0.05 % 115.55 110,720 0.13 % 348.24 2,412 0.02 % 56.18 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 52 File at CLARIN.SI 1.2.36 List of final character-level 1-grams from noun lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-nouns-lowercase_forms- final-1grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] leta let a 1,848,166 0.52 % 1,628.78 0 0 % 0 16,515 0.20 % 415.83 4,785 0.39 % 1,209.60 879,792 0.51 % 1,621.07 269,635 0.49 % 1,438.68 611,187 0.59 % 1,922.34 66,252 0.51 % 1,543.09 let le t 873,231 0.25 % 769.57 0 0 % 0 18,975 0.22 % 477.77 2,384 0.19 % 602.65 449,263 0.26 % 827.80 144,472 0.26 % 770.86 237,663 0.23 % 747.51 20,474 0.16 % 476.86 evrov evro v 773,102 0.22 % 681.33 0 0 % 0 446 0.01 % 11.23 6 0 % 1.52 323,022 0.19 % 595.19 31,324 0.06 % 167.13 417,524 0.41 % 1,313.22 780 0.01 % 18.17 leto let o 721,611 0.20 % 635.95 0 0 % 0 6,698 0.08 % 168.65 1,461 0.12 % 369.33 394,765 0.23 % 727.38 98,875 0.18 % 527.56 209,509 0.20 % 658.96 10,303 0.08 % 239.97 del de l 709,817 0.20 % 625.56 4 0.13 % 412.07 9,763 0.12 % 245.82 2,670 0.22 % 674.95 356,268 0.21 % 656.45 112,840 0.21 % 602.08 196,278 0.19 % 617.35 31,994 0.25 % 745.18 dan da n 699,476 0.20 % 616.44 0 0 % 0 28,703 0.34 % 722.71 1,846 0.15 % 466.65 335,315 0.20 % 617.84 125,516 0.23 % 669.71 190,718 0.18 % 599.86 17,378 0.13 % 404.75 dela del a 657,992 0.19 % 579.89 1 0.03 % 103.02 8,364 0.10 % 210.60 2,657 0.21 % 671.66 337,659 0.20 % 622.16 87,263 0.16 % 465.61 192,362 0.19 % 605.03 29,686 0.23 % 691.42 strani stran i 656,360 0.19 % 578.45 5 0.16 % 515.09 19,223 0.23 % 484.01 1,737 0.14 % 439.09 274,744 0.16 % 506.23 128,171 0.23 % 683.88 199,449 0.19 % 627.32 33,031 0.26 % 769.33 ljudi ljud i 634,249 0.18 % 558.96 0 0 % 0 16,405 0.19 % 413.06 1,327 0.11 % 335.45 284,778 0.17 % 524.72 96,577 0.18 % 515.30 210,518 0.20 % 662.13 24,644 0.19 % 573.99 ljubljana ljubljan a 629,709 0.18 % 554.96 0 0 % 0 517 0.01 % 13.02 355 0.03 % 89.74 313,379 0.18 % 577.42 31,935 0.06 % 170.39 273,474 0.27 % 860.15 10,049 0.08 % 234.05 mesto mest o 628,282 0.18 % 553.70 0 0 % 0 10,635 0.12 % 267.78 729 0.06 % 184.28 351,818 0.20 % 648.25 69,292 0.13 % 369.72 181,789 0.18 % 571.77 14,019 0.11 % 326.52 letih leti h 593,892 0.17 % 523.39 0 0 % 0 7,036 0.08 % 177.16 1,222 0.10 % 308.91 295,576 0.17 % 544.62 88,203 0.16 % 470.62 186,187 0.18 % 585.61 15,668 0.12 % 364.93 delo del o 592,066 0.17 % 521.78 0 0 % 0 9,140 0.11 % 230.14 2,089 0.17 % 528.08 304,906 0.18 % 561.81 87,155 0.16 % 465.03 162,646 0.16 % 511.56 26,130 0.20 % 608.60 čas ča s 584,983 0.17 % 515.54 2 0.07 % 206.04 25,547 0.30 % 643.25 2,121 0.17 % 536.17 273,358 0.16 % 503.68 115,093 0.21 % 614.10 147,089 0.14 % 462.63 21,773 0.17 % 507.12 slovenije slovenij e 579,657 0.16 % 510.85 0 0 % 0 465 0.01 % 11.71 6,843 0.55 % 1,729.84 310,741 0.18 % 572.56 56,763 0.10 % 302.87 198,780 0.19 % 625.21 6,065 0.05 % 141.26 sloveniji slovenij i 567,197 0.16 % 499.87 0 0 % 0 478 0.01 % 12.04 2,037 0.17 % 514.93 292,345 0.17 % 538.66 78,143 0.14 % 416.95 187,227 0.18 % 588.88 6,967 0.05 % 162.27 odstotkov odstotko v 559,799 0.16 % 493.35 0 0 % 0 578 0.01 % 14.55 299 0.02 % 75.58 311,673 0.18 % 574.28 53,835 0.10 % 287.25 188,394 0.18 % 592.55 5,020 0.04 % 116.92 predsednik predsedni k 545,455 0.15 % 480.71 0 0 % 0 1,964 0.02 % 49.45 1,093 0.09 % 276.30 309,379 0.18 % 570.05 37,608 0.07 % 200.66 192,584 0.19 % 605.73 2,827 0.02 % 65.84 času čas u 515,535 0.15 % 454.34 1 0.03 % 103.02 7,848 0.09 % 197.60 1,614 0.13 % 408 247,802 0.14 % 456.59 82,263 0.15 % 438.93 157,737 0.15 % 496.12 18,270 0.14 % 425.53 časa čas a 504,880 0.14 % 444.95 12 0.39 % 1,236.22 24,957 0.29 % 628.39 1,300 0.10 % 328.63 228,687 0.13 % 421.37 104,840 0.19 % 559.39 124,379 0.12 % 391.20 20,705 0.16 % 482.24 dni dn i 492,162 0.14 % 433.74 2 0.07 % 206.04 13,720 0.16 % 345.45 1,525 0.12 % 385.50 256,647 0.15 % 472.89 74,007 0.14 % 394.88 137,827 0.13 % 433.50 8,434 0.07 % 196.44 svetu svet u 452,594 0.13 % 398.87 0 0 % 0 9,995 0.12 % 251.66 998 0.08 % 252.28 213,404 0.12 % 393.21 76,162 0.14 % 406.38 137,347 0.13 % 431.99 14,688 0.11 % 342.10 milijonov milijono v 451,741 0.13 % 398.12 0 0 % 0 883 0.01 % 22.23 341 0.03 % 86.20 263,033 0.15 % 484.66 39,191 0.07 % 209.11 145,704 0.14 % 458.28 2,589 0.02 % 60.30 države držav e 441,261 0.12 % 388.88 0 0 % 0 1,697 0.02 % 42.73 3,406 0.28 % 861 219,400 0.13 % 404.26 43,421 0.08 % 231.68 160,359 0.16 % 504.37 12,978 0.10 % 302.27 slovenija slovenij a 434,662 0.12 % 383.07 0 0 % 0 216 0 % 5.44 1,480 0.12 % 374.13 199,029 0.12 % 366.72 38,121 0.07 % 203.40 192,738 0.19 % 606.21 3,078 0.02 % 71.69 tolarjev tolarje v 412,138 0.12 % 363.22 0 0 % 0 93 0 % 2.34 1,150 0.09 % 290.71 368,701 0.21 % 679.36 40,402 0.07 % 215.57 1,394 0 % 4.38 398 0 % 9.27 mestu mest u 409,944 0.12 % 361.28 0 0 % 0 10,096 0.12 % 254.21 534 0.04 % 134.99 198,916 0.12 % 366.52 53,464 0.10 % 285.27 137,112 0.13 % 431.25 9,822 0.08 % 228.77 odstotka odstotk a 408,111 0.12 % 359.67 0 0 % 0 48 0 % 1.21 62 0.01 % 15.67 182,717 0.11 % 336.67 15,227 0.03 % 81.25 209,030 0.20 % 657.45 1,027 0.01 % 23.92 letu let u 396,319 0.11 % 349.27 0 0 % 0 1,695 0.02 % 42.68 1,290 0.10 % 326.10 205,061 0.12 % 377.84 50,649 0.09 % 270.25 128,073 0.12 % 402.82 9,551 0.07 % 222.45 koncu konc u 386,487 0.11 % 340.61 20 0.65 % 2,060.37 11,331 0.13 % 285.30 1,011 0.08 % 255.57 182,053 0.11 % 335.44 60,717 0.11 % 323.97 118,425 0.12 % 372.48 12,930 0.10 % 301.16 leti let i 380,841 0.11 % 335.63 0 0 % 0 6,530 0.08 % 164.42 570 0.05 % 144.09 200,657 0.12 % 369.72 60,949 0.11 % 325.20 106,368 0.10 % 334.56 5,767 0.04 % 134.32 ljudje ljudj e 380,144 0.11 % 335.02 0 0 % 0 17,127 0.20 % 431.24 1,076 0.09 % 272 172,775 0.10 % 318.35 75,678 0.14 % 403.79 92,650 0.09 % 291.41 20,838 0.16 % 485.34 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 53 File at CLARIN.SI 1.2.37 List of final character-level 2-grams from noun lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-nouns-lowercase_forms- final-2grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] leta le ta 1,848,166 0.53 % 1,628.78 0 0 % 0 16,515 0.20 % 415.83 4,785 0.39 % 1,209.60 879,792 0.51 % 1,621.07 269,635 0.49 % 1,438.68 611,187 0.59 % 1,922.34 66,252 0.52 % 1,543.09 let l et 873,231 0.25 % 769.57 0 0 % 0 18,975 0.22 % 477.77 2,384 0.19 % 602.65 449,263 0.26 % 827.80 144,472 0.27 % 770.86 237,663 0.23 % 747.51 20,474 0.16 % 476.86 evrov evr ov 773,102 0.22 % 681.33 0 0 % 0 446 0.01 % 11.23 6 0 % 1.52 323,022 0.19 % 595.19 31,324 0.06 % 167.13 417,524 0.41 % 1,313.22 780 0.01 % 18.17 leto le to 721,611 0.20 % 635.95 0 0 % 0 6,698 0.08 % 168.65 1,461 0.12 % 369.33 394,765 0.23 % 727.38 98,875 0.18 % 527.56 209,509 0.20 % 658.96 10,303 0.08 % 239.97 del d el 709,817 0.20 % 625.56 4 0.13 % 412.07 9,763 0.12 % 245.82 2,670 0.22 % 674.95 356,268 0.21 % 656.45 112,840 0.21 % 602.08 196,278 0.19 % 617.35 31,994 0.25 % 745.18 dan d an 699,476 0.20 % 616.44 0 0 % 0 28,703 0.34 % 722.71 1,846 0.15 % 466.65 335,315 0.20 % 617.84 125,516 0.23 % 669.71 190,718 0.19 % 599.86 17,378 0.14 % 404.75 dela de la 657,992 0.19 % 579.89 1 0.03 % 103.02 8,364 0.10 % 210.60 2,657 0.22 % 671.66 337,659 0.20 % 622.16 87,263 0.16 % 465.61 192,362 0.19 % 605.03 29,686 0.23 % 691.42 strani stra ni 656,360 0.19 % 578.45 5 0.17 % 515.09 19,223 0.23 % 484.01 1,737 0.14 % 439.09 274,744 0.16 % 506.23 128,171 0.23 % 683.88 199,449 0.19 % 627.32 33,031 0.26 % 769.33 ljudi lju di 634,249 0.18 % 558.96 0 0 % 0 16,405 0.19 % 413.06 1,327 0.11 % 335.45 284,778 0.17 % 524.72 96,577 0.18 % 515.30 210,518 0.20 % 662.13 24,644 0.19 % 573.99 ljubljana ljublja na 629,709 0.18 % 554.96 0 0 % 0 517 0.01 % 13.02 355 0.03 % 89.74 313,379 0.18 % 577.42 31,935 0.06 % 170.39 273,474 0.27 % 860.15 10,049 0.08 % 234.05 mesto mes to 628,282 0.18 % 553.70 0 0 % 0 10,635 0.13 % 267.78 729 0.06 % 184.28 351,818 0.20 % 648.25 69,292 0.13 % 369.72 181,789 0.18 % 571.77 14,019 0.11 % 326.52 letih let ih 593,892 0.17 % 523.39 0 0 % 0 7,036 0.08 % 177.16 1,222 0.10 % 308.91 295,576 0.17 % 544.62 88,203 0.16 % 470.62 186,187 0.18 % 585.61 15,668 0.12 % 364.93 delo de lo 592,066 0.17 % 521.78 0 0 % 0 9,140 0.11 % 230.14 2,089 0.17 % 528.08 304,906 0.18 % 561.81 87,155 0.16 % 465.03 162,646 0.16 % 511.56 26,130 0.20 % 608.60 čas č as 584,983 0.17 % 515.54 2 0.07 % 206.04 25,547 0.30 % 643.25 2,121 0.17 % 536.17 273,358 0.16 % 503.68 115,093 0.21 % 614.10 147,089 0.14 % 462.63 21,773 0.17 % 507.12 slovenije sloveni je 579,657 0.17 % 510.85 0 0 % 0 465 0.01 % 11.71 6,843 0.56 % 1,729.84 310,741 0.18 % 572.56 56,763 0.10 % 302.87 198,780 0.19 % 625.21 6,065 0.05 % 141.26 sloveniji sloveni ji 567,197 0.16 % 499.87 0 0 % 0 478 0.01 % 12.04 2,037 0.17 % 514.93 292,345 0.17 % 538.66 78,143 0.14 % 416.95 187,227 0.18 % 588.88 6,967 0.05 % 162.27 odstotkov odstotk ov 559,799 0.16 % 493.35 0 0 % 0 578 0.01 % 14.55 299 0.02 % 75.58 311,673 0.18 % 574.28 53,835 0.10 % 287.25 188,394 0.18 % 592.55 5,020 0.04 % 116.92 predsednik predsedn ik 545,455 0.15 % 480.71 0 0 % 0 1,964 0.02 % 49.45 1,093 0.09 % 276.30 309,379 0.18 % 570.05 37,608 0.07 % 200.66 192,584 0.19 % 605.73 2,827 0.02 % 65.84 času ča su 515,535 0.15 % 454.34 1 0.03 % 103.02 7,848 0.09 % 197.60 1,614 0.13 % 408 247,802 0.14 % 456.59 82,263 0.15 % 438.93 157,737 0.15 % 496.12 18,270 0.14 % 425.53 časa ča sa 504,880 0.14 % 444.95 12 0.40 % 1,236.22 24,957 0.29 % 628.39 1,300 0.11 % 328.63 228,687 0.13 % 421.37 104,840 0.19 % 559.39 124,379 0.12 % 391.20 20,705 0.16 % 482.24 dni d ni 492,162 0.14 % 433.74 2 0.07 % 206.04 13,720 0.16 % 345.45 1,525 0.12 % 385.50 256,647 0.15 % 472.89 74,007 0.14 % 394.88 137,827 0.13 % 433.50 8,434 0.07 % 196.44 svetu sve tu 452,594 0.13 % 398.87 0 0 % 0 9,995 0.12 % 251.66 998 0.08 % 252.28 213,404 0.12 % 393.21 76,162 0.14 % 406.38 137,347 0.13 % 431.99 14,688 0.11 % 342.10 milijonov milijon ov 451,741 0.13 % 398.12 0 0 % 0 883 0.01 % 22.23 341 0.03 % 86.20 263,033 0.15 % 484.66 39,191 0.07 % 209.11 145,704 0.14 % 458.28 2,589 0.02 % 60.30 države drža ve 441,261 0.13 % 388.88 0 0 % 0 1,697 0.02 % 42.73 3,406 0.28 % 861 219,400 0.13 % 404.26 43,421 0.08 % 231.68 160,359 0.16 % 504.37 12,978 0.10 % 302.27 slovenija sloveni ja 434,662 0.12 % 383.07 0 0 % 0 216 0 % 5.44 1,480 0.12 % 374.13 199,029 0.12 % 366.72 38,121 0.07 % 203.40 192,738 0.19 % 606.21 3,078 0.02 % 71.69 tolarjev tolarj ev 412,138 0.12 % 363.22 0 0 % 0 93 0 % 2.34 1,150 0.09 % 290.71 368,701 0.21 % 679.36 40,402 0.07 % 215.57 1,394 0 % 4.38 398 0 % 9.27 mestu mes tu 409,944 0.12 % 361.28 0 0 % 0 10,096 0.12 % 254.21 534 0.04 % 134.99 198,916 0.12 % 366.52 53,464 0.10 % 285.27 137,112 0.13 % 431.25 9,822 0.08 % 228.77 odstotka odstot ka 408,111 0.12 % 359.67 0 0 % 0 48 0 % 1.21 62 0.01 % 15.67 182,717 0.11 % 336.67 15,227 0.03 % 81.25 209,030 0.20 % 657.45 1,027 0.01 % 23.92 letu le tu 396,319 0.11 % 349.27 0 0 % 0 1,695 0.02 % 42.68 1,290 0.10 % 326.10 205,061 0.12 % 377.84 50,649 0.09 % 270.25 128,073 0.12 % 402.82 9,551 0.07 % 222.45 koncu kon cu 386,487 0.11 % 340.61 20 0.67 % 2,060.37 11,331 0.13 % 285.30 1,011 0.08 % 255.57 182,053 0.11 % 335.44 60,717 0.11 % 323.97 118,425 0.12 % 372.48 12,930 0.10 % 301.16 leti le ti 380,841 0.11 % 335.63 0 0 % 0 6,530 0.08 % 164.42 570 0.05 % 144.09 200,657 0.12 % 369.72 60,949 0.11 % 325.20 106,368 0.10 % 334.56 5,767 0.04 % 134.32 ljudje ljud je 380,144 0.11 % 335.02 0 0 % 0 17,127 0.20 % 431.24 1,076 0.09 % 272 172,775 0.10 % 318.35 75,678 0.14 % 403.79 92,650 0.09 % 291.41 20,838 0.16 % 485.34 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 54 File at CLARIN.SI 1.2.38 List of final character-level 3-grams from noun lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-nouns-lowercase_forms- final-3grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] leta l eta 1,848,166 0.53 % 1,628.78 0 0 % 0 16,515 0.20 % 415.83 4,785 0.39 % 1,209.60 879,792 0.52 % 1,621.07 269,635 0.50 % 1,438.68 611,187 0.60 % 1,922.34 66,252 0.52 % 1,543.09 let let 873,231 0.25 % 769.57 0 0 % 0 18,975 0.23 % 477.77 2,384 0.20 % 602.65 449,263 0.26 % 827.80 144,472 0.27 % 770.86 237,663 0.23 % 747.51 20,474 0.16 % 476.86 evrov ev rov 773,102 0.22 % 681.33 0 0 % 0 446 0.01 % 11.23 6 0 % 1.52 323,022 0.19 % 595.19 31,324 0.06 % 167.13 417,524 0.41 % 1,313.22 780 0.01 % 18.17 leto l eto 721,611 0.21 % 635.95 0 0 % 0 6,698 0.08 % 168.65 1,461 0.12 % 369.33 394,765 0.23 % 727.38 98,875 0.18 % 527.56 209,509 0.21 % 658.96 10,303 0.08 % 239.97 del del 709,817 0.20 % 625.56 4 0.14 % 412.07 9,763 0.12 % 245.82 2,670 0.22 % 674.95 356,268 0.21 % 656.45 112,840 0.21 % 602.08 196,278 0.19 % 617.35 31,994 0.25 % 745.18 dan dan 699,476 0.20 % 616.44 0 0 % 0 28,703 0.34 % 722.71 1,846 0.15 % 466.65 335,315 0.20 % 617.84 125,516 0.23 % 669.71 190,718 0.19 % 599.86 17,378 0.14 % 404.75 dela d ela 657,992 0.19 % 579.89 1 0.03 % 103.02 8,364 0.10 % 210.60 2,657 0.22 % 671.66 337,659 0.20 % 622.16 87,263 0.16 % 465.61 192,362 0.19 % 605.03 29,686 0.23 % 691.42 strani str ani 656,360 0.19 % 578.45 5 0.17 % 515.09 19,223 0.23 % 484.01 1,737 0.14 % 439.09 274,744 0.16 % 506.23 128,171 0.24 % 683.88 199,449 0.20 % 627.32 33,031 0.26 % 769.33 ljudi lj udi 634,249 0.18 % 558.96 0 0 % 0 16,405 0.19 % 413.06 1,327 0.11 % 335.45 284,778 0.17 % 524.72 96,577 0.18 % 515.30 210,518 0.21 % 662.13 24,644 0.19 % 573.99 ljubljana ljublj ana 629,709 0.18 % 554.96 0 0 % 0 517 0.01 % 13.02 355 0.03 % 89.74 313,379 0.18 % 577.42 31,935 0.06 % 170.39 273,474 0.27 % 860.15 10,049 0.08 % 234.05 mesto me sto 628,282 0.18 % 553.70 0 0 % 0 10,635 0.13 % 267.78 729 0.06 % 184.28 351,818 0.21 % 648.25 69,292 0.13 % 369.72 181,789 0.18 % 571.77 14,019 0.11 % 326.52 letih le tih 593,892 0.17 % 523.39 0 0 % 0 7,036 0.08 % 177.16 1,222 0.10 % 308.91 295,576 0.17 % 544.62 88,203 0.16 % 470.62 186,187 0.18 % 585.61 15,668 0.12 % 364.93 delo d elo 592,066 0.17 % 521.78 0 0 % 0 9,140 0.11 % 230.14 2,089 0.17 % 528.08 304,906 0.18 % 561.81 87,155 0.16 % 465.03 162,646 0.16 % 511.56 26,130 0.20 % 608.60 čas čas 584,983 0.17 % 515.54 2 0.07 % 206.04 25,547 0.30 % 643.25 2,121 0.17 % 536.17 273,358 0.16 % 503.68 115,093 0.21 % 614.10 147,089 0.14 % 462.63 21,773 0.17 % 507.12 slovenije sloven ije 579,657 0.17 % 510.85 0 0 % 0 465 0.01 % 11.71 6,843 0.56 % 1,729.84 310,741 0.18 % 572.56 56,763 0.10 % 302.87 198,780 0.20 % 625.21 6,065 0.05 % 141.26 sloveniji sloven iji 567,197 0.16 % 499.87 0 0 % 0 478 0.01 % 12.04 2,037 0.17 % 514.93 292,345 0.17 % 538.66 78,143 0.14 % 416.95 187,227 0.18 % 588.88 6,967 0.06 % 162.27 odstotkov odstot kov 559,799 0.16 % 493.35 0 0 % 0 578 0.01 % 14.55 299 0.02 % 75.58 311,673 0.18 % 574.28 53,835 0.10 % 287.25 188,394 0.18 % 592.55 5,020 0.04 % 116.92 predsednik predsed nik 545,455 0.16 % 480.71 0 0 % 0 1,964 0.02 % 49.45 1,093 0.09 % 276.30 309,379 0.18 % 570.05 37,608 0.07 % 200.66 192,584 0.19 % 605.73 2,827 0.02 % 65.84 času č asu 515,535 0.15 % 454.34 1 0.03 % 103.02 7,848 0.09 % 197.60 1,614 0.13 % 408 247,802 0.15 % 456.59 82,263 0.15 % 438.93 157,737 0.15 % 496.12 18,270 0.14 % 425.53 časa č asa 504,880 0.14 % 444.95 12 0.41 % 1,236.22 24,957 0.30 % 628.39 1,300 0.11 % 328.63 228,687 0.13 % 421.37 104,840 0.19 % 559.39 124,379 0.12 % 391.20 20,705 0.16 % 482.24 dni dni 492,162 0.14 % 433.74 2 0.07 % 206.04 13,720 0.16 % 345.45 1,525 0.12 % 385.50 256,647 0.15 % 472.89 74,007 0.14 % 394.88 137,827 0.14 % 433.50 8,434 0.07 % 196.44 svetu sv etu 452,594 0.13 % 398.87 0 0 % 0 9,995 0.12 % 251.66 998 0.08 % 252.28 213,404 0.12 % 393.21 76,162 0.14 % 406.38 137,347 0.14 % 431.99 14,688 0.12 % 342.10 milijonov milijo nov 451,741 0.13 % 398.12 0 0 % 0 883 0.01 % 22.23 341 0.03 % 86.20 263,033 0.15 % 484.66 39,191 0.07 % 209.11 145,704 0.14 % 458.28 2,589 0.02 % 60.30 države drž ave 441,261 0.13 % 388.88 0 0 % 0 1,697 0.02 % 42.73 3,406 0.28 % 861 219,400 0.13 % 404.26 43,421 0.08 % 231.68 160,359 0.16 % 504.37 12,978 0.10 % 302.27 slovenija sloven ija 434,662 0.12 % 383.07 0 0 % 0 216 0 % 5.44 1,480 0.12 % 374.13 199,029 0.12 % 366.72 38,121 0.07 % 203.40 192,738 0.19 % 606.21 3,078 0.02 % 71.69 tolarjev tolar jev 412,138 0.12 % 363.22 0 0 % 0 93 0 % 2.34 1,150 0.09 % 290.71 368,701 0.22 % 679.36 40,402 0.07 % 215.57 1,394 0 % 4.38 398 0 % 9.27 mestu me stu 409,944 0.12 % 361.28 0 0 % 0 10,096 0.12 % 254.21 534 0.04 % 134.99 198,916 0.12 % 366.52 53,464 0.10 % 285.27 137,112 0.14 % 431.25 9,822 0.08 % 228.77 odstotka odsto tka 408,111 0.12 % 359.67 0 0 % 0 48 0 % 1.21 62 0.01 % 15.67 182,717 0.11 % 336.67 15,227 0.03 % 81.25 209,030 0.20 % 657.45 1,027 0.01 % 23.92 letu l etu 396,319 0.11 % 349.27 0 0 % 0 1,695 0.02 % 42.68 1,290 0.10 % 326.10 205,061 0.12 % 377.84 50,649 0.09 % 270.25 128,073 0.13 % 402.82 9,551 0.07 % 222.45 koncu ko ncu 386,487 0.11 % 340.61 20 0.68 % 2,060.37 11,331 0.13 % 285.30 1,011 0.08 % 255.57 182,053 0.11 % 335.44 60,717 0.11 % 323.97 118,425 0.12 % 372.48 12,930 0.10 % 301.16 leti l eti 380,841 0.11 % 335.63 0 0 % 0 6,530 0.08 % 164.42 570 0.05 % 144.09 200,657 0.12 % 369.72 60,949 0.11 % 325.20 106,368 0.10 % 334.56 5,767 0.04 % 134.32 ljudje lju dje 380,144 0.11 % 335.02 0 0 % 0 17,127 0.20 % 431.24 1,076 0.09 % 272 172,775 0.10 % 318.35 75,678 0.14 % 403.79 92,650 0.09 % 291.41 20,838 0.16 % 485.34 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 55 File at CLARIN.SI 1.2.39 List of final character-level 4-grams from noun lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-nouns-lowercase_forms- final-4grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] leta leta 1,848,166 0.56 % 1,628.78 0 0 % 0 16,515 0.21 % 415.83 4,785 0.40 % 1,209.60 879,792 0.54 % 1,621.07 269,635 0.52 % 1,438.68 611,187 0.63 % 1,922.34 66,252 0.54 % 1,543.09 evrov e vrov 773,102 0.23 % 681.33 0 0 % 0 446 0.01 % 11.23 6 0 % 1.52 323,022 0.20 % 595.19 31,324 0.06 % 167.13 417,524 0.43 % 1,313.22 780 0.01 % 18.17 leto leto 721,611 0.22 % 635.95 0 0 % 0 6,698 0.09 % 168.65 1,461 0.12 % 369.33 394,765 0.24 % 727.38 98,875 0.19 % 527.56 209,509 0.21 % 658.96 10,303 0.08 % 239.97 dela dela 657,992 0.20 % 579.89 1 0.04 % 103.02 8,364 0.11 % 210.60 2,657 0.22 % 671.66 337,659 0.21 % 622.16 87,263 0.17 % 465.61 192,362 0.20 % 605.03 29,686 0.24 % 691.42 strani st rani 656,360 0.20 % 578.45 5 0.18 % 515.09 19,223 0.24 % 484.01 1,737 0.15 % 439.09 274,744 0.17 % 506.23 128,171 0.25 % 683.88 199,449 0.20 % 627.32 33,031 0.27 % 769.33 ljudi l judi 634,249 0.19 % 558.96 0 0 % 0 16,405 0.21 % 413.06 1,327 0.11 % 335.45 284,778 0.17 % 524.72 96,577 0.19 % 515.30 210,518 0.22 % 662.13 24,644 0.20 % 573.99 ljubljana ljubl jana 629,709 0.19 % 554.96 0 0 % 0 517 0.01 % 13.02 355 0.03 % 89.74 313,379 0.19 % 577.42 31,935 0.06 % 170.39 273,474 0.28 % 860.15 10,049 0.08 % 234.05 mesto m esto 628,282 0.19 % 553.70 0 0 % 0 10,635 0.13 % 267.78 729 0.06 % 184.28 351,818 0.22 % 648.25 69,292 0.14 % 369.72 181,789 0.19 % 571.77 14,019 0.11 % 326.52 letih l etih 593,892 0.18 % 523.39 0 0 % 0 7,036 0.09 % 177.16 1,222 0.10 % 308.91 295,576 0.18 % 544.62 88,203 0.17 % 470.62 186,187 0.19 % 585.61 15,668 0.13 % 364.93 delo delo 592,066 0.18 % 521.78 0 0 % 0 9,140 0.12 % 230.14 2,089 0.18 % 528.08 304,906 0.19 % 561.81 87,155 0.17 % 465.03 162,646 0.17 % 511.56 26,130 0.21 % 608.60 slovenije slove nije 579,657 0.17 % 510.85 0 0 % 0 465 0.01 % 11.71 6,843 0.58 % 1,729.84 310,741 0.19 % 572.56 56,763 0.11 % 302.87 198,780 0.20 % 625.21 6,065 0.05 % 141.26 sloveniji slove niji 567,197 0.17 % 499.87 0 0 % 0 478 0.01 % 12.04 2,037 0.17 % 514.93 292,345 0.18 % 538.66 78,143 0.15 % 416.95 187,227 0.19 % 588.88 6,967 0.06 % 162.27 odstotkov odsto tkov 559,799 0.17 % 493.35 0 0 % 0 578 0.01 % 14.55 299 0.03 % 75.58 311,673 0.19 % 574.28 53,835 0.10 % 287.25 188,394 0.19 % 592.55 5,020 0.04 % 116.92 predsednik predse dnik 545,455 0.16 % 480.71 0 0 % 0 1,964 0.03 % 49.45 1,093 0.09 % 276.30 309,379 0.19 % 570.05 37,608 0.07 % 200.66 192,584 0.20 % 605.73 2,827 0.02 % 65.84 času času 515,535 0.15 % 454.34 1 0.04 % 103.02 7,848 0.10 % 197.60 1,614 0.14 % 408 247,802 0.15 % 456.59 82,263 0.16 % 438.93 157,737 0.16 % 496.12 18,270 0.15 % 425.53 časa časa 504,880 0.15 % 444.95 12 0.42 % 1,236.22 24,957 0.32 % 628.39 1,300 0.11 % 328.63 228,687 0.14 % 421.37 104,840 0.20 % 559.39 124,379 0.13 % 391.20 20,705 0.17 % 482.24 svetu s vetu 452,594 0.14 % 398.87 0 0 % 0 9,995 0.13 % 251.66 998 0.08 % 252.28 213,404 0.13 % 393.21 76,162 0.15 % 406.38 137,347 0.14 % 431.99 14,688 0.12 % 342.10 milijonov milij onov 451,741 0.14 % 398.12 0 0 % 0 883 0.01 % 22.23 341 0.03 % 86.20 263,033 0.16 % 484.66 39,191 0.08 % 209.11 145,704 0.15 % 458.28 2,589 0.02 % 60.30 države dr žave 441,261 0.13 % 388.88 0 0 % 0 1,697 0.02 % 42.73 3,406 0.29 % 861 219,400 0.14 % 404.26 43,421 0.08 % 231.68 160,359 0.17 % 504.37 12,978 0.11 % 302.27 slovenija slove nija 434,662 0.13 % 383.07 0 0 % 0 216 0 % 5.44 1,480 0.12 % 374.13 199,029 0.12 % 366.72 38,121 0.07 % 203.40 192,738 0.20 % 606.21 3,078 0.03 % 71.69 tolarjev tola rjev 412,138 0.12 % 363.22 0 0 % 0 93 0 % 2.34 1,150 0.10 % 290.71 368,701 0.23 % 679.36 40,402 0.08 % 215.57 1,394 0 % 4.38 398 0 % 9.27 mestu m estu 409,944 0.12 % 361.28 0 0 % 0 10,096 0.13 % 254.21 534 0.04 % 134.99 198,916 0.12 % 366.52 53,464 0.10 % 285.27 137,112 0.14 % 431.25 9,822 0.08 % 228.77 odstotka odst otka 408,111 0.12 % 359.67 0 0 % 0 48 0 % 1.21 62 0.01 % 15.67 182,717 0.11 % 336.67 15,227 0.03 % 81.25 209,030 0.21 % 657.45 1,027 0.01 % 23.92 letu letu 396,319 0.12 % 349.27 0 0 % 0 1,695 0.02 % 42.68 1,290 0.11 % 326.10 205,061 0.13 % 377.84 50,649 0.10 % 270.25 128,073 0.13 % 402.82 9,551 0.08 % 222.45 koncu k oncu 386,487 0.12 % 340.61 20 0.70 % 2,060.37 11,331 0.14 % 285.30 1,011 0.09 % 255.57 182,053 0.11 % 335.44 60,717 0.12 % 323.97 118,425 0.12 % 372.48 12,930 0.10 % 301.16 leti leti 380,841 0.11 % 335.63 0 0 % 0 6,530 0.08 % 164.42 570 0.05 % 144.09 200,657 0.12 % 369.72 60,949 0.12 % 325.20 106,368 0.11 % 334.56 5,767 0.05 % 134.32 ljudje lj udje 380,144 0.11 % 335.02 0 0 % 0 17,127 0.22 % 431.24 1,076 0.09 % 272 172,775 0.11 % 318.35 75,678 0.15 % 403.79 92,650 0.10 % 291.41 20,838 0.17 % 485.34 primer pr imer 373,094 0.11 % 328.81 8 0.28 % 824.15 7,528 0.10 % 189.55 1,699 0.14 % 429.49 171,615 0.10 % 316.21 74,757 0.14 % 398.88 86,453 0.09 % 271.92 31,034 0.25 % 722.82 sveta s veta 372,133 0.11 % 327.96 0 0 % 0 6,226 0.08 % 156.76 1,499 0.13 % 378.93 198,045 0.12 % 364.91 43,196 0.08 % 230.48 110,306 0.11 % 346.94 12,861 0.10 % 299.55 delu delu 372,018 0.11 % 327.86 0 0 % 0 4,223 0.05 % 106.33 1,374 0.12 % 347.33 183,117 0.11 % 337.41 53,176 0.10 % 283.73 112,003 0.12 % 352.28 18,125 0.15 % 422.15 način n ačin 366,442 0.11 % 322.94 2 0.07 % 206.04 8,527 0.11 % 214.70 3,371 0.28 % 852.15 156,026 0.10 % 287.49 69,269 0.14 % 369.60 104,410 0.11 % 328.40 24,837 0.20 % 578.48 podjetja podj etja 357,777 0.11 % 315.31 0 0 % 0 870 0.01 % 21.91 920 0.08 % 232.57 193,160 0.12 % 355.91 54,217 0.10 % 289.28 100,211 0.10 % 315.19 8,399 0.07 % 195.62 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 56 File at CLARIN.SI 1.2.40 List of final character-level 5-grams from noun lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-nouns-lowercase_forms- final-5grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] evrov evrov 773,102 0.26 % 681.33 0 0 % 0 446 0.01 % 11.23 6 0 % 1.52 323,022 0.22 % 595.19 31,324 0.07 % 167.13 417,524 0.47 % 1,313.22 780 0.01 % 18.17 strani s trani 656,360 0.22 % 578.45 5 0.22 % 515.09 19,223 0.29 % 484.01 1,737 0.16 % 439.09 274,744 0.19 % 506.23 128,171 0.28 % 683.88 199,449 0.23 % 627.32 33,031 0.30 % 769.33 ljudi ljudi 634,249 0.21 % 558.96 0 0 % 0 16,405 0.25 % 413.06 1,327 0.12 % 335.45 284,778 0.19 % 524.72 96,577 0.21 % 515.30 210,518 0.24 % 662.13 24,644 0.22 % 573.99 ljubljana ljub ljana 629,709 0.21 % 554.96 0 0 % 0 517 0.01 % 13.02 355 0.03 % 89.74 313,379 0.21 % 577.42 31,935 0.07 % 170.39 273,474 0.31 % 860.15 10,049 0.09 % 234.05 mesto mesto 628,282 0.21 % 553.70 0 0 % 0 10,635 0.16 % 267.78 729 0.07 % 184.28 351,818 0.24 % 648.25 69,292 0.15 % 369.72 181,789 0.21 % 571.77 14,019 0.13 % 326.52 letih letih 593,892 0.20 % 523.39 0 0 % 0 7,036 0.11 % 177.16 1,222 0.11 % 308.91 295,576 0.20 % 544.62 88,203 0.19 % 470.62 186,187 0.21 % 585.61 15,668 0.14 % 364.93 slovenije slov enije 579,657 0.19 % 510.85 0 0 % 0 465 0.01 % 11.71 6,843 0.62 % 1,729.84 310,741 0.21 % 572.56 56,763 0.12 % 302.87 198,780 0.23 % 625.21 6,065 0.06 % 141.26 sloveniji slov eniji 567,197 0.19 % 499.87 0 0 % 0 478 0.01 % 12.04 2,037 0.19 % 514.93 292,345 0.20 % 538.66 78,143 0.17 % 416.95 187,227 0.21 % 588.88 6,967 0.06 % 162.27 odstotkov odst otkov 559,799 0.19 % 493.35 0 0 % 0 578 0.01 % 14.55 299 0.03 % 75.58 311,673 0.21 % 574.28 53,835 0.12 % 287.25 188,394 0.21 % 592.55 5,020 0.05 % 116.92 predsednik preds ednik 545,455 0.18 % 480.71 0 0 % 0 1,964 0.03 % 49.45 1,093 0.10 % 276.30 309,379 0.21 % 570.05 37,608 0.08 % 200.66 192,584 0.22 % 605.73 2,827 0.03 % 65.84 svetu svetu 452,594 0.15 % 398.87 0 0 % 0 9,995 0.15 % 251.66 998 0.09 % 252.28 213,404 0.14 % 393.21 76,162 0.17 % 406.38 137,347 0.16 % 431.99 14,688 0.13 % 342.10 milijonov mili jonov 451,741 0.15 % 398.12 0 0 % 0 883 0.01 % 22.23 341 0.03 % 86.20 263,033 0.18 % 484.66 39,191 0.09 % 209.11 145,704 0.17 % 458.28 2,589 0.02 % 60.30 države d ržave 441,261 0.15 % 388.88 0 0 % 0 1,697 0.03 % 42.73 3,406 0.31 % 861 219,400 0.15 % 404.26 43,421 0.10 % 231.68 160,359 0.18 % 504.37 12,978 0.12 % 302.27 slovenija slov enija 434,662 0.14 % 383.07 0 0 % 0 216 0 % 5.44 1,480 0.14 % 374.13 199,029 0.14 % 366.72 38,121 0.08 % 203.40 192,738 0.22 % 606.21 3,078 0.03 % 71.69 tolarjev tol arjev 412,138 0.14 % 363.22 0 0 % 0 93 0 % 2.34 1,150 0.10 % 290.71 368,701 0.25 % 679.36 40,402 0.09 % 215.57 1,394 0 % 4.38 398 0 % 9.27 mestu mestu 409,944 0.14 % 361.28 0 0 % 0 10,096 0.15 % 254.21 534 0.05 % 134.99 198,916 0.14 % 366.52 53,464 0.12 % 285.27 137,112 0.16 % 431.25 9,822 0.09 % 228.77 odstotka ods totka 408,111 0.14 % 359.67 0 0 % 0 48 0 % 1.21 62 0.01 % 15.67 182,717 0.12 % 336.67 15,227 0.03 % 81.25 209,030 0.24 % 657.45 1,027 0.01 % 23.92 koncu koncu 386,487 0.13 % 340.61 20 0.86 % 2,060.37 11,331 0.17 % 285.30 1,011 0.09 % 255.57 182,053 0.12 % 335.44 60,717 0.13 % 323.97 118,425 0.14 % 372.48 12,930 0.12 % 301.16 ljudje l judje 380,144 0.13 % 335.02 0 0 % 0 17,127 0.26 % 431.24 1,076 0.10 % 272 172,775 0.12 % 318.35 75,678 0.17 % 403.79 92,650 0.10 % 291.41 20,838 0.19 % 485.34 primer p rimer 373,094 0.12 % 328.81 8 0.34 % 824.15 7,528 0.11 % 189.55 1,699 0.15 % 429.49 171,615 0.12 % 316.21 74,757 0.16 % 398.88 86,453 0.10 % 271.92 31,034 0.28 % 722.82 sveta sveta 372,133 0.12 % 327.96 0 0 % 0 6,226 0.09 % 156.76 1,499 0.14 % 378.93 198,045 0.14 % 364.91 43,196 0.10 % 230.48 110,306 0.12 % 346.94 12,861 0.12 % 299.55 način način 366,442 0.12 % 322.94 2 0.09 % 206.04 8,527 0.13 % 214.70 3,371 0.31 % 852.15 156,026 0.11 % 287.49 69,269 0.15 % 369.60 104,410 0.12 % 328.40 24,837 0.23 % 578.48 podjetja pod jetja 357,777 0.12 % 315.31 0 0 % 0 870 0.01 % 21.91 920 0.08 % 232.57 193,160 0.13 % 355.91 54,217 0.12 % 289.28 100,211 0.11 % 315.19 8,399 0.08 % 195.62 življenje živl jenje 336,327 0.11 % 296.40 0 0 % 0 18,645 0.28 % 469.46 1,095 0.10 % 276.80 143,565 0.10 % 264.53 73,976 0.16 % 394.71 77,091 0.09 % 242.47 21,955 0.20 % 511.36 primeru pr imeru 329,763 0.11 % 290.62 0 0 % 0 4,501 0.07 % 113.33 2,246 0.20 % 567.76 144,090 0.10 % 265.50 49,248 0.11 % 262.77 114,733 0.13 % 360.87 14,945 0.14 % 348.09 vprašanje vpra šanje 329,493 0.11 % 290.38 0 0 % 0 6,144 0.09 % 154.70 1,445 0.13 % 365.28 173,591 0.12 % 319.85 44,565 0.10 % 237.78 92,549 0.10 % 291.09 11,199 0.10 % 260.84 vlada vlada 312,080 0.10 % 275.03 0 0 % 0 928 0.01 % 23.37 2,791 0.26 % 705.53 162,534 0.11 % 299.48 22,227 0.05 % 118.60 120,955 0.14 % 380.43 2,645 0.02 % 61.61 ljubljani ljub ljani 298,833 0.10 % 263.36 0 0 % 0 1,086 0.02 % 27.34 371 0.03 % 93.78 164,428 0.11 % 302.97 38,080 0.08 % 203.18 90,498 0.10 % 284.64 4,370 0.04 % 101.78 začetku za četku 297,915 0.10 % 262.55 1 0.04 % 103.02 3,617 0.06 % 91.07 694 0.06 % 175.44 147,717 0.10 % 272.18 45,622 0.10 % 243.42 90,370 0.10 % 284.24 9,894 0.09 % 230.44 mesta mesta 295,647 0.10 % 260.55 0 0 % 0 6,951 0.10 % 175.02 630 0.06 % 159.26 152,791 0.10 % 281.53 36,280 0.08 % 193.58 89,032 0.10 % 280.03 9,963 0.09 % 232.05 otrok otrok 287,157 0.10 % 253.07 0 0 % 0 10,771 0.16 % 271.20 602 0.06 % 152.18 129,206 0.09 % 238.07 51,492 0.11 % 274.74 77,643 0.09 % 244.21 17,443 0.16 % 406.27 vlade vlade 287,061 0.10 % 252.99 0 0 % 0 543 0.01 % 13.67 1,602 0.15 % 404.97 150,128 0.10 % 276.62 21,656 0.05 % 115.55 110,720 0.13 % 348.24 2,412 0.02 % 56.18 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 57 File at CLARIN.SI 1.2.41 List of initial character-level 1-grams from verb lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-verbs-lemmas-initial- 1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] biti biti b iti 91,521,762 45.65 % 80,657.66 130 6.79 % 13,392.40 5,035,173 46.44 % 126,780.21 232,963 38.78 % 58,890.53 43,157,916 46.56 % 79,521.31 14,021,776 42.64 % 74,815.59 26,234,349 46.57 % 82,513.86 2,839,455 39.78 % 66,134.28 imeti imeti i meti 4,146,931 2.07 % 3,654.67 12 0.63 % 1,236.22 179,933 1.66 % 4,530.52 14,728 2.45 % 3,723.08 1,929,486 2.08 % 3,555.21 785,754 2.39 % 4,192.53 1,085,001 1.93 % 3,412.61 152,017 2.13 % 3,540.66 morati morati m orati 2,048,887 1.02 % 1,805.67 6 0.31 % 618.11 93,053 0.86 % 2,342.97 10,982 1.83 % 2,776.13 960,889 1.04 % 1,770.50 336,918 1.02 % 1,797.68 558,017 0.99 % 1,755.11 89,022 1.25 % 2,073.43 iti iti i ti 1,422,557 0.71 % 1,253.69 1 0.05 % 103.02 87,499 0.81 % 2,203.13 4,730 0.79 % 1,195.69 635,556 0.69 % 1,171.05 245,607 0.75 % 1,310.48 407,923 0.72 % 1,283.02 41,241 0.58 % 960.55 začeti začeti z ačeti 1,138,044 0.57 % 1,002.95 5 0.26 % 515.09 48,368 0.45 % 1,217.85 3,141 0.52 % 794.01 542,728 0.59 % 1,000.01 183,809 0.56 % 980.74 324,275 0.58 % 1,019.93 35,718 0.50 % 831.91 priti priti p riti 1,046,777 0.52 % 922.52 2 0.10 % 206.04 71,696 0.66 % 1,805.23 3,067 0.51 % 775.30 476,007 0.51 % 877.07 175,716 0.53 % 937.56 286,526 0.51 % 901.20 33,763 0.47 % 786.38 povedati povedati p ovedati 970,784 0.48 % 855.55 0 0 % 0 54,069 0.50 % 1,361.40 2,102 0.35 % 531.36 490,903 0.53 % 904.52 114,814 0.35 % 612.61 292,099 0.52 % 918.73 16,797 0.23 % 391.22 dobiti dobiti d obiti 966,276 0.48 % 851.57 25 1.31 % 2,575.46 24,925 0.23 % 627.58 2,237 0.37 % 565.49 498,593 0.54 % 918.69 162,196 0.49 % 865.42 253,287 0.45 % 796.65 25,013 0.35 % 582.58 želeti želeti ž eleti 916,972 0.46 % 808.12 12 0.63 % 1,236.22 26,786 0.25 % 674.44 2,633 0.44 % 665.59 392,752 0.42 % 723.67 153,351 0.47 % 818.23 311,583 0.55 % 980.01 29,855 0.42 % 695.36 vedeti vedeti v edeti 869,903 0.43 % 766.64 1 0.05 % 103.02 115,253 1.06 % 2,901.95 3,084 0.51 % 779.60 365,237 0.39 % 672.97 174,422 0.53 % 930.66 183,677 0.33 % 577.71 28,229 0.40 % 657.49 moči moči m oči 829,908 0.41 % 731.39 0 0 % 0 65,444 0.60 % 1,647.81 3,619 0.60 % 914.84 382,779 0.41 % 705.30 143,731 0.44 % 766.90 196,963 0.35 % 619.50 37,372 0.52 % 870.44 videti videti v ideti 729,726 0.36 % 643.10 0 0 % 0 94,473 0.87 % 2,378.73 1,702 0.28 % 430.25 273,372 0.29 % 503.71 146,780 0.45 % 783.17 178,717 0.32 % 562.11 34,682 0.49 % 807.79 praviti praviti p raviti 720,199 0.36 % 634.71 0 0 % 0 19,047 0.18 % 479.58 1,668 0.28 % 421.65 379,505 0.41 % 699.26 122,357 0.37 % 652.86 180,251 0.32 % 566.94 17,371 0.24 % 404.59 postati postati p ostati 640,874 0.32 % 564.80 8 0.42 % 824.15 24,724 0.23 % 622.52 1,981 0.33 % 500.78 280,862 0.30 % 517.51 131,659 0.40 % 702.49 166,830 0.30 % 524.72 34,810 0.49 % 810.77 reči reči r eči 623,230 0.31 % 549.25 0 0 % 0 143,415 1.32 % 3,611.03 3,775 0.63 % 954.28 220,496 0.24 % 406.28 130,052 0.40 % 693.91 102,237 0.18 % 321.56 23,255 0.33 % 541.64 dejati dejati d ejati 588,836 0.29 % 518.94 0 0 % 0 11,125 0.10 % 280.12 129 0.02 % 32.61 238,385 0.26 % 439.24 24,514 0.07 % 130.80 310,945 0.55 % 978 3,738 0.05 % 87.06 pomeniti pomeniti p omeniti 572,293 0.28 % 504.36 1 0.05 % 103.02 15,176 0.14 % 382.12 2,098 0.35 % 530.35 265,611 0.29 % 489.41 114,402 0.35 % 610.41 146,456 0.26 % 460.64 28,549 0.40 % 664.94 ostati ostati o stati 562,250 0.28 % 495.51 1 0.05 % 103.02 26,621 0.25 % 670.29 1,527 0.25 % 386.01 268,949 0.29 % 495.56 90,209 0.27 % 481.33 157,481 0.28 % 495.32 17,462 0.24 % 406.71 dati dati d ati 532,058 0.27 % 468.90 31 1.62 % 3,193.57 38,456 0.35 % 968.28 3,258 0.54 % 823.59 242,506 0.26 % 446.83 100,968 0.31 % 538.73 125,248 0.22 % 393.94 21,591 0.30 % 502.88 odločiti odločiti o dločiti 511,266 0.26 % 450.58 0 0 % 0 11,502 0.11 % 289.61 1,705 0.28 % 431.01 253,119 0.27 % 466.39 84,393 0.26 % 450.29 150,035 0.27 % 471.90 10,512 0.15 % 244.84 najti najti n ajti 509,948 0.25 % 449.41 1 0.05 % 103.02 30,279 0.28 % 762.39 1,136 0.19 % 287.17 213,191 0.23 % 392.82 113,686 0.35 % 606.59 127,739 0.23 % 401.77 23,916 0.34 % 557.03 igrati igrati i grati 507,131 0.25 % 446.93 0 0 % 0 10,320 0.10 % 259.85 250 0.04 % 63.20 259,647 0.28 % 478.42 72,953 0.22 % 389.25 157,472 0.28 % 495.29 6,489 0.09 % 151.14 doseči doseči d oseči 504,161 0.25 % 444.31 2 0.10 % 206.04 5,633 0.05 % 141.83 968 0.16 % 244.70 251,182 0.27 % 462.82 61,834 0.19 % 329.93 166,863 0.30 % 524.83 17,679 0.25 % 411.76 hoteti hoteti h oteti 473,951 0.24 % 417.69 4 0.21 % 412.07 66,334 0.61 % 1,670.22 1,878 0.31 % 474.74 213,028 0.23 % 392.52 94,743 0.29 % 505.52 78,178 0.14 % 245.89 19,786 0.28 % 460.84 narediti narediti n arediti 471,071 0.23 % 415.15 8 0.42 % 824.15 26,480 0.24 % 666.74 1,294 0.21 % 327.11 200,417 0.22 % 369.28 101,403 0.31 % 541.05 124,061 0.22 % 390.20 17,408 0.24 % 405.45 govoriti govoriti g ovoriti 450,094 0.22 % 396.67 0 0 % 0 37,708 0.35 % 949.45 2,215 0.37 % 559.93 204,344 0.22 % 376.52 77,505 0.24 % 413.54 107,171 0.19 % 337.08 21,151 0.30 % 492.63 delati delati d elati 437,814 0.22 % 385.84 1 0.05 % 103.02 19,899 0.18 % 501.04 1,349 0.23 % 341.01 203,224 0.22 % 374.45 83,284 0.25 % 444.38 115,508 0.20 % 363.30 14,549 0.20 % 338.86 pričakovati pričakovati p ričakovati 429,229 0.21 % 378.28 0 0 % 0 9,942 0.09 % 250.33 536 0.09 % 135.50 221,371 0.24 % 407.89 54,496 0.17 % 290.77 135,527 0.24 % 426.27 7,357 0.10 % 171.35 pomagati pomagati p omagati 425,110 0.21 % 374.65 1 0.05 % 103.02 18,730 0.17 % 471.60 929 0.15 % 234.84 187,285 0.20 % 345.08 86,740 0.26 % 462.82 109,931 0.20 % 345.76 21,494 0.30 % 500.62 uspeti uspeti u speti 416,462 0.21 % 367.03 0 0 % 0 11,945 0.11 % 300.76 441 0.07 % 111.48 202,705 0.22 % 373.50 63,264 0.19 % 337.56 130,990 0.23 % 412 7,117 0.10 % 165.76 kazati kazati k azati 414,236 0.21 % 365.06 1 0.05 % 103.02 9,240 0.09 % 232.65 811 0.14 % 205.01 204,947 0.22 % 377.63 64,708 0.20 % 345.26 114,799 0.20 % 361.07 19,730 0.28 % 459.54 pokazati pokazati p okazati 412,679 0.21 % 363.69 0 0 % 0 17,416 0.16 % 438.52 714 0.12 % 180.49 195,963 0.21 % 361.07 69,548 0.21 % 371.09 112,955 0.20 % 355.27 16,083 0.23 % 374.59 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 58 File at CLARIN.SI 1.2.42 List of initial character-level 2-grams from verb lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-verbs-lemmas-initial- 2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] biti biti bi ti 91,521,762 45.65 % 80,657.66 130 6.79 % 13,392.40 5,035,173 46.44 % 126,780.21 232,963 38.78 % 58,890.53 43,157,916 46.56 % 79,521.31 14,021,776 42.64 % 74,815.59 26,234,349 46.57 % 82,513.86 2,839,455 39.78 % 66,134.28 imeti imeti im eti 4,146,931 2.07 % 3,654.67 12 0.63 % 1,236.22 179,933 1.66 % 4,530.52 14,728 2.45 % 3,723.08 1,929,486 2.08 % 3,555.21 785,754 2.39 % 4,192.53 1,085,001 1.93 % 3,412.61 152,017 2.13 % 3,540.66 morati morati mo rati 2,048,887 1.02 % 1,805.67 6 0.31 % 618.11 93,053 0.86 % 2,342.97 10,982 1.83 % 2,776.13 960,889 1.04 % 1,770.50 336,918 1.02 % 1,797.68 558,017 0.99 % 1,755.11 89,022 1.25 % 2,073.43 iti iti it i 1,422,557 0.71 % 1,253.69 1 0.05 % 103.02 87,499 0.81 % 2,203.13 4,730 0.79 % 1,195.69 635,556 0.69 % 1,171.05 245,607 0.75 % 1,310.48 407,923 0.72 % 1,283.02 41,241 0.58 % 960.55 začeti začeti za četi 1,138,044 0.57 % 1,002.95 5 0.26 % 515.09 48,368 0.45 % 1,217.85 3,141 0.52 % 794.01 542,728 0.59 % 1,000.01 183,809 0.56 % 980.74 324,275 0.58 % 1,019.93 35,718 0.50 % 831.91 priti priti pr iti 1,046,777 0.52 % 922.52 2 0.10 % 206.04 71,696 0.66 % 1,805.23 3,067 0.51 % 775.30 476,007 0.51 % 877.07 175,716 0.53 % 937.56 286,526 0.51 % 901.20 33,763 0.47 % 786.38 povedati povedati po vedati 970,784 0.48 % 855.55 0 0 % 0 54,069 0.50 % 1,361.40 2,102 0.35 % 531.36 490,903 0.53 % 904.52 114,814 0.35 % 612.61 292,099 0.52 % 918.73 16,797 0.23 % 391.22 dobiti dobiti do biti 966,276 0.48 % 851.57 25 1.31 % 2,575.46 24,925 0.23 % 627.58 2,237 0.37 % 565.49 498,593 0.54 % 918.69 162,196 0.49 % 865.42 253,287 0.45 % 796.65 25,013 0.35 % 582.58 želeti želeti že leti 916,972 0.46 % 808.12 12 0.63 % 1,236.22 26,786 0.25 % 674.44 2,633 0.44 % 665.59 392,752 0.42 % 723.67 153,351 0.47 % 818.23 311,583 0.55 % 980.01 29,855 0.42 % 695.36 vedeti vedeti ve deti 869,903 0.43 % 766.64 1 0.05 % 103.02 115,253 1.06 % 2,901.95 3,084 0.51 % 779.60 365,237 0.39 % 672.97 174,422 0.53 % 930.66 183,677 0.33 % 577.71 28,229 0.40 % 657.49 moči moči mo či 829,908 0.41 % 731.39 0 0 % 0 65,444 0.60 % 1,647.81 3,619 0.60 % 914.84 382,779 0.41 % 705.30 143,731 0.44 % 766.90 196,963 0.35 % 619.50 37,372 0.52 % 870.44 videti videti vi deti 729,726 0.36 % 643.10 0 0 % 0 94,473 0.87 % 2,378.73 1,702 0.28 % 430.25 273,372 0.29 % 503.71 146,780 0.45 % 783.17 178,717 0.32 % 562.11 34,682 0.49 % 807.79 praviti praviti pr aviti 720,199 0.36 % 634.71 0 0 % 0 19,047 0.18 % 479.58 1,668 0.28 % 421.65 379,505 0.41 % 699.26 122,357 0.37 % 652.86 180,251 0.32 % 566.94 17,371 0.24 % 404.59 postati postati po stati 640,874 0.32 % 564.80 8 0.42 % 824.15 24,724 0.23 % 622.52 1,981 0.33 % 500.78 280,862 0.30 % 517.51 131,659 0.40 % 702.49 166,830 0.30 % 524.72 34,810 0.49 % 810.77 reči reči re či 623,230 0.31 % 549.25 0 0 % 0 143,415 1.32 % 3,611.03 3,775 0.63 % 954.28 220,496 0.24 % 406.28 130,052 0.40 % 693.91 102,237 0.18 % 321.56 23,255 0.33 % 541.64 dejati dejati de jati 588,836 0.29 % 518.94 0 0 % 0 11,125 0.10 % 280.12 129 0.02 % 32.61 238,385 0.26 % 439.24 24,514 0.07 % 130.80 310,945 0.55 % 978 3,738 0.05 % 87.06 pomeniti pomeniti po meniti 572,293 0.28 % 504.36 1 0.05 % 103.02 15,176 0.14 % 382.12 2,098 0.35 % 530.35 265,611 0.29 % 489.41 114,402 0.35 % 610.41 146,456 0.26 % 460.64 28,549 0.40 % 664.94 ostati ostati os tati 562,250 0.28 % 495.51 1 0.05 % 103.02 26,621 0.25 % 670.29 1,527 0.25 % 386.01 268,949 0.29 % 495.56 90,209 0.27 % 481.33 157,481 0.28 % 495.32 17,462 0.24 % 406.71 dati dati da ti 532,058 0.27 % 468.90 31 1.62 % 3,193.57 38,456 0.35 % 968.28 3,258 0.54 % 823.59 242,506 0.26 % 446.83 100,968 0.31 % 538.73 125,248 0.22 % 393.94 21,591 0.30 % 502.88 odločiti odločiti od ločiti 511,266 0.26 % 450.58 0 0 % 0 11,502 0.11 % 289.61 1,705 0.28 % 431.01 253,119 0.27 % 466.39 84,393 0.26 % 450.29 150,035 0.27 % 471.90 10,512 0.15 % 244.84 najti najti na jti 509,948 0.25 % 449.41 1 0.05 % 103.02 30,279 0.28 % 762.39 1,136 0.19 % 287.17 213,191 0.23 % 392.82 113,686 0.35 % 606.59 127,739 0.23 % 401.77 23,916 0.34 % 557.03 igrati igrati ig rati 507,131 0.25 % 446.93 0 0 % 0 10,320 0.10 % 259.85 250 0.04 % 63.20 259,647 0.28 % 478.42 72,953 0.22 % 389.25 157,472 0.28 % 495.29 6,489 0.09 % 151.14 doseči doseči do seči 504,161 0.25 % 444.31 2 0.10 % 206.04 5,633 0.05 % 141.83 968 0.16 % 244.70 251,182 0.27 % 462.82 61,834 0.19 % 329.93 166,863 0.30 % 524.83 17,679 0.25 % 411.76 hoteti hoteti ho teti 473,951 0.24 % 417.69 4 0.21 % 412.07 66,334 0.61 % 1,670.22 1,878 0.31 % 474.74 213,028 0.23 % 392.52 94,743 0.29 % 505.52 78,178 0.14 % 245.89 19,786 0.28 % 460.84 narediti narediti na rediti 471,071 0.23 % 415.15 8 0.42 % 824.15 26,480 0.24 % 666.74 1,294 0.21 % 327.11 200,417 0.22 % 369.28 101,403 0.31 % 541.05 124,061 0.22 % 390.20 17,408 0.24 % 405.45 govoriti govoriti go voriti 450,094 0.22 % 396.67 0 0 % 0 37,708 0.35 % 949.45 2,215 0.37 % 559.93 204,344 0.22 % 376.52 77,505 0.24 % 413.54 107,171 0.19 % 337.08 21,151 0.30 % 492.63 delati delati de lati 437,814 0.22 % 385.84 1 0.05 % 103.02 19,899 0.18 % 501.04 1,349 0.23 % 341.01 203,224 0.22 % 374.45 83,284 0.25 % 444.38 115,508 0.20 % 363.30 14,549 0.20 % 338.86 pričakovati pričakovati pr ičakovati 429,229 0.21 % 378.28 0 0 % 0 9,942 0.09 % 250.33 536 0.09 % 135.50 221,371 0.24 % 407.89 54,496 0.17 % 290.77 135,527 0.24 % 426.27 7,357 0.10 % 171.35 pomagati pomagati po magati 425,110 0.21 % 374.65 1 0.05 % 103.02 18,730 0.17 % 471.60 929 0.15 % 234.84 187,285 0.20 % 345.08 86,740 0.26 % 462.82 109,931 0.20 % 345.76 21,494 0.30 % 500.62 uspeti uspeti us peti 416,462 0.21 % 367.03 0 0 % 0 11,945 0.11 % 300.76 441 0.07 % 111.48 202,705 0.22 % 373.50 63,264 0.19 % 337.56 130,990 0.23 % 412 7,117 0.10 % 165.76 kazati kazati ka zati 414,236 0.21 % 365.06 1 0.05 % 103.02 9,240 0.09 % 232.65 811 0.14 % 205.01 204,947 0.22 % 377.63 64,708 0.20 % 345.26 114,799 0.20 % 361.07 19,730 0.28 % 459.54 pokazati pokazati po kazati 412,679 0.21 % 363.69 0 0 % 0 17,416 0.16 % 438.52 714 0.12 % 180.49 195,963 0.21 % 361.07 69,548 0.21 % 371.09 112,955 0.20 % 355.27 16,083 0.23 % 374.59 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 59 File at CLARIN.SI 1.2.43 List of initial character-level 3-grams from verb lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-verbs-lemmas-initial- 3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] biti biti bit i 91,521,762 45.65 % 80,657.66 130 6.79 % 13,392.40 5,035,173 46.44 % 126,780.21 232,963 38.79 % 58,890.53 43,157,916 46.56 % 79,521.31 14,021,776 42.64 % 74,815.59 26,234,349 46.57 % 82,513.86 2,839,455 39.78 % 66,134.28 imeti imeti ime ti 4,146,931 2.07 % 3,654.67 12 0.63 % 1,236.22 179,933 1.66 % 4,530.52 14,728 2.45 % 3,723.08 1,929,486 2.08 % 3,555.21 785,754 2.39 % 4,192.53 1,085,001 1.93 % 3,412.61 152,017 2.13 % 3,540.66 morati morati mor ati 2,048,887 1.02 % 1,805.67 6 0.31 % 618.11 93,053 0.86 % 2,342.97 10,982 1.83 % 2,776.13 960,889 1.04 % 1,770.50 336,918 1.02 % 1,797.68 558,017 0.99 % 1,755.11 89,022 1.25 % 2,073.43 iti iti iti 1,422,557 0.71 % 1,253.69 1 0.05 % 103.02 87,499 0.81 % 2,203.13 4,730 0.79 % 1,195.69 635,556 0.69 % 1,171.05 245,607 0.75 % 1,310.48 407,923 0.72 % 1,283.02 41,241 0.58 % 960.55 začeti začeti zač eti 1,138,044 0.57 % 1,002.95 5 0.26 % 515.09 48,368 0.45 % 1,217.85 3,141 0.52 % 794.01 542,728 0.59 % 1,000.01 183,809 0.56 % 980.74 324,275 0.58 % 1,019.93 35,718 0.50 % 831.91 priti priti pri ti 1,046,777 0.52 % 922.52 2 0.10 % 206.04 71,696 0.66 % 1,805.23 3,067 0.51 % 775.30 476,007 0.51 % 877.07 175,716 0.53 % 937.56 286,526 0.51 % 901.20 33,763 0.47 % 786.38 povedati povedati pov edati 970,784 0.48 % 855.55 0 0 % 0 54,069 0.50 % 1,361.40 2,102 0.35 % 531.36 490,903 0.53 % 904.52 114,814 0.35 % 612.61 292,099 0.52 % 918.73 16,797 0.23 % 391.22 dobiti dobiti dob iti 966,276 0.48 % 851.57 25 1.31 % 2,575.46 24,925 0.23 % 627.58 2,237 0.37 % 565.49 498,593 0.54 % 918.69 162,196 0.49 % 865.42 253,287 0.45 % 796.65 25,013 0.35 % 582.58 želeti želeti žel eti 916,972 0.46 % 808.12 12 0.63 % 1,236.22 26,786 0.25 % 674.44 2,633 0.44 % 665.59 392,752 0.42 % 723.67 153,351 0.47 % 818.23 311,583 0.55 % 980.01 29,855 0.42 % 695.36 vedeti vedeti ved eti 869,903 0.43 % 766.64 1 0.05 % 103.02 115,253 1.06 % 2,901.95 3,084 0.51 % 779.60 365,237 0.39 % 672.97 174,422 0.53 % 930.66 183,677 0.33 % 577.71 28,229 0.40 % 657.49 moči moči moč i 829,908 0.41 % 731.39 0 0 % 0 65,444 0.60 % 1,647.81 3,619 0.60 % 914.84 382,779 0.41 % 705.30 143,731 0.44 % 766.90 196,963 0.35 % 619.50 37,372 0.52 % 870.44 videti videti vid eti 729,726 0.36 % 643.10 0 0 % 0 94,473 0.87 % 2,378.73 1,702 0.28 % 430.25 273,372 0.29 % 503.71 146,780 0.45 % 783.17 178,717 0.32 % 562.11 34,682 0.49 % 807.79 praviti praviti pra viti 720,199 0.36 % 634.71 0 0 % 0 19,047 0.18 % 479.58 1,668 0.28 % 421.65 379,505 0.41 % 699.26 122,357 0.37 % 652.86 180,251 0.32 % 566.94 17,371 0.24 % 404.59 postati postati pos tati 640,874 0.32 % 564.80 8 0.42 % 824.15 24,724 0.23 % 622.52 1,981 0.33 % 500.78 280,862 0.30 % 517.51 131,659 0.40 % 702.49 166,830 0.30 % 524.72 34,810 0.49 % 810.77 reči reči reč i 623,230 0.31 % 549.25 0 0 % 0 143,415 1.32 % 3,611.03 3,775 0.63 % 954.28 220,496 0.24 % 406.28 130,052 0.40 % 693.91 102,237 0.18 % 321.56 23,255 0.33 % 541.64 dejati dejati dej ati 588,836 0.29 % 518.94 0 0 % 0 11,125 0.10 % 280.12 129 0.02 % 32.61 238,385 0.26 % 439.24 24,514 0.07 % 130.80 310,945 0.55 % 978 3,738 0.05 % 87.06 pomeniti pomeniti pom eniti 572,293 0.28 % 504.36 1 0.05 % 103.02 15,176 0.14 % 382.12 2,098 0.35 % 530.35 265,611 0.29 % 489.41 114,402 0.35 % 610.41 146,456 0.26 % 460.64 28,549 0.40 % 664.94 ostati ostati ost ati 562,250 0.28 % 495.51 1 0.05 % 103.02 26,621 0.25 % 670.29 1,527 0.25 % 386.01 268,949 0.29 % 495.56 90,209 0.27 % 481.33 157,481 0.28 % 495.32 17,462 0.24 % 406.71 dati dati dat i 532,058 0.27 % 468.90 31 1.62 % 3,193.57 38,456 0.35 % 968.28 3,258 0.54 % 823.59 242,506 0.26 % 446.83 100,968 0.31 % 538.73 125,248 0.22 % 393.94 21,591 0.30 % 502.88 odločiti odločiti odl očiti 511,266 0.26 % 450.58 0 0 % 0 11,502 0.11 % 289.61 1,705 0.28 % 431.01 253,119 0.27 % 466.39 84,393 0.26 % 450.29 150,035 0.27 % 471.90 10,512 0.15 % 244.84 najti najti naj ti 509,948 0.25 % 449.41 1 0.05 % 103.02 30,279 0.28 % 762.39 1,136 0.19 % 287.17 213,191 0.23 % 392.82 113,686 0.35 % 606.59 127,739 0.23 % 401.77 23,916 0.34 % 557.03 igrati igrati igr ati 507,131 0.25 % 446.93 0 0 % 0 10,320 0.10 % 259.85 250 0.04 % 63.20 259,647 0.28 % 478.42 72,953 0.22 % 389.25 157,472 0.28 % 495.29 6,489 0.09 % 151.14 doseči doseči dos eči 504,161 0.25 % 444.31 2 0.10 % 206.04 5,633 0.05 % 141.83 968 0.16 % 244.70 251,182 0.27 % 462.82 61,834 0.19 % 329.93 166,863 0.30 % 524.83 17,679 0.25 % 411.76 hoteti hoteti hot eti 473,951 0.24 % 417.69 4 0.21 % 412.07 66,334 0.61 % 1,670.22 1,878 0.31 % 474.74 213,028 0.23 % 392.52 94,743 0.29 % 505.52 78,178 0.14 % 245.89 19,786 0.28 % 460.84 narediti narediti nar editi 471,071 0.23 % 415.15 8 0.42 % 824.15 26,480 0.24 % 666.74 1,294 0.21 % 327.11 200,417 0.22 % 369.28 101,403 0.31 % 541.05 124,061 0.22 % 390.20 17,408 0.24 % 405.45 govoriti govoriti gov oriti 450,094 0.23 % 396.67 0 0 % 0 37,708 0.35 % 949.45 2,215 0.37 % 559.93 204,344 0.22 % 376.52 77,505 0.24 % 413.54 107,171 0.19 % 337.08 21,151 0.30 % 492.63 delati delati del ati 437,814 0.22 % 385.84 1 0.05 % 103.02 19,899 0.18 % 501.04 1,349 0.23 % 341.01 203,224 0.22 % 374.45 83,284 0.25 % 444.38 115,508 0.20 % 363.30 14,549 0.20 % 338.86 pričakovati pričakovati pri čakovati 429,229 0.21 % 378.28 0 0 % 0 9,942 0.09 % 250.33 536 0.09 % 135.50 221,371 0.24 % 407.89 54,496 0.17 % 290.77 135,527 0.24 % 426.27 7,357 0.10 % 171.35 pomagati pomagati pom agati 425,110 0.21 % 374.65 1 0.05 % 103.02 18,730 0.17 % 471.60 929 0.15 % 234.84 187,285 0.20 % 345.08 86,740 0.26 % 462.82 109,931 0.20 % 345.76 21,494 0.30 % 500.62 uspeti uspeti usp eti 416,462 0.21 % 367.03 0 0 % 0 11,945 0.11 % 300.76 441 0.07 % 111.48 202,705 0.22 % 373.50 63,264 0.19 % 337.56 130,990 0.23 % 412 7,117 0.10 % 165.76 kazati kazati kaz ati 414,236 0.21 % 365.06 1 0.05 % 103.02 9,240 0.09 % 232.65 811 0.14 % 205.01 204,947 0.22 % 377.63 64,708 0.20 % 345.26 114,799 0.20 % 361.07 19,730 0.28 % 459.54 pokazati pokazati pok azati 412,679 0.21 % 363.69 0 0 % 0 17,416 0.16 % 438.52 714 0.12 % 180.49 195,963 0.21 % 361.07 69,548 0.21 % 371.09 112,955 0.20 % 355.27 16,083 0.23 % 374.59 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 60 File at CLARIN.SI 1.2.44 List of initial character-level 4-grams from verb lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-verbs-lemmas-initial- 4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] biti biti biti 91,521,762 45.98 % 80,657.66 130 6.80 % 13,392.40 5,035,173 46.82 % 126,780.21 232,963 39.10 % 58,890.53 43,157,916 46.89 % 79,521.31 14,021,776 42.97 % 74,815.59 26,234,349 46.92 % 82,513.86 2,839,455 40.02 % 66,134.28 imeti imeti imet i4,146,931 2.08 % 3,654.67 12 0.63 % 1,236.22 179,933 1.67 % 4,530.52 14,728 2.47 % 3,723.08 1,929,486 2.10 % 3,555.21 785,754 2.41 % 4,192.53 1,085,001 1.94 % 3,412.61 152,017 2.14 % 3,540.66 morati morati mora ti2,048,887 1.03 % 1,805.67 6 0.31 % 618.11 93,053 0.86 % 2,342.97 10,982 1.84 % 2,776.13 960,889 1.04 % 1,770.50 336,918 1.03 % 1,797.68 558,017 1.00 % 1,755.11 89,022 1.25 % 2,073.43 začeti začeti zače ti 1,138,044 0.57 % 1,002.95 5 0.26 % 515.09 48,368 0.45 % 1,217.85 3,141 0.53 % 794.01 542,728 0.59 % 1,000.01 183,809 0.56 % 980.74 324,275 0.58 % 1,019.93 35,718 0.50 % 831.91 priti priti prit i1,046,777 0.53 % 922.52 2 0.10 % 206.04 71,696 0.67 % 1,805.23 3,067 0.52 % 775.30 476,007 0.52 % 877.07 175,716 0.54 % 937.56 286,526 0.51 % 901.20 33,763 0.48 % 786.38 povedati povedati pove dati 970,784 0.49 % 855.55 0 0 % 0 54,069 0.50 % 1,361.40 2,102 0.35 % 531.36 490,903 0.53 % 904.52 114,814 0.35 % 612.61 292,099 0.52 % 918.73 16,797 0.24 % 391.22 dobiti dobiti dobi ti 966,276 0.48 % 851.57 25 1.31 % 2,575.46 24,925 0.23 % 627.58 2,237 0.38 % 565.49 498,593 0.54 % 918.69 162,196 0.50 % 865.42 253,287 0.45 % 796.65 25,013 0.35 % 582.58 želeti želeti žele ti 916,972 0.46 % 808.12 12 0.63 % 1,236.22 26,786 0.25 % 674.44 2,633 0.44 % 665.59 392,752 0.43 % 723.67 153,351 0.47 % 818.23 311,583 0.56 % 980.01 29,855 0.42 % 695.36 vedeti vedeti vede ti 869,903 0.44 % 766.64 1 0.05 % 103.02 115,253 1.07 % 2,901.95 3,084 0.52 % 779.60 365,237 0.40 % 672.97 174,422 0.53 % 930.66 183,677 0.33 % 577.71 28,229 0.40 % 657.49 moči moči moči 829,908 0.42 % 731.39 0 0 % 0 65,444 0.61 % 1,647.81 3,619 0.61 % 914.84 382,779 0.42 % 705.30 143,731 0.44 % 766.90 196,963 0.35 % 619.50 37,372 0.53 % 870.44 videti videti vide ti 729,726 0.37 % 643.10 0 0 % 0 94,473 0.88 % 2,378.73 1,702 0.29 % 430.25 273,372 0.30 % 503.71 146,780 0.45 % 783.17 178,717 0.32 % 562.11 34,682 0.49 % 807.79 praviti praviti prav iti 720,199 0.36 % 634.71 0 0 % 0 19,047 0.18 % 479.58 1,668 0.28 % 421.65 379,505 0.41 % 699.26 122,357 0.38 % 652.86 180,251 0.32 % 566.94 17,371 0.24 % 404.59 postati postati post ati 640,874 0.32 % 564.80 8 0.42 % 824.15 24,724 0.23 % 622.52 1,981 0.33 % 500.78 280,862 0.30 % 517.51 131,659 0.40 % 702.49 166,830 0.30 % 524.72 34,810 0.49 % 810.77 reči reči reči 623,230 0.31 % 549.25 0 0 % 0 143,415 1.33 % 3,611.03 3,775 0.63 % 954.28 220,496 0.24 % 406.28 130,052 0.40 % 693.91 102,237 0.18 % 321.56 23,255 0.33 % 541.64 dejati dejati deja ti 588,836 0.30 % 518.94 0 0 % 0 11,125 0.10 % 280.12 129 0.02 % 32.61 238,385 0.26 % 439.24 24,514 0.07 % 130.80 310,945 0.56 % 978 3,738 0.05 % 87.06 pomeniti pomeniti pome niti 572,293 0.29 % 504.36 1 0.05 % 103.02 15,176 0.14 % 382.12 2,098 0.35 % 530.35 265,611 0.29 % 489.41 114,402 0.35 % 610.41 146,456 0.26 % 460.64 28,549 0.40 % 664.94 ostati ostati osta ti 562,250 0.28 % 495.51 1 0.05 % 103.02 26,621 0.25 % 670.29 1,527 0.26 % 386.01 268,949 0.29 % 495.56 90,209 0.28 % 481.33 157,481 0.28 % 495.32 17,462 0.25 % 406.71 dati dati dati 532,058 0.27 % 468.90 31 1.62 % 3,193.57 38,456 0.36 % 968.28 3,258 0.55 % 823.59 242,506 0.26 % 446.83 100,968 0.31 % 538.73 125,248 0.22 % 393.94 21,591 0.30 % 502.88 odločiti odločiti odlo čiti 511,266 0.26 % 450.58 0 0 % 0 11,502 0.11 % 289.61 1,705 0.29 % 431.01 253,119 0.28 % 466.39 84,393 0.26 % 450.29 150,035 0.27 % 471.90 10,512 0.15 % 244.84 najti najti najt i 509,948 0.26 % 449.41 1 0.05 % 103.02 30,279 0.28 % 762.39 1,136 0.19 % 287.17 213,191 0.23 % 392.82 113,686 0.35 % 606.59 127,739 0.23 % 401.77 23,916 0.34 % 557.03 igrati igrati igra ti 507,131 0.26 % 446.93 0 0 % 0 10,320 0.10 % 259.85 250 0.04 % 63.20 259,647 0.28 % 478.42 72,953 0.22 % 389.25 157,472 0.28 % 495.29 6,489 0.09 % 151.14 doseči doseči dose či 504,161 0.25 % 444.31 2 0.10 % 206.04 5,633 0.05 % 141.83 968 0.16 % 244.70 251,182 0.27 % 462.82 61,834 0.19 % 329.93 166,863 0.30 % 524.83 17,679 0.25 % 411.76 hoteti hoteti hote ti 473,951 0.24 % 417.69 4 0.21 % 412.07 66,334 0.62 % 1,670.22 1,878 0.32 % 474.74 213,028 0.23 % 392.52 94,743 0.29 % 505.52 78,178 0.14 % 245.89 19,786 0.28 % 460.84 narediti narediti nare diti 471,071 0.24 % 415.15 8 0.42 % 824.15 26,480 0.25 % 666.74 1,294 0.22 % 327.11 200,417 0.22 % 369.28 101,403 0.31 % 541.05 124,061 0.22 % 390.20 17,408 0.24 % 405.45 govoriti govoriti govo riti 450,094 0.23 % 396.67 0 0 % 0 37,708 0.35 % 949.45 2,215 0.37 % 559.93 204,344 0.22 % 376.52 77,505 0.24 % 413.54 107,171 0.19 % 337.08 21,151 0.30 % 492.63 delati delati dela ti 437,814 0.22 % 385.84 1 0.05 % 103.02 19,899 0.18 % 501.04 1,349 0.23 % 341.01 203,224 0.22 % 374.45 83,284 0.26 % 444.38 115,508 0.21 % 363.30 14,549 0.20 % 338.86 pričakovati pričakovati prič akovati 429,229 0.22 % 378.28 0 0 % 0 9,942 0.09 % 250.33 536 0.09 % 135.50 221,371 0.24 % 407.89 54,496 0.17 % 290.77 135,527 0.24 % 426.27 7,357 0.10 % 171.35 pomagati pomagati poma gati 425,110 0.21 % 374.65 1 0.05 % 103.02 18,730 0.17 % 471.60 929 0.16 % 234.84 187,285 0.20 % 345.08 86,740 0.27 % 462.82 109,931 0.20 % 345.76 21,494 0.30 % 500.62 uspeti uspeti uspe ti 416,462 0.21 % 367.03 0 0 % 0 11,945 0.11 % 300.76 441 0.07 % 111.48 202,705 0.22 % 373.50 63,264 0.19 % 337.56 130,990 0.23 % 412 7,117 0.10 % 165.76 kazati kazati kaza ti 414,236 0.21 % 365.06 1 0.05 % 103.02 9,240 0.09 % 232.65 811 0.14 % 205.01 204,947 0.22 % 377.63 64,708 0.20 % 345.26 114,799 0.20 % 361.07 19,730 0.28 % 459.54 pokazati pokazati poka zati 412,679 0.21 % 363.69 0 0 % 0 17,416 0.16 % 438.52 714 0.12 % 180.49 195,963 0.21 % 361.07 69,548 0.21 % 371.09 112,955 0.20 % 355.27 16,083 0.23 % 374.59 voditi voditi vodi ti 410,241 0.21 % 361.54 1 0.05 % 103.02 7,552 0.07 % 190.15 1,581 0.27 % 399.66 209,331 0.23 % 385.71 49,427 0.15 % 263.73 128,999 0.23 % 405.74 13,350 0.19 % 310.94 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 61 File at CLARIN.SI 1.2.45 List of initial character-level 5-grams from verb lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-verbs-lemmas-initial- 5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] imeti imeti imeti 4,146,931 3.96 % 3,654.67 12 0.70 % 1,236.22 179,933 3.32 % 4,530.52 14,728 4.20 % 3,723.08 1,929,486 4.04 % 3,555.21 785,754 4.34 % 4,192.53 1,085,001 3.73 % 3,412.61 152,017 3.67 % 3,540.66 morati morati morat i 2,048,887 1.95 % 1,805.67 6 0.35 % 618.11 93,053 1.72 % 2,342.97 10,982 3.13 % 2,776.13 960,889 2.01 % 1,770.50 336,918 1.86 % 1,797.68 558,017 1.92 % 1,755.11 89,022 2.15 % 2,073.43 začeti začeti začet i 1,138,044 1.08 % 1,002.95 5 0.29 % 515.09 48,368 0.89 % 1,217.85 3,141 0.90 % 794.01 542,728 1.14 % 1,000.01 183,809 1.01 % 980.74 324,275 1.11 % 1,019.93 35,718 0.86 % 831.91 priti priti priti 1,046,777 1.00 % 922.52 2 0.12 % 206.04 71,696 1.32 % 1,805.23 3,067 0.88 % 775.30 476,007 1.00 % 877.07 175,716 0.97 % 937.56 286,526 0.98 % 901.20 33,763 0.81 % 786.38 povedati povedati poved ati 970,784 0.93 % 855.55 0 0 % 0 54,069 1.00 % 1,361.40 2,102 0.60 % 531.36 490,903 1.03 % 904.52 114,814 0.63 % 612.61 292,099 1.00 % 918.73 16,797 0.41 % 391.22 dobiti dobiti dobit i 966,276 0.92 % 851.57 25 1.46 % 2,575.46 24,925 0.46 % 627.58 2,237 0.64 % 565.49 498,593 1.04 % 918.69 162,196 0.90 % 865.42 253,287 0.87 % 796.65 25,013 0.60 % 582.58 želeti želeti želet i 916,972 0.87 % 808.12 12 0.70 % 1,236.22 26,786 0.49 % 674.44 2,633 0.75 % 665.59 392,752 0.82 % 723.67 153,351 0.85 % 818.23 311,583 1.07 % 980.01 29,855 0.72 % 695.36 vedeti vedeti vedet i 869,903 0.83 % 766.64 1 0.06 % 103.02 115,253 2.12 % 2,901.95 3,084 0.88 % 779.60 365,237 0.77 % 672.97 174,422 0.96 % 930.66 183,677 0.63 % 577.71 28,229 0.68 % 657.49 videti videti videt i 729,726 0.70 % 643.10 0 0 % 0 94,473 1.74 % 2,378.73 1,702 0.49 % 430.25 273,372 0.57 % 503.71 146,780 0.81 % 783.17 178,717 0.61 % 562.11 34,682 0.84 % 807.79 praviti praviti pravi ti 720,199 0.69 % 634.71 0 0 % 0 19,047 0.35 % 479.58 1,668 0.48 % 421.65 379,505 0.80 % 699.26 122,357 0.68 % 652.86 180,251 0.62 % 566.94 17,371 0.42 % 404.59 postati postati posta ti 640,874 0.61 % 564.80 8 0.47 % 824.15 24,724 0.46 % 622.52 1,981 0.56 % 500.78 280,862 0.59 % 517.51 131,659 0.73 % 702.49 166,830 0.57 % 524.72 34,810 0.84 % 810.77 dejati dejati dejat i 588,836 0.56 % 518.94 0 0 % 0 11,125 0.20 % 280.12 129 0.04 % 32.61 238,385 0.50 % 439.24 24,514 0.14 % 130.80 310,945 1.07 % 978 3,738 0.09 % 87.06 pomeniti pomeniti pomen iti 572,293 0.55 % 504.36 1 0.06 % 103.02 15,176 0.28 % 382.12 2,098 0.60 % 530.35 265,611 0.56 % 489.41 114,402 0.63 % 610.41 146,456 0.50 % 460.64 28,549 0.69 % 664.94 ostati ostati ostat i 562,250 0.54 % 495.51 1 0.06 % 103.02 26,621 0.49 % 670.29 1,527 0.44 % 386.01 268,949 0.56 % 495.56 90,209 0.50 % 481.33 157,481 0.54 % 495.32 17,462 0.42 % 406.71 odločiti odločiti odloč iti 511,266 0.49 % 450.58 0 0 % 0 11,502 0.21 % 289.61 1,705 0.49 % 431.01 253,119 0.53 % 466.39 84,393 0.47 % 450.29 150,035 0.52 % 471.90 10,512 0.25 % 244.84 najti najti najti 509,948 0.49 % 449.41 1 0.06 % 103.02 30,279 0.56 % 762.39 1,136 0.32 % 287.17 213,191 0.45 % 392.82 113,686 0.63 % 606.59 127,739 0.44 % 401.77 23,916 0.58 % 557.03 igrati igrati igrat i 507,131 0.48 % 446.93 0 0 % 0 10,320 0.19 % 259.85 250 0.07 % 63.20 259,647 0.54 % 478.42 72,953 0.40 % 389.25 157,472 0.54 % 495.29 6,489 0.16 % 151.14 doseči doseči doseč i 504,161 0.48 % 444.31 2 0.12 % 206.04 5,633 0.10 % 141.83 968 0.28 % 244.70 251,182 0.53 % 462.82 61,834 0.34 % 329.93 166,863 0.57 % 524.83 17,679 0.43 % 411.76 hoteti hoteti hotet i 473,951 0.45 % 417.69 4 0.23 % 412.07 66,334 1.22 % 1,670.22 1,878 0.54 % 474.74 213,028 0.45 % 392.52 94,743 0.52 % 505.52 78,178 0.27 % 245.89 19,786 0.48 % 460.84 narediti narediti nared iti 471,071 0.45 % 415.15 8 0.47 % 824.15 26,480 0.49 % 666.74 1,294 0.37 % 327.11 200,417 0.42 % 369.28 101,403 0.56 % 541.05 124,061 0.43 % 390.20 17,408 0.42 % 405.45 govoriti govoriti govor iti 450,094 0.43 % 396.67 0 0 % 0 37,708 0.69 % 949.45 2,215 0.63 % 559.93 204,344 0.43 % 376.52 77,505 0.43 % 413.54 107,171 0.37 % 337.08 21,151 0.51 % 492.63 delati delati delat i 437,814 0.42 % 385.84 1 0.06 % 103.02 19,899 0.37 % 501.04 1,349 0.39 % 341.01 203,224 0.43 % 374.45 83,284 0.46 % 444.38 115,508 0.40 % 363.30 14,549 0.35 % 338.86 pričakovati pričakovati priča kovati 429,229 0.41 % 378.28 0 0 % 0 9,942 0.18 % 250.33 536 0.15 % 135.50 221,371 0.46 % 407.89 54,496 0.30 % 290.77 135,527 0.47 % 426.27 7,357 0.18 % 171.35 pomagati pomagati pomag ati 425,110 0.41 % 374.65 1 0.06 % 103.02 18,730 0.34 % 471.60 929 0.27 % 234.84 187,285 0.39 % 345.08 86,740 0.48 % 462.82 109,931 0.38 % 345.76 21,494 0.52 % 500.62 uspeti uspeti uspet i 416,462 0.40 % 367.03 0 0 % 0 11,945 0.22 % 300.76 441 0.13 % 111.48 202,705 0.42 % 373.50 63,264 0.35 % 337.56 130,990 0.45 % 412 7,117 0.17 % 165.76 kazati kazati kazat i 414,236 0.40 % 365.06 1 0.06 % 103.02 9,240 0.17 % 232.65 811 0.23 % 205.01 204,947 0.43 % 377.63 64,708 0.36 % 345.26 114,799 0.39 % 361.07 19,730 0.48 % 459.54 pokazati pokazati pokaz ati 412,679 0.39 % 363.69 0 0 % 0 17,416 0.32 % 438.52 714 0.20 % 180.49 195,963 0.41 % 361.07 69,548 0.38 % 371.09 112,955 0.39 % 355.27 16,083 0.39 % 374.59 voditi voditi vodit i 410,241 0.39 % 361.54 1 0.06 % 103.02 7,552 0.14 % 190.15 1,581 0.45 % 399.66 209,331 0.44 % 385.71 49,427 0.27 % 263.73 128,999 0.44 % 405.74 13,350 0.32 % 310.94 dodati dodati dodat i 399,061 0.38 % 351.69 136 7.95 % 14,010.51 7,250 0.13 % 182.55 1,166 0.33 % 294.75 151,686 0.32 % 279.49 61,463 0.34 % 327.95 163,610 0.56 % 514.60 13,750 0.33 % 320.25 veljati veljati velja ti 399,008 0.38 % 351.64 0 0 % 0 3,601 0.07 % 90.67 2,893 0.83 % 731.32 194,170 0.41 % 357.77 71,297 0.39 % 380.42 110,243 0.38 % 346.74 16,804 0.41 % 391.39 zgoditi zgoditi zgodi ti 398,852 0.38 % 351.51 0 0 % 0 21,111 0.39 % 531.55 911 0.26 % 230.29 185,364 0.39 % 341.55 67,503 0.37 % 360.17 112,678 0.39 % 354.40 11,285 0.27 % 262.84 pripraviti pripraviti pripr aviti 393,709 0.38 % 346.97 23 1.34 % 2,369.42 7,290 0.13 % 183.55 1,084 0.31 % 274.02 229,431 0.48 % 422.74 56,054 0.31 % 299.09 90,982 0.31 % 286.16 8,845 0.21 % 206.01 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 62 File at CLARIN.SI 1.2.46 List of final character-level 1-grams from verb lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-verbs-lemmas-final- 1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] biti biti bit i 91,521,762 45.65 % 80,657.66 130 6.79 % 13,392.40 5,035,173 46.44 % 126,780.21 232,963 38.78 % 58,890.53 43,157,916 46.56 % 79,521.31 14,021,776 42.64 % 74,815.59 26,234,349 46.57 % 82,513.86 2,839,455 39.78 % 66,134.28 imeti imeti imet i 4,146,931 2.07 % 3,654.67 12 0.63 % 1,236.22 179,933 1.66 % 4,530.52 14,728 2.45 % 3,723.08 1,929,486 2.08 % 3,555.21 785,754 2.39 % 4,192.53 1,085,001 1.93 % 3,412.61 152,017 2.13 % 3,540.66 morati morati morat i 2,048,887 1.02 % 1,805.67 6 0.31 % 618.11 93,053 0.86 % 2,342.97 10,982 1.83 % 2,776.13 960,889 1.04 % 1,770.50 336,918 1.02 % 1,797.68 558,017 0.99 % 1,755.11 89,022 1.25 % 2,073.43 iti iti it i 1,422,557 0.71 % 1,253.69 1 0.05 % 103.02 87,499 0.81 % 2,203.13 4,730 0.79 % 1,195.69 635,556 0.69 % 1,171.05 245,607 0.75 % 1,310.48 407,923 0.72 % 1,283.02 41,241 0.58 % 960.55 začeti začeti začet i 1,138,044 0.57 % 1,002.95 5 0.26 % 515.09 48,368 0.45 % 1,217.85 3,141 0.52 % 794.01 542,728 0.59 % 1,000.01 183,809 0.56 % 980.74 324,275 0.58 % 1,019.93 35,718 0.50 % 831.91 priti priti prit i 1,046,777 0.52 % 922.52 2 0.10 % 206.04 71,696 0.66 % 1,805.23 3,067 0.51 % 775.30 476,007 0.51 % 877.07 175,716 0.53 % 937.56 286,526 0.51 % 901.20 33,763 0.47 % 786.38 povedati povedati povedat i 970,784 0.48 % 855.55 0 0 % 0 54,069 0.50 % 1,361.40 2,102 0.35 % 531.36 490,903 0.53 % 904.52 114,814 0.35 % 612.61 292,099 0.52 % 918.73 16,797 0.23 % 391.22 dobiti dobiti dobit i 966,276 0.48 % 851.57 25 1.31 % 2,575.46 24,925 0.23 % 627.58 2,237 0.37 % 565.49 498,593 0.54 % 918.69 162,196 0.49 % 865.42 253,287 0.45 % 796.65 25,013 0.35 % 582.58 želeti želeti želet i 916,972 0.46 % 808.12 12 0.63 % 1,236.22 26,786 0.25 % 674.44 2,633 0.44 % 665.59 392,752 0.42 % 723.67 153,351 0.47 % 818.23 311,583 0.55 % 980.01 29,855 0.42 % 695.36 vedeti vedeti vedet i 869,903 0.43 % 766.64 1 0.05 % 103.02 115,253 1.06 % 2,901.95 3,084 0.51 % 779.60 365,237 0.39 % 672.97 174,422 0.53 % 930.66 183,677 0.33 % 577.71 28,229 0.40 % 657.49 moči moči moč i 829,908 0.41 % 731.39 0 0 % 0 65,444 0.60 % 1,647.81 3,619 0.60 % 914.84 382,779 0.41 % 705.30 143,731 0.44 % 766.90 196,963 0.35 % 619.50 37,372 0.52 % 870.44 videti videti videt i 729,726 0.36 % 643.10 0 0 % 0 94,473 0.87 % 2,378.73 1,702 0.28 % 430.25 273,372 0.29 % 503.71 146,780 0.45 % 783.17 178,717 0.32 % 562.11 34,682 0.49 % 807.79 praviti praviti pravit i 720,199 0.36 % 634.71 0 0 % 0 19,047 0.18 % 479.58 1,668 0.28 % 421.65 379,505 0.41 % 699.26 122,357 0.37 % 652.86 180,251 0.32 % 566.94 17,371 0.24 % 404.59 postati postati postat i 640,874 0.32 % 564.80 8 0.42 % 824.15 24,724 0.23 % 622.52 1,981 0.33 % 500.78 280,862 0.30 % 517.51 131,659 0.40 % 702.49 166,830 0.30 % 524.72 34,810 0.49 % 810.77 reči reči reč i 623,230 0.31 % 549.25 0 0 % 0 143,415 1.32 % 3,611.03 3,775 0.63 % 954.28 220,496 0.24 % 406.28 130,052 0.40 % 693.91 102,237 0.18 % 321.56 23,255 0.33 % 541.64 dejati dejati dejat i 588,836 0.29 % 518.94 0 0 % 0 11,125 0.10 % 280.12 129 0.02 % 32.61 238,385 0.26 % 439.24 24,514 0.07 % 130.80 310,945 0.55 % 978 3,738 0.05 % 87.06 pomeniti pomeniti pomenit i 572,293 0.28 % 504.36 1 0.05 % 103.02 15,176 0.14 % 382.12 2,098 0.35 % 530.35 265,611 0.29 % 489.41 114,402 0.35 % 610.41 146,456 0.26 % 460.64 28,549 0.40 % 664.94 ostati ostati ostat i 562,250 0.28 % 495.51 1 0.05 % 103.02 26,621 0.25 % 670.29 1,527 0.25 % 386.01 268,949 0.29 % 495.56 90,209 0.27 % 481.33 157,481 0.28 % 495.32 17,462 0.24 % 406.71 dati dati dat i 532,058 0.27 % 468.90 31 1.62 % 3,193.57 38,456 0.35 % 968.28 3,258 0.54 % 823.59 242,506 0.26 % 446.83 100,968 0.31 % 538.73 125,248 0.22 % 393.94 21,591 0.30 % 502.88 odločiti odločiti odločit i 511,266 0.26 % 450.58 0 0 % 0 11,502 0.11 % 289.61 1,705 0.28 % 431.01 253,119 0.27 % 466.39 84,393 0.26 % 450.29 150,035 0.27 % 471.90 10,512 0.15 % 244.84 najti najti najt i 509,948 0.25 % 449.41 1 0.05 % 103.02 30,279 0.28 % 762.39 1,136 0.19 % 287.17 213,191 0.23 % 392.82 113,686 0.35 % 606.59 127,739 0.23 % 401.77 23,916 0.34 % 557.03 igrati igrati igrat i 507,131 0.25 % 446.93 0 0 % 0 10,320 0.10 % 259.85 250 0.04 % 63.20 259,647 0.28 % 478.42 72,953 0.22 % 389.25 157,472 0.28 % 495.29 6,489 0.09 % 151.14 doseči doseči doseč i 504,161 0.25 % 444.31 2 0.10 % 206.04 5,633 0.05 % 141.83 968 0.16 % 244.70 251,182 0.27 % 462.82 61,834 0.19 % 329.93 166,863 0.30 % 524.83 17,679 0.25 % 411.76 hoteti hoteti hotet i 473,951 0.24 % 417.69 4 0.21 % 412.07 66,334 0.61 % 1,670.22 1,878 0.31 % 474.74 213,028 0.23 % 392.52 94,743 0.29 % 505.52 78,178 0.14 % 245.89 19,786 0.28 % 460.84 narediti narediti naredit i 471,071 0.23 % 415.15 8 0.42 % 824.15 26,480 0.24 % 666.74 1,294 0.21 % 327.11 200,417 0.22 % 369.28 101,403 0.31 % 541.05 124,061 0.22 % 390.20 17,408 0.24 % 405.45 govoriti govoriti govorit i 450,094 0.22 % 396.67 0 0 % 0 37,708 0.35 % 949.45 2,215 0.37 % 559.93 204,344 0.22 % 376.52 77,505 0.24 % 413.54 107,171 0.19 % 337.08 21,151 0.30 % 492.63 delati delati delat i 437,814 0.22 % 385.84 1 0.05 % 103.02 19,899 0.18 % 501.04 1,349 0.23 % 341.01 203,224 0.22 % 374.45 83,284 0.25 % 444.38 115,508 0.20 % 363.30 14,549 0.20 % 338.86 pričakovati pričakovati pričakovat i 429,229 0.21 % 378.28 0 0 % 0 9,942 0.09 % 250.33 536 0.09 % 135.50 221,371 0.24 % 407.89 54,496 0.17 % 290.77 135,527 0.24 % 426.27 7,357 0.10 % 171.35 pomagati pomagati pomagat i 425,110 0.21 % 374.65 1 0.05 % 103.02 18,730 0.17 % 471.60 929 0.15 % 234.84 187,285 0.20 % 345.08 86,740 0.26 % 462.82 109,931 0.20 % 345.76 21,494 0.30 % 500.62 uspeti uspeti uspet i 416,462 0.21 % 367.03 0 0 % 0 11,945 0.11 % 300.76 441 0.07 % 111.48 202,705 0.22 % 373.50 63,264 0.19 % 337.56 130,990 0.23 % 412 7,117 0.10 % 165.76 kazati kazati kazat i 414,236 0.21 % 365.06 1 0.05 % 103.02 9,240 0.09 % 232.65 811 0.14 % 205.01 204,947 0.22 % 377.63 64,708 0.20 % 345.26 114,799 0.20 % 361.07 19,730 0.28 % 459.54 pokazati pokazati pokazat i 412,679 0.21 % 363.69 0 0 % 0 17,416 0.16 % 438.52 714 0.12 % 180.49 195,963 0.21 % 361.07 69,548 0.21 % 371.09 112,955 0.20 % 355.27 16,083 0.23 % 374.59 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 63 File at CLARIN.SI 1.2.47 List of final character-level 2-grams from verb lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-verbs-lemmas-final- 2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] biti biti bi ti 91,521,762 45.65 % 80,657.66 130 6.79 % 13,392.40 5,035,173 46.44 % 126,780.21 232,963 38.78 % 58,890.53 43,157,916 46.56 % 79,521.31 14,021,776 42.64 % 74,815.59 26,234,349 46.57 % 82,513.86 2,839,455 39.78 % 66,134.28 imeti imeti ime ti 4,146,931 2.07 % 3,654.67 12 0.63 % 1,236.22 179,933 1.66 % 4,530.52 14,728 2.45 % 3,723.08 1,929,486 2.08 % 3,555.21 785,754 2.39 % 4,192.53 1,085,001 1.93 % 3,412.61 152,017 2.13 % 3,540.66 morati morati mora ti 2,048,887 1.02 % 1,805.67 6 0.31 % 618.11 93,053 0.86 % 2,342.97 10,982 1.83 % 2,776.13 960,889 1.04 % 1,770.50 336,918 1.02 % 1,797.68 558,017 0.99 % 1,755.11 89,022 1.25 % 2,073.43 iti iti i ti 1,422,557 0.71 % 1,253.69 1 0.05 % 103.02 87,499 0.81 % 2,203.13 4,730 0.79 % 1,195.69 635,556 0.69 % 1,171.05 245,607 0.75 % 1,310.48 407,923 0.72 % 1,283.02 41,241 0.58 % 960.55 začeti začeti zače ti 1,138,044 0.57 % 1,002.95 5 0.26 % 515.09 48,368 0.45 % 1,217.85 3,141 0.52 % 794.01 542,728 0.59 % 1,000.01 183,809 0.56 % 980.74 324,275 0.58 % 1,019.93 35,718 0.50 % 831.91 priti priti pri ti 1,046,777 0.52 % 922.52 2 0.10 % 206.04 71,696 0.66 % 1,805.23 3,067 0.51 % 775.30 476,007 0.51 % 877.07 175,716 0.53 % 937.56 286,526 0.51 % 901.20 33,763 0.47 % 786.38 povedati povedati poveda ti 970,784 0.48 % 855.55 0 0 % 0 54,069 0.50 % 1,361.40 2,102 0.35 % 531.36 490,903 0.53 % 904.52 114,814 0.35 % 612.61 292,099 0.52 % 918.73 16,797 0.23 % 391.22 dobiti dobiti dobi ti 966,276 0.48 % 851.57 25 1.31 % 2,575.46 24,925 0.23 % 627.58 2,237 0.37 % 565.49 498,593 0.54 % 918.69 162,196 0.49 % 865.42 253,287 0.45 % 796.65 25,013 0.35 % 582.58 želeti želeti žele ti 916,972 0.46 % 808.12 12 0.63 % 1,236.22 26,786 0.25 % 674.44 2,633 0.44 % 665.59 392,752 0.42 % 723.67 153,351 0.47 % 818.23 311,583 0.55 % 980.01 29,855 0.42 % 695.36 vedeti vedeti vede ti 869,903 0.43 % 766.64 1 0.05 % 103.02 115,253 1.06 % 2,901.95 3,084 0.51 % 779.60 365,237 0.39 % 672.97 174,422 0.53 % 930.66 183,677 0.33 % 577.71 28,229 0.40 % 657.49 moči moči mo či 829,908 0.41 % 731.39 0 0 % 0 65,444 0.60 % 1,647.81 3,619 0.60 % 914.84 382,779 0.41 % 705.30 143,731 0.44 % 766.90 196,963 0.35 % 619.50 37,372 0.52 % 870.44 videti videti vide ti 729,726 0.36 % 643.10 0 0 % 0 94,473 0.87 % 2,378.73 1,702 0.28 % 430.25 273,372 0.29 % 503.71 146,780 0.45 % 783.17 178,717 0.32 % 562.11 34,682 0.49 % 807.79 praviti praviti pravi ti 720,199 0.36 % 634.71 0 0 % 0 19,047 0.18 % 479.58 1,668 0.28 % 421.65 379,505 0.41 % 699.26 122,357 0.37 % 652.86 180,251 0.32 % 566.94 17,371 0.24 % 404.59 postati postati posta ti 640,874 0.32 % 564.80 8 0.42 % 824.15 24,724 0.23 % 622.52 1,981 0.33 % 500.78 280,862 0.30 % 517.51 131,659 0.40 % 702.49 166,830 0.30 % 524.72 34,810 0.49 % 810.77 reči reči re či 623,230 0.31 % 549.25 0 0 % 0 143,415 1.32 % 3,611.03 3,775 0.63 % 954.28 220,496 0.24 % 406.28 130,052 0.40 % 693.91 102,237 0.18 % 321.56 23,255 0.33 % 541.64 dejati dejati deja ti 588,836 0.29 % 518.94 0 0 % 0 11,125 0.10 % 280.12 129 0.02 % 32.61 238,385 0.26 % 439.24 24,514 0.07 % 130.80 310,945 0.55 % 978 3,738 0.05 % 87.06 pomeniti pomeniti pomeni ti 572,293 0.28 % 504.36 1 0.05 % 103.02 15,176 0.14 % 382.12 2,098 0.35 % 530.35 265,611 0.29 % 489.41 114,402 0.35 % 610.41 146,456 0.26 % 460.64 28,549 0.40 % 664.94 ostati ostati osta ti 562,250 0.28 % 495.51 1 0.05 % 103.02 26,621 0.25 % 670.29 1,527 0.25 % 386.01 268,949 0.29 % 495.56 90,209 0.27 % 481.33 157,481 0.28 % 495.32 17,462 0.24 % 406.71 dati dati da ti 532,058 0.27 % 468.90 31 1.62 % 3,193.57 38,456 0.35 % 968.28 3,258 0.54 % 823.59 242,506 0.26 % 446.83 100,968 0.31 % 538.73 125,248 0.22 % 393.94 21,591 0.30 % 502.88 odločiti odločiti odloči ti 511,266 0.26 % 450.58 0 0 % 0 11,502 0.11 % 289.61 1,705 0.28 % 431.01 253,119 0.27 % 466.39 84,393 0.26 % 450.29 150,035 0.27 % 471.90 10,512 0.15 % 244.84 najti najti naj ti 509,948 0.25 % 449.41 1 0.05 % 103.02 30,279 0.28 % 762.39 1,136 0.19 % 287.17 213,191 0.23 % 392.82 113,686 0.35 % 606.59 127,739 0.23 % 401.77 23,916 0.34 % 557.03 igrati igrati igra ti 507,131 0.25 % 446.93 0 0 % 0 10,320 0.10 % 259.85 250 0.04 % 63.20 259,647 0.28 % 478.42 72,953 0.22 % 389.25 157,472 0.28 % 495.29 6,489 0.09 % 151.14 doseči doseči dose či 504,161 0.25 % 444.31 2 0.10 % 206.04 5,633 0.05 % 141.83 968 0.16 % 244.70 251,182 0.27 % 462.82 61,834 0.19 % 329.93 166,863 0.30 % 524.83 17,679 0.25 % 411.76 hoteti hoteti hote ti 473,951 0.24 % 417.69 4 0.21 % 412.07 66,334 0.61 % 1,670.22 1,878 0.31 % 474.74 213,028 0.23 % 392.52 94,743 0.29 % 505.52 78,178 0.14 % 245.89 19,786 0.28 % 460.84 narediti narediti naredi ti 471,071 0.23 % 415.15 8 0.42 % 824.15 26,480 0.24 % 666.74 1,294 0.21 % 327.11 200,417 0.22 % 369.28 101,403 0.31 % 541.05 124,061 0.22 % 390.20 17,408 0.24 % 405.45 govoriti govoriti govori ti 450,094 0.22 % 396.67 0 0 % 0 37,708 0.35 % 949.45 2,215 0.37 % 559.93 204,344 0.22 % 376.52 77,505 0.24 % 413.54 107,171 0.19 % 337.08 21,151 0.30 % 492.63 delati delati dela ti 437,814 0.22 % 385.84 1 0.05 % 103.02 19,899 0.18 % 501.04 1,349 0.23 % 341.01 203,224 0.22 % 374.45 83,284 0.25 % 444.38 115,508 0.20 % 363.30 14,549 0.20 % 338.86 pričakovati pričakovati pričakova ti 429,229 0.21 % 378.28 0 0 % 0 9,942 0.09 % 250.33 536 0.09 % 135.50 221,371 0.24 % 407.89 54,496 0.17 % 290.77 135,527 0.24 % 426.27 7,357 0.10 % 171.35 pomagati pomagati pomaga ti 425,110 0.21 % 374.65 1 0.05 % 103.02 18,730 0.17 % 471.60 929 0.15 % 234.84 187,285 0.20 % 345.08 86,740 0.26 % 462.82 109,931 0.20 % 345.76 21,494 0.30 % 500.62 uspeti uspeti uspe ti 416,462 0.21 % 367.03 0 0 % 0 11,945 0.11 % 300.76 441 0.07 % 111.48 202,705 0.22 % 373.50 63,264 0.19 % 337.56 130,990 0.23 % 412 7,117 0.10 % 165.76 kazati kazati kaza ti 414,236 0.21 % 365.06 1 0.05 % 103.02 9,240 0.09 % 232.65 811 0.14 % 205.01 204,947 0.22 % 377.63 64,708 0.20 % 345.26 114,799 0.20 % 361.07 19,730 0.28 % 459.54 pokazati pokazati pokaza ti 412,679 0.21 % 363.69 0 0 % 0 17,416 0.16 % 438.52 714 0.12 % 180.49 195,963 0.21 % 361.07 69,548 0.21 % 371.09 112,955 0.20 % 355.27 16,083 0.23 % 374.59 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 64 File at CLARIN.SI 1.2.48 List of final character-level 3-grams from verb lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-verbs-lemmas-final- 3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] biti biti b iti 91,521,762 45.65 % 80,657.66 130 6.79 % 13,392.40 5,035,173 46.44 % 126,780.21 232,963 38.79 % 58,890.53 43,157,916 46.56 % 79,521.31 14,021,776 42.64 % 74,815.59 26,234,349 46.57 % 82,513.86 2,839,455 39.78 % 66,134.28 imeti imeti im eti 4,146,931 2.07 % 3,654.67 12 0.63 % 1,236.22 179,933 1.66 % 4,530.52 14,728 2.45 % 3,723.08 1,929,486 2.08 % 3,555.21 785,754 2.39 % 4,192.53 1,085,001 1.93 % 3,412.61 152,017 2.13 % 3,540.66 morati morati mor ati 2,048,887 1.02 % 1,805.67 6 0.31 % 618.11 93,053 0.86 % 2,342.97 10,982 1.83 % 2,776.13 960,889 1.04 % 1,770.50 336,918 1.02 % 1,797.68 558,017 0.99 % 1,755.11 89,022 1.25 % 2,073.43 iti iti iti 1,422,557 0.71 % 1,253.69 1 0.05 % 103.02 87,499 0.81 % 2,203.13 4,730 0.79 % 1,195.69 635,556 0.69 % 1,171.05 245,607 0.75 % 1,310.48 407,923 0.72 % 1,283.02 41,241 0.58 % 960.55 začeti začeti zač eti 1,138,044 0.57 % 1,002.95 5 0.26 % 515.09 48,368 0.45 % 1,217.85 3,141 0.52 % 794.01 542,728 0.59 % 1,000.01 183,809 0.56 % 980.74 324,275 0.58 % 1,019.93 35,718 0.50 % 831.91 priti priti pr iti 1,046,777 0.52 % 922.52 2 0.10 % 206.04 71,696 0.66 % 1,805.23 3,067 0.51 % 775.30 476,007 0.51 % 877.07 175,716 0.53 % 937.56 286,526 0.51 % 901.20 33,763 0.47 % 786.38 povedati povedati poved ati 970,784 0.48 % 855.55 0 0 % 0 54,069 0.50 % 1,361.40 2,102 0.35 % 531.36 490,903 0.53 % 904.52 114,814 0.35 % 612.61 292,099 0.52 % 918.73 16,797 0.23 % 391.22 dobiti dobiti dob iti 966,276 0.48 % 851.57 25 1.31 % 2,575.46 24,925 0.23 % 627.58 2,237 0.37 % 565.49 498,593 0.54 % 918.69 162,196 0.49 % 865.42 253,287 0.45 % 796.65 25,013 0.35 % 582.58 želeti želeti žel eti 916,972 0.46 % 808.12 12 0.63 % 1,236.22 26,786 0.25 % 674.44 2,633 0.44 % 665.59 392,752 0.42 % 723.67 153,351 0.47 % 818.23 311,583 0.55 % 980.01 29,855 0.42 % 695.36 vedeti vedeti ved eti 869,903 0.43 % 766.64 1 0.05 % 103.02 115,253 1.06 % 2,901.95 3,084 0.51 % 779.60 365,237 0.39 % 672.97 174,422 0.53 % 930.66 183,677 0.33 % 577.71 28,229 0.40 % 657.49 moči moči m oči 829,908 0.41 % 731.39 0 0 % 0 65,444 0.60 % 1,647.81 3,619 0.60 % 914.84 382,779 0.41 % 705.30 143,731 0.44 % 766.90 196,963 0.35 % 619.50 37,372 0.52 % 870.44 videti videti vid eti 729,726 0.36 % 643.10 0 0 % 0 94,473 0.87 % 2,378.73 1,702 0.28 % 430.25 273,372 0.29 % 503.71 146,780 0.45 % 783.17 178,717 0.32 % 562.11 34,682 0.49 % 807.79 praviti praviti prav iti 720,199 0.36 % 634.71 0 0 % 0 19,047 0.18 % 479.58 1,668 0.28 % 421.65 379,505 0.41 % 699.26 122,357 0.37 % 652.86 180,251 0.32 % 566.94 17,371 0.24 % 404.59 postati postati post ati 640,874 0.32 % 564.80 8 0.42 % 824.15 24,724 0.23 % 622.52 1,981 0.33 % 500.78 280,862 0.30 % 517.51 131,659 0.40 % 702.49 166,830 0.30 % 524.72 34,810 0.49 % 810.77 reči reči r eči 623,230 0.31 % 549.25 0 0 % 0 143,415 1.32 % 3,611.03 3,775 0.63 % 954.28 220,496 0.24 % 406.28 130,052 0.40 % 693.91 102,237 0.18 % 321.56 23,255 0.33 % 541.64 dejati dejati dej ati 588,836 0.29 % 518.94 0 0 % 0 11,125 0.10 % 280.12 129 0.02 % 32.61 238,385 0.26 % 439.24 24,514 0.07 % 130.80 310,945 0.55 % 978 3,738 0.05 % 87.06 pomeniti pomeniti pomen iti 572,293 0.28 % 504.36 1 0.05 % 103.02 15,176 0.14 % 382.12 2,098 0.35 % 530.35 265,611 0.29 % 489.41 114,402 0.35 % 610.41 146,456 0.26 % 460.64 28,549 0.40 % 664.94 ostati ostati ost ati 562,250 0.28 % 495.51 1 0.05 % 103.02 26,621 0.25 % 670.29 1,527 0.25 % 386.01 268,949 0.29 % 495.56 90,209 0.27 % 481.33 157,481 0.28 % 495.32 17,462 0.24 % 406.71 dati dati d ati 532,058 0.27 % 468.90 31 1.62 % 3,193.57 38,456 0.35 % 968.28 3,258 0.54 % 823.59 242,506 0.26 % 446.83 100,968 0.31 % 538.73 125,248 0.22 % 393.94 21,591 0.30 % 502.88 odločiti odločiti odloč iti 511,266 0.26 % 450.58 0 0 % 0 11,502 0.11 % 289.61 1,705 0.28 % 431.01 253,119 0.27 % 466.39 84,393 0.26 % 450.29 150,035 0.27 % 471.90 10,512 0.15 % 244.84 najti najti na jti 509,948 0.25 % 449.41 1 0.05 % 103.02 30,279 0.28 % 762.39 1,136 0.19 % 287.17 213,191 0.23 % 392.82 113,686 0.35 % 606.59 127,739 0.23 % 401.77 23,916 0.34 % 557.03 igrati igrati igr ati 507,131 0.25 % 446.93 0 0 % 0 10,320 0.10 % 259.85 250 0.04 % 63.20 259,647 0.28 % 478.42 72,953 0.22 % 389.25 157,472 0.28 % 495.29 6,489 0.09 % 151.14 doseči doseči dos eči 504,161 0.25 % 444.31 2 0.10 % 206.04 5,633 0.05 % 141.83 968 0.16 % 244.70 251,182 0.27 % 462.82 61,834 0.19 % 329.93 166,863 0.30 % 524.83 17,679 0.25 % 411.76 hoteti hoteti hot eti 473,951 0.24 % 417.69 4 0.21 % 412.07 66,334 0.61 % 1,670.22 1,878 0.31 % 474.74 213,028 0.23 % 392.52 94,743 0.29 % 505.52 78,178 0.14 % 245.89 19,786 0.28 % 460.84 narediti narediti nared iti 471,071 0.23 % 415.15 8 0.42 % 824.15 26,480 0.24 % 666.74 1,294 0.21 % 327.11 200,417 0.22 % 369.28 101,403 0.31 % 541.05 124,061 0.22 % 390.20 17,408 0.24 % 405.45 govoriti govoriti govor iti 450,094 0.23 % 396.67 0 0 % 0 37,708 0.35 % 949.45 2,215 0.37 % 559.93 204,344 0.22 % 376.52 77,505 0.24 % 413.54 107,171 0.19 % 337.08 21,151 0.30 % 492.63 delati delati del ati 437,814 0.22 % 385.84 1 0.05 % 103.02 19,899 0.18 % 501.04 1,349 0.23 % 341.01 203,224 0.22 % 374.45 83,284 0.25 % 444.38 115,508 0.20 % 363.30 14,549 0.20 % 338.86 pričakovati pričakovati pričakov ati 429,229 0.21 % 378.28 0 0 % 0 9,942 0.09 % 250.33 536 0.09 % 135.50 221,371 0.24 % 407.89 54,496 0.17 % 290.77 135,527 0.24 % 426.27 7,357 0.10 % 171.35 pomagati pomagati pomag ati 425,110 0.21 % 374.65 1 0.05 % 103.02 18,730 0.17 % 471.60 929 0.15 % 234.84 187,285 0.20 % 345.08 86,740 0.26 % 462.82 109,931 0.20 % 345.76 21,494 0.30 % 500.62 uspeti uspeti usp eti 416,462 0.21 % 367.03 0 0 % 0 11,945 0.11 % 300.76 441 0.07 % 111.48 202,705 0.22 % 373.50 63,264 0.19 % 337.56 130,990 0.23 % 412 7,117 0.10 % 165.76 kazati kazati kaz ati 414,236 0.21 % 365.06 1 0.05 % 103.02 9,240 0.09 % 232.65 811 0.14 % 205.01 204,947 0.22 % 377.63 64,708 0.20 % 345.26 114,799 0.20 % 361.07 19,730 0.28 % 459.54 pokazati pokazati pokaz ati 412,679 0.21 % 363.69 0 0 % 0 17,416 0.16 % 438.52 714 0.12 % 180.49 195,963 0.21 % 361.07 69,548 0.21 % 371.09 112,955 0.20 % 355.27 16,083 0.23 % 374.59 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 65 File at CLARIN.SI 1.2.49 List of final character-level 4-grams from verb lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-verbs-lemmas-final- 4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] biti biti biti 91,521,762 45.98 % 80,657.66 130 6.80 % 13,392.40 5,035,173 46.82 % 126,780.21 232,963 39.10 % 58,890.53 43,157,916 46.89 % 79,521.31 14,021,776 42.97 % 74,815.59 26,234,349 46.92 % 82,513.86 2,839,455 40.02 % 66,134.28 imeti imeti i meti 4,146,931 2.08 % 3,654.67 12 0.63 % 1,236.22 179,933 1.67 % 4,530.52 14,728 2.47 % 3,723.08 1,929,486 2.10 % 3,555.21 785,754 2.41 % 4,192.53 1,085,001 1.94 % 3,412.61 152,017 2.14 % 3,540.66 morati morati mo rati 2,048,887 1.03 % 1,805.67 6 0.31 % 618.11 93,053 0.86 % 2,342.97 10,982 1.84 % 2,776.13 960,889 1.04 % 1,770.50 336,918 1.03 % 1,797.68 558,017 1.00 % 1,755.11 89,022 1.25 % 2,073.43 začeti začeti za četi 1,138,044 0.57 % 1,002.95 5 0.26 % 515.09 48,368 0.45 % 1,217.85 3,141 0.53 % 794.01 542,728 0.59 % 1,000.01 183,809 0.56 % 980.74 324,275 0.58 % 1,019.93 35,718 0.50 % 831.91 priti priti p riti 1,046,777 0.53 % 922.52 2 0.10 % 206.04 71,696 0.67 % 1,805.23 3,067 0.52 % 775.30 476,007 0.52 % 877.07 175,716 0.54 % 937.56 286,526 0.51 % 901.20 33,763 0.48 % 786.38 povedati povedati pove dati 970,784 0.49 % 855.55 0 0 % 0 54,069 0.50 % 1,361.40 2,102 0.35 % 531.36 490,903 0.53 % 904.52 114,814 0.35 % 612.61 292,099 0.52 % 918.73 16,797 0.24 % 391.22 dobiti dobiti do biti 966,276 0.48 % 851.57 25 1.31 % 2,575.46 24,925 0.23 % 627.58 2,237 0.38 % 565.49 498,593 0.54 % 918.69 162,196 0.50 % 865.42 253,287 0.45 % 796.65 25,013 0.35 % 582.58 želeti želeti že leti 916,972 0.46 % 808.12 12 0.63 % 1,236.22 26,786 0.25 % 674.44 2,633 0.44 % 665.59 392,752 0.43 % 723.67 153,351 0.47 % 818.23 311,583 0.56 % 980.01 29,855 0.42 % 695.36 vedeti vedeti ve deti 869,903 0.44 % 766.64 1 0.05 % 103.02 115,253 1.07 % 2,901.95 3,084 0.52 % 779.60 365,237 0.40 % 672.97 174,422 0.53 % 930.66 183,677 0.33 % 577.71 28,229 0.40 % 657.49 moči moči moči 829,908 0.42 % 731.39 0 0 % 0 65,444 0.61 % 1,647.81 3,619 0.61 % 914.84 382,779 0.42 % 705.30 143,731 0.44 % 766.90 196,963 0.35 % 619.50 37,372 0.53 % 870.44 videti videti vi deti 729,726 0.37 % 643.10 0 0 % 0 94,473 0.88 % 2,378.73 1,702 0.29 % 430.25 273,372 0.30 % 503.71 146,780 0.45 % 783.17 178,717 0.32 % 562.11 34,682 0.49 % 807.79 praviti praviti pra viti 720,199 0.36 % 634.71 0 0 % 0 19,047 0.18 % 479.58 1,668 0.28 % 421.65 379,505 0.41 % 699.26 122,357 0.38 % 652.86 180,251 0.32 % 566.94 17,371 0.24 % 404.59 postati postati pos tati 640,874 0.32 % 564.80 8 0.42 % 824.15 24,724 0.23 % 622.52 1,981 0.33 % 500.78 280,862 0.30 % 517.51 131,659 0.40 % 702.49 166,830 0.30 % 524.72 34,810 0.49 % 810.77 reči reči reči 623,230 0.31 % 549.25 0 0 % 0 143,415 1.33 % 3,611.03 3,775 0.63 % 954.28 220,496 0.24 % 406.28 130,052 0.40 % 693.91 102,237 0.18 % 321.56 23,255 0.33 % 541.64 dejati dejati de jati 588,836 0.30 % 518.94 0 0 % 0 11,125 0.10 % 280.12 129 0.02 % 32.61 238,385 0.26 % 439.24 24,514 0.07 % 130.80 310,945 0.56 % 978 3,738 0.05 % 87.06 pomeniti pomeniti pome niti 572,293 0.29 % 504.36 1 0.05 % 103.02 15,176 0.14 % 382.12 2,098 0.35 % 530.35 265,611 0.29 % 489.41 114,402 0.35 % 610.41 146,456 0.26 % 460.64 28,549 0.40 % 664.94 ostati ostati os tati 562,250 0.28 % 495.51 1 0.05 % 103.02 26,621 0.25 % 670.29 1,527 0.26 % 386.01 268,949 0.29 % 495.56 90,209 0.28 % 481.33 157,481 0.28 % 495.32 17,462 0.25 % 406.71 dati dati dati 532,058 0.27 % 468.90 31 1.62 % 3,193.57 38,456 0.36 % 968.28 3,258 0.55 % 823.59 242,506 0.26 % 446.83 100,968 0.31 % 538.73 125,248 0.22 % 393.94 21,591 0.30 % 502.88 odločiti odločiti odlo čiti 511,266 0.26 % 450.58 0 0 % 0 11,502 0.11 % 289.61 1,705 0.29 % 431.01 253,119 0.28 % 466.39 84,393 0.26 % 450.29 150,035 0.27 % 471.90 10,512 0.15 % 244.84 najti najti n ajti 509,948 0.26 % 449.41 1 0.05 % 103.02 30,279 0.28 % 762.39 1,136 0.19 % 287.17 213,191 0.23 % 392.82 113,686 0.35 % 606.59 127,739 0.23 % 401.77 23,916 0.34 % 557.03 igrati igrati ig rati 507,131 0.26 % 446.93 0 0 % 0 10,320 0.10 % 259.85 250 0.04 % 63.20 259,647 0.28 % 478.42 72,953 0.22 % 389.25 157,472 0.28 % 495.29 6,489 0.09 % 151.14 doseči doseči do seči 504,161 0.25 % 444.31 2 0.10 % 206.04 5,633 0.05 % 141.83 968 0.16 % 244.70 251,182 0.27 % 462.82 61,834 0.19 % 329.93 166,863 0.30 % 524.83 17,679 0.25 % 411.76 hoteti hoteti ho teti 473,951 0.24 % 417.69 4 0.21 % 412.07 66,334 0.62 % 1,670.22 1,878 0.32 % 474.74 213,028 0.23 % 392.52 94,743 0.29 % 505.52 78,178 0.14 % 245.89 19,786 0.28 % 460.84 narediti narediti nare diti 471,071 0.24 % 415.15 8 0.42 % 824.15 26,480 0.25 % 666.74 1,294 0.22 % 327.11 200,417 0.22 % 369.28 101,403 0.31 % 541.05 124,061 0.22 % 390.20 17,408 0.24 % 405.45 govoriti govoriti govo riti 450,094 0.23 % 396.67 0 0 % 0 37,708 0.35 % 949.45 2,215 0.37 % 559.93 204,344 0.22 % 376.52 77,505 0.24 % 413.54 107,171 0.19 % 337.08 21,151 0.30 % 492.63 delati delati de lati 437,814 0.22 % 385.84 1 0.05 % 103.02 19,899 0.18 % 501.04 1,349 0.23 % 341.01 203,224 0.22 % 374.45 83,284 0.26 % 444.38 115,508 0.21 % 363.30 14,549 0.20 % 338.86 pričakovati pričakovati pričako vati 429,229 0.22 % 378.28 0 0 % 0 9,942 0.09 % 250.33 536 0.09 % 135.50 221,371 0.24 % 407.89 54,496 0.17 % 290.77 135,527 0.24 % 426.27 7,357 0.10 % 171.35 pomagati pomagati poma gati 425,110 0.21 % 374.65 1 0.05 % 103.02 18,730 0.17 % 471.60 929 0.16 % 234.84 187,285 0.20 % 345.08 86,740 0.27 % 462.82 109,931 0.20 % 345.76 21,494 0.30 % 500.62 uspeti uspeti us peti 416,462 0.21 % 367.03 0 0 % 0 11,945 0.11 % 300.76 441 0.07 % 111.48 202,705 0.22 % 373.50 63,264 0.19 % 337.56 130,990 0.23 % 412 7,117 0.10 % 165.76 kazati kazati ka zati 414,236 0.21 % 365.06 1 0.05 % 103.02 9,240 0.09 % 232.65 811 0.14 % 205.01 204,947 0.22 % 377.63 64,708 0.20 % 345.26 114,799 0.20 % 361.07 19,730 0.28 % 459.54 pokazati pokazati poka zati 412,679 0.21 % 363.69 0 0 % 0 17,416 0.16 % 438.52 714 0.12 % 180.49 195,963 0.21 % 361.07 69,548 0.21 % 371.09 112,955 0.20 % 355.27 16,083 0.23 % 374.59 voditi voditi vo diti 410,241 0.21 % 361.54 1 0.05 % 103.02 7,552 0.07 % 190.15 1,581 0.27 % 399.66 209,331 0.23 % 385.71 49,427 0.15 % 263.73 128,999 0.23 % 405.74 13,350 0.19 % 310.94 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 66 File at CLARIN.SI 1.2.50 List of final character-level 5-grams from verb lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-verbs-lemmas-final- 5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] imeti imeti imeti 4,146,931 3.96 % 3,654.67 12 0.70 % 1,236.22 179,933 3.32 % 4,530.52 14,728 4.20 % 3,723.08 1,929,486 4.04 % 3,555.21 785,754 4.34 % 4,192.53 1,085,001 3.73 % 3,412.61 152,017 3.67 % 3,540.66 morati morati m orati 2,048,887 1.95 % 1,805.67 6 0.35 % 618.11 93,053 1.72 % 2,342.97 10,982 3.13 % 2,776.13 960,889 2.01 % 1,770.50 336,918 1.86 % 1,797.68 558,017 1.92 % 1,755.11 89,022 2.15 % 2,073.43 začeti začeti z ačeti 1,138,044 1.08 % 1,002.95 5 0.29 % 515.09 48,368 0.89 % 1,217.85 3,141 0.90 % 794.01 542,728 1.14 % 1,000.01 183,809 1.01 % 980.74 324,275 1.11 % 1,019.93 35,718 0.86 % 831.91 priti priti priti 1,046,777 1.00 % 922.52 2 0.12 % 206.04 71,696 1.32 % 1,805.23 3,067 0.88 % 775.30 476,007 1.00 % 877.07 175,716 0.97 % 937.56 286,526 0.98 % 901.20 33,763 0.81 % 786.38 povedati povedati pov edati 970,784 0.93 % 855.55 0 0 % 0 54,069 1.00 % 1,361.40 2,102 0.60 % 531.36 490,903 1.03 % 904.52 114,814 0.63 % 612.61 292,099 1.00 % 918.73 16,797 0.41 % 391.22 dobiti dobiti d obiti 966,276 0.92 % 851.57 25 1.46 % 2,575.46 24,925 0.46 % 627.58 2,237 0.64 % 565.49 498,593 1.04 % 918.69 162,196 0.90 % 865.42 253,287 0.87 % 796.65 25,013 0.60 % 582.58 želeti želeti ž eleti 916,972 0.87 % 808.12 12 0.70 % 1,236.22 26,786 0.49 % 674.44 2,633 0.75 % 665.59 392,752 0.82 % 723.67 153,351 0.85 % 818.23 311,583 1.07 % 980.01 29,855 0.72 % 695.36 vedeti vedeti v edeti 869,903 0.83 % 766.64 1 0.06 % 103.02 115,253 2.12 % 2,901.95 3,084 0.88 % 779.60 365,237 0.77 % 672.97 174,422 0.96 % 930.66 183,677 0.63 % 577.71 28,229 0.68 % 657.49 videti videti v ideti 729,726 0.70 % 643.10 0 0 % 0 94,473 1.74 % 2,378.73 1,702 0.49 % 430.25 273,372 0.57 % 503.71 146,780 0.81 % 783.17 178,717 0.61 % 562.11 34,682 0.84 % 807.79 praviti praviti pr aviti 720,199 0.69 % 634.71 0 0 % 0 19,047 0.35 % 479.58 1,668 0.48 % 421.65 379,505 0.80 % 699.26 122,357 0.68 % 652.86 180,251 0.62 % 566.94 17,371 0.42 % 404.59 postati postati po stati 640,874 0.61 % 564.80 8 0.47 % 824.15 24,724 0.46 % 622.52 1,981 0.56 % 500.78 280,862 0.59 % 517.51 131,659 0.73 % 702.49 166,830 0.57 % 524.72 34,810 0.84 % 810.77 dejati dejati d ejati 588,836 0.56 % 518.94 0 0 % 0 11,125 0.20 % 280.12 129 0.04 % 32.61 238,385 0.50 % 439.24 24,514 0.14 % 130.80 310,945 1.07 % 978 3,738 0.09 % 87.06 pomeniti pomeniti pom eniti 572,293 0.55 % 504.36 1 0.06 % 103.02 15,176 0.28 % 382.12 2,098 0.60 % 530.35 265,611 0.56 % 489.41 114,402 0.63 % 610.41 146,456 0.50 % 460.64 28,549 0.69 % 664.94 ostati ostati o stati 562,250 0.54 % 495.51 1 0.06 % 103.02 26,621 0.49 % 670.29 1,527 0.44 % 386.01 268,949 0.56 % 495.56 90,209 0.50 % 481.33 157,481 0.54 % 495.32 17,462 0.42 % 406.71 odločiti odločiti odl očiti 511,266 0.49 % 450.58 0 0 % 0 11,502 0.21 % 289.61 1,705 0.49 % 431.01 253,119 0.53 % 466.39 84,393 0.47 % 450.29 150,035 0.52 % 471.90 10,512 0.25 % 244.84 najti najti najti 509,948 0.49 % 449.41 1 0.06 % 103.02 30,279 0.56 % 762.39 1,136 0.32 % 287.17 213,191 0.45 % 392.82 113,686 0.63 % 606.59 127,739 0.44 % 401.77 23,916 0.58 % 557.03 igrati igrati i grati 507,131 0.48 % 446.93 0 0 % 0 10,320 0.19 % 259.85 250 0.07 % 63.20 259,647 0.54 % 478.42 72,953 0.40 % 389.25 157,472 0.54 % 495.29 6,489 0.16 % 151.14 doseči doseči d oseči 504,161 0.48 % 444.31 2 0.12 % 206.04 5,633 0.10 % 141.83 968 0.28 % 244.70 251,182 0.53 % 462.82 61,834 0.34 % 329.93 166,863 0.57 % 524.83 17,679 0.43 % 411.76 hoteti hoteti h oteti 473,951 0.45 % 417.69 4 0.23 % 412.07 66,334 1.22 % 1,670.22 1,878 0.54 % 474.74 213,028 0.45 % 392.52 94,743 0.52 % 505.52 78,178 0.27 % 245.89 19,786 0.48 % 460.84 narediti narediti nar editi 471,071 0.45 % 415.15 8 0.47 % 824.15 26,480 0.49 % 666.74 1,294 0.37 % 327.11 200,417 0.42 % 369.28 101,403 0.56 % 541.05 124,061 0.43 % 390.20 17,408 0.42 % 405.45 govoriti govoriti gov oriti 450,094 0.43 % 396.67 0 0 % 0 37,708 0.69 % 949.45 2,215 0.63 % 559.93 204,344 0.43 % 376.52 77,505 0.43 % 413.54 107,171 0.37 % 337.08 21,151 0.51 % 492.63 delati delati d elati 437,814 0.42 % 385.84 1 0.06 % 103.02 19,899 0.37 % 501.04 1,349 0.39 % 341.01 203,224 0.43 % 374.45 83,284 0.46 % 444.38 115,508 0.40 % 363.30 14,549 0.35 % 338.86 pričakovati pričakovati pričak ovati 429,229 0.41 % 378.28 0 0 % 0 9,942 0.18 % 250.33 536 0.15 % 135.50 221,371 0.46 % 407.89 54,496 0.30 % 290.77 135,527 0.47 % 426.27 7,357 0.18 % 171.35 pomagati pomagati pom agati 425,110 0.41 % 374.65 1 0.06 % 103.02 18,730 0.34 % 471.60 929 0.27 % 234.84 187,285 0.39 % 345.08 86,740 0.48 % 462.82 109,931 0.38 % 345.76 21,494 0.52 % 500.62 uspeti uspeti u speti 416,462 0.40 % 367.03 0 0 % 0 11,945 0.22 % 300.76 441 0.13 % 111.48 202,705 0.42 % 373.50 63,264 0.35 % 337.56 130,990 0.45 % 412 7,117 0.17 % 165.76 kazati kazati k azati 414,236 0.40 % 365.06 1 0.06 % 103.02 9,240 0.17 % 232.65 811 0.23 % 205.01 204,947 0.43 % 377.63 64,708 0.36 % 345.26 114,799 0.39 % 361.07 19,730 0.48 % 459.54 pokazati pokazati pok azati 412,679 0.39 % 363.69 0 0 % 0 17,416 0.32 % 438.52 714 0.20 % 180.49 195,963 0.41 % 361.07 69,548 0.38 % 371.09 112,955 0.39 % 355.27 16,083 0.39 % 374.59 voditi voditi v oditi 410,241 0.39 % 361.54 1 0.06 % 103.02 7,552 0.14 % 190.15 1,581 0.45 % 399.66 209,331 0.44 % 385.71 49,427 0.27 % 263.73 128,999 0.44 % 405.74 13,350 0.32 % 310.94 dodati dodati d odati 399,061 0.38 % 351.69 136 7.95 % 14,010.51 7,250 0.13 % 182.55 1,166 0.33 % 294.75 151,686 0.32 % 279.49 61,463 0.34 % 327.95 163,610 0.56 % 514.60 13,750 0.33 % 320.25 veljati veljati ve ljati 399,008 0.38 % 351.64 0 0 % 0 3,601 0.07 % 90.67 2,893 0.83 % 731.32 194,170 0.41 % 357.77 71,297 0.39 % 380.42 110,243 0.38 % 346.74 16,804 0.41 % 391.39 zgoditi zgoditi zg oditi 398,852 0.38 % 351.51 0 0 % 0 21,111 0.39 % 531.55 911 0.26 % 230.29 185,364 0.39 % 341.55 67,503 0.37 % 360.17 112,678 0.39 % 354.40 11,285 0.27 % 262.84 pripraviti pripraviti pripr aviti 393,709 0.38 % 346.97 23 1.34 % 2,369.42 7,290 0.13 % 183.55 1,084 0.31 % 274.02 229,431 0.48 % 422.74 56,054 0.31 % 299.09 90,982 0.31 % 286.16 8,845 0.21 % 206.01 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 67 File at CLARIN.SI 1.2.51 List of initial character-level 1-grams from verb lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-verbs-lowercase_ forms-initial-1grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] je j e 39,852,651 19.88 % 35,121.94 53 2.77 % 5,459.98 2,355,179 21.72 % 59,300.86 95,246 15.86 % 24,077.16 18,126,599 19.56 % 33,399.46 6,003,429 18.26 % 32,032.32 11,999,560 21.30 % 37,741.74 1,272,585 17.83 % 29,640.02 so s o 13,322,285 6.64 % 11,740.86 25 1.31 % 2,575.46 381,659 3.52 % 9,609.76 33,146 5.52 % 8,378.95 6,609,847 7.13 % 12,179.08 1,974,260 6.00 % 10,534 3,829,327 6.80 % 12,044.23 494,021 6.92 % 11,506.34 bi b i 6,497,427 3.24 % 5,726.15 4 0.21 % 412.07 388,773 3.58 % 9,788.88 19,288 3.21 % 4,875.80 3,212,413 3.47 % 5,919.08 1,006,674 3.06 % 5,371.28 1,665,728 2.96 % 5,239.15 204,547 2.87 % 4,764.14 bo b o 5,991,999 2.99 % 5,280.72 9 0.47 % 927.17 165,711 1.53 % 4,172.42 10,939 1.82 % 2,765.26 3,054,378 3.29 % 5,627.89 827,838 2.52 % 4,417.07 1,837,278 3.26 % 5,778.72 95,846 1.34 % 2,232.37 ni n i 4,416,652 2.20 % 3,892.37 0 0 % 0 283,727 2.62 % 7,143.94 15,184 2.53 % 3,838.35 2,061,015 2.22 % 3,797.56 719,621 2.19 % 3,839.66 1,202,740 2.13 % 3,782.93 134,365 1.88 % 3,129.52 bodo b odo 2,658,670 1.33 % 2,343.07 4 0.21 % 412.07 31,054 0.29 % 781.91 3,637 0.61 % 919.39 1,471,149 1.59 % 2,710.69 302,865 0.92 % 1,615.99 814,748 1.45 % 2,562.59 35,213 0.49 % 820.15 sem s em 2,397,474 1.20 % 2,112.88 0 0 % 0 402,492 3.71 % 10,134.31 6,808 1.13 % 1,720.99 928,864 1.00 % 1,711.49 454,648 1.38 % 2,425.85 534,969 0.95 % 1,682.62 69,693 0.98 % 1,623.23 bil b il 2,389,569 1.19 % 2,105.92 0 0 % 0 167,677 1.55 % 4,221.93 6,270 1.04 % 1,584.99 1,107,569 1.20 % 2,040.77 323,764 0.98 % 1,727.50 713,850 1.27 % 2,245.24 70,439 0.99 % 1,640.61 bilo b ilo 2,082,016 1.04 % 1,834.87 1 0.05 % 103.02 149,980 1.38 % 3,776.33 6,314 1.05 % 1,596.11 1,019,432 1.10 % 1,878.37 298,941 0.91 % 1,595.05 544,248 0.97 % 1,711.80 63,100 0.88 % 1,469.67 bila b ila 1,952,423 0.97 % 1,720.66 0 0 % 0 144,082 1.33 % 3,627.83 5,749 0.96 % 1,453.29 902,435 0.97 % 1,662.80 291,559 0.89 % 1,555.66 543,982 0.97 % 1,710.97 64,616 0.91 % 1,504.98 smo s mo 1,909,803 0.95 % 1,683.10 0 0 % 0 44,381 0.41 % 1,117.47 5,673 0.94 % 1,434.07 994,108 1.07 % 1,831.71 348,648 1.06 % 1,860.27 471,393 0.84 % 1,482.65 45,600 0.64 % 1,062.08 sta s ta 1,875,334 0.94 % 1,652.72 2 0.10 % 206.04 96,486 0.89 % 2,429.41 2,861 0.48 % 723.23 881,185 0.95 % 1,623.64 274,754 0.84 % 1,466 566,839 1.01 % 1,782.86 53,207 0.74 % 1,239.25 ima i ma 1,068,307 0.53 % 941.49 5 0.26 % 515.09 28,747 0.27 % 723.82 4,891 0.81 % 1,236.39 479,110 0.52 % 882.79 225,335 0.69 % 1,202.31 286,154 0.51 % 900.03 44,065 0.62 % 1,026.33 niso n iso 1,033,073 0.52 % 910.44 2 0.10 % 206.04 29,832 0.28 % 751.14 3,575 0.59 % 903.72 525,937 0.57 % 969.07 152,957 0.47 % 816.13 285,738 0.51 % 898.72 35,032 0.49 % 815.94 bili b ili 949,365 0.47 % 836.67 1 0.05 % 103.02 39,173 0.36 % 986.33 2,864 0.48 % 723.99 475,580 0.51 % 876.29 138,560 0.42 % 739.31 259,161 0.46 % 815.13 34,026 0.48 % 792.51 gre g re 814,855 0.41 % 718.13 1 0.05 % 103.02 19,458 0.18 % 489.93 2,827 0.47 % 714.64 383,213 0.41 % 706.10 137,971 0.42 % 736.17 247,421 0.44 % 778.20 23,964 0.34 % 558.15 bomo b omo 651,882 0.33 % 574.50 0 0 % 0 14,951 0.14 % 376.45 2,964 0.49 % 749.27 340,630 0.37 % 627.63 110,301 0.34 % 588.53 163,942 0.29 % 515.64 19,094 0.27 % 444.72 imajo i majo 589,974 0.29 % 519.94 1 0.05 % 103.02 8,325 0.08 % 209.61 2,572 0.43 % 650.17 294,609 0.32 % 542.84 109,963 0.33 % 586.73 145,513 0.26 % 457.68 28,991 0.41 % 675.23 pravi p ravi 519,683 0.26 % 457.99 0 0 % 0 10,534 0.10 % 265.23 1,232 0.20 % 311.44 278,850 0.30 % 513.80 86,508 0.26 % 461.58 131,590 0.23 % 413.88 10,969 0.15 % 255.48 biti b iti 505,624 0.25 % 445.60 4 0.21 % 412.07 21,007 0.19 % 528.93 2,919 0.49 % 737.89 223,980 0.24 % 412.70 96,166 0.29 % 513.11 135,922 0.24 % 427.51 25,626 0.36 % 596.86 povedal p ovedal 470,909 0.23 % 415.01 0 0 % 0 17,662 0.16 % 444.71 699 0.12 % 176.70 256,636 0.28 % 472.87 30,466 0.09 % 162.56 162,162 0.29 % 510.04 3,284 0.05 % 76.49 dejal d ejal 467,145 0.23 % 411.69 0 0 % 0 7,833 0.07 % 197.23 81 0.01 % 20.48 196,501 0.21 % 362.07 16,632 0.05 % 88.74 243,317 0.43 % 765.30 2,781 0.04 % 64.77 imel i mel 456,762 0.23 % 402.54 0 0 % 0 42,126 0.39 % 1,060.69 950 0.16 % 240.15 202,873 0.22 % 373.81 66,132 0.20 % 352.86 131,752 0.23 % 414.39 12,929 0.18 % 301.13 pomeni p omeni 439,400 0.22 % 387.24 1 0.05 % 103.02 9,080 0.08 % 228.62 1,576 0.26 % 398.40 202,050 0.22 % 372.29 91,731 0.28 % 489.45 112,511 0.20 % 353.88 22,451 0.32 % 522.91 mora m ora 437,226 0.22 % 385.33 3 0.16 % 309.06 13,897 0.13 % 349.91 5,423 0.90 % 1,370.88 196,118 0.21 % 361.36 77,719 0.24 % 414.68 120,038 0.21 % 377.55 24,028 0.34 % 559.64 imeli i meli 424,733 0.21 % 374.32 0 0 % 0 11,741 0.11 % 295.63 995 0.17 % 251.53 227,147 0.24 % 418.53 63,176 0.19 % 337.09 110,375 0.20 % 347.16 11,299 0.16 % 263.17 boste b oste 424,723 0.21 % 374.31 8 0.42 % 824.15 11,314 0.10 % 284.87 1,143 0.19 % 288.94 144,008 0.15 % 265.34 176,813 0.54 % 943.42 60,928 0.11 % 191.63 30,509 0.43 % 710.59 ste s te 389,782 0.19 % 343.51 16 0.84 % 1,648.30 23,529 0.22 % 592.43 1,606 0.27 % 405.98 155,988 0.17 % 287.42 111,345 0.34 % 594.10 74,884 0.13 % 235.53 22,414 0.31 % 522.05 bile b ile 375,459 0.19 % 330.89 1 0.05 % 103.02 20,500 0.19 % 516.17 1,483 0.25 % 374.89 178,752 0.19 % 329.36 57,280 0.17 % 305.63 99,776 0.18 % 313.82 17,667 0.25 % 411.49 bom b om 366,374 0.18 % 322.88 0 0 % 0 59,430 0.55 % 1,496.38 1,893 0.32 % 478.53 150,679 0.16 % 277.64 60,672 0.18 % 323.73 83,377 0.15 % 262.24 10,323 0.14 % 240.43 morali m orali 356,817 0.18 % 314.46 0 0 % 0 7,714 0.07 % 194.23 654 0.11 % 165.32 196,175 0.21 % 361.47 48,407 0.15 % 258.28 94,072 0.17 % 295.88 9,795 0.14 % 228.14 nisem n isem 333,913 0.17 % 294.28 0 0 % 0 57,684 0.53 % 1,452.42 857 0.14 % 216.64 129,813 0.14 % 239.19 61,255 0.19 % 326.84 76,320 0.14 % 240.05 7,984 0.11 % 185.96 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 68 File at CLARIN.SI 1.2.52 List of initial character-level 2-grams from verb lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-verbs-lowercase_ forms-initial-2grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] je je 39,852,651 19.88 % 35,121.94 53 2.77 % 5,459.98 2,355,179 21.72 % 59,300.86 95,246 15.86 % 24,077.16 18,126,599 19.56 % 33,399.46 6,003,429 18.26 % 32,032.32 11,999,560 21.30 % 37,741.74 1,272,585 17.83 % 29,640.02 so so 13,322,285 6.64 % 11,740.86 25 1.31 % 2,575.46 381,659 3.52 % 9,609.76 33,146 5.52 % 8,378.95 6,609,847 7.13 % 12,179.08 1,974,260 6.00 % 10,534 3,829,327 6.80 % 12,044.23 494,021 6.92 % 11,506.34 bi bi 6,497,427 3.24 % 5,726.15 4 0.21 % 412.07 388,773 3.58 % 9,788.88 19,288 3.21 % 4,875.80 3,212,413 3.47 % 5,919.08 1,006,674 3.06 % 5,371.28 1,665,728 2.96 % 5,239.15 204,547 2.87 % 4,764.14 bo bo 5,991,999 2.99 % 5,280.72 9 0.47 % 927.17 165,711 1.53 % 4,172.42 10,939 1.82 % 2,765.26 3,054,378 3.29 % 5,627.89 827,838 2.52 % 4,417.07 1,837,278 3.26 % 5,778.72 95,846 1.34 % 2,232.37 ni ni 4,416,652 2.20 % 3,892.37 0 0 % 0 283,727 2.62 % 7,143.94 15,184 2.53 % 3,838.35 2,061,015 2.22 % 3,797.56 719,621 2.19 % 3,839.66 1,202,740 2.13 % 3,782.93 134,365 1.88 % 3,129.52 bodo bo do 2,658,670 1.33 % 2,343.07 4 0.21 % 412.07 31,054 0.29 % 781.91 3,637 0.61 % 919.39 1,471,149 1.59 % 2,710.69 302,865 0.92 % 1,615.99 814,748 1.45 % 2,562.59 35,213 0.49 % 820.15 sem se m 2,397,474 1.20 % 2,112.88 0 0 % 0 402,492 3.71 % 10,134.31 6,808 1.13 % 1,720.99 928,864 1.00 % 1,711.49 454,648 1.38 % 2,425.85 534,969 0.95 % 1,682.62 69,693 0.98 % 1,623.23 bil bi l 2,389,569 1.19 % 2,105.92 0 0 % 0 167,677 1.55 % 4,221.93 6,270 1.04 % 1,584.99 1,107,569 1.20 % 2,040.77 323,764 0.98 % 1,727.50 713,850 1.27 % 2,245.24 70,439 0.99 % 1,640.61 bilo bi lo 2,082,016 1.04 % 1,834.87 1 0.05 % 103.02 149,980 1.38 % 3,776.33 6,314 1.05 % 1,596.11 1,019,432 1.10 % 1,878.37 298,941 0.91 % 1,595.05 544,248 0.97 % 1,711.80 63,100 0.88 % 1,469.67 bila bi la 1,952,423 0.97 % 1,720.66 0 0 % 0 144,082 1.33 % 3,627.83 5,749 0.96 % 1,453.29 902,435 0.97 % 1,662.80 291,559 0.89 % 1,555.66 543,982 0.97 % 1,710.97 64,616 0.91 % 1,504.98 smo sm o 1,909,803 0.95 % 1,683.10 0 0 % 0 44,381 0.41 % 1,117.47 5,673 0.94 % 1,434.07 994,108 1.07 % 1,831.71 348,648 1.06 % 1,860.27 471,393 0.84 % 1,482.65 45,600 0.64 % 1,062.08 sta st a 1,875,334 0.94 % 1,652.72 2 0.10 % 206.04 96,486 0.89 % 2,429.41 2,861 0.48 % 723.23 881,185 0.95 % 1,623.64 274,754 0.84 % 1,466 566,839 1.01 % 1,782.86 53,207 0.74 % 1,239.25 ima im a 1,068,307 0.53 % 941.49 5 0.26 % 515.09 28,747 0.27 % 723.82 4,891 0.81 % 1,236.39 479,110 0.52 % 882.79 225,335 0.69 % 1,202.31 286,154 0.51 % 900.03 44,065 0.62 % 1,026.33 niso ni so 1,033,073 0.52 % 910.44 2 0.10 % 206.04 29,832 0.28 % 751.14 3,575 0.59 % 903.72 525,937 0.57 % 969.07 152,957 0.47 % 816.13 285,738 0.51 % 898.72 35,032 0.49 % 815.94 bili bi li 949,365 0.47 % 836.67 1 0.05 % 103.02 39,173 0.36 % 986.33 2,864 0.48 % 723.99 475,580 0.51 % 876.29 138,560 0.42 % 739.31 259,161 0.46 % 815.13 34,026 0.48 % 792.51 gre gr e 814,855 0.41 % 718.13 1 0.05 % 103.02 19,458 0.18 % 489.93 2,827 0.47 % 714.64 383,213 0.41 % 706.10 137,971 0.42 % 736.17 247,421 0.44 % 778.20 23,964 0.34 % 558.15 bomo bo mo 651,882 0.33 % 574.50 0 0 % 0 14,951 0.14 % 376.45 2,964 0.49 % 749.27 340,630 0.37 % 627.63 110,301 0.34 % 588.53 163,942 0.29 % 515.64 19,094 0.27 % 444.72 imajo im ajo 589,974 0.29 % 519.94 1 0.05 % 103.02 8,325 0.08 % 209.61 2,572 0.43 % 650.17 294,609 0.32 % 542.84 109,963 0.33 % 586.73 145,513 0.26 % 457.68 28,991 0.41 % 675.23 pravi pr avi 519,683 0.26 % 457.99 0 0 % 0 10,534 0.10 % 265.23 1,232 0.20 % 311.44 278,850 0.30 % 513.80 86,508 0.26 % 461.58 131,590 0.23 % 413.88 10,969 0.15 % 255.48 biti bi ti 505,624 0.25 % 445.60 4 0.21 % 412.07 21,007 0.19 % 528.93 2,919 0.49 % 737.89 223,980 0.24 % 412.70 96,166 0.29 % 513.11 135,922 0.24 % 427.51 25,626 0.36 % 596.86 povedal po vedal 470,909 0.23 % 415.01 0 0 % 0 17,662 0.16 % 444.71 699 0.12 % 176.70 256,636 0.28 % 472.87 30,466 0.09 % 162.56 162,162 0.29 % 510.04 3,284 0.05 % 76.49 dejal de jal 467,145 0.23 % 411.69 0 0 % 0 7,833 0.07 % 197.23 81 0.01 % 20.48 196,501 0.21 % 362.07 16,632 0.05 % 88.74 243,317 0.43 % 765.30 2,781 0.04 % 64.77 imel im el 456,762 0.23 % 402.54 0 0 % 0 42,126 0.39 % 1,060.69 950 0.16 % 240.15 202,873 0.22 % 373.81 66,132 0.20 % 352.86 131,752 0.23 % 414.39 12,929 0.18 % 301.13 pomeni po meni 439,400 0.22 % 387.24 1 0.05 % 103.02 9,080 0.08 % 228.62 1,576 0.26 % 398.40 202,050 0.22 % 372.29 91,731 0.28 % 489.45 112,511 0.20 % 353.88 22,451 0.32 % 522.91 mora mo ra 437,226 0.22 % 385.33 3 0.16 % 309.06 13,897 0.13 % 349.91 5,423 0.90 % 1,370.88 196,118 0.21 % 361.36 77,719 0.24 % 414.68 120,038 0.21 % 377.55 24,028 0.34 % 559.64 imeli im eli 424,733 0.21 % 374.32 0 0 % 0 11,741 0.11 % 295.63 995 0.17 % 251.53 227,147 0.24 % 418.53 63,176 0.19 % 337.09 110,375 0.20 % 347.16 11,299 0.16 % 263.17 boste bo ste 424,723 0.21 % 374.31 8 0.42 % 824.15 11,314 0.10 % 284.87 1,143 0.19 % 288.94 144,008 0.15 % 265.34 176,813 0.54 % 943.42 60,928 0.11 % 191.63 30,509 0.43 % 710.59 ste st e 389,782 0.19 % 343.51 16 0.84 % 1,648.30 23,529 0.22 % 592.43 1,606 0.27 % 405.98 155,988 0.17 % 287.42 111,345 0.34 % 594.10 74,884 0.13 % 235.53 22,414 0.31 % 522.05 bile bi le 375,459 0.19 % 330.89 1 0.05 % 103.02 20,500 0.19 % 516.17 1,483 0.25 % 374.89 178,752 0.19 % 329.36 57,280 0.17 % 305.63 99,776 0.18 % 313.82 17,667 0.25 % 411.49 bom bo m 366,374 0.18 % 322.88 0 0 % 0 59,430 0.55 % 1,496.38 1,893 0.32 % 478.53 150,679 0.16 % 277.64 60,672 0.18 % 323.73 83,377 0.15 % 262.24 10,323 0.14 % 240.43 morali mo rali 356,817 0.18 % 314.46 0 0 % 0 7,714 0.07 % 194.23 654 0.11 % 165.32 196,175 0.21 % 361.47 48,407 0.15 % 258.28 94,072 0.17 % 295.88 9,795 0.14 % 228.14 nisem ni sem 333,913 0.17 % 294.28 0 0 % 0 57,684 0.53 % 1,452.42 857 0.14 % 216.64 129,813 0.14 % 239.19 61,255 0.19 % 326.84 76,320 0.14 % 240.05 7,984 0.11 % 185.96 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 69 File at CLARIN.SI 1.2.53 List of initial character-level 3-grams from verb lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-verbs-lowercase_ forms-initial-3grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] bodo bod o 2,658,670 2.04 % 2,343.07 4 0.22 % 412.07 31,054 0.43 % 781.91 3,637 0.85 % 919.39 1,471,149 2.47 % 2,710.69 302,865 1.36 % 1,615.99 814,748 2.28 % 2,562.59 35,213 0.71 % 820.15 sem sem 2,397,474 1.84 % 2,112.88 0 0 % 0 402,492 5.57 % 10,134.31 6,808 1.60 % 1,720.99 928,864 1.56 % 1,711.49 454,648 2.04 % 2,425.85 534,969 1.50 % 1,682.62 69,693 1.42 % 1,623.23 bil bil 2,389,569 1.84 % 2,105.92 0 0 % 0 167,677 2.32 % 4,221.93 6,270 1.47 % 1,584.99 1,107,569 1.86 % 2,040.77 323,764 1.45 % 1,727.50 713,850 2.00 % 2,245.24 70,439 1.43 % 1,640.61 bilo bil o 2,082,016 1.60 % 1,834.87 1 0.06 % 103.02 149,980 2.08 % 3,776.33 6,314 1.48 % 1,596.11 1,019,432 1.71 % 1,878.37 298,941 1.34 % 1,595.05 544,248 1.52 % 1,711.80 63,100 1.28 % 1,469.67 bila bil a 1,952,423 1.50 % 1,720.66 0 0 % 0 144,082 2.00 % 3,627.83 5,749 1.35 % 1,453.29 902,435 1.52 % 1,662.80 291,559 1.31 % 1,555.66 543,982 1.52 % 1,710.97 64,616 1.31 % 1,504.98 smo smo 1,909,803 1.47 % 1,683.10 0 0 % 0 44,381 0.61 % 1,117.47 5,673 1.33 % 1,434.07 994,108 1.67 % 1,831.71 348,648 1.57 % 1,860.27 471,393 1.32 % 1,482.65 45,600 0.93 % 1,062.08 sta sta 1,875,334 1.44 % 1,652.72 2 0.11 % 206.04 96,486 1.34 % 2,429.41 2,861 0.67 % 723.23 881,185 1.48 % 1,623.64 274,754 1.23 % 1,466 566,839 1.59 % 1,782.86 53,207 1.08 % 1,239.25 ima ima 1,068,307 0.82 % 941.49 5 0.27 % 515.09 28,747 0.40 % 723.82 4,891 1.15 % 1,236.39 479,110 0.81 % 882.79 225,335 1.01 % 1,202.31 286,154 0.80 % 900.03 44,065 0.90 % 1,026.33 niso nis o 1,033,073 0.79 % 910.44 2 0.11 % 206.04 29,832 0.41 % 751.14 3,575 0.84 % 903.72 525,937 0.88 % 969.07 152,957 0.69 % 816.13 285,738 0.80 % 898.72 35,032 0.71 % 815.94 bili bil i 949,365 0.73 % 836.67 1 0.06 % 103.02 39,173 0.54 % 986.33 2,864 0.67 % 723.99 475,580 0.80 % 876.29 138,560 0.62 % 739.31 259,161 0.72 % 815.13 34,026 0.69 % 792.51 gre gre 814,855 0.63 % 718.13 1 0.06 % 103.02 19,458 0.27 % 489.93 2,827 0.66 % 714.64 383,213 0.64 % 706.10 137,971 0.62 % 736.17 247,421 0.69 % 778.20 23,964 0.49 % 558.15 bomo bom o 651,882 0.50 % 574.50 0 0 % 0 14,951 0.21 % 376.45 2,964 0.70 % 749.27 340,630 0.57 % 627.63 110,301 0.49 % 588.53 163,942 0.46 % 515.64 19,094 0.39 % 444.72 imajo ima jo 589,974 0.45 % 519.94 1 0.06 % 103.02 8,325 0.12 % 209.61 2,572 0.60 % 650.17 294,609 0.49 % 542.84 109,963 0.49 % 586.73 145,513 0.41 % 457.68 28,991 0.59 % 675.23 pravi pra vi 519,683 0.40 % 457.99 0 0 % 0 10,534 0.15 % 265.23 1,232 0.29 % 311.44 278,850 0.47 % 513.80 86,508 0.39 % 461.58 131,590 0.37 % 413.88 10,969 0.22 % 255.48 biti bit i 505,624 0.39 % 445.60 4 0.22 % 412.07 21,007 0.29 % 528.93 2,919 0.69 % 737.89 223,980 0.38 % 412.70 96,166 0.43 % 513.11 135,922 0.38 % 427.51 25,626 0.52 % 596.86 povedal pov edal 470,909 0.36 % 415.01 0 0 % 0 17,662 0.24 % 444.71 699 0.16 % 176.70 256,636 0.43 % 472.87 30,466 0.14 % 162.56 162,162 0.45 % 510.04 3,284 0.07 % 76.49 dejal dej al 467,145 0.36 % 411.69 0 0 % 0 7,833 0.11 % 197.23 81 0.02 % 20.48 196,501 0.33 % 362.07 16,632 0.07 % 88.74 243,317 0.68 % 765.30 2,781 0.06 % 64.77 imel ime l 456,762 0.35 % 402.54 0 0 % 0 42,126 0.58 % 1,060.69 950 0.22 % 240.15 202,873 0.34 % 373.81 66,132 0.30 % 352.86 131,752 0.37 % 414.39 12,929 0.26 % 301.13 pomeni pom eni 439,400 0.34 % 387.24 1 0.06 % 103.02 9,080 0.13 % 228.62 1,576 0.37 % 398.40 202,050 0.34 % 372.29 91,731 0.41 % 489.45 112,511 0.32 % 353.88 22,451 0.46 % 522.91 mora mor a 437,226 0.34 % 385.33 3 0.17 % 309.06 13,897 0.19 % 349.91 5,423 1.27 % 1,370.88 196,118 0.33 % 361.36 77,719 0.35 % 414.68 120,038 0.34 % 377.55 24,028 0.49 % 559.64 imeli ime li 424,733 0.33 % 374.32 0 0 % 0 11,741 0.16 % 295.63 995 0.23 % 251.53 227,147 0.38 % 418.53 63,176 0.28 % 337.09 110,375 0.31 % 347.16 11,299 0.23 % 263.17 boste bos te 424,723 0.33 % 374.31 8 0.44 % 824.15 11,314 0.16 % 284.87 1,143 0.27 % 288.94 144,008 0.24 % 265.34 176,813 0.79 % 943.42 60,928 0.17 % 191.63 30,509 0.62 % 710.59 ste ste 389,782 0.30 % 343.51 16 0.88 % 1,648.30 23,529 0.33 % 592.43 1,606 0.38 % 405.98 155,988 0.26 % 287.42 111,345 0.50 % 594.10 74,884 0.21 % 235.53 22,414 0.46 % 522.05 bile bil e 375,459 0.29 % 330.89 1 0.06 % 103.02 20,500 0.28 % 516.17 1,483 0.35 % 374.89 178,752 0.30 % 329.36 57,280 0.26 % 305.63 99,776 0.28 % 313.82 17,667 0.36 % 411.49 bom bom 366,374 0.28 % 322.88 0 0 % 0 59,430 0.82 % 1,496.38 1,893 0.45 % 478.53 150,679 0.25 % 277.64 60,672 0.27 % 323.73 83,377 0.23 % 262.24 10,323 0.21 % 240.43 morali mor ali 356,817 0.27 % 314.46 0 0 % 0 7,714 0.11 % 194.23 654 0.15 % 165.32 196,175 0.33 % 361.47 48,407 0.22 % 258.28 94,072 0.26 % 295.88 9,795 0.20 % 228.14 nisem nis em 333,913 0.26 % 294.28 0 0 % 0 57,684 0.80 % 1,452.42 857 0.20 % 216.64 129,813 0.22 % 239.19 61,255 0.28 % 326.84 76,320 0.21 % 240.05 7,984 0.16 % 185.96 imela ime la 312,385 0.24 % 275.30 0 0 % 0 29,555 0.41 % 744.16 640 0.15 % 161.79 136,559 0.23 % 251.62 52,286 0.23 % 278.98 84,290 0.24 % 265.11 9,055 0.18 % 210.90 moral mor al 287,118 0.22 % 253.04 0 0 % 0 23,576 0.33 % 593.62 571 0.13 % 144.34 134,971 0.23 % 248.69 36,532 0.16 % 194.92 84,515 0.24 % 265.82 6,953 0.14 % 161.94 bosta bos ta 273,268 0.21 % 240.83 0 0 % 0 8,212 0.11 % 206.77 451 0.11 % 114.01 135,358 0.23 % 249.41 35,885 0.16 % 191.47 90,196 0.25 % 283.69 3,166 0.06 % 73.74 začel zač el 271,468 0.21 % 239.24 0 0 % 0 16,850 0.23 % 424.26 485 0.11 % 122.60 128,069 0.21 % 235.98 36,310 0.16 % 193.74 83,600 0.23 % 262.94 6,154 0.12 % 143.33 dobil dob il 251,856 0.19 % 221.96 0 0 % 0 9,123 0.13 % 229.71 370 0.09 % 93.53 126,291 0.21 % 232.70 34,996 0.16 % 186.73 76,748 0.21 % 241.39 4,328 0.09 % 100.80 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 70 File at CLARIN.SI 1.2.54 List of initial character-level 4-grams from verb lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-verbs-lowercase_ forms-initial-4grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] bodo bodo 2,658,670 2.27 % 2,343.07 4 0.22 % 412.07 31,054 0.50 % 781.91 3,637 0.94 % 919.39 1,471,149 2.74 % 2,710.69 302,865 1.51 % 1,615.99 814,748 2.52 % 2,562.59 35,213 0.78 % 820.15 bilo bilo 2,082,016 1.78 % 1,834.87 1 0.06 % 103.02 149,980 2.43 % 3,776.33 6,314 1.64 % 1,596.11 1,019,432 1.90 % 1,878.37 298,941 1.49 % 1,595.05 544,248 1.68 % 1,711.80 63,100 1.39 % 1,469.67 bila bila 1,952,423 1.67 % 1,720.66 0 0 % 0 144,082 2.33 % 3,627.83 5,749 1.49 % 1,453.29 902,435 1.68 % 1,662.80 291,559 1.46 % 1,555.66 543,982 1.68 % 1,710.97 64,616 1.43 % 1,504.98 niso niso 1,033,073 0.88 % 910.44 2 0.11 % 206.04 29,832 0.48 % 751.14 3,575 0.93 % 903.72 525,937 0.98 % 969.07 152,957 0.77 % 816.13 285,738 0.88 % 898.72 35,032 0.77 % 815.94 bili bili 949,365 0.81 % 836.67 1 0.06 % 103.02 39,173 0.64 % 986.33 2,864 0.74 % 723.99 475,580 0.89 % 876.29 138,560 0.69 % 739.31 259,161 0.80 % 815.13 34,026 0.75 % 792.51 bomo bomo 651,882 0.56 % 574.50 0 0 % 0 14,951 0.24 % 376.45 2,964 0.77 % 749.27 340,630 0.63 % 627.63 110,301 0.55 % 588.53 163,942 0.51 % 515.64 19,094 0.42 % 444.72 imajo imaj o 589,974 0.50 % 519.94 1 0.06 % 103.02 8,325 0.14 % 209.61 2,572 0.67 % 650.17 294,609 0.55 % 542.84 109,963 0.55 % 586.73 145,513 0.45 % 457.68 28,991 0.64 % 675.23 pravi prav i 519,683 0.44 % 457.99 0 0 % 0 10,534 0.17 % 265.23 1,232 0.32 % 311.44 278,850 0.52 % 513.80 86,508 0.43 % 461.58 131,590 0.41 % 413.88 10,969 0.24 % 255.48 biti biti 505,624 0.43 % 445.60 4 0.22 % 412.07 21,007 0.34 % 528.93 2,919 0.76 % 737.89 223,980 0.42 % 412.70 96,166 0.48 % 513.11 135,922 0.42 % 427.51 25,626 0.57 % 596.86 povedal pove dal 470,909 0.40 % 415.01 0 0 % 0 17,662 0.29 % 444.71 699 0.18 % 176.70 256,636 0.48 % 472.87 30,466 0.15 % 162.56 162,162 0.50 % 510.04 3,284 0.07 % 76.49 dejal deja l 467,145 0.40 % 411.69 0 0 % 0 7,833 0.13 % 197.23 81 0.02 % 20.48 196,501 0.37 % 362.07 16,632 0.08 % 88.74 243,317 0.75 % 765.30 2,781 0.06 % 64.77 imel imel 456,762 0.39 % 402.54 0 0 % 0 42,126 0.68 % 1,060.69 950 0.25 % 240.15 202,873 0.38 % 373.81 66,132 0.33 % 352.86 131,752 0.41 % 414.39 12,929 0.29 % 301.13 pomeni pome ni 439,400 0.38 % 387.24 1 0.06 % 103.02 9,080 0.15 % 228.62 1,576 0.41 % 398.40 202,050 0.38 % 372.29 91,731 0.46 % 489.45 112,511 0.35 % 353.88 22,451 0.50 % 522.91 mora mora 437,226 0.37 % 385.33 3 0.17 % 309.06 13,897 0.23 % 349.91 5,423 1.41 % 1,370.88 196,118 0.36 % 361.36 77,719 0.39 % 414.68 120,038 0.37 % 377.55 24,028 0.53 % 559.64 imeli imel i 424,733 0.36 % 374.32 0 0 % 0 11,741 0.19 % 295.63 995 0.26 % 251.53 227,147 0.42 % 418.53 63,176 0.32 % 337.09 110,375 0.34 % 347.16 11,299 0.25 % 263.17 boste bost e 424,723 0.36 % 374.31 8 0.45 % 824.15 11,314 0.18 % 284.87 1,143 0.30 % 288.94 144,008 0.27 % 265.34 176,813 0.88 % 943.42 60,928 0.19 % 191.63 30,509 0.67 % 710.59 bile bile 375,459 0.32 % 330.89 1 0.06 % 103.02 20,500 0.33 % 516.17 1,483 0.38 % 374.89 178,752 0.33 % 329.36 57,280 0.29 % 305.63 99,776 0.31 % 313.82 17,667 0.39 % 411.49 morali mora li 356,817 0.30 % 314.46 0 0 % 0 7,714 0.12 % 194.23 654 0.17 % 165.32 196,175 0.36 % 361.47 48,407 0.24 % 258.28 94,072 0.29 % 295.88 9,795 0.22 % 228.14 nisem nise m 333,913 0.28 % 294.28 0 0 % 0 57,684 0.94 % 1,452.42 857 0.22 % 216.64 129,813 0.24 % 239.19 61,255 0.31 % 326.84 76,320 0.24 % 240.05 7,984 0.18 % 185.96 imela imel a 312,385 0.27 % 275.30 0 0 % 0 29,555 0.48 % 744.16 640 0.17 % 161.79 136,559 0.25 % 251.62 52,286 0.26 % 278.98 84,290 0.26 % 265.11 9,055 0.20 % 210.90 moral mora l 287,118 0.24 % 253.04 0 0 % 0 23,576 0.38 % 593.62 571 0.15 % 144.34 134,971 0.25 % 248.69 36,532 0.18 % 194.92 84,515 0.26 % 265.82 6,953 0.15 % 161.94 bosta bost a 273,268 0.23 % 240.83 0 0 % 0 8,212 0.13 % 206.77 451 0.12 % 114.01 135,358 0.25 % 249.41 35,885 0.18 % 191.47 90,196 0.28 % 283.69 3,166 0.07 % 73.74 začel zače l 271,468 0.23 % 239.24 0 0 % 0 16,850 0.27 % 424.26 485 0.13 % 122.60 128,069 0.24 % 235.98 36,310 0.18 % 193.74 83,600 0.26 % 262.94 6,154 0.14 % 143.33 dobil dobi l 251,856 0.21 % 221.96 0 0 % 0 9,123 0.15 % 229.71 370 0.10 % 93.53 126,291 0.23 % 232.70 34,996 0.17 % 186.73 76,748 0.24 % 241.39 4,328 0.10 % 100.80 velja velj a 249,697 0.21 % 220.06 0 0 % 0 1,835 0.03 % 46.20 1,075 0.28 % 271.75 118,417 0.22 % 218.19 49,796 0.25 % 265.70 67,241 0.21 % 211.49 11,333 0.25 % 263.96 začeli zače li 247,919 0.21 % 218.49 0 0 % 0 4,872 0.08 % 122.67 456 0.12 % 115.27 134,875 0.25 % 248.52 33,697 0.17 % 179.80 68,006 0.21 % 213.90 6,013 0.13 % 140.05 imamo imam o 244,638 0.21 % 215.60 1 0.06 % 103.02 4,422 0.07 % 111.34 1,128 0.29 % 285.15 118,225 0.22 % 217.84 45,792 0.23 % 244.33 67,498 0.21 % 212.30 7,572 0.17 % 176.36 začela zače la 244,359 0.21 % 215.35 0 0 % 0 12,890 0.21 % 324.56 362 0.09 % 91.51 115,625 0.21 % 213.05 35,436 0.18 % 189.07 74,767 0.23 % 235.16 5,279 0.12 % 122.95 uspelo uspe lo 240,892 0.21 % 212.30 0 0 % 0 8,476 0.14 % 213.42 165 0.04 % 41.71 119,483 0.22 % 220.16 36,170 0.18 % 192.99 73,282 0.23 % 230.49 3,316 0.07 % 77.23 dobili dobi li 240,319 0.20 % 211.79 1 0.06 % 103.02 3,115 0.05 % 78.43 556 0.14 % 140.55 140,144 0.26 % 258.22 32,111 0.16 % 171.33 60,285 0.19 % 189.61 4,107 0.09 % 95.66 morajo mora jo 238,365 0.20 % 210.07 2 0.11 % 206.04 2,918 0.05 % 73.47 2,055 0.53 % 519.48 118,656 0.22 % 218.63 35,753 0.18 % 190.77 67,032 0.21 % 210.83 11,949 0.26 % 278.31 kaže kaže 229,873 0.20 % 202.59 0 0 % 0 3,215 0.05 % 80.95 403 0.10 % 101.87 118,682 0.22 % 218.68 35,830 0.18 % 191.18 60,440 0.19 % 190.10 11,303 0.25 % 263.26 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 71 File at CLARIN.SI 1.2.55 List of initial character-level 5-grams from verb lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-verbs-lowercase_ forms-initial-5grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] imajo imajo 589,974 0.58 % 519.94 1 0.06 % 103.02 8,325 0.16 % 209.61 2,572 0.78 % 650.17 294,609 0.64 % 542.84 109,963 0.63 % 586.73 145,513 0.52 % 457.68 28,991 0.73 % 675.23 pravi pravi 519,683 0.52 % 457.99 0 0 % 0 10,534 0.20 % 265.23 1,232 0.38 % 311.44 278,850 0.61 % 513.80 86,508 0.50 % 461.58 131,590 0.47 % 413.88 10,969 0.28 % 255.48 povedal poved al 470,909 0.47 % 415.01 0 0 % 0 17,662 0.33 % 444.71 699 0.21 % 176.70 256,636 0.56 % 472.87 30,466 0.17 % 162.56 162,162 0.58 % 510.04 3,284 0.08 % 76.49 dejal dejal 467,145 0.46 % 411.69 0 0 % 0 7,833 0.15 % 197.23 81 0.03 % 20.48 196,501 0.43 % 362.07 16,632 0.10 % 88.74 243,317 0.87 % 765.30 2,781 0.07 % 64.77 pomeni pomen i 439,400 0.44 % 387.24 1 0.06 % 103.02 9,080 0.17 % 228.62 1,576 0.48 % 398.40 202,050 0.44 % 372.29 91,731 0.53 % 489.45 112,511 0.40 % 353.88 22,451 0.56 % 522.91 imeli imeli 424,733 0.42 % 374.32 0 0 % 0 11,741 0.22 % 295.63 995 0.30 % 251.53 227,147 0.49 % 418.53 63,176 0.36 % 337.09 110,375 0.40 % 347.16 11,299 0.28 % 263.17 boste boste 424,723 0.42 % 374.31 8 0.45 % 824.15 11,314 0.21 % 284.87 1,143 0.35 % 288.94 144,008 0.31 % 265.34 176,813 1.01 % 943.42 60,928 0.22 % 191.63 30,509 0.77 % 710.59 morali moral i 356,817 0.35 % 314.46 0 0 % 0 7,714 0.14 % 194.23 654 0.20 % 165.32 196,175 0.43 % 361.47 48,407 0.28 % 258.28 94,072 0.34 % 295.88 9,795 0.25 % 228.14 nisem nisem 333,913 0.33 % 294.28 0 0 % 0 57,684 1.08 % 1,452.42 857 0.26 % 216.64 129,813 0.28 % 239.19 61,255 0.35 % 326.84 76,320 0.27 % 240.05 7,984 0.20 % 185.96 imela imela 312,385 0.31 % 275.30 0 0 % 0 29,555 0.55 % 744.16 640 0.20 % 161.79 136,559 0.30 % 251.62 52,286 0.30 % 278.98 84,290 0.30 % 265.11 9,055 0.23 % 210.90 moral moral 287,118 0.28 % 253.04 0 0 % 0 23,576 0.44 % 593.62 571 0.17 % 144.34 134,971 0.29 % 248.69 36,532 0.21 % 194.92 84,515 0.30 % 265.82 6,953 0.17 % 161.94 bosta bosta 273,268 0.27 % 240.83 0 0 % 0 8,212 0.15 % 206.77 451 0.14 % 114.01 135,358 0.29 % 249.41 35,885 0.21 % 191.47 90,196 0.32 % 283.69 3,166 0.08 % 73.74 začel začel 271,468 0.27 % 239.24 0 0 % 0 16,850 0.32 % 424.26 485 0.15 % 122.60 128,069 0.28 % 235.98 36,310 0.21 % 193.74 83,600 0.30 % 262.94 6,154 0.15 % 143.33 dobil dobil 251,856 0.25 % 221.96 0 0 % 0 9,123 0.17 % 229.71 370 0.11 % 93.53 126,291 0.28 % 232.70 34,996 0.20 % 186.73 76,748 0.28 % 241.39 4,328 0.11 % 100.80 velja velja 249,697 0.25 % 220.06 0 0 % 0 1,835 0.03 % 46.20 1,075 0.33 % 271.75 118,417 0.26 % 218.19 49,796 0.29 % 265.70 67,241 0.24 % 211.49 11,333 0.28 % 263.96 začeli začel i 247,919 0.25 % 218.49 0 0 % 0 4,872 0.09 % 122.67 456 0.14 % 115.27 134,875 0.29 % 248.52 33,697 0.19 % 179.80 68,006 0.24 % 213.90 6,013 0.15 % 140.05 imamo imamo 244,638 0.24 % 215.60 1 0.06 % 103.02 4,422 0.08 % 111.34 1,128 0.34 % 285.15 118,225 0.26 % 217.84 45,792 0.26 % 244.33 67,498 0.24 % 212.30 7,572 0.19 % 176.36 začela začel a 244,359 0.24 % 215.35 0 0 % 0 12,890 0.24 % 324.56 362 0.11 % 91.51 115,625 0.25 % 213.05 35,436 0.20 % 189.07 74,767 0.27 % 235.16 5,279 0.13 % 122.95 uspelo uspel o 240,892 0.24 % 212.30 0 0 % 0 8,476 0.16 % 213.42 165 0.05 % 41.71 119,483 0.26 % 220.16 36,170 0.21 % 192.99 73,282 0.26 % 230.49 3,316 0.08 % 77.23 dobili dobil i 240,319 0.24 % 211.79 1 0.06 % 103.02 3,115 0.06 % 78.43 556 0.17 % 140.55 140,144 0.30 % 258.22 32,111 0.18 % 171.33 60,285 0.22 % 189.61 4,107 0.10 % 95.66 morajo moraj o 238,365 0.24 % 210.07 2 0.11 % 206.04 2,918 0.06 % 73.47 2,055 0.63 % 519.48 118,656 0.26 % 218.63 35,753 0.20 % 190.77 67,032 0.24 % 210.83 11,949 0.30 % 278.31 prišel priše l 211,310 0.21 % 186.23 0 0 % 0 21,234 0.40 % 534.65 480 0.15 % 121.34 92,681 0.20 % 170.77 32,652 0.19 % 174.22 58,948 0.21 % 185.41 5,315 0.13 % 123.79 morala moral a 211,288 0.21 % 186.21 0 0 % 0 15,694 0.29 % 395.16 442 0.14 % 111.73 100,937 0.22 % 185.98 28,381 0.16 % 151.43 61,142 0.22 % 192.31 4,692 0.12 % 109.28 moramo moram o 205,491 0.20 % 181.10 0 0 % 0 4,182 0.08 % 105.30 660 0.20 % 166.84 86,068 0.19 % 158.59 44,596 0.26 % 237.95 53,929 0.19 % 169.62 16,056 0.40 % 373.96 zgodilo zgodi lo 196,054 0.19 % 172.78 0 0 % 0 15,427 0.29 % 388.44 449 0.14 % 113.50 93,840 0.20 % 172.91 31,188 0.18 % 166.41 50,496 0.18 % 158.82 4,654 0.12 % 108.40 postal posta l 195,102 0.19 % 171.94 0 0 % 0 7,745 0.14 % 195.01 350 0.11 % 88.48 88,788 0.19 % 163.60 33,422 0.19 % 178.33 57,517 0.21 % 180.91 7,280 0.18 % 169.56 prišlo prišl o 190,940 0.19 % 168.27 0 0 % 0 5,570 0.10 % 140.25 615 0.19 % 155.47 94,348 0.21 % 173.84 23,088 0.13 % 123.19 62,015 0.22 % 195.05 5,304 0.13 % 123.54 nismo nismo 181,629 0.18 % 160.07 0 0 % 0 4,707 0.09 % 118.52 531 0.16 % 134.23 95,326 0.21 % 175.64 30,240 0.17 % 161.35 47,229 0.17 % 148.55 3,596 0.09 % 83.76 dodal dodal 179,735 0.18 % 158.40 0 0 % 0 3,209 0.06 % 80.80 47 0.01 % 11.88 73,621 0.16 % 135.65 7,800 0.04 % 41.62 94,091 0.34 % 295.94 967 0.02 % 22.52 rekel rekel 178,062 0.18 % 156.93 0 0 % 0 66,774 1.25 % 1,681.30 1,158 0.35 % 292.73 55,284 0.12 % 101.86 25,116 0.14 % 134.01 23,982 0.09 % 75.43 5,748 0.14 % 133.88 videti videt i 176,327 0.17 % 155.40 0 0 % 0 23,660 0.44 % 595.73 280 0.09 % 70.78 63,466 0.14 % 116.94 41,741 0.24 % 222.72 39,253 0.14 % 123.46 7,927 0.20 % 184.63 želijo želij o 171,909 0.17 % 151.50 1 0.06 % 103.02 932 0.02 % 23.47 214 0.07 % 54.10 81,012 0.18 % 149.27 22,149 0.13 % 118.18 64,068 0.23 % 201.51 3,533 0.09 % 82.29 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 72 File at CLARIN.SI 1.2.56 List of final character-level 1-grams from verb lower-case word forms in the Gigafida 2.0 corpus with text- type distributionGF2.0-word_parts-verbs-lowercase_ forms-final-1grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] je j e 39,852,651 19.88 % 35,121.94 53 2.77 % 5,459.98 2,355,179 21.72 % 59,300.86 95,246 15.86 % 24,077.16 18,126,599 19.56 % 33,399.46 6,003,429 18.26 % 32,032.32 11,999,560 21.30 % 37,741.74 1,272,585 17.83 % 29,640.02 so s o 13,322,285 6.64 % 11,740.86 25 1.31 % 2,575.46 381,659 3.52 % 9,609.76 33,146 5.52 % 8,378.95 6,609,847 7.13 % 12,179.08 1,974,260 6.00 % 10,534 3,829,327 6.80 % 12,044.23 494,021 6.92 % 11,506.34 bi b i 6,497,427 3.24 % 5,726.15 4 0.21 % 412.07 388,773 3.58 % 9,788.88 19,288 3.21 % 4,875.80 3,212,413 3.47 % 5,919.08 1,006,674 3.06 % 5,371.28 1,665,728 2.96 % 5,239.15 204,547 2.87 % 4,764.14 bo b o 5,991,999 2.99 % 5,280.72 9 0.47 % 927.17 165,711 1.53 % 4,172.42 10,939 1.82 % 2,765.26 3,054,378 3.29 % 5,627.89 827,838 2.52 % 4,417.07 1,837,278 3.26 % 5,778.72 95,846 1.34 % 2,232.37 ni n i 4,416,652 2.20 % 3,892.37 0 0 % 0 283,727 2.62 % 7,143.94 15,184 2.53 % 3,838.35 2,061,015 2.22 % 3,797.56 719,621 2.19 % 3,839.66 1,202,740 2.13 % 3,782.93 134,365 1.88 % 3,129.52 bodo bod o 2,658,670 1.33 % 2,343.07 4 0.21 % 412.07 31,054 0.29 % 781.91 3,637 0.61 % 919.39 1,471,149 1.59 % 2,710.69 302,865 0.92 % 1,615.99 814,748 1.45 % 2,562.59 35,213 0.49 % 820.15 sem se m 2,397,474 1.20 % 2,112.88 0 0 % 0 402,492 3.71 % 10,134.31 6,808 1.13 % 1,720.99 928,864 1.00 % 1,711.49 454,648 1.38 % 2,425.85 534,969 0.95 % 1,682.62 69,693 0.98 % 1,623.23 bil bi l 2,389,569 1.19 % 2,105.92 0 0 % 0 167,677 1.55 % 4,221.93 6,270 1.04 % 1,584.99 1,107,569 1.20 % 2,040.77 323,764 0.98 % 1,727.50 713,850 1.27 % 2,245.24 70,439 0.99 % 1,640.61 bilo bil o 2,082,016 1.04 % 1,834.87 1 0.05 % 103.02 149,980 1.38 % 3,776.33 6,314 1.05 % 1,596.11 1,019,432 1.10 % 1,878.37 298,941 0.91 % 1,595.05 544,248 0.97 % 1,711.80 63,100 0.88 % 1,469.67 bila bil a 1,952,423 0.97 % 1,720.66 0 0 % 0 144,082 1.33 % 3,627.83 5,749 0.96 % 1,453.29 902,435 0.97 % 1,662.80 291,559 0.89 % 1,555.66 543,982 0.97 % 1,710.97 64,616 0.91 % 1,504.98 smo sm o 1,909,803 0.95 % 1,683.10 0 0 % 0 44,381 0.41 % 1,117.47 5,673 0.94 % 1,434.07 994,108 1.07 % 1,831.71 348,648 1.06 % 1,860.27 471,393 0.84 % 1,482.65 45,600 0.64 % 1,062.08 sta st a 1,875,334 0.94 % 1,652.72 2 0.10 % 206.04 96,486 0.89 % 2,429.41 2,861 0.48 % 723.23 881,185 0.95 % 1,623.64 274,754 0.84 % 1,466 566,839 1.01 % 1,782.86 53,207 0.74 % 1,239.25 ima im a 1,068,307 0.53 % 941.49 5 0.26 % 515.09 28,747 0.27 % 723.82 4,891 0.81 % 1,236.39 479,110 0.52 % 882.79 225,335 0.69 % 1,202.31 286,154 0.51 % 900.03 44,065 0.62 % 1,026.33 niso nis o 1,033,073 0.52 % 910.44 2 0.10 % 206.04 29,832 0.28 % 751.14 3,575 0.59 % 903.72 525,937 0.57 % 969.07 152,957 0.47 % 816.13 285,738 0.51 % 898.72 35,032 0.49 % 815.94 bili bil i 949,365 0.47 % 836.67 1 0.05 % 103.02 39,173 0.36 % 986.33 2,864 0.48 % 723.99 475,580 0.51 % 876.29 138,560 0.42 % 739.31 259,161 0.46 % 815.13 34,026 0.48 % 792.51 gre gr e 814,855 0.41 % 718.13 1 0.05 % 103.02 19,458 0.18 % 489.93 2,827 0.47 % 714.64 383,213 0.41 % 706.10 137,971 0.42 % 736.17 247,421 0.44 % 778.20 23,964 0.34 % 558.15 bomo bom o 651,882 0.33 % 574.50 0 0 % 0 14,951 0.14 % 376.45 2,964 0.49 % 749.27 340,630 0.37 % 627.63 110,301 0.34 % 588.53 163,942 0.29 % 515.64 19,094 0.27 % 444.72 imajo imaj o 589,974 0.29 % 519.94 1 0.05 % 103.02 8,325 0.08 % 209.61 2,572 0.43 % 650.17 294,609 0.32 % 542.84 109,963 0.33 % 586.73 145,513 0.26 % 457.68 28,991 0.41 % 675.23 pravi prav i 519,683 0.26 % 457.99 0 0 % 0 10,534 0.10 % 265.23 1,232 0.20 % 311.44 278,850 0.30 % 513.80 86,508 0.26 % 461.58 131,590 0.23 % 413.88 10,969 0.15 % 255.48 biti bit i 505,624 0.25 % 445.60 4 0.21 % 412.07 21,007 0.19 % 528.93 2,919 0.49 % 737.89 223,980 0.24 % 412.70 96,166 0.29 % 513.11 135,922 0.24 % 427.51 25,626 0.36 % 596.86 povedal poveda l 470,909 0.23 % 415.01 0 0 % 0 17,662 0.16 % 444.71 699 0.12 % 176.70 256,636 0.28 % 472.87 30,466 0.09 % 162.56 162,162 0.29 % 510.04 3,284 0.05 % 76.49 dejal deja l 467,145 0.23 % 411.69 0 0 % 0 7,833 0.07 % 197.23 81 0.01 % 20.48 196,501 0.21 % 362.07 16,632 0.05 % 88.74 243,317 0.43 % 765.30 2,781 0.04 % 64.77 imel ime l 456,762 0.23 % 402.54 0 0 % 0 42,126 0.39 % 1,060.69 950 0.16 % 240.15 202,873 0.22 % 373.81 66,132 0.20 % 352.86 131,752 0.23 % 414.39 12,929 0.18 % 301.13 pomeni pomen i 439,400 0.22 % 387.24 1 0.05 % 103.02 9,080 0.08 % 228.62 1,576 0.26 % 398.40 202,050 0.22 % 372.29 91,731 0.28 % 489.45 112,511 0.20 % 353.88 22,451 0.32 % 522.91 mora mor a 437,226 0.22 % 385.33 3 0.16 % 309.06 13,897 0.13 % 349.91 5,423 0.90 % 1,370.88 196,118 0.21 % 361.36 77,719 0.24 % 414.68 120,038 0.21 % 377.55 24,028 0.34 % 559.64 imeli imel i 424,733 0.21 % 374.32 0 0 % 0 11,741 0.11 % 295.63 995 0.17 % 251.53 227,147 0.24 % 418.53 63,176 0.19 % 337.09 110,375 0.20 % 347.16 11,299 0.16 % 263.17 boste bost e 424,723 0.21 % 374.31 8 0.42 % 824.15 11,314 0.10 % 284.87 1,143 0.19 % 288.94 144,008 0.15 % 265.34 176,813 0.54 % 943.42 60,928 0.11 % 191.63 30,509 0.43 % 710.59 ste st e 389,782 0.19 % 343.51 16 0.84 % 1,648.30 23,529 0.22 % 592.43 1,606 0.27 % 405.98 155,988 0.17 % 287.42 111,345 0.34 % 594.10 74,884 0.13 % 235.53 22,414 0.31 % 522.05 bile bil e 375,459 0.19 % 330.89 1 0.05 % 103.02 20,500 0.19 % 516.17 1,483 0.25 % 374.89 178,752 0.19 % 329.36 57,280 0.17 % 305.63 99,776 0.18 % 313.82 17,667 0.25 % 411.49 bom bo m 366,374 0.18 % 322.88 0 0 % 0 59,430 0.55 % 1,496.38 1,893 0.32 % 478.53 150,679 0.16 % 277.64 60,672 0.18 % 323.73 83,377 0.15 % 262.24 10,323 0.14 % 240.43 morali moral i 356,817 0.18 % 314.46 0 0 % 0 7,714 0.07 % 194.23 654 0.11 % 165.32 196,175 0.21 % 361.47 48,407 0.15 % 258.28 94,072 0.17 % 295.88 9,795 0.14 % 228.14 nisem nise m 333,913 0.17 % 294.28 0 0 % 0 57,684 0.53 % 1,452.42 857 0.14 % 216.64 129,813 0.14 % 239.19 61,255 0.19 % 326.84 76,320 0.14 % 240.05 7,984 0.11 % 185.96 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 73 File at CLARIN.SI 1.2.57 List of final character-level 2-grams from verb lower-case word forms in the Gigafida 2.0 corpus with text- type distributionGF2.0-word_parts-verbs-lowercase_ forms-final-2grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] je je 39,852,651 19.88 % 35,121.94 53 2.77 % 5,459.98 2,355,179 21.72 % 59,300.86 95,246 15.86 % 24,077.16 18,126,599 19.56 % 33,399.46 6,003,429 18.26 % 32,032.32 11,999,560 21.30 % 37,741.74 1,272,585 17.83 % 29,640.02 so so 13,322,285 6.64 % 11,740.86 25 1.31 % 2,575.46 381,659 3.52 % 9,609.76 33,146 5.52 % 8,378.95 6,609,847 7.13 % 12,179.08 1,974,260 6.00 % 10,534 3,829,327 6.80 % 12,044.23 494,021 6.92 % 11,506.34 bi bi 6,497,427 3.24 % 5,726.15 4 0.21 % 412.07 388,773 3.58 % 9,788.88 19,288 3.21 % 4,875.80 3,212,413 3.47 % 5,919.08 1,006,674 3.06 % 5,371.28 1,665,728 2.96 % 5,239.15 204,547 2.87 % 4,764.14 bo bo 5,991,999 2.99 % 5,280.72 9 0.47 % 927.17 165,711 1.53 % 4,172.42 10,939 1.82 % 2,765.26 3,054,378 3.29 % 5,627.89 827,838 2.52 % 4,417.07 1,837,278 3.26 % 5,778.72 95,846 1.34 % 2,232.37 ni ni 4,416,652 2.20 % 3,892.37 0 0 % 0 283,727 2.62 % 7,143.94 15,184 2.53 % 3,838.35 2,061,015 2.22 % 3,797.56 719,621 2.19 % 3,839.66 1,202,740 2.13 % 3,782.93 134,365 1.88 % 3,129.52 bodo bo do 2,658,670 1.33 % 2,343.07 4 0.21 % 412.07 31,054 0.29 % 781.91 3,637 0.61 % 919.39 1,471,149 1.59 % 2,710.69 302,865 0.92 % 1,615.99 814,748 1.45 % 2,562.59 35,213 0.49 % 820.15 sem s em 2,397,474 1.20 % 2,112.88 0 0 % 0 402,492 3.71 % 10,134.31 6,808 1.13 % 1,720.99 928,864 1.00 % 1,711.49 454,648 1.38 % 2,425.85 534,969 0.95 % 1,682.62 69,693 0.98 % 1,623.23 bil b il 2,389,569 1.19 % 2,105.92 0 0 % 0 167,677 1.55 % 4,221.93 6,270 1.04 % 1,584.99 1,107,569 1.20 % 2,040.77 323,764 0.98 % 1,727.50 713,850 1.27 % 2,245.24 70,439 0.99 % 1,640.61 bilo bi lo 2,082,016 1.04 % 1,834.87 1 0.05 % 103.02 149,980 1.38 % 3,776.33 6,314 1.05 % 1,596.11 1,019,432 1.10 % 1,878.37 298,941 0.91 % 1,595.05 544,248 0.97 % 1,711.80 63,100 0.88 % 1,469.67 bila bi la 1,952,423 0.97 % 1,720.66 0 0 % 0 144,082 1.33 % 3,627.83 5,749 0.96 % 1,453.29 902,435 0.97 % 1,662.80 291,559 0.89 % 1,555.66 543,982 0.97 % 1,710.97 64,616 0.91 % 1,504.98 smo s mo 1,909,803 0.95 % 1,683.10 0 0 % 0 44,381 0.41 % 1,117.47 5,673 0.94 % 1,434.07 994,108 1.07 % 1,831.71 348,648 1.06 % 1,860.27 471,393 0.84 % 1,482.65 45,600 0.64 % 1,062.08 sta s ta 1,875,334 0.94 % 1,652.72 2 0.10 % 206.04 96,486 0.89 % 2,429.41 2,861 0.48 % 723.23 881,185 0.95 % 1,623.64 274,754 0.84 % 1,466 566,839 1.01 % 1,782.86 53,207 0.74 % 1,239.25 ima i ma 1,068,307 0.53 % 941.49 5 0.26 % 515.09 28,747 0.27 % 723.82 4,891 0.81 % 1,236.39 479,110 0.52 % 882.79 225,335 0.69 % 1,202.31 286,154 0.51 % 900.03 44,065 0.62 % 1,026.33 niso ni so 1,033,073 0.52 % 910.44 2 0.10 % 206.04 29,832 0.28 % 751.14 3,575 0.59 % 903.72 525,937 0.57 % 969.07 152,957 0.47 % 816.13 285,738 0.51 % 898.72 35,032 0.49 % 815.94 bili bi li 949,365 0.47 % 836.67 1 0.05 % 103.02 39,173 0.36 % 986.33 2,864 0.48 % 723.99 475,580 0.51 % 876.29 138,560 0.42 % 739.31 259,161 0.46 % 815.13 34,026 0.48 % 792.51 gre g re 814,855 0.41 % 718.13 1 0.05 % 103.02 19,458 0.18 % 489.93 2,827 0.47 % 714.64 383,213 0.41 % 706.10 137,971 0.42 % 736.17 247,421 0.44 % 778.20 23,964 0.34 % 558.15 bomo bo mo 651,882 0.33 % 574.50 0 0 % 0 14,951 0.14 % 376.45 2,964 0.49 % 749.27 340,630 0.37 % 627.63 110,301 0.34 % 588.53 163,942 0.29 % 515.64 19,094 0.27 % 444.72 imajo ima jo 589,974 0.29 % 519.94 1 0.05 % 103.02 8,325 0.08 % 209.61 2,572 0.43 % 650.17 294,609 0.32 % 542.84 109,963 0.33 % 586.73 145,513 0.26 % 457.68 28,991 0.41 % 675.23 pravi pra vi 519,683 0.26 % 457.99 0 0 % 0 10,534 0.10 % 265.23 1,232 0.20 % 311.44 278,850 0.30 % 513.80 86,508 0.26 % 461.58 131,590 0.23 % 413.88 10,969 0.15 % 255.48 biti bi ti 505,624 0.25 % 445.60 4 0.21 % 412.07 21,007 0.19 % 528.93 2,919 0.49 % 737.89 223,980 0.24 % 412.70 96,166 0.29 % 513.11 135,922 0.24 % 427.51 25,626 0.36 % 596.86 povedal poved al 470,909 0.23 % 415.01 0 0 % 0 17,662 0.16 % 444.71 699 0.12 % 176.70 256,636 0.28 % 472.87 30,466 0.09 % 162.56 162,162 0.29 % 510.04 3,284 0.05 % 76.49 dejal dej al 467,145 0.23 % 411.69 0 0 % 0 7,833 0.07 % 197.23 81 0.01 % 20.48 196,501 0.21 % 362.07 16,632 0.05 % 88.74 243,317 0.43 % 765.30 2,781 0.04 % 64.77 imel im el 456,762 0.23 % 402.54 0 0 % 0 42,126 0.39 % 1,060.69 950 0.16 % 240.15 202,873 0.22 % 373.81 66,132 0.20 % 352.86 131,752 0.23 % 414.39 12,929 0.18 % 301.13 pomeni pome ni 439,400 0.22 % 387.24 1 0.05 % 103.02 9,080 0.08 % 228.62 1,576 0.26 % 398.40 202,050 0.22 % 372.29 91,731 0.28 % 489.45 112,511 0.20 % 353.88 22,451 0.32 % 522.91 mora mo ra 437,226 0.22 % 385.33 3 0.16 % 309.06 13,897 0.13 % 349.91 5,423 0.90 % 1,370.88 196,118 0.21 % 361.36 77,719 0.24 % 414.68 120,038 0.21 % 377.55 24,028 0.34 % 559.64 imeli ime li 424,733 0.21 % 374.32 0 0 % 0 11,741 0.11 % 295.63 995 0.17 % 251.53 227,147 0.24 % 418.53 63,176 0.19 % 337.09 110,375 0.20 % 347.16 11,299 0.16 % 263.17 boste bos te 424,723 0.21 % 374.31 8 0.42 % 824.15 11,314 0.10 % 284.87 1,143 0.19 % 288.94 144,008 0.15 % 265.34 176,813 0.54 % 943.42 60,928 0.11 % 191.63 30,509 0.43 % 710.59 ste s te 389,782 0.19 % 343.51 16 0.84 % 1,648.30 23,529 0.22 % 592.43 1,606 0.27 % 405.98 155,988 0.17 % 287.42 111,345 0.34 % 594.10 74,884 0.13 % 235.53 22,414 0.31 % 522.05 bile bi le 375,459 0.19 % 330.89 1 0.05 % 103.02 20,500 0.19 % 516.17 1,483 0.25 % 374.89 178,752 0.19 % 329.36 57,280 0.17 % 305.63 99,776 0.18 % 313.82 17,667 0.25 % 411.49 bom b om 366,374 0.18 % 322.88 0 0 % 0 59,430 0.55 % 1,496.38 1,893 0.32 % 478.53 150,679 0.16 % 277.64 60,672 0.18 % 323.73 83,377 0.15 % 262.24 10,323 0.14 % 240.43 morali mora li 356,817 0.18 % 314.46 0 0 % 0 7,714 0.07 % 194.23 654 0.11 % 165.32 196,175 0.21 % 361.47 48,407 0.15 % 258.28 94,072 0.17 % 295.88 9,795 0.14 % 228.14 nisem nis em 333,913 0.17 % 294.28 0 0 % 0 57,684 0.53 % 1,452.42 857 0.14 % 216.64 129,813 0.14 % 239.19 61,255 0.19 % 326.84 76,320 0.14 % 240.05 7,984 0.11 % 185.96 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 74 File at CLARIN.SI 1.2.58 List of final character-level 3-grams from verb lower-case word forms in the Gigafida 2.0 corpus with text- type distributionGF2.0-word_parts-verbs-lowercase_ forms-final-3grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] bodo b odo 2,658,670 2.04 % 2,343.07 4 0.22 % 412.07 31,054 0.43 % 781.91 3,637 0.85 % 919.39 1,471,149 2.47 % 2,710.69 302,865 1.36 % 1,615.99 814,748 2.28 % 2,562.59 35,213 0.71 % 820.15 sem sem 2,397,474 1.84 % 2,112.88 0 0 % 0 402,492 5.57 % 10,134.31 6,808 1.60 % 1,720.99 928,864 1.56 % 1,711.49 454,648 2.04 % 2,425.85 534,969 1.50 % 1,682.62 69,693 1.42 % 1,623.23 bil bil 2,389,569 1.84 % 2,105.92 0 0 % 0 167,677 2.32 % 4,221.93 6,270 1.47 % 1,584.99 1,107,569 1.86 % 2,040.77 323,764 1.45 % 1,727.50 713,850 2.00 % 2,245.24 70,439 1.43 % 1,640.61 bilo b ilo 2,082,016 1.60 % 1,834.87 1 0.06 % 103.02 149,980 2.08 % 3,776.33 6,314 1.48 % 1,596.11 1,019,432 1.71 % 1,878.37 298,941 1.34 % 1,595.05 544,248 1.52 % 1,711.80 63,100 1.28 % 1,469.67 bila b ila 1,952,423 1.50 % 1,720.66 0 0 % 0 144,082 2.00 % 3,627.83 5,749 1.35 % 1,453.29 902,435 1.52 % 1,662.80 291,559 1.31 % 1,555.66 543,982 1.52 % 1,710.97 64,616 1.31 % 1,504.98 smo smo 1,909,803 1.47 % 1,683.10 0 0 % 0 44,381 0.61 % 1,117.47 5,673 1.33 % 1,434.07 994,108 1.67 % 1,831.71 348,648 1.57 % 1,860.27 471,393 1.32 % 1,482.65 45,600 0.93 % 1,062.08 sta sta 1,875,334 1.44 % 1,652.72 2 0.11 % 206.04 96,486 1.34 % 2,429.41 2,861 0.67 % 723.23 881,185 1.48 % 1,623.64 274,754 1.23 % 1,466 566,839 1.59 % 1,782.86 53,207 1.08 % 1,239.25 ima ima 1,068,307 0.82 % 941.49 5 0.27 % 515.09 28,747 0.40 % 723.82 4,891 1.15 % 1,236.39 479,110 0.81 % 882.79 225,335 1.01 % 1,202.31 286,154 0.80 % 900.03 44,065 0.90 % 1,026.33 niso n iso 1,033,073 0.79 % 910.44 2 0.11 % 206.04 29,832 0.41 % 751.14 3,575 0.84 % 903.72 525,937 0.88 % 969.07 152,957 0.69 % 816.13 285,738 0.80 % 898.72 35,032 0.71 % 815.94 bili b ili 949,365 0.73 % 836.67 1 0.06 % 103.02 39,173 0.54 % 986.33 2,864 0.67 % 723.99 475,580 0.80 % 876.29 138,560 0.62 % 739.31 259,161 0.72 % 815.13 34,026 0.69 % 792.51 gre gre 814,855 0.63 % 718.13 1 0.06 % 103.02 19,458 0.27 % 489.93 2,827 0.66 % 714.64 383,213 0.64 % 706.10 137,971 0.62 % 736.17 247,421 0.69 % 778.20 23,964 0.49 % 558.15 bomo b omo 651,882 0.50 % 574.50 0 0 % 0 14,951 0.21 % 376.45 2,964 0.70 % 749.27 340,630 0.57 % 627.63 110,301 0.49 % 588.53 163,942 0.46 % 515.64 19,094 0.39 % 444.72 imajo im ajo 589,974 0.45 % 519.94 1 0.06 % 103.02 8,325 0.12 % 209.61 2,572 0.60 % 650.17 294,609 0.49 % 542.84 109,963 0.49 % 586.73 145,513 0.41 % 457.68 28,991 0.59 % 675.23 pravi pr avi 519,683 0.40 % 457.99 0 0 % 0 10,534 0.15 % 265.23 1,232 0.29 % 311.44 278,850 0.47 % 513.80 86,508 0.39 % 461.58 131,590 0.37 % 413.88 10,969 0.22 % 255.48 biti b iti 505,624 0.39 % 445.60 4 0.22 % 412.07 21,007 0.29 % 528.93 2,919 0.69 % 737.89 223,980 0.38 % 412.70 96,166 0.43 % 513.11 135,922 0.38 % 427.51 25,626 0.52 % 596.86 povedal pove dal 470,909 0.36 % 415.01 0 0 % 0 17,662 0.24 % 444.71 699 0.16 % 176.70 256,636 0.43 % 472.87 30,466 0.14 % 162.56 162,162 0.45 % 510.04 3,284 0.07 % 76.49 dejal de jal 467,145 0.36 % 411.69 0 0 % 0 7,833 0.11 % 197.23 81 0.02 % 20.48 196,501 0.33 % 362.07 16,632 0.07 % 88.74 243,317 0.68 % 765.30 2,781 0.06 % 64.77 imel i mel 456,762 0.35 % 402.54 0 0 % 0 42,126 0.58 % 1,060.69 950 0.22 % 240.15 202,873 0.34 % 373.81 66,132 0.30 % 352.86 131,752 0.37 % 414.39 12,929 0.26 % 301.13 pomeni pom eni 439,400 0.34 % 387.24 1 0.06 % 103.02 9,080 0.13 % 228.62 1,576 0.37 % 398.40 202,050 0.34 % 372.29 91,731 0.41 % 489.45 112,511 0.32 % 353.88 22,451 0.46 % 522.91 mora m ora 437,226 0.34 % 385.33 3 0.17 % 309.06 13,897 0.19 % 349.91 5,423 1.27 % 1,370.88 196,118 0.33 % 361.36 77,719 0.35 % 414.68 120,038 0.34 % 377.55 24,028 0.49 % 559.64 imeli im eli 424,733 0.33 % 374.32 0 0 % 0 11,741 0.16 % 295.63 995 0.23 % 251.53 227,147 0.38 % 418.53 63,176 0.28 % 337.09 110,375 0.31 % 347.16 11,299 0.23 % 263.17 boste bo ste 424,723 0.33 % 374.31 8 0.44 % 824.15 11,314 0.16 % 284.87 1,143 0.27 % 288.94 144,008 0.24 % 265.34 176,813 0.79 % 943.42 60,928 0.17 % 191.63 30,509 0.62 % 710.59 ste ste 389,782 0.30 % 343.51 16 0.88 % 1,648.30 23,529 0.33 % 592.43 1,606 0.38 % 405.98 155,988 0.26 % 287.42 111,345 0.50 % 594.10 74,884 0.21 % 235.53 22,414 0.46 % 522.05 bile b ile 375,459 0.29 % 330.89 1 0.06 % 103.02 20,500 0.28 % 516.17 1,483 0.35 % 374.89 178,752 0.30 % 329.36 57,280 0.26 % 305.63 99,776 0.28 % 313.82 17,667 0.36 % 411.49 bom bom 366,374 0.28 % 322.88 0 0 % 0 59,430 0.82 % 1,496.38 1,893 0.45 % 478.53 150,679 0.25 % 277.64 60,672 0.27 % 323.73 83,377 0.23 % 262.24 10,323 0.21 % 240.43 morali mor ali 356,817 0.27 % 314.46 0 0 % 0 7,714 0.11 % 194.23 654 0.15 % 165.32 196,175 0.33 % 361.47 48,407 0.22 % 258.28 94,072 0.26 % 295.88 9,795 0.20 % 228.14 nisem ni sem 333,913 0.26 % 294.28 0 0 % 0 57,684 0.80 % 1,452.42 857 0.20 % 216.64 129,813 0.22 % 239.19 61,255 0.28 % 326.84 76,320 0.21 % 240.05 7,984 0.16 % 185.96 imela im ela 312,385 0.24 % 275.30 0 0 % 0 29,555 0.41 % 744.16 640 0.15 % 161.79 136,559 0.23 % 251.62 52,286 0.23 % 278.98 84,290 0.24 % 265.11 9,055 0.18 % 210.90 moral mo ral 287,118 0.22 % 253.04 0 0 % 0 23,576 0.33 % 593.62 571 0.13 % 144.34 134,971 0.23 % 248.69 36,532 0.16 % 194.92 84,515 0.24 % 265.82 6,953 0.14 % 161.94 bosta bo sta 273,268 0.21 % 240.83 0 0 % 0 8,212 0.11 % 206.77 451 0.11 % 114.01 135,358 0.23 % 249.41 35,885 0.16 % 191.47 90,196 0.25 % 283.69 3,166 0.06 % 73.74 začel za čel 271,468 0.21 % 239.24 0 0 % 0 16,850 0.23 % 424.26 485 0.11 % 122.60 128,069 0.21 % 235.98 36,310 0.16 % 193.74 83,600 0.23 % 262.94 6,154 0.12 % 143.33 dobil do bil 251,856 0.19 % 221.96 0 0 % 0 9,123 0.13 % 229.71 370 0.09 % 93.53 126,291 0.21 % 232.70 34,996 0.16 % 186.73 76,748 0.21 % 241.39 4,328 0.09 % 100.80 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 75 File at CLARIN.SI 1.2.59 List of final character-level 4-grams from verb lower-case word forms in the Gigafida 2.0 corpus with text- type distributionGF2.0-word_parts-verbs-lowercase_ forms-final-4grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] bodo bodo 2,658,670 2.27 % 2,343.07 4 0.22 % 412.07 31,054 0.50 % 781.91 3,637 0.94 % 919.39 1,471,149 2.74 % 2,710.69 302,865 1.51 % 1,615.99 814,748 2.52 % 2,562.59 35,213 0.78 % 820.15 bilo bilo 2,082,016 1.78 % 1,834.87 1 0.06 % 103.02 149,980 2.43 % 3,776.33 6,314 1.64 % 1,596.11 1,019,432 1.90 % 1,878.37 298,941 1.49 % 1,595.05 544,248 1.68 % 1,711.80 63,100 1.39 % 1,469.67 bila bila 1,952,423 1.67 % 1,720.66 0 0 % 0 144,082 2.33 % 3,627.83 5,749 1.49 % 1,453.29 902,435 1.68 % 1,662.80 291,559 1.46 % 1,555.66 543,982 1.68 % 1,710.97 64,616 1.43 % 1,504.98 niso niso 1,033,073 0.88 % 910.44 2 0.11 % 206.04 29,832 0.48 % 751.14 3,575 0.93 % 903.72 525,937 0.98 % 969.07 152,957 0.77 % 816.13 285,738 0.88 % 898.72 35,032 0.77 % 815.94 bili bili 949,365 0.81 % 836.67 1 0.06 % 103.02 39,173 0.64 % 986.33 2,864 0.74 % 723.99 475,580 0.89 % 876.29 138,560 0.69 % 739.31 259,161 0.80 % 815.13 34,026 0.75 % 792.51 bomo bomo 651,882 0.56 % 574.50 0 0 % 0 14,951 0.24 % 376.45 2,964 0.77 % 749.27 340,630 0.63 % 627.63 110,301 0.55 % 588.53 163,942 0.51 % 515.64 19,094 0.42 % 444.72 imajo i majo 589,974 0.50 % 519.94 1 0.06 % 103.02 8,325 0.14 % 209.61 2,572 0.67 % 650.17 294,609 0.55 % 542.84 109,963 0.55 % 586.73 145,513 0.45 % 457.68 28,991 0.64 % 675.23 pravi p ravi 519,683 0.44 % 457.99 0 0 % 0 10,534 0.17 % 265.23 1,232 0.32 % 311.44 278,850 0.52 % 513.80 86,508 0.43 % 461.58 131,590 0.41 % 413.88 10,969 0.24 % 255.48 biti biti 505,624 0.43 % 445.60 4 0.22 % 412.07 21,007 0.34 % 528.93 2,919 0.76 % 737.89 223,980 0.42 % 412.70 96,166 0.48 % 513.11 135,922 0.42 % 427.51 25,626 0.57 % 596.86 povedal pov edal 470,909 0.40 % 415.01 0 0 % 0 17,662 0.29 % 444.71 699 0.18 % 176.70 256,636 0.48 % 472.87 30,466 0.15 % 162.56 162,162 0.50 % 510.04 3,284 0.07 % 76.49 dejal d ejal 467,145 0.40 % 411.69 0 0 % 0 7,833 0.13 % 197.23 81 0.02 % 20.48 196,501 0.37 % 362.07 16,632 0.08 % 88.74 243,317 0.75 % 765.30 2,781 0.06 % 64.77 imel imel 456,762 0.39 % 402.54 0 0 % 0 42,126 0.68 % 1,060.69 950 0.25 % 240.15 202,873 0.38 % 373.81 66,132 0.33 % 352.86 131,752 0.41 % 414.39 12,929 0.29 % 301.13 pomeni po meni 439,400 0.38 % 387.24 1 0.06 % 103.02 9,080 0.15 % 228.62 1,576 0.41 % 398.40 202,050 0.38 % 372.29 91,731 0.46 % 489.45 112,511 0.35 % 353.88 22,451 0.50 % 522.91 mora mora 437,226 0.37 % 385.33 3 0.17 % 309.06 13,897 0.23 % 349.91 5,423 1.41 % 1,370.88 196,118 0.36 % 361.36 77,719 0.39 % 414.68 120,038 0.37 % 377.55 24,028 0.53 % 559.64 imeli i meli 424,733 0.36 % 374.32 0 0 % 0 11,741 0.19 % 295.63 995 0.26 % 251.53 227,147 0.42 % 418.53 63,176 0.32 % 337.09 110,375 0.34 % 347.16 11,299 0.25 % 263.17 boste b oste 424,723 0.36 % 374.31 8 0.45 % 824.15 11,314 0.18 % 284.87 1,143 0.30 % 288.94 144,008 0.27 % 265.34 176,813 0.88 % 943.42 60,928 0.19 % 191.63 30,509 0.67 % 710.59 bile bile 375,459 0.32 % 330.89 1 0.06 % 103.02 20,500 0.33 % 516.17 1,483 0.38 % 374.89 178,752 0.33 % 329.36 57,280 0.29 % 305.63 99,776 0.31 % 313.82 17,667 0.39 % 411.49 morali mo rali 356,817 0.30 % 314.46 0 0 % 0 7,714 0.12 % 194.23 654 0.17 % 165.32 196,175 0.36 % 361.47 48,407 0.24 % 258.28 94,072 0.29 % 295.88 9,795 0.22 % 228.14 nisem n isem 333,913 0.28 % 294.28 0 0 % 0 57,684 0.94 % 1,452.42 857 0.22 % 216.64 129,813 0.24 % 239.19 61,255 0.31 % 326.84 76,320 0.24 % 240.05 7,984 0.18 % 185.96 imela i mela 312,385 0.27 % 275.30 0 0 % 0 29,555 0.48 % 744.16 640 0.17 % 161.79 136,559 0.25 % 251.62 52,286 0.26 % 278.98 84,290 0.26 % 265.11 9,055 0.20 % 210.90 moral m oral 287,118 0.24 % 253.04 0 0 % 0 23,576 0.38 % 593.62 571 0.15 % 144.34 134,971 0.25 % 248.69 36,532 0.18 % 194.92 84,515 0.26 % 265.82 6,953 0.15 % 161.94 bosta b osta 273,268 0.23 % 240.83 0 0 % 0 8,212 0.13 % 206.77 451 0.12 % 114.01 135,358 0.25 % 249.41 35,885 0.18 % 191.47 90,196 0.28 % 283.69 3,166 0.07 % 73.74 začel z ačel 271,468 0.23 % 239.24 0 0 % 0 16,850 0.27 % 424.26 485 0.13 % 122.60 128,069 0.24 % 235.98 36,310 0.18 % 193.74 83,600 0.26 % 262.94 6,154 0.14 % 143.33 dobil d obil 251,856 0.21 % 221.96 0 0 % 0 9,123 0.15 % 229.71 370 0.10 % 93.53 126,291 0.23 % 232.70 34,996 0.17 % 186.73 76,748 0.24 % 241.39 4,328 0.10 % 100.80 velja v elja 249,697 0.21 % 220.06 0 0 % 0 1,835 0.03 % 46.20 1,075 0.28 % 271.75 118,417 0.22 % 218.19 49,796 0.25 % 265.70 67,241 0.21 % 211.49 11,333 0.25 % 263.96 začeli za čeli 247,919 0.21 % 218.49 0 0 % 0 4,872 0.08 % 122.67 456 0.12 % 115.27 134,875 0.25 % 248.52 33,697 0.17 % 179.80 68,006 0.21 % 213.90 6,013 0.13 % 140.05 imamo i mamo 244,638 0.21 % 215.60 1 0.06 % 103.02 4,422 0.07 % 111.34 1,128 0.29 % 285.15 118,225 0.22 % 217.84 45,792 0.23 % 244.33 67,498 0.21 % 212.30 7,572 0.17 % 176.36 začela za čela 244,359 0.21 % 215.35 0 0 % 0 12,890 0.21 % 324.56 362 0.09 % 91.51 115,625 0.21 % 213.05 35,436 0.18 % 189.07 74,767 0.23 % 235.16 5,279 0.12 % 122.95 uspelo us pelo 240,892 0.21 % 212.30 0 0 % 0 8,476 0.14 % 213.42 165 0.04 % 41.71 119,483 0.22 % 220.16 36,170 0.18 % 192.99 73,282 0.23 % 230.49 3,316 0.07 % 77.23 dobili do bili 240,319 0.20 % 211.79 1 0.06 % 103.02 3,115 0.05 % 78.43 556 0.14 % 140.55 140,144 0.26 % 258.22 32,111 0.16 % 171.33 60,285 0.19 % 189.61 4,107 0.09 % 95.66 morajo mo rajo 238,365 0.20 % 210.07 2 0.11 % 206.04 2,918 0.05 % 73.47 2,055 0.53 % 519.48 118,656 0.22 % 218.63 35,753 0.18 % 190.77 67,032 0.21 % 210.83 11,949 0.26 % 278.31 kaže kaže 229,873 0.20 % 202.59 0 0 % 0 3,215 0.05 % 80.95 403 0.10 % 101.87 118,682 0.22 % 218.68 35,830 0.18 % 191.18 60,440 0.19 % 190.10 11,303 0.25 % 263.26 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 76 File at CLARIN.SI 1.2.60 List of final character-level 5-grams from verb lower-case word forms in the Gigafida 2.0 corpus with text- type distributionGF2.0-word_parts-verbs-lowercase_ forms-final-5grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] imajo imajo 589,974 0.58 % 519.94 1 0.06 % 103.02 8,325 0.16 % 209.61 2,572 0.78 % 650.17 294,609 0.64 % 542.84 109,963 0.63 % 586.73 145,513 0.52 % 457.68 28,991 0.73 % 675.23 pravi pravi 519,683 0.52 % 457.99 0 0 % 0 10,534 0.20 % 265.23 1,232 0.38 % 311.44 278,850 0.61 % 513.80 86,508 0.50 % 461.58 131,590 0.47 % 413.88 10,969 0.28 % 255.48 povedal po vedal 470,909 0.47 % 415.01 0 0 % 0 17,662 0.33 % 444.71 699 0.21 % 176.70 256,636 0.56 % 472.87 30,466 0.17 % 162.56 162,162 0.58 % 510.04 3,284 0.08 % 76.49 dejal dejal 467,145 0.46 % 411.69 0 0 % 0 7,833 0.15 % 197.23 81 0.03 % 20.48 196,501 0.43 % 362.07 16,632 0.10 % 88.74 243,317 0.87 % 765.30 2,781 0.07 % 64.77 pomeni p omeni 439,400 0.44 % 387.24 1 0.06 % 103.02 9,080 0.17 % 228.62 1,576 0.48 % 398.40 202,050 0.44 % 372.29 91,731 0.53 % 489.45 112,511 0.40 % 353.88 22,451 0.56 % 522.91 imeli imeli 424,733 0.42 % 374.32 0 0 % 0 11,741 0.22 % 295.63 995 0.30 % 251.53 227,147 0.49 % 418.53 63,176 0.36 % 337.09 110,375 0.40 % 347.16 11,299 0.28 % 263.17 boste boste 424,723 0.42 % 374.31 8 0.45 % 824.15 11,314 0.21 % 284.87 1,143 0.35 % 288.94 144,008 0.31 % 265.34 176,813 1.01 % 943.42 60,928 0.22 % 191.63 30,509 0.77 % 710.59 morali m orali 356,817 0.35 % 314.46 0 0 % 0 7,714 0.14 % 194.23 654 0.20 % 165.32 196,175 0.43 % 361.47 48,407 0.28 % 258.28 94,072 0.34 % 295.88 9,795 0.25 % 228.14 nisem nisem 333,913 0.33 % 294.28 0 0 % 0 57,684 1.08 % 1,452.42 857 0.26 % 216.64 129,813 0.28 % 239.19 61,255 0.35 % 326.84 76,320 0.27 % 240.05 7,984 0.20 % 185.96 imela imela 312,385 0.31 % 275.30 0 0 % 0 29,555 0.55 % 744.16 640 0.20 % 161.79 136,559 0.30 % 251.62 52,286 0.30 % 278.98 84,290 0.30 % 265.11 9,055 0.23 % 210.90 moral moral 287,118 0.28 % 253.04 0 0 % 0 23,576 0.44 % 593.62 571 0.17 % 144.34 134,971 0.29 % 248.69 36,532 0.21 % 194.92 84,515 0.30 % 265.82 6,953 0.17 % 161.94 bosta bosta 273,268 0.27 % 240.83 0 0 % 0 8,212 0.15 % 206.77 451 0.14 % 114.01 135,358 0.29 % 249.41 35,885 0.21 % 191.47 90,196 0.32 % 283.69 3,166 0.08 % 73.74 začel začel 271,468 0.27 % 239.24 0 0 % 0 16,850 0.32 % 424.26 485 0.15 % 122.60 128,069 0.28 % 235.98 36,310 0.21 % 193.74 83,600 0.30 % 262.94 6,154 0.15 % 143.33 dobil dobil 251,856 0.25 % 221.96 0 0 % 0 9,123 0.17 % 229.71 370 0.11 % 93.53 126,291 0.28 % 232.70 34,996 0.20 % 186.73 76,748 0.28 % 241.39 4,328 0.11 % 100.80 velja velja 249,697 0.25 % 220.06 0 0 % 0 1,835 0.03 % 46.20 1,075 0.33 % 271.75 118,417 0.26 % 218.19 49,796 0.29 % 265.70 67,241 0.24 % 211.49 11,333 0.28 % 263.96 začeli z ačeli 247,919 0.25 % 218.49 0 0 % 0 4,872 0.09 % 122.67 456 0.14 % 115.27 134,875 0.29 % 248.52 33,697 0.19 % 179.80 68,006 0.24 % 213.90 6,013 0.15 % 140.05 imamo imamo 244,638 0.24 % 215.60 1 0.06 % 103.02 4,422 0.08 % 111.34 1,128 0.34 % 285.15 118,225 0.26 % 217.84 45,792 0.26 % 244.33 67,498 0.24 % 212.30 7,572 0.19 % 176.36 začela z ačela 244,359 0.24 % 215.35 0 0 % 0 12,890 0.24 % 324.56 362 0.11 % 91.51 115,625 0.25 % 213.05 35,436 0.20 % 189.07 74,767 0.27 % 235.16 5,279 0.13 % 122.95 uspelo u spelo 240,892 0.24 % 212.30 0 0 % 0 8,476 0.16 % 213.42 165 0.05 % 41.71 119,483 0.26 % 220.16 36,170 0.21 % 192.99 73,282 0.26 % 230.49 3,316 0.08 % 77.23 dobili d obili 240,319 0.24 % 211.79 1 0.06 % 103.02 3,115 0.06 % 78.43 556 0.17 % 140.55 140,144 0.30 % 258.22 32,111 0.18 % 171.33 60,285 0.22 % 189.61 4,107 0.10 % 95.66 morajo m orajo 238,365 0.24 % 210.07 2 0.11 % 206.04 2,918 0.06 % 73.47 2,055 0.63 % 519.48 118,656 0.26 % 218.63 35,753 0.20 % 190.77 67,032 0.24 % 210.83 11,949 0.30 % 278.31 prišel p rišel 211,310 0.21 % 186.23 0 0 % 0 21,234 0.40 % 534.65 480 0.15 % 121.34 92,681 0.20 % 170.77 32,652 0.19 % 174.22 58,948 0.21 % 185.41 5,315 0.13 % 123.79 morala m orala 211,288 0.21 % 186.21 0 0 % 0 15,694 0.29 % 395.16 442 0.14 % 111.73 100,937 0.22 % 185.98 28,381 0.16 % 151.43 61,142 0.22 % 192.31 4,692 0.12 % 109.28 moramo m oramo 205,491 0.20 % 181.10 0 0 % 0 4,182 0.08 % 105.30 660 0.20 % 166.84 86,068 0.19 % 158.59 44,596 0.26 % 237.95 53,929 0.19 % 169.62 16,056 0.40 % 373.96 zgodilo zg odilo 196,054 0.19 % 172.78 0 0 % 0 15,427 0.29 % 388.44 449 0.14 % 113.50 93,840 0.20 % 172.91 31,188 0.18 % 166.41 50,496 0.18 % 158.82 4,654 0.12 % 108.40 postal p ostal 195,102 0.19 % 171.94 0 0 % 0 7,745 0.14 % 195.01 350 0.11 % 88.48 88,788 0.19 % 163.60 33,422 0.19 % 178.33 57,517 0.21 % 180.91 7,280 0.18 % 169.56 prišlo p rišlo 190,940 0.19 % 168.27 0 0 % 0 5,570 0.10 % 140.25 615 0.19 % 155.47 94,348 0.21 % 173.84 23,088 0.13 % 123.19 62,015 0.22 % 195.05 5,304 0.13 % 123.54 nismo nismo 181,629 0.18 % 160.07 0 0 % 0 4,707 0.09 % 118.52 531 0.16 % 134.23 95,326 0.21 % 175.64 30,240 0.17 % 161.35 47,229 0.17 % 148.55 3,596 0.09 % 83.76 dodal dodal 179,735 0.18 % 158.40 0 0 % 0 3,209 0.06 % 80.80 47 0.01 % 11.88 73,621 0.16 % 135.65 7,800 0.04 % 41.62 94,091 0.34 % 295.94 967 0.02 % 22.52 rekel rekel 178,062 0.18 % 156.93 0 0 % 0 66,774 1.25 % 1,681.30 1,158 0.35 % 292.73 55,284 0.12 % 101.86 25,116 0.14 % 134.01 23,982 0.09 % 75.43 5,748 0.14 % 133.88 videti v ideti 176,327 0.17 % 155.40 0 0 % 0 23,660 0.44 % 595.73 280 0.09 % 70.78 63,466 0.14 % 116.94 41,741 0.24 % 222.72 39,253 0.14 % 123.46 7,927 0.20 % 184.63 želijo ž elijo 171,909 0.17 % 151.50 1 0.06 % 103.02 932 0.02 % 23.47 214 0.07 % 54.10 81,012 0.18 % 149.27 22,149 0.13 % 118.18 64,068 0.23 % 201.51 3,533 0.09 % 82.29 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 77 File at CLARIN.SI 1.2.61 List of initial character-level 1-grams from adjective lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adjectives-lemmas- initial-1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] velik velik v elik 2,652,565 2.04 % 2,337.69 5 0.58 % 515.09 51,414 1.72 % 1,294.55 5,866 1.23 % 1,482.86 1,283,822 2.02 % 2,365.53 485,663 2.24 % 2,591.34 724,281 1.99 % 2,278.05 101,514 1.92 % 2,364.38 nov nov n ov 2,584,746 1.99 % 2,277.92 1 0.12 % 103.02 31,228 1.04 % 786.29 5,052 1.06 % 1,277.09 1,343,551 2.12 % 2,475.58 430,154 1.99 % 2,295.16 711,383 1.95 % 2,237.48 63,377 1.20 % 1,476.13 slovenski slovenski s lovenski 1,880,807 1.44 % 1,657.55 0 0 % 0 3,004 0.10 % 75.64 2,992 0.63 % 756.35 1,056,514 1.67 % 1,946.70 207,876 0.96 % 1,109.16 582,162 1.60 % 1,831.05 28,259 0.54 % 658.19 dober dober d ober 1,837,252 1.41 % 1,619.16 3 0.35 % 309.06 40,204 1.34 % 1,012.29 2,465 0.52 % 623.13 898,379 1.42 % 1,655.32 314,636 1.45 % 1,678.79 540,724 1.49 % 1,700.72 40,841 0.77 % 951.24 zadnji zadnji z adnji 1,254,851 0.96 % 1,105.89 3 0.35 % 309.06 26,159 0.87 % 658.66 1,842 0.39 % 465.64 634,771 1.00 % 1,169.61 170,972 0.79 % 912.25 400,657 1.10 % 1,260.17 20,447 0.39 % 476.23 sam sam s am 1,096,267 0.84 % 966.13 2 0.23 % 206.04 65,317 2.18 % 1,644.61 4,168 0.87 % 1,053.63 485,995 0.77 % 895.48 228,841 1.06 % 1,221.02 258,335 0.71 % 812.53 53,609 1.02 % 1,248.62 evropski evropski e vropski 952,492 0.73 % 839.43 0 0 % 0 1,068 0.04 % 26.89 1,646 0.34 % 416.09 476,018 0.75 % 877.09 83,115 0.38 % 443.47 374,614 1.03 % 1,178.26 16,031 0.30 % 373.38 star star s tar 819,239 0.63 % 721.99 0 0 % 0 40,235 1.34 % 1,013.07 1,669 0.35 % 421.91 428,688 0.68 % 789.89 141,731 0.65 % 756.23 175,740 0.48 % 552.75 31,176 0.59 % 726.13 visok visok v isok 785,457 0.60 % 692.22 1 0.12 % 103.02 15,006 0.50 % 377.83 2,003 0.42 % 506.34 386,467 0.61 % 712.09 116,871 0.54 % 623.59 235,448 0.65 % 740.55 29,661 0.56 % 690.84 pomemben pomemben p omemben 785,444 0.60 % 692.21 0 0 % 0 12,253 0.41 % 308.52 2,080 0.43 % 525.80 369,088 0.58 % 680.07 142,611 0.66 % 760.93 213,803 0.59 % 672.47 45,609 0.86 % 1,062.29 državen državen d ržaven 727,558 0.56 % 641.19 0 0 % 0 2,144 0.07 % 53.98 6,431 1.35 % 1,625.69 424,876 0.67 % 782.86 53,346 0.25 % 284.64 231,274 0.64 % 727.42 9,487 0.18 % 220.96 mlad mlad m lad 710,490 0.55 % 626.15 0 0 % 0 19,798 0.66 % 498.49 901 0.19 % 227.76 384,080 0.61 % 707.69 102,434 0.47 % 546.55 187,441 0.52 % 589.55 15,836 0.30 % 368.84 svetoven svetoven s vetoven 701,704 0.54 % 618.41 0 0 % 0 2,333 0.08 % 58.74 920 0.19 % 232.57 354,281 0.56 % 652.79 71,182 0.33 % 379.80 261,527 0.72 % 822.57 11,461 0.22 % 266.94 različen različen r azličen 601,579 0.46 % 530.17 10 1.16 % 1,030.18 6,101 0.20 % 153.62 1,707 0.36 % 431.51 265,022 0.42 % 488.32 132,380 0.61 % 706.34 147,499 0.41 % 463.92 48,860 0.93 % 1,138.01 javen javen j aven 599,038 0.46 % 527.93 0 0 % 0 2,139 0.07 % 53.86 4,421 0.93 % 1,117.58 285,690 0.45 % 526.40 46,128 0.21 % 246.12 245,425 0.67 % 771.93 15,235 0.29 % 354.84 majhen majhen m ajhen 586,026 0.45 % 516.46 11 1.28 % 1,133.20 22,105 0.74 % 556.58 1,989 0.42 % 502.80 260,722 0.41 % 480.40 146,093 0.67 % 779.50 121,483 0.33 % 382.10 33,623 0.64 % 783.12 mogoč mogoč m ogoč 549,842 0.42 % 484.57 0 0 % 0 24,830 0.83 % 625.19 2,511 0.53 % 634.75 266,544 0.42 % 491.12 98,624 0.46 % 526.23 129,413 0.36 % 407.04 27,920 0.53 % 650.29 ameriški ameriški a meriški 546,670 0.42 % 481.78 0 0 % 0 2,955 0.10 % 74.40 553 0.12 % 139.79 269,453 0.42 % 496.48 74,371 0.34 % 396.82 191,053 0.53 % 600.91 8,285 0.16 % 192.97 domač domač d omač 514,492 0.40 % 453.42 1 0.12 % 103.02 5,007 0.17 % 126.07 1,010 0.21 % 255.32 290,270 0.46 % 534.84 65,068 0.30 % 347.18 143,722 0.40 % 452.04 9,414 0.18 % 219.26 pravi pravi p ravi 508,554 0.39 % 448.19 1 0.12 % 103.02 19,024 0.64 % 479 1,069 0.22 % 270.23 238,097 0.38 % 438.71 115,643 0.53 % 617.03 117,969 0.32 % 371.04 16,751 0.32 % 390.15 glaven glaven g laven 504,817 0.39 % 444.89 0 0 % 0 8,165 0.27 % 205.59 1,532 0.32 % 387.27 252,083 0.40 % 464.48 79,633 0.37 % 424.90 142,430 0.39 % 447.98 20,974 0.40 % 488.51 leten leten l eten 497,066 0.38 % 438.06 0 0 % 0 930 0.03 % 23.42 819 0.17 % 207.03 258,476 0.41 % 476.26 37,560 0.17 % 200.41 195,112 0.54 % 613.68 4,169 0.08 % 97.10 poseben poseben p oseben 491,903 0.38 % 433.51 1 0.12 % 103.02 7,888 0.26 % 198.61 3,320 0.69 % 839.26 240,439 0.38 % 443.02 100,029 0.46 % 533.72 117,371 0.32 % 369.16 22,855 0.43 % 532.32 znan znan z nan 482,303 0.37 % 425.05 0 0 % 0 6,258 0.21 % 157.57 949 0.20 % 239.90 239,747 0.38 % 441.75 92,995 0.43 % 496.19 127,889 0.35 % 402.24 14,465 0.27 % 336.91 slab slab s lab 456,819 0.35 % 402.59 0 0 % 0 11,344 0.38 % 285.63 522 0.11 % 131.96 222,346 0.35 % 409.69 74,623 0.34 % 398.16 136,968 0.38 % 430.80 11,016 0.21 % 256.58 številen številen š tevilen 440,492 0.34 % 388.20 1 0.12 % 103.02 3,801 0.13 % 95.71 536 0.11 % 135.50 199,765 0.32 % 368.08 74,277 0.34 % 396.32 143,497 0.39 % 451.34 18,615 0.35 % 433.57 mednaroden mednaroden m ednaroden 438,511 0.34 % 386.46 0 0 % 0 889 0.03 % 22.38 2,246 0.47 % 567.76 237,200 0.37 % 437.06 40,842 0.19 % 217.92 150,001 0.41 % 471.79 7,333 0.14 % 170.79 nekdanji nekdanji n ekdanji 438,100 0.34 % 386.10 0 0 % 0 3,153 0.10 % 79.39 321 0.07 % 81.15 223,487 0.35 % 411.79 46,215 0.21 % 246.59 159,504 0.44 % 501.68 5,420 0.10 % 126.24 prihodnji prihodnji p rihodnji 435,770 0.34 % 384.04 0 0 % 0 1,515 0.05 % 38.15 524 0.11 % 132.46 255,001 0.40 % 469.86 35,529 0.16 % 189.57 139,896 0.38 % 440.01 3,305 0.06 % 76.98 političen političen p olitičen 433,419 0.33 % 381.97 0 0 % 0 2,210 0.07 % 55.65 1,473 0.31 % 372.36 232,367 0.37 % 428.15 50,096 0.23 % 267.30 128,308 0.35 % 403.56 18,965 0.36 % 441.72 lep lep l ep 420,724 0.32 % 370.78 0 0 % 0 23,733 0.79 % 597.57 2,813 0.59 % 711.10 198,767 0.31 % 366.24 101,128 0.47 % 539.59 81,862 0.23 % 257.48 12,421 0.23 % 289.30 letošnji letošnji l etošnji 419,649 0.32 % 369.83 0 0 % 0 181 0.01 % 4.56 258 0.05 % 65.22 248,400 0.39 % 457.69 43,478 0.20 % 231.98 126,783 0.35 % 398.77 549 0.01 % 12.79 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 78 File at CLARIN.SI 1.2.62 List of initial character-level 2-grams from adjective lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adjectives-lemmas- initial-2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] velik velik ve lik 2,652,565 2.04 % 2,337.69 5 0.58 % 515.09 51,414 1.72 % 1,294.55 5,866 1.23 % 1,482.86 1,283,822 2.02 % 2,365.53 485,663 2.24 % 2,591.34 724,281 1.99 % 2,278.05 101,514 1.92 % 2,364.38 nov nov no v 2,584,746 1.99 % 2,277.92 1 0.12 % 103.02 31,228 1.04 % 786.29 5,052 1.06 % 1,277.09 1,343,551 2.12 % 2,475.58 430,154 1.99 % 2,295.16 711,383 1.95 % 2,237.48 63,377 1.20 % 1,476.13 slovenski slovenski sl ovenski 1,880,807 1.44 % 1,657.55 0 0 % 0 3,004 0.10 % 75.64 2,992 0.63 % 756.35 1,056,514 1.67 % 1,946.70 207,876 0.96 % 1,109.16 582,162 1.60 % 1,831.05 28,259 0.54 % 658.19 dober dober do ber 1,837,252 1.41 % 1,619.16 3 0.35 % 309.06 40,204 1.34 % 1,012.29 2,465 0.52 % 623.13 898,379 1.42 % 1,655.32 314,636 1.45 % 1,678.79 540,724 1.49 % 1,700.72 40,841 0.77 % 951.24 zadnji zadnji za dnji 1,254,851 0.96 % 1,105.89 3 0.35 % 309.06 26,159 0.87 % 658.66 1,842 0.39 % 465.64 634,771 1.00 % 1,169.61 170,972 0.79 % 912.25 400,657 1.10 % 1,260.17 20,447 0.39 % 476.23 sam sam sa m 1,096,267 0.84 % 966.13 2 0.23 % 206.04 65,317 2.18 % 1,644.61 4,168 0.87 % 1,053.63 485,995 0.77 % 895.48 228,841 1.06 % 1,221.02 258,335 0.71 % 812.53 53,609 1.02 % 1,248.62 evropski evropski ev ropski 952,492 0.73 % 839.43 0 0 % 0 1,068 0.04 % 26.89 1,646 0.34 % 416.09 476,018 0.75 % 877.09 83,115 0.38 % 443.47 374,614 1.03 % 1,178.26 16,031 0.30 % 373.38 star star st ar 819,239 0.63 % 721.99 0 0 % 0 40,235 1.34 % 1,013.07 1,669 0.35 % 421.91 428,688 0.68 % 789.89 141,731 0.65 % 756.23 175,740 0.48 % 552.75 31,176 0.59 % 726.13 visok visok vi sok 785,457 0.60 % 692.22 1 0.12 % 103.02 15,006 0.50 % 377.83 2,003 0.42 % 506.34 386,467 0.61 % 712.09 116,871 0.54 % 623.59 235,448 0.65 % 740.55 29,661 0.56 % 690.84 pomemben pomemben po memben 785,444 0.60 % 692.21 0 0 % 0 12,253 0.41 % 308.52 2,080 0.43 % 525.80 369,088 0.58 % 680.07 142,611 0.66 % 760.93 213,803 0.59 % 672.47 45,609 0.86 % 1,062.29 državen državen dr žaven 727,558 0.56 % 641.19 0 0 % 0 2,144 0.07 % 53.98 6,431 1.35 % 1,625.69 424,876 0.67 % 782.86 53,346 0.25 % 284.64 231,274 0.64 % 727.42 9,487 0.18 % 220.96 mlad mlad ml ad 710,490 0.55 % 626.15 0 0 % 0 19,798 0.66 % 498.49 901 0.19 % 227.76 384,080 0.61 % 707.69 102,434 0.47 % 546.55 187,441 0.52 % 589.55 15,836 0.30 % 368.84 svetoven svetoven sv etoven 701,704 0.54 % 618.41 0 0 % 0 2,333 0.08 % 58.74 920 0.19 % 232.57 354,281 0.56 % 652.79 71,182 0.33 % 379.80 261,527 0.72 % 822.57 11,461 0.22 % 266.94 različen različen ra zličen 601,579 0.46 % 530.17 10 1.16 % 1,030.18 6,101 0.20 % 153.62 1,707 0.36 % 431.51 265,022 0.42 % 488.32 132,380 0.61 % 706.34 147,499 0.41 % 463.92 48,860 0.93 % 1,138.01 javen javen ja ven 599,038 0.46 % 527.93 0 0 % 0 2,139 0.07 % 53.86 4,421 0.93 % 1,117.58 285,690 0.45 % 526.40 46,128 0.21 % 246.12 245,425 0.67 % 771.93 15,235 0.29 % 354.84 majhen majhen ma jhen 586,026 0.45 % 516.46 11 1.28 % 1,133.20 22,105 0.74 % 556.58 1,989 0.42 % 502.80 260,722 0.41 % 480.40 146,093 0.67 % 779.50 121,483 0.33 % 382.10 33,623 0.64 % 783.12 mogoč mogoč mo goč 549,842 0.42 % 484.57 0 0 % 0 24,830 0.83 % 625.19 2,511 0.53 % 634.75 266,544 0.42 % 491.12 98,624 0.46 % 526.23 129,413 0.36 % 407.04 27,920 0.53 % 650.29 ameriški ameriški am eriški 546,670 0.42 % 481.78 0 0 % 0 2,955 0.10 % 74.40 553 0.12 % 139.79 269,453 0.42 % 496.48 74,371 0.34 % 396.82 191,053 0.53 % 600.91 8,285 0.16 % 192.97 domač domač do mač 514,492 0.40 % 453.42 1 0.12 % 103.02 5,007 0.17 % 126.07 1,010 0.21 % 255.32 290,270 0.46 % 534.84 65,068 0.30 % 347.18 143,722 0.40 % 452.04 9,414 0.18 % 219.26 pravi pravi pr avi 508,554 0.39 % 448.19 1 0.12 % 103.02 19,024 0.64 % 479 1,069 0.22 % 270.23 238,097 0.38 % 438.71 115,643 0.53 % 617.03 117,969 0.32 % 371.04 16,751 0.32 % 390.15 glaven glaven gl aven 504,817 0.39 % 444.89 0 0 % 0 8,165 0.27 % 205.59 1,532 0.32 % 387.27 252,083 0.40 % 464.48 79,633 0.37 % 424.90 142,430 0.39 % 447.98 20,974 0.40 % 488.51 leten leten le ten 497,066 0.38 % 438.06 0 0 % 0 930 0.03 % 23.42 819 0.17 % 207.03 258,476 0.41 % 476.26 37,560 0.17 % 200.41 195,112 0.54 % 613.68 4,169 0.08 % 97.10 poseben poseben po seben 491,903 0.38 % 433.51 1 0.12 % 103.02 7,888 0.26 % 198.61 3,320 0.69 % 839.26 240,439 0.38 % 443.02 100,029 0.46 % 533.72 117,371 0.32 % 369.16 22,855 0.43 % 532.32 znan znan zn an 482,303 0.37 % 425.05 0 0 % 0 6,258 0.21 % 157.57 949 0.20 % 239.90 239,747 0.38 % 441.75 92,995 0.43 % 496.19 127,889 0.35 % 402.24 14,465 0.27 % 336.91 slab slab sl ab 456,819 0.35 % 402.59 0 0 % 0 11,344 0.38 % 285.63 522 0.11 % 131.96 222,346 0.35 % 409.69 74,623 0.34 % 398.16 136,968 0.38 % 430.80 11,016 0.21 % 256.58 številen številen št evilen 440,492 0.34 % 388.20 1 0.12 % 103.02 3,801 0.13 % 95.71 536 0.11 % 135.50 199,765 0.32 % 368.08 74,277 0.34 % 396.32 143,497 0.39 % 451.34 18,615 0.35 % 433.57 mednaroden mednaroden me dnaroden 438,511 0.34 % 386.46 0 0 % 0 889 0.03 % 22.38 2,246 0.47 % 567.76 237,200 0.37 % 437.06 40,842 0.19 % 217.92 150,001 0.41 % 471.79 7,333 0.14 % 170.79 nekdanji nekdanji ne kdanji 438,100 0.34 % 386.10 0 0 % 0 3,153 0.10 % 79.39 321 0.07 % 81.15 223,487 0.35 % 411.79 46,215 0.21 % 246.59 159,504 0.44 % 501.68 5,420 0.10 % 126.24 prihodnji prihodnji pr ihodnji 435,770 0.34 % 384.04 0 0 % 0 1,515 0.05 % 38.15 524 0.11 % 132.46 255,001 0.40 % 469.86 35,529 0.16 % 189.57 139,896 0.38 % 440.01 3,305 0.06 % 76.98 političen političen po litičen 433,419 0.33 % 381.97 0 0 % 0 2,210 0.07 % 55.65 1,473 0.31 % 372.36 232,367 0.37 % 428.15 50,096 0.23 % 267.30 128,308 0.35 % 403.56 18,965 0.36 % 441.72 lep lep le p 420,724 0.32 % 370.78 0 0 % 0 23,733 0.79 % 597.57 2,813 0.59 % 711.10 198,767 0.31 % 366.24 101,128 0.47 % 539.59 81,862 0.23 % 257.48 12,421 0.23 % 289.30 letošnji letošnji le tošnji 419,649 0.32 % 369.83 0 0 % 0 181 0.01 % 4.56 258 0.05 % 65.22 248,400 0.39 % 457.69 43,478 0.20 % 231.98 126,783 0.35 % 398.77 549 0.01 % 12.79 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 79 File at CLARIN.SI 1.2.63 List of initial character-level 3-grams from adjective lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adjectives-lemmas- initial-3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] velik velik vel ik2,652,565 2.04 % 2,337.69 5 0.58 % 515.09 51,414 1.72 % 1,294.55 5,866 1.23 % 1,482.86 1,283,822 2.02 % 2,365.53 485,663 2.24 % 2,591.34 724,281 1.99 % 2,278.05 101,514 1.93 % 2,364.38 nov nov nov 2,584,746 1.99 % 2,277.92 1 0.12 % 103.02 31,228 1.04 % 786.29 5,052 1.06 % 1,277.09 1,343,551 2.12 % 2,475.58 430,154 1.99 % 2,295.16 711,383 1.96 % 2,237.48 63,377 1.20 % 1,476.13 slovenski slovenski slo venski 1,880,807 1.44 % 1,657.55 0 0 % 0 3,004 0.10 % 75.64 2,992 0.63 % 756.35 1,056,514 1.67 % 1,946.70 207,876 0.96 % 1,109.16 582,162 1.60 % 1,831.05 28,259 0.54 % 658.19 dober dober dob er 1,837,252 1.41 % 1,619.16 3 0.35 % 309.06 40,204 1.34 % 1,012.29 2,465 0.52 % 623.13 898,379 1.42 % 1,655.32 314,636 1.45 % 1,678.79 540,724 1.49 % 1,700.72 40,841 0.77 % 951.24 zadnji zadnji zad nji 1,254,851 0.96 % 1,105.89 3 0.35 % 309.06 26,159 0.87 % 658.66 1,842 0.39 % 465.64 634,771 1.00 % 1,169.61 170,972 0.79 % 912.25 400,657 1.10 % 1,260.17 20,447 0.39 % 476.23 sam sam sam 1,096,267 0.84 % 966.13 2 0.23 % 206.04 65,317 2.18 % 1,644.61 4,168 0.87 % 1,053.63 485,995 0.77 % 895.48 228,841 1.06 % 1,221.02 258,335 0.71 % 812.53 53,609 1.02 % 1,248.62 evropski evropski evr opski 952,492 0.73 % 839.43 0 0 % 0 1,068 0.04 % 26.89 1,646 0.34 % 416.09 476,018 0.75 % 877.09 83,115 0.38 % 443.47 374,614 1.03 % 1,178.26 16,031 0.30 % 373.38 star star sta r 819,239 0.63 % 721.99 0 0 % 0 40,235 1.34 % 1,013.07 1,669 0.35 % 421.91 428,688 0.68 % 789.89 141,731 0.66 % 756.23 175,740 0.48 % 552.75 31,176 0.59 % 726.13 visok visok vis ok 785,457 0.60 % 692.22 1 0.12 % 103.02 15,006 0.50 % 377.83 2,003 0.42 % 506.34 386,467 0.61 % 712.09 116,871 0.54 % 623.59 235,448 0.65 % 740.55 29,661 0.56 % 690.84 pomemben pomemben pom emben 785,444 0.60 % 692.21 0 0 % 0 12,253 0.41 % 308.52 2,080 0.43 % 525.80 369,088 0.58 % 680.07 142,611 0.66 % 760.93 213,803 0.59 % 672.47 45,609 0.86 % 1,062.29 državen državen drž aven 727,558 0.56 % 641.19 0 0 % 0 2,144 0.07 % 53.98 6,431 1.35 % 1,625.69 424,876 0.67 % 782.86 53,346 0.25 % 284.64 231,274 0.64 % 727.42 9,487 0.18 % 220.96 mlad mlad mla d 710,490 0.55 % 626.15 0 0 % 0 19,798 0.66 % 498.49 901 0.19 % 227.76 384,080 0.61 % 707.69 102,434 0.47 % 546.55 187,441 0.52 % 589.55 15,836 0.30 % 368.84 svetoven svetoven sve toven 701,704 0.54 % 618.41 0 0 % 0 2,333 0.08 % 58.74 920 0.19 % 232.57 354,281 0.56 % 652.79 71,182 0.33 % 379.80 261,527 0.72 % 822.57 11,461 0.22 % 266.94 različen različen raz ličen 601,579 0.46 % 530.17 10 1.16 % 1,030.18 6,101 0.20 % 153.62 1,707 0.36 % 431.51 265,022 0.42 % 488.32 132,380 0.61 % 706.34 147,499 0.41 % 463.92 48,860 0.93 % 1,138.01 javen javen jav en 599,038 0.46 % 527.93 0 0 % 0 2,139 0.07 % 53.86 4,421 0.93 % 1,117.58 285,690 0.45 % 526.40 46,128 0.21 % 246.12 245,425 0.67 % 771.93 15,235 0.29 % 354.84 majhen majhen maj hen 586,026 0.45 % 516.46 11 1.28 % 1,133.20 22,105 0.74 % 556.58 1,989 0.42 % 502.80 260,722 0.41 % 480.40 146,093 0.68 % 779.50 121,483 0.33 % 382.10 33,623 0.64 % 783.12 mogoč mogoč mog oč 549,842 0.42 % 484.57 0 0 % 0 24,830 0.83 % 625.19 2,511 0.53 % 634.75 266,544 0.42 % 491.12 98,624 0.46 % 526.23 129,413 0.36 % 407.04 27,920 0.53 % 650.29 ameriški ameriški ame riški 546,670 0.42 % 481.78 0 0 % 0 2,955 0.10 % 74.40 553 0.12 % 139.79 269,453 0.42 % 496.48 74,371 0.34 % 396.82 191,053 0.53 % 600.91 8,285 0.16 % 192.97 domač domač dom ač 514,492 0.40 % 453.42 1 0.12 % 103.02 5,007 0.17 % 126.07 1,010 0.21 % 255.32 290,270 0.46 % 534.84 65,068 0.30 % 347.18 143,722 0.40 % 452.04 9,414 0.18 % 219.26 pravi pravi pra vi 508,554 0.39 % 448.19 1 0.12 % 103.02 19,024 0.64 % 479 1,069 0.22 % 270.23 238,097 0.38 % 438.71 115,643 0.53 % 617.03 117,969 0.32 % 371.04 16,751 0.32 % 390.15 glaven glaven gla ven 504,817 0.39 % 444.89 0 0 % 0 8,165 0.27 % 205.59 1,532 0.32 % 387.27 252,083 0.40 % 464.48 79,633 0.37 % 424.90 142,430 0.39 % 447.98 20,974 0.40 % 488.51 leten leten let en 497,066 0.38 % 438.06 0 0 % 0 930 0.03 % 23.42 819 0.17 % 207.03 258,476 0.41 % 476.26 37,560 0.17 % 200.41 195,112 0.54 % 613.68 4,169 0.08 % 97.10 poseben poseben pos eben 491,903 0.38 % 433.51 1 0.12 % 103.02 7,888 0.26 % 198.61 3,320 0.69 % 839.26 240,439 0.38 % 443.02 100,029 0.46 % 533.72 117,371 0.32 % 369.16 22,855 0.43 % 532.32 znan znan zna n 482,303 0.37 % 425.05 0 0 % 0 6,258 0.21 % 157.57 949 0.20 % 239.90 239,747 0.38 % 441.75 92,995 0.43 % 496.19 127,889 0.35 % 402.24 14,465 0.27 % 336.91 slab slab sla b 456,819 0.35 % 402.59 0 0 % 0 11,344 0.38 % 285.63 522 0.11 % 131.96 222,346 0.35 % 409.69 74,623 0.34 % 398.16 136,968 0.38 % 430.80 11,016 0.21 % 256.58 številen številen šte vilen 440,492 0.34 % 388.20 1 0.12 % 103.02 3,801 0.13 % 95.71 536 0.11 % 135.50 199,765 0.32 % 368.08 74,277 0.34 % 396.32 143,497 0.39 % 451.34 18,615 0.35 % 433.57 mednaroden mednaroden med naroden 438,511 0.34 % 386.46 0 0 % 0 889 0.03 % 22.38 2,246 0.47 % 567.76 237,200 0.37 % 437.06 40,842 0.19 % 217.92 150,001 0.41 % 471.79 7,333 0.14 % 170.79 nekdanji nekdanji nek danji 438,100 0.34 % 386.10 0 0 % 0 3,153 0.10 % 79.39 321 0.07 % 81.15 223,487 0.35 % 411.79 46,215 0.21 % 246.59 159,504 0.44 % 501.68 5,420 0.10 % 126.24 prihodnji prihodnji pri hodnji 435,770 0.34 % 384.04 0 0 % 0 1,515 0.05 % 38.15 524 0.11 % 132.46 255,001 0.40 % 469.86 35,529 0.16 % 189.57 139,896 0.38 % 440.01 3,305 0.06 % 76.98 političen političen pol itičen 433,419 0.33 % 381.97 0 0 % 0 2,210 0.07 % 55.65 1,473 0.31 % 372.36 232,367 0.37 % 428.15 50,096 0.23 % 267.30 128,308 0.35 % 403.56 18,965 0.36 % 441.72 lep lep lep 420,724 0.32 % 370.78 0 0 % 0 23,733 0.79 % 597.57 2,813 0.59 % 711.10 198,767 0.31 % 366.24 101,128 0.47 % 539.59 81,862 0.23 % 257.48 12,421 0.23 % 289.30 letošnji letošnji let ošnji 419,649 0.32 % 369.83 0 0 % 0 181 0.01 % 4.56 258 0.05 % 65.22 248,400 0.39 % 457.69 43,478 0.20 % 231.98 126,783 0.35 % 398.77 549 0.01 % 12.79 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 80 File at CLARIN.SI 1.2.64 List of initial character-level 4-grams from adjective lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adjectives-lemmas- initial-4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] velik velik veli k 2,652,565 2.14 % 2,337.69 5 0.60 % 515.09 51,414 1.87 % 1,294.55 5,866 1.28 % 1,482.86 1,283,822 2.12 % 2,365.53 485,663 2.37 % 2,591.34 724,281 2.08 % 2,278.05 101,514 2.01 % 2,364.38 slovenski slovenski slov enski 1,880,807 1.51 % 1,657.55 0 0 % 0 3,004 0.11 % 75.64 2,992 0.65 % 756.35 1,056,514 1.75 % 1,946.70 207,876 1.01 % 1,109.16 582,162 1.67 % 1,831.05 28,259 0.56 % 658.19 dober dober dobe r 1,837,252 1.48 % 1,619.16 3 0.36 % 309.06 40,204 1.46 % 1,012.29 2,465 0.54 % 623.13 898,379 1.49 % 1,655.32 314,636 1.53 % 1,678.79 540,724 1.55 % 1,700.72 40,841 0.81 % 951.24 zadnji zadnji zadn ji 1,254,851 1.01 % 1,105.89 3 0.36 % 309.06 26,159 0.95 % 658.66 1,842 0.40 % 465.64 634,771 1.05 % 1,169.61 170,972 0.83 % 912.25 400,657 1.15 % 1,260.17 20,447 0.40 % 476.23 evropski evropski evro pski 952,492 0.77 % 839.43 0 0 % 0 1,068 0.04 % 26.89 1,646 0.36 % 416.09 476,018 0.79 % 877.09 83,115 0.41 % 443.47 374,614 1.07 % 1,178.26 16,031 0.32 % 373.38 star star star 819,239 0.66 % 721.99 0 0 % 0 40,235 1.46 % 1,013.07 1,669 0.36 % 421.91 428,688 0.71 % 789.89 141,731 0.69 % 756.23 175,740 0.50 % 552.75 31,176 0.62 % 726.13 visok visok viso k 785,457 0.63 % 692.22 1 0.12 % 103.02 15,006 0.55 % 377.83 2,003 0.44 % 506.34 386,467 0.64 % 712.09 116,871 0.57 % 623.59 235,448 0.68 % 740.55 29,661 0.59 % 690.84 pomemben pomemben pome mben 785,444 0.63 % 692.21 0 0 % 0 12,253 0.45 % 308.52 2,080 0.45 % 525.80 369,088 0.61 % 680.07 142,611 0.70 % 760.93 213,803 0.61 % 672.47 45,609 0.90 % 1,062.29 državen državen drža ven 727,558 0.59 % 641.19 0 0 % 0 2,144 0.08 % 53.98 6,431 1.40 % 1,625.69 424,876 0.70 % 782.86 53,346 0.26 % 284.64 231,274 0.66 % 727.42 9,487 0.19 % 220.96 mlad mlad mlad 710,490 0.57 % 626.15 0 0 % 0 19,798 0.72 % 498.49 901 0.20 % 227.76 384,080 0.64 % 707.69 102,434 0.50 % 546.55 187,441 0.54 % 589.55 15,836 0.31 % 368.84 svetoven svetoven svet oven 701,704 0.56 % 618.41 0 0 % 0 2,333 0.09 % 58.74 920 0.20 % 232.57 354,281 0.59 % 652.79 71,182 0.35 % 379.80 261,527 0.75 % 822.57 11,461 0.23 % 266.94 različen različen razl ičen 601,579 0.48 % 530.17 10 1.20 % 1,030.18 6,101 0.22 % 153.62 1,707 0.37 % 431.51 265,022 0.44 % 488.32 132,380 0.65 % 706.34 147,499 0.42 % 463.92 48,860 0.97 % 1,138.01 javen javen jave n 599,038 0.48 % 527.93 0 0 % 0 2,139 0.08 % 53.86 4,421 0.96 % 1,117.58 285,690 0.47 % 526.40 46,128 0.23 % 246.12 245,425 0.70 % 771.93 15,235 0.30 % 354.84 majhen majhen majh en 586,026 0.47 % 516.46 11 1.32 % 1,133.20 22,105 0.80 % 556.58 1,989 0.43 % 502.80 260,722 0.43 % 480.40 146,093 0.71 % 779.50 121,483 0.35 % 382.10 33,623 0.67 % 783.12 mogoč mogoč mogo č 549,842 0.44 % 484.57 0 0 % 0 24,830 0.90 % 625.19 2,511 0.55 % 634.75 266,544 0.44 % 491.12 98,624 0.48 % 526.23 129,413 0.37 % 407.04 27,920 0.55 % 650.29 ameriški ameriški amer iški 546,670 0.44 % 481.78 0 0 % 0 2,955 0.11 % 74.40 553 0.12 % 139.79 269,453 0.45 % 496.48 74,371 0.36 % 396.82 191,053 0.55 % 600.91 8,285 0.16 % 192.97 domač domač doma č 514,492 0.41 % 453.42 1 0.12 % 103.02 5,007 0.18 % 126.07 1,010 0.22 % 255.32 290,270 0.48 % 534.84 65,068 0.32 % 347.18 143,722 0.41 % 452.04 9,414 0.19 % 219.26 pravi pravi prav i 508,554 0.41 % 448.19 1 0.12 % 103.02 19,024 0.69 % 479 1,069 0.23 % 270.23 238,097 0.39 % 438.71 115,643 0.56 % 617.03 117,969 0.34 % 371.04 16,751 0.33 % 390.15 glaven glaven glav en 504,817 0.41 % 444.89 0 0 % 0 8,165 0.30 % 205.59 1,532 0.33 % 387.27 252,083 0.42 % 464.48 79,633 0.39 % 424.90 142,430 0.41 % 447.98 20,974 0.41 % 488.51 leten leten lete n 497,066 0.40 % 438.06 0 0 % 0 930 0.03 % 23.42 819 0.18 % 207.03 258,476 0.43 % 476.26 37,560 0.18 % 200.41 195,112 0.56 % 613.68 4,169 0.08 % 97.10 poseben poseben pose ben 491,903 0.40 % 433.51 1 0.12 % 103.02 7,888 0.29 % 198.61 3,320 0.72 % 839.26 240,439 0.40 % 443.02 100,029 0.49 % 533.72 117,371 0.34 % 369.16 22,855 0.45 % 532.32 znan znan znan 482,303 0.39 % 425.05 0 0 % 0 6,258 0.23 % 157.57 949 0.21 % 239.90 239,747 0.40 % 441.75 92,995 0.45 % 496.19 127,889 0.37 % 402.24 14,465 0.29 % 336.91 slab slab slab 456,819 0.37 % 402.59 0 0 % 0 11,344 0.41 % 285.63 522 0.11 % 131.96 222,346 0.37 % 409.69 74,623 0.36 % 398.16 136,968 0.39 % 430.80 11,016 0.22 % 256.58 številen številen štev ilen 440,492 0.35 % 388.20 1 0.12 % 103.02 3,801 0.14 % 95.71 536 0.12 % 135.50 199,765 0.33 % 368.08 74,277 0.36 % 396.32 143,497 0.41 % 451.34 18,615 0.37 % 433.57 mednaroden mednaroden medn aroden 438,511 0.35 % 386.46 0 0 % 0 889 0.03 % 22.38 2,246 0.49 % 567.76 237,200 0.39 % 437.06 40,842 0.20 % 217.92 150,001 0.43 % 471.79 7,333 0.14 % 170.79 nekdanji nekdanji nekd anji 438,100 0.35 % 386.10 0 0 % 0 3,153 0.11 % 79.39 321 0.07 % 81.15 223,487 0.37 % 411.79 46,215 0.23 % 246.59 159,504 0.46 % 501.68 5,420 0.11 % 126.24 prihodnji prihodnji prih odnji 435,770 0.35 % 384.04 0 0 % 0 1,515 0.06 % 38.15 524 0.11 % 132.46 255,001 0.42 % 469.86 35,529 0.17 % 189.57 139,896 0.40 % 440.01 3,305 0.07 % 76.98 političen političen poli tičen 433,419 0.35 % 381.97 0 0 % 0 2,210 0.08 % 55.65 1,473 0.32 % 372.36 232,367 0.38 % 428.15 50,096 0.24 % 267.30 128,308 0.37 % 403.56 18,965 0.38 % 441.72 letošnji letošnji leto šnji 419,649 0.34 % 369.83 0 0 % 0 181 0.01 % 4.56 258 0.06 % 65.22 248,400 0.41 % 457.69 43,478 0.21 % 231.98 126,783 0.36 % 398.77 549 0.01 % 12.79 dolg dolg dolg 415,365 0.34 % 366.06 1 0.12 % 103.02 20,443 0.74 % 514.73 1,176 0.26 % 297.28 187,249 0.31 % 345.02 84,197 0.41 % 449.25 103,367 0.30 % 325.12 18,932 0.37 % 440.95 podoben podoben podo ben 409,114 0.33 % 360.55 2 0.24 % 206.04 13,805 0.50 % 347.59 1,564 0.34 % 395.36 190,385 0.32 % 350.80 90,523 0.44 % 483 90,775 0.26 % 285.51 22,060 0.44 % 513.80 deloven deloven delo ven 399,552 0.32 % 352.12 0 0 % 0 3,474 0.13 % 87.47 2,872 0.62 % 726.01 194,485 0.32 % 358.35 55,822 0.27 % 297.85 125,086 0.36 % 393.43 17,813 0.35 % 414.89 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 81 File at CLARIN.SI 1.2.65 List of initial character-level 5-grams from adjective lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adjectives-lemmas- initial-5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] velik velik velik 2,652,565 2.23 % 2,337.69 5 0.66 % 515.09 51,414 2.04 % 1,294.55 5,866 1.31 % 1,482.86 1,283,822 2.22 % 2,365.53 485,663 2.49 % 2,591.34 724,281 2.16 % 2,278.05 101,514 2.09 % 2,364.38 slovenski slovenski slove nski 1,880,807 1.58 % 1,657.55 0 0 % 0 3,004 0.12 % 75.64 2,992 0.67 % 756.35 1,056,514 1.83 % 1,946.70 207,876 1.07 % 1,109.16 582,162 1.74 % 1,831.05 28,259 0.58 % 658.19 dober dober dober 1,837,252 1.55 % 1,619.16 3 0.40 % 309.06 40,204 1.60 % 1,012.29 2,465 0.55 % 623.13 898,379 1.55 % 1,655.32 314,636 1.62 % 1,678.79 540,724 1.61 % 1,700.72 40,841 0.84 % 951.24 zadnji zadnji zadnj i 1,254,851 1.06 % 1,105.89 3 0.40 % 309.06 26,159 1.04 % 658.66 1,842 0.41 % 465.64 634,771 1.10 % 1,169.61 170,972 0.88 % 912.25 400,657 1.19 % 1,260.17 20,447 0.42 % 476.23 evropski evropski evrop ski 952,492 0.80 % 839.43 0 0 % 0 1,068 0.04 % 26.89 1,646 0.37 % 416.09 476,018 0.82 % 877.09 83,115 0.43 % 443.47 374,614 1.12 % 1,178.26 16,031 0.33 % 373.38 visok visok visok 785,457 0.66 % 692.22 1 0.13 % 103.02 15,006 0.60 % 377.83 2,003 0.45 % 506.34 386,467 0.67 % 712.09 116,871 0.60 % 623.59 235,448 0.70 % 740.55 29,661 0.61 % 690.84 pomemben pomemben pomem ben 785,444 0.66 % 692.21 0 0 % 0 12,253 0.49 % 308.52 2,080 0.47 % 525.80 369,088 0.64 % 680.07 142,611 0.73 % 760.93 213,803 0.64 % 672.47 45,609 0.94 % 1,062.29 državen državen držav en 727,558 0.61 % 641.19 0 0 % 0 2,144 0.09 % 53.98 6,431 1.44 % 1,625.69 424,876 0.73 % 782.86 53,346 0.27 % 284.64 231,274 0.69 % 727.42 9,487 0.20 % 220.96 svetoven svetoven sveto ven 701,704 0.59 % 618.41 0 0 % 0 2,333 0.09 % 58.74 920 0.21 % 232.57 354,281 0.61 % 652.79 71,182 0.37 % 379.80 261,527 0.78 % 822.57 11,461 0.24 % 266.94 različen različen razli čen 601,579 0.51 % 530.17 10 1.32 % 1,030.18 6,101 0.24 % 153.62 1,707 0.38 % 431.51 265,022 0.46 % 488.32 132,380 0.68 % 706.34 147,499 0.44 % 463.92 48,860 1.01 % 1,138.01 javen javen javen 599,038 0.51 % 527.93 0 0 % 0 2,139 0.09 % 53.86 4,421 0.99 % 1,117.58 285,690 0.49 % 526.40 46,128 0.24 % 246.12 245,425 0.73 % 771.93 15,235 0.31 % 354.84 majhen majhen majhe n 586,026 0.49 % 516.46 11 1.46 % 1,133.20 22,105 0.88 % 556.58 1,989 0.44 % 502.80 260,722 0.45 % 480.40 146,093 0.75 % 779.50 121,483 0.36 % 382.10 33,623 0.69 % 783.12 mogoč mogoč mogoč 549,842 0.46 % 484.57 0 0 % 0 24,830 0.99 % 625.19 2,511 0.56 % 634.75 266,544 0.46 % 491.12 98,624 0.51 % 526.23 129,413 0.39 % 407.04 27,920 0.58 % 650.29 ameriški ameriški ameri ški 546,670 0.46 % 481.78 0 0 % 0 2,955 0.12 % 74.40 553 0.12 % 139.79 269,453 0.47 % 496.48 74,371 0.38 % 396.82 191,053 0.57 % 600.91 8,285 0.17 % 192.97 domač domač domač 514,492 0.43 % 453.42 1 0.13 % 103.02 5,007 0.20 % 126.07 1,010 0.23 % 255.32 290,270 0.50 % 534.84 65,068 0.33 % 347.18 143,722 0.43 % 452.04 9,414 0.19 % 219.26 pravi pravi pravi 508,554 0.43 % 448.19 1 0.13 % 103.02 19,024 0.76 % 479 1,069 0.24 % 270.23 238,097 0.41 % 438.71 115,643 0.59 % 617.03 117,969 0.35 % 371.04 16,751 0.35 % 390.15 glaven glaven glave n 504,817 0.42 % 444.89 0 0 % 0 8,165 0.32 % 205.59 1,532 0.34 % 387.27 252,083 0.44 % 464.48 79,633 0.41 % 424.90 142,430 0.42 % 447.98 20,974 0.43 % 488.51 leten leten leten 497,066 0.42 % 438.06 0 0 % 0 930 0.04 % 23.42 819 0.18 % 207.03 258,476 0.45 % 476.26 37,560 0.19 % 200.41 195,112 0.58 % 613.68 4,169 0.09 % 97.10 poseben poseben poseb en 491,903 0.41 % 433.51 1 0.13 % 103.02 7,888 0.31 % 198.61 3,320 0.74 % 839.26 240,439 0.42 % 443.02 100,029 0.51 % 533.72 117,371 0.35 % 369.16 22,855 0.47 % 532.32 številen številen števi len 440,492 0.37 % 388.20 1 0.13 % 103.02 3,801 0.15 % 95.71 536 0.12 % 135.50 199,765 0.34 % 368.08 74,277 0.38 % 396.32 143,497 0.43 % 451.34 18,615 0.38 % 433.57 mednaroden mednaroden medna roden 438,511 0.37 % 386.46 0 0 % 0 889 0.04 % 22.38 2,246 0.50 % 567.76 237,200 0.41 % 437.06 40,842 0.21 % 217.92 150,001 0.45 % 471.79 7,333 0.15 % 170.79 nekdanji nekdanji nekda nji 438,100 0.37 % 386.10 0 0 % 0 3,153 0.12 % 79.39 321 0.07 % 81.15 223,487 0.39 % 411.79 46,215 0.24 % 246.59 159,504 0.47 % 501.68 5,420 0.11 % 126.24 prihodnji prihodnji priho dnji 435,770 0.37 % 384.04 0 0 % 0 1,515 0.06 % 38.15 524 0.12 % 132.46 255,001 0.44 % 469.86 35,529 0.18 % 189.57 139,896 0.42 % 440.01 3,305 0.07 % 76.98 političen političen polit ičen 433,419 0.36 % 381.97 0 0 % 0 2,210 0.09 % 55.65 1,473 0.33 % 372.36 232,367 0.40 % 428.15 50,096 0.26 % 267.30 128,308 0.38 % 403.56 18,965 0.39 % 441.72 letošnji letošnji letoš nji 419,649 0.35 % 369.83 0 0 % 0 181 0.01 % 4.56 258 0.06 % 65.22 248,400 0.43 % 457.69 43,478 0.22 % 231.98 126,783 0.38 % 398.77 549 0.01 % 12.79 podoben podoben podob en 409,114 0.34 % 360.55 2 0.27 % 206.04 13,805 0.55 % 347.59 1,564 0.35 % 395.36 190,385 0.33 % 350.80 90,523 0.47 % 483 90,775 0.27 % 285.51 22,060 0.46 % 513.80 deloven deloven delov en 399,552 0.34 % 352.12 0 0 % 0 3,474 0.14 % 87.47 2,872 0.64 % 726.01 194,485 0.34 % 358.35 55,822 0.29 % 297.85 125,086 0.37 % 393.43 17,813 0.37 % 414.89 naslednji naslednji nasle dnji 398,626 0.34 % 351.31 0 0 % 0 14,763 0.59 % 371.72 2,230 0.50 % 563.72 184,210 0.32 % 339.42 71,373 0.37 % 380.82 106,100 0.32 % 333.71 19,950 0.41 % 464.66 skupen skupen skupe n 393,317 0.33 % 346.63 0 0 % 0 3,599 0.14 % 90.62 1,505 0.34 % 380.45 200,699 0.35 % 369.80 50,130 0.26 % 267.48 121,349 0.36 % 381.67 16,035 0.33 % 373.47 kratek kratek krate k 384,378 0.32 % 338.75 0 0 % 0 12,904 0.51 % 324.91 1,112 0.25 % 281.10 188,289 0.33 % 346.93 78,154 0.40 % 417 89,042 0.27 % 280.06 14,877 0.31 % 346.50 gospodarski gospodarski gospo darski 362,577 0.31 % 319.54 0 0 % 0 414 0.02 % 10.42 2,060 0.46 % 520.75 194,741 0.34 % 358.82 30,506 0.16 % 162.77 126,209 0.38 % 396.96 8,647 0.18 % 201.40 potreben potreben potre ben 349,591 0.29 % 308.09 3 0.40 % 309.06 4,573 0.18 % 115.14 3,388 0.76 % 856.45 171,370 0.30 % 315.76 60,929 0.31 % 325.10 91,369 0.27 % 287.38 17,959 0.37 % 418.29 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 82 File at CLARIN.SI 1.2.66 List of final character-level 1-grams from adjective lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adjectives-lemmas- final-1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] velik velik veli k 2,652,565 2.04 % 2,337.69 5 0.58 % 515.09 51,414 1.72 % 1,294.55 5,866 1.23 % 1,482.86 1,283,822 2.02 % 2,365.53 485,663 2.24 % 2,591.34 724,281 1.99 % 2,278.05 101,514 1.92 % 2,364.38 nov nov no v 2,584,746 1.99 % 2,277.92 1 0.12 % 103.02 31,228 1.04 % 786.29 5,052 1.06 % 1,277.09 1,343,551 2.12 % 2,475.58 430,154 1.99 % 2,295.16 711,383 1.95 % 2,237.48 63,377 1.20 % 1,476.13 slovenski slovenski slovensk i 1,880,807 1.44 % 1,657.55 0 0 % 0 3,004 0.10 % 75.64 2,992 0.63 % 756.35 1,056,514 1.67 % 1,946.70 207,876 0.96 % 1,109.16 582,162 1.60 % 1,831.05 28,259 0.54 % 658.19 dober dober dobe r 1,837,252 1.41 % 1,619.16 3 0.35 % 309.06 40,204 1.34 % 1,012.29 2,465 0.52 % 623.13 898,379 1.42 % 1,655.32 314,636 1.45 % 1,678.79 540,724 1.49 % 1,700.72 40,841 0.77 % 951.24 zadnji zadnji zadnj i 1,254,851 0.96 % 1,105.89 3 0.35 % 309.06 26,159 0.87 % 658.66 1,842 0.39 % 465.64 634,771 1.00 % 1,169.61 170,972 0.79 % 912.25 400,657 1.10 % 1,260.17 20,447 0.39 % 476.23 sam sam sa m 1,096,267 0.84 % 966.13 2 0.23 % 206.04 65,317 2.18 % 1,644.61 4,168 0.87 % 1,053.63 485,995 0.77 % 895.48 228,841 1.06 % 1,221.02 258,335 0.71 % 812.53 53,609 1.02 % 1,248.62 evropski evropski evropsk i 952,492 0.73 % 839.43 0 0 % 0 1,068 0.04 % 26.89 1,646 0.34 % 416.09 476,018 0.75 % 877.09 83,115 0.38 % 443.47 374,614 1.03 % 1,178.26 16,031 0.30 % 373.38 star star sta r 819,239 0.63 % 721.99 0 0 % 0 40,235 1.34 % 1,013.07 1,669 0.35 % 421.91 428,688 0.68 % 789.89 141,731 0.65 % 756.23 175,740 0.48 % 552.75 31,176 0.59 % 726.13 visok visok viso k 785,457 0.60 % 692.22 1 0.12 % 103.02 15,006 0.50 % 377.83 2,003 0.42 % 506.34 386,467 0.61 % 712.09 116,871 0.54 % 623.59 235,448 0.65 % 740.55 29,661 0.56 % 690.84 pomemben pomemben pomembe n 785,444 0.60 % 692.21 0 0 % 0 12,253 0.41 % 308.52 2,080 0.43 % 525.80 369,088 0.58 % 680.07 142,611 0.66 % 760.93 213,803 0.59 % 672.47 45,609 0.86 % 1,062.29 državen državen države n 727,558 0.56 % 641.19 0 0 % 0 2,144 0.07 % 53.98 6,431 1.35 % 1,625.69 424,876 0.67 % 782.86 53,346 0.25 % 284.64 231,274 0.64 % 727.42 9,487 0.18 % 220.96 mlad mlad mla d 710,490 0.55 % 626.15 0 0 % 0 19,798 0.66 % 498.49 901 0.19 % 227.76 384,080 0.61 % 707.69 102,434 0.47 % 546.55 187,441 0.52 % 589.55 15,836 0.30 % 368.84 svetoven svetoven svetove n 701,704 0.54 % 618.41 0 0 % 0 2,333 0.08 % 58.74 920 0.19 % 232.57 354,281 0.56 % 652.79 71,182 0.33 % 379.80 261,527 0.72 % 822.57 11,461 0.22 % 266.94 različen različen različe n 601,579 0.46 % 530.17 10 1.16 % 1,030.18 6,101 0.20 % 153.62 1,707 0.36 % 431.51 265,022 0.42 % 488.32 132,380 0.61 % 706.34 147,499 0.41 % 463.92 48,860 0.93 % 1,138.01 javen javen jave n 599,038 0.46 % 527.93 0 0 % 0 2,139 0.07 % 53.86 4,421 0.93 % 1,117.58 285,690 0.45 % 526.40 46,128 0.21 % 246.12 245,425 0.67 % 771.93 15,235 0.29 % 354.84 majhen majhen majhe n 586,026 0.45 % 516.46 11 1.28 % 1,133.20 22,105 0.74 % 556.58 1,989 0.42 % 502.80 260,722 0.41 % 480.40 146,093 0.67 % 779.50 121,483 0.33 % 382.10 33,623 0.64 % 783.12 mogoč mogoč mogo č 549,842 0.42 % 484.57 0 0 % 0 24,830 0.83 % 625.19 2,511 0.53 % 634.75 266,544 0.42 % 491.12 98,624 0.46 % 526.23 129,413 0.36 % 407.04 27,920 0.53 % 650.29 ameriški ameriški amerišk i 546,670 0.42 % 481.78 0 0 % 0 2,955 0.10 % 74.40 553 0.12 % 139.79 269,453 0.42 % 496.48 74,371 0.34 % 396.82 191,053 0.53 % 600.91 8,285 0.16 % 192.97 domač domač doma č 514,492 0.40 % 453.42 1 0.12 % 103.02 5,007 0.17 % 126.07 1,010 0.21 % 255.32 290,270 0.46 % 534.84 65,068 0.30 % 347.18 143,722 0.40 % 452.04 9,414 0.18 % 219.26 pravi pravi prav i 508,554 0.39 % 448.19 1 0.12 % 103.02 19,024 0.64 % 479 1,069 0.22 % 270.23 238,097 0.38 % 438.71 115,643 0.53 % 617.03 117,969 0.32 % 371.04 16,751 0.32 % 390.15 glaven glaven glave n 504,817 0.39 % 444.89 0 0 % 0 8,165 0.27 % 205.59 1,532 0.32 % 387.27 252,083 0.40 % 464.48 79,633 0.37 % 424.90 142,430 0.39 % 447.98 20,974 0.40 % 488.51 leten leten lete n 497,066 0.38 % 438.06 0 0 % 0 930 0.03 % 23.42 819 0.17 % 207.03 258,476 0.41 % 476.26 37,560 0.17 % 200.41 195,112 0.54 % 613.68 4,169 0.08 % 97.10 poseben poseben posebe n 491,903 0.38 % 433.51 1 0.12 % 103.02 7,888 0.26 % 198.61 3,320 0.69 % 839.26 240,439 0.38 % 443.02 100,029 0.46 % 533.72 117,371 0.32 % 369.16 22,855 0.43 % 532.32 znan znan zna n 482,303 0.37 % 425.05 0 0 % 0 6,258 0.21 % 157.57 949 0.20 % 239.90 239,747 0.38 % 441.75 92,995 0.43 % 496.19 127,889 0.35 % 402.24 14,465 0.27 % 336.91 slab slab sla b 456,819 0.35 % 402.59 0 0 % 0 11,344 0.38 % 285.63 522 0.11 % 131.96 222,346 0.35 % 409.69 74,623 0.34 % 398.16 136,968 0.38 % 430.80 11,016 0.21 % 256.58 številen številen števile n 440,492 0.34 % 388.20 1 0.12 % 103.02 3,801 0.13 % 95.71 536 0.11 % 135.50 199,765 0.32 % 368.08 74,277 0.34 % 396.32 143,497 0.39 % 451.34 18,615 0.35 % 433.57 mednaroden mednaroden mednarode n 438,511 0.34 % 386.46 0 0 % 0 889 0.03 % 22.38 2,246 0.47 % 567.76 237,200 0.37 % 437.06 40,842 0.19 % 217.92 150,001 0.41 % 471.79 7,333 0.14 % 170.79 nekdanji nekdanji nekdanj i 438,100 0.34 % 386.10 0 0 % 0 3,153 0.10 % 79.39 321 0.07 % 81.15 223,487 0.35 % 411.79 46,215 0.21 % 246.59 159,504 0.44 % 501.68 5,420 0.10 % 126.24 prihodnji prihodnji prihodnj i 435,770 0.34 % 384.04 0 0 % 0 1,515 0.05 % 38.15 524 0.11 % 132.46 255,001 0.40 % 469.86 35,529 0.16 % 189.57 139,896 0.38 % 440.01 3,305 0.06 % 76.98 političen političen političe n 433,419 0.33 % 381.97 0 0 % 0 2,210 0.07 % 55.65 1,473 0.31 % 372.36 232,367 0.37 % 428.15 50,096 0.23 % 267.30 128,308 0.35 % 403.56 18,965 0.36 % 441.72 lep lep le p 420,724 0.32 % 370.78 0 0 % 0 23,733 0.79 % 597.57 2,813 0.59 % 711.10 198,767 0.31 % 366.24 101,128 0.47 % 539.59 81,862 0.23 % 257.48 12,421 0.23 % 289.30 letošnji letošnji letošnj i 419,649 0.32 % 369.83 0 0 % 0 181 0.01 % 4.56 258 0.05 % 65.22 248,400 0.39 % 457.69 43,478 0.20 % 231.98 126,783 0.35 % 398.77 549 0.01 % 12.79 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 83 File at CLARIN.SI 1.2.67 List of final character-level 2-grams from adjective lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adjectives-lemmas-final- 2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] velik velik vel ik 2,652,565 2.04 % 2,337.69 5 0.58 % 515.09 51,414 1.72 % 1,294.55 5,866 1.23 % 1,482.86 1,283,822 2.02 % 2,365.53 485,663 2.24 % 2,591.34 724,281 1.99 % 2,278.05 101,514 1.92 % 2,364.38 nov nov n ov 2,584,746 1.99 % 2,277.92 1 0.12 % 103.02 31,228 1.04 % 786.29 5,052 1.06 % 1,277.09 1,343,551 2.12 % 2,475.58 430,154 1.99 % 2,295.16 711,383 1.95 % 2,237.48 63,377 1.20 % 1,476.13 slovenski slovenski slovens ki 1,880,807 1.44 % 1,657.55 0 0 % 0 3,004 0.10 % 75.64 2,992 0.63 % 756.35 1,056,514 1.67 % 1,946.70 207,876 0.96 % 1,109.16 582,162 1.60 % 1,831.05 28,259 0.54 % 658.19 dober dober dob er 1,837,252 1.41 % 1,619.16 3 0.35 % 309.06 40,204 1.34 % 1,012.29 2,465 0.52 % 623.13 898,379 1.42 % 1,655.32 314,636 1.45 % 1,678.79 540,724 1.49 % 1,700.72 40,841 0.77 % 951.24 zadnji zadnji zadn ji 1,254,851 0.96 % 1,105.89 3 0.35 % 309.06 26,159 0.87 % 658.66 1,842 0.39 % 465.64 634,771 1.00 % 1,169.61 170,972 0.79 % 912.25 400,657 1.10 % 1,260.17 20,447 0.39 % 476.23 sam sam s am 1,096,267 0.84 % 966.13 2 0.23 % 206.04 65,317 2.18 % 1,644.61 4,168 0.87 % 1,053.63 485,995 0.77 % 895.48 228,841 1.06 % 1,221.02 258,335 0.71 % 812.53 53,609 1.02 % 1,248.62 evropski evropski evrops ki 952,492 0.73 % 839.43 0 0 % 0 1,068 0.04 % 26.89 1,646 0.34 % 416.09 476,018 0.75 % 877.09 83,115 0.38 % 443.47 374,614 1.03 % 1,178.26 16,031 0.30 % 373.38 star star st ar 819,239 0.63 % 721.99 0 0 % 0 40,235 1.34 % 1,013.07 1,669 0.35 % 421.91 428,688 0.68 % 789.89 141,731 0.65 % 756.23 175,740 0.48 % 552.75 31,176 0.59 % 726.13 visok visok vis ok 785,457 0.60 % 692.22 1 0.12 % 103.02 15,006 0.50 % 377.83 2,003 0.42 % 506.34 386,467 0.61 % 712.09 116,871 0.54 % 623.59 235,448 0.65 % 740.55 29,661 0.56 % 690.84 pomemben pomemben pomemb en 785,444 0.60 % 692.21 0 0 % 0 12,253 0.41 % 308.52 2,080 0.43 % 525.80 369,088 0.58 % 680.07 142,611 0.66 % 760.93 213,803 0.59 % 672.47 45,609 0.86 % 1,062.29 državen državen držav en 727,558 0.56 % 641.19 0 0 % 0 2,144 0.07 % 53.98 6,431 1.35 % 1,625.69 424,876 0.67 % 782.86 53,346 0.25 % 284.64 231,274 0.64 % 727.42 9,487 0.18 % 220.96 mlad mlad ml ad 710,490 0.55 % 626.15 0 0 % 0 19,798 0.66 % 498.49 901 0.19 % 227.76 384,080 0.61 % 707.69 102,434 0.47 % 546.55 187,441 0.52 % 589.55 15,836 0.30 % 368.84 svetoven svetoven svetov en 701,704 0.54 % 618.41 0 0 % 0 2,333 0.08 % 58.74 920 0.19 % 232.57 354,281 0.56 % 652.79 71,182 0.33 % 379.80 261,527 0.72 % 822.57 11,461 0.22 % 266.94 različen različen različ en 601,579 0.46 % 530.17 10 1.16 % 1,030.18 6,101 0.20 % 153.62 1,707 0.36 % 431.51 265,022 0.42 % 488.32 132,380 0.61 % 706.34 147,499 0.41 % 463.92 48,860 0.93 % 1,138.01 javen javen jav en 599,038 0.46 % 527.93 0 0 % 0 2,139 0.07 % 53.86 4,421 0.93 % 1,117.58 285,690 0.45 % 526.40 46,128 0.21 % 246.12 245,425 0.67 % 771.93 15,235 0.29 % 354.84 majhen majhen majh en 586,026 0.45 % 516.46 11 1.28 % 1,133.20 22,105 0.74 % 556.58 1,989 0.42 % 502.80 260,722 0.41 % 480.40 146,093 0.67 % 779.50 121,483 0.33 % 382.10 33,623 0.64 % 783.12 mogoč mogoč mog oč 549,842 0.42 % 484.57 0 0 % 0 24,830 0.83 % 625.19 2,511 0.53 % 634.75 266,544 0.42 % 491.12 98,624 0.46 % 526.23 129,413 0.36 % 407.04 27,920 0.53 % 650.29 ameriški ameriški ameriš ki 546,670 0.42 % 481.78 0 0 % 0 2,955 0.10 % 74.40 553 0.12 % 139.79 269,453 0.42 % 496.48 74,371 0.34 % 396.82 191,053 0.53 % 600.91 8,285 0.16 % 192.97 domač domač dom ač 514,492 0.40 % 453.42 1 0.12 % 103.02 5,007 0.17 % 126.07 1,010 0.21 % 255.32 290,270 0.46 % 534.84 65,068 0.30 % 347.18 143,722 0.40 % 452.04 9,414 0.18 % 219.26 pravi pravi pra vi 508,554 0.39 % 448.19 1 0.12 % 103.02 19,024 0.64 % 479 1,069 0.22 % 270.23 238,097 0.38 % 438.71 115,643 0.53 % 617.03 117,969 0.32 % 371.04 16,751 0.32 % 390.15 glaven glaven glav en 504,817 0.39 % 444.89 0 0 % 0 8,165 0.27 % 205.59 1,532 0.32 % 387.27 252,083 0.40 % 464.48 79,633 0.37 % 424.90 142,430 0.39 % 447.98 20,974 0.40 % 488.51 leten leten let en 497,066 0.38 % 438.06 0 0 % 0 930 0.03 % 23.42 819 0.17 % 207.03 258,476 0.41 % 476.26 37,560 0.17 % 200.41 195,112 0.54 % 613.68 4,169 0.08 % 97.10 poseben poseben poseb en 491,903 0.38 % 433.51 1 0.12 % 103.02 7,888 0.26 % 198.61 3,320 0.69 % 839.26 240,439 0.38 % 443.02 100,029 0.46 % 533.72 117,371 0.32 % 369.16 22,855 0.43 % 532.32 znan znan zn an 482,303 0.37 % 425.05 0 0 % 0 6,258 0.21 % 157.57 949 0.20 % 239.90 239,747 0.38 % 441.75 92,995 0.43 % 496.19 127,889 0.35 % 402.24 14,465 0.27 % 336.91 slab slab sl ab 456,819 0.35 % 402.59 0 0 % 0 11,344 0.38 % 285.63 522 0.11 % 131.96 222,346 0.35 % 409.69 74,623 0.34 % 398.16 136,968 0.38 % 430.80 11,016 0.21 % 256.58 številen številen števil en 440,492 0.34 % 388.20 1 0.12 % 103.02 3,801 0.13 % 95.71 536 0.11 % 135.50 199,765 0.32 % 368.08 74,277 0.34 % 396.32 143,497 0.39 % 451.34 18,615 0.35 % 433.57 mednaroden mednaroden mednarod en 438,511 0.34 % 386.46 0 0 % 0 889 0.03 % 22.38 2,246 0.47 % 567.76 237,200 0.37 % 437.06 40,842 0.19 % 217.92 150,001 0.41 % 471.79 7,333 0.14 % 170.79 nekdanji nekdanji nekdan ji 438,100 0.34 % 386.10 0 0 % 0 3,153 0.10 % 79.39 321 0.07 % 81.15 223,487 0.35 % 411.79 46,215 0.21 % 246.59 159,504 0.44 % 501.68 5,420 0.10 % 126.24 prihodnji prihodnji prihodn ji 435,770 0.34 % 384.04 0 0 % 0 1,515 0.05 % 38.15 524 0.11 % 132.46 255,001 0.40 % 469.86 35,529 0.16 % 189.57 139,896 0.38 % 440.01 3,305 0.06 % 76.98 političen političen politič en 433,419 0.33 % 381.97 0 0 % 0 2,210 0.07 % 55.65 1,473 0.31 % 372.36 232,367 0.37 % 428.15 50,096 0.23 % 267.30 128,308 0.35 % 403.56 18,965 0.36 % 441.72 lep lep l ep 420,724 0.32 % 370.78 0 0 % 0 23,733 0.79 % 597.57 2,813 0.59 % 711.10 198,767 0.31 % 366.24 101,128 0.47 % 539.59 81,862 0.23 % 257.48 12,421 0.23 % 289.30 letošnji letošnji letošn ji 419,649 0.32 % 369.83 0 0 % 0 181 0.01 % 4.56 258 0.05 % 65.22 248,400 0.39 % 457.69 43,478 0.20 % 231.98 126,783 0.35 % 398.77 549 0.01 % 12.79 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 84 File at CLARIN.SI 1.2.68 List of final character-level 3-grams from adjective lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adjectives-lemmas-final- 3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] velik velik ve lik 2,652,565 2.04 % 2,337.69 50.58 % 515.09 51,414 1.72 % 1,294.55 5,866 1.23 % 1,482.86 1,283,822 2.02 % 2,365.53 485,663 2.24 % 2,591.34 724,281 1.99 % 2,278.05 101,514 1.93 % 2,364.38 nov nov nov 2,584,746 1.99 % 2,277.92 1 0.12 % 103.02 31,228 1.04 % 786.29 5,052 1.06 % 1,277.09 1,343,551 2.12 % 2,475.58 430,154 1.99 % 2,295.16 711,383 1.96 % 2,237.48 63,377 1.20 % 1,476.13 slovenski slovenski sloven ski 1,880,807 1.44 % 1,657.55 0 0 % 0 3,004 0.10 % 75.64 2,992 0.63 % 756.35 1,056,514 1.67 % 1,946.70 207,876 0.96 % 1,109.16 582,162 1.60 % 1,831.05 28,259 0.54 % 658.19 dober dober do ber 1,837,252 1.41 % 1,619.16 3 0.35 % 309.06 40,204 1.34 % 1,012.29 2,465 0.52 % 623.13 898,379 1.42 % 1,655.32 314,636 1.45 % 1,678.79 540,724 1.49 % 1,700.72 40,841 0.77 % 951.24 zadnji zadnji zad nji 1,254,851 0.96 % 1,105.89 3 0.35 % 309.06 26,159 0.87 % 658.66 1,842 0.39 % 465.64 634,771 1.00 % 1,169.61 170,972 0.79 % 912.25 400,657 1.10 % 1,260.17 20,447 0.39 % 476.23 sam sam sam 1,096,267 0.84 % 966.13 20.23 % 206.04 65,317 2.18 % 1,644.61 4,168 0.87 % 1,053.63 485,995 0.77 % 895.48 228,841 1.06 % 1,221.02 258,335 0.71 % 812.53 53,609 1.02 % 1,248.62 evropski evropski evrop ski 952,492 0.73 % 839.43 0 0 % 0 1,068 0.04 % 26.89 1,646 0.34 % 416.09 476,018 0.75 % 877.09 83,115 0.38 % 443.47 374,614 1.03 % 1,178.26 16,031 0.30 % 373.38 star star s tar 819,239 0.63 % 721.99 0 0 % 0 40,235 1.34 % 1,013.07 1,669 0.35 % 421.91 428,688 0.68 % 789.89 141,731 0.66 % 756.23 175,740 0.48 % 552.75 31,176 0.59 % 726.13 visok visok vi sok 785,457 0.60 % 692.22 1 0.12 % 103.02 15,006 0.50 % 377.83 2,003 0.42 % 506.34 386,467 0.61 % 712.09 116,871 0.54 % 623.59 235,448 0.65 % 740.55 29,661 0.56 % 690.84 pomemben pomemben pomem ben 785,444 0.60 % 692.21 0 0 % 0 12,253 0.41 % 308.52 2,080 0.43 % 525.80 369,088 0.58 % 680.07 142,611 0.66 % 760.93 213,803 0.59 % 672.47 45,609 0.86 % 1,062.29 državen državen drža ven 727,558 0.56 % 641.19 0 0 % 0 2,144 0.07 % 53.98 6,431 1.35 % 1,625.69 424,876 0.67 % 782.86 53,346 0.25 % 284.64 231,274 0.64 % 727.42 9,487 0.18 % 220.96 mlad mlad m lad 710,490 0.55 % 626.15 0 0 % 0 19,798 0.66 % 498.49 901 0.19 % 227.76 384,080 0.61 % 707.69 102,434 0.47 % 546.55 187,441 0.52 % 589.55 15,836 0.30 % 368.84 svetoven svetoven sveto ven 701,704 0.54 % 618.41 0 0 % 0 2,333 0.08 % 58.74 920 0.19 % 232.57 354,281 0.56 % 652.79 71,182 0.33 % 379.80 261,527 0.72 % 822.57 11,461 0.22 % 266.94 različen različen razli čen 601,579 0.46 % 530.17 10 1.16 % 1,030.18 6,101 0.20 % 153.62 1,707 0.36 % 431.51 265,022 0.42 % 488.32 132,380 0.61 % 706.34 147,499 0.41 % 463.92 48,860 0.93 % 1,138.01 javen javen ja ven 599,038 0.46 % 527.93 0 0 % 0 2,139 0.07 % 53.86 4,421 0.93 % 1,117.58 285,690 0.45 % 526.40 46,128 0.21 % 246.12 245,425 0.67 % 771.93 15,235 0.29 % 354.84 majhen majhen maj hen 586,026 0.45 % 516.46 11 1.28 % 1,133.20 22,105 0.74 % 556.58 1,989 0.42 % 502.80 260,722 0.41 % 480.40 146,093 0.68 % 779.50 121,483 0.33 % 382.10 33,623 0.64 % 783.12 mogoč mogoč mo goč 549,842 0.42 % 484.57 0 0 % 0 24,830 0.83 % 625.19 2,511 0.53 % 634.75 266,544 0.42 % 491.12 98,624 0.46 % 526.23 129,413 0.36 % 407.04 27,920 0.53 % 650.29 ameriški ameriški ameri ški 546,670 0.42 % 481.78 0 0 % 0 2,955 0.10 % 74.40 553 0.12 % 139.79 269,453 0.42 % 496.48 74,371 0.34 % 396.82 191,053 0.53 % 600.91 8,285 0.16 % 192.97 domač domač do mač 514,492 0.40 % 453.42 1 0.12 % 103.02 5,007 0.17 % 126.07 1,010 0.21 % 255.32 290,270 0.46 % 534.84 65,068 0.30 % 347.18 143,722 0.40 % 452.04 9,414 0.18 % 219.26 pravi pravi pr avi 508,554 0.39 % 448.19 1 0.12 % 103.02 19,024 0.64 % 479 1,069 0.22 % 270.23 238,097 0.38 % 438.71 115,643 0.53 % 617.03 117,969 0.32 % 371.04 16,751 0.32 % 390.15 glaven glaven gla ven 504,817 0.39 % 444.89 0 0 % 0 8,165 0.27 % 205.59 1,532 0.32 % 387.27 252,083 0.40 % 464.48 79,633 0.37 % 424.90 142,430 0.39 % 447.98 20,974 0.40 % 488.51 leten leten le ten 497,066 0.38 % 438.06 0 0 % 0 930 0.03 % 23.42 819 0.17 % 207.03 258,476 0.41 % 476.26 37,560 0.17 % 200.41 195,112 0.54 % 613.68 4,169 0.08 % 97.10 poseben poseben pose ben 491,903 0.38 % 433.51 1 0.12 % 103.02 7,888 0.26 % 198.61 3,320 0.69 % 839.26 240,439 0.38 % 443.02 100,029 0.46 % 533.72 117,371 0.32 % 369.16 22,855 0.43 % 532.32 znan znan z nan 482,303 0.37 % 425.05 0 0 % 0 6,258 0.21 % 157.57 949 0.20 % 239.90 239,747 0.38 % 441.75 92,995 0.43 % 496.19 127,889 0.35 % 402.24 14,465 0.27 % 336.91 slab slab s lab 456,819 0.35 % 402.59 0 0 % 0 11,344 0.38 % 285.63 522 0.11 % 131.96 222,346 0.35 % 409.69 74,623 0.34 % 398.16 136,968 0.38 % 430.80 11,016 0.21 % 256.58 številen številen števi len 440,492 0.34 % 388.20 1 0.12 % 103.02 3,801 0.13 % 95.71 536 0.11 % 135.50 199,765 0.32 % 368.08 74,277 0.34 % 396.32 143,497 0.39 % 451.34 18,615 0.35 % 433.57 mednaroden mednaroden mednaro den 438,511 0.34 % 386.46 0 0 % 0 889 0.03 % 22.38 2,246 0.47 % 567.76 237,200 0.37 % 437.06 40,842 0.19 % 217.92 150,001 0.41 % 471.79 7,333 0.14 % 170.79 nekdanji nekdanji nekda nji 438,100 0.34 % 386.10 0 0 % 0 3,153 0.10 % 79.39 321 0.07 % 81.15 223,487 0.35 % 411.79 46,215 0.21 % 246.59 159,504 0.44 % 501.68 5,420 0.10 % 126.24 prihodnji prihodnji prihod nji 435,770 0.34 % 384.04 0 0 % 0 1,515 0.05 % 38.15 524 0.11 % 132.46 255,001 0.40 % 469.86 35,529 0.16 % 189.57 139,896 0.38 % 440.01 3,305 0.06 % 76.98 političen političen politi čen 433,419 0.33 % 381.97 0 0 % 0 2,210 0.07 % 55.65 1,473 0.31 % 372.36 232,367 0.37 % 428.15 50,096 0.23 % 267.30 128,308 0.35 % 403.56 18,965 0.36 % 441.72 lep lep lep 420,724 0.32 % 370.78 0 0 % 0 23,733 0.79 % 597.57 2,813 0.59 % 711.10 198,767 0.31 % 366.24 101,128 0.47 % 539.59 81,862 0.23 % 257.48 12,421 0.23 % 289.30 letošnji letošnji letoš nji 419,649 0.32 % 369.83 0 0 % 0 181 0.01 % 4.56 258 0.05 % 65.22 248,400 0.39 % 457.69 43,478 0.20 % 231.98 126,783 0.35 % 398.77 549 0.01 % 12.79 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 85 File at CLARIN.SI 1.2.69 List of final character-level 4-grams from adjective lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adjectives-lemmas-final- 4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] velik velik v elik 2,652,565 2.14 % 2,337.69 5 0.60 % 515.09 51,414 1.87 % 1,294.55 5,866 1.28 % 1,482.86 1,283,822 2.12 % 2,365.53 485,663 2.37 % 2,591.34 724,281 2.08 % 2,278.05 101,514 2.01 % 2,364.38 slovenski slovenski slove nski 1,880,807 1.51 % 1,657.55 0 0 % 0 3,004 0.11 % 75.64 2,992 0.65 % 756.35 1,056,514 1.75 % 1,946.70 207,876 1.01 % 1,109.16 582,162 1.67 % 1,831.05 28,259 0.56 % 658.19 dober dober d ober 1,837,252 1.48 % 1,619.16 3 0.36 % 309.06 40,204 1.46 % 1,012.29 2,465 0.54 % 623.13 898,379 1.49 % 1,655.32 314,636 1.53 % 1,678.79 540,724 1.55 % 1,700.72 40,841 0.81 % 951.24 zadnji zadnji za dnji 1,254,851 1.01 % 1,105.89 3 0.36 % 309.06 26,159 0.95 % 658.66 1,842 0.40 % 465.64 634,771 1.05 % 1,169.61 170,972 0.83 % 912.25 400,657 1.15 % 1,260.17 20,447 0.40 % 476.23 evropski evropski evro pski 952,492 0.77 % 839.43 0 0 % 0 1,068 0.04 % 26.89 1,646 0.36 % 416.09 476,018 0.79 % 877.09 83,115 0.41 % 443.47 374,614 1.07 % 1,178.26 16,031 0.32 % 373.38 star star star 819,239 0.66 % 721.99 0 0 % 0 40,235 1.46 % 1,013.07 1,669 0.36 % 421.91 428,688 0.71 % 789.89 141,731 0.69 % 756.23 175,740 0.50 % 552.75 31,176 0.62 % 726.13 visok visok v isok 785,457 0.63 % 692.22 1 0.12 % 103.02 15,006 0.55 % 377.83 2,003 0.44 % 506.34 386,467 0.64 % 712.09 116,871 0.57 % 623.59 235,448 0.68 % 740.55 29,661 0.59 % 690.84 pomemben pomemben pome mben 785,444 0.63 % 692.21 0 0 % 0 12,253 0.45 % 308.52 2,080 0.45 % 525.80 369,088 0.61 % 680.07 142,611 0.70 % 760.93 213,803 0.61 % 672.47 45,609 0.90 % 1,062.29 državen državen drž aven 727,558 0.59 % 641.19 0 0 % 0 2,144 0.08 % 53.98 6,431 1.40 % 1,625.69 424,876 0.70 % 782.86 53,346 0.26 % 284.64 231,274 0.66 % 727.42 9,487 0.19 % 220.96 mlad mlad mlad 710,490 0.57 % 626.15 0 0 % 0 19,798 0.72 % 498.49 901 0.20 % 227.76 384,080 0.64 % 707.69 102,434 0.50 % 546.55 187,441 0.54 % 589.55 15,836 0.31 % 368.84 svetoven svetoven svet oven 701,704 0.56 % 618.41 0 0 % 0 2,333 0.09 % 58.74 920 0.20 % 232.57 354,281 0.59 % 652.79 71,182 0.35 % 379.80 261,527 0.75 % 822.57 11,461 0.23 % 266.94 različen različen razl ičen 601,579 0.48 % 530.17 10 1.20 % 1,030.18 6,101 0.22 % 153.62 1,707 0.37 % 431.51 265,022 0.44 % 488.32 132,380 0.65 % 706.34 147,499 0.42 % 463.92 48,860 0.97 % 1,138.01 javen javen j aven 599,038 0.48 % 527.93 0 0 % 0 2,139 0.08 % 53.86 4,421 0.96 % 1,117.58 285,690 0.47 % 526.40 46,128 0.23 % 246.12 245,425 0.70 % 771.93 15,235 0.30 % 354.84 majhen majhen ma jhen 586,026 0.47 % 516.46 11 1.32 % 1,133.20 22,105 0.80 % 556.58 1,989 0.43 % 502.80 260,722 0.43 % 480.40 146,093 0.71 % 779.50 121,483 0.35 % 382.10 33,623 0.67 % 783.12 mogoč mogoč m ogoč 549,842 0.44 % 484.57 0 0 % 0 24,830 0.90 % 625.19 2,511 0.55 % 634.75 266,544 0.44 % 491.12 98,624 0.48 % 526.23 129,413 0.37 % 407.04 27,920 0.55 % 650.29 ameriški ameriški amer iški 546,670 0.44 % 481.78 0 0 % 0 2,955 0.11 % 74.40 553 0.12 % 139.79 269,453 0.45 % 496.48 74,371 0.36 % 396.82 191,053 0.55 % 600.91 8,285 0.16 % 192.97 domač domač d omač 514,492 0.41 % 453.42 1 0.12 % 103.02 5,007 0.18 % 126.07 1,010 0.22 % 255.32 290,270 0.48 % 534.84 65,068 0.32 % 347.18 143,722 0.41 % 452.04 9,414 0.19 % 219.26 pravi pravi p ravi 508,554 0.41 % 448.19 1 0.12 % 103.02 19,024 0.69 % 479 1,069 0.23 % 270.23 238,097 0.39 % 438.71 115,643 0.56 % 617.03 117,969 0.34 % 371.04 16,751 0.33 % 390.15 glaven glaven gl aven 504,817 0.41 % 444.89 0 0 % 0 8,165 0.30 % 205.59 1,532 0.33 % 387.27 252,083 0.42 % 464.48 79,633 0.39 % 424.90 142,430 0.41 % 447.98 20,974 0.41 % 488.51 leten leten l eten 497,066 0.40 % 438.06 0 0 % 0 930 0.03 % 23.42 819 0.18 % 207.03 258,476 0.43 % 476.26 37,560 0.18 % 200.41 195,112 0.56 % 613.68 4,169 0.08 % 97.10 poseben poseben pos eben 491,903 0.40 % 433.51 1 0.12 % 103.02 7,888 0.29 % 198.61 3,320 0.72 % 839.26 240,439 0.40 % 443.02 100,029 0.49 % 533.72 117,371 0.34 % 369.16 22,855 0.45 % 532.32 znan znan znan 482,303 0.39 % 425.05 0 0 % 0 6,258 0.23 % 157.57 949 0.21 % 239.90 239,747 0.40 % 441.75 92,995 0.45 % 496.19 127,889 0.37 % 402.24 14,465 0.29 % 336.91 slab slab slab 456,819 0.37 % 402.59 0 0 % 0 11,344 0.41 % 285.63 522 0.11 % 131.96 222,346 0.37 % 409.69 74,623 0.36 % 398.16 136,968 0.39 % 430.80 11,016 0.22 % 256.58 številen številen štev ilen 440,492 0.35 % 388.20 1 0.12 % 103.02 3,801 0.14 % 95.71 536 0.12 % 135.50 199,765 0.33 % 368.08 74,277 0.36 % 396.32 143,497 0.41 % 451.34 18,615 0.37 % 433.57 mednaroden mednaroden mednar oden 438,511 0.35 % 386.46 0 0 % 0 889 0.03 % 22.38 2,246 0.49 % 567.76 237,200 0.39 % 437.06 40,842 0.20 % 217.92 150,001 0.43 % 471.79 7,333 0.14 % 170.79 nekdanji nekdanji nekd anji 438,100 0.35 % 386.10 0 0 % 0 3,153 0.11 % 79.39 321 0.07 % 81.15 223,487 0.37 % 411.79 46,215 0.23 % 246.59 159,504 0.46 % 501.68 5,420 0.11 % 126.24 prihodnji prihodnji priho dnji 435,770 0.35 % 384.04 0 0 % 0 1,515 0.06 % 38.15 524 0.11 % 132.46 255,001 0.42 % 469.86 35,529 0.17 % 189.57 139,896 0.40 % 440.01 3,305 0.07 % 76.98 političen političen polit ičen 433,419 0.35 % 381.97 0 0 % 0 2,210 0.08 % 55.65 1,473 0.32 % 372.36 232,367 0.38 % 428.15 50,096 0.24 % 267.30 128,308 0.37 % 403.56 18,965 0.38 % 441.72 letošnji letošnji leto šnji 419,649 0.34 % 369.83 0 0 % 0 181 0.01 % 4.56 258 0.06 % 65.22 248,400 0.41 % 457.69 43,478 0.21 % 231.98 126,783 0.36 % 398.77 549 0.01 % 12.79 dolg dolg dolg 415,365 0.34 % 366.06 1 0.12 % 103.02 20,443 0.74 % 514.73 1,176 0.26 % 297.28 187,249 0.31 % 345.02 84,197 0.41 % 449.25 103,367 0.30 % 325.12 18,932 0.37 % 440.95 podoben podoben pod oben 409,114 0.33 % 360.55 2 0.24 % 206.04 13,805 0.50 % 347.59 1,564 0.34 % 395.36 190,385 0.32 % 350.80 90,523 0.44 % 483 90,775 0.26 % 285.51 22,060 0.44 % 513.80 deloven deloven del oven 399,552 0.32 % 352.12 0 0 % 0 3,474 0.13 % 87.47 2,872 0.62 % 726.01 194,485 0.32 % 358.35 55,822 0.27 % 297.85 125,086 0.36 % 393.43 17,813 0.35 % 414.89 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 86 File at CLARIN.SI 1.2.70 List of final character-level 5-grams from adjective lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adjectives-lemmas-final- 5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] velik velik velik 2,652,565 2.23 % 2,337.69 5 0.66 % 515.09 51,414 2.04 % 1,294.55 5,866 1.31 % 1,482.86 1,283,822 2.22 % 2,365.53 485,663 2.49 % 2,591.34 724,281 2.16 % 2,278.05 101,514 2.09 % 2,364.38 slovenski slovenski slov enski 1,880,807 1.58 % 1,657.55 0 0 % 0 3,004 0.12 % 75.64 2,992 0.67 % 756.35 1,056,514 1.83 % 1,946.70 207,876 1.07 % 1,109.16 582,162 1.74 % 1,831.05 28,259 0.58 % 658.19 dober dober dober 1,837,252 1.55 % 1,619.16 3 0.40 % 309.06 40,204 1.60 % 1,012.29 2,465 0.55 % 623.13 898,379 1.55 % 1,655.32 314,636 1.62 % 1,678.79 540,724 1.61 % 1,700.72 40,841 0.84 % 951.24 zadnji zadnji z adnji 1,254,851 1.06 % 1,105.89 3 0.40 % 309.06 26,159 1.04 % 658.66 1,842 0.41 % 465.64 634,771 1.10 % 1,169.61 170,972 0.88 % 912.25 400,657 1.19 % 1,260.17 20,447 0.42 % 476.23 evropski evropski evr opski 952,492 0.80 % 839.43 0 0 % 0 1,068 0.04 % 26.89 1,646 0.37 % 416.09 476,018 0.82 % 877.09 83,115 0.43 % 443.47 374,614 1.12 % 1,178.26 16,031 0.33 % 373.38 visok visok visok 785,457 0.66 % 692.22 1 0.13 % 103.02 15,006 0.60 % 377.83 2,003 0.45 % 506.34 386,467 0.67 % 712.09 116,871 0.60 % 623.59 235,448 0.70 % 740.55 29,661 0.61 % 690.84 pomemben pomemben pom emben 785,444 0.66 % 692.21 0 0 % 0 12,253 0.49 % 308.52 2,080 0.47 % 525.80 369,088 0.64 % 680.07 142,611 0.73 % 760.93 213,803 0.64 % 672.47 45,609 0.94 % 1,062.29 državen državen dr žaven 727,558 0.61 % 641.19 0 0 % 0 2,144 0.09 % 53.98 6,431 1.44 % 1,625.69 424,876 0.73 % 782.86 53,346 0.27 % 284.64 231,274 0.69 % 727.42 9,487 0.20 % 220.96 svetoven svetoven sve toven 701,704 0.59 % 618.41 0 0 % 0 2,333 0.09 % 58.74 920 0.21 % 232.57 354,281 0.61 % 652.79 71,182 0.37 % 379.80 261,527 0.78 % 822.57 11,461 0.24 % 266.94 različen različen raz ličen 601,579 0.51 % 530.17 10 1.32 % 1,030.18 6,101 0.24 % 153.62 1,707 0.38 % 431.51 265,022 0.46 % 488.32 132,380 0.68 % 706.34 147,499 0.44 % 463.92 48,860 1.01 % 1,138.01 javen javen javen 599,038 0.51 % 527.93 0 0 % 0 2,139 0.09 % 53.86 4,421 0.99 % 1,117.58 285,690 0.49 % 526.40 46,128 0.24 % 246.12 245,425 0.73 % 771.93 15,235 0.31 % 354.84 majhen majhen m ajhen 586,026 0.49 % 516.46 11 1.46 % 1,133.20 22,105 0.88 % 556.58 1,989 0.44 % 502.80 260,722 0.45 % 480.40 146,093 0.75 % 779.50 121,483 0.36 % 382.10 33,623 0.69 % 783.12 mogoč mogoč mogoč 549,842 0.46 % 484.57 0 0 % 0 24,830 0.99 % 625.19 2,511 0.56 % 634.75 266,544 0.46 % 491.12 98,624 0.51 % 526.23 129,413 0.39 % 407.04 27,920 0.58 % 650.29 ameriški ameriški ame riški 546,670 0.46 % 481.78 0 0 % 0 2,955 0.12 % 74.40 553 0.12 % 139.79 269,453 0.47 % 496.48 74,371 0.38 % 396.82 191,053 0.57 % 600.91 8,285 0.17 % 192.97 domač domač domač 514,492 0.43 % 453.42 1 0.13 % 103.02 5,007 0.20 % 126.07 1,010 0.23 % 255.32 290,270 0.50 % 534.84 65,068 0.33 % 347.18 143,722 0.43 % 452.04 9,414 0.19 % 219.26 pravi pravi pravi 508,554 0.43 % 448.19 1 0.13 % 103.02 19,024 0.76 % 479 1,069 0.24 % 270.23 238,097 0.41 % 438.71 115,643 0.59 % 617.03 117,969 0.35 % 371.04 16,751 0.35 % 390.15 glaven glaven g laven 504,817 0.42 % 444.89 0 0 % 0 8,165 0.32 % 205.59 1,532 0.34 % 387.27 252,083 0.44 % 464.48 79,633 0.41 % 424.90 142,430 0.42 % 447.98 20,974 0.43 % 488.51 leten leten leten 497,066 0.42 % 438.06 0 0 % 0 930 0.04 % 23.42 819 0.18 % 207.03 258,476 0.45 % 476.26 37,560 0.19 % 200.41 195,112 0.58 % 613.68 4,169 0.09 % 97.10 poseben poseben po seben 491,903 0.41 % 433.51 1 0.13 % 103.02 7,888 0.31 % 198.61 3,320 0.74 % 839.26 240,439 0.42 % 443.02 100,029 0.51 % 533.72 117,371 0.35 % 369.16 22,855 0.47 % 532.32 številen številen šte vilen 440,492 0.37 % 388.20 1 0.13 % 103.02 3,801 0.15 % 95.71 536 0.12 % 135.50 199,765 0.34 % 368.08 74,277 0.38 % 396.32 143,497 0.43 % 451.34 18,615 0.38 % 433.57 mednaroden mednaroden medna roden 438,511 0.37 % 386.46 0 0 % 0 889 0.04 % 22.38 2,246 0.50 % 567.76 237,200 0.41 % 437.06 40,842 0.21 % 217.92 150,001 0.45 % 471.79 7,333 0.15 % 170.79 nekdanji nekdanji nek danji 438,100 0.37 % 386.10 0 0 % 0 3,153 0.12 % 79.39 321 0.07 % 81.15 223,487 0.39 % 411.79 46,215 0.24 % 246.59 159,504 0.47 % 501.68 5,420 0.11 % 126.24 prihodnji prihodnji prih odnji 435,770 0.37 % 384.04 0 0 % 0 1,515 0.06 % 38.15 524 0.12 % 132.46 255,001 0.44 % 469.86 35,529 0.18 % 189.57 139,896 0.42 % 440.01 3,305 0.07 % 76.98 političen političen poli tičen 433,419 0.36 % 381.97 0 0 % 0 2,210 0.09 % 55.65 1,473 0.33 % 372.36 232,367 0.40 % 428.15 50,096 0.26 % 267.30 128,308 0.38 % 403.56 18,965 0.39 % 441.72 letošnji letošnji let ošnji 419,649 0.35 % 369.83 0 0 % 0 181 0.01 % 4.56 258 0.06 % 65.22 248,400 0.43 % 457.69 43,478 0.22 % 231.98 126,783 0.38 % 398.77 549 0.01 % 12.79 podoben podoben po doben 409,114 0.34 % 360.55 2 0.27 % 206.04 13,805 0.55 % 347.59 1,564 0.35 % 395.36 190,385 0.33 % 350.80 90,523 0.47 % 483 90,775 0.27 % 285.51 22,060 0.46 % 513.80 deloven deloven de loven 399,552 0.34 % 352.12 0 0 % 0 3,474 0.14 % 87.47 2,872 0.64 % 726.01 194,485 0.34 % 358.35 55,822 0.29 % 297.85 125,086 0.37 % 393.43 17,813 0.37 % 414.89 naslednji naslednji nasl ednji 398,626 0.34 % 351.31 0 0 % 0 14,763 0.59 % 371.72 2,230 0.50 % 563.72 184,210 0.32 % 339.42 71,373 0.37 % 380.82 106,100 0.32 % 333.71 19,950 0.41 % 464.66 skupen skupen s kupen 393,317 0.33 % 346.63 0 0 % 0 3,599 0.14 % 90.62 1,505 0.34 % 380.45 200,699 0.35 % 369.80 50,130 0.26 % 267.48 121,349 0.36 % 381.67 16,035 0.33 % 373.47 kratek kratek k ratek 384,378 0.32 % 338.75 0 0 % 0 12,904 0.51 % 324.91 1,112 0.25 % 281.10 188,289 0.33 % 346.93 78,154 0.40 % 417 89,042 0.27 % 280.06 14,877 0.31 % 346.50 gospodarski gospodarski gospod arski 362,577 0.31 % 319.54 0 0 % 0 414 0.02 % 10.42 2,060 0.46 % 520.75 194,741 0.34 % 358.82 30,506 0.16 % 162.77 126,209 0.38 % 396.96 8,647 0.18 % 201.40 potreben potreben pot reben 349,591 0.29 % 308.09 3 0.40 % 309.06 4,573 0.18 % 115.14 3,388 0.76 % 856.45 171,370 0.30 % 315.76 60,929 0.31 % 325.10 91,369 0.27 % 287.38 17,959 0.37 % 418.29 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 87 File at CLARIN.SI 1.2.71 List of initial character-level 1-grams from adjective lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adjectives-lowercase_ forms-initial-1grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] mogoče m ogoče 502,447 0.39 % 442.80 0 0 % 0 23,836 0.80 % 600.16 2,362 0.49 % 597.09 243,451 0.38 % 448.57 90,333 0.42 % 481.99 116,502 0.32 % 366.43 25,963 0.49 % 604.71 novo n ovo 475,332 0.36 % 418.91 1 0.12 % 103.02 7,539 0.25 % 189.82 788 0.17 % 199.20 258,248 0.41 % 475.84 75,103 0.35 % 400.72 121,642 0.33 % 382.60 12,011 0.23 % 279.75 slovenski s lovenski 446,624 0.34 % 393.61 0 0 % 0 678 0.02 % 17.07 576 0.12 % 145.61 240,556 0.38 % 443.24 47,459 0.22 % 253.23 151,452 0.42 % 476.36 5,903 0.11 % 137.49 sam s am 404,454 0.31 % 356.44 0 0 % 0 28,032 0.94 % 705.82 1,505 0.32 % 380.45 182,788 0.29 % 336.80 76,278 0.35 % 406.99 99,109 0.27 % 311.72 16,742 0.32 % 389.94 nove n ove 383,174 0.29 % 337.69 0 0 % 0 4,160 0.14 % 104.74 806 0.17 % 203.75 197,154 0.31 % 363.27 63,139 0.29 % 336.89 105,758 0.29 % 332.64 12,157 0.23 % 283.15 zadnjih z adnjih 374,079 0.29 % 329.67 0 0 % 0 3,656 0.12 % 92.05 371 0.08 % 93.78 188,954 0.30 % 348.16 40,539 0.19 % 216.30 135,861 0.37 % 427.32 4,698 0.09 % 109.42 slovenske s lovenske 353,715 0.27 % 311.73 0 0 % 0 506 0.02 % 12.74 592 0.12 % 149.65 206,966 0.33 % 381.35 37,129 0.17 % 198.11 103,866 0.28 % 326.69 4,656 0.09 % 108.44 novi n ovi 329,121 0.25 % 290.05 0 0 % 0 3,532 0.12 % 88.93 561 0.12 % 141.81 173,329 0.27 % 319.37 56,367 0.26 % 300.76 89,190 0.24 % 280.53 6,142 0.12 % 143.05 slovenskih s lovenskih 294,271 0.23 % 259.34 0 0 % 0 442 0.01 % 11.13 399 0.08 % 100.86 174,340 0.28 % 321.23 34,387 0.16 % 183.48 80,914 0.22 % 254.50 3,789 0.07 % 88.25 novega n ovega 281,250 0.22 % 247.86 0 0 % 0 3,744 0.12 % 94.27 446 0.09 % 112.74 149,906 0.24 % 276.21 43,885 0.20 % 234.16 77,637 0.21 % 244.19 5,632 0.11 % 131.18 različnih r azličnih 266,466 0.20 % 234.84 2 0.23 % 206.04 2,438 0.08 % 61.39 678 0.14 % 171.39 118,184 0.19 % 217.76 56,333 0.26 % 300.57 68,940 0.19 % 216.83 19,891 0.38 % 463.29 novih n ovih 265,857 0.20 % 234.30 0 0 % 0 2,128 0.07 % 53.58 655 0.14 % 165.58 137,591 0.22 % 253.52 42,406 0.20 % 226.26 74,578 0.20 % 234.57 8,499 0.16 % 197.95 nova n ova 254,400 0.20 % 224.20 0 0 % 0 2,819 0.09 % 70.98 635 0.13 % 160.52 128,227 0.20 % 236.27 41,185 0.19 % 219.75 75,332 0.21 % 236.94 6,202 0.12 % 144.45 veliko v eliko 252,295 0.19 % 222.35 0 0 % 0 7,380 0.25 % 185.82 564 0.12 % 142.57 121,628 0.19 % 224.11 44,659 0.21 % 238.29 68,608 0.19 % 215.79 9,456 0.18 % 220.24 zadnji z adnji 251,253 0.19 % 221.43 1 0.12 % 103.02 6,258 0.21 % 157.57 400 0.08 % 101.12 128,961 0.20 % 237.62 35,235 0.16 % 188 75,833 0.21 % 238.51 4,565 0.09 % 106.32 sami s ami 231,519 0.18 % 204.04 1 0.12 % 103.02 5,772 0.19 % 145.33 753 0.16 % 190.35 111,377 0.18 % 205.22 49,240 0.23 % 262.73 52,561 0.14 % 165.32 11,815 0.22 % 275.19 velika v elika 224,674 0.17 % 198 0 0 % 0 5,726 0.19 % 144.17 592 0.12 % 149.65 104,684 0.17 % 192.89 42,115 0.19 % 224.71 62,856 0.17 % 197.70 8,701 0.17 % 202.66 pomembno p omembno 223,934 0.17 % 197.35 0 0 % 0 4,575 0.15 % 115.19 439 0.09 % 110.97 99,670 0.16 % 183.65 43,834 0.20 % 233.88 62,790 0.17 % 197.49 12,626 0.24 % 294.07 letni l etni 222,751 0.17 % 196.31 0 0 % 0 309 0.01 % 7.78 182 0.04 % 46.01 114,022 0.18 % 210.09 12,989 0.06 % 69.31 94,228 0.26 % 296.37 1,021 0.02 % 23.78 slovenska s lovenska 220,229 0.17 % 194.09 0 0 % 0 268 0.01 % 6.75 407 0.09 % 102.89 118,994 0.19 % 219.25 22,935 0.11 % 122.37 74,627 0.20 % 234.72 2,998 0.06 % 69.83 veliki v eliki 216,843 0.17 % 191.10 0 0 % 0 5,506 0.18 % 138.64 648 0.14 % 163.81 103,011 0.16 % 189.80 37,422 0.17 % 199.67 60,010 0.17 % 188.75 10,246 0.19 % 238.64 zadnjem z adnjem 213,519 0.16 % 188.17 1 0.12 % 103.02 2,919 0.10 % 73.50 290 0.06 % 73.31 110,962 0.17 % 204.45 28,044 0.13 % 149.63 68,052 0.19 % 214.04 3,251 0.06 % 75.72 sama s ama 209,273 0.16 % 184.43 0 0 % 0 17,968 0.60 % 452.41 729 0.15 % 184.28 84,772 0.13 % 156.20 47,535 0.22 % 253.63 50,301 0.14 % 158.21 7,968 0.15 % 185.58 evropske e vropske 208,721 0.16 % 183.94 0 0 % 0 195 0.01 % 4.91 447 0.09 % 113 106,056 0.17 % 195.42 17,905 0.08 % 95.54 80,530 0.22 % 253.29 3,588 0.07 % 83.57 velik v elik 205,754 0.16 % 181.33 0 0 % 0 5,789 0.19 % 145.76 372 0.08 % 94.04 94,260 0.15 % 173.68 39,261 0.18 % 209.48 58,805 0.16 % 184.96 7,267 0.14 % 169.26 prihodnje p rihodnje 203,631 0.16 % 179.46 0 0 % 0 614 0.02 % 15.46 158 0.03 % 39.94 120,469 0.19 % 221.97 19,044 0.09 % 101.61 61,773 0.17 % 194.29 1,573 0.03 % 36.64 slovenskega s lovenskega 202,094 0.15 % 178.10 0 0 % 0 398 0.01 % 10.02 407 0.09 % 102.89 112,062 0.18 % 206.48 22,466 0.10 % 119.87 62,505 0.17 % 196.59 4,256 0.08 % 99.13 dobro d obro 201,264 0.15 % 177.37 0 0 % 0 6,966 0.23 % 175.40 343 0.07 % 86.71 98,094 0.15 % 180.74 39,696 0.18 % 211.80 50,431 0.14 % 158.62 5,734 0.11 % 133.55 velike v elike 199,502 0.15 % 175.82 4 0.47 % 412.07 4,824 0.16 % 121.46 674 0.14 % 170.38 95,438 0.15 % 175.85 36,157 0.17 % 192.92 52,627 0.14 % 165.53 9,778 0.18 % 227.74 nov n ov 192,972 0.15 % 170.07 0 0 % 0 3,307 0.11 % 83.27 567 0.12 % 143.33 87,154 0.14 % 160.59 34,997 0.16 % 186.73 62,028 0.17 % 195.09 4,919 0.09 % 114.57 državni d ržavni 190,454 0.15 % 167.85 0 0 % 0 679 0.02 % 17.10 2,076 0.43 % 524.79 111,848 0.18 % 206.09 14,035 0.07 % 74.89 59,642 0.16 % 187.59 2,174 0.04 % 50.64 najboljši n ajboljši 182,232 0.14 % 160.60 0 0 % 0 2,304 0.08 % 58.01 142 0.03 % 35.90 90,767 0.14 % 167.24 23,701 0.11 % 126.46 63,025 0.17 % 198.23 2,293 0.04 % 53.41 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 88 File at CLARIN.SI 1.2.72 List of initial character-level 2-grams from adjective lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adjectives-lowercase_ forms-initial-2grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] mogoče mo goče 502,447 0.39 % 442.80 0 0 % 0 23,836 0.80 % 600.16 2,362 0.49 % 597.09 243,451 0.38 % 448.57 90,333 0.42 % 481.99 116,502 0.32 % 366.43 25,963 0.49 % 604.71 novo no vo 475,332 0.36 % 418.91 1 0.12 % 103.02 7,539 0.25 % 189.82 788 0.17 % 199.20 258,248 0.41 % 475.84 75,103 0.35 % 400.72 121,642 0.33 % 382.60 12,011 0.23 % 279.75 slovenski sl ovenski 446,624 0.34 % 393.61 0 0 % 0 678 0.02 % 17.07 576 0.12 % 145.61 240,556 0.38 % 443.24 47,459 0.22 % 253.23 151,452 0.42 % 476.36 5,903 0.11 % 137.49 sam sa m 404,454 0.31 % 356.44 0 0 % 0 28,032 0.94 % 705.82 1,505 0.32 % 380.45 182,788 0.29 % 336.80 76,278 0.35 % 406.99 99,109 0.27 % 311.72 16,742 0.32 % 389.94 nove no ve 383,174 0.29 % 337.69 0 0 % 0 4,160 0.14 % 104.74 806 0.17 % 203.75 197,154 0.31 % 363.27 63,139 0.29 % 336.89 105,758 0.29 % 332.64 12,157 0.23 % 283.15 zadnjih za dnjih 374,079 0.29 % 329.67 0 0 % 0 3,656 0.12 % 92.05 371 0.08 % 93.78 188,954 0.30 % 348.16 40,539 0.19 % 216.30 135,861 0.37 % 427.32 4,698 0.09 % 109.42 slovenske sl ovenske 353,715 0.27 % 311.73 0 0 % 0 506 0.02 % 12.74 592 0.12 % 149.65 206,966 0.33 % 381.35 37,129 0.17 % 198.11 103,866 0.28 % 326.69 4,656 0.09 % 108.44 novi no vi 329,121 0.25 % 290.05 0 0 % 0 3,532 0.12 % 88.93 561 0.12 % 141.81 173,329 0.27 % 319.37 56,367 0.26 % 300.76 89,190 0.24 % 280.53 6,142 0.12 % 143.05 slovenskih sl ovenskih 294,271 0.23 % 259.34 0 0 % 0 442 0.01 % 11.13 399 0.08 % 100.86 174,340 0.28 % 321.23 34,387 0.16 % 183.48 80,914 0.22 % 254.50 3,789 0.07 % 88.25 novega no vega 281,250 0.22 % 247.86 0 0 % 0 3,744 0.12 % 94.27 446 0.09 % 112.74 149,906 0.24 % 276.21 43,885 0.20 % 234.16 77,637 0.21 % 244.19 5,632 0.11 % 131.18 različnih ra zličnih 266,466 0.20 % 234.84 2 0.23 % 206.04 2,438 0.08 % 61.39 678 0.14 % 171.39 118,184 0.19 % 217.76 56,333 0.26 % 300.57 68,940 0.19 % 216.83 19,891 0.38 % 463.29 novih no vih 265,857 0.20 % 234.30 0 0 % 0 2,128 0.07 % 53.58 655 0.14 % 165.58 137,591 0.22 % 253.52 42,406 0.20 % 226.26 74,578 0.20 % 234.57 8,499 0.16 % 197.95 nova no va 254,400 0.20 % 224.20 0 0 % 0 2,819 0.09 % 70.98 635 0.13 % 160.52 128,227 0.20 % 236.27 41,185 0.19 % 219.75 75,332 0.21 % 236.94 6,202 0.12 % 144.45 veliko ve liko 252,295 0.19 % 222.35 0 0 % 0 7,380 0.25 % 185.82 564 0.12 % 142.57 121,628 0.19 % 224.11 44,659 0.21 % 238.29 68,608 0.19 % 215.79 9,456 0.18 % 220.24 zadnji za dnji 251,253 0.19 % 221.43 1 0.12 % 103.02 6,258 0.21 % 157.57 400 0.08 % 101.12 128,961 0.20 % 237.62 35,235 0.16 % 188 75,833 0.21 % 238.51 4,565 0.09 % 106.32 sami sa mi 231,519 0.18 % 204.04 1 0.12 % 103.02 5,772 0.19 % 145.33 753 0.16 % 190.35 111,377 0.18 % 205.22 49,240 0.23 % 262.73 52,561 0.14 % 165.32 11,815 0.22 % 275.19 velika ve lika 224,674 0.17 % 198 0 0 % 0 5,726 0.19 % 144.17 592 0.12 % 149.65 104,684 0.17 % 192.89 42,115 0.19 % 224.71 62,856 0.17 % 197.70 8,701 0.17 % 202.66 pomembno po membno 223,934 0.17 % 197.35 0 0 % 0 4,575 0.15 % 115.19 439 0.09 % 110.97 99,670 0.16 % 183.65 43,834 0.20 % 233.88 62,790 0.17 % 197.49 12,626 0.24 % 294.07 letni le tni 222,751 0.17 % 196.31 0 0 % 0 309 0.01 % 7.78 182 0.04 % 46.01 114,022 0.18 % 210.09 12,989 0.06 % 69.31 94,228 0.26 % 296.37 1,021 0.02 % 23.78 slovenska sl ovenska 220,229 0.17 % 194.09 0 0 % 0 268 0.01 % 6.75 407 0.09 % 102.89 118,994 0.19 % 219.25 22,935 0.11 % 122.37 74,627 0.20 % 234.72 2,998 0.06 % 69.83 veliki ve liki 216,843 0.17 % 191.10 0 0 % 0 5,506 0.18 % 138.64 648 0.14 % 163.81 103,011 0.16 % 189.80 37,422 0.17 % 199.67 60,010 0.17 % 188.75 10,246 0.19 % 238.64 zadnjem za dnjem 213,519 0.16 % 188.17 1 0.12 % 103.02 2,919 0.10 % 73.50 290 0.06 % 73.31 110,962 0.17 % 204.45 28,044 0.13 % 149.63 68,052 0.19 % 214.04 3,251 0.06 % 75.72 sama sa ma 209,273 0.16 % 184.43 0 0 % 0 17,968 0.60 % 452.41 729 0.15 % 184.28 84,772 0.13 % 156.20 47,535 0.22 % 253.63 50,301 0.14 % 158.21 7,968 0.15 % 185.58 evropske ev ropske 208,721 0.16 % 183.94 0 0 % 0 195 0.01 % 4.91 447 0.09 % 113 106,056 0.17 % 195.42 17,905 0.08 % 95.54 80,530 0.22 % 253.29 3,588 0.07 % 83.57 velik ve lik 205,754 0.16 % 181.33 0 0 % 0 5,789 0.19 % 145.76 372 0.08 % 94.04 94,260 0.15 % 173.68 39,261 0.18 % 209.48 58,805 0.16 % 184.96 7,267 0.14 % 169.26 prihodnje pr ihodnje 203,631 0.16 % 179.46 0 0 % 0 614 0.02 % 15.46 158 0.03 % 39.94 120,469 0.19 % 221.97 19,044 0.09 % 101.61 61,773 0.17 % 194.29 1,573 0.03 % 36.64 slovenskega sl ovenskega 202,094 0.15 % 178.10 0 0 % 0 398 0.01 % 10.02 407 0.09 % 102.89 112,062 0.18 % 206.48 22,466 0.10 % 119.87 62,505 0.17 % 196.59 4,256 0.08 % 99.13 dobro do bro 201,264 0.15 % 177.37 0 0 % 0 6,966 0.23 % 175.40 343 0.07 % 86.71 98,094 0.15 % 180.74 39,696 0.18 % 211.80 50,431 0.14 % 158.62 5,734 0.11 % 133.55 velike ve like 199,502 0.15 % 175.82 4 0.47 % 412.07 4,824 0.16 % 121.46 674 0.14 % 170.38 95,438 0.15 % 175.85 36,157 0.17 % 192.92 52,627 0.14 % 165.53 9,778 0.18 % 227.74 nov no v 192,972 0.15 % 170.07 0 0 % 0 3,307 0.11 % 83.27 567 0.12 % 143.33 87,154 0.14 % 160.59 34,997 0.16 % 186.73 62,028 0.17 % 195.09 4,919 0.09 % 114.57 državni dr žavni 190,454 0.15 % 167.85 0 0 % 0 679 0.02 % 17.10 2,076 0.43 % 524.79 111,848 0.18 % 206.09 14,035 0.07 % 74.89 59,642 0.16 % 187.59 2,174 0.04 % 50.64 najboljši na jboljši 182,232 0.14 % 160.60 0 0 % 0 2,304 0.08 % 58.01 142 0.03 % 35.90 90,767 0.14 % 167.24 23,701 0.11 % 126.46 63,025 0.17 % 198.23 2,293 0.04 % 53.41 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 89 File at CLARIN.SI 1.2.73 List of initial character-level 3-grams from adjective lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adjectives-lowercase_ forms-initial-3grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] mogoče mog oče 502,447 0.39 % 442.80 0 0 % 0 23,836 0.80 % 600.16 2,362 0.49 % 597.09 243,451 0.38 % 448.57 90,333 0.42 % 481.99 116,502 0.32 % 366.43 25,963 0.49 % 604.71 novo nov o 475,332 0.36 % 418.91 1 0.12 % 103.02 7,539 0.25 % 189.82 788 0.17 % 199.20 258,248 0.41 % 475.84 75,103 0.35 % 400.72 121,642 0.33 % 382.60 12,011 0.23 % 279.75 slovenski slo venski 446,624 0.34 % 393.61 0 0 % 0 678 0.02 % 17.07 576 0.12 % 145.61 240,556 0.38 % 443.24 47,459 0.22 % 253.23 151,452 0.42 % 476.36 5,903 0.11 % 137.49 sam sam 404,454 0.31 % 356.44 0 0 % 0 28,032 0.94 % 705.82 1,505 0.32 % 380.45 182,788 0.29 % 336.80 76,278 0.35 % 406.99 99,109 0.27 % 311.72 16,742 0.32 % 389.94 nove nov e 383,174 0.29 % 337.69 0 0 % 0 4,160 0.14 % 104.74 806 0.17 % 203.75 197,154 0.31 % 363.27 63,139 0.29 % 336.89 105,758 0.29 % 332.64 12,157 0.23 % 283.15 zadnjih zad njih 374,079 0.29 % 329.67 0 0 % 0 3,656 0.12 % 92.05 371 0.08 % 93.78 188,954 0.30 % 348.16 40,539 0.19 % 216.30 135,861 0.37 % 427.32 4,698 0.09 % 109.42 slovenske slo venske 353,715 0.27 % 311.73 0 0 % 0 506 0.02 % 12.74 592 0.12 % 149.65 206,966 0.33 % 381.35 37,129 0.17 % 198.11 103,866 0.28 % 326.69 4,656 0.09 % 108.44 novi nov i 329,121 0.25 % 290.05 0 0 % 0 3,532 0.12 % 88.93 561 0.12 % 141.81 173,329 0.27 % 319.37 56,367 0.26 % 300.76 89,190 0.24 % 280.53 6,142 0.12 % 143.05 slovenskih slo venskih 294,271 0.23 % 259.34 0 0 % 0 442 0.01 % 11.13 399 0.08 % 100.86 174,340 0.28 % 321.23 34,387 0.16 % 183.48 80,914 0.22 % 254.50 3,789 0.07 % 88.25 novega nov ega 281,250 0.22 % 247.86 0 0 % 0 3,744 0.12 % 94.27 446 0.09 % 112.74 149,906 0.24 % 276.21 43,885 0.20 % 234.16 77,637 0.21 % 244.19 5,632 0.11 % 131.18 različnih raz ličnih 266,466 0.20 % 234.84 2 0.23 % 206.04 2,438 0.08 % 61.39 678 0.14 % 171.39 118,184 0.19 % 217.76 56,333 0.26 % 300.57 68,940 0.19 % 216.83 19,891 0.38 % 463.29 novih nov ih 265,857 0.20 % 234.30 0 0 % 0 2,128 0.07 % 53.58 655 0.14 % 165.58 137,591 0.22 % 253.52 42,406 0.20 % 226.26 74,578 0.20 % 234.57 8,499 0.16 % 197.95 nova nov a 254,400 0.20 % 224.20 0 0 % 0 2,819 0.09 % 70.98 635 0.13 % 160.52 128,227 0.20 % 236.27 41,185 0.19 % 219.75 75,332 0.21 % 236.94 6,202 0.12 % 144.45 veliko vel iko 252,295 0.19 % 222.35 0 0 % 0 7,380 0.25 % 185.82 564 0.12 % 142.57 121,628 0.19 % 224.11 44,659 0.21 % 238.29 68,608 0.19 % 215.79 9,456 0.18 % 220.24 zadnji zad nji 251,253 0.19 % 221.43 1 0.12 % 103.02 6,258 0.21 % 157.57 400 0.08 % 101.12 128,961 0.20 % 237.62 35,235 0.16 % 188 75,833 0.21 % 238.51 4,565 0.09 % 106.32 sami sam i 231,519 0.18 % 204.04 1 0.12 % 103.02 5,772 0.19 % 145.33 753 0.16 % 190.35 111,377 0.18 % 205.22 49,240 0.23 % 262.73 52,561 0.14 % 165.32 11,815 0.22 % 275.19 velika vel ika 224,674 0.17 % 198 0 0 % 0 5,726 0.19 % 144.17 592 0.12 % 149.65 104,684 0.17 % 192.89 42,115 0.19 % 224.71 62,856 0.17 % 197.70 8,701 0.17 % 202.66 pomembno pom embno 223,934 0.17 % 197.35 0 0 % 0 4,575 0.15 % 115.19 439 0.09 % 110.97 99,670 0.16 % 183.65 43,834 0.20 % 233.88 62,790 0.17 % 197.49 12,626 0.24 % 294.07 letni let ni 222,751 0.17 % 196.31 0 0 % 0 309 0.01 % 7.78 182 0.04 % 46.01 114,022 0.18 % 210.09 12,989 0.06 % 69.31 94,228 0.26 % 296.37 1,021 0.02 % 23.78 slovenska slo venska 220,229 0.17 % 194.09 0 0 % 0 268 0.01 % 6.75 407 0.09 % 102.89 118,994 0.19 % 219.25 22,935 0.11 % 122.37 74,627 0.20 % 234.72 2,998 0.06 % 69.83 veliki vel iki 216,843 0.17 % 191.10 0 0 % 0 5,506 0.18 % 138.64 648 0.14 % 163.81 103,011 0.16 % 189.80 37,422 0.17 % 199.67 60,010 0.17 % 188.75 10,246 0.19 % 238.64 zadnjem zad njem 213,519 0.16 % 188.17 1 0.12 % 103.02 2,919 0.10 % 73.50 290 0.06 % 73.31 110,962 0.17 % 204.45 28,044 0.13 % 149.63 68,052 0.19 % 214.04 3,251 0.06 % 75.72 sama sam a 209,273 0.16 % 184.43 0 0 % 0 17,968 0.60 % 452.41 729 0.15 % 184.28 84,772 0.13 % 156.20 47,535 0.22 % 253.63 50,301 0.14 % 158.21 7,968 0.15 % 185.58 evropske evr opske 208,721 0.16 % 183.94 0 0 % 0 195 0.01 % 4.91 447 0.09 % 113 106,056 0.17 % 195.42 17,905 0.08 % 95.54 80,530 0.22 % 253.29 3,588 0.07 % 83.57 velik vel ik 205,754 0.16 % 181.33 0 0 % 0 5,789 0.19 % 145.76 372 0.08 % 94.04 94,260 0.15 % 173.68 39,261 0.18 % 209.48 58,805 0.16 % 184.96 7,267 0.14 % 169.26 prihodnje pri hodnje 203,631 0.16 % 179.46 0 0 % 0 614 0.02 % 15.46 158 0.03 % 39.94 120,469 0.19 % 221.97 19,044 0.09 % 101.61 61,773 0.17 % 194.29 1,573 0.03 % 36.64 slovenskega slo venskega 202,094 0.15 % 178.10 0 0 % 0 398 0.01 % 10.02 407 0.09 % 102.89 112,062 0.18 % 206.48 22,466 0.10 % 119.87 62,505 0.17 % 196.59 4,256 0.08 % 99.13 dobro dob ro 201,264 0.15 % 177.37 0 0 % 0 6,966 0.23 % 175.40 343 0.07 % 86.71 98,094 0.15 % 180.74 39,696 0.18 % 211.80 50,431 0.14 % 158.62 5,734 0.11 % 133.55 velike vel ike 199,502 0.15 % 175.82 4 0.47 % 412.07 4,824 0.16 % 121.46 674 0.14 % 170.38 95,438 0.15 % 175.85 36,157 0.17 % 192.92 52,627 0.14 % 165.53 9,778 0.18 % 227.74 nov nov 192,972 0.15 % 170.07 0 0 % 0 3,307 0.11 % 83.27 567 0.12 % 143.33 87,154 0.14 % 160.59 34,997 0.16 % 186.73 62,028 0.17 % 195.09 4,919 0.09 % 114.57 državni drž avni 190,454 0.15 % 167.85 0 0 % 0 679 0.02 % 17.10 2,076 0.43 % 524.79 111,848 0.18 % 206.09 14,035 0.07 % 74.89 59,642 0.16 % 187.59 2,174 0.04 % 50.64 najboljši naj boljši 182,232 0.14 % 160.60 0 0 % 0 2,304 0.08 % 58.01 142 0.03 % 35.90 90,767 0.14 % 167.24 23,701 0.11 % 126.46 63,025 0.17 % 198.23 2,293 0.04 % 53.41 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 90 File at CLARIN.SI 1.2.74 List of initial character-level 4-grams from adjective lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adjectives-lowercase_ forms-initial-4grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] mogoče mogo če 502,447 0.39 % 442.80 0 0 % 0 23,836 0.81 % 600.16 2,362 0.50 % 597.09 243,451 0.39 % 448.57 90,333 0.42 % 481.99 116,502 0.32 % 366.43 25,963 0.49 % 604.71 novo novo 475,332 0.37 % 418.91 1 0.12 % 103.02 7,539 0.26 % 189.82 788 0.17 % 199.20 258,248 0.41 % 475.84 75,103 0.35 % 400.72 121,642 0.34 % 382.60 12,011 0.23 % 279.75 slovenski slov enski 446,624 0.34 % 393.61 0 0 % 0 678 0.02 % 17.07 576 0.12 % 145.61 240,556 0.38 % 443.24 47,459 0.22 % 253.23 151,452 0.42 % 476.36 5,903 0.11 % 137.49 nove nove 383,174 0.30 % 337.69 0 0 % 0 4,160 0.14 % 104.74 806 0.17 % 203.75 197,154 0.31 % 363.27 63,139 0.29 % 336.89 105,758 0.29 % 332.64 12,157 0.23 % 283.15 zadnjih zadn jih 374,079 0.29 % 329.67 0 0 % 0 3,656 0.12 % 92.05 371 0.08 % 93.78 188,954 0.30 % 348.16 40,539 0.19 % 216.30 135,861 0.38 % 427.32 4,698 0.09 % 109.42 slovenske slov enske 353,715 0.27 % 311.73 0 0 % 0 506 0.02 % 12.74 592 0.12 % 149.65 206,966 0.33 % 381.35 37,129 0.17 % 198.11 103,866 0.29 % 326.69 4,656 0.09 % 108.44 novi novi 329,121 0.25 % 290.05 0 0 % 0 3,532 0.12 % 88.93 561 0.12 % 141.81 173,329 0.28 % 319.37 56,367 0.26 % 300.76 89,190 0.25 % 280.53 6,142 0.12 % 143.05 slovenskih slov enskih 294,271 0.23 % 259.34 0 0 % 0 442 0.01 % 11.13 399 0.08 % 100.86 174,340 0.28 % 321.23 34,387 0.16 % 183.48 80,914 0.22 % 254.50 3,789 0.07 % 88.25 novega nove ga 281,250 0.22 % 247.86 0 0 % 0 3,744 0.13 % 94.27 446 0.09 % 112.74 149,906 0.24 % 276.21 43,885 0.20 % 234.16 77,637 0.21 % 244.19 5,632 0.11 % 131.18 različnih razl ičnih 266,466 0.21 % 234.84 2 0.23 % 206.04 2,438 0.08 % 61.39 678 0.14 % 171.39 118,184 0.19 % 217.76 56,333 0.26 % 300.57 68,940 0.19 % 216.83 19,891 0.38 % 463.29 novih novi h 265,857 0.20 % 234.30 0 0 % 0 2,128 0.07 % 53.58 655 0.14 % 165.58 137,591 0.22 % 253.52 42,406 0.20 % 226.26 74,578 0.21 % 234.57 8,499 0.16 % 197.95 nova nova 254,400 0.20 % 224.20 0 0 % 0 2,819 0.10 % 70.98 635 0.13 % 160.52 128,227 0.20 % 236.27 41,185 0.19 % 219.75 75,332 0.21 % 236.94 6,202 0.12 % 144.45 veliko veli ko 252,295 0.20 % 222.35 0 0 % 0 7,380 0.25 % 185.82 564 0.12 % 142.57 121,628 0.19 % 224.11 44,659 0.21 % 238.29 68,608 0.19 % 215.79 9,456 0.18 % 220.24 zadnji zadn ji 251,253 0.19 % 221.43 1 0.12 % 103.02 6,258 0.21 % 157.57 400 0.08 % 101.12 128,961 0.20 % 237.62 35,235 0.16 % 188 75,833 0.21 % 238.51 4,565 0.09 % 106.32 sami sami 231,519 0.18 % 204.04 1 0.12 % 103.02 5,772 0.20 % 145.33 753 0.16 % 190.35 111,377 0.18 % 205.22 49,240 0.23 % 262.73 52,561 0.14 % 165.32 11,815 0.23 % 275.19 velika veli ka 224,674 0.17 % 198 0 0 % 0 5,726 0.20 % 144.17 592 0.12 % 149.65 104,684 0.17 % 192.89 42,115 0.20 % 224.71 62,856 0.17 % 197.70 8,701 0.17 % 202.66 pomembno pome mbno 223,934 0.17 % 197.35 0 0 % 0 4,575 0.15 % 115.19 439 0.09 % 110.97 99,670 0.16 % 183.65 43,834 0.20 % 233.88 62,790 0.17 % 197.49 12,626 0.24 % 294.07 letni letn i 222,751 0.17 % 196.31 0 0 % 0 309 0.01 % 7.78 182 0.04 % 46.01 114,022 0.18 % 210.09 12,989 0.06 % 69.31 94,228 0.26 % 296.37 1,021 0.02 % 23.78 slovenska slov enska 220,229 0.17 % 194.09 0 0 % 0 268 0.01 % 6.75 407 0.09 % 102.89 118,994 0.19 % 219.25 22,935 0.11 % 122.37 74,627 0.21 % 234.72 2,998 0.06 % 69.83 veliki veli ki 216,843 0.17 % 191.10 0 0 % 0 5,506 0.19 % 138.64 648 0.14 % 163.81 103,011 0.16 % 189.80 37,422 0.17 % 199.67 60,010 0.17 % 188.75 10,246 0.20 % 238.64 zadnjem zadn jem 213,519 0.17 % 188.17 1 0.12 % 103.02 2,919 0.10 % 73.50 290 0.06 % 73.31 110,962 0.18 % 204.45 28,044 0.13 % 149.63 68,052 0.19 % 214.04 3,251 0.06 % 75.72 sama sama 209,273 0.16 % 184.43 0 0 % 0 17,968 0.61 % 452.41 729 0.15 % 184.28 84,772 0.13 % 156.20 47,535 0.22 % 253.63 50,301 0.14 % 158.21 7,968 0.15 % 185.58 evropske evro pske 208,721 0.16 % 183.94 0 0 % 0 195 0.01 % 4.91 447 0.09 % 113 106,056 0.17 % 195.42 17,905 0.08 % 95.54 80,530 0.22 % 253.29 3,588 0.07 % 83.57 velik veli k 205,754 0.16 % 181.33 0 0 % 0 5,789 0.20 % 145.76 372 0.08 % 94.04 94,260 0.15 % 173.68 39,261 0.18 % 209.48 58,805 0.16 % 184.96 7,267 0.14 % 169.26 prihodnje prih odnje 203,631 0.16 % 179.46 0 0 % 0 614 0.02 % 15.46 158 0.03 % 39.94 120,469 0.19 % 221.97 19,044 0.09 % 101.61 61,773 0.17 % 194.29 1,573 0.03 % 36.64 slovenskega slov enskega 202,094 0.16 % 178.10 0 0 % 0 398 0.01 % 10.02 407 0.09 % 102.89 112,062 0.18 % 206.48 22,466 0.10 % 119.87 62,505 0.17 % 196.59 4,256 0.08 % 99.13 dobro dobr o 201,264 0.16 % 177.37 0 0 % 0 6,966 0.24 % 175.40 343 0.07 % 86.71 98,094 0.16 % 180.74 39,696 0.18 % 211.80 50,431 0.14 % 158.62 5,734 0.11 % 133.55 velike veli ke 199,502 0.15 % 175.82 4 0.47 % 412.07 4,824 0.16 % 121.46 674 0.14 % 170.38 95,438 0.15 % 175.85 36,157 0.17 % 192.92 52,627 0.14 % 165.53 9,778 0.19 % 227.74 državni drža vni 190,454 0.15 % 167.85 0 0 % 0 679 0.02 % 17.10 2,076 0.44 % 524.79 111,848 0.18 % 206.09 14,035 0.07 % 74.89 59,642 0.17 % 187.59 2,174 0.04 % 50.64 najboljši najb oljši 182,232 0.14 % 160.60 0 0 % 0 2,304 0.08 % 58.01 142 0.03 % 35.90 90,767 0.14 % 167.24 23,701 0.11 % 126.46 63,025 0.17 % 198.23 2,293 0.04 % 53.41 jasno jasn o 178,267 0.14 % 157.11 1 0.12 % 103.02 8,873 0.30 % 223.41 581 0.12 % 146.87 82,213 0.13 % 151.48 29,396 0.14 % 156.85 52,757 0.15 % 165.93 4,446 0.09 % 103.55 slovensko slov ensko 172,365 0.13 % 151.90 0 0 % 0 362 0.01 % 9.11 270 0.06 % 68.25 96,938 0.15 % 178.61 20,514 0.10 % 109.46 51,003 0.14 % 160.42 3,278 0.06 % 76.35 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 91 File at CLARIN.SI 1.2.75 List of initial character-level 5-grams from adjective lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adjectives-lowercase_ forms-initial-5grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] mogoče mogoč e 502,447 0.40 % 442.80 0 0 % 0 23,836 0.86 % 600.16 2,362 0.51 % 597.09 243,451 0.40 % 448.57 90,333 0.44 % 481.99 116,502 0.33 % 366.43 25,963 0.51 % 604.71 slovenski slove nski 446,624 0.36 % 393.61 0 0 % 0 678 0.02 % 17.07 576 0.12 % 145.61 240,556 0.39 % 443.24 47,459 0.23 % 253.23 151,452 0.43 % 476.36 5,903 0.12 % 137.49 zadnjih zadnj ih 374,079 0.30 % 329.67 0 0 % 0 3,656 0.13 % 92.05 371 0.08 % 93.78 188,954 0.31 % 348.16 40,539 0.20 % 216.30 135,861 0.39 % 427.32 4,698 0.09 % 109.42 slovenske slove nske 353,715 0.28 % 311.73 0 0 % 0 506 0.02 % 12.74 592 0.13 % 149.65 206,966 0.34 % 381.35 37,129 0.18 % 198.11 103,866 0.29 % 326.69 4,656 0.09 % 108.44 slovenskih slove nskih 294,271 0.23 % 259.34 0 0 % 0 442 0.02 % 11.13 399 0.09 % 100.86 174,340 0.28 % 321.23 34,387 0.17 % 183.48 80,914 0.23 % 254.50 3,789 0.07 % 88.25 novega noveg a 281,250 0.22 % 247.86 0 0 % 0 3,744 0.14 % 94.27 446 0.10 % 112.74 149,906 0.24 % 276.21 43,885 0.21 % 234.16 77,637 0.22 % 244.19 5,632 0.11 % 131.18 različnih razli čnih 266,466 0.21 % 234.84 2 0.24 % 206.04 2,438 0.09 % 61.39 678 0.15 % 171.39 118,184 0.19 % 217.76 56,333 0.27 % 300.57 68,940 0.20 % 216.83 19,891 0.39 % 463.29 novih novih 265,857 0.21 % 234.30 0 0 % 0 2,128 0.08 % 53.58 655 0.14 % 165.58 137,591 0.23 % 253.52 42,406 0.20 % 226.26 74,578 0.21 % 234.57 8,499 0.17 % 197.95 veliko velik o 252,295 0.20 % 222.35 0 0 % 0 7,380 0.27 % 185.82 564 0.12 % 142.57 121,628 0.20 % 224.11 44,659 0.22 % 238.29 68,608 0.20 % 215.79 9,456 0.19 % 220.24 zadnji zadnj i 251,253 0.20 % 221.43 1 0.12 % 103.02 6,258 0.23 % 157.57 400 0.09 % 101.12 128,961 0.21 % 237.62 35,235 0.17 % 188 75,833 0.21 % 238.51 4,565 0.09 % 106.32 velika velik a 224,674 0.18 % 198 0 0 % 0 5,726 0.21 % 144.17 592 0.13 % 149.65 104,684 0.17 % 192.89 42,115 0.20 % 224.71 62,856 0.18 % 197.70 8,701 0.17 % 202.66 pomembno pomem bno 223,934 0.18 % 197.35 0 0 % 0 4,575 0.17 % 115.19 439 0.10 % 110.97 99,670 0.16 % 183.65 43,834 0.21 % 233.88 62,790 0.18 % 197.49 12,626 0.25 % 294.07 letni letni 222,751 0.18 % 196.31 0 0 % 0 309 0.01 % 7.78 182 0.04 % 46.01 114,022 0.19 % 210.09 12,989 0.06 % 69.31 94,228 0.27 % 296.37 1,021 0.02 % 23.78 slovenska slove nska 220,229 0.18 % 194.09 0 0 % 0 268 0.01 % 6.75 407 0.09 % 102.89 118,994 0.20 % 219.25 22,935 0.11 % 122.37 74,627 0.21 % 234.72 2,998 0.06 % 69.83 veliki velik i 216,843 0.17 % 191.10 0 0 % 0 5,506 0.20 % 138.64 648 0.14 % 163.81 103,011 0.17 % 189.80 37,422 0.18 % 199.67 60,010 0.17 % 188.75 10,246 0.20 % 238.64 zadnjem zadnj em 213,519 0.17 % 188.17 1 0.12 % 103.02 2,919 0.10 % 73.50 290 0.06 % 73.31 110,962 0.18 % 204.45 28,044 0.14 % 149.63 68,052 0.19 % 214.04 3,251 0.06 % 75.72 evropske evrop ske 208,721 0.17 % 183.94 0 0 % 0 195 0.01 % 4.91 447 0.10 % 113 106,056 0.17 % 195.42 17,905 0.09 % 95.54 80,530 0.23 % 253.29 3,588 0.07 % 83.57 velik velik 205,754 0.16 % 181.33 0 0 % 0 5,789 0.21 % 145.76 372 0.08 % 94.04 94,260 0.15 % 173.68 39,261 0.19 % 209.48 58,805 0.17 % 184.96 7,267 0.14 % 169.26 prihodnje priho dnje 203,631 0.16 % 179.46 0 0 % 0 614 0.02 % 15.46 158 0.03 % 39.94 120,469 0.20 % 221.97 19,044 0.09 % 101.61 61,773 0.18 % 194.29 1,573 0.03 % 36.64 slovenskega slove nskega 202,094 0.16 % 178.10 0 0 % 0 398 0.01 % 10.02 407 0.09 % 102.89 112,062 0.18 % 206.48 22,466 0.11 % 119.87 62,505 0.18 % 196.59 4,256 0.08 % 99.13 dobro dobro 201,264 0.16 % 177.37 0 0 % 0 6,966 0.25 % 175.40 343 0.07 % 86.71 98,094 0.16 % 180.74 39,696 0.19 % 211.80 50,431 0.14 % 158.62 5,734 0.11 % 133.55 velike velik e 199,502 0.16 % 175.82 4 0.48 % 412.07 4,824 0.17 % 121.46 674 0.15 % 170.38 95,438 0.16 % 175.85 36,157 0.17 % 192.92 52,627 0.15 % 165.53 9,778 0.19 % 227.74 državni držav ni 190,454 0.15 % 167.85 0 0 % 0 679 0.02 % 17.10 2,076 0.45 % 524.79 111,848 0.18 % 206.09 14,035 0.07 % 74.89 59,642 0.17 % 187.59 2,174 0.04 % 50.64 najboljši najbo ljši 182,232 0.14 % 160.60 0 0 % 0 2,304 0.08 % 58.01 142 0.03 % 35.90 90,767 0.15 % 167.24 23,701 0.11 % 126.46 63,025 0.18 % 198.23 2,293 0.04 % 53.41 jasno jasno 178,267 0.14 % 157.11 1 0.12 % 103.02 8,873 0.32 % 223.41 581 0.13 % 146.87 82,213 0.14 % 151.48 29,396 0.14 % 156.85 52,757 0.15 % 165.93 4,446 0.09 % 103.55 slovensko slove nsko 172,365 0.14 % 151.90 0 0 % 0 362 0.01 % 9.11 270 0.06 % 68.25 96,938 0.16 % 178.61 20,514 0.10 % 109.46 51,003 0.14 % 160.42 3,278 0.06 % 76.35 nekdanji nekda nji 170,878 0.14 % 150.59 0 0 % 0 922 0.03 % 23.21 58 0.01 % 14.66 84,369 0.14 % 155.46 15,884 0.08 % 84.75 68,299 0.19 % 214.82 1,346 0.03 % 31.35 evropski evrop ski 170,666 0.14 % 150.41 0 0 % 0 191 0.01 % 4.81 328 0.07 % 82.91 86,760 0.14 % 159.86 15,777 0.08 % 84.18 64,420 0.18 % 202.62 3,190 0.06 % 74.30 glavni glavn i 169,954 0.14 % 149.78 0 0 % 0 2,389 0.09 % 60.15 442 0.10 % 111.73 85,037 0.14 % 156.69 24,126 0.12 % 128.73 52,387 0.15 % 164.77 5,573 0.11 % 129.80 večji večji 166,234 0.13 % 146.50 0 0 % 0 2,284 0.08 % 57.51 307 0.07 % 77.61 81,608 0.13 % 150.37 32,453 0.16 % 173.16 42,062 0.12 % 132.30 7,520 0.15 % 175.15 novem novem 165,059 0.13 % 145.47 0 0 % 0 1,201 0.04 % 30.24 183 0.04 % 46.26 94,756 0.15 % 174.59 22,827 0.11 % 121.80 44,218 0.13 % 139.08 1,874 0.04 % 43.65 največji najve čji 162,333 0.13 % 143.06 0 0 % 0 1,346 0.05 % 33.89 203 0.04 % 51.32 82,320 0.14 % 151.68 26,141 0.13 % 139.48 48,637 0.14 % 152.98 3,686 0.07 % 85.85 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 92 File at CLARIN.SI 1.2.76 List of final character-level 1-grams from adjective lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adjectives-lowercase_ forms-final-1grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] mogoče mogoč e 502,447 0.39 % 442.80 0 0 % 0 23,836 0.80 % 600.16 2,362 0.49 % 597.09 243,451 0.38 % 448.57 90,333 0.42 % 481.99 116,502 0.32 % 366.43 25,963 0.49 % 604.71 novo nov o 475,332 0.36 % 418.91 1 0.12 % 103.02 7,539 0.25 % 189.82 788 0.17 % 199.20 258,248 0.41 % 475.84 75,103 0.35 % 400.72 121,642 0.33 % 382.60 12,011 0.23 % 279.75 slovenski slovensk i 446,624 0.34 % 393.61 0 0 % 0 678 0.02 % 17.07 576 0.12 % 145.61 240,556 0.38 % 443.24 47,459 0.22 % 253.23 151,452 0.42 % 476.36 5,903 0.11 % 137.49 sam sa m 404,454 0.31 % 356.44 0 0 % 0 28,032 0.94 % 705.82 1,505 0.32 % 380.45 182,788 0.29 % 336.80 76,278 0.35 % 406.99 99,109 0.27 % 311.72 16,742 0.32 % 389.94 nove nov e 383,174 0.29 % 337.69 0 0 % 0 4,160 0.14 % 104.74 806 0.17 % 203.75 197,154 0.31 % 363.27 63,139 0.29 % 336.89 105,758 0.29 % 332.64 12,157 0.23 % 283.15 zadnjih zadnji h 374,079 0.29 % 329.67 0 0 % 0 3,656 0.12 % 92.05 371 0.08 % 93.78 188,954 0.30 % 348.16 40,539 0.19 % 216.30 135,861 0.37 % 427.32 4,698 0.09 % 109.42 slovenske slovensk e 353,715 0.27 % 311.73 0 0 % 0 506 0.02 % 12.74 592 0.12 % 149.65 206,966 0.33 % 381.35 37,129 0.17 % 198.11 103,866 0.28 % 326.69 4,656 0.09 % 108.44 novi nov i 329,121 0.25 % 290.05 0 0 % 0 3,532 0.12 % 88.93 561 0.12 % 141.81 173,329 0.27 % 319.37 56,367 0.26 % 300.76 89,190 0.24 % 280.53 6,142 0.12 % 143.05 slovenskih slovenski h 294,271 0.23 % 259.34 0 0 % 0 442 0.01 % 11.13 399 0.08 % 100.86 174,340 0.28 % 321.23 34,387 0.16 % 183.48 80,914 0.22 % 254.50 3,789 0.07 % 88.25 novega noveg a 281,250 0.22 % 247.86 0 0 % 0 3,744 0.12 % 94.27 446 0.09 % 112.74 149,906 0.24 % 276.21 43,885 0.20 % 234.16 77,637 0.21 % 244.19 5,632 0.11 % 131.18 različnih različni h 266,466 0.20 % 234.84 2 0.23 % 206.04 2,438 0.08 % 61.39 678 0.14 % 171.39 118,184 0.19 % 217.76 56,333 0.26 % 300.57 68,940 0.19 % 216.83 19,891 0.38 % 463.29 novih novi h 265,857 0.20 % 234.30 0 0 % 0 2,128 0.07 % 53.58 655 0.14 % 165.58 137,591 0.22 % 253.52 42,406 0.20 % 226.26 74,578 0.20 % 234.57 8,499 0.16 % 197.95 nova nov a 254,400 0.20 % 224.20 0 0 % 0 2,819 0.09 % 70.98 635 0.13 % 160.52 128,227 0.20 % 236.27 41,185 0.19 % 219.75 75,332 0.21 % 236.94 6,202 0.12 % 144.45 veliko velik o 252,295 0.19 % 222.35 0 0 % 0 7,380 0.25 % 185.82 564 0.12 % 142.57 121,628 0.19 % 224.11 44,659 0.21 % 238.29 68,608 0.19 % 215.79 9,456 0.18 % 220.24 zadnji zadnj i 251,253 0.19 % 221.43 1 0.12 % 103.02 6,258 0.21 % 157.57 400 0.08 % 101.12 128,961 0.20 % 237.62 35,235 0.16 % 188 75,833 0.21 % 238.51 4,565 0.09 % 106.32 sami sam i 231,519 0.18 % 204.04 1 0.12 % 103.02 5,772 0.19 % 145.33 753 0.16 % 190.35 111,377 0.18 % 205.22 49,240 0.23 % 262.73 52,561 0.14 % 165.32 11,815 0.22 % 275.19 velika velik a 224,674 0.17 % 198 0 0 % 0 5,726 0.19 % 144.17 592 0.12 % 149.65 104,684 0.17 % 192.89 42,115 0.19 % 224.71 62,856 0.17 % 197.70 8,701 0.17 % 202.66 pomembno pomembn o 223,934 0.17 % 197.35 0 0 % 0 4,575 0.15 % 115.19 439 0.09 % 110.97 99,670 0.16 % 183.65 43,834 0.20 % 233.88 62,790 0.17 % 197.49 12,626 0.24 % 294.07 letni letn i 222,751 0.17 % 196.31 0 0 % 0 309 0.01 % 7.78 182 0.04 % 46.01 114,022 0.18 % 210.09 12,989 0.06 % 69.31 94,228 0.26 % 296.37 1,021 0.02 % 23.78 slovenska slovensk a 220,229 0.17 % 194.09 0 0 % 0 268 0.01 % 6.75 407 0.09 % 102.89 118,994 0.19 % 219.25 22,935 0.11 % 122.37 74,627 0.20 % 234.72 2,998 0.06 % 69.83 veliki velik i 216,843 0.17 % 191.10 0 0 % 0 5,506 0.18 % 138.64 648 0.14 % 163.81 103,011 0.16 % 189.80 37,422 0.17 % 199.67 60,010 0.17 % 188.75 10,246 0.19 % 238.64 zadnjem zadnje m 213,519 0.16 % 188.17 1 0.12 % 103.02 2,919 0.10 % 73.50 290 0.06 % 73.31 110,962 0.17 % 204.45 28,044 0.13 % 149.63 68,052 0.19 % 214.04 3,251 0.06 % 75.72 sama sam a 209,273 0.16 % 184.43 0 0 % 0 17,968 0.60 % 452.41 729 0.15 % 184.28 84,772 0.13 % 156.20 47,535 0.22 % 253.63 50,301 0.14 % 158.21 7,968 0.15 % 185.58 evropske evropsk e 208,721 0.16 % 183.94 0 0 % 0 195 0.01 % 4.91 447 0.09 % 113 106,056 0.17 % 195.42 17,905 0.08 % 95.54 80,530 0.22 % 253.29 3,588 0.07 % 83.57 velik veli k 205,754 0.16 % 181.33 0 0 % 0 5,789 0.19 % 145.76 372 0.08 % 94.04 94,260 0.15 % 173.68 39,261 0.18 % 209.48 58,805 0.16 % 184.96 7,267 0.14 % 169.26 prihodnje prihodnj e 203,631 0.16 % 179.46 0 0 % 0 614 0.02 % 15.46 158 0.03 % 39.94 120,469 0.19 % 221.97 19,044 0.09 % 101.61 61,773 0.17 % 194.29 1,573 0.03 % 36.64 slovenskega slovenskeg a 202,094 0.15 % 178.10 0 0 % 0 398 0.01 % 10.02 407 0.09 % 102.89 112,062 0.18 % 206.48 22,466 0.10 % 119.87 62,505 0.17 % 196.59 4,256 0.08 % 99.13 dobro dobr o 201,264 0.15 % 177.37 0 0 % 0 6,966 0.23 % 175.40 343 0.07 % 86.71 98,094 0.15 % 180.74 39,696 0.18 % 211.80 50,431 0.14 % 158.62 5,734 0.11 % 133.55 velike velik e 199,502 0.15 % 175.82 4 0.47 % 412.07 4,824 0.16 % 121.46 674 0.14 % 170.38 95,438 0.15 % 175.85 36,157 0.17 % 192.92 52,627 0.14 % 165.53 9,778 0.18 % 227.74 nov no v 192,972 0.15 % 170.07 0 0 % 0 3,307 0.11 % 83.27 567 0.12 % 143.33 87,154 0.14 % 160.59 34,997 0.16 % 186.73 62,028 0.17 % 195.09 4,919 0.09 % 114.57 državni državn i 190,454 0.15 % 167.85 0 0 % 0 679 0.02 % 17.10 2,076 0.43 % 524.79 111,848 0.18 % 206.09 14,035 0.07 % 74.89 59,642 0.16 % 187.59 2,174 0.04 % 50.64 najboljši najboljš i 182,232 0.14 % 160.60 0 0 % 0 2,304 0.08 % 58.01 142 0.03 % 35.90 90,767 0.14 % 167.24 23,701 0.11 % 126.46 63,025 0.17 % 198.23 2,293 0.04 % 53.41 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 93 File at CLARIN.SI 1.2.77 List of final character-level 2-grams from adjective lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adjectives-lowercase_ forms-final-2grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] mogoče mogo če 502,447 0.39 % 442.80 0 0 % 0 23,836 0.80 % 600.16 2,362 0.49 % 597.09 243,451 0.38 % 448.57 90,333 0.42 % 481.99 116,502 0.32 % 366.43 25,963 0.49 % 604.71 novo no vo 475,332 0.36 % 418.91 1 0.12 % 103.02 7,539 0.25 % 189.82 788 0.17 % 199.20 258,248 0.41 % 475.84 75,103 0.35 % 400.72 121,642 0.33 % 382.60 12,011 0.23 % 279.75 slovenski slovens ki 446,624 0.34 % 393.61 0 0 % 0 678 0.02 % 17.07 576 0.12 % 145.61 240,556 0.38 % 443.24 47,459 0.22 % 253.23 151,452 0.42 % 476.36 5,903 0.11 % 137.49 sam s am 404,454 0.31 % 356.44 0 0 % 0 28,032 0.94 % 705.82 1,505 0.32 % 380.45 182,788 0.29 % 336.80 76,278 0.35 % 406.99 99,109 0.27 % 311.72 16,742 0.32 % 389.94 nove no ve 383,174 0.29 % 337.69 0 0 % 0 4,160 0.14 % 104.74 806 0.17 % 203.75 197,154 0.31 % 363.27 63,139 0.29 % 336.89 105,758 0.29 % 332.64 12,157 0.23 % 283.15 zadnjih zadnj ih 374,079 0.29 % 329.67 0 0 % 0 3,656 0.12 % 92.05 371 0.08 % 93.78 188,954 0.30 % 348.16 40,539 0.19 % 216.30 135,861 0.37 % 427.32 4,698 0.09 % 109.42 slovenske slovens ke 353,715 0.27 % 311.73 0 0 % 0 506 0.02 % 12.74 592 0.12 % 149.65 206,966 0.33 % 381.35 37,129 0.17 % 198.11 103,866 0.28 % 326.69 4,656 0.09 % 108.44 novi no vi 329,121 0.25 % 290.05 0 0 % 0 3,532 0.12 % 88.93 561 0.12 % 141.81 173,329 0.27 % 319.37 56,367 0.26 % 300.76 89,190 0.24 % 280.53 6,142 0.12 % 143.05 slovenskih slovensk ih 294,271 0.23 % 259.34 0 0 % 0 442 0.01 % 11.13 399 0.08 % 100.86 174,340 0.28 % 321.23 34,387 0.16 % 183.48 80,914 0.22 % 254.50 3,789 0.07 % 88.25 novega nove ga 281,250 0.22 % 247.86 0 0 % 0 3,744 0.12 % 94.27 446 0.09 % 112.74 149,906 0.24 % 276.21 43,885 0.20 % 234.16 77,637 0.21 % 244.19 5,632 0.11 % 131.18 različnih različn ih 266,466 0.20 % 234.84 2 0.23 % 206.04 2,438 0.08 % 61.39 678 0.14 % 171.39 118,184 0.19 % 217.76 56,333 0.26 % 300.57 68,940 0.19 % 216.83 19,891 0.38 % 463.29 novih nov ih 265,857 0.20 % 234.30 0 0 % 0 2,128 0.07 % 53.58 655 0.14 % 165.58 137,591 0.22 % 253.52 42,406 0.20 % 226.26 74,578 0.20 % 234.57 8,499 0.16 % 197.95 nova no va 254,400 0.20 % 224.20 0 0 % 0 2,819 0.09 % 70.98 635 0.13 % 160.52 128,227 0.20 % 236.27 41,185 0.19 % 219.75 75,332 0.21 % 236.94 6,202 0.12 % 144.45 veliko veli ko 252,295 0.19 % 222.35 0 0 % 0 7,380 0.25 % 185.82 564 0.12 % 142.57 121,628 0.19 % 224.11 44,659 0.21 % 238.29 68,608 0.19 % 215.79 9,456 0.18 % 220.24 zadnji zadn ji 251,253 0.19 % 221.43 1 0.12 % 103.02 6,258 0.21 % 157.57 400 0.08 % 101.12 128,961 0.20 % 237.62 35,235 0.16 % 188 75,833 0.21 % 238.51 4,565 0.09 % 106.32 sami sa mi 231,519 0.18 % 204.04 1 0.12 % 103.02 5,772 0.19 % 145.33 753 0.16 % 190.35 111,377 0.18 % 205.22 49,240 0.23 % 262.73 52,561 0.14 % 165.32 11,815 0.22 % 275.19 velika veli ka 224,674 0.17 % 198 0 0 % 0 5,726 0.19 % 144.17 592 0.12 % 149.65 104,684 0.17 % 192.89 42,115 0.19 % 224.71 62,856 0.17 % 197.70 8,701 0.17 % 202.66 pomembno pomemb no 223,934 0.17 % 197.35 0 0 % 0 4,575 0.15 % 115.19 439 0.09 % 110.97 99,670 0.16 % 183.65 43,834 0.20 % 233.88 62,790 0.17 % 197.49 12,626 0.24 % 294.07 letni let ni 222,751 0.17 % 196.31 0 0 % 0 309 0.01 % 7.78 182 0.04 % 46.01 114,022 0.18 % 210.09 12,989 0.06 % 69.31 94,228 0.26 % 296.37 1,021 0.02 % 23.78 slovenska slovens ka 220,229 0.17 % 194.09 0 0 % 0 268 0.01 % 6.75 407 0.09 % 102.89 118,994 0.19 % 219.25 22,935 0.11 % 122.37 74,627 0.20 % 234.72 2,998 0.06 % 69.83 veliki veli ki 216,843 0.17 % 191.10 0 0 % 0 5,506 0.18 % 138.64 648 0.14 % 163.81 103,011 0.16 % 189.80 37,422 0.17 % 199.67 60,010 0.17 % 188.75 10,246 0.19 % 238.64 zadnjem zadnj em 213,519 0.16 % 188.17 1 0.12 % 103.02 2,919 0.10 % 73.50 290 0.06 % 73.31 110,962 0.17 % 204.45 28,044 0.13 % 149.63 68,052 0.19 % 214.04 3,251 0.06 % 75.72 sama sa ma 209,273 0.16 % 184.43 0 0 % 0 17,968 0.60 % 452.41 729 0.15 % 184.28 84,772 0.13 % 156.20 47,535 0.22 % 253.63 50,301 0.14 % 158.21 7,968 0.15 % 185.58 evropske evrops ke 208,721 0.16 % 183.94 0 0 % 0 195 0.01 % 4.91 447 0.09 % 113 106,056 0.17 % 195.42 17,905 0.08 % 95.54 80,530 0.22 % 253.29 3,588 0.07 % 83.57 velik vel ik 205,754 0.16 % 181.33 0 0 % 0 5,789 0.19 % 145.76 372 0.08 % 94.04 94,260 0.15 % 173.68 39,261 0.18 % 209.48 58,805 0.16 % 184.96 7,267 0.14 % 169.26 prihodnje prihodn je 203,631 0.16 % 179.46 0 0 % 0 614 0.02 % 15.46 158 0.03 % 39.94 120,469 0.19 % 221.97 19,044 0.09 % 101.61 61,773 0.17 % 194.29 1,573 0.03 % 36.64 slovenskega slovenske ga 202,094 0.15 % 178.10 0 0 % 0 398 0.01 % 10.02 407 0.09 % 102.89 112,062 0.18 % 206.48 22,466 0.10 % 119.87 62,505 0.17 % 196.59 4,256 0.08 % 99.13 dobro dob ro 201,264 0.15 % 177.37 0 0 % 0 6,966 0.23 % 175.40 343 0.07 % 86.71 98,094 0.15 % 180.74 39,696 0.18 % 211.80 50,431 0.14 % 158.62 5,734 0.11 % 133.55 velike veli ke 199,502 0.15 % 175.82 4 0.47 % 412.07 4,824 0.16 % 121.46 674 0.14 % 170.38 95,438 0.15 % 175.85 36,157 0.17 % 192.92 52,627 0.14 % 165.53 9,778 0.18 % 227.74 nov n ov 192,972 0.15 % 170.07 0 0 % 0 3,307 0.11 % 83.27 567 0.12 % 143.33 87,154 0.14 % 160.59 34,997 0.16 % 186.73 62,028 0.17 % 195.09 4,919 0.09 % 114.57 državni držav ni 190,454 0.15 % 167.85 0 0 % 0 679 0.02 % 17.10 2,076 0.43 % 524.79 111,848 0.18 % 206.09 14,035 0.07 % 74.89 59,642 0.16 % 187.59 2,174 0.04 % 50.64 najboljši najbolj ši 182,232 0.14 % 160.60 0 0 % 0 2,304 0.08 % 58.01 142 0.03 % 35.90 90,767 0.14 % 167.24 23,701 0.11 % 126.46 63,025 0.17 % 198.23 2,293 0.04 % 53.41 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 94 File at CLARIN.SI 1.2.78 List of final character-level 3-grams from adjective lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adjectives-lowercase_ forms-final-3grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] mogoče mog oče 502,447 0.39 % 442.80 0 0 % 0 23,836 0.80 % 600.16 2,362 0.49 % 597.09 243,451 0.38 % 448.57 90,333 0.42 % 481.99 116,502 0.32 % 366.43 25,963 0.49 % 604.71 novo n ovo 475,332 0.36 % 418.91 1 0.12 % 103.02 7,539 0.25 % 189.82 788 0.17 % 199.20 258,248 0.41 % 475.84 75,103 0.35 % 400.72 121,642 0.33 % 382.60 12,011 0.23 % 279.75 slovenski sloven ski 446,624 0.34 % 393.61 0 0 % 0 678 0.02 % 17.07 576 0.12 % 145.61 240,556 0.38 % 443.24 47,459 0.22 % 253.23 151,452 0.42 % 476.36 5,903 0.11 % 137.49 sam sam 404,454 0.31 % 356.44 0 0 % 0 28,032 0.94 % 705.82 1,505 0.32 % 380.45 182,788 0.29 % 336.80 76,278 0.35 % 406.99 99,109 0.27 % 311.72 16,742 0.32 % 389.94 nove n ove 383,174 0.29 % 337.69 0 0 % 0 4,160 0.14 % 104.74 806 0.17 % 203.75 197,154 0.31 % 363.27 63,139 0.29 % 336.89 105,758 0.29 % 332.64 12,157 0.23 % 283.15 zadnjih zadn jih 374,079 0.29 % 329.67 0 0 % 0 3,656 0.12 % 92.05 371 0.08 % 93.78 188,954 0.30 % 348.16 40,539 0.19 % 216.30 135,861 0.37 % 427.32 4,698 0.09 % 109.42 slovenske sloven ske 353,715 0.27 % 311.73 0 0 % 0 506 0.02 % 12.74 592 0.12 % 149.65 206,966 0.33 % 381.35 37,129 0.17 % 198.11 103,866 0.28 % 326.69 4,656 0.09 % 108.44 novi n ovi 329,121 0.25 % 290.05 0 0 % 0 3,532 0.12 % 88.93 561 0.12 % 141.81 173,329 0.27 % 319.37 56,367 0.26 % 300.76 89,190 0.24 % 280.53 6,142 0.12 % 143.05 slovenskih slovens kih 294,271 0.23 % 259.34 0 0 % 0 442 0.01 % 11.13 399 0.08 % 100.86 174,340 0.28 % 321.23 34,387 0.16 % 183.48 80,914 0.22 % 254.50 3,789 0.07 % 88.25 novega nov ega 281,250 0.22 % 247.86 0 0 % 0 3,744 0.12 % 94.27 446 0.09 % 112.74 149,906 0.24 % 276.21 43,885 0.20 % 234.16 77,637 0.21 % 244.19 5,632 0.11 % 131.18 različnih različ nih 266,466 0.20 % 234.84 2 0.23 % 206.04 2,438 0.08 % 61.39 678 0.14 % 171.39 118,184 0.19 % 217.76 56,333 0.26 % 300.57 68,940 0.19 % 216.83 19,891 0.38 % 463.29 novih no vih 265,857 0.20 % 234.30 0 0 % 0 2,128 0.07 % 53.58 655 0.14 % 165.58 137,591 0.22 % 253.52 42,406 0.20 % 226.26 74,578 0.20 % 234.57 8,499 0.16 % 197.95 nova n ova 254,400 0.20 % 224.20 0 0 % 0 2,819 0.09 % 70.98 635 0.13 % 160.52 128,227 0.20 % 236.27 41,185 0.19 % 219.75 75,332 0.21 % 236.94 6,202 0.12 % 144.45 veliko vel iko 252,295 0.19 % 222.35 0 0 % 0 7,380 0.25 % 185.82 564 0.12 % 142.57 121,628 0.19 % 224.11 44,659 0.21 % 238.29 68,608 0.19 % 215.79 9,456 0.18 % 220.24 zadnji zad nji 251,253 0.19 % 221.43 1 0.12 % 103.02 6,258 0.21 % 157.57 400 0.08 % 101.12 128,961 0.20 % 237.62 35,235 0.16 % 188 75,833 0.21 % 238.51 4,565 0.09 % 106.32 sami s ami 231,519 0.18 % 204.04 1 0.12 % 103.02 5,772 0.19 % 145.33 753 0.16 % 190.35 111,377 0.18 % 205.22 49,240 0.23 % 262.73 52,561 0.14 % 165.32 11,815 0.22 % 275.19 velika vel ika 224,674 0.17 % 198 0 0 % 0 5,726 0.19 % 144.17 592 0.12 % 149.65 104,684 0.17 % 192.89 42,115 0.19 % 224.71 62,856 0.17 % 197.70 8,701 0.17 % 202.66 pomembno pomem bno 223,934 0.17 % 197.35 0 0 % 0 4,575 0.15 % 115.19 439 0.09 % 110.97 99,670 0.16 % 183.65 43,834 0.20 % 233.88 62,790 0.17 % 197.49 12,626 0.24 % 294.07 letni le tni 222,751 0.17 % 196.31 0 0 % 0 309 0.01 % 7.78 182 0.04 % 46.01 114,022 0.18 % 210.09 12,989 0.06 % 69.31 94,228 0.26 % 296.37 1,021 0.02 % 23.78 slovenska sloven ska 220,229 0.17 % 194.09 0 0 % 0 268 0.01 % 6.75 407 0.09 % 102.89 118,994 0.19 % 219.25 22,935 0.11 % 122.37 74,627 0.20 % 234.72 2,998 0.06 % 69.83 veliki vel iki 216,843 0.17 % 191.10 0 0 % 0 5,506 0.18 % 138.64 648 0.14 % 163.81 103,011 0.16 % 189.80 37,422 0.17 % 199.67 60,010 0.17 % 188.75 10,246 0.19 % 238.64 zadnjem zadn jem 213,519 0.16 % 188.17 1 0.12 % 103.02 2,919 0.10 % 73.50 290 0.06 % 73.31 110,962 0.17 % 204.45 28,044 0.13 % 149.63 68,052 0.19 % 214.04 3,251 0.06 % 75.72 sama s ama 209,273 0.16 % 184.43 0 0 % 0 17,968 0.60 % 452.41 729 0.15 % 184.28 84,772 0.13 % 156.20 47,535 0.22 % 253.63 50,301 0.14 % 158.21 7,968 0.15 % 185.58 evropske evrop ske 208,721 0.16 % 183.94 0 0 % 0 195 0.01 % 4.91 447 0.09 % 113 106,056 0.17 % 195.42 17,905 0.08 % 95.54 80,530 0.22 % 253.29 3,588 0.07 % 83.57 velik ve lik 205,754 0.16 % 181.33 0 0 % 0 5,789 0.19 % 145.76 372 0.08 % 94.04 94,260 0.15 % 173.68 39,261 0.18 % 209.48 58,805 0.16 % 184.96 7,267 0.14 % 169.26 prihodnje prihod nje 203,631 0.16 % 179.46 0 0 % 0 614 0.02 % 15.46 158 0.03 % 39.94 120,469 0.19 % 221.97 19,044 0.09 % 101.61 61,773 0.17 % 194.29 1,573 0.03 % 36.64 slovenskega slovensk ega 202,094 0.15 % 178.10 0 0 % 0 398 0.01 % 10.02 407 0.09 % 102.89 112,062 0.18 % 206.48 22,466 0.10 % 119.87 62,505 0.17 % 196.59 4,256 0.08 % 99.13 dobro do bro 201,264 0.15 % 177.37 0 0 % 0 6,966 0.23 % 175.40 343 0.07 % 86.71 98,094 0.15 % 180.74 39,696 0.18 % 211.80 50,431 0.14 % 158.62 5,734 0.11 % 133.55 velike vel ike 199,502 0.15 % 175.82 4 0.47 % 412.07 4,824 0.16 % 121.46 674 0.14 % 170.38 95,438 0.15 % 175.85 36,157 0.17 % 192.92 52,627 0.14 % 165.53 9,778 0.18 % 227.74 nov nov 192,972 0.15 % 170.07 0 0 % 0 3,307 0.11 % 83.27 567 0.12 % 143.33 87,154 0.14 % 160.59 34,997 0.16 % 186.73 62,028 0.17 % 195.09 4,919 0.09 % 114.57 državni drža vni 190,454 0.15 % 167.85 0 0 % 0 679 0.02 % 17.10 2,076 0.43 % 524.79 111,848 0.18 % 206.09 14,035 0.07 % 74.89 59,642 0.16 % 187.59 2,174 0.04 % 50.64 najboljši najbol jši 182,232 0.14 % 160.60 0 0 % 0 2,304 0.08 % 58.01 142 0.03 % 35.90 90,767 0.14 % 167.24 23,701 0.11 % 126.46 63,025 0.17 % 198.23 2,293 0.04 % 53.41 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 95 File at CLARIN.SI 1.2.79 List of final character-level 4-grams from adjective lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adjectives-lowercase_ forms-final-4grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] mogoče mo goče 502,447 0.39 % 442.80 0 0 % 0 23,836 0.81 % 600.16 2,362 0.50 % 597.09 243,451 0.39 % 448.57 90,333 0.42 % 481.99 116,502 0.32 % 366.43 25,963 0.49 % 604.71 novo novo 475,332 0.37 % 418.91 1 0.12 % 103.02 7,539 0.26 % 189.82 788 0.17 % 199.20 258,248 0.41 % 475.84 75,103 0.35 % 400.72 121,642 0.34 % 382.60 12,011 0.23 % 279.75 slovenski slove nski 446,624 0.34 % 393.61 0 0 % 0 678 0.02 % 17.07 576 0.12 % 145.61 240,556 0.38 % 443.24 47,459 0.22 % 253.23 151,452 0.42 % 476.36 5,903 0.11 % 137.49 nove nove 383,174 0.30 % 337.69 0 0 % 0 4,160 0.14 % 104.74 806 0.17 % 203.75 197,154 0.31 % 363.27 63,139 0.29 % 336.89 105,758 0.29 % 332.64 12,157 0.23 % 283.15 zadnjih zad njih 374,079 0.29 % 329.67 0 0 % 0 3,656 0.12 % 92.05 371 0.08 % 93.78 188,954 0.30 % 348.16 40,539 0.19 % 216.30 135,861 0.38 % 427.32 4,698 0.09 % 109.42 slovenske slove nske 353,715 0.27 % 311.73 0 0 % 0 506 0.02 % 12.74 592 0.12 % 149.65 206,966 0.33 % 381.35 37,129 0.17 % 198.11 103,866 0.29 % 326.69 4,656 0.09 % 108.44 novi novi 329,121 0.25 % 290.05 0 0 % 0 3,532 0.12 % 88.93 561 0.12 % 141.81 173,329 0.28 % 319.37 56,367 0.26 % 300.76 89,190 0.25 % 280.53 6,142 0.12 % 143.05 slovenskih sloven skih 294,271 0.23 % 259.34 0 0 % 0 442 0.01 % 11.13 399 0.08 % 100.86 174,340 0.28 % 321.23 34,387 0.16 % 183.48 80,914 0.22 % 254.50 3,789 0.07 % 88.25 novega no vega 281,250 0.22 % 247.86 0 0 % 0 3,744 0.13 % 94.27 446 0.09 % 112.74 149,906 0.24 % 276.21 43,885 0.20 % 234.16 77,637 0.21 % 244.19 5,632 0.11 % 131.18 različnih razli čnih 266,466 0.21 % 234.84 2 0.23 % 206.04 2,438 0.08 % 61.39 678 0.14 % 171.39 118,184 0.19 % 217.76 56,333 0.26 % 300.57 68,940 0.19 % 216.83 19,891 0.38 % 463.29 novih n ovih 265,857 0.20 % 234.30 0 0 % 0 2,128 0.07 % 53.58 655 0.14 % 165.58 137,591 0.22 % 253.52 42,406 0.20 % 226.26 74,578 0.21 % 234.57 8,499 0.16 % 197.95 nova nova 254,400 0.20 % 224.20 0 0 % 0 2,819 0.10 % 70.98 635 0.13 % 160.52 128,227 0.20 % 236.27 41,185 0.19 % 219.75 75,332 0.21 % 236.94 6,202 0.12 % 144.45 veliko ve liko 252,295 0.20 % 222.35 0 0 % 0 7,380 0.25 % 185.82 564 0.12 % 142.57 121,628 0.19 % 224.11 44,659 0.21 % 238.29 68,608 0.19 % 215.79 9,456 0.18 % 220.24 zadnji za dnji 251,253 0.19 % 221.43 1 0.12 % 103.02 6,258 0.21 % 157.57 400 0.08 % 101.12 128,961 0.20 % 237.62 35,235 0.16 % 188 75,833 0.21 % 238.51 4,565 0.09 % 106.32 sami sami 231,519 0.18 % 204.04 1 0.12 % 103.02 5,772 0.20 % 145.33 753 0.16 % 190.35 111,377 0.18 % 205.22 49,240 0.23 % 262.73 52,561 0.14 % 165.32 11,815 0.23 % 275.19 velika ve lika 224,674 0.17 % 198 0 0 % 0 5,726 0.20 % 144.17 592 0.12 % 149.65 104,684 0.17 % 192.89 42,115 0.20 % 224.71 62,856 0.17 % 197.70 8,701 0.17 % 202.66 pomembno pome mbno 223,934 0.17 % 197.35 0 0 % 0 4,575 0.15 % 115.19 439 0.09 % 110.97 99,670 0.16 % 183.65 43,834 0.20 % 233.88 62,790 0.17 % 197.49 12,626 0.24 % 294.07 letni l etni 222,751 0.17 % 196.31 0 0 % 0 309 0.01 % 7.78 182 0.04 % 46.01 114,022 0.18 % 210.09 12,989 0.06 % 69.31 94,228 0.26 % 296.37 1,021 0.02 % 23.78 slovenska slove nska 220,229 0.17 % 194.09 0 0 % 0 268 0.01 % 6.75 407 0.09 % 102.89 118,994 0.19 % 219.25 22,935 0.11 % 122.37 74,627 0.21 % 234.72 2,998 0.06 % 69.83 veliki ve liki 216,843 0.17 % 191.10 0 0 % 0 5,506 0.19 % 138.64 648 0.14 % 163.81 103,011 0.16 % 189.80 37,422 0.17 % 199.67 60,010 0.17 % 188.75 10,246 0.20 % 238.64 zadnjem zad njem 213,519 0.17 % 188.17 1 0.12 % 103.02 2,919 0.10 % 73.50 290 0.06 % 73.31 110,962 0.18 % 204.45 28,044 0.13 % 149.63 68,052 0.19 % 214.04 3,251 0.06 % 75.72 sama sama 209,273 0.16 % 184.43 0 0 % 0 17,968 0.61 % 452.41 729 0.15 % 184.28 84,772 0.13 % 156.20 47,535 0.22 % 253.63 50,301 0.14 % 158.21 7,968 0.15 % 185.58 evropske evro pske 208,721 0.16 % 183.94 0 0 % 0 195 0.01 % 4.91 447 0.09 % 113 106,056 0.17 % 195.42 17,905 0.08 % 95.54 80,530 0.22 % 253.29 3,588 0.07 % 83.57 velik v elik 205,754 0.16 % 181.33 0 0 % 0 5,789 0.20 % 145.76 372 0.08 % 94.04 94,260 0.15 % 173.68 39,261 0.18 % 209.48 58,805 0.16 % 184.96 7,267 0.14 % 169.26 prihodnje priho dnje 203,631 0.16 % 179.46 0 0 % 0 614 0.02 % 15.46 158 0.03 % 39.94 120,469 0.19 % 221.97 19,044 0.09 % 101.61 61,773 0.17 % 194.29 1,573 0.03 % 36.64 slovenskega slovens kega 202,094 0.16 % 178.10 0 0 % 0 398 0.01 % 10.02 407 0.09 % 102.89 112,062 0.18 % 206.48 22,466 0.10 % 119.87 62,505 0.17 % 196.59 4,256 0.08 % 99.13 dobro d obro 201,264 0.16 % 177.37 0 0 % 0 6,966 0.24 % 175.40 343 0.07 % 86.71 98,094 0.16 % 180.74 39,696 0.18 % 211.80 50,431 0.14 % 158.62 5,734 0.11 % 133.55 velike ve like 199,502 0.15 % 175.82 4 0.47 % 412.07 4,824 0.16 % 121.46 674 0.14 % 170.38 95,438 0.15 % 175.85 36,157 0.17 % 192.92 52,627 0.14 % 165.53 9,778 0.19 % 227.74 državni drž avni 190,454 0.15 % 167.85 0 0 % 0 679 0.02 % 17.10 2,076 0.44 % 524.79 111,848 0.18 % 206.09 14,035 0.07 % 74.89 59,642 0.17 % 187.59 2,174 0.04 % 50.64 najboljši najbo ljši 182,232 0.14 % 160.60 0 0 % 0 2,304 0.08 % 58.01 142 0.03 % 35.90 90,767 0.14 % 167.24 23,701 0.11 % 126.46 63,025 0.17 % 198.23 2,293 0.04 % 53.41 jasno j asno 178,267 0.14 % 157.11 1 0.12 % 103.02 8,873 0.30 % 223.41 581 0.12 % 146.87 82,213 0.13 % 151.48 29,396 0.14 % 156.85 52,757 0.15 % 165.93 4,446 0.09 % 103.55 slovensko slove nsko 172,365 0.13 % 151.90 0 0 % 0 362 0.01 % 9.11 270 0.06 % 68.25 96,938 0.15 % 178.61 20,514 0.10 % 109.46 51,003 0.14 % 160.42 3,278 0.06 % 76.35 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 96 File at CLARIN.SI 1.2.80 List of final character-level 5-grams from adjective lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adjectives-lowercase_ forms-final-5grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] mogoče m ogoče 502,447 0.40 % 442.80 0 0 % 0 23,836 0.86 % 600.16 2,362 0.51 % 597.09 243,451 0.40 % 448.57 90,333 0.44 % 481.99 116,502 0.33 % 366.43 25,963 0.51 % 604.71 slovenski slov enski 446,624 0.36 % 393.61 0 0 % 0 678 0.02 % 17.07 576 0.12 % 145.61 240,556 0.39 % 443.24 47,459 0.23 % 253.23 151,452 0.43 % 476.36 5,903 0.12 % 137.49 zadnjih za dnjih 374,079 0.30 % 329.67 0 0 % 0 3,656 0.13 % 92.05 371 0.08 % 93.78 188,954 0.31 % 348.16 40,539 0.20 % 216.30 135,861 0.39 % 427.32 4,698 0.09 % 109.42 slovenske slov enske 353,715 0.28 % 311.73 0 0 % 0 506 0.02 % 12.74 592 0.13 % 149.65 206,966 0.34 % 381.35 37,129 0.18 % 198.11 103,866 0.29 % 326.69 4,656 0.09 % 108.44 slovenskih slove nskih 294,271 0.23 % 259.34 0 0 % 0 442 0.02 % 11.13 399 0.09 % 100.86 174,340 0.28 % 321.23 34,387 0.17 % 183.48 80,914 0.23 % 254.50 3,789 0.07 % 88.25 novega n ovega 281,250 0.22 % 247.86 0 0 % 0 3,744 0.14 % 94.27 446 0.10 % 112.74 149,906 0.24 % 276.21 43,885 0.21 % 234.16 77,637 0.22 % 244.19 5,632 0.11 % 131.18 različnih razl ičnih 266,466 0.21 % 234.84 2 0.24 % 206.04 2,438 0.09 % 61.39 678 0.15 % 171.39 118,184 0.19 % 217.76 56,333 0.27 % 300.57 68,940 0.20 % 216.83 19,891 0.39 % 463.29 novih novih 265,857 0.21 % 234.30 0 0 % 0 2,128 0.08 % 53.58 655 0.14 % 165.58 137,591 0.23 % 253.52 42,406 0.20 % 226.26 74,578 0.21 % 234.57 8,499 0.17 % 197.95 veliko v eliko 252,295 0.20 % 222.35 0 0 % 0 7,380 0.27 % 185.82 564 0.12 % 142.57 121,628 0.20 % 224.11 44,659 0.22 % 238.29 68,608 0.20 % 215.79 9,456 0.19 % 220.24 zadnji z adnji 251,253 0.20 % 221.43 1 0.12 % 103.02 6,258 0.23 % 157.57 400 0.09 % 101.12 128,961 0.21 % 237.62 35,235 0.17 % 188 75,833 0.21 % 238.51 4,565 0.09 % 106.32 velika v elika 224,674 0.18 % 198 0 0 % 0 5,726 0.21 % 144.17 592 0.13 % 149.65 104,684 0.17 % 192.89 42,115 0.20 % 224.71 62,856 0.18 % 197.70 8,701 0.17 % 202.66 pomembno pom embno 223,934 0.18 % 197.35 0 0 % 0 4,575 0.17 % 115.19 439 0.10 % 110.97 99,670 0.16 % 183.65 43,834 0.21 % 233.88 62,790 0.18 % 197.49 12,626 0.25 % 294.07 letni letni 222,751 0.18 % 196.31 0 0 % 0 309 0.01 % 7.78 182 0.04 % 46.01 114,022 0.19 % 210.09 12,989 0.06 % 69.31 94,228 0.27 % 296.37 1,021 0.02 % 23.78 slovenska slov enska 220,229 0.18 % 194.09 0 0 % 0 268 0.01 % 6.75 407 0.09 % 102.89 118,994 0.20 % 219.25 22,935 0.11 % 122.37 74,627 0.21 % 234.72 2,998 0.06 % 69.83 veliki v eliki 216,843 0.17 % 191.10 0 0 % 0 5,506 0.20 % 138.64 648 0.14 % 163.81 103,011 0.17 % 189.80 37,422 0.18 % 199.67 60,010 0.17 % 188.75 10,246 0.20 % 238.64 zadnjem za dnjem 213,519 0.17 % 188.17 1 0.12 % 103.02 2,919 0.10 % 73.50 290 0.06 % 73.31 110,962 0.18 % 204.45 28,044 0.14 % 149.63 68,052 0.19 % 214.04 3,251 0.06 % 75.72 evropske evr opske 208,721 0.17 % 183.94 0 0 % 0 195 0.01 % 4.91 447 0.10 % 113 106,056 0.17 % 195.42 17,905 0.09 % 95.54 80,530 0.23 % 253.29 3,588 0.07 % 83.57 velik velik 205,754 0.16 % 181.33 0 0 % 0 5,789 0.21 % 145.76 372 0.08 % 94.04 94,260 0.15 % 173.68 39,261 0.19 % 209.48 58,805 0.17 % 184.96 7,267 0.14 % 169.26 prihodnje prih odnje 203,631 0.16 % 179.46 0 0 % 0 614 0.02 % 15.46 158 0.03 % 39.94 120,469 0.20 % 221.97 19,044 0.09 % 101.61 61,773 0.18 % 194.29 1,573 0.03 % 36.64 slovenskega sloven skega 202,094 0.16 % 178.10 0 0 % 0 398 0.01 % 10.02 407 0.09 % 102.89 112,062 0.18 % 206.48 22,466 0.11 % 119.87 62,505 0.18 % 196.59 4,256 0.08 % 99.13 dobro dobro 201,264 0.16 % 177.37 0 0 % 0 6,966 0.25 % 175.40 343 0.07 % 86.71 98,094 0.16 % 180.74 39,696 0.19 % 211.80 50,431 0.14 % 158.62 5,734 0.11 % 133.55 velike v elike 199,502 0.16 % 175.82 4 0.48 % 412.07 4,824 0.17 % 121.46 674 0.15 % 170.38 95,438 0.16 % 175.85 36,157 0.17 % 192.92 52,627 0.15 % 165.53 9,778 0.19 % 227.74 državni dr žavni 190,454 0.15 % 167.85 0 0 % 0 679 0.02 % 17.10 2,076 0.45 % 524.79 111,848 0.18 % 206.09 14,035 0.07 % 74.89 59,642 0.17 % 187.59 2,174 0.04 % 50.64 najboljši najb oljši 182,232 0.14 % 160.60 0 0 % 0 2,304 0.08 % 58.01 142 0.03 % 35.90 90,767 0.15 % 167.24 23,701 0.11 % 126.46 63,025 0.18 % 198.23 2,293 0.04 % 53.41 jasno jasno 178,267 0.14 % 157.11 1 0.12 % 103.02 8,873 0.32 % 223.41 581 0.13 % 146.87 82,213 0.14 % 151.48 29,396 0.14 % 156.85 52,757 0.15 % 165.93 4,446 0.09 % 103.55 slovensko slov ensko 172,365 0.14 % 151.90 0 0 % 0 362 0.01 % 9.11 270 0.06 % 68.25 96,938 0.16 % 178.61 20,514 0.10 % 109.46 51,003 0.14 % 160.42 3,278 0.06 % 76.35 nekdanji nek danji 170,878 0.14 % 150.59 0 0 % 0 922 0.03 % 23.21 58 0.01 % 14.66 84,369 0.14 % 155.46 15,884 0.08 % 84.75 68,299 0.19 % 214.82 1,346 0.03 % 31.35 evropski evr opski 170,666 0.14 % 150.41 0 0 % 0 191 0.01 % 4.81 328 0.07 % 82.91 86,760 0.14 % 159.86 15,777 0.08 % 84.18 64,420 0.18 % 202.62 3,190 0.06 % 74.30 glavni g lavni 169,954 0.14 % 149.78 0 0 % 0 2,389 0.09 % 60.15 442 0.10 % 111.73 85,037 0.14 % 156.69 24,126 0.12 % 128.73 52,387 0.15 % 164.77 5,573 0.11 % 129.80 večji večji 166,234 0.13 % 146.50 0 0 % 0 2,284 0.08 % 57.51 307 0.07 % 77.61 81,608 0.13 % 150.37 32,453 0.16 % 173.16 42,062 0.12 % 132.30 7,520 0.15 % 175.15 novem novem 165,059 0.13 % 145.47 0 0 % 0 1,201 0.04 % 30.24 183 0.04 % 46.26 94,756 0.15 % 174.59 22,827 0.11 % 121.80 44,218 0.13 % 139.08 1,874 0.04 % 43.65 največji naj večji 162,333 0.13 % 143.06 0 0 % 0 1,346 0.05 % 33.89 203 0.04 % 51.32 82,320 0.14 % 151.68 26,141 0.13 % 139.48 48,637 0.14 % 152.98 3,686 0.07 % 85.85 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 97 File at CLARIN.SI 1.2.81 List of initial character-level 1-grams from adverb lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adverbs-lemmas- initial-1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] lahko lahko l ahko 3,488,074 5.43 % 3,074.02 44 8.85 % 4,532.81 118,552 3.93 % 2,985.01 17,112 9.02 % 4,325.73 1,472,009 4.94 % 2,712.27 791,767 6.41 % 4,224.61 883,055 5.35 % 2,777.44 205,535 8.48 % 4,787.15 tako tako t ako 3,289,402 5.12 % 2,898.93 31 6.24 % 3,193.57 157,077 5.21 % 3,955.03 11,067 5.84 % 2,797.62 1,532,883 5.14 % 2,824.44 627,155 5.08 % 3,346.29 830,732 5.04 % 2,612.87 130,457 5.38 % 3,038.50 več več v eč 1,895,190 2.95 % 1,670.22 4 0.81 % 412.07 35,879 1.19 % 903.39 3,863 2.04 % 976.52 889,908 2.99 % 1,639.71 302,261 2.45 % 1,612.77 609,107 3.69 % 1,915.80 54,168 2.23 % 1,261.64 nekaj nekaj n ekaj 1,478,636 2.30 % 1,303.11 17 3.42 % 1,751.31 75,312 2.50 % 1,896.27 2,638 1.39 % 666.86 706,346 2.37 % 1,301.49 292,468 2.37 % 1,560.51 361,517 2.19 % 1,137.07 40,338 1.66 % 939.52 zelo zelo z elo 1,441,970 2.24 % 1,270.80 2 0.40 % 206.04 45,177 1.50 % 1,137.51 3,480 1.83 % 879.71 645,584 2.17 % 1,189.53 323,837 2.62 % 1,727.89 366,622 2.22 % 1,153.12 57,268 2.36 % 1,333.84 bolj bolj b olj 1,262,289 1.96 % 1,112.45 13 2.62 % 1,339.24 42,788 1.42 % 1,077.36 2,639 1.39 % 667.11 596,437 2.00 % 1,098.97 266,960 2.16 % 1,424.41 297,200 1.80 % 934.77 56,252 2.32 % 1,310.18 zdaj zdaj z daj 1,082,531 1.68 % 954.03 0 0 % 0 71,531 2.37 % 1,801.07 2,616 1.38 % 661.30 531,590 1.78 % 979.49 150,379 1.22 % 802.37 308,065 1.87 % 968.94 18,350 0.76 % 427.39 vedno vedno v edno 1,075,127 1.67 % 947.50 1 0.20 % 103.02 54,794 1.82 % 1,379.65 2,037 1.07 % 514.93 473,977 1.59 % 873.33 227,298 1.84 % 1,212.79 277,237 1.68 % 871.98 39,783 1.64 % 926.59 kar kar k ar 1,071,938 1.67 % 944.69 0 0 % 0 39,622 1.31 % 997.64 1,731 0.91 % 437.58 542,754 1.82 % 1,000.06 222,883 1.81 % 1,189.23 244,821 1.48 % 770.03 20,127 0.83 % 468.78 kako kako k ako 1,005,454 1.56 % 886.10 2 0.40 % 206.04 90,504 3.00 % 2,278.79 2,958 1.56 % 747.75 424,401 1.42 % 781.99 198,869 1.61 % 1,061.10 237,842 1.44 % 748.08 50,878 2.10 % 1,185.01 veliko veliko v eliko 945,307 1.47 % 833.09 1 0.20 % 103.02 24,019 0.80 % 604.77 1,561 0.82 % 394.60 451,747 1.52 % 832.37 200,120 1.62 % 1,067.77 237,636 1.44 % 747.43 30,223 1.25 % 703.93 dobro dobro d obro 906,394 1.41 % 798.80 17 3.42 % 1,751.31 38,425 1.27 % 967.50 1,435 0.76 % 362.75 419,465 1.41 % 772.89 194,249 1.57 % 1,036.45 219,223 1.33 % 689.51 33,580 1.39 % 782.12 danes danes d anes 905,146 1.41 % 797.70 0 0 % 0 13,186 0.44 % 332.01 2,091 1.10 % 528.58 348,133 1.17 % 641.46 105,919 0.86 % 565.15 416,761 2.53 % 1,310.82 19,056 0.79 % 443.84 potem potem p otem 814,559 1.27 % 717.87 10 2.01 % 1,030.18 74,531 2.47 % 1,876.61 4,470 2.36 % 1,129.97 357,330 1.20 % 658.40 152,779 1.24 % 815.18 198,520 1.20 % 624.40 26,919 1.11 % 626.98 najbolj najbolj n ajbolj 780,725 1.22 % 688.05 1 0.20 % 103.02 13,768 0.46 % 346.66 1,191 0.63 % 301.07 377,064 1.26 % 694.77 156,552 1.27 % 835.31 207,955 1.26 % 654.07 24,194 1.00 % 563.51 treba treba t reba 715,315 1.11 % 630.40 2 0.40 % 206.04 22,139 0.73 % 557.44 3,300 1.74 % 834.20 348,254 1.17 % 641.68 131,260 1.06 % 700.36 182,173 1.10 % 572.98 28,187 1.16 % 656.51 skupaj skupaj s kupaj 614,538 0.96 % 541.59 21 4.22 % 2,163.39 23,547 0.78 % 592.89 1,545 0.81 % 390.56 284,334 0.95 % 523.90 117,286 0.95 % 625.80 170,035 1.03 % 534.80 17,770 0.73 % 413.88 letos letos l etos 606,763 0.94 % 534.74 0 0 % 0 765 0.03 % 19.26 280 0.15 % 70.78 362,999 1.22 % 668.85 61,361 0.50 % 327.40 180,244 1.09 % 566.91 1,114 0.05 % 25.95 manj manj m anj 605,930 0.94 % 534 1 0.20 % 103.02 10,236 0.34 % 257.73 1,249 0.66 % 315.73 293,350 0.98 % 540.52 115,502 0.94 % 616.28 163,127 0.99 % 513.08 22,465 0.93 % 523.24 nato nato n ato 588,918 0.92 % 519.01 52 10.46 % 5,356.96 28,600 0.95 % 720.12 1,074 0.57 % 271.50 236,089 0.79 % 435.01 94,258 0.76 % 502.93 201,513 1.22 % 633.81 27,332 1.13 % 636.59 res res r es 565,140 0.88 % 498.06 0 0 % 0 43,525 1.44 % 1,095.91 1,548 0.82 % 391.32 259,710 0.87 % 478.53 119,655 0.97 % 638.44 127,911 0.78 % 402.31 12,791 0.53 % 297.92 lani lani l ani 505,184 0.79 % 445.22 0 0 % 0 745 0.03 % 18.76 90 0.05 % 22.75 276,175 0.93 % 508.87 38,722 0.31 % 206.61 188,480 1.14 % 592.82 972 0.04 % 22.64 precej precej p recej 492,127 0.77 % 433.71 2 0.40 % 206.04 12,458 0.41 % 313.68 619 0.33 % 156.48 241,123 0.81 % 444.29 101,408 0.82 % 541.08 122,999 0.75 % 386.86 13,518 0.56 % 314.85 takrat takrat t akrat 444,872 0.69 % 392.06 0 0 % 0 20,788 0.69 % 523.42 1,482 0.78 % 374.63 201,503 0.68 % 371.28 88,896 0.72 % 474.32 118,816 0.72 % 373.71 13,387 0.55 % 311.80 tam tam t am 430,492 0.67 % 379.39 0 0 % 0 37,488 1.24 % 943.91 1,592 0.84 % 402.44 193,210 0.65 % 356 84,595 0.69 % 451.37 98,397 0.60 % 309.48 15,210 0.63 % 354.26 povsem povsem p ovsem 426,682 0.66 % 376.03 0 0 % 0 14,047 0.47 % 353.69 572 0.30 % 144.60 206,808 0.69 % 381.06 86,791 0.70 % 463.09 103,440 0.63 % 325.35 15,024 0.62 % 349.93 medtem medtem m edtem 423,755 0.66 % 373.45 6 1.21 % 618.11 18,625 0.62 % 468.96 716 0.38 % 181 185,207 0.62 % 341.26 60,849 0.49 % 324.67 146,062 0.89 % 459.40 12,290 0.51 % 286.25 malo malo m alo 420,328 0.65 % 370.43 29 5.83 % 2,987.53 31,608 1.05 % 795.86 1,133 0.60 % 286.41 185,259 0.62 % 341.35 97,676 0.79 % 521.17 89,437 0.54 % 281.30 15,186 0.63 % 353.70 vse vse v se 415,803 0.65 % 366.45 0 0 % 0 14,050 0.47 % 353.76 827 0.44 % 209.06 199,559 0.67 % 367.70 73,812 0.60 % 393.84 113,290 0.69 % 356.33 14,265 0.59 % 332.25 rad rad r ad 404,452 0.63 % 356.44 0 0 % 0 31,141 1.03 % 784.10 1,256 0.66 % 317.50 176,738 0.59 % 325.65 102,264 0.83 % 545.65 79,765 0.48 % 250.88 13,288 0.55 % 309.49 glede glede g lede 400,382 0.62 % 352.85 1 0.20 % 103.02 4,848 0.16 % 122.07 2,830 1.49 % 715.39 176,820 0.59 % 325.80 67,773 0.55 % 361.61 130,230 0.79 % 409.61 17,880 0.74 % 416.45 dovolj dovolj d ovolj 400,015 0.62 % 352.53 3 0.60 % 309.06 18,306 0.61 % 460.93 806 0.42 % 203.75 184,495 0.62 % 339.94 90,989 0.74 % 485.49 91,354 0.55 % 287.33 14,062 0.58 % 327.52 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 98 File at CLARIN.SI 1.2.82 List of initial character-level 2-grams from adverb lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adverbs-lemmas- initial-2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] lahko lahko la hko 3,488,074 5.43 % 3,074.02 44 8.85 % 4,532.81 118,552 3.93 % 2,985.01 17,112 9.02 % 4,325.73 1,472,009 4.94 % 2,712.27 791,767 6.41 % 4,224.61 883,055 5.35 % 2,777.44 205,535 8.48 % 4,787.15 tako tako ta ko 3,289,402 5.12 % 2,898.93 31 6.24 % 3,193.57 157,077 5.21 % 3,955.03 11,067 5.84 % 2,797.62 1,532,883 5.14 % 2,824.44 627,155 5.08 % 3,346.29 830,732 5.04 % 2,612.87 130,457 5.38 % 3,038.50 več več ve č 1,895,190 2.95 % 1,670.22 4 0.81 % 412.07 35,879 1.19 % 903.39 3,863 2.04 % 976.52 889,908 2.99 % 1,639.71 302,261 2.45 % 1,612.77 609,107 3.69 % 1,915.80 54,168 2.23 % 1,261.64 nekaj nekaj ne kaj 1,478,636 2.30 % 1,303.11 17 3.42 % 1,751.31 75,312 2.50 % 1,896.27 2,638 1.39 % 666.86 706,346 2.37 % 1,301.49 292,468 2.37 % 1,560.51 361,517 2.19 % 1,137.07 40,338 1.66 % 939.52 zelo zelo ze lo 1,441,970 2.24 % 1,270.80 2 0.40 % 206.04 45,177 1.50 % 1,137.51 3,480 1.83 % 879.71 645,584 2.17 % 1,189.53 323,837 2.62 % 1,727.89 366,622 2.22 % 1,153.12 57,268 2.36 % 1,333.84 bolj bolj bo lj 1,262,289 1.96 % 1,112.45 13 2.62 % 1,339.24 42,788 1.42 % 1,077.36 2,639 1.39 % 667.11 596,437 2.00 % 1,098.97 266,960 2.16 % 1,424.41 297,200 1.80 % 934.77 56,252 2.32 % 1,310.18 zdaj zdaj zd aj 1,082,531 1.68 % 954.03 0 0 % 0 71,531 2.37 % 1,801.07 2,616 1.38 % 661.30 531,590 1.78 % 979.49 150,379 1.22 % 802.37 308,065 1.87 % 968.94 18,350 0.76 % 427.39 vedno vedno ve dno 1,075,127 1.67 % 947.50 1 0.20 % 103.02 54,794 1.82 % 1,379.65 2,037 1.07 % 514.93 473,977 1.59 % 873.33 227,298 1.84 % 1,212.79 277,237 1.68 % 871.98 39,783 1.64 % 926.59 kar kar ka r 1,071,938 1.67 % 944.69 0 0 % 0 39,622 1.31 % 997.64 1,731 0.91 % 437.58 542,754 1.82 % 1,000.06 222,883 1.81 % 1,189.23 244,821 1.48 % 770.03 20,127 0.83 % 468.78 kako kako ka ko 1,005,454 1.56 % 886.10 2 0.40 % 206.04 90,504 3.00 % 2,278.79 2,958 1.56 % 747.75 424,401 1.42 % 781.99 198,869 1.61 % 1,061.10 237,842 1.44 % 748.08 50,878 2.10 % 1,185.01 veliko veliko ve liko 945,307 1.47 % 833.09 1 0.20 % 103.02 24,019 0.80 % 604.77 1,561 0.82 % 394.60 451,747 1.52 % 832.37 200,120 1.62 % 1,067.77 237,636 1.44 % 747.43 30,223 1.25 % 703.93 dobro dobro do bro 906,394 1.41 % 798.80 17 3.42 % 1,751.31 38,425 1.27 % 967.50 1,435 0.76 % 362.75 419,465 1.41 % 772.89 194,249 1.57 % 1,036.45 219,223 1.33 % 689.51 33,580 1.39 % 782.12 danes danes da nes 905,146 1.41 % 797.70 0 0 % 0 13,186 0.44 % 332.01 2,091 1.10 % 528.58 348,133 1.17 % 641.46 105,919 0.86 % 565.15 416,761 2.53 % 1,310.82 19,056 0.79 % 443.84 potem potem po tem 814,559 1.27 % 717.87 10 2.01 % 1,030.18 74,531 2.47 % 1,876.61 4,470 2.36 % 1,129.97 357,330 1.20 % 658.40 152,779 1.24 % 815.18 198,520 1.20 % 624.40 26,919 1.11 % 626.98 najbolj najbolj na jbolj 780,725 1.22 % 688.05 1 0.20 % 103.02 13,768 0.46 % 346.66 1,191 0.63 % 301.07 377,064 1.26 % 694.77 156,552 1.27 % 835.31 207,955 1.26 % 654.07 24,194 1.00 % 563.51 treba treba tr eba 715,315 1.11 % 630.40 2 0.40 % 206.04 22,139 0.73 % 557.44 3,300 1.74 % 834.20 348,254 1.17 % 641.68 131,260 1.06 % 700.36 182,173 1.10 % 572.98 28,187 1.16 % 656.51 skupaj skupaj sk upaj 614,538 0.96 % 541.59 21 4.22 % 2,163.39 23,547 0.78 % 592.89 1,545 0.81 % 390.56 284,334 0.95 % 523.90 117,286 0.95 % 625.80 170,035 1.03 % 534.80 17,770 0.73 % 413.88 letos letos le tos 606,763 0.94 % 534.74 0 0 % 0 765 0.03 % 19.26 280 0.15 % 70.78 362,999 1.22 % 668.85 61,361 0.50 % 327.40 180,244 1.09 % 566.91 1,114 0.05 % 25.95 manj manj ma nj 605,930 0.94 % 534 1 0.20 % 103.02 10,236 0.34 % 257.73 1,249 0.66 % 315.73 293,350 0.98 % 540.52 115,502 0.94 % 616.28 163,127 0.99 % 513.08 22,465 0.93 % 523.24 nato nato na to 588,918 0.92 % 519.01 52 10.46 % 5,356.96 28,600 0.95 % 720.12 1,074 0.57 % 271.50 236,089 0.79 % 435.01 94,258 0.76 % 502.93 201,513 1.22 % 633.81 27,332 1.13 % 636.59 res res re s 565,140 0.88 % 498.06 0 0 % 0 43,525 1.44 % 1,095.91 1,548 0.82 % 391.32 259,710 0.87 % 478.53 119,655 0.97 % 638.44 127,911 0.78 % 402.31 12,791 0.53 % 297.92 lani lani la ni 505,184 0.79 % 445.22 0 0 % 0 745 0.03 % 18.76 90 0.05 % 22.75 276,175 0.93 % 508.87 38,722 0.31 % 206.61 188,480 1.14 % 592.82 972 0.04 % 22.64 precej precej pr ecej 492,127 0.77 % 433.71 2 0.40 % 206.04 12,458 0.41 % 313.68 619 0.33 % 156.48 241,123 0.81 % 444.29 101,408 0.82 % 541.08 122,999 0.75 % 386.86 13,518 0.56 % 314.85 takrat takrat ta krat 444,872 0.69 % 392.06 0 0 % 0 20,788 0.69 % 523.42 1,482 0.78 % 374.63 201,503 0.68 % 371.28 88,896 0.72 % 474.32 118,816 0.72 % 373.71 13,387 0.55 % 311.80 tam tam ta m 430,492 0.67 % 379.39 0 0 % 0 37,488 1.24 % 943.91 1,592 0.84 % 402.44 193,210 0.65 % 356 84,595 0.69 % 451.37 98,397 0.60 % 309.48 15,210 0.63 % 354.26 povsem povsem po vsem 426,682 0.66 % 376.03 0 0 % 0 14,047 0.47 % 353.69 572 0.30 % 144.60 206,808 0.69 % 381.06 86,791 0.70 % 463.09 103,440 0.63 % 325.35 15,024 0.62 % 349.93 medtem medtem me dtem 423,755 0.66 % 373.45 6 1.21 % 618.11 18,625 0.62 % 468.96 716 0.38 % 181 185,207 0.62 % 341.26 60,849 0.49 % 324.67 146,062 0.89 % 459.40 12,290 0.51 % 286.25 malo malo ma lo 420,328 0.65 % 370.43 29 5.83 % 2,987.53 31,608 1.05 % 795.86 1,133 0.60 % 286.41 185,259 0.62 % 341.35 97,676 0.79 % 521.17 89,437 0.54 % 281.30 15,186 0.63 % 353.70 vse vse vs e 415,803 0.65 % 366.45 0 0 % 0 14,050 0.47 % 353.76 827 0.44 % 209.06 199,559 0.67 % 367.70 73,812 0.60 % 393.84 113,290 0.69 % 356.33 14,265 0.59 % 332.25 rad rad ra d 404,452 0.63 % 356.44 0 0 % 0 31,141 1.03 % 784.10 1,256 0.66 % 317.50 176,738 0.59 % 325.65 102,264 0.83 % 545.65 79,765 0.48 % 250.88 13,288 0.55 % 309.49 glede glede gl ede 400,382 0.62 % 352.85 1 0.20 % 103.02 4,848 0.16 % 122.07 2,830 1.49 % 715.39 176,820 0.59 % 325.80 67,773 0.55 % 361.61 130,230 0.79 % 409.61 17,880 0.74 % 416.45 dovolj dovolj do volj 400,015 0.62 % 352.53 3 0.60 % 309.06 18,306 0.61 % 460.93 806 0.42 % 203.75 184,495 0.62 % 339.94 90,989 0.74 % 485.49 91,354 0.55 % 287.33 14,062 0.58 % 327.52 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 99 File at CLARIN.SI 1.2.83 List of initial character-level 3-grams from adverb lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adverbs-lemmas- initial-3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] lahko lahko lah ko 3,488,074 5.45 % 3,074.02 44 8.87 % 4,532.81 118,552 3.96 % 2,985.01 17,112 9.11 % 4,325.73 1,472,009 4.96 % 2,712.27 791,767 6.45 % 4,224.61 883,055 5.37 % 2,777.44 205,535 8.54 % 4,787.15 tako tako tak o 3,289,402 5.14 % 2,898.93 31 6.25 % 3,193.57 157,077 5.25 % 3,955.03 11,067 5.89 % 2,797.62 1,532,883 5.17 % 2,824.44 627,155 5.11 % 3,346.29 830,732 5.05 % 2,612.87 130,457 5.42 % 3,038.50 več več več 1,895,190 2.96 % 1,670.22 4 0.81 % 412.07 35,879 1.20 % 903.39 3,863 2.06 % 976.52 889,908 3.00 % 1,639.71 302,261 2.46 % 1,612.77 609,107 3.71 % 1,915.80 54,168 2.25 % 1,261.64 nekaj nekaj nek aj 1,478,636 2.31 % 1,303.11 17 3.43 % 1,751.31 75,312 2.52 % 1,896.27 2,638 1.41 % 666.86 706,346 2.38 % 1,301.49 292,468 2.38 % 1,560.51 361,517 2.20 % 1,137.07 40,338 1.68 % 939.52 zelo zelo zel o 1,441,970 2.25 % 1,270.80 2 0.40 % 206.04 45,177 1.51 % 1,137.51 3,480 1.85 % 879.71 645,584 2.18 % 1,189.53 323,837 2.64 % 1,727.89 366,622 2.23 % 1,153.12 57,268 2.38 % 1,333.84 bolj bolj bol j 1,262,289 1.97 % 1,112.45 13 2.62 % 1,339.24 42,788 1.43 % 1,077.36 2,639 1.41 % 667.11 596,437 2.01 % 1,098.97 266,960 2.17 % 1,424.41 297,200 1.81 % 934.77 56,252 2.34 % 1,310.18 zdaj zdaj zda j 1,082,531 1.69 % 954.03 0 0 % 0 71,531 2.39 % 1,801.07 2,616 1.39 % 661.30 531,590 1.79 % 979.49 150,379 1.23 % 802.37 308,065 1.88 % 968.94 18,350 0.76 % 427.39 vedno vedno ved no 1,075,127 1.68 % 947.50 1 0.20 % 103.02 54,794 1.83 % 1,379.65 2,037 1.08 % 514.93 473,977 1.60 % 873.33 227,298 1.85 % 1,212.79 277,237 1.69 % 871.98 39,783 1.65 % 926.59 kar kar kar 1,071,938 1.68 % 944.69 0 0 % 0 39,622 1.32 % 997.64 1,731 0.92 % 437.58 542,754 1.83 % 1,000.06 222,883 1.81 % 1,189.23 244,821 1.49 % 770.03 20,127 0.84 % 468.78 kako kako kak o 1,005,454 1.57 % 886.10 2 0.40 % 206.04 90,504 3.02 % 2,278.79 2,958 1.57 % 747.75 424,401 1.43 % 781.99 198,869 1.62 % 1,061.10 237,842 1.45 % 748.08 50,878 2.11 % 1,185.01 veliko veliko vel iko 945,307 1.48 % 833.09 1 0.20 % 103.02 24,019 0.80 % 604.77 1,561 0.83 % 394.60 451,747 1.52 % 832.37 200,120 1.63 % 1,067.77 237,636 1.45 % 747.43 30,223 1.25 % 703.93 dobro dobro dob ro 906,394 1.42 % 798.80 17 3.43 % 1,751.31 38,425 1.28 % 967.50 1,435 0.76 % 362.75 419,465 1.41 % 772.89 194,249 1.58 % 1,036.45 219,223 1.33 % 689.51 33,580 1.39 % 782.12 danes danes dan es 905,146 1.42 % 797.70 0 0 % 0 13,186 0.44 % 332.01 2,091 1.11 % 528.58 348,133 1.17 % 641.46 105,919 0.86 % 565.15 416,761 2.54 % 1,310.82 19,056 0.79 % 443.84 potem potem pot em 814,559 1.27 % 717.87 10 2.02 % 1,030.18 74,531 2.49 % 1,876.61 4,470 2.38 % 1,129.97 357,330 1.20 % 658.40 152,779 1.24 % 815.18 198,520 1.21 % 624.40 26,919 1.12 % 626.98 najbolj najbolj naj bolj 780,725 1.22 % 688.05 1 0.20 % 103.02 13,768 0.46 % 346.66 1,191 0.63 % 301.07 377,064 1.27 % 694.77 156,552 1.27 % 835.31 207,955 1.26 % 654.07 24,194 1.00 % 563.51 treba treba tre ba 715,315 1.12 % 630.40 2 0.40 % 206.04 22,139 0.74 % 557.44 3,300 1.76 % 834.20 348,254 1.17 % 641.68 131,260 1.07 % 700.36 182,173 1.11 % 572.98 28,187 1.17 % 656.51 skupaj skupaj sku paj 614,538 0.96 % 541.59 21 4.23 % 2,163.39 23,547 0.79 % 592.89 1,545 0.82 % 390.56 284,334 0.96 % 523.90 117,286 0.95 % 625.80 170,035 1.03 % 534.80 17,770 0.74 % 413.88 letos letos let os 606,763 0.95 % 534.74 0 0 % 0 765 0.03 % 19.26 280 0.15 % 70.78 362,999 1.22 % 668.85 61,361 0.50 % 327.40 180,244 1.10 % 566.91 1,114 0.05 % 25.95 manj manj man j 605,930 0.95 % 534 1 0.20 % 103.02 10,236 0.34 % 257.73 1,249 0.67 % 315.73 293,350 0.99 % 540.52 115,502 0.94 % 616.28 163,127 0.99 % 513.08 22,465 0.93 % 523.24 nato nato nat o 588,918 0.92 % 519.01 52 10.48 % 5,356.96 28,600 0.95 % 720.12 1,074 0.57 % 271.50 236,089 0.80 % 435.01 94,258 0.77 % 502.93 201,513 1.23 % 633.81 27,332 1.14 % 636.59 res res res 565,140 0.88 % 498.06 0 0 % 0 43,525 1.45 % 1,095.91 1,548 0.82 % 391.32 259,710 0.88 % 478.53 119,655 0.97 % 638.44 127,911 0.78 % 402.31 12,791 0.53 % 297.92 lani lani lan i 505,184 0.79 % 445.22 0 0 % 0 745 0.03 % 18.76 90 0.05 % 22.75 276,175 0.93 % 508.87 38,722 0.32 % 206.61 188,480 1.15 % 592.82 972 0.04 % 22.64 precej precej pre cej 492,127 0.77 % 433.71 2 0.40 % 206.04 12,458 0.42 % 313.68 619 0.33 % 156.48 241,123 0.81 % 444.29 101,408 0.83 % 541.08 122,999 0.75 % 386.86 13,518 0.56 % 314.85 takrat takrat tak rat 444,872 0.69 % 392.06 0 0 % 0 20,788 0.69 % 523.42 1,482 0.79 % 374.63 201,503 0.68 % 371.28 88,896 0.72 % 474.32 118,816 0.72 % 373.71 13,387 0.56 % 311.80 tam tam tam 430,492 0.67 % 379.39 0 0 % 0 37,488 1.25 % 943.91 1,592 0.85 % 402.44 193,210 0.65 % 356 84,595 0.69 % 451.37 98,397 0.60 % 309.48 15,210 0.63 % 354.26 povsem povsem pov sem 426,682 0.67 % 376.03 0 0 % 0 14,047 0.47 % 353.69 572 0.30 % 144.60 206,808 0.70 % 381.06 86,791 0.71 % 463.09 103,440 0.63 % 325.35 15,024 0.62 % 349.93 medtem medtem med tem 423,755 0.66 % 373.45 6 1.21 % 618.11 18,625 0.62 % 468.96 716 0.38 % 181 185,207 0.62 % 341.26 60,849 0.50 % 324.67 146,062 0.89 % 459.40 12,290 0.51 % 286.25 malo malo mal o 420,328 0.66 % 370.43 29 5.85 % 2,987.53 31,608 1.06 % 795.86 1,133 0.60 % 286.41 185,259 0.62 % 341.35 97,676 0.80 % 521.17 89,437 0.54 % 281.30 15,186 0.63 % 353.70 vse vse vse 415,803 0.65 % 366.45 0 0 % 0 14,050 0.47 % 353.76 827 0.44 % 209.06 199,559 0.67 % 367.70 73,812 0.60 % 393.84 113,290 0.69 % 356.33 14,265 0.59 % 332.25 rad rad rad 404,452 0.63 % 356.44 0 0 % 0 31,141 1.04 % 784.10 1,256 0.67 % 317.50 176,738 0.60 % 325.65 102,264 0.83 % 545.65 79,765 0.48 % 250.88 13,288 0.55 % 309.49 glede glede gle de 400,382 0.63 % 352.85 1 0.20 % 103.02 4,848 0.16 % 122.07 2,830 1.51 % 715.39 176,820 0.60 % 325.80 67,773 0.55 % 361.61 130,230 0.79 % 409.61 17,880 0.74 % 416.45 dovolj dovolj dov olj 400,015 0.62 % 352.53 3 0.60 % 309.06 18,306 0.61 % 460.93 806 0.43 % 203.75 184,495 0.62 % 339.94 90,989 0.74 % 485.49 91,354 0.56 % 287.33 14,062 0.58 % 327.52 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 100 File at CLARIN.SI 1.2.84 List of initial character-level 4-grams from adverb lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adverbs-lemmas- initial-4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] lahko lahko lahk o 3,488,074 6.09 % 3,074.02 44 9.22 % 4,532.81 118,552 4.46 % 2,985.01 17,112 9.97 % 4,325.73 1,472,009 5.55 % 2,712.27 791,767 7.20 % 4,224.61 883,055 5.99 % 2,777.44 205,535 9.29 % 4,787.15 tako tako tako 3,289,402 5.74 % 2,898.93 31 6.50 % 3,193.57 157,077 5.92 % 3,955.03 11,067 6.45 % 2,797.62 1,532,883 5.78 % 2,824.44 627,155 5.71 % 3,346.29 830,732 5.64 % 2,612.87 130,457 5.90 % 3,038.50 nekaj nekaj neka j 1,478,636 2.58 % 1,303.11 17 3.56 % 1,751.31 75,312 2.84 % 1,896.27 2,638 1.54 % 666.86 706,346 2.66 % 1,301.49 292,468 2.66 % 1,560.51 361,517 2.45 % 1,137.07 40,338 1.82 % 939.52 zelo zelo zelo 1,441,970 2.52 % 1,270.80 2 0.42 % 206.04 45,177 1.70 % 1,137.51 3,480 2.03 % 879.71 645,584 2.43 % 1,189.53 323,837 2.95 % 1,727.89 366,622 2.49 % 1,153.12 57,268 2.59 % 1,333.84 bolj bolj bolj 1,262,289 2.20 % 1,112.45 13 2.73 % 1,339.24 42,788 1.61 % 1,077.36 2,639 1.54 % 667.11 596,437 2.25 % 1,098.97 266,960 2.43 % 1,424.41 297,200 2.02 % 934.77 56,252 2.54 % 1,310.18 zdaj zdaj zdaj 1,082,531 1.89 % 954.03 0 0 % 0 71,531 2.69 % 1,801.07 2,616 1.52 % 661.30 531,590 2.00 % 979.49 150,379 1.37 % 802.37 308,065 2.09 % 968.94 18,350 0.83 % 427.39 vedno vedno vedn o 1,075,127 1.88 % 947.50 1 0.21 % 103.02 54,794 2.06 % 1,379.65 2,037 1.19 % 514.93 473,977 1.79 % 873.33 227,298 2.07 % 1,212.79 277,237 1.88 % 871.98 39,783 1.80 % 926.59 kako kako kako 1,005,454 1.75 % 886.10 2 0.42 % 206.04 90,504 3.41 % 2,278.79 2,958 1.72 % 747.75 424,401 1.60 % 781.99 198,869 1.81 % 1,061.10 237,842 1.61 % 748.08 50,878 2.30 % 1,185.01 veliko veliko veli ko 945,307 1.65 % 833.09 1 0.21 % 103.02 24,019 0.90 % 604.77 1,561 0.91 % 394.60 451,747 1.70 % 832.37 200,120 1.82 % 1,067.77 237,636 1.61 % 747.43 30,223 1.37 % 703.93 dobro dobro dobr o 906,394 1.58 % 798.80 17 3.56 % 1,751.31 38,425 1.45 % 967.50 1,435 0.84 % 362.75 419,465 1.58 % 772.89 194,249 1.77 % 1,036.45 219,223 1.49 % 689.51 33,580 1.52 % 782.12 danes danes dane s 905,146 1.58 % 797.70 0 0 % 0 13,186 0.50 % 332.01 2,091 1.22 % 528.58 348,133 1.31 % 641.46 105,919 0.96 % 565.15 416,761 2.83 % 1,310.82 19,056 0.86 % 443.84 potem potem pote m 814,559 1.42 % 717.87 10 2.10 % 1,030.18 74,531 2.81 % 1,876.61 4,470 2.60 % 1,129.97 357,330 1.35 % 658.40 152,779 1.39 % 815.18 198,520 1.35 % 624.40 26,919 1.22 % 626.98 najbolj najbolj najb olj 780,725 1.36 % 688.05 1 0.21 % 103.02 13,768 0.52 % 346.66 1,191 0.69 % 301.07 377,064 1.42 % 694.77 156,552 1.42 % 835.31 207,955 1.41 % 654.07 24,194 1.09 % 563.51 treba treba treb a 715,315 1.25 % 630.40 2 0.42 % 206.04 22,139 0.83 % 557.44 3,300 1.92 % 834.20 348,254 1.31 % 641.68 131,260 1.19 % 700.36 182,173 1.24 % 572.98 28,187 1.27 % 656.51 skupaj skupaj skup aj 614,538 1.07 % 541.59 21 4.40 % 2,163.39 23,547 0.89 % 592.89 1,545 0.90 % 390.56 284,334 1.07 % 523.90 117,286 1.07 % 625.80 170,035 1.15 % 534.80 17,770 0.80 % 413.88 letos letos leto s 606,763 1.06 % 534.74 0 0 % 0 765 0.03 % 19.26 280 0.16 % 70.78 362,999 1.37 % 668.85 61,361 0.56 % 327.40 180,244 1.22 % 566.91 1,114 0.05 % 25.95 manj manj manj 605,930 1.06 % 534 1 0.21 % 103.02 10,236 0.39 % 257.73 1,249 0.73 % 315.73 293,350 1.11 % 540.52 115,502 1.05 % 616.28 163,127 1.11 % 513.08 22,465 1.01 % 523.24 nato nato nato 588,918 1.03 % 519.01 52 10.90 % 5,356.96 28,600 1.08 % 720.12 1,074 0.63 % 271.50 236,089 0.89 % 435.01 94,258 0.86 % 502.93 201,513 1.37 % 633.81 27,332 1.24 % 636.59 lani lani lani 505,184 0.88 % 445.22 0 0 % 0 745 0.03 % 18.76 90 0.05 % 22.75 276,175 1.04 % 508.87 38,722 0.35 % 206.61 188,480 1.28 % 592.82 972 0.04 % 22.64 precej precej prec ej 492,127 0.86 % 433.71 2 0.42 % 206.04 12,458 0.47 % 313.68 619 0.36 % 156.48 241,123 0.91 % 444.29 101,408 0.92 % 541.08 122,999 0.83 % 386.86 13,518 0.61 % 314.85 takrat takrat takr at 444,872 0.78 % 392.06 0 0 % 0 20,788 0.78 % 523.42 1,482 0.86 % 374.63 201,503 0.76 % 371.28 88,896 0.81 % 474.32 118,816 0.81 % 373.71 13,387 0.60 % 311.80 povsem povsem povs em 426,682 0.74 % 376.03 0 0 % 0 14,047 0.53 % 353.69 572 0.33 % 144.60 206,808 0.78 % 381.06 86,791 0.79 % 463.09 103,440 0.70 % 325.35 15,024 0.68 % 349.93 medtem medtem medt em 423,755 0.74 % 373.45 6 1.26 % 618.11 18,625 0.70 % 468.96 716 0.42 % 181 185,207 0.70 % 341.26 60,849 0.55 % 324.67 146,062 0.99 % 459.40 12,290 0.56 % 286.25 malo malo malo 420,328 0.73 % 370.43 29 6.08 % 2,987.53 31,608 1.19 % 795.86 1,133 0.66 % 286.41 185,259 0.70 % 341.35 97,676 0.89 % 521.17 89,437 0.61 % 281.30 15,186 0.69 % 353.70 glede glede gled e 400,382 0.70 % 352.85 1 0.21 % 103.02 4,848 0.18 % 122.07 2,830 1.65 % 715.39 176,820 0.67 % 325.80 67,773 0.62 % 361.61 130,230 0.88 % 409.61 17,880 0.81 % 416.45 dovolj dovolj dovo lj 400,015 0.70 % 352.53 3 0.63 % 309.06 18,306 0.69 % 460.93 806 0.47 % 203.75 184,495 0.70 % 339.94 90,989 0.83 % 485.49 91,354 0.62 % 287.33 14,062 0.64 % 327.52 spet spet spet 394,360 0.69 % 347.55 0 0 % 0 41,209 1.55 % 1,037.60 1,269 0.74 % 320.79 187,447 0.71 % 345.38 68,686 0.62 % 366.49 83,350 0.57 % 262.16 12,399 0.56 % 288.79 prej prej prej 381,934 0.67 % 336.60 1 0.21 % 103.02 19,064 0.72 % 480.01 1,381 0.81 % 349.10 186,379 0.70 % 343.42 69,611 0.63 % 371.42 93,050 0.63 % 292.67 12,448 0.56 % 289.93 naprej naprej napr ej 377,483 0.66 % 332.67 3 0.63 % 309.06 25,693 0.97 % 646.92 1,567 0.91 % 396.12 169,016 0.64 % 311.42 62,691 0.57 % 334.50 105,317 0.71 % 331.25 13,196 0.60 % 307.35 včeraj včeraj včer aj 354,001 0.62 % 311.98 0 0 % 0 3,279 0.12 % 82.56 323 0.19 % 81.65 279,239 1.05 % 514.52 5,457 0.05 % 29.12 65,131 0.44 % 204.85 572 0.03 % 13.32 hitro hitro hitr o 353,030 0.62 % 311.12 3 0.63 % 309.06 19,800 0.75 % 498.54 751 0.44 % 189.84 142,631 0.54 % 262.81 82,662 0.75 % 441.06 92,138 0.62 % 289.80 15,045 0.68 % 350.42 torej torej tore j 350,715 0.61 % 309.08 0 0 % 0 13,285 0.50 % 334.50 1,398 0.81 % 353.40 169,569 0.64 % 312.44 78,027 0.71 % 416.33 70,358 0.48 % 221.29 18,078 0.82 % 421.06 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 101 File at CLARIN.SI 1.2.85 List of initial character-level 5-grams from adverb lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adverbs-lemmas- initial-5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] lahko lahko lahko 3,488,074 7.83 % 3,074.02 44 13.02 % 4,532.81 118,552 5.93 % 2,985.01 17,112 12.39 % 4,325.73 1,472,009 7.14 % 2,712.27 791,767 9.25 % 4,224.61 883,055 7.68 % 2,777.44 205,535 11.77 % 4,787.15 nekaj nekaj nekaj 1,478,636 3.32 % 1,303.11 17 5.03 % 1,751.31 75,312 3.77 % 1,896.27 2,638 1.91 % 666.86 706,346 3.43 % 1,301.49 292,468 3.42 % 1,560.51 361,517 3.14 % 1,137.07 40,338 2.31 % 939.52 vedno vedno vedno 1,075,127 2.41 % 947.50 1 0.30 % 103.02 54,794 2.74 % 1,379.65 2,037 1.48 % 514.93 473,977 2.30 % 873.33 227,298 2.65 % 1,212.79 277,237 2.41 % 871.98 39,783 2.28 % 926.59 veliko veliko velik o 945,307 2.12 % 833.09 1 0.30 % 103.02 24,019 1.20 % 604.77 1,561 1.13 % 394.60 451,747 2.19 % 832.37 200,120 2.34 % 1,067.77 237,636 2.07 % 747.43 30,223 1.73 % 703.93 dobro dobro dobro 906,394 2.03 % 798.80 17 5.03 % 1,751.31 38,425 1.92 % 967.50 1,435 1.04 % 362.75 419,465 2.04 % 772.89 194,249 2.27 % 1,036.45 219,223 1.91 % 689.51 33,580 1.92 % 782.12 danes danes danes 905,146 2.03 % 797.70 0 0 % 0 13,186 0.66 % 332.01 2,091 1.51 % 528.58 348,133 1.69 % 641.46 105,919 1.24 % 565.15 416,761 3.62 % 1,310.82 19,056 1.09 % 443.84 potem potem potem 814,559 1.83 % 717.87 10 2.96 % 1,030.18 74,531 3.73 % 1,876.61 4,470 3.24 % 1,129.97 357,330 1.73 % 658.40 152,779 1.78 % 815.18 198,520 1.73 % 624.40 26,919 1.54 % 626.98 najbolj najbolj najbo lj 780,725 1.75 % 688.05 1 0.30 % 103.02 13,768 0.69 % 346.66 1,191 0.86 % 301.07 377,064 1.83 % 694.77 156,552 1.83 % 835.31 207,955 1.81 % 654.07 24,194 1.39 % 563.51 treba treba treba 715,315 1.60 % 630.40 2 0.59 % 206.04 22,139 1.11 % 557.44 3,300 2.39 % 834.20 348,254 1.69 % 641.68 131,260 1.53 % 700.36 182,173 1.58 % 572.98 28,187 1.61 % 656.51 skupaj skupaj skupa j 614,538 1.38 % 541.59 21 6.21 % 2,163.39 23,547 1.18 % 592.89 1,545 1.12 % 390.56 284,334 1.38 % 523.90 117,286 1.37 % 625.80 170,035 1.48 % 534.80 17,770 1.02 % 413.88 letos letos letos 606,763 1.36 % 534.74 0 0 % 0 765 0.04 % 19.26 280 0.20 % 70.78 362,999 1.76 % 668.85 61,361 0.72 % 327.40 180,244 1.57 % 566.91 1,114 0.06 % 25.95 precej precej prece j 492,127 1.10 % 433.71 2 0.59 % 206.04 12,458 0.62 % 313.68 619 0.45 % 156.48 241,123 1.17 % 444.29 101,408 1.18 % 541.08 122,999 1.07 % 386.86 13,518 0.77 % 314.85 takrat takrat takra t 444,872 1.00 % 392.06 0 0 % 0 20,788 1.04 % 523.42 1,482 1.07 % 374.63 201,503 0.98 % 371.28 88,896 1.04 % 474.32 118,816 1.03 % 373.71 13,387 0.77 % 311.80 povsem povsem povse m 426,682 0.96 % 376.03 0 0 % 0 14,047 0.70 % 353.69 572 0.41 % 144.60 206,808 1.00 % 381.06 86,791 1.01 % 463.09 103,440 0.90 % 325.35 15,024 0.86 % 349.93 medtem medtem medte m 423,755 0.95 % 373.45 6 1.77 % 618.11 18,625 0.93 % 468.96 716 0.52 % 181 185,207 0.90 % 341.26 60,849 0.71 % 324.67 146,062 1.27 % 459.40 12,290 0.70 % 286.25 glede glede glede 400,382 0.90 % 352.85 1 0.30 % 103.02 4,848 0.24 % 122.07 2,830 2.05 % 715.39 176,820 0.86 % 325.80 67,773 0.79 % 361.61 130,230 1.13 % 409.61 17,880 1.02 % 416.45 dovolj dovolj dovol j 400,015 0.90 % 352.53 3 0.89 % 309.06 18,306 0.92 % 460.93 806 0.58 % 203.75 184,495 0.90 % 339.94 90,989 1.06 % 485.49 91,354 0.79 % 287.33 14,062 0.81 % 327.52 naprej naprej napre j 377,483 0.85 % 332.67 3 0.89 % 309.06 25,693 1.29 % 646.92 1,567 1.14 % 396.12 169,016 0.82 % 311.42 62,691 0.73 % 334.50 105,317 0.92 % 331.25 13,196 0.76 % 307.35 včeraj včeraj včera j 354,001 0.80 % 311.98 0 0 % 0 3,279 0.16 % 82.56 323 0.23 % 81.65 279,239 1.35 % 514.52 5,457 0.06 % 29.12 65,131 0.57 % 204.85 572 0.03 % 13.32 hitro hitro hitro 353,030 0.79 % 311.12 3 0.89 % 309.06 19,800 0.99 % 498.54 751 0.54 % 189.84 142,631 0.69 % 262.81 82,662 0.96 % 441.06 92,138 0.80 % 289.80 15,045 0.86 % 350.42 torej torej torej 350,715 0.79 % 309.08 0 0 % 0 13,285 0.67 % 334.50 1,398 1.01 % 353.40 169,569 0.82 % 312.44 78,027 0.91 % 416.33 70,358 0.61 % 221.29 18,078 1.03 % 421.06 najprej najprej najpr ej 346,336 0.78 % 305.22 1 0.30 % 103.02 12,918 0.65 % 325.26 1,024 0.74 % 258.86 166,148 0.81 % 306.14 67,446 0.79 % 359.87 83,679 0.73 % 263.19 15,120 0.87 % 352.16 nikoli nikoli nikol i 342,994 0.77 % 302.28 0 0 % 0 35,030 1.75 % 882.02 811 0.59 % 205.01 139,253 0.68 % 256.58 74,199 0.87 % 395.90 80,494 0.70 % 253.17 13,207 0.76 % 307.61 hkrati hkrati hkrat i 328,962 0.74 % 289.91 0 0 % 0 6,504 0.33 % 163.76 764 0.55 % 193.13 159,548 0.77 % 293.98 60,655 0.71 % 323.64 85,984 0.75 % 270.44 15,507 0.89 % 361.18 verjetno verjetno verje tno 328,520 0.74 % 289.52 0 0 % 0 11,371 0.57 % 286.31 874 0.63 % 220.94 159,262 0.77 % 293.45 62,591 0.73 % 333.97 82,117 0.71 % 258.28 12,305 0.70 % 286.60 toliko toliko tolik o 324,313 0.73 % 285.82 13 3.85 % 1,339.24 16,859 0.84 % 424.49 1,101 0.80 % 278.32 159,180 0.77 % 293.30 66,980 0.78 % 357.38 69,370 0.60 % 218.19 10,810 0.62 % 251.78 zakaj zakaj zakaj 323,693 0.73 % 285.27 0 0 % 0 28,271 1.42 % 711.83 1,138 0.82 % 287.67 143,844 0.70 % 265.04 66,510 0.78 % 354.88 70,625 0.61 % 222.13 13,305 0.76 % 309.89 dolgo dolgo dolgo 311,120 0.70 % 274.19 2 0.59 % 206.04 20,159 1.01 % 507.58 917 0.66 % 231.81 141,751 0.69 % 261.19 62,078 0.72 % 331.23 75,292 0.66 % 236.81 10,921 0.62 % 254.36 tokrat tokrat tokra t 307,823 0.69 % 271.28 0 0 % 0 6,904 0.35 % 173.84 352 0.26 % 88.98 167,344 0.81 % 308.34 46,175 0.54 % 246.37 85,275 0.74 % 268.21 1,773 0.10 % 41.30 približno približno pribl ižno 304,422 0.68 % 268.29 17 5.03 % 1,751.31 3,716 0.19 % 93.56 649 0.47 % 164.06 152,500 0.74 % 280.99 55,444 0.65 % 295.83 80,811 0.70 % 254.17 11,285 0.65 % 262.84 nekoliko nekoliko nekol iko 302,982 0.68 % 267.02 3 0.89 % 309.06 8,364 0.42 % 210.60 527 0.38 % 133.22 143,420 0.70 % 264.26 67,728 0.79 % 361.37 71,679 0.62 % 225.45 11,261 0.65 % 262.28 posebej posebej poseb ej 301,006 0.68 % 265.28 3 0.89 % 309.06 6,382 0.32 % 160.69 1,082 0.78 % 273.52 135,912 0.66 % 250.43 68,691 0.80 % 366.51 75,735 0.66 % 238.21 13,201 0.76 % 307.47 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 102 File at CLARIN.SI 1.2.86 List of final character-level 1-grams from adverb lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adverbs-lemmas- final-1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] lahko lahko lahk o 3,488,074 5.43 % 3,074.02 44 8.85 % 4,532.81 118,552 3.93 % 2,985.01 17,112 9.02 % 4,325.73 1,472,009 4.94 % 2,712.27 791,767 6.41 % 4,224.61 883,055 5.35 % 2,777.44 205,535 8.48 % 4,787.15 tako tako tak o 3,289,402 5.12 % 2,898.93 31 6.24 % 3,193.57 157,077 5.21 % 3,955.03 11,067 5.84 % 2,797.62 1,532,883 5.14 % 2,824.44 627,155 5.08 % 3,346.29 830,732 5.04 % 2,612.87 130,457 5.38 % 3,038.50 več več ve č 1,895,190 2.95 % 1,670.22 4 0.81 % 412.07 35,879 1.19 % 903.39 3,863 2.04 % 976.52 889,908 2.99 % 1,639.71 302,261 2.45 % 1,612.77 609,107 3.69 % 1,915.80 54,168 2.23 % 1,261.64 nekaj nekaj neka j 1,478,636 2.30 % 1,303.11 17 3.42 % 1,751.31 75,312 2.50 % 1,896.27 2,638 1.39 % 666.86 706,346 2.37 % 1,301.49 292,468 2.37 % 1,560.51 361,517 2.19 % 1,137.07 40,338 1.66 % 939.52 zelo zelo zel o 1,441,970 2.24 % 1,270.80 2 0.40 % 206.04 45,177 1.50 % 1,137.51 3,480 1.83 % 879.71 645,584 2.17 % 1,189.53 323,837 2.62 % 1,727.89 366,622 2.22 % 1,153.12 57,268 2.36 % 1,333.84 bolj bolj bol j 1,262,289 1.96 % 1,112.45 13 2.62 % 1,339.24 42,788 1.42 % 1,077.36 2,639 1.39 % 667.11 596,437 2.00 % 1,098.97 266,960 2.16 % 1,424.41 297,200 1.80 % 934.77 56,252 2.32 % 1,310.18 zdaj zdaj zda j 1,082,531 1.68 % 954.03 0 0 % 0 71,531 2.37 % 1,801.07 2,616 1.38 % 661.30 531,590 1.78 % 979.49 150,379 1.22 % 802.37 308,065 1.87 % 968.94 18,350 0.76 % 427.39 vedno vedno vedn o 1,075,127 1.67 % 947.50 1 0.20 % 103.02 54,794 1.82 % 1,379.65 2,037 1.07 % 514.93 473,977 1.59 % 873.33 227,298 1.84 % 1,212.79 277,237 1.68 % 871.98 39,783 1.64 % 926.59 kar kar ka r 1,071,938 1.67 % 944.69 0 0 % 0 39,622 1.31 % 997.64 1,731 0.91 % 437.58 542,754 1.82 % 1,000.06 222,883 1.81 % 1,189.23 244,821 1.48 % 770.03 20,127 0.83 % 468.78 kako kako kak o 1,005,454 1.56 % 886.10 2 0.40 % 206.04 90,504 3.00 % 2,278.79 2,958 1.56 % 747.75 424,401 1.42 % 781.99 198,869 1.61 % 1,061.10 237,842 1.44 % 748.08 50,878 2.10 % 1,185.01 veliko veliko velik o 945,307 1.47 % 833.09 1 0.20 % 103.02 24,019 0.80 % 604.77 1,561 0.82 % 394.60 451,747 1.52 % 832.37 200,120 1.62 % 1,067.77 237,636 1.44 % 747.43 30,223 1.25 % 703.93 dobro dobro dobr o 906,394 1.41 % 798.80 17 3.42 % 1,751.31 38,425 1.27 % 967.50 1,435 0.76 % 362.75 419,465 1.41 % 772.89 194,249 1.57 % 1,036.45 219,223 1.33 % 689.51 33,580 1.39 % 782.12 danes danes dane s 905,146 1.41 % 797.70 0 0 % 0 13,186 0.44 % 332.01 2,091 1.10 % 528.58 348,133 1.17 % 641.46 105,919 0.86 % 565.15 416,761 2.53 % 1,310.82 19,056 0.79 % 443.84 potem potem pote m 814,559 1.27 % 717.87 10 2.01 % 1,030.18 74,531 2.47 % 1,876.61 4,470 2.36 % 1,129.97 357,330 1.20 % 658.40 152,779 1.24 % 815.18 198,520 1.20 % 624.40 26,919 1.11 % 626.98 najbolj najbolj najbol j 780,725 1.22 % 688.05 1 0.20 % 103.02 13,768 0.46 % 346.66 1,191 0.63 % 301.07 377,064 1.26 % 694.77 156,552 1.27 % 835.31 207,955 1.26 % 654.07 24,194 1.00 % 563.51 treba treba treb a 715,315 1.11 % 630.40 2 0.40 % 206.04 22,139 0.73 % 557.44 3,300 1.74 % 834.20 348,254 1.17 % 641.68 131,260 1.06 % 700.36 182,173 1.10 % 572.98 28,187 1.16 % 656.51 skupaj skupaj skupa j 614,538 0.96 % 541.59 21 4.22 % 2,163.39 23,547 0.78 % 592.89 1,545 0.81 % 390.56 284,334 0.95 % 523.90 117,286 0.95 % 625.80 170,035 1.03 % 534.80 17,770 0.73 % 413.88 letos letos leto s 606,763 0.94 % 534.74 0 0 % 0 765 0.03 % 19.26 280 0.15 % 70.78 362,999 1.22 % 668.85 61,361 0.50 % 327.40 180,244 1.09 % 566.91 1,114 0.05 % 25.95 manj manj man j 605,930 0.94 % 534 1 0.20 % 103.02 10,236 0.34 % 257.73 1,249 0.66 % 315.73 293,350 0.98 % 540.52 115,502 0.94 % 616.28 163,127 0.99 % 513.08 22,465 0.93 % 523.24 nato nato nat o 588,918 0.92 % 519.01 52 10.46 % 5,356.96 28,600 0.95 % 720.12 1,074 0.57 % 271.50 236,089 0.79 % 435.01 94,258 0.76 % 502.93 201,513 1.22 % 633.81 27,332 1.13 % 636.59 res res re s 565,140 0.88 % 498.06 0 0 % 0 43,525 1.44 % 1,095.91 1,548 0.82 % 391.32 259,710 0.87 % 478.53 119,655 0.97 % 638.44 127,911 0.78 % 402.31 12,791 0.53 % 297.92 lani lani lan i 505,184 0.79 % 445.22 0 0 % 0 745 0.03 % 18.76 90 0.05 % 22.75 276,175 0.93 % 508.87 38,722 0.31 % 206.61 188,480 1.14 % 592.82 972 0.04 % 22.64 precej precej prece j 492,127 0.77 % 433.71 2 0.40 % 206.04 12,458 0.41 % 313.68 619 0.33 % 156.48 241,123 0.81 % 444.29 101,408 0.82 % 541.08 122,999 0.75 % 386.86 13,518 0.56 % 314.85 takrat takrat takra t 444,872 0.69 % 392.06 0 0 % 0 20,788 0.69 % 523.42 1,482 0.78 % 374.63 201,503 0.68 % 371.28 88,896 0.72 % 474.32 118,816 0.72 % 373.71 13,387 0.55 % 311.80 tam tam ta m 430,492 0.67 % 379.39 0 0 % 0 37,488 1.24 % 943.91 1,592 0.84 % 402.44 193,210 0.65 % 356 84,595 0.69 % 451.37 98,397 0.60 % 309.48 15,210 0.63 % 354.26 povsem povsem povse m 426,682 0.66 % 376.03 0 0 % 0 14,047 0.47 % 353.69 572 0.30 % 144.60 206,808 0.69 % 381.06 86,791 0.70 % 463.09 103,440 0.63 % 325.35 15,024 0.62 % 349.93 medtem medtem medte m 423,755 0.66 % 373.45 6 1.21 % 618.11 18,625 0.62 % 468.96 716 0.38 % 181 185,207 0.62 % 341.26 60,849 0.49 % 324.67 146,062 0.89 % 459.40 12,290 0.51 % 286.25 malo malo mal o 420,328 0.65 % 370.43 29 5.83 % 2,987.53 31,608 1.05 % 795.86 1,133 0.60 % 286.41 185,259 0.62 % 341.35 97,676 0.79 % 521.17 89,437 0.54 % 281.30 15,186 0.63 % 353.70 vse vse vs e 415,803 0.65 % 366.45 0 0 % 0 14,050 0.47 % 353.76 827 0.44 % 209.06 199,559 0.67 % 367.70 73,812 0.60 % 393.84 113,290 0.69 % 356.33 14,265 0.59 % 332.25 rad rad ra d 404,452 0.63 % 356.44 0 0 % 0 31,141 1.03 % 784.10 1,256 0.66 % 317.50 176,738 0.59 % 325.65 102,264 0.83 % 545.65 79,765 0.48 % 250.88 13,288 0.55 % 309.49 glede glede gled e 400,382 0.62 % 352.85 1 0.20 % 103.02 4,848 0.16 % 122.07 2,830 1.49 % 715.39 176,820 0.59 % 325.80 67,773 0.55 % 361.61 130,230 0.79 % 409.61 17,880 0.74 % 416.45 dovolj dovolj dovol j 400,015 0.62 % 352.53 3 0.60 % 309.06 18,306 0.61 % 460.93 806 0.42 % 203.75 184,495 0.62 % 339.94 90,989 0.74 % 485.49 91,354 0.55 % 287.33 14,062 0.58 % 327.52 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 103 File at CLARIN.SI 1.2.87 List of final character-level 2-grams from adverb lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adverbs-lemmas- final-2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] lahko lahko lah ko 3,488,074 5.43 % 3,074.02 44 8.85 % 4,532.81 118,552 3.93 % 2,985.01 17,112 9.02 % 4,325.73 1,472,009 4.94 % 2,712.27 791,767 6.41 % 4,224.61 883,055 5.35 % 2,777.44 205,535 8.48 % 4,787.15 tako tako ta ko 3,289,402 5.12 % 2,898.93 31 6.24 % 3,193.57 157,077 5.21 % 3,955.03 11,067 5.84 % 2,797.62 1,532,883 5.14 % 2,824.44 627,155 5.08 % 3,346.29 830,732 5.04 % 2,612.87 130,457 5.38 % 3,038.50 več več v eč 1,895,190 2.95 % 1,670.22 4 0.81 % 412.07 35,879 1.19 % 903.39 3,863 2.04 % 976.52 889,908 2.99 % 1,639.71 302,261 2.45 % 1,612.77 609,107 3.69 % 1,915.80 54,168 2.23 % 1,261.64 nekaj nekaj nek aj 1,478,636 2.30 % 1,303.11 17 3.42 % 1,751.31 75,312 2.50 % 1,896.27 2,638 1.39 % 666.86 706,346 2.37 % 1,301.49 292,468 2.37 % 1,560.51 361,517 2.19 % 1,137.07 40,338 1.66 % 939.52 zelo zelo ze lo 1,441,970 2.24 % 1,270.80 2 0.40 % 206.04 45,177 1.50 % 1,137.51 3,480 1.83 % 879.71 645,584 2.17 % 1,189.53 323,837 2.62 % 1,727.89 366,622 2.22 % 1,153.12 57,268 2.36 % 1,333.84 bolj bolj bo lj 1,262,289 1.96 % 1,112.45 13 2.62 % 1,339.24 42,788 1.42 % 1,077.36 2,639 1.39 % 667.11 596,437 2.00 % 1,098.97 266,960 2.16 % 1,424.41 297,200 1.80 % 934.77 56,252 2.32 % 1,310.18 zdaj zdaj zd aj 1,082,531 1.68 % 954.03 0 0 % 0 71,531 2.37 % 1,801.07 2,616 1.38 % 661.30 531,590 1.78 % 979.49 150,379 1.22 % 802.37 308,065 1.87 % 968.94 18,350 0.76 % 427.39 vedno vedno ved no 1,075,127 1.67 % 947.50 1 0.20 % 103.02 54,794 1.82 % 1,379.65 2,037 1.07 % 514.93 473,977 1.59 % 873.33 227,298 1.84 % 1,212.79 277,237 1.68 % 871.98 39,783 1.64 % 926.59 kar kar k ar 1,071,938 1.67 % 944.69 0 0 % 0 39,622 1.31 % 997.64 1,731 0.91 % 437.58 542,754 1.82 % 1,000.06 222,883 1.81 % 1,189.23 244,821 1.48 % 770.03 20,127 0.83 % 468.78 kako kako ka ko 1,005,454 1.56 % 886.10 2 0.40 % 206.04 90,504 3.00 % 2,278.79 2,958 1.56 % 747.75 424,401 1.42 % 781.99 198,869 1.61 % 1,061.10 237,842 1.44 % 748.08 50,878 2.10 % 1,185.01 veliko veliko veli ko 945,307 1.47 % 833.09 1 0.20 % 103.02 24,019 0.80 % 604.77 1,561 0.82 % 394.60 451,747 1.52 % 832.37 200,120 1.62 % 1,067.77 237,636 1.44 % 747.43 30,223 1.25 % 703.93 dobro dobro dob ro 906,394 1.41 % 798.80 17 3.42 % 1,751.31 38,425 1.27 % 967.50 1,435 0.76 % 362.75 419,465 1.41 % 772.89 194,249 1.57 % 1,036.45 219,223 1.33 % 689.51 33,580 1.39 % 782.12 danes danes dan es 905,146 1.41 % 797.70 0 0 % 0 13,186 0.44 % 332.01 2,091 1.10 % 528.58 348,133 1.17 % 641.46 105,919 0.86 % 565.15 416,761 2.53 % 1,310.82 19,056 0.79 % 443.84 potem potem pot em 814,559 1.27 % 717.87 10 2.01 % 1,030.18 74,531 2.47 % 1,876.61 4,470 2.36 % 1,129.97 357,330 1.20 % 658.40 152,779 1.24 % 815.18 198,520 1.20 % 624.40 26,919 1.11 % 626.98 najbolj najbolj najbo lj 780,725 1.22 % 688.05 1 0.20 % 103.02 13,768 0.46 % 346.66 1,191 0.63 % 301.07 377,064 1.26 % 694.77 156,552 1.27 % 835.31 207,955 1.26 % 654.07 24,194 1.00 % 563.51 treba treba tre ba 715,315 1.11 % 630.40 2 0.40 % 206.04 22,139 0.73 % 557.44 3,300 1.74 % 834.20 348,254 1.17 % 641.68 131,260 1.06 % 700.36 182,173 1.10 % 572.98 28,187 1.16 % 656.51 skupaj skupaj skup aj 614,538 0.96 % 541.59 21 4.22 % 2,163.39 23,547 0.78 % 592.89 1,545 0.81 % 390.56 284,334 0.95 % 523.90 117,286 0.95 % 625.80 170,035 1.03 % 534.80 17,770 0.73 % 413.88 letos letos let os 606,763 0.94 % 534.74 0 0 % 0 765 0.03 % 19.26 280 0.15 % 70.78 362,999 1.22 % 668.85 61,361 0.50 % 327.40 180,244 1.09 % 566.91 1,114 0.05 % 25.95 manj manj ma nj 605,930 0.94 % 534 1 0.20 % 103.02 10,236 0.34 % 257.73 1,249 0.66 % 315.73 293,350 0.98 % 540.52 115,502 0.94 % 616.28 163,127 0.99 % 513.08 22,465 0.93 % 523.24 nato nato na to 588,918 0.92 % 519.01 52 10.46 % 5,356.96 28,600 0.95 % 720.12 1,074 0.57 % 271.50 236,089 0.79 % 435.01 94,258 0.76 % 502.93 201,513 1.22 % 633.81 27,332 1.13 % 636.59 res res r es 565,140 0.88 % 498.06 0 0 % 0 43,525 1.44 % 1,095.91 1,548 0.82 % 391.32 259,710 0.87 % 478.53 119,655 0.97 % 638.44 127,911 0.78 % 402.31 12,791 0.53 % 297.92 lani lani la ni 505,184 0.79 % 445.22 0 0 % 0 745 0.03 % 18.76 90 0.05 % 22.75 276,175 0.93 % 508.87 38,722 0.31 % 206.61 188,480 1.14 % 592.82 972 0.04 % 22.64 precej precej prec ej 492,127 0.77 % 433.71 2 0.40 % 206.04 12,458 0.41 % 313.68 619 0.33 % 156.48 241,123 0.81 % 444.29 101,408 0.82 % 541.08 122,999 0.75 % 386.86 13,518 0.56 % 314.85 takrat takrat takr at 444,872 0.69 % 392.06 0 0 % 0 20,788 0.69 % 523.42 1,482 0.78 % 374.63 201,503 0.68 % 371.28 88,896 0.72 % 474.32 118,816 0.72 % 373.71 13,387 0.55 % 311.80 tam tam t am 430,492 0.67 % 379.39 0 0 % 0 37,488 1.24 % 943.91 1,592 0.84 % 402.44 193,210 0.65 % 356 84,595 0.69 % 451.37 98,397 0.60 % 309.48 15,210 0.63 % 354.26 povsem povsem povs em 426,682 0.66 % 376.03 0 0 % 0 14,047 0.47 % 353.69 572 0.30 % 144.60 206,808 0.69 % 381.06 86,791 0.70 % 463.09 103,440 0.63 % 325.35 15,024 0.62 % 349.93 medtem medtem medt em 423,755 0.66 % 373.45 6 1.21 % 618.11 18,625 0.62 % 468.96 716 0.38 % 181 185,207 0.62 % 341.26 60,849 0.49 % 324.67 146,062 0.89 % 459.40 12,290 0.51 % 286.25 malo malo ma lo 420,328 0.65 % 370.43 29 5.83 % 2,987.53 31,608 1.05 % 795.86 1,133 0.60 % 286.41 185,259 0.62 % 341.35 97,676 0.79 % 521.17 89,437 0.54 % 281.30 15,186 0.63 % 353.70 vse vse v se 415,803 0.65 % 366.45 0 0 % 0 14,050 0.47 % 353.76 827 0.44 % 209.06 199,559 0.67 % 367.70 73,812 0.60 % 393.84 113,290 0.69 % 356.33 14,265 0.59 % 332.25 rad rad r ad 404,452 0.63 % 356.44 0 0 % 0 31,141 1.03 % 784.10 1,256 0.66 % 317.50 176,738 0.59 % 325.65 102,264 0.83 % 545.65 79,765 0.48 % 250.88 13,288 0.55 % 309.49 glede glede gle de 400,382 0.62 % 352.85 1 0.20 % 103.02 4,848 0.16 % 122.07 2,830 1.49 % 715.39 176,820 0.59 % 325.80 67,773 0.55 % 361.61 130,230 0.79 % 409.61 17,880 0.74 % 416.45 dovolj dovolj dovo lj 400,015 0.62 % 352.53 3 0.60 % 309.06 18,306 0.61 % 460.93 806 0.42 % 203.75 184,495 0.62 % 339.94 90,989 0.74 % 485.49 91,354 0.55 % 287.33 14,062 0.58 % 327.52 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 104 File at CLARIN.SI 1.2.88 List of final character-level 3-grams from adverb lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adverbs-lemmas- final-3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] lahko lahko la hko 3,488,074 5.45 % 3,074.02 44 8.87 % 4,532.81 118,552 3.96 % 2,985.01 17,112 9.11 % 4,325.73 1,472,009 4.96 % 2,712.27 791,767 6.45 % 4,224.61 883,055 5.37 % 2,777.44 205,535 8.54 % 4,787.15 tako tako t ako 3,289,402 5.14 % 2,898.93 31 6.25 % 3,193.57 157,077 5.25 % 3,955.03 11,067 5.89 % 2,797.62 1,532,883 5.17 % 2,824.44 627,155 5.11 % 3,346.29 830,732 5.05 % 2,612.87 130,457 5.42 % 3,038.50 več več več 1,895,190 2.96 % 1,670.22 4 0.81 % 412.07 35,879 1.20 % 903.39 3,863 2.06 % 976.52 889,908 3.00 % 1,639.71 302,261 2.46 % 1,612.77 609,107 3.71 % 1,915.80 54,168 2.25 % 1,261.64 nekaj nekaj ne kaj 1,478,636 2.31 % 1,303.11 17 3.43 % 1,751.31 75,312 2.52 % 1,896.27 2,638 1.41 % 666.86 706,346 2.38 % 1,301.49 292,468 2.38 % 1,560.51 361,517 2.20 % 1,137.07 40,338 1.68 % 939.52 zelo zelo z elo 1,441,970 2.25 % 1,270.80 2 0.40 % 206.04 45,177 1.51 % 1,137.51 3,480 1.85 % 879.71 645,584 2.18 % 1,189.53 323,837 2.64 % 1,727.89 366,622 2.23 % 1,153.12 57,268 2.38 % 1,333.84 bolj bolj b olj 1,262,289 1.97 % 1,112.45 13 2.62 % 1,339.24 42,788 1.43 % 1,077.36 2,639 1.41 % 667.11 596,437 2.01 % 1,098.97 266,960 2.17 % 1,424.41 297,200 1.81 % 934.77 56,252 2.34 % 1,310.18 zdaj zdaj z daj 1,082,531 1.69 % 954.03 0 0 % 0 71,531 2.39 % 1,801.07 2,616 1.39 % 661.30 531,590 1.79 % 979.49 150,379 1.23 % 802.37 308,065 1.88 % 968.94 18,350 0.76 % 427.39 vedno vedno ve dno 1,075,127 1.68 % 947.50 1 0.20 % 103.02 54,794 1.83 % 1,379.65 2,037 1.08 % 514.93 473,977 1.60 % 873.33 227,298 1.85 % 1,212.79 277,237 1.69 % 871.98 39,783 1.65 % 926.59 kar kar kar 1,071,938 1.68 % 944.69 0 0 % 0 39,622 1.32 % 997.64 1,731 0.92 % 437.58 542,754 1.83 % 1,000.06 222,883 1.81 % 1,189.23 244,821 1.49 % 770.03 20,127 0.84 % 468.78 kako kako k ako 1,005,454 1.57 % 886.10 2 0.40 % 206.04 90,504 3.02 % 2,278.79 2,958 1.57 % 747.75 424,401 1.43 % 781.99 198,869 1.62 % 1,061.10 237,842 1.45 % 748.08 50,878 2.11 % 1,185.01 veliko veliko vel iko 945,307 1.48 % 833.09 1 0.20 % 103.02 24,019 0.80 % 604.77 1,561 0.83 % 394.60 451,747 1.52 % 832.37 200,120 1.63 % 1,067.77 237,636 1.45 % 747.43 30,223 1.25 % 703.93 dobro dobro do bro 906,394 1.42 % 798.80 17 3.43 % 1,751.31 38,425 1.28 % 967.50 1,435 0.76 % 362.75 419,465 1.41 % 772.89 194,249 1.58 % 1,036.45 219,223 1.33 % 689.51 33,580 1.39 % 782.12 danes danes da nes 905,146 1.42 % 797.70 0 0 % 0 13,186 0.44 % 332.01 2,091 1.11 % 528.58 348,133 1.17 % 641.46 105,919 0.86 % 565.15 416,761 2.54 % 1,310.82 19,056 0.79 % 443.84 potem potem po tem 814,559 1.27 % 717.87 10 2.02 % 1,030.18 74,531 2.49 % 1,876.61 4,470 2.38 % 1,129.97 357,330 1.20 % 658.40 152,779 1.24 % 815.18 198,520 1.21 % 624.40 26,919 1.12 % 626.98 najbolj najbolj najb olj 780,725 1.22 % 688.05 1 0.20 % 103.02 13,768 0.46 % 346.66 1,191 0.63 % 301.07 377,064 1.27 % 694.77 156,552 1.27 % 835.31 207,955 1.26 % 654.07 24,194 1.00 % 563.51 treba treba tr eba 715,315 1.12 % 630.40 2 0.40 % 206.04 22,139 0.74 % 557.44 3,300 1.76 % 834.20 348,254 1.17 % 641.68 131,260 1.07 % 700.36 182,173 1.11 % 572.98 28,187 1.17 % 656.51 skupaj skupaj sku paj 614,538 0.96 % 541.59 21 4.23 % 2,163.39 23,547 0.79 % 592.89 1,545 0.82 % 390.56 284,334 0.96 % 523.90 117,286 0.95 % 625.80 170,035 1.03 % 534.80 17,770 0.74 % 413.88 letos letos le tos 606,763 0.95 % 534.74 0 0 % 0 765 0.03 % 19.26 280 0.15 % 70.78 362,999 1.22 % 668.85 61,361 0.50 % 327.40 180,244 1.10 % 566.91 1,114 0.05 % 25.95 manj manj m anj 605,930 0.95 % 534 1 0.20 % 103.02 10,236 0.34 % 257.73 1,249 0.67 % 315.73 293,350 0.99 % 540.52 115,502 0.94 % 616.28 163,127 0.99 % 513.08 22,465 0.93 % 523.24 nato nato n ato 588,918 0.92 % 519.01 52 10.48 % 5,356.96 28,600 0.95 % 720.12 1,074 0.57 % 271.50 236,089 0.80 % 435.01 94,258 0.77 % 502.93 201,513 1.23 % 633.81 27,332 1.14 % 636.59 res res res 565,140 0.88 % 498.06 0 0 % 0 43,525 1.45 % 1,095.91 1,548 0.82 % 391.32 259,710 0.88 % 478.53 119,655 0.97 % 638.44 127,911 0.78 % 402.31 12,791 0.53 % 297.92 lani lani l ani 505,184 0.79 % 445.22 0 0 % 0 745 0.03 % 18.76 90 0.05 % 22.75 276,175 0.93 % 508.87 38,722 0.32 % 206.61 188,480 1.15 % 592.82 972 0.04 % 22.64 precej precej pre cej 492,127 0.77 % 433.71 2 0.40 % 206.04 12,458 0.42 % 313.68 619 0.33 % 156.48 241,123 0.81 % 444.29 101,408 0.83 % 541.08 122,999 0.75 % 386.86 13,518 0.56 % 314.85 takrat takrat tak rat 444,872 0.69 % 392.06 0 0 % 0 20,788 0.69 % 523.42 1,482 0.79 % 374.63 201,503 0.68 % 371.28 88,896 0.72 % 474.32 118,816 0.72 % 373.71 13,387 0.56 % 311.80 tam tam tam 430,492 0.67 % 379.39 0 0 % 0 37,488 1.25 % 943.91 1,592 0.85 % 402.44 193,210 0.65 % 356 84,595 0.69 % 451.37 98,397 0.60 % 309.48 15,210 0.63 % 354.26 povsem povsem pov sem 426,682 0.67 % 376.03 0 0 % 0 14,047 0.47 % 353.69 572 0.30 % 144.60 206,808 0.70 % 381.06 86,791 0.71 % 463.09 103,440 0.63 % 325.35 15,024 0.62 % 349.93 medtem medtem med tem 423,755 0.66 % 373.45 6 1.21 % 618.11 18,625 0.62 % 468.96 716 0.38 % 181 185,207 0.62 % 341.26 60,849 0.50 % 324.67 146,062 0.89 % 459.40 12,290 0.51 % 286.25 malo malo m alo 420,328 0.66 % 370.43 29 5.85 % 2,987.53 31,608 1.06 % 795.86 1,133 0.60 % 286.41 185,259 0.62 % 341.35 97,676 0.80 % 521.17 89,437 0.54 % 281.30 15,186 0.63 % 353.70 vse vse vse 415,803 0.65 % 366.45 0 0 % 0 14,050 0.47 % 353.76 827 0.44 % 209.06 199,559 0.67 % 367.70 73,812 0.60 % 393.84 113,290 0.69 % 356.33 14,265 0.59 % 332.25 rad rad rad 404,452 0.63 % 356.44 0 0 % 0 31,141 1.04 % 784.10 1,256 0.67 % 317.50 176,738 0.60 % 325.65 102,264 0.83 % 545.65 79,765 0.48 % 250.88 13,288 0.55 % 309.49 glede glede gl ede 400,382 0.63 % 352.85 1 0.20 % 103.02 4,848 0.16 % 122.07 2,830 1.51 % 715.39 176,820 0.60 % 325.80 67,773 0.55 % 361.61 130,230 0.79 % 409.61 17,880 0.74 % 416.45 dovolj dovolj dov olj 400,015 0.62 % 352.53 3 0.60 % 309.06 18,306 0.61 % 460.93 806 0.43 % 203.75 184,495 0.62 % 339.94 90,989 0.74 % 485.49 91,354 0.56 % 287.33 14,062 0.58 % 327.52 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 105 File at CLARIN.SI 1.2.89 List of final character-level 4-grams from adverb lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adverbs-lemmas- final-4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] lahko lahko l ahko 3,488,074 6.09 % 3,074.02 44 9.22 % 4,532.81 118,552 4.46 % 2,985.01 17,112 9.97 % 4,325.73 1,472,009 5.55 % 2,712.27 791,767 7.20 % 4,224.61 883,055 5.99 % 2,777.44 205,535 9.29 % 4,787.15 tako tako tako 3,289,402 5.74 % 2,898.93 31 6.50 % 3,193.57 157,077 5.92 % 3,955.03 11,067 6.45 % 2,797.62 1,532,883 5.78 % 2,824.44 627,155 5.71 % 3,346.29 830,732 5.64 % 2,612.87 130,457 5.90 % 3,038.50 nekaj nekaj n ekaj 1,478,636 2.58 % 1,303.11 17 3.56 % 1,751.31 75,312 2.84 % 1,896.27 2,638 1.54 % 666.86 706,346 2.66 % 1,301.49 292,468 2.66 % 1,560.51 361,517 2.45 % 1,137.07 40,338 1.82 % 939.52 zelo zelo zelo 1,441,970 2.52 % 1,270.80 2 0.42 % 206.04 45,177 1.70 % 1,137.51 3,480 2.03 % 879.71 645,584 2.43 % 1,189.53 323,837 2.95 % 1,727.89 366,622 2.49 % 1,153.12 57,268 2.59 % 1,333.84 bolj bolj bolj 1,262,289 2.20 % 1,112.45 13 2.73 % 1,339.24 42,788 1.61 % 1,077.36 2,639 1.54 % 667.11 596,437 2.25 % 1,098.97 266,960 2.43 % 1,424.41 297,200 2.02 % 934.77 56,252 2.54 % 1,310.18 zdaj zdaj zdaj 1,082,531 1.89 % 954.03 0 0 % 0 71,531 2.69 % 1,801.07 2,616 1.52 % 661.30 531,590 2.00 % 979.49 150,379 1.37 % 802.37 308,065 2.09 % 968.94 18,350 0.83 % 427.39 vedno vedno v edno 1,075,127 1.88 % 947.50 1 0.21 % 103.02 54,794 2.06 % 1,379.65 2,037 1.19 % 514.93 473,977 1.79 % 873.33 227,298 2.07 % 1,212.79 277,237 1.88 % 871.98 39,783 1.80 % 926.59 kako kako kako 1,005,454 1.75 % 886.10 2 0.42 % 206.04 90,504 3.41 % 2,278.79 2,958 1.72 % 747.75 424,401 1.60 % 781.99 198,869 1.81 % 1,061.10 237,842 1.61 % 748.08 50,878 2.30 % 1,185.01 veliko veliko ve liko 945,307 1.65 % 833.09 1 0.21 % 103.02 24,019 0.90 % 604.77 1,561 0.91 % 394.60 451,747 1.70 % 832.37 200,120 1.82 % 1,067.77 237,636 1.61 % 747.43 30,223 1.37 % 703.93 dobro dobro d obro 906,394 1.58 % 798.80 17 3.56 % 1,751.31 38,425 1.45 % 967.50 1,435 0.84 % 362.75 419,465 1.58 % 772.89 194,249 1.77 % 1,036.45 219,223 1.49 % 689.51 33,580 1.52 % 782.12 danes danes d anes 905,146 1.58 % 797.70 0 0 % 0 13,186 0.50 % 332.01 2,091 1.22 % 528.58 348,133 1.31 % 641.46 105,919 0.96 % 565.15 416,761 2.83 % 1,310.82 19,056 0.86 % 443.84 potem potem p otem 814,559 1.42 % 717.87 10 2.10 % 1,030.18 74,531 2.81 % 1,876.61 4,470 2.60 % 1,129.97 357,330 1.35 % 658.40 152,779 1.39 % 815.18 198,520 1.35 % 624.40 26,919 1.22 % 626.98 najbolj najbolj naj bolj 780,725 1.36 % 688.05 1 0.21 % 103.02 13,768 0.52 % 346.66 1,191 0.69 % 301.07 377,064 1.42 % 694.77 156,552 1.42 % 835.31 207,955 1.41 % 654.07 24,194 1.09 % 563.51 treba treba t reba 715,315 1.25 % 630.40 2 0.42 % 206.04 22,139 0.83 % 557.44 3,300 1.92 % 834.20 348,254 1.31 % 641.68 131,260 1.19 % 700.36 182,173 1.24 % 572.98 28,187 1.27 % 656.51 skupaj skupaj sk upaj 614,538 1.07 % 541.59 21 4.40 % 2,163.39 23,547 0.89 % 592.89 1,545 0.90 % 390.56 284,334 1.07 % 523.90 117,286 1.07 % 625.80 170,035 1.15 % 534.80 17,770 0.80 % 413.88 letos letos l etos 606,763 1.06 % 534.74 0 0 % 0 765 0.03 % 19.26 280 0.16 % 70.78 362,999 1.37 % 668.85 61,361 0.56 % 327.40 180,244 1.22 % 566.91 1,114 0.05 % 25.95 manj manj manj 605,930 1.06 % 534 1 0.21 % 103.02 10,236 0.39 % 257.73 1,249 0.73 % 315.73 293,350 1.11 % 540.52 115,502 1.05 % 616.28 163,127 1.11 % 513.08 22,465 1.01 % 523.24 nato nato nato 588,918 1.03 % 519.01 52 10.90 % 5,356.96 28,600 1.08 % 720.12 1,074 0.63 % 271.50 236,089 0.89 % 435.01 94,258 0.86 % 502.93 201,513 1.37 % 633.81 27,332 1.24 % 636.59 lani lani lani 505,184 0.88 % 445.22 0 0 % 0 745 0.03 % 18.76 90 0.05 % 22.75 276,175 1.04 % 508.87 38,722 0.35 % 206.61 188,480 1.28 % 592.82 972 0.04 % 22.64 precej precej pr ecej 492,127 0.86 % 433.71 2 0.42 % 206.04 12,458 0.47 % 313.68 619 0.36 % 156.48 241,123 0.91 % 444.29 101,408 0.92 % 541.08 122,999 0.83 % 386.86 13,518 0.61 % 314.85 takrat takrat ta krat 444,872 0.78 % 392.06 0 0 % 0 20,788 0.78 % 523.42 1,482 0.86 % 374.63 201,503 0.76 % 371.28 88,896 0.81 % 474.32 118,816 0.81 % 373.71 13,387 0.60 % 311.80 povsem povsem po vsem 426,682 0.74 % 376.03 0 0 % 0 14,047 0.53 % 353.69 572 0.33 % 144.60 206,808 0.78 % 381.06 86,791 0.79 % 463.09 103,440 0.70 % 325.35 15,024 0.68 % 349.93 medtem medtem me dtem 423,755 0.74 % 373.45 6 1.26 % 618.11 18,625 0.70 % 468.96 716 0.42 % 181 185,207 0.70 % 341.26 60,849 0.55 % 324.67 146,062 0.99 % 459.40 12,290 0.56 % 286.25 malo malo malo 420,328 0.73 % 370.43 29 6.08 % 2,987.53 31,608 1.19 % 795.86 1,133 0.66 % 286.41 185,259 0.70 % 341.35 97,676 0.89 % 521.17 89,437 0.61 % 281.30 15,186 0.69 % 353.70 glede glede g lede 400,382 0.70 % 352.85 1 0.21 % 103.02 4,848 0.18 % 122.07 2,830 1.65 % 715.39 176,820 0.67 % 325.80 67,773 0.62 % 361.61 130,230 0.88 % 409.61 17,880 0.81 % 416.45 dovolj dovolj do volj 400,015 0.70 % 352.53 3 0.63 % 309.06 18,306 0.69 % 460.93 806 0.47 % 203.75 184,495 0.70 % 339.94 90,989 0.83 % 485.49 91,354 0.62 % 287.33 14,062 0.64 % 327.52 spet spet spet 394,360 0.69 % 347.55 0 0 % 0 41,209 1.55 % 1,037.60 1,269 0.74 % 320.79 187,447 0.71 % 345.38 68,686 0.62 % 366.49 83,350 0.57 % 262.16 12,399 0.56 % 288.79 prej prej prej 381,934 0.67 % 336.60 1 0.21 % 103.02 19,064 0.72 % 480.01 1,381 0.81 % 349.10 186,379 0.70 % 343.42 69,611 0.63 % 371.42 93,050 0.63 % 292.67 12,448 0.56 % 289.93 naprej naprej na prej 377,483 0.66 % 332.67 3 0.63 % 309.06 25,693 0.97 % 646.92 1,567 0.91 % 396.12 169,016 0.64 % 311.42 62,691 0.57 % 334.50 105,317 0.71 % 331.25 13,196 0.60 % 307.35 včeraj včeraj vč eraj 354,001 0.62 % 311.98 0 0 % 0 3,279 0.12 % 82.56 323 0.19 % 81.65 279,239 1.05 % 514.52 5,457 0.05 % 29.12 65,131 0.44 % 204.85 572 0.03 % 13.32 hitro hitro h itro 353,030 0.62 % 311.12 3 0.63 % 309.06 19,800 0.75 % 498.54 751 0.44 % 189.84 142,631 0.54 % 262.81 82,662 0.75 % 441.06 92,138 0.62 % 289.80 15,045 0.68 % 350.42 torej torej t orej 350,715 0.61 % 309.08 0 0 % 0 13,285 0.50 % 334.50 1,398 0.81 % 353.40 169,569 0.64 % 312.44 78,027 0.71 % 416.33 70,358 0.48 % 221.29 18,078 0.82 % 421.06 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 106 File at CLARIN.SI 1.2.90 List of final character-level 5-grams from adverb lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adverbs-lemmas- final-5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] lahko lahko lahko 3,488,074 7.83 % 3,074.02 44 13.02 % 4,532.81 118,552 5.93 % 2,985.01 17,112 12.39 % 4,325.73 1,472,009 7.14 % 2,712.27 791,767 9.25 % 4,224.61 883,055 7.68 % 2,777.44 205,535 11.77 % 4,787.15 nekaj nekaj nekaj 1,478,636 3.32 % 1,303.11 17 5.03 % 1,751.31 75,312 3.77 % 1,896.27 2,638 1.91 % 666.86 706,346 3.43 % 1,301.49 292,468 3.42 % 1,560.51 361,517 3.14 % 1,137.07 40,338 2.31 % 939.52 vedno vedno vedno 1,075,127 2.41 % 947.50 1 0.30 % 103.02 54,794 2.74 % 1,379.65 2,037 1.48 % 514.93 473,977 2.30 % 873.33 227,298 2.65 % 1,212.79 277,237 2.41 % 871.98 39,783 2.28 % 926.59 veliko veliko v eliko 945,307 2.12 % 833.09 1 0.30 % 103.02 24,019 1.20 % 604.77 1,561 1.13 % 394.60 451,747 2.19 % 832.37 200,120 2.34 % 1,067.77 237,636 2.07 % 747.43 30,223 1.73 % 703.93 dobro dobro dobro 906,394 2.03 % 798.80 17 5.03 % 1,751.31 38,425 1.92 % 967.50 1,435 1.04 % 362.75 419,465 2.04 % 772.89 194,249 2.27 % 1,036.45 219,223 1.91 % 689.51 33,580 1.92 % 782.12 danes danes danes 905,146 2.03 % 797.70 0 0 % 0 13,186 0.66 % 332.01 2,091 1.51 % 528.58 348,133 1.69 % 641.46 105,919 1.24 % 565.15 416,761 3.62 % 1,310.82 19,056 1.09 % 443.84 potem potem potem 814,559 1.83 % 717.87 10 2.96 % 1,030.18 74,531 3.73 % 1,876.61 4,470 3.24 % 1,129.97 357,330 1.73 % 658.40 152,779 1.78 % 815.18 198,520 1.73 % 624.40 26,919 1.54 % 626.98 najbolj najbolj na jbolj 780,725 1.75 % 688.05 1 0.30 % 103.02 13,768 0.69 % 346.66 1,191 0.86 % 301.07 377,064 1.83 % 694.77 156,552 1.83 % 835.31 207,955 1.81 % 654.07 24,194 1.39 % 563.51 treba treba treba 715,315 1.60 % 630.40 2 0.59 % 206.04 22,139 1.11 % 557.44 3,300 2.39 % 834.20 348,254 1.69 % 641.68 131,260 1.53 % 700.36 182,173 1.58 % 572.98 28,187 1.61 % 656.51 skupaj skupaj s kupaj 614,538 1.38 % 541.59 21 6.21 % 2,163.39 23,547 1.18 % 592.89 1,545 1.12 % 390.56 284,334 1.38 % 523.90 117,286 1.37 % 625.80 170,035 1.48 % 534.80 17,770 1.02 % 413.88 letos letos letos 606,763 1.36 % 534.74 0 0 % 0 765 0.04 % 19.26 280 0.20 % 70.78 362,999 1.76 % 668.85 61,361 0.72 % 327.40 180,244 1.57 % 566.91 1,114 0.06 % 25.95 precej precej p recej 492,127 1.10 % 433.71 2 0.59 % 206.04 12,458 0.62 % 313.68 619 0.45 % 156.48 241,123 1.17 % 444.29 101,408 1.18 % 541.08 122,999 1.07 % 386.86 13,518 0.77 % 314.85 takrat takrat t akrat 444,872 1.00 % 392.06 0 0 % 0 20,788 1.04 % 523.42 1,482 1.07 % 374.63 201,503 0.98 % 371.28 88,896 1.04 % 474.32 118,816 1.03 % 373.71 13,387 0.77 % 311.80 povsem povsem p ovsem 426,682 0.96 % 376.03 0 0 % 0 14,047 0.70 % 353.69 572 0.41 % 144.60 206,808 1.00 % 381.06 86,791 1.01 % 463.09 103,440 0.90 % 325.35 15,024 0.86 % 349.93 medtem medtem m edtem 423,755 0.95 % 373.45 6 1.77 % 618.11 18,625 0.93 % 468.96 716 0.52 % 181 185,207 0.90 % 341.26 60,849 0.71 % 324.67 146,062 1.27 % 459.40 12,290 0.70 % 286.25 glede glede glede 400,382 0.90 % 352.85 1 0.30 % 103.02 4,848 0.24 % 122.07 2,830 2.05 % 715.39 176,820 0.86 % 325.80 67,773 0.79 % 361.61 130,230 1.13 % 409.61 17,880 1.02 % 416.45 dovolj dovolj d ovolj 400,015 0.90 % 352.53 3 0.89 % 309.06 18,306 0.92 % 460.93 806 0.58 % 203.75 184,495 0.90 % 339.94 90,989 1.06 % 485.49 91,354 0.79 % 287.33 14,062 0.81 % 327.52 naprej naprej n aprej 377,483 0.85 % 332.67 3 0.89 % 309.06 25,693 1.29 % 646.92 1,567 1.14 % 396.12 169,016 0.82 % 311.42 62,691 0.73 % 334.50 105,317 0.92 % 331.25 13,196 0.76 % 307.35 včeraj včeraj v čeraj 354,001 0.80 % 311.98 0 0 % 0 3,279 0.16 % 82.56 323 0.23 % 81.65 279,239 1.35 % 514.52 5,457 0.06 % 29.12 65,131 0.57 % 204.85 572 0.03 % 13.32 hitro hitro hitro 353,030 0.79 % 311.12 3 0.89 % 309.06 19,800 0.99 % 498.54 751 0.54 % 189.84 142,631 0.69 % 262.81 82,662 0.96 % 441.06 92,138 0.80 % 289.80 15,045 0.86 % 350.42 torej torej torej 350,715 0.79 % 309.08 0 0 % 0 13,285 0.67 % 334.50 1,398 1.01 % 353.40 169,569 0.82 % 312.44 78,027 0.91 % 416.33 70,358 0.61 % 221.29 18,078 1.03 % 421.06 najprej najprej na jprej 346,336 0.78 % 305.22 1 0.30 % 103.02 12,918 0.65 % 325.26 1,024 0.74 % 258.86 166,148 0.81 % 306.14 67,446 0.79 % 359.87 83,679 0.73 % 263.19 15,120 0.87 % 352.16 nikoli nikoli n ikoli 342,994 0.77 % 302.28 0 0 % 0 35,030 1.75 % 882.02 811 0.59 % 205.01 139,253 0.68 % 256.58 74,199 0.87 % 395.90 80,494 0.70 % 253.17 13,207 0.76 % 307.61 hkrati hkrati h krati 328,962 0.74 % 289.91 0 0 % 0 6,504 0.33 % 163.76 764 0.55 % 193.13 159,548 0.77 % 293.98 60,655 0.71 % 323.64 85,984 0.75 % 270.44 15,507 0.89 % 361.18 verjetno verjetno ver jetno 328,520 0.74 % 289.52 0 0 % 0 11,371 0.57 % 286.31 874 0.63 % 220.94 159,262 0.77 % 293.45 62,591 0.73 % 333.97 82,117 0.71 % 258.28 12,305 0.70 % 286.60 toliko toliko t oliko 324,313 0.73 % 285.82 13 3.85 % 1,339.24 16,859 0.84 % 424.49 1,101 0.80 % 278.32 159,180 0.77 % 293.30 66,980 0.78 % 357.38 69,370 0.60 % 218.19 10,810 0.62 % 251.78 zakaj zakaj zakaj 323,693 0.73 % 285.27 0 0 % 0 28,271 1.42 % 711.83 1,138 0.82 % 287.67 143,844 0.70 % 265.04 66,510 0.78 % 354.88 70,625 0.61 % 222.13 13,305 0.76 % 309.89 dolgo dolgo dolgo 311,120 0.70 % 274.19 2 0.59 % 206.04 20,159 1.01 % 507.58 917 0.66 % 231.81 141,751 0.69 % 261.19 62,078 0.72 % 331.23 75,292 0.66 % 236.81 10,921 0.62 % 254.36 tokrat tokrat t okrat 307,823 0.69 % 271.28 0 0 % 0 6,904 0.35 % 173.84 352 0.26 % 88.98 167,344 0.81 % 308.34 46,175 0.54 % 246.37 85,275 0.74 % 268.21 1,773 0.10 % 41.30 približno približno prib ližno 304,422 0.68 % 268.29 17 5.03 % 1,751.31 3,716 0.19 % 93.56 649 0.47 % 164.06 152,500 0.74 % 280.99 55,444 0.65 % 295.83 80,811 0.70 % 254.17 11,285 0.65 % 262.84 nekoliko nekoliko nek oliko 302,982 0.68 % 267.02 3 0.89 % 309.06 8,364 0.42 % 210.60 527 0.38 % 133.22 143,420 0.70 % 264.26 67,728 0.79 % 361.37 71,679 0.62 % 225.45 11,261 0.65 % 262.28 posebej posebej po sebej 301,006 0.68 % 265.28 3 0.89 % 309.06 6,382 0.32 % 160.69 1,082 0.78 % 273.52 135,912 0.66 % 250.43 68,691 0.80 % 366.51 75,735 0.66 % 238.21 13,201 0.76 % 307.47 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 107 File at CLARIN.SI 1.2.91 List of initial character-level 1-grams from adverb lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adverbs-lowercase_forms- initial-1grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] lahko l ahko 3,363,470 5.23 % 2,964.21 42 8.45 % 4,326.77 115,009 3.81 % 2,895.80 16,869 8.89 % 4,264.30 1,415,614 4.75 % 2,608.36 764,622 6.19 % 4,079.77 852,311 5.17 % 2,680.74 199,003 8.21 % 4,635.02 tako t ako 3,289,402 5.12 % 2,898.93 31 6.24 % 3,193.57 157,077 5.21 % 3,955.03 11,067 5.84 % 2,797.62 1,532,883 5.14 % 2,824.44 627,155 5.08 % 3,346.29 830,732 5.04 % 2,612.87 130,457 5.38 % 3,038.50 več v eč 1,895,190 2.95 % 1,670.22 4 0.81 % 412.07 35,879 1.19 % 903.39 3,863 2.04 % 976.52 889,908 2.99 % 1,639.71 302,261 2.45 % 1,612.77 609,107 3.69 % 1,915.80 54,168 2.23 % 1,261.64 nekaj n ekaj 1,478,635 2.30 % 1,303.11 17 3.42 % 1,751.31 75,312 2.50 % 1,896.27 2,638 1.39 % 666.86 706,346 2.37 % 1,301.49 292,467 2.37 % 1,560.51 361,517 2.19 % 1,137.07 40,338 1.66 % 939.52 zelo z elo 1,441,970 2.24 % 1,270.80 2 0.40 % 206.04 45,177 1.50 % 1,137.51 3,480 1.83 % 879.71 645,584 2.17 % 1,189.53 323,837 2.62 % 1,727.89 366,622 2.22 % 1,153.12 57,268 2.36 % 1,333.84 bolj b olj 1,262,087 1.96 % 1,112.27 13 2.62 % 1,339.24 42,774 1.42 % 1,077 2,638 1.39 % 666.86 596,351 2.00 % 1,098.82 266,922 2.16 % 1,424.21 297,149 1.80 % 934.61 56,240 2.32 % 1,309.90 zdaj z daj 1,082,531 1.68 % 954.03 0 0 % 0 71,531 2.37 % 1,801.07 2,616 1.38 % 661.30 531,590 1.78 % 979.49 150,379 1.22 % 802.37 308,065 1.87 % 968.94 18,350 0.76 % 427.39 vedno v edno 1,075,127 1.67 % 947.50 1 0.20 % 103.02 54,794 1.82 % 1,379.65 2,037 1.07 % 514.93 473,977 1.59 % 873.33 227,298 1.84 % 1,212.79 277,237 1.68 % 871.98 39,783 1.64 % 926.59 kar k ar 1,071,938 1.67 % 944.69 0 0 % 0 39,622 1.31 % 997.64 1,731 0.91 % 437.58 542,754 1.82 % 1,000.06 222,883 1.81 % 1,189.23 244,821 1.48 % 770.03 20,127 0.83 % 468.78 kako k ako 1,005,454 1.56 % 886.10 2 0.40 % 206.04 90,504 3.00 % 2,278.79 2,958 1.56 % 747.75 424,401 1.42 % 781.99 198,869 1.61 % 1,061.10 237,842 1.44 % 748.08 50,878 2.10 % 1,185.01 veliko v eliko 947,533 1.47 % 835.06 1 0.20 % 103.02 24,065 0.80 % 605.93 1,569 0.83 % 396.63 452,761 1.52 % 834.24 200,571 1.62 % 1,070.18 238,221 1.44 % 749.27 30,345 1.25 % 706.77 danes d anes 905,146 1.41 % 797.70 0 0 % 0 13,186 0.44 % 332.01 2,091 1.10 % 528.58 348,133 1.17 % 641.46 105,919 0.86 % 565.15 416,761 2.53 % 1,310.82 19,056 0.79 % 443.84 potem p otem 814,559 1.27 % 717.87 10 2.01 % 1,030.18 74,531 2.47 % 1,876.61 4,470 2.36 % 1,129.97 357,330 1.20 % 658.40 152,779 1.24 % 815.18 198,520 1.20 % 624.40 26,919 1.11 % 626.98 najbolj n ajbolj 780,723 1.22 % 688.05 1 0.20 % 103.02 13,768 0.46 % 346.66 1,191 0.63 % 301.07 377,064 1.26 % 694.77 156,552 1.27 % 835.31 207,953 1.26 % 654.07 24,194 1.00 % 563.51 treba t reba 715,315 1.11 % 630.40 2 0.40 % 206.04 22,139 0.73 % 557.44 3,300 1.74 % 834.20 348,254 1.17 % 641.68 131,260 1.06 % 700.36 182,173 1.10 % 572.98 28,187 1.16 % 656.51 skupaj s kupaj 614,538 0.96 % 541.59 21 4.22 % 2,163.39 23,547 0.78 % 592.89 1,545 0.81 % 390.56 284,334 0.95 % 523.90 117,286 0.95 % 625.80 170,035 1.03 % 534.80 17,770 0.73 % 413.88 letos l etos 606,763 0.94 % 534.74 0 0 % 0 765 0.03 % 19.26 280 0.15 % 70.78 362,999 1.22 % 668.85 61,361 0.50 % 327.40 180,244 1.09 % 566.91 1,114 0.05 % 25.95 manj m anj 605,930 0.94 % 534 1 0.20 % 103.02 10,236 0.34 % 257.73 1,249 0.66 % 315.73 293,350 0.98 % 540.52 115,502 0.94 % 616.28 163,127 0.99 % 513.08 22,465 0.93 % 523.24 nato n ato 589,423 0.92 % 519.46 52 10.46 % 5,356.96 28,623 0.95 % 720.70 1,075 0.57 % 271.75 236,312 0.79 % 435.42 94,351 0.76 % 503.43 201,657 1.22 % 634.26 27,353 1.13 % 637.08 dobro d obro 576,943 0.90 % 508.46 16 3.22 % 1,648.30 25,215 0.84 % 634.89 975 0.51 % 246.47 268,699 0.90 % 495.10 122,531 0.99 % 653.79 138,841 0.84 % 436.69 20,666 0.85 % 481.34 res r es 565,323 0.88 % 498.22 0 0 % 0 43,525 1.44 % 1,095.91 1,548 0.82 % 391.32 259,768 0.87 % 478.64 119,765 0.97 % 639.03 127,925 0.78 % 402.36 12,792 0.53 % 297.94 lani l ani 505,184 0.79 % 445.22 0 0 % 0 745 0.03 % 18.76 90 0.05 % 22.75 276,175 0.93 % 508.87 38,722 0.31 % 206.61 188,480 1.14 % 592.82 972 0.04 % 22.64 precej p recej 492,127 0.77 % 433.71 2 0.40 % 206.04 12,458 0.41 % 313.68 619 0.33 % 156.48 241,123 0.81 % 444.29 101,408 0.82 % 541.08 122,999 0.75 % 386.86 13,518 0.56 % 314.85 takrat t akrat 444,872 0.69 % 392.06 0 0 % 0 20,788 0.69 % 523.42 1,482 0.78 % 374.63 201,503 0.68 % 371.28 88,896 0.72 % 474.32 118,816 0.72 % 373.71 13,387 0.55 % 311.80 tam t am 430,619 0.67 % 379.50 0 0 % 0 37,490 1.24 % 943.96 1,594 0.84 % 402.95 193,283 0.65 % 356.14 84,620 0.69 % 451.50 98,420 0.60 % 309.56 15,212 0.63 % 354.31 povsem p ovsem 426,682 0.66 % 376.03 0 0 % 0 14,047 0.47 % 353.69 572 0.30 % 144.60 206,808 0.69 % 381.06 86,791 0.70 % 463.09 103,440 0.63 % 325.35 15,024 0.62 % 349.93 medtem m edtem 423,755 0.66 % 373.45 6 1.21 % 618.11 18,625 0.62 % 468.96 716 0.38 % 181 185,207 0.62 % 341.26 60,849 0.49 % 324.67 146,062 0.89 % 459.40 12,290 0.51 % 286.25 malo m alo 419,742 0.65 % 369.92 29 5.83 % 2,987.53 31,436 1.04 % 791.52 1,114 0.59 % 281.61 185,166 0.62 % 341.18 97,544 0.79 % 520.46 89,322 0.54 % 280.94 15,131 0.62 % 352.42 vse v se 415,803 0.65 % 366.45 0 0 % 0 14,050 0.47 % 353.76 827 0.44 % 209.06 199,559 0.67 % 367.70 73,812 0.60 % 393.84 113,290 0.69 % 356.33 14,265 0.59 % 332.25 glede g lede 400,382 0.62 % 352.85 1 0.20 % 103.02 4,848 0.16 % 122.07 2,830 1.49 % 715.39 176,820 0.59 % 325.80 67,773 0.55 % 361.61 130,230 0.79 % 409.61 17,880 0.74 % 416.45 dovolj d ovolj 400,015 0.62 % 352.53 3 0.60 % 309.06 18,306 0.61 % 460.93 806 0.42 % 203.75 184,495 0.62 % 339.94 90,989 0.74 % 485.49 91,354 0.55 % 287.33 14,062 0.58 % 327.52 spet s pet 394,360 0.61 % 347.55 0 0 % 0 41,209 1.37 % 1,037.60 1,269 0.67 % 320.79 187,447 0.63 % 345.38 68,686 0.56 % 366.49 83,350 0.51 % 262.16 12,399 0.51 % 288.79 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 108 File at CLARIN.SI 1.2.92 List of initial character-level 2-grams from adverb lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adverbs-lowercase_forms- initial-2grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] lahko la hko 3,363,470 5.23 % 2,964.21 42 8.45 % 4,326.77 115,009 3.81 % 2,895.80 16,869 8.89 % 4,264.30 1,415,614 4.75 % 2,608.36 764,622 6.19 % 4,079.77 852,311 5.17 % 2,680.74 199,003 8.21 % 4,635.02 tako ta ko 3,289,402 5.12 % 2,898.93 31 6.24 % 3,193.57 157,077 5.21 % 3,955.03 11,067 5.84 % 2,797.62 1,532,883 5.14 % 2,824.44 627,155 5.08 % 3,346.29 830,732 5.04 % 2,612.87 130,457 5.38 % 3,038.50 več ve č 1,895,190 2.95 % 1,670.22 4 0.81 % 412.07 35,879 1.19 % 903.39 3,863 2.04 % 976.52 889,908 2.99 % 1,639.71 302,261 2.45 % 1,612.77 609,107 3.69 % 1,915.80 54,168 2.23 % 1,261.64 nekaj ne kaj 1,478,635 2.30 % 1,303.11 17 3.42 % 1,751.31 75,312 2.50 % 1,896.27 2,638 1.39 % 666.86 706,346 2.37 % 1,301.49 292,467 2.37 % 1,560.51 361,517 2.19 % 1,137.07 40,338 1.66 % 939.52 zelo ze lo 1,441,970 2.24 % 1,270.80 2 0.40 % 206.04 45,177 1.50 % 1,137.51 3,480 1.83 % 879.71 645,584 2.17 % 1,189.53 323,837 2.62 % 1,727.89 366,622 2.22 % 1,153.12 57,268 2.36 % 1,333.84 bolj bo lj 1,262,087 1.96 % 1,112.27 13 2.62 % 1,339.24 42,774 1.42 % 1,077 2,638 1.39 % 666.86 596,351 2.00 % 1,098.82 266,922 2.16 % 1,424.21 297,149 1.80 % 934.61 56,240 2.32 % 1,309.90 zdaj zd aj 1,082,531 1.68 % 954.03 0 0 % 0 71,531 2.37 % 1,801.07 2,616 1.38 % 661.30 531,590 1.78 % 979.49 150,379 1.22 % 802.37 308,065 1.87 % 968.94 18,350 0.76 % 427.39 vedno ve dno 1,075,127 1.67 % 947.50 1 0.20 % 103.02 54,794 1.82 % 1,379.65 2,037 1.07 % 514.93 473,977 1.59 % 873.33 227,298 1.84 % 1,212.79 277,237 1.68 % 871.98 39,783 1.64 % 926.59 kar ka r 1,071,938 1.67 % 944.69 0 0 % 0 39,622 1.31 % 997.64 1,731 0.91 % 437.58 542,754 1.82 % 1,000.06 222,883 1.81 % 1,189.23 244,821 1.48 % 770.03 20,127 0.83 % 468.78 kako ka ko 1,005,454 1.56 % 886.10 2 0.40 % 206.04 90,504 3.00 % 2,278.79 2,958 1.56 % 747.75 424,401 1.42 % 781.99 198,869 1.61 % 1,061.10 237,842 1.44 % 748.08 50,878 2.10 % 1,185.01 veliko ve liko 947,533 1.47 % 835.06 1 0.20 % 103.02 24,065 0.80 % 605.93 1,569 0.83 % 396.63 452,761 1.52 % 834.24 200,571 1.62 % 1,070.18 238,221 1.44 % 749.27 30,345 1.25 % 706.77 danes da nes 905,146 1.41 % 797.70 0 0 % 0 13,186 0.44 % 332.01 2,091 1.10 % 528.58 348,133 1.17 % 641.46 105,919 0.86 % 565.15 416,761 2.53 % 1,310.82 19,056 0.79 % 443.84 potem po tem 814,559 1.27 % 717.87 10 2.01 % 1,030.18 74,531 2.47 % 1,876.61 4,470 2.36 % 1,129.97 357,330 1.20 % 658.40 152,779 1.24 % 815.18 198,520 1.20 % 624.40 26,919 1.11 % 626.98 najbolj na jbolj 780,723 1.22 % 688.05 1 0.20 % 103.02 13,768 0.46 % 346.66 1,191 0.63 % 301.07 377,064 1.26 % 694.77 156,552 1.27 % 835.31 207,953 1.26 % 654.07 24,194 1.00 % 563.51 treba tr eba 715,315 1.11 % 630.40 2 0.40 % 206.04 22,139 0.73 % 557.44 3,300 1.74 % 834.20 348,254 1.17 % 641.68 131,260 1.06 % 700.36 182,173 1.10 % 572.98 28,187 1.16 % 656.51 skupaj sk upaj 614,538 0.96 % 541.59 21 4.22 % 2,163.39 23,547 0.78 % 592.89 1,545 0.81 % 390.56 284,334 0.95 % 523.90 117,286 0.95 % 625.80 170,035 1.03 % 534.80 17,770 0.73 % 413.88 letos le tos 606,763 0.94 % 534.74 0 0 % 0 765 0.03 % 19.26 280 0.15 % 70.78 362,999 1.22 % 668.85 61,361 0.50 % 327.40 180,244 1.09 % 566.91 1,114 0.05 % 25.95 manj ma nj 605,930 0.94 % 534 1 0.20 % 103.02 10,236 0.34 % 257.73 1,249 0.66 % 315.73 293,350 0.98 % 540.52 115,502 0.94 % 616.28 163,127 0.99 % 513.08 22,465 0.93 % 523.24 nato na to 589,423 0.92 % 519.46 52 10.46 % 5,356.96 28,623 0.95 % 720.70 1,075 0.57 % 271.75 236,312 0.79 % 435.42 94,351 0.76 % 503.43 201,657 1.22 % 634.26 27,353 1.13 % 637.08 dobro do bro 576,943 0.90 % 508.46 16 3.22 % 1,648.30 25,215 0.84 % 634.89 975 0.51 % 246.47 268,699 0.90 % 495.10 122,531 0.99 % 653.79 138,841 0.84 % 436.69 20,666 0.85 % 481.34 res re s 565,323 0.88 % 498.22 0 0 % 0 43,525 1.44 % 1,095.91 1,548 0.82 % 391.32 259,768 0.87 % 478.64 119,765 0.97 % 639.03 127,925 0.78 % 402.36 12,792 0.53 % 297.94 lani la ni 505,184 0.79 % 445.22 0 0 % 0 745 0.03 % 18.76 90 0.05 % 22.75 276,175 0.93 % 508.87 38,722 0.31 % 206.61 188,480 1.14 % 592.82 972 0.04 % 22.64 precej pr ecej 492,127 0.77 % 433.71 2 0.40 % 206.04 12,458 0.41 % 313.68 619 0.33 % 156.48 241,123 0.81 % 444.29 101,408 0.82 % 541.08 122,999 0.75 % 386.86 13,518 0.56 % 314.85 takrat ta krat 444,872 0.69 % 392.06 0 0 % 0 20,788 0.69 % 523.42 1,482 0.78 % 374.63 201,503 0.68 % 371.28 88,896 0.72 % 474.32 118,816 0.72 % 373.71 13,387 0.55 % 311.80 tam ta m 430,619 0.67 % 379.50 0 0 % 0 37,490 1.24 % 943.96 1,594 0.84 % 402.95 193,283 0.65 % 356.14 84,620 0.69 % 451.50 98,420 0.60 % 309.56 15,212 0.63 % 354.31 povsem po vsem 426,682 0.66 % 376.03 0 0 % 0 14,047 0.47 % 353.69 572 0.30 % 144.60 206,808 0.69 % 381.06 86,791 0.70 % 463.09 103,440 0.63 % 325.35 15,024 0.62 % 349.93 medtem me dtem 423,755 0.66 % 373.45 6 1.21 % 618.11 18,625 0.62 % 468.96 716 0.38 % 181 185,207 0.62 % 341.26 60,849 0.49 % 324.67 146,062 0.89 % 459.40 12,290 0.51 % 286.25 malo ma lo 419,742 0.65 % 369.92 29 5.83 % 2,987.53 31,436 1.04 % 791.52 1,114 0.59 % 281.61 185,166 0.62 % 341.18 97,544 0.79 % 520.46 89,322 0.54 % 280.94 15,131 0.62 % 352.42 vse vs e 415,803 0.65 % 366.45 0 0 % 0 14,050 0.47 % 353.76 827 0.44 % 209.06 199,559 0.67 % 367.70 73,812 0.60 % 393.84 113,290 0.69 % 356.33 14,265 0.59 % 332.25 glede gl ede 400,382 0.62 % 352.85 1 0.20 % 103.02 4,848 0.16 % 122.07 2,830 1.49 % 715.39 176,820 0.59 % 325.80 67,773 0.55 % 361.61 130,230 0.79 % 409.61 17,880 0.74 % 416.45 dovolj do volj 400,015 0.62 % 352.53 3 0.60 % 309.06 18,306 0.61 % 460.93 806 0.42 % 203.75 184,495 0.62 % 339.94 90,989 0.74 % 485.49 91,354 0.55 % 287.33 14,062 0.58 % 327.52 spet sp et 394,360 0.61 % 347.55 0 0 % 0 41,209 1.37 % 1,037.60 1,269 0.67 % 320.79 187,447 0.63 % 345.38 68,686 0.56 % 366.49 83,350 0.51 % 262.16 12,399 0.51 % 288.79 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 109 File at CLARIN.SI 1.2.93 List of initial character-level 3-grams from adverb lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adverbs-lowercase_forms- initial-3grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] lahko lah ko 3,363,470 5.26 % 2,964.21 42 8.47 % 4,326.77 115,009 3.84 % 2,895.80 16,869 8.98 % 4,264.30 1,415,614 4.77 % 2,608.36 764,622 6.23 % 4,079.77 852,311 5.19 % 2,680.74 199,003 8.26 % 4,635.02 tako tak o 3,289,402 5.14 % 2,898.93 31 6.25 % 3,193.57 157,077 5.25 % 3,955.03 11,067 5.89 % 2,797.62 1,532,883 5.17 % 2,824.44 627,155 5.11 % 3,346.29 830,732 5.05 % 2,612.87 130,457 5.42 % 3,038.50 več več 1,895,190 2.96 % 1,670.22 4 0.81 % 412.07 35,879 1.20 % 903.39 3,863 2.06 % 976.52 889,908 3.00 % 1,639.71 302,261 2.46 % 1,612.77 609,107 3.71 % 1,915.80 54,168 2.25 % 1,261.64 nekaj nek aj 1,478,635 2.31 % 1,303.11 17 3.43 % 1,751.31 75,312 2.52 % 1,896.27 2,638 1.41 % 666.86 706,346 2.38 % 1,301.49 292,467 2.38 % 1,560.51 361,517 2.20 % 1,137.07 40,338 1.68 % 939.52 zelo zel o 1,441,970 2.25 % 1,270.80 2 0.40 % 206.04 45,177 1.51 % 1,137.51 3,480 1.85 % 879.71 645,584 2.18 % 1,189.53 323,837 2.64 % 1,727.89 366,622 2.23 % 1,153.12 57,268 2.38 % 1,333.84 bolj bol j 1,262,087 1.97 % 1,112.27 13 2.62 % 1,339.24 42,774 1.43 % 1,077 2,638 1.41 % 666.86 596,351 2.01 % 1,098.82 266,922 2.17 % 1,424.21 297,149 1.81 % 934.61 56,240 2.33 % 1,309.90 zdaj zda j 1,082,531 1.69 % 954.03 0 0 % 0 71,531 2.39 % 1,801.07 2,616 1.39 % 661.30 531,590 1.79 % 979.49 150,379 1.23 % 802.37 308,065 1.88 % 968.94 18,350 0.76 % 427.39 vedno ved no 1,075,127 1.68 % 947.50 1 0.20 % 103.02 54,794 1.83 % 1,379.65 2,037 1.08 % 514.93 473,977 1.60 % 873.33 227,298 1.85 % 1,212.79 277,237 1.69 % 871.98 39,783 1.65 % 926.59 kar kar 1,071,938 1.68 % 944.69 0 0 % 0 39,622 1.32 % 997.64 1,731 0.92 % 437.58 542,754 1.83 % 1,000.06 222,883 1.81 % 1,189.23 244,821 1.49 % 770.03 20,127 0.84 % 468.78 kako kak o 1,005,454 1.57 % 886.10 2 0.40 % 206.04 90,504 3.02 % 2,278.79 2,958 1.57 % 747.75 424,401 1.43 % 781.99 198,869 1.62 % 1,061.10 237,842 1.45 % 748.08 50,878 2.11 % 1,185.01 veliko vel iko 947,533 1.48 % 835.06 1 0.20 % 103.02 24,065 0.80 % 605.93 1,569 0.83 % 396.63 452,761 1.53 % 834.24 200,571 1.63 % 1,070.18 238,221 1.45 % 749.27 30,345 1.26 % 706.77 danes dan es 905,146 1.42 % 797.70 0 0 % 0 13,186 0.44 % 332.01 2,091 1.11 % 528.58 348,133 1.17 % 641.46 105,919 0.86 % 565.15 416,761 2.54 % 1,310.82 19,056 0.79 % 443.84 potem pot em 814,559 1.27 % 717.87 10 2.02 % 1,030.18 74,531 2.49 % 1,876.61 4,470 2.38 % 1,129.97 357,330 1.20 % 658.40 152,779 1.24 % 815.18 198,520 1.21 % 624.40 26,919 1.12 % 626.98 najbolj naj bolj 780,723 1.22 % 688.05 1 0.20 % 103.02 13,768 0.46 % 346.66 1,191 0.63 % 301.07 377,064 1.27 % 694.77 156,552 1.27 % 835.31 207,953 1.26 % 654.07 24,194 1.00 % 563.51 treba tre ba 715,315 1.12 % 630.40 2 0.40 % 206.04 22,139 0.74 % 557.44 3,300 1.76 % 834.20 348,254 1.17 % 641.68 131,260 1.07 % 700.36 182,173 1.11 % 572.98 28,187 1.17 % 656.51 skupaj sku paj 614,538 0.96 % 541.59 21 4.23 % 2,163.39 23,547 0.79 % 592.89 1,545 0.82 % 390.56 284,334 0.96 % 523.90 117,286 0.95 % 625.80 170,035 1.03 % 534.80 17,770 0.74 % 413.88 letos let os 606,763 0.95 % 534.74 0 0 % 0 765 0.03 % 19.26 280 0.15 % 70.78 362,999 1.22 % 668.85 61,361 0.50 % 327.40 180,244 1.10 % 566.91 1,114 0.05 % 25.95 manj man j 605,930 0.95 % 534 1 0.20 % 103.02 10,236 0.34 % 257.73 1,249 0.67 % 315.73 293,350 0.99 % 540.52 115,502 0.94 % 616.28 163,127 0.99 % 513.08 22,465 0.93 % 523.24 nato nat o 589,423 0.92 % 519.46 52 10.48 % 5,356.96 28,623 0.96 % 720.70 1,075 0.57 % 271.75 236,312 0.80 % 435.42 94,351 0.77 % 503.43 201,657 1.23 % 634.26 27,353 1.14 % 637.08 dobro dob ro 576,943 0.90 % 508.46 16 3.23 % 1,648.30 25,215 0.84 % 634.89 975 0.52 % 246.47 268,699 0.91 % 495.10 122,531 1.00 % 653.79 138,841 0.84 % 436.69 20,666 0.86 % 481.34 res res 565,323 0.88 % 498.22 0 0 % 0 43,525 1.45 % 1,095.91 1,548 0.82 % 391.32 259,768 0.88 % 478.64 119,765 0.97 % 639.03 127,925 0.78 % 402.36 12,792 0.53 % 297.94 lani lan i 505,184 0.79 % 445.22 0 0 % 0 745 0.03 % 18.76 90 0.05 % 22.75 276,175 0.93 % 508.87 38,722 0.32 % 206.61 188,480 1.15 % 592.82 972 0.04 % 22.64 precej pre cej 492,127 0.77 % 433.71 2 0.40 % 206.04 12,458 0.42 % 313.68 619 0.33 % 156.48 241,123 0.81 % 444.29 101,408 0.83 % 541.08 122,999 0.75 % 386.86 13,518 0.56 % 314.85 takrat tak rat 444,872 0.69 % 392.06 0 0 % 0 20,788 0.69 % 523.42 1,482 0.79 % 374.63 201,503 0.68 % 371.28 88,896 0.72 % 474.32 118,816 0.72 % 373.71 13,387 0.56 % 311.80 tam tam 430,619 0.67 % 379.50 0 0 % 0 37,490 1.25 % 943.96 1,594 0.85 % 402.95 193,283 0.65 % 356.14 84,620 0.69 % 451.50 98,420 0.60 % 309.56 15,212 0.63 % 354.31 povsem pov sem 426,682 0.67 % 376.03 0 0 % 0 14,047 0.47 % 353.69 572 0.30 % 144.60 206,808 0.70 % 381.06 86,791 0.71 % 463.09 103,440 0.63 % 325.35 15,024 0.62 % 349.93 medtem med tem 423,755 0.66 % 373.45 6 1.21 % 618.11 18,625 0.62 % 468.96 716 0.38 % 181 185,207 0.62 % 341.26 60,849 0.50 % 324.67 146,062 0.89 % 459.40 12,290 0.51 % 286.25 malo mal o 419,742 0.66 % 369.92 29 5.85 % 2,987.53 31,436 1.05 % 791.52 1,114 0.59 % 281.61 185,166 0.62 % 341.18 97,544 0.79 % 520.46 89,322 0.54 % 280.94 15,131 0.63 % 352.42 vse vse 415,803 0.65 % 366.45 0 0 % 0 14,050 0.47 % 353.76 827 0.44 % 209.06 199,559 0.67 % 367.70 73,812 0.60 % 393.84 113,290 0.69 % 356.33 14,265 0.59 % 332.25 glede gle de 400,382 0.63 % 352.85 1 0.20 % 103.02 4,848 0.16 % 122.07 2,830 1.51 % 715.39 176,820 0.60 % 325.80 67,773 0.55 % 361.61 130,230 0.79 % 409.61 17,880 0.74 % 416.45 dovolj dov olj 400,015 0.62 % 352.53 3 0.60 % 309.06 18,306 0.61 % 460.93 806 0.43 % 203.75 184,495 0.62 % 339.94 90,989 0.74 % 485.49 91,354 0.56 % 287.33 14,062 0.58 % 327.52 spet spe t 394,360 0.62 % 347.55 0 0 % 0 41,209 1.38 % 1,037.60 1,269 0.68 % 320.79 187,447 0.63 % 345.38 68,686 0.56 % 366.49 83,350 0.51 % 262.16 12,399 0.52 % 288.79 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 110 File at CLARIN.SI 1.2.94 List of initial character-level 4-grams from adverb lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adverbs-lowercase_forms- initial-4grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] lahko lahk o 3,363,470 5.84 % 2,964.21 42 8.80 % 4,326.77 115,009 4.30 % 2,895.80 16,869 9.80 % 4,264.30 1,415,614 5.31 % 2,608.36 764,622 6.91 % 4,079.77 852,311 5.76 % 2,680.74 199,003 8.96 % 4,635.02 tako tako 3,289,402 5.71 % 2,898.93 31 6.50 % 3,193.57 157,077 5.88 % 3,955.03 11,067 6.43 % 2,797.62 1,532,883 5.75 % 2,824.44 627,155 5.67 % 3,346.29 830,732 5.62 % 2,612.87 130,457 5.87 % 3,038.50 nekaj neka j 1,478,635 2.57 % 1,303.11 17 3.56 % 1,751.31 75,312 2.82 % 1,896.27 2,638 1.53 % 666.86 706,346 2.65 % 1,301.49 292,467 2.64 % 1,560.51 361,517 2.44 % 1,137.07 40,338 1.82 % 939.52 zelo zelo 1,441,970 2.50 % 1,270.80 2 0.42 % 206.04 45,177 1.69 % 1,137.51 3,480 2.02 % 879.71 645,584 2.42 % 1,189.53 323,837 2.93 % 1,727.89 366,622 2.48 % 1,153.12 57,268 2.58 % 1,333.84 bolj bolj 1,262,087 2.19 % 1,112.27 13 2.73 % 1,339.24 42,774 1.60 % 1,077 2,638 1.53 % 666.86 596,351 2.24 % 1,098.82 266,922 2.41 % 1,424.21 297,149 2.01 % 934.61 56,240 2.53 % 1,309.90 zdaj zdaj 1,082,531 1.88 % 954.03 0 0 % 0 71,531 2.68 % 1,801.07 2,616 1.52 % 661.30 531,590 2.00 % 979.49 150,379 1.36 % 802.37 308,065 2.08 % 968.94 18,350 0.83 % 427.39 vedno vedn o 1,075,127 1.87 % 947.50 1 0.21 % 103.02 54,794 2.05 % 1,379.65 2,037 1.18 % 514.93 473,977 1.78 % 873.33 227,298 2.05 % 1,212.79 277,237 1.88 % 871.98 39,783 1.79 % 926.59 kako kako 1,005,454 1.75 % 886.10 2 0.42 % 206.04 90,504 3.39 % 2,278.79 2,958 1.72 % 747.75 424,401 1.59 % 781.99 198,869 1.80 % 1,061.10 237,842 1.61 % 748.08 50,878 2.29 % 1,185.01 veliko veli ko 947,533 1.65 % 835.06 1 0.21 % 103.02 24,065 0.90 % 605.93 1,569 0.91 % 396.63 452,761 1.70 % 834.24 200,571 1.81 % 1,070.18 238,221 1.61 % 749.27 30,345 1.37 % 706.77 danes dane s 905,146 1.57 % 797.70 0 0 % 0 13,186 0.49 % 332.01 2,091 1.22 % 528.58 348,133 1.31 % 641.46 105,919 0.96 % 565.15 416,761 2.82 % 1,310.82 19,056 0.86 % 443.84 potem pote m 814,559 1.42 % 717.87 10 2.10 % 1,030.18 74,531 2.79 % 1,876.61 4,470 2.60 % 1,129.97 357,330 1.34 % 658.40 152,779 1.38 % 815.18 198,520 1.34 % 624.40 26,919 1.21 % 626.98 najbolj najb olj 780,723 1.36 % 688.05 1 0.21 % 103.02 13,768 0.52 % 346.66 1,191 0.69 % 301.07 377,064 1.42 % 694.77 156,552 1.42 % 835.31 207,953 1.41 % 654.07 24,194 1.09 % 563.51 treba treb a 715,315 1.24 % 630.40 2 0.42 % 206.04 22,139 0.83 % 557.44 3,300 1.92 % 834.20 348,254 1.31 % 641.68 131,260 1.19 % 700.36 182,173 1.23 % 572.98 28,187 1.27 % 656.51 skupaj skup aj 614,538 1.07 % 541.59 21 4.40 % 2,163.39 23,547 0.88 % 592.89 1,545 0.90 % 390.56 284,334 1.07 % 523.90 117,286 1.06 % 625.80 170,035 1.15 % 534.80 17,770 0.80 % 413.88 letos leto s 606,763 1.05 % 534.74 0 0 % 0 765 0.03 % 19.26 280 0.16 % 70.78 362,999 1.36 % 668.85 61,361 0.56 % 327.40 180,244 1.22 % 566.91 1,114 0.05 % 25.95 manj manj 605,930 1.05 % 534 1 0.21 % 103.02 10,236 0.38 % 257.73 1,249 0.73 % 315.73 293,350 1.10 % 540.52 115,502 1.04 % 616.28 163,127 1.10 % 513.08 22,465 1.01 % 523.24 nato nato 589,423 1.02 % 519.46 52 10.90 % 5,356.96 28,623 1.07 % 720.70 1,075 0.62 % 271.75 236,312 0.89 % 435.42 94,351 0.85 % 503.43 201,657 1.36 % 634.26 27,353 1.23 % 637.08 dobro dobr o 576,943 1.00 % 508.46 16 3.35 % 1,648.30 25,215 0.94 % 634.89 975 0.57 % 246.47 268,699 1.01 % 495.10 122,531 1.11 % 653.79 138,841 0.94 % 436.69 20,666 0.93 % 481.34 lani lani 505,184 0.88 % 445.22 0 0 % 0 745 0.03 % 18.76 90 0.05 % 22.75 276,175 1.04 % 508.87 38,722 0.35 % 206.61 188,480 1.27 % 592.82 972 0.04 % 22.64 precej prec ej 492,127 0.85 % 433.71 2 0.42 % 206.04 12,458 0.47 % 313.68 619 0.36 % 156.48 241,123 0.91 % 444.29 101,408 0.92 % 541.08 122,999 0.83 % 386.86 13,518 0.61 % 314.85 takrat takr at 444,872 0.77 % 392.06 0 0 % 0 20,788 0.78 % 523.42 1,482 0.86 % 374.63 201,503 0.76 % 371.28 88,896 0.80 % 474.32 118,816 0.80 % 373.71 13,387 0.60 % 311.80 povsem povs em 426,682 0.74 % 376.03 0 0 % 0 14,047 0.53 % 353.69 572 0.33 % 144.60 206,808 0.78 % 381.06 86,791 0.78 % 463.09 103,440 0.70 % 325.35 15,024 0.68 % 349.93 medtem medt em 423,755 0.74 % 373.45 6 1.26 % 618.11 18,625 0.70 % 468.96 716 0.42 % 181 185,207 0.69 % 341.26 60,849 0.55 % 324.67 146,062 0.99 % 459.40 12,290 0.55 % 286.25 malo malo 419,742 0.73 % 369.92 29 6.08 % 2,987.53 31,436 1.18 % 791.52 1,114 0.65 % 281.61 185,166 0.69 % 341.18 97,544 0.88 % 520.46 89,322 0.60 % 280.94 15,131 0.68 % 352.42 glede gled e 400,382 0.70 % 352.85 1 0.21 % 103.02 4,848 0.18 % 122.07 2,830 1.64 % 715.39 176,820 0.66 % 325.80 67,773 0.61 % 361.61 130,230 0.88 % 409.61 17,880 0.81 % 416.45 dovolj dovo lj 400,015 0.69 % 352.53 3 0.63 % 309.06 18,306 0.69 % 460.93 806 0.47 % 203.75 184,495 0.69 % 339.94 90,989 0.82 % 485.49 91,354 0.62 % 287.33 14,062 0.63 % 327.52 spet spet 394,360 0.69 % 347.55 0 0 % 0 41,209 1.54 % 1,037.60 1,269 0.74 % 320.79 187,447 0.70 % 345.38 68,686 0.62 % 366.49 83,350 0.56 % 262.16 12,399 0.56 % 288.79 prej prej 381,934 0.66 % 336.60 1 0.21 % 103.02 19,064 0.71 % 480.01 1,381 0.80 % 349.10 186,379 0.70 % 343.42 69,611 0.63 % 371.42 93,050 0.63 % 292.67 12,448 0.56 % 289.93 naprej napr ej 377,483 0.66 % 332.67 3 0.63 % 309.06 25,693 0.96 % 646.92 1,567 0.91 % 396.12 169,016 0.63 % 311.42 62,691 0.57 % 334.50 105,317 0.71 % 331.25 13,196 0.59 % 307.35 včeraj včer aj 354,001 0.61 % 311.98 0 0 % 0 3,279 0.12 % 82.56 323 0.19 % 81.65 279,239 1.05 % 514.52 5,457 0.05 % 29.12 65,131 0.44 % 204.85 572 0.03 % 13.32 torej tore j 350,715 0.61 % 309.08 0 0 % 0 13,285 0.50 % 334.50 1,398 0.81 % 353.40 169,569 0.64 % 312.44 78,027 0.70 % 416.33 70,358 0.48 % 221.29 18,078 0.81 % 421.06 najprej najp rej 346,336 0.60 % 305.22 1 0.21 % 103.02 12,918 0.48 % 325.26 1,024 0.59 % 258.86 166,148 0.62 % 306.14 67,446 0.61 % 359.87 83,679 0.57 % 263.19 15,120 0.68 % 352.16 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 111 File at CLARIN.SI 1.2.95 List of initial character-level 5-grams from adverb lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adverbs-lowercase_forms- initial-5grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] lahko lahko 3,363,470 7.56 % 2,964.21 42 12.35 % 4,326.77 115,009 5.77 % 2,895.80 16,869 12.24 % 4,264.30 1,415,614 6.88 % 2,608.36 764,622 8.95 % 4,079.77 852,311 7.42 % 2,680.74 199,003 11.43 % 4,635.02 nekaj nekaj 1,478,635 3.33 % 1,303.11 17 5.00 % 1,751.31 75,312 3.78 % 1,896.27 2,638 1.91 % 666.86 706,346 3.43 % 1,301.49 292,467 3.42 % 1,560.51 361,517 3.15 % 1,137.07 40,338 2.32 % 939.52 vedno vedno 1,075,127 2.42 % 947.50 1 0.29 % 103.02 54,794 2.75 % 1,379.65 2,037 1.48 % 514.93 473,977 2.30 % 873.33 227,298 2.66 % 1,212.79 277,237 2.41 % 871.98 39,783 2.29 % 926.59 veliko velik o 947,533 2.13 % 835.06 1 0.29 % 103.02 24,065 1.21 % 605.93 1,569 1.14 % 396.63 452,761 2.20 % 834.24 200,571 2.35 % 1,070.18 238,221 2.08 % 749.27 30,345 1.74 % 706.77 danes danes 905,146 2.04 % 797.70 0 0 % 0 13,186 0.66 % 332.01 2,091 1.52 % 528.58 348,133 1.69 % 641.46 105,919 1.24 % 565.15 416,761 3.63 % 1,310.82 19,056 1.09 % 443.84 potem potem 814,559 1.83 % 717.87 10 2.94 % 1,030.18 74,531 3.74 % 1,876.61 4,470 3.24 % 1,129.97 357,330 1.74 % 658.40 152,779 1.79 % 815.18 198,520 1.73 % 624.40 26,919 1.55 % 626.98 najbolj najbo lj 780,723 1.76 % 688.05 1 0.29 % 103.02 13,768 0.69 % 346.66 1,191 0.86 % 301.07 377,064 1.83 % 694.77 156,552 1.83 % 835.31 207,953 1.81 % 654.07 24,194 1.39 % 563.51 treba treba 715,315 1.61 % 630.40 2 0.59 % 206.04 22,139 1.11 % 557.44 3,300 2.39 % 834.20 348,254 1.69 % 641.68 131,260 1.54 % 700.36 182,173 1.59 % 572.98 28,187 1.62 % 656.51 skupaj skupa j 614,538 1.38 % 541.59 21 6.18 % 2,163.39 23,547 1.18 % 592.89 1,545 1.12 % 390.56 284,334 1.38 % 523.90 117,286 1.37 % 625.80 170,035 1.48 % 534.80 17,770 1.02 % 413.88 letos letos 606,763 1.36 % 534.74 0 0 % 0 765 0.04 % 19.26 280 0.20 % 70.78 362,999 1.76 % 668.85 61,361 0.72 % 327.40 180,244 1.57 % 566.91 1,114 0.06 % 25.95 dobro dobro 576,943 1.30 % 508.46 16 4.71 % 1,648.30 25,215 1.26 % 634.89 975 0.71 % 246.47 268,699 1.31 % 495.10 122,531 1.43 % 653.79 138,841 1.21 % 436.69 20,666 1.19 % 481.34 precej prece j 492,127 1.11 % 433.71 2 0.59 % 206.04 12,458 0.62 % 313.68 619 0.45 % 156.48 241,123 1.17 % 444.29 101,408 1.19 % 541.08 122,999 1.07 % 386.86 13,518 0.78 % 314.85 takrat takra t 444,872 1.00 % 392.06 0 0 % 0 20,788 1.04 % 523.42 1,482 1.07 % 374.63 201,503 0.98 % 371.28 88,896 1.04 % 474.32 118,816 1.03 % 373.71 13,387 0.77 % 311.80 povsem povse m 426,682 0.96 % 376.03 0 0 % 0 14,047 0.70 % 353.69 572 0.41 % 144.60 206,808 1.00 % 381.06 86,791 1.02 % 463.09 103,440 0.90 % 325.35 15,024 0.86 % 349.93 medtem medte m 423,755 0.95 % 373.45 6 1.76 % 618.11 18,625 0.93 % 468.96 716 0.52 % 181 185,207 0.90 % 341.26 60,849 0.71 % 324.67 146,062 1.27 % 459.40 12,290 0.71 % 286.25 glede glede 400,382 0.90 % 352.85 1 0.29 % 103.02 4,848 0.24 % 122.07 2,830 2.05 % 715.39 176,820 0.86 % 325.80 67,773 0.79 % 361.61 130,230 1.13 % 409.61 17,880 1.03 % 416.45 dovolj dovol j 400,015 0.90 % 352.53 3 0.88 % 309.06 18,306 0.92 % 460.93 806 0.58 % 203.75 184,495 0.90 % 339.94 90,989 1.06 % 485.49 91,354 0.80 % 287.33 14,062 0.81 % 327.52 naprej napre j 377,483 0.85 % 332.67 3 0.88 % 309.06 25,693 1.29 % 646.92 1,567 1.14 % 396.12 169,016 0.82 % 311.42 62,691 0.73 % 334.50 105,317 0.92 % 331.25 13,196 0.76 % 307.35 včeraj včera j 354,001 0.80 % 311.98 0 0 % 0 3,279 0.16 % 82.56 323 0.23 % 81.65 279,239 1.36 % 514.52 5,457 0.06 % 29.12 65,131 0.57 % 204.85 572 0.03 % 13.32 torej torej 350,715 0.79 % 309.08 0 0 % 0 13,285 0.67 % 334.50 1,398 1.01 % 353.40 169,569 0.82 % 312.44 78,027 0.91 % 416.33 70,358 0.61 % 221.29 18,078 1.04 % 421.06 najprej najpr ej 346,336 0.78 % 305.22 1 0.29 % 103.02 12,918 0.65 % 325.26 1,024 0.74 % 258.86 166,148 0.81 % 306.14 67,446 0.79 % 359.87 83,679 0.73 % 263.19 15,120 0.87 % 352.16 nikoli nikol i 342,994 0.77 % 302.28 0 0 % 0 35,030 1.76 % 882.02 811 0.59 % 205.01 139,253 0.68 % 256.58 74,199 0.87 % 395.90 80,494 0.70 % 253.17 13,207 0.76 % 307.61 hkrati hkrat i 328,962 0.74 % 289.91 0 0 % 0 6,504 0.33 % 163.76 764 0.55 % 193.13 159,548 0.78 % 293.98 60,655 0.71 % 323.64 85,984 0.75 % 270.44 15,507 0.89 % 361.18 toliko tolik o 324,312 0.73 % 285.81 13 3.82 % 1,339.24 16,859 0.84 % 424.49 1,101 0.80 % 278.32 159,179 0.77 % 293.30 66,980 0.78 % 357.38 69,370 0.60 % 218.19 10,810 0.62 % 251.78 zakaj zakaj 323,691 0.73 % 285.27 0 0 % 0 28,270 1.42 % 711.81 1,138 0.83 % 287.67 143,843 0.70 % 265.04 66,510 0.78 % 354.88 70,625 0.61 % 222.13 13,305 0.76 % 309.89 tokrat tokra t 307,823 0.69 % 271.28 0 0 % 0 6,904 0.35 % 173.84 352 0.26 % 88.98 167,344 0.81 % 308.34 46,175 0.54 % 246.37 85,275 0.74 % 268.21 1,773 0.10 % 41.30 približno pribl ižno 306,100 0.69 % 269.76 17 5.00 % 1,751.31 3,730 0.19 % 93.92 653 0.47 % 165.07 153,334 0.74 % 282.53 55,729 0.65 % 297.35 81,278 0.71 % 255.64 11,359 0.65 % 264.56 nekoliko nekol iko 302,982 0.68 % 267.02 3 0.88 % 309.06 8,364 0.42 % 210.60 527 0.38 % 133.22 143,420 0.70 % 264.26 67,728 0.79 % 361.37 71,679 0.62 % 225.45 11,261 0.65 % 262.28 posebej poseb ej 301,006 0.68 % 265.28 3 0.88 % 309.06 6,382 0.32 % 160.69 1,082 0.79 % 273.52 135,912 0.66 % 250.43 68,691 0.80 % 366.51 75,735 0.66 % 238.21 13,201 0.76 % 307.47 takoj takoj 297,170 0.67 % 261.89 4 1.18 % 412.07 23,961 1.20 % 603.31 1,027 0.74 % 259.61 134,686 0.66 % 248.17 60,428 0.71 % 322.42 65,957 0.57 % 207.45 11,107 0.64 % 258.70 hitro hitro 284,404 0.64 % 250.64 3 0.88 % 309.06 17,202 0.86 % 433.13 568 0.41 % 143.58 113,054 0.55 % 208.31 66,773 0.78 % 356.28 75,059 0.65 % 236.08 11,745 0.67 % 273.56 največ najve č 283,095 0.64 % 249.49 1 0.29 % 103.02 2,062 0.10 % 51.92 711 0.52 % 179.73 152,774 0.74 % 281.50 41,617 0.49 % 222.05 80,421 0.70 % 252.94 5,509 0.32 % 128.31 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 112 File at CLARIN.SI 1.2.96 List of final character-level 1-grams from adverb lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adverbs-lowercase_forms- final-1grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] lahko lahk o 3,363,470 5.23 % 2,964.21 42 8.45 % 4,326.77 115,009 3.81 % 2,895.80 16,869 8.89 % 4,264.30 1,415,614 4.75 % 2,608.36 764,622 6.19 % 4,079.77 852,311 5.17 % 2,680.74 199,003 8.21 % 4,635.02 tako tak o 3,289,402 5.12 % 2,898.93 31 6.24 % 3,193.57 157,077 5.21 % 3,955.03 11,067 5.84 % 2,797.62 1,532,883 5.14 % 2,824.44 627,155 5.08 % 3,346.29 830,732 5.04 % 2,612.87 130,457 5.38 % 3,038.50 več ve č 1,895,190 2.95 % 1,670.22 4 0.81 % 412.07 35,879 1.19 % 903.39 3,863 2.04 % 976.52 889,908 2.99 % 1,639.71 302,261 2.45 % 1,612.77 609,107 3.69 % 1,915.80 54,168 2.23 % 1,261.64 nekaj neka j 1,478,635 2.30 % 1,303.11 17 3.42 % 1,751.31 75,312 2.50 % 1,896.27 2,638 1.39 % 666.86 706,346 2.37 % 1,301.49 292,467 2.37 % 1,560.51 361,517 2.19 % 1,137.07 40,338 1.66 % 939.52 zelo zel o 1,441,970 2.24 % 1,270.80 2 0.40 % 206.04 45,177 1.50 % 1,137.51 3,480 1.83 % 879.71 645,584 2.17 % 1,189.53 323,837 2.62 % 1,727.89 366,622 2.22 % 1,153.12 57,268 2.36 % 1,333.84 bolj bol j 1,262,087 1.96 % 1,112.27 13 2.62 % 1,339.24 42,774 1.42 % 1,077 2,638 1.39 % 666.86 596,351 2.00 % 1,098.82 266,922 2.16 % 1,424.21 297,149 1.80 % 934.61 56,240 2.32 % 1,309.90 zdaj zda j 1,082,531 1.68 % 954.03 0 0 % 0 71,531 2.37 % 1,801.07 2,616 1.38 % 661.30 531,590 1.78 % 979.49 150,379 1.22 % 802.37 308,065 1.87 % 968.94 18,350 0.76 % 427.39 vedno vedn o 1,075,127 1.67 % 947.50 1 0.20 % 103.02 54,794 1.82 % 1,379.65 2,037 1.07 % 514.93 473,977 1.59 % 873.33 227,298 1.84 % 1,212.79 277,237 1.68 % 871.98 39,783 1.64 % 926.59 kar ka r 1,071,938 1.67 % 944.69 0 0 % 0 39,622 1.31 % 997.64 1,731 0.91 % 437.58 542,754 1.82 % 1,000.06 222,883 1.81 % 1,189.23 244,821 1.48 % 770.03 20,127 0.83 % 468.78 kako kak o 1,005,454 1.56 % 886.10 2 0.40 % 206.04 90,504 3.00 % 2,278.79 2,958 1.56 % 747.75 424,401 1.42 % 781.99 198,869 1.61 % 1,061.10 237,842 1.44 % 748.08 50,878 2.10 % 1,185.01 veliko velik o 947,533 1.47 % 835.06 1 0.20 % 103.02 24,065 0.80 % 605.93 1,569 0.83 % 396.63 452,761 1.52 % 834.24 200,571 1.62 % 1,070.18 238,221 1.44 % 749.27 30,345 1.25 % 706.77 danes dane s 905,146 1.41 % 797.70 0 0 % 0 13,186 0.44 % 332.01 2,091 1.10 % 528.58 348,133 1.17 % 641.46 105,919 0.86 % 565.15 416,761 2.53 % 1,310.82 19,056 0.79 % 443.84 potem pote m 814,559 1.27 % 717.87 10 2.01 % 1,030.18 74,531 2.47 % 1,876.61 4,470 2.36 % 1,129.97 357,330 1.20 % 658.40 152,779 1.24 % 815.18 198,520 1.20 % 624.40 26,919 1.11 % 626.98 najbolj najbol j 780,723 1.22 % 688.05 1 0.20 % 103.02 13,768 0.46 % 346.66 1,191 0.63 % 301.07 377,064 1.26 % 694.77 156,552 1.27 % 835.31 207,953 1.26 % 654.07 24,194 1.00 % 563.51 treba treb a 715,315 1.11 % 630.40 2 0.40 % 206.04 22,139 0.73 % 557.44 3,300 1.74 % 834.20 348,254 1.17 % 641.68 131,260 1.06 % 700.36 182,173 1.10 % 572.98 28,187 1.16 % 656.51 skupaj skupa j 614,538 0.96 % 541.59 21 4.22 % 2,163.39 23,547 0.78 % 592.89 1,545 0.81 % 390.56 284,334 0.95 % 523.90 117,286 0.95 % 625.80 170,035 1.03 % 534.80 17,770 0.73 % 413.88 letos leto s 606,763 0.94 % 534.74 0 0 % 0 765 0.03 % 19.26 280 0.15 % 70.78 362,999 1.22 % 668.85 61,361 0.50 % 327.40 180,244 1.09 % 566.91 1,114 0.05 % 25.95 manj man j 605,930 0.94 % 534 1 0.20 % 103.02 10,236 0.34 % 257.73 1,249 0.66 % 315.73 293,350 0.98 % 540.52 115,502 0.94 % 616.28 163,127 0.99 % 513.08 22,465 0.93 % 523.24 nato nat o 589,423 0.92 % 519.46 52 10.46 % 5,356.96 28,623 0.95 % 720.70 1,075 0.57 % 271.75 236,312 0.79 % 435.42 94,351 0.76 % 503.43 201,657 1.22 % 634.26 27,353 1.13 % 637.08 dobro dobr o 576,943 0.90 % 508.46 16 3.22 % 1,648.30 25,215 0.84 % 634.89 975 0.51 % 246.47 268,699 0.90 % 495.10 122,531 0.99 % 653.79 138,841 0.84 % 436.69 20,666 0.85 % 481.34 res re s 565,323 0.88 % 498.22 0 0 % 0 43,525 1.44 % 1,095.91 1,548 0.82 % 391.32 259,768 0.87 % 478.64 119,765 0.97 % 639.03 127,925 0.78 % 402.36 12,792 0.53 % 297.94 lani lan i 505,184 0.79 % 445.22 0 0 % 0 745 0.03 % 18.76 90 0.05 % 22.75 276,175 0.93 % 508.87 38,722 0.31 % 206.61 188,480 1.14 % 592.82 972 0.04 % 22.64 precej prece j 492,127 0.77 % 433.71 2 0.40 % 206.04 12,458 0.41 % 313.68 619 0.33 % 156.48 241,123 0.81 % 444.29 101,408 0.82 % 541.08 122,999 0.75 % 386.86 13,518 0.56 % 314.85 takrat takra t 444,872 0.69 % 392.06 0 0 % 0 20,788 0.69 % 523.42 1,482 0.78 % 374.63 201,503 0.68 % 371.28 88,896 0.72 % 474.32 118,816 0.72 % 373.71 13,387 0.55 % 311.80 tam ta m 430,619 0.67 % 379.50 0 0 % 0 37,490 1.24 % 943.96 1,594 0.84 % 402.95 193,283 0.65 % 356.14 84,620 0.69 % 451.50 98,420 0.60 % 309.56 15,212 0.63 % 354.31 povsem povse m 426,682 0.66 % 376.03 0 0 % 0 14,047 0.47 % 353.69 572 0.30 % 144.60 206,808 0.69 % 381.06 86,791 0.70 % 463.09 103,440 0.63 % 325.35 15,024 0.62 % 349.93 medtem medte m 423,755 0.66 % 373.45 6 1.21 % 618.11 18,625 0.62 % 468.96 716 0.38 % 181 185,207 0.62 % 341.26 60,849 0.49 % 324.67 146,062 0.89 % 459.40 12,290 0.51 % 286.25 malo mal o 419,742 0.65 % 369.92 29 5.83 % 2,987.53 31,436 1.04 % 791.52 1,114 0.59 % 281.61 185,166 0.62 % 341.18 97,544 0.79 % 520.46 89,322 0.54 % 280.94 15,131 0.62 % 352.42 vse vs e 415,803 0.65 % 366.45 0 0 % 0 14,050 0.47 % 353.76 827 0.44 % 209.06 199,559 0.67 % 367.70 73,812 0.60 % 393.84 113,290 0.69 % 356.33 14,265 0.59 % 332.25 glede gled e 400,382 0.62 % 352.85 1 0.20 % 103.02 4,848 0.16 % 122.07 2,830 1.49 % 715.39 176,820 0.59 % 325.80 67,773 0.55 % 361.61 130,230 0.79 % 409.61 17,880 0.74 % 416.45 dovolj dovol j 400,015 0.62 % 352.53 3 0.60 % 309.06 18,306 0.61 % 460.93 806 0.42 % 203.75 184,495 0.62 % 339.94 90,989 0.74 % 485.49 91,354 0.55 % 287.33 14,062 0.58 % 327.52 spet spe t 394,360 0.61 % 347.55 0 0 % 0 41,209 1.37 % 1,037.60 1,269 0.67 % 320.79 187,447 0.63 % 345.38 68,686 0.56 % 366.49 83,350 0.51 % 262.16 12,399 0.51 % 288.79 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 113 File at CLARIN.SI 1.2.97 List of final character-level 2-grams from adverb lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adverbs-lowercase_forms- final-2grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] lahko lah ko 3,363,470 5.23 % 2,964.21 42 8.45 % 4,326.77 115,009 3.81 % 2,895.80 16,869 8.89 % 4,264.30 1,415,614 4.75 % 2,608.36 764,622 6.19 % 4,079.77 852,311 5.17 % 2,680.74 199,003 8.21 % 4,635.02 tako ta ko 3,289,402 5.12 % 2,898.93 31 6.24 % 3,193.57 157,077 5.21 % 3,955.03 11,067 5.84 % 2,797.62 1,532,883 5.14 % 2,824.44 627,155 5.08 % 3,346.29 830,732 5.04 % 2,612.87 130,457 5.38 % 3,038.50 več v eč 1,895,190 2.95 % 1,670.22 4 0.81 % 412.07 35,879 1.19 % 903.39 3,863 2.04 % 976.52 889,908 2.99 % 1,639.71 302,261 2.45 % 1,612.77 609,107 3.69 % 1,915.80 54,168 2.23 % 1,261.64 nekaj nek aj 1,478,635 2.30 % 1,303.11 17 3.42 % 1,751.31 75,312 2.50 % 1,896.27 2,638 1.39 % 666.86 706,346 2.37 % 1,301.49 292,467 2.37 % 1,560.51 361,517 2.19 % 1,137.07 40,338 1.66 % 939.52 zelo ze lo 1,441,970 2.24 % 1,270.80 2 0.40 % 206.04 45,177 1.50 % 1,137.51 3,480 1.83 % 879.71 645,584 2.17 % 1,189.53 323,837 2.62 % 1,727.89 366,622 2.22 % 1,153.12 57,268 2.36 % 1,333.84 bolj bo lj 1,262,087 1.96 % 1,112.27 13 2.62 % 1,339.24 42,774 1.42 % 1,077 2,638 1.39 % 666.86 596,351 2.00 % 1,098.82 266,922 2.16 % 1,424.21 297,149 1.80 % 934.61 56,240 2.32 % 1,309.90 zdaj zd aj 1,082,531 1.68 % 954.03 0 0 % 0 71,531 2.37 % 1,801.07 2,616 1.38 % 661.30 531,590 1.78 % 979.49 150,379 1.22 % 802.37 308,065 1.87 % 968.94 18,350 0.76 % 427.39 vedno ved no 1,075,127 1.67 % 947.50 1 0.20 % 103.02 54,794 1.82 % 1,379.65 2,037 1.07 % 514.93 473,977 1.59 % 873.33 227,298 1.84 % 1,212.79 277,237 1.68 % 871.98 39,783 1.64 % 926.59 kar k ar 1,071,938 1.67 % 944.69 0 0 % 0 39,622 1.31 % 997.64 1,731 0.91 % 437.58 542,754 1.82 % 1,000.06 222,883 1.81 % 1,189.23 244,821 1.48 % 770.03 20,127 0.83 % 468.78 kako ka ko 1,005,454 1.56 % 886.10 2 0.40 % 206.04 90,504 3.00 % 2,278.79 2,958 1.56 % 747.75 424,401 1.42 % 781.99 198,869 1.61 % 1,061.10 237,842 1.44 % 748.08 50,878 2.10 % 1,185.01 veliko veli ko 947,533 1.47 % 835.06 1 0.20 % 103.02 24,065 0.80 % 605.93 1,569 0.83 % 396.63 452,761 1.52 % 834.24 200,571 1.62 % 1,070.18 238,221 1.44 % 749.27 30,345 1.25 % 706.77 danes dan es 905,146 1.41 % 797.70 0 0 % 0 13,186 0.44 % 332.01 2,091 1.10 % 528.58 348,133 1.17 % 641.46 105,919 0.86 % 565.15 416,761 2.53 % 1,310.82 19,056 0.79 % 443.84 potem pot em 814,559 1.27 % 717.87 10 2.01 % 1,030.18 74,531 2.47 % 1,876.61 4,470 2.36 % 1,129.97 357,330 1.20 % 658.40 152,779 1.24 % 815.18 198,520 1.20 % 624.40 26,919 1.11 % 626.98 najbolj najbo lj 780,723 1.22 % 688.05 1 0.20 % 103.02 13,768 0.46 % 346.66 1,191 0.63 % 301.07 377,064 1.26 % 694.77 156,552 1.27 % 835.31 207,953 1.26 % 654.07 24,194 1.00 % 563.51 treba tre ba 715,315 1.11 % 630.40 2 0.40 % 206.04 22,139 0.73 % 557.44 3,300 1.74 % 834.20 348,254 1.17 % 641.68 131,260 1.06 % 700.36 182,173 1.10 % 572.98 28,187 1.16 % 656.51 skupaj skup aj 614,538 0.96 % 541.59 21 4.22 % 2,163.39 23,547 0.78 % 592.89 1,545 0.81 % 390.56 284,334 0.95 % 523.90 117,286 0.95 % 625.80 170,035 1.03 % 534.80 17,770 0.73 % 413.88 letos let os 606,763 0.94 % 534.74 0 0 % 0 765 0.03 % 19.26 280 0.15 % 70.78 362,999 1.22 % 668.85 61,361 0.50 % 327.40 180,244 1.09 % 566.91 1,114 0.05 % 25.95 manj ma nj 605,930 0.94 % 534 1 0.20 % 103.02 10,236 0.34 % 257.73 1,249 0.66 % 315.73 293,350 0.98 % 540.52 115,502 0.94 % 616.28 163,127 0.99 % 513.08 22,465 0.93 % 523.24 nato na to 589,423 0.92 % 519.46 52 10.46 % 5,356.96 28,623 0.95 % 720.70 1,075 0.57 % 271.75 236,312 0.79 % 435.42 94,351 0.76 % 503.43 201,657 1.22 % 634.26 27,353 1.13 % 637.08 dobro dob ro 576,943 0.90 % 508.46 16 3.22 % 1,648.30 25,215 0.84 % 634.89 975 0.51 % 246.47 268,699 0.90 % 495.10 122,531 0.99 % 653.79 138,841 0.84 % 436.69 20,666 0.85 % 481.34 res r es 565,323 0.88 % 498.22 0 0 % 0 43,525 1.44 % 1,095.91 1,548 0.82 % 391.32 259,768 0.87 % 478.64 119,765 0.97 % 639.03 127,925 0.78 % 402.36 12,792 0.53 % 297.94 lani la ni 505,184 0.79 % 445.22 0 0 % 0 745 0.03 % 18.76 90 0.05 % 22.75 276,175 0.93 % 508.87 38,722 0.31 % 206.61 188,480 1.14 % 592.82 972 0.04 % 22.64 precej prec ej 492,127 0.77 % 433.71 2 0.40 % 206.04 12,458 0.41 % 313.68 619 0.33 % 156.48 241,123 0.81 % 444.29 101,408 0.82 % 541.08 122,999 0.75 % 386.86 13,518 0.56 % 314.85 takrat takr at 444,872 0.69 % 392.06 0 0 % 0 20,788 0.69 % 523.42 1,482 0.78 % 374.63 201,503 0.68 % 371.28 88,896 0.72 % 474.32 118,816 0.72 % 373.71 13,387 0.55 % 311.80 tam t am 430,619 0.67 % 379.50 0 0 % 0 37,490 1.24 % 943.96 1,594 0.84 % 402.95 193,283 0.65 % 356.14 84,620 0.69 % 451.50 98,420 0.60 % 309.56 15,212 0.63 % 354.31 povsem povs em 426,682 0.66 % 376.03 0 0 % 0 14,047 0.47 % 353.69 572 0.30 % 144.60 206,808 0.69 % 381.06 86,791 0.70 % 463.09 103,440 0.63 % 325.35 15,024 0.62 % 349.93 medtem medt em 423,755 0.66 % 373.45 6 1.21 % 618.11 18,625 0.62 % 468.96 716 0.38 % 181 185,207 0.62 % 341.26 60,849 0.49 % 324.67 146,062 0.89 % 459.40 12,290 0.51 % 286.25 malo ma lo 419,742 0.65 % 369.92 29 5.83 % 2,987.53 31,436 1.04 % 791.52 1,114 0.59 % 281.61 185,166 0.62 % 341.18 97,544 0.79 % 520.46 89,322 0.54 % 280.94 15,131 0.62 % 352.42 vse v se 415,803 0.65 % 366.45 0 0 % 0 14,050 0.47 % 353.76 827 0.44 % 209.06 199,559 0.67 % 367.70 73,812 0.60 % 393.84 113,290 0.69 % 356.33 14,265 0.59 % 332.25 glede gle de 400,382 0.62 % 352.85 1 0.20 % 103.02 4,848 0.16 % 122.07 2,830 1.49 % 715.39 176,820 0.59 % 325.80 67,773 0.55 % 361.61 130,230 0.79 % 409.61 17,880 0.74 % 416.45 dovolj dovo lj 400,015 0.62 % 352.53 3 0.60 % 309.06 18,306 0.61 % 460.93 806 0.42 % 203.75 184,495 0.62 % 339.94 90,989 0.74 % 485.49 91,354 0.55 % 287.33 14,062 0.58 % 327.52 spet sp et 394,360 0.61 % 347.55 0 0 % 0 41,209 1.37 % 1,037.60 1,269 0.67 % 320.79 187,447 0.63 % 345.38 68,686 0.56 % 366.49 83,350 0.51 % 262.16 12,399 0.51 % 288.79 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 114 File at CLARIN.SI 1.2.98 List of final character-level 3-grams from adverb lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adverbs-lowercase_forms- final-3grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] lahko la hko 3,363,470 5.26 % 2,964.21 42 8.47 % 4,326.77 115,009 3.84 % 2,895.80 16,869 8.98 % 4,264.30 1,415,614 4.77 % 2,608.36 764,622 6.23 % 4,079.77 852,311 5.19 % 2,680.74 199,003 8.26 % 4,635.02 tako t ako 3,289,402 5.14 % 2,898.93 31 6.25 % 3,193.57 157,077 5.25 % 3,955.03 11,067 5.89 % 2,797.62 1,532,883 5.17 % 2,824.44 627,155 5.11 % 3,346.29 830,732 5.05 % 2,612.87 130,457 5.42 % 3,038.50 več več 1,895,190 2.96 % 1,670.22 4 0.81 % 412.07 35,879 1.20 % 903.39 3,863 2.06 % 976.52 889,908 3.00 % 1,639.71 302,261 2.46 % 1,612.77 609,107 3.71 % 1,915.80 54,168 2.25 % 1,261.64 nekaj ne kaj 1,478,635 2.31 % 1,303.11 17 3.43 % 1,751.31 75,312 2.52 % 1,896.27 2,638 1.41 % 666.86 706,346 2.38 % 1,301.49 292,467 2.38 % 1,560.51 361,517 2.20 % 1,137.07 40,338 1.68 % 939.52 zelo z elo 1,441,970 2.25 % 1,270.80 2 0.40 % 206.04 45,177 1.51 % 1,137.51 3,480 1.85 % 879.71 645,584 2.18 % 1,189.53 323,837 2.64 % 1,727.89 366,622 2.23 % 1,153.12 57,268 2.38 % 1,333.84 bolj b olj 1,262,087 1.97 % 1,112.27 13 2.62 % 1,339.24 42,774 1.43 % 1,077 2,638 1.41 % 666.86 596,351 2.01 % 1,098.82 266,922 2.17 % 1,424.21 297,149 1.81 % 934.61 56,240 2.33 % 1,309.90 zdaj z daj 1,082,531 1.69 % 954.03 0 0 % 0 71,531 2.39 % 1,801.07 2,616 1.39 % 661.30 531,590 1.79 % 979.49 150,379 1.23 % 802.37 308,065 1.88 % 968.94 18,350 0.76 % 427.39 vedno ve dno 1,075,127 1.68 % 947.50 1 0.20 % 103.02 54,794 1.83 % 1,379.65 2,037 1.08 % 514.93 473,977 1.60 % 873.33 227,298 1.85 % 1,212.79 277,237 1.69 % 871.98 39,783 1.65 % 926.59 kar kar 1,071,938 1.68 % 944.69 0 0 % 0 39,622 1.32 % 997.64 1,731 0.92 % 437.58 542,754 1.83 % 1,000.06 222,883 1.81 % 1,189.23 244,821 1.49 % 770.03 20,127 0.84 % 468.78 kako k ako 1,005,454 1.57 % 886.10 2 0.40 % 206.04 90,504 3.02 % 2,278.79 2,958 1.57 % 747.75 424,401 1.43 % 781.99 198,869 1.62 % 1,061.10 237,842 1.45 % 748.08 50,878 2.11 % 1,185.01 veliko vel iko 947,533 1.48 % 835.06 1 0.20 % 103.02 24,065 0.80 % 605.93 1,569 0.83 % 396.63 452,761 1.53 % 834.24 200,571 1.63 % 1,070.18 238,221 1.45 % 749.27 30,345 1.26 % 706.77 danes da nes 905,146 1.42 % 797.70 0 0 % 0 13,186 0.44 % 332.01 2,091 1.11 % 528.58 348,133 1.17 % 641.46 105,919 0.86 % 565.15 416,761 2.54 % 1,310.82 19,056 0.79 % 443.84 potem po tem 814,559 1.27 % 717.87 10 2.02 % 1,030.18 74,531 2.49 % 1,876.61 4,470 2.38 % 1,129.97 357,330 1.20 % 658.40 152,779 1.24 % 815.18 198,520 1.21 % 624.40 26,919 1.12 % 626.98 najbolj najb olj 780,723 1.22 % 688.05 1 0.20 % 103.02 13,768 0.46 % 346.66 1,191 0.63 % 301.07 377,064 1.27 % 694.77 156,552 1.27 % 835.31 207,953 1.26 % 654.07 24,194 1.00 % 563.51 treba tr eba 715,315 1.12 % 630.40 2 0.40 % 206.04 22,139 0.74 % 557.44 3,300 1.76 % 834.20 348,254 1.17 % 641.68 131,260 1.07 % 700.36 182,173 1.11 % 572.98 28,187 1.17 % 656.51 skupaj sku paj 614,538 0.96 % 541.59 21 4.23 % 2,163.39 23,547 0.79 % 592.89 1,545 0.82 % 390.56 284,334 0.96 % 523.90 117,286 0.95 % 625.80 170,035 1.03 % 534.80 17,770 0.74 % 413.88 letos le tos 606,763 0.95 % 534.74 0 0 % 0 765 0.03 % 19.26 280 0.15 % 70.78 362,999 1.22 % 668.85 61,361 0.50 % 327.40 180,244 1.10 % 566.91 1,114 0.05 % 25.95 manj m anj 605,930 0.95 % 534 1 0.20 % 103.02 10,236 0.34 % 257.73 1,249 0.67 % 315.73 293,350 0.99 % 540.52 115,502 0.94 % 616.28 163,127 0.99 % 513.08 22,465 0.93 % 523.24 nato n ato 589,423 0.92 % 519.46 52 10.48 % 5,356.96 28,623 0.96 % 720.70 1,075 0.57 % 271.75 236,312 0.80 % 435.42 94,351 0.77 % 503.43 201,657 1.23 % 634.26 27,353 1.14 % 637.08 dobro do bro 576,943 0.90 % 508.46 16 3.23 % 1,648.30 25,215 0.84 % 634.89 975 0.52 % 246.47 268,699 0.91 % 495.10 122,531 1.00 % 653.79 138,841 0.84 % 436.69 20,666 0.86 % 481.34 res res 565,323 0.88 % 498.22 0 0 % 0 43,525 1.45 % 1,095.91 1,548 0.82 % 391.32 259,768 0.88 % 478.64 119,765 0.97 % 639.03 127,925 0.78 % 402.36 12,792 0.53 % 297.94 lani l ani 505,184 0.79 % 445.22 0 0 % 0 745 0.03 % 18.76 90 0.05 % 22.75 276,175 0.93 % 508.87 38,722 0.32 % 206.61 188,480 1.15 % 592.82 972 0.04 % 22.64 precej pre cej 492,127 0.77 % 433.71 2 0.40 % 206.04 12,458 0.42 % 313.68 619 0.33 % 156.48 241,123 0.81 % 444.29 101,408 0.83 % 541.08 122,999 0.75 % 386.86 13,518 0.56 % 314.85 takrat tak rat 444,872 0.69 % 392.06 0 0 % 0 20,788 0.69 % 523.42 1,482 0.79 % 374.63 201,503 0.68 % 371.28 88,896 0.72 % 474.32 118,816 0.72 % 373.71 13,387 0.56 % 311.80 tam tam 430,619 0.67 % 379.50 0 0 % 0 37,490 1.25 % 943.96 1,594 0.85 % 402.95 193,283 0.65 % 356.14 84,620 0.69 % 451.50 98,420 0.60 % 309.56 15,212 0.63 % 354.31 povsem pov sem 426,682 0.67 % 376.03 0 0 % 0 14,047 0.47 % 353.69 572 0.30 % 144.60 206,808 0.70 % 381.06 86,791 0.71 % 463.09 103,440 0.63 % 325.35 15,024 0.62 % 349.93 medtem med tem 423,755 0.66 % 373.45 6 1.21 % 618.11 18,625 0.62 % 468.96 716 0.38 % 181 185,207 0.62 % 341.26 60,849 0.50 % 324.67 146,062 0.89 % 459.40 12,290 0.51 % 286.25 malo m alo 419,742 0.66 % 369.92 29 5.85 % 2,987.53 31,436 1.05 % 791.52 1,114 0.59 % 281.61 185,166 0.62 % 341.18 97,544 0.79 % 520.46 89,322 0.54 % 280.94 15,131 0.63 % 352.42 vse vse 415,803 0.65 % 366.45 0 0 % 0 14,050 0.47 % 353.76 827 0.44 % 209.06 199,559 0.67 % 367.70 73,812 0.60 % 393.84 113,290 0.69 % 356.33 14,265 0.59 % 332.25 glede gl ede 400,382 0.63 % 352.85 1 0.20 % 103.02 4,848 0.16 % 122.07 2,830 1.51 % 715.39 176,820 0.60 % 325.80 67,773 0.55 % 361.61 130,230 0.79 % 409.61 17,880 0.74 % 416.45 dovolj dov olj 400,015 0.62 % 352.53 3 0.60 % 309.06 18,306 0.61 % 460.93 806 0.43 % 203.75 184,495 0.62 % 339.94 90,989 0.74 % 485.49 91,354 0.56 % 287.33 14,062 0.58 % 327.52 spet s pet 394,360 0.62 % 347.55 0 0 % 0 41,209 1.38 % 1,037.60 1,269 0.68 % 320.79 187,447 0.63 % 345.38 68,686 0.56 % 366.49 83,350 0.51 % 262.16 12,399 0.52 % 288.79 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 115 File at CLARIN.SI 1.2.99 List of final character-level 4-grams from adverb lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adverbs-lowercase_forms- final-4grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] lahko l ahko 3,363,470 5.84 % 2,964.21 42 8.80 % 4,326.77 115,009 4.30 % 2,895.80 16,869 9.80 % 4,264.30 1,415,614 5.31 % 2,608.36 764,622 6.91 % 4,079.77 852,311 5.76 % 2,680.74 199,003 8.96 % 4,635.02 tako tako 3,289,402 5.71 % 2,898.93 31 6.50 % 3,193.57 157,077 5.88 % 3,955.03 11,067 6.43 % 2,797.62 1,532,883 5.75 % 2,824.44 627,155 5.67 % 3,346.29 830,732 5.62 % 2,612.87 130,457 5.87 % 3,038.50 nekaj n ekaj 1,478,635 2.57 % 1,303.11 17 3.56 % 1,751.31 75,312 2.82 % 1,896.27 2,638 1.53 % 666.86 706,346 2.65 % 1,301.49 292,467 2.64 % 1,560.51 361,517 2.44 % 1,137.07 40,338 1.82 % 939.52 zelo zelo 1,441,970 2.50 % 1,270.80 2 0.42 % 206.04 45,177 1.69 % 1,137.51 3,480 2.02 % 879.71 645,584 2.42 % 1,189.53 323,837 2.93 % 1,727.89 366,622 2.48 % 1,153.12 57,268 2.58 % 1,333.84 bolj bolj 1,262,087 2.19 % 1,112.27 13 2.73 % 1,339.24 42,774 1.60 % 1,077 2,638 1.53 % 666.86 596,351 2.24 % 1,098.82 266,922 2.41 % 1,424.21 297,149 2.01 % 934.61 56,240 2.53 % 1,309.90 zdaj zdaj 1,082,531 1.88 % 954.03 0 0 % 0 71,531 2.68 % 1,801.07 2,616 1.52 % 661.30 531,590 2.00 % 979.49 150,379 1.36 % 802.37 308,065 2.08 % 968.94 18,350 0.83 % 427.39 vedno v edno 1,075,127 1.87 % 947.50 1 0.21 % 103.02 54,794 2.05 % 1,379.65 2,037 1.18 % 514.93 473,977 1.78 % 873.33 227,298 2.05 % 1,212.79 277,237 1.88 % 871.98 39,783 1.79 % 926.59 kako kako 1,005,454 1.75 % 886.10 2 0.42 % 206.04 90,504 3.39 % 2,278.79 2,958 1.72 % 747.75 424,401 1.59 % 781.99 198,869 1.80 % 1,061.10 237,842 1.61 % 748.08 50,878 2.29 % 1,185.01 veliko ve liko 947,533 1.65 % 835.06 1 0.21 % 103.02 24,065 0.90 % 605.93 1,569 0.91 % 396.63 452,761 1.70 % 834.24 200,571 1.81 % 1,070.18 238,221 1.61 % 749.27 30,345 1.37 % 706.77 danes d anes 905,146 1.57 % 797.70 0 0 % 0 13,186 0.49 % 332.01 2,091 1.22 % 528.58 348,133 1.31 % 641.46 105,919 0.96 % 565.15 416,761 2.82 % 1,310.82 19,056 0.86 % 443.84 potem p otem 814,559 1.42 % 717.87 10 2.10 % 1,030.18 74,531 2.79 % 1,876.61 4,470 2.60 % 1,129.97 357,330 1.34 % 658.40 152,779 1.38 % 815.18 198,520 1.34 % 624.40 26,919 1.21 % 626.98 najbolj naj bolj 780,723 1.36 % 688.05 1 0.21 % 103.02 13,768 0.52 % 346.66 1,191 0.69 % 301.07 377,064 1.42 % 694.77 156,552 1.42 % 835.31 207,953 1.41 % 654.07 24,194 1.09 % 563.51 treba t reba 715,315 1.24 % 630.40 2 0.42 % 206.04 22,139 0.83 % 557.44 3,300 1.92 % 834.20 348,254 1.31 % 641.68 131,260 1.19 % 700.36 182,173 1.23 % 572.98 28,187 1.27 % 656.51 skupaj sk upaj 614,538 1.07 % 541.59 21 4.40 % 2,163.39 23,547 0.88 % 592.89 1,545 0.90 % 390.56 284,334 1.07 % 523.90 117,286 1.06 % 625.80 170,035 1.15 % 534.80 17,770 0.80 % 413.88 letos l etos 606,763 1.05 % 534.74 0 0 % 0 765 0.03 % 19.26 280 0.16 % 70.78 362,999 1.36 % 668.85 61,361 0.56 % 327.40 180,244 1.22 % 566.91 1,114 0.05 % 25.95 manj manj 605,930 1.05 % 534 1 0.21 % 103.02 10,236 0.38 % 257.73 1,249 0.73 % 315.73 293,350 1.10 % 540.52 115,502 1.04 % 616.28 163,127 1.10 % 513.08 22,465 1.01 % 523.24 nato nato 589,423 1.02 % 519.46 52 10.90 % 5,356.96 28,623 1.07 % 720.70 1,075 0.62 % 271.75 236,312 0.89 % 435.42 94,351 0.85 % 503.43 201,657 1.36 % 634.26 27,353 1.23 % 637.08 dobro d obro 576,943 1.00 % 508.46 16 3.35 % 1,648.30 25,215 0.94 % 634.89 975 0.57 % 246.47 268,699 1.01 % 495.10 122,531 1.11 % 653.79 138,841 0.94 % 436.69 20,666 0.93 % 481.34 lani lani 505,184 0.88 % 445.22 0 0 % 0 745 0.03 % 18.76 90 0.05 % 22.75 276,175 1.04 % 508.87 38,722 0.35 % 206.61 188,480 1.27 % 592.82 972 0.04 % 22.64 precej pr ecej 492,127 0.85 % 433.71 2 0.42 % 206.04 12,458 0.47 % 313.68 619 0.36 % 156.48 241,123 0.91 % 444.29 101,408 0.92 % 541.08 122,999 0.83 % 386.86 13,518 0.61 % 314.85 takrat ta krat 444,872 0.77 % 392.06 0 0 % 0 20,788 0.78 % 523.42 1,482 0.86 % 374.63 201,503 0.76 % 371.28 88,896 0.80 % 474.32 118,816 0.80 % 373.71 13,387 0.60 % 311.80 povsem po vsem 426,682 0.74 % 376.03 0 0 % 0 14,047 0.53 % 353.69 572 0.33 % 144.60 206,808 0.78 % 381.06 86,791 0.78 % 463.09 103,440 0.70 % 325.35 15,024 0.68 % 349.93 medtem me dtem 423,755 0.74 % 373.45 6 1.26 % 618.11 18,625 0.70 % 468.96 716 0.42 % 181 185,207 0.69 % 341.26 60,849 0.55 % 324.67 146,062 0.99 % 459.40 12,290 0.55 % 286.25 malo malo 419,742 0.73 % 369.92 29 6.08 % 2,987.53 31,436 1.18 % 791.52 1,114 0.65 % 281.61 185,166 0.69 % 341.18 97,544 0.88 % 520.46 89,322 0.60 % 280.94 15,131 0.68 % 352.42 glede g lede 400,382 0.70 % 352.85 1 0.21 % 103.02 4,848 0.18 % 122.07 2,830 1.64 % 715.39 176,820 0.66 % 325.80 67,773 0.61 % 361.61 130,230 0.88 % 409.61 17,880 0.81 % 416.45 dovolj do volj 400,015 0.69 % 352.53 3 0.63 % 309.06 18,306 0.69 % 460.93 806 0.47 % 203.75 184,495 0.69 % 339.94 90,989 0.82 % 485.49 91,354 0.62 % 287.33 14,062 0.63 % 327.52 spet spet 394,360 0.69 % 347.55 0 0 % 0 41,209 1.54 % 1,037.60 1,269 0.74 % 320.79 187,447 0.70 % 345.38 68,686 0.62 % 366.49 83,350 0.56 % 262.16 12,399 0.56 % 288.79 prej prej 381,934 0.66 % 336.60 1 0.21 % 103.02 19,064 0.71 % 480.01 1,381 0.80 % 349.10 186,379 0.70 % 343.42 69,611 0.63 % 371.42 93,050 0.63 % 292.67 12,448 0.56 % 289.93 naprej na prej 377,483 0.66 % 332.67 3 0.63 % 309.06 25,693 0.96 % 646.92 1,567 0.91 % 396.12 169,016 0.63 % 311.42 62,691 0.57 % 334.50 105,317 0.71 % 331.25 13,196 0.59 % 307.35 včeraj vč eraj 354,001 0.61 % 311.98 0 0 % 0 3,279 0.12 % 82.56 323 0.19 % 81.65 279,239 1.05 % 514.52 5,457 0.05 % 29.12 65,131 0.44 % 204.85 572 0.03 % 13.32 torej t orej 350,715 0.61 % 309.08 0 0 % 0 13,285 0.50 % 334.50 1,398 0.81 % 353.40 169,569 0.64 % 312.44 78,027 0.70 % 416.33 70,358 0.48 % 221.29 18,078 0.81 % 421.06 najprej naj prej 346,336 0.60 % 305.22 1 0.21 % 103.02 12,918 0.48 % 325.26 1,024 0.59 % 258.86 166,148 0.62 % 306.14 67,446 0.61 % 359.87 83,679 0.57 % 263.19 15,120 0.68 % 352.16 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 116 File at CLARIN.SI 1.2.100 List of final character-level 5-grams from adverb lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-adverbs-lowercase_forms- final-5grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] lahko lahko 3,363,470 7.56 % 2,964.21 42 12.35 % 4,326.77 115,009 5.77 % 2,895.80 16,869 12.24 % 4,264.30 1,415,614 6.88 % 2,608.36 764,622 8.95 % 4,079.77 852,311 7.42 % 2,680.74 199,003 11.43 % 4,635.02 nekaj nekaj 1,478,635 3.33 % 1,303.11 17 5.00 % 1,751.31 75,312 3.78 % 1,896.27 2,638 1.91 % 666.86 706,346 3.43 % 1,301.49 292,467 3.42 % 1,560.51 361,517 3.15 % 1,137.07 40,338 2.32 % 939.52 vedno vedno 1,075,127 2.42 % 947.50 1 0.29 % 103.02 54,794 2.75 % 1,379.65 2,037 1.48 % 514.93 473,977 2.30 % 873.33 227,298 2.66 % 1,212.79 277,237 2.41 % 871.98 39,783 2.29 % 926.59 veliko v eliko 947,533 2.13 % 835.06 1 0.29 % 103.02 24,065 1.21 % 605.93 1,569 1.14 % 396.63 452,761 2.20 % 834.24 200,571 2.35 % 1,070.18 238,221 2.08 % 749.27 30,345 1.74 % 706.77 danes danes 905,146 2.04 % 797.70 0 0 % 0 13,186 0.66 % 332.01 2,091 1.52 % 528.58 348,133 1.69 % 641.46 105,919 1.24 % 565.15 416,761 3.63 % 1,310.82 19,056 1.09 % 443.84 potem potem 814,559 1.83 % 717.87 10 2.94 % 1,030.18 74,531 3.74 % 1,876.61 4,470 3.24 % 1,129.97 357,330 1.74 % 658.40 152,779 1.79 % 815.18 198,520 1.73 % 624.40 26,919 1.55 % 626.98 najbolj na jbolj 780,723 1.76 % 688.05 1 0.29 % 103.02 13,768 0.69 % 346.66 1,191 0.86 % 301.07 377,064 1.83 % 694.77 156,552 1.83 % 835.31 207,953 1.81 % 654.07 24,194 1.39 % 563.51 treba treba 715,315 1.61 % 630.40 2 0.59 % 206.04 22,139 1.11 % 557.44 3,300 2.39 % 834.20 348,254 1.69 % 641.68 131,260 1.54 % 700.36 182,173 1.59 % 572.98 28,187 1.62 % 656.51 skupaj s kupaj 614,538 1.38 % 541.59 21 6.18 % 2,163.39 23,547 1.18 % 592.89 1,545 1.12 % 390.56 284,334 1.38 % 523.90 117,286 1.37 % 625.80 170,035 1.48 % 534.80 17,770 1.02 % 413.88 letos letos 606,763 1.36 % 534.74 0 0 % 0 765 0.04 % 19.26 280 0.20 % 70.78 362,999 1.76 % 668.85 61,361 0.72 % 327.40 180,244 1.57 % 566.91 1,114 0.06 % 25.95 dobro dobro 576,943 1.30 % 508.46 16 4.71 % 1,648.30 25,215 1.26 % 634.89 975 0.71 % 246.47 268,699 1.31 % 495.10 122,531 1.43 % 653.79 138,841 1.21 % 436.69 20,666 1.19 % 481.34 precej p recej 492,127 1.11 % 433.71 2 0.59 % 206.04 12,458 0.62 % 313.68 619 0.45 % 156.48 241,123 1.17 % 444.29 101,408 1.19 % 541.08 122,999 1.07 % 386.86 13,518 0.78 % 314.85 takrat t akrat 444,872 1.00 % 392.06 0 0 % 0 20,788 1.04 % 523.42 1,482 1.07 % 374.63 201,503 0.98 % 371.28 88,896 1.04 % 474.32 118,816 1.03 % 373.71 13,387 0.77 % 311.80 povsem p ovsem 426,682 0.96 % 376.03 0 0 % 0 14,047 0.70 % 353.69 572 0.41 % 144.60 206,808 1.00 % 381.06 86,791 1.02 % 463.09 103,440 0.90 % 325.35 15,024 0.86 % 349.93 medtem m edtem 423,755 0.95 % 373.45 6 1.76 % 618.11 18,625 0.93 % 468.96 716 0.52 % 181 185,207 0.90 % 341.26 60,849 0.71 % 324.67 146,062 1.27 % 459.40 12,290 0.71 % 286.25 glede glede 400,382 0.90 % 352.85 1 0.29 % 103.02 4,848 0.24 % 122.07 2,830 2.05 % 715.39 176,820 0.86 % 325.80 67,773 0.79 % 361.61 130,230 1.13 % 409.61 17,880 1.03 % 416.45 dovolj d ovolj 400,015 0.90 % 352.53 3 0.88 % 309.06 18,306 0.92 % 460.93 806 0.58 % 203.75 184,495 0.90 % 339.94 90,989 1.06 % 485.49 91,354 0.80 % 287.33 14,062 0.81 % 327.52 naprej n aprej 377,483 0.85 % 332.67 3 0.88 % 309.06 25,693 1.29 % 646.92 1,567 1.14 % 396.12 169,016 0.82 % 311.42 62,691 0.73 % 334.50 105,317 0.92 % 331.25 13,196 0.76 % 307.35 včeraj v čeraj 354,001 0.80 % 311.98 0 0 % 0 3,279 0.16 % 82.56 323 0.23 % 81.65 279,239 1.36 % 514.52 5,457 0.06 % 29.12 65,131 0.57 % 204.85 572 0.03 % 13.32 torej torej 350,715 0.79 % 309.08 0 0 % 0 13,285 0.67 % 334.50 1,398 1.01 % 353.40 169,569 0.82 % 312.44 78,027 0.91 % 416.33 70,358 0.61 % 221.29 18,078 1.04 % 421.06 najprej na jprej 346,336 0.78 % 305.22 1 0.29 % 103.02 12,918 0.65 % 325.26 1,024 0.74 % 258.86 166,148 0.81 % 306.14 67,446 0.79 % 359.87 83,679 0.73 % 263.19 15,120 0.87 % 352.16 nikoli n ikoli 342,994 0.77 % 302.28 0 0 % 0 35,030 1.76 % 882.02 811 0.59 % 205.01 139,253 0.68 % 256.58 74,199 0.87 % 395.90 80,494 0.70 % 253.17 13,207 0.76 % 307.61 hkrati h krati 328,962 0.74 % 289.91 0 0 % 0 6,504 0.33 % 163.76 764 0.55 % 193.13 159,548 0.78 % 293.98 60,655 0.71 % 323.64 85,984 0.75 % 270.44 15,507 0.89 % 361.18 toliko t oliko 324,312 0.73 % 285.81 13 3.82 % 1,339.24 16,859 0.84 % 424.49 1,101 0.80 % 278.32 159,179 0.77 % 293.30 66,980 0.78 % 357.38 69,370 0.60 % 218.19 10,810 0.62 % 251.78 zakaj zakaj 323,691 0.73 % 285.27 0 0 % 0 28,270 1.42 % 711.81 1,138 0.83 % 287.67 143,843 0.70 % 265.04 66,510 0.78 % 354.88 70,625 0.61 % 222.13 13,305 0.76 % 309.89 tokrat t okrat 307,823 0.69 % 271.28 0 0 % 0 6,904 0.35 % 173.84 352 0.26 % 88.98 167,344 0.81 % 308.34 46,175 0.54 % 246.37 85,275 0.74 % 268.21 1,773 0.10 % 41.30 približno prib ližno 306,100 0.69 % 269.76 17 5.00 % 1,751.31 3,730 0.19 % 93.92 653 0.47 % 165.07 153,334 0.74 % 282.53 55,729 0.65 % 297.35 81,278 0.71 % 255.64 11,359 0.65 % 264.56 nekoliko nek oliko 302,982 0.68 % 267.02 3 0.88 % 309.06 8,364 0.42 % 210.60 527 0.38 % 133.22 143,420 0.70 % 264.26 67,728 0.79 % 361.37 71,679 0.62 % 225.45 11,261 0.65 % 262.28 posebej po sebej 301,006 0.68 % 265.28 3 0.88 % 309.06 6,382 0.32 % 160.69 1,082 0.79 % 273.52 135,912 0.66 % 250.43 68,691 0.80 % 366.51 75,735 0.66 % 238.21 13,201 0.76 % 307.47 takoj takoj 297,170 0.67 % 261.89 4 1.18 % 412.07 23,961 1.20 % 603.31 1,027 0.74 % 259.61 134,686 0.66 % 248.17 60,428 0.71 % 322.42 65,957 0.57 % 207.45 11,107 0.64 % 258.70 hitro hitro 284,404 0.64 % 250.64 3 0.88 % 309.06 17,202 0.86 % 433.13 568 0.41 % 143.58 113,054 0.55 % 208.31 66,773 0.78 % 356.28 75,059 0.65 % 236.08 11,745 0.67 % 273.56 največ n ajveč 283,095 0.64 % 249.49 1 0.29 % 103.02 2,062 0.10 % 51.92 711 0.52 % 179.73 152,774 0.74 % 281.50 41,617 0.49 % 222.05 80,421 0.70 % 252.94 5,509 0.32 % 128.31 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 117 File at CLARIN.SI 1.2.101 List of initial character-level 1-grams from pronoun lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-pronouns-lemmas-initial- 1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] se se s e 18,656,426 24.26 % 16,441.81 63 17.45 % 6,490.16 1,227,523 25.39 % 30,907.70 8,331,411 23.93 % 15,351.18 3,444,419 23.78 % 18,378.29 757,814 23.16 % 17,650.39 73,060 23.14 % 18,468.78 4,822,136 25.14 % 15,166.87 on on o n 11,874,012 15.44 % 10,464.51 189 52.35 % 19,470.49 947,214 19.59 % 23,849.82 5,248,082 15.07 % 9,669.94 2,238,961 15.46 % 11,946.36 531,245 16.23 % 12,373.33 38,159 12.09 % 9,646.18 2,870,162 14.96 % 9,027.41 ta ta t a 11,377,542 14.79 % 10,026.97 17 4.71 % 1,751.31 469,391 9.71 % 11,818.76 5,253,309 15.09 % 9,679.57 1,998,988 13.80 % 10,665.94 496,205 15.16 % 11,557.20 74,801 23.69 % 18,908.89 3,084,831 16.08 % 9,702.60 ves ves v es 4,233,029 5.50 % 3,730.55 26 7.20 % 2,678.48 207,705 4.30 % 5,229.79 2,037,592 5.85 % 3,754.40 799,974 5.52 % 4,268.40 152,217 4.65 % 3,545.31 13,137 4.16 % 3,320.89 1,022,378 5.33 % 3,215.64 jaz jaz j az 3,481,582 4.53 % 3,068.30 0 0 % 0 403,401 8.35 % 10,157.20 1,512,005 4.34 % 2,785.97 716,575 4.95 % 3,823.41 118,259 3.61 % 2,754.39 12,057 3.82 % 3,047.88 719,285 3.75 % 2,262.34 svoj svoj s voj 3,453,653 4.49 % 3,043.69 3 0.83 % 309.06 151,218 3.13 % 3,807.51 1,585,205 4.55 % 2,920.85 654,031 4.51 % 3,489.69 155,236 4.74 % 3,615.63 10,998 3.48 % 2,780.18 896,962 4.68 % 2,821.18 kateri kateri k ateri 2,632,017 3.42 % 2,319.58 6 1.66 % 618.11 67,762 1.40 % 1,706.17 1,197,868 3.44 % 2,207.15 448,321 3.10 % 2,392.09 122,313 3.74 % 2,848.82 14,555 4.61 % 3,679.35 781,192 4.07 % 2,457.05 kar kar k ar 1,932,807 2.51 % 1,703.37 1 0.28 % 103.02 73,171 1.51 % 1,842.37 882,447 2.54 % 1,625.97 342,677 2.37 % 1,828.41 69,534 2.12 % 1,619.53 5,556 1.76 % 1,404.50 559,421 2.92 % 1,759.52 njegov njegov n jegov 1,765,365 2.29 % 1,555.81 0 0 % 0 110,706 2.29 % 2,787.46 781,513 2.25 % 1,439.99 274,494 1.90 % 1,464.61 77,991 2.38 % 1,816.50 5,165 1.64 % 1,305.66 515,496 2.69 % 1,621.37 naš naš n aš 1,295,731 1.69 % 1,141.92 0 0 % 0 25,301 0.52 % 637.05 733,564 2.11 % 1,351.64 220,741 1.52 % 1,177.80 44,419 1.36 % 1,034.57 3,839 1.22 % 970.46 267,867 1.40 % 842.51 tisti tisti t isti 1,204,248 1.57 % 1,061.30 1 0.28 % 103.02 85,777 1.77 % 2,159.77 554,134 1.59 % 1,021.03 243,437 1.68 % 1,298.90 53,821 1.65 % 1,253.56 6,616 2.10 % 1,672.45 260,462 1.36 % 819.22 kaj kaj k aj 1,177,693 1.53 % 1,037.89 1 0.28 % 103.02 124,849 2.58 % 3,143.56 508,341 1.46 % 936.65 253,464 1.75 % 1,352.40 47,569 1.45 % 1,107.94 3,588 1.14 % 907.01 239,881 1.25 % 754.49 ti ti t i 1,132,179 1.47 % 997.78 5 1.39 % 515.09 142,326 2.94 % 3,583.61 391,026 1.12 % 720.49 346,431 2.39 % 1,848.44 60,549 1.85 % 1,410.26 4,357 1.38 % 1,101.40 187,485 0.98 % 589.69 vsak vsak v sak 1,091,923 1.42 % 962.31 8 2.22 % 824.15 44,704 0.93 % 1,125.60 497,898 1.43 % 917.41 237,701 1.64 % 1,268.29 55,860 1.71 % 1,301.05 5,188 1.64 % 1,311.47 250,564 1.31 % 788.09 njihov njihov n jihov 1,022,463 1.33 % 901.09 1 0.28 % 103.02 22,312 0.46 % 561.79 502,914 1.45 % 926.65 168,419 1.16 % 898.63 52,087 1.59 % 1,213.17 4,322 1.37 % 1,092.55 272,408 1.42 % 856.79 njen njen n jen 971,331 1.26 % 856.03 0 0 % 0 81,849 1.69 % 2,060.87 394,090 1.13 % 726.14 181,976 1.26 % 970.96 40,880 1.25 % 952.14 3,157 1.00 % 798.06 269,379 1.40 % 847.27 nekateri nekateri n ekateri 826,585 1.07 % 728.47 2 0.55 % 206.04 11,748 0.24 % 295.80 424,818 1.22 % 782.76 144,522 1.00 % 771.12 41,824 1.28 % 974.13 2,358 0.75 % 596.08 201,313 1.05 % 633.18 kakšen kakšen k akšen 686,530 0.89 % 605.04 2 0.55 % 206.04 38,652 0.80 % 973.22 324,671 0.93 % 598.23 143,530 0.99 % 765.83 22,579 0.69 % 525.89 2,249 0.71 % 568.52 154,847 0.81 % 487.03 takšen takšen t akšen 627,943 0.82 % 553.40 0 0 % 0 19,175 0.40 % 482.81 300,654 0.86 % 553.97 117,208 0.81 % 625.38 21,964 0.67 % 511.57 1,988 0.63 % 502.54 166,954 0.87 % 525.11 moj moj m oj 617,627 0.80 % 544.31 0 0 % 0 92,173 1.91 % 2,320.82 244,505 0.70 % 450.52 127,017 0.88 % 677.72 23,634 0.72 % 550.46 1,908 0.60 % 482.32 128,390 0.67 % 403.82 oba oba o ba 542,160 0.70 % 477.80 17 4.71 % 1,751.31 16,851 0.35 % 424.29 280,698 0.81 % 517.20 81,296 0.56 % 433.77 21,586 0.66 % 502.76 1,356 0.43 % 342.78 140,356 0.73 % 441.46 tak tak t ak 535,614 0.70 % 472.03 0 0 % 0 27,946 0.58 % 703.65 256,456 0.74 % 472.54 111,527 0.77 % 595.07 29,507 0.90 % 687.25 4,605 1.46 % 1,164.09 105,573 0.55 % 332.05 nič nič n ič 475,396 0.62 % 418.96 0 0 % 0 53,076 1.10 % 1,336.40 214,201 0.61 % 394.68 94,691 0.65 % 505.24 16,107 0.49 % 375.15 1,400 0.44 % 353.90 95,921 0.50 % 301.70 kdo kdo k do 456,086 0.59 % 401.95 0 0 % 0 35,045 0.72 % 882.40 222,399 0.64 % 409.78 86,567 0.60 % 461.89 13,691 0.42 % 318.88 2,132 0.68 % 538.95 96,252 0.50 % 302.74 nek nek n ek 393,758 0.51 % 347.02 0 0 % 0 33,855 0.70 % 852.43 157,860 0.45 % 290.87 79,718 0.55 % 425.35 24,047 0.73 % 560.08 2,404 0.76 % 607.71 95,874 0.50 % 301.55 zame zame z ame 390,470 0.51 % 344.12 0 0 % 0 23,376 0.48 % 588.58 181,725 0.52 % 334.84 68,638 0.47 % 366.23 12,818 0.39 % 298.55 971 0.31 % 245.46 102,942 0.54 % 323.78 noben noben n oben 361,539 0.47 % 318.62 0 0 % 0 24,292 0.50 % 611.65 170,441 0.49 % 314.05 59,200 0.41 % 315.87 13,989 0.43 % 325.82 1,648 0.52 % 416.60 91,969 0.48 % 289.27 vaš vaš v aš 318,812 0.41 % 280.97 0 0 % 0 14,681 0.30 % 369.65 117,509 0.34 % 216.52 104,686 0.72 % 558.57 25,254 0.77 % 588.20 1,057 0.34 % 267.20 55,625 0.29 % 174.96 isti isti i sti 305,329 0.40 % 269.08 5 1.39 % 515.09 14,396 0.30 % 362.48 138,929 0.40 % 255.99 59,259 0.41 % 316.19 18,519 0.57 % 431.33 1,503 0.48 % 379.94 72,718 0.38 % 228.72 nihče nihče n ihče 286,212 0.37 % 252.24 0 0 % 0 26,310 0.54 % 662.46 139,188 0.40 % 256.46 46,911 0.32 % 250.30 7,577 0.23 % 176.48 820 0.26 % 207.29 65,406 0.34 % 205.72 enak enak e nak 255,863 0.33 % 225.49 2 0.55 % 206.04 5,846 0.12 % 147.20 110,141 0.32 % 202.94 51,176 0.35 % 273.06 17,036 0.52 % 396.79 1,377 0.44 % 348.09 70,285 0.37 % 221.06 mnog mnog m nog 247,148 0.32 % 217.81 0 0 % 0 3,585 0.07 % 90.27 128,052 0.37 % 235.94 45,853 0.32 % 244.66 14,872 0.45 % 346.39 768 0.24 % 194.14 54,018 0.28 % 169.90 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 118 File at CLARIN.SI 1.2.102 List of initial character-level 2-grams from pronoun lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-pronouns-lemmas-initial- 2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] se se se 18,656,426 24.26 % 16,441.81 63 17.45 % 6,490.16 1,227,523 25.39 % 30,907.70 8,331,411 23.93 % 15,351.18 3,444,419 23.78 % 18,378.29 757,814 23.16 % 17,650.39 73,060 23.14 % 18,468.78 4,822,136 25.14 % 15,166.87 on on on 11,874,012 15.44 % 10,464.51 189 52.35 % 19,470.49 947,214 19.59 % 23,849.82 5,248,082 15.07 % 9,669.94 2,238,961 15.46 % 11,946.36 531,245 16.23 % 12,373.33 38,159 12.09 % 9,646.18 2,870,162 14.96 % 9,027.41 ta ta ta 11,377,542 14.79 % 10,026.97 17 4.71 % 1,751.31 469,391 9.71 % 11,818.76 5,253,309 15.09 % 9,679.57 1,998,988 13.80 % 10,665.94 496,205 15.16 % 11,557.20 74,801 23.69 % 18,908.89 3,084,831 16.08 % 9,702.60 ves ves ve s 4,233,029 5.50 % 3,730.55 26 7.20 % 2,678.48 207,705 4.30 % 5,229.79 2,037,592 5.85 % 3,754.40 799,974 5.52 % 4,268.40 152,217 4.65 % 3,545.31 13,137 4.16 % 3,320.89 1,022,378 5.33 % 3,215.64 jaz jaz ja z 3,481,582 4.53 % 3,068.30 0 0 % 0 403,401 8.35 % 10,157.20 1,512,005 4.34 % 2,785.97 716,575 4.95 % 3,823.41 118,259 3.61 % 2,754.39 12,057 3.82 % 3,047.88 719,285 3.75 % 2,262.34 svoj svoj sv oj 3,453,653 4.49 % 3,043.69 3 0.83 % 309.06 151,218 3.13 % 3,807.51 1,585,205 4.55 % 2,920.85 654,031 4.51 % 3,489.69 155,236 4.74 % 3,615.63 10,998 3.48 % 2,780.18 896,962 4.68 % 2,821.18 kateri kateri ka teri 2,632,017 3.42 % 2,319.58 6 1.66 % 618.11 67,762 1.40 % 1,706.17 1,197,868 3.44 % 2,207.15 448,321 3.10 % 2,392.09 122,313 3.74 % 2,848.82 14,555 4.61 % 3,679.35 781,192 4.07 % 2,457.05 kar kar ka r 1,932,807 2.51 % 1,703.37 1 0.28 % 103.02 73,171 1.51 % 1,842.37 882,447 2.54 % 1,625.97 342,677 2.37 % 1,828.41 69,534 2.12 % 1,619.53 5,556 1.76 % 1,404.50 559,421 2.92 % 1,759.52 njegov njegov nj egov 1,765,365 2.29 % 1,555.81 0 0 % 0 110,706 2.29 % 2,787.46 781,513 2.25 % 1,439.99 274,494 1.90 % 1,464.61 77,991 2.38 % 1,816.50 5,165 1.64 % 1,305.66 515,496 2.69 % 1,621.37 naš naš na š 1,295,731 1.69 % 1,141.92 0 0 % 0 25,301 0.52 % 637.05 733,564 2.11 % 1,351.64 220,741 1.52 % 1,177.80 44,419 1.36 % 1,034.57 3,839 1.22 % 970.46 267,867 1.40 % 842.51 tisti tisti ti sti 1,204,248 1.57 % 1,061.30 1 0.28 % 103.02 85,777 1.77 % 2,159.77 554,134 1.59 % 1,021.03 243,437 1.68 % 1,298.90 53,821 1.65 % 1,253.56 6,616 2.10 % 1,672.45 260,462 1.36 % 819.22 kaj kaj ka j 1,177,693 1.53 % 1,037.89 1 0.28 % 103.02 124,849 2.58 % 3,143.56 508,341 1.46 % 936.65 253,464 1.75 % 1,352.40 47,569 1.45 % 1,107.94 3,588 1.14 % 907.01 239,881 1.25 % 754.49 ti ti ti 1,132,179 1.47 % 997.78 5 1.39 % 515.09 142,326 2.94 % 3,583.61 391,026 1.12 % 720.49 346,431 2.39 % 1,848.44 60,549 1.85 % 1,410.26 4,357 1.38 % 1,101.40 187,485 0.98 % 589.69 vsak vsak vs ak 1,091,923 1.42 % 962.31 8 2.22 % 824.15 44,704 0.93 % 1,125.60 497,898 1.43 % 917.41 237,701 1.64 % 1,268.29 55,860 1.71 % 1,301.05 5,188 1.64 % 1,311.47 250,564 1.31 % 788.09 njihov njihov nj ihov 1,022,463 1.33 % 901.09 1 0.28 % 103.02 22,312 0.46 % 561.79 502,914 1.45 % 926.65 168,419 1.16 % 898.63 52,087 1.59 % 1,213.17 4,322 1.37 % 1,092.55 272,408 1.42 % 856.79 njen njen nj en 971,331 1.26 % 856.03 0 0 % 0 81,849 1.69 % 2,060.87 394,090 1.13 % 726.14 181,976 1.26 % 970.96 40,880 1.25 % 952.14 3,157 1.00 % 798.06 269,379 1.40 % 847.27 nekateri nekateri ne kateri 826,585 1.07 % 728.47 2 0.55 % 206.04 11,748 0.24 % 295.80 424,818 1.22 % 782.76 144,522 1.00 % 771.12 41,824 1.28 % 974.13 2,358 0.75 % 596.08 201,313 1.05 % 633.18 kakšen kakšen ka kšen 686,530 0.89 % 605.04 2 0.55 % 206.04 38,652 0.80 % 973.22 324,671 0.93 % 598.23 143,530 0.99 % 765.83 22,579 0.69 % 525.89 2,249 0.71 % 568.52 154,847 0.81 % 487.03 takšen takšen ta kšen 627,943 0.82 % 553.40 0 0 % 0 19,175 0.40 % 482.81 300,654 0.86 % 553.97 117,208 0.81 % 625.38 21,964 0.67 % 511.57 1,988 0.63 % 502.54 166,954 0.87 % 525.11 moj moj mo j 617,627 0.80 % 544.31 0 0 % 0 92,173 1.91 % 2,320.82 244,505 0.70 % 450.52 127,017 0.88 % 677.72 23,634 0.72 % 550.46 1,908 0.60 % 482.32 128,390 0.67 % 403.82 oba oba ob a 542,160 0.70 % 477.80 17 4.71 % 1,751.31 16,851 0.35 % 424.29 280,698 0.81 % 517.20 81,296 0.56 % 433.77 21,586 0.66 % 502.76 1,356 0.43 % 342.78 140,356 0.73 % 441.46 tak tak ta k 535,614 0.70 % 472.03 0 0 % 0 27,946 0.58 % 703.65 256,456 0.74 % 472.54 111,527 0.77 % 595.07 29,507 0.90 % 687.25 4,605 1.46 % 1,164.09 105,573 0.55 % 332.05 nič nič ni č 475,396 0.62 % 418.96 0 0 % 0 53,076 1.10 % 1,336.40 214,201 0.61 % 394.68 94,691 0.65 % 505.24 16,107 0.49 % 375.15 1,400 0.44 % 353.90 95,921 0.50 % 301.70 kdo kdo kd o 456,086 0.59 % 401.95 0 0 % 0 35,045 0.72 % 882.40 222,399 0.64 % 409.78 86,567 0.60 % 461.89 13,691 0.42 % 318.88 2,132 0.68 % 538.95 96,252 0.50 % 302.74 nek nek ne k 393,758 0.51 % 347.02 0 0 % 0 33,855 0.70 % 852.43 157,860 0.45 % 290.87 79,718 0.55 % 425.35 24,047 0.73 % 560.08 2,404 0.76 % 607.71 95,874 0.50 % 301.55 zame zame za me 390,470 0.51 % 344.12 0 0 % 0 23,376 0.48 % 588.58 181,725 0.52 % 334.84 68,638 0.47 % 366.23 12,818 0.39 % 298.55 971 0.31 % 245.46 102,942 0.54 % 323.78 noben noben no ben 361,539 0.47 % 318.62 0 0 % 0 24,292 0.50 % 611.65 170,441 0.49 % 314.05 59,200 0.41 % 315.87 13,989 0.43 % 325.82 1,648 0.52 % 416.60 91,969 0.48 % 289.27 vaš vaš va š 318,812 0.41 % 280.97 0 0 % 0 14,681 0.30 % 369.65 117,509 0.34 % 216.52 104,686 0.72 % 558.57 25,254 0.77 % 588.20 1,057 0.34 % 267.20 55,625 0.29 % 174.96 isti isti is ti 305,329 0.40 % 269.08 5 1.39 % 515.09 14,396 0.30 % 362.48 138,929 0.40 % 255.99 59,259 0.41 % 316.19 18,519 0.57 % 431.33 1,503 0.48 % 379.94 72,718 0.38 % 228.72 nihče nihče ni hče 286,212 0.37 % 252.24 0 0 % 0 26,310 0.54 % 662.46 139,188 0.40 % 256.46 46,911 0.32 % 250.30 7,577 0.23 % 176.48 820 0.26 % 207.29 65,406 0.34 % 205.72 enak enak en ak 255,863 0.33 % 225.49 2 0.55 % 206.04 5,846 0.12 % 147.20 110,141 0.32 % 202.94 51,176 0.35 % 273.06 17,036 0.52 % 396.79 1,377 0.44 % 348.09 70,285 0.37 % 221.06 mnog mnog mn og 247,148 0.32 % 217.81 0 0 % 0 3,585 0.07 % 90.27 128,052 0.37 % 235.94 45,853 0.32 % 244.66 14,872 0.45 % 346.39 768 0.24 % 194.14 54,018 0.28 % 169.90 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 119 File at CLARIN.SI 1.2.103 List of initial character-level 3-grams from pronoun lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-pronouns-lemmas-initial- 3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] ves ves ves 4,233,029 12.50 % 3,730.55 26 29.89 % 2,678.48 207,705 10.15 % 5,229.79 2,037,592 13.07 % 3,754.40 799,974 12.39 % 4,268.40 152,217 10.67 % 3,545.31 13,137 10.48 % 3,320.89 1,022,378 12.45 % 3,215.64 jaz jaz jaz 3,481,582 10.28 % 3,068.30 0 0 % 0 403,401 19.70 % 10,157.20 1,512,005 9.70 % 2,785.97 716,575 11.10 % 3,823.41 118,259 8.29 % 2,754.39 12,057 9.62 % 3,047.88 719,285 8.76 % 2,262.34 svoj svoj svo j 3,453,653 10.20 % 3,043.69 3 3.45 % 309.06 151,218 7.39 % 3,807.51 1,585,205 10.17 % 2,920.85 654,031 10.13 % 3,489.69 155,236 10.88 % 3,615.63 10,998 8.78 % 2,780.18 896,962 10.92 % 2,821.18 kateri kateri kat eri 2,632,017 7.77 % 2,319.58 6 6.90 % 618.11 67,762 3.31 % 1,706.17 1,197,868 7.68 % 2,207.15 448,321 6.94 % 2,392.09 122,313 8.57 % 2,848.82 14,555 11.62 % 3,679.35 781,192 9.51 % 2,457.05 kar kar kar 1,932,807 5.71 % 1,703.37 1 1.15 % 103.02 73,171 3.57 % 1,842.37 882,447 5.66 % 1,625.97 342,677 5.31 % 1,828.41 69,534 4.87 % 1,619.53 5,556 4.43 % 1,404.50 559,421 6.81 % 1,759.52 njegov njegov nje gov 1,765,365 5.21 % 1,555.81 0 0 % 0 110,706 5.41 % 2,787.46 781,513 5.01 % 1,439.99 274,494 4.25 % 1,464.61 77,991 5.47 % 1,816.50 5,165 4.12 % 1,305.66 515,496 6.28 % 1,621.37 naš naš naš 1,295,731 3.83 % 1,141.92 0 0 % 0 25,301 1.24 % 637.05 733,564 4.71 % 1,351.64 220,741 3.42 % 1,177.80 44,419 3.11 % 1,034.57 3,839 3.06 % 970.46 267,867 3.26 % 842.51 tisti tisti tis ti 1,204,248 3.56 % 1,061.30 1 1.15 % 103.02 85,777 4.19 % 2,159.77 554,134 3.56 % 1,021.03 243,437 3.77 % 1,298.90 53,821 3.77 % 1,253.56 6,616 5.28 % 1,672.45 260,462 3.17 % 819.22 kaj kaj kaj 1,177,693 3.48 % 1,037.89 1 1.15 % 103.02 124,849 6.10 % 3,143.56 508,341 3.26 % 936.65 253,464 3.92 % 1,352.40 47,569 3.33 % 1,107.94 3,588 2.86 % 907.01 239,881 2.92 % 754.49 vsak vsak vsa k 1,091,923 3.23 % 962.31 8 9.20 % 824.15 44,704 2.18 % 1,125.60 497,898 3.19 % 917.41 237,701 3.68 % 1,268.29 55,860 3.92 % 1,301.05 5,188 4.14 % 1,311.47 250,564 3.05 % 788.09 njihov njihov nji hov 1,022,463 3.02 % 901.09 1 1.15 % 103.02 22,312 1.09 % 561.79 502,914 3.23 % 926.65 168,419 2.61 % 898.63 52,087 3.65 % 1,213.17 4,322 3.45 % 1,092.55 272,408 3.32 % 856.79 njen njen nje n 971,331 2.87 % 856.03 0 0 % 0 81,849 4.00 % 2,060.87 394,090 2.53 % 726.14 181,976 2.82 % 970.96 40,880 2.87 % 952.14 3,157 2.52 % 798.06 269,379 3.28 % 847.27 nekateri nekateri nek ateri 826,585 2.44 % 728.47 2 2.30 % 206.04 11,748 0.57 % 295.80 424,818 2.73 % 782.76 144,522 2.24 % 771.12 41,824 2.93 % 974.13 2,358 1.88 % 596.08 201,313 2.45 % 633.18 kakšen kakšen kak šen 686,530 2.03 % 605.04 2 2.30 % 206.04 38,652 1.89 % 973.22 324,671 2.08 % 598.23 143,530 2.22 % 765.83 22,579 1.58 % 525.89 2,249 1.79 % 568.52 154,847 1.89 % 487.03 takšen takšen tak šen 627,943 1.85 % 553.40 0 0 % 0 19,175 0.94 % 482.81 300,654 1.93 % 553.97 117,208 1.81 % 625.38 21,964 1.54 % 511.57 1,988 1.59 % 502.54 166,954 2.03 % 525.11 moj moj moj 617,627 1.82 % 544.31 0 0 % 0 92,173 4.50 % 2,320.82 244,505 1.57 % 450.52 127,017 1.97 % 677.72 23,634 1.66 % 550.46 1,908 1.52 % 482.32 128,390 1.56 % 403.82 oba oba oba 542,160 1.60 % 477.80 17 19.54 % 1,751.31 16,851 0.82 % 424.29 280,698 1.80 % 517.20 81,296 1.26 % 433.77 21,586 1.51 % 502.76 1,356 1.08 % 342.78 140,356 1.71 % 441.46 tak tak tak 535,614 1.58 % 472.03 0 0 % 0 27,946 1.36 % 703.65 256,456 1.65 % 472.54 111,527 1.73 % 595.07 29,507 2.07 % 687.25 4,605 3.67 % 1,164.09 105,573 1.28 % 332.05 nič nič nič 475,396 1.40 % 418.96 0 0 % 0 53,076 2.59 % 1,336.40 214,201 1.37 % 394.68 94,691 1.47 % 505.24 16,107 1.13 % 375.15 1,400 1.12 % 353.90 95,921 1.17 % 301.70 kdo kdo kdo 456,086 1.35 % 401.95 0 0 % 0 35,045 1.71 % 882.40 222,399 1.43 % 409.78 86,567 1.34 % 461.89 13,691 0.96 % 318.88 2,132 1.70 % 538.95 96,252 1.17 % 302.74 nek nek nek 393,758 1.16 % 347.02 0 0 % 0 33,855 1.65 % 852.43 157,860 1.01 % 290.87 79,718 1.24 % 425.35 24,047 1.69 % 560.08 2,404 1.92 % 607.71 95,874 1.17 % 301.55 zame zame zam e 390,470 1.15 % 344.12 0 0 % 0 23,376 1.14 % 588.58 181,725 1.17 % 334.84 68,638 1.06 % 366.23 12,818 0.90 % 298.55 971 0.78 % 245.46 102,942 1.25 % 323.78 noben noben nob en 361,539 1.07 % 318.62 0 0 % 0 24,292 1.19 % 611.65 170,441 1.09 % 314.05 59,200 0.92 % 315.87 13,989 0.98 % 325.82 1,648 1.31 % 416.60 91,969 1.12 % 289.27 vaš vaš vaš 318,812 0.94 % 280.97 0 0 % 0 14,681 0.72 % 369.65 117,509 0.75 % 216.52 104,686 1.62 % 558.57 25,254 1.77 % 588.20 1,057 0.84 % 267.20 55,625 0.68 % 174.96 isti isti ist i 305,329 0.90 % 269.08 5 5.75 % 515.09 14,396 0.70 % 362.48 138,929 0.89 % 255.99 59,259 0.92 % 316.19 18,519 1.30 % 431.33 1,503 1.20 % 379.94 72,718 0.89 % 228.72 nihče nihče nih če 286,212 0.84 % 252.24 0 0 % 0 26,310 1.28 % 662.46 139,188 0.89 % 256.46 46,911 0.73 % 250.30 7,577 0.53 % 176.48 820 0.65 % 207.29 65,406 0.80 % 205.72 enak enak ena k 255,863 0.76 % 225.49 2 2.30 % 206.04 5,846 0.29 % 147.20 110,141 0.71 % 202.94 51,176 0.79 % 273.06 17,036 1.19 % 396.79 1,377 1.10 % 348.09 70,285 0.86 % 221.06 mnog mnog mno g 247,148 0.73 % 217.81 0 0 % 0 3,585 0.17 % 90.27 128,052 0.82 % 235.94 45,853 0.71 % 244.66 14,872 1.04 % 346.39 768 0.61 % 194.14 54,018 0.66 % 169.90 nekaj nekaj nek aj 229,505 0.68 % 202.26 0 0 % 0 22,120 1.08 % 556.96 92,539 0.59 % 170.51 49,750 0.77 % 265.45 10,825 0.76 % 252.13 747 0.60 % 188.83 53,524 0.65 % 168.35 nekdo nekdo nek do 185,672 0.55 % 163.63 0 0 % 0 18,069 0.88 % 454.96 80,714 0.52 % 148.72 34,070 0.53 % 181.79 6,740 0.47 % 156.98 575 0.46 % 145.35 45,504 0.55 % 143.12 name name nam e 156,828 0.46 % 138.21 0 0 % 0 16,570 0.81 % 417.21 63,497 0.41 % 117 32,970 0.51 % 175.92 8,779 0.61 % 204.47 434 0.35 % 109.71 34,578 0.42 % 108.76 njun njun nju n 142,046 0.42 % 125.18 1 1.15 % 103.02 10,343 0.51 % 260.43 55,733 0.36 % 102.69 29,091 0.45 % 155.22 5,299 0.37 % 123.42 311 0.25 % 78.62 41,268 0.50 % 129.80 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 120 File at CLARIN.SI 1.2.104 List of initial character-level 4-grams from pronoun lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-pronouns-lemmas-initial- 4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] svoj svoj svoj 3,453,653 18.95 % 3,043.69 3 7.89 % 309.06 151,218 16.39 % 3,807.51 1,585,205 18.99 % 2,920.85 654,031 19.27 % 3,489.69 155,236 18.67 % 3,615.63 10,998 15.41 % 2,780.18 896,962 19.23 % 2,821.18 kateri kateri kate ri 2,632,017 14.44 % 2,319.58 6 15.79 % 618.11 67,762 7.34 % 1,706.17 1,197,868 14.35 % 2,207.15 448,321 13.21 % 2,392.09 122,313 14.71 % 2,848.82 14,555 20.40 % 3,679.35 781,192 16.75 % 2,457.05 njegov njegov njeg ov 1,765,365 9.68 % 1,555.81 0 0 % 0 110,706 12.00 % 2,787.46 781,513 9.36 % 1,439.99 274,494 8.09 % 1,464.61 77,991 9.38 % 1,816.50 5,165 7.24 % 1,305.66 515,496 11.05 % 1,621.37 tisti tisti tist i 1,204,248 6.61 % 1,061.30 1 2.63 % 103.02 85,777 9.30 % 2,159.77 554,134 6.64 % 1,021.03 243,437 7.17 % 1,298.90 53,821 6.47 % 1,253.56 6,616 9.27 % 1,672.45 260,462 5.58 % 819.22 vsak vsak vsak 1,091,923 5.99 % 962.31 8 21.05 % 824.15 44,704 4.85 % 1,125.60 497,898 5.96 % 917.41 237,701 7.00 % 1,268.29 55,860 6.72 % 1,301.05 5,188 7.27 % 1,311.47 250,564 5.37 % 788.09 njihov njihov njih ov 1,022,463 5.61 % 901.09 1 2.63 % 103.02 22,312 2.42 % 561.79 502,914 6.03 % 926.65 168,419 4.96 % 898.63 52,087 6.26 % 1,213.17 4,322 6.06 % 1,092.55 272,408 5.84 % 856.79 njen njen njen 971,331 5.33 % 856.03 0 0 % 0 81,849 8.87 % 2,060.87 394,090 4.72 % 726.14 181,976 5.36 % 970.96 40,880 4.92 % 952.14 3,157 4.42 % 798.06 269,379 5.78 % 847.27 nekateri nekateri neka teri 826,585 4.53 % 728.47 2 5.26 % 206.04 11,748 1.27 % 295.80 424,818 5.09 % 782.76 144,522 4.26 % 771.12 41,824 5.03 % 974.13 2,358 3.31 % 596.08 201,313 4.32 % 633.18 kakšen kakšen kakš en 686,530 3.77 % 605.04 2 5.26 % 206.04 38,652 4.19 % 973.22 324,671 3.89 % 598.23 143,530 4.23 % 765.83 22,579 2.71 % 525.89 2,249 3.15 % 568.52 154,847 3.32 % 487.03 takšen takšen takš en 627,943 3.44 % 553.40 0 0 % 0 19,175 2.08 % 482.81 300,654 3.60 % 553.97 117,208 3.45 % 625.38 21,964 2.64 % 511.57 1,988 2.79 % 502.54 166,954 3.58 % 525.11 zame zame zame 390,470 2.14 % 344.12 0 0 % 0 23,376 2.53 % 588.58 181,725 2.18 % 334.84 68,638 2.02 % 366.23 12,818 1.54 % 298.55 971 1.36 % 245.46 102,942 2.21 % 323.78 noben noben nobe n 361,539 1.98 % 318.62 0 0 % 0 24,292 2.63 % 611.65 170,441 2.04 % 314.05 59,200 1.74 % 315.87 13,989 1.68 % 325.82 1,648 2.31 % 416.60 91,969 1.97 % 289.27 isti isti isti 305,329 1.68 % 269.08 5 13.16 % 515.09 14,396 1.56 % 362.48 138,929 1.66 % 255.99 59,259 1.75 % 316.19 18,519 2.23 % 431.33 1,503 2.11 % 379.94 72,718 1.56 % 228.72 nihče nihče nihč e 286,212 1.57 % 252.24 0 0 % 0 26,310 2.85 % 662.46 139,188 1.67 % 256.46 46,911 1.38 % 250.30 7,577 0.91 % 176.48 820 1.15 % 207.29 65,406 1.40 % 205.72 enak enak enak 255,863 1.40 % 225.49 2 5.26 % 206.04 5,846 0.63 % 147.20 110,141 1.32 % 202.94 51,176 1.51 % 273.06 17,036 2.05 % 396.79 1,377 1.93 % 348.09 70,285 1.51 % 221.06 mnog mnog mnog 247,148 1.36 % 217.81 0 0 % 0 3,585 0.39 % 90.27 128,052 1.53 % 235.94 45,853 1.35 % 244.66 14,872 1.79 % 346.39 768 1.08 % 194.14 54,018 1.16 % 169.90 nekaj nekaj neka j 229,505 1.26 % 202.26 0 0 % 0 22,120 2.40 % 556.96 92,539 1.11 % 170.51 49,750 1.47 % 265.45 10,825 1.30 % 252.13 747 1.05 % 188.83 53,524 1.15 % 168.35 nekdo nekdo nekd o 185,672 1.02 % 163.63 0 0 % 0 18,069 1.96 % 454.96 80,714 0.97 % 148.72 34,070 1.00 % 181.79 6,740 0.81 % 156.98 575 0.81 % 145.35 45,504 0.98 % 143.12 name name name 156,828 0.86 % 138.21 0 0 % 0 16,570 1.80 % 417.21 63,497 0.76 % 117 32,970 0.97 % 175.92 8,779 1.06 % 204.47 434 0.61 % 109.71 34,578 0.74 % 108.76 njun njun njun 142,046 0.78 % 125.18 1 2.63 % 103.02 10,343 1.12 % 260.43 55,733 0.67 % 102.69 29,091 0.86 % 155.22 5,299 0.64 % 123.42 311 0.44 % 78.62 41,268 0.89 % 129.80 kakršen kakršen kakr šen 138,508 0.76 % 122.07 0 0 % 0 7,383 0.80 % 185.90 70,116 0.84 % 129.19 22,766 0.67 % 121.47 8,270 0.99 % 192.62 821 1.15 % 207.54 29,152 0.62 % 91.69 nekakšen nekakšen neka kšen 116,986 0.64 % 103.10 1 2.63 % 103.02 9,747 1.06 % 245.42 57,488 0.69 % 105.93 24,921 0.73 % 132.97 5,483 0.66 % 127.71 317 0.44 % 80.13 19,029 0.41 % 59.85 vame vame vame 107,947 0.59 % 95.13 5 13.16 % 515.09 15,980 1.73 % 402.36 43,124 0.52 % 79.46 22,753 0.67 % 121.40 5,833 0.70 % 135.86 258 0.36 % 65.22 19,994 0.43 % 62.89 tvoj tvoj tvoj 78,633 0.43 % 69.30 0 0 % 0 22,886 2.48 % 576.24 18,339 0.22 % 33.79 21,258 0.63 % 113.43 5,813 0.70 % 135.39 163 0.23 % 41.20 10,174 0.22 % 32 marsikaj marsikaj mars ikaj 77,371 0.42 % 68.19 0 0 % 0 2,169 0.23 % 54.61 42,346 0.51 % 78.03 15,514 0.46 % 82.78 1,924 0.23 % 44.81 202 0.28 % 51.06 15,216 0.33 % 47.86 zase zase zase 68,312 0.38 % 60.20 0 0 % 0 3,970 0.43 % 99.96 28,617 0.34 % 52.73 16,957 0.50 % 90.48 3,056 0.37 % 71.18 200 0.28 % 50.56 15,512 0.33 % 48.79 kdor kdor kdor 66,805 0.37 % 58.87 0 0 % 0 3,328 0.36 % 83.80 33,331 0.40 % 61.41 13,083 0.39 % 69.81 4,392 0.53 % 102.29 674 0.94 % 170.38 11,997 0.26 % 37.73 vsakdo vsakdo vsak do 55,960 0.31 % 49.32 0 0 % 0 2,938 0.32 % 73.98 25,300 0.30 % 46.62 12,495 0.37 % 66.67 3,653 0.44 % 85.08 226 0.32 % 57.13 11,348 0.24 % 35.69 marsikateri marsikateri mars ikateri 53,517 0.29 % 47.16 0 0 % 0 654 0.07 % 16.47 27,013 0.32 % 49.77 13,784 0.41 % 73.55 1,520 0.18 % 35.40 106 0.15 % 26.80 10,440 0.22 % 32.84 marsikdo marsikdo mars ikdo 52,605 0.29 % 46.36 0 0 % 0 528 0.06 % 13.29 26,799 0.32 % 49.38 12,411 0.37 % 66.22 930 0.11 % 21.66 74 0.10 % 18.71 11,863 0.25 % 37.31 tale tale tale 44,476 0.24 % 39.20 0 0 % 0 9,227 1.00 % 232.33 13,979 0.17 % 25.76 13,760 0.41 % 73.42 1,955 0.23 % 45.53 351 0.49 % 88.73 5,204 0.11 % 16.37 nikakršen nikakršen nika kršen 44,197 0.24 % 38.95 1 2.63 % 103.02 1,372 0.15 % 34.55 23,681 0.28 % 43.63 7,398 0.22 % 39.47 1,764 0.21 % 41.09 123 0.17 % 31.09 9,858 0.21 % 31.01 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 121 File at CLARIN.SI 1.2.105 List of initial character-level 5-grams from pronoun lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-pronouns-lemmas-initial- 5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] kateri kateri kater i 2,632,017 24.57 % 2,319.58 6 42.86 % 618.11 67,762 13.44 % 1,706.17 1,197,868 23.99 % 2,207.15 448,321 23.43 % 2,392.09 122,313 25.83 % 2,848.82 14,555 32.64 % 3,679.35 781,192 28.04 % 2,457.05 njegov njegov njego v 1,765,365 16.48 % 1,555.81 0 0 % 0 110,706 21.96 % 2,787.46 781,513 15.65 % 1,439.99 274,494 14.35 % 1,464.61 77,991 16.47 % 1,816.50 5,165 11.58 % 1,305.66 515,496 18.50 % 1,621.37 tisti tisti tisti 1,204,248 11.24 % 1,061.30 1 7.14 % 103.02 85,777 17.02 % 2,159.77 554,134 11.10 % 1,021.03 243,437 12.72 % 1,298.90 53,821 11.37 % 1,253.56 6,616 14.84 % 1,672.45 260,462 9.35 % 819.22 njihov njihov njiho v 1,022,463 9.54 % 901.09 1 7.14 % 103.02 22,312 4.43 % 561.79 502,914 10.07 % 926.65 168,419 8.80 % 898.63 52,087 11.00 % 1,213.17 4,322 9.69 % 1,092.55 272,408 9.78 % 856.79 nekateri nekateri nekat eri 826,585 7.71 % 728.47 2 14.29 % 206.04 11,748 2.33 % 295.80 424,818 8.51 % 782.76 144,522 7.55 % 771.12 41,824 8.83 % 974.13 2,358 5.29 % 596.08 201,313 7.23 % 633.18 kakšen kakšen kakše n 686,530 6.41 % 605.04 2 14.29 % 206.04 38,652 7.67 % 973.22 324,671 6.50 % 598.23 143,530 7.50 % 765.83 22,579 4.77 % 525.89 2,249 5.04 % 568.52 154,847 5.56 % 487.03 takšen takšen takše n 627,943 5.86 % 553.40 0 0 % 0 19,175 3.80 % 482.81 300,654 6.02 % 553.97 117,208 6.13 % 625.38 21,964 4.64 % 511.57 1,988 4.46 % 502.54 166,954 5.99 % 525.11 noben noben noben 361,539 3.38 % 318.62 0 0 % 0 24,292 4.82 % 611.65 170,441 3.41 % 314.05 59,200 3.09 % 315.87 13,989 2.95 % 325.82 1,648 3.70 % 416.60 91,969 3.30 % 289.27 nihče nihče nihče 286,212 2.67 % 252.24 0 0 % 0 26,310 5.22 % 662.46 139,188 2.79 % 256.46 46,911 2.45 % 250.30 7,577 1.60 % 176.48 820 1.84 % 207.29 65,406 2.35 % 205.72 nekaj nekaj nekaj 229,505 2.14 % 202.26 0 0 % 0 22,120 4.39 % 556.96 92,539 1.85 % 170.51 49,750 2.60 % 265.45 10,825 2.29 % 252.13 747 1.68 % 188.83 53,524 1.92 % 168.35 nekdo nekdo nekdo 185,672 1.73 % 163.63 0 0 % 0 18,069 3.58 % 454.96 80,714 1.62 % 148.72 34,070 1.78 % 181.79 6,740 1.42 % 156.98 575 1.29 % 145.35 45,504 1.63 % 143.12 kakršen kakršen kakrš en 138,508 1.29 % 122.07 0 0 % 0 7,383 1.47 % 185.90 70,116 1.40 % 129.19 22,766 1.19 % 121.47 8,270 1.75 % 192.62 821 1.84 % 207.54 29,152 1.05 % 91.69 nekakšen nekakšen nekak šen 116,986 1.09 % 103.10 1 7.14 % 103.02 9,747 1.93 % 245.42 57,488 1.15 % 105.93 24,921 1.30 % 132.97 5,483 1.16 % 127.71 317 0.71 % 80.13 19,029 0.68 % 59.85 marsikaj marsikaj marsi kaj 77,371 0.72 % 68.19 0 0 % 0 2,169 0.43 % 54.61 42,346 0.85 % 78.03 15,514 0.81 % 82.78 1,924 0.41 % 44.81 202 0.45 % 51.06 15,216 0.55 % 47.86 vsakdo vsakdo vsakd o 55,960 0.52 % 49.32 0 0 % 0 2,938 0.58 % 73.98 25,300 0.51 % 46.62 12,495 0.65 % 66.67 3,653 0.77 % 85.08 226 0.51 % 57.13 11,348 0.41 % 35.69 marsikateri marsikateri marsi kateri 53,517 0.50 % 47.16 0 0 % 0 654 0.13 % 16.47 27,013 0.54 % 49.77 13,784 0.72 % 73.55 1,520 0.32 % 35.40 106 0.24 % 26.80 10,440 0.38 % 32.84 marsikdo marsikdo marsi kdo 52,605 0.49 % 46.36 0 0 % 0 528 0.10 % 13.29 26,799 0.54 % 49.38 12,411 0.65 % 66.22 930 0.20 % 21.66 74 0.17 % 18.71 11,863 0.43 % 37.31 nikakršen nikakršen nikak ršen 44,197 0.41 % 38.95 1 7.14 % 103.02 1,372 0.27 % 34.55 23,681 0.47 % 43.63 7,398 0.39 % 39.47 1,764 0.37 % 41.09 123 0.28 % 31.09 9,858 0.35 % 31.01 najin najin najin 41,218 0.39 % 36.33 0 0 % 0 9,678 1.92 % 243.68 12,373 0.25 % 22.80 10,372 0.54 % 55.34 1,298 0.27 % 30.23 44 0.10 % 11.12 7,453 0.27 % 23.44 kakršenkoli kakršenkoli kakrš enkoli 32,300 0.30 % 28.47 0 0 % 0 1,247 0.25 % 31.40 13,707 0.28 % 25.26 7,506 0.39 % 40.05 1,424 0.30 % 33.17 285 0.64 % 72.04 8,131 0.29 % 25.57 katerikoli katerikoli kater ikoli 28,599 0.27 % 25.20 0 0 % 0 1,140 0.23 % 28.70 10,160 0.20 % 18.72 8,043 0.42 % 42.91 2,655 0.56 % 61.84 369 0.83 % 93.28 6,232 0.22 % 19.60 tolikšen tolikšen tolik šen 28,371 0.27 % 25 0 0 % 0 1,073 0.21 % 27.02 16,211 0.33 % 29.87 4,900 0.26 % 26.14 1,097 0.23 % 25.55 98 0.22 % 24.77 4,992 0.18 % 15.70 karkoli karkoli karko li 26,931 0.25 % 23.73 0 0 % 0 3,684 0.73 % 92.76 9,689 0.19 % 17.85 6,558 0.34 % 34.99 1,355 0.29 % 31.56 139 0.31 % 35.14 5,506 0.20 % 17.32 vsakršen vsakršen vsakr šen 24,525 0.23 % 21.61 0 0 % 0 1,219 0.24 % 30.69 12,057 0.24 % 22.22 3,870 0.20 % 20.65 1,517 0.32 % 35.33 79 0.18 % 19.97 5,783 0.21 % 18.19 kolikšen kolikšen kolik šen 22,993 0.21 % 20.26 0 0 % 0 317 0.06 % 7.98 12,565 0.25 % 23.15 3,557 0.19 % 18.98 1,890 0.40 % 44.02 60 0.14 % 15.17 4,604 0.17 % 14.48 čigar čigar čigar 21,783 0.20 % 19.20 0 0 % 0 1,015 0.20 % 25.56 10,161 0.20 % 18.72 2,918 0.15 % 15.57 1,022 0.22 % 23.80 118 0.27 % 29.83 6,549 0.23 % 20.60 kdorkoli kdorkoli kdork oli 16,200 0.15 % 14.28 0 0 % 0 1,800 0.36 % 45.32 7,001 0.14 % 12.90 3,105 0.16 % 16.57 533 0.11 % 12.41 85 0.19 % 21.49 3,676 0.13 % 11.56 nobeden nobeden nobed en 10,904 0.10 % 9.61 0 0 % 0 1,127 0.22 % 28.38 4,720 0.10 % 8.70 2,228 0.12 % 11.89 466 0.10 % 10.85 59 0.13 % 14.91 2,304 0.08 % 7.25 medme medme medme 9,006 0.08 % 7.94 0 0 % 0 464 0.09 % 11.68 4,397 0.09 % 8.10 2,010 0.10 % 10.72 586 0.12 % 13.65 31 0.07 % 7.84 1,518 0.05 % 4.77 vajin vajin vajin 8,656 0.08 % 7.63 0 0 % 0 1,006 0.20 % 25.33 1,899 0.04 % 3.50 3,764 0.20 % 20.08 380 0.08 % 8.85 15 0.03 % 3.79 1,592 0.06 % 5.01 čigav čigav čigav 7,676 0.07 % 6.76 0 0 % 0 585 0.12 % 14.73 4,293 0.09 % 7.91 1,233 0.06 % 6.58 254 0.05 % 5.92 39 0.09 % 9.86 1,272 0.05 % 4 malokdo malokdo malok do 6,623 0.06 % 5.84 0 0 % 0 176 0.04 % 4.43 3,768 0.07 % 6.94 1,260 0.07 % 6.72 147 0.03 % 3.42 15 0.03 % 3.79 1,257 0.04 % 3.95 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 122 File at CLARIN.SI 1.2.106 List of final character-level 1-grams from pronoun lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-pronouns-lemmas-final- 1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] se se s e 18,656,426 24.26 % 16,441.81 63 17.45 % 6,490.16 1,227,523 25.39 % 30,907.70 8,331,411 23.93 % 15,351.18 3,444,419 23.78 % 18,378.29 757,814 23.16 % 17,650.39 73,060 23.14 % 18,468.78 4,822,136 25.14 % 15,166.87 on on o n 11,874,012 15.44 % 10,464.51 189 52.35 % 19,470.49 947,214 19.59 % 23,849.82 5,248,082 15.07 % 9,669.94 2,238,961 15.46 % 11,946.36 531,245 16.23 % 12,373.33 38,159 12.09 % 9,646.18 2,870,162 14.96 % 9,027.41 ta ta t a 11,377,542 14.79 % 10,026.97 17 4.71 % 1,751.31 469,391 9.71 % 11,818.76 5,253,309 15.09 % 9,679.57 1,998,988 13.80 % 10,665.94 496,205 15.16 % 11,557.20 74,801 23.69 % 18,908.89 3,084,831 16.08 % 9,702.60 ves ves ve s 4,233,029 5.50 % 3,730.55 26 7.20 % 2,678.48 207,705 4.30 % 5,229.79 2,037,592 5.85 % 3,754.40 799,974 5.52 % 4,268.40 152,217 4.65 % 3,545.31 13,137 4.16 % 3,320.89 1,022,378 5.33 % 3,215.64 jaz jaz ja z 3,481,582 4.53 % 3,068.30 0 0 % 0 403,401 8.35 % 10,157.20 1,512,005 4.34 % 2,785.97 716,575 4.95 % 3,823.41 118,259 3.61 % 2,754.39 12,057 3.82 % 3,047.88 719,285 3.75 % 2,262.34 svoj svoj svo j 3,453,653 4.49 % 3,043.69 3 0.83 % 309.06 151,218 3.13 % 3,807.51 1,585,205 4.55 % 2,920.85 654,031 4.51 % 3,489.69 155,236 4.74 % 3,615.63 10,998 3.48 % 2,780.18 896,962 4.68 % 2,821.18 kateri kateri kater i 2,632,017 3.42 % 2,319.58 6 1.66 % 618.11 67,762 1.40 % 1,706.17 1,197,868 3.44 % 2,207.15 448,321 3.10 % 2,392.09 122,313 3.74 % 2,848.82 14,555 4.61 % 3,679.35 781,192 4.07 % 2,457.05 kar kar ka r 1,932,807 2.51 % 1,703.37 1 0.28 % 103.02 73,171 1.51 % 1,842.37 882,447 2.54 % 1,625.97 342,677 2.37 % 1,828.41 69,534 2.12 % 1,619.53 5,556 1.76 % 1,404.50 559,421 2.92 % 1,759.52 njegov njegov njego v 1,765,365 2.29 % 1,555.81 0 0 % 0 110,706 2.29 % 2,787.46 781,513 2.25 % 1,439.99 274,494 1.90 % 1,464.61 77,991 2.38 % 1,816.50 5,165 1.64 % 1,305.66 515,496 2.69 % 1,621.37 naš naš na š 1,295,731 1.69 % 1,141.92 0 0 % 0 25,301 0.52 % 637.05 733,564 2.11 % 1,351.64 220,741 1.52 % 1,177.80 44,419 1.36 % 1,034.57 3,839 1.22 % 970.46 267,867 1.40 % 842.51 tisti tisti tist i 1,204,248 1.57 % 1,061.30 1 0.28 % 103.02 85,777 1.77 % 2,159.77 554,134 1.59 % 1,021.03 243,437 1.68 % 1,298.90 53,821 1.65 % 1,253.56 6,616 2.10 % 1,672.45 260,462 1.36 % 819.22 kaj kaj ka j 1,177,693 1.53 % 1,037.89 1 0.28 % 103.02 124,849 2.58 % 3,143.56 508,341 1.46 % 936.65 253,464 1.75 % 1,352.40 47,569 1.45 % 1,107.94 3,588 1.14 % 907.01 239,881 1.25 % 754.49 ti ti t i 1,132,179 1.47 % 997.78 5 1.39 % 515.09 142,326 2.94 % 3,583.61 391,026 1.12 % 720.49 346,431 2.39 % 1,848.44 60,549 1.85 % 1,410.26 4,357 1.38 % 1,101.40 187,485 0.98 % 589.69 vsak vsak vsa k 1,091,923 1.42 % 962.31 8 2.22 % 824.15 44,704 0.93 % 1,125.60 497,898 1.43 % 917.41 237,701 1.64 % 1,268.29 55,860 1.71 % 1,301.05 5,188 1.64 % 1,311.47 250,564 1.31 % 788.09 njihov njihov njiho v 1,022,463 1.33 % 901.09 1 0.28 % 103.02 22,312 0.46 % 561.79 502,914 1.45 % 926.65 168,419 1.16 % 898.63 52,087 1.59 % 1,213.17 4,322 1.37 % 1,092.55 272,408 1.42 % 856.79 njen njen nje n 971,331 1.26 % 856.03 0 0 % 0 81,849 1.69 % 2,060.87 394,090 1.13 % 726.14 181,976 1.26 % 970.96 40,880 1.25 % 952.14 3,157 1.00 % 798.06 269,379 1.40 % 847.27 nekateri nekateri nekater i 826,585 1.07 % 728.47 2 0.55 % 206.04 11,748 0.24 % 295.80 424,818 1.22 % 782.76 144,522 1.00 % 771.12 41,824 1.28 % 974.13 2,358 0.75 % 596.08 201,313 1.05 % 633.18 kakšen kakšen kakše n 686,530 0.89 % 605.04 2 0.55 % 206.04 38,652 0.80 % 973.22 324,671 0.93 % 598.23 143,530 0.99 % 765.83 22,579 0.69 % 525.89 2,249 0.71 % 568.52 154,847 0.81 % 487.03 takšen takšen takše n 627,943 0.82 % 553.40 0 0 % 0 19,175 0.40 % 482.81 300,654 0.86 % 553.97 117,208 0.81 % 625.38 21,964 0.67 % 511.57 1,988 0.63 % 502.54 166,954 0.87 % 525.11 moj moj mo j 617,627 0.80 % 544.31 0 0 % 0 92,173 1.91 % 2,320.82 244,505 0.70 % 450.52 127,017 0.88 % 677.72 23,634 0.72 % 550.46 1,908 0.60 % 482.32 128,390 0.67 % 403.82 oba oba ob a 542,160 0.70 % 477.80 17 4.71 % 1,751.31 16,851 0.35 % 424.29 280,698 0.81 % 517.20 81,296 0.56 % 433.77 21,586 0.66 % 502.76 1,356 0.43 % 342.78 140,356 0.73 % 441.46 tak tak ta k 535,614 0.70 % 472.03 0 0 % 0 27,946 0.58 % 703.65 256,456 0.74 % 472.54 111,527 0.77 % 595.07 29,507 0.90 % 687.25 4,605 1.46 % 1,164.09 105,573 0.55 % 332.05 nič nič ni č 475,396 0.62 % 418.96 0 0 % 0 53,076 1.10 % 1,336.40 214,201 0.61 % 394.68 94,691 0.65 % 505.24 16,107 0.49 % 375.15 1,400 0.44 % 353.90 95,921 0.50 % 301.70 kdo kdo kd o 456,086 0.59 % 401.95 0 0 % 0 35,045 0.72 % 882.40 222,399 0.64 % 409.78 86,567 0.60 % 461.89 13,691 0.42 % 318.88 2,132 0.68 % 538.95 96,252 0.50 % 302.74 nek nek ne k 393,758 0.51 % 347.02 0 0 % 0 33,855 0.70 % 852.43 157,860 0.45 % 290.87 79,718 0.55 % 425.35 24,047 0.73 % 560.08 2,404 0.76 % 607.71 95,874 0.50 % 301.55 zame zame zam e 390,470 0.51 % 344.12 0 0 % 0 23,376 0.48 % 588.58 181,725 0.52 % 334.84 68,638 0.47 % 366.23 12,818 0.39 % 298.55 971 0.31 % 245.46 102,942 0.54 % 323.78 noben noben nobe n 361,539 0.47 % 318.62 0 0 % 0 24,292 0.50 % 611.65 170,441 0.49 % 314.05 59,200 0.41 % 315.87 13,989 0.43 % 325.82 1,648 0.52 % 416.60 91,969 0.48 % 289.27 vaš vaš va š 318,812 0.41 % 280.97 0 0 % 0 14,681 0.30 % 369.65 117,509 0.34 % 216.52 104,686 0.72 % 558.57 25,254 0.77 % 588.20 1,057 0.34 % 267.20 55,625 0.29 % 174.96 isti isti ist i 305,329 0.40 % 269.08 5 1.39 % 515.09 14,396 0.30 % 362.48 138,929 0.40 % 255.99 59,259 0.41 % 316.19 18,519 0.57 % 431.33 1,503 0.48 % 379.94 72,718 0.38 % 228.72 nihče nihče nihč e 286,212 0.37 % 252.24 0 0 % 0 26,310 0.54 % 662.46 139,188 0.40 % 256.46 46,911 0.32 % 250.30 7,577 0.23 % 176.48 820 0.26 % 207.29 65,406 0.34 % 205.72 enak enak ena k 255,863 0.33 % 225.49 2 0.55 % 206.04 5,846 0.12 % 147.20 110,141 0.32 % 202.94 51,176 0.35 % 273.06 17,036 0.52 % 396.79 1,377 0.44 % 348.09 70,285 0.37 % 221.06 mnog mnog mno g 247,148 0.32 % 217.81 0 0 % 0 3,585 0.07 % 90.27 128,052 0.37 % 235.94 45,853 0.32 % 244.66 14,872 0.45 % 346.39 768 0.24 % 194.14 54,018 0.28 % 169.90 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 123 File at CLARIN.SI 1.2.107 List of final character-level 2-grams from pronoun lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-pronouns-lemmas-final- 2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] se se se 18,656,426 24.26 % 16,441.81 63 17.45 % 6,490.16 1,227,523 25.39 % 30,907.70 8,331,411 23.93 % 15,351.18 3,444,419 23.78 % 18,378.29 757,814 23.16 % 17,650.39 73,060 23.14 % 18,468.78 4,822,136 25.14 % 15,166.87 on on on 11,874,012 15.44 % 10,464.51 189 52.35 % 19,470.49 947,214 19.59 % 23,849.82 5,248,082 15.07 % 9,669.94 2,238,961 15.46 % 11,946.36 531,245 16.23 % 12,373.33 38,159 12.09 % 9,646.18 2,870,162 14.96 % 9,027.41 ta ta ta 11,377,542 14.79 % 10,026.97 17 4.71 % 1,751.31 469,391 9.71 % 11,818.76 5,253,309 15.09 % 9,679.57 1,998,988 13.80 % 10,665.94 496,205 15.16 % 11,557.20 74,801 23.69 % 18,908.89 3,084,831 16.08 % 9,702.60 ves ves v es 4,233,029 5.50 % 3,730.55 26 7.20 % 2,678.48 207,705 4.30 % 5,229.79 2,037,592 5.85 % 3,754.40 799,974 5.52 % 4,268.40 152,217 4.65 % 3,545.31 13,137 4.16 % 3,320.89 1,022,378 5.33 % 3,215.64 jaz jaz j az 3,481,582 4.53 % 3,068.30 0 0 % 0 403,401 8.35 % 10,157.20 1,512,005 4.34 % 2,785.97 716,575 4.95 % 3,823.41 118,259 3.61 % 2,754.39 12,057 3.82 % 3,047.88 719,285 3.75 % 2,262.34 svoj svoj sv oj 3,453,653 4.49 % 3,043.69 3 0.83 % 309.06 151,218 3.13 % 3,807.51 1,585,205 4.55 % 2,920.85 654,031 4.51 % 3,489.69 155,236 4.74 % 3,615.63 10,998 3.48 % 2,780.18 896,962 4.68 % 2,821.18 kateri kateri kate ri 2,632,017 3.42 % 2,319.58 6 1.66 % 618.11 67,762 1.40 % 1,706.17 1,197,868 3.44 % 2,207.15 448,321 3.10 % 2,392.09 122,313 3.74 % 2,848.82 14,555 4.61 % 3,679.35 781,192 4.07 % 2,457.05 kar kar k ar 1,932,807 2.51 % 1,703.37 1 0.28 % 103.02 73,171 1.51 % 1,842.37 882,447 2.54 % 1,625.97 342,677 2.37 % 1,828.41 69,534 2.12 % 1,619.53 5,556 1.76 % 1,404.50 559,421 2.92 % 1,759.52 njegov njegov njeg ov 1,765,365 2.29 % 1,555.81 0 0 % 0 110,706 2.29 % 2,787.46 781,513 2.25 % 1,439.99 274,494 1.90 % 1,464.61 77,991 2.38 % 1,816.50 5,165 1.64 % 1,305.66 515,496 2.69 % 1,621.37 naš naš n aš 1,295,731 1.69 % 1,141.92 0 0 % 0 25,301 0.52 % 637.05 733,564 2.11 % 1,351.64 220,741 1.52 % 1,177.80 44,419 1.36 % 1,034.57 3,839 1.22 % 970.46 267,867 1.40 % 842.51 tisti tisti tis ti 1,204,248 1.57 % 1,061.30 1 0.28 % 103.02 85,777 1.77 % 2,159.77 554,134 1.59 % 1,021.03 243,437 1.68 % 1,298.90 53,821 1.65 % 1,253.56 6,616 2.10 % 1,672.45 260,462 1.36 % 819.22 kaj kaj k aj 1,177,693 1.53 % 1,037.89 1 0.28 % 103.02 124,849 2.58 % 3,143.56 508,341 1.46 % 936.65 253,464 1.75 % 1,352.40 47,569 1.45 % 1,107.94 3,588 1.14 % 907.01 239,881 1.25 % 754.49 ti ti ti 1,132,179 1.47 % 997.78 5 1.39 % 515.09 142,326 2.94 % 3,583.61 391,026 1.12 % 720.49 346,431 2.39 % 1,848.44 60,549 1.85 % 1,410.26 4,357 1.38 % 1,101.40 187,485 0.98 % 589.69 vsak vsak vs ak 1,091,923 1.42 % 962.31 8 2.22 % 824.15 44,704 0.93 % 1,125.60 497,898 1.43 % 917.41 237,701 1.64 % 1,268.29 55,860 1.71 % 1,301.05 5,188 1.64 % 1,311.47 250,564 1.31 % 788.09 njihov njihov njih ov 1,022,463 1.33 % 901.09 1 0.28 % 103.02 22,312 0.46 % 561.79 502,914 1.45 % 926.65 168,419 1.16 % 898.63 52,087 1.59 % 1,213.17 4,322 1.37 % 1,092.55 272,408 1.42 % 856.79 njen njen nj en 971,331 1.26 % 856.03 0 0 % 0 81,849 1.69 % 2,060.87 394,090 1.13 % 726.14 181,976 1.26 % 970.96 40,880 1.25 % 952.14 3,157 1.00 % 798.06 269,379 1.40 % 847.27 nekateri nekateri nekate ri 826,585 1.07 % 728.47 2 0.55 % 206.04 11,748 0.24 % 295.80 424,818 1.22 % 782.76 144,522 1.00 % 771.12 41,824 1.28 % 974.13 2,358 0.75 % 596.08 201,313 1.05 % 633.18 kakšen kakšen kakš en 686,530 0.89 % 605.04 2 0.55 % 206.04 38,652 0.80 % 973.22 324,671 0.93 % 598.23 143,530 0.99 % 765.83 22,579 0.69 % 525.89 2,249 0.71 % 568.52 154,847 0.81 % 487.03 takšen takšen takš en 627,943 0.82 % 553.40 0 0 % 0 19,175 0.40 % 482.81 300,654 0.86 % 553.97 117,208 0.81 % 625.38 21,964 0.67 % 511.57 1,988 0.63 % 502.54 166,954 0.87 % 525.11 moj moj m oj 617,627 0.80 % 544.31 0 0 % 0 92,173 1.91 % 2,320.82 244,505 0.70 % 450.52 127,017 0.88 % 677.72 23,634 0.72 % 550.46 1,908 0.60 % 482.32 128,390 0.67 % 403.82 oba oba o ba 542,160 0.70 % 477.80 17 4.71 % 1,751.31 16,851 0.35 % 424.29 280,698 0.81 % 517.20 81,296 0.56 % 433.77 21,586 0.66 % 502.76 1,356 0.43 % 342.78 140,356 0.73 % 441.46 tak tak t ak 535,614 0.70 % 472.03 0 0 % 0 27,946 0.58 % 703.65 256,456 0.74 % 472.54 111,527 0.77 % 595.07 29,507 0.90 % 687.25 4,605 1.46 % 1,164.09 105,573 0.55 % 332.05 nič nič n ič 475,396 0.62 % 418.96 0 0 % 0 53,076 1.10 % 1,336.40 214,201 0.61 % 394.68 94,691 0.65 % 505.24 16,107 0.49 % 375.15 1,400 0.44 % 353.90 95,921 0.50 % 301.70 kdo kdo k do 456,086 0.59 % 401.95 0 0 % 0 35,045 0.72 % 882.40 222,399 0.64 % 409.78 86,567 0.60 % 461.89 13,691 0.42 % 318.88 2,132 0.68 % 538.95 96,252 0.50 % 302.74 nek nek n ek 393,758 0.51 % 347.02 0 0 % 0 33,855 0.70 % 852.43 157,860 0.45 % 290.87 79,718 0.55 % 425.35 24,047 0.73 % 560.08 2,404 0.76 % 607.71 95,874 0.50 % 301.55 zame zame za me 390,470 0.51 % 344.12 0 0 % 0 23,376 0.48 % 588.58 181,725 0.52 % 334.84 68,638 0.47 % 366.23 12,818 0.39 % 298.55 971 0.31 % 245.46 102,942 0.54 % 323.78 noben noben nob en 361,539 0.47 % 318.62 0 0 % 0 24,292 0.50 % 611.65 170,441 0.49 % 314.05 59,200 0.41 % 315.87 13,989 0.43 % 325.82 1,648 0.52 % 416.60 91,969 0.48 % 289.27 vaš vaš v aš 318,812 0.41 % 280.97 0 0 % 0 14,681 0.30 % 369.65 117,509 0.34 % 216.52 104,686 0.72 % 558.57 25,254 0.77 % 588.20 1,057 0.34 % 267.20 55,625 0.29 % 174.96 isti isti is ti 305,329 0.40 % 269.08 5 1.39 % 515.09 14,396 0.30 % 362.48 138,929 0.40 % 255.99 59,259 0.41 % 316.19 18,519 0.57 % 431.33 1,503 0.48 % 379.94 72,718 0.38 % 228.72 nihče nihče nih če 286,212 0.37 % 252.24 0 0 % 0 26,310 0.54 % 662.46 139,188 0.40 % 256.46 46,911 0.32 % 250.30 7,577 0.23 % 176.48 820 0.26 % 207.29 65,406 0.34 % 205.72 enak enak en ak 255,863 0.33 % 225.49 2 0.55 % 206.04 5,846 0.12 % 147.20 110,141 0.32 % 202.94 51,176 0.35 % 273.06 17,036 0.52 % 396.79 1,377 0.44 % 348.09 70,285 0.37 % 221.06 mnog mnog mn og 247,148 0.32 % 217.81 0 0 % 0 3,585 0.07 % 90.27 128,052 0.37 % 235.94 45,853 0.32 % 244.66 14,872 0.45 % 346.39 768 0.24 % 194.14 54,018 0.28 % 169.90 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 124 File at CLARIN.SI 1.2.108 List of final character-level 3-grams from pronoun lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-pronouns-lemmas-final- 3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] ves ves ves 4,233,029 12.50 % 3,730.55 26 29.89 % 2,678.48 207,705 10.15 % 5,229.79 2,037,592 13.07 % 3,754.40 799,974 12.39 % 4,268.40 152,217 10.67 % 3,545.31 13,137 10.48 % 3,320.89 1,022,378 12.45 % 3,215.64 jaz jaz jaz 3,481,582 10.28 % 3,068.30 0 0 % 0 403,401 19.70 % 10,157.20 1,512,005 9.70 % 2,785.97 716,575 11.10 % 3,823.41 118,259 8.29 % 2,754.39 12,057 9.62 % 3,047.88 719,285 8.76 % 2,262.34 svoj svoj s voj 3,453,653 10.20 % 3,043.69 3 3.45 % 309.06 151,218 7.39 % 3,807.51 1,585,205 10.17 % 2,920.85 654,031 10.13 % 3,489.69 155,236 10.88 % 3,615.63 10,998 8.78 % 2,780.18 896,962 10.92 % 2,821.18 kateri kateri kat eri 2,632,017 7.77 % 2,319.58 6 6.90 % 618.11 67,762 3.31 % 1,706.17 1,197,868 7.68 % 2,207.15 448,321 6.94 % 2,392.09 122,313 8.57 % 2,848.82 14,555 11.62 % 3,679.35 781,192 9.51 % 2,457.05 kar kar kar 1,932,807 5.71 % 1,703.37 1 1.15 % 103.02 73,171 3.57 % 1,842.37 882,447 5.66 % 1,625.97 342,677 5.31 % 1,828.41 69,534 4.87 % 1,619.53 5,556 4.43 % 1,404.50 559,421 6.81 % 1,759.52 njegov njegov nje gov 1,765,365 5.21 % 1,555.81 0 0 % 0 110,706 5.41 % 2,787.46 781,513 5.01 % 1,439.99 274,494 4.25 % 1,464.61 77,991 5.47 % 1,816.50 5,165 4.12 % 1,305.66 515,496 6.28 % 1,621.37 naš naš naš 1,295,731 3.83 % 1,141.92 0 0 % 0 25,301 1.24 % 637.05 733,564 4.71 % 1,351.64 220,741 3.42 % 1,177.80 44,419 3.11 % 1,034.57 3,839 3.06 % 970.46 267,867 3.26 % 842.51 tisti tisti ti sti 1,204,248 3.56 % 1,061.30 1 1.15 % 103.02 85,777 4.19 % 2,159.77 554,134 3.56 % 1,021.03 243,437 3.77 % 1,298.90 53,821 3.77 % 1,253.56 6,616 5.28 % 1,672.45 260,462 3.17 % 819.22 kaj kaj kaj 1,177,693 3.48 % 1,037.89 1 1.15 % 103.02 124,849 6.10 % 3,143.56 508,341 3.26 % 936.65 253,464 3.92 % 1,352.40 47,569 3.33 % 1,107.94 3,588 2.86 % 907.01 239,881 2.92 % 754.49 vsak vsak v sak 1,091,923 3.23 % 962.31 8 9.20 % 824.15 44,704 2.18 % 1,125.60 497,898 3.19 % 917.41 237,701 3.68 % 1,268.29 55,860 3.92 % 1,301.05 5,188 4.14 % 1,311.47 250,564 3.05 % 788.09 njihov njihov nji hov 1,022,463 3.02 % 901.09 1 1.15 % 103.02 22,312 1.09 % 561.79 502,914 3.23 % 926.65 168,419 2.61 % 898.63 52,087 3.65 % 1,213.17 4,322 3.45 % 1,092.55 272,408 3.32 % 856.79 njen njen n jen 971,331 2.87 % 856.03 0 0 % 0 81,849 4.00 % 2,060.87 394,090 2.53 % 726.14 181,976 2.82 % 970.96 40,880 2.87 % 952.14 3,157 2.52 % 798.06 269,379 3.28 % 847.27 nekateri nekateri nekat eri 826,585 2.44 % 728.47 2 2.30 % 206.04 11,748 0.57 % 295.80 424,818 2.73 % 782.76 144,522 2.24 % 771.12 41,824 2.93 % 974.13 2,358 1.88 % 596.08 201,313 2.45 % 633.18 kakšen kakšen kak šen 686,530 2.03 % 605.04 2 2.30 % 206.04 38,652 1.89 % 973.22 324,671 2.08 % 598.23 143,530 2.22 % 765.83 22,579 1.58 % 525.89 2,249 1.79 % 568.52 154,847 1.89 % 487.03 takšen takšen tak šen 627,943 1.85 % 553.40 0 0 % 0 19,175 0.94 % 482.81 300,654 1.93 % 553.97 117,208 1.81 % 625.38 21,964 1.54 % 511.57 1,988 1.59 % 502.54 166,954 2.03 % 525.11 moj moj moj 617,627 1.82 % 544.31 0 0 % 0 92,173 4.50 % 2,320.82 244,505 1.57 % 450.52 127,017 1.97 % 677.72 23,634 1.66 % 550.46 1,908 1.52 % 482.32 128,390 1.56 % 403.82 oba oba oba 542,160 1.60 % 477.80 17 19.54 % 1,751.31 16,851 0.82 % 424.29 280,698 1.80 % 517.20 81,296 1.26 % 433.77 21,586 1.51 % 502.76 1,356 1.08 % 342.78 140,356 1.71 % 441.46 tak tak tak 535,614 1.58 % 472.03 0 0 % 0 27,946 1.36 % 703.65 256,456 1.65 % 472.54 111,527 1.73 % 595.07 29,507 2.07 % 687.25 4,605 3.67 % 1,164.09 105,573 1.28 % 332.05 nič nič nič 475,396 1.40 % 418.96 0 0 % 0 53,076 2.59 % 1,336.40 214,201 1.37 % 394.68 94,691 1.47 % 505.24 16,107 1.13 % 375.15 1,400 1.12 % 353.90 95,921 1.17 % 301.70 kdo kdo kdo 456,086 1.35 % 401.95 0 0 % 0 35,045 1.71 % 882.40 222,399 1.43 % 409.78 86,567 1.34 % 461.89 13,691 0.96 % 318.88 2,132 1.70 % 538.95 96,252 1.17 % 302.74 nek nek nek 393,758 1.16 % 347.02 0 0 % 0 33,855 1.65 % 852.43 157,860 1.01 % 290.87 79,718 1.24 % 425.35 24,047 1.69 % 560.08 2,404 1.92 % 607.71 95,874 1.17 % 301.55 zame zame z ame 390,470 1.15 % 344.12 0 0 % 0 23,376 1.14 % 588.58 181,725 1.17 % 334.84 68,638 1.06 % 366.23 12,818 0.90 % 298.55 971 0.78 % 245.46 102,942 1.25 % 323.78 noben noben no ben 361,539 1.07 % 318.62 0 0 % 0 24,292 1.19 % 611.65 170,441 1.09 % 314.05 59,200 0.92 % 315.87 13,989 0.98 % 325.82 1,648 1.31 % 416.60 91,969 1.12 % 289.27 vaš vaš vaš 318,812 0.94 % 280.97 0 0 % 0 14,681 0.72 % 369.65 117,509 0.75 % 216.52 104,686 1.62 % 558.57 25,254 1.77 % 588.20 1,057 0.84 % 267.20 55,625 0.68 % 174.96 isti isti i sti 305,329 0.90 % 269.08 5 5.75 % 515.09 14,396 0.70 % 362.48 138,929 0.89 % 255.99 59,259 0.92 % 316.19 18,519 1.30 % 431.33 1,503 1.20 % 379.94 72,718 0.89 % 228.72 nihče nihče ni hče 286,212 0.84 % 252.24 0 0 % 0 26,310 1.28 % 662.46 139,188 0.89 % 256.46 46,911 0.73 % 250.30 7,577 0.53 % 176.48 820 0.65 % 207.29 65,406 0.80 % 205.72 enak enak e nak 255,863 0.76 % 225.49 2 2.30 % 206.04 5,846 0.29 % 147.20 110,141 0.71 % 202.94 51,176 0.79 % 273.06 17,036 1.19 % 396.79 1,377 1.10 % 348.09 70,285 0.86 % 221.06 mnog mnog m nog 247,148 0.73 % 217.81 0 0 % 0 3,585 0.17 % 90.27 128,052 0.82 % 235.94 45,853 0.71 % 244.66 14,872 1.04 % 346.39 768 0.61 % 194.14 54,018 0.66 % 169.90 nekaj nekaj ne kaj 229,505 0.68 % 202.26 0 0 % 0 22,120 1.08 % 556.96 92,539 0.59 % 170.51 49,750 0.77 % 265.45 10,825 0.76 % 252.13 747 0.60 % 188.83 53,524 0.65 % 168.35 nekdo nekdo ne kdo 185,672 0.55 % 163.63 0 0 % 0 18,069 0.88 % 454.96 80,714 0.52 % 148.72 34,070 0.53 % 181.79 6,740 0.47 % 156.98 575 0.46 % 145.35 45,504 0.55 % 143.12 name name n ame 156,828 0.46 % 138.21 0 0 % 0 16,570 0.81 % 417.21 63,497 0.41 % 117 32,970 0.51 % 175.92 8,779 0.61 % 204.47 434 0.35 % 109.71 34,578 0.42 % 108.76 njun njun n jun 142,046 0.42 % 125.18 1 1.15 % 103.02 10,343 0.51 % 260.43 55,733 0.36 % 102.69 29,091 0.45 % 155.22 5,299 0.37 % 123.42 311 0.25 % 78.62 41,268 0.50 % 129.80 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 125 File at CLARIN.SI 1.2.109 List of final character-level 4-grams from pronoun lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-pronouns-lemmas-final- 4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] svoj svoj svoj 3,453,653 18.95 % 3,043.69 3 7.89 % 309.06 151,218 16.39 % 3,807.51 1,585,205 18.99 % 2,920.85 654,031 19.27 % 3,489.69 155,236 18.67 % 3,615.63 10,998 15.41 % 2,780.18 896,962 19.23 % 2,821.18 kateri kateri ka teri 2,632,017 14.44 % 2,319.58 6 15.79 % 618.11 67,762 7.34 % 1,706.17 1,197,868 14.35 % 2,207.15 448,321 13.21 % 2,392.09 122,313 14.71 % 2,848.82 14,555 20.40 % 3,679.35 781,192 16.75 % 2,457.05 njegov njegov nj egov 1,765,365 9.68 % 1,555.81 0 0 % 0 110,706 12.00 % 2,787.46 781,513 9.36 % 1,439.99 274,494 8.09 % 1,464.61 77,991 9.38 % 1,816.50 5,165 7.24 % 1,305.66 515,496 11.05 % 1,621.37 tisti tisti t isti 1,204,248 6.61 % 1,061.30 1 2.63 % 103.02 85,777 9.30 % 2,159.77 554,134 6.64 % 1,021.03 243,437 7.17 % 1,298.90 53,821 6.47 % 1,253.56 6,616 9.27 % 1,672.45 260,462 5.58 % 819.22 vsak vsak vsak 1,091,923 5.99 % 962.31 8 21.05 % 824.15 44,704 4.85 % 1,125.60 497,898 5.96 % 917.41 237,701 7.00 % 1,268.29 55,860 6.72 % 1,301.05 5,188 7.27 % 1,311.47 250,564 5.37 % 788.09 njihov njihov nj ihov 1,022,463 5.61 % 901.09 1 2.63 % 103.02 22,312 2.42 % 561.79 502,914 6.03 % 926.65 168,419 4.96 % 898.63 52,087 6.26 % 1,213.17 4,322 6.06 % 1,092.55 272,408 5.84 % 856.79 njen njen njen 971,331 5.33 % 856.03 0 0 % 0 81,849 8.87 % 2,060.87 394,090 4.72 % 726.14 181,976 5.36 % 970.96 40,880 4.92 % 952.14 3,157 4.42 % 798.06 269,379 5.78 % 847.27 nekateri nekateri neka teri 826,585 4.53 % 728.47 2 5.26 % 206.04 11,748 1.27 % 295.80 424,818 5.09 % 782.76 144,522 4.26 % 771.12 41,824 5.03 % 974.13 2,358 3.31 % 596.08 201,313 4.32 % 633.18 kakšen kakšen ka kšen 686,530 3.77 % 605.04 2 5.26 % 206.04 38,652 4.19 % 973.22 324,671 3.89 % 598.23 143,530 4.23 % 765.83 22,579 2.71 % 525.89 2,249 3.15 % 568.52 154,847 3.32 % 487.03 takšen takšen ta kšen 627,943 3.44 % 553.40 0 0 % 0 19,175 2.08 % 482.81 300,654 3.60 % 553.97 117,208 3.45 % 625.38 21,964 2.64 % 511.57 1,988 2.79 % 502.54 166,954 3.58 % 525.11 zame zame zame 390,470 2.14 % 344.12 0 0 % 0 23,376 2.53 % 588.58 181,725 2.18 % 334.84 68,638 2.02 % 366.23 12,818 1.54 % 298.55 971 1.36 % 245.46 102,942 2.21 % 323.78 noben noben n oben 361,539 1.98 % 318.62 0 0 % 0 24,292 2.63 % 611.65 170,441 2.04 % 314.05 59,200 1.74 % 315.87 13,989 1.68 % 325.82 1,648 2.31 % 416.60 91,969 1.97 % 289.27 isti isti isti 305,329 1.68 % 269.08 5 13.16 % 515.09 14,396 1.56 % 362.48 138,929 1.66 % 255.99 59,259 1.75 % 316.19 18,519 2.23 % 431.33 1,503 2.11 % 379.94 72,718 1.56 % 228.72 nihče nihče n ihče 286,212 1.57 % 252.24 0 0 % 0 26,310 2.85 % 662.46 139,188 1.67 % 256.46 46,911 1.38 % 250.30 7,577 0.91 % 176.48 820 1.15 % 207.29 65,406 1.40 % 205.72 enak enak enak 255,863 1.40 % 225.49 2 5.26 % 206.04 5,846 0.63 % 147.20 110,141 1.32 % 202.94 51,176 1.51 % 273.06 17,036 2.05 % 396.79 1,377 1.93 % 348.09 70,285 1.51 % 221.06 mnog mnog mnog 247,148 1.36 % 217.81 0 0 % 0 3,585 0.39 % 90.27 128,052 1.53 % 235.94 45,853 1.35 % 244.66 14,872 1.79 % 346.39 768 1.08 % 194.14 54,018 1.16 % 169.90 nekaj nekaj n ekaj 229,505 1.26 % 202.26 0 0 % 0 22,120 2.40 % 556.96 92,539 1.11 % 170.51 49,750 1.47 % 265.45 10,825 1.30 % 252.13 747 1.05 % 188.83 53,524 1.15 % 168.35 nekdo nekdo n ekdo 185,672 1.02 % 163.63 0 0 % 0 18,069 1.96 % 454.96 80,714 0.97 % 148.72 34,070 1.00 % 181.79 6,740 0.81 % 156.98 575 0.81 % 145.35 45,504 0.98 % 143.12 name name name 156,828 0.86 % 138.21 0 0 % 0 16,570 1.80 % 417.21 63,497 0.76 % 117 32,970 0.97 % 175.92 8,779 1.06 % 204.47 434 0.61 % 109.71 34,578 0.74 % 108.76 njun njun njun 142,046 0.78 % 125.18 1 2.63 % 103.02 10,343 1.12 % 260.43 55,733 0.67 % 102.69 29,091 0.86 % 155.22 5,299 0.64 % 123.42 311 0.44 % 78.62 41,268 0.89 % 129.80 kakršen kakršen kak ršen 138,508 0.76 % 122.07 0 0 % 0 7,383 0.80 % 185.90 70,116 0.84 % 129.19 22,766 0.67 % 121.47 8,270 0.99 % 192.62 821 1.15 % 207.54 29,152 0.62 % 91.69 nekakšen nekakšen neka kšen 116,986 0.64 % 103.10 1 2.63 % 103.02 9,747 1.06 % 245.42 57,488 0.69 % 105.93 24,921 0.73 % 132.97 5,483 0.66 % 127.71 317 0.44 % 80.13 19,029 0.41 % 59.85 vame vame vame 107,947 0.59 % 95.13 5 13.16 % 515.09 15,980 1.73 % 402.36 43,124 0.52 % 79.46 22,753 0.67 % 121.40 5,833 0.70 % 135.86 258 0.36 % 65.22 19,994 0.43 % 62.89 tvoj tvoj tvoj 78,633 0.43 % 69.30 0 0 % 0 22,886 2.48 % 576.24 18,339 0.22 % 33.79 21,258 0.63 % 113.43 5,813 0.70 % 135.39 163 0.23 % 41.20 10,174 0.22 % 32 marsikaj marsikaj mars ikaj 77,371 0.42 % 68.19 0 0 % 0 2,169 0.23 % 54.61 42,346 0.51 % 78.03 15,514 0.46 % 82.78 1,924 0.23 % 44.81 202 0.28 % 51.06 15,216 0.33 % 47.86 zase zase zase 68,312 0.38 % 60.20 0 0 % 0 3,970 0.43 % 99.96 28,617 0.34 % 52.73 16,957 0.50 % 90.48 3,056 0.37 % 71.18 200 0.28 % 50.56 15,512 0.33 % 48.79 kdor kdor kdor 66,805 0.37 % 58.87 0 0 % 0 3,328 0.36 % 83.80 33,331 0.40 % 61.41 13,083 0.39 % 69.81 4,392 0.53 % 102.29 674 0.94 % 170.38 11,997 0.26 % 37.73 vsakdo vsakdo vs akdo 55,960 0.31 % 49.32 0 0 % 0 2,938 0.32 % 73.98 25,300 0.30 % 46.62 12,495 0.37 % 66.67 3,653 0.44 % 85.08 226 0.32 % 57.13 11,348 0.24 % 35.69 marsikateri marsikateri marsika teri 53,517 0.29 % 47.16 0 0 % 0 654 0.07 % 16.47 27,013 0.32 % 49.77 13,784 0.41 % 73.55 1,520 0.18 % 35.40 106 0.15 % 26.80 10,440 0.22 % 32.84 marsikdo marsikdo mars ikdo 52,605 0.29 % 46.36 0 0 % 0 528 0.06 % 13.29 26,799 0.32 % 49.38 12,411 0.37 % 66.22 930 0.11 % 21.66 74 0.10 % 18.71 11,863 0.25 % 37.31 tale tale tale 44,476 0.24 % 39.20 0 0 % 0 9,227 1.00 % 232.33 13,979 0.17 % 25.76 13,760 0.41 % 73.42 1,955 0.23 % 45.53 351 0.49 % 88.73 5,204 0.11 % 16.37 nikakršen nikakršen nikak ršen 44,197 0.24 % 38.95 1 2.63 % 103.02 1,372 0.15 % 34.55 23,681 0.28 % 43.63 7,398 0.22 % 39.47 1,764 0.21 % 41.09 123 0.17 % 31.09 9,858 0.21 % 31.01 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 126 File at CLARIN.SI 1.2.110 List of final character-level 5-grams from pronoun lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-pronouns-lemmas-final- 5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] kateri kateri k ateri 2,632,017 24.57 % 2,319.58 6 42.86 % 618.11 67,762 13.44 % 1,706.17 1,197,868 23.99 % 2,207.15 448,321 23.43 % 2,392.09 122,313 25.83 % 2,848.82 14,555 32.64 % 3,679.35 781,192 28.04 % 2,457.05 njegov njegov n jegov 1,765,365 16.48 % 1,555.81 0 0 % 0 110,706 21.96 % 2,787.46 781,513 15.65 % 1,439.99 274,494 14.35 % 1,464.61 77,991 16.47 % 1,816.50 5,165 11.58 % 1,305.66 515,496 18.50 % 1,621.37 tisti tisti tisti 1,204,248 11.24 % 1,061.30 1 7.14 % 103.02 85,777 17.02 % 2,159.77 554,134 11.10 % 1,021.03 243,437 12.72 % 1,298.90 53,821 11.37 % 1,253.56 6,616 14.84 % 1,672.45 260,462 9.35 % 819.22 njihov njihov n jihov 1,022,463 9.54 % 901.09 1 7.14 % 103.02 22,312 4.43 % 561.79 502,914 10.07 % 926.65 168,419 8.80 % 898.63 52,087 11.00 % 1,213.17 4,322 9.69 % 1,092.55 272,408 9.78 % 856.79 nekateri nekateri nek ateri 826,585 7.71 % 728.47 2 14.29 % 206.04 11,748 2.33 % 295.80 424,818 8.51 % 782.76 144,522 7.55 % 771.12 41,824 8.83 % 974.13 2,358 5.29 % 596.08 201,313 7.23 % 633.18 kakšen kakšen k akšen 686,530 6.41 % 605.04 2 14.29 % 206.04 38,652 7.67 % 973.22 324,671 6.50 % 598.23 143,530 7.50 % 765.83 22,579 4.77 % 525.89 2,249 5.04 % 568.52 154,847 5.56 % 487.03 takšen takšen t akšen 627,943 5.86 % 553.40 0 0 % 0 19,175 3.80 % 482.81 300,654 6.02 % 553.97 117,208 6.13 % 625.38 21,964 4.64 % 511.57 1,988 4.46 % 502.54 166,954 5.99 % 525.11 noben noben noben 361,539 3.38 % 318.62 0 0 % 0 24,292 4.82 % 611.65 170,441 3.41 % 314.05 59,200 3.09 % 315.87 13,989 2.95 % 325.82 1,648 3.70 % 416.60 91,969 3.30 % 289.27 nihče nihče nihče 286,212 2.67 % 252.24 0 0 % 0 26,310 5.22 % 662.46 139,188 2.79 % 256.46 46,911 2.45 % 250.30 7,577 1.60 % 176.48 820 1.84 % 207.29 65,406 2.35 % 205.72 nekaj nekaj nekaj 229,505 2.14 % 202.26 0 0 % 0 22,120 4.39 % 556.96 92,539 1.85 % 170.51 49,750 2.60 % 265.45 10,825 2.29 % 252.13 747 1.68 % 188.83 53,524 1.92 % 168.35 nekdo nekdo nekdo 185,672 1.73 % 163.63 0 0 % 0 18,069 3.58 % 454.96 80,714 1.62 % 148.72 34,070 1.78 % 181.79 6,740 1.42 % 156.98 575 1.29 % 145.35 45,504 1.63 % 143.12 kakršen kakršen ka kršen 138,508 1.29 % 122.07 0 0 % 0 7,383 1.47 % 185.90 70,116 1.40 % 129.19 22,766 1.19 % 121.47 8,270 1.75 % 192.62 821 1.84 % 207.54 29,152 1.05 % 91.69 nekakšen nekakšen nek akšen 116,986 1.09 % 103.10 1 7.14 % 103.02 9,747 1.93 % 245.42 57,488 1.15 % 105.93 24,921 1.30 % 132.97 5,483 1.16 % 127.71 317 0.71 % 80.13 19,029 0.68 % 59.85 marsikaj marsikaj mar sikaj 77,371 0.72 % 68.19 0 0 % 0 2,169 0.43 % 54.61 42,346 0.85 % 78.03 15,514 0.81 % 82.78 1,924 0.41 % 44.81 202 0.45 % 51.06 15,216 0.55 % 47.86 vsakdo vsakdo v sakdo 55,960 0.52 % 49.32 0 0 % 0 2,938 0.58 % 73.98 25,300 0.51 % 46.62 12,495 0.65 % 66.67 3,653 0.77 % 85.08 226 0.51 % 57.13 11,348 0.41 % 35.69 marsikateri marsikateri marsik ateri 53,517 0.50 % 47.16 0 0 % 0 654 0.13 % 16.47 27,013 0.54 % 49.77 13,784 0.72 % 73.55 1,520 0.32 % 35.40 106 0.24 % 26.80 10,440 0.38 % 32.84 marsikdo marsikdo mar sikdo 52,605 0.49 % 46.36 0 0 % 0 528 0.10 % 13.29 26,799 0.54 % 49.38 12,411 0.65 % 66.22 930 0.20 % 21.66 74 0.17 % 18.71 11,863 0.43 % 37.31 nikakršen nikakršen nika kršen 44,197 0.41 % 38.95 1 7.14 % 103.02 1,372 0.27 % 34.55 23,681 0.47 % 43.63 7,398 0.39 % 39.47 1,764 0.37 % 41.09 123 0.28 % 31.09 9,858 0.35 % 31.01 najin najin najin 41,218 0.39 % 36.33 0 0 % 0 9,678 1.92 % 243.68 12,373 0.25 % 22.80 10,372 0.54 % 55.34 1,298 0.27 % 30.23 44 0.10 % 11.12 7,453 0.27 % 23.44 kakršenkoli kakršenkoli kakrše nkoli 32,300 0.30 % 28.47 0 0 % 0 1,247 0.25 % 31.40 13,707 0.28 % 25.26 7,506 0.39 % 40.05 1,424 0.30 % 33.17 285 0.64 % 72.04 8,131 0.29 % 25.57 katerikoli katerikoli kater ikoli 28,599 0.27 % 25.20 0 0 % 0 1,140 0.23 % 28.70 10,160 0.20 % 18.72 8,043 0.42 % 42.91 2,655 0.56 % 61.84 369 0.83 % 93.28 6,232 0.22 % 19.60 tolikšen tolikšen tol ikšen 28,371 0.27 % 25 0 0 % 0 1,073 0.21 % 27.02 16,211 0.33 % 29.87 4,900 0.26 % 26.14 1,097 0.23 % 25.55 98 0.22 % 24.77 4,992 0.18 % 15.70 karkoli karkoli ka rkoli 26,931 0.25 % 23.73 0 0 % 0 3,684 0.73 % 92.76 9,689 0.19 % 17.85 6,558 0.34 % 34.99 1,355 0.29 % 31.56 139 0.31 % 35.14 5,506 0.20 % 17.32 vsakršen vsakršen vsa kršen 24,525 0.23 % 21.61 0 0 % 0 1,219 0.24 % 30.69 12,057 0.24 % 22.22 3,870 0.20 % 20.65 1,517 0.32 % 35.33 79 0.18 % 19.97 5,783 0.21 % 18.19 kolikšen kolikšen kol ikšen 22,993 0.21 % 20.26 0 0 % 0 317 0.06 % 7.98 12,565 0.25 % 23.15 3,557 0.19 % 18.98 1,890 0.40 % 44.02 60 0.14 % 15.17 4,604 0.17 % 14.48 čigar čigar čigar 21,783 0.20 % 19.20 0 0 % 0 1,015 0.20 % 25.56 10,161 0.20 % 18.72 2,918 0.15 % 15.57 1,022 0.22 % 23.80 118 0.27 % 29.83 6,549 0.23 % 20.60 kdorkoli kdorkoli kdo rkoli 16,200 0.15 % 14.28 0 0 % 0 1,800 0.36 % 45.32 7,001 0.14 % 12.90 3,105 0.16 % 16.57 533 0.11 % 12.41 85 0.19 % 21.49 3,676 0.13 % 11.56 nobeden nobeden no beden 10,904 0.10 % 9.61 0 0 % 0 1,127 0.22 % 28.38 4,720 0.10 % 8.70 2,228 0.12 % 11.89 466 0.10 % 10.85 59 0.13 % 14.91 2,304 0.08 % 7.25 medme medme medme 9,006 0.08 % 7.94 0 0 % 0 464 0.09 % 11.68 4,397 0.09 % 8.10 2,010 0.10 % 10.72 586 0.12 % 13.65 31 0.07 % 7.84 1,518 0.05 % 4.77 vajin vajin vajin 8,656 0.08 % 7.63 0 0 % 0 1,006 0.20 % 25.33 1,899 0.04 % 3.50 3,764 0.20 % 20.08 380 0.08 % 8.85 15 0.03 % 3.79 1,592 0.06 % 5.01 čigav čigav čigav 7,676 0.07 % 6.76 0 0 % 0 585 0.12 % 14.73 4,293 0.09 % 7.91 1,233 0.06 % 6.58 254 0.05 % 5.92 39 0.09 % 9.86 1,272 0.05 % 4 malokdo malokdo ma lokdo 6,623 0.06 % 5.84 0 0 % 0 176 0.04 % 4.43 3,768 0.07 % 6.94 1,260 0.07 % 6.72 147 0.03 % 3.42 15 0.03 % 3.79 1,257 0.04 % 3.95 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 127 File at CLARIN.SI 1.2.111 List of initial character-level 1-grams from pronoun lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-pronouns-lowercase_forms- initial-1grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] se s e 16,040,357 20.86 % 14,136.29 61 16.90 % 6,284.12 4,200,428 21.90 % 13,211.44 1,002,397 20.74 % 25,239.27 67,082 21.25 % 16,957.61 2,888,135 19.94 % 15,410.14 645,048 19.71 % 15,023.93 7,237,206 20.79 % 13,335.03 to t o 4,103,550 5.34 % 3,616.44 8 2.22 % 824.15 1,079,804 5.63 % 3,396.26 218,365 4.52 % 5,498.19 20,900 6.62 % 5,283.29 773,014 5.34 % 4,124.55 165,865 5.07 % 3,863.19 1,845,594 5.30 % 3,400.63 ga g a 2,820,915 3.67 % 2,486.06 57 15.79 % 5,872.05 693,711 3.62 % 2,181.90 221,093 4.57 % 5,566.88 9,589 3.04 % 2,424 522,706 3.61 % 2,788.99 123,613 3.78 % 2,879.09 1,250,146 3.59 % 2,303.48 jih j ih 2,543,853 3.31 % 2,241.88 71 19.67 % 7,314.31 632,756 3.30 % 1,990.18 97,565 2.02 % 2,456.58 10,205 3.23 % 2,579.71 482,799 3.33 % 2,576.06 132,177 4.04 % 3,078.56 1,188,280 3.41 % 2,189.48 tem t em 2,422,935 3.15 % 2,135.32 5 1.39 % 515.09 725,853 3.78 % 2,283 65,918 1.36 % 1,659.74 13,150 4.17 % 3,324.18 389,063 2.69 % 2,075.91 100,541 3.07 % 2,341.72 1,128,405 3.24 % 2,079.16 si s i 2,116,658 2.75 % 1,865.40 0 0 % 0 514,578 2.68 % 1,618.48 177,478 3.67 % 4,468.70 4,363 1.38 % 1,102.92 442,516 3.06 % 2,361.12 79,910 2.44 % 1,861.20 897,813 2.58 % 1,654.28 jo j o 1,732,491 2.25 % 1,526.84 46 12.74 % 4,738.85 423,164 2.21 % 1,330.96 148,766 3.08 % 3,745.77 5,376 1.70 % 1,358.99 339,940 2.35 % 1,813.81 81,944 2.50 % 1,908.57 733,255 2.11 % 1,351.07 vse v se 1,718,925 2.23 % 1,514.88 21 5.82 % 2,163.39 417,831 2.18 % 1,314.19 93,251 1.93 % 2,347.96 5,102 1.62 % 1,289.73 341,670 2.36 % 1,823.04 62,134 1.90 % 1,447.17 798,916 2.29 % 1,472.06 kar k ar 1,570,304 2.04 % 1,383.90 1 0.28 % 103.02 433,726 2.26 % 1,364.18 63,660 1.32 % 1,602.89 4,504 1.43 % 1,138.56 290,148 2.00 % 1,548.13 56,252 1.72 % 1,310.18 722,013 2.07 % 1,330.36 ta t a 1,399,658 1.82 % 1,233.51 2 0.55 % 206.04 385,481 2.01 % 1,212.44 51,880 1.07 % 1,306.28 9,758 3.09 % 2,466.72 237,867 1.64 % 1,269.18 70,067 2.14 % 1,631.94 644,603 1.85 % 1,187.72 tega t ega 1,301,008 1.69 % 1,146.57 1 0.28 % 103.02 339,271 1.77 % 1,067.10 57,915 1.20 % 1,458.24 17,867 5.66 % 4,516.58 228,327 1.58 % 1,218.28 52,168 1.59 % 1,215.05 605,459 1.74 % 1,115.60 mu m u 1,099,357 1.43 % 968.86 3 0.83 % 309.06 260,446 1.36 % 819.17 141,366 2.92 % 3,559.44 3,183 1.01 % 804.63 190,707 1.32 % 1,017.55 39,582 1.21 % 921.91 464,070 1.33 % 855.08 kaj k aj 1,092,205 1.42 % 962.55 0 0 % 0 221,852 1.16 % 697.78 116,352 2.41 % 2,929.62 3,316 1.05 % 838.25 235,001 1.62 % 1,253.89 43,222 1.32 % 1,006.69 472,462 1.36 % 870.54 mi m i 928,446 1.21 % 818.23 0 0 % 0 191,967 1.00 % 603.79 141,012 2.92 % 3,550.53 3,855 1.22 % 974.50 191,610 1.32 % 1,022.37 28,100 0.86 % 654.48 371,902 1.07 % 685.25 svoje s voje 896,025 1.17 % 789.66 2 0.55 % 206.04 228,388 1.19 % 718.34 37,102 0.77 % 934.19 2,958 0.94 % 747.75 167,773 1.16 % 895.18 42,226 1.29 % 983.49 417,576 1.20 % 769.41 te t e 778,569 1.01 % 686.15 1 0.28 % 103.02 186,165 0.97 % 585.54 50,755 1.05 % 1,277.96 5,227 1.66 % 1,321.33 144,978 1.00 % 773.55 42,256 1.29 % 984.19 349,187 1.00 % 643.40 nas n as 736,499 0.96 % 649.07 0 0 % 0 162,023 0.84 % 509.60 20,438 0.42 % 514.61 1,687 0.53 % 426.46 151,156 1.04 % 806.52 27,837 0.85 % 648.36 373,358 1.07 % 687.94 jim j im 723,196 0.94 % 637.35 1 0.28 % 103.02 177,348 0.93 % 557.81 25,600 0.53 % 644.58 1,813 0.57 % 458.31 118,672 0.82 % 633.19 26,431 0.81 % 615.61 373,331 1.07 % 687.89 vseh v seh 667,739 0.87 % 588.47 1 0.28 % 103.02 182,831 0.95 % 575.05 14,187 0.29 % 357.21 2,601 0.82 % 657.50 111,575 0.77 % 595.33 25,604 0.78 % 596.35 330,940 0.95 % 609.78 nam n am 660,025 0.86 % 581.68 0 0 % 0 144,622 0.75 % 454.87 16,117 0.33 % 405.81 1,553 0.49 % 392.58 130,942 0.90 % 698.66 24,627 0.75 % 573.59 342,164 0.98 % 630.46 vsi v si 590,283 0.77 % 520.21 1 0.28 % 103.02 137,969 0.72 % 433.95 30,468 0.63 % 767.15 1,801 0.57 % 455.27 107,195 0.74 % 571.96 18,366 0.56 % 427.77 294,483 0.85 % 542.60 svojo s vojo 587,202 0.76 % 517.50 1 0.28 % 103.02 152,737 0.80 % 480.40 28,699 0.59 % 722.61 1,737 0.55 % 439.09 117,000 0.81 % 624.27 25,348 0.78 % 590.39 261,680 0.75 % 482.16 ji j i 560,679 0.73 % 494.12 3 0.83 % 309.06 123,492 0.64 % 388.41 96,395 1.99 % 2,427.12 1,411 0.45 % 356.69 120,392 0.83 % 642.37 17,867 0.55 % 416.14 201,119 0.58 % 370.58 me m e 551,083 0.72 % 485.67 0 0 % 0 100,520 0.52 % 316.16 107,770 2.23 % 2,713.53 1,191 0.38 % 301.07 117,280 0.81 % 625.77 16,084 0.49 % 374.62 208,238 0.60 % 383.69 teh t eh 461,307 0.60 % 406.55 0 0 % 0 118,196 0.62 % 371.76 11,382 0.23 % 286.59 3,211 1.02 % 811.71 76,699 0.53 % 409.24 24,262 0.74 % 565.09 227,557 0.65 % 419.29 kateri k ateri 443,426 0.58 % 390.79 3 0.83 % 309.06 138,022 0.72 % 434.12 12,103 0.25 % 304.74 2,070 0.66 % 523.27 70,318 0.48 % 375.19 19,163 0.59 % 446.33 201,747 0.58 % 371.73 ti t i 441,478 0.57 % 389.07 0 0 % 0 92,094 0.48 % 289.66 66,304 1.37 % 1,669.46 1,655 0.52 % 418.37 94,191 0.65 % 502.57 22,424 0.69 % 522.28 164,810 0.47 % 303.67 vsak v sak 435,378 0.57 % 383.70 4 1.11 % 412.07 98,859 0.52 % 310.94 20,678 0.43 % 520.65 1,480 0.47 % 374.13 95,410 0.66 % 509.08 19,497 0.60 % 454.11 199,450 0.57 % 367.50 katerem k aterem 422,748 0.55 % 372.57 0 0 % 0 131,630 0.69 % 414.01 9,500 0.20 % 239.20 1,712 0.54 % 432.78 65,619 0.45 % 350.12 17,545 0.54 % 408.64 196,742 0.56 % 362.51 katerih k aterih 414,448 0.54 % 365.25 2 0.55 % 206.04 115,416 0.60 % 363.01 9,180 0.19 % 231.14 2,561 0.81 % 647.39 72,188 0.50 % 385.17 24,070 0.73 % 560.62 191,031 0.55 % 351.99 temu t emu 395,296 0.51 % 348.37 0 0 % 0 99,918 0.52 % 314.27 12,140 0.25 % 305.67 1,550 0.49 % 391.82 73,770 0.51 % 393.61 14,842 0.45 % 345.69 193,076 0.56 % 355.76 svoj s voj 384,876 0.50 % 339.19 0 0 % 0 102,755 0.54 % 323.19 15,313 0.32 % 385.56 1,259 0.40 % 318.26 72,883 0.50 % 388.88 15,034 0.46 % 350.16 177,632 0.51 % 327.30 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 128 File at CLARIN.SI 1.2.112 List of initial character-level 2-grams from pronoun lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-pronouns-lowercase_forms- initial-2grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] se se 16,040,357 20.86 % 14,136.29 61 16.90 % 6,284.12 4,200,428 21.90 % 13,211.44 1,002,397 20.74 % 25,239.27 67,082 21.25 % 16,957.61 2,888,135 19.94 % 15,410.14 645,048 19.71 % 15,023.93 7,237,206 20.79 % 13,335.03 to to 4,103,550 5.34 % 3,616.44 8 2.22 % 824.15 1,079,804 5.63 % 3,396.26 218,365 4.52 % 5,498.19 20,900 6.62 % 5,283.29 773,014 5.34 % 4,124.55 165,865 5.07 % 3,863.19 1,845,594 5.30 % 3,400.63 ga ga 2,820,915 3.67 % 2,486.06 57 15.79 % 5,872.05 693,711 3.62 % 2,181.90 221,093 4.57 % 5,566.88 9,589 3.04 % 2,424 522,706 3.61 % 2,788.99 123,613 3.78 % 2,879.09 1,250,146 3.59 % 2,303.48 jih ji h 2,543,853 3.31 % 2,241.88 71 19.67 % 7,314.31 632,756 3.30 % 1,990.18 97,565 2.02 % 2,456.58 10,205 3.23 % 2,579.71 482,799 3.33 % 2,576.06 132,177 4.04 % 3,078.56 1,188,280 3.41 % 2,189.48 tem te m 2,422,935 3.15 % 2,135.32 5 1.39 % 515.09 725,853 3.78 % 2,283 65,918 1.36 % 1,659.74 13,150 4.17 % 3,324.18 389,063 2.69 % 2,075.91 100,541 3.07 % 2,341.72 1,128,405 3.24 % 2,079.16 si si 2,116,658 2.75 % 1,865.40 0 0 % 0 514,578 2.68 % 1,618.48 177,478 3.67 % 4,468.70 4,363 1.38 % 1,102.92 442,516 3.06 % 2,361.12 79,910 2.44 % 1,861.20 897,813 2.58 % 1,654.28 jo jo 1,732,491 2.25 % 1,526.84 46 12.74 % 4,738.85 423,164 2.21 % 1,330.96 148,766 3.08 % 3,745.77 5,376 1.70 % 1,358.99 339,940 2.35 % 1,813.81 81,944 2.50 % 1,908.57 733,255 2.11 % 1,351.07 vse vs e 1,718,925 2.23 % 1,514.88 21 5.82 % 2,163.39 417,831 2.18 % 1,314.19 93,251 1.93 % 2,347.96 5,102 1.62 % 1,289.73 341,670 2.36 % 1,823.04 62,134 1.90 % 1,447.17 798,916 2.29 % 1,472.06 kar ka r 1,570,304 2.04 % 1,383.90 1 0.28 % 103.02 433,726 2.26 % 1,364.18 63,660 1.32 % 1,602.89 4,504 1.43 % 1,138.56 290,148 2.00 % 1,548.13 56,252 1.72 % 1,310.18 722,013 2.07 % 1,330.36 ta ta 1,399,658 1.82 % 1,233.51 2 0.55 % 206.04 385,481 2.01 % 1,212.44 51,880 1.07 % 1,306.28 9,758 3.09 % 2,466.72 237,867 1.64 % 1,269.18 70,067 2.14 % 1,631.94 644,603 1.85 % 1,187.72 tega te ga 1,301,008 1.69 % 1,146.57 1 0.28 % 103.02 339,271 1.77 % 1,067.10 57,915 1.20 % 1,458.24 17,867 5.66 % 4,516.58 228,327 1.58 % 1,218.28 52,168 1.59 % 1,215.05 605,459 1.74 % 1,115.60 mu mu 1,099,357 1.43 % 968.86 3 0.83 % 309.06 260,446 1.36 % 819.17 141,366 2.92 % 3,559.44 3,183 1.01 % 804.63 190,707 1.32 % 1,017.55 39,582 1.21 % 921.91 464,070 1.33 % 855.08 kaj ka j 1,092,205 1.42 % 962.55 0 0 % 0 221,852 1.16 % 697.78 116,352 2.41 % 2,929.62 3,316 1.05 % 838.25 235,001 1.62 % 1,253.89 43,222 1.32 % 1,006.69 472,462 1.36 % 870.54 mi mi 928,446 1.21 % 818.23 0 0 % 0 191,967 1.00 % 603.79 141,012 2.92 % 3,550.53 3,855 1.22 % 974.50 191,610 1.32 % 1,022.37 28,100 0.86 % 654.48 371,902 1.07 % 685.25 svoje sv oje 896,025 1.17 % 789.66 2 0.55 % 206.04 228,388 1.19 % 718.34 37,102 0.77 % 934.19 2,958 0.94 % 747.75 167,773 1.16 % 895.18 42,226 1.29 % 983.49 417,576 1.20 % 769.41 te te 778,569 1.01 % 686.15 1 0.28 % 103.02 186,165 0.97 % 585.54 50,755 1.05 % 1,277.96 5,227 1.66 % 1,321.33 144,978 1.00 % 773.55 42,256 1.29 % 984.19 349,187 1.00 % 643.40 nas na s 736,499 0.96 % 649.07 0 0 % 0 162,023 0.84 % 509.60 20,438 0.42 % 514.61 1,687 0.53 % 426.46 151,156 1.04 % 806.52 27,837 0.85 % 648.36 373,358 1.07 % 687.94 jim ji m 723,196 0.94 % 637.35 1 0.28 % 103.02 177,348 0.93 % 557.81 25,600 0.53 % 644.58 1,813 0.57 % 458.31 118,672 0.82 % 633.19 26,431 0.81 % 615.61 373,331 1.07 % 687.89 vseh vs eh 667,739 0.87 % 588.47 1 0.28 % 103.02 182,831 0.95 % 575.05 14,187 0.29 % 357.21 2,601 0.82 % 657.50 111,575 0.77 % 595.33 25,604 0.78 % 596.35 330,940 0.95 % 609.78 nam na m 660,025 0.86 % 581.68 0 0 % 0 144,622 0.75 % 454.87 16,117 0.33 % 405.81 1,553 0.49 % 392.58 130,942 0.90 % 698.66 24,627 0.75 % 573.59 342,164 0.98 % 630.46 vsi vs i 590,283 0.77 % 520.21 1 0.28 % 103.02 137,969 0.72 % 433.95 30,468 0.63 % 767.15 1,801 0.57 % 455.27 107,195 0.74 % 571.96 18,366 0.56 % 427.77 294,483 0.85 % 542.60 svojo sv ojo 587,202 0.76 % 517.50 1 0.28 % 103.02 152,737 0.80 % 480.40 28,699 0.59 % 722.61 1,737 0.55 % 439.09 117,000 0.81 % 624.27 25,348 0.78 % 590.39 261,680 0.75 % 482.16 ji ji 560,679 0.73 % 494.12 3 0.83 % 309.06 123,492 0.64 % 388.41 96,395 1.99 % 2,427.12 1,411 0.45 % 356.69 120,392 0.83 % 642.37 17,867 0.55 % 416.14 201,119 0.58 % 370.58 me me 551,083 0.72 % 485.67 0 0 % 0 100,520 0.52 % 316.16 107,770 2.23 % 2,713.53 1,191 0.38 % 301.07 117,280 0.81 % 625.77 16,084 0.49 % 374.62 208,238 0.60 % 383.69 teh te h 461,307 0.60 % 406.55 0 0 % 0 118,196 0.62 % 371.76 11,382 0.23 % 286.59 3,211 1.02 % 811.71 76,699 0.53 % 409.24 24,262 0.74 % 565.09 227,557 0.65 % 419.29 kateri ka teri 443,426 0.58 % 390.79 3 0.83 % 309.06 138,022 0.72 % 434.12 12,103 0.25 % 304.74 2,070 0.66 % 523.27 70,318 0.48 % 375.19 19,163 0.59 % 446.33 201,747 0.58 % 371.73 ti ti 441,478 0.57 % 389.07 0 0 % 0 92,094 0.48 % 289.66 66,304 1.37 % 1,669.46 1,655 0.52 % 418.37 94,191 0.65 % 502.57 22,424 0.69 % 522.28 164,810 0.47 % 303.67 vsak vs ak 435,378 0.57 % 383.70 4 1.11 % 412.07 98,859 0.52 % 310.94 20,678 0.43 % 520.65 1,480 0.47 % 374.13 95,410 0.66 % 509.08 19,497 0.60 % 454.11 199,450 0.57 % 367.50 katerem ka terem 422,748 0.55 % 372.57 0 0 % 0 131,630 0.69 % 414.01 9,500 0.20 % 239.20 1,712 0.54 % 432.78 65,619 0.45 % 350.12 17,545 0.54 % 408.64 196,742 0.56 % 362.51 katerih ka terih 414,448 0.54 % 365.25 2 0.55 % 206.04 115,416 0.60 % 363.01 9,180 0.19 % 231.14 2,561 0.81 % 647.39 72,188 0.50 % 385.17 24,070 0.73 % 560.62 191,031 0.55 % 351.99 temu te mu 395,296 0.51 % 348.37 0 0 % 0 99,918 0.52 % 314.27 12,140 0.25 % 305.67 1,550 0.49 % 391.82 73,770 0.51 % 393.61 14,842 0.45 % 345.69 193,076 0.56 % 355.76 svoj sv oj 384,876 0.50 % 339.19 0 0 % 0 102,755 0.54 % 323.19 15,313 0.32 % 385.56 1,259 0.40 % 318.26 72,883 0.50 % 388.88 15,034 0.46 % 350.16 177,632 0.51 % 327.30 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 129 File at CLARIN.SI 1.2.113 List of initial character-level 3-grams from pronoun lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-pronouns-lowercase_forms- initial-3grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] jih jih 2,543,853 5.79 % 2,241.88 71 39.66 % 7,314.31 632,756 5.84 % 1,990.18 97,565 4.13 % 2,456.58 10,205 5.66 % 2,579.71 482,799 5.79 % 2,576.06 132,177 6.88 % 3,078.56 1,188,280 5.86 % 2,189.48 tem tem 2,422,935 5.52 % 2,135.32 5 2.79 % 515.09 725,853 6.70 % 2,283 65,918 2.79 % 1,659.74 13,150 7.29 % 3,324.18 389,063 4.67 % 2,075.91 100,541 5.23 % 2,341.72 1,128,405 5.56 % 2,079.16 vse vse 1,718,925 3.91 % 1,514.88 21 11.73 % 2,163.39 417,831 3.85 % 1,314.19 93,251 3.95 % 2,347.96 5,102 2.83 % 1,289.73 341,670 4.10 % 1,823.04 62,134 3.23 % 1,447.17 798,916 3.94 % 1,472.06 kar kar 1,570,304 3.58 % 1,383.90 1 0.56 % 103.02 433,726 4.00 % 1,364.18 63,660 2.70 % 1,602.89 4,504 2.50 % 1,138.56 290,148 3.48 % 1,548.13 56,252 2.93 % 1,310.18 722,013 3.56 % 1,330.36 tega teg a 1,301,008 2.96 % 1,146.57 1 0.56 % 103.02 339,271 3.13 % 1,067.10 57,915 2.45 % 1,458.24 17,867 9.90 % 4,516.58 228,327 2.74 % 1,218.28 52,168 2.71 % 1,215.05 605,459 2.98 % 1,115.60 kaj kaj 1,092,205 2.49 % 962.55 0 0 % 0 221,852 2.05 % 697.78 116,352 4.93 % 2,929.62 3,316 1.84 % 838.25 235,001 2.82 % 1,253.89 43,222 2.25 % 1,006.69 472,462 2.33 % 870.54 svoje svo je 896,025 2.04 % 789.66 2 1.12 % 206.04 228,388 2.11 % 718.34 37,102 1.57 % 934.19 2,958 1.64 % 747.75 167,773 2.01 % 895.18 42,226 2.20 % 983.49 417,576 2.06 % 769.41 nas nas 736,499 1.68 % 649.07 0 0 % 0 162,023 1.50 % 509.60 20,438 0.87 % 514.61 1,687 0.94 % 426.46 151,156 1.81 % 806.52 27,837 1.45 % 648.36 373,358 1.84 % 687.94 jim jim 723,196 1.65 % 637.35 1 0.56 % 103.02 177,348 1.64 % 557.81 25,600 1.08 % 644.58 1,813 1.00 % 458.31 118,672 1.42 % 633.19 26,431 1.38 % 615.61 373,331 1.84 % 687.89 vseh vse h 667,739 1.52 % 588.47 1 0.56 % 103.02 182,831 1.69 % 575.05 14,187 0.60 % 357.21 2,601 1.44 % 657.50 111,575 1.34 % 595.33 25,604 1.33 % 596.35 330,940 1.63 % 609.78 nam nam 660,025 1.50 % 581.68 0 0 % 0 144,622 1.33 % 454.87 16,117 0.68 % 405.81 1,553 0.86 % 392.58 130,942 1.57 % 698.66 24,627 1.28 % 573.59 342,164 1.69 % 630.46 vsi vsi 590,283 1.34 % 520.21 1 0.56 % 103.02 137,969 1.27 % 433.95 30,468 1.29 % 767.15 1,801 1.00 % 455.27 107,195 1.29 % 571.96 18,366 0.96 % 427.77 294,483 1.45 % 542.60 svojo svo jo 587,202 1.34 % 517.50 1 0.56 % 103.02 152,737 1.41 % 480.40 28,699 1.22 % 722.61 1,737 0.96 % 439.09 117,000 1.40 % 624.27 25,348 1.32 % 590.39 261,680 1.29 % 482.16 teh teh 461,307 1.05 % 406.55 0 0 % 0 118,196 1.09 % 371.76 11,382 0.48 % 286.59 3,211 1.78 % 811.71 76,699 0.92 % 409.24 24,262 1.26 % 565.09 227,557 1.12 % 419.29 kateri kat eri 443,426 1.01 % 390.79 3 1.68 % 309.06 138,022 1.27 % 434.12 12,103 0.51 % 304.74 2,070 1.15 % 523.27 70,318 0.84 % 375.19 19,163 1.00 % 446.33 201,747 0.99 % 371.73 vsak vsa k 435,378 0.99 % 383.70 4 2.23 % 412.07 98,859 0.91 % 310.94 20,678 0.88 % 520.65 1,480 0.82 % 374.13 95,410 1.15 % 509.08 19,497 1.01 % 454.11 199,450 0.98 % 367.50 katerem kat erem 422,748 0.96 % 372.57 0 0 % 0 131,630 1.21 % 414.01 9,500 0.40 % 239.20 1,712 0.95 % 432.78 65,619 0.79 % 350.12 17,545 0.91 % 408.64 196,742 0.97 % 362.51 katerih kat erih 414,448 0.94 % 365.25 2 1.12 % 206.04 115,416 1.06 % 363.01 9,180 0.39 % 231.14 2,561 1.42 % 647.39 72,188 0.87 % 385.17 24,070 1.25 % 560.62 191,031 0.94 % 351.99 temu tem u 395,296 0.90 % 348.37 0 0 % 0 99,918 0.92 % 314.27 12,140 0.51 % 305.67 1,550 0.86 % 391.82 73,770 0.89 % 393.61 14,842 0.77 % 345.69 193,076 0.95 % 355.76 svoj svo j 384,876 0.88 % 339.19 0 0 % 0 102,755 0.95 % 323.19 15,313 0.65 % 385.56 1,259 0.70 % 318.26 72,883 0.88 % 388.88 15,034 0.78 % 350.16 177,632 0.88 % 327.30 nič nič 368,884 0.84 % 325.10 0 0 % 0 71,597 0.66 % 225.19 37,466 1.59 % 943.35 1,096 0.61 % 277.06 76,998 0.92 % 410.84 11,727 0.61 % 273.14 170,000 0.84 % 313.24 svojih svo jih 359,437 0.82 % 316.77 0 0 % 0 93,095 0.86 % 292.81 13,076 0.55 % 329.24 1,311 0.73 % 331.41 63,062 0.76 % 336.48 16,724 0.87 % 389.52 172,169 0.85 % 317.23 vam vam 353,752 0.81 % 311.76 4 2.23 % 412.07 54,047 0.50 % 169.99 19,362 0.82 % 487.51 1,295 0.72 % 327.36 131,736 1.58 % 702.90 21,242 1.11 % 494.75 126,066 0.62 % 232.28 tej tej 348,441 0.79 % 307.08 0 0 % 0 101,358 0.94 % 318.80 10,799 0.46 % 271.91 1,893 1.05 % 478.53 48,392 0.58 % 258.20 15,293 0.80 % 356.19 170,706 0.84 % 314.54 kdo kdo 347,009 0.79 % 305.82 0 0 % 0 73,405 0.68 % 230.88 26,511 1.12 % 667.52 1,709 0.95 % 432.02 64,480 0.77 % 344.04 9,866 0.51 % 229.79 171,038 0.84 % 315.15 njimi nji mi 323,106 0.74 % 284.75 2 1.12 % 206.04 87,905 0.81 % 276.48 10,619 0.45 % 267.37 743 0.41 % 187.82 54,499 0.65 % 290.79 13,886 0.72 % 323.42 155,452 0.77 % 286.43 tisti tis ti 321,273 0.73 % 283.14 0 0 % 0 69,913 0.65 % 219.89 21,523 0.91 % 541.93 1,718 0.95 % 434.29 63,881 0.77 % 340.85 12,686 0.66 % 295.47 151,552 0.75 % 279.24 vsem vse m 314,109 0.71 % 276.82 0 0 % 0 72,840 0.67 % 229.10 13,013 0.55 % 327.65 950 0.53 % 240.15 58,875 0.71 % 314.14 10,914 0.57 % 254.20 157,517 0.78 % 290.24 njegov nje gov 312,067 0.71 % 275.02 0 0 % 0 91,225 0.84 % 286.93 20,759 0.88 % 522.69 788 0.44 % 199.20 48,529 0.58 % 258.93 11,386 0.59 % 265.19 139,380 0.69 % 256.82 svojega svo jega 308,222 0.70 % 271.63 0 0 % 0 77,639 0.72 % 244.19 15,805 0.67 % 397.95 946 0.52 % 239.14 59,785 0.72 % 318.99 14,856 0.77 % 346.01 139,191 0.69 % 256.47 katerega kat erega 301,745 0.69 % 265.93 0 0 % 0 88,965 0.82 % 279.82 8,703 0.37 % 219.13 1,749 0.97 % 442.13 52,123 0.63 % 278.11 12,539 0.65 % 292.05 137,666 0.68 % 253.66 katero kat ero 296,676 0.68 % 261.46 0 0 % 0 86,849 0.80 % 273.16 9,799 0.41 % 246.73 1,822 1.01 % 460.58 54,232 0.65 % 289.36 12,778 0.67 % 297.61 131,196 0.65 % 241.74 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 130 File at CLARIN.SI 1.2.114 List of initial character-level 4-grams from pronoun lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-pronouns-lowercase_forms- initial-4grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] tega tega 1,301,008 4.72 % 1,146.57 1 1.64 % 103.02 339,271 4.95 % 1,067.10 57,915 3.83 % 1,458.24 17,867 15.21 % 4,516.58 228,327 4.42 % 1,218.28 52,168 4.19 % 1,215.05 605,459 4.78 % 1,115.60 svoje svoj e 896,025 3.25 % 789.66 2 3.28 % 206.04 228,388 3.33 % 718.34 37,102 2.46 % 934.19 2,958 2.52 % 747.75 167,773 3.25 % 895.18 42,226 3.39 % 983.49 417,576 3.29 % 769.41 vseh vseh 667,739 2.42 % 588.47 1 1.64 % 103.02 182,831 2.67 % 575.05 14,187 0.94 % 357.21 2,601 2.21 % 657.50 111,575 2.16 % 595.33 25,604 2.06 % 596.35 330,940 2.61 % 609.78 svojo svoj o 587,202 2.13 % 517.50 1 1.64 % 103.02 152,737 2.23 % 480.40 28,699 1.90 % 722.61 1,737 1.48 % 439.09 117,000 2.26 % 624.27 25,348 2.04 % 590.39 261,680 2.06 % 482.16 kateri kate ri 443,426 1.61 % 390.79 3 4.92 % 309.06 138,022 2.01 % 434.12 12,103 0.80 % 304.74 2,070 1.76 % 523.27 70,318 1.36 % 375.19 19,163 1.54 % 446.33 201,747 1.59 % 371.73 vsak vsak 435,378 1.58 % 383.70 4 6.56 % 412.07 98,859 1.44 % 310.94 20,678 1.37 % 520.65 1,480 1.26 % 374.13 95,410 1.85 % 509.08 19,497 1.57 % 454.11 199,450 1.57 % 367.50 katerem kate rem 422,748 1.53 % 372.57 0 0 % 0 131,630 1.92 % 414.01 9,500 0.63 % 239.20 1,712 1.46 % 432.78 65,619 1.27 % 350.12 17,545 1.41 % 408.64 196,742 1.55 % 362.51 katerih kate rih 414,448 1.50 % 365.25 2 3.28 % 206.04 115,416 1.68 % 363.01 9,180 0.61 % 231.14 2,561 2.18 % 647.39 72,188 1.40 % 385.17 24,070 1.93 % 560.62 191,031 1.51 % 351.99 temu temu 395,296 1.43 % 348.37 0 0 % 0 99,918 1.46 % 314.27 12,140 0.80 % 305.67 1,550 1.32 % 391.82 73,770 1.43 % 393.61 14,842 1.19 % 345.69 193,076 1.52 % 355.76 svoj svoj 384,876 1.40 % 339.19 0 0 % 0 102,755 1.50 % 323.19 15,313 1.01 % 385.56 1,259 1.07 % 318.26 72,883 1.41 % 388.88 15,034 1.21 % 350.16 177,632 1.40 % 327.30 svojih svoj ih 359,437 1.30 % 316.77 0 0 % 0 93,095 1.36 % 292.81 13,076 0.86 % 329.24 1,311 1.12 % 331.41 63,062 1.22 % 336.48 16,724 1.34 % 389.52 172,169 1.36 % 317.23 njimi njim i 323,106 1.17 % 284.75 2 3.28 % 206.04 87,905 1.28 % 276.48 10,619 0.70 % 267.37 743 0.63 % 187.82 54,499 1.05 % 290.79 13,886 1.11 % 323.42 155,452 1.23 % 286.43 tisti tist i 321,273 1.17 % 283.14 0 0 % 0 69,913 1.02 % 219.89 21,523 1.42 % 541.93 1,718 1.46 % 434.29 63,881 1.24 % 340.85 12,686 1.02 % 295.47 151,552 1.20 % 279.24 vsem vsem 314,109 1.14 % 276.82 0 0 % 0 72,840 1.06 % 229.10 13,013 0.86 % 327.65 950 0.81 % 240.15 58,875 1.14 % 314.14 10,914 0.88 % 254.20 157,517 1.24 % 290.24 njegov njeg ov 312,067 1.13 % 275.02 0 0 % 0 91,225 1.33 % 286.93 20,759 1.37 % 522.69 788 0.67 % 199.20 48,529 0.94 % 258.93 11,386 0.91 % 265.19 139,380 1.10 % 256.82 svojega svoj ega 308,222 1.12 % 271.63 0 0 % 0 77,639 1.13 % 244.19 15,805 1.05 % 397.95 946 0.81 % 239.14 59,785 1.16 % 318.99 14,856 1.19 % 346.01 139,191 1.10 % 256.47 katerega kate rega 301,745 1.09 % 265.93 0 0 % 0 88,965 1.30 % 279.82 8,703 0.58 % 219.13 1,749 1.49 % 442.13 52,123 1.01 % 278.11 12,539 1.01 % 292.05 137,666 1.09 % 253.66 katero kate ro 296,676 1.08 % 261.46 0 0 % 0 86,849 1.27 % 273.16 9,799 0.65 % 246.73 1,822 1.55 % 460.58 54,232 1.05 % 289.36 12,778 1.03 % 297.61 131,196 1.03 % 241.74 katere kate re 293,688 1.06 % 258.83 0 0 % 0 83,547 1.22 % 262.78 7,495 0.50 % 188.72 2,320 1.98 % 586.47 51,495 1.00 % 274.76 15,891 1.28 % 370.12 132,940 1.05 % 244.95 njih njih 292,042 1.06 % 257.38 1 1.64 % 103.02 68,727 1.00 % 216.16 13,870 0.92 % 349.23 966 0.82 % 244.19 56,026 1.08 % 298.94 15,855 1.27 % 369.28 136,597 1.08 % 251.69 nekateri neka teri 291,094 1.06 % 256.54 1 1.64 % 103.02 69,699 1.02 % 219.22 5,577 0.37 % 140.42 710 0.60 % 179.48 51,392 0.99 % 274.21 12,872 1.03 % 299.80 150,843 1.19 % 277.94 njegovo njeg ovo 282,399 1.02 % 248.88 0 0 % 0 75,507 1.10 % 237.49 18,982 1.26 % 477.95 1,040 0.89 % 262.90 48,068 0.93 % 256.48 15,381 1.24 % 358.24 123,421 0.97 % 227.41 naše naše 270,649 0.98 % 238.52 0 0 % 0 56,652 0.83 % 178.19 5,104 0.34 % 128.51 901 0.77 % 227.76 47,448 0.92 % 253.17 11,218 0.90 % 261.28 149,326 1.18 % 275.14 njim njim 269,365 0.98 % 237.39 2 3.28 % 206.04 65,014 0.95 % 204.49 23,698 1.57 % 596.69 704 0.60 % 177.96 51,706 1.00 % 275.89 11,409 0.92 % 265.73 116,832 0.92 % 215.27 njegova njeg ova 263,259 0.95 % 232.01 0 0 % 0 71,977 1.05 % 226.39 16,409 1.09 % 413.16 578 0.49 % 146.11 48,864 0.95 % 260.72 11,792 0.95 % 274.65 113,639 0.90 % 209.39 svoji svoj i 244,889 0.89 % 215.82 0 0 % 0 69,060 1.01 % 217.21 12,098 0.80 % 304.61 728 0.62 % 184.03 44,470 0.86 % 237.28 10,640 0.85 % 247.82 107,893 0.85 % 198.80 vsako vsak o 242,448 0.88 % 213.67 0 0 % 0 56,658 0.83 % 178.20 7,927 0.52 % 199.59 1,017 0.87 % 257.09 48,054 0.93 % 256.40 10,671 0.86 % 248.54 118,121 0.93 % 217.65 nekaterih neka terih 231,651 0.84 % 204.15 0 0 % 0 58,371 0.85 % 183.59 1,963 0.13 % 49.43 643 0.55 % 162.54 38,342 0.74 % 204.58 11,127 0.89 % 259.16 121,205 0.96 % 223.33 njem njem 229,536 0.83 % 202.29 1 1.64 % 103.02 51,728 0.76 % 162.70 14,739 0.97 % 371.11 557 0.47 % 140.80 42,205 0.82 % 225.19 12,298 0.99 % 286.44 108,008 0.85 % 199.01 njej njej 222,731 0.81 % 196.29 0 0 % 0 48,591 0.71 % 152.83 19,397 1.28 % 488.40 562 0.48 % 142.07 41,419 0.80 % 221 11,185 0.90 % 260.51 101,577 0.80 % 187.16 svojem svoj em 213,135 0.77 % 187.83 0 0 % 0 58,828 0.86 % 185.03 8,622 0.57 % 217.09 825 0.70 % 208.55 38,100 0.74 % 203.29 10,163 0.82 % 236.71 96,597 0.76 % 177.99 nihče nihč e 212,510 0.77 % 187.28 0 0 % 0 49,043 0.71 % 154.25 17,772 1.18 % 447.48 618 0.53 % 156.22 34,654 0.67 % 184.90 5,487 0.44 % 127.80 104,936 0.83 % 193.35 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 131 File at CLARIN.SI 1.2.115 List of initial character-level 5-grams from pronoun lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-pronouns-lowercase_forms- initial-5grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] svoje svoje 896,025 4.72 % 789.66 2 5.88 % 206.04 228,388 4.72 % 718.34 37,102 3.87 % 934.19 2,958 4.01 % 747.75 167,773 4.79 % 895.18 42,226 4.88 % 983.49 417,576 4.77 % 769.41 svojo svojo 587,202 3.09 % 517.50 1 2.94 % 103.02 152,737 3.15 % 480.40 28,699 2.99 % 722.61 1,737 2.36 % 439.09 117,000 3.34 % 624.27 25,348 2.93 % 590.39 261,680 2.99 % 482.16 kateri kater i 443,426 2.33 % 390.79 3 8.82 % 309.06 138,022 2.85 % 434.12 12,103 1.26 % 304.74 2,070 2.81 % 523.27 70,318 2.01 % 375.19 19,163 2.22 % 446.33 201,747 2.31 % 371.73 katerem kater em 422,748 2.23 % 372.57 0 0 % 0 131,630 2.72 % 414.01 9,500 0.99 % 239.20 1,712 2.32 % 432.78 65,619 1.87 % 350.12 17,545 2.03 % 408.64 196,742 2.25 % 362.51 katerih kater ih 414,448 2.18 % 365.25 2 5.88 % 206.04 115,416 2.38 % 363.01 9,180 0.96 % 231.14 2,561 3.48 % 647.39 72,188 2.06 % 385.17 24,070 2.78 % 560.62 191,031 2.18 % 351.99 svojih svoji h 359,437 1.89 % 316.77 0 0 % 0 93,095 1.92 % 292.81 13,076 1.36 % 329.24 1,311 1.78 % 331.41 63,062 1.80 % 336.48 16,724 1.93 % 389.52 172,169 1.97 % 317.23 njimi njimi 323,106 1.70 % 284.75 2 5.88 % 206.04 87,905 1.82 % 276.48 10,619 1.11 % 267.37 743 1.01 % 187.82 54,499 1.56 % 290.79 13,886 1.60 % 323.42 155,452 1.78 % 286.43 tisti tisti 321,273 1.69 % 283.14 0 0 % 0 69,913 1.44 % 219.89 21,523 2.24 % 541.93 1,718 2.33 % 434.29 63,881 1.82 % 340.85 12,686 1.47 % 295.47 151,552 1.73 % 279.24 njegov njego v 312,067 1.64 % 275.02 0 0 % 0 91,225 1.88 % 286.93 20,759 2.16 % 522.69 788 1.07 % 199.20 48,529 1.39 % 258.93 11,386 1.32 % 265.19 139,380 1.59 % 256.82 svojega svoje ga 308,222 1.62 % 271.63 0 0 % 0 77,639 1.60 % 244.19 15,805 1.65 % 397.95 946 1.28 % 239.14 59,785 1.71 % 318.99 14,856 1.72 % 346.01 139,191 1.59 % 256.47 katerega kater ega 301,745 1.59 % 265.93 0 0 % 0 88,965 1.84 % 279.82 8,703 0.91 % 219.13 1,749 2.37 % 442.13 52,123 1.49 % 278.11 12,539 1.45 % 292.05 137,666 1.57 % 253.66 katero kater o 296,676 1.56 % 261.46 0 0 % 0 86,849 1.79 % 273.16 9,799 1.02 % 246.73 1,822 2.47 % 460.58 54,232 1.55 % 289.36 12,778 1.48 % 297.61 131,196 1.50 % 241.74 katere kater e 293,688 1.55 % 258.83 0 0 % 0 83,547 1.73 % 262.78 7,495 0.78 % 188.72 2,320 3.15 % 586.47 51,495 1.47 % 274.76 15,891 1.84 % 370.12 132,940 1.52 % 244.95 nekateri nekat eri 291,094 1.53 % 256.54 1 2.94 % 103.02 69,699 1.44 % 219.22 5,577 0.58 % 140.42 710 0.96 % 179.48 51,392 1.47 % 274.21 12,872 1.49 % 299.80 150,843 1.72 % 277.94 njegovo njego vo 282,399 1.49 % 248.88 0 0 % 0 75,507 1.56 % 237.49 18,982 1.98 % 477.95 1,040 1.41 % 262.90 48,068 1.37 % 256.48 15,381 1.78 % 358.24 123,421 1.41 % 227.41 njegova njego va 263,259 1.39 % 232.01 0 0 % 0 71,977 1.49 % 226.39 16,409 1.71 % 413.16 578 0.79 % 146.11 48,864 1.40 % 260.72 11,792 1.36 % 274.65 113,639 1.30 % 209.39 svoji svoji 244,889 1.29 % 215.82 0 0 % 0 69,060 1.43 % 217.21 12,098 1.26 % 304.61 728 0.99 % 184.03 44,470 1.27 % 237.28 10,640 1.23 % 247.82 107,893 1.23 % 198.80 vsako vsako 242,448 1.28 % 213.67 0 0 % 0 56,658 1.17 % 178.20 7,927 0.83 % 199.59 1,017 1.38 % 257.09 48,054 1.37 % 256.40 10,671 1.23 % 248.54 118,121 1.35 % 217.65 nekaterih nekat erih 231,651 1.22 % 204.15 0 0 % 0 58,371 1.21 % 183.59 1,963 0.20 % 49.43 643 0.87 % 162.54 38,342 1.09 % 204.58 11,127 1.29 % 259.16 121,205 1.39 % 223.33 svojem svoje m 213,135 1.12 % 187.83 0 0 % 0 58,828 1.22 % 185.03 8,622 0.90 % 217.09 825 1.12 % 208.55 38,100 1.09 % 203.29 10,163 1.18 % 236.71 96,597 1.10 % 177.99 nihče nihče 212,510 1.12 % 187.28 0 0 % 0 49,043 1.01 % 154.25 17,772 1.85 % 447.48 618 0.84 % 156.22 34,654 0.99 % 184.90 5,487 0.63 % 127.80 104,936 1.20 % 193.35 svojim svoji m 209,107 1.10 % 184.28 0 0 % 0 52,278 1.08 % 164.43 9,134 0.95 % 229.98 537 0.73 % 135.75 42,914 1.23 % 228.98 8,840 1.02 % 205.89 95,404 1.09 % 175.79 nekaj nekaj 203,242 1.07 % 179.12 0 0 % 0 47,858 0.99 % 150.53 18,383 1.92 % 462.86 667 0.91 % 168.61 44,180 1.26 % 235.73 9,029 1.04 % 210.30 83,125 0.95 % 153.16 nekatere nekat ere 202,747 1.07 % 178.68 0 0 % 0 49,489 1.02 % 155.66 2,957 0.31 % 74.45 739 1.00 % 186.81 37,104 1.06 % 197.97 13,192 1.52 % 307.26 99,266 1.14 % 182.90 njegove njego ve 198,326 1.04 % 174.78 0 0 % 0 52,708 1.09 % 165.78 14,987 1.56 % 377.36 714 0.97 % 180.49 33,244 0.95 % 177.38 10,944 1.26 % 254.90 85,729 0.98 % 157.96 katerimi kater imi 197,235 1.04 % 173.82 0 0 % 0 58,790 1.21 % 184.91 3,547 0.37 % 89.31 875 1.19 % 221.19 34,351 0.98 % 183.29 8,749 1.01 % 203.77 90,923 1.04 % 167.53 tiste tiste 189,871 1.00 % 167.33 1 2.94 % 103.02 42,601 0.88 % 133.99 9,150 0.95 % 230.39 1,020 1.39 % 257.84 41,769 1.19 % 222.87 9,106 1.05 % 212.09 86,224 0.98 % 158.87 njegovi njego vi 188,214 0.99 % 165.87 0 0 % 0 52,128 1.08 % 163.96 11,023 1.15 % 277.55 719 0.98 % 181.76 29,364 0.84 % 156.68 7,727 0.89 % 179.97 87,253 1.00 % 160.77 katerim kater im 179,565 0.95 % 158.25 1 2.94 % 103.02 57,323 1.18 % 180.30 4,658 0.48 % 117.28 866 1.18 % 218.92 32,159 0.92 % 171.59 6,650 0.77 % 154.89 77,908 0.89 % 143.55 njihovo njiho vo 176,435 0.93 % 155.49 0 0 % 0 45,803 0.95 % 144.06 3,451 0.36 % 86.89 955 1.30 % 241.41 29,870 0.85 % 159.38 10,893 1.26 % 253.71 85,463 0.98 % 157.47 njegovih njego vih 171,382 0.90 % 151.04 0 0 % 0 67,441 1.39 % 212.12 8,418 0.88 % 211.96 328 0.45 % 82.91 19,655 0.56 % 104.87 6,550 0.76 % 152.56 68,990 0.79 % 127.12 naših naših 170,366 0.90 % 150.14 0 0 % 0 36,247 0.75 % 114.01 2,106 0.22 % 53.03 358 0.49 % 90.50 27,572 0.79 % 147.12 4,715 0.55 % 109.82 99,368 1.14 % 183.09 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 132 File at CLARIN.SI 1.2.116 List of final character-level 1-grams from pronoun lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-pronouns-lowercase_forms- final-1grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] se s e 16,040,357 20.86 % 14,136.29 61 16.90 % 6,284.12 1,002,397 20.74 % 25,239.27 7,237,206 20.79 % 13,335.03 2,888,135 19.94 % 15,410.14 645,048 19.71 % 15,023.93 67,082 21.25 % 16,957.61 4,200,428 21.90 % 13,211.44 to t o 4,103,550 5.34 % 3,616.44 8 2.22 % 824.15 218,365 4.52 % 5,498.19 1,845,594 5.30 % 3,400.63 773,014 5.34 % 4,124.55 165,865 5.07 % 3,863.19 20,900 6.62 % 5,283.29 1,079,804 5.63 % 3,396.26 ga g a 2,820,915 3.67 % 2,486.06 57 15.79 % 5,872.05 221,093 4.57 % 5,566.88 1,250,146 3.59 % 2,303.48 522,706 3.61 % 2,788.99 123,613 3.78 % 2,879.09 9,589 3.04 % 2,424 693,711 3.62 % 2,181.90 jih ji h 2,543,853 3.31 % 2,241.88 71 19.67 % 7,314.31 97,565 2.02 % 2,456.58 1,188,280 3.41 % 2,189.48 482,799 3.33 % 2,576.06 132,177 4.04 % 3,078.56 10,205 3.23 % 2,579.71 632,756 3.30 % 1,990.18 tem te m 2,422,935 3.15 % 2,135.32 5 1.39 % 515.09 65,918 1.36 % 1,659.74 1,128,405 3.24 % 2,079.16 389,063 2.69 % 2,075.91 100,541 3.07 % 2,341.72 13,150 4.17 % 3,324.18 725,853 3.78 % 2,283 si s i 2,116,658 2.75 % 1,865.40 0 0 % 0 177,478 3.67 % 4,468.70 897,813 2.58 % 1,654.28 442,516 3.06 % 2,361.12 79,910 2.44 % 1,861.20 4,363 1.38 % 1,102.92 514,578 2.68 % 1,618.48 jo j o 1,732,491 2.25 % 1,526.84 46 12.74 % 4,738.85 148,766 3.08 % 3,745.77 733,255 2.11 % 1,351.07 339,940 2.35 % 1,813.81 81,944 2.50 % 1,908.57 5,376 1.70 % 1,358.99 423,164 2.21 % 1,330.96 vse vs e 1,718,925 2.23 % 1,514.88 21 5.82 % 2,163.39 93,251 1.93 % 2,347.96 798,916 2.29 % 1,472.06 341,670 2.36 % 1,823.04 62,134 1.90 % 1,447.17 5,102 1.62 % 1,289.73 417,831 2.18 % 1,314.19 kar ka r 1,570,304 2.04 % 1,383.90 1 0.28 % 103.02 63,660 1.32 % 1,602.89 722,013 2.07 % 1,330.36 290,148 2.00 % 1,548.13 56,252 1.72 % 1,310.18 4,504 1.43 % 1,138.56 433,726 2.26 % 1,364.18 ta t a 1,399,658 1.82 % 1,233.51 2 0.55 % 206.04 51,880 1.07 % 1,306.28 644,603 1.85 % 1,187.72 237,867 1.64 % 1,269.18 70,067 2.14 % 1,631.94 9,758 3.09 % 2,466.72 385,481 2.01 % 1,212.44 tega teg a 1,301,008 1.69 % 1,146.57 1 0.28 % 103.02 57,915 1.20 % 1,458.24 605,459 1.74 % 1,115.60 228,327 1.58 % 1,218.28 52,168 1.59 % 1,215.05 17,867 5.66 % 4,516.58 339,271 1.77 % 1,067.10 mu m u 1,099,357 1.43 % 968.86 3 0.83 % 309.06 141,366 2.92 % 3,559.44 464,070 1.33 % 855.08 190,707 1.32 % 1,017.55 39,582 1.21 % 921.91 3,183 1.01 % 804.63 260,446 1.36 % 819.17 kaj ka j 1,092,205 1.42 % 962.55 0 0 % 0 116,352 2.41 % 2,929.62 472,462 1.36 % 870.54 235,001 1.62 % 1,253.89 43,222 1.32 % 1,006.69 3,316 1.05 % 838.25 221,852 1.16 % 697.78 mi m i 928,446 1.21 % 818.23 0 0 % 0 141,012 2.92 % 3,550.53 371,902 1.07 % 685.25 191,610 1.32 % 1,022.37 28,100 0.86 % 654.48 3,855 1.22 % 974.50 191,967 1.00 % 603.79 svoje svoj e 896,025 1.17 % 789.66 2 0.55 % 206.04 37,102 0.77 % 934.19 417,576 1.20 % 769.41 167,773 1.16 % 895.18 42,226 1.29 % 983.49 2,958 0.94 % 747.75 228,388 1.19 % 718.34 te t e 778,569 1.01 % 686.15 1 0.28 % 103.02 50,755 1.05 % 1,277.96 349,187 1.00 % 643.40 144,978 1.00 % 773.55 42,256 1.29 % 984.19 5,227 1.66 % 1,321.33 186,165 0.97 % 585.54 nas na s 736,499 0.96 % 649.07 0 0 % 0 20,438 0.42 % 514.61 373,358 1.07 % 687.94 151,156 1.04 % 806.52 27,837 0.85 % 648.36 1,687 0.53 % 426.46 162,023 0.84 % 509.60 jim ji m 723,196 0.94 % 637.35 1 0.28 % 103.02 25,600 0.53 % 644.58 373,331 1.07 % 687.89 118,672 0.82 % 633.19 26,431 0.81 % 615.61 1,813 0.57 % 458.31 177,348 0.93 % 557.81 vseh vse h 667,739 0.87 % 588.47 1 0.28 % 103.02 14,187 0.29 % 357.21 330,940 0.95 % 609.78 111,575 0.77 % 595.33 25,604 0.78 % 596.35 2,601 0.82 % 657.50 182,831 0.95 % 575.05 nam na m 660,025 0.86 % 581.68 0 0 % 0 16,117 0.33 % 405.81 342,164 0.98 % 630.46 130,942 0.90 % 698.66 24,627 0.75 % 573.59 1,553 0.49 % 392.58 144,622 0.75 % 454.87 vsi vs i 590,283 0.77 % 520.21 1 0.28 % 103.02 30,468 0.63 % 767.15 294,483 0.85 % 542.60 107,195 0.74 % 571.96 18,366 0.56 % 427.77 1,801 0.57 % 455.27 137,969 0.72 % 433.95 svojo svoj o 587,202 0.76 % 517.50 1 0.28 % 103.02 28,699 0.59 % 722.61 261,680 0.75 % 482.16 117,000 0.81 % 624.27 25,348 0.78 % 590.39 1,737 0.55 % 439.09 152,737 0.80 % 480.40 ji j i 560,679 0.73 % 494.12 3 0.83 % 309.06 96,395 1.99 % 2,427.12 201,119 0.58 % 370.58 120,392 0.83 % 642.37 17,867 0.55 % 416.14 1,411 0.45 % 356.69 123,492 0.64 % 388.41 me m e 551,083 0.72 % 485.67 0 0 % 0 107,770 2.23 % 2,713.53 208,238 0.60 % 383.69 117,280 0.81 % 625.77 16,084 0.49 % 374.62 1,191 0.38 % 301.07 100,520 0.52 % 316.16 teh te h 461,307 0.60 % 406.55 0 0 % 0 11,382 0.23 % 286.59 227,557 0.65 % 419.29 76,699 0.53 % 409.24 24,262 0.74 % 565.09 3,211 1.02 % 811.71 118,196 0.62 % 371.76 kateri kater i 443,426 0.58 % 390.79 3 0.83 % 309.06 12,103 0.25 % 304.74 201,747 0.58 % 371.73 70,318 0.48 % 375.19 19,163 0.59 % 446.33 2,070 0.66 % 523.27 138,022 0.72 % 434.12 ti t i 441,478 0.57 % 389.07 0 0 % 0 66,304 1.37 % 1,669.46 164,810 0.47 % 303.67 94,191 0.65 % 502.57 22,424 0.69 % 522.28 1,655 0.52 % 418.37 92,094 0.48 % 289.66 vsak vsa k 435,378 0.57 % 383.70 4 1.11 % 412.07 20,678 0.43 % 520.65 199,450 0.57 % 367.50 95,410 0.66 % 509.08 19,497 0.60 % 454.11 1,480 0.47 % 374.13 98,859 0.52 % 310.94 katerem katere m 422,748 0.55 % 372.57 0 0 % 0 9,500 0.20 % 239.20 196,742 0.56 % 362.51 65,619 0.45 % 350.12 17,545 0.54 % 408.64 1,712 0.54 % 432.78 131,630 0.69 % 414.01 katerih kateri h 414,448 0.54 % 365.25 2 0.55 % 206.04 9,180 0.19 % 231.14 191,031 0.55 % 351.99 72,188 0.50 % 385.17 24,070 0.73 % 560.62 2,561 0.81 % 647.39 115,416 0.60 % 363.01 temu tem u 395,296 0.51 % 348.37 0 0 % 0 12,140 0.25 % 305.67 193,076 0.56 % 355.76 73,770 0.51 % 393.61 14,842 0.45 % 345.69 1,550 0.49 % 391.82 99,918 0.52 % 314.27 svoj svo j 384,876 0.50 % 339.19 0 0 % 0 15,313 0.32 % 385.56 177,632 0.51 % 327.30 72,883 0.50 % 388.88 15,034 0.46 % 350.16 1,259 0.40 % 318.26 102,755 0.54 % 323.19 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 133 File at CLARIN.SI 1.2.117 List of final character-level 2-grams from pronoun lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-pronouns-lowercase_forms- final-2grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] se se 16,040,357 20.86 % 14,136.29 61 16.90 % 6,284.12 1,002,397 20.74 % 25,239.27 7,237,206 20.79 % 13,335.03 2,888,135 19.94 % 15,410.14 645,048 19.71 % 15,023.93 67,082 21.25 % 16,957.61 4,200,428 21.90 % 13,211.44 to to 4,103,550 5.34 % 3,616.44 8 2.22 % 824.15 218,365 4.52 % 5,498.19 1,845,594 5.30 % 3,400.63 773,014 5.34 % 4,124.55 165,865 5.07 % 3,863.19 20,900 6.62 % 5,283.29 1,079,804 5.63 % 3,396.26 ga ga 2,820,915 3.67 % 2,486.06 57 15.79 % 5,872.05 221,093 4.57 % 5,566.88 1,250,146 3.59 % 2,303.48 522,706 3.61 % 2,788.99 123,613 3.78 % 2,879.09 9,589 3.04 % 2,424 693,711 3.62 % 2,181.90 jih j ih 2,543,853 3.31 % 2,241.88 71 19.67 % 7,314.31 97,565 2.02 % 2,456.58 1,188,280 3.41 % 2,189.48 482,799 3.33 % 2,576.06 132,177 4.04 % 3,078.56 10,205 3.23 % 2,579.71 632,756 3.30 % 1,990.18 tem t em 2,422,935 3.15 % 2,135.32 5 1.39 % 515.09 65,918 1.36 % 1,659.74 1,128,405 3.24 % 2,079.16 389,063 2.69 % 2,075.91 100,541 3.07 % 2,341.72 13,150 4.17 % 3,324.18 725,853 3.78 % 2,283 si si 2,116,658 2.75 % 1,865.40 0 0 % 0 177,478 3.67 % 4,468.70 897,813 2.58 % 1,654.28 442,516 3.06 % 2,361.12 79,910 2.44 % 1,861.20 4,363 1.38 % 1,102.92 514,578 2.68 % 1,618.48 jo jo 1,732,491 2.25 % 1,526.84 46 12.74 % 4,738.85 148,766 3.08 % 3,745.77 733,255 2.11 % 1,351.07 339,940 2.35 % 1,813.81 81,944 2.50 % 1,908.57 5,376 1.70 % 1,358.99 423,164 2.21 % 1,330.96 vse v se 1,718,925 2.23 % 1,514.88 21 5.82 % 2,163.39 93,251 1.93 % 2,347.96 798,916 2.29 % 1,472.06 341,670 2.36 % 1,823.04 62,134 1.90 % 1,447.17 5,102 1.62 % 1,289.73 417,831 2.18 % 1,314.19 kar k ar 1,570,304 2.04 % 1,383.90 1 0.28 % 103.02 63,660 1.32 % 1,602.89 722,013 2.07 % 1,330.36 290,148 2.00 % 1,548.13 56,252 1.72 % 1,310.18 4,504 1.43 % 1,138.56 433,726 2.26 % 1,364.18 ta ta 1,399,658 1.82 % 1,233.51 2 0.55 % 206.04 51,880 1.07 % 1,306.28 644,603 1.85 % 1,187.72 237,867 1.64 % 1,269.18 70,067 2.14 % 1,631.94 9,758 3.09 % 2,466.72 385,481 2.01 % 1,212.44 tega te ga 1,301,008 1.69 % 1,146.57 1 0.28 % 103.02 57,915 1.20 % 1,458.24 605,459 1.74 % 1,115.60 228,327 1.58 % 1,218.28 52,168 1.59 % 1,215.05 17,867 5.66 % 4,516.58 339,271 1.77 % 1,067.10 mu mu 1,099,357 1.43 % 968.86 3 0.83 % 309.06 141,366 2.92 % 3,559.44 464,070 1.33 % 855.08 190,707 1.32 % 1,017.55 39,582 1.21 % 921.91 3,183 1.01 % 804.63 260,446 1.36 % 819.17 kaj k aj 1,092,205 1.42 % 962.55 0 0 % 0 116,352 2.41 % 2,929.62 472,462 1.36 % 870.54 235,001 1.62 % 1,253.89 43,222 1.32 % 1,006.69 3,316 1.05 % 838.25 221,852 1.16 % 697.78 mi mi 928,446 1.21 % 818.23 0 0 % 0 141,012 2.92 % 3,550.53 371,902 1.07 % 685.25 191,610 1.32 % 1,022.37 28,100 0.86 % 654.48 3,855 1.22 % 974.50 191,967 1.00 % 603.79 svoje svo je 896,025 1.17 % 789.66 2 0.55 % 206.04 37,102 0.77 % 934.19 417,576 1.20 % 769.41 167,773 1.16 % 895.18 42,226 1.29 % 983.49 2,958 0.94 % 747.75 228,388 1.19 % 718.34 te te 778,569 1.01 % 686.15 1 0.28 % 103.02 50,755 1.05 % 1,277.96 349,187 1.00 % 643.40 144,978 1.00 % 773.55 42,256 1.29 % 984.19 5,227 1.66 % 1,321.33 186,165 0.97 % 585.54 nas n as 736,499 0.96 % 649.07 0 0 % 0 20,438 0.42 % 514.61 373,358 1.07 % 687.94 151,156 1.04 % 806.52 27,837 0.85 % 648.36 1,687 0.53 % 426.46 162,023 0.84 % 509.60 jim j im 723,196 0.94 % 637.35 1 0.28 % 103.02 25,600 0.53 % 644.58 373,331 1.07 % 687.89 118,672 0.82 % 633.19 26,431 0.81 % 615.61 1,813 0.57 % 458.31 177,348 0.93 % 557.81 vseh vs eh 667,739 0.87 % 588.47 1 0.28 % 103.02 14,187 0.29 % 357.21 330,940 0.95 % 609.78 111,575 0.77 % 595.33 25,604 0.78 % 596.35 2,601 0.82 % 657.50 182,831 0.95 % 575.05 nam n am 660,025 0.86 % 581.68 0 0 % 0 16,117 0.33 % 405.81 342,164 0.98 % 630.46 130,942 0.90 % 698.66 24,627 0.75 % 573.59 1,553 0.49 % 392.58 144,622 0.75 % 454.87 vsi v si 590,283 0.77 % 520.21 1 0.28 % 103.02 30,468 0.63 % 767.15 294,483 0.85 % 542.60 107,195 0.74 % 571.96 18,366 0.56 % 427.77 1,801 0.57 % 455.27 137,969 0.72 % 433.95 svojo svo jo 587,202 0.76 % 517.50 1 0.28 % 103.02 28,699 0.59 % 722.61 261,680 0.75 % 482.16 117,000 0.81 % 624.27 25,348 0.78 % 590.39 1,737 0.55 % 439.09 152,737 0.80 % 480.40 ji ji 560,679 0.73 % 494.12 3 0.83 % 309.06 96,395 1.99 % 2,427.12 201,119 0.58 % 370.58 120,392 0.83 % 642.37 17,867 0.55 % 416.14 1,411 0.45 % 356.69 123,492 0.64 % 388.41 me me 551,083 0.72 % 485.67 0 0 % 0 107,770 2.23 % 2,713.53 208,238 0.60 % 383.69 117,280 0.81 % 625.77 16,084 0.49 % 374.62 1,191 0.38 % 301.07 100,520 0.52 % 316.16 teh t eh 461,307 0.60 % 406.55 0 0 % 0 11,382 0.23 % 286.59 227,557 0.65 % 419.29 76,699 0.53 % 409.24 24,262 0.74 % 565.09 3,211 1.02 % 811.71 118,196 0.62 % 371.76 kateri kate ri 443,426 0.58 % 390.79 3 0.83 % 309.06 12,103 0.25 % 304.74 201,747 0.58 % 371.73 70,318 0.48 % 375.19 19,163 0.59 % 446.33 2,070 0.66 % 523.27 138,022 0.72 % 434.12 ti ti 441,478 0.57 % 389.07 0 0 % 0 66,304 1.37 % 1,669.46 164,810 0.47 % 303.67 94,191 0.65 % 502.57 22,424 0.69 % 522.28 1,655 0.52 % 418.37 92,094 0.48 % 289.66 vsak vs ak 435,378 0.57 % 383.70 4 1.11 % 412.07 20,678 0.43 % 520.65 199,450 0.57 % 367.50 95,410 0.66 % 509.08 19,497 0.60 % 454.11 1,480 0.47 % 374.13 98,859 0.52 % 310.94 katerem kater em 422,748 0.55 % 372.57 0 0 % 0 9,500 0.20 % 239.20 196,742 0.56 % 362.51 65,619 0.45 % 350.12 17,545 0.54 % 408.64 1,712 0.54 % 432.78 131,630 0.69 % 414.01 katerih kater ih 414,448 0.54 % 365.25 2 0.55 % 206.04 9,180 0.19 % 231.14 191,031 0.55 % 351.99 72,188 0.50 % 385.17 24,070 0.73 % 560.62 2,561 0.81 % 647.39 115,416 0.60 % 363.01 temu te mu 395,296 0.51 % 348.37 0 0 % 0 12,140 0.25 % 305.67 193,076 0.56 % 355.76 73,770 0.51 % 393.61 14,842 0.45 % 345.69 1,550 0.49 % 391.82 99,918 0.52 % 314.27 svoj sv oj 384,876 0.50 % 339.19 0 0 % 0 15,313 0.32 % 385.56 177,632 0.51 % 327.30 72,883 0.50 % 388.88 15,034 0.46 % 350.16 1,259 0.40 % 318.26 102,755 0.54 % 323.19 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 134 File at CLARIN.SI 1.2.118 List of final character-level 3-grams from pronoun lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-pronouns-lowercase_forms- final-3grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] jih jih 2,543,853 5.79 % 2,241.88 71 39.66 % 7,314.31 97,565 4.13 % 2,456.58 1,188,280 5.86 % 2,189.48 482,799 5.79 % 2,576.06 132,177 6.88 % 3,078.56 10,205 5.66 % 2,579.71 632,756 5.84 % 1,990.18 tem tem 2,422,935 5.52 % 2,135.32 5 2.79 % 515.09 65,918 2.79 % 1,659.74 1,128,405 5.56 % 2,079.16 389,063 4.67 % 2,075.91 100,541 5.23 % 2,341.72 13,150 7.29 % 3,324.18 725,853 6.70 % 2,283 vse vse 1,718,925 3.91 % 1,514.88 21 11.73 % 2,163.39 93,251 3.95 % 2,347.96 798,916 3.94 % 1,472.06 341,670 4.10 % 1,823.04 62,134 3.23 % 1,447.17 5,102 2.83 % 1,289.73 417,831 3.85 % 1,314.19 kar kar 1,570,304 3.58 % 1,383.90 1 0.56 % 103.02 63,660 2.70 % 1,602.89 722,013 3.56 % 1,330.36 290,148 3.48 % 1,548.13 56,252 2.93 % 1,310.18 4,504 2.50 % 1,138.56 433,726 4.00 % 1,364.18 tega t ega 1,301,008 2.96 % 1,146.57 1 0.56 % 103.02 57,915 2.45 % 1,458.24 605,459 2.98 % 1,115.60 228,327 2.74 % 1,218.28 52,168 2.71 % 1,215.05 17,867 9.90 % 4,516.58 339,271 3.13 % 1,067.10 kaj kaj 1,092,205 2.49 % 962.55 0 0 % 0 116,352 4.93 % 2,929.62 472,462 2.33 % 870.54 235,001 2.82 % 1,253.89 43,222 2.25 % 1,006.69 3,316 1.84 % 838.25 221,852 2.05 % 697.78 svoje sv oje 896,025 2.04 % 789.66 2 1.12 % 206.04 37,102 1.57 % 934.19 417,576 2.06 % 769.41 167,773 2.01 % 895.18 42,226 2.20 % 983.49 2,958 1.64 % 747.75 228,388 2.11 % 718.34 nas nas 736,499 1.68 % 649.07 0 0 % 0 20,438 0.87 % 514.61 373,358 1.84 % 687.94 151,156 1.81 % 806.52 27,837 1.45 % 648.36 1,687 0.94 % 426.46 162,023 1.50 % 509.60 jim jim 723,196 1.65 % 637.35 1 0.56 % 103.02 25,600 1.08 % 644.58 373,331 1.84 % 687.89 118,672 1.42 % 633.19 26,431 1.38 % 615.61 1,813 1.00 % 458.31 177,348 1.64 % 557.81 vseh v seh 667,739 1.52 % 588.47 1 0.56 % 103.02 14,187 0.60 % 357.21 330,940 1.63 % 609.78 111,575 1.34 % 595.33 25,604 1.33 % 596.35 2,601 1.44 % 657.50 182,831 1.69 % 575.05 nam nam 660,025 1.50 % 581.68 0 0 % 0 16,117 0.68 % 405.81 342,164 1.69 % 630.46 130,942 1.57 % 698.66 24,627 1.28 % 573.59 1,553 0.86 % 392.58 144,622 1.33 % 454.87 vsi vsi 590,283 1.34 % 520.21 1 0.56 % 103.02 30,468 1.29 % 767.15 294,483 1.45 % 542.60 107,195 1.29 % 571.96 18,366 0.96 % 427.77 1,801 1.00 % 455.27 137,969 1.27 % 433.95 svojo sv ojo 587,202 1.34 % 517.50 1 0.56 % 103.02 28,699 1.22 % 722.61 261,680 1.29 % 482.16 117,000 1.40 % 624.27 25,348 1.32 % 590.39 1,737 0.96 % 439.09 152,737 1.41 % 480.40 teh teh 461,307 1.05 % 406.55 0 0 % 0 11,382 0.48 % 286.59 227,557 1.12 % 419.29 76,699 0.92 % 409.24 24,262 1.26 % 565.09 3,211 1.78 % 811.71 118,196 1.09 % 371.76 kateri kat eri 443,426 1.01 % 390.79 3 1.68 % 309.06 12,103 0.51 % 304.74 201,747 0.99 % 371.73 70,318 0.84 % 375.19 19,163 1.00 % 446.33 2,070 1.15 % 523.27 138,022 1.27 % 434.12 vsak v sak 435,378 0.99 % 383.70 4 2.23 % 412.07 20,678 0.88 % 520.65 199,450 0.98 % 367.50 95,410 1.15 % 509.08 19,497 1.01 % 454.11 1,480 0.82 % 374.13 98,859 0.91 % 310.94 katerem kate rem 422,748 0.96 % 372.57 0 0 % 0 9,500 0.40 % 239.20 196,742 0.97 % 362.51 65,619 0.79 % 350.12 17,545 0.91 % 408.64 1,712 0.95 % 432.78 131,630 1.21 % 414.01 katerih kate rih 414,448 0.94 % 365.25 2 1.12 % 206.04 9,180 0.39 % 231.14 191,031 0.94 % 351.99 72,188 0.87 % 385.17 24,070 1.25 % 560.62 2,561 1.42 % 647.39 115,416 1.06 % 363.01 temu t emu 395,296 0.90 % 348.37 0 0 % 0 12,140 0.51 % 305.67 193,076 0.95 % 355.76 73,770 0.89 % 393.61 14,842 0.77 % 345.69 1,550 0.86 % 391.82 99,918 0.92 % 314.27 svoj s voj 384,876 0.88 % 339.19 0 0 % 0 15,313 0.65 % 385.56 177,632 0.88 % 327.30 72,883 0.88 % 388.88 15,034 0.78 % 350.16 1,259 0.70 % 318.26 102,755 0.95 % 323.19 nič nič 368,884 0.84 % 325.10 0 0 % 0 37,466 1.59 % 943.35 170,000 0.84 % 313.24 76,998 0.92 % 410.84 11,727 0.61 % 273.14 1,096 0.61 % 277.06 71,597 0.66 % 225.19 svojih svo jih 359,437 0.82 % 316.77 0 0 % 0 13,076 0.55 % 329.24 172,169 0.85 % 317.23 63,062 0.76 % 336.48 16,724 0.87 % 389.52 1,311 0.73 % 331.41 93,095 0.86 % 292.81 vam vam 353,752 0.81 % 311.76 4 2.23 % 412.07 19,362 0.82 % 487.51 126,066 0.62 % 232.28 131,736 1.58 % 702.90 21,242 1.11 % 494.75 1,295 0.72 % 327.36 54,047 0.50 % 169.99 tej tej 348,441 0.79 % 307.08 0 0 % 0 10,799 0.46 % 271.91 170,706 0.84 % 314.54 48,392 0.58 % 258.20 15,293 0.80 % 356.19 1,893 1.05 % 478.53 101,358 0.94 % 318.80 kdo kdo 347,009 0.79 % 305.82 0 0 % 0 26,511 1.12 % 667.52 171,038 0.84 % 315.15 64,480 0.77 % 344.04 9,866 0.51 % 229.79 1,709 0.95 % 432.02 73,405 0.68 % 230.88 njimi nj imi 323,106 0.74 % 284.75 2 1.12 % 206.04 10,619 0.45 % 267.37 155,452 0.77 % 286.43 54,499 0.65 % 290.79 13,886 0.72 % 323.42 743 0.41 % 187.82 87,905 0.81 % 276.48 tisti ti sti 321,273 0.73 % 283.14 0 0 % 0 21,523 0.91 % 541.93 151,552 0.75 % 279.24 63,881 0.77 % 340.85 12,686 0.66 % 295.47 1,718 0.95 % 434.29 69,913 0.65 % 219.89 vsem v sem 314,109 0.71 % 276.82 0 0 % 0 13,013 0.55 % 327.65 157,517 0.78 % 290.24 58,875 0.71 % 314.14 10,914 0.57 % 254.20 950 0.53 % 240.15 72,840 0.67 % 229.10 njegov nje gov 312,067 0.71 % 275.02 0 0 % 0 20,759 0.88 % 522.69 139,380 0.69 % 256.82 48,529 0.58 % 258.93 11,386 0.59 % 265.19 788 0.44 % 199.20 91,225 0.84 % 286.93 svojega svoj ega 308,222 0.70 % 271.63 0 0 % 0 15,805 0.67 % 397.95 139,191 0.69 % 256.47 59,785 0.72 % 318.99 14,856 0.77 % 346.01 946 0.52 % 239.14 77,639 0.72 % 244.19 katerega kater ega 301,745 0.69 % 265.93 0 0 % 0 8,703 0.37 % 219.13 137,666 0.68 % 253.66 52,123 0.63 % 278.11 12,539 0.65 % 292.05 1,749 0.97 % 442.13 88,965 0.82 % 279.82 katero kat ero 296,676 0.68 % 261.46 0 0 % 0 9,799 0.41 % 246.73 131,196 0.65 % 241.74 54,232 0.65 % 289.36 12,778 0.67 % 297.61 1,822 1.01 % 460.58 86,849 0.80 % 273.16 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 135 File at CLARIN.SI 1.2.119 List of final character-level 4-grams from pronoun lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-pronouns-lowercase_forms- final-4grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] tega tega 1,301,008 4.72 % 1,146.57 1 1.64 % 103.02 57,915 3.83 % 1,458.24 605,459 4.78 % 1,115.60 228,327 4.42 % 1,218.28 52,168 4.19 % 1,215.05 17,867 15.21 % 4,516.58 339,271 4.95 % 1,067.10 svoje s voje 896,025 3.25 % 789.66 2 3.28 % 206.04 37,102 2.46 % 934.19 417,576 3.29 % 769.41 167,773 3.25 % 895.18 42,226 3.39 % 983.49 2,958 2.52 % 747.75 228,388 3.33 % 718.34 vseh vseh 667,739 2.42 % 588.47 1 1.64 % 103.02 14,187 0.94 % 357.21 330,940 2.61 % 609.78 111,575 2.16 % 595.33 25,604 2.06 % 596.35 2,601 2.21 % 657.50 182,831 2.67 % 575.05 svojo s vojo 587,202 2.13 % 517.50 1 1.64 % 103.02 28,699 1.90 % 722.61 261,680 2.06 % 482.16 117,000 2.26 % 624.27 25,348 2.04 % 590.39 1,737 1.48 % 439.09 152,737 2.23 % 480.40 kateri ka teri 443,426 1.61 % 390.79 3 4.92 % 309.06 12,103 0.80 % 304.74 201,747 1.59 % 371.73 70,318 1.36 % 375.19 19,163 1.54 % 446.33 2,070 1.76 % 523.27 138,022 2.01 % 434.12 vsak vsak 435,378 1.58 % 383.70 4 6.56 % 412.07 20,678 1.37 % 520.65 199,450 1.57 % 367.50 95,410 1.85 % 509.08 19,497 1.57 % 454.11 1,480 1.26 % 374.13 98,859 1.44 % 310.94 katerem kat erem 422,748 1.53 % 372.57 0 0 % 0 9,500 0.63 % 239.20 196,742 1.55 % 362.51 65,619 1.27 % 350.12 17,545 1.41 % 408.64 1,712 1.46 % 432.78 131,630 1.92 % 414.01 katerih kat erih 414,448 1.50 % 365.25 2 3.28 % 206.04 9,180 0.61 % 231.14 191,031 1.51 % 351.99 72,188 1.40 % 385.17 24,070 1.93 % 560.62 2,561 2.18 % 647.39 115,416 1.68 % 363.01 temu temu 395,296 1.43 % 348.37 0 0 % 0 12,140 0.80 % 305.67 193,076 1.52 % 355.76 73,770 1.43 % 393.61 14,842 1.19 % 345.69 1,550 1.32 % 391.82 99,918 1.46 % 314.27 svoj svoj 384,876 1.40 % 339.19 0 0 % 0 15,313 1.01 % 385.56 177,632 1.40 % 327.30 72,883 1.41 % 388.88 15,034 1.21 % 350.16 1,259 1.07 % 318.26 102,755 1.50 % 323.19 svojih sv ojih 359,437 1.30 % 316.77 0 0 % 0 13,076 0.86 % 329.24 172,169 1.36 % 317.23 63,062 1.22 % 336.48 16,724 1.34 % 389.52 1,311 1.12 % 331.41 93,095 1.36 % 292.81 njimi n jimi 323,106 1.17 % 284.75 2 3.28 % 206.04 10,619 0.70 % 267.37 155,452 1.23 % 286.43 54,499 1.05 % 290.79 13,886 1.11 % 323.42 743 0.63 % 187.82 87,905 1.28 % 276.48 tisti t isti 321,273 1.17 % 283.14 0 0 % 0 21,523 1.42 % 541.93 151,552 1.20 % 279.24 63,881 1.24 % 340.85 12,686 1.02 % 295.47 1,718 1.46 % 434.29 69,913 1.02 % 219.89 vsem vsem 314,109 1.14 % 276.82 0 0 % 0 13,013 0.86 % 327.65 157,517 1.24 % 290.24 58,875 1.14 % 314.14 10,914 0.88 % 254.20 950 0.81 % 240.15 72,840 1.06 % 229.10 njegov nj egov 312,067 1.13 % 275.02 0 0 % 0 20,759 1.37 % 522.69 139,380 1.10 % 256.82 48,529 0.94 % 258.93 11,386 0.91 % 265.19 788 0.67 % 199.20 91,225 1.33 % 286.93 svojega svo jega 308,222 1.12 % 271.63 0 0 % 0 15,805 1.05 % 397.95 139,191 1.10 % 256.47 59,785 1.16 % 318.99 14,856 1.19 % 346.01 946 0.81 % 239.14 77,639 1.13 % 244.19 katerega kate rega 301,745 1.09 % 265.93 0 0 % 0 8,703 0.58 % 219.13 137,666 1.09 % 253.66 52,123 1.01 % 278.11 12,539 1.01 % 292.05 1,749 1.49 % 442.13 88,965 1.30 % 279.82 katero ka tero 296,676 1.08 % 261.46 0 0 % 0 9,799 0.65 % 246.73 131,196 1.03 % 241.74 54,232 1.05 % 289.36 12,778 1.03 % 297.61 1,822 1.55 % 460.58 86,849 1.27 % 273.16 katere ka tere 293,688 1.06 % 258.83 0 0 % 0 7,495 0.50 % 188.72 132,940 1.05 % 244.95 51,495 1.00 % 274.76 15,891 1.28 % 370.12 2,320 1.98 % 586.47 83,547 1.22 % 262.78 njih njih 292,042 1.06 % 257.38 1 1.64 % 103.02 13,870 0.92 % 349.23 136,597 1.08 % 251.69 56,026 1.08 % 298.94 15,855 1.27 % 369.28 966 0.82 % 244.19 68,727 1.00 % 216.16 nekateri neka teri 291,094 1.06 % 256.54 1 1.64 % 103.02 5,577 0.37 % 140.42 150,843 1.19 % 277.94 51,392 0.99 % 274.21 12,872 1.03 % 299.80 710 0.60 % 179.48 69,699 1.02 % 219.22 njegovo nje govo 282,399 1.02 % 248.88 0 0 % 0 18,982 1.26 % 477.95 123,421 0.97 % 227.41 48,068 0.93 % 256.48 15,381 1.24 % 358.24 1,040 0.89 % 262.90 75,507 1.10 % 237.49 naše naše 270,649 0.98 % 238.52 0 0 % 0 5,104 0.34 % 128.51 149,326 1.18 % 275.14 47,448 0.92 % 253.17 11,218 0.90 % 261.28 901 0.77 % 227.76 56,652 0.83 % 178.19 njim njim 269,365 0.98 % 237.39 2 3.28 % 206.04 23,698 1.57 % 596.69 116,832 0.92 % 215.27 51,706 1.00 % 275.89 11,409 0.92 % 265.73 704 0.60 % 177.96 65,014 0.95 % 204.49 njegova nje gova 263,259 0.95 % 232.01 0 0 % 0 16,409 1.09 % 413.16 113,639 0.90 % 209.39 48,864 0.95 % 260.72 11,792 0.95 % 274.65 578 0.49 % 146.11 71,977 1.05 % 226.39 svoji s voji 244,889 0.89 % 215.82 0 0 % 0 12,098 0.80 % 304.61 107,893 0.85 % 198.80 44,470 0.86 % 237.28 10,640 0.85 % 247.82 728 0.62 % 184.03 69,060 1.01 % 217.21 vsako v sako 242,448 0.88 % 213.67 0 0 % 0 7,927 0.52 % 199.59 118,121 0.93 % 217.65 48,054 0.93 % 256.40 10,671 0.86 % 248.54 1,017 0.87 % 257.09 56,658 0.83 % 178.20 nekaterih nekat erih 231,651 0.84 % 204.15 0 0 % 0 1,963 0.13 % 49.43 121,205 0.96 % 223.33 38,342 0.74 % 204.58 11,127 0.89 % 259.16 643 0.55 % 162.54 58,371 0.85 % 183.59 njem njem 229,536 0.83 % 202.29 1 1.64 % 103.02 14,739 0.97 % 371.11 108,008 0.85 % 199.01 42,205 0.82 % 225.19 12,298 0.99 % 286.44 557 0.47 % 140.80 51,728 0.76 % 162.70 njej njej 222,731 0.81 % 196.29 0 0 % 0 19,397 1.28 % 488.40 101,577 0.80 % 187.16 41,419 0.80 % 221 11,185 0.90 % 260.51 562 0.48 % 142.07 48,591 0.71 % 152.83 svojem sv ojem 213,135 0.77 % 187.83 0 0 % 0 8,622 0.57 % 217.09 96,597 0.76 % 177.99 38,100 0.74 % 203.29 10,163 0.82 % 236.71 825 0.70 % 208.55 58,828 0.86 % 185.03 nihče n ihče 212,510 0.77 % 187.28 0 0 % 0 17,772 1.18 % 447.48 104,936 0.83 % 193.35 34,654 0.67 % 184.90 5,487 0.44 % 127.80 618 0.53 % 156.22 49,043 0.71 % 154.25 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 136 File at CLARIN.SI 1.2.120 List of final character-level 5-grams from pronoun lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-pronouns-lowercase_forms- final-5grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] svoje svoje 896,025 4.72 % 789.66 2 5.88 % 206.04 37,102 3.87 % 934.19 417,576 4.77 % 769.41 167,773 4.79 % 895.18 42,226 4.88 % 983.49 2,958 4.01 % 747.75 228,388 4.72 % 718.34 svojo svojo 587,202 3.09 % 517.50 1 2.94 % 103.02 28,699 2.99 % 722.61 261,680 2.99 % 482.16 117,000 3.34 % 624.27 25,348 2.93 % 590.39 1,737 2.36 % 439.09 152,737 3.15 % 480.40 kateri k ateri 443,426 2.33 % 390.79 3 8.82 % 309.06 12,103 1.26 % 304.74 201,747 2.31 % 371.73 70,318 2.01 % 375.19 19,163 2.22 % 446.33 2,070 2.81 % 523.27 138,022 2.85 % 434.12 katerem ka terem 422,748 2.23 % 372.57 0 0 % 0 9,500 0.99 % 239.20 196,742 2.25 % 362.51 65,619 1.87 % 350.12 17,545 2.03 % 408.64 1,712 2.32 % 432.78 131,630 2.72 % 414.01 katerih ka terih 414,448 2.18 % 365.25 2 5.88 % 206.04 9,180 0.96 % 231.14 191,031 2.18 % 351.99 72,188 2.06 % 385.17 24,070 2.78 % 560.62 2,561 3.48 % 647.39 115,416 2.38 % 363.01 svojih s vojih 359,437 1.89 % 316.77 0 0 % 0 13,076 1.36 % 329.24 172,169 1.97 % 317.23 63,062 1.80 % 336.48 16,724 1.93 % 389.52 1,311 1.78 % 331.41 93,095 1.92 % 292.81 njimi njimi 323,106 1.70 % 284.75 2 5.88 % 206.04 10,619 1.11 % 267.37 155,452 1.78 % 286.43 54,499 1.56 % 290.79 13,886 1.60 % 323.42 743 1.01 % 187.82 87,905 1.82 % 276.48 tisti tisti 321,273 1.69 % 283.14 0 0 % 0 21,523 2.24 % 541.93 151,552 1.73 % 279.24 63,881 1.82 % 340.85 12,686 1.47 % 295.47 1,718 2.33 % 434.29 69,913 1.44 % 219.89 njegov n jegov 312,067 1.64 % 275.02 0 0 % 0 20,759 2.16 % 522.69 139,380 1.59 % 256.82 48,529 1.39 % 258.93 11,386 1.32 % 265.19 788 1.07 % 199.20 91,225 1.88 % 286.93 svojega sv ojega 308,222 1.62 % 271.63 0 0 % 0 15,805 1.65 % 397.95 139,191 1.59 % 256.47 59,785 1.71 % 318.99 14,856 1.72 % 346.01 946 1.28 % 239.14 77,639 1.60 % 244.19 katerega kat erega 301,745 1.59 % 265.93 0 0 % 0 8,703 0.91 % 219.13 137,666 1.57 % 253.66 52,123 1.49 % 278.11 12,539 1.45 % 292.05 1,749 2.37 % 442.13 88,965 1.84 % 279.82 katero k atero 296,676 1.56 % 261.46 0 0 % 0 9,799 1.02 % 246.73 131,196 1.50 % 241.74 54,232 1.55 % 289.36 12,778 1.48 % 297.61 1,822 2.47 % 460.58 86,849 1.79 % 273.16 katere k atere 293,688 1.55 % 258.83 0 0 % 0 7,495 0.78 % 188.72 132,940 1.52 % 244.95 51,495 1.47 % 274.76 15,891 1.84 % 370.12 2,320 3.15 % 586.47 83,547 1.73 % 262.78 nekateri nek ateri 291,094 1.53 % 256.54 1 2.94 % 103.02 5,577 0.58 % 140.42 150,843 1.72 % 277.94 51,392 1.47 % 274.21 12,872 1.49 % 299.80 710 0.96 % 179.48 69,699 1.44 % 219.22 njegovo nj egovo 282,399 1.49 % 248.88 0 0 % 0 18,982 1.98 % 477.95 123,421 1.41 % 227.41 48,068 1.37 % 256.48 15,381 1.78 % 358.24 1,040 1.41 % 262.90 75,507 1.56 % 237.49 njegova nj egova 263,259 1.39 % 232.01 0 0 % 0 16,409 1.71 % 413.16 113,639 1.30 % 209.39 48,864 1.40 % 260.72 11,792 1.36 % 274.65 578 0.79 % 146.11 71,977 1.49 % 226.39 svoji svoji 244,889 1.29 % 215.82 0 0 % 0 12,098 1.26 % 304.61 107,893 1.23 % 198.80 44,470 1.27 % 237.28 10,640 1.23 % 247.82 728 0.99 % 184.03 69,060 1.43 % 217.21 vsako vsako 242,448 1.28 % 213.67 0 0 % 0 7,927 0.83 % 199.59 118,121 1.35 % 217.65 48,054 1.37 % 256.40 10,671 1.23 % 248.54 1,017 1.38 % 257.09 56,658 1.17 % 178.20 nekaterih neka terih 231,651 1.22 % 204.15 0 0 % 0 1,963 0.20 % 49.43 121,205 1.39 % 223.33 38,342 1.09 % 204.58 11,127 1.29 % 259.16 643 0.87 % 162.54 58,371 1.21 % 183.59 svojem s vojem 213,135 1.12 % 187.83 0 0 % 0 8,622 0.90 % 217.09 96,597 1.10 % 177.99 38,100 1.09 % 203.29 10,163 1.18 % 236.71 825 1.12 % 208.55 58,828 1.22 % 185.03 nihče nihče 212,510 1.12 % 187.28 0 0 % 0 17,772 1.85 % 447.48 104,936 1.20 % 193.35 34,654 0.99 % 184.90 5,487 0.63 % 127.80 618 0.84 % 156.22 49,043 1.01 % 154.25 svojim s vojim 209,107 1.10 % 184.28 0 0 % 0 9,134 0.95 % 229.98 95,404 1.09 % 175.79 42,914 1.23 % 228.98 8,840 1.02 % 205.89 537 0.73 % 135.75 52,278 1.08 % 164.43 nekaj nekaj 203,242 1.07 % 179.12 0 0 % 0 18,383 1.92 % 462.86 83,125 0.95 % 153.16 44,180 1.26 % 235.73 9,029 1.04 % 210.30 667 0.91 % 168.61 47,858 0.99 % 150.53 nekatere nek atere 202,747 1.07 % 178.68 0 0 % 0 2,957 0.31 % 74.45 99,266 1.14 % 182.90 37,104 1.06 % 197.97 13,192 1.52 % 307.26 739 1.00 % 186.81 49,489 1.02 % 155.66 njegove nj egove 198,326 1.04 % 174.78 0 0 % 0 14,987 1.56 % 377.36 85,729 0.98 % 157.96 33,244 0.95 % 177.38 10,944 1.26 % 254.90 714 0.97 % 180.49 52,708 1.09 % 165.78 katerimi kat erimi 197,235 1.04 % 173.82 0 0 % 0 3,547 0.37 % 89.31 90,923 1.04 % 167.53 34,351 0.98 % 183.29 8,749 1.01 % 203.77 875 1.19 % 221.19 58,790 1.21 % 184.91 tiste tiste 189,871 1.00 % 167.33 1 2.94 % 103.02 9,150 0.95 % 230.39 86,224 0.98 % 158.87 41,769 1.19 % 222.87 9,106 1.05 % 212.09 1,020 1.39 % 257.84 42,601 0.88 % 133.99 njegovi nj egovi 188,214 0.99 % 165.87 0 0 % 0 11,023 1.15 % 277.55 87,253 1.00 % 160.77 29,364 0.84 % 156.68 7,727 0.89 % 179.97 719 0.98 % 181.76 52,128 1.08 % 163.96 katerim ka terim 179,565 0.95 % 158.25 1 2.94 % 103.02 4,658 0.48 % 117.28 77,908 0.89 % 143.55 32,159 0.92 % 171.59 6,650 0.77 % 154.89 866 1.18 % 218.92 57,323 1.18 % 180.30 njihovo nj ihovo 176,435 0.93 % 155.49 0 0 % 0 3,451 0.36 % 86.89 85,463 0.98 % 157.47 29,870 0.85 % 159.38 10,893 1.26 % 253.71 955 1.30 % 241.41 45,803 0.95 % 144.06 njegovih nje govih 171,382 0.90 % 151.04 0 0 % 0 8,418 0.88 % 211.96 68,990 0.79 % 127.12 19,655 0.56 % 104.87 6,550 0.76 % 152.56 328 0.45 % 82.91 67,441 1.39 % 212.12 naših naših 170,366 0.90 % 150.14 0 0 % 0 2,106 0.22 % 53.03 99,368 1.14 % 183.09 27,572 0.79 % 147.12 4,715 0.55 % 109.82 358 0.49 % 90.50 36,247 0.75 % 114.01 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 137 File at CLARIN.SI 1.2.121 List of initial character-level 1-grams from numeral lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-numerals-lemmas-initial- 1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] drug drug d rug 3,399,068 8.45 % 2,995.58 19 4.38 % 1,957.35 110,912 24.64 % 2,792.64 1,577,458 7.55 % 2,906.57 551,520 10.49 % 2,942.73 157,344 10.86 % 3,664.73 22,761 14.19 % 5,753.74 979,054 8.15 % 3,079.38 prvi prvi p rvi 2,198,324 5.46 % 1,937.37 2 0.46 % 206.04 31,830 7.07 % 801.44 1,053,085 5.04 % 1,940.38 295,479 5.62 % 1,576.58 58,824 4.06 % 1,370.08 9,684 6.04 % 2,448.01 749,420 6.24 % 2,357.12 en en e n 1,710,696 4.25 % 1,507.63 21 4.84 % 2,163.39 67,198 14.93 % 1,691.97 762,842 3.65 % 1,405.59 317,026 6.03 % 1,691.55 66,950 4.62 % 1,559.35 6,327 3.94 % 1,599.40 490,332 4.08 % 1,542.22 dva dva d va 1,632,916 4.06 % 1,439.08 9 2.07 % 927.17 41,991 9.33 % 1,057.29 804,557 3.85 % 1,482.45 255,248 4.86 % 1,361.92 45,807 3.16 % 1,066.90 3,602 2.25 % 910.55 481,702 4.01 % 1,515.08 trije trije t rije 1,077,597 2.68 % 949.68 4 0.92 % 412.07 24,134 5.36 % 607.67 554,125 2.65 % 1,021.01 148,979 2.83 % 794.90 24,987 1.72 % 581.98 2,226 1.39 % 562.71 323,142 2.69 % 1,016.37 štirje štirje š tirje 562,863 1.40 % 496.05 2 0.46 % 206.04 10,338 2.30 % 260.30 287,367 1.38 % 529.49 76,421 1.45 % 407.76 10,991 0.76 % 255.99 649 0.41 % 164.06 177,095 1.47 % 557.01 pet pet p et 490,799 1.22 % 432.54 1 0.23 % 103.02 9,434 2.10 % 237.54 250,598 1.20 % 461.74 63,012 1.20 % 336.21 7,471 0.52 % 174.01 917 0.57 % 231.81 159,366 1.33 % 501.25 1 1 1 473,724 1.18 % 417.49 34 7.83 % 3,502.63 1,167 0.26 % 29.38 224,982 1.08 % 414.54 75,917 1.44 % 405.07 31,719 2.19 % 738.77 5,780 3.60 % 1,461.12 134,125 1.12 % 421.86 tretji tretji t retji 454,803 1.13 % 400.82 0 0 % 0 5,843 1.30 % 147.12 221,805 1.06 % 408.69 50,869 0.97 % 271.42 11,061 0.76 % 257.62 2,681 1.67 % 677.73 162,544 1.35 % 511.24 2 2 2 448,702 1.11 % 395.44 32 7.37 % 3,296.59 870 0.19 % 21.91 232,497 1.11 % 428.39 74,329 1.41 % 396.60 26,758 1.85 % 623.23 5,495 3.43 % 1,389.08 108,721 0.91 % 341.96 deset deset d eset 366,728 0.91 % 323.20 0 0 % 0 9,231 2.05 % 232.43 193,409 0.93 % 356.37 51,610 0.98 % 275.37 5,416 0.37 % 126.15 459 0.29 % 116.03 106,603 0.89 % 335.29 eden eden e den 339,502 0.84 % 299.20 0 0 % 0 9,711 2.16 % 244.51 157,403 0.75 % 290.03 57,846 1.10 % 308.65 10,361 0.71 % 241.32 694 0.43 % 175.44 103,487 0.86 % 325.49 1, 1, 1 , 338,262 0.84 % 298.11 0 0 % 0 729 0.16 % 18.36 212,046 1.01 % 390.71 30,858 0.59 % 164.65 10,574 0.73 % 246.28 3,351 2.09 % 847.10 80,704 0.67 % 253.84 3 3 3 334,249 0.83 % 294.57 20 4.61 % 2,060.37 684 0.15 % 17.22 178,016 0.85 % 328.01 52,468 1.00 % 279.95 20,047 1.38 % 466.92 3,802 2.37 % 961.10 79,212 0.66 % 249.14 20 20 2 0 318,181 0.79 % 280.41 22 5.07 % 2,266.41 360 0.08 % 9.06 161,967 0.78 % 298.43 43,208 0.82 % 230.54 9,708 0.67 % 226.11 928 0.58 % 234.59 101,988 0.85 % 320.78 tisoč tisoč t isoč 305,442 0.76 % 269.18 0 0 % 0 5,271 1.17 % 132.72 195,780 0.94 % 360.74 37,622 0.72 % 200.74 3,442 0.24 % 80.17 270 0.17 % 68.25 63,057 0.53 % 198.33 4 4 4 303,784 0.76 % 267.72 12 2.77 % 1,236.22 467 0.10 % 11.76 165,356 0.79 % 304.68 46,384 0.88 % 247.49 16,263 1.12 % 378.78 2,293 1.43 % 579.65 73,009 0.61 % 229.63 šest šest š est 303,010 0.75 % 267.04 0 0 % 0 5,487 1.22 % 138.16 154,063 0.74 % 283.87 36,091 0.69 % 192.57 4,534 0.31 % 105.60 741 0.46 % 187.32 102,094 0.85 % 321.11 10 10 1 0 296,783 0.74 % 261.55 21 4.84 % 2,163.39 401 0.09 % 10.10 142,870 0.68 % 263.25 49,185 0.94 % 262.43 13,444 0.93 % 313.13 1,236 0.77 % 312.45 89,626 0.75 % 281.90 5 5 5 265,501 0.66 % 233.98 14 3.23 % 1,442.26 549 0.12 % 13.82 141,453 0.68 % 260.64 45,080 0.86 % 240.53 15,334 1.06 % 357.15 1,690 1.05 % 427.21 61,381 0.51 % 193.06 15 15 1 5 265,493 0.66 % 233.98 19 4.38 % 1,957.35 308 0.07 % 7.76 133,739 0.64 % 246.42 35,793 0.68 % 190.98 8,377 0.58 % 195.11 753 0.47 % 190.35 86,504 0.72 % 272.08 30 30 3 0 251,613 0.62 % 221.75 20 4.61 % 2,060.37 282 0.06 % 7.10 128,479 0.61 % 236.73 38,641 0.73 % 206.18 7,698 0.53 % 179.30 1,127 0.70 % 284.89 75,366 0.63 % 237.05 6 6 6 241,633 0.60 % 212.95 2 0.46 % 206.04 314 0.07 % 7.91 144,313 0.69 % 265.91 28,665 0.55 % 152.95 10,067 0.69 % 234.47 1,004 0.63 % 253.80 57,268 0.48 % 180.12 2, 2, 2 , 238,500 0.59 % 210.19 0 0 % 0 327 0.07 % 8.23 154,120 0.74 % 283.98 23,331 0.44 % 124.49 8,832 0.61 % 205.71 3,076 1.92 % 777.58 48,814 0.41 % 153.53 sedem sedem s edem 221,597 0.55 % 195.29 0 0 % 0 4,599 1.02 % 115.80 112,630 0.54 % 207.53 25,309 0.48 % 135.04 3,382 0.23 % 78.77 261 0.16 % 65.98 75,416 0.63 % 237.20 12 12 1 2 214,333 0.53 % 188.89 5 1.15 % 515.09 297 0.07 % 7.48 109,658 0.53 % 202.05 25,719 0.49 % 137.23 6,921 0.48 % 161.20 479 0.30 % 121.09 71,254 0.59 % 224.11 3, 3, 3 , 214,002 0.53 % 188.60 0 0 % 0 288 0.06 % 7.25 143,636 0.69 % 264.66 19,566 0.37 % 104.40 7,182 0.50 % 167.28 2,450 1.53 % 619.33 40,880 0.34 % 128.58 osem osem o sem 208,962 0.52 % 184.16 0 0 % 0 3,541 0.79 % 89.16 108,342 0.52 % 199.63 23,456 0.45 % 125.15 2,867 0.20 % 66.78 416 0.26 % 105.16 70,340 0.58 % 221.24 50 50 5 0 208,691 0.52 % 183.92 8 1.84 % 824.15 222 0.05 % 5.59 107,858 0.52 % 198.74 32,852 0.62 % 175.29 5,759 0.40 % 134.13 785 0.49 % 198.44 61,207 0.51 % 192.51 40 40 4 0 180,779 0.45 % 159.32 5 1.15 % 515.09 198 0.04 % 4.99 93,495 0.45 % 172.27 27,276 0.52 % 145.54 4,939 0.34 % 115.04 445 0.28 % 112.49 54,421 0.45 % 171.17 100 100 1 0 180,711 0.45 % 159.26 11 2.54 % 1,133.20 196 0.04 % 4.94 85,233 0.41 % 157.05 36,302 0.69 % 193.70 5,283 0.36 % 123.05 698 0.43 % 176.45 52,988 0.44 % 166.66 25 25 2 5 177,837 0.44 % 156.73 9 2.07 % 927.17 224 0.05 % 5.64 93,905 0.45 % 173.03 22,798 0.43 % 121.64 5,172 0.36 % 120.46 417 0.26 % 105.41 55,312 0.46 % 173.97 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 138 File at CLARIN.SI 1.2.122 List of initial character-level 2-grams from numeral lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-numerals-lemmas-initial- 2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] drug drug dr ug 3,399,068 9.04 % 2,995.58 19 6.17 % 1,957.35 110,912 24.93 % 2,792.64 1,577,458 8.09 % 2,906.57 551,520 11.35 % 2,942.73 157,344 12.10 % 3,664.73 22,761 16.44 % 5,753.74 979,054 8.63 % 3,079.38 prvi prvi pr vi 2,198,324 5.85 % 1,937.37 2 0.65 % 206.04 31,830 7.16 % 801.44 1,053,085 5.40 % 1,940.38 295,479 6.08 % 1,576.58 58,824 4.52 % 1,370.08 9,684 6.99 % 2,448.01 749,420 6.60 % 2,357.12 en en en 1,710,696 4.55 % 1,507.63 21 6.82 % 2,163.39 67,198 15.10 % 1,691.97 762,842 3.91 % 1,405.59 317,026 6.53 % 1,691.55 66,950 5.15 % 1,559.35 6,327 4.57 % 1,599.40 490,332 4.32 % 1,542.22 dva dva dv a 1,632,916 4.34 % 1,439.08 9 2.92 % 927.17 41,991 9.44 % 1,057.29 804,557 4.12 % 1,482.45 255,248 5.25 % 1,361.92 45,807 3.52 % 1,066.90 3,602 2.60 % 910.55 481,702 4.25 % 1,515.08 trije trije tr ije 1,077,597 2.87 % 949.68 4 1.30 % 412.07 24,134 5.42 % 607.67 554,125 2.84 % 1,021.01 148,979 3.07 % 794.90 24,987 1.92 % 581.98 2,226 1.61 % 562.71 323,142 2.85 % 1,016.37 štirje štirje št irje 562,863 1.50 % 496.05 2 0.65 % 206.04 10,338 2.32 % 260.30 287,367 1.47 % 529.49 76,421 1.57 % 407.76 10,991 0.84 % 255.99 649 0.47 % 164.06 177,095 1.56 % 557.01 pet pet pe t 490,799 1.30 % 432.54 1 0.33 % 103.02 9,434 2.12 % 237.54 250,598 1.28 % 461.74 63,012 1.30 % 336.21 7,471 0.57 % 174.01 917 0.66 % 231.81 159,366 1.40 % 501.25 tretji tretji tr etji 454,803 1.21 % 400.82 0 0 % 0 5,843 1.31 % 147.12 221,805 1.14 % 408.69 50,869 1.05 % 271.42 11,061 0.85 % 257.62 2,681 1.94 % 677.73 162,544 1.43 % 511.24 deset deset de set 366,728 0.97 % 323.20 0 0 % 0 9,231 2.08 % 232.43 193,409 0.99 % 356.37 51,610 1.06 % 275.37 5,416 0.42 % 126.15 459 0.33 % 116.03 106,603 0.94 % 335.29 eden eden ed en 339,502 0.90 % 299.20 0 0 % 0 9,711 2.18 % 244.51 157,403 0.81 % 290.03 57,846 1.19 % 308.65 10,361 0.80 % 241.32 694 0.50 % 175.44 103,487 0.91 % 325.49 1, 1, 1, 338,262 0.90 % 298.11 0 0 % 0 729 0.16 % 18.36 212,046 1.09 % 390.71 30,858 0.64 % 164.65 10,574 0.81 % 246.28 3,351 2.42 % 847.10 80,704 0.71 % 253.84 20 20 20 318,181 0.85 % 280.41 22 7.14 % 2,266.41 360 0.08 % 9.06 161,967 0.83 % 298.43 43,208 0.89 % 230.54 9,708 0.75 % 226.11 928 0.67 % 234.59 101,988 0.90 % 320.78 tisoč tisoč ti soč 305,442 0.81 % 269.18 0 0 % 0 5,271 1.19 % 132.72 195,780 1.00 % 360.74 37,622 0.78 % 200.74 3,442 0.27 % 80.17 270 0.20 % 68.25 63,057 0.56 % 198.33 šest šest še st 303,010 0.81 % 267.04 0 0 % 0 5,487 1.23 % 138.16 154,063 0.79 % 283.87 36,091 0.74 % 192.57 4,534 0.35 % 105.60 741 0.54 % 187.32 102,094 0.90 % 321.11 10 10 10 296,783 0.79 % 261.55 21 6.82 % 2,163.39 401 0.09 % 10.10 142,870 0.73 % 263.25 49,185 1.01 % 262.43 13,444 1.03 % 313.13 1,236 0.89 % 312.45 89,626 0.79 % 281.90 15 15 15 265,493 0.71 % 233.98 19 6.17 % 1,957.35 308 0.07 % 7.76 133,739 0.69 % 246.42 35,793 0.74 % 190.98 8,377 0.64 % 195.11 753 0.54 % 190.35 86,504 0.76 % 272.08 30 30 30 251,613 0.67 % 221.75 20 6.49 % 2,060.37 282 0.06 % 7.10 128,479 0.66 % 236.73 38,641 0.80 % 206.18 7,698 0.59 % 179.30 1,127 0.81 % 284.89 75,366 0.66 % 237.05 2, 2, 2, 238,500 0.63 % 210.19 0 0 % 0 327 0.07 % 8.23 154,120 0.79 % 283.98 23,331 0.48 % 124.49 8,832 0.68 % 205.71 3,076 2.22 % 777.58 48,814 0.43 % 153.53 sedem sedem se dem 221,597 0.59 % 195.29 0 0 % 0 4,599 1.03 % 115.80 112,630 0.58 % 207.53 25,309 0.52 % 135.04 3,382 0.26 % 78.77 261 0.19 % 65.98 75,416 0.67 % 237.20 12 12 12 214,333 0.57 % 188.89 5 1.62 % 515.09 297 0.07 % 7.48 109,658 0.56 % 202.05 25,719 0.53 % 137.23 6,921 0.53 % 161.20 479 0.35 % 121.09 71,254 0.63 % 224.11 3, 3, 3, 214,002 0.57 % 188.60 0 0 % 0 288 0.07 % 7.25 143,636 0.74 % 264.66 19,566 0.40 % 104.40 7,182 0.55 % 167.28 2,450 1.77 % 619.33 40,880 0.36 % 128.58 osem osem os em 208,962 0.56 % 184.16 0 0 % 0 3,541 0.80 % 89.16 108,342 0.56 % 199.63 23,456 0.48 % 125.15 2,867 0.22 % 66.78 416 0.30 % 105.16 70,340 0.62 % 221.24 50 50 50 208,691 0.56 % 183.92 8 2.60 % 824.15 222 0.05 % 5.59 107,858 0.55 % 198.74 32,852 0.68 % 175.29 5,759 0.44 % 134.13 785 0.57 % 198.44 61,207 0.54 % 192.51 40 40 40 180,779 0.48 % 159.32 5 1.62 % 515.09 198 0.04 % 4.99 93,495 0.48 % 172.27 27,276 0.56 % 145.54 4,939 0.38 % 115.04 445 0.32 % 112.49 54,421 0.48 % 171.17 100 100 10 0 180,711 0.48 % 159.26 11 3.57 % 1,133.20 196 0.04 % 4.94 85,233 0.44 % 157.05 36,302 0.75 % 193.70 5,283 0.41 % 123.05 698 0.50 % 176.45 52,988 0.47 % 166.66 25 25 25 177,837 0.47 % 156.73 9 2.92 % 927.17 224 0.05 % 5.64 93,905 0.48 % 173.03 22,798 0.47 % 121.64 5,172 0.40 % 120.46 417 0.30 % 105.41 55,312 0.49 % 173.97 14 14 14 169,856 0.45 % 149.69 0 0 % 0 227 0.05 % 5.72 90,968 0.47 % 167.61 15,762 0.32 % 84.10 4,747 0.36 % 110.56 314 0.23 % 79.38 57,838 0.51 % 181.92 16 16 16 163,126 0.43 % 143.76 1 0.33 % 103.02 201 0.04 % 5.06 85,630 0.44 % 157.78 18,486 0.38 % 98.64 4,603 0.35 % 107.21 231 0.17 % 58.39 53,974 0.48 % 169.76 18 18 18 159,374 0.42 % 140.46 3 0.97 % 309.06 192 0.04 % 4.83 84,973 0.43 % 156.57 15,991 0.33 % 85.32 4,219 0.32 % 98.27 456 0.33 % 115.27 53,540 0.47 % 168.40 četrti četrti če trti 158,938 0.42 % 140.07 0 0 % 0 1,901 0.43 % 47.87 76,520 0.39 % 140.99 15,548 0.32 % 82.96 3,282 0.25 % 76.44 926 0.67 % 234.08 60,761 0.54 % 191.11 20, 20, 20 , 157,565 0.42 % 138.86 0 0 % 0 297 0.07 % 7.48 89,595 0.46 % 165.08 12,425 0.26 % 66.30 4,708 0.36 % 109.65 571 0.41 % 144.34 49,969 0.44 % 157.17 11 11 11 153,446 0.41 % 135.23 0 0 % 0 179 0.04 % 4.51 80,081 0.41 % 147.55 14,037 0.29 % 74.90 4,536 0.35 % 105.65 300 0.22 % 75.84 54,313 0.48 % 170.83 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 139 File at CLARIN.SI 1.2.123 List of initial character-level 3-grams from numeral lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-numerals-lemmas-initial- 3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] drug drug dru g 3,399,068 12.12 % 2,995.58 19 13.77 % 1,957.35 110,912 30.40 % 2,792.64 1,577,458 10.87 % 2,906.57 551,520 15.30 % 2,942.73 157,344 16.58 % 3,664.73 22,761 23.55 % 5,753.74 979,054 11.49 % 3,079.38 prvi prvi prv i 2,198,324 7.84 % 1,937.37 2 1.45 % 206.04 31,830 8.72 % 801.44 1,053,085 7.25 % 1,940.38 295,479 8.20 % 1,576.58 58,824 6.20 % 1,370.08 9,684 10.02 % 2,448.01 749,420 8.80 % 2,357.12 dva dva dva 1,632,916 5.82 % 1,439.08 9 6.52 % 927.17 41,991 11.51 % 1,057.29 804,557 5.54 % 1,482.45 255,248 7.08 % 1,361.92 45,807 4.83 % 1,066.90 3,602 3.73 % 910.55 481,702 5.65 % 1,515.08 trije trije tri je 1,077,597 3.84 % 949.68 4 2.90 % 412.07 24,134 6.62 % 607.67 554,125 3.82 % 1,021.01 148,979 4.13 % 794.90 24,987 2.63 % 581.98 2,226 2.30 % 562.71 323,142 3.79 % 1,016.37 štirje štirje šti rje 562,863 2.01 % 496.05 2 1.45 % 206.04 10,338 2.83 % 260.30 287,367 1.98 % 529.49 76,421 2.12 % 407.76 10,991 1.16 % 255.99 649 0.67 % 164.06 177,095 2.08 % 557.01 pet pet pet 490,799 1.75 % 432.54 1 0.72 % 103.02 9,434 2.59 % 237.54 250,598 1.73 % 461.74 63,012 1.75 % 336.21 7,471 0.79 % 174.01 917 0.95 % 231.81 159,366 1.87 % 501.25 tretji tretji tre tji 454,803 1.62 % 400.82 0 0 % 0 5,843 1.60 % 147.12 221,805 1.53 % 408.69 50,869 1.41 % 271.42 11,061 1.17 % 257.62 2,681 2.77 % 677.73 162,544 1.91 % 511.24 deset deset des et 366,728 1.31 % 323.20 0 0 % 0 9,231 2.53 % 232.43 193,409 1.33 % 356.37 51,610 1.43 % 275.37 5,416 0.57 % 126.15 459 0.47 % 116.03 106,603 1.25 % 335.29 eden eden ede n 339,502 1.21 % 299.20 0 0 % 0 9,711 2.66 % 244.51 157,403 1.08 % 290.03 57,846 1.60 % 308.65 10,361 1.09 % 241.32 694 0.72 % 175.44 103,487 1.22 % 325.49 tisoč tisoč tis oč 305,442 1.09 % 269.18 0 0 % 0 5,271 1.45 % 132.72 195,780 1.35 % 360.74 37,622 1.04 % 200.74 3,442 0.36 % 80.17 270 0.28 % 68.25 63,057 0.74 % 198.33 šest šest šes t 303,010 1.08 % 267.04 0 0 % 0 5,487 1.50 % 138.16 154,063 1.06 % 283.87 36,091 1.00 % 192.57 4,534 0.48 % 105.60 741 0.77 % 187.32 102,094 1.20 % 321.11 sedem sedem sed em 221,597 0.79 % 195.29 0 0 % 0 4,599 1.26 % 115.80 112,630 0.78 % 207.53 25,309 0.70 % 135.04 3,382 0.36 % 78.77 261 0.27 % 65.98 75,416 0.89 % 237.20 osem osem ose m 208,962 0.74 % 184.16 0 0 % 0 3,541 0.97 % 89.16 108,342 0.75 % 199.63 23,456 0.65 % 125.15 2,867 0.30 % 66.78 416 0.43 % 105.16 70,340 0.83 % 221.24 100 100 100 180,711 0.64 % 159.26 11 7.97 % 1,133.20 196 0.05 % 4.94 85,233 0.59 % 157.05 36,302 1.01 % 193.70 5,283 0.56 % 123.05 698 0.72 % 176.45 52,988 0.62 % 166.66 četrti četrti čet rti 158,938 0.57 % 140.07 0 0 % 0 1,901 0.52 % 47.87 76,520 0.53 % 140.99 15,548 0.43 % 82.96 3,282 0.35 % 76.44 926 0.96 % 234.08 60,761 0.71 % 191.11 20, 20, 20, 157,565 0.56 % 138.86 0 0 % 0 297 0.08 % 7.48 89,595 0.62 % 165.08 12,425 0.34 % 66.30 4,708 0.50 % 109.65 571 0.59 % 144.34 49,969 0.59 % 157.17 sto sto sto 152,879 0.55 % 134.73 0 0 % 0 4,718 1.29 % 118.79 84,745 0.58 % 156.15 21,926 0.61 % 116.99 3,217 0.34 % 74.93 184 0.19 % 46.51 38,089 0.45 % 119.80 2,000 2,000 200 0 148,688 0.53 % 131.04 0 0 % 0 245 0.07 % 6.17 89,618 0.62 % 165.13 26,937 0.75 % 143.73 5,880 0.62 % 136.95 159 0.16 % 40.19 25,849 0.30 % 81.30 19, 19, 19, 144,490 0.52 % 127.34 0 0 % 0 334 0.09 % 8.41 83,071 0.57 % 153.06 11,758 0.33 % 62.74 6,449 0.68 % 150.20 689 0.71 % 174.17 42,189 0.49 % 132.70 10, 10, 10, 140,661 0.50 % 123.96 0 0 % 0 237 0.07 % 5.97 83,115 0.57 % 153.14 11,021 0.31 % 58.80 2,755 0.29 % 64.17 682 0.70 % 172.40 42,851 0.50 % 134.78 18, 18, 18, 136,255 0.49 % 120.08 0 0 % 0 273 0.07 % 6.87 80,084 0.55 % 147.56 10,192 0.28 % 54.38 4,116 0.43 % 95.87 516 0.53 % 130.44 41,074 0.48 % 129.19 devet devet dev et 134,561 0.48 % 118.59 2 1.45 % 206.04 2,497 0.68 % 62.87 68,635 0.47 % 126.46 14,329 0.40 % 76.45 1,695 0.18 % 39.48 117 0.12 % 29.58 47,286 0.56 % 148.73 15, 15, 15, 133,142 0.47 % 117.34 0 0 % 0 432 0.12 % 10.88 75,868 0.52 % 139.79 10,427 0.29 % 55.64 3,144 0.33 % 73.23 565 0.58 % 142.83 42,706 0.50 % 134.32 11, 11, 11, 127,423 0.45 % 112.30 0 0 % 0 267 0.07 % 6.72 69,899 0.48 % 128.79 10,413 0.29 % 55.56 2,697 0.28 % 62.82 554 0.57 % 140.05 43,593 0.51 % 137.11 200 200 200 124,316 0.44 % 109.56 15 10.87 % 1,545.28 119 0.03 % 3 68,266 0.47 % 125.78 20,159 0.56 % 107.56 2,773 0.29 % 64.59 210 0.22 % 53.09 32,774 0.39 % 103.08 12, 12, 12, 122,524 0.44 % 107.98 0 0 % 0 219 0.06 % 5.51 65,640 0.45 % 120.95 10,009 0.28 % 53.40 3,207 0.34 % 74.69 552 0.57 % 139.54 42,897 0.50 % 134.92 17, 17, 17, 121,238 0.43 % 106.85 0 0 % 0 244 0.07 % 6.14 70,583 0.49 % 130.05 8,629 0.24 % 46.04 3,183 0.34 % 74.14 466 0.48 % 117.80 38,133 0.45 % 119.94 16, 16, 16, 112,174 0.40 % 98.86 0 0 % 0 278 0.08 % 7 63,412 0.44 % 116.84 8,435 0.23 % 45.01 3,340 0.35 % 77.79 499 0.52 % 126.14 36,210 0.42 % 113.89 2,007 2,007 200 7 108,871 0.39 % 95.95 0 0 % 0 117 0.03 % 2.95 56,465 0.39 % 104.04 14,573 0.40 % 77.76 4,368 0.46 % 101.74 11 0.01 % 2.78 33,337 0.39 % 104.85 14, 14, 14, 108,240 0.39 % 95.39 0 0 % 0 217 0.06 % 5.46 59,413 0.41 % 109.47 7,886 0.22 % 42.08 2,528 0.27 % 58.88 507 0.52 % 128.16 37,689 0.44 % 118.54 13, 13, 13, 106,668 0.38 % 94.01 0 0 % 0 143 0.04 % 3.60 58,794 0.41 % 108.33 7,834 0.22 % 41.80 2,721 0.29 % 63.38 532 0.55 % 134.48 36,644 0.43 % 115.25 2,008 2,008 200 8 104,919 0.37 % 92.46 0 0 % 0 95 0.03 % 2.39 47,198 0.33 % 86.97 11,128 0.31 % 59.38 2,811 0.30 % 65.47 30 0.03 % 7.58 43,657 0.51 % 137.31 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 140 File at CLARIN.SI 1.2.124 List of initial character-level 4-grams from numeral lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-numerals-lemmas-initial- 4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] drug drug drug 3,399,068 18.08 % 2,995.58 19 61.29 % 1,957.35 110,912 37.42 % 2,792.64 1,577,458 16.61 % 2,906.57 551,520 22.33 % 2,942.73 157,344 23.70 % 3,664.73 22,761 32.85 % 5,753.74 979,054 16.87 % 3,079.38 prvi prvi prvi 2,198,324 11.70 % 1,937.37 2 6.45 % 206.04 31,830 10.74 % 801.44 1,053,085 11.09 % 1,940.38 295,479 11.96 % 1,576.58 58,824 8.86 % 1,370.08 9,684 13.98 % 2,448.01 749,420 12.91 % 2,357.12 trije trije trij e 1,077,597 5.73 % 949.68 4 12.90 % 412.07 24,134 8.14 % 607.67 554,125 5.84 % 1,021.01 148,979 6.03 % 794.90 24,987 3.76 % 581.98 2,226 3.21 % 562.71 323,142 5.57 % 1,016.37 štirje štirje štir je 562,863 2.99 % 496.05 2 6.45 % 206.04 10,338 3.49 % 260.30 287,367 3.03 % 529.49 76,421 3.09 % 407.76 10,991 1.66 % 255.99 649 0.94 % 164.06 177,095 3.05 % 557.01 tretji tretji tret ji 454,803 2.42 % 400.82 0 0 % 0 5,843 1.97 % 147.12 221,805 2.34 % 408.69 50,869 2.06 % 271.42 11,061 1.67 % 257.62 2,681 3.87 % 677.73 162,544 2.80 % 511.24 deset deset dese t 366,728 1.95 % 323.20 0 0 % 0 9,231 3.11 % 232.43 193,409 2.04 % 356.37 51,610 2.09 % 275.37 5,416 0.82 % 126.15 459 0.66 % 116.03 106,603 1.84 % 335.29 eden eden eden 339,502 1.81 % 299.20 0 0 % 0 9,711 3.28 % 244.51 157,403 1.66 % 290.03 57,846 2.34 % 308.65 10,361 1.56 % 241.32 694 1.00 % 175.44 103,487 1.78 % 325.49 tisoč tisoč tiso č 305,442 1.62 % 269.18 0 0 % 0 5,271 1.78 % 132.72 195,780 2.06 % 360.74 37,622 1.52 % 200.74 3,442 0.52 % 80.17 270 0.39 % 68.25 63,057 1.09 % 198.33 šest šest šest 303,010 1.61 % 267.04 0 0 % 0 5,487 1.85 % 138.16 154,063 1.62 % 283.87 36,091 1.46 % 192.57 4,534 0.68 % 105.60 741 1.07 % 187.32 102,094 1.76 % 321.11 sedem sedem sede m 221,597 1.18 % 195.29 0 0 % 0 4,599 1.55 % 115.80 112,630 1.19 % 207.53 25,309 1.02 % 135.04 3,382 0.51 % 78.77 261 0.38 % 65.98 75,416 1.30 % 237.20 osem osem osem 208,962 1.11 % 184.16 0 0 % 0 3,541 1.20 % 89.16 108,342 1.14 % 199.63 23,456 0.95 % 125.15 2,867 0.43 % 66.78 416 0.60 % 105.16 70,340 1.21 % 221.24 četrti četrti četr ti 158,938 0.85 % 140.07 0 0 % 0 1,901 0.64 % 47.87 76,520 0.81 % 140.99 15,548 0.63 % 82.96 3,282 0.49 % 76.44 926 1.34 % 234.08 60,761 1.05 % 191.11 2,000 2,000 2,000 148,688 0.79 % 131.04 0 0 % 0 245 0.08 % 6.17 89,618 0.94 % 165.13 26,937 1.09 % 143.73 5,880 0.89 % 136.95 159 0.23 % 40.19 25,849 0.45 % 81.30 devet devet deve t 134,561 0.72 % 118.59 2 6.45 % 206.04 2,497 0.84 % 62.87 68,635 0.72 % 126.46 14,329 0.58 % 76.45 1,695 0.26 % 39.48 117 0.17 % 29.58 47,286 0.81 % 148.73 2,007 2,007 2,007 108,871 0.58 % 95.95 0 0 % 0 117 0.04 % 2.95 56,465 0.59 % 104.04 14,573 0.59 % 77.76 4,368 0.66 % 101.74 11 0.02 % 2.78 33,337 0.57 % 104.85 2,008 2,008 2,008 104,919 0.56 % 92.46 0 0 % 0 95 0.03 % 2.39 47,198 0.50 % 86.97 11,128 0.45 % 59.38 2,811 0.42 % 65.47 30 0.04 % 7.58 43,657 0.75 % 137.31 2,004 2,004 2,004 98,632 0.53 % 86.92 0 0 % 0 106 0.04 % 2.67 55,211 0.58 % 101.73 17,052 0.69 % 90.98 3,950 0.59 % 92 38 0.06 % 9.61 22,275 0.38 % 70.06 2,006 2,006 2,006 98,449 0.52 % 86.76 0 0 % 0 122 0.04 % 3.07 52,575 0.55 % 96.87 15,004 0.61 % 80.06 4,316 0.65 % 100.52 48 0.07 % 12.13 26,384 0.46 % 82.98 2,005 2,005 2,005 88,134 0.47 % 77.67 0 0 % 0 88 0.03 % 2.22 46,840 0.49 % 86.31 15,419 0.62 % 82.27 3,549 0.54 % 82.66 51 0.07 % 12.89 22,187 0.38 % 69.78 2,002 2,002 2,002 86,866 0.46 % 76.55 0 0 % 0 118 0.04 % 2.97 50,869 0.54 % 93.73 15,965 0.65 % 85.18 3,731 0.56 % 86.90 39 0.06 % 9.86 16,144 0.28 % 50.78 dvajset dvajset dvaj set 85,701 0.46 % 75.53 0 0 % 0 4,883 1.65 % 122.95 47,739 0.50 % 87.96 16,442 0.67 % 87.73 2,169 0.33 % 50.52 142 0.20 % 35.90 14,326 0.25 % 45.06 2,001 2,001 2,001 85,421 0.45 % 75.28 0 0 % 0 134 0.04 % 3.37 52,447 0.55 % 96.64 14,015 0.57 % 74.78 3,405 0.51 % 79.31 49 0.07 % 12.39 15,371 0.27 % 48.35 peti peti peti 85,266 0.45 % 75.14 0 0 % 0 1,304 0.44 % 32.83 41,917 0.44 % 77.23 9,024 0.36 % 48.15 1,482 0.22 % 34.52 375 0.54 % 94.80 31,164 0.54 % 98.02 2,003 2,003 2,003 83,355 0.44 % 73.46 0 0 % 0 109 0.04 % 2.74 45,807 0.48 % 84.40 16,737 0.68 % 89.30 3,901 0.59 % 90.86 33 0.05 % 8.34 16,768 0.29 % 52.74 2,009 2,009 2,009 79,900 0.42 % 70.42 0 0 % 0 70 0.02 % 1.76 28,954 0.30 % 53.35 6,586 0.27 % 35.14 964 0.14 % 22.45 21 0.03 % 5.31 43,305 0.75 % 136.21 2,010 2,010 2,010 79,410 0.42 % 69.98 0 0 % 0 72 0.02 % 1.81 19,772 0.21 % 36.43 3,005 0.12 % 16.03 587 0.09 % 13.67 2 0 % 0.51 55,972 0.96 % 176.05 šesti šesti šest i 74,252 0.40 % 65.44 0 0 % 0 966 0.33 % 24.32 35,823 0.38 % 66.01 7,316 0.30 % 39.04 1,195 0.18 % 27.83 282 0.41 % 71.29 28,670 0.49 % 90.17 1,999 1,999 1,999 73,271 0.39 % 64.57 0 0 % 0 117 0.04 % 2.95 47,250 0.50 % 87.06 11,085 0.45 % 59.15 3,507 0.53 % 81.68 175 0.25 % 44.24 11,137 0.19 % 35.03 1,998 1,998 1,998 66,087 0.35 % 58.24 0 0 % 0 134 0.04 % 3.37 41,645 0.44 % 76.73 10,313 0.42 % 55.03 3,428 0.52 % 79.84 203 0.29 % 51.32 10,364 0.18 % 32.60 sedmi sedmi sedm i 66,081 0.35 % 58.24 0 0 % 0 848 0.29 % 21.35 33,649 0.35 % 62 5,901 0.24 % 31.49 975 0.15 % 22.71 207 0.30 % 52.33 24,501 0.42 % 77.06 2,013 2,013 2,013 65,695 0.35 % 57.90 0 0 % 0 59 0.02 % 1.49 5,509 0.06 % 10.15 541 0.02 % 2.89 410 0.06 % 9.55 1 0 % 0.25 59,175 1.02 % 186.12 2,014 2,014 2,014 65,624 0.35 % 57.83 0 0 % 0 42 0.01 % 1.06 1,573 0.02 % 2.90 133 0.01 % 0.71 475 0.07 % 11.06 1 0 % 0.25 63,400 1.09 % 199.41 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 141 File at CLARIN.SI 1.2.125 List of initial character-level 5-grams from numeral lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-numerals-lemmas-initial- 5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] trije trije trije 1,077,597 14.63 % 949.68 4 50.00 % 412.07 24,134 21.20 % 607.67 554,125 13.61 % 1,021.01 148,979 17.83 % 794.90 24,987 17.09 % 581.98 2,226 11.91 % 562.71 323,142 14.83 % 1,016.37 štirje štirje štirj e 562,863 7.64 % 496.05 2 25.00 % 206.04 10,338 9.08 % 260.30 287,367 7.06 % 529.49 76,421 9.15 % 407.76 10,991 7.52 % 255.99 649 3.47 % 164.06 177,095 8.13 % 557.01 tretji tretji tretj i 454,803 6.18 % 400.82 0 0 % 0 5,843 5.13 % 147.12 221,805 5.45 % 408.69 50,869 6.09 % 271.42 11,061 7.57 % 257.62 2,681 14.34 % 677.73 162,544 7.46 % 511.24 deset deset deset 366,728 4.98 % 323.20 0 0 % 0 9,231 8.11 % 232.43 193,409 4.75 % 356.37 51,610 6.18 % 275.37 5,416 3.71 % 126.15 459 2.46 % 116.03 106,603 4.89 % 335.29 tisoč tisoč tisoč 305,442 4.15 % 269.18 0 0 % 0 5,271 4.63 % 132.72 195,780 4.81 % 360.74 37,622 4.50 % 200.74 3,442 2.35 % 80.17 270 1.44 % 68.25 63,057 2.89 % 198.33 sedem sedem sedem 221,597 3.01 % 195.29 0 0 % 0 4,599 4.04 % 115.80 112,630 2.77 % 207.53 25,309 3.03 % 135.04 3,382 2.31 % 78.77 261 1.40 % 65.98 75,416 3.46 % 237.20 četrti četrti četrt i 158,938 2.16 % 140.07 0 0 % 0 1,901 1.67 % 47.87 76,520 1.88 % 140.99 15,548 1.86 % 82.96 3,282 2.25 % 76.44 926 4.95 % 234.08 60,761 2.79 % 191.11 devet devet devet 134,561 1.83 % 118.59 2 25.00 % 206.04 2,497 2.19 % 62.87 68,635 1.69 % 126.46 14,329 1.72 % 76.45 1,695 1.16 % 39.48 117 0.63 % 29.58 47,286 2.17 % 148.73 dvajset dvajset dvajs et 85,701 1.16 % 75.53 0 0 % 0 4,883 4.29 % 122.95 47,739 1.17 % 87.96 16,442 1.97 % 87.73 2,169 1.48 % 50.52 142 0.76 % 35.90 14,326 0.66 % 45.06 šesti šesti šesti 74,252 1.01 % 65.44 0 0 % 0 966 0.85 % 24.32 35,823 0.88 % 66.01 7,316 0.88 % 39.04 1,195 0.82 % 27.83 282 1.51 % 71.29 28,670 1.32 % 90.17 sedmi sedmi sedmi 66,081 0.90 % 58.24 0 0 % 0 848 0.74 % 21.35 33,649 0.83 % 62 5,901 0.71 % 31.49 975 0.67 % 22.71 207 1.11 % 52.33 24,501 1.12 % 77.06 dvojen dvojen dvoje n 53,225 0.72 % 46.91 0 0 % 0 1,309 1.15 % 32.96 25,179 0.62 % 46.39 9,987 1.20 % 53.29 2,927 2.00 % 68.17 268 1.43 % 67.75 13,555 0.62 % 42.63 petnajst petnajst petna jst 52,670 0.71 % 46.42 0 0 % 0 2,585 2.27 % 65.09 31,310 0.77 % 57.69 9,890 1.18 % 52.77 1,092 0.75 % 25.43 177 0.95 % 44.74 7,616 0.35 % 23.95 dvanajst dvanajst dvana jst 52,274 0.71 % 46.07 0 0 % 0 2,256 1.98 % 56.80 30,553 0.75 % 56.30 8,579 1.03 % 45.77 1,577 1.08 % 36.73 109 0.58 % 27.55 9,200 0.42 % 28.94 enajst enajst enajs t 49,499 0.67 % 43.62 0 0 % 0 1,534 1.35 % 38.62 28,527 0.70 % 52.56 5,959 0.71 % 31.80 615 0.42 % 14.32 50 0.27 % 12.64 12,814 0.59 % 40.30 trideset trideset tride set 47,140 0.64 % 41.54 0 0 % 0 2,690 2.36 % 67.73 27,199 0.67 % 50.12 9,456 1.13 % 50.45 1,273 0.87 % 29.65 167 0.89 % 42.22 6,355 0.29 % 19.99 deseti deseti deset i 41,418 0.56 % 36.50 0 0 % 0 692 0.61 % 17.42 22,075 0.54 % 40.67 4,406 0.53 % 23.51 629 0.43 % 14.65 60 0.32 % 15.17 13,556 0.62 % 42.64 deveti deveti devet i 38,542 0.52 % 33.97 0 0 % 0 714 0.63 % 17.98 19,195 0.47 % 35.37 3,512 0.42 % 18.74 578 0.40 % 13.46 79 0.42 % 19.97 14,464 0.66 % 45.49 petdeset petdeset petde set 35,974 0.49 % 31.70 0 0 % 0 2,539 2.23 % 63.93 20,873 0.51 % 38.46 7,067 0.85 % 37.71 1,066 0.73 % 24.83 72 0.39 % 18.20 4,357 0.20 % 13.70 19,30 19,30 19,30 34,632 0.47 % 30.52 0 0 % 0 4 0 % 0.10 30,009 0.74 % 55.29 717 0.09 % 3.83 20 0.01 % 0.47 2 0.01 % 0.51 3,880 0.18 % 12.20 10,000 10,000 10,00 0 31,107 0.42 % 27.41 0 0 % 0 36 0.03 % 0.91 15,753 0.39 % 29.03 4,273 0.51 % 22.80 494 0.34 % 11.51 111 0.59 % 28.06 10,440 0.48 % 32.84 štirinajst štirinajst štiri najst 29,291 0.40 % 25.81 0 0 % 0 1,335 1.17 % 33.61 18,357 0.45 % 33.82 5,206 0.62 % 27.78 627 0.43 % 14.60 48 0.26 % 12.13 3,718 0.17 % 11.69 20,00 20,00 20,00 28,807 0.39 % 25.39 0 0 % 0 2 0 % 0.05 22,837 0.56 % 42.08 1,244 0.15 % 6.64 62 0.04 % 1.44 15 0.08 % 3.79 4,647 0.21 % 14.62 osemdeseti osemdeseti osemd eseti 27,174 0.37 % 23.95 0 0 % 0 276 0.24 % 6.95 14,169 0.35 % 26.11 7,078 0.85 % 37.77 1,373 0.94 % 31.98 37 0.20 % 9.35 4,241 0.20 % 13.34 19,00 19,00 19,00 27,027 0.37 % 23.82 0 0 % 0 7 0.01 % 0.18 20,005 0.49 % 36.86 752 0.09 % 4.01 63 0.04 % 1.47 2 0.01 % 0.51 6,198 0.28 % 19.49 100,000 100,000 100,0 0 25,443 0.35 % 22.42 0 0 % 0 29 0.03 % 0.73 12,044 0.30 % 22.19 3,426 0.41 % 18.28 354 0.24 % 8.25 118 0.63 % 29.83 9,472 0.43 % 29.79 devetdeseti devetdeseti devet deseti 24,539 0.33 % 21.63 0 0 % 0 217 0.19 % 5.46 14,140 0.35 % 26.05 4,995 0.60 % 26.65 979 0.67 % 22.80 32 0.17 % 8.09 4,176 0.19 % 13.13 18,00 18,00 18,00 23,606 0.32 % 20.80 0 0 % 0 6 0.01 % 0.15 17,269 0.42 % 31.82 632 0.08 % 3.37 112 0.08 % 2.61 12 0.06 % 3.03 5,575 0.26 % 17.53 štirideset štirideset štiri deset 23,199 0.32 % 20.45 0 0 % 0 1,769 1.55 % 44.54 13,090 0.32 % 24.12 4,667 0.56 % 24.90 786 0.54 % 18.31 31 0.17 % 7.84 2,856 0.13 % 8.98 trinajst trinajst trina jst 22,766 0.31 % 20.06 0 0 % 0 736 0.65 % 18.53 13,808 0.34 % 25.44 3,650 0.44 % 19.48 473 0.32 % 11.02 21 0.11 % 5.31 4,078 0.19 % 12.83 dvesto dvesto dvest o 21,911 0.30 % 19.31 0 0 % 0 1,170 1.03 % 29.46 13,791 0.34 % 25.41 3,592 0.43 % 19.17 581 0.40 % 13.53 21 0.11 % 5.31 2,756 0.13 % 8.67 20,000 20,000 20,00 0 20,058 0.27 % 17.68 0 0 % 0 30 0.03 % 0.76 10,180 0.25 % 18.76 2,542 0.30 % 13.56 314 0.21 % 7.31 73 0.39 % 18.45 6,919 0.32 % 21.76 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 142 File at CLARIN.SI 1.2.126 List of final character-level 1-grams from numeral lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-numerals-lemmas-final- 1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] drug drug dru g 3,399,068 8.45 % 2,995.58 19 4.38 % 1,957.35 110,912 24.64 % 2,792.64 1,577,458 7.55 % 2,906.57 551,520 10.49 % 2,942.73 157,344 10.86 % 3,664.73 22,761 14.19 % 5,753.74 979,054 8.15 % 3,079.38 prvi prvi prv i 2,198,324 5.46 % 1,937.37 2 0.46 % 206.04 31,830 7.07 % 801.44 1,053,085 5.04 % 1,940.38 295,479 5.62 % 1,576.58 58,824 4.06 % 1,370.08 9,684 6.04 % 2,448.01 749,420 6.24 % 2,357.12 en en e n 1,710,696 4.25 % 1,507.63 21 4.84 % 2,163.39 67,198 14.93 % 1,691.97 762,842 3.65 % 1,405.59 317,026 6.03 % 1,691.55 66,950 4.62 % 1,559.35 6,327 3.94 % 1,599.40 490,332 4.08 % 1,542.22 dva dva dv a 1,632,916 4.06 % 1,439.08 9 2.07 % 927.17 41,991 9.33 % 1,057.29 804,557 3.85 % 1,482.45 255,248 4.86 % 1,361.92 45,807 3.16 % 1,066.90 3,602 2.25 % 910.55 481,702 4.01 % 1,515.08 trije trije trij e 1,077,597 2.68 % 949.68 4 0.92 % 412.07 24,134 5.36 % 607.67 554,125 2.65 % 1,021.01 148,979 2.83 % 794.90 24,987 1.72 % 581.98 2,226 1.39 % 562.71 323,142 2.69 % 1,016.37 štirje štirje štirj e 562,863 1.40 % 496.05 2 0.46 % 206.04 10,338 2.30 % 260.30 287,367 1.38 % 529.49 76,421 1.45 % 407.76 10,991 0.76 % 255.99 649 0.41 % 164.06 177,095 1.47 % 557.01 pet pet pe t 490,799 1.22 % 432.54 1 0.23 % 103.02 9,434 2.10 % 237.54 250,598 1.20 % 461.74 63,012 1.20 % 336.21 7,471 0.52 % 174.01 917 0.57 % 231.81 159,366 1.33 % 501.25 1 1 1 473,724 1.18 % 417.49 34 7.83 % 3,502.63 1,167 0.26 % 29.38 224,982 1.08 % 414.54 75,917 1.44 % 405.07 31,719 2.19 % 738.77 5,780 3.60 % 1,461.12 134,125 1.12 % 421.86 tretji tretji tretj i 454,803 1.13 % 400.82 0 0 % 0 5,843 1.30 % 147.12 221,805 1.06 % 408.69 50,869 0.97 % 271.42 11,061 0.76 % 257.62 2,681 1.67 % 677.73 162,544 1.35 % 511.24 2 2 2 448,702 1.11 % 395.44 32 7.37 % 3,296.59 870 0.19 % 21.91 232,497 1.11 % 428.39 74,329 1.41 % 396.60 26,758 1.85 % 623.23 5,495 3.43 % 1,389.08 108,721 0.91 % 341.96 deset deset dese t 366,728 0.91 % 323.20 0 0 % 0 9,231 2.05 % 232.43 193,409 0.93 % 356.37 51,610 0.98 % 275.37 5,416 0.37 % 126.15 459 0.29 % 116.03 106,603 0.89 % 335.29 eden eden ede n 339,502 0.84 % 299.20 0 0 % 0 9,711 2.16 % 244.51 157,403 0.75 % 290.03 57,846 1.10 % 308.65 10,361 0.71 % 241.32 694 0.43 % 175.44 103,487 0.86 % 325.49 1, 1, 1 , 338,262 0.84 % 298.11 0 0 % 0 729 0.16 % 18.36 212,046 1.01 % 390.71 30,858 0.59 % 164.65 10,574 0.73 % 246.28 3,351 2.09 % 847.10 80,704 0.67 % 253.84 3 3 3 334,249 0.83 % 294.57 20 4.61 % 2,060.37 684 0.15 % 17.22 178,016 0.85 % 328.01 52,468 1.00 % 279.95 20,047 1.38 % 466.92 3,802 2.37 % 961.10 79,212 0.66 % 249.14 20 20 2 0 318,181 0.79 % 280.41 22 5.07 % 2,266.41 360 0.08 % 9.06 161,967 0.78 % 298.43 43,208 0.82 % 230.54 9,708 0.67 % 226.11 928 0.58 % 234.59 101,988 0.85 % 320.78 tisoč tisoč tiso č 305,442 0.76 % 269.18 0 0 % 0 5,271 1.17 % 132.72 195,780 0.94 % 360.74 37,622 0.72 % 200.74 3,442 0.24 % 80.17 270 0.17 % 68.25 63,057 0.53 % 198.33 4 4 4 303,784 0.76 % 267.72 12 2.77 % 1,236.22 467 0.10 % 11.76 165,356 0.79 % 304.68 46,384 0.88 % 247.49 16,263 1.12 % 378.78 2,293 1.43 % 579.65 73,009 0.61 % 229.63 šest šest šes t 303,010 0.75 % 267.04 0 0 % 0 5,487 1.22 % 138.16 154,063 0.74 % 283.87 36,091 0.69 % 192.57 4,534 0.31 % 105.60 741 0.46 % 187.32 102,094 0.85 % 321.11 10 10 1 0 296,783 0.74 % 261.55 21 4.84 % 2,163.39 401 0.09 % 10.10 142,870 0.68 % 263.25 49,185 0.94 % 262.43 13,444 0.93 % 313.13 1,236 0.77 % 312.45 89,626 0.75 % 281.90 5 5 5 265,501 0.66 % 233.98 14 3.23 % 1,442.26 549 0.12 % 13.82 141,453 0.68 % 260.64 45,080 0.86 % 240.53 15,334 1.06 % 357.15 1,690 1.05 % 427.21 61,381 0.51 % 193.06 15 15 1 5 265,493 0.66 % 233.98 19 4.38 % 1,957.35 308 0.07 % 7.76 133,739 0.64 % 246.42 35,793 0.68 % 190.98 8,377 0.58 % 195.11 753 0.47 % 190.35 86,504 0.72 % 272.08 30 30 3 0 251,613 0.62 % 221.75 20 4.61 % 2,060.37 282 0.06 % 7.10 128,479 0.61 % 236.73 38,641 0.73 % 206.18 7,698 0.53 % 179.30 1,127 0.70 % 284.89 75,366 0.63 % 237.05 6 6 6 241,633 0.60 % 212.95 2 0.46 % 206.04 314 0.07 % 7.91 144,313 0.69 % 265.91 28,665 0.55 % 152.95 10,067 0.69 % 234.47 1,004 0.63 % 253.80 57,268 0.48 % 180.12 2, 2, 2 , 238,500 0.59 % 210.19 0 0 % 0 327 0.07 % 8.23 154,120 0.74 % 283.98 23,331 0.44 % 124.49 8,832 0.61 % 205.71 3,076 1.92 % 777.58 48,814 0.41 % 153.53 sedem sedem sede m 221,597 0.55 % 195.29 0 0 % 0 4,599 1.02 % 115.80 112,630 0.54 % 207.53 25,309 0.48 % 135.04 3,382 0.23 % 78.77 261 0.16 % 65.98 75,416 0.63 % 237.20 12 12 1 2 214,333 0.53 % 188.89 5 1.15 % 515.09 297 0.07 % 7.48 109,658 0.53 % 202.05 25,719 0.49 % 137.23 6,921 0.48 % 161.20 479 0.30 % 121.09 71,254 0.59 % 224.11 3, 3, 3 , 214,002 0.53 % 188.60 0 0 % 0 288 0.06 % 7.25 143,636 0.69 % 264.66 19,566 0.37 % 104.40 7,182 0.50 % 167.28 2,450 1.53 % 619.33 40,880 0.34 % 128.58 osem osem ose m 208,962 0.52 % 184.16 0 0 % 0 3,541 0.79 % 89.16 108,342 0.52 % 199.63 23,456 0.45 % 125.15 2,867 0.20 % 66.78 416 0.26 % 105.16 70,340 0.58 % 221.24 50 50 5 0 208,691 0.52 % 183.92 8 1.84 % 824.15 222 0.05 % 5.59 107,858 0.52 % 198.74 32,852 0.62 % 175.29 5,759 0.40 % 134.13 785 0.49 % 198.44 61,207 0.51 % 192.51 40 40 4 0 180,779 0.45 % 159.32 5 1.15 % 515.09 198 0.04 % 4.99 93,495 0.45 % 172.27 27,276 0.52 % 145.54 4,939 0.34 % 115.04 445 0.28 % 112.49 54,421 0.45 % 171.17 100 100 10 0 180,711 0.45 % 159.26 11 2.54 % 1,133.20 196 0.04 % 4.94 85,233 0.41 % 157.05 36,302 0.69 % 193.70 5,283 0.36 % 123.05 698 0.43 % 176.45 52,988 0.44 % 166.66 25 25 2 5 177,837 0.44 % 156.73 9 2.07 % 927.17 224 0.05 % 5.64 93,905 0.45 % 173.03 22,798 0.43 % 121.64 5,172 0.36 % 120.46 417 0.26 % 105.41 55,312 0.46 % 173.97 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 143 File at CLARIN.SI 1.2.127 List of final character-level 2-grams from numeral lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-numerals-lemmas-final- 2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] drug drug dr ug 3,399,068 9.04 % 2,995.58 19 6.17 % 1,957.35 110,912 24.93 % 2,792.64 1,577,458 8.09 % 2,906.57 551,520 11.35 % 2,942.73 157,344 12.10 % 3,664.73 22,761 16.44 % 5,753.74 979,054 8.63 % 3,079.38 prvi prvi pr vi 2,198,324 5.85 % 1,937.37 2 0.65 % 206.04 31,830 7.16 % 801.44 1,053,085 5.40 % 1,940.38 295,479 6.08 % 1,576.58 58,824 4.52 % 1,370.08 9,684 6.99 % 2,448.01 749,420 6.60 % 2,357.12 en en en 1,710,696 4.55 % 1,507.63 21 6.82 % 2,163.39 67,198 15.10 % 1,691.97 762,842 3.91 % 1,405.59 317,026 6.53 % 1,691.55 66,950 5.15 % 1,559.35 6,327 4.57 % 1,599.40 490,332 4.32 % 1,542.22 dva dva d va 1,632,916 4.34 % 1,439.08 9 2.92 % 927.17 41,991 9.44 % 1,057.29 804,557 4.12 % 1,482.45 255,248 5.25 % 1,361.92 45,807 3.52 % 1,066.90 3,602 2.60 % 910.55 481,702 4.25 % 1,515.08 trije trije tri je 1,077,597 2.87 % 949.68 4 1.30 % 412.07 24,134 5.42 % 607.67 554,125 2.84 % 1,021.01 148,979 3.07 % 794.90 24,987 1.92 % 581.98 2,226 1.61 % 562.71 323,142 2.85 % 1,016.37 štirje štirje štir je 562,863 1.50 % 496.05 2 0.65 % 206.04 10,338 2.32 % 260.30 287,367 1.47 % 529.49 76,421 1.57 % 407.76 10,991 0.84 % 255.99 649 0.47 % 164.06 177,095 1.56 % 557.01 pet pet p et 490,799 1.30 % 432.54 1 0.33 % 103.02 9,434 2.12 % 237.54 250,598 1.28 % 461.74 63,012 1.30 % 336.21 7,471 0.57 % 174.01 917 0.66 % 231.81 159,366 1.40 % 501.25 tretji tretji tret ji 454,803 1.21 % 400.82 0 0 % 0 5,843 1.31 % 147.12 221,805 1.14 % 408.69 50,869 1.05 % 271.42 11,061 0.85 % 257.62 2,681 1.94 % 677.73 162,544 1.43 % 511.24 deset deset des et 366,728 0.97 % 323.20 0 0 % 0 9,231 2.08 % 232.43 193,409 0.99 % 356.37 51,610 1.06 % 275.37 5,416 0.42 % 126.15 459 0.33 % 116.03 106,603 0.94 % 335.29 eden eden ed en 339,502 0.90 % 299.20 0 0 % 0 9,711 2.18 % 244.51 157,403 0.81 % 290.03 57,846 1.19 % 308.65 10,361 0.80 % 241.32 694 0.50 % 175.44 103,487 0.91 % 325.49 1, 1, 1, 338,262 0.90 % 298.11 0 0 % 0 729 0.16 % 18.36 212,046 1.09 % 390.71 30,858 0.64 % 164.65 10,574 0.81 % 246.28 3,351 2.42 % 847.10 80,704 0.71 % 253.84 20 20 20 318,181 0.85 % 280.41 22 7.14 % 2,266.41 360 0.08 % 9.06 161,967 0.83 % 298.43 43,208 0.89 % 230.54 9,708 0.75 % 226.11 928 0.67 % 234.59 101,988 0.90 % 320.78 tisoč tisoč tis oč 305,442 0.81 % 269.18 0 0 % 0 5,271 1.19 % 132.72 195,780 1.00 % 360.74 37,622 0.78 % 200.74 3,442 0.27 % 80.17 270 0.20 % 68.25 63,057 0.56 % 198.33 šest šest še st 303,010 0.81 % 267.04 0 0 % 0 5,487 1.23 % 138.16 154,063 0.79 % 283.87 36,091 0.74 % 192.57 4,534 0.35 % 105.60 741 0.54 % 187.32 102,094 0.90 % 321.11 10 10 10 296,783 0.79 % 261.55 21 6.82 % 2,163.39 401 0.09 % 10.10 142,870 0.73 % 263.25 49,185 1.01 % 262.43 13,444 1.03 % 313.13 1,236 0.89 % 312.45 89,626 0.79 % 281.90 15 15 15 265,493 0.71 % 233.98 19 6.17 % 1,957.35 308 0.07 % 7.76 133,739 0.69 % 246.42 35,793 0.74 % 190.98 8,377 0.64 % 195.11 753 0.54 % 190.35 86,504 0.76 % 272.08 30 30 30 251,613 0.67 % 221.75 20 6.49 % 2,060.37 282 0.06 % 7.10 128,479 0.66 % 236.73 38,641 0.80 % 206.18 7,698 0.59 % 179.30 1,127 0.81 % 284.89 75,366 0.66 % 237.05 2, 2, 2, 238,500 0.63 % 210.19 0 0 % 0 327 0.07 % 8.23 154,120 0.79 % 283.98 23,331 0.48 % 124.49 8,832 0.68 % 205.71 3,076 2.22 % 777.58 48,814 0.43 % 153.53 sedem sedem sed em 221,597 0.59 % 195.29 0 0 % 0 4,599 1.03 % 115.80 112,630 0.58 % 207.53 25,309 0.52 % 135.04 3,382 0.26 % 78.77 261 0.19 % 65.98 75,416 0.67 % 237.20 12 12 12 214,333 0.57 % 188.89 5 1.62 % 515.09 297 0.07 % 7.48 109,658 0.56 % 202.05 25,719 0.53 % 137.23 6,921 0.53 % 161.20 479 0.35 % 121.09 71,254 0.63 % 224.11 3, 3, 3, 214,002 0.57 % 188.60 0 0 % 0 288 0.07 % 7.25 143,636 0.74 % 264.66 19,566 0.40 % 104.40 7,182 0.55 % 167.28 2,450 1.77 % 619.33 40,880 0.36 % 128.58 osem osem os em 208,962 0.56 % 184.16 0 0 % 0 3,541 0.80 % 89.16 108,342 0.56 % 199.63 23,456 0.48 % 125.15 2,867 0.22 % 66.78 416 0.30 % 105.16 70,340 0.62 % 221.24 50 50 50 208,691 0.56 % 183.92 8 2.60 % 824.15 222 0.05 % 5.59 107,858 0.55 % 198.74 32,852 0.68 % 175.29 5,759 0.44 % 134.13 785 0.57 % 198.44 61,207 0.54 % 192.51 40 40 40 180,779 0.48 % 159.32 5 1.62 % 515.09 198 0.04 % 4.99 93,495 0.48 % 172.27 27,276 0.56 % 145.54 4,939 0.38 % 115.04 445 0.32 % 112.49 54,421 0.48 % 171.17 100 100 1 0 180,711 0.48 % 159.26 11 3.57 % 1,133.20 196 0.04 % 4.94 85,233 0.44 % 157.05 36,302 0.75 % 193.70 5,283 0.41 % 123.05 698 0.50 % 176.45 52,988 0.47 % 166.66 25 25 25 177,837 0.47 % 156.73 9 2.92 % 927.17 224 0.05 % 5.64 93,905 0.48 % 173.03 22,798 0.47 % 121.64 5,172 0.40 % 120.46 417 0.30 % 105.41 55,312 0.49 % 173.97 14 14 14 169,856 0.45 % 149.69 0 0 % 0 227 0.05 % 5.72 90,968 0.47 % 167.61 15,762 0.32 % 84.10 4,747 0.36 % 110.56 314 0.23 % 79.38 57,838 0.51 % 181.92 16 16 16 163,126 0.43 % 143.76 1 0.33 % 103.02 201 0.04 % 5.06 85,630 0.44 % 157.78 18,486 0.38 % 98.64 4,603 0.35 % 107.21 231 0.17 % 58.39 53,974 0.48 % 169.76 18 18 18 159,374 0.42 % 140.46 3 0.97 % 309.06 192 0.04 % 4.83 84,973 0.43 % 156.57 15,991 0.33 % 85.32 4,219 0.32 % 98.27 456 0.33 % 115.27 53,540 0.47 % 168.40 četrti četrti četr ti 158,938 0.42 % 140.07 0 0 % 0 1,901 0.43 % 47.87 76,520 0.39 % 140.99 15,548 0.32 % 82.96 3,282 0.25 % 76.44 926 0.67 % 234.08 60,761 0.54 % 191.11 20, 20, 2 0, 157,565 0.42 % 138.86 0 0 % 0 297 0.07 % 7.48 89,595 0.46 % 165.08 12,425 0.26 % 66.30 4,708 0.36 % 109.65 571 0.41 % 144.34 49,969 0.44 % 157.17 11 11 11 153,446 0.41 % 135.23 0 0 % 0 179 0.04 % 4.51 80,081 0.41 % 147.55 14,037 0.29 % 74.90 4,536 0.35 % 105.65 300 0.22 % 75.84 54,313 0.48 % 170.83 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 144 File at CLARIN.SI 1.2.128 List of final character-level 3-grams from numeral lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-numerals-lemmas-final- 3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] drug drug d rug 3,399,068 12.12 % 2,995.58 19 13.77 % 1,957.35 110,912 30.40 % 2,792.64 1,577,458 10.87 % 2,906.57 551,520 15.30 % 2,942.73 157,344 16.58 % 3,664.73 22,761 23.55 % 5,753.74 979,054 11.49 % 3,079.38 prvi prvi p rvi 2,198,324 7.84 % 1,937.37 2 1.45 % 206.04 31,830 8.72 % 801.44 1,053,085 7.25 % 1,940.38 295,479 8.20 % 1,576.58 58,824 6.20 % 1,370.08 9,684 10.02 % 2,448.01 749,420 8.80 % 2,357.12 dva dva dva 1,632,916 5.82 % 1,439.08 9 6.52 % 927.17 41,991 11.51 % 1,057.29 804,557 5.54 % 1,482.45 255,248 7.08 % 1,361.92 45,807 4.83 % 1,066.90 3,602 3.73 % 910.55 481,702 5.65 % 1,515.08 trije trije tr ije 1,077,597 3.84 % 949.68 4 2.90 % 412.07 24,134 6.62 % 607.67 554,125 3.82 % 1,021.01 148,979 4.13 % 794.90 24,987 2.63 % 581.98 2,226 2.30 % 562.71 323,142 3.79 % 1,016.37 štirje štirje šti rje 562,863 2.01 % 496.05 2 1.45 % 206.04 10,338 2.83 % 260.30 287,367 1.98 % 529.49 76,421 2.12 % 407.76 10,991 1.16 % 255.99 649 0.67 % 164.06 177,095 2.08 % 557.01 pet pet pet 490,799 1.75 % 432.54 1 0.72 % 103.02 9,434 2.59 % 237.54 250,598 1.73 % 461.74 63,012 1.75 % 336.21 7,471 0.79 % 174.01 917 0.95 % 231.81 159,366 1.87 % 501.25 tretji tretji tre tji 454,803 1.62 % 400.82 0 0 % 0 5,843 1.60 % 147.12 221,805 1.53 % 408.69 50,869 1.41 % 271.42 11,061 1.17 % 257.62 2,681 2.77 % 677.73 162,544 1.91 % 511.24 deset deset de set 366,728 1.31 % 323.20 0 0 % 0 9,231 2.53 % 232.43 193,409 1.33 % 356.37 51,610 1.43 % 275.37 5,416 0.57 % 126.15 459 0.47 % 116.03 106,603 1.25 % 335.29 eden eden e den 339,502 1.21 % 299.20 0 0 % 0 9,711 2.66 % 244.51 157,403 1.08 % 290.03 57,846 1.60 % 308.65 10,361 1.09 % 241.32 694 0.72 % 175.44 103,487 1.22 % 325.49 tisoč tisoč ti soč 305,442 1.09 % 269.18 0 0 % 0 5,271 1.45 % 132.72 195,780 1.35 % 360.74 37,622 1.04 % 200.74 3,442 0.36 % 80.17 270 0.28 % 68.25 63,057 0.74 % 198.33 šest šest š est 303,010 1.08 % 267.04 0 0 % 0 5,487 1.50 % 138.16 154,063 1.06 % 283.87 36,091 1.00 % 192.57 4,534 0.48 % 105.60 741 0.77 % 187.32 102,094 1.20 % 321.11 sedem sedem se dem 221,597 0.79 % 195.29 0 0 % 0 4,599 1.26 % 115.80 112,630 0.78 % 207.53 25,309 0.70 % 135.04 3,382 0.36 % 78.77 261 0.27 % 65.98 75,416 0.89 % 237.20 osem osem o sem 208,962 0.74 % 184.16 0 0 % 0 3,541 0.97 % 89.16 108,342 0.75 % 199.63 23,456 0.65 % 125.15 2,867 0.30 % 66.78 416 0.43 % 105.16 70,340 0.83 % 221.24 100 100 100 180,711 0.64 % 159.26 11 7.97 % 1,133.20 196 0.05 % 4.94 85,233 0.59 % 157.05 36,302 1.01 % 193.70 5,283 0.56 % 123.05 698 0.72 % 176.45 52,988 0.62 % 166.66 četrti četrti čet rti 158,938 0.57 % 140.07 0 0 % 0 1,901 0.52 % 47.87 76,520 0.53 % 140.99 15,548 0.43 % 82.96 3,282 0.35 % 76.44 926 0.96 % 234.08 60,761 0.71 % 191.11 20, 20, 20, 157,565 0.56 % 138.86 0 0 % 0 297 0.08 % 7.48 89,595 0.62 % 165.08 12,425 0.34 % 66.30 4,708 0.50 % 109.65 571 0.59 % 144.34 49,969 0.59 % 157.17 sto sto sto 152,879 0.55 % 134.73 0 0 % 0 4,718 1.29 % 118.79 84,745 0.58 % 156.15 21,926 0.61 % 116.99 3,217 0.34 % 74.93 184 0.19 % 46.51 38,089 0.45 % 119.80 2,000 2,000 2 0 148,688 0.53 % 131.04 0 0 % 0 245 0.07 % 6.17 89,618 0.62 % 165.13 26,937 0.75 % 143.73 5,880 0.62 % 136.95 159 0.16 % 40.19 25,849 0.30 % 81.30 19, 19, 19, 144,490 0.52 % 127.34 0 0 % 0 334 0.09 % 8.41 83,071 0.57 % 153.06 11,758 0.33 % 62.74 6,449 0.68 % 150.20 689 0.71 % 174.17 42,189 0.49 % 132.70 10, 10, 10, 140,661 0.50 % 123.96 0 0 % 0 237 0.07 % 5.97 83,115 0.57 % 153.14 11,021 0.31 % 58.80 2,755 0.29 % 64.17 682 0.70 % 172.40 42,851 0.50 % 134.78 18, 18, 18, 136,255 0.49 % 120.08 0 0 % 0 273 0.07 % 6.87 80,084 0.55 % 147.56 10,192 0.28 % 54.38 4,116 0.43 % 95.87 516 0.53 % 130.44 41,074 0.48 % 129.19 devet devet de vet 134,561 0.48 % 118.59 2 1.45 % 206.04 2,497 0.68 % 62.87 68,635 0.47 % 126.46 14,329 0.40 % 76.45 1,695 0.18 % 39.48 117 0.12 % 29.58 47,286 0.56 % 148.73 15, 15, 15, 133,142 0.47 % 117.34 0 0 % 0 432 0.12 % 10.88 75,868 0.52 % 139.79 10,427 0.29 % 55.64 3,144 0.33 % 73.23 565 0.58 % 142.83 42,706 0.50 % 134.32 11, 11, 11, 127,423 0.45 % 112.30 0 0 % 0 267 0.07 % 6.72 69,899 0.48 % 128.79 10,413 0.29 % 55.56 2,697 0.28 % 62.82 554 0.57 % 140.05 43,593 0.51 % 137.11 200 200 200 124,316 0.44 % 109.56 15 10.87 % 1,545.28 119 0.03 % 3 68,266 0.47 % 125.78 20,159 0.56 % 107.56 2,773 0.29 % 64.59 210 0.22 % 53.09 32,774 0.39 % 103.08 12, 12, 12, 122,524 0.44 % 107.98 0 0 % 0 219 0.06 % 5.51 65,640 0.45 % 120.95 10,009 0.28 % 53.40 3,207 0.34 % 74.69 552 0.57 % 139.54 42,897 0.50 % 134.92 17, 17, 17, 121,238 0.43 % 106.85 0 0 % 0 244 0.07 % 6.14 70,583 0.49 % 130.05 8,629 0.24 % 46.04 3,183 0.34 % 74.14 466 0.48 % 117.80 38,133 0.45 % 119.94 16, 16, 16, 112,174 0.40 % 98.86 0 0 % 0 278 0.08 % 7 63,412 0.44 % 116.84 8,435 0.23 % 45.01 3,340 0.35 % 77.79 499 0.52 % 126.14 36,210 0.42 % 113.89 2,007 2,007 2 7 108,871 0.39 % 95.95 0 0 % 0 117 0.03 % 2.95 56,465 0.39 % 104.04 14,573 0.40 % 77.76 4,368 0.46 % 101.74 11 0.01 % 2.78 33,337 0.39 % 104.85 14, 14, 14, 108,240 0.39 % 95.39 0 0 % 0 217 0.06 % 5.46 59,413 0.41 % 109.47 7,886 0.22 % 42.08 2,528 0.27 % 58.88 507 0.52 % 128.16 37,689 0.44 % 118.54 13, 13, 13, 106,668 0.38 % 94.01 0 0 % 0 143 0.04 % 3.60 58,794 0.41 % 108.33 7,834 0.22 % 41.80 2,721 0.29 % 63.38 532 0.55 % 134.48 36,644 0.43 % 115.25 2,008 2,008 2 8 104,919 0.37 % 92.46 0 0 % 0 95 0.03 % 2.39 47,198 0.33 % 86.97 11,128 0.31 % 59.38 2,811 0.30 % 65.47 30 0.03 % 7.58 43,657 0.51 % 137.31 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 145 File at CLARIN.SI 1.2.129 List of final character-level 4-grams from numeral lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-numerals-lemmas-final- 4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] drug drug drug 3,399,068 18.08 % 2,995.58 19 61.29 % 1,957.35 110,912 37.42 % 2,792.64 1,577,458 16.61 % 2,906.57 551,520 22.33 % 2,942.73 157,344 23.70 % 3,664.73 22,761 32.85 % 5,753.74 979,054 16.87 % 3,079.38 prvi prvi prvi 2,198,324 11.70 % 1,937.37 2 6.45 % 206.04 31,830 10.74 % 801.44 1,053,085 11.09 % 1,940.38 295,479 11.96 % 1,576.58 58,824 8.86 % 1,370.08 9,684 13.98 % 2,448.01 749,420 12.91 % 2,357.12 trije trije t rije 1,077,597 5.73 % 949.68 4 12.90 % 412.07 24,134 8.14 % 607.67 554,125 5.84 % 1,021.01 148,979 6.03 % 794.90 24,987 3.76 % 581.98 2,226 3.21 % 562.71 323,142 5.57 % 1,016.37 štirje štirje št irje 562,863 2.99 % 496.05 2 6.45 % 206.04 10,338 3.49 % 260.30 287,367 3.03 % 529.49 76,421 3.09 % 407.76 10,991 1.66 % 255.99 649 0.94 % 164.06 177,095 3.05 % 557.01 tretji tretji tr etji 454,803 2.42 % 400.82 0 0 % 0 5,843 1.97 % 147.12 221,805 2.34 % 408.69 50,869 2.06 % 271.42 11,061 1.67 % 257.62 2,681 3.87 % 677.73 162,544 2.80 % 511.24 deset deset d eset 366,728 1.95 % 323.20 0 0 % 0 9,231 3.11 % 232.43 193,409 2.04 % 356.37 51,610 2.09 % 275.37 5,416 0.82 % 126.15 459 0.66 % 116.03 106,603 1.84 % 335.29 eden eden eden 339,502 1.81 % 299.20 0 0 % 0 9,711 3.28 % 244.51 157,403 1.66 % 290.03 57,846 2.34 % 308.65 10,361 1.56 % 241.32 694 1.00 % 175.44 103,487 1.78 % 325.49 tisoč tisoč t isoč 305,442 1.62 % 269.18 0 0 % 0 5,271 1.78 % 132.72 195,780 2.06 % 360.74 37,622 1.52 % 200.74 3,442 0.52 % 80.17 270 0.39 % 68.25 63,057 1.09 % 198.33 šest šest šest 303,010 1.61 % 267.04 0 0 % 0 5,487 1.85 % 138.16 154,063 1.62 % 283.87 36,091 1.46 % 192.57 4,534 0.68 % 105.60 741 1.07 % 187.32 102,094 1.76 % 321.11 sedem sedem s edem 221,597 1.18 % 195.29 0 0 % 0 4,599 1.55 % 115.80 112,630 1.19 % 207.53 25,309 1.02 % 135.04 3,382 0.51 % 78.77 261 0.38 % 65.98 75,416 1.30 % 237.20 osem osem osem 208,962 1.11 % 184.16 0 0 % 0 3,541 1.20 % 89.16 108,342 1.14 % 199.63 23,456 0.95 % 125.15 2,867 0.43 % 66.78 416 0.60 % 105.16 70,340 1.21 % 221.24 četrti četrti če trti 158,938 0.85 % 140.07 0 0 % 0 1,901 0.64 % 47.87 76,520 0.81 % 140.99 15,548 0.63 % 82.96 3,282 0.49 % 76.44 926 1.34 % 234.08 60,761 1.05 % 191.11 2,000 2,000 2,000 148,688 0.79 % 131.04 0 0 % 0 245 0.08 % 6.17 89,618 0.94 % 165.13 26,937 1.09 % 143.73 5,880 0.89 % 136.95 159 0.23 % 40.19 25,849 0.45 % 81.30 devet devet d evet 134,561 0.72 % 118.59 2 6.45 % 206.04 2,497 0.84 % 62.87 68,635 0.72 % 126.46 14,329 0.58 % 76.45 1,695 0.26 % 39.48 117 0.17 % 29.58 47,286 0.81 % 148.73 2,007 2,007 2,007 108,871 0.58 % 95.95 0 0 % 0 117 0.04 % 2.95 56,465 0.59 % 104.04 14,573 0.59 % 77.76 4,368 0.66 % 101.74 11 0.02 % 2.78 33,337 0.57 % 104.85 2,008 2,008 2,008 104,919 0.56 % 92.46 0 0 % 0 95 0.03 % 2.39 47,198 0.50 % 86.97 11,128 0.45 % 59.38 2,811 0.42 % 65.47 30 0.04 % 7.58 43,657 0.75 % 137.31 2,004 2,004 2,004 98,632 0.53 % 86.92 0 0 % 0 106 0.04 % 2.67 55,211 0.58 % 101.73 17,052 0.69 % 90.98 3,950 0.59 % 92 38 0.06 % 9.61 22,275 0.38 % 70.06 2,006 2,006 2,006 98,449 0.52 % 86.76 0 0 % 0 122 0.04 % 3.07 52,575 0.55 % 96.87 15,004 0.61 % 80.06 4,316 0.65 % 100.52 48 0.07 % 12.13 26,384 0.46 % 82.98 2,005 2,005 2,005 88,134 0.47 % 77.67 0 0 % 0 88 0.03 % 2.22 46,840 0.49 % 86.31 15,419 0.62 % 82.27 3,549 0.54 % 82.66 51 0.07 % 12.89 22,187 0.38 % 69.78 2,002 2,002 2,002 86,866 0.46 % 76.55 0 0 % 0 118 0.04 % 2.97 50,869 0.54 % 93.73 15,965 0.65 % 85.18 3,731 0.56 % 86.90 39 0.06 % 9.86 16,144 0.28 % 50.78 dvajset dvajset dva jset 85,701 0.46 % 75.53 0 0 % 0 4,883 1.65 % 122.95 47,739 0.50 % 87.96 16,442 0.67 % 87.73 2,169 0.33 % 50.52 142 0.20 % 35.90 14,326 0.25 % 45.06 2,001 2,001 2,001 85,421 0.45 % 75.28 0 0 % 0 134 0.04 % 3.37 52,447 0.55 % 96.64 14,015 0.57 % 74.78 3,405 0.51 % 79.31 49 0.07 % 12.39 15,371 0.27 % 48.35 peti peti peti 85,266 0.45 % 75.14 0 0 % 0 1,304 0.44 % 32.83 41,917 0.44 % 77.23 9,024 0.36 % 48.15 1,482 0.22 % 34.52 375 0.54 % 94.80 31,164 0.54 % 98.02 2,003 2,003 2,003 83,355 0.44 % 73.46 0 0 % 0 109 0.04 % 2.74 45,807 0.48 % 84.40 16,737 0.68 % 89.30 3,901 0.59 % 90.86 33 0.05 % 8.34 16,768 0.29 % 52.74 2,009 2,009 2,009 79,900 0.42 % 70.42 0 0 % 0 70 0.02 % 1.76 28,954 0.30 % 53.35 6,586 0.27 % 35.14 964 0.14 % 22.45 21 0.03 % 5.31 43,305 0.75 % 136.21 2,010 2,010 2,010 79,410 0.42 % 69.98 0 0 % 0 72 0.02 % 1.81 19,772 0.21 % 36.43 3,005 0.12 % 16.03 587 0.09 % 13.67 2 0 % 0.51 55,972 0.96 % 176.05 šesti šesti š esti 74,252 0.40 % 65.44 0 0 % 0 966 0.33 % 24.32 35,823 0.38 % 66.01 7,316 0.30 % 39.04 1,195 0.18 % 27.83 282 0.41 % 71.29 28,670 0.49 % 90.17 1,999 1,999 1,999 73,271 0.39 % 64.57 0 0 % 0 117 0.04 % 2.95 47,250 0.50 % 87.06 11,085 0.45 % 59.15 3,507 0.53 % 81.68 175 0.25 % 44.24 11,137 0.19 % 35.03 1,998 1,998 1,998 66,087 0.35 % 58.24 0 0 % 0 134 0.04 % 3.37 41,645 0.44 % 76.73 10,313 0.42 % 55.03 3,428 0.52 % 79.84 203 0.29 % 51.32 10,364 0.18 % 32.60 sedmi sedmi s edmi 66,081 0.35 % 58.24 0 0 % 0 848 0.29 % 21.35 33,649 0.35 % 62 5,901 0.24 % 31.49 975 0.15 % 22.71 207 0.30 % 52.33 24,501 0.42 % 77.06 2,013 2,013 2,013 65,695 0.35 % 57.90 0 0 % 0 59 0.02 % 1.49 5,509 0.06 % 10.15 541 0.02 % 2.89 410 0.06 % 9.55 1 0 % 0.25 59,175 1.02 % 186.12 2,014 2,014 2,014 65,624 0.35 % 57.83 0 0 % 0 42 0.01 % 1.06 1,573 0.02 % 2.90 133 0.01 % 0.71 475 0.07 % 11.06 1 0 % 0.25 63,400 1.09 % 199.41 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 146 File at CLARIN.SI 1.2.130 List of final character-level 5-grams from numeral lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-numerals-lemmas-final- 5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] trije trije trije 1,077,597 14.63 % 949.68 4 50.00 % 412.07 24,134 21.20 % 607.67 554,125 13.61 % 1,021.01 148,979 17.83 % 794.90 24,987 17.09 % 581.98 2,226 11.91 % 562.71 323,142 14.83 % 1,016.37 štirje štirje š tirje 562,863 7.64 % 496.05 2 25.00 % 206.04 10,338 9.08 % 260.30 287,367 7.06 % 529.49 76,421 9.15 % 407.76 10,991 7.52 % 255.99 649 3.47 % 164.06 177,095 8.13 % 557.01 tretji tretji t retji 454,803 6.18 % 400.82 0 0 % 0 5,843 5.13 % 147.12 221,805 5.45 % 408.69 50,869 6.09 % 271.42 11,061 7.57 % 257.62 2,681 14.34 % 677.73 162,544 7.46 % 511.24 deset deset deset 366,728 4.98 % 323.20 0 0 % 0 9,231 8.11 % 232.43 193,409 4.75 % 356.37 51,610 6.18 % 275.37 5,416 3.71 % 126.15 459 2.46 % 116.03 106,603 4.89 % 335.29 tisoč tisoč tisoč 305,442 4.15 % 269.18 0 0 % 0 5,271 4.63 % 132.72 195,780 4.81 % 360.74 37,622 4.50 % 200.74 3,442 2.35 % 80.17 270 1.44 % 68.25 63,057 2.89 % 198.33 sedem sedem sedem 221,597 3.01 % 195.29 0 0 % 0 4,599 4.04 % 115.80 112,630 2.77 % 207.53 25,309 3.03 % 135.04 3,382 2.31 % 78.77 261 1.40 % 65.98 75,416 3.46 % 237.20 četrti četrti č etrti 158,938 2.16 % 140.07 0 0 % 0 1,901 1.67 % 47.87 76,520 1.88 % 140.99 15,548 1.86 % 82.96 3,282 2.25 % 76.44 926 4.95 % 234.08 60,761 2.79 % 191.11 devet devet devet 134,561 1.83 % 118.59 2 25.00 % 206.04 2,497 2.19 % 62.87 68,635 1.69 % 126.46 14,329 1.72 % 76.45 1,695 1.16 % 39.48 117 0.63 % 29.58 47,286 2.17 % 148.73 dvajset dvajset dv ajset 85,701 1.16 % 75.53 0 0 % 0 4,883 4.29 % 122.95 47,739 1.17 % 87.96 16,442 1.97 % 87.73 2,169 1.48 % 50.52 142 0.76 % 35.90 14,326 0.66 % 45.06 šesti šesti šesti 74,252 1.01 % 65.44 0 0 % 0 966 0.85 % 24.32 35,823 0.88 % 66.01 7,316 0.88 % 39.04 1,195 0.82 % 27.83 282 1.51 % 71.29 28,670 1.32 % 90.17 sedmi sedmi sedmi 66,081 0.90 % 58.24 0 0 % 0 848 0.74 % 21.35 33,649 0.83 % 62 5,901 0.71 % 31.49 975 0.67 % 22.71 207 1.11 % 52.33 24,501 1.12 % 77.06 dvojen dvojen d vojen 53,225 0.72 % 46.91 0 0 % 0 1,309 1.15 % 32.96 25,179 0.62 % 46.39 9,987 1.20 % 53.29 2,927 2.00 % 68.17 268 1.43 % 67.75 13,555 0.62 % 42.63 petnajst petnajst pet najst 52,670 0.71 % 46.42 0 0 % 0 2,585 2.27 % 65.09 31,310 0.77 % 57.69 9,890 1.18 % 52.77 1,092 0.75 % 25.43 177 0.95 % 44.74 7,616 0.35 % 23.95 dvanajst dvanajst dva najst 52,274 0.71 % 46.07 0 0 % 0 2,256 1.98 % 56.80 30,553 0.75 % 56.30 8,579 1.03 % 45.77 1,577 1.08 % 36.73 109 0.58 % 27.55 9,200 0.42 % 28.94 enajst enajst e najst 49,499 0.67 % 43.62 0 0 % 0 1,534 1.35 % 38.62 28,527 0.70 % 52.56 5,959 0.71 % 31.80 615 0.42 % 14.32 50 0.27 % 12.64 12,814 0.59 % 40.30 trideset trideset tri deset 47,140 0.64 % 41.54 0 0 % 0 2,690 2.36 % 67.73 27,199 0.67 % 50.12 9,456 1.13 % 50.45 1,273 0.87 % 29.65 167 0.89 % 42.22 6,355 0.29 % 19.99 deseti deseti d eseti 41,418 0.56 % 36.50 0 0 % 0 692 0.61 % 17.42 22,075 0.54 % 40.67 4,406 0.53 % 23.51 629 0.43 % 14.65 60 0.32 % 15.17 13,556 0.62 % 42.64 deveti deveti d eveti 38,542 0.52 % 33.97 0 0 % 0 714 0.63 % 17.98 19,195 0.47 % 35.37 3,512 0.42 % 18.74 578 0.40 % 13.46 79 0.42 % 19.97 14,464 0.66 % 45.49 petdeset petdeset pet deset 35,974 0.49 % 31.70 0 0 % 0 2,539 2.23 % 63.93 20,873 0.51 % 38.46 7,067 0.85 % 37.71 1,066 0.73 % 24.83 72 0.39 % 18.20 4,357 0.20 % 13.70 19,30 19,30 19,30 34,632 0.47 % 30.52 0 0 % 0 4 0 % 0.10 30,009 0.74 % 55.29 717 0.09 % 3.83 20 0.01 % 0.47 2 0.01 % 0.51 3,880 0.18 % 12.20 10,000 10,000 1 0,000 31,107 0.42 % 27.41 0 0 % 0 36 0.03 % 0.91 15,753 0.39 % 29.03 4,273 0.51 % 22.80 494 0.34 % 11.51 111 0.59 % 28.06 10,440 0.48 % 32.84 štirinajst štirinajst štiri najst 29,291 0.40 % 25.81 0 0 % 0 1,335 1.17 % 33.61 18,357 0.45 % 33.82 5,206 0.62 % 27.78 627 0.43 % 14.60 48 0.26 % 12.13 3,718 0.17 % 11.69 20,00 20,00 20,00 28,807 0.39 % 25.39 0 0 % 0 2 0 % 0.05 22,837 0.56 % 42.08 1,244 0.15 % 6.64 62 0.04 % 1.44 15 0.08 % 3.79 4,647 0.21 % 14.62 osemdeseti osemdeseti osemd eseti 27,174 0.37 % 23.95 0 0 % 0 276 0.24 % 6.95 14,169 0.35 % 26.11 7,078 0.85 % 37.77 1,373 0.94 % 31.98 37 0.20 % 9.35 4,241 0.20 % 13.34 19,00 19,00 19,00 27,027 0.37 % 23.82 0 0 % 0 7 0.01 % 0.18 20,005 0.49 % 36.86 752 0.09 % 4.01 63 0.04 % 1.47 2 0.01 % 0.51 6,198 0.28 % 19.49 100,000 100,000 10 0,000 25,443 0.35 % 22.42 0 0 % 0 29 0.03 % 0.73 12,044 0.30 % 22.19 3,426 0.41 % 18.28 354 0.24 % 8.25 118 0.63 % 29.83 9,472 0.43 % 29.79 devetdeseti devetdeseti devetd eseti 24,539 0.33 % 21.63 0 0 % 0 217 0.19 % 5.46 14,140 0.35 % 26.05 4,995 0.60 % 26.65 979 0.67 % 22.80 32 0.17 % 8.09 4,176 0.19 % 13.13 18,00 18,00 18,00 23,606 0.32 % 20.80 0 0 % 0 6 0.01 % 0.15 17,269 0.42 % 31.82 632 0.08 % 3.37 112 0.08 % 2.61 12 0.06 % 3.03 5,575 0.26 % 17.53 štirideset štirideset štiri deset 23,199 0.32 % 20.45 0 0 % 0 1,769 1.55 % 44.54 13,090 0.32 % 24.12 4,667 0.56 % 24.90 786 0.54 % 18.31 31 0.17 % 7.84 2,856 0.13 % 8.98 trinajst trinajst tri najst 22,766 0.31 % 20.06 0 0 % 0 736 0.65 % 18.53 13,808 0.34 % 25.44 3,650 0.44 % 19.48 473 0.32 % 11.02 21 0.11 % 5.31 4,078 0.19 % 12.83 dvesto dvesto d vesto 21,911 0.30 % 19.31 0 0 % 0 1,170 1.03 % 29.46 13,791 0.34 % 25.41 3,592 0.43 % 19.17 581 0.40 % 13.53 21 0.11 % 5.31 2,756 0.13 % 8.67 20,000 20,000 2 0,000 20,058 0.27 % 17.68 0 0 % 0 30 0.03 % 0.76 10,180 0.25 % 18.76 2,542 0.30 % 13.56 314 0.21 % 7.31 73 0.39 % 18.45 6,919 0.32 % 21.76 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 147 File at CLARIN.SI 1.2.131 List of initial character-level 1-grams from numeral lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-numerals-lowercase_forms- initial-1grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] drugi d rugi 820,060 2.04 % 722.71 3 0.69 % 309.06 24,165 5.37 % 608.45 392,784 1.88 % 723.73 128,927 2.45 % 687.91 34,299 2.37 % 798.86 3,773 2.35 % 953.77 236,109 1.97 % 742.62 prvi p rvi 808,380 2.01 % 712.42 2 0.46 % 206.04 11,945 2.65 % 300.76 373,300 1.79 % 687.83 112,201 2.13 % 598.67 21,036 1.45 % 489.95 2,570 1.60 % 649.67 287,326 2.39 % 903.72 tri t ri 563,742 1.40 % 496.82 1 0.23 % 103.02 13,856 3.08 % 348.88 286,039 1.37 % 527.05 78,929 1.50 % 421.14 13,581 0.94 % 316.32 939 0.58 % 237.37 170,397 1.42 % 535.94 dva d va 531,037 1.32 % 468 3 0.69 % 309.06 17,833 3.96 % 449.02 259,987 1.24 % 479.04 86,185 1.64 % 459.85 14,415 0.99 % 335.74 1,103 0.69 % 278.83 151,511 1.26 % 476.54 dve d ve 517,328 1.29 % 455.92 2 0.46 % 206.04 13,722 3.05 % 345.51 252,491 1.21 % 465.23 82,830 1.57 % 441.95 15,837 1.09 % 368.86 1,072 0.67 % 270.99 151,374 1.26 % 476.11 1 1 473,721 1.18 % 417.49 34 7.83 % 3,502.63 1,167 0.26 % 29.38 224,981 1.08 % 414.54 75,915 1.44 % 405.06 31,719 2.19 % 738.77 5,780 3.60 % 1,461.12 134,125 1.12 % 421.86 drugih d rugih 462,975 1.15 % 408.02 0 0 % 0 8,523 1.89 % 214.60 224,951 1.08 % 414.49 80,742 1.54 % 430.81 25,540 1.76 % 594.86 3,960 2.47 % 1,001.05 119,259 0.99 % 375.10 2 2 448,702 1.11 % 395.44 32 7.37 % 3,296.59 870 0.19 % 21.91 232,497 1.11 % 428.39 74,329 1.41 % 396.60 26,758 1.85 % 623.23 5,495 3.43 % 1,389.08 108,721 0.91 % 341.96 eno e no 413,972 1.03 % 364.83 7 1.61 % 721.13 19,037 4.23 % 479.33 183,413 0.88 % 337.95 77,701 1.48 % 414.59 16,926 1.17 % 394.23 1,428 0.89 % 360.98 115,460 0.96 % 363.15 dveh d veh 411,331 1.02 % 362.50 2 0.46 % 206.04 6,335 1.41 % 159.51 205,987 0.98 % 379.54 58,443 1.11 % 311.83 11,257 0.78 % 262.19 1,144 0.71 % 289.19 128,163 1.07 % 403.11 druge d ruge 404,732 1.01 % 356.69 1 0.23 % 103.02 10,596 2.35 % 266.80 187,972 0.90 % 346.35 71,965 1.37 % 383.98 23,722 1.64 % 552.51 4,453 2.78 % 1,125.67 106,023 0.88 % 333.47 drugim d rugim 396,234 0.98 % 349.20 3 0.69 % 309.06 6,249 1.39 % 157.34 173,675 0.83 % 320.01 49,406 0.94 % 263.61 10,022 0.69 % 233.42 1,036 0.65 % 261.89 155,843 1.30 % 490.17 ena e na 342,038 0.85 % 301.44 0 0 % 0 11,184 2.48 % 281.60 153,961 0.74 % 283.68 64,745 1.23 % 345.46 12,596 0.87 % 293.38 1,003 0.62 % 253.55 98,549 0.82 % 309.96 eden e den 339,547 0.84 % 299.24 0 0 % 0 9,712 2.16 % 244.54 157,429 0.75 % 290.07 57,853 1.10 % 308.68 10,361 0.71 % 241.32 694 0.43 % 175.44 103,498 0.86 % 325.53 1, 1 , 338,262 0.84 % 298.11 0 0 % 0 729 0.16 % 18.36 212,046 1.01 % 390.71 30,858 0.59 % 164.65 10,574 0.73 % 246.28 3,351 2.09 % 847.10 80,704 0.67 % 253.84 3 3 334,249 0.83 % 294.57 20 4.61 % 2,060.37 684 0.15 % 17.22 178,016 0.85 % 328.01 52,468 1.00 % 279.95 20,047 1.38 % 466.92 3,802 2.37 % 961.10 79,212 0.66 % 249.14 drugo d rugo 330,167 0.82 % 290.97 10 2.30 % 1,030.18 13,160 2.92 % 331.35 160,253 0.77 % 295.28 53,640 1.02 % 286.21 13,845 0.95 % 322.47 2,029 1.26 % 512.91 87,230 0.73 % 274.36 pet p et 320,662 0.80 % 282.60 0 0 % 0 6,969 1.55 % 175.47 162,760 0.78 % 299.90 41,601 0.79 % 221.97 5,031 0.35 % 117.18 521 0.33 % 131.70 103,780 0.86 % 326.42 20 2 0 318,178 0.79 % 280.41 22 5.07 % 2,266.41 360 0.08 % 9.06 161,965 0.78 % 298.43 43,208 0.82 % 230.54 9,708 0.67 % 226.11 928 0.58 % 234.59 101,987 0.85 % 320.78 4 4 303,782 0.76 % 267.72 12 2.77 % 1,236.22 467 0.10 % 11.76 165,355 0.79 % 304.68 46,384 0.88 % 247.49 16,262 1.12 % 378.76 2,293 1.43 % 579.65 73,009 0.61 % 229.63 tisoč t isoč 301,750 0.75 % 265.93 0 0 % 0 5,082 1.13 % 127.96 194,101 0.93 % 357.64 36,893 0.70 % 196.85 3,323 0.23 % 77.40 257 0.16 % 64.97 62,094 0.52 % 195.30 štiri š tiri 296,946 0.74 % 261.70 1 0.23 % 103.02 5,534 1.23 % 139.34 148,934 0.71 % 274.42 40,692 0.77 % 217.12 5,876 0.41 % 136.86 349 0.22 % 88.22 95,560 0.80 % 300.56 10 1 0 296,783 0.74 % 261.55 21 4.84 % 2,163.39 401 0.09 % 10.10 142,870 0.68 % 263.25 49,185 0.94 % 262.43 13,444 0.93 % 313.13 1,236 0.77 % 312.45 89,626 0.75 % 281.90 drugega d rugega 294,080 0.73 % 259.17 1 0.23 % 103.02 20,272 4.50 % 510.43 126,128 0.60 % 232.40 51,562 0.98 % 275.12 14,473 1.00 % 337.09 2,891 1.80 % 730.81 78,753 0.66 % 247.70 treh t reh 288,800 0.72 % 254.52 2 0.46 % 206.04 3,940 0.88 % 99.20 150,526 0.72 % 277.35 37,751 0.72 % 201.43 6,751 0.47 % 157.24 997 0.62 % 252.03 88,833 0.74 % 279.40 prvo p rvo 284,160 0.71 % 250.43 0 0 % 0 4,862 1.08 % 122.42 144,620 0.69 % 266.47 39,257 0.75 % 209.46 7,276 0.50 % 169.47 601 0.38 % 151.93 87,544 0.73 % 275.35 5 5 265,500 0.66 % 233.98 14 3.23 % 1,442.26 549 0.12 % 13.82 141,453 0.68 % 260.64 45,080 0.86 % 240.53 15,333 1.06 % 357.12 1,690 1.05 % 427.21 61,381 0.51 % 193.06 15 1 5 265,493 0.66 % 233.98 19 4.38 % 1,957.35 308 0.07 % 7.76 133,739 0.64 % 246.42 35,793 0.68 % 190.98 8,377 0.58 % 195.11 753 0.47 % 190.35 86,504 0.72 % 272.08 enega e nega 257,431 0.64 % 226.87 0 0 % 0 8,961 1.99 % 225.63 117,647 0.56 % 216.77 42,696 0.81 % 227.81 8,810 0.61 % 205.20 1,025 0.64 % 259.11 78,292 0.65 % 246.25 30 3 0 251,610 0.62 % 221.74 20 4.61 % 2,060.37 282 0.06 % 7.10 128,477 0.61 % 236.73 38,641 0.73 % 206.18 7,698 0.53 % 179.30 1,126 0.70 % 284.64 75,366 0.63 % 237.05 prvem p rvem 250,842 0.62 % 221.07 0 0 % 0 2,705 0.60 % 68.11 125,127 0.60 % 230.55 26,820 0.51 % 143.10 5,363 0.37 % 124.91 1,064 0.66 % 268.97 89,763 0.75 % 282.33 deset d eset 246,223 0.61 % 217 0 0 % 0 6,849 1.52 % 172.45 128,781 0.62 % 237.29 34,149 0.65 % 182.21 3,845 0.27 % 89.55 285 0.18 % 72.04 72,314 0.60 % 227.45 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 148 File at CLARIN.SI 1.2.132 List of initial character-level 2-grams from numeral lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-numerals-lowercase_forms- initial-2grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] drugi dr ugi 820,060 2.18 % 722.71 3 0.97 % 309.06 24,165 5.43 % 608.45 392,784 2.01 % 723.73 128,927 2.65 % 687.91 34,299 2.64 % 798.86 3,773 2.73 % 953.77 236,109 2.08 % 742.62 prvi pr vi 808,380 2.15 % 712.42 2 0.65 % 206.04 11,945 2.69 % 300.76 373,300 1.91 % 687.83 112,201 2.31 % 598.67 21,036 1.62 % 489.95 2,570 1.86 % 649.67 287,326 2.53 % 903.72 tri tr i 563,742 1.50 % 496.82 1 0.33 % 103.02 13,856 3.11 % 348.88 286,039 1.47 % 527.05 78,929 1.62 % 421.14 13,581 1.04 % 316.32 939 0.68 % 237.37 170,397 1.50 % 535.94 dva dv a 531,037 1.41 % 468 3 0.97 % 309.06 17,833 4.01 % 449.02 259,987 1.33 % 479.04 86,185 1.77 % 459.85 14,415 1.11 % 335.74 1,103 0.80 % 278.83 151,511 1.33 % 476.54 dve dv e 517,328 1.38 % 455.92 2 0.65 % 206.04 13,722 3.08 % 345.51 252,491 1.29 % 465.23 82,830 1.71 % 441.95 15,837 1.22 % 368.86 1,072 0.77 % 270.99 151,374 1.33 % 476.11 drugih dr ugih 462,975 1.23 % 408.02 0 0 % 0 8,523 1.92 % 214.60 224,951 1.15 % 414.49 80,742 1.66 % 430.81 25,540 1.96 % 594.86 3,960 2.86 % 1,001.05 119,259 1.05 % 375.10 eno en o 413,972 1.10 % 364.83 7 2.27 % 721.13 19,037 4.28 % 479.33 183,413 0.94 % 337.95 77,701 1.60 % 414.59 16,926 1.30 % 394.23 1,428 1.03 % 360.98 115,460 1.02 % 363.15 dveh dv eh 411,331 1.09 % 362.50 2 0.65 % 206.04 6,335 1.42 % 159.51 205,987 1.06 % 379.54 58,443 1.20 % 311.83 11,257 0.86 % 262.19 1,144 0.83 % 289.19 128,163 1.13 % 403.11 druge dr uge 404,732 1.08 % 356.69 1 0.33 % 103.02 10,596 2.38 % 266.80 187,972 0.96 % 346.35 71,965 1.48 % 383.98 23,722 1.82 % 552.51 4,453 3.22 % 1,125.67 106,023 0.93 % 333.47 drugim dr ugim 396,234 1.05 % 349.20 3 0.97 % 309.06 6,249 1.41 % 157.34 173,675 0.89 % 320.01 49,406 1.02 % 263.61 10,022 0.77 % 233.42 1,036 0.75 % 261.89 155,843 1.37 % 490.17 ena en a 342,038 0.91 % 301.44 0 0 % 0 11,184 2.51 % 281.60 153,961 0.79 % 283.68 64,745 1.33 % 345.46 12,596 0.97 % 293.38 1,003 0.72 % 253.55 98,549 0.87 % 309.96 eden ed en 339,547 0.90 % 299.24 0 0 % 0 9,712 2.18 % 244.54 157,429 0.81 % 290.07 57,853 1.19 % 308.68 10,361 0.80 % 241.32 694 0.50 % 175.44 103,498 0.91 % 325.53 1, 1, 338,262 0.90 % 298.11 0 0 % 0 729 0.16 % 18.36 212,046 1.09 % 390.71 30,858 0.64 % 164.65 10,574 0.81 % 246.28 3,351 2.42 % 847.10 80,704 0.71 % 253.84 drugo dr ugo 330,167 0.88 % 290.97 10 3.25 % 1,030.18 13,160 2.96 % 331.35 160,253 0.82 % 295.28 53,640 1.10 % 286.21 13,845 1.06 % 322.47 2,029 1.47 % 512.91 87,230 0.77 % 274.36 pet pe t 320,662 0.85 % 282.60 0 0 % 0 6,969 1.57 % 175.47 162,760 0.83 % 299.90 41,601 0.86 % 221.97 5,031 0.39 % 117.18 521 0.38 % 131.70 103,780 0.92 % 326.42 20 20 318,178 0.85 % 280.41 22 7.14 % 2,266.41 360 0.08 % 9.06 161,965 0.83 % 298.43 43,208 0.89 % 230.54 9,708 0.75 % 226.11 928 0.67 % 234.59 101,987 0.90 % 320.78 tisoč ti soč 301,750 0.80 % 265.93 0 0 % 0 5,082 1.14 % 127.96 194,101 0.99 % 357.64 36,893 0.76 % 196.85 3,323 0.26 % 77.40 257 0.19 % 64.97 62,094 0.55 % 195.30 štiri št iri 296,946 0.79 % 261.70 1 0.33 % 103.02 5,534 1.24 % 139.34 148,934 0.76 % 274.42 40,692 0.84 % 217.12 5,876 0.45 % 136.86 349 0.25 % 88.22 95,560 0.84 % 300.56 10 10 296,783 0.79 % 261.55 21 6.82 % 2,163.39 401 0.09 % 10.10 142,870 0.73 % 263.25 49,185 1.01 % 262.43 13,444 1.03 % 313.13 1,236 0.89 % 312.45 89,626 0.79 % 281.90 drugega dr ugega 294,080 0.78 % 259.17 1 0.33 % 103.02 20,272 4.56 % 510.43 126,128 0.65 % 232.40 51,562 1.06 % 275.12 14,473 1.11 % 337.09 2,891 2.09 % 730.81 78,753 0.69 % 247.70 treh tr eh 288,800 0.77 % 254.52 2 0.65 % 206.04 3,940 0.89 % 99.20 150,526 0.77 % 277.35 37,751 0.78 % 201.43 6,751 0.52 % 157.24 997 0.72 % 252.03 88,833 0.78 % 279.40 prvo pr vo 284,160 0.76 % 250.43 0 0 % 0 4,862 1.09 % 122.42 144,620 0.74 % 266.47 39,257 0.81 % 209.46 7,276 0.56 % 169.47 601 0.43 % 151.93 87,544 0.77 % 275.35 15 15 265,493 0.71 % 233.98 19 6.17 % 1,957.35 308 0.07 % 7.76 133,739 0.69 % 246.42 35,793 0.74 % 190.98 8,377 0.64 % 195.11 753 0.54 % 190.35 86,504 0.76 % 272.08 enega en ega 257,431 0.69 % 226.87 0 0 % 0 8,961 2.01 % 225.63 117,647 0.60 % 216.77 42,696 0.88 % 227.81 8,810 0.68 % 205.20 1,025 0.74 % 259.11 78,292 0.69 % 246.25 30 30 251,610 0.67 % 221.74 20 6.49 % 2,060.37 282 0.06 % 7.10 128,477 0.66 % 236.73 38,641 0.80 % 206.18 7,698 0.59 % 179.30 1,126 0.81 % 284.64 75,366 0.66 % 237.05 prvem pr vem 250,842 0.67 % 221.07 0 0 % 0 2,705 0.61 % 68.11 125,127 0.64 % 230.55 26,820 0.55 % 143.10 5,363 0.41 % 124.91 1,064 0.77 % 268.97 89,763 0.79 % 282.33 deset de set 246,223 0.66 % 217 0 0 % 0 6,849 1.54 % 172.45 128,781 0.66 % 237.29 34,149 0.70 % 182.21 3,845 0.30 % 89.55 285 0.21 % 72.04 72,314 0.64 % 227.45 2, 2, 238,500 0.63 % 210.19 0 0 % 0 327 0.07 % 8.23 154,120 0.79 % 283.98 23,331 0.48 % 124.49 8,832 0.68 % 205.71 3,076 2.22 % 777.58 48,814 0.43 % 153.53 en en 234,489 0.62 % 206.65 3 0.97 % 309.06 11,541 2.59 % 290.59 98,653 0.51 % 181.77 43,941 0.91 % 234.45 9,394 0.72 % 218.80 882 0.64 % 222.96 70,075 0.62 % 220.40 prva pr va 222,284 0.59 % 195.90 0 0 % 0 3,827 0.86 % 96.36 108,395 0.56 % 199.72 35,035 0.72 % 186.94 7,215 0.56 % 168.05 494 0.36 % 124.88 67,318 0.59 % 211.73 prve pr ve 216,155 0.57 % 190.50 0 0 % 0 2,727 0.61 % 68.66 96,390 0.49 % 177.60 28,125 0.58 % 150.07 6,850 0.53 % 159.54 804 0.58 % 203.24 81,259 0.72 % 255.58 12 12 214,333 0.57 % 188.89 5 1.62 % 515.09 297 0.07 % 7.48 109,658 0.56 % 202.05 25,719 0.53 % 137.23 6,921 0.53 % 161.20 479 0.35 % 121.09 71,254 0.63 % 224.11 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 149 File at CLARIN.SI 1.2.133 List of initial character-level 3-grams from numeral lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-numerals-lowercase_forms- initial-3grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] drugi dru gi 820,060 2.78 % 722.71 3 1.92 % 309.06 24,165 5.75 % 608.45 392,784 2.59 % 723.73 128,927 3.32 % 687.91 34,299 3.41 % 798.86 3,773 3.69 % 953.77 236,109 2.64 % 742.62 prvi prv i 808,380 2.74 % 712.42 2 1.28 % 206.04 11,945 2.84 % 300.76 373,300 2.46 % 687.83 112,201 2.89 % 598.67 21,036 2.09 % 489.95 2,570 2.52 % 649.67 287,326 3.21 % 903.72 tri tri 563,742 1.91 % 496.82 1 0.64 % 103.02 13,856 3.29 % 348.88 286,039 1.88 % 527.05 78,929 2.04 % 421.14 13,581 1.35 % 316.32 939 0.92 % 237.37 170,397 1.91 % 535.94 dva dva 531,037 1.80 % 468 3 1.92 % 309.06 17,833 4.24 % 449.02 259,987 1.71 % 479.04 86,185 2.22 % 459.85 14,415 1.43 % 335.74 1,103 1.08 % 278.83 151,511 1.70 % 476.54 dve dve 517,328 1.75 % 455.92 2 1.28 % 206.04 13,722 3.26 % 345.51 252,491 1.66 % 465.23 82,830 2.14 % 441.95 15,837 1.57 % 368.86 1,072 1.05 % 270.99 151,374 1.69 % 476.11 drugih dru gih 462,975 1.57 % 408.02 0 0 % 0 8,523 2.03 % 214.60 224,951 1.48 % 414.49 80,742 2.08 % 430.81 25,540 2.54 % 594.86 3,960 3.88 % 1,001.05 119,259 1.33 % 375.10 eno eno 413,972 1.40 % 364.83 7 4.49 % 721.13 19,037 4.53 % 479.33 183,413 1.21 % 337.95 77,701 2.00 % 414.59 16,926 1.68 % 394.23 1,428 1.40 % 360.98 115,460 1.29 % 363.15 dveh dve h 411,331 1.39 % 362.50 2 1.28 % 206.04 6,335 1.51 % 159.51 205,987 1.36 % 379.54 58,443 1.51 % 311.83 11,257 1.12 % 262.19 1,144 1.12 % 289.19 128,163 1.43 % 403.11 druge dru ge 404,732 1.37 % 356.69 1 0.64 % 103.02 10,596 2.52 % 266.80 187,972 1.24 % 346.35 71,965 1.85 % 383.98 23,722 2.36 % 552.51 4,453 4.36 % 1,125.67 106,023 1.19 % 333.47 drugim dru gim 396,234 1.34 % 349.20 3 1.92 % 309.06 6,249 1.49 % 157.34 173,675 1.14 % 320.01 49,406 1.27 % 263.61 10,022 0.99 % 233.42 1,036 1.01 % 261.89 155,843 1.74 % 490.17 ena ena 342,038 1.16 % 301.44 0 0 % 0 11,184 2.66 % 281.60 153,961 1.01 % 283.68 64,745 1.67 % 345.46 12,596 1.25 % 293.38 1,003 0.98 % 253.55 98,549 1.10 % 309.96 eden ede n 339,547 1.15 % 299.24 0 0 % 0 9,712 2.31 % 244.54 157,429 1.04 % 290.07 57,853 1.49 % 308.68 10,361 1.03 % 241.32 694 0.68 % 175.44 103,498 1.16 % 325.53 drugo dru go 330,167 1.12 % 290.97 10 6.41 % 1,030.18 13,160 3.13 % 331.35 160,253 1.06 % 295.28 53,640 1.38 % 286.21 13,845 1.38 % 322.47 2,029 1.99 % 512.91 87,230 0.98 % 274.36 pet pet 320,662 1.09 % 282.60 0 0 % 0 6,969 1.66 % 175.47 162,760 1.07 % 299.90 41,601 1.07 % 221.97 5,031 0.50 % 117.18 521 0.51 % 131.70 103,780 1.16 % 326.42 tisoč tis oč 301,750 1.02 % 265.93 0 0 % 0 5,082 1.21 % 127.96 194,101 1.28 % 357.64 36,893 0.95 % 196.85 3,323 0.33 % 77.40 257 0.25 % 64.97 62,094 0.69 % 195.30 štiri šti ri 296,946 1.01 % 261.70 1 0.64 % 103.02 5,534 1.32 % 139.34 148,934 0.98 % 274.42 40,692 1.05 % 217.12 5,876 0.58 % 136.86 349 0.34 % 88.22 95,560 1.07 % 300.56 drugega dru gega 294,080 1.00 % 259.17 1 0.64 % 103.02 20,272 4.82 % 510.43 126,128 0.83 % 232.40 51,562 1.33 % 275.12 14,473 1.44 % 337.09 2,891 2.83 % 730.81 78,753 0.88 % 247.70 treh tre h 288,800 0.98 % 254.52 2 1.28 % 206.04 3,940 0.94 % 99.20 150,526 0.99 % 277.35 37,751 0.97 % 201.43 6,751 0.67 % 157.24 997 0.98 % 252.03 88,833 0.99 % 279.40 prvo prv o 284,160 0.96 % 250.43 0 0 % 0 4,862 1.16 % 122.42 144,620 0.95 % 266.47 39,257 1.01 % 209.46 7,276 0.72 % 169.47 601 0.59 % 151.93 87,544 0.98 % 275.35 enega ene ga 257,431 0.87 % 226.87 0 0 % 0 8,961 2.13 % 225.63 117,647 0.78 % 216.77 42,696 1.10 % 227.81 8,810 0.88 % 205.20 1,025 1.00 % 259.11 78,292 0.88 % 246.25 prvem prv em 250,842 0.85 % 221.07 0 0 % 0 2,705 0.64 % 68.11 125,127 0.82 % 230.55 26,820 0.69 % 143.10 5,363 0.53 % 124.91 1,064 1.04 % 268.97 89,763 1.00 % 282.33 deset des et 246,223 0.83 % 217 0 0 % 0 6,849 1.63 % 172.45 128,781 0.85 % 237.29 34,149 0.88 % 182.21 3,845 0.38 % 89.55 285 0.28 % 72.04 72,314 0.81 % 227.45 prva prv a 222,284 0.75 % 195.90 0 0 % 0 3,827 0.91 % 96.36 108,395 0.71 % 199.72 35,035 0.90 % 186.94 7,215 0.72 % 168.05 494 0.48 % 124.88 67,318 0.75 % 211.73 prve prv e 216,155 0.73 % 190.50 0 0 % 0 2,727 0.65 % 68.66 96,390 0.64 % 177.60 28,125 0.72 % 150.07 6,850 0.68 % 159.54 804 0.79 % 203.24 81,259 0.91 % 255.58 druga dru ga 207,331 0.70 % 182.72 0 0 % 0 6,455 1.53 % 162.53 97,666 0.64 % 179.96 37,580 0.97 % 200.51 10,927 1.08 % 254.50 1,461 1.43 % 369.33 53,242 0.60 % 167.46 drugem dru gem 197,297 0.67 % 173.88 1 0.64 % 103.02 4,399 1.05 % 110.76 92,410 0.61 % 170.27 19,781 0.51 % 105.54 5,479 0.54 % 127.61 1,112 1.09 % 281.10 74,115 0.83 % 233.11 šest šes t 194,710 0.66 % 171.60 0 0 % 0 3,650 0.87 % 91.90 98,377 0.65 % 181.27 23,184 0.60 % 123.70 3,062 0.30 % 71.32 334 0.33 % 84.43 66,103 0.74 % 207.91 100 100 180,711 0.61 % 159.26 11 7.05 % 1,133.20 196 0.05 % 4.94 85,233 0.56 % 157.05 36,302 0.94 % 193.70 5,283 0.53 % 123.05 698 0.68 % 176.45 52,988 0.59 % 166.66 prvih prv ih 176,236 0.60 % 155.32 0 0 % 0 2,002 0.48 % 50.41 89,275 0.59 % 164.50 24,291 0.63 % 129.61 4,882 0.48 % 113.71 270 0.26 % 68.25 55,516 0.62 % 174.61 dvema dve ma 172,686 0.58 % 152.19 2 1.28 % 206.04 4,089 0.97 % 102.96 85,914 0.57 % 158.30 27,603 0.71 % 147.28 4,285 0.43 % 99.80 268 0.26 % 67.75 50,525 0.56 % 158.91 prvega prv ega 170,665 0.58 % 150.41 0 0 % 0 2,487 0.59 % 62.62 81,130 0.53 % 149.49 19,285 0.50 % 102.90 4,378 0.43 % 101.97 3,542 3.47 % 895.38 59,843 0.67 % 188.22 eni eni 163,314 0.55 % 143.93 4 2.56 % 412.07 5,102 1.21 % 128.46 75,673 0.50 % 139.43 31,603 0.81 % 168.62 7,598 0.76 % 176.97 492 0.48 % 124.37 42,842 0.48 % 134.75 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 150 File at CLARIN.SI 1.2.134 List of initial character-level 4-grams from numeral lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-numerals-lowercase_forms- initial-4grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] drugi drug i 820,060 4.21 % 722.71 3 7.14 % 309.06 24,165 7.74 % 608.45 392,784 4.01 % 723.73 128,927 4.99 % 687.91 34,299 5.01 % 798.86 3,773 5.24 % 953.77 236,109 3.93 % 742.62 prvi prvi 808,380 4.15 % 712.42 2 4.76 % 206.04 11,945 3.83 % 300.76 373,300 3.81 % 687.83 112,201 4.35 % 598.67 21,036 3.07 % 489.95 2,570 3.57 % 649.67 287,326 4.78 % 903.72 drugih drug ih 462,975 2.38 % 408.02 0 0 % 0 8,523 2.73 % 214.60 224,951 2.29 % 414.49 80,742 3.13 % 430.81 25,540 3.73 % 594.86 3,960 5.50 % 1,001.05 119,259 1.99 % 375.10 dveh dveh 411,331 2.11 % 362.50 2 4.76 % 206.04 6,335 2.03 % 159.51 205,987 2.10 % 379.54 58,443 2.26 % 311.83 11,257 1.65 % 262.19 1,144 1.59 % 289.19 128,163 2.13 % 403.11 druge drug e 404,732 2.08 % 356.69 1 2.38 % 103.02 10,596 3.39 % 266.80 187,972 1.92 % 346.35 71,965 2.79 % 383.98 23,722 3.47 % 552.51 4,453 6.18 % 1,125.67 106,023 1.76 % 333.47 drugim drug im 396,234 2.04 % 349.20 3 7.14 % 309.06 6,249 2.00 % 157.34 173,675 1.77 % 320.01 49,406 1.91 % 263.61 10,022 1.46 % 233.42 1,036 1.44 % 261.89 155,843 2.59 % 490.17 eden eden 339,547 1.75 % 299.24 0 0 % 0 9,712 3.11 % 244.54 157,429 1.61 % 290.07 57,853 2.24 % 308.68 10,361 1.51 % 241.32 694 0.96 % 175.44 103,498 1.72 % 325.53 drugo drug o 330,167 1.70 % 290.97 10 23.81 % 1,030.18 13,160 4.21 % 331.35 160,253 1.64 % 295.28 53,640 2.08 % 286.21 13,845 2.02 % 322.47 2,029 2.82 % 512.91 87,230 1.45 % 274.36 tisoč tiso č 301,750 1.55 % 265.93 0 0 % 0 5,082 1.63 % 127.96 194,101 1.98 % 357.64 36,893 1.43 % 196.85 3,323 0.49 % 77.40 257 0.36 % 64.97 62,094 1.03 % 195.30 štiri štir i 296,946 1.53 % 261.70 1 2.38 % 103.02 5,534 1.77 % 139.34 148,934 1.52 % 274.42 40,692 1.58 % 217.12 5,876 0.86 % 136.86 349 0.48 % 88.22 95,560 1.59 % 300.56 drugega drug ega 294,080 1.51 % 259.17 1 2.38 % 103.02 20,272 6.49 % 510.43 126,128 1.29 % 232.40 51,562 2.00 % 275.12 14,473 2.12 % 337.09 2,891 4.01 % 730.81 78,753 1.31 % 247.70 treh treh 288,800 1.48 % 254.52 2 4.76 % 206.04 3,940 1.26 % 99.20 150,526 1.54 % 277.35 37,751 1.46 % 201.43 6,751 0.99 % 157.24 997 1.38 % 252.03 88,833 1.48 % 279.40 prvo prvo 284,160 1.46 % 250.43 0 0 % 0 4,862 1.56 % 122.42 144,620 1.48 % 266.47 39,257 1.52 % 209.46 7,276 1.06 % 169.47 601 0.83 % 151.93 87,544 1.46 % 275.35 enega eneg a 257,431 1.32 % 226.87 0 0 % 0 8,961 2.87 % 225.63 117,647 1.20 % 216.77 42,696 1.65 % 227.81 8,810 1.29 % 205.20 1,025 1.42 % 259.11 78,292 1.30 % 246.25 prvem prve m 250,842 1.29 % 221.07 0 0 % 0 2,705 0.87 % 68.11 125,127 1.28 % 230.55 26,820 1.04 % 143.10 5,363 0.78 % 124.91 1,064 1.48 % 268.97 89,763 1.49 % 282.33 deset dese t 246,223 1.26 % 217 0 0 % 0 6,849 2.19 % 172.45 128,781 1.31 % 237.29 34,149 1.32 % 182.21 3,845 0.56 % 89.55 285 0.40 % 72.04 72,314 1.20 % 227.45 prva prva 222,284 1.14 % 195.90 0 0 % 0 3,827 1.23 % 96.36 108,395 1.11 % 199.72 35,035 1.36 % 186.94 7,215 1.05 % 168.05 494 0.69 % 124.88 67,318 1.12 % 211.73 prve prve 216,155 1.11 % 190.50 0 0 % 0 2,727 0.87 % 68.66 96,390 0.98 % 177.60 28,125 1.09 % 150.07 6,850 1.00 % 159.54 804 1.12 % 203.24 81,259 1.35 % 255.58 druga drug a 207,331 1.06 % 182.72 0 0 % 0 6,455 2.07 % 162.53 97,666 1.00 % 179.96 37,580 1.46 % 200.51 10,927 1.60 % 254.50 1,461 2.03 % 369.33 53,242 0.89 % 167.46 drugem drug em 197,297 1.01 % 173.88 1 2.38 % 103.02 4,399 1.41 % 110.76 92,410 0.94 % 170.27 19,781 0.77 % 105.54 5,479 0.80 % 127.61 1,112 1.54 % 281.10 74,115 1.23 % 233.11 šest šest 194,710 1.00 % 171.60 0 0 % 0 3,650 1.17 % 91.90 98,377 1.00 % 181.27 23,184 0.90 % 123.70 3,062 0.45 % 71.32 334 0.46 % 84.43 66,103 1.10 % 207.91 prvih prvi h 176,236 0.91 % 155.32 0 0 % 0 2,002 0.64 % 50.41 89,275 0.91 % 164.50 24,291 0.94 % 129.61 4,882 0.71 % 113.71 270 0.38 % 68.25 55,516 0.92 % 174.61 dvema dvem a 172,686 0.89 % 152.19 2 4.76 % 206.04 4,089 1.31 % 102.96 85,914 0.88 % 158.30 27,603 1.07 % 147.28 4,285 0.63 % 99.80 268 0.37 % 67.75 50,525 0.84 % 158.91 prvega prve ga 170,665 0.88 % 150.41 0 0 % 0 2,487 0.80 % 62.62 81,130 0.83 % 149.49 19,285 0.75 % 102.90 4,378 0.64 % 101.97 3,542 4.92 % 895.38 59,843 1.00 % 188.22 2,000 2,000 148,688 0.76 % 131.04 0 0 % 0 245 0.08 % 6.17 89,618 0.91 % 165.13 26,937 1.04 % 143.73 5,880 0.86 % 136.95 159 0.22 % 40.19 25,849 0.43 % 81.30 tretji tret ji 145,441 0.75 % 128.18 0 0 % 0 1,990 0.64 % 50.11 70,342 0.72 % 129.61 16,447 0.64 % 87.76 3,373 0.49 % 78.56 686 0.95 % 173.41 52,603 0.88 % 165.45 drugimi drug imi 144,266 0.74 % 127.14 0 0 % 0 3,342 1.07 % 84.15 66,744 0.68 % 122.98 28,301 1.10 % 151 9,385 1.37 % 218.59 938 1.30 % 237.12 35,556 0.59 % 111.83 štirih štir ih 143,128 0.73 % 126.14 0 0 % 0 2,238 0.72 % 56.35 74,029 0.76 % 136.40 18,589 0.72 % 99.18 2,888 0.42 % 67.26 204 0.28 % 51.57 45,180 0.75 % 142.10 sedem sede m 143,046 0.73 % 126.07 0 0 % 0 2,985 0.96 % 75.16 71,859 0.73 % 132.40 16,155 0.63 % 86.20 2,252 0.33 % 52.45 135 0.19 % 34.13 49,660 0.83 % 156.19 osem osem 136,419 0.70 % 120.23 0 0 % 0 2,174 0.70 % 54.74 70,102 0.71 % 129.17 14,763 0.57 % 78.77 1,885 0.28 % 43.90 164 0.23 % 41.46 47,331 0.79 % 148.87 trije trij e 127,337 0.65 % 112.22 0 0 % 0 4,412 1.41 % 111.09 66,917 0.68 % 123.30 17,756 0.69 % 94.74 2,898 0.42 % 67.50 194 0.27 % 49.04 35,160 0.58 % 110.59 enem enem 125,075 0.64 % 110.23 4 9.52 % 412.07 3,694 1.18 % 93.01 55,098 0.56 % 101.52 24,108 0.93 % 128.63 4,278 0.62 % 99.64 568 0.79 % 143.58 37,325 0.62 % 117.40 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 151 File at CLARIN.SI 1.2.135 List of initial character-level 5-grams from numeral lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-numerals-lowercase_forms- initial-5grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] drugi drugi 820,060 7.24 % 722.71 3 10.71 % 309.06 24,165 10.67 % 608.45 392,784 6.63 % 723.73 128,927 8.87 % 687.91 34,299 10.95 % 798.86 3,773 8.08 % 953.77 236,109 7.02 % 742.62 drugih drugi h 462,975 4.09 % 408.02 0 0 % 0 8,523 3.76 % 214.60 224,951 3.80 % 414.49 80,742 5.55 % 430.81 25,540 8.15 % 594.86 3,960 8.48 % 1,001.05 119,259 3.54 % 375.10 druge druge 404,732 3.57 % 356.69 1 3.57 % 103.02 10,596 4.68 % 266.80 187,972 3.17 % 346.35 71,965 4.95 % 383.98 23,722 7.57 % 552.51 4,453 9.54 % 1,125.67 106,023 3.15 % 333.47 drugim drugi m 396,234 3.50 % 349.20 3 10.71 % 309.06 6,249 2.76 % 157.34 173,675 2.93 % 320.01 49,406 3.40 % 263.61 10,022 3.20 % 233.42 1,036 2.22 % 261.89 155,843 4.63 % 490.17 drugo drugo 330,167 2.92 % 290.97 10 35.71 % 1,030.18 13,160 5.81 % 331.35 160,253 2.71 % 295.28 53,640 3.69 % 286.21 13,845 4.42 % 322.47 2,029 4.34 % 512.91 87,230 2.59 % 274.36 tisoč tisoč 301,750 2.67 % 265.93 0 0 % 0 5,082 2.24 % 127.96 194,101 3.28 % 357.64 36,893 2.54 % 196.85 3,323 1.06 % 77.40 257 0.55 % 64.97 62,094 1.85 % 195.30 štiri štiri 296,946 2.62 % 261.70 1 3.57 % 103.02 5,534 2.44 % 139.34 148,934 2.52 % 274.42 40,692 2.80 % 217.12 5,876 1.88 % 136.86 349 0.75 % 88.22 95,560 2.84 % 300.56 drugega druge ga 294,080 2.60 % 259.17 1 3.57 % 103.02 20,272 8.95 % 510.43 126,128 2.13 % 232.40 51,562 3.55 % 275.12 14,473 4.62 % 337.09 2,891 6.19 % 730.81 78,753 2.34 % 247.70 enega enega 257,431 2.27 % 226.87 0 0 % 0 8,961 3.96 % 225.63 117,647 1.99 % 216.77 42,696 2.94 % 227.81 8,810 2.81 % 205.20 1,025 2.19 % 259.11 78,292 2.33 % 246.25 prvem prvem 250,842 2.21 % 221.07 0 0 % 0 2,705 1.19 % 68.11 125,127 2.11 % 230.55 26,820 1.84 % 143.10 5,363 1.71 % 124.91 1,064 2.28 % 268.97 89,763 2.67 % 282.33 deset deset 246,223 2.17 % 217 0 0 % 0 6,849 3.02 % 172.45 128,781 2.17 % 237.29 34,149 2.35 % 182.21 3,845 1.23 % 89.55 285 0.61 % 72.04 72,314 2.15 % 227.45 druga druga 207,331 1.83 % 182.72 0 0 % 0 6,455 2.85 % 162.53 97,666 1.65 % 179.96 37,580 2.58 % 200.51 10,927 3.49 % 254.50 1,461 3.13 % 369.33 53,242 1.58 % 167.46 drugem druge m 197,297 1.74 % 173.88 1 3.57 % 103.02 4,399 1.94 % 110.76 92,410 1.56 % 170.27 19,781 1.36 % 105.54 5,479 1.75 % 127.61 1,112 2.38 % 281.10 74,115 2.20 % 233.11 prvih prvih 176,236 1.56 % 155.32 0 0 % 0 2,002 0.88 % 50.41 89,275 1.51 % 164.50 24,291 1.67 % 129.61 4,882 1.56 % 113.71 270 0.58 % 68.25 55,516 1.65 % 174.61 dvema dvema 172,686 1.52 % 152.19 2 7.14 % 206.04 4,089 1.80 % 102.96 85,914 1.45 % 158.30 27,603 1.90 % 147.28 4,285 1.37 % 99.80 268 0.57 % 67.75 50,525 1.50 % 158.91 prvega prveg a 170,665 1.51 % 150.41 0 0 % 0 2,487 1.10 % 62.62 81,130 1.37 % 149.49 19,285 1.33 % 102.90 4,378 1.40 % 101.97 3,542 7.59 % 895.38 59,843 1.78 % 188.22 tretji tretj i 145,441 1.28 % 128.18 0 0 % 0 1,990 0.88 % 50.11 70,342 1.19 % 129.61 16,447 1.13 % 87.76 3,373 1.08 % 78.56 686 1.47 % 173.41 52,603 1.56 % 165.45 drugimi drugi mi 144,266 1.27 % 127.14 0 0 % 0 3,342 1.48 % 84.15 66,744 1.13 % 122.98 28,301 1.95 % 151 9,385 3.00 % 218.59 938 2.01 % 237.12 35,556 1.06 % 111.83 štirih štiri h 143,128 1.26 % 126.14 0 0 % 0 2,238 0.99 % 56.35 74,029 1.25 % 136.40 18,589 1.28 % 99.18 2,888 0.92 % 67.26 204 0.44 % 51.57 45,180 1.34 % 142.10 sedem sedem 143,046 1.26 % 126.07 0 0 % 0 2,985 1.32 % 75.16 71,859 1.21 % 132.40 16,155 1.11 % 86.20 2,252 0.72 % 52.45 135 0.29 % 34.13 49,660 1.48 % 156.19 trije trije 127,337 1.12 % 112.22 0 0 % 0 4,412 1.95 % 111.09 66,917 1.13 % 123.30 17,756 1.22 % 94.74 2,898 0.93 % 67.50 194 0.41 % 49.04 35,160 1.04 % 110.59 petih petih 122,008 1.08 % 107.53 1 3.57 % 103.02 1,848 0.82 % 46.53 62,977 1.06 % 116.04 14,811 1.02 % 79.03 1,833 0.58 % 42.69 335 0.72 % 84.68 40,203 1.20 % 126.45 tretje tretj e 97,845 0.86 % 86.23 0 0 % 0 901 0.40 % 22.69 51,539 0.87 % 94.96 10,412 0.72 % 55.56 1,553 0.50 % 36.17 297 0.64 % 75.08 33,143 0.98 % 104.24 desetih deset ih 86,072 0.76 % 75.85 0 0 % 0 1,844 0.81 % 46.43 45,492 0.77 % 83.82 12,654 0.87 % 67.52 1,244 0.40 % 28.97 163 0.35 % 41.20 24,675 0.73 % 77.61 tremi tremi 84,540 0.75 % 74.50 1 3.57 % 103.02 1,599 0.71 % 40.26 43,950 0.74 % 80.98 12,581 0.86 % 67.13 1,509 0.48 % 35.15 83 0.18 % 20.98 24,817 0.74 % 78.06 devet devet 83,521 0.74 % 73.61 1 3.57 % 103.02 1,452 0.64 % 36.56 42,137 0.71 % 77.64 8,310 0.57 % 44.34 1,062 0.34 % 24.74 69 0.15 % 17.44 30,490 0.91 % 95.90 šestih šesti h 80,748 0.71 % 71.16 0 0 % 0 1,400 0.62 % 35.25 41,731 0.70 % 76.89 9,460 0.65 % 50.48 1,146 0.37 % 26.69 381 0.82 % 96.31 26,630 0.79 % 83.76 tretjem tretj em 61,704 0.55 % 54.38 0 0 % 0 680 0.30 % 17.12 28,203 0.48 % 51.97 5,627 0.39 % 30.02 1,446 0.46 % 33.68 322 0.69 % 81.40 25,426 0.76 % 79.97 štirje štirj e 61,129 0.54 % 53.87 0 0 % 0 1,629 0.72 % 41.02 32,123 0.54 % 59.19 8,065 0.56 % 43.03 1,300 0.41 % 30.28 55 0.12 % 13.90 17,957 0.53 % 56.48 sedmih sedmi h 58,162 0.51 % 51.26 0 0 % 0 1,391 0.61 % 35.02 30,258 0.51 % 55.75 6,643 0.46 % 35.44 918 0.29 % 21.38 111 0.24 % 28.06 18,841 0.56 % 59.26 dvajset dvajs et 57,623 0.51 % 50.78 0 0 % 0 3,915 1.73 % 98.58 32,574 0.55 % 60.02 11,044 0.76 % 58.93 1,459 0.47 % 33.98 87 0.19 % 21.99 8,544 0.25 % 26.87 štirimi štiri mi 55,654 0.49 % 49.05 1 3.57 % 103.02 836 0.37 % 21.05 29,008 0.49 % 53.45 8,278 0.57 % 44.17 831 0.27 % 19.35 36 0.08 % 9.10 16,664 0.49 % 52.41 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 152 File at CLARIN.SI 1.2.136 List of final character-level 1-grams from numeral lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-numerals-lowercase_forms- final-1grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] drugi drug i 820,060 2.04 % 722.71 3 0.69 % 309.06 24,165 5.37 % 608.45 392,784 1.88 % 723.73 128,927 2.45 % 687.91 34,299 2.37 % 798.86 3,773 2.35 % 953.77 236,109 1.97 % 742.62 prvi prv i 808,380 2.01 % 712.42 2 0.46 % 206.04 11,945 2.65 % 300.76 373,300 1.79 % 687.83 112,201 2.13 % 598.67 21,036 1.45 % 489.95 2,570 1.60 % 649.67 287,326 2.39 % 903.72 tri tr i 563,742 1.40 % 496.82 1 0.23 % 103.02 13,856 3.08 % 348.88 286,039 1.37 % 527.05 78,929 1.50 % 421.14 13,581 0.94 % 316.32 939 0.58 % 237.37 170,397 1.42 % 535.94 dva dv a 531,037 1.32 % 468 3 0.69 % 309.06 17,833 3.96 % 449.02 259,987 1.24 % 479.04 86,185 1.64 % 459.85 14,415 0.99 % 335.74 1,103 0.69 % 278.83 151,511 1.26 % 476.54 dve dv e 517,328 1.29 % 455.92 2 0.46 % 206.04 13,722 3.05 % 345.51 252,491 1.21 % 465.23 82,830 1.57 % 441.95 15,837 1.09 % 368.86 1,072 0.67 % 270.99 151,374 1.26 % 476.11 1 1 473,721 1.18 % 417.49 34 7.83 % 3,502.63 1,167 0.26 % 29.38 224,981 1.08 % 414.54 75,915 1.44 % 405.06 31,719 2.19 % 738.77 5,780 3.60 % 1,461.12 134,125 1.12 % 421.86 drugih drugi h 462,975 1.15 % 408.02 0 0 % 0 8,523 1.89 % 214.60 224,951 1.08 % 414.49 80,742 1.54 % 430.81 25,540 1.76 % 594.86 3,960 2.47 % 1,001.05 119,259 0.99 % 375.10 2 2 448,702 1.11 % 395.44 32 7.37 % 3,296.59 870 0.19 % 21.91 232,497 1.11 % 428.39 74,329 1.41 % 396.60 26,758 1.85 % 623.23 5,495 3.43 % 1,389.08 108,721 0.91 % 341.96 eno en o 413,972 1.03 % 364.83 7 1.61 % 721.13 19,037 4.23 % 479.33 183,413 0.88 % 337.95 77,701 1.48 % 414.59 16,926 1.17 % 394.23 1,428 0.89 % 360.98 115,460 0.96 % 363.15 dveh dve h 411,331 1.02 % 362.50 2 0.46 % 206.04 6,335 1.41 % 159.51 205,987 0.98 % 379.54 58,443 1.11 % 311.83 11,257 0.78 % 262.19 1,144 0.71 % 289.19 128,163 1.07 % 403.11 druge drug e 404,732 1.01 % 356.69 1 0.23 % 103.02 10,596 2.35 % 266.80 187,972 0.90 % 346.35 71,965 1.37 % 383.98 23,722 1.64 % 552.51 4,453 2.78 % 1,125.67 106,023 0.88 % 333.47 drugim drugi m 396,234 0.98 % 349.20 3 0.69 % 309.06 6,249 1.39 % 157.34 173,675 0.83 % 320.01 49,406 0.94 % 263.61 10,022 0.69 % 233.42 1,036 0.65 % 261.89 155,843 1.30 % 490.17 ena en a 342,038 0.85 % 301.44 0 0 % 0 11,184 2.48 % 281.60 153,961 0.74 % 283.68 64,745 1.23 % 345.46 12,596 0.87 % 293.38 1,003 0.62 % 253.55 98,549 0.82 % 309.96 eden ede n 339,547 0.84 % 299.24 0 0 % 0 9,712 2.16 % 244.54 157,429 0.75 % 290.07 57,853 1.10 % 308.68 10,361 0.71 % 241.32 694 0.43 % 175.44 103,498 0.86 % 325.53 1, 1 , 338,262 0.84 % 298.11 0 0 % 0 729 0.16 % 18.36 212,046 1.01 % 390.71 30,858 0.59 % 164.65 10,574 0.73 % 246.28 3,351 2.09 % 847.10 80,704 0.67 % 253.84 3 3 334,249 0.83 % 294.57 20 4.61 % 2,060.37 684 0.15 % 17.22 178,016 0.85 % 328.01 52,468 1.00 % 279.95 20,047 1.38 % 466.92 3,802 2.37 % 961.10 79,212 0.66 % 249.14 drugo drug o 330,167 0.82 % 290.97 10 2.30 % 1,030.18 13,160 2.92 % 331.35 160,253 0.77 % 295.28 53,640 1.02 % 286.21 13,845 0.95 % 322.47 2,029 1.26 % 512.91 87,230 0.73 % 274.36 pet pe t 320,662 0.80 % 282.60 0 0 % 0 6,969 1.55 % 175.47 162,760 0.78 % 299.90 41,601 0.79 % 221.97 5,031 0.35 % 117.18 521 0.33 % 131.70 103,780 0.86 % 326.42 20 2 0 318,178 0.79 % 280.41 22 5.07 % 2,266.41 360 0.08 % 9.06 161,965 0.78 % 298.43 43,208 0.82 % 230.54 9,708 0.67 % 226.11 928 0.58 % 234.59 101,987 0.85 % 320.78 4 4 303,782 0.76 % 267.72 12 2.77 % 1,236.22 467 0.10 % 11.76 165,355 0.79 % 304.68 46,384 0.88 % 247.49 16,262 1.12 % 378.76 2,293 1.43 % 579.65 73,009 0.61 % 229.63 tisoč tiso č 301,750 0.75 % 265.93 0 0 % 0 5,082 1.13 % 127.96 194,101 0.93 % 357.64 36,893 0.70 % 196.85 3,323 0.23 % 77.40 257 0.16 % 64.97 62,094 0.52 % 195.30 štiri štir i 296,946 0.74 % 261.70 1 0.23 % 103.02 5,534 1.23 % 139.34 148,934 0.71 % 274.42 40,692 0.77 % 217.12 5,876 0.41 % 136.86 349 0.22 % 88.22 95,560 0.80 % 300.56 10 1 0 296,783 0.74 % 261.55 21 4.84 % 2,163.39 401 0.09 % 10.10 142,870 0.68 % 263.25 49,185 0.94 % 262.43 13,444 0.93 % 313.13 1,236 0.77 % 312.45 89,626 0.75 % 281.90 drugega drugeg a 294,080 0.73 % 259.17 1 0.23 % 103.02 20,272 4.50 % 510.43 126,128 0.60 % 232.40 51,562 0.98 % 275.12 14,473 1.00 % 337.09 2,891 1.80 % 730.81 78,753 0.66 % 247.70 treh tre h 288,800 0.72 % 254.52 2 0.46 % 206.04 3,940 0.88 % 99.20 150,526 0.72 % 277.35 37,751 0.72 % 201.43 6,751 0.47 % 157.24 997 0.62 % 252.03 88,833 0.74 % 279.40 prvo prv o 284,160 0.71 % 250.43 0 0 % 0 4,862 1.08 % 122.42 144,620 0.69 % 266.47 39,257 0.75 % 209.46 7,276 0.50 % 169.47 601 0.38 % 151.93 87,544 0.73 % 275.35 5 5 265,500 0.66 % 233.98 14 3.23 % 1,442.26 549 0.12 % 13.82 141,453 0.68 % 260.64 45,080 0.86 % 240.53 15,333 1.06 % 357.12 1,690 1.05 % 427.21 61,381 0.51 % 193.06 15 1 5 265,493 0.66 % 233.98 19 4.38 % 1,957.35 308 0.07 % 7.76 133,739 0.64 % 246.42 35,793 0.68 % 190.98 8,377 0.58 % 195.11 753 0.47 % 190.35 86,504 0.72 % 272.08 enega eneg a 257,431 0.64 % 226.87 0 0 % 0 8,961 1.99 % 225.63 117,647 0.56 % 216.77 42,696 0.81 % 227.81 8,810 0.61 % 205.20 1,025 0.64 % 259.11 78,292 0.65 % 246.25 30 3 0 251,610 0.62 % 221.74 20 4.61 % 2,060.37 282 0.06 % 7.10 128,477 0.61 % 236.73 38,641 0.73 % 206.18 7,698 0.53 % 179.30 1,126 0.70 % 284.64 75,366 0.63 % 237.05 prvem prve m 250,842 0.62 % 221.07 0 0 % 0 2,705 0.60 % 68.11 125,127 0.60 % 230.55 26,820 0.51 % 143.10 5,363 0.37 % 124.91 1,064 0.66 % 268.97 89,763 0.75 % 282.33 deset dese t 246,223 0.61 % 217 0 0 % 0 6,849 1.52 % 172.45 128,781 0.62 % 237.29 34,149 0.65 % 182.21 3,845 0.27 % 89.55 285 0.18 % 72.04 72,314 0.60 % 227.45 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 153 File at CLARIN.SI 1.2.137 List of final character-level 2-grams from numeral lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-numerals-lowercase_forms- final-2grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] drugi dru gi 820,060 2.18 % 722.71 3 0.97 % 309.06 24,165 5.43 % 608.45 392,784 2.01 % 723.73 128,927 2.65 % 687.91 34,299 2.64 % 798.86 3,773 2.73 % 953.77 236,109 2.08 % 742.62 prvi pr vi 808,380 2.15 % 712.42 2 0.65 % 206.04 11,945 2.69 % 300.76 373,300 1.91 % 687.83 112,201 2.31 % 598.67 21,036 1.62 % 489.95 2,570 1.86 % 649.67 287,326 2.53 % 903.72 tri t ri 563,742 1.50 % 496.82 1 0.33 % 103.02 13,856 3.11 % 348.88 286,039 1.47 % 527.05 78,929 1.62 % 421.14 13,581 1.04 % 316.32 939 0.68 % 237.37 170,397 1.50 % 535.94 dva d va 531,037 1.41 % 468 3 0.97 % 309.06 17,833 4.01 % 449.02 259,987 1.33 % 479.04 86,185 1.77 % 459.85 14,415 1.11 % 335.74 1,103 0.80 % 278.83 151,511 1.33 % 476.54 dve d ve 517,328 1.38 % 455.92 2 0.65 % 206.04 13,722 3.08 % 345.51 252,491 1.29 % 465.23 82,830 1.71 % 441.95 15,837 1.22 % 368.86 1,072 0.77 % 270.99 151,374 1.33 % 476.11 drugih drug ih 462,975 1.23 % 408.02 0 0 % 0 8,523 1.92 % 214.60 224,951 1.15 % 414.49 80,742 1.66 % 430.81 25,540 1.96 % 594.86 3,960 2.86 % 1,001.05 119,259 1.05 % 375.10 eno e no 413,972 1.10 % 364.83 7 2.27 % 721.13 19,037 4.28 % 479.33 183,413 0.94 % 337.95 77,701 1.60 % 414.59 16,926 1.30 % 394.23 1,428 1.03 % 360.98 115,460 1.02 % 363.15 dveh dv eh 411,331 1.09 % 362.50 2 0.65 % 206.04 6,335 1.42 % 159.51 205,987 1.06 % 379.54 58,443 1.20 % 311.83 11,257 0.86 % 262.19 1,144 0.83 % 289.19 128,163 1.13 % 403.11 druge dru ge 404,732 1.08 % 356.69 1 0.33 % 103.02 10,596 2.38 % 266.80 187,972 0.96 % 346.35 71,965 1.48 % 383.98 23,722 1.82 % 552.51 4,453 3.22 % 1,125.67 106,023 0.93 % 333.47 drugim drug im 396,234 1.05 % 349.20 3 0.97 % 309.06 6,249 1.41 % 157.34 173,675 0.89 % 320.01 49,406 1.02 % 263.61 10,022 0.77 % 233.42 1,036 0.75 % 261.89 155,843 1.37 % 490.17 ena e na 342,038 0.91 % 301.44 0 0 % 0 11,184 2.51 % 281.60 153,961 0.79 % 283.68 64,745 1.33 % 345.46 12,596 0.97 % 293.38 1,003 0.72 % 253.55 98,549 0.87 % 309.96 eden ed en 339,547 0.90 % 299.24 0 0 % 0 9,712 2.18 % 244.54 157,429 0.81 % 290.07 57,853 1.19 % 308.68 10,361 0.80 % 241.32 694 0.50 % 175.44 103,498 0.91 % 325.53 1, 1, 338,262 0.90 % 298.11 0 0 % 0 729 0.16 % 18.36 212,046 1.09 % 390.71 30,858 0.64 % 164.65 10,574 0.81 % 246.28 3,351 2.42 % 847.10 80,704 0.71 % 253.84 drugo dru go 330,167 0.88 % 290.97 10 3.25 % 1,030.18 13,160 2.96 % 331.35 160,253 0.82 % 295.28 53,640 1.10 % 286.21 13,845 1.06 % 322.47 2,029 1.47 % 512.91 87,230 0.77 % 274.36 pet p et 320,662 0.85 % 282.60 0 0 % 0 6,969 1.57 % 175.47 162,760 0.83 % 299.90 41,601 0.86 % 221.97 5,031 0.39 % 117.18 521 0.38 % 131.70 103,780 0.92 % 326.42 20 20 318,178 0.85 % 280.41 22 7.14 % 2,266.41 360 0.08 % 9.06 161,965 0.83 % 298.43 43,208 0.89 % 230.54 9,708 0.75 % 226.11 928 0.67 % 234.59 101,987 0.90 % 320.78 tisoč tis oč 301,750 0.80 % 265.93 0 0 % 0 5,082 1.14 % 127.96 194,101 0.99 % 357.64 36,893 0.76 % 196.85 3,323 0.26 % 77.40 257 0.19 % 64.97 62,094 0.55 % 195.30 štiri šti ri 296,946 0.79 % 261.70 1 0.33 % 103.02 5,534 1.24 % 139.34 148,934 0.76 % 274.42 40,692 0.84 % 217.12 5,876 0.45 % 136.86 349 0.25 % 88.22 95,560 0.84 % 300.56 10 10 296,783 0.79 % 261.55 21 6.82 % 2,163.39 401 0.09 % 10.10 142,870 0.73 % 263.25 49,185 1.01 % 262.43 13,444 1.03 % 313.13 1,236 0.89 % 312.45 89,626 0.79 % 281.90 drugega druge ga 294,080 0.78 % 259.17 1 0.33 % 103.02 20,272 4.56 % 510.43 126,128 0.65 % 232.40 51,562 1.06 % 275.12 14,473 1.11 % 337.09 2,891 2.09 % 730.81 78,753 0.69 % 247.70 treh tr eh 288,800 0.77 % 254.52 2 0.65 % 206.04 3,940 0.89 % 99.20 150,526 0.77 % 277.35 37,751 0.78 % 201.43 6,751 0.52 % 157.24 997 0.72 % 252.03 88,833 0.78 % 279.40 prvo pr vo 284,160 0.76 % 250.43 0 0 % 0 4,862 1.09 % 122.42 144,620 0.74 % 266.47 39,257 0.81 % 209.46 7,276 0.56 % 169.47 601 0.43 % 151.93 87,544 0.77 % 275.35 15 15 265,493 0.71 % 233.98 19 6.17 % 1,957.35 308 0.07 % 7.76 133,739 0.69 % 246.42 35,793 0.74 % 190.98 8,377 0.64 % 195.11 753 0.54 % 190.35 86,504 0.76 % 272.08 enega ene ga 257,431 0.69 % 226.87 0 0 % 0 8,961 2.01 % 225.63 117,647 0.60 % 216.77 42,696 0.88 % 227.81 8,810 0.68 % 205.20 1,025 0.74 % 259.11 78,292 0.69 % 246.25 30 30 251,610 0.67 % 221.74 20 6.49 % 2,060.37 282 0.06 % 7.10 128,477 0.66 % 236.73 38,641 0.80 % 206.18 7,698 0.59 % 179.30 1,126 0.81 % 284.64 75,366 0.66 % 237.05 prvem prv em 250,842 0.67 % 221.07 0 0 % 0 2,705 0.61 % 68.11 125,127 0.64 % 230.55 26,820 0.55 % 143.10 5,363 0.41 % 124.91 1,064 0.77 % 268.97 89,763 0.79 % 282.33 deset des et 246,223 0.66 % 217 0 0 % 0 6,849 1.54 % 172.45 128,781 0.66 % 237.29 34,149 0.70 % 182.21 3,845 0.30 % 89.55 285 0.21 % 72.04 72,314 0.64 % 227.45 2, 2, 238,500 0.63 % 210.19 0 0 % 0 327 0.07 % 8.23 154,120 0.79 % 283.98 23,331 0.48 % 124.49 8,832 0.68 % 205.71 3,076 2.22 % 777.58 48,814 0.43 % 153.53 en en 234,489 0.62 % 206.65 3 0.97 % 309.06 11,541 2.59 % 290.59 98,653 0.51 % 181.77 43,941 0.91 % 234.45 9,394 0.72 % 218.80 882 0.64 % 222.96 70,075 0.62 % 220.40 prva pr va 222,284 0.59 % 195.90 0 0 % 0 3,827 0.86 % 96.36 108,395 0.56 % 199.72 35,035 0.72 % 186.94 7,215 0.56 % 168.05 494 0.36 % 124.88 67,318 0.59 % 211.73 prve pr ve 216,155 0.57 % 190.50 0 0 % 0 2,727 0.61 % 68.66 96,390 0.49 % 177.60 28,125 0.58 % 150.07 6,850 0.53 % 159.54 804 0.58 % 203.24 81,259 0.72 % 255.58 12 12 214,333 0.57 % 188.89 5 1.62 % 515.09 297 0.07 % 7.48 109,658 0.56 % 202.05 25,719 0.53 % 137.23 6,921 0.53 % 161.20 479 0.35 % 121.09 71,254 0.63 % 224.11 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 154 File at CLARIN.SI 1.2.138 List of final character-level 3-grams from numeral lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-numerals-lowercase_forms- final-3grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] drugi dr ugi 820,060 2.78 % 722.71 3 1.92 % 309.06 24,165 5.75 % 608.45 392,784 2.59 % 723.73 128,927 3.32 % 687.91 34,299 3.41 % 798.86 3,773 3.69 % 953.77 236,109 2.64 % 742.62 prvi p rvi 808,380 2.74 % 712.42 2 1.28 % 206.04 11,945 2.84 % 300.76 373,300 2.46 % 687.83 112,201 2.89 % 598.67 21,036 2.09 % 489.95 2,570 2.52 % 649.67 287,326 3.21 % 903.72 tri tri 563,742 1.91 % 496.82 1 0.64 % 103.02 13,856 3.29 % 348.88 286,039 1.88 % 527.05 78,929 2.04 % 421.14 13,581 1.35 % 316.32 939 0.92 % 237.37 170,397 1.91 % 535.94 dva dva 531,037 1.80 % 468 3 1.92 % 309.06 17,833 4.24 % 449.02 259,987 1.71 % 479.04 86,185 2.22 % 459.85 14,415 1.43 % 335.74 1,103 1.08 % 278.83 151,511 1.70 % 476.54 dve dve 517,328 1.75 % 455.92 2 1.28 % 206.04 13,722 3.26 % 345.51 252,491 1.66 % 465.23 82,830 2.14 % 441.95 15,837 1.57 % 368.86 1,072 1.05 % 270.99 151,374 1.69 % 476.11 drugih dru gih 462,975 1.57 % 408.02 0 0 % 0 8,523 2.03 % 214.60 224,951 1.48 % 414.49 80,742 2.08 % 430.81 25,540 2.54 % 594.86 3,960 3.88 % 1,001.05 119,259 1.33 % 375.10 eno eno 413,972 1.40 % 364.83 7 4.49 % 721.13 19,037 4.53 % 479.33 183,413 1.21 % 337.95 77,701 2.00 % 414.59 16,926 1.68 % 394.23 1,428 1.40 % 360.98 115,460 1.29 % 363.15 dveh d veh 411,331 1.39 % 362.50 2 1.28 % 206.04 6,335 1.51 % 159.51 205,987 1.36 % 379.54 58,443 1.51 % 311.83 11,257 1.12 % 262.19 1,144 1.12 % 289.19 128,163 1.43 % 403.11 druge dr uge 404,732 1.37 % 356.69 1 0.64 % 103.02 10,596 2.52 % 266.80 187,972 1.24 % 346.35 71,965 1.85 % 383.98 23,722 2.36 % 552.51 4,453 4.36 % 1,125.67 106,023 1.19 % 333.47 drugim dru gim 396,234 1.34 % 349.20 3 1.92 % 309.06 6,249 1.49 % 157.34 173,675 1.14 % 320.01 49,406 1.27 % 263.61 10,022 0.99 % 233.42 1,036 1.01 % 261.89 155,843 1.74 % 490.17 ena ena 342,038 1.16 % 301.44 0 0 % 0 11,184 2.66 % 281.60 153,961 1.01 % 283.68 64,745 1.67 % 345.46 12,596 1.25 % 293.38 1,003 0.98 % 253.55 98,549 1.10 % 309.96 eden e den 339,547 1.15 % 299.24 0 0 % 0 9,712 2.31 % 244.54 157,429 1.04 % 290.07 57,853 1.49 % 308.68 10,361 1.03 % 241.32 694 0.68 % 175.44 103,498 1.16 % 325.53 drugo dr ugo 330,167 1.12 % 290.97 10 6.41 % 1,030.18 13,160 3.13 % 331.35 160,253 1.06 % 295.28 53,640 1.38 % 286.21 13,845 1.38 % 322.47 2,029 1.99 % 512.91 87,230 0.98 % 274.36 pet pet 320,662 1.09 % 282.60 0 0 % 0 6,969 1.66 % 175.47 162,760 1.07 % 299.90 41,601 1.07 % 221.97 5,031 0.50 % 117.18 521 0.51 % 131.70 103,780 1.16 % 326.42 tisoč ti soč 301,750 1.02 % 265.93 0 0 % 0 5,082 1.21 % 127.96 194,101 1.28 % 357.64 36,893 0.95 % 196.85 3,323 0.33 % 77.40 257 0.25 % 64.97 62,094 0.69 % 195.30 štiri št iri 296,946 1.01 % 261.70 1 0.64 % 103.02 5,534 1.32 % 139.34 148,934 0.98 % 274.42 40,692 1.05 % 217.12 5,876 0.58 % 136.86 349 0.34 % 88.22 95,560 1.07 % 300.56 drugega drug ega 294,080 1.00 % 259.17 1 0.64 % 103.02 20,272 4.82 % 510.43 126,128 0.83 % 232.40 51,562 1.33 % 275.12 14,473 1.44 % 337.09 2,891 2.83 % 730.81 78,753 0.88 % 247.70 treh t reh 288,800 0.98 % 254.52 2 1.28 % 206.04 3,940 0.94 % 99.20 150,526 0.99 % 277.35 37,751 0.97 % 201.43 6,751 0.67 % 157.24 997 0.98 % 252.03 88,833 0.99 % 279.40 prvo p rvo 284,160 0.96 % 250.43 0 0 % 0 4,862 1.16 % 122.42 144,620 0.95 % 266.47 39,257 1.01 % 209.46 7,276 0.72 % 169.47 601 0.59 % 151.93 87,544 0.98 % 275.35 enega en ega 257,431 0.87 % 226.87 0 0 % 0 8,961 2.13 % 225.63 117,647 0.78 % 216.77 42,696 1.10 % 227.81 8,810 0.88 % 205.20 1,025 1.00 % 259.11 78,292 0.88 % 246.25 prvem pr vem 250,842 0.85 % 221.07 0 0 % 0 2,705 0.64 % 68.11 125,127 0.82 % 230.55 26,820 0.69 % 143.10 5,363 0.53 % 124.91 1,064 1.04 % 268.97 89,763 1.00 % 282.33 deset de set 246,223 0.83 % 217 0 0 % 0 6,849 1.63 % 172.45 128,781 0.85 % 237.29 34,149 0.88 % 182.21 3,845 0.38 % 89.55 285 0.28 % 72.04 72,314 0.81 % 227.45 prva p rva 222,284 0.75 % 195.90 0 0 % 0 3,827 0.91 % 96.36 108,395 0.71 % 199.72 35,035 0.90 % 186.94 7,215 0.72 % 168.05 494 0.48 % 124.88 67,318 0.75 % 211.73 prve p rve 216,155 0.73 % 190.50 0 0 % 0 2,727 0.65 % 68.66 96,390 0.64 % 177.60 28,125 0.72 % 150.07 6,850 0.68 % 159.54 804 0.79 % 203.24 81,259 0.91 % 255.58 druga dr uga 207,331 0.70 % 182.72 0 0 % 0 6,455 1.53 % 162.53 97,666 0.64 % 179.96 37,580 0.97 % 200.51 10,927 1.08 % 254.50 1,461 1.43 % 369.33 53,242 0.60 % 167.46 drugem dru gem 197,297 0.67 % 173.88 1 0.64 % 103.02 4,399 1.05 % 110.76 92,410 0.61 % 170.27 19,781 0.51 % 105.54 5,479 0.54 % 127.61 1,112 1.09 % 281.10 74,115 0.83 % 233.11 šest š est 194,710 0.66 % 171.60 0 0 % 0 3,650 0.87 % 91.90 98,377 0.65 % 181.27 23,184 0.60 % 123.70 3,062 0.30 % 71.32 334 0.33 % 84.43 66,103 0.74 % 207.91 100 100 180,711 0.61 % 159.26 11 7.05 % 1,133.20 196 0.05 % 4.94 85,233 0.56 % 157.05 36,302 0.94 % 193.70 5,283 0.53 % 123.05 698 0.68 % 176.45 52,988 0.59 % 166.66 prvih pr vih 176,236 0.60 % 155.32 0 0 % 0 2,002 0.48 % 50.41 89,275 0.59 % 164.50 24,291 0.63 % 129.61 4,882 0.48 % 113.71 270 0.26 % 68.25 55,516 0.62 % 174.61 dvema dv ema 172,686 0.58 % 152.19 2 1.28 % 206.04 4,089 0.97 % 102.96 85,914 0.57 % 158.30 27,603 0.71 % 147.28 4,285 0.43 % 99.80 268 0.26 % 67.75 50,525 0.56 % 158.91 prvega prv ega 170,665 0.58 % 150.41 0 0 % 0 2,487 0.59 % 62.62 81,130 0.53 % 149.49 19,285 0.50 % 102.90 4,378 0.43 % 101.97 3,542 3.47 % 895.38 59,843 0.67 % 188.22 eni eni 163,314 0.55 % 143.93 4 2.56 % 412.07 5,102 1.21 % 128.46 75,673 0.50 % 139.43 31,603 0.81 % 168.62 7,598 0.76 % 176.97 492 0.48 % 124.37 42,842 0.48 % 134.75 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 155 File at CLARIN.SI 1.2.139 List of final character-level 4-grams from numeral lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-numerals-lowercase_forms- final-4grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] drugi d rugi 820,060 4.21 % 722.71 3 7.14 % 309.06 24,165 7.74 % 608.45 392,784 4.01 % 723.73 128,927 4.99 % 687.91 34,299 5.01 % 798.86 3,773 5.24 % 953.77 236,109 3.93 % 742.62 prvi prvi 808,380 4.15 % 712.42 2 4.76 % 206.04 11,945 3.83 % 300.76 373,300 3.81 % 687.83 112,201 4.35 % 598.67 21,036 3.07 % 489.95 2,570 3.57 % 649.67 287,326 4.78 % 903.72 drugih dr ugih 462,975 2.38 % 408.02 0 0 % 0 8,523 2.73 % 214.60 224,951 2.29 % 414.49 80,742 3.13 % 430.81 25,540 3.73 % 594.86 3,960 5.50 % 1,001.05 119,259 1.99 % 375.10 dveh dveh 411,331 2.11 % 362.50 2 4.76 % 206.04 6,335 2.03 % 159.51 205,987 2.10 % 379.54 58,443 2.26 % 311.83 11,257 1.65 % 262.19 1,144 1.59 % 289.19 128,163 2.13 % 403.11 druge d ruge 404,732 2.08 % 356.69 1 2.38 % 103.02 10,596 3.39 % 266.80 187,972 1.92 % 346.35 71,965 2.79 % 383.98 23,722 3.47 % 552.51 4,453 6.18 % 1,125.67 106,023 1.76 % 333.47 drugim dr ugim 396,234 2.04 % 349.20 3 7.14 % 309.06 6,249 2.00 % 157.34 173,675 1.77 % 320.01 49,406 1.91 % 263.61 10,022 1.46 % 233.42 1,036 1.44 % 261.89 155,843 2.59 % 490.17 eden eden 339,547 1.75 % 299.24 0 0 % 0 9,712 3.11 % 244.54 157,429 1.61 % 290.07 57,853 2.24 % 308.68 10,361 1.51 % 241.32 694 0.96 % 175.44 103,498 1.72 % 325.53 drugo d rugo 330,167 1.70 % 290.97 10 23.81 % 1,030.18 13,160 4.21 % 331.35 160,253 1.64 % 295.28 53,640 2.08 % 286.21 13,845 2.02 % 322.47 2,029 2.82 % 512.91 87,230 1.45 % 274.36 tisoč t isoč 301,750 1.55 % 265.93 0 0 % 0 5,082 1.63 % 127.96 194,101 1.98 % 357.64 36,893 1.43 % 196.85 3,323 0.49 % 77.40 257 0.36 % 64.97 62,094 1.03 % 195.30 štiri š tiri 296,946 1.53 % 261.70 1 2.38 % 103.02 5,534 1.77 % 139.34 148,934 1.52 % 274.42 40,692 1.58 % 217.12 5,876 0.86 % 136.86 349 0.48 % 88.22 95,560 1.59 % 300.56 drugega dru gega 294,080 1.51 % 259.17 1 2.38 % 103.02 20,272 6.49 % 510.43 126,128 1.29 % 232.40 51,562 2.00 % 275.12 14,473 2.12 % 337.09 2,891 4.01 % 730.81 78,753 1.31 % 247.70 treh treh 288,800 1.48 % 254.52 2 4.76 % 206.04 3,940 1.26 % 99.20 150,526 1.54 % 277.35 37,751 1.46 % 201.43 6,751 0.99 % 157.24 997 1.38 % 252.03 88,833 1.48 % 279.40 prvo prvo 284,160 1.46 % 250.43 0 0 % 0 4,862 1.56 % 122.42 144,620 1.48 % 266.47 39,257 1.52 % 209.46 7,276 1.06 % 169.47 601 0.83 % 151.93 87,544 1.46 % 275.35 enega e nega 257,431 1.32 % 226.87 0 0 % 0 8,961 2.87 % 225.63 117,647 1.20 % 216.77 42,696 1.65 % 227.81 8,810 1.29 % 205.20 1,025 1.42 % 259.11 78,292 1.30 % 246.25 prvem p rvem 250,842 1.29 % 221.07 0 0 % 0 2,705 0.87 % 68.11 125,127 1.28 % 230.55 26,820 1.04 % 143.10 5,363 0.78 % 124.91 1,064 1.48 % 268.97 89,763 1.49 % 282.33 deset d eset 246,223 1.26 % 217 0 0 % 0 6,849 2.19 % 172.45 128,781 1.31 % 237.29 34,149 1.32 % 182.21 3,845 0.56 % 89.55 285 0.40 % 72.04 72,314 1.20 % 227.45 prva prva 222,284 1.14 % 195.90 0 0 % 0 3,827 1.23 % 96.36 108,395 1.11 % 199.72 35,035 1.36 % 186.94 7,215 1.05 % 168.05 494 0.69 % 124.88 67,318 1.12 % 211.73 prve prve 216,155 1.11 % 190.50 0 0 % 0 2,727 0.87 % 68.66 96,390 0.98 % 177.60 28,125 1.09 % 150.07 6,850 1.00 % 159.54 804 1.12 % 203.24 81,259 1.35 % 255.58 druga d ruga 207,331 1.06 % 182.72 0 0 % 0 6,455 2.07 % 162.53 97,666 1.00 % 179.96 37,580 1.46 % 200.51 10,927 1.60 % 254.50 1,461 2.03 % 369.33 53,242 0.89 % 167.46 drugem dr ugem 197,297 1.01 % 173.88 1 2.38 % 103.02 4,399 1.41 % 110.76 92,410 0.94 % 170.27 19,781 0.77 % 105.54 5,479 0.80 % 127.61 1,112 1.54 % 281.10 74,115 1.23 % 233.11 šest šest 194,710 1.00 % 171.60 0 0 % 0 3,650 1.17 % 91.90 98,377 1.00 % 181.27 23,184 0.90 % 123.70 3,062 0.45 % 71.32 334 0.46 % 84.43 66,103 1.10 % 207.91 prvih p rvih 176,236 0.91 % 155.32 0 0 % 0 2,002 0.64 % 50.41 89,275 0.91 % 164.50 24,291 0.94 % 129.61 4,882 0.71 % 113.71 270 0.38 % 68.25 55,516 0.92 % 174.61 dvema d vema 172,686 0.89 % 152.19 2 4.76 % 206.04 4,089 1.31 % 102.96 85,914 0.88 % 158.30 27,603 1.07 % 147.28 4,285 0.63 % 99.80 268 0.37 % 67.75 50,525 0.84 % 158.91 prvega pr vega 170,665 0.88 % 150.41 0 0 % 0 2,487 0.80 % 62.62 81,130 0.83 % 149.49 19,285 0.75 % 102.90 4,378 0.64 % 101.97 3,542 4.92 % 895.38 59,843 1.00 % 188.22 2,000 2,000 148,688 0.76 % 131.04 0 0 % 0 245 0.08 % 6.17 89,618 0.91 % 165.13 26,937 1.04 % 143.73 5,880 0.86 % 136.95 159 0.22 % 40.19 25,849 0.43 % 81.30 tretji tr etji 145,441 0.75 % 128.18 0 0 % 0 1,990 0.64 % 50.11 70,342 0.72 % 129.61 16,447 0.64 % 87.76 3,373 0.49 % 78.56 686 0.95 % 173.41 52,603 0.88 % 165.45 drugimi dru gimi 144,266 0.74 % 127.14 0 0 % 0 3,342 1.07 % 84.15 66,744 0.68 % 122.98 28,301 1.10 % 151 9,385 1.37 % 218.59 938 1.30 % 237.12 35,556 0.59 % 111.83 štirih št irih 143,128 0.73 % 126.14 0 0 % 0 2,238 0.72 % 56.35 74,029 0.76 % 136.40 18,589 0.72 % 99.18 2,888 0.42 % 67.26 204 0.28 % 51.57 45,180 0.75 % 142.10 sedem s edem 143,046 0.73 % 126.07 0 0 % 0 2,985 0.96 % 75.16 71,859 0.73 % 132.40 16,155 0.63 % 86.20 2,252 0.33 % 52.45 135 0.19 % 34.13 49,660 0.83 % 156.19 osem osem 136,419 0.70 % 120.23 0 0 % 0 2,174 0.70 % 54.74 70,102 0.71 % 129.17 14,763 0.57 % 78.77 1,885 0.28 % 43.90 164 0.23 % 41.46 47,331 0.79 % 148.87 trije t rije 127,337 0.65 % 112.22 0 0 % 0 4,412 1.41 % 111.09 66,917 0.68 % 123.30 17,756 0.69 % 94.74 2,898 0.42 % 67.50 194 0.27 % 49.04 35,160 0.58 % 110.59 enem enem 125,075 0.64 % 110.23 4 9.52 % 412.07 3,694 1.18 % 93.01 55,098 0.56 % 101.52 24,108 0.93 % 128.63 4,278 0.62 % 99.64 568 0.79 % 143.58 37,325 0.62 % 117.40 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 156 File at CLARIN.SI 1.2.140 List of final character-level 5-grams from numeral lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-numerals-lowercase_forms- final-5grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] drugi drugi 820,060 7.24 % 722.71 3 10.71 % 309.06 24,165 10.67 % 608.45 392,784 6.63 % 723.73 128,927 8.87 % 687.91 34,299 10.95 % 798.86 3,773 8.08 % 953.77 236,109 7.02 % 742.62 drugih d rugih 462,975 4.09 % 408.02 0 0 % 0 8,523 3.76 % 214.60 224,951 3.80 % 414.49 80,742 5.55 % 430.81 25,540 8.15 % 594.86 3,960 8.48 % 1,001.05 119,259 3.54 % 375.10 druge druge 404,732 3.57 % 356.69 1 3.57 % 103.02 10,596 4.68 % 266.80 187,972 3.17 % 346.35 71,965 4.95 % 383.98 23,722 7.57 % 552.51 4,453 9.54 % 1,125.67 106,023 3.15 % 333.47 drugim d rugim 396,234 3.50 % 349.20 3 10.71 % 309.06 6,249 2.76 % 157.34 173,675 2.93 % 320.01 49,406 3.40 % 263.61 10,022 3.20 % 233.42 1,036 2.22 % 261.89 155,843 4.63 % 490.17 drugo drugo 330,167 2.92 % 290.97 10 35.71 % 1,030.18 13,160 5.81 % 331.35 160,253 2.71 % 295.28 53,640 3.69 % 286.21 13,845 4.42 % 322.47 2,029 4.34 % 512.91 87,230 2.59 % 274.36 tisoč tisoč 301,750 2.67 % 265.93 0 0 % 0 5,082 2.24 % 127.96 194,101 3.28 % 357.64 36,893 2.54 % 196.85 3,323 1.06 % 77.40 257 0.55 % 64.97 62,094 1.85 % 195.30 štiri štiri 296,946 2.62 % 261.70 1 3.57 % 103.02 5,534 2.44 % 139.34 148,934 2.52 % 274.42 40,692 2.80 % 217.12 5,876 1.88 % 136.86 349 0.75 % 88.22 95,560 2.84 % 300.56 drugega dr ugega 294,080 2.60 % 259.17 1 3.57 % 103.02 20,272 8.95 % 510.43 126,128 2.13 % 232.40 51,562 3.55 % 275.12 14,473 4.62 % 337.09 2,891 6.19 % 730.81 78,753 2.34 % 247.70 enega enega 257,431 2.27 % 226.87 0 0 % 0 8,961 3.96 % 225.63 117,647 1.99 % 216.77 42,696 2.94 % 227.81 8,810 2.81 % 205.20 1,025 2.19 % 259.11 78,292 2.33 % 246.25 prvem prvem 250,842 2.21 % 221.07 0 0 % 0 2,705 1.19 % 68.11 125,127 2.11 % 230.55 26,820 1.84 % 143.10 5,363 1.71 % 124.91 1,064 2.28 % 268.97 89,763 2.67 % 282.33 deset deset 246,223 2.17 % 217 0 0 % 0 6,849 3.02 % 172.45 128,781 2.17 % 237.29 34,149 2.35 % 182.21 3,845 1.23 % 89.55 285 0.61 % 72.04 72,314 2.15 % 227.45 druga druga 207,331 1.83 % 182.72 0 0 % 0 6,455 2.85 % 162.53 97,666 1.65 % 179.96 37,580 2.58 % 200.51 10,927 3.49 % 254.50 1,461 3.13 % 369.33 53,242 1.58 % 167.46 drugem d rugem 197,297 1.74 % 173.88 1 3.57 % 103.02 4,399 1.94 % 110.76 92,410 1.56 % 170.27 19,781 1.36 % 105.54 5,479 1.75 % 127.61 1,112 2.38 % 281.10 74,115 2.20 % 233.11 prvih prvih 176,236 1.56 % 155.32 0 0 % 0 2,002 0.88 % 50.41 89,275 1.51 % 164.50 24,291 1.67 % 129.61 4,882 1.56 % 113.71 270 0.58 % 68.25 55,516 1.65 % 174.61 dvema dvema 172,686 1.52 % 152.19 2 7.14 % 206.04 4,089 1.80 % 102.96 85,914 1.45 % 158.30 27,603 1.90 % 147.28 4,285 1.37 % 99.80 268 0.57 % 67.75 50,525 1.50 % 158.91 prvega p rvega 170,665 1.51 % 150.41 0 0 % 0 2,487 1.10 % 62.62 81,130 1.37 % 149.49 19,285 1.33 % 102.90 4,378 1.40 % 101.97 3,542 7.59 % 895.38 59,843 1.78 % 188.22 tretji t retji 145,441 1.28 % 128.18 0 0 % 0 1,990 0.88 % 50.11 70,342 1.19 % 129.61 16,447 1.13 % 87.76 3,373 1.08 % 78.56 686 1.47 % 173.41 52,603 1.56 % 165.45 drugimi dr ugimi 144,266 1.27 % 127.14 0 0 % 0 3,342 1.48 % 84.15 66,744 1.13 % 122.98 28,301 1.95 % 151 9,385 3.00 % 218.59 938 2.01 % 237.12 35,556 1.06 % 111.83 štirih š tirih 143,128 1.26 % 126.14 0 0 % 0 2,238 0.99 % 56.35 74,029 1.25 % 136.40 18,589 1.28 % 99.18 2,888 0.92 % 67.26 204 0.44 % 51.57 45,180 1.34 % 142.10 sedem sedem 143,046 1.26 % 126.07 0 0 % 0 2,985 1.32 % 75.16 71,859 1.21 % 132.40 16,155 1.11 % 86.20 2,252 0.72 % 52.45 135 0.29 % 34.13 49,660 1.48 % 156.19 trije trije 127,337 1.12 % 112.22 0 0 % 0 4,412 1.95 % 111.09 66,917 1.13 % 123.30 17,756 1.22 % 94.74 2,898 0.93 % 67.50 194 0.41 % 49.04 35,160 1.04 % 110.59 petih petih 122,008 1.08 % 107.53 1 3.57 % 103.02 1,848 0.82 % 46.53 62,977 1.06 % 116.04 14,811 1.02 % 79.03 1,833 0.58 % 42.69 335 0.72 % 84.68 40,203 1.20 % 126.45 tretje t retje 97,845 0.86 % 86.23 0 0 % 0 901 0.40 % 22.69 51,539 0.87 % 94.96 10,412 0.72 % 55.56 1,553 0.50 % 36.17 297 0.64 % 75.08 33,143 0.98 % 104.24 desetih de setih 86,072 0.76 % 75.85 0 0 % 0 1,844 0.81 % 46.43 45,492 0.77 % 83.82 12,654 0.87 % 67.52 1,244 0.40 % 28.97 163 0.35 % 41.20 24,675 0.73 % 77.61 tremi tremi 84,540 0.75 % 74.50 1 3.57 % 103.02 1,599 0.71 % 40.26 43,950 0.74 % 80.98 12,581 0.86 % 67.13 1,509 0.48 % 35.15 83 0.18 % 20.98 24,817 0.74 % 78.06 devet devet 83,521 0.74 % 73.61 1 3.57 % 103.02 1,452 0.64 % 36.56 42,137 0.71 % 77.64 8,310 0.57 % 44.34 1,062 0.34 % 24.74 69 0.15 % 17.44 30,490 0.91 % 95.90 šestih š estih 80,748 0.71 % 71.16 0 0 % 0 1,400 0.62 % 35.25 41,731 0.70 % 76.89 9,460 0.65 % 50.48 1,146 0.37 % 26.69 381 0.82 % 96.31 26,630 0.79 % 83.76 tretjem tr etjem 61,704 0.55 % 54.38 0 0 % 0 680 0.30 % 17.12 28,203 0.48 % 51.97 5,627 0.39 % 30.02 1,446 0.46 % 33.68 322 0.69 % 81.40 25,426 0.76 % 79.97 štirje š tirje 61,129 0.54 % 53.87 0 0 % 0 1,629 0.72 % 41.02 32,123 0.54 % 59.19 8,065 0.56 % 43.03 1,300 0.41 % 30.28 55 0.12 % 13.90 17,957 0.53 % 56.48 sedmih s edmih 58,162 0.51 % 51.26 0 0 % 0 1,391 0.61 % 35.02 30,258 0.51 % 55.75 6,643 0.46 % 35.44 918 0.29 % 21.38 111 0.24 % 28.06 18,841 0.56 % 59.26 dvajset dv ajset 57,623 0.51 % 50.78 0 0 % 0 3,915 1.73 % 98.58 32,574 0.55 % 60.02 11,044 0.76 % 58.93 1,459 0.47 % 33.98 87 0.19 % 21.99 8,544 0.25 % 26.87 štirimi št irimi 55,654 0.49 % 49.05 1 3.57 % 103.02 836 0.37 % 21.05 29,008 0.49 % 53.45 8,278 0.57 % 44.17 831 0.27 % 19.35 36 0.08 % 9.10 16,664 0.49 % 52.41 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 157 File at CLARIN.SI 1.2.141 List of initial character-level 1-grams from preposition lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-prepositions-lemmas- initial-1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] v v v 31,818,378 25.51 % 28,041.37 1,138,445 25.83 % 26,515.74 9,553,100 26.12 % 30,046.99 299 22.72 % 30,802.51 15,532,386 25.82 % 28,619.45 103,256 23.08 % 26,102 4,674,294 23.87 % 24,940.50 816,598 23.02 % 20,561.05 na na n a 19,072,589 15.29 % 16,808.58 631,488 14.33 % 14,708.11 5,747,052 15.72 % 18,075.97 248 18.84 % 25,548.57 9,072,361 15.08 % 16,716.43 56,074 12.53 % 14,174.90 3,009,439 15.37 % 16,057.38 555,927 15.67 % 13,997.64 z z z 15,732,362 12.62 % 13,864.85 626,742 14.22 % 14,597.57 4,402,058 12.04 % 13,845.62 403 30.62 % 41,516.43 7,267,869 12.08 % 13,391.53 58,748 13.13 % 14,850.86 2,841,173 14.51 % 15,159.57 535,369 15.09 % 13,480.01 za za z a 15,195,407 12.18 % 13,391.64 436,419 9.90 % 10,164.72 4,564,728 12.48 % 14,357.26 88 6.69 % 9,065.62 7,527,662 12.51 % 13,870.22 67,714 15.13 % 17,117.37 2,303,946 11.77 % 12,293.10 294,850 8.31 % 7,424 po po p o 6,360,566 5.10 % 5,605.53 196,183 4.45 % 4,569.34 1,959,908 5.36 % 6,164.42 64 4.86 % 6,593.18 3,054,588 5.08 % 5,628.28 21,577 4.82 % 5,454.43 936,691 4.78 % 4,997.88 191,555 5.40 % 4,823.15 iz iz i z 4,006,518 3.21 % 3,530.92 159,021 3.61 % 3,703.79 1,060,540 2.90 % 3,335.67 37 2.81 % 3,811.68 2,017,138 3.35 % 3,716.71 23,762 5.31 % 6,006.78 611,339 3.12 % 3,261.90 134,681 3.80 % 3,391.12 pri pri p ri 3,934,055 3.15 % 3,467.06 178,024 4.04 % 4,146.39 1,018,794 2.79 % 3,204.37 47 3.57 % 4,841.87 1,888,953 3.14 % 3,480.52 15,360 3.43 % 3,882.84 756,619 3.86 % 4,037.07 76,258 2.15 % 1,920.09 od od o d 3,855,989 3.09 % 3,398.26 157,936 3.58 % 3,678.52 1,061,423 2.90 % 3,338.45 20 1.52 % 2,060.37 1,843,270 3.06 % 3,396.35 13,534 3.02 % 3,421.25 638,050 3.26 % 3,404.43 141,756 4.00 % 3,569.26 o o o 3,670,387 2.94 % 3,234.69 130,867 2.97 % 3,048.05 1,103,026 3.02 % 3,469.30 1 0.08 % 103.02 1,828,963 3.04 % 3,369.99 24,271 5.42 % 6,135.45 488,092 2.49 % 2,604.30 95,167 2.68 % 2,396.20 do do d o 3,434,842 2.75 % 3,027.11 128,879 2.92 % 3,001.75 976,111 2.67 % 3,070.12 16 1.22 % 1,648.30 1,686,736 2.80 % 3,107.92 13,221 2.96 % 3,342.13 544,963 2.78 % 2,907.74 84,916 2.39 % 2,138.09 med med m ed 3,005,484 2.41 % 2,648.72 125,226 2.84 % 2,916.66 906,664 2.48 % 2,851.69 19 1.44 % 1,957.35 1,425,352 2.37 % 2,626.31 7,107 1.59 % 1,796.57 481,142 2.46 % 2,567.22 59,974 1.69 % 1,510.08 ob ob o b 2,529,897 2.03 % 2,229.59 68,415 1.55 % 1,593.47 700,467 1.92 % 2,203.15 9 0.68 % 927.17 1,338,788 2.23 % 2,466.81 5,209 1.16 % 1,316.78 342,313 1.75 % 1,826.47 74,696 2.10 % 1,880.76 pred pred p red 2,269,453 1.82 % 2,000.06 51,884 1.18 % 1,208.44 697,210 1.91 % 2,192.91 7 0.53 % 721.13 1,115,560 1.85 % 2,055.49 5,680 1.27 % 1,435.84 329,208 1.68 % 1,756.55 69,904 1.97 % 1,760.11 zaradi zaradi z aradi 1,792,327 1.44 % 1,579.57 53,002 1.20 % 1,234.48 569,966 1.56 % 1,792.69 0 0 % 0 861,596 1.43 % 1,587.55 5,122 1.15 % 1,294.79 266,969 1.36 % 1,424.46 35,672 1.00 % 898.18 brez brez b rez 895,603 0.72 % 789.29 31,130 0.71 % 725.05 238,407 0.65 % 749.85 11 0.84 % 1,133.20 419,896 0.70 % 773.69 2,893 0.65 % 731.32 166,614 0.85 % 889 36,652 1.03 % 922.86 k k k 881,577 0.71 % 776.93 48,583 1.10 % 1,131.56 227,339 0.62 % 715.04 14 1.06 % 1,442.26 391,830 0.65 % 721.97 3,769 0.84 % 952.76 146,443 0.75 % 781.37 63,599 1.79 % 1,601.35 proti proti p roti 875,538 0.70 % 771.61 31,406 0.71 % 731.48 296,441 0.81 % 932.38 4 0.30 % 412.07 400,159 0.67 % 737.32 2,536 0.57 % 641.07 105,654 0.54 % 563.74 39,338 1.11 % 990.49 pod pod p od 771,297 0.62 % 679.74 31,206 0.71 % 726.82 211,135 0.58 % 664.07 6 0.46 % 618.11 357,676 0.59 % 659.04 3,253 0.73 % 822.32 131,152 0.67 % 699.78 36,869 1.04 % 928.32 poleg poleg p oleg 592,281 0.47 % 521.97 19,376 0.44 % 451.29 155,163 0.42 % 488.03 0 0 % 0 284,760 0.47 % 524.69 1,216 0.27 % 307.39 118,883 0.61 % 634.32 12,883 0.36 % 324.38 nad nad n ad 553,355 0.44 % 487.67 24,668 0.56 % 574.55 152,855 0.42 % 480.77 1 0.08 % 103.02 259,821 0.43 % 478.74 1,840 0.41 % 465.13 89,316 0.46 % 476.56 24,854 0.70 % 625.80 kljub kljub k ljub 509,251 0.41 % 448.80 13,018 0.29 % 303.20 130,869 0.36 % 411.62 0 0 % 0 262,664 0.44 % 483.98 835 0.19 % 211.08 90,789 0.46 % 484.42 11,076 0.31 % 278.88 čez čez č ez 342,085 0.27 % 301.48 12,802 0.29 % 298.17 76,727 0.21 % 241.33 2 0.15 % 206.04 156,143 0.26 % 287.70 764 0.17 % 193.13 63,586 0.33 % 339.27 32,061 0.90 % 807.26 glede glede g lede 271,297 0.22 % 239.09 8,095 0.18 % 188.54 121,462 0.33 % 382.03 0 0 % 0 106,244 0.18 % 195.76 1,543 0.34 % 390.05 30,305 0.15 % 161.70 3,648 0.10 % 91.85 skozi skozi s kozi 261,306 0.21 % 230.29 16,616 0.38 % 387.01 59,303 0.16 % 186.52 2 0.15 % 206.04 106,627 0.18 % 196.47 957 0.21 % 241.92 51,512 0.26 % 274.85 26,289 0.74 % 661.93 izmed izmed i zmed 231,718 0.19 % 204.21 10,396 0.24 % 242.14 68,140 0.19 % 214.32 0 0 % 0 98,094 0.16 % 180.74 698 0.16 % 176.45 47,539 0.24 % 253.65 6,851 0.19 % 172.50 prek prek p rek 230,294 0.18 % 202.96 7,496 0.17 % 174.59 68,452 0.19 % 215.30 0 0 % 0 100,698 0.17 % 185.54 543 0.12 % 137.26 49,035 0.25 % 261.63 4,070 0.12 % 102.48 konec konec k onec 210,897 0.17 % 185.86 3,402 0.08 % 79.24 66,489 0.18 % 209.13 0 0 % 0 112,615 0.19 % 207.50 187 0.04 % 47.27 25,891 0.13 % 138.15 2,313 0.07 % 58.24 okoli okoli o koli 187,237 0.15 % 165.01 7,963 0.18 % 185.47 53,728 0.15 % 168.99 1 0.08 % 103.02 81,299 0.14 % 149.80 320 0.07 % 80.89 33,734 0.17 % 179.99 10,192 0.29 % 256.62 namesto namesto n amesto 180,360 0.14 % 158.95 9,030 0.20 % 210.32 42,657 0.12 % 134.17 5 0.38 % 515.09 81,373 0.14 % 149.94 634 0.14 % 160.27 38,674 0.20 % 206.35 7,987 0.23 % 201.10 sredi sredi s redi 155,626 0.12 % 137.15 5,109 0.12 % 118.99 39,051 0.11 % 122.83 0 0 % 0 75,945 0.13 % 139.93 228 0.05 % 57.64 25,698 0.13 % 137.12 9,595 0.27 % 241.59 preko preko p reko 107,027 0.09 % 94.32 5,005 0.11 % 116.57 35,516 0.10 % 111.71 9 0.68 % 927.17 43,418 0.07 % 80 563 0.13 % 142.32 20,743 0.11 % 110.68 1,773 0.05 % 44.64 zoper zoper z oper 96,349 0.08 % 84.91 2,034 0.05 % 47.37 37,620 0.10 % 118.32 0 0 % 0 48,608 0.08 % 89.56 986 0.22 % 249.25 6,457 0.03 % 34.45 644 0.02 % 16.22 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 158 File at CLARIN.SI 1.2.142 List of initial character-level 2-grams from preposition lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-prepositions-lemmas- initial-2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] na na na 19,072,589 26.27 % 16,808.58 631,488 25.64 % 14,708.11 5,747,052 27.01 % 18,075.97 248 41.47 % 25,548.57 9,072,361 25.82 % 16,716.43 56,074 21.79 % 14,174.90 3,009,439 26.34 % 16,057.38 555,927 27.30 % 13,997.64 za za za 15,195,407 20.93 % 13,391.64 436,419 17.72 % 10,164.72 4,564,728 21.45 % 14,357.26 88 14.72 % 9,065.62 7,527,662 21.42 % 13,870.22 67,714 26.31 % 17,117.37 2,303,946 20.16 % 12,293.10 294,850 14.48 % 7,424 po po po 6,360,566 8.76 % 5,605.53 196,183 7.97 % 4,569.34 1,959,908 9.21 % 6,164.42 64 10.70 % 6,593.18 3,054,588 8.69 % 5,628.28 21,577 8.38 % 5,454.43 936,691 8.20 % 4,997.88 191,555 9.40 % 4,823.15 iz iz iz 4,006,518 5.52 % 3,530.92 159,021 6.46 % 3,703.79 1,060,540 4.98 % 3,335.67 37 6.19 % 3,811.68 2,017,138 5.74 % 3,716.71 23,762 9.23 % 6,006.78 611,339 5.35 % 3,261.90 134,681 6.61 % 3,391.12 pri pri pr i 3,934,055 5.42 % 3,467.06 178,024 7.23 % 4,146.39 1,018,794 4.79 % 3,204.37 47 7.86 % 4,841.87 1,888,953 5.38 % 3,480.52 15,360 5.97 % 3,882.84 756,619 6.62 % 4,037.07 76,258 3.74 % 1,920.09 od od od 3,855,989 5.31 % 3,398.26 157,936 6.41 % 3,678.52 1,061,423 4.99 % 3,338.45 20 3.34 % 2,060.37 1,843,270 5.25 % 3,396.35 13,534 5.26 % 3,421.25 638,050 5.58 % 3,404.43 141,756 6.96 % 3,569.26 do do do 3,434,842 4.73 % 3,027.11 128,879 5.23 % 3,001.75 976,111 4.59 % 3,070.12 16 2.68 % 1,648.30 1,686,736 4.80 % 3,107.92 13,221 5.14 % 3,342.13 544,963 4.77 % 2,907.74 84,916 4.17 % 2,138.09 med med me d 3,005,484 4.14 % 2,648.72 125,226 5.08 % 2,916.66 906,664 4.26 % 2,851.69 19 3.18 % 1,957.35 1,425,352 4.06 % 2,626.31 7,107 2.76 % 1,796.57 481,142 4.21 % 2,567.22 59,974 2.94 % 1,510.08 ob ob ob 2,529,897 3.48 % 2,229.59 68,415 2.78 % 1,593.47 700,467 3.29 % 2,203.15 9 1.50 % 927.17 1,338,788 3.81 % 2,466.81 5,209 2.02 % 1,316.78 342,313 3.00 % 1,826.47 74,696 3.67 % 1,880.76 pred pred pr ed 2,269,453 3.13 % 2,000.06 51,884 2.11 % 1,208.44 697,210 3.28 % 2,192.91 7 1.17 % 721.13 1,115,560 3.17 % 2,055.49 5,680 2.21 % 1,435.84 329,208 2.88 % 1,756.55 69,904 3.43 % 1,760.11 zaradi zaradi za radi 1,792,327 2.47 % 1,579.57 53,002 2.15 % 1,234.48 569,966 2.68 % 1,792.69 0 0 % 0 861,596 2.45 % 1,587.55 5,122 1.99 % 1,294.79 266,969 2.34 % 1,424.46 35,672 1.75 % 898.18 brez brez br ez 895,603 1.23 % 789.29 31,130 1.26 % 725.05 238,407 1.12 % 749.85 11 1.84 % 1,133.20 419,896 1.20 % 773.69 2,893 1.12 % 731.32 166,614 1.46 % 889 36,652 1.80 % 922.86 proti proti pr oti 875,538 1.21 % 771.61 31,406 1.27 % 731.48 296,441 1.39 % 932.38 4 0.67 % 412.07 400,159 1.14 % 737.32 2,536 0.98 % 641.07 105,654 0.93 % 563.74 39,338 1.93 % 990.49 pod pod po d 771,297 1.06 % 679.74 31,206 1.27 % 726.82 211,135 0.99 % 664.07 6 1.00 % 618.11 357,676 1.02 % 659.04 3,253 1.26 % 822.32 131,152 1.15 % 699.78 36,869 1.81 % 928.32 poleg poleg po leg 592,281 0.82 % 521.97 19,376 0.79 % 451.29 155,163 0.73 % 488.03 0 0 % 0 284,760 0.81 % 524.69 1,216 0.47 % 307.39 118,883 1.04 % 634.32 12,883 0.63 % 324.38 nad nad na d 553,355 0.76 % 487.67 24,668 1.00 % 574.55 152,855 0.72 % 480.77 1 0.17 % 103.02 259,821 0.74 % 478.74 1,840 0.71 % 465.13 89,316 0.78 % 476.56 24,854 1.22 % 625.80 kljub kljub kl jub 509,251 0.70 % 448.80 13,018 0.53 % 303.20 130,869 0.61 % 411.62 0 0 % 0 262,664 0.75 % 483.98 835 0.32 % 211.08 90,789 0.79 % 484.42 11,076 0.54 % 278.88 čez čez če z 342,085 0.47 % 301.48 12,802 0.52 % 298.17 76,727 0.36 % 241.33 2 0.33 % 206.04 156,143 0.44 % 287.70 764 0.30 % 193.13 63,586 0.56 % 339.27 32,061 1.57 % 807.26 glede glede gl ede 271,297 0.37 % 239.09 8,095 0.33 % 188.54 121,462 0.57 % 382.03 0 0 % 0 106,244 0.30 % 195.76 1,543 0.60 % 390.05 30,305 0.27 % 161.70 3,648 0.18 % 91.85 skozi skozi sk ozi 261,306 0.36 % 230.29 16,616 0.68 % 387.01 59,303 0.28 % 186.52 2 0.33 % 206.04 106,627 0.30 % 196.47 957 0.37 % 241.92 51,512 0.45 % 274.85 26,289 1.29 % 661.93 izmed izmed iz med 231,718 0.32 % 204.21 10,396 0.42 % 242.14 68,140 0.32 % 214.32 0 0 % 0 98,094 0.28 % 180.74 698 0.27 % 176.45 47,539 0.42 % 253.65 6,851 0.34 % 172.50 prek prek pr ek 230,294 0.32 % 202.96 7,496 0.30 % 174.59 68,452 0.32 % 215.30 0 0 % 0 100,698 0.29 % 185.54 543 0.21 % 137.26 49,035 0.43 % 261.63 4,070 0.20 % 102.48 konec konec ko nec 210,897 0.29 % 185.86 3,402 0.14 % 79.24 66,489 0.31 % 209.13 0 0 % 0 112,615 0.32 % 207.50 187 0.07 % 47.27 25,891 0.23 % 138.15 2,313 0.11 % 58.24 okoli okoli ok oli 187,237 0.26 % 165.01 7,963 0.32 % 185.47 53,728 0.25 % 168.99 1 0.17 % 103.02 81,299 0.23 % 149.80 320 0.12 % 80.89 33,734 0.29 % 179.99 10,192 0.50 % 256.62 namesto namesto na mesto 180,360 0.25 % 158.95 9,030 0.37 % 210.32 42,657 0.20 % 134.17 5 0.84 % 515.09 81,373 0.23 % 149.94 634 0.25 % 160.27 38,674 0.34 % 206.35 7,987 0.39 % 201.10 sredi sredi sr edi 155,626 0.21 % 137.15 5,109 0.21 % 118.99 39,051 0.18 % 122.83 0 0 % 0 75,945 0.22 % 139.93 228 0.09 % 57.64 25,698 0.23 % 137.12 9,595 0.47 % 241.59 preko preko pr eko 107,027 0.15 % 94.32 5,005 0.20 % 116.57 35,516 0.17 % 111.71 9 1.50 % 927.17 43,418 0.12 % 80 563 0.22 % 142.32 20,743 0.18 % 110.68 1,773 0.09 % 44.64 zoper zoper zo per 96,349 0.13 % 84.91 2,034 0.08 % 47.37 37,620 0.18 % 118.32 0 0 % 0 48,608 0.14 % 89.56 986 0.38 % 249.25 6,457 0.06 % 34.45 644 0.03 % 16.22 okrog okrog ok rog 92,380 0.13 % 81.41 5,996 0.24 % 139.65 15,388 0.07 % 48.40 0 0 % 0 42,736 0.12 % 78.74 304 0.12 % 76.85 16,810 0.15 % 89.69 11,146 0.55 % 280.64 znotraj znotraj zn otraj 80,944 0.11 % 71.34 6,149 0.25 % 143.22 26,268 0.12 % 82.62 0 0 % 0 33,649 0.10 % 62 333 0.13 % 84.18 13,667 0.12 % 72.92 878 0.04 % 22.11 razen razen ra zen 71,460 0.10 % 62.98 3,333 0.14 % 77.63 12,931 0.06 % 40.67 1 0.17 % 103.02 35,674 0.10 % 65.73 840 0.33 % 212.34 13,685 0.12 % 73.02 4,996 0.24 % 125.79 mimo mimo mi mo 70,089 0.10 % 61.77 2,943 0.12 % 68.55 15,955 0.07 % 50.18 0 0 % 0 32,735 0.09 % 60.32 175 0.07 % 44.24 10,508 0.09 % 56.07 7,773 0.38 % 195.72 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 159 File at CLARIN.SI 1.2.143 List of initial character-level 3-grams from preposition lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-prepositions-lemmas- initial-3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] pri pri pri 3,934,055 21.68 % 3,467.06 178,024 26.02 % 4,146.39 1,018,794 19.56 % 3,204.37 47 40.52 % 4,841.87 1,888,953 21.98 % 3,480.52 15,360 27.30 % 3,882.84 756,619 24.88 % 4,037.07 76,258 13.66 % 1,920.09 med med med 3,005,484 16.56 % 2,648.72 125,226 18.30 % 2,916.66 906,664 17.40 % 2,851.69 19 16.38 % 1,957.35 1,425,352 16.58 % 2,626.31 7,107 12.63 % 1,796.57 481,142 15.82 % 2,567.22 59,974 10.74 % 1,510.08 pred pred pre d 2,269,453 12.51 % 2,000.06 51,884 7.58 % 1,208.44 697,210 13.38 % 2,192.91 7 6.03 % 721.13 1,115,560 12.98 % 2,055.49 5,680 10.10 % 1,435.84 329,208 10.83 % 1,756.55 69,904 12.52 % 1,760.11 zaradi zaradi zar adi 1,792,327 9.88 % 1,579.57 53,002 7.75 % 1,234.48 569,966 10.94 % 1,792.69 0 0 % 0 861,596 10.02 % 1,587.55 5,122 9.11 % 1,294.79 266,969 8.78 % 1,424.46 35,672 6.39 % 898.18 brez brez bre z 895,603 4.94 % 789.29 31,130 4.55 % 725.05 238,407 4.58 % 749.85 11 9.48 % 1,133.20 419,896 4.88 % 773.69 2,893 5.14 % 731.32 166,614 5.48 % 889 36,652 6.56 % 922.86 proti proti pro ti 875,538 4.83 % 771.61 31,406 4.59 % 731.48 296,441 5.69 % 932.38 4 3.45 % 412.07 400,159 4.66 % 737.32 2,536 4.51 % 641.07 105,654 3.48 % 563.74 39,338 7.04 % 990.49 pod pod pod 771,297 4.25 % 679.74 31,206 4.56 % 726.82 211,135 4.05 % 664.07 6 5.17 % 618.11 357,676 4.16 % 659.04 3,253 5.78 % 822.32 131,152 4.31 % 699.78 36,869 6.60 % 928.32 poleg poleg pol eg 592,281 3.26 % 521.97 19,376 2.83 % 451.29 155,163 2.98 % 488.03 0 0 % 0 284,760 3.31 % 524.69 1,216 2.16 % 307.39 118,883 3.91 % 634.32 12,883 2.31 % 324.38 nad nad nad 553,355 3.05 % 487.67 24,668 3.60 % 574.55 152,855 2.93 % 480.77 1 0.86 % 103.02 259,821 3.02 % 478.74 1,840 3.27 % 465.13 89,316 2.94 % 476.56 24,854 4.45 % 625.80 kljub kljub klj ub 509,251 2.81 % 448.80 13,018 1.90 % 303.20 130,869 2.51 % 411.62 0 0 % 0 262,664 3.06 % 483.98 835 1.48 % 211.08 90,789 2.99 % 484.42 11,076 1.98 % 278.88 čez čez čez 342,085 1.89 % 301.48 12,802 1.87 % 298.17 76,727 1.47 % 241.33 2 1.72 % 206.04 156,143 1.82 % 287.70 764 1.36 % 193.13 63,586 2.09 % 339.27 32,061 5.74 % 807.26 glede glede gle de 271,297 1.50 % 239.09 8,095 1.18 % 188.54 121,462 2.33 % 382.03 0 0 % 0 106,244 1.24 % 195.76 1,543 2.74 % 390.05 30,305 1.00 % 161.70 3,648 0.65 % 91.85 skozi skozi sko zi 261,306 1.44 % 230.29 16,616 2.43 % 387.01 59,303 1.14 % 186.52 2 1.72 % 206.04 106,627 1.24 % 196.47 957 1.70 % 241.92 51,512 1.69 % 274.85 26,289 4.71 % 661.93 izmed izmed izm ed 231,718 1.28 % 204.21 10,396 1.52 % 242.14 68,140 1.31 % 214.32 0 0 % 0 98,094 1.14 % 180.74 698 1.24 % 176.45 47,539 1.56 % 253.65 6,851 1.23 % 172.50 prek prek pre k 230,294 1.27 % 202.96 7,496 1.09 % 174.59 68,452 1.31 % 215.30 0 0 % 0 100,698 1.17 % 185.54 543 0.96 % 137.26 49,035 1.61 % 261.63 4,070 0.73 % 102.48 konec konec kon ec 210,897 1.16 % 185.86 3,402 0.50 % 79.24 66,489 1.28 % 209.13 0 0 % 0 112,615 1.31 % 207.50 187 0.33 % 47.27 25,891 0.85 % 138.15 2,313 0.41 % 58.24 okoli okoli oko li 187,237 1.03 % 165.01 7,963 1.16 % 185.47 53,728 1.03 % 168.99 1 0.86 % 103.02 81,299 0.95 % 149.80 320 0.57 % 80.89 33,734 1.11 % 179.99 10,192 1.82 % 256.62 namesto namesto nam esto 180,360 0.99 % 158.95 9,030 1.32 % 210.32 42,657 0.82 % 134.17 5 4.31 % 515.09 81,373 0.95 % 149.94 634 1.13 % 160.27 38,674 1.27 % 206.35 7,987 1.43 % 201.10 sredi sredi sre di 155,626 0.86 % 137.15 5,109 0.75 % 118.99 39,051 0.75 % 122.83 0 0 % 0 75,945 0.88 % 139.93 228 0.41 % 57.64 25,698 0.84 % 137.12 9,595 1.72 % 241.59 preko preko pre ko 107,027 0.59 % 94.32 5,005 0.73 % 116.57 35,516 0.68 % 111.71 9 7.76 % 927.17 43,418 0.51 % 80 563 1.00 % 142.32 20,743 0.68 % 110.68 1,773 0.32 % 44.64 zoper zoper zop er 96,349 0.53 % 84.91 2,034 0.30 % 47.37 37,620 0.72 % 118.32 0 0 % 0 48,608 0.57 % 89.56 986 1.75 % 249.25 6,457 0.21 % 34.45 644 0.12 % 16.22 okrog okrog okr og 92,380 0.51 % 81.41 5,996 0.88 % 139.65 15,388 0.29 % 48.40 0 0 % 0 42,736 0.50 % 78.74 304 0.54 % 76.85 16,810 0.55 % 89.69 11,146 2.00 % 280.64 znotraj znotraj zno traj 80,944 0.45 % 71.34 6,149 0.90 % 143.22 26,268 0.50 % 82.62 0 0 % 0 33,649 0.39 % 62 333 0.59 % 84.18 13,667 0.45 % 72.92 878 0.16 % 22.11 razen razen raz en 71,460 0.39 % 62.98 3,333 0.49 % 77.63 12,931 0.25 % 40.67 1 0.86 % 103.02 35,674 0.41 % 65.73 840 1.49 % 212.34 13,685 0.45 % 73.02 4,996 0.90 % 125.79 mimo mimo mim o 70,089 0.39 % 61.77 2,943 0.43 % 68.55 15,955 0.31 % 50.18 0 0 % 0 32,735 0.38 % 60.32 175 0.31 % 44.24 10,508 0.35 % 56.07 7,773 1.39 % 195.72 blizu blizu bli zu 64,672 0.36 % 57 2,885 0.42 % 67.20 20,277 0.39 % 63.78 0 0 % 0 29,850 0.35 % 55 105 0.19 % 26.54 8,372 0.28 % 44.67 3,183 0.57 % 80.14 zunaj zunaj zun aj 58,074 0.32 % 51.18 3,363 0.49 % 78.33 14,572 0.28 % 45.83 0 0 % 0 29,682 0.34 % 54.69 284 0.51 % 71.79 8,737 0.29 % 46.62 1,436 0.26 % 36.16 izven izven izv en 28,190 0.15 % 24.84 1,401 0.20 % 32.63 10,009 0.19 % 31.48 0 0 % 0 12,402 0.14 % 22.85 262 0.47 % 66.23 3,795 0.12 % 20.25 321 0.06 % 8.08 izpred izpred izp red 26,041 0.14 % 22.95 510 0.07 % 11.88 5,911 0.11 % 18.59 0 0 % 0 15,218 0.18 % 28.04 29 0.05 % 7.33 3,274 0.11 % 17.47 1,099 0.20 % 27.67 izpod izpod izp od 22,299 0.12 % 19.65 939 0.14 % 21.87 4,581 0.09 % 14.41 0 0 % 0 9,967 0.12 % 18.36 51 0.09 % 12.89 4,214 0.14 % 22.48 2,547 0.46 % 64.13 navkljub navkljub nav kljub 16,205 0.09 % 14.28 471 0.07 % 10.97 3,708 0.07 % 11.66 0 0 % 0 7,363 0.09 % 13.57 35 0.06 % 8.85 4,137 0.14 % 22.07 491 0.09 % 12.36 zavoljo zavoljo zav oljo 14,784 0.08 % 13.03 423 0.06 % 9.85 4,629 0.09 % 14.56 0 0 % 0 6,837 0.08 % 12.60 5 0.01 % 1.26 2,221 0.07 % 11.85 669 0.12 % 16.84 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 160 File at CLARIN.SI 1.2.144 List of initial character-level 4-grams from preposition lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-prepositions-lemmas- initial-4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] pred pred pred 2,269,453 23.82 % 2,000.06 51,884 16.65 % 1,208.44 697,210 24.55 % 2,192.91 7 17.07 % 721.13 1,115,560 24.78 % 2,055.49 5,680 20.34 % 1,435.84 329,208 21.70 % 1,756.55 69,904 21.34 % 1,760.11 zaradi zaradi zara di 1,792,327 18.81 % 1,579.57 53,002 17.01 % 1,234.48 569,966 20.07 % 1,792.69 0 0 % 0 861,596 19.14 % 1,587.55 5,122 18.35 % 1,294.79 266,969 17.60 % 1,424.46 35,672 10.89 % 898.18 brez brez brez 895,603 9.40 % 789.29 31,130 9.99 % 725.05 238,407 8.39 % 749.85 11 26.83 % 1,133.20 419,896 9.33 % 773.69 2,893 10.36 % 731.32 166,614 10.98 % 889 36,652 11.19 % 922.86 proti proti prot i 875,538 9.19 % 771.61 31,406 10.08 % 731.48 296,441 10.44 % 932.38 4 9.76 % 412.07 400,159 8.89 % 737.32 2,536 9.08 % 641.07 105,654 6.97 % 563.74 39,338 12.01 % 990.49 poleg poleg pole g 592,281 6.22 % 521.97 19,376 6.22 % 451.29 155,163 5.46 % 488.03 0 0 % 0 284,760 6.33 % 524.69 1,216 4.36 % 307.39 118,883 7.84 % 634.32 12,883 3.93 % 324.38 kljub kljub klju b 509,251 5.35 % 448.80 13,018 4.18 % 303.20 130,869 4.61 % 411.62 0 0 % 0 262,664 5.83 % 483.98 835 2.99 % 211.08 90,789 5.99 % 484.42 11,076 3.38 % 278.88 glede glede gled e 271,297 2.85 % 239.09 8,095 2.60 % 188.54 121,462 4.28 % 382.03 0 0 % 0 106,244 2.36 % 195.76 1,543 5.53 % 390.05 30,305 2.00 % 161.70 3,648 1.11 % 91.85 skozi skozi skoz i 261,306 2.74 % 230.29 16,616 5.33 % 387.01 59,303 2.09 % 186.52 2 4.88 % 206.04 106,627 2.37 % 196.47 957 3.43 % 241.92 51,512 3.40 % 274.85 26,289 8.03 % 661.93 izmed izmed izme d 231,718 2.43 % 204.21 10,396 3.34 % 242.14 68,140 2.40 % 214.32 0 0 % 0 98,094 2.18 % 180.74 698 2.50 % 176.45 47,539 3.13 % 253.65 6,851 2.09 % 172.50 prek prek prek 230,294 2.42 % 202.96 7,496 2.41 % 174.59 68,452 2.41 % 215.30 0 0 % 0 100,698 2.24 % 185.54 543 1.95 % 137.26 49,035 3.23 % 261.63 4,070 1.24 % 102.48 konec konec kone c 210,897 2.21 % 185.86 3,402 1.09 % 79.24 66,489 2.34 % 209.13 0 0 % 0 112,615 2.50 % 207.50 187 0.67 % 47.27 25,891 1.71 % 138.15 2,313 0.71 % 58.24 okoli okoli okol i 187,237 1.97 % 165.01 7,963 2.56 % 185.47 53,728 1.89 % 168.99 1 2.44 % 103.02 81,299 1.81 % 149.80 320 1.15 % 80.89 33,734 2.22 % 179.99 10,192 3.11 % 256.62 namesto namesto name sto 180,360 1.89 % 158.95 9,030 2.90 % 210.32 42,657 1.50 % 134.17 5 12.20 % 515.09 81,373 1.81 % 149.94 634 2.27 % 160.27 38,674 2.55 % 206.35 7,987 2.44 % 201.10 sredi sredi sred i 155,626 1.63 % 137.15 5,109 1.64 % 118.99 39,051 1.38 % 122.83 0 0 % 0 75,945 1.69 % 139.93 228 0.82 % 57.64 25,698 1.69 % 137.12 9,595 2.93 % 241.59 preko preko prek o 107,027 1.12 % 94.32 5,005 1.61 % 116.57 35,516 1.25 % 111.71 9 21.95 % 927.17 43,418 0.96 % 80 563 2.02 % 142.32 20,743 1.37 % 110.68 1,773 0.54 % 44.64 zoper zoper zope r 96,349 1.01 % 84.91 2,034 0.65 % 47.37 37,620 1.32 % 118.32 0 0 % 0 48,608 1.08 % 89.56 986 3.53 % 249.25 6,457 0.43 % 34.45 644 0.20 % 16.22 okrog okrog okro g 92,380 0.97 % 81.41 5,996 1.92 % 139.65 15,388 0.54 % 48.40 0 0 % 0 42,736 0.95 % 78.74 304 1.09 % 76.85 16,810 1.11 % 89.69 11,146 3.40 % 280.64 znotraj znotraj znot raj 80,944 0.85 % 71.34 6,149 1.97 % 143.22 26,268 0.93 % 82.62 0 0 % 0 33,649 0.75 % 62 333 1.19 % 84.18 13,667 0.90 % 72.92 878 0.27 % 22.11 razen razen raze n 71,460 0.75 % 62.98 3,333 1.07 % 77.63 12,931 0.46 % 40.67 1 2.44 % 103.02 35,674 0.79 % 65.73 840 3.01 % 212.34 13,685 0.90 % 73.02 4,996 1.52 % 125.79 mimo mimo mimo 70,089 0.74 % 61.77 2,943 0.94 % 68.55 15,955 0.56 % 50.18 0 0 % 0 32,735 0.73 % 60.32 175 0.63 % 44.24 10,508 0.69 % 56.07 7,773 2.37 % 195.72 blizu blizu bliz u 64,672 0.68 % 57 2,885 0.93 % 67.20 20,277 0.71 % 63.78 0 0 % 0 29,850 0.66 % 55 105 0.38 % 26.54 8,372 0.55 % 44.67 3,183 0.97 % 80.14 zunaj zunaj zuna j 58,074 0.61 % 51.18 3,363 1.08 % 78.33 14,572 0.51 % 45.83 0 0 % 0 29,682 0.66 % 54.69 284 1.02 % 71.79 8,737 0.58 % 46.62 1,436 0.44 % 36.16 izven izven izve n 28,190 0.30 % 24.84 1,401 0.45 % 32.63 10,009 0.35 % 31.48 0 0 % 0 12,402 0.28 % 22.85 262 0.94 % 66.23 3,795 0.25 % 20.25 321 0.10 % 8.08 izpred izpred izpr ed 26,041 0.27 % 22.95 510 0.16 % 11.88 5,911 0.21 % 18.59 0 0 % 0 15,218 0.34 % 28.04 29 0.10 % 7.33 3,274 0.22 % 17.47 1,099 0.34 % 27.67 izpod izpod izpo d 22,299 0.23 % 19.65 939 0.30 % 21.87 4,581 0.16 % 14.41 0 0 % 0 9,967 0.22 % 18.36 51 0.18 % 12.89 4,214 0.28 % 22.48 2,547 0.78 % 64.13 navkljub navkljub navk ljub 16,205 0.17 % 14.28 471 0.15 % 10.97 3,708 0.13 % 11.66 0 0 % 0 7,363 0.16 % 13.57 35 0.12 % 8.85 4,137 0.27 % 22.07 491 0.15 % 12.36 zavoljo zavoljo zavo ljo 14,784 0.15 % 13.03 423 0.14 % 9.85 4,629 0.16 % 14.56 0 0 % 0 6,837 0.15 % 12.60 5 0.02 % 1.26 2,221 0.15 % 11.85 669 0.20 % 16.84 izza izza izza 13,225 0.14 % 11.66 453 0.14 % 10.55 2,746 0.10 % 8.64 0 0 % 0 5,279 0.12 % 9.73 27 0.10 % 6.83 2,222 0.15 % 11.86 2,498 0.76 % 62.90 nasproti nasproti nasp roti 13,059 0.14 % 11.51 1,083 0.35 % 25.22 2,243 0.08 % 7.05 0 0 % 0 6,206 0.14 % 11.43 86 0.31 % 21.74 2,087 0.14 % 11.14 1,354 0.41 % 34.09 zraven zraven zrav en 11,040 0.12 % 9.73 348 0.11 % 8.11 1,602 0.06 % 5.04 0 0 % 0 3,353 0.07 % 6.18 18 0.06 % 4.55 2,164 0.14 % 11.55 3,555 1.08 % 89.51 vzdolž vzdolž vzdo lž 9,721 0.10 % 8.57 1,619 0.52 % 37.71 1,790 0.06 % 5.63 1 2.44 % 103.02 3,028 0.07 % 5.58 78 0.28 % 19.72 2,009 0.13 % 10.72 1,196 0.36 % 30.11 spričo spričo spri čo 9,312 0.10 % 8.21 798 0.26 % 18.59 1,646 0.06 % 5.18 0 0 % 0 5,100 0.11 % 9.40 92 0.33 % 23.26 1,185 0.08 % 6.32 491 0.15 % 12.36 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 161 File at CLARIN.SI 1.2.145 List of initial character-level 5-grams from preposition lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-prepositions-lemmas-initial- 5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] zaradi zaradi zarad i 1,792,327 29.72 % 1,579.57 53,002 24.50 % 1,234.48 569,966 31.44 % 1,792.69 0 0 % 0 861,596 30.54 % 1,587.55 5,122 27.69 % 1,294.79 266,969 27.91 % 1,424.46 35,672 17.42 % 898.18 proti proti proti 875,538 14.52 % 771.61 31,406 14.51 % 731.48 296,441 16.35 % 932.38 4 17.39 % 412.07 400,159 14.19 % 737.32 2,536 13.71 % 641.07 105,654 11.04 % 563.74 39,338 19.21 % 990.49 poleg poleg poleg 592,281 9.82 % 521.97 19,376 8.95 % 451.29 155,163 8.56 % 488.03 0 0 % 0 284,760 10.10 % 524.69 1,216 6.57 % 307.39 118,883 12.43 % 634.32 12,883 6.29 % 324.38 kljub kljub kljub 509,251 8.45 % 448.80 13,018 6.02 % 303.20 130,869 7.22 % 411.62 0 0 % 0 262,664 9.31 % 483.98 835 4.51 % 211.08 90,789 9.49 % 484.42 11,076 5.41 % 278.88 glede glede glede 271,297 4.50 % 239.09 8,095 3.74 % 188.54 121,462 6.70 % 382.03 0 0 % 0 106,244 3.77 % 195.76 1,543 8.34 % 390.05 30,305 3.17 % 161.70 3,648 1.78 % 91.85 skozi skozi skozi 261,306 4.33 % 230.29 16,616 7.68 % 387.01 59,303 3.27 % 186.52 2 8.70 % 206.04 106,627 3.78 % 196.47 957 5.17 % 241.92 51,512 5.38 % 274.85 26,289 12.84 % 661.93 izmed izmed izmed 231,718 3.84 % 204.21 10,396 4.80 % 242.14 68,140 3.76 % 214.32 0 0 % 0 98,094 3.48 % 180.74 698 3.77 % 176.45 47,539 4.97 % 253.65 6,851 3.35 % 172.50 konec konec konec 210,897 3.50 % 185.86 3,402 1.57 % 79.24 66,489 3.67 % 209.13 0 0 % 0 112,615 3.99 % 207.50 187 1.01 % 47.27 25,891 2.71 % 138.15 2,313 1.13 % 58.24 okoli okoli okoli 187,237 3.10 % 165.01 7,963 3.68 % 185.47 53,728 2.96 % 168.99 1 4.35 % 103.02 81,299 2.88 % 149.80 320 1.73 % 80.89 33,734 3.53 % 179.99 10,192 4.98 % 256.62 namesto namesto names to 180,360 2.99 % 158.95 9,030 4.17 % 210.32 42,657 2.35 % 134.17 5 21.74 % 515.09 81,373 2.88 % 149.94 634 3.43 % 160.27 38,674 4.04 % 206.35 7,987 3.90 % 201.10 sredi sredi sredi 155,626 2.58 % 137.15 5,109 2.36 % 118.99 39,051 2.15 % 122.83 0 0 % 0 75,945 2.69 % 139.93 228 1.23 % 57.64 25,698 2.69 % 137.12 9,595 4.69 % 241.59 preko preko preko 107,027 1.77 % 94.32 5,005 2.31 % 116.57 35,516 1.96 % 111.71 9 39.13 % 927.17 43,418 1.54 % 80 563 3.04 % 142.32 20,743 2.17 % 110.68 1,773 0.87 % 44.64 zoper zoper zoper 96,349 1.60 % 84.91 2,034 0.94 % 47.37 37,620 2.08 % 118.32 0 0 % 0 48,608 1.72 % 89.56 986 5.33 % 249.25 6,457 0.68 % 34.45 644 0.32 % 16.22 okrog okrog okrog 92,380 1.53 % 81.41 5,996 2.77 % 139.65 15,388 0.85 % 48.40 0 0 % 0 42,736 1.51 % 78.74 304 1.64 % 76.85 16,810 1.76 % 89.69 11,146 5.44 % 280.64 znotraj znotraj znotr aj 80,944 1.34 % 71.34 6,149 2.84 % 143.22 26,268 1.45 % 82.62 0 0 % 0 33,649 1.19 % 62 333 1.80 % 84.18 13,667 1.43 % 72.92 878 0.43 % 22.11 razen razen razen 71,460 1.19 % 62.98 3,333 1.54 % 77.63 12,931 0.71 % 40.67 1 4.35 % 103.02 35,674 1.26 % 65.73 840 4.54 % 212.34 13,685 1.43 % 73.02 4,996 2.44 % 125.79 blizu blizu blizu 64,672 1.07 % 57 2,885 1.33 % 67.20 20,277 1.12 % 63.78 0 0 % 0 29,850 1.06 % 55 105 0.57 % 26.54 8,372 0.88 % 44.67 3,183 1.55 % 80.14 zunaj zunaj zunaj 58,074 0.96 % 51.18 3,363 1.55 % 78.33 14,572 0.80 % 45.83 0 0 % 0 29,682 1.05 % 54.69 284 1.53 % 71.79 8,737 0.91 % 46.62 1,436 0.70 % 36.16 izven izven izven 28,190 0.47 % 24.84 1,401 0.65 % 32.63 10,009 0.55 % 31.48 0 0 % 0 12,402 0.44 % 22.85 262 1.42 % 66.23 3,795 0.40 % 20.25 321 0.16 % 8.08 izpred izpred izpre d 26,041 0.43 % 22.95 510 0.24 % 11.88 5,911 0.33 % 18.59 0 0 % 0 15,218 0.54 % 28.04 29 0.16 % 7.33 3,274 0.34 % 17.47 1,099 0.54 % 27.67 izpod izpod izpod 22,299 0.37 % 19.65 939 0.43 % 21.87 4,581 0.25 % 14.41 0 0 % 0 9,967 0.35 % 18.36 51 0.28 % 12.89 4,214 0.44 % 22.48 2,547 1.24 % 64.13 navkljub navkljub navkl jub 16,205 0.27 % 14.28 471 0.22 % 10.97 3,708 0.20 % 11.66 0 0 % 0 7,363 0.26 % 13.57 35 0.19 % 8.85 4,137 0.43 % 22.07 491 0.24 % 12.36 zavoljo zavoljo zavol jo 14,784 0.24 % 13.03 423 0.20 % 9.85 4,629 0.26 % 14.56 0 0 % 0 6,837 0.24 % 12.60 5 0.03 % 1.26 2,221 0.23 % 11.85 669 0.33 % 16.84 nasproti nasproti naspr oti 13,059 0.22 % 11.51 1,083 0.50 % 25.22 2,243 0.12 % 7.05 0 0 % 0 6,206 0.22 % 11.43 86 0.47 % 21.74 2,087 0.22 % 11.14 1,354 0.66 % 34.09 zraven zraven zrave n 11,040 0.18 % 9.73 348 0.16 % 8.11 1,602 0.09 % 5.04 0 0 % 0 3,353 0.12 % 6.18 18 0.10 % 4.55 2,164 0.23 % 11.55 3,555 1.74 % 89.51 vzdolž vzdolž vzdol ž 9,721 0.16 % 8.57 1,619 0.75 % 37.71 1,790 0.10 % 5.63 1 4.35 % 103.02 3,028 0.11 % 5.58 78 0.42 % 19.72 2,009 0.21 % 10.72 1,196 0.58 % 30.11 spričo spričo sprič o 9,312 0.15 % 8.21 798 0.37 % 18.59 1,646 0.09 % 5.18 0 0 % 0 5,100 0.18 % 9.40 92 0.50 % 23.26 1,185 0.12 % 6.32 491 0.24 % 12.36 onkraj onkraj onkra j 8,858 0.15 % 7.81 705 0.33 % 16.42 2,514 0.14 % 7.91 0 0 % 0 3,350 0.12 % 6.17 27 0.15 % 6.83 1,273 0.13 % 6.79 989 0.48 % 24.90 onstran onstran onstr an 7,522 0.12 % 6.63 752 0.35 % 17.51 1,477 0.08 % 4.65 0 0 % 0 3,666 0.13 % 6.75 38 0.20 % 9.61 1,131 0.12 % 6.03 458 0.22 % 11.53 tekom tekom tekom 4,638 0.08 % 4.09 174 0.08 % 4.05 3,169 0.17 % 9.97 0 0 % 0 626 0.02 % 1.15 25 0.14 % 6.32 613 0.06 % 3.27 31 0.01 % 0.78 povrhu povrhu povrh u 2,397 0.04 % 2.11 63 0.03 % 1.47 486 0.03 % 1.53 0 0 % 0 1,213 0.04 % 2.24 3 0.02 % 0.76 477 0.05 % 2.55 155 0.08 % 3.90 širom širom širom 2,386 0.04 % 2.10 35 0.02 % 0.82 711 0.04 % 2.24 0 0 % 0 874 0.03 % 1.61 6 0.03 % 1.52 713 0.07 % 3.80 47 0.02 % 1.18 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 162 File at CLARIN.SI 1.2.146 List of final character-level 1-grams from preposition lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-prepositions-lemmas-final- 1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] v v v 31,818,378 25.51 % 28,041.37 1,138,445 25.83 % 26,515.74 9,553,100 26.12 % 30,046.99 299 22.72 % 30,802.51 15,532,386 25.82 % 28,619.45 103,256 23.08 % 26,102 4,674,294 23.87 % 24,940.50 816,598 23.02 % 20,561.05 na na n a 19,072,589 15.29 % 16,808.58 631,488 14.33 % 14,708.11 5,747,052 15.72 % 18,075.97 248 18.84 % 25,548.57 9,072,361 15.08 % 16,716.43 56,074 12.53 % 14,174.90 3,009,439 15.37 % 16,057.38 555,927 15.67 % 13,997.64 z z z 15,732,362 12.62 % 13,864.85 626,742 14.22 % 14,597.57 4,402,058 12.04 % 13,845.62 403 30.62 % 41,516.43 7,267,869 12.08 % 13,391.53 58,748 13.13 % 14,850.86 2,841,173 14.51 % 15,159.57 535,369 15.09 % 13,480.01 za za z a 15,195,407 12.18 % 13,391.64 436,419 9.90 % 10,164.72 4,564,728 12.48 % 14,357.26 88 6.69 % 9,065.62 7,527,662 12.51 % 13,870.22 67,714 15.13 % 17,117.37 2,303,946 11.77 % 12,293.10 294,850 8.31 % 7,424 po po p o 6,360,566 5.10 % 5,605.53 196,183 4.45 % 4,569.34 1,959,908 5.36 % 6,164.42 64 4.86 % 6,593.18 3,054,588 5.08 % 5,628.28 21,577 4.82 % 5,454.43 936,691 4.78 % 4,997.88 191,555 5.40 % 4,823.15 iz iz i z 4,006,518 3.21 % 3,530.92 159,021 3.61 % 3,703.79 1,060,540 2.90 % 3,335.67 37 2.81 % 3,811.68 2,017,138 3.35 % 3,716.71 23,762 5.31 % 6,006.78 611,339 3.12 % 3,261.90 134,681 3.80 % 3,391.12 pri pri pr i 3,934,055 3.15 % 3,467.06 178,024 4.04 % 4,146.39 1,018,794 2.79 % 3,204.37 47 3.57 % 4,841.87 1,888,953 3.14 % 3,480.52 15,360 3.43 % 3,882.84 756,619 3.86 % 4,037.07 76,258 2.15 % 1,920.09 od od o d 3,855,989 3.09 % 3,398.26 157,936 3.58 % 3,678.52 1,061,423 2.90 % 3,338.45 20 1.52 % 2,060.37 1,843,270 3.06 % 3,396.35 13,534 3.02 % 3,421.25 638,050 3.26 % 3,404.43 141,756 4.00 % 3,569.26 o o o 3,670,387 2.94 % 3,234.69 130,867 2.97 % 3,048.05 1,103,026 3.02 % 3,469.30 1 0.08 % 103.02 1,828,963 3.04 % 3,369.99 24,271 5.42 % 6,135.45 488,092 2.49 % 2,604.30 95,167 2.68 % 2,396.20 do do d o 3,434,842 2.75 % 3,027.11 128,879 2.92 % 3,001.75 976,111 2.67 % 3,070.12 16 1.22 % 1,648.30 1,686,736 2.80 % 3,107.92 13,221 2.96 % 3,342.13 544,963 2.78 % 2,907.74 84,916 2.39 % 2,138.09 med med me d 3,005,484 2.41 % 2,648.72 125,226 2.84 % 2,916.66 906,664 2.48 % 2,851.69 19 1.44 % 1,957.35 1,425,352 2.37 % 2,626.31 7,107 1.59 % 1,796.57 481,142 2.46 % 2,567.22 59,974 1.69 % 1,510.08 ob ob o b 2,529,897 2.03 % 2,229.59 68,415 1.55 % 1,593.47 700,467 1.92 % 2,203.15 9 0.68 % 927.17 1,338,788 2.23 % 2,466.81 5,209 1.16 % 1,316.78 342,313 1.75 % 1,826.47 74,696 2.10 % 1,880.76 pred pred pre d 2,269,453 1.82 % 2,000.06 51,884 1.18 % 1,208.44 697,210 1.91 % 2,192.91 7 0.53 % 721.13 1,115,560 1.85 % 2,055.49 5,680 1.27 % 1,435.84 329,208 1.68 % 1,756.55 69,904 1.97 % 1,760.11 zaradi zaradi zarad i 1,792,327 1.44 % 1,579.57 53,002 1.20 % 1,234.48 569,966 1.56 % 1,792.69 0 0 % 0 861,596 1.43 % 1,587.55 5,122 1.15 % 1,294.79 266,969 1.36 % 1,424.46 35,672 1.00 % 898.18 brez brez bre z 895,603 0.72 % 789.29 31,130 0.71 % 725.05 238,407 0.65 % 749.85 11 0.84 % 1,133.20 419,896 0.70 % 773.69 2,893 0.65 % 731.32 166,614 0.85 % 889 36,652 1.03 % 922.86 k k k 881,577 0.71 % 776.93 48,583 1.10 % 1,131.56 227,339 0.62 % 715.04 14 1.06 % 1,442.26 391,830 0.65 % 721.97 3,769 0.84 % 952.76 146,443 0.75 % 781.37 63,599 1.79 % 1,601.35 proti proti prot i 875,538 0.70 % 771.61 31,406 0.71 % 731.48 296,441 0.81 % 932.38 4 0.30 % 412.07 400,159 0.67 % 737.32 2,536 0.57 % 641.07 105,654 0.54 % 563.74 39,338 1.11 % 990.49 pod pod po d 771,297 0.62 % 679.74 31,206 0.71 % 726.82 211,135 0.58 % 664.07 6 0.46 % 618.11 357,676 0.59 % 659.04 3,253 0.73 % 822.32 131,152 0.67 % 699.78 36,869 1.04 % 928.32 poleg poleg pole g 592,281 0.47 % 521.97 19,376 0.44 % 451.29 155,163 0.42 % 488.03 0 0 % 0 284,760 0.47 % 524.69 1,216 0.27 % 307.39 118,883 0.61 % 634.32 12,883 0.36 % 324.38 nad nad na d 553,355 0.44 % 487.67 24,668 0.56 % 574.55 152,855 0.42 % 480.77 1 0.08 % 103.02 259,821 0.43 % 478.74 1,840 0.41 % 465.13 89,316 0.46 % 476.56 24,854 0.70 % 625.80 kljub kljub klju b 509,251 0.41 % 448.80 13,018 0.29 % 303.20 130,869 0.36 % 411.62 0 0 % 0 262,664 0.44 % 483.98 835 0.19 % 211.08 90,789 0.46 % 484.42 11,076 0.31 % 278.88 čez čez če z 342,085 0.27 % 301.48 12,802 0.29 % 298.17 76,727 0.21 % 241.33 2 0.15 % 206.04 156,143 0.26 % 287.70 764 0.17 % 193.13 63,586 0.33 % 339.27 32,061 0.90 % 807.26 glede glede gled e 271,297 0.22 % 239.09 8,095 0.18 % 188.54 121,462 0.33 % 382.03 0 0 % 0 106,244 0.18 % 195.76 1,543 0.34 % 390.05 30,305 0.15 % 161.70 3,648 0.10 % 91.85 skozi skozi skoz i 261,306 0.21 % 230.29 16,616 0.38 % 387.01 59,303 0.16 % 186.52 2 0.15 % 206.04 106,627 0.18 % 196.47 957 0.21 % 241.92 51,512 0.26 % 274.85 26,289 0.74 % 661.93 izmed izmed izme d 231,718 0.19 % 204.21 10,396 0.24 % 242.14 68,140 0.19 % 214.32 0 0 % 0 98,094 0.16 % 180.74 698 0.16 % 176.45 47,539 0.24 % 253.65 6,851 0.19 % 172.50 prek prek pre k 230,294 0.18 % 202.96 7,496 0.17 % 174.59 68,452 0.19 % 215.30 0 0 % 0 100,698 0.17 % 185.54 543 0.12 % 137.26 49,035 0.25 % 261.63 4,070 0.12 % 102.48 konec konec kone c 210,897 0.17 % 185.86 3,402 0.08 % 79.24 66,489 0.18 % 209.13 0 0 % 0 112,615 0.19 % 207.50 187 0.04 % 47.27 25,891 0.13 % 138.15 2,313 0.07 % 58.24 okoli okoli okol i 187,237 0.15 % 165.01 7,963 0.18 % 185.47 53,728 0.15 % 168.99 1 0.08 % 103.02 81,299 0.14 % 149.80 320 0.07 % 80.89 33,734 0.17 % 179.99 10,192 0.29 % 256.62 namesto namesto namest o 180,360 0.14 % 158.95 9,030 0.20 % 210.32 42,657 0.12 % 134.17 5 0.38 % 515.09 81,373 0.14 % 149.94 634 0.14 % 160.27 38,674 0.20 % 206.35 7,987 0.23 % 201.10 sredi sredi sred i 155,626 0.12 % 137.15 5,109 0.12 % 118.99 39,051 0.11 % 122.83 0 0 % 0 75,945 0.13 % 139.93 228 0.05 % 57.64 25,698 0.13 % 137.12 9,595 0.27 % 241.59 preko preko prek o 107,027 0.09 % 94.32 5,005 0.11 % 116.57 35,516 0.10 % 111.71 9 0.68 % 927.17 43,418 0.07 % 80 563 0.13 % 142.32 20,743 0.11 % 110.68 1,773 0.05 % 44.64 zoper zoper zope r 96,349 0.08 % 84.91 2,034 0.05 % 47.37 37,620 0.10 % 118.32 0 0 % 0 48,608 0.08 % 89.56 986 0.22 % 249.25 6,457 0.03 % 34.45 644 0.02 % 16.22 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 163 File at CLARIN.SI 1.2.147 List of final character-level 2-grams from preposition lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-prepositions-lemmas-final- 2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] na na na 19,072,589 26.27 % 16,808.58 631,488 25.64 % 14,708.11 5,747,052 27.01 % 18,075.97 248 41.47 % 25,548.57 9,072,361 25.82 % 16,716.43 56,074 21.79 % 14,174.90 3,009,439 26.34 % 16,057.38 555,927 27.30 % 13,997.64 za za za 15,195,407 20.93 % 13,391.64 436,419 17.72 % 10,164.72 4,564,728 21.45 % 14,357.26 88 14.72 % 9,065.62 7,527,662 21.42 % 13,870.22 67,714 26.31 % 17,117.37 2,303,946 20.16 % 12,293.10 294,850 14.48 % 7,424 po po po 6,360,566 8.76 % 5,605.53 196,183 7.97 % 4,569.34 1,959,908 9.21 % 6,164.42 64 10.70 % 6,593.18 3,054,588 8.69 % 5,628.28 21,577 8.38 % 5,454.43 936,691 8.20 % 4,997.88 191,555 9.40 % 4,823.15 iz iz iz 4,006,518 5.52 % 3,530.92 159,021 6.46 % 3,703.79 1,060,540 4.98 % 3,335.67 37 6.19 % 3,811.68 2,017,138 5.74 % 3,716.71 23,762 9.23 % 6,006.78 611,339 5.35 % 3,261.90 134,681 6.61 % 3,391.12 pri pri p ri 3,934,055 5.42 % 3,467.06 178,024 7.23 % 4,146.39 1,018,794 4.79 % 3,204.37 47 7.86 % 4,841.87 1,888,953 5.38 % 3,480.52 15,360 5.97 % 3,882.84 756,619 6.62 % 4,037.07 76,258 3.74 % 1,920.09 od od od 3,855,989 5.31 % 3,398.26 157,936 6.41 % 3,678.52 1,061,423 4.99 % 3,338.45 20 3.34 % 2,060.37 1,843,270 5.25 % 3,396.35 13,534 5.26 % 3,421.25 638,050 5.58 % 3,404.43 141,756 6.96 % 3,569.26 do do do 3,434,842 4.73 % 3,027.11 128,879 5.23 % 3,001.75 976,111 4.59 % 3,070.12 16 2.68 % 1,648.30 1,686,736 4.80 % 3,107.92 13,221 5.14 % 3,342.13 544,963 4.77 % 2,907.74 84,916 4.17 % 2,138.09 med med m ed 3,005,484 4.14 % 2,648.72 125,226 5.08 % 2,916.66 906,664 4.26 % 2,851.69 19 3.18 % 1,957.35 1,425,352 4.06 % 2,626.31 7,107 2.76 % 1,796.57 481,142 4.21 % 2,567.22 59,974 2.94 % 1,510.08 ob ob ob 2,529,897 3.48 % 2,229.59 68,415 2.78 % 1,593.47 700,467 3.29 % 2,203.15 9 1.50 % 927.17 1,338,788 3.81 % 2,466.81 5,209 2.02 % 1,316.78 342,313 3.00 % 1,826.47 74,696 3.67 % 1,880.76 pred pred pr ed 2,269,453 3.13 % 2,000.06 51,884 2.11 % 1,208.44 697,210 3.28 % 2,192.91 7 1.17 % 721.13 1,115,560 3.17 % 2,055.49 5,680 2.21 % 1,435.84 329,208 2.88 % 1,756.55 69,904 3.43 % 1,760.11 zaradi zaradi zara di 1,792,327 2.47 % 1,579.57 53,002 2.15 % 1,234.48 569,966 2.68 % 1,792.69 0 0 % 0 861,596 2.45 % 1,587.55 5,122 1.99 % 1,294.79 266,969 2.34 % 1,424.46 35,672 1.75 % 898.18 brez brez br ez 895,603 1.23 % 789.29 31,130 1.26 % 725.05 238,407 1.12 % 749.85 11 1.84 % 1,133.20 419,896 1.20 % 773.69 2,893 1.12 % 731.32 166,614 1.46 % 889 36,652 1.80 % 922.86 proti proti pro ti 875,538 1.21 % 771.61 31,406 1.27 % 731.48 296,441 1.39 % 932.38 4 0.67 % 412.07 400,159 1.14 % 737.32 2,536 0.98 % 641.07 105,654 0.93 % 563.74 39,338 1.93 % 990.49 pod pod p od 771,297 1.06 % 679.74 31,206 1.27 % 726.82 211,135 0.99 % 664.07 6 1.00 % 618.11 357,676 1.02 % 659.04 3,253 1.26 % 822.32 131,152 1.15 % 699.78 36,869 1.81 % 928.32 poleg poleg pol eg 592,281 0.82 % 521.97 19,376 0.79 % 451.29 155,163 0.73 % 488.03 0 0 % 0 284,760 0.81 % 524.69 1,216 0.47 % 307.39 118,883 1.04 % 634.32 12,883 0.63 % 324.38 nad nad n ad 553,355 0.76 % 487.67 24,668 1.00 % 574.55 152,855 0.72 % 480.77 1 0.17 % 103.02 259,821 0.74 % 478.74 1,840 0.71 % 465.13 89,316 0.78 % 476.56 24,854 1.22 % 625.80 kljub kljub klj ub 509,251 0.70 % 448.80 13,018 0.53 % 303.20 130,869 0.61 % 411.62 0 0 % 0 262,664 0.75 % 483.98 835 0.32 % 211.08 90,789 0.79 % 484.42 11,076 0.54 % 278.88 čez čez č ez 342,085 0.47 % 301.48 12,802 0.52 % 298.17 76,727 0.36 % 241.33 2 0.33 % 206.04 156,143 0.44 % 287.70 764 0.30 % 193.13 63,586 0.56 % 339.27 32,061 1.57 % 807.26 glede glede gle de 271,297 0.37 % 239.09 8,095 0.33 % 188.54 121,462 0.57 % 382.03 0 0 % 0 106,244 0.30 % 195.76 1,543 0.60 % 390.05 30,305 0.27 % 161.70 3,648 0.18 % 91.85 skozi skozi sko zi 261,306 0.36 % 230.29 16,616 0.68 % 387.01 59,303 0.28 % 186.52 2 0.33 % 206.04 106,627 0.30 % 196.47 957 0.37 % 241.92 51,512 0.45 % 274.85 26,289 1.29 % 661.93 izmed izmed izm ed 231,718 0.32 % 204.21 10,396 0.42 % 242.14 68,140 0.32 % 214.32 0 0 % 0 98,094 0.28 % 180.74 698 0.27 % 176.45 47,539 0.42 % 253.65 6,851 0.34 % 172.50 prek prek pr ek 230,294 0.32 % 202.96 7,496 0.30 % 174.59 68,452 0.32 % 215.30 0 0 % 0 100,698 0.29 % 185.54 543 0.21 % 137.26 49,035 0.43 % 261.63 4,070 0.20 % 102.48 konec konec kon ec 210,897 0.29 % 185.86 3,402 0.14 % 79.24 66,489 0.31 % 209.13 0 0 % 0 112,615 0.32 % 207.50 187 0.07 % 47.27 25,891 0.23 % 138.15 2,313 0.11 % 58.24 okoli okoli oko li 187,237 0.26 % 165.01 7,963 0.32 % 185.47 53,728 0.25 % 168.99 1 0.17 % 103.02 81,299 0.23 % 149.80 320 0.12 % 80.89 33,734 0.29 % 179.99 10,192 0.50 % 256.62 namesto namesto names to 180,360 0.25 % 158.95 9,030 0.37 % 210.32 42,657 0.20 % 134.17 5 0.84 % 515.09 81,373 0.23 % 149.94 634 0.25 % 160.27 38,674 0.34 % 206.35 7,987 0.39 % 201.10 sredi sredi sre di 155,626 0.21 % 137.15 5,109 0.21 % 118.99 39,051 0.18 % 122.83 0 0 % 0 75,945 0.22 % 139.93 228 0.09 % 57.64 25,698 0.23 % 137.12 9,595 0.47 % 241.59 preko preko pre ko 107,027 0.15 % 94.32 5,005 0.20 % 116.57 35,516 0.17 % 111.71 9 1.50 % 927.17 43,418 0.12 % 80 563 0.22 % 142.32 20,743 0.18 % 110.68 1,773 0.09 % 44.64 zoper zoper zop er 96,349 0.13 % 84.91 2,034 0.08 % 47.37 37,620 0.18 % 118.32 0 0 % 0 48,608 0.14 % 89.56 986 0.38 % 249.25 6,457 0.06 % 34.45 644 0.03 % 16.22 okrog okrog okr og 92,380 0.13 % 81.41 5,996 0.24 % 139.65 15,388 0.07 % 48.40 0 0 % 0 42,736 0.12 % 78.74 304 0.12 % 76.85 16,810 0.15 % 89.69 11,146 0.55 % 280.64 znotraj znotraj znotr aj 80,944 0.11 % 71.34 6,149 0.25 % 143.22 26,268 0.12 % 82.62 0 0 % 0 33,649 0.10 % 62 333 0.13 % 84.18 13,667 0.12 % 72.92 878 0.04 % 22.11 razen razen raz en 71,460 0.10 % 62.98 3,333 0.14 % 77.63 12,931 0.06 % 40.67 1 0.17 % 103.02 35,674 0.10 % 65.73 840 0.33 % 212.34 13,685 0.12 % 73.02 4,996 0.24 % 125.79 mimo mimo mi mo 70,089 0.10 % 61.77 2,943 0.12 % 68.55 15,955 0.07 % 50.18 0 0 % 0 32,735 0.09 % 60.32 175 0.07 % 44.24 10,508 0.09 % 56.07 7,773 0.38 % 195.72 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 164 File at CLARIN.SI 1.2.148 List of final character-level 3-grams from preposition lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-prepositions-lemmas-final- 3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] pri pri pri 3,934,055 21.68 % 3,467.06 178,024 26.02 % 4,146.39 1,018,794 19.56 % 3,204.37 47 40.52 % 4,841.87 1,888,953 21.98 % 3,480.52 15,360 27.30 % 3,882.84 756,619 24.88 % 4,037.07 76,258 13.66 % 1,920.09 med med med 3,005,484 16.56 % 2,648.72 125,226 18.30 % 2,916.66 906,664 17.40 % 2,851.69 19 16.38 % 1,957.35 1,425,352 16.58 % 2,626.31 7,107 12.63 % 1,796.57 481,142 15.82 % 2,567.22 59,974 10.74 % 1,510.08 pred pred p red 2,269,453 12.51 % 2,000.06 51,884 7.58 % 1,208.44 697,210 13.38 % 2,192.91 7 6.03 % 721.13 1,115,560 12.98 % 2,055.49 5,680 10.10 % 1,435.84 329,208 10.83 % 1,756.55 69,904 12.52 % 1,760.11 zaradi zaradi zar adi 1,792,327 9.88 % 1,579.57 53,002 7.75 % 1,234.48 569,966 10.94 % 1,792.69 0 0 % 0 861,596 10.02 % 1,587.55 5,122 9.11 % 1,294.79 266,969 8.78 % 1,424.46 35,672 6.39 % 898.18 brez brez b rez 895,603 4.94 % 789.29 31,130 4.55 % 725.05 238,407 4.58 % 749.85 11 9.48 % 1,133.20 419,896 4.88 % 773.69 2,893 5.14 % 731.32 166,614 5.48 % 889 36,652 6.56 % 922.86 proti proti pr oti 875,538 4.83 % 771.61 31,406 4.59 % 731.48 296,441 5.69 % 932.38 4 3.45 % 412.07 400,159 4.66 % 737.32 2,536 4.51 % 641.07 105,654 3.48 % 563.74 39,338 7.04 % 990.49 pod pod pod 771,297 4.25 % 679.74 31,206 4.56 % 726.82 211,135 4.05 % 664.07 6 5.17 % 618.11 357,676 4.16 % 659.04 3,253 5.78 % 822.32 131,152 4.31 % 699.78 36,869 6.60 % 928.32 poleg poleg po leg 592,281 3.26 % 521.97 19,376 2.83 % 451.29 155,163 2.98 % 488.03 0 0 % 0 284,760 3.31 % 524.69 1,216 2.16 % 307.39 118,883 3.91 % 634.32 12,883 2.31 % 324.38 nad nad nad 553,355 3.05 % 487.67 24,668 3.60 % 574.55 152,855 2.93 % 480.77 1 0.86 % 103.02 259,821 3.02 % 478.74 1,840 3.27 % 465.13 89,316 2.94 % 476.56 24,854 4.45 % 625.80 kljub kljub kl jub 509,251 2.81 % 448.80 13,018 1.90 % 303.20 130,869 2.51 % 411.62 0 0 % 0 262,664 3.06 % 483.98 835 1.48 % 211.08 90,789 2.99 % 484.42 11,076 1.98 % 278.88 čez čez čez 342,085 1.89 % 301.48 12,802 1.87 % 298.17 76,727 1.47 % 241.33 2 1.72 % 206.04 156,143 1.82 % 287.70 764 1.36 % 193.13 63,586 2.09 % 339.27 32,061 5.74 % 807.26 glede glede gl ede 271,297 1.50 % 239.09 8,095 1.18 % 188.54 121,462 2.33 % 382.03 0 0 % 0 106,244 1.24 % 195.76 1,543 2.74 % 390.05 30,305 1.00 % 161.70 3,648 0.65 % 91.85 skozi skozi sk ozi 261,306 1.44 % 230.29 16,616 2.43 % 387.01 59,303 1.14 % 186.52 2 1.72 % 206.04 106,627 1.24 % 196.47 957 1.70 % 241.92 51,512 1.69 % 274.85 26,289 4.71 % 661.93 izmed izmed iz med 231,718 1.28 % 204.21 10,396 1.52 % 242.14 68,140 1.31 % 214.32 0 0 % 0 98,094 1.14 % 180.74 698 1.24 % 176.45 47,539 1.56 % 253.65 6,851 1.23 % 172.50 prek prek p rek 230,294 1.27 % 202.96 7,496 1.09 % 174.59 68,452 1.31 % 215.30 0 0 % 0 100,698 1.17 % 185.54 543 0.96 % 137.26 49,035 1.61 % 261.63 4,070 0.73 % 102.48 konec konec ko nec 210,897 1.16 % 185.86 3,402 0.50 % 79.24 66,489 1.28 % 209.13 0 0 % 0 112,615 1.31 % 207.50 187 0.33 % 47.27 25,891 0.85 % 138.15 2,313 0.41 % 58.24 okoli okoli ok oli 187,237 1.03 % 165.01 7,963 1.16 % 185.47 53,728 1.03 % 168.99 1 0.86 % 103.02 81,299 0.95 % 149.80 320 0.57 % 80.89 33,734 1.11 % 179.99 10,192 1.82 % 256.62 namesto namesto name sto 180,360 0.99 % 158.95 9,030 1.32 % 210.32 42,657 0.82 % 134.17 5 4.31 % 515.09 81,373 0.95 % 149.94 634 1.13 % 160.27 38,674 1.27 % 206.35 7,987 1.43 % 201.10 sredi sredi sr edi 155,626 0.86 % 137.15 5,109 0.75 % 118.99 39,051 0.75 % 122.83 0 0 % 0 75,945 0.88 % 139.93 228 0.41 % 57.64 25,698 0.84 % 137.12 9,595 1.72 % 241.59 preko preko pr eko 107,027 0.59 % 94.32 5,005 0.73 % 116.57 35,516 0.68 % 111.71 9 7.76 % 927.17 43,418 0.51 % 80 563 1.00 % 142.32 20,743 0.68 % 110.68 1,773 0.32 % 44.64 zoper zoper zo per 96,349 0.53 % 84.91 2,034 0.30 % 47.37 37,620 0.72 % 118.32 0 0 % 0 48,608 0.57 % 89.56 986 1.75 % 249.25 6,457 0.21 % 34.45 644 0.12 % 16.22 okrog okrog ok rog 92,380 0.51 % 81.41 5,996 0.88 % 139.65 15,388 0.29 % 48.40 0 0 % 0 42,736 0.50 % 78.74 304 0.54 % 76.85 16,810 0.55 % 89.69 11,146 2.00 % 280.64 znotraj znotraj znot raj 80,944 0.45 % 71.34 6,149 0.90 % 143.22 26,268 0.50 % 82.62 0 0 % 0 33,649 0.39 % 62 333 0.59 % 84.18 13,667 0.45 % 72.92 878 0.16 % 22.11 razen razen ra zen 71,460 0.39 % 62.98 3,333 0.49 % 77.63 12,931 0.25 % 40.67 1 0.86 % 103.02 35,674 0.41 % 65.73 840 1.49 % 212.34 13,685 0.45 % 73.02 4,996 0.90 % 125.79 mimo mimo m imo 70,089 0.39 % 61.77 2,943 0.43 % 68.55 15,955 0.31 % 50.18 0 0 % 0 32,735 0.38 % 60.32 175 0.31 % 44.24 10,508 0.35 % 56.07 7,773 1.39 % 195.72 blizu blizu bl izu 64,672 0.36 % 57 2,885 0.42 % 67.20 20,277 0.39 % 63.78 0 0 % 0 29,850 0.35 % 55 105 0.19 % 26.54 8,372 0.28 % 44.67 3,183 0.57 % 80.14 zunaj zunaj zu naj 58,074 0.32 % 51.18 3,363 0.49 % 78.33 14,572 0.28 % 45.83 0 0 % 0 29,682 0.34 % 54.69 284 0.51 % 71.79 8,737 0.29 % 46.62 1,436 0.26 % 36.16 izven izven iz ven 28,190 0.15 % 24.84 1,401 0.20 % 32.63 10,009 0.19 % 31.48 0 0 % 0 12,402 0.14 % 22.85 262 0.47 % 66.23 3,795 0.12 % 20.25 321 0.06 % 8.08 izpred izpred izp red 26,041 0.14 % 22.95 510 0.07 % 11.88 5,911 0.11 % 18.59 0 0 % 0 15,218 0.18 % 28.04 29 0.05 % 7.33 3,274 0.11 % 17.47 1,099 0.20 % 27.67 izpod izpod iz pod 22,299 0.12 % 19.65 939 0.14 % 21.87 4,581 0.09 % 14.41 0 0 % 0 9,967 0.12 % 18.36 51 0.09 % 12.89 4,214 0.14 % 22.48 2,547 0.46 % 64.13 navkljub navkljub navkl jub 16,205 0.09 % 14.28 471 0.07 % 10.97 3,708 0.07 % 11.66 0 0 % 0 7,363 0.09 % 13.57 35 0.06 % 8.85 4,137 0.14 % 22.07 491 0.09 % 12.36 zavoljo zavoljo zavo ljo 14,784 0.08 % 13.03 423 0.06 % 9.85 4,629 0.09 % 14.56 0 0 % 0 6,837 0.08 % 12.60 5 0.01 % 1.26 2,221 0.07 % 11.85 669 0.12 % 16.84 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 165 File at CLARIN.SI 1.2.149 List of final character-level 4-grams from preposition lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-prepositions-lemmas-final- 4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] pred pred pred 2,269,453 23.82 % 2,000.06 51,884 16.65 % 1,208.44 697,210 24.55 % 2,192.91 7 17.07 % 721.13 1,115,560 24.78 % 2,055.49 5,680 20.34 % 1,435.84 329,208 21.70 % 1,756.55 69,904 21.34 % 1,760.11 zaradi zaradi za radi 1,792,327 18.81 % 1,579.57 53,002 17.01 % 1,234.48 569,966 20.07 % 1,792.69 0 0 % 0 861,596 19.14 % 1,587.55 5,122 18.35 % 1,294.79 266,969 17.60 % 1,424.46 35,672 10.89 % 898.18 brez brez brez 895,603 9.40 % 789.29 31,130 9.99 % 725.05 238,407 8.39 % 749.85 11 26.83 % 1,133.20 419,896 9.33 % 773.69 2,893 10.36 % 731.32 166,614 10.98 % 889 36,652 11.19 % 922.86 proti proti p roti 875,538 9.19 % 771.61 31,406 10.08 % 731.48 296,441 10.44 % 932.38 4 9.76 % 412.07 400,159 8.89 % 737.32 2,536 9.08 % 641.07 105,654 6.97 % 563.74 39,338 12.01 % 990.49 poleg poleg p oleg 592,281 6.22 % 521.97 19,376 6.22 % 451.29 155,163 5.46 % 488.03 0 0 % 0 284,760 6.33 % 524.69 1,216 4.36 % 307.39 118,883 7.84 % 634.32 12,883 3.93 % 324.38 kljub kljub k ljub 509,251 5.35 % 448.80 13,018 4.18 % 303.20 130,869 4.61 % 411.62 0 0 % 0 262,664 5.83 % 483.98 835 2.99 % 211.08 90,789 5.99 % 484.42 11,076 3.38 % 278.88 glede glede g lede 271,297 2.85 % 239.09 8,095 2.60 % 188.54 121,462 4.28 % 382.03 0 0 % 0 106,244 2.36 % 195.76 1,543 5.53 % 390.05 30,305 2.00 % 161.70 3,648 1.11 % 91.85 skozi skozi s kozi 261,306 2.74 % 230.29 16,616 5.33 % 387.01 59,303 2.09 % 186.52 2 4.88 % 206.04 106,627 2.37 % 196.47 957 3.43 % 241.92 51,512 3.40 % 274.85 26,289 8.03 % 661.93 izmed izmed i zmed 231,718 2.43 % 204.21 10,396 3.34 % 242.14 68,140 2.40 % 214.32 0 0 % 0 98,094 2.18 % 180.74 698 2.50 % 176.45 47,539 3.13 % 253.65 6,851 2.09 % 172.50 prek prek prek 230,294 2.42 % 202.96 7,496 2.41 % 174.59 68,452 2.41 % 215.30 0 0 % 0 100,698 2.24 % 185.54 543 1.95 % 137.26 49,035 3.23 % 261.63 4,070 1.24 % 102.48 konec konec k onec 210,897 2.21 % 185.86 3,402 1.09 % 79.24 66,489 2.34 % 209.13 0 0 % 0 112,615 2.50 % 207.50 187 0.67 % 47.27 25,891 1.71 % 138.15 2,313 0.71 % 58.24 okoli okoli o koli 187,237 1.97 % 165.01 7,963 2.56 % 185.47 53,728 1.89 % 168.99 1 2.44 % 103.02 81,299 1.81 % 149.80 320 1.15 % 80.89 33,734 2.22 % 179.99 10,192 3.11 % 256.62 namesto namesto nam esto 180,360 1.89 % 158.95 9,030 2.90 % 210.32 42,657 1.50 % 134.17 5 12.20 % 515.09 81,373 1.81 % 149.94 634 2.27 % 160.27 38,674 2.55 % 206.35 7,987 2.44 % 201.10 sredi sredi s redi 155,626 1.63 % 137.15 5,109 1.64 % 118.99 39,051 1.38 % 122.83 0 0 % 0 75,945 1.69 % 139.93 228 0.82 % 57.64 25,698 1.69 % 137.12 9,595 2.93 % 241.59 preko preko p reko 107,027 1.12 % 94.32 5,005 1.61 % 116.57 35,516 1.25 % 111.71 9 21.95 % 927.17 43,418 0.96 % 80 563 2.02 % 142.32 20,743 1.37 % 110.68 1,773 0.54 % 44.64 zoper zoper z oper 96,349 1.01 % 84.91 2,034 0.65 % 47.37 37,620 1.32 % 118.32 0 0 % 0 48,608 1.08 % 89.56 986 3.53 % 249.25 6,457 0.43 % 34.45 644 0.20 % 16.22 okrog okrog o krog 92,380 0.97 % 81.41 5,996 1.92 % 139.65 15,388 0.54 % 48.40 0 0 % 0 42,736 0.95 % 78.74 304 1.09 % 76.85 16,810 1.11 % 89.69 11,146 3.40 % 280.64 znotraj znotraj zno traj 80,944 0.85 % 71.34 6,149 1.97 % 143.22 26,268 0.93 % 82.62 0 0 % 0 33,649 0.75 % 62 333 1.19 % 84.18 13,667 0.90 % 72.92 878 0.27 % 22.11 razen razen r azen 71,460 0.75 % 62.98 3,333 1.07 % 77.63 12,931 0.46 % 40.67 1 2.44 % 103.02 35,674 0.79 % 65.73 840 3.01 % 212.34 13,685 0.90 % 73.02 4,996 1.52 % 125.79 mimo mimo mimo 70,089 0.74 % 61.77 2,943 0.94 % 68.55 15,955 0.56 % 50.18 0 0 % 0 32,735 0.73 % 60.32 175 0.63 % 44.24 10,508 0.69 % 56.07 7,773 2.37 % 195.72 blizu blizu b lizu 64,672 0.68 % 57 2,885 0.93 % 67.20 20,277 0.71 % 63.78 0 0 % 0 29,850 0.66 % 55 105 0.38 % 26.54 8,372 0.55 % 44.67 3,183 0.97 % 80.14 zunaj zunaj z unaj 58,074 0.61 % 51.18 3,363 1.08 % 78.33 14,572 0.51 % 45.83 0 0 % 0 29,682 0.66 % 54.69 284 1.02 % 71.79 8,737 0.58 % 46.62 1,436 0.44 % 36.16 izven izven i zven 28,190 0.30 % 24.84 1,401 0.45 % 32.63 10,009 0.35 % 31.48 0 0 % 0 12,402 0.28 % 22.85 262 0.94 % 66.23 3,795 0.25 % 20.25 321 0.10 % 8.08 izpred izpred iz pred 26,041 0.27 % 22.95 510 0.16 % 11.88 5,911 0.21 % 18.59 0 0 % 0 15,218 0.34 % 28.04 29 0.10 % 7.33 3,274 0.22 % 17.47 1,099 0.34 % 27.67 izpod izpod i zpod 22,299 0.23 % 19.65 939 0.30 % 21.87 4,581 0.16 % 14.41 0 0 % 0 9,967 0.22 % 18.36 51 0.18 % 12.89 4,214 0.28 % 22.48 2,547 0.78 % 64.13 navkljub navkljub navk ljub 16,205 0.17 % 14.28 471 0.15 % 10.97 3,708 0.13 % 11.66 0 0 % 0 7,363 0.16 % 13.57 35 0.12 % 8.85 4,137 0.27 % 22.07 491 0.15 % 12.36 zavoljo zavoljo zav oljo 14,784 0.15 % 13.03 423 0.14 % 9.85 4,629 0.16 % 14.56 0 0 % 0 6,837 0.15 % 12.60 5 0.02 % 1.26 2,221 0.15 % 11.85 669 0.20 % 16.84 izza izza izza 13,225 0.14 % 11.66 453 0.14 % 10.55 2,746 0.10 % 8.64 0 0 % 0 5,279 0.12 % 9.73 27 0.10 % 6.83 2,222 0.15 % 11.86 2,498 0.76 % 62.90 nasproti nasproti nasp roti 13,059 0.14 % 11.51 1,083 0.35 % 25.22 2,243 0.08 % 7.05 0 0 % 0 6,206 0.14 % 11.43 86 0.31 % 21.74 2,087 0.14 % 11.14 1,354 0.41 % 34.09 zraven zraven zr aven 11,040 0.12 % 9.73 348 0.11 % 8.11 1,602 0.06 % 5.04 0 0 % 0 3,353 0.07 % 6.18 18 0.06 % 4.55 2,164 0.14 % 11.55 3,555 1.08 % 89.51 vzdolž vzdolž vz dolž 9,721 0.10 % 8.57 1,619 0.52 % 37.71 1,790 0.06 % 5.63 1 2.44 % 103.02 3,028 0.07 % 5.58 78 0.28 % 19.72 2,009 0.13 % 10.72 1,196 0.36 % 30.11 spričo spričo sp ričo 9,312 0.10 % 8.21 798 0.26 % 18.59 1,646 0.06 % 5.18 0 0 % 0 5,100 0.11 % 9.40 92 0.33 % 23.26 1,185 0.08 % 6.32 491 0.15 % 12.36 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 166 File at CLARIN.SI 1.2.150 List of final character-level 5-grams from preposition lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-prepositions-lemmas-final- 5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] zaradi zaradi z aradi 1,792,327 29.72 % 1,579.57 53,002 24.50 % 1,234.48 569,966 31.44 % 1,792.69 0 0 % 0 861,596 30.54 % 1,587.55 5,122 27.69 % 1,294.79 266,969 27.91 % 1,424.46 35,672 17.42 % 898.18 proti proti proti 875,538 14.52 % 771.61 31,406 14.51 % 731.48 296,441 16.35 % 932.38 4 17.39 % 412.07 400,159 14.19 % 737.32 2,536 13.71 % 641.07 105,654 11.04 % 563.74 39,338 19.21 % 990.49 poleg poleg poleg 592,281 9.82 % 521.97 19,376 8.95 % 451.29 155,163 8.56 % 488.03 0 0 % 0 284,760 10.10 % 524.69 1,216 6.57 % 307.39 118,883 12.43 % 634.32 12,883 6.29 % 324.38 kljub kljub kljub 509,251 8.45 % 448.80 13,018 6.02 % 303.20 130,869 7.22 % 411.62 0 0 % 0 262,664 9.31 % 483.98 835 4.51 % 211.08 90,789 9.49 % 484.42 11,076 5.41 % 278.88 glede glede glede 271,297 4.50 % 239.09 8,095 3.74 % 188.54 121,462 6.70 % 382.03 0 0 % 0 106,244 3.77 % 195.76 1,543 8.34 % 390.05 30,305 3.17 % 161.70 3,648 1.78 % 91.85 skozi skozi skozi 261,306 4.33 % 230.29 16,616 7.68 % 387.01 59,303 3.27 % 186.52 2 8.70 % 206.04 106,627 3.78 % 196.47 957 5.17 % 241.92 51,512 5.38 % 274.85 26,289 12.84 % 661.93 izmed izmed izmed 231,718 3.84 % 204.21 10,396 4.80 % 242.14 68,140 3.76 % 214.32 0 0 % 0 98,094 3.48 % 180.74 698 3.77 % 176.45 47,539 4.97 % 253.65 6,851 3.35 % 172.50 konec konec konec 210,897 3.50 % 185.86 3,402 1.57 % 79.24 66,489 3.67 % 209.13 0 0 % 0 112,615 3.99 % 207.50 187 1.01 % 47.27 25,891 2.71 % 138.15 2,313 1.13 % 58.24 okoli okoli okoli 187,237 3.10 % 165.01 7,963 3.68 % 185.47 53,728 2.96 % 168.99 1 4.35 % 103.02 81,299 2.88 % 149.80 320 1.73 % 80.89 33,734 3.53 % 179.99 10,192 4.98 % 256.62 namesto namesto na mesto 180,360 2.99 % 158.95 9,030 4.17 % 210.32 42,657 2.35 % 134.17 5 21.74 % 515.09 81,373 2.88 % 149.94 634 3.43 % 160.27 38,674 4.04 % 206.35 7,987 3.90 % 201.10 sredi sredi sredi 155,626 2.58 % 137.15 5,109 2.36 % 118.99 39,051 2.15 % 122.83 0 0 % 0 75,945 2.69 % 139.93 228 1.23 % 57.64 25,698 2.69 % 137.12 9,595 4.69 % 241.59 preko preko preko 107,027 1.77 % 94.32 5,005 2.31 % 116.57 35,516 1.96 % 111.71 9 39.13 % 927.17 43,418 1.54 % 80 563 3.04 % 142.32 20,743 2.17 % 110.68 1,773 0.87 % 44.64 zoper zoper zoper 96,349 1.60 % 84.91 2,034 0.94 % 47.37 37,620 2.08 % 118.32 0 0 % 0 48,608 1.72 % 89.56 986 5.33 % 249.25 6,457 0.68 % 34.45 644 0.32 % 16.22 okrog okrog okrog 92,380 1.53 % 81.41 5,996 2.77 % 139.65 15,388 0.85 % 48.40 0 0 % 0 42,736 1.51 % 78.74 304 1.64 % 76.85 16,810 1.76 % 89.69 11,146 5.44 % 280.64 znotraj znotraj zn otraj 80,944 1.34 % 71.34 6,149 2.84 % 143.22 26,268 1.45 % 82.62 0 0 % 0 33,649 1.19 % 62 333 1.80 % 84.18 13,667 1.43 % 72.92 878 0.43 % 22.11 razen razen razen 71,460 1.19 % 62.98 3,333 1.54 % 77.63 12,931 0.71 % 40.67 1 4.35 % 103.02 35,674 1.26 % 65.73 840 4.54 % 212.34 13,685 1.43 % 73.02 4,996 2.44 % 125.79 blizu blizu blizu 64,672 1.07 % 57 2,885 1.33 % 67.20 20,277 1.12 % 63.78 0 0 % 0 29,850 1.06 % 55 105 0.57 % 26.54 8,372 0.88 % 44.67 3,183 1.55 % 80.14 zunaj zunaj zunaj 58,074 0.96 % 51.18 3,363 1.55 % 78.33 14,572 0.80 % 45.83 0 0 % 0 29,682 1.05 % 54.69 284 1.53 % 71.79 8,737 0.91 % 46.62 1,436 0.70 % 36.16 izven izven izven 28,190 0.47 % 24.84 1,401 0.65 % 32.63 10,009 0.55 % 31.48 0 0 % 0 12,402 0.44 % 22.85 262 1.42 % 66.23 3,795 0.40 % 20.25 321 0.16 % 8.08 izpred izpred i zpred 26,041 0.43 % 22.95 510 0.24 % 11.88 5,911 0.33 % 18.59 0 0 % 0 15,218 0.54 % 28.04 29 0.16 % 7.33 3,274 0.34 % 17.47 1,099 0.54 % 27.67 izpod izpod izpod 22,299 0.37 % 19.65 939 0.43 % 21.87 4,581 0.25 % 14.41 0 0 % 0 9,967 0.35 % 18.36 51 0.28 % 12.89 4,214 0.44 % 22.48 2,547 1.24 % 64.13 navkljub navkljub nav kljub 16,205 0.27 % 14.28 471 0.22 % 10.97 3,708 0.20 % 11.66 0 0 % 0 7,363 0.26 % 13.57 35 0.19 % 8.85 4,137 0.43 % 22.07 491 0.24 % 12.36 zavoljo zavoljo za voljo 14,784 0.24 % 13.03 423 0.20 % 9.85 4,629 0.26 % 14.56 0 0 % 0 6,837 0.24 % 12.60 5 0.03 % 1.26 2,221 0.23 % 11.85 669 0.33 % 16.84 nasproti nasproti nas proti 13,059 0.22 % 11.51 1,083 0.50 % 25.22 2,243 0.12 % 7.05 0 0 % 0 6,206 0.22 % 11.43 86 0.47 % 21.74 2,087 0.22 % 11.14 1,354 0.66 % 34.09 zraven zraven z raven 11,040 0.18 % 9.73 348 0.16 % 8.11 1,602 0.09 % 5.04 0 0 % 0 3,353 0.12 % 6.18 18 0.10 % 4.55 2,164 0.23 % 11.55 3,555 1.74 % 89.51 vzdolž vzdolž v zdolž 9,721 0.16 % 8.57 1,619 0.75 % 37.71 1,790 0.10 % 5.63 1 4.35 % 103.02 3,028 0.11 % 5.58 78 0.42 % 19.72 2,009 0.21 % 10.72 1,196 0.58 % 30.11 spričo spričo s pričo 9,312 0.15 % 8.21 798 0.37 % 18.59 1,646 0.09 % 5.18 0 0 % 0 5,100 0.18 % 9.40 92 0.50 % 23.26 1,185 0.12 % 6.32 491 0.24 % 12.36 onkraj onkraj o nkraj 8,858 0.15 % 7.81 705 0.33 % 16.42 2,514 0.14 % 7.91 0 0 % 0 3,350 0.12 % 6.17 27 0.15 % 6.83 1,273 0.13 % 6.79 989 0.48 % 24.90 onstran onstran on stran 7,522 0.12 % 6.63 752 0.35 % 17.51 1,477 0.08 % 4.65 0 0 % 0 3,666 0.13 % 6.75 38 0.20 % 9.61 1,131 0.12 % 6.03 458 0.22 % 11.53 tekom tekom tekom 4,638 0.08 % 4.09 174 0.08 % 4.05 3,169 0.17 % 9.97 0 0 % 0 626 0.02 % 1.15 25 0.14 % 6.32 613 0.06 % 3.27 31 0.01 % 0.78 povrhu povrhu p ovrhu 2,397 0.04 % 2.11 63 0.03 % 1.47 486 0.03 % 1.53 0 0 % 0 1,213 0.04 % 2.24 3 0.02 % 0.76 477 0.05 % 2.55 155 0.08 % 3.90 širom širom širom 2,386 0.04 % 2.10 35 0.02 % 0.82 711 0.04 % 2.24 0 0 % 0 874 0.03 % 1.61 6 0.03 % 1.52 713 0.07 % 3.80 47 0.02 % 1.18 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 167 File at CLARIN.SI 1.2.151 List of initial character-level 1-grams from preposition lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-prepositions-lowercase_ forms-initial-1grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] v v 31,813,825 25.51 % 28,037.36 1,138,081 25.82 % 26,507.26 9,551,729 26.12 % 30,042.67 299 22.72 % 30,802.51 15,531,117 25.82 % 28,617.11 103,250 23.08 % 26,100.49 4,673,034 23.86 % 24,933.77 816,315 23.01 % 20,553.93 na n a 19,072,589 15.29 % 16,808.58 631,488 14.33 % 14,708.11 5,747,052 15.72 % 18,075.97 248 18.84 % 25,548.57 9,072,361 15.08 % 16,716.43 56,074 12.53 % 14,174.90 3,009,439 15.37 % 16,057.38 555,927 15.67 % 13,997.64 za z a 15,195,407 12.18 % 13,391.64 436,419 9.90 % 10,164.72 4,564,728 12.48 % 14,357.26 88 6.69 % 9,065.62 7,527,662 12.51 % 13,870.22 67,714 15.13 % 17,117.37 2,303,946 11.77 % 12,293.10 294,850 8.31 % 7,424 z z 8,671,164 6.95 % 7,641.85 350,573 7.95 % 8,165.26 2,398,257 6.56 % 7,543.14 237 18.01 % 24,415.37 4,006,567 6.66 % 7,382.36 31,792 7.11 % 8,036.67 1,570,983 8.02 % 8,382.25 312,755 8.81 % 7,874.83 s s 7,070,474 5.67 % 6,231.17 276,736 6.28 % 6,445.51 2,005,941 5.49 % 6,309.21 167 12.69 % 17,204.08 3,263,890 5.42 % 6,013.93 26,986 6.03 % 6,821.77 1,273,975 6.51 % 6,797.51 222,779 6.28 % 5,609.33 po p o 6,360,566 5.10 % 5,605.53 196,183 4.45 % 4,569.34 1,959,908 5.36 % 6,164.42 64 4.86 % 6,593.18 3,054,588 5.08 % 5,628.28 21,577 4.82 % 5,454.43 936,691 4.78 % 4,997.88 191,555 5.40 % 4,823.15 iz i z 4,006,518 3.21 % 3,530.92 159,021 3.61 % 3,703.79 1,060,540 2.90 % 3,335.67 37 2.81 % 3,811.68 2,017,138 3.35 % 3,716.71 23,762 5.31 % 6,006.78 611,339 3.12 % 3,261.90 134,681 3.80 % 3,391.12 pri p ri 3,934,055 3.15 % 3,467.06 178,024 4.04 % 4,146.39 1,018,794 2.79 % 3,204.37 47 3.57 % 4,841.87 1,888,953 3.14 % 3,480.52 15,360 3.43 % 3,882.84 756,619 3.86 % 4,037.07 76,258 2.15 % 1,920.09 od o d 3,855,989 3.09 % 3,398.26 157,936 3.58 % 3,678.52 1,061,423 2.90 % 3,338.45 20 1.52 % 2,060.37 1,843,270 3.06 % 3,396.35 13,534 3.02 % 3,421.25 638,050 3.26 % 3,404.43 141,756 4.00 % 3,569.26 o o 3,670,387 2.94 % 3,234.69 130,867 2.97 % 3,048.05 1,103,026 3.02 % 3,469.30 1 0.08 % 103.02 1,828,963 3.04 % 3,369.99 24,271 5.42 % 6,135.45 488,092 2.49 % 2,604.30 95,167 2.68 % 2,396.20 do d o 3,434,842 2.75 % 3,027.11 128,879 2.92 % 3,001.75 976,111 2.67 % 3,070.12 16 1.22 % 1,648.30 1,686,736 2.80 % 3,107.92 13,221 2.96 % 3,342.13 544,963 2.78 % 2,907.74 84,916 2.39 % 2,138.09 med m ed 3,005,484 2.41 % 2,648.72 125,226 2.84 % 2,916.66 906,664 2.48 % 2,851.69 19 1.44 % 1,957.35 1,425,352 2.37 % 2,626.31 7,107 1.59 % 1,796.57 481,142 2.46 % 2,567.22 59,974 1.69 % 1,510.08 ob o b 2,529,897 2.03 % 2,229.59 68,415 1.55 % 1,593.47 700,467 1.92 % 2,203.15 9 0.68 % 927.17 1,338,788 2.23 % 2,466.81 5,209 1.16 % 1,316.78 342,313 1.75 % 1,826.47 74,696 2.10 % 1,880.76 pred p red 2,269,453 1.82 % 2,000.06 51,884 1.18 % 1,208.44 697,210 1.91 % 2,192.91 7 0.53 % 721.13 1,115,560 1.85 % 2,055.49 5,680 1.27 % 1,435.84 329,208 1.68 % 1,756.55 69,904 1.97 % 1,760.11 zaradi z aradi 1,792,318 1.44 % 1,579.56 53,002 1.20 % 1,234.48 569,960 1.56 % 1,792.67 0 0 % 0 861,596 1.43 % 1,587.55 5,122 1.15 % 1,294.79 266,969 1.36 % 1,424.46 35,669 1.00 % 898.11 brez b rez 895,603 0.72 % 789.29 31,130 0.71 % 725.05 238,407 0.65 % 749.85 11 0.84 % 1,133.20 419,896 0.70 % 773.69 2,893 0.65 % 731.32 166,614 0.85 % 889 36,652 1.03 % 922.86 proti p roti 875,538 0.70 % 771.61 31,406 0.71 % 731.48 296,441 0.81 % 932.38 4 0.30 % 412.07 400,159 0.67 % 737.32 2,536 0.57 % 641.07 105,654 0.54 % 563.74 39,338 1.11 % 990.49 k k 829,466 0.67 % 731 45,573 1.03 % 1,061.45 213,857 0.58 % 672.64 14 1.06 % 1,442.26 368,686 0.61 % 679.33 3,567 0.80 % 901.70 138,006 0.70 % 736.35 59,763 1.69 % 1,504.77 pod p od 771,403 0.62 % 679.83 31,212 0.71 % 726.96 211,136 0.58 % 664.08 6 0.46 % 618.11 357,724 0.59 % 659.13 3,253 0.73 % 822.32 131,202 0.67 % 700.05 36,870 1.04 % 928.35 poleg p oleg 592,281 0.47 % 521.97 19,376 0.44 % 451.29 155,163 0.42 % 488.03 0 0 % 0 284,760 0.47 % 524.69 1,216 0.27 % 307.39 118,883 0.61 % 634.32 12,883 0.36 % 324.38 nad n ad 553,355 0.44 % 487.67 24,668 0.56 % 574.55 152,855 0.42 % 480.77 1 0.08 % 103.02 259,821 0.43 % 478.74 1,840 0.41 % 465.13 89,316 0.46 % 476.56 24,854 0.70 % 625.80 kljub k ljub 509,251 0.41 % 448.80 13,018 0.29 % 303.20 130,869 0.36 % 411.62 0 0 % 0 262,664 0.44 % 483.98 835 0.19 % 211.08 90,789 0.46 % 484.42 11,076 0.31 % 278.88 čez č ez 342,085 0.27 % 301.48 12,802 0.29 % 298.17 76,727 0.21 % 241.33 2 0.15 % 206.04 156,143 0.26 % 287.70 764 0.17 % 193.13 63,586 0.33 % 339.27 32,061 0.90 % 807.26 glede g lede 271,297 0.22 % 239.09 8,095 0.18 % 188.54 121,462 0.33 % 382.03 0 0 % 0 106,244 0.18 % 195.76 1,543 0.34 % 390.05 30,305 0.15 % 161.70 3,648 0.10 % 91.85 skozi s kozi 261,306 0.21 % 230.29 16,616 0.38 % 387.01 59,303 0.16 % 186.52 2 0.15 % 206.04 106,627 0.18 % 196.47 957 0.21 % 241.92 51,512 0.26 % 274.85 26,289 0.74 % 661.93 izmed i zmed 231,718 0.19 % 204.21 10,396 0.24 % 242.14 68,140 0.19 % 214.32 0 0 % 0 98,094 0.16 % 180.74 698 0.16 % 176.45 47,539 0.24 % 253.65 6,851 0.19 % 172.50 prek p rek 230,294 0.18 % 202.96 7,496 0.17 % 174.59 68,452 0.19 % 215.30 0 0 % 0 100,698 0.17 % 185.54 543 0.12 % 137.26 49,035 0.25 % 261.63 4,070 0.12 % 102.48 konec k onec 210,897 0.17 % 185.86 3,402 0.08 % 79.24 66,489 0.18 % 209.13 0 0 % 0 112,615 0.19 % 207.50 187 0.04 % 47.27 25,891 0.13 % 138.15 2,313 0.07 % 58.24 okoli o koli 187,237 0.15 % 165.01 7,963 0.18 % 185.47 53,728 0.15 % 168.99 1 0.08 % 103.02 81,299 0.14 % 149.80 320 0.07 % 80.89 33,734 0.17 % 179.99 10,192 0.29 % 256.62 namesto n amesto 180,360 0.14 % 158.95 9,030 0.20 % 210.32 42,657 0.12 % 134.17 5 0.38 % 515.09 81,373 0.14 % 149.94 634 0.14 % 160.27 38,674 0.20 % 206.35 7,987 0.23 % 201.10 sredi s redi 155,626 0.12 % 137.15 5,109 0.12 % 118.99 39,051 0.11 % 122.83 0 0 % 0 75,945 0.13 % 139.93 228 0.05 % 57.64 25,698 0.13 % 137.12 9,595 0.27 % 241.59 preko p reko 107,027 0.09 % 94.32 5,005 0.11 % 116.57 35,516 0.10 % 111.71 9 0.68 % 927.17 43,418 0.07 % 80 563 0.13 % 142.32 20,743 0.11 % 110.68 1,773 0.05 % 44.64 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 168 File at CLARIN.SI 1.2.152 List of initial character-level 2-grams from preposition lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-prepositions-lowercase_ forms-initial-2grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] na na 19,072,589 26.27 % 16,808.58 631,488 25.64 % 14,708.11 5,747,052 27.01 % 18,075.97 248 41.47 % 25,548.57 9,072,361 25.82 % 16,716.43 56,074 21.79 % 14,174.90 3,009,439 26.34 % 16,057.38 555,927 27.30 % 13,997.64 za za 15,195,407 20.93 % 13,391.64 436,419 17.72 % 10,164.72 4,564,728 21.45 % 14,357.26 88 14.72 % 9,065.62 7,527,662 21.42 % 13,870.22 67,714 26.31 % 17,117.37 2,303,946 20.16 % 12,293.10 294,850 14.48 % 7,424 po po 6,360,566 8.76 % 5,605.53 196,183 7.97 % 4,569.34 1,959,908 9.21 % 6,164.42 64 10.70 % 6,593.18 3,054,588 8.69 % 5,628.28 21,577 8.38 % 5,454.43 936,691 8.20 % 4,997.88 191,555 9.40 % 4,823.15 iz iz 4,006,518 5.52 % 3,530.92 159,021 6.46 % 3,703.79 1,060,540 4.98 % 3,335.67 37 6.19 % 3,811.68 2,017,138 5.74 % 3,716.71 23,762 9.23 % 6,006.78 611,339 5.35 % 3,261.90 134,681 6.61 % 3,391.12 pri pr i 3,934,055 5.42 % 3,467.06 178,024 7.23 % 4,146.39 1,018,794 4.79 % 3,204.37 47 7.86 % 4,841.87 1,888,953 5.38 % 3,480.52 15,360 5.97 % 3,882.84 756,619 6.62 % 4,037.07 76,258 3.74 % 1,920.09 od od 3,855,989 5.31 % 3,398.26 157,936 6.41 % 3,678.52 1,061,423 4.99 % 3,338.45 20 3.34 % 2,060.37 1,843,270 5.25 % 3,396.35 13,534 5.26 % 3,421.25 638,050 5.58 % 3,404.43 141,756 6.96 % 3,569.26 do do 3,434,842 4.73 % 3,027.11 128,879 5.23 % 3,001.75 976,111 4.59 % 3,070.12 16 2.68 % 1,648.30 1,686,736 4.80 % 3,107.92 13,221 5.14 % 3,342.13 544,963 4.77 % 2,907.74 84,916 4.17 % 2,138.09 med me d 3,005,484 4.14 % 2,648.72 125,226 5.08 % 2,916.66 906,664 4.26 % 2,851.69 19 3.18 % 1,957.35 1,425,352 4.06 % 2,626.31 7,107 2.76 % 1,796.57 481,142 4.21 % 2,567.22 59,974 2.94 % 1,510.08 ob ob 2,529,897 3.48 % 2,229.59 68,415 2.78 % 1,593.47 700,467 3.29 % 2,203.15 9 1.50 % 927.17 1,338,788 3.81 % 2,466.81 5,209 2.02 % 1,316.78 342,313 3.00 % 1,826.47 74,696 3.67 % 1,880.76 pred pr ed 2,269,453 3.13 % 2,000.06 51,884 2.11 % 1,208.44 697,210 3.28 % 2,192.91 7 1.17 % 721.13 1,115,560 3.17 % 2,055.49 5,680 2.21 % 1,435.84 329,208 2.88 % 1,756.55 69,904 3.43 % 1,760.11 zaradi za radi 1,792,318 2.47 % 1,579.56 53,002 2.15 % 1,234.48 569,960 2.68 % 1,792.67 0 0 % 0 861,596 2.45 % 1,587.55 5,122 1.99 % 1,294.79 266,969 2.34 % 1,424.46 35,669 1.75 % 898.11 brez br ez 895,603 1.23 % 789.29 31,130 1.26 % 725.05 238,407 1.12 % 749.85 11 1.84 % 1,133.20 419,896 1.20 % 773.69 2,893 1.12 % 731.32 166,614 1.46 % 889 36,652 1.80 % 922.86 proti pr oti 875,538 1.21 % 771.61 31,406 1.27 % 731.48 296,441 1.39 % 932.38 4 0.67 % 412.07 400,159 1.14 % 737.32 2,536 0.98 % 641.07 105,654 0.93 % 563.74 39,338 1.93 % 990.49 pod po d 771,403 1.06 % 679.83 31,212 1.27 % 726.96 211,136 0.99 % 664.08 6 1.00 % 618.11 357,724 1.02 % 659.13 3,253 1.26 % 822.32 131,202 1.15 % 700.05 36,870 1.81 % 928.35 poleg po leg 592,281 0.82 % 521.97 19,376 0.79 % 451.29 155,163 0.73 % 488.03 0 0 % 0 284,760 0.81 % 524.69 1,216 0.47 % 307.39 118,883 1.04 % 634.32 12,883 0.63 % 324.38 nad na d 553,355 0.76 % 487.67 24,668 1.00 % 574.55 152,855 0.72 % 480.77 1 0.17 % 103.02 259,821 0.74 % 478.74 1,840 0.71 % 465.13 89,316 0.78 % 476.56 24,854 1.22 % 625.80 kljub kl jub 509,251 0.70 % 448.80 13,018 0.53 % 303.20 130,869 0.61 % 411.62 0 0 % 0 262,664 0.75 % 483.98 835 0.32 % 211.08 90,789 0.79 % 484.42 11,076 0.54 % 278.88 čez če z 342,085 0.47 % 301.48 12,802 0.52 % 298.17 76,727 0.36 % 241.33 2 0.33 % 206.04 156,143 0.44 % 287.70 764 0.30 % 193.13 63,586 0.56 % 339.27 32,061 1.57 % 807.26 glede gl ede 271,297 0.37 % 239.09 8,095 0.33 % 188.54 121,462 0.57 % 382.03 0 0 % 0 106,244 0.30 % 195.76 1,543 0.60 % 390.05 30,305 0.27 % 161.70 3,648 0.18 % 91.85 skozi sk ozi 261,306 0.36 % 230.29 16,616 0.68 % 387.01 59,303 0.28 % 186.52 2 0.33 % 206.04 106,627 0.30 % 196.47 957 0.37 % 241.92 51,512 0.45 % 274.85 26,289 1.29 % 661.93 izmed iz med 231,718 0.32 % 204.21 10,396 0.42 % 242.14 68,140 0.32 % 214.32 0 0 % 0 98,094 0.28 % 180.74 698 0.27 % 176.45 47,539 0.42 % 253.65 6,851 0.34 % 172.50 prek pr ek 230,294 0.32 % 202.96 7,496 0.30 % 174.59 68,452 0.32 % 215.30 0 0 % 0 100,698 0.29 % 185.54 543 0.21 % 137.26 49,035 0.43 % 261.63 4,070 0.20 % 102.48 konec ko nec 210,897 0.29 % 185.86 3,402 0.14 % 79.24 66,489 0.31 % 209.13 0 0 % 0 112,615 0.32 % 207.50 187 0.07 % 47.27 25,891 0.23 % 138.15 2,313 0.11 % 58.24 okoli ok oli 187,237 0.26 % 165.01 7,963 0.32 % 185.47 53,728 0.25 % 168.99 1 0.17 % 103.02 81,299 0.23 % 149.80 320 0.12 % 80.89 33,734 0.29 % 179.99 10,192 0.50 % 256.62 namesto na mesto 180,360 0.25 % 158.95 9,030 0.37 % 210.32 42,657 0.20 % 134.17 5 0.84 % 515.09 81,373 0.23 % 149.94 634 0.25 % 160.27 38,674 0.34 % 206.35 7,987 0.39 % 201.10 sredi sr edi 155,626 0.21 % 137.15 5,109 0.21 % 118.99 39,051 0.18 % 122.83 0 0 % 0 75,945 0.22 % 139.93 228 0.09 % 57.64 25,698 0.23 % 137.12 9,595 0.47 % 241.59 preko pr eko 107,027 0.15 % 94.32 5,005 0.20 % 116.57 35,516 0.17 % 111.71 9 1.50 % 927.17 43,418 0.12 % 80 563 0.22 % 142.32 20,743 0.18 % 110.68 1,773 0.09 % 44.64 zoper zo per 96,349 0.13 % 84.91 2,034 0.08 % 47.37 37,620 0.18 % 118.32 0 0 % 0 48,608 0.14 % 89.56 986 0.38 % 249.25 6,457 0.06 % 34.45 644 0.03 % 16.22 okrog ok rog 92,380 0.13 % 81.41 5,996 0.24 % 139.65 15,388 0.07 % 48.40 0 0 % 0 42,736 0.12 % 78.74 304 0.12 % 76.85 16,810 0.15 % 89.69 11,146 0.55 % 280.64 znotraj zn otraj 80,944 0.11 % 71.34 6,149 0.25 % 143.22 26,268 0.12 % 82.62 0 0 % 0 33,649 0.10 % 62 333 0.13 % 84.18 13,667 0.12 % 72.92 878 0.04 % 22.11 razen ra zen 71,460 0.10 % 62.98 3,333 0.14 % 77.63 12,931 0.06 % 40.67 1 0.17 % 103.02 35,674 0.10 % 65.73 840 0.33 % 212.34 13,685 0.12 % 73.02 4,996 0.24 % 125.79 mimo mi mo 70,089 0.10 % 61.77 2,943 0.12 % 68.55 15,955 0.07 % 50.18 0 0 % 0 32,735 0.09 % 60.32 175 0.07 % 44.24 10,508 0.09 % 56.07 7,773 0.38 % 195.72 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 169 File at CLARIN.SI 1.2.153 List of initial character-level 3-grams from preposition lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-prepositions-lowercase_ forms-initial-3grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] pri pri 3,934,055 21.68 % 3,467.06 178,024 26.02 % 4,146.39 1,018,794 19.56 % 3,204.37 47 40.52 % 4,841.87 1,888,953 21.98 % 3,480.52 15,360 27.30 % 3,882.84 756,619 24.88 % 4,037.07 76,258 13.66 % 1,920.09 med med 3,005,484 16.56 % 2,648.72 125,226 18.30 % 2,916.66 906,664 17.40 % 2,851.69 19 16.38 % 1,957.35 1,425,352 16.58 % 2,626.31 7,107 12.63 % 1,796.57 481,142 15.82 % 2,567.22 59,974 10.74 % 1,510.08 pred pre d 2,269,453 12.51 % 2,000.06 51,884 7.58 % 1,208.44 697,210 13.38 % 2,192.91 7 6.03 % 721.13 1,115,560 12.98 % 2,055.49 5,680 10.10 % 1,435.84 329,208 10.83 % 1,756.55 69,904 12.52 % 1,760.11 zaradi zar adi 1,792,318 9.88 % 1,579.56 53,002 7.75 % 1,234.48 569,960 10.94 % 1,792.67 0 0 % 0 861,596 10.02 % 1,587.55 5,122 9.11 % 1,294.79 266,969 8.78 % 1,424.46 35,669 6.39 % 898.11 brez bre z 895,603 4.94 % 789.29 31,130 4.55 % 725.05 238,407 4.58 % 749.85 11 9.48 % 1,133.20 419,896 4.88 % 773.69 2,893 5.14 % 731.32 166,614 5.48 % 889 36,652 6.56 % 922.86 proti pro ti 875,538 4.83 % 771.61 31,406 4.59 % 731.48 296,441 5.69 % 932.38 4 3.45 % 412.07 400,159 4.66 % 737.32 2,536 4.51 % 641.07 105,654 3.48 % 563.74 39,338 7.04 % 990.49 pod pod 771,403 4.25 % 679.83 31,212 4.56 % 726.96 211,136 4.05 % 664.08 6 5.17 % 618.11 357,724 4.16 % 659.13 3,253 5.78 % 822.32 131,202 4.32 % 700.05 36,870 6.60 % 928.35 poleg pol eg 592,281 3.26 % 521.97 19,376 2.83 % 451.29 155,163 2.98 % 488.03 0 0 % 0 284,760 3.31 % 524.69 1,216 2.16 % 307.39 118,883 3.91 % 634.32 12,883 2.31 % 324.38 nad nad 553,355 3.05 % 487.67 24,668 3.60 % 574.55 152,855 2.93 % 480.77 1 0.86 % 103.02 259,821 3.02 % 478.74 1,840 3.27 % 465.13 89,316 2.94 % 476.56 24,854 4.45 % 625.80 kljub klj ub 509,251 2.81 % 448.80 13,018 1.90 % 303.20 130,869 2.51 % 411.62 0 0 % 0 262,664 3.06 % 483.98 835 1.48 % 211.08 90,789 2.99 % 484.42 11,076 1.98 % 278.88 čez čez 342,085 1.89 % 301.48 12,802 1.87 % 298.17 76,727 1.47 % 241.33 2 1.72 % 206.04 156,143 1.82 % 287.70 764 1.36 % 193.13 63,586 2.09 % 339.27 32,061 5.74 % 807.26 glede gle de 271,297 1.50 % 239.09 8,095 1.18 % 188.54 121,462 2.33 % 382.03 0 0 % 0 106,244 1.24 % 195.76 1,543 2.74 % 390.05 30,305 1.00 % 161.70 3,648 0.65 % 91.85 skozi sko zi 261,306 1.44 % 230.29 16,616 2.43 % 387.01 59,303 1.14 % 186.52 2 1.72 % 206.04 106,627 1.24 % 196.47 957 1.70 % 241.92 51,512 1.69 % 274.85 26,289 4.71 % 661.93 izmed izm ed 231,718 1.28 % 204.21 10,396 1.52 % 242.14 68,140 1.31 % 214.32 0 0 % 0 98,094 1.14 % 180.74 698 1.24 % 176.45 47,539 1.56 % 253.65 6,851 1.23 % 172.50 prek pre k 230,294 1.27 % 202.96 7,496 1.09 % 174.59 68,452 1.31 % 215.30 0 0 % 0 100,698 1.17 % 185.54 543 0.96 % 137.26 49,035 1.61 % 261.63 4,070 0.73 % 102.48 konec kon ec 210,897 1.16 % 185.86 3,402 0.50 % 79.24 66,489 1.28 % 209.13 0 0 % 0 112,615 1.31 % 207.50 187 0.33 % 47.27 25,891 0.85 % 138.15 2,313 0.41 % 58.24 okoli oko li 187,237 1.03 % 165.01 7,963 1.16 % 185.47 53,728 1.03 % 168.99 1 0.86 % 103.02 81,299 0.95 % 149.80 320 0.57 % 80.89 33,734 1.11 % 179.99 10,192 1.82 % 256.62 namesto nam esto 180,360 0.99 % 158.95 9,030 1.32 % 210.32 42,657 0.82 % 134.17 5 4.31 % 515.09 81,373 0.95 % 149.94 634 1.13 % 160.27 38,674 1.27 % 206.35 7,987 1.43 % 201.10 sredi sre di 155,626 0.86 % 137.15 5,109 0.75 % 118.99 39,051 0.75 % 122.83 0 0 % 0 75,945 0.88 % 139.93 228 0.41 % 57.64 25,698 0.84 % 137.12 9,595 1.72 % 241.59 preko pre ko 107,027 0.59 % 94.32 5,005 0.73 % 116.57 35,516 0.68 % 111.71 9 7.76 % 927.17 43,418 0.51 % 80 563 1.00 % 142.32 20,743 0.68 % 110.68 1,773 0.32 % 44.64 zoper zop er 96,349 0.53 % 84.91 2,034 0.30 % 47.37 37,620 0.72 % 118.32 0 0 % 0 48,608 0.57 % 89.56 986 1.75 % 249.25 6,457 0.21 % 34.45 644 0.12 % 16.22 okrog okr og 92,380 0.51 % 81.41 5,996 0.88 % 139.65 15,388 0.29 % 48.40 0 0 % 0 42,736 0.50 % 78.74 304 0.54 % 76.85 16,810 0.55 % 89.69 11,146 2.00 % 280.64 znotraj zno traj 80,944 0.45 % 71.34 6,149 0.90 % 143.22 26,268 0.50 % 82.62 0 0 % 0 33,649 0.39 % 62 333 0.59 % 84.18 13,667 0.45 % 72.92 878 0.16 % 22.11 razen raz en 71,460 0.39 % 62.98 3,333 0.49 % 77.63 12,931 0.25 % 40.67 1 0.86 % 103.02 35,674 0.41 % 65.73 840 1.49 % 212.34 13,685 0.45 % 73.02 4,996 0.90 % 125.79 mimo mim o 70,089 0.39 % 61.77 2,943 0.43 % 68.55 15,955 0.31 % 50.18 0 0 % 0 32,735 0.38 % 60.32 175 0.31 % 44.24 10,508 0.35 % 56.07 7,773 1.39 % 195.72 blizu bli zu 64,672 0.36 % 57 2,885 0.42 % 67.20 20,277 0.39 % 63.78 0 0 % 0 29,850 0.35 % 55 105 0.19 % 26.54 8,372 0.28 % 44.67 3,183 0.57 % 80.14 zunaj zun aj 58,074 0.32 % 51.18 3,363 0.49 % 78.33 14,572 0.28 % 45.83 0 0 % 0 29,682 0.34 % 54.69 284 0.51 % 71.79 8,737 0.29 % 46.62 1,436 0.26 % 36.16 izven izv en 28,190 0.15 % 24.84 1,401 0.20 % 32.63 10,009 0.19 % 31.48 0 0 % 0 12,402 0.14 % 22.85 262 0.47 % 66.23 3,795 0.12 % 20.25 321 0.06 % 8.08 izpred izp red 26,041 0.14 % 22.95 510 0.07 % 11.88 5,911 0.11 % 18.59 0 0 % 0 15,218 0.18 % 28.04 29 0.05 % 7.33 3,274 0.11 % 17.47 1,099 0.20 % 27.67 izpod izp od 22,299 0.12 % 19.65 939 0.14 % 21.87 4,581 0.09 % 14.41 0 0 % 0 9,967 0.12 % 18.36 51 0.09 % 12.89 4,214 0.14 % 22.48 2,547 0.46 % 64.13 navkljub nav kljub 16,205 0.09 % 14.28 471 0.07 % 10.97 3,708 0.07 % 11.66 0 0 % 0 7,363 0.09 % 13.57 35 0.06 % 8.85 4,137 0.14 % 22.07 491 0.09 % 12.36 zavoljo zav oljo 14,784 0.08 % 13.03 423 0.06 % 9.85 4,629 0.09 % 14.56 0 0 % 0 6,837 0.08 % 12.60 5 0.01 % 1.26 2,221 0.07 % 11.85 669 0.12 % 16.84 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 170 File at CLARIN.SI 1.2.154 List of initial character-level 4-grams from preposition lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-prepositions-lowercase_ forms-initial-4grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] pred pred 2,269,453 23.82 % 2,000.06 51,884 16.65 % 1,208.44 697,210 24.55 % 2,192.91 7 17.07 % 721.13 1,115,560 24.78 % 2,055.49 5,680 20.34 % 1,435.84 329,208 21.70 % 1,756.55 69,904 21.34 % 1,760.11 zaradi zara di 1,792,318 18.81 % 1,579.56 53,002 17.01 % 1,234.48 569,960 20.07 % 1,792.67 0 0 % 0 861,596 19.14 % 1,587.55 5,122 18.35 % 1,294.79 266,969 17.60 % 1,424.46 35,669 10.89 % 898.11 brez brez 895,603 9.40 % 789.29 31,130 9.99 % 725.05 238,407 8.39 % 749.85 11 26.83 % 1,133.20 419,896 9.33 % 773.69 2,893 10.36 % 731.32 166,614 10.98 % 889 36,652 11.19 % 922.86 proti prot i 875,538 9.19 % 771.61 31,406 10.08 % 731.48 296,441 10.44 % 932.38 4 9.76 % 412.07 400,159 8.89 % 737.32 2,536 9.08 % 641.07 105,654 6.97 % 563.74 39,338 12.01 % 990.49 poleg pole g 592,281 6.22 % 521.97 19,376 6.22 % 451.29 155,163 5.46 % 488.03 0 0 % 0 284,760 6.33 % 524.69 1,216 4.36 % 307.39 118,883 7.84 % 634.32 12,883 3.93 % 324.38 kljub klju b 509,251 5.35 % 448.80 13,018 4.18 % 303.20 130,869 4.61 % 411.62 0 0 % 0 262,664 5.83 % 483.98 835 2.99 % 211.08 90,789 5.99 % 484.42 11,076 3.38 % 278.88 glede gled e 271,297 2.85 % 239.09 8,095 2.60 % 188.54 121,462 4.28 % 382.03 0 0 % 0 106,244 2.36 % 195.76 1,543 5.53 % 390.05 30,305 2.00 % 161.70 3,648 1.11 % 91.85 skozi skoz i 261,306 2.74 % 230.29 16,616 5.33 % 387.01 59,303 2.09 % 186.52 2 4.88 % 206.04 106,627 2.37 % 196.47 957 3.43 % 241.92 51,512 3.40 % 274.85 26,289 8.03 % 661.93 izmed izme d 231,718 2.43 % 204.21 10,396 3.34 % 242.14 68,140 2.40 % 214.32 0 0 % 0 98,094 2.18 % 180.74 698 2.50 % 176.45 47,539 3.13 % 253.65 6,851 2.09 % 172.50 prek prek 230,294 2.42 % 202.96 7,496 2.41 % 174.59 68,452 2.41 % 215.30 0 0 % 0 100,698 2.24 % 185.54 543 1.95 % 137.26 49,035 3.23 % 261.63 4,070 1.24 % 102.48 konec kone c 210,897 2.21 % 185.86 3,402 1.09 % 79.24 66,489 2.34 % 209.13 0 0 % 0 112,615 2.50 % 207.50 187 0.67 % 47.27 25,891 1.71 % 138.15 2,313 0.71 % 58.24 okoli okol i 187,237 1.97 % 165.01 7,963 2.56 % 185.47 53,728 1.89 % 168.99 1 2.44 % 103.02 81,299 1.81 % 149.80 320 1.15 % 80.89 33,734 2.22 % 179.99 10,192 3.11 % 256.62 namesto name sto 180,360 1.89 % 158.95 9,030 2.90 % 210.32 42,657 1.50 % 134.17 5 12.20 % 515.09 81,373 1.81 % 149.94 634 2.27 % 160.27 38,674 2.55 % 206.35 7,987 2.44 % 201.10 sredi sred i 155,626 1.63 % 137.15 5,109 1.64 % 118.99 39,051 1.38 % 122.83 0 0 % 0 75,945 1.69 % 139.93 228 0.82 % 57.64 25,698 1.69 % 137.12 9,595 2.93 % 241.59 preko prek o 107,027 1.12 % 94.32 5,005 1.61 % 116.57 35,516 1.25 % 111.71 9 21.95 % 927.17 43,418 0.96 % 80 563 2.02 % 142.32 20,743 1.37 % 110.68 1,773 0.54 % 44.64 zoper zope r 96,349 1.01 % 84.91 2,034 0.65 % 47.37 37,620 1.32 % 118.32 0 0 % 0 48,608 1.08 % 89.56 986 3.53 % 249.25 6,457 0.43 % 34.45 644 0.20 % 16.22 okrog okro g 92,380 0.97 % 81.41 5,996 1.92 % 139.65 15,388 0.54 % 48.40 0 0 % 0 42,736 0.95 % 78.74 304 1.09 % 76.85 16,810 1.11 % 89.69 11,146 3.40 % 280.64 znotraj znot raj 80,944 0.85 % 71.34 6,149 1.97 % 143.22 26,268 0.93 % 82.62 0 0 % 0 33,649 0.75 % 62 333 1.19 % 84.18 13,667 0.90 % 72.92 878 0.27 % 22.11 razen raze n 71,460 0.75 % 62.98 3,333 1.07 % 77.63 12,931 0.46 % 40.67 1 2.44 % 103.02 35,674 0.79 % 65.73 840 3.01 % 212.34 13,685 0.90 % 73.02 4,996 1.52 % 125.79 mimo mimo 70,089 0.74 % 61.77 2,943 0.94 % 68.55 15,955 0.56 % 50.18 0 0 % 0 32,735 0.73 % 60.32 175 0.63 % 44.24 10,508 0.69 % 56.07 7,773 2.37 % 195.72 blizu bliz u 64,672 0.68 % 57 2,885 0.93 % 67.20 20,277 0.71 % 63.78 0 0 % 0 29,850 0.66 % 55 105 0.38 % 26.54 8,372 0.55 % 44.67 3,183 0.97 % 80.14 zunaj zuna j 58,074 0.61 % 51.18 3,363 1.08 % 78.33 14,572 0.51 % 45.83 0 0 % 0 29,682 0.66 % 54.69 284 1.02 % 71.79 8,737 0.58 % 46.62 1,436 0.44 % 36.16 izven izve n 28,190 0.30 % 24.84 1,401 0.45 % 32.63 10,009 0.35 % 31.48 0 0 % 0 12,402 0.28 % 22.85 262 0.94 % 66.23 3,795 0.25 % 20.25 321 0.10 % 8.08 izpred izpr ed 26,041 0.27 % 22.95 510 0.16 % 11.88 5,911 0.21 % 18.59 0 0 % 0 15,218 0.34 % 28.04 29 0.10 % 7.33 3,274 0.22 % 17.47 1,099 0.34 % 27.67 izpod izpo d 22,299 0.23 % 19.65 939 0.30 % 21.87 4,581 0.16 % 14.41 0 0 % 0 9,967 0.22 % 18.36 51 0.18 % 12.89 4,214 0.28 % 22.48 2,547 0.78 % 64.13 navkljub navk ljub 16,205 0.17 % 14.28 471 0.15 % 10.97 3,708 0.13 % 11.66 0 0 % 0 7,363 0.16 % 13.57 35 0.12 % 8.85 4,137 0.27 % 22.07 491 0.15 % 12.36 zavoljo zavo ljo 14,784 0.15 % 13.03 423 0.14 % 9.85 4,629 0.16 % 14.56 0 0 % 0 6,837 0.15 % 12.60 5 0.02 % 1.26 2,221 0.15 % 11.85 669 0.20 % 16.84 izza izza 13,225 0.14 % 11.66 453 0.14 % 10.55 2,746 0.10 % 8.64 0 0 % 0 5,279 0.12 % 9.73 27 0.10 % 6.83 2,222 0.15 % 11.86 2,498 0.76 % 62.90 nasproti nasp roti 13,059 0.14 % 11.51 1,083 0.35 % 25.22 2,243 0.08 % 7.05 0 0 % 0 6,206 0.14 % 11.43 86 0.31 % 21.74 2,087 0.14 % 11.14 1,354 0.41 % 34.09 zraven zrav en 11,040 0.12 % 9.73 348 0.11 % 8.11 1,602 0.06 % 5.04 0 0 % 0 3,353 0.07 % 6.18 18 0.06 % 4.55 2,164 0.14 % 11.55 3,555 1.08 % 89.51 vzdolž vzdo lž 9,721 0.10 % 8.57 1,619 0.52 % 37.71 1,790 0.06 % 5.63 1 2.44 % 103.02 3,028 0.07 % 5.58 78 0.28 % 19.72 2,009 0.13 % 10.72 1,196 0.36 % 30.11 spričo spri čo 9,312 0.10 % 8.21 798 0.26 % 18.59 1,646 0.06 % 5.18 0 0 % 0 5,100 0.11 % 9.40 92 0.33 % 23.26 1,185 0.08 % 6.32 491 0.15 % 12.36 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 171 File at CLARIN.SI 1.2.155 List of initial character-level 5-grams from preposition lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-prepositions-lowercase_ forms-initial-5grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] zaradi zarad i 1,792,318 29.72 % 1,579.56 53,002 24.50 % 1,234.48 569,960 31.44 % 1,792.67 0 0 % 0 861,596 30.54 % 1,587.55 5,122 27.69 % 1,294.79 266,969 27.91 % 1,424.46 35,669 17.42 % 898.11 proti proti 875,538 14.52 % 771.61 31,406 14.51 % 731.48 296,441 16.35 % 932.38 4 17.39 % 412.07 400,159 14.19 % 737.32 2,536 13.71 % 641.07 105,654 11.04 % 563.74 39,338 19.21 % 990.49 poleg poleg 592,281 9.82 % 521.97 19,376 8.96 % 451.29 155,163 8.56 % 488.03 0 0 % 0 284,760 10.10 % 524.69 1,216 6.57 % 307.39 118,883 12.43 % 634.32 12,883 6.29 % 324.38 kljub kljub 509,251 8.45 % 448.80 13,018 6.02 % 303.20 130,869 7.22 % 411.62 0 0 % 0 262,664 9.31 % 483.98 835 4.51 % 211.08 90,789 9.49 % 484.42 11,076 5.41 % 278.88 glede glede 271,297 4.50 % 239.09 8,095 3.74 % 188.54 121,462 6.70 % 382.03 0 0 % 0 106,244 3.77 % 195.76 1,543 8.34 % 390.05 30,305 3.17 % 161.70 3,648 1.78 % 91.85 skozi skozi 261,306 4.33 % 230.29 16,616 7.68 % 387.01 59,303 3.27 % 186.52 2 8.70 % 206.04 106,627 3.78 % 196.47 957 5.17 % 241.92 51,512 5.38 % 274.85 26,289 12.84 % 661.93 izmed izmed 231,718 3.84 % 204.21 10,396 4.80 % 242.14 68,140 3.76 % 214.32 0 0 % 0 98,094 3.48 % 180.74 698 3.77 % 176.45 47,539 4.97 % 253.65 6,851 3.35 % 172.50 konec konec 210,897 3.50 % 185.86 3,402 1.57 % 79.24 66,489 3.67 % 209.13 0 0 % 0 112,615 3.99 % 207.50 187 1.01 % 47.27 25,891 2.71 % 138.15 2,313 1.13 % 58.24 okoli okoli 187,237 3.10 % 165.01 7,963 3.68 % 185.47 53,728 2.96 % 168.99 1 4.35 % 103.02 81,299 2.88 % 149.80 320 1.73 % 80.89 33,734 3.53 % 179.99 10,192 4.98 % 256.62 namesto names to 180,360 2.99 % 158.95 9,030 4.17 % 210.32 42,657 2.35 % 134.17 5 21.74 % 515.09 81,373 2.88 % 149.94 634 3.43 % 160.27 38,674 4.04 % 206.35 7,987 3.90 % 201.10 sredi sredi 155,626 2.58 % 137.15 5,109 2.36 % 118.99 39,051 2.15 % 122.83 0 0 % 0 75,945 2.69 % 139.93 228 1.23 % 57.64 25,698 2.69 % 137.12 9,595 4.69 % 241.59 preko preko 107,027 1.77 % 94.32 5,005 2.31 % 116.57 35,516 1.96 % 111.71 9 39.13 % 927.17 43,418 1.54 % 80 563 3.04 % 142.32 20,743 2.17 % 110.68 1,773 0.87 % 44.64 zoper zoper 96,349 1.60 % 84.91 2,034 0.94 % 47.37 37,620 2.08 % 118.32 0 0 % 0 48,608 1.72 % 89.56 986 5.33 % 249.25 6,457 0.68 % 34.45 644 0.32 % 16.22 okrog okrog 92,380 1.53 % 81.41 5,996 2.77 % 139.65 15,388 0.85 % 48.40 0 0 % 0 42,736 1.51 % 78.74 304 1.64 % 76.85 16,810 1.76 % 89.69 11,146 5.44 % 280.64 znotraj znotr aj 80,944 1.34 % 71.34 6,149 2.84 % 143.22 26,268 1.45 % 82.62 0 0 % 0 33,649 1.19 % 62 333 1.80 % 84.18 13,667 1.43 % 72.92 878 0.43 % 22.11 razen razen 71,460 1.19 % 62.98 3,333 1.54 % 77.63 12,931 0.71 % 40.67 1 4.35 % 103.02 35,674 1.26 % 65.73 840 4.54 % 212.34 13,685 1.43 % 73.02 4,996 2.44 % 125.79 blizu blizu 64,672 1.07 % 57 2,885 1.33 % 67.20 20,277 1.12 % 63.78 0 0 % 0 29,850 1.06 % 55 105 0.57 % 26.54 8,372 0.88 % 44.67 3,183 1.55 % 80.14 zunaj zunaj 58,074 0.96 % 51.18 3,363 1.55 % 78.33 14,572 0.80 % 45.83 0 0 % 0 29,682 1.05 % 54.69 284 1.53 % 71.79 8,737 0.91 % 46.62 1,436 0.70 % 36.16 izven izven 28,190 0.47 % 24.84 1,401 0.65 % 32.63 10,009 0.55 % 31.48 0 0 % 0 12,402 0.44 % 22.85 262 1.42 % 66.23 3,795 0.40 % 20.25 321 0.16 % 8.08 izpred izpre d 26,041 0.43 % 22.95 510 0.24 % 11.88 5,911 0.33 % 18.59 0 0 % 0 15,218 0.54 % 28.04 29 0.16 % 7.33 3,274 0.34 % 17.47 1,099 0.54 % 27.67 izpod izpod 22,299 0.37 % 19.65 939 0.43 % 21.87 4,581 0.25 % 14.41 0 0 % 0 9,967 0.35 % 18.36 51 0.28 % 12.89 4,214 0.44 % 22.48 2,547 1.24 % 64.13 navkljub navkl jub 16,205 0.27 % 14.28 471 0.22 % 10.97 3,708 0.20 % 11.66 0 0 % 0 7,363 0.26 % 13.57 35 0.19 % 8.85 4,137 0.43 % 22.07 491 0.24 % 12.36 zavoljo zavol jo 14,784 0.24 % 13.03 423 0.20 % 9.85 4,629 0.26 % 14.56 0 0 % 0 6,837 0.24 % 12.60 5 0.03 % 1.26 2,221 0.23 % 11.85 669 0.33 % 16.84 nasproti naspr oti 13,059 0.22 % 11.51 1,083 0.50 % 25.22 2,243 0.12 % 7.05 0 0 % 0 6,206 0.22 % 11.43 86 0.47 % 21.74 2,087 0.22 % 11.14 1,354 0.66 % 34.09 zraven zrave n 11,040 0.18 % 9.73 348 0.16 % 8.11 1,602 0.09 % 5.04 0 0 % 0 3,353 0.12 % 6.18 18 0.10 % 4.55 2,164 0.23 % 11.55 3,555 1.74 % 89.51 vzdolž vzdol ž 9,721 0.16 % 8.57 1,619 0.75 % 37.71 1,790 0.10 % 5.63 1 4.35 % 103.02 3,028 0.11 % 5.58 78 0.42 % 19.72 2,009 0.21 % 10.72 1,196 0.58 % 30.11 spričo sprič o 9,312 0.15 % 8.21 798 0.37 % 18.59 1,646 0.09 % 5.18 0 0 % 0 5,100 0.18 % 9.40 92 0.50 % 23.26 1,185 0.12 % 6.32 491 0.24 % 12.36 onkraj onkra j 8,858 0.15 % 7.81 705 0.33 % 16.42 2,514 0.14 % 7.91 0 0 % 0 3,350 0.12 % 6.17 27 0.15 % 6.83 1,273 0.13 % 6.79 989 0.48 % 24.90 onstran onstr an 7,522 0.12 % 6.63 752 0.35 % 17.51 1,477 0.08 % 4.65 0 0 % 0 3,666 0.13 % 6.75 38 0.20 % 9.61 1,131 0.12 % 6.03 458 0.22 % 11.53 tekom tekom 4,638 0.08 % 4.09 174 0.08 % 4.05 3,169 0.17 % 9.97 0 0 % 0 626 0.02 % 1.15 25 0.14 % 6.32 613 0.06 % 3.27 31 0.01 % 0.78 povrhu povrh u 2,397 0.04 % 2.11 63 0.03 % 1.47 486 0.03 % 1.53 0 0 % 0 1,213 0.04 % 2.24 3 0.02 % 0.76 477 0.05 % 2.55 155 0.08 % 3.90 širom širom 2,386 0.04 % 2.10 35 0.02 % 0.82 711 0.04 % 2.24 0 0 % 0 874 0.03 % 1.61 6 0.03 % 1.52 713 0.07 % 3.80 47 0.02 % 1.18 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 172 File at CLARIN.SI 1.2.156 List of final character-level 1-grams from preposition lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-prepositions-lowercase_ forms-final-1grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] v v 31,813,825 25.51 % 28,037.36 1,138,081 25.82 % 26,507.26 9,551,729 26.12 % 30,042.67 299 22.72 % 30,802.51 15,531,117 25.82 % 28,617.11 103,250 23.08 % 26,100.49 4,673,034 23.86 % 24,933.77 816,315 23.01 % 20,553.93 na n a 19,072,589 15.29 % 16,808.58 631,488 14.33 % 14,708.11 5,747,052 15.72 % 18,075.97 248 18.84 % 25,548.57 9,072,361 15.08 % 16,716.43 56,074 12.53 % 14,174.90 3,009,439 15.37 % 16,057.38 555,927 15.67 % 13,997.64 za z a 15,195,407 12.18 % 13,391.64 436,419 9.90 % 10,164.72 4,564,728 12.48 % 14,357.26 88 6.69 % 9,065.62 7,527,662 12.51 % 13,870.22 67,714 15.13 % 17,117.37 2,303,946 11.77 % 12,293.10 294,850 8.31 % 7,424 z z 8,671,164 6.95 % 7,641.85 350,573 7.95 % 8,165.26 2,398,257 6.56 % 7,543.14 237 18.01 % 24,415.37 4,006,567 6.66 % 7,382.36 31,792 7.11 % 8,036.67 1,570,983 8.02 % 8,382.25 312,755 8.81 % 7,874.83 s s 7,070,474 5.67 % 6,231.17 276,736 6.28 % 6,445.51 2,005,941 5.49 % 6,309.21 167 12.69 % 17,204.08 3,263,890 5.42 % 6,013.93 26,986 6.03 % 6,821.77 1,273,975 6.51 % 6,797.51 222,779 6.28 % 5,609.33 po p o 6,360,566 5.10 % 5,605.53 196,183 4.45 % 4,569.34 1,959,908 5.36 % 6,164.42 64 4.86 % 6,593.18 3,054,588 5.08 % 5,628.28 21,577 4.82 % 5,454.43 936,691 4.78 % 4,997.88 191,555 5.40 % 4,823.15 iz i z 4,006,518 3.21 % 3,530.92 159,021 3.61 % 3,703.79 1,060,540 2.90 % 3,335.67 37 2.81 % 3,811.68 2,017,138 3.35 % 3,716.71 23,762 5.31 % 6,006.78 611,339 3.12 % 3,261.90 134,681 3.80 % 3,391.12 pri pr i 3,934,055 3.15 % 3,467.06 178,024 4.04 % 4,146.39 1,018,794 2.79 % 3,204.37 47 3.57 % 4,841.87 1,888,953 3.14 % 3,480.52 15,360 3.43 % 3,882.84 756,619 3.86 % 4,037.07 76,258 2.15 % 1,920.09 od o d 3,855,989 3.09 % 3,398.26 157,936 3.58 % 3,678.52 1,061,423 2.90 % 3,338.45 20 1.52 % 2,060.37 1,843,270 3.06 % 3,396.35 13,534 3.02 % 3,421.25 638,050 3.26 % 3,404.43 141,756 4.00 % 3,569.26 o o 3,670,387 2.94 % 3,234.69 130,867 2.97 % 3,048.05 1,103,026 3.02 % 3,469.30 1 0.08 % 103.02 1,828,963 3.04 % 3,369.99 24,271 5.42 % 6,135.45 488,092 2.49 % 2,604.30 95,167 2.68 % 2,396.20 do d o 3,434,842 2.75 % 3,027.11 128,879 2.92 % 3,001.75 976,111 2.67 % 3,070.12 16 1.22 % 1,648.30 1,686,736 2.80 % 3,107.92 13,221 2.96 % 3,342.13 544,963 2.78 % 2,907.74 84,916 2.39 % 2,138.09 med me d 3,005,484 2.41 % 2,648.72 125,226 2.84 % 2,916.66 906,664 2.48 % 2,851.69 19 1.44 % 1,957.35 1,425,352 2.37 % 2,626.31 7,107 1.59 % 1,796.57 481,142 2.46 % 2,567.22 59,974 1.69 % 1,510.08 ob o b 2,529,897 2.03 % 2,229.59 68,415 1.55 % 1,593.47 700,467 1.92 % 2,203.15 9 0.68 % 927.17 1,338,788 2.23 % 2,466.81 5,209 1.16 % 1,316.78 342,313 1.75 % 1,826.47 74,696 2.10 % 1,880.76 pred pre d 2,269,453 1.82 % 2,000.06 51,884 1.18 % 1,208.44 697,210 1.91 % 2,192.91 7 0.53 % 721.13 1,115,560 1.85 % 2,055.49 5,680 1.27 % 1,435.84 329,208 1.68 % 1,756.55 69,904 1.97 % 1,760.11 zaradi zarad i 1,792,318 1.44 % 1,579.56 53,002 1.20 % 1,234.48 569,960 1.56 % 1,792.67 0 0 % 0 861,596 1.43 % 1,587.55 5,122 1.15 % 1,294.79 266,969 1.36 % 1,424.46 35,669 1.00 % 898.11 brez bre z 895,603 0.72 % 789.29 31,130 0.71 % 725.05 238,407 0.65 % 749.85 11 0.84 % 1,133.20 419,896 0.70 % 773.69 2,893 0.65 % 731.32 166,614 0.85 % 889 36,652 1.03 % 922.86 proti prot i 875,538 0.70 % 771.61 31,406 0.71 % 731.48 296,441 0.81 % 932.38 4 0.30 % 412.07 400,159 0.67 % 737.32 2,536 0.57 % 641.07 105,654 0.54 % 563.74 39,338 1.11 % 990.49 k k 829,466 0.67 % 731 45,573 1.03 % 1,061.45 213,857 0.58 % 672.64 14 1.06 % 1,442.26 368,686 0.61 % 679.33 3,567 0.80 % 901.70 138,006 0.70 % 736.35 59,763 1.69 % 1,504.77 pod po d 771,403 0.62 % 679.83 31,212 0.71 % 726.96 211,136 0.58 % 664.08 6 0.46 % 618.11 357,724 0.59 % 659.13 3,253 0.73 % 822.32 131,202 0.67 % 700.05 36,870 1.04 % 928.35 poleg pole g 592,281 0.47 % 521.97 19,376 0.44 % 451.29 155,163 0.42 % 488.03 0 0 % 0 284,760 0.47 % 524.69 1,216 0.27 % 307.39 118,883 0.61 % 634.32 12,883 0.36 % 324.38 nad na d 553,355 0.44 % 487.67 24,668 0.56 % 574.55 152,855 0.42 % 480.77 1 0.08 % 103.02 259,821 0.43 % 478.74 1,840 0.41 % 465.13 89,316 0.46 % 476.56 24,854 0.70 % 625.80 kljub klju b 509,251 0.41 % 448.80 13,018 0.29 % 303.20 130,869 0.36 % 411.62 0 0 % 0 262,664 0.44 % 483.98 835 0.19 % 211.08 90,789 0.46 % 484.42 11,076 0.31 % 278.88 čez če z 342,085 0.27 % 301.48 12,802 0.29 % 298.17 76,727 0.21 % 241.33 2 0.15 % 206.04 156,143 0.26 % 287.70 764 0.17 % 193.13 63,586 0.33 % 339.27 32,061 0.90 % 807.26 glede gled e 271,297 0.22 % 239.09 8,095 0.18 % 188.54 121,462 0.33 % 382.03 0 0 % 0 106,244 0.18 % 195.76 1,543 0.34 % 390.05 30,305 0.15 % 161.70 3,648 0.10 % 91.85 skozi skoz i 261,306 0.21 % 230.29 16,616 0.38 % 387.01 59,303 0.16 % 186.52 2 0.15 % 206.04 106,627 0.18 % 196.47 957 0.21 % 241.92 51,512 0.26 % 274.85 26,289 0.74 % 661.93 izmed izme d 231,718 0.19 % 204.21 10,396 0.24 % 242.14 68,140 0.19 % 214.32 0 0 % 0 98,094 0.16 % 180.74 698 0.16 % 176.45 47,539 0.24 % 253.65 6,851 0.19 % 172.50 prek pre k 230,294 0.18 % 202.96 7,496 0.17 % 174.59 68,452 0.19 % 215.30 0 0 % 0 100,698 0.17 % 185.54 543 0.12 % 137.26 49,035 0.25 % 261.63 4,070 0.12 % 102.48 konec kone c 210,897 0.17 % 185.86 3,402 0.08 % 79.24 66,489 0.18 % 209.13 0 0 % 0 112,615 0.19 % 207.50 187 0.04 % 47.27 25,891 0.13 % 138.15 2,313 0.07 % 58.24 okoli okol i 187,237 0.15 % 165.01 7,963 0.18 % 185.47 53,728 0.15 % 168.99 1 0.08 % 103.02 81,299 0.14 % 149.80 320 0.07 % 80.89 33,734 0.17 % 179.99 10,192 0.29 % 256.62 namesto namest o 180,360 0.14 % 158.95 9,030 0.20 % 210.32 42,657 0.12 % 134.17 5 0.38 % 515.09 81,373 0.14 % 149.94 634 0.14 % 160.27 38,674 0.20 % 206.35 7,987 0.23 % 201.10 sredi sred i 155,626 0.12 % 137.15 5,109 0.12 % 118.99 39,051 0.11 % 122.83 0 0 % 0 75,945 0.13 % 139.93 228 0.05 % 57.64 25,698 0.13 % 137.12 9,595 0.27 % 241.59 preko prek o 107,027 0.09 % 94.32 5,005 0.11 % 116.57 35,516 0.10 % 111.71 9 0.68 % 927.17 43,418 0.07 % 80 563 0.13 % 142.32 20,743 0.11 % 110.68 1,773 0.05 % 44.64 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 173 File at CLARIN.SI 1.2.157 List of final character-level 2-grams from preposition lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-prepositions-lowercase_ forms-final-2grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] na na 19,072,589 26.27 % 16,808.58 631,488 25.64 % 14,708.11 5,747,052 27.01 % 18,075.97 248 41.47 % 25,548.57 9,072,361 25.82 % 16,716.43 56,074 21.79 % 14,174.90 3,009,439 26.34 % 16,057.38 555,927 27.30 % 13,997.64 za za 15,195,407 20.93 % 13,391.64 436,419 17.72 % 10,164.72 4,564,728 21.45 % 14,357.26 88 14.72 % 9,065.62 7,527,662 21.42 % 13,870.22 67,714 26.31 % 17,117.37 2,303,946 20.16 % 12,293.10 294,850 14.48 % 7,424 po po 6,360,566 8.76 % 5,605.53 196,183 7.97 % 4,569.34 1,959,908 9.21 % 6,164.42 64 10.70 % 6,593.18 3,054,588 8.69 % 5,628.28 21,577 8.38 % 5,454.43 936,691 8.20 % 4,997.88 191,555 9.40 % 4,823.15 iz iz 4,006,518 5.52 % 3,530.92 159,021 6.46 % 3,703.79 1,060,540 4.98 % 3,335.67 37 6.19 % 3,811.68 2,017,138 5.74 % 3,716.71 23,762 9.23 % 6,006.78 611,339 5.35 % 3,261.90 134,681 6.61 % 3,391.12 pri p ri 3,934,055 5.42 % 3,467.06 178,024 7.23 % 4,146.39 1,018,794 4.79 % 3,204.37 47 7.86 % 4,841.87 1,888,953 5.38 % 3,480.52 15,360 5.97 % 3,882.84 756,619 6.62 % 4,037.07 76,258 3.74 % 1,920.09 od od 3,855,989 5.31 % 3,398.26 157,936 6.41 % 3,678.52 1,061,423 4.99 % 3,338.45 20 3.34 % 2,060.37 1,843,270 5.25 % 3,396.35 13,534 5.26 % 3,421.25 638,050 5.58 % 3,404.43 141,756 6.96 % 3,569.26 do do 3,434,842 4.73 % 3,027.11 128,879 5.23 % 3,001.75 976,111 4.59 % 3,070.12 16 2.68 % 1,648.30 1,686,736 4.80 % 3,107.92 13,221 5.14 % 3,342.13 544,963 4.77 % 2,907.74 84,916 4.17 % 2,138.09 med m ed 3,005,484 4.14 % 2,648.72 125,226 5.08 % 2,916.66 906,664 4.26 % 2,851.69 19 3.18 % 1,957.35 1,425,352 4.06 % 2,626.31 7,107 2.76 % 1,796.57 481,142 4.21 % 2,567.22 59,974 2.94 % 1,510.08 ob ob 2,529,897 3.48 % 2,229.59 68,415 2.78 % 1,593.47 700,467 3.29 % 2,203.15 9 1.50 % 927.17 1,338,788 3.81 % 2,466.81 5,209 2.02 % 1,316.78 342,313 3.00 % 1,826.47 74,696 3.67 % 1,880.76 pred pr ed 2,269,453 3.13 % 2,000.06 51,884 2.11 % 1,208.44 697,210 3.28 % 2,192.91 7 1.17 % 721.13 1,115,560 3.17 % 2,055.49 5,680 2.21 % 1,435.84 329,208 2.88 % 1,756.55 69,904 3.43 % 1,760.11 zaradi zara di 1,792,318 2.47 % 1,579.56 53,002 2.15 % 1,234.48 569,960 2.68 % 1,792.67 0 0 % 0 861,596 2.45 % 1,587.55 5,122 1.99 % 1,294.79 266,969 2.34 % 1,424.46 35,669 1.75 % 898.11 brez br ez 895,603 1.23 % 789.29 31,130 1.26 % 725.05 238,407 1.12 % 749.85 11 1.84 % 1,133.20 419,896 1.20 % 773.69 2,893 1.12 % 731.32 166,614 1.46 % 889 36,652 1.80 % 922.86 proti pro ti 875,538 1.21 % 771.61 31,406 1.27 % 731.48 296,441 1.39 % 932.38 4 0.67 % 412.07 400,159 1.14 % 737.32 2,536 0.98 % 641.07 105,654 0.93 % 563.74 39,338 1.93 % 990.49 pod p od 771,403 1.06 % 679.83 31,212 1.27 % 726.96 211,136 0.99 % 664.08 6 1.00 % 618.11 357,724 1.02 % 659.13 3,253 1.26 % 822.32 131,202 1.15 % 700.05 36,870 1.81 % 928.35 poleg pol eg 592,281 0.82 % 521.97 19,376 0.79 % 451.29 155,163 0.73 % 488.03 0 0 % 0 284,760 0.81 % 524.69 1,216 0.47 % 307.39 118,883 1.04 % 634.32 12,883 0.63 % 324.38 nad n ad 553,355 0.76 % 487.67 24,668 1.00 % 574.55 152,855 0.72 % 480.77 1 0.17 % 103.02 259,821 0.74 % 478.74 1,840 0.71 % 465.13 89,316 0.78 % 476.56 24,854 1.22 % 625.80 kljub klj ub 509,251 0.70 % 448.80 13,018 0.53 % 303.20 130,869 0.61 % 411.62 0 0 % 0 262,664 0.75 % 483.98 835 0.32 % 211.08 90,789 0.79 % 484.42 11,076 0.54 % 278.88 čez č ez 342,085 0.47 % 301.48 12,802 0.52 % 298.17 76,727 0.36 % 241.33 2 0.33 % 206.04 156,143 0.44 % 287.70 764 0.30 % 193.13 63,586 0.56 % 339.27 32,061 1.57 % 807.26 glede gle de 271,297 0.37 % 239.09 8,095 0.33 % 188.54 121,462 0.57 % 382.03 0 0 % 0 106,244 0.30 % 195.76 1,543 0.60 % 390.05 30,305 0.27 % 161.70 3,648 0.18 % 91.85 skozi sko zi 261,306 0.36 % 230.29 16,616 0.68 % 387.01 59,303 0.28 % 186.52 2 0.33 % 206.04 106,627 0.30 % 196.47 957 0.37 % 241.92 51,512 0.45 % 274.85 26,289 1.29 % 661.93 izmed izm ed 231,718 0.32 % 204.21 10,396 0.42 % 242.14 68,140 0.32 % 214.32 0 0 % 0 98,094 0.28 % 180.74 698 0.27 % 176.45 47,539 0.42 % 253.65 6,851 0.34 % 172.50 prek pr ek 230,294 0.32 % 202.96 7,496 0.30 % 174.59 68,452 0.32 % 215.30 0 0 % 0 100,698 0.29 % 185.54 543 0.21 % 137.26 49,035 0.43 % 261.63 4,070 0.20 % 102.48 konec kon ec 210,897 0.29 % 185.86 3,402 0.14 % 79.24 66,489 0.31 % 209.13 0 0 % 0 112,615 0.32 % 207.50 187 0.07 % 47.27 25,891 0.23 % 138.15 2,313 0.11 % 58.24 okoli oko li 187,237 0.26 % 165.01 7,963 0.32 % 185.47 53,728 0.25 % 168.99 1 0.17 % 103.02 81,299 0.23 % 149.80 320 0.12 % 80.89 33,734 0.29 % 179.99 10,192 0.50 % 256.62 namesto names to 180,360 0.25 % 158.95 9,030 0.37 % 210.32 42,657 0.20 % 134.17 5 0.84 % 515.09 81,373 0.23 % 149.94 634 0.25 % 160.27 38,674 0.34 % 206.35 7,987 0.39 % 201.10 sredi sre di 155,626 0.21 % 137.15 5,109 0.21 % 118.99 39,051 0.18 % 122.83 0 0 % 0 75,945 0.22 % 139.93 228 0.09 % 57.64 25,698 0.23 % 137.12 9,595 0.47 % 241.59 preko pre ko 107,027 0.15 % 94.32 5,005 0.20 % 116.57 35,516 0.17 % 111.71 9 1.50 % 927.17 43,418 0.12 % 80 563 0.22 % 142.32 20,743 0.18 % 110.68 1,773 0.09 % 44.64 zoper zop er 96,349 0.13 % 84.91 2,034 0.08 % 47.37 37,620 0.18 % 118.32 0 0 % 0 48,608 0.14 % 89.56 986 0.38 % 249.25 6,457 0.06 % 34.45 644 0.03 % 16.22 okrog okr og 92,380 0.13 % 81.41 5,996 0.24 % 139.65 15,388 0.07 % 48.40 0 0 % 0 42,736 0.12 % 78.74 304 0.12 % 76.85 16,810 0.15 % 89.69 11,146 0.55 % 280.64 znotraj znotr aj 80,944 0.11 % 71.34 6,149 0.25 % 143.22 26,268 0.12 % 82.62 0 0 % 0 33,649 0.10 % 62 333 0.13 % 84.18 13,667 0.12 % 72.92 878 0.04 % 22.11 razen raz en 71,460 0.10 % 62.98 3,333 0.14 % 77.63 12,931 0.06 % 40.67 1 0.17 % 103.02 35,674 0.10 % 65.73 840 0.33 % 212.34 13,685 0.12 % 73.02 4,996 0.24 % 125.79 mimo mi mo 70,089 0.10 % 61.77 2,943 0.12 % 68.55 15,955 0.07 % 50.18 0 0 % 0 32,735 0.09 % 60.32 175 0.07 % 44.24 10,508 0.09 % 56.07 7,773 0.38 % 195.72 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 174 File at CLARIN.SI 1.2.158 List of final character-level 3-grams from preposition lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-prepositions-lowercase_ forms-final-3grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] pri pri 3,934,055 21.68 % 3,467.06 178,024 26.02 % 4,146.39 1,018,794 19.56 % 3,204.37 47 40.52 % 4,841.87 1,888,953 21.98 % 3,480.52 15,360 27.30 % 3,882.84 756,619 24.88 % 4,037.07 76,258 13.66 % 1,920.09 med med 3,005,484 16.56 % 2,648.72 125,226 18.30 % 2,916.66 906,664 17.40 % 2,851.69 19 16.38 % 1,957.35 1,425,352 16.58 % 2,626.31 7,107 12.63 % 1,796.57 481,142 15.82 % 2,567.22 59,974 10.74 % 1,510.08 pred p red 2,269,453 12.51 % 2,000.06 51,884 7.58 % 1,208.44 697,210 13.38 % 2,192.91 7 6.03 % 721.13 1,115,560 12.98 % 2,055.49 5,680 10.10 % 1,435.84 329,208 10.83 % 1,756.55 69,904 12.52 % 1,760.11 zaradi zar adi 1,792,318 9.88 % 1,579.56 53,002 7.75 % 1,234.48 569,960 10.94 % 1,792.67 0 0 % 0 861,596 10.02 % 1,587.55 5,122 9.11 % 1,294.79 266,969 8.78 % 1,424.46 35,669 6.39 % 898.11 brez b rez 895,603 4.94 % 789.29 31,130 4.55 % 725.05 238,407 4.58 % 749.85 11 9.48 % 1,133.20 419,896 4.88 % 773.69 2,893 5.14 % 731.32 166,614 5.48 % 889 36,652 6.56 % 922.86 proti pr oti 875,538 4.83 % 771.61 31,406 4.59 % 731.48 296,441 5.69 % 932.38 4 3.45 % 412.07 400,159 4.66 % 737.32 2,536 4.51 % 641.07 105,654 3.48 % 563.74 39,338 7.04 % 990.49 pod pod 771,403 4.25 % 679.83 31,212 4.56 % 726.96 211,136 4.05 % 664.08 6 5.17 % 618.11 357,724 4.16 % 659.13 3,253 5.78 % 822.32 131,202 4.32 % 700.05 36,870 6.60 % 928.35 poleg po leg 592,281 3.26 % 521.97 19,376 2.83 % 451.29 155,163 2.98 % 488.03 0 0 % 0 284,760 3.31 % 524.69 1,216 2.16 % 307.39 118,883 3.91 % 634.32 12,883 2.31 % 324.38 nad nad 553,355 3.05 % 487.67 24,668 3.60 % 574.55 152,855 2.93 % 480.77 1 0.86 % 103.02 259,821 3.02 % 478.74 1,840 3.27 % 465.13 89,316 2.94 % 476.56 24,854 4.45 % 625.80 kljub kl jub 509,251 2.81 % 448.80 13,018 1.90 % 303.20 130,869 2.51 % 411.62 0 0 % 0 262,664 3.06 % 483.98 835 1.48 % 211.08 90,789 2.99 % 484.42 11,076 1.98 % 278.88 čez čez 342,085 1.89 % 301.48 12,802 1.87 % 298.17 76,727 1.47 % 241.33 2 1.72 % 206.04 156,143 1.82 % 287.70 764 1.36 % 193.13 63,586 2.09 % 339.27 32,061 5.74 % 807.26 glede gl ede 271,297 1.50 % 239.09 8,095 1.18 % 188.54 121,462 2.33 % 382.03 0 0 % 0 106,244 1.24 % 195.76 1,543 2.74 % 390.05 30,305 1.00 % 161.70 3,648 0.65 % 91.85 skozi sk ozi 261,306 1.44 % 230.29 16,616 2.43 % 387.01 59,303 1.14 % 186.52 2 1.72 % 206.04 106,627 1.24 % 196.47 957 1.70 % 241.92 51,512 1.69 % 274.85 26,289 4.71 % 661.93 izmed iz med 231,718 1.28 % 204.21 10,396 1.52 % 242.14 68,140 1.31 % 214.32 0 0 % 0 98,094 1.14 % 180.74 698 1.24 % 176.45 47,539 1.56 % 253.65 6,851 1.23 % 172.50 prek p rek 230,294 1.27 % 202.96 7,496 1.09 % 174.59 68,452 1.31 % 215.30 0 0 % 0 100,698 1.17 % 185.54 543 0.96 % 137.26 49,035 1.61 % 261.63 4,070 0.73 % 102.48 konec ko nec 210,897 1.16 % 185.86 3,402 0.50 % 79.24 66,489 1.28 % 209.13 0 0 % 0 112,615 1.31 % 207.50 187 0.33 % 47.27 25,891 0.85 % 138.15 2,313 0.41 % 58.24 okoli ok oli 187,237 1.03 % 165.01 7,963 1.16 % 185.47 53,728 1.03 % 168.99 1 0.86 % 103.02 81,299 0.95 % 149.80 320 0.57 % 80.89 33,734 1.11 % 179.99 10,192 1.82 % 256.62 namesto name sto 180,360 0.99 % 158.95 9,030 1.32 % 210.32 42,657 0.82 % 134.17 5 4.31 % 515.09 81,373 0.95 % 149.94 634 1.13 % 160.27 38,674 1.27 % 206.35 7,987 1.43 % 201.10 sredi sr edi 155,626 0.86 % 137.15 5,109 0.75 % 118.99 39,051 0.75 % 122.83 0 0 % 0 75,945 0.88 % 139.93 228 0.41 % 57.64 25,698 0.84 % 137.12 9,595 1.72 % 241.59 preko pr eko 107,027 0.59 % 94.32 5,005 0.73 % 116.57 35,516 0.68 % 111.71 9 7.76 % 927.17 43,418 0.51 % 80 563 1.00 % 142.32 20,743 0.68 % 110.68 1,773 0.32 % 44.64 zoper zo per 96,349 0.53 % 84.91 2,034 0.30 % 47.37 37,620 0.72 % 118.32 0 0 % 0 48,608 0.57 % 89.56 986 1.75 % 249.25 6,457 0.21 % 34.45 644 0.12 % 16.22 okrog ok rog 92,380 0.51 % 81.41 5,996 0.88 % 139.65 15,388 0.29 % 48.40 0 0 % 0 42,736 0.50 % 78.74 304 0.54 % 76.85 16,810 0.55 % 89.69 11,146 2.00 % 280.64 znotraj znot raj 80,944 0.45 % 71.34 6,149 0.90 % 143.22 26,268 0.50 % 82.62 0 0 % 0 33,649 0.39 % 62 333 0.59 % 84.18 13,667 0.45 % 72.92 878 0.16 % 22.11 razen ra zen 71,460 0.39 % 62.98 3,333 0.49 % 77.63 12,931 0.25 % 40.67 1 0.86 % 103.02 35,674 0.41 % 65.73 840 1.49 % 212.34 13,685 0.45 % 73.02 4,996 0.90 % 125.79 mimo m imo 70,089 0.39 % 61.77 2,943 0.43 % 68.55 15,955 0.31 % 50.18 0 0 % 0 32,735 0.38 % 60.32 175 0.31 % 44.24 10,508 0.35 % 56.07 7,773 1.39 % 195.72 blizu bl izu 64,672 0.36 % 57 2,885 0.42 % 67.20 20,277 0.39 % 63.78 0 0 % 0 29,850 0.35 % 55 105 0.19 % 26.54 8,372 0.28 % 44.67 3,183 0.57 % 80.14 zunaj zu naj 58,074 0.32 % 51.18 3,363 0.49 % 78.33 14,572 0.28 % 45.83 0 0 % 0 29,682 0.34 % 54.69 284 0.51 % 71.79 8,737 0.29 % 46.62 1,436 0.26 % 36.16 izven iz ven 28,190 0.15 % 24.84 1,401 0.20 % 32.63 10,009 0.19 % 31.48 0 0 % 0 12,402 0.14 % 22.85 262 0.47 % 66.23 3,795 0.12 % 20.25 321 0.06 % 8.08 izpred izp red 26,041 0.14 % 22.95 510 0.07 % 11.88 5,911 0.11 % 18.59 0 0 % 0 15,218 0.18 % 28.04 29 0.05 % 7.33 3,274 0.11 % 17.47 1,099 0.20 % 27.67 izpod iz pod 22,299 0.12 % 19.65 939 0.14 % 21.87 4,581 0.09 % 14.41 0 0 % 0 9,967 0.12 % 18.36 51 0.09 % 12.89 4,214 0.14 % 22.48 2,547 0.46 % 64.13 navkljub navkl jub 16,205 0.09 % 14.28 471 0.07 % 10.97 3,708 0.07 % 11.66 0 0 % 0 7,363 0.09 % 13.57 35 0.06 % 8.85 4,137 0.14 % 22.07 491 0.09 % 12.36 zavoljo zavo ljo 14,784 0.08 % 13.03 423 0.06 % 9.85 4,629 0.09 % 14.56 0 0 % 0 6,837 0.08 % 12.60 5 0.01 % 1.26 2,221 0.07 % 11.85 669 0.12 % 16.84 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 175 File at CLARIN.SI 1.2.159 List of final character-level 4-grams from preposition lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-prepositions-lowercase_ forms-final-4grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] pred pred 2,269,453 23.82 % 2,000.06 51,884 16.65 % 1,208.44 697,210 24.55 % 2,192.91 7 17.07 % 721.13 1,115,560 24.78 % 2,055.49 5,680 20.34 % 1,435.84 329,208 21.70 % 1,756.55 69,904 21.34 % 1,760.11 zaradi za radi 1,792,318 18.81 % 1,579.56 53,002 17.01 % 1,234.48 569,960 20.07 % 1,792.67 0 0 % 0 861,596 19.14 % 1,587.55 5,122 18.35 % 1,294.79 266,969 17.60 % 1,424.46 35,669 10.89 % 898.11 brez brez 895,603 9.40 % 789.29 31,130 9.99 % 725.05 238,407 8.39 % 749.85 11 26.83 % 1,133.20 419,896 9.33 % 773.69 2,893 10.36 % 731.32 166,614 10.98 % 889 36,652 11.19 % 922.86 proti p roti 875,538 9.19 % 771.61 31,406 10.08 % 731.48 296,441 10.44 % 932.38 4 9.76 % 412.07 400,159 8.89 % 737.32 2,536 9.08 % 641.07 105,654 6.97 % 563.74 39,338 12.01 % 990.49 poleg p oleg 592,281 6.22 % 521.97 19,376 6.22 % 451.29 155,163 5.46 % 488.03 0 0 % 0 284,760 6.33 % 524.69 1,216 4.36 % 307.39 118,883 7.84 % 634.32 12,883 3.93 % 324.38 kljub k ljub 509,251 5.35 % 448.80 13,018 4.18 % 303.20 130,869 4.61 % 411.62 0 0 % 0 262,664 5.83 % 483.98 835 2.99 % 211.08 90,789 5.99 % 484.42 11,076 3.38 % 278.88 glede g lede 271,297 2.85 % 239.09 8,095 2.60 % 188.54 121,462 4.28 % 382.03 0 0 % 0 106,244 2.36 % 195.76 1,543 5.53 % 390.05 30,305 2.00 % 161.70 3,648 1.11 % 91.85 skozi s kozi 261,306 2.74 % 230.29 16,616 5.33 % 387.01 59,303 2.09 % 186.52 2 4.88 % 206.04 106,627 2.37 % 196.47 957 3.43 % 241.92 51,512 3.40 % 274.85 26,289 8.03 % 661.93 izmed i zmed 231,718 2.43 % 204.21 10,396 3.34 % 242.14 68,140 2.40 % 214.32 0 0 % 0 98,094 2.18 % 180.74 698 2.50 % 176.45 47,539 3.13 % 253.65 6,851 2.09 % 172.50 prek prek 230,294 2.42 % 202.96 7,496 2.41 % 174.59 68,452 2.41 % 215.30 0 0 % 0 100,698 2.24 % 185.54 543 1.95 % 137.26 49,035 3.23 % 261.63 4,070 1.24 % 102.48 konec k onec 210,897 2.21 % 185.86 3,402 1.09 % 79.24 66,489 2.34 % 209.13 0 0 % 0 112,615 2.50 % 207.50 187 0.67 % 47.27 25,891 1.71 % 138.15 2,313 0.71 % 58.24 okoli o koli 187,237 1.97 % 165.01 7,963 2.56 % 185.47 53,728 1.89 % 168.99 1 2.44 % 103.02 81,299 1.81 % 149.80 320 1.15 % 80.89 33,734 2.22 % 179.99 10,192 3.11 % 256.62 namesto nam esto 180,360 1.89 % 158.95 9,030 2.90 % 210.32 42,657 1.50 % 134.17 5 12.20 % 515.09 81,373 1.81 % 149.94 634 2.27 % 160.27 38,674 2.55 % 206.35 7,987 2.44 % 201.10 sredi s redi 155,626 1.63 % 137.15 5,109 1.64 % 118.99 39,051 1.38 % 122.83 0 0 % 0 75,945 1.69 % 139.93 228 0.82 % 57.64 25,698 1.69 % 137.12 9,595 2.93 % 241.59 preko p reko 107,027 1.12 % 94.32 5,005 1.61 % 116.57 35,516 1.25 % 111.71 9 21.95 % 927.17 43,418 0.96 % 80 563 2.02 % 142.32 20,743 1.37 % 110.68 1,773 0.54 % 44.64 zoper z oper 96,349 1.01 % 84.91 2,034 0.65 % 47.37 37,620 1.32 % 118.32 0 0 % 0 48,608 1.08 % 89.56 986 3.53 % 249.25 6,457 0.43 % 34.45 644 0.20 % 16.22 okrog o krog 92,380 0.97 % 81.41 5,996 1.92 % 139.65 15,388 0.54 % 48.40 0 0 % 0 42,736 0.95 % 78.74 304 1.09 % 76.85 16,810 1.11 % 89.69 11,146 3.40 % 280.64 znotraj zno traj 80,944 0.85 % 71.34 6,149 1.97 % 143.22 26,268 0.93 % 82.62 0 0 % 0 33,649 0.75 % 62 333 1.19 % 84.18 13,667 0.90 % 72.92 878 0.27 % 22.11 razen r azen 71,460 0.75 % 62.98 3,333 1.07 % 77.63 12,931 0.46 % 40.67 1 2.44 % 103.02 35,674 0.79 % 65.73 840 3.01 % 212.34 13,685 0.90 % 73.02 4,996 1.52 % 125.79 mimo mimo 70,089 0.74 % 61.77 2,943 0.94 % 68.55 15,955 0.56 % 50.18 0 0 % 0 32,735 0.73 % 60.32 175 0.63 % 44.24 10,508 0.69 % 56.07 7,773 2.37 % 195.72 blizu b lizu 64,672 0.68 % 57 2,885 0.93 % 67.20 20,277 0.71 % 63.78 0 0 % 0 29,850 0.66 % 55 105 0.38 % 26.54 8,372 0.55 % 44.67 3,183 0.97 % 80.14 zunaj z unaj 58,074 0.61 % 51.18 3,363 1.08 % 78.33 14,572 0.51 % 45.83 0 0 % 0 29,682 0.66 % 54.69 284 1.02 % 71.79 8,737 0.58 % 46.62 1,436 0.44 % 36.16 izven i zven 28,190 0.30 % 24.84 1,401 0.45 % 32.63 10,009 0.35 % 31.48 0 0 % 0 12,402 0.28 % 22.85 262 0.94 % 66.23 3,795 0.25 % 20.25 321 0.10 % 8.08 izpred iz pred 26,041 0.27 % 22.95 510 0.16 % 11.88 5,911 0.21 % 18.59 0 0 % 0 15,218 0.34 % 28.04 29 0.10 % 7.33 3,274 0.22 % 17.47 1,099 0.34 % 27.67 izpod i zpod 22,299 0.23 % 19.65 939 0.30 % 21.87 4,581 0.16 % 14.41 0 0 % 0 9,967 0.22 % 18.36 51 0.18 % 12.89 4,214 0.28 % 22.48 2,547 0.78 % 64.13 navkljub navk ljub 16,205 0.17 % 14.28 471 0.15 % 10.97 3,708 0.13 % 11.66 0 0 % 0 7,363 0.16 % 13.57 35 0.12 % 8.85 4,137 0.27 % 22.07 491 0.15 % 12.36 zavoljo zav oljo 14,784 0.15 % 13.03 423 0.14 % 9.85 4,629 0.16 % 14.56 0 0 % 0 6,837 0.15 % 12.60 5 0.02 % 1.26 2,221 0.15 % 11.85 669 0.20 % 16.84 izza izza 13,225 0.14 % 11.66 453 0.14 % 10.55 2,746 0.10 % 8.64 0 0 % 0 5,279 0.12 % 9.73 27 0.10 % 6.83 2,222 0.15 % 11.86 2,498 0.76 % 62.90 nasproti nasp roti 13,059 0.14 % 11.51 1,083 0.35 % 25.22 2,243 0.08 % 7.05 0 0 % 0 6,206 0.14 % 11.43 86 0.31 % 21.74 2,087 0.14 % 11.14 1,354 0.41 % 34.09 zraven zr aven 11,040 0.12 % 9.73 348 0.11 % 8.11 1,602 0.06 % 5.04 0 0 % 0 3,353 0.07 % 6.18 18 0.06 % 4.55 2,164 0.14 % 11.55 3,555 1.08 % 89.51 vzdolž vz dolž 9,721 0.10 % 8.57 1,619 0.52 % 37.71 1,790 0.06 % 5.63 1 2.44 % 103.02 3,028 0.07 % 5.58 78 0.28 % 19.72 2,009 0.13 % 10.72 1,196 0.36 % 30.11 spričo sp ričo 9,312 0.10 % 8.21 798 0.26 % 18.59 1,646 0.06 % 5.18 0 0 % 0 5,100 0.11 % 9.40 92 0.33 % 23.26 1,185 0.08 % 6.32 491 0.15 % 12.36 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 176 File at CLARIN.SI 1.2.160 List of final character-level 5-grams from preposition lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-prepositions-lowercase_ forms-final-5grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] zaradi z aradi 1,792,318 29.72 % 1,579.56 53,002 24.50 % 1,234.48 569,960 31.44 % 1,792.67 0 0 % 0 861,596 30.54 % 1,587.55 5,122 27.69 % 1,294.79 266,969 27.91 % 1,424.46 35,669 17.42 % 898.11 proti proti 875,538 14.52 % 771.61 31,406 14.51 % 731.48 296,441 16.35 % 932.38 4 17.39 % 412.07 400,159 14.19 % 737.32 2,536 13.71 % 641.07 105,654 11.04 % 563.74 39,338 19.21 % 990.49 poleg poleg 592,281 9.82 % 521.97 19,376 8.96 % 451.29 155,163 8.56 % 488.03 0 0 % 0 284,760 10.10 % 524.69 1,216 6.57 % 307.39 118,883 12.43 % 634.32 12,883 6.29 % 324.38 kljub kljub 509,251 8.45 % 448.80 13,018 6.02 % 303.20 130,869 7.22 % 411.62 0 0 % 0 262,664 9.31 % 483.98 835 4.51 % 211.08 90,789 9.49 % 484.42 11,076 5.41 % 278.88 glede glede 271,297 4.50 % 239.09 8,095 3.74 % 188.54 121,462 6.70 % 382.03 0 0 % 0 106,244 3.77 % 195.76 1,543 8.34 % 390.05 30,305 3.17 % 161.70 3,648 1.78 % 91.85 skozi skozi 261,306 4.33 % 230.29 16,616 7.68 % 387.01 59,303 3.27 % 186.52 2 8.70 % 206.04 106,627 3.78 % 196.47 957 5.17 % 241.92 51,512 5.38 % 274.85 26,289 12.84 % 661.93 izmed izmed 231,718 3.84 % 204.21 10,396 4.80 % 242.14 68,140 3.76 % 214.32 0 0 % 0 98,094 3.48 % 180.74 698 3.77 % 176.45 47,539 4.97 % 253.65 6,851 3.35 % 172.50 konec konec 210,897 3.50 % 185.86 3,402 1.57 % 79.24 66,489 3.67 % 209.13 0 0 % 0 112,615 3.99 % 207.50 187 1.01 % 47.27 25,891 2.71 % 138.15 2,313 1.13 % 58.24 okoli okoli 187,237 3.10 % 165.01 7,963 3.68 % 185.47 53,728 2.96 % 168.99 1 4.35 % 103.02 81,299 2.88 % 149.80 320 1.73 % 80.89 33,734 3.53 % 179.99 10,192 4.98 % 256.62 namesto na mesto 180,360 2.99 % 158.95 9,030 4.17 % 210.32 42,657 2.35 % 134.17 5 21.74 % 515.09 81,373 2.88 % 149.94 634 3.43 % 160.27 38,674 4.04 % 206.35 7,987 3.90 % 201.10 sredi sredi 155,626 2.58 % 137.15 5,109 2.36 % 118.99 39,051 2.15 % 122.83 0 0 % 0 75,945 2.69 % 139.93 228 1.23 % 57.64 25,698 2.69 % 137.12 9,595 4.69 % 241.59 preko preko 107,027 1.77 % 94.32 5,005 2.31 % 116.57 35,516 1.96 % 111.71 9 39.13 % 927.17 43,418 1.54 % 80 563 3.04 % 142.32 20,743 2.17 % 110.68 1,773 0.87 % 44.64 zoper zoper 96,349 1.60 % 84.91 2,034 0.94 % 47.37 37,620 2.08 % 118.32 0 0 % 0 48,608 1.72 % 89.56 986 5.33 % 249.25 6,457 0.68 % 34.45 644 0.32 % 16.22 okrog okrog 92,380 1.53 % 81.41 5,996 2.77 % 139.65 15,388 0.85 % 48.40 0 0 % 0 42,736 1.51 % 78.74 304 1.64 % 76.85 16,810 1.76 % 89.69 11,146 5.44 % 280.64 znotraj zn otraj 80,944 1.34 % 71.34 6,149 2.84 % 143.22 26,268 1.45 % 82.62 0 0 % 0 33,649 1.19 % 62 333 1.80 % 84.18 13,667 1.43 % 72.92 878 0.43 % 22.11 razen razen 71,460 1.19 % 62.98 3,333 1.54 % 77.63 12,931 0.71 % 40.67 1 4.35 % 103.02 35,674 1.26 % 65.73 840 4.54 % 212.34 13,685 1.43 % 73.02 4,996 2.44 % 125.79 blizu blizu 64,672 1.07 % 57 2,885 1.33 % 67.20 20,277 1.12 % 63.78 0 0 % 0 29,850 1.06 % 55 105 0.57 % 26.54 8,372 0.88 % 44.67 3,183 1.55 % 80.14 zunaj zunaj 58,074 0.96 % 51.18 3,363 1.55 % 78.33 14,572 0.80 % 45.83 0 0 % 0 29,682 1.05 % 54.69 284 1.53 % 71.79 8,737 0.91 % 46.62 1,436 0.70 % 36.16 izven izven 28,190 0.47 % 24.84 1,401 0.65 % 32.63 10,009 0.55 % 31.48 0 0 % 0 12,402 0.44 % 22.85 262 1.42 % 66.23 3,795 0.40 % 20.25 321 0.16 % 8.08 izpred i zpred 26,041 0.43 % 22.95 510 0.24 % 11.88 5,911 0.33 % 18.59 0 0 % 0 15,218 0.54 % 28.04 29 0.16 % 7.33 3,274 0.34 % 17.47 1,099 0.54 % 27.67 izpod izpod 22,299 0.37 % 19.65 939 0.43 % 21.87 4,581 0.25 % 14.41 0 0 % 0 9,967 0.35 % 18.36 51 0.28 % 12.89 4,214 0.44 % 22.48 2,547 1.24 % 64.13 navkljub nav kljub 16,205 0.27 % 14.28 471 0.22 % 10.97 3,708 0.20 % 11.66 0 0 % 0 7,363 0.26 % 13.57 35 0.19 % 8.85 4,137 0.43 % 22.07 491 0.24 % 12.36 zavoljo za voljo 14,784 0.24 % 13.03 423 0.20 % 9.85 4,629 0.26 % 14.56 0 0 % 0 6,837 0.24 % 12.60 5 0.03 % 1.26 2,221 0.23 % 11.85 669 0.33 % 16.84 nasproti nas proti 13,059 0.22 % 11.51 1,083 0.50 % 25.22 2,243 0.12 % 7.05 0 0 % 0 6,206 0.22 % 11.43 86 0.47 % 21.74 2,087 0.22 % 11.14 1,354 0.66 % 34.09 zraven z raven 11,040 0.18 % 9.73 348 0.16 % 8.11 1,602 0.09 % 5.04 0 0 % 0 3,353 0.12 % 6.18 18 0.10 % 4.55 2,164 0.23 % 11.55 3,555 1.74 % 89.51 vzdolž v zdolž 9,721 0.16 % 8.57 1,619 0.75 % 37.71 1,790 0.10 % 5.63 1 4.35 % 103.02 3,028 0.11 % 5.58 78 0.42 % 19.72 2,009 0.21 % 10.72 1,196 0.58 % 30.11 spričo s pričo 9,312 0.15 % 8.21 798 0.37 % 18.59 1,646 0.09 % 5.18 0 0 % 0 5,100 0.18 % 9.40 92 0.50 % 23.26 1,185 0.12 % 6.32 491 0.24 % 12.36 onkraj o nkraj 8,858 0.15 % 7.81 705 0.33 % 16.42 2,514 0.14 % 7.91 0 0 % 0 3,350 0.12 % 6.17 27 0.15 % 6.83 1,273 0.13 % 6.79 989 0.48 % 24.90 onstran on stran 7,522 0.12 % 6.63 752 0.35 % 17.51 1,477 0.08 % 4.65 0 0 % 0 3,666 0.13 % 6.75 38 0.20 % 9.61 1,131 0.12 % 6.03 458 0.22 % 11.53 tekom tekom 4,638 0.08 % 4.09 174 0.08 % 4.05 3,169 0.17 % 9.97 0 0 % 0 626 0.02 % 1.15 25 0.14 % 6.32 613 0.06 % 3.27 31 0.01 % 0.78 povrhu p ovrhu 2,397 0.04 % 2.11 63 0.03 % 1.47 486 0.03 % 1.53 0 0 % 0 1,213 0.04 % 2.24 3 0.02 % 0.76 477 0.05 % 2.55 155 0.08 % 3.90 širom širom 2,386 0.04 % 2.10 35 0.02 % 0.82 711 0.04 % 2.24 0 0 % 0 874 0.03 % 1.61 6 0.03 % 1.52 713 0.07 % 3.80 47 0.02 % 1.18 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 177 File at CLARIN.SI 1.2.161 List of initial character-level 1-grams from conjunction lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-conjunctions-lemmas- initial-1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] in in i n 29,117,711 29.55 % 25,661.29 619 55.66 % 63,768.41 1,187,174 30.09 % 29,891.76 13,762,326 29.85 % 25,358 5,256,746 29.89 % 28,048.27 1,338,463 33.09 % 31,174.39 115,835 29.37 % 29,281.84 7,456,548 28.18 % 23,452.78 da da d a 14,810,352 15.03 % 13,052.29 124 11.15 % 12,774.29 752,394 19.07 % 18,944.47 6,760,076 14.66 % 12,455.89 2,457,240 13.97 % 13,111.03 526,301 13.01 % 12,258.18 53,649 13.60 % 13,561.89 4,260,568 16.10 % 13,400.60 ki ki k i 12,623,562 12.81 % 11,125.08 55 4.95 % 5,666.01 338,126 8.57 % 8,513.65 5,915,959 12.83 % 10,900.55 2,202,562 12.52 % 11,752.15 516,575 12.77 % 12,031.65 55,386 14.04 % 14,000.98 3,594,899 13.58 % 11,306.89 pa pa p a 11,003,780 11.17 % 9,697.58 37 3.33 % 3,811.68 319,351 8.09 % 8,040.91 5,422,876 11.76 % 9,992.01 1,898,081 10.79 % 10,127.54 320,699 7.93 % 7,469.46 25,847 6.55 % 6,533.84 3,016,889 11.40 % 9,488.90 kot kot k ot 5,694,030 5.78 % 5,018.12 23 2.07 % 2,369.42 210,691 5.34 % 5,304.97 2,600,526 5.64 % 4,791.64 973,450 5.54 % 5,194.01 231,781 5.73 % 5,398.46 16,409 4.16 % 4,148.02 1,661,150 6.28 % 5,224.75 ko ko k o 3,180,102 3.23 % 2,802.61 36 3.24 % 3,708.66 225,473 5.71 % 5,677.17 1,386,450 3.01 % 2,554.63 573,281 3.26 % 3,058.84 123,662 3.06 % 2,880.24 9,198 2.33 % 2,325.16 862,002 3.26 % 2,711.22 ali ali a li 2,999,080 3.04 % 2,643.07 42 3.78 % 4,326.77 115,783 2.94 % 2,915.29 1,285,899 2.79 % 2,369.35 682,175 3.88 % 3,639.86 232,818 5.76 % 5,422.61 34,601 8.77 % 8,746.76 647,762 2.45 % 2,037.38 če če č e 2,441,017 2.48 % 2,151.26 27 2.43 % 2,781.50 127,590 3.23 % 3,212.58 1,064,667 2.31 % 1,961.72 547,603 3.11 % 2,921.83 138,698 3.43 % 3,230.44 23,279 5.90 % 5,884.68 539,153 2.04 % 1,695.78 saj saj s aj 1,922,955 1.95 % 1,694.69 0 0 % 0 52,415 1.33 % 1,319.75 956,035 2.07 % 1,761.56 353,479 2.01 % 1,886.05 45,680 1.13 % 1,063.94 1,897 0.48 % 479.54 513,449 1.94 % 1,614.93 ter ter t er 1,668,988 1.69 % 1,470.87 100 8.99 % 10,301.84 23,537 0.60 % 592.64 765,500 1.66 % 1,410.48 282,898 1.61 % 1,509.45 66,550 1.65 % 1,550.03 8,191 2.08 % 2,070.60 522,212 1.97 % 1,642.49 a a a 1,586,387 1.61 % 1,398.07 2 0.18 % 206.04 65,928 1.67 % 1,660 611,753 1.33 % 1,127.20 263,856 1.50 % 1,407.85 35,869 0.89 % 835.43 2,997 0.76 % 757.61 605,982 2.29 % 1,905.97 ker ker k er 1,508,421 1.53 % 1,329.36 2 0.18 % 206.04 71,438 1.81 % 1,798.73 727,004 1.58 % 1,339.55 293,861 1.67 % 1,567.95 57,937 1.43 % 1,349.42 5,756 1.46 % 1,455.05 352,423 1.33 % 1,108.46 zato zato z ato 1,302,554 1.32 % 1,147.93 0 0 % 0 34,925 0.89 % 879.37 637,853 1.38 % 1,175.29 249,506 1.42 % 1,331.28 51,526 1.27 % 1,200.10 3,264 0.83 % 825.10 325,480 1.23 % 1,023.72 kjer kjer k jer 1,147,657 1.17 % 1,011.42 0 0 % 0 30,047 0.76 % 756.55 561,556 1.22 % 1,034.70 188,431 1.07 % 1,005.41 36,017 0.89 % 838.88 2,761 0.70 % 697.95 328,845 1.24 % 1,034.30 vendar vendar v endar 997,194 1.01 % 878.82 0 0 % 0 53,106 1.35 % 1,337.15 515,237 1.12 % 949.36 192,538 1.09 % 1,027.32 48,139 1.19 % 1,121.21 3,173 0.81 % 802.10 185,001 0.70 % 581.88 namreč namreč n amreč 902,134 0.92 % 795.05 0 0 % 0 9,366 0.24 % 235.83 452,514 0.98 % 833.79 148,837 0.85 % 794.15 17,827 0.44 % 415.21 1,164 0.29 % 294.25 272,426 1.03 % 856.85 čeprav čeprav č eprav 679,637 0.69 % 598.96 0 0 % 0 28,778 0.73 % 724.60 333,503 0.72 % 614.50 130,569 0.74 % 696.67 24,686 0.61 % 574.97 1,251 0.32 % 316.24 160,850 0.61 % 505.92 oziroma oziroma o ziroma 648,207 0.66 % 571.26 2 0.18 % 206.04 4,180 0.11 % 105.25 332,593 0.72 % 612.82 100,468 0.57 % 536.06 23,365 0.58 % 544.20 10,581 2.68 % 2,674.76 177,018 0.67 % 556.77 toda toda t oda 588,678 0.60 % 518.80 0 0 % 0 48,129 1.22 % 1,211.84 303,559 0.66 % 559.33 106,097 0.60 % 566.10 26,653 0.66 % 620.78 1,443 0.37 % 364.77 102,797 0.39 % 323.32 ampak ampak a mpak 572,085 0.58 % 504.18 0 0 % 0 58,951 1.49 % 1,484.32 227,969 0.49 % 420.05 115,041 0.65 % 613.82 18,836 0.47 % 438.71 2,969 0.75 % 750.53 148,319 0.56 % 466.50 tako tako t ako 410,279 0.42 % 361.58 12 1.08 % 1,236.22 11,498 0.29 % 289.51 193,920 0.42 % 357.31 77,933 0.44 % 415.82 17,681 0.44 % 411.81 1,533 0.39 % 387.53 107,702 0.41 % 338.75 naj naj n aj 389,738 0.40 % 343.47 0 0 % 0 25,019 0.63 % 629.95 187,176 0.41 % 344.88 57,022 0.32 % 304.25 13,726 0.34 % 319.69 1,027 0.26 % 259.61 105,768 0.40 % 332.67 kakor kakor k akor 272,376 0.28 % 240.04 0 0 % 0 40,148 1.02 % 1,010.88 123,199 0.27 % 227 51,656 0.29 % 275.62 24,063 0.59 % 560.46 1,664 0.42 % 420.64 31,646 0.12 % 99.53 sicer sicer s icer 263,692 0.27 % 232.39 0 0 % 0 2,680 0.07 % 67.48 130,258 0.28 % 240.01 36,602 0.21 % 195.30 6,798 0.17 % 158.33 847 0.21 % 214.11 86,507 0.33 % 272.09 temveč temveč t emveč 235,490 0.24 % 207.54 0 0 % 0 7,332 0.19 % 184.61 110,231 0.24 % 203.11 43,698 0.25 % 233.16 16,034 0.40 % 373.45 373 0.10 % 94.29 57,822 0.22 % 181.87 torej torej t orej 196,891 0.20 % 173.52 0 0 % 0 3,308 0.08 % 83.29 96,806 0.21 % 178.37 39,105 0.22 % 208.65 9,010 0.22 % 209.85 593 0.15 % 149.90 48,069 0.18 % 151.19 kajti kajti k ajti 179,605 0.18 % 158.28 0 0 % 0 11,065 0.28 % 278.60 106,508 0.23 % 196.25 32,173 0.18 % 171.66 7,547 0.19 % 175.78 1,131 0.29 % 285.90 21,181 0.08 % 66.62 vendarle vendarle v endarle 165,832 0.17 % 146.15 0 0 % 0 4,087 0.10 % 102.91 91,249 0.20 % 168.13 23,643 0.13 % 126.15 3,756 0.09 % 87.48 544 0.14 % 137.52 42,553 0.16 % 133.84 preden preden p reden 148,498 0.15 % 130.87 8 0.72 % 824.15 20,759 0.53 % 522.69 53,160 0.12 % 97.95 32,176 0.18 % 171.68 8,842 0.22 % 205.94 629 0.16 % 159 32,924 0.12 % 103.55 dokler dokler d okler 147,332 0.15 % 129.84 19 1.71 % 1,957.35 16,427 0.42 % 413.61 59,151 0.13 % 108.99 30,010 0.17 % 160.12 9,366 0.23 % 218.15 1,089 0.28 % 275.29 31,270 0.12 % 98.35 kadar kadar k adar 133,016 0.14 % 117.23 1 0.09 % 103.02 13,669 0.35 % 344.17 44,635 0.10 % 82.24 34,877 0.20 % 186.09 16,388 0.41 % 381.70 1,820 0.46 % 460.08 21,626 0.08 % 68.02 kolikor kolikor k olikor 105,767 0.11 % 93.21 1 0.09 % 103.02 5,383 0.14 % 135.54 50,897 0.11 % 93.78 18,240 0.10 % 97.32 4,410 0.11 % 102.71 830 0.21 % 209.82 26,006 0.10 % 81.80 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 178 File at CLARIN.SI 1.2.162 List of initial character-level 2-grams from conjunction lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-conjunctions-lemmas- initial-2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] in in in 29,117,711 30.03 % 25,661.29 619 55.77 % 63,768.41 1,187,174 30.60 % 29,891.76 13,762,326 30.25 % 25,358 5,256,746 30.34 % 28,048.27 1,338,463 33.39 % 31,174.39 115,835 29.60 % 29,281.84 7,456,548 28.84 % 23,452.78 da da da 14,810,352 15.28 % 13,052.29 124 11.17 % 12,774.29 752,394 19.39 % 18,944.47 6,760,076 14.86 % 12,455.89 2,457,240 14.18 % 13,111.03 526,301 13.13 % 12,258.18 53,649 13.71 % 13,561.89 4,260,568 16.48 % 13,400.60 ki ki ki 12,623,562 13.02 % 11,125.08 55 4.96 % 5,666.01 338,126 8.72 % 8,513.65 5,915,959 13.00 % 10,900.55 2,202,562 12.71 % 11,752.15 516,575 12.89 % 12,031.65 55,386 14.15 % 14,000.98 3,594,899 13.90 % 11,306.89 pa pa pa 11,003,780 11.35 % 9,697.58 37 3.33 % 3,811.68 319,351 8.23 % 8,040.91 5,422,876 11.92 % 9,992.01 1,898,081 10.96 % 10,127.54 320,699 8.00 % 7,469.46 25,847 6.60 % 6,533.84 3,016,889 11.67 % 9,488.90 kot kot ko t 5,694,030 5.87 % 5,018.12 23 2.07 % 2,369.42 210,691 5.43 % 5,304.97 2,600,526 5.72 % 4,791.64 973,450 5.62 % 5,194.01 231,781 5.78 % 5,398.46 16,409 4.19 % 4,148.02 1,661,150 6.42 % 5,224.75 ko ko ko 3,180,102 3.28 % 2,802.61 36 3.24 % 3,708.66 225,473 5.81 % 5,677.17 1,386,450 3.05 % 2,554.63 573,281 3.31 % 3,058.84 123,662 3.08 % 2,880.24 9,198 2.35 % 2,325.16 862,002 3.33 % 2,711.22 ali ali al i 2,999,080 3.09 % 2,643.07 42 3.78 % 4,326.77 115,783 2.98 % 2,915.29 1,285,899 2.83 % 2,369.35 682,175 3.94 % 3,639.86 232,818 5.81 % 5,422.61 34,601 8.84 % 8,746.76 647,762 2.50 % 2,037.38 če če če 2,441,017 2.52 % 2,151.26 27 2.43 % 2,781.50 127,590 3.29 % 3,212.58 1,064,667 2.34 % 1,961.72 547,603 3.16 % 2,921.83 138,698 3.46 % 3,230.44 23,279 5.95 % 5,884.68 539,153 2.08 % 1,695.78 saj saj sa j 1,922,955 1.98 % 1,694.69 0 0 % 0 52,415 1.35 % 1,319.75 956,035 2.10 % 1,761.56 353,479 2.04 % 1,886.05 45,680 1.14 % 1,063.94 1,897 0.48 % 479.54 513,449 1.99 % 1,614.93 ter ter te r 1,668,988 1.72 % 1,470.87 100 9.01 % 10,301.84 23,537 0.61 % 592.64 765,500 1.68 % 1,410.48 282,898 1.63 % 1,509.45 66,550 1.66 % 1,550.03 8,191 2.09 % 2,070.60 522,212 2.02 % 1,642.49 ker ker ke r 1,508,421 1.56 % 1,329.36 2 0.18 % 206.04 71,438 1.84 % 1,798.73 727,004 1.60 % 1,339.55 293,861 1.70 % 1,567.95 57,937 1.45 % 1,349.42 5,756 1.47 % 1,455.05 352,423 1.36 % 1,108.46 zato zato za to 1,302,554 1.34 % 1,147.93 0 0 % 0 34,925 0.90 % 879.37 637,853 1.40 % 1,175.29 249,506 1.44 % 1,331.28 51,526 1.28 % 1,200.10 3,264 0.83 % 825.10 325,480 1.26 % 1,023.72 kjer kjer kj er 1,147,657 1.18 % 1,011.42 0 0 % 0 30,047 0.77 % 756.55 561,556 1.23 % 1,034.70 188,431 1.09 % 1,005.41 36,017 0.90 % 838.88 2,761 0.70 % 697.95 328,845 1.27 % 1,034.30 vendar vendar ve ndar 997,194 1.03 % 878.82 0 0 % 0 53,106 1.37 % 1,337.15 515,237 1.13 % 949.36 192,538 1.11 % 1,027.32 48,139 1.20 % 1,121.21 3,173 0.81 % 802.10 185,001 0.71 % 581.88 namreč namreč na mreč 902,134 0.93 % 795.05 0 0 % 0 9,366 0.24 % 235.83 452,514 0.99 % 833.79 148,837 0.86 % 794.15 17,827 0.45 % 415.21 1,164 0.30 % 294.25 272,426 1.05 % 856.85 čeprav čeprav če prav 679,637 0.70 % 598.96 0 0 % 0 28,778 0.74 % 724.60 333,503 0.73 % 614.50 130,569 0.75 % 696.67 24,686 0.62 % 574.97 1,251 0.32 % 316.24 160,850 0.62 % 505.92 oziroma oziroma oz iroma 648,207 0.67 % 571.26 2 0.18 % 206.04 4,180 0.11 % 105.25 332,593 0.73 % 612.82 100,468 0.58 % 536.06 23,365 0.58 % 544.20 10,581 2.70 % 2,674.76 177,018 0.69 % 556.77 toda toda to da 588,678 0.61 % 518.80 0 0 % 0 48,129 1.24 % 1,211.84 303,559 0.67 % 559.33 106,097 0.61 % 566.10 26,653 0.67 % 620.78 1,443 0.37 % 364.77 102,797 0.40 % 323.32 ampak ampak am pak 572,085 0.59 % 504.18 0 0 % 0 58,951 1.52 % 1,484.32 227,969 0.50 % 420.05 115,041 0.66 % 613.82 18,836 0.47 % 438.71 2,969 0.76 % 750.53 148,319 0.57 % 466.50 tako tako ta ko 410,279 0.42 % 361.58 12 1.08 % 1,236.22 11,498 0.30 % 289.51 193,920 0.43 % 357.31 77,933 0.45 % 415.82 17,681 0.44 % 411.81 1,533 0.39 % 387.53 107,702 0.42 % 338.75 naj naj na j 389,738 0.40 % 343.47 0 0 % 0 25,019 0.65 % 629.95 187,176 0.41 % 344.88 57,022 0.33 % 304.25 13,726 0.34 % 319.69 1,027 0.26 % 259.61 105,768 0.41 % 332.67 kakor kakor ka kor 272,376 0.28 % 240.04 0 0 % 0 40,148 1.03 % 1,010.88 123,199 0.27 % 227 51,656 0.30 % 275.62 24,063 0.60 % 560.46 1,664 0.42 % 420.64 31,646 0.12 % 99.53 sicer sicer si cer 263,692 0.27 % 232.39 0 0 % 0 2,680 0.07 % 67.48 130,258 0.29 % 240.01 36,602 0.21 % 195.30 6,798 0.17 % 158.33 847 0.22 % 214.11 86,507 0.34 % 272.09 temveč temveč te mveč 235,490 0.24 % 207.54 0 0 % 0 7,332 0.19 % 184.61 110,231 0.24 % 203.11 43,698 0.25 % 233.16 16,034 0.40 % 373.45 373 0.10 % 94.29 57,822 0.22 % 181.87 torej torej to rej 196,891 0.20 % 173.52 0 0 % 0 3,308 0.09 % 83.29 96,806 0.21 % 178.37 39,105 0.23 % 208.65 9,010 0.23 % 209.85 593 0.15 % 149.90 48,069 0.19 % 151.19 kajti kajti ka jti 179,605 0.18 % 158.28 0 0 % 0 11,065 0.28 % 278.60 106,508 0.23 % 196.25 32,173 0.19 % 171.66 7,547 0.19 % 175.78 1,131 0.29 % 285.90 21,181 0.08 % 66.62 vendarle vendarle ve ndarle 165,832 0.17 % 146.15 0 0 % 0 4,087 0.10 % 102.91 91,249 0.20 % 168.13 23,643 0.14 % 126.15 3,756 0.09 % 87.48 544 0.14 % 137.52 42,553 0.17 % 133.84 preden preden pr eden 148,498 0.15 % 130.87 8 0.72 % 824.15 20,759 0.54 % 522.69 53,160 0.12 % 97.95 32,176 0.19 % 171.68 8,842 0.22 % 205.94 629 0.16 % 159 32,924 0.13 % 103.55 dokler dokler do kler 147,332 0.15 % 129.84 19 1.71 % 1,957.35 16,427 0.42 % 413.61 59,151 0.13 % 108.99 30,010 0.17 % 160.12 9,366 0.23 % 218.15 1,089 0.28 % 275.29 31,270 0.12 % 98.35 kadar kadar ka dar 133,016 0.14 % 117.23 10.09 % 103.02 13,669 0.35 % 344.17 44,635 0.10 % 82.24 34,877 0.20 % 186.09 16,388 0.41 % 381.70 1,820 0.47 % 460.08 21,626 0.08 % 68.02 kolikor kolikor ko likor 105,767 0.11 % 93.21 10.09 % 103.02 5,383 0.14 % 135.54 50,897 0.11 % 93.78 18,240 0.10 % 97.32 4,410 0.11 % 102.71 830 0.21 % 209.82 26,006 0.10 % 81.80 kamor kamor ka mor 78,225 0.08 % 68.94 0 0 % 0 3,973 0.10 % 100.04 37,516 0.08 % 69.13 14,813 0.09 % 79.04 2,953 0.07 % 68.78 176 0.04 % 44.49 18,794 0.07 % 59.11 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 179 File at CLARIN.SI 1.2.163 List of initial character-level 3-grams from conjunction lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-conjunctions-lemmas- initial-3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] kot kot kot 5,694,030 23.98 % 5,018.12 23 10.85 % 2,369.42 210,691 22.73 % 5,304.97 2,600,526 23.29 % 4,791.64 973,450 22.21 % 5,194.01 231,781 22.22 % 5,398.46 16,409 15.19 % 4,148.02 1,661,150 27.14 % 5,224.75 ali ali ali 2,999,080 12.63 % 2,643.07 42 19.81 % 4,326.77 115,783 12.49 % 2,915.29 1,285,899 11.52 % 2,369.35 682,175 15.57 % 3,639.86 232,818 22.32 % 5,422.61 34,601 32.03 % 8,746.76 647,762 10.58 % 2,037.38 saj saj saj 1,922,955 8.10 % 1,694.69 0 0 % 0 52,415 5.66 % 1,319.75 956,035 8.56 % 1,761.56 353,479 8.07 % 1,886.05 45,680 4.38 % 1,063.94 1,897 1.76 % 479.54 513,449 8.39 % 1,614.93 ter ter ter 1,668,988 7.03 % 1,470.87 100 47.17 % 10,301.84 23,537 2.54 % 592.64 765,500 6.86 % 1,410.48 282,898 6.46 % 1,509.45 66,550 6.38 % 1,550.03 8,191 7.58 % 2,070.60 522,212 8.53 % 1,642.49 ker ker ker 1,508,421 6.35 % 1,329.36 2 0.94 % 206.04 71,438 7.71 % 1,798.73 727,004 6.51 % 1,339.55 293,861 6.71 % 1,567.95 57,937 5.55 % 1,349.42 5,756 5.33 % 1,455.05 352,423 5.76 % 1,108.46 zato zato zat o 1,302,554 5.49 % 1,147.93 0 0 % 0 34,925 3.77 % 879.37 637,853 5.71 % 1,175.29 249,506 5.69 % 1,331.28 51,526 4.94 % 1,200.10 3,264 3.02 % 825.10 325,480 5.32 % 1,023.72 kjer kjer kje r 1,147,657 4.83 % 1,011.42 0 0 % 0 30,047 3.24 % 756.55 561,556 5.03 % 1,034.70 188,431 4.30 % 1,005.41 36,017 3.45 % 838.88 2,761 2.56 % 697.95 328,845 5.37 % 1,034.30 vendar vendar ven dar 997,194 4.20 % 878.82 0 0 % 0 53,106 5.73 % 1,337.15 515,237 4.62 % 949.36 192,538 4.39 % 1,027.32 48,139 4.62 % 1,121.21 3,173 2.94 % 802.10 185,001 3.02 % 581.88 namreč namreč nam reč 902,134 3.80 % 795.05 0 0 % 0 9,366 1.01 % 235.83 452,514 4.05 % 833.79 148,837 3.40 % 794.15 17,827 1.71 % 415.21 1,164 1.08 % 294.25 272,426 4.45 % 856.85 čeprav čeprav čep rav 679,637 2.86 % 598.96 0 0 % 0 28,778 3.10 % 724.60 333,503 2.99 % 614.50 130,569 2.98 % 696.67 24,686 2.37 % 574.97 1,251 1.16 % 316.24 160,850 2.63 % 505.92 oziroma oziroma ozi roma 648,207 2.73 % 571.26 2 0.94 % 206.04 4,180 0.45 % 105.25 332,593 2.98 % 612.82 100,468 2.29 % 536.06 23,365 2.24 % 544.20 10,581 9.79 % 2,674.76 177,018 2.89 % 556.77 toda toda tod a 588,678 2.48 % 518.80 0 0 % 0 48,129 5.19 % 1,211.84 303,559 2.72 % 559.33 106,097 2.42 % 566.10 26,653 2.56 % 620.78 1,443 1.34 % 364.77 102,797 1.68 % 323.32 ampak ampak amp ak 572,085 2.41 % 504.18 0 0 % 0 58,951 6.36 % 1,484.32 227,969 2.04 % 420.05 115,041 2.62 % 613.82 18,836 1.81 % 438.71 2,969 2.75 % 750.53 148,319 2.42 % 466.50 tako tako tak o 410,279 1.73 % 361.58 12 5.66 % 1,236.22 11,498 1.24 % 289.51 193,920 1.74 % 357.31 77,933 1.78 % 415.82 17,681 1.70 % 411.81 1,533 1.42 % 387.53 107,702 1.76 % 338.75 naj naj naj 389,738 1.64 % 343.47 0 0 % 0 25,019 2.70 % 629.95 187,176 1.68 % 344.88 57,022 1.30 % 304.25 13,726 1.32 % 319.69 1,027 0.95 % 259.61 105,768 1.73 % 332.67 kakor kakor kak or 272,376 1.15 % 240.04 0 0 % 0 40,148 4.33 % 1,010.88 123,199 1.10 % 227 51,656 1.18 % 275.62 24,063 2.31 % 560.46 1,664 1.54 % 420.64 31,646 0.52 % 99.53 sicer sicer sic er 263,692 1.11 % 232.39 0 0 % 0 2,680 0.29 % 67.48 130,258 1.17 % 240.01 36,602 0.83 % 195.30 6,798 0.65 % 158.33 847 0.78 % 214.11 86,507 1.41 % 272.09 temveč temveč tem več 235,490 0.99 % 207.54 0 0 % 0 7,332 0.79 % 184.61 110,231 0.99 % 203.11 43,698 1.00 % 233.16 16,034 1.54 % 373.45 373 0.34 % 94.29 57,822 0.94 % 181.87 torej torej tor ej 196,891 0.83 % 173.52 0 0 % 0 3,308 0.36 % 83.29 96,806 0.87 % 178.37 39,105 0.89 % 208.65 9,010 0.86 % 209.85 593 0.55 % 149.90 48,069 0.79 % 151.19 kajti kajti kaj ti 179,605 0.76 % 158.28 0 0 % 0 11,065 1.19 % 278.60 106,508 0.95 % 196.25 32,173 0.73 % 171.66 7,547 0.72 % 175.78 1,131 1.05 % 285.90 21,181 0.35 % 66.62 vendarle vendarle ven darle 165,832 0.70 % 146.15 0 0 % 0 4,087 0.44 % 102.91 91,249 0.82 % 168.13 23,643 0.54 % 126.15 3,756 0.36 % 87.48 544 0.50 % 137.52 42,553 0.69 % 133.84 preden preden pre den 148,498 0.62 % 130.87 8 3.77 % 824.15 20,759 2.24 % 522.69 53,160 0.48 % 97.95 32,176 0.73 % 171.68 8,842 0.85 % 205.94 629 0.58 % 159 32,924 0.54 % 103.55 dokler dokler dok ler 147,332 0.62 % 129.84 19 8.96 % 1,957.35 16,427 1.77 % 413.61 59,151 0.53 % 108.99 30,010 0.69 % 160.12 9,366 0.90 % 218.15 1,089 1.01 % 275.29 31,270 0.51 % 98.35 kadar kadar kad ar 133,016 0.56 % 117.23 1 0.47 % 103.02 13,669 1.48 % 344.17 44,635 0.40 % 82.24 34,877 0.80 % 186.09 16,388 1.57 % 381.70 1,820 1.69 % 460.08 21,626 0.35 % 68.02 kolikor kolikor kol ikor 105,767 0.45 % 93.21 1 0.47 % 103.02 5,383 0.58 % 135.54 50,897 0.46 % 93.78 18,240 0.42 % 97.32 4,410 0.42 % 102.71 830 0.77 % 209.82 26,006 0.42 % 81.80 kamor kamor kam or 78,225 0.33 % 68.94 0 0 % 0 3,973 0.43 % 100.04 37,516 0.34 % 69.13 14,813 0.34 % 79.04 2,953 0.28 % 68.78 176 0.16 % 44.49 18,794 0.31 % 59.11 bodisi bodisi bod isi 63,373 0.27 % 55.85 2 0.94 % 206.04 1,759 0.19 % 44.29 27,313 0.24 % 50.33 12,796 0.29 % 68.28 5,707 0.55 % 132.92 413 0.38 % 104.40 15,383 0.25 % 48.38 odkar odkar odk ar 63,190 0.27 % 55.69 0 0 % 0 4,813 0.52 % 121.19 28,914 0.26 % 53.28 10,306 0.23 % 54.99 920 0.09 % 21.43 106 0.10 % 26.80 18,131 0.30 % 57.03 četudi četudi čet udi 62,349 0.26 % 54.95 0 0 % 0 2,741 0.30 % 69.02 28,006 0.25 % 51.60 11,127 0.25 % 59.37 3,231 0.31 % 75.25 135 0.12 % 34.13 17,109 0.28 % 53.81 niti niti nit i 56,291 0.24 % 49.61 0 0 % 0 3,040 0.33 % 76.54 27,731 0.25 % 51.10 10,226 0.23 % 54.56 2,181 0.21 % 50.80 222 0.21 % 56.12 12,891 0.21 % 40.55 razen razen raz en 50,366 0.21 % 44.39 0 0 % 0 2,897 0.31 % 72.94 22,525 0.20 % 41.50 10,083 0.23 % 53.80 3,015 0.29 % 70.22 1,122 1.04 % 283.63 10,724 0.17 % 33.73 koder koder kod er 33,663 0.14 % 29.67 0 0 % 0 1,770 0.19 % 44.57 17,062 0.15 % 31.44 5,230 0.12 % 27.91 1,410 0.14 % 32.84 49 0.04 % 12.39 8,142 0.13 % 25.61 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 180 File at CLARIN.SI 1.2.164 List of initial character-level 4-grams from conjunction lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-conjunctions-lemmas- initial-4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] zato zato zato 1,302,554 13.63 % 1,147.93 0 0 % 0 34,925 8.17 % 879.37 637,853 13.75 % 1,175.29 249,506 14.36 % 1,331.28 51,526 13.08 % 1,200.10 3,264 8.14 % 825.10 325,480 14.05 % 1,023.72 kjer kjer kjer 1,147,657 12.01 % 1,011.42 0 0 % 0 30,047 7.03 % 756.55 561,556 12.10 % 1,034.70 188,431 10.85 % 1,005.41 36,017 9.15 % 838.88 2,761 6.88 % 697.95 328,845 14.20 % 1,034.30 vendar vendar vend ar 997,194 10.44 % 878.82 0 0 % 0 53,106 12.43 % 1,337.15 515,237 11.11 % 949.36 192,538 11.08 % 1,027.32 48,139 12.22 % 1,121.21 3,173 7.91 % 802.10 185,001 7.99 % 581.88 namreč namreč namr eč 902,134 9.44 % 795.05 0 0 % 0 9,366 2.19 % 235.83 452,514 9.75 % 833.79 148,837 8.57 % 794.15 17,827 4.53 % 415.21 1,164 2.90 % 294.25 272,426 11.76 % 856.85 čeprav čeprav čepr av 679,637 7.11 % 598.96 0 0 % 0 28,778 6.73 % 724.60 333,503 7.19 % 614.50 130,569 7.52 % 696.67 24,686 6.27 % 574.97 1,251 3.12 % 316.24 160,850 6.94 % 505.92 oziroma oziroma ozir oma 648,207 6.78 % 571.26 2 4.44 % 206.04 4,180 0.98 % 105.25 332,593 7.17 % 612.82 100,468 5.78 % 536.06 23,365 5.93 % 544.20 10,581 26.38 % 2,674.76 177,018 7.64 % 556.77 toda toda toda 588,678 6.16 % 518.80 0 0 % 0 48,129 11.26 % 1,211.84 303,559 6.54 % 559.33 106,097 6.11 % 566.10 26,653 6.77 % 620.78 1,443 3.60 % 364.77 102,797 4.44 % 323.32 ampak ampak ampa k 572,085 5.99 % 504.18 0 0 % 0 58,951 13.79 % 1,484.32 227,969 4.91 % 420.05 115,041 6.62 % 613.82 18,836 4.78 % 438.71 2,969 7.40 % 750.53 148,319 6.40 % 466.50 tako tako tako 410,279 4.29 % 361.58 12 26.67 % 1,236.22 11,498 2.69 % 289.51 193,920 4.18 % 357.31 77,933 4.49 % 415.82 17,681 4.49 % 411.81 1,533 3.82 % 387.53 107,702 4.65 % 338.75 kakor kakor kako r 272,376 2.85 % 240.04 0 0 % 0 40,148 9.39 % 1,010.88 123,199 2.65 % 227 51,656 2.97 % 275.62 24,063 6.11 % 560.46 1,664 4.15 % 420.64 31,646 1.37 % 99.53 sicer sicer sice r 263,692 2.76 % 232.39 0 0 % 0 2,680 0.63 % 67.48 130,258 2.81 % 240.01 36,602 2.11 % 195.30 6,798 1.73 % 158.33 847 2.11 % 214.11 86,507 3.73 % 272.09 temveč temveč temv eč 235,490 2.46 % 207.54 0 0 % 0 7,332 1.72 % 184.61 110,231 2.38 % 203.11 43,698 2.52 % 233.16 16,034 4.07 % 373.45 373 0.93 % 94.29 57,822 2.50 % 181.87 torej torej tore j 196,891 2.06 % 173.52 0 0 % 0 3,308 0.77 % 83.29 96,806 2.09 % 178.37 39,105 2.25 % 208.65 9,010 2.29 % 209.85 593 1.48 % 149.90 48,069 2.08 % 151.19 kajti kajti kajt i 179,605 1.88 % 158.28 0 0 % 0 11,065 2.59 % 278.60 106,508 2.30 % 196.25 32,173 1.85 % 171.66 7,547 1.92 % 175.78 1,131 2.82 % 285.90 21,181 0.91 % 66.62 vendarle vendarle vend arle 165,832 1.74 % 146.15 0 0 % 0 4,087 0.96 % 102.91 91,249 1.97 % 168.13 23,643 1.36 % 126.15 3,756 0.95 % 87.48 544 1.36 % 137.52 42,553 1.84 % 133.84 preden preden pred en 148,498 1.55 % 130.87 8 17.78 % 824.15 20,759 4.86 % 522.69 53,160 1.15 % 97.95 32,176 1.85 % 171.68 8,842 2.25 % 205.94 629 1.57 % 159 32,924 1.42 % 103.55 dokler dokler dokl er 147,332 1.54 % 129.84 19 42.22 % 1,957.35 16,427 3.84 % 413.61 59,151 1.27 % 108.99 30,010 1.73 % 160.12 9,366 2.38 % 218.15 1,089 2.71 % 275.29 31,270 1.35 % 98.35 kadar kadar kada r 133,016 1.39 % 117.23 1 2.22 % 103.02 13,669 3.20 % 344.17 44,635 0.96 % 82.24 34,877 2.01 % 186.09 16,388 4.16 % 381.70 1,820 4.54 % 460.08 21,626 0.93 % 68.02 kolikor kolikor koli kor 105,767 1.11 % 93.21 1 2.22 % 103.02 5,383 1.26 % 135.54 50,897 1.10 % 93.78 18,240 1.05 % 97.32 4,410 1.12 % 102.71 830 2.07 % 209.82 26,006 1.12 % 81.80 kamor kamor kamo r 78,225 0.82 % 68.94 0 0 % 0 3,973 0.93 % 100.04 37,516 0.81 % 69.13 14,813 0.85 % 79.04 2,953 0.75 % 68.78 176 0.44 % 44.49 18,794 0.81 % 59.11 bodisi bodisi bodi si 63,373 0.66 % 55.85 2 4.44 % 206.04 1,759 0.41 % 44.29 27,313 0.59 % 50.33 12,796 0.74 % 68.28 5,707 1.45 % 132.92 413 1.03 % 104.40 15,383 0.66 % 48.38 odkar odkar odka r 63,190 0.66 % 55.69 0 0 % 0 4,813 1.13 % 121.19 28,914 0.62 % 53.28 10,306 0.59 % 54.99 920 0.23 % 21.43 106 0.26 % 26.80 18,131 0.78 % 57.03 četudi četudi četu di 62,349 0.65 % 54.95 0 0 % 0 2,741 0.64 % 69.02 28,006 0.60 % 51.60 11,127 0.64 % 59.37 3,231 0.82 % 75.25 135 0.34 % 34.13 17,109 0.74 % 53.81 niti niti niti 56,291 0.59 % 49.61 0 0 % 0 3,040 0.71 % 76.54 27,731 0.60 % 51.10 10,226 0.59 % 54.56 2,181 0.55 % 50.80 222 0.55 % 56.12 12,891 0.56 % 40.55 razen razen raze n 50,366 0.53 % 44.39 0 0 % 0 2,897 0.68 % 72.94 22,525 0.49 % 41.50 10,083 0.58 % 53.80 3,015 0.77 % 70.22 1,122 2.80 % 283.63 10,724 0.46 % 33.73 koder koder kode r 33,663 0.35 % 29.67 0 0 % 0 1,770 0.41 % 44.57 17,062 0.37 % 31.44 5,230 0.30 % 27.91 1,410 0.36 % 32.84 49 0.12 % 12.39 8,142 0.35 % 25.61 marveč marveč marv eč 21,385 0.22 % 18.85 0 0 % 0 595 0.14 % 14.98 12,386 0.27 % 22.82 4,535 0.26 % 24.20 1,702 0.43 % 39.64 47 0.12 % 11.88 2,120 0.09 % 6.67 čeravno čeravno čera vno 9,604 0.10 % 8.46 0 0 % 0 600 0.14 % 15.11 5,123 0.11 % 9.44 1,322 0.08 % 7.05 357 0.09 % 8.31 21 0.05 % 5.31 2,181 0.09 % 6.86 zatorej zatorej zato rej 4,880 0.05 % 4.30 0 0 % 0 317 0.07 % 7.98 2,133 0.05 % 3.93 1,440 0.08 % 7.68 344 0.09 % 8.01 22 0.06 % 5.56 624 0.03 % 1.96 najsi najsi najs i 4,664 0.05 % 4.11 0 0 % 0 343 0.08 % 8.64 1,902 0.04 % 3.50 1,216 0.07 % 6.49 395 0.10 % 9.20 66 0.17 % 16.68 742 0.03 % 2.33 predno predno pred no 1,817 0.02 % 1.60 0 0 % 0 143 0.03 % 3.60 944 0.02 % 1.74 274 0.02 % 1.46 223 0.06 % 5.19 42 0.10 % 10.62 191 0.01 % 0.60 odkoder odkoder odko der 1,597 0.02 % 1.41 0 0 % 0 109 0.03 % 2.74 878 0.02 % 1.62 300 0.02 % 1.60 96 0.02 % 2.24 7 0.02 % 1.77 207 0.01 % 0.65 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 181 File at CLARIN.SI 1.2.165 List of initial character-level 5-grams from conjunction lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-conjunctions-lemmas- initial-5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] vendar vendar venda r 997,194 16.49 % 878.82 0 0 % 0 53,106 17.73 % 1,337.15 515,237 17.68 % 949.36 192,538 17.43 % 1,027.32 48,139 18.55 % 1,121.21 3,173 10.28 % 802.10 185,001 12.86 % 581.88 namreč namreč namre č 902,134 14.92 % 795.05 0 0 % 0 9,366 3.13 % 235.83 452,514 15.53 % 833.79 148,837 13.47 % 794.15 17,827 6.87 % 415.21 1,164 3.77 % 294.25 272,426 18.94 % 856.85 čeprav čeprav čepra v 679,637 11.24 % 598.96 0 0 % 0 28,778 9.61 % 724.60 333,503 11.44 % 614.50 130,569 11.82 % 696.67 24,686 9.51 % 574.97 1,251 4.05 % 316.24 160,850 11.18 % 505.92 oziroma oziroma oziro ma 648,207 10.72 % 571.26 2 6.06 % 206.04 4,180 1.40 % 105.25 332,593 11.41 % 612.82 100,468 9.09 % 536.06 23,365 9.00 % 544.20 10,581 34.27 % 2,674.76 177,018 12.31 % 556.77 ampak ampak ampak 572,085 9.46 % 504.18 0 0 % 0 58,951 19.68 % 1,484.32 227,969 7.82 % 420.05 115,041 10.41 % 613.82 18,836 7.26 % 438.71 2,969 9.62 % 750.53 148,319 10.31 % 466.50 kakor kakor kakor 272,376 4.50 % 240.04 0 0 % 0 40,148 13.40 % 1,010.88 123,199 4.23 % 227 51,656 4.67 % 275.62 24,063 9.27 % 560.46 1,664 5.39 % 420.64 31,646 2.20 % 99.53 sicer sicer sicer 263,692 4.36 % 232.39 0 0 % 0 2,680 0.90 % 67.48 130,258 4.47 % 240.01 36,602 3.31 % 195.30 6,798 2.62 % 158.33 847 2.74 % 214.11 86,507 6.01 % 272.09 temveč temveč temve č 235,490 3.89 % 207.54 0 0 % 0 7,332 2.45 % 184.61 110,231 3.78 % 203.11 43,698 3.96 % 233.16 16,034 6.18 % 373.45 373 1.21 % 94.29 57,822 4.02 % 181.87 torej torej torej 196,891 3.26 % 173.52 0 0 % 0 3,308 1.10 % 83.29 96,806 3.32 % 178.37 39,105 3.54 % 208.65 9,010 3.47 % 209.85 593 1.92 % 149.90 48,069 3.34 % 151.19 kajti kajti kajti 179,605 2.97 % 158.28 0 0 % 0 11,065 3.69 % 278.60 106,508 3.65 % 196.25 32,173 2.91 % 171.66 7,547 2.91 % 175.78 1,131 3.66 % 285.90 21,181 1.47 % 66.62 vendarle vendarle venda rle 165,832 2.74 % 146.15 0 0 % 0 4,087 1.36 % 102.91 91,249 3.13 % 168.13 23,643 2.14 % 126.15 3,756 1.45 % 87.48 544 1.76 % 137.52 42,553 2.96 % 133.84 preden preden prede n 148,498 2.46 % 130.87 8 24.24 % 824.15 20,759 6.93 % 522.69 53,160 1.82 % 97.95 32,176 2.91 % 171.68 8,842 3.41 % 205.94 629 2.04 % 159 32,924 2.29 % 103.55 dokler dokler dokle r 147,332 2.44 % 129.84 19 57.58 % 1,957.35 16,427 5.48 % 413.61 59,151 2.03 % 108.99 30,010 2.72 % 160.12 9,366 3.61 % 218.15 1,089 3.53 % 275.29 31,270 2.17 % 98.35 kadar kadar kadar 133,016 2.20 % 117.23 1 3.03 % 103.02 13,669 4.56 % 344.17 44,635 1.53 % 82.24 34,877 3.16 % 186.09 16,388 6.31 % 381.70 1,820 5.89 % 460.08 21,626 1.50 % 68.02 kolikor kolikor kolik or 105,767 1.75 % 93.21 1 3.03 % 103.02 5,383 1.80 % 135.54 50,897 1.75 % 93.78 18,240 1.65 % 97.32 4,410 1.70 % 102.71 830 2.69 % 209.82 26,006 1.81 % 81.80 kamor kamor kamor 78,225 1.29 % 68.94 0 0 % 0 3,973 1.33 % 100.04 37,516 1.29 % 69.13 14,813 1.34 % 79.04 2,953 1.14 % 68.78 176 0.57 % 44.49 18,794 1.31 % 59.11 bodisi bodisi bodis i 63,373 1.05 % 55.85 2 6.06 % 206.04 1,759 0.59 % 44.29 27,313 0.94 % 50.33 12,796 1.16 % 68.28 5,707 2.20 % 132.92 413 1.34 % 104.40 15,383 1.07 % 48.38 odkar odkar odkar 63,190 1.04 % 55.69 0 0 % 0 4,813 1.61 % 121.19 28,914 0.99 % 53.28 10,306 0.93 % 54.99 920 0.35 % 21.43 106 0.34 % 26.80 18,131 1.26 % 57.03 četudi četudi četud i 62,349 1.03 % 54.95 0 0 % 0 2,741 0.92 % 69.02 28,006 0.96 % 51.60 11,127 1.01 % 59.37 3,231 1.25 % 75.25 135 0.44 % 34.13 17,109 1.19 % 53.81 razen razen razen 50,366 0.83 % 44.39 0 0 % 0 2,897 0.97 % 72.94 22,525 0.77 % 41.50 10,083 0.91 % 53.80 3,015 1.16 % 70.22 1,122 3.63 % 283.63 10,724 0.75 % 33.73 koder koder koder 33,663 0.56 % 29.67 0 0 % 0 1,770 0.59 % 44.57 17,062 0.58 % 31.44 5,230 0.47 % 27.91 1,410 0.54 % 32.84 49 0.16 % 12.39 8,142 0.57 % 25.61 marveč marveč marve č 21,385 0.35 % 18.85 0 0 % 0 595 0.20 % 14.98 12,386 0.42 % 22.82 4,535 0.41 % 24.20 1,702 0.66 % 39.64 47 0.15 % 11.88 2,120 0.15 % 6.67 čeravno čeravno čerav no 9,604 0.16 % 8.46 0 0 % 0 600 0.20 % 15.11 5,123 0.18 % 9.44 1,322 0.12 % 7.05 357 0.14 % 8.31 21 0.07 % 5.31 2,181 0.15 % 6.86 zatorej zatorej zator ej 4,880 0.08 % 4.30 0 0 % 0 317 0.11 % 7.98 2,133 0.07 % 3.93 1,440 0.13 % 7.68 344 0.13 % 8.01 22 0.07 % 5.56 624 0.04 % 1.96 najsi najsi najsi 4,664 0.08 % 4.11 0 0 % 0 343 0.11 % 8.64 1,902 0.07 % 3.50 1,216 0.11 % 6.49 395 0.15 % 9.20 66 0.21 % 16.68 742 0.05 % 2.33 predno predno predn o 1,817 0.03 % 1.60 0 0 % 0 143 0.05 % 3.60 944 0.03 % 1.74 274 0.03 % 1.46 223 0.09 % 5.19 42 0.14 % 10.62 191 0.01 % 0.60 odkoder odkoder odkod er 1,597 0.03 % 1.41 0 0 % 0 109 0.04 % 2.74 878 0.03 % 1.62 300 0.03 % 1.60 96 0.04 % 2.24 7 0.02 % 1.77 207 0.01 % 0.65 dasiravno dasiravno dasir avno 1,435 0.02 % 1.26 0 0 % 0 58 0.02 % 1.46 637 0.02 % 1.17 487 0.04 % 2.60 51 0.02 % 1.19 1 0 % 0.25 201 0.01 % 0.63 dočim dočim dočim 855 0.01 % 0.75 0 0 % 0 15 0.01 % 0.38 73 0 % 0.13 692 0.06 % 3.69 36 0.01 % 0.84 1 0 % 0.25 38 0 % 0.12 super super super 570 0.01 % 0.50 0 0 % 0 26 0.01 % 0.65 174 0.01 % 0.32 124 0.01 % 0.66 5 0 % 0.12 4 0.01 % 1.01 237 0.02 % 0.75 akoravno akoravno akora vno 207 0 % 0.18 0 0 % 0 56 0.02 % 1.41 44 0 % 0.08 71 0.01 % 0.38 16 0.01 % 0.37 0 0 % 0 20 0 % 0.06 poceni poceni pocen i 179 0 % 0.16 0 0 % 0 4 0 % 0.10 130 0 % 0.24 23 0 % 0.12 5 0 % 0.12 0 0 % 0 17 0 % 0.05 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 182 File at CLARIN.SI 1.2.166 List of final character-level 1-grams from conjunction lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-conjunctions-lemmas-final- 1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] in in i n 29,117,711 29.55 % 25,661.29 619 55.66 % 63,768.41 1,187,174 30.09 % 29,891.76 13,762,326 29.85 % 25,358 5,256,746 29.89 % 28,048.27 1,338,463 33.09 % 31,174.39 115,835 29.37 % 29,281.84 7,456,548 28.18 % 23,452.78 da da d a 14,810,352 15.03 % 13,052.29 124 11.15 % 12,774.29 752,394 19.07 % 18,944.47 6,760,076 14.66 % 12,455.89 2,457,240 13.97 % 13,111.03 526,301 13.01 % 12,258.18 53,649 13.60 % 13,561.89 4,260,568 16.10 % 13,400.60 ki ki k i 12,623,562 12.81 % 11,125.08 55 4.95 % 5,666.01 338,126 8.57 % 8,513.65 5,915,959 12.83 % 10,900.55 2,202,562 12.52 % 11,752.15 516,575 12.77 % 12,031.65 55,386 14.04 % 14,000.98 3,594,899 13.58 % 11,306.89 pa pa p a 11,003,780 11.17 % 9,697.58 37 3.33 % 3,811.68 319,351 8.09 % 8,040.91 5,422,876 11.76 % 9,992.01 1,898,081 10.79 % 10,127.54 320,699 7.93 % 7,469.46 25,847 6.55 % 6,533.84 3,016,889 11.40 % 9,488.90 kot kot ko t 5,694,030 5.78 % 5,018.12 23 2.07 % 2,369.42 210,691 5.34 % 5,304.97 2,600,526 5.64 % 4,791.64 973,450 5.54 % 5,194.01 231,781 5.73 % 5,398.46 16,409 4.16 % 4,148.02 1,661,150 6.28 % 5,224.75 ko ko k o 3,180,102 3.23 % 2,802.61 36 3.24 % 3,708.66 225,473 5.71 % 5,677.17 1,386,450 3.01 % 2,554.63 573,281 3.26 % 3,058.84 123,662 3.06 % 2,880.24 9,198 2.33 % 2,325.16 862,002 3.26 % 2,711.22 ali ali al i 2,999,080 3.04 % 2,643.07 42 3.78 % 4,326.77 115,783 2.94 % 2,915.29 1,285,899 2.79 % 2,369.35 682,175 3.88 % 3,639.86 232,818 5.76 % 5,422.61 34,601 8.77 % 8,746.76 647,762 2.45 % 2,037.38 če če č e 2,441,017 2.48 % 2,151.26 27 2.43 % 2,781.50 127,590 3.23 % 3,212.58 1,064,667 2.31 % 1,961.72 547,603 3.11 % 2,921.83 138,698 3.43 % 3,230.44 23,279 5.90 % 5,884.68 539,153 2.04 % 1,695.78 saj saj sa j 1,922,955 1.95 % 1,694.69 0 0 % 0 52,415 1.33 % 1,319.75 956,035 2.07 % 1,761.56 353,479 2.01 % 1,886.05 45,680 1.13 % 1,063.94 1,897 0.48 % 479.54 513,449 1.94 % 1,614.93 ter ter te r 1,668,988 1.69 % 1,470.87 100 8.99 % 10,301.84 23,537 0.60 % 592.64 765,500 1.66 % 1,410.48 282,898 1.61 % 1,509.45 66,550 1.65 % 1,550.03 8,191 2.08 % 2,070.60 522,212 1.97 % 1,642.49 a a a 1,586,387 1.61 % 1,398.07 2 0.18 % 206.04 65,928 1.67 % 1,660 611,753 1.33 % 1,127.20 263,856 1.50 % 1,407.85 35,869 0.89 % 835.43 2,997 0.76 % 757.61 605,982 2.29 % 1,905.97 ker ker ke r 1,508,421 1.53 % 1,329.36 2 0.18 % 206.04 71,438 1.81 % 1,798.73 727,004 1.58 % 1,339.55 293,861 1.67 % 1,567.95 57,937 1.43 % 1,349.42 5,756 1.46 % 1,455.05 352,423 1.33 % 1,108.46 zato zato zat o 1,302,554 1.32 % 1,147.93 0 0 % 0 34,925 0.89 % 879.37 637,853 1.38 % 1,175.29 249,506 1.42 % 1,331.28 51,526 1.27 % 1,200.10 3,264 0.83 % 825.10 325,480 1.23 % 1,023.72 kjer kjer kje r 1,147,657 1.17 % 1,011.42 0 0 % 0 30,047 0.76 % 756.55 561,556 1.22 % 1,034.70 188,431 1.07 % 1,005.41 36,017 0.89 % 838.88 2,761 0.70 % 697.95 328,845 1.24 % 1,034.30 vendar vendar venda r 997,194 1.01 % 878.82 0 0 % 0 53,106 1.35 % 1,337.15 515,237 1.12 % 949.36 192,538 1.09 % 1,027.32 48,139 1.19 % 1,121.21 3,173 0.81 % 802.10 185,001 0.70 % 581.88 namreč namreč namre č 902,134 0.92 % 795.05 0 0 % 0 9,366 0.24 % 235.83 452,514 0.98 % 833.79 148,837 0.85 % 794.15 17,827 0.44 % 415.21 1,164 0.29 % 294.25 272,426 1.03 % 856.85 čeprav čeprav čepra v 679,637 0.69 % 598.96 0 0 % 0 28,778 0.73 % 724.60 333,503 0.72 % 614.50 130,569 0.74 % 696.67 24,686 0.61 % 574.97 1,251 0.32 % 316.24 160,850 0.61 % 505.92 oziroma oziroma ozirom a 648,207 0.66 % 571.26 2 0.18 % 206.04 4,180 0.11 % 105.25 332,593 0.72 % 612.82 100,468 0.57 % 536.06 23,365 0.58 % 544.20 10,581 2.68 % 2,674.76 177,018 0.67 % 556.77 toda toda tod a 588,678 0.60 % 518.80 0 0 % 0 48,129 1.22 % 1,211.84 303,559 0.66 % 559.33 106,097 0.60 % 566.10 26,653 0.66 % 620.78 1,443 0.37 % 364.77 102,797 0.39 % 323.32 ampak ampak ampa k 572,085 0.58 % 504.18 0 0 % 0 58,951 1.49 % 1,484.32 227,969 0.49 % 420.05 115,041 0.65 % 613.82 18,836 0.47 % 438.71 2,969 0.75 % 750.53 148,319 0.56 % 466.50 tako tako tak o 410,279 0.42 % 361.58 12 1.08 % 1,236.22 11,498 0.29 % 289.51 193,920 0.42 % 357.31 77,933 0.44 % 415.82 17,681 0.44 % 411.81 1,533 0.39 % 387.53 107,702 0.41 % 338.75 naj naj na j 389,738 0.40 % 343.47 0 0 % 0 25,019 0.63 % 629.95 187,176 0.41 % 344.88 57,022 0.32 % 304.25 13,726 0.34 % 319.69 1,027 0.26 % 259.61 105,768 0.40 % 332.67 kakor kakor kako r 272,376 0.28 % 240.04 0 0 % 0 40,148 1.02 % 1,010.88 123,199 0.27 % 227 51,656 0.29 % 275.62 24,063 0.59 % 560.46 1,664 0.42 % 420.64 31,646 0.12 % 99.53 sicer sicer sice r 263,692 0.27 % 232.39 0 0 % 0 2,680 0.07 % 67.48 130,258 0.28 % 240.01 36,602 0.21 % 195.30 6,798 0.17 % 158.33 847 0.21 % 214.11 86,507 0.33 % 272.09 temveč temveč temve č 235,490 0.24 % 207.54 0 0 % 0 7,332 0.19 % 184.61 110,231 0.24 % 203.11 43,698 0.25 % 233.16 16,034 0.40 % 373.45 373 0.10 % 94.29 57,822 0.22 % 181.87 torej torej tore j 196,891 0.20 % 173.52 0 0 % 0 3,308 0.08 % 83.29 96,806 0.21 % 178.37 39,105 0.22 % 208.65 9,010 0.22 % 209.85 593 0.15 % 149.90 48,069 0.18 % 151.19 kajti kajti kajt i 179,605 0.18 % 158.28 0 0 % 0 11,065 0.28 % 278.60 106,508 0.23 % 196.25 32,173 0.18 % 171.66 7,547 0.19 % 175.78 1,131 0.29 % 285.90 21,181 0.08 % 66.62 vendarle vendarle vendarl e 165,832 0.17 % 146.15 0 0 % 0 4,087 0.10 % 102.91 91,249 0.20 % 168.13 23,643 0.13 % 126.15 3,756 0.09 % 87.48 544 0.14 % 137.52 42,553 0.16 % 133.84 preden preden prede n 148,498 0.15 % 130.87 8 0.72 % 824.15 20,759 0.53 % 522.69 53,160 0.12 % 97.95 32,176 0.18 % 171.68 8,842 0.22 % 205.94 629 0.16 % 159 32,924 0.12 % 103.55 dokler dokler dokle r 147,332 0.15 % 129.84 19 1.71 % 1,957.35 16,427 0.42 % 413.61 59,151 0.13 % 108.99 30,010 0.17 % 160.12 9,366 0.23 % 218.15 1,089 0.28 % 275.29 31,270 0.12 % 98.35 kadar kadar kada r 133,016 0.14 % 117.23 1 0.09 % 103.02 13,669 0.35 % 344.17 44,635 0.10 % 82.24 34,877 0.20 % 186.09 16,388 0.41 % 381.70 1,820 0.46 % 460.08 21,626 0.08 % 68.02 kolikor kolikor koliko r 105,767 0.11 % 93.21 1 0.09 % 103.02 5,383 0.14 % 135.54 50,897 0.11 % 93.78 18,240 0.10 % 97.32 4,410 0.11 % 102.71 830 0.21 % 209.82 26,006 0.10 % 81.80 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 183 File at CLARIN.SI 1.2.167 List of final character-level 2-grams from conjunction lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-conjunctions-lemmas-final- 2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] in in in 29,117,711 30.03 % 25,661.29 619 55.77 % 63,768.41 1,187,174 30.60 % 29,891.76 13,762,326 30.25 % 25,358 5,256,746 30.34 % 28,048.27 1,338,463 33.39 % 31,174.39 115,835 29.60 % 29,281.84 7,456,548 28.84 % 23,452.78 da da da 14,810,352 15.28 % 13,052.29 124 11.17 % 12,774.29 752,394 19.39 % 18,944.47 6,760,076 14.86 % 12,455.89 2,457,240 14.18 % 13,111.03 526,301 13.13 % 12,258.18 53,649 13.71 % 13,561.89 4,260,568 16.48 % 13,400.60 ki ki ki 12,623,562 13.02 % 11,125.08 55 4.96 % 5,666.01 338,126 8.72 % 8,513.65 5,915,959 13.00 % 10,900.55 2,202,562 12.71 % 11,752.15 516,575 12.89 % 12,031.65 55,386 14.15 % 14,000.98 3,594,899 13.90 % 11,306.89 pa pa pa 11,003,780 11.35 % 9,697.58 37 3.33 % 3,811.68 319,351 8.23 % 8,040.91 5,422,876 11.92 % 9,992.01 1,898,081 10.96 % 10,127.54 320,699 8.00 % 7,469.46 25,847 6.60 % 6,533.84 3,016,889 11.67 % 9,488.90 kot kot k ot 5,694,030 5.87 % 5,018.12 23 2.07 % 2,369.42 210,691 5.43 % 5,304.97 2,600,526 5.72 % 4,791.64 973,450 5.62 % 5,194.01 231,781 5.78 % 5,398.46 16,409 4.19 % 4,148.02 1,661,150 6.42 % 5,224.75 ko ko ko 3,180,102 3.28 % 2,802.61 36 3.24 % 3,708.66 225,473 5.81 % 5,677.17 1,386,450 3.05 % 2,554.63 573,281 3.31 % 3,058.84 123,662 3.08 % 2,880.24 9,198 2.35 % 2,325.16 862,002 3.33 % 2,711.22 ali ali a li 2,999,080 3.09 % 2,643.07 42 3.78 % 4,326.77 115,783 2.98 % 2,915.29 1,285,899 2.83 % 2,369.35 682,175 3.94 % 3,639.86 232,818 5.81 % 5,422.61 34,601 8.84 % 8,746.76 647,762 2.50 % 2,037.38 če če če 2,441,017 2.52 % 2,151.26 27 2.43 % 2,781.50 127,590 3.29 % 3,212.58 1,064,667 2.34 % 1,961.72 547,603 3.16 % 2,921.83 138,698 3.46 % 3,230.44 23,279 5.95 % 5,884.68 539,153 2.08 % 1,695.78 saj saj s aj 1,922,955 1.98 % 1,694.69 0 0 % 0 52,415 1.35 % 1,319.75 956,035 2.10 % 1,761.56 353,479 2.04 % 1,886.05 45,680 1.14 % 1,063.94 1,897 0.48 % 479.54 513,449 1.99 % 1,614.93 ter ter t er 1,668,988 1.72 % 1,470.87 100 9.01 % 10,301.84 23,537 0.61 % 592.64 765,500 1.68 % 1,410.48 282,898 1.63 % 1,509.45 66,550 1.66 % 1,550.03 8,191 2.09 % 2,070.60 522,212 2.02 % 1,642.49 ker ker k er 1,508,421 1.56 % 1,329.36 2 0.18 % 206.04 71,438 1.84 % 1,798.73 727,004 1.60 % 1,339.55 293,861 1.70 % 1,567.95 57,937 1.45 % 1,349.42 5,756 1.47 % 1,455.05 352,423 1.36 % 1,108.46 zato zato za to 1,302,554 1.34 % 1,147.93 0 0 % 0 34,925 0.90 % 879.37 637,853 1.40 % 1,175.29 249,506 1.44 % 1,331.28 51,526 1.28 % 1,200.10 3,264 0.83 % 825.10 325,480 1.26 % 1,023.72 kjer kjer kj er 1,147,657 1.18 % 1,011.42 0 0 % 0 30,047 0.77 % 756.55 561,556 1.23 % 1,034.70 188,431 1.09 % 1,005.41 36,017 0.90 % 838.88 2,761 0.70 % 697.95 328,845 1.27 % 1,034.30 vendar vendar vend ar 997,194 1.03 % 878.82 0 0 % 0 53,106 1.37 % 1,337.15 515,237 1.13 % 949.36 192,538 1.11 % 1,027.32 48,139 1.20 % 1,121.21 3,173 0.81 % 802.10 185,001 0.71 % 581.88 namreč namreč namr eč 902,134 0.93 % 795.05 0 0 % 0 9,366 0.24 % 235.83 452,514 0.99 % 833.79 148,837 0.86 % 794.15 17,827 0.45 % 415.21 1,164 0.30 % 294.25 272,426 1.05 % 856.85 čeprav čeprav čepr av 679,637 0.70 % 598.96 0 0 % 0 28,778 0.74 % 724.60 333,503 0.73 % 614.50 130,569 0.75 % 696.67 24,686 0.62 % 574.97 1,251 0.32 % 316.24 160,850 0.62 % 505.92 oziroma oziroma oziro ma 648,207 0.67 % 571.26 2 0.18 % 206.04 4,180 0.11 % 105.25 332,593 0.73 % 612.82 100,468 0.58 % 536.06 23,365 0.58 % 544.20 10,581 2.70 % 2,674.76 177,018 0.69 % 556.77 toda toda to da 588,678 0.61 % 518.80 0 0 % 0 48,129 1.24 % 1,211.84 303,559 0.67 % 559.33 106,097 0.61 % 566.10 26,653 0.67 % 620.78 1,443 0.37 % 364.77 102,797 0.40 % 323.32 ampak ampak amp ak 572,085 0.59 % 504.18 0 0 % 0 58,951 1.52 % 1,484.32 227,969 0.50 % 420.05 115,041 0.66 % 613.82 18,836 0.47 % 438.71 2,969 0.76 % 750.53 148,319 0.57 % 466.50 tako tako ta ko 410,279 0.42 % 361.58 12 1.08 % 1,236.22 11,498 0.30 % 289.51 193,920 0.43 % 357.31 77,933 0.45 % 415.82 17,681 0.44 % 411.81 1,533 0.39 % 387.53 107,702 0.42 % 338.75 naj naj n aj 389,738 0.40 % 343.47 0 0 % 0 25,019 0.65 % 629.95 187,176 0.41 % 344.88 57,022 0.33 % 304.25 13,726 0.34 % 319.69 1,027 0.26 % 259.61 105,768 0.41 % 332.67 kakor kakor kak or 272,376 0.28 % 240.04 0 0 % 0 40,148 1.03 % 1,010.88 123,199 0.27 % 227 51,656 0.30 % 275.62 24,063 0.60 % 560.46 1,664 0.42 % 420.64 31,646 0.12 % 99.53 sicer sicer sic er 263,692 0.27 % 232.39 0 0 % 0 2,680 0.07 % 67.48 130,258 0.29 % 240.01 36,602 0.21 % 195.30 6,798 0.17 % 158.33 847 0.22 % 214.11 86,507 0.34 % 272.09 temveč temveč temv eč 235,490 0.24 % 207.54 0 0 % 0 7,332 0.19 % 184.61 110,231 0.24 % 203.11 43,698 0.25 % 233.16 16,034 0.40 % 373.45 373 0.10 % 94.29 57,822 0.22 % 181.87 torej torej tor ej 196,891 0.20 % 173.52 0 0 % 0 3,308 0.09 % 83.29 96,806 0.21 % 178.37 39,105 0.23 % 208.65 9,010 0.23 % 209.85 593 0.15 % 149.90 48,069 0.19 % 151.19 kajti kajti kaj ti 179,605 0.18 % 158.28 0 0 % 0 11,065 0.28 % 278.60 106,508 0.23 % 196.25 32,173 0.19 % 171.66 7,547 0.19 % 175.78 1,131 0.29 % 285.90 21,181 0.08 % 66.62 vendarle vendarle vendar le 165,832 0.17 % 146.15 0 0 % 0 4,087 0.10 % 102.91 91,249 0.20 % 168.13 23,643 0.14 % 126.15 3,756 0.09 % 87.48 544 0.14 % 137.52 42,553 0.17 % 133.84 preden preden pred en 148,498 0.15 % 130.87 8 0.72 % 824.15 20,759 0.54 % 522.69 53,160 0.12 % 97.95 32,176 0.19 % 171.68 8,842 0.22 % 205.94 629 0.16 % 159 32,924 0.13 % 103.55 dokler dokler dokl er 147,332 0.15 % 129.84 19 1.71 % 1,957.35 16,427 0.42 % 413.61 59,151 0.13 % 108.99 30,010 0.17 % 160.12 9,366 0.23 % 218.15 1,089 0.28 % 275.29 31,270 0.12 % 98.35 kadar kadar kad ar 133,016 0.14 % 117.23 1 0.09 % 103.02 13,669 0.35 % 344.17 44,635 0.10 % 82.24 34,877 0.20 % 186.09 16,388 0.41 % 381.70 1,820 0.47 % 460.08 21,626 0.08 % 68.02 kolikor kolikor kolik or 105,767 0.11 % 93.21 1 0.09 % 103.02 5,383 0.14 % 135.54 50,897 0.11 % 93.78 18,240 0.10 % 97.32 4,410 0.11 % 102.71 830 0.21 % 209.82 26,006 0.10 % 81.80 kamor kamor kam or 78,225 0.08 % 68.94 0 0 % 0 3,973 0.10 % 100.04 37,516 0.08 % 69.13 14,813 0.09 % 79.04 2,953 0.07 % 68.78 176 0.04 % 44.49 18,794 0.07 % 59.11 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 184 File at CLARIN.SI 1.2.168 List of final character-level 3-grams from conjunction lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-conjunctions-lemmas-final- 3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] kot kot kot 5,694,030 23.98 % 5,018.12 23 10.85 % 2,369.42 210,691 22.73 % 5,304.97 2,600,526 23.29 % 4,791.64 973,450 22.21 % 5,194.01 231,781 22.22 % 5,398.46 16,409 15.19 % 4,148.02 1,661,150 27.14 % 5,224.75 ali ali ali 2,999,080 12.63 % 2,643.07 42 19.81 % 4,326.77 115,783 12.49 % 2,915.29 1,285,899 11.52 % 2,369.35 682,175 15.57 % 3,639.86 232,818 22.32 % 5,422.61 34,601 32.03 % 8,746.76 647,762 10.58 % 2,037.38 saj saj saj 1,922,955 8.10 % 1,694.69 0 0 % 0 52,415 5.66 % 1,319.75 956,035 8.56 % 1,761.56 353,479 8.07 % 1,886.05 45,680 4.38 % 1,063.94 1,897 1.76 % 479.54 513,449 8.39 % 1,614.93 ter ter ter 1,668,988 7.03 % 1,470.87 100 47.17 % 10,301.84 23,537 2.54 % 592.64 765,500 6.86 % 1,410.48 282,898 6.46 % 1,509.45 66,550 6.38 % 1,550.03 8,191 7.58 % 2,070.60 522,212 8.53 % 1,642.49 ker ker ker 1,508,421 6.35 % 1,329.36 2 0.94 % 206.04 71,438 7.71 % 1,798.73 727,004 6.51 % 1,339.55 293,861 6.71 % 1,567.95 57,937 5.55 % 1,349.42 5,756 5.33 % 1,455.05 352,423 5.76 % 1,108.46 zato zato z ato 1,302,554 5.49 % 1,147.93 0 0 % 0 34,925 3.77 % 879.37 637,853 5.71 % 1,175.29 249,506 5.69 % 1,331.28 51,526 4.94 % 1,200.10 3,264 3.02 % 825.10 325,480 5.32 % 1,023.72 kjer kjer k jer 1,147,657 4.83 % 1,011.42 0 0 % 0 30,047 3.24 % 756.55 561,556 5.03 % 1,034.70 188,431 4.30 % 1,005.41 36,017 3.45 % 838.88 2,761 2.56 % 697.95 328,845 5.37 % 1,034.30 vendar vendar ven dar 997,194 4.20 % 878.82 0 0 % 0 53,106 5.73 % 1,337.15 515,237 4.62 % 949.36 192,538 4.39 % 1,027.32 48,139 4.62 % 1,121.21 3,173 2.94 % 802.10 185,001 3.02 % 581.88 namreč namreč nam reč 902,134 3.80 % 795.05 0 0 % 0 9,366 1.01 % 235.83 452,514 4.05 % 833.79 148,837 3.40 % 794.15 17,827 1.71 % 415.21 1,164 1.08 % 294.25 272,426 4.45 % 856.85 čeprav čeprav čep rav 679,637 2.86 % 598.96 0 0 % 0 28,778 3.10 % 724.60 333,503 2.99 % 614.50 130,569 2.98 % 696.67 24,686 2.37 % 574.97 1,251 1.16 % 316.24 160,850 2.63 % 505.92 oziroma oziroma ozir oma 648,207 2.73 % 571.26 2 0.94 % 206.04 4,180 0.45 % 105.25 332,593 2.98 % 612.82 100,468 2.29 % 536.06 23,365 2.24 % 544.20 10,581 9.79 % 2,674.76 177,018 2.89 % 556.77 toda toda t oda 588,678 2.48 % 518.80 0 0 % 0 48,129 5.19 % 1,211.84 303,559 2.72 % 559.33 106,097 2.42 % 566.10 26,653 2.56 % 620.78 1,443 1.34 % 364.77 102,797 1.68 % 323.32 ampak ampak am pak 572,085 2.41 % 504.18 0 0 % 0 58,951 6.36 % 1,484.32 227,969 2.04 % 420.05 115,041 2.62 % 613.82 18,836 1.81 % 438.71 2,969 2.75 % 750.53 148,319 2.42 % 466.50 tako tako t ako 410,279 1.73 % 361.58 12 5.66 % 1,236.22 11,498 1.24 % 289.51 193,920 1.74 % 357.31 77,933 1.78 % 415.82 17,681 1.70 % 411.81 1,533 1.42 % 387.53 107,702 1.76 % 338.75 naj naj naj 389,738 1.64 % 343.47 0 0 % 0 25,019 2.70 % 629.95 187,176 1.68 % 344.88 57,022 1.30 % 304.25 13,726 1.32 % 319.69 1,027 0.95 % 259.61 105,768 1.73 % 332.67 kakor kakor ka kor 272,376 1.15 % 240.04 0 0 % 0 40,148 4.33 % 1,010.88 123,199 1.10 % 227 51,656 1.18 % 275.62 24,063 2.31 % 560.46 1,664 1.54 % 420.64 31,646 0.52 % 99.53 sicer sicer si cer 263,692 1.11 % 232.39 0 0 % 0 2,680 0.29 % 67.48 130,258 1.17 % 240.01 36,602 0.83 % 195.30 6,798 0.65 % 158.33 847 0.78 % 214.11 86,507 1.41 % 272.09 temveč temveč tem več 235,490 0.99 % 207.54 0 0 % 0 7,332 0.79 % 184.61 110,231 0.99 % 203.11 43,698 1.00 % 233.16 16,034 1.54 % 373.45 373 0.34 % 94.29 57,822 0.94 % 181.87 torej torej to rej 196,891 0.83 % 173.52 0 0 % 0 3,308 0.36 % 83.29 96,806 0.87 % 178.37 39,105 0.89 % 208.65 9,010 0.86 % 209.85 593 0.55 % 149.90 48,069 0.79 % 151.19 kajti kajti ka jti 179,605 0.76 % 158.28 0 0 % 0 11,065 1.19 % 278.60 106,508 0.95 % 196.25 32,173 0.73 % 171.66 7,547 0.72 % 175.78 1,131 1.05 % 285.90 21,181 0.35 % 66.62 vendarle vendarle venda rle 165,832 0.70 % 146.15 0 0 % 0 4,087 0.44 % 102.91 91,249 0.82 % 168.13 23,643 0.54 % 126.15 3,756 0.36 % 87.48 544 0.50 % 137.52 42,553 0.69 % 133.84 preden preden pre den 148,498 0.62 % 130.87 8 3.77 % 824.15 20,759 2.24 % 522.69 53,160 0.48 % 97.95 32,176 0.73 % 171.68 8,842 0.85 % 205.94 629 0.58 % 159 32,924 0.54 % 103.55 dokler dokler dok ler 147,332 0.62 % 129.84 19 8.96 % 1,957.35 16,427 1.77 % 413.61 59,151 0.53 % 108.99 30,010 0.69 % 160.12 9,366 0.90 % 218.15 1,089 1.01 % 275.29 31,270 0.51 % 98.35 kadar kadar ka dar 133,016 0.56 % 117.23 1 0.47 % 103.02 13,669 1.48 % 344.17 44,635 0.40 % 82.24 34,877 0.80 % 186.09 16,388 1.57 % 381.70 1,820 1.69 % 460.08 21,626 0.35 % 68.02 kolikor kolikor koli kor 105,767 0.45 % 93.21 1 0.47 % 103.02 5,383 0.58 % 135.54 50,897 0.46 % 93.78 18,240 0.42 % 97.32 4,410 0.42 % 102.71 830 0.77 % 209.82 26,006 0.42 % 81.80 kamor kamor ka mor 78,225 0.33 % 68.94 0 0 % 0 3,973 0.43 % 100.04 37,516 0.34 % 69.13 14,813 0.34 % 79.04 2,953 0.28 % 68.78 176 0.16 % 44.49 18,794 0.31 % 59.11 bodisi bodisi bod isi 63,373 0.27 % 55.85 2 0.94 % 206.04 1,759 0.19 % 44.29 27,313 0.24 % 50.33 12,796 0.29 % 68.28 5,707 0.55 % 132.92 413 0.38 % 104.40 15,383 0.25 % 48.38 odkar odkar od kar 63,190 0.27 % 55.69 0 0 % 0 4,813 0.52 % 121.19 28,914 0.26 % 53.28 10,306 0.23 % 54.99 920 0.09 % 21.43 106 0.10 % 26.80 18,131 0.30 % 57.03 četudi četudi čet udi 62,349 0.26 % 54.95 0 0 % 0 2,741 0.30 % 69.02 28,006 0.25 % 51.60 11,127 0.25 % 59.37 3,231 0.31 % 75.25 135 0.12 % 34.13 17,109 0.28 % 53.81 niti niti n iti 56,291 0.24 % 49.61 0 0 % 0 3,040 0.33 % 76.54 27,731 0.25 % 51.10 10,226 0.23 % 54.56 2,181 0.21 % 50.80 222 0.21 % 56.12 12,891 0.21 % 40.55 razen razen ra zen 50,366 0.21 % 44.39 0 0 % 0 2,897 0.31 % 72.94 22,525 0.20 % 41.50 10,083 0.23 % 53.80 3,015 0.29 % 70.22 1,122 1.04 % 283.63 10,724 0.17 % 33.73 koder koder ko der 33,663 0.14 % 29.67 0 0 % 0 1,770 0.19 % 44.57 17,062 0.15 % 31.44 5,230 0.12 % 27.91 1,410 0.14 % 32.84 49 0.04 % 12.39 8,142 0.13 % 25.61 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 185 File at CLARIN.SI 1.2.169 List of final character-level 4-grams from conjunction lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-conjunctions-lemmas-final- 4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] zato zato zato 1,302,554 13.63 % 1,147.93 0 0 % 0 34,925 8.17 % 879.37 637,853 13.75 % 1,175.29 249,506 14.36 % 1,331.28 51,526 13.08 % 1,200.10 3,264 8.14 % 825.10 325,480 14.05 % 1,023.72 kjer kjer kjer 1,147,657 12.01 % 1,011.42 0 0 % 0 30,047 7.03 % 756.55 561,556 12.10 % 1,034.70 188,431 10.85 % 1,005.41 36,017 9.15 % 838.88 2,761 6.88 % 697.95 328,845 14.20 % 1,034.30 vendar vendar ve ndar 997,194 10.44 % 878.82 0 0 % 0 53,106 12.43 % 1,337.15 515,237 11.11 % 949.36 192,538 11.08 % 1,027.32 48,139 12.22 % 1,121.21 3,173 7.91 % 802.10 185,001 7.99 % 581.88 namreč namreč na mreč 902,134 9.44 % 795.05 0 0 % 0 9,366 2.19 % 235.83 452,514 9.75 % 833.79 148,837 8.57 % 794.15 17,827 4.53 % 415.21 1,164 2.90 % 294.25 272,426 11.76 % 856.85 čeprav čeprav če prav 679,637 7.11 % 598.96 0 0 % 0 28,778 6.73 % 724.60 333,503 7.19 % 614.50 130,569 7.52 % 696.67 24,686 6.27 % 574.97 1,251 3.12 % 316.24 160,850 6.94 % 505.92 oziroma oziroma ozi roma 648,207 6.78 % 571.26 2 4.44 % 206.04 4,180 0.98 % 105.25 332,593 7.17 % 612.82 100,468 5.78 % 536.06 23,365 5.93 % 544.20 10,581 26.38 % 2,674.76 177,018 7.64 % 556.77 toda toda toda 588,678 6.16 % 518.80 0 0 % 0 48,129 11.26 % 1,211.84 303,559 6.54 % 559.33 106,097 6.11 % 566.10 26,653 6.77 % 620.78 1,443 3.60 % 364.77 102,797 4.44 % 323.32 ampak ampak a mpak 572,085 5.99 % 504.18 0 0 % 0 58,951 13.79 % 1,484.32 227,969 4.91 % 420.05 115,041 6.62 % 613.82 18,836 4.78 % 438.71 2,969 7.40 % 750.53 148,319 6.40 % 466.50 tako tako tako 410,279 4.29 % 361.58 12 26.67 % 1,236.22 11,498 2.69 % 289.51 193,920 4.18 % 357.31 77,933 4.49 % 415.82 17,681 4.49 % 411.81 1,533 3.82 % 387.53 107,702 4.65 % 338.75 kakor kakor k akor 272,376 2.85 % 240.04 0 0 % 0 40,148 9.39 % 1,010.88 123,199 2.65 % 227 51,656 2.97 % 275.62 24,063 6.11 % 560.46 1,664 4.15 % 420.64 31,646 1.37 % 99.53 sicer sicer s icer 263,692 2.76 % 232.39 0 0 % 0 2,680 0.63 % 67.48 130,258 2.81 % 240.01 36,602 2.11 % 195.30 6,798 1.73 % 158.33 847 2.11 % 214.11 86,507 3.73 % 272.09 temveč temveč te mveč 235,490 2.46 % 207.54 0 0 % 0 7,332 1.72 % 184.61 110,231 2.38 % 203.11 43,698 2.52 % 233.16 16,034 4.07 % 373.45 373 0.93 % 94.29 57,822 2.50 % 181.87 torej torej t orej 196,891 2.06 % 173.52 0 0 % 0 3,308 0.77 % 83.29 96,806 2.09 % 178.37 39,105 2.25 % 208.65 9,010 2.29 % 209.85 593 1.48 % 149.90 48,069 2.08 % 151.19 kajti kajti k ajti 179,605 1.88 % 158.28 0 0 % 0 11,065 2.59 % 278.60 106,508 2.30 % 196.25 32,173 1.85 % 171.66 7,547 1.92 % 175.78 1,131 2.82 % 285.90 21,181 0.91 % 66.62 vendarle vendarle vend arle 165,832 1.74 % 146.15 0 0 % 0 4,087 0.96 % 102.91 91,249 1.97 % 168.13 23,643 1.36 % 126.15 3,756 0.95 % 87.48 544 1.36 % 137.52 42,553 1.84 % 133.84 preden preden pr eden 148,498 1.55 % 130.87 8 17.78 % 824.15 20,759 4.86 % 522.69 53,160 1.15 % 97.95 32,176 1.85 % 171.68 8,842 2.25 % 205.94 629 1.57 % 159 32,924 1.42 % 103.55 dokler dokler do kler 147,332 1.54 % 129.84 19 42.22 % 1,957.35 16,427 3.84 % 413.61 59,151 1.27 % 108.99 30,010 1.73 % 160.12 9,366 2.38 % 218.15 1,089 2.71 % 275.29 31,270 1.35 % 98.35 kadar kadar k adar 133,016 1.39 % 117.23 1 2.22 % 103.02 13,669 3.20 % 344.17 44,635 0.96 % 82.24 34,877 2.01 % 186.09 16,388 4.16 % 381.70 1,820 4.54 % 460.08 21,626 0.93 % 68.02 kolikor kolikor kol ikor 105,767 1.11 % 93.21 1 2.22 % 103.02 5,383 1.26 % 135.54 50,897 1.10 % 93.78 18,240 1.05 % 97.32 4,410 1.12 % 102.71 830 2.07 % 209.82 26,006 1.12 % 81.80 kamor kamor k amor 78,225 0.82 % 68.94 0 0 % 0 3,973 0.93 % 100.04 37,516 0.81 % 69.13 14,813 0.85 % 79.04 2,953 0.75 % 68.78 176 0.44 % 44.49 18,794 0.81 % 59.11 bodisi bodisi bo disi 63,373 0.66 % 55.85 2 4.44 % 206.04 1,759 0.41 % 44.29 27,313 0.59 % 50.33 12,796 0.74 % 68.28 5,707 1.45 % 132.92 413 1.03 % 104.40 15,383 0.66 % 48.38 odkar odkar o dkar 63,190 0.66 % 55.69 0 0 % 0 4,813 1.13 % 121.19 28,914 0.62 % 53.28 10,306 0.59 % 54.99 920 0.23 % 21.43 106 0.26 % 26.80 18,131 0.78 % 57.03 četudi četudi če tudi 62,349 0.65 % 54.95 0 0 % 0 2,741 0.64 % 69.02 28,006 0.60 % 51.60 11,127 0.64 % 59.37 3,231 0.82 % 75.25 135 0.34 % 34.13 17,109 0.74 % 53.81 niti niti niti 56,291 0.59 % 49.61 0 0 % 0 3,040 0.71 % 76.54 27,731 0.60 % 51.10 10,226 0.59 % 54.56 2,181 0.55 % 50.80 222 0.55 % 56.12 12,891 0.56 % 40.55 razen razen r azen 50,366 0.53 % 44.39 0 0 % 0 2,897 0.68 % 72.94 22,525 0.49 % 41.50 10,083 0.58 % 53.80 3,015 0.77 % 70.22 1,122 2.80 % 283.63 10,724 0.46 % 33.73 koder koder k oder 33,663 0.35 % 29.67 0 0 % 0 1,770 0.41 % 44.57 17,062 0.37 % 31.44 5,230 0.30 % 27.91 1,410 0.36 % 32.84 49 0.12 % 12.39 8,142 0.35 % 25.61 marveč marveč ma rveč 21,385 0.22 % 18.85 0 0 % 0 595 0.14 % 14.98 12,386 0.27 % 22.82 4,535 0.26 % 24.20 1,702 0.43 % 39.64 47 0.12 % 11.88 2,120 0.09 % 6.67 čeravno čeravno čer avno 9,604 0.10 % 8.46 0 0 % 0 600 0.14 % 15.11 5,123 0.11 % 9.44 1,322 0.08 % 7.05 357 0.09 % 8.31 21 0.05 % 5.31 2,181 0.09 % 6.86 zatorej zatorej zat orej 4,880 0.05 % 4.30 0 0 % 0 317 0.07 % 7.98 2,133 0.05 % 3.93 1,440 0.08 % 7.68 344 0.09 % 8.01 22 0.06 % 5.56 624 0.03 % 1.96 najsi najsi n ajsi 4,664 0.05 % 4.11 0 0 % 0 343 0.08 % 8.64 1,902 0.04 % 3.50 1,216 0.07 % 6.49 395 0.10 % 9.20 66 0.17 % 16.68 742 0.03 % 2.33 predno predno pr edno 1,817 0.02 % 1.60 0 0 % 0 143 0.03 % 3.60 944 0.02 % 1.74 274 0.02 % 1.46 223 0.06 % 5.19 42 0.10 % 10.62 191 0.01 % 0.60 odkoder odkoder odk oder 1,597 0.02 % 1.41 0 0 % 0 109 0.03 % 2.74 878 0.02 % 1.62 300 0.02 % 1.60 96 0.02 % 2.24 7 0.02 % 1.77 207 0.01 % 0.65 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 186 File at CLARIN.SI 1.2.170 List of final character-level 5-grams from conjunction lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-conjunctions-lemmas-final- 5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] vendar vendar v endar 997,194 16.49 % 878.82 0 0 % 0 53,106 17.73 % 1,337.15 515,237 17.68 % 949.36 192,538 17.43 % 1,027.32 48,139 18.55 % 1,121.21 3,173 10.28 % 802.10 185,001 12.86 % 581.88 namreč namreč n amreč 902,134 14.92 % 795.05 0 0 % 0 9,366 3.13 % 235.83 452,514 15.53 % 833.79 148,837 13.47 % 794.15 17,827 6.87 % 415.21 1,164 3.77 % 294.25 272,426 18.94 % 856.85 čeprav čeprav č eprav 679,637 11.24 % 598.96 0 0 % 0 28,778 9.61 % 724.60 333,503 11.44 % 614.50 130,569 11.82 % 696.67 24,686 9.51 % 574.97 1,251 4.05 % 316.24 160,850 11.18 % 505.92 oziroma oziroma oz iroma 648,207 10.72 % 571.26 2 6.06 % 206.04 4,180 1.40 % 105.25 332,593 11.41 % 612.82 100,468 9.09 % 536.06 23,365 9.00 % 544.20 10,581 34.27 % 2,674.76 177,018 12.31 % 556.77 ampak ampak ampak 572,085 9.46 % 504.18 0 0 % 0 58,951 19.68 % 1,484.32 227,969 7.82 % 420.05 115,041 10.41 % 613.82 18,836 7.26 % 438.71 2,969 9.62 % 750.53 148,319 10.31 % 466.50 kakor kakor kakor 272,376 4.50 % 240.04 0 0 % 0 40,148 13.40 % 1,010.88 123,199 4.23 % 227 51,656 4.67 % 275.62 24,063 9.27 % 560.46 1,664 5.39 % 420.64 31,646 2.20 % 99.53 sicer sicer sicer 263,692 4.36 % 232.39 0 0 % 0 2,680 0.90 % 67.48 130,258 4.47 % 240.01 36,602 3.31 % 195.30 6,798 2.62 % 158.33 847 2.74 % 214.11 86,507 6.01 % 272.09 temveč temveč t emveč 235,490 3.89 % 207.54 0 0 % 0 7,332 2.45 % 184.61 110,231 3.78 % 203.11 43,698 3.96 % 233.16 16,034 6.18 % 373.45 373 1.21 % 94.29 57,822 4.02 % 181.87 torej torej torej 196,891 3.26 % 173.52 0 0 % 0 3,308 1.10 % 83.29 96,806 3.32 % 178.37 39,105 3.54 % 208.65 9,010 3.47 % 209.85 593 1.92 % 149.90 48,069 3.34 % 151.19 kajti kajti kajti 179,605 2.97 % 158.28 0 0 % 0 11,065 3.69 % 278.60 106,508 3.65 % 196.25 32,173 2.91 % 171.66 7,547 2.91 % 175.78 1,131 3.66 % 285.90 21,181 1.47 % 66.62 vendarle vendarle ven darle 165,832 2.74 % 146.15 0 0 % 0 4,087 1.36 % 102.91 91,249 3.13 % 168.13 23,643 2.14 % 126.15 3,756 1.45 % 87.48 544 1.76 % 137.52 42,553 2.96 % 133.84 preden preden p reden 148,498 2.46 % 130.87 8 24.24 % 824.15 20,759 6.93 % 522.69 53,160 1.82 % 97.95 32,176 2.91 % 171.68 8,842 3.41 % 205.94 629 2.04 % 159 32,924 2.29 % 103.55 dokler dokler d okler 147,332 2.44 % 129.84 19 57.58 % 1,957.35 16,427 5.48 % 413.61 59,151 2.03 % 108.99 30,010 2.72 % 160.12 9,366 3.61 % 218.15 1,089 3.53 % 275.29 31,270 2.17 % 98.35 kadar kadar kadar 133,016 2.20 % 117.23 1 3.03 % 103.02 13,669 4.56 % 344.17 44,635 1.53 % 82.24 34,877 3.16 % 186.09 16,388 6.31 % 381.70 1,820 5.89 % 460.08 21,626 1.50 % 68.02 kolikor kolikor ko likor 105,767 1.75 % 93.21 1 3.03 % 103.02 5,383 1.80 % 135.54 50,897 1.75 % 93.78 18,240 1.65 % 97.32 4,410 1.70 % 102.71 830 2.69 % 209.82 26,006 1.81 % 81.80 kamor kamor kamor 78,225 1.29 % 68.94 0 0 % 0 3,973 1.33 % 100.04 37,516 1.29 % 69.13 14,813 1.34 % 79.04 2,953 1.14 % 68.78 176 0.57 % 44.49 18,794 1.31 % 59.11 bodisi bodisi b odisi 63,373 1.05 % 55.85 2 6.06 % 206.04 1,759 0.59 % 44.29 27,313 0.94 % 50.33 12,796 1.16 % 68.28 5,707 2.20 % 132.92 413 1.34 % 104.40 15,383 1.07 % 48.38 odkar odkar odkar 63,190 1.04 % 55.69 0 0 % 0 4,813 1.61 % 121.19 28,914 0.99 % 53.28 10,306 0.93 % 54.99 920 0.35 % 21.43 106 0.34 % 26.80 18,131 1.26 % 57.03 četudi četudi č etudi 62,349 1.03 % 54.95 0 0 % 0 2,741 0.92 % 69.02 28,006 0.96 % 51.60 11,127 1.01 % 59.37 3,231 1.25 % 75.25 135 0.44 % 34.13 17,109 1.19 % 53.81 razen razen razen 50,366 0.83 % 44.39 0 0 % 0 2,897 0.97 % 72.94 22,525 0.77 % 41.50 10,083 0.91 % 53.80 3,015 1.16 % 70.22 1,122 3.63 % 283.63 10,724 0.75 % 33.73 koder koder koder 33,663 0.56 % 29.67 0 0 % 0 1,770 0.59 % 44.57 17,062 0.58 % 31.44 5,230 0.47 % 27.91 1,410 0.54 % 32.84 49 0.16 % 12.39 8,142 0.57 % 25.61 marveč marveč m arveč 21,385 0.35 % 18.85 0 0 % 0 595 0.20 % 14.98 12,386 0.42 % 22.82 4,535 0.41 % 24.20 1,702 0.66 % 39.64 47 0.15 % 11.88 2,120 0.15 % 6.67 čeravno čeravno če ravno 9,604 0.16 % 8.46 0 0 % 0 600 0.20 % 15.11 5,123 0.18 % 9.44 1,322 0.12 % 7.05 357 0.14 % 8.31 21 0.07 % 5.31 2,181 0.15 % 6.86 zatorej zatorej za torej 4,880 0.08 % 4.30 0 0 % 0 317 0.11 % 7.98 2,133 0.07 % 3.93 1,440 0.13 % 7.68 344 0.13 % 8.01 22 0.07 % 5.56 624 0.04 % 1.96 najsi najsi najsi 4,664 0.08 % 4.11 0 0 % 0 343 0.11 % 8.64 1,902 0.07 % 3.50 1,216 0.11 % 6.49 395 0.15 % 9.20 66 0.21 % 16.68 742 0.05 % 2.33 predno predno p redno 1,817 0.03 % 1.60 0 0 % 0 143 0.05 % 3.60 944 0.03 % 1.74 274 0.03 % 1.46 223 0.09 % 5.19 42 0.14 % 10.62 191 0.01 % 0.60 odkoder odkoder od koder 1,597 0.03 % 1.41 0 0 % 0 109 0.04 % 2.74 878 0.03 % 1.62 300 0.03 % 1.60 96 0.04 % 2.24 7 0.02 % 1.77 207 0.01 % 0.65 dasiravno dasiravno dasi ravno 1,435 0.02 % 1.26 0 0 % 0 58 0.02 % 1.46 637 0.02 % 1.17 487 0.04 % 2.60 51 0.02 % 1.19 1 0 % 0.25 201 0.01 % 0.63 dočim dočim dočim 855 0.01 % 0.75 0 0 % 0 15 0.01 % 0.38 73 0 % 0.13 692 0.06 % 3.69 36 0.01 % 0.84 1 0 % 0.25 38 0 % 0.12 super super super 570 0.01 % 0.50 0 0 % 0 26 0.01 % 0.65 174 0.01 % 0.32 124 0.01 % 0.66 5 0 % 0.12 4 0.01 % 1.01 237 0.02 % 0.75 akoravno akoravno ako ravno 207 0 % 0.18 0 0 % 0 56 0.02 % 1.41 44 0 % 0.08 71 0.01 % 0.38 16 0.01 % 0.37 0 0 % 0 20 0 % 0.06 poceni poceni p oceni 179 0 % 0.16 0 0 % 0 4 0 % 0.10 130 0 % 0.24 23 0 % 0.12 5 0 % 0.12 0 0 % 0 17 0 % 0.05 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 187 File at CLARIN.SI 1.2.171 List of initial character-level 1-grams from conjunction lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-conjunctions-lowercase_ forms-initial-1grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] in i n 29,117,711 29.55 % 25,661.29 619 55.66 % 63,768.41 1,187,174 30.09 % 29,891.76 13,762,326 29.85 % 25,358 5,256,746 29.89 % 28,048.27 1,338,463 33.09 % 31,174.39 115,835 29.37 % 29,281.84 7,456,548 28.18 % 23,452.78 da d a 14,810,352 15.03 % 13,052.29 124 11.15 % 12,774.29 752,394 19.07 % 18,944.47 6,760,076 14.66 % 12,455.89 2,457,240 13.97 % 13,111.03 526,301 13.01 % 12,258.18 53,649 13.60 % 13,561.89 4,260,568 16.10 % 13,400.60 ki k i 12,623,562 12.81 % 11,125.08 55 4.95 % 5,666.01 338,126 8.57 % 8,513.65 5,915,959 12.83 % 10,900.55 2,202,562 12.52 % 11,752.15 516,575 12.77 % 12,031.65 55,386 14.04 % 14,000.98 3,594,899 13.58 % 11,306.89 pa p a 11,003,780 11.17 % 9,697.58 37 3.33 % 3,811.68 319,351 8.09 % 8,040.91 5,422,876 11.76 % 9,992.01 1,898,081 10.79 % 10,127.54 320,699 7.93 % 7,469.46 25,847 6.55 % 6,533.84 3,016,889 11.40 % 9,488.90 kot k ot 5,694,030 5.78 % 5,018.12 23 2.07 % 2,369.42 210,691 5.34 % 5,304.97 2,600,526 5.64 % 4,791.64 973,450 5.54 % 5,194.01 231,781 5.73 % 5,398.46 16,409 4.16 % 4,148.02 1,661,150 6.28 % 5,224.75 ko k o 3,180,102 3.23 % 2,802.61 36 3.24 % 3,708.66 225,473 5.71 % 5,677.17 1,386,450 3.01 % 2,554.63 573,281 3.26 % 3,058.84 123,662 3.06 % 2,880.24 9,198 2.33 % 2,325.16 862,002 3.26 % 2,711.22 ali a li 2,999,080 3.04 % 2,643.07 42 3.78 % 4,326.77 115,783 2.94 % 2,915.29 1,285,899 2.79 % 2,369.35 682,175 3.88 % 3,639.86 232,818 5.76 % 5,422.61 34,601 8.77 % 8,746.76 647,762 2.45 % 2,037.38 če č e 2,441,017 2.48 % 2,151.26 27 2.43 % 2,781.50 127,590 3.23 % 3,212.58 1,064,667 2.31 % 1,961.72 547,603 3.11 % 2,921.83 138,698 3.43 % 3,230.44 23,279 5.90 % 5,884.68 539,153 2.04 % 1,695.78 saj s aj 1,922,955 1.95 % 1,694.69 0 0 % 0 52,415 1.33 % 1,319.75 956,035 2.07 % 1,761.56 353,479 2.01 % 1,886.05 45,680 1.13 % 1,063.94 1,897 0.48 % 479.54 513,449 1.94 % 1,614.93 ter t er 1,668,988 1.69 % 1,470.87 100 8.99 % 10,301.84 23,537 0.60 % 592.64 765,500 1.66 % 1,410.48 282,898 1.61 % 1,509.45 66,550 1.65 % 1,550.03 8,191 2.08 % 2,070.60 522,212 1.97 % 1,642.49 a a 1,586,387 1.61 % 1,398.07 2 0.18 % 206.04 65,928 1.67 % 1,660 611,753 1.33 % 1,127.20 263,856 1.50 % 1,407.85 35,869 0.89 % 835.43 2,997 0.76 % 757.61 605,982 2.29 % 1,905.97 ker k er 1,508,421 1.53 % 1,329.36 2 0.18 % 206.04 71,438 1.81 % 1,798.73 727,004 1.58 % 1,339.55 293,861 1.67 % 1,567.95 57,937 1.43 % 1,349.42 5,756 1.46 % 1,455.05 352,423 1.33 % 1,108.46 zato z ato 1,302,554 1.32 % 1,147.93 0 0 % 0 34,925 0.89 % 879.37 637,853 1.38 % 1,175.29 249,506 1.42 % 1,331.28 51,526 1.27 % 1,200.10 3,264 0.83 % 825.10 325,480 1.23 % 1,023.72 kjer k jer 1,147,657 1.17 % 1,011.42 0 0 % 0 30,047 0.76 % 756.55 561,556 1.22 % 1,034.70 188,431 1.07 % 1,005.41 36,017 0.89 % 838.88 2,761 0.70 % 697.95 328,845 1.24 % 1,034.30 vendar v endar 997,194 1.01 % 878.82 0 0 % 0 53,106 1.35 % 1,337.15 515,237 1.12 % 949.36 192,538 1.09 % 1,027.32 48,139 1.19 % 1,121.21 3,173 0.81 % 802.10 185,001 0.70 % 581.88 namreč n amreč 902,134 0.92 % 795.05 0 0 % 0 9,366 0.24 % 235.83 452,514 0.98 % 833.79 148,837 0.85 % 794.15 17,827 0.44 % 415.21 1,164 0.29 % 294.25 272,426 1.03 % 856.85 čeprav č eprav 679,637 0.69 % 598.96 0 0 % 0 28,778 0.73 % 724.60 333,503 0.72 % 614.50 130,569 0.74 % 696.67 24,686 0.61 % 574.97 1,251 0.32 % 316.24 160,850 0.61 % 505.92 oziroma o ziroma 648,207 0.66 % 571.26 2 0.18 % 206.04 4,180 0.11 % 105.25 332,593 0.72 % 612.82 100,468 0.57 % 536.06 23,365 0.58 % 544.20 10,581 2.68 % 2,674.76 177,018 0.67 % 556.77 toda t oda 588,678 0.60 % 518.80 0 0 % 0 48,129 1.22 % 1,211.84 303,559 0.66 % 559.33 106,097 0.60 % 566.10 26,653 0.66 % 620.78 1,443 0.37 % 364.77 102,797 0.39 % 323.32 ampak a mpak 572,085 0.58 % 504.18 0 0 % 0 58,951 1.49 % 1,484.32 227,969 0.49 % 420.05 115,041 0.65 % 613.82 18,836 0.47 % 438.71 2,969 0.75 % 750.53 148,319 0.56 % 466.50 tako t ako 410,279 0.42 % 361.58 12 1.08 % 1,236.22 11,498 0.29 % 289.51 193,920 0.42 % 357.31 77,933 0.44 % 415.82 17,681 0.44 % 411.81 1,533 0.39 % 387.53 107,702 0.41 % 338.75 naj n aj 389,738 0.40 % 343.47 0 0 % 0 25,019 0.63 % 629.95 187,176 0.41 % 344.88 57,022 0.32 % 304.25 13,726 0.34 % 319.69 1,027 0.26 % 259.61 105,768 0.40 % 332.67 kakor k akor 272,376 0.28 % 240.04 0 0 % 0 40,148 1.02 % 1,010.88 123,199 0.27 % 227 51,656 0.29 % 275.62 24,063 0.59 % 560.46 1,664 0.42 % 420.64 31,646 0.12 % 99.53 sicer s icer 263,692 0.27 % 232.39 0 0 % 0 2,680 0.07 % 67.48 130,258 0.28 % 240.01 36,602 0.21 % 195.30 6,798 0.17 % 158.33 847 0.21 % 214.11 86,507 0.33 % 272.09 temveč t emveč 235,490 0.24 % 207.54 0 0 % 0 7,332 0.19 % 184.61 110,231 0.24 % 203.11 43,698 0.25 % 233.16 16,034 0.40 % 373.45 373 0.10 % 94.29 57,822 0.22 % 181.87 torej t orej 196,891 0.20 % 173.52 0 0 % 0 3,308 0.08 % 83.29 96,806 0.21 % 178.37 39,105 0.22 % 208.65 9,010 0.22 % 209.85 593 0.15 % 149.90 48,069 0.18 % 151.19 kajti k ajti 179,605 0.18 % 158.28 0 0 % 0 11,065 0.28 % 278.60 106,508 0.23 % 196.25 32,173 0.18 % 171.66 7,547 0.19 % 175.78 1,131 0.29 % 285.90 21,181 0.08 % 66.62 vendarle v endarle 165,832 0.17 % 146.15 0 0 % 0 4,087 0.10 % 102.91 91,249 0.20 % 168.13 23,643 0.13 % 126.15 3,756 0.09 % 87.48 544 0.14 % 137.52 42,553 0.16 % 133.84 preden p reden 148,498 0.15 % 130.87 8 0.72 % 824.15 20,759 0.53 % 522.69 53,160 0.12 % 97.95 32,176 0.18 % 171.68 8,842 0.22 % 205.94 629 0.16 % 159 32,924 0.12 % 103.55 dokler d okler 147,332 0.15 % 129.84 19 1.71 % 1,957.35 16,427 0.42 % 413.61 59,151 0.13 % 108.99 30,010 0.17 % 160.12 9,366 0.23 % 218.15 1,089 0.28 % 275.29 31,270 0.12 % 98.35 kadar k adar 133,016 0.14 % 117.23 1 0.09 % 103.02 13,669 0.35 % 344.17 44,635 0.10 % 82.24 34,877 0.20 % 186.09 16,388 0.41 % 381.70 1,820 0.46 % 460.08 21,626 0.08 % 68.02 kolikor k olikor 105,767 0.11 % 93.21 1 0.09 % 103.02 5,383 0.14 % 135.54 50,897 0.11 % 93.78 18,240 0.10 % 97.32 4,410 0.11 % 102.71 830 0.21 % 209.82 26,006 0.10 % 81.80 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 188 File at CLARIN.SI 1.2.172 List of initial character-level 2-grams from conjunction lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-conjunctions-lowercase_ forms-initial-2grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] in in 29,117,711 30.03 % 25,661.29 619 55.77 % 63,768.41 1,187,174 30.60 % 29,891.76 13,762,326 30.25 % 25,358 5,256,746 30.34 % 28,048.27 1,338,463 33.39 % 31,174.39 115,835 29.60 % 29,281.84 7,456,548 28.84 % 23,452.78 da da 14,810,352 15.28 % 13,052.29 124 11.17 % 12,774.29 752,394 19.39 % 18,944.47 6,760,076 14.86 % 12,455.89 2,457,240 14.18 % 13,111.03 526,301 13.13 % 12,258.18 53,649 13.71 % 13,561.89 4,260,568 16.48 % 13,400.60 ki ki 12,623,562 13.02 % 11,125.08 55 4.96 % 5,666.01 338,126 8.72 % 8,513.65 5,915,959 13.00 % 10,900.55 2,202,562 12.71 % 11,752.15 516,575 12.89 % 12,031.65 55,386 14.15 % 14,000.98 3,594,899 13.90 % 11,306.89 pa pa 11,003,780 11.35 % 9,697.58 37 3.33 % 3,811.68 319,351 8.23 % 8,040.91 5,422,876 11.92 % 9,992.01 1,898,081 10.96 % 10,127.54 320,699 8.00 % 7,469.46 25,847 6.60 % 6,533.84 3,016,889 11.67 % 9,488.90 kot ko t 5,694,030 5.87 % 5,018.12 23 2.07 % 2,369.42 210,691 5.43 % 5,304.97 2,600,526 5.72 % 4,791.64 973,450 5.62 % 5,194.01 231,781 5.78 % 5,398.46 16,409 4.19 % 4,148.02 1,661,150 6.42 % 5,224.75 ko ko 3,180,102 3.28 % 2,802.61 36 3.24 % 3,708.66 225,473 5.81 % 5,677.17 1,386,450 3.05 % 2,554.63 573,281 3.31 % 3,058.84 123,662 3.08 % 2,880.24 9,198 2.35 % 2,325.16 862,002 3.33 % 2,711.22 ali al i 2,999,080 3.09 % 2,643.07 42 3.78 % 4,326.77 115,783 2.98 % 2,915.29 1,285,899 2.83 % 2,369.35 682,175 3.94 % 3,639.86 232,818 5.81 % 5,422.61 34,601 8.84 % 8,746.76 647,762 2.50 % 2,037.38 če če 2,441,017 2.52 % 2,151.26 27 2.43 % 2,781.50 127,590 3.29 % 3,212.58 1,064,667 2.34 % 1,961.72 547,603 3.16 % 2,921.83 138,698 3.46 % 3,230.44 23,279 5.95 % 5,884.68 539,153 2.08 % 1,695.78 saj sa j 1,922,955 1.98 % 1,694.69 0 0 % 0 52,415 1.35 % 1,319.75 956,035 2.10 % 1,761.56 353,479 2.04 % 1,886.05 45,680 1.14 % 1,063.94 1,897 0.48 % 479.54 513,449 1.99 % 1,614.93 ter te r 1,668,988 1.72 % 1,470.87 100 9.01 % 10,301.84 23,537 0.61 % 592.64 765,500 1.68 % 1,410.48 282,898 1.63 % 1,509.45 66,550 1.66 % 1,550.03 8,191 2.09 % 2,070.60 522,212 2.02 % 1,642.49 ker ke r 1,508,421 1.56 % 1,329.36 2 0.18 % 206.04 71,438 1.84 % 1,798.73 727,004 1.60 % 1,339.55 293,861 1.70 % 1,567.95 57,937 1.45 % 1,349.42 5,756 1.47 % 1,455.05 352,423 1.36 % 1,108.46 zato za to 1,302,554 1.34 % 1,147.93 0 0 % 0 34,925 0.90 % 879.37 637,853 1.40 % 1,175.29 249,506 1.44 % 1,331.28 51,526 1.28 % 1,200.10 3,264 0.83 % 825.10 325,480 1.26 % 1,023.72 kjer kj er 1,147,657 1.18 % 1,011.42 0 0 % 0 30,047 0.77 % 756.55 561,556 1.23 % 1,034.70 188,431 1.09 % 1,005.41 36,017 0.90 % 838.88 2,761 0.70 % 697.95 328,845 1.27 % 1,034.30 vendar ve ndar 997,194 1.03 % 878.82 0 0 % 0 53,106 1.37 % 1,337.15 515,237 1.13 % 949.36 192,538 1.11 % 1,027.32 48,139 1.20 % 1,121.21 3,173 0.81 % 802.10 185,001 0.71 % 581.88 namreč na mreč 902,134 0.93 % 795.05 0 0 % 0 9,366 0.24 % 235.83 452,514 0.99 % 833.79 148,837 0.86 % 794.15 17,827 0.45 % 415.21 1,164 0.30 % 294.25 272,426 1.05 % 856.85 čeprav če prav 679,637 0.70 % 598.96 0 0 % 0 28,778 0.74 % 724.60 333,503 0.73 % 614.50 130,569 0.75 % 696.67 24,686 0.62 % 574.97 1,251 0.32 % 316.24 160,850 0.62 % 505.92 oziroma oz iroma 648,207 0.67 % 571.26 2 0.18 % 206.04 4,180 0.11 % 105.25 332,593 0.73 % 612.82 100,468 0.58 % 536.06 23,365 0.58 % 544.20 10,581 2.70 % 2,674.76 177,018 0.69 % 556.77 toda to da 588,678 0.61 % 518.80 0 0 % 0 48,129 1.24 % 1,211.84 303,559 0.67 % 559.33 106,097 0.61 % 566.10 26,653 0.67 % 620.78 1,443 0.37 % 364.77 102,797 0.40 % 323.32 ampak am pak 572,085 0.59 % 504.18 0 0 % 0 58,951 1.52 % 1,484.32 227,969 0.50 % 420.05 115,041 0.66 % 613.82 18,836 0.47 % 438.71 2,969 0.76 % 750.53 148,319 0.57 % 466.50 tako ta ko 410,279 0.42 % 361.58 12 1.08 % 1,236.22 11,498 0.30 % 289.51 193,920 0.43 % 357.31 77,933 0.45 % 415.82 17,681 0.44 % 411.81 1,533 0.39 % 387.53 107,702 0.42 % 338.75 naj na j 389,738 0.40 % 343.47 0 0 % 0 25,019 0.65 % 629.95 187,176 0.41 % 344.88 57,022 0.33 % 304.25 13,726 0.34 % 319.69 1,027 0.26 % 259.61 105,768 0.41 % 332.67 kakor ka kor 272,376 0.28 % 240.04 0 0 % 0 40,148 1.03 % 1,010.88 123,199 0.27 % 227 51,656 0.30 % 275.62 24,063 0.60 % 560.46 1,664 0.42 % 420.64 31,646 0.12 % 99.53 sicer si cer 263,692 0.27 % 232.39 0 0 % 0 2,680 0.07 % 67.48 130,258 0.29 % 240.01 36,602 0.21 % 195.30 6,798 0.17 % 158.33 847 0.22 % 214.11 86,507 0.34 % 272.09 temveč te mveč 235,490 0.24 % 207.54 0 0 % 0 7,332 0.19 % 184.61 110,231 0.24 % 203.11 43,698 0.25 % 233.16 16,034 0.40 % 373.45 373 0.10 % 94.29 57,822 0.22 % 181.87 torej to rej 196,891 0.20 % 173.52 0 0 % 0 3,308 0.09 % 83.29 96,806 0.21 % 178.37 39,105 0.23 % 208.65 9,010 0.23 % 209.85 593 0.15 % 149.90 48,069 0.19 % 151.19 kajti ka jti 179,605 0.18 % 158.28 0 0 % 0 11,065 0.28 % 278.60 106,508 0.23 % 196.25 32,173 0.19 % 171.66 7,547 0.19 % 175.78 1,131 0.29 % 285.90 21,181 0.08 % 66.62 vendarle ve ndarle 165,832 0.17 % 146.15 0 0 % 0 4,087 0.10 % 102.91 91,249 0.20 % 168.13 23,643 0.14 % 126.15 3,756 0.09 % 87.48 544 0.14 % 137.52 42,553 0.17 % 133.84 preden pr eden 148,498 0.15 % 130.87 8 0.72 % 824.15 20,759 0.54 % 522.69 53,160 0.12 % 97.95 32,176 0.19 % 171.68 8,842 0.22 % 205.94 629 0.16 % 159 32,924 0.13 % 103.55 dokler do kler 147,332 0.15 % 129.84 19 1.71 % 1,957.35 16,427 0.42 % 413.61 59,151 0.13 % 108.99 30,010 0.17 % 160.12 9,366 0.23 % 218.15 1,089 0.28 % 275.29 31,270 0.12 % 98.35 kadar ka dar 133,016 0.14 % 117.23 1 0.09 % 103.02 13,669 0.35 % 344.17 44,635 0.10 % 82.24 34,877 0.20 % 186.09 16,388 0.41 % 381.70 1,820 0.47 % 460.08 21,626 0.08 % 68.02 kolikor ko likor 105,767 0.11 % 93.21 1 0.09 % 103.02 5,383 0.14 % 135.54 50,897 0.11 % 93.78 18,240 0.10 % 97.32 4,410 0.11 % 102.71 830 0.21 % 209.82 26,006 0.10 % 81.80 kamor ka mor 78,225 0.08 % 68.94 0 0 % 0 3,973 0.10 % 100.04 37,516 0.08 % 69.13 14,813 0.09 % 79.04 2,953 0.07 % 68.78 176 0.04 % 44.49 18,794 0.07 % 59.11 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 189 File at CLARIN.SI 1.2.173 List of initial character-level 3-grams from conjunction lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-conjunctions-lowercase_ forms-initial-3grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] kot kot 5,694,030 23.98 % 5,018.12 23 10.85 % 2,369.42 210,691 22.73 % 5,304.97 2,600,526 23.29 % 4,791.64 973,450 22.21 % 5,194.01 231,781 22.22 % 5,398.46 16,409 15.19 % 4,148.02 1,661,150 27.14 % 5,224.75 ali ali 2,999,080 12.63 % 2,643.07 42 19.81 % 4,326.77 115,783 12.49 % 2,915.29 1,285,899 11.52 % 2,369.35 682,175 15.57 % 3,639.86 232,818 22.32 % 5,422.61 34,601 32.03 % 8,746.76 647,762 10.58 % 2,037.38 saj saj 1,922,955 8.10 % 1,694.69 0 0 % 0 52,415 5.66 % 1,319.75 956,035 8.56 % 1,761.56 353,479 8.07 % 1,886.05 45,680 4.38 % 1,063.94 1,897 1.76 % 479.54 513,449 8.39 % 1,614.93 ter ter 1,668,988 7.03 % 1,470.87 100 47.17 % 10,301.84 23,537 2.54 % 592.64 765,500 6.86 % 1,410.48 282,898 6.46 % 1,509.45 66,550 6.38 % 1,550.03 8,191 7.58 % 2,070.60 522,212 8.53 % 1,642.49 ker ker 1,508,421 6.35 % 1,329.36 2 0.94 % 206.04 71,438 7.71 % 1,798.73 727,004 6.51 % 1,339.55 293,861 6.71 % 1,567.95 57,937 5.55 % 1,349.42 5,756 5.33 % 1,455.05 352,423 5.76 % 1,108.46 zato zat o 1,302,554 5.49 % 1,147.93 0 0 % 0 34,925 3.77 % 879.37 637,853 5.71 % 1,175.29 249,506 5.69 % 1,331.28 51,526 4.94 % 1,200.10 3,264 3.02 % 825.10 325,480 5.32 % 1,023.72 kjer kje r 1,147,657 4.83 % 1,011.42 0 0 % 0 30,047 3.24 % 756.55 561,556 5.03 % 1,034.70 188,431 4.30 % 1,005.41 36,017 3.45 % 838.88 2,761 2.56 % 697.95 328,845 5.37 % 1,034.30 vendar ven dar 997,194 4.20 % 878.82 0 0 % 0 53,106 5.73 % 1,337.15 515,237 4.62 % 949.36 192,538 4.39 % 1,027.32 48,139 4.62 % 1,121.21 3,173 2.94 % 802.10 185,001 3.02 % 581.88 namreč nam reč 902,134 3.80 % 795.05 0 0 % 0 9,366 1.01 % 235.83 452,514 4.05 % 833.79 148,837 3.40 % 794.15 17,827 1.71 % 415.21 1,164 1.08 % 294.25 272,426 4.45 % 856.85 čeprav čep rav 679,637 2.86 % 598.96 0 0 % 0 28,778 3.10 % 724.60 333,503 2.99 % 614.50 130,569 2.98 % 696.67 24,686 2.37 % 574.97 1,251 1.16 % 316.24 160,850 2.63 % 505.92 oziroma ozi roma 648,207 2.73 % 571.26 2 0.94 % 206.04 4,180 0.45 % 105.25 332,593 2.98 % 612.82 100,468 2.29 % 536.06 23,365 2.24 % 544.20 10,581 9.79 % 2,674.76 177,018 2.89 % 556.77 toda tod a 588,678 2.48 % 518.80 0 0 % 0 48,129 5.19 % 1,211.84 303,559 2.72 % 559.33 106,097 2.42 % 566.10 26,653 2.56 % 620.78 1,443 1.34 % 364.77 102,797 1.68 % 323.32 ampak amp ak 572,085 2.41 % 504.18 0 0 % 0 58,951 6.36 % 1,484.32 227,969 2.04 % 420.05 115,041 2.62 % 613.82 18,836 1.81 % 438.71 2,969 2.75 % 750.53 148,319 2.42 % 466.50 tako tak o 410,279 1.73 % 361.58 12 5.66 % 1,236.22 11,498 1.24 % 289.51 193,920 1.74 % 357.31 77,933 1.78 % 415.82 17,681 1.70 % 411.81 1,533 1.42 % 387.53 107,702 1.76 % 338.75 naj naj 389,738 1.64 % 343.47 0 0 % 0 25,019 2.70 % 629.95 187,176 1.68 % 344.88 57,022 1.30 % 304.25 13,726 1.32 % 319.69 1,027 0.95 % 259.61 105,768 1.73 % 332.67 kakor kak or 272,376 1.15 % 240.04 0 0 % 0 40,148 4.33 % 1,010.88 123,199 1.10 % 227 51,656 1.18 % 275.62 24,063 2.31 % 560.46 1,664 1.54 % 420.64 31,646 0.52 % 99.53 sicer sic er 263,692 1.11 % 232.39 0 0 % 0 2,680 0.29 % 67.48 130,258 1.17 % 240.01 36,602 0.83 % 195.30 6,798 0.65 % 158.33 847 0.78 % 214.11 86,507 1.41 % 272.09 temveč tem več 235,490 0.99 % 207.54 0 0 % 0 7,332 0.79 % 184.61 110,231 0.99 % 203.11 43,698 1.00 % 233.16 16,034 1.54 % 373.45 373 0.34 % 94.29 57,822 0.94 % 181.87 torej tor ej 196,891 0.83 % 173.52 0 0 % 0 3,308 0.36 % 83.29 96,806 0.87 % 178.37 39,105 0.89 % 208.65 9,010 0.86 % 209.85 593 0.55 % 149.90 48,069 0.79 % 151.19 kajti kaj ti 179,605 0.76 % 158.28 0 0 % 0 11,065 1.19 % 278.60 106,508 0.95 % 196.25 32,173 0.73 % 171.66 7,547 0.72 % 175.78 1,131 1.05 % 285.90 21,181 0.35 % 66.62 vendarle ven darle 165,832 0.70 % 146.15 0 0 % 0 4,087 0.44 % 102.91 91,249 0.82 % 168.13 23,643 0.54 % 126.15 3,756 0.36 % 87.48 544 0.50 % 137.52 42,553 0.69 % 133.84 preden pre den 148,498 0.62 % 130.87 8 3.77 % 824.15 20,759 2.24 % 522.69 53,160 0.48 % 97.95 32,176 0.73 % 171.68 8,842 0.85 % 205.94 629 0.58 % 159 32,924 0.54 % 103.55 dokler dok ler 147,332 0.62 % 129.84 19 8.96 % 1,957.35 16,427 1.77 % 413.61 59,151 0.53 % 108.99 30,010 0.69 % 160.12 9,366 0.90 % 218.15 1,089 1.01 % 275.29 31,270 0.51 % 98.35 kadar kad ar 133,016 0.56 % 117.23 1 0.47 % 103.02 13,669 1.48 % 344.17 44,635 0.40 % 82.24 34,877 0.80 % 186.09 16,388 1.57 % 381.70 1,820 1.69 % 460.08 21,626 0.35 % 68.02 kolikor kol ikor 105,767 0.45 % 93.21 1 0.47 % 103.02 5,383 0.58 % 135.54 50,897 0.46 % 93.78 18,240 0.42 % 97.32 4,410 0.42 % 102.71 830 0.77 % 209.82 26,006 0.42 % 81.80 kamor kam or 78,225 0.33 % 68.94 0 0 % 0 3,973 0.43 % 100.04 37,516 0.34 % 69.13 14,813 0.34 % 79.04 2,953 0.28 % 68.78 176 0.16 % 44.49 18,794 0.31 % 59.11 bodisi bod isi 63,373 0.27 % 55.85 2 0.94 % 206.04 1,759 0.19 % 44.29 27,313 0.24 % 50.33 12,796 0.29 % 68.28 5,707 0.55 % 132.92 413 0.38 % 104.40 15,383 0.25 % 48.38 odkar odk ar 63,190 0.27 % 55.69 0 0 % 0 4,813 0.52 % 121.19 28,914 0.26 % 53.28 10,306 0.23 % 54.99 920 0.09 % 21.43 106 0.10 % 26.80 18,131 0.30 % 57.03 četudi čet udi 62,349 0.26 % 54.95 0 0 % 0 2,741 0.30 % 69.02 28,006 0.25 % 51.60 11,127 0.25 % 59.37 3,231 0.31 % 75.25 135 0.12 % 34.13 17,109 0.28 % 53.81 niti nit i 56,291 0.24 % 49.61 0 0 % 0 3,040 0.33 % 76.54 27,731 0.25 % 51.10 10,226 0.23 % 54.56 2,181 0.21 % 50.80 222 0.21 % 56.12 12,891 0.21 % 40.55 razen raz en 50,366 0.21 % 44.39 0 0 % 0 2,897 0.31 % 72.94 22,525 0.20 % 41.50 10,083 0.23 % 53.80 3,015 0.29 % 70.22 1,122 1.04 % 283.63 10,724 0.17 % 33.73 koder kod er 33,663 0.14 % 29.67 0 0 % 0 1,770 0.19 % 44.57 17,062 0.15 % 31.44 5,230 0.12 % 27.91 1,410 0.14 % 32.84 49 0.04 % 12.39 8,142 0.13 % 25.61 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 190 File at CLARIN.SI 1.2.174 List of initial character-level 4-grams from conjunction lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-conjunctions-lowercase_ forms-initial-4grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] zato zato 1,302,554 13.63 % 1,147.93 0 0 % 0 34,925 8.17 % 879.37 637,853 13.75 % 1,175.29 249,506 14.36 % 1,331.28 51,526 13.08 % 1,200.10 3,264 8.14 % 825.10 325,480 14.05 % 1,023.72 kjer kjer 1,147,657 12.01 % 1,011.42 0 0 % 0 30,047 7.03 % 756.55 561,556 12.10 % 1,034.70 188,431 10.85 % 1,005.41 36,017 9.15 % 838.88 2,761 6.88 % 697.95 328,845 14.20 % 1,034.30 vendar vend ar 997,194 10.44 % 878.82 0 0 % 0 53,106 12.43 % 1,337.15 515,237 11.11 % 949.36 192,538 11.08 % 1,027.32 48,139 12.22 % 1,121.21 3,173 7.91 % 802.10 185,001 7.99 % 581.88 namreč namr eč 902,134 9.44 % 795.05 0 0 % 0 9,366 2.19 % 235.83 452,514 9.75 % 833.79 148,837 8.57 % 794.15 17,827 4.53 % 415.21 1,164 2.90 % 294.25 272,426 11.76 % 856.85 čeprav čepr av 679,637 7.11 % 598.96 0 0 % 0 28,778 6.73 % 724.60 333,503 7.19 % 614.50 130,569 7.52 % 696.67 24,686 6.27 % 574.97 1,251 3.12 % 316.24 160,850 6.94 % 505.92 oziroma ozir oma 648,207 6.78 % 571.26 2 4.44 % 206.04 4,180 0.98 % 105.25 332,593 7.17 % 612.82 100,468 5.78 % 536.06 23,365 5.93 % 544.20 10,581 26.38 % 2,674.76 177,018 7.64 % 556.77 toda toda 588,678 6.16 % 518.80 0 0 % 0 48,129 11.26 % 1,211.84 303,559 6.54 % 559.33 106,097 6.11 % 566.10 26,653 6.77 % 620.78 1,443 3.60 % 364.77 102,797 4.44 % 323.32 ampak ampa k 572,085 5.99 % 504.18 0 0 % 0 58,951 13.79 % 1,484.32 227,969 4.91 % 420.05 115,041 6.62 % 613.82 18,836 4.78 % 438.71 2,969 7.40 % 750.53 148,319 6.40 % 466.50 tako tako 410,279 4.29 % 361.58 12 26.67 % 1,236.22 11,498 2.69 % 289.51 193,920 4.18 % 357.31 77,933 4.49 % 415.82 17,681 4.49 % 411.81 1,533 3.82 % 387.53 107,702 4.65 % 338.75 kakor kako r 272,376 2.85 % 240.04 0 0 % 0 40,148 9.39 % 1,010.88 123,199 2.65 % 227 51,656 2.97 % 275.62 24,063 6.11 % 560.46 1,664 4.15 % 420.64 31,646 1.37 % 99.53 sicer sice r 263,692 2.76 % 232.39 0 0 % 0 2,680 0.63 % 67.48 130,258 2.81 % 240.01 36,602 2.11 % 195.30 6,798 1.73 % 158.33 847 2.11 % 214.11 86,507 3.73 % 272.09 temveč temv eč 235,490 2.46 % 207.54 0 0 % 0 7,332 1.72 % 184.61 110,231 2.38 % 203.11 43,698 2.52 % 233.16 16,034 4.07 % 373.45 373 0.93 % 94.29 57,822 2.50 % 181.87 torej tore j 196,891 2.06 % 173.52 0 0 % 0 3,308 0.77 % 83.29 96,806 2.09 % 178.37 39,105 2.25 % 208.65 9,010 2.29 % 209.85 593 1.48 % 149.90 48,069 2.08 % 151.19 kajti kajt i 179,605 1.88 % 158.28 0 0 % 0 11,065 2.59 % 278.60 106,508 2.30 % 196.25 32,173 1.85 % 171.66 7,547 1.92 % 175.78 1,131 2.82 % 285.90 21,181 0.91 % 66.62 vendarle vend arle 165,832 1.74 % 146.15 0 0 % 0 4,087 0.96 % 102.91 91,249 1.97 % 168.13 23,643 1.36 % 126.15 3,756 0.95 % 87.48 544 1.36 % 137.52 42,553 1.84 % 133.84 preden pred en 148,498 1.55 % 130.87 8 17.78 % 824.15 20,759 4.86 % 522.69 53,160 1.15 % 97.95 32,176 1.85 % 171.68 8,842 2.25 % 205.94 629 1.57 % 159 32,924 1.42 % 103.55 dokler dokl er 147,332 1.54 % 129.84 19 42.22 % 1,957.35 16,427 3.84 % 413.61 59,151 1.27 % 108.99 30,010 1.73 % 160.12 9,366 2.38 % 218.15 1,089 2.71 % 275.29 31,270 1.35 % 98.35 kadar kada r 133,016 1.39 % 117.23 1 2.22 % 103.02 13,669 3.20 % 344.17 44,635 0.96 % 82.24 34,877 2.01 % 186.09 16,388 4.16 % 381.70 1,820 4.54 % 460.08 21,626 0.93 % 68.02 kolikor koli kor 105,767 1.11 % 93.21 1 2.22 % 103.02 5,383 1.26 % 135.54 50,897 1.10 % 93.78 18,240 1.05 % 97.32 4,410 1.12 % 102.71 830 2.07 % 209.82 26,006 1.12 % 81.80 kamor kamo r 78,225 0.82 % 68.94 0 0 % 0 3,973 0.93 % 100.04 37,516 0.81 % 69.13 14,813 0.85 % 79.04 2,953 0.75 % 68.78 176 0.44 % 44.49 18,794 0.81 % 59.11 bodisi bodi si 63,373 0.66 % 55.85 2 4.44 % 206.04 1,759 0.41 % 44.29 27,313 0.59 % 50.33 12,796 0.74 % 68.28 5,707 1.45 % 132.92 413 1.03 % 104.40 15,383 0.66 % 48.38 odkar odka r 63,190 0.66 % 55.69 0 0 % 0 4,813 1.13 % 121.19 28,914 0.62 % 53.28 10,306 0.59 % 54.99 920 0.23 % 21.43 106 0.26 % 26.80 18,131 0.78 % 57.03 četudi četu di 62,349 0.65 % 54.95 0 0 % 0 2,741 0.64 % 69.02 28,006 0.60 % 51.60 11,127 0.64 % 59.37 3,231 0.82 % 75.25 135 0.34 % 34.13 17,109 0.74 % 53.81 niti niti 56,291 0.59 % 49.61 0 0 % 0 3,040 0.71 % 76.54 27,731 0.60 % 51.10 10,226 0.59 % 54.56 2,181 0.55 % 50.80 222 0.55 % 56.12 12,891 0.56 % 40.55 razen raze n 50,366 0.53 % 44.39 0 0 % 0 2,897 0.68 % 72.94 22,525 0.49 % 41.50 10,083 0.58 % 53.80 3,015 0.77 % 70.22 1,122 2.80 % 283.63 10,724 0.46 % 33.73 koder kode r 33,663 0.35 % 29.67 0 0 % 0 1,770 0.41 % 44.57 17,062 0.37 % 31.44 5,230 0.30 % 27.91 1,410 0.36 % 32.84 49 0.12 % 12.39 8,142 0.35 % 25.61 marveč marv eč 21,385 0.22 % 18.85 0 0 % 0 595 0.14 % 14.98 12,386 0.27 % 22.82 4,535 0.26 % 24.20 1,702 0.43 % 39.64 47 0.12 % 11.88 2,120 0.09 % 6.67 čeravno čera vno 9,604 0.10 % 8.46 0 0 % 0 600 0.14 % 15.11 5,123 0.11 % 9.44 1,322 0.08 % 7.05 357 0.09 % 8.31 21 0.05 % 5.31 2,181 0.09 % 6.86 zatorej zato rej 4,880 0.05 % 4.30 0 0 % 0 317 0.07 % 7.98 2,133 0.05 % 3.93 1,440 0.08 % 7.68 344 0.09 % 8.01 22 0.06 % 5.56 624 0.03 % 1.96 najsi najs i 4,664 0.05 % 4.11 0 0 % 0 343 0.08 % 8.64 1,902 0.04 % 3.50 1,216 0.07 % 6.49 395 0.10 % 9.20 66 0.17 % 16.68 742 0.03 % 2.33 predno pred no 1,817 0.02 % 1.60 0 0 % 0 143 0.03 % 3.60 944 0.02 % 1.74 274 0.02 % 1.46 223 0.06 % 5.19 42 0.10 % 10.62 191 0.01 % 0.60 odkoder odko der 1,597 0.02 % 1.41 0 0 % 0 109 0.03 % 2.74 878 0.02 % 1.62 300 0.02 % 1.60 96 0.02 % 2.24 7 0.02 % 1.77 207 0.01 % 0.65 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 191 File at CLARIN.SI 1.2.175 List of initial character-level 5-grams from conjunction lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-conjunctions-lowercase_ forms-initial-5grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] vendar venda r 997,194 16.49 % 878.82 0 0 % 0 53,106 17.73 % 1,337.15 515,237 17.68 % 949.36 192,538 17.43 % 1,027.32 48,139 18.55 % 1,121.21 3,173 10.28 % 802.10 185,001 12.86 % 581.88 namreč namre č 902,134 14.92 % 795.05 0 0 % 0 9,366 3.13 % 235.83 452,514 15.53 % 833.79 148,837 13.47 % 794.15 17,827 6.87 % 415.21 1,164 3.77 % 294.25 272,426 18.94 % 856.85 čeprav čepra v 679,637 11.24 % 598.96 0 0 % 0 28,778 9.61 % 724.60 333,503 11.44 % 614.50 130,569 11.82 % 696.67 24,686 9.51 % 574.97 1,251 4.05 % 316.24 160,850 11.18 % 505.92 oziroma oziro ma 648,207 10.72 % 571.26 2 6.06 % 206.04 4,180 1.40 % 105.25 332,593 11.41 % 612.82 100,468 9.09 % 536.06 23,365 9.00 % 544.20 10,581 34.27 % 2,674.76 177,018 12.31 % 556.77 ampak ampak 572,085 9.46 % 504.18 0 0 % 0 58,951 19.68 % 1,484.32 227,969 7.82 % 420.05 115,041 10.41 % 613.82 18,836 7.26 % 438.71 2,969 9.62 % 750.53 148,319 10.31 % 466.50 kakor kakor 272,376 4.50 % 240.04 0 0 % 0 40,148 13.40 % 1,010.88 123,199 4.23 % 227 51,656 4.67 % 275.62 24,063 9.27 % 560.46 1,664 5.39 % 420.64 31,646 2.20 % 99.53 sicer sicer 263,692 4.36 % 232.39 0 0 % 0 2,680 0.90 % 67.48 130,258 4.47 % 240.01 36,602 3.31 % 195.30 6,798 2.62 % 158.33 847 2.74 % 214.11 86,507 6.01 % 272.09 temveč temve č 235,490 3.89 % 207.54 0 0 % 0 7,332 2.45 % 184.61 110,231 3.78 % 203.11 43,698 3.96 % 233.16 16,034 6.18 % 373.45 373 1.21 % 94.29 57,822 4.02 % 181.87 torej torej 196,891 3.26 % 173.52 0 0 % 0 3,308 1.10 % 83.29 96,806 3.32 % 178.37 39,105 3.54 % 208.65 9,010 3.47 % 209.85 593 1.92 % 149.90 48,069 3.34 % 151.19 kajti kajti 179,605 2.97 % 158.28 0 0 % 0 11,065 3.69 % 278.60 106,508 3.65 % 196.25 32,173 2.91 % 171.66 7,547 2.91 % 175.78 1,131 3.66 % 285.90 21,181 1.47 % 66.62 vendarle venda rle 165,832 2.74 % 146.15 0 0 % 0 4,087 1.36 % 102.91 91,249 3.13 % 168.13 23,643 2.14 % 126.15 3,756 1.45 % 87.48 544 1.76 % 137.52 42,553 2.96 % 133.84 preden prede n 148,498 2.46 % 130.87 8 24.24 % 824.15 20,759 6.93 % 522.69 53,160 1.82 % 97.95 32,176 2.91 % 171.68 8,842 3.41 % 205.94 629 2.04 % 159 32,924 2.29 % 103.55 dokler dokle r 147,332 2.44 % 129.84 19 57.58 % 1,957.35 16,427 5.48 % 413.61 59,151 2.03 % 108.99 30,010 2.72 % 160.12 9,366 3.61 % 218.15 1,089 3.53 % 275.29 31,270 2.17 % 98.35 kadar kadar 133,016 2.20 % 117.23 1 3.03 % 103.02 13,669 4.56 % 344.17 44,635 1.53 % 82.24 34,877 3.16 % 186.09 16,388 6.31 % 381.70 1,820 5.89 % 460.08 21,626 1.50 % 68.02 kolikor kolik or 105,767 1.75 % 93.21 1 3.03 % 103.02 5,383 1.80 % 135.54 50,897 1.75 % 93.78 18,240 1.65 % 97.32 4,410 1.70 % 102.71 830 2.69 % 209.82 26,006 1.81 % 81.80 kamor kamor 78,225 1.29 % 68.94 0 0 % 0 3,973 1.33 % 100.04 37,516 1.29 % 69.13 14,813 1.34 % 79.04 2,953 1.14 % 68.78 176 0.57 % 44.49 18,794 1.31 % 59.11 bodisi bodis i 63,373 1.05 % 55.85 2 6.06 % 206.04 1,759 0.59 % 44.29 27,313 0.94 % 50.33 12,796 1.16 % 68.28 5,707 2.20 % 132.92 413 1.34 % 104.40 15,383 1.07 % 48.38 odkar odkar 63,190 1.04 % 55.69 0 0 % 0 4,813 1.61 % 121.19 28,914 0.99 % 53.28 10,306 0.93 % 54.99 920 0.35 % 21.43 106 0.34 % 26.80 18,131 1.26 % 57.03 četudi četud i 62,349 1.03 % 54.95 0 0 % 0 2,741 0.92 % 69.02 28,006 0.96 % 51.60 11,127 1.01 % 59.37 3,231 1.25 % 75.25 135 0.44 % 34.13 17,109 1.19 % 53.81 razen razen 50,366 0.83 % 44.39 0 0 % 0 2,897 0.97 % 72.94 22,525 0.77 % 41.50 10,083 0.91 % 53.80 3,015 1.16 % 70.22 1,122 3.63 % 283.63 10,724 0.75 % 33.73 koder koder 33,663 0.56 % 29.67 0 0 % 0 1,770 0.59 % 44.57 17,062 0.58 % 31.44 5,230 0.47 % 27.91 1,410 0.54 % 32.84 49 0.16 % 12.39 8,142 0.57 % 25.61 marveč marve č 21,385 0.35 % 18.85 0 0 % 0 595 0.20 % 14.98 12,386 0.42 % 22.82 4,535 0.41 % 24.20 1,702 0.66 % 39.64 47 0.15 % 11.88 2,120 0.15 % 6.67 čeravno čerav no 9,604 0.16 % 8.46 0 0 % 0 600 0.20 % 15.11 5,123 0.18 % 9.44 1,322 0.12 % 7.05 357 0.14 % 8.31 21 0.07 % 5.31 2,181 0.15 % 6.86 zatorej zator ej 4,880 0.08 % 4.30 0 0 % 0 317 0.11 % 7.98 2,133 0.07 % 3.93 1,440 0.13 % 7.68 344 0.13 % 8.01 22 0.07 % 5.56 624 0.04 % 1.96 najsi najsi 4,664 0.08 % 4.11 0 0 % 0 343 0.11 % 8.64 1,902 0.07 % 3.50 1,216 0.11 % 6.49 395 0.15 % 9.20 66 0.21 % 16.68 742 0.05 % 2.33 predno predn o 1,817 0.03 % 1.60 0 0 % 0 143 0.05 % 3.60 944 0.03 % 1.74 274 0.03 % 1.46 223 0.09 % 5.19 42 0.14 % 10.62 191 0.01 % 0.60 odkoder odkod er 1,597 0.03 % 1.41 0 0 % 0 109 0.04 % 2.74 878 0.03 % 1.62 300 0.03 % 1.60 96 0.04 % 2.24 7 0.02 % 1.77 207 0.01 % 0.65 dasiravno dasir avno 1,435 0.02 % 1.26 0 0 % 0 58 0.02 % 1.46 637 0.02 % 1.17 487 0.04 % 2.60 51 0.02 % 1.19 1 0 % 0.25 201 0.01 % 0.63 dočim dočim 855 0.01 % 0.75 0 0 % 0 15 0.01 % 0.38 73 0 % 0.13 692 0.06 % 3.69 36 0.01 % 0.84 1 0 % 0.25 38 0 % 0.12 super super 633 0.01 % 0.56 0 0 % 0 32 0.01 % 0.81 197 0.01 % 0.36 143 0.01 % 0.76 6 0 % 0.14 4 0.01 % 1.01 251 0.02 % 0.79 akoravno akora vno 207 0 % 0.18 0 0 % 0 56 0.02 % 1.41 44 0 % 0.08 71 0.01 % 0.38 16 0.01 % 0.37 0 0 % 0 20 0 % 0.06 poceni pocen i 179 0 % 0.16 0 0 % 0 4 0 % 0.10 130 0 % 0.24 23 0 % 0.12 5 0 % 0.12 0 0 % 0 17 0 % 0.05 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 192 File at CLARIN.SI 1.2.176 List of final character-level 1-grams from conjunction lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-conjunctions-lowercase_ forms-final-1grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] in i n 29,117,711 29.55 % 25,661.29 619 55.66 % 63,768.41 1,187,174 30.09 % 29,891.76 13,762,326 29.85 % 25,358 5,256,746 29.89 % 28,048.27 1,338,463 33.09 % 31,174.39 115,835 29.37 % 29,281.84 7,456,548 28.18 % 23,452.78 da d a 14,810,352 15.03 % 13,052.29 124 11.15 % 12,774.29 752,394 19.07 % 18,944.47 6,760,076 14.66 % 12,455.89 2,457,240 13.97 % 13,111.03 526,301 13.01 % 12,258.18 53,649 13.60 % 13,561.89 4,260,568 16.10 % 13,400.60 ki k i 12,623,562 12.81 % 11,125.08 55 4.95 % 5,666.01 338,126 8.57 % 8,513.65 5,915,959 12.83 % 10,900.55 2,202,562 12.52 % 11,752.15 516,575 12.77 % 12,031.65 55,386 14.04 % 14,000.98 3,594,899 13.58 % 11,306.89 pa p a 11,003,780 11.17 % 9,697.58 37 3.33 % 3,811.68 319,351 8.09 % 8,040.91 5,422,876 11.76 % 9,992.01 1,898,081 10.79 % 10,127.54 320,699 7.93 % 7,469.46 25,847 6.55 % 6,533.84 3,016,889 11.40 % 9,488.90 kot ko t 5,694,030 5.78 % 5,018.12 23 2.07 % 2,369.42 210,691 5.34 % 5,304.97 2,600,526 5.64 % 4,791.64 973,450 5.54 % 5,194.01 231,781 5.73 % 5,398.46 16,409 4.16 % 4,148.02 1,661,150 6.28 % 5,224.75 ko k o 3,180,102 3.23 % 2,802.61 36 3.24 % 3,708.66 225,473 5.71 % 5,677.17 1,386,450 3.01 % 2,554.63 573,281 3.26 % 3,058.84 123,662 3.06 % 2,880.24 9,198 2.33 % 2,325.16 862,002 3.26 % 2,711.22 ali al i 2,999,080 3.04 % 2,643.07 42 3.78 % 4,326.77 115,783 2.94 % 2,915.29 1,285,899 2.79 % 2,369.35 682,175 3.88 % 3,639.86 232,818 5.76 % 5,422.61 34,601 8.77 % 8,746.76 647,762 2.45 % 2,037.38 če č e 2,441,017 2.48 % 2,151.26 27 2.43 % 2,781.50 127,590 3.23 % 3,212.58 1,064,667 2.31 % 1,961.72 547,603 3.11 % 2,921.83 138,698 3.43 % 3,230.44 23,279 5.90 % 5,884.68 539,153 2.04 % 1,695.78 saj sa j 1,922,955 1.95 % 1,694.69 0 0 % 0 52,415 1.33 % 1,319.75 956,035 2.07 % 1,761.56 353,479 2.01 % 1,886.05 45,680 1.13 % 1,063.94 1,897 0.48 % 479.54 513,449 1.94 % 1,614.93 ter te r 1,668,988 1.69 % 1,470.87 100 8.99 % 10,301.84 23,537 0.60 % 592.64 765,500 1.66 % 1,410.48 282,898 1.61 % 1,509.45 66,550 1.65 % 1,550.03 8,191 2.08 % 2,070.60 522,212 1.97 % 1,642.49 a a 1,586,387 1.61 % 1,398.07 2 0.18 % 206.04 65,928 1.67 % 1,660 611,753 1.33 % 1,127.20 263,856 1.50 % 1,407.85 35,869 0.89 % 835.43 2,997 0.76 % 757.61 605,982 2.29 % 1,905.97 ker ke r 1,508,421 1.53 % 1,329.36 2 0.18 % 206.04 71,438 1.81 % 1,798.73 727,004 1.58 % 1,339.55 293,861 1.67 % 1,567.95 57,937 1.43 % 1,349.42 5,756 1.46 % 1,455.05 352,423 1.33 % 1,108.46 zato zat o 1,302,554 1.32 % 1,147.93 0 0 % 0 34,925 0.89 % 879.37 637,853 1.38 % 1,175.29 249,506 1.42 % 1,331.28 51,526 1.27 % 1,200.10 3,264 0.83 % 825.10 325,480 1.23 % 1,023.72 kjer kje r 1,147,657 1.17 % 1,011.42 0 0 % 0 30,047 0.76 % 756.55 561,556 1.22 % 1,034.70 188,431 1.07 % 1,005.41 36,017 0.89 % 838.88 2,761 0.70 % 697.95 328,845 1.24 % 1,034.30 vendar venda r 997,194 1.01 % 878.82 0 0 % 0 53,106 1.35 % 1,337.15 515,237 1.12 % 949.36 192,538 1.09 % 1,027.32 48,139 1.19 % 1,121.21 3,173 0.81 % 802.10 185,001 0.70 % 581.88 namreč namre č 902,134 0.92 % 795.05 0 0 % 0 9,366 0.24 % 235.83 452,514 0.98 % 833.79 148,837 0.85 % 794.15 17,827 0.44 % 415.21 1,164 0.29 % 294.25 272,426 1.03 % 856.85 čeprav čepra v 679,637 0.69 % 598.96 0 0 % 0 28,778 0.73 % 724.60 333,503 0.72 % 614.50 130,569 0.74 % 696.67 24,686 0.61 % 574.97 1,251 0.32 % 316.24 160,850 0.61 % 505.92 oziroma ozirom a 648,207 0.66 % 571.26 2 0.18 % 206.04 4,180 0.11 % 105.25 332,593 0.72 % 612.82 100,468 0.57 % 536.06 23,365 0.58 % 544.20 10,581 2.68 % 2,674.76 177,018 0.67 % 556.77 toda tod a 588,678 0.60 % 518.80 0 0 % 0 48,129 1.22 % 1,211.84 303,559 0.66 % 559.33 106,097 0.60 % 566.10 26,653 0.66 % 620.78 1,443 0.37 % 364.77 102,797 0.39 % 323.32 ampak ampa k 572,085 0.58 % 504.18 0 0 % 0 58,951 1.49 % 1,484.32 227,969 0.49 % 420.05 115,041 0.65 % 613.82 18,836 0.47 % 438.71 2,969 0.75 % 750.53 148,319 0.56 % 466.50 tako tak o 410,279 0.42 % 361.58 12 1.08 % 1,236.22 11,498 0.29 % 289.51 193,920 0.42 % 357.31 77,933 0.44 % 415.82 17,681 0.44 % 411.81 1,533 0.39 % 387.53 107,702 0.41 % 338.75 naj na j 389,738 0.40 % 343.47 0 0 % 0 25,019 0.63 % 629.95 187,176 0.41 % 344.88 57,022 0.32 % 304.25 13,726 0.34 % 319.69 1,027 0.26 % 259.61 105,768 0.40 % 332.67 kakor kako r 272,376 0.28 % 240.04 0 0 % 0 40,148 1.02 % 1,010.88 123,199 0.27 % 227 51,656 0.29 % 275.62 24,063 0.59 % 560.46 1,664 0.42 % 420.64 31,646 0.12 % 99.53 sicer sice r 263,692 0.27 % 232.39 0 0 % 0 2,680 0.07 % 67.48 130,258 0.28 % 240.01 36,602 0.21 % 195.30 6,798 0.17 % 158.33 847 0.21 % 214.11 86,507 0.33 % 272.09 temveč temve č 235,490 0.24 % 207.54 0 0 % 0 7,332 0.19 % 184.61 110,231 0.24 % 203.11 43,698 0.25 % 233.16 16,034 0.40 % 373.45 373 0.10 % 94.29 57,822 0.22 % 181.87 torej tore j 196,891 0.20 % 173.52 0 0 % 0 3,308 0.08 % 83.29 96,806 0.21 % 178.37 39,105 0.22 % 208.65 9,010 0.22 % 209.85 593 0.15 % 149.90 48,069 0.18 % 151.19 kajti kajt i 179,605 0.18 % 158.28 0 0 % 0 11,065 0.28 % 278.60 106,508 0.23 % 196.25 32,173 0.18 % 171.66 7,547 0.19 % 175.78 1,131 0.29 % 285.90 21,181 0.08 % 66.62 vendarle vendarl e 165,832 0.17 % 146.15 0 0 % 0 4,087 0.10 % 102.91 91,249 0.20 % 168.13 23,643 0.13 % 126.15 3,756 0.09 % 87.48 544 0.14 % 137.52 42,553 0.16 % 133.84 preden prede n 148,498 0.15 % 130.87 8 0.72 % 824.15 20,759 0.53 % 522.69 53,160 0.12 % 97.95 32,176 0.18 % 171.68 8,842 0.22 % 205.94 629 0.16 % 159 32,924 0.12 % 103.55 dokler dokle r 147,332 0.15 % 129.84 19 1.71 % 1,957.35 16,427 0.42 % 413.61 59,151 0.13 % 108.99 30,010 0.17 % 160.12 9,366 0.23 % 218.15 1,089 0.28 % 275.29 31,270 0.12 % 98.35 kadar kada r 133,016 0.14 % 117.23 1 0.09 % 103.02 13,669 0.35 % 344.17 44,635 0.10 % 82.24 34,877 0.20 % 186.09 16,388 0.41 % 381.70 1,820 0.46 % 460.08 21,626 0.08 % 68.02 kolikor koliko r 105,767 0.11 % 93.21 1 0.09 % 103.02 5,383 0.14 % 135.54 50,897 0.11 % 93.78 18,240 0.10 % 97.32 4,410 0.11 % 102.71 830 0.21 % 209.82 26,006 0.10 % 81.80 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 193 File at CLARIN.SI 1.2.177 List of final character-level 2-grams from conjunction lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-conjunctions-lowercase_ forms-final-2grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] in in 29,117,711 30.03 % 25,661.29 619 55.77 % 63,768.41 1,187,174 30.60 % 29,891.76 13,762,326 30.25 % 25,358 5,256,746 30.34 % 28,048.27 1,338,463 33.39 % 31,174.39 115,835 29.60 % 29,281.84 7,456,548 28.84 % 23,452.78 da da 14,810,352 15.28 % 13,052.29 124 11.17 % 12,774.29 752,394 19.39 % 18,944.47 6,760,076 14.86 % 12,455.89 2,457,240 14.18 % 13,111.03 526,301 13.13 % 12,258.18 53,649 13.71 % 13,561.89 4,260,568 16.48 % 13,400.60 ki ki 12,623,562 13.02 % 11,125.08 55 4.96 % 5,666.01 338,126 8.72 % 8,513.65 5,915,959 13.00 % 10,900.55 2,202,562 12.71 % 11,752.15 516,575 12.89 % 12,031.65 55,386 14.15 % 14,000.98 3,594,899 13.90 % 11,306.89 pa pa 11,003,780 11.35 % 9,697.58 37 3.33 % 3,811.68 319,351 8.23 % 8,040.91 5,422,876 11.92 % 9,992.01 1,898,081 10.96 % 10,127.54 320,699 8.00 % 7,469.46 25,847 6.60 % 6,533.84 3,016,889 11.67 % 9,488.90 kot k ot 5,694,030 5.87 % 5,018.12 23 2.07 % 2,369.42 210,691 5.43 % 5,304.97 2,600,526 5.72 % 4,791.64 973,450 5.62 % 5,194.01 231,781 5.78 % 5,398.46 16,409 4.19 % 4,148.02 1,661,150 6.42 % 5,224.75 ko ko 3,180,102 3.28 % 2,802.61 36 3.24 % 3,708.66 225,473 5.81 % 5,677.17 1,386,450 3.05 % 2,554.63 573,281 3.31 % 3,058.84 123,662 3.08 % 2,880.24 9,198 2.35 % 2,325.16 862,002 3.33 % 2,711.22 ali a li 2,999,080 3.09 % 2,643.07 42 3.78 % 4,326.77 115,783 2.98 % 2,915.29 1,285,899 2.83 % 2,369.35 682,175 3.94 % 3,639.86 232,818 5.81 % 5,422.61 34,601 8.84 % 8,746.76 647,762 2.50 % 2,037.38 če če 2,441,017 2.52 % 2,151.26 27 2.43 % 2,781.50 127,590 3.29 % 3,212.58 1,064,667 2.34 % 1,961.72 547,603 3.16 % 2,921.83 138,698 3.46 % 3,230.44 23,279 5.95 % 5,884.68 539,153 2.08 % 1,695.78 saj s aj 1,922,955 1.98 % 1,694.69 0 0 % 0 52,415 1.35 % 1,319.75 956,035 2.10 % 1,761.56 353,479 2.04 % 1,886.05 45,680 1.14 % 1,063.94 1,897 0.48 % 479.54 513,449 1.99 % 1,614.93 ter t er 1,668,988 1.72 % 1,470.87 100 9.01 % 10,301.84 23,537 0.61 % 592.64 765,500 1.68 % 1,410.48 282,898 1.63 % 1,509.45 66,550 1.66 % 1,550.03 8,191 2.09 % 2,070.60 522,212 2.02 % 1,642.49 ker k er 1,508,421 1.56 % 1,329.36 2 0.18 % 206.04 71,438 1.84 % 1,798.73 727,004 1.60 % 1,339.55 293,861 1.70 % 1,567.95 57,937 1.45 % 1,349.42 5,756 1.47 % 1,455.05 352,423 1.36 % 1,108.46 zato za to 1,302,554 1.34 % 1,147.93 0 0 % 0 34,925 0.90 % 879.37 637,853 1.40 % 1,175.29 249,506 1.44 % 1,331.28 51,526 1.28 % 1,200.10 3,264 0.83 % 825.10 325,480 1.26 % 1,023.72 kjer kj er 1,147,657 1.18 % 1,011.42 0 0 % 0 30,047 0.77 % 756.55 561,556 1.23 % 1,034.70 188,431 1.09 % 1,005.41 36,017 0.90 % 838.88 2,761 0.70 % 697.95 328,845 1.27 % 1,034.30 vendar vend ar 997,194 1.03 % 878.82 0 0 % 0 53,106 1.37 % 1,337.15 515,237 1.13 % 949.36 192,538 1.11 % 1,027.32 48,139 1.20 % 1,121.21 3,173 0.81 % 802.10 185,001 0.71 % 581.88 namreč namr eč 902,134 0.93 % 795.05 0 0 % 0 9,366 0.24 % 235.83 452,514 0.99 % 833.79 148,837 0.86 % 794.15 17,827 0.45 % 415.21 1,164 0.30 % 294.25 272,426 1.05 % 856.85 čeprav čepr av 679,637 0.70 % 598.96 0 0 % 0 28,778 0.74 % 724.60 333,503 0.73 % 614.50 130,569 0.75 % 696.67 24,686 0.62 % 574.97 1,251 0.32 % 316.24 160,850 0.62 % 505.92 oziroma oziro ma 648,207 0.67 % 571.26 2 0.18 % 206.04 4,180 0.11 % 105.25 332,593 0.73 % 612.82 100,468 0.58 % 536.06 23,365 0.58 % 544.20 10,581 2.70 % 2,674.76 177,018 0.69 % 556.77 toda to da 588,678 0.61 % 518.80 0 0 % 0 48,129 1.24 % 1,211.84 303,559 0.67 % 559.33 106,097 0.61 % 566.10 26,653 0.67 % 620.78 1,443 0.37 % 364.77 102,797 0.40 % 323.32 ampak amp ak 572,085 0.59 % 504.18 0 0 % 0 58,951 1.52 % 1,484.32 227,969 0.50 % 420.05 115,041 0.66 % 613.82 18,836 0.47 % 438.71 2,969 0.76 % 750.53 148,319 0.57 % 466.50 tako ta ko 410,279 0.42 % 361.58 12 1.08 % 1,236.22 11,498 0.30 % 289.51 193,920 0.43 % 357.31 77,933 0.45 % 415.82 17,681 0.44 % 411.81 1,533 0.39 % 387.53 107,702 0.42 % 338.75 naj n aj 389,738 0.40 % 343.47 0 0 % 0 25,019 0.65 % 629.95 187,176 0.41 % 344.88 57,022 0.33 % 304.25 13,726 0.34 % 319.69 1,027 0.26 % 259.61 105,768 0.41 % 332.67 kakor kak or 272,376 0.28 % 240.04 0 0 % 0 40,148 1.03 % 1,010.88 123,199 0.27 % 227 51,656 0.30 % 275.62 24,063 0.60 % 560.46 1,664 0.42 % 420.64 31,646 0.12 % 99.53 sicer sic er 263,692 0.27 % 232.39 0 0 % 0 2,680 0.07 % 67.48 130,258 0.29 % 240.01 36,602 0.21 % 195.30 6,798 0.17 % 158.33 847 0.22 % 214.11 86,507 0.34 % 272.09 temveč temv eč 235,490 0.24 % 207.54 0 0 % 0 7,332 0.19 % 184.61 110,231 0.24 % 203.11 43,698 0.25 % 233.16 16,034 0.40 % 373.45 373 0.10 % 94.29 57,822 0.22 % 181.87 torej tor ej 196,891 0.20 % 173.52 0 0 % 0 3,308 0.09 % 83.29 96,806 0.21 % 178.37 39,105 0.23 % 208.65 9,010 0.23 % 209.85 593 0.15 % 149.90 48,069 0.19 % 151.19 kajti kaj ti 179,605 0.18 % 158.28 0 0 % 0 11,065 0.28 % 278.60 106,508 0.23 % 196.25 32,173 0.19 % 171.66 7,547 0.19 % 175.78 1,131 0.29 % 285.90 21,181 0.08 % 66.62 vendarle vendar le 165,832 0.17 % 146.15 0 0 % 0 4,087 0.10 % 102.91 91,249 0.20 % 168.13 23,643 0.14 % 126.15 3,756 0.09 % 87.48 544 0.14 % 137.52 42,553 0.17 % 133.84 preden pred en 148,498 0.15 % 130.87 8 0.72 % 824.15 20,759 0.54 % 522.69 53,160 0.12 % 97.95 32,176 0.19 % 171.68 8,842 0.22 % 205.94 629 0.16 % 159 32,924 0.13 % 103.55 dokler dokl er 147,332 0.15 % 129.84 19 1.71 % 1,957.35 16,427 0.42 % 413.61 59,151 0.13 % 108.99 30,010 0.17 % 160.12 9,366 0.23 % 218.15 1,089 0.28 % 275.29 31,270 0.12 % 98.35 kadar kad ar 133,016 0.14 % 117.23 1 0.09 % 103.02 13,669 0.35 % 344.17 44,635 0.10 % 82.24 34,877 0.20 % 186.09 16,388 0.41 % 381.70 1,820 0.47 % 460.08 21,626 0.08 % 68.02 kolikor kolik or 105,767 0.11 % 93.21 1 0.09 % 103.02 5,383 0.14 % 135.54 50,897 0.11 % 93.78 18,240 0.10 % 97.32 4,410 0.11 % 102.71 830 0.21 % 209.82 26,006 0.10 % 81.80 kamor kam or 78,225 0.08 % 68.94 0 0 % 0 3,973 0.10 % 100.04 37,516 0.08 % 69.13 14,813 0.09 % 79.04 2,953 0.07 % 68.78 176 0.04 % 44.49 18,794 0.07 % 59.11 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 194 File at CLARIN.SI 1.2.178 List of final character-level 3-grams from conjunction lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-conjunctions-lowercase_ forms-final-3grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] kot kot 5,694,030 23.98 % 5,018.12 23 10.85 % 2,369.42 210,691 22.73 % 5,304.97 2,600,526 23.29 % 4,791.64 973,450 22.21 % 5,194.01 231,781 22.22 % 5,398.46 16,409 15.19 % 4,148.02 1,661,150 27.14 % 5,224.75 ali ali 2,999,080 12.63 % 2,643.07 42 19.81 % 4,326.77 115,783 12.49 % 2,915.29 1,285,899 11.52 % 2,369.35 682,175 15.57 % 3,639.86 232,818 22.32 % 5,422.61 34,601 32.03 % 8,746.76 647,762 10.58 % 2,037.38 saj saj 1,922,955 8.10 % 1,694.69 0 0 % 0 52,415 5.66 % 1,319.75 956,035 8.56 % 1,761.56 353,479 8.07 % 1,886.05 45,680 4.38 % 1,063.94 1,897 1.76 % 479.54 513,449 8.39 % 1,614.93 ter ter 1,668,988 7.03 % 1,470.87 100 47.17 % 10,301.84 23,537 2.54 % 592.64 765,500 6.86 % 1,410.48 282,898 6.46 % 1,509.45 66,550 6.38 % 1,550.03 8,191 7.58 % 2,070.60 522,212 8.53 % 1,642.49 ker ker 1,508,421 6.35 % 1,329.36 2 0.94 % 206.04 71,438 7.71 % 1,798.73 727,004 6.51 % 1,339.55 293,861 6.71 % 1,567.95 57,937 5.55 % 1,349.42 5,756 5.33 % 1,455.05 352,423 5.76 % 1,108.46 zato z ato 1,302,554 5.49 % 1,147.93 0 0 % 0 34,925 3.77 % 879.37 637,853 5.71 % 1,175.29 249,506 5.69 % 1,331.28 51,526 4.94 % 1,200.10 3,264 3.02 % 825.10 325,480 5.32 % 1,023.72 kjer k jer 1,147,657 4.83 % 1,011.42 0 0 % 0 30,047 3.24 % 756.55 561,556 5.03 % 1,034.70 188,431 4.30 % 1,005.41 36,017 3.45 % 838.88 2,761 2.56 % 697.95 328,845 5.37 % 1,034.30 vendar ven dar 997,194 4.20 % 878.82 0 0 % 0 53,106 5.73 % 1,337.15 515,237 4.62 % 949.36 192,538 4.39 % 1,027.32 48,139 4.62 % 1,121.21 3,173 2.94 % 802.10 185,001 3.02 % 581.88 namreč nam reč 902,134 3.80 % 795.05 0 0 % 0 9,366 1.01 % 235.83 452,514 4.05 % 833.79 148,837 3.40 % 794.15 17,827 1.71 % 415.21 1,164 1.08 % 294.25 272,426 4.45 % 856.85 čeprav čep rav 679,637 2.86 % 598.96 0 0 % 0 28,778 3.10 % 724.60 333,503 2.99 % 614.50 130,569 2.98 % 696.67 24,686 2.37 % 574.97 1,251 1.16 % 316.24 160,850 2.63 % 505.92 oziroma ozir oma 648,207 2.73 % 571.26 2 0.94 % 206.04 4,180 0.45 % 105.25 332,593 2.98 % 612.82 100,468 2.29 % 536.06 23,365 2.24 % 544.20 10,581 9.79 % 2,674.76 177,018 2.89 % 556.77 toda t oda 588,678 2.48 % 518.80 0 0 % 0 48,129 5.19 % 1,211.84 303,559 2.72 % 559.33 106,097 2.42 % 566.10 26,653 2.56 % 620.78 1,443 1.34 % 364.77 102,797 1.68 % 323.32 ampak am pak 572,085 2.41 % 504.18 0 0 % 0 58,951 6.36 % 1,484.32 227,969 2.04 % 420.05 115,041 2.62 % 613.82 18,836 1.81 % 438.71 2,969 2.75 % 750.53 148,319 2.42 % 466.50 tako t ako 410,279 1.73 % 361.58 12 5.66 % 1,236.22 11,498 1.24 % 289.51 193,920 1.74 % 357.31 77,933 1.78 % 415.82 17,681 1.70 % 411.81 1,533 1.42 % 387.53 107,702 1.76 % 338.75 naj naj 389,738 1.64 % 343.47 0 0 % 0 25,019 2.70 % 629.95 187,176 1.68 % 344.88 57,022 1.30 % 304.25 13,726 1.32 % 319.69 1,027 0.95 % 259.61 105,768 1.73 % 332.67 kakor ka kor 272,376 1.15 % 240.04 0 0 % 0 40,148 4.33 % 1,010.88 123,199 1.10 % 227 51,656 1.18 % 275.62 24,063 2.31 % 560.46 1,664 1.54 % 420.64 31,646 0.52 % 99.53 sicer si cer 263,692 1.11 % 232.39 0 0 % 0 2,680 0.29 % 67.48 130,258 1.17 % 240.01 36,602 0.83 % 195.30 6,798 0.65 % 158.33 847 0.78 % 214.11 86,507 1.41 % 272.09 temveč tem več 235,490 0.99 % 207.54 0 0 % 0 7,332 0.79 % 184.61 110,231 0.99 % 203.11 43,698 1.00 % 233.16 16,034 1.54 % 373.45 373 0.34 % 94.29 57,822 0.94 % 181.87 torej to rej 196,891 0.83 % 173.52 0 0 % 0 3,308 0.36 % 83.29 96,806 0.87 % 178.37 39,105 0.89 % 208.65 9,010 0.86 % 209.85 593 0.55 % 149.90 48,069 0.79 % 151.19 kajti ka jti 179,605 0.76 % 158.28 0 0 % 0 11,065 1.19 % 278.60 106,508 0.95 % 196.25 32,173 0.73 % 171.66 7,547 0.72 % 175.78 1,131 1.05 % 285.90 21,181 0.35 % 66.62 vendarle venda rle 165,832 0.70 % 146.15 0 0 % 0 4,087 0.44 % 102.91 91,249 0.82 % 168.13 23,643 0.54 % 126.15 3,756 0.36 % 87.48 544 0.50 % 137.52 42,553 0.69 % 133.84 preden pre den 148,498 0.62 % 130.87 8 3.77 % 824.15 20,759 2.24 % 522.69 53,160 0.48 % 97.95 32,176 0.73 % 171.68 8,842 0.85 % 205.94 629 0.58 % 159 32,924 0.54 % 103.55 dokler dok ler 147,332 0.62 % 129.84 19 8.96 % 1,957.35 16,427 1.77 % 413.61 59,151 0.53 % 108.99 30,010 0.69 % 160.12 9,366 0.90 % 218.15 1,089 1.01 % 275.29 31,270 0.51 % 98.35 kadar ka dar 133,016 0.56 % 117.23 1 0.47 % 103.02 13,669 1.48 % 344.17 44,635 0.40 % 82.24 34,877 0.80 % 186.09 16,388 1.57 % 381.70 1,820 1.69 % 460.08 21,626 0.35 % 68.02 kolikor koli kor 105,767 0.45 % 93.21 1 0.47 % 103.02 5,383 0.58 % 135.54 50,897 0.46 % 93.78 18,240 0.42 % 97.32 4,410 0.42 % 102.71 830 0.77 % 209.82 26,006 0.42 % 81.80 kamor ka mor 78,225 0.33 % 68.94 0 0 % 0 3,973 0.43 % 100.04 37,516 0.34 % 69.13 14,813 0.34 % 79.04 2,953 0.28 % 68.78 176 0.16 % 44.49 18,794 0.31 % 59.11 bodisi bod isi 63,373 0.27 % 55.85 2 0.94 % 206.04 1,759 0.19 % 44.29 27,313 0.24 % 50.33 12,796 0.29 % 68.28 5,707 0.55 % 132.92 413 0.38 % 104.40 15,383 0.25 % 48.38 odkar od kar 63,190 0.27 % 55.69 0 0 % 0 4,813 0.52 % 121.19 28,914 0.26 % 53.28 10,306 0.23 % 54.99 920 0.09 % 21.43 106 0.10 % 26.80 18,131 0.30 % 57.03 četudi čet udi 62,349 0.26 % 54.95 0 0 % 0 2,741 0.30 % 69.02 28,006 0.25 % 51.60 11,127 0.25 % 59.37 3,231 0.31 % 75.25 135 0.12 % 34.13 17,109 0.28 % 53.81 niti n iti 56,291 0.24 % 49.61 0 0 % 0 3,040 0.33 % 76.54 27,731 0.25 % 51.10 10,226 0.23 % 54.56 2,181 0.21 % 50.80 222 0.21 % 56.12 12,891 0.21 % 40.55 razen ra zen 50,366 0.21 % 44.39 0 0 % 0 2,897 0.31 % 72.94 22,525 0.20 % 41.50 10,083 0.23 % 53.80 3,015 0.29 % 70.22 1,122 1.04 % 283.63 10,724 0.17 % 33.73 koder ko der 33,663 0.14 % 29.67 0 0 % 0 1,770 0.19 % 44.57 17,062 0.15 % 31.44 5,230 0.12 % 27.91 1,410 0.14 % 32.84 49 0.04 % 12.39 8,142 0.13 % 25.61 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 195 File at CLARIN.SI 1.2.179 List of final character-level 4-grams from conjunction lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-conjunctions-lowercase_ forms-final-4grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] zato zato 1,302,554 13.63 % 1,147.93 0 0 % 0 34,925 8.17 % 879.37 637,853 13.75 % 1,175.29 249,506 14.36 % 1,331.28 51,526 13.08 % 1,200.10 3,264 8.14 % 825.10 325,480 14.05 % 1,023.72 kjer kjer 1,147,657 12.01 % 1,011.42 0 0 % 0 30,047 7.03 % 756.55 561,556 12.10 % 1,034.70 188,431 10.85 % 1,005.41 36,017 9.15 % 838.88 2,761 6.88 % 697.95 328,845 14.20 % 1,034.30 vendar ve ndar 997,194 10.44 % 878.82 0 0 % 0 53,106 12.43 % 1,337.15 515,237 11.11 % 949.36 192,538 11.08 % 1,027.32 48,139 12.22 % 1,121.21 3,173 7.91 % 802.10 185,001 7.99 % 581.88 namreč na mreč 902,134 9.44 % 795.05 0 0 % 0 9,366 2.19 % 235.83 452,514 9.75 % 833.79 148,837 8.57 % 794.15 17,827 4.53 % 415.21 1,164 2.90 % 294.25 272,426 11.76 % 856.85 čeprav če prav 679,637 7.11 % 598.96 0 0 % 0 28,778 6.73 % 724.60 333,503 7.19 % 614.50 130,569 7.52 % 696.67 24,686 6.27 % 574.97 1,251 3.12 % 316.24 160,850 6.94 % 505.92 oziroma ozi roma 648,207 6.78 % 571.26 2 4.44 % 206.04 4,180 0.98 % 105.25 332,593 7.17 % 612.82 100,468 5.78 % 536.06 23,365 5.93 % 544.20 10,581 26.38 % 2,674.76 177,018 7.64 % 556.77 toda toda 588,678 6.16 % 518.80 0 0 % 0 48,129 11.26 % 1,211.84 303,559 6.54 % 559.33 106,097 6.11 % 566.10 26,653 6.77 % 620.78 1,443 3.60 % 364.77 102,797 4.44 % 323.32 ampak a mpak 572,085 5.99 % 504.18 0 0 % 0 58,951 13.79 % 1,484.32 227,969 4.91 % 420.05 115,041 6.62 % 613.82 18,836 4.78 % 438.71 2,969 7.40 % 750.53 148,319 6.40 % 466.50 tako tako 410,279 4.29 % 361.58 12 26.67 % 1,236.22 11,498 2.69 % 289.51 193,920 4.18 % 357.31 77,933 4.49 % 415.82 17,681 4.49 % 411.81 1,533 3.82 % 387.53 107,702 4.65 % 338.75 kakor k akor 272,376 2.85 % 240.04 0 0 % 0 40,148 9.39 % 1,010.88 123,199 2.65 % 227 51,656 2.97 % 275.62 24,063 6.11 % 560.46 1,664 4.15 % 420.64 31,646 1.37 % 99.53 sicer s icer 263,692 2.76 % 232.39 0 0 % 0 2,680 0.63 % 67.48 130,258 2.81 % 240.01 36,602 2.11 % 195.30 6,798 1.73 % 158.33 847 2.11 % 214.11 86,507 3.73 % 272.09 temveč te mveč 235,490 2.46 % 207.54 0 0 % 0 7,332 1.72 % 184.61 110,231 2.38 % 203.11 43,698 2.52 % 233.16 16,034 4.07 % 373.45 373 0.93 % 94.29 57,822 2.50 % 181.87 torej t orej 196,891 2.06 % 173.52 0 0 % 0 3,308 0.77 % 83.29 96,806 2.09 % 178.37 39,105 2.25 % 208.65 9,010 2.29 % 209.85 593 1.48 % 149.90 48,069 2.08 % 151.19 kajti k ajti 179,605 1.88 % 158.28 0 0 % 0 11,065 2.59 % 278.60 106,508 2.30 % 196.25 32,173 1.85 % 171.66 7,547 1.92 % 175.78 1,131 2.82 % 285.90 21,181 0.91 % 66.62 vendarle vend arle 165,832 1.74 % 146.15 0 0 % 0 4,087 0.96 % 102.91 91,249 1.97 % 168.13 23,643 1.36 % 126.15 3,756 0.95 % 87.48 544 1.36 % 137.52 42,553 1.84 % 133.84 preden pr eden 148,498 1.55 % 130.87 8 17.78 % 824.15 20,759 4.86 % 522.69 53,160 1.15 % 97.95 32,176 1.85 % 171.68 8,842 2.25 % 205.94 629 1.57 % 159 32,924 1.42 % 103.55 dokler do kler 147,332 1.54 % 129.84 19 42.22 % 1,957.35 16,427 3.84 % 413.61 59,151 1.27 % 108.99 30,010 1.73 % 160.12 9,366 2.38 % 218.15 1,089 2.71 % 275.29 31,270 1.35 % 98.35 kadar k adar 133,016 1.39 % 117.23 1 2.22 % 103.02 13,669 3.20 % 344.17 44,635 0.96 % 82.24 34,877 2.01 % 186.09 16,388 4.16 % 381.70 1,820 4.54 % 460.08 21,626 0.93 % 68.02 kolikor kol ikor 105,767 1.11 % 93.21 1 2.22 % 103.02 5,383 1.26 % 135.54 50,897 1.10 % 93.78 18,240 1.05 % 97.32 4,410 1.12 % 102.71 830 2.07 % 209.82 26,006 1.12 % 81.80 kamor k amor 78,225 0.82 % 68.94 0 0 % 0 3,973 0.93 % 100.04 37,516 0.81 % 69.13 14,813 0.85 % 79.04 2,953 0.75 % 68.78 176 0.44 % 44.49 18,794 0.81 % 59.11 bodisi bo disi 63,373 0.66 % 55.85 2 4.44 % 206.04 1,759 0.41 % 44.29 27,313 0.59 % 50.33 12,796 0.74 % 68.28 5,707 1.45 % 132.92 413 1.03 % 104.40 15,383 0.66 % 48.38 odkar o dkar 63,190 0.66 % 55.69 0 0 % 0 4,813 1.13 % 121.19 28,914 0.62 % 53.28 10,306 0.59 % 54.99 920 0.23 % 21.43 106 0.26 % 26.80 18,131 0.78 % 57.03 četudi če tudi 62,349 0.65 % 54.95 0 0 % 0 2,741 0.64 % 69.02 28,006 0.60 % 51.60 11,127 0.64 % 59.37 3,231 0.82 % 75.25 135 0.34 % 34.13 17,109 0.74 % 53.81 niti niti 56,291 0.59 % 49.61 0 0 % 0 3,040 0.71 % 76.54 27,731 0.60 % 51.10 10,226 0.59 % 54.56 2,181 0.55 % 50.80 222 0.55 % 56.12 12,891 0.56 % 40.55 razen r azen 50,366 0.53 % 44.39 0 0 % 0 2,897 0.68 % 72.94 22,525 0.49 % 41.50 10,083 0.58 % 53.80 3,015 0.77 % 70.22 1,122 2.80 % 283.63 10,724 0.46 % 33.73 koder k oder 33,663 0.35 % 29.67 0 0 % 0 1,770 0.41 % 44.57 17,062 0.37 % 31.44 5,230 0.30 % 27.91 1,410 0.36 % 32.84 49 0.12 % 12.39 8,142 0.35 % 25.61 marveč ma rveč 21,385 0.22 % 18.85 0 0 % 0 595 0.14 % 14.98 12,386 0.27 % 22.82 4,535 0.26 % 24.20 1,702 0.43 % 39.64 47 0.12 % 11.88 2,120 0.09 % 6.67 čeravno čer avno 9,604 0.10 % 8.46 0 0 % 0 600 0.14 % 15.11 5,123 0.11 % 9.44 1,322 0.08 % 7.05 357 0.09 % 8.31 21 0.05 % 5.31 2,181 0.09 % 6.86 zatorej zat orej 4,880 0.05 % 4.30 0 0 % 0 317 0.07 % 7.98 2,133 0.05 % 3.93 1,440 0.08 % 7.68 344 0.09 % 8.01 22 0.06 % 5.56 624 0.03 % 1.96 najsi n ajsi 4,664 0.05 % 4.11 0 0 % 0 343 0.08 % 8.64 1,902 0.04 % 3.50 1,216 0.07 % 6.49 395 0.10 % 9.20 66 0.17 % 16.68 742 0.03 % 2.33 predno pr edno 1,817 0.02 % 1.60 0 0 % 0 143 0.03 % 3.60 944 0.02 % 1.74 274 0.02 % 1.46 223 0.06 % 5.19 42 0.10 % 10.62 191 0.01 % 0.60 odkoder odk oder 1,597 0.02 % 1.41 0 0 % 0 109 0.03 % 2.74 878 0.02 % 1.62 300 0.02 % 1.60 96 0.02 % 2.24 7 0.02 % 1.77 207 0.01 % 0.65 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 196 File at CLARIN.SI 1.2.180 List of final character-level 5-grams from conjunction lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-conjunctions-lowercase_ forms-final-5grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] vendar v endar 997,194 16.49 % 878.82 0 0 % 0 53,106 17.73 % 1,337.15 515,237 17.68 % 949.36 192,538 17.43 % 1,027.32 48,139 18.55 % 1,121.21 3,173 10.28 % 802.10 185,001 12.86 % 581.88 namreč n amreč 902,134 14.92 % 795.05 0 0 % 0 9,366 3.13 % 235.83 452,514 15.53 % 833.79 148,837 13.47 % 794.15 17,827 6.87 % 415.21 1,164 3.77 % 294.25 272,426 18.94 % 856.85 čeprav č eprav 679,637 11.24 % 598.96 0 0 % 0 28,778 9.61 % 724.60 333,503 11.44 % 614.50 130,569 11.82 % 696.67 24,686 9.51 % 574.97 1,251 4.05 % 316.24 160,850 11.18 % 505.92 oziroma oz iroma 648,207 10.72 % 571.26 2 6.06 % 206.04 4,180 1.40 % 105.25 332,593 11.41 % 612.82 100,468 9.09 % 536.06 23,365 9.00 % 544.20 10,581 34.27 % 2,674.76 177,018 12.31 % 556.77 ampak ampak 572,085 9.46 % 504.18 0 0 % 0 58,951 19.68 % 1,484.32 227,969 7.82 % 420.05 115,041 10.41 % 613.82 18,836 7.26 % 438.71 2,969 9.62 % 750.53 148,319 10.31 % 466.50 kakor kakor 272,376 4.50 % 240.04 0 0 % 0 40,148 13.40 % 1,010.88 123,199 4.23 % 227 51,656 4.67 % 275.62 24,063 9.27 % 560.46 1,664 5.39 % 420.64 31,646 2.20 % 99.53 sicer sicer 263,692 4.36 % 232.39 0 0 % 0 2,680 0.90 % 67.48 130,258 4.47 % 240.01 36,602 3.31 % 195.30 6,798 2.62 % 158.33 847 2.74 % 214.11 86,507 6.01 % 272.09 temveč t emveč 235,490 3.89 % 207.54 0 0 % 0 7,332 2.45 % 184.61 110,231 3.78 % 203.11 43,698 3.96 % 233.16 16,034 6.18 % 373.45 373 1.21 % 94.29 57,822 4.02 % 181.87 torej torej 196,891 3.26 % 173.52 0 0 % 0 3,308 1.10 % 83.29 96,806 3.32 % 178.37 39,105 3.54 % 208.65 9,010 3.47 % 209.85 593 1.92 % 149.90 48,069 3.34 % 151.19 kajti kajti 179,605 2.97 % 158.28 0 0 % 0 11,065 3.69 % 278.60 106,508 3.65 % 196.25 32,173 2.91 % 171.66 7,547 2.91 % 175.78 1,131 3.66 % 285.90 21,181 1.47 % 66.62 vendarle ven darle 165,832 2.74 % 146.15 0 0 % 0 4,087 1.36 % 102.91 91,249 3.13 % 168.13 23,643 2.14 % 126.15 3,756 1.45 % 87.48 544 1.76 % 137.52 42,553 2.96 % 133.84 preden p reden 148,498 2.46 % 130.87 8 24.24 % 824.15 20,759 6.93 % 522.69 53,160 1.82 % 97.95 32,176 2.91 % 171.68 8,842 3.41 % 205.94 629 2.04 % 159 32,924 2.29 % 103.55 dokler d okler 147,332 2.44 % 129.84 19 57.58 % 1,957.35 16,427 5.48 % 413.61 59,151 2.03 % 108.99 30,010 2.72 % 160.12 9,366 3.61 % 218.15 1,089 3.53 % 275.29 31,270 2.17 % 98.35 kadar kadar 133,016 2.20 % 117.23 1 3.03 % 103.02 13,669 4.56 % 344.17 44,635 1.53 % 82.24 34,877 3.16 % 186.09 16,388 6.31 % 381.70 1,820 5.89 % 460.08 21,626 1.50 % 68.02 kolikor ko likor 105,767 1.75 % 93.21 1 3.03 % 103.02 5,383 1.80 % 135.54 50,897 1.75 % 93.78 18,240 1.65 % 97.32 4,410 1.70 % 102.71 830 2.69 % 209.82 26,006 1.81 % 81.80 kamor kamor 78,225 1.29 % 68.94 0 0 % 0 3,973 1.33 % 100.04 37,516 1.29 % 69.13 14,813 1.34 % 79.04 2,953 1.14 % 68.78 176 0.57 % 44.49 18,794 1.31 % 59.11 bodisi b odisi 63,373 1.05 % 55.85 2 6.06 % 206.04 1,759 0.59 % 44.29 27,313 0.94 % 50.33 12,796 1.16 % 68.28 5,707 2.20 % 132.92 413 1.34 % 104.40 15,383 1.07 % 48.38 odkar odkar 63,190 1.04 % 55.69 0 0 % 0 4,813 1.61 % 121.19 28,914 0.99 % 53.28 10,306 0.93 % 54.99 920 0.35 % 21.43 106 0.34 % 26.80 18,131 1.26 % 57.03 četudi č etudi 62,349 1.03 % 54.95 0 0 % 0 2,741 0.92 % 69.02 28,006 0.96 % 51.60 11,127 1.01 % 59.37 3,231 1.25 % 75.25 135 0.44 % 34.13 17,109 1.19 % 53.81 razen razen 50,366 0.83 % 44.39 0 0 % 0 2,897 0.97 % 72.94 22,525 0.77 % 41.50 10,083 0.91 % 53.80 3,015 1.16 % 70.22 1,122 3.63 % 283.63 10,724 0.75 % 33.73 koder koder 33,663 0.56 % 29.67 0 0 % 0 1,770 0.59 % 44.57 17,062 0.58 % 31.44 5,230 0.47 % 27.91 1,410 0.54 % 32.84 49 0.16 % 12.39 8,142 0.57 % 25.61 marveč m arveč 21,385 0.35 % 18.85 0 0 % 0 595 0.20 % 14.98 12,386 0.42 % 22.82 4,535 0.41 % 24.20 1,702 0.66 % 39.64 47 0.15 % 11.88 2,120 0.15 % 6.67 čeravno če ravno 9,604 0.16 % 8.46 0 0 % 0 600 0.20 % 15.11 5,123 0.18 % 9.44 1,322 0.12 % 7.05 357 0.14 % 8.31 21 0.07 % 5.31 2,181 0.15 % 6.86 zatorej za torej 4,880 0.08 % 4.30 0 0 % 0 317 0.11 % 7.98 2,133 0.07 % 3.93 1,440 0.13 % 7.68 344 0.13 % 8.01 22 0.07 % 5.56 624 0.04 % 1.96 najsi najsi 4,664 0.08 % 4.11 0 0 % 0 343 0.11 % 8.64 1,902 0.07 % 3.50 1,216 0.11 % 6.49 395 0.15 % 9.20 66 0.21 % 16.68 742 0.05 % 2.33 predno p redno 1,817 0.03 % 1.60 0 0 % 0 143 0.05 % 3.60 944 0.03 % 1.74 274 0.03 % 1.46 223 0.09 % 5.19 42 0.14 % 10.62 191 0.01 % 0.60 odkoder od koder 1,597 0.03 % 1.41 0 0 % 0 109 0.04 % 2.74 878 0.03 % 1.62 300 0.03 % 1.60 96 0.04 % 2.24 7 0.02 % 1.77 207 0.01 % 0.65 dasiravno dasi ravno 1,435 0.02 % 1.26 0 0 % 0 58 0.02 % 1.46 637 0.02 % 1.17 487 0.04 % 2.60 51 0.02 % 1.19 1 0 % 0.25 201 0.01 % 0.63 dočim dočim 855 0.01 % 0.75 0 0 % 0 15 0.01 % 0.38 73 0 % 0.13 692 0.06 % 3.69 36 0.01 % 0.84 1 0 % 0.25 38 0 % 0.12 super super 633 0.01 % 0.56 0 0 % 0 32 0.01 % 0.81 197 0.01 % 0.36 143 0.01 % 0.76 6 0 % 0.14 4 0.01 % 1.01 251 0.02 % 0.79 akoravno ako ravno 207 0 % 0.18 0 0 % 0 56 0.02 % 1.41 44 0 % 0.08 71 0.01 % 0.38 16 0.01 % 0.37 0 0 % 0 20 0 % 0.06 poceni p oceni 179 0 % 0.16 0 0 % 0 4 0 % 0.10 130 0 % 0.24 23 0 % 0.12 5 0 % 0.12 0 0 % 0 17 0 % 0.05 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 197 File at CLARIN.SI 1.2.181 List of initial character-level 1-grams from particle lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-particles-lemmas-initial- 1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] tudi tudi t udi 8,478,793 21.38 % 7,472.32 21 18.75 % 2,163.39 136,549 9.19 % 3,438.16 4,214,730 21.79 % 7,765.92 1,465,007 20.28 % 7,816.80 263,011 21.31 % 6,125.84 23,030 20.53 % 5,821.74 2,376,445 23.18 % 7,474.54 ne ne n e 6,734,994 16.98 % 5,935.52 44 39.29 % 4,532.81 374,116 25.18 % 9,419.84 3,104,942 16.05 % 5,721.06 1,318,655 18.25 % 7,035.91 259,075 20.99 % 6,034.16 30,064 26.80 % 7,599.85 1,648,098 16.08 % 5,183.70 še še š e 5,787,718 14.60 % 5,100.69 24 21.43 % 2,472.44 216,168 14.55 % 5,442.88 2,847,027 14.72 % 5,245.84 962,021 13.32 % 5,133.03 135,437 10.97 % 3,154.49 11,596 10.34 % 2,931.34 1,615,445 15.76 % 5,081 že že ž e 3,735,419 9.42 % 3,292.01 6 5.36 % 618.11 131,252 8.84 % 3,304.78 1,898,346 9.81 % 3,497.83 608,889 8.43 % 3,248.83 77,192 6.25 % 1,797.89 7,450 6.64 % 1,883.28 1,012,284 9.87 % 3,183.90 le le l e 2,165,776 5.46 % 1,908.69 0 0 % 0 58,645 3.95 % 1,476.62 1,056,697 5.46 % 1,947.03 439,209 6.08 % 2,343.47 79,149 6.41 % 1,843.47 4,405 3.93 % 1,113.54 527,671 5.15 % 1,659.66 naj naj n aj 1,598,002 4.03 % 1,408.31 2 1.79 % 206.04 35,826 2.41 % 902.06 847,958 4.38 % 1,562.42 235,452 3.26 % 1,256.29 47,034 3.81 % 1,095.48 3,314 2.95 % 837.74 428,416 4.18 % 1,347.48 prav prav p rav 1,208,290 3.05 % 1,064.86 1 0.89 % 103.02 52,405 3.53 % 1,319.50 578,438 2.99 % 1,065.81 228,510 3.16 % 1,219.25 41,863 3.39 % 975.04 2,593 2.31 % 655.48 304,480 2.97 % 957.67 sicer sicer s icer 1,078,123 2.72 % 950.14 0 0 % 0 15,993 1.08 % 402.69 490,602 2.54 % 903.97 157,744 2.18 % 841.67 18,839 1.53 % 438.78 1,842 1.64 % 465.64 393,103 3.83 % 1,236.41 samo samo s amo 940,204 2.37 % 828.60 6 5.36 % 618.11 76,915 5.18 % 1,936.64 434,798 2.25 % 801.14 194,492 2.69 % 1,037.75 44,828 3.63 % 1,044.10 4,906 4.37 % 1,240.18 184,259 1.80 % 579.54 predvsem predvsem p redvsem 844,382 2.13 % 744.15 0 0 % 0 5,733 0.39 % 144.35 442,044 2.29 % 814.50 160,555 2.22 % 856.67 27,431 2.22 % 638.90 1,744 1.55 % 440.86 206,875 2.02 % 650.68 seveda seveda s eveda 725,852 1.83 % 639.69 0 0 % 0 24,491 1.65 % 616.66 363,766 1.88 % 670.26 184,814 2.56 % 986.11 20,809 1.69 % 484.67 2,801 2.50 % 708.06 129,171 1.26 % 406.28 celo celo c elo 646,664 1.63 % 569.90 0 0 % 0 27,288 1.84 % 687.08 317,242 1.64 % 584.54 135,530 1.88 % 723.14 27,137 2.20 % 632.05 1,415 1.26 % 357.70 138,052 1.35 % 434.21 skoraj skoraj s koraj 528,454 1.33 % 465.72 3 2.68 % 309.06 26,418 1.78 % 665.18 253,737 1.31 % 467.53 93,065 1.29 % 496.56 15,802 1.28 % 368.05 828 0.74 % 209.31 138,601 1.35 % 435.94 več več v eč 520,542 1.31 % 458.75 0 0 % 0 37,136 2.50 % 935.04 242,295 1.25 % 446.44 94,562 1.31 % 504.55 18,310 1.48 % 426.46 1,689 1.51 % 426.96 126,550 1.23 % 398.03 vsaj vsaj v saj 507,625 1.28 % 447.37 0 0 % 0 20,218 1.36 % 509.07 254,444 1.31 % 468.83 102,076 1.41 % 544.64 15,753 1.28 % 366.91 1,432 1.28 % 361.99 113,702 1.11 % 357.62 morda morda m orda 468,348 1.18 % 412.75 0 0 % 0 25,811 1.74 % 649.89 217,144 1.12 % 400.10 99,850 1.38 % 532.77 21,212 1.72 % 494.05 1,177 1.05 % 297.53 103,154 1.01 % 324.45 niti niti n iti 458,790 1.16 % 404.33 1 0.89 % 103.02 31,303 2.11 % 788.18 220,261 1.14 % 405.85 80,301 1.11 % 428.46 11,945 0.97 % 278.21 1,088 0.97 % 275.03 113,891 1.11 % 358.22 sploh sploh s ploh 419,675 1.06 % 369.86 0 0 % 0 30,505 2.05 % 768.08 192,378 0.99 % 354.47 87,831 1.22 % 468.64 13,018 1.05 % 303.20 1,240 1.10 % 313.46 94,703 0.92 % 297.87 šele šele š ele 388,144 0.98 % 342.07 2 1.79 % 206.04 13,326 0.90 % 335.53 199,240 1.03 % 367.11 69,131 0.96 % 368.86 13,716 1.11 % 319.46 803 0.72 % 202.99 91,926 0.90 % 289.13 pač pač p ač 294,972 0.74 % 259.96 0 0 % 0 13,015 0.88 % 327.70 148,590 0.77 % 273.79 66,321 0.92 % 353.87 8,471 0.69 % 197.30 2,120 1.89 % 535.91 56,455 0.55 % 177.57 zgolj zgolj z golj 254,436 0.64 % 224.23 0 0 % 0 5,389 0.36 % 135.69 111,593 0.58 % 205.62 38,423 0.53 % 205.01 9,377 0.76 % 218.40 546 0.49 % 138.02 89,108 0.87 % 280.27 ravno ravno r avno 247,650 0.62 % 218.25 1 0.89 % 103.02 15,200 1.02 % 382.72 105,439 0.55 % 194.28 58,189 0.81 % 310.48 7,227 0.59 % 168.33 782 0.70 % 197.68 60,812 0.59 % 191.27 zlasti zlasti z lasti 242,203 0.61 % 213.45 0 0 % 0 3,265 0.22 % 82.21 138,036 0.71 % 254.34 36,842 0.51 % 196.58 16,342 1.32 % 380.62 908 0.81 % 229.53 46,810 0.46 % 147.23 pravzaprav pravzaprav p ravzaprav 198,500 0.50 % 174.94 0 0 % 0 11,349 0.76 % 285.76 93,357 0.48 % 172.02 50,059 0.69 % 267.10 7,342 0.59 % 171 774 0.69 % 195.66 35,619 0.35 % 112.03 vsekakor vsekakor v sekakor 159,819 0.40 % 140.85 1 0.89 % 103.02 3,884 0.26 % 97.79 83,779 0.43 % 154.37 38,563 0.53 % 205.76 4,333 0.35 % 100.92 385 0.34 % 97.32 28,874 0.28 % 90.82 no no n o 142,860 0.36 % 125.90 0 0 % 0 19,680 1.32 % 495.52 52,215 0.27 % 96.21 44,047 0.61 % 235.02 3,129 0.25 % 72.88 501 0.45 % 126.65 23,288 0.23 % 73.25 menda menda m enda 132,347 0.33 % 116.64 0 0 % 0 4,569 0.31 % 115.04 76,040 0.39 % 140.11 26,259 0.36 % 140.11 1,353 0.11 % 31.51 130 0.12 % 32.86 23,996 0.23 % 75.47 najbrž najbrž n ajbrž 123,679 0.31 % 109 0 0 % 0 13,632 0.92 % 343.24 60,518 0.31 % 111.51 24,671 0.34 % 131.64 4,216 0.34 % 98.20 369 0.33 % 93.28 20,273 0.20 % 63.76 koli koli k oli 107,068 0.27 % 94.36 0 0 % 0 6,574 0.44 % 165.53 49,660 0.26 % 91.50 14,368 0.20 % 76.66 5,246 0.42 % 122.19 962 0.86 % 243.18 30,258 0.29 % 95.17 ja ja j a 87,056 0.22 % 76.72 0 0 % 0 19,465 1.31 % 490.11 27,323 0.14 % 50.34 26,187 0.36 % 139.73 1,897 0.15 % 44.18 332 0.30 % 83.93 11,852 0.12 % 37.28 češ češ č eš 87,039 0.22 % 76.71 0 0 % 0 2,631 0.18 % 66.25 51,345 0.27 % 94.61 11,944 0.17 % 63.73 1,723 0.14 % 40.13 343 0.31 % 86.71 19,053 0.19 % 59.93 skorajda skorajda s korajda 58,291 0.15 % 51.37 0 0 % 0 2,478 0.17 % 62.39 29,411 0.15 % 54.19 12,753 0.18 % 68.05 1,690 0.14 % 39.36 70 0.06 % 17.70 11,889 0.12 % 37.39 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 198 File at CLARIN.SI 1.2.182 List of initial character-level 2-grams from particle lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-particles-lemmas-initial- 2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] tudi tudi tu di 8,478,793 21.38 % 7,472.32 21 18.75 % 2,163.39 136,549 9.19 % 3,438.16 4,214,730 21.79 % 7,765.92 1,465,007 20.28 % 7,816.80 263,011 21.31 % 6,125.84 23,030 20.53 % 5,821.74 2,376,445 23.18 % 7,474.54 ne ne ne 6,734,994 16.98 % 5,935.52 44 39.29 % 4,532.81 374,116 25.18 % 9,419.84 3,104,942 16.05 % 5,721.06 1,318,655 18.25 % 7,035.91 259,075 20.99 % 6,034.16 30,064 26.80 % 7,599.85 1,648,098 16.08 % 5,183.70 še še še 5,787,718 14.60 % 5,100.69 24 21.43 % 2,472.44 216,168 14.55 % 5,442.88 2,847,027 14.72 % 5,245.84 962,021 13.32 % 5,133.03 135,437 10.97 % 3,154.49 11,596 10.34 % 2,931.34 1,615,445 15.76 % 5,081 že že že 3,735,419 9.42 % 3,292.01 6 5.36 % 618.11 131,252 8.84 % 3,304.78 1,898,346 9.81 % 3,497.83 608,889 8.43 % 3,248.83 77,192 6.25 % 1,797.89 7,450 6.64 % 1,883.28 1,012,284 9.87 % 3,183.90 le le le 2,165,776 5.46 % 1,908.69 0 0 % 0 58,645 3.95 % 1,476.62 1,056,697 5.46 % 1,947.03 439,209 6.08 % 2,343.47 79,149 6.41 % 1,843.47 4,405 3.93 % 1,113.54 527,671 5.15 % 1,659.66 naj naj na j 1,598,002 4.03 % 1,408.31 2 1.79 % 206.04 35,826 2.41 % 902.06 847,958 4.38 % 1,562.42 235,452 3.26 % 1,256.29 47,034 3.81 % 1,095.48 3,314 2.95 % 837.74 428,416 4.18 % 1,347.48 prav prav pr av 1,208,290 3.05 % 1,064.86 1 0.89 % 103.02 52,405 3.53 % 1,319.50 578,438 2.99 % 1,065.81 228,510 3.16 % 1,219.25 41,863 3.39 % 975.04 2,593 2.31 % 655.48 304,480 2.97 % 957.67 sicer sicer si cer 1,078,123 2.72 % 950.14 0 0 % 0 15,993 1.08 % 402.69 490,602 2.54 % 903.97 157,744 2.18 % 841.67 18,839 1.53 % 438.78 1,842 1.64 % 465.64 393,103 3.83 % 1,236.41 samo samo sa mo 940,204 2.37 % 828.60 6 5.36 % 618.11 76,915 5.18 % 1,936.64 434,798 2.25 % 801.14 194,492 2.69 % 1,037.75 44,828 3.63 % 1,044.10 4,906 4.37 % 1,240.18 184,259 1.80 % 579.54 predvsem predvsem pr edvsem 844,382 2.13 % 744.15 0 0 % 0 5,733 0.39 % 144.35 442,044 2.29 % 814.50 160,555 2.22 % 856.67 27,431 2.22 % 638.90 1,744 1.55 % 440.86 206,875 2.02 % 650.68 seveda seveda se veda 725,852 1.83 % 639.69 0 0 % 0 24,491 1.65 % 616.66 363,766 1.88 % 670.26 184,814 2.56 % 986.11 20,809 1.69 % 484.67 2,801 2.50 % 708.06 129,171 1.26 % 406.28 celo celo ce lo 646,664 1.63 % 569.90 0 0 % 0 27,288 1.84 % 687.08 317,242 1.64 % 584.54 135,530 1.88 % 723.14 27,137 2.20 % 632.05 1,415 1.26 % 357.70 138,052 1.35 % 434.21 skoraj skoraj sk oraj 528,454 1.33 % 465.72 3 2.68 % 309.06 26,418 1.78 % 665.18 253,737 1.31 % 467.53 93,065 1.29 % 496.56 15,802 1.28 % 368.05 828 0.74 % 209.31 138,601 1.35 % 435.94 več več ve č 520,542 1.31 % 458.75 0 0 % 0 37,136 2.50 % 935.04 242,295 1.25 % 446.44 94,562 1.31 % 504.55 18,310 1.48 % 426.46 1,689 1.51 % 426.96 126,550 1.23 % 398.03 vsaj vsaj vs aj 507,625 1.28 % 447.37 0 0 % 0 20,218 1.36 % 509.07 254,444 1.31 % 468.83 102,076 1.41 % 544.64 15,753 1.28 % 366.91 1,432 1.28 % 361.99 113,702 1.11 % 357.62 morda morda mo rda 468,348 1.18 % 412.75 0 0 % 0 25,811 1.74 % 649.89 217,144 1.12 % 400.10 99,850 1.38 % 532.77 21,212 1.72 % 494.05 1,177 1.05 % 297.53 103,154 1.01 % 324.45 niti niti ni ti 458,790 1.16 % 404.33 1 0.89 % 103.02 31,303 2.11 % 788.18 220,261 1.14 % 405.85 80,301 1.11 % 428.46 11,945 0.97 % 278.21 1,088 0.97 % 275.03 113,891 1.11 % 358.22 sploh sploh sp loh 419,675 1.06 % 369.86 0 0 % 0 30,505 2.05 % 768.08 192,378 0.99 % 354.47 87,831 1.22 % 468.64 13,018 1.05 % 303.20 1,240 1.10 % 313.46 94,703 0.92 % 297.87 šele šele še le 388,144 0.98 % 342.07 2 1.79 % 206.04 13,326 0.90 % 335.53 199,240 1.03 % 367.11 69,131 0.96 % 368.86 13,716 1.11 % 319.46 803 0.72 % 202.99 91,926 0.90 % 289.13 pač pač pa č 294,972 0.74 % 259.96 0 0 % 0 13,015 0.88 % 327.70 148,590 0.77 % 273.79 66,321 0.92 % 353.87 8,471 0.69 % 197.30 2,120 1.89 % 535.91 56,455 0.55 % 177.57 zgolj zgolj zg olj 254,436 0.64 % 224.23 0 0 % 0 5,389 0.36 % 135.69 111,593 0.58 % 205.62 38,423 0.53 % 205.01 9,377 0.76 % 218.40 546 0.49 % 138.02 89,108 0.87 % 280.27 ravno ravno ra vno 247,650 0.62 % 218.25 1 0.89 % 103.02 15,200 1.02 % 382.72 105,439 0.55 % 194.28 58,189 0.81 % 310.48 7,227 0.59 % 168.33 782 0.70 % 197.68 60,812 0.59 % 191.27 zlasti zlasti zl asti 242,203 0.61 % 213.45 0 0 % 0 3,265 0.22 % 82.21 138,036 0.71 % 254.34 36,842 0.51 % 196.58 16,342 1.32 % 380.62 908 0.81 % 229.53 46,810 0.46 % 147.23 pravzaprav pravzaprav pr avzaprav 198,500 0.50 % 174.94 0 0 % 0 11,349 0.76 % 285.76 93,357 0.48 % 172.02 50,059 0.69 % 267.10 7,342 0.59 % 171 774 0.69 % 195.66 35,619 0.35 % 112.03 vsekakor vsekakor vs ekakor 159,819 0.40 % 140.85 1 0.89 % 103.02 3,884 0.26 % 97.79 83,779 0.43 % 154.37 38,563 0.53 % 205.76 4,333 0.35 % 100.92 385 0.34 % 97.32 28,874 0.28 % 90.82 no no no 142,860 0.36 % 125.90 0 0 % 0 19,680 1.32 % 495.52 52,215 0.27 % 96.21 44,047 0.61 % 235.02 3,129 0.25 % 72.88 501 0.45 % 126.65 23,288 0.23 % 73.25 menda menda me nda 132,347 0.33 % 116.64 0 0 % 0 4,569 0.31 % 115.04 76,040 0.39 % 140.11 26,259 0.36 % 140.11 1,353 0.11 % 31.51 130 0.12 % 32.86 23,996 0.23 % 75.47 najbrž najbrž na jbrž 123,679 0.31 % 109 0 0 % 0 13,632 0.92 % 343.24 60,518 0.31 % 111.51 24,671 0.34 % 131.64 4,216 0.34 % 98.20 369 0.33 % 93.28 20,273 0.20 % 63.76 koli koli ko li 107,068 0.27 % 94.36 0 0 % 0 6,574 0.44 % 165.53 49,660 0.26 % 91.50 14,368 0.20 % 76.66 5,246 0.42 % 122.19 962 0.86 % 243.18 30,258 0.29 % 95.17 ja ja ja 87,056 0.22 % 76.72 0 0 % 0 19,465 1.31 % 490.11 27,323 0.14 % 50.34 26,187 0.36 % 139.73 1,897 0.15 % 44.18 332 0.30 % 83.93 11,852 0.12 % 37.28 češ češ če š 87,039 0.22 % 76.71 0 0 % 0 2,631 0.18 % 66.25 51,345 0.27 % 94.61 11,944 0.17 % 63.73 1,723 0.14 % 40.13 343 0.31 % 86.71 19,053 0.19 % 59.93 skorajda skorajda sk orajda 58,291 0.15 % 51.37 0 0 % 0 2,478 0.17 % 62.39 29,411 0.15 % 54.19 12,753 0.18 % 68.05 1,690 0.14 % 39.36 70 0.06 % 17.70 11,889 0.12 % 37.39 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 199 File at CLARIN.SI 1.2.183 List of initial character-level 3-grams from particle lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-particles-lemmas-initial- 3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] tudi tudi tud i 8,478,793 40.41 % 7,472.32 21 55.26 % 2,163.39 136,549 20.61 % 3,438.16 4,214,730 40.71 % 7,765.92 1,465,007 38.33 % 7,816.80 263,011 38.81 % 6,125.84 23,030 39.85 % 5,821.74 2,376,445 43.91 % 7,474.54 naj naj naj 1,598,002 7.62 % 1,408.31 2 5.26 % 206.04 35,826 5.41 % 902.06 847,958 8.19 % 1,562.42 235,452 6.16 % 1,256.29 47,034 6.94 % 1,095.48 3,314 5.73 % 837.74 428,416 7.92 % 1,347.48 prav prav pra v 1,208,290 5.76 % 1,064.86 1 2.63 % 103.02 52,405 7.91 % 1,319.50 578,438 5.59 % 1,065.81 228,510 5.98 % 1,219.25 41,863 6.18 % 975.04 2,593 4.49 % 655.48 304,480 5.63 % 957.67 sicer sicer sic er 1,078,123 5.14 % 950.14 0 0 % 0 15,993 2.41 % 402.69 490,602 4.74 % 903.97 157,744 4.13 % 841.67 18,839 2.78 % 438.78 1,842 3.19 % 465.64 393,103 7.26 % 1,236.41 samo samo sam o 940,204 4.48 % 828.60 6 15.79 % 618.11 76,915 11.61 % 1,936.64 434,798 4.20 % 801.14 194,492 5.09 % 1,037.75 44,828 6.61 % 1,044.10 4,906 8.49 % 1,240.18 184,259 3.40 % 579.54 predvsem predvsem pre dvsem 844,382 4.02 % 744.15 0 0 % 0 5,733 0.86 % 144.35 442,044 4.27 % 814.50 160,555 4.20 % 856.67 27,431 4.05 % 638.90 1,744 3.02 % 440.86 206,875 3.82 % 650.68 seveda seveda sev eda 725,852 3.46 % 639.69 0 0 % 0 24,491 3.70 % 616.66 363,766 3.51 % 670.26 184,814 4.84 % 986.11 20,809 3.07 % 484.67 2,801 4.85 % 708.06 129,171 2.39 % 406.28 celo celo cel o 646,664 3.08 % 569.90 0 0 % 0 27,288 4.12 % 687.08 317,242 3.06 % 584.54 135,530 3.55 % 723.14 27,137 4.00 % 632.05 1,415 2.45 % 357.70 138,052 2.55 % 434.21 skoraj skoraj sko raj 528,454 2.52 % 465.72 3 7.89 % 309.06 26,418 3.99 % 665.18 253,737 2.45 % 467.53 93,065 2.44 % 496.56 15,802 2.33 % 368.05 828 1.43 % 209.31 138,601 2.56 % 435.94 več več več 520,542 2.48 % 458.75 0 0 % 0 37,136 5.60 % 935.04 242,295 2.34 % 446.44 94,562 2.47 % 504.55 18,310 2.70 % 426.46 1,689 2.92 % 426.96 126,550 2.34 % 398.03 vsaj vsaj vsa j 507,625 2.42 % 447.37 0 0 % 0 20,218 3.05 % 509.07 254,444 2.46 % 468.83 102,076 2.67 % 544.64 15,753 2.32 % 366.91 1,432 2.48 % 361.99 113,702 2.10 % 357.62 morda morda mor da 468,348 2.23 % 412.75 0 0 % 0 25,811 3.90 % 649.89 217,144 2.10 % 400.10 99,850 2.61 % 532.77 21,212 3.13 % 494.05 1,177 2.04 % 297.53 103,154 1.91 % 324.45 niti niti nit i 458,790 2.19 % 404.33 1 2.63 % 103.02 31,303 4.72 % 788.18 220,261 2.13 % 405.85 80,301 2.10 % 428.46 11,945 1.76 % 278.21 1,088 1.88 % 275.03 113,891 2.10 % 358.22 sploh sploh spl oh 419,675 2.00 % 369.86 0 0 % 0 30,505 4.60 % 768.08 192,378 1.86 % 354.47 87,831 2.30 % 468.64 13,018 1.92 % 303.20 1,240 2.15 % 313.46 94,703 1.75 % 297.87 šele šele šel e 388,144 1.85 % 342.07 2 5.26 % 206.04 13,326 2.01 % 335.53 199,240 1.93 % 367.11 69,131 1.81 % 368.86 13,716 2.02 % 319.46 803 1.39 % 202.99 91,926 1.70 % 289.13 pač pač pač 294,972 1.41 % 259.96 0 0 % 0 13,015 1.96 % 327.70 148,590 1.44 % 273.79 66,321 1.74 % 353.87 8,471 1.25 % 197.30 2,120 3.67 % 535.91 56,455 1.04 % 177.57 zgolj zgolj zgo lj 254,436 1.21 % 224.23 0 0 % 0 5,389 0.81 % 135.69 111,593 1.08 % 205.62 38,423 1.00 % 205.01 9,377 1.38 % 218.40 546 0.94 % 138.02 89,108 1.65 % 280.27 ravno ravno rav no 247,650 1.18 % 218.25 1 2.63 % 103.02 15,200 2.29 % 382.72 105,439 1.02 % 194.28 58,189 1.52 % 310.48 7,227 1.07 % 168.33 782 1.35 % 197.68 60,812 1.12 % 191.27 zlasti zlasti zla sti 242,203 1.15 % 213.45 0 0 % 0 3,265 0.49 % 82.21 138,036 1.33 % 254.34 36,842 0.96 % 196.58 16,342 2.41 % 380.62 908 1.57 % 229.53 46,810 0.86 % 147.23 pravzaprav pravzaprav pra vzaprav 198,500 0.95 % 174.94 0 0 % 0 11,349 1.71 % 285.76 93,357 0.90 % 172.02 50,059 1.31 % 267.10 7,342 1.08 % 171 774 1.34 % 195.66 35,619 0.66 % 112.03 vsekakor vsekakor vse kakor 159,819 0.76 % 140.85 1 2.63 % 103.02 3,884 0.59 % 97.79 83,779 0.81 % 154.37 38,563 1.01 % 205.76 4,333 0.64 % 100.92 385 0.67 % 97.32 28,874 0.53 % 90.82 menda menda men da 132,347 0.63 % 116.64 0 0 % 0 4,569 0.69 % 115.04 76,040 0.73 % 140.11 26,259 0.69 % 140.11 1,353 0.20 % 31.51 130 0.23 % 32.86 23,996 0.44 % 75.47 najbrž najbrž naj brž 123,679 0.59 % 109 0 0 % 0 13,632 2.06 % 343.24 60,518 0.58 % 111.51 24,671 0.65 % 131.64 4,216 0.62 % 98.20 369 0.64 % 93.28 20,273 0.38 % 63.76 koli koli kol i 107,068 0.51 % 94.36 0 0 % 0 6,574 0.99 % 165.53 49,660 0.48 % 91.50 14,368 0.38 % 76.66 5,246 0.77 % 122.19 962 1.67 % 243.18 30,258 0.56 % 95.17 češ češ češ 87,039 0.41 % 76.71 0 0 % 0 2,631 0.40 % 66.25 51,345 0.50 % 94.61 11,944 0.31 % 63.73 1,723 0.25 % 40.13 343 0.59 % 86.71 19,053 0.35 % 59.93 skorajda skorajda sko rajda 58,291 0.28 % 51.37 0 0 % 0 2,478 0.37 % 62.39 29,411 0.28 % 54.19 12,753 0.33 % 68.05 1,690 0.25 % 39.36 70 0.12 % 17.70 11,889 0.22 % 37.39 bržkone bržkone brž kone 28,045 0.13 % 24.72 0 0 % 0 720 0.11 % 18.13 17,758 0.17 % 32.72 3,128 0.08 % 16.69 773 0.11 % 18 21 0.04 % 5.31 5,645 0.10 % 17.75 nemara nemara nem ara 24,208 0.12 % 21.33 0 0 % 0 2,593 0.39 % 65.29 12,101 0.12 % 22.30 4,123 0.11 % 22 2,031 0.30 % 47.30 37 0.06 % 9.35 3,323 0.06 % 10.45 morebiti morebiti mor ebiti 23,169 0.11 % 20.42 0 0 % 0 1,693 0.26 % 42.63 12,592 0.12 % 23.20 3,410 0.09 % 18.19 787 0.12 % 18.33 33 0.06 % 8.34 4,654 0.09 % 14.64 nikar nikar nik ar 21,998 0.10 % 19.39 0 0 % 0 2,458 0.37 % 61.89 8,133 0.08 % 14.99 7,672 0.20 % 40.94 1,134 0.17 % 26.41 47 0.08 % 11.88 2,554 0.05 % 8.03 kajpak kajpak kaj pak 21,308 0.10 % 18.78 0 0 % 0 482 0.07 % 12.14 14,047 0.14 % 25.88 2,845 0.07 % 15.18 214 0.03 % 4.98 5 0.01 % 1.26 3,715 0.07 % 11.68 kvečjemu kvečjemu kve čjemu 16,808 0.08 % 14.81 0 0 % 0 733 0.11 % 18.46 8,898 0.09 % 16.40 3,245 0.09 % 17.31 571 0.08 % 13.30 48 0.08 % 12.13 3,313 0.06 % 10.42 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 200 File at CLARIN.SI 1.2.184 List of initial character-level 4-grams from particle lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-particles-lemmas-initial- 4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] tudi tudi tudi 8,478,793 45.90 % 7,472.32 21 58.33 % 2,163.39 136,549 23.82 % 3,438.16 4,214,730 46.53 % 7,765.92 1,465,007 42.94 % 7,816.80 263,011 43.71 % 6,125.84 23,030 45.80 % 5,821.74 2,376,445 49.73 % 7,474.54 prav prav prav 1,208,290 6.54 % 1,064.86 1 2.78 % 103.02 52,405 9.14 % 1,319.50 578,438 6.39 % 1,065.81 228,510 6.70 % 1,219.25 41,863 6.96 % 975.04 2,593 5.16 % 655.48 304,480 6.37 % 957.67 sicer sicer sice r 1,078,123 5.84 % 950.14 0 0 % 0 15,993 2.79 % 402.69 490,602 5.42 % 903.97 157,744 4.62 % 841.67 18,839 3.13 % 438.78 1,842 3.66 % 465.64 393,103 8.23 % 1,236.41 samo samo samo 940,204 5.09 % 828.60 6 16.67 % 618.11 76,915 13.42 % 1,936.64 434,798 4.80 % 801.14 194,492 5.70 % 1,037.75 44,828 7.45 % 1,044.10 4,906 9.76 % 1,240.18 184,259 3.86 % 579.54 predvsem predvsem pred vsem 844,382 4.57 % 744.15 0 0 % 0 5,733 1.00 % 144.35 442,044 4.88 % 814.50 160,555 4.71 % 856.67 27,431 4.56 % 638.90 1,744 3.47 % 440.86 206,875 4.33 % 650.68 seveda seveda seve da 725,852 3.93 % 639.69 0 0 % 0 24,491 4.27 % 616.66 363,766 4.02 % 670.26 184,814 5.42 % 986.11 20,809 3.46 % 484.67 2,801 5.57 % 708.06 129,171 2.70 % 406.28 celo celo celo 646,664 3.50 % 569.90 0 0 % 0 27,288 4.76 % 687.08 317,242 3.50 % 584.54 135,530 3.97 % 723.14 27,137 4.51 % 632.05 1,415 2.81 % 357.70 138,052 2.89 % 434.21 skoraj skoraj skor aj 528,454 2.86 % 465.72 3 8.33 % 309.06 26,418 4.61 % 665.18 253,737 2.80 % 467.53 93,065 2.73 % 496.56 15,802 2.63 % 368.05 828 1.65 % 209.31 138,601 2.90 % 435.94 vsaj vsaj vsaj 507,625 2.75 % 447.37 0 0 % 0 20,218 3.53 % 509.07 254,444 2.81 % 468.83 102,076 2.99 % 544.64 15,753 2.62 % 366.91 1,432 2.85 % 361.99 113,702 2.38 % 357.62 morda morda mord a 468,348 2.54 % 412.75 0 0 % 0 25,811 4.50 % 649.89 217,144 2.40 % 400.10 99,850 2.93 % 532.77 21,212 3.52 % 494.05 1,177 2.34 % 297.53 103,154 2.16 % 324.45 niti niti niti 458,790 2.48 % 404.33 1 2.78 % 103.02 31,303 5.46 % 788.18 220,261 2.43 % 405.85 80,301 2.35 % 428.46 11,945 1.99 % 278.21 1,088 2.16 % 275.03 113,891 2.38 % 358.22 sploh sploh splo h 419,675 2.27 % 369.86 0 0 % 0 30,505 5.32 % 768.08 192,378 2.12 % 354.47 87,831 2.58 % 468.64 13,018 2.16 % 303.20 1,240 2.47 % 313.46 94,703 1.98 % 297.87 šele šele šele 388,144 2.10 % 342.07 2 5.56 % 206.04 13,326 2.32 % 335.53 199,240 2.20 % 367.11 69,131 2.03 % 368.86 13,716 2.28 % 319.46 803 1.60 % 202.99 91,926 1.92 % 289.13 zgolj zgolj zgol j 254,436 1.38 % 224.23 0 0 % 0 5,389 0.94 % 135.69 111,593 1.23 % 205.62 38,423 1.13 % 205.01 9,377 1.56 % 218.40 546 1.09 % 138.02 89,108 1.86 % 280.27 ravno ravno ravn o 247,650 1.34 % 218.25 1 2.78 % 103.02 15,200 2.65 % 382.72 105,439 1.16 % 194.28 58,189 1.71 % 310.48 7,227 1.20 % 168.33 782 1.55 % 197.68 60,812 1.27 % 191.27 zlasti zlasti zlas ti 242,203 1.31 % 213.45 0 0 % 0 3,265 0.57 % 82.21 138,036 1.52 % 254.34 36,842 1.08 % 196.58 16,342 2.72 % 380.62 908 1.81 % 229.53 46,810 0.98 % 147.23 pravzaprav pravzaprav prav zaprav 198,500 1.07 % 174.94 0 0 % 0 11,349 1.98 % 285.76 93,357 1.03 % 172.02 50,059 1.47 % 267.10 7,342 1.22 % 171 774 1.54 % 195.66 35,619 0.74 % 112.03 vsekakor vsekakor vsek akor 159,819 0.86 % 140.85 1 2.78 % 103.02 3,884 0.68 % 97.79 83,779 0.93 % 154.37 38,563 1.13 % 205.76 4,333 0.72 % 100.92 385 0.77 % 97.32 28,874 0.60 % 90.82 menda menda mend a 132,347 0.72 % 116.64 0 0 % 0 4,569 0.80 % 115.04 76,040 0.84 % 140.11 26,259 0.77 % 140.11 1,353 0.23 % 31.51 130 0.26 % 32.86 23,996 0.50 % 75.47 najbrž najbrž najb rž 123,679 0.67 % 109 0 0 % 0 13,632 2.38 % 343.24 60,518 0.67 % 111.51 24,671 0.72 % 131.64 4,216 0.70 % 98.20 369 0.73 % 93.28 20,273 0.42 % 63.76 koli koli koli 107,068 0.58 % 94.36 0 0 % 0 6,574 1.15 % 165.53 49,660 0.55 % 91.50 14,368 0.42 % 76.66 5,246 0.87 % 122.19 962 1.91 % 243.18 30,258 0.63 % 95.17 skorajda skorajda skor ajda 58,291 0.32 % 51.37 0 0 % 0 2,478 0.43 % 62.39 29,411 0.33 % 54.19 12,753 0.37 % 68.05 1,690 0.28 % 39.36 70 0.14 % 17.70 11,889 0.25 % 37.39 bržkone bržkone bržk one 28,045 0.15 % 24.72 0 0 % 0 720 0.13 % 18.13 17,758 0.20 % 32.72 3,128 0.09 % 16.69 773 0.13 % 18 21 0.04 % 5.31 5,645 0.12 % 17.75 nemara nemara nema ra 24,208 0.13 % 21.33 0 0 % 0 2,593 0.45 % 65.29 12,101 0.13 % 22.30 4,123 0.12 % 22 2,031 0.34 % 47.30 37 0.07 % 9.35 3,323 0.07 % 10.45 morebiti morebiti more biti 23,169 0.12 % 20.42 0 0 % 0 1,693 0.29 % 42.63 12,592 0.14 % 23.20 3,410 0.10 % 18.19 787 0.13 % 18.33 33 0.07 % 8.34 4,654 0.10 % 14.64 nikar nikar nika r 21,998 0.12 % 19.39 0 0 % 0 2,458 0.43 % 61.89 8,133 0.09 % 14.99 7,672 0.23 % 40.94 1,134 0.19 % 26.41 47 0.09 % 11.88 2,554 0.05 % 8.03 kajpak kajpak kajp ak 21,308 0.12 % 18.78 0 0 % 0 482 0.08 % 12.14 14,047 0.15 % 25.88 2,845 0.08 % 15.18 214 0.04 % 4.98 5 0.01 % 1.26 3,715 0.08 % 11.68 kvečjemu kvečjemu kveč jemu 16,808 0.09 % 14.81 0 0 % 0 733 0.13 % 18.46 8,898 0.10 % 16.40 3,245 0.10 % 17.31 571 0.10 % 13.30 48 0.10 % 12.13 3,313 0.07 % 10.42 domala domala doma la 16,383 0.09 % 14.44 0 0 % 0 282 0.05 % 7.10 10,557 0.12 % 19.45 2,066 0.06 % 11.02 352 0.06 % 8.20 10 0.02 % 2.53 3,116 0.07 % 9.80 edino edino edin o 15,218 0.08 % 13.41 0 0 % 0 1,096 0.19 % 27.60 7,644 0.08 % 14.08 2,894 0.09 % 15.44 646 0.11 % 15.05 65 0.13 % 16.43 2,873 0.06 % 9.04 kajne kajne kajn e 13,447 0.07 % 11.85 0 0 % 0 3,958 0.69 % 99.66 3,563 0.04 % 6.57 3,581 0.10 % 19.11 233 0.04 % 5.43 14 0.03 % 3.54 2,098 0.04 % 6.60 bržčas bržčas bržč as 12,496 0.07 % 11.01 0 0 % 0 79 0.01 % 1.99 8,473 0.09 % 15.61 1,322 0.04 % 7.05 139 0.02 % 3.24 0 0 % 0 2,483 0.05 % 7.81 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 201 File at CLARIN.SI 1.2.185 List of initial character-level 5-grams from particle lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-particles-lemmas-initial- 5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] sicer sicer sicer 1,078,123 18.85 % 950.14 0 0 % 0 15,993 7.74 % 402.69 490,602 17.64 % 903.97 157,744 14.14 % 841.67 18,839 10.60 % 438.78 1,842 13.17 % 465.64 393,103 27.62 % 1,236.41 predvsem predvsem predv sem 844,382 14.77 % 744.15 0 0 % 0 5,733 2.77 % 144.35 442,044 15.89 % 814.50 160,555 14.39 % 856.67 27,431 15.43 % 638.90 1,744 12.47 % 440.86 206,875 14.53 % 650.68 seveda seveda seved a 725,852 12.69 % 639.69 0 0 % 0 24,491 11.85 % 616.66 363,766 13.08 % 670.26 184,814 16.57 % 986.11 20,809 11.71 % 484.67 2,801 20.03 % 708.06 129,171 9.07 % 406.28 skoraj skoraj skora j 528,454 9.24 % 465.72 3 60.00 % 309.06 26,418 12.79 % 665.18 253,737 9.12 % 467.53 93,065 8.34 % 496.56 15,802 8.89 % 368.05 828 5.92 % 209.31 138,601 9.74 % 435.94 morda morda morda 468,348 8.19 % 412.75 0 0 % 0 25,811 12.49 % 649.89 217,144 7.81 % 400.10 99,850 8.95 % 532.77 21,212 11.93 % 494.05 1,177 8.41 % 297.53 103,154 7.25 % 324.45 sploh sploh sploh 419,675 7.34 % 369.86 0 0 % 0 30,505 14.76 % 768.08 192,378 6.92 % 354.47 87,831 7.87 % 468.64 13,018 7.32 % 303.20 1,240 8.87 % 313.46 94,703 6.65 % 297.87 zgolj zgolj zgolj 254,436 4.45 % 224.23 0 0 % 0 5,389 2.61 % 135.69 111,593 4.01 % 205.62 38,423 3.44 % 205.01 9,377 5.28 % 218.40 546 3.90 % 138.02 89,108 6.26 % 280.27 ravno ravno ravno 247,650 4.33 % 218.25 1 20.00 % 103.02 15,200 7.36 % 382.72 105,439 3.79 % 194.28 58,189 5.22 % 310.48 7,227 4.07 % 168.33 782 5.59 % 197.68 60,812 4.27 % 191.27 zlasti zlasti zlast i 242,203 4.24 % 213.45 0 0 % 0 3,265 1.58 % 82.21 138,036 4.96 % 254.34 36,842 3.30 % 196.58 16,342 9.19 % 380.62 908 6.49 % 229.53 46,810 3.29 % 147.23 pravzaprav pravzaprav pravz aprav 198,500 3.47 % 174.94 0 0 % 0 11,349 5.49 % 285.76 93,357 3.36 % 172.02 50,059 4.49 % 267.10 7,342 4.13 % 171 774 5.53 % 195.66 35,619 2.50 % 112.03 vsekakor vsekakor vseka kor 159,819 2.79 % 140.85 1 20.00 % 103.02 3,884 1.88 % 97.79 83,779 3.01 % 154.37 38,563 3.46 % 205.76 4,333 2.44 % 100.92 385 2.75 % 97.32 28,874 2.03 % 90.82 menda menda menda 132,347 2.31 % 116.64 0 0 % 0 4,569 2.21 % 115.04 76,040 2.73 % 140.11 26,259 2.35 % 140.11 1,353 0.76 % 31.51 130 0.93 % 32.86 23,996 1.69 % 75.47 najbrž najbrž najbr ž 123,679 2.16 % 109 0 0 % 0 13,632 6.60 % 343.24 60,518 2.18 % 111.51 24,671 2.21 % 131.64 4,216 2.37 % 98.20 369 2.64 % 93.28 20,273 1.42 % 63.76 skorajda skorajda skora jda 58,291 1.02 % 51.37 0 0 % 0 2,478 1.20 % 62.39 29,411 1.06 % 54.19 12,753 1.14 % 68.05 1,690 0.95 % 39.36 70 0.50 % 17.70 11,889 0.83 % 37.39 bržkone bržkone bržko ne 28,045 0.49 % 24.72 0 0 % 0 720 0.35 % 18.13 17,758 0.64 % 32.72 3,128 0.28 % 16.69 773 0.43 % 18 21 0.15 % 5.31 5,645 0.40 % 17.75 nemara nemara nemar a 24,208 0.42 % 21.33 0 0 % 0 2,593 1.25 % 65.29 12,101 0.43 % 22.30 4,123 0.37 % 22 2,031 1.14 % 47.30 37 0.27 % 9.35 3,323 0.23 % 10.45 morebiti morebiti moreb iti 23,169 0.41 % 20.42 0 0 % 0 1,693 0.82 % 42.63 12,592 0.45 % 23.20 3,410 0.31 % 18.19 787 0.44 % 18.33 33 0.24 % 8.34 4,654 0.33 % 14.64 nikar nikar nikar 21,998 0.39 % 19.39 0 0 % 0 2,458 1.19 % 61.89 8,133 0.29 % 14.99 7,672 0.69 % 40.94 1,134 0.64 % 26.41 47 0.34 % 11.88 2,554 0.18 % 8.03 kajpak kajpak kajpa k 21,308 0.37 % 18.78 0 0 % 0 482 0.23 % 12.14 14,047 0.51 % 25.88 2,845 0.26 % 15.18 214 0.12 % 4.98 5 0.04 % 1.26 3,715 0.26 % 11.68 kvečjemu kvečjemu kvečj emu 16,808 0.29 % 14.81 0 0 % 0 733 0.35 % 18.46 8,898 0.32 % 16.40 3,245 0.29 % 17.31 571 0.32 % 13.30 48 0.34 % 12.13 3,313 0.23 % 10.42 domala domala domal a 16,383 0.29 % 14.44 0 0 % 0 282 0.14 % 7.10 10,557 0.38 % 19.45 2,066 0.18 % 11.02 352 0.20 % 8.20 10 0.07 % 2.53 3,116 0.22 % 9.80 edino edino edino 15,218 0.27 % 13.41 0 0 % 0 1,096 0.53 % 27.60 7,644 0.28 % 14.08 2,894 0.26 % 15.44 646 0.36 % 15.05 65 0.47 % 16.43 2,873 0.20 % 9.04 kajne kajne kajne 13,447 0.23 % 11.85 0 0 % 0 3,958 1.92 % 99.66 3,563 0.13 % 6.57 3,581 0.32 % 19.11 233 0.13 % 5.43 14 0.10 % 3.54 2,098 0.15 % 6.60 bržčas bržčas bržča s 12,496 0.22 % 11.01 0 0 % 0 79 0.04 % 1.99 8,473 0.30 % 15.61 1,322 0.12 % 7.05 139 0.08 % 3.24 0 0 % 0 2,483 0.17 % 7.81 bojda bojda bojda 8,067 0.14 % 7.11 0 0 % 0 258 0.12 % 6.50 3,899 0.14 % 7.18 2,303 0.21 % 12.29 66 0.04 % 1.54 3 0.02 % 0.76 1,538 0.11 % 4.84 kajpada kajpada kajpa da 6,710 0.12 % 5.91 0 0 % 0 432 0.21 % 10.88 3,814 0.14 % 7.03 1,288 0.12 % 6.87 393 0.22 % 9.15 3 0.02 % 0.76 780 0.06 % 2.45 kakopak kakopak kakop ak 6,398 0.11 % 5.64 0 0 % 0 142 0.07 % 3.58 1,685 0.06 % 3.10 3,126 0.28 % 16.68 44 0.03 % 1.02 0 0 % 0 1,401 0.10 % 4.41 malodane malodane malod ane 5,717 0.10 % 5.04 0 0 % 0 498 0.24 % 12.54 2,960 0.11 % 5.45 1,147 0.10 % 6.12 118 0.07 % 2.75 3 0.02 % 0.76 991 0.07 % 3.12 malone malone malon e 4,179 0.07 % 3.68 0 0 % 0 664 0.32 % 16.72 2,087 0.07 % 3.85 555 0.05 % 2.96 313 0.18 % 7.29 21 0.15 % 5.31 539 0.04 % 1.70 takorekoč takorekoč takor ekoč 3,914 0.07 % 3.45 0 0 % 0 114 0.06 % 2.87 1,579 0.06 % 2.91 1,331 0.12 % 7.10 264 0.15 % 6.15 42 0.30 % 10.62 584 0.04 % 1.84 edinole edinole edino le 2,874 0.05 % 2.53 0 0 % 0 444 0.21 % 11.18 1,180 0.04 % 2.17 611 0.06 % 3.26 342 0.19 % 7.97 29 0.21 % 7.33 268 0.02 % 0.84 kratkomalo kratkomalo kratk omalo 1,141 0.02 % 1.01 0 0 % 0 90 0.04 % 2.27 627 0.02 % 1.16 318 0.03 % 1.70 59 0.03 % 1.37 3 0.02 % 0.76 44 0 % 0.14 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 202 File at CLARIN.SI 1.2.186 List of final character-level 1-grams from particle lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-particles-lemmas-final- 1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] tudi tudi tud i 8,478,793 21.38 % 7,472.32 21 18.75 % 2,163.39 136,549 9.19 % 3,438.16 4,214,730 21.79 % 7,765.92 1,465,007 20.28 % 7,816.80 263,011 21.31 % 6,125.84 23,030 20.53 % 5,821.74 2,376,445 23.18 % 7,474.54 ne ne n e 6,734,994 16.98 % 5,935.52 44 39.29 % 4,532.81 374,116 25.18 % 9,419.84 3,104,942 16.05 % 5,721.06 1,318,655 18.25 % 7,035.91 259,075 20.99 % 6,034.16 30,064 26.80 % 7,599.85 1,648,098 16.08 % 5,183.70 še še š e 5,787,718 14.60 % 5,100.69 24 21.43 % 2,472.44 216,168 14.55 % 5,442.88 2,847,027 14.72 % 5,245.84 962,021 13.32 % 5,133.03 135,437 10.97 % 3,154.49 11,596 10.34 % 2,931.34 1,615,445 15.76 % 5,081 že že ž e 3,735,419 9.42 % 3,292.01 6 5.36 % 618.11 131,252 8.84 % 3,304.78 1,898,346 9.81 % 3,497.83 608,889 8.43 % 3,248.83 77,192 6.25 % 1,797.89 7,450 6.64 % 1,883.28 1,012,284 9.87 % 3,183.90 le le l e 2,165,776 5.46 % 1,908.69 0 0 % 0 58,645 3.95 % 1,476.62 1,056,697 5.46 % 1,947.03 439,209 6.08 % 2,343.47 79,149 6.41 % 1,843.47 4,405 3.93 % 1,113.54 527,671 5.15 % 1,659.66 naj naj na j 1,598,002 4.03 % 1,408.31 2 1.79 % 206.04 35,826 2.41 % 902.06 847,958 4.38 % 1,562.42 235,452 3.26 % 1,256.29 47,034 3.81 % 1,095.48 3,314 2.95 % 837.74 428,416 4.18 % 1,347.48 prav prav pra v 1,208,290 3.05 % 1,064.86 1 0.89 % 103.02 52,405 3.53 % 1,319.50 578,438 2.99 % 1,065.81 228,510 3.16 % 1,219.25 41,863 3.39 % 975.04 2,593 2.31 % 655.48 304,480 2.97 % 957.67 sicer sicer sice r 1,078,123 2.72 % 950.14 0 0 % 0 15,993 1.08 % 402.69 490,602 2.54 % 903.97 157,744 2.18 % 841.67 18,839 1.53 % 438.78 1,842 1.64 % 465.64 393,103 3.83 % 1,236.41 samo samo sam o 940,204 2.37 % 828.60 6 5.36 % 618.11 76,915 5.18 % 1,936.64 434,798 2.25 % 801.14 194,492 2.69 % 1,037.75 44,828 3.63 % 1,044.10 4,906 4.37 % 1,240.18 184,259 1.80 % 579.54 predvsem predvsem predvse m 844,382 2.13 % 744.15 0 0 % 0 5,733 0.39 % 144.35 442,044 2.29 % 814.50 160,555 2.22 % 856.67 27,431 2.22 % 638.90 1,744 1.55 % 440.86 206,875 2.02 % 650.68 seveda seveda seved a 725,852 1.83 % 639.69 0 0 % 0 24,491 1.65 % 616.66 363,766 1.88 % 670.26 184,814 2.56 % 986.11 20,809 1.69 % 484.67 2,801 2.50 % 708.06 129,171 1.26 % 406.28 celo celo cel o 646,664 1.63 % 569.90 0 0 % 0 27,288 1.84 % 687.08 317,242 1.64 % 584.54 135,530 1.88 % 723.14 27,137 2.20 % 632.05 1,415 1.26 % 357.70 138,052 1.35 % 434.21 skoraj skoraj skora j 528,454 1.33 % 465.72 3 2.68 % 309.06 26,418 1.78 % 665.18 253,737 1.31 % 467.53 93,065 1.29 % 496.56 15,802 1.28 % 368.05 828 0.74 % 209.31 138,601 1.35 % 435.94 več več ve č 520,542 1.31 % 458.75 0 0 % 0 37,136 2.50 % 935.04 242,295 1.25 % 446.44 94,562 1.31 % 504.55 18,310 1.48 % 426.46 1,689 1.51 % 426.96 126,550 1.23 % 398.03 vsaj vsaj vsa j 507,625 1.28 % 447.37 0 0 % 0 20,218 1.36 % 509.07 254,444 1.31 % 468.83 102,076 1.41 % 544.64 15,753 1.28 % 366.91 1,432 1.28 % 361.99 113,702 1.11 % 357.62 morda morda mord a 468,348 1.18 % 412.75 0 0 % 0 25,811 1.74 % 649.89 217,144 1.12 % 400.10 99,850 1.38 % 532.77 21,212 1.72 % 494.05 1,177 1.05 % 297.53 103,154 1.01 % 324.45 niti niti nit i 458,790 1.16 % 404.33 1 0.89 % 103.02 31,303 2.11 % 788.18 220,261 1.14 % 405.85 80,301 1.11 % 428.46 11,945 0.97 % 278.21 1,088 0.97 % 275.03 113,891 1.11 % 358.22 sploh sploh splo h 419,675 1.06 % 369.86 0 0 % 0 30,505 2.05 % 768.08 192,378 0.99 % 354.47 87,831 1.22 % 468.64 13,018 1.05 % 303.20 1,240 1.10 % 313.46 94,703 0.92 % 297.87 šele šele šel e 388,144 0.98 % 342.07 2 1.79 % 206.04 13,326 0.90 % 335.53 199,240 1.03 % 367.11 69,131 0.96 % 368.86 13,716 1.11 % 319.46 803 0.72 % 202.99 91,926 0.90 % 289.13 pač pač pa č 294,972 0.74 % 259.96 0 0 % 0 13,015 0.88 % 327.70 148,590 0.77 % 273.79 66,321 0.92 % 353.87 8,471 0.69 % 197.30 2,120 1.89 % 535.91 56,455 0.55 % 177.57 zgolj zgolj zgol j 254,436 0.64 % 224.23 0 0 % 0 5,389 0.36 % 135.69 111,593 0.58 % 205.62 38,423 0.53 % 205.01 9,377 0.76 % 218.40 546 0.49 % 138.02 89,108 0.87 % 280.27 ravno ravno ravn o 247,650 0.62 % 218.25 1 0.89 % 103.02 15,200 1.02 % 382.72 105,439 0.55 % 194.28 58,189 0.81 % 310.48 7,227 0.59 % 168.33 782 0.70 % 197.68 60,812 0.59 % 191.27 zlasti zlasti zlast i 242,203 0.61 % 213.45 0 0 % 0 3,265 0.22 % 82.21 138,036 0.71 % 254.34 36,842 0.51 % 196.58 16,342 1.32 % 380.62 908 0.81 % 229.53 46,810 0.46 % 147.23 pravzaprav pravzaprav pravzapra v 198,500 0.50 % 174.94 0 0 % 0 11,349 0.76 % 285.76 93,357 0.48 % 172.02 50,059 0.69 % 267.10 7,342 0.59 % 171 774 0.69 % 195.66 35,619 0.35 % 112.03 vsekakor vsekakor vsekako r 159,819 0.40 % 140.85 1 0.89 % 103.02 3,884 0.26 % 97.79 83,779 0.43 % 154.37 38,563 0.53 % 205.76 4,333 0.35 % 100.92 385 0.34 % 97.32 28,874 0.28 % 90.82 no no n o 142,860 0.36 % 125.90 0 0 % 0 19,680 1.32 % 495.52 52,215 0.27 % 96.21 44,047 0.61 % 235.02 3,129 0.25 % 72.88 501 0.45 % 126.65 23,288 0.23 % 73.25 menda menda mend a 132,347 0.33 % 116.64 0 0 % 0 4,569 0.31 % 115.04 76,040 0.39 % 140.11 26,259 0.36 % 140.11 1,353 0.11 % 31.51 130 0.12 % 32.86 23,996 0.23 % 75.47 najbrž najbrž najbr ž 123,679 0.31 % 109 0 0 % 0 13,632 0.92 % 343.24 60,518 0.31 % 111.51 24,671 0.34 % 131.64 4,216 0.34 % 98.20 369 0.33 % 93.28 20,273 0.20 % 63.76 koli koli kol i 107,068 0.27 % 94.36 0 0 % 0 6,574 0.44 % 165.53 49,660 0.26 % 91.50 14,368 0.20 % 76.66 5,246 0.42 % 122.19 962 0.86 % 243.18 30,258 0.29 % 95.17 ja ja j a 87,056 0.22 % 76.72 0 0 % 0 19,465 1.31 % 490.11 27,323 0.14 % 50.34 26,187 0.36 % 139.73 1,897 0.15 % 44.18 332 0.30 % 83.93 11,852 0.12 % 37.28 češ češ če š 87,039 0.22 % 76.71 0 0 % 0 2,631 0.18 % 66.25 51,345 0.27 % 94.61 11,944 0.17 % 63.73 1,723 0.14 % 40.13 343 0.31 % 86.71 19,053 0.19 % 59.93 skorajda skorajda skorajd a 58,291 0.15 % 51.37 0 0 % 0 2,478 0.17 % 62.39 29,411 0.15 % 54.19 12,753 0.18 % 68.05 1,690 0.14 % 39.36 70 0.06 % 17.70 11,889 0.12 % 37.39 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 203 File at CLARIN.SI 1.2.187 List of final character-level 2-grams from particle lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-particles-lemmas-final- 2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] tudi tudi tu di 8,478,793 21.38 % 7,472.32 21 18.75 % 2,163.39 136,549 9.19 % 3,438.16 4,214,730 21.79 % 7,765.92 1,465,007 20.28 % 7,816.80 263,011 21.31 % 6,125.84 23,030 20.53 % 5,821.74 2,376,445 23.18 % 7,474.54 ne ne ne 6,734,994 16.98 % 5,935.52 44 39.29 % 4,532.81 374,116 25.18 % 9,419.84 3,104,942 16.05 % 5,721.06 1,318,655 18.25 % 7,035.91 259,075 20.99 % 6,034.16 30,064 26.80 % 7,599.85 1,648,098 16.08 % 5,183.70 še še še 5,787,718 14.60 % 5,100.69 24 21.43 % 2,472.44 216,168 14.55 % 5,442.88 2,847,027 14.72 % 5,245.84 962,021 13.32 % 5,133.03 135,437 10.97 % 3,154.49 11,596 10.34 % 2,931.34 1,615,445 15.76 % 5,081 že že že 3,735,419 9.42 % 3,292.01 6 5.36 % 618.11 131,252 8.84 % 3,304.78 1,898,346 9.81 % 3,497.83 608,889 8.43 % 3,248.83 77,192 6.25 % 1,797.89 7,450 6.64 % 1,883.28 1,012,284 9.87 % 3,183.90 le le le 2,165,776 5.46 % 1,908.69 0 0 % 0 58,645 3.95 % 1,476.62 1,056,697 5.46 % 1,947.03 439,209 6.08 % 2,343.47 79,149 6.41 % 1,843.47 4,405 3.93 % 1,113.54 527,671 5.15 % 1,659.66 naj naj n aj 1,598,002 4.03 % 1,408.31 2 1.79 % 206.04 35,826 2.41 % 902.06 847,958 4.38 % 1,562.42 235,452 3.26 % 1,256.29 47,034 3.81 % 1,095.48 3,314 2.95 % 837.74 428,416 4.18 % 1,347.48 prav prav pr av 1,208,290 3.05 % 1,064.86 1 0.89 % 103.02 52,405 3.53 % 1,319.50 578,438 2.99 % 1,065.81 228,510 3.16 % 1,219.25 41,863 3.39 % 975.04 2,593 2.31 % 655.48 304,480 2.97 % 957.67 sicer sicer sic er 1,078,123 2.72 % 950.14 0 0 % 0 15,993 1.08 % 402.69 490,602 2.54 % 903.97 157,744 2.18 % 841.67 18,839 1.53 % 438.78 1,842 1.64 % 465.64 393,103 3.83 % 1,236.41 samo samo sa mo 940,204 2.37 % 828.60 6 5.36 % 618.11 76,915 5.18 % 1,936.64 434,798 2.25 % 801.14 194,492 2.69 % 1,037.75 44,828 3.63 % 1,044.10 4,906 4.37 % 1,240.18 184,259 1.80 % 579.54 predvsem predvsem predvs em 844,382 2.13 % 744.15 0 0 % 0 5,733 0.39 % 144.35 442,044 2.29 % 814.50 160,555 2.22 % 856.67 27,431 2.22 % 638.90 1,744 1.55 % 440.86 206,875 2.02 % 650.68 seveda seveda seve da 725,852 1.83 % 639.69 0 0 % 0 24,491 1.65 % 616.66 363,766 1.88 % 670.26 184,814 2.56 % 986.11 20,809 1.69 % 484.67 2,801 2.50 % 708.06 129,171 1.26 % 406.28 celo celo ce lo 646,664 1.63 % 569.90 0 0 % 0 27,288 1.84 % 687.08 317,242 1.64 % 584.54 135,530 1.88 % 723.14 27,137 2.20 % 632.05 1,415 1.26 % 357.70 138,052 1.35 % 434.21 skoraj skoraj skor aj 528,454 1.33 % 465.72 3 2.68 % 309.06 26,418 1.78 % 665.18 253,737 1.31 % 467.53 93,065 1.29 % 496.56 15,802 1.28 % 368.05 828 0.74 % 209.31 138,601 1.35 % 435.94 več več v eč 520,542 1.31 % 458.75 0 0 % 0 37,136 2.50 % 935.04 242,295 1.25 % 446.44 94,562 1.31 % 504.55 18,310 1.48 % 426.46 1,689 1.51 % 426.96 126,550 1.23 % 398.03 vsaj vsaj vs aj 507,625 1.28 % 447.37 0 0 % 0 20,218 1.36 % 509.07 254,444 1.31 % 468.83 102,076 1.41 % 544.64 15,753 1.28 % 366.91 1,432 1.28 % 361.99 113,702 1.11 % 357.62 morda morda mor da 468,348 1.18 % 412.75 0 0 % 0 25,811 1.74 % 649.89 217,144 1.12 % 400.10 99,850 1.38 % 532.77 21,212 1.72 % 494.05 1,177 1.05 % 297.53 103,154 1.01 % 324.45 niti niti ni ti 458,790 1.16 % 404.33 1 0.89 % 103.02 31,303 2.11 % 788.18 220,261 1.14 % 405.85 80,301 1.11 % 428.46 11,945 0.97 % 278.21 1,088 0.97 % 275.03 113,891 1.11 % 358.22 sploh sploh spl oh 419,675 1.06 % 369.86 0 0 % 0 30,505 2.05 % 768.08 192,378 0.99 % 354.47 87,831 1.22 % 468.64 13,018 1.05 % 303.20 1,240 1.10 % 313.46 94,703 0.92 % 297.87 šele šele še le 388,144 0.98 % 342.07 2 1.79 % 206.04 13,326 0.90 % 335.53 199,240 1.03 % 367.11 69,131 0.96 % 368.86 13,716 1.11 % 319.46 803 0.72 % 202.99 91,926 0.90 % 289.13 pač pač p ač 294,972 0.74 % 259.96 0 0 % 0 13,015 0.88 % 327.70 148,590 0.77 % 273.79 66,321 0.92 % 353.87 8,471 0.69 % 197.30 2,120 1.89 % 535.91 56,455 0.55 % 177.57 zgolj zgolj zgo lj 254,436 0.64 % 224.23 0 0 % 0 5,389 0.36 % 135.69 111,593 0.58 % 205.62 38,423 0.53 % 205.01 9,377 0.76 % 218.40 546 0.49 % 138.02 89,108 0.87 % 280.27 ravno ravno rav no 247,650 0.62 % 218.25 1 0.89 % 103.02 15,200 1.02 % 382.72 105,439 0.55 % 194.28 58,189 0.81 % 310.48 7,227 0.59 % 168.33 782 0.70 % 197.68 60,812 0.59 % 191.27 zlasti zlasti zlas ti 242,203 0.61 % 213.45 0 0 % 0 3,265 0.22 % 82.21 138,036 0.71 % 254.34 36,842 0.51 % 196.58 16,342 1.32 % 380.62 908 0.81 % 229.53 46,810 0.46 % 147.23 pravzaprav pravzaprav pravzapr av 198,500 0.50 % 174.94 0 0 % 0 11,349 0.76 % 285.76 93,357 0.48 % 172.02 50,059 0.69 % 267.10 7,342 0.59 % 171 774 0.69 % 195.66 35,619 0.35 % 112.03 vsekakor vsekakor vsekak or 159,819 0.40 % 140.85 1 0.89 % 103.02 3,884 0.26 % 97.79 83,779 0.43 % 154.37 38,563 0.53 % 205.76 4,333 0.35 % 100.92 385 0.34 % 97.32 28,874 0.28 % 90.82 no no no 142,860 0.36 % 125.90 0 0 % 0 19,680 1.32 % 495.52 52,215 0.27 % 96.21 44,047 0.61 % 235.02 3,129 0.25 % 72.88 501 0.45 % 126.65 23,288 0.23 % 73.25 menda menda men da 132,347 0.33 % 116.64 0 0 % 0 4,569 0.31 % 115.04 76,040 0.39 % 140.11 26,259 0.36 % 140.11 1,353 0.11 % 31.51 130 0.12 % 32.86 23,996 0.23 % 75.47 najbrž najbrž najb rž 123,679 0.31 % 109 0 0 % 0 13,632 0.92 % 343.24 60,518 0.31 % 111.51 24,671 0.34 % 131.64 4,216 0.34 % 98.20 369 0.33 % 93.28 20,273 0.20 % 63.76 koli koli ko li 107,068 0.27 % 94.36 0 0 % 0 6,574 0.44 % 165.53 49,660 0.26 % 91.50 14,368 0.20 % 76.66 5,246 0.42 % 122.19 962 0.86 % 243.18 30,258 0.29 % 95.17 ja ja ja 87,056 0.22 % 76.72 0 0 % 0 19,465 1.31 % 490.11 27,323 0.14 % 50.34 26,187 0.36 % 139.73 1,897 0.15 % 44.18 332 0.30 % 83.93 11,852 0.12 % 37.28 češ češ č eš 87,039 0.22 % 76.71 0 0 % 0 2,631 0.18 % 66.25 51,345 0.27 % 94.61 11,944 0.17 % 63.73 1,723 0.14 % 40.13 343 0.31 % 86.71 19,053 0.19 % 59.93 skorajda skorajda skoraj da 58,291 0.15 % 51.37 0 0 % 0 2,478 0.17 % 62.39 29,411 0.15 % 54.19 12,753 0.18 % 68.05 1,690 0.14 % 39.36 70 0.06 % 17.70 11,889 0.12 % 37.39 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 204 File at CLARIN.SI 1.2.188 List of final character-level 3-grams from particle lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-particles-lemmas-final- 3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] tudi tudi t udi 8,478,793 40.41 % 7,472.32 21 55.26 % 2,163.39 136,549 20.61 % 3,438.16 4,214,730 40.71 % 7,765.92 1,465,007 38.33 % 7,816.80 263,011 38.81 % 6,125.84 23,030 39.85 % 5,821.74 2,376,445 43.91 % 7,474.54 naj naj naj 1,598,002 7.62 % 1,408.31 2 5.26 % 206.04 35,826 5.41 % 902.06 847,958 8.19 % 1,562.42 235,452 6.16 % 1,256.29 47,034 6.94 % 1,095.48 3,314 5.73 % 837.74 428,416 7.92 % 1,347.48 prav prav p rav 1,208,290 5.76 % 1,064.86 1 2.63 % 103.02 52,405 7.91 % 1,319.50 578,438 5.59 % 1,065.81 228,510 5.98 % 1,219.25 41,863 6.18 % 975.04 2,593 4.49 % 655.48 304,480 5.63 % 957.67 sicer sicer si cer 1,078,123 5.14 % 950.14 0 0 % 0 15,993 2.41 % 402.69 490,602 4.74 % 903.97 157,744 4.13 % 841.67 18,839 2.78 % 438.78 1,842 3.19 % 465.64 393,103 7.26 % 1,236.41 samo samo s amo 940,204 4.48 % 828.60 6 15.79 % 618.11 76,915 11.61 % 1,936.64 434,798 4.20 % 801.14 194,492 5.09 % 1,037.75 44,828 6.61 % 1,044.10 4,906 8.49 % 1,240.18 184,259 3.40 % 579.54 predvsem predvsem predv sem 844,382 4.02 % 744.15 0 0 % 0 5,733 0.86 % 144.35 442,044 4.27 % 814.50 160,555 4.20 % 856.67 27,431 4.05 % 638.90 1,744 3.02 % 440.86 206,875 3.82 % 650.68 seveda seveda sev eda 725,852 3.46 % 639.69 0 0 % 0 24,491 3.70 % 616.66 363,766 3.51 % 670.26 184,814 4.84 % 986.11 20,809 3.07 % 484.67 2,801 4.85 % 708.06 129,171 2.39 % 406.28 celo celo c elo 646,664 3.08 % 569.90 0 0 % 0 27,288 4.12 % 687.08 317,242 3.06 % 584.54 135,530 3.55 % 723.14 27,137 4.00 % 632.05 1,415 2.45 % 357.70 138,052 2.55 % 434.21 skoraj skoraj sko raj 528,454 2.52 % 465.72 3 7.89 % 309.06 26,418 3.99 % 665.18 253,737 2.45 % 467.53 93,065 2.44 % 496.56 15,802 2.33 % 368.05 828 1.43 % 209.31 138,601 2.56 % 435.94 več več več 520,542 2.48 % 458.75 0 0 % 0 37,136 5.60 % 935.04 242,295 2.34 % 446.44 94,562 2.47 % 504.55 18,310 2.70 % 426.46 1,689 2.92 % 426.96 126,550 2.34 % 398.03 vsaj vsaj v saj 507,625 2.42 % 447.37 0 0 % 0 20,218 3.05 % 509.07 254,444 2.46 % 468.83 102,076 2.67 % 544.64 15,753 2.32 % 366.91 1,432 2.48 % 361.99 113,702 2.10 % 357.62 morda morda mo rda 468,348 2.23 % 412.75 0 0 % 0 25,811 3.90 % 649.89 217,144 2.10 % 400.10 99,850 2.61 % 532.77 21,212 3.13 % 494.05 1,177 2.04 % 297.53 103,154 1.91 % 324.45 niti niti n iti 458,790 2.19 % 404.33 1 2.63 % 103.02 31,303 4.72 % 788.18 220,261 2.13 % 405.85 80,301 2.10 % 428.46 11,945 1.76 % 278.21 1,088 1.88 % 275.03 113,891 2.10 % 358.22 sploh sploh sp loh 419,675 2.00 % 369.86 0 0 % 0 30,505 4.60 % 768.08 192,378 1.86 % 354.47 87,831 2.30 % 468.64 13,018 1.92 % 303.20 1,240 2.15 % 313.46 94,703 1.75 % 297.87 šele šele š ele 388,144 1.85 % 342.07 2 5.26 % 206.04 13,326 2.01 % 335.53 199,240 1.93 % 367.11 69,131 1.81 % 368.86 13,716 2.02 % 319.46 803 1.39 % 202.99 91,926 1.70 % 289.13 pač pač pač 294,972 1.41 % 259.96 0 0 % 0 13,015 1.96 % 327.70 148,590 1.44 % 273.79 66,321 1.74 % 353.87 8,471 1.25 % 197.30 2,120 3.67 % 535.91 56,455 1.04 % 177.57 zgolj zgolj zg olj 254,436 1.21 % 224.23 0 0 % 0 5,389 0.81 % 135.69 111,593 1.08 % 205.62 38,423 1.00 % 205.01 9,377 1.38 % 218.40 546 0.94 % 138.02 89,108 1.65 % 280.27 ravno ravno ra vno 247,650 1.18 % 218.25 1 2.63 % 103.02 15,200 2.29 % 382.72 105,439 1.02 % 194.28 58,189 1.52 % 310.48 7,227 1.07 % 168.33 782 1.35 % 197.68 60,812 1.12 % 191.27 zlasti zlasti zla sti 242,203 1.15 % 213.45 0 0 % 0 3,265 0.49 % 82.21 138,036 1.33 % 254.34 36,842 0.96 % 196.58 16,342 2.41 % 380.62 908 1.57 % 229.53 46,810 0.86 % 147.23 pravzaprav pravzaprav pravzap rav 198,500 0.95 % 174.94 0 0 % 0 11,349 1.71 % 285.76 93,357 0.90 % 172.02 50,059 1.31 % 267.10 7,342 1.08 % 171 774 1.34 % 195.66 35,619 0.66 % 112.03 vsekakor vsekakor vseka kor 159,819 0.76 % 140.85 1 2.63 % 103.02 3,884 0.59 % 97.79 83,779 0.81 % 154.37 38,563 1.01 % 205.76 4,333 0.64 % 100.92 385 0.67 % 97.32 28,874 0.53 % 90.82 menda menda me nda 132,347 0.63 % 116.64 0 0 % 0 4,569 0.69 % 115.04 76,040 0.73 % 140.11 26,259 0.69 % 140.11 1,353 0.20 % 31.51 130 0.23 % 32.86 23,996 0.44 % 75.47 najbrž najbrž naj brž 123,679 0.59 % 109 0 0 % 0 13,632 2.06 % 343.24 60,518 0.58 % 111.51 24,671 0.65 % 131.64 4,216 0.62 % 98.20 369 0.64 % 93.28 20,273 0.38 % 63.76 koli koli k oli 107,068 0.51 % 94.36 0 0 % 0 6,574 0.99 % 165.53 49,660 0.48 % 91.50 14,368 0.38 % 76.66 5,246 0.77 % 122.19 962 1.67 % 243.18 30,258 0.56 % 95.17 češ češ češ 87,039 0.41 % 76.71 0 0 % 0 2,631 0.40 % 66.25 51,345 0.50 % 94.61 11,944 0.31 % 63.73 1,723 0.25 % 40.13 343 0.59 % 86.71 19,053 0.35 % 59.93 skorajda skorajda skora jda 58,291 0.28 % 51.37 0 0 % 0 2,478 0.37 % 62.39 29,411 0.28 % 54.19 12,753 0.33 % 68.05 1,690 0.25 % 39.36 70 0.12 % 17.70 11,889 0.22 % 37.39 bržkone bržkone bržk one 28,045 0.13 % 24.72 0 0 % 0 720 0.11 % 18.13 17,758 0.17 % 32.72 3,128 0.08 % 16.69 773 0.11 % 18 21 0.04 % 5.31 5,645 0.10 % 17.75 nemara nemara nem ara 24,208 0.12 % 21.33 0 0 % 0 2,593 0.39 % 65.29 12,101 0.12 % 22.30 4,123 0.11 % 22 2,031 0.30 % 47.30 37 0.06 % 9.35 3,323 0.06 % 10.45 morebiti morebiti moreb iti 23,169 0.11 % 20.42 0 0 % 0 1,693 0.26 % 42.63 12,592 0.12 % 23.20 3,410 0.09 % 18.19 787 0.12 % 18.33 33 0.06 % 8.34 4,654 0.09 % 14.64 nikar nikar ni kar 21,998 0.10 % 19.39 0 0 % 0 2,458 0.37 % 61.89 8,133 0.08 % 14.99 7,672 0.20 % 40.94 1,134 0.17 % 26.41 47 0.08 % 11.88 2,554 0.05 % 8.03 kajpak kajpak kaj pak 21,308 0.10 % 18.78 0 0 % 0 482 0.07 % 12.14 14,047 0.14 % 25.88 2,845 0.07 % 15.18 214 0.03 % 4.98 5 0.01 % 1.26 3,715 0.07 % 11.68 kvečjemu kvečjemu kvečj emu 16,808 0.08 % 14.81 0 0 % 0 733 0.11 % 18.46 8,898 0.09 % 16.40 3,245 0.09 % 17.31 571 0.08 % 13.30 48 0.08 % 12.13 3,313 0.06 % 10.42 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 205 File at CLARIN.SI 1.2.189 List of final character-level 4-grams from particle lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-particles-lemmas-final- 4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] tudi tudi tudi 8,478,793 45.90 % 7,472.32 21 58.33 % 2,163.39 136,549 23.82 % 3,438.16 4,214,730 46.53 % 7,765.92 1,465,007 42.94 % 7,816.80 263,011 43.71 % 6,125.84 23,030 45.80 % 5,821.74 2,376,445 49.73 % 7,474.54 prav prav prav 1,208,290 6.54 % 1,064.86 1 2.78 % 103.02 52,405 9.14 % 1,319.50 578,438 6.39 % 1,065.81 228,510 6.70 % 1,219.25 41,863 6.96 % 975.04 2,593 5.16 % 655.48 304,480 6.37 % 957.67 sicer sicer s icer 1,078,123 5.84 % 950.14 0 0 % 0 15,993 2.79 % 402.69 490,602 5.42 % 903.97 157,744 4.62 % 841.67 18,839 3.13 % 438.78 1,842 3.66 % 465.64 393,103 8.23 % 1,236.41 samo samo samo 940,204 5.09 % 828.60 6 16.67 % 618.11 76,915 13.42 % 1,936.64 434,798 4.80 % 801.14 194,492 5.70 % 1,037.75 44,828 7.45 % 1,044.10 4,906 9.76 % 1,240.18 184,259 3.86 % 579.54 predvsem predvsem pred vsem 844,382 4.57 % 744.15 0 0 % 0 5,733 1.00 % 144.35 442,044 4.88 % 814.50 160,555 4.71 % 856.67 27,431 4.56 % 638.90 1,744 3.47 % 440.86 206,875 4.33 % 650.68 seveda seveda se veda 725,852 3.93 % 639.69 0 0 % 0 24,491 4.27 % 616.66 363,766 4.02 % 670.26 184,814 5.42 % 986.11 20,809 3.46 % 484.67 2,801 5.57 % 708.06 129,171 2.70 % 406.28 celo celo celo 646,664 3.50 % 569.90 0 0 % 0 27,288 4.76 % 687.08 317,242 3.50 % 584.54 135,530 3.97 % 723.14 27,137 4.51 % 632.05 1,415 2.81 % 357.70 138,052 2.89 % 434.21 skoraj skoraj sk oraj 528,454 2.86 % 465.72 3 8.33 % 309.06 26,418 4.61 % 665.18 253,737 2.80 % 467.53 93,065 2.73 % 496.56 15,802 2.63 % 368.05 828 1.65 % 209.31 138,601 2.90 % 435.94 vsaj vsaj vsaj 507,625 2.75 % 447.37 0 0 % 0 20,218 3.53 % 509.07 254,444 2.81 % 468.83 102,076 2.99 % 544.64 15,753 2.62 % 366.91 1,432 2.85 % 361.99 113,702 2.38 % 357.62 morda morda m orda 468,348 2.54 % 412.75 0 0 % 0 25,811 4.50 % 649.89 217,144 2.40 % 400.10 99,850 2.93 % 532.77 21,212 3.52 % 494.05 1,177 2.34 % 297.53 103,154 2.16 % 324.45 niti niti niti 458,790 2.48 % 404.33 1 2.78 % 103.02 31,303 5.46 % 788.18 220,261 2.43 % 405.85 80,301 2.35 % 428.46 11,945 1.99 % 278.21 1,088 2.16 % 275.03 113,891 2.38 % 358.22 sploh sploh s ploh 419,675 2.27 % 369.86 0 0 % 0 30,505 5.32 % 768.08 192,378 2.12 % 354.47 87,831 2.58 % 468.64 13,018 2.16 % 303.20 1,240 2.47 % 313.46 94,703 1.98 % 297.87 šele šele šele 388,144 2.10 % 342.07 2 5.56 % 206.04 13,326 2.32 % 335.53 199,240 2.20 % 367.11 69,131 2.03 % 368.86 13,716 2.28 % 319.46 803 1.60 % 202.99 91,926 1.92 % 289.13 zgolj zgolj z golj 254,436 1.38 % 224.23 0 0 % 0 5,389 0.94 % 135.69 111,593 1.23 % 205.62 38,423 1.13 % 205.01 9,377 1.56 % 218.40 546 1.09 % 138.02 89,108 1.86 % 280.27 ravno ravno r avno 247,650 1.34 % 218.25 1 2.78 % 103.02 15,200 2.65 % 382.72 105,439 1.16 % 194.28 58,189 1.71 % 310.48 7,227 1.20 % 168.33 782 1.55 % 197.68 60,812 1.27 % 191.27 zlasti zlasti zl asti 242,203 1.31 % 213.45 0 0 % 0 3,265 0.57 % 82.21 138,036 1.52 % 254.34 36,842 1.08 % 196.58 16,342 2.72 % 380.62 908 1.81 % 229.53 46,810 0.98 % 147.23 pravzaprav pravzaprav pravza prav 198,500 1.07 % 174.94 0 0 % 0 11,349 1.98 % 285.76 93,357 1.03 % 172.02 50,059 1.47 % 267.10 7,342 1.22 % 171 774 1.54 % 195.66 35,619 0.74 % 112.03 vsekakor vsekakor vsek akor 159,819 0.86 % 140.85 1 2.78 % 103.02 3,884 0.68 % 97.79 83,779 0.93 % 154.37 38,563 1.13 % 205.76 4,333 0.72 % 100.92 385 0.77 % 97.32 28,874 0.60 % 90.82 menda menda m enda 132,347 0.72 % 116.64 0 0 % 0 4,569 0.80 % 115.04 76,040 0.84 % 140.11 26,259 0.77 % 140.11 1,353 0.23 % 31.51 130 0.26 % 32.86 23,996 0.50 % 75.47 najbrž najbrž na jbrž 123,679 0.67 % 109 0 0 % 0 13,632 2.38 % 343.24 60,518 0.67 % 111.51 24,671 0.72 % 131.64 4,216 0.70 % 98.20 369 0.73 % 93.28 20,273 0.42 % 63.76 koli koli koli 107,068 0.58 % 94.36 0 0 % 0 6,574 1.15 % 165.53 49,660 0.55 % 91.50 14,368 0.42 % 76.66 5,246 0.87 % 122.19 962 1.91 % 243.18 30,258 0.63 % 95.17 skorajda skorajda skor ajda 58,291 0.32 % 51.37 0 0 % 0 2,478 0.43 % 62.39 29,411 0.33 % 54.19 12,753 0.37 % 68.05 1,690 0.28 % 39.36 70 0.14 % 17.70 11,889 0.25 % 37.39 bržkone bržkone brž kone 28,045 0.15 % 24.72 0 0 % 0 720 0.13 % 18.13 17,758 0.20 % 32.72 3,128 0.09 % 16.69 773 0.13 % 18 21 0.04 % 5.31 5,645 0.12 % 17.75 nemara nemara ne mara 24,208 0.13 % 21.33 0 0 % 0 2,593 0.45 % 65.29 12,101 0.13 % 22.30 4,123 0.12 % 22 2,031 0.34 % 47.30 37 0.07 % 9.35 3,323 0.07 % 10.45 morebiti morebiti more biti 23,169 0.12 % 20.42 0 0 % 0 1,693 0.29 % 42.63 12,592 0.14 % 23.20 3,410 0.10 % 18.19 787 0.13 % 18.33 33 0.07 % 8.34 4,654 0.10 % 14.64 nikar nikar n ikar 21,998 0.12 % 19.39 0 0 % 0 2,458 0.43 % 61.89 8,133 0.09 % 14.99 7,672 0.23 % 40.94 1,134 0.19 % 26.41 47 0.09 % 11.88 2,554 0.05 % 8.03 kajpak kajpak ka jpak 21,308 0.12 % 18.78 0 0 % 0 482 0.08 % 12.14 14,047 0.15 % 25.88 2,845 0.08 % 15.18 214 0.04 % 4.98 5 0.01 % 1.26 3,715 0.08 % 11.68 kvečjemu kvečjemu kveč jemu 16,808 0.09 % 14.81 0 0 % 0 733 0.13 % 18.46 8,898 0.10 % 16.40 3,245 0.10 % 17.31 571 0.10 % 13.30 48 0.10 % 12.13 3,313 0.07 % 10.42 domala domala do mala 16,383 0.09 % 14.44 0 0 % 0 282 0.05 % 7.10 10,557 0.12 % 19.45 2,066 0.06 % 11.02 352 0.06 % 8.20 10 0.02 % 2.53 3,116 0.07 % 9.80 edino edino e dino 15,218 0.08 % 13.41 0 0 % 0 1,096 0.19 % 27.60 7,644 0.08 % 14.08 2,894 0.09 % 15.44 646 0.11 % 15.05 65 0.13 % 16.43 2,873 0.06 % 9.04 kajne kajne k ajne 13,447 0.07 % 11.85 0 0 % 0 3,958 0.69 % 99.66 3,563 0.04 % 6.57 3,581 0.10 % 19.11 233 0.04 % 5.43 14 0.03 % 3.54 2,098 0.04 % 6.60 bržčas bržčas br žčas 12,496 0.07 % 11.01 0 0 % 0 79 0.01 % 1.99 8,473 0.09 % 15.61 1,322 0.04 % 7.05 139 0.02 % 3.24 0 0 % 0 2,483 0.05 % 7.81 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 206 File at CLARIN.SI 1.2.190 List of final character-level 5-grams from particle lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-particles-lemmas-final- 5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] sicer sicer sicer 1,078,123 18.85 % 950.14 0 0 % 0 15,993 7.74 % 402.69 490,602 17.64 % 903.97 157,744 14.14 % 841.67 18,839 10.60 % 438.78 1,842 13.17 % 465.64 393,103 27.62 % 1,236.41 predvsem predvsem pre dvsem 844,382 14.77 % 744.15 0 0 % 0 5,733 2.77 % 144.35 442,044 15.89 % 814.50 160,555 14.39 % 856.67 27,431 15.43 % 638.90 1,744 12.47 % 440.86 206,875 14.53 % 650.68 seveda seveda s eveda 725,852 12.69 % 639.69 0 0 % 0 24,491 11.85 % 616.66 363,766 13.08 % 670.26 184,814 16.57 % 986.11 20,809 11.71 % 484.67 2,801 20.03 % 708.06 129,171 9.07 % 406.28 skoraj skoraj s koraj 528,454 9.24 % 465.72 3 60.00 % 309.06 26,418 12.79 % 665.18 253,737 9.12 % 467.53 93,065 8.34 % 496.56 15,802 8.89 % 368.05 828 5.92 % 209.31 138,601 9.74 % 435.94 morda morda morda 468,348 8.19 % 412.75 0 0 % 0 25,811 12.49 % 649.89 217,144 7.81 % 400.10 99,850 8.95 % 532.77 21,212 11.93 % 494.05 1,177 8.41 % 297.53 103,154 7.25 % 324.45 sploh sploh sploh 419,675 7.34 % 369.86 0 0 % 0 30,505 14.76 % 768.08 192,378 6.92 % 354.47 87,831 7.87 % 468.64 13,018 7.32 % 303.20 1,240 8.87 % 313.46 94,703 6.65 % 297.87 zgolj zgolj zgolj 254,436 4.45 % 224.23 0 0 % 0 5,389 2.61 % 135.69 111,593 4.01 % 205.62 38,423 3.44 % 205.01 9,377 5.28 % 218.40 546 3.90 % 138.02 89,108 6.26 % 280.27 ravno ravno ravno 247,650 4.33 % 218.25 1 20.00 % 103.02 15,200 7.36 % 382.72 105,439 3.79 % 194.28 58,189 5.22 % 310.48 7,227 4.07 % 168.33 782 5.59 % 197.68 60,812 4.27 % 191.27 zlasti zlasti z lasti 242,203 4.24 % 213.45 0 0 % 0 3,265 1.58 % 82.21 138,036 4.96 % 254.34 36,842 3.30 % 196.58 16,342 9.19 % 380.62 908 6.49 % 229.53 46,810 3.29 % 147.23 pravzaprav pravzaprav pravz aprav 198,500 3.47 % 174.94 0 0 % 0 11,349 5.49 % 285.76 93,357 3.36 % 172.02 50,059 4.49 % 267.10 7,342 4.13 % 171 774 5.53 % 195.66 35,619 2.50 % 112.03 vsekakor vsekakor vse kakor 159,819 2.79 % 140.85 1 20.00 % 103.02 3,884 1.88 % 97.79 83,779 3.01 % 154.37 38,563 3.46 % 205.76 4,333 2.44 % 100.92 385 2.75 % 97.32 28,874 2.03 % 90.82 menda menda menda 132,347 2.31 % 116.64 0 0 % 0 4,569 2.21 % 115.04 76,040 2.73 % 140.11 26,259 2.35 % 140.11 1,353 0.76 % 31.51 130 0.93 % 32.86 23,996 1.69 % 75.47 najbrž najbrž n ajbrž 123,679 2.16 % 109 0 0 % 0 13,632 6.60 % 343.24 60,518 2.18 % 111.51 24,671 2.21 % 131.64 4,216 2.37 % 98.20 369 2.64 % 93.28 20,273 1.42 % 63.76 skorajda skorajda sko rajda 58,291 1.02 % 51.37 0 0 % 0 2,478 1.20 % 62.39 29,411 1.06 % 54.19 12,753 1.14 % 68.05 1,690 0.95 % 39.36 70 0.50 % 17.70 11,889 0.83 % 37.39 bržkone bržkone br žkone 28,045 0.49 % 24.72 0 0 % 0 720 0.35 % 18.13 17,758 0.64 % 32.72 3,128 0.28 % 16.69 773 0.43 % 18 21 0.15 % 5.31 5,645 0.40 % 17.75 nemara nemara n emara 24,208 0.42 % 21.33 0 0 % 0 2,593 1.25 % 65.29 12,101 0.43 % 22.30 4,123 0.37 % 22 2,031 1.14 % 47.30 37 0.27 % 9.35 3,323 0.23 % 10.45 morebiti morebiti mor ebiti 23,169 0.41 % 20.42 0 0 % 0 1,693 0.82 % 42.63 12,592 0.45 % 23.20 3,410 0.31 % 18.19 787 0.44 % 18.33 33 0.24 % 8.34 4,654 0.33 % 14.64 nikar nikar nikar 21,998 0.39 % 19.39 0 0 % 0 2,458 1.19 % 61.89 8,133 0.29 % 14.99 7,672 0.69 % 40.94 1,134 0.64 % 26.41 47 0.34 % 11.88 2,554 0.18 % 8.03 kajpak kajpak k ajpak 21,308 0.37 % 18.78 0 0 % 0 482 0.23 % 12.14 14,047 0.51 % 25.88 2,845 0.26 % 15.18 214 0.12 % 4.98 5 0.04 % 1.26 3,715 0.26 % 11.68 kvečjemu kvečjemu kve čjemu 16,808 0.29 % 14.81 0 0 % 0 733 0.35 % 18.46 8,898 0.32 % 16.40 3,245 0.29 % 17.31 571 0.32 % 13.30 48 0.34 % 12.13 3,313 0.23 % 10.42 domala domala d omala 16,383 0.29 % 14.44 0 0 % 0 282 0.14 % 7.10 10,557 0.38 % 19.45 2,066 0.18 % 11.02 352 0.20 % 8.20 10 0.07 % 2.53 3,116 0.22 % 9.80 edino edino edino 15,218 0.27 % 13.41 0 0 % 0 1,096 0.53 % 27.60 7,644 0.28 % 14.08 2,894 0.26 % 15.44 646 0.36 % 15.05 65 0.47 % 16.43 2,873 0.20 % 9.04 kajne kajne kajne 13,447 0.23 % 11.85 0 0 % 0 3,958 1.92 % 99.66 3,563 0.13 % 6.57 3,581 0.32 % 19.11 233 0.13 % 5.43 14 0.10 % 3.54 2,098 0.15 % 6.60 bržčas bržčas b ržčas 12,496 0.22 % 11.01 0 0 % 0 79 0.04 % 1.99 8,473 0.30 % 15.61 1,322 0.12 % 7.05 139 0.08 % 3.24 0 0 % 0 2,483 0.17 % 7.81 bojda bojda bojda 8,067 0.14 % 7.11 0 0 % 0 258 0.12 % 6.50 3,899 0.14 % 7.18 2,303 0.21 % 12.29 66 0.04 % 1.54 3 0.02 % 0.76 1,538 0.11 % 4.84 kajpada kajpada ka jpada 6,710 0.12 % 5.91 0 0 % 0 432 0.21 % 10.88 3,814 0.14 % 7.03 1,288 0.12 % 6.87 393 0.22 % 9.15 3 0.02 % 0.76 780 0.06 % 2.45 kakopak kakopak ka kopak 6,398 0.11 % 5.64 0 0 % 0 142 0.07 % 3.58 1,685 0.06 % 3.10 3,126 0.28 % 16.68 44 0.03 % 1.02 0 0 % 0 1,401 0.10 % 4.41 malodane malodane mal odane 5,717 0.10 % 5.04 0 0 % 0 498 0.24 % 12.54 2,960 0.11 % 5.45 1,147 0.10 % 6.12 118 0.07 % 2.75 3 0.02 % 0.76 991 0.07 % 3.12 malone malone m alone 4,179 0.07 % 3.68 0 0 % 0 664 0.32 % 16.72 2,087 0.07 % 3.85 555 0.05 % 2.96 313 0.18 % 7.29 21 0.15 % 5.31 539 0.04 % 1.70 takorekoč takorekoč tako rekoč 3,914 0.07 % 3.45 0 0 % 0 114 0.06 % 2.87 1,579 0.06 % 2.91 1,331 0.12 % 7.10 264 0.15 % 6.15 42 0.30 % 10.62 584 0.04 % 1.84 edinole edinole ed inole 2,874 0.05 % 2.53 0 0 % 0 444 0.21 % 11.18 1,180 0.04 % 2.17 611 0.06 % 3.26 342 0.19 % 7.97 29 0.21 % 7.33 268 0.02 % 0.84 kratkomalo kratkomalo kratk omalo 1,141 0.02 % 1.01 0 0 % 0 90 0.04 % 2.27 627 0.02 % 1.16 318 0.03 % 1.70 59 0.03 % 1.37 3 0.02 % 0.76 44 0 % 0.14 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 207 File at CLARIN.SI 1.2.191 List of initial character-level 1-grams from particle lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-particles-lowercase_forms- initial-1grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] tudi t udi 8,478,789 21.38 % 7,472.31 21 18.75 % 2,163.39 136,547 9.19 % 3,438.11 4,214,730 21.79 % 7,765.92 1,465,006 20.28 % 7,816.79 263,011 21.31 % 6,125.84 23,030 20.53 % 5,821.74 2,376,444 23.18 % 7,474.54 ne n e 6,734,994 16.98 % 5,935.52 44 39.29 % 4,532.81 374,116 25.18 % 9,419.84 3,104,942 16.05 % 5,721.06 1,318,655 18.25 % 7,035.91 259,075 20.99 % 6,034.16 30,064 26.80 % 7,599.85 1,648,098 16.08 % 5,183.70 še š e 5,787,718 14.60 % 5,100.69 24 21.43 % 2,472.44 216,168 14.55 % 5,442.88 2,847,027 14.72 % 5,245.84 962,021 13.32 % 5,133.03 135,437 10.97 % 3,154.49 11,596 10.34 % 2,931.34 1,615,445 15.76 % 5,081 že ž e 3,735,419 9.42 % 3,292.01 6 5.36 % 618.11 131,252 8.84 % 3,304.78 1,898,346 9.81 % 3,497.83 608,889 8.43 % 3,248.83 77,192 6.25 % 1,797.89 7,450 6.64 % 1,883.28 1,012,284 9.87 % 3,183.90 le l e 2,165,776 5.46 % 1,908.69 0 0 % 0 58,645 3.95 % 1,476.62 1,056,697 5.46 % 1,947.03 439,209 6.08 % 2,343.47 79,149 6.41 % 1,843.47 4,405 3.93 % 1,113.54 527,671 5.15 % 1,659.66 naj n aj 1,598,002 4.03 % 1,408.31 2 1.79 % 206.04 35,826 2.41 % 902.06 847,958 4.38 % 1,562.42 235,452 3.26 % 1,256.29 47,034 3.81 % 1,095.48 3,314 2.95 % 837.74 428,416 4.18 % 1,347.48 prav p rav 1,208,290 3.05 % 1,064.86 1 0.89 % 103.02 52,405 3.53 % 1,319.50 578,438 2.99 % 1,065.81 228,510 3.16 % 1,219.25 41,863 3.39 % 975.04 2,593 2.31 % 655.48 304,480 2.97 % 957.67 sicer s icer 1,078,123 2.72 % 950.14 0 0 % 0 15,993 1.08 % 402.69 490,602 2.54 % 903.97 157,744 2.18 % 841.67 18,839 1.53 % 438.78 1,842 1.64 % 465.64 393,103 3.83 % 1,236.41 samo s amo 940,469 2.37 % 828.83 6 5.36 % 618.11 76,926 5.18 % 1,936.91 434,924 2.25 % 801.38 194,550 2.69 % 1,038.05 44,843 3.63 % 1,044.45 4,907 4.38 % 1,240.44 184,313 1.80 % 579.71 predvsem p redvsem 844,382 2.13 % 744.15 0 0 % 0 5,733 0.39 % 144.35 442,044 2.29 % 814.50 160,555 2.22 % 856.67 27,431 2.22 % 638.90 1,744 1.55 % 440.86 206,875 2.02 % 650.68 seveda s eveda 725,852 1.83 % 639.69 0 0 % 0 24,491 1.65 % 616.66 363,766 1.88 % 670.26 184,814 2.56 % 986.11 20,809 1.69 % 484.67 2,801 2.50 % 708.06 129,171 1.26 % 406.28 celo c elo 648,010 1.63 % 571.09 0 0 % 0 27,365 1.84 % 689.02 317,838 1.64 % 585.64 135,813 1.88 % 724.65 27,194 2.20 % 633.38 1,417 1.26 % 358.20 138,383 1.35 % 435.25 skoraj s koraj 528,454 1.33 % 465.72 3 2.68 % 309.06 26,418 1.78 % 665.18 253,737 1.31 % 467.53 93,065 1.29 % 496.56 15,802 1.28 % 368.05 828 0.74 % 209.31 138,601 1.35 % 435.94 več v eč 520,542 1.31 % 458.75 0 0 % 0 37,136 2.50 % 935.04 242,295 1.25 % 446.44 94,562 1.31 % 504.55 18,310 1.48 % 426.46 1,689 1.51 % 426.96 126,550 1.23 % 398.03 vsaj v saj 507,625 1.28 % 447.37 0 0 % 0 20,218 1.36 % 509.07 254,444 1.31 % 468.83 102,076 1.41 % 544.64 15,753 1.28 % 366.91 1,432 1.28 % 361.99 113,702 1.11 % 357.62 morda m orda 468,348 1.18 % 412.75 0 0 % 0 25,811 1.74 % 649.89 217,144 1.12 % 400.10 99,850 1.38 % 532.77 21,212 1.72 % 494.05 1,177 1.05 % 297.53 103,154 1.01 % 324.45 niti n iti 458,790 1.16 % 404.33 1 0.89 % 103.02 31,303 2.11 % 788.18 220,261 1.14 % 405.85 80,301 1.11 % 428.46 11,945 0.97 % 278.21 1,088 0.97 % 275.03 113,891 1.11 % 358.22 sploh s ploh 419,675 1.06 % 369.86 0 0 % 0 30,505 2.05 % 768.08 192,378 0.99 % 354.47 87,831 1.22 % 468.64 13,018 1.05 % 303.20 1,240 1.10 % 313.46 94,703 0.92 % 297.87 šele š ele 388,144 0.98 % 342.07 2 1.79 % 206.04 13,326 0.90 % 335.53 199,240 1.03 % 367.11 69,131 0.96 % 368.86 13,716 1.11 % 319.46 803 0.72 % 202.99 91,926 0.90 % 289.13 pač p ač 294,972 0.74 % 259.96 0 0 % 0 13,015 0.88 % 327.70 148,590 0.77 % 273.79 66,321 0.92 % 353.87 8,471 0.69 % 197.30 2,120 1.89 % 535.91 56,455 0.55 % 177.57 zgolj z golj 254,436 0.64 % 224.23 0 0 % 0 5,389 0.36 % 135.69 111,593 0.58 % 205.62 38,423 0.53 % 205.01 9,377 0.76 % 218.40 546 0.49 % 138.02 89,108 0.87 % 280.27 ravno r avno 247,808 0.62 % 218.39 1 0.89 % 103.02 15,206 1.02 % 382.87 105,504 0.55 % 194.40 58,223 0.81 % 310.66 7,237 0.59 % 168.56 782 0.70 % 197.68 60,855 0.59 % 191.40 zlasti z lasti 242,203 0.61 % 213.45 0 0 % 0 3,265 0.22 % 82.21 138,036 0.71 % 254.34 36,842 0.51 % 196.58 16,342 1.32 % 380.62 908 0.81 % 229.53 46,810 0.46 % 147.23 pravzaprav p ravzaprav 198,500 0.50 % 174.94 0 0 % 0 11,349 0.76 % 285.76 93,357 0.48 % 172.02 50,059 0.69 % 267.10 7,342 0.59 % 171 774 0.69 % 195.66 35,619 0.35 % 112.03 vsekakor v sekakor 159,819 0.40 % 140.85 1 0.89 % 103.02 3,884 0.26 % 97.79 83,779 0.43 % 154.37 38,563 0.53 % 205.76 4,333 0.35 % 100.92 385 0.34 % 97.32 28,874 0.28 % 90.82 no n o 142,860 0.36 % 125.90 0 0 % 0 19,680 1.32 % 495.52 52,215 0.27 % 96.21 44,047 0.61 % 235.02 3,129 0.25 % 72.88 501 0.45 % 126.65 23,288 0.23 % 73.25 menda m enda 132,347 0.33 % 116.64 0 0 % 0 4,569 0.31 % 115.04 76,040 0.39 % 140.11 26,259 0.36 % 140.11 1,353 0.11 % 31.51 130 0.12 % 32.86 23,996 0.23 % 75.47 najbrž n ajbrž 123,679 0.31 % 109 0 0 % 0 13,632 0.92 % 343.24 60,518 0.31 % 111.51 24,671 0.34 % 131.64 4,216 0.34 % 98.20 369 0.33 % 93.28 20,273 0.20 % 63.76 koli k oli 107,085 0.27 % 94.37 0 0 % 0 6,575 0.44 % 165.55 49,669 0.26 % 91.52 14,370 0.20 % 76.67 5,247 0.42 % 122.21 962 0.86 % 243.18 30,262 0.29 % 95.18 ja j a 87,056 0.22 % 76.72 0 0 % 0 19,465 1.31 % 490.11 27,323 0.14 % 50.34 26,187 0.36 % 139.73 1,897 0.15 % 44.18 332 0.30 % 83.93 11,852 0.12 % 37.28 češ č eš 87,039 0.22 % 76.71 0 0 % 0 2,631 0.18 % 66.25 51,345 0.27 % 94.61 11,944 0.17 % 63.73 1,723 0.14 % 40.13 343 0.31 % 86.71 19,053 0.19 % 59.93 skorajda s korajda 58,291 0.15 % 51.37 0 0 % 0 2,478 0.17 % 62.39 29,411 0.15 % 54.19 12,753 0.18 % 68.05 1,690 0.14 % 39.36 70 0.06 % 17.70 11,889 0.12 % 37.39 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 208 File at CLARIN.SI 1.2.192 List of initial character-level 2-grams from particle lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-particles-lowercase_forms- initial-2grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] tudi tu di 8,478,789 21.38 % 7,472.31 21 18.75 % 2,163.39 136,547 9.19 % 3,438.11 4,214,730 21.79 % 7,765.92 1,465,006 20.28 % 7,816.79 263,011 21.31 % 6,125.84 23,030 20.53 % 5,821.74 2,376,444 23.18 % 7,474.54 ne ne 6,734,994 16.98 % 5,935.52 44 39.29 % 4,532.81 374,116 25.18 % 9,419.84 3,104,942 16.05 % 5,721.06 1,318,655 18.25 % 7,035.91 259,075 20.99 % 6,034.16 30,064 26.80 % 7,599.85 1,648,098 16.08 % 5,183.70 še še 5,787,718 14.60 % 5,100.69 24 21.43 % 2,472.44 216,168 14.55 % 5,442.88 2,847,027 14.72 % 5,245.84 962,021 13.32 % 5,133.03 135,437 10.97 % 3,154.49 11,596 10.34 % 2,931.34 1,615,445 15.76 % 5,081 že že 3,735,419 9.42 % 3,292.01 6 5.36 % 618.11 131,252 8.84 % 3,304.78 1,898,346 9.81 % 3,497.83 608,889 8.43 % 3,248.83 77,192 6.25 % 1,797.89 7,450 6.64 % 1,883.28 1,012,284 9.87 % 3,183.90 le le 2,165,776 5.46 % 1,908.69 0 0 % 0 58,645 3.95 % 1,476.62 1,056,697 5.46 % 1,947.03 439,209 6.08 % 2,343.47 79,149 6.41 % 1,843.47 4,405 3.93 % 1,113.54 527,671 5.15 % 1,659.66 naj na j 1,598,002 4.03 % 1,408.31 2 1.79 % 206.04 35,826 2.41 % 902.06 847,958 4.38 % 1,562.42 235,452 3.26 % 1,256.29 47,034 3.81 % 1,095.48 3,314 2.95 % 837.74 428,416 4.18 % 1,347.48 prav pr av 1,208,290 3.05 % 1,064.86 1 0.89 % 103.02 52,405 3.53 % 1,319.50 578,438 2.99 % 1,065.81 228,510 3.16 % 1,219.25 41,863 3.39 % 975.04 2,593 2.31 % 655.48 304,480 2.97 % 957.67 sicer si cer 1,078,123 2.72 % 950.14 0 0 % 0 15,993 1.08 % 402.69 490,602 2.54 % 903.97 157,744 2.18 % 841.67 18,839 1.53 % 438.78 1,842 1.64 % 465.64 393,103 3.83 % 1,236.41 samo sa mo 940,469 2.37 % 828.83 6 5.36 % 618.11 76,926 5.18 % 1,936.91 434,924 2.25 % 801.38 194,550 2.69 % 1,038.05 44,843 3.63 % 1,044.45 4,907 4.38 % 1,240.44 184,313 1.80 % 579.71 predvsem pr edvsem 844,382 2.13 % 744.15 0 0 % 0 5,733 0.39 % 144.35 442,044 2.29 % 814.50 160,555 2.22 % 856.67 27,431 2.22 % 638.90 1,744 1.55 % 440.86 206,875 2.02 % 650.68 seveda se veda 725,852 1.83 % 639.69 0 0 % 0 24,491 1.65 % 616.66 363,766 1.88 % 670.26 184,814 2.56 % 986.11 20,809 1.69 % 484.67 2,801 2.50 % 708.06 129,171 1.26 % 406.28 celo ce lo 648,010 1.63 % 571.09 0 0 % 0 27,365 1.84 % 689.02 317,838 1.64 % 585.64 135,813 1.88 % 724.65 27,194 2.20 % 633.38 1,417 1.26 % 358.20 138,383 1.35 % 435.25 skoraj sk oraj 528,454 1.33 % 465.72 3 2.68 % 309.06 26,418 1.78 % 665.18 253,737 1.31 % 467.53 93,065 1.29 % 496.56 15,802 1.28 % 368.05 828 0.74 % 209.31 138,601 1.35 % 435.94 več ve č 520,542 1.31 % 458.75 0 0 % 0 37,136 2.50 % 935.04 242,295 1.25 % 446.44 94,562 1.31 % 504.55 18,310 1.48 % 426.46 1,689 1.51 % 426.96 126,550 1.23 % 398.03 vsaj vs aj 507,625 1.28 % 447.37 0 0 % 0 20,218 1.36 % 509.07 254,444 1.31 % 468.83 102,076 1.41 % 544.64 15,753 1.28 % 366.91 1,432 1.28 % 361.99 113,702 1.11 % 357.62 morda mo rda 468,348 1.18 % 412.75 0 0 % 0 25,811 1.74 % 649.89 217,144 1.12 % 400.10 99,850 1.38 % 532.77 21,212 1.72 % 494.05 1,177 1.05 % 297.53 103,154 1.01 % 324.45 niti ni ti 458,790 1.16 % 404.33 1 0.89 % 103.02 31,303 2.11 % 788.18 220,261 1.14 % 405.85 80,301 1.11 % 428.46 11,945 0.97 % 278.21 1,088 0.97 % 275.03 113,891 1.11 % 358.22 sploh sp loh 419,675 1.06 % 369.86 0 0 % 0 30,505 2.05 % 768.08 192,378 0.99 % 354.47 87,831 1.22 % 468.64 13,018 1.05 % 303.20 1,240 1.10 % 313.46 94,703 0.92 % 297.87 šele še le 388,144 0.98 % 342.07 2 1.79 % 206.04 13,326 0.90 % 335.53 199,240 1.03 % 367.11 69,131 0.96 % 368.86 13,716 1.11 % 319.46 803 0.72 % 202.99 91,926 0.90 % 289.13 pač pa č 294,972 0.74 % 259.96 0 0 % 0 13,015 0.88 % 327.70 148,590 0.77 % 273.79 66,321 0.92 % 353.87 8,471 0.69 % 197.30 2,120 1.89 % 535.91 56,455 0.55 % 177.57 zgolj zg olj 254,436 0.64 % 224.23 0 0 % 0 5,389 0.36 % 135.69 111,593 0.58 % 205.62 38,423 0.53 % 205.01 9,377 0.76 % 218.40 546 0.49 % 138.02 89,108 0.87 % 280.27 ravno ra vno 247,808 0.62 % 218.39 1 0.89 % 103.02 15,206 1.02 % 382.87 105,504 0.55 % 194.40 58,223 0.81 % 310.66 7,237 0.59 % 168.56 782 0.70 % 197.68 60,855 0.59 % 191.40 zlasti zl asti 242,203 0.61 % 213.45 0 0 % 0 3,265 0.22 % 82.21 138,036 0.71 % 254.34 36,842 0.51 % 196.58 16,342 1.32 % 380.62 908 0.81 % 229.53 46,810 0.46 % 147.23 pravzaprav pr avzaprav 198,500 0.50 % 174.94 0 0 % 0 11,349 0.76 % 285.76 93,357 0.48 % 172.02 50,059 0.69 % 267.10 7,342 0.59 % 171 774 0.69 % 195.66 35,619 0.35 % 112.03 vsekakor vs ekakor 159,819 0.40 % 140.85 1 0.89 % 103.02 3,884 0.26 % 97.79 83,779 0.43 % 154.37 38,563 0.53 % 205.76 4,333 0.35 % 100.92 385 0.34 % 97.32 28,874 0.28 % 90.82 no no 142,860 0.36 % 125.90 0 0 % 0 19,680 1.32 % 495.52 52,215 0.27 % 96.21 44,047 0.61 % 235.02 3,129 0.25 % 72.88 501 0.45 % 126.65 23,288 0.23 % 73.25 menda me nda 132,347 0.33 % 116.64 0 0 % 0 4,569 0.31 % 115.04 76,040 0.39 % 140.11 26,259 0.36 % 140.11 1,353 0.11 % 31.51 130 0.12 % 32.86 23,996 0.23 % 75.47 najbrž na jbrž 123,679 0.31 % 109 0 0 % 0 13,632 0.92 % 343.24 60,518 0.31 % 111.51 24,671 0.34 % 131.64 4,216 0.34 % 98.20 369 0.33 % 93.28 20,273 0.20 % 63.76 koli ko li 107,085 0.27 % 94.37 0 0 % 0 6,575 0.44 % 165.55 49,669 0.26 % 91.52 14,370 0.20 % 76.67 5,247 0.42 % 122.21 962 0.86 % 243.18 30,262 0.29 % 95.18 ja ja 87,056 0.22 % 76.72 0 0 % 0 19,465 1.31 % 490.11 27,323 0.14 % 50.34 26,187 0.36 % 139.73 1,897 0.15 % 44.18 332 0.30 % 83.93 11,852 0.12 % 37.28 češ če š 87,039 0.22 % 76.71 0 0 % 0 2,631 0.18 % 66.25 51,345 0.27 % 94.61 11,944 0.17 % 63.73 1,723 0.14 % 40.13 343 0.31 % 86.71 19,053 0.19 % 59.93 skorajda sk orajda 58,291 0.15 % 51.37 0 0 % 0 2,478 0.17 % 62.39 29,411 0.15 % 54.19 12,753 0.18 % 68.05 1,690 0.14 % 39.36 70 0.06 % 17.70 11,889 0.12 % 37.39 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 209 File at CLARIN.SI 1.2.193 List of initial character-level 3-grams from particle lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-particles-lowercase_forms- initial-3grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] tudi tud i 8,478,789 40.41 % 7,472.31 21 55.26 % 2,163.39 136,547 20.61 % 3,438.11 4,214,730 40.71 % 7,765.92 1,465,006 38.33 % 7,816.79 263,011 38.81 % 6,125.84 23,030 39.85 % 5,821.74 2,376,444 43.91 % 7,474.54 naj naj 1,598,002 7.62 % 1,408.31 2 5.26 % 206.04 35,826 5.41 % 902.06 847,958 8.19 % 1,562.42 235,452 6.16 % 1,256.29 47,034 6.94 % 1,095.48 3,314 5.73 % 837.74 428,416 7.92 % 1,347.48 prav pra v 1,208,290 5.76 % 1,064.86 1 2.63 % 103.02 52,405 7.91 % 1,319.50 578,438 5.59 % 1,065.81 228,510 5.98 % 1,219.25 41,863 6.18 % 975.04 2,593 4.49 % 655.48 304,480 5.63 % 957.67 sicer sic er 1,078,123 5.14 % 950.14 0 0 % 0 15,993 2.41 % 402.69 490,602 4.74 % 903.97 157,744 4.13 % 841.67 18,839 2.78 % 438.78 1,842 3.19 % 465.64 393,103 7.26 % 1,236.41 samo sam o 940,469 4.48 % 828.83 6 15.79 % 618.11 76,926 11.61 % 1,936.91 434,924 4.20 % 801.38 194,550 5.09 % 1,038.05 44,843 6.62 % 1,044.45 4,907 8.49 % 1,240.44 184,313 3.41 % 579.71 predvsem pre dvsem 844,382 4.02 % 744.15 0 0 % 0 5,733 0.86 % 144.35 442,044 4.27 % 814.50 160,555 4.20 % 856.67 27,431 4.05 % 638.90 1,744 3.02 % 440.86 206,875 3.82 % 650.68 seveda sev eda 725,852 3.46 % 639.69 0 0 % 0 24,491 3.70 % 616.66 363,766 3.51 % 670.26 184,814 4.84 % 986.11 20,809 3.07 % 484.67 2,801 4.85 % 708.06 129,171 2.39 % 406.28 celo cel o 648,010 3.09 % 571.09 0 0 % 0 27,365 4.13 % 689.02 317,838 3.07 % 585.64 135,813 3.55 % 724.65 27,194 4.01 % 633.38 1,417 2.45 % 358.20 138,383 2.56 % 435.25 skoraj sko raj 528,454 2.52 % 465.72 3 7.89 % 309.06 26,418 3.99 % 665.18 253,737 2.45 % 467.53 93,065 2.44 % 496.56 15,802 2.33 % 368.05 828 1.43 % 209.31 138,601 2.56 % 435.94 več več 520,542 2.48 % 458.75 0 0 % 0 37,136 5.60 % 935.04 242,295 2.34 % 446.44 94,562 2.47 % 504.55 18,310 2.70 % 426.46 1,689 2.92 % 426.96 126,550 2.34 % 398.03 vsaj vsa j 507,625 2.42 % 447.37 0 0 % 0 20,218 3.05 % 509.07 254,444 2.46 % 468.83 102,076 2.67 % 544.64 15,753 2.32 % 366.91 1,432 2.48 % 361.99 113,702 2.10 % 357.62 morda mor da 468,348 2.23 % 412.75 0 0 % 0 25,811 3.90 % 649.89 217,144 2.10 % 400.10 99,850 2.61 % 532.77 21,212 3.13 % 494.05 1,177 2.04 % 297.53 103,154 1.91 % 324.45 niti nit i 458,790 2.19 % 404.33 1 2.63 % 103.02 31,303 4.72 % 788.18 220,261 2.13 % 405.85 80,301 2.10 % 428.46 11,945 1.76 % 278.21 1,088 1.88 % 275.03 113,891 2.10 % 358.22 sploh spl oh 419,675 2.00 % 369.86 0 0 % 0 30,505 4.60 % 768.08 192,378 1.86 % 354.47 87,831 2.30 % 468.64 13,018 1.92 % 303.20 1,240 2.15 % 313.46 94,703 1.75 % 297.87 šele šel e 388,144 1.85 % 342.07 2 5.26 % 206.04 13,326 2.01 % 335.53 199,240 1.93 % 367.11 69,131 1.81 % 368.86 13,716 2.02 % 319.46 803 1.39 % 202.99 91,926 1.70 % 289.13 pač pač 294,972 1.41 % 259.96 0 0 % 0 13,015 1.96 % 327.70 148,590 1.44 % 273.79 66,321 1.74 % 353.87 8,471 1.25 % 197.30 2,120 3.67 % 535.91 56,455 1.04 % 177.57 zgolj zgo lj 254,436 1.21 % 224.23 0 0 % 0 5,389 0.81 % 135.69 111,593 1.08 % 205.62 38,423 1.00 % 205.01 9,377 1.38 % 218.40 546 0.94 % 138.02 89,108 1.65 % 280.27 ravno rav no 247,808 1.18 % 218.39 1 2.63 % 103.02 15,206 2.29 % 382.87 105,504 1.02 % 194.40 58,223 1.52 % 310.66 7,237 1.07 % 168.56 782 1.35 % 197.68 60,855 1.12 % 191.40 zlasti zla sti 242,203 1.15 % 213.45 0 0 % 0 3,265 0.49 % 82.21 138,036 1.33 % 254.34 36,842 0.96 % 196.58 16,342 2.41 % 380.62 908 1.57 % 229.53 46,810 0.86 % 147.23 pravzaprav pra vzaprav 198,500 0.95 % 174.94 0 0 % 0 11,349 1.71 % 285.76 93,357 0.90 % 172.02 50,059 1.31 % 267.10 7,342 1.08 % 171 774 1.34 % 195.66 35,619 0.66 % 112.03 vsekakor vse kakor 159,819 0.76 % 140.85 1 2.63 % 103.02 3,884 0.59 % 97.79 83,779 0.81 % 154.37 38,563 1.01 % 205.76 4,333 0.64 % 100.92 385 0.67 % 97.32 28,874 0.53 % 90.82 menda men da 132,347 0.63 % 116.64 0 0 % 0 4,569 0.69 % 115.04 76,040 0.73 % 140.11 26,259 0.69 % 140.11 1,353 0.20 % 31.51 130 0.23 % 32.86 23,996 0.44 % 75.47 najbrž naj brž 123,679 0.59 % 109 0 0 % 0 13,632 2.06 % 343.24 60,518 0.58 % 111.51 24,671 0.65 % 131.64 4,216 0.62 % 98.20 369 0.64 % 93.28 20,273 0.38 % 63.76 koli kol i 107,085 0.51 % 94.37 0 0 % 0 6,575 0.99 % 165.55 49,669 0.48 % 91.52 14,370 0.38 % 76.67 5,247 0.77 % 122.21 962 1.67 % 243.18 30,262 0.56 % 95.18 češ češ 87,039 0.41 % 76.71 0 0 % 0 2,631 0.40 % 66.25 51,345 0.50 % 94.61 11,944 0.31 % 63.73 1,723 0.25 % 40.13 343 0.59 % 86.71 19,053 0.35 % 59.93 skorajda sko rajda 58,291 0.28 % 51.37 0 0 % 0 2,478 0.37 % 62.39 29,411 0.28 % 54.19 12,753 0.33 % 68.05 1,690 0.25 % 39.36 70 0.12 % 17.70 11,889 0.22 % 37.39 bržkone brž kone 28,045 0.13 % 24.72 0 0 % 0 720 0.11 % 18.13 17,758 0.17 % 32.72 3,128 0.08 % 16.69 773 0.11 % 18 21 0.04 % 5.31 5,645 0.10 % 17.75 nemara nem ara 24,208 0.12 % 21.33 0 0 % 0 2,593 0.39 % 65.29 12,101 0.12 % 22.30 4,123 0.11 % 22 2,031 0.30 % 47.30 37 0.06 % 9.35 3,323 0.06 % 10.45 morebiti mor ebiti 23,169 0.11 % 20.42 0 0 % 0 1,693 0.26 % 42.63 12,592 0.12 % 23.20 3,410 0.09 % 18.19 787 0.12 % 18.33 33 0.06 % 8.34 4,654 0.09 % 14.64 nikar nik ar 21,998 0.10 % 19.39 0 0 % 0 2,458 0.37 % 61.89 8,133 0.08 % 14.99 7,672 0.20 % 40.94 1,134 0.17 % 26.41 47 0.08 % 11.88 2,554 0.05 % 8.03 kajpak kaj pak 21,308 0.10 % 18.78 0 0 % 0 482 0.07 % 12.14 14,047 0.14 % 25.88 2,845 0.07 % 15.18 214 0.03 % 4.98 5 0.01 % 1.26 3,715 0.07 % 11.68 kvečjemu kve čjemu 16,808 0.08 % 14.81 0 0 % 0 733 0.11 % 18.46 8,898 0.09 % 16.40 3,245 0.09 % 17.31 571 0.08 % 13.30 48 0.08 % 12.13 3,313 0.06 % 10.42 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 210 File at CLARIN.SI 1.2.194 List of initial character-level 4-grams from particle lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-particles-lowercase_forms- initial-4grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] tudi tudi 8,478,789 45.89 % 7,472.31 21 58.33 % 2,163.39 136,547 23.81 % 3,438.11 4,214,730 46.53 % 7,765.92 1,465,006 42.94 % 7,816.79 263,011 43.70 % 6,125.84 23,030 45.80 % 5,821.74 2,376,444 49.72 % 7,474.54 prav prav 1,208,290 6.54 % 1,064.86 1 2.78 % 103.02 52,405 9.14 % 1,319.50 578,438 6.39 % 1,065.81 228,510 6.70 % 1,219.25 41,863 6.96 % 975.04 2,593 5.16 % 655.48 304,480 6.37 % 957.67 sicer sice r 1,078,123 5.84 % 950.14 0 0 % 0 15,993 2.79 % 402.69 490,602 5.42 % 903.97 157,744 4.62 % 841.67 18,839 3.13 % 438.78 1,842 3.66 % 465.64 393,103 8.22 % 1,236.41 samo samo 940,469 5.09 % 828.83 6 16.67 % 618.11 76,926 13.42 % 1,936.91 434,924 4.80 % 801.38 194,550 5.70 % 1,038.05 44,843 7.45 % 1,044.45 4,907 9.76 % 1,240.44 184,313 3.86 % 579.71 predvsem pred vsem 844,382 4.57 % 744.15 0 0 % 0 5,733 1.00 % 144.35 442,044 4.88 % 814.50 160,555 4.71 % 856.67 27,431 4.56 % 638.90 1,744 3.47 % 440.86 206,875 4.33 % 650.68 seveda seve da 725,852 3.93 % 639.69 0 0 % 0 24,491 4.27 % 616.66 363,766 4.02 % 670.26 184,814 5.42 % 986.11 20,809 3.46 % 484.67 2,801 5.57 % 708.06 129,171 2.70 % 406.28 celo celo 648,010 3.51 % 571.09 0 0 % 0 27,365 4.77 % 689.02 317,838 3.51 % 585.64 135,813 3.98 % 724.65 27,194 4.52 % 633.38 1,417 2.82 % 358.20 138,383 2.90 % 435.25 skoraj skor aj 528,454 2.86 % 465.72 3 8.33 % 309.06 26,418 4.61 % 665.18 253,737 2.80 % 467.53 93,065 2.73 % 496.56 15,802 2.63 % 368.05 828 1.65 % 209.31 138,601 2.90 % 435.94 vsaj vsaj 507,625 2.75 % 447.37 0 0 % 0 20,218 3.53 % 509.07 254,444 2.81 % 468.83 102,076 2.99 % 544.64 15,753 2.62 % 366.91 1,432 2.85 % 361.99 113,702 2.38 % 357.62 morda mord a 468,348 2.54 % 412.75 0 0 % 0 25,811 4.50 % 649.89 217,144 2.40 % 400.10 99,850 2.93 % 532.77 21,212 3.52 % 494.05 1,177 2.34 % 297.53 103,154 2.16 % 324.45 niti niti 458,790 2.48 % 404.33 1 2.78 % 103.02 31,303 5.46 % 788.18 220,261 2.43 % 405.85 80,301 2.35 % 428.46 11,945 1.99 % 278.21 1,088 2.16 % 275.03 113,891 2.38 % 358.22 sploh splo h 419,675 2.27 % 369.86 0 0 % 0 30,505 5.32 % 768.08 192,378 2.12 % 354.47 87,831 2.57 % 468.64 13,018 2.16 % 303.20 1,240 2.47 % 313.46 94,703 1.98 % 297.87 šele šele 388,144 2.10 % 342.07 2 5.56 % 206.04 13,326 2.32 % 335.53 199,240 2.20 % 367.11 69,131 2.03 % 368.86 13,716 2.28 % 319.46 803 1.60 % 202.99 91,926 1.92 % 289.13 zgolj zgol j 254,436 1.38 % 224.23 0 0 % 0 5,389 0.94 % 135.69 111,593 1.23 % 205.62 38,423 1.13 % 205.01 9,377 1.56 % 218.40 546 1.09 % 138.02 89,108 1.86 % 280.27 ravno ravn o 247,808 1.34 % 218.39 1 2.78 % 103.02 15,206 2.65 % 382.87 105,504 1.17 % 194.40 58,223 1.71 % 310.66 7,237 1.20 % 168.56 782 1.55 % 197.68 60,855 1.27 % 191.40 zlasti zlas ti 242,203 1.31 % 213.45 0 0 % 0 3,265 0.57 % 82.21 138,036 1.52 % 254.34 36,842 1.08 % 196.58 16,342 2.71 % 380.62 908 1.81 % 229.53 46,810 0.98 % 147.23 pravzaprav prav zaprav 198,500 1.07 % 174.94 0 0 % 0 11,349 1.98 % 285.76 93,357 1.03 % 172.02 50,059 1.47 % 267.10 7,342 1.22 % 171 774 1.54 % 195.66 35,619 0.74 % 112.03 vsekakor vsek akor 159,819 0.86 % 140.85 1 2.78 % 103.02 3,884 0.68 % 97.79 83,779 0.93 % 154.37 38,563 1.13 % 205.76 4,333 0.72 % 100.92 385 0.77 % 97.32 28,874 0.60 % 90.82 menda mend a 132,347 0.72 % 116.64 0 0 % 0 4,569 0.80 % 115.04 76,040 0.84 % 140.11 26,259 0.77 % 140.11 1,353 0.23 % 31.51 130 0.26 % 32.86 23,996 0.50 % 75.47 najbrž najb rž 123,679 0.67 % 109 0 0 % 0 13,632 2.38 % 343.24 60,518 0.67 % 111.51 24,671 0.72 % 131.64 4,216 0.70 % 98.20 369 0.73 % 93.28 20,273 0.42 % 63.76 koli koli 107,085 0.58 % 94.37 0 0 % 0 6,575 1.15 % 165.55 49,669 0.55 % 91.52 14,370 0.42 % 76.67 5,247 0.87 % 122.21 962 1.91 % 243.18 30,262 0.63 % 95.18 skorajda skor ajda 58,291 0.32 % 51.37 0 0 % 0 2,478 0.43 % 62.39 29,411 0.33 % 54.19 12,753 0.37 % 68.05 1,690 0.28 % 39.36 70 0.14 % 17.70 11,889 0.25 % 37.39 bržkone bržk one 28,045 0.15 % 24.72 0 0 % 0 720 0.13 % 18.13 17,758 0.20 % 32.72 3,128 0.09 % 16.69 773 0.13 % 18 21 0.04 % 5.31 5,645 0.12 % 17.75 nemara nema ra 24,208 0.13 % 21.33 0 0 % 0 2,593 0.45 % 65.29 12,101 0.13 % 22.30 4,123 0.12 % 22 2,031 0.34 % 47.30 37 0.07 % 9.35 3,323 0.07 % 10.45 morebiti more biti 23,169 0.12 % 20.42 0 0 % 0 1,693 0.29 % 42.63 12,592 0.14 % 23.20 3,410 0.10 % 18.19 787 0.13 % 18.33 33 0.07 % 8.34 4,654 0.10 % 14.64 nikar nika r 21,998 0.12 % 19.39 0 0 % 0 2,458 0.43 % 61.89 8,133 0.09 % 14.99 7,672 0.23 % 40.94 1,134 0.19 % 26.41 47 0.09 % 11.88 2,554 0.05 % 8.03 kajpak kajp ak 21,308 0.12 % 18.78 0 0 % 0 482 0.08 % 12.14 14,047 0.15 % 25.88 2,845 0.08 % 15.18 214 0.04 % 4.98 5 0.01 % 1.26 3,715 0.08 % 11.68 kvečjemu kveč jemu 16,808 0.09 % 14.81 0 0 % 0 733 0.13 % 18.46 8,898 0.10 % 16.40 3,245 0.10 % 17.31 571 0.10 % 13.30 48 0.10 % 12.13 3,313 0.07 % 10.42 domala doma la 16,383 0.09 % 14.44 0 0 % 0 282 0.05 % 7.10 10,557 0.12 % 19.45 2,066 0.06 % 11.02 352 0.06 % 8.20 10 0.02 % 2.53 3,116 0.07 % 9.80 edino edin o 15,222 0.08 % 13.42 0 0 % 0 1,096 0.19 % 27.60 7,646 0.08 % 14.09 2,895 0.09 % 15.45 647 0.11 % 15.07 65 0.13 % 16.43 2,873 0.06 % 9.04 kajne kajn e 13,447 0.07 % 11.85 0 0 % 0 3,958 0.69 % 99.66 3,563 0.04 % 6.57 3,581 0.10 % 19.11 233 0.04 % 5.43 14 0.03 % 3.54 2,098 0.04 % 6.60 bržčas bržč as 12,496 0.07 % 11.01 0 0 % 0 79 0.01 % 1.99 8,473 0.09 % 15.61 1,322 0.04 % 7.05 139 0.02 % 3.24 0 0 % 0 2,483 0.05 % 7.81 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 211 File at CLARIN.SI 1.2.195 List of initial character-level 5-grams from particle lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-particles-lowercase_forms- initial-5grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] sicer sicer 1,078,123 18.85 % 950.14 0 0 % 0 15,993 7.74 % 402.69 490,602 17.64 % 903.97 157,744 14.14 % 841.67 18,839 10.60 % 438.78 1,842 13.17 % 465.64 393,103 27.62 % 1,236.41 predvsem predv sem 844,382 14.77 % 744.15 0 0 % 0 5,733 2.77 % 144.35 442,044 15.89 % 814.50 160,555 14.39 % 856.67 27,431 15.43 % 638.90 1,744 12.47 % 440.86 206,875 14.53 % 650.68 seveda seved a 725,852 12.69 % 639.69 0 0 % 0 24,491 11.85 % 616.66 363,766 13.08 % 670.26 184,814 16.57 % 986.11 20,809 11.71 % 484.67 2,801 20.03 % 708.06 129,171 9.07 % 406.28 skoraj skora j 528,454 9.24 % 465.72 3 60.00 % 309.06 26,418 12.79 % 665.18 253,737 9.12 % 467.53 93,065 8.34 % 496.56 15,802 8.89 % 368.05 828 5.92 % 209.31 138,601 9.74 % 435.94 morda morda 468,348 8.19 % 412.75 0 0 % 0 25,811 12.49 % 649.89 217,144 7.81 % 400.10 99,850 8.95 % 532.77 21,212 11.93 % 494.05 1,177 8.41 % 297.53 103,154 7.25 % 324.45 sploh sploh 419,675 7.34 % 369.86 0 0 % 0 30,505 14.76 % 768.08 192,378 6.92 % 354.47 87,831 7.87 % 468.64 13,018 7.32 % 303.20 1,240 8.87 % 313.46 94,703 6.65 % 297.87 zgolj zgolj 254,436 4.45 % 224.23 0 0 % 0 5,389 2.61 % 135.69 111,593 4.01 % 205.62 38,423 3.44 % 205.01 9,377 5.28 % 218.40 546 3.90 % 138.02 89,108 6.26 % 280.27 ravno ravno 247,808 4.33 % 218.39 1 20.00 % 103.02 15,206 7.36 % 382.87 105,504 3.79 % 194.40 58,223 5.22 % 310.66 7,237 4.07 % 168.56 782 5.59 % 197.68 60,855 4.28 % 191.40 zlasti zlast i 242,203 4.24 % 213.45 0 0 % 0 3,265 1.58 % 82.21 138,036 4.96 % 254.34 36,842 3.30 % 196.58 16,342 9.19 % 380.62 908 6.49 % 229.53 46,810 3.29 % 147.23 pravzaprav pravz aprav 198,500 3.47 % 174.94 0 0 % 0 11,349 5.49 % 285.76 93,357 3.36 % 172.02 50,059 4.49 % 267.10 7,342 4.13 % 171 774 5.53 % 195.66 35,619 2.50 % 112.03 vsekakor vseka kor 159,819 2.79 % 140.85 1 20.00 % 103.02 3,884 1.88 % 97.79 83,779 3.01 % 154.37 38,563 3.46 % 205.76 4,333 2.44 % 100.92 385 2.75 % 97.32 28,874 2.03 % 90.82 menda menda 132,347 2.31 % 116.64 0 0 % 0 4,569 2.21 % 115.04 76,040 2.73 % 140.11 26,259 2.35 % 140.11 1,353 0.76 % 31.51 130 0.93 % 32.86 23,996 1.69 % 75.47 najbrž najbr ž 123,679 2.16 % 109 0 0 % 0 13,632 6.60 % 343.24 60,518 2.18 % 111.51 24,671 2.21 % 131.64 4,216 2.37 % 98.20 369 2.64 % 93.28 20,273 1.42 % 63.76 skorajda skora jda 58,291 1.02 % 51.37 0 0 % 0 2,478 1.20 % 62.39 29,411 1.06 % 54.19 12,753 1.14 % 68.05 1,690 0.95 % 39.36 70 0.50 % 17.70 11,889 0.83 % 37.39 bržkone bržko ne 28,045 0.49 % 24.72 0 0 % 0 720 0.35 % 18.13 17,758 0.64 % 32.72 3,128 0.28 % 16.69 773 0.43 % 18 21 0.15 % 5.31 5,645 0.40 % 17.75 nemara nemar a 24,208 0.42 % 21.33 0 0 % 0 2,593 1.25 % 65.29 12,101 0.43 % 22.30 4,123 0.37 % 22 2,031 1.14 % 47.30 37 0.27 % 9.35 3,323 0.23 % 10.45 morebiti moreb iti 23,169 0.41 % 20.42 0 0 % 0 1,693 0.82 % 42.63 12,592 0.45 % 23.20 3,410 0.31 % 18.19 787 0.44 % 18.33 33 0.24 % 8.34 4,654 0.33 % 14.64 nikar nikar 21,998 0.39 % 19.39 0 0 % 0 2,458 1.19 % 61.89 8,133 0.29 % 14.99 7,672 0.69 % 40.94 1,134 0.64 % 26.41 47 0.34 % 11.88 2,554 0.18 % 8.03 kajpak kajpa k 21,308 0.37 % 18.78 0 0 % 0 482 0.23 % 12.14 14,047 0.51 % 25.88 2,845 0.26 % 15.18 214 0.12 % 4.98 5 0.04 % 1.26 3,715 0.26 % 11.68 kvečjemu kvečj emu 16,808 0.29 % 14.81 0 0 % 0 733 0.35 % 18.46 8,898 0.32 % 16.40 3,245 0.29 % 17.31 571 0.32 % 13.30 48 0.34 % 12.13 3,313 0.23 % 10.42 domala domal a 16,383 0.29 % 14.44 0 0 % 0 282 0.14 % 7.10 10,557 0.38 % 19.45 2,066 0.18 % 11.02 352 0.20 % 8.20 10 0.07 % 2.53 3,116 0.22 % 9.80 edino edino 15,222 0.27 % 13.42 0 0 % 0 1,096 0.53 % 27.60 7,646 0.28 % 14.09 2,895 0.26 % 15.45 647 0.36 % 15.07 65 0.47 % 16.43 2,873 0.20 % 9.04 kajne kajne 13,447 0.23 % 11.85 0 0 % 0 3,958 1.92 % 99.66 3,563 0.13 % 6.57 3,581 0.32 % 19.11 233 0.13 % 5.43 14 0.10 % 3.54 2,098 0.15 % 6.60 bržčas bržča s 12,496 0.22 % 11.01 0 0 % 0 79 0.04 % 1.99 8,473 0.30 % 15.61 1,322 0.12 % 7.05 139 0.08 % 3.24 0 0 % 0 2,483 0.17 % 7.81 bojda bojda 8,067 0.14 % 7.11 0 0 % 0 258 0.12 % 6.50 3,899 0.14 % 7.18 2,303 0.21 % 12.29 66 0.04 % 1.54 3 0.02 % 0.76 1,538 0.11 % 4.84 kajpada kajpa da 6,710 0.12 % 5.91 0 0 % 0 432 0.21 % 10.88 3,814 0.14 % 7.03 1,288 0.12 % 6.87 393 0.22 % 9.15 3 0.02 % 0.76 780 0.06 % 2.45 kakopak kakop ak 6,398 0.11 % 5.64 0 0 % 0 142 0.07 % 3.58 1,685 0.06 % 3.10 3,126 0.28 % 16.68 44 0.03 % 1.02 0 0 % 0 1,401 0.10 % 4.41 malodane malod ane 5,717 0.10 % 5.04 0 0 % 0 498 0.24 % 12.54 2,960 0.11 % 5.45 1,147 0.10 % 6.12 118 0.07 % 2.75 3 0.02 % 0.76 991 0.07 % 3.12 malone malon e 4,179 0.07 % 3.68 0 0 % 0 664 0.32 % 16.72 2,087 0.07 % 3.85 555 0.05 % 2.96 313 0.18 % 7.29 21 0.15 % 5.31 539 0.04 % 1.70 takorekoč takor ekoč 3,914 0.07 % 3.45 0 0 % 0 114 0.06 % 2.87 1,579 0.06 % 2.91 1,331 0.12 % 7.10 264 0.15 % 6.15 42 0.30 % 10.62 584 0.04 % 1.84 edinole edino le 2,874 0.05 % 2.53 0 0 % 0 444 0.21 % 11.18 1,180 0.04 % 2.17 611 0.06 % 3.26 342 0.19 % 7.97 29 0.21 % 7.33 268 0.02 % 0.84 kratkomalo kratk omalo 1,141 0.02 % 1.01 0 0 % 0 90 0.04 % 2.27 627 0.02 % 1.16 318 0.03 % 1.70 59 0.03 % 1.37 3 0.02 % 0.76 44 0 % 0.14 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 212 File at CLARIN.SI 1.2.196 List of final character-level 1-grams from particle lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-particles-lowercase_forms- final-1grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] tudi tud i 8,478,789 21.38 % 7,472.31 21 18.75 % 2,163.39 136,547 9.19 % 3,438.11 4,214,730 21.79 % 7,765.92 1,465,006 20.28 % 7,816.79 263,011 21.31 % 6,125.84 23,030 20.53 % 5,821.74 2,376,444 23.18 % 7,474.54 ne n e 6,734,994 16.98 % 5,935.52 44 39.29 % 4,532.81 374,116 25.18 % 9,419.84 3,104,942 16.05 % 5,721.06 1,318,655 18.25 % 7,035.91 259,075 20.99 % 6,034.16 30,064 26.80 % 7,599.85 1,648,098 16.08 % 5,183.70 še š e 5,787,718 14.60 % 5,100.69 24 21.43 % 2,472.44 216,168 14.55 % 5,442.88 2,847,027 14.72 % 5,245.84 962,021 13.32 % 5,133.03 135,437 10.97 % 3,154.49 11,596 10.34 % 2,931.34 1,615,445 15.76 % 5,081 že ž e 3,735,419 9.42 % 3,292.01 6 5.36 % 618.11 131,252 8.84 % 3,304.78 1,898,346 9.81 % 3,497.83 608,889 8.43 % 3,248.83 77,192 6.25 % 1,797.89 7,450 6.64 % 1,883.28 1,012,284 9.87 % 3,183.90 le l e 2,165,776 5.46 % 1,908.69 0 0 % 0 58,645 3.95 % 1,476.62 1,056,697 5.46 % 1,947.03 439,209 6.08 % 2,343.47 79,149 6.41 % 1,843.47 4,405 3.93 % 1,113.54 527,671 5.15 % 1,659.66 naj na j 1,598,002 4.03 % 1,408.31 2 1.79 % 206.04 35,826 2.41 % 902.06 847,958 4.38 % 1,562.42 235,452 3.26 % 1,256.29 47,034 3.81 % 1,095.48 3,314 2.95 % 837.74 428,416 4.18 % 1,347.48 prav pra v 1,208,290 3.05 % 1,064.86 1 0.89 % 103.02 52,405 3.53 % 1,319.50 578,438 2.99 % 1,065.81 228,510 3.16 % 1,219.25 41,863 3.39 % 975.04 2,593 2.31 % 655.48 304,480 2.97 % 957.67 sicer sice r 1,078,123 2.72 % 950.14 0 0 % 0 15,993 1.08 % 402.69 490,602 2.54 % 903.97 157,744 2.18 % 841.67 18,839 1.53 % 438.78 1,842 1.64 % 465.64 393,103 3.83 % 1,236.41 samo sam o 940,469 2.37 % 828.83 6 5.36 % 618.11 76,926 5.18 % 1,936.91 434,924 2.25 % 801.38 194,550 2.69 % 1,038.05 44,843 3.63 % 1,044.45 4,907 4.38 % 1,240.44 184,313 1.80 % 579.71 predvsem predvse m 844,382 2.13 % 744.15 0 0 % 0 5,733 0.39 % 144.35 442,044 2.29 % 814.50 160,555 2.22 % 856.67 27,431 2.22 % 638.90 1,744 1.55 % 440.86 206,875 2.02 % 650.68 seveda seved a 725,852 1.83 % 639.69 0 0 % 0 24,491 1.65 % 616.66 363,766 1.88 % 670.26 184,814 2.56 % 986.11 20,809 1.69 % 484.67 2,801 2.50 % 708.06 129,171 1.26 % 406.28 celo cel o 648,010 1.63 % 571.09 0 0 % 0 27,365 1.84 % 689.02 317,838 1.64 % 585.64 135,813 1.88 % 724.65 27,194 2.20 % 633.38 1,417 1.26 % 358.20 138,383 1.35 % 435.25 skoraj skora j 528,454 1.33 % 465.72 3 2.68 % 309.06 26,418 1.78 % 665.18 253,737 1.31 % 467.53 93,065 1.29 % 496.56 15,802 1.28 % 368.05 828 0.74 % 209.31 138,601 1.35 % 435.94 več ve č 520,542 1.31 % 458.75 0 0 % 0 37,136 2.50 % 935.04 242,295 1.25 % 446.44 94,562 1.31 % 504.55 18,310 1.48 % 426.46 1,689 1.51 % 426.96 126,550 1.23 % 398.03 vsaj vsa j 507,625 1.28 % 447.37 0 0 % 0 20,218 1.36 % 509.07 254,444 1.31 % 468.83 102,076 1.41 % 544.64 15,753 1.28 % 366.91 1,432 1.28 % 361.99 113,702 1.11 % 357.62 morda mord a 468,348 1.18 % 412.75 0 0 % 0 25,811 1.74 % 649.89 217,144 1.12 % 400.10 99,850 1.38 % 532.77 21,212 1.72 % 494.05 1,177 1.05 % 297.53 103,154 1.01 % 324.45 niti nit i 458,790 1.16 % 404.33 1 0.89 % 103.02 31,303 2.11 % 788.18 220,261 1.14 % 405.85 80,301 1.11 % 428.46 11,945 0.97 % 278.21 1,088 0.97 % 275.03 113,891 1.11 % 358.22 sploh splo h 419,675 1.06 % 369.86 0 0 % 0 30,505 2.05 % 768.08 192,378 0.99 % 354.47 87,831 1.22 % 468.64 13,018 1.05 % 303.20 1,240 1.10 % 313.46 94,703 0.92 % 297.87 šele šel e 388,144 0.98 % 342.07 2 1.79 % 206.04 13,326 0.90 % 335.53 199,240 1.03 % 367.11 69,131 0.96 % 368.86 13,716 1.11 % 319.46 803 0.72 % 202.99 91,926 0.90 % 289.13 pač pa č 294,972 0.74 % 259.96 0 0 % 0 13,015 0.88 % 327.70 148,590 0.77 % 273.79 66,321 0.92 % 353.87 8,471 0.69 % 197.30 2,120 1.89 % 535.91 56,455 0.55 % 177.57 zgolj zgol j 254,436 0.64 % 224.23 0 0 % 0 5,389 0.36 % 135.69 111,593 0.58 % 205.62 38,423 0.53 % 205.01 9,377 0.76 % 218.40 546 0.49 % 138.02 89,108 0.87 % 280.27 ravno ravn o 247,808 0.62 % 218.39 1 0.89 % 103.02 15,206 1.02 % 382.87 105,504 0.55 % 194.40 58,223 0.81 % 310.66 7,237 0.59 % 168.56 782 0.70 % 197.68 60,855 0.59 % 191.40 zlasti zlast i 242,203 0.61 % 213.45 0 0 % 0 3,265 0.22 % 82.21 138,036 0.71 % 254.34 36,842 0.51 % 196.58 16,342 1.32 % 380.62 908 0.81 % 229.53 46,810 0.46 % 147.23 pravzaprav pravzapra v 198,500 0.50 % 174.94 0 0 % 0 11,349 0.76 % 285.76 93,357 0.48 % 172.02 50,059 0.69 % 267.10 7,342 0.59 % 171 774 0.69 % 195.66 35,619 0.35 % 112.03 vsekakor vsekako r 159,819 0.40 % 140.85 1 0.89 % 103.02 3,884 0.26 % 97.79 83,779 0.43 % 154.37 38,563 0.53 % 205.76 4,333 0.35 % 100.92 385 0.34 % 97.32 28,874 0.28 % 90.82 no n o 142,860 0.36 % 125.90 0 0 % 0 19,680 1.32 % 495.52 52,215 0.27 % 96.21 44,047 0.61 % 235.02 3,129 0.25 % 72.88 501 0.45 % 126.65 23,288 0.23 % 73.25 menda mend a 132,347 0.33 % 116.64 0 0 % 0 4,569 0.31 % 115.04 76,040 0.39 % 140.11 26,259 0.36 % 140.11 1,353 0.11 % 31.51 130 0.12 % 32.86 23,996 0.23 % 75.47 najbrž najbr ž 123,679 0.31 % 109 0 0 % 0 13,632 0.92 % 343.24 60,518 0.31 % 111.51 24,671 0.34 % 131.64 4,216 0.34 % 98.20 369 0.33 % 93.28 20,273 0.20 % 63.76 koli kol i 107,085 0.27 % 94.37 0 0 % 0 6,575 0.44 % 165.55 49,669 0.26 % 91.52 14,370 0.20 % 76.67 5,247 0.42 % 122.21 962 0.86 % 243.18 30,262 0.29 % 95.18 ja j a 87,056 0.22 % 76.72 0 0 % 0 19,465 1.31 % 490.11 27,323 0.14 % 50.34 26,187 0.36 % 139.73 1,897 0.15 % 44.18 332 0.30 % 83.93 11,852 0.12 % 37.28 češ če š 87,039 0.22 % 76.71 0 0 % 0 2,631 0.18 % 66.25 51,345 0.27 % 94.61 11,944 0.17 % 63.73 1,723 0.14 % 40.13 343 0.31 % 86.71 19,053 0.19 % 59.93 skorajda skorajd a 58,291 0.15 % 51.37 0 0 % 0 2,478 0.17 % 62.39 29,411 0.15 % 54.19 12,753 0.18 % 68.05 1,690 0.14 % 39.36 70 0.06 % 17.70 11,889 0.12 % 37.39 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 213 File at CLARIN.SI 1.2.197 List of final character-level 2-grams from particle lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-particles-lowercase_forms- final-2grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] tudi tu di 8,478,789 21.38 % 7,472.31 21 18.75 % 2,163.39 136,547 9.19 % 3,438.11 4,214,730 21.79 % 7,765.92 1,465,006 20.28 % 7,816.79 263,011 21.31 % 6,125.84 23,030 20.53 % 5,821.74 2,376,444 23.18 % 7,474.54 ne ne 6,734,994 16.98 % 5,935.52 44 39.29 % 4,532.81 374,116 25.18 % 9,419.84 3,104,942 16.05 % 5,721.06 1,318,655 18.25 % 7,035.91 259,075 20.99 % 6,034.16 30,064 26.80 % 7,599.85 1,648,098 16.08 % 5,183.70 še še 5,787,718 14.60 % 5,100.69 24 21.43 % 2,472.44 216,168 14.55 % 5,442.88 2,847,027 14.72 % 5,245.84 962,021 13.32 % 5,133.03 135,437 10.97 % 3,154.49 11,596 10.34 % 2,931.34 1,615,445 15.76 % 5,081 že že 3,735,419 9.42 % 3,292.01 6 5.36 % 618.11 131,252 8.84 % 3,304.78 1,898,346 9.81 % 3,497.83 608,889 8.43 % 3,248.83 77,192 6.25 % 1,797.89 7,450 6.64 % 1,883.28 1,012,284 9.87 % 3,183.90 le le 2,165,776 5.46 % 1,908.69 0 0 % 0 58,645 3.95 % 1,476.62 1,056,697 5.46 % 1,947.03 439,209 6.08 % 2,343.47 79,149 6.41 % 1,843.47 4,405 3.93 % 1,113.54 527,671 5.15 % 1,659.66 naj n aj 1,598,002 4.03 % 1,408.31 2 1.79 % 206.04 35,826 2.41 % 902.06 847,958 4.38 % 1,562.42 235,452 3.26 % 1,256.29 47,034 3.81 % 1,095.48 3,314 2.95 % 837.74 428,416 4.18 % 1,347.48 prav pr av 1,208,290 3.05 % 1,064.86 1 0.89 % 103.02 52,405 3.53 % 1,319.50 578,438 2.99 % 1,065.81 228,510 3.16 % 1,219.25 41,863 3.39 % 975.04 2,593 2.31 % 655.48 304,480 2.97 % 957.67 sicer sic er 1,078,123 2.72 % 950.14 0 0 % 0 15,993 1.08 % 402.69 490,602 2.54 % 903.97 157,744 2.18 % 841.67 18,839 1.53 % 438.78 1,842 1.64 % 465.64 393,103 3.83 % 1,236.41 samo sa mo 940,469 2.37 % 828.83 6 5.36 % 618.11 76,926 5.18 % 1,936.91 434,924 2.25 % 801.38 194,550 2.69 % 1,038.05 44,843 3.63 % 1,044.45 4,907 4.38 % 1,240.44 184,313 1.80 % 579.71 predvsem predvs em 844,382 2.13 % 744.15 0 0 % 0 5,733 0.39 % 144.35 442,044 2.29 % 814.50 160,555 2.22 % 856.67 27,431 2.22 % 638.90 1,744 1.55 % 440.86 206,875 2.02 % 650.68 seveda seve da 725,852 1.83 % 639.69 0 0 % 0 24,491 1.65 % 616.66 363,766 1.88 % 670.26 184,814 2.56 % 986.11 20,809 1.69 % 484.67 2,801 2.50 % 708.06 129,171 1.26 % 406.28 celo ce lo 648,010 1.63 % 571.09 0 0 % 0 27,365 1.84 % 689.02 317,838 1.64 % 585.64 135,813 1.88 % 724.65 27,194 2.20 % 633.38 1,417 1.26 % 358.20 138,383 1.35 % 435.25 skoraj skor aj 528,454 1.33 % 465.72 3 2.68 % 309.06 26,418 1.78 % 665.18 253,737 1.31 % 467.53 93,065 1.29 % 496.56 15,802 1.28 % 368.05 828 0.74 % 209.31 138,601 1.35 % 435.94 več v eč 520,542 1.31 % 458.75 0 0 % 0 37,136 2.50 % 935.04 242,295 1.25 % 446.44 94,562 1.31 % 504.55 18,310 1.48 % 426.46 1,689 1.51 % 426.96 126,550 1.23 % 398.03 vsaj vs aj 507,625 1.28 % 447.37 0 0 % 0 20,218 1.36 % 509.07 254,444 1.31 % 468.83 102,076 1.41 % 544.64 15,753 1.28 % 366.91 1,432 1.28 % 361.99 113,702 1.11 % 357.62 morda mor da 468,348 1.18 % 412.75 0 0 % 0 25,811 1.74 % 649.89 217,144 1.12 % 400.10 99,850 1.38 % 532.77 21,212 1.72 % 494.05 1,177 1.05 % 297.53 103,154 1.01 % 324.45 niti ni ti 458,790 1.16 % 404.33 1 0.89 % 103.02 31,303 2.11 % 788.18 220,261 1.14 % 405.85 80,301 1.11 % 428.46 11,945 0.97 % 278.21 1,088 0.97 % 275.03 113,891 1.11 % 358.22 sploh spl oh 419,675 1.06 % 369.86 0 0 % 0 30,505 2.05 % 768.08 192,378 0.99 % 354.47 87,831 1.22 % 468.64 13,018 1.05 % 303.20 1,240 1.10 % 313.46 94,703 0.92 % 297.87 šele še le 388,144 0.98 % 342.07 2 1.79 % 206.04 13,326 0.90 % 335.53 199,240 1.03 % 367.11 69,131 0.96 % 368.86 13,716 1.11 % 319.46 803 0.72 % 202.99 91,926 0.90 % 289.13 pač p ač 294,972 0.74 % 259.96 0 0 % 0 13,015 0.88 % 327.70 148,590 0.77 % 273.79 66,321 0.92 % 353.87 8,471 0.69 % 197.30 2,120 1.89 % 535.91 56,455 0.55 % 177.57 zgolj zgo lj 254,436 0.64 % 224.23 0 0 % 0 5,389 0.36 % 135.69 111,593 0.58 % 205.62 38,423 0.53 % 205.01 9,377 0.76 % 218.40 546 0.49 % 138.02 89,108 0.87 % 280.27 ravno rav no 247,808 0.62 % 218.39 1 0.89 % 103.02 15,206 1.02 % 382.87 105,504 0.55 % 194.40 58,223 0.81 % 310.66 7,237 0.59 % 168.56 782 0.70 % 197.68 60,855 0.59 % 191.40 zlasti zlas ti 242,203 0.61 % 213.45 0 0 % 0 3,265 0.22 % 82.21 138,036 0.71 % 254.34 36,842 0.51 % 196.58 16,342 1.32 % 380.62 908 0.81 % 229.53 46,810 0.46 % 147.23 pravzaprav pravzapr av 198,500 0.50 % 174.94 0 0 % 0 11,349 0.76 % 285.76 93,357 0.48 % 172.02 50,059 0.69 % 267.10 7,342 0.59 % 171 774 0.69 % 195.66 35,619 0.35 % 112.03 vsekakor vsekak or 159,819 0.40 % 140.85 1 0.89 % 103.02 3,884 0.26 % 97.79 83,779 0.43 % 154.37 38,563 0.53 % 205.76 4,333 0.35 % 100.92 385 0.34 % 97.32 28,874 0.28 % 90.82 no no 142,860 0.36 % 125.90 0 0 % 0 19,680 1.32 % 495.52 52,215 0.27 % 96.21 44,047 0.61 % 235.02 3,129 0.25 % 72.88 501 0.45 % 126.65 23,288 0.23 % 73.25 menda men da 132,347 0.33 % 116.64 0 0 % 0 4,569 0.31 % 115.04 76,040 0.39 % 140.11 26,259 0.36 % 140.11 1,353 0.11 % 31.51 130 0.12 % 32.86 23,996 0.23 % 75.47 najbrž najb rž 123,679 0.31 % 109 0 0 % 0 13,632 0.92 % 343.24 60,518 0.31 % 111.51 24,671 0.34 % 131.64 4,216 0.34 % 98.20 369 0.33 % 93.28 20,273 0.20 % 63.76 koli ko li 107,085 0.27 % 94.37 0 0 % 0 6,575 0.44 % 165.55 49,669 0.26 % 91.52 14,370 0.20 % 76.67 5,247 0.42 % 122.21 962 0.86 % 243.18 30,262 0.29 % 95.18 ja ja 87,056 0.22 % 76.72 0 0 % 0 19,465 1.31 % 490.11 27,323 0.14 % 50.34 26,187 0.36 % 139.73 1,897 0.15 % 44.18 332 0.30 % 83.93 11,852 0.12 % 37.28 češ č eš 87,039 0.22 % 76.71 0 0 % 0 2,631 0.18 % 66.25 51,345 0.27 % 94.61 11,944 0.17 % 63.73 1,723 0.14 % 40.13 343 0.31 % 86.71 19,053 0.19 % 59.93 skorajda skoraj da 58,291 0.15 % 51.37 0 0 % 0 2,478 0.17 % 62.39 29,411 0.15 % 54.19 12,753 0.18 % 68.05 1,690 0.14 % 39.36 70 0.06 % 17.70 11,889 0.12 % 37.39 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 214 File at CLARIN.SI 1.2.198 List of final character-level 3-grams from particle lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-particles-lowercase_forms- final-3grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] tudi t udi 8,478,789 40.41 % 7,472.31 21 55.26 % 2,163.39 136,547 20.61 % 3,438.11 4,214,730 40.71 % 7,765.92 1,465,006 38.33 % 7,816.79 263,011 38.81 % 6,125.84 23,030 39.85 % 5,821.74 2,376,444 43.91 % 7,474.54 naj naj 1,598,002 7.62 % 1,408.31 2 5.26 % 206.04 35,826 5.41 % 902.06 847,958 8.19 % 1,562.42 235,452 6.16 % 1,256.29 47,034 6.94 % 1,095.48 3,314 5.73 % 837.74 428,416 7.92 % 1,347.48 prav p rav 1,208,290 5.76 % 1,064.86 1 2.63 % 103.02 52,405 7.91 % 1,319.50 578,438 5.59 % 1,065.81 228,510 5.98 % 1,219.25 41,863 6.18 % 975.04 2,593 4.49 % 655.48 304,480 5.63 % 957.67 sicer si cer 1,078,123 5.14 % 950.14 0 0 % 0 15,993 2.41 % 402.69 490,602 4.74 % 903.97 157,744 4.13 % 841.67 18,839 2.78 % 438.78 1,842 3.19 % 465.64 393,103 7.26 % 1,236.41 samo s amo 940,469 4.48 % 828.83 6 15.79 % 618.11 76,926 11.61 % 1,936.91 434,924 4.20 % 801.38 194,550 5.09 % 1,038.05 44,843 6.62 % 1,044.45 4,907 8.49 % 1,240.44 184,313 3.41 % 579.71 predvsem predv sem 844,382 4.02 % 744.15 0 0 % 0 5,733 0.86 % 144.35 442,044 4.27 % 814.50 160,555 4.20 % 856.67 27,431 4.05 % 638.90 1,744 3.02 % 440.86 206,875 3.82 % 650.68 seveda sev eda 725,852 3.46 % 639.69 0 0 % 0 24,491 3.70 % 616.66 363,766 3.51 % 670.26 184,814 4.84 % 986.11 20,809 3.07 % 484.67 2,801 4.85 % 708.06 129,171 2.39 % 406.28 celo c elo 648,010 3.09 % 571.09 0 0 % 0 27,365 4.13 % 689.02 317,838 3.07 % 585.64 135,813 3.55 % 724.65 27,194 4.01 % 633.38 1,417 2.45 % 358.20 138,383 2.56 % 435.25 skoraj sko raj 528,454 2.52 % 465.72 3 7.89 % 309.06 26,418 3.99 % 665.18 253,737 2.45 % 467.53 93,065 2.44 % 496.56 15,802 2.33 % 368.05 828 1.43 % 209.31 138,601 2.56 % 435.94 več več 520,542 2.48 % 458.75 0 0 % 0 37,136 5.60 % 935.04 242,295 2.34 % 446.44 94,562 2.47 % 504.55 18,310 2.70 % 426.46 1,689 2.92 % 426.96 126,550 2.34 % 398.03 vsaj v saj 507,625 2.42 % 447.37 0 0 % 0 20,218 3.05 % 509.07 254,444 2.46 % 468.83 102,076 2.67 % 544.64 15,753 2.32 % 366.91 1,432 2.48 % 361.99 113,702 2.10 % 357.62 morda mo rda 468,348 2.23 % 412.75 0 0 % 0 25,811 3.90 % 649.89 217,144 2.10 % 400.10 99,850 2.61 % 532.77 21,212 3.13 % 494.05 1,177 2.04 % 297.53 103,154 1.91 % 324.45 niti n iti 458,790 2.19 % 404.33 1 2.63 % 103.02 31,303 4.72 % 788.18 220,261 2.13 % 405.85 80,301 2.10 % 428.46 11,945 1.76 % 278.21 1,088 1.88 % 275.03 113,891 2.10 % 358.22 sploh sp loh 419,675 2.00 % 369.86 0 0 % 0 30,505 4.60 % 768.08 192,378 1.86 % 354.47 87,831 2.30 % 468.64 13,018 1.92 % 303.20 1,240 2.15 % 313.46 94,703 1.75 % 297.87 šele š ele 388,144 1.85 % 342.07 2 5.26 % 206.04 13,326 2.01 % 335.53 199,240 1.93 % 367.11 69,131 1.81 % 368.86 13,716 2.02 % 319.46 803 1.39 % 202.99 91,926 1.70 % 289.13 pač pač 294,972 1.41 % 259.96 0 0 % 0 13,015 1.96 % 327.70 148,590 1.44 % 273.79 66,321 1.74 % 353.87 8,471 1.25 % 197.30 2,120 3.67 % 535.91 56,455 1.04 % 177.57 zgolj zg olj 254,436 1.21 % 224.23 0 0 % 0 5,389 0.81 % 135.69 111,593 1.08 % 205.62 38,423 1.00 % 205.01 9,377 1.38 % 218.40 546 0.94 % 138.02 89,108 1.65 % 280.27 ravno ra vno 247,808 1.18 % 218.39 1 2.63 % 103.02 15,206 2.29 % 382.87 105,504 1.02 % 194.40 58,223 1.52 % 310.66 7,237 1.07 % 168.56 782 1.35 % 197.68 60,855 1.12 % 191.40 zlasti zla sti 242,203 1.15 % 213.45 0 0 % 0 3,265 0.49 % 82.21 138,036 1.33 % 254.34 36,842 0.96 % 196.58 16,342 2.41 % 380.62 908 1.57 % 229.53 46,810 0.86 % 147.23 pravzaprav pravzap rav 198,500 0.95 % 174.94 0 0 % 0 11,349 1.71 % 285.76 93,357 0.90 % 172.02 50,059 1.31 % 267.10 7,342 1.08 % 171 774 1.34 % 195.66 35,619 0.66 % 112.03 vsekakor vseka kor 159,819 0.76 % 140.85 1 2.63 % 103.02 3,884 0.59 % 97.79 83,779 0.81 % 154.37 38,563 1.01 % 205.76 4,333 0.64 % 100.92 385 0.67 % 97.32 28,874 0.53 % 90.82 menda me nda 132,347 0.63 % 116.64 0 0 % 0 4,569 0.69 % 115.04 76,040 0.73 % 140.11 26,259 0.69 % 140.11 1,353 0.20 % 31.51 130 0.23 % 32.86 23,996 0.44 % 75.47 najbrž naj brž 123,679 0.59 % 109 0 0 % 0 13,632 2.06 % 343.24 60,518 0.58 % 111.51 24,671 0.65 % 131.64 4,216 0.62 % 98.20 369 0.64 % 93.28 20,273 0.38 % 63.76 koli k oli 107,085 0.51 % 94.37 0 0 % 0 6,575 0.99 % 165.55 49,669 0.48 % 91.52 14,370 0.38 % 76.67 5,247 0.77 % 122.21 962 1.67 % 243.18 30,262 0.56 % 95.18 češ češ 87,039 0.41 % 76.71 0 0 % 0 2,631 0.40 % 66.25 51,345 0.50 % 94.61 11,944 0.31 % 63.73 1,723 0.25 % 40.13 343 0.59 % 86.71 19,053 0.35 % 59.93 skorajda skora jda 58,291 0.28 % 51.37 0 0 % 0 2,478 0.37 % 62.39 29,411 0.28 % 54.19 12,753 0.33 % 68.05 1,690 0.25 % 39.36 70 0.12 % 17.70 11,889 0.22 % 37.39 bržkone bržk one 28,045 0.13 % 24.72 0 0 % 0 720 0.11 % 18.13 17,758 0.17 % 32.72 3,128 0.08 % 16.69 773 0.11 % 18 21 0.04 % 5.31 5,645 0.10 % 17.75 nemara nem ara 24,208 0.12 % 21.33 0 0 % 0 2,593 0.39 % 65.29 12,101 0.12 % 22.30 4,123 0.11 % 22 2,031 0.30 % 47.30 37 0.06 % 9.35 3,323 0.06 % 10.45 morebiti moreb iti 23,169 0.11 % 20.42 0 0 % 0 1,693 0.26 % 42.63 12,592 0.12 % 23.20 3,410 0.09 % 18.19 787 0.12 % 18.33 33 0.06 % 8.34 4,654 0.09 % 14.64 nikar ni kar 21,998 0.10 % 19.39 0 0 % 0 2,458 0.37 % 61.89 8,133 0.08 % 14.99 7,672 0.20 % 40.94 1,134 0.17 % 26.41 47 0.08 % 11.88 2,554 0.05 % 8.03 kajpak kaj pak 21,308 0.10 % 18.78 0 0 % 0 482 0.07 % 12.14 14,047 0.14 % 25.88 2,845 0.07 % 15.18 214 0.03 % 4.98 5 0.01 % 1.26 3,715 0.07 % 11.68 kvečjemu kvečj emu 16,808 0.08 % 14.81 0 0 % 0 733 0.11 % 18.46 8,898 0.09 % 16.40 3,245 0.09 % 17.31 571 0.08 % 13.30 48 0.08 % 12.13 3,313 0.06 % 10.42 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 215 File at CLARIN.SI 1.2.199 List of final character-level 4-grams from particle lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-particles-lowercase_forms- final-4grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] tudi tudi 8,478,789 45.89 % 7,472.31 21 58.33 % 2,163.39 136,547 23.81 % 3,438.11 4,214,730 46.53 % 7,765.92 1,465,006 42.94 % 7,816.79 263,011 43.70 % 6,125.84 23,030 45.80 % 5,821.74 2,376,444 49.72 % 7,474.54 prav prav 1,208,290 6.54 % 1,064.86 1 2.78 % 103.02 52,405 9.14 % 1,319.50 578,438 6.39 % 1,065.81 228,510 6.70 % 1,219.25 41,863 6.96 % 975.04 2,593 5.16 % 655.48 304,480 6.37 % 957.67 sicer s icer 1,078,123 5.84 % 950.14 0 0 % 0 15,993 2.79 % 402.69 490,602 5.42 % 903.97 157,744 4.62 % 841.67 18,839 3.13 % 438.78 1,842 3.66 % 465.64 393,103 8.22 % 1,236.41 samo samo 940,469 5.09 % 828.83 6 16.67 % 618.11 76,926 13.42 % 1,936.91 434,924 4.80 % 801.38 194,550 5.70 % 1,038.05 44,843 7.45 % 1,044.45 4,907 9.76 % 1,240.44 184,313 3.86 % 579.71 predvsem pred vsem 844,382 4.57 % 744.15 0 0 % 0 5,733 1.00 % 144.35 442,044 4.88 % 814.50 160,555 4.71 % 856.67 27,431 4.56 % 638.90 1,744 3.47 % 440.86 206,875 4.33 % 650.68 seveda se veda 725,852 3.93 % 639.69 0 0 % 0 24,491 4.27 % 616.66 363,766 4.02 % 670.26 184,814 5.42 % 986.11 20,809 3.46 % 484.67 2,801 5.57 % 708.06 129,171 2.70 % 406.28 celo celo 648,010 3.51 % 571.09 0 0 % 0 27,365 4.77 % 689.02 317,838 3.51 % 585.64 135,813 3.98 % 724.65 27,194 4.52 % 633.38 1,417 2.82 % 358.20 138,383 2.90 % 435.25 skoraj sk oraj 528,454 2.86 % 465.72 3 8.33 % 309.06 26,418 4.61 % 665.18 253,737 2.80 % 467.53 93,065 2.73 % 496.56 15,802 2.63 % 368.05 828 1.65 % 209.31 138,601 2.90 % 435.94 vsaj vsaj 507,625 2.75 % 447.37 0 0 % 0 20,218 3.53 % 509.07 254,444 2.81 % 468.83 102,076 2.99 % 544.64 15,753 2.62 % 366.91 1,432 2.85 % 361.99 113,702 2.38 % 357.62 morda m orda 468,348 2.54 % 412.75 0 0 % 0 25,811 4.50 % 649.89 217,144 2.40 % 400.10 99,850 2.93 % 532.77 21,212 3.52 % 494.05 1,177 2.34 % 297.53 103,154 2.16 % 324.45 niti niti 458,790 2.48 % 404.33 1 2.78 % 103.02 31,303 5.46 % 788.18 220,261 2.43 % 405.85 80,301 2.35 % 428.46 11,945 1.99 % 278.21 1,088 2.16 % 275.03 113,891 2.38 % 358.22 sploh s ploh 419,675 2.27 % 369.86 0 0 % 0 30,505 5.32 % 768.08 192,378 2.12 % 354.47 87,831 2.57 % 468.64 13,018 2.16 % 303.20 1,240 2.47 % 313.46 94,703 1.98 % 297.87 šele šele 388,144 2.10 % 342.07 2 5.56 % 206.04 13,326 2.32 % 335.53 199,240 2.20 % 367.11 69,131 2.03 % 368.86 13,716 2.28 % 319.46 803 1.60 % 202.99 91,926 1.92 % 289.13 zgolj z golj 254,436 1.38 % 224.23 0 0 % 0 5,389 0.94 % 135.69 111,593 1.23 % 205.62 38,423 1.13 % 205.01 9,377 1.56 % 218.40 546 1.09 % 138.02 89,108 1.86 % 280.27 ravno r avno 247,808 1.34 % 218.39 1 2.78 % 103.02 15,206 2.65 % 382.87 105,504 1.17 % 194.40 58,223 1.71 % 310.66 7,237 1.20 % 168.56 782 1.55 % 197.68 60,855 1.27 % 191.40 zlasti zl asti 242,203 1.31 % 213.45 0 0 % 0 3,265 0.57 % 82.21 138,036 1.52 % 254.34 36,842 1.08 % 196.58 16,342 2.71 % 380.62 908 1.81 % 229.53 46,810 0.98 % 147.23 pravzaprav pravza prav 198,500 1.07 % 174.94 0 0 % 0 11,349 1.98 % 285.76 93,357 1.03 % 172.02 50,059 1.47 % 267.10 7,342 1.22 % 171 774 1.54 % 195.66 35,619 0.74 % 112.03 vsekakor vsek akor 159,819 0.86 % 140.85 1 2.78 % 103.02 3,884 0.68 % 97.79 83,779 0.93 % 154.37 38,563 1.13 % 205.76 4,333 0.72 % 100.92 385 0.77 % 97.32 28,874 0.60 % 90.82 menda m enda 132,347 0.72 % 116.64 0 0 % 0 4,569 0.80 % 115.04 76,040 0.84 % 140.11 26,259 0.77 % 140.11 1,353 0.23 % 31.51 130 0.26 % 32.86 23,996 0.50 % 75.47 najbrž na jbrž 123,679 0.67 % 109 0 0 % 0 13,632 2.38 % 343.24 60,518 0.67 % 111.51 24,671 0.72 % 131.64 4,216 0.70 % 98.20 369 0.73 % 93.28 20,273 0.42 % 63.76 koli koli 107,085 0.58 % 94.37 0 0 % 0 6,575 1.15 % 165.55 49,669 0.55 % 91.52 14,370 0.42 % 76.67 5,247 0.87 % 122.21 962 1.91 % 243.18 30,262 0.63 % 95.18 skorajda skor ajda 58,291 0.32 % 51.37 0 0 % 0 2,478 0.43 % 62.39 29,411 0.33 % 54.19 12,753 0.37 % 68.05 1,690 0.28 % 39.36 70 0.14 % 17.70 11,889 0.25 % 37.39 bržkone brž kone 28,045 0.15 % 24.72 0 0 % 0 720 0.13 % 18.13 17,758 0.20 % 32.72 3,128 0.09 % 16.69 773 0.13 % 18 21 0.04 % 5.31 5,645 0.12 % 17.75 nemara ne mara 24,208 0.13 % 21.33 0 0 % 0 2,593 0.45 % 65.29 12,101 0.13 % 22.30 4,123 0.12 % 22 2,031 0.34 % 47.30 37 0.07 % 9.35 3,323 0.07 % 10.45 morebiti more biti 23,169 0.12 % 20.42 0 0 % 0 1,693 0.29 % 42.63 12,592 0.14 % 23.20 3,410 0.10 % 18.19 787 0.13 % 18.33 33 0.07 % 8.34 4,654 0.10 % 14.64 nikar n ikar 21,998 0.12 % 19.39 0 0 % 0 2,458 0.43 % 61.89 8,133 0.09 % 14.99 7,672 0.23 % 40.94 1,134 0.19 % 26.41 47 0.09 % 11.88 2,554 0.05 % 8.03 kajpak ka jpak 21,308 0.12 % 18.78 0 0 % 0 482 0.08 % 12.14 14,047 0.15 % 25.88 2,845 0.08 % 15.18 214 0.04 % 4.98 5 0.01 % 1.26 3,715 0.08 % 11.68 kvečjemu kveč jemu 16,808 0.09 % 14.81 0 0 % 0 733 0.13 % 18.46 8,898 0.10 % 16.40 3,245 0.10 % 17.31 571 0.10 % 13.30 48 0.10 % 12.13 3,313 0.07 % 10.42 domala do mala 16,383 0.09 % 14.44 0 0 % 0 282 0.05 % 7.10 10,557 0.12 % 19.45 2,066 0.06 % 11.02 352 0.06 % 8.20 10 0.02 % 2.53 3,116 0.07 % 9.80 edino e dino 15,222 0.08 % 13.42 0 0 % 0 1,096 0.19 % 27.60 7,646 0.08 % 14.09 2,895 0.09 % 15.45 647 0.11 % 15.07 65 0.13 % 16.43 2,873 0.06 % 9.04 kajne k ajne 13,447 0.07 % 11.85 0 0 % 0 3,958 0.69 % 99.66 3,563 0.04 % 6.57 3,581 0.10 % 19.11 233 0.04 % 5.43 14 0.03 % 3.54 2,098 0.04 % 6.60 bržčas br žčas 12,496 0.07 % 11.01 0 0 % 0 79 0.01 % 1.99 8,473 0.09 % 15.61 1,322 0.04 % 7.05 139 0.02 % 3.24 0 0 % 0 2,483 0.05 % 7.81 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 216 File at CLARIN.SI 1.2.200 List of final character-level 5-grams from particle lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-particles-lowercase_forms- final-5grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] sicer sicer 1,078,123 18.85 % 950.14 0 0 % 0 15,993 7.74 % 402.69 490,602 17.64 % 903.97 157,744 14.14 % 841.67 18,839 10.60 % 438.78 1,842 13.17 % 465.64 393,103 27.62 % 1,236.41 predvsem pre dvsem 844,382 14.77 % 744.15 0 0 % 0 5,733 2.77 % 144.35 442,044 15.89 % 814.50 160,555 14.39 % 856.67 27,431 15.43 % 638.90 1,744 12.47 % 440.86 206,875 14.53 % 650.68 seveda s eveda 725,852 12.69 % 639.69 0 0 % 0 24,491 11.85 % 616.66 363,766 13.08 % 670.26 184,814 16.57 % 986.11 20,809 11.71 % 484.67 2,801 20.03 % 708.06 129,171 9.07 % 406.28 skoraj s koraj 528,454 9.24 % 465.72 3 60.00 % 309.06 26,418 12.79 % 665.18 253,737 9.12 % 467.53 93,065 8.34 % 496.56 15,802 8.89 % 368.05 828 5.92 % 209.31 138,601 9.74 % 435.94 morda morda 468,348 8.19 % 412.75 0 0 % 0 25,811 12.49 % 649.89 217,144 7.81 % 400.10 99,850 8.95 % 532.77 21,212 11.93 % 494.05 1,177 8.41 % 297.53 103,154 7.25 % 324.45 sploh sploh 419,675 7.34 % 369.86 0 0 % 0 30,505 14.76 % 768.08 192,378 6.92 % 354.47 87,831 7.87 % 468.64 13,018 7.32 % 303.20 1,240 8.87 % 313.46 94,703 6.65 % 297.87 zgolj zgolj 254,436 4.45 % 224.23 0 0 % 0 5,389 2.61 % 135.69 111,593 4.01 % 205.62 38,423 3.44 % 205.01 9,377 5.28 % 218.40 546 3.90 % 138.02 89,108 6.26 % 280.27 ravno ravno 247,808 4.33 % 218.39 1 20.00 % 103.02 15,206 7.36 % 382.87 105,504 3.79 % 194.40 58,223 5.22 % 310.66 7,237 4.07 % 168.56 782 5.59 % 197.68 60,855 4.28 % 191.40 zlasti z lasti 242,203 4.24 % 213.45 0 0 % 0 3,265 1.58 % 82.21 138,036 4.96 % 254.34 36,842 3.30 % 196.58 16,342 9.19 % 380.62 908 6.49 % 229.53 46,810 3.29 % 147.23 pravzaprav pravz aprav 198,500 3.47 % 174.94 0 0 % 0 11,349 5.49 % 285.76 93,357 3.36 % 172.02 50,059 4.49 % 267.10 7,342 4.13 % 171 774 5.53 % 195.66 35,619 2.50 % 112.03 vsekakor vse kakor 159,819 2.79 % 140.85 1 20.00 % 103.02 3,884 1.88 % 97.79 83,779 3.01 % 154.37 38,563 3.46 % 205.76 4,333 2.44 % 100.92 385 2.75 % 97.32 28,874 2.03 % 90.82 menda menda 132,347 2.31 % 116.64 0 0 % 0 4,569 2.21 % 115.04 76,040 2.73 % 140.11 26,259 2.35 % 140.11 1,353 0.76 % 31.51 130 0.93 % 32.86 23,996 1.69 % 75.47 najbrž n ajbrž 123,679 2.16 % 109 0 0 % 0 13,632 6.60 % 343.24 60,518 2.18 % 111.51 24,671 2.21 % 131.64 4,216 2.37 % 98.20 369 2.64 % 93.28 20,273 1.42 % 63.76 skorajda sko rajda 58,291 1.02 % 51.37 0 0 % 0 2,478 1.20 % 62.39 29,411 1.06 % 54.19 12,753 1.14 % 68.05 1,690 0.95 % 39.36 70 0.50 % 17.70 11,889 0.83 % 37.39 bržkone br žkone 28,045 0.49 % 24.72 0 0 % 0 720 0.35 % 18.13 17,758 0.64 % 32.72 3,128 0.28 % 16.69 773 0.43 % 18 21 0.15 % 5.31 5,645 0.40 % 17.75 nemara n emara 24,208 0.42 % 21.33 0 0 % 0 2,593 1.25 % 65.29 12,101 0.43 % 22.30 4,123 0.37 % 22 2,031 1.14 % 47.30 37 0.27 % 9.35 3,323 0.23 % 10.45 morebiti mor ebiti 23,169 0.41 % 20.42 0 0 % 0 1,693 0.82 % 42.63 12,592 0.45 % 23.20 3,410 0.31 % 18.19 787 0.44 % 18.33 33 0.24 % 8.34 4,654 0.33 % 14.64 nikar nikar 21,998 0.39 % 19.39 0 0 % 0 2,458 1.19 % 61.89 8,133 0.29 % 14.99 7,672 0.69 % 40.94 1,134 0.64 % 26.41 47 0.34 % 11.88 2,554 0.18 % 8.03 kajpak k ajpak 21,308 0.37 % 18.78 0 0 % 0 482 0.23 % 12.14 14,047 0.51 % 25.88 2,845 0.26 % 15.18 214 0.12 % 4.98 5 0.04 % 1.26 3,715 0.26 % 11.68 kvečjemu kve čjemu 16,808 0.29 % 14.81 0 0 % 0 733 0.35 % 18.46 8,898 0.32 % 16.40 3,245 0.29 % 17.31 571 0.32 % 13.30 48 0.34 % 12.13 3,313 0.23 % 10.42 domala d omala 16,383 0.29 % 14.44 0 0 % 0 282 0.14 % 7.10 10,557 0.38 % 19.45 2,066 0.18 % 11.02 352 0.20 % 8.20 10 0.07 % 2.53 3,116 0.22 % 9.80 edino edino 15,222 0.27 % 13.42 0 0 % 0 1,096 0.53 % 27.60 7,646 0.28 % 14.09 2,895 0.26 % 15.45 647 0.36 % 15.07 65 0.47 % 16.43 2,873 0.20 % 9.04 kajne kajne 13,447 0.23 % 11.85 0 0 % 0 3,958 1.92 % 99.66 3,563 0.13 % 6.57 3,581 0.32 % 19.11 233 0.13 % 5.43 14 0.10 % 3.54 2,098 0.15 % 6.60 bržčas b ržčas 12,496 0.22 % 11.01 0 0 % 0 79 0.04 % 1.99 8,473 0.30 % 15.61 1,322 0.12 % 7.05 139 0.08 % 3.24 0 0 % 0 2,483 0.17 % 7.81 bojda bojda 8,067 0.14 % 7.11 0 0 % 0 258 0.12 % 6.50 3,899 0.14 % 7.18 2,303 0.21 % 12.29 66 0.04 % 1.54 3 0.02 % 0.76 1,538 0.11 % 4.84 kajpada ka jpada 6,710 0.12 % 5.91 0 0 % 0 432 0.21 % 10.88 3,814 0.14 % 7.03 1,288 0.12 % 6.87 393 0.22 % 9.15 3 0.02 % 0.76 780 0.06 % 2.45 kakopak ka kopak 6,398 0.11 % 5.64 0 0 % 0 142 0.07 % 3.58 1,685 0.06 % 3.10 3,126 0.28 % 16.68 44 0.03 % 1.02 0 0 % 0 1,401 0.10 % 4.41 malodane mal odane 5,717 0.10 % 5.04 0 0 % 0 498 0.24 % 12.54 2,960 0.11 % 5.45 1,147 0.10 % 6.12 118 0.07 % 2.75 3 0.02 % 0.76 991 0.07 % 3.12 malone m alone 4,179 0.07 % 3.68 0 0 % 0 664 0.32 % 16.72 2,087 0.07 % 3.85 555 0.05 % 2.96 313 0.18 % 7.29 21 0.15 % 5.31 539 0.04 % 1.70 takorekoč tako rekoč 3,914 0.07 % 3.45 0 0 % 0 114 0.06 % 2.87 1,579 0.06 % 2.91 1,331 0.12 % 7.10 264 0.15 % 6.15 42 0.30 % 10.62 584 0.04 % 1.84 edinole ed inole 2,874 0.05 % 2.53 0 0 % 0 444 0.21 % 11.18 1,180 0.04 % 2.17 611 0.06 % 3.26 342 0.19 % 7.97 29 0.21 % 7.33 268 0.02 % 0.84 kratkomalo kratk omalo 1,141 0.02 % 1.01 0 0 % 0 90 0.04 % 2.27 627 0.02 % 1.16 318 0.03 % 1.70 59 0.03 % 1.37 3 0.02 % 0.76 44 0 % 0.14 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 217 File at CLARIN.SI 1.2.201 List of initial character-level 1-grams from interjection lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-interjections-lemmas-initial- 1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] oh oh o h 14,865 10.91 % 13.10 0 0 % 0 5,344 19.95 % 134.56 3,245 7.48 % 5.98 3,990 9.80 % 21.29 760 18.32 % 17.70 25 7.14 % 6.32 1,501 7.18 % 4.72 ah ah a h 13,301 9.76 % 11.72 0 0 % 0 2,696 10.06 % 67.88 4,963 11.44 % 9.14 3,766 9.25 % 20.09 445 10.73 % 10.36 30 8.57 % 7.58 1,401 6.71 % 4.41 ha ha h a 11,172 8.20 % 9.85 0 0 % 0 1,638 6.11 % 41.24 4,524 10.43 % 8.34 3,900 9.58 % 20.81 320 7.71 % 7.45 33 9.43 % 8.34 757 3.62 % 2.38 hm hm h m 8,313 6.10 % 7.33 0 0 % 0 1,142 4.26 % 28.75 3,096 7.13 % 5.70 3,089 7.59 % 16.48 169 4.07 % 3.94 12 3.43 % 3.03 805 3.85 % 2.53 hej hej h ej 7,497 5.50 % 6.61 0 0 % 0 1,515 5.66 % 38.15 1,767 4.07 % 3.26 3,368 8.27 % 17.97 210 5.06 % 4.89 4 1.14 % 1.01 633 3.03 % 1.99 joj joj j oj 6,441 4.73 % 5.68 0 0 % 0 1,074 4.01 % 27.04 2,157 4.97 % 3.97 2,162 5.31 % 11.54 138 3.33 % 3.21 33 9.43 % 8.34 877 4.20 % 2.76 aha aha a ha 6,335 4.65 % 5.58 0 0 % 0 1,184 4.42 % 29.81 1,301 3.00 % 2.40 1,538 3.78 % 8.21 160 3.86 % 3.73 25 7.14 % 6.32 2,127 10.18 % 6.69 fak fak f ak 4,669 3.43 % 4.11 0 0 % 0 160 0.60 % 4.03 190 0.44 % 0.35 98 0.24 % 0.52 2 0.05 % 0.05 0 0 % 0 4,219 20.20 % 13.27 ej ej e j 3,721 2.73 % 3.28 0 0 % 0 1,232 4.60 % 31.02 1,049 2.42 % 1.93 830 2.04 % 4.43 241 5.81 % 5.61 7 2.00 % 1.77 362 1.73 % 1.14 hi hi h i 3,425 2.51 % 3.02 0 0 % 0 169 0.63 % 4.26 1,246 2.87 % 2.30 1,521 3.74 % 8.12 111 2.68 % 2.59 22 6.29 % 5.56 356 1.70 % 1.12 bravo bravo b ravo 3,087 2.27 % 2.72 0 0 % 0 254 0.95 % 6.40 1,220 2.81 % 2.25 848 2.08 % 4.52 37 0.89 % 0.86 4 1.14 % 1.01 724 3.47 % 2.28 uf uf u f 2,808 2.06 % 2.47 0 0 % 0 254 0.95 % 6.40 834 1.92 % 1.54 1,109 2.72 % 5.92 25 0.60 % 0.58 0 0 % 0 586 2.81 % 1.84 zbogom zbogom z bogom 2,474 1.81 % 2.18 0 0 % 0 360 1.34 % 9.06 1,062 2.45 % 1.96 532 1.31 % 2.84 90 2.17 % 2.10 3 0.86 % 0.76 427 2.04 % 1.34 uh uh u h 2,435 1.79 % 2.15 0 0 % 0 336 1.25 % 8.46 830 1.91 % 1.53 983 2.41 % 5.24 48 1.16 % 1.12 2 0.57 % 0.51 236 1.13 % 0.74 adijo adijo a dijo 2,414 1.77 % 2.13 0 0 % 0 416 1.55 % 10.47 961 2.21 % 1.77 608 1.49 % 3.24 60 1.45 % 1.40 1 0.29 % 0.25 368 1.76 % 1.16 eh eh e h 2,409 1.77 % 2.12 0 0 % 0 399 1.49 % 10.05 655 1.51 % 1.21 1,114 2.74 % 5.94 47 1.13 % 1.09 1 0.29 % 0.25 193 0.92 % 0.61 ho ho h o 2,160 1.58 % 1.90 0 0 % 0 220 0.82 % 5.54 784 1.81 % 1.44 504 1.24 % 2.69 133 3.21 % 3.10 18 5.14 % 4.55 501 2.40 % 1.58 bla bla b la 1,985 1.46 % 1.75 0 0 % 0 325 1.21 % 8.18 824 1.90 % 1.52 512 1.26 % 2.73 54 1.30 % 1.26 11 3.14 % 2.78 259 1.24 % 0.81 la la l a 1,888 1.39 % 1.66 0 0 % 0 90 0.34 % 2.27 919 2.12 % 1.69 450 1.10 % 2.40 72 1.74 % 1.68 3 0.86 % 0.76 354 1.70 % 1.11 haha haha h aha 1,798 1.32 % 1.58 0 0 % 0 86 0.32 % 2.17 448 1.03 % 0.83 915 2.25 % 4.88 5 0.12 % 0.12 0 0 % 0 344 1.65 % 1.08 oj oj o j 1,484 1.09 % 1.31 0 0 % 0 218 0.81 % 5.49 651 1.50 % 1.20 318 0.78 % 1.70 149 3.59 % 3.47 9 2.57 % 2.28 139 0.67 % 0.44 huh huh h uh 1,382 1.01 % 1.22 0 0 % 0 23 0.09 % 0.58 97 0.22 % 0.18 1,180 2.90 % 6.30 10 0.24 % 0.23 0 0 % 0 72 0.34 % 0.23 alo alo a lo 1,291 0.95 % 1.14 0 0 % 0 100 0.37 % 2.52 599 1.38 % 1.10 221 0.54 % 1.18 25 0.60 % 0.58 0 0 % 0 346 1.66 % 1.09 hvalabogu hvalabogu h valabogu 1,286 0.94 % 1.13 0 0 % 0 146 0.55 % 3.68 671 1.55 % 1.24 300 0.74 % 1.60 17 0.41 % 0.40 4 1.14 % 1.01 148 0.71 % 0.47 jebiga jebiga j ebiga 1,175 0.86 % 1.04 0 0 % 0 511 1.91 % 12.87 321 0.74 % 0.59 272 0.67 % 1.45 5 0.12 % 0.12 3 0.86 % 0.76 63 0.30 % 0.20 nasvidenje nasvidenje n asvidenje 1,162 0.85 % 1.02 0 0 % 0 130 0.48 % 3.27 620 1.43 % 1.14 181 0.45 % 0.97 28 0.68 % 0.65 35 10.00 % 8.85 168 0.80 % 0.53 oho oho o ho 1,160 0.85 % 1.02 0 0 % 0 135 0.50 % 3.40 421 0.97 % 0.78 217 0.53 % 1.16 27 0.65 % 0.63 1 0.29 % 0.25 359 1.72 % 1.13 hura hura h ura 1,087 0.80 % 0.96 0 0 % 0 121 0.45 % 3.05 531 1.22 % 0.98 289 0.71 % 1.54 38 0.92 % 0.89 1 0.29 % 0.25 107 0.51 % 0.34 jah jah j ah 1,031 0.76 % 0.91 0 0 % 0 22 0.08 % 0.55 650 1.50 % 1.20 235 0.58 % 1.25 8 0.19 % 0.19 1 0.29 % 0.25 115 0.55 % 0.36 ups ups u ps 972 0.71 % 0.86 0 0 % 0 74 0.28 % 1.86 302 0.70 % 0.56 468 1.15 % 2.50 8 0.19 % 0.19 5 1.43 % 1.26 115 0.55 % 0.36 hopla hopla h opla 946 0.69 % 0.83 0 0 % 0 42 0.16 % 1.06 338 0.78 % 0.62 535 1.31 % 2.85 5 0.12 % 0.12 3 0.86 % 0.76 23 0.11 % 0.07 av av a v 944 0.69 % 0.83 0 0 % 0 78 0.29 % 1.96 588 1.35 % 1.08 200 0.49 % 1.07 28 0.68 % 0.65 10 2.86 % 2.53 40 0.19 % 0.13 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 218 File at CLARIN.SI 1.2.202 List of initial character-level 2-grams from interjection lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-interjections-lemmas-initial-2grams- taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] oh oh oh 14,865 10.91 % 13.10 0 0 % 0 5,344 19.95 % 134.56 3,245 7.48 % 5.98 3,990 9.80 % 21.29 760 18.32 % 17.70 25 7.14 % 6.32 1,501 7.18 % 4.72 ah ah ah 13,301 9.76 % 11.72 0 0 % 0 2,696 10.06 % 67.88 4,963 11.44 % 9.14 3,766 9.25 % 20.09 445 10.73 % 10.36 30 8.57 % 7.58 1,401 6.71 % 4.41 ha ha ha 11,172 8.20 % 9.85 0 0 % 0 1,638 6.11 % 41.24 4,524 10.43 % 8.34 3,900 9.58 % 20.81 320 7.71 % 7.45 33 9.43 % 8.34 757 3.62 % 2.38 hm hm hm 8,313 6.10 % 7.33 0 0 % 0 1,142 4.26 % 28.75 3,096 7.13 % 5.70 3,089 7.59 % 16.48 169 4.07 % 3.94 12 3.43 % 3.03 805 3.85 % 2.53 hej hej he j 7,497 5.50 % 6.61 0 0 % 0 1,515 5.66 % 38.15 1,767 4.07 % 3.26 3,368 8.27 % 17.97 210 5.06 % 4.89 4 1.14 % 1.01 633 3.03 % 1.99 joj joj jo j 6,441 4.73 % 5.68 0 0 % 0 1,074 4.01 % 27.04 2,157 4.97 % 3.97 2,162 5.31 % 11.54 138 3.33 % 3.21 33 9.43 % 8.34 877 4.20 % 2.76 aha aha ah a 6,335 4.65 % 5.58 0 0 % 0 1,184 4.42 % 29.81 1,301 3.00 % 2.40 1,538 3.78 % 8.21 160 3.86 % 3.73 25 7.14 % 6.32 2,127 10.18 % 6.69 fak fak fa k 4,669 3.43 % 4.11 0 0 % 0 160 0.60 % 4.03 190 0.44 % 0.35 98 0.24 % 0.52 2 0.05 % 0.05 0 0 % 0 4,219 20.20 % 13.27 ej ej ej 3,721 2.73 % 3.28 0 0 % 0 1,232 4.60 % 31.02 1,049 2.42 % 1.93 830 2.04 % 4.43 241 5.81 % 5.61 7 2.00 % 1.77 362 1.73 % 1.14 hi hi hi 3,425 2.51 % 3.02 0 0 % 0 169 0.63 % 4.26 1,246 2.87 % 2.30 1,521 3.74 % 8.12 111 2.68 % 2.59 22 6.29 % 5.56 356 1.70 % 1.12 bravo bravo br avo 3,087 2.27 % 2.72 0 0 % 0 254 0.95 % 6.40 1,220 2.81 % 2.25 848 2.08 % 4.52 37 0.89 % 0.86 4 1.14 % 1.01 724 3.47 % 2.28 uf uf uf 2,808 2.06 % 2.47 0 0 % 0 254 0.95 % 6.40 834 1.92 % 1.54 1,109 2.72 % 5.92 25 0.60 % 0.58 0 0 % 0 586 2.81 % 1.84 zbogom zbogom zb ogom 2,474 1.81 % 2.18 0 0 % 0 360 1.34 % 9.06 1,062 2.45 % 1.96 532 1.31 % 2.84 90 2.17 % 2.10 3 0.86 % 0.76 427 2.04 % 1.34 uh uh uh 2,435 1.79 % 2.15 0 0 % 0 336 1.25 % 8.46 830 1.91 % 1.53 983 2.41 % 5.24 48 1.16 % 1.12 2 0.57 % 0.51 236 1.13 % 0.74 adijo adijo ad ijo 2,414 1.77 % 2.13 0 0 % 0 416 1.55 % 10.47 961 2.21 % 1.77 608 1.49 % 3.24 60 1.45 % 1.40 1 0.29 % 0.25 368 1.76 % 1.16 eh eh eh 2,409 1.77 % 2.12 0 0 % 0 399 1.49 % 10.05 655 1.51 % 1.21 1,114 2.74 % 5.94 47 1.13 % 1.09 1 0.29 % 0.25 193 0.92 % 0.61 ho ho ho 2,160 1.58 % 1.90 0 0 % 0 220 0.82 % 5.54 784 1.81 % 1.44 504 1.24 % 2.69 133 3.21 % 3.10 18 5.14 % 4.55 501 2.40 % 1.58 bla bla bl a 1,985 1.46 % 1.75 0 0 % 0 325 1.21 % 8.18 824 1.90 % 1.52 512 1.26 % 2.73 54 1.30 % 1.26 11 3.14 % 2.78 259 1.24 % 0.81 la la la 1,888 1.39 % 1.66 0 0 % 0 90 0.34 % 2.27 919 2.12 % 1.69 450 1.10 % 2.40 72 1.74 % 1.68 3 0.86 % 0.76 354 1.70 % 1.11 haha haha ha ha 1,798 1.32 % 1.58 0 0 % 0 86 0.32 % 2.17 448 1.03 % 0.83 915 2.25 % 4.88 5 0.12 % 0.12 0 0 % 0 344 1.65 % 1.08 oj oj oj 1,484 1.09 % 1.31 0 0 % 0 218 0.81 % 5.49 651 1.50 % 1.20 318 0.78 % 1.70 149 3.59 % 3.47 9 2.57 % 2.28 139 0.67 % 0.44 huh huh hu h 1,382 1.01 % 1.22 0 0 % 0 23 0.09 % 0.58 97 0.22 % 0.18 1,180 2.90 % 6.30 10 0.24 % 0.23 0 0 % 0 72 0.34 % 0.23 alo alo al o 1,291 0.95 % 1.14 0 0 % 0 100 0.37 % 2.52 599 1.38 % 1.10 221 0.54 % 1.18 25 0.60 % 0.58 0 0 % 0 346 1.66 % 1.09 hvalabogu hvalabogu hv alabogu 1,286 0.94 % 1.13 0 0 % 0 146 0.55 % 3.68 671 1.55 % 1.24 300 0.74 % 1.60 17 0.41 % 0.40 4 1.14 % 1.01 148 0.71 % 0.47 jebiga jebiga je biga 1,175 0.86 % 1.04 0 0 % 0 511 1.91 % 12.87 321 0.74 % 0.59 272 0.67 % 1.45 5 0.12 % 0.12 3 0.86 % 0.76 63 0.30 % 0.20 nasvidenje nasvidenje na svidenje 1,162 0.85 % 1.02 0 0 % 0 130 0.48 % 3.27 620 1.43 % 1.14 181 0.45 % 0.97 28 0.68 % 0.65 35 10.00 % 8.85 168 0.80 % 0.53 oho oho oh o 1,160 0.85 % 1.02 0 0 % 0 135 0.50 % 3.40 421 0.97 % 0.78 217 0.53 % 1.16 27 0.65 % 0.63 1 0.29 % 0.25 359 1.72 % 1.13 hura hura hu ra 1,087 0.80 % 0.96 0 0 % 0 121 0.45 % 3.05 531 1.22 % 0.98 289 0.71 % 1.54 38 0.92 % 0.89 1 0.29 % 0.25 107 0.51 % 0.34 jah jah ja h 1,031 0.76 % 0.91 0 0 % 0 22 0.08 % 0.55 650 1.50 % 1.20 235 0.58 % 1.25 8 0.19 % 0.19 1 0.29 % 0.25 115 0.55 % 0.36 ups ups up s 972 0.71 % 0.86 0 0 % 0 74 0.28 % 1.86 302 0.70 % 0.56 468 1.15 % 2.50 8 0.19 % 0.19 5 1.43 % 1.26 115 0.55 % 0.36 hopla hopla ho pla 946 0.69 % 0.83 0 0 % 0 42 0.16 % 1.06 338 0.78 % 0.62 535 1.31 % 2.85 5 0.12 % 0.12 3 0.86 % 0.76 23 0.11 % 0.07 av av av 944 0.69 % 0.83 0 0 % 0 78 0.29 % 1.96 588 1.35 % 1.08 200 0.49 % 1.07 28 0.68 % 0.65 10 2.86 % 2.53 40 0.19 % 0.13 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 219 File at CLARIN.SI 1.2.203 List of initial character-level 3-grams from interjection lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-interjections-lemmas-initial- 3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] hej hej hej 7,497 11.28 % 6.61 0 0 % 0 1,515 11.94 % 38.15 1,767 8.96 % 3.26 3,368 17.95 % 17.97 210 13.38 % 4.89 4 2.31 % 1.01 633 4.66 % 1.99 joj joj joj 6,441 9.69 % 5.68 0 0 % 0 1,074 8.47 % 27.04 2,157 10.94 % 3.97 2,162 11.52 % 11.54 138 8.79 % 3.21 33 19.07 % 8.34 877 6.45 % 2.76 aha aha aha 6,335 9.53 % 5.58 0 0 % 0 1,184 9.34 % 29.81 1,301 6.60 % 2.40 1,538 8.20 % 8.21 160 10.20 % 3.73 25 14.45 % 6.32 2,127 15.65 % 6.69 fak fak fak 4,669 7.02 % 4.11 0 0 % 0 160 1.26 % 4.03 190 0.96 % 0.35 98 0.52 % 0.52 2 0.13 % 0.05 0 0 % 0 4,219 31.04 % 13.27 bravo bravo bra vo 3,087 4.64 % 2.72 0 0 % 0 254 2.00 % 6.40 1,220 6.19 % 2.25 848 4.52 % 4.52 37 2.36 % 0.86 4 2.31 % 1.01 724 5.33 % 2.28 zbogom zbogom zbo gom 2,474 3.72 % 2.18 0 0 % 0 360 2.84 % 9.06 1,062 5.39 % 1.96 532 2.84 % 2.84 90 5.74 % 2.10 3 1.73 % 0.76 427 3.14 % 1.34 adijo adijo adi jo 2,414 3.63 % 2.13 0 0 % 0 416 3.28 % 10.47 961 4.87 % 1.77 608 3.24 % 3.24 60 3.82 % 1.40 1 0.58 % 0.25 368 2.71 % 1.16 bla bla bla 1,985 2.98 % 1.75 0 0 % 0 325 2.56 % 8.18 824 4.18 % 1.52 512 2.73 % 2.73 54 3.44 % 1.26 11 6.36 % 2.78 259 1.91 % 0.81 haha haha hah a 1,798 2.70 % 1.58 0 0 % 0 86 0.68 % 2.17 448 2.27 % 0.83 915 4.88 % 4.88 5 0.32 % 0.12 0 0 % 0 344 2.53 % 1.08 huh huh huh 1,382 2.08 % 1.22 0 0 % 0 23 0.18 % 0.58 97 0.49 % 0.18 1,180 6.29 % 6.30 10 0.64 % 0.23 0 0 % 0 72 0.53 % 0.23 alo alo alo 1,291 1.94 % 1.14 0 0 % 0 100 0.79 % 2.52 599 3.04 % 1.10 221 1.18 % 1.18 25 1.59 % 0.58 0 0 % 0 346 2.55 % 1.09 hvalabogu hvalabogu hva labogu 1,286 1.93 % 1.13 0 0 % 0 146 1.15 % 3.68 671 3.40 % 1.24 300 1.60 % 1.60 17 1.08 % 0.40 4 2.31 % 1.01 148 1.09 % 0.47 jebiga jebiga jeb iga 1,175 1.77 % 1.04 0 0 % 0 511 4.03 % 12.87 321 1.63 % 0.59 272 1.45 % 1.45 5 0.32 % 0.12 3 1.73 % 0.76 63 0.46 % 0.20 nasvidenje nasvidenje nas videnje 1,162 1.75 % 1.02 0 0 % 0 130 1.02 % 3.27 620 3.15 % 1.14 181 0.96 % 0.97 28 1.78 % 0.65 35 20.23 % 8.85 168 1.24 % 0.53 oho oho oho 1,160 1.75 % 1.02 0 0 % 0 135 1.06 % 3.40 421 2.13 % 0.78 217 1.16 % 1.16 27 1.72 % 0.63 1 0.58 % 0.25 359 2.64 % 1.13 hura hura hur a 1,087 1.64 % 0.96 0 0 % 0 121 0.95 % 3.05 531 2.69 % 0.98 289 1.54 % 1.54 38 2.42 % 0.89 1 0.58 % 0.25 107 0.79 % 0.34 jah jah jah 1,031 1.55 % 0.91 0 0 % 0 22 0.17 % 0.55 650 3.30 % 1.20 235 1.25 % 1.25 8 0.51 % 0.19 1 0.58 % 0.25 115 0.85 % 0.36 ups ups ups 972 1.46 % 0.86 0 0 % 0 74 0.58 % 1.86 302 1.53 % 0.56 468 2.50 % 2.50 8 0.51 % 0.19 5 2.89 % 1.26 115 0.85 % 0.36 hopla hopla hop la 946 1.42 % 0.83 0 0 % 0 42 0.33 % 1.06 338 1.71 % 0.62 535 2.85 % 2.85 5 0.32 % 0.12 3 1.73 % 0.76 23 0.17 % 0.07 ojoj ojoj ojo j 890 1.34 % 0.78 0 0 % 0 282 2.22 % 7.10 202 1.02 % 0.37 241 1.28 % 1.29 15 0.96 % 0.35 4 2.31 % 1.01 146 1.07 % 0.46 zaboga zaboga zab oga 831 1.25 % 0.73 0 0 % 0 455 3.59 % 11.46 168 0.85 % 0.31 131 0.70 % 0.70 35 2.23 % 0.82 0 0 % 0 42 0.31 % 0.13 fuj fuj fuj 800 1.20 % 0.71 0 0 % 0 197 1.55 % 4.96 218 1.11 % 0.40 246 1.31 % 1.31 24 1.53 % 0.56 4 2.31 % 1.01 111 0.82 % 0.35 ejga ejga ejg a 756 1.14 % 0.67 0 0 % 0 45 0.35 % 1.13 541 2.74 % 1 61 0.33 % 0.33 0 0 % 0 0 0 % 0 109 0.80 % 0.34 jebenti jebenti jeb enti 753 1.13 % 0.66 0 0 % 0 576 4.54 % 14.50 64 0.33 % 0.12 87 0.46 % 0.46 7 0.45 % 0.16 0 0 % 0 19 0.14 % 0.06 opa opa opa 673 1.01 % 0.59 0 0 % 0 70 0.55 % 1.76 243 1.23 % 0.45 256 1.36 % 1.37 25 1.59 % 0.58 0 0 % 0 79 0.58 % 0.25 jebemti jebemti jeb emti 643 0.97 % 0.57 0 0 % 0 498 3.93 % 12.54 52 0.26 % 0.10 54 0.29 % 0.29 1 0.06 % 0.02 0 0 % 0 38 0.28 % 0.12 mhm mhm mhm 612 0.92 % 0.54 0 0 % 0 223 1.76 % 5.61 88 0.45 % 0.16 256 1.36 % 1.37 16 1.02 % 0.37 0 0 % 0 29 0.21 % 0.09 ojej ojej oje j 603 0.91 % 0.53 0 0 % 0 291 2.29 % 7.33 107 0.54 % 0.20 129 0.69 % 0.69 19 1.21 % 0.44 0 0 % 0 57 0.42 % 0.18 paf paf paf 520 0.78 % 0.46 0 0 % 0 57 0.45 % 1.44 374 1.90 % 0.69 59 0.31 % 0.31 2 0.13 % 0.05 1 0.58 % 0.25 27 0.20 % 0.08 pst pst pst 436 0.66 % 0.38 0 0 % 0 66 0.52 % 1.66 102 0.52 % 0.19 63 0.34 % 0.34 53 3.38 % 1.23 4 2.31 % 1.01 148 1.09 % 0.47 čao čao čao 412 0.62 % 0.36 0 0 % 0 97 0.77 % 2.44 109 0.55 % 0.20 105 0.56 % 0.56 8 0.51 % 0.19 0 0 % 0 93 0.68 % 0.29 hov hov hov 359 0.54 % 0.32 0 0 % 0 25 0.20 % 0.63 159 0.81 % 0.29 82 0.44 % 0.44 22 1.40 % 0.51 1 0.58 % 0.25 70 0.52 % 0.22 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 220 File at CLARIN.SI 1.2.204 List of initial character-level 4-grams from interjection lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-interjections-lemmas-initial-4grams- taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] bravo bravo brav o 3,087 11.12 % 2.72 0 0 % 0 254 3.77 % 6.40 1,220 12.87 % 2.25 848 11.88 % 4.52 37 5.55 % 0.86 4 5.20 % 1.01 724 19.79 % 2.28 zbogom zbogom zbog om 2,474 8.91 % 2.18 0 0 % 0 360 5.34 % 9.06 1,062 11.20 % 1.96 532 7.46 % 2.84 90 13.49 % 2.10 3 3.90 % 0.76 427 11.67 % 1.34 adijo adijo adij o 2,414 8.70 % 2.13 0 0 % 0 416 6.17 % 10.47 961 10.14 % 1.77 608 8.52 % 3.24 60 9.00 % 1.40 1 1.30 % 0.25 368 10.06 % 1.16 haha haha haha 1,798 6.48 % 1.58 0 0 % 0 86 1.27 % 2.17 448 4.73 % 0.83 915 12.82 % 4.88 5 0.75 % 0.12 0 0 % 0 344 9.40 % 1.08 hvalabogu hvalabogu hval abogu 1,286 4.63 % 1.13 0 0 % 0 146 2.17 % 3.68 671 7.08 % 1.24 300 4.21 % 1.60 17 2.55 % 0.40 4 5.20 % 1.01 148 4.05 % 0.47 jebiga jebiga jebi ga 1,175 4.23 % 1.04 0 0 % 0 511 7.58 % 12.87 321 3.39 % 0.59 272 3.81 % 1.45 5 0.75 % 0.12 3 3.90 % 0.76 63 1.72 % 0.20 nasvidenje nasvidenje nasv idenje 1,162 4.19 % 1.02 0 0 % 0 130 1.93 % 3.27 620 6.54 % 1.14 181 2.54 % 0.97 28 4.20 % 0.65 35 45.45 % 8.85 168 4.59 % 0.53 hura hura hura 1,087 3.92 % 0.96 0 0 % 0 121 1.79 % 3.05 531 5.60 % 0.98 289 4.05 % 1.54 38 5.70 % 0.89 1 1.30 % 0.25 107 2.92 % 0.34 hopla hopla hopl a 946 3.41 % 0.83 0 0 % 0 42 0.62 % 1.06 338 3.56 % 0.62 535 7.50 % 2.85 5 0.75 % 0.12 3 3.90 % 0.76 23 0.63 % 0.07 ojoj ojoj ojoj 890 3.21 % 0.78 0 0 % 0 282 4.18 % 7.10 202 2.13 % 0.37 241 3.38 % 1.29 15 2.25 % 0.35 4 5.20 % 1.01 146 3.99 % 0.46 zaboga zaboga zabo ga 831 2.99 % 0.73 0 0 % 0 455 6.75 % 11.46 168 1.77 % 0.31 131 1.84 % 0.70 35 5.25 % 0.82 0 0 % 0 42 1.15 % 0.13 ejga ejga ejga 756 2.72 % 0.67 0 0 % 0 45 0.67 % 1.13 541 5.71 % 1 61 0.85 % 0.33 0 0 % 0 0 0 % 0 109 2.98 % 0.34 jebenti jebenti jebe nti 753 2.71 % 0.66 0 0 % 0 576 8.54 % 14.50 64 0.68 % 0.12 87 1.22 % 0.46 7 1.05 % 0.16 0 0 % 0 19 0.52 % 0.06 jebemti jebemti jebe mti 643 2.32 % 0.57 0 0 % 0 498 7.38 % 12.54 52 0.55 % 0.10 54 0.76 % 0.29 1 0.15 % 0.02 0 0 % 0 38 1.04 % 0.12 ojej ojej ojej 603 2.17 % 0.53 0 0 % 0 291 4.32 % 7.33 107 1.13 % 0.20 129 1.81 % 0.69 19 2.85 % 0.44 0 0 % 0 57 1.56 % 0.18 juhej juhej juhe j 325 1.17 % 0.29 0 0 % 0 17 0.25 % 0.43 127 1.34 % 0.23 126 1.77 % 0.67 6 0.90 % 0.14 0 0 % 0 49 1.34 % 0.15 živijo živijo živi jo 255 0.92 % 0.22 0 0 % 0 134 1.99 % 3.37 43 0.45 % 0.08 47 0.66 % 0.25 12 1.80 % 0.28 1 1.30 % 0.25 18 0.49 % 0.06 juhuhu juhuhu juhu hu 246 0.89 % 0.22 0 0 % 0 33 0.49 % 0.83 103 1.09 % 0.19 79 1.11 % 0.42 5 0.75 % 0.12 3 3.90 % 0.76 23 0.63 % 0.07 živio živio živi o 207 0.75 % 0.18 0 0 % 0 76 1.13 % 1.91 81 0.85 % 0.15 7 0.10 % 0.04 18 2.70 % 0.42 2 2.60 % 0.51 23 0.63 % 0.07 hojla hojla hojl a 194 0.70 % 0.17 0 0 % 0 39 0.58 % 0.98 107 1.13 % 0.20 46 0.65 % 0.25 1 0.15 % 0.02 0 0 % 0 1 0.03 % 0 živjo živjo živj o 191 0.69 % 0.17 0 0 % 0 126 1.87 % 3.17 20 0.21 % 0.04 28 0.39 % 0.15 2 0.30 % 0.05 0 0 % 0 15 0.41 % 0.05 hahaha hahaha haha ha 190 0.68 % 0.17 0 0 % 0 30 0.45 % 0.76 34 0.36 % 0.06 67 0.94 % 0.36 2 0.30 % 0.05 0 0 % 0 57 1.56 % 0.18 žalibog žalibog žali bog 186 0.67 % 0.16 0 0 % 0 13 0.19 % 0.33 78 0.82 % 0.14 35 0.49 % 0.19 13 1.95 % 0.30 1 1.30 % 0.25 46 1.26 % 0.14 ježeš ježeš ježe š 155 0.56 % 0.14 0 0 % 0 41 0.61 % 1.03 34 0.36 % 0.06 75 1.05 % 0.40 0 0 % 0 0 0 % 0 5 0.14 % 0.02 aaaa aaaa aaaa 148 0.53 % 0.13 0 0 % 0 49 0.73 % 1.23 22 0.23 % 0.04 58 0.81 % 0.31 0 0 % 0 0 0 % 0 19 0.52 % 0.06 tralala tralala tral ala 127 0.46 % 0.11 0 0 % 0 17 0.25 % 0.43 57 0.60 % 0.11 35 0.49 % 0.19 1 0.15 % 0.02 1 1.30 % 0.25 16 0.44 % 0.05 heja heja heja 98 0.35 % 0.09 0 0 % 0 7 0.10 % 0.18 30 0.32 % 0.06 22 0.31 % 0.12 3 0.45 % 0.07 0 0 % 0 36 0.98 % 0.11 bumf bumf bumf 80 0.29 % 0.07 0 0 % 0 33 0.49 % 0.83 23 0.24 % 0.04 18 0.25 % 0.10 3 0.45 % 0.07 0 0 % 0 3 0.08 % 0.01 hrsk hrsk hrsk 73 0.26 % 0.06 0 0 % 0 33 0.49 % 0.83 21 0.22 % 0.04 13 0.18 % 0.07 1 0.15 % 0.02 0 0 % 0 5 0.14 % 0.02 jojmene jojmene jojm ene 72 0.26 % 0.06 0 0 % 0 34 0.50 % 0.86 13 0.14 % 0.02 17 0.24 % 0.09 5 0.75 % 0.12 0 0 % 0 3 0.08 % 0.01 hozana hozana hoza na 66 0.24 % 0.06 0 0 % 0 21 0.31 % 0.53 24 0.25 % 0.04 10 0.14 % 0.05 3 0.45 % 0.07 1 1.30 % 0.25 7 0.19 % 0.02 aaah aaah aaah 63 0.23 % 0.06 0 0 % 0 17 0.25 % 0.43 6 0.06 % 0.01 32 0.45 % 0.17 2 0.30 % 0.05 0 0 % 0 6 0.16 % 0.02 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 221 File at CLARIN.SI 1.2.205 List of initial character-level 5-grams from interjection lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-interjections-lemmas-initial- 5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] bravo bravo bravo 3,087 14.65 % 2.72 0 0 % 0 254 4.73 % 6.40 1,220 16.73 % 2.25 848 16.67 % 4.52 37 6.88 % 0.86 4 5.97 % 1.01 724 26.55 % 2.28 zbogom zbogom zbogo m 2,474 11.74 % 2.18 0 0 % 0 360 6.71 % 9.06 1,062 14.57 % 1.96 532 10.46 % 2.84 90 16.73 % 2.10 3 4.48 % 0.76 427 15.66 % 1.34 adijo adijo adijo 2,414 11.45 % 2.13 0 0 % 0 416 7.75 % 10.47 961 13.18 % 1.77 608 11.95 % 3.24 60 11.15 % 1.40 1 1.49 % 0.25 368 13.49 % 1.16 hvalabogu hvalabogu hvala bogu 1,286 6.10 % 1.13 0 0 % 0 146 2.72 % 3.68 671 9.20 % 1.24 300 5.90 % 1.60 17 3.16 % 0.40 4 5.97 % 1.01 148 5.43 % 0.47 jebiga jebiga jebig a 1,175 5.58 % 1.04 0 0 % 0 511 9.52 % 12.87 321 4.40 % 0.59 272 5.35 % 1.45 5 0.93 % 0.12 3 4.48 % 0.76 63 2.31 % 0.20 nasvidenje nasvidenje nasvi denje 1,162 5.51 % 1.02 0 0 % 0 130 2.42 % 3.27 620 8.51 % 1.14 181 3.56 % 0.97 28 5.20 % 0.65 35 52.24 % 8.85 168 6.16 % 0.53 hopla hopla hopla 946 4.49 % 0.83 0 0 % 0 42 0.78 % 1.06 338 4.64 % 0.62 535 10.52 % 2.85 5 0.93 % 0.12 3 4.48 % 0.76 23 0.84 % 0.07 zaboga zaboga zabog a 831 3.94 % 0.73 0 0 % 0 455 8.48 % 11.46 168 2.31 % 0.31 131 2.58 % 0.70 35 6.51 % 0.82 0 0 % 0 42 1.54 % 0.13 jebenti jebenti jeben ti 753 3.57 % 0.66 0 0 % 0 576 10.73 % 14.50 64 0.88 % 0.12 87 1.71 % 0.46 7 1.30 % 0.16 0 0 % 0 19 0.70 % 0.06 jebemti jebemti jebem ti 643 3.05 % 0.57 0 0 % 0 498 9.28 % 12.54 52 0.71 % 0.10 54 1.06 % 0.29 1 0.19 % 0.02 0 0 % 0 38 1.39 % 0.12 juhej juhej juhej 325 1.54 % 0.29 0 0 % 0 17 0.32 % 0.43 127 1.74 % 0.23 126 2.48 % 0.67 6 1.11 % 0.14 0 0 % 0 49 1.80 % 0.15 živijo živijo živij o 255 1.21 % 0.22 0 0 % 0 134 2.50 % 3.37 43 0.59 % 0.08 47 0.92 % 0.25 12 2.23 % 0.28 1 1.49 % 0.25 18 0.66 % 0.06 juhuhu juhuhu juhuh u 246 1.17 % 0.22 0 0 % 0 33 0.61 % 0.83 103 1.41 % 0.19 79 1.55 % 0.42 5 0.93 % 0.12 3 4.48 % 0.76 23 0.84 % 0.07 živio živio živio 207 0.98 % 0.18 0 0 % 0 76 1.42 % 1.91 81 1.11 % 0.15 7 0.14 % 0.04 18 3.35 % 0.42 2 2.98 % 0.51 23 0.84 % 0.07 hojla hojla hojla 194 0.92 % 0.17 0 0 % 0 39 0.73 % 0.98 107 1.47 % 0.20 46 0.90 % 0.25 1 0.19 % 0.02 0 0 % 0 1 0.04 % 0 živjo živjo živjo 191 0.91 % 0.17 0 0 % 0 126 2.35 % 3.17 20 0.27 % 0.04 28 0.55 % 0.15 2 0.37 % 0.05 0 0 % 0 15 0.55 % 0.05 hahaha hahaha hahah a 190 0.90 % 0.17 0 0 % 0 30 0.56 % 0.76 34 0.47 % 0.06 67 1.32 % 0.36 2 0.37 % 0.05 0 0 % 0 57 2.09 % 0.18 žalibog žalibog žalib og 186 0.88 % 0.16 0 0 % 0 13 0.24 % 0.33 78 1.07 % 0.14 35 0.69 % 0.19 13 2.42 % 0.30 1 1.49 % 0.25 46 1.69 % 0.14 ježeš ježeš ježeš 155 0.73 % 0.14 0 0 % 0 41 0.76 % 1.03 34 0.47 % 0.06 75 1.47 % 0.40 0 0 % 0 0 0 % 0 5 0.18 % 0.02 tralala tralala trala la 127 0.60 % 0.11 0 0 % 0 17 0.32 % 0.43 57 0.78 % 0.11 35 0.69 % 0.19 1 0.19 % 0.02 1 1.49 % 0.25 16 0.59 % 0.05 jojmene jojmene jojme ne 72 0.34 % 0.06 0 0 % 0 34 0.63 % 0.86 13 0.18 % 0.02 17 0.33 % 0.09 5 0.93 % 0.12 0 0 % 0 3 0.11 % 0.01 hozana hozana hozan a 66 0.31 % 0.06 0 0 % 0 21 0.39 % 0.53 24 0.33 % 0.04 10 0.20 % 0.05 3 0.56 % 0.07 1 1.49 % 0.25 7 0.26 % 0.02 mijav mijav mijav 56 0.27 % 0.05 0 0 % 0 11 0.20 % 0.28 17 0.23 % 0.03 19 0.37 % 0.10 5 0.93 % 0.12 0 0 % 0 4 0.15 % 0.01 joooj joooj joooj 53 0.25 % 0.05 0 0 % 0 1 0.02 % 0.03 24 0.33 % 0.04 22 0.43 % 0.12 0 0 % 0 0 0 % 0 6 0.22 % 0.02 Nasvidenje nasvidenje Nasvi denje 52 0.25 % 0.05 0 0 % 0 0 0 % 0 29 0.40 % 0.05 7 0.14 % 0.04 4 0.74 % 0.09 0 0 % 0 12 0.44 % 0.04 zdravo zdravo zdrav o 52 0.25 % 0.05 0 0 % 0 12 0.22 % 0.30 24 0.33 % 0.04 9 0.18 % 0.05 3 0.56 % 0.07 0 0 % 0 4 0.15 % 0.01 jojme jojme jojme 49 0.23 % 0.04 0 0 % 0 16 0.30 % 0.40 10 0.14 % 0.02 15 0.29 % 0.08 6 1.11 % 0.14 1 1.49 % 0.25 1 0.04 % 0 aaaaa aaaaa aaaaa 45 0.21 % 0.04 0 0 % 0 12 0.22 % 0.30 8 0.11 % 0.01 17 0.33 % 0.09 2 0.37 % 0.05 0 0 % 0 6 0.22 % 0.02 mejduš mejduš mejdu š 42 0.20 % 0.04 0 0 % 0 14 0.26 % 0.35 10 0.14 % 0.02 10 0.20 % 0.05 2 0.37 % 0.05 0 0 % 0 6 0.22 % 0.02 ježešna ježešna ježeš na 32 0.15 % 0.03 0 0 % 0 13 0.24 % 0.33 13 0.18 % 0.02 4 0.08 % 0.02 1 0.19 % 0.02 0 0 % 0 1 0.04 % 0 jaaaa jaaaa jaaaa 29 0.14 % 0.03 0 0 % 0 12 0.22 % 0.30 5 0.07 % 0.01 5 0.10 % 0.03 0 0 % 0 0 0 % 0 7 0.26 % 0.02 aaaah aaaah aaaah 28 0.13 % 0.02 0 0 % 0 8 0.15 % 0.20 4 0.06 % 0.01 14 0.28 % 0.07 0 0 % 0 0 0 % 0 2 0.07 % 0.01 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 222 File at CLARIN.SI 1.2.206 List of final character-level 1-grams from interjection lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-interjections-lemmas-final- 1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] oh oh o h 14,865 10.91 % 13.10 0 0 % 0 5,344 19.95 % 134.56 3,245 7.48 % 5.98 3,990 9.80 % 21.29 760 18.32 % 17.70 25 7.14 % 6.32 1,501 7.18 % 4.72 ah ah a h 13,301 9.76 % 11.72 0 0 % 0 2,696 10.06 % 67.88 4,963 11.44 % 9.14 3,766 9.25 % 20.09 445 10.73 % 10.36 30 8.57 % 7.58 1,401 6.71 % 4.41 ha ha h a 11,172 8.20 % 9.85 0 0 % 0 1,638 6.11 % 41.24 4,524 10.43 % 8.34 3,900 9.58 % 20.81 320 7.71 % 7.45 33 9.43 % 8.34 757 3.62 % 2.38 hm hm h m 8,313 6.10 % 7.33 0 0 % 0 1,142 4.26 % 28.75 3,096 7.13 % 5.70 3,089 7.59 % 16.48 169 4.07 % 3.94 12 3.43 % 3.03 805 3.85 % 2.53 hej hej he j 7,497 5.50 % 6.61 0 0 % 0 1,515 5.66 % 38.15 1,767 4.07 % 3.26 3,368 8.27 % 17.97 210 5.06 % 4.89 4 1.14 % 1.01 633 3.03 % 1.99 joj joj jo j 6,441 4.73 % 5.68 0 0 % 0 1,074 4.01 % 27.04 2,157 4.97 % 3.97 2,162 5.31 % 11.54 138 3.33 % 3.21 33 9.43 % 8.34 877 4.20 % 2.76 aha aha ah a 6,335 4.65 % 5.58 0 0 % 0 1,184 4.42 % 29.81 1,301 3.00 % 2.40 1,538 3.78 % 8.21 160 3.86 % 3.73 25 7.14 % 6.32 2,127 10.18 % 6.69 fak fak fa k 4,669 3.43 % 4.11 0 0 % 0 160 0.60 % 4.03 190 0.44 % 0.35 98 0.24 % 0.52 2 0.05 % 0.05 0 0 % 0 4,219 20.20 % 13.27 ej ej e j 3,721 2.73 % 3.28 0 0 % 0 1,232 4.60 % 31.02 1,049 2.42 % 1.93 830 2.04 % 4.43 241 5.81 % 5.61 7 2.00 % 1.77 362 1.73 % 1.14 hi hi h i 3,425 2.51 % 3.02 0 0 % 0 169 0.63 % 4.26 1,246 2.87 % 2.30 1,521 3.74 % 8.12 111 2.68 % 2.59 22 6.29 % 5.56 356 1.70 % 1.12 bravo bravo brav o 3,087 2.27 % 2.72 0 0 % 0 254 0.95 % 6.40 1,220 2.81 % 2.25 848 2.08 % 4.52 37 0.89 % 0.86 4 1.14 % 1.01 724 3.47 % 2.28 uf uf u f 2,808 2.06 % 2.47 0 0 % 0 254 0.95 % 6.40 834 1.92 % 1.54 1,109 2.72 % 5.92 25 0.60 % 0.58 0 0 % 0 586 2.81 % 1.84 zbogom zbogom zbogo m 2,474 1.81 % 2.18 0 0 % 0 360 1.34 % 9.06 1,062 2.45 % 1.96 532 1.31 % 2.84 90 2.17 % 2.10 3 0.86 % 0.76 427 2.04 % 1.34 uh uh u h 2,435 1.79 % 2.15 0 0 % 0 336 1.25 % 8.46 830 1.91 % 1.53 983 2.41 % 5.24 48 1.16 % 1.12 2 0.57 % 0.51 236 1.13 % 0.74 adijo adijo adij o 2,414 1.77 % 2.13 0 0 % 0 416 1.55 % 10.47 961 2.21 % 1.77 608 1.49 % 3.24 60 1.45 % 1.40 1 0.29 % 0.25 368 1.76 % 1.16 eh eh e h 2,409 1.77 % 2.12 0 0 % 0 399 1.49 % 10.05 655 1.51 % 1.21 1,114 2.74 % 5.94 47 1.13 % 1.09 1 0.29 % 0.25 193 0.92 % 0.61 ho ho h o 2,160 1.58 % 1.90 0 0 % 0 220 0.82 % 5.54 784 1.81 % 1.44 504 1.24 % 2.69 133 3.21 % 3.10 18 5.14 % 4.55 501 2.40 % 1.58 bla bla bl a 1,985 1.46 % 1.75 0 0 % 0 325 1.21 % 8.18 824 1.90 % 1.52 512 1.26 % 2.73 54 1.30 % 1.26 11 3.14 % 2.78 259 1.24 % 0.81 la la l a 1,888 1.39 % 1.66 0 0 % 0 90 0.34 % 2.27 919 2.12 % 1.69 450 1.10 % 2.40 72 1.74 % 1.68 3 0.86 % 0.76 354 1.70 % 1.11 haha haha hah a 1,798 1.32 % 1.58 0 0 % 0 86 0.32 % 2.17 448 1.03 % 0.83 915 2.25 % 4.88 5 0.12 % 0.12 0 0 % 0 344 1.65 % 1.08 oj oj o j 1,484 1.09 % 1.31 0 0 % 0 218 0.81 % 5.49 651 1.50 % 1.20 318 0.78 % 1.70 149 3.59 % 3.47 9 2.57 % 2.28 139 0.67 % 0.44 huh huh hu h 1,382 1.01 % 1.22 0 0 % 0 23 0.09 % 0.58 97 0.22 % 0.18 1,180 2.90 % 6.30 10 0.24 % 0.23 0 0 % 0 72 0.34 % 0.23 alo alo al o 1,291 0.95 % 1.14 0 0 % 0 100 0.37 % 2.52 599 1.38 % 1.10 221 0.54 % 1.18 25 0.60 % 0.58 0 0 % 0 346 1.66 % 1.09 hvalabogu hvalabogu hvalabog u 1,286 0.94 % 1.13 0 0 % 0 146 0.55 % 3.68 671 1.55 % 1.24 300 0.74 % 1.60 17 0.41 % 0.40 4 1.14 % 1.01 148 0.71 % 0.47 jebiga jebiga jebig a 1,175 0.86 % 1.04 0 0 % 0 511 1.91 % 12.87 321 0.74 % 0.59 272 0.67 % 1.45 5 0.12 % 0.12 3 0.86 % 0.76 63 0.30 % 0.20 nasvidenje nasvidenje nasvidenj e 1,162 0.85 % 1.02 0 0 % 0 130 0.48 % 3.27 620 1.43 % 1.14 181 0.45 % 0.97 28 0.68 % 0.65 35 10.00 % 8.85 168 0.80 % 0.53 oho oho oh o 1,160 0.85 % 1.02 0 0 % 0 135 0.50 % 3.40 421 0.97 % 0.78 217 0.53 % 1.16 27 0.65 % 0.63 1 0.29 % 0.25 359 1.72 % 1.13 hura hura hur a 1,087 0.80 % 0.96 0 0 % 0 121 0.45 % 3.05 531 1.22 % 0.98 289 0.71 % 1.54 38 0.92 % 0.89 1 0.29 % 0.25 107 0.51 % 0.34 jah jah ja h 1,031 0.76 % 0.91 0 0 % 0 22 0.08 % 0.55 650 1.50 % 1.20 235 0.58 % 1.25 8 0.19 % 0.19 1 0.29 % 0.25 115 0.55 % 0.36 ups ups up s 972 0.71 % 0.86 0 0 % 0 74 0.28 % 1.86 302 0.70 % 0.56 468 1.15 % 2.50 8 0.19 % 0.19 5 1.43 % 1.26 115 0.55 % 0.36 hopla hopla hopl a 946 0.69 % 0.83 0 0 % 0 42 0.16 % 1.06 338 0.78 % 0.62 535 1.31 % 2.85 5 0.12 % 0.12 3 0.86 % 0.76 23 0.11 % 0.07 av av a v 944 0.69 % 0.83 0 0 % 0 78 0.29 % 1.96 588 1.35 % 1.08 200 0.49 % 1.07 28 0.68 % 0.65 10 2.86 % 2.53 40 0.19 % 0.13 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 223 File at CLARIN.SI 1.2.207 List of final character-level 2-grams from interjection lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-interjections-lemmas-final- 2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] oh oh oh 14,865 10.91 % 13.10 0 0 % 0 5,344 19.95 % 134.56 3,245 7.48 % 5.98 3,990 9.80 % 21.29 760 18.32 % 17.70 25 7.14 % 6.32 1,501 7.18 % 4.72 ah ah ah 13,301 9.76 % 11.72 0 0 % 0 2,696 10.06 % 67.88 4,963 11.44 % 9.14 3,766 9.25 % 20.09 445 10.73 % 10.36 30 8.57 % 7.58 1,401 6.71 % 4.41 ha ha ha 11,172 8.20 % 9.85 0 0 % 0 1,638 6.11 % 41.24 4,524 10.43 % 8.34 3,900 9.58 % 20.81 320 7.71 % 7.45 33 9.43 % 8.34 757 3.62 % 2.38 hm hm hm 8,313 6.10 % 7.33 0 0 % 0 1,142 4.26 % 28.75 3,096 7.13 % 5.70 3,089 7.59 % 16.48 169 4.07 % 3.94 12 3.43 % 3.03 805 3.85 % 2.53 hej hej h ej 7,497 5.50 % 6.61 0 0 % 0 1,515 5.66 % 38.15 1,767 4.07 % 3.26 3,368 8.27 % 17.97 210 5.06 % 4.89 4 1.14 % 1.01 633 3.03 % 1.99 joj joj j oj 6,441 4.73 % 5.68 0 0 % 0 1,074 4.01 % 27.04 2,157 4.97 % 3.97 2,162 5.31 % 11.54 138 3.33 % 3.21 33 9.43 % 8.34 877 4.20 % 2.76 aha aha a ha 6,335 4.65 % 5.58 0 0 % 0 1,184 4.42 % 29.81 1,301 3.00 % 2.40 1,538 3.78 % 8.21 160 3.86 % 3.73 25 7.14 % 6.32 2,127 10.18 % 6.69 fak fak f ak 4,669 3.43 % 4.11 0 0 % 0 160 0.60 % 4.03 190 0.44 % 0.35 98 0.24 % 0.52 2 0.05 % 0.05 0 0 % 0 4,219 20.20 % 13.27 ej ej ej 3,721 2.73 % 3.28 0 0 % 0 1,232 4.60 % 31.02 1,049 2.42 % 1.93 830 2.04 % 4.43 241 5.81 % 5.61 7 2.00 % 1.77 362 1.73 % 1.14 hi hi hi 3,425 2.51 % 3.02 0 0 % 0 169 0.63 % 4.26 1,246 2.87 % 2.30 1,521 3.74 % 8.12 111 2.68 % 2.59 22 6.29 % 5.56 356 1.70 % 1.12 bravo bravo bra vo 3,087 2.27 % 2.72 0 0 % 0 254 0.95 % 6.40 1,220 2.81 % 2.25 848 2.08 % 4.52 37 0.89 % 0.86 4 1.14 % 1.01 724 3.47 % 2.28 uf uf uf 2,808 2.06 % 2.47 0 0 % 0 254 0.95 % 6.40 834 1.92 % 1.54 1,109 2.72 % 5.92 25 0.60 % 0.58 0 0 % 0 586 2.81 % 1.84 zbogom zbogom zbog om 2,474 1.81 % 2.18 0 0 % 0 360 1.34 % 9.06 1,062 2.45 % 1.96 532 1.31 % 2.84 90 2.17 % 2.10 3 0.86 % 0.76 427 2.04 % 1.34 uh uh uh 2,435 1.79 % 2.15 0 0 % 0 336 1.25 % 8.46 830 1.91 % 1.53 983 2.41 % 5.24 48 1.16 % 1.12 2 0.57 % 0.51 236 1.13 % 0.74 adijo adijo adi jo 2,414 1.77 % 2.13 0 0 % 0 416 1.55 % 10.47 961 2.21 % 1.77 608 1.49 % 3.24 60 1.45 % 1.40 1 0.29 % 0.25 368 1.76 % 1.16 eh eh eh 2,409 1.77 % 2.12 0 0 % 0 399 1.49 % 10.05 655 1.51 % 1.21 1,114 2.74 % 5.94 47 1.13 % 1.09 1 0.29 % 0.25 193 0.92 % 0.61 ho ho ho 2,160 1.58 % 1.90 0 0 % 0 220 0.82 % 5.54 784 1.81 % 1.44 504 1.24 % 2.69 133 3.21 % 3.10 18 5.14 % 4.55 501 2.40 % 1.58 bla bla b la 1,985 1.46 % 1.75 0 0 % 0 325 1.21 % 8.18 824 1.90 % 1.52 512 1.26 % 2.73 54 1.30 % 1.26 11 3.14 % 2.78 259 1.24 % 0.81 la la la 1,888 1.39 % 1.66 0 0 % 0 90 0.34 % 2.27 919 2.12 % 1.69 450 1.10 % 2.40 72 1.74 % 1.68 3 0.86 % 0.76 354 1.70 % 1.11 haha haha ha ha 1,798 1.32 % 1.58 0 0 % 0 86 0.32 % 2.17 448 1.03 % 0.83 915 2.25 % 4.88 5 0.12 % 0.12 0 0 % 0 344 1.65 % 1.08 oj oj oj 1,484 1.09 % 1.31 0 0 % 0 218 0.81 % 5.49 651 1.50 % 1.20 318 0.78 % 1.70 149 3.59 % 3.47 9 2.57 % 2.28 139 0.67 % 0.44 huh huh h uh 1,382 1.01 % 1.22 0 0 % 0 23 0.09 % 0.58 97 0.22 % 0.18 1,180 2.90 % 6.30 10 0.24 % 0.23 0 0 % 0 72 0.34 % 0.23 alo alo a lo 1,291 0.95 % 1.14 0 0 % 0 100 0.37 % 2.52 599 1.38 % 1.10 221 0.54 % 1.18 25 0.60 % 0.58 0 0 % 0 346 1.66 % 1.09 hvalabogu hvalabogu hvalabo gu 1,286 0.94 % 1.13 0 0 % 0 146 0.55 % 3.68 671 1.55 % 1.24 300 0.74 % 1.60 17 0.41 % 0.40 4 1.14 % 1.01 148 0.71 % 0.47 jebiga jebiga jebi ga 1,175 0.86 % 1.04 0 0 % 0 511 1.91 % 12.87 321 0.74 % 0.59 272 0.67 % 1.45 5 0.12 % 0.12 3 0.86 % 0.76 63 0.30 % 0.20 nasvidenje nasvidenje nasviden je 1,162 0.85 % 1.02 0 0 % 0 130 0.48 % 3.27 620 1.43 % 1.14 181 0.45 % 0.97 28 0.68 % 0.65 35 10.00 % 8.85 168 0.80 % 0.53 oho oho o ho 1,160 0.85 % 1.02 0 0 % 0 135 0.50 % 3.40 421 0.97 % 0.78 217 0.53 % 1.16 27 0.65 % 0.63 1 0.29 % 0.25 359 1.72 % 1.13 hura hura hu ra 1,087 0.80 % 0.96 0 0 % 0 121 0.45 % 3.05 531 1.22 % 0.98 289 0.71 % 1.54 38 0.92 % 0.89 1 0.29 % 0.25 107 0.51 % 0.34 jah jah j ah 1,031 0.76 % 0.91 0 0 % 0 22 0.08 % 0.55 650 1.50 % 1.20 235 0.58 % 1.25 8 0.19 % 0.19 1 0.29 % 0.25 115 0.55 % 0.36 ups ups u ps 972 0.71 % 0.86 0 0 % 0 74 0.28 % 1.86 302 0.70 % 0.56 468 1.15 % 2.50 8 0.19 % 0.19 5 1.43 % 1.26 115 0.55 % 0.36 hopla hopla hop la 946 0.69 % 0.83 0 0 % 0 42 0.16 % 1.06 338 0.78 % 0.62 535 1.31 % 2.85 5 0.12 % 0.12 3 0.86 % 0.76 23 0.11 % 0.07 av av av 944 0.69 % 0.83 0 0 % 0 78 0.29 % 1.96 588 1.35 % 1.08 200 0.49 % 1.07 28 0.68 % 0.65 10 2.86 % 2.53 40 0.19 % 0.13 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 224 File at CLARIN.SI 1.2.208 List of final character-level 3-grams from interjection lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-interjections-lemmas-final- 3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] hej hej hej 7,497 11.28 % 6.61 0 0 % 0 1,515 11.94 % 38.15 1,767 8.96 % 3.26 3,368 17.95 % 17.97 210 13.38 % 4.89 4 2.31 % 1.01 633 4.66 % 1.99 joj joj joj 6,441 9.69 % 5.68 0 0 % 0 1,074 8.47 % 27.04 2,157 10.94 % 3.97 2,162 11.52 % 11.54 138 8.79 % 3.21 33 19.07 % 8.34 877 6.45 % 2.76 aha aha aha 6,335 9.53 % 5.58 0 0 % 0 1,184 9.34 % 29.81 1,301 6.60 % 2.40 1,538 8.20 % 8.21 160 10.20 % 3.73 25 14.45 % 6.32 2,127 15.65 % 6.69 fak fak fak 4,669 7.02 % 4.11 0 0 % 0 160 1.26 % 4.03 190 0.96 % 0.35 98 0.52 % 0.52 2 0.13 % 0.05 0 0 % 0 4,219 31.04 % 13.27 bravo bravo br avo 3,087 4.64 % 2.72 0 0 % 0 254 2.00 % 6.40 1,220 6.19 % 2.25 848 4.52 % 4.52 37 2.36 % 0.86 4 2.31 % 1.01 724 5.33 % 2.28 zbogom zbogom zbo gom 2,474 3.72 % 2.18 0 0 % 0 360 2.84 % 9.06 1,062 5.39 % 1.96 532 2.84 % 2.84 90 5.74 % 2.10 3 1.73 % 0.76 427 3.14 % 1.34 adijo adijo ad ijo 2,414 3.63 % 2.13 0 0 % 0 416 3.28 % 10.47 961 4.87 % 1.77 608 3.24 % 3.24 60 3.82 % 1.40 1 0.58 % 0.25 368 2.71 % 1.16 bla bla bla 1,985 2.98 % 1.75 0 0 % 0 325 2.56 % 8.18 824 4.18 % 1.52 512 2.73 % 2.73 54 3.44 % 1.26 11 6.36 % 2.78 259 1.91 % 0.81 haha haha h aha 1,798 2.70 % 1.58 0 0 % 0 86 0.68 % 2.17 448 2.27 % 0.83 915 4.88 % 4.88 5 0.32 % 0.12 0 0 % 0 344 2.53 % 1.08 huh huh huh 1,382 2.08 % 1.22 0 0 % 0 23 0.18 % 0.58 97 0.49 % 0.18 1,180 6.29 % 6.30 10 0.64 % 0.23 0 0 % 0 72 0.53 % 0.23 alo alo alo 1,291 1.94 % 1.14 0 0 % 0 100 0.79 % 2.52 599 3.04 % 1.10 221 1.18 % 1.18 25 1.59 % 0.58 0 0 % 0 346 2.55 % 1.09 hvalabogu hvalabogu hvalab ogu 1,286 1.93 % 1.13 0 0 % 0 146 1.15 % 3.68 671 3.40 % 1.24 300 1.60 % 1.60 17 1.08 % 0.40 4 2.31 % 1.01 148 1.09 % 0.47 jebiga jebiga jeb iga 1,175 1.77 % 1.04 0 0 % 0 511 4.03 % 12.87 321 1.63 % 0.59 272 1.45 % 1.45 5 0.32 % 0.12 3 1.73 % 0.76 63 0.46 % 0.20 nasvidenje nasvidenje nasvide nje 1,162 1.75 % 1.02 0 0 % 0 130 1.02 % 3.27 620 3.15 % 1.14 181 0.96 % 0.97 28 1.78 % 0.65 35 20.23 % 8.85 168 1.24 % 0.53 oho oho oho 1,160 1.75 % 1.02 0 0 % 0 135 1.06 % 3.40 421 2.13 % 0.78 217 1.16 % 1.16 27 1.72 % 0.63 1 0.58 % 0.25 359 2.64 % 1.13 hura hura h ura 1,087 1.64 % 0.96 0 0 % 0 121 0.95 % 3.05 531 2.69 % 0.98 289 1.54 % 1.54 38 2.42 % 0.89 1 0.58 % 0.25 107 0.79 % 0.34 jah jah jah 1,031 1.55 % 0.91 0 0 % 0 22 0.17 % 0.55 650 3.30 % 1.20 235 1.25 % 1.25 8 0.51 % 0.19 1 0.58 % 0.25 115 0.85 % 0.36 ups ups ups 972 1.46 % 0.86 0 0 % 0 74 0.58 % 1.86 302 1.53 % 0.56 468 2.50 % 2.50 8 0.51 % 0.19 5 2.89 % 1.26 115 0.85 % 0.36 hopla hopla ho pla 946 1.42 % 0.83 0 0 % 0 42 0.33 % 1.06 338 1.71 % 0.62 535 2.85 % 2.85 5 0.32 % 0.12 3 1.73 % 0.76 23 0.17 % 0.07 ojoj ojoj o joj 890 1.34 % 0.78 0 0 % 0 282 2.22 % 7.10 202 1.02 % 0.37 241 1.28 % 1.29 15 0.96 % 0.35 4 2.31 % 1.01 146 1.07 % 0.46 zaboga zaboga zab oga 831 1.25 % 0.73 0 0 % 0 455 3.59 % 11.46 168 0.85 % 0.31 131 0.70 % 0.70 35 2.23 % 0.82 0 0 % 0 42 0.31 % 0.13 fuj fuj fuj 800 1.20 % 0.71 0 0 % 0 197 1.55 % 4.96 218 1.11 % 0.40 246 1.31 % 1.31 24 1.53 % 0.56 4 2.31 % 1.01 111 0.82 % 0.35 ejga ejga e jga 756 1.14 % 0.67 0 0 % 0 45 0.35 % 1.13 541 2.74 % 1 61 0.33 % 0.33 0 0 % 0 0 0 % 0 109 0.80 % 0.34 jebenti jebenti jebe nti 753 1.13 % 0.66 0 0 % 0 576 4.54 % 14.50 64 0.33 % 0.12 87 0.46 % 0.46 7 0.45 % 0.16 0 0 % 0 19 0.14 % 0.06 opa opa opa 673 1.01 % 0.59 0 0 % 0 70 0.55 % 1.76 243 1.23 % 0.45 256 1.36 % 1.37 25 1.59 % 0.58 0 0 % 0 79 0.58 % 0.25 jebemti jebemti jebe mti 643 0.97 % 0.57 0 0 % 0 498 3.93 % 12.54 52 0.26 % 0.10 54 0.29 % 0.29 1 0.06 % 0.02 0 0 % 0 38 0.28 % 0.12 mhm mhm mhm 612 0.92 % 0.54 0 0 % 0 223 1.76 % 5.61 88 0.45 % 0.16 256 1.36 % 1.37 16 1.02 % 0.37 0 0 % 0 29 0.21 % 0.09 ojej ojej o jej 603 0.91 % 0.53 0 0 % 0 291 2.29 % 7.33 107 0.54 % 0.20 129 0.69 % 0.69 19 1.21 % 0.44 0 0 % 0 57 0.42 % 0.18 paf paf paf 520 0.78 % 0.46 0 0 % 0 57 0.45 % 1.44 374 1.90 % 0.69 59 0.31 % 0.31 2 0.13 % 0.05 1 0.58 % 0.25 27 0.20 % 0.08 pst pst pst 436 0.66 % 0.38 0 0 % 0 66 0.52 % 1.66 102 0.52 % 0.19 63 0.34 % 0.34 53 3.38 % 1.23 4 2.31 % 1.01 148 1.09 % 0.47 čao čao čao 412 0.62 % 0.36 0 0 % 0 97 0.77 % 2.44 109 0.55 % 0.20 105 0.56 % 0.56 8 0.51 % 0.19 0 0 % 0 93 0.68 % 0.29 hov hov hov 359 0.54 % 0.32 0 0 % 0 25 0.20 % 0.63 159 0.81 % 0.29 82 0.44 % 0.44 22 1.40 % 0.51 1 0.58 % 0.25 70 0.52 % 0.22 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 225 File at CLARIN.SI 1.2.209 List of final character-level 4-grams from interjection lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-interjections-lemmas-final- 4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] bravo bravo b ravo 3,087 11.12 % 2.72 0 0 % 0 254 3.77 % 6.40 1,220 12.87 % 2.25 848 11.88 % 4.52 37 5.55 % 0.86 4 5.20 % 1.01 724 19.79 % 2.28 zbogom zbogom zb ogom 2,474 8.91 % 2.18 0 0 % 0 360 5.34 % 9.06 1,062 11.20 % 1.96 532 7.46 % 2.84 90 13.49 % 2.10 3 3.90 % 0.76 427 11.67 % 1.34 adijo adijo a dijo 2,414 8.70 % 2.13 0 0 % 0 416 6.17 % 10.47 961 10.14 % 1.77 608 8.52 % 3.24 60 9.00 % 1.40 1 1.30 % 0.25 368 10.06 % 1.16 haha haha haha 1,798 6.48 % 1.58 0 0 % 0 86 1.27 % 2.17 448 4.73 % 0.83 915 12.82 % 4.88 5 0.75 % 0.12 0 0 % 0 344 9.40 % 1.08 hvalabogu hvalabogu hvala bogu 1,286 4.63 % 1.13 0 0 % 0 146 2.17 % 3.68 671 7.08 % 1.24 300 4.21 % 1.60 17 2.55 % 0.40 4 5.20 % 1.01 148 4.05 % 0.47 jebiga jebiga je biga 1,175 4.23 % 1.04 0 0 % 0 511 7.58 % 12.87 321 3.39 % 0.59 272 3.81 % 1.45 5 0.75 % 0.12 3 3.90 % 0.76 63 1.72 % 0.20 nasvidenje nasvidenje nasvid enje 1,162 4.19 % 1.02 0 0 % 0 130 1.93 % 3.27 620 6.54 % 1.14 181 2.54 % 0.97 28 4.20 % 0.65 35 45.45 % 8.85 168 4.59 % 0.53 hura hura hura 1,087 3.92 % 0.96 0 0 % 0 121 1.79 % 3.05 531 5.60 % 0.98 289 4.05 % 1.54 38 5.70 % 0.89 1 1.30 % 0.25 107 2.92 % 0.34 hopla hopla h opla 946 3.41 % 0.83 0 0 % 0 42 0.62 % 1.06 338 3.56 % 0.62 535 7.50 % 2.85 5 0.75 % 0.12 3 3.90 % 0.76 23 0.63 % 0.07 ojoj ojoj ojoj 890 3.21 % 0.78 0 0 % 0 282 4.18 % 7.10 202 2.13 % 0.37 241 3.38 % 1.29 15 2.25 % 0.35 4 5.20 % 1.01 146 3.99 % 0.46 zaboga zaboga za boga 831 2.99 % 0.73 0 0 % 0 455 6.75 % 11.46 168 1.77 % 0.31 131 1.84 % 0.70 35 5.25 % 0.82 0 0 % 0 42 1.15 % 0.13 ejga ejga ejga 756 2.72 % 0.67 0 0 % 0 45 0.67 % 1.13 541 5.71 % 1 61 0.85 % 0.33 0 0 % 0 0 0 % 0 109 2.98 % 0.34 jebenti jebenti jeb enti 753 2.71 % 0.66 0 0 % 0 576 8.54 % 14.50 64 0.68 % 0.12 87 1.22 % 0.46 7 1.05 % 0.16 0 0 % 0 19 0.52 % 0.06 jebemti jebemti jeb emti 643 2.32 % 0.57 0 0 % 0 498 7.38 % 12.54 52 0.55 % 0.10 54 0.76 % 0.29 1 0.15 % 0.02 0 0 % 0 38 1.04 % 0.12 ojej ojej ojej 603 2.17 % 0.53 0 0 % 0 291 4.32 % 7.33 107 1.13 % 0.20 129 1.81 % 0.69 19 2.85 % 0.44 0 0 % 0 57 1.56 % 0.18 juhej juhej j uhej 325 1.17 % 0.29 0 0 % 0 17 0.25 % 0.43 127 1.34 % 0.23 126 1.77 % 0.67 6 0.90 % 0.14 0 0 % 0 49 1.34 % 0.15 živijo živijo ži vijo 255 0.92 % 0.22 0 0 % 0 134 1.99 % 3.37 43 0.45 % 0.08 47 0.66 % 0.25 12 1.80 % 0.28 1 1.30 % 0.25 18 0.49 % 0.06 juhuhu juhuhu ju huhu 246 0.89 % 0.22 0 0 % 0 33 0.49 % 0.83 103 1.09 % 0.19 79 1.11 % 0.42 5 0.75 % 0.12 3 3.90 % 0.76 23 0.63 % 0.07 živio živio ž ivio 207 0.75 % 0.18 0 0 % 0 76 1.13 % 1.91 81 0.85 % 0.15 7 0.10 % 0.04 18 2.70 % 0.42 2 2.60 % 0.51 23 0.63 % 0.07 hojla hojla h ojla 194 0.70 % 0.17 0 0 % 0 39 0.58 % 0.98 107 1.13 % 0.20 46 0.65 % 0.25 1 0.15 % 0.02 0 0 % 0 1 0.03 % 0 živjo živjo ž ivjo 191 0.69 % 0.17 0 0 % 0 126 1.87 % 3.17 20 0.21 % 0.04 28 0.39 % 0.15 2 0.30 % 0.05 0 0 % 0 15 0.41 % 0.05 hahaha hahaha ha haha 190 0.68 % 0.17 0 0 % 0 30 0.45 % 0.76 34 0.36 % 0.06 67 0.94 % 0.36 2 0.30 % 0.05 0 0 % 0 57 1.56 % 0.18 žalibog žalibog žal ibog 186 0.67 % 0.16 0 0 % 0 13 0.19 % 0.33 78 0.82 % 0.14 35 0.49 % 0.19 13 1.95 % 0.30 1 1.30 % 0.25 46 1.26 % 0.14 ježeš ježeš j ežeš 155 0.56 % 0.14 0 0 % 0 41 0.61 % 1.03 34 0.36 % 0.06 75 1.05 % 0.40 0 0 % 0 0 0 % 0 5 0.14 % 0.02 aaaa aaaa aaaa 148 0.53 % 0.13 0 0 % 0 49 0.73 % 1.23 22 0.23 % 0.04 58 0.81 % 0.31 0 0 % 0 0 0 % 0 19 0.52 % 0.06 tralala tralala tra lala 127 0.46 % 0.11 0 0 % 0 17 0.25 % 0.43 57 0.60 % 0.11 35 0.49 % 0.19 1 0.15 % 0.02 1 1.30 % 0.25 16 0.44 % 0.05 heja heja heja 98 0.35 % 0.09 0 0 % 0 7 0.10 % 0.18 30 0.32 % 0.06 22 0.31 % 0.12 3 0.45 % 0.07 0 0 % 0 36 0.98 % 0.11 bumf bumf bumf 80 0.29 % 0.07 0 0 % 0 33 0.49 % 0.83 23 0.24 % 0.04 18 0.25 % 0.10 3 0.45 % 0.07 0 0 % 0 3 0.08 % 0.01 hrsk hrsk hrsk 73 0.26 % 0.06 0 0 % 0 33 0.49 % 0.83 21 0.22 % 0.04 13 0.18 % 0.07 1 0.15 % 0.02 0 0 % 0 5 0.14 % 0.02 jojmene jojmene joj mene 72 0.26 % 0.06 0 0 % 0 34 0.50 % 0.86 13 0.14 % 0.02 17 0.24 % 0.09 5 0.75 % 0.12 0 0 % 0 3 0.08 % 0.01 hozana hozana ho zana 66 0.24 % 0.06 0 0 % 0 21 0.31 % 0.53 24 0.25 % 0.04 10 0.14 % 0.05 3 0.45 % 0.07 1 1.30 % 0.25 7 0.19 % 0.02 aaah aaah aaah 63 0.23 % 0.06 0 0 % 0 17 0.25 % 0.43 6 0.06 % 0.01 32 0.45 % 0.17 2 0.30 % 0.05 0 0 % 0 6 0.16 % 0.02 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 226 File at CLARIN.SI 1.2.210 List of final character-level 5-grams from interjection lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-interjections-lemmas-final- 5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] bravo bravo bravo 3,087 14.65 % 2.72 0 0 % 0 254 4.73 % 6.40 1,220 16.73 % 2.25 848 16.67 % 4.52 37 6.88 % 0.86 4 5.97 % 1.01 724 26.55 % 2.28 zbogom zbogom z bogom 2,474 11.74 % 2.18 0 0 % 0 360 6.71 % 9.06 1,062 14.57 % 1.96 532 10.46 % 2.84 90 16.73 % 2.10 3 4.48 % 0.76 427 15.66 % 1.34 adijo adijo adijo 2,414 11.45 % 2.13 0 0 % 0 416 7.75 % 10.47 961 13.18 % 1.77 608 11.95 % 3.24 60 11.15 % 1.40 1 1.49 % 0.25 368 13.49 % 1.16 hvalabogu hvalabogu hval abogu 1,286 6.10 % 1.13 0 0 % 0 146 2.72 % 3.68 671 9.20 % 1.24 300 5.90 % 1.60 17 3.16 % 0.40 4 5.97 % 1.01 148 5.43 % 0.47 jebiga jebiga j ebiga 1,175 5.58 % 1.04 0 0 % 0 511 9.52 % 12.87 321 4.40 % 0.59 272 5.35 % 1.45 5 0.93 % 0.12 3 4.48 % 0.76 63 2.31 % 0.20 nasvidenje nasvidenje nasvi denje 1,162 5.51 % 1.02 0 0 % 0 130 2.42 % 3.27 620 8.51 % 1.14 181 3.56 % 0.97 28 5.20 % 0.65 35 52.24 % 8.85 168 6.16 % 0.53 hopla hopla hopla 946 4.49 % 0.83 0 0 % 0 42 0.78 % 1.06 338 4.64 % 0.62 535 10.52 % 2.85 5 0.93 % 0.12 3 4.48 % 0.76 23 0.84 % 0.07 zaboga zaboga z aboga 831 3.94 % 0.73 0 0 % 0 455 8.48 % 11.46 168 2.31 % 0.31 131 2.58 % 0.70 35 6.51 % 0.82 0 0 % 0 42 1.54 % 0.13 jebenti jebenti je benti 753 3.57 % 0.66 0 0 % 0 576 10.73 % 14.50 64 0.88 % 0.12 87 1.71 % 0.46 7 1.30 % 0.16 0 0 % 0 19 0.70 % 0.06 jebemti jebemti je bemti 643 3.05 % 0.57 0 0 % 0 498 9.28 % 12.54 52 0.71 % 0.10 54 1.06 % 0.29 1 0.19 % 0.02 0 0 % 0 38 1.39 % 0.12 juhej juhej juhej 325 1.54 % 0.29 0 0 % 0 17 0.32 % 0.43 127 1.74 % 0.23 126 2.48 % 0.67 6 1.11 % 0.14 0 0 % 0 49 1.80 % 0.15 živijo živijo ž ivijo 255 1.21 % 0.22 0 0 % 0 134 2.50 % 3.37 43 0.59 % 0.08 47 0.92 % 0.25 12 2.23 % 0.28 1 1.49 % 0.25 18 0.66 % 0.06 juhuhu juhuhu j uhuhu 246 1.17 % 0.22 0 0 % 0 33 0.61 % 0.83 103 1.41 % 0.19 79 1.55 % 0.42 5 0.93 % 0.12 3 4.48 % 0.76 23 0.84 % 0.07 živio živio živio 207 0.98 % 0.18 0 0 % 0 76 1.42 % 1.91 81 1.11 % 0.15 7 0.14 % 0.04 18 3.35 % 0.42 2 2.98 % 0.51 23 0.84 % 0.07 hojla hojla hojla 194 0.92 % 0.17 0 0 % 0 39 0.73 % 0.98 107 1.47 % 0.20 46 0.90 % 0.25 1 0.19 % 0.02 0 0 % 0 1 0.04 % 0 živjo živjo živjo 191 0.91 % 0.17 0 0 % 0 126 2.35 % 3.17 20 0.27 % 0.04 28 0.55 % 0.15 2 0.37 % 0.05 0 0 % 0 15 0.55 % 0.05 hahaha hahaha h ahaha 190 0.90 % 0.17 0 0 % 0 30 0.56 % 0.76 34 0.47 % 0.06 67 1.32 % 0.36 2 0.37 % 0.05 0 0 % 0 57 2.09 % 0.18 žalibog žalibog ža libog 186 0.88 % 0.16 0 0 % 0 13 0.24 % 0.33 78 1.07 % 0.14 35 0.69 % 0.19 13 2.42 % 0.30 1 1.49 % 0.25 46 1.69 % 0.14 ježeš ježeš ježeš 155 0.73 % 0.14 0 0 % 0 41 0.76 % 1.03 34 0.47 % 0.06 75 1.47 % 0.40 0 0 % 0 0 0 % 0 5 0.18 % 0.02 tralala tralala tr alala 127 0.60 % 0.11 0 0 % 0 17 0.32 % 0.43 57 0.78 % 0.11 35 0.69 % 0.19 1 0.19 % 0.02 1 1.49 % 0.25 16 0.59 % 0.05 jojmene jojmene jo jmene 72 0.34 % 0.06 0 0 % 0 34 0.63 % 0.86 13 0.18 % 0.02 17 0.33 % 0.09 5 0.93 % 0.12 0 0 % 0 3 0.11 % 0.01 hozana hozana h ozana 66 0.31 % 0.06 0 0 % 0 21 0.39 % 0.53 24 0.33 % 0.04 10 0.20 % 0.05 3 0.56 % 0.07 1 1.49 % 0.25 7 0.26 % 0.02 mijav mijav mijav 56 0.27 % 0.05 0 0 % 0 11 0.20 % 0.28 17 0.23 % 0.03 19 0.37 % 0.10 5 0.93 % 0.12 0 0 % 0 4 0.15 % 0.01 joooj joooj joooj 53 0.25 % 0.05 0 0 % 0 1 0.02 % 0.03 24 0.33 % 0.04 22 0.43 % 0.12 0 0 % 0 0 0 % 0 6 0.22 % 0.02 Nasvidenje nasvidenje Nasvi denje 52 0.25 % 0.05 0 0 % 0 0 0 % 0 29 0.40 % 0.05 7 0.14 % 0.04 4 0.74 % 0.09 0 0 % 0 12 0.44 % 0.04 zdravo zdravo z dravo 52 0.25 % 0.05 0 0 % 0 12 0.22 % 0.30 24 0.33 % 0.04 9 0.18 % 0.05 3 0.56 % 0.07 0 0 % 0 4 0.15 % 0.01 jojme jojme jojme 49 0.23 % 0.04 0 0 % 0 16 0.30 % 0.40 10 0.14 % 0.02 15 0.29 % 0.08 6 1.11 % 0.14 1 1.49 % 0.25 1 0.04 % 0 aaaaa aaaaa aaaaa 45 0.21 % 0.04 0 0 % 0 12 0.22 % 0.30 8 0.11 % 0.01 17 0.33 % 0.09 2 0.37 % 0.05 0 0 % 0 6 0.22 % 0.02 mejduš mejduš m ejduš 42 0.20 % 0.04 0 0 % 0 14 0.26 % 0.35 10 0.14 % 0.02 10 0.20 % 0.05 2 0.37 % 0.05 0 0 % 0 6 0.22 % 0.02 ježešna ježešna je žešna 32 0.15 % 0.03 0 0 % 0 13 0.24 % 0.33 13 0.18 % 0.02 4 0.08 % 0.02 1 0.19 % 0.02 0 0 % 0 1 0.04 % 0 jaaaa jaaaa jaaaa 29 0.14 % 0.03 0 0 % 0 12 0.22 % 0.30 5 0.07 % 0.01 5 0.10 % 0.03 0 0 % 0 0 0 % 0 7 0.26 % 0.02 aaaah aaaah aaaah 28 0.13 % 0.02 0 0 % 0 8 0.15 % 0.20 4 0.06 % 0.01 14 0.28 % 0.07 0 0 % 0 0 0 % 0 2 0.07 % 0.01 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 227 File at CLARIN.SI 1.2.211 List of initial character-level 1-grams from interjection lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-interjections-lowercase_ forms-initial-1grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] oh o h 14,865 10.91 % 13.10 0 0 % 0 5,344 19.95 % 134.56 3,245 7.48 % 5.98 3,990 9.80 % 21.29 760 18.32 % 17.70 25 7.14 % 6.32 1,501 7.18 % 4.72 ah a h 13,301 9.76 % 11.72 0 0 % 0 2,696 10.06 % 67.88 4,963 11.44 % 9.14 3,766 9.25 % 20.09 445 10.73 % 10.36 30 8.57 % 7.58 1,401 6.71 % 4.41 ha h a 11,172 8.20 % 9.85 0 0 % 0 1,638 6.11 % 41.24 4,524 10.43 % 8.34 3,900 9.58 % 20.81 320 7.71 % 7.45 33 9.43 % 8.34 757 3.62 % 2.38 hm h m 8,313 6.10 % 7.33 0 0 % 0 1,142 4.26 % 28.75 3,096 7.13 % 5.70 3,089 7.59 % 16.48 169 4.07 % 3.94 12 3.43 % 3.03 805 3.85 % 2.53 hej h ej 7,497 5.50 % 6.61 0 0 % 0 1,515 5.66 % 38.15 1,767 4.07 % 3.26 3,368 8.27 % 17.97 210 5.06 % 4.89 4 1.14 % 1.01 633 3.03 % 1.99 joj j oj 6,441 4.73 % 5.68 0 0 % 0 1,074 4.01 % 27.04 2,157 4.97 % 3.97 2,162 5.31 % 11.54 138 3.33 % 3.21 33 9.43 % 8.34 877 4.20 % 2.76 aha a ha 6,335 4.65 % 5.58 0 0 % 0 1,184 4.42 % 29.81 1,301 3.00 % 2.40 1,538 3.78 % 8.21 160 3.86 % 3.73 25 7.14 % 6.32 2,127 10.18 % 6.69 fak f ak 4,669 3.43 % 4.11 0 0 % 0 160 0.60 % 4.03 190 0.44 % 0.35 98 0.24 % 0.52 2 0.05 % 0.05 0 0 % 0 4,219 20.20 % 13.27 ej e j 3,721 2.73 % 3.28 0 0 % 0 1,232 4.60 % 31.02 1,049 2.42 % 1.93 830 2.04 % 4.43 241 5.81 % 5.61 7 2.00 % 1.77 362 1.73 % 1.14 hi h i 3,425 2.51 % 3.02 0 0 % 0 169 0.63 % 4.26 1,246 2.87 % 2.30 1,521 3.74 % 8.12 111 2.68 % 2.59 22 6.29 % 5.56 356 1.70 % 1.12 bravo b ravo 3,100 2.27 % 2.73 0 0 % 0 255 0.95 % 6.42 1,225 2.82 % 2.26 853 2.10 % 4.55 37 0.89 % 0.86 4 1.14 % 1.01 726 3.48 % 2.28 uf u f 2,808 2.06 % 2.47 0 0 % 0 254 0.95 % 6.40 834 1.92 % 1.54 1,109 2.72 % 5.92 25 0.60 % 0.58 0 0 % 0 586 2.81 % 1.84 zbogom z bogom 2,474 1.81 % 2.18 0 0 % 0 360 1.34 % 9.06 1,062 2.45 % 1.96 532 1.31 % 2.84 90 2.17 % 2.10 3 0.86 % 0.76 427 2.04 % 1.34 uh u h 2,435 1.79 % 2.15 0 0 % 0 336 1.25 % 8.46 830 1.91 % 1.53 983 2.41 % 5.24 48 1.16 % 1.12 2 0.57 % 0.51 236 1.13 % 0.74 adijo a dijo 2,415 1.77 % 2.13 0 0 % 0 416 1.55 % 10.47 961 2.21 % 1.77 609 1.50 % 3.25 60 1.45 % 1.40 1 0.29 % 0.25 368 1.76 % 1.16 eh e h 2,409 1.77 % 2.12 0 0 % 0 399 1.49 % 10.05 655 1.51 % 1.21 1,114 2.74 % 5.94 47 1.13 % 1.09 1 0.29 % 0.25 193 0.92 % 0.61 ho h o 2,160 1.58 % 1.90 0 0 % 0 220 0.82 % 5.54 784 1.81 % 1.44 504 1.24 % 2.69 133 3.21 % 3.10 18 5.14 % 4.55 501 2.40 % 1.58 bla b la 1,985 1.46 % 1.75 0 0 % 0 325 1.21 % 8.18 824 1.90 % 1.52 512 1.26 % 2.73 54 1.30 % 1.26 11 3.14 % 2.78 259 1.24 % 0.81 la l a 1,888 1.39 % 1.66 0 0 % 0 90 0.34 % 2.27 919 2.12 % 1.69 450 1.10 % 2.40 72 1.74 % 1.68 3 0.86 % 0.76 354 1.70 % 1.11 haha h aha 1,798 1.32 % 1.58 0 0 % 0 86 0.32 % 2.17 448 1.03 % 0.83 915 2.25 % 4.88 5 0.12 % 0.12 0 0 % 0 344 1.65 % 1.08 oj o j 1,484 1.09 % 1.31 0 0 % 0 218 0.81 % 5.49 651 1.50 % 1.20 318 0.78 % 1.70 149 3.59 % 3.47 9 2.57 % 2.28 139 0.67 % 0.44 huh h uh 1,382 1.01 % 1.22 0 0 % 0 23 0.09 % 0.58 97 0.22 % 0.18 1,180 2.90 % 6.30 10 0.24 % 0.23 0 0 % 0 72 0.34 % 0.23 alo a lo 1,291 0.95 % 1.14 0 0 % 0 100 0.37 % 2.52 599 1.38 % 1.10 221 0.54 % 1.18 25 0.60 % 0.58 0 0 % 0 346 1.66 % 1.09 hvalabogu h valabogu 1,286 0.94 % 1.13 0 0 % 0 146 0.55 % 3.68 671 1.55 % 1.24 300 0.74 % 1.60 17 0.41 % 0.40 4 1.14 % 1.01 148 0.71 % 0.47 nasvidenje n asvidenje 1,217 0.89 % 1.07 0 0 % 0 130 0.48 % 3.27 651 1.50 % 1.20 189 0.46 % 1.01 32 0.77 % 0.75 35 10.00 % 8.85 180 0.86 % 0.57 jebiga j ebiga 1,175 0.86 % 1.04 0 0 % 0 511 1.91 % 12.87 321 0.74 % 0.59 272 0.67 % 1.45 5 0.12 % 0.12 3 0.86 % 0.76 63 0.30 % 0.20 oho o ho 1,160 0.85 % 1.02 0 0 % 0 135 0.50 % 3.40 421 0.97 % 0.78 217 0.53 % 1.16 27 0.65 % 0.63 1 0.29 % 0.25 359 1.72 % 1.13 hura h ura 1,087 0.80 % 0.96 0 0 % 0 121 0.45 % 3.05 531 1.22 % 0.98 289 0.71 % 1.54 38 0.92 % 0.89 1 0.29 % 0.25 107 0.51 % 0.34 jah j ah 1,031 0.76 % 0.91 0 0 % 0 22 0.08 % 0.55 650 1.50 % 1.20 235 0.58 % 1.25 8 0.19 % 0.19 1 0.29 % 0.25 115 0.55 % 0.36 ups u ps 972 0.71 % 0.86 0 0 % 0 74 0.28 % 1.86 302 0.70 % 0.56 468 1.15 % 2.50 8 0.19 % 0.19 5 1.43 % 1.26 115 0.55 % 0.36 hopla h opla 946 0.69 % 0.83 0 0 % 0 42 0.16 % 1.06 338 0.78 % 0.62 535 1.31 % 2.85 5 0.12 % 0.12 3 0.86 % 0.76 23 0.11 % 0.07 av a v 944 0.69 % 0.83 0 0 % 0 78 0.29 % 1.96 588 1.35 % 1.08 200 0.49 % 1.07 28 0.68 % 0.65 10 2.86 % 2.53 40 0.19 % 0.13 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 228 File at CLARIN.SI 1.2.212 List of initial character-level 2-grams from interjection lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-interjections-lowercase_ forms-initial-2grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] oh oh 14,865 10.91 % 13.10 0 0 % 0 5,344 19.95 % 134.56 3,245 7.48 % 5.98 3,990 9.80 % 21.29 760 18.32 % 17.70 25 7.14 % 6.32 1,501 7.18 % 4.72 ah ah 13,301 9.76 % 11.72 0 0 % 0 2,696 10.06 % 67.88 4,963 11.44 % 9.14 3,766 9.25 % 20.09 445 10.73 % 10.36 30 8.57 % 7.58 1,401 6.71 % 4.41 ha ha 11,172 8.20 % 9.85 0 0 % 0 1,638 6.11 % 41.24 4,524 10.43 % 8.34 3,900 9.58 % 20.81 320 7.71 % 7.45 33 9.43 % 8.34 757 3.62 % 2.38 hm hm 8,313 6.10 % 7.33 0 0 % 0 1,142 4.26 % 28.75 3,096 7.13 % 5.70 3,089 7.59 % 16.48 169 4.07 % 3.94 12 3.43 % 3.03 805 3.85 % 2.53 hej he j 7,497 5.50 % 6.61 0 0 % 0 1,515 5.66 % 38.15 1,767 4.07 % 3.26 3,368 8.27 % 17.97 210 5.06 % 4.89 4 1.14 % 1.01 633 3.03 % 1.99 joj jo j 6,441 4.73 % 5.68 0 0 % 0 1,074 4.01 % 27.04 2,157 4.97 % 3.97 2,162 5.31 % 11.54 138 3.33 % 3.21 33 9.43 % 8.34 877 4.20 % 2.76 aha ah a 6,335 4.65 % 5.58 0 0 % 0 1,184 4.42 % 29.81 1,301 3.00 % 2.40 1,538 3.78 % 8.21 160 3.86 % 3.73 25 7.14 % 6.32 2,127 10.18 % 6.69 fak fa k 4,669 3.43 % 4.11 0 0 % 0 160 0.60 % 4.03 190 0.44 % 0.35 98 0.24 % 0.52 2 0.05 % 0.05 0 0 % 0 4,219 20.20 % 13.27 ej ej 3,721 2.73 % 3.28 0 0 % 0 1,232 4.60 % 31.02 1,049 2.42 % 1.93 830 2.04 % 4.43 241 5.81 % 5.61 7 2.00 % 1.77 362 1.73 % 1.14 hi hi 3,425 2.51 % 3.02 0 0 % 0 169 0.63 % 4.26 1,246 2.87 % 2.30 1,521 3.74 % 8.12 111 2.68 % 2.59 22 6.29 % 5.56 356 1.70 % 1.12 bravo br avo 3,100 2.27 % 2.73 0 0 % 0 255 0.95 % 6.42 1,225 2.82 % 2.26 853 2.10 % 4.55 37 0.89 % 0.86 4 1.14 % 1.01 726 3.48 % 2.28 uf uf 2,808 2.06 % 2.47 0 0 % 0 254 0.95 % 6.40 834 1.92 % 1.54 1,109 2.72 % 5.92 25 0.60 % 0.58 0 0 % 0 586 2.81 % 1.84 zbogom zb ogom 2,474 1.81 % 2.18 0 0 % 0 360 1.34 % 9.06 1,062 2.45 % 1.96 532 1.31 % 2.84 90 2.17 % 2.10 3 0.86 % 0.76 427 2.04 % 1.34 uh uh 2,435 1.79 % 2.15 0 0 % 0 336 1.25 % 8.46 830 1.91 % 1.53 983 2.41 % 5.24 48 1.16 % 1.12 2 0.57 % 0.51 236 1.13 % 0.74 adijo ad ijo 2,415 1.77 % 2.13 0 0 % 0 416 1.55 % 10.47 961 2.21 % 1.77 609 1.50 % 3.25 60 1.45 % 1.40 1 0.29 % 0.25 368 1.76 % 1.16 eh eh 2,409 1.77 % 2.12 0 0 % 0 399 1.49 % 10.05 655 1.51 % 1.21 1,114 2.74 % 5.94 47 1.13 % 1.09 1 0.29 % 0.25 193 0.92 % 0.61 ho ho 2,160 1.58 % 1.90 0 0 % 0 220 0.82 % 5.54 784 1.81 % 1.44 504 1.24 % 2.69 133 3.21 % 3.10 18 5.14 % 4.55 501 2.40 % 1.58 bla bl a 1,985 1.46 % 1.75 0 0 % 0 325 1.21 % 8.18 824 1.90 % 1.52 512 1.26 % 2.73 54 1.30 % 1.26 11 3.14 % 2.78 259 1.24 % 0.81 la la 1,888 1.39 % 1.66 0 0 % 0 90 0.34 % 2.27 919 2.12 % 1.69 450 1.10 % 2.40 72 1.74 % 1.68 3 0.86 % 0.76 354 1.70 % 1.11 haha ha ha 1,798 1.32 % 1.58 0 0 % 0 86 0.32 % 2.17 448 1.03 % 0.83 915 2.25 % 4.88 5 0.12 % 0.12 0 0 % 0 344 1.65 % 1.08 oj oj 1,484 1.09 % 1.31 0 0 % 0 218 0.81 % 5.49 651 1.50 % 1.20 318 0.78 % 1.70 149 3.59 % 3.47 9 2.57 % 2.28 139 0.67 % 0.44 huh hu h 1,382 1.01 % 1.22 0 0 % 0 23 0.09 % 0.58 97 0.22 % 0.18 1,180 2.90 % 6.30 10 0.24 % 0.23 0 0 % 0 72 0.34 % 0.23 alo al o 1,291 0.95 % 1.14 0 0 % 0 100 0.37 % 2.52 599 1.38 % 1.10 221 0.54 % 1.18 25 0.60 % 0.58 0 0 % 0 346 1.66 % 1.09 hvalabogu hv alabogu 1,286 0.94 % 1.13 0 0 % 0 146 0.55 % 3.68 671 1.55 % 1.24 300 0.74 % 1.60 17 0.41 % 0.40 4 1.14 % 1.01 148 0.71 % 0.47 nasvidenje na svidenje 1,217 0.89 % 1.07 0 0 % 0 130 0.48 % 3.27 651 1.50 % 1.20 189 0.46 % 1.01 32 0.77 % 0.75 35 10.00 % 8.85 180 0.86 % 0.57 jebiga je biga 1,175 0.86 % 1.04 0 0 % 0 511 1.91 % 12.87 321 0.74 % 0.59 272 0.67 % 1.45 5 0.12 % 0.12 3 0.86 % 0.76 63 0.30 % 0.20 oho oh o 1,160 0.85 % 1.02 0 0 % 0 135 0.50 % 3.40 421 0.97 % 0.78 217 0.53 % 1.16 27 0.65 % 0.63 1 0.29 % 0.25 359 1.72 % 1.13 hura hu ra 1,087 0.80 % 0.96 0 0 % 0 121 0.45 % 3.05 531 1.22 % 0.98 289 0.71 % 1.54 38 0.92 % 0.89 1 0.29 % 0.25 107 0.51 % 0.34 jah ja h 1,031 0.76 % 0.91 0 0 % 0 22 0.08 % 0.55 650 1.50 % 1.20 235 0.58 % 1.25 8 0.19 % 0.19 1 0.29 % 0.25 115 0.55 % 0.36 ups up s 972 0.71 % 0.86 0 0 % 0 74 0.28 % 1.86 302 0.70 % 0.56 468 1.15 % 2.50 8 0.19 % 0.19 5 1.43 % 1.26 115 0.55 % 0.36 hopla ho pla 946 0.69 % 0.83 0 0 % 0 42 0.16 % 1.06 338 0.78 % 0.62 535 1.31 % 2.85 5 0.12 % 0.12 3 0.86 % 0.76 23 0.11 % 0.07 av av 944 0.69 % 0.83 0 0 % 0 78 0.29 % 1.96 588 1.35 % 1.08 200 0.49 % 1.07 28 0.68 % 0.65 10 2.86 % 2.53 40 0.19 % 0.13 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 229 File at CLARIN.SI 1.2.213 List of initial character-level 3-grams from interjection lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-interjections-lowercase_ forms-initial-3grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] hej hej 7,497 11.28 % 6.61 0 0 % 0 1,515 11.94 % 38.15 1,767 8.96 % 3.26 3,368 17.95 % 17.97 210 13.38 % 4.89 4 2.31 % 1.01 633 4.66 % 1.99 joj joj 6,441 9.69 % 5.68 0 0 % 0 1,074 8.47 % 27.04 2,157 10.94 % 3.97 2,162 11.52 % 11.54 138 8.79 % 3.21 33 19.07 % 8.34 877 6.45 % 2.76 aha aha 6,335 9.53 % 5.58 0 0 % 0 1,184 9.34 % 29.81 1,301 6.60 % 2.40 1,538 8.20 % 8.21 160 10.20 % 3.73 25 14.45 % 6.32 2,127 15.65 % 6.69 fak fak 4,669 7.02 % 4.11 0 0 % 0 160 1.26 % 4.03 190 0.96 % 0.35 98 0.52 % 0.52 2 0.13 % 0.05 0 0 % 0 4,219 31.04 % 13.27 bravo bra vo 3,100 4.66 % 2.73 0 0 % 0 255 2.01 % 6.42 1,225 6.21 % 2.26 853 4.55 % 4.55 37 2.36 % 0.86 4 2.31 % 1.01 726 5.34 % 2.28 zbogom zbo gom 2,474 3.72 % 2.18 0 0 % 0 360 2.84 % 9.06 1,062 5.39 % 1.96 532 2.84 % 2.84 90 5.74 % 2.10 3 1.73 % 0.76 427 3.14 % 1.34 adijo adi jo 2,415 3.63 % 2.13 0 0 % 0 416 3.28 % 10.47 961 4.87 % 1.77 609 3.25 % 3.25 60 3.82 % 1.40 1 0.58 % 0.25 368 2.71 % 1.16 bla bla 1,985 2.98 % 1.75 0 0 % 0 325 2.56 % 8.18 824 4.18 % 1.52 512 2.73 % 2.73 54 3.44 % 1.26 11 6.36 % 2.78 259 1.91 % 0.81 haha hah a 1,798 2.70 % 1.58 0 0 % 0 86 0.68 % 2.17 448 2.27 % 0.83 915 4.88 % 4.88 5 0.32 % 0.12 0 0 % 0 344 2.53 % 1.08 huh huh 1,382 2.08 % 1.22 0 0 % 0 23 0.18 % 0.58 97 0.49 % 0.18 1,180 6.29 % 6.30 10 0.64 % 0.23 0 0 % 0 72 0.53 % 0.23 alo alo 1,291 1.94 % 1.14 0 0 % 0 100 0.79 % 2.52 599 3.04 % 1.10 221 1.18 % 1.18 25 1.59 % 0.58 0 0 % 0 346 2.55 % 1.09 hvalabogu hva labogu 1,286 1.93 % 1.13 0 0 % 0 146 1.15 % 3.68 671 3.40 % 1.24 300 1.60 % 1.60 17 1.08 % 0.40 4 2.31 % 1.01 148 1.09 % 0.47 nasvidenje nas videnje 1,217 1.83 % 1.07 0 0 % 0 130 1.02 % 3.27 651 3.30 % 1.20 189 1.01 % 1.01 32 2.04 % 0.75 35 20.23 % 8.85 180 1.32 % 0.57 jebiga jeb iga 1,175 1.77 % 1.04 0 0 % 0 511 4.03 % 12.87 321 1.63 % 0.59 272 1.45 % 1.45 5 0.32 % 0.12 3 1.73 % 0.76 63 0.46 % 0.20 oho oho 1,160 1.75 % 1.02 0 0 % 0 135 1.06 % 3.40 421 2.13 % 0.78 217 1.16 % 1.16 27 1.72 % 0.63 1 0.58 % 0.25 359 2.64 % 1.13 hura hur a 1,087 1.64 % 0.96 0 0 % 0 121 0.95 % 3.05 531 2.69 % 0.98 289 1.54 % 1.54 38 2.42 % 0.89 1 0.58 % 0.25 107 0.79 % 0.34 jah jah 1,031 1.55 % 0.91 0 0 % 0 22 0.17 % 0.55 650 3.30 % 1.20 235 1.25 % 1.25 8 0.51 % 0.19 1 0.58 % 0.25 115 0.85 % 0.36 ups ups 972 1.46 % 0.86 0 0 % 0 74 0.58 % 1.86 302 1.53 % 0.56 468 2.49 % 2.50 8 0.51 % 0.19 5 2.89 % 1.26 115 0.85 % 0.36 hopla hop la 946 1.42 % 0.83 0 0 % 0 42 0.33 % 1.06 338 1.71 % 0.62 535 2.85 % 2.85 5 0.32 % 0.12 3 1.73 % 0.76 23 0.17 % 0.07 ojoj ojo j 890 1.34 % 0.78 0 0 % 0 282 2.22 % 7.10 202 1.02 % 0.37 241 1.28 % 1.29 15 0.96 % 0.35 4 2.31 % 1.01 146 1.07 % 0.46 zaboga zab oga 831 1.25 % 0.73 0 0 % 0 455 3.59 % 11.46 168 0.85 % 0.31 131 0.70 % 0.70 35 2.23 % 0.82 0 0 % 0 42 0.31 % 0.13 fuj fuj 800 1.20 % 0.71 0 0 % 0 197 1.55 % 4.96 218 1.11 % 0.40 246 1.31 % 1.31 24 1.53 % 0.56 4 2.31 % 1.01 111 0.82 % 0.35 ejga ejg a 756 1.14 % 0.67 0 0 % 0 45 0.35 % 1.13 541 2.74 % 1 61 0.33 % 0.33 0 0 % 0 0 0 % 0 109 0.80 % 0.34 jebenti jeb enti 753 1.13 % 0.66 0 0 % 0 576 4.54 % 14.50 64 0.33 % 0.12 87 0.46 % 0.46 7 0.45 % 0.16 0 0 % 0 19 0.14 % 0.06 opa opa 673 1.01 % 0.59 0 0 % 0 70 0.55 % 1.76 243 1.23 % 0.45 256 1.36 % 1.37 25 1.59 % 0.58 0 0 % 0 79 0.58 % 0.25 jebemti jeb emti 643 0.97 % 0.57 0 0 % 0 498 3.93 % 12.54 52 0.26 % 0.10 54 0.29 % 0.29 1 0.06 % 0.02 0 0 % 0 38 0.28 % 0.12 mhm mhm 612 0.92 % 0.54 0 0 % 0 223 1.76 % 5.61 88 0.45 % 0.16 256 1.36 % 1.37 16 1.02 % 0.37 0 0 % 0 29 0.21 % 0.09 ojej oje j 603 0.91 % 0.53 0 0 % 0 291 2.29 % 7.33 107 0.54 % 0.20 129 0.69 % 0.69 19 1.21 % 0.44 0 0 % 0 57 0.42 % 0.18 paf paf 520 0.78 % 0.46 0 0 % 0 57 0.45 % 1.44 374 1.90 % 0.69 59 0.31 % 0.31 2 0.13 % 0.05 1 0.58 % 0.25 27 0.20 % 0.08 pst pst 436 0.66 % 0.38 0 0 % 0 66 0.52 % 1.66 102 0.52 % 0.19 63 0.34 % 0.34 53 3.38 % 1.23 4 2.31 % 1.01 148 1.09 % 0.47 čao čao 412 0.62 % 0.36 0 0 % 0 97 0.77 % 2.44 109 0.55 % 0.20 105 0.56 % 0.56 8 0.51 % 0.19 0 0 % 0 93 0.68 % 0.29 hov hov 359 0.54 % 0.32 0 0 % 0 25 0.20 % 0.63 159 0.81 % 0.29 82 0.44 % 0.44 22 1.40 % 0.51 1 0.58 % 0.25 70 0.52 % 0.22 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 230 File at CLARIN.SI 1.2.214 List of initial character-level 4-grams from interjection lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-interjections-lowercase_ forms-initial-4grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] bravo brav o 3,100 11.17 % 2.73 0 0 % 0 255 3.78 % 6.42 1,225 12.92 % 2.26 853 11.96 % 4.55 37 5.54 % 0.86 4 5.20 % 1.01 726 19.86 % 2.28 zbogom zbog om 2,474 8.91 % 2.18 0 0 % 0 360 5.34 % 9.06 1,062 11.20 % 1.96 532 7.46 % 2.84 90 13.47 % 2.10 3 3.90 % 0.76 427 11.68 % 1.34 adijo adij o 2,415 8.70 % 2.13 0 0 % 0 416 6.17 % 10.47 961 10.14 % 1.77 609 8.54 % 3.25 60 8.98 % 1.40 1 1.30 % 0.25 368 10.07 % 1.16 haha haha 1,798 6.48 % 1.58 0 0 % 0 86 1.27 % 2.17 448 4.73 % 0.83 915 12.83 % 4.88 5 0.75 % 0.12 0 0 % 0 344 9.41 % 1.08 hvalabogu hval abogu 1,286 4.63 % 1.13 0 0 % 0 146 2.17 % 3.68 671 7.08 % 1.24 300 4.21 % 1.60 17 2.54 % 0.40 4 5.20 % 1.01 148 4.05 % 0.47 nasvidenje nasv idenje 1,217 4.38 % 1.07 0 0 % 0 130 1.93 % 3.27 651 6.87 % 1.20 189 2.65 % 1.01 32 4.79 % 0.75 35 45.45 % 8.85 180 4.92 % 0.57 jebiga jebi ga 1,175 4.23 % 1.04 0 0 % 0 511 7.58 % 12.87 321 3.39 % 0.59 272 3.81 % 1.45 5 0.75 % 0.12 3 3.90 % 0.76 63 1.72 % 0.20 hura hura 1,087 3.92 % 0.96 0 0 % 0 121 1.79 % 3.05 531 5.60 % 0.98 289 4.05 % 1.54 38 5.69 % 0.89 1 1.30 % 0.25 107 2.93 % 0.34 hopla hopl a 946 3.41 % 0.83 0 0 % 0 42 0.62 % 1.06 338 3.57 % 0.62 535 7.50 % 2.85 5 0.75 % 0.12 3 3.90 % 0.76 23 0.63 % 0.07 ojoj ojoj 890 3.21 % 0.78 0 0 % 0 282 4.18 % 7.10 202 2.13 % 0.37 241 3.38 % 1.29 15 2.25 % 0.35 4 5.20 % 1.01 146 4.00 % 0.46 zaboga zabo ga 831 2.99 % 0.73 0 0 % 0 455 6.75 % 11.46 168 1.77 % 0.31 131 1.84 % 0.70 35 5.24 % 0.82 0 0 % 0 42 1.15 % 0.13 ejga ejga 756 2.72 % 0.67 0 0 % 0 45 0.67 % 1.13 541 5.71 % 1 61 0.85 % 0.33 0 0 % 0 0 0 % 0 109 2.98 % 0.34 jebenti jebe nti 753 2.71 % 0.66 0 0 % 0 576 8.54 % 14.50 64 0.68 % 0.12 87 1.22 % 0.46 7 1.05 % 0.16 0 0 % 0 19 0.52 % 0.06 jebemti jebe mti 643 2.32 % 0.57 0 0 % 0 498 7.38 % 12.54 52 0.55 % 0.10 54 0.76 % 0.29 1 0.15 % 0.02 0 0 % 0 38 1.04 % 0.12 ojej ojej 603 2.17 % 0.53 0 0 % 0 291 4.32 % 7.33 107 1.13 % 0.20 129 1.81 % 0.69 19 2.84 % 0.44 0 0 % 0 57 1.56 % 0.18 juhej juhe j 325 1.17 % 0.29 0 0 % 0 17 0.25 % 0.43 127 1.34 % 0.23 126 1.77 % 0.67 6 0.90 % 0.14 0 0 % 0 49 1.34 % 0.15 živijo živi jo 255 0.92 % 0.22 0 0 % 0 134 1.99 % 3.37 43 0.45 % 0.08 47 0.66 % 0.25 12 1.80 % 0.28 1 1.30 % 0.25 18 0.49 % 0.06 juhuhu juhu hu 246 0.89 % 0.22 0 0 % 0 33 0.49 % 0.83 103 1.09 % 0.19 79 1.11 % 0.42 5 0.75 % 0.12 3 3.90 % 0.76 23 0.63 % 0.07 živio živi o 207 0.75 % 0.18 0 0 % 0 76 1.13 % 1.91 81 0.85 % 0.15 7 0.10 % 0.04 18 2.69 % 0.42 2 2.60 % 0.51 23 0.63 % 0.07 hojla hojl a 194 0.70 % 0.17 0 0 % 0 39 0.58 % 0.98 107 1.13 % 0.20 46 0.65 % 0.25 1 0.15 % 0.02 0 0 % 0 1 0.03 % 0 živjo živj o 191 0.69 % 0.17 0 0 % 0 126 1.87 % 3.17 20 0.21 % 0.04 28 0.39 % 0.15 2 0.30 % 0.05 0 0 % 0 15 0.41 % 0.05 hahaha haha ha 190 0.69 % 0.17 0 0 % 0 30 0.45 % 0.76 34 0.36 % 0.06 67 0.94 % 0.36 2 0.30 % 0.05 0 0 % 0 57 1.56 % 0.18 žalibog žali bog 186 0.67 % 0.16 0 0 % 0 13 0.19 % 0.33 78 0.82 % 0.14 35 0.49 % 0.19 13 1.95 % 0.30 1 1.30 % 0.25 46 1.26 % 0.14 ježeš ježe š 155 0.56 % 0.14 0 0 % 0 41 0.61 % 1.03 34 0.36 % 0.06 75 1.05 % 0.40 0 0 % 0 0 0 % 0 5 0.14 % 0.02 aaaa aaaa 148 0.53 % 0.13 0 0 % 0 49 0.73 % 1.23 22 0.23 % 0.04 58 0.81 % 0.31 0 0 % 0 0 0 % 0 19 0.52 % 0.06 tralala tral ala 127 0.46 % 0.11 0 0 % 0 17 0.25 % 0.43 57 0.60 % 0.11 35 0.49 % 0.19 1 0.15 % 0.02 1 1.30 % 0.25 16 0.44 % 0.05 heja heja 98 0.35 % 0.09 0 0 % 0 7 0.10 % 0.18 30 0.32 % 0.06 22 0.31 % 0.12 3 0.45 % 0.07 0 0 % 0 36 0.98 % 0.11 bumf bumf 80 0.29 % 0.07 0 0 % 0 33 0.49 % 0.83 23 0.24 % 0.04 18 0.25 % 0.10 3 0.45 % 0.07 0 0 % 0 3 0.08 % 0.01 hrsk hrsk 73 0.26 % 0.06 0 0 % 0 33 0.49 % 0.83 21 0.22 % 0.04 13 0.18 % 0.07 1 0.15 % 0.02 0 0 % 0 5 0.14 % 0.02 jojmene jojm ene 72 0.26 % 0.06 0 0 % 0 34 0.50 % 0.86 13 0.14 % 0.02 17 0.24 % 0.09 5 0.75 % 0.12 0 0 % 0 3 0.08 % 0.01 hozana hoza na 66 0.24 % 0.06 0 0 % 0 21 0.31 % 0.53 24 0.25 % 0.04 10 0.14 % 0.05 3 0.45 % 0.07 1 1.30 % 0.25 7 0.19 % 0.02 aaah aaah 63 0.23 % 0.06 0 0 % 0 17 0.25 % 0.43 6 0.06 % 0.01 32 0.45 % 0.17 2 0.30 % 0.05 0 0 % 0 6 0.16 % 0.02 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 231 File at CLARIN.SI 1.2.215 List of initial character-level 5-grams from interjection lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-interjections-lowercase_ forms-initial-5grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] bravo bravo 3,100 14.71 % 2.73 0 0 % 0 255 4.75 % 6.42 1,225 16.80 % 2.26 853 16.77 % 4.55 37 6.87 % 0.86 4 5.97 % 1.01 726 26.62 % 2.28 zbogom zbogo m 2,474 11.74 % 2.18 0 0 % 0 360 6.71 % 9.06 1,062 14.57 % 1.96 532 10.46 % 2.84 90 16.70 % 2.10 3 4.48 % 0.76 427 15.66 % 1.34 adijo adijo 2,415 11.46 % 2.13 0 0 % 0 416 7.75 % 10.47 961 13.18 % 1.77 609 11.97 % 3.25 60 11.13 % 1.40 1 1.49 % 0.25 368 13.49 % 1.16 hvalabogu hvala bogu 1,286 6.10 % 1.13 0 0 % 0 146 2.72 % 3.68 671 9.20 % 1.24 300 5.90 % 1.60 17 3.15 % 0.40 4 5.97 % 1.01 148 5.43 % 0.47 nasvidenje nasvi denje 1,217 5.77 % 1.07 0 0 % 0 130 2.42 % 3.27 651 8.93 % 1.20 189 3.71 % 1.01 32 5.94 % 0.75 35 52.24 % 8.85 180 6.60 % 0.57 jebiga jebig a 1,175 5.57 % 1.04 0 0 % 0 511 9.52 % 12.87 321 4.40 % 0.59 272 5.35 % 1.45 5 0.93 % 0.12 3 4.48 % 0.76 63 2.31 % 0.20 hopla hopla 946 4.49 % 0.83 0 0 % 0 42 0.78 % 1.06 338 4.64 % 0.62 535 10.52 % 2.85 5 0.93 % 0.12 3 4.48 % 0.76 23 0.84 % 0.07 zaboga zabog a 831 3.94 % 0.73 0 0 % 0 455 8.48 % 11.46 168 2.30 % 0.31 131 2.58 % 0.70 35 6.49 % 0.82 0 0 % 0 42 1.54 % 0.13 jebenti jeben ti 753 3.57 % 0.66 0 0 % 0 576 10.73 % 14.50 64 0.88 % 0.12 87 1.71 % 0.46 7 1.30 % 0.16 0 0 % 0 19 0.70 % 0.06 jebemti jebem ti 643 3.05 % 0.57 0 0 % 0 498 9.28 % 12.54 52 0.71 % 0.10 54 1.06 % 0.29 1 0.19 % 0.02 0 0 % 0 38 1.39 % 0.12 juhej juhej 325 1.54 % 0.29 0 0 % 0 17 0.32 % 0.43 127 1.74 % 0.23 126 2.48 % 0.67 6 1.11 % 0.14 0 0 % 0 49 1.80 % 0.15 živijo živij o 255 1.21 % 0.22 0 0 % 0 134 2.50 % 3.37 43 0.59 % 0.08 47 0.92 % 0.25 12 2.23 % 0.28 1 1.49 % 0.25 18 0.66 % 0.06 juhuhu juhuh u 246 1.17 % 0.22 0 0 % 0 33 0.61 % 0.83 103 1.41 % 0.19 79 1.55 % 0.42 5 0.93 % 0.12 3 4.48 % 0.76 23 0.84 % 0.07 živio živio 207 0.98 % 0.18 0 0 % 0 76 1.42 % 1.91 81 1.11 % 0.15 7 0.14 % 0.04 18 3.34 % 0.42 2 2.98 % 0.51 23 0.84 % 0.07 hojla hojla 194 0.92 % 0.17 0 0 % 0 39 0.73 % 0.98 107 1.47 % 0.20 46 0.90 % 0.25 1 0.19 % 0.02 0 0 % 0 1 0.04 % 0 živjo živjo 191 0.91 % 0.17 0 0 % 0 126 2.35 % 3.17 20 0.27 % 0.04 28 0.55 % 0.15 2 0.37 % 0.05 0 0 % 0 15 0.55 % 0.05 hahaha hahah a 190 0.90 % 0.17 0 0 % 0 30 0.56 % 0.76 34 0.47 % 0.06 67 1.32 % 0.36 2 0.37 % 0.05 0 0 % 0 57 2.09 % 0.18 žalibog žalib og 186 0.88 % 0.16 0 0 % 0 13 0.24 % 0.33 78 1.07 % 0.14 35 0.69 % 0.19 13 2.41 % 0.30 1 1.49 % 0.25 46 1.69 % 0.14 ježeš ježeš 155 0.73 % 0.14 0 0 % 0 41 0.76 % 1.03 34 0.47 % 0.06 75 1.47 % 0.40 0 0 % 0 0 0 % 0 5 0.18 % 0.02 tralala trala la 127 0.60 % 0.11 0 0 % 0 17 0.32 % 0.43 57 0.78 % 0.11 35 0.69 % 0.19 1 0.19 % 0.02 1 1.49 % 0.25 16 0.59 % 0.05 jojmene jojme ne 72 0.34 % 0.06 0 0 % 0 34 0.63 % 0.86 13 0.18 % 0.02 17 0.33 % 0.09 5 0.93 % 0.12 0 0 % 0 3 0.11 % 0.01 hozana hozan a 66 0.31 % 0.06 0 0 % 0 21 0.39 % 0.53 24 0.33 % 0.04 10 0.20 % 0.05 3 0.56 % 0.07 1 1.49 % 0.25 7 0.26 % 0.02 mijav mijav 56 0.27 % 0.05 0 0 % 0 11 0.20 % 0.28 17 0.23 % 0.03 19 0.37 % 0.10 5 0.93 % 0.12 0 0 % 0 4 0.15 % 0.01 joooj joooj 53 0.25 % 0.05 0 0 % 0 1 0.02 % 0.03 24 0.33 % 0.04 22 0.43 % 0.12 0 0 % 0 0 0 % 0 6 0.22 % 0.02 zdravo zdrav o 52 0.25 % 0.05 0 0 % 0 12 0.22 % 0.30 24 0.33 % 0.04 9 0.18 % 0.05 3 0.56 % 0.07 0 0 % 0 4 0.15 % 0.01 jojme jojme 49 0.23 % 0.04 0 0 % 0 16 0.30 % 0.40 10 0.14 % 0.02 15 0.29 % 0.08 6 1.11 % 0.14 1 1.49 % 0.25 1 0.04 % 0 aaaaa aaaaa 45 0.21 % 0.04 0 0 % 0 12 0.22 % 0.30 8 0.11 % 0.01 17 0.33 % 0.09 2 0.37 % 0.05 0 0 % 0 6 0.22 % 0.02 mejduš mejdu š 42 0.20 % 0.04 0 0 % 0 14 0.26 % 0.35 10 0.14 % 0.02 10 0.20 % 0.05 2 0.37 % 0.05 0 0 % 0 6 0.22 % 0.02 ježešna ježeš na 32 0.15 % 0.03 0 0 % 0 13 0.24 % 0.33 13 0.18 % 0.02 4 0.08 % 0.02 1 0.19 % 0.02 0 0 % 0 1 0.04 % 0 jaaaa jaaaa 29 0.14 % 0.03 0 0 % 0 12 0.22 % 0.30 5 0.07 % 0.01 5 0.10 % 0.03 0 0 % 0 0 0 % 0 7 0.26 % 0.02 aaaah aaaah 28 0.13 % 0.02 0 0 % 0 8 0.15 % 0.20 4 0.06 % 0.01 14 0.28 % 0.07 0 0 % 0 0 0 % 0 2 0.07 % 0.01 prejoj prejo j 28 0.13 % 0.02 0 0 % 0 5 0.09 % 0.13 6 0.08 % 0.01 12 0.24 % 0.06 2 0.37 % 0.05 0 0 % 0 3 0.11 % 0.01 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 232 File at CLARIN.SI 1.2.216 List of final character-level 1-grams from interjection lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-interjections-lowercase_ forms-final-1grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] oh o h 14,865 10.91 % 13.10 0 0 % 0 5,344 19.95 % 134.56 3,245 7.48 % 5.98 3,990 9.80 % 21.29 760 18.32 % 17.70 25 7.14 % 6.32 1,501 7.18 % 4.72 ah a h 13,301 9.76 % 11.72 0 0 % 0 2,696 10.06 % 67.88 4,963 11.44 % 9.14 3,766 9.25 % 20.09 445 10.73 % 10.36 30 8.57 % 7.58 1,401 6.71 % 4.41 ha h a 11,172 8.20 % 9.85 0 0 % 0 1,638 6.11 % 41.24 4,524 10.43 % 8.34 3,900 9.58 % 20.81 320 7.71 % 7.45 33 9.43 % 8.34 757 3.62 % 2.38 hm h m 8,313 6.10 % 7.33 0 0 % 0 1,142 4.26 % 28.75 3,096 7.13 % 5.70 3,089 7.59 % 16.48 169 4.07 % 3.94 12 3.43 % 3.03 805 3.85 % 2.53 hej he j 7,497 5.50 % 6.61 0 0 % 0 1,515 5.66 % 38.15 1,767 4.07 % 3.26 3,368 8.27 % 17.97 210 5.06 % 4.89 4 1.14 % 1.01 633 3.03 % 1.99 joj jo j 6,441 4.73 % 5.68 0 0 % 0 1,074 4.01 % 27.04 2,157 4.97 % 3.97 2,162 5.31 % 11.54 138 3.33 % 3.21 33 9.43 % 8.34 877 4.20 % 2.76 aha ah a 6,335 4.65 % 5.58 0 0 % 0 1,184 4.42 % 29.81 1,301 3.00 % 2.40 1,538 3.78 % 8.21 160 3.86 % 3.73 25 7.14 % 6.32 2,127 10.18 % 6.69 fak fa k 4,669 3.43 % 4.11 0 0 % 0 160 0.60 % 4.03 190 0.44 % 0.35 98 0.24 % 0.52 2 0.05 % 0.05 0 0 % 0 4,219 20.20 % 13.27 ej e j 3,721 2.73 % 3.28 0 0 % 0 1,232 4.60 % 31.02 1,049 2.42 % 1.93 830 2.04 % 4.43 241 5.81 % 5.61 7 2.00 % 1.77 362 1.73 % 1.14 hi h i 3,425 2.51 % 3.02 0 0 % 0 169 0.63 % 4.26 1,246 2.87 % 2.30 1,521 3.74 % 8.12 111 2.68 % 2.59 22 6.29 % 5.56 356 1.70 % 1.12 bravo brav o 3,100 2.27 % 2.73 0 0 % 0 255 0.95 % 6.42 1,225 2.82 % 2.26 853 2.10 % 4.55 37 0.89 % 0.86 4 1.14 % 1.01 726 3.48 % 2.28 uf u f 2,808 2.06 % 2.47 0 0 % 0 254 0.95 % 6.40 834 1.92 % 1.54 1,109 2.72 % 5.92 25 0.60 % 0.58 0 0 % 0 586 2.81 % 1.84 zbogom zbogo m 2,474 1.81 % 2.18 0 0 % 0 360 1.34 % 9.06 1,062 2.45 % 1.96 532 1.31 % 2.84 90 2.17 % 2.10 3 0.86 % 0.76 427 2.04 % 1.34 uh u h 2,435 1.79 % 2.15 0 0 % 0 336 1.25 % 8.46 830 1.91 % 1.53 983 2.41 % 5.24 48 1.16 % 1.12 2 0.57 % 0.51 236 1.13 % 0.74 adijo adij o 2,415 1.77 % 2.13 0 0 % 0 416 1.55 % 10.47 961 2.21 % 1.77 609 1.50 % 3.25 60 1.45 % 1.40 1 0.29 % 0.25 368 1.76 % 1.16 eh e h 2,409 1.77 % 2.12 0 0 % 0 399 1.49 % 10.05 655 1.51 % 1.21 1,114 2.74 % 5.94 47 1.13 % 1.09 1 0.29 % 0.25 193 0.92 % 0.61 ho h o 2,160 1.58 % 1.90 0 0 % 0 220 0.82 % 5.54 784 1.81 % 1.44 504 1.24 % 2.69 133 3.21 % 3.10 18 5.14 % 4.55 501 2.40 % 1.58 bla bl a 1,985 1.46 % 1.75 0 0 % 0 325 1.21 % 8.18 824 1.90 % 1.52 512 1.26 % 2.73 54 1.30 % 1.26 11 3.14 % 2.78 259 1.24 % 0.81 la l a 1,888 1.39 % 1.66 0 0 % 0 90 0.34 % 2.27 919 2.12 % 1.69 450 1.10 % 2.40 72 1.74 % 1.68 3 0.86 % 0.76 354 1.70 % 1.11 haha hah a 1,798 1.32 % 1.58 0 0 % 0 86 0.32 % 2.17 448 1.03 % 0.83 915 2.25 % 4.88 5 0.12 % 0.12 0 0 % 0 344 1.65 % 1.08 oj o j 1,484 1.09 % 1.31 0 0 % 0 218 0.81 % 5.49 651 1.50 % 1.20 318 0.78 % 1.70 149 3.59 % 3.47 9 2.57 % 2.28 139 0.67 % 0.44 huh hu h 1,382 1.01 % 1.22 0 0 % 0 23 0.09 % 0.58 97 0.22 % 0.18 1,180 2.90 % 6.30 10 0.24 % 0.23 0 0 % 0 72 0.34 % 0.23 alo al o 1,291 0.95 % 1.14 0 0 % 0 100 0.37 % 2.52 599 1.38 % 1.10 221 0.54 % 1.18 25 0.60 % 0.58 0 0 % 0 346 1.66 % 1.09 hvalabogu hvalabog u 1,286 0.94 % 1.13 0 0 % 0 146 0.55 % 3.68 671 1.55 % 1.24 300 0.74 % 1.60 17 0.41 % 0.40 4 1.14 % 1.01 148 0.71 % 0.47 nasvidenje nasvidenj e 1,217 0.89 % 1.07 0 0 % 0 130 0.48 % 3.27 651 1.50 % 1.20 189 0.46 % 1.01 32 0.77 % 0.75 35 10.00 % 8.85 180 0.86 % 0.57 jebiga jebig a 1,175 0.86 % 1.04 0 0 % 0 511 1.91 % 12.87 321 0.74 % 0.59 272 0.67 % 1.45 5 0.12 % 0.12 3 0.86 % 0.76 63 0.30 % 0.20 oho oh o 1,160 0.85 % 1.02 0 0 % 0 135 0.50 % 3.40 421 0.97 % 0.78 217 0.53 % 1.16 27 0.65 % 0.63 1 0.29 % 0.25 359 1.72 % 1.13 hura hur a 1,087 0.80 % 0.96 0 0 % 0 121 0.45 % 3.05 531 1.22 % 0.98 289 0.71 % 1.54 38 0.92 % 0.89 1 0.29 % 0.25 107 0.51 % 0.34 jah ja h 1,031 0.76 % 0.91 0 0 % 0 22 0.08 % 0.55 650 1.50 % 1.20 235 0.58 % 1.25 8 0.19 % 0.19 1 0.29 % 0.25 115 0.55 % 0.36 ups up s 972 0.71 % 0.86 0 0 % 0 74 0.28 % 1.86 302 0.70 % 0.56 468 1.15 % 2.50 8 0.19 % 0.19 5 1.43 % 1.26 115 0.55 % 0.36 hopla hopl a 946 0.69 % 0.83 0 0 % 0 42 0.16 % 1.06 338 0.78 % 0.62 535 1.31 % 2.85 5 0.12 % 0.12 3 0.86 % 0.76 23 0.11 % 0.07 av a v 944 0.69 % 0.83 0 0 % 0 78 0.29 % 1.96 588 1.35 % 1.08 200 0.49 % 1.07 28 0.68 % 0.65 10 2.86 % 2.53 40 0.19 % 0.13 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 233 File at CLARIN.SI 1.2.217 List of final character-level 2-grams from interjection lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-interjections-lowercase_ forms-final-2grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] oh oh 14,865 10.91 % 13.10 0 0 % 0 5,344 19.95 % 134.56 3,245 7.48 % 5.98 3,990 9.80 % 21.29 760 18.32 % 17.70 25 7.14 % 6.32 1,501 7.18 % 4.72 ah ah 13,301 9.76 % 11.72 0 0 % 0 2,696 10.06 % 67.88 4,963 11.44 % 9.14 3,766 9.25 % 20.09 445 10.73 % 10.36 30 8.57 % 7.58 1,401 6.71 % 4.41 ha ha 11,172 8.20 % 9.85 0 0 % 0 1,638 6.11 % 41.24 4,524 10.43 % 8.34 3,900 9.58 % 20.81 320 7.71 % 7.45 33 9.43 % 8.34 757 3.62 % 2.38 hm hm 8,313 6.10 % 7.33 0 0 % 0 1,142 4.26 % 28.75 3,096 7.13 % 5.70 3,089 7.59 % 16.48 169 4.07 % 3.94 12 3.43 % 3.03 805 3.85 % 2.53 hej h ej 7,497 5.50 % 6.61 0 0 % 0 1,515 5.66 % 38.15 1,767 4.07 % 3.26 3,368 8.27 % 17.97 210 5.06 % 4.89 4 1.14 % 1.01 633 3.03 % 1.99 joj j oj 6,441 4.73 % 5.68 0 0 % 0 1,074 4.01 % 27.04 2,157 4.97 % 3.97 2,162 5.31 % 11.54 138 3.33 % 3.21 33 9.43 % 8.34 877 4.20 % 2.76 aha a ha 6,335 4.65 % 5.58 0 0 % 0 1,184 4.42 % 29.81 1,301 3.00 % 2.40 1,538 3.78 % 8.21 160 3.86 % 3.73 25 7.14 % 6.32 2,127 10.18 % 6.69 fak f ak 4,669 3.43 % 4.11 0 0 % 0 160 0.60 % 4.03 190 0.44 % 0.35 98 0.24 % 0.52 2 0.05 % 0.05 0 0 % 0 4,219 20.20 % 13.27 ej ej 3,721 2.73 % 3.28 0 0 % 0 1,232 4.60 % 31.02 1,049 2.42 % 1.93 830 2.04 % 4.43 241 5.81 % 5.61 7 2.00 % 1.77 362 1.73 % 1.14 hi hi 3,425 2.51 % 3.02 0 0 % 0 169 0.63 % 4.26 1,246 2.87 % 2.30 1,521 3.74 % 8.12 111 2.68 % 2.59 22 6.29 % 5.56 356 1.70 % 1.12 bravo bra vo 3,100 2.27 % 2.73 0 0 % 0 255 0.95 % 6.42 1,225 2.82 % 2.26 853 2.10 % 4.55 37 0.89 % 0.86 4 1.14 % 1.01 726 3.48 % 2.28 uf uf 2,808 2.06 % 2.47 0 0 % 0 254 0.95 % 6.40 834 1.92 % 1.54 1,109 2.72 % 5.92 25 0.60 % 0.58 0 0 % 0 586 2.81 % 1.84 zbogom zbog om 2,474 1.81 % 2.18 0 0 % 0 360 1.34 % 9.06 1,062 2.45 % 1.96 532 1.31 % 2.84 90 2.17 % 2.10 3 0.86 % 0.76 427 2.04 % 1.34 uh uh 2,435 1.79 % 2.15 0 0 % 0 336 1.25 % 8.46 830 1.91 % 1.53 983 2.41 % 5.24 48 1.16 % 1.12 2 0.57 % 0.51 236 1.13 % 0.74 adijo adi jo 2,415 1.77 % 2.13 0 0 % 0 416 1.55 % 10.47 961 2.21 % 1.77 609 1.50 % 3.25 60 1.45 % 1.40 1 0.29 % 0.25 368 1.76 % 1.16 eh eh 2,409 1.77 % 2.12 0 0 % 0 399 1.49 % 10.05 655 1.51 % 1.21 1,114 2.74 % 5.94 47 1.13 % 1.09 1 0.29 % 0.25 193 0.92 % 0.61 ho ho 2,160 1.58 % 1.90 0 0 % 0 220 0.82 % 5.54 784 1.81 % 1.44 504 1.24 % 2.69 133 3.21 % 3.10 18 5.14 % 4.55 501 2.40 % 1.58 bla b la 1,985 1.46 % 1.75 0 0 % 0 325 1.21 % 8.18 824 1.90 % 1.52 512 1.26 % 2.73 54 1.30 % 1.26 11 3.14 % 2.78 259 1.24 % 0.81 la la 1,888 1.39 % 1.66 0 0 % 0 90 0.34 % 2.27 919 2.12 % 1.69 450 1.10 % 2.40 72 1.74 % 1.68 3 0.86 % 0.76 354 1.70 % 1.11 haha ha ha 1,798 1.32 % 1.58 0 0 % 0 86 0.32 % 2.17 448 1.03 % 0.83 915 2.25 % 4.88 5 0.12 % 0.12 0 0 % 0 344 1.65 % 1.08 oj oj 1,484 1.09 % 1.31 0 0 % 0 218 0.81 % 5.49 651 1.50 % 1.20 318 0.78 % 1.70 149 3.59 % 3.47 9 2.57 % 2.28 139 0.67 % 0.44 huh h uh 1,382 1.01 % 1.22 0 0 % 0 23 0.09 % 0.58 97 0.22 % 0.18 1,180 2.90 % 6.30 10 0.24 % 0.23 0 0 % 0 72 0.34 % 0.23 alo a lo 1,291 0.95 % 1.14 0 0 % 0 100 0.37 % 2.52 599 1.38 % 1.10 221 0.54 % 1.18 25 0.60 % 0.58 0 0 % 0 346 1.66 % 1.09 hvalabogu hvalabo gu 1,286 0.94 % 1.13 0 0 % 0 146 0.55 % 3.68 671 1.55 % 1.24 300 0.74 % 1.60 17 0.41 % 0.40 4 1.14 % 1.01 148 0.71 % 0.47 nasvidenje nasviden je 1,217 0.89 % 1.07 0 0 % 0 130 0.48 % 3.27 651 1.50 % 1.20 189 0.46 % 1.01 32 0.77 % 0.75 35 10.00 % 8.85 180 0.86 % 0.57 jebiga jebi ga 1,175 0.86 % 1.04 0 0 % 0 511 1.91 % 12.87 321 0.74 % 0.59 272 0.67 % 1.45 5 0.12 % 0.12 3 0.86 % 0.76 63 0.30 % 0.20 oho o ho 1,160 0.85 % 1.02 0 0 % 0 135 0.50 % 3.40 421 0.97 % 0.78 217 0.53 % 1.16 27 0.65 % 0.63 1 0.29 % 0.25 359 1.72 % 1.13 hura hu ra 1,087 0.80 % 0.96 0 0 % 0 121 0.45 % 3.05 531 1.22 % 0.98 289 0.71 % 1.54 38 0.92 % 0.89 1 0.29 % 0.25 107 0.51 % 0.34 jah j ah 1,031 0.76 % 0.91 0 0 % 0 22 0.08 % 0.55 650 1.50 % 1.20 235 0.58 % 1.25 8 0.19 % 0.19 1 0.29 % 0.25 115 0.55 % 0.36 ups u ps 972 0.71 % 0.86 0 0 % 0 74 0.28 % 1.86 302 0.70 % 0.56 468 1.15 % 2.50 8 0.19 % 0.19 5 1.43 % 1.26 115 0.55 % 0.36 hopla hop la 946 0.69 % 0.83 0 0 % 0 42 0.16 % 1.06 338 0.78 % 0.62 535 1.31 % 2.85 5 0.12 % 0.12 3 0.86 % 0.76 23 0.11 % 0.07 av av 944 0.69 % 0.83 0 0 % 0 78 0.29 % 1.96 588 1.35 % 1.08 200 0.49 % 1.07 28 0.68 % 0.65 10 2.86 % 2.53 40 0.19 % 0.13 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 234 File at CLARIN.SI 1.2.218 List of final character-level 3-grams from interjection lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-interjections-lowercase_ forms-final-3grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] hej hej 7,497 11.28 % 6.61 0 0 % 0 1,515 11.94 % 38.15 1,767 8.96 % 3.26 3,368 17.95 % 17.97 210 13.38 % 4.89 4 2.31 % 1.01 633 4.66 % 1.99 joj joj 6,441 9.69 % 5.68 0 0 % 0 1,074 8.47 % 27.04 2,157 10.94 % 3.97 2,162 11.52 % 11.54 138 8.79 % 3.21 33 19.07 % 8.34 877 6.45 % 2.76 aha aha 6,335 9.53 % 5.58 0 0 % 0 1,184 9.34 % 29.81 1,301 6.60 % 2.40 1,538 8.20 % 8.21 160 10.20 % 3.73 25 14.45 % 6.32 2,127 15.65 % 6.69 fak fak 4,669 7.02 % 4.11 0 0 % 0 160 1.26 % 4.03 190 0.96 % 0.35 98 0.52 % 0.52 2 0.13 % 0.05 0 0 % 0 4,219 31.04 % 13.27 bravo br avo 3,100 4.66 % 2.73 0 0 % 0 255 2.01 % 6.42 1,225 6.21 % 2.26 853 4.55 % 4.55 37 2.36 % 0.86 4 2.31 % 1.01 726 5.34 % 2.28 zbogom zbo gom 2,474 3.72 % 2.18 0 0 % 0 360 2.84 % 9.06 1,062 5.39 % 1.96 532 2.84 % 2.84 90 5.74 % 2.10 3 1.73 % 0.76 427 3.14 % 1.34 adijo ad ijo 2,415 3.63 % 2.13 0 0 % 0 416 3.28 % 10.47 961 4.87 % 1.77 609 3.25 % 3.25 60 3.82 % 1.40 1 0.58 % 0.25 368 2.71 % 1.16 bla bla 1,985 2.98 % 1.75 0 0 % 0 325 2.56 % 8.18 824 4.18 % 1.52 512 2.73 % 2.73 54 3.44 % 1.26 11 6.36 % 2.78 259 1.91 % 0.81 haha h aha 1,798 2.70 % 1.58 0 0 % 0 86 0.68 % 2.17 448 2.27 % 0.83 915 4.88 % 4.88 5 0.32 % 0.12 0 0 % 0 344 2.53 % 1.08 huh huh 1,382 2.08 % 1.22 0 0 % 0 23 0.18 % 0.58 97 0.49 % 0.18 1,180 6.29 % 6.30 10 0.64 % 0.23 0 0 % 0 72 0.53 % 0.23 alo alo 1,291 1.94 % 1.14 0 0 % 0 100 0.79 % 2.52 599 3.04 % 1.10 221 1.18 % 1.18 25 1.59 % 0.58 0 0 % 0 346 2.55 % 1.09 hvalabogu hvalab ogu 1,286 1.93 % 1.13 0 0 % 0 146 1.15 % 3.68 671 3.40 % 1.24 300 1.60 % 1.60 17 1.08 % 0.40 4 2.31 % 1.01 148 1.09 % 0.47 nasvidenje nasvide nje 1,217 1.83 % 1.07 0 0 % 0 130 1.02 % 3.27 651 3.30 % 1.20 189 1.01 % 1.01 32 2.04 % 0.75 35 20.23 % 8.85 180 1.32 % 0.57 jebiga jeb iga 1,175 1.77 % 1.04 0 0 % 0 511 4.03 % 12.87 321 1.63 % 0.59 272 1.45 % 1.45 5 0.32 % 0.12 3 1.73 % 0.76 63 0.46 % 0.20 oho oho 1,160 1.75 % 1.02 0 0 % 0 135 1.06 % 3.40 421 2.13 % 0.78 217 1.16 % 1.16 27 1.72 % 0.63 1 0.58 % 0.25 359 2.64 % 1.13 hura h ura 1,087 1.64 % 0.96 0 0 % 0 121 0.95 % 3.05 531 2.69 % 0.98 289 1.54 % 1.54 38 2.42 % 0.89 1 0.58 % 0.25 107 0.79 % 0.34 jah jah 1,031 1.55 % 0.91 0 0 % 0 22 0.17 % 0.55 650 3.30 % 1.20 235 1.25 % 1.25 8 0.51 % 0.19 1 0.58 % 0.25 115 0.85 % 0.36 ups ups 972 1.46 % 0.86 0 0 % 0 74 0.58 % 1.86 302 1.53 % 0.56 468 2.49 % 2.50 8 0.51 % 0.19 5 2.89 % 1.26 115 0.85 % 0.36 hopla ho pla 946 1.42 % 0.83 0 0 % 0 42 0.33 % 1.06 338 1.71 % 0.62 535 2.85 % 2.85 5 0.32 % 0.12 3 1.73 % 0.76 23 0.17 % 0.07 ojoj o joj 890 1.34 % 0.78 0 0 % 0 282 2.22 % 7.10 202 1.02 % 0.37 241 1.28 % 1.29 15 0.96 % 0.35 4 2.31 % 1.01 146 1.07 % 0.46 zaboga zab oga 831 1.25 % 0.73 0 0 % 0 455 3.59 % 11.46 168 0.85 % 0.31 131 0.70 % 0.70 35 2.23 % 0.82 0 0 % 0 42 0.31 % 0.13 fuj fuj 800 1.20 % 0.71 0 0 % 0 197 1.55 % 4.96 218 1.11 % 0.40 246 1.31 % 1.31 24 1.53 % 0.56 4 2.31 % 1.01 111 0.82 % 0.35 ejga e jga 756 1.14 % 0.67 0 0 % 0 45 0.35 % 1.13 541 2.74 % 1 61 0.33 % 0.33 0 0 % 0 0 0 % 0 109 0.80 % 0.34 jebenti jebe nti 753 1.13 % 0.66 0 0 % 0 576 4.54 % 14.50 64 0.33 % 0.12 87 0.46 % 0.46 7 0.45 % 0.16 0 0 % 0 19 0.14 % 0.06 opa opa 673 1.01 % 0.59 0 0 % 0 70 0.55 % 1.76 243 1.23 % 0.45 256 1.36 % 1.37 25 1.59 % 0.58 0 0 % 0 79 0.58 % 0.25 jebemti jebe mti 643 0.97 % 0.57 0 0 % 0 498 3.93 % 12.54 52 0.26 % 0.10 54 0.29 % 0.29 1 0.06 % 0.02 0 0 % 0 38 0.28 % 0.12 mhm mhm 612 0.92 % 0.54 0 0 % 0 223 1.76 % 5.61 88 0.45 % 0.16 256 1.36 % 1.37 16 1.02 % 0.37 0 0 % 0 29 0.21 % 0.09 ojej o jej 603 0.91 % 0.53 0 0 % 0 291 2.29 % 7.33 107 0.54 % 0.20 129 0.69 % 0.69 19 1.21 % 0.44 0 0 % 0 57 0.42 % 0.18 paf paf 520 0.78 % 0.46 0 0 % 0 57 0.45 % 1.44 374 1.90 % 0.69 59 0.31 % 0.31 2 0.13 % 0.05 1 0.58 % 0.25 27 0.20 % 0.08 pst pst 436 0.66 % 0.38 0 0 % 0 66 0.52 % 1.66 102 0.52 % 0.19 63 0.34 % 0.34 53 3.38 % 1.23 4 2.31 % 1.01 148 1.09 % 0.47 čao čao 412 0.62 % 0.36 0 0 % 0 97 0.77 % 2.44 109 0.55 % 0.20 105 0.56 % 0.56 8 0.51 % 0.19 0 0 % 0 93 0.68 % 0.29 hov hov 359 0.54 % 0.32 0 0 % 0 25 0.20 % 0.63 159 0.81 % 0.29 82 0.44 % 0.44 22 1.40 % 0.51 1 0.58 % 0.25 70 0.52 % 0.22 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 235 File at CLARIN.SI 1.2.219 List of final character-level 4-grams from interjection lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-interjections-lowercase_ forms-final-4grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] bravo b ravo 3,100 11.17 % 2.73 0 0 % 0 255 3.78 % 6.42 1,225 12.92 % 2.26 853 11.96 % 4.55 37 5.54 % 0.86 4 5.20 % 1.01 726 19.86 % 2.28 zbogom zb ogom 2,474 8.91 % 2.18 0 0 % 0 360 5.34 % 9.06 1,062 11.20 % 1.96 532 7.46 % 2.84 90 13.47 % 2.10 3 3.90 % 0.76 427 11.68 % 1.34 adijo a dijo 2,415 8.70 % 2.13 0 0 % 0 416 6.17 % 10.47 961 10.14 % 1.77 609 8.54 % 3.25 60 8.98 % 1.40 1 1.30 % 0.25 368 10.07 % 1.16 haha haha 1,798 6.48 % 1.58 0 0 % 0 86 1.27 % 2.17 448 4.73 % 0.83 915 12.83 % 4.88 5 0.75 % 0.12 0 0 % 0 344 9.41 % 1.08 hvalabogu hvala bogu 1,286 4.63 % 1.13 0 0 % 0 146 2.17 % 3.68 671 7.08 % 1.24 300 4.21 % 1.60 17 2.54 % 0.40 4 5.20 % 1.01 148 4.05 % 0.47 nasvidenje nasvid enje 1,217 4.38 % 1.07 0 0 % 0 130 1.93 % 3.27 651 6.87 % 1.20 189 2.65 % 1.01 32 4.79 % 0.75 35 45.45 % 8.85 180 4.92 % 0.57 jebiga je biga 1,175 4.23 % 1.04 0 0 % 0 511 7.58 % 12.87 321 3.39 % 0.59 272 3.81 % 1.45 5 0.75 % 0.12 3 3.90 % 0.76 63 1.72 % 0.20 hura hura 1,087 3.92 % 0.96 0 0 % 0 121 1.79 % 3.05 531 5.60 % 0.98 289 4.05 % 1.54 38 5.69 % 0.89 1 1.30 % 0.25 107 2.93 % 0.34 hopla h opla 946 3.41 % 0.83 0 0 % 0 42 0.62 % 1.06 338 3.57 % 0.62 535 7.50 % 2.85 5 0.75 % 0.12 3 3.90 % 0.76 23 0.63 % 0.07 ojoj ojoj 890 3.21 % 0.78 0 0 % 0 282 4.18 % 7.10 202 2.13 % 0.37 241 3.38 % 1.29 15 2.25 % 0.35 4 5.20 % 1.01 146 4.00 % 0.46 zaboga za boga 831 2.99 % 0.73 0 0 % 0 455 6.75 % 11.46 168 1.77 % 0.31 131 1.84 % 0.70 35 5.24 % 0.82 0 0 % 0 42 1.15 % 0.13 ejga ejga 756 2.72 % 0.67 0 0 % 0 45 0.67 % 1.13 541 5.71 % 1 61 0.85 % 0.33 0 0 % 0 0 0 % 0 109 2.98 % 0.34 jebenti jeb enti 753 2.71 % 0.66 0 0 % 0 576 8.54 % 14.50 64 0.68 % 0.12 87 1.22 % 0.46 7 1.05 % 0.16 0 0 % 0 19 0.52 % 0.06 jebemti jeb emti 643 2.32 % 0.57 0 0 % 0 498 7.38 % 12.54 52 0.55 % 0.10 54 0.76 % 0.29 1 0.15 % 0.02 0 0 % 0 38 1.04 % 0.12 ojej ojej 603 2.17 % 0.53 0 0 % 0 291 4.32 % 7.33 107 1.13 % 0.20 129 1.81 % 0.69 19 2.84 % 0.44 0 0 % 0 57 1.56 % 0.18 juhej j uhej 325 1.17 % 0.29 0 0 % 0 17 0.25 % 0.43 127 1.34 % 0.23 126 1.77 % 0.67 6 0.90 % 0.14 0 0 % 0 49 1.34 % 0.15 živijo ži vijo 255 0.92 % 0.22 0 0 % 0 134 1.99 % 3.37 43 0.45 % 0.08 47 0.66 % 0.25 12 1.80 % 0.28 1 1.30 % 0.25 18 0.49 % 0.06 juhuhu ju huhu 246 0.89 % 0.22 0 0 % 0 33 0.49 % 0.83 103 1.09 % 0.19 79 1.11 % 0.42 5 0.75 % 0.12 3 3.90 % 0.76 23 0.63 % 0.07 živio ž ivio 207 0.75 % 0.18 0 0 % 0 76 1.13 % 1.91 81 0.85 % 0.15 7 0.10 % 0.04 18 2.69 % 0.42 2 2.60 % 0.51 23 0.63 % 0.07 hojla h ojla 194 0.70 % 0.17 0 0 % 0 39 0.58 % 0.98 107 1.13 % 0.20 46 0.65 % 0.25 1 0.15 % 0.02 0 0 % 0 1 0.03 % 0 živjo ž ivjo 191 0.69 % 0.17 0 0 % 0 126 1.87 % 3.17 20 0.21 % 0.04 28 0.39 % 0.15 2 0.30 % 0.05 0 0 % 0 15 0.41 % 0.05 hahaha ha haha 190 0.69 % 0.17 0 0 % 0 30 0.45 % 0.76 34 0.36 % 0.06 67 0.94 % 0.36 2 0.30 % 0.05 0 0 % 0 57 1.56 % 0.18 žalibog žal ibog 186 0.67 % 0.16 0 0 % 0 13 0.19 % 0.33 78 0.82 % 0.14 35 0.49 % 0.19 13 1.95 % 0.30 1 1.30 % 0.25 46 1.26 % 0.14 ježeš j ežeš 155 0.56 % 0.14 0 0 % 0 41 0.61 % 1.03 34 0.36 % 0.06 75 1.05 % 0.40 0 0 % 0 0 0 % 0 5 0.14 % 0.02 aaaa aaaa 148 0.53 % 0.13 0 0 % 0 49 0.73 % 1.23 22 0.23 % 0.04 58 0.81 % 0.31 0 0 % 0 0 0 % 0 19 0.52 % 0.06 tralala tra lala 127 0.46 % 0.11 0 0 % 0 17 0.25 % 0.43 57 0.60 % 0.11 35 0.49 % 0.19 1 0.15 % 0.02 1 1.30 % 0.25 16 0.44 % 0.05 heja heja 98 0.35 % 0.09 0 0 % 0 7 0.10 % 0.18 30 0.32 % 0.06 22 0.31 % 0.12 3 0.45 % 0.07 0 0 % 0 36 0.98 % 0.11 bumf bumf 80 0.29 % 0.07 0 0 % 0 33 0.49 % 0.83 23 0.24 % 0.04 18 0.25 % 0.10 3 0.45 % 0.07 0 0 % 0 3 0.08 % 0.01 hrsk hrsk 73 0.26 % 0.06 0 0 % 0 33 0.49 % 0.83 21 0.22 % 0.04 13 0.18 % 0.07 1 0.15 % 0.02 0 0 % 0 5 0.14 % 0.02 jojmene joj mene 72 0.26 % 0.06 0 0 % 0 34 0.50 % 0.86 13 0.14 % 0.02 17 0.24 % 0.09 5 0.75 % 0.12 0 0 % 0 3 0.08 % 0.01 hozana ho zana 66 0.24 % 0.06 0 0 % 0 21 0.31 % 0.53 24 0.25 % 0.04 10 0.14 % 0.05 3 0.45 % 0.07 1 1.30 % 0.25 7 0.19 % 0.02 aaah aaah 63 0.23 % 0.06 0 0 % 0 17 0.25 % 0.43 6 0.06 % 0.01 32 0.45 % 0.17 2 0.30 % 0.05 0 0 % 0 6 0.16 % 0.02 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 236 File at CLARIN.SI 1.2.220 List of final character-level 5-grams from interjection lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-interjections-lowercase_ forms-final-5grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] bravo bravo 3,100 14.71 % 2.73 0 0 % 0 255 4.75 % 6.42 1,225 16.80 % 2.26 853 16.77 % 4.55 37 6.87 % 0.86 4 5.97 % 1.01 726 26.62 % 2.28 zbogom z bogom 2,474 11.74 % 2.18 0 0 % 0 360 6.71 % 9.06 1,062 14.57 % 1.96 532 10.46 % 2.84 90 16.70 % 2.10 3 4.48 % 0.76 427 15.66 % 1.34 adijo adijo 2,415 11.46 % 2.13 0 0 % 0 416 7.75 % 10.47 961 13.18 % 1.77 609 11.97 % 3.25 60 11.13 % 1.40 1 1.49 % 0.25 368 13.49 % 1.16 hvalabogu hval abogu 1,286 6.10 % 1.13 0 0 % 0 146 2.72 % 3.68 671 9.20 % 1.24 300 5.90 % 1.60 17 3.15 % 0.40 4 5.97 % 1.01 148 5.43 % 0.47 nasvidenje nasvi denje 1,217 5.77 % 1.07 0 0 % 0 130 2.42 % 3.27 651 8.93 % 1.20 189 3.71 % 1.01 32 5.94 % 0.75 35 52.24 % 8.85 180 6.60 % 0.57 jebiga j ebiga 1,175 5.57 % 1.04 0 0 % 0 511 9.52 % 12.87 321 4.40 % 0.59 272 5.35 % 1.45 5 0.93 % 0.12 3 4.48 % 0.76 63 2.31 % 0.20 hopla hopla 946 4.49 % 0.83 0 0 % 0 42 0.78 % 1.06 338 4.64 % 0.62 535 10.52 % 2.85 5 0.93 % 0.12 3 4.48 % 0.76 23 0.84 % 0.07 zaboga z aboga 831 3.94 % 0.73 0 0 % 0 455 8.48 % 11.46 168 2.30 % 0.31 131 2.58 % 0.70 35 6.49 % 0.82 0 0 % 0 42 1.54 % 0.13 jebenti je benti 753 3.57 % 0.66 0 0 % 0 576 10.73 % 14.50 64 0.88 % 0.12 87 1.71 % 0.46 7 1.30 % 0.16 0 0 % 0 19 0.70 % 0.06 jebemti je bemti 643 3.05 % 0.57 0 0 % 0 498 9.28 % 12.54 52 0.71 % 0.10 54 1.06 % 0.29 1 0.19 % 0.02 0 0 % 0 38 1.39 % 0.12 juhej juhej 325 1.54 % 0.29 0 0 % 0 17 0.32 % 0.43 127 1.74 % 0.23 126 2.48 % 0.67 6 1.11 % 0.14 0 0 % 0 49 1.80 % 0.15 živijo ž ivijo 255 1.21 % 0.22 0 0 % 0 134 2.50 % 3.37 43 0.59 % 0.08 47 0.92 % 0.25 12 2.23 % 0.28 1 1.49 % 0.25 18 0.66 % 0.06 juhuhu j uhuhu 246 1.17 % 0.22 0 0 % 0 33 0.61 % 0.83 103 1.41 % 0.19 79 1.55 % 0.42 5 0.93 % 0.12 3 4.48 % 0.76 23 0.84 % 0.07 živio živio 207 0.98 % 0.18 0 0 % 0 76 1.42 % 1.91 81 1.11 % 0.15 7 0.14 % 0.04 18 3.34 % 0.42 2 2.98 % 0.51 23 0.84 % 0.07 hojla hojla 194 0.92 % 0.17 0 0 % 0 39 0.73 % 0.98 107 1.47 % 0.20 46 0.90 % 0.25 1 0.19 % 0.02 0 0 % 0 1 0.04 % 0 živjo živjo 191 0.91 % 0.17 0 0 % 0 126 2.35 % 3.17 20 0.27 % 0.04 28 0.55 % 0.15 2 0.37 % 0.05 0 0 % 0 15 0.55 % 0.05 hahaha h ahaha 190 0.90 % 0.17 0 0 % 0 30 0.56 % 0.76 34 0.47 % 0.06 67 1.32 % 0.36 2 0.37 % 0.05 0 0 % 0 57 2.09 % 0.18 žalibog ža libog 186 0.88 % 0.16 0 0 % 0 13 0.24 % 0.33 78 1.07 % 0.14 35 0.69 % 0.19 13 2.41 % 0.30 1 1.49 % 0.25 46 1.69 % 0.14 ježeš ježeš 155 0.73 % 0.14 0 0 % 0 41 0.76 % 1.03 34 0.47 % 0.06 75 1.47 % 0.40 0 0 % 0 0 0 % 0 5 0.18 % 0.02 tralala tr alala 127 0.60 % 0.11 0 0 % 0 17 0.32 % 0.43 57 0.78 % 0.11 35 0.69 % 0.19 1 0.19 % 0.02 1 1.49 % 0.25 16 0.59 % 0.05 jojmene jo jmene 72 0.34 % 0.06 0 0 % 0 34 0.63 % 0.86 13 0.18 % 0.02 17 0.33 % 0.09 5 0.93 % 0.12 0 0 % 0 3 0.11 % 0.01 hozana h ozana 66 0.31 % 0.06 0 0 % 0 21 0.39 % 0.53 24 0.33 % 0.04 10 0.20 % 0.05 3 0.56 % 0.07 1 1.49 % 0.25 7 0.26 % 0.02 mijav mijav 56 0.27 % 0.05 0 0 % 0 11 0.20 % 0.28 17 0.23 % 0.03 19 0.37 % 0.10 5 0.93 % 0.12 0 0 % 0 4 0.15 % 0.01 joooj joooj 53 0.25 % 0.05 0 0 % 0 1 0.02 % 0.03 24 0.33 % 0.04 22 0.43 % 0.12 0 0 % 0 0 0 % 0 6 0.22 % 0.02 zdravo z dravo 52 0.25 % 0.05 0 0 % 0 12 0.22 % 0.30 24 0.33 % 0.04 9 0.18 % 0.05 3 0.56 % 0.07 0 0 % 0 4 0.15 % 0.01 jojme jojme 49 0.23 % 0.04 0 0 % 0 16 0.30 % 0.40 10 0.14 % 0.02 15 0.29 % 0.08 6 1.11 % 0.14 1 1.49 % 0.25 1 0.04 % 0 aaaaa aaaaa 45 0.21 % 0.04 0 0 % 0 12 0.22 % 0.30 8 0.11 % 0.01 17 0.33 % 0.09 2 0.37 % 0.05 0 0 % 0 6 0.22 % 0.02 mejduš m ejduš 42 0.20 % 0.04 0 0 % 0 14 0.26 % 0.35 10 0.14 % 0.02 10 0.20 % 0.05 2 0.37 % 0.05 0 0 % 0 6 0.22 % 0.02 ježešna je žešna 32 0.15 % 0.03 0 0 % 0 13 0.24 % 0.33 13 0.18 % 0.02 4 0.08 % 0.02 1 0.19 % 0.02 0 0 % 0 1 0.04 % 0 jaaaa jaaaa 29 0.14 % 0.03 0 0 % 0 12 0.22 % 0.30 5 0.07 % 0.01 5 0.10 % 0.03 0 0 % 0 0 0 % 0 7 0.26 % 0.02 aaaah aaaah 28 0.13 % 0.02 0 0 % 0 8 0.15 % 0.20 4 0.06 % 0.01 14 0.28 % 0.07 0 0 % 0 0 0 % 0 2 0.07 % 0.01 prejoj p rejoj 28 0.13 % 0.02 0 0 % 0 5 0.09 % 0.13 6 0.08 % 0.01 12 0.24 % 0.06 2 0.37 % 0.05 0 0 % 0 3 0.11 % 0.01 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 237 File at CLARIN.SI 1.2.221 List of initial character-level 1-grams from abbreviation lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-abbreviations-lemmas- initial-1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] dr, dr, d r, 322,077 10.23 % 283.84 0 0 % 0 2,425 9.78 % 61.06 222,930 11.94 % 410.76 51,452 11.39 % 274.53 4,941 1.97 % 115.08 476 5.17 % 120.33 39,853 7.31 % 125.35 oz, oz, o z, 136,593 4.34 % 120.38 0 0 % 0 226 0.91 % 5.69 40,351 2.16 % 74.35 22,364 4.95 % 119.33 11,031 4.41 % 256.93 329 3.57 % 83.17 62,292 11.43 % 195.92 d, d, d , 133,611 4.24 % 117.75 0 0 % 0 213 0.86 % 5.36 84,570 4.53 % 155.83 20,886 4.62 % 111.44 1,152 0.46 % 26.83 567 6.16 % 143.33 26,223 4.81 % 82.48 o, o, o , 110,895 3.52 % 97.73 0 0 % 0 142 0.57 % 3.58 70,116 3.75 % 129.19 21,336 4.72 % 113.84 1,407 0.56 % 32.77 182 1.98 % 46.01 17,712 3.25 % 55.71 M, m, M , 110,881 3.52 % 97.72 0 0 % 0 522 2.10 % 13.14 85,646 4.59 % 157.81 9,522 2.11 % 50.81 5,916 2.36 % 137.79 61 0.66 % 15.42 9,214 1.69 % 28.98 št, št, š t, 94,645 3.01 % 83.41 0 0 % 0 941 3.79 % 23.69 28,295 1.51 % 52.14 11,553 2.56 % 61.64 7,171 2.87 % 167.02 1,913 20.77 % 483.59 44,772 8.21 % 140.82 npr, npr, n pr, 92,015 2.92 % 81.09 0 0 % 0 314 1.27 % 7.91 24,837 1.33 % 45.76 29,590 6.55 % 157.88 22,440 8.97 % 522.65 311 3.38 % 78.62 14,523 2.66 % 45.68 J, j, J , 84,562 2.69 % 74.52 0 0 % 0 1,071 4.32 % 26.97 59,838 3.21 % 110.26 7,985 1.77 % 42.61 6,885 2.75 % 160.36 69 0.75 % 17.44 8,714 1.60 % 27.41 A, a, A , 81,577 2.59 % 71.89 0 0 % 0 1,198 4.83 % 30.16 49,090 2.63 % 90.45 8,889 1.97 % 47.43 6,513 2.60 % 151.70 191 2.07 % 48.28 15,696 2.88 % 49.37 S, s, S , 80,813 2.57 % 71.22 0 0 % 0 488 1.97 % 12.29 62,235 3.33 % 114.67 7,148 1.58 % 38.14 3,722 1.49 % 86.69 57 0.62 % 14.41 7,163 1.31 % 22.53 t, t, t , 79,373 2.52 % 69.95 0 0 % 0 168 0.68 % 4.23 34,619 1.85 % 63.79 14,460 3.20 % 77.15 4,648 1.86 % 108.26 99 1.07 % 25.03 25,379 4.66 % 79.82 B, b, B , 75,217 2.39 % 66.29 0 0 % 0 1,095 4.42 % 27.57 54,738 2.93 % 100.86 6,177 1.37 % 32.96 3,566 1.43 % 83.06 242 2.63 % 61.17 9,399 1.72 % 29.56 i, i, i , 68,165 2.17 % 60.07 0 0 % 0 158 0.64 % 3.98 27,139 1.45 % 50.01 12,936 2.86 % 69.02 3,525 1.41 % 82.10 90 0.98 % 22.75 24,317 4.46 % 76.48 itd, itd, i td, 65,475 2.08 % 57.70 2 5.71 % 206.04 578 2.33 % 14.55 36,287 1.94 % 66.86 14,609 3.23 % 77.95 5,517 2.21 % 128.50 195 2.12 % 49.29 8,287 1.52 % 26.06 sv, sv, s v, 65,377 2.08 % 57.62 0 0 % 0 688 2.77 % 17.32 44,951 2.41 % 82.83 6,229 1.38 % 33.24 5,754 2.30 % 134.02 57 0.62 % 14.41 7,698 1.41 % 24.21 D, d, D , 63,685 2.02 % 56.13 0 0 % 0 415 1.67 % 10.45 47,370 2.54 % 87.28 5,629 1.25 % 30.03 3,471 1.39 % 80.84 92 1.00 % 23.26 6,708 1.23 % 21.10 P, p, P , 61,583 1.96 % 54.27 0 0 % 0 371 1.50 % 9.34 47,303 2.53 % 87.16 6,019 1.33 % 32.12 2,848 1.14 % 66.33 68 0.74 % 17.19 4,974 0.91 % 15.64 tel, tel, t el, 59,340 1.89 % 52.30 0 0 % 0 5 0.02 % 0.13 47,321 2.53 % 87.19 7,401 1.64 % 39.49 1,487 0.59 % 34.63 12 0.13 % 3.03 3,114 0.57 % 9.79 K, k, K , 49,179 1.56 % 43.34 0 0 % 0 1,046 4.22 % 26.34 37,122 1.99 % 68.40 4,816 1.07 % 25.70 1,780 0.71 % 41.46 48 0.52 % 12.13 4,367 0.80 % 13.74 prof, prof, p rof, 48,612 1.54 % 42.84 0 0 % 0 47 0.19 % 1.18 34,903 1.87 % 64.31 5,305 1.17 % 28.31 741 0.30 % 17.26 26 0.28 % 6.57 7,590 1.39 % 23.87 str, str, s tr, 47,194 1.50 % 41.59 0 0 % 0 1,110 4.48 % 27.95 3,788 0.20 % 6.98 10,527 2.33 % 56.17 28,190 11.27 % 656.58 33 0.36 % 8.34 3,546 0.65 % 11.15 I, i, I , 44,515 1.41 % 39.23 0 0 % 0 441 1.78 % 11.10 30,623 1.64 % 56.42 4,534 1.00 % 24.19 3,184 1.27 % 74.16 137 1.49 % 34.63 5,596 1.03 % 17.60 V, v, V , 43,853 1.39 % 38.65 0 0 % 0 342 1.38 % 8.61 30,139 1.61 % 55.53 5,216 1.16 % 27.83 2,017 0.81 % 46.98 84 0.91 % 21.23 6,055 1.11 % 19.04 R, r, R , 43,194 1.37 % 38.07 0 0 % 0 533 2.15 % 13.42 29,209 1.56 % 53.82 5,154 1.14 % 27.50 3,135 1.25 % 73.02 40 0.43 % 10.11 5,123 0.94 % 16.11 G, g, G , 42,062 1.34 % 37.07 0 0 % 0 555 2.24 % 13.97 30,793 1.65 % 56.74 4,157 0.92 % 22.18 2,731 1.09 % 63.61 44 0.48 % 11.12 3,782 0.69 % 11.90 mag, mag, m ag, 40,143 1.27 % 35.38 0 0 % 0 17 0.07 % 0.43 29,459 1.58 % 54.28 5,620 1.24 % 29.99 471 0.19 % 10.97 65 0.71 % 16.43 4,511 0.83 % 14.19 C, c, C , 37,814 1.20 % 33.33 33 94.29 % 3,399.61 339 1.37 % 8.54 18,868 1.01 % 34.77 7,426 1.64 % 39.62 4,770 1.91 % 111.10 132 1.43 % 33.37 6,246 1.15 % 19.65 L, l, L , 37,102 1.18 % 32.70 0 0 % 0 793 3.20 % 19.97 25,324 1.36 % 46.66 3,661 0.81 % 19.53 3,527 1.41 % 82.15 65 0.71 % 16.43 3,732 0.69 % 11.74 st, st, s t, 36,399 1.16 % 32.08 0 0 % 0 947 3.82 % 23.84 19,248 1.03 % 35.47 3,431 0.76 % 18.31 1,228 0.49 % 28.60 30 0.33 % 7.58 11,515 2.11 % 36.22 p, p, p , 35,939 1.14 % 31.67 0 0 % 0 96 0.39 % 2.42 20,936 1.12 % 38.58 6,478 1.43 % 34.56 703 0.28 % 16.37 68 0.74 % 17.19 7,658 1.41 % 24.09 op, op, o p, 35,928 1.14 % 31.66 0 0 % 0 454 1.83 % 11.43 15,464 0.83 % 28.49 4,702 1.04 % 25.09 3,481 1.39 % 81.08 8 0.09 % 2.02 11,819 2.17 % 37.17 T, t, T , 34,839 1.11 % 30.70 0 0 % 0 350 1.41 % 8.81 23,035 1.23 % 42.44 4,478 0.99 % 23.89 1,987 0.79 % 46.28 35 0.38 % 8.85 4,954 0.91 % 15.58 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 238 File at CLARIN.SI 1.2.222 List of initial character-level 2-grams from abbreviation lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-abbreviations-lemmas- initial-2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] dr, dr, dr , 322,077 10.23 % 283.84 2,425 9.78 % 61.06 0 0 % 0 476 5.17 % 120.33 4,941 1.97 % 115.08 51,452 11.39 % 274.53 222,930 11.94 % 410.76 39,853 7.31 % 125.35 oz, oz, oz , 136,593 4.34 % 120.38 226 0.91 % 5.69 0 0 % 0 329 3.57 % 83.17 11,031 4.41 % 256.93 22,364 4.95 % 119.33 40,351 2.16 % 74.35 62,292 11.43 % 195.92 d, d, d, 133,611 4.24 % 117.75 213 0.86 % 5.36 0 0 % 0 567 6.16 % 143.33 1,152 0.46 % 26.83 20,886 4.62 % 111.44 84,570 4.53 % 155.83 26,223 4.81 % 82.48 o, o, o, 110,895 3.52 % 97.73 142 0.57 % 3.58 0 0 % 0 182 1.98 % 46.01 1,407 0.56 % 32.77 21,336 4.72 % 113.84 70,116 3.75 % 129.19 17,712 3.25 % 55.71 M, m, M, 110,881 3.52 % 97.72 522 2.10 % 13.14 0 0 % 0 61 0.66 % 15.42 5,916 2.36 % 137.79 9,522 2.11 % 50.81 85,646 4.59 % 157.81 9,214 1.69 % 28.98 št, št, št , 94,645 3.01 % 83.41 941 3.79 % 23.69 0 0 % 0 1,913 20.77 % 483.59 7,171 2.87 % 167.02 11,553 2.56 % 61.64 28,295 1.51 % 52.14 44,772 8.21 % 140.82 npr, npr, np r, 92,015 2.92 % 81.09 314 1.27 % 7.91 0 0 % 0 311 3.38 % 78.62 22,440 8.97 % 522.65 29,590 6.55 % 157.88 24,837 1.33 % 45.76 14,523 2.66 % 45.68 J, j, J, 84,562 2.69 % 74.52 1,071 4.32 % 26.97 0 0 % 0 69 0.75 % 17.44 6,885 2.75 % 160.36 7,985 1.77 % 42.61 59,838 3.21 % 110.26 8,714 1.60 % 27.41 A, a, A, 81,577 2.59 % 71.89 1,198 4.83 % 30.16 0 0 % 0 191 2.07 % 48.28 6,513 2.60 % 151.70 8,889 1.97 % 47.43 49,090 2.63 % 90.45 15,696 2.88 % 49.37 S, s, S, 80,813 2.57 % 71.22 488 1.97 % 12.29 0 0 % 0 57 0.62 % 14.41 3,722 1.49 % 86.69 7,148 1.58 % 38.14 62,235 3.33 % 114.67 7,163 1.31 % 22.53 t, t, t, 79,373 2.52 % 69.95 168 0.68 % 4.23 0 0 % 0 99 1.07 % 25.03 4,648 1.86 % 108.26 14,460 3.20 % 77.15 34,619 1.85 % 63.79 25,379 4.66 % 79.82 B, b, B, 75,217 2.39 % 66.29 1,095 4.42 % 27.57 0 0 % 0 242 2.63 % 61.17 3,566 1.43 % 83.06 6,177 1.37 % 32.96 54,738 2.93 % 100.86 9,399 1.72 % 29.56 i, i, i, 68,165 2.17 % 60.07 158 0.64 % 3.98 0 0 % 0 90 0.98 % 22.75 3,525 1.41 % 82.10 12,936 2.86 % 69.02 27,139 1.45 % 50.01 24,317 4.46 % 76.48 itd, itd, it d, 65,475 2.08 % 57.70 578 2.33 % 14.55 2 5.71 % 206.04 195 2.12 % 49.29 5,517 2.21 % 128.50 14,609 3.23 % 77.95 36,287 1.94 % 66.86 8,287 1.52 % 26.06 sv, sv, sv , 65,377 2.08 % 57.62 688 2.77 % 17.32 0 0 % 0 57 0.62 % 14.41 5,754 2.30 % 134.02 6,229 1.38 % 33.24 44,951 2.41 % 82.83 7,698 1.41 % 24.21 D, d, D, 63,685 2.02 % 56.13 415 1.67 % 10.45 0 0 % 0 92 1.00 % 23.26 3,471 1.39 % 80.84 5,629 1.25 % 30.03 47,370 2.54 % 87.28 6,708 1.23 % 21.10 P, p, P, 61,583 1.96 % 54.27 371 1.50 % 9.34 0 0 % 0 68 0.74 % 17.19 2,848 1.14 % 66.33 6,019 1.33 % 32.12 47,303 2.53 % 87.16 4,974 0.91 % 15.64 tel, tel, te l, 59,340 1.89 % 52.30 5 0.02 % 0.13 0 0 % 0 12 0.13 % 3.03 1,487 0.59 % 34.63 7,401 1.64 % 39.49 47,321 2.53 % 87.19 3,114 0.57 % 9.79 K, k, K, 49,179 1.56 % 43.34 1,046 4.22 % 26.34 0 0 % 0 48 0.52 % 12.13 1,780 0.71 % 41.46 4,816 1.07 % 25.70 37,122 1.99 % 68.40 4,367 0.80 % 13.74 prof, prof, pr of, 48,612 1.54 % 42.84 47 0.19 % 1.18 0 0 % 0 26 0.28 % 6.57 741 0.30 % 17.26 5,305 1.17 % 28.31 34,903 1.87 % 64.31 7,590 1.39 % 23.87 str, str, st r, 47,194 1.50 % 41.59 1,110 4.48 % 27.95 0 0 % 0 33 0.36 % 8.34 28,190 11.27 % 656.58 10,527 2.33 % 56.17 3,788 0.20 % 6.98 3,546 0.65 % 11.15 I, i, I, 44,515 1.41 % 39.23 441 1.78 % 11.10 0 0 % 0 137 1.49 % 34.63 3,184 1.27 % 74.16 4,534 1.00 % 24.19 30,623 1.64 % 56.42 5,596 1.03 % 17.60 V, v, V, 43,853 1.39 % 38.65 342 1.38 % 8.61 0 0 % 0 84 0.91 % 21.23 2,017 0.81 % 46.98 5,216 1.16 % 27.83 30,139 1.61 % 55.53 6,055 1.11 % 19.04 R, r, R, 43,194 1.37 % 38.07 533 2.15 % 13.42 0 0 % 0 40 0.43 % 10.11 3,135 1.25 % 73.02 5,154 1.14 % 27.50 29,209 1.56 % 53.82 5,123 0.94 % 16.11 G, g, G, 42,062 1.34 % 37.07 555 2.24 % 13.97 0 0 % 0 44 0.48 % 11.12 2,731 1.09 % 63.61 4,157 0.92 % 22.18 30,793 1.65 % 56.74 3,782 0.69 % 11.90 mag, mag, ma g, 40,143 1.27 % 35.38 17 0.07 % 0.43 0 0 % 0 65 0.71 % 16.43 471 0.19 % 10.97 5,620 1.24 % 29.99 29,459 1.58 % 54.28 4,511 0.83 % 14.19 C, c, C, 37,814 1.20 % 33.33 339 1.37 % 8.54 33 94.29 % 3,399.61 132 1.43 % 33.37 4,770 1.91 % 111.10 7,426 1.64 % 39.62 18,868 1.01 % 34.77 6,246 1.15 % 19.65 L, l, L, 37,102 1.18 % 32.70 793 3.20 % 19.97 0 0 % 0 65 0.71 % 16.43 3,527 1.41 % 82.15 3,661 0.81 % 19.53 25,324 1.36 % 46.66 3,732 0.69 % 11.74 st, st, st , 36,399 1.16 % 32.08 947 3.82 % 23.84 0 0 % 0 30 0.33 % 7.58 1,228 0.49 % 28.60 3,431 0.76 % 18.31 19,248 1.03 % 35.47 11,515 2.11 % 36.22 p, p, p, 35,939 1.14 % 31.67 96 0.39 % 2.42 0 0 % 0 68 0.74 % 17.19 703 0.28 % 16.37 6,478 1.43 % 34.56 20,936 1.12 % 38.58 7,658 1.41 % 24.09 op, op, op , 35,928 1.14 % 31.66 454 1.83 % 11.43 0 0 % 0 8 0.09 % 2.02 3,481 1.39 % 81.08 4,702 1.04 % 25.09 15,464 0.83 % 28.49 11,819 2.17 % 37.17 T, t, T, 34,839 1.11 % 30.70 350 1.41 % 8.81 0 0 % 0 35 0.38 % 8.85 1,987 0.79 % 46.28 4,478 0.99 % 23.89 23,035 1.23 % 42.44 4,954 0.91 % 15.58 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 239 File at CLARIN.SI 1.2.223 List of initial character-level 3-grams from abbreviation lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-abbreviations-lemmas- initial-3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] dr, dr, dr, 322,077 21.49 % 283.84 2,425 20.54 % 61.06 0 0 % 0 476 9.49 % 120.33 4,941 3.16 % 115.08 51,452 21.42 % 274.53 222,930 28.22 % 410.76 39,853 13.50 % 125.35 oz, oz, oz, 136,593 9.12 % 120.38 226 1.92 % 5.69 0 0 % 0 329 6.56 % 83.17 11,031 7.05 % 256.93 22,364 9.31 % 119.33 40,351 5.11 % 74.35 62,292 21.10 % 195.92 št, št, št, 94,645 6.32 % 83.41 941 7.97 % 23.69 0 0 % 0 1,913 38.13 % 483.59 7,171 4.58 % 167.02 11,553 4.81 % 61.64 28,295 3.58 % 52.14 44,772 15.16 % 140.82 npr, npr, npr , 92,015 6.14 % 81.09 314 2.66 % 7.91 0 0 % 0 311 6.20 % 78.62 22,440 14.34 % 522.65 29,590 12.32 % 157.88 24,837 3.15 % 45.76 14,523 4.92 % 45.68 itd, itd, itd , 65,475 4.37 % 57.70 578 4.90 % 14.55 2100.00 % 206.04 195 3.89 % 49.29 5,517 3.53 % 128.50 14,609 6.08 % 77.95 36,287 4.59 % 66.86 8,287 2.81 % 26.06 sv, sv, sv, 65,377 4.36 % 57.62 688 5.83 % 17.32 0 0 % 0 57 1.14 % 14.41 5,754 3.68 % 134.02 6,229 2.59 % 33.24 44,951 5.69 % 82.83 7,698 2.61 % 24.21 tel, tel, tel , 59,340 3.96 % 52.30 5 0.04 % 0.13 0 0 % 0 12 0.24 % 3.03 1,487 0.95 % 34.63 7,401 3.08 % 39.49 47,321 5.99 % 87.19 3,114 1.05 % 9.79 prof, prof, pro f, 48,612 3.24 % 42.84 47 0.40 % 1.18 0 0 % 0 26 0.52 % 6.57 741 0.47 % 17.26 5,305 2.21 % 28.31 34,903 4.42 % 64.31 7,590 2.57 % 23.87 str, str, str , 47,194 3.15 % 41.59 1,110 9.40 % 27.95 0 0 % 0 33 0.66 % 8.34 28,190 18.01 % 656.58 10,527 4.38 % 56.17 3,788 0.48 % 6.98 3,546 1.20 % 11.15 mag, mag, mag , 40,143 2.68 % 35.38 17 0.14 % 0.43 0 0 % 0 65 1.30 % 16.43 471 0.30 % 10.97 5,620 2.34 % 29.99 29,459 3.73 % 54.28 4,511 1.53 % 14.19 st, st, st, 36,399 2.43 % 32.08 947 8.02 % 23.84 0 0 % 0 30 0.60 % 7.58 1,228 0.79 % 28.60 3,431 1.43 % 18.31 19,248 2.44 % 35.47 11,515 3.90 % 36.22 op, op, op, 35,928 2.40 % 31.66 454 3.85 % 11.43 0 0 % 0 8 0.16 % 2.02 3,481 2.22 % 81.08 4,702 1.96 % 25.09 15,464 1.96 % 28.49 11,819 4.00 % 37.17 ipd, ipd, ipd , 29,413 1.96 % 25.92 161 1.36 % 4.05 0 0 % 0 178 3.55 % 45 5,227 3.34 % 121.74 7,015 2.92 % 37.43 12,438 1.57 % 22.92 4,394 1.49 % 13.82 itn, itn, itn , 16,421 1.10 % 14.47 132 1.12 % 3.32 0 0 % 0 157 3.13 % 39.69 2,446 1.56 % 56.97 3,546 1.48 % 18.92 8,494 1.07 % 15.65 1,646 0.56 % 5.18 odst, odst, ods t, 16,149 1.08 % 14.23 1 0.01 % 0.03 0 0 % 0 1 0.02 % 0.25 46 0.03 % 1.07 256 0.11 % 1.37 10,780 1.36 % 19.86 5,065 1.72 % 15.93 am, am, am, 10,990 0.73 % 9.69 2 0.02 % 0.05 0 0 % 0 0 0 % 0 95 0.06 % 2.21 74 0.03 % 0.39 10,748 1.36 % 19.80 71 0.02 % 0.22 tj, tj, tj, 10,915 0.73 % 9.62 93 0.79 % 2.34 0 0 % 0 6 0.12 % 1.52 2,778 1.77 % 64.70 2,880 1.20 % 15.37 3,085 0.39 % 5.68 2,073 0.70 % 6.52 med, med, med , 10,326 0.69 % 9.10 7 0.06 % 0.18 0 0 % 0 0 0 % 0 53 0.03 % 1.23 3,586 1.49 % 19.13 5,652 0.72 % 10.41 1,028 0.35 % 3.23 čl, čl, čl, 10,036 0.67 % 8.84 5 0.04 % 0.13 0 0 % 0 12 0.24 % 3.03 183 0.12 % 4.26 282 0.12 % 1.50 1,356 0.17 % 2.50 8,198 2.78 % 25.78 nan, nan, nan , 9,178 0.61 % 8.09 3 0.03 % 0.08 0 0 % 0 0 0 % 0 3 0 % 0.07 18 0.01 % 0.10 9,052 1.15 % 16.68 102 0.04 % 0.32 prim, prim, pri m, 8,714 0.58 % 7.68 94 0.80 % 2.37 0 0 % 0 11 0.22 % 2.78 3,218 2.06 % 74.95 1,535 0.64 % 8.19 2,949 0.37 % 5.43 907 0.31 % 2.85 ur, ur, ur, 8,434 0.56 % 7.43 28 0.24 % 0.71 0 0 % 0 10 0.20 % 2.53 562 0.36 % 13.09 647 0.27 % 3.45 3,021 0.38 % 5.57 4,166 1.41 % 13.10 pr, pr, pr, 8,211 0.55 % 7.24 57 0.48 % 1.44 0 0 % 0 255 5.08 % 64.46 2,655 1.70 % 61.84 2,206 0.92 % 11.77 2,146 0.27 % 3.95 892 0.30 % 2.81 amer, amer, ame r, 8,026 0.54 % 7.07 0 0 % 0 0 0 % 0 0 0 % 0 11 0.01 % 0.26 5 0 % 0.03 8,006 1.01 % 14.75 4 0 % 0.01 idr, idr, idr , 7,791 0.52 % 6.87 33 0.28 % 0.83 0 0 % 0 33 0.66 % 8.34 1,650 1.05 % 38.43 1,322 0.55 % 7.05 3,765 0.48 % 6.94 988 0.34 % 3.11 dok, dok, dok , 7,725 0.52 % 6.81 3 0.03 % 0.08 0 0 % 0 3 0.06 % 0.76 49 0.03 % 1.14 78 0.03 % 0.42 7,494 0.95 % 13.81 98 0.03 % 0.31 nad, nad, nad , 7,233 0.48 % 6.37 3 0.03 % 0.08 0 0 % 0 2 0.04 % 0.51 2 0 % 0.05 23 0.01 % 0.12 7,181 0.91 % 13.23 22 0.01 % 0.07 angl, angl, ang l, 7,232 0.48 % 6.37 8 0.07 % 0.20 0 0 % 0 26 0.52 % 6.57 1,407 0.90 % 32.77 2,817 1.17 % 15.03 2,390 0.30 % 4.40 584 0.20 % 1.84 ul, ul, ul, 6,190 0.41 % 5.46 3 0.03 % 0.08 0 0 % 0 2 0.04 % 0.51 58 0.04 % 1.35 1,201 0.50 % 6.41 4,418 0.56 % 8.14 508 0.17 % 1.60 ml, ml, ml, 5,973 0.40 % 5.26 55 0.47 % 1.38 0 0 % 0 4 0.08 % 1.01 74 0.05 % 1.72 455 0.19 % 2.43 4,939 0.62 % 9.10 446 0.15 % 1.40 doc, doc, doc , 5,756 0.38 % 5.07 1 0.01 % 0.03 0 0 % 0 3 0.06 % 0.76 53 0.03 % 1.23 1,157 0.48 % 6.17 3,086 0.39 % 5.69 1,456 0.49 % 4.58 dipl, dipl, dip l, 5,602 0.37 % 4.94 9 0.08 % 0.23 0 0 % 0 1 0.02 % 0.25 81 0.05 % 1.89 819 0.34 % 4.37 4,280 0.54 % 7.89 412 0.14 % 1.30 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 240 File at CLARIN.SI 1.2.224 List of initial character-level 4-grams from abbreviation lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-abbreviations-lemmas- initial-4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] npr, npr, npr, 92,015 13.93 % 81.09 314 6.52 % 7.91 0 0 % 0 311 19.23 % 78.62 22,440 23.11 % 522.65 29,590 25.14 % 157.88 24,837 7.02 % 45.76 14,523 16.99 % 45.68 itd, itd, itd, 65,475 9.91 % 57.70 578 12.01 % 14.55 2100.00 % 206.04 195 12.06 % 49.29 5,517 5.68 % 128.50 14,609 12.41 % 77.95 36,287 10.25 % 66.86 8,287 9.69 % 26.06 tel, tel, tel, 59,340 8.98 % 52.30 5 0.10 % 0.13 0 0 % 0 12 0.74 % 3.03 1,487 1.53 % 34.63 7,401 6.29 % 39.49 47,321 13.37 % 87.19 3,114 3.64 % 9.79 prof, prof, prof , 48,612 7.36 % 42.84 47 0.98 % 1.18 0 0 % 0 26 1.61 % 6.57 741 0.76 % 17.26 5,305 4.51 % 28.31 34,903 9.86 % 64.31 7,590 8.88 % 23.87 str, str, str, 47,194 7.14 % 41.59 1,110 23.06 % 27.95 0 0 % 0 33 2.04 % 8.34 28,190 29.03 % 656.58 10,527 8.94 % 56.17 3,788 1.07 % 6.98 3,546 4.15 % 11.15 mag, mag, mag, 40,143 6.08 % 35.38 17 0.35 % 0.43 0 0 % 0 65 4.02 % 16.43 471 0.48 % 10.97 5,620 4.78 % 29.99 29,459 8.32 % 54.28 4,511 5.28 % 14.19 ipd, ipd, ipd, 29,413 4.45 % 25.92 161 3.35 % 4.05 0 0 % 0 178 11.01 % 45 5,227 5.38 % 121.74 7,015 5.96 % 37.43 12,438 3.52 % 22.92 4,394 5.14 % 13.82 itn, itn, itn, 16,421 2.49 % 14.47 132 2.74 % 3.32 0 0 % 0 157 9.71 % 39.69 2,446 2.52 % 56.97 3,546 3.01 % 18.92 8,494 2.40 % 15.65 1,646 1.93 % 5.18 odst, odst, odst , 16,149 2.44 % 14.23 1 0.02 % 0.03 0 0 % 0 1 0.06 % 0.25 46 0.05 % 1.07 256 0.22 % 1.37 10,780 3.05 % 19.86 5,065 5.92 % 15.93 med, med, med, 10,326 1.56 % 9.10 7 0.14 % 0.18 0 0 % 0 0 0 % 0 53 0.06 % 1.23 3,586 3.05 % 19.13 5,652 1.60 % 10.41 1,028 1.20 % 3.23 nan, nan, nan, 9,178 1.39 % 8.09 3 0.06 % 0.08 0 0 % 0 0 0 % 0 3 0 % 0.07 18 0.01 % 0.10 9,052 2.56 % 16.68 102 0.12 % 0.32 prim, prim, prim , 8,714 1.32 % 7.68 94 1.95 % 2.37 0 0 % 0 11 0.68 % 2.78 3,218 3.31 % 74.95 1,535 1.30 % 8.19 2,949 0.83 % 5.43 907 1.06 % 2.85 amer, amer, amer , 8,026 1.22 % 7.07 0 0 % 0 0 0 % 0 0 0 % 0 11 0.01 % 0.26 5 0 % 0.03 8,006 2.26 % 14.75 4 0.01 % 0.01 idr, idr, idr, 7,791 1.18 % 6.87 33 0.69 % 0.83 0 0 % 0 33 2.04 % 8.34 1,650 1.70 % 38.43 1,322 1.12 % 7.05 3,765 1.06 % 6.94 988 1.16 % 3.11 dok, dok, dok, 7,725 1.17 % 6.81 3 0.06 % 0.08 0 0 % 0 3 0.19 % 0.76 49 0.05 % 1.14 78 0.07 % 0.42 7,494 2.12 % 13.81 98 0.12 % 0.31 nad, nad, nad, 7,233 1.09 % 6.37 3 0.06 % 0.08 0 0 % 0 2 0.12 % 0.51 2 0 % 0.05 23 0.02 % 0.12 7,181 2.03 % 13.23 22 0.03 % 0.07 angl, angl, angl , 7,232 1.09 % 6.37 8 0.17 % 0.20 0 0 % 0 26 1.61 % 6.57 1,407 1.45 % 32.77 2,817 2.39 % 15.03 2,390 0.68 % 4.40 584 0.68 % 1.84 doc, doc, doc, 5,756 0.87 % 5.07 1 0.02 % 0.03 0 0 % 0 3 0.19 % 0.76 53 0.06 % 1.23 1,157 0.98 % 6.17 3,086 0.87 % 5.69 1,456 1.70 % 4.58 dipl, dipl, dipl , 5,602 0.85 % 4.94 9 0.19 % 0.23 0 0 % 0 1 0.06 % 0.25 81 0.08 % 1.89 819 0.70 % 4.37 4,280 1.21 % 7.89 412 0.48 % 1.30 inž, inž, inž, 5,119 0.78 % 4.51 4 0.08 % 0.10 0 0 % 0 1 0.06 % 0.25 43 0.04 % 1 468 0.40 % 2.50 4,449 1.26 % 8.20 154 0.18 % 0.48 pon, pon, pon, 4,874 0.74 % 4.30 0 0 % 0 0 0 % 0 4 0.25 % 1.01 104 0.11 % 2.42 137 0.12 % 0.73 4,611 1.30 % 8.50 18 0.02 % 0.06 roj, roj, roj, 4,310 0.65 % 3.80 29 0.60 % 0.73 0 0 % 0 3 0.19 % 0.76 901 0.93 % 20.99 116 0.10 % 0.62 2,632 0.74 % 4.85 629 0.74 % 1.98 parc, parc, parc , 3,955 0.60 % 3.49 0 0 % 0 0 0 % 0 4 0.25 % 1.01 19 0.02 % 0.44 9 0.01 % 0.05 1,470 0.41 % 2.71 2,453 2.87 % 7.72 naniz, naniz, nani z, 3,743 0.57 % 3.30 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 3,743 1.06 % 6.90 0 0 % 0 nem, nem, nem, 3,739 0.57 % 3.30 43 0.89 % 1.08 0 0 % 0 4 0.25 % 1.01 436 0.45 % 10.15 175 0.15 % 0.93 2,975 0.84 % 5.48 106 0.12 % 0.33 brit, brit, brit , 3,261 0.49 % 2.87 0 0 % 0 0 0 % 0 0 0 % 0 56 0.06 % 1.30 21 0.02 % 0.11 738 0.21 % 1.36 2,446 2.86 % 7.69 msgr, msgr, msgr , 3,221 0.49 % 2.84 16 0.33 % 0.40 0 0 % 0 0 0 % 0 19 0.02 % 0.44 90 0.08 % 0.48 2,980 0.84 % 5.49 116 0.14 % 0.36 opr, opr, opr, 3,179 0.48 % 2.80 0 0 % 0 0 0 % 0 0 0 % 0 357 0.37 % 8.31 22 0.02 % 0.12 204 0.06 % 0.38 2,596 3.04 % 8.17 slov, slov, slov , 2,932 0.44 % 2.58 37 0.77 % 0.93 0 0 % 0 1 0.06 % 0.25 630 0.65 % 14.67 239 0.20 % 1.28 1,789 0.51 % 3.30 236 0.28 % 0.74 univ, univ, univ , 2,922 0.44 % 2.58 5 0.10 % 0.13 0 0 % 0 2 0.12 % 0.51 89 0.09 % 2.07 463 0.39 % 2.47 2,067 0.58 % 3.81 296 0.35 % 0.93 odd, odd, odd, 2,739 0.41 % 2.41 1 0.02 % 0.03 0 0 % 0 0 0 % 0 11 0.01 % 0.26 7 0.01 % 0.04 2,661 0.75 % 4.90 59 0.07 % 0.19 stol, stol, stol , 2,715 0.41 % 2.39 12 0.25 % 0.30 0 0 % 0 0 0 % 0 527 0.54 % 12.27 1,314 1.12 % 7.01 708 0.20 % 1.30 154 0.18 % 0.48 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 241 File at CLARIN.SI 1.2.225 List of initial character-level 5-grams from abbreviation lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-abbreviations-lemmas- initial-5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] prof, prof, prof, 48,612 29.71 % 42.84 47 5.12 % 1.18 0 0 % 0 26 10.48 % 6.57 741 4.56 % 17.26 5,305 28.32 % 28.31 34,903 34.39 % 64.31 7,590 29.21 % 23.87 odst, odst, odst, 16,149 9.87 % 14.23 1 0.11 % 0.03 0 0 % 0 1 0.40 % 0.25 46 0.28 % 1.07 256 1.37 % 1.37 10,780 10.62 % 19.86 5,065 19.49 % 15.93 prim, prim, prim, 8,714 5.33 % 7.68 94 10.25 % 2.37 0 0 % 0 11 4.43 % 2.78 3,218 19.80 % 74.95 1,535 8.20 % 8.19 2,949 2.91 % 5.43 907 3.49 % 2.85 amer, amer, amer, 8,026 4.91 % 7.07 0 0 % 0 0 0 % 0 0 0 % 0 11 0.07 % 0.26 5 0.03 % 0.03 8,006 7.89 % 14.75 4 0.01 % 0.01 angl, angl, angl, 7,232 4.42 % 6.37 8 0.87 % 0.20 0 0 % 0 26 10.48 % 6.57 1,407 8.66 % 32.77 2,817 15.04 % 15.03 2,390 2.35 % 4.40 584 2.25 % 1.84 dipl, dipl, dipl, 5,602 3.42 % 4.94 9 0.98 % 0.23 0 0 % 0 1 0.40 % 0.25 81 0.50 % 1.89 819 4.37 % 4.37 4,280 4.22 % 7.89 412 1.59 % 1.30 parc, parc, parc, 3,955 2.42 % 3.49 0 0 % 0 0 0 % 0 4 1.61 % 1.01 19 0.12 % 0.44 9 0.05 % 0.05 1,470 1.45 % 2.71 2,453 9.44 % 7.72 naniz, naniz, naniz , 3,743 2.29 % 3.30 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 3,743 3.69 % 6.90 0 0 % 0 brit, brit, brit, 3,261 1.99 % 2.87 0 0 % 0 0 0 % 0 0 0 % 0 56 0.34 % 1.30 21 0.11 % 0.11 738 0.73 % 1.36 2,446 9.41 % 7.69 msgr, msgr, msgr, 3,221 1.97 % 2.84 16 1.75 % 0.40 0 0 % 0 0 0 % 0 19 0.12 % 0.44 90 0.48 % 0.48 2,980 2.94 % 5.49 116 0.45 % 0.36 slov, slov, slov, 2,932 1.79 % 2.58 37 4.04 % 0.93 0 0 % 0 1 0.40 % 0.25 630 3.88 % 14.67 239 1.28 % 1.28 1,789 1.76 % 3.30 236 0.91 % 0.74 univ, univ, univ, 2,922 1.79 % 2.58 5 0.55 % 0.13 0 0 % 0 2 0.81 % 0.51 89 0.55 % 2.07 463 2.47 % 2.47 2,067 2.04 % 3.81 296 1.14 % 0.93 stol, stol, stol, 2,715 1.66 % 2.39 12 1.31 % 0.30 0 0 % 0 0 0 % 0 527 3.24 % 12.27 1,314 7.01 % 7.01 708 0.70 % 1.30 154 0.59 % 0.48 prev, prev, prev, 2,524 1.54 % 2.22 246 26.83 % 6.19 0 0 % 0 5 2.02 % 1.26 1,713 10.54 % 39.90 150 0.80 % 0.80 350 0.34 % 0.64 60 0.23 % 0.19 ibid, ibid, ibid, 2,175 1.33 % 1.92 0 0 % 0 0 0 % 0 2 0.81 % 0.51 1,945 11.97 % 45.30 145 0.77 % 0.77 8 0.01 % 0.01 75 0.29 % 0.24 ponov, ponov, ponov , 1,907 1.17 % 1.68 1 0.11 % 0.03 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 1,906 1.88 % 3.51 0 0 % 0 nadalj, nadalj, nadal j, 1,606 0.98 % 1.42 0 0 % 0 0 0 % 0 1 0.40 % 0.25 2 0.01 % 0.05 2 0.01 % 0.01 1,457 1.44 % 2.68 144 0.55 % 0.45 asist, asist, asist , 1,359 0.83 % 1.20 0 0 % 0 0 0 % 0 0 0 % 0 8 0.05 % 0.19 496 2.65 % 2.65 513 0.51 % 0.95 342 1.32 % 1.08 spec, spec, spec, 1,274 0.78 % 1.12 2 0.22 % 0.05 0 0 % 0 0 0 % 0 20 0.12 % 0.47 336 1.79 % 1.79 660 0.65 % 1.22 256 0.98 % 0.81 film, film, film, 1,206 0.74 % 1.06 0 0 % 0 0 0 % 0 1 0.40 % 0.25 55 0.34 % 1.28 17 0.09 % 0.09 1,000 0.98 % 1.84 133 0.51 % 0.42 štev, štev, štev, 1,168 0.71 % 1.03 16 1.75 % 0.40 0 0 % 0 9 3.63 % 2.28 80 0.49 % 1.86 161 0.86 % 0.86 751 0.74 % 1.38 151 0.58 % 0.47 pribl, pribl, pribl , 1,089 0.67 % 0.96 0 0 % 0 0 0 % 0 20 8.06 % 5.06 140 0.86 % 3.26 335 1.79 % 1.79 325 0.32 % 0.60 269 1.03 % 0.85 nasl, nasl, nasl, 966 0.59 % 0.85 1 0.11 % 0.03 0 0 % 0 1 0.40 % 0.25 131 0.81 % 3.05 52 0.28 % 0.28 12 0.01 % 0.02 769 2.96 % 2.42 franc, franc, franc , 896 0.55 % 0.79 0 0 % 0 0 0 % 0 0 0 % 0 55 0.34 % 1.28 14 0.07 % 0.07 818 0.81 % 1.51 9 0.04 % 0.03 akad, akad, akad, 876 0.54 % 0.77 1 0.11 % 0.03 0 0 % 0 2 0.81 % 0.51 74 0.46 % 1.72 26 0.14 % 0.14 656 0.65 % 1.21 117 0.45 % 0.37 avstral, avstral, avstr al, 818 0.50 % 0.72 0 0 % 0 0 0 % 0 0 0 % 0 8 0.05 % 0.19 1 0.01 % 0.01 807 0.80 % 1.49 2 0.01 % 0.01 pogl, pogl, pogl, 740 0.45 % 0.65 8 0.87 % 0.20 0 0 % 0 1 0.40 % 0.25 670 4.12 % 15.61 45 0.24 % 0.24 5 0.01 % 0.01 11 0.04 % 0.03 pren, pren, pren, 713 0.44 % 0.63 0 0 % 0 0 0 % 0 0 0 % 0 1 0.01 % 0.02 0 0 % 0 693 0.68 % 1.28 19 0.07 % 0.06 alter, alter, alter , 638 0.39 % 0.56 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 12 0.06 % 0.06 624 0.61 % 1.15 2 0.01 % 0.01 šport, šport, šport , 612 0.37 % 0.54 2 0.22 % 0.05 0 0 % 0 0 0 % 0 3 0.02 % 0.07 67 0.36 % 0.36 530 0.52 % 0.98 10 0.04 % 0.03 posn, posn, posn, 597 0.36 % 0.53 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 1 0.01 % 0.01 596 0.59 % 1.10 0 0 % 0 madž, madž, madž, 552 0.34 % 0.49 3 0.33 % 0.08 0 0 % 0 0 0 % 0 114 0.70 % 2.66 4 0.02 % 0.02 401 0.40 % 0.74 30 0.12 % 0.09 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 242 File at CLARIN.SI 1.2.226 List of final character-level 1-grams from abbreviation lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-abbreviations-lemmas-final- 1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] dr, dr, dr , 322,077 10.23 % 283.84 2,425 9.78 % 61.06 0 0 % 0 476 5.17 % 120.33 4,941 1.97 % 115.08 51,452 11.39 % 274.53 222,930 11.94 % 410.76 39,853 7.31 % 125.35 oz, oz, oz , 136,593 4.34 % 120.38 226 0.91 % 5.69 0 0 % 0 329 3.57 % 83.17 11,031 4.41 % 256.93 22,364 4.95 % 119.33 40,351 2.16 % 74.35 62,292 11.43 % 195.92 d, d, d , 133,611 4.24 % 117.75 213 0.86 % 5.36 0 0 % 0 567 6.16 % 143.33 1,152 0.46 % 26.83 20,886 4.62 % 111.44 84,570 4.53 % 155.83 26,223 4.81 % 82.48 o, o, o , 110,895 3.52 % 97.73 142 0.57 % 3.58 0 0 % 0 182 1.98 % 46.01 1,407 0.56 % 32.77 21,336 4.72 % 113.84 70,116 3.75 % 129.19 17,712 3.25 % 55.71 M, m, M , 110,881 3.52 % 97.72 522 2.10 % 13.14 0 0 % 0 61 0.66 % 15.42 5,916 2.36 % 137.79 9,522 2.11 % 50.81 85,646 4.59 % 157.81 9,214 1.69 % 28.98 št, št, št , 94,645 3.01 % 83.41 941 3.79 % 23.69 0 0 % 0 1,913 20.77 % 483.59 7,171 2.87 % 167.02 11,553 2.56 % 61.64 28,295 1.51 % 52.14 44,772 8.21 % 140.82 npr, npr, npr , 92,015 2.92 % 81.09 314 1.27 % 7.91 0 0 % 0 311 3.38 % 78.62 22,440 8.97 % 522.65 29,590 6.55 % 157.88 24,837 1.33 % 45.76 14,523 2.66 % 45.68 J, j, J , 84,562 2.69 % 74.52 1,071 4.32 % 26.97 0 0 % 0 69 0.75 % 17.44 6,885 2.75 % 160.36 7,985 1.77 % 42.61 59,838 3.21 % 110.26 8,714 1.60 % 27.41 A, a, A , 81,577 2.59 % 71.89 1,198 4.83 % 30.16 0 0 % 0 191 2.07 % 48.28 6,513 2.60 % 151.70 8,889 1.97 % 47.43 49,090 2.63 % 90.45 15,696 2.88 % 49.37 S, s, S , 80,813 2.57 % 71.22 488 1.97 % 12.29 0 0 % 0 57 0.62 % 14.41 3,722 1.49 % 86.69 7,148 1.58 % 38.14 62,235 3.33 % 114.67 7,163 1.31 % 22.53 t, t, t , 79,373 2.52 % 69.95 168 0.68 % 4.23 0 0 % 0 99 1.07 % 25.03 4,648 1.86 % 108.26 14,460 3.20 % 77.15 34,619 1.85 % 63.79 25,379 4.66 % 79.82 B, b, B , 75,217 2.39 % 66.29 1,095 4.42 % 27.57 0 0 % 0 242 2.63 % 61.17 3,566 1.43 % 83.06 6,177 1.37 % 32.96 54,738 2.93 % 100.86 9,399 1.72 % 29.56 i, i, i , 68,165 2.17 % 60.07 158 0.64 % 3.98 0 0 % 0 90 0.98 % 22.75 3,525 1.41 % 82.10 12,936 2.86 % 69.02 27,139 1.45 % 50.01 24,317 4.46 % 76.48 itd, itd, itd , 65,475 2.08 % 57.70 578 2.33 % 14.55 2 5.71 % 206.04 195 2.12 % 49.29 5,517 2.21 % 128.50 14,609 3.23 % 77.95 36,287 1.94 % 66.86 8,287 1.52 % 26.06 sv, sv, sv , 65,377 2.08 % 57.62 688 2.77 % 17.32 0 0 % 0 57 0.62 % 14.41 5,754 2.30 % 134.02 6,229 1.38 % 33.24 44,951 2.41 % 82.83 7,698 1.41 % 24.21 D, d, D , 63,685 2.02 % 56.13 415 1.67 % 10.45 0 0 % 0 92 1.00 % 23.26 3,471 1.39 % 80.84 5,629 1.25 % 30.03 47,370 2.54 % 87.28 6,708 1.23 % 21.10 P, p, P , 61,583 1.96 % 54.27 371 1.50 % 9.34 0 0 % 0 68 0.74 % 17.19 2,848 1.14 % 66.33 6,019 1.33 % 32.12 47,303 2.53 % 87.16 4,974 0.91 % 15.64 tel, tel, tel , 59,340 1.89 % 52.30 5 0.02 % 0.13 0 0 % 0 12 0.13 % 3.03 1,487 0.59 % 34.63 7,401 1.64 % 39.49 47,321 2.53 % 87.19 3,114 0.57 % 9.79 K, k, K , 49,179 1.56 % 43.34 1,046 4.22 % 26.34 0 0 % 0 48 0.52 % 12.13 1,780 0.71 % 41.46 4,816 1.07 % 25.70 37,122 1.99 % 68.40 4,367 0.80 % 13.74 prof, prof, prof , 48,612 1.54 % 42.84 47 0.19 % 1.18 0 0 % 0 26 0.28 % 6.57 741 0.30 % 17.26 5,305 1.17 % 28.31 34,903 1.87 % 64.31 7,590 1.39 % 23.87 str, str, str , 47,194 1.50 % 41.59 1,110 4.48 % 27.95 0 0 % 0 33 0.36 % 8.34 28,190 11.27 % 656.58 10,527 2.33 % 56.17 3,788 0.20 % 6.98 3,546 0.65 % 11.15 I, i, I , 44,515 1.41 % 39.23 441 1.78 % 11.10 0 0 % 0 137 1.49 % 34.63 3,184 1.27 % 74.16 4,534 1.00 % 24.19 30,623 1.64 % 56.42 5,596 1.03 % 17.60 V, v, V , 43,853 1.39 % 38.65 342 1.38 % 8.61 0 0 % 0 84 0.91 % 21.23 2,017 0.81 % 46.98 5,216 1.16 % 27.83 30,139 1.61 % 55.53 6,055 1.11 % 19.04 R, r, R , 43,194 1.37 % 38.07 533 2.15 % 13.42 0 0 % 0 40 0.43 % 10.11 3,135 1.25 % 73.02 5,154 1.14 % 27.50 29,209 1.56 % 53.82 5,123 0.94 % 16.11 G, g, G , 42,062 1.34 % 37.07 555 2.24 % 13.97 0 0 % 0 44 0.48 % 11.12 2,731 1.09 % 63.61 4,157 0.92 % 22.18 30,793 1.65 % 56.74 3,782 0.69 % 11.90 mag, mag, mag , 40,143 1.27 % 35.38 17 0.07 % 0.43 0 0 % 0 65 0.71 % 16.43 471 0.19 % 10.97 5,620 1.24 % 29.99 29,459 1.58 % 54.28 4,511 0.83 % 14.19 C, c, C , 37,814 1.20 % 33.33 339 1.37 % 8.54 33 94.29 % 3,399.61 132 1.43 % 33.37 4,770 1.91 % 111.10 7,426 1.64 % 39.62 18,868 1.01 % 34.77 6,246 1.15 % 19.65 L, l, L , 37,102 1.18 % 32.70 793 3.20 % 19.97 0 0 % 0 65 0.71 % 16.43 3,527 1.41 % 82.15 3,661 0.81 % 19.53 25,324 1.36 % 46.66 3,732 0.69 % 11.74 st, st, st , 36,399 1.16 % 32.08 947 3.82 % 23.84 0 0 % 0 30 0.33 % 7.58 1,228 0.49 % 28.60 3,431 0.76 % 18.31 19,248 1.03 % 35.47 11,515 2.11 % 36.22 p, p, p , 35,939 1.14 % 31.67 96 0.39 % 2.42 0 0 % 0 68 0.74 % 17.19 703 0.28 % 16.37 6,478 1.43 % 34.56 20,936 1.12 % 38.58 7,658 1.41 % 24.09 op, op, op , 35,928 1.14 % 31.66 454 1.83 % 11.43 0 0 % 0 8 0.09 % 2.02 3,481 1.39 % 81.08 4,702 1.04 % 25.09 15,464 0.83 % 28.49 11,819 2.17 % 37.17 T, t, T , 34,839 1.11 % 30.70 350 1.41 % 8.81 0 0 % 0 35 0.38 % 8.85 1,987 0.79 % 46.28 4,478 0.99 % 23.89 23,035 1.23 % 42.44 4,954 0.91 % 15.58 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 243 File at CLARIN.SI 1.2.227 List of final character-level 2-grams from abbreviation lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-abbreviations-lemmas-final- 2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] dr, dr, d r, 322,077 10.23 % 283.84 4,941 1.97 % 115.08 39,853 7.31 % 125.35 0 0 % 0 222,930 11.94 % 410.76 476 5.17 % 120.33 51,452 11.39 % 274.53 2,425 9.78 % 61.06 oz, oz, o z, 136,593 4.34 % 120.38 11,031 4.41 % 256.93 62,292 11.43 % 195.92 0 0 % 0 40,351 2.16 % 74.35 329 3.57 % 83.17 22,364 4.95 % 119.33 226 0.91 % 5.69 d, d, d, 133,611 4.24 % 117.75 1,152 0.46 % 26.83 26,223 4.81 % 82.48 0 0 % 0 84,570 4.53 % 155.83 567 6.16 % 143.33 20,886 4.62 % 111.44 213 0.86 % 5.36 o, o, o, 110,895 3.52 % 97.73 1,407 0.56 % 32.77 17,712 3.25 % 55.71 0 0 % 0 70,116 3.75 % 129.19 182 1.98 % 46.01 21,336 4.72 % 113.84 142 0.57 % 3.58 M, m, M, 110,881 3.52 % 97.72 5,916 2.36 % 137.79 9,214 1.69 % 28.98 0 0 % 0 85,646 4.59 % 157.81 61 0.66 % 15.42 9,522 2.11 % 50.81 522 2.10 % 13.14 št, št, š t, 94,645 3.01 % 83.41 7,171 2.87 % 167.02 44,772 8.21 % 140.82 0 0 % 0 28,295 1.51 % 52.14 1,913 20.77 % 483.59 11,553 2.56 % 61.64 941 3.79 % 23.69 npr, npr, np r, 92,015 2.92 % 81.09 22,440 8.97 % 522.65 14,523 2.66 % 45.68 0 0 % 0 24,837 1.33 % 45.76 311 3.38 % 78.62 29,590 6.55 % 157.88 314 1.27 % 7.91 J, j, J, 84,562 2.69 % 74.52 6,885 2.75 % 160.36 8,714 1.60 % 27.41 0 0 % 0 59,838 3.21 % 110.26 69 0.75 % 17.44 7,985 1.77 % 42.61 1,071 4.32 % 26.97 A, a, A, 81,577 2.59 % 71.89 6,513 2.60 % 151.70 15,696 2.88 % 49.37 0 0 % 0 49,090 2.63 % 90.45 191 2.07 % 48.28 8,889 1.97 % 47.43 1,198 4.83 % 30.16 S, s, S, 80,813 2.57 % 71.22 3,722 1.49 % 86.69 7,163 1.31 % 22.53 0 0 % 0 62,235 3.33 % 114.67 57 0.62 % 14.41 7,148 1.58 % 38.14 488 1.97 % 12.29 t, t, t, 79,373 2.52 % 69.95 4,648 1.86 % 108.26 25,379 4.66 % 79.82 0 0 % 0 34,619 1.85 % 63.79 99 1.07 % 25.03 14,460 3.20 % 77.15 168 0.68 % 4.23 B, b, B, 75,217 2.39 % 66.29 3,566 1.43 % 83.06 9,399 1.72 % 29.56 0 0 % 0 54,738 2.93 % 100.86 242 2.63 % 61.17 6,177 1.37 % 32.96 1,095 4.42 % 27.57 i, i, i, 68,165 2.17 % 60.07 3,525 1.41 % 82.10 24,317 4.46 % 76.48 0 0 % 0 27,139 1.45 % 50.01 90 0.98 % 22.75 12,936 2.86 % 69.02 158 0.64 % 3.98 itd, itd, it d, 65,475 2.08 % 57.70 5,517 2.21 % 128.50 8,287 1.52 % 26.06 2 5.71 % 206.04 36,287 1.94 % 66.86 195 2.12 % 49.29 14,609 3.23 % 77.95 578 2.33 % 14.55 sv, sv, s v, 65,377 2.08 % 57.62 5,754 2.30 % 134.02 7,698 1.41 % 24.21 0 0 % 0 44,951 2.41 % 82.83 57 0.62 % 14.41 6,229 1.38 % 33.24 688 2.77 % 17.32 D, d, D, 63,685 2.02 % 56.13 3,471 1.39 % 80.84 6,708 1.23 % 21.10 0 0 % 0 47,370 2.54 % 87.28 92 1.00 % 23.26 5,629 1.25 % 30.03 415 1.67 % 10.45 P, p, P, 61,583 1.96 % 54.27 2,848 1.14 % 66.33 4,974 0.91 % 15.64 0 0 % 0 47,303 2.53 % 87.16 68 0.74 % 17.19 6,019 1.33 % 32.12 371 1.50 % 9.34 tel, tel, te l, 59,340 1.89 % 52.30 1,487 0.59 % 34.63 3,114 0.57 % 9.79 0 0 % 0 47,321 2.53 % 87.19 12 0.13 % 3.03 7,401 1.64 % 39.49 5 0.02 % 0.13 K, k, K, 49,179 1.56 % 43.34 1,780 0.71 % 41.46 4,367 0.80 % 13.74 0 0 % 0 37,122 1.99 % 68.40 48 0.52 % 12.13 4,816 1.07 % 25.70 1,046 4.22 % 26.34 prof, prof, pro f, 48,612 1.54 % 42.84 741 0.30 % 17.26 7,590 1.39 % 23.87 0 0 % 0 34,903 1.87 % 64.31 26 0.28 % 6.57 5,305 1.17 % 28.31 47 0.19 % 1.18 str, str, st r, 47,194 1.50 % 41.59 28,190 11.27 % 656.58 3,546 0.65 % 11.15 0 0 % 0 3,788 0.20 % 6.98 33 0.36 % 8.34 10,527 2.33 % 56.17 1,110 4.48 % 27.95 I, i, I, 44,515 1.41 % 39.23 3,184 1.27 % 74.16 5,596 1.03 % 17.60 0 0 % 0 30,623 1.64 % 56.42 137 1.49 % 34.63 4,534 1.00 % 24.19 441 1.78 % 11.10 V, v, V, 43,853 1.39 % 38.65 2,017 0.81 % 46.98 6,055 1.11 % 19.04 0 0 % 0 30,139 1.61 % 55.53 84 0.91 % 21.23 5,216 1.16 % 27.83 342 1.38 % 8.61 R, r, R, 43,194 1.37 % 38.07 3,135 1.25 % 73.02 5,123 0.94 % 16.11 0 0 % 0 29,209 1.56 % 53.82 40 0.43 % 10.11 5,154 1.14 % 27.50 533 2.15 % 13.42 G, g, G, 42,062 1.34 % 37.07 2,731 1.09 % 63.61 3,782 0.69 % 11.90 0 0 % 0 30,793 1.65 % 56.74 44 0.48 % 11.12 4,157 0.92 % 22.18 555 2.24 % 13.97 mag, mag, ma g, 40,143 1.27 % 35.38 471 0.19 % 10.97 4,511 0.83 % 14.19 0 0 % 0 29,459 1.58 % 54.28 65 0.71 % 16.43 5,620 1.24 % 29.99 17 0.07 % 0.43 C, c, C, 37,814 1.20 % 33.33 4,770 1.91 % 111.10 6,246 1.15 % 19.65 33 94.29 % 3,399.61 18,868 1.01 % 34.77 132 1.43 % 33.37 7,426 1.64 % 39.62 339 1.37 % 8.54 L, l, L, 37,102 1.18 % 32.70 3,527 1.41 % 82.15 3,732 0.69 % 11.74 0 0 % 0 25,324 1.36 % 46.66 65 0.71 % 16.43 3,661 0.81 % 19.53 793 3.20 % 19.97 st, st, s t, 36,399 1.16 % 32.08 1,228 0.49 % 28.60 11,515 2.11 % 36.22 0 0 % 0 19,248 1.03 % 35.47 30 0.33 % 7.58 3,431 0.76 % 18.31 947 3.82 % 23.84 p, p, p, 35,939 1.14 % 31.67 703 0.28 % 16.37 7,658 1.41 % 24.09 0 0 % 0 20,936 1.12 % 38.58 68 0.74 % 17.19 6,478 1.43 % 34.56 96 0.39 % 2.42 op, op, o p, 35,928 1.14 % 31.66 3,481 1.39 % 81.08 11,819 2.17 % 37.17 0 0 % 0 15,464 0.83 % 28.49 8 0.09 % 2.02 4,702 1.04 % 25.09 454 1.83 % 11.43 T, t, T, 34,839 1.11 % 30.70 1,987 0.79 % 46.28 4,954 0.91 % 15.58 0 0 % 0 23,035 1.23 % 42.44 35 0.38 % 8.85 4,478 0.99 % 23.89 350 1.41 % 8.81 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 244 File at CLARIN.SI 1.2.228 List of final character-level 3-grams from abbreviation lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-abbreviations-lemmas-final- 3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] dr, dr, dr, 322,077 21.49 % 283.84 4,941 3.16 % 115.08 39,853 13.50 % 125.35 0 0 % 0 222,930 28.22 % 410.76 476 9.49 % 120.33 51,452 21.42 % 274.53 2,425 20.54 % 61.06 oz, oz, oz, 136,593 9.12 % 120.38 11,031 7.05 % 256.93 62,292 21.10 % 195.92 0 0 % 0 40,351 5.11 % 74.35 329 6.56 % 83.17 22,364 9.31 % 119.33 226 1.92 % 5.69 št, št, št, 94,645 6.32 % 83.41 7,171 4.58 % 167.02 44,772 15.16 % 140.82 0 0 % 0 28,295 3.58 % 52.14 1,913 38.13 % 483.59 11,553 4.81 % 61.64 941 7.97 % 23.69 npr, npr, n pr, 92,015 6.14 % 81.09 22,440 14.34 % 522.65 14,523 4.92 % 45.68 0 0 % 0 24,837 3.15 % 45.76 311 6.20 % 78.62 29,590 12.32 % 157.88 314 2.66 % 7.91 itd, itd, i td, 65,475 4.37 % 57.70 5,517 3.53 % 128.50 8,287 2.81 % 26.06 2100.00 % 206.04 36,287 4.59 % 66.86 195 3.89 % 49.29 14,609 6.08 % 77.95 578 4.90 % 14.55 sv, sv, sv, 65,377 4.36 % 57.62 5,754 3.68 % 134.02 7,698 2.61 % 24.21 0 0 % 0 44,951 5.69 % 82.83 57 1.14 % 14.41 6,229 2.59 % 33.24 688 5.83 % 17.32 tel, tel, t el, 59,340 3.96 % 52.30 1,487 0.95 % 34.63 3,114 1.05 % 9.79 0 0 % 0 47,321 5.99 % 87.19 12 0.24 % 3.03 7,401 3.08 % 39.49 5 0.04 % 0.13 prof, prof, pr of, 48,612 3.24 % 42.84 741 0.47 % 17.26 7,590 2.57 % 23.87 0 0 % 0 34,903 4.42 % 64.31 26 0.52 % 6.57 5,305 2.21 % 28.31 47 0.40 % 1.18 str, str, s tr, 47,194 3.15 % 41.59 28,190 18.01 % 656.58 3,546 1.20 % 11.15 0 0 % 0 3,788 0.48 % 6.98 33 0.66 % 8.34 10,527 4.38 % 56.17 1,110 9.40 % 27.95 mag, mag, m ag, 40,143 2.68 % 35.38 471 0.30 % 10.97 4,511 1.53 % 14.19 0 0 % 0 29,459 3.73 % 54.28 65 1.30 % 16.43 5,620 2.34 % 29.99 17 0.14 % 0.43 st, st, st, 36,399 2.43 % 32.08 1,228 0.79 % 28.60 11,515 3.90 % 36.22 0 0 % 0 19,248 2.44 % 35.47 30 0.60 % 7.58 3,431 1.43 % 18.31 947 8.02 % 23.84 op, op, op, 35,928 2.40 % 31.66 3,481 2.22 % 81.08 11,819 4.00 % 37.17 0 0 % 0 15,464 1.96 % 28.49 8 0.16 % 2.02 4,702 1.96 % 25.09 454 3.85 % 11.43 ipd, ipd, i pd, 29,413 1.96 % 25.92 5,227 3.34 % 121.74 4,394 1.49 % 13.82 0 0 % 0 12,438 1.57 % 22.92 178 3.55 % 45 7,015 2.92 % 37.43 161 1.36 % 4.05 itn, itn, i tn, 16,421 1.10 % 14.47 2,446 1.56 % 56.97 1,646 0.56 % 5.18 0 0 % 0 8,494 1.07 % 15.65 157 3.13 % 39.69 3,546 1.48 % 18.92 132 1.12 % 3.32 odst, odst, od st, 16,149 1.08 % 14.23 46 0.03 % 1.07 5,065 1.72 % 15.93 0 0 % 0 10,780 1.36 % 19.86 1 0.02 % 0.25 256 0.11 % 1.37 1 0.01 % 0.03 am, am, am, 10,990 0.73 % 9.69 95 0.06 % 2.21 71 0.02 % 0.22 0 0 % 0 10,748 1.36 % 19.80 0 0 % 0 74 0.03 % 0.39 2 0.02 % 0.05 tj, tj, tj, 10,915 0.73 % 9.62 2,778 1.77 % 64.70 2,073 0.70 % 6.52 0 0 % 0 3,085 0.39 % 5.68 6 0.12 % 1.52 2,880 1.20 % 15.37 93 0.79 % 2.34 med, med, m ed, 10,326 0.69 % 9.10 53 0.03 % 1.23 1,028 0.35 % 3.23 0 0 % 0 5,652 0.72 % 10.41 0 0 % 0 3,586 1.49 % 19.13 7 0.06 % 0.18 čl, čl, čl, 10,036 0.67 % 8.84 183 0.12 % 4.26 8,198 2.78 % 25.78 0 0 % 0 1,356 0.17 % 2.50 12 0.24 % 3.03 282 0.12 % 1.50 5 0.04 % 0.13 nan, nan, n an, 9,178 0.61 % 8.09 3 0 % 0.07 102 0.04 % 0.32 0 0 % 0 9,052 1.15 % 16.68 0 0 % 0 18 0.01 % 0.10 3 0.03 % 0.08 prim, prim, pr im, 8,714 0.58 % 7.68 3,218 2.06 % 74.95 907 0.31 % 2.85 0 0 % 0 2,949 0.37 % 5.43 11 0.22 % 2.78 1,535 0.64 % 8.19 94 0.80 % 2.37 ur, ur, ur, 8,434 0.56 % 7.43 562 0.36 % 13.09 4,166 1.41 % 13.10 0 0 % 0 3,021 0.38 % 5.57 10 0.20 % 2.53 647 0.27 % 3.45 28 0.24 % 0.71 pr, pr, pr, 8,211 0.55 % 7.24 2,655 1.70 % 61.84 892 0.30 % 2.81 0 0 % 0 2,146 0.27 % 3.95 255 5.08 % 64.46 2,206 0.92 % 11.77 57 0.48 % 1.44 amer, amer, am er, 8,026 0.54 % 7.07 11 0.01 % 0.26 4 0 % 0.01 0 0 % 0 8,006 1.01 % 14.75 0 0 % 0 5 0 % 0.03 0 0 % 0 idr, idr, i dr, 7,791 0.52 % 6.87 1,650 1.05 % 38.43 988 0.34 % 3.11 0 0 % 0 3,765 0.48 % 6.94 33 0.66 % 8.34 1,322 0.55 % 7.05 33 0.28 % 0.83 dok, dok, d ok, 7,725 0.52 % 6.81 49 0.03 % 1.14 98 0.03 % 0.31 0 0 % 0 7,494 0.95 % 13.81 3 0.06 % 0.76 78 0.03 % 0.42 3 0.03 % 0.08 nad, nad, n ad, 7,233 0.48 % 6.37 2 0 % 0.05 22 0.01 % 0.07 0 0 % 0 7,181 0.91 % 13.23 2 0.04 % 0.51 23 0.01 % 0.12 3 0.03 % 0.08 angl, angl, an gl, 7,232 0.48 % 6.37 1,407 0.90 % 32.77 584 0.20 % 1.84 0 0 % 0 2,390 0.30 % 4.40 26 0.52 % 6.57 2,817 1.17 % 15.03 8 0.07 % 0.20 ul, ul, ul, 6,190 0.41 % 5.46 58 0.04 % 1.35 508 0.17 % 1.60 0 0 % 0 4,418 0.56 % 8.14 2 0.04 % 0.51 1,201 0.50 % 6.41 3 0.03 % 0.08 ml, ml, ml, 5,973 0.40 % 5.26 74 0.05 % 1.72 446 0.15 % 1.40 0 0 % 0 4,939 0.62 % 9.10 4 0.08 % 1.01 455 0.19 % 2.43 55 0.47 % 1.38 doc, doc, d oc, 5,756 0.38 % 5.07 53 0.03 % 1.23 1,456 0.49 % 4.58 0 0 % 0 3,086 0.39 % 5.69 3 0.06 % 0.76 1,157 0.48 % 6.17 1 0.01 % 0.03 dipl, dipl, di pl, 5,602 0.37 % 4.94 81 0.05 % 1.89 412 0.14 % 1.30 0 0 % 0 4,280 0.54 % 7.89 1 0.02 % 0.25 819 0.34 % 4.37 9 0.08 % 0.23 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 245 File at CLARIN.SI 1.2.229 List of final character-level 4-grams from abbreviation lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-abbreviations-lemmas-final- 4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] npr, npr, npr, 92,015 13.93 % 81.09 22,440 23.11 % 522.65 14,523 16.99 % 45.68 0 0 % 0 24,837 7.02 % 45.76 311 19.23 % 78.62 29,590 25.14 % 157.88 314 6.52 % 7.91 itd, itd, itd, 65,475 9.91 % 57.70 5,517 5.68 % 128.50 8,287 9.69 % 26.06 2100.00 % 206.04 36,287 10.25 % 66.86 195 12.06 % 49.29 14,609 12.41 % 77.95 578 12.01 % 14.55 tel, tel, tel, 59,340 8.98 % 52.30 1,487 1.53 % 34.63 3,114 3.64 % 9.79 0 0 % 0 47,321 13.37 % 87.19 12 0.74 % 3.03 7,401 6.29 % 39.49 5 0.10 % 0.13 prof, prof, p rof, 48,612 7.36 % 42.84 741 0.76 % 17.26 7,590 8.88 % 23.87 0 0 % 0 34,903 9.86 % 64.31 26 1.61 % 6.57 5,305 4.51 % 28.31 47 0.98 % 1.18 str, str, str, 47,194 7.14 % 41.59 28,190 29.03 % 656.58 3,546 4.15 % 11.15 0 0 % 0 3,788 1.07 % 6.98 33 2.04 % 8.34 10,527 8.94 % 56.17 1,110 23.06 % 27.95 mag, mag, mag, 40,143 6.08 % 35.38 471 0.48 % 10.97 4,511 5.28 % 14.19 0 0 % 0 29,459 8.32 % 54.28 65 4.02 % 16.43 5,620 4.78 % 29.99 17 0.35 % 0.43 ipd, ipd, ipd, 29,413 4.45 % 25.92 5,227 5.38 % 121.74 4,394 5.14 % 13.82 0 0 % 0 12,438 3.52 % 22.92 178 11.01 % 45 7,015 5.96 % 37.43 161 3.35 % 4.05 itn, itn, itn, 16,421 2.49 % 14.47 2,446 2.52 % 56.97 1,646 1.93 % 5.18 0 0 % 0 8,494 2.40 % 15.65 157 9.71 % 39.69 3,546 3.01 % 18.92 132 2.74 % 3.32 odst, odst, o dst, 16,149 2.44 % 14.23 46 0.05 % 1.07 5,065 5.92 % 15.93 0 0 % 0 10,780 3.05 % 19.86 1 0.06 % 0.25 256 0.22 % 1.37 1 0.02 % 0.03 med, med, med, 10,326 1.56 % 9.10 53 0.06 % 1.23 1,028 1.20 % 3.23 0 0 % 0 5,652 1.60 % 10.41 0 0 % 0 3,586 3.05 % 19.13 7 0.14 % 0.18 nan, nan, nan, 9,178 1.39 % 8.09 3 0 % 0.07 102 0.12 % 0.32 0 0 % 0 9,052 2.56 % 16.68 0 0 % 0 18 0.01 % 0.10 3 0.06 % 0.08 prim, prim, p rim, 8,714 1.32 % 7.68 3,218 3.31 % 74.95 907 1.06 % 2.85 0 0 % 0 2,949 0.83 % 5.43 11 0.68 % 2.78 1,535 1.30 % 8.19 94 1.95 % 2.37 amer, amer, a mer, 8,026 1.22 % 7.07 11 0.01 % 0.26 4 0.01 % 0.01 0 0 % 0 8,006 2.26 % 14.75 0 0 % 0 5 0 % 0.03 0 0 % 0 idr, idr, idr, 7,791 1.18 % 6.87 1,650 1.70 % 38.43 988 1.16 % 3.11 0 0 % 0 3,765 1.06 % 6.94 33 2.04 % 8.34 1,322 1.12 % 7.05 33 0.69 % 0.83 dok, dok, dok, 7,725 1.17 % 6.81 49 0.05 % 1.14 98 0.12 % 0.31 0 0 % 0 7,494 2.12 % 13.81 3 0.19 % 0.76 78 0.07 % 0.42 3 0.06 % 0.08 nad, nad, nad, 7,233 1.09 % 6.37 2 0 % 0.05 22 0.03 % 0.07 0 0 % 0 7,181 2.03 % 13.23 2 0.12 % 0.51 23 0.02 % 0.12 3 0.06 % 0.08 angl, angl, a ngl, 7,232 1.09 % 6.37 1,407 1.45 % 32.77 584 0.68 % 1.84 0 0 % 0 2,390 0.68 % 4.40 26 1.61 % 6.57 2,817 2.39 % 15.03 8 0.17 % 0.20 doc, doc, doc, 5,756 0.87 % 5.07 53 0.06 % 1.23 1,456 1.70 % 4.58 0 0 % 0 3,086 0.87 % 5.69 3 0.19 % 0.76 1,157 0.98 % 6.17 1 0.02 % 0.03 dipl, dipl, d ipl, 5,602 0.85 % 4.94 81 0.08 % 1.89 412 0.48 % 1.30 0 0 % 0 4,280 1.21 % 7.89 1 0.06 % 0.25 819 0.70 % 4.37 9 0.19 % 0.23 inž, inž, inž, 5,119 0.78 % 4.51 43 0.04 % 1 154 0.18 % 0.48 0 0 % 0 4,449 1.26 % 8.20 1 0.06 % 0.25 468 0.40 % 2.50 4 0.08 % 0.10 pon, pon, pon, 4,874 0.74 % 4.30 104 0.11 % 2.42 18 0.02 % 0.06 0 0 % 0 4,611 1.30 % 8.50 4 0.25 % 1.01 137 0.12 % 0.73 0 0 % 0 roj, roj, roj, 4,310 0.65 % 3.80 901 0.93 % 20.99 629 0.74 % 1.98 0 0 % 0 2,632 0.74 % 4.85 3 0.19 % 0.76 116 0.10 % 0.62 29 0.60 % 0.73 parc, parc, p arc, 3,955 0.60 % 3.49 19 0.02 % 0.44 2,453 2.87 % 7.72 0 0 % 0 1,470 0.41 % 2.71 4 0.25 % 1.01 9 0.01 % 0.05 0 0 % 0 naniz, naniz, na niz, 3,743 0.57 % 3.30 0 0 % 0 0 0 % 0 0 0 % 0 3,743 1.06 % 6.90 0 0 % 0 0 0 % 0 0 0 % 0 nem, nem, nem, 3,739 0.57 % 3.30 436 0.45 % 10.15 106 0.12 % 0.33 0 0 % 0 2,975 0.84 % 5.48 4 0.25 % 1.01 175 0.15 % 0.93 43 0.89 % 1.08 brit, brit, b rit, 3,261 0.49 % 2.87 56 0.06 % 1.30 2,446 2.86 % 7.69 0 0 % 0 738 0.21 % 1.36 0 0 % 0 21 0.02 % 0.11 0 0 % 0 msgr, msgr, m sgr, 3,221 0.49 % 2.84 19 0.02 % 0.44 116 0.14 % 0.36 0 0 % 0 2,980 0.84 % 5.49 0 0 % 0 90 0.08 % 0.48 16 0.33 % 0.40 opr, opr, opr, 3,179 0.48 % 2.80 357 0.37 % 8.31 2,596 3.04 % 8.17 0 0 % 0 204 0.06 % 0.38 0 0 % 0 22 0.02 % 0.12 0 0 % 0 slov, slov, s lov, 2,932 0.44 % 2.58 630 0.65 % 14.67 236 0.28 % 0.74 0 0 % 0 1,789 0.51 % 3.30 1 0.06 % 0.25 239 0.20 % 1.28 37 0.77 % 0.93 univ, univ, u niv, 2,922 0.44 % 2.58 89 0.09 % 2.07 296 0.35 % 0.93 0 0 % 0 2,067 0.58 % 3.81 2 0.12 % 0.51 463 0.39 % 2.47 5 0.10 % 0.13 odd, odd, odd, 2,739 0.41 % 2.41 11 0.01 % 0.26 59 0.07 % 0.19 0 0 % 0 2,661 0.75 % 4.90 0 0 % 0 7 0.01 % 0.04 1 0.02 % 0.03 stol, stol, s tol, 2,715 0.41 % 2.39 527 0.54 % 12.27 154 0.18 % 0.48 0 0 % 0 708 0.20 % 1.30 0 0 % 0 1,314 1.12 % 7.01 12 0.25 % 0.30 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 246 File at CLARIN.SI 1.2.230 List of final character-level 5-grams from abbreviation lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-abbreviations-lemmas-final- 5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] prof, prof, prof, 48,612 29.71 % 42.84 741 4.56 % 17.26 7,590 29.21 % 23.87 0 0 % 0 34,903 34.39 % 64.31 26 10.48 % 6.57 5,305 28.32 % 28.31 47 5.12 % 1.18 odst, odst, odst, 16,149 9.87 % 14.23 46 0.28 % 1.07 5,065 19.49 % 15.93 0 0 % 0 10,780 10.62 % 19.86 1 0.40 % 0.25 256 1.37 % 1.37 1 0.11 % 0.03 prim, prim, prim, 8,714 5.33 % 7.68 3,218 19.80 % 74.95 907 3.49 % 2.85 0 0 % 0 2,949 2.91 % 5.43 11 4.43 % 2.78 1,535 8.20 % 8.19 94 10.25 % 2.37 amer, amer, amer, 8,026 4.91 % 7.07 11 0.07 % 0.26 4 0.01 % 0.01 0 0 % 0 8,006 7.89 % 14.75 0 0 % 0 5 0.03 % 0.03 0 0 % 0 angl, angl, angl, 7,232 4.42 % 6.37 1,407 8.66 % 32.77 584 2.25 % 1.84 0 0 % 0 2,390 2.35 % 4.40 26 10.48 % 6.57 2,817 15.04 % 15.03 8 0.87 % 0.20 dipl, dipl, dipl, 5,602 3.42 % 4.94 81 0.50 % 1.89 412 1.59 % 1.30 0 0 % 0 4,280 4.22 % 7.89 1 0.40 % 0.25 819 4.37 % 4.37 9 0.98 % 0.23 parc, parc, parc, 3,955 2.42 % 3.49 19 0.12 % 0.44 2,453 9.44 % 7.72 0 0 % 0 1,470 1.45 % 2.71 4 1.61 % 1.01 9 0.05 % 0.05 0 0 % 0 naniz, naniz, n aniz, 3,743 2.29 % 3.30 0 0 % 0 0 0 % 0 0 0 % 0 3,743 3.69 % 6.90 0 0 % 0 0 0 % 0 0 0 % 0 brit, brit, brit, 3,261 1.99 % 2.87 56 0.34 % 1.30 2,446 9.41 % 7.69 0 0 % 0 738 0.73 % 1.36 0 0 % 0 21 0.11 % 0.11 0 0 % 0 msgr, msgr, msgr, 3,221 1.97 % 2.84 19 0.12 % 0.44 116 0.45 % 0.36 0 0 % 0 2,980 2.94 % 5.49 0 0 % 0 90 0.48 % 0.48 16 1.75 % 0.40 slov, slov, slov, 2,932 1.79 % 2.58 630 3.88 % 14.67 236 0.91 % 0.74 0 0 % 0 1,789 1.76 % 3.30 1 0.40 % 0.25 239 1.28 % 1.28 37 4.04 % 0.93 univ, univ, univ, 2,922 1.79 % 2.58 89 0.55 % 2.07 296 1.14 % 0.93 0 0 % 0 2,067 2.04 % 3.81 2 0.81 % 0.51 463 2.47 % 2.47 5 0.55 % 0.13 stol, stol, stol, 2,715 1.66 % 2.39 527 3.24 % 12.27 154 0.59 % 0.48 0 0 % 0 708 0.70 % 1.30 0 0 % 0 1,314 7.01 % 7.01 12 1.31 % 0.30 prev, prev, prev, 2,524 1.54 % 2.22 1,713 10.54 % 39.90 60 0.23 % 0.19 0 0 % 0 350 0.34 % 0.64 5 2.02 % 1.26 150 0.80 % 0.80 246 26.83 % 6.19 ibid, ibid, ibid, 2,175 1.33 % 1.92 1,945 11.97 % 45.30 75 0.29 % 0.24 0 0 % 0 8 0.01 % 0.01 2 0.81 % 0.51 145 0.77 % 0.77 0 0 % 0 ponov, ponov, p onov, 1,907 1.17 % 1.68 0 0 % 0 0 0 % 0 0 0 % 0 1,906 1.88 % 3.51 0 0 % 0 0 0 % 0 1 0.11 % 0.03 nadalj, nadalj, na dalj, 1,606 0.98 % 1.42 2 0.01 % 0.05 144 0.55 % 0.45 0 0 % 0 1,457 1.44 % 2.68 1 0.40 % 0.25 2 0.01 % 0.01 0 0 % 0 asist, asist, a sist, 1,359 0.83 % 1.20 8 0.05 % 0.19 342 1.32 % 1.08 0 0 % 0 513 0.51 % 0.95 0 0 % 0 496 2.65 % 2.65 0 0 % 0 spec, spec, spec, 1,274 0.78 % 1.12 20 0.12 % 0.47 256 0.98 % 0.81 0 0 % 0 660 0.65 % 1.22 0 0 % 0 336 1.79 % 1.79 2 0.22 % 0.05 film, film, film, 1,206 0.74 % 1.06 55 0.34 % 1.28 133 0.51 % 0.42 0 0 % 0 1,000 0.98 % 1.84 1 0.40 % 0.25 17 0.09 % 0.09 0 0 % 0 štev, štev, štev, 1,168 0.71 % 1.03 80 0.49 % 1.86 151 0.58 % 0.47 0 0 % 0 751 0.74 % 1.38 9 3.63 % 2.28 161 0.86 % 0.86 16 1.75 % 0.40 pribl, pribl, p ribl, 1,089 0.67 % 0.96 140 0.86 % 3.26 269 1.03 % 0.85 0 0 % 0 325 0.32 % 0.60 20 8.06 % 5.06 335 1.79 % 1.79 0 0 % 0 nasl, nasl, nasl, 966 0.59 % 0.85 131 0.81 % 3.05 769 2.96 % 2.42 0 0 % 0 12 0.01 % 0.02 1 0.40 % 0.25 52 0.28 % 0.28 1 0.11 % 0.03 franc, franc, f ranc, 896 0.55 % 0.79 55 0.34 % 1.28 9 0.04 % 0.03 0 0 % 0 818 0.81 % 1.51 0 0 % 0 14 0.07 % 0.07 0 0 % 0 akad, akad, akad, 876 0.54 % 0.77 74 0.46 % 1.72 117 0.45 % 0.37 0 0 % 0 656 0.65 % 1.21 2 0.81 % 0.51 26 0.14 % 0.14 1 0.11 % 0.03 avstral, avstral, avs tral, 818 0.50 % 0.72 8 0.05 % 0.19 2 0.01 % 0.01 0 0 % 0 807 0.80 % 1.49 0 0 % 0 1 0.01 % 0.01 0 0 % 0 pogl, pogl, pogl, 740 0.45 % 0.65 670 4.12 % 15.61 11 0.04 % 0.03 0 0 % 0 5 0.01 % 0.01 1 0.40 % 0.25 45 0.24 % 0.24 8 0.87 % 0.20 pren, pren, pren, 713 0.44 % 0.63 1 0.01 % 0.02 19 0.07 % 0.06 0 0 % 0 693 0.68 % 1.28 0 0 % 0 0 0 % 0 0 0 % 0 alter, alter, a lter, 638 0.39 % 0.56 0 0 % 0 2 0.01 % 0.01 0 0 % 0 624 0.61 % 1.15 0 0 % 0 12 0.06 % 0.06 0 0 % 0 šport, šport, š port, 612 0.37 % 0.54 3 0.02 % 0.07 10 0.04 % 0.03 0 0 % 0 530 0.52 % 0.98 0 0 % 0 67 0.36 % 0.36 2 0.22 % 0.05 posn, posn, posn, 597 0.36 % 0.53 0 0 % 0 0 0 % 0 0 0 % 0 596 0.59 % 1.10 0 0 % 0 1 0.01 % 0.01 0 0 % 0 madž, madž, madž, 552 0.34 % 0.49 114 0.70 % 2.66 30 0.12 % 0.09 0 0 % 0 401 0.40 % 0.74 0 0 % 0 4 0.02 % 0.02 3 0.33 % 0.08 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 247 File at CLARIN.SI 1.2.231 List of initial character-level 1-grams from abbreviation lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-abbreviations-lowercase_ forms-initial-1grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] dr, d r, 322,077 10.23 % 283.84 4,941 1.97 % 115.08 39,853 7.31 % 125.35 0 0 % 0 222,930 11.94 % 410.76 476 5.17 % 120.33 51,452 11.39 % 274.53 2,425 9.78 % 61.06 d, d , 197,296 6.27 % 173.88 4,623 1.85 % 107.68 32,931 6.04 % 103.58 0 0 % 0 131,940 7.07 % 243.11 659 7.16 % 166.59 26,515 5.87 % 141.48 628 2.53 % 15.81 oz, o z, 136,593 4.34 % 120.38 11,031 4.41 % 256.93 62,292 11.43 % 195.92 0 0 % 0 40,351 2.16 % 74.35 329 3.57 % 83.17 22,364 4.95 % 119.33 226 0.91 % 5.69 o, o , 131,455 4.17 % 115.85 2,225 0.89 % 51.82 19,598 3.60 % 61.64 0 0 % 0 84,948 4.55 % 156.52 215 2.33 % 54.35 24,114 5.34 % 128.66 355 1.43 % 8.94 m, m , 124,724 3.96 % 109.92 6,590 2.63 % 153.49 11,023 2.02 % 34.67 0 0 % 0 95,479 5.11 % 175.93 148 1.61 % 37.41 10,946 2.42 % 58.40 538 2.17 % 13.55 t, t , 114,212 3.63 % 100.65 6,635 2.65 % 154.54 30,333 5.56 % 95.41 0 0 % 0 57,654 3.09 % 106.23 134 1.46 % 33.87 18,938 4.19 % 101.05 518 2.09 % 13.04 i, i , 112,680 3.58 % 99.30 6,709 2.68 % 156.26 29,913 5.49 % 94.08 0 0 % 0 57,762 3.09 % 106.43 227 2.46 % 57.38 17,470 3.87 % 93.21 599 2.42 % 15.08 p, p , 97,522 3.10 % 85.95 3,551 1.42 % 82.71 12,632 2.32 % 39.73 0 0 % 0 68,239 3.65 % 125.73 136 1.48 % 34.38 12,497 2.77 % 66.68 467 1.88 % 11.76 s, s , 96,007 3.05 % 84.61 5,029 2.01 % 117.13 9,265 1.70 % 29.14 0 0 % 0 71,070 3.81 % 130.95 72 0.78 % 18.20 9,783 2.17 % 52.20 788 3.18 % 19.84 št, š t, 94,645 3.01 % 83.41 7,171 2.87 % 167.02 44,772 8.21 % 140.82 0 0 % 0 28,295 1.51 % 52.14 1,913 20.77 % 483.59 11,553 2.56 % 61.64 941 3.79 % 23.69 npr, n pr, 92,015 2.92 % 81.09 22,440 8.97 % 522.65 14,523 2.66 % 45.68 0 0 % 0 24,837 1.33 % 45.76 311 3.38 % 78.62 29,590 6.55 % 157.88 314 1.27 % 7.91 a, a , 90,930 2.89 % 80.14 7,199 2.88 % 167.67 20,714 3.80 % 65.15 0 0 % 0 51,507 2.76 % 94.91 278 3.02 % 70.28 10,006 2.21 % 53.39 1,226 4.95 % 30.87 j, j , 88,597 2.81 % 78.08 7,582 3.03 % 176.59 9,608 1.76 % 30.22 0 0 % 0 61,530 3.29 % 113.37 88 0.95 % 22.25 8,701 1.93 % 46.43 1,088 4.39 % 27.39 b, b , 78,541 2.50 % 69.22 3,823 1.53 % 89.04 9,968 1.83 % 31.35 0 0 % 0 56,849 3.04 % 104.75 259 2.81 % 65.47 6,528 1.45 % 34.83 1,114 4.49 % 28.05 itd, i td, 65,475 2.08 % 57.70 5,517 2.21 % 128.50 8,287 1.52 % 26.06 2 5.71 % 206.04 36,287 1.94 % 66.86 195 2.12 % 49.29 14,609 3.23 % 77.95 578 2.33 % 14.55 sv, s v, 65,377 2.08 % 57.62 5,754 2.30 % 134.02 7,698 1.41 % 24.21 0 0 % 0 44,951 2.41 % 82.83 57 0.62 % 14.41 6,229 1.38 % 33.24 688 2.77 % 17.32 tel, t el, 59,340 1.89 % 52.30 1,487 0.59 % 34.63 3,114 0.57 % 9.79 0 0 % 0 47,321 2.53 % 87.19 12 0.13 % 3.03 7,401 1.64 % 39.49 5 0.02 % 0.13 v, v , 59,163 1.88 % 52.14 2,403 0.96 % 55.97 10,850 1.99 % 34.13 0 0 % 0 38,197 2.05 % 70.38 102 1.11 % 25.78 7,237 1.60 % 38.61 374 1.51 % 9.42 k, k , 57,713 1.83 % 50.86 2,031 0.81 % 47.30 8,416 1.54 % 26.47 0 0 % 0 40,899 2.19 % 75.36 58 0.63 % 14.66 5,237 1.16 % 27.94 1,072 4.32 % 26.99 l, l , 55,731 1.77 % 49.12 4,807 1.92 % 111.96 4,983 0.91 % 15.67 0 0 % 0 37,229 1.99 % 68.60 93 1.01 % 23.51 7,783 1.72 % 41.53 836 3.37 % 21.05 prof, p rof, 48,612 1.54 % 42.84 741 0.30 % 17.26 7,590 1.39 % 23.87 0 0 % 0 34,903 1.87 % 64.31 26 0.28 % 6.57 5,305 1.17 % 28.31 47 0.19 % 1.18 g, g , 47,260 1.50 % 41.65 2,978 1.19 % 69.36 4,354 0.80 % 13.69 0 0 % 0 34,400 1.84 % 63.38 69 0.75 % 17.44 4,878 1.08 % 26.03 581 2.34 % 14.63 str, s tr, 47,194 1.50 % 41.59 28,190 11.27 % 656.58 3,546 0.65 % 11.15 0 0 % 0 3,788 0.20 % 6.98 33 0.36 % 8.34 10,527 2.33 % 56.17 1,110 4.48 % 27.95 r, r , 46,116 1.47 % 40.64 3,576 1.43 % 83.29 5,342 0.98 % 16.80 0 0 % 0 30,906 1.66 % 56.95 117 1.27 % 29.58 5,609 1.24 % 29.93 566 2.28 % 14.25 c, c , 42,313 1.34 % 37.29 4,977 1.99 % 115.92 6,573 1.21 % 20.67 33 94.29 % 3,399.61 21,307 1.14 % 39.26 140 1.52 % 35.39 8,931 1.98 % 47.65 352 1.42 % 8.86 mag, m ag, 40,143 1.27 % 35.38 471 0.19 % 10.97 4,511 0.83 % 14.19 0 0 % 0 29,459 1.58 % 54.28 65 0.71 % 16.43 5,620 1.24 % 29.99 17 0.07 % 0.43 n, n , 36,882 1.17 % 32.50 4,774 1.91 % 111.19 3,482 0.64 % 10.95 0 0 % 0 22,698 1.22 % 41.82 353 3.83 % 89.23 5,182 1.15 % 27.65 393 1.58 % 9.90 st, s t, 36,399 1.16 % 32.08 1,228 0.49 % 28.60 11,515 2.11 % 36.22 0 0 % 0 19,248 1.03 % 35.47 30 0.33 % 7.58 3,431 0.76 % 18.31 947 3.82 % 23.84 op, o p, 35,928 1.14 % 31.66 3,481 1.39 % 81.08 11,819 2.17 % 37.17 0 0 % 0 15,464 0.83 % 28.49 8 0.09 % 2.02 4,702 1.04 % 25.09 454 1.83 % 11.43 ipd, i pd, 29,413 0.93 % 25.92 5,227 2.09 % 121.74 4,394 0.81 % 13.82 0 0 % 0 12,438 0.67 % 22.92 178 1.93 % 45 7,015 1.55 % 37.43 161 0.65 % 4.05 f, f , 28,327 0.90 % 24.96 2,349 0.94 % 54.71 3,285 0.60 % 10.33 0 0 % 0 18,757 1.00 % 34.56 362 3.93 % 91.51 3,366 0.74 % 17.96 208 0.84 % 5.24 e, e , 27,090 0.86 % 23.87 2,854 1.14 % 66.47 4,675 0.86 % 14.70 0 0 % 0 14,894 0.80 % 27.44 35 0.38 % 8.85 4,388 0.97 % 23.41 244 0.98 % 6.14 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 248 File at CLARIN.SI 1.2.232 List of initial character-level 2-grams from abbreviation lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-abbreviations-lowercase_ forms-initial-2grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] dr, dr , 322,077 10.23 % 283.84 4,941 1.97 % 115.08 39,853 7.31 % 125.35 0 0 % 0 222,930 11.94 % 410.76 476 5.17 % 120.33 51,452 11.39 % 274.53 2,425 9.78 % 61.06 d, d, 197,296 6.27 % 173.88 4,623 1.85 % 107.68 32,931 6.04 % 103.58 0 0 % 0 131,940 7.07 % 243.11 659 7.16 % 166.59 26,515 5.87 % 141.48 628 2.53 % 15.81 oz, oz , 136,593 4.34 % 120.38 11,031 4.41 % 256.93 62,292 11.43 % 195.92 0 0 % 0 40,351 2.16 % 74.35 329 3.57 % 83.17 22,364 4.95 % 119.33 226 0.91 % 5.69 o, o, 131,455 4.17 % 115.85 2,225 0.89 % 51.82 19,598 3.60 % 61.64 0 0 % 0 84,948 4.55 % 156.52 215 2.33 % 54.35 24,114 5.34 % 128.66 355 1.43 % 8.94 m, m, 124,724 3.96 % 109.92 6,590 2.63 % 153.49 11,023 2.02 % 34.67 0 0 % 0 95,479 5.11 % 175.93 148 1.61 % 37.41 10,946 2.42 % 58.40 538 2.17 % 13.55 t, t, 114,212 3.63 % 100.65 6,635 2.65 % 154.54 30,333 5.56 % 95.41 0 0 % 0 57,654 3.09 % 106.23 134 1.46 % 33.87 18,938 4.19 % 101.05 518 2.09 % 13.04 i, i, 112,680 3.58 % 99.30 6,709 2.68 % 156.26 29,913 5.49 % 94.08 0 0 % 0 57,762 3.09 % 106.43 227 2.46 % 57.38 17,470 3.87 % 93.21 599 2.42 % 15.08 p, p, 97,522 3.10 % 85.95 3,551 1.42 % 82.71 12,632 2.32 % 39.73 0 0 % 0 68,239 3.65 % 125.73 136 1.48 % 34.38 12,497 2.77 % 66.68 467 1.88 % 11.76 s, s, 96,007 3.05 % 84.61 5,029 2.01 % 117.13 9,265 1.70 % 29.14 0 0 % 0 71,070 3.81 % 130.95 72 0.78 % 18.20 9,783 2.17 % 52.20 788 3.18 % 19.84 št, št , 94,645 3.01 % 83.41 7,171 2.87 % 167.02 44,772 8.21 % 140.82 0 0 % 0 28,295 1.51 % 52.14 1,913 20.77 % 483.59 11,553 2.56 % 61.64 941 3.79 % 23.69 npr, np r, 92,015 2.92 % 81.09 22,440 8.97 % 522.65 14,523 2.66 % 45.68 0 0 % 0 24,837 1.33 % 45.76 311 3.38 % 78.62 29,590 6.55 % 157.88 314 1.27 % 7.91 a, a, 90,930 2.89 % 80.14 7,199 2.88 % 167.67 20,714 3.80 % 65.15 0 0 % 0 51,507 2.76 % 94.91 278 3.02 % 70.28 10,006 2.21 % 53.39 1,226 4.95 % 30.87 j, j, 88,597 2.81 % 78.08 7,582 3.03 % 176.59 9,608 1.76 % 30.22 0 0 % 0 61,530 3.29 % 113.37 88 0.95 % 22.25 8,701 1.93 % 46.43 1,088 4.39 % 27.39 b, b, 78,541 2.50 % 69.22 3,823 1.53 % 89.04 9,968 1.83 % 31.35 0 0 % 0 56,849 3.04 % 104.75 259 2.81 % 65.47 6,528 1.45 % 34.83 1,114 4.49 % 28.05 itd, it d, 65,475 2.08 % 57.70 5,517 2.21 % 128.50 8,287 1.52 % 26.06 2 5.71 % 206.04 36,287 1.94 % 66.86 195 2.12 % 49.29 14,609 3.23 % 77.95 578 2.33 % 14.55 sv, sv , 65,377 2.08 % 57.62 5,754 2.30 % 134.02 7,698 1.41 % 24.21 0 0 % 0 44,951 2.41 % 82.83 57 0.62 % 14.41 6,229 1.38 % 33.24 688 2.77 % 17.32 tel, te l, 59,340 1.89 % 52.30 1,487 0.59 % 34.63 3,114 0.57 % 9.79 0 0 % 0 47,321 2.53 % 87.19 12 0.13 % 3.03 7,401 1.64 % 39.49 5 0.02 % 0.13 v, v, 59,163 1.88 % 52.14 2,403 0.96 % 55.97 10,850 1.99 % 34.13 0 0 % 0 38,197 2.05 % 70.38 102 1.11 % 25.78 7,237 1.60 % 38.61 374 1.51 % 9.42 k, k, 57,713 1.83 % 50.86 2,031 0.81 % 47.30 8,416 1.54 % 26.47 0 0 % 0 40,899 2.19 % 75.36 58 0.63 % 14.66 5,237 1.16 % 27.94 1,072 4.32 % 26.99 l, l, 55,731 1.77 % 49.12 4,807 1.92 % 111.96 4,983 0.91 % 15.67 0 0 % 0 37,229 1.99 % 68.60 93 1.01 % 23.51 7,783 1.72 % 41.53 836 3.37 % 21.05 prof, pr of, 48,612 1.54 % 42.84 741 0.30 % 17.26 7,590 1.39 % 23.87 0 0 % 0 34,903 1.87 % 64.31 26 0.28 % 6.57 5,305 1.17 % 28.31 47 0.19 % 1.18 g, g, 47,260 1.50 % 41.65 2,978 1.19 % 69.36 4,354 0.80 % 13.69 0 0 % 0 34,400 1.84 % 63.38 69 0.75 % 17.44 4,878 1.08 % 26.03 581 2.34 % 14.63 str, st r, 47,194 1.50 % 41.59 28,190 11.27 % 656.58 3,546 0.65 % 11.15 0 0 % 0 3,788 0.20 % 6.98 33 0.36 % 8.34 10,527 2.33 % 56.17 1,110 4.48 % 27.95 r, r, 46,116 1.47 % 40.64 3,576 1.43 % 83.29 5,342 0.98 % 16.80 0 0 % 0 30,906 1.66 % 56.95 117 1.27 % 29.58 5,609 1.24 % 29.93 566 2.28 % 14.25 c, c, 42,313 1.34 % 37.29 4,977 1.99 % 115.92 6,573 1.21 % 20.67 33 94.29 % 3,399.61 21,307 1.14 % 39.26 140 1.52 % 35.39 8,931 1.98 % 47.65 352 1.42 % 8.86 mag, ma g, 40,143 1.27 % 35.38 471 0.19 % 10.97 4,511 0.83 % 14.19 0 0 % 0 29,459 1.58 % 54.28 65 0.71 % 16.43 5,620 1.24 % 29.99 17 0.07 % 0.43 n, n, 36,882 1.17 % 32.50 4,774 1.91 % 111.19 3,482 0.64 % 10.95 0 0 % 0 22,698 1.22 % 41.82 353 3.83 % 89.23 5,182 1.15 % 27.65 393 1.58 % 9.90 st, st , 36,399 1.16 % 32.08 1,228 0.49 % 28.60 11,515 2.11 % 36.22 0 0 % 0 19,248 1.03 % 35.47 30 0.33 % 7.58 3,431 0.76 % 18.31 947 3.82 % 23.84 op, op , 35,928 1.14 % 31.66 3,481 1.39 % 81.08 11,819 2.17 % 37.17 0 0 % 0 15,464 0.83 % 28.49 8 0.09 % 2.02 4,702 1.04 % 25.09 454 1.83 % 11.43 ipd, ip d, 29,413 0.93 % 25.92 5,227 2.09 % 121.74 4,394 0.81 % 13.82 0 0 % 0 12,438 0.67 % 22.92 178 1.93 % 45 7,015 1.55 % 37.43 161 0.65 % 4.05 f, f, 28,327 0.90 % 24.96 2,349 0.94 % 54.71 3,285 0.60 % 10.33 0 0 % 0 18,757 1.00 % 34.56 362 3.93 % 91.51 3,366 0.74 % 17.96 208 0.84 % 5.24 e, e, 27,090 0.86 % 23.87 2,854 1.14 % 66.47 4,675 0.86 % 14.70 0 0 % 0 14,894 0.80 % 27.44 35 0.38 % 8.85 4,388 0.97 % 23.41 244 0.98 % 6.14 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 249 File at CLARIN.SI 1.2.233 List of initial character-level 3-grams from abbreviation lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-abbreviations-lowercase_ forms-initial-3grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] dr, dr, 322,077 21.49 % 283.84 4,941 3.16 % 115.08 39,853 13.50 % 125.35 0 0 % 0 222,930 28.22 % 410.76 476 9.49 % 120.33 51,452 21.42 % 274.53 2,425 20.54 % 61.06 oz, oz, 136,593 9.12 % 120.38 11,031 7.05 % 256.93 62,292 21.10 % 195.92 0 0 % 0 40,351 5.11 % 74.35 329 6.56 % 83.17 22,364 9.31 % 119.33 226 1.92 % 5.69 št, št, 94,645 6.32 % 83.41 7,171 4.58 % 167.02 44,772 15.16 % 140.82 0 0 % 0 28,295 3.58 % 52.14 1,913 38.13 % 483.59 11,553 4.81 % 61.64 941 7.97 % 23.69 npr, npr , 92,015 6.14 % 81.09 22,440 14.34 % 522.65 14,523 4.92 % 45.68 0 0 % 0 24,837 3.15 % 45.76 311 6.20 % 78.62 29,590 12.32 % 157.88 314 2.66 % 7.91 itd, itd , 65,475 4.37 % 57.70 5,517 3.53 % 128.50 8,287 2.81 % 26.06 2100.00 % 206.04 36,287 4.59 % 66.86 195 3.89 % 49.29 14,609 6.08 % 77.95 578 4.90 % 14.55 sv, sv, 65,377 4.36 % 57.62 5,754 3.68 % 134.02 7,698 2.61 % 24.21 0 0 % 0 44,951 5.69 % 82.83 57 1.14 % 14.41 6,229 2.59 % 33.24 688 5.83 % 17.32 tel, tel , 59,340 3.96 % 52.30 1,487 0.95 % 34.63 3,114 1.05 % 9.79 0 0 % 0 47,321 5.99 % 87.19 12 0.24 % 3.03 7,401 3.08 % 39.49 5 0.04 % 0.13 prof, pro f, 48,612 3.24 % 42.84 741 0.47 % 17.26 7,590 2.57 % 23.87 0 0 % 0 34,903 4.42 % 64.31 26 0.52 % 6.57 5,305 2.21 % 28.31 47 0.40 % 1.18 str, str , 47,194 3.15 % 41.59 28,190 18.01 % 656.58 3,546 1.20 % 11.15 0 0 % 0 3,788 0.48 % 6.98 33 0.66 % 8.34 10,527 4.38 % 56.17 1,110 9.40 % 27.95 mag, mag , 40,143 2.68 % 35.38 471 0.30 % 10.97 4,511 1.53 % 14.19 0 0 % 0 29,459 3.73 % 54.28 65 1.30 % 16.43 5,620 2.34 % 29.99 17 0.14 % 0.43 st, st, 36,399 2.43 % 32.08 1,228 0.79 % 28.60 11,515 3.90 % 36.22 0 0 % 0 19,248 2.44 % 35.47 30 0.60 % 7.58 3,431 1.43 % 18.31 947 8.02 % 23.84 op, op, 35,928 2.40 % 31.66 3,481 2.22 % 81.08 11,819 4.00 % 37.17 0 0 % 0 15,464 1.96 % 28.49 8 0.16 % 2.02 4,702 1.96 % 25.09 454 3.85 % 11.43 ipd, ipd , 29,413 1.96 % 25.92 5,227 3.34 % 121.74 4,394 1.49 % 13.82 0 0 % 0 12,438 1.57 % 22.92 178 3.55 % 45 7,015 2.92 % 37.43 161 1.36 % 4.05 itn, itn , 16,421 1.10 % 14.47 2,446 1.56 % 56.97 1,646 0.56 % 5.18 0 0 % 0 8,494 1.07 % 15.65 157 3.13 % 39.69 3,546 1.48 % 18.92 132 1.12 % 3.32 odst, ods t, 16,149 1.08 % 14.23 46 0.03 % 1.07 5,065 1.72 % 15.93 0 0 % 0 10,780 1.36 % 19.86 1 0.02 % 0.25 256 0.11 % 1.37 1 0.01 % 0.03 am, am, 10,990 0.73 % 9.69 95 0.06 % 2.21 71 0.02 % 0.22 0 0 % 0 10,748 1.36 % 19.80 0 0 % 0 74 0.03 % 0.39 2 0.02 % 0.05 tj, tj, 10,915 0.73 % 9.62 2,778 1.77 % 64.70 2,073 0.70 % 6.52 0 0 % 0 3,085 0.39 % 5.68 6 0.12 % 1.52 2,880 1.20 % 15.37 93 0.79 % 2.34 med, med , 10,326 0.69 % 9.10 53 0.03 % 1.23 1,028 0.35 % 3.23 0 0 % 0 5,652 0.72 % 10.41 0 0 % 0 3,586 1.49 % 19.13 7 0.06 % 0.18 čl, čl, 10,036 0.67 % 8.84 183 0.12 % 4.26 8,198 2.78 % 25.78 0 0 % 0 1,356 0.17 % 2.50 12 0.24 % 3.03 282 0.12 % 1.50 5 0.04 % 0.13 nan, nan , 9,178 0.61 % 8.09 3 0 % 0.07 102 0.04 % 0.32 0 0 % 0 9,052 1.15 % 16.68 0 0 % 0 18 0.01 % 0.10 3 0.03 % 0.08 prim, pri m, 8,714 0.58 % 7.68 3,218 2.06 % 74.95 907 0.31 % 2.85 0 0 % 0 2,949 0.37 % 5.43 11 0.22 % 2.78 1,535 0.64 % 8.19 94 0.80 % 2.37 ur, ur, 8,434 0.56 % 7.43 562 0.36 % 13.09 4,166 1.41 % 13.10 0 0 % 0 3,021 0.38 % 5.57 10 0.20 % 2.53 647 0.27 % 3.45 28 0.24 % 0.71 pr, pr, 8,211 0.55 % 7.24 2,655 1.70 % 61.84 892 0.30 % 2.81 0 0 % 0 2,146 0.27 % 3.95 255 5.08 % 64.46 2,206 0.92 % 11.77 57 0.48 % 1.44 amer, ame r, 8,026 0.54 % 7.07 11 0.01 % 0.26 4 0 % 0.01 0 0 % 0 8,006 1.01 % 14.75 0 0 % 0 5 0 % 0.03 0 0 % 0 idr, idr , 7,791 0.52 % 6.87 1,650 1.05 % 38.43 988 0.34 % 3.11 0 0 % 0 3,765 0.48 % 6.94 33 0.66 % 8.34 1,322 0.55 % 7.05 33 0.28 % 0.83 dok, dok , 7,725 0.52 % 6.81 49 0.03 % 1.14 98 0.03 % 0.31 0 0 % 0 7,494 0.95 % 13.81 3 0.06 % 0.76 78 0.03 % 0.42 3 0.03 % 0.08 nad, nad , 7,233 0.48 % 6.37 2 0 % 0.05 22 0.01 % 0.07 0 0 % 0 7,181 0.91 % 13.23 2 0.04 % 0.51 23 0.01 % 0.12 3 0.03 % 0.08 angl, ang l, 7,232 0.48 % 6.37 1,407 0.90 % 32.77 584 0.20 % 1.84 0 0 % 0 2,390 0.30 % 4.40 26 0.52 % 6.57 2,817 1.17 % 15.03 8 0.07 % 0.20 ul, ul, 6,190 0.41 % 5.46 58 0.04 % 1.35 508 0.17 % 1.60 0 0 % 0 4,418 0.56 % 8.14 2 0.04 % 0.51 1,201 0.50 % 6.41 3 0.03 % 0.08 ml, ml, 5,973 0.40 % 5.26 74 0.05 % 1.72 446 0.15 % 1.40 0 0 % 0 4,939 0.62 % 9.10 4 0.08 % 1.01 455 0.19 % 2.43 55 0.47 % 1.38 doc, doc , 5,756 0.38 % 5.07 53 0.03 % 1.23 1,456 0.49 % 4.58 0 0 % 0 3,086 0.39 % 5.69 3 0.06 % 0.76 1,157 0.48 % 6.17 1 0.01 % 0.03 dipl, dip l, 5,602 0.37 % 4.94 81 0.05 % 1.89 412 0.14 % 1.30 0 0 % 0 4,280 0.54 % 7.89 1 0.02 % 0.25 819 0.34 % 4.37 9 0.08 % 0.23 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 250 File at CLARIN.SI 1.2.234 List of initial character-level 4-grams from abbreviation lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-abbreviations-lowercase_ forms-initial-4grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] npr, npr, 92,015 13.93 % 81.09 22,440 23.11 % 522.65 14,523 16.99 % 45.68 0 0 % 0 24,837 7.02 % 45.76 311 19.23 % 78.62 29,590 25.14 % 157.88 314 6.52 % 7.91 itd, itd, 65,475 9.91 % 57.70 5,517 5.68 % 128.50 8,287 9.69 % 26.06 2100.00 % 206.04 36,287 10.25 % 66.86 195 12.06 % 49.29 14,609 12.41 % 77.95 578 12.01 % 14.55 tel, tel, 59,340 8.98 % 52.30 1,487 1.53 % 34.63 3,114 3.64 % 9.79 0 0 % 0 47,321 13.37 % 87.19 12 0.74 % 3.03 7,401 6.29 % 39.49 5 0.10 % 0.13 prof, prof , 48,612 7.36 % 42.84 741 0.76 % 17.26 7,590 8.88 % 23.87 0 0 % 0 34,903 9.86 % 64.31 26 1.61 % 6.57 5,305 4.51 % 28.31 47 0.98 % 1.18 str, str, 47,194 7.14 % 41.59 28,190 29.03 % 656.58 3,546 4.15 % 11.15 0 0 % 0 3,788 1.07 % 6.98 33 2.04 % 8.34 10,527 8.94 % 56.17 1,110 23.06 % 27.95 mag, mag, 40,143 6.08 % 35.38 471 0.48 % 10.97 4,511 5.28 % 14.19 0 0 % 0 29,459 8.32 % 54.28 65 4.02 % 16.43 5,620 4.78 % 29.99 17 0.35 % 0.43 ipd, ipd, 29,413 4.45 % 25.92 5,227 5.38 % 121.74 4,394 5.14 % 13.82 0 0 % 0 12,438 3.52 % 22.92 178 11.01 % 45 7,015 5.96 % 37.43 161 3.35 % 4.05 itn, itn, 16,421 2.49 % 14.47 2,446 2.52 % 56.97 1,646 1.93 % 5.18 0 0 % 0 8,494 2.40 % 15.65 157 9.71 % 39.69 3,546 3.01 % 18.92 132 2.74 % 3.32 odst, odst , 16,149 2.44 % 14.23 46 0.05 % 1.07 5,065 5.92 % 15.93 0 0 % 0 10,780 3.05 % 19.86 1 0.06 % 0.25 256 0.22 % 1.37 1 0.02 % 0.03 med, med, 10,326 1.56 % 9.10 53 0.06 % 1.23 1,028 1.20 % 3.23 0 0 % 0 5,652 1.60 % 10.41 0 0 % 0 3,586 3.05 % 19.13 7 0.14 % 0.18 nan, nan, 9,178 1.39 % 8.09 3 0 % 0.07 102 0.12 % 0.32 0 0 % 0 9,052 2.56 % 16.68 0 0 % 0 18 0.01 % 0.10 3 0.06 % 0.08 prim, prim , 8,714 1.32 % 7.68 3,218 3.31 % 74.95 907 1.06 % 2.85 0 0 % 0 2,949 0.83 % 5.43 11 0.68 % 2.78 1,535 1.30 % 8.19 94 1.95 % 2.37 amer, amer , 8,026 1.22 % 7.07 11 0.01 % 0.26 4 0.01 % 0.01 0 0 % 0 8,006 2.26 % 14.75 0 0 % 0 5 0 % 0.03 0 0 % 0 idr, idr, 7,791 1.18 % 6.87 1,650 1.70 % 38.43 988 1.16 % 3.11 0 0 % 0 3,765 1.06 % 6.94 33 2.04 % 8.34 1,322 1.12 % 7.05 33 0.69 % 0.83 dok, dok, 7,725 1.17 % 6.81 49 0.05 % 1.14 98 0.12 % 0.31 0 0 % 0 7,494 2.12 % 13.81 3 0.19 % 0.76 78 0.07 % 0.42 3 0.06 % 0.08 nad, nad, 7,233 1.09 % 6.37 2 0 % 0.05 22 0.03 % 0.07 0 0 % 0 7,181 2.03 % 13.23 2 0.12 % 0.51 23 0.02 % 0.12 3 0.06 % 0.08 angl, angl , 7,232 1.09 % 6.37 1,407 1.45 % 32.77 584 0.68 % 1.84 0 0 % 0 2,390 0.68 % 4.40 26 1.61 % 6.57 2,817 2.39 % 15.03 8 0.17 % 0.20 doc, doc, 5,756 0.87 % 5.07 53 0.06 % 1.23 1,456 1.70 % 4.58 0 0 % 0 3,086 0.87 % 5.69 3 0.19 % 0.76 1,157 0.98 % 6.17 1 0.02 % 0.03 dipl, dipl , 5,602 0.85 % 4.94 81 0.08 % 1.89 412 0.48 % 1.30 0 0 % 0 4,280 1.21 % 7.89 1 0.06 % 0.25 819 0.70 % 4.37 9 0.19 % 0.23 inž, inž, 5,119 0.78 % 4.51 43 0.04 % 1 154 0.18 % 0.48 0 0 % 0 4,449 1.26 % 8.20 1 0.06 % 0.25 468 0.40 % 2.50 4 0.08 % 0.10 pon, pon, 4,874 0.74 % 4.30 104 0.11 % 2.42 18 0.02 % 0.06 0 0 % 0 4,611 1.30 % 8.50 4 0.25 % 1.01 137 0.12 % 0.73 0 0 % 0 roj, roj, 4,310 0.65 % 3.80 901 0.93 % 20.99 629 0.74 % 1.98 0 0 % 0 2,632 0.74 % 4.85 3 0.19 % 0.76 116 0.10 % 0.62 29 0.60 % 0.73 parc, parc , 3,955 0.60 % 3.49 19 0.02 % 0.44 2,453 2.87 % 7.72 0 0 % 0 1,470 0.41 % 2.71 4 0.25 % 1.01 9 0.01 % 0.05 0 0 % 0 naniz, nani z, 3,743 0.57 % 3.30 0 0 % 0 0 0 % 0 0 0 % 0 3,743 1.06 % 6.90 0 0 % 0 0 0 % 0 0 0 % 0 nem, nem, 3,739 0.57 % 3.30 436 0.45 % 10.15 106 0.12 % 0.33 0 0 % 0 2,975 0.84 % 5.48 4 0.25 % 1.01 175 0.15 % 0.93 43 0.89 % 1.08 brit, brit , 3,261 0.49 % 2.87 56 0.06 % 1.30 2,446 2.86 % 7.69 0 0 % 0 738 0.21 % 1.36 0 0 % 0 21 0.02 % 0.11 0 0 % 0 msgr, msgr , 3,221 0.49 % 2.84 19 0.02 % 0.44 116 0.14 % 0.36 0 0 % 0 2,980 0.84 % 5.49 0 0 % 0 90 0.08 % 0.48 16 0.33 % 0.40 opr, opr, 3,179 0.48 % 2.80 357 0.37 % 8.31 2,596 3.04 % 8.17 0 0 % 0 204 0.06 % 0.38 0 0 % 0 22 0.02 % 0.12 0 0 % 0 slov, slov , 2,932 0.44 % 2.58 630 0.65 % 14.67 236 0.28 % 0.74 0 0 % 0 1,789 0.51 % 3.30 1 0.06 % 0.25 239 0.20 % 1.28 37 0.77 % 0.93 univ, univ , 2,922 0.44 % 2.58 89 0.09 % 2.07 296 0.35 % 0.93 0 0 % 0 2,067 0.58 % 3.81 2 0.12 % 0.51 463 0.39 % 2.47 5 0.10 % 0.13 odd, odd, 2,739 0.41 % 2.41 11 0.01 % 0.26 59 0.07 % 0.19 0 0 % 0 2,661 0.75 % 4.90 0 0 % 0 7 0.01 % 0.04 1 0.02 % 0.03 stol, stol , 2,715 0.41 % 2.39 527 0.54 % 12.27 154 0.18 % 0.48 0 0 % 0 708 0.20 % 1.30 0 0 % 0 1,314 1.12 % 7.01 12 0.25 % 0.30 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 251 File at CLARIN.SI 1.2.235 List of initial character-level 5-grams from abbreviation lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-abbreviations-lowercase_ forms-initial-5grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] prof, prof, 48,612 29.71 % 42.84 741 4.56 % 17.26 7,590 29.21 % 23.87 0 0 % 0 34,903 34.39 % 64.31 26 10.48 % 6.57 5,305 28.32 % 28.31 47 5.12 % 1.18 odst, odst, 16,149 9.87 % 14.23 46 0.28 % 1.07 5,065 19.49 % 15.93 0 0 % 0 10,780 10.62 % 19.86 1 0.40 % 0.25 256 1.37 % 1.37 1 0.11 % 0.03 prim, prim, 8,714 5.33 % 7.68 3,218 19.80 % 74.95 907 3.49 % 2.85 0 0 % 0 2,949 2.91 % 5.43 11 4.43 % 2.78 1,535 8.20 % 8.19 94 10.25 % 2.37 amer, amer, 8,026 4.91 % 7.07 11 0.07 % 0.26 4 0.01 % 0.01 0 0 % 0 8,006 7.89 % 14.75 0 0 % 0 5 0.03 % 0.03 0 0 % 0 angl, angl, 7,232 4.42 % 6.37 1,407 8.66 % 32.77 584 2.25 % 1.84 0 0 % 0 2,390 2.35 % 4.40 26 10.48 % 6.57 2,817 15.04 % 15.03 8 0.87 % 0.20 dipl, dipl, 5,602 3.42 % 4.94 81 0.50 % 1.89 412 1.59 % 1.30 0 0 % 0 4,280 4.22 % 7.89 1 0.40 % 0.25 819 4.37 % 4.37 9 0.98 % 0.23 parc, parc, 3,955 2.42 % 3.49 19 0.12 % 0.44 2,453 9.44 % 7.72 0 0 % 0 1,470 1.45 % 2.71 4 1.61 % 1.01 9 0.05 % 0.05 0 0 % 0 naniz, naniz , 3,743 2.29 % 3.30 0 0 % 0 0 0 % 0 0 0 % 0 3,743 3.69 % 6.90 0 0 % 0 0 0 % 0 0 0 % 0 brit, brit, 3,261 1.99 % 2.87 56 0.34 % 1.30 2,446 9.41 % 7.69 0 0 % 0 738 0.73 % 1.36 0 0 % 0 21 0.11 % 0.11 0 0 % 0 msgr, msgr, 3,221 1.97 % 2.84 19 0.12 % 0.44 116 0.45 % 0.36 0 0 % 0 2,980 2.94 % 5.49 0 0 % 0 90 0.48 % 0.48 16 1.75 % 0.40 slov, slov, 2,932 1.79 % 2.58 630 3.88 % 14.67 236 0.91 % 0.74 0 0 % 0 1,789 1.76 % 3.30 1 0.40 % 0.25 239 1.28 % 1.28 37 4.04 % 0.93 univ, univ, 2,922 1.79 % 2.58 89 0.55 % 2.07 296 1.14 % 0.93 0 0 % 0 2,067 2.04 % 3.81 2 0.81 % 0.51 463 2.47 % 2.47 5 0.55 % 0.13 stol, stol, 2,715 1.66 % 2.39 527 3.24 % 12.27 154 0.59 % 0.48 0 0 % 0 708 0.70 % 1.30 0 0 % 0 1,314 7.01 % 7.01 12 1.31 % 0.30 prev, prev, 2,524 1.54 % 2.22 1,713 10.54 % 39.90 60 0.23 % 0.19 0 0 % 0 350 0.34 % 0.64 5 2.02 % 1.26 150 0.80 % 0.80 246 26.83 % 6.19 ibid, ibid, 2,175 1.33 % 1.92 1,945 11.97 % 45.30 75 0.29 % 0.24 0 0 % 0 8 0.01 % 0.01 2 0.81 % 0.51 145 0.77 % 0.77 0 0 % 0 ponov, ponov , 1,907 1.17 % 1.68 0 0 % 0 0 0 % 0 0 0 % 0 1,906 1.88 % 3.51 0 0 % 0 0 0 % 0 1 0.11 % 0.03 nadalj, nadal j, 1,606 0.98 % 1.42 2 0.01 % 0.05 144 0.55 % 0.45 0 0 % 0 1,457 1.44 % 2.68 1 0.40 % 0.25 2 0.01 % 0.01 0 0 % 0 asist, asist , 1,359 0.83 % 1.20 8 0.05 % 0.19 342 1.32 % 1.08 0 0 % 0 513 0.51 % 0.95 0 0 % 0 496 2.65 % 2.65 0 0 % 0 spec, spec, 1,274 0.78 % 1.12 20 0.12 % 0.47 256 0.98 % 0.81 0 0 % 0 660 0.65 % 1.22 0 0 % 0 336 1.79 % 1.79 2 0.22 % 0.05 film, film, 1,206 0.74 % 1.06 55 0.34 % 1.28 133 0.51 % 0.42 0 0 % 0 1,000 0.98 % 1.84 1 0.40 % 0.25 17 0.09 % 0.09 0 0 % 0 štev, štev, 1,168 0.71 % 1.03 80 0.49 % 1.86 151 0.58 % 0.47 0 0 % 0 751 0.74 % 1.38 9 3.63 % 2.28 161 0.86 % 0.86 16 1.75 % 0.40 pribl, pribl , 1,089 0.67 % 0.96 140 0.86 % 3.26 269 1.03 % 0.85 0 0 % 0 325 0.32 % 0.60 20 8.06 % 5.06 335 1.79 % 1.79 0 0 % 0 nasl, nasl, 966 0.59 % 0.85 131 0.81 % 3.05 769 2.96 % 2.42 0 0 % 0 12 0.01 % 0.02 1 0.40 % 0.25 52 0.28 % 0.28 1 0.11 % 0.03 franc, franc , 896 0.55 % 0.79 55 0.34 % 1.28 9 0.04 % 0.03 0 0 % 0 818 0.81 % 1.51 0 0 % 0 14 0.07 % 0.07 0 0 % 0 akad, akad, 876 0.54 % 0.77 74 0.46 % 1.72 117 0.45 % 0.37 0 0 % 0 656 0.65 % 1.21 2 0.81 % 0.51 26 0.14 % 0.14 1 0.11 % 0.03 avstral, avstr al, 818 0.50 % 0.72 8 0.05 % 0.19 2 0.01 % 0.01 0 0 % 0 807 0.80 % 1.49 0 0 % 0 1 0.01 % 0.01 0 0 % 0 pogl, pogl, 740 0.45 % 0.65 670 4.12 % 15.61 11 0.04 % 0.03 0 0 % 0 5 0.01 % 0.01 1 0.40 % 0.25 45 0.24 % 0.24 8 0.87 % 0.20 pren, pren, 713 0.44 % 0.63 1 0.01 % 0.02 19 0.07 % 0.06 0 0 % 0 693 0.68 % 1.28 0 0 % 0 0 0 % 0 0 0 % 0 alter, alter , 638 0.39 % 0.56 0 0 % 0 2 0.01 % 0.01 0 0 % 0 624 0.61 % 1.15 0 0 % 0 12 0.06 % 0.06 0 0 % 0 šport, šport , 612 0.37 % 0.54 3 0.02 % 0.07 10 0.04 % 0.03 0 0 % 0 530 0.52 % 0.98 0 0 % 0 67 0.36 % 0.36 2 0.22 % 0.05 posn, posn, 597 0.36 % 0.53 0 0 % 0 0 0 % 0 0 0 % 0 596 0.59 % 1.10 0 0 % 0 1 0.01 % 0.01 0 0 % 0 madž, madž, 552 0.34 % 0.49 114 0.70 % 2.66 30 0.12 % 0.09 0 0 % 0 401 0.40 % 0.74 0 0 % 0 4 0.02 % 0.02 3 0.33 % 0.08 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 252 File at CLARIN.SI 1.2.236 List of final character-level 1-grams from abbreviation lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-abbreviations-lowercase_ forms-final-1grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] dr, dr , 322,077 10.23 % 283.84 4,941 1.97 % 115.08 39,853 7.31 % 125.35 0 0 % 0 222,930 11.94 % 410.76 476 5.17 % 120.33 51,452 11.39 % 274.53 2,425 9.78 % 61.06 d, d , 197,296 6.27 % 173.88 4,623 1.85 % 107.68 32,931 6.04 % 103.58 0 0 % 0 131,940 7.07 % 243.11 659 7.16 % 166.59 26,515 5.87 % 141.48 628 2.53 % 15.81 oz, oz , 136,593 4.34 % 120.38 11,031 4.41 % 256.93 62,292 11.43 % 195.92 0 0 % 0 40,351 2.16 % 74.35 329 3.57 % 83.17 22,364 4.95 % 119.33 226 0.91 % 5.69 o, o , 131,455 4.17 % 115.85 2,225 0.89 % 51.82 19,598 3.60 % 61.64 0 0 % 0 84,948 4.55 % 156.52 215 2.33 % 54.35 24,114 5.34 % 128.66 355 1.43 % 8.94 m, m , 124,724 3.96 % 109.92 6,590 2.63 % 153.49 11,023 2.02 % 34.67 0 0 % 0 95,479 5.11 % 175.93 148 1.61 % 37.41 10,946 2.42 % 58.40 538 2.17 % 13.55 t, t , 114,212 3.63 % 100.65 6,635 2.65 % 154.54 30,333 5.56 % 95.41 0 0 % 0 57,654 3.09 % 106.23 134 1.46 % 33.87 18,938 4.19 % 101.05 518 2.09 % 13.04 i, i , 112,680 3.58 % 99.30 6,709 2.68 % 156.26 29,913 5.49 % 94.08 0 0 % 0 57,762 3.09 % 106.43 227 2.46 % 57.38 17,470 3.87 % 93.21 599 2.42 % 15.08 p, p , 97,522 3.10 % 85.95 3,551 1.42 % 82.71 12,632 2.32 % 39.73 0 0 % 0 68,239 3.65 % 125.73 136 1.48 % 34.38 12,497 2.77 % 66.68 467 1.88 % 11.76 s, s , 96,007 3.05 % 84.61 5,029 2.01 % 117.13 9,265 1.70 % 29.14 0 0 % 0 71,070 3.81 % 130.95 72 0.78 % 18.20 9,783 2.17 % 52.20 788 3.18 % 19.84 št, št , 94,645 3.01 % 83.41 7,171 2.87 % 167.02 44,772 8.21 % 140.82 0 0 % 0 28,295 1.51 % 52.14 1,913 20.77 % 483.59 11,553 2.56 % 61.64 941 3.79 % 23.69 npr, npr , 92,015 2.92 % 81.09 22,440 8.97 % 522.65 14,523 2.66 % 45.68 0 0 % 0 24,837 1.33 % 45.76 311 3.38 % 78.62 29,590 6.55 % 157.88 314 1.27 % 7.91 a, a , 90,930 2.89 % 80.14 7,199 2.88 % 167.67 20,714 3.80 % 65.15 0 0 % 0 51,507 2.76 % 94.91 278 3.02 % 70.28 10,006 2.21 % 53.39 1,226 4.95 % 30.87 j, j , 88,597 2.81 % 78.08 7,582 3.03 % 176.59 9,608 1.76 % 30.22 0 0 % 0 61,530 3.29 % 113.37 88 0.95 % 22.25 8,701 1.93 % 46.43 1,088 4.39 % 27.39 b, b , 78,541 2.50 % 69.22 3,823 1.53 % 89.04 9,968 1.83 % 31.35 0 0 % 0 56,849 3.04 % 104.75 259 2.81 % 65.47 6,528 1.45 % 34.83 1,114 4.49 % 28.05 itd, itd , 65,475 2.08 % 57.70 5,517 2.21 % 128.50 8,287 1.52 % 26.06 2 5.71 % 206.04 36,287 1.94 % 66.86 195 2.12 % 49.29 14,609 3.23 % 77.95 578 2.33 % 14.55 sv, sv , 65,377 2.08 % 57.62 5,754 2.30 % 134.02 7,698 1.41 % 24.21 0 0 % 0 44,951 2.41 % 82.83 57 0.62 % 14.41 6,229 1.38 % 33.24 688 2.77 % 17.32 tel, tel , 59,340 1.89 % 52.30 1,487 0.59 % 34.63 3,114 0.57 % 9.79 0 0 % 0 47,321 2.53 % 87.19 12 0.13 % 3.03 7,401 1.64 % 39.49 5 0.02 % 0.13 v, v , 59,163 1.88 % 52.14 2,403 0.96 % 55.97 10,850 1.99 % 34.13 0 0 % 0 38,197 2.05 % 70.38 102 1.11 % 25.78 7,237 1.60 % 38.61 374 1.51 % 9.42 k, k , 57,713 1.83 % 50.86 2,031 0.81 % 47.30 8,416 1.54 % 26.47 0 0 % 0 40,899 2.19 % 75.36 58 0.63 % 14.66 5,237 1.16 % 27.94 1,072 4.32 % 26.99 l, l , 55,731 1.77 % 49.12 4,807 1.92 % 111.96 4,983 0.91 % 15.67 0 0 % 0 37,229 1.99 % 68.60 93 1.01 % 23.51 7,783 1.72 % 41.53 836 3.37 % 21.05 prof, prof , 48,612 1.54 % 42.84 741 0.30 % 17.26 7,590 1.39 % 23.87 0 0 % 0 34,903 1.87 % 64.31 26 0.28 % 6.57 5,305 1.17 % 28.31 47 0.19 % 1.18 g, g , 47,260 1.50 % 41.65 2,978 1.19 % 69.36 4,354 0.80 % 13.69 0 0 % 0 34,400 1.84 % 63.38 69 0.75 % 17.44 4,878 1.08 % 26.03 581 2.34 % 14.63 str, str , 47,194 1.50 % 41.59 28,190 11.27 % 656.58 3,546 0.65 % 11.15 0 0 % 0 3,788 0.20 % 6.98 33 0.36 % 8.34 10,527 2.33 % 56.17 1,110 4.48 % 27.95 r, r , 46,116 1.47 % 40.64 3,576 1.43 % 83.29 5,342 0.98 % 16.80 0 0 % 0 30,906 1.66 % 56.95 117 1.27 % 29.58 5,609 1.24 % 29.93 566 2.28 % 14.25 c, c , 42,313 1.34 % 37.29 4,977 1.99 % 115.92 6,573 1.21 % 20.67 33 94.29 % 3,399.61 21,307 1.14 % 39.26 140 1.52 % 35.39 8,931 1.98 % 47.65 352 1.42 % 8.86 mag, mag , 40,143 1.27 % 35.38 471 0.19 % 10.97 4,511 0.83 % 14.19 0 0 % 0 29,459 1.58 % 54.28 65 0.71 % 16.43 5,620 1.24 % 29.99 17 0.07 % 0.43 n, n , 36,882 1.17 % 32.50 4,774 1.91 % 111.19 3,482 0.64 % 10.95 0 0 % 0 22,698 1.22 % 41.82 353 3.83 % 89.23 5,182 1.15 % 27.65 393 1.58 % 9.90 st, st , 36,399 1.16 % 32.08 1,228 0.49 % 28.60 11,515 2.11 % 36.22 0 0 % 0 19,248 1.03 % 35.47 30 0.33 % 7.58 3,431 0.76 % 18.31 947 3.82 % 23.84 op, op , 35,928 1.14 % 31.66 3,481 1.39 % 81.08 11,819 2.17 % 37.17 0 0 % 0 15,464 0.83 % 28.49 8 0.09 % 2.02 4,702 1.04 % 25.09 454 1.83 % 11.43 ipd, ipd , 29,413 0.93 % 25.92 5,227 2.09 % 121.74 4,394 0.81 % 13.82 0 0 % 0 12,438 0.67 % 22.92 178 1.93 % 45 7,015 1.55 % 37.43 161 0.65 % 4.05 f, f , 28,327 0.90 % 24.96 2,349 0.94 % 54.71 3,285 0.60 % 10.33 0 0 % 0 18,757 1.00 % 34.56 362 3.93 % 91.51 3,366 0.74 % 17.96 208 0.84 % 5.24 e, e , 27,090 0.86 % 23.87 2,854 1.14 % 66.47 4,675 0.86 % 14.70 0 0 % 0 14,894 0.80 % 27.44 35 0.38 % 8.85 4,388 0.97 % 23.41 244 0.98 % 6.14 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 253 File at CLARIN.SI 1.2.237 List of final character-level 2-grams from abbreviation lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-abbreviations-lowercase_ forms-final-2grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] dr, d r, 322,077 10.23 % 283.84 4,941 1.97 % 115.08 39,853 7.31 % 125.35 0 0 % 0 222,930 11.94 % 410.76 476 5.17 % 120.33 51,452 11.39 % 274.53 2,425 9.78 % 61.06 d, d, 197,296 6.27 % 173.88 4,623 1.85 % 107.68 32,931 6.04 % 103.58 0 0 % 0 131,940 7.07 % 243.11 659 7.16 % 166.59 26,515 5.87 % 141.48 628 2.53 % 15.81 oz, o z, 136,593 4.34 % 120.38 11,031 4.41 % 256.93 62,292 11.43 % 195.92 0 0 % 0 40,351 2.16 % 74.35 329 3.57 % 83.17 22,364 4.95 % 119.33 226 0.91 % 5.69 o, o, 131,455 4.17 % 115.85 2,225 0.89 % 51.82 19,598 3.60 % 61.64 0 0 % 0 84,948 4.55 % 156.52 215 2.33 % 54.35 24,114 5.34 % 128.66 355 1.43 % 8.94 m, m, 124,724 3.96 % 109.92 6,590 2.63 % 153.49 11,023 2.02 % 34.67 0 0 % 0 95,479 5.11 % 175.93 148 1.61 % 37.41 10,946 2.42 % 58.40 538 2.17 % 13.55 t, t, 114,212 3.63 % 100.65 6,635 2.65 % 154.54 30,333 5.56 % 95.41 0 0 % 0 57,654 3.09 % 106.23 134 1.46 % 33.87 18,938 4.19 % 101.05 518 2.09 % 13.04 i, i, 112,680 3.58 % 99.30 6,709 2.68 % 156.26 29,913 5.49 % 94.08 0 0 % 0 57,762 3.09 % 106.43 227 2.46 % 57.38 17,470 3.87 % 93.21 599 2.42 % 15.08 p, p, 97,522 3.10 % 85.95 3,551 1.42 % 82.71 12,632 2.32 % 39.73 0 0 % 0 68,239 3.65 % 125.73 136 1.48 % 34.38 12,497 2.77 % 66.68 467 1.88 % 11.76 s, s, 96,007 3.05 % 84.61 5,029 2.01 % 117.13 9,265 1.70 % 29.14 0 0 % 0 71,070 3.81 % 130.95 72 0.78 % 18.20 9,783 2.17 % 52.20 788 3.18 % 19.84 št, š t, 94,645 3.01 % 83.41 7,171 2.87 % 167.02 44,772 8.21 % 140.82 0 0 % 0 28,295 1.51 % 52.14 1,913 20.77 % 483.59 11,553 2.56 % 61.64 941 3.79 % 23.69 npr, np r, 92,015 2.92 % 81.09 22,440 8.97 % 522.65 14,523 2.66 % 45.68 0 0 % 0 24,837 1.33 % 45.76 311 3.38 % 78.62 29,590 6.55 % 157.88 314 1.27 % 7.91 a, a, 90,930 2.89 % 80.14 7,199 2.88 % 167.67 20,714 3.80 % 65.15 0 0 % 0 51,507 2.76 % 94.91 278 3.02 % 70.28 10,006 2.21 % 53.39 1,226 4.95 % 30.87 j, j, 88,597 2.81 % 78.08 7,582 3.03 % 176.59 9,608 1.76 % 30.22 0 0 % 0 61,530 3.29 % 113.37 88 0.95 % 22.25 8,701 1.93 % 46.43 1,088 4.39 % 27.39 b, b, 78,541 2.50 % 69.22 3,823 1.53 % 89.04 9,968 1.83 % 31.35 0 0 % 0 56,849 3.04 % 104.75 259 2.81 % 65.47 6,528 1.45 % 34.83 1,114 4.49 % 28.05 itd, it d, 65,475 2.08 % 57.70 5,517 2.21 % 128.50 8,287 1.52 % 26.06 2 5.71 % 206.04 36,287 1.94 % 66.86 195 2.12 % 49.29 14,609 3.23 % 77.95 578 2.33 % 14.55 sv, s v, 65,377 2.08 % 57.62 5,754 2.30 % 134.02 7,698 1.41 % 24.21 0 0 % 0 44,951 2.41 % 82.83 57 0.62 % 14.41 6,229 1.38 % 33.24 688 2.77 % 17.32 tel, te l, 59,340 1.89 % 52.30 1,487 0.59 % 34.63 3,114 0.57 % 9.79 0 0 % 0 47,321 2.53 % 87.19 12 0.13 % 3.03 7,401 1.64 % 39.49 5 0.02 % 0.13 v, v, 59,163 1.88 % 52.14 2,403 0.96 % 55.97 10,850 1.99 % 34.13 0 0 % 0 38,197 2.05 % 70.38 102 1.11 % 25.78 7,237 1.60 % 38.61 374 1.51 % 9.42 k, k, 57,713 1.83 % 50.86 2,031 0.81 % 47.30 8,416 1.54 % 26.47 0 0 % 0 40,899 2.19 % 75.36 58 0.63 % 14.66 5,237 1.16 % 27.94 1,072 4.32 % 26.99 l, l, 55,731 1.77 % 49.12 4,807 1.92 % 111.96 4,983 0.91 % 15.67 0 0 % 0 37,229 1.99 % 68.60 93 1.01 % 23.51 7,783 1.72 % 41.53 836 3.37 % 21.05 prof, pro f, 48,612 1.54 % 42.84 741 0.30 % 17.26 7,590 1.39 % 23.87 0 0 % 0 34,903 1.87 % 64.31 26 0.28 % 6.57 5,305 1.17 % 28.31 47 0.19 % 1.18 g, g, 47,260 1.50 % 41.65 2,978 1.19 % 69.36 4,354 0.80 % 13.69 0 0 % 0 34,400 1.84 % 63.38 69 0.75 % 17.44 4,878 1.08 % 26.03 581 2.34 % 14.63 str, st r, 47,194 1.50 % 41.59 28,190 11.27 % 656.58 3,546 0.65 % 11.15 0 0 % 0 3,788 0.20 % 6.98 33 0.36 % 8.34 10,527 2.33 % 56.17 1,110 4.48 % 27.95 r, r, 46,116 1.47 % 40.64 3,576 1.43 % 83.29 5,342 0.98 % 16.80 0 0 % 0 30,906 1.66 % 56.95 117 1.27 % 29.58 5,609 1.24 % 29.93 566 2.28 % 14.25 c, c, 42,313 1.34 % 37.29 4,977 1.99 % 115.92 6,573 1.21 % 20.67 33 94.29 % 3,399.61 21,307 1.14 % 39.26 140 1.52 % 35.39 8,931 1.98 % 47.65 352 1.42 % 8.86 mag, ma g, 40,143 1.27 % 35.38 471 0.19 % 10.97 4,511 0.83 % 14.19 0 0 % 0 29,459 1.58 % 54.28 65 0.71 % 16.43 5,620 1.24 % 29.99 17 0.07 % 0.43 n, n, 36,882 1.17 % 32.50 4,774 1.91 % 111.19 3,482 0.64 % 10.95 0 0 % 0 22,698 1.22 % 41.82 353 3.83 % 89.23 5,182 1.15 % 27.65 393 1.58 % 9.90 st, s t, 36,399 1.16 % 32.08 1,228 0.49 % 28.60 11,515 2.11 % 36.22 0 0 % 0 19,248 1.03 % 35.47 30 0.33 % 7.58 3,431 0.76 % 18.31 947 3.82 % 23.84 op, o p, 35,928 1.14 % 31.66 3,481 1.39 % 81.08 11,819 2.17 % 37.17 0 0 % 0 15,464 0.83 % 28.49 8 0.09 % 2.02 4,702 1.04 % 25.09 454 1.83 % 11.43 ipd, ip d, 29,413 0.93 % 25.92 5,227 2.09 % 121.74 4,394 0.81 % 13.82 0 0 % 0 12,438 0.67 % 22.92 178 1.93 % 45 7,015 1.55 % 37.43 161 0.65 % 4.05 f, f, 28,327 0.90 % 24.96 2,349 0.94 % 54.71 3,285 0.60 % 10.33 0 0 % 0 18,757 1.00 % 34.56 362 3.93 % 91.51 3,366 0.74 % 17.96 208 0.84 % 5.24 e, e, 27,090 0.86 % 23.87 2,854 1.14 % 66.47 4,675 0.86 % 14.70 0 0 % 0 14,894 0.80 % 27.44 35 0.38 % 8.85 4,388 0.97 % 23.41 244 0.98 % 6.14 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 254 File at CLARIN.SI 1.2.238 List of final character-level 3-grams from abbreviation lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-abbreviations-lowercase_ forms-final-3grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] dr, dr, 322,077 21.49 % 283.84 4,941 3.16 % 115.08 39,853 13.50 % 125.35 0 0 % 0 222,930 28.22 % 410.76 476 9.49 % 120.33 51,452 21.42 % 274.53 2,425 20.54 % 61.06 oz, oz, 136,593 9.12 % 120.38 11,031 7.05 % 256.93 62,292 21.10 % 195.92 0 0 % 0 40,351 5.11 % 74.35 329 6.56 % 83.17 22,364 9.31 % 119.33 226 1.92 % 5.69 št, št, 94,645 6.32 % 83.41 7,171 4.58 % 167.02 44,772 15.16 % 140.82 0 0 % 0 28,295 3.58 % 52.14 1,913 38.13 % 483.59 11,553 4.81 % 61.64 941 7.97 % 23.69 npr, n pr, 92,015 6.14 % 81.09 22,440 14.34 % 522.65 14,523 4.92 % 45.68 0 0 % 0 24,837 3.15 % 45.76 311 6.20 % 78.62 29,590 12.32 % 157.88 314 2.66 % 7.91 itd, i td, 65,475 4.37 % 57.70 5,517 3.53 % 128.50 8,287 2.81 % 26.06 2100.00 % 206.04 36,287 4.59 % 66.86 195 3.89 % 49.29 14,609 6.08 % 77.95 578 4.90 % 14.55 sv, sv, 65,377 4.36 % 57.62 5,754 3.68 % 134.02 7,698 2.61 % 24.21 0 0 % 0 44,951 5.69 % 82.83 57 1.14 % 14.41 6,229 2.59 % 33.24 688 5.83 % 17.32 tel, t el, 59,340 3.96 % 52.30 1,487 0.95 % 34.63 3,114 1.05 % 9.79 0 0 % 0 47,321 5.99 % 87.19 12 0.24 % 3.03 7,401 3.08 % 39.49 5 0.04 % 0.13 prof, pr of, 48,612 3.24 % 42.84 741 0.47 % 17.26 7,590 2.57 % 23.87 0 0 % 0 34,903 4.42 % 64.31 26 0.52 % 6.57 5,305 2.21 % 28.31 47 0.40 % 1.18 str, s tr, 47,194 3.15 % 41.59 28,190 18.01 % 656.58 3,546 1.20 % 11.15 0 0 % 0 3,788 0.48 % 6.98 33 0.66 % 8.34 10,527 4.38 % 56.17 1,110 9.40 % 27.95 mag, m ag, 40,143 2.68 % 35.38 471 0.30 % 10.97 4,511 1.53 % 14.19 0 0 % 0 29,459 3.73 % 54.28 65 1.30 % 16.43 5,620 2.34 % 29.99 17 0.14 % 0.43 st, st, 36,399 2.43 % 32.08 1,228 0.79 % 28.60 11,515 3.90 % 36.22 0 0 % 0 19,248 2.44 % 35.47 30 0.60 % 7.58 3,431 1.43 % 18.31 947 8.02 % 23.84 op, op, 35,928 2.40 % 31.66 3,481 2.22 % 81.08 11,819 4.00 % 37.17 0 0 % 0 15,464 1.96 % 28.49 8 0.16 % 2.02 4,702 1.96 % 25.09 454 3.85 % 11.43 ipd, i pd, 29,413 1.96 % 25.92 5,227 3.34 % 121.74 4,394 1.49 % 13.82 0 0 % 0 12,438 1.57 % 22.92 178 3.55 % 45 7,015 2.92 % 37.43 161 1.36 % 4.05 itn, i tn, 16,421 1.10 % 14.47 2,446 1.56 % 56.97 1,646 0.56 % 5.18 0 0 % 0 8,494 1.07 % 15.65 157 3.13 % 39.69 3,546 1.48 % 18.92 132 1.12 % 3.32 odst, od st, 16,149 1.08 % 14.23 46 0.03 % 1.07 5,065 1.72 % 15.93 0 0 % 0 10,780 1.36 % 19.86 1 0.02 % 0.25 256 0.11 % 1.37 1 0.01 % 0.03 am, am, 10,990 0.73 % 9.69 95 0.06 % 2.21 71 0.02 % 0.22 0 0 % 0 10,748 1.36 % 19.80 0 0 % 0 74 0.03 % 0.39 2 0.02 % 0.05 tj, tj, 10,915 0.73 % 9.62 2,778 1.77 % 64.70 2,073 0.70 % 6.52 0 0 % 0 3,085 0.39 % 5.68 6 0.12 % 1.52 2,880 1.20 % 15.37 93 0.79 % 2.34 med, m ed, 10,326 0.69 % 9.10 53 0.03 % 1.23 1,028 0.35 % 3.23 0 0 % 0 5,652 0.72 % 10.41 0 0 % 0 3,586 1.49 % 19.13 7 0.06 % 0.18 čl, čl, 10,036 0.67 % 8.84 183 0.12 % 4.26 8,198 2.78 % 25.78 0 0 % 0 1,356 0.17 % 2.50 12 0.24 % 3.03 282 0.12 % 1.50 5 0.04 % 0.13 nan, n an, 9,178 0.61 % 8.09 3 0 % 0.07 102 0.04 % 0.32 0 0 % 0 9,052 1.15 % 16.68 0 0 % 0 18 0.01 % 0.10 3 0.03 % 0.08 prim, pr im, 8,714 0.58 % 7.68 3,218 2.06 % 74.95 907 0.31 % 2.85 0 0 % 0 2,949 0.37 % 5.43 11 0.22 % 2.78 1,535 0.64 % 8.19 94 0.80 % 2.37 ur, ur, 8,434 0.56 % 7.43 562 0.36 % 13.09 4,166 1.41 % 13.10 0 0 % 0 3,021 0.38 % 5.57 10 0.20 % 2.53 647 0.27 % 3.45 28 0.24 % 0.71 pr, pr, 8,211 0.55 % 7.24 2,655 1.70 % 61.84 892 0.30 % 2.81 0 0 % 0 2,146 0.27 % 3.95 255 5.08 % 64.46 2,206 0.92 % 11.77 57 0.48 % 1.44 amer, am er, 8,026 0.54 % 7.07 11 0.01 % 0.26 4 0 % 0.01 0 0 % 0 8,006 1.01 % 14.75 0 0 % 0 5 0 % 0.03 0 0 % 0 idr, i dr, 7,791 0.52 % 6.87 1,650 1.05 % 38.43 988 0.34 % 3.11 0 0 % 0 3,765 0.48 % 6.94 33 0.66 % 8.34 1,322 0.55 % 7.05 33 0.28 % 0.83 dok, d ok, 7,725 0.52 % 6.81 49 0.03 % 1.14 98 0.03 % 0.31 0 0 % 0 7,494 0.95 % 13.81 3 0.06 % 0.76 78 0.03 % 0.42 3 0.03 % 0.08 nad, n ad, 7,233 0.48 % 6.37 2 0 % 0.05 22 0.01 % 0.07 0 0 % 0 7,181 0.91 % 13.23 2 0.04 % 0.51 23 0.01 % 0.12 3 0.03 % 0.08 angl, an gl, 7,232 0.48 % 6.37 1,407 0.90 % 32.77 584 0.20 % 1.84 0 0 % 0 2,390 0.30 % 4.40 26 0.52 % 6.57 2,817 1.17 % 15.03 8 0.07 % 0.20 ul, ul, 6,190 0.41 % 5.46 58 0.04 % 1.35 508 0.17 % 1.60 0 0 % 0 4,418 0.56 % 8.14 2 0.04 % 0.51 1,201 0.50 % 6.41 3 0.03 % 0.08 ml, ml, 5,973 0.40 % 5.26 74 0.05 % 1.72 446 0.15 % 1.40 0 0 % 0 4,939 0.62 % 9.10 4 0.08 % 1.01 455 0.19 % 2.43 55 0.47 % 1.38 doc, d oc, 5,756 0.38 % 5.07 53 0.03 % 1.23 1,456 0.49 % 4.58 0 0 % 0 3,086 0.39 % 5.69 3 0.06 % 0.76 1,157 0.48 % 6.17 1 0.01 % 0.03 dipl, di pl, 5,602 0.37 % 4.94 81 0.05 % 1.89 412 0.14 % 1.30 0 0 % 0 4,280 0.54 % 7.89 1 0.02 % 0.25 819 0.34 % 4.37 9 0.08 % 0.23 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 255 File at CLARIN.SI 1.2.239 List of final character-level 4-grams from abbreviation lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-abbreviations-lowercase_ forms-final-4grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] npr, npr, 92,015 13.93 % 81.09 22,440 23.11 % 522.65 14,523 16.99 % 45.68 0 0 % 0 24,837 7.02 % 45.76 311 19.23 % 78.62 29,590 25.14 % 157.88 314 6.52 % 7.91 itd, itd, 65,475 9.91 % 57.70 5,517 5.68 % 128.50 8,287 9.69 % 26.06 2100.00 % 206.04 36,287 10.25 % 66.86 195 12.06 % 49.29 14,609 12.41 % 77.95 578 12.01 % 14.55 tel, tel, 59,340 8.98 % 52.30 1,487 1.53 % 34.63 3,114 3.64 % 9.79 0 0 % 0 47,321 13.37 % 87.19 12 0.74 % 3.03 7,401 6.29 % 39.49 5 0.10 % 0.13 prof, p rof, 48,612 7.36 % 42.84 741 0.76 % 17.26 7,590 8.88 % 23.87 0 0 % 0 34,903 9.86 % 64.31 26 1.61 % 6.57 5,305 4.51 % 28.31 47 0.98 % 1.18 str, str, 47,194 7.14 % 41.59 28,190 29.03 % 656.58 3,546 4.15 % 11.15 0 0 % 0 3,788 1.07 % 6.98 33 2.04 % 8.34 10,527 8.94 % 56.17 1,110 23.06 % 27.95 mag, mag, 40,143 6.08 % 35.38 471 0.48 % 10.97 4,511 5.28 % 14.19 0 0 % 0 29,459 8.32 % 54.28 65 4.02 % 16.43 5,620 4.78 % 29.99 17 0.35 % 0.43 ipd, ipd, 29,413 4.45 % 25.92 5,227 5.38 % 121.74 4,394 5.14 % 13.82 0 0 % 0 12,438 3.52 % 22.92 178 11.01 % 45 7,015 5.96 % 37.43 161 3.35 % 4.05 itn, itn, 16,421 2.49 % 14.47 2,446 2.52 % 56.97 1,646 1.93 % 5.18 0 0 % 0 8,494 2.40 % 15.65 157 9.71 % 39.69 3,546 3.01 % 18.92 132 2.74 % 3.32 odst, o dst, 16,149 2.44 % 14.23 46 0.05 % 1.07 5,065 5.92 % 15.93 0 0 % 0 10,780 3.05 % 19.86 1 0.06 % 0.25 256 0.22 % 1.37 1 0.02 % 0.03 med, med, 10,326 1.56 % 9.10 53 0.06 % 1.23 1,028 1.20 % 3.23 0 0 % 0 5,652 1.60 % 10.41 0 0 % 0 3,586 3.05 % 19.13 7 0.14 % 0.18 nan, nan, 9,178 1.39 % 8.09 3 0 % 0.07 102 0.12 % 0.32 0 0 % 0 9,052 2.56 % 16.68 0 0 % 0 18 0.01 % 0.10 3 0.06 % 0.08 prim, p rim, 8,714 1.32 % 7.68 3,218 3.31 % 74.95 907 1.06 % 2.85 0 0 % 0 2,949 0.83 % 5.43 11 0.68 % 2.78 1,535 1.30 % 8.19 94 1.95 % 2.37 amer, a mer, 8,026 1.22 % 7.07 11 0.01 % 0.26 4 0.01 % 0.01 0 0 % 0 8,006 2.26 % 14.75 0 0 % 0 5 0 % 0.03 0 0 % 0 idr, idr, 7,791 1.18 % 6.87 1,650 1.70 % 38.43 988 1.16 % 3.11 0 0 % 0 3,765 1.06 % 6.94 33 2.04 % 8.34 1,322 1.12 % 7.05 33 0.69 % 0.83 dok, dok, 7,725 1.17 % 6.81 49 0.05 % 1.14 98 0.12 % 0.31 0 0 % 0 7,494 2.12 % 13.81 3 0.19 % 0.76 78 0.07 % 0.42 3 0.06 % 0.08 nad, nad, 7,233 1.09 % 6.37 2 0 % 0.05 22 0.03 % 0.07 0 0 % 0 7,181 2.03 % 13.23 2 0.12 % 0.51 23 0.02 % 0.12 3 0.06 % 0.08 angl, a ngl, 7,232 1.09 % 6.37 1,407 1.45 % 32.77 584 0.68 % 1.84 0 0 % 0 2,390 0.68 % 4.40 26 1.61 % 6.57 2,817 2.39 % 15.03 8 0.17 % 0.20 doc, doc, 5,756 0.87 % 5.07 53 0.06 % 1.23 1,456 1.70 % 4.58 0 0 % 0 3,086 0.87 % 5.69 3 0.19 % 0.76 1,157 0.98 % 6.17 1 0.02 % 0.03 dipl, d ipl, 5,602 0.85 % 4.94 81 0.08 % 1.89 412 0.48 % 1.30 0 0 % 0 4,280 1.21 % 7.89 1 0.06 % 0.25 819 0.70 % 4.37 9 0.19 % 0.23 inž, inž, 5,119 0.78 % 4.51 43 0.04 % 1 154 0.18 % 0.48 0 0 % 0 4,449 1.26 % 8.20 1 0.06 % 0.25 468 0.40 % 2.50 4 0.08 % 0.10 pon, pon, 4,874 0.74 % 4.30 104 0.11 % 2.42 18 0.02 % 0.06 0 0 % 0 4,611 1.30 % 8.50 4 0.25 % 1.01 137 0.12 % 0.73 0 0 % 0 roj, roj, 4,310 0.65 % 3.80 901 0.93 % 20.99 629 0.74 % 1.98 0 0 % 0 2,632 0.74 % 4.85 3 0.19 % 0.76 116 0.10 % 0.62 29 0.60 % 0.73 parc, p arc, 3,955 0.60 % 3.49 19 0.02 % 0.44 2,453 2.87 % 7.72 0 0 % 0 1,470 0.41 % 2.71 4 0.25 % 1.01 9 0.01 % 0.05 0 0 % 0 naniz, na niz, 3,743 0.57 % 3.30 0 0 % 0 0 0 % 0 0 0 % 0 3,743 1.06 % 6.90 0 0 % 0 0 0 % 0 0 0 % 0 nem, nem, 3,739 0.57 % 3.30 436 0.45 % 10.15 106 0.12 % 0.33 0 0 % 0 2,975 0.84 % 5.48 4 0.25 % 1.01 175 0.15 % 0.93 43 0.89 % 1.08 brit, b rit, 3,261 0.49 % 2.87 56 0.06 % 1.30 2,446 2.86 % 7.69 0 0 % 0 738 0.21 % 1.36 0 0 % 0 21 0.02 % 0.11 0 0 % 0 msgr, m sgr, 3,221 0.49 % 2.84 19 0.02 % 0.44 116 0.14 % 0.36 0 0 % 0 2,980 0.84 % 5.49 0 0 % 0 90 0.08 % 0.48 16 0.33 % 0.40 opr, opr, 3,179 0.48 % 2.80 357 0.37 % 8.31 2,596 3.04 % 8.17 0 0 % 0 204 0.06 % 0.38 0 0 % 0 22 0.02 % 0.12 0 0 % 0 slov, s lov, 2,932 0.44 % 2.58 630 0.65 % 14.67 236 0.28 % 0.74 0 0 % 0 1,789 0.51 % 3.30 1 0.06 % 0.25 239 0.20 % 1.28 37 0.77 % 0.93 univ, u niv, 2,922 0.44 % 2.58 89 0.09 % 2.07 296 0.35 % 0.93 0 0 % 0 2,067 0.58 % 3.81 2 0.12 % 0.51 463 0.39 % 2.47 5 0.10 % 0.13 odd, odd, 2,739 0.41 % 2.41 11 0.01 % 0.26 59 0.07 % 0.19 0 0 % 0 2,661 0.75 % 4.90 0 0 % 0 7 0.01 % 0.04 1 0.02 % 0.03 stol, s tol, 2,715 0.41 % 2.39 527 0.54 % 12.27 154 0.18 % 0.48 0 0 % 0 708 0.20 % 1.30 0 0 % 0 1,314 1.12 % 7.01 12 0.25 % 0.30 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 256 File at CLARIN.SI 1.2.240 List of final character-level 5-grams from abbreviation lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-abbreviations-lowercase_ forms-final-5grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] prof, prof, 48,612 29.71 % 42.84 741 4.56 % 17.26 7,590 29.21 % 23.87 0 0 % 0 34,903 34.39 % 64.31 26 10.48 % 6.57 5,305 28.32 % 28.31 47 5.12 % 1.18 odst, odst, 16,149 9.87 % 14.23 46 0.28 % 1.07 5,065 19.49 % 15.93 0 0 % 0 10,780 10.62 % 19.86 1 0.40 % 0.25 256 1.37 % 1.37 1 0.11 % 0.03 prim, prim, 8,714 5.33 % 7.68 3,218 19.80 % 74.95 907 3.49 % 2.85 0 0 % 0 2,949 2.91 % 5.43 11 4.43 % 2.78 1,535 8.20 % 8.19 94 10.25 % 2.37 amer, amer, 8,026 4.91 % 7.07 11 0.07 % 0.26 4 0.01 % 0.01 0 0 % 0 8,006 7.89 % 14.75 0 0 % 0 5 0.03 % 0.03 0 0 % 0 angl, angl, 7,232 4.42 % 6.37 1,407 8.66 % 32.77 584 2.25 % 1.84 0 0 % 0 2,390 2.35 % 4.40 26 10.48 % 6.57 2,817 15.04 % 15.03 8 0.87 % 0.20 dipl, dipl, 5,602 3.42 % 4.94 81 0.50 % 1.89 412 1.59 % 1.30 0 0 % 0 4,280 4.22 % 7.89 1 0.40 % 0.25 819 4.37 % 4.37 9 0.98 % 0.23 parc, parc, 3,955 2.42 % 3.49 19 0.12 % 0.44 2,453 9.44 % 7.72 0 0 % 0 1,470 1.45 % 2.71 4 1.61 % 1.01 9 0.05 % 0.05 0 0 % 0 naniz, n aniz, 3,743 2.29 % 3.30 0 0 % 0 0 0 % 0 0 0 % 0 3,743 3.69 % 6.90 0 0 % 0 0 0 % 0 0 0 % 0 brit, brit, 3,261 1.99 % 2.87 56 0.34 % 1.30 2,446 9.41 % 7.69 0 0 % 0 738 0.73 % 1.36 0 0 % 0 21 0.11 % 0.11 0 0 % 0 msgr, msgr, 3,221 1.97 % 2.84 19 0.12 % 0.44 116 0.45 % 0.36 0 0 % 0 2,980 2.94 % 5.49 0 0 % 0 90 0.48 % 0.48 16 1.75 % 0.40 slov, slov, 2,932 1.79 % 2.58 630 3.88 % 14.67 236 0.91 % 0.74 0 0 % 0 1,789 1.76 % 3.30 1 0.40 % 0.25 239 1.28 % 1.28 37 4.04 % 0.93 univ, univ, 2,922 1.79 % 2.58 89 0.55 % 2.07 296 1.14 % 0.93 0 0 % 0 2,067 2.04 % 3.81 2 0.81 % 0.51 463 2.47 % 2.47 5 0.55 % 0.13 stol, stol, 2,715 1.66 % 2.39 527 3.24 % 12.27 154 0.59 % 0.48 0 0 % 0 708 0.70 % 1.30 0 0 % 0 1,314 7.01 % 7.01 12 1.31 % 0.30 prev, prev, 2,524 1.54 % 2.22 1,713 10.54 % 39.90 60 0.23 % 0.19 0 0 % 0 350 0.34 % 0.64 5 2.02 % 1.26 150 0.80 % 0.80 246 26.83 % 6.19 ibid, ibid, 2,175 1.33 % 1.92 1,945 11.97 % 45.30 75 0.29 % 0.24 0 0 % 0 8 0.01 % 0.01 2 0.81 % 0.51 145 0.77 % 0.77 0 0 % 0 ponov, p onov, 1,907 1.17 % 1.68 0 0 % 0 0 0 % 0 0 0 % 0 1,906 1.88 % 3.51 0 0 % 0 0 0 % 0 1 0.11 % 0.03 nadalj, na dalj, 1,606 0.98 % 1.42 2 0.01 % 0.05 144 0.55 % 0.45 0 0 % 0 1,457 1.44 % 2.68 1 0.40 % 0.25 2 0.01 % 0.01 0 0 % 0 asist, a sist, 1,359 0.83 % 1.20 8 0.05 % 0.19 342 1.32 % 1.08 0 0 % 0 513 0.51 % 0.95 0 0 % 0 496 2.65 % 2.65 0 0 % 0 spec, spec, 1,274 0.78 % 1.12 20 0.12 % 0.47 256 0.98 % 0.81 0 0 % 0 660 0.65 % 1.22 0 0 % 0 336 1.79 % 1.79 2 0.22 % 0.05 film, film, 1,206 0.74 % 1.06 55 0.34 % 1.28 133 0.51 % 0.42 0 0 % 0 1,000 0.98 % 1.84 1 0.40 % 0.25 17 0.09 % 0.09 0 0 % 0 štev, štev, 1,168 0.71 % 1.03 80 0.49 % 1.86 151 0.58 % 0.47 0 0 % 0 751 0.74 % 1.38 9 3.63 % 2.28 161 0.86 % 0.86 16 1.75 % 0.40 pribl, p ribl, 1,089 0.67 % 0.96 140 0.86 % 3.26 269 1.03 % 0.85 0 0 % 0 325 0.32 % 0.60 20 8.06 % 5.06 335 1.79 % 1.79 0 0 % 0 nasl, nasl, 966 0.59 % 0.85 131 0.81 % 3.05 769 2.96 % 2.42 0 0 % 0 12 0.01 % 0.02 1 0.40 % 0.25 52 0.28 % 0.28 1 0.11 % 0.03 franc, f ranc, 896 0.55 % 0.79 55 0.34 % 1.28 9 0.04 % 0.03 0 0 % 0 818 0.81 % 1.51 0 0 % 0 14 0.07 % 0.07 0 0 % 0 akad, akad, 876 0.54 % 0.77 74 0.46 % 1.72 117 0.45 % 0.37 0 0 % 0 656 0.65 % 1.21 2 0.81 % 0.51 26 0.14 % 0.14 1 0.11 % 0.03 avstral, avs tral, 818 0.50 % 0.72 8 0.05 % 0.19 2 0.01 % 0.01 0 0 % 0 807 0.80 % 1.49 0 0 % 0 1 0.01 % 0.01 0 0 % 0 pogl, pogl, 740 0.45 % 0.65 670 4.12 % 15.61 11 0.04 % 0.03 0 0 % 0 5 0.01 % 0.01 1 0.40 % 0.25 45 0.24 % 0.24 8 0.87 % 0.20 pren, pren, 713 0.44 % 0.63 1 0.01 % 0.02 19 0.07 % 0.06 0 0 % 0 693 0.68 % 1.28 0 0 % 0 0 0 % 0 0 0 % 0 alter, a lter, 638 0.39 % 0.56 0 0 % 0 2 0.01 % 0.01 0 0 % 0 624 0.61 % 1.15 0 0 % 0 12 0.06 % 0.06 0 0 % 0 šport, š port, 612 0.37 % 0.54 3 0.02 % 0.07 10 0.04 % 0.03 0 0 % 0 530 0.52 % 0.98 0 0 % 0 67 0.36 % 0.36 2 0.22 % 0.05 posn, posn, 597 0.36 % 0.53 0 0 % 0 0 0 % 0 0 0 % 0 596 0.59 % 1.10 0 0 % 0 1 0.01 % 0.01 0 0 % 0 madž, madž, 552 0.34 % 0.49 114 0.70 % 2.66 30 0.12 % 0.09 0 0 % 0 401 0.40 % 0.74 0 0 % 0 4 0.02 % 0.02 3 0.33 % 0.08 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 257 File at CLARIN.SI 1.2.241 List of initial character-level 1-grams from residual lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-residual-lemmas-initial- 1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] de de d e 119,443 3.25 % 105.26 12,264 2.48 % 285.64 37,427 4.85 % 117.72 0 0 % 0 47,661 3.62 % 87.82 175 1.66 % 44.24 18,231 1.80 % 97.27 3,685 5.07 % 92.78 of of o f 88,222 2.40 % 77.75 19,500 3.94 % 454.18 19,565 2.54 % 61.54 0 0 % 0 23,377 1.78 % 43.07 120 1.14 % 30.33 24,881 2.46 % 132.76 779 1.07 % 19.61 the the t he 75,221 2.05 % 66.29 15,174 3.07 % 353.42 15,392 2.00 % 48.41 0 0 % 0 19,043 1.45 % 35.09 94 0.89 % 23.76 24,539 2.43 % 130.93 979 1.35 % 24.65 The the T he 66,216 1.80 % 58.36 7,608 1.54 % 177.20 20,382 2.64 % 64.11 0 0 % 0 21,304 1.62 % 39.25 65 0.62 % 16.43 16,340 1.62 % 87.18 517 0.71 % 13.02 and and a nd 48,667 1.32 % 42.89 14,524 2.94 % 338.28 9,262 1.20 % 29.13 0 0 % 0 9,755 0.74 % 17.97 36 0.34 % 9.10 14,531 1.44 % 77.53 559 0.77 % 14.08 la la l a 47,471 1.29 % 41.84 4,772 0.96 % 111.15 12,499 1.62 % 39.31 0 0 % 0 19,464 1.48 % 35.86 53 0.50 % 13.40 9,233 0.91 % 49.26 1,450 2.00 % 36.51 i i i 46,624 1.27 % 41.09 3,283 0.66 % 76.46 12,783 1.66 % 40.21 0 0 % 0 9,281 0.71 % 17.10 327 3.10 % 82.66 20,111 1.99 % 107.31 839 1.16 % 21.13 a a a 32,433 0.88 % 28.58 5,651 1.14 % 131.62 5,850 0.76 % 18.40 0 0 % 0 8,682 0.66 % 16 220 2.09 % 55.61 11,459 1.14 % 61.14 571 0.79 % 14.38 in in i n 22,434 0.61 % 19.77 5,021 1.01 % 116.95 4,201 0.55 % 13.21 0 0 % 0 5,728 0.43 % 10.55 39 0.37 % 9.86 7,168 0.71 % 38.25 277 0.38 % 6.97 di di d i 20,807 0.57 % 18.34 1,481 0.30 % 34.49 5,855 0.76 % 18.42 1 33.33 % 103.02 9,811 0.75 % 18.08 28 0.27 % 7.08 3,363 0.33 % 17.94 268 0.37 % 6.75 sta sta s ta 17,203 0.47 % 15.16 3 0 % 0.07 1,388 0.18 % 4.37 0 0 % 0 15,790 1.20 % 29.09 0 0 % 0 21 0 % 0.11 1 0 % 0.03 van van v an 16,380 0.45 % 14.44 528 0.11 % 12.30 6,171 0.80 % 19.41 0 0 % 0 8,131 0.62 % 14.98 18 0.17 % 4.55 1,246 0.12 % 6.65 286 0.39 % 7.20 el el e l 15,435 0.42 % 13.60 721 0.15 % 16.79 5,592 0.72 % 17.59 0 0 % 0 6,078 0.46 % 11.20 11 0.10 % 2.78 2,615 0.26 % 13.95 418 0.57 % 10.52 to to t o 15,224 0.41 % 13.42 3,150 0.64 % 73.37 2,841 0.37 % 8.94 0 0 % 0 3,583 0.27 % 6.60 18 0.17 % 4.55 5,425 0.54 % 28.95 207 0.28 % 5.21 for for f or 13,275 0.36 % 11.70 2,800 0.57 % 65.22 2,816 0.36 % 8.86 0 0 % 0 3,168 0.24 % 5.84 16 0.15 % 4.04 4,355 0.43 % 23.24 120 0.17 % 3.02 on on o n 12,675 0.34 % 11.17 2,313 0.47 % 53.87 2,767 0.36 % 8.70 0 0 % 0 3,727 0.28 % 6.87 8 0.08 % 2.02 3,678 0.36 % 19.62 182 0.25 % 4.58 pre pre p re 12,183 0.33 % 10.74 379 0.08 % 8.83 2,877 0.37 % 9.05 0 0 % 0 6,154 0.47 % 11.34 34 0.32 % 8.59 2,695 0.27 % 14.38 44 0.06 % 1.11 is is i s 11,863 0.32 % 10.45 1,865 0.38 % 43.44 3,514 0.46 % 11.05 0 0 % 0 2,493 0.19 % 4.59 3 0.03 % 0.76 3,726 0.37 % 19.88 262 0.36 % 6.60 von von v on 11,146 0.30 % 9.82 1,850 0.37 % 43.09 2,532 0.33 % 7.96 0 0 % 0 4,268 0.32 % 7.86 178 1.69 % 45 1,975 0.20 % 10.54 343 0.47 % 8.64 der der d er 10,665 0.29 % 9.40 2,093 0.42 % 48.75 3,064 0.40 % 9.64 0 0 % 0 3,773 0.29 % 6.95 28 0.27 % 7.08 1,459 0.14 % 7.78 248 0.34 % 6.24 et et e t 9,624 0.26 % 8.48 3,354 0.68 % 78.12 1,351 0.17 % 4.25 0 0 % 0 2,923 0.22 % 5.39 26 0.25 % 6.57 1,606 0.16 % 8.57 364 0.50 % 9.17 ta ta t a 9,565 0.26 % 8.43 1,178 0.24 % 27.44 2,441 0.32 % 7.68 0 0 % 0 2,643 0.20 % 4.87 15 0.14 % 3.79 2,979 0.29 % 15.89 309 0.42 % 7.78 da da d a 9,128 0.25 % 8.04 554 0.11 % 12.90 2,267 0.29 % 7.13 0 0 % 0 3,188 0.24 % 5.87 27 0.26 % 6.83 2,083 0.21 % 11.11 1,009 1.39 % 25.41 bin bin b in 9,010 0.24 % 7.94 101 0.02 % 2.35 2,562 0.33 % 8.06 0 0 % 0 5,068 0.39 % 9.34 2 0.02 % 0.51 1,212 0.12 % 6.47 65 0.09 % 1.64 by by b y 8,563 0.23 % 7.55 1,142 0.23 % 26.60 1,719 0.22 % 5.41 0 0 % 0 2,522 0.19 % 4.65 21 0.20 % 5.31 3,090 0.31 % 16.49 69 0.10 % 1.74 an an a n 7,554 0.21 % 6.66 1,400 0.28 % 32.61 1,264 0.16 % 3.98 0 0 % 0 3,059 0.23 % 5.64 11 0.10 % 2.78 1,705 0.17 % 9.10 115 0.16 % 2.90 du du d u 7,024 0.19 % 6.19 1,345 0.27 % 31.33 1,300 0.17 % 4.09 0 0 % 0 3,031 0.23 % 5.58 10 0.10 % 2.53 1,043 0.10 % 5.57 295 0.41 % 7.43 be be b e 6,845 0.19 % 6.03 998 0.20 % 23.24 1,265 0.16 % 3.98 0 0 % 0 2,025 0.15 % 3.73 11 0.10 % 2.78 2,409 0.24 % 12.85 137 0.19 % 3.45 des des d es 6,567 0.18 % 5.79 1,968 0.40 % 45.84 1,067 0.14 % 3.36 0 0 % 0 2,185 0.17 % 4.03 21 0.20 % 5.31 941 0.09 % 5.02 385 0.53 % 9.69 del del d el 6,521 0.18 % 5.75 749 0.15 % 17.45 1,325 0.17 % 4.17 0 0 % 0 2,826 0.21 % 5.21 14 0.13 % 3.54 1,519 0.15 % 8.10 88 0.12 % 2.22 dnevnik,si dnevnik,si d nevnik,si 6,354 0.17 % 5.60 0 0 % 0 278 0.04 % 0.87 0 0 % 0 6,059 0.46 % 11.16 0 0 % 0 17 0 % 0.09 0 0 % 0 with with w ith 5,952 0.16 % 5.25 1,244 0.25 % 28.97 1,145 0.15 % 3.60 0 0 % 0 1,281 0.10 % 2.36 4 0.04 % 1.01 2,175 0.21 % 11.61 103 0.14 % 2.59 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 258 File at CLARIN.SI 1.2.242 List of initial character-level 2-grams from residual lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-residual-lemmas-initial- 2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] de de de 119,443 3.36 % 105.26 12,264 2.55 % 285.64 37,427 5.01 % 117.72 0 0 % 0 47,661 3.69 % 87.82 175 1.76 % 44.24 18,231 1.91 % 97.27 3,685 5.25 % 92.78 of of of 88,222 2.48 % 77.75 19,500 4.06 % 454.18 19,565 2.62 % 61.54 0 0 % 0 23,377 1.81 % 43.07 120 1.21 % 30.33 24,881 2.60 % 132.76 779 1.11 % 19.61 the the th e 75,221 2.12 % 66.29 15,174 3.16 % 353.42 15,392 2.06 % 48.41 0 0 % 0 19,043 1.47 % 35.09 94 0.94 % 23.76 24,539 2.57 % 130.93 979 1.39 % 24.65 The the Th e 66,216 1.86 % 58.36 7,608 1.58 % 177.20 20,382 2.73 % 64.11 0 0 % 0 21,304 1.65 % 39.25 65 0.65 % 16.43 16,340 1.71 % 87.18 517 0.74 % 13.02 and and an d 48,667 1.37 % 42.89 14,524 3.02 % 338.28 9,262 1.24 % 29.13 0 0 % 0 9,755 0.76 % 17.97 36 0.36 % 9.10 14,531 1.52 % 77.53 559 0.80 % 14.08 la la la 47,471 1.33 % 41.84 4,772 0.99 % 111.15 12,499 1.67 % 39.31 0 0 % 0 19,464 1.51 % 35.86 53 0.53 % 13.40 9,233 0.97 % 49.26 1,450 2.06 % 36.51 in in in 22,434 0.63 % 19.77 5,021 1.04 % 116.95 4,201 0.56 % 13.21 0 0 % 0 5,728 0.44 % 10.55 39 0.39 % 9.86 7,168 0.75 % 38.25 277 0.39 % 6.97 di di di 20,807 0.58 % 18.34 1,481 0.31 % 34.49 5,855 0.78 % 18.42 133.33 % 103.02 9,811 0.76 % 18.08 28 0.28 % 7.08 3,363 0.35 % 17.94 268 0.38 % 6.75 sta sta st a 17,203 0.48 % 15.16 3 0 % 0.07 1,388 0.19 % 4.37 0 0 % 0 15,790 1.22 % 29.09 0 0 % 0 21 0 % 0.11 1 0 % 0.03 van van va n 16,380 0.46 % 14.44 528 0.11 % 12.30 6,171 0.83 % 19.41 0 0 % 0 8,131 0.63 % 14.98 18 0.18 % 4.55 1,246 0.13 % 6.65 286 0.41 % 7.20 el el el 15,435 0.43 % 13.60 721 0.15 % 16.79 5,592 0.75 % 17.59 0 0 % 0 6,078 0.47 % 11.20 11 0.11 % 2.78 2,615 0.27 % 13.95 418 0.59 % 10.52 to to to 15,224 0.43 % 13.42 3,150 0.66 % 73.37 2,841 0.38 % 8.94 0 0 % 0 3,583 0.28 % 6.60 18 0.18 % 4.55 5,425 0.57 % 28.95 207 0.29 % 5.21 for for fo r 13,275 0.37 % 11.70 2,800 0.58 % 65.22 2,816 0.38 % 8.86 0 0 % 0 3,168 0.24 % 5.84 16 0.16 % 4.04 4,355 0.46 % 23.24 120 0.17 % 3.02 on on on 12,675 0.36 % 11.17 2,313 0.48 % 53.87 2,767 0.37 % 8.70 0 0 % 0 3,727 0.29 % 6.87 8 0.08 % 2.02 3,678 0.39 % 19.62 182 0.26 % 4.58 pre pre pr e 12,183 0.34 % 10.74 379 0.08 % 8.83 2,877 0.39 % 9.05 0 0 % 0 6,154 0.48 % 11.34 34 0.34 % 8.59 2,695 0.28 % 14.38 44 0.06 % 1.11 is is is 11,863 0.33 % 10.45 1,865 0.39 % 43.44 3,514 0.47 % 11.05 0 0 % 0 2,493 0.19 % 4.59 3 0.03 % 0.76 3,726 0.39 % 19.88 262 0.37 % 6.60 von von vo n 11,146 0.31 % 9.82 1,850 0.39 % 43.09 2,532 0.34 % 7.96 0 0 % 0 4,268 0.33 % 7.86 178 1.79 % 45 1,975 0.21 % 10.54 343 0.49 % 8.64 der der de r 10,665 0.30 % 9.40 2,093 0.44 % 48.75 3,064 0.41 % 9.64 0 0 % 0 3,773 0.29 % 6.95 28 0.28 % 7.08 1,459 0.15 % 7.78 248 0.35 % 6.24 et et et 9,624 0.27 % 8.48 3,354 0.70 % 78.12 1,351 0.18 % 4.25 0 0 % 0 2,923 0.23 % 5.39 26 0.26 % 6.57 1,606 0.17 % 8.57 364 0.52 % 9.17 ta ta ta 9,565 0.27 % 8.43 1,178 0.24 % 27.44 2,441 0.33 % 7.68 0 0 % 0 2,643 0.20 % 4.87 15 0.15 % 3.79 2,979 0.31 % 15.89 309 0.44 % 7.78 da da da 9,128 0.26 % 8.04 554 0.12 % 12.90 2,267 0.30 % 7.13 0 0 % 0 3,188 0.25 % 5.87 27 0.27 % 6.83 2,083 0.22 % 11.11 1,009 1.44 % 25.41 bin bin bi n 9,010 0.25 % 7.94 101 0.02 % 2.35 2,562 0.34 % 8.06 0 0 % 0 5,068 0.39 % 9.34 2 0.02 % 0.51 1,212 0.13 % 6.47 65 0.09 % 1.64 by by by 8,563 0.24 % 7.55 1,142 0.24 % 26.60 1,719 0.23 % 5.41 0 0 % 0 2,522 0.20 % 4.65 21 0.21 % 5.31 3,090 0.32 % 16.49 69 0.10 % 1.74 an an an 7,554 0.21 % 6.66 1,400 0.29 % 32.61 1,264 0.17 % 3.98 0 0 % 0 3,059 0.24 % 5.64 11 0.11 % 2.78 1,705 0.18 % 9.10 115 0.16 % 2.90 du du du 7,024 0.20 % 6.19 1,345 0.28 % 31.33 1,300 0.17 % 4.09 0 0 % 0 3,031 0.23 % 5.58 10 0.10 % 2.53 1,043 0.11 % 5.57 295 0.42 % 7.43 be be be 6,845 0.19 % 6.03 998 0.21 % 23.24 1,265 0.17 % 3.98 0 0 % 0 2,025 0.16 % 3.73 11 0.11 % 2.78 2,409 0.25 % 12.85 137 0.20 % 3.45 des des de s 6,567 0.18 % 5.79 1,968 0.41 % 45.84 1,067 0.14 % 3.36 0 0 % 0 2,185 0.17 % 4.03 21 0.21 % 5.31 941 0.10 % 5.02 385 0.55 % 9.69 del del de l 6,521 0.18 % 5.75 749 0.16 % 17.45 1,325 0.18 % 4.17 0 0 % 0 2,826 0.22 % 5.21 14 0.14 % 3.54 1,519 0.16 % 8.10 88 0.12 % 2.22 dnevnik,si dnevnik,si dn evnik,si 6,354 0.18 % 5.60 0 0 % 0 278 0.04 % 0.87 0 0 % 0 6,059 0.47 % 11.16 0 0 % 0 17 0 % 0.09 0 0 % 0 with with wi th 5,952 0.17 % 5.25 1,244 0.26 % 28.97 1,145 0.15 % 3.60 0 0 % 0 1,281 0.10 % 2.36 4 0.04 % 1.01 2,175 0.23 % 11.61 103 0.15 % 2.59 re re re 5,794 0.16 % 5.11 281 0.06 % 6.54 1,286 0.17 % 4.04 0 0 % 0 2,954 0.23 % 5.44 29 0.29 % 7.33 1,102 0.12 % 5.88 142 0.20 % 3.58 at at at 5,740 0.16 % 5.06 847 0.18 % 19.73 1,237 0.17 % 3.89 0 0 % 0 1,685 0.13 % 3.10 68 0.68 % 17.19 1,834 0.19 % 9.79 69 0.10 % 1.74 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 259 File at CLARIN.SI 1.2.243 List of initial character-level 3-grams from residual lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-residual-lemmas-initial- 3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] the the the 75,221 2.72 % 66.29 15,174 3.87 % 353.42 15,392 2.64 % 48.41 0 0 % 0 19,043 2.01 % 35.09 94 1.08 % 23.76 24,539 3.16 % 130.93 979 1.78 % 24.65 The the The 66,216 2.40 % 58.36 7,608 1.94 % 177.20 20,382 3.50 % 64.11 0 0 % 0 21,304 2.25 % 39.25 65 0.75 % 16.43 16,340 2.10 % 87.18 517 0.94 % 13.02 and and and 48,667 1.76 % 42.89 14,524 3.70 % 338.28 9,262 1.59 % 29.13 0 0 % 0 9,755 1.03 % 17.97 36 0.41 % 9.10 14,531 1.87 % 77.53 559 1.02 % 14.08 sta sta sta 17,203 0.62 % 15.16 3 0 % 0.07 1,388 0.24 % 4.37 0 0 % 0 15,790 1.67 % 29.09 0 0 % 0 21 0 % 0.11 1 0 % 0.03 van van van 16,380 0.59 % 14.44 528 0.14 % 12.30 6,171 1.06 % 19.41 0 0 % 0 8,131 0.86 % 14.98 18 0.21 % 4.55 1,246 0.16 % 6.65 286 0.52 % 7.20 for for for 13,275 0.48 % 11.70 2,800 0.71 % 65.22 2,816 0.48 % 8.86 0 0 % 0 3,168 0.33 % 5.84 16 0.18 % 4.04 4,355 0.56 % 23.24 120 0.22 % 3.02 pre pre pre 12,183 0.44 % 10.74 379 0.10 % 8.83 2,877 0.49 % 9.05 0 0 % 0 6,154 0.65 % 11.34 34 0.39 % 8.59 2,695 0.35 % 14.38 44 0.08 % 1.11 von von von 11,146 0.40 % 9.82 1,850 0.47 % 43.09 2,532 0.43 % 7.96 0 0 % 0 4,268 0.45 % 7.86 178 2.05 % 45 1,975 0.25 % 10.54 343 0.62 % 8.64 der der der 10,665 0.39 % 9.40 2,093 0.53 % 48.75 3,064 0.53 % 9.64 0 0 % 0 3,773 0.40 % 6.95 28 0.32 % 7.08 1,459 0.19 % 7.78 248 0.45 % 6.24 bin bin bin 9,010 0.33 % 7.94 101 0.03 % 2.35 2,562 0.44 % 8.06 0 0 % 0 5,068 0.54 % 9.34 2 0.02 % 0.51 1,212 0.16 % 6.47 65 0.12 % 1.64 des des des 6,567 0.24 % 5.79 1,968 0.50 % 45.84 1,067 0.18 % 3.36 0 0 % 0 2,185 0.23 % 4.03 21 0.24 % 5.31 941 0.12 % 5.02 385 0.70 % 9.69 del del del 6,521 0.24 % 5.75 749 0.19 % 17.45 1,325 0.23 % 4.17 0 0 % 0 2,826 0.30 % 5.21 14 0.16 % 3.54 1,519 0.20 % 8.10 88 0.16 % 2.22 dnevnik,si dnevnik,si dne vnik,si 6,354 0.23 % 5.60 0 0 % 0 278 0.05 % 0.87 0 0 % 0 6,059 0.64 % 11.16 0 0 % 0 17 0 % 0.09 0 0 % 0 with with wit h 5,952 0.21 % 5.25 1,244 0.32 % 28.97 1,145 0.20 % 3.60 0 0 % 0 1,281 0.14 % 2.36 4 0.05 % 1.01 2,175 0.28 % 11.61 103 0.19 % 2.59 und und und 5,228 0.19 % 4.61 1,888 0.48 % 43.97 552 0.10 % 1.74 0 0 % 0 1,226 0.13 % 2.26 18 0.21 % 4.55 1,392 0.18 % 7.43 152 0.28 % 3.83 that that tha t 5,042 0.18 % 4.44 1,039 0.27 % 24.20 572 0.10 % 1.80 0 0 % 0 1,159 0.12 % 2.14 5 0.06 % 1.26 2,166 0.28 % 11.56 101 0.18 % 2.54 24ur,com 24ur,com 24u r,com 4,723 0.17 % 4.16 2 0 % 0.05 4,477 0.77 % 14.08 0 0 % 0 158 0.02 % 0.29 0 0 % 0 86 0.01 % 0.46 0 0 % 0 you you you 4,704 0.17 % 4.15 190 0.05 % 4.43 1,140 0.20 % 3.59 0 0 % 0 1,276 0.14 % 2.35 11 0.13 % 2.78 1,714 0.22 % 9.15 373 0.68 % 9.39 from from fro m 4,370 0.16 % 3.85 885 0.23 % 20.61 932 0.16 % 2.93 0 0 % 0 1,010 0.11 % 1.86 2 0.02 % 0.51 1,468 0.19 % 7.83 73 0.13 % 1.84 all all all 3,977 0.14 % 3.50 364 0.09 % 8.48 892 0.15 % 2.81 0 0 % 0 1,403 0.15 % 2.59 3 0.04 % 0.76 1,244 0.16 % 6.64 71 0.13 % 1.79 are are are 3,973 0.14 % 3.50 866 0.22 % 20.17 581 0.10 % 1.83 0 0 % 0 771 0.08 % 1.42 1 0.01 % 0.25 1,659 0.21 % 8.85 95 0.17 % 2.39 siol,net siol,net sio l,net 3,780 0.14 % 3.33 78 0.02 % 1.82 1,852 0.32 % 5.83 0 0 % 0 1,016 0.11 % 1.87 1 0.01 % 0.25 833 0.11 % 4.44 0 0 % 0 per per per 3,776 0.14 % 3.33 406 0.10 % 9.46 319 0.06 % 1 0 0 % 0 2,319 0.24 % 4.27 4 0.05 % 1.01 649 0.08 % 3.46 79 0.14 % 1.99 World world Wor ld 3,772 0.14 % 3.32 434 0.11 % 10.11 1,169 0.20 % 3.68 0 0 % 0 1,222 0.13 % 2.25 8 0.09 % 2.02 884 0.11 % 4.72 55 0.10 % 1.38 Van van Van 3,581 0.13 % 3.16 114 0.03 % 2.66 1,278 0.22 % 4.02 0 0 % 0 1,749 0.18 % 3.22 3 0.04 % 0.76 374 0.05 % 2 63 0.12 % 1.59 press press pre ss 3,361 0.12 % 2.96 92 0.02 % 2.14 104 0.02 % 0.33 0 0 % 0 124 0.01 % 0.23 0 0 % 0 3,037 0.39 % 16.20 4 0.01 % 0.10 stand stand sta nd 3,204 0.12 % 2.82 98 0.03 % 2.28 1,808 0.31 % 5.69 0 0 % 0 780 0.08 % 1.44 0 0 % 0 499 0.06 % 2.66 19 0.04 % 0.48 den den den 3,077 0.11 % 2.71 294 0.07 % 6.85 416 0.07 % 1.31 0 0 % 0 1,799 0.19 % 3.31 5 0.06 % 1.26 516 0.07 % 2.75 47 0.09 % 1.18 was was was 3,024 0.11 % 2.67 438 0.11 % 10.20 406 0.07 % 1.28 0 0 % 0 954 0.10 % 1.76 1 0.01 % 0.25 1,166 0.15 % 6.22 59 0.11 % 1.49 not not not 2,991 0.11 % 2.64 485 0.12 % 11.30 355 0.06 % 1.12 0 0 % 0 820 0.09 % 1.51 5 0.06 % 1.26 1,279 0.17 % 6.82 47 0.09 % 1.18 International international Int ernational 2,897 0.10 % 2.55 414 0.11 % 9.64 809 0.14 % 2.54 0 0 % 0 965 0.10 % 1.78 6 0.07 % 1.52 696 0.09 % 3.71 7 0.01 % 0.18 You you You 2,882 0.10 % 2.54 72 0.02 % 1.68 1,005 0.17 % 3.16 0 0 % 0 831 0.09 % 1.53 1 0.01 % 0.25 904 0.12 % 4.82 69 0.13 % 1.74 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 260 File at CLARIN.SI 1.2.244 List of initial character-level 4-grams from residual lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-residual-lemmas-initial- 4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] dnevnik,si dnevnik,si dnev nik,si 6,354 0.30 % 5.60 0 0 % 0 278 0.06 % 0.87 0 0 % 0 6,059 0.85 % 11.16 0 0 % 0 17 0 % 0.09 0 0 % 0 with with with 5,952 0.28 % 5.25 1,244 0.41 % 28.97 1,145 0.26 % 3.60 0 0 % 0 1,281 0.18 % 2.36 4 0.05 % 1.01 2,175 0.35 % 11.61 103 0.24 % 2.59 that that that 5,042 0.24 % 4.44 1,039 0.34 % 24.20 572 0.13 % 1.80 0 0 % 0 1,159 0.16 % 2.14 5 0.07 % 1.26 2,166 0.35 % 11.56 101 0.23 % 2.54 24ur,com 24ur,com 24ur ,com 4,723 0.22 % 4.16 2 0 % 0.05 4,477 0.99 % 14.08 0 0 % 0 158 0.02 % 0.29 0 0 % 0 86 0.01 % 0.46 0 0 % 0 from from from 4,370 0.20 % 3.85 885 0.29 % 20.61 932 0.21 % 2.93 0 0 % 0 1,010 0.14 % 1.86 2 0.03 % 0.51 1,468 0.24 % 7.83 73 0.17 % 1.84 siol,net siol,net siol ,net 3,780 0.18 % 3.33 78 0.03 % 1.82 1,852 0.41 % 5.83 0 0 % 0 1,016 0.14 % 1.87 1 0.01 % 0.25 833 0.14 % 4.44 0 0 % 0 World world Worl d 3,772 0.18 % 3.32 434 0.14 % 10.11 1,169 0.26 % 3.68 0 0 % 0 1,222 0.17 % 2.25 8 0.11 % 2.02 884 0.14 % 4.72 55 0.13 % 1.38 press press pres s 3,361 0.16 % 2.96 92 0.03 % 2.14 104 0.02 % 0.33 0 0 % 0 124 0.02 % 0.23 0 0 % 0 3,037 0.49 % 16.20 4 0.01 % 0.10 stand stand stan d 3,204 0.15 % 2.82 98 0.03 % 2.28 1,808 0.40 % 5.69 0 0 % 0 780 0.11 % 1.44 0 0 % 0 499 0.08 % 2.66 19 0.04 % 0.48 International international Inte rnational 2,897 0.14 % 2.55 414 0.14 % 9.64 809 0.18 % 2.54 0 0 % 0 965 0.14 % 1.78 6 0.08 % 1.52 696 0.11 % 3.71 7 0.02 % 0.18 European european Euro pean 2,851 0.13 % 2.51 550 0.18 % 12.81 664 0.15 % 2.09 0 0 % 0 870 0.12 % 1.60 3 0.04 % 0.76 763 0.12 % 4.07 1 0 % 0.03 Journal journal Jour nal 2,796 0.13 % 2.46 796 0.26 % 18.54 725 0.16 % 2.28 0 0 % 0 461 0.07 % 0.85 1 0.01 % 0.25 790 0.13 % 4.22 23 0.05 % 0.58 blue blue blue 2,740 0.13 % 2.41 20 0.01 % 0.47 1,397 0.31 % 4.39 0 0 % 0 1,144 0.16 % 2.11 1 0.01 % 0.25 166 0.03 % 0.89 12 0.03 % 0.30 have have have 2,449 0.12 % 2.16 375 0.12 % 8.73 418 0.09 % 1.31 0 0 % 0 592 0.08 % 1.09 4 0.05 % 1.01 976 0.16 % 5.21 84 0.20 % 2.12 facto facto fact o 2,278 0.11 % 2.01 104 0.03 % 2.42 827 0.18 % 2.60 0 0 % 0 918 0.13 % 1.69 15 0.20 % 3.79 398 0.07 % 2.12 16 0.04 % 0.40 this this this 2,261 0.11 % 1.99 403 0.13 % 9.39 407 0.09 % 1.28 0 0 % 0 365 0.05 % 0.67 2 0.03 % 0.51 1,015 0.17 % 5.42 69 0.16 % 1.74 hard hard hard 2,243 0.10 % 1.98 49 0.02 % 1.14 420 0.09 % 1.32 0 0 % 0 776 0.11 % 1.43 2 0.03 % 0.51 972 0.16 % 5.19 24 0.06 % 0.60 ford ford ford 2,196 0.10 % 1.94 2 0 % 0.05 407 0.09 % 1.28 0 0 % 0 1,616 0.23 % 2.98 0 0 % 0 159 0.03 % 0.85 12 0.03 % 0.30 grand grand gran d 2,153 0.10 % 1.90 42 0.01 % 0.98 754 0.17 % 2.37 0 0 % 0 1,090 0.15 % 2.01 2 0.03 % 0.51 253 0.04 % 1.35 12 0.03 % 0.30 which which whic h 2,124 0.10 % 1.87 560 0.18 % 13.04 110 0.02 % 0.35 0 0 % 0 662 0.09 % 1.22 1 0.01 % 0.25 777 0.13 % 4.15 14 0.03 % 0.35 will will will 2,112 0.10 % 1.86 233 0.08 % 5.43 381 0.09 % 1.20 0 0 % 0 878 0.12 % 1.62 1 0.01 % 0.25 575 0.09 % 3.07 44 0.10 % 1.11 della della dell a 2,058 0.10 % 1.81 275 0.09 % 6.41 517 0.12 % 1.63 0 0 % 0 870 0.12 % 1.60 0 0 % 0 361 0.06 % 1.93 35 0.08 % 0.88 Love love Love 2,036 0.10 % 1.79 31 0.01 % 0.72 748 0.17 % 2.35 0 0 % 0 650 0.09 % 1.20 0 0 % 0 584 0.10 % 3.12 23 0.05 % 0.58 University university Univ ersity 2,003 0.09 % 1.77 817 0.27 % 19.03 429 0.10 % 1.35 0 0 % 0 417 0.06 % 0.77 1 0.01 % 0.25 325 0.05 % 1.73 14 0.03 % 0.35 miss miss miss 1,914 0.09 % 1.69 5 0 % 0.12 220 0.05 % 0.69 0 0 % 0 584 0.08 % 1.08 0 0 % 0 1,006 0.16 % 5.37 99 0.23 % 2.49 www,facebook,com www,facebook,com www, facebook,com 1,853 0.09 % 1.63 1 0 % 0.02 1,835 0.41 % 5.77 0 0 % 0 9 0 % 0.02 0 0 % 0 8 0 % 0.04 0 0 % 0 time time time 1,797 0.08 % 1.58 217 0.07 % 5.05 291 0.07 % 0.92 0 0 % 0 488 0.07 % 0.90 4 0.05 % 1.01 765 0.12 % 4.08 32 0.07 % 0.81 twitter,com twitter,com twit ter,com 1,775 0.08 % 1.56 0 0 % 0 1,762 0.39 % 5.54 0 0 % 0 8 0 % 0.01 0 0 % 0 5 0 % 0.03 0 0 % 0 si,mobil si,mobil si,m obil 1,730 0.08 % 1.52 0 0 % 0 337 0.07 % 1.06 0 0 % 0 788 0.11 % 1.45 6 0.08 % 1.52 599 0.10 % 3.20 0 0 % 0 your your your 1,711 0.08 % 1.51 100 0.03 % 2.33 362 0.08 % 1.14 0 0 % 0 413 0.06 % 0.76 0 0 % 0 754 0.12 % 4.02 82 0.19 % 2.06 National national Nati onal 1,702 0.08 % 1.50 292 0.10 % 6.80 417 0.09 % 1.31 0 0 % 0 617 0.09 % 1.14 10 0.13 % 2.53 337 0.06 % 1.80 29 0.07 % 0.73 their their thei r 1,680 0.08 % 1.48 530 0.17 % 12.34 173 0.04 % 0.54 0 0 % 0 247 0.04 % 0.46 3 0.04 % 0.76 724 0.12 % 3.86 3 0.01 % 0.08 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 261 File at CLARIN.SI 1.2.245 List of initial character-level 5-grams from residual lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-residual-lemmas-initial- 5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] dnevnik,si dnevnik,si dnevn ik,si 6,354 0.36 % 5.60 0 0 % 0 278 0.07 % 0.87 0 0 % 0 6,059 1.04 % 11.16 0 0 % 0 17 0 % 0.09 0 0 % 0 24ur,com 24ur,com 24ur, com 4,723 0.27 % 4.16 2 0 % 0.05 4,477 1.22 % 14.08 0 0 % 0 158 0.03 % 0.29 0 0 % 0 86 0.02 % 0.46 0 0 % 0 siol,net siol,net siol, net 3,780 0.21 % 3.33 78 0.03 % 1.82 1,852 0.50 % 5.83 0 0 % 0 1,016 0.17 % 1.87 1 0.01 % 0.25 833 0.17 % 4.44 0 0 % 0 World world World 3,772 0.21 % 3.32 434 0.16 % 10.11 1,169 0.32 % 3.68 0 0 % 0 1,222 0.21 % 2.25 8 0.12 % 2.02 884 0.18 % 4.72 55 0.16 % 1.38 press press press 3,361 0.19 % 2.96 92 0.04 % 2.14 104 0.03 % 0.33 0 0 % 0 124 0.02 % 0.23 0 0 % 0 3,037 0.60 % 16.20 4 0.01 % 0.10 stand stand stand 3,204 0.18 % 2.82 98 0.04 % 2.28 1,808 0.49 % 5.69 0 0 % 0 780 0.13 % 1.44 0 0 % 0 499 0.10 % 2.66 19 0.06 % 0.48 International international Inter national 2,897 0.17 % 2.55 414 0.16 % 9.64 809 0.22 % 2.54 0 0 % 0 965 0.17 % 1.78 6 0.09 % 1.52 696 0.14 % 3.71 7 0.02 % 0.18 European european Europ ean 2,851 0.16 % 2.51 550 0.21 % 12.81 664 0.18 % 2.09 0 0 % 0 870 0.15 % 1.60 3 0.04 % 0.76 763 0.15 % 4.07 1 0 % 0.03 Journal journal Journ al 2,796 0.16 % 2.46 796 0.30 % 18.54 725 0.20 % 2.28 0 0 % 0 461 0.08 % 0.85 1 0.01 % 0.25 790 0.16 % 4.22 23 0.07 % 0.58 facto facto facto 2,278 0.13 % 2.01 104 0.04 % 2.42 827 0.22 % 2.60 0 0 % 0 918 0.16 % 1.69 15 0.22 % 3.79 398 0.08 % 2.12 16 0.05 % 0.40 grand grand grand 2,153 0.12 % 1.90 42 0.02 % 0.98 754 0.20 % 2.37 0 0 % 0 1,090 0.19 % 2.01 2 0.03 % 0.51 253 0.05 % 1.35 12 0.04 % 0.30 which which which 2,124 0.12 % 1.87 560 0.21 % 13.04 110 0.03 % 0.35 0 0 % 0 662 0.11 % 1.22 1 0.01 % 0.25 777 0.15 % 4.15 14 0.04 % 0.35 della della della 2,058 0.12 % 1.81 275 0.10 % 6.41 517 0.14 % 1.63 0 0 % 0 870 0.15 % 1.60 0 0 % 0 361 0.07 % 1.93 35 0.10 % 0.88 University university Unive rsity 2,003 0.11 % 1.77 817 0.31 % 19.03 429 0.12 % 1.35 0 0 % 0 417 0.07 % 0.77 1 0.01 % 0.25 325 0.07 % 1.73 14 0.04 % 0.35 www,facebook,com www,facebook,com www,f acebook,com 1,853 0.10 % 1.63 1 0 % 0.02 1,835 0.50 % 5.77 0 0 % 0 9 0 % 0.02 0 0 % 0 8 0 % 0.04 0 0 % 0 twitter,com twitter,com twitt er,com 1,775 0.10 % 1.56 0 0 % 0 1,762 0.48 % 5.54 0 0 % 0 8 0 % 0.01 0 0 % 0 5 0 % 0.03 0 0 % 0 si,mobil si,mobil si,mo bil 1,730 0.10 % 1.52 0 0 % 0 337 0.09 % 1.06 0 0 % 0 788 0.14 % 1.45 6 0.09 % 1.52 599 0.12 % 3.20 0 0 % 0 National national Natio nal 1,702 0.10 % 1.50 292 0.11 % 6.80 417 0.11 % 1.31 0 0 % 0 617 0.11 % 1.14 10 0.15 % 2.53 337 0.07 % 1.80 29 0.09 % 0.73 their their their 1,680 0.10 % 1.48 530 0.20 % 12.34 173 0.05 % 0.54 0 0 % 0 247 0.04 % 0.46 3 0.04 % 0.76 724 0.14 % 3.86 3 0.01 % 0.08 alias alias alias 1,676 0.10 % 1.48 25 0.01 % 0.58 236 0.06 % 0.74 0 0 % 0 650 0.11 % 1.20 2 0.03 % 0.51 678 0.14 % 3.62 85 0.25 % 2.14 sport sport sport 1,471 0.08 % 1.30 60 0.02 % 1.40 317 0.09 % 1 0 0 % 0 646 0.11 % 1.19 0 0 % 0 448 0.09 % 2.39 0 0 % 0 desk@sta,si desk@sta,si desk@ sta,si 1,435 0.08 % 1.26 0 0 % 0 1,435 0.39 % 4.51 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 www,promet,si www,promet,si www,p romet,si 1,393 0.08 % 1.23 0 0 % 0 1,358 0.37 % 4.27 0 0 % 0 28 0.01 % 0.05 0 0 % 0 7 0 % 0.04 0 0 % 0 dello dello dello 1,377 0.08 % 1.21 14 0.01 % 0.33 623 0.17 % 1.96 0 0 % 0 687 0.12 % 1.27 0 0 % 0 48 0.01 % 0.26 5 0.01 % 0.13 American american Ameri can 1,352 0.08 % 1.19 304 0.12 % 7.08 301 0.08 % 0.95 0 0 % 0 365 0.06 % 0.67 1 0.01 % 0.25 371 0.07 % 1.98 10 0.03 % 0.25 www,mtv,si www,mtv,si www,m tv,si 1,337 0.08 % 1.18 0 0 % 0 1,337 0.36 % 4.21 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 world world world 1,309 0.07 % 1.15 103 0.04 % 2.40 327 0.09 % 1.03 0 0 % 0 449 0.08 % 0.83 1 0.01 % 0.25 404 0.08 % 2.16 25 0.07 % 0.63 www,mojedelo,com www,mojedelo,com www,m ojedelo,com 1,278 0.07 % 1.13 0 0 % 0 1 0 % 0 0 0 % 0 1,209 0.21 % 2.23 0 0 % 0 68 0.01 % 0.36 0 0 % 0 pripomba@sta,si pripomba@sta,si pripo mba@sta,si 1,262 0.07 % 1.11 0 0 % 0 1,262 0.34 % 3.97 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 Social social Socia l 1,240 0.07 % 1.09 736 0.28 % 17.14 214 0.06 % 0.67 0 0 % 0 96 0.02 % 0.18 0 0 % 0 192 0.04 % 1.02 2 0.01 % 0.05 people people peopl e 1,217 0.07 % 1.07 152 0.06 % 3.54 421 0.11 % 1.32 0 0 % 0 198 0.03 % 0.36 0 0 % 0 430 0.09 % 2.29 16 0.05 % 0.40 power power power 1,206 0.07 % 1.06 39 0.01 % 0.91 377 0.10 % 1.19 0 0 % 0 495 0.09 % 0.91 0 0 % 0 283 0.06 % 1.51 12 0.04 % 0.30 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 262 File at CLARIN.SI 1.2.246 List of final character-level 1-grams from residual lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-residual-lemmas-final- 1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] de de d e 119,443 3.25 % 105.26 12,264 2.48 % 285.64 37,427 4.85 % 117.72 0 0 % 0 47,661 3.62 % 87.82 175 1.66 % 44.24 18,231 1.80 % 97.27 3,685 5.07 % 92.78 of of o f 88,222 2.40 % 77.75 19,500 3.94 % 454.18 19,565 2.54 % 61.54 0 0 % 0 23,377 1.78 % 43.07 120 1.14 % 30.33 24,881 2.46 % 132.76 779 1.07 % 19.61 the the th e 75,221 2.05 % 66.29 15,174 3.07 % 353.42 15,392 2.00 % 48.41 0 0 % 0 19,043 1.45 % 35.09 94 0.89 % 23.76 24,539 2.43 % 130.93 979 1.35 % 24.65 The the Th e 66,216 1.80 % 58.36 7,608 1.54 % 177.20 20,382 2.64 % 64.11 0 0 % 0 21,304 1.62 % 39.25 65 0.62 % 16.43 16,340 1.62 % 87.18 517 0.71 % 13.02 and and an d 48,667 1.32 % 42.89 14,524 2.94 % 338.28 9,262 1.20 % 29.13 0 0 % 0 9,755 0.74 % 17.97 36 0.34 % 9.10 14,531 1.44 % 77.53 559 0.77 % 14.08 la la l a 47,471 1.29 % 41.84 4,772 0.96 % 111.15 12,499 1.62 % 39.31 0 0 % 0 19,464 1.48 % 35.86 53 0.50 % 13.40 9,233 0.91 % 49.26 1,450 2.00 % 36.51 i i i 46,624 1.27 % 41.09 3,283 0.66 % 76.46 12,783 1.66 % 40.21 0 0 % 0 9,281 0.71 % 17.10 327 3.10 % 82.66 20,111 1.99 % 107.31 839 1.16 % 21.13 a a a 32,433 0.88 % 28.58 5,651 1.14 % 131.62 5,850 0.76 % 18.40 0 0 % 0 8,682 0.66 % 16 220 2.09 % 55.61 11,459 1.14 % 61.14 571 0.79 % 14.38 in in i n 22,434 0.61 % 19.77 5,021 1.01 % 116.95 4,201 0.55 % 13.21 0 0 % 0 5,728 0.43 % 10.55 39 0.37 % 9.86 7,168 0.71 % 38.25 277 0.38 % 6.97 di di d i 20,807 0.57 % 18.34 1,481 0.30 % 34.49 5,855 0.76 % 18.42 1 33.33 % 103.02 9,811 0.75 % 18.08 28 0.27 % 7.08 3,363 0.33 % 17.94 268 0.37 % 6.75 sta sta st a 17,203 0.47 % 15.16 3 0 % 0.07 1,388 0.18 % 4.37 0 0 % 0 15,790 1.20 % 29.09 0 0 % 0 21 0 % 0.11 1 0 % 0.03 van van va n 16,380 0.45 % 14.44 528 0.11 % 12.30 6,171 0.80 % 19.41 0 0 % 0 8,131 0.62 % 14.98 18 0.17 % 4.55 1,246 0.12 % 6.65 286 0.39 % 7.20 el el e l 15,435 0.42 % 13.60 721 0.15 % 16.79 5,592 0.72 % 17.59 0 0 % 0 6,078 0.46 % 11.20 11 0.10 % 2.78 2,615 0.26 % 13.95 418 0.57 % 10.52 to to t o 15,224 0.41 % 13.42 3,150 0.64 % 73.37 2,841 0.37 % 8.94 0 0 % 0 3,583 0.27 % 6.60 18 0.17 % 4.55 5,425 0.54 % 28.95 207 0.28 % 5.21 for for fo r 13,275 0.36 % 11.70 2,800 0.57 % 65.22 2,816 0.36 % 8.86 0 0 % 0 3,168 0.24 % 5.84 16 0.15 % 4.04 4,355 0.43 % 23.24 120 0.17 % 3.02 on on o n 12,675 0.34 % 11.17 2,313 0.47 % 53.87 2,767 0.36 % 8.70 0 0 % 0 3,727 0.28 % 6.87 8 0.08 % 2.02 3,678 0.36 % 19.62 182 0.25 % 4.58 pre pre pr e 12,183 0.33 % 10.74 379 0.08 % 8.83 2,877 0.37 % 9.05 0 0 % 0 6,154 0.47 % 11.34 34 0.32 % 8.59 2,695 0.27 % 14.38 44 0.06 % 1.11 is is i s 11,863 0.32 % 10.45 1,865 0.38 % 43.44 3,514 0.46 % 11.05 0 0 % 0 2,493 0.19 % 4.59 3 0.03 % 0.76 3,726 0.37 % 19.88 262 0.36 % 6.60 von von vo n 11,146 0.30 % 9.82 1,850 0.37 % 43.09 2,532 0.33 % 7.96 0 0 % 0 4,268 0.32 % 7.86 178 1.69 % 45 1,975 0.20 % 10.54 343 0.47 % 8.64 der der de r 10,665 0.29 % 9.40 2,093 0.42 % 48.75 3,064 0.40 % 9.64 0 0 % 0 3,773 0.29 % 6.95 28 0.27 % 7.08 1,459 0.14 % 7.78 248 0.34 % 6.24 et et e t 9,624 0.26 % 8.48 3,354 0.68 % 78.12 1,351 0.17 % 4.25 0 0 % 0 2,923 0.22 % 5.39 26 0.25 % 6.57 1,606 0.16 % 8.57 364 0.50 % 9.17 ta ta t a 9,565 0.26 % 8.43 1,178 0.24 % 27.44 2,441 0.32 % 7.68 0 0 % 0 2,643 0.20 % 4.87 15 0.14 % 3.79 2,979 0.29 % 15.89 309 0.42 % 7.78 da da d a 9,128 0.25 % 8.04 554 0.11 % 12.90 2,267 0.29 % 7.13 0 0 % 0 3,188 0.24 % 5.87 27 0.26 % 6.83 2,083 0.21 % 11.11 1,009 1.39 % 25.41 bin bin bi n 9,010 0.24 % 7.94 101 0.02 % 2.35 2,562 0.33 % 8.06 0 0 % 0 5,068 0.39 % 9.34 2 0.02 % 0.51 1,212 0.12 % 6.47 65 0.09 % 1.64 by by b y 8,563 0.23 % 7.55 1,142 0.23 % 26.60 1,719 0.22 % 5.41 0 0 % 0 2,522 0.19 % 4.65 21 0.20 % 5.31 3,090 0.31 % 16.49 69 0.10 % 1.74 an an a n 7,554 0.21 % 6.66 1,400 0.28 % 32.61 1,264 0.16 % 3.98 0 0 % 0 3,059 0.23 % 5.64 11 0.10 % 2.78 1,705 0.17 % 9.10 115 0.16 % 2.90 du du d u 7,024 0.19 % 6.19 1,345 0.27 % 31.33 1,300 0.17 % 4.09 0 0 % 0 3,031 0.23 % 5.58 10 0.10 % 2.53 1,043 0.10 % 5.57 295 0.41 % 7.43 be be b e 6,845 0.19 % 6.03 998 0.20 % 23.24 1,265 0.16 % 3.98 0 0 % 0 2,025 0.15 % 3.73 11 0.10 % 2.78 2,409 0.24 % 12.85 137 0.19 % 3.45 des des de s 6,567 0.18 % 5.79 1,968 0.40 % 45.84 1,067 0.14 % 3.36 0 0 % 0 2,185 0.17 % 4.03 21 0.20 % 5.31 941 0.09 % 5.02 385 0.53 % 9.69 del del de l 6,521 0.18 % 5.75 749 0.15 % 17.45 1,325 0.17 % 4.17 0 0 % 0 2,826 0.21 % 5.21 14 0.13 % 3.54 1,519 0.15 % 8.10 88 0.12 % 2.22 dnevnik,si dnevnik,si dnevnik,s i 6,354 0.17 % 5.60 0 0 % 0 278 0.04 % 0.87 0 0 % 0 6,059 0.46 % 11.16 0 0 % 0 17 0 % 0.09 0 0 % 0 with with wit h 5,952 0.16 % 5.25 1,244 0.25 % 28.97 1,145 0.15 % 3.60 0 0 % 0 1,281 0.10 % 2.36 4 0.04 % 1.01 2,175 0.21 % 11.61 103 0.14 % 2.59 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 263 File at CLARIN.SI 1.2.247 List of final character-level 2-grams from residual lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-residual-lemmas-final- 2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] de de de 119,443 3.36 % 105.26 175 1.76 % 44.24 47,661 3.69 % 87.82 37,427 5.01 % 117.72 12,264 2.55 % 285.64 0 0 % 0 18,231 1.91 % 97.27 3,685 5.25 % 92.78 of of of 88,222 2.48 % 77.75 120 1.21 % 30.33 23,377 1.81 % 43.07 19,565 2.62 % 61.54 19,500 4.06 % 454.18 0 0 % 0 24,881 2.60 % 132.76 779 1.11 % 19.61 the the t he 75,221 2.12 % 66.29 94 0.94 % 23.76 19,043 1.47 % 35.09 15,392 2.06 % 48.41 15,174 3.16 % 353.42 0 0 % 0 24,539 2.57 % 130.93 979 1.39 % 24.65 The the T he 66,216 1.86 % 58.36 65 0.65 % 16.43 21,304 1.65 % 39.25 20,382 2.73 % 64.11 7,608 1.58 % 177.20 0 0 % 0 16,340 1.71 % 87.18 517 0.74 % 13.02 and and a nd 48,667 1.37 % 42.89 36 0.36 % 9.10 9,755 0.76 % 17.97 9,262 1.24 % 29.13 14,524 3.02 % 338.28 0 0 % 0 14,531 1.52 % 77.53 559 0.80 % 14.08 la la la 47,471 1.33 % 41.84 53 0.53 % 13.40 19,464 1.51 % 35.86 12,499 1.67 % 39.31 4,772 0.99 % 111.15 0 0 % 0 9,233 0.97 % 49.26 1,450 2.06 % 36.51 in in in 22,434 0.63 % 19.77 39 0.39 % 9.86 5,728 0.44 % 10.55 4,201 0.56 % 13.21 5,021 1.04 % 116.95 0 0 % 0 7,168 0.75 % 38.25 277 0.39 % 6.97 di di di 20,807 0.58 % 18.34 28 0.28 % 7.08 9,811 0.76 % 18.08 5,855 0.78 % 18.42 1,481 0.31 % 34.49 1 33.33 % 103.02 3,363 0.35 % 17.94 268 0.38 % 6.75 sta sta s ta 17,203 0.48 % 15.16 0 0 % 0 15,790 1.22 % 29.09 1,388 0.19 % 4.37 3 0 % 0.07 0 0 % 0 21 0 % 0.11 1 0 % 0.03 van van v an 16,380 0.46 % 14.44 18 0.18 % 4.55 8,131 0.63 % 14.98 6,171 0.83 % 19.41 528 0.11 % 12.30 0 0 % 0 1,246 0.13 % 6.65 286 0.41 % 7.20 el el el 15,435 0.43 % 13.60 11 0.11 % 2.78 6,078 0.47 % 11.20 5,592 0.75 % 17.59 721 0.15 % 16.79 0 0 % 0 2,615 0.27 % 13.95 418 0.59 % 10.52 to to to 15,224 0.43 % 13.42 18 0.18 % 4.55 3,583 0.28 % 6.60 2,841 0.38 % 8.94 3,150 0.66 % 73.37 0 0 % 0 5,425 0.57 % 28.95 207 0.29 % 5.21 for for f or 13,275 0.37 % 11.70 16 0.16 % 4.04 3,168 0.24 % 5.84 2,816 0.38 % 8.86 2,800 0.58 % 65.22 0 0 % 0 4,355 0.46 % 23.24 120 0.17 % 3.02 on on on 12,675 0.36 % 11.17 8 0.08 % 2.02 3,727 0.29 % 6.87 2,767 0.37 % 8.70 2,313 0.48 % 53.87 0 0 % 0 3,678 0.39 % 19.62 182 0.26 % 4.58 pre pre p re 12,183 0.34 % 10.74 34 0.34 % 8.59 6,154 0.48 % 11.34 2,877 0.39 % 9.05 379 0.08 % 8.83 0 0 % 0 2,695 0.28 % 14.38 44 0.06 % 1.11 is is is 11,863 0.33 % 10.45 3 0.03 % 0.76 2,493 0.19 % 4.59 3,514 0.47 % 11.05 1,865 0.39 % 43.44 0 0 % 0 3,726 0.39 % 19.88 262 0.37 % 6.60 von von v on 11,146 0.31 % 9.82 178 1.79 % 45 4,268 0.33 % 7.86 2,532 0.34 % 7.96 1,850 0.39 % 43.09 0 0 % 0 1,975 0.21 % 10.54 343 0.49 % 8.64 der der d er 10,665 0.30 % 9.40 28 0.28 % 7.08 3,773 0.29 % 6.95 3,064 0.41 % 9.64 2,093 0.44 % 48.75 0 0 % 0 1,459 0.15 % 7.78 248 0.35 % 6.24 et et et 9,624 0.27 % 8.48 26 0.26 % 6.57 2,923 0.23 % 5.39 1,351 0.18 % 4.25 3,354 0.70 % 78.12 0 0 % 0 1,606 0.17 % 8.57 364 0.52 % 9.17 ta ta ta 9,565 0.27 % 8.43 15 0.15 % 3.79 2,643 0.20 % 4.87 2,441 0.33 % 7.68 1,178 0.24 % 27.44 0 0 % 0 2,979 0.31 % 15.89 309 0.44 % 7.78 da da da 9,128 0.26 % 8.04 27 0.27 % 6.83 3,188 0.25 % 5.87 2,267 0.30 % 7.13 554 0.12 % 12.90 0 0 % 0 2,083 0.22 % 11.11 1,009 1.44 % 25.41 bin bin b in 9,010 0.25 % 7.94 2 0.02 % 0.51 5,068 0.39 % 9.34 2,562 0.34 % 8.06 101 0.02 % 2.35 0 0 % 0 1,212 0.13 % 6.47 65 0.09 % 1.64 by by by 8,563 0.24 % 7.55 21 0.21 % 5.31 2,522 0.20 % 4.65 1,719 0.23 % 5.41 1,142 0.24 % 26.60 0 0 % 0 3,090 0.32 % 16.49 69 0.10 % 1.74 an an an 7,554 0.21 % 6.66 11 0.11 % 2.78 3,059 0.24 % 5.64 1,264 0.17 % 3.98 1,400 0.29 % 32.61 0 0 % 0 1,705 0.18 % 9.10 115 0.16 % 2.90 du du du 7,024 0.20 % 6.19 10 0.10 % 2.53 3,031 0.23 % 5.58 1,300 0.17 % 4.09 1,345 0.28 % 31.33 0 0 % 0 1,043 0.11 % 5.57 295 0.42 % 7.43 be be be 6,845 0.19 % 6.03 11 0.11 % 2.78 2,025 0.16 % 3.73 1,265 0.17 % 3.98 998 0.21 % 23.24 0 0 % 0 2,409 0.25 % 12.85 137 0.20 % 3.45 des des d es 6,567 0.18 % 5.79 21 0.21 % 5.31 2,185 0.17 % 4.03 1,067 0.14 % 3.36 1,968 0.41 % 45.84 0 0 % 0 941 0.10 % 5.02 385 0.55 % 9.69 del del d el 6,521 0.18 % 5.75 14 0.14 % 3.54 2,826 0.22 % 5.21 1,325 0.18 % 4.17 749 0.16 % 17.45 0 0 % 0 1,519 0.16 % 8.10 88 0.12 % 2.22 dnevnik,si dnevnik,si dnevnik, si 6,354 0.18 % 5.60 0 0 % 0 6,059 0.47 % 11.16 278 0.04 % 0.87 0 0 % 0 0 0 % 0 17 0 % 0.09 0 0 % 0 with with wi th 5,952 0.17 % 5.25 4 0.04 % 1.01 1,281 0.10 % 2.36 1,145 0.15 % 3.60 1,244 0.26 % 28.97 0 0 % 0 2,175 0.23 % 11.61 103 0.15 % 2.59 re re re 5,794 0.16 % 5.11 29 0.29 % 7.33 2,954 0.23 % 5.44 1,286 0.17 % 4.04 281 0.06 % 6.54 0 0 % 0 1,102 0.12 % 5.88 142 0.20 % 3.58 at at at 5,740 0.16 % 5.06 68 0.68 % 17.19 1,685 0.13 % 3.10 1,237 0.17 % 3.89 847 0.18 % 19.73 0 0 % 0 1,834 0.19 % 9.79 69 0.10 % 1.74 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 264 File at CLARIN.SI 1.2.248 List of final character-level 3-grams from residual lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-residual-lemmas-final- 3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] the the the 75,221 2.72 % 66.29 94 1.08 % 23.76 19,043 2.01 % 35.09 15,392 2.64 % 48.41 15,174 3.87 % 353.42 0 0 % 0 24,539 3.16 % 130.93 979 1.78 % 24.65 The the The 66,216 2.40 % 58.36 65 0.75 % 16.43 21,304 2.25 % 39.25 20,382 3.50 % 64.11 7,608 1.94 % 177.20 0 0 % 0 16,340 2.10 % 87.18 517 0.94 % 13.02 and and and 48,667 1.76 % 42.89 36 0.41 % 9.10 9,755 1.03 % 17.97 9,262 1.59 % 29.13 14,524 3.70 % 338.28 0 0 % 0 14,531 1.87 % 77.53 559 1.02 % 14.08 sta sta sta 17,203 0.62 % 15.16 0 0 % 0 15,790 1.67 % 29.09 1,388 0.24 % 4.37 3 0 % 0.07 0 0 % 0 21 0 % 0.11 1 0 % 0.03 van van van 16,380 0.59 % 14.44 18 0.21 % 4.55 8,131 0.86 % 14.98 6,171 1.06 % 19.41 528 0.14 % 12.30 0 0 % 0 1,246 0.16 % 6.65 286 0.52 % 7.20 for for for 13,275 0.48 % 11.70 16 0.18 % 4.04 3,168 0.33 % 5.84 2,816 0.48 % 8.86 2,800 0.71 % 65.22 0 0 % 0 4,355 0.56 % 23.24 120 0.22 % 3.02 pre pre pre 12,183 0.44 % 10.74 34 0.39 % 8.59 6,154 0.65 % 11.34 2,877 0.49 % 9.05 379 0.10 % 8.83 0 0 % 0 2,695 0.35 % 14.38 44 0.08 % 1.11 von von von 11,146 0.40 % 9.82 178 2.05 % 45 4,268 0.45 % 7.86 2,532 0.43 % 7.96 1,850 0.47 % 43.09 0 0 % 0 1,975 0.25 % 10.54 343 0.62 % 8.64 der der der 10,665 0.39 % 9.40 28 0.32 % 7.08 3,773 0.40 % 6.95 3,064 0.53 % 9.64 2,093 0.53 % 48.75 0 0 % 0 1,459 0.19 % 7.78 248 0.45 % 6.24 bin bin bin 9,010 0.33 % 7.94 2 0.02 % 0.51 5,068 0.54 % 9.34 2,562 0.44 % 8.06 101 0.03 % 2.35 0 0 % 0 1,212 0.16 % 6.47 65 0.12 % 1.64 des des des 6,567 0.24 % 5.79 21 0.24 % 5.31 2,185 0.23 % 4.03 1,067 0.18 % 3.36 1,968 0.50 % 45.84 0 0 % 0 941 0.12 % 5.02 385 0.70 % 9.69 del del del 6,521 0.24 % 5.75 14 0.16 % 3.54 2,826 0.30 % 5.21 1,325 0.23 % 4.17 749 0.19 % 17.45 0 0 % 0 1,519 0.20 % 8.10 88 0.16 % 2.22 dnevnik,si dnevnik,si dnevnik ,si 6,354 0.23 % 5.60 0 0 % 0 6,059 0.64 % 11.16 278 0.05 % 0.87 0 0 % 0 0 0 % 0 17 0 % 0.09 0 0 % 0 with with w ith 5,952 0.21 % 5.25 4 0.05 % 1.01 1,281 0.14 % 2.36 1,145 0.20 % 3.60 1,244 0.32 % 28.97 0 0 % 0 2,175 0.28 % 11.61 103 0.19 % 2.59 und und und 5,228 0.19 % 4.61 18 0.21 % 4.55 1,226 0.13 % 2.26 552 0.10 % 1.74 1,888 0.48 % 43.97 0 0 % 0 1,392 0.18 % 7.43 152 0.28 % 3.83 that that t hat 5,042 0.18 % 4.44 5 0.06 % 1.26 1,159 0.12 % 2.14 572 0.10 % 1.80 1,039 0.27 % 24.20 0 0 % 0 2,166 0.28 % 11.56 101 0.18 % 2.54 24ur,com 24ur,com 24ur, com 4,723 0.17 % 4.16 0 0 % 0 158 0.02 % 0.29 4,477 0.77 % 14.08 2 0 % 0.05 0 0 % 0 86 0.01 % 0.46 0 0 % 0 you you you 4,704 0.17 % 4.15 11 0.13 % 2.78 1,276 0.14 % 2.35 1,140 0.20 % 3.59 190 0.05 % 4.43 0 0 % 0 1,714 0.22 % 9.15 373 0.68 % 9.39 from from f rom 4,370 0.16 % 3.85 2 0.02 % 0.51 1,010 0.11 % 1.86 932 0.16 % 2.93 885 0.23 % 20.61 0 0 % 0 1,468 0.19 % 7.83 73 0.13 % 1.84 all all all 3,977 0.14 % 3.50 3 0.04 % 0.76 1,403 0.15 % 2.59 892 0.15 % 2.81 364 0.09 % 8.48 0 0 % 0 1,244 0.16 % 6.64 71 0.13 % 1.79 are are are 3,973 0.14 % 3.50 1 0.01 % 0.25 771 0.08 % 1.42 581 0.10 % 1.83 866 0.22 % 20.17 0 0 % 0 1,659 0.21 % 8.85 95 0.17 % 2.39 siol,net siol,net siol, net 3,780 0.14 % 3.33 1 0.01 % 0.25 1,016 0.11 % 1.87 1,852 0.32 % 5.83 78 0.02 % 1.82 0 0 % 0 833 0.11 % 4.44 0 0 % 0 per per per 3,776 0.14 % 3.33 4 0.05 % 1.01 2,319 0.24 % 4.27 319 0.06 % 1 406 0.10 % 9.46 0 0 % 0 649 0.08 % 3.46 79 0.14 % 1.99 World world Wo rld 3,772 0.14 % 3.32 8 0.09 % 2.02 1,222 0.13 % 2.25 1,169 0.20 % 3.68 434 0.11 % 10.11 0 0 % 0 884 0.11 % 4.72 55 0.10 % 1.38 Van van Van 3,581 0.13 % 3.16 3 0.04 % 0.76 1,749 0.18 % 3.22 1,278 0.22 % 4.02 114 0.03 % 2.66 0 0 % 0 374 0.05 % 2 63 0.12 % 1.59 press press pr ess 3,361 0.12 % 2.96 0 0 % 0 124 0.01 % 0.23 104 0.02 % 0.33 92 0.02 % 2.14 0 0 % 0 3,037 0.39 % 16.20 4 0.01 % 0.10 stand stand st and 3,204 0.12 % 2.82 0 0 % 0 780 0.08 % 1.44 1,808 0.31 % 5.69 98 0.03 % 2.28 0 0 % 0 499 0.06 % 2.66 19 0.04 % 0.48 den den den 3,077 0.11 % 2.71 5 0.06 % 1.26 1,799 0.19 % 3.31 416 0.07 % 1.31 294 0.07 % 6.85 0 0 % 0 516 0.07 % 2.75 47 0.09 % 1.18 was was was 3,024 0.11 % 2.67 1 0.01 % 0.25 954 0.10 % 1.76 406 0.07 % 1.28 438 0.11 % 10.20 0 0 % 0 1,166 0.15 % 6.22 59 0.11 % 1.49 not not not 2,991 0.11 % 2.64 5 0.06 % 1.26 820 0.09 % 1.51 355 0.06 % 1.12 485 0.12 % 11.30 0 0 % 0 1,279 0.17 % 6.82 47 0.09 % 1.18 International international Internatio nal 2,897 0.10 % 2.55 6 0.07 % 1.52 965 0.10 % 1.78 809 0.14 % 2.54 414 0.11 % 9.64 0 0 % 0 696 0.09 % 3.71 7 0.01 % 0.18 You you You 2,882 0.10 % 2.54 1 0.01 % 0.25 831 0.09 % 1.53 1,005 0.17 % 3.16 72 0.02 % 1.68 0 0 % 0 904 0.12 % 4.82 69 0.13 % 1.74 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 265 File at CLARIN.SI 1.2.249 List of final character-level 4-grams from residual lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-residual-lemmas-final- 4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] dnevnik,si dnevnik,si dnevni k,si 6,354 0.30 % 5.60 0 0 % 0 6,059 0.85 % 11.16 278 0.06 % 0.87 0 0 % 0 0 0 % 0 17 0 % 0.09 0 0 % 0 with with with 5,952 0.28 % 5.25 4 0.05 % 1.01 1,281 0.18 % 2.36 1,145 0.26 % 3.60 1,244 0.41 % 28.97 0 0 % 0 2,175 0.35 % 11.61 103 0.24 % 2.59 that that that 5,042 0.24 % 4.44 5 0.07 % 1.26 1,159 0.16 % 2.14 572 0.13 % 1.80 1,039 0.34 % 24.20 0 0 % 0 2,166 0.35 % 11.56 101 0.23 % 2.54 24ur,com 24ur,com 24ur ,com 4,723 0.22 % 4.16 0 0 % 0 158 0.02 % 0.29 4,477 0.99 % 14.08 2 0 % 0.05 0 0 % 0 86 0.01 % 0.46 0 0 % 0 from from from 4,370 0.20 % 3.85 2 0.03 % 0.51 1,010 0.14 % 1.86 932 0.21 % 2.93 885 0.29 % 20.61 0 0 % 0 1,468 0.24 % 7.83 73 0.17 % 1.84 siol,net siol,net siol ,net 3,780 0.18 % 3.33 1 0.01 % 0.25 1,016 0.14 % 1.87 1,852 0.41 % 5.83 78 0.03 % 1.82 0 0 % 0 833 0.14 % 4.44 0 0 % 0 World world W orld 3,772 0.18 % 3.32 8 0.11 % 2.02 1,222 0.17 % 2.25 1,169 0.26 % 3.68 434 0.14 % 10.11 0 0 % 0 884 0.14 % 4.72 55 0.13 % 1.38 press press p ress 3,361 0.16 % 2.96 0 0 % 0 124 0.02 % 0.23 104 0.02 % 0.33 92 0.03 % 2.14 0 0 % 0 3,037 0.49 % 16.20 4 0.01 % 0.10 stand stand s tand 3,204 0.15 % 2.82 0 0 % 0 780 0.11 % 1.44 1,808 0.40 % 5.69 98 0.03 % 2.28 0 0 % 0 499 0.08 % 2.66 19 0.04 % 0.48 International international Internati onal 2,897 0.14 % 2.55 6 0.08 % 1.52 965 0.14 % 1.78 809 0.18 % 2.54 414 0.14 % 9.64 0 0 % 0 696 0.11 % 3.71 7 0.02 % 0.18 European european Euro pean 2,851 0.13 % 2.51 3 0.04 % 0.76 870 0.12 % 1.60 664 0.15 % 2.09 550 0.18 % 12.81 0 0 % 0 763 0.12 % 4.07 1 0 % 0.03 Journal journal Jou rnal 2,796 0.13 % 2.46 1 0.01 % 0.25 461 0.07 % 0.85 725 0.16 % 2.28 796 0.26 % 18.54 0 0 % 0 790 0.13 % 4.22 23 0.05 % 0.58 blue blue blue 2,740 0.13 % 2.41 1 0.01 % 0.25 1,144 0.16 % 2.11 1,397 0.31 % 4.39 20 0.01 % 0.47 0 0 % 0 166 0.03 % 0.89 12 0.03 % 0.30 have have have 2,449 0.12 % 2.16 4 0.05 % 1.01 592 0.08 % 1.09 418 0.09 % 1.31 375 0.12 % 8.73 0 0 % 0 976 0.16 % 5.21 84 0.20 % 2.12 facto facto f acto 2,278 0.11 % 2.01 15 0.20 % 3.79 918 0.13 % 1.69 827 0.18 % 2.60 104 0.03 % 2.42 0 0 % 0 398 0.07 % 2.12 16 0.04 % 0.40 this this this 2,261 0.11 % 1.99 2 0.03 % 0.51 365 0.05 % 0.67 407 0.09 % 1.28 403 0.13 % 9.39 0 0 % 0 1,015 0.17 % 5.42 69 0.16 % 1.74 hard hard hard 2,243 0.10 % 1.98 2 0.03 % 0.51 776 0.11 % 1.43 420 0.09 % 1.32 49 0.02 % 1.14 0 0 % 0 972 0.16 % 5.19 24 0.06 % 0.60 ford ford ford 2,196 0.10 % 1.94 0 0 % 0 1,616 0.23 % 2.98 407 0.09 % 1.28 2 0 % 0.05 0 0 % 0 159 0.03 % 0.85 12 0.03 % 0.30 grand grand g rand 2,153 0.10 % 1.90 2 0.03 % 0.51 1,090 0.15 % 2.01 754 0.17 % 2.37 42 0.01 % 0.98 0 0 % 0 253 0.04 % 1.35 12 0.03 % 0.30 which which w hich 2,124 0.10 % 1.87 1 0.01 % 0.25 662 0.09 % 1.22 110 0.02 % 0.35 560 0.18 % 13.04 0 0 % 0 777 0.13 % 4.15 14 0.03 % 0.35 will will will 2,112 0.10 % 1.86 1 0.01 % 0.25 878 0.12 % 1.62 381 0.09 % 1.20 233 0.08 % 5.43 0 0 % 0 575 0.09 % 3.07 44 0.10 % 1.11 della della d ella 2,058 0.10 % 1.81 0 0 % 0 870 0.12 % 1.60 517 0.12 % 1.63 275 0.09 % 6.41 0 0 % 0 361 0.06 % 1.93 35 0.08 % 0.88 Love love Love 2,036 0.10 % 1.79 0 0 % 0 650 0.09 % 1.20 748 0.17 % 2.35 31 0.01 % 0.72 0 0 % 0 584 0.10 % 3.12 23 0.05 % 0.58 University university Univer sity 2,003 0.09 % 1.77 1 0.01 % 0.25 417 0.06 % 0.77 429 0.10 % 1.35 817 0.27 % 19.03 0 0 % 0 325 0.05 % 1.73 14 0.03 % 0.35 miss miss miss 1,914 0.09 % 1.69 0 0 % 0 584 0.08 % 1.08 220 0.05 % 0.69 5 0 % 0.12 0 0 % 0 1,006 0.16 % 5.37 99 0.23 % 2.49 www,facebook,com www,facebook,com www,facebook ,com 1,853 0.09 % 1.63 0 0 % 0 9 0 % 0.02 1,835 0.41 % 5.77 1 0 % 0.02 0 0 % 0 8 0 % 0.04 0 0 % 0 time time time 1,797 0.08 % 1.58 4 0.05 % 1.01 488 0.07 % 0.90 291 0.07 % 0.92 217 0.07 % 5.05 0 0 % 0 765 0.12 % 4.08 32 0.07 % 0.81 twitter,com twitter,com twitter ,com 1,775 0.08 % 1.56 0 0 % 0 8 0 % 0.01 1,762 0.39 % 5.54 0 0 % 0 0 0 % 0 5 0 % 0.03 0 0 % 0 si,mobil si,mobil si,m obil 1,730 0.08 % 1.52 6 0.08 % 1.52 788 0.11 % 1.45 337 0.07 % 1.06 0 0 % 0 0 0 % 0 599 0.10 % 3.20 0 0 % 0 your your your 1,711 0.08 % 1.51 0 0 % 0 413 0.06 % 0.76 362 0.08 % 1.14 100 0.03 % 2.33 0 0 % 0 754 0.12 % 4.02 82 0.19 % 2.06 National national Nati onal 1,702 0.08 % 1.50 10 0.13 % 2.53 617 0.09 % 1.14 417 0.09 % 1.31 292 0.10 % 6.80 0 0 % 0 337 0.06 % 1.80 29 0.07 % 0.73 their their t heir 1,680 0.08 % 1.48 3 0.04 % 0.76 247 0.04 % 0.46 173 0.04 % 0.54 530 0.17 % 12.34 0 0 % 0 724 0.12 % 3.86 3 0.01 % 0.08 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 266 File at CLARIN.SI 1.2.250 List of final character-level 5-grams from residual lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-residual-lemmas-final- 5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] dnevnik,si dnevnik,si dnevn ik,si 6,354 0.36 % 5.60 0 0 % 0 0 0 % 0 6,059 1.04 % 11.16 17 0 % 0.09 278 0.07 % 0.87 0 0 % 0 0 0 % 0 24ur,com 24ur,com 24u r,com 4,723 0.27 % 4.16 0 0 % 0 0 0 % 0 158 0.03 % 0.29 86 0.02 % 0.46 4,477 1.22 % 14.08 2 0 % 0.05 0 0 % 0 siol,net siol,net sio l,net 3,780 0.21 % 3.33 0 0 % 0 1 0.01 % 0.25 1,016 0.17 % 1.87 833 0.17 % 4.44 1,852 0.50 % 5.83 78 0.03 % 1.82 0 0 % 0 World world World 3,772 0.21 % 3.32 55 0.16 % 1.38 8 0.12 % 2.02 1,222 0.21 % 2.25 884 0.18 % 4.72 1,169 0.32 % 3.68 434 0.16 % 10.11 0 0 % 0 press press press 3,361 0.19 % 2.96 4 0.01 % 0.10 0 0 % 0 124 0.02 % 0.23 3,037 0.60 % 16.20 104 0.03 % 0.33 92 0.04 % 2.14 0 0 % 0 stand stand stand 3,204 0.18 % 2.82 19 0.06 % 0.48 0 0 % 0 780 0.13 % 1.44 499 0.10 % 2.66 1,808 0.49 % 5.69 98 0.04 % 2.28 0 0 % 0 International international Internat ional 2,897 0.17 % 2.55 7 0.02 % 0.18 6 0.09 % 1.52 965 0.17 % 1.78 696 0.14 % 3.71 809 0.22 % 2.54 414 0.16 % 9.64 0 0 % 0 European european Eur opean 2,851 0.16 % 2.51 1 0 % 0.03 3 0.04 % 0.76 870 0.15 % 1.60 763 0.15 % 4.07 664 0.18 % 2.09 550 0.21 % 12.81 0 0 % 0 Journal journal Jo urnal 2,796 0.16 % 2.46 23 0.07 % 0.58 1 0.01 % 0.25 461 0.08 % 0.85 790 0.16 % 4.22 725 0.20 % 2.28 796 0.30 % 18.54 0 0 % 0 facto facto facto 2,278 0.13 % 2.01 16 0.05 % 0.40 15 0.22 % 3.79 918 0.16 % 1.69 398 0.08 % 2.12 827 0.22 % 2.60 104 0.04 % 2.42 0 0 % 0 grand grand grand 2,153 0.12 % 1.90 12 0.04 % 0.30 2 0.03 % 0.51 1,090 0.19 % 2.01 253 0.05 % 1.35 754 0.20 % 2.37 42 0.02 % 0.98 0 0 % 0 which which which 2,124 0.12 % 1.87 14 0.04 % 0.35 1 0.01 % 0.25 662 0.11 % 1.22 777 0.15 % 4.15 110 0.03 % 0.35 560 0.21 % 13.04 0 0 % 0 della della della 2,058 0.12 % 1.81 35 0.10 % 0.88 0 0 % 0 870 0.15 % 1.60 361 0.07 % 1.93 517 0.14 % 1.63 275 0.10 % 6.41 0 0 % 0 University university Unive rsity 2,003 0.11 % 1.77 14 0.04 % 0.35 1 0.01 % 0.25 417 0.07 % 0.77 325 0.07 % 1.73 429 0.12 % 1.35 817 0.31 % 19.03 0 0 % 0 www,facebook,com www,facebook,com www,faceboo k,com 1,853 0.10 % 1.63 0 0 % 0 0 0 % 0 9 0 % 0.02 8 0 % 0.04 1,835 0.50 % 5.77 1 0 % 0.02 0 0 % 0 twitter,com twitter,com twitte r,com 1,775 0.10 % 1.56 0 0 % 0 0 0 % 0 8 0 % 0.01 5 0 % 0.03 1,762 0.48 % 5.54 0 0 % 0 0 0 % 0 si,mobil si,mobil si, mobil 1,730 0.10 % 1.52 0 0 % 0 6 0.09 % 1.52 788 0.14 % 1.45 599 0.12 % 3.20 337 0.09 % 1.06 0 0 % 0 0 0 % 0 National national Nat ional 1,702 0.10 % 1.50 29 0.09 % 0.73 10 0.15 % 2.53 617 0.11 % 1.14 337 0.07 % 1.80 417 0.11 % 1.31 292 0.11 % 6.80 0 0 % 0 their their their 1,680 0.10 % 1.48 3 0.01 % 0.08 3 0.04 % 0.76 247 0.04 % 0.46 724 0.14 % 3.86 173 0.05 % 0.54 530 0.20 % 12.34 0 0 % 0 alias alias alias 1,676 0.10 % 1.48 85 0.25 % 2.14 2 0.03 % 0.51 650 0.11 % 1.20 678 0.14 % 3.62 236 0.06 % 0.74 25 0.01 % 0.58 0 0 % 0 sport sport sport 1,471 0.08 % 1.30 0 0 % 0 0 0 % 0 646 0.11 % 1.19 448 0.09 % 2.39 317 0.09 % 1 60 0.02 % 1.40 0 0 % 0 desk@sta,si desk@sta,si desk@s ta,si 1,435 0.08 % 1.26 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 1,435 0.39 % 4.51 0 0 % 0 0 0 % 0 www,promet,si www,promet,si www,prom et,si 1,393 0.08 % 1.23 0 0 % 0 0 0 % 0 28 0.01 % 0.05 7 0 % 0.04 1,358 0.37 % 4.27 0 0 % 0 0 0 % 0 dello dello dello 1,377 0.08 % 1.21 5 0.01 % 0.13 0 0 % 0 687 0.12 % 1.27 48 0.01 % 0.26 623 0.17 % 1.96 14 0.01 % 0.33 0 0 % 0 American american Ame rican 1,352 0.08 % 1.19 10 0.03 % 0.25 1 0.01 % 0.25 365 0.06 % 0.67 371 0.07 % 1.98 301 0.08 % 0.95 304 0.12 % 7.08 0 0 % 0 www,mtv,si www,mtv,si www,m tv,si 1,337 0.08 % 1.18 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 1,337 0.36 % 4.21 0 0 % 0 0 0 % 0 world world world 1,309 0.07 % 1.15 25 0.07 % 0.63 1 0.01 % 0.25 449 0.08 % 0.83 404 0.08 % 2.16 327 0.09 % 1.03 103 0.04 % 2.40 0 0 % 0 www,mojedelo,com www,mojedelo,com www,mojedel o,com 1,278 0.07 % 1.13 0 0 % 0 0 0 % 0 1,209 0.21 % 2.23 68 0.01 % 0.36 1 0 % 0 0 0 % 0 0 0 % 0 pripomba@sta,si pripomba@sta,si pripomba@s ta,si 1,262 0.07 % 1.11 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 1,262 0.34 % 3.97 0 0 % 0 0 0 % 0 Social social S ocial 1,240 0.07 % 1.09 2 0.01 % 0.05 0 0 % 0 96 0.02 % 0.18 192 0.04 % 1.02 214 0.06 % 0.67 736 0.28 % 17.14 0 0 % 0 people people p eople 1,217 0.07 % 1.07 16 0.05 % 0.40 0 0 % 0 198 0.03 % 0.36 430 0.09 % 2.29 421 0.11 % 1.32 152 0.06 % 3.54 0 0 % 0 power power power 1,206 0.07 % 1.06 12 0.04 % 0.30 0 0 % 0 495 0.09 % 0.91 283 0.06 % 1.51 377 0.10 % 1.19 39 0.01 % 0.91 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 267 File at CLARIN.SI 1.2.251 List of initial character-level 1-grams from residual lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-residual-lowercase_forms- initial-1grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] the t he 143,431 3.90 % 126.41 1,518 2.09 % 38.22 165 1.56 % 41.71 40,736 3.10 % 75.06 41,796 4.14 % 223.01 35,895 4.65 % 112.90 23,321 4.71 % 543.17 0 0 % 0 de d e 119,443 3.25 % 105.26 3,685 5.07 % 92.78 175 1.66 % 44.24 47,661 3.62 % 87.82 18,231 1.80 % 97.27 37,427 4.85 % 117.72 12,264 2.48 % 285.64 0 0 % 0 of o f 88,222 2.40 % 77.75 779 1.07 % 19.61 120 1.14 % 30.33 23,377 1.78 % 43.07 24,881 2.46 % 132.76 19,565 2.54 % 61.54 19,500 3.94 % 454.18 0 0 % 0 and a nd 50,052 1.36 % 44.11 612 0.84 % 15.41 38 0.36 % 9.61 10,082 0.77 % 18.58 14,953 1.48 % 79.78 9,499 1.23 % 29.88 14,868 3.00 % 346.29 0 0 % 0 la l a 47,471 1.29 % 41.84 1,450 2.00 % 36.51 53 0.50 % 13.40 19,464 1.48 % 35.86 9,233 0.91 % 49.26 12,499 1.62 % 39.31 4,772 0.96 % 111.15 0 0 % 0 i i 46,596 1.27 % 41.06 839 1.16 % 21.13 327 3.10 % 82.66 9,280 0.71 % 17.10 20,109 1.99 % 107.30 12,783 1.66 % 40.21 3,258 0.66 % 75.88 0 0 % 0 a a 32,433 0.88 % 28.58 571 0.79 % 14.38 220 2.09 % 55.61 8,682 0.66 % 16 11,459 1.14 % 61.14 5,850 0.76 % 18.40 5,651 1.14 % 131.62 0 0 % 0 to t o 23,864 0.65 % 21.03 429 0.59 % 10.80 26 0.25 % 6.57 5,851 0.45 % 10.78 8,178 0.81 % 43.64 5,143 0.67 % 16.18 4,237 0.86 % 98.68 0 0 % 0 in i n 22,418 0.61 % 19.76 277 0.38 % 6.97 39 0.37 % 9.86 5,726 0.43 % 10.55 7,166 0.71 % 38.24 4,199 0.54 % 13.21 5,011 1.01 % 116.71 0 0 % 0 di d i 20,809 0.57 % 18.34 268 0.37 % 6.75 28 0.27 % 7.08 9,812 0.75 % 18.08 3,363 0.33 % 17.94 5,856 0.76 % 18.42 1,481 0.30 % 34.49 1 33.33 % 103.02 van v an 20,097 0.55 % 17.71 349 0.48 % 8.79 21 0.20 % 5.31 9,983 0.76 % 18.39 1,630 0.16 % 8.70 7,452 0.97 % 23.44 662 0.13 % 15.42 0 0 % 0 sta s ta 17,204 0.47 % 15.16 1 0 % 0.03 0 0 % 0 15,791 1.20 % 29.10 21 0 % 0.11 1,388 0.18 % 4.37 3 0 % 0.07 0 0 % 0 el e l 15,435 0.42 % 13.60 418 0.57 % 10.52 11 0.10 % 2.78 6,078 0.46 % 11.20 2,615 0.26 % 13.95 5,592 0.72 % 17.59 721 0.15 % 16.79 0 0 % 0 for f or 14,317 0.39 % 12.62 126 0.17 % 3.17 18 0.17 % 4.55 3,428 0.26 % 6.32 4,714 0.47 % 25.15 3,030 0.39 % 9.53 3,001 0.61 % 69.90 0 0 % 0 on o n 12,619 0.34 % 11.12 178 0.24 % 4.48 8 0.08 % 2.02 3,704 0.28 % 6.82 3,664 0.36 % 19.55 2,757 0.36 % 8.67 2,308 0.47 % 53.76 0 0 % 0 pre p re 12,246 0.33 % 10.79 45 0.06 % 1.13 34 0.32 % 8.59 6,173 0.47 % 11.37 2,724 0.27 % 14.53 2,881 0.37 % 9.06 389 0.08 % 9.06 0 0 % 0 is i s 11,863 0.32 % 10.45 262 0.36 % 6.60 3 0.03 % 0.76 2,493 0.19 % 4.59 3,726 0.37 % 19.88 3,514 0.46 % 11.05 1,865 0.38 % 43.44 0 0 % 0 der d er 11,513 0.31 % 10.15 276 0.38 % 6.95 29 0.28 % 7.33 4,068 0.31 % 7.50 1,653 0.16 % 8.82 3,174 0.41 % 9.98 2,313 0.47 % 53.87 0 0 % 0 von v on 11,326 0.31 % 9.98 349 0.48 % 8.79 178 1.69 % 45 4,312 0.33 % 7.95 2,001 0.20 % 10.68 2,590 0.34 % 8.15 1,896 0.38 % 44.16 0 0 % 0 et e t 9,623 0.26 % 8.48 364 0.50 % 9.17 26 0.25 % 6.57 2,923 0.22 % 5.39 1,605 0.16 % 8.56 1,351 0.17 % 4.25 3,354 0.68 % 78.12 0 0 % 0 bin b in 9,272 0.25 % 8.17 72 0.10 % 1.81 2 0.02 % 0.51 5,214 0.40 % 9.61 1,243 0.12 % 6.63 2,638 0.34 % 8.30 103 0.02 % 2.40 0 0 % 0 da d a 9,128 0.25 % 8.04 1,009 1.39 % 25.41 27 0.26 % 6.83 3,188 0.24 % 5.87 2,083 0.21 % 11.11 2,267 0.29 % 7.13 554 0.11 % 12.90 0 0 % 0 by b y 8,563 0.23 % 7.55 69 0.10 % 1.74 21 0.20 % 5.31 2,522 0.19 % 4.65 3,090 0.31 % 16.49 1,719 0.22 % 5.41 1,142 0.23 % 26.60 0 0 % 0 del d el 7,785 0.21 % 6.86 95 0.13 % 2.39 20 0.19 % 5.06 3,735 0.28 % 6.88 1,642 0.16 % 8.76 1,510 0.20 % 4.75 783 0.16 % 18.24 0 0 % 0 you y ou 7,687 0.21 % 6.77 444 0.61 % 11.18 12 0.11 % 3.03 2,163 0.16 % 3.99 2,650 0.26 % 14.14 2,149 0.28 % 6.76 269 0.05 % 6.27 0 0 % 0 an a n 7,544 0.20 % 6.65 115 0.16 % 2.90 11 0.10 % 2.78 3,059 0.23 % 5.64 1,701 0.17 % 9.08 1,261 0.16 % 3.97 1,397 0.28 % 32.54 0 0 % 0 du d u 7,024 0.19 % 6.19 295 0.41 % 7.43 10 0.10 % 2.53 3,031 0.23 % 5.58 1,043 0.10 % 5.57 1,300 0.17 % 4.09 1,345 0.27 % 31.33 0 0 % 0 be b e 6,845 0.19 % 6.03 137 0.19 % 3.45 11 0.10 % 2.78 2,025 0.15 % 3.73 2,409 0.24 % 12.85 1,265 0.16 % 3.98 998 0.20 % 23.24 0 0 % 0 des d es 6,751 0.18 % 5.95 388 0.53 % 9.77 21 0.20 % 5.31 2,218 0.17 % 4.09 982 0.10 % 5.24 1,095 0.14 % 3.44 2,047 0.41 % 47.68 0 0 % 0 with w ith 6,647 0.18 % 5.86 110 0.15 % 2.77 4 0.04 % 1.01 1,466 0.11 % 2.70 2,355 0.23 % 12.57 1,405 0.18 % 4.42 1,307 0.26 % 30.44 0 0 % 0 dnevnik,si d nevnik,si 6,371 0.17 % 5.61 0 0 % 0 0 0 % 0 6,059 0.46 % 11.16 17 0 % 0.09 295 0.04 % 0.93 0 0 % 0 0 0 % 0 all a ll 5,868 0.16 % 5.17 107 0.15 % 2.69 5 0.05 % 1.26 1,983 0.15 % 3.65 1,773 0.18 % 9.46 1,505 0.20 % 4.73 495 0.10 % 11.53 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 268 File at CLARIN.SI 1.2.252 List of initial character-level 2-grams from residual lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-residual-lowercase_forms- initial-2grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] the th e 143,431 4.03 % 126.41 1,518 2.16 % 38.22 165 1.66 % 41.71 40,736 3.15 % 75.06 41,796 4.37 % 223.01 35,895 4.80 % 112.90 23,321 4.86 % 543.17 0 0 % 0 de de 119,443 3.36 % 105.26 3,685 5.25 % 92.78 175 1.76 % 44.24 47,661 3.69 % 87.82 18,231 1.91 % 97.27 37,427 5.01 % 117.72 12,264 2.55 % 285.64 0 0 % 0 of of 88,222 2.48 % 77.75 779 1.11 % 19.61 120 1.21 % 30.33 23,377 1.81 % 43.07 24,881 2.60 % 132.76 19,565 2.62 % 61.54 19,500 4.06 % 454.18 0 0 % 0 and an d 50,052 1.41 % 44.11 612 0.87 % 15.41 38 0.38 % 9.61 10,082 0.78 % 18.58 14,953 1.56 % 79.78 9,499 1.27 % 29.88 14,868 3.10 % 346.29 0 0 % 0 la la 47,471 1.33 % 41.84 1,450 2.06 % 36.51 53 0.53 % 13.40 19,464 1.51 % 35.86 9,233 0.97 % 49.26 12,499 1.67 % 39.31 4,772 0.99 % 111.15 0 0 % 0 to to 23,864 0.67 % 21.03 429 0.61 % 10.80 26 0.26 % 6.57 5,851 0.45 % 10.78 8,178 0.86 % 43.64 5,143 0.69 % 16.18 4,237 0.88 % 98.68 0 0 % 0 in in 22,418 0.63 % 19.76 277 0.39 % 6.97 39 0.39 % 9.86 5,726 0.44 % 10.55 7,166 0.75 % 38.24 4,199 0.56 % 13.21 5,011 1.04 % 116.71 0 0 % 0 di di 20,809 0.58 % 18.34 268 0.38 % 6.75 28 0.28 % 7.08 9,812 0.76 % 18.08 3,363 0.35 % 17.94 5,856 0.78 % 18.42 1,481 0.31 % 34.49 1 33.33 % 103.02 van va n 20,097 0.56 % 17.71 349 0.50 % 8.79 21 0.21 % 5.31 9,983 0.77 % 18.39 1,630 0.17 % 8.70 7,452 1.00 % 23.44 662 0.14 % 15.42 0 0 % 0 sta st a 17,204 0.48 % 15.16 1 0 % 0.03 0 0 % 0 15,791 1.22 % 29.10 21 0 % 0.11 1,388 0.19 % 4.37 3 0 % 0.07 0 0 % 0 el el 15,435 0.43 % 13.60 418 0.59 % 10.52 11 0.11 % 2.78 6,078 0.47 % 11.20 2,615 0.27 % 13.95 5,592 0.75 % 17.59 721 0.15 % 16.79 0 0 % 0 for fo r 14,317 0.40 % 12.62 126 0.18 % 3.17 18 0.18 % 4.55 3,428 0.27 % 6.32 4,714 0.49 % 25.15 3,030 0.41 % 9.53 3,001 0.62 % 69.90 0 0 % 0 on on 12,619 0.35 % 11.12 178 0.25 % 4.48 8 0.08 % 2.02 3,704 0.29 % 6.82 3,664 0.38 % 19.55 2,757 0.37 % 8.67 2,308 0.48 % 53.76 0 0 % 0 pre pr e 12,246 0.34 % 10.79 45 0.06 % 1.13 34 0.34 % 8.59 6,173 0.48 % 11.37 2,724 0.28 % 14.53 2,881 0.39 % 9.06 389 0.08 % 9.06 0 0 % 0 is is 11,863 0.33 % 10.45 262 0.37 % 6.60 3 0.03 % 0.76 2,493 0.19 % 4.59 3,726 0.39 % 19.88 3,514 0.47 % 11.05 1,865 0.39 % 43.44 0 0 % 0 der de r 11,513 0.32 % 10.15 276 0.39 % 6.95 29 0.29 % 7.33 4,068 0.32 % 7.50 1,653 0.17 % 8.82 3,174 0.42 % 9.98 2,313 0.48 % 53.87 0 0 % 0 von vo n 11,326 0.32 % 9.98 349 0.50 % 8.79 178 1.79 % 45 4,312 0.33 % 7.95 2,001 0.21 % 10.68 2,590 0.35 % 8.15 1,896 0.40 % 44.16 0 0 % 0 et et 9,623 0.27 % 8.48 364 0.52 % 9.17 26 0.26 % 6.57 2,923 0.23 % 5.39 1,605 0.17 % 8.56 1,351 0.18 % 4.25 3,354 0.70 % 78.12 0 0 % 0 bin bi n 9,272 0.26 % 8.17 72 0.10 % 1.81 2 0.02 % 0.51 5,214 0.40 % 9.61 1,243 0.13 % 6.63 2,638 0.35 % 8.30 103 0.02 % 2.40 0 0 % 0 da da 9,128 0.26 % 8.04 1,009 1.44 % 25.41 27 0.27 % 6.83 3,188 0.25 % 5.87 2,083 0.22 % 11.11 2,267 0.30 % 7.13 554 0.12 % 12.90 0 0 % 0 by by 8,563 0.24 % 7.55 69 0.10 % 1.74 21 0.21 % 5.31 2,522 0.20 % 4.65 3,090 0.32 % 16.49 1,719 0.23 % 5.41 1,142 0.24 % 26.60 0 0 % 0 del de l 7,785 0.22 % 6.86 95 0.14 % 2.39 20 0.20 % 5.06 3,735 0.29 % 6.88 1,642 0.17 % 8.76 1,510 0.20 % 4.75 783 0.16 % 18.24 0 0 % 0 you yo u 7,687 0.22 % 6.77 444 0.63 % 11.18 12 0.12 % 3.03 2,163 0.17 % 3.99 2,650 0.28 % 14.14 2,149 0.29 % 6.76 269 0.06 % 6.27 0 0 % 0 an an 7,544 0.21 % 6.65 115 0.16 % 2.90 11 0.11 % 2.78 3,059 0.24 % 5.64 1,701 0.18 % 9.08 1,261 0.17 % 3.97 1,397 0.29 % 32.54 0 0 % 0 du du 7,024 0.20 % 6.19 295 0.42 % 7.43 10 0.10 % 2.53 3,031 0.23 % 5.58 1,043 0.11 % 5.57 1,300 0.17 % 4.09 1,345 0.28 % 31.33 0 0 % 0 be be 6,845 0.19 % 6.03 137 0.20 % 3.45 11 0.11 % 2.78 2,025 0.16 % 3.73 2,409 0.25 % 12.85 1,265 0.17 % 3.98 998 0.21 % 23.24 0 0 % 0 des de s 6,751 0.19 % 5.95 388 0.55 % 9.77 21 0.21 % 5.31 2,218 0.17 % 4.09 982 0.10 % 5.24 1,095 0.15 % 3.44 2,047 0.43 % 47.68 0 0 % 0 with wi th 6,647 0.19 % 5.86 110 0.16 % 2.77 4 0.04 % 1.01 1,466 0.11 % 2.70 2,355 0.25 % 12.57 1,405 0.19 % 4.42 1,307 0.27 % 30.44 0 0 % 0 dnevnik,si dn evnik,si 6,371 0.18 % 5.61 0 0 % 0 0 0 % 0 6,059 0.47 % 11.16 17 0 % 0.09 295 0.04 % 0.93 0 0 % 0 0 0 % 0 all al l 5,868 0.17 % 5.17 107 0.15 % 2.69 5 0.05 % 1.26 1,983 0.15 % 3.65 1,773 0.18 % 9.46 1,505 0.20 % 4.73 495 0.10 % 11.53 0 0 % 0 re re 5,794 0.16 % 5.11 142 0.20 % 3.58 29 0.29 % 7.33 2,954 0.23 % 5.44 1,102 0.12 % 5.88 1,286 0.17 % 4.04 281 0.06 % 6.54 0 0 % 0 at at 5,740 0.16 % 5.06 69 0.10 % 1.74 68 0.68 % 17.19 1,685 0.13 % 3.10 1,834 0.19 % 9.79 1,237 0.17 % 3.89 847 0.18 % 19.73 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 269 File at CLARIN.SI 1.2.253 List of initial character-level 3-grams from residual lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-residual-lowercase_forms- initial-3grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] the the 143,431 5.20 % 126.41 1,518 2.77 % 38.22 165 1.90 % 41.71 40,736 4.30 % 75.06 41,796 5.39 % 223.01 35,895 6.17 % 112.90 23,321 5.95 % 543.17 0 0 % 0 and and 50,052 1.81 % 44.11 612 1.12 % 15.41 38 0.44 % 9.61 10,082 1.06 % 18.58 14,953 1.93 % 79.78 9,499 1.63 % 29.88 14,868 3.79 % 346.29 0 0 % 0 van van 20,097 0.73 % 17.71 349 0.64 % 8.79 21 0.24 % 5.31 9,983 1.05 % 18.39 1,630 0.21 % 8.70 7,452 1.28 % 23.44 662 0.17 % 15.42 0 0 % 0 sta sta 17,204 0.62 % 15.16 1 0 % 0.03 0 0 % 0 15,791 1.67 % 29.10 21 0 % 0.11 1,388 0.24 % 4.37 3 0 % 0.07 0 0 % 0 for for 14,317 0.52 % 12.62 126 0.23 % 3.17 18 0.21 % 4.55 3,428 0.36 % 6.32 4,714 0.61 % 25.15 3,030 0.52 % 9.53 3,001 0.77 % 69.90 0 0 % 0 pre pre 12,246 0.44 % 10.79 45 0.08 % 1.13 34 0.39 % 8.59 6,173 0.65 % 11.37 2,724 0.35 % 14.53 2,881 0.49 % 9.06 389 0.10 % 9.06 0 0 % 0 der der 11,513 0.42 % 10.15 276 0.50 % 6.95 29 0.33 % 7.33 4,068 0.43 % 7.50 1,653 0.21 % 8.82 3,174 0.55 % 9.98 2,313 0.59 % 53.87 0 0 % 0 von von 11,326 0.41 % 9.98 349 0.64 % 8.79 178 2.05 % 45 4,312 0.46 % 7.95 2,001 0.26 % 10.68 2,590 0.45 % 8.15 1,896 0.48 % 44.16 0 0 % 0 bin bin 9,272 0.34 % 8.17 72 0.13 % 1.81 2 0.02 % 0.51 5,214 0.55 % 9.61 1,243 0.16 % 6.63 2,638 0.45 % 8.30 103 0.03 % 2.40 0 0 % 0 del del 7,785 0.28 % 6.86 95 0.17 % 2.39 20 0.23 % 5.06 3,735 0.39 % 6.88 1,642 0.21 % 8.76 1,510 0.26 % 4.75 783 0.20 % 18.24 0 0 % 0 you you 7,687 0.28 % 6.77 444 0.81 % 11.18 12 0.14 % 3.03 2,163 0.23 % 3.99 2,650 0.34 % 14.14 2,149 0.37 % 6.76 269 0.07 % 6.27 0 0 % 0 des des 6,751 0.24 % 5.95 388 0.71 % 9.77 21 0.24 % 5.31 2,218 0.23 % 4.09 982 0.13 % 5.24 1,095 0.19 % 3.44 2,047 0.52 % 47.68 0 0 % 0 with wit h 6,647 0.24 % 5.86 110 0.20 % 2.77 4 0.05 % 1.01 1,466 0.15 % 2.70 2,355 0.30 % 12.57 1,405 0.24 % 4.42 1,307 0.33 % 30.44 0 0 % 0 dnevnik,si dne vnik,si 6,371 0.23 % 5.61 0 0 % 0 0 0 % 0 6,059 0.64 % 11.16 17 0 % 0.09 295 0.05 % 0.93 0 0 % 0 0 0 % 0 all all 5,868 0.21 % 5.17 107 0.20 % 2.69 5 0.06 % 1.26 1,983 0.21 % 3.65 1,773 0.23 % 9.46 1,505 0.26 % 4.73 495 0.13 % 11.53 0 0 % 0 that tha t 5,449 0.20 % 4.80 110 0.20 % 2.77 5 0.06 % 1.26 1,273 0.13 % 2.35 2,290 0.29 % 12.22 678 0.12 % 2.13 1,093 0.28 % 25.46 0 0 % 0 from fro m 5,386 0.20 % 4.75 80 0.15 % 2.01 2 0.02 % 0.51 1,311 0.14 % 2.42 1,730 0.22 % 9.23 1,185 0.20 % 3.73 1,078 0.28 % 25.11 0 0 % 0 und und 5,288 0.19 % 4.66 154 0.28 % 3.88 18 0.21 % 4.55 1,244 0.13 % 2.29 1,418 0.18 % 7.57 556 0.10 % 1.75 1,898 0.48 % 44.21 0 0 % 0 world wor ld 5,151 0.19 % 4.54 81 0.15 % 2.04 9 0.10 % 2.28 1,696 0.18 % 3.12 1,314 0.17 % 7.01 1,499 0.26 % 4.71 552 0.14 % 12.86 0 0 % 0 die die 4,901 0.18 % 4.32 147 0.27 % 3.70 20 0.23 % 5.06 1,433 0.15 % 2.64 1,322 0.17 % 7.05 753 0.13 % 2.37 1,226 0.31 % 28.55 0 0 % 0 are are 4,844 0.17 % 4.27 112 0.20 % 2.82 1 0.01 % 0.25 948 0.10 % 1.75 1,942 0.25 % 10.36 919 0.16 % 2.89 922 0.23 % 21.47 0 0 % 0 24ur,com 24u r,com 4,704 0.17 % 4.15 0 0 % 0 0 0 % 0 151 0.02 % 0.28 80 0.01 % 0.43 4,471 0.77 % 14.06 2 0 % 0.05 0 0 % 0 one one 4,594 0.17 % 4.05 52 0.10 % 1.31 4 0.05 % 1.01 1,471 0.15 % 2.71 1,452 0.19 % 7.75 1,189 0.20 % 3.74 426 0.11 % 9.92 0 0 % 0 press pre ss 4,379 0.16 % 3.86 21 0.04 % 0.53 0 0 % 0 263 0.03 % 0.48 3,187 0.41 % 17 246 0.04 % 0.77 662 0.17 % 15.42 0 0 % 0 les les 3,871 0.14 % 3.41 175 0.32 % 4.41 4 0.05 % 1.01 1,540 0.16 % 2.84 456 0.06 % 2.43 612 0.10 % 1.92 1,084 0.28 % 25.25 0 0 % 0 per per 3,805 0.14 % 3.35 79 0.14 % 1.99 4 0.05 % 1.01 2,321 0.24 % 4.28 662 0.09 % 3.53 327 0.06 % 1.03 412 0.10 % 9.60 0 0 % 0 siol,net sio l,net 3,794 0.14 % 3.34 0 0 % 0 1 0.01 % 0.25 1,018 0.11 % 1.88 833 0.11 % 4.44 1,864 0.32 % 5.86 78 0.02 % 1.82 0 0 % 0 international int ernational 3,611 0.13 % 3.18 8 0.01 % 0.20 7 0.08 % 1.77 1,231 0.13 % 2.27 883 0.11 % 4.71 969 0.17 % 3.05 513 0.13 % 11.95 0 0 % 0 this thi s 3,532 0.13 % 3.11 85 0.15 % 2.14 2 0.02 % 0.51 615 0.07 % 1.13 1,499 0.19 % 8 774 0.13 % 2.43 557 0.14 % 12.97 0 0 % 0 not not 3,492 0.13 % 3.08 51 0.09 % 1.28 5 0.06 % 1.26 963 0.10 % 1.77 1,430 0.18 % 7.63 525 0.09 % 1.65 518 0.13 % 12.06 0 0 % 0 stand sta nd 3,378 0.12 % 2.98 22 0.04 % 0.55 0 0 % 0 824 0.09 % 1.52 562 0.07 % 3 1,862 0.32 % 5.86 108 0.03 % 2.52 0 0 % 0 was was 3,284 0.12 % 2.89 67 0.12 % 1.69 1 0.01 % 0.25 1,034 0.11 % 1.91 1,228 0.16 % 6.55 479 0.08 % 1.51 475 0.12 % 11.06 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 270 File at CLARIN.SI 1.2.254 List of initial character-level 4-grams from residual lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-residual-lowercase_forms- initial-4grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] with with 6,647 0.31 % 5.86 110 0.26 % 2.77 4 0.05 % 1.01 1,466 0.21 % 2.70 2,355 0.38 % 12.57 1,405 0.31 % 4.42 1,307 0.42 % 30.44 0 0 % 0 dnevnik,si dnev nik,si 6,371 0.30 % 5.61 0 0 % 0 0 0 % 0 6,059 0.85 % 11.16 17 0 % 0.09 295 0.07 % 0.93 0 0 % 0 0 0 % 0 that that 5,449 0.26 % 4.80 110 0.26 % 2.77 5 0.07 % 1.26 1,273 0.18 % 2.35 2,290 0.37 % 12.22 678 0.15 % 2.13 1,093 0.35 % 25.46 0 0 % 0 from from 5,386 0.25 % 4.75 80 0.19 % 2.01 2 0.03 % 0.51 1,311 0.18 % 2.42 1,730 0.28 % 9.23 1,185 0.26 % 3.73 1,078 0.35 % 25.11 0 0 % 0 world worl d 5,151 0.24 % 4.54 81 0.19 % 2.04 9 0.12 % 2.28 1,696 0.24 % 3.12 1,314 0.21 % 7.01 1,499 0.33 % 4.71 552 0.18 % 12.86 0 0 % 0 24ur,com 24ur ,com 4,704 0.22 % 4.15 0 0 % 0 0 0 % 0 151 0.02 % 0.28 80 0.01 % 0.43 4,471 0.99 % 14.06 2 0 % 0.05 0 0 % 0 press pres s 4,379 0.20 % 3.86 21 0.05 % 0.53 0 0 % 0 263 0.04 % 0.48 3,187 0.52 % 17 246 0.06 % 0.77 662 0.21 % 15.42 0 0 % 0 siol,net siol ,net 3,794 0.18 % 3.34 0 0 % 0 1 0.01 % 0.25 1,018 0.14 % 1.88 833 0.14 % 4.44 1,864 0.41 % 5.86 78 0.03 % 1.82 0 0 % 0 international inte rnational 3,611 0.17 % 3.18 8 0.02 % 0.20 7 0.09 % 1.77 1,231 0.17 % 2.27 883 0.14 % 4.71 969 0.21 % 3.05 513 0.17 % 11.95 0 0 % 0 this this 3,532 0.17 % 3.11 85 0.20 % 2.14 2 0.03 % 0.51 615 0.09 % 1.13 1,499 0.24 % 8 774 0.17 % 2.43 557 0.18 % 12.97 0 0 % 0 stand stan d 3,378 0.16 % 2.98 22 0.05 % 0.55 0 0 % 0 824 0.12 % 1.52 562 0.09 % 3 1,862 0.41 % 5.86 108 0.04 % 2.52 0 0 % 0 time time 3,126 0.15 % 2.75 42 0.10 % 1.06 4 0.05 % 1.01 926 0.13 % 1.71 1,167 0.19 % 6.23 689 0.15 % 2.17 298 0.10 % 6.94 0 0 % 0 blue blue 3,110 0.14 % 2.74 15 0.04 % 0.38 3 0.04 % 0.76 1,301 0.18 % 2.40 262 0.04 % 1.40 1,500 0.33 % 4.72 29 0.01 % 0.68 0 0 % 0 european euro pean 3,069 0.14 % 2.70 1 0 % 0.03 3 0.04 % 0.76 965 0.14 % 1.78 810 0.13 % 4.32 696 0.15 % 2.19 594 0.19 % 13.83 0 0 % 0 journal jour nal 3,005 0.14 % 2.65 23 0.05 % 0.58 1 0.01 % 0.25 520 0.07 % 0.96 821 0.13 % 4.38 760 0.17 % 2.39 880 0.29 % 20.50 0 0 % 0 love love 2,899 0.14 % 2.55 59 0.14 % 1.49 0 0 % 0 987 0.14 % 1.82 862 0.14 % 4.60 948 0.21 % 2.98 43 0.01 % 1 0 0 % 0 have have 2,868 0.13 % 2.53 95 0.22 % 2.39 4 0.05 % 1.01 720 0.10 % 1.33 1,104 0.18 % 5.89 545 0.12 % 1.71 400 0.13 % 9.32 0 0 % 0 life life 2,822 0.13 % 2.49 39 0.09 % 0.98 2 0.03 % 0.51 763 0.11 % 1.41 912 0.15 % 4.87 721 0.16 % 2.27 385 0.12 % 8.97 0 0 % 0 your your 2,496 0.12 % 2.20 95 0.22 % 2.39 0 0 % 0 621 0.09 % 1.14 993 0.16 % 5.30 621 0.14 % 1.95 166 0.05 % 3.87 0 0 % 0 sport spor t 2,432 0.11 % 2.14 2 0.01 % 0.05 0 0 % 0 1,096 0.15 % 2.02 572 0.09 % 3.05 638 0.14 % 2.01 124 0.04 % 2.89 0 0 % 0 hard hard 2,430 0.11 % 2.14 26 0.06 % 0.65 2 0.03 % 0.51 855 0.12 % 1.58 1,031 0.17 % 5.50 463 0.10 % 1.46 53 0.02 % 1.23 0 0 % 0 best best 2,391 0.11 % 2.11 22 0.05 % 0.55 0 0 % 0 854 0.12 % 1.57 838 0.14 % 4.47 595 0.13 % 1.87 82 0.03 % 1.91 0 0 % 0 della dell a 2,341 0.11 % 2.06 42 0.10 % 1.06 0 0 % 0 956 0.13 % 1.76 442 0.07 % 2.36 557 0.12 % 1.75 344 0.11 % 8.01 0 0 % 0 grand gran d 2,310 0.11 % 2.04 18 0.04 % 0.45 2 0.03 % 0.51 1,148 0.16 % 2.12 282 0.05 % 1.50 798 0.18 % 2.51 62 0.02 % 1.44 0 0 % 0 miss miss 2,297 0.11 % 2.02 108 0.25 % 2.72 1 0.01 % 0.25 721 0.10 % 1.33 1,154 0.19 % 6.16 303 0.07 % 0.95 10 0 % 0.23 0 0 % 0 social soci al 2,280 0.11 % 2.01 11 0.03 % 0.28 0 0 % 0 185 0.03 % 0.34 487 0.08 % 2.60 346 0.08 % 1.09 1,251 0.41 % 29.14 0 0 % 0 facto fact o 2,279 0.11 % 2.01 17 0.04 % 0.43 15 0.20 % 3.79 918 0.13 % 1.69 398 0.06 % 2.12 827 0.18 % 2.60 104 0.03 % 2.42 0 0 % 0 will will 2,215 0.10 % 1.95 46 0.11 % 1.16 1 0.01 % 0.25 907 0.13 % 1.67 596 0.10 % 3.18 425 0.09 % 1.34 240 0.08 % 5.59 0 0 % 0 university univ ersity 2,209 0.10 % 1.95 14 0.03 % 0.35 1 0.01 % 0.25 437 0.06 % 0.81 407 0.07 % 2.17 445 0.10 % 1.40 905 0.29 % 21.08 0 0 % 0 ford ford 2,197 0.10 % 1.94 13 0.03 % 0.33 0 0 % 0 1,605 0.23 % 2.96 166 0.03 % 0.89 411 0.09 % 1.29 2 0 % 0.05 0 0 % 0 national nati onal 2,190 0.10 % 1.93 31 0.07 % 0.78 10 0.13 % 2.53 768 0.11 % 1.42 468 0.08 % 2.50 495 0.11 % 1.56 418 0.14 % 9.74 0 0 % 0 free free 2,174 0.10 % 1.92 33 0.08 % 0.83 2 0.03 % 0.51 843 0.12 % 1.55 731 0.12 % 3.90 403 0.09 % 1.27 162 0.05 % 3.77 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 271 File at CLARIN.SI 1.2.255 List of initial character-level 5-grams from residual lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-residual-lowercase_forms- initial-5grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] dnevnik,si dnevn ik,si 6,371 0.36 % 5.61 0 0 % 0 0 0 % 0 6,059 1.03 % 11.16 17 0 % 0.09 295 0.08 % 0.93 0 0 % 0 0 0 % 0 world world 5,151 0.29 % 4.54 81 0.24 % 2.04 9 0.13 % 2.28 1,696 0.29 % 3.12 1,314 0.26 % 7.01 1,499 0.41 % 4.71 552 0.21 % 12.86 0 0 % 0 24ur,com 24ur, com 4,704 0.27 % 4.15 0 0 % 0 0 0 % 0 151 0.03 % 0.28 80 0.02 % 0.43 4,471 1.21 % 14.06 2 0 % 0.05 0 0 % 0 press press 4,379 0.25 % 3.86 21 0.06 % 0.53 0 0 % 0 263 0.04 % 0.48 3,187 0.63 % 17 246 0.07 % 0.77 662 0.25 % 15.42 0 0 % 0 siol,net siol, net 3,794 0.21 % 3.34 0 0 % 0 1 0.01 % 0.25 1,018 0.17 % 1.88 833 0.17 % 4.44 1,864 0.50 % 5.86 78 0.03 % 1.82 0 0 % 0 international inter national 3,611 0.20 % 3.18 8 0.02 % 0.20 7 0.10 % 1.77 1,231 0.21 % 2.27 883 0.17 % 4.71 969 0.26 % 3.05 513 0.19 % 11.95 0 0 % 0 stand stand 3,378 0.19 % 2.98 22 0.07 % 0.55 0 0 % 0 824 0.14 % 1.52 562 0.11 % 3 1,862 0.50 % 5.86 108 0.04 % 2.52 0 0 % 0 european europ ean 3,069 0.17 % 2.70 1 0 % 0.03 3 0.04 % 0.76 965 0.17 % 1.78 810 0.16 % 4.32 696 0.19 % 2.19 594 0.22 % 13.83 0 0 % 0 journal journ al 3,005 0.17 % 2.65 23 0.07 % 0.58 1 0.01 % 0.25 520 0.09 % 0.96 821 0.16 % 4.38 760 0.20 % 2.39 880 0.33 % 20.50 0 0 % 0 sport sport 2,432 0.14 % 2.14 2 0.01 % 0.05 0 0 % 0 1,096 0.19 % 2.02 572 0.11 % 3.05 638 0.17 % 2.01 124 0.05 % 2.89 0 0 % 0 della della 2,341 0.13 % 2.06 42 0.12 % 1.06 0 0 % 0 956 0.16 % 1.76 442 0.09 % 2.36 557 0.15 % 1.75 344 0.13 % 8.01 0 0 % 0 grand grand 2,310 0.13 % 2.04 18 0.05 % 0.45 2 0.03 % 0.51 1,148 0.20 % 2.12 282 0.06 % 1.50 798 0.22 % 2.51 62 0.02 % 1.44 0 0 % 0 social socia l 2,280 0.13 % 2.01 11 0.03 % 0.28 0 0 % 0 185 0.03 % 0.34 487 0.10 % 2.60 346 0.09 % 1.09 1,251 0.47 % 29.14 0 0 % 0 facto facto 2,279 0.13 % 2.01 17 0.05 % 0.43 15 0.22 % 3.79 918 0.16 % 1.69 398 0.08 % 2.12 827 0.22 % 2.60 104 0.04 % 2.42 0 0 % 0 university unive rsity 2,209 0.12 % 1.95 14 0.04 % 0.35 1 0.01 % 0.25 437 0.07 % 0.81 407 0.08 % 2.17 445 0.12 % 1.40 905 0.34 % 21.08 0 0 % 0 national natio nal 2,190 0.12 % 1.93 31 0.09 % 0.78 10 0.15 % 2.53 768 0.13 % 1.42 468 0.09 % 2.50 495 0.13 % 1.56 418 0.16 % 9.74 0 0 % 0 which which 2,159 0.12 % 1.90 15 0.04 % 0.38 1 0.01 % 0.25 667 0.11 % 1.23 785 0.16 % 4.19 116 0.03 % 0.36 575 0.22 % 13.39 0 0 % 0 si,mobil si,mo bil 1,846 0.10 % 1.63 0 0 % 0 6 0.09 % 1.52 879 0.15 % 1.62 621 0.12 % 3.31 340 0.09 % 1.07 0 0 % 0 0 0 % 0 education educa tion 1,819 0.10 % 1.60 3 0.01 % 0.08 0 0 % 0 116 0.02 % 0.21 778 0.15 % 4.15 158 0.04 % 0.50 764 0.29 % 17.79 0 0 % 0 their their 1,818 0.10 % 1.60 4 0.01 % 0.10 3 0.04 % 0.76 261 0.04 % 0.48 767 0.15 % 4.09 192 0.05 % 0.60 591 0.22 % 13.77 0 0 % 0 people peopl e 1,788 0.10 % 1.58 22 0.07 % 0.55 1 0.01 % 0.25 359 0.06 % 0.66 565 0.11 % 3.01 589 0.16 % 1.85 252 0.10 % 5.87 0 0 % 0 dance dance 1,721 0.10 % 1.52 6 0.02 % 0.15 0 0 % 0 948 0.16 % 1.75 352 0.07 % 1.88 387 0.10 % 1.22 28 0.01 % 0.65 0 0 % 0 business busin ess 1,705 0.10 % 1.50 6 0.02 % 0.15 1 0.01 % 0.25 607 0.10 % 1.12 504 0.10 % 2.69 331 0.09 % 1.04 256 0.10 % 5.96 0 0 % 0 alias alias 1,681 0.10 % 1.48 85 0.25 % 2.14 2 0.03 % 0.51 654 0.11 % 1.21 679 0.14 % 3.62 236 0.06 % 0.74 25 0.01 % 0.58 0 0 % 0 development devel opment 1,636 0.09 % 1.44 0 0 % 0 1 0.01 % 0.25 227 0.04 % 0.42 447 0.09 % 2.39 143 0.04 % 0.45 818 0.31 % 19.05 0 0 % 0 power power 1,617 0.09 % 1.43 17 0.05 % 0.43 1 0.01 % 0.25 597 0.10 % 1.10 401 0.08 % 2.14 478 0.13 % 1.50 123 0.05 % 2.86 0 0 % 0 research resea rch 1,588 0.09 % 1.40 6 0.02 % 0.15 1 0.01 % 0.25 299 0.05 % 0.55 495 0.10 % 2.64 275 0.07 % 0.86 512 0.19 % 11.93 0 0 % 0 night night 1,583 0.09 % 1.40 24 0.07 % 0.60 1 0.01 % 0.25 628 0.11 % 1.16 426 0.08 % 2.27 469 0.13 % 1.48 35 0.01 % 0.82 0 0 % 0 don’t don’t 1,555 0.09 % 1.37 46 0.14 % 1.16 2 0.03 % 0.51 460 0.08 % 0.85 478 0.10 % 2.55 548 0.15 % 1.72 21 0.01 % 0.49 0 0 % 0 american ameri can 1,539 0.09 % 1.36 15 0.04 % 0.38 2 0.03 % 0.51 444 0.08 % 0.82 423 0.08 % 2.26 332 0.09 % 1.04 323 0.12 % 7.52 0 0 % 0 school schoo l 1,531 0.09 % 1.35 12 0.04 % 0.30 2 0.03 % 0.51 425 0.07 % 0.78 328 0.07 % 1.75 320 0.09 % 1.01 444 0.17 % 10.34 0 0 % 0 https://twitter,com/sta_novice https ://twitter,com/sta_novice 1,503 0.09 % 1.32 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 1,503 0.41 % 4.73 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 272 File at CLARIN.SI 1.2.256 List of final character-level 1-grams from residual lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-residual-lowercase_forms- final-1grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] the th e 143,431 3.90 % 126.41 1,518 2.09 % 38.22 165 1.56 % 41.71 40,736 3.10 % 75.06 41,796 4.14 % 223.01 35,895 4.65 % 112.90 23,321 4.71 % 543.17 0 0 % 0 de d e 119,443 3.25 % 105.26 3,685 5.07 % 92.78 175 1.66 % 44.24 47,661 3.62 % 87.82 18,231 1.80 % 97.27 37,427 4.85 % 117.72 12,264 2.48 % 285.64 0 0 % 0 of o f 88,222 2.40 % 77.75 779 1.07 % 19.61 120 1.14 % 30.33 23,377 1.78 % 43.07 24,881 2.46 % 132.76 19,565 2.54 % 61.54 19,500 3.94 % 454.18 0 0 % 0 and an d 50,052 1.36 % 44.11 612 0.84 % 15.41 38 0.36 % 9.61 10,082 0.77 % 18.58 14,953 1.48 % 79.78 9,499 1.23 % 29.88 14,868 3.00 % 346.29 0 0 % 0 la l a 47,471 1.29 % 41.84 1,450 2.00 % 36.51 53 0.50 % 13.40 19,464 1.48 % 35.86 9,233 0.91 % 49.26 12,499 1.62 % 39.31 4,772 0.96 % 111.15 0 0 % 0 i i 46,596 1.27 % 41.06 839 1.16 % 21.13 327 3.10 % 82.66 9,280 0.71 % 17.10 20,109 1.99 % 107.30 12,783 1.66 % 40.21 3,258 0.66 % 75.88 0 0 % 0 a a 32,433 0.88 % 28.58 571 0.79 % 14.38 220 2.09 % 55.61 8,682 0.66 % 16 11,459 1.14 % 61.14 5,850 0.76 % 18.40 5,651 1.14 % 131.62 0 0 % 0 to t o 23,864 0.65 % 21.03 429 0.59 % 10.80 26 0.25 % 6.57 5,851 0.45 % 10.78 8,178 0.81 % 43.64 5,143 0.67 % 16.18 4,237 0.86 % 98.68 0 0 % 0 in i n 22,418 0.61 % 19.76 277 0.38 % 6.97 39 0.37 % 9.86 5,726 0.43 % 10.55 7,166 0.71 % 38.24 4,199 0.54 % 13.21 5,011 1.01 % 116.71 0 0 % 0 di d i 20,809 0.57 % 18.34 268 0.37 % 6.75 28 0.27 % 7.08 9,812 0.75 % 18.08 3,363 0.33 % 17.94 5,856 0.76 % 18.42 1,481 0.30 % 34.49 1 33.33 % 103.02 van va n 20,097 0.55 % 17.71 349 0.48 % 8.79 21 0.20 % 5.31 9,983 0.76 % 18.39 1,630 0.16 % 8.70 7,452 0.97 % 23.44 662 0.13 % 15.42 0 0 % 0 sta st a 17,204 0.47 % 15.16 1 0 % 0.03 0 0 % 0 15,791 1.20 % 29.10 21 0 % 0.11 1,388 0.18 % 4.37 3 0 % 0.07 0 0 % 0 el e l 15,435 0.42 % 13.60 418 0.57 % 10.52 11 0.10 % 2.78 6,078 0.46 % 11.20 2,615 0.26 % 13.95 5,592 0.72 % 17.59 721 0.15 % 16.79 0 0 % 0 for fo r 14,317 0.39 % 12.62 126 0.17 % 3.17 18 0.17 % 4.55 3,428 0.26 % 6.32 4,714 0.47 % 25.15 3,030 0.39 % 9.53 3,001 0.61 % 69.90 0 0 % 0 on o n 12,619 0.34 % 11.12 178 0.24 % 4.48 8 0.08 % 2.02 3,704 0.28 % 6.82 3,664 0.36 % 19.55 2,757 0.36 % 8.67 2,308 0.47 % 53.76 0 0 % 0 pre pr e 12,246 0.33 % 10.79 45 0.06 % 1.13 34 0.32 % 8.59 6,173 0.47 % 11.37 2,724 0.27 % 14.53 2,881 0.37 % 9.06 389 0.08 % 9.06 0 0 % 0 is i s 11,863 0.32 % 10.45 262 0.36 % 6.60 3 0.03 % 0.76 2,493 0.19 % 4.59 3,726 0.37 % 19.88 3,514 0.46 % 11.05 1,865 0.38 % 43.44 0 0 % 0 der de r 11,513 0.31 % 10.15 276 0.38 % 6.95 29 0.28 % 7.33 4,068 0.31 % 7.50 1,653 0.16 % 8.82 3,174 0.41 % 9.98 2,313 0.47 % 53.87 0 0 % 0 von vo n 11,326 0.31 % 9.98 349 0.48 % 8.79 178 1.69 % 45 4,312 0.33 % 7.95 2,001 0.20 % 10.68 2,590 0.34 % 8.15 1,896 0.38 % 44.16 0 0 % 0 et e t 9,623 0.26 % 8.48 364 0.50 % 9.17 26 0.25 % 6.57 2,923 0.22 % 5.39 1,605 0.16 % 8.56 1,351 0.17 % 4.25 3,354 0.68 % 78.12 0 0 % 0 bin bi n 9,272 0.25 % 8.17 72 0.10 % 1.81 2 0.02 % 0.51 5,214 0.40 % 9.61 1,243 0.12 % 6.63 2,638 0.34 % 8.30 103 0.02 % 2.40 0 0 % 0 da d a 9,128 0.25 % 8.04 1,009 1.39 % 25.41 27 0.26 % 6.83 3,188 0.24 % 5.87 2,083 0.21 % 11.11 2,267 0.29 % 7.13 554 0.11 % 12.90 0 0 % 0 by b y 8,563 0.23 % 7.55 69 0.10 % 1.74 21 0.20 % 5.31 2,522 0.19 % 4.65 3,090 0.31 % 16.49 1,719 0.22 % 5.41 1,142 0.23 % 26.60 0 0 % 0 del de l 7,785 0.21 % 6.86 95 0.13 % 2.39 20 0.19 % 5.06 3,735 0.28 % 6.88 1,642 0.16 % 8.76 1,510 0.20 % 4.75 783 0.16 % 18.24 0 0 % 0 you yo u 7,687 0.21 % 6.77 444 0.61 % 11.18 12 0.11 % 3.03 2,163 0.16 % 3.99 2,650 0.26 % 14.14 2,149 0.28 % 6.76 269 0.05 % 6.27 0 0 % 0 an a n 7,544 0.20 % 6.65 115 0.16 % 2.90 11 0.10 % 2.78 3,059 0.23 % 5.64 1,701 0.17 % 9.08 1,261 0.16 % 3.97 1,397 0.28 % 32.54 0 0 % 0 du d u 7,024 0.19 % 6.19 295 0.41 % 7.43 10 0.10 % 2.53 3,031 0.23 % 5.58 1,043 0.10 % 5.57 1,300 0.17 % 4.09 1,345 0.27 % 31.33 0 0 % 0 be b e 6,845 0.19 % 6.03 137 0.19 % 3.45 11 0.10 % 2.78 2,025 0.15 % 3.73 2,409 0.24 % 12.85 1,265 0.16 % 3.98 998 0.20 % 23.24 0 0 % 0 des de s 6,751 0.18 % 5.95 388 0.53 % 9.77 21 0.20 % 5.31 2,218 0.17 % 4.09 982 0.10 % 5.24 1,095 0.14 % 3.44 2,047 0.41 % 47.68 0 0 % 0 with wit h 6,647 0.18 % 5.86 110 0.15 % 2.77 4 0.04 % 1.01 1,466 0.11 % 2.70 2,355 0.23 % 12.57 1,405 0.18 % 4.42 1,307 0.26 % 30.44 0 0 % 0 dnevnik,si dnevnik,s i 6,371 0.17 % 5.61 0 0 % 0 0 0 % 0 6,059 0.46 % 11.16 17 0 % 0.09 295 0.04 % 0.93 0 0 % 0 0 0 % 0 all al l 5,868 0.16 % 5.17 107 0.15 % 2.69 5 0.05 % 1.26 1,983 0.15 % 3.65 1,773 0.18 % 9.46 1,505 0.20 % 4.73 495 0.10 % 11.53 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 273 File at CLARIN.SI 1.2.257 List of final character-level 2-grams from residual lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-residual-lowercase_forms- final-2grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] the t he 143,431 4.03 % 126.41 1,518 2.16 % 38.22 165 1.66 % 41.71 40,736 3.15 % 75.06 41,796 4.37 % 223.01 35,895 4.80 % 112.90 23,321 4.86 % 543.17 0 0 % 0 de de 119,443 3.36 % 105.26 3,685 5.25 % 92.78 175 1.76 % 44.24 47,661 3.69 % 87.82 18,231 1.91 % 97.27 37,427 5.01 % 117.72 12,264 2.55 % 285.64 0 0 % 0 of of 88,222 2.48 % 77.75 779 1.11 % 19.61 120 1.21 % 30.33 23,377 1.81 % 43.07 24,881 2.60 % 132.76 19,565 2.62 % 61.54 19,500 4.06 % 454.18 0 0 % 0 and a nd 50,052 1.41 % 44.11 612 0.87 % 15.41 38 0.38 % 9.61 10,082 0.78 % 18.58 14,953 1.56 % 79.78 9,499 1.27 % 29.88 14,868 3.10 % 346.29 0 0 % 0 la la 47,471 1.33 % 41.84 1,450 2.06 % 36.51 53 0.53 % 13.40 19,464 1.51 % 35.86 9,233 0.97 % 49.26 12,499 1.67 % 39.31 4,772 0.99 % 111.15 0 0 % 0 to to 23,864 0.67 % 21.03 429 0.61 % 10.80 26 0.26 % 6.57 5,851 0.45 % 10.78 8,178 0.86 % 43.64 5,143 0.69 % 16.18 4,237 0.88 % 98.68 0 0 % 0 in in 22,418 0.63 % 19.76 277 0.39 % 6.97 39 0.39 % 9.86 5,726 0.44 % 10.55 7,166 0.75 % 38.24 4,199 0.56 % 13.21 5,011 1.04 % 116.71 0 0 % 0 di di 20,809 0.58 % 18.34 268 0.38 % 6.75 28 0.28 % 7.08 9,812 0.76 % 18.08 3,363 0.35 % 17.94 5,856 0.78 % 18.42 1,481 0.31 % 34.49 1 33.33 % 103.02 van v an 20,097 0.56 % 17.71 349 0.50 % 8.79 21 0.21 % 5.31 9,983 0.77 % 18.39 1,630 0.17 % 8.70 7,452 1.00 % 23.44 662 0.14 % 15.42 0 0 % 0 sta s ta 17,204 0.48 % 15.16 1 0 % 0.03 0 0 % 0 15,791 1.22 % 29.10 21 0 % 0.11 1,388 0.19 % 4.37 3 0 % 0.07 0 0 % 0 el el 15,435 0.43 % 13.60 418 0.59 % 10.52 11 0.11 % 2.78 6,078 0.47 % 11.20 2,615 0.27 % 13.95 5,592 0.75 % 17.59 721 0.15 % 16.79 0 0 % 0 for f or 14,317 0.40 % 12.62 126 0.18 % 3.17 18 0.18 % 4.55 3,428 0.27 % 6.32 4,714 0.49 % 25.15 3,030 0.41 % 9.53 3,001 0.62 % 69.90 0 0 % 0 on on 12,619 0.35 % 11.12 178 0.25 % 4.48 8 0.08 % 2.02 3,704 0.29 % 6.82 3,664 0.38 % 19.55 2,757 0.37 % 8.67 2,308 0.48 % 53.76 0 0 % 0 pre p re 12,246 0.34 % 10.79 45 0.06 % 1.13 34 0.34 % 8.59 6,173 0.48 % 11.37 2,724 0.28 % 14.53 2,881 0.39 % 9.06 389 0.08 % 9.06 0 0 % 0 is is 11,863 0.33 % 10.45 262 0.37 % 6.60 3 0.03 % 0.76 2,493 0.19 % 4.59 3,726 0.39 % 19.88 3,514 0.47 % 11.05 1,865 0.39 % 43.44 0 0 % 0 der d er 11,513 0.32 % 10.15 276 0.39 % 6.95 29 0.29 % 7.33 4,068 0.32 % 7.50 1,653 0.17 % 8.82 3,174 0.42 % 9.98 2,313 0.48 % 53.87 0 0 % 0 von v on 11,326 0.32 % 9.98 349 0.50 % 8.79 178 1.79 % 45 4,312 0.33 % 7.95 2,001 0.21 % 10.68 2,590 0.35 % 8.15 1,896 0.40 % 44.16 0 0 % 0 et et 9,623 0.27 % 8.48 364 0.52 % 9.17 26 0.26 % 6.57 2,923 0.23 % 5.39 1,605 0.17 % 8.56 1,351 0.18 % 4.25 3,354 0.70 % 78.12 0 0 % 0 bin b in 9,272 0.26 % 8.17 72 0.10 % 1.81 2 0.02 % 0.51 5,214 0.40 % 9.61 1,243 0.13 % 6.63 2,638 0.35 % 8.30 103 0.02 % 2.40 0 0 % 0 da da 9,128 0.26 % 8.04 1,009 1.44 % 25.41 27 0.27 % 6.83 3,188 0.25 % 5.87 2,083 0.22 % 11.11 2,267 0.30 % 7.13 554 0.12 % 12.90 0 0 % 0 by by 8,563 0.24 % 7.55 69 0.10 % 1.74 21 0.21 % 5.31 2,522 0.20 % 4.65 3,090 0.32 % 16.49 1,719 0.23 % 5.41 1,142 0.24 % 26.60 0 0 % 0 del d el 7,785 0.22 % 6.86 95 0.14 % 2.39 20 0.20 % 5.06 3,735 0.29 % 6.88 1,642 0.17 % 8.76 1,510 0.20 % 4.75 783 0.16 % 18.24 0 0 % 0 you y ou 7,687 0.22 % 6.77 444 0.63 % 11.18 12 0.12 % 3.03 2,163 0.17 % 3.99 2,650 0.28 % 14.14 2,149 0.29 % 6.76 269 0.06 % 6.27 0 0 % 0 an an 7,544 0.21 % 6.65 115 0.16 % 2.90 11 0.11 % 2.78 3,059 0.24 % 5.64 1,701 0.18 % 9.08 1,261 0.17 % 3.97 1,397 0.29 % 32.54 0 0 % 0 du du 7,024 0.20 % 6.19 295 0.42 % 7.43 10 0.10 % 2.53 3,031 0.23 % 5.58 1,043 0.11 % 5.57 1,300 0.17 % 4.09 1,345 0.28 % 31.33 0 0 % 0 be be 6,845 0.19 % 6.03 137 0.20 % 3.45 11 0.11 % 2.78 2,025 0.16 % 3.73 2,409 0.25 % 12.85 1,265 0.17 % 3.98 998 0.21 % 23.24 0 0 % 0 des d es 6,751 0.19 % 5.95 388 0.55 % 9.77 21 0.21 % 5.31 2,218 0.17 % 4.09 982 0.10 % 5.24 1,095 0.15 % 3.44 2,047 0.43 % 47.68 0 0 % 0 with wi th 6,647 0.19 % 5.86 110 0.16 % 2.77 4 0.04 % 1.01 1,466 0.11 % 2.70 2,355 0.25 % 12.57 1,405 0.19 % 4.42 1,307 0.27 % 30.44 0 0 % 0 dnevnik,si dnevnik, si 6,371 0.18 % 5.61 0 0 % 0 0 0 % 0 6,059 0.47 % 11.16 17 0 % 0.09 295 0.04 % 0.93 0 0 % 0 0 0 % 0 all a ll 5,868 0.17 % 5.17 107 0.15 % 2.69 5 0.05 % 1.26 1,983 0.15 % 3.65 1,773 0.18 % 9.46 1,505 0.20 % 4.73 495 0.10 % 11.53 0 0 % 0 re re 5,794 0.16 % 5.11 142 0.20 % 3.58 29 0.29 % 7.33 2,954 0.23 % 5.44 1,102 0.12 % 5.88 1,286 0.17 % 4.04 281 0.06 % 6.54 0 0 % 0 at at 5,740 0.16 % 5.06 69 0.10 % 1.74 68 0.68 % 17.19 1,685 0.13 % 3.10 1,834 0.19 % 9.79 1,237 0.17 % 3.89 847 0.18 % 19.73 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 274 File at CLARIN.SI 1.2.258 List of final character-level 3-grams from residual lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-residual-lowercase_forms- final-3grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] the the 143,431 5.20 % 126.41 1,518 2.77 % 38.22 165 1.90 % 41.71 40,736 4.30 % 75.06 41,796 5.39 % 223.01 35,895 6.17 % 112.90 23,321 5.95 % 543.17 0 0 % 0 and and 50,052 1.81 % 44.11 612 1.12 % 15.41 38 0.44 % 9.61 10,082 1.06 % 18.58 14,953 1.93 % 79.78 9,499 1.63 % 29.88 14,868 3.79 % 346.29 0 0 % 0 van van 20,097 0.73 % 17.71 349 0.64 % 8.79 21 0.24 % 5.31 9,983 1.05 % 18.39 1,630 0.21 % 8.70 7,452 1.28 % 23.44 662 0.17 % 15.42 0 0 % 0 sta sta 17,204 0.62 % 15.16 1 0 % 0.03 0 0 % 0 15,791 1.67 % 29.10 21 0 % 0.11 1,388 0.24 % 4.37 3 0 % 0.07 0 0 % 0 for for 14,317 0.52 % 12.62 126 0.23 % 3.17 18 0.21 % 4.55 3,428 0.36 % 6.32 4,714 0.61 % 25.15 3,030 0.52 % 9.53 3,001 0.77 % 69.90 0 0 % 0 pre pre 12,246 0.44 % 10.79 45 0.08 % 1.13 34 0.39 % 8.59 6,173 0.65 % 11.37 2,724 0.35 % 14.53 2,881 0.49 % 9.06 389 0.10 % 9.06 0 0 % 0 der der 11,513 0.42 % 10.15 276 0.50 % 6.95 29 0.33 % 7.33 4,068 0.43 % 7.50 1,653 0.21 % 8.82 3,174 0.55 % 9.98 2,313 0.59 % 53.87 0 0 % 0 von von 11,326 0.41 % 9.98 349 0.64 % 8.79 178 2.05 % 45 4,312 0.46 % 7.95 2,001 0.26 % 10.68 2,590 0.45 % 8.15 1,896 0.48 % 44.16 0 0 % 0 bin bin 9,272 0.34 % 8.17 72 0.13 % 1.81 2 0.02 % 0.51 5,214 0.55 % 9.61 1,243 0.16 % 6.63 2,638 0.45 % 8.30 103 0.03 % 2.40 0 0 % 0 del del 7,785 0.28 % 6.86 95 0.17 % 2.39 20 0.23 % 5.06 3,735 0.39 % 6.88 1,642 0.21 % 8.76 1,510 0.26 % 4.75 783 0.20 % 18.24 0 0 % 0 you you 7,687 0.28 % 6.77 444 0.81 % 11.18 12 0.14 % 3.03 2,163 0.23 % 3.99 2,650 0.34 % 14.14 2,149 0.37 % 6.76 269 0.07 % 6.27 0 0 % 0 des des 6,751 0.24 % 5.95 388 0.71 % 9.77 21 0.24 % 5.31 2,218 0.23 % 4.09 982 0.13 % 5.24 1,095 0.19 % 3.44 2,047 0.52 % 47.68 0 0 % 0 with w ith 6,647 0.24 % 5.86 110 0.20 % 2.77 4 0.05 % 1.01 1,466 0.15 % 2.70 2,355 0.30 % 12.57 1,405 0.24 % 4.42 1,307 0.33 % 30.44 0 0 % 0 dnevnik,si dnevnik ,si 6,371 0.23 % 5.61 0 0 % 0 0 0 % 0 6,059 0.64 % 11.16 17 0 % 0.09 295 0.05 % 0.93 0 0 % 0 0 0 % 0 all all 5,868 0.21 % 5.17 107 0.20 % 2.69 5 0.06 % 1.26 1,983 0.21 % 3.65 1,773 0.23 % 9.46 1,505 0.26 % 4.73 495 0.13 % 11.53 0 0 % 0 that t hat 5,449 0.20 % 4.80 110 0.20 % 2.77 5 0.06 % 1.26 1,273 0.13 % 2.35 2,290 0.29 % 12.22 678 0.12 % 2.13 1,093 0.28 % 25.46 0 0 % 0 from f rom 5,386 0.20 % 4.75 80 0.15 % 2.01 2 0.02 % 0.51 1,311 0.14 % 2.42 1,730 0.22 % 9.23 1,185 0.20 % 3.73 1,078 0.28 % 25.11 0 0 % 0 und und 5,288 0.19 % 4.66 154 0.28 % 3.88 18 0.21 % 4.55 1,244 0.13 % 2.29 1,418 0.18 % 7.57 556 0.10 % 1.75 1,898 0.48 % 44.21 0 0 % 0 world wo rld 5,151 0.19 % 4.54 81 0.15 % 2.04 9 0.10 % 2.28 1,696 0.18 % 3.12 1,314 0.17 % 7.01 1,499 0.26 % 4.71 552 0.14 % 12.86 0 0 % 0 die die 4,901 0.18 % 4.32 147 0.27 % 3.70 20 0.23 % 5.06 1,433 0.15 % 2.64 1,322 0.17 % 7.05 753 0.13 % 2.37 1,226 0.31 % 28.55 0 0 % 0 are are 4,844 0.17 % 4.27 112 0.20 % 2.82 1 0.01 % 0.25 948 0.10 % 1.75 1,942 0.25 % 10.36 919 0.16 % 2.89 922 0.23 % 21.47 0 0 % 0 24ur,com 24ur, com 4,704 0.17 % 4.15 0 0 % 0 0 0 % 0 151 0.02 % 0.28 80 0.01 % 0.43 4,471 0.77 % 14.06 2 0 % 0.05 0 0 % 0 one one 4,594 0.17 % 4.05 52 0.10 % 1.31 4 0.05 % 1.01 1,471 0.15 % 2.71 1,452 0.19 % 7.75 1,189 0.20 % 3.74 426 0.11 % 9.92 0 0 % 0 press pr ess 4,379 0.16 % 3.86 21 0.04 % 0.53 0 0 % 0 263 0.03 % 0.48 3,187 0.41 % 17 246 0.04 % 0.77 662 0.17 % 15.42 0 0 % 0 les les 3,871 0.14 % 3.41 175 0.32 % 4.41 4 0.05 % 1.01 1,540 0.16 % 2.84 456 0.06 % 2.43 612 0.10 % 1.92 1,084 0.28 % 25.25 0 0 % 0 per per 3,805 0.14 % 3.35 79 0.14 % 1.99 4 0.05 % 1.01 2,321 0.24 % 4.28 662 0.09 % 3.53 327 0.06 % 1.03 412 0.10 % 9.60 0 0 % 0 siol,net siol, net 3,794 0.14 % 3.34 0 0 % 0 1 0.01 % 0.25 1,018 0.11 % 1.88 833 0.11 % 4.44 1,864 0.32 % 5.86 78 0.02 % 1.82 0 0 % 0 international internatio nal 3,611 0.13 % 3.18 8 0.01 % 0.20 7 0.08 % 1.77 1,231 0.13 % 2.27 883 0.11 % 4.71 969 0.17 % 3.05 513 0.13 % 11.95 0 0 % 0 this t his 3,532 0.13 % 3.11 85 0.15 % 2.14 2 0.02 % 0.51 615 0.07 % 1.13 1,499 0.19 % 8 774 0.13 % 2.43 557 0.14 % 12.97 0 0 % 0 not not 3,492 0.13 % 3.08 51 0.09 % 1.28 5 0.06 % 1.26 963 0.10 % 1.77 1,430 0.18 % 7.63 525 0.09 % 1.65 518 0.13 % 12.06 0 0 % 0 stand st and 3,378 0.12 % 2.98 22 0.04 % 0.55 0 0 % 0 824 0.09 % 1.52 562 0.07 % 3 1,862 0.32 % 5.86 108 0.03 % 2.52 0 0 % 0 was was 3,284 0.12 % 2.89 67 0.12 % 1.69 1 0.01 % 0.25 1,034 0.11 % 1.91 1,228 0.16 % 6.55 479 0.08 % 1.51 475 0.12 % 11.06 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 275 File at CLARIN.SI 1.2.259 List of final character-level 4-grams from residual lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-residual-lowercase_forms- final-4grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] with with 6,647 0.31 % 5.86 110 0.26 % 2.77 4 0.05 % 1.01 1,466 0.21 % 2.70 2,355 0.38 % 12.57 1,405 0.31 % 4.42 1,307 0.42 % 30.44 0 0 % 0 dnevnik,si dnevni k,si 6,371 0.30 % 5.61 0 0 % 0 0 0 % 0 6,059 0.85 % 11.16 17 0 % 0.09 295 0.07 % 0.93 0 0 % 0 0 0 % 0 that that 5,449 0.26 % 4.80 110 0.26 % 2.77 5 0.07 % 1.26 1,273 0.18 % 2.35 2,290 0.37 % 12.22 678 0.15 % 2.13 1,093 0.35 % 25.46 0 0 % 0 from from 5,386 0.25 % 4.75 80 0.19 % 2.01 2 0.03 % 0.51 1,311 0.18 % 2.42 1,730 0.28 % 9.23 1,185 0.26 % 3.73 1,078 0.35 % 25.11 0 0 % 0 world w orld 5,151 0.24 % 4.54 81 0.19 % 2.04 9 0.12 % 2.28 1,696 0.24 % 3.12 1,314 0.21 % 7.01 1,499 0.33 % 4.71 552 0.18 % 12.86 0 0 % 0 24ur,com 24ur ,com 4,704 0.22 % 4.15 0 0 % 0 0 0 % 0 151 0.02 % 0.28 80 0.01 % 0.43 4,471 0.99 % 14.06 2 0 % 0.05 0 0 % 0 press p ress 4,379 0.20 % 3.86 21 0.05 % 0.53 0 0 % 0 263 0.04 % 0.48 3,187 0.52 % 17 246 0.06 % 0.77 662 0.21 % 15.42 0 0 % 0 siol,net siol ,net 3,794 0.18 % 3.34 0 0 % 0 1 0.01 % 0.25 1,018 0.14 % 1.88 833 0.14 % 4.44 1,864 0.41 % 5.86 78 0.03 % 1.82 0 0 % 0 international internati onal 3,611 0.17 % 3.18 8 0.02 % 0.20 7 0.09 % 1.77 1,231 0.17 % 2.27 883 0.14 % 4.71 969 0.21 % 3.05 513 0.17 % 11.95 0 0 % 0 this this 3,532 0.17 % 3.11 85 0.20 % 2.14 2 0.03 % 0.51 615 0.09 % 1.13 1,499 0.24 % 8 774 0.17 % 2.43 557 0.18 % 12.97 0 0 % 0 stand s tand 3,378 0.16 % 2.98 22 0.05 % 0.55 0 0 % 0 824 0.12 % 1.52 562 0.09 % 3 1,862 0.41 % 5.86 108 0.04 % 2.52 0 0 % 0 time time 3,126 0.15 % 2.75 42 0.10 % 1.06 4 0.05 % 1.01 926 0.13 % 1.71 1,167 0.19 % 6.23 689 0.15 % 2.17 298 0.10 % 6.94 0 0 % 0 blue blue 3,110 0.14 % 2.74 15 0.04 % 0.38 3 0.04 % 0.76 1,301 0.18 % 2.40 262 0.04 % 1.40 1,500 0.33 % 4.72 29 0.01 % 0.68 0 0 % 0 european euro pean 3,069 0.14 % 2.70 1 0 % 0.03 3 0.04 % 0.76 965 0.14 % 1.78 810 0.13 % 4.32 696 0.15 % 2.19 594 0.19 % 13.83 0 0 % 0 journal jou rnal 3,005 0.14 % 2.65 23 0.05 % 0.58 1 0.01 % 0.25 520 0.07 % 0.96 821 0.13 % 4.38 760 0.17 % 2.39 880 0.29 % 20.50 0 0 % 0 love love 2,899 0.14 % 2.55 59 0.14 % 1.49 0 0 % 0 987 0.14 % 1.82 862 0.14 % 4.60 948 0.21 % 2.98 43 0.01 % 1 0 0 % 0 have have 2,868 0.13 % 2.53 95 0.22 % 2.39 4 0.05 % 1.01 720 0.10 % 1.33 1,104 0.18 % 5.89 545 0.12 % 1.71 400 0.13 % 9.32 0 0 % 0 life life 2,822 0.13 % 2.49 39 0.09 % 0.98 2 0.03 % 0.51 763 0.11 % 1.41 912 0.15 % 4.87 721 0.16 % 2.27 385 0.12 % 8.97 0 0 % 0 your your 2,496 0.12 % 2.20 95 0.22 % 2.39 0 0 % 0 621 0.09 % 1.14 993 0.16 % 5.30 621 0.14 % 1.95 166 0.05 % 3.87 0 0 % 0 sport s port 2,432 0.11 % 2.14 2 0.01 % 0.05 0 0 % 0 1,096 0.15 % 2.02 572 0.09 % 3.05 638 0.14 % 2.01 124 0.04 % 2.89 0 0 % 0 hard hard 2,430 0.11 % 2.14 26 0.06 % 0.65 2 0.03 % 0.51 855 0.12 % 1.58 1,031 0.17 % 5.50 463 0.10 % 1.46 53 0.02 % 1.23 0 0 % 0 best best 2,391 0.11 % 2.11 22 0.05 % 0.55 0 0 % 0 854 0.12 % 1.57 838 0.14 % 4.47 595 0.13 % 1.87 82 0.03 % 1.91 0 0 % 0 della d ella 2,341 0.11 % 2.06 42 0.10 % 1.06 0 0 % 0 956 0.13 % 1.76 442 0.07 % 2.36 557 0.12 % 1.75 344 0.11 % 8.01 0 0 % 0 grand g rand 2,310 0.11 % 2.04 18 0.04 % 0.45 2 0.03 % 0.51 1,148 0.16 % 2.12 282 0.05 % 1.50 798 0.18 % 2.51 62 0.02 % 1.44 0 0 % 0 miss miss 2,297 0.11 % 2.02 108 0.25 % 2.72 1 0.01 % 0.25 721 0.10 % 1.33 1,154 0.19 % 6.16 303 0.07 % 0.95 10 0 % 0.23 0 0 % 0 social so cial 2,280 0.11 % 2.01 11 0.03 % 0.28 0 0 % 0 185 0.03 % 0.34 487 0.08 % 2.60 346 0.08 % 1.09 1,251 0.41 % 29.14 0 0 % 0 facto f acto 2,279 0.11 % 2.01 17 0.04 % 0.43 15 0.20 % 3.79 918 0.13 % 1.69 398 0.06 % 2.12 827 0.18 % 2.60 104 0.03 % 2.42 0 0 % 0 will will 2,215 0.10 % 1.95 46 0.11 % 1.16 1 0.01 % 0.25 907 0.13 % 1.67 596 0.10 % 3.18 425 0.09 % 1.34 240 0.08 % 5.59 0 0 % 0 university univer sity 2,209 0.10 % 1.95 14 0.03 % 0.35 1 0.01 % 0.25 437 0.06 % 0.81 407 0.07 % 2.17 445 0.10 % 1.40 905 0.29 % 21.08 0 0 % 0 ford ford 2,197 0.10 % 1.94 13 0.03 % 0.33 0 0 % 0 1,605 0.23 % 2.96 166 0.03 % 0.89 411 0.09 % 1.29 2 0 % 0.05 0 0 % 0 national nati onal 2,190 0.10 % 1.93 31 0.07 % 0.78 10 0.13 % 2.53 768 0.11 % 1.42 468 0.08 % 2.50 495 0.11 % 1.56 418 0.14 % 9.74 0 0 % 0 free free 2,174 0.10 % 1.92 33 0.08 % 0.83 2 0.03 % 0.51 843 0.12 % 1.55 731 0.12 % 3.90 403 0.09 % 1.27 162 0.05 % 3.77 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 276 File at CLARIN.SI 1.2.260 List of final character-level 5-grams from residual lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-word_parts-residual-lowercase_forms- final-5grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] dnevnik,si dnevn ik,si 6,371 0.36 % 5.61 0 0 % 0 0 0 % 0 6,059 1.03 % 11.16 17 0 % 0.09 295 0.08 % 0.93 0 0 % 0 0 0 % 0 world world 5,151 0.29 % 4.54 81 0.24 % 2.04 9 0.13 % 2.28 1,696 0.29 % 3.12 1,314 0.26 % 7.01 1,499 0.41 % 4.71 552 0.21 % 12.86 0 0 % 0 24ur,com 24u r,com 4,704 0.27 % 4.15 0 0 % 0 0 0 % 0 151 0.03 % 0.28 80 0.02 % 0.43 4,471 1.21 % 14.06 2 0 % 0.05 0 0 % 0 press press 4,379 0.25 % 3.86 21 0.06 % 0.53 0 0 % 0 263 0.04 % 0.48 3,187 0.63 % 17 246 0.07 % 0.77 662 0.25 % 15.42 0 0 % 0 siol,net sio l,net 3,794 0.21 % 3.34 0 0 % 0 1 0.01 % 0.25 1,018 0.17 % 1.88 833 0.17 % 4.44 1,864 0.50 % 5.86 78 0.03 % 1.82 0 0 % 0 international internat ional 3,611 0.20 % 3.18 8 0.02 % 0.20 7 0.10 % 1.77 1,231 0.21 % 2.27 883 0.17 % 4.71 969 0.26 % 3.05 513 0.19 % 11.95 0 0 % 0 stand stand 3,378 0.19 % 2.98 22 0.07 % 0.55 0 0 % 0 824 0.14 % 1.52 562 0.11 % 3 1,862 0.50 % 5.86 108 0.04 % 2.52 0 0 % 0 european eur opean 3,069 0.17 % 2.70 1 0 % 0.03 3 0.04 % 0.76 965 0.17 % 1.78 810 0.16 % 4.32 696 0.19 % 2.19 594 0.22 % 13.83 0 0 % 0 journal jo urnal 3,005 0.17 % 2.65 23 0.07 % 0.58 1 0.01 % 0.25 520 0.09 % 0.96 821 0.16 % 4.38 760 0.20 % 2.39 880 0.33 % 20.50 0 0 % 0 sport sport 2,432 0.14 % 2.14 2 0.01 % 0.05 0 0 % 0 1,096 0.19 % 2.02 572 0.11 % 3.05 638 0.17 % 2.01 124 0.05 % 2.89 0 0 % 0 della della 2,341 0.13 % 2.06 42 0.12 % 1.06 0 0 % 0 956 0.16 % 1.76 442 0.09 % 2.36 557 0.15 % 1.75 344 0.13 % 8.01 0 0 % 0 grand grand 2,310 0.13 % 2.04 18 0.05 % 0.45 2 0.03 % 0.51 1,148 0.20 % 2.12 282 0.06 % 1.50 798 0.22 % 2.51 62 0.02 % 1.44 0 0 % 0 social s ocial 2,280 0.13 % 2.01 11 0.03 % 0.28 0 0 % 0 185 0.03 % 0.34 487 0.10 % 2.60 346 0.09 % 1.09 1,251 0.47 % 29.14 0 0 % 0 facto facto 2,279 0.13 % 2.01 17 0.05 % 0.43 15 0.22 % 3.79 918 0.16 % 1.69 398 0.08 % 2.12 827 0.22 % 2.60 104 0.04 % 2.42 0 0 % 0 university unive rsity 2,209 0.12 % 1.95 14 0.04 % 0.35 1 0.01 % 0.25 437 0.07 % 0.81 407 0.08 % 2.17 445 0.12 % 1.40 905 0.34 % 21.08 0 0 % 0 national nat ional 2,190 0.12 % 1.93 31 0.09 % 0.78 10 0.15 % 2.53 768 0.13 % 1.42 468 0.09 % 2.50 495 0.13 % 1.56 418 0.16 % 9.74 0 0 % 0 which which 2,159 0.12 % 1.90 15 0.04 % 0.38 1 0.01 % 0.25 667 0.11 % 1.23 785 0.16 % 4.19 116 0.03 % 0.36 575 0.22 % 13.39 0 0 % 0 si,mobil si, mobil 1,846 0.10 % 1.63 0 0 % 0 6 0.09 % 1.52 879 0.15 % 1.62 621 0.12 % 3.31 340 0.09 % 1.07 0 0 % 0 0 0 % 0 education educ ation 1,819 0.10 % 1.60 3 0.01 % 0.08 0 0 % 0 116 0.02 % 0.21 778 0.15 % 4.15 158 0.04 % 0.50 764 0.29 % 17.79 0 0 % 0 their their 1,818 0.10 % 1.60 4 0.01 % 0.10 3 0.04 % 0.76 261 0.04 % 0.48 767 0.15 % 4.09 192 0.05 % 0.60 591 0.22 % 13.77 0 0 % 0 people p eople 1,788 0.10 % 1.58 22 0.07 % 0.55 1 0.01 % 0.25 359 0.06 % 0.66 565 0.11 % 3.01 589 0.16 % 1.85 252 0.10 % 5.87 0 0 % 0 dance dance 1,721 0.10 % 1.52 6 0.02 % 0.15 0 0 % 0 948 0.16 % 1.75 352 0.07 % 1.88 387 0.10 % 1.22 28 0.01 % 0.65 0 0 % 0 business bus iness 1,705 0.10 % 1.50 6 0.02 % 0.15 1 0.01 % 0.25 607 0.10 % 1.12 504 0.10 % 2.69 331 0.09 % 1.04 256 0.10 % 5.96 0 0 % 0 alias alias 1,681 0.10 % 1.48 85 0.25 % 2.14 2 0.03 % 0.51 654 0.11 % 1.21 679 0.14 % 3.62 236 0.06 % 0.74 25 0.01 % 0.58 0 0 % 0 development develo pment 1,636 0.09 % 1.44 0 0 % 0 1 0.01 % 0.25 227 0.04 % 0.42 447 0.09 % 2.39 143 0.04 % 0.45 818 0.31 % 19.05 0 0 % 0 power power 1,617 0.09 % 1.43 17 0.05 % 0.43 1 0.01 % 0.25 597 0.10 % 1.10 401 0.08 % 2.14 478 0.13 % 1.50 123 0.05 % 2.86 0 0 % 0 research res earch 1,588 0.09 % 1.40 6 0.02 % 0.15 1 0.01 % 0.25 299 0.05 % 0.55 495 0.10 % 2.64 275 0.07 % 0.86 512 0.19 % 11.93 0 0 % 0 night night 1,583 0.09 % 1.40 24 0.07 % 0.60 1 0.01 % 0.25 628 0.11 % 1.16 426 0.08 % 2.27 469 0.13 % 1.48 35 0.01 % 0.82 0 0 % 0 don’t don’t 1,555 0.09 % 1.37 46 0.14 % 1.16 2 0.03 % 0.51 460 0.08 % 0.85 478 0.10 % 2.55 548 0.15 % 1.72 21 0.01 % 0.49 0 0 % 0 american ame rican 1,539 0.09 % 1.36 15 0.04 % 0.38 2 0.03 % 0.51 444 0.08 % 0.82 423 0.08 % 2.26 332 0.09 % 1.04 323 0.12 % 7.52 0 0 % 0 school s chool 1,531 0.09 % 1.35 12 0.04 % 0.30 2 0.03 % 0.51 425 0.07 % 0.78 328 0.07 % 1.75 320 0.09 % 1.01 444 0.17 % 10.34 0 0 % 0 https://twitter,com/sta_novice https://twitter,com/sta_n ovice 1,503 0.09 % 1.32 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 1,503 0.41 % 4.73 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 277The frequency lists of words from the Gigafida 2.0 corpus are divided into two groups: the first group contains lower-case word forms, lemmas, or morphosyntactic tags (with respective frequencies and percentages), while the second group contains consonant-vowel structures, e.g. conversions of words using character substitution (e.g. C for consonants and V for vowels, with the word “tiger” becoming “CVCVC”). The lists from the first group contain the unit (lower-case form, lemma, or morphosyntactic tag according to the MTE-6 annotation scheme; some lists also contain the part-of-speech and lower-case lemma), its total absolute frequency (fa), ie. the sum of all occurrences of the unit in the corpus, and its percentage (p), i.e. its share according to the total frequency (N) of all units in the corpus: The total relative frequency (fr) indicates how frequently per 1,000,000 units the specific unit occurs. The formula takes into account the total absolute frequency of the unit in the corpus (fa) and the total frequency of all units in the corpus (N): The lists also contain the absolute frequencies (faT) of the unit in different taxonomy branches of the corpus, i.e. text-types, as well as the percentages of the unit (pT) and its relative frequencies (frT) in different taxonomy branches. NT represents the total frequency of all units in the subcorpus: The lists containing morphosyntactic tags also feature individual elements of the morphosyntactic tags listed in separate columns at the end of each line, e.g. “Somei” → “S o m e i”, meaning “noun”, “common”, “masculine gender”, “singular”, “nominative case”. This allows the user to filter the relevant lines in data analysis software in order to sum and compare relevant frequencies.The second group of lists, which contains consonant-vowel structures, contains the unit’s converted form (e.g. “CVCVC”) and the total absolute frequency of all units with the same consonant-vowel structure in the corpus. A separate column contains the actual forms (and their absolute frequencies) that have been annotated with a specific part-of-speech in the corpus and pertain to the consonant-vowel structure in question. The forms are listed in the following format: “form_1~part_of_speech_1~frequency | form_2~part_of_speech_2~frequency | ...”. A separate column indicates the number of all different forms pertaining to the consonant-vowel structure in question. The consonant-vowel structure in the lists are either robust (e.g. “tiger” → “CVCVC”) or finegrained (e.g. “tiger” → “KVGVZ”) according to the symbols used in character substitution. The substitution algorithm follows the following categorization: C - consonants (in finegrained consonant-vowel structures, these are instead divided into Z - sonorants, G - voiced obstruents, and K - voiceless obstruents) V - vowels X - foreign consonants Y - foreign vowels S - symbols A more detailed list determining the substitution rules for characters is available under the relevant CLARIN.SI repository entry in the file titled “GF2.0_character_categorization.tsv”. The tables follow the following order: • 1.3.1. → Lemmas with part-of-speech tags and text-type distribution • 1.3.2. → Lower-case word forms with lemmas, part-of-speech tags, and text-type distribution • 1.3.3. → Morphosyntactic tags with text-type distribution • 1.3.4.–1.3.7. → Lists of consonant-vowel structures • 1.3.8.–1.3.31. → Lists of lemmas and lower-case word forms for individual part-of-speech categories (nouns, adjectives, etc.)P - punctuation N - numbers F - non-Latin-script characters ! - other1.3. Frequency lists of words from the Gigafida 2.0 corpus CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 278 File at CLARIN.SI 1.3.1 List of lemmas in the Gigafida 2.0 corpus with part-of-speech categories and text-type distributionGF2.0-words-all-lemmas-parts_of_ speech-taxonomy-entire.tsvLemma Lemma (lower-case) Part-of-speech category Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] . . / 97,531,651 7.32 % 73,147.24 703 6.11 % 61,087.94 3,922,343 8.11 % 81,048.54 359,725 7.70 % 76,951.84 46,835,879 7.36 % 73,559.47 16,487,816 7.46 % 74,577.24 26,124,011 7.04 % 70,387.41 3,801,174 7.40 % 74,036.43 biti biti G 91,521,762 6.86 % 68,639.92 130 1.13 % 11,296.49 5,035,173 10.40 % 104,043.28 232,963 4.98 % 49,835.10 43,157,916 6.78 % 67,782.94 14,021,776 6.34 % 63,422.91 26,234,349 7.07 % 70,684.70 2,839,455 5.53 % 55,304.78 , , / 57,997,263 4.35 % 43,497.06 857 7.45 % 74,469.93 2,879,648 5.95 % 59,503.02 176,879 3.78 % 37,837.69 26,383,435 4.14 % 41,437.28 10,194,668 4.61 % 46,112.24 15,985,760 4.31 % 43,071.34 2,376,016 4.63 % 46,278.27 v v D 31,818,378 2.39 % 23,863.29 299 2.60 % 25,981.93 816,598 1.69 % 16,873.61 103,256 2.21 % 22,088.37 15,532,386 2.44 % 24,394.85 4,674,294 2.11 % 21,142.64 9,553,100 2.57 % 25,739.46 1,138,445 2.22 % 22,173.78 in in V 29,117,711 2.18 % 21,837.84 619 5.38 % 53,788.67 1,187,174 2.45 % 24,530.93 115,835 2.48 % 24,779.25 13,762,326 2.16 % 21,614.83 5,256,746 2.38 % 23,777.17 7,456,548 2.01 % 20,090.60 1,338,463 2.61 % 26,069.58 na na D 19,072,589 1.43 % 14,304.15 248 2.15 % 21,550.23 555,927 1.15 % 11,487.29 56,074 1.20 % 11,995.27 9,072,361 1.43 % 14,248.86 3,009,439 1.36 % 13,612.21 5,747,052 1.55 % 15,484.61 631,488 1.23 % 12,299.65 se se Z 18,656,426 1.40 % 13,992.03 63 0.55 % 5,474.45 1,227,523 2.54 % 25,364.67 73,060 1.56 % 15,628.89 8,331,411 1.31 % 13,085.14 3,444,419 1.56 % 15,579.70 4,822,136 1.30 % 12,992.55 757,814 1.48 % 14,760.14 z z D 15,732,362 1.18 % 11,799.03 403 3.50 % 35,019.12 535,369 1.11 % 11,062.49 58,748 1.26 % 12,567.29 7,267,869 1.14 % 11,414.77 2,841,173 1.28 % 12,851.12 4,402,058 1.19 % 11,860.71 626,742 1.22 % 12,207.21 za za D 15,195,407 1.14 % 11,396.32 88 0.77 % 7,646.85 294,850 0.61 % 6,092.57 67,714 1.45 % 14,485.28 7,527,662 1.18 % 11,822.79 2,303,946 1.04 % 10,421.15 4,564,728 1.23 % 12,299.01 436,419 0.85 % 8,500.24 da da V 14,810,352 1.11 % 11,107.54 124 1.08 % 10,775.11 752,394 1.55 % 15,546.94 53,649 1.15 % 11,476.51 6,760,076 1.06 % 10,617.24 2,457,240 1.11 % 11,114.52 4,260,568 1.15 % 11,479.49 526,301 1.02 % 10,250.90 ki ki V 12,623,562 0.95 % 9,467.48 55 0.48 % 4,779.28 338,126 0.70 % 6,986.80 55,386 1.19 % 11,848.09 5,915,959 0.93 % 9,291.48 2,202,562 1.00 % 9,962.57 3,594,899 0.97 % 9,685.94 516,575 1.01 % 10,061.46 on on Z 11,874,012 0.89 % 8,905.33 189 1.64 % 16,423.36 947,214 1.96 % 19,572.56 38,159 0.82 % 8,162.92 5,248,082 0.82 % 8,242.53 2,238,961 1.01 % 10,127.21 2,870,162 0.77 % 7,733.24 531,245 1.03 % 10,347.19 ta ta Z 11,377,542 0.85 % 8,532.98 17 0.15 % 1,477.23 469,391 0.97 % 9,699.17 74,801 1.60 % 16,001.32 5,253,309 0.82 % 8,250.74 1,998,988 0.90 % 9,041.77 3,084,831 0.83 % 8,311.64 496,205 0.97 % 9,664.71 pa pa V 11,003,780 0.82 % 8,252.67 37 0.32 % 3,215.15 319,351 0.66 % 6,598.84 25,847 0.55 % 5,529.15 5,422,876 0.85 % 8,517.06 1,898,081 0.86 % 8,585.35 3,016,889 0.81 % 8,128.58 320,699 0.62 % 6,246.34 tudi tudi L 8,478,793 0.64 % 6,358.96 21 0.18 % 1,824.82 136,549 0.28 % 2,821.55 23,030 0.49 % 4,926.54 4,214,730 0.66 % 6,619.57 1,465,007 0.66 % 6,626.48 2,376,445 0.64 % 6,402.99 263,011 0.51 % 5,122.73 - - / 7,044,475 0.53 % 5,283.25 0 0 % 0 40,704 0.08 % 841.08 22,550 0.48 % 4,823.86 3,697,425 0.58 % 5,807.10 809,963 0.37 % 3,663.60 2,288,921 0.62 % 6,167.17 184,912 0.36 % 3,601.58 ) ) / 6,772,979 0.51 % 5,079.63 57 0.49 % 4,953.08 26,519 0.06 % 547.97 36,924 0.79 % 7,898.73 3,567,558 0.56 % 5,603.13 1,036,176 0.47 % 4,686.80 1,708,178 0.46 % 4,602.44 397,567 0.77 % 7,743.51 ( ( / 6,765,109 0.51 % 5,073.73 57 0.49 % 4,953.08 26,178 0.05 % 540.92 33,999 0.73 % 7,273.02 3,585,956 0.56 % 5,632.03 1,028,955 0.47 % 4,654.14 1,703,505 0.46 % 4,589.85 386,459 0.75 % 7,527.16 ne ne L 6,734,994 0.51 % 5,051.14 44 0.38 % 3,823.43 374,116 0.77 % 7,730.47 30,064 0.64 % 6,431.25 3,104,942 0.49 % 4,876.56 1,318,655 0.60 % 5,964.50 1,648,098 0.44 % 4,440.56 259,075 0.51 % 5,046.07 po po D 6,360,566 0.48 % 4,770.33 64 0.56 % 5,561.35 191,555 0.40 % 3,958.16 21,577 0.46 % 4,615.72 3,054,588 0.48 % 4,797.47 936,691 0.42 % 4,236.81 1,959,908 0.53 % 5,280.69 196,183 0.38 % 3,821.11 še še L 5,787,718 0.43 % 4,340.70 24 0.21 % 2,085.51 216,168 0.45 % 4,466.74 11,596 0.25 % 2,480.60 2,847,027 0.45 % 4,471.48 962,021 0.43 % 4,351.39 1,615,445 0.43 % 4,352.59 135,437 0.26 % 2,637.94 kot kot V 5,694,030 0.43 % 4,270.43 23 0.20 % 1,998.61 210,691 0.43 % 4,353.57 16,409 0.35 % 3,510.19 2,600,526 0.41 % 4,084.33 973,450 0.44 % 4,403.08 1,661,150 0.45 % 4,475.73 231,781 0.45 % 4,514.46 / 5,356,795 0.40 % 4,017.51 0 0 % 0 96,302 0.20 % 1,989.92 16,710 0.36 % 3,574.58 1,475,738 0.23 % 2,317.76 509,263 0.23 % 2,303.48 3,151,731 0.85 % 8,491.89 107,051 0.21 % 2,085.06 leto leto S 4,967,678 0.37 % 3,725.68 0 0 % 0 58,808 0.12 % 1,215.17 11,974 0.26 % 2,561.46 2,501,788 0.39 % 3,929.26 732,438 0.33 % 3,312.94 1,531,105 0.41 % 4,125.34 131,565 0.26 % 2,562.52 ves ves Z 4,233,029 0.32 % 3,174.71 26 0.23 % 2,259.30 207,705 0.43 % 4,291.87 13,137 0.28 % 2,810.25 2,037,592 0.32 % 3,200.20 799,974 0.36 % 3,618.42 1,022,378 0.28 % 2,754.65 152,217 0.30 % 2,964.77 imeti imeti G 4,146,931 0.31 % 3,110.13 12 0.10 % 1,042.75 179,933 0.37 % 3,718.01 14,728 0.32 % 3,150.59 1,929,486 0.30 % 3,030.41 785,754 0.35 % 3,554.10 1,085,001 0.29 % 2,923.38 152,017 0.30 % 2,960.87 iz iz D 4,006,518 0.30 % 3,004.83 37 0.32 % 3,215.15 134,681 0.28 % 2,782.95 23,762 0.51 % 5,083.13 2,017,138 0.32 % 3,168.08 611,339 0.28 % 2,765.19 1,060,540 0.29 % 2,857.47 159,021 0.31 % 3,097.29 pri pri D 3,934,055 0.29 % 2,950.48 47 0.41 % 4,084.12 76,258 0.16 % 1,575.74 15,360 0.33 % 3,285.79 1,888,953 0.30 % 2,966.75 756,619 0.34 % 3,422.32 1,018,794 0.27 % 2,744.99 178,024 0.35 % 3,467.42 od od D 3,855,989 0.29 % 2,891.93 20 0.17 % 1,737.92 141,756 0.29 % 2,929.15 13,534 0.29 % 2,895.17 1,843,270 0.29 % 2,895 638,050 0.29 % 2,886.01 1,061,423 0.29 % 2,859.85 157,936 0.31 % 3,076.16 : : / 3,814,843 0.29 % 2,861.07 2 0.02 % 173.79 118,720 0.24 % 2,453.15 14,473 0.31 % 3,096.04 1,988,615 0.31 % 3,123.28 691,819 0.31 % 3,129.22 781,587 0.21 % 2,105.87 219,627 0.43 % 4,277.73 že že L 3,735,419 0.28 % 2,801.51 6 0.05 % 521.38 131,252 0.27 % 2,712.10 7,450 0.16 % 1,593.69 1,898,346 0.30 % 2,981.50 608,889 0.28 % 2,754.11 1,012,284 0.27 % 2,727.45 77,192 0.15 % 1,503.49 o o D 3,670,387 0.28 % 2,752.73 1 0.01 % 86.90 95,167 0.20 % 1,966.46 24,271 0.52 % 5,192.02 1,828,963 0.29 % 2,872.53 488,092 0.22 % 2,207.72 1,103,026 0.30 % 2,971.95 130,867 0.26 % 2,548.93 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 279 File at CLARIN.SI 1.3.2 List of lower-case word forms in the Gigafida 2.0 corpus with lemmas, part-of-speech categories and text-type distributionGF2.0-words-all-lowercase_forms-lemmas- parts_of_speech-taxonomy-entire.tsvLower-case word form Lemma Lemma (lower-case) Part-of-speech category Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] . . . / 97,531,651 7.32 % 73,147.24 703 6.11 % 61,087.94 3,922,343 8.11 % 81,048.54 359,725 7.70 % 76,951.84 46,835,879 7.36 % 73,559.47 16,487,816 7.46 % 74,577.24 26,124,011 7.04 % 70,387.41 3,801,174 7.40 % 74,036.43 , , , / 57,997,263 4.35 % 43,497.06 857 7.45 % 74,469.93 2,879,648 5.95 % 59,503.02 176,879 3.78 % 37,837.69 26,383,435 4.14 % 41,437.28 10,194,668 4.61 % 46,112.24 15,985,760 4.31 % 43,071.34 2,376,016 4.63 % 46,278.27 je biti biti G 39,852,647 2.99 % 29,888.87 53 0.46 % 4,605.49 2,355,179 4.87 % 48,665.76 95,246 2.04 % 20,374.88 18,126,598 2.85 % 28,469.26 6,003,428 2.71 % 27,154.54 11,999,560 3.23 % 32,331.10 1,272,583 2.48 % 24,786.42 v v v D 31,813,825 2.39 % 23,859.88 299 2.60 % 25,981.93 816,315 1.69 % 16,867.76 103,250 2.21 % 22,087.09 15,531,117 2.44 % 24,392.85 4,673,034 2.11 % 21,136.94 9,551,729 2.57 % 25,735.77 1,138,081 2.22 % 22,166.69 in in in V 29,117,711 2.18 % 21,837.84 619 5.38 % 53,788.67 1,187,174 2.45 % 24,530.93 115,835 2.48 % 24,779.25 13,762,326 2.16 % 21,614.83 5,256,746 2.38 % 23,777.17 7,456,548 2.01 % 20,090.60 1,338,463 2.61 % 26,069.58 na na na D 19,072,589 1.43 % 14,304.15 248 2.15 % 21,550.23 555,927 1.15 % 11,487.29 56,074 1.20 % 11,995.27 9,072,361 1.43 % 14,248.86 3,009,439 1.36 % 13,612.21 5,747,052 1.55 % 15,484.61 631,488 1.23 % 12,299.65 se se se Z 16,033,172 1.20 % 12,024.63 61 0.53 % 5,300.66 1,002,189 2.07 % 20,708.53 67,074 1.44 % 14,348.37 7,234,161 1.14 % 11,361.82 2,886,836 1.31 % 13,057.66 4,197,943 1.13 % 11,310.76 644,908 1.26 % 12,561.04 za za za D 15,195,407 1.14 % 11,396.32 88 0.77 % 7,646.85 294,850 0.61 % 6,092.57 67,714 1.45 % 14,485.28 7,527,662 1.18 % 11,822.79 2,303,946 1.04 % 10,421.15 4,564,728 1.23 % 12,299.01 436,419 0.85 % 8,500.24 da da da V 14,810,352 1.11 % 11,107.54 124 1.08 % 10,775.11 752,394 1.55 % 15,546.94 53,649 1.15 % 11,476.51 6,760,076 1.06 % 10,617.24 2,457,240 1.11 % 11,114.52 4,260,568 1.15 % 11,479.49 526,301 1.02 % 10,250.90 so biti biti G 13,322,285 1.00 % 9,991.51 25 0.22 % 2,172.40 381,659 0.79 % 7,886.33 33,146 0.71 % 7,090.54 6,609,847 1.04 % 10,381.29 1,974,260 0.89 % 8,929.92 3,829,327 1.03 % 10,317.57 494,021 0.96 % 9,622.17 ki ki ki V 12,623,562 0.95 % 9,467.48 55 0.48 % 4,779.28 338,126 0.70 % 6,986.80 55,386 1.19 % 11,848.09 5,915,959 0.93 % 9,291.48 2,202,562 1.00 % 9,962.57 3,594,899 0.97 % 9,685.94 516,575 1.01 % 10,061.46 pa pa pa V 11,003,780 0.82 % 8,252.67 37 0.32 % 3,215.15 319,351 0.66 % 6,598.84 25,847 0.55 % 5,529.15 5,422,876 0.85 % 8,517.06 1,898,081 0.86 % 8,585.35 3,016,889 0.81 % 8,128.58 320,699 0.62 % 6,246.34 z z z D 8,670,542 0.65 % 6,502.77 236 2.05 % 20,507.47 312,741 0.65 % 6,462.26 31,792 0.68 % 6,800.90 4,006,375 0.63 % 6,292.33 1,570,819 0.71 % 7,105.09 2,398,026 0.65 % 6,461.14 350,553 0.68 % 6,827.81 tudi tudi tudi L 8,478,789 0.64 % 6,358.96 21 0.18 % 1,824.82 136,547 0.28 % 2,821.51 23,030 0.49 % 4,926.54 4,214,730 0.66 % 6,619.57 1,465,006 0.66 % 6,626.47 2,376,444 0.64 % 6,402.99 263,011 0.51 % 5,122.73 s z z D 7,061,820 0.53 % 5,296.26 167 1.45 % 14,511.64 222,628 0.46 % 4,600.23 26,956 0.58 % 5,766.39 3,261,494 0.51 % 5,122.44 1,270,354 0.57 % 5,746.03 2,004,032 0.54 % 5,399.58 276,189 0.54 % 5,379.40 - - - / 7,044,475 0.53 % 5,283.25 0 0 % 0 40,704 0.08 % 841.08 22,550 0.48 % 4,823.86 3,697,425 0.58 % 5,807.10 809,963 0.37 % 3,663.60 2,288,921 0.62 % 6,167.17 184,912 0.36 % 3,601.58 ) ) ) / 6,772,979 0.51 % 5,079.63 57 0.49 % 4,953.08 26,519 0.06 % 547.97 36,924 0.79 % 7,898.73 3,567,558 0.56 % 5,603.13 1,036,176 0.47 % 4,686.80 1,708,178 0.46 % 4,602.44 397,567 0.77 % 7,743.51 ( ( ( / 6,765,109 0.51 % 5,073.73 57 0.49 % 4,953.08 26,178 0.05 % 540.92 33,999 0.73 % 7,273.02 3,585,956 0.56 % 5,632.03 1,028,955 0.47 % 4,654.14 1,703,505 0.46 % 4,589.85 386,459 0.75 % 7,527.16 ne ne ne L 6,734,994 0.51 % 5,051.14 44 0.38 % 3,823.43 374,116 0.77 % 7,730.47 30,064 0.64 % 6,431.25 3,104,942 0.49 % 4,876.56 1,318,655 0.60 % 5,964.50 1,648,098 0.44 % 4,440.56 259,075 0.51 % 5,046.07 bi biti biti G 6,493,377 0.49 % 4,869.93 4 0.04 % 347.58 388,499 0.80 % 8,027.67 19,280 0.41 % 4,124.35 3,209,848 0.50 % 5,041.32 1,006,147 0.46 % 4,550.98 1,665,174 0.45 % 4,486.57 204,425 0.40 % 3,981.64 po po po D 6,360,566 0.48 % 4,770.33 64 0.56 % 5,561.35 191,555 0.40 % 3,958.16 21,577 0.46 % 4,615.72 3,054,588 0.48 % 4,797.47 936,691 0.42 % 4,236.81 1,959,908 0.53 % 5,280.69 196,183 0.38 % 3,821.11 bo biti biti G 5,991,968 0.45 % 4,493.88 9 0.08 % 782.06 165,708 0.34 % 3,424.07 10,939 0.23 % 2,340.05 3,054,359 0.48 % 4,797.11 827,835 0.37 % 3,744.44 1,837,272 0.49 % 4,950.27 95,846 0.19 % 1,866.82 še še še L 5,787,718 0.43 % 4,340.70 24 0.21 % 2,085.51 216,168 0.45 % 4,466.74 11,596 0.25 % 2,480.60 2,847,027 0.45 % 4,471.48 962,021 0.43 % 4,351.39 1,615,445 0.43 % 4,352.59 135,437 0.26 % 2,637.94 kot kot kot V 5,694,030 0.43 % 4,270.43 23 0.20 % 1,998.61 210,691 0.43 % 4,353.57 16,409 0.35 % 3,510.19 2,600,526 0.41 % 4,084.33 973,450 0.44 % 4,403.08 1,661,150 0.45 % 4,475.73 231,781 0.45 % 4,514.46 / 5,356,795 0.40 % 4,017.51 0 0 % 0 96,302 0.20 % 1,989.92 16,710 0.36 % 3,574.58 1,475,738 0.23 % 2,317.76 509,263 0.23 % 2,303.48 3,151,731 0.85 % 8,491.89 107,051 0.21 % 2,085.06 ni biti biti G 4,416,649 0.33 % 3,312.42 0 0 % 0 283,726 0.59 % 5,862.71 15,184 0.33 % 3,248.14 2,061,015 0.32 % 3,236.99 719,621 0.33 % 3,254.97 1,202,738 0.32 % 3,240.61 134,365 0.26 % 2,617.06 to ta ta Z 4,103,550 0.31 % 3,077.60 8 0.07 % 695.17 218,365 0.45 % 4,512.14 20,900 0.45 % 4,470.90 1,845,594 0.29 % 2,898.65 773,014 0.35 % 3,496.48 1,079,804 0.29 % 2,909.38 165,865 0.32 % 3,230.59 iz iz iz D 4,006,518 0.30 % 3,004.83 37 0.32 % 3,215.15 134,681 0.28 % 2,782.95 23,762 0.51 % 5,083.13 2,017,138 0.32 % 3,168.08 611,339 0.28 % 2,765.19 1,060,540 0.29 % 2,857.47 159,021 0.31 % 3,097.29 pri pri pri D 3,934,055 0.29 % 2,950.48 47 0.41 % 4,084.12 76,258 0.16 % 1,575.74 15,360 0.33 % 3,285.79 1,888,953 0.30 % 2,966.75 756,619 0.34 % 3,422.32 1,018,794 0.27 % 2,744.99 178,024 0.35 % 3,467.42 od od od D 3,855,989 0.29 % 2,891.93 20 0.17 % 1,737.92 141,756 0.29 % 2,929.15 13,534 0.29 % 2,895.17 1,843,270 0.29 % 2,895 638,050 0.29 % 2,886.01 1,061,423 0.29 % 2,859.85 157,936 0.31 % 3,076.16 : : : / 3,814,843 0.29 % 2,861.07 2 0.02 % 173.79 118,720 0.24 % 2,453.15 14,473 0.31 % 3,096.04 1,988,615 0.31 % 3,123.28 691,819 0.31 % 3,129.22 781,587 0.21 % 2,105.87 219,627 0.43 % 4,277.73 že že že L 3,735,419 0.28 % 2,801.51 6 0.05 % 521.38 131,252 0.27 % 2,712.10 7,450 0.16 % 1,593.69 1,898,346 0.30 % 2,981.50 608,889 0.28 % 2,754.11 1,012,284 0.27 % 2,727.45 77,192 0.15 % 1,503.49 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 280 File at CLARIN.SI 1.3.3 List of morphosyntactic tags in the Gigafida 2.0 corpus with text-type distributionGF2.0-words-all-morphosyntactic_ tags-split_MSD-taxonomy-entire.tsvMorphosyntactic tag Total absolute frequency of morphosyntactic tag Percentage of all found morphosyntactic tags Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] msd01 msd02 msd03 msd04 msd05 msd06 msd07 msd08 msd09 / 198,666,720 14.90 % 148,996.99 1,801 15.65 % 156,499.83 8,679,221 17.93 % 179,341.33 718,812 15.38 % 153,767.20 93,986,319 14.76 % 147,612.98 33,665,926 15.23 % 152,276.79 53,207,390 14.34 % 143,359.69 8,407,251 16.38 % 163,750.16 / Rsn 56,322,959 4.22 % 42,241.35 456 3.96 % 39,624.61 2,781,306 5.75 % 57,470.95 169,932 3.63 % 36,351.60 26,046,787 4.09 % 40,908.55 10,848,980 4.91 % 49,071.81 14,340,788 3.86 % 38,639.20 2,134,710 4.16 % 41,578.29 Rsn Vp 54,993,423 4.12 % 41,244.22 816 7.09 % 70,907.19 2,024,294 4.18 % 41,828.59 218,973 4.68 % 46,842.38 26,112,246 4.10 % 41,011.36 9,948,396 4.50 % 44,998.31 14,386,971 3.88 % 38,763.63 2,301,727 4.48 % 44,831.32 Vp Dm 51,583,054 3.87 % 38,686.50 390 3.39 % 33,889.47 1,193,801 2.47 % 24,667.87 175,134 3.75 % 37,464.41 25,266,626 3.97 % 39,683.24 7,530,894 3.41 % 34,063.53 15,673,665 4.22 % 42,230.45 1,742,544 3.39 % 33,939.97 Dm Vd 43,547,870 3.27 % 32,660.23 296 2.57 % 25,721.24 1,921,203 3.97 % 39,698.39 175,390 3.75 % 37,519.17 19,992,450 3.14 % 31,399.73 7,639,062 3.46 % 34,552.79 12,076,501 3.25 % 32,538.40 1,742,968 3.40 % 33,948.23 Vd Gp-ste-n 39,852,762 2.99 % 29,888.96 53 0.46 % 4,605.49 2,355,200 4.87 % 48,666.20 95,246 2.04 % 20,374.88 18,126,629 2.85 % 28,469.31 6,003,452 2.71 % 27,154.65 11,999,591 3.23 % 32,331.18 1,272,591 2.48 % 24,786.58 Gp-ste-n L 39,653,775 2.97 % 29,739.72 112 0.97 % 9,732.36 1,485,530 3.07 % 30,695.95 112,172 2.40 % 23,995.67 19,346,131 3.04 % 30,384.64 7,223,559 3.27 % 32,673.40 10,252,147 2.76 % 27,622.94 1,234,124 2.40 % 24,037.35 L Slmei 33,708,453 2.53 % 25,280.82 90.08 % 782.06 623,751 1.29 % 12,888.75 32,133 0.69 % 6,873.84 17,574,320 2.76 % 27,601.87 4,255,650 1.93 % 19,249.04 10,413,344 2.81 % 28,057.26 809,246 1.58 % 15,761.89 Slmei Dt 31,555,233 2.37 % 23,665.94 368 3.20 % 31,977.75 945,803 1.95 % 19,543.41 120,682 2.58 % 25,816.12 15,232,006 2.39 % 23,923.08 5,082,573 2.30 % 22,989.35 9,106,897 2.45 % 24,537.23 1,066,904 2.08 % 20,780.36 Dt Somei 27,071,604 2.03 % 20,303.29 215 1.87 % 18,682.66 762,171 1.57 % 15,748.97 109,081 2.33 % 23,334.45 12,965,306 2.04 % 20,363.04 4,934,015 2.23 % 22,317.40 7,182,247 1.94 % 19,351.54 1,118,569 2.18 % 21,786.65 Somei Sozer 21,380,765 1.60 % 16,035.25 207 1.80 % 17,987.49 436,997 0.90 % 9,029.80 98,090 2.10 % 20,983.27 10,261,559 1.61 % 16,116.59 3,239,400 1.47 % 14,652.36 6,429,486 1.73 % 17,323.33 915,026 1.78 % 17,822.19 Sozer Do 21,343,433 1.60 % 16,007.25 422 3.67 % 36,670.14 709,582 1.47 % 14,662.30 74,273 1.59 % 15,888.37 9,949,072 1.56 % 15,625.81 3,738,468 1.69 % 16,909.74 6,045,443 1.63 % 16,288.58 826,173 1.61 % 16,091.58 Do Kag 20,820,597 1.56 % 15,615.13 373 3.24 % 32,412.23 39,487 0.08 % 815.93 69,194 1.48 % 14,801.88 11,037,417 1.73 % 17,335.14 2,615,913 1.18 % 11,832.23 6,204,434 1.67 % 16,716.96 853,779 1.66 % 16,629.27 Kag Sozei 20,240,717 1.52 % 15,180.23 55 0.48 % 4,779.28 572,043 1.18 % 11,820.29 83,720 1.79 % 17,909.26 9,609,552 1.51 % 15,092.56 3,556,115 1.61 % 16,084.92 5,492,990 1.48 % 14,800.07 926,242 1.80 % 18,040.65 Sozei Sozet 20,135,892 1.51 % 15,101.61 446 3.88 % 38,755.65 677,430 1.40 % 13,997.94 76,334 1.63 % 16,329.26 9,443,046 1.48 % 14,831.05 3,450,420 1.56 % 15,606.84 5,674,428 1.53 % 15,288.93 813,788 1.58 % 15,850.35 Sozet Somer 19,429,195 1.46 % 14,571.60 175 1.52 % 15,206.81 411,569 0.85 % 8,504.37 101,991 2.18 % 21,817.76 9,212,571 1.45 % 14,469.07 2,829,295 1.28 % 12,797.39 6,233,660 1.68 % 16,795.70 639,934 1.25 % 12,464.16 Somer Dr 17,943,294 1.35 % 13,457.19 118 1.02 % 10,253.74 583,892 1.21 % 12,065.13 70,088 1.50 % 14,993.12 8,648,140 1.36 % 13,582.59 2,881,652 1.30 % 13,034.21 5,081,033 1.37 % 13,690.12 678,371 1.32 % 13,212.80 Dr Zp------k 16,040,375 1.20 % 12,030.03 61 0.53 % 5,300.66 1,002,400 2.07 % 20,712.89 67,082 1.44 % 14,350.08 7,237,212 1.14 % 11,366.62 2,888,141 1.31 % 13,063.56 4,200,430 1.13 % 11,317.46 645,049 1.26 % 12,563.78 Zp------k Sometn 15,316,125 1.15 % 11,486.86 278 2.42 % 24,157.11 533,024 1.10 % 11,014.03 56,118 1.20 % 12,004.68 7,201,751 1.13 % 11,310.92 2,677,811 1.21 % 12,112.20 4,274,524 1.15 % 11,517.09 572,619 1.11 % 11,153.05 Sometn Sommr 15,236,675 1.14 % 11,427.27 29 0.25 % 2,519.99 211,119 0.44 % 4,362.41 48,520 1.04 % 10,379.33 7,693,463 1.21 % 12,083.19 2,243,275 1.01 % 10,146.72 4,571,050 1.23 % 12,316.04 469,219 0.91 % 9,139.10 Sommr Sozem 14,644,589 1.10 % 10,983.22 184 1.60 % 15,988.88 361,824 0.75 % 7,476.48 52,054 1.11 % 11,135.31 7,173,030 1.13 % 11,265.81 2,164,742 0.98 % 9,791.50 4,371,717 1.18 % 11,778.97 521,038 1.01 % 10,148.39 Sozem Ggdd-em 13,879,307 1.04 % 10,409.27 30.03 % 260.69 1,110,378 2.29 % 22,944.07 23,407 0.50 % 5,007.19 6,311,599 0.99 % 9,912.87 1,576,678 0.71 % 7,131.59 4,557,367 1.23 % 12,279.17 299,875 0.58 % 5,840.74 Ggdd-em Gp-stm-n 13,322,332 1.00 % 9,991.54 25 0.22 % 2,172.40 381,663 0.79 % 7,886.42 33,148 0.71 % 7,090.97 6,609,862 1.04 % 10,381.31 1,974,264 0.89 % 8,929.94 3,829,346 1.03 % 10,317.62 494,024 0.96 % 9,622.23 Gp-stm-n Ggnste 12,836,948 0.96 % 9,627.51 30 0.26 % 2,606.88 349,156 0.72 % 7,214.71 59,312 1.27 % 12,687.94 5,698,341 0.90 % 8,949.70 2,427,826 1.10 % 10,981.48 3,705,556 1.00 % 9,984.09 596,727 1.16 % 11,622.60 Ggnste Somem 11,658,323 0.87 % 8,743.56 97 0.84 % 8,428.92 295,865 0.61 % 6,113.55 49,477 1.06 % 10,584.05 5,559,072 0.87 % 8,730.96 1,738,401 0.79 % 7,863.09 3,628,159 0.98 % 9,775.55 387,252 0.75 % 7,542.61 Somem Soser 11,030,313 0.83 % 8,272.57 91 0.79 % 7,907.54 194,300 0.40 % 4,014.88 49,251 1.05 % 10,535.70 5,300,054 0.83 % 8,324.16 1,693,803 0.77 % 7,661.36 3,255,385 0.88 % 8,771.17 537,429 1.05 % 10,467.64 Soser Sozmr 10,162,912 0.76 % 7,622.03 157 1.36 % 13,642.68 182,864 0.38 % 3,778.57 41,143 0.88 % 8,801.25 4,888,030 0.77 % 7,677.04 1,696,748 0.77 % 7,674.68 2,924,909 0.79 % 7,880.75 429,061 0.84 % 8,356.93 Sozmr Slzei 10,018,178 0.75 % 7,513.48 20.02 % 173.79 220,000 0.46 % 4,545.93 11,441 0.24 % 2,447.44 5,070,911 0.80 % 7,964.27 1,162,284 0.53 % 5,257.21 3,399,889 0.92 % 9,160.51 153,651 0.30 % 2,992.70 Slzei Ppnzei 9,972,939 0.75 % 7,479.55 22 0.19 % 1,911.71 274,478 0.57 % 5,671.62 36,054 0.77 % 7,712.62 4,708,624 0.74 % 7,395.27 1,768,998 0.80 % 8,001.48 2,762,890 0.74 % 7,444.21 421,873 0.82 % 8,216.93 Ppnzei Ggdd-mm 9,968,028 0.75 % 7,475.87 31 0.27 % 2,693.78 214,514 0.44 % 4,432.57 16,155 0.35 % 3,455.85 5,274,440 0.83 % 8,283.93 1,338,559 0.60 % 6,054.53 2,895,138 0.78 % 7,800.53 229,191 0.45 % 4,464.01 Ggdd-mm Sommi 9,671,683 0.72 % 7,253.61 15 0.13 % 1,303.44 208,518 0.43 % 4,308.67 35,345 0.76 % 7,560.95 4,886,479 0.77 % 7,674.60 1,587,273 0.72 % 7,179.51 2,539,557 0.68 % 6,842.47 414,496 0.81 % 8,073.24 Sommi Soset 9,236,913 0.69 % 6,927.54 141 1.23 % 12,252.35 232,946 0.48 % 4,813.43 44,055 0.94 % 9,424.18 4,474,078 0.70 % 7,026.89 1,515,736 0.69 % 6,855.94 2,574,261 0.69 % 6,935.98 395,696 0.77 % 7,707.07 Soset CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 281 File at CLARIN.SI1.3.4 List of lemmas by basic consonant-vowel structure in the Gigafida 2.0 corpusGF2.0_cv_lemmas_robust_entire.tsvStructure … Frequency (G) Most frequent lemmas (G) Number of all unique lemmas (G) … Frequency (R) Most frequent lemmas (R) Number of all unique lemmas (R) Frequency (S) Most frequent lemmas (S) Number of all unique lemmas (S) … Total frequency of structure Total number of all unique lemmas in structure CVCVC … 21.406 zadel~G~10.357 | napet~G~1.349 | debel~G~853 | … 596 … 6.595.666 nekaj~R~1.478.636 | danes~R~905.146 | potem~R~814.559 | … 992 25.000.372 konec~S~893.711 | zakon~S~706.952 | teden~S~683.852 | … 50.570 … 51.451.238 67.515 CVCV … 94.132.591 biti~G~91.521.762 | moči~G~829.908 | reči~G~623.230 | … 716 … 8.423.056 tako~R~3.289.402 | zelo~R~1.441.970 | kako~R~1.005.454 | … 864 19.215.933 leto~S~4.967.678 | delo~S~1.611.830 | šola~S~576.848 | ... 15.430 … 136.594.478 25.177 CVC … 7.111 reč~G~4.911 | paz~G~510 | pij~G~262 | … 176 … 5.736.300 več~R~1.895.190 | kar~R~1.071.938 | res~R~565.140 | … 367 14.616.619 čas~S~1.869.767 | dan~S~1.852.418 | del~S~960.422 | ... 5.705 … 59.730.466 10.829 CVCVCV … 16.284.986 morati~G~2.048.887 | začeti~G~1.138.044 | dobiti~G~966.276 | … 2.929 … 2.946.611 veliko~R~945.307 | nikoli~R~342.994 | toliko~R~324.313 | … 1.346 13.421.550 težava~S~621.900 | beseda~S~594.966 | sezona~S~516.392 | ... 50.660 … 38.220.887 67.574 CVCCV … 936.445 najti~G~509.948 | pasti~G~199.420 | rasti~G~69.030 | … 709 … 8.827.834 lahko~R~3.488.074 | vedno~R~1.075.127 | dobro~R~906.394 | … 1.383 12.636.445 mesto~S~1.587.720 | tekma~S~978.012 | točka~S~672.245 | ... 34.120 … 26.144.952 47.802 CVCCVC … 8.936 razvit~G~2.664 | pokrit~G~1.179 | naštet~G~773 | … 365 … 2.800.337 takrat~R~444.872 | povsem~R~426.682 | medtem~R~423.755 | … 650 12.170.590 sistem~S~560.940 | razvoj~S~420.682 | center~S~419.711 | ... 59.998 … 22.841.406 78.290 CVCVCVC … 3.031 zaletel~G~1.392 | posedel~G~141 | zagorel~G~137 | … 370 … 576.538 posebej~R~301.006 | nikakor~R~110.573 | zapored~R~56.237 | … 816 11.138.135 milijon~S~863.041 | podatek~S~587.651 | začetek~S~584.429 | ... 48.905 … 21.507.975 70.448 CCVC … 8.259 glej~G~1.829 | glej~G~1.273 | vroč~G~801 | … 158 … 2.389.495 zdaj~R~1.082.531 | spet~R~394.360 | prej~R~381.934 | … 294 8.840.886 svet~S~1.216.329 | član~S~561.640 | klub~S~426.105 | ... 9.939 … 30.725.594 14.440 CCVCVC … 4.785 predal~G~820 | prebil~G~678 | preživ~G~423 | … 263 … 2.598.431 skupaj~R~614.538 | precej~R~492.127 | včeraj~R~354.001 | … 563 8.309.553 človek~S~1.543.691 | primer~S~905.773 | trener~S~306.400 | ... 21.702 … 18.803.086 30.042 CCVCV … 3.280.635 priti~G~1.046.777 | zdeti~G~373.353 | stati~G~296.462 | … 703 … 2.061.831 treba~R~715.315 | glede~R~400.382 | znova~R~244.774 | … 600 7.490.558 vlada~S~767.169 | zveza~S~506.534 | zmaga~S~477.023 | ... 11.991 … 16.228.123 18.039 CVCCVCV … 4.622.874 postati~G~640.874 | veljati~G~399.008 | misliti~G~389.740 | … 2.507 … 149.476 najraje~R~44.893 | pošteno~R~36.039 | podnevi~R~8.556 | … 931 7.154.580 razmera~S~284.022 | nagrada~S~272.445 | delnica~S~272.190 | ... 42.328 … 12.002.363 55.461 CVCVCCV … 428.773 zasesti~G~69.573 | navesti~G~68.964 | napasti~G~53.832 | … 707 … 1.561.261 pogosto~R~299.878 | ponovno~R~146.746 | posebno~R~126.266 | … 1.488 6.478.905 sodišče~S~546.254 | pogodba~S~368.653 | ponudba~S~261.275 | ... 31.471 … 9.781.237 41.201 CCVCVCV … 7.126.731 praviti~G~720.199 | zgoditi~G~398.852 | slediti~G~305.351 | … 2.322 … 454.487 drugače~R~215.016 | premalo~R~100.905 | globoko~R~65.969 | … 648 5.995.856 skupina~S~857.991 | pravica~S~459.268 | število~S~408.161 | ... 14.941 … 13.597.817 22.411 VCCVC … 3.419 ostal~G~1.534 | oglas~G~449 | odbit~G~242 | … 116 … 128.059 okrog~R~95.620 | odveč~R~25.025 | osmič~R~1.949 | … 111 5.692.845 otrok~S~858.940 | odnos~S~329.539 | okvir~S~265.875 | ... 10.614 … 7.287.524 14.070 CCVCCV … 321.589 znajti~G~161.646 | zrasti~G~50.048 | vnesti~G~23.819 | … 395 … 612.957 skupno~R~75.556 | drugje~R~43.210 | zlahka~R~37.127 | … 632 5.060.655 družba~S~762.071 | mnenje~S~413.250 | služba~S~338.598 | ... 13.123 … 7.548.086 18.144 CVCC … 1.900 past~G~909 | pust~G~528 | nost~G~149 | … 118 … 1.872.113 bolj~R~1.262.289 | manj~R~605.930 | fajn~R~957 | … 264 4.891.060 film~S~522.446 | cilj~S~344.514 | gost~S~275.606 | ... 16.603 … 7.954.815 22.423 CVCVCVCV … 11.283.679 povedati~G~970.784 | pomeniti~G~572.293 | narediti~G~471.071 | … 4.180 … 908.121 nekoliko~R~302.982 | zagotovo~R~181.171 | večinoma~R~127.004 | … 1.086 4.709.331 komisija~S~349.324 | politika~S~342.522 | policija~S~276.483 | ... 34.484 … 17.784.168 48.338 CCVCCVC … 2.309 prevzet~G~425 | preklet~G~326 | prekrit~G~282 | … 168 … 463.595 dvakrat~R~129.369 | zjutraj~R~111.789 | trikrat~R~58.405 | … 228 4.532.268 prostor~S~672.859 | program~S~667.540 | predlog~S~363.182 | ... 16.979 … 6.556.350 23.295 CVCVCCVC … 935 temeljit~G~470 | pogolten~G~173 | pobesnel~G~46 | … 156 … 48.055 naposled~R~24.791 | popoldan~R~9.711 | dopoldan~R~5.038 | … 327 4.310.272 minister~S~468.972 | direktor~S~425.975 | rezultat~S~354.363 | ... 24.250 … 7.431.255 34.438 VCCVCV … 2.543.942 ostati~G~562.250 | igrati~G~507.131 | uspeti~G~416.462 | … 549 … 3.346 igrivo~R~1.197 | igraje~R~508 | ubrano~R~505 | … 198 4.245.978 občina~S~688.467 | evropa~S~391.599 | uprava~S~371.264 | ... 8.812 … 6.807.513 12.102 CCCVC … 3.003 skrit~G~1.461 | vštet~G~498 | trkaj~G~417 | … 64 … 320.517 prvič~R~282.017 | stran~R~35.649 | vznak~R~965 | … 141 3.966.653 stran~S~856.113 | stvar~S~376.437 | sklad~S~347.446 | ... 4.864 … 6.076.523 6.710 CVCCVCVC … 728 zastarel~G~294 | zarjavel~G~28 | pogrezen~G~19 | … 230 … 53.917 dandanes~R~22.628 | naprodaj~R~16.584 | ravnokar~R~8.658 | … 475 3.594.170 postopek~S~434.043 | poslanec~S~229.055 | kandidat~S~228.465 | ... 33.021 … 8.777.880 50.398 CVCCVCCV … 103.483 zaplesti~G~34.400 | razpasti~G~15.182 | zatresti~G~13.524 | … 464 … 933.612 verjetno~R~328.520 | resnično~R~74.682 | potrebno~R~71.774 | … 1.080 3.402.723 podjetje~S~1.001.078 | področje~S~506.606 | poškodba~S~169.848 | ... 21.613 … 5.696.584 28.517 CCVCVCVC … 5.339 prizadet~G~4.208 | preživet~G~269 | preživel~G~174 | … 166 … 12.289 predaleč~R~6.727 | gladoven~R~484 | klasičen~R~443 | … 364 3.255.249 trenutek~S~302.193 | slovenec~S~244.351 | storitev~S~226.842 | ... 13.227 … 8.479.088 20.404 VCCVCVC … 2.381 izrazit~G~871 | okrogel~G~315 | ostarel~G~276 | … 152 … 6.532 izjemen~R~1.009 | ogromen~R~846 | izvozen~R~744 | … 181 2.971.609 igralec~S~463.082 | izdelek~S~293.646 | odgovor~S~221.881 | ... 8.985 … 6.384.541 13.679 CCVCVCCV … 425.606 prinesti~G~197.821 | prenesti~G~84.647 | pripasti~G~27.903 | … 312 … 694.325 trenutno~R~201.729 | pretežno~R~65.503 | pravilno~R~52.975 | … 731 2.960.149 srečanje~S~285.542 | številka~S~248.732 | stoletje~S~242.678 | ... 9.414 … 4.334.480 13.049 VCVCV … 4.950.613 imeti~G~4.146.931 | upati~G~239.933 | oditi~G~167.993 | … 364 … 419.876 okoli~R~277.924 | enako~R~126.110 | obilo~R~12.331 | … 214 2.931.193 ekipa~S~495.141 | oseba~S~339.640 | ocena~S~221.062 | ... 8.115 … 8.516.651 11.172 CCVCVCVCV … 3.802.831 sporočiti~G~245.883 | prihajati~G~231.654 | pridobiti~G~208.657 | … 2.206 … 103.100 praviloma~R~52.432 | pretirano~R~28.829 | slikovito~R~4.437 | … 394 2.847.310 slovenija~S~1.764.516 | zgodovina~S~246.897 | sporočilo~S~149.732 | ... 8.094 … 6.772.674 13.085 CCC … 184 rtv~G~54 | pbz~G~9 | sdh~G~7 | … 88 … 10.555 brž~R~8.484 | črn~R~1677 | trd~R~115 | … 107 2.780.139 trg~S~545.441 | vrh~S~254.042 | vrt~S~124.729 | ... 8.617 … 3.185.177 12.149 CVCVCVCCV … 972 zapopasti~G~492 | nameravti~G~14 | počivajta~G~10 | … 359 … 302.820 dobesedno~R~42.683 | politično~R~30.762 | pozitivno~R~29.809 | … 990 2.643.979 milijarda~S~274.815 | delovanje~S~248.681 | dogajanje~S~148.537 | ... 13.597 … 3.295.685 18.224 CCCVCV … 1.282.675 vrniti~G~326.502 | zbrati~G~198.305 | držati~G~183.377 | … 481 … 392.448 hkrati~R~328.962 | sproti~R~24.706 | strogo~R~23.849 | … 186 2.591.603 država~S~1.502.639 | knjiga~S~403.004 | srbija~S~111.830 | ... 3.792 … 4.283.150 5.698 CVCCVCC … 60 lottonl~G~7 | kikboks~G~2 | končall~G~2 | … 47 … 970.253 najbolj~R~780.725 | najmanj~R~163.555 | zastonj~R~17.366 | … 187 2.461.453 možnost~S~539.410 | javnost~S~306.627 | varnost~S~193.396 | ... 18.354 … 3.477.159 22.190 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 282 File at CLARIN.SI 1.3.5 List of word forms by basic consonant-vowel structure in the Gigafida 2.0 corpusGF2.0_cv_forms_robust_entire.tsvStructure … Frequency (G) Most frequent forms (G) Number of unique forms (G) … Frequency (R) Most frequent forms (R) Number of unique forms (R) Frequency (S) Most frequent forms (S) Number of unique forms (S) … Total frequency of structure Total number of unique forms in structure CVCV … 14.534.446 bodo~G~2.658.670 | bilo~G~2.082.016 | bila~G~1.952.416 | … 2.133 … 8.705.186 tako~R~3.289.402 | zelo~R~1.441.970 | kako~R~1.005.454 | … 902 20.580.791 leta~S~1.847.738 | leto~S~721.493 | dela~S~638.910 | … 27.376 … 65.451.074 40.517 CVCVCV … 11.722.936 pomeni~G~439.400 | morali~G~356.788 | začeli~G~247.919 | … 9.071 … 2.898.015 veliko~R~945.305 | nikoli~R~342.994 | toliko~R~324.312 | … 1.414 18.278.733 zakona~S~273.400 | težave~S~235.504 | večina~S~227.790 | … 93.511 … 41.532.169 125.843 CVCCV … 2.810.860 boste~G~424.723 | bosta~G~273.266 | velja~G~249.697 | … 1.980 … 8.110.785 lahko~R~3.362.721 | vedno~R~1.075.127 | dobro~R~575.758 | … 1.593 15.323.635 mesto~S~628.237 | mestu~S~409.942 | koncu~S~386.446 | … 57.876 … 34.728.848 77.141 CVCVC … 4.786.214 dejal~G~467.145 | nisem~G~333.913 | moral~G~287.118 | … 2.873 … 6.535.645 nekaj~R~1.478.635 | danes~R~905.146 | potem~R~814.559 | … 802 14.371.978 letih~S~590.287 | način~S~366.438 | teden~S~282.441 | … 48.713 … 33.370.619 64.357 CCVCV … 2.977.870 pravi~G~517.845 | pride~G~142.308 | zgodi~G~91.020 | … 2.226 … 1.999.186 treba~R~715.315 | glede~R~400.382 | znova~R~244.774 | … 641 11.738.182 ljudi~S~634.017 | svetu~S~452.411 | sveta~S~372.042 | … 20.902 … 27.162.439 30.124 CVCCVCV … 3.544.254 postala~G~143.826 | zahteva~G~112.647 | končala~G~82.661 | … 6.001 … 638.847 pozneje~R~191.439 | kasneje~R~184.115 | hitreje~R~54.808 | … 1.034 10.237.910 podlagi~S~194.698 | sistema~S~150.191 | razmere~S~137.332 | … 82.044 … 16.414.509 111.767 CVCVCCV … 907.222 dosegli~G~108.685 | dosegla~G~86.064 | temelji~G~46.051 | … 1.934 … 1.498.421 pogosto~R~235.262 | ponovno~R~146.746 | posebno~R~126.265 | … 1.828 9.401.911 začetku~S~297.915 | denarja~S~250.061 | sodišče~S~247.429 | … 57.872 … 17.183.235 76.206 CVCCVC … 1.184.668 postal~G~195.102 | mislim~G~163.430 | končal~G~102.489 | … 2.036 … 2.797.468 takrat~R~444.872 | povsem~R~426.682 | medtem~R~423.755 | … 561 9.085.989 razvoj~S~232.816 | sistem~S~231.613 | metrov~S~219.943 | … 59.858 … 18.809.535 75.910 CVC … 5.724.828 sem~G~2.397.474 | bil~G~2.389.568 | bom~G~366.374 | … 343 … 5.468.743 več~R~1.895.190 | kar~R~1.071.938 | res~R~565.140 | … 342 8.274.348 let~S~871.303 | dan~S~691.333 | del~S~614.034 | … 6.839 … 51.791.278 11.878 CCVCVCV … 4.964.744 zgodilo~G~196.054 | pravijo~G~157.886 | prihaja~G~104.613 | … 7.426 … 442.420 drugače~R~215.016 | premalo~R~100.905 | globoko~R~53.871 | … 671 7.545.220 primeru~S~329.762 | skupine~S~283.298 | število~S~256.416 | … 28.203 … 16.486.707 46.144 CVCVCVC … 2.683.772 povedal~G~470.909 | dosegel~G~148.920 | naredil~G~87.139 | … 4.348 … 567.932 posebej~R~301.006 | nikakor~R~110.573 | zapored~R~56.237 | … 514 7.357.771 besedah~S~232.300 | maribor~S~201.669 | rešitev~S~155.462 | … 53.862 … 13.810.566 72.480 CVCVCVCV … 6.653.192 povedala~G~168.913 | naredili~G~95.216 | narediti~G~90.882 | … 11.653 … 907.544 nekoliko~R~302.982 | zagotovo~R~181.171 | večinoma~R~127.004 | … 1.135 6.662.091 milijona~S~273.670 | komisija~S~138.922 | komisije~S~137.879 | … 65.013 … 17.497.409 106.749 VCCVCV … 2.322.605 uspelo~G~240.892 | ostaja~G~113.186 | ostala~G~104.858 | … 1.754 … 3.463 igrivo~R~1.197 | igraje~R~508 | ubrano~R~505 | … 201 6.388.722 občine~S~218.923 | okviru~S~199.142 | uprave~S~196.099 | … 16.244 … 8.993.892 22.278 CCVCCV … 968.699 prišlo~G~190.940 | prišli~G~159.033 | prišla~G~143.476 | … 1.150 … 672.306 skupno~R~75.556 | drugje~R~43.210 | zlahka~R~37.127 | … 781 6.190.118 ljudje~S~380.130 | družbe~S~278.827 | mnenju~S~218.944 | … 22.773 … 11.104.323 31.185 CCVCVC … 1.814.727 prišel~G~211.310 | prejel~G~80.982 | zmagal~G~74.398 | … 2.636 … 2.595.370 skupaj~R~614.538 | precej~R~492.127 | včeraj~R~354.001 | … 434 5.422.057 primer~S~372.991 | človek~S~199.070 | trener~S~187.374 | … 22.034 … 14.343.957 30.514 CVCCVCCV … 156.239 poglejte~G~19.405 | poglejmo~G~14.489 | zapletlo~G~8.423 | … 1.167 … 1.012.997 verjetno~R~273.140 | najbolje~R~116.206 | resnično~R~74.664 | … 1.359 5.046.681 podjetja~S~357.753 | možnosti~S~286.651 | področju~S~260.325 | … 43.820 … 10.371.246 57.684 CVCVCCVC … 595.599 pojasnil~G~154.120 | zapustil~G~45.502 | zamenjal~G~42.675 | … 1.471 … 44.211 naposled~R~24.791 | popoldan~R~9.711 | dopoldan~R~5.038 | … 252 4.660.109 tolarjev~S~412.138 | minister~S~255.353 | direktor~S~255.013 | … 29.895 … 7.477.840 40.476 CCVCVCCV … 491.139 spominja~G~58.374 | prinesla~G~46.280 | prineslo~G~22.935 | … 987 … 693.290 trenutno~R~201.729 | pretežno~R~65.503 | pravilno~R~52.228 | … 916 4.402.584 stoletja~S~163.118 | trenutku~S~122.507 | projekta~S~120.657 | … 19.375 … 8.428.857 27.587 CCCVCV … 993.538 vpliva~G~62.113 | zbrali~G~60.359 | vrnila~G~51.813 | … 1.382 … 390.440 hkrati~R~328.962 | sproti~R~24.706 | strogo~R~21.663 | … 203 4.010.397 strani~S~656.325 | države~S~441.257 | država~S~280.757 | … 6.878 … 5.907.323 10.682 CCVC … 371.418 vzel~G~53.596 | stal~G~50.168 | znal~G~34.791 | … 430 … 2.376.185 zdaj~R~1.082.531 | spet~R~394.360 | prej~R~381.934 | … 262 3.894.091 svet~S~342.448 | dneh~S~208.231 | član~S~126.982 | … 9.360 … 17.254.058 13.399 CVCVCVCCV … 81.385 zagovarja~G~11.876 | pogovarja~G~6.983 | nadomesti~G~5.527 | … 976 … 305.196 dobesedno~R~42.683 | politično~R~30.762 | pozitivno~R~29.771 | … 1.193 3.731.538 policisti~S~169.647 | milijarde~S~144.291 | delovanje~S~108.318 | … 27.374 … 5.994.797 38.309 CVCCVCVC … 1.286.649 postavil~G~73.429 | nastopil~G~68.170 | zahteval~G~54.810 | … 2.907 … 50.709 dandanes~R~22.628 | naprodaj~R~16.584 | ravnokar~R~8.658 | … 286 3.531.243 podjetij~S~170.110 | postopek~S~130.418 | naslovom~S~105.264 | … 39.657 … 5.553.652 55.153 VCVCV … 2.520.511 imajo~G~589.974 | imeli~G~424.733 | imela~G~312.385 | … 1.230 … 419.939 okoli~R~277.924 | enako~R~126.110 | obilo~R~12.331 | … 225 3.498.335 ekipa~S~128.351 | ekipe~S~118.010 | ekipo~S~105.523 | … 14.255 … 7.469.235 19.311 VCCVCCV … 450.253 ukvarja~G~61.955 | izvedli~G~40.306 | ohranja~G~19.002 | … 759 … 837.179 izjemno~R~131.823 | uspešno~R~129.295 | izredno~R~97.268 | … 381 3.461.295 oktobra~S~181.713 | območju~S~179.720 | obdobju~S~165.198 | … 9.093 … 6.426.329 13.362 VCCVC … 548.023 ostal~G~125.029 | igral~G~119.806 | odšel~G~59.189 | … 366 … 127.949 okrog~R~95.620 | odveč~R~25.025 | osmič~R~1.949 | … 102 3.359.763 evrov~S~773.102 | otrok~S~287.121 | uspeh~S~108.385 | … 9.860 … 5.259.086 12.889 CCVCVCVCV … 2.622.056 sporočili~G~120.071 | pričakuje~G~76.613 | prihajajo~G~63.953 | … 6.043 … 103.231 praviloma~R~52.432 | pretirano~R~28.829 | slikovito~R~4.375 | … 423 3.318.872 slovenije~S~579.630 | sloveniji~S~567.029 | slovenija~S~433.985 | … 15.202 … 6.854.930 31.010 CCVCCVCV … 1.336.633 priznati~G~45.138 | prevzela~G~41.858 | predlaga~G~40.698 | … 3.695 … 171.207 spomladi~R~46.677 | vseskozi~R~33.166 | predlani~R~17.742 | … 435 3.173.310 prostora~S~146.863 | programa~S~139.877 | prostoru~S~89.878 | … 19.784 … 5.832.090 32.654 CVCC … 1.509 nost~G~599 | nost~G~148 | morš~G~62 | … 245 … 1.895.729 bolj~R~1.262.087 | manj~R~605.930 | dalj~R~24.898 | … 257 3.135.036 točk~S~242.169 | film~S~221.668 | cilj~S~155.054 | … 14.426 … 5.631.684 19.768 CVCVCCVCV … 1.240.266 nadaljuje~G~56.322 | pojasnila~G~51.945 | pojasnili~G~44.944 | … 3.949 … 364.859 popolnoma~R~192.438 | mimogrede~R~36.407 | temeljito~R~32.612 | … 580 2.868.047 republike~S~108.300 | rezultati~S~85.053 | raziskave~S~80.468 | … 32.075 … 6.168.624 52.782 CCVCCVC … 492.251 prevzel~G~72.924 | priznal~G~53.126 | spomnil~G~43.614 | … 1.283 … 462.381 dvakrat~R~129.369 | zjutraj~R~111.789 | trikrat~R~58.405 | … 187 2.740.332 program~S~240.539 | prostor~S~213.701 | predlog~S~187.497 | … 18.348 … 4.987.399 24.621 CVCCCV … 257.624 manjka~G~51.661 | zaprli~G~29.088 | zaprla~G~15.829 | … 718 … 280.591 končno~R~125.543 | nadvse~R~51.341 | žensko~R~11.892 | … 629 2.579.937 ženske~S~174.956 | centra~S~119.042 | ženska~S~89.709 | … 17.668 … 7.412.195 24.841 CVCCVCVCV … 3.359.708 potrebuje~G~97.483 | postavili~G~80.875 | nastopila~G~61.267 | … 7.760 … 129.362 razmeroma~R~48.367 | postopoma~R~30.760 | zasluženo~R~16.048 | … 570 2.202.007 posledica~S~103.590 | posledice~S~83.853 | festivala~S~65.230 | … 40.623 … 6.454.452 72.754 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 283 File at CLARIN.SI 1.3.6 List of lemmas by detailed consonant-vowel structure in the Gigafida 2.0 corpusGF2.0_cv_lemmas_finegrained_entire.tsvStructure … Frequency (G) Most frequent lemmas (G) Number of all unique lemmas (G) … Frequency (R) Most frequent lemmas (R) Number of all unique lemmas (R) Frequency (S) Most frequent lemmas (S) Number of all unique lemmas (S) … Total frequency of structure Total number of all unique lemmas in structure ZVKV … 1.499.961 moči~G~829.908 | reči~G~623.230 | meti~G~11.291 | … 118 … 733.773 nato~R~588.918 | lepo~R~141.118 | jako~R~2.069 | … 118 6.886.012 leto~S~4.967.678 | roka~S~456.944 | mati~S~123.143 | … 2.116 … 9.767.677 3.390 KVK … 211 tec~G~134 | pet~G~8 | pas~G~7 | … 31 … 130.986 tik~R~98.185 | peš~R~17.527 | kos~R~14.328 | … 63 4.083.180 čas~S~1.869.767 | pot~S~601.436 | pes~S~113.980 | … 1.159 … 11.620.246 2.113 GVZ … 276 del~G~126 | živ~G~44 | gol~G~19 | … 32 … 221.104 žal~R~168.318 | dol~R~29.840 | gor~R~19.995 | … 37 3.879.936 dan~S~1.852.418 | del~S~960.422 | dom~S~354.503 | … 449 … 4.616.246 903 KVZV … 312 poja~G~43 | pija~G~39 | poje~G~38 | … 92 … 45.462 sila~R~16.669 | čemu~R~13.421 | fino~R~4.008 | … 156 3.337.841 šola~S~576.848 | cena~S~552.731 | telo~S~259.396 | … 2.538 … 5.547.427 4.297 KVZVK … 431 sunit~G~250 | šalit~G~50 | polit~G~45 | … 39 … 4.898 širok~R~4.316 | sijoč~R~218 | pojoč~R~118 | … 75 2.714.493 konec~S~893.711 | pomoč~S~538.416 | korak~S~175.344 | … 4.449 … 3.190.138 5.458 ZVKVZ … 132 rasel~G~26 | lokaj~G~21 | mukaj~G~10 | … 43 … 1.571.663 nekaj~R~1.478.636 | nocoj~R~24.929 | napol~R~18.461 | … 80 2.483.442 način~S~516.472 | meter~S~335.227 | račun~S~214.566 | … 3.323 … 6.282.862 4.872 ZVKKV … 162.319 rasti~G~69.030 | jesti~G~61.864 | vesti~G~12.888 | … 91 … 3.503.454 lahko~R~3.488.074 | mehko~R~5.940 | rusko~R~2.697 | … 135 2.454.887 mesto~S~1.587.720 | moški~S~284.703 | laško~S~83.302 | … 2.803 … 6.808.207 4.034 GVZV … 1.584 dali~G~488 | dale~G~487 | delo~G~107 | … 79 … 1.748.613 zelo~R~1.441.970 | doma~R~261.540 | živo~R~29.929 | … 96 2.400.280 delo~S~1.611.830 | žena~S~171.289 | gora~S~130.967 | … 1.367 … 4.563.161 2.326 KZVZVK … 865 prijet~G~325 | prejet~G~228 | slovit~G~163 | … 26 … 283.435 preveč~R~282.880 | privat~R~125 | smejoč~R~107 | … 44 2.329.477 človek~S~1.543.691 | promet~S~261.144 | prenos~S~98.220 | … 1.533 … 2.690.983 1.989 KZVK … 149 sleč~G~43 | krit~G~36 | snet~G~32 | … 18 … 51.245 krat~R~40.288 | proč~R~10.362 | plus~R~218 | … 48 2.328.323 svet~S~1.216.329 | kmet~S~95.443 | klop~S~84.534 | … 1.507 … 2.760.165 2.147 KVKVZ … 604 popaj~G~221 | pesen~G~69 | hočem~G~51 | … 55 … 1.237.918 potem~R~814.559 | takoj~R~297.170 | tukaj~R~114.792 | … 99 2.125.529 pokal~S~241.804 | pesem~S~201.394 | koper~S~188.356 | … 4.365 … 5.805.741 6.213 ZVK … 5.276 reč~G~4.911 | noč~G~123 | vit~G~62 | … 26 … 2.609.355 več~R~1.895.190 | res~R~565.140 | nič~R~106.823 | … 59 2.076.143 moč~S~313.476 | noč~S~190.190 | vas~S~170.658 | … 824 … 12.379.593 1.547 ZVGV … 77lezi~G~9 | meda~G~7 | rubi~G~7 | … 37 … 1.887 rado~R~461 | miže~R~379 | meze~R~281 | … 74 1.981.376 voda~S~485.257 | liga~S~342.171 | noga~S~166.994 | … 1.252 … 1.997.175 2.016 ZVZV … 1.121 maja~G~190 | vili~G~157 | rina~G~139 | … 109 … 1.145.033 lani~R~505.184 | malo~R~420.328 | raje~R~132.367 | … 104 1.980.583 meja~S~302.187 | mera~S~184.317 | vino~S~149.318 | … 1.837 … 3.745.958 3.254 KVZVZ … 73 halal~G~18 | coral~G~10 | haral~G~6 | … 32 … 454.078 torej~R~350.715 | komaj~R~100.681 | širom~R~2.183 | … 86 1.889.201 tolar~S~449.918 | pomen~S~173.704 | sejem~S~97.826 | … 4.228 … 2.894.367 6.089 KZVZVZVZV … 1.059 slovenija~G~965 | slovenija~G~60 | slovenija~G~11 | … 17… 52.518 praviloma~R~52.432 | prevejano~R~21 | prirojeno~R~18 | … 24 1.780.281 slovenija~S~1.764.516 | članarina~S~10.119 | slavonija~S~1.996 | … 392 … 1.834.165 551 KVGVZ … 158 sédel~G~75 | padel~G~59 | podel~G~7 | … 17… 347.851 sedaj~R~219.877 | tedaj~R~127.360 | kobal~R~231 | … 55 1.699.293 teden~S~683.852 | pogoj~S~259.463 | kazen~S~165.632 | … 2.296 … 2.835.411 3.175 KZVZ … 1.005 smej~G~557 | smel~G~410 | smej~G~6 | … 23 … 464.459 prej~R~381.934 | prav~R~57.387 | slej~R~24.408 | … 43 1.688.414 član~S~561.640 | kraj~S~292.693 | člen~S~216.134 | … 1.012 … 8.181.251 1.584 KVKKV … 246.161 pasti~G~199.420 | sesti~G~46.456 | sésti~G~58 | … 91 … 127.755 čisto~R~121.929 | često~R~2.648 | kasko~R~1.244 | … 108 1.605.647 točka~S~672.245 | cesta~S~450.932 | pošta~S~97.060 | … 3.122 … 3.335.676 4.433 KZVZVZ … 36 snemaj~G~11 | plavaj~G~5 | sviraj~G~3 | … 16 … 580 praven~R~284 | slaven~R~60 | kremen~R~52 | … 37 1.594.092 primer~S~905.773 | trener~S~306.400 | krajan~S~43.597 | … 1.499 … 2.005.107 2.250 VZV … 254 alu~G~73 | oni~G~34 | eva~G~20 | … 48 … 259.961 ali~R~259.351 | ino~R~236 | ilo~R~81 | … 62 1.588.752 ura~S~887.827 | ime~S~450.447 | ana~S~69.629 | … 458 … 4.902.975 1.000 KVGZVKZV … 2podlipja~G~1 | podjetje~G~1 2… 15.628 soglasno~R~14.074 | podjetno~R~1.153 | požrešno~R~277 | … 13 1.585.060 podjetje~S~1.001.078 | področje~S~506.606 | soglasje~S~73.687 | … 103 … 1.600.877 159 GVZVZ … 112 delaj~G~39 | bajaj~G~23 | delaj~G~9 | … 34 … 181.215 domov~R~112.483 | zunaj~R~45.254 | zaman~R~22.993 | … 54 1.547.910 denar~S~517.466 | dolar~S~252.401 | žival~S~167.291 | … 2.090 … 2.167.450 3.078 ZVZVZ … 189 limaj~G~45 | nimen~G~29 | venel~G~18 | … 41 … 993 varen~R~459 | javen~R~136 | rumen~R~126 | … 43 1.545.697 raven~S~266.578 | junij~S~226.175 | namen~S~223.962 | … 2.684 … 3.093.884 4.040 GZGVZV … 5drživa~G~3 | drživa~G~1 | drzalo~G~1 3… 10 drzavo~R~6 | držana~R~1 | držovo~R~1 | … 5 1.511.358 država~S~1.502.639 | držalo~S~3.680 | grbina~S~2.458 | … 103 … 1.511.447 140 KVGVZV … 26 sodelo~G~2 | pobero~G~2 | sobili~G~2 | … 22 … 34.773 pozimi~R~34.518 | šegavo~R~181 | cagavo~R~12 | … 48 1.474.125 težava~S~621.900 | sezona~S~516.392 | podaja~S~45.839 | … 2.295 … 1.510.395 2.926 KKZVZ … 6škval~G~3 | phral~G~1 | splel~G~1 | … 4… 35.662 stran~R~35.649 | skraj~R~5 | fkrej~R~2 | … 9 1.436.932 stran~S~856.113 | stvar~S~376.437 | strel~S~88.996 | … 345 … 1.474.227 489 KZVG … 6kroz~G~1 | plaz~G~1 | prad~G~1 | … 6… 8.941 slab~R~8.801 | svež~R~93 | križ~R~30 | … 16 1.418.387 klub~S~426.105 | krog~S~379.075 | sneg~S~89.132 | … 593 … 4.249.328 860 ZVKVZV … 876 ločilo~G~696 | načelo~G~59 | nočemo~G~19 | … 51 … 443.462 nikoli~R~342.994 | jeseni~R~58.787 | veselo~R~17.991 | … 110 1.388.408 večina~S~448.952 | višina~S~208.860 | rusija~S~158.896 | … 3.578 … 1.836.923 4.724 GVKVZ … 146 basaj~G~81 | dihaj~G~17 | došel~G~9 | … 27 … 474.587 zakaj~R~323693 | dokaj~R~87697 | zatem~R~61788 | … 39 1.383.064 zakon~S~706.952 | župan~S~257.742 | zapor~S~105.666 | … 1.800 … 2.148.380 2.569 ZZVGV … 2mlodi~G~1 | vlože~G~1 2… 96.508 mnogo~R~89.142 | ljubo~R~5.439 | mlado~R~1.606 | … 20 1.363.063 vlada~S~767.169 | vloga~S~404.771 | mreža~S~134.354 | … 227 … 1.462.242 369 GZVGV … 16 glade~G~6 | druže~G~4 | bnodo~G~2 | … 7… 540.226 glede~R~400.382 | blizu~R~107.893 | drago~R~14.627 | … 53 1.337.527 zveza~S~506.534 | zmaga~S~477.023 | blago~S~71.047 | … 606 … 2.216.981 941 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 284 File at CLARIN.SI1.3.7 List of word forms by detailed consonant-vowel structure in the Gigafida 2.0 corpusGF2.0_cv_forms_finegrained_entire.tsvStruktura ... Pogostost (G) Najpogostejše oblike (G) Število vseh različnih oblik (G) ... Pogostost (R) Najpogostejše oblike (R) Število vseh različnih oblik (R) Pogostost (S) Najpogostejše oblike (S) Število vseh različnih oblik (S) ... Skupna pogostost strukture Skupno število različnih oblik v strukturi ZVKV ... 1.389.036 niso~G~1.033.011 | reči~G~63.419 | nosi~G~43.473 | ... 246 ... 727.352 nato~R~588.918 | lepo~R~128.216 | više~R~6.216 | ... 119 5.795.533 leta~S~1.847.738 | leto~S~721.493 | letu~S~393.553 | … 3.720 ... 9.865.777 5.307 KVKV ... 516.525 čaka~G~152.986 | piše~G~125.147 | hoče~G~53.903 | ... 321 ... 4.328.080 tako~R~3.289.402 | kako~R~1.005.454 | tiho~R~26.224 | ... 144 2.982.765 času~S~515.482 | časa~S~504.691 | poti~S~274.092 | … 5.006 ... 8.685.155 7.080 KVZV ... 211.853 pove~G~84.266 | širi~G~20.617 | poje~G~15.550 | ... 374 ... 71.299 huje~R~25.133 | sila~R~16.669 | čemu~R~13.421 | … 161 2.927.683 šole~S~178.671 | cene~S~154.615 | cena~S~142.620 | … 4.572 ... 6.637.525 7.064 GVZV ... 6.888.415 bilo~G~2.082.016 | bila~G~1.952.416 | bili~G~949.361 | ... 304 ... 1.750.944 zelo~R~1.441.970 | doma~R~261.540 | živo~R~29.927 | ... 100 2.840.285 dela~S~638.910 | delo~S~591.888 | delu~S~198.415 | … 2.505 ... 11.893.511 3.929 ZVKKV ... 271.774 nista~G~113.901 | veste~G~53.004 | niste~G~42.125 | ... 139 ... 3.387.940 lahko~R~3.362.721 | lepše~R~5.953 | mehko~R~5.725 | … 141 2.397.725 mesto~S~628.237 | mestu~S~409.942 | mesta~S~295.629 | … 4.792 ... 6.628.952 6.325 ZVZV ... 1.230.362 mora~G~437.226 | more~G~224.072 | nima~G~219786 | ... 344 ... 1.137.392 lani~R~505.184 | malo~R~419.478 | raje~R~123.947 | … 112 2.195.783 maja~S~181.212 | meje~S~101.003 | mama~S~80.070 | … 3.351 ... 7.201.261 5.364 ZVK ... 43.059 veš~G~40.547 | ješ~G~1.344 | maš~G~837 | ... 38 ... 2.605.671 več~R~1.895.190 | res~R~565.140 | nič~R~106.823 | … 59 1.919.193 let~S~871.303 | moč~S~113.032 | noč~S~105.164 | … 1.012 ... 9.610.819 1.744 GVZ ... 2.902.358 bil~G~2.389.568 | bom~G~366.374 | dal~G~110.438 | ... 63 ... 218.963 žal~R~168.318 | dol~R~29.840 | gor~R~19.995 | … 35 1.883.644 dan~S~691.333 | del~S~614.034 | dom~S~113.740 | … 554 ... 5.070.655 997 KVK ... 418 teč~G~349 | kaš~G~6 | pis~G~6 | ... 41 ... 130.622 tik~R~98.185 | peš~R~17.527 | kos~R~14.328 | … 59 1.801.919 čas~S~583.014 | pot~S~289.597 | sit~S~62.994 | … 1.365 ... 8.999.081 2.296 KZVZV ... 833.029 pravi~G~517.845 | traja~G~51.995 | smemo~G~46.248 | ... 294 ... 241.786 kmalu~R~234.301 | krivo~R~2.088 | smelo~R~2.003 | … 80 1.580.029 člani~S~187.220 | člena~S~110.127 | smeri~S~100.560 | … 2.692 ... 5.189.913 4.014 ZVZVZV ... 1.924.598 morali~G~356.788 | morajo~G~238.365 | morala~G~211.284 | ... 675 ... 17.173 nemalo~R~10.527 | rumeno~R~4.248 | nanovo~R~1.180 | … 89 1.551.161 junija~S~196.186 | julija~S~146.102 | narave~S~70.670 | … 6.661 ... 3.865.188 9.602 ZVKVZ ... 807.506 nisem~G~333.913 | rekel~G~178.062 | našel~G~76.313 | ... 280 ... 1.567.368 nekaj~R~1.478.635 | nocoj~R~24.929 | napol~R~18.461 | … 46 1.445.639 način~S~366.438 | račun~S~119.270 | letom~S~93.631 | … 3.416 ... 4.463.550 4.735 GZVGV ... 177.532 gredo~G~43.686 | grozi~G~28.920 | gleda~G~26.851 | ... 119 ... 506.187 glede~R~400.382 | blizu~R~67.448 | bliže~R~15.447 | … 57 1.394.775 zmago~S~198.367 | zvezi~S~182.052 | zveze~S~172.845 | … 1.068 ... 4.262.104 1.588 KVKVZ ... 376.695 hotel~G~106.293 | kupil~G~44.573 | pisal~G~32.760 | ... 352 ... 1.237.527 potem~R~814.559 | takoj~R~297.170 | tukaj~R~114.789 | … 70 1.383.676 koper~S~126.298 | peter~S~109.949 | časom~S~86.315 | … 4.341 ... 4.905.466 5.866 VZV ... 1.074.137 ima~G~1.068.307 | ove~G~2.156 | aja~G~1.344 | ... 109 ... 259.962 ali~R~259.351 | ino~R~236 | ilo~R~81 | … 65 1.124.743 uri~S~325.698 | ure~S~186.752 | ime~S~184.107 | … 820 ... 6.652.904 1.623 KVZ ... 2.502.263 sem~G~2.397.474 | šel~G~92.005 | pel~G~7.219 | ... 79 ... 2.238.428 kar~R~1.071.938 | tam~R~430.492 | pol~R~268.770 | … 66 877.155 cen~S~77.540 | sin~S~65.497 | šol~S~51.588 | … 992 ... 16.703.615 1.898 KVGV ... 455.462 kaže~G~229.871 | sodi~G~75.149 | pade~G~23.476 | ... 131 ... 70.879 hudo~R~58.235 | teže~R~10.343 | togo~R~1.644 | … 77 672.859 kožo~S~56.683 | fazi~S~44.667 | sobi~S~41.989 | … 2.297 ... 12.131.563 3.268 ZVG ... 2lub~G~1 | nod~G~1 2... 145.174 rad~R~145.150 | ned~R~6 | lad~R~4 | … 14 607.506 mož~S~119.903 | red~S~97.550 | niz~S~44.478 | … 512 ... 4.538.462 824 ZVZ ... 189.881 vem~G~176.059 | jem~G~5.268 | mel~G~2.775 | ... 72 ... 80.223 mar~R~40.907 | ven~R~37.776 | mal~R~850 | … 51 563.858 vir~S~55.345 | mir~S~54.262 | jan~S~38.121 | … 717 ... 4.951.742 1.393 Z ... 0 0... 0 0 340.218 m~S~175.770 | l~S~36.410 | m~S~30.363 | … 12 ... 32.162.983 28 G ... 0 0... 0 0 276.154 g~S~94.640 | b~S~71.815 | d~S~35.202 | … 10 ... 8.953.950 21 V ... 0 0... 285 a~R~247 | a~R~33 | à~R~5 3 275.477 e~S~65.702 | a~S~47.897 | e~S~30.584 | … 35 ... 5.636.079 82 VZ ... 132 il~G~55 | em~G~45 | il~G~11 | ... 13 ... 166 in~R~44 | em~R~30 | aj~R~28 | … 11 261.943 ur~S~133.379 | al~S~33.318 | al~S~19.176 | … 121 ... 29.856.767 279 K ... 0 0... 0 0 259.278 c~S~61.714 | h~S~29.544 | p~S~20.785 | … 20 ... 8.220.434 46 KZV ... 2.545.100 smo~G~1.909.803 | sva~G~217.982 | šlo~G~186.950 | ... 109 ... 259.202 kje~R~164.308 | tja~R~94.572 | sno~R~52 | … 48 256.339 tla~S~60.312 | slo~S~49.761 | kri~S~36.909 | … 943 ... 7.578.512 1.669 KV ... 13.434.694 so~G~13.322.285 | si~G~111.723 | šu~G~209 | ... 54 ... 279.382 tu~R~274.975 | ke~R~3.172 | ha~R~732 | … 30 86.238 pu~S~17.228 | ha~S~8.899 | he~S~6.493 | … 244 ... 80.153.006 590 ZV ... 44.416.538 je~G~39.852.647 | ni~G~4.416.649 | ve~G~146.074 | ... 43 ... 743 la~R~406 | li~R~129 | jé~R~58 | … 15 52.703 mo~S~5.628 | li~S~4.883 | la~S~4.683 | … 157 ... 77.935.383 396 VG ... 26 az~G~19 | az~G~3 | id~G~2 | ... 5... 86 id~R~56 | ad~R~16 | ed~R~13 | … 4 38.920 ad~S~5.366 | ig~S~5.228 | ag~S~3.920 | … 95 ... 10.446.351 186 GV ... 12.589.686 bi~G~6.493.377 | bo~G~5.991.968 | da~G~99.128 | ... 36 ... 6.035 za~R~5.864 | ze~R~87 | dá~R~29 | … 14 24.115 go~S~6.856 | di~S~3.156 | du~S~2.587 | … 118 ... 52.797.573 273 P ... 0 0... 0 0 121 ‘~S~121 1... 186.030.725 23 ! ... 0 0... 0 0 0 0... 6.344.899 392 NN ... 0 0... 0 0 0 0... 6.196.834 108 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 285 File at CLARIN.SI 1.3.8 List of noun lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-words-nouns-lemmas- taxonomy-entire.tsvLemma Lemma (lower-case) Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] leto leto 4,967,678 1.41 % 4,377.99 0 0 % 0 58,808 0.69 % 1,480.72 11,974 0.97 % 3,026.90 2,501,788 1.45 % 4,609.71 732,438 1.33 % 3,908.05 1,531,105 1.49 % 4,815.72 131,565 1.02 % 3,064.31 čas čas 1,869,767 0.53 % 1,647.82 15 0.49 % 1,545.28 65,738 0.78 % 1,655.21 5,627 0.46 % 1,422.44 884,076 0.51 % 1,628.97 353,071 0.64 % 1,883.87 493,125 0.48 % 1,551.01 68,115 0.53 % 1,586.48 dan dan 1,852,418 0.53 % 1,632.53 4 0.13 % 412.07 63,871 0.75 % 1,608.20 7,241 0.58 % 1,830.45 911,593 0.53 % 1,679.67 285,803 0.52 % 1,524.95 544,212 0.53 % 1,711.69 39,694 0.31 % 924.52 Slovenija slovenija 1,764,516 0.50 % 1,555.06 0 0 % 0 1,408 0.02 % 35.45 11,071 0.90 % 2,798.63 894,914 0.52 % 1,648.94 193,536 0.35 % 1,032.64 645,977 0.63 % 2,031.77 17,610 0.14 % 410.16 delo delo 1,611,830 0.46 % 1,420.50 2 0.07 % 206.04 21,312 0.25 % 536.61 6,576 0.53 % 1,662.34 832,012 0.48 % 1,533.04 223,957 0.41 % 1,194.96 455,097 0.44 % 1,431.40 72,874 0.56 % 1,697.32 mesto mesto 1,587,720 0.45 % 1,399.25 0 0 % 0 30,392 0.36 % 765.24 2,457 0.20 % 621.10 831,762 0.48 % 1,532.58 187,507 0.34 % 1,000.48 494,079 0.48 % 1,554.01 41,523 0.32 % 967.12 človek človek 1,543,691 0.44 % 1,360.45 0 0 % 0 64,013 0.76 % 1,611.78 4,089 0.33 % 1,033.66 693,749 0.40 % 1,278.28 286,123 0.52 % 1,526.66 411,720 0.40 % 1,294.97 83,997 0.65 % 1,956.39 država država 1,502,639 0.43 % 1,324.27 0 0 % 0 5,437 0.06 % 136.90 9,324 0.75 % 2,357.01 761,820 0.44 % 1,403.70 149,312 0.27 % 796.68 536,561 0.52 % 1,687.62 40,185 0.31 % 935.96 svet svet 1,216,329 0.34 % 1,071.94 0 0 % 0 27,166 0.32 % 684.01 4,324 0.35 % 1,093.06 611,210 0.35 % 1,126.19 177,328 0.32 % 946.16 351,299 0.34 % 1,104.93 45,002 0.35 % 1,048.15 odstotek odstotek 1,089,884 0.31 % 960.51 0 0 % 0 898 0.01 % 22.61 583 0.05 % 147.38 558,212 0.32 % 1,028.54 82,011 0.15 % 437.58 439,901 0.43 % 1,383.60 8,279 0.06 % 192.83 Ljubljana ljubljana 1,078,589 0.31 % 950.56 0 0 % 0 2,707 0.03 % 68.16 940 0.08 % 237.62 570,072 0.33 % 1,050.40 89,473 0.16 % 477.40 398,878 0.39 % 1,254.58 16,519 0.13 % 384.75 podjetje podjetje 1,001,078 0.28 % 882.24 0 0 % 0 2,879 0.03 % 72.49 2,661 0.21 % 672.67 537,241 0.31 % 989.90 154,968 0.28 % 826.86 283,563 0.28 % 891.88 19,766 0.15 % 460.37 tekma tekma 978,012 0.28 % 861.92 0 0 % 0 1,508 0.02 % 37.97 81 0.01 % 20.48 519,746 0.30 % 957.67 45,417 0.08 % 242.33 409,852 0.40 % 1,289.09 1,408 0.01 % 32.79 del del 960,422 0.27 % 846.42 7 0.23 % 721.13 13,353 0.16 % 336.21 3,660 0.30 % 925.21 460,019 0.27 % 847.62 165,687 0.30 % 884.05 267,226 0.26 % 840.50 50,470 0.39 % 1,175.51 evro evro 952,189 0.27 % 839.16 0 0 % 0 587 0.01 % 14.78 39 0 % 9.86 403,032 0.23 % 742.61 41,108 0.07 % 219.34 505,459 0.49 % 1,589.80 1,964 0.01 % 45.74 predsednik predsednik 935,429 0.27 % 824.39 0 0 % 0 3,383 0.04 % 85.18 2,020 0.16 % 510.63 520,594 0.30 % 959.23 65,730 0.12 % 350.71 339,004 0.33 % 1,066.26 4,698 0.04 % 109.42 primer primer 905,773 0.26 % 798.25 8 0.26 % 824.15 14,594 0.17 % 367.46 5,373 0.43 % 1,358.24 406,445 0.24 % 748.90 154,197 0.28 % 822.74 266,038 0.26 % 836.76 59,118 0.46 % 1,376.93 konec konec 893,711 0.25 % 787.62 26 0.85 % 2,678.48 29,432 0.35 % 741.07 2,089 0.17 % 528.08 428,007 0.25 % 788.63 131,299 0.24 % 700.57 277,983 0.27 % 874.33 24,875 0.19 % 579.37 ura ura 887,827 0.25 % 782.44 38 1.24 % 3,914.70 28,341 0.33 % 713.60 1,567 0.13 % 396.12 495,541 0.29 % 913.07 115,281 0.21 % 615.10 229,747 0.22 % 722.61 17,312 0.13 % 403.22 milijon milijon 863,041 0.24 % 760.59 0 0 % 0 2,587 0.03 % 65.14 561 0.04 % 141.81 472,592 0.27 % 870.78 73,218 0.13 % 390.67 308,212 0.30 % 969.41 5,871 0.04 % 136.74 otrok otrok 858,940 0.24 % 756.98 0 0 % 0 32,917 0.39 % 828.81 1,735 0.14 % 438.59 386,949 0.23 % 712.98 167,600 0.30 % 894.26 219,308 0.21 % 689.78 50,431 0.39 % 1,174.60 skupina skupina 857,991 0.24 % 756.14 0 0 % 0 6,619 0.08 % 166.66 2,360 0.19 % 596.58 424,903 0.25 % 782.91 115,348 0.21 % 615.46 268,220 0.26 % 843.62 40,541 0.31 % 944.25 stran stran 856,113 0.24 % 754.49 24 0.78 % 2,472.44 28,776 0.34 % 724.55 2,133 0.17 % 539.20 357,367 0.21 % 658.47 172,145 0.31 % 918.51 250,185 0.24 % 786.90 45,483 0.35 % 1,059.35 vlada vlada 767,169 0.22 % 676.10 0 0 % 0 1,902 0.02 % 47.89 5,609 0.45 % 1,417.89 398,609 0.23 % 734.46 56,720 0.10 % 302.64 297,430 0.29 % 935.49 6,899 0.05 % 160.69 življenje življenje 762,139 0.22 % 671.67 0 0 % 0 38,014 0.45 % 957.15 2,198 0.18 % 555.63 319,278 0.18 % 588.29 169,305 0.31 % 903.36 178,445 0.17 % 561.26 54,899 0.42 % 1,278.66 družba družba 762,071 0.22 % 671.61 0 0 % 0 6,818 0.08 % 171.67 4,866 0.39 % 1,230.07 394,846 0.23 % 727.53 79,568 0.14 % 424.55 247,831 0.24 % 779.49 28,142 0.22 % 655.46 zakon zakon 706,952 0.20 % 623.03 1 0.03 % 103.02 4,285 0.05 % 107.89 20,577 1.66 % 5,201.64 358,907 0.21 % 661.31 70,886 0.13 % 378.22 236,331 0.23 % 743.32 15,965 0.12 % 371.84 občina občina 688,467 0.20 % 606.74 0 0 % 0 430 0.01 % 10.83 2,616 0.21 % 661.30 491,291 0.28 % 905.24 24,281 0.04 % 129.56 167,303 0.16 % 526.21 2,546 0.02 % 59.30 teden teden 683,852 0.19 % 602.68 2 0.07 % 206.04 14,002 0.17 % 352.56 904 0.07 % 228.52 351,497 0.20 % 647.66 84,244 0.15 % 449.50 225,387 0.22 % 708.90 7,816 0.06 % 182.04 prostor prostor 672,859 0.19 % 592.99 0 0 % 0 13,594 0.16 % 342.28 1,932 0.16 % 488.39 359,646 0.21 % 662.67 111,497 0.20 % 594.91 160,130 0.16 % 503.65 26,060 0.20 % 606.97 točka točka 672,245 0.19 % 592.45 0 0 % 0 3,432 0.04 % 86.41 3,686 0.30 % 931.78 305,174 0.18 % 562.30 49,829 0.09 % 265.87 294,805 0.29 % 927.24 15,319 0.12 % 356.80 program program 667,540 0.19 % 588.30 0 0 % 0 1,804 0.02 % 45.42 3,767 0.30 % 952.26 339,935 0.20 % 626.35 138,849 0.25 % 740.85 162,628 0.16 % 511.51 20,557 0.16 % 478.80 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 286 File at CLARIN.SI 1.3.9 List of noun lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-words-nouns-lowercase_ forms-taxonomy-entire.tsvLower-case word form Lemma Lemma (lower-case) Morphosyntactic tag Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] msd01 msd02 msd03 msd04 msd05 msd06 leta leto leto Soser 1,636,722 0.46 % 1,442.43 0 0 % 0 10,983 0.13 % 276.54 4,242 0.34 % 1,072.33 770,441 0.45 % 1,419.59 235,053 0.43 % 1,254.17 553,616 0.54 % 1,741.27 62,387 0.48 % 1,453.07 S o s e r let leto leto Sosmr 867,250 0.25 % 764.30 0 0 % 0 18,868 0.22 % 475.08 2,308 0.19 % 583.44 446,614 0.26 % 822.92 143,528 0.26 % 765.82 235,537 0.23 % 740.83 20,395 0.16 % 475.02 S o s m r evrov evro evro Sommr 772,929 0.22 % 681.18 0 0 % 0 446 0.01 % 11.23 6 0 % 1.52 322,966 0.19 % 595.09 31,313 0.06 % 167.08 417,418 0.41 % 1,312.89 780 0.01 % 18.17 S o m m r leto leto leto Soset 671,131 0.19 % 591.46 0 0 % 0 5,973 0.07 % 150.39 1,356 0.11 % 342.78 368,949 0.21 % 679.81 90,634 0.17 % 483.59 194,845 0.19 % 612.84 9,374 0.07 % 218.33 S o s e t ljubljana Ljubljana ljubljana Slzei 628,735 0.18 % 554.10 0 0 % 0 517 0.01 % 13.02 355 0.03 % 89.74 313,025 0.18 % 576.77 31,900 0.06 % 170.21 272,890 0.27 % 858.31 10,048 0.08 % 234.03 S l z e i slovenije Slovenija slovenija Slzer 579,625 0.16 % 510.82 0 0 % 0 465 0.01 % 11.71 6,843 0.55 % 1,729.84 310,718 0.18 % 572.52 56,762 0.10 % 302.86 198,772 0.19 % 625.19 6,065 0.05 % 141.26 S l z e r odstotkov odstotek odstotek Sommr 558,438 0.16 % 492.15 0 0 % 0 578 0.01 % 14.55 299 0.02 % 75.58 311,093 0.18 % 573.21 53,776 0.10 % 286.93 187,678 0.18 % 590.30 5,014 0.04 % 116.78 S o m m r dan dan dan Sometn 557,484 0.16 % 491.31 0 0 % 0 21,876 0.26 % 550.81 1,510 0.12 % 381.71 266,761 0.15 % 491.52 104,066 0.19 % 555.26 148,971 0.14 % 468.55 14,300 0.11 % 333.06 S o m e t n letih leto leto Sosmm 548,398 0.15 % 483.30 0 0 % 0 6,669 0.08 % 167.92 1,114 0.09 % 281.61 271,679 0.16 % 500.59 82,481 0.15 % 440.09 171,339 0.17 % 538.91 15,116 0.12 % 352.07 S o s m m predsednik predsednik predsednik Somei 545,303 0.15 % 480.57 0 0 % 0 1,964 0.02 % 49.45 1,092 0.09 % 276.05 309,316 0.18 % 569.94 37,606 0.07 % 200.65 192,578 0.19 % 605.71 2,747 0.02 % 63.98 S o m e i sloveniji Slovenija slovenija Slzem 538,018 0.15 % 474.15 0 0 % 0 445 0.01 % 11.20 1,968 0.16 % 497.49 277,889 0.16 % 512.03 75,414 0.14 % 402.38 175,524 0.17 % 552.07 6,778 0.05 % 157.87 S l z e m dela delo delo Soser 521,790 0.15 % 459.85 0 0 % 0 6,757 0.08 % 170.13 2,148 0.17 % 542.99 259,793 0.15 % 478.69 71,635 0.13 % 382.22 157,578 0.15 % 495.62 23,879 0.18 % 556.17 S o s e r času čas čas Somem 508,899 0.14 % 448.49 10.03 % 103.02 7,697 0.09 % 193.80 1,590 0.13 % 401.93 244,428 0.14 % 450.37 80,908 0.15 % 431.70 156,353 0.15 % 491.77 17,922 0.14 % 417.42 S o m e m časa čas čas Somer 504,139 0.14 % 444.30 12 0.39 % 1,236.22 24,941 0.29 % 627.99 1,300 0.10 % 328.63 228,289 0.13 % 420.64 104,730 0.19 % 558.80 124,191 0.12 % 390.61 20,676 0.16 % 481.57 S o m e r ljudi človek človek Sommr 495,522 0.14 % 436.70 0 0 % 0 10,996 0.13 % 276.87 980 0.08 % 247.73 221,899 0.13 % 408.86 70,540 0.13 % 376.38 173,552 0.17 % 545.87 17,555 0.14 % 408.88 S o m m r milijonov milijon milijon Sommr 449,520 0.13 % 396.16 0 0 % 0 872 0.01 % 21.96 339 0.03 % 85.70 261,919 0.15 % 482.60 38,906 0.07 % 207.59 144,911 0.14 % 455.78 2,573 0.02 % 59.93 S o m m r slovenija Slovenija slovenija Slzei 433,967 0.12 % 382.45 0 0 % 0 216 0 % 5.44 1,480 0.12 % 374.13 198,662 0.12 % 366.05 38,064 0.07 % 203.10 192,470 0.19 % 605.37 3,075 0.02 % 71.62 S l z e i strani stran stran Sozem 426,530 0.12 % 375.90 5 0.16 % 515.09 13,232 0.16 % 333.17 1,066 0.09 % 269.47 181,912 0.11 % 335.18 80,834 0.15 % 431.30 129,289 0.13 % 406.65 20,192 0.16 % 470.30 S o z e m svetu svet svet Somem 420,992 0.12 % 371.02 0 0 % 0 9,213 0.11 % 231.97 800 0.07 % 202.23 196,955 0.11 % 362.90 72,303 0.13 % 385.79 128,250 0.12 % 403.38 13,471 0.10 % 313.76 S o m e m del del del Somei 416,401 0.12 % 366.97 0 0 % 0 5,866 0.07 % 147.70 1,498 0.12 % 378.68 203,386 0.12 % 374.75 72,893 0.13 % 388.93 112,102 0.11 % 352.59 20,656 0.16 % 481.10 S o m e i tolarjev tolar tolar Sommr 412,126 0.12 % 363.20 0 0 % 0 93 0 % 2.34 1,150 0.09 % 290.71 368,691 0.21 % 679.34 40,400 0.07 % 215.56 1,394 0 % 4.38 398 0 % 9.27 S o m m r delo delo delo Soset 406,221 0.12 % 358 0 0 % 0 5,879 0.07 % 148.03 1,632 0.13 % 412.55 209,805 0.12 % 386.58 56,874 0.10 % 303.46 116,542 0.11 % 366.55 15,489 0.12 % 360.76 S o s e t mestu mesto mesto Sosem 399,823 0.11 % 352.36 0 0 % 0 9,633 0.11 % 242.55 516 0.04 % 130.44 193,436 0.11 % 356.42 52,211 0.10 % 278.58 134,535 0.13 % 423.15 9,492 0.07 % 221.08 S o s e m letu leto leto Sosem 392,161 0.11 % 345.61 0 0 % 0 1,636 0.02 % 41.19 1,266 0.10 % 320.03 202,872 0.12 % 373.81 49,987 0.09 % 266.71 126,949 0.12 % 399.29 9,451 0.07 % 220.13 S o s e m mesto mesto mesto Soset 391,840 0.11 % 345.33 0 0 % 0 7,208 0.09 % 181.49 465 0.04 % 117.55 208,004 0.12 % 383.26 42,758 0.08 % 228.14 125,206 0.12 % 393.81 8,199 0.06 % 190.96 S o s e t odstotka odstotek odstotek Somer 389,774 0.11 % 343.51 0 0 % 0 38 0 % 0.96 59 0.01 % 14.91 172,689 0.10 % 318.19 14,276 0.03 % 76.17 201,759 0.20 % 634.58 953 0.01 % 22.20 S o m e r ljudje človek človek Sommi 380,121 0.11 % 335 0 0 % 0 17,127 0.20 % 431.24 1,076 0.09 % 272 172,766 0.10 % 318.33 75,675 0.14 % 403.78 92,639 0.09 % 291.37 20,838 0.16 % 485.34 S o m m i sveta svet svet Somer 371,312 0.10 % 327.24 0 0 % 0 6,223 0.07 % 156.69 1,493 0.12 % 377.41 197,463 0.12 % 363.84 43,149 0.08 % 230.23 110,129 0.11 % 346.38 12,855 0.10 % 299.41 S o m e r čas čas čas Sometn 363,924 0.10 % 320.72 2 0.07 % 206.04 16,452 0.19 % 414.24 1,309 0.11 % 330.90 168,395 0.10 % 310.28 69,561 0.13 % 371.15 94,911 0.09 % 298.52 13,294 0.10 % 309.63 S o m e t n koncu konec konec Somem 360,848 0.10 % 318.01 18 0.59 % 1,854.33 10,355 0.12 % 260.73 943 0.08 % 238.38 169,617 0.10 % 312.53 55,974 0.10 % 298.66 112,344 0.11 % 353.35 11,597 0.09 % 270.11 S o m e m dni dan dan Sommr 349,250 0.10 % 307.79 2 0.07 % 206.04 9,328 0.11 % 234.87 1,347 0.11 % 340.51 176,256 0.10 % 324.76 52,388 0.10 % 279.53 103,461 0.10 % 325.41 6,468 0.05 % 150.65 S o m m r primeru primer primer Somem 328,726 0.09 % 289.70 0 0 % 0 4,441 0.05 % 111.82 2,241 0.18 % 566.50 143,637 0.08 % 264.66 49,093 0.09 % 261.94 114,446 0.11 % 359.96 14,868 0.12 % 346.29 S o m e m CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 287 File at CLARIN.SI 1.3.10 List of verb lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-words-verbs-lemmas- taxonomy-entire.tsvLemma Lemma (lower-case) Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] biti biti 91,521,762 45.65 % 80,657.66 130 6.79 % 13,392.40 5,035,173 46.44 % 126,780.21 232,963 38.78 % 58,890.53 43,157,916 46.56 % 79,521.31 14,021,776 42.64 % 74,815.59 26,234,349 46.57 % 82,513.86 2,839,455 39.78 % 66,134.28 imeti imeti 4,146,931 2.07 % 3,654.67 12 0.63 % 1,236.22 179,933 1.66 % 4,530.52 14,728 2.45 % 3,723.08 1,929,486 2.08 % 3,555.21 785,754 2.39 % 4,192.53 1,085,001 1.93 % 3,412.61 152,017 2.13 % 3,540.66 morati morati 2,048,887 1.02 % 1,805.67 6 0.31 % 618.11 93,053 0.86 % 2,342.97 10,982 1.83 % 2,776.13 960,889 1.04 % 1,770.50 336,918 1.02 % 1,797.68 558,017 0.99 % 1,755.11 89,022 1.25 % 2,073.43 iti iti 1,422,557 0.71 % 1,253.69 1 0.05 % 103.02 87,499 0.81 % 2,203.13 4,730 0.79 % 1,195.69 635,556 0.69 % 1,171.05 245,607 0.75 % 1,310.48 407,923 0.72 % 1,283.02 41,241 0.58 % 960.55 začeti začeti 1,138,044 0.57 % 1,002.95 5 0.26 % 515.09 48,368 0.45 % 1,217.85 3,141 0.52 % 794.01 542,728 0.59 % 1,000.01 183,809 0.56 % 980.74 324,275 0.58 % 1,019.93 35,718 0.50 % 831.91 priti priti 1,046,777 0.52 % 922.52 2 0.10 % 206.04 71,696 0.66 % 1,805.23 3,067 0.51 % 775.30 476,007 0.51 % 877.07 175,716 0.53 % 937.56 286,526 0.51 % 901.20 33,763 0.47 % 786.38 povedati povedati 970,784 0.48 % 855.55 0 0 % 0 54,069 0.50 % 1,361.40 2,102 0.35 % 531.36 490,903 0.53 % 904.52 114,814 0.35 % 612.61 292,099 0.52 % 918.73 16,797 0.23 % 391.22 dobiti dobiti 966,276 0.48 % 851.57 25 1.31 % 2,575.46 24,925 0.23 % 627.58 2,237 0.37 % 565.49 498,593 0.54 % 918.69 162,196 0.49 % 865.42 253,287 0.45 % 796.65 25,013 0.35 % 582.58 želeti želeti 916,972 0.46 % 808.12 12 0.63 % 1,236.22 26,786 0.25 % 674.44 2,633 0.44 % 665.59 392,752 0.42 % 723.67 153,351 0.47 % 818.23 311,583 0.55 % 980.01 29,855 0.42 % 695.36 vedeti vedeti 869,903 0.43 % 766.64 1 0.05 % 103.02 115,253 1.06 % 2,901.95 3,084 0.51 % 779.60 365,237 0.39 % 672.97 174,422 0.53 % 930.66 183,677 0.33 % 577.71 28,229 0.40 % 657.49 moči moči 829,908 0.41 % 731.39 0 0 % 0 65,444 0.60 % 1,647.81 3,619 0.60 % 914.84 382,779 0.41 % 705.30 143,731 0.44 % 766.90 196,963 0.35 % 619.50 37,372 0.52 % 870.44 videti videti 729,726 0.36 % 643.10 0 0 % 0 94,473 0.87 % 2,378.73 1,702 0.28 % 430.25 273,372 0.29 % 503.71 146,780 0.45 % 783.17 178,717 0.32 % 562.11 34,682 0.49 % 807.79 praviti praviti 720,199 0.36 % 634.71 0 0 % 0 19,047 0.18 % 479.58 1,668 0.28 % 421.65 379,505 0.41 % 699.26 122,357 0.37 % 652.86 180,251 0.32 % 566.94 17,371 0.24 % 404.59 postati postati 640,874 0.32 % 564.80 8 0.42 % 824.15 24,724 0.23 % 622.52 1,981 0.33 % 500.78 280,862 0.30 % 517.51 131,659 0.40 % 702.49 166,830 0.30 % 524.72 34,810 0.49 % 810.77 reči reči 623,230 0.31 % 549.25 0 0 % 0 143,415 1.32 % 3,611.03 3,775 0.63 % 954.28 220,496 0.24 % 406.28 130,052 0.40 % 693.91 102,237 0.18 % 321.56 23,255 0.33 % 541.64 dejati dejati 588,836 0.29 % 518.94 0 0 % 0 11,125 0.10 % 280.12 129 0.02 % 32.61 238,385 0.26 % 439.24 24,514 0.07 % 130.80 310,945 0.55 % 978 3,738 0.05 % 87.06 pomeniti pomeniti 572,293 0.28 % 504.36 1 0.05 % 103.02 15,176 0.14 % 382.12 2,098 0.35 % 530.35 265,611 0.29 % 489.41 114,402 0.35 % 610.41 146,456 0.26 % 460.64 28,549 0.40 % 664.94 ostati ostati 562,250 0.28 % 495.51 1 0.05 % 103.02 26,621 0.25 % 670.29 1,527 0.25 % 386.01 268,949 0.29 % 495.56 90,209 0.27 % 481.33 157,481 0.28 % 495.32 17,462 0.24 % 406.71 dati dati 532,058 0.27 % 468.90 31 1.62 % 3,193.57 38,456 0.35 % 968.28 3,258 0.54 % 823.59 242,506 0.26 % 446.83 100,968 0.31 % 538.73 125,248 0.22 % 393.94 21,591 0.30 % 502.88 odločiti odločiti 511,266 0.26 % 450.58 0 0 % 0 11,502 0.11 % 289.61 1,705 0.28 % 431.01 253,119 0.27 % 466.39 84,393 0.26 % 450.29 150,035 0.27 % 471.90 10,512 0.15 % 244.84 najti najti 509,948 0.25 % 449.41 1 0.05 % 103.02 30,279 0.28 % 762.39 1,136 0.19 % 287.17 213,191 0.23 % 392.82 113,686 0.35 % 606.59 127,739 0.23 % 401.77 23,916 0.34 % 557.03 igrati igrati 507,131 0.25 % 446.93 0 0 % 0 10,320 0.10 % 259.85 250 0.04 % 63.20 259,647 0.28 % 478.42 72,953 0.22 % 389.25 157,472 0.28 % 495.29 6,489 0.09 % 151.14 doseči doseči 504,161 0.25 % 444.31 2 0.10 % 206.04 5,633 0.05 % 141.83 968 0.16 % 244.70 251,182 0.27 % 462.82 61,834 0.19 % 329.93 166,863 0.30 % 524.83 17,679 0.25 % 411.76 hoteti hoteti 473,951 0.24 % 417.69 4 0.21 % 412.07 66,334 0.61 % 1,670.22 1,878 0.31 % 474.74 213,028 0.23 % 392.52 94,743 0.29 % 505.52 78,178 0.14 % 245.89 19,786 0.28 % 460.84 narediti narediti 471,071 0.23 % 415.15 8 0.42 % 824.15 26,480 0.24 % 666.74 1,294 0.21 % 327.11 200,417 0.22 % 369.28 101,403 0.31 % 541.05 124,061 0.22 % 390.20 17,408 0.24 % 405.45 govoriti govoriti 450,094 0.22 % 396.67 0 0 % 0 37,708 0.35 % 949.45 2,215 0.37 % 559.93 204,344 0.22 % 376.52 77,505 0.24 % 413.54 107,171 0.19 % 337.08 21,151 0.30 % 492.63 delati delati 437,814 0.22 % 385.84 1 0.05 % 103.02 19,899 0.18 % 501.04 1,349 0.23 % 341.01 203,224 0.22 % 374.45 83,284 0.25 % 444.38 115,508 0.20 % 363.30 14,549 0.20 % 338.86 pričakovati pričakovati 429,229 0.21 % 378.28 0 0 % 0 9,942 0.09 % 250.33 536 0.09 % 135.50 221,371 0.24 % 407.89 54,496 0.17 % 290.77 135,527 0.24 % 426.27 7,357 0.10 % 171.35 pomagati pomagati 425,110 0.21 % 374.65 1 0.05 % 103.02 18,730 0.17 % 471.60 929 0.15 % 234.84 187,285 0.20 % 345.08 86,740 0.26 % 462.82 109,931 0.20 % 345.76 21,494 0.30 % 500.62 uspeti uspeti 416,462 0.21 % 367.03 0 0 % 0 11,945 0.11 % 300.76 441 0.07 % 111.48 202,705 0.22 % 373.50 63,264 0.19 % 337.56 130,990 0.23 % 412 7,117 0.10 % 165.76 kazati kazati 414,236 0.21 % 365.06 1 0.05 % 103.02 9,240 0.09 % 232.65 811 0.14 % 205.01 204,947 0.22 % 377.63 64,708 0.20 % 345.26 114,799 0.20 % 361.07 19,730 0.28 % 459.54 pokazati pokazati 412,679 0.21 % 363.69 0 0 % 0 17,416 0.16 % 438.52 714 0.12 % 180.49 195,963 0.21 % 361.07 69,548 0.21 % 371.09 112,955 0.20 % 355.27 16,083 0.23 % 374.59 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 288 File at CLARIN.SI 1.3.11 List of verb lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-words-verbs-lowercase_ forms-taxonomy-entire.tsvLower-case word form Lemma Lemma (lower-case) Morphosyntactic tag Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] msd01 msd02 msd03 msd04 msd05 msd06 msd07 msd08 je biti biti Gp-ste-n 39,852,647 19.88 % 35,121.94 53 2.77 % 5,459.98 2,355,179 21.72 % 59,300.86 95,246 15.86 % 24,077.16 18,126,598 19.56 % 33,399.46 6,003,428 18.26 % 32,032.32 11,999,560 21.30 % 37,741.74 1,272,583 17.83 % 29,639.97 G p - s t e - n so biti biti Gp-stm-n 13,322,285 6.64 % 11,740.86 25 1.31 % 2,575.46 381,659 3.52 % 9,609.76 33,146 5.52 % 8,378.95 6,609,847 7.13 % 12,179.08 1,974,260 6.00 % 10,534 3,829,327 6.80 % 12,044.23 494,021 6.92 % 11,506.34 G p - s t m - n bi biti biti Gp-g 6,493,377 3.24 % 5,722.58 4 0.21 % 412.07 388,499 3.58 % 9,781.98 19,280 3.21 % 4,873.78 3,209,848 3.46 % 5,914.36 1,006,147 3.06 % 5,368.47 1,665,174 2.96 % 5,237.41 204,425 2.86 % 4,761.30 G p - g bo biti biti Gp-pte-n 5,991,968 2.99 % 5,280.69 9 0.47 % 927.17 165,708 1.53 % 4,172.35 10,939 1.82 % 2,765.26 3,054,359 3.29 % 5,627.86 827,835 2.52 % 4,417.06 1,837,272 3.26 % 5,778.70 95,846 1.34 % 2,232.37 G p - p t e - n ni biti biti Gp-ste-d 4,416,649 2.20 % 3,892.37 0 0 % 0 283,726 2.62 % 7,143.91 15,184 2.53 % 3,838.35 2,061,015 2.22 % 3,797.56 719,621 2.19 % 3,839.66 1,202,738 2.13 % 3,782.92 134,365 1.88 % 3,129.52 G p - s t e - d bodo biti biti Gp-ptm-n 2,658,670 1.33 % 2,343.07 4 0.21 % 412.07 31,054 0.29 % 781.91 3,637 0.61 % 919.39 1,471,149 1.59 % 2,710.69 302,865 0.92 % 1,615.99 814,748 1.45 % 2,562.59 35,213 0.49 % 820.15 G p - p t m - n sem biti biti Gp-spe-n 2,397,474 1.20 % 2,112.88 0 0 % 0 402,492 3.71 % 10,134.31 6,808 1.13 % 1,720.99 928,864 1.00 % 1,711.49 454,648 1.38 % 2,425.85 534,969 0.95 % 1,682.62 69,693 0.98 % 1,623.23 G p - s p e - n bil biti biti Gp-d-em 2,389,568 1.19 % 2,105.91 0 0 % 0 167,677 1.55 % 4,221.93 6,270 1.04 % 1,584.99 1,107,569 1.20 % 2,040.77 323,764 0.98 % 1,727.50 713,849 1.27 % 2,245.24 70,439 0.99 % 1,640.61 G p - d - e m bilo biti biti Gp-d-es 2,082,014 1.04 % 1,834.87 10.05 % 103.02 149,980 1.38 % 3,776.33 6,314 1.05 % 1,596.11 1,019,431 1.10 % 1,878.37 298,941 0.91 % 1,595.05 544,247 0.97 % 1,711.80 63,100 0.88 % 1,469.67 G p - d - e s smo biti biti Gp-spm-n 1,909,803 0.95 % 1,683.10 0 0 % 0 44,381 0.41 % 1,117.47 5,673 0.94 % 1,434.07 994,108 1.07 % 1,831.71 348,648 1.06 % 1,860.27 471,393 0.84 % 1,482.65 45,600 0.64 % 1,062.08 G p - s p m - n sta biti biti Gp-std-n 1,867,979 0.93 % 1,646.24 20.10 % 206.04 96,319 0.89 % 2,425.21 2,858 0.48 % 722.47 879,215 0.95 % 1,620.01 274,500 0.83 % 1,464.64 561,951 1.00 % 1,767.48 53,134 0.74 % 1,237.55 G p - s t d - n bila biti biti Gp-d-ez 1,713,345 0.85 % 1,509.96 0 0 % 0 126,623 1.17 % 3,188.23 5,134 0.85 % 1,297.82 790,142 0.85 % 1,455.89 258,265 0.79 % 1,378.02 476,604 0.85 % 1,499.04 56,577 0.79 % 1,317.75 G p - d - e z ima imeti imeti Ggnste-n 1,068,307 0.53 % 941.49 50.26 % 515.09 28,747 0.27 % 723.82 4,891 0.81 % 1,236.39 479,110 0.52 % 882.79 225,335 0.69 % 1,202.31 286,154 0.51 % 900.03 44,065 0.62 % 1,026.33 G g n s t e - n niso biti biti Gp-stm-d 1,033,011 0.52 % 910.39 20.10 % 206.04 29,827 0.28 % 751.01 3,574 0.59 % 903.47 525,901 0.57 % 969.01 152,951 0.47 % 816.10 285,727 0.51 % 898.69 35,029 0.49 % 815.87 G p - s t m - d bili biti biti Gp-d-mm 900,631 0.45 % 793.72 10.05 % 103.02 36,884 0.34 % 928.70 2,774 0.46 % 701.24 451,245 0.49 % 831.45 132,844 0.40 % 708.81 244,580 0.43 % 769.27 32,303 0.45 % 752.38 G p - d - m m gre iti iti Ggvste 814,855 0.41 % 718.13 10.05 % 103.02 19,458 0.18 % 489.93 2,827 0.47 % 714.64 383,213 0.41 % 706.10 137,971 0.42 % 736.17 247,421 0.44 % 778.20 23,964 0.34 % 558.15 G g v s t e bomo biti biti Gp-ppm-n 651,882 0.33 % 574.50 0 0 % 0 14,951 0.14 % 376.45 2,964 0.49 % 749.27 340,630 0.37 % 627.63 110,301 0.34 % 588.53 163,942 0.29 % 515.64 19,094 0.27 % 444.72 G p - p p m - n imajo imeti imeti Ggnstm-n 589,974 0.29 % 519.94 10.05 % 103.02 8,325 0.08 % 209.61 2,572 0.43 % 650.17 294,609 0.32 % 542.84 109,963 0.33 % 586.73 145,513 0.26 % 457.68 28,991 0.41 % 675.23 G g n s t m - n pravi praviti praviti Ggvste 517,845 0.26 % 456.37 0 0 % 0 10,488 0.10 % 264.08 1,229 0.20 % 310.68 278,042 0.30 % 512.31 85,933 0.26 % 458.51 131,247 0.23 % 412.81 10,906 0.15 % 254.01 G g v s t e biti biti biti Gp-n 505,608 0.25 % 445.59 4 0.21 % 412.07 21,006 0.19 % 528.91 2,919 0.49 % 737.89 223,975 0.24 % 412.69 96,164 0.29 % 513.10 135,919 0.24 % 427.50 25,621 0.36 % 596.74 G p - n povedal povedati povedati Ggdd-em 470,909 0.23 % 415.01 0 0 % 0 17,662 0.16 % 444.71 699 0.12 % 176.70 256,636 0.28 % 472.87 30,466 0.09 % 162.56 162,162 0.29 % 510.04 3,284 0.05 % 76.49 G g d d - e m dejal dejati dejati Ggdd-em 467,145 0.23 % 411.69 0 0 % 0 7,833 0.07 % 197.23 81 0.01 % 20.48 196,501 0.21 % 362.07 16,632 0.05 % 88.74 243,317 0.43 % 765.30 2,781 0.04 % 64.77 G g d d - e m imel imeti imeti Ggnd-em 456,762 0.23 % 402.54 0 0 % 0 42,126 0.39 % 1,060.69 950 0.16 % 240.15 202,873 0.22 % 373.81 66,132 0.20 % 352.86 131,752 0.23 % 414.39 12,929 0.18 % 301.13 G g n d - e m pomeni pomeniti pomeniti Ggvste 437,749 0.22 % 385.79 10.05 % 103.02 9,039 0.08 % 227.59 1,572 0.26 % 397.38 201,302 0.22 % 370.91 91,360 0.28 % 487.47 112,095 0.20 % 352.57 22,380 0.31 % 521.26 G g v s t e mora morati morati Ggnste 437,226 0.22 % 385.33 3 0.16 % 309.06 13,897 0.13 % 349.91 5,423 0.90 % 1,370.88 196,118 0.21 % 361.36 77,719 0.24 % 414.68 120,038 0.21 % 377.55 24,028 0.34 % 559.64 G g n s t e boste biti biti Gp-pdm-n 424,723 0.21 % 374.31 80.42 % 824.15 11,314 0.10 % 284.87 1,143 0.19 % 288.94 144,008 0.15 % 265.34 176,813 0.54 % 943.42 60,928 0.11 % 191.63 30,509 0.43 % 710.59 G p - p d m - n imeli imeti imeti Ggnd-mm 419,948 0.21 % 370.10 0 0 % 0 11,505 0.11 % 289.68 984 0.16 % 248.74 224,931 0.24 % 414.45 62,534 0.19 % 333.66 108,880 0.19 % 342.46 11,114 0.16 % 258.86 G g n d - m m ste biti biti Gp-sdm-n 389,782 0.19 % 343.51 16 0.84 % 1,648.30 23,529 0.22 % 592.43 1,606 0.27 % 405.98 155,988 0.17 % 287.42 111,345 0.34 % 594.10 74,884 0.13 % 235.53 22,414 0.31 % 522.05 G p - s d m - n bile biti biti Gp-d-mz 375,232 0.19 % 330.69 10.05 % 103.02 20,494 0.19 % 516.02 1,483 0.25 % 374.89 178,647 0.19 % 329.17 57,238 0.17 % 305.40 99,724 0.18 % 313.66 17,645 0.25 % 410.97 G p - d - m z bom biti biti Gp-ppe-n 366,374 0.18 % 322.88 0 0 % 0 59,430 0.55 % 1,496.38 1,893 0.32 % 478.53 150,679 0.16 % 277.64 60,672 0.18 % 323.73 83,377 0.15 % 262.24 10,323 0.14 % 240.43 G p - p p e - n morali morati morati Ggnd-mm 354,484 0.18 % 312.40 0 0 % 0 7,601 0.07 % 191.38 649 0.11 % 164.06 194,917 0.21 % 359.15 48,161 0.15 % 256.97 93,416 0.17 % 293.82 9,740 0.14 % 226.86 G g n d - m m nisem biti biti Gp-spe-d 333,913 0.17 % 294.28 0 0 % 0 57,684 0.53 % 1,452.42 857 0.14 % 216.64 129,813 0.14 % 239.19 61,255 0.19 % 326.84 76,320 0.14 % 240.05 7,984 0.11 % 185.96 G p - s p e - d CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 289 File at CLARIN.SI 1.3.12 List of adjective lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-words-adjectives-lemmas- taxonomy-entire.tsvLemma Lemma (lower-case) Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] velik velik 2,652,565 2.04 % 2,337.69 5 0.58 % 515.09 51,414 1.72 % 1,294.55 5,866 1.23 % 1,482.86 1,283,822 2.02 % 2,365.53 485,663 2.24 % 2,591.34 724,281 1.99 % 2,278.05 101,514 1.92 % 2,364.38 nov nov 2,584,746 1.99 % 2,277.92 1 0.12 % 103.02 31,228 1.04 % 786.29 5,052 1.06 % 1,277.09 1,343,551 2.12 % 2,475.58 430,154 1.99 % 2,295.16 711,383 1.95 % 2,237.48 63,377 1.20 % 1,476.13 slovenski slovenski 1,880,807 1.44 % 1,657.55 0 0 % 0 3,004 0.10 % 75.64 2,992 0.63 % 756.35 1,056,514 1.67 % 1,946.70 207,876 0.96 % 1,109.16 582,162 1.60 % 1,831.05 28,259 0.54 % 658.19 dober dober 1,837,252 1.41 % 1,619.16 3 0.35 % 309.06 40,204 1.34 % 1,012.29 2,465 0.52 % 623.13 898,379 1.42 % 1,655.32 314,636 1.45 % 1,678.79 540,724 1.49 % 1,700.72 40,841 0.77 % 951.24 zadnji zadnji 1,254,851 0.96 % 1,105.89 3 0.35 % 309.06 26,159 0.87 % 658.66 1,842 0.39 % 465.64 634,771 1.00 % 1,169.61 170,972 0.79 % 912.25 400,657 1.10 % 1,260.17 20,447 0.39 % 476.23 sam sam 1,096,267 0.84 % 966.13 2 0.23 % 206.04 65,317 2.18 % 1,644.61 4,168 0.87 % 1,053.63 485,995 0.77 % 895.48 228,841 1.06 % 1,221.02 258,335 0.71 % 812.53 53,609 1.02 % 1,248.62 evropski evropski 952,492 0.73 % 839.43 0 0 % 0 1,068 0.04 % 26.89 1,646 0.34 % 416.09 476,018 0.75 % 877.09 83,115 0.38 % 443.47 374,614 1.03 % 1,178.26 16,031 0.30 % 373.38 star star 819,239 0.63 % 721.99 0 0 % 0 40,235 1.34 % 1,013.07 1,669 0.35 % 421.91 428,688 0.68 % 789.89 141,731 0.65 % 756.23 175,740 0.48 % 552.75 31,176 0.59 % 726.13 visok visok 785,457 0.60 % 692.22 1 0.12 % 103.02 15,006 0.50 % 377.83 2,003 0.42 % 506.34 386,467 0.61 % 712.09 116,871 0.54 % 623.59 235,448 0.65 % 740.55 29,661 0.56 % 690.84 pomemben pomemben 785,444 0.60 % 692.21 0 0 % 0 12,253 0.41 % 308.52 2,080 0.43 % 525.80 369,088 0.58 % 680.07 142,611 0.66 % 760.93 213,803 0.59 % 672.47 45,609 0.86 % 1,062.29 državen državen 727,558 0.56 % 641.19 0 0 % 0 2,144 0.07 % 53.98 6,431 1.35 % 1,625.69 424,876 0.67 % 782.86 53,346 0.25 % 284.64 231,274 0.64 % 727.42 9,487 0.18 % 220.96 mlad mlad 710,490 0.55 % 626.15 0 0 % 0 19,798 0.66 % 498.49 901 0.19 % 227.76 384,080 0.61 % 707.69 102,434 0.47 % 546.55 187,441 0.52 % 589.55 15,836 0.30 % 368.84 svetoven svetoven 701,704 0.54 % 618.41 0 0 % 0 2,333 0.08 % 58.74 920 0.19 % 232.57 354,281 0.56 % 652.79 71,182 0.33 % 379.80 261,527 0.72 % 822.57 11,461 0.22 % 266.94 različen različen 601,579 0.46 % 530.17 10 1.16 % 1,030.18 6,101 0.20 % 153.62 1,707 0.36 % 431.51 265,022 0.42 % 488.32 132,380 0.61 % 706.34 147,499 0.41 % 463.92 48,860 0.93 % 1,138.01 javen javen 599,038 0.46 % 527.93 0 0 % 0 2,139 0.07 % 53.86 4,421 0.93 % 1,117.58 285,690 0.45 % 526.40 46,128 0.21 % 246.12 245,425 0.67 % 771.93 15,235 0.29 % 354.84 majhen majhen 586,026 0.45 % 516.46 11 1.28 % 1,133.20 22,105 0.74 % 556.58 1,989 0.42 % 502.80 260,722 0.41 % 480.40 146,093 0.67 % 779.50 121,483 0.33 % 382.10 33,623 0.64 % 783.12 mogoč mogoč 549,842 0.42 % 484.57 0 0 % 0 24,830 0.83 % 625.19 2,511 0.53 % 634.75 266,544 0.42 % 491.12 98,624 0.46 % 526.23 129,413 0.36 % 407.04 27,920 0.53 % 650.29 ameriški ameriški 546,670 0.42 % 481.78 0 0 % 0 2,955 0.10 % 74.40 553 0.12 % 139.79 269,453 0.42 % 496.48 74,371 0.34 % 396.82 191,053 0.53 % 600.91 8,285 0.16 % 192.97 domač domač 514,492 0.40 % 453.42 1 0.12 % 103.02 5,007 0.17 % 126.07 1,010 0.21 % 255.32 290,270 0.46 % 534.84 65,068 0.30 % 347.18 143,722 0.40 % 452.04 9,414 0.18 % 219.26 pravi pravi 508,554 0.39 % 448.19 1 0.12 % 103.02 19,024 0.64 % 479 1,069 0.22 % 270.23 238,097 0.38 % 438.71 115,643 0.53 % 617.03 117,969 0.32 % 371.04 16,751 0.32 % 390.15 glaven glaven 504,817 0.39 % 444.89 0 0 % 0 8,165 0.27 % 205.59 1,532 0.32 % 387.27 252,083 0.40 % 464.48 79,633 0.37 % 424.90 142,430 0.39 % 447.98 20,974 0.40 % 488.51 leten leten 497,066 0.38 % 438.06 0 0 % 0 930 0.03 % 23.42 819 0.17 % 207.03 258,476 0.41 % 476.26 37,560 0.17 % 200.41 195,112 0.54 % 613.68 4,169 0.08 % 97.10 poseben poseben 491,903 0.38 % 433.51 1 0.12 % 103.02 7,888 0.26 % 198.61 3,320 0.69 % 839.26 240,439 0.38 % 443.02 100,029 0.46 % 533.72 117,371 0.32 % 369.16 22,855 0.43 % 532.32 znan znan 482,303 0.37 % 425.05 0 0 % 0 6,258 0.21 % 157.57 949 0.20 % 239.90 239,747 0.38 % 441.75 92,995 0.43 % 496.19 127,889 0.35 % 402.24 14,465 0.27 % 336.91 slab slab 456,819 0.35 % 402.59 0 0 % 0 11,344 0.38 % 285.63 522 0.11 % 131.96 222,346 0.35 % 409.69 74,623 0.34 % 398.16 136,968 0.38 % 430.80 11,016 0.21 % 256.58 številen številen 440,492 0.34 % 388.20 1 0.12 % 103.02 3,801 0.13 % 95.71 536 0.11 % 135.50 199,765 0.32 % 368.08 74,277 0.34 % 396.32 143,497 0.39 % 451.34 18,615 0.35 % 433.57 mednaroden mednaroden 438,511 0.34 % 386.46 0 0 % 0 889 0.03 % 22.38 2,246 0.47 % 567.76 237,200 0.37 % 437.06 40,842 0.19 % 217.92 150,001 0.41 % 471.79 7,333 0.14 % 170.79 nekdanji nekdanji 438,100 0.34 % 386.10 0 0 % 0 3,153 0.10 % 79.39 321 0.07 % 81.15 223,487 0.35 % 411.79 46,215 0.21 % 246.59 159,504 0.44 % 501.68 5,420 0.10 % 126.24 prihodnji prihodnji 435,770 0.34 % 384.04 0 0 % 0 1,515 0.05 % 38.15 524 0.11 % 132.46 255,001 0.40 % 469.86 35,529 0.16 % 189.57 139,896 0.38 % 440.01 3,305 0.06 % 76.98 političen političen 433,419 0.33 % 381.97 0 0 % 0 2,210 0.07 % 55.65 1,473 0.31 % 372.36 232,367 0.37 % 428.15 50,096 0.23 % 267.30 128,308 0.35 % 403.56 18,965 0.36 % 441.72 lep lep 420,724 0.32 % 370.78 0 0 % 0 23,733 0.79 % 597.57 2,813 0.59 % 711.10 198,767 0.31 % 366.24 101,128 0.47 % 539.59 81,862 0.23 % 257.48 12,421 0.23 % 289.30 letošnji letošnji 419,649 0.32 % 369.83 0 0 % 0 181 0.01 % 4.56 258 0.05 % 65.22 248,400 0.39 % 457.69 43,478 0.20 % 231.98 126,783 0.35 % 398.77 549 0.01 % 12.79 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 290 File at CLARIN.SI 1.3.13 List of adjective lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-words-adjectives-lowercase_ forms-taxonomy-entire.tsvLower-case word form Lemma Lemma (lower-case) Morphosyntactic tag Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] msd01 msd02 msd03 msd04 msd05 msd06 msd07 mogoče mogoč mogoč Ppnsei 489,677 0.38 % 431.55 0 0 % 0 23,105 0.77 % 581.76 2,308 0.48 % 583.44 237,585 0.38 % 437.77 87,471 0.40 % 466.72 113,792 0.31 % 357.91 25,416 0.48 % 591.97 P p n s e i sam sam sam Ppnmein 384,412 0.29 % 338.78 0 0 % 0 27,439 0.92 % 690.88 1,440 0.30 % 364.02 173,542 0.27 % 319.76 71,878 0.33 % 383.52 94,231 0.26 % 296.38 15,882 0.30 % 369.91 P p n m e i n slovenske slovenski slovenski Ppnzer 261,712 0.20 % 230.65 0 0 % 0 350 0.01 % 8.81 491 0.10 % 124.12 153,160 0.24 % 282.21 26,734 0.12 % 142.64 77,498 0.21 % 243.75 3,479 0.07 % 81.03 P p n z e r sami sam sam Ppnmmi 215,205 0.17 % 189.66 10.12 % 103.02 5,245 0.17 % 132.06 663 0.14 % 167.60 104,237 0.16 % 192.06 45,813 0.21 % 244.44 48,294 0.13 % 151.90 10,952 0.21 % 255.09 P p n m m i novo nov nov Ppnzet 208,946 0.16 % 184.14 0 0 % 0 3,685 0.12 % 92.78 344 0.07 % 86.96 103,249 0.16 % 190.24 37,303 0.17 % 199.04 58,665 0.16 % 184.52 5,700 0.11 % 132.76 P p n z e t nova nov nov Ppnzei 204,970 0.16 % 180.64 0 0 % 0 2,186 0.07 % 55.04 483 0.10 % 122.10 104,189 0.16 % 191.98 33,075 0.15 % 176.48 60,496 0.17 % 190.28 4,541 0.09 % 105.77 P p n z e i sama sam sam Ppnzei 202,301 0.15 % 178.29 0 0 % 0 17,214 0.57 % 433.43 712 0.15 % 179.99 81,789 0.13 % 150.70 46,150 0.21 % 246.24 48,716 0.13 % 153.22 7,720 0.15 % 179.81 P p n z e i velika velik velik Ppnzei 193,885 0.15 % 170.87 0 0 % 0 5,032 0.17 % 126.70 502 0.10 % 126.90 90,536 0.14 % 166.82 35,892 0.17 % 191.51 54,551 0.15 % 171.58 7,372 0.14 % 171.70 P p n z e i slovenska slovenski slovenski Ppnzei 189,204 0.14 % 166.74 0 0 % 0 239 0.01 % 6.02 384 0.08 % 97.07 101,986 0.16 % 187.92 19,696 0.09 % 105.09 64,204 0.18 % 201.94 2,695 0.05 % 62.77 P p n z e i slovenski slovenski slovenski Ppnmeid 185,733 0.14 % 163.69 0 0 % 0 270 0.01 % 6.80 222 0.05 % 56.12 93,319 0.15 % 171.95 18,174 0.08 % 96.97 71,393 0.20 % 224.55 2,355 0.04 % 54.85 P p n m e i d prihodnje prihodnji prihodnji Ppnset 182,748 0.14 % 161.05 0 0 % 0 459 0.01 % 11.56 112 0.02 % 28.31 108,703 0.17 % 200.29 17,044 0.08 % 90.94 55,279 0.15 % 173.87 1,151 0.02 % 26.81 P p n s e t letni leten leten Ppnmeid 181,498 0.14 % 159.95 0 0 % 0 140 0.01 % 3.53 92 0.02 % 23.26 96,530 0.15 % 177.86 9,608 0.04 % 51.27 74,701 0.20 % 234.95 427 0.01 % 9.95 P p n m e i d pomembno pomemben pomemben Ppnsei 175,802 0.14 % 154.93 0 0 % 0 4,080 0.14 % 102.73 352 0.07 % 88.98 76,780 0.12 % 141.47 36,218 0.17 % 193.25 48,590 0.13 % 152.83 9,782 0.18 % 227.83 P p n s e i jasno jasen jasen Ppnsei 167,649 0.13 % 147.75 0 0 % 0 8,560 0.29 % 215.53 547 0.11 % 138.28 77,690 0.12 % 143.15 28,052 0.13 % 149.68 48,889 0.13 % 153.77 3,911 0.07 % 91.09 P p n s e i nove nov nov Ppnzer 163,169 0.12 % 143.80 0 0 % 0 1,138 0.04 % 28.65 319 0.07 % 80.64 88,283 0.14 % 162.67 22,619 0.10 % 120.69 46,597 0.13 % 146.56 4,213 0.08 % 98.13 P p n z e r evropske evropski evropski Ppnzer 160,221 0.12 % 141.20 0 0 % 0 137 0.01 % 3.45 356 0.07 % 89.99 79,935 0.13 % 147.29 12,746 0.06 % 68.01 64,413 0.18 % 202.60 2,634 0.05 % 61.35 P p n z e r prepričan prepričan prepričan Pdnmein 153,095 0.12 % 134.92 0 0 % 0 8,311 0.28 % 209.26 301 0.06 % 76.09 78,080 0.12 % 143.87 16,513 0.08 % 88.11 47,577 0.13 % 149.64 2,313 0.04 % 53.87 P d n m e i n zadnjem zadnji zadnji Ppnmem 143,807 0.11 % 126.74 10.12 % 103.02 2,387 0.08 % 60.10 220 0.05 % 55.61 75,772 0.12 % 139.61 20,518 0.10 % 109.48 42,679 0.12 % 134.24 2,230 0.04 % 51.94 P p n m e m nov nov nov Ppnmetn 141,390 0.11 % 124.61 0 0 % 0 2,213 0.07 % 55.72 276 0.06 % 69.77 63,999 0.10 % 117.92 25,018 0.12 % 133.49 46,282 0.13 % 145.57 3,602 0.07 % 83.89 P p n m e t n zadnjih zadnji zadnji Ppnsmm 140,490 0.11 % 123.81 0 0 % 0 745 0.03 % 18.76 147 0.03 % 37.16 70,258 0.11 % 129.46 17,326 0.08 % 92.45 49,608 0.14 % 156.03 2,406 0.05 % 56.04 P p n s m m nekdanji nekdanji nekdanji Ppnmeid 135,064 0.10 % 119.03 0 0 % 0 501 0.02 % 12.61 28 0.01 % 7.08 64,324 0.10 % 118.52 11,370 0.05 % 60.67 58,234 0.16 % 183.16 607 0.01 % 14.14 P p n m e i d veliko velik velik Ppnzet 134,774 0.10 % 118.78 0 0 % 0 4,098 0.14 % 103.18 298 0.06 % 75.33 63,594 0.10 % 117.18 24,120 0.11 % 128.70 37,589 0.10 % 118.23 5,075 0.10 % 118.20 P p n z e t slovenskih slovenski slovenski Ppnmmr 129,736 0.10 % 114.34 0 0 % 0 201 0.01 % 5.06 207 0.04 % 52.33 75,429 0.12 % 138.98 14,839 0.07 % 79.18 37,575 0.10 % 118.18 1,485 0.03 % 34.59 P p n m m r novi nov nov Ppnmeid 126,081 0.10 % 111.11 0 0 % 0 1,150 0.04 % 28.96 160 0.03 % 40.45 65,578 0.10 % 120.83 23,898 0.11 % 127.51 33,561 0.09 % 105.56 1,734 0.03 % 40.39 P p n m e i d novega nov nov Ppnmer 125,291 0.10 % 110.42 0 0 % 0 918 0.03 % 23.11 223 0.05 % 56.37 65,585 0.10 % 120.84 19,628 0.09 % 104.73 36,407 0.10 % 114.51 2,530 0.05 % 58.93 P p n m e r velik velik velik Ppnmein 125,004 0.10 % 110.17 0 0 % 0 3,894 0.13 % 98.05 241 0.05 % 60.92 56,803 0.09 % 104.66 24,925 0.12 % 132.99 34,775 0.10 % 109.38 4,366 0.08 % 101.69 P p n m e i n zaposlenih zaposlen zaposlen Pdnmmr 124,000 0.10 % 109.28 0 0 % 0 236 0.01 % 5.94 234 0.05 % 59.15 69,820 0.11 % 128.65 12,469 0.06 % 66.53 38,883 0.11 % 122.30 2,358 0.04 % 54.92 P d n m m r slovenskega slovenski slovenski Ppnmer 120,596 0.09 % 106.28 0 0 % 0 235 0.01 % 5.92 239 0.05 % 60.42 65,838 0.10 % 121.31 13,356 0.06 % 71.26 38,584 0.11 % 121.36 2,344 0.04 % 54.59 P p n m e r znano znan znan Pdnsei 115,873 0.09 % 102.12 0 0 % 0 1,368 0.05 % 34.44 145 0.03 % 36.65 62,748 0.10 % 115.62 15,800 0.07 % 84.30 33,527 0.09 % 105.45 2,285 0.04 % 53.22 P d n s e i glavni glaven glaven Ppnmeid 114,723 0.09 % 101.10 0 0 % 0 1,292 0.04 % 32.53 190 0.04 % 48.03 58,386 0.09 % 107.58 15,593 0.07 % 83.20 35,998 0.10 % 113.22 3,264 0.06 % 76.02 P p n m e i d novo nov nov Ppnsei 114,513 0.09 % 100.92 0 0 % 0 1,090 0.04 % 27.45 126 0.03 % 31.85 77,517 0.12 % 142.83 11,610 0.05 % 61.95 22,255 0.06 % 70 1,915 0.04 % 44.60 P p n s e i državnega državen državen Ppnmer 113,938 0.09 % 100.41 0 0 % 0 182 0.01 % 4.58 1,515 0.32 % 382.98 67,626 0.11 % 124.61 7,342 0.03 % 39.17 36,445 0.10 % 114.63 828 0.02 % 19.29 P p n m e r CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 291 File at CLARIN.SI 1.3.14 List of adverb lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-words-adverbs-lemmas- taxonomy-entire.tsvLemma Lemma (lower-case) Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] lahko lahko 3,488,074 5.43 % 3,074.02 44 8.85 % 4,532.81 118,552 3.93 % 2,985.01 17,112 9.02 % 4,325.73 1,472,009 4.94 % 2,712.27 791,767 6.41 % 4,224.61 883,055 5.35 % 2,777.44 205,535 8.48 % 4,787.15 tako tako 3,289,402 5.12 % 2,898.93 31 6.24 % 3,193.57 157,077 5.21 % 3,955.03 11,067 5.84 % 2,797.62 1,532,883 5.14 % 2,824.44 627,155 5.08 % 3,346.29 830,732 5.04 % 2,612.87 130,457 5.38 % 3,038.50 več več 1,895,190 2.95 % 1,670.22 4 0.81 % 412.07 35,879 1.19 % 903.39 3,863 2.04 % 976.52 889,908 2.99 % 1,639.71 302,261 2.45 % 1,612.77 609,107 3.69 % 1,915.80 54,168 2.23 % 1,261.64 nekaj nekaj 1,478,636 2.30 % 1,303.11 17 3.42 % 1,751.31 75,312 2.50 % 1,896.27 2,638 1.39 % 666.86 706,346 2.37 % 1,301.49 292,468 2.37 % 1,560.51 361,517 2.19 % 1,137.07 40,338 1.66 % 939.52 zelo zelo 1,441,970 2.24 % 1,270.80 2 0.40 % 206.04 45,177 1.50 % 1,137.51 3,480 1.83 % 879.71 645,584 2.17 % 1,189.53 323,837 2.62 % 1,727.89 366,622 2.22 % 1,153.12 57,268 2.36 % 1,333.84 bolj bolj 1,262,289 1.96 % 1,112.45 13 2.62 % 1,339.24 42,788 1.42 % 1,077.36 2,639 1.39 % 667.11 596,437 2.00 % 1,098.97 266,960 2.16 % 1,424.41 297,200 1.80 % 934.77 56,252 2.32 % 1,310.18 zdaj zdaj 1,082,531 1.68 % 954.03 0 0 % 0 71,531 2.37 % 1,801.07 2,616 1.38 % 661.30 531,590 1.78 % 979.49 150,379 1.22 % 802.37 308,065 1.87 % 968.94 18,350 0.76 % 427.39 vedno vedno 1,075,127 1.67 % 947.50 1 0.20 % 103.02 54,794 1.82 % 1,379.65 2,037 1.07 % 514.93 473,977 1.59 % 873.33 227,298 1.84 % 1,212.79 277,237 1.68 % 871.98 39,783 1.64 % 926.59 kar kar 1,071,938 1.67 % 944.69 0 0 % 0 39,622 1.31 % 997.64 1,731 0.91 % 437.58 542,754 1.82 % 1,000.06 222,883 1.81 % 1,189.23 244,821 1.48 % 770.03 20,127 0.83 % 468.78 kako kako 1,005,454 1.56 % 886.10 2 0.40 % 206.04 90,504 3.00 % 2,278.79 2,958 1.56 % 747.75 424,401 1.42 % 781.99 198,869 1.61 % 1,061.10 237,842 1.44 % 748.08 50,878 2.10 % 1,185.01 veliko veliko 945,307 1.47 % 833.09 1 0.20 % 103.02 24,019 0.80 % 604.77 1,561 0.82 % 394.60 451,747 1.52 % 832.37 200,120 1.62 % 1,067.77 237,636 1.44 % 747.43 30,223 1.25 % 703.93 dobro dobro 906,394 1.41 % 798.80 17 3.42 % 1,751.31 38,425 1.27 % 967.50 1,435 0.76 % 362.75 419,465 1.41 % 772.89 194,249 1.57 % 1,036.45 219,223 1.33 % 689.51 33,580 1.39 % 782.12 danes danes 905,146 1.41 % 797.70 0 0 % 0 13,186 0.44 % 332.01 2,091 1.10 % 528.58 348,133 1.17 % 641.46 105,919 0.86 % 565.15 416,761 2.53 % 1,310.82 19,056 0.79 % 443.84 potem potem 814,559 1.27 % 717.87 10 2.01 % 1,030.18 74,531 2.47 % 1,876.61 4,470 2.36 % 1,129.97 357,330 1.20 % 658.40 152,779 1.24 % 815.18 198,520 1.20 % 624.40 26,919 1.11 % 626.98 najbolj najbolj 780,725 1.22 % 688.05 1 0.20 % 103.02 13,768 0.46 % 346.66 1,191 0.63 % 301.07 377,064 1.26 % 694.77 156,552 1.27 % 835.31 207,955 1.26 % 654.07 24,194 1.00 % 563.51 treba treba 715,315 1.11 % 630.40 2 0.40 % 206.04 22,139 0.73 % 557.44 3,300 1.74 % 834.20 348,254 1.17 % 641.68 131,260 1.06 % 700.36 182,173 1.10 % 572.98 28,187 1.16 % 656.51 skupaj skupaj 614,538 0.96 % 541.59 21 4.22 % 2,163.39 23,547 0.78 % 592.89 1,545 0.81 % 390.56 284,334 0.95 % 523.90 117,286 0.95 % 625.80 170,035 1.03 % 534.80 17,770 0.73 % 413.88 letos letos 606,763 0.94 % 534.74 0 0 % 0 765 0.03 % 19.26 280 0.15 % 70.78 362,999 1.22 % 668.85 61,361 0.50 % 327.40 180,244 1.09 % 566.91 1,114 0.05 % 25.95 manj manj 605,930 0.94 % 534 1 0.20 % 103.02 10,236 0.34 % 257.73 1,249 0.66 % 315.73 293,350 0.98 % 540.52 115,502 0.94 % 616.28 163,127 0.99 % 513.08 22,465 0.93 % 523.24 nato nato 588,918 0.92 % 519.01 52 10.46 % 5,356.96 28,600 0.95 % 720.12 1,074 0.57 % 271.50 236,089 0.79 % 435.01 94,258 0.76 % 502.93 201,513 1.22 % 633.81 27,332 1.13 % 636.59 res res 565,140 0.88 % 498.06 0 0 % 0 43,525 1.44 % 1,095.91 1,548 0.82 % 391.32 259,710 0.87 % 478.53 119,655 0.97 % 638.44 127,911 0.78 % 402.31 12,791 0.53 % 297.92 lani lani 505,184 0.79 % 445.22 0 0 % 0 745 0.03 % 18.76 90 0.05 % 22.75 276,175 0.93 % 508.87 38,722 0.31 % 206.61 188,480 1.14 % 592.82 972 0.04 % 22.64 precej precej 492,127 0.77 % 433.71 2 0.40 % 206.04 12,458 0.41 % 313.68 619 0.33 % 156.48 241,123 0.81 % 444.29 101,408 0.82 % 541.08 122,999 0.75 % 386.86 13,518 0.56 % 314.85 takrat takrat 444,872 0.69 % 392.06 0 0 % 0 20,788 0.69 % 523.42 1,482 0.78 % 374.63 201,503 0.68 % 371.28 88,896 0.72 % 474.32 118,816 0.72 % 373.71 13,387 0.55 % 311.80 tam tam 430,492 0.67 % 379.39 0 0 % 0 37,488 1.24 % 943.91 1,592 0.84 % 402.44 193,210 0.65 % 356 84,595 0.69 % 451.37 98,397 0.60 % 309.48 15,210 0.63 % 354.26 povsem povsem 426,682 0.66 % 376.03 0 0 % 0 14,047 0.47 % 353.69 572 0.30 % 144.60 206,808 0.69 % 381.06 86,791 0.70 % 463.09 103,440 0.63 % 325.35 15,024 0.62 % 349.93 medtem medtem 423,755 0.66 % 373.45 6 1.21 % 618.11 18,625 0.62 % 468.96 716 0.38 % 181 185,207 0.62 % 341.26 60,849 0.49 % 324.67 146,062 0.89 % 459.40 12,290 0.51 % 286.25 malo malo 420,328 0.65 % 370.43 29 5.83 % 2,987.53 31,608 1.05 % 795.86 1,133 0.60 % 286.41 185,259 0.62 % 341.35 97,676 0.79 % 521.17 89,437 0.54 % 281.30 15,186 0.63 % 353.70 vse vse 415,803 0.65 % 366.45 0 0 % 0 14,050 0.47 % 353.76 827 0.44 % 209.06 199,559 0.67 % 367.70 73,812 0.60 % 393.84 113,290 0.69 % 356.33 14,265 0.59 % 332.25 rad rad 404,452 0.63 % 356.44 0 0 % 0 31,141 1.03 % 784.10 1,256 0.66 % 317.50 176,738 0.59 % 325.65 102,264 0.83 % 545.65 79,765 0.48 % 250.88 13,288 0.55 % 309.49 glede glede 400,382 0.62 % 352.85 1 0.20 % 103.02 4,848 0.16 % 122.07 2,830 1.49 % 715.39 176,820 0.59 % 325.80 67,773 0.55 % 361.61 130,230 0.79 % 409.61 17,880 0.74 % 416.45 dovolj dovolj 400,015 0.62 % 352.53 3 0.60 % 309.06 18,306 0.61 % 460.93 806 0.42 % 203.75 184,495 0.62 % 339.94 90,989 0.74 % 485.49 91,354 0.55 % 287.33 14,062 0.58 % 327.52 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 292 File at CLARIN.SI 1.3.15 List of adverb lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-words-adverbs-lowercase_ forms-taxonomy-entire.tsvLower-case word form Lemma Lemma (lower-case) Morphosyntactic tag Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] msd01 msd02 msd03 lahko lahko lahko Rsn 3,362,721 5.23 % 2,963.55 42 8.45 % 4,326.77 114,980 3.81 % 2,895.07 16,863 8.89 % 4,262.78 1,415,346 4.75 % 2,607.87 764,409 6.19 % 4,078.64 852,161 5.17 % 2,680.27 198,920 8.20 % 4,633.08 R s n tako tako tako Rsn 3,289,402 5.12 % 2,898.93 31 6.24 % 3,193.57 157,077 5.21 % 3,955.03 11,067 5.84 % 2,797.62 1,532,883 5.14 % 2,824.44 627,155 5.08 % 3,346.29 830,732 5.04 % 2,612.87 130,457 5.38 % 3,038.50 R s n več več več Rsr 1,895,190 2.95 % 1,670.22 4 0.81 % 412.07 35,879 1.19 % 903.39 3,863 2.04 % 976.52 889,908 2.99 % 1,639.71 302,261 2.45 % 1,612.77 609,107 3.69 % 1,915.80 54,168 2.23 % 1,261.64 R s r nekaj nekaj nekaj Rsn 1,478,635 2.30 % 1,303.11 17 3.42 % 1,751.31 75,312 2.50 % 1,896.27 2,638 1.39 % 666.86 706,346 2.37 % 1,301.49 292,467 2.37 % 1,560.51 361,517 2.19 % 1,137.07 40,338 1.66 % 939.52 R s n zelo zelo zelo Rsn 1,441,970 2.24 % 1,270.80 2 0.40 % 206.04 45,177 1.50 % 1,137.51 3,480 1.83 % 879.71 645,584 2.17 % 1,189.53 323,837 2.62 % 1,727.89 366,622 2.22 % 1,153.12 57,268 2.36 % 1,333.84 R s n bolj bolj bolj Rsr 1,262,087 1.96 % 1,112.27 13 2.62 % 1,339.24 42,774 1.42 % 1,077 2,638 1.39 % 666.86 596,351 2.00 % 1,098.82 266,922 2.16 % 1,424.21 297,149 1.80 % 934.61 56,240 2.32 % 1,309.90 R s r zdaj zdaj zdaj Rsn 1,082,531 1.68 % 954.03 0 0 % 0 71,531 2.37 % 1,801.07 2,616 1.38 % 661.30 531,590 1.78 % 979.49 150,379 1.22 % 802.37 308,065 1.87 % 968.94 18,350 0.76 % 427.39 R s n vedno vedno vedno Rsn 1,075,127 1.67 % 947.50 1 0.20 % 103.02 54,794 1.82 % 1,379.65 2,037 1.07 % 514.93 473,977 1.59 % 873.33 227,298 1.84 % 1,212.79 277,237 1.68 % 871.98 39,783 1.64 % 926.59 R s n kar kar kar Rsn 1,071,938 1.67 % 944.69 0 0 % 0 39,622 1.31 % 997.64 1,731 0.91 % 437.58 542,754 1.82 % 1,000.06 222,883 1.81 % 1,189.23 244,821 1.48 % 770.03 20,127 0.83 % 468.78 R s n kako kako kako Rsn 1,005,454 1.56 % 886.10 2 0.40 % 206.04 90,504 3.00 % 2,278.79 2,958 1.56 % 747.75 424,401 1.42 % 781.99 198,869 1.61 % 1,061.10 237,842 1.44 % 748.08 50,878 2.10 % 1,185.01 R s n veliko veliko veliko Rsn 945,305 1.47 % 833.09 1 0.20 % 103.02 24,019 0.80 % 604.77 1,561 0.82 % 394.60 451,747 1.52 % 832.37 200,120 1.62 % 1,067.77 237,634 1.44 % 747.42 30,223 1.25 % 703.93 R s n danes danes danes Rsn 905,146 1.41 % 797.70 0 0 % 0 13,186 0.44 % 332.01 2,091 1.10 % 528.58 348,133 1.17 % 641.46 105,919 0.86 % 565.15 416,761 2.53 % 1,310.82 19,056 0.79 % 443.84 R s n potem potem potem Rsn 814,559 1.27 % 717.87 10 2.01 % 1,030.18 74,531 2.47 % 1,876.61 4,470 2.36 % 1,129.97 357,330 1.20 % 658.40 152,779 1.24 % 815.18 198,520 1.20 % 624.40 26,919 1.11 % 626.98 R s n najbolj najbolj najbolj Rss 780,723 1.22 % 688.05 1 0.20 % 103.02 13,768 0.46 % 346.66 1,191 0.63 % 301.07 377,064 1.26 % 694.77 156,552 1.27 % 835.31 207,953 1.26 % 654.07 24,194 1.00 % 563.51 R s s treba treba treba Rsn 715,315 1.11 % 630.40 2 0.40 % 206.04 22,139 0.73 % 557.44 3,300 1.74 % 834.20 348,254 1.17 % 641.68 131,260 1.06 % 700.36 182,173 1.10 % 572.98 28,187 1.16 % 656.51 R s n skupaj skupaj skupaj Rsn 614,538 0.96 % 541.59 21 4.22 % 2,163.39 23,547 0.78 % 592.89 1,545 0.81 % 390.56 284,334 0.95 % 523.90 117,286 0.95 % 625.80 170,035 1.03 % 534.80 17,770 0.73 % 413.88 R s n letos letos letos Rsn 606,763 0.94 % 534.74 0 0 % 0 765 0.03 % 19.26 280 0.15 % 70.78 362,999 1.22 % 668.85 61,361 0.50 % 327.40 180,244 1.09 % 566.91 1,114 0.05 % 25.95 R s n manj manj manj Rsr 605,930 0.94 % 534 1 0.20 % 103.02 10,236 0.34 % 257.73 1,249 0.66 % 315.73 293,350 0.98 % 540.52 115,502 0.94 % 616.28 163,127 0.99 % 513.08 22,465 0.93 % 523.24 R s r nato nato nato Rsn 588,918 0.92 % 519.01 52 10.46 % 5,356.96 28,600 0.95 % 720.12 1,074 0.57 % 271.50 236,089 0.79 % 435.01 94,258 0.76 % 502.93 201,513 1.22 % 633.81 27,332 1.13 % 636.59 R s n dobro dobro dobro Rsn 575,758 0.90 % 507.41 16 3.22 % 1,648.30 25,204 0.84 % 634.61 972 0.51 % 245.71 268,097 0.90 % 493.99 122,275 0.99 % 652.42 138,567 0.84 % 435.83 20,627 0.85 % 480.43 R s n res res res Rsn 565,140 0.88 % 498.06 0 0 % 0 43,525 1.44 % 1,095.91 1,548 0.82 % 391.32 259,710 0.87 % 478.53 119,655 0.97 % 638.44 127,911 0.78 % 402.31 12,791 0.53 % 297.92 R s n lani lani lani Rsn 505,184 0.79 % 445.22 0 0 % 0 745 0.03 % 18.76 90 0.05 % 22.75 276,175 0.93 % 508.87 38,722 0.31 % 206.61 188,480 1.14 % 592.82 972 0.04 % 22.64 R s n precej precej precej Rsn 492,127 0.77 % 433.71 2 0.40 % 206.04 12,458 0.41 % 313.68 619 0.33 % 156.48 241,123 0.81 % 444.29 101,408 0.82 % 541.08 122,999 0.75 % 386.86 13,518 0.56 % 314.85 R s n takrat takrat takrat Rsn 444,872 0.69 % 392.06 0 0 % 0 20,788 0.69 % 523.42 1,482 0.78 % 374.63 201,503 0.68 % 371.28 88,896 0.72 % 474.32 118,816 0.72 % 373.71 13,387 0.55 % 311.80 R s n tam tam tam Rsn 430,492 0.67 % 379.39 0 0 % 0 37,488 1.24 % 943.91 1,592 0.84 % 402.44 193,210 0.65 % 356 84,595 0.69 % 451.37 98,397 0.60 % 309.48 15,210 0.63 % 354.26 R s n povsem povsem povsem Rsn 426,682 0.66 % 376.03 0 0 % 0 14,047 0.47 % 353.69 572 0.30 % 144.60 206,808 0.69 % 381.06 86,791 0.70 % 463.09 103,440 0.63 % 325.35 15,024 0.62 % 349.93 R s n medtem medtem medtem Rsn 423,755 0.66 % 373.45 6 1.21 % 618.11 18,625 0.62 % 468.96 716 0.38 % 181 185,207 0.62 % 341.26 60,849 0.49 % 324.67 146,062 0.89 % 459.40 12,290 0.51 % 286.25 R s n malo malo malo Rsn 419,478 0.65 % 369.68 29 5.83 % 2,987.53 31,414 1.04 % 790.97 1,114 0.59 % 281.61 185,039 0.62 % 340.95 97,481 0.79 % 520.13 89,278 0.54 % 280.80 15,123 0.62 % 352.23 R s n vse vse vse Rsn 415,803 0.65 % 366.45 0 0 % 0 14,050 0.47 % 353.76 827 0.44 % 209.06 199,559 0.67 % 367.70 73,812 0.60 % 393.84 113,290 0.69 % 356.33 14,265 0.59 % 332.25 R s n glede glede glede Rsn 400,382 0.62 % 352.85 1 0.20 % 103.02 4,848 0.16 % 122.07 2,830 1.49 % 715.39 176,820 0.59 % 325.80 67,773 0.55 % 361.61 130,230 0.79 % 409.61 17,880 0.74 % 416.45 R s n dovolj dovolj dovolj Rsn 400,015 0.62 % 352.53 3 0.60 % 309.06 18,306 0.61 % 460.93 806 0.42 % 203.75 184,495 0.62 % 339.94 90,989 0.74 % 485.49 91,354 0.55 % 287.33 14,062 0.58 % 327.52 R s n spet spet spet Rsn 394,360 0.61 % 347.55 0 0 % 0 41,209 1.37 % 1,037.60 1,269 0.67 % 320.79 187,447 0.63 % 345.38 68,686 0.56 % 366.49 83,350 0.51 % 262.16 12,399 0.51 % 288.79 R s n CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 293 File at CLARIN.SI 1.3.16 List of pronoun lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-words-pronouns-lemmas- taxonomy-entire.tsvLemma Lemma (lower-case) Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] se se 18,656,426 24.26 % 16,441.81 63 17.45 % 6,490.16 1,227,523 25.39 % 30,907.70 73,060 23.14 % 18,468.78 8,331,411 23.93 % 15,351.18 3,444,419 23.78 % 18,378.29 4,822,136 25.14 % 15,166.87 757,814 23.16 % 17,650.39 on on 11,874,012 15.44 % 10,464.51 189 52.35 % 19,470.49 947,214 19.59 % 23,849.82 38,159 12.09 % 9,646.18 5,248,082 15.07 % 9,669.94 2,238,961 15.46 % 11,946.36 2,870,162 14.96 % 9,027.41 531,245 16.23 % 12,373.33 ta ta 11,377,542 14.79 % 10,026.97 17 4.71 % 1,751.31 469,391 9.71 % 11,818.76 74,801 23.69 % 18,908.89 5,253,309 15.09 % 9,679.57 1,998,988 13.80 % 10,665.94 3,084,831 16.08 % 9,702.60 496,205 15.16 % 11,557.20 ves ves 4,233,029 5.50 % 3,730.55 26 7.20 % 2,678.48 207,705 4.30 % 5,229.79 13,137 4.16 % 3,320.89 2,037,592 5.85 % 3,754.40 799,974 5.52 % 4,268.40 1,022,378 5.33 % 3,215.64 152,217 4.65 % 3,545.31 jaz jaz 3,481,582 4.53 % 3,068.30 0 0 % 0 403,401 8.35 % 10,157.20 12,057 3.82 % 3,047.88 1,512,005 4.34 % 2,785.97 716,575 4.95 % 3,823.41 719,285 3.75 % 2,262.34 118,259 3.61 % 2,754.39 svoj svoj 3,453,653 4.49 % 3,043.69 3 0.83 % 309.06 151,218 3.13 % 3,807.51 10,998 3.48 % 2,780.18 1,585,205 4.55 % 2,920.85 654,031 4.51 % 3,489.69 896,962 4.68 % 2,821.18 155,236 4.74 % 3,615.63 kateri kateri 2,632,017 3.42 % 2,319.58 6 1.66 % 618.11 67,762 1.40 % 1,706.17 14,555 4.61 % 3,679.35 1,197,868 3.44 % 2,207.15 448,321 3.10 % 2,392.09 781,192 4.07 % 2,457.05 122,313 3.74 % 2,848.82 kar kar 1,932,807 2.51 % 1,703.37 1 0.28 % 103.02 73,171 1.51 % 1,842.37 5,556 1.76 % 1,404.50 882,447 2.54 % 1,625.97 342,677 2.37 % 1,828.41 559,421 2.92 % 1,759.52 69,534 2.12 % 1,619.53 njegov njegov 1,765,365 2.29 % 1,555.81 0 0 % 0 110,706 2.29 % 2,787.46 5,165 1.64 % 1,305.66 781,513 2.25 % 1,439.99 274,494 1.90 % 1,464.61 515,496 2.69 % 1,621.37 77,991 2.38 % 1,816.50 naš naš 1,295,731 1.69 % 1,141.92 0 0 % 0 25,301 0.52 % 637.05 3,839 1.22 % 970.46 733,564 2.11 % 1,351.64 220,741 1.52 % 1,177.80 267,867 1.40 % 842.51 44,419 1.36 % 1,034.57 tisti tisti 1,204,248 1.57 % 1,061.30 1 0.28 % 103.02 85,777 1.77 % 2,159.77 6,616 2.10 % 1,672.45 554,134 1.59 % 1,021.03 243,437 1.68 % 1,298.90 260,462 1.36 % 819.22 53,821 1.65 % 1,253.56 kaj kaj 1,177,693 1.53 % 1,037.89 1 0.28 % 103.02 124,849 2.58 % 3,143.56 3,588 1.14 % 907.01 508,341 1.46 % 936.65 253,464 1.75 % 1,352.40 239,881 1.25 % 754.49 47,569 1.45 % 1,107.94 ti ti 1,132,179 1.47 % 997.78 5 1.39 % 515.09 142,326 2.94 % 3,583.61 4,357 1.38 % 1,101.40 391,026 1.12 % 720.49 346,431 2.39 % 1,848.44 187,485 0.98 % 589.69 60,549 1.85 % 1,410.26 vsak vsak 1,091,923 1.42 % 962.31 8 2.22 % 824.15 44,704 0.93 % 1,125.60 5,188 1.64 % 1,311.47 497,898 1.43 % 917.41 237,701 1.64 % 1,268.29 250,564 1.31 % 788.09 55,860 1.71 % 1,301.05 njihov njihov 1,022,463 1.33 % 901.09 1 0.28 % 103.02 22,312 0.46 % 561.79 4,322 1.37 % 1,092.55 502,914 1.45 % 926.65 168,419 1.16 % 898.63 272,408 1.42 % 856.79 52,087 1.59 % 1,213.17 njen njen 971,331 1.26 % 856.03 0 0 % 0 81,849 1.69 % 2,060.87 3,157 1.00 % 798.06 394,090 1.13 % 726.14 181,976 1.26 % 970.96 269,379 1.40 % 847.27 40,880 1.25 % 952.14 nekateri nekateri 826,585 1.07 % 728.47 2 0.55 % 206.04 11,748 0.24 % 295.80 2,358 0.75 % 596.08 424,818 1.22 % 782.76 144,522 1.00 % 771.12 201,313 1.05 % 633.18 41,824 1.28 % 974.13 kakšen kakšen 686,530 0.89 % 605.04 2 0.55 % 206.04 38,652 0.80 % 973.22 2,249 0.71 % 568.52 324,671 0.93 % 598.23 143,530 0.99 % 765.83 154,847 0.81 % 487.03 22,579 0.69 % 525.89 takšen takšen 627,943 0.82 % 553.40 0 0 % 0 19,175 0.40 % 482.81 1,988 0.63 % 502.54 300,654 0.86 % 553.97 117,208 0.81 % 625.38 166,954 0.87 % 525.11 21,964 0.67 % 511.57 moj moj 617,627 0.80 % 544.31 0 0 % 0 92,173 1.91 % 2,320.82 1,908 0.60 % 482.32 244,505 0.70 % 450.52 127,017 0.88 % 677.72 128,390 0.67 % 403.82 23,634 0.72 % 550.46 oba oba 542,160 0.70 % 477.80 17 4.71 % 1,751.31 16,851 0.35 % 424.29 1,356 0.43 % 342.78 280,698 0.81 % 517.20 81,296 0.56 % 433.77 140,356 0.73 % 441.46 21,586 0.66 % 502.76 tak tak 535,614 0.70 % 472.03 0 0 % 0 27,946 0.58 % 703.65 4,605 1.46 % 1,164.09 256,456 0.74 % 472.54 111,527 0.77 % 595.07 105,573 0.55 % 332.05 29,507 0.90 % 687.25 nič nič 475,396 0.62 % 418.96 0 0 % 0 53,076 1.10 % 1,336.40 1,400 0.44 % 353.90 214,201 0.61 % 394.68 94,691 0.65 % 505.24 95,921 0.50 % 301.70 16,107 0.49 % 375.15 kdo kdo 456,086 0.59 % 401.95 0 0 % 0 35,045 0.72 % 882.40 2,132 0.68 % 538.95 222,399 0.64 % 409.78 86,567 0.60 % 461.89 96,252 0.50 % 302.74 13,691 0.42 % 318.88 nek nek 393,758 0.51 % 347.02 0 0 % 0 33,855 0.70 % 852.43 2,404 0.76 % 607.71 157,860 0.45 % 290.87 79,718 0.55 % 425.35 95,874 0.50 % 301.55 24,047 0.73 % 560.08 zame zame 390,470 0.51 % 344.12 0 0 % 0 23,376 0.48 % 588.58 971 0.31 % 245.46 181,725 0.52 % 334.84 68,638 0.47 % 366.23 102,942 0.54 % 323.78 12,818 0.39 % 298.55 noben noben 361,539 0.47 % 318.62 0 0 % 0 24,292 0.50 % 611.65 1,648 0.52 % 416.60 170,441 0.49 % 314.05 59,200 0.41 % 315.87 91,969 0.48 % 289.27 13,989 0.43 % 325.82 vaš vaš 318,812 0.41 % 280.97 0 0 % 0 14,681 0.30 % 369.65 1,057 0.34 % 267.20 117,509 0.34 % 216.52 104,686 0.72 % 558.57 55,625 0.29 % 174.96 25,254 0.77 % 588.20 isti isti 305,329 0.40 % 269.08 5 1.39 % 515.09 14,396 0.30 % 362.48 1,503 0.48 % 379.94 138,929 0.40 % 255.99 59,259 0.41 % 316.19 72,718 0.38 % 228.72 18,519 0.57 % 431.33 nihče nihče 286,212 0.37 % 252.24 0 0 % 0 26,310 0.54 % 662.46 820 0.26 % 207.29 139,188 0.40 % 256.46 46,911 0.32 % 250.30 65,406 0.34 % 205.72 7,577 0.23 % 176.48 enak enak 255,863 0.33 % 225.49 2 0.55 % 206.04 5,846 0.12 % 147.20 1,377 0.44 % 348.09 110,141 0.32 % 202.94 51,176 0.35 % 273.06 70,285 0.37 % 221.06 17,036 0.52 % 396.79 mnog mnog 247,148 0.32 % 217.81 0 0 % 0 3,585 0.07 % 90.27 768 0.24 % 194.14 128,052 0.37 % 235.94 45,853 0.32 % 244.66 54,018 0.28 % 169.90 14,872 0.45 % 346.39 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 294 File at CLARIN.SI 1.3.17 List of pronoun lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-words-pronouns-lowercase_ forms-taxonomy-entire.tsvLower-case word form Lemma Lemma (lower-case) Morphosyntactic tag Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] msd01 msd02 msd03 msd04 msd05 msd06 msd07 msd08 msd09 se se se Zp------k 16,033,172 20.85 % 14,129.95 61 16.90 % 6,284.12 1,002,189 20.73 % 25,234.03 67,074 21.25 % 16,955.58 7,234,161 20.78 % 13,329.42 2,886,836 19.93 % 15,403.21 4,197,943 21.89 % 13,203.62 644,908 19.70 % 15,020.67 Z p - - - - - - k to ta ta Zk-sei 2,720,820 3.54 % 2,397.84 4 1.11 % 412.07 147,067 3.04 % 3,702.99 12,681 4.02 % 3,205.62 1,214,806 3.49 % 2,238.36 519,600 3.59 % 2,772.41 719,934 3.75 % 2,264.38 106,728 3.26 % 2,485.82 Z k - s e i ga on on Zotmet--k 2,644,832 3.44 % 2,330.88 57 15.79 % 5,872.05 204,045 4.22 % 5,137.63 8,917 2.83 % 2,254.12 1,172,956 3.37 % 2,161.25 488,103 3.37 % 2,604.36 654,442 3.41 % 2,058.39 116,312 3.55 % 2,709.04 Z o t m e t - - k si se se Zp---d--k 2,116,658 2.75 % 1,865.40 0 0 % 0 177,478 3.67 % 4,468.70 4,363 1.38 % 1,102.92 897,813 2.58 % 1,654.28 442,516 3.06 % 2,361.12 514,578 2.68 % 1,618.48 79,910 2.44 % 1,861.20 Z p - - - d - - k jih on on Zotmmt--k 1,757,890 2.29 % 1,549.22 65 18.01 % 6,696.20 67,762 1.40 % 1,706.17 6,978 2.21 % 1,763.96 822,820 2.36 % 1,516.10 332,169 2.29 % 1,772.34 437,947 2.28 % 1,377.46 90,149 2.75 % 2,099.68 Z o t m m t - - k jo on on Zotzet--k 1,732,437 2.25 % 1,526.79 46 12.74 % 4,738.85 148,766 3.08 % 3,745.77 5,376 1.70 % 1,358.99 733,243 2.11 % 1,351.05 339,939 2.35 % 1,813.80 423,123 2.21 % 1,330.83 81,944 2.50 % 1,908.57 Z o t z e t - - k tem ta ta Zk-sem 1,291,375 1.68 % 1,138.08 10.28 % 103.02 37,357 0.77 % 940.61 4,898 1.55 % 1,238.16 600,729 1.73 % 1,106.88 200,319 1.38 % 1,068.84 396,564 2.07 % 1,247.30 51,507 1.57 % 1,199.66 Z k - s e m kar kar kar Zz-sei 1,162,043 1.51 % 1,024.10 10.28 % 103.02 36,975 0.77 % 930.99 3,153 1.00 % 797.04 542,083 1.56 % 998.82 210,283 1.45 % 1,122 330,667 1.72 % 1,040.03 38,881 1.19 % 905.58 Z z - s e i mu on on Zotmed--k 1,099,213 1.43 % 968.73 3 0.83 % 309.06 141,361 2.92 % 3,559.32 3,183 1.01 % 804.63 463,991 1.33 % 854.93 190,693 1.32 % 1,017.48 260,401 1.36 % 819.03 39,581 1.21 % 921.89 Z o t m e d - - k tega ta ta Zk-ser 1,058,784 1.38 % 933.10 10.28 % 103.02 50,333 1.04 % 1,267.33 4,603 1.46 % 1,163.59 494,178 1.42 % 910.56 190,340 1.31 % 1,015.59 277,885 1.45 % 874.02 41,444 1.27 % 965.28 Z k - s e r to ta ta Zk-set 1,048,902 1.36 % 924.39 0 0 % 0 57,652 1.19 % 1,451.61 5,855 1.85 % 1,480.08 471,831 1.35 % 869.38 196,179 1.35 % 1,046.75 275,247 1.44 % 865.72 42,138 1.29 % 981.44 Z k - s e t mi jaz jaz Zop-ed--k 780,655 1.01 % 687.99 0 0 % 0 132,176 2.73 % 3,328.05 2,596 0.82 % 656.24 302,841 0.87 % 558 163,646 1.13 % 873.16 155,511 0.81 % 489.12 23,885 0.73 % 556.31 Z o p - e d - - k jim on on Zotmmd--k 714,609 0.93 % 629.78 10.28 % 103.02 25,352 0.52 % 638.34 1,786 0.57 % 451.48 368,927 1.06 % 679.77 117,172 0.81 % 625.19 175,340 0.91 % 551.49 26,031 0.80 % 606.29 Z o t m m d - - k kaj kaj kaj Zv-set 683,343 0.89 % 602.23 0 0 % 0 72,143 1.49 % 1,816.48 2,045 0.65 % 516.95 298,096 0.86 % 549.26 149,258 1.03 % 796.39 134,906 0.70 % 424.31 26,895 0.82 % 626.42 Z v - s e t nam jaz jaz Zop-md 660,025 0.86 % 581.68 0 0 % 0 16,117 0.33 % 405.81 1,553 0.49 % 392.58 342,164 0.98 % 630.46 130,942 0.90 % 698.66 144,622 0.75 % 454.87 24,627 0.75 % 573.59 Z o p - m d tem ta ta Zk-seo 652,158 0.85 % 574.74 4 1.11 % 412.07 16,299 0.34 % 410.39 2,650 0.84 % 669.89 296,545 0.85 % 546.40 110,534 0.76 % 589.77 198,541 1.03 % 624.46 27,585 0.84 % 642.49 Z k - s e o vsi ves ves Zc-mmi 590,283 0.77 % 520.21 10.28 % 103.02 30,468 0.63 % 767.15 1,801 0.57 % 455.27 294,483 0.85 % 542.60 107,195 0.74 % 571.96 137,969 0.72 % 433.95 18,366 0.56 % 427.77 Z c - m m i ji on on Zotzed--k 560,679 0.73 % 494.12 3 0.83 % 309.06 96,395 1.99 % 2,427.12 1,411 0.45 % 356.69 201,119 0.58 % 370.58 120,392 0.83 % 642.37 123,492 0.64 % 388.41 17,867 0.55 % 416.14 Z o t z e d - - k me jaz jaz Zop-et--k 539,338 0.70 % 475.32 0 0 % 0 105,292 2.18 % 2,651.14 1,160 0.37 % 293.24 203,767 0.58 % 375.45 114,777 0.79 % 612.41 98,611 0.51 % 310.16 15,731 0.48 % 366.39 Z o p - e t - - k vse ves ves Zc-sei 538,742 0.70 % 474.79 11 3.05 % 1,133.20 40,685 0.84 % 1,024.40 1,229 0.39 % 310.68 243,807 0.70 % 449.23 111,646 0.77 % 595.71 125,094 0.65 % 393.45 16,270 0.50 % 378.95 Z c - s e i jih on on Zotzmt--k 505,037 0.66 % 445.09 6 1.66 % 618.11 17,684 0.37 % 445.26 2,438 0.77 % 616.30 228,539 0.66 % 421.10 103,293 0.71 % 551.14 121,852 0.64 % 383.26 31,225 0.95 % 727.27 Z o t z m t - - k ta ta ta Zk-mei 500,160 0.65 % 440.79 0 0 % 0 20,798 0.43 % 523.67 3,681 1.17 % 930.52 225,320 0.65 % 415.17 88,565 0.61 % 472.55 136,681 0.71 % 429.90 25,115 0.77 % 584.96 Z k - m e i ta ta ta Zk-zei 475,713 0.62 % 419.24 10.28 % 103.02 18,665 0.39 % 469.96 2,946 0.93 % 744.72 214,190 0.61 % 394.66 84,642 0.58 % 451.62 128,182 0.67 % 403.17 27,087 0.83 % 630.89 Z k - z e i svojo svoj svoj Zp-zet 473,127 0.61 % 416.96 10.28 % 103.02 23,701 0.49 % 596.77 1,410 0.45 % 356.43 210,810 0.61 % 388.43 91,695 0.63 % 489.25 124,808 0.65 % 392.55 20,702 0.63 % 482.17 Z p - z e t vse ves ves Zc-set 457,062 0.59 % 402.81 4 1.11 % 412.07 29,590 0.61 % 745.04 1,096 0.35 % 277.06 204,641 0.59 % 377.06 95,377 0.66 % 508.90 109,557 0.57 % 344.59 16,797 0.51 % 391.22 Z c - s e t nas jaz jaz Zop-mt 454,912 0.59 % 400.91 0 0 % 0 15,369 0.32 % 386.97 1,155 0.37 % 291.97 220,841 0.63 % 406.91 93,011 0.64 % 496.28 104,140 0.54 % 327.55 20,396 0.62 % 475.05 Z o p - m t tem ta ta Zk-mem 419,609 0.55 % 369.80 0 0 % 0 10,639 0.22 % 267.88 4,094 1.30 % 1,034.92 203,001 0.58 % 374.04 67,813 0.47 % 361.83 115,268 0.60 % 362.55 18,794 0.57 % 437.73 Z k - m e m kaj kaj kaj Zv-sei 408,862 0.53 % 360.33 0 0 % 0 44,209 0.92 % 1,113.13 1,271 0.40 % 321.30 174,366 0.50 % 321.28 85,743 0.59 % 457.50 86,946 0.45 % 273.47 16,327 0.50 % 380.28 Z v - s e i kar kar kar Zz-set 408,261 0.53 % 359.80 0 0 % 0 26,685 0.55 % 671.90 1,351 0.43 % 341.52 179,930 0.52 % 331.53 79,865 0.55 % 426.13 103,059 0.54 % 324.15 17,371 0.53 % 404.59 Z z - s e t svoj svoj svoj Zp-met 384,873 0.50 % 339.19 0 0 % 0 15,313 0.32 % 385.56 1,259 0.40 % 318.26 177,631 0.51 % 327.30 72,882 0.50 % 388.87 102,755 0.54 % 323.19 15,033 0.46 % 350.14 Z p - m e t temu ta ta Zk-sed 383,005 0.50 % 337.54 0 0 % 0 11,526 0.24 % 290.21 1,433 0.45 % 362.25 187,443 0.54 % 345.38 71,484 0.49 % 381.42 97,046 0.51 % 305.23 14,073 0.43 % 327.78 Z k - s e d kateri kateri kateri Zv-zem 380,063 0.49 % 334.95 3 0.83 % 309.06 9,548 0.20 % 240.41 1,695 0.54 % 428.48 174,278 0.50 % 321.12 56,746 0.39 % 302.78 122,971 0.64 % 386.78 14,822 0.45 % 345.22 Z v - z e m CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 295 File at CLARIN.SI 1.3.18 List of numeral lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-words-numerals-lemmas- taxonomy-entire.tsvLemma Lemma (lower-case) Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] drug drug 3,399,068 8.45 % 2,995.58 19 4.38 % 1,957.35 110,912 24.64 % 2,792.64 22,761 14.19 % 5,753.74 1,577,458 7.55 % 2,906.57 551,520 10.49 % 2,942.73 979,054 8.15 % 3,079.38 157,344 10.86 % 3,664.73 prvi prvi 2,198,324 5.46 % 1,937.37 2 0.46 % 206.04 31,830 7.07 % 801.44 9,684 6.04 % 2,448.01 1,053,085 5.04 % 1,940.38 295,479 5.62 % 1,576.58 749,420 6.24 % 2,357.12 58,824 4.06 % 1,370.08 en en 1,710,696 4.25 % 1,507.63 21 4.84 % 2,163.39 67,198 14.93 % 1,691.97 6,327 3.94 % 1,599.40 762,842 3.65 % 1,405.59 317,026 6.03 % 1,691.55 490,332 4.08 % 1,542.22 66,950 4.62 % 1,559.35 dva dva 1,632,916 4.06 % 1,439.08 9 2.07 % 927.17 41,991 9.33 % 1,057.29 3,602 2.25 % 910.55 804,557 3.85 % 1,482.45 255,248 4.86 % 1,361.92 481,702 4.01 % 1,515.08 45,807 3.16 % 1,066.90 trije trije 1,077,597 2.68 % 949.68 4 0.92 % 412.07 24,134 5.36 % 607.67 2,226 1.39 % 562.71 554,125 2.65 % 1,021.01 148,979 2.83 % 794.90 323,142 2.69 % 1,016.37 24,987 1.72 % 581.98 štirje štirje 562,863 1.40 % 496.05 2 0.46 % 206.04 10,338 2.30 % 260.30 649 0.41 % 164.06 287,367 1.38 % 529.49 76,421 1.45 % 407.76 177,095 1.47 % 557.01 10,991 0.76 % 255.99 pet pet 490,799 1.22 % 432.54 1 0.23 % 103.02 9,434 2.10 % 237.54 917 0.57 % 231.81 250,598 1.20 % 461.74 63,012 1.20 % 336.21 159,366 1.33 % 501.25 7,471 0.52 % 174.01 1 1 473,724 1.18 % 417.49 34 7.83 % 3,502.63 1,167 0.26 % 29.38 5,780 3.60 % 1,461.12 224,982 1.08 % 414.54 75,917 1.44 % 405.07 134,125 1.12 % 421.86 31,719 2.19 % 738.77 tretji tretji 454,803 1.13 % 400.82 0 0 % 0 5,843 1.30 % 147.12 2,681 1.67 % 677.73 221,805 1.06 % 408.69 50,869 0.97 % 271.42 162,544 1.35 % 511.24 11,061 0.76 % 257.62 2 2 448,702 1.11 % 395.44 32 7.37 % 3,296.59 870 0.19 % 21.91 5,495 3.43 % 1,389.08 232,497 1.11 % 428.39 74,329 1.41 % 396.60 108,721 0.91 % 341.96 26,758 1.85 % 623.23 deset deset 366,728 0.91 % 323.20 0 0 % 0 9,231 2.05 % 232.43 459 0.29 % 116.03 193,409 0.93 % 356.37 51,610 0.98 % 275.37 106,603 0.89 % 335.29 5,416 0.37 % 126.15 eden eden 339,502 0.84 % 299.20 0 0 % 0 9,711 2.16 % 244.51 694 0.43 % 175.44 157,403 0.75 % 290.03 57,846 1.10 % 308.65 103,487 0.86 % 325.49 10,361 0.71 % 241.32 1, 1, 338,262 0.84 % 298.11 0 0 % 0 729 0.16 % 18.36 3,351 2.09 % 847.10 212,046 1.01 % 390.71 30,858 0.59 % 164.65 80,704 0.67 % 253.84 10,574 0.73 % 246.28 3 3 334,249 0.83 % 294.57 20 4.61 % 2,060.37 684 0.15 % 17.22 3,802 2.37 % 961.10 178,016 0.85 % 328.01 52,468 1.00 % 279.95 79,212 0.66 % 249.14 20,047 1.38 % 466.92 20 20 318,181 0.79 % 280.41 22 5.07 % 2,266.41 360 0.08 % 9.06 928 0.58 % 234.59 161,967 0.78 % 298.43 43,208 0.82 % 230.54 101,988 0.85 % 320.78 9,708 0.67 % 226.11 tisoč tisoč 305,442 0.76 % 269.18 0 0 % 0 5,271 1.17 % 132.72 270 0.17 % 68.25 195,780 0.94 % 360.74 37,622 0.72 % 200.74 63,057 0.53 % 198.33 3,442 0.24 % 80.17 4 4 303,784 0.76 % 267.72 12 2.77 % 1,236.22 467 0.10 % 11.76 2,293 1.43 % 579.65 165,356 0.79 % 304.68 46,384 0.88 % 247.49 73,009 0.61 % 229.63 16,263 1.12 % 378.78 šest šest 303,010 0.75 % 267.04 0 0 % 0 5,487 1.22 % 138.16 741 0.46 % 187.32 154,063 0.74 % 283.87 36,091 0.69 % 192.57 102,094 0.85 % 321.11 4,534 0.31 % 105.60 10 10 296,783 0.74 % 261.55 21 4.84 % 2,163.39 401 0.09 % 10.10 1,236 0.77 % 312.45 142,870 0.68 % 263.25 49,185 0.94 % 262.43 89,626 0.75 % 281.90 13,444 0.93 % 313.13 5 5 265,501 0.66 % 233.98 14 3.23 % 1,442.26 549 0.12 % 13.82 1,690 1.05 % 427.21 141,453 0.68 % 260.64 45,080 0.86 % 240.53 61,381 0.51 % 193.06 15,334 1.06 % 357.15 15 15 265,493 0.66 % 233.98 19 4.38 % 1,957.35 308 0.07 % 7.76 753 0.47 % 190.35 133,739 0.64 % 246.42 35,793 0.68 % 190.98 86,504 0.72 % 272.08 8,377 0.58 % 195.11 30 30 251,613 0.62 % 221.75 20 4.61 % 2,060.37 282 0.06 % 7.10 1,127 0.70 % 284.89 128,479 0.61 % 236.73 38,641 0.73 % 206.18 75,366 0.63 % 237.05 7,698 0.53 % 179.30 6 6 241,633 0.60 % 212.95 2 0.46 % 206.04 314 0.07 % 7.91 1,004 0.63 % 253.80 144,313 0.69 % 265.91 28,665 0.55 % 152.95 57,268 0.48 % 180.12 10,067 0.69 % 234.47 2, 2, 238,500 0.59 % 210.19 0 0 % 0 327 0.07 % 8.23 3,076 1.92 % 777.58 154,120 0.74 % 283.98 23,331 0.44 % 124.49 48,814 0.41 % 153.53 8,832 0.61 % 205.71 sedem sedem 221,597 0.55 % 195.29 0 0 % 0 4,599 1.02 % 115.80 261 0.16 % 65.98 112,630 0.54 % 207.53 25,309 0.48 % 135.04 75,416 0.63 % 237.20 3,382 0.23 % 78.77 12 12 214,333 0.53 % 188.89 5 1.15 % 515.09 297 0.07 % 7.48 479 0.30 % 121.09 109,658 0.53 % 202.05 25,719 0.49 % 137.23 71,254 0.59 % 224.11 6,921 0.48 % 161.20 3, 3, 214,002 0.53 % 188.60 0 0 % 0 288 0.06 % 7.25 2,450 1.53 % 619.33 143,636 0.69 % 264.66 19,566 0.37 % 104.40 40,880 0.34 % 128.58 7,182 0.50 % 167.28 osem osem 208,962 0.52 % 184.16 0 0 % 0 3,541 0.79 % 89.16 416 0.26 % 105.16 108,342 0.52 % 199.63 23,456 0.45 % 125.15 70,340 0.58 % 221.24 2,867 0.20 % 66.78 50 50 208,691 0.52 % 183.92 8 1.84 % 824.15 222 0.05 % 5.59 785 0.49 % 198.44 107,858 0.52 % 198.74 32,852 0.62 % 175.29 61,207 0.51 % 192.51 5,759 0.40 % 134.13 40 40 180,779 0.45 % 159.32 5 1.15 % 515.09 198 0.04 % 4.99 445 0.28 % 112.49 93,495 0.45 % 172.27 27,276 0.52 % 145.54 54,421 0.45 % 171.17 4,939 0.34 % 115.04 100 100 180,711 0.45 % 159.26 11 2.54 % 1,133.20 196 0.04 % 4.94 698 0.43 % 176.45 85,233 0.41 % 157.05 36,302 0.69 % 193.70 52,988 0.44 % 166.66 5,283 0.36 % 123.05 25 25 177,837 0.44 % 156.73 9 2.07 % 927.17 224 0.05 % 5.64 417 0.26 % 105.41 93,905 0.45 % 173.03 22,798 0.43 % 121.64 55,312 0.46 % 173.97 5,172 0.36 % 120.46 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 296 File at CLARIN.SI 1.3.19 List of numeral lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-words-numerals-lowercase_ forms-taxonomy-entire.tsvLower-case word form Lemma Lemma (lower-case) Morphosyntactic tag Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] msd01 msd02 msd03 msd04 msd05 msd06 msd07 1 1 1 Kag 473,721 1.18 % 417.49 34 7.83 % 3,502.63 1,167 0.26 % 29.38 5,780 3.60 % 1,461.12 224,981 1.08 % 414.54 75,915 1.44 % 405.06 134,125 1.12 % 421.86 31,719 2.19 % 738.77 K a g 2 2 2 Kag 448,702 1.11 % 395.44 32 7.37 % 3,296.59 870 0.19 % 21.91 5,495 3.43 % 1,389.08 232,497 1.11 % 428.39 74,329 1.41 % 396.60 108,721 0.91 % 341.96 26,758 1.85 % 623.23 K a g prvi prvi prvi Kbvmei 396,558 0.99 % 349.48 0 0 % 0 6,100 1.35 % 153.59 1,176 0.73 % 297.28 172,347 0.82 % 317.56 53,793 1.02 % 287.02 152,714 1.27 % 480.33 10,428 0.72 % 242.88 K b v m e i dva dva dva Kbgmdt 348,135 0.86 % 306.81 2 0.46 % 206.04 10,544 2.34 % 265.49 624 0.39 % 157.74 171,501 0.82 % 316 55,941 1.06 % 298.48 100,399 0.84 % 315.78 9,124 0.63 % 212.51 K b g m d t ena en en Kbzzei 341,909 0.85 % 301.32 0 0 % 0 11,182 2.48 % 281.55 1,003 0.62 % 253.55 153,868 0.74 % 283.51 64,726 1.23 % 345.36 98,535 0.82 % 309.92 12,595 0.87 % 293.35 K b z z e i eden eden eden Kbzmei 339,502 0.84 % 299.20 0 0 % 0 9,711 2.16 % 244.51 694 0.43 % 175.44 157,403 0.75 % 290.03 57,846 1.10 % 308.65 103,487 0.86 % 325.49 10,361 0.71 % 241.32 K b z m e i 1, 1, 1, Kav 338,262 0.84 % 298.11 0 0 % 0 729 0.16 % 18.36 3,351 2.09 % 847.10 212,046 1.01 % 390.71 30,858 0.59 % 164.65 80,704 0.67 % 253.84 10,574 0.73 % 246.28 K a v 3 3 3 Kag 334,249 0.83 % 294.57 20 4.61 % 2,060.37 684 0.15 % 17.22 3,802 2.37 % 961.10 178,016 0.85 % 328.01 52,468 1.00 % 279.95 79,212 0.66 % 249.14 20,047 1.38 % 466.92 K a g dve dva dva Kbgzdt 321,312 0.80 % 283.17 2 0.46 % 206.04 9,104 2.02 % 229.23 613 0.38 % 154.96 155,708 0.74 % 286.90 51,674 0.98 % 275.72 93,971 0.78 % 295.56 10,240 0.71 % 238.50 K b g z d t drugi drug drug Kbzzem 318,701 0.79 % 280.87 2 0.46 % 206.04 8,283 1.84 % 208.56 1,042 0.65 % 263.41 150,409 0.72 % 277.14 48,575 0.92 % 259.18 97,244 0.81 % 305.86 13,146 0.91 % 306.19 K b z z e m 20 20 20 Kag 318,178 0.79 % 280.41 22 5.07 % 2,266.41 360 0.08 % 9.06 928 0.58 % 234.59 161,965 0.78 % 298.43 43,208 0.82 % 230.54 101,987 0.85 % 320.78 9,708 0.67 % 226.11 K a g 4 4 4 Kag 303,782 0.76 % 267.72 12 2.77 % 1,236.22 467 0.10 % 11.76 2,293 1.43 % 579.65 165,355 0.79 % 304.68 46,384 0.88 % 247.49 73,009 0.61 % 229.63 16,262 1.12 % 378.76 K a g 10 10 10 Kag 296,783 0.74 % 261.55 21 4.84 % 2,163.39 401 0.09 % 10.10 1,236 0.77 % 312.45 142,870 0.68 % 263.25 49,185 0.94 % 262.43 89,626 0.75 % 281.90 13,444 0.93 % 313.13 K a g drugim drug drug Kbzseo 282,258 0.70 % 248.75 0 0 % 0 966 0.21 % 24.32 223 0.14 % 56.37 123,086 0.59 % 226.79 27,448 0.52 % 146.45 127,023 1.06 % 399.52 3,512 0.24 % 81.80 K b z s e o drugi drug drug Kbzmmi 281,383 0.70 % 247.98 1 0.23 % 103.02 10,822 2.40 % 272.49 1,588 0.99 % 401.43 137,508 0.66 % 253.37 50,488 0.96 % 269.39 67,458 0.56 % 212.17 13,518 0.93 % 314.85 K b z m m i eno en en Kbzzet 278,348 0.69 % 245.31 3 0.69 % 309.06 12,481 2.77 % 314.26 843 0.53 % 213.10 122,051 0.58 % 224.89 52,746 1.00 % 281.44 78,644 0.66 % 247.36 11,580 0.80 % 269.71 K b z z e t pet pet pet Kbg-mt 265,597 0.66 % 234.07 0 0 % 0 5,688 1.26 % 143.22 452 0.28 % 114.26 134,694 0.64 % 248.18 35,088 0.67 % 187.22 85,586 0.71 % 269.19 4,089 0.28 % 95.24 K b g - m t 5 5 5 Kag 265,500 0.66 % 233.98 14 3.23 % 1,442.26 549 0.12 % 13.82 1,690 1.05 % 427.21 141,453 0.68 % 260.64 45,080 0.86 % 240.53 61,381 0.51 % 193.06 15,333 1.06 % 357.12 K a g 15 15 15 Kag 265,493 0.66 % 233.98 19 4.38 % 1,957.35 308 0.07 % 7.76 753 0.47 % 190.35 133,739 0.64 % 246.42 35,793 0.68 % 190.98 86,504 0.72 % 272.08 8,377 0.58 % 195.11 K a g tisoč tisoč tisoč Kbg-mt 256,433 0.64 % 225.99 0 0 % 0 4,189 0.93 % 105.47 223 0.14 % 56.37 169,114 0.81 % 311.60 31,179 0.59 % 166.36 49,029 0.41 % 154.21 2,699 0.19 % 62.86 K b g - m t 30 30 30 Kag 251,610 0.62 % 221.74 20 4.61 % 2,060.37 282 0.06 % 7.10 1,126 0.70 % 284.64 128,477 0.61 % 236.73 38,641 0.73 % 206.18 75,366 0.63 % 237.05 7,698 0.53 % 179.30 K a g 6 6 6 Kag 241,633 0.60 % 212.95 2 0.46 % 206.04 314 0.07 % 7.91 1,004 0.63 % 253.80 144,313 0.69 % 265.91 28,665 0.55 % 152.95 57,268 0.48 % 180.12 10,067 0.69 % 234.47 K a g 2, 2, 2, Kav 238,500 0.59 % 210.19 0 0 % 0 327 0.07 % 8.23 3,076 1.92 % 777.58 154,120 0.74 % 283.98 23,331 0.44 % 124.49 48,814 0.41 % 153.53 8,832 0.61 % 205.71 K a v tri trije trije Kbgmmt 229,173 0.57 % 201.97 0 0 % 0 6,316 1.40 % 159.03 383 0.24 % 96.82 115,864 0.55 % 213.49 32,880 0.62 % 175.44 68,984 0.57 % 216.97 4,746 0.33 % 110.54 K b g m m t prva prvi prvi Kbvzei 217,410 0.54 % 191.60 0 0 % 0 3,743 0.83 % 94.24 482 0.30 % 121.84 105,876 0.51 % 195.08 34,249 0.65 % 182.74 65,986 0.55 % 207.54 7,074 0.49 % 164.76 K b v z e i 12 12 12 Kag 214,333 0.53 % 188.89 5 1.15 % 515.09 297 0.07 % 7.48 479 0.30 % 121.09 109,658 0.53 % 202.05 25,719 0.49 % 137.23 71,254 0.59 % 224.11 6,921 0.48 % 161.20 K a g 3, 3, 3, Kav 214,002 0.53 % 188.60 0 0 % 0 288 0.06 % 7.25 2,450 1.53 % 619.33 143,636 0.69 % 264.66 19,566 0.37 % 104.40 40,880 0.34 % 128.58 7,182 0.50 % 167.28 K a v deset deset deset Kbg-mt 213,634 0.53 % 188.27 0 0 % 0 5,929 1.32 % 149.29 265 0.17 % 66.99 112,078 0.54 % 206.51 30,541 0.58 % 162.96 61,469 0.51 % 193.34 3,352 0.23 % 78.07 K b g - m t prvi prvi prvi Kbvzem 211,378 0.53 % 186.29 1 0.23 % 103.02 1,813 0.40 % 45.65 616 0.38 % 155.72 105,688 0.51 % 194.74 21,510 0.41 % 114.77 76,604 0.64 % 240.94 5,146 0.35 % 119.86 K b v z e m 50 50 50 Kag 208,688 0.52 % 183.92 8 1.84 % 824.15 222 0.05 % 5.59 785 0.49 % 198.44 107,858 0.52 % 198.74 32,851 0.62 % 175.28 61,206 0.51 % 192.51 5,758 0.40 % 134.11 K a g drugih drug drug Kbzmmr 198,608 0.49 % 175.03 0 0 % 0 4,973 1.10 % 125.21 1,634 1.02 % 413.06 94,590 0.45 % 174.29 37,091 0.70 % 197.91 48,120 0.40 % 151.35 12,200 0.84 % 284.15 K b z m m r tri trije trije Kbgzmt 192,666 0.48 % 169.80 1 0.23 % 103.02 3,745 0.83 % 94.30 214 0.13 % 54.10 97,410 0.47 % 179.48 25,207 0.48 % 134.50 61,039 0.51 % 191.98 5,050 0.35 % 117.62 K b g z m t CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 297 File at CLARIN.SI 1.3.20 List of preposition lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-words-prepositions-lemmas- taxonomy-entire.tsvLemma Lemma (lower-case) Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] v v 31,818,378 25.51 % 28,041.37 299 22.72 % 30,802.51 816,598 23.02 % 20,561.05 103,256 23.08 % 26,102 15,532,386 25.82 % 28,619.45 4,674,294 23.87 % 24,940.50 9,553,100 26.12 % 30,046.99 1,138,445 25.83 % 26,515.74 na na 19,072,589 15.29 % 16,808.58 248 18.84 % 25,548.57 555,927 15.67 % 13,997.64 56,074 12.53 % 14,174.90 9,072,361 15.08 % 16,716.43 3,009,439 15.37 % 16,057.38 5,747,052 15.72 % 18,075.97 631,488 14.33 % 14,708.11 z z 15,732,362 12.62 % 13,864.85 403 30.62 % 41,516.43 535,369 15.09 % 13,480.01 58,748 13.13 % 14,850.86 7,267,869 12.08 % 13,391.53 2,841,173 14.51 % 15,159.57 4,402,058 12.04 % 13,845.62 626,742 14.22 % 14,597.57 za za 15,195,407 12.18 % 13,391.64 88 6.69 % 9,065.62 294,850 8.31 % 7,424 67,714 15.13 % 17,117.37 7,527,662 12.51 % 13,870.22 2,303,946 11.77 % 12,293.10 4,564,728 12.48 % 14,357.26 436,419 9.90 % 10,164.72 po po 6,360,566 5.10 % 5,605.53 64 4.86 % 6,593.18 191,555 5.40 % 4,823.15 21,577 4.82 % 5,454.43 3,054,588 5.08 % 5,628.28 936,691 4.78 % 4,997.88 1,959,908 5.36 % 6,164.42 196,183 4.45 % 4,569.34 iz iz 4,006,518 3.21 % 3,530.92 37 2.81 % 3,811.68 134,681 3.80 % 3,391.12 23,762 5.31 % 6,006.78 2,017,138 3.35 % 3,716.71 611,339 3.12 % 3,261.90 1,060,540 2.90 % 3,335.67 159,021 3.61 % 3,703.79 pri pri 3,934,055 3.15 % 3,467.06 47 3.57 % 4,841.87 76,258 2.15 % 1,920.09 15,360 3.43 % 3,882.84 1,888,953 3.14 % 3,480.52 756,619 3.86 % 4,037.07 1,018,794 2.79 % 3,204.37 178,024 4.04 % 4,146.39 od od 3,855,989 3.09 % 3,398.26 20 1.52 % 2,060.37 141,756 4.00 % 3,569.26 13,534 3.02 % 3,421.25 1,843,270 3.06 % 3,396.35 638,050 3.26 % 3,404.43 1,061,423 2.90 % 3,338.45 157,936 3.58 % 3,678.52 o o 3,670,387 2.94 % 3,234.69 1 0.08 % 103.02 95,167 2.68 % 2,396.20 24,271 5.42 % 6,135.45 1,828,963 3.04 % 3,369.99 488,092 2.49 % 2,604.30 1,103,026 3.02 % 3,469.30 130,867 2.97 % 3,048.05 do do 3,434,842 2.75 % 3,027.11 16 1.22 % 1,648.30 84,916 2.39 % 2,138.09 13,221 2.96 % 3,342.13 1,686,736 2.80 % 3,107.92 544,963 2.78 % 2,907.74 976,111 2.67 % 3,070.12 128,879 2.92 % 3,001.75 med med 3,005,484 2.41 % 2,648.72 19 1.44 % 1,957.35 59,974 1.69 % 1,510.08 7,107 1.59 % 1,796.57 1,425,352 2.37 % 2,626.31 481,142 2.46 % 2,567.22 906,664 2.48 % 2,851.69 125,226 2.84 % 2,916.66 ob ob 2,529,897 2.03 % 2,229.59 9 0.68 % 927.17 74,696 2.10 % 1,880.76 5,209 1.16 % 1,316.78 1,338,788 2.23 % 2,466.81 342,313 1.75 % 1,826.47 700,467 1.92 % 2,203.15 68,415 1.55 % 1,593.47 pred pred 2,269,453 1.82 % 2,000.06 7 0.53 % 721.13 69,904 1.97 % 1,760.11 5,680 1.27 % 1,435.84 1,115,560 1.85 % 2,055.49 329,208 1.68 % 1,756.55 697,210 1.91 % 2,192.91 51,884 1.18 % 1,208.44 zaradi zaradi 1,792,327 1.44 % 1,579.57 0 0 % 0 35,672 1.00 % 898.18 5,122 1.15 % 1,294.79 861,596 1.43 % 1,587.55 266,969 1.36 % 1,424.46 569,966 1.56 % 1,792.69 53,002 1.20 % 1,234.48 brez brez 895,603 0.72 % 789.29 11 0.84 % 1,133.20 36,652 1.03 % 922.86 2,893 0.65 % 731.32 419,896 0.70 % 773.69 166,614 0.85 % 889 238,407 0.65 % 749.85 31,130 0.71 % 725.05 k k 881,577 0.71 % 776.93 14 1.06 % 1,442.26 63,599 1.79 % 1,601.35 3,769 0.84 % 952.76 391,830 0.65 % 721.97 146,443 0.75 % 781.37 227,339 0.62 % 715.04 48,583 1.10 % 1,131.56 proti proti 875,538 0.70 % 771.61 4 0.30 % 412.07 39,338 1.11 % 990.49 2,536 0.57 % 641.07 400,159 0.67 % 737.32 105,654 0.54 % 563.74 296,441 0.81 % 932.38 31,406 0.71 % 731.48 pod pod 771,297 0.62 % 679.74 6 0.46 % 618.11 36,869 1.04 % 928.32 3,253 0.73 % 822.32 357,676 0.59 % 659.04 131,152 0.67 % 699.78 211,135 0.58 % 664.07 31,206 0.71 % 726.82 poleg poleg 592,281 0.47 % 521.97 0 0 % 0 12,883 0.36 % 324.38 1,216 0.27 % 307.39 284,760 0.47 % 524.69 118,883 0.61 % 634.32 155,163 0.42 % 488.03 19,376 0.44 % 451.29 nad nad 553,355 0.44 % 487.67 1 0.08 % 103.02 24,854 0.70 % 625.80 1,840 0.41 % 465.13 259,821 0.43 % 478.74 89,316 0.46 % 476.56 152,855 0.42 % 480.77 24,668 0.56 % 574.55 kljub kljub 509,251 0.41 % 448.80 0 0 % 0 11,076 0.31 % 278.88 835 0.19 % 211.08 262,664 0.44 % 483.98 90,789 0.46 % 484.42 130,869 0.36 % 411.62 13,018 0.29 % 303.20 čez čez 342,085 0.27 % 301.48 2 0.15 % 206.04 32,061 0.90 % 807.26 764 0.17 % 193.13 156,143 0.26 % 287.70 63,586 0.33 % 339.27 76,727 0.21 % 241.33 12,802 0.29 % 298.17 glede glede 271,297 0.22 % 239.09 0 0 % 0 3,648 0.10 % 91.85 1,543 0.34 % 390.05 106,244 0.18 % 195.76 30,305 0.15 % 161.70 121,462 0.33 % 382.03 8,095 0.18 % 188.54 skozi skozi 261,306 0.21 % 230.29 2 0.15 % 206.04 26,289 0.74 % 661.93 957 0.21 % 241.92 106,627 0.18 % 196.47 51,512 0.26 % 274.85 59,303 0.16 % 186.52 16,616 0.38 % 387.01 izmed izmed 231,718 0.19 % 204.21 0 0 % 0 6,851 0.19 % 172.50 698 0.16 % 176.45 98,094 0.16 % 180.74 47,539 0.24 % 253.65 68,140 0.19 % 214.32 10,396 0.24 % 242.14 prek prek 230,294 0.18 % 202.96 0 0 % 0 4,070 0.12 % 102.48 543 0.12 % 137.26 100,698 0.17 % 185.54 49,035 0.25 % 261.63 68,452 0.19 % 215.30 7,496 0.17 % 174.59 konec konec 210,897 0.17 % 185.86 0 0 % 0 2,313 0.07 % 58.24 187 0.04 % 47.27 112,615 0.19 % 207.50 25,891 0.13 % 138.15 66,489 0.18 % 209.13 3,402 0.08 % 79.24 okoli okoli 187,237 0.15 % 165.01 1 0.08 % 103.02 10,192 0.29 % 256.62 320 0.07 % 80.89 81,299 0.14 % 149.80 33,734 0.17 % 179.99 53,728 0.15 % 168.99 7,963 0.18 % 185.47 namesto namesto 180,360 0.14 % 158.95 5 0.38 % 515.09 7,987 0.23 % 201.10 634 0.14 % 160.27 81,373 0.14 % 149.94 38,674 0.20 % 206.35 42,657 0.12 % 134.17 9,030 0.20 % 210.32 sredi sredi 155,626 0.12 % 137.15 0 0 % 0 9,595 0.27 % 241.59 228 0.05 % 57.64 75,945 0.13 % 139.93 25,698 0.13 % 137.12 39,051 0.11 % 122.83 5,109 0.12 % 118.99 preko preko 107,027 0.09 % 94.32 9 0.68 % 927.17 1,773 0.05 % 44.64 563 0.13 % 142.32 43,418 0.07 % 80 20,743 0.11 % 110.68 35,516 0.10 % 111.71 5,005 0.11 % 116.57 zoper zoper 96,349 0.08 % 84.91 0 0 % 0 644 0.02 % 16.22 986 0.22 % 249.25 48,608 0.08 % 89.56 6,457 0.03 % 34.45 37,620 0.10 % 118.32 2,034 0.05 % 47.37 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 298 File at CLARIN.SI 1.3.21 List of preposition lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-words-prepositions-lowercase_ forms-taxonomy-entire.tsvLower-case word form Lemma Lemma (lower-case) Morphosyntactic tag Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] msd01 msd02 v v v Dm 25,046,467 20.08 % 22,073.32 177 13.45 % 18,234.26 524,683 14.79 % 13,210.95 84,105 18.80 % 21,260.84 12,296,064 20.44 % 22,656.31 3,596,953 18.37 % 19,192.16 7,653,153 20.93 % 24,071.16 891,332 20.22 % 20,760.18 D m za za za Dt 14,767,070 11.84 % 13,014.14 86 6.54 % 8,859.59 258,259 7.28 % 6,502.68 66,736 14.92 % 16,870.14 7,335,761 12.19 % 13,516.62 2,231,808 11.40 % 11,908.19 4,450,678 12.17 % 13,998.54 423,742 9.61 % 9,869.45 D t na na na Dm 10,451,191 8.38 % 9,210.58 96 7.29 % 9,889.77 252,708 7.12 % 6,362.91 25,421 5.68 % 6,426.15 5,083,066 8.45 % 9,365.89 1,476,984 7.54 % 7,880.70 3,323,056 9.09 % 10,451.88 289,860 6.58 % 6,751.18 D m na na na Dt 8,621,398 6.91 % 7,597.99 152 11.55 % 15,658.80 303,219 8.55 % 7,634.73 30,653 6.85 % 7,748.75 3,989,295 6.63 % 7,350.54 1,532,455 7.83 % 8,176.68 2,423,996 6.63 % 7,624.10 341,628 7.75 % 7,956.92 D t z z z Do 8,028,598 6.44 % 7,075.56 221 16.79 % 22,767.08 288,180 8.12 % 7,256.06 30,405 6.80 % 7,686.06 3,707,344 6.16 % 6,831.03 1,482,833 7.57 % 7,911.91 2,189,741 5.99 % 6,887.31 329,874 7.48 % 7,683.16 D o v v v Dt 6,767,349 5.43 % 5,964.03 122 9.27 % 12,568.25 291,632 8.22 % 7,342.98 19,145 4.28 % 4,839.65 3,235,048 5.38 % 5,960.79 1,076,079 5.50 % 5,741.60 1,898,574 5.19 % 5,971.51 246,749 5.60 % 5,747.08 D t s z z Do 6,555,866 5.26 % 5,777.65 165 12.54 % 16,998.04 203,755 5.74 % 5,130.33 25,576 5.72 % 6,465.34 3,021,193 5.02 % 5,566.75 1,200,090 6.13 % 6,403.29 1,844,962 5.04 % 5,802.89 260,125 5.90 % 6,058.62 D o po po po Dm 6,013,433 4.82 % 5,299.61 62 4.71 % 6,387.14 178,753 5.04 % 4,500.81 20,886 4.67 % 5,279.76 2,857,328 4.75 % 5,264.82 882,042 4.50 % 4,706.29 1,887,763 5.16 % 5,937.51 186,599 4.23 % 4,346.11 D m iz iz iz Dr 4,006,515 3.21 % 3,530.92 37 2.81 % 3,811.68 134,681 3.80 % 3,391.12 23,762 5.31 % 6,006.78 2,017,136 3.35 % 3,716.71 611,339 3.12 % 3,261.90 1,060,539 2.90 % 3,335.67 159,021 3.61 % 3,703.79 D r pri pri pri Dm 3,934,007 3.15 % 3,467.02 47 3.57 % 4,841.87 76,256 2.15 % 1,920.04 15,360 3.43 % 3,882.84 1,888,931 3.14 % 3,480.48 756,605 3.86 % 4,037 1,018,784 2.79 % 3,204.34 178,024 4.04 % 4,146.39 D m od od od Dr 3,855,988 3.09 % 3,398.26 20 1.52 % 2,060.37 141,756 4.00 % 3,569.26 13,534 3.02 % 3,421.25 1,843,270 3.06 % 3,396.35 638,050 3.26 % 3,404.43 1,061,422 2.90 % 3,338.45 157,936 3.58 % 3,678.52 D r o o o Dm 3,670,163 2.94 % 3,234.50 1 0.08 % 103.02 95,165 2.68 % 2,396.15 24,268 5.42 % 6,134.69 1,828,883 3.04 % 3,369.84 488,034 2.49 % 2,603.99 1,102,955 3.02 % 3,469.08 130,857 2.97 % 3,047.82 D m do do do Dr 3,434,841 2.75 % 3,027.11 16 1.22 % 1,648.30 84,916 2.39 % 2,138.09 13,221 2.96 % 3,342.13 1,686,736 2.80 % 3,107.92 544,962 2.78 % 2,907.74 976,111 2.67 % 3,070.12 128,879 2.92 % 3,001.75 D r med med med Do 2,850,292 2.29 % 2,511.95 19 1.44 % 1,957.35 56,286 1.59 % 1,417.22 6,809 1.52 % 1,721.24 1,348,730 2.24 % 2,485.12 450,646 2.30 % 2,404.50 868,444 2.38 % 2,731.48 119,358 2.71 % 2,779.99 D o ob ob ob Dm 2,464,279 1.98 % 2,171.76 7 0.53 % 721.13 66,021 1.86 % 1,662.34 5,089 1.14 % 1,286.44 1,311,400 2.18 % 2,416.34 329,331 1.68 % 1,757.20 686,868 1.88 % 2,160.38 65,563 1.49 % 1,527.04 D m pred pred pred Do 2,238,062 1.79 % 1,972.39 7 0.53 % 721.13 67,927 1.92 % 1,710.33 5,597 1.25 % 1,414.86 1,101,401 1.83 % 2,029.40 324,228 1.66 % 1,729.97 687,960 1.88 % 2,163.81 50,942 1.16 % 1,186.50 D o zaradi zaradi zaradi Dr 1,792,318 1.44 % 1,579.56 0 0 % 0 35,669 1.00 % 898.11 5,122 1.15 % 1,294.79 861,596 1.43 % 1,587.55 266,969 1.36 % 1,424.46 569,960 1.56 % 1,792.67 53,002 1.20 % 1,234.48 D r brez brez brez Dr 895,603 0.72 % 789.29 11 0.84 % 1,133.20 36,652 1.03 % 922.86 2,893 0.65 % 731.32 419,896 0.70 % 773.69 166,614 0.85 % 889 238,407 0.65 % 749.85 31,130 0.71 % 725.05 D r proti proti proti Dd 875,538 0.70 % 771.61 4 0.30 % 412.07 39,338 1.11 % 990.49 2,536 0.57 % 641.07 400,159 0.67 % 737.32 105,654 0.54 % 563.74 296,441 0.81 % 932.38 31,406 0.71 % 731.48 D d k k k Dd 829,462 0.67 % 731 14 1.06 % 1,442.26 59,763 1.69 % 1,504.77 3,567 0.80 % 901.70 368,683 0.61 % 679.32 138,006 0.70 % 736.35 213,857 0.58 % 672.64 45,572 1.03 % 1,061.43 D d pod pod pod Do 698,970 0.56 % 616 6 0.46 % 618.11 33,099 0.93 % 833.40 3,115 0.70 % 787.44 325,671 0.54 % 600.07 118,355 0.60 % 631.50 189,949 0.52 % 597.44 28,775 0.65 % 670.20 D o z z z Dr 641,940 0.52 % 565.74 15 1.14 % 1,545.28 24,561 0.69 % 618.42 1,387 0.31 % 350.62 299,030 0.50 % 550.98 87,984 0.45 % 469.45 208,284 0.57 % 655.11 20,679 0.47 % 481.64 D r poleg poleg poleg Dr 592,281 0.47 % 521.97 0 0 % 0 12,883 0.36 % 324.38 1,216 0.27 % 307.39 284,760 0.47 % 524.69 118,883 0.61 % 634.32 155,163 0.42 % 488.03 19,376 0.44 % 451.29 D r nad nad nad Do 530,732 0.43 % 467.73 1 0.08 % 103.02 23,418 0.66 % 589.64 1,761 0.39 % 445.16 248,222 0.41 % 457.37 85,792 0.44 % 457.76 147,925 0.41 % 465.26 23,613 0.54 % 549.97 D o kljub kljub kljub Dd 509,251 0.41 % 448.80 0 0 % 0 11,076 0.31 % 278.88 835 0.19 % 211.08 262,664 0.44 % 483.98 90,789 0.46 % 484.42 130,869 0.36 % 411.62 13,018 0.29 % 303.20 D d s z z Dr 505,930 0.41 % 445.87 2 0.15 % 206.04 18,873 0.53 % 475.20 1,380 0.31 % 348.85 240,284 0.40 % 442.74 70,261 0.36 % 374.89 159,066 0.43 % 500.30 16,064 0.36 % 374.15 D r za za za Do 427,846 0.34 % 377.06 2 0.15 % 206.04 36,585 1.03 % 921.17 975 0.22 % 246.47 191,665 0.32 % 353.16 72,048 0.37 % 384.42 113,913 0.31 % 358.29 12,658 0.29 % 294.82 D o po po po Dt 347,132 0.28 % 305.93 2 0.15 % 206.04 12,802 0.36 % 322.34 691 0.15 % 174.68 197,259 0.33 % 363.46 54,649 0.28 % 291.59 72,145 0.20 % 226.91 9,584 0.22 % 223.22 D t čez čez čez Dt 342,085 0.27 % 301.48 2 0.15 % 206.04 32,061 0.90 % 807.26 764 0.17 % 193.13 156,143 0.26 % 287.70 63,586 0.33 % 339.27 76,727 0.21 % 241.33 12,802 0.29 % 298.17 D t glede glede glede Dr 271,297 0.22 % 239.09 0 0 % 0 3,648 0.10 % 91.85 1,543 0.34 % 390.05 106,244 0.18 % 195.76 30,305 0.15 % 161.70 121,462 0.33 % 382.03 8,095 0.18 % 188.54 D r skozi skozi skozi Dt 261,306 0.21 % 230.29 2 0.15 % 206.04 26,289 0.74 % 661.93 957 0.21 % 241.92 106,627 0.18 % 196.47 51,512 0.26 % 274.85 59,303 0.16 % 186.52 16,616 0.38 % 387.01 D t izmed izmed izmed Dr 231,712 0.19 % 204.21 0 0 % 0 6,851 0.19 % 172.50 698 0.16 % 176.45 98,092 0.16 % 180.74 47,537 0.24 % 253.64 68,138 0.19 % 214.31 10,396 0.24 % 242.14 D r CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 299 File at CLARIN.SI 1.3.22 List of conjunction lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-words-conjunctions-lemmas- taxonomy-entire.tsvLemma Lemma (lower-case) Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] in in 29,117,711 29.55 % 25,661.29 619 55.66 % 63,768.41 1,187,174 30.09 % 29,891.76 115,835 29.37 % 29,281.84 13,762,326 29.85 % 25,358 5,256,746 29.89 % 28,048.27 7,456,548 28.18 % 23,452.78 1,338,463 33.09 % 31,174.39 da da 14,810,352 15.03 % 13,052.29 124 11.15 % 12,774.29 752,394 19.07 % 18,944.47 53,649 13.60 % 13,561.89 6,760,076 14.66 % 12,455.89 2,457,240 13.97 % 13,111.03 4,260,568 16.10 % 13,400.60 526,301 13.01 % 12,258.18 ki ki 12,623,562 12.81 % 11,125.08 55 4.95 % 5,666.01 338,126 8.57 % 8,513.65 55,386 14.04 % 14,000.98 5,915,959 12.83 % 10,900.55 2,202,562 12.52 % 11,752.15 3,594,899 13.58 % 11,306.89 516,575 12.77 % 12,031.65 pa pa 11,003,780 11.17 % 9,697.58 37 3.33 % 3,811.68 319,351 8.09 % 8,040.91 25,847 6.55 % 6,533.84 5,422,876 11.76 % 9,992.01 1,898,081 10.79 % 10,127.54 3,016,889 11.40 % 9,488.90 320,699 7.93 % 7,469.46 kot kot 5,694,030 5.78 % 5,018.12 23 2.07 % 2,369.42 210,691 5.34 % 5,304.97 16,409 4.16 % 4,148.02 2,600,526 5.64 % 4,791.64 973,450 5.54 % 5,194.01 1,661,150 6.28 % 5,224.75 231,781 5.73 % 5,398.46 ko ko 3,180,102 3.23 % 2,802.61 36 3.24 % 3,708.66 225,473 5.71 % 5,677.17 9,198 2.33 % 2,325.16 1,386,450 3.01 % 2,554.63 573,281 3.26 % 3,058.84 862,002 3.26 % 2,711.22 123,662 3.06 % 2,880.24 ali ali 2,999,080 3.04 % 2,643.07 42 3.78 % 4,326.77 115,783 2.94 % 2,915.29 34,601 8.77 % 8,746.76 1,285,899 2.79 % 2,369.35 682,175 3.88 % 3,639.86 647,762 2.45 % 2,037.38 232,818 5.76 % 5,422.61 če če 2,441,017 2.48 % 2,151.26 27 2.43 % 2,781.50 127,590 3.23 % 3,212.58 23,279 5.90 % 5,884.68 1,064,667 2.31 % 1,961.72 547,603 3.11 % 2,921.83 539,153 2.04 % 1,695.78 138,698 3.43 % 3,230.44 saj saj 1,922,955 1.95 % 1,694.69 0 0 % 0 52,415 1.33 % 1,319.75 1,897 0.48 % 479.54 956,035 2.07 % 1,761.56 353,479 2.01 % 1,886.05 513,449 1.94 % 1,614.93 45,680 1.13 % 1,063.94 ter ter 1,668,988 1.69 % 1,470.87 100 8.99 % 10,301.84 23,537 0.60 % 592.64 8,191 2.08 % 2,070.60 765,500 1.66 % 1,410.48 282,898 1.61 % 1,509.45 522,212 1.97 % 1,642.49 66,550 1.65 % 1,550.03 a a 1,586,387 1.61 % 1,398.07 2 0.18 % 206.04 65,928 1.67 % 1,660 2,997 0.76 % 757.61 611,753 1.33 % 1,127.20 263,856 1.50 % 1,407.85 605,982 2.29 % 1,905.97 35,869 0.89 % 835.43 ker ker 1,508,421 1.53 % 1,329.36 2 0.18 % 206.04 71,438 1.81 % 1,798.73 5,756 1.46 % 1,455.05 727,004 1.58 % 1,339.55 293,861 1.67 % 1,567.95 352,423 1.33 % 1,108.46 57,937 1.43 % 1,349.42 zato zato 1,302,554 1.32 % 1,147.93 0 0 % 0 34,925 0.89 % 879.37 3,264 0.83 % 825.10 637,853 1.38 % 1,175.29 249,506 1.42 % 1,331.28 325,480 1.23 % 1,023.72 51,526 1.27 % 1,200.10 kjer kjer 1,147,657 1.17 % 1,011.42 0 0 % 0 30,047 0.76 % 756.55 2,761 0.70 % 697.95 561,556 1.22 % 1,034.70 188,431 1.07 % 1,005.41 328,845 1.24 % 1,034.30 36,017 0.89 % 838.88 vendar vendar 997,194 1.01 % 878.82 0 0 % 0 53,106 1.35 % 1,337.15 3,173 0.81 % 802.10 515,237 1.12 % 949.36 192,538 1.09 % 1,027.32 185,001 0.70 % 581.88 48,139 1.19 % 1,121.21 namreč namreč 902,134 0.92 % 795.05 0 0 % 0 9,366 0.24 % 235.83 1,164 0.29 % 294.25 452,514 0.98 % 833.79 148,837 0.85 % 794.15 272,426 1.03 % 856.85 17,827 0.44 % 415.21 čeprav čeprav 679,637 0.69 % 598.96 0 0 % 0 28,778 0.73 % 724.60 1,251 0.32 % 316.24 333,503 0.72 % 614.50 130,569 0.74 % 696.67 160,850 0.61 % 505.92 24,686 0.61 % 574.97 oziroma oziroma 648,207 0.66 % 571.26 2 0.18 % 206.04 4,180 0.11 % 105.25 10,581 2.68 % 2,674.76 332,593 0.72 % 612.82 100,468 0.57 % 536.06 177,018 0.67 % 556.77 23,365 0.58 % 544.20 toda toda 588,678 0.60 % 518.80 0 0 % 0 48,129 1.22 % 1,211.84 1,443 0.37 % 364.77 303,559 0.66 % 559.33 106,097 0.60 % 566.10 102,797 0.39 % 323.32 26,653 0.66 % 620.78 ampak ampak 572,085 0.58 % 504.18 0 0 % 0 58,951 1.49 % 1,484.32 2,969 0.75 % 750.53 227,969 0.49 % 420.05 115,041 0.65 % 613.82 148,319 0.56 % 466.50 18,836 0.47 % 438.71 tako tako 410,279 0.42 % 361.58 12 1.08 % 1,236.22 11,498 0.29 % 289.51 1,533 0.39 % 387.53 193,920 0.42 % 357.31 77,933 0.44 % 415.82 107,702 0.41 % 338.75 17,681 0.44 % 411.81 naj naj 389,738 0.40 % 343.47 0 0 % 0 25,019 0.63 % 629.95 1,027 0.26 % 259.61 187,176 0.41 % 344.88 57,022 0.32 % 304.25 105,768 0.40 % 332.67 13,726 0.34 % 319.69 kakor kakor 272,376 0.28 % 240.04 0 0 % 0 40,148 1.02 % 1,010.88 1,664 0.42 % 420.64 123,199 0.27 % 227 51,656 0.29 % 275.62 31,646 0.12 % 99.53 24,063 0.59 % 560.46 sicer sicer 263,692 0.27 % 232.39 0 0 % 0 2,680 0.07 % 67.48 847 0.21 % 214.11 130,258 0.28 % 240.01 36,602 0.21 % 195.30 86,507 0.33 % 272.09 6,798 0.17 % 158.33 temveč temveč 235,490 0.24 % 207.54 0 0 % 0 7,332 0.19 % 184.61 373 0.10 % 94.29 110,231 0.24 % 203.11 43,698 0.25 % 233.16 57,822 0.22 % 181.87 16,034 0.40 % 373.45 torej torej 196,891 0.20 % 173.52 0 0 % 0 3,308 0.08 % 83.29 593 0.15 % 149.90 96,806 0.21 % 178.37 39,105 0.22 % 208.65 48,069 0.18 % 151.19 9,010 0.22 % 209.85 kajti kajti 179,605 0.18 % 158.28 0 0 % 0 11,065 0.28 % 278.60 1,131 0.29 % 285.90 106,508 0.23 % 196.25 32,173 0.18 % 171.66 21,181 0.08 % 66.62 7,547 0.19 % 175.78 vendarle vendarle 165,832 0.17 % 146.15 0 0 % 0 4,087 0.10 % 102.91 544 0.14 % 137.52 91,249 0.20 % 168.13 23,643 0.13 % 126.15 42,553 0.16 % 133.84 3,756 0.09 % 87.48 preden preden 148,498 0.15 % 130.87 8 0.72 % 824.15 20,759 0.53 % 522.69 629 0.16 % 159 53,160 0.12 % 97.95 32,176 0.18 % 171.68 32,924 0.12 % 103.55 8,842 0.22 % 205.94 dokler dokler 147,332 0.15 % 129.84 19 1.71 % 1,957.35 16,427 0.42 % 413.61 1,089 0.28 % 275.29 59,151 0.13 % 108.99 30,010 0.17 % 160.12 31,270 0.12 % 98.35 9,366 0.23 % 218.15 kadar kadar 133,016 0.14 % 117.23 1 0.09 % 103.02 13,669 0.35 % 344.17 1,820 0.46 % 460.08 44,635 0.10 % 82.24 34,877 0.20 % 186.09 21,626 0.08 % 68.02 16,388 0.41 % 381.70 kolikor kolikor 105,767 0.11 % 93.21 1 0.09 % 103.02 5,383 0.14 % 135.54 830 0.21 % 209.82 50,897 0.11 % 93.78 18,240 0.10 % 97.32 26,006 0.10 % 81.80 4,410 0.11 % 102.71 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 300 File at CLARIN.SI 1.3.23 List of conjunction lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-words-conjunctions-lowercase_ forms-taxonomy-entire.tsvLower-case word form Lemma Lemma (lower-case) Morphosyntactic tag Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] msd01 msd02 in in in Vp 29,117,711 29.55 % 25,661.29 619 55.66 % 63,768.41 1,187,174 30.09 % 29,891.76 115,835 29.37 % 29,281.84 13,762,326 29.85 % 25,358 5,256,746 29.89 % 28,048.27 7,456,548 28.18 % 23,452.78 1,338,463 33.09 % 31,174.39 V p da da da Vd 14,810,352 15.03 % 13,052.29 124 11.15 % 12,774.29 752,394 19.07 % 18,944.47 53,649 13.60 % 13,561.89 6,760,076 14.66 % 12,455.89 2,457,240 13.97 % 13,111.03 4,260,568 16.10 % 13,400.60 526,301 13.01 % 12,258.18 V d ki ki ki Vd 12,623,562 12.81 % 11,125.08 55 4.95 % 5,666.01 338,126 8.57 % 8,513.65 55,386 14.04 % 14,000.98 5,915,959 12.83 % 10,900.55 2,202,562 12.52 % 11,752.15 3,594,899 13.58 % 11,306.89 516,575 12.77 % 12,031.65 V d pa pa pa Vp 11,003,780 11.17 % 9,697.58 37 3.33 % 3,811.68 319,351 8.09 % 8,040.91 25,847 6.55 % 6,533.84 5,422,876 11.76 % 9,992.01 1,898,081 10.79 % 10,127.54 3,016,889 11.40 % 9,488.90 320,699 7.93 % 7,469.46 V p kot kot kot Vd 5,694,030 5.78 % 5,018.12 23 2.07 % 2,369.42 210,691 5.34 % 5,304.97 16,409 4.16 % 4,148.02 2,600,526 5.64 % 4,791.64 973,450 5.54 % 5,194.01 1,661,150 6.28 % 5,224.75 231,781 5.73 % 5,398.46 V d ko ko ko Vd 3,180,102 3.23 % 2,802.61 36 3.24 % 3,708.66 225,473 5.71 % 5,677.17 9,198 2.33 % 2,325.16 1,386,450 3.01 % 2,554.63 573,281 3.26 % 3,058.84 862,002 3.26 % 2,711.22 123,662 3.06 % 2,880.24 V d ali ali ali Vp 2,999,080 3.04 % 2,643.07 42 3.78 % 4,326.77 115,783 2.94 % 2,915.29 34,601 8.77 % 8,746.76 1,285,899 2.79 % 2,369.35 682,175 3.88 % 3,639.86 647,762 2.45 % 2,037.38 232,818 5.76 % 5,422.61 V p če če če Vd 2,441,017 2.48 % 2,151.26 27 2.43 % 2,781.50 127,590 3.23 % 3,212.58 23,279 5.90 % 5,884.68 1,064,667 2.31 % 1,961.72 547,603 3.11 % 2,921.83 539,153 2.04 % 1,695.78 138,698 3.43 % 3,230.44 V d saj saj saj Vp 1,922,955 1.95 % 1,694.69 0 0 % 0 52,415 1.33 % 1,319.75 1,897 0.48 % 479.54 956,035 2.07 % 1,761.56 353,479 2.01 % 1,886.05 513,449 1.94 % 1,614.93 45,680 1.13 % 1,063.94 V p ter ter ter Vp 1,668,988 1.69 % 1,470.87 100 8.99 % 10,301.84 23,537 0.60 % 592.64 8,191 2.08 % 2,070.60 765,500 1.66 % 1,410.48 282,898 1.61 % 1,509.45 522,212 1.97 % 1,642.49 66,550 1.65 % 1,550.03 V p a a a Vp 1,586,387 1.61 % 1,398.07 2 0.18 % 206.04 65,928 1.67 % 1,660 2,997 0.76 % 757.61 611,753 1.33 % 1,127.20 263,856 1.50 % 1,407.85 605,982 2.29 % 1,905.97 35,869 0.89 % 835.43 V p ker ker ker Vd 1,508,421 1.53 % 1,329.36 2 0.18 % 206.04 71,438 1.81 % 1,798.73 5,756 1.46 % 1,455.05 727,004 1.58 % 1,339.55 293,861 1.67 % 1,567.95 352,423 1.33 % 1,108.46 57,937 1.43 % 1,349.42 V d zato zato zato Vp 1,302,554 1.32 % 1,147.93 0 0 % 0 34,925 0.89 % 879.37 3,264 0.83 % 825.10 637,853 1.38 % 1,175.29 249,506 1.42 % 1,331.28 325,480 1.23 % 1,023.72 51,526 1.27 % 1,200.10 V p kjer kjer kjer Vd 1,147,657 1.17 % 1,011.42 0 0 % 0 30,047 0.76 % 756.55 2,761 0.70 % 697.95 561,556 1.22 % 1,034.70 188,431 1.07 % 1,005.41 328,845 1.24 % 1,034.30 36,017 0.89 % 838.88 V d vendar vendar vendar Vp 997,194 1.01 % 878.82 0 0 % 0 53,106 1.35 % 1,337.15 3,173 0.81 % 802.10 515,237 1.12 % 949.36 192,538 1.09 % 1,027.32 185,001 0.70 % 581.88 48,139 1.19 % 1,121.21 V p namreč namreč namreč Vp 902,134 0.92 % 795.05 0 0 % 0 9,366 0.24 % 235.83 1,164 0.29 % 294.25 452,514 0.98 % 833.79 148,837 0.85 % 794.15 272,426 1.03 % 856.85 17,827 0.44 % 415.21 V p čeprav čeprav čeprav Vd 679,637 0.69 % 598.96 0 0 % 0 28,778 0.73 % 724.60 1,251 0.32 % 316.24 333,503 0.72 % 614.50 130,569 0.74 % 696.67 160,850 0.61 % 505.92 24,686 0.61 % 574.97 V d oziroma oziroma oziroma Vp 648,207 0.66 % 571.26 2 0.18 % 206.04 4,180 0.11 % 105.25 10,581 2.68 % 2,674.76 332,593 0.72 % 612.82 100,468 0.57 % 536.06 177,018 0.67 % 556.77 23,365 0.58 % 544.20 V p toda toda toda Vp 588,678 0.60 % 518.80 0 0 % 0 48,129 1.22 % 1,211.84 1,443 0.37 % 364.77 303,559 0.66 % 559.33 106,097 0.60 % 566.10 102,797 0.39 % 323.32 26,653 0.66 % 620.78 V p ampak ampak ampak Vp 572,085 0.58 % 504.18 0 0 % 0 58,951 1.49 % 1,484.32 2,969 0.75 % 750.53 227,969 0.49 % 420.05 115,041 0.65 % 613.82 148,319 0.56 % 466.50 18,836 0.47 % 438.71 V p tako tako tako Vp 410,279 0.42 % 361.58 12 1.08 % 1,236.22 11,498 0.29 % 289.51 1,533 0.39 % 387.53 193,920 0.42 % 357.31 77,933 0.44 % 415.82 107,702 0.41 % 338.75 17,681 0.44 % 411.81 V p naj naj naj Vd 389,738 0.40 % 343.47 0 0 % 0 25,019 0.63 % 629.95 1,027 0.26 % 259.61 187,176 0.41 % 344.88 57,022 0.32 % 304.25 105,768 0.40 % 332.67 13,726 0.34 % 319.69 V d kakor kakor kakor Vd 272,374 0.28 % 240.04 0 0 % 0 40,148 1.02 % 1,010.88 1,664 0.42 % 420.64 123,198 0.27 % 227 51,656 0.29 % 275.62 31,646 0.12 % 99.53 24,062 0.59 % 560.43 V d sicer sicer sicer Vp 263,692 0.27 % 232.39 0 0 % 0 2,680 0.07 % 67.48 847 0.21 % 214.11 130,258 0.28 % 240.01 36,602 0.21 % 195.30 86,507 0.33 % 272.09 6,798 0.17 % 158.33 V p temveč temveč temveč Vp 235,490 0.24 % 207.54 0 0 % 0 7,332 0.19 % 184.61 373 0.10 % 94.29 110,231 0.24 % 203.11 43,698 0.25 % 233.16 57,822 0.22 % 181.87 16,034 0.40 % 373.45 V p torej torej torej Vp 196,891 0.20 % 173.52 0 0 % 0 3,308 0.08 % 83.29 593 0.15 % 149.90 96,806 0.21 % 178.37 39,105 0.22 % 208.65 48,069 0.18 % 151.19 9,010 0.22 % 209.85 V p kajti kajti kajti Vp 179,605 0.18 % 158.28 0 0 % 0 11,065 0.28 % 278.60 1,131 0.29 % 285.90 106,508 0.23 % 196.25 32,173 0.18 % 171.66 21,181 0.08 % 66.62 7,547 0.19 % 175.78 V p vendarle vendarle vendarle Vp 165,832 0.17 % 146.15 0 0 % 0 4,087 0.10 % 102.91 544 0.14 % 137.52 91,249 0.20 % 168.13 23,643 0.13 % 126.15 42,553 0.16 % 133.84 3,756 0.09 % 87.48 V p preden preden preden Vd 148,498 0.15 % 130.87 8 0.72 % 824.15 20,759 0.53 % 522.69 629 0.16 % 159 53,160 0.12 % 97.95 32,176 0.18 % 171.68 32,924 0.12 % 103.55 8,842 0.22 % 205.94 V d dokler dokler dokler Vd 147,332 0.15 % 129.84 19 1.71 % 1,957.35 16,427 0.42 % 413.61 1,089 0.28 % 275.29 59,151 0.13 % 108.99 30,010 0.17 % 160.12 31,270 0.12 % 98.35 9,366 0.23 % 218.15 V d kadar kadar kadar Vd 133,016 0.14 % 117.23 1 0.09 % 103.02 13,669 0.35 % 344.17 1,820 0.46 % 460.08 44,635 0.10 % 82.24 34,877 0.20 % 186.09 21,626 0.08 % 68.02 16,388 0.41 % 381.70 V d kolikor kolikor kolikor Vd 105,767 0.11 % 93.21 1 0.09 % 103.02 5,383 0.14 % 135.54 830 0.21 % 209.82 50,897 0.11 % 93.78 18,240 0.10 % 97.32 26,006 0.10 % 81.80 4,410 0.11 % 102.71 V d CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 301 File at CLARIN.SI 1.3.24 List of particle lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-words-particles-lemmas- taxonomy-entire.tsvLemma Lemma (lower-case) Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] tudi tudi 8,478,793 21.38 % 7,472.32 21 18.75 % 2,163.39 136,549 9.19 % 3,438.16 23,030 20.53 % 5,821.74 4,214,730 21.79 % 7,765.92 1,465,007 20.28 % 7,816.80 2,376,445 23.18 % 7,474.54 263,011 21.31 % 6,125.84 ne ne 6,734,994 16.98 % 5,935.52 44 39.29 % 4,532.81 374,116 25.18 % 9,419.84 30,064 26.80 % 7,599.85 3,104,942 16.05 % 5,721.06 1,318,655 18.25 % 7,035.91 1,648,098 16.08 % 5,183.70 259,075 20.99 % 6,034.16 še še 5,787,718 14.60 % 5,100.69 24 21.43 % 2,472.44 216,168 14.55 % 5,442.88 11,596 10.34 % 2,931.34 2,847,027 14.72 % 5,245.84 962,021 13.32 % 5,133.03 1,615,445 15.76 % 5,081 135,437 10.97 % 3,154.49 že že 3,735,419 9.42 % 3,292.01 6 5.36 % 618.11 131,252 8.84 % 3,304.78 7,450 6.64 % 1,883.28 1,898,346 9.81 % 3,497.83 608,889 8.43 % 3,248.83 1,012,284 9.87 % 3,183.90 77,192 6.25 % 1,797.89 le le 2,165,776 5.46 % 1,908.69 0 0 % 0 58,645 3.95 % 1,476.62 4,405 3.93 % 1,113.54 1,056,697 5.46 % 1,947.03 439,209 6.08 % 2,343.47 527,671 5.15 % 1,659.66 79,149 6.41 % 1,843.47 naj naj 1,598,002 4.03 % 1,408.31 2 1.79 % 206.04 35,826 2.41 % 902.06 3,314 2.95 % 837.74 847,958 4.38 % 1,562.42 235,452 3.26 % 1,256.29 428,416 4.18 % 1,347.48 47,034 3.81 % 1,095.48 prav prav 1,208,290 3.05 % 1,064.86 1 0.89 % 103.02 52,405 3.53 % 1,319.50 2,593 2.31 % 655.48 578,438 2.99 % 1,065.81 228,510 3.16 % 1,219.25 304,480 2.97 % 957.67 41,863 3.39 % 975.04 sicer sicer 1,078,123 2.72 % 950.14 0 0 % 0 15,993 1.08 % 402.69 1,842 1.64 % 465.64 490,602 2.54 % 903.97 157,744 2.18 % 841.67 393,103 3.83 % 1,236.41 18,839 1.53 % 438.78 samo samo 940,204 2.37 % 828.60 6 5.36 % 618.11 76,915 5.18 % 1,936.64 4,906 4.37 % 1,240.18 434,798 2.25 % 801.14 194,492 2.69 % 1,037.75 184,259 1.80 % 579.54 44,828 3.63 % 1,044.10 predvsem predvsem 844,382 2.13 % 744.15 0 0 % 0 5,733 0.39 % 144.35 1,744 1.55 % 440.86 442,044 2.29 % 814.50 160,555 2.22 % 856.67 206,875 2.02 % 650.68 27,431 2.22 % 638.90 seveda seveda 725,852 1.83 % 639.69 0 0 % 0 24,491 1.65 % 616.66 2,801 2.50 % 708.06 363,766 1.88 % 670.26 184,814 2.56 % 986.11 129,171 1.26 % 406.28 20,809 1.69 % 484.67 celo celo 646,664 1.63 % 569.90 0 0 % 0 27,288 1.84 % 687.08 1,415 1.26 % 357.70 317,242 1.64 % 584.54 135,530 1.88 % 723.14 138,052 1.35 % 434.21 27,137 2.20 % 632.05 skoraj skoraj 528,454 1.33 % 465.72 3 2.68 % 309.06 26,418 1.78 % 665.18 828 0.74 % 209.31 253,737 1.31 % 467.53 93,065 1.29 % 496.56 138,601 1.35 % 435.94 15,802 1.28 % 368.05 več več 520,542 1.31 % 458.75 0 0 % 0 37,136 2.50 % 935.04 1,689 1.51 % 426.96 242,295 1.25 % 446.44 94,562 1.31 % 504.55 126,550 1.23 % 398.03 18,310 1.48 % 426.46 vsaj vsaj 507,625 1.28 % 447.37 0 0 % 0 20,218 1.36 % 509.07 1,432 1.28 % 361.99 254,444 1.31 % 468.83 102,076 1.41 % 544.64 113,702 1.11 % 357.62 15,753 1.28 % 366.91 morda morda 468,348 1.18 % 412.75 0 0 % 0 25,811 1.74 % 649.89 1,177 1.05 % 297.53 217,144 1.12 % 400.10 99,850 1.38 % 532.77 103,154 1.01 % 324.45 21,212 1.72 % 494.05 niti niti 458,790 1.16 % 404.33 1 0.89 % 103.02 31,303 2.11 % 788.18 1,088 0.97 % 275.03 220,261 1.14 % 405.85 80,301 1.11 % 428.46 113,891 1.11 % 358.22 11,945 0.97 % 278.21 sploh sploh 419,675 1.06 % 369.86 0 0 % 0 30,505 2.05 % 768.08 1,240 1.10 % 313.46 192,378 0.99 % 354.47 87,831 1.22 % 468.64 94,703 0.92 % 297.87 13,018 1.05 % 303.20 šele šele 388,144 0.98 % 342.07 2 1.79 % 206.04 13,326 0.90 % 335.53 803 0.72 % 202.99 199,240 1.03 % 367.11 69,131 0.96 % 368.86 91,926 0.90 % 289.13 13,716 1.11 % 319.46 pač pač 294,972 0.74 % 259.96 0 0 % 0 13,015 0.88 % 327.70 2,120 1.89 % 535.91 148,590 0.77 % 273.79 66,321 0.92 % 353.87 56,455 0.55 % 177.57 8,471 0.69 % 197.30 zgolj zgolj 254,436 0.64 % 224.23 0 0 % 0 5,389 0.36 % 135.69 546 0.49 % 138.02 111,593 0.58 % 205.62 38,423 0.53 % 205.01 89,108 0.87 % 280.27 9,377 0.76 % 218.40 ravno ravno 247,650 0.62 % 218.25 1 0.89 % 103.02 15,200 1.02 % 382.72 782 0.70 % 197.68 105,439 0.55 % 194.28 58,189 0.81 % 310.48 60,812 0.59 % 191.27 7,227 0.59 % 168.33 zlasti zlasti 242,203 0.61 % 213.45 0 0 % 0 3,265 0.22 % 82.21 908 0.81 % 229.53 138,036 0.71 % 254.34 36,842 0.51 % 196.58 46,810 0.46 % 147.23 16,342 1.32 % 380.62 pravzaprav pravzaprav 198,500 0.50 % 174.94 0 0 % 0 11,349 0.76 % 285.76 774 0.69 % 195.66 93,357 0.48 % 172.02 50,059 0.69 % 267.10 35,619 0.35 % 112.03 7,342 0.59 % 171 vsekakor vsekakor 159,819 0.40 % 140.85 1 0.89 % 103.02 3,884 0.26 % 97.79 385 0.34 % 97.32 83,779 0.43 % 154.37 38,563 0.53 % 205.76 28,874 0.28 % 90.82 4,333 0.35 % 100.92 no no 142,860 0.36 % 125.90 0 0 % 0 19,680 1.32 % 495.52 501 0.45 % 126.65 52,215 0.27 % 96.21 44,047 0.61 % 235.02 23,288 0.23 % 73.25 3,129 0.25 % 72.88 menda menda 132,347 0.33 % 116.64 0 0 % 0 4,569 0.31 % 115.04 130 0.12 % 32.86 76,040 0.39 % 140.11 26,259 0.36 % 140.11 23,996 0.23 % 75.47 1,353 0.11 % 31.51 najbrž najbrž 123,679 0.31 % 109 0 0 % 0 13,632 0.92 % 343.24 369 0.33 % 93.28 60,518 0.31 % 111.51 24,671 0.34 % 131.64 20,273 0.20 % 63.76 4,216 0.34 % 98.20 koli koli 107,068 0.27 % 94.36 0 0 % 0 6,574 0.44 % 165.53 962 0.86 % 243.18 49,660 0.26 % 91.50 14,368 0.20 % 76.66 30,258 0.29 % 95.17 5,246 0.42 % 122.19 ja ja 87,056 0.22 % 76.72 0 0 % 0 19,465 1.31 % 490.11 332 0.30 % 83.93 27,323 0.14 % 50.34 26,187 0.36 % 139.73 11,852 0.12 % 37.28 1,897 0.15 % 44.18 češ češ 87,039 0.22 % 76.71 0 0 % 0 2,631 0.18 % 66.25 343 0.31 % 86.71 51,345 0.27 % 94.61 11,944 0.17 % 63.73 19,053 0.19 % 59.93 1,723 0.14 % 40.13 skorajda skorajda 58,291 0.15 % 51.37 0 0 % 0 2,478 0.17 % 62.39 70 0.06 % 17.70 29,411 0.15 % 54.19 12,753 0.18 % 68.05 11,889 0.12 % 37.39 1,690 0.14 % 39.36 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 302 File at CLARIN.SI 1.3.25 List of particle lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-words-particles-lowercase_ forms-taxonomy-entire.tsvLower-case word form Lemma Lemma (lower-case) Morphosyntactic tag Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] msd01 tudi tudi tudi L 8,478,789 21.38 % 7,472.31 21 18.75 % 2,163.39 136,547 9.19 % 3,438.11 23,030 20.53 % 5,821.74 4,214,730 21.79 % 7,765.92 1,465,006 20.28 % 7,816.79 2,376,444 23.18 % 7,474.54 263,011 21.31 % 6,125.84 L ne ne ne L 6,734,994 16.98 % 5,935.52 44 39.29 % 4,532.81 374,116 25.18 % 9,419.84 30,064 26.80 % 7,599.85 3,104,942 16.05 % 5,721.06 1,318,655 18.25 % 7,035.91 1,648,098 16.08 % 5,183.70 259,075 20.99 % 6,034.16 L še še še L 5,787,718 14.60 % 5,100.69 24 21.43 % 2,472.44 216,168 14.55 % 5,442.88 11,596 10.34 % 2,931.34 2,847,027 14.72 % 5,245.84 962,021 13.32 % 5,133.03 1,615,445 15.76 % 5,081 135,437 10.97 % 3,154.49 L že že že L 3,735,419 9.42 % 3,292.01 6 5.36 % 618.11 131,252 8.84 % 3,304.78 7,450 6.64 % 1,883.28 1,898,346 9.81 % 3,497.83 608,889 8.43 % 3,248.83 1,012,284 9.87 % 3,183.90 77,192 6.25 % 1,797.89 L le le le L 2,165,776 5.46 % 1,908.69 0 0 % 0 58,645 3.95 % 1,476.62 4,405 3.93 % 1,113.54 1,056,697 5.46 % 1,947.03 439,209 6.08 % 2,343.47 527,671 5.15 % 1,659.66 79,149 6.41 % 1,843.47 L naj naj naj L 1,598,002 4.03 % 1,408.31 2 1.79 % 206.04 35,826 2.41 % 902.06 3,314 2.95 % 837.74 847,958 4.38 % 1,562.42 235,452 3.26 % 1,256.29 428,416 4.18 % 1,347.48 47,034 3.81 % 1,095.48 L prav prav prav L 1,208,290 3.05 % 1,064.86 1 0.89 % 103.02 52,405 3.53 % 1,319.50 2,593 2.31 % 655.48 578,438 2.99 % 1,065.81 228,510 3.16 % 1,219.25 304,480 2.97 % 957.67 41,863 3.39 % 975.04 L sicer sicer sicer L 1,078,123 2.72 % 950.14 0 0 % 0 15,993 1.08 % 402.69 1,842 1.64 % 465.64 490,602 2.54 % 903.97 157,744 2.18 % 841.67 393,103 3.83 % 1,236.41 18,839 1.53 % 438.78 L samo samo samo L 940,204 2.37 % 828.60 6 5.36 % 618.11 76,915 5.18 % 1,936.64 4,906 4.37 % 1,240.18 434,798 2.25 % 801.14 194,492 2.69 % 1,037.75 184,259 1.80 % 579.54 44,828 3.63 % 1,044.10 L predvsem predvsem predvsem L 844,382 2.13 % 744.15 0 0 % 0 5,733 0.39 % 144.35 1,744 1.55 % 440.86 442,044 2.29 % 814.50 160,555 2.22 % 856.67 206,875 2.02 % 650.68 27,431 2.22 % 638.90 L seveda seveda seveda L 725,852 1.83 % 639.69 0 0 % 0 24,491 1.65 % 616.66 2,801 2.50 % 708.06 363,766 1.88 % 670.26 184,814 2.56 % 986.11 129,171 1.26 % 406.28 20,809 1.69 % 484.67 L celo celo celo L 646,664 1.63 % 569.90 0 0 % 0 27,288 1.84 % 687.08 1,415 1.26 % 357.70 317,242 1.64 % 584.54 135,530 1.88 % 723.14 138,052 1.35 % 434.21 27,137 2.20 % 632.05 L skoraj skoraj skoraj L 528,454 1.33 % 465.72 3 2.68 % 309.06 26,418 1.78 % 665.18 828 0.74 % 209.31 253,737 1.31 % 467.53 93,065 1.29 % 496.56 138,601 1.35 % 435.94 15,802 1.28 % 368.05 L več več več L 520,542 1.31 % 458.75 0 0 % 0 37,136 2.50 % 935.04 1,689 1.51 % 426.96 242,295 1.25 % 446.44 94,562 1.31 % 504.55 126,550 1.23 % 398.03 18,310 1.48 % 426.46 L vsaj vsaj vsaj L 507,625 1.28 % 447.37 0 0 % 0 20,218 1.36 % 509.07 1,432 1.28 % 361.99 254,444 1.31 % 468.83 102,076 1.41 % 544.64 113,702 1.11 % 357.62 15,753 1.28 % 366.91 L morda morda morda L 468,348 1.18 % 412.75 0 0 % 0 25,811 1.74 % 649.89 1,177 1.05 % 297.53 217,144 1.12 % 400.10 99,850 1.38 % 532.77 103,154 1.01 % 324.45 21,212 1.72 % 494.05 L niti niti niti L 458,790 1.16 % 404.33 1 0.89 % 103.02 31,303 2.11 % 788.18 1,088 0.97 % 275.03 220,261 1.14 % 405.85 80,301 1.11 % 428.46 113,891 1.11 % 358.22 11,945 0.97 % 278.21 L sploh sploh sploh L 419,675 1.06 % 369.86 0 0 % 0 30,505 2.05 % 768.08 1,240 1.10 % 313.46 192,378 0.99 % 354.47 87,831 1.22 % 468.64 94,703 0.92 % 297.87 13,018 1.05 % 303.20 L šele šele šele L 388,144 0.98 % 342.07 2 1.79 % 206.04 13,326 0.90 % 335.53 803 0.72 % 202.99 199,240 1.03 % 367.11 69,131 0.96 % 368.86 91,926 0.90 % 289.13 13,716 1.11 % 319.46 L pač pač pač L 294,972 0.74 % 259.96 0 0 % 0 13,015 0.88 % 327.70 2,120 1.89 % 535.91 148,590 0.77 % 273.79 66,321 0.92 % 353.87 56,455 0.55 % 177.57 8,471 0.69 % 197.30 L zgolj zgolj zgolj L 254,436 0.64 % 224.23 0 0 % 0 5,389 0.36 % 135.69 546 0.49 % 138.02 111,593 0.58 % 205.62 38,423 0.53 % 205.01 89,108 0.87 % 280.27 9,377 0.76 % 218.40 L ravno ravno ravno L 247,650 0.62 % 218.25 1 0.89 % 103.02 15,200 1.02 % 382.72 782 0.70 % 197.68 105,439 0.55 % 194.28 58,189 0.81 % 310.48 60,812 0.59 % 191.27 7,227 0.59 % 168.33 L zlasti zlasti zlasti L 242,203 0.61 % 213.45 0 0 % 0 3,265 0.22 % 82.21 908 0.81 % 229.53 138,036 0.71 % 254.34 36,842 0.51 % 196.58 46,810 0.46 % 147.23 16,342 1.32 % 380.62 L pravzaprav pravzaprav pravzaprav L 198,500 0.50 % 174.94 0 0 % 0 11,349 0.76 % 285.76 774 0.69 % 195.66 93,357 0.48 % 172.02 50,059 0.69 % 267.10 35,619 0.35 % 112.03 7,342 0.59 % 171 L vsekakor vsekakor vsekakor L 159,819 0.40 % 140.85 1 0.89 % 103.02 3,884 0.26 % 97.79 385 0.34 % 97.32 83,779 0.43 % 154.37 38,563 0.53 % 205.76 28,874 0.28 % 90.82 4,333 0.35 % 100.92 L no no no L 142,860 0.36 % 125.90 0 0 % 0 19,680 1.32 % 495.52 501 0.45 % 126.65 52,215 0.27 % 96.21 44,047 0.61 % 235.02 23,288 0.23 % 73.25 3,129 0.25 % 72.88 L menda menda menda L 132,347 0.33 % 116.64 0 0 % 0 4,569 0.31 % 115.04 130 0.12 % 32.86 76,040 0.39 % 140.11 26,259 0.36 % 140.11 23,996 0.23 % 75.47 1,353 0.11 % 31.51 L najbrž najbrž najbrž L 123,679 0.31 % 109 0 0 % 0 13,632 0.92 % 343.24 369 0.33 % 93.28 60,518 0.31 % 111.51 24,671 0.34 % 131.64 20,273 0.20 % 63.76 4,216 0.34 % 98.20 L koli koli koli L 107,068 0.27 % 94.36 0 0 % 0 6,574 0.44 % 165.53 962 0.86 % 243.18 49,660 0.26 % 91.50 14,368 0.20 % 76.66 30,258 0.29 % 95.17 5,246 0.42 % 122.19 L ja ja ja L 87,056 0.22 % 76.72 0 0 % 0 19,465 1.31 % 490.11 332 0.30 % 83.93 27,323 0.14 % 50.34 26,187 0.36 % 139.73 11,852 0.12 % 37.28 1,897 0.15 % 44.18 L češ češ češ L 87,039 0.22 % 76.71 0 0 % 0 2,631 0.18 % 66.25 343 0.31 % 86.71 51,345 0.27 % 94.61 11,944 0.17 % 63.73 19,053 0.19 % 59.93 1,723 0.14 % 40.13 L skorajda skorajda skorajda L 58,291 0.15 % 51.37 0 0 % 0 2,478 0.17 % 62.39 70 0.06 % 17.70 29,411 0.15 % 54.19 12,753 0.18 % 68.05 11,889 0.12 % 37.39 1,690 0.14 % 39.36 L CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 303 File at CLARIN.SI 1.3.26 List of interjection lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-words-interjections-lemmas- taxonomy-entire.tsvLemma Lemma (lower-case) Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] oh oh 14,865 10.91 % 13.10 0 0 % 0 5,344 19.95 % 134.56 25 7.14 % 6.32 3,245 7.48 % 5.98 3,990 9.80 % 21.29 1,501 7.18 % 4.72 760 18.32 % 17.70 ah ah 13,301 9.76 % 11.72 0 0 % 0 2,696 10.06 % 67.88 30 8.57 % 7.58 4,963 11.44 % 9.14 3,766 9.25 % 20.09 1,401 6.71 % 4.41 445 10.73 % 10.36 ha ha 11,172 8.20 % 9.85 0 0 % 0 1,638 6.11 % 41.24 33 9.43 % 8.34 4,524 10.43 % 8.34 3,900 9.58 % 20.81 757 3.62 % 2.38 320 7.71 % 7.45 hm hm 8,313 6.10 % 7.33 0 0 % 0 1,142 4.26 % 28.75 12 3.43 % 3.03 3,096 7.13 % 5.70 3,089 7.59 % 16.48 805 3.85 % 2.53 169 4.07 % 3.94 hej hej 7,497 5.50 % 6.61 0 0 % 0 1,515 5.66 % 38.15 4 1.14 % 1.01 1,767 4.07 % 3.26 3,368 8.27 % 17.97 633 3.03 % 1.99 210 5.06 % 4.89 joj joj 6,441 4.73 % 5.68 0 0 % 0 1,074 4.01 % 27.04 33 9.43 % 8.34 2,157 4.97 % 3.97 2,162 5.31 % 11.54 877 4.20 % 2.76 138 3.33 % 3.21 aha aha 6,335 4.65 % 5.58 0 0 % 0 1,184 4.42 % 29.81 25 7.14 % 6.32 1,301 3.00 % 2.40 1,538 3.78 % 8.21 2,127 10.18 % 6.69 160 3.86 % 3.73 fak fak 4,669 3.43 % 4.11 0 0 % 0 160 0.60 % 4.03 0 0 % 0 190 0.44 % 0.35 98 0.24 % 0.52 4,219 20.20 % 13.27 2 0.05 % 0.05 ej ej 3,721 2.73 % 3.28 0 0 % 0 1,232 4.60 % 31.02 7 2.00 % 1.77 1,049 2.42 % 1.93 830 2.04 % 4.43 362 1.73 % 1.14 241 5.81 % 5.61 hi hi 3,425 2.51 % 3.02 0 0 % 0 169 0.63 % 4.26 22 6.29 % 5.56 1,246 2.87 % 2.30 1,521 3.74 % 8.12 356 1.70 % 1.12 111 2.68 % 2.59 bravo bravo 3,087 2.27 % 2.72 0 0 % 0 254 0.95 % 6.40 4 1.14 % 1.01 1,220 2.81 % 2.25 848 2.08 % 4.52 724 3.47 % 2.28 37 0.89 % 0.86 uf uf 2,808 2.06 % 2.47 0 0 % 0 254 0.95 % 6.40 0 0 % 0 834 1.92 % 1.54 1,109 2.72 % 5.92 586 2.81 % 1.84 25 0.60 % 0.58 zbogom zbogom 2,474 1.81 % 2.18 0 0 % 0 360 1.34 % 9.06 3 0.86 % 0.76 1,062 2.45 % 1.96 532 1.31 % 2.84 427 2.04 % 1.34 90 2.17 % 2.10 uh uh 2,435 1.79 % 2.15 0 0 % 0 336 1.25 % 8.46 2 0.57 % 0.51 830 1.91 % 1.53 983 2.41 % 5.24 236 1.13 % 0.74 48 1.16 % 1.12 adijo adijo 2,414 1.77 % 2.13 0 0 % 0 416 1.55 % 10.47 1 0.29 % 0.25 961 2.21 % 1.77 608 1.49 % 3.24 368 1.76 % 1.16 60 1.45 % 1.40 eh eh 2,409 1.77 % 2.12 0 0 % 0 399 1.49 % 10.05 1 0.29 % 0.25 655 1.51 % 1.21 1,114 2.74 % 5.94 193 0.92 % 0.61 47 1.13 % 1.09 ho ho 2,160 1.58 % 1.90 0 0 % 0 220 0.82 % 5.54 18 5.14 % 4.55 784 1.81 % 1.44 504 1.24 % 2.69 501 2.40 % 1.58 133 3.21 % 3.10 bla bla 1,985 1.46 % 1.75 0 0 % 0 325 1.21 % 8.18 11 3.14 % 2.78 824 1.90 % 1.52 512 1.26 % 2.73 259 1.24 % 0.81 54 1.30 % 1.26 la la 1,888 1.39 % 1.66 0 0 % 0 90 0.34 % 2.27 3 0.86 % 0.76 919 2.12 % 1.69 450 1.10 % 2.40 354 1.70 % 1.11 72 1.74 % 1.68 haha haha 1,798 1.32 % 1.58 0 0 % 0 86 0.32 % 2.17 0 0 % 0 448 1.03 % 0.83 915 2.25 % 4.88 344 1.65 % 1.08 5 0.12 % 0.12 oj oj 1,484 1.09 % 1.31 0 0 % 0 218 0.81 % 5.49 9 2.57 % 2.28 651 1.50 % 1.20 318 0.78 % 1.70 139 0.67 % 0.44 149 3.59 % 3.47 huh huh 1,382 1.01 % 1.22 0 0 % 0 23 0.09 % 0.58 0 0 % 0 97 0.22 % 0.18 1,180 2.90 % 6.30 72 0.34 % 0.23 10 0.24 % 0.23 alo alo 1,291 0.95 % 1.14 0 0 % 0 100 0.37 % 2.52 0 0 % 0 599 1.38 % 1.10 221 0.54 % 1.18 346 1.66 % 1.09 25 0.60 % 0.58 hvalabogu hvalabogu 1,286 0.94 % 1.13 0 0 % 0 146 0.55 % 3.68 4 1.14 % 1.01 671 1.55 % 1.24 300 0.74 % 1.60 148 0.71 % 0.47 17 0.41 % 0.40 jebiga jebiga 1,175 0.86 % 1.04 0 0 % 0 511 1.91 % 12.87 3 0.86 % 0.76 321 0.74 % 0.59 272 0.67 % 1.45 63 0.30 % 0.20 5 0.12 % 0.12 nasvidenje nasvidenje 1,162 0.85 % 1.02 0 0 % 0 130 0.48 % 3.27 35 10.00 % 8.85 620 1.43 % 1.14 181 0.45 % 0.97 168 0.80 % 0.53 28 0.68 % 0.65 oho oho 1,160 0.85 % 1.02 0 0 % 0 135 0.50 % 3.40 1 0.29 % 0.25 421 0.97 % 0.78 217 0.53 % 1.16 359 1.72 % 1.13 27 0.65 % 0.63 hura hura 1,087 0.80 % 0.96 0 0 % 0 121 0.45 % 3.05 1 0.29 % 0.25 531 1.22 % 0.98 289 0.71 % 1.54 107 0.51 % 0.34 38 0.92 % 0.89 jah jah 1,031 0.76 % 0.91 0 0 % 0 22 0.08 % 0.55 1 0.29 % 0.25 650 1.50 % 1.20 235 0.58 % 1.25 115 0.55 % 0.36 8 0.19 % 0.19 ups ups 972 0.71 % 0.86 0 0 % 0 74 0.28 % 1.86 5 1.43 % 1.26 302 0.70 % 0.56 468 1.15 % 2.50 115 0.55 % 0.36 8 0.19 % 0.19 hopla hopla 946 0.69 % 0.83 0 0 % 0 42 0.16 % 1.06 3 0.86 % 0.76 338 0.78 % 0.62 535 1.31 % 2.85 23 0.11 % 0.07 5 0.12 % 0.12 av av 944 0.69 % 0.83 0 0 % 0 78 0.29 % 1.96 10 2.86 % 2.53 588 1.35 % 1.08 200 0.49 % 1.07 40 0.19 % 0.13 28 0.68 % 0.65 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 304 File at CLARIN.SI 1.3.27 List of interjection lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-words-interjections-lowercase_ forms-taxonomy-entire.tsvLower-case word form Lemma Lemma (lower-case) Morphosyntactic tag Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] msd01 oh oh oh M 14,865 10.91 % 13.10 0 0 % 0 5,344 19.95 % 134.56 25 7.14 % 6.32 3,245 7.48 % 5.98 3,990 9.80 % 21.29 1,501 7.18 % 4.72 760 18.32 % 17.70 M ah ah ah M 13,301 9.76 % 11.72 0 0 % 0 2,696 10.06 % 67.88 30 8.57 % 7.58 4,963 11.44 % 9.14 3,766 9.25 % 20.09 1,401 6.71 % 4.41 445 10.73 % 10.36 M ha ha ha M 11,172 8.20 % 9.85 0 0 % 0 1,638 6.11 % 41.24 33 9.43 % 8.34 4,524 10.43 % 8.34 3,900 9.58 % 20.81 757 3.62 % 2.38 320 7.71 % 7.45 M hm hm hm M 8,313 6.10 % 7.33 0 0 % 0 1,142 4.26 % 28.75 12 3.43 % 3.03 3,096 7.13 % 5.70 3,089 7.59 % 16.48 805 3.85 % 2.53 169 4.07 % 3.94 M hej hej hej M 7,497 5.50 % 6.61 0 0 % 0 1,515 5.66 % 38.15 4 1.14 % 1.01 1,767 4.07 % 3.26 3,368 8.27 % 17.97 633 3.03 % 1.99 210 5.06 % 4.89 M joj joj joj M 6,441 4.73 % 5.68 0 0 % 0 1,074 4.01 % 27.04 33 9.43 % 8.34 2,157 4.97 % 3.97 2,162 5.31 % 11.54 877 4.20 % 2.76 138 3.33 % 3.21 M aha aha aha M 6,335 4.65 % 5.58 0 0 % 0 1,184 4.42 % 29.81 25 7.14 % 6.32 1,301 3.00 % 2.40 1,538 3.78 % 8.21 2,127 10.18 % 6.69 160 3.86 % 3.73 M fak fak fak M 4,669 3.43 % 4.11 0 0 % 0 160 0.60 % 4.03 0 0 % 0 190 0.44 % 0.35 98 0.24 % 0.52 4,219 20.20 % 13.27 2 0.05 % 0.05 M ej ej ej M 3,721 2.73 % 3.28 0 0 % 0 1,232 4.60 % 31.02 7 2.00 % 1.77 1,049 2.42 % 1.93 830 2.04 % 4.43 362 1.73 % 1.14 241 5.81 % 5.61 M hi hi hi M 3,425 2.51 % 3.02 0 0 % 0 169 0.63 % 4.26 22 6.29 % 5.56 1,246 2.87 % 2.30 1,521 3.74 % 8.12 356 1.70 % 1.12 111 2.68 % 2.59 M bravo bravo bravo M 3,087 2.27 % 2.72 0 0 % 0 254 0.95 % 6.40 4 1.14 % 1.01 1,220 2.81 % 2.25 848 2.08 % 4.52 724 3.47 % 2.28 37 0.89 % 0.86 M uf uf uf M 2,808 2.06 % 2.47 0 0 % 0 254 0.95 % 6.40 0 0 % 0 834 1.92 % 1.54 1,109 2.72 % 5.92 586 2.81 % 1.84 25 0.60 % 0.58 M zbogom zbogom zbogom M 2,474 1.81 % 2.18 0 0 % 0 360 1.34 % 9.06 3 0.86 % 0.76 1,062 2.45 % 1.96 532 1.31 % 2.84 427 2.04 % 1.34 90 2.17 % 2.10 M uh uh uh M 2,435 1.79 % 2.15 0 0 % 0 336 1.25 % 8.46 2 0.57 % 0.51 830 1.91 % 1.53 983 2.41 % 5.24 236 1.13 % 0.74 48 1.16 % 1.12 M adijo adijo adijo M 2,414 1.77 % 2.13 0 0 % 0 416 1.55 % 10.47 1 0.29 % 0.25 961 2.21 % 1.77 608 1.49 % 3.24 368 1.76 % 1.16 60 1.45 % 1.40 M eh eh eh M 2,409 1.77 % 2.12 0 0 % 0 399 1.49 % 10.05 1 0.29 % 0.25 655 1.51 % 1.21 1,114 2.74 % 5.94 193 0.92 % 0.61 47 1.13 % 1.09 M ho ho ho M 2,160 1.58 % 1.90 0 0 % 0 220 0.82 % 5.54 18 5.14 % 4.55 784 1.81 % 1.44 504 1.24 % 2.69 501 2.40 % 1.58 133 3.21 % 3.10 M bla bla bla M 1,985 1.46 % 1.75 0 0 % 0 325 1.21 % 8.18 11 3.14 % 2.78 824 1.90 % 1.52 512 1.26 % 2.73 259 1.24 % 0.81 54 1.30 % 1.26 M la la la M 1,888 1.39 % 1.66 0 0 % 0 90 0.34 % 2.27 3 0.86 % 0.76 919 2.12 % 1.69 450 1.10 % 2.40 354 1.70 % 1.11 72 1.74 % 1.68 M haha haha haha M 1,798 1.32 % 1.58 0 0 % 0 86 0.32 % 2.17 0 0 % 0 448 1.03 % 0.83 915 2.25 % 4.88 344 1.65 % 1.08 5 0.12 % 0.12 M oj oj oj M 1,484 1.09 % 1.31 0 0 % 0 218 0.81 % 5.49 9 2.57 % 2.28 651 1.50 % 1.20 318 0.78 % 1.70 139 0.67 % 0.44 149 3.59 % 3.47 M huh huh huh M 1,382 1.01 % 1.22 0 0 % 0 23 0.09 % 0.58 0 0 % 0 97 0.22 % 0.18 1,180 2.90 % 6.30 72 0.34 % 0.23 10 0.24 % 0.23 M alo alo alo M 1,291 0.95 % 1.14 0 0 % 0 100 0.37 % 2.52 0 0 % 0 599 1.38 % 1.10 221 0.54 % 1.18 346 1.66 % 1.09 25 0.60 % 0.58 M hvalabogu hvalabogu hvalabogu M 1,286 0.94 % 1.13 0 0 % 0 146 0.55 % 3.68 4 1.14 % 1.01 671 1.55 % 1.24 300 0.74 % 1.60 148 0.71 % 0.47 17 0.41 % 0.40 M jebiga jebiga jebiga M 1,175 0.86 % 1.04 0 0 % 0 511 1.91 % 12.87 3 0.86 % 0.76 321 0.74 % 0.59 272 0.67 % 1.45 63 0.30 % 0.20 5 0.12 % 0.12 M nasvidenje nasvidenje nasvidenje M 1,162 0.85 % 1.02 0 0 % 0 130 0.48 % 3.27 35 10.00 % 8.85 620 1.43 % 1.14 181 0.45 % 0.97 168 0.80 % 0.53 28 0.68 % 0.65 M oho oho oho M 1,160 0.85 % 1.02 0 0 % 0 135 0.50 % 3.40 1 0.29 % 0.25 421 0.97 % 0.78 217 0.53 % 1.16 359 1.72 % 1.13 27 0.65 % 0.63 M hura hura hura M 1,087 0.80 % 0.96 0 0 % 0 121 0.45 % 3.05 1 0.29 % 0.25 531 1.22 % 0.98 289 0.71 % 1.54 107 0.51 % 0.34 38 0.92 % 0.89 M jah jah jah M 1,031 0.76 % 0.91 0 0 % 0 22 0.08 % 0.55 1 0.29 % 0.25 650 1.50 % 1.20 235 0.58 % 1.25 115 0.55 % 0.36 8 0.19 % 0.19 M ups ups ups M 972 0.71 % 0.86 0 0 % 0 74 0.28 % 1.86 5 1.43 % 1.26 302 0.70 % 0.56 468 1.15 % 2.50 115 0.55 % 0.36 8 0.19 % 0.19 M hopla hopla hopla M 946 0.69 % 0.83 0 0 % 0 42 0.16 % 1.06 3 0.86 % 0.76 338 0.78 % 0.62 535 1.31 % 2.85 23 0.11 % 0.07 5 0.12 % 0.12 M av av av M 944 0.69 % 0.83 0 0 % 0 78 0.29 % 1.96 10 2.86 % 2.53 588 1.35 % 1.08 200 0.49 % 1.07 40 0.19 % 0.13 28 0.68 % 0.65 M CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 305 File at CLARIN.SI 1.3.28 List of abbreviation lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-words-abbreviations-lemmas- taxonomy-entire.tsvLemma Lemma (lower-case) Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] dr, dr, 322,077 10.23 % 283.84 0 0 % 0 2,425 9.78 % 61.06 476 5.17 % 120.33 222,930 11.94 % 410.76 51,452 11.39 % 274.53 39,853 7.31 % 125.35 4,941 1.97 % 115.08 oz, oz, 136,593 4.34 % 120.38 0 0 % 0 226 0.91 % 5.69 329 3.57 % 83.17 40,351 2.16 % 74.35 22,364 4.95 % 119.33 62,292 11.43 % 195.92 11,031 4.41 % 256.93 d, d, 133,611 4.24 % 117.75 0 0 % 0 213 0.86 % 5.36 567 6.16 % 143.33 84,570 4.53 % 155.83 20,886 4.62 % 111.44 26,223 4.81 % 82.48 1,152 0.46 % 26.83 o, o, 110,895 3.52 % 97.73 0 0 % 0 142 0.57 % 3.58 182 1.98 % 46.01 70,116 3.75 % 129.19 21,336 4.72 % 113.84 17,712 3.25 % 55.71 1,407 0.56 % 32.77 M, m, 110,881 3.52 % 97.72 0 0 % 0 522 2.10 % 13.14 61 0.66 % 15.42 85,646 4.59 % 157.81 9,522 2.11 % 50.81 9,214 1.69 % 28.98 5,916 2.36 % 137.79 št, št, 94,645 3.01 % 83.41 0 0 % 0 941 3.79 % 23.69 1,913 20.77 % 483.59 28,295 1.51 % 52.14 11,553 2.56 % 61.64 44,772 8.21 % 140.82 7,171 2.87 % 167.02 npr, npr, 92,015 2.92 % 81.09 0 0 % 0 314 1.27 % 7.91 311 3.38 % 78.62 24,837 1.33 % 45.76 29,590 6.55 % 157.88 14,523 2.66 % 45.68 22,440 8.97 % 522.65 J, j, 84,562 2.69 % 74.52 0 0 % 0 1,071 4.32 % 26.97 69 0.75 % 17.44 59,838 3.21 % 110.26 7,985 1.77 % 42.61 8,714 1.60 % 27.41 6,885 2.75 % 160.36 A, a, 81,577 2.59 % 71.89 0 0 % 0 1,198 4.83 % 30.16 191 2.07 % 48.28 49,090 2.63 % 90.45 8,889 1.97 % 47.43 15,696 2.88 % 49.37 6,513 2.60 % 151.70 S, s, 80,813 2.57 % 71.22 0 0 % 0 488 1.97 % 12.29 57 0.62 % 14.41 62,235 3.33 % 114.67 7,148 1.58 % 38.14 7,163 1.31 % 22.53 3,722 1.49 % 86.69 t, t, 79,373 2.52 % 69.95 0 0 % 0 168 0.68 % 4.23 99 1.07 % 25.03 34,619 1.85 % 63.79 14,460 3.20 % 77.15 25,379 4.66 % 79.82 4,648 1.86 % 108.26 B, b, 75,217 2.39 % 66.29 0 0 % 0 1,095 4.42 % 27.57 242 2.63 % 61.17 54,738 2.93 % 100.86 6,177 1.37 % 32.96 9,399 1.72 % 29.56 3,566 1.43 % 83.06 i, i, 68,165 2.17 % 60.07 0 0 % 0 158 0.64 % 3.98 90 0.98 % 22.75 27,139 1.45 % 50.01 12,936 2.86 % 69.02 24,317 4.46 % 76.48 3,525 1.41 % 82.10 itd, itd, 65,475 2.08 % 57.70 2 5.71 % 206.04 578 2.33 % 14.55 195 2.12 % 49.29 36,287 1.94 % 66.86 14,609 3.23 % 77.95 8,287 1.52 % 26.06 5,517 2.21 % 128.50 sv, sv, 65,377 2.08 % 57.62 0 0 % 0 688 2.77 % 17.32 57 0.62 % 14.41 44,951 2.41 % 82.83 6,229 1.38 % 33.24 7,698 1.41 % 24.21 5,754 2.30 % 134.02 D, d, 63,685 2.02 % 56.13 0 0 % 0 415 1.67 % 10.45 92 1.00 % 23.26 47,370 2.54 % 87.28 5,629 1.25 % 30.03 6,708 1.23 % 21.10 3,471 1.39 % 80.84 P, p, 61,583 1.96 % 54.27 0 0 % 0 371 1.50 % 9.34 68 0.74 % 17.19 47,303 2.53 % 87.16 6,019 1.33 % 32.12 4,974 0.91 % 15.64 2,848 1.14 % 66.33 tel, tel, 59,340 1.89 % 52.30 0 0 % 0 5 0.02 % 0.13 12 0.13 % 3.03 47,321 2.53 % 87.19 7,401 1.64 % 39.49 3,114 0.57 % 9.79 1,487 0.59 % 34.63 K, k, 49,179 1.56 % 43.34 0 0 % 0 1,046 4.22 % 26.34 48 0.52 % 12.13 37,122 1.99 % 68.40 4,816 1.07 % 25.70 4,367 0.80 % 13.74 1,780 0.71 % 41.46 prof, prof, 48,612 1.54 % 42.84 0 0 % 0 47 0.19 % 1.18 26 0.28 % 6.57 34,903 1.87 % 64.31 5,305 1.17 % 28.31 7,590 1.39 % 23.87 741 0.30 % 17.26 str, str, 47,194 1.50 % 41.59 0 0 % 0 1,110 4.48 % 27.95 33 0.36 % 8.34 3,788 0.20 % 6.98 10,527 2.33 % 56.17 3,546 0.65 % 11.15 28,190 11.27 % 656.58 I, i, 44,515 1.41 % 39.23 0 0 % 0 441 1.78 % 11.10 137 1.49 % 34.63 30,623 1.64 % 56.42 4,534 1.00 % 24.19 5,596 1.03 % 17.60 3,184 1.27 % 74.16 V, v, 43,853 1.39 % 38.65 0 0 % 0 342 1.38 % 8.61 84 0.91 % 21.23 30,139 1.61 % 55.53 5,216 1.16 % 27.83 6,055 1.11 % 19.04 2,017 0.81 % 46.98 R, r, 43,194 1.37 % 38.07 0 0 % 0 533 2.15 % 13.42 40 0.43 % 10.11 29,209 1.56 % 53.82 5,154 1.14 % 27.50 5,123 0.94 % 16.11 3,135 1.25 % 73.02 G, g, 42,062 1.34 % 37.07 0 0 % 0 555 2.24 % 13.97 44 0.48 % 11.12 30,793 1.65 % 56.74 4,157 0.92 % 22.18 3,782 0.69 % 11.90 2,731 1.09 % 63.61 mag, mag, 40,143 1.27 % 35.38 0 0 % 0 17 0.07 % 0.43 65 0.71 % 16.43 29,459 1.58 % 54.28 5,620 1.24 % 29.99 4,511 0.83 % 14.19 471 0.19 % 10.97 C, c, 37,814 1.20 % 33.33 33 94.29 % 3,399.61 339 1.37 % 8.54 132 1.43 % 33.37 18,868 1.01 % 34.77 7,426 1.64 % 39.62 6,246 1.15 % 19.65 4,770 1.91 % 111.10 L, l, 37,102 1.18 % 32.70 0 0 % 0 793 3.20 % 19.97 65 0.71 % 16.43 25,324 1.36 % 46.66 3,661 0.81 % 19.53 3,732 0.69 % 11.74 3,527 1.41 % 82.15 st, st, 36,399 1.16 % 32.08 0 0 % 0 947 3.82 % 23.84 30 0.33 % 7.58 19,248 1.03 % 35.47 3,431 0.76 % 18.31 11,515 2.11 % 36.22 1,228 0.49 % 28.60 p, p, 35,939 1.14 % 31.67 0 0 % 0 96 0.39 % 2.42 68 0.74 % 17.19 20,936 1.12 % 38.58 6,478 1.43 % 34.56 7,658 1.41 % 24.09 703 0.28 % 16.37 op, op, 35,928 1.14 % 31.66 0 0 % 0 454 1.83 % 11.43 8 0.09 % 2.02 15,464 0.83 % 28.49 4,702 1.04 % 25.09 11,819 2.17 % 37.17 3,481 1.39 % 81.08 T, t, 34,839 1.11 % 30.70 0 0 % 0 350 1.41 % 8.81 35 0.38 % 8.85 23,035 1.23 % 42.44 4,478 0.99 % 23.89 4,954 0.91 % 15.58 1,987 0.79 % 46.28 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 306 File at CLARIN.SI 1.3.29 List of abbreviation lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-words-abbreviations-lowercase_ forms-taxonomy-entire.tsvLower-case word form Lemma Lemma (lower-case) Morphosyntactic tag Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] msd01 dr, dr, dr, O 322,077 10.23 % 283.84 0 0 % 0 2,425 9.78 % 61.06 476 5.17 % 120.33 222,930 11.94 % 410.76 51,452 11.39 % 274.53 39,853 7.31 % 125.35 4,941 1.97 % 115.08 O oz, oz, oz, O 136,593 4.34 % 120.38 0 0 % 0 226 0.91 % 5.69 329 3.57 % 83.17 40,351 2.16 % 74.35 22,364 4.95 % 119.33 62,292 11.43 % 195.92 11,031 4.41 % 256.93 O d, d, d, O 133,611 4.24 % 117.75 0 0 % 0 213 0.86 % 5.36 567 6.16 % 143.33 84,570 4.53 % 155.83 20,886 4.62 % 111.44 26,223 4.81 % 82.48 1,152 0.46 % 26.83 O o, o, o, O 110,895 3.52 % 97.73 0 0 % 0 142 0.57 % 3.58 182 1.98 % 46.01 70,116 3.75 % 129.19 21,336 4.72 % 113.84 17,712 3.25 % 55.71 1,407 0.56 % 32.77 O m, M, m, O 110,881 3.52 % 97.72 0 0 % 0 522 2.10 % 13.14 61 0.66 % 15.42 85,646 4.59 % 157.81 9,522 2.11 % 50.81 9,214 1.69 % 28.98 5,916 2.36 % 137.79 O št, št, št, O 94,645 3.01 % 83.41 0 0 % 0 941 3.79 % 23.69 1,913 20.77 % 483.59 28,295 1.51 % 52.14 11,553 2.56 % 61.64 44,772 8.21 % 140.82 7,171 2.87 % 167.02 O npr, npr, npr, O 92,015 2.92 % 81.09 0 0 % 0 314 1.27 % 7.91 311 3.38 % 78.62 24,837 1.33 % 45.76 29,590 6.55 % 157.88 14,523 2.66 % 45.68 22,440 8.97 % 522.65 O j, J, j, O 84,562 2.69 % 74.52 0 0 % 0 1,071 4.32 % 26.97 69 0.75 % 17.44 59,838 3.21 % 110.26 7,985 1.77 % 42.61 8,714 1.60 % 27.41 6,885 2.75 % 160.36 O a, A, a, O 81,577 2.59 % 71.89 0 0 % 0 1,198 4.83 % 30.16 191 2.07 % 48.28 49,090 2.63 % 90.45 8,889 1.97 % 47.43 15,696 2.88 % 49.37 6,513 2.60 % 151.70 O s, S, s, O 80,813 2.57 % 71.22 0 0 % 0 488 1.97 % 12.29 57 0.62 % 14.41 62,235 3.33 % 114.67 7,148 1.58 % 38.14 7,163 1.31 % 22.53 3,722 1.49 % 86.69 O t, t, t, O 79,373 2.52 % 69.95 0 0 % 0 168 0.68 % 4.23 99 1.07 % 25.03 34,619 1.85 % 63.79 14,460 3.20 % 77.15 25,379 4.66 % 79.82 4,648 1.86 % 108.26 O b, B, b, O 75,217 2.39 % 66.29 0 0 % 0 1,095 4.42 % 27.57 242 2.63 % 61.17 54,738 2.93 % 100.86 6,177 1.37 % 32.96 9,399 1.72 % 29.56 3,566 1.43 % 83.06 O i, i, i, O 68,165 2.17 % 60.07 0 0 % 0 158 0.64 % 3.98 90 0.98 % 22.75 27,139 1.45 % 50.01 12,936 2.86 % 69.02 24,317 4.46 % 76.48 3,525 1.41 % 82.10 O itd, itd, itd, O 65,475 2.08 % 57.70 2 5.71 % 206.04 578 2.33 % 14.55 195 2.12 % 49.29 36,287 1.94 % 66.86 14,609 3.23 % 77.95 8,287 1.52 % 26.06 5,517 2.21 % 128.50 O sv, sv, sv, O 65,377 2.08 % 57.62 0 0 % 0 688 2.77 % 17.32 57 0.62 % 14.41 44,951 2.41 % 82.83 6,229 1.38 % 33.24 7,698 1.41 % 24.21 5,754 2.30 % 134.02 O d, D, d, O 63,685 2.02 % 56.13 0 0 % 0 415 1.67 % 10.45 92 1.00 % 23.26 47,370 2.54 % 87.28 5,629 1.25 % 30.03 6,708 1.23 % 21.10 3,471 1.39 % 80.84 O p, P, p, O 61,583 1.96 % 54.27 0 0 % 0 371 1.50 % 9.34 68 0.74 % 17.19 47,303 2.53 % 87.16 6,019 1.33 % 32.12 4,974 0.91 % 15.64 2,848 1.14 % 66.33 O tel, tel, tel, O 59,340 1.89 % 52.30 0 0 % 0 5 0.02 % 0.13 12 0.13 % 3.03 47,321 2.53 % 87.19 7,401 1.64 % 39.49 3,114 0.57 % 9.79 1,487 0.59 % 34.63 O k, K, k, O 49,179 1.56 % 43.34 0 0 % 0 1,046 4.22 % 26.34 48 0.52 % 12.13 37,122 1.99 % 68.40 4,816 1.07 % 25.70 4,367 0.80 % 13.74 1,780 0.71 % 41.46 O prof, prof, prof, O 48,612 1.54 % 42.84 0 0 % 0 47 0.19 % 1.18 26 0.28 % 6.57 34,903 1.87 % 64.31 5,305 1.17 % 28.31 7,590 1.39 % 23.87 741 0.30 % 17.26 O str, str, str, O 47,194 1.50 % 41.59 0 0 % 0 1,110 4.48 % 27.95 33 0.36 % 8.34 3,788 0.20 % 6.98 10,527 2.33 % 56.17 3,546 0.65 % 11.15 28,190 11.27 % 656.58 O i, I, i, O 44,515 1.41 % 39.23 0 0 % 0 441 1.78 % 11.10 137 1.49 % 34.63 30,623 1.64 % 56.42 4,534 1.00 % 24.19 5,596 1.03 % 17.60 3,184 1.27 % 74.16 O v, V, v, O 43,853 1.39 % 38.65 0 0 % 0 342 1.38 % 8.61 84 0.91 % 21.23 30,139 1.61 % 55.53 5,216 1.16 % 27.83 6,055 1.11 % 19.04 2,017 0.81 % 46.98 O r, R, r, O 43,194 1.37 % 38.07 0 0 % 0 533 2.15 % 13.42 40 0.43 % 10.11 29,209 1.56 % 53.82 5,154 1.14 % 27.50 5,123 0.94 % 16.11 3,135 1.25 % 73.02 O g, G, g, O 42,062 1.34 % 37.07 0 0 % 0 555 2.24 % 13.97 44 0.48 % 11.12 30,793 1.65 % 56.74 4,157 0.92 % 22.18 3,782 0.69 % 11.90 2,731 1.09 % 63.61 O mag, mag, mag, O 40,143 1.27 % 35.38 0 0 % 0 17 0.07 % 0.43 65 0.71 % 16.43 29,459 1.58 % 54.28 5,620 1.24 % 29.99 4,511 0.83 % 14.19 471 0.19 % 10.97 O c, C, c, O 37,814 1.20 % 33.33 33 94.29 % 3,399.61 339 1.37 % 8.54 132 1.43 % 33.37 18,868 1.01 % 34.77 7,426 1.64 % 39.62 6,246 1.15 % 19.65 4,770 1.91 % 111.10 O l, L, l, O 37,102 1.18 % 32.70 0 0 % 0 793 3.20 % 19.97 65 0.71 % 16.43 25,324 1.36 % 46.66 3,661 0.81 % 19.53 3,732 0.69 % 11.74 3,527 1.41 % 82.15 O st, st, st, O 36,399 1.16 % 32.08 0 0 % 0 947 3.82 % 23.84 30 0.33 % 7.58 19,248 1.03 % 35.47 3,431 0.76 % 18.31 11,515 2.11 % 36.22 1,228 0.49 % 28.60 O p, p, p, O 35,939 1.14 % 31.67 0 0 % 0 96 0.39 % 2.42 68 0.74 % 17.19 20,936 1.12 % 38.58 6,478 1.43 % 34.56 7,658 1.41 % 24.09 703 0.28 % 16.37 O op, op, op, O 35,928 1.14 % 31.66 0 0 % 0 454 1.83 % 11.43 8 0.09 % 2.02 15,464 0.83 % 28.49 4,702 1.04 % 25.09 11,819 2.17 % 37.17 3,481 1.39 % 81.08 O t, T, t, O 34,839 1.11 % 30.70 0 0 % 0 350 1.41 % 8.81 35 0.38 % 8.85 23,035 1.23 % 42.44 4,478 0.99 % 23.89 4,954 0.91 % 15.58 1,987 0.79 % 46.28 O CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 307 File at CLARIN.SI 1.3.30 List of residual lemmas in the Gigafida 2.0 corpus with text-type distributionGF2.0-words-residual-lemmas- taxonomy-entire.tsvLemma Lemma (lower-case) Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] de de 119,443 3.25 % 105.26 0 0 % 0 3,685 5.07 % 92.78 175 1.66 % 44.24 47,661 3.62 % 87.82 18,231 1.80 % 97.27 37,427 4.85 % 117.72 12,264 2.48 % 285.64 of of 88,222 2.40 % 77.75 0 0 % 0 779 1.07 % 19.61 120 1.14 % 30.33 23,377 1.78 % 43.07 24,881 2.46 % 132.76 19,565 2.54 % 61.54 19,500 3.94 % 454.18 the the 75,221 2.05 % 66.29 0 0 % 0 979 1.35 % 24.65 94 0.89 % 23.76 19,043 1.45 % 35.09 24,539 2.43 % 130.93 15,392 2.00 % 48.41 15,174 3.07 % 353.42 The the 66,216 1.80 % 58.36 0 0 % 0 517 0.71 % 13.02 65 0.62 % 16.43 21,304 1.62 % 39.25 16,340 1.62 % 87.18 20,382 2.64 % 64.11 7,608 1.54 % 177.20 and and 48,667 1.32 % 42.89 0 0 % 0 559 0.77 % 14.08 36 0.34 % 9.10 9,755 0.74 % 17.97 14,531 1.44 % 77.53 9,262 1.20 % 29.13 14,524 2.94 % 338.28 la la 47,471 1.29 % 41.84 0 0 % 0 1,450 2.00 % 36.51 53 0.50 % 13.40 19,464 1.48 % 35.86 9,233 0.91 % 49.26 12,499 1.62 % 39.31 4,772 0.96 % 111.15 i i 46,624 1.27 % 41.09 0 0 % 0 839 1.16 % 21.13 327 3.10 % 82.66 9,281 0.71 % 17.10 20,111 1.99 % 107.31 12,783 1.66 % 40.21 3,283 0.66 % 76.46 a a 32,433 0.88 % 28.58 0 0 % 0 571 0.79 % 14.38 220 2.09 % 55.61 8,682 0.66 % 16 11,459 1.14 % 61.14 5,850 0.76 % 18.40 5,651 1.14 % 131.62 in in 22,434 0.61 % 19.77 0 0 % 0 277 0.38 % 6.97 39 0.37 % 9.86 5,728 0.43 % 10.55 7,168 0.71 % 38.25 4,201 0.55 % 13.21 5,021 1.01 % 116.95 di di 20,807 0.57 % 18.34 1 33.33 % 103.02 268 0.37 % 6.75 28 0.27 % 7.08 9,811 0.75 % 18.08 3,363 0.33 % 17.94 5,855 0.76 % 18.42 1,481 0.30 % 34.49 sta sta 17,203 0.47 % 15.16 0 0 % 0 1 0 % 0.03 0 0 % 0 15,790 1.20 % 29.09 21 0 % 0.11 1,388 0.18 % 4.37 3 0 % 0.07 van van 16,380 0.45 % 14.44 0 0 % 0 286 0.39 % 7.20 18 0.17 % 4.55 8,131 0.62 % 14.98 1,246 0.12 % 6.65 6,171 0.80 % 19.41 528 0.11 % 12.30 el el 15,435 0.42 % 13.60 0 0 % 0 418 0.57 % 10.52 11 0.10 % 2.78 6,078 0.46 % 11.20 2,615 0.26 % 13.95 5,592 0.72 % 17.59 721 0.15 % 16.79 to to 15,224 0.41 % 13.42 0 0 % 0 207 0.28 % 5.21 18 0.17 % 4.55 3,583 0.27 % 6.60 5,425 0.54 % 28.95 2,841 0.37 % 8.94 3,150 0.64 % 73.37 for for 13,275 0.36 % 11.70 0 0 % 0 120 0.17 % 3.02 16 0.15 % 4.04 3,168 0.24 % 5.84 4,355 0.43 % 23.24 2,816 0.36 % 8.86 2,800 0.57 % 65.22 on on 12,675 0.34 % 11.17 0 0 % 0 182 0.25 % 4.58 8 0.08 % 2.02 3,727 0.28 % 6.87 3,678 0.36 % 19.62 2,767 0.36 % 8.70 2,313 0.47 % 53.87 pre pre 12,183 0.33 % 10.74 0 0 % 0 44 0.06 % 1.11 34 0.32 % 8.59 6,154 0.47 % 11.34 2,695 0.27 % 14.38 2,877 0.37 % 9.05 379 0.08 % 8.83 is is 11,863 0.32 % 10.45 0 0 % 0 262 0.36 % 6.60 3 0.03 % 0.76 2,493 0.19 % 4.59 3,726 0.37 % 19.88 3,514 0.46 % 11.05 1,865 0.38 % 43.44 von von 11,146 0.30 % 9.82 0 0 % 0 343 0.47 % 8.64 178 1.69 % 45 4,268 0.32 % 7.86 1,975 0.20 % 10.54 2,532 0.33 % 7.96 1,850 0.37 % 43.09 der der 10,665 0.29 % 9.40 0 0 % 0 248 0.34 % 6.24 28 0.27 % 7.08 3,773 0.29 % 6.95 1,459 0.14 % 7.78 3,064 0.40 % 9.64 2,093 0.42 % 48.75 et et 9,624 0.26 % 8.48 0 0 % 0 364 0.50 % 9.17 26 0.25 % 6.57 2,923 0.22 % 5.39 1,606 0.16 % 8.57 1,351 0.17 % 4.25 3,354 0.68 % 78.12 ta ta 9,565 0.26 % 8.43 0 0 % 0 309 0.42 % 7.78 15 0.14 % 3.79 2,643 0.20 % 4.87 2,979 0.29 % 15.89 2,441 0.32 % 7.68 1,178 0.24 % 27.44 da da 9,128 0.25 % 8.04 0 0 % 0 1,009 1.39 % 25.41 27 0.26 % 6.83 3,188 0.24 % 5.87 2,083 0.21 % 11.11 2,267 0.29 % 7.13 554 0.11 % 12.90 bin bin 9,010 0.24 % 7.94 0 0 % 0 65 0.09 % 1.64 2 0.02 % 0.51 5,068 0.39 % 9.34 1,212 0.12 % 6.47 2,562 0.33 % 8.06 101 0.02 % 2.35 by by 8,563 0.23 % 7.55 0 0 % 0 69 0.10 % 1.74 21 0.20 % 5.31 2,522 0.19 % 4.65 3,090 0.31 % 16.49 1,719 0.22 % 5.41 1,142 0.23 % 26.60 an an 7,554 0.21 % 6.66 0 0 % 0 115 0.16 % 2.90 11 0.10 % 2.78 3,059 0.23 % 5.64 1,705 0.17 % 9.10 1,264 0.16 % 3.98 1,400 0.28 % 32.61 du du 7,024 0.19 % 6.19 0 0 % 0 295 0.41 % 7.43 10 0.10 % 2.53 3,031 0.23 % 5.58 1,043 0.10 % 5.57 1,300 0.17 % 4.09 1,345 0.27 % 31.33 be be 6,845 0.19 % 6.03 0 0 % 0 137 0.19 % 3.45 11 0.10 % 2.78 2,025 0.15 % 3.73 2,409 0.24 % 12.85 1,265 0.16 % 3.98 998 0.20 % 23.24 des des 6,567 0.18 % 5.79 0 0 % 0 385 0.53 % 9.69 21 0.20 % 5.31 2,185 0.17 % 4.03 941 0.09 % 5.02 1,067 0.14 % 3.36 1,968 0.40 % 45.84 del del 6,521 0.18 % 5.75 0 0 % 0 88 0.12 % 2.22 14 0.13 % 3.54 2,826 0.21 % 5.21 1,519 0.15 % 8.10 1,325 0.17 % 4.17 749 0.15 % 17.45 dnevnik,si dnevnik,si 6,354 0.17 % 5.60 0 0 % 0 0 0 % 0 0 0 % 0 6,059 0.46 % 11.16 17 0 % 0.09 278 0.04 % 0.87 0 0 % 0 with with 5,952 0.16 % 5.25 0 0 % 0 103 0.14 % 2.59 4 0.04 % 1.01 1,281 0.10 % 2.36 2,175 0.21 % 11.61 1,145 0.15 % 3.60 1,244 0.25 % 28.97 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 308 File at CLARIN.SI 1.3.31 List of residual lower-case word forms in the Gigafida 2.0 corpus with text-type distributionGF2.0-words-residual-lowercase_forms- taxonomy-entire.tsvLower-case word form Lemma Lemma (lower-case) Morphosyntactic tag Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] msd01 msd02 de de de Nj 113,760 3.10 % 100.26 0 0 % 0 3,345 4.60 % 84.22 161 1.53 % 40.70 45,380 3.45 % 83.62 17,160 1.70 % 91.56 36,135 4.68 % 113.65 11,579 2.34 % 269.69 N j of of of Nj 87,374 2.38 % 77 0 0 % 0 766 1.05 % 19.29 118 1.12 % 29.83 23,009 1.75 % 42.40 24,678 2.44 % 131.67 19,416 2.52 % 61.07 19,387 3.92 % 451.55 N j the the the Nj 73,182 1.99 % 64.49 0 0 % 0 960 1.32 % 24.17 93 0.88 % 23.51 18,465 1.40 % 34.02 24,005 2.38 % 128.08 14,531 1.88 % 45.70 15,128 3.06 % 352.35 N j the The the Nj 66,216 1.80 % 58.36 0 0 % 0 517 0.71 % 13.02 65 0.62 % 16.43 21,304 1.62 % 39.25 16,340 1.62 % 87.18 20,382 2.64 % 64.11 7,608 1.54 % 177.20 N j and and and Nj 46,695 1.27 % 41.15 0 0 % 0 538 0.74 % 13.55 36 0.34 % 9.10 9,165 0.70 % 16.89 13,973 1.38 % 74.56 8,811 1.14 % 27.71 14,172 2.86 % 330.08 N j la la la Nj 46,398 1.26 % 40.89 0 0 % 0 1,431 1.97 % 36.03 50 0.47 % 12.64 19,029 1.45 % 35.06 8,995 0.89 % 47.99 12,156 1.58 % 38.23 4,737 0.96 % 110.33 N j i i i N 25,225 0.69 % 22.23 0 0 % 0 205 0.28 % 5.16 292 2.77 % 73.81 4,791 0.36 % 8.83 11,246 1.11 % 60 6,961 0.90 % 21.89 1,730 0.35 % 40.29 N a a a Nj 23,834 0.65 % 21 0 0 % 0 406 0.56 % 10.22 23 0.22 % 5.81 5,428 0.41 % 10 9,365 0.93 % 49.97 4,121 0.53 % 12.96 4,491 0.91 % 104.60 N j in in in Nj 22,418 0.61 % 19.76 0 0 % 0 277 0.38 % 6.97 39 0.37 % 9.86 5,726 0.43 % 10.55 7,166 0.71 % 38.24 4,199 0.54 % 13.21 5,011 1.01 % 116.71 N j i i i Nj 21,371 0.58 % 18.83 0 0 % 0 634 0.87 % 15.96 35 0.33 % 8.85 4,489 0.34 % 8.27 8,863 0.88 % 47.29 5,822 0.76 % 18.31 1,528 0.31 % 35.59 N j di di di Nj 18,615 0.51 % 16.41 133.33 % 103.02 241 0.33 % 6.07 21 0.20 % 5.31 8,530 0.65 % 15.72 2,830 0.28 % 15.10 5,584 0.72 % 17.56 1,408 0.28 % 32.79 N j sta sta sta N 17,199 0.47 % 15.16 0 0 % 0 1 0 % 0.03 0 0 % 0 15,786 1.20 % 29.09 21 0 % 0.11 1,388 0.18 % 4.37 3 0 % 0.07 N van van van Nj 16,212 0.44 % 14.29 0 0 % 0 279 0.38 % 7.02 18 0.17 % 4.55 8,068 0.61 % 14.87 1,220 0.12 % 6.51 6,105 0.79 % 19.20 522 0.10 % 12.16 N j to to to Nj 15,223 0.41 % 13.42 0 0 % 0 207 0.28 % 5.21 18 0.17 % 4.55 3,583 0.27 % 6.60 5,425 0.54 % 28.95 2,841 0.37 % 8.94 3,149 0.64 % 73.34 N j el el el Nj 14,670 0.40 % 12.93 0 0 % 0 409 0.56 % 10.30 11 0.10 % 2.78 5,725 0.43 % 10.55 2,556 0.25 % 13.64 5,261 0.68 % 16.55 708 0.14 % 16.49 N j for for for Nj 13,203 0.36 % 11.64 0 0 % 0 117 0.16 % 2.95 16 0.15 % 4.04 3,154 0.24 % 5.81 4,320 0.43 % 23.05 2,808 0.36 % 8.83 2,788 0.56 % 64.94 N j on on on Nj 12,444 0.34 % 10.97 0 0 % 0 178 0.24 % 4.48 8 0.08 % 2.02 3,549 0.27 % 6.54 3,655 0.36 % 19.50 2,748 0.36 % 8.64 2,306 0.47 % 53.71 N j pre pre pre N 11,574 0.32 % 10.20 0 0 % 0 36 0.05 % 0.91 32 0.30 % 8.09 5,831 0.44 % 10.74 2,559 0.25 % 13.65 2,793 0.36 % 8.78 323 0.07 % 7.52 N der der der Nj 10,059 0.27 % 8.86 0 0 % 0 240 0.33 % 6.04 26 0.25 % 6.57 3,517 0.27 % 6.48 1,356 0.13 % 7.24 2,906 0.38 % 9.14 2,014 0.41 % 46.91 N j von von von Nj 10,057 0.27 % 8.86 0 0 % 0 297 0.41 % 7.48 143 1.36 % 36.15 3,879 0.29 % 7.15 1,731 0.17 % 9.24 2,300 0.30 % 7.23 1,707 0.34 % 39.76 N j is is is Nj 9,761 0.27 % 8.60 0 0 % 0 241 0.33 % 6.07 3 0.03 % 0.76 2,275 0.17 % 4.19 3,494 0.35 % 18.64 1,979 0.26 % 6.22 1,769 0.36 % 41.20 N j bin bin bin Nj 8,765 0.24 % 7.72 0 0 % 0 64 0.09 % 1.61 2 0.02 % 0.51 4,933 0.38 % 9.09 1,167 0.12 % 6.23 2,502 0.32 % 7.87 97 0.02 % 2.26 N j to ta ta Nj 8,640 0.23 % 7.61 0 0 % 0 222 0.31 % 5.59 8 0.08 % 2.02 2,268 0.17 % 4.18 2,753 0.27 % 14.69 2,302 0.30 % 7.24 1,087 0.22 % 25.32 N j a a a N 8,599 0.23 % 7.58 0 0 % 0 165 0.23 % 4.15 197 1.87 % 49.80 3,254 0.25 % 6 2,094 0.21 % 11.17 1,729 0.22 % 5.44 1,160 0.23 % 27.02 N et et et Nj 7,947 0.22 % 7 0 0 % 0 348 0.48 % 8.76 23 0.22 % 5.81 2,091 0.16 % 3.85 1,394 0.14 % 7.44 981 0.13 % 3.09 3,110 0.63 % 72.44 N j by by by Nj 7,363 0.20 % 6.49 0 0 % 0 62 0.09 % 1.56 17 0.16 % 4.30 2,059 0.16 % 3.79 2,610 0.26 % 13.93 1,556 0.20 % 4.89 1,059 0.21 % 24.67 N j des des des Nj 6,526 0.18 % 5.75 0 0 % 0 378 0.52 % 9.52 21 0.20 % 5.31 2,181 0.17 % 4.02 923 0.09 % 4.92 1,060 0.14 % 3.33 1,963 0.40 % 45.72 N j dnevnik,si dnevnik,si dnevnik,si N 6,323 0.17 % 5.57 0 0 % 0 0 0 % 0 0 0 % 0 6,028 0.46 % 11.11 17 0 % 0.09 278 0.04 % 0.87 0 0 % 0N be be be Nj 6,202 0.17 % 5.47 0 0 % 0 125 0.17 % 3.15 9 0.09 % 2.28 1,700 0.13 % 3.13 2,240 0.22 % 11.95 1,169 0.15 % 3.68 959 0.19 % 22.34 N j an an an Nj 5,886 0.16 % 5.19 0 0 % 0 111 0.15 % 2.79 9 0.09 % 2.28 1,593 0.12 % 2.94 1,666 0.17 % 8.89 1,128 0.15 % 3.55 1,379 0.28 % 32.12 N j with with with Nj 5,815 0.16 % 5.12 0 0 % 0 101 0.14 % 2.54 4 0.04 % 1.01 1,231 0.09 % 2.27 2,141 0.21 % 11.42 1,109 0.14 % 3.49 1,229 0.25 % 28.62 N j de de de N 5,683 0.15 % 5.01 0 0 % 0 340 0.47 % 8.56 14 0.13 % 3.54 2,281 0.17 % 4.20 1,071 0.11 % 5.71 1,292 0.17 % 4.06 685 0.14 % 15.95 N CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 309The frequency lists of word sets contain consecutive sequences of words in the corpus (without skipped words; e.g. the string “prevajati roman” is extracted only when the two words actually co-occur in the text; the frequencies for “prevajati roman” thus do not include cases with one or more interceding words, such as “prevajati italijanski roman” or “prevajati novi italijanski roman”). The length of the extracted word sets ranges from 1 to 5 words. Because of the size of the Gigafida 2.0 corpus, only the word sets with a relative frequency of at least 2 per million were extracted. Along with the word sets, the lists also contain their total absolute frequencies (fas), i.e. the number of its occurrences in the corpus, and their percentages (psn) based on the sum of the frequencies of all extracted sets (N) of equal length (n) in the corpus: The total relative frequency of a word set (frs) is calculated using the total absolute frequency of the word set (fas) and the total sum of the absolute frequencies (faw) of all m words in the corpus: The same method is used to calculate the absolute frequencies, percentages, and relative frequencies in the subcorpora containing only texts from a specific taxonomy branch. The main difference is that instead of the sums from the entire corpus (e.g. for the number of all words or the absolute frequency of a word set), the formulae take into account only partial values extracted from the relevant subcorpus (e.g. the number of all words in fiction texts and the absolute frequency of a word set in fiction texts). The lists of word sets also contain five different collocability measures that indicate how typical a certain word set is. The available measures are the following: t-score, MI (mutual information), MI3 (mutual information cubed), logDice, and simple LL (simple log-likelihood). The measures are calculated using the following formulae based on the observed (O) and expected (E) frequencies of the word set): O ... observed frequency of the word set E ... expected frequency of the word set fas ... absolute frequency of the word set in the (sub)corpus N ... frequency of all word sets in the (sub)corpus n ... length of the word set (number of words) fw ... absolute frequency of the word in the (sub)corpus Tables 1.4.1. to 1.4.4. contain lists of word sets extracted from lower-case word forms. Tables 1.4.5. do 1.4.8. contain lists of word sets extracted from lower-case word forms, along with their lemmas (e.g. “prevajal roman” → “prevajati roman”). In tables 1.4.9. to 1.4.12., the lists of word- level n-grams contain lower-case word forms as well as their morphosyntactic tags (e.g. “prevajal roman” → “Ggnd-em Sometn”). Tables 1.4.13. to 1.4.16. contain lists of sets of morphosyntactic tags without word forms or lemmas.1.4. Frequency lists of word sets from the Gigafida 2.0 corpus CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 310 File at CLARIN.SI 1.4.1 List of word-level 2-grams with relative frequency 2/million or above from lower-case word forms in the Gigafida 2.0 corpus with text-type distribution and collocation measuresGF2.0-word_sets-lowercase_forms-2grams- taxonomy-collocativity-entire.tsvLower-case form of string Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Dice t-score MI MI3 logDice simple LL se je 3,555,104 0.95 % 3,133.10 20.03 % 206.04 1,027,944 0.49 % 3,233.15 301,539 1.24 % 7,592.43 6,079 0.23 % 1,536.71 497,382 0.44 % 2,653.87 98,216 0.40 % 2,287.57 1,623,942 0.47 % 2,992.22 0.13 1,586.33 2.66 46.18 11.02 -297,364.79 da je 2,685,352 0.72 % 2,366.59 10.02 % 103.02 826,940 0.39 % 2,600.94 124,220 0.51 % 3,127.73 8,534 0.33 % 2,157.30 420,760 0.37 % 2,245.04 84,817 0.34 % 1,975.49 1,220,080 0.35 % 2,248.08 0.10 1,318.41 2.36 45.07 10.65 -513,399.16 ki je 2,442,122 0.66 % 2,152.23 6 0.10 % 618.11 781,783 0.37 % 2,458.91 72,793 0.30 % 1,832.85 7,697 0.30 % 1,945.72 350,919 0.31 % 1,872.39 73,861 0.30 % 1,720.31 1,155,063 0.33 % 2,128.28 0.09 1,278.63 2.46 44.90 10.57 -379,932.13 pa je 2,208,372 0.59 % 1,946.23 10.02 % 103.02 670,499 0.32 % 2,108.89 52,815 0.22 % 1,329.82 3,006 0.12 % 759.88 328,872 0.29 % 1,754.75 52,321 0.21 % 1,218.62 1,100,858 0.32 % 2,028.40 0.09 1,225.67 2.51 44.66 10.47 -301,933.37 ki so 1,743,293 0.47 % 1,536.36 4 0.07 % 412.07 486,132 0.23 % 1,529.01 50,819 0.21 % 1,279.57 7,766 0.30 % 1,963.16 270,450 0.24 % 1,443.03 74,618 0.30 % 1,737.94 853,504 0.25 % 1,572.64 0.13 1,208.06 3.56 45.02 11.10 541,959.60 je v 1,737,223 0.47 % 1,531.01 10.02 % 103.02 599,035 0.28 % 1,884.12 44,337 0.18 % 1,116.36 2,964 0.11 % 749.27 216,723 0.19 % 1,156.36 45,659 0.18 % 1,063.45 828,504 0.24 % 1,526.57 0.05 468.98 0.63 42.09 9.63 -572,675.74 je bil 1,681,246 0.45 % 1,481.67 0 0 % 0 525,712 0.25 % 1,653.50 101,698 0.42 % 2,560.65 4,022 0.15 % 1,016.72 217,258 0.19 % 1,159.22 49,827 0.20 % 1,160.53 782,729 0.23 % 1,442.23 0.08 1,231.82 4.32 45.68 10.35 1,180,862.13 da bi 1,640,637 0.44 % 1,445.89 0 0 % 0 408,727 0.20 % 1,285.55 133,500 0.55 % 3,361.39 5,902 0.23 % 1,491.96 259,491 0.23 % 1,384.56 69,025 0.28 % 1,607.67 763,992 0.22 % 1,407.71 0.15 1,214.15 4.26 45.55 11.29 1,100,184.03 da se 1,584,231 0.42 % 1,396.17 30 0.48 % 3,090.55 452,711 0.21 % 1,423.89 78,405 0.32 % 1,974.15 7,516 0.29 % 1,899.96 274,171 0.24 % 1,462.89 59,773 0.24 % 1,392.18 711,625 0.20 % 1,311.22 0.10 1,091.01 2.91 44.10 10.71 27,576.90 so se 1,266,721 0.34 % 1,116.35 20.03 % 206.04 344,051 0.16 % 1,082.13 60,393 0.25 % 1,520.63 2,287 0.09 % 578.13 165,405 0.14 % 882.55 39,128 0.16 % 911.34 655,455 0.19 % 1,207.72 0.09 958.14 2.75 43.29 10.47 -59,757.48 je bilo 1,199,198 0.32 % 1,056.85 0 0 % 0 330,119 0.16 % 1,038.31 80,644 0.33 % 2,030.53 3,236 0.12 % 818.03 165,513 0.14 % 883.12 37,875 0.15 % 882.15 581,811 0.17 % 1,072.03 0.06 1,028.22 4.03 44.42 9.87 660,410.29 naj bi 1,193,529 0.32 % 1,051.85 0 0 % 0 349,915 0.17 % 1,100.57 11,116 0.05 % 279.89 1,238 0.05 % 312.95 143,723 0.13 % 766.86 21,714 0.09 % 505.74 665,823 0.19 % 1,226.82 0.28 1,082.07 6.71 47.09 12.17 2,458,968.33 je bila 1,189,566 0.32 % 1,048.36 0 0 % 0 342,001 0.16 % 1,075.68 79,513 0.33 % 2,002.05 3,492 0.13 % 882.74 171,598 0.15 % 915.59 41,219 0.17 % 960.04 551,743 0.16 % 1,016.62 0.06 1,027.72 4.11 44.48 9.86 705,240.09 da so 1,029,697 0.28 % 907.47 0 0 % 0 302,301 0.14 % 950.82 29,853 0.12 % 751.67 2,610 0.10 % 659.78 150,613 0.13 % 803.62 36,097 0.14 % 840.74 508,223 0.15 % 936.43 0.07 842.03 2.55 42.50 10.22 -125,119.72 ki se 1,014,422 0.27 % 894 30.05 % 309.06 287,492 0.14 % 904.24 32,604 0.13 % 820.93 5,816 0.22 % 1,470.22 183,768 0.16 % 980.53 45,698 0.18 % 1,064.36 459,041 0.13 % 845.81 0.07 829.97 2.51 42.41 10.18 -140,886.44 pa so 906,695 0.24 % 799.07 0 0 % 0 256,521 0.12 % 806.83 14,316 0.06 % 360.46 1,242 0.05 % 313.96 124,644 0.11 % 665.06 23,582 0.10 % 549.25 486,390 0.14 % 896.21 0.07 816.52 2.81 42.39 10.25 -20,504.39 ko je 892,721 0.24 % 786.75 9 0.15 % 927.17 284,884 0.14 % 896.03 66,558 0.27 % 1,675.86 2,288 0.09 % 578.38 125,412 0.11 % 669.16 25,434 0.10 % 592.39 388,136 0.11 % 715.17 0.04 826.47 3.00 42.53 9.41 48,898.46 pa se 851,420 0.23 % 750.35 0 0 % 0 231,651 0.11 % 728.60 25,693 0.10 % 646.92 2,016 0.08 % 509.62 151,613 0.13 % 808.96 26,441 0.11 % 615.84 414,006 0.12 % 762.83 0.06 754.13 2.45 41.85 10.01 -134,620.88 je na 837,578 0.23 % 738.15 10.02 % 103.02 294,216 0.14 % 925.39 27,550 0.11 % 693.68 1,146 0.04 % 289.70 111,850 0.10 % 596.79 16,675 0.07 % 388.38 386,140 0.11 % 711.49 0.03 182.28 0.32 39.67 8.86 -172,059.51 da bo 786,625 0.21 % 693.25 30.05 % 309.06 251,763 0.12 % 791.86 26,224 0.11 % 660.29 1,966 0.08 % 496.98 104,238 0.09 % 556.18 14,627 0.06 % 340.68 387,804 0.11 % 714.55 0.08 798.05 3.32 42.49 10.27 156,253.98 ki jih 735,400 0.20 % 648.10 30.05 % 309.06 195,182 0.09 % 613.90 18,218 0.07 % 458.71 4,813 0.18 % 1,216.67 141,488 0.12 % 754.93 40,726 0.16 % 948.56 334,970 0.10 % 617.20 0.10 824.55 4.70 43.68 10.63 666,493.80 bi se 707,845 0.19 % 623.82 10.02 % 103.02 178,981 0.09 % 562.94 48,802 0.20 % 1,228.78 2,146 0.08 % 542.49 111,306 0.10 % 593.89 23,491 0.10 % 547.13 343,118 0.10 % 632.22 0.06 732.16 2.95 41.81 10.01 23,496.36 ki ga 706,070 0.19 % 622.26 6 0.10 % 618.11 199,524 0.10 % 627.55 23,967 0.10 % 603.46 3,374 0.13 % 852.91 122,406 0.11 % 653.12 31,710 0.13 % 738.56 325,083 0.09 % 598.99 0.09 802.93 4.49 43.35 10.55 559,970.61 se bo 686,378 0.18 % 604.90 0 0 % 0 227,432 0.11 % 715.33 17,788 0.07 % 447.88 1,059 0.04 % 267.70 88,753 0.08 % 473.56 11,192 0.04 % 260.67 340,154 0.10 % 626.76 0.06 726.23 3.02 41.80 10.00 43,978.21 več kot 673,051 0.18 % 593.16 10.02 % 103.02 235,287 0.11 % 740.04 8,026 0.03 % 202.09 1,049 0.04 % 265.18 91,023 0.08 % 485.67 10,873 0.04 % 253.25 326,792 0.09 % 602.14 0.17 805.58 5.79 44.51 11.41 1,024,933.68 ga je 667,351 0.18 % 588.13 0 0 % 0 180,765 0.09 % 568.55 71,865 0.29 % 1,809.48 1,819 0.07 % 459.82 92,374 0.08 % 492.88 22,299 0.09 % 519.37 298,229 0.09 % 549.51 0.03 695.49 2.75 41.45 9.00 -31,362.05 in se 659,805 0.18 % 581.48 10.02 % 103.02 154,656 0.07 % 486.43 78,210 0.32 % 1,969.24 2,057 0.08 % 519.99 127,501 0.11 % 680.30 30,049 0.12 % 699.88 267,331 0.08 % 492.58 0.03 305.12 0.68 39.34 8.90 -225,748.57 ne bo 650,516 0.17 % 573.30 0 0 % 0 182,187 0.09 % 573.03 29,369 0.12 % 739.48 1,544 0.06 % 390.31 99,628 0.09 % 531.58 11,939 0.05 % 278.07 325,849 0.09 % 600.40 0.10 762.23 4.19 42.81 10.71 409,864.93 je da 641,874 0.17 % 565.68 0 0 % 0 180,290 0.09 % 567.06 40,510 0.17 % 1,020 1,120 0.04 % 283.12 105,934 0.09 % 565.23 21,884 0.09 % 509.70 292,136 0.08 % 538.28 0.02 146.04 0.29 38.87 8.58 -121,813.67 to je 630,253 0.17 % 555.44 20.03 % 206.04 178,917 0.09 % 562.74 39,610 0.16 % 997.34 2,906 0.11 % 734.61 110,954 0.10 % 592.01 25,701 0.10 % 598.61 272,163 0.08 % 501.48 0.03 611.06 2.12 40.65 8.87 -166,362.48 je tudi 626,072 0.17 % 551.75 10.02 % 103.02 185,043 0.09 % 582.01 11,577 0.05 % 291.50 1,274 0.05 % 322.05 107,544 0.09 % 573.82 17,904 0.07 % 417.01 302,729 0.09 % 557.80 0.03 414.44 1.07 39.58 8.73 -252,416.34 ne bi 619,614 0.17 % 546.06 0 0 % 0 145,511 0.07 % 457.67 49,666 0.20 % 1,250.54 1,908 0.07 % 482.32 107,185 0.09 % 571.90 18,391 0.07 % 428.35 296,953 0.09 % 547.16 0.09 737.92 4.00 42.48 10.58 330,049.23 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 311 File at CLARIN.SI 1.4.2 List of word-level 3-grams with relative frequency 2/million or above from lower-case word forms in the Gigafida 2.0 corpus with text-type distribution and collocation measuresGF2.0-word_sets-lowercase_forms-3grams- taxonomy-collocativity-entire.tsvLower-case form of string Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Dice t-score MI MI3 logDice simple LL ki se je 272,838 0.38 % 240.45 1 0.04 % 103.02 85,076 0.11 % 267.59 14,121 0.17 % 355.55 489 0.05 % 123.61 37,285 0.10 % 198.94 8,072 0.10 % 188.01 127,794 0.10 % 235.47 0.01 522.34 17.91 54.02 7.61 2,395,929.59 ki ga je 267,151 0.37 % 235.44 0 0 % 0 81,848 0.10 % 257.43 13,466 0.16 % 339.06 937 0.09 % 236.86 34,875 0.09 % 186.08 8,902 0.12 % 207.34 127,123 0.10 % 234.23 0.01 516.87 18.94 54.99 7.89 2,511,277.97 pa se je 253,657 0.35 % 223.55 0 0 % 0 78,195 0.10 % 245.94 10,612 0.12 % 267.20 325 0.03 % 82.16 34,458 0.09 % 183.86 5,328 0.07 % 124.10 124,739 0.10 % 229.84 0.01 503.63 15.11 51.02 7.54 1,800,529.85 da se je 232,812 0.33 % 205.18 0 0 % 0 67,872 0.09 % 213.48 16,579 0.19 % 417.44 507 0.05 % 128.16 34,422 0.09 % 183.66 7,023 0.09 % 163.57 106,409 0.09 % 196.07 0.01 482.49 14.99 50.65 7.34 1,635,226.73 da bi se 230,612 0.32 % 203.24 0 0 % 0 54,494 0.07 % 171.40 21,753 0.25 % 547.72 892 0.09 % 225.49 38,615 0.10 % 206.04 10,269 0.13 % 239.18 104,589 0.08 % 192.71 0.02 480.21 15.54 51.17 8.24 1,696,941.38 se je v 217,486 0.30 % 191.67 0 0 % 0 71,008 0.09 % 223.34 7,392 0.09 % 186.12 229 0.02 % 57.89 25,526 0.07 % 136.20 5,288 0.07 % 123.16 108,043 0.09 % 199.08 0.01 466.35 15.88 51.34 6.93 1,643,926.55 ki jo je 206,751 0.29 % 182.21 0 0 % 0 63,716 0.08 % 200.40 10,790 0.13 % 271.68 542 0.05 % 137.01 28,377 0.08 % 151.41 7,016 0.09 % 163.41 96,310 0.08 % 177.46 0.01 454.69 15.45 50.77 7.55 1,510,124.32 ki jih je 204,194 0.28 % 179.96 0 0 % 0 59,646 0.07 % 187.60 9,134 0.11 % 229.98 1,082 0.11 % 273.52 29,355 0.08 % 156.63 8,281 0.11 % 192.87 96,696 0.08 % 178.17 0.01 451.86 14.89 50.17 7.51 1,421,702.23 ki je v 186,710 0.26 % 164.55 1 0.04 % 103.02 66,989 0.08 % 210.70 1,882 0.02 % 47.39 463 0.04 % 117.04 20,847 0.06 % 111.23 3,893 0.05 % 90.67 92,635 0.07 % 170.69 0.01 432.10 16.84 51.86 6.77 1,519,942.00 ki naj bi 173,475 0.24 % 152.88 0 0 % 0 45,720 0.06 % 143.80 2,081 0.02 % 52.40 277 0.03 % 70.02 24,480 0.07 % 130.62 4,471 0.06 % 104.13 96,446 0.08 % 177.71 0.02 416.49 14.56 49.37 8.66 1,174,129.43 ki je bil 159,984 0.22 % 140.99 0 0 % 0 53,428 0.07 % 168.04 6,298 0.07 % 158.58 507 0.05 % 128.16 19,240 0.05 % 102.66 4,452 0.06 % 103.69 76,059 0.06 % 140.14 0.01 399.97 15.40 49.98 7.16 1,163,636.98 ko se je 156,602 0.22 % 138.01 0 0 % 0 43,625 0.05 % 137.21 17,965 0.21 % 452.34 295 0.03 % 74.57 23,582 0.06 % 125.83 4,690 0.06 % 109.24 66,445 0.05 % 122.43 0.01 395.72 14.89 49.40 7.02 1,090,483.59 ki so se 150,086 0.21 % 132.27 0 0 % 039,890 0.05 % 125.46 7,363 0.09 % 185.39 375 0.04 % 94.80 21,507 0.06 % 114.75 5,975 0.08 % 139.16 74,976 0.06 % 138.15 0.01 387.40 15.23 49.62 7.46 1,075,849.43 da se bo 139,142 0.19 % 122.63 0 0 % 0 42,142 0.05 % 132.55 5,964 0.07 % 150.17 262 0.03 % 66.23 19,776 0.05 % 105.52 2,884 0.04 % 67.17 68,114 0.06 % 125.50 0.01 373.00 14.25 48.42 7.53 915,102.83 naj bi se 133,302 0.19 % 117.48 0 0 % 040,908 0.05 % 128.67 1,268 0.01 % 31.93 165 0.02 % 41.71 15,363 0.04 % 81.97 2,266 0.03 % 52.78 73,332 0.06 % 135.12 0.02 365.10 15.29 49.34 8.06 960,775.49 ki so ga 131,513 0.18 % 115.90 0 0 % 0 38,648 0.05 % 121.56 3,980 0.05 % 100.21 225 0.02 % 56.88 17,107 0.05 % 91.28 4,466 0.06 % 104.02 67,087 0.05 % 123.61 0.01 362.63 14.16 48.17 7.81 858,488.17 pa naj bi 129,124 0.18 % 113.80 0 0 % 0 37,605 0.05 % 118.28 737 0.01 % 18.56 44 0 % 11.12 13,446 0.04 % 71.74 1,146 0.01 % 26.69 76,146 0.06 % 140.30 0.02 359.32 14.14 48.09 8.35 840,837.46 ki so jih 128,399 0.18 % 113.16 0 0 % 0 35,801 0.04 % 112.60 3,476 0.04 % 87.52 375 0.04 % 94.80 17,741 0.05 % 94.66 5,956 0.08 % 138.72 65,050 0.05 % 119.86 0.01 358.32 15.48 49.42 7.79 939,888.09 je da je 119,657 0.17 % 105.45 0 0 % 0 35,171 0.04 % 110.62 7,787 0.09 % 196.07 176 0.02 % 44.49 17,944 0.05 % 95.74 3,546 0.05 % 82.59 55,033 0.04 % 101.40 0.00 345.91 15.25 48.99 5.96 859,229.69 da je bil 117,439 0.16 % 103.50 0 0 % 0 38,932 0.05 % 122.45 6,175 0.07 % 155.48 299 0.03 % 75.58 15,190 0.04 % 81.05 3,182 0.04 % 74.11 53,661 0.04 % 98.87 0.01 342.69 15.75 49.43 6.66 878,402.33 pa je bil 116,129 0.16 % 102.34 0 0 % 0 37,902 0.05 % 119.21 3,652 0.04 % 91.95 130 0.01 % 32.86 12,947 0.04 % 69.08 2,175 0.03 % 50.66 59,323 0.05 % 109.31 0.01 340.76 13.98 47.64 6.74 745,518.01 glede na to 110,418 0.15 % 97.31 0 0 % 0 33,388 0.04 % 105.01 2,473 0.03 % 62.27 620 0.06 % 156.73 21,005 0.06 % 112.08 4,277 0.06 % 99.62 48,655 0.04 % 89.65 0.01 332.27 13.91 47.42 7.83 704,019.09 ne glede na 106,770 0.15 % 94.10 0 0 % 0 30,284 0.04 % 95.25 2,213 0.03 % 55.72 1,363 0.13 % 344.55 19,692 0.05 % 105.07 5,304 0.07 % 123.54 47,914 0.04 % 88.28 0.01 326.73 13.86 47.27 7.63 677,644.44 in s tem 106,420 0.15 % 93.79 1 0.04 % 103.02 28,447 0.04 % 89.47 1,098 0.01 % 27.65 406 0.04 % 102.63 20,099 0.05 % 107.24 5,111 0.07 % 119.04 51,258 0.04 % 94.45 0.01 326.21 14.45 47.85 7.08 712,975.59 na to da 104,806 0.15 % 92.36 0 0 % 0 31,157 0.04 % 98 2,691 0.03 % 67.76 498 0.05 % 125.89 19,265 0.05 % 102.79 2,951 0.04 % 68.73 48,244 0.04 % 88.89 0.01 323.72 13.84 47.19 7.08 663,489.52 se je na 104,579 0.15 % 92.16 0 0 % 0 36,490 0.04 % 114.77 5,060 0.06 % 127.41 87 0.01 % 21.99 12,307 0.03 % 65.67 1,614 0.02 % 37.59 49,021 0.04 % 90.32 0.00 323.36 13.83 47.18 6.10 661,855.54 ki je bila 102,295 0.14 % 90.15 0 0 % 0 31,096 0.04 % 97.81 4,859 0.06 % 122.34 394 0.04 % 99.60 14,023 0.04 % 74.82 3,717 0.05 % 86.57 48,206 0.04 % 88.82 0.01 319.82 14.45 47.73 6.53 685,306.28 da je v 101,691 0.14 % 89.62 0 0 % 0 31,406 0.04 % 98.78 3,347 0.04 % 84.27 276 0.03 % 69.77 14,707 0.04 % 78.47 2,989 0.04 % 69.62 48,966 0.04 % 90.22 0.00 318.87 13.84 47.11 5.85 644,176.94 ki so jo 98,471 0.14 % 86.78 0 0 % 0 28,375 0.04 % 89.25 2,656 0.03 % 66.88 175 0.02 % 44.24 13,676 0.04 % 72.97 3,364 0.04 % 78.35 50,225 0.04 % 92.54 0.01 313.78 13.75 46.92 7.45 618,053.00 je bil v 96,039 0.13 % 84.64 0 0 % 0 32,188 0.04 % 101.24 3,438 0.04 % 86.57 196 0.02 % 49.55 10,342 0.03 % 55.18 2,379 0.03 % 55.41 47,496 0.04 % 87.51 0.00 309.88 13.73 46.83 5.99 601,742.51 pa so se 95,140 0.13 % 83.85 0 0 % 0 26,670 0.03 % 83.88 2,351 0.03 % 59.20 131 0.01 % 33.12 11,747 0.03 % 62.68 2,011 0.03 % 46.84 52,230 0.04 % 96.24 0.01 308.43 13.74 46.81 6.86 596,490.51 potem ko je 93,216 0.13 % 82.15 0 0 % 0 44,397 0.06 % 139.64 2,258 0.03 % 56.85 169 0.02 % 42.72 6,267 0.02 % 33.44 1,185 0.01 % 27.60 38,940 0.03 % 71.75 0.01 305.29 13.67 46.68 6.71 580,630.36 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 312 File at CLARIN.SI1.4.3 List of word-level 4-grams with relative frequency 2/million or above from lower-case word forms in the Gigafida 2.0 corpus with text-type distribution and collocation measuresGF2.0-word_sets-lowercase_forms-4grams- taxonomy-collocativity-entire.tsvLower-case form of string Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Dice t-score MI MI3 logDice simple LL glede na to da 60,171 0.95 % 53.03 0 0 % 0 18,554 0.09 % 58.36 789 0.05 % 19.87 291 0.09 % 73.56 11,024 0.14 % 58.82 1,001 0.06 % 23.31 28,512 0.10 % 52.54 0.01 245.30 43.12 74.87 6.67 1,441,592.91 ne glede na to 47,026 0.75 % 41.44 0 0 % 0 13,610 0.06 % 42.81 1,386 0.08 % 34.90 318 0.10 % 80.39 9,188 0.11 % 49.02 2,606 0.15 % 60.70 19,918 0.07 % 36.70 0.01 216.85 42.80 73.84 6.65 1,117,679.60 po drugi strani pa 38,703 0.61 % 34.11 0 0 % 0 9,551 0.04 % 30.04 1,068 0.06 % 26.89 65 0.02 % 16.43 8,296 0.10 % 44.26 1,874 0.11 % 43.65 17,849 0.06 % 32.89 0.01 196.73 42.48 72.96 7.07 912,422.54 portal mmc rtv slovenija 33,538 0.53 % 29.56 0 0 % 0 33,538 0.15 % 105.49 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0.22 183.13 42.45 72.51 11.84 790,011.60 interaktivni multimedijski portal mmc 33,520 0.53 % 29.54 0 0 % 0 33,520 0.15 % 105.43 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0.83 183.08 44.15 74.22 13.74 823,972.95 multimedijski portal mmc rtv 33,520 0.53 % 29.54 0 0 % 0 33,520 0.15 % 105.43 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0.67 183.08 43.09 73.15 13.43 802,542.73 prvi interaktivni multimedijski portal 33,520 0.53 % 29.54 0 0 % 0 33,520 0.15 % 105.43 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0.14 183.08 42.27 72.34 11.21 786,047.38 francoska tiskovna agencija afp 26,156 0.41 % 23.05 0 0 % 0 25,837 0.12 % 81.26 0 0 % 0 0 0 % 0 6 0 % 0.03 0 0 % 0 313 0 % 0.58 0.36 161.73 43.64 72.99 12.52 634,925.99 za okolje in prostor 25,735 0.41 % 22.68 0 0 % 0 5,432 0.03 % 17.09 4 0 % 0.10 85 0.03 % 21.49 1,247 0.01 % 6.65 81 0.01 % 1.89 18,886 0.06 % 34.80 0.00 160.42 42.20 71.50 5.24 602,307.14 na to da je 24,808 0.39 % 21.86 0 0 % 0 7,740 0.04 % 24.34 478 0.03 % 12.04 131 0.04 % 33.12 4,351 0.05 % 23.22 642 0.04 % 14.95 11,466 0.04 % 21.13 0.00 157.51 41.84 71.03 4.38 575,264.69 res pa je da 22,333 0.35 % 19.68 0 0 % 0 3,849 0.02 % 12.11 308 0.02 % 7.76 25 0.01 % 6.32 5,432 0.07 % 28.98 373 0.02 % 8.69 12,346 0.04 % 22.75 0.00 149.44 41.69 70.58 4.46 515,833.95 ki se je v 21,946 0.35 % 19.34 0 0 % 0 7,892 0.04 % 24.82 408 0.02 % 10.27 22 0.01 % 5.56 2,479 0.03 % 13.23 490 0.03 % 11.41 10,655 0.04 % 19.63 0.00 148.14 41.76 70.60 3.84 507,848.33 je v tem da 19,799 0.31 % 17.45 0 0 % 0 3,532 0.02 % 11.11 483 0.03 % 12.16 64 0.02 % 16.18 5,556 0.07 % 29.64 1,377 0.08 % 32.07 8,787 0.03 % 16.19 0.00 140.71 41.51 70.06 3.86 455,234.04 poroča francoska tiskovna agencija 19,422 0.31 % 17.12 0 0 % 0 19,265 0.09 % 60.59 0 0 % 0 0 0 % 0 4 0 % 0.02 0 0 % 0 153 0 % 0.28 0.24 139.36 41.48 69.98 11.94 446,241.44 na drugi strani pa 17,034 0.27 % 15.01 0 0 % 0 5,423 0.03 % 17.06 128 0.01 % 3.22 37 0.01 % 9.35 2,522 0.03 % 13.46 804 0.05 % 18.73 8,120 0.03 % 14.96 0.00 130.51 41.30 69.41 5.14 389,433.46 se je izkazalo da 16,687 0.27 % 14.71 0 0 % 0 4,802 0.02 % 15.10 540 0.03 % 13.60 21 0.01 % 5.31 2,943 0.04 % 15.70 331 0.02 % 7.71 8,050 0.03 % 14.83 0.00 129.18 41.27 69.32 3.95 381,201.99 za to da bi 16,575 0.26 % 14.61 0 0 % 0 3,939 0.02 % 12.39 606 0.04 % 15.26 93 0.03 % 23.51 2,389 0.03 % 12.75 664 0.04 % 15.47 8,884 0.03 % 16.37 0.00 128.74 42.42 70.45 4.74 390,176.86 pa naj bi se 16,564 0.26 % 14.60 0 0 % 0 5,085 0.02 % 15.99 68 0 % 1.71 10 0 % 2.53 1,743 0.02 % 9.30 150 0.01 % 3.49 9,508 0.03 % 17.52 0.00 128.70 41.25 69.29 4.93 378,285.71 pa je da je 15,779 0.25 % 13.91 0 0 % 0 3,477 0.02 % 10.94 161 0.01 % 4.05 17 0.01 % 4.30 3,034 0.04 % 16.19 336 0.02 % 7.83 8,754 0.03 % 16.13 0.00 125.61 42.13 70.02 3.29 368,677.31 v zvezi s tem 15,762 0.25 % 13.89 0 0 % 0 5,244 0.02 % 16.49 229 0.01 % 5.77 174 0.05 % 43.99 1,579 0.02 % 8.43 577 0.03 % 13.44 7,959 0.03 % 14.66 0.00 125.55 48.78 76.67 4.64 431,359.25 je dejal da je 15,759 0.25 % 13.89 0 0 % 0 7,960 0.04 % 25.04 60 0 % 1.51 1 0 % 0.25 560 0.01 % 2.99 73 0 % 1.70 7,105 0.02 % 13.09 0.00 125.53 41.74 69.63 3.44 364,480.42 ki je bil v 15,455 0.24 % 13.62 0 0 % 0 5,684 0.03 % 17.88 289 0.02 % 7.28 40 0.01 % 10.11 1,575 0.02 % 8.40 296 0.02 % 6.89 7,571 0.03 % 13.95 0.00 124.32 41.76 69.59 3.55 357,685.09 nemška tiskovna agencija dpa 15,445 0.24 % 13.61 0 0 % 0 15,103 0.07 % 47.50 0 0 % 0 0 0 % 0 8 0 % 0.04 0 0 % 0 334 0 % 0.62 0.23 124.28 41.15 68.98 11.86 351,791.83 iz leta v leto 15,311 0.24 % 13.49 0 0 % 0 3,531 0.02 % 11.11 101 0.01 % 2.54 13 0 % 3.29 2,456 0.03 % 13.10 195 0.01 % 4.54 9,015 0.03 % 16.61 0.00 123.74 41.56 69.37 4.71 352,505.62 je povedal da je 15,248 0.24 % 13.44 0 0 % 0 4,379 0.02 % 13.77 454 0.03 % 11.43 8 0 % 2.02 1,085 0.01 % 5.79 100 0.01 % 2.33 9,222 0.03 % 16.99 0.00 123.48 41.14 68.93 3.39 347,134.73 kljub temu da je 15,020 0.24 % 13.24 0 0 % 0 4,259 0.02 % 13.40 268 0.02 % 6.75 29 0.01 % 7.33 2,915 0.04 % 15.55 347 0.02 % 8.08 7,202 0.02 % 13.27 0.00 122.56 42.06 69.81 4.14 350,307.42 po drugi svetovni vojni 14,815 0.23 % 13.06 0 0 % 0 3,683 0.02 % 11.58 65 0 % 1.64 40 0.01 % 10.11 1,828 0.02 % 9.75 916 0.05 % 21.33 8,283 0.03 % 15.26 0.01 121.72 41.88 69.59 7.03 343,930.70 da se je v 14,573 0.23 % 12.84 0 0 % 0 4,536 0.02 % 14.27 483 0.03 % 12.16 21 0.01 % 5.31 1,954 0.02 % 10.43 430 0.03 % 10.02 7,149 0.02 % 13.17 0.00 120.72 41.07 68.73 3.22 331,194.61 zdi se mi da 14,470 0.23 % 12.75 0 0 % 0 3,140 0.01 % 9.88 1,732 0.10 % 43.61 21 0.01 % 5.31 3,213 0.04 % 17.14 542 0.03 % 12.62 5,822 0.02 % 10.73 0.00 120.29 44.89 72.53 4.88 362,150.48 potem ko se je 14,253 0.23 % 12.56 0 0 % 0 6,018 0.03 % 18.93 533 0.03 % 13.42 23 0.01 % 5.81 1,296 0.02 % 6.92 237 0.01 % 5.52 6,146 0.02 % 11.32 0.00 119.39 43.14 70.73 3.96 341,652.33 iz dneva v dan 13,942 0.22 % 12.29 0 0 % 0 3,311 0.01 % 10.41 402 0.02 % 10.12 25 0.01 % 6.32 2,883 0.04 % 15.38 287 0.02 % 6.68 7,034 0.02 % 12.96 0.00 118.08 41.73 69.26 4.64 322,379.22 se mi zdi da 13,864 0.22 % 12.22 0 0 % 0 3,385 0.02 % 10.65 975 0.06 % 24.55 110 0.03 % 27.81 3,287 0.04 % 17.54 369 0.02 % 8.59 5,738 0.02 % 10.57 0.00 117.75 44.83 72.35 4.82 346,468.53 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 313 File at CLARIN.SI 1.4.4 List of word-level 5-grams with relative frequency 2/million or above from lower-case word forms in the Gigafida 2.0 corpus with text-type distribution and collocation measuresGF2.0-word_sets-lowercase_forms-5grams- taxonomy-collocativity-entire.tsvLower-case form of string Total absolute frequen- cy of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occur - rences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Dice t-score MI MI3 logDice simple LL interaktivni multimedijski portal mmc rtv 33,520 5,97 % 29.54 0 0 % 0 33,520 0.59 % 105.43 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0.72 183.08 72.35 102.42 13.52 1,393,086.45 multimedijski portal mmc rtv slovenija 33,520 5,97 % 29.54 0 0 % 0 33,520 0.59 % 105.43 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0.26 183.08 72.35 102.42 12.08 1,393,086.45 prvi interaktivni multimedijski portal mmc 33,520 5,97 % 29.54 0 0 % 0 33,520 0.59 % 105.43 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0.17 183.08 72.47 102.53 11.47 1,395,451.39 poroča francoska tiskovna agencija afp 19,338 3,44 % 17.04 0 0 % 0 19,185 0.34 % 60.34 0 0 % 0 0 0 % 0 4 0 % 0.02 0 0 % 0 149 0 % 0.27 0.25 139.06 71.87 100.34 12.03 798,028.61 glede na to da je 17,357 3,09 % 15.30 0 0 % 0 5,513 0.10 % 17.34 209 0.09 % 5.26 90 0.08 % 22.75 3,050 0.20 % 16.27 285 0,07 % 6,64 8,210 0.13 % 15.13 0.00 131.75 71.50 99.67 4.18 712,452.87 ne glede na to ali 11,720 2,09 % 10.33 0 0 % 0 3,227 0.06 % 10.15 159 0.07 % 4 125 0.11 % 31.60 2,545 0.17 % 13.58 902 0,23 % 21,01 4,762 0.08 % 8.77 0.00 108.26 70.84 97.87 4.82 476,384.08 delo družino in socialne zadeve 11,192 1,99 % 9.86 0 0 % 0 1,934 0.03 % 6.08 3 0 % 0.08 330.03 % 8.34 821 0.05 % 4.38 57 0,01 % 1,33 8,344 0.13 % 15.37 0.00 105.79 72.23 99.13 4.93 464,325.22 za delo družino in socialne 11,072 1,97 % 9.76 0 0 % 0 1,925 0.03 % 6.05 3 0 % 0.08 320.03 % 8.09 814 0.05 % 4.34 56 0,01 % 1,30 8,242 0.13 % 15.19 0.00 105.22 70.75 97.62 4.33 449,497.76 poroča nemška tiskovna agencija dpa 10,414 1,85 % 9.18 0 0 % 0 10,257 0.18 % 32.26 0 0 % 0 0 0 % 0 4 0 % 0.02 0 0 % 0 153 0 % 0.28 0.14 102.05 70.84 97.53 11.21 423,321.49 za kmetijstvo gozdarstvo in prehrano 8,904 1,58 % 7.85 0 0 % 0 2,068 0.04 % 6.50 2 0 % 0.05 140.01 % 3.54 656 0.04 % 3.50 26 0,01 % 0,61 6,138 0.10 % 11.31 0.00 94.36 73.38 99.62 4.04 375,574.31 ne glede na to da 8,780 1,56 % 7.74 0 0 % 0 2,620 0.05 % 8.24 117 0.05 % 2.95 60 0.05 % 15.17 1,448 0.09 % 7.73 218 0,06 % 5,08 4,317 0.07 % 7.95 0.00 93.70 70.42 96.62 3.98 354,678.98 po drugi strani pa je 8,307 1,48 % 7.32 0 0 % 0 2,050 0.04 % 6.45 284 0.12 % 7.15 140.01 % 3.54 1,794 0.12 % 9.57 382 0,10 % 8,90 3,783 0.06 % 6.97 0.00 91.14 70.36 96.40 3.53 335,298.71 kot v enakem obdobju lani 8,131 1,45 % 7.17 0 0 % 0 4,759 0.08 % 14.97 0 0 % 0 0 0 % 0 366 0.02 % 1.95 15 0 % 0,35 2,991 0.05 % 5.51 0.00 90.17 70.31 96.29 4.12 327,919.48 ministrstvo za okolje in prostor 6,619 1,18 % 5.83 0 0 % 0 1,539 0.03 % 4.84 0 0 % 0 330.03 % 8.34 358 0.02 % 1.91 42 0,01 % 0,98 4,647 0.07 % 8.56 0.00 81.36 70.01 95.40 3.59 265,758.38 glede na to da so 6,038 1,07 % 5.32 0 0 % 0 1,724 0.03 % 5.42 640.03 % 1.61 210.02 % 5.31 1,012 0.07 % 5.40 131 0,03 % 3,05 3,086 0.05 % 5.69 0.00 77.70 69.88 95.00 3.25 241,948.92 se je izkazalo da je 5,993 1,07 % 5.28 0 0 % 0 1,752 0.03 % 5.51 235 0.10 % 5.92 70.01 % 1.77 1,068 0.07 % 5.70 108 0,03 % 2,52 2,823 0.04 % 5.20 0.00 77.41 71.50 96.59 2.15 245,981.72 po poročanju francoske tiskovne agencije 5,771 1,03 % 5.09 0 0 % 0 5,650 0.10 % 17.77 0 0 % 0 0 0 % 0 1 0 % 0.01 0 0 % 0 120 0 % 0.22 0.00 75.97 69.81 94.80 6.17 231,023.24 poročanju francoske tiskovne agencije afp 5,720 1,02 % 5.04 0 0 % 0 5,602 0.10 % 17.62 0 0 % 0 0 0 % 0 1 0 % 0.01 0 0 % 0 117 0 % 0.22 0.11 75.63 69.80 94.76 10.80 228,937.52 to še ne pomeni da 5,509 0,98 % 4.86 0 0 % 0 1,225 0.02 % 3.85 181 0.08 % 4.56 110.01 % 2.78 1,332 0.09 % 7.11 187 0,05 % 4,36 2,573 0.04 % 4.74 0.00 74.22 70.44 95.30 3.82 222,623.99 glede na to da se 5,430 0,97 % 4.79 0 0 % 0 1,686 0.03 % 5.30 670.03 % 1.69 240.02 % 6.07 992 0.06 % 5.29 101 0,03 % 2,35 2,560 0.04 % 4.72 0.00 73.69 70.84 95.66 3.02 220,738.59 se je odločil da bo 5,122 0,91 % 4.51 0 0 % 0 1,480 0.03 % 4.65 282 0.12 % 7.10 5 0 % 1.26 1,009 0.07 % 5.38 105 0,03 % 2,45 2,241 0.04 % 4.13 0.00 71.57 70.70 95.34 2.45 207,769.25 v skladu z zakonom o 5,094 0,91 % 4.49 0 0 % 0 1,669 0.03 % 5.25 7 0 % 0.18 740.06 % 18.71 339 0.02 % 1.81 40 0,01 % 0,93 2,965 0.05 % 5.46 0.00 71.37 69.63 94.26 3.23 203,369.64 je na današnji novinarski konferenci 4,917 0,88 % 4.33 0 0 % 0 4,822 0.09 % 15.17 0 0 % 0 0 0 % 0 14 0 % 0.07 0 0 % 0 81 0 % 0.15 0.00 70.12 70.61 95.13 2.77 199,180.93 res pa je da je 4,756 0,85 % 4.19 0 0 % 0 822 0.01 % 2.59 60 0.03 % 1.51 3 0 % 0.76 1,174 0.08 % 6.26 101 0,03 % 2,35 2,596 0.04 % 4.78 0.00 68.96 69.57 94.00 1.87 189,682.42 ministrstva za okolje in prostor 4,555 0,81 % 4.01 0 0 % 0 937 0.02 % 2.95 2 0 % 0.05 110.01 % 2.78 213 0.01 % 1.14 17 0 % 0,40 3,375 0.05 % 6.22 0.00 67.49 69.47 93.78 3.06 181,408.46 ne glede na to kako 4,527 0,81 % 3.99 0 0 % 0 1,380 0.02 % 4.34 328 0.14 % 8.26 150.01 % 3.79 980 0.06 % 5.23 409 0,10 % 9,53 1,415 0.02 % 2.61 0.00 67.28 69.46 93.75 3.55 180,269.08 za izobraževanje znanost in šport 4,526 0,81 % 3.99 0 0 % 0 4,520 0.08 % 14.22 0 0 % 0 0 0 % 0 1 0 % 0.01 1 0 % 0,02 4 0 % 0.01 0.00 67.28 69.46 93.75 3.06 180,228.40 več kot v enakem obdobju 4,521 0,81 % 3.98 0 0 % 0 2,680 0.05 % 8.43 0 0 % 0 0 0 % 0 188 0.01 % 1 1 0 % 0,02 1,652 0.03 % 3.04 0.00 67.24 76.24 100.53 3.21 198,485.43 za gospodarski razvoj in tehnologijo 4,377 0,78 % 3.86 0 0 % 0 4,377 0.08 % 13.77 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0.00 66.16 71.34 95.53 3.00 179,230.81 za pokojninsko in invalidsko zavarovanje 4,374 0,78 % 3.85 0 0 % 0 1,571 0.03 % 4.94 0 0 % 0 130 0.11 % 32.86 375 0.02 % 2 32 0,01 % 0,75 2,266 0.04 % 4.18 0.00 66.14 69.41 93.60 3.01 174,045.87 državni sekretar na ministrstvu za 4,213 0,75 % 3.71 0 0 % 0 2,028 0.04 % 6.38 0 0 % 0 0 0 % 0 188 0.01 % 1 3 0 % 0,07 1,994 0.03 % 3.67 0.00 64.91 69.36 93.44 3.32 167,502.28 izkazalo se je da je 4,051 0,72 % 3.57 0 0 % 0 1,130 0.02 % 3.55 261 0.11 % 6.57 60.01 % 1.52 737 0.05 % 3.93 138 0,04 % 3,21 1,779 0.03 % 3.28 0.00 63.65 70.93 94.90 1.58 164,894.64 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 314 File at CLARIN.SI 1.4.5 List of word-level 2-grams with relative frequency 2/million or above from lower-case word forms in the Gigafida 2.0 corpus with text-type distribution, lemmas and collocation measuresGF2.0-word_sets-lowercase_forms-lemmas-2grams- taxonomy-collocativity-entire.tsvLower-case form of string Lemma of string Lemma (lower-case) Total absolute frequen- cy of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occur - rences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Dice t-score MI MI3 logDice simple LL se je se biti se biti 3,551,846 0.96 % 3,130.22 20.03 % 206.04 1,027,099 0.49 % 3,230.49 301,215 1.24 % 7,584.27 6,069 0.23 % 1,534.18 496,718 0.44 % 2,650,32 98,099 0,40 % 2,284.84 1,622,644 0.47 % 2,989.83 0.13 1,585.82 2.66 46.18 11.02 -295,656.33 da je da biti da biti 2,678,705 0.72 % 2,360.73 10.02 % 103.02 825,501 0.40 % 2,596.42 123,117 0.51 % 3,099.95 8,514 0.33 % 2,152.25 419,347 0.37 % 2,237,50 84,545 0,34 % 1,969.15 1,217,680 0.35 % 2,243.66 0.10 1,318.50 2.36 45.07 10.65 -505,215.46 ki je ki biti ki biti 2,429,361 0.66 % 2,140.98 6 0.10 % 618.11 778,858 0.37 % 2,449.71 71,777 0.30 % 1,807.27 7,661 0.30 % 1,936.62 348,186 0.31 % 1,857,81 73,250 0,30 % 1,706.08 1,149,623 0.33 % 2,118.26 0.09 1,274.18 2.45 44.88 10.57 -382,719.89 pa je pa biti pa biti 2,207,049 0.60 % 1,945.06 10.02 % 103.02 670,188 0.32 % 2,107.92 52,742 0.22 % 1,327.99 3,003 0.12 % 759.13 328,589 0.29 % 1,753,24 52,260 0,21 % 1,217.20 1,100,266 0.32 % 2,027.31 0.09 1,225.46 2.51 44.66 10.47 -301,062.70 ki so ki biti ki biti 1,743,293 0.47 % 1,536.36 4 0.07 % 412.07 486,132 0.23 % 1,529.01 50,819 0.21 % 1,279.57 7,766 0.30 % 1,963.16 270,450 0.24 % 1,443,03 74,618 0,30 % 1,737.94 853,504 0.25 % 1,572.64 0.13 1,208.08 3.56 45.02 11.10 542,171.33 je v biti v biti v 1,737,218 0.47 % 1,531 10.02 % 103.02 599,034 0.29 % 1,884.12 44,336 0.18 % 1,116.33 2,964 0.12 % 749.27 216,723 0.19 % 1,156,36 45,658 0,19 % 1,063.43 828,502 0.24 % 1,526.57 0.05 470.00 0.64 42.09 9.63 -573,553.36 je bil biti biti biti biti 1,681,241 0.46 % 1,481.67 0 0 % 0 525,710 0.25 % 1,653.49 101,698 0.42 % 2,560.65 4,022 0.15 % 1,016.72 217,256 0.19 % 1,159,21 49,827 0,20 % 1,160.53 782,728 0.23 % 1,442.23 0.08 1,231.90 4.32 45.69 10.35 1,182,417.32 da bi da biti da biti 1,640,576 0.44 % 1,445.83 0 0 % 0 408,716 0.20 % 1,285.52 133,488 0.55 % 3,361.08 5,902 0.23 % 1,491.96 259,478 0.23 % 1,384,49 69,020 0,28 % 1,607.56 763,972 0.22 % 1,407.67 0.15 1,214.61 4.27 45.56 11.30 1,109,249.95 da se da se da se 1,583,847 0.43 % 1,395.84 30 0.49 % 3,090.55 452,634 0.22 % 1,423.65 78,380 0.32 % 1,973.52 7,514 0.29 % 1,899.46 274,040 0.24 % 1,462,19 59,755 0,24 % 1,391.76 711,494 0.21 % 1,310.97 0.10 1,092.03 2.92 44.11 10.72 34,135.33 so se biti se biti se 1,266,718 0.34 % 1,116.35 20.03 % 206.04 344,051 0.17 % 1,082.13 60,392 0.25 % 1,520.61 2,287 0.09 % 578.13 165,405 0.15 % 882,55 39,127 0,16 % 911.31 655,454 0.19 % 1,207.72 0.09 958.22 2.75 43.30 10.47 -59,411.03 je bilo biti biti biti biti 1,199,196 0.33 % 1,056.85 0 0 % 0 330,118 0.16 % 1,038.31 80,644 0.33 % 2,030.53 3,236 0.12 % 818.03 165,512 0.15 % 883,12 37,875 0,15 % 882.15 581,811 0.17 % 1,072.03 0.06 1,028.30 4.04 44.42 9.87 661,483.77 naj bi naj biti naj biti 1,193,523 0.32 % 1,051.85 0 0 % 0 349,913 0.17 % 1,100.57 11,115 0.05 % 279.86 1,238 0.05 % 312.95 143,722 0.13 % 766,85 21,714 0,09 % 505.74 665,821 0.19 % 1,226.82 0.28 1,082.07 6.71 47.09 12.17 2,459,620.70 je bila biti biti biti biti 1,189,565 0.32 % 1,048.36 0 0 % 0 342,001 0.16 % 1,075.68 79,513 0.33 % 2,002.05 3,492 0.14 % 882.74 171,598 0.15 % 915,59 41,219 0,17 % 960.04 551,742 0.16 % 1,016.62 0.06 1,027.80 4.12 44.48 9.86 706,318.92 da so da biti da biti 1,029,617 0.28 % 907.40 0 0 % 0 302,283 0.14 % 950.76 29,850 0.12 % 751.59 2,610 0.10 % 659.78 150,596 0.13 % 803,53 36,095 0,15 % 840.70 508,183 0.15 % 936.36 0.07 843.14 2.56 42.51 10.23 -121,500.94 ki se ki se ki se 1,014,421 0.28 % 894 30.05 % 309.06 287,492 0.14 % 904.24 32,603 0.14 % 820.91 5,816 0.23 % 1,470.22 183,768 0.16 % 980,53 45,698 0,19 % 1,064.36 459,041 0.13 % 845.81 0.07 830.07 2.51 42.41 10.18 -140,577.93 pa so pa biti pa biti 906,695 0.24 % 799.07 0 0 % 0 256,521 0.12 % 806.83 14,316 0.06 % 360.46 1,242 0.05 % 313.96 124,644 0.11 % 665,06 23,582 0,10 % 549.25 486,390 0.14 % 896.21 0.07 816.52 2.81 42.39 10.25 -20,486.84 ko je ko biti ko biti 892,460 0.24 % 786.52 9 0.15 % 927.17 284,843 0.14 % 895.91 66,529 0.28 % 1,675.13 2,288 0.09 % 578.38 125,350 0.11 % 668,83 25,421 0,10 % 592.09 388,020 0.11 % 714.95 0.04 826.46 3.00 42.53 9.41 49,430.17 pa se pa se pa se 851,419 0.23 % 750.35 0 0 % 0 231,651 0.11 % 728.60 25,693 0.11 % 646.92 2,016 0.08 % 509.62 151,613 0.13 % 808,96 26,441 0,11 % 615.84 414,005 0.12 % 762.83 0.06 754.21 2.45 41.85 10.01 -134,429.06 je na biti na biti na 837,578 0.23 % 738.15 10.02 % 103.02 294,216 0.14 % 925.39 27,550 0.11 % 693.68 1,146 0.04 % 289.70 111,850 0.10 % 596,79 16,675 0,07 % 388.38 386,140 0.11 % 711.49 0.03 183.17 0.32 39.67 8.86 -172,803.80 da bo da biti da biti 786,576 0.21 % 693.21 30.05 % 309.06 251,747 0.12 % 791.81 26,221 0.11 % 660.22 1,966 0.08 % 496.98 104,233 0.09 % 556,15 14,625 0,06 % 340.63 387,781 0.11 % 714.51 0.08 798.61 3.33 42.50 10.27 159,735.25 ki jih ki on ki on 735,400 0.20 % 648.10 30.05 % 309.06 195,182 0.09 % 613.90 18,218 0.07 % 458.71 4,813 0.19 % 1,216.67 141,488 0.12 % 754,93 40,726 0,17 % 948.56 334,970 0.10 % 617.20 0.10 824.55 4.70 43.68 10.63 666,575.92 bi se biti se biti se 707,817 0.19 % 623.80 10.02 % 103.02 178,976 0.09 % 562.93 48,796 0.20 % 1,228.63 2,146 0.08 % 542.49 111,298 0.10 % 593,85 23,491 0,10 % 547.13 343,109 0.10 % 632.20 0.06 732.25 2.95 41.81 10.01 23,942.58 ki ga ki on ki on 706,070 0.19 % 622.26 6 0.10 % 618.11 199,524 0.10 % 627.55 23,967 0.10 % 603.46 3,374 0.13 % 852.91 122,406 0.11 % 653,12 31,710 0,13 % 738.56 325,083 0.10 % 598.99 0.09 802.93 4.49 43.35 10.55 560,048.47 se bo se biti se biti 686,350 0.19 % 604.88 0 0 % 0 227,420 0.11 % 715.30 17,787 0.07 % 447.86 1,059 0.04 % 267.70 88,752 0.08 % 473,55 11,192 0,05 % 260.67 340,140 0.10 % 626.73 0.06 726.26 3.02 41.80 10.00 44,163.43 več kot več kot več kot 673,051 0.18 % 593.16 10.02 % 103.02 235,287 0.11 % 740.04 8,026 0.03 % 202.09 1,049 0.04 % 265.18 91,023 0.08 % 485,67 10,873 0,04 % 253.25 326,792 0.10 % 602.14 0.17 805.59 5.79 44.51 11.41 1,025,033.61 ga je on biti on biti 667,325 0.18 % 588.11 0 0 % 0 180,760 0.09 % 568.54 71,862 0.30 % 1,809.41 1,819 0.07 % 459.82 92,365 0.08 % 492,83 22,298 0,09 % 519.35 298,221 0.09 % 549.49 0.03 695.62 2.75 41.45 9.00 -30,917.17 in se in se in se 659,774 0.18 % 581.46 10.02 % 103.02 154,650 0.07 % 486.41 78,208 0.32 % 1,969.19 2,057 0.08 % 519.99 127,497 0.11 % 680,28 30,047 0,12 % 699.83 267,314 0.08 % 492.54 0.03 305.31 0.68 39.34 8.90 -225,838.53 ne bo ne biti ne biti 650,516 0.18 % 573.30 0 0 % 0 182,187 0.09 % 573.03 29,369 0.12 % 739.48 1,544 0.06 % 390.31 99,628 0.09 % 531,58 11,939 0,05 % 278.07 325,849 0.10 % 600.40 0.10 762.23 4.19 42.81 10.71 409,880.17 je da biti da biti da 641,787 0.17 % 565.60 0 0 % 0 180,251 0.09 % 566.94 40,505 0.17 % 1,019.87 1,119 0.04 % 282.87 105,912 0.09 % 565,11 21,879 0,09 % 509.59 292,121 0.09 % 538.25 0.02 151.08 0.30 38.88 8.59 -125,571.83 to je ta biti ta biti 630,173 0.17 % 555.37 20.03 % 206.04 178,888 0.09 % 562.65 39,600 0.16 % 997.09 2,905 0.11 % 734.35 110,943 0.10 % 591,96 25,701 0,10 % 598.61 272,134 0.08 % 501.42 0.03 611.90 2.13 40.66 8.88 -165,111.13 je tudi biti tudi biti tudi 625,665 0.17 % 551.40 10.02 % 103.02 184,971 0.09 % 581.78 11,565 0.05 % 291.19 1,273 0.05 % 321.80 107,464 0.10 % 573,39 17,891 0,07 % 416.70 302,500 0.09 % 557.38 0.03 414.51 1.07 39.58 8.73 -252,281.11 ne bi ne biti ne biti 619,598 0.17 % 546.05 0 0 % 0 145,509 0.07 % 457.66 49,662 0.20 % 1,250.44 1,908 0.07 % 482.32 107,181 0.10 % 571,88 18,390 0,07 % 428.32 296,948 0.09 % 547.15 0.09 737.94 4.00 42.48 10.58 330,318.32 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 315 File at CLARIN.SI 1.4.6 List of word-level 3-grams with relative frequency 2/million or above from lower-case word forms in the Gigafida 2.0 corpus with text-type distribution, lemmas and collocation measuresGF2.0-word_sets-lowercase_forms-lemmas-3grams- taxonomy-collocativity-entire.tsvLower-case form of string Lemma of string Lemma (lower-case) Total absolute frequen- cy of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occur - rences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Dice t-score MI MI3 logDice simple LL ki se je ki se biti ki se biti 271,902 0.38 % 239.63 10.04 % 103.02 84,854 0.11 % 266.89 14,048 0.17 % 353.71 488 0.05 % 123.36 37,058 0.10 % 197,73 8,037 0,10 % 187.19 127,416 0.10 % 234.77 0.01 521.43 15.98 52.09 7.61 2,072,266.20 ki ga je ki on biti ki on biti 267,146 0.38 % 235.43 0 0 % 0 81,846 0.10 % 257.43 13,465 0.16 % 339.03 937 0.09 % 236.86 34,875 0.09 % 186,08 8,902 0,12 % 207.34 127,121 0.10 % 234.23 0.01 516.85 15.19 51.24 7.89 1,908,300.40 pa se je pa se biti pa se biti 253,512 0.36 % 223.42 0 0 % 0 78,163 0.10 % 245.84 10,604 0.12 % 267 324 0.03 % 81.90 34,416 0.09 % 183,63 5,324 0,07 % 124 124,681 0.10 % 229.73 0.01 503.49 16.55 52.45 7.54 2,018,353.68 da se je da se biti da se biti 232,397 0.33 % 204.81 0 0 % 0 67,777 0.09 % 213.18 16,520 0.20 % 415.96 504 0.05 % 127.41 34,338 0.09 % 183,22 7,006 0,09 % 163.18 106,252 0.09 % 195.78 0.01 482.06 14.99 50.64 7.34 1,631,951.73 da bi se da biti se da biti se 230,598 0.32 % 203.22 0 0 % 0 54,493 0.07 % 171.39 21,747 0.26 % 547.57 892 0.09 % 225.49 38,613 0.10 % 206,03 10,268 0,14 % 239.15 104,585 0.09 % 192.70 0.02 480.19 14.97 50.60 8.25 1,617,762.27 se je v se biti v se biti v 217,483 0.30 % 191.67 0 0 % 0 71,005 0.09 % 223.33 7,392 0.09 % 186.12 229 0.02 % 57.89 25,526 0.07 % 136,20 5,288 0,07 % 123.16 108,043 0.09 % 199.08 0.01 466.34 15.13 50.60 6.93 1,546,768.10 ki jo je ki on biti ki on biti 206,746 0.29 % 182.20 0 0 % 0 63,713 0.08 % 200.39 10,790 0.13 % 271.68 542 0.05 % 137.01 28,377 0.08 % 151,41 7,016 0,09 % 163.41 96,308 0.08 % 177.45 0.01 454.69 15.82 51.14 7.55 1,555,772.31 ki jih je ki on biti ki on biti 204,188 0.29 % 179.95 0 0 % 0 59,644 0.07 % 187.60 9,134 0.11 % 229.98 1,082 0.11 % 273.52 29,354 0.08 % 156,62 8,281 0,11 % 192.87 96,693 0.08 % 178.16 0.01 451.86 15.20 50.48 7.51 1,460,812.68 ki je v ki biti v ki biti v 186,710 0.26 % 164.55 10.04 % 103.02 66,989 0.08 % 210.70 1,882 0.02 % 47.39 463 0.05 % 117.04 20,847 0.06 % 111,23 3,893 0,05 % 90.67 92,635 0.07 % 170.69 0.01 432.08 14.67 49.69 6.77 1,275,630.24 ki naj bi ki naj biti ki naj biti 173,475 0.24 % 152.88 0 0 % 0 45,720 0.06 % 143.80 2,081 0.03 % 52.40 277 0.03 % 70.02 24,480 0.07 % 130,62 4,471 0,06 % 104.13 96,446 0.08 % 177.71 0.02 416.49 14.56 49.37 8.66 1,174,129.43 ki je bil ki biti biti ki biti biti 159,982 0.23 % 140.99 0 0 % 0 53,427 0.07 % 168.04 6,298 0.07 % 158.58 507 0.05 % 128.16 19,240 0.05 % 102,66 4,452 0,06 % 103.69 76,058 0.06 % 140.14 0.01 399.97 16.01 50.58 7.16 1,221,693.74 ko se je ko se biti ko se biti 156,592 0.22 % 138 0 0 % 0 43,622 0.06 % 137.20 17,964 0.21 % 452.31 295 0.03 % 74.57 23,582 0.06 % 125,83 4,689 0,06 % 109.21 66,440 0.05 % 122.42 0.01 395.71 15.97 50.49 7.03 1,192,850.19 ki so se ki biti se ki biti se 150,086 0.21 % 132.27 0 0 % 039,890 0.05 % 125.46 7,363 0.09 % 185.39 375 0.04 % 94.80 21,507 0.06 % 114,75 5,975 0,08 % 139.16 74,976 0.06 % 138.15 0.01 387.40 15.81 50.20 7.46 1,128,681.90 da se bo da se biti da se biti 139,139 0.20 % 122.62 0 0 % 0 42,142 0.05 % 132.55 5,963 0.07 % 150.14 262 0.03 % 66.23 19,776 0.05 % 105,52 2,883 0,04 % 67.15 68,113 0.06 % 125.50 0.01 373.00 15.41 49.58 7.54 1,012,343.75 naj bi se naj biti se naj biti se 133,301 0.19 % 117.48 0 0 % 0 40,907 0.05 % 128.66 1,268 0.01 % 31.93 165 0.02 % 41.71 15,363 0.04 % 81,97 2,266 0,03 % 52.78 73,332 0.06 % 135.12 0.02 365.10 15.37 49.42 8.06 967,159.72 ki so ga ki biti on ki biti on 131,513 0.18 % 115.90 0 0 % 0 38,648 0.05 % 121.56 3,980 0.05 % 100.21 225 0.02 % 56.88 17,107 0.05 % 91,28 4,466 0,06 % 104.02 67,087 0.05 % 123.61 0.01 362.63 14.16 48.17 7.81 858,488.17 pa naj bi pa naj biti pa naj biti 129,124 0.18 % 113.80 0 0 % 0 37,605 0.05 % 118.28 737 0.01 % 18.56 44 0 % 11.12 13,446 0.04 % 71,74 1,146 0,01 % 26.69 76,146 0.06 % 140.30 0.02 359.32 14.14 48.09 8.35 840,837.46 ki so jih ki biti on ki biti on 128,399 0.18 % 113.16 0 0 % 0 35,801 0.04 % 112.60 3,476 0.04 % 87.52 375 0.04 % 94.80 17,741 0.05 % 94,66 5,956 0,08 % 138.72 65,050 0.05 % 119.86 0.01 358.32 15.51 49.45 7.79 942,205.92 je da je biti da biti biti da biti 119,356 0.17 % 105.19 0 0 % 0 35,113 0.04 % 110.44 7,697 0.09 % 193.80 176 0.02 % 44.49 17,896 0.05 % 95,49 3,536 0,05 % 82.36 54,938 0.04 % 101.23 0.00 345.46 14.02 47.75 5.96 769,075.64 da je bil da biti biti da biti biti 117,432 0.17 % 103.49 0 0 % 0 38,928 0.05 % 122.44 6,175 0.07 % 155.48 299 0.03 % 75.58 15,189 0.04 % 81,04 3,182 0,04 % 74.11 53,659 0.04 % 98.87 0.01 342.66 14.00 47.68 6.66 755,020.87 pa je bil pa biti biti pa biti biti 116,129 0.16 % 102.34 0 0 % 0 37,902 0.05 % 119.21 3,652 0.04 % 91.95 130 0.01 % 32.86 12,947 0.04 % 69,08 2,175 0,03 % 50.66 59,323 0.05 % 109.31 0.01 340.76 13.98 47.64 6.74 745,518.01 glede na to glede na ta glede na ta 110,414 0.15 % 97.31 0 0 % 0 33,387 0.04 % 105.01 2,473 0.03 % 62.27 620 0.06 % 156.73 21,003 0.06 % 112,07 4,277 0,06 % 99.62 48,654 0.04 % 89.65 0.01 332.26 13.91 47.42 7.83 703,990.12 ne glede na ne glede na ne glede na 106,770 0.15 % 94.10 0 0 % 0 30,284 0.04 % 95.25 2,213 0.03 % 55.72 1,363 0.14 % 344.55 19,692 0.05 % 105,07 5,304 0,07 % 123.54 47,914 0.04 % 88.28 0.01 326.73 13.86 47.27 7.63 677,644.44 in s tem in z ta in z ta 106,407 0.15 % 93.78 10.04 % 103.02 28,440 0.04 % 89.45 1,098 0.01 % 27.65 406 0.04 % 102.63 20,097 0.05 % 107,23 5,111 0,07 % 119.04 51,254 0.04 % 94.44 0.01 326.20 17.39 50.79 7.08 901,339.12 na to da na ta da na ta da 104,753 0.15 % 92.32 0 0 % 0 31,131 0.04 % 97.92 2,688 0.03 % 67.68 498 0.05 % 125.89 19,263 0.05 % 102,78 2,951 0,04 % 68.73 48,222 0.04 % 88.85 0.01 323.65 17.25 50.60 7.08 878,364.72 se je na se biti na se biti na 104,577 0.15 % 92.16 0 0 % 0 36,489 0.05 % 114.77 5,060 0.06 % 127.41 87 0.01 % 21.99 12,307 0.03 % 65,67 1,614 0,02 % 37.59 49,020 0.04 % 90.32 0.00 323.36 13.83 47.18 6.10 661,841.15 ki je bila ki biti biti ki biti biti 102,295 0.14 % 90.15 0 0 % 0 31,096 0.04 % 97.81 4,859 0.06 % 122.34 394 0.04 % 99.60 14,023 0.04 % 74,82 3,717 0,05 % 86.57 48,206 0.04 % 88.82 0.01 319.82 14.82 48.10 6.53 707,926.59 da je v da biti v da biti v 101,686 0.14 % 89.62 0 0 % 0 31,404 0.04 % 98.77 3,347 0.04 % 84.27 276 0.03 % 69.77 14,707 0.04 % 78,47 2,989 0,04 % 69.62 48,963 0.04 % 90.22 0.00 318.86 14.04 47.31 5.85 656,403.63 ki so jo ki biti on ki biti on 98,471 0.14 % 86.78 0 0 % 0 28,375 0.04 % 89.25 2,656 0.03 % 66.88 175 0.02 % 44.24 13,676 0.04 % 72,97 3,364 0,04 % 78.35 50,225 0.04 % 92.54 0.01 313.78 13.75 46.92 7.45 618,053.00 je bil v biti biti v biti biti v 96,038 0.14 % 84.64 0 0 % 0 32,187 0.04 % 101.24 3,438 0.04 % 86.57 196 0.02 % 49.55 10,342 0.03 % 55,18 2,379 0,03 % 55.41 47,496 0.04 % 87.51 0.00 309.89 14.48 47.58 5.99 645,104.52 pa so se pa biti se pa biti se 95,140 0.13 % 83.85 0 0 % 0 26,670 0.03 % 83.88 2,351 0.03 % 59.20 131 0.01 % 33.12 11,747 0.03 % 62,68 2,011 0,03 % 46.84 52,230 0.04 % 96.24 0.01 308.43 13.93 47.01 6.86 607,616.06 potem ko je potem ko biti potem ko biti 93,214 0.13 % 82.15 0 0 % 0 44,397 0.06 % 139.64 2,258 0.03 % 56.85 169 0.02 % 42.72 6,267 0.02 % 33,44 1,185 0,02 % 27.60 38,938 0.03 % 71.75 0.01 305.29 13.67 46.68 6.71 580,616.16 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 316 File at CLARIN.SI 1.4.7 List of word-level 4-grams with relative frequency 2/million or above from lower-case word forms in the Gigafida 2.0 corpus with text-type distribution, lemmas and collocation measuresGF2.0-word_sets-lowercase_forms-lemmas-4grams- taxonomy-collocativity-entire.tsvLower-case form of string Lemma of string Lemma (lower-case) Total absolute frequen- cy of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occur - rences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Dice t-score MI MI3 logDice simple LL glede na to da glede na ta da glede na ta da 60,130 0.96 % 52.99 0 0 % 018,535 0.09 % 58.30 789 0.05 % 19.87 291 0.09 % 73.56 11,021 0.14 % 58,80 1,001 0,06 % 23.31 28,493 0.10 % 52.50 0.01 245.21 43.11 74.87 6.67 1,440,575.02 ne glede na to ne glede na ta ne glede na ta 47,026 0.75 % 41.44 0 0 % 013,610 0.06 % 42.81 1,386 0.08 % 34.90 318 0.10 % 80.39 9,188 0.11 % 49,02 2,606 0,15 % 60.70 19,918 0.07 % 36.70 0.01 216.85 43.33 74.37 6.65 1,132,726.51 po drugi strani pa po drug stran pa po drug stran pa 38,703 0.62 % 34.11 0 0 % 0 9,551 0.04 % 30.04 1,068 0.06 % 26.89 65 0.02 % 16.43 8,296 0.10 % 44,26 1,874 0,11 % 43.65 17,849 0.06 % 32.89 0.01 196.73 42.53 73.01 7.07 913,655.99 portal mmc rtv slovenijaportal MMC RTV Slovenijaportal mmc rtv slovenija33,538 0.53 % 29.56 0 0 % 033,538 0.16 % 105.49 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0.23 183.13 42.64 72.71 11.85 793,951.68 interaktivni mul- timedijski portal mmcinteraktiven multimedijski portal MMCinteraktiven multimedijski portal mmc33,520 0.53 % 29.54 0 0 % 033,520 0.15 % 105.43 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0.84 183.08 44.16 74.22 13.74 824,072.19 multimedijski portal mmc rtvmultimedijski portal MMC RTVmultimedijski portal mmc rtv33,520 0.53 % 29.54 0 0 % 033,520 0.15 % 105.43 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0.68 183.08 43.11 73.18 13.44 802,971.27 prvi interaktivni multimedijski portalprvi interaktiven multimedijski portalprvi interaktiven multimedijski portal33,520 0.53 % 29.54 0 0 % 033,520 0.15 % 105.43 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0.14 183.08 42.27 72.34 11.21 786,047.38 francoska tiskovna agencija afpfrancoski tiskoven agencija AFPfrancoski tiskoven agencija afp26,145 0.42 % 23.04 0 0 % 025,826 0.12 % 81.23 0 0 % 0 0 0 % 0 6 0 % 0,03 0 0 % 0 313 0 % 0.58 0.36 161.69 43.71 73.06 12.52 635,758.32 za okolje in prostor za okolje in prostor za okolje in prostor 25,735 0.41 % 22.68 0 0 % 0 5,432 0.03 % 17.09 4 0 % 0.10 850.03 % 21.49 1,247 0.02 % 6,65 810,01 % 1.89 18,886 0.06 % 34.80 0.00 160.42 41.89 71.19 5.24 597,580.63 na to da je na ta da biti na ta da biti 24,786 0.40 % 21.84 0 0 % 0 7,733 0.04 % 24.32 475 0.03 % 11.96 131 0.04 % 33.12 4,345 0.05 % 23,18 641 0,04 % 14.93 11,461 0.04 % 21.12 0.00 157.44 50.28 79.47 4.38 700,683.03 res pa je da res pa biti da res pa biti da 22,309 0.35 % 19.66 0 0 % 0 3,844 0.02 % 12.09 308 0.02 % 7.76 25 0.01 % 6.32 5,423 0.07 % 28,94 372 0,02 % 8.66 12,337 0.04 % 22.73 0.00 149.36 41.80 70.69 4.46 516,840.28 ki se je v ki se biti v ki se biti v 21,946 0.35 % 19.34 0 0 % 0 7,892 0.04 % 24.82 408 0.02 % 10.27 22 0.01 % 5.56 2,479 0.03 % 13,23 490 0,03 % 11.41 10,655 0.04 % 19.63 0.00 148.14 41.66 70.50 3.84 506,562.04 je v tem da biti v ta da biti v ta da 19,792 0.32 % 17.44 0 0 % 0 3,530 0.02 % 11.10 482 0.03 % 12.14 64 0.02 % 16.18 5,555 0.07 % 29,64 1,375 0,08 % 32.03 8,786 0.03 % 16.19 0.00 140.68 41.51 70.06 3.87 455,067.01 poroča francoska tiskovna agencijaporočati francoski tiskoven agencijaporočati francoski tiskoven agencija19,422 0.31 % 17.12 0 0 % 019,265 0.09 % 60.59 0 0 % 0 0 0 % 0 4 0 % 0,02 0 0 % 0 153 0 % 0.28 0.24 139.36 41.48 69.98 11.94 446,241.44 na drugi strani pa na drug stran pa na drug stran pa 17,034 0.27 % 15.01 0 0 % 0 5,423 0.03 % 17.06 128 0.01 % 3.22 370.01 % 9.35 2,522 0.03 % 13,46 804 0,05 % 18.73 8,120 0.03 % 14.96 0.00 130.51 43.87 71.98 5.14 415,812.65 se je izkazalo da se biti izkazati da se biti izkazati da 16,687 0.27 % 14.71 0 0 % 0 4,802 0.02 % 15.10 540 0.03 % 13.60 210.01 % 5.31 2,943 0.04 % 15,70 331 0,02 % 7.71 8,050 0.03 % 14.83 0.00 129.18 41.27 69.32 3.95 381,201.99 za to da bi za ta da biti za ta da biti 16,575 0.26 % 14.61 0 0 % 0 3,939 0.02 % 12.39 606 0.04 % 15.26 930.03 % 23.51 2,389 0.03 % 12,75 664 0,04 % 15.47 8,884 0.03 % 16.37 0.00 128.74 41.71 69.75 4.74 383,110.32 pa naj bi se pa naj biti se pa naj biti se 16,564 0.26 % 14.60 0 0 % 0 5,085 0.02 % 15.99 68 0 % 1.71 10 0 % 2.53 1,743 0.02 % 9,30 150 0,01 % 3.49 9,508 0.03 % 17.52 0.00 128.70 41.47 69.50 4.93 380,393.85 v zvezi s tem v zveza z ta v zveza z ta 15,761 0.25 % 13.89 0 0 % 0 5,244 0.02 % 16.49 229 0.01 % 5.77 174 0.05 % 43.99 1,579 0.02 % 8,43 577 0,03 % 13.44 7,958 0.03 % 14.66 0.00 125.54 41.18 69.07 4.64 359,266.64 je dejal da je biti dejati da biti biti dejati da biti 15,757 0.25 % 13.89 0 0 % 0 7,960 0.04 % 25.04 60 0 % 1.51 1 0 % 0.25 560 0.01 % 2,99 73 0 % 1.70 7,103 0.02 % 13.09 0.00 125.53 41.18 69.07 3.44 359,171.99 pa je da je pa biti da biti pa biti da biti 15,754 0.25 % 13.88 0 0 % 0 3,471 0.02 % 10.92 158 0.01 % 3.98 170.01 % 4.30 3,030 0.04 % 16,17 336 0,02 % 7.83 8,742 0.03 % 16.11 0.00 125.51 41.18 69.07 3.29 359,101.00 ki je bil v ki biti biti v ki biti biti v 15,454 0.25 % 13.62 0 0 % 0 5,683 0.03 % 17.87 289 0.02 % 7.28 40 0.01 % 10.11 1,575 0.02 % 8,40 296 0,02 % 6.89 7,571 0.03 % 13.95 0.00 124.31 41.15 68.99 3.55 352,004.64 iz leta v leto iz leto v leto iz leto v leto 15,311 0.24 % 13.49 0 0 % 0 3,531 0.02 % 11.11 101 0.01 % 2.54 13 0 % 3.29 2,456 0.03 % 13,10 195 0,01 % 4.54 9,015 0.03 % 16.61 0.00 123.74 43.18 70.98 4.71 367,381.45 je povedal da je biti povedati da biti biti povedati da biti 15,247 0.24 % 13.44 0 0 % 0 4,379 0.02 % 13.77 454 0.03 % 11.43 8 0 % 2.02 1,084 0.01 % 5,78 100 0,01 % 2.33 9,222 0.03 % 16.99 0.00 123.48 43.02 70.81 3.39 364,434.76 kljub temu da je kljub ta da biti kljub ta da biti 15,016 0.24 % 13.23 0 0 % 0 4,258 0.02 % 13.39 268 0.02 % 6.75 29 0.01 % 7.33 2,915 0.04 % 15,55 347 0,02 % 8.08 7,199 0.02 % 13.26 0.00 122.54 43.01 70.76 4.15 358,845.96 po drugi svetovni vojnipo drug svetoven vojnapo drug svetoven vojna14,793 0.24 % 13.04 0 0 % 0 3,678 0.02 % 11.57 65 0 % 1.64 40 0.01 % 10.11 1,826 0.02 % 9,74 915 0,05 % 21.31 8,269 0.03 % 15.24 0.01 121.63 43.24 70.95 7.03 355,558.03 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 317 File at CLARIN.SI 1.4.8 List of word-level 5-grams with relative frequency 2/million or above from lower-case word forms in the Gigafida 2.0 corpus with text-type distribution, lemmas and collocation measuresGF2.0-word_sets-lowercase_forms-lemmas-5grams- taxonomy-collocativity-entire.tsvLower-case form of string Lemma of string Lemma (lower-case) Total absolute frequen- cy of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occur - rences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Dice t-score MI MI3 logDice simple LL interaktivni mul- timedijski portal mmc rtvinteraktiven multimedijski portal MMC RTVinteraktiven multimedijski portal mmc rtv33,520 5.98 % 29.54 0 0 % 033,520 0.60 % 105.43 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0.72 183.08 73.11 103.17 13.53 1,408,305.57 multimedijski portal mmc rtv slovenijamultimedijski portal MMC RTV Slovenijamultimedijski portal mmc rtv slovenija33,520 5.98 % 29.54 0 0 % 033,520 0.60 % 105.43 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0.27 183.08 72.35 102.42 12.09 1,393,086.45 prvi interaktivni multimedijski portal mmcprvi interaktiven multimedijski portal MMCprvi interaktiven multimedijski portal mmc33,520 5.98 % 29.54 0 0 % 033,520 0.60 % 105.43 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0.17 183.08 73.66 103.73 11.47 1,419,570.30 poroča francoska tis - kovna agencija afpporočati francoski tiskoven agencija AFPporočati francoski tiskoven agencija afp19,331 3.45 % 17.04 0 0 % 0 19,178 0.34 % 60.32 0 0 % 0 0 0 % 0 4 0 % 0,02 0 0 % 0 149 0 % 0.27 0.25 139.04 71.56 100.03 12.03 794,151.48 glede na to da je glede na ta da biti glede na ta da biti 17,352 3.10 % 15.29 0 0 % 0 5,510 0.10 % 17.33 209 0.09 % 5.26 90 0.08 % 22.75 3,048 0.20 % 16,26 285 0,07 % 6.64 8,210 0.13 % 15.13 0.00 131.73 74.90 103.06 4.18 747,753.44 ne glede na to ali ne glede na ta ali ne glede na ta ali 11,720 2.09 % 10.33 0 0 % 0 3,227 0.06 % 10.15 159 0.07 % 4 125 0.11 % 31.60 2,545 0.17 % 13,58 902 0,23 % 21.01 4,762 0.08 % 8.77 0.00 108.26 71.67 98.70 4.82 482,268.02 delo družino in socialne zadevedelo družina in socialen zadevadelo družina in socialen zadeva11,192 2.00 % 9.86 0 0 % 0 1,934 0.04 % 6.08 3 0 % 0.08 330.03 % 8.34 821 0.05 % 4,38 570,01 % 1.33 8,344 0.14 % 15.37 0.00 105.79 71.22 98.12 4.93 457,483.24 za delo družino in socialneza delo družina in socialenza delo družina in socialen11,072 1.98 % 9.76 0 0 % 0 1,925 0.04 % 6.05 3 0 % 0.08 320.03 % 8.09 814 0.05 % 4,34 56 0,01 % 1.30 8,242 0.13 % 15.19 0.00 105.22 70.75 97.62 4.33 449,497.76 poroča nemška tiskovna agencija dpaporočati nemški tiskoven agencija dpaporočati nemški tiskoven agencija dpa9,942 1.77 % 8.76 0 0 % 0 9,869 0.18 % 31.04 0 0 % 0 0 0 % 0 2 0 % 0,01 0 0 % 0 71 0 % 0.13 0.14 99.71 70.60 97.16 11.17 402,692.74 za kmetijstvo goz - darstvo in prehranoza kmetijstvo goz - darstvo in prehranaza kmetijstvo goz - darstvo in prehrana8,902 1.59 % 7.85 0 0 % 0 2,068 0.04 % 6.50 2 0 % 0.05 140.01 % 3.54 656 0.04 % 3,50 26 0,01 % 0.61 6,136 0.10 % 11.31 0.00 94.35 70.44 96.68 4.04 359,714.02 ne glede na to da ne glede na ta da ne glede na ta da 8,776 1.57 % 7.73 0 0 % 0 2,618 0.05 % 8.23 117 0.05 % 2.95 60 0.05 % 15.17 1,448 0.10 % 7,73 218 0,06 % 5.08 4,315 0.07 % 7.95 0.00 93.68 70.42 96.62 3.98 354,513.92 po drugi strani pa je po drug stran pa biti po drug stran pa biti 8,307 1.48 % 7.32 0 0 % 0 2,050 0.04 % 6.45 284 0.12 % 7.15 140.01 % 3.54 1,794 0.12 % 9,57 382 0,10 % 8.90 3,783 0.06 % 6.97 0.00 91.14 72.13 98.17 3.54 344,130.23 kot v enakem obd- obju lanikot v enak obdobje lanikot v enak obdobje lani8,131 1.45 % 7.17 0 0 % 0 4,759 0.09 % 14.97 0 0 % 0 0 0 % 0 366 0.02 % 1,95 15 0 % 0.35 2,991 0.05 % 5.51 0.00 90.17 70.35 96.33 4.12 328,135.87 ministrstvo za okolje in prostorministrstvo za okolje in prostorministrstvo za okolje in prostor6,619 1.18 % 5.83 0 0 % 0 1,539 0.03 % 4.84 0 0 % 0 330.03 % 8.34 358 0.02 % 1,91 42 0,01 % 0.98 4,647 0.07 % 8.56 0.00 81.36 70.01 95.40 3.59 265,758.38 glede na to da so glede na ta da biti glede na ta da biti 6,037 1.08 % 5.32 0 0 % 0 1,724 0.03 % 5.42 640.03 % 1.61 210.02 % 5.31 1,011 0.07 % 5,39 131 0,03 % 3.05 3,086 0.05 % 5.69 0.00 77.70 70.33 95.45 3.25 243,538.82 se je izkazalo da je se biti izkazati da biti se biti izkazati da biti 5,991 1.07 % 5.28 0 0 % 0 1,751 0.03 % 5.51 235 0.10 % 5.92 70.01 % 1.77 1,067 0.07 % 5,69 108 0,03 % 2.52 2,823 0.05 % 5.20 0.00 77.40 69.87 94.96 2.15 240,024.92 po poročanju francoske tiskovne agencijepo poročanje francoski tiskoven agencijapo poročanje francoski tiskoven agencija5,771 1.03 % 5.09 0 0 % 0 5,650 0.10 % 17.77 0 0 % 0 0 0 % 0 1 0 % 0,01 0 0 % 0 120 0 % 0.22 0.00 75.97 69.81 94.80 6.17 231,023.24 poročanju francoske tiskovne agencije afpporočanje francoski tiskoven agencija AFPporočanje francoski tiskoven agencija afp5,720 1.02 % 5.04 0 0 % 0 5,602 0.10 % 17.62 0 0 % 0 0 0 % 0 1 0 % 0,01 0 0 % 0 117 0 % 0.22 0.11 75.63 69.80 94.76 10.80 228,937.52 to še ne pomeni da ta še ne pomeniti da ta še ne pomeniti da 5,509 0.98 % 4.86 0 0 % 0 1,225 0.02 % 3.85 181 0.08 % 4.56 110.01 % 2.78 1,332 0.09 % 7,11 187 0,05 % 4.36 2,573 0.04 % 4.74 0.00 74.22 73.60 98.45 3.82 233,093.06 glede na to da se glede na ta da se glede na ta da se 5,430 0.97 % 4.79 0 0 % 0 1,686 0.03 % 5.30 670.03 % 1.69 240.02 % 6.07 992 0.07 % 5,29 101 0,03 % 2.35 2,560 0.04 % 4.72 0.00 73.69 70.65 95.47 3.02 220,122.85 se je odločil da bo se biti odločiti da biti se biti odločiti da biti 5,122 0.91 % 4.51 0 0 % 0 1,480 0.03 % 4.65 282 0.12 % 7.10 5 0 % 1.26 1,009 0.07 % 5,38 105 0,03 % 2.45 2,241 0.04 % 4.13 0.00 71.57 70.02 94.67 2.45 205,683.50 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 318 File at CLARIN.SI 1.4.9 List of word-level 2-grams with relative frequency 2/million or above from lower-case word forms in the Gigafida 2.0 corpus with text-type distribution, morphosyntactic tags and collocation measuresGF2.0-word_sets-lowercase_forms-morphosyntactic_ tags-2grams-taxonomy-collocativity-entire.tsvLower-case form of string Morphosyntactic tag of string Total absolute frequen- cy of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occur - rences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Dice t-score MI MI3 logDice simple LL se je Zp------k Gp-ste-n 3,552,248 1.02 % 3,130.58 20.03 % 206.04 1,027,281 0.51 % 3,231.07 301,232 1.29 % 7,584.70 6,069 0.24 % 1,534.18 496,753 0.46 % 2,650.51 98,100 0,42 % 2,284,87 1,622,811 0.49 % 2,990.14 0.13 1,585.83 2.66 46.18 11.02 -296,217.19 da je Vd Gp-ste-n 2,678,006 0.77 % 2,360.11 10.02 % 103.02 825,416 0.41 % 2,596.15 122,724 0.53 % 3,090.06 8,514 0.34 % 2,152.25 419,283 0.39 % 2,237.16 84,518 0,36 % 1,968,52 1,217,550 0.37 % 2,243.42 0.10 1,318.60 2.36 45.07 10.65 -503,974.80 ki je Vd Gp-ste-n 2,429,359 0.70 % 2,140.98 6 0.10 % 618.11 778,856 0.39 % 2,449.70 71,777 0.31 % 1,807.27 7,661 0.31 % 1,936.62 348,186 0.32 % 1,857.81 73,250 0,31 % 1,706,08 1,149,623 0.35 % 2,118.26 0.09 1,274.18 2.45 44.88 10.57 -382,703.00 pa je Vp Gp-ste-n 2,207,016 0.63 % 1,945.03 10.02 % 103.02 670,188 0.33 % 2,107.92 52,739 0.23 % 1,327.91 3,003 0.12 % 759.13 328,584 0.30 % 1,753.22 52,259 0,22 % 1,217,17 1,100,242 0.33 % 2,027.27 0.09 1,225.46 2.51 44.66 10.47 -301,051.58 ki so Vd Gp-stm-n 1,743,293 0.50 % 1,536.36 4 0.07 % 412.07 486,132 0.24 % 1,529.01 50,819 0.22 % 1,279.57 7,766 0.31 % 1,963.16 270,450 0.25 % 1,443.03 74,618 0,32 % 1,737,94 853,504 0.26 % 1,572.64 0.13 1,208.09 3.56 45.02 11.10 542,187.15 je bil Gp-ste-n Gp-d-em 1,681,240 0.48 % 1,481.67 0 0 % 0 525,709 0.26 % 1,653.49 101,698 0.43 % 2,560.65 4,022 0.16 % 1,016.72 217,256 0.20 % 1,159.21 49,827 0,21 % 1,160,53 782,728 0.24 % 1,442.23 0.08 1,231.90 4.32 45.69 10.35 1,182,419.25 da bi Vd Gp-g 1,640,453 0.47 % 1,445.72 0 0 % 0408,685 0.20 % 1,285.42 133,476 0.57 % 3,360.78 5,901 0.24 % 1,491.71 259,458 0.24 % 1,384.38 69,020 0,29 % 1,607,56 763,913 0.23 % 1,407.56 0.15 1,214.59 4.27 45.57 11.30 1,109,709.39 da se Vd Zp------k 1,583,717 0.45 % 1,395.72 30 0.51 % 3,090.55 452,612 0.23 % 1,423.58 78,316 0.34 % 1,971.91 7,514 0.30 % 1,899.46 274,019 0.25 % 1,462.08 59,749 0,26 % 1,391,63 711,477 0.22 % 1,310.94 0.10 1,092.09 2.92 44.11 10.72 34,760.55 je v Gp-ste-n Dm 1,466,676 0.42 % 1,292.57 10.02 % 103.02 498,050 0.25 % 1,566.50 33,347 0.14 % 839.64 2,772 0.11 % 700.73 192,965 0.18 % 1,029.60 42,103 0,18 % 980,63 697,438 0.21 % 1,285.08 0.05 484.69 0.74 41.71 9.53 -522,759.82 so se Gp-stm-n Zp------k 1,266,721 0.36 % 1,116.35 20.03 % 206.04 344,051 0.17 % 1,082.13 60,393 0.26 % 1,520.63 2,287 0.09 % 578.13 165,405 0.15 % 882.55 39,128 0,17 % 911,34 655,455 0.20 % 1,207.72 0.09 958.16 2.75 43.30 10.47 -59,687.20 je bilo Gp-ste-n Gp-d-es 1,199,196 0.34 % 1,056.85 0 0 % 0 330,118 0.17 % 1,038.31 80,644 0.34 % 2,030.53 3,236 0.13 % 818.03 165,512 0.15 % 883.12 37,875 0,16 % 882,15 581,811 0.18 % 1,072.03 0.06 1,028.30 4.04 44.42 9.87 661,490.60 je bila Gp-ste-n Gp-d-ez 1,189,476 0.34 % 1,048.28 0 0 % 0 341,972 0.17 % 1,075.59 79,506 0.34 % 2,001.88 3,491 0.14 % 882.49 171,585 0.16 % 915.52 41,218 0,18 % 960,02 551,704 0.17 % 1,016.55 0.06 1,035.46 4.30 44.67 9.87 824,357.19 naj bi L Gp-g 1,155,137 0.33 % 1,018.02 0 0 % 0 339,269 0.17 % 1,067.09 10,908 0.05 % 274.65 1,206 0.05 % 304.86 139,643 0.13 % 745.09 21,086 0,09 % 491,12 643,025 0.20 % 1,184.82 0.29 1,066.26 6.98 47.26 12.19 2,562,359.51 da so Vd Gp-stm-n 1,029,570 0.29 % 907.35 0 0 % 0 302,269 0.15 % 950.71 29,850 0.13 % 751.59 2,610 0.10 % 659.78 150,583 0.14 % 803.46 36,095 0,15 % 840,70 508,163 0.15 % 936.32 0.07 843.31 2.57 42.51 10.23 -120,907.20 ki se Vd Zp------k 1,014,422 0.29 % 894 30.05 % 309.06 287,492 0.14 % 904.24 32,604 0.14 % 820.93 5,816 0.23 % 1,470.22 183,768 0.17 % 980.53 45,698 0,20 % 1,064,36 459,041 0.14 % 845.81 0.07 830.01 2.51 42.41 10.18 -140,771.96 pa so Vp Gp-stm-n 906,693 0.26 % 799.06 0 0 % 0 256,521 0.13 % 806.83 14,316 0.06 % 360.46 1,242 0.05 % 313.96 124,643 0.12 % 665.05 23,582 0,10 % 549,25 486,389 0.15 % 896.20 0.07 816.53 2.81 42.39 10.25 -20,477.72 ko je Vd Gp-ste-n 892,460 0.26 % 786.52 9 0.15 % 927.17 284,843 0.14 % 895.91 66,529 0.28 % 1,675.13 2,288 0.09 % 578.38 125,350 0.12 % 668.83 25,421 0,11 % 592,09 388,020 0.12 % 714.95 0.04 826.47 3.00 42.53 9.41 49,476.56 pa se Vp Zp------k 851,419 0.24 % 750.35 0 0 % 0 231,651 0.12 % 728.60 25,693 0.11 % 646.92 2,016 0.08 % 509.62 151,613 0.14 % 808.96 26,441 0,11 % 615,84 414,005 0.12 % 762.83 0.06 754.14 2.45 41.85 10.01 -134,585.22 da bo Vd Gp-pte-n 786,550 0.23 % 693.18 30.05 % 309.06 251,734 0.13 % 791.77 26,221 0.11 % 660.22 1,966 0.08 % 496.98 104,229 0.10 % 556.13 14,625 0,06 % 340,63 387,772 0.12 % 714.50 0.08 798.69 3.33 42.50 10.27 160,300.93 bi se Gp-g Zp------k 707,845 0.20 % 623.82 10.02 % 103.02 178,981 0.09 % 562.94 48,802 0.21 % 1,228.78 2,146 0.09 % 542.49 111,306 0.10 % 593.89 23,491 0,10 % 547,13 343,118 0.10 % 632.22 0.06 732.16 2.95 41.81 10.01 23,529.87 se bo Zp------k Gp-pte-n 686,378 0.20 % 604.90 0 0 % 0 227,432 0.11 % 715.33 17,788 0.08 % 447.88 1,059 0.04 % 267.70 88,753 0.08 % 473.56 11,192 0,05 % 260,67 340,154 0.10 % 626.76 0.06 726.24 3.02 41.80 10.00 44,016.64 ki ga Vd Zotmet--k 674,645 0.19 % 594.56 6 0.10 % 618.11 191,815 0.10 % 603.31 22,080 0.10 % 555.95 3,199 0.13 % 808.67 116,553 0.11 % 621.89 29,955 0,13 % 697,69 311,037 0.09 % 573.11 0.09 785.54 4.52 43.25 10.50 545,096.82 več kot Rsr Vd 672,233 0.19 % 592.44 10.02 % 103.02 235,139 0.12 % 739.57 7,987 0.03 % 201.10 1,048 0.04 % 264.92 90,844 0.08 % 484.71 10,827 0,05 % 252,17 326,387 0.10 % 601.39 0.18 808.30 6.14 44.86 11.50 1,160,910.58 ga je Zotmet--k Gp-ste-n 661,965 0.19 % 583.39 0 0 % 0 179,354 0.09 % 564.12 71,458 0.31 % 1,799.24 1,810 0.07 % 457.55 91,597 0.09 % 488.73 22,087 0,10 % 514,43 295,659 0.09 % 544.77 0.03 699.44 2.83 41.51 9.00 -9,022.00 in se Vp Zp------k 659,803 0.19 % 581.48 10.02 % 103.02 154,656 0.08 % 486.43 78,210 0.34 % 1,969.24 2,057 0.08 % 519.99 127,499 0.12 % 680.29 30,049 0,13 % 699,88 267,331 0.08 % 492.58 0.03 305.54 0.68 39.34 8.90 -225,957.49 ne bo L Gp-pte-n 650,495 0.19 % 573.28 0 0 % 0 182,186 0.09 % 573.02 29,369 0.13 % 739.48 1,544 0.06 % 390.31 99,616 0.09 % 531.52 11,939 0,05 % 278,07 325,841 0.10 % 600.38 0.10 762.44 4.19 42.82 10.71 412,268.68 je da Gp-ste-n Vd 641,672 0.18 % 565.50 0 0 % 0 180,213 0.09 % 566.82 40,500 0.17 % 1,019.75 1,118 0.04 % 282.62 105,887 0.10 % 564.98 21,868 0,09 % 509,33 292,086 0.09 % 538.19 0.02 151.68 0.30 38.89 8.59 -126,005.97 je tudi Gp-ste-n L 625,665 0.18 % 551.40 10.02 % 103.02 184,971 0.09 % 581.78 11,565 0.05 % 291.19 1,273 0.05 % 321.80 107,464 0.10 % 573.39 17,891 0,08 % 416,70 302,500 0.09 % 557.38 0.03 414.51 1.07 39.58 8.73 -252,281.21 ne bi L Gp-g 619,609 0.18 % 546.06 0 0 % 0 145,511 0.07 % 457.67 49,665 0.21 % 1,250.51 1,908 0.08 % 482.32 107,185 0.10 % 571.90 18,391 0,08 % 428,35 296,949 0.09 % 547.15 0.09 738.16 4.01 42.49 10.58 332,308.24 to je Zk-sei Gp-ste-n 606,143 0.17 % 534.19 20.03 % 206.04 172,410 0.09 % 542.27 38,345 0.16 % 965.49 2,851 0.12 % 720.70 106,715 0.10 % 569.40 24,727 0,11 % 575,92 261,093 0.08 % 481.08 0.03 655.81 2.67 41.08 8.87 -48,551.87 je na Gp-ste-n Dm 570,177 0.16 % 502.49 0 0 % 0 214,892 0.11 % 675.89 13,127 0.06 % 330.52 661 0.03 % 167.09 65,454 0.06 % 349.24 9,528 0,04 % 221,92 266,515 0.08 % 491.07 0.02 268.99 0.64 38.88 8.54 -188,111.52 so bili Gp-stm-n Gp-d-mm 542,792 0.15 % 478.36 10.02 % 103.02 151,371 0.07 % 476.10 22,259 0.10 % 560.46 1,585 0.06 % 400.67 75,945 0.07 % 405.22 21,278 0,09 % 495,59 270,353 0.08 % 498.14 0.08 722.39 5.68 43.78 10.29 792,330.65 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 319 File at CLARIN.SI 1.4.10 List of word-level 3-grams with relative frequency 2/million or above from lower-case word forms in the Gigafida 2.0 corpus with text-type distribution, morphosyntactic tags and collocation measuresGF2.0-word_sets-lowercase_forms-morphosyntactic_ tags-3grams-taxonomy-collocativity-entire.tsvLower-case form of string Morphosyntactic tag of string Total absolute frequen- cy of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occur - rences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Dice t-score MI MI3 logDice simple LL ki se je Vd Zp------k Gp-ste-n 271,902 0.40 % 239.63 10.04 % 103.02 84,854 0.11 % 266.89 14,048 0.17 % 353.71 488 0.05 % 123.36 37,058 0.10 % 197.73 8,037 0,11 % 187,19 127,416 0.11 % 234.77 0.01 521.43 15.35 51.45 7.61 1,968,684.08 ki ga je Vd Zotmet--k Gp-ste-n 264,623 0.39 % 233.21 0 0 % 0 81,088 0.11 % 255.04 13,331 0.16 % 335.66 931 0.10 % 235.35 34,581 0.10 % 184.51 8,788 0,12 % 204,68 125,904 0.11 % 231.99 0.01 514.41 17.11 53.14 7.88 2,197,200.84 pa se je Vp Zp------k Gp-ste-n 253,512 0.37 % 223.42 0 0 % 0 78,163 0.10 % 245.84 10,604 0.13 % 267 324 0.03 % 81.90 34,416 0.10 % 183.63 5,324 0,07 % 124 124,681 0.11 % 229.73 0.01 503.49 15.74 51.64 7.54 1,895,334.36 da se je Vd Zp------k Gp-ste-n 232,308 0.34 % 204.73 0 0 % 0 67,766 0.09 % 213.14 16,463 0.20 % 414.52 504 0.05 % 127.41 34,329 0.10 % 183.17 7,003 0,10 % 163,11 106,243 0.09 % 195.76 0.01 481.98 16.07 51.72 7.34 1,782,943.55 da bi se Vd Gp-g Zp------k 230,596 0.34 % 203.22 0 0 % 0 54,493 0.07 % 171.39 21,748 0.27 % 547.59 892 0.09 % 225.49 38,613 0.11 % 206.03 10,268 0,14 % 239,15 104,582 0.09 % 192.70 0.02 480.19 14.97 50.60 8.25 1,617,746.50 ki jo je Vd Zotzet--k Gp-ste-n 206,746 0.30 % 182.20 0 0 % 0 63,713 0.08 % 200.39 10,790 0.13 % 271.68 542 0.06 % 137.01 28,377 0.08 % 151.41 7,016 0,10 % 163,41 96,308 0.08 % 177.45 0.01 454.69 15.82 51.14 7.55 1,556,183.27 ki naj bi Vd L Gp-g 173,475 0.26 % 152.88 0 0 % 0 45,720 0.06 % 143.80 2,081 0.03 % 52.40 277 0.03 % 70.02 24,480 0.07 % 130.62 4,471 0,06 % 104,13 96,446 0.08 % 177.71 0.03 416.50 16.81 51.62 8.68 1,408,879.29 se je v Zp------k Gp-ste-n Dm 170,820 0.25 % 150.54 0 0 % 0 55,845 0.07 % 175.65 4,979 0.06 % 125.37 204 0.02 % 51.57 21,596 0.06 % 115.23 4,737 0,07 % 110,33 83,459 0.07 % 153.78 0.01 413.29 14.54 49.31 6.70 1,153,871.47 ki je v Vd Gp-ste-n Dm 162,937 0.24 % 143.60 10.04 % 103.02 57,178 0.07 % 179.84 1,752 0.02 % 44.11 453 0.05 % 114.51 19,001 0.05 % 101.38 3,664 0,05 % 85,34 80,888 0.07 % 149.04 0.01 403.65 17.29 51.92 6.69 1,370,055.10 ki je bil Vd Gp-ste-n Gp-d-em 159,982 0.24 % 140.99 0 0 % 0 53,427 0.07 % 168.04 6,298 0.08 % 158.58 507 0.05 % 128.16 19,240 0.06 % 102.66 4,452 0,06 % 103,69 76,058 0.07 % 140.14 0.01 399.97 16.01 50.59 7.16 1,222,530.11 ki jih je Vd Zotmmt--k Gp-ste-n 159,426 0.23 % 140.50 0 0 % 0 47,322 0.06 % 148.84 7,213 0.09 % 181.62 778 0.08 % 196.67 22,341 0.06 % 119.20 6,080 0,08 % 141,61 75,692 0.07 % 139.47 0.01 399.26 14.44 49.01 7.17 1,067,348.09 ko se je Vd Zp------k Gp-ste-n 156,592 0.23 % 138 0 0 % 0 43,622 0.06 % 137.20 17,964 0.22 % 452.31 295 0.03 % 74.57 23,582 0.07 % 125.83 4,689 0,07 % 109,21 66,440 0.06 % 122.42 0.01 395.71 15.72 50.23 7.03 1,168,673.73 ki so se Vd Gp-stm-n Zp------k 150,086 0.22 % 132.27 0 0 % 039,890 0.05 % 125.46 7,363 0.09 % 185.39 375 0.04 % 94.80 21,507 0.06 % 114.75 5,975 0,08 % 139,16 74,976 0.06 % 138.15 0.01 387.40 15.44 49.83 7.46 1,094,693.84 da se bo Vd Zp------k Gp-pte-n 139,139 0.20 % 122.62 0 0 % 0 42,142 0.06 % 132.55 5,963 0.07 % 150.14 262 0.03 % 66.23 19,776 0.06 % 105.52 2,883 0,04 % 67,15 68,113 0.06 % 125.50 0.01 373.01 15.83 50.00 7.54 1,047,627.10 ki so ga Vd Gp-stm-n Zotmet--k 131,420 0.19 % 115.82 0 0 % 038,630 0.05 % 121.50 3,979 0.05 % 100.19 225 0.02 % 56.88 17,091 0.05 % 91.19 4,465 0,06 % 104 67,030 0.06 % 123.51 0.01 362.51 16.32 50.32 7.82 1,028,234.69 pa naj bi Vp L Gp-g 129,114 0.19 % 113.79 0 0 % 0 37,605 0.05 % 118.28 737 0.01 % 18.56 44 0.01 % 11.12 13,443 0.04 % 71.73 1,146 0,02 % 26,69 76,139 0.07 % 140.29 0.02 359.32 15.51 49.46 8.38 947,180.10 naj bi se L Gp-g Zp------k 128,074 0.19 % 112.87 0 0 % 0 39,339 0.05 % 123.73 1,241 0.01 % 31.25 162 0.02 % 40.95 14,815 0.04 % 79.05 2,192 0,03 % 51,05 70,325 0.06 % 129.58 0.02 357.87 18.26 52.19 8.03 1,151,967.03 je da je Gp-ste-n Vd Gp-ste-n 119,347 0.18 % 105.18 0 0 % 0 35,108 0.05 % 110.42 7,697 0.10 % 193.80 176 0.02 % 44.49 17,896 0.05 % 95.49 3,535 0,05 % 82,33 54,935 0.05 % 101.22 0.00 345.46 15.79 49.52 5.96 896,240.94 da je bil Vd Gp-ste-n Gp-d-em 117,422 0.17 % 103.48 0 0 % 0 38,924 0.05 % 122.43 6,174 0.08 % 155.45 299 0.03 % 75.58 15,186 0.04 % 81.03 3,182 0,04 % 74,11 53,657 0.05 % 98.87 0.01 342.65 14.13 47.81 6.66 763,914.28 pa je bil Vp Gp-ste-n Gp-d-em 116,128 0.17 % 102.34 0 0 % 0 37,902 0.05 % 119.21 3,652 0.04 % 91.95 130 0.01 % 32.86 12,947 0.04 % 69.08 2,175 0,03 % 50,66 59,322 0.05 % 109.30 0.01 340.75 13.98 47.64 6.74 745,510.72 glede na to Rsn Dt Zk-set 109,986 0.16 % 96.93 0 0 % 0 33,238 0.04 % 104.54 2,471 0.03 % 62.22 613 0.06 % 154.96 20,947 0.06 % 111.77 4,241 0,06 % 98,78 48,476 0.04 % 89.32 0.03 331.63 15.26 48.75 9.07 790,212.74 ne glede na L Rsn Dt 106,604 0.16 % 93.95 0 0 % 0 30,249 0.04 % 95.14 2,212 0.03 % 55.70 1,348 0.14 % 340.76 19,670 0.06 % 104.95 5,293 0,07 % 123,28 47,832 0.04 % 88.13 0.02 326.49 14.80 48.21 8.38 736,887.87 in s tem Vp Do Zk-seo 105,296 0.15 % 92.80 10.04 % 103.02 28,141 0.04 % 88.51 1,062 0.01 % 26.74 393 0.04 % 99.35 19,934 0.06 % 106.36 5,079 0,07 % 118,30 50,686 0.04 % 93.39 0.01 324.47 13.84 47.21 7.15 667,018.08 na to da Dt Zk-set Vd 104,745 0.15 % 92.31 0 0 % 0 31,126 0.04 % 97.90 2,688 0.03 % 67.68 498 0.05 % 125.89 19,261 0.06 % 102.77 2,951 0,04 % 68,73 48,221 0.04 % 88.85 0.01 323.63 14.78 48.13 7.72 722,411.46 ki je bila Vd Gp-ste-n Gp-d-ez 102,285 0.15 % 90.14 0 0 % 0 31,091 0.04 % 97.79 4,859 0.06 % 122.34 394 0.04 % 99.60 14,022 0.04 % 74.82 3,717 0,05 % 86,57 48,202 0.04 % 88.82 0.01 319.80 13.80 47.09 6.54 645,367.15 ki so jih Vd Gp-stm-n Zotmmt--k 100,663 0.15 % 88.71 0 0 % 0 28,283 0.04 % 88.96 2,859 0.04 % 71.99 295 0.03 % 74.57 13,830 0.04 % 73.79 4,632 0,06 % 107,88 50,764 0.04 % 93.54 0.01 317.27 18.03 51.27 7.48 891,301.69 ki so jo Vd Gp-stm-n Zotzet--k 98,471 0.14 % 86.78 0 0 % 0 28,375 0.04 % 89.25 2,656 0.03 % 66.88 175 0.02 % 44.24 13,676 0.04 % 72.97 3,364 0,05 % 78,35 50,225 0.04 % 92.54 0.01 313.78 13.75 46.92 7.45 618,053.00 pa so se Vp Gp-stm-n Zp------k 95,140 0.14 % 83.85 0 0 % 0 26,670 0.04 % 83.88 2,351 0.03 % 59.20 131 0.01 % 33.12 11,747 0.03 % 62.68 2,011 0,03 % 46,84 52,230 0.04 % 96.24 0.01 308.43 13.78 46.86 6.86 599,057.72 potem ko je Rsn Vd Gp-ste-n 93,214 0.14 % 82.15 0 0 % 0 44,397 0.06 % 139.64 2,258 0.03 % 56.85 169 0.02 % 42.72 6,267 0.02 % 33.44 1,185 0,02 % 27,60 38,938 0.03 % 71.75 0.01 305.29 13.67 46.68 6.71 580,616.16 da je v Vd Gp-ste-n Dm 92,790 0.14 % 81.78 0 0 % 0 28,302 0.04 % 89.02 3,067 0.04 % 77.22 269 0.03 % 68 13,537 0.04 % 72.23 2,797 0,04 % 65,15 44,818 0.04 % 82.58 0.00 304.60 13.98 46.99 5.84 595,585.78 medtem ko je Rsn Vd Gp-ste-n 92,063 0.14 % 81.13 0 0 % 0 30,129 0.04 % 94.76 4,807 0.06 % 121.04 95 0.01 % 24.01 10,904 0.03 % 58.18 2,453 0,03 % 57,13 43,675 0.04 % 80.47 0.01 303.40 13.65 46.63 6.70 572,453.38 da je bila Vd Gp-ste-n Gp-d-ez 89,888 0.13 % 79.22 0 0 % 0 30,131 0.04 % 94.77 4,418 0.05 % 111.24 233 0.02 % 58.90 12,189 0.04 % 65.04 2,585 0,04 % 60,21 40,332 0.03 % 74.31 0.00 299.79 13.62 46.53 6.29 557,062.75 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 320 File at CLARIN.SI 1.4.11 List of word-level 4-grams with relative frequency 2/million or above from lower-case word forms in the Gigafida 2.0 corpus with text-type distribution, morphosyntactic tags and collocation measuresGF2.0-word_sets-lowercase_forms-morphosyntactic_ tags-4grams-taxonomy-collocativity-entire.tsvLower-case form of string Morphosyntactic tag of string Total absolute frequen- cy of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occur - rences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Dice t-score MI MI3 logDice simple LL glede na to da Rsn Dt Zk-set Vd 60,125 1.02 % 52.99 00 % 018,532 0.09 % 58.29 789 0.05 % 19.87 291 0.10 % 73.56 11,020 0.14 % 58.80 1,001 0,06 % 23,31 28,492 0.10 % 52.50 0.01 245.20 44.45 76.20 7.31 1,488,899.50 ne glede na to L Rsn Dt Zk-set 46,869 0.80 % 41.31 00 % 013,562 0.07 % 42.66 1,385 0.09 % 34.87 314 0.10 % 79.38 9,170 0.12 % 48.93 2,601 0,16 % 60,58 19,837 0.07 % 36.55 0.01 216.49 43.13 74.16 7.51 1,123,205.03 po drugi strani pa Dm Kbzzem Sozem Vp 38,703 0.66 % 34.11 00 % 0 9,551 0.05 % 30.04 1,068 0.07 % 26.89 65 0.02 % 16.43 8,296 0.11 % 44.26 1,874 0,12 % 43,65 17,849 0.06 % 32.89 0.01 196.73 42.48 72.96 7.16 912,422.54 portal mmc rtv slovenija Somei Slmei Slmei Slzei 33,534 0.57 % 29.55 00 % 033,534 0.16 % 105.47 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0.24 183.12 42.85 72.92 11.96 798,066.34 interaktivni multimedijski portal mmc Ppnmeid Ppnmeid Somei Slmei 33,520 0.57 % 29.54 00 % 033,520 0.16 % 105.43 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0.90 183.08 44.54 74.61 13.84 831,829.18 multimedijski portal mmc rtv Ppnmeid Somei Slmei Slmei 33,520 0.57 % 29.54 00 % 033,520 0.16 % 105.43 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0.89 183.08 44.49 74.56 13.83 830,892.48 prvi interaktivni multimedijski portal Kbvmei Ppnmeid Ppnmeid Somei 33,520 0.57 % 29.54 00 % 033,520 0.16 % 105.43 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0.26 183.08 44.23 74.29 12.07 825,517.14 francoska tiskovna agencija afp Ppnzei Ppnzei Sozei Slmei 25,954 0.44 % 22.87 00 % 025,644 0.12 % 80.66 0 0 % 0 0 0 % 0 5 0 % 0.03 0 0 % 0 305 0 % 0.56 0.36 161.10 45.78 75.11 12.54 663,468.13 za okolje in prostor Dt Soset Vp Sometn 25,101 0.43 % 22.12 00 % 0 5,279 0.03 % 16.60 4 0 % 0.10 83 0.03 % 20.98 1,214 0.02 % 6.48 80 0,01 % 1,86 18,441 0.07 % 33.98 0.00 158.43 41.85 71.09 5.22 582,314.96 na to da je Dt Zk-set Vd Gp-ste-n 24,782 0.42 % 21.84 00 % 0 7,730 0.04 % 24.31 475 0.03 % 11.96 131 0.04 % 33.12 4,344 0.06 % 23.18 641 0,04 % 14,93 11,461 0.04 % 21.12 0.00 157.42 41.84 71.03 4.66 574,639.21 res pa je da Rsn Vp Gp-ste-n Vd 22,309 0.38 % 19.66 00 % 0 3,844 0.02 % 12.09 308 0.02 % 7.76 25 0.01 % 6.32 5,423 0.07 % 28.94 372 0,02 % 8,66 12,337 0.04 % 22.73 0.00 149.36 44.67 73.56 4.46 555,348.58 je v tem da Gp-ste-n Dm Zk-sem Vd 19,791 0.34 % 17.44 00 % 0 3,529 0.02 % 11.10 482 0.03 % 12.14 64 0.02 % 16.18 5,555 0.07 % 29.64 1,375 0,09 % 32,03 8,786 0.03 % 16.19 0.00 140.68 41.51 70.06 4.00 455,043.15 poroča francoska tiskovna agencija Ggnste Ppnzei Ppnzei Sozei 19,422 0.33 % 17.12 00 % 019,265 0.09 % 60.59 0 0 % 0 0 0 % 0 4 0 % 0.02 0 0 % 0 153 0 % 0.28 0.24 139.36 41.48 69.98 11.94 446,241.44 na drugi strani pa Dm Kbzzem Sozem Vp 17,034 0.29 % 15.01 00 % 0 5,423 0.03 % 17.06 128 0.01 % 3.22 37 0.01 % 9.35 2,522 0.03 % 13.46 804 0,05 % 18,73 8,120 0.03 % 14.96 0.00 130.51 41.49 69.60 5.65 391,429.66 ki se je v Vd Zp------k Gp-ste-n Dm 16,954 0.29 % 14.94 00 % 0 5,852 0.03 % 18.41 379 0.02 % 9.54 22 0.01 % 5.56 2,097 0.03 % 11.19 455 0,03 % 10,60 8,149 0.03 % 15.02 0.00 130.21 41.29 69.39 3.57 387,535.16 se je izkazalo da Zp------k Gp-ste-n Ggdd-es Vd 16,687 0.28 % 14.71 00 % 0 4,802 0.02 % 15.10 540 0.03 % 13.60 21 0.01 % 5.31 2,943 0.04 % 15.70 331 0,02 % 7,71 8,050 0.03 % 14.83 0.00 129.18 41.27 69.32 3.95 381,201.99 za to da bi Dt Zk-set Vd Gp-g 16,575 0.28 % 14.61 00 % 0 3,939 0.02 % 12.39 606 0.04 % 15.26 93 0.03 % 23.51 2,389 0.03 % 12.75 664 0,04 % 15,47 8,884 0.03 % 16.37 0.00 128.74 41.26 69.29 4.87 378,546.48 pa naj bi se Vp L Gp-g Zp------k 16,562 0.28 % 14.60 00 % 0 5,085 0.03 % 15.99 68 0 % 1.71 10 0 % 2.53 1,742 0.02 % 9.29 150 0,01 % 3,49 9,507 0.03 % 17.52 0.00 128.69 41.73 69.76 4.95 382,939.08 je dejal da je Gp-ste-n Ggdd-em Vd Gp-ste-n 15,757 0.27 % 13.89 00 % 0 7,960 0.04 % 25.04 60 0 % 1.51 1 0 % 0.25 560 0.01 % 2.99 73 0,01 % 1,70 7,103 0.03 % 13.09 0.00 125.53 42.85 70.73 3.44 374,949.35 pa je da je Vp Gp-ste-n Vd Gp-ste-n 15,754 0.27 % 13.88 00 % 0 3,471 0.02 % 10.92 158 0.01 % 3.98 17 0.01 % 4.30 3,030 0.04 % 16.17 336 0,02 % 7,83 8,742 0.03 % 16.11 0.00 125.51 41.18 69.07 3.29 359,101.00 iz leta v leto Dr Soser Dt Soset 15,311 0.26 % 13.49 00 % 0 3,531 0.02 % 11.11 101 0.01 % 2.54 13 0 % 3.29 2,456 0.03 % 13.10 195 0,01 % 4,54 9,015 0.03 % 16.61 0.00 123.74 43.12 70.92 6.26 366,841.61 je povedal da je Gp-ste-n Ggdd-em Vd Gp-ste-n 15,247 0.26 % 13.44 00 % 0 4,379 0.02 % 13.77 454 0.03 % 11.43 8 0 % 2.02 1,084 0.01 % 5.78 100 0,01 % 2,33 9,222 0.03 % 16.99 0.00 123.48 41.14 68.93 3.40 347,111.09 kljub temu da je Dd Zk-sed Vd Gp-ste-n 15,016 0.26 % 13.23 00 % 0 4,258 0.02 % 13.39 268 0.02 % 6.75 29 0.01 % 7.33 2,915 0.04 % 15.55 347 0,02 % 8,08 7,199 0.03 % 13.26 0.00 122.54 41.11 68.86 4.15 341,653.06 po drugi svetovni vojni Dm Kbzzem Ppnzem Sozem 14,810 0.25 % 13.05 00 % 0 3,680 0.02 % 11.57 65 0 % 1.64 40 0.01 % 10.11 1,828 0.02 % 9.75 915 0,06 % 21,31 8,282 0.03 % 15.26 0.01 121.70 41.09 68.80 7.23 336,788.33 zdi se mi da Ggnste Zp------k Zop-ed--k Vd 14,450 0.24 % 12.73 00 % 0 3,135 0.01 % 9.86 1,729 0.11 % 43.53 21 0.01 % 5.31 3,210 0.04 % 17.13 540 0,03 % 12,58 5,815 0.02 % 10.71 0.00 120.21 43.11 70.75 4.89 346,143.99 v zvezi s tem Dm Sozem Do Zk-seo 14,395 0.24 % 12.69 00 % 0 4,806 0.02 % 15.12 205 0.01 % 5.16 132 0.04 % 33.37 1,456 0.02 % 7.77 527 0,03 % 12,27 7,269 0.03 % 13.39 0.00 119.98 42.87 70.50 4.86 342,772.98 potem ko se je Rsn Vd Zp------k Gp-ste-n 14,253 0.24 % 12.56 00 % 0 6,018 0.03 % 18.93 533 0.03 % 13.42 23 0.01 % 5.81 1,296 0.02 % 6.92 237 0,01 % 5,52 6,146 0.02 % 11.32 0.00 119.39 44.71 72.31 3.96 355,176.69 ki je bil v Vd Gp-ste-n Gp-d-em Dm 14,202 0.24 % 12.52 00 % 0 5,213 0.03 % 16.40 271 0.02 % 6.82 38 0.01 % 9.61 1,508 0.02 % 8.05 288 0,02 % 6,71 6,884 0.02 % 12.68 0.00 119.17 43.71 71.30 3.54 345,356.86 iz dneva v dan Dr Somer Dt Sometn 13,941 0.24 % 12.29 00 % 0 3,311 0.02 % 10.41 402 0.03 % 10.12 25 0.01 % 6.32 2,882 0.04 % 15.38 287 0,02 % 6,68 7,034 0.03 % 12.96 0.00 118.07 41.01 68.54 6.32 316,294.53 se mi zdi da Zp------k Zop-ed--k Ggnste Vd 13,859 0.23 % 12.21 00 % 0 3,382 0.02 % 10.64 975 0.06 % 24.55 109 0.04 % 27.55 3,287 0.04 % 17.54 369 0,02 % 8,59 5,737 0.02 % 10.57 0.00 117.72 43.05 70.57 4.83 331,484.12 kaj se je zgodilo Zv-sei Zp------k Gp-ste-n Ggdd-es 13,257 0.23 % 11.68 00 % 0 3,293 0.02 % 10.36 2,663 0.17 % 67.05 27 0.01 % 6.83 2,150 0.03 % 11.47 417 0,03 % 9,71 4,707 0.02 % 8.67 0.00 115.14 42.08 69.47 3.94 309,321.34 da bi se lahko Vd Gp-g Zp------k Rsn 13,205 0.22 % 11.64 00 % 0 4,450 0.02 % 14 908 0.06 % 22.86 28 0.01 % 7.08 1,796 0.02 % 9.58 413 0,03 % 9,62 5,610 0.02 % 10.34 0.00 114.91 40.93 68.31 4.41 298,974.01 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 321 File at CLARIN.SI 1.4.12 List of word-level 5-grams with relative frequency 2/million or above from lower-case word forms in the Gigafida 2.0 corpus with text-type distribution, morphosyntactic tags and collocation measuresGF2.0-word_sets-lowercase_forms-morphosyntactic_ tags-5grams-taxonomy-collocativity-entire.tsvLower-case form of string Morphosyntactic tag of string Total absolute frequen- cy of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occur - rences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Dice t-score MI MI3 logDice simple LL interaktivni multimedijski portal mmc rtv Ppnmeid Ppnmeid Somei Slmei Slmei 33,520 6.41 % 29.54 00 % 033,520 0.63 % 105.43 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 00.91 183.08 73.58 103.65 13.86 1,417,929.59 multimedijski portal mmc rtv slovenija Ppnmeid Somei Slmei Slmei Slzei 33,520 6.41 % 29.54 00 % 033,520 0.63 % 105.43 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 00.29 183.08 72.35 102.42 12.20 1,393,086.45 prvi interaktivni multimedijski portal mmc Kbvmei Ppnmeid Ppnmeid Somei Slmei 33,520 6.41 % 29.54 00 % 033,520 0.63 % 105.43 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 00.31 183.08 72.35 102.42 12.30 1,393,086.45 poroča francoska tiskovna agencija afp Ggnste Ppnzei Ppnzei Sozei Slmei 19,326 3.69 % 17.03 00 % 0 19,176 0.36 % 60.31 0 0 % 0 0 0 % 0 4 0 % 0.02 0 0 % 0 146 0 % 0.27 0.26 139.02 71.56 100.03 12.05 793,941.73 glede na to da je Rsn Dt Zk-set Vd Gp-ste-n 17,350 3.32 % 15.29 00 % 0 5,508 0.10 % 17.32 209 0.10 % 5.26 90 0.08 % 22.75 3,048 0.21 % 16.26 285 0,08 % 6,64 8,210 0.14 % 15.13 0.00 131.72 77.78 105.94 4.46 777,744.72 delo družino in socialne zadeve Soset Sozet Vp Ppnzmt Sozmt 9,362 1.79 % 8.25 00 % 0 1,534 0.03 % 4.82 3 0 % 0.08 320.03 % 8.09 697 0.05 % 3.72 470,01 % 1,09 7,049 0.12 % 12.99 0.00 96.76 70.86 97.24 4.69 380,664.19 za delo družino in socialne Dt Soset Sozet Vp Ppnzmt 9,239 1.77 % 8.14 00 % 0 1,524 0.03 % 4.79 3 0 % 0.08 310.03 % 7.84 689 0.05 % 3.68 46 0,01 % 1,07 6,946 0.12 % 12.80 0.00 96.12 70.54 96.89 4.09 373,891.52 za kmetijstvo gozdarstvo in prehrano Dt Soset Soset Vp Sozet 8,898 1.70 % 7.84 00 % 0 2,067 0.04 % 6.50 2 0 % 0.05 140.01 % 3.54 656 0.05 % 3.50 26 0,01 % 0,61 6,133 0.10 % 11.30 0.00 94.33 72.12 98.36 4.05 368,549.27 ne glede na to da L Rsn Dt Zk-set Vd 8,774 1.68 % 7.73 00 % 0 2,616 0.05 % 8.23 117 0.05 % 2.95 60 0.05 % 15.17 1,448 0.10 % 7.73 218 0,06 % 5,08 4,315 0.07 % 7.95 0.00 93.67 70.94 97.14 4.51 357,184.15 po drugi strani pa je Dm Kbzzem Sozem Vp Gp-ste-n 8,307 1.59 % 7.32 00 % 0 2,050 0.04 % 6.45 284 0.13 % 7.15 140.01 % 3.54 1,794 0.12 % 9.57 382 0,10 % 8,90 3,783 0.07 % 6.97 0.00 91.14 70.34 96.38 3.56 335,171.99 kot v enakem obdobju lani Vd Dm Zn-sem Sosem Rsn 8,131 1.55 % 7.17 00 % 0 4,759 0.09 % 14.97 0 0 % 0 0 0 % 0 366 0.03 % 1.95 15 0 % 0,35 2,991 0.05 % 5.51 0.00 90.17 71.92 97.89 4.41 335,793.10 ne glede na to ali L Rsn Dt Zk-set Vp 7,421 1.42 % 6.54 00 % 0 2,050 0.04 % 6.45 110 0.05 % 2.77 90 0.08 % 22.75 1,623 0.11 % 8.66 613 0,17 % 14,28 2,935 0.05 % 5.41 0.00 86.15 70.18 95.89 4.94 298,696.55 poroča nemška tiskovna agencija dpa Ggnste Ppnzei Ppnzei Sozei Slzei 7,074 1.35 % 6.23 00 % 0 7,027 0.13 % 22.10 0 0 % 0 0 0 % 0 2 0 % 0.01 0 0 % 0 45 0 % 0.08 0.10 84.11 70.11 95.68 10.74 284,435.50 glede na to da so Rsn Dt Zk-set Vd Gp-stm-n 6,037 1.15 % 5.32 00 % 0 1,724 0.03 % 5.42 640.03 % 1.61 210.02 % 5.31 1,011 0.07 % 5.39 131 0,04 % 3,05 3,086 0.05 % 5.69 0.00 77.70 69.90 95.02 3.69 241,985.82 se je izkazalo da je Zp------k Gp-ste-n Ggdd-es Vd Gp-ste-n 5,991 1.15 % 5.28 00 % 0 1,751 0.03 % 5.51 235 0.11 % 5.92 70.01 % 1.77 1,067 0.07 % 5.69 108 0,03 % 2,52 2,823 0.05 % 5.20 0.00 77.40 70.06 95.16 2.15 240,731.23 po poročanju francoske tiskovne agencije Dm Sosem Ppnzer Ppnzer Sozer 5,771 1.10 % 5.09 00 % 0 5,650 0.11 % 17.77 0 0 % 0 0 0 % 0 1 0 % 0.01 0 0 % 0 120 0 % 0.22 0.00 75.97 70.15 95.14 6.26 232,201.72 ministrstvo za okolje in prostor Sosei Dt Soset Vp Sometn 5,505 1.05 % 4.85 00 % 0 1,327 0.03 % 4.17 0 0 % 0 310.03 % 7.84 299 0.02 % 1.60 40 0,01 % 0,93 3,808 0.07 % 7.02 0.00 74.20 70.13 94.98 3.35 221,418.15 to še ne pomeni da Zk-sei L L Ggvste Vd 5,493 1.05 % 4.84 00 % 0 1,221 0.02 % 3.84 181 0.08 % 4.56 110.01 % 2.78 1,330 0.09 % 7.10 187 0,05 % 4,36 2,563 0.04 % 4.72 0.00 74.11 69.74 94.59 3.88 219,658.86 glede na to da se Rsn Dt Zk-set Vd Zp------k 5,430 1.04 % 4.79 00 % 0 1,686 0.03 % 5.30 670.03 % 1.69 240.02 % 6.07 992 0.07 % 5.29 101 0,03 % 2,35 2,560 0.04 % 4.72 0.00 73.69 69.73 94.54 3.44 217,085.16 se je odločil da bo Zp------k Gp-ste-n Ggdd-em Vd Gp-pte-n 5,122 0.98 % 4.51 00 % 0 1,480 0.03 % 4.65 282 0.13 % 7.10 50.01 % 1.26 1,009 0.07 % 5.38 105 0,03 % 2,45 2,241 0.04 % 4.13 0.00 71.57 71.80 96.45 2.45 211,178.33 v skladu z zakonom o Dm Somem Do Someo Dm 5,090 0.97 % 4.49 00 % 0 1,667 0.03 % 5.24 7 0 % 0.18 740.07 % 18.71 339 0.02 % 1.81 40 0,01 % 0,93 2,963 0.05 % 5.46 0.00 71.34 69.63 94.26 3.50 203,206.47 je na današnji novinarski konferenci Gp-ste-n Dm Ppnzem Ppnzem Sozem 4,917 0.94 % 4.33 00 % 0 4,822 0.09 % 15.17 0 0 % 0 0 0 % 0 14 0 % 0.07 0 0 % 0 81 0 % 0.15 0.00 70.12 69.58 94.11 3.00 196,152.16 poročanju francoske tiskovne agencije afp Sosem Ppnzer Ppnzer Sozer Slmei 4,805 0.92 % 4.23 00 % 0 4,717 0.09 % 14.84 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 88 0 % 0.16 0.11 69.32 73.93 98.39 10.77 204,265.36 res pa je da je Rsn Vp Gp-ste-n Vd Gp-ste-n 4,742 0.91 % 4.18 00 % 0 820 0.01 % 2.58 580.03 % 1.46 3 0 % 0.76 1,169 0.08 % 6.24 100 0,03 % 2,33 2,592 0.04 % 4.78 0.00 68.86 71.06 95.49 1.87 193,403.84 ne glede na to kako L Rsn Dt Zk-set Rsn 4,522 0.86 % 3.99 00 % 0 1,379 0.03 % 4.34 328 0.15 % 8.26 150.01 % 3.79 978 0.07 % 5.22 409 0,11 % 9,53 1,413 0.02 % 2.60 0.00 67.25 69.87 94.16 4.38 181,189.44 več kot v enakem obdobju Rsr Vd Dm Zn-sem Sosem 4,521 0.86 % 3.98 00 % 0 2,680 0.05 % 8.43 0 0 % 0 0 0 % 0 188 0.01 % 1 1 0 % 0,02 1,652 0.03 % 3.04 0.00 67.24 69.67 93.95 3.50 180,588.79 ministrstva za okolje in prostor Soser Dt Soset Vp Sometn 4,480 0.86 % 3.95 00 % 0 922 0.02 % 2.90 2 0 % 0.05 90.01 % 2.28 208 0.01 % 1.11 170,01 % 0,40 3,322 0.06 % 6.12 0.00 66.93 69.45 93.71 3.05 178,356.89 za pokojninsko in invalidsko zavarovanje Dt Ppnset Vp Ppnset Soset 4,360 0.83 % 3.84 00 % 0 1,564 0.03 % 4.92 0 0 % 0 130 0.12 % 32.86 375 0.03 % 2 32 0,01 % 0,75 2,259 0.04 % 4.16 0.00 66.03 70.62 94.80 3.02 176,646.78 ne glede na to ali L Rsn Dt Zk-set Rsn 4,298 0.82 % 3.79 00 % 0 1,177 0.02 % 3.70 490.02 % 1.23 350.03 % 8.85 922 0.06 % 4.92 289 0,08 % 6,73 1,826 0.03 % 3.36 0.00 65.56 69.39 93.53 4.37 170,956.32 državni sekretar na ministrstvu za Ppnmeid Somei Dm Sosem Dt 4,205 0.80 % 3.71 00 % 0 2,022 0.04 % 6.36 0 0 % 0 0 0 % 0 188 0.01 % 1 3 0 % 0,07 1,992 0.03 % 3.67 0.00 64.85 72.87 96.95 3.76 176,084.23 za gospodarski razvoj in tehnologijo Dt Ppnmetd Sometn Vp Sozet 4,184 0.80 % 3.69 00 % 0 4,184 0.08 % 13.16 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 0 0 0 % 00.00 64.68 69.35 93.41 2.96 166,324.18 izkazalo se je da je Ggdd-es Zp------k Gp-ste-n Vd Gp-ste-n 4,050 0.77 % 3.57 00 % 0 1,130 0.02 % 3.55 261 0.12 % 6.57 60.01 % 1.52 737 0.05 % 3.93 138 0,04 % 3,21 1,778 0.03 % 3.28 0.00 63.64 69.50 93.47 1.58 161,360.33 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 322 File at CLARIN.SI 1.4.13 List of word-level 2-grams with relative frequency 2/million or above from morphosyntactic tags in the Gigafida 2.0 corpus with text-type distribution, part-of-speech categories and collocation measuresGF2.0-word_sets-morphosyntactic_tags-parts_of_ speech-2grams_taksonomija-collocativity-entire.tsvMorphosyntactic tag of string Part-of-speech catego - ry of string Total absolute frequen- cy of morphosyntac - tic tag Percentage of all found morphosyntactic tags Total relative frequency (per million occur - rences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Dice t-score MI MI3 logDice simple LL Slmei Slmei S S 11,468,146 1.11 % 10,106.82 0 0 % 03,606,875 1.20 % 11,344.56 77,071 0.21 % 1,940.56 8,223 0.22 % 2,078.69 1,361,175 0.78 % 7,262.78 235,299 0,58 % 5,480,39 6,179,503 1.21 % 11,386.14 0.34 3,090.76 3.52 50.42 12.44 3,353,576.91 Dm Sozem D S 7,866,532 0.76 % 6,932.73 137 1.55 % 14,113.53 2,361,102 0.79 % 7,426.28 220,881 0.61 % 5,561.54 31,781 0.85 % 8,033.89 1,229,483 0.70 % 6,560.12 292,819 0,73 % 6,820,10 3,730,329 0.73 % 6,873.38 0.24 2,567.37 3.56 49.38 11.93 2,471,780.60 Vd Gp-ste-n V G 7,668,424 0.74 % 6,758.14 25 0.28 % 2,575.46 2,362,168 0.79 % 7,429.63 334,304 0.92 % 8,417.41 27,302 0.73 % 6,901.65 1,164,041 0.66 % 6,210.94 247,060 0,61 % 5,754,32 3,533,524 0.69 % 6,510.75 0.18 2,216.87 2.33 48.07 11.56 -1,539,628.28 Dm Somem D S 6,739,074 0.65 % 5,939.11 62 0.70 % 6,387.14 2,165,276 0.72 % 6,810.36 184,922 0.51 % 4,656.14 27,738 0.74 % 7,011.87 1,013,835 0.58 % 5,409.49 218,534 0,54 % 5,089,92 3,128,707 0.61 % 5,764.85 0.21 2,391.82 3.67 49.04 11.77 2,466,225.68 Ppnzer Sozer P S 6,511,302 0.63 % 5,738.38 23 0.26 % 2,369.42 1,942,404 0.65 % 6,109.37 89,917 0.25 % 2,264.01 25,338 0.68 % 6,405.17 912,446 0.52 % 4,868.51 251,486 0,62 % 5,857,41 3,289,688 0.64 % 6,061.47 0.44 2,490.33 5.38 50.65 12.81 8,370,432.56 Ppnzei Sozei P S 5,817,920 0.56 % 5,127.30 40.04 % 412.07 1,649,613 0.55 % 5,188.46 118,381 0.33 % 2,980.71 21,433 0.57 % 5,418.03 974,963 0.56 % 5,202.08 248,258 0,62 % 5,782,22 2,805,268 0.55 % 5,168.89 0.39 2,338.28 5.03 49.98 12.62 6,343,568.66 Ppnzet Sozet P S 5,468,606 0.53 % 4,819.45 45 0.51 % 4,635.83 1,537,059 0.51 % 4,834.45 134,952 0.37 % 3,397.95 17,610 0.47 % 4,451.62 929,963 0.53 % 4,961.98 216,127 0,54 % 5,033,85 2,632,850 0.51 % 4,851.20 0.41 2,288.85 5.56 50.32 12.71 7,592,348.89 Ppnmeid Somei P S 5,435,462 0.53 % 4,790.25 9 0.10 % 927.17 1,636,025 0.54 % 5,145.72 90,242 0.25 % 2,272.20 20,977 0.56 % 5,302.76 780,178 0.44 % 4,162.77 172,946 0,43 % 4,028,12 2,735,085 0.53 % 5,039.57 0.32 2,260.03 5.03 49.78 12.35 5,921,216.00 L Rsn L R 5,377,236 0.52 % 4,738.93 10 0.11 % 1,030.18 1,327,075 0.44 % 4,174 230,322 0.63 % 5,799.26 14,067 0.38 % 3,555.99 1,036,669 0.59 % 5,531.33 163,020 0,41 % 3,796,93 2,606,073 0.51 % 4,801.86 0.11 1,470.07 1.45 46.17 10.84 -2,123,891.17 Dt Sozet D S 5,366,321 0.52 % 4,729.31 90 1.02 % 9,271.66 1,569,329 0.52 % 4,935.95 183,692 0.51 % 4,625.17 22,167 0.59 % 5,603.58 855,018 0.49 % 4,562.10 175,856 0,44 % 4,095,90 2,560,169 0.50 % 4,717.28 0.21 2,074.81 3.26 47.97 11.73 921,523.72 Ppnmer Somer P S 4,550,535 0.44 % 4,010.36 18 0.20 % 1,854.33 1,372,750 0.46 % 4,317.66 68,236 0.19 % 1,718.11 22,177 0.59 % 5,606.11 630,609 0.36 % 3,364.72 149,136 0,37 % 3,473,55 2,307,609 0.45 % 4,251.92 0.37 2,089.25 5.60 49.84 12.55 6,431,479.36 Rsn Rsn R R 4,527,727 0.44 % 3,990.26 13 0.15 % 1,339.24 1,104,208 0.37 % 3,473.02 242,051 0.67 % 6,094.58 13,404 0.36 % 3,388.39 942,290 0.54 % 5,027.75 168,962 0,42 % 3,935,33 2,056,799 0.40 % 3,789.79 0.08 813.98 0.70 44.92 10.36 -1,567,929.29 Dt Sometn D S 4,485,712 0.43 % 3,953.24 56 0.64 % 5,769.03 1,289,447 0.43 % 4,055.65 162,894 0.45 % 4,101.49 16,527 0.44 % 4,177.85 734,055 0.42 % 3,916.68 144,857 0,36 % 3,373,89 2,137,876 0.42 % 3,939.18 0.19 1,916.84 3.40 47.59 11.61 1,053,632.30 Gp-ste-n Rsn G R 4,150,709 0.40 % 3,658 40.04 % 412.07 1,142,783 0.38 % 3,594.35 249,803 0.69 % 6,289.77 11,387 0.30 % 2,878.51 727,058 0.41 % 3,879.34 143,012 0,35 % 3,330,92 1,876,662 0.37 % 3,457.87 0.09 1,066.36 1.07 45.04 10.47 -1,673,193.35 Dm Sosem D S 4,119,372 0.40 % 3,630.38 23 0.26 % 2,369.42 1,285,990 0.43 % 4,044.77 84,166 0.23 % 2,119.21 17,468 0.47 % 4,415.72 599,552 0.34 % 3,199.01 156,256 0,39 % 3,639,39 1,975,917 0.39 % 3,640.76 0.14 1,871.04 3.68 47.63 11.17 1,526,515.49 Ppnzem Sozem P S 3,991,747 0.39 % 3,517.91 26 0.29 % 2,678.48 1,197,102 0.40 % 3,765.20 70,311 0.19 % 1,770.35 10,790 0.29 % 2,727.60 545,344 0.31 % 2,909.78 126,946 0,32 % 2,956,72 2,041,228 0.40 % 3,761.10 0.41 1,965.23 5.93 49.79 12.70 6,405,190.28 Vd Gp-stm-n V G 3,959,897 0.38 % 3,489.84 10 0.11 % 1,030.18 1,129,112 0.38 % 3,551.35 117,440 0.32 % 2,957.01 14,013 0.37 % 3,542.34 608,987 0.35 % 3,249.35 155,676 0,39 % 3,625,88 1,934,659 0.38 % 3,564.74 0.14 1,733.01 2.95 46.79 11.16 143,607.93 Vd Zp------k V Z 3,809,020 0.37 % 3,356.87 49 0.56 % 5,047.90 1,037,079 0.34 % 3,261.88 184,021 0.51 % 4,633.45 19,211 0.51 % 4,856.33 703,671 0.40 % 3,754.56 163,327 0,41 % 3,804,08 1,701,662 0.33 % 3,135.42 0.13 1,636.25 2.63 46.35 11.03 -357,069.37 Dm Ppnzem D P 3,741,866 0.36 % 3,297.69 25 0.28 % 2,575.46 1,131,179 0.38 % 3,557.85 60,806 0.17 % 1,531.03 9,614 0.26 % 2,430.32 496,234 0.28 % 2,647.74 115,833 0,29 % 2,697,89 1,928,175 0.38 % 3,552.79 0.13 1,815.40 4.02 47.69 11.08 2,039,634.50 Gp-ste-n Ggdd-em G G 3,700,432 0.36 % 3,261.17 0 0 % 0 1,263,224 0.42 % 3,973.17 330,837 0.91 % 8,330.12 5,270 0.14 % 1,332.20 371,305 0.21 % 1,981.16 81,025 0,20 % 1,887,17 1,648,771 0.32 % 3,037.97 0.14 1,670.24 2.92 46.56 11.14 89,092.30 Vp Gp-ste-n V G 3,650,722 0.35 % 3,217.36 50.06 % 515.09 1,079,807 0.36 % 3,396.27 106,835 0.29 % 2,689.99 6,522 0.17 % 1,648.69 571,777 0.33 % 3,050.81 102,914 0,26 % 2,396,99 1,782,862 0.35 % 3,285.04 0.08 899.80 0.92 44.52 10.30 -1,419,717.99 Zp------k Gp-ste-n Z G 3,552,256 0.34 % 3,130.59 20.02 % 206.04 1,027,283 0.34 % 3,231.07 301,232 0.83 % 7,584.70 6,069 0.16 % 1,534.18 496,758 0.28 % 2,650.54 98,100 0,24 % 2,284,87 1,622,812 0.32 % 2,990.14 0.13 1,585.83 2.66 46.18 11.02 -296,221.10 Vp Rsn V R 3,453,235 0.33 % 3,043.32 49 0.56 % 5,047.90 808,812 0.27 % 2,543.92 192,450 0.53 % 4,845.68 12,080 0.32 % 3,053.69 737,784 0.42 % 3,936.57 145,297 0,36 % 3,384,14 1,556,763 0.30 % 2,868.44 0.06 389.35 0.34 43.78 9.99 -741,827.44 Vp L V L 3,398,136 0.33 % 2,994.76 70.08 % 721.13 797,858 0.27 % 2,509.47 121,495 0.34 % 3,059.11 10,445 0.28 % 2,640.38 667,647 0.38 % 3,562.34 118,694 0,29 % 2,764,52 1,681,990 0.33 % 3,099.18 0.07 800.86 0.82 44.21 10.20 -1,270,356.15 Ppnmmr Sommr P S 3,327,749 0.32 % 2,932.73 50.06 % 515.09 949,610 0.32 % 2,986.77 41,116 0.11 % 1,035.26 12,521 0.33 % 3,165.17 535,630 0.30 % 2,857.95 120,705 0,30 % 2,811,36 1,668,162 0.33 % 3,073.70 0.34 1,791.15 5.79 49.12 12.43 5,057,327.09 Ppnzmr Sozmr P S 3,297,877 0.32 % 2,906.40 11 0.12 % 1,133.20 928,616 0.31 % 2,920.74 41,781 0.12 % 1,052 13,057 0.35 % 3,300.67 543,134 0.31 % 2,897.98 143,391 0,36 % 3,339,75 1,627,887 0.32 % 2,999.49 0.46 1,795.80 6.49 49.80 12.89 6,362,893.91 Dt Soset D S 3,034,590 0.29 % 2,674.37 25 0.28 % 2,575.46 878,036 0.29 % 2,761.65 68,983 0.19 % 1,736.92 17,943 0.48 % 4,535.80 475,322 0.27 % 2,536.16 116,532 0,29 % 2,714,17 1,477,749 0.29 % 2,722.85 0.15 1,594.55 3.56 46.63 11.25 953,037.59 Ppnmem Somem P S 2,910,604 0.28 % 2,565.10 27 0.31 % 2,781.50 875,845 0.29 % 2,754.76 51,212 0.14 % 1,289.46 9,272 0.25 % 2,343.86 404,524 0.23 % 2,158.41 88,813 0,22 % 2,068,56 1,480,911 0.29 % 2,728.68 0.39 1,685.36 6.37 49.31 12.63 5,404,840.14 Rsn Vp R V 2,894,588 0.28 % 2,550.99 16 0.18 % 1,648.30 723,494 0.24 % 2,275.58 143,712 0.40 % 3,618.51 8,158 0.22 % 2,062.25 551,050 0.31 % 2,940.22 107,582 0,27 % 2,505,71 1,360,576 0.27 % 2,506.95 0.05 96.91 0.08 43.01 9.73 -182,297.67 Gp-ste-n Dm G D 2,892,007 0.28 % 2,548.71 10.01 % 103.02 1,034,256 0.34 % 3,253 67,300 0.18 % 1,694.54 4,688 0.12 % 1,185.08 356,788 0.20 % 1,903.70 68,553 0,17 % 1,596,68 1,360,421 0.27 % 2,506.67 0.06 635.25 0.67 43.60 10.02 -985,804.07 Ppnmetd Sometn P S 2,799,305 0.27 % 2,467.01 18 0.20 % 1,854.33 789,853 0.26 % 2,484.29 57,156 0.16 % 1,439.13 9,564 0.26 % 2,417.68 447,124 0.25 % 2,385.71 93,032 0,23 % 2,166,83 1,402,558 0.27 % 2,584.31 0.30 1,647.90 6.05 48.89 12.28 4,686,110.94 Kag Kag K K 2,763,118 0.27 % 2,435.12 41 0.47 % 4,223.76 766,176 0.26 % 2,409.82 4,019 0.01 % 101.19 15,053 0.40 % 3,805.24 370,748 0.21 % 1,978.19 225,589 0,56 % 5,254,24 1,381,492 0.27 % 2,545.49 0.13 1,432.43 2.85 45.65 11.09 -13,508.81 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 323 File at CLARIN.SI 1.4.14 List of word-level 3-grams with relative frequency 2/million or above from morphosyntactic tags in the Gigafida 2.0 corpus with text-type distribution, part-of-speech categories and collocation measuresGF2.0-word_sets-morphosyntactic_tags-parts_of_ speech-3grams_taksonomija-collocativity-entire.tsvMorphosyntactic tag of string Part-of-speech catego - ry of string Total absolute frequen- cy of morphosyntac - tic tag Percentage of all found morphosyntactic tags Total relative frequency (per million occur - rences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Dice t-score MI MI3 logDice simple LL Dm Ppnzem Sozem D P S 2,941,408 0.39 % 2,592.25 24 0.32 % 2,472.44 896,396 0.33 % 2,819.40 50,739 0.17 % 1,277.55 7,916 0.24 % 2,001.08 395,431 0.26 % 2,109.89 92,303 0,26 % 2,149,85 1,498,599 0.33 % 2,761.27 0.12 1,715.05 19.18 62.15 10.99 28,080,760.76 Slmei Slmei Slmei S S S 2,923,324 0.39 % 2,576.31 0 0 % 0930,905 0.35 % 2,927.94 11,128 0.04 % 280.19 2,513 0.07 % 635.26 370,863 0.24 % 1,978.80 77,098 0,22 % 1,795,70 1,530,817 0.34 % 2,820.63 0.09 1,709.77 19.23 62.19 10.47 27,999,134.21 Dm Ppnmem Somem D P S 2,254,755 0.30 % 1,987.10 25 0.33 % 2,575.46 688,832 0.26 % 2,166.56 37,820 0.12 % 952.27 7,065 0.21 % 1,785.96 305,342 0.20 % 1,629.20 66,885 0,19 % 1,557,83 1,148,786 0.25 % 2,116.71 0.10 1,501.58 18.26 60.47 10.70 20,283,641.93 Somei Slmei Slmei S S S 1,544,994 0.20 % 1,361.60 0 0 % 0548,184 0.20 % 1,724.18 6,216 0.02 % 156.51 1,294 0.04 % 327.11 147,684 0.10 % 787.99 17,996 0,05 % 419,15 823,620 0.18 % 1,517.57 0.05 1,242.97 17.72 58.84 9.65 13,391,387.44 Dt Ppnzet Sozet D P S 1,443,810 0.19 % 1,272.42 15 0.20 % 1,545.28 414,921 0.15 % 1,305.03 36,931 0.12 % 929.88 4,999 0.15 % 1,263.69 225,467 0.15 % 1,203.02 50,606 0,14 % 1,178,67 710,871 0.16 % 1,309.83 0.07 1,201.58 17.92 58.84 10.25 12,688,946.09 Kag Kag Kag K K K 1,227,700 0.16 % 1,081.97 0 0 % 0 379,112 0.14 % 1,192.41 1,581 0.01 % 39.81 9,153 0.27 % 2,313.78 198,893 0.13 % 1,061.23 133,713 0,38 % 3,114,33 505,248 0.11 % 930.95 0.06 1,108.01 18.21 58.66 9.92 11,004,345.02 Nj Nj Nj N N N 1,190,841 0.16 % 1,049.48 0 0 % 0 217,153 0.08 % 683 24,753 0.08 % 623.25 3,485 0.10 % 880.97 390,846 0.26 % 2,085.43 260,953 0,74 % 6,077,91 293,651 0.06 % 541.07 0.41 1,091.25 17.91 58.27 12.71 10,456,085.00 Dm Ppnsem Sosem D P S 1,187,835 0.16 % 1,046.83 10 0.13 % 1,030.18 359,721 0.13 % 1,131.42 16,442 0.06 % 413.99 4,498 0.14 % 1,137.05 157,167 0.10 % 838.59 41,732 0,12 % 971,99 608,265 0.13 % 1,120.77 0.06 1,089.87 17.34 57.70 9.91 10,024,448.13 Slmei Slmei Vp S S V 983,455 0.13 % 866.71 0 0 % 0 318,757 0.12 % 1,002.57 7,457 0.03 % 187.76 492 0.01 % 124.37 121,661 0.08 % 649.14 14,956 0,04 % 348,34 520,132 0.11 % 958.38 0.02 991.69 17.33 57.14 8.63 8,293,229.64 Do Ppnzeo Sozeo D P S 976,964 0.13 % 860.99 28 0.37 % 2,884.52 265,002 0.10 % 833.50 24,319 0.08 % 612.33 3,853 0.12 % 974 177,150 0.12 % 945.21 39,132 0,11 % 911,43 467,480 0.10 % 861.36 0.11 988.41 18.66 58.46 10.76 9,023,273.48 Dm Ppnzmm Sozmm D P S 865,981 0.12 % 763.18 0 0 % 0250,736 0.09 % 788.63 13,661 0.04 % 343.97 2,651 0.08 % 670.14 133,094 0.09 % 710.15 33,917 0,10 % 789,97 431,922 0.10 % 795.84 0.05 930.57 16.88 56.33 9.54 7,070,534.61 Ppnmeid Somei Slmei P S S 862,717 0.11 % 760.31 0 0 % 0346,414 0.13 % 1,089.56 3,529 0.01 % 88.86 617 0.02 % 155.97 76,172 0.05 % 406.43 10,301 0,03 % 239,92 425,684 0.09 % 784.35 0.04 928.83 20.65 60.08 9.29 8,998,353.81 Slmei Slmei Kag S S K 849,759 0.11 % 748.89 0 0 % 0180,487 0.07 % 567.68 784 0 % 19.74 616 0.02 % 155.72 52,049 0.03 % 277.72 25,820 0,07 % 601,38 590,003 0.13 % 1,087.12 0.03 921.82 16.91 56.30 8.89 6,952,001.20 Ppnzer Ppnzer Sozer P P S 844,313 0.11 % 744.09 30.04 % 309.06 268,069 0.10 % 843.15 7,069 0.02 % 177.99 2,566 0.08 % 648.66 102,275 0.07 % 545.71 24,647 0,07 % 574,06 439,684 0.10 % 810.15 0.07 918.86 18.89 58.27 10.09 7,914,979.98 Dt Ppnmetd Sometn D P S 840,964 0.11 % 741.14 40.05 % 412.07 249,643 0.09 % 785.19 17,501 0.06 % 440.66 3,248 0.10 % 821.06 123,885 0.08 % 661.01 24,789 0,07 % 577,37 421,894 0.09 % 777.37 0.05 917.03 16.84 56.20 9.69 6,844,864.44 Vd Zp------k Gp-ste-n V Z G 812,855 0.11 % 716.36 20.03 % 206.04 236,787 0.09 % 744.76 59,934 0.20 % 1,509.07 1,734 0.05 % 438.34 118,306 0.08 % 631.24 24,348 0,07 % 567,09 371,744 0.08 % 684.96 0.02 901.58 19.01 58.28 8.65 7,678,223.78 Slmei Slmei Gp-ste-n S S G 796,483 0.11 % 701.94 0 0 % 0 318,713 0.12 % 1,002.44 9,864 0.03 % 248.36 286 0.01 % 72.30 76,972 0.05 % 410.70 8,110 0,02 % 188,89 382,538 0.08 % 704.85 0.02 892.45 16.76 55.97 8.51 6,445,225.25 Ppnzei Ppnzei Sozei P P S 792,635 0.10 % 698.55 10.01 % 103.02 270,181 0.10 % 849.79 11,228 0.04 % 282.71 2,304 0.07 % 582.43 113,989 0.07 % 608.21 27,410 0,08 % 638,41 367,522 0.08 % 677.18 0.06 890.30 18.68 57.87 9.92 7,327,156.59 Dr Ppnzer Sozer D P S 777,910 0.10 % 685.57 40.05 % 412.07 231,452 0.09 % 727.98 19,045 0.06 % 479.53 2,707 0.08 % 684.30 117,784 0.08 % 628.46 29,780 0,09 % 693,61 377,138 0.08 % 694.90 0.05 881.98 16.73 55.87 9.65 6,278,988.16 Dt Ppnset Soset D P S 758,272 0.10 % 668.26 30.04 % 309.06 217,493 0.08 % 684.07 12,907 0.04 % 324.98 2,897 0.09 % 732.33 113,149 0.07 % 603.73 27,682 0,08 % 644,75 384,141 0.08 % 707.81 0.05 870.78 16.69 55.76 9.74 6,103,637.98 Do Ppnmeo Someo D P S 756,563 0.10 % 666.76 14 0.18 % 1,442.26 212,930 0.08 % 669.72 23,242 0.08 % 585.21 2,476 0.07 % 625.91 135,397 0.09 % 722.43 25,930 0,07 % 603,94 356,574 0.08 % 657.01 0.08 869.80 16.69 55.75 10.41 6,088,398.84 Dm Ppnmmm Sommm D P S 754,369 0.10 % 664.82 10.01 % 103.02 220,300 0.08 % 692.90 11,555 0.04 % 290.94 2,727 0.08 % 689.36 119,302 0.08 % 636.56 26,117 0,07 % 608,30 374,367 0.08 % 689.80 0.04 868.54 17.63 56.68 9.37 6,496,918.22 Ppnzer Sozer Vp P S V 720,444 0.10 % 634.92 30.04 % 309.06 199,270 0.07 % 626.76 11,280 0.04 % 284.02 3,581 0.11 % 905.24 112,294 0.07 % 599.16 32,224 0,09 % 750,54 361,792 0.08 % 666.63 0.03 848.78 16.62 55.53 8.71 5,767,122.39 Slmei Kag Slmei S K S 702,314 0.09 % 618.95 0 0 % 0106,367 0.04 % 334.55 190 0 % 4.78 90 0 % 22.75 18,446 0.01 % 98.42 7,163 0,02 % 166,83 570,058 0.12 % 1,050.37 0.02 838.03 16.64 55.48 8.61 5,629,481.47 Kag Slmei Kag K S K 692,489 0.09 % 610.29 0 0 % 0103,443 0.04 % 325.36 158 0 % 3.98 199 0.01 % 50.31 14,291 0.01 % 76.25 7,470 0,02 % 173,99 566,928 0.12 % 1,044.60 0.03 832.16 18.33 57.14 8.82 6,258,298.43 Vd Gp-ste-n Rsn V G R 678,967 0.09 % 598.37 20.03 % 206.04 190,993 0.07 % 600.72 31,265 0.10 % 787.22 2,577 0.08 % 651.44 116,010 0.08 % 618.99 21,844 0,06 % 508,77 316,276 0.07 % 582.76 0.01 823.99 16.53 55.28 7.90 5,400,132.74 Ppnmeid Ppnmeid Somei P P S 670,204 0.09 % 590.65 0 0 % 0245,228 0.09 % 771.31 5,549 0.02 % 139.72 1,601 0.05 % 404.72 78,654 0.05 % 419.67 14,473 0,04 % 337,09 324,699 0.07 % 598.28 0.05 818.65 16.86 55.57 9.65 5,462,312.27 Ppnzet Sozet Vp P S V 642,920 0.09 % 566.60 10 0.13 % 1,030.18 171,382 0.06 % 539.04 18,420 0.06 % 463.80 2,541 0.08 % 642.34 115,528 0.07 % 616.42 27,250 0,08 % 634,68 307,789 0.07 % 567.12 0.02 801.81 16.45 55.04 8.60 5,082,971.58 Slmei Vp Slmei S V S 627,364 0.08 % 552.89 0 0 % 0182,209 0.07 % 573.09 9,350 0.03 % 235.42 451 0.01 % 114.01 87,894 0.06 % 468.97 18,343 0,05 % 427,23 329,117 0.07 % 606.42 0.02 792.06 16.68 55.20 7.98 5,045,441.08 Vd Gp-ste-n Dm V G D 614,986 0.08 % 541.98 10.01 % 103.02 207,192 0.08 % 651.67 12,535 0.04 % 315.62 1,713 0.05 % 433.03 80,528 0.05 % 429.67 14,661 0,04 % 341,47 298,356 0.07 % 549.74 0.01 784.21 20.62 59.08 7.81 6,405,400.55 Slmei Slmei Vd S S V 612,714 0.08 % 539.98 0 0 % 0224,585 0.08 % 706.38 3,351 0.01 % 84.37 208 0.01 % 52.58 73,662 0.05 % 393.04 5,827 0,02 % 135,72 305,081 0.07 % 562.13 0.02 782.75 16.68 55.13 8.08 4,926,741.85 Ppnzet Ppnzet Sozet P P S 595,485 0.08 % 524.80 20.03 % 206.04 167,958 0.06 % 528.27 12,508 0.04 % 314.94 1,798 0.05 % 454.52 100,334 0.07 % 535.35 21,706 0,06 % 505,56 291,179 0.06 % 536.52 0.05 771.67 16.34 54.71 9.78 4,668,305.37 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 324 File at CLARIN.SI 1.4.15 List of word-level 4-grams with relative frequency 2/million or above from morphosyntactic tags in the Gigafida 2.0 corpus with text-type distribution, part-of-speech categories and collocation measuresGF2.0-word_sets-morphosyntactic_tags-parts_of_ speech-4grams_taksonomija-collocativity-entire.tsvMorphosyntactic tag of string Part-of-speech catego - ry of string Total absolute frequen- cy of morphosyntac - tic tag Percentage of all found morphosyntactic tags Total relative frequency (per million occur - rences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Dice t-score MI MI3 logDice simple LL Slmei Slmei Slmei Slmei S S S S 1,322,238 0.43 % 1,165.28 0 0 % 0393,121 0.20 % 1,236.47 4,390 0.02 % 110.54 1,276 0.05 % 322.56 170,043 0.17 % 907.29 35,537 0,15 % 827,70 717,871 0.22 % 1,322.72 0.04 1,149.89 47.57 88.24 9.33 35,227,213.17 Nj Nj Nj Nj N N N N 833,890 0.28 % 734.90 0 0 % 0137,594 0.07 % 432.77 16,451 0.09 % 414.22 2,421 0.10 % 612 291,629 0.28 % 1,556.04 204,983 0,87 % 4,774,30 180,812 0.06 % 333.16 0.29 913.18 50.95 90.29 12.20 23,911,608.85 Kag Kag Kag Kag K K K K 701,671 0.23 % 618.38 0 0 % 0230,550 0.12 % 725.14 1,200 0.01 % 30.21 7,525 0.32 % 1,902.24 144,678 0.14 % 771.95 107,050 0,45 % 2,493,32 210,668 0.07 % 388.17 0.03 837.66 46.66 85.50 9.11 18,307,831.33 Ppnmeid Somei Slmei Slmei P S S S 628,307 0.21 % 553.72 0 0 % 0272,841 0.14 % 858.16 1,164 0.01 % 29.31 268 0.01 % 67.75 44,261 0.04 % 236.16 4,918 0,02 % 114,55 304,855 0.10 % 561.72 0.02 792.66 47.44 85.96 8.66 16,687,483.45 Slmei Kag Slmei Kag S K S K 422,227 0.14 % 372.11 0 0 % 0 51,850 0.03 % 163.08 64 0 % 1.61 38 0 % 9.61 6,261 0.01 % 33.41 2,752 0,01 % 64,10 361,262 0.11 % 665.65 0.02 649.79 45.93 83.30 7.99 10,830,369.53 Kag Slmei Kag Slmei K S K S 369,891 0.12 % 325.98 0 0 % 0 47,612 0.03 % 149.75 36 0 % 0.91 27 0 % 6.83 4,471 0 % 23.86 1,515 0,01 % 35,29 316,230 0.10 % 582.67 0.01 608.19 45.74 82.73 7.80 9,445,403.59 Slmei Slmei Vp Slmei S S V S 362,766 0.12 % 319.70 0 0 % 0 113,215 0.06 % 356.09 1,999 0.01 % 50.33 188 0.01 % 47.52 44,307 0.04 % 236.41 6,927 0,03 % 161,34 196,130 0.06 % 361.38 0.01 602.30 45.71 82.65 7.25 9,257,333.45 Slmei Vp Slmei Slmei S V S S 326,417 0.11 % 287.67 0 0 % 0106,643 0.06 % 335.42 1,417 0.01 % 35.68 150 0.01 % 37.92 38,103 0.04 % 203.31 5,598 0,02 % 130,38 174,506 0.05 % 321.54 0.01 571.33 45.56 82.19 7.10 8,299,817.62 Dm Ppnzem Ppnzem Sozem D P P S 325,355 0.11 % 286.73 0 0 % 0100,622 0.05 % 316.48 4,377 0.02 % 110.21 595 0.03 % 150.41 37,094 0.04 % 197.92 7,614 0,03 % 177,34 175,053 0.05 % 322.55 0.02 570.40 46.05 82.67 8.13 8,369,252.58 Slmei Slmei Vd Gp-ste-n S S V G 290,057 0.10 % 255.63 0 0 % 0 113,467 0.06 % 356.88 1,386 0.01 % 34.90 87 0 % 21.99 27,362 0.03 % 145.99 2,366 0,01 % 55,11 145,389 0.04 % 267.89 0.01 538.57 48.05 84.34 6.98 7,810,815.07 Slmei Slmei Slmei Kag S S S K 257,629 0.09 % 227.05 0 0 % 0 81,846 0.04 % 257.43 268 0 % 6.75 176 0.01 % 44.49 13,382 0.01 % 71.40 8,953 0,04 % 208,53 153,004 0.05 % 281.92 0.01 507.57 45.21 81.16 7.11 6,497,785.30 Slmei Slmei Slmei Vp S S S V 253,137 0.08 % 223.09 0 0 % 0 83,823 0.04 % 263.65 1,188 0.01 % 29.91 99 0 % 25.03 34,281 0.03 % 182.91 4,376 0,02 % 101,92 129,370 0.04 % 238.37 0.01 503.13 45.19 81.09 6.73 6,380,622.91 Dm Ppnzem Sozem Vp D P S V 244,745 0.08 % 215.69 40.08 % 412.07 68,877 0.04 % 216.64 5,377 0.03 % 135.39 975 0.04 % 246.47 37,415 0.04 % 199.63 9,078 0,04 % 211,44 123,019 0.04 % 226.67 0.01 494.72 46.40 82.20 6.99 6,347,879.83 Dm Ppnzem Sozem Dm D P S D 235,077 0.08 % 207.17 20.04 % 206.04 74,380 0.04 % 233.94 2,925 0.02 % 73.65 422 0.02 % 106.68 23,232 0.02 % 123.96 4,886 0,02 % 113,80 129,230 0.04 % 238.11 0.01 484.85 45.08 80.77 6.97 5,910,285.53 Slmei Slmei Kag Kag S S K K 223,810 0.07 % 197.24 0 0 % 0 56,413 0.03 % 177.43 150 0 % 3.78 326 0.01 % 82.41 8,586 0.01 % 45.81 7,631 0,03 % 177,74 150,704 0.05 % 277.68 0.01 473.09 45.01 80.55 7.07 5,617,463.57 Dm Ppnmem Ppnmem Somem D P P S 214,687 0.07 % 189.20 0 0 % 0 64,663 0.03 % 203.38 2,640 0.01 % 66.47 533 0.02 % 134.74 26,686 0.03 % 142.39 5,939 0,03 % 138,33 114,226 0.04 % 210.47 0.01 463.34 46.46 81.88 7.65 5,575,744.11 Somei Slmei Slmei Slmei S S S S 213,549 0.07 % 188.20 0 0 % 0 77,642 0.04 % 244.20 589 0 % 14.83 114 0.01 % 28.82 25,561 0.03 % 136.39 3,838 0,02 % 89,39 105,805 0.03 % 194.95 0.01 462.11 44.94 80.35 6.77 5,351,215.06 Slmei Slmei Kag Slmei S S K S 212,802 0.07 % 187.54 0 0 % 0 42,049 0.02 % 132.26 79 0 % 1.99 16 0 % 4.04 6,283 0.01 % 33.52 2,986 0,01 % 69,55 161,389 0.05 % 297.37 0.01 461.30 44.94 80.34 6.84 5,331,848.67 Slmei Kag Slmei Slmei S K S S 202,193 0.07 % 178.19 0 0 % 0 41,679 0.02 % 131.09 63 0 % 1.59 22 0 % 5.56 6,444 0.01 % 34.38 2,686 0,01 % 62,56 151,299 0.05 % 278.78 0.01 449.66 44.86 80.12 6.76 5,057,054.21 Kag Slmei Slmei Kag K S S K 198,386 0.07 % 174.84 0 0 % 0 28,848 0.01 % 90.73 41 0 % 1.03 37 0 % 9.35 4,688 0.01 % 25.01 2,914 0,01 % 67,87 161,858 0.05 % 298.23 0.01 445.41 44.84 80.03 6.90 4,958,561.85 Ppnmeid Ppnmeid Somei Slmei P P S S 196,309 0.07 % 173.01 0 0 % 0104,321 0.05 % 328.12 284 0 % 7.15 57 0 % 14.41 11,870 0.01 % 63.33 1,253 0,01 % 29,18 78,524 0.02 % 144.69 0.01 443.07 44.82 79.99 7.43 4,904,853.66 Gp-ste-n Dm Ppnzem Sozem G D P S 184,055 0.06 % 162.21 10.02 % 103.02 68,844 0.04 % 216.53 2,717 0.01 % 68.41 194 0.01 % 49.04 18,425 0.02 % 98.31 3,533 0,01 % 82,29 90,341 0.03 % 166.46 0.01 429.02 45.33 80.31 6.76 4,654,652.47 Dm Ppnmem Somem Vp D P S V 182,654 0.06 % 160.97 20.04 % 206.04 53,044 0.03 % 166.84 3,706 0.02 % 93.31 711 0.03 % 179.73 27,193 0.03 % 145.09 6,110 0,03 % 142,31 91,888 0.03 % 169.31 0.01 427.38 44.72 79.68 6.62 4,552,240.22 Slzei Slzei Slzei Slzei S S S S 182,624 0.06 % 160.95 0 0 % 0 51,314 0.03 % 161.40 303 0 % 7.63 499 0.02 % 126.14 23,933 0.02 % 127.70 5,215 0,02 % 121,46 101,360 0.03 % 186.76 0.02 427.35 44.72 79.67 8.22 4,551,466.48 Somei Slmei Slmei Gp-ste-n S S S G 182,572 0.06 % 160.90 0 0 % 0 78,993 0.04 % 248.45 834 0 % 21 79 0 % 19.97 12,458 0.01 % 66.47 1,476 0,01 % 34,38 88,732 0.03 % 163.49 0.01 427.28 44.97 79.92 6.48 4,577,598.59 Dm Ppnmem Somem Dm D P S D 178,257 0.06 % 157.10 0 0 % 0 56,785 0.03 % 178.60 1,913 0.01 % 48.17 395 0.02 % 99.85 18,381 0.02 % 98.07 3,224 0,01 % 75,09 97,559 0.03 % 179.76 0.01 422.20 45.78 80.67 6.63 4,556,703.01 Dt Ppnzet Sozet Vp D P S V 176,202 0.06 % 155.29 50.10 % 515.09 47,655 0.03 % 149.89 5,240 0.03 % 131.94 796 0.03 % 201.22 29,255 0.03 % 156.10 6,692 0,03 % 155,86 86,559 0.03 % 159.49 0.01 419.76 44.67 79.52 6.67 4,385,934.66 Gp-ste-n Ggdd-em Slmei Slmei G G S S 172,459 0.06 % 151.99 0 0 % 0 51,600 0.03 % 162.30 2,378 0.01 % 59.88 64 0 % 16.18 12,286 0.01 % 65.55 1,681 0,01 % 39,15 104,450 0.03 % 192.46 0.01 415.28 45.16 79.95 6.54 4,344,406.15 Slmei Kag Kag Kag S K K K 167,759 0.06 % 147.85 0 0 % 0 46,626 0.02 % 146.65 92 0 % 2.32 173 0.01 % 43.73 6,623 0.01 % 35.34 6,438 0,03 % 149,95 107,807 0.03 % 198.64 0.01 409.58 47.89 82.60 6.84 4,501,313.42 Kag Kag Slmei Kag K K S K 162,476 0.05 % 143.19 0 0 % 0 32,563 0.02 % 102.42 37 0 % 0.93 49 0 % 12.39 3,023 0 % 16.13 2,106 0,01 % 49,05 124,698 0.04 % 229.76 0.01 403.08 47.84 82.46 6.79 4,355,044.10 Slmei Slmei Gp-ste-n Ggdd-em S S G G 161,946 0.05 % 142.72 0 0 % 0 65,477 0.03 % 205.94 2,491 0.01 % 62.72 31 0 % 7.84 10,596 0.01 % 56.54 1,267 0,01 % 29,51 82,084 0.03 % 151.25 0.01 402.43 45.07 79.68 6.45 4,070,726.37 Kag Slmei Kag Kag K S K K 159,487 0.05 % 140.56 0 0 % 0 33,377 0.02 % 104.98 51 0 % 1.28 82 0 % 20.73 3,296 0 % 17.59 2,546 0,01 % 59,30 120,135 0.04 % 221.36 0.01 399.36 47.82 82.38 6.76 4,272,354.08 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 325 File at CLARIN.SI 1.4.16 List of word-level 5-grams with relative frequency 2/million or above from morphosyntactic tags in the Gigafida 2.0 corpus with text-type distribution, part-of-speech categories and collocation measuresGF2.0-word_sets-morphosyntactic_tags-parts_of_ speech-5grams_taksonomija-collocativity-entire.tsvMorphosyntactic tag of string Part-of-speech catego - ry of string Total absolute frequen- cy of morphosyntac - tic tag Percentage of all found morphosyntactic tags Total relative frequency (per million occur - rences) Absolute frequency [SSJ.T.K.N] Percentage [SSJ.T.K.N] Relative frequency [SSJ.T.K.N] Absolute frequency [SSJ.I] Percentage [SSJ.I] Relative frequency [SSJ.I] Absolute frequency [SSJ.T.K.L] Percentage [SSJ.T.K.L] Relative frequency [SSJ.T.K.L] Absolute frequency [SSJ.T.D] Percentage [SSJ.T.D] Relative frequency [SSJ.T.D] Absolute frequency [SSJ.T.P.R] Percentage [SSJ.T.P.R] Relative frequency [SSJ.T.P.R] Absolute frequency [SSJ.T.K.S] Percentage [SSJ.T.K.S] Relative frequency [SSJ.T.K.S] Absolute frequency [SSJ.T.P.C] Percentage [SSJ.T.P.C] Relative frequency [SSJ.T.P.C] Dice t-score MI MI3 logDice simple LL Slmei Slmei Slmei Slmei Slmei S S S S S 761,084 1.41 % 670.74 00 % 0 224,979 0.25 % 707.62 2,511 0.04 % 63.22 747 0.07 % 188.83 93,209 0.22 % 497.33 18,811 0,19 % 438,13 420,827 0.29 % 775.40 0.02 872.40 76.86 115.93 8.53 33,694,798.82 Nj Nj Nj Nj Nj N N N N N 628,969 1.17 % 554.31 00 % 0 94,742 0.11 % 297.99 11,723 0.17 % 295.17 1,783 0.17 % 450.72 232,078 0.56 % 1,238.29 165,200 1,70 % 3,847,70 123,443 0.09 % 227.45 0.22 793.08 77.83 116.35 11.79 28,213,027.24 Kag Kag Kag Kag Kag K K K K K 433,955 0.81 % 382.44 00 % 0 115,858 0.13 % 364.40 999 0.01 % 25.15 6,566 0.63 % 1,659.81 114,535 0.27 % 611.12 91,691 0,94 % 2,135,59 104,306 0.07 % 192.19 0.02 658.75 76.05 113.50 8.42 19,000,346.39 Kag Slmei Kag Slmei Kag K S K S K 296,434 0.55 % 261.25 00 % 0 32,246 0.04 % 101.42 30 0 % 0.76 16 0 % 4.04 3,145 0.01 % 16.78 993 0,01 % 23,13 260,004 0.18 % 479.07 0.01 544.46 75.81 112.16 7.55 12,936,306.40 Slmei Kag Slmei Kag Slmei S K S K S 289,485 0.54 % 255.12 00 % 0 33,951 0.04 % 106.78 30 0 % 0.76 14 0 % 3.54 3,050 0.01 % 16.27 1,016 0,01 % 23,66 251,424 0.17 % 463.27 0.01 538.04 76.18 112.46 7.38 12,697,543.26 Slmei Slmei Vp Slmei Slmei S S V S S 274,189 0.51 % 241.64 00 % 0 90,655 0.10 % 285.13 885 0.01 % 22.28 113 0.01 % 28.57 29,331 0.07 % 156.50 3,705 0,04 % 86,29 149,500 0.10 % 275.46 0.01 523.63 75.38 111.51 6.89 11,895,785.86 Ppnmeid Ppnmeid Somei Slmei Slmei P P S S S 154,483 0.29 % 136.15 00 % 0 87,541 0.10 % 275.34 147 0 % 3.70 22 0 % 5.56 7,408 0.02 % 39.53 673 0,01 % 15,67 58,692 0.04 % 108.14 0.01 393.04 74.56 109.03 6.87 6,625,314.24 Slmei Slmei Slmei Slmei Vp S S S S V 143,574 0.27 % 126.53 00 % 0 45,364 0.05 % 142.68 480 0.01 % 12.09 61 0.01 % 15.42 18,690 0.04 % 99.72 2,102 0,02 % 48,96 76,877 0.05 % 141.65 0.00 378.91 74.45 108.71 5.95 6,148,327.12 Slmei Slmei Slmei Vp Slmei S S S V S 135,552 0.25 % 119.46 00 % 0 43,595 0.05 % 137.12 501 0.01 % 12.61 48 0.01 % 12.13 16,404 0.04 % 87.53 2,409 0,03 % 56,11 72,595 0.05 % 133.76 0.00 368.17 74.37 108.46 5.87 5,798,028.36 Slmei Slmei Kag Slmei Slmei S S K S S 120,063 0.22 % 105.81 00 % 0 24,833 0.03 % 78.11 48 0 % 1.21 6 0 % 1.52 3,355 0.01 % 17.90 1,682 0,02 % 39,18 90,139 0.06 % 166.09 0.00 346.50 74.19 107.94 5.98 5,122,856.34 Slmei Kag Slmei Slmei Kag S K S S K 111,215 0.21 % 98.01 00 % 0 17,155 0.02 % 53.96 13 0 % 0.33 6 0 % 1.52 2,327 0.01 % 12.42 1,118 0,01 % 26,04 90,596 0.06 % 166.93 0.00 333.49 74.80 108.32 6.00 4,785,759.43 Ppnmeid Somei Slmei Slmei Gp-ste-n P S S S G 102,629 0.19 % 90.45 00 % 0 50,109 0.06 % 157.61 150 0 % 3.78 20 0 % 5.06 5,098 0.01 % 27.20 571 0,01 % 13,30 46,681 0.03 % 86.01 0.00 320.36 73.97 107.26 5.89 4,364,995.14 Kag Kag Slmei Kag Kag K K S K K 102,325 0.19 % 90.18 00 % 0 21,945 0.03 % 69.02 26 0 % 0.65 23 0 % 5.81 1,755 0 % 9.36 1,436 0,01 % 33,45 77,140 0.05 % 142.14 0.00 319.88 74.05 107.33 6.16 4,357,055.30 Slmei Kag Kag Slmei Kag S K K S K 101,584 0.19 % 89.53 00 % 0 24,173 0.03 % 76.03 4 0 % 0.10 7 0 % 1.77 1,246 0 % 6.65 656 0,01 % 15,28 75,498 0.05 % 139.11 0.00 318.72 74.26 107.53 6.00 4,338,606.63 Slzei Slzei Slzei Slzei Slzei S S S S S 96,697 0.18 % 85.22 00 % 0 25,017 0.03 % 78.68 109 0 % 2.74 302 0.03 % 76.34 13,317 0.03 % 71.06 3,595 0,04 % 83,73 54,357 0.04 % 100.16 0.01 310.96 75.44 108.56 7.31 4,198,363.65 Kag Slmei Slmei Kag Slmei K S S K S 94,377 0.17 % 83.17 00 % 0 13,844 0.02 % 43.54 10 0 % 0.25 4 0 % 1.01 1,665 0 % 8.88 922 0,01 % 21,47 77,932 0.05 % 143.59 0.00 307.21 74.56 107.61 5.76 4,047,735.71 Slmei Slmei Slmei Slmei Kag S S S S K 87,343 0.16 % 76.97 00 % 0 25,338 0.03 % 79.69 117 0 % 2.95 35 0 % 8.85 4,444 0.01 % 23.71 4,105 0,04 % 95,61 53,304 0.04 % 98.22 0.00 295.54 73.73 106.56 5.52 3,702,618.83 Kag Slmei Kag Kag Slmei K S K K S 87,298 0.16 % 76.94 00 % 0 21,751 0.02 % 68.41 9 0 % 0.23 11 0 % 2.78 803 0 % 4.28 401 0 % 9,34 64,323 0.04 % 118.52 0.00 295.46 74.04 106.87 5.78 3,716,965.96 Slmei Slmei Slmei Kag Kag S S S K K 82,037 0.15 % 72.30 00 % 0 33,904 0.04 % 106.64 48 0 % 1.21 87 0.01 % 21.99 2,504 0.01 % 13.36 2,303 0,02 % 53,64 43,191 0.03 % 79.58 0.00 286.42 74.36 107.00 5.56 3,508,500.43 Slmei Slmei Kag Kag Kag S S K K K 75,974 0.14 % 66.96 00 % 0 23,327 0.03 % 73.37 32 0 % 0.81 39 0 % 9.86 2,084 0.01 % 11.12 2,055 0,02 % 47,86 48,437 0.03 % 89.25 0.00 275.63 73.84 106.27 5.58 3,225,645.32 Slmei Kag Kag Kag Kag S K K K K 72,479 0.14 % 63.88 00 % 0 33,537 0.04 % 105.48 64 0 % 1.61 98 0.01 % 24.77 2,952 0.01 % 15.75 2,685 0,03 % 62,54 33,143 0.02 % 61.07 0.00 269.22 73.55 105.84 5.67 3,064,485.76 Slmei Slmei Gp-ste-n Ggdd-em Vd S S G G V 68,753 0.13 % 60.59 00 % 0 28,149 0.03 % 88.54 251 0 % 6.32 10 0 % 2.53 2,426 0.01 % 12.94 227 0 % 5,29 37,690 0.03 % 69.45 0.00 262.21 73.39 105.53 5.10 2,900,265.12 Slmei Slmei Slmei Kag Slmei S S S K S 67,060 0.12 % 59.10 00 % 0 20,172 0.02 % 63.45 34 0 % 0.86 2 0 % 0.51 1,900 0.01 % 10.14 1,090 0,01 % 25,39 43,862 0.03 % 80.82 0.00 258.96 73.35 105.42 5.14 2,827,395.62 Slmei Slmei Kag Kav Slmei S S K K S 66,425 0.12 % 58.54 00 % 0 1,426 0 % 4.49 0 0 % 0 0 0 % 0 1,059 0 % 5.65 15 0 % 0,35 63,925 0.04 % 117.79 0.00 257.73 73.34 105.38 5.43 2,800,073.71 Kag Kav Slmei Slmei Kag K K S S K 64,096 0.12 % 56.49 00 % 0 720 0 % 2.26 0 0 % 0 0 0 % 0 777 0 % 4.15 15 0 % 0,35 62,584 0.04 % 115.32 0.00 253.17 73.29 105.22 5.53 2,699,910.18 Somei Slmei Slmei Vd Gp-ste-n S S S V G 63,459 0.12 % 55.93 00 % 0 25,018 0.03 % 78.69 188 0 % 4.73 24 0 % 6.07 5,110 0.01 % 27.27 495 0,01 % 11,53 32,624 0.02 % 60.11 0.00 251.91 73.27 105.18 4.87 2,672,527.36 Slmei Slmei Kag Slmei Kag S S K S K 62,330 0.12 % 54.93 00 % 0 12,188 0.01 % 38.33 12 0 % 0.30 7 0 % 1.77 1,200 0 % 6.40 620 0,01 % 14,44 48,303 0.03 % 89 0.00 249.66 73.96 105.82 5.16 2,650,811.87 Ppnmeid Somei Slmei Slmei Vd P S S S V 60,970 0.11 % 53.73 00 % 0 25,297 0.03 % 79.57 91 0 % 2.29 14 0 % 3.54 4,357 0.01 % 23.25 342 0 % 7,97 30,869 0.02 % 56.88 0.00 246.92 73.27 105.06 5.11 2,567,594.19 Ggdd-em Ppnmeid Somei Slmei Slmei G P S S S 60,794 0.11 % 53.58 00 % 0 25,176 0.03 % 79.19 81 0 % 2.04 37 0 % 9.35 3,340 0.01 % 17.82 608 0,01 % 14,16 31,552 0.02 % 58.14 0.00 246.56 73.21 104.99 5.43 2,558,027.43 Ppnmeid Somei Slmei Slmei Slmei P S S S S 59,639 0.11 % 52.56 00 % 0 25,190 0.03 % 79.23 118 0 % 2.97 25 0 % 6.32 5,760 0.01 % 30.73 725 0,01 % 16,89 27,821 0.02 % 51.26 0.00 244.21 73.18 104.91 5.18 2,508,434.89 Somei Slmei Slmei Slmei Slmei S S S S S 59,610 0.11 % 52.53 00 % 0 17,548 0.02 % 55.19 150 0 % 3.78 48 0.01 % 12.13 8,666 0.02 % 46.24 1,206 0,01 % 28,09 31,992 0.02 % 58.95 0.00 244.15 75.44 107.17 4.91 2,588,237.55 Slmei Kag Slmei Slmei Slmei S K S S S 59,549 0.11 % 52.48 00 % 0 17,243 0.02 % 54.23 30 0 % 0.76 6 0 % 1.52 1,864 0 % 9.95 783 0,01 % 18,24 39,623 0.03 % 73.01 0.00 244.03 73.18 104.90 4.97 2,504,571.35 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 326GOS is the reference corpus of spoken Slovene. It contains transcriptions of approximately 120 hours of recording of speech/conversation in different situations, from radio and television shows to school lessons, university lectures, and private conversations between friends or family members, as well as work meetings, consultations, conversations during sales or services, etc. The transcription of the speech contains more than a million words and was made in two versions (standardized and colloquial). The corpus was compiled as part of the Communication in Slovene project. The corpus serves as a basis for researching spoken Slovene, be it from a linguistic, sociological, or computational perspective. The GOS online interface, which allows access to speech recordings, can be used in Slovene language lessons, in language courses for non-native speakers, or in similar situations where information on modern spoken Slovene is useful. The corpus is available under open access as a database at the CLARIN.SI repository ( Zwitter Vitez et al. 2013 ). The speech recordings included in the GOS corpus have been collected in the manner that best ensures the representativeness of modern spoken Slovene in the most frequent everyday situations. The second sampling criterion was the representativeness of speakers: the part of the corpus with recordings of private conversations thus contains an adequate percentage of speakers from different regions, with different genders, ages, and educational backgrounds. More information on the structure and the compilation process of the corpus is available in the monograph by Verdonik and Zwitter Vitez 2011 (in Slovene). The corpus as a database contains speech recordings, colloquial speech transcriptions (e.g. “d bi loh”), and standardized speech transcriptions (e.g. “da bi lahko”). The standardized speech transcriptions have been automatically tagged with lemmas and morphosyntactic descriptors. The recordings include metadata on the situation during which the recording was made, and on the speakers included in the recording. When preparing frequency lists from the GOS corpus, we took into account lemmas, colloquial word forms, and standardized word forms. The tables also contain the following metalabels on discourse type: • gos.T.J.I – Discourse / Public / Education and Information • gos.T.J.R – Discourse / Public / Entertainment • gos.T.N.N – Discourse / Non-public / Non-private • gos.T.N.Z – Discourse / Non-public / Private This chapter contains the frequency lists extracted from the GOS 1.0 corpus. The frequency lists are divided into sections by levels, starting with characters (Section 2.1.), word parts (Section 2.2.), and word with consonant-vowel structures (Section 2.3.), and ending with word sets (Section 2.4.).2. Frequency lists from the GOS 1.0 Corpus of Spoken Slovene CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 327Tables 2.1.1. to 2.1.5. contain frequency lists of character-level n-grams extracted from lemmas in the GOS 1.0 corpus (from 1-grams, i.e. individual characters, to 5-grams, i.e. sequences of five characters). For instance, from the word “dobro”, the following character 2-grams were extracted: “do”, “ob”, “br”, and “ro”, as well as the following 3-grams: “dob”, “obr”, and “bro”. Each character n-gram entry on the list also contains its absolute and relative frequencies and percentages based on all character n-grams in the corpus (or. part of the corpus when the frequency refers to a specific taxonomy branch; see below for further details). Tables 2.1.6. to 2.1.10. also contain character-level n-grams extracted in the same manner, but from standardized forms in the GOS 1.0 corpus instead of lemmas. The n-grams in tables 2.1.11. to 2.1.15. were extracted from lower-case word forms. Total absolute frequencies constitute sums of all occurrences of a specific character n-gram in all units (word forms or lemmas) in the corpus. Total relative frequencies indicate how frequently a character n-gram occurs per 1,000,000 occurrences of character n-grams of equal length in the corpus. The frequencies are calculated according to the following formula, where fa is the total absolute frequency of the character n-gram and N is the total absolute frequency of all character n-grams of equal length in the corpus:The percentage of a character n-gram represents the share of the n-gram among all extracted character n-grams of equal length in the corpus, and is calculated in the following manner: The character n-grams are also listed with the absolute and relative frequencies within individual taxonomy branches in the GOS 1.0 corpus. These indicate how frequently a certain character n-gram appears in a certain text type (e.g. public discourse for information or entertainment, non-public private discourse). In this case, the absolute frequencies represent the sum of all occurrences of a character n-gram in the texts of a specific taxonomy branch. The relative frequencies (frT) and percentages (pT) are calculated using the following formulas, where faT is the absolute frequency of a character n-gram in a taxonomy branch, and NT is the total frequency of all extracted character n-grams of equal length within that taxonomy branch: 2.1. Frequency lists of characters from the GOS 1.0 corpus CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 328 File at CLARIN.SI 2.1.1 List of character-level 1-grams in lemmas in the GOS 1.0 corpusGOS1.0-characters-lemmas-1grams-taxonomy-entire.tsvCharacter string Character string (lower case) Total absolute frequen- cy of character string Percentage of total sum of all found character strings Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] i i 559,478 12.96 % 129,637.77 123,714 12.63 % 126,261.71 160,508 14.36 % 143,558.35 197,237 12.37 % 123,646.77 78,019 12.53 % 125,302.14 a a 515,644 11.95 % 119,480.91 114,631 11.70 % 116,991.66 142,408 12.74 % 127,369.71 180,507 11.32 % 113,158.83 78,098 12.54 % 125,429.02 e e 459,328 10.64 % 106,431.82 103,243 10.54 % 105,369.14 110,515 9.88 % 98,844.61 176,183 11.04 % 110,448.14 69,387 11.14 % 111,438.74 t t 409,037 9.48 % 94,778.79 90,608 9.25 % 92,473.94 115,374 10.32 % 103,190.50 141,993 8.90 % 89,014.62 61,062 9.81 % 98,068.41 o o 285,022 6.60 % 66,043.02 65,516 6.69 % 66,865.21 66,741 5.97 % 59,693.15 113,132 7.09 % 70,921.82 39,633 6.37 % 63,652.44 n n 248,460 5.76 % 57,571.17 57,269 5.84 % 58,448.37 59,621 5.33 % 53,325.02 96,619 6.06 % 60,569.91 34,951 5.61 % 56,132.93 r r 186,105 4.31 % 43,122.76 44,456 4.54 % 45,371.51 40,011 3.58 % 35,785.84 76,482 4.79 % 47,946.14 25,156 4.04 % 40,401.70 k k 164,194 3.81 % 38,045.72 35,910 3.67 % 36,649.51 42,464 3.80 % 37,979.80 62,323 3.91 % 39,069.94 23,497 3.77 % 37,737.27 s s 157,715 3.65 % 36,544.46 36,285 3.70 % 37,032.24 36,825 3.29 % 32,936.28 62,325 3.91 % 39,071.19 22,280 3.58 % 35,782.71 d d 154,824 3.59 % 35,874.58 36,599 3.73 % 37,352.70 37,754 3.38 % 33,767.18 58,628 3.67 % 36,753.56 21,843 3.51 % 35,080.87 j j 152,740 3.54 % 35,391.69 33,436 3.41 % 34,124.57 45,919 4.11 % 41,069.95 50,085 3.14 % 31,398.01 23,300 3.74 % 37,420.88 p p 148,448 3.44 % 34,397.18 31,763 3.24 % 32,417.11 39,127 3.50 % 34,995.19 55,433 3.48 % 34,750.64 22,125 3.55 % 35,533.78 b b 134,023 3.10 % 31,054.74 29,755 3.04 % 30,367.76 40,452 3.62 % 36,180.27 45,363 2.84 % 28,437.81 18,453 2.96 % 29,636.38 v v 133,485 3.09 % 30,930.08 31,193 3.18 % 31,835.37 27,770 2.48 % 24,837.49 57,022 3.58 % 35,746.77 17,500 2.81 % 28,105.81 l l 122,261 2.83 % 28,329.34 28,669 2.93 % 29,259.40 30,103 2.69 % 26,924.12 46,368 2.91 % 29,067.84 17,121 2.75 % 27,497.12 m m 107,535 2.49 % 24,917.15 23,089 2.36 % 23,564.48 29,588 2.65 % 26,463.51 36,375 2.28 % 22,803.28 18,483 2.97 % 29,684.56 z z 92,076 2.13 % 21,335.12 20,201 2.06 % 20,617.01 22,546 2.02 % 20,165.14 35,981 2.26 % 22,556.29 13,348 2.14 % 21,437.51 č č 57,877 1.34 % 13,410.80 12,598 1.29 % 12,857.44 15,257 1.36 % 13,645.86 21,502 1.35 % 13,479.48 8,520 1.37 % 13,683.52 u u 53,109 1.23 % 12,305.99 13,711 1.40 % 13,993.36 13,028 1.17 % 11,652.24 19,377 1.22 % 12,147.33 6,993 1.12 % 11,231.08 g g 36,492 0.85 % 8,455.63 8,940 0.91 % 9,124.11 8,023 0.72 % 7,175.77 14,957 0.94 % 9,376.46 4,572 0.73 % 7,342.84 š š 33,087 0.77 % 7,666.66 8,335 0.85 % 8,506.65 7,640 0.68 % 6,833.22 12,795 0.80 % 8,021.11 4,317 0.69 % 6,933.30 h h 27,070 0.63 % 6,272.44 6,180 0.63 % 6,307.27 8,093 0.72 % 7,238.38 7,509 0.47 % 4,707.35 5,288 0.85 % 8,492.77 c c 24,102 0.56 % 5,584.72 6,362 0.65 % 6,493.02 5,182 0.46 % 4,634.78 9,642 0.60 % 6,044.52 2,916 0.47 % 4,683.23 ž ž 17,678 0.41 % 4,096.20 4,643 0.47 % 4,738.62 4,012 0.36 % 3,588.33 7,082 0.44 % 4,439.67 1,941 0.31 % 3,117.34 f f 8,132 0.19 % 1,884.28 1,790 0.18 % 1,826.86 3,102 0.28 % 2,774.43 2,144 0.13 % 1,344.06 1,096 0.18 % 1,760.23 M m 6,372 0.15 % 1,476.47 1,383 0.14 % 1,411.48 2,165 0.19 % 1,936.38 1,571 0.10 % 984.85 1,253 0.20 % 2,012.38 S s 2,105 0.05 % 487.75 718 0.07 % 732.79 320 0.03 % 286.21 926 0.06 % 580.50 141 0.02 % 226.45 B b 1,510 0.04 % 349.89 716 0.07 % 730.74 279 0.03 % 249.54 438 0.03 % 274.58 77 0.01 % 123.67 P p 1,353 0.03 % 313.51 513 0.05 % 523.56 262 0.02 % 234.33 441 0.03 % 276.46 137 0.02 % 220.03 A a 1,208 0.03 % 279.91 417 0.04 % 425.59 244 0.02 % 218.23 487 0.03 % 305.30 60 0.01 % 96.36 I i 1,134 0.03 % 262.76 497 0.05 % 507.23 205 0.02 % 183.35 361 0.02 % 226.31 71 0.01 % 114.03 K k 1,127 0.03 % 261.14 425 0.04 % 433.75 226 0.02 % 202.13 405 0.03 % 253.89 71 0.01 % 114.03 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 329 File at CLARIN.SI 2.1.2 List of character-level 2-grams in lemmas in the GOS 1.0 corpusGOS1.0-characters-lemmas-2grams-taxonomy-entire.tsvCharacter string Character string (lower case) Total absolute frequen- cy of character string Percentage of total sum of all found character strings Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ti ti 237,623 7.24 % 72,432.72 51,499 6.87 % 68,729.48 75,266 9.15 % 91,505.92 76,789 6.18 % 61,825.75 34,069 7.30 % 72,991.34 it it 135,074 4.12 % 41,173.53 28,646 3.82 % 38,230.35 42,906 5.22 % 52,163.70 44,771 3.60 % 36,046.84 18,751 4.02 % 40,173.20 bi bi 99,646 3.04 % 30,374.29 21,182 2.83 % 28,269.05 32,675 3.97 % 39,725.19 31,787 2.56 % 25,592.92 14,002 3.00 % 29,998.67 ja ja 62,404 1.90 % 19,022.11 12,664 1.69 % 16,901.11 21,748 2.64 % 26,440.50 17,214 1.39 % 13,859.65 10,778 2.31 % 23,091.39 ta ta 62,403 1.90 % 19,021.81 12,900 1.72 % 17,216.07 18,253 2.22 % 22,191.40 20,460 1.65 % 16,473.12 10,790 2.31 % 23,117.10 at at 57,988 1.77 % 17,676.02 13,011 1.74 % 17,364.21 15,136 1.84 % 18,401.85 22,001 1.77 % 17,713.84 7,840 1.68 % 16,796.86 da da 52,142 1.59 % 15,894.03 12,155 1.62 % 16,221.81 13,510 1.64 % 16,425.01 18,858 1.52 % 15,183.29 7,619 1.63 % 16,323.37 en en 51,001 1.55 % 15,546.23 11,439 1.53 % 15,266.25 9,574 1.16 % 11,639.75 22,972 1.85 % 18,495.63 7,016 1.50 % 15,031.47 ee ee 50,061 1.53 % 15,259.69 9,113 1.22 % 12,162.02 9,941 1.21 % 12,085.94 22,478 1.81 % 18,097.89 8,529 1.83 % 18,273.01 et et 49,277 1.50 % 15,020.71 11,207 1.50 % 14,956.63 15,376 1.87 % 18,693.63 14,968 1.21 % 12,051.31 7,726 1.66 % 16,552.62 ne ne 47,715 1.45 % 14,544.58 10,425 1.39 % 13,912.99 15,138 1.84 % 18,404.28 13,964 1.12 % 11,242.95 8,188 1.75 % 17,542.43 ra ra 44,303 1.35 % 13,504.53 9,917 1.32 % 13,235.02 9,288 1.13 % 11,292.04 19,097 1.54 % 15,375.72 6,001 1.29 % 12,856.88 aj aj 44,270 1.35 % 13,494.47 9,481 1.26 % 12,653.14 14,960 1.82 % 18,187.87 12,835 1.03 % 10,333.95 6,994 1.50 % 14,984.34 ka ka 44,069 1.34 % 13,433.20 9,888 1.32 % 13,196.32 12,385 1.51 % 15,057.27 15,877 1.28 % 12,783.18 5,919 1.27 % 12,681.20 na na 42,961 1.31 % 13,095.46 9,826 1.31 % 13,113.57 9,401 1.14 % 11,429.43 17,832 1.44 % 14,357.22 5,902 1.26 % 12,644.78 pa pa 42,286 1.29 % 12,889.70 8,381 1.12 % 11,185.11 15,709 1.91 % 19,098.48 11,273 0.91 % 9,076.32 6,923 1.48 % 14,832.22 ko ko 41,940 1.28 % 12,784.24 8,828 1.18 % 11,781.66 11,804 1.44 % 14,350.91 14,733 1.19 % 11,862.10 6,575 1.41 % 14,086.65 po po 41,628 1.27 % 12,689.13 8,819 1.18 % 11,769.65 9,663 1.18 % 11,747.96 17,001 1.37 % 13,688.15 6,145 1.32 % 13,165.39 re re 40,797 1.24 % 12,435.82 9,733 1.30 % 12,989.46 8,244 1.00 % 10,022.78 17,082 1.38 % 13,753.37 5,738 1.23 % 12,293.41 pr pr 40,179 1.23 % 12,247.44 8,957 1.20 % 11,953.82 7,323 0.89 % 8,903.06 18,408 1.48 % 14,820.98 5,491 1.18 % 11,764.23 st st 39,141 1.19 % 11,931.04 8,061 1.08 % 10,758.04 8,466 1.03 % 10,292.68 16,994 1.37 % 13,682.52 5,620 1.20 % 12,040.60 ve ve 38,965 1.19 % 11,877.39 9,083 1.21 % 12,121.98 9,863 1.20 % 11,991.11 14,963 1.21 % 12,047.28 5,056 1.08 % 10,832.26 se se 37,459 1.14 % 11,418.33 9,133 1.22 % 12,188.71 8,463 1.03 % 10,289.04 14,207 1.14 % 11,438.60 5,656 1.21 % 12,117.73 ak ak 34,303 1.05 % 10,456.31 7,290 0.97 % 9,729.08 10,387 1.26 % 12,628.17 11,106 0.89 % 8,941.86 5,520 1.18 % 11,826.36 ed ed 32,010 0.98 % 9,757.35 7,241 0.97 % 9,663.69 8,978 1.09 % 10,915.16 11,267 0.91 % 9,071.49 4,524 0.97 % 9,692.47 in in 30,610 0.93 % 9,330.60 8,118 1.08 % 10,834.11 5,743 0.70 % 6,982.15 13,230 1.06 % 10,651.98 3,519 0.75 % 7,539.30 de de 29,463 0.90 % 8,980.97 6,321 0.84 % 8,435.87 8,941 1.09 % 10,870.17 9,619 0.77 % 7,744.62 4,582 0.98 % 9,816.73 ri ri 26,697 0.81 % 8,137.83 6,507 0.87 % 8,684.11 5,760 0.70 % 7,002.82 10,856 0.87 % 8,740.58 3,574 0.77 % 7,657.14 me me 26,128 0.80 % 7,964.39 5,338 0.71 % 7,123.98 7,174 0.87 % 8,721.91 9,705 0.78 % 7,813.86 3,911 0.84 % 8,379.15 no no 25,981 0.79 % 7,919.58 6,275 0.84 % 8,374.48 5,693 0.69 % 6,921.36 10,234 0.82 % 8,239.78 3,779 0.81 % 8,096.34 li li 25,616 0.78 % 7,808.32 5,362 0.72 % 7,156.01 6,811 0.83 % 8,280.59 9,605 0.77 % 7,733.35 3,838 0.82 % 8,222.75 za za 25,571 0.78 % 7,794.60 5,437 0.73 % 7,256.11 5,533 0.67 % 6,726.84 11,062 0.89 % 8,906.44 3,539 0.76 % 7,582.15 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 330 File at CLARIN.SI 2.1.3 List of character-level 3-grams in lemmas in the GOS 1.0 corpusGOS1.0-characters-lemmas-3grams-taxonomy-entire.tsvCharacter string Character string (lower case) Total absolute frequen- cy of character string Percentage of total sum of all found character strings Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] iti iti 129,348 5.65 % 56,534.25 27,325 5.17 % 51,751.99 41,662 7.75 % 77,446.29 42,285 4.67 % 46,733.10 18,076 5.70 % 56,987.39 bit bit 96,331 4.21 % 42,103.48 20,449 3.87 % 38,729.24 31,912 5.93 % 59,321.83 30,603 3.38 % 33,822.23 13,367 4.21 % 42,141.54 ati ati 46,490 2.03 % 20,319.43 10,448 1.98 % 19,787.92 12,706 2.36 % 23,619.43 16,910 1.87 % 18,688.82 6,426 2.03 % 20,258.96 eti eti 31,863 1.39 % 13,926.39 6,583 1.25 % 12,467.83 11,752 2.19 % 21,846.02 8,435 0.93 % 9,322.31 5,093 1.61 % 16,056.47 eee eee 23,234 1.01 % 10,154.91 4,252 0.81 % 8,053.05 4,533 0.84 % 8,426.48 10,558 1.17 % 11,668.63 3,891 1.23 % 12,266.98 tak tak 15,525 0.68 % 6,785.53 3,142 0.59 % 5,950.77 5,712 1.06 % 10,618.15 3,880 0.43 % 4,288.15 2,791 0.88 % 8,799.06 jaz jaz 15,340 0.67 % 6,704.67 3,171 0.60 % 6,005.69 5,689 1.06 % 10,575.39 3,972 0.44 % 4,389.83 2,508 0.79 % 7,906.86 kaj kaj 15,058 0.66 % 6,581.41 2,804 0.53 % 5,310.62 5,182 0.96 % 9,632.92 4,782 0.53 % 5,285.03 2,290 0.72 % 7,219.58 ako ako 13,763 0.60 % 6,015.41 2,794 0.53 % 5,291.68 4,728 0.88 % 8,788.97 3,871 0.43 % 4,278.20 2,370 0.75 % 7,471.79 met met 13,497 0.59 % 5,899.15 2,605 0.49 % 4,933.72 4,912 0.91 % 9,131.01 3,730 0.41 % 4,122.37 2,250 0.71 % 7,093.47 ved ved 12,750 0.56 % 5,572.65 2,746 0.52 % 5,200.77 4,734 0.88 % 8,800.12 3,408 0.38 % 3,766.50 1,862 0.59 % 5,870.24 pre pre 12,642 0.55 % 5,525.45 2,961 0.56 % 5,607.97 1,862 0.35 % 3,461.31 6,170 0.68 % 6,819.04 1,649 0.52 % 5,198.73 ime ime 12,090 0.53 % 5,284.19 2,165 0.41 % 4,100.39 3,899 0.72 % 7,247.93 3,864 0.43 % 4,270.47 2,162 0.68 % 6,816.04 pri pri 11,614 0.51 % 5,076.14 2,701 0.51 % 5,115.54 2,526 0.47 % 4,695.63 4,784 0.53 % 5,287.25 1,603 0.51 % 5,053.71 ede ede 11,319 0.49 % 4,947.21 2,212 0.42 % 4,189.40 4,845 0.90 % 9,006.46 2,376 0.26 % 2,625.94 1,886 0.59 % 5,945.91 det det 11,264 0.49 % 4,923.17 2,221 0.42 % 4,206.45 5,039 0.94 % 9,367.09 2,097 0.23 % 2,317.59 1,907 0.60 % 6,012.11 dat dat 10,155 0.44 % 4,438.46 2,374 0.45 % 4,496.22 2,999 0.56 % 5,574.90 3,204 0.35 % 3,541.04 1,578 0.50 % 4,974.89 sti sti 10,127 0.44 % 4,426.22 2,091 0.40 % 3,960.23 3,288 0.61 % 6,112.13 3,338 0.37 % 3,689.14 1,410 0.45 % 4,445.24 rav rav 9,887 0.43 % 4,321.32 2,097 0.40 % 3,971.60 1,776 0.33 % 3,301.44 4,624 0.51 % 5,110.41 1,390 0.44 % 4,382.19 rat rat 9,682 0.42 % 4,231.72 2,152 0.41 % 4,075.77 2,422 0.45 % 4,502.30 3,717 0.41 % 4,108 1,391 0.44 % 4,385.34 pra pra 9,187 0.40 % 4,015.37 1,863 0.35 % 3,528.42 1,724 0.32 % 3,204.78 4,365 0.48 % 4,824.17 1,235 0.39 % 3,893.53 daj daj 9,146 0.40 % 3,997.45 2,200 0.42 % 4,166.67 2,584 0.48 % 4,803.45 2,665 0.29 % 2,945.34 1,697 0.54 % 5,350.06 ist ist 8,692 0.38 % 3,799.02 1,327 0.25 % 2,513.26 2,576 0.48 % 4,788.58 3,334 0.37 % 3,684.71 1,455 0.46 % 4,587.11 eda eda 8,681 0.38 % 3,794.21 2,278 0.43 % 4,314.40 1,986 0.37 % 3,691.81 3,390 0.38 % 3,746.61 1,027 0.32 % 3,237.78 udi udi 8,619 0.38 % 3,767.11 2,262 0.43 % 4,284.10 1,808 0.34 % 3,360.93 3,147 0.35 % 3,478.04 1,402 0.44 % 4,420.02 tud tud 8,405 0.37 % 3,673.58 2,182 0.41 % 4,132.58 1,761 0.33 % 3,273.56 3,040 0.34 % 3,359.79 1,422 0.45 % 4,483.07 ali ali 8,236 0.36 % 3,599.72 1,686 0.32 % 3,193.19 2,446 0.46 % 4,546.92 2,646 0.29 % 2,924.34 1,458 0.46 % 4,596.57 red red 7,618 0.33 % 3,329.61 1,610 0.30 % 3,049.25 1,508 0.28 % 2,803.25 3,319 0.37 % 3,668.14 1,181 0.37 % 3,723.29 kak kak 7,491 0.33 % 3,274.10 1,678 0.32 % 3,178.04 2,046 0.38 % 3,803.35 2,694 0.30 % 2,977.39 1,073 0.34 % 3,382.80 zda zda 7,377 0.32 % 3,224.27 1,650 0.31 % 3,125.01 2,215 0.41 % 4,117.51 2,039 0.23 % 2,253.49 1,473 0.46 % 4,643.86 ost ost 6,994 0.31 % 3,056.87 1,469 0.28 % 2,782.20 949 0.18 % 1,764.11 3,714 0.41 % 4,104.69 862 0.27 % 2,717.59 sta sta 6,850 0.30 % 2,993.94 1,473 0.28 % 2,789.78 1,446 0.27 % 2,688 3,074 0.34 % 3,397.36 857 0.27 % 2,701.83 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 331 File at CLARIN.SI 2.1.4 List of character-level 4-grams in lemmas in the GOS 1.0 corpusGOS1.0-characters-lemmas-4grams-taxonomy-entire.tsvCharacter string Character string (lower case) Total absolute frequen- cy of character string Percentage of total sum of all found character strings Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] biti biti 96,100 6.13 % 61,264.20 20,397 5.59 % 55,928.31 31,879 9.34 % 93,378.36 30,477 4.69 % 46,902.27 13,347 6.27 % 62,743.57 meti meti 12,061 0.77 % 7,688.94 2,236 0.61 % 6,131.08 4,584 1.34 % 13,427.22 3,208 0.49 % 4,936.92 2,033 0.96 % 9,557.03 deti deti 11,140 0.71 % 7,101.80 2,175 0.60 % 5,963.82 5,004 1.47 % 14,657.47 2,068 0.32 % 3,182.53 1,893 0.89 % 8,898.90 tako tako 10,658 0.68 % 6,794.52 2,161 0.59 % 5,925.43 3,924 1.15 % 11,493.98 2,551 0.39 % 3,925.84 2,022 0.95 % 9,505.32 imet imet 10,289 0.66 % 6,559.29 1,879 0.52 % 5,152.19 3,707 1.09 % 10,858.36 2,865 0.44 % 4,409.06 1,838 0.86 % 8,640.34 dati dati 9,628 0.61 % 6,137.89 2,287 0.63 % 6,270.92 2,962 0.87 % 8,676.14 2,916 0.45 % 4,487.55 1,463 0.69 % 6,877.49 tudi tudi 8,187 0.52 % 5,219.25 2,130 0.58 % 5,840.43 1,706 0.50 % 4,997.13 2,992 0.46 % 4,604.51 1,359 0.64 % 6,388.59 edet edet 7,865 0.50 % 5,013.97 1,379 0.38 % 3,781.20 3,986 1.17 % 11,675.59 1,158 0.18 % 1,782.09 1,342 0.63 % 6,308.67 vede vede 7,857 0.50 % 5,008.87 1,353 0.37 % 3,709.91 3,937 1.15 % 11,532.06 1,219 0.19 % 1,875.97 1,348 0.63 % 6,336.88 prav prav 7,199 0.46 % 4,589.40 1,441 0.40 % 3,951.20 1,287 0.38 % 3,769.82 3,498 0.54 % 5,383.21 973 0.46 % 4,574.02 zdaj zdaj 6,888 0.44 % 4,391.13 1,592 0.44 % 4,365.24 1,956 0.57 % 5,729.42 1,882 0.29 % 2,896.28 1,458 0.69 % 6,853.98 jati jati 6,298 0.40 % 4,015 1,332 0.36 % 3,652.33 1,428 0.42 % 4,182.83 2,716 0.42 % 4,179.76 822 0.39 % 3,864.18 rati rati 5,455 0.35 % 3,477.59 1,151 0.32 % 3,156.03 1,381 0.41 % 4,045.16 2,104 0.32 % 3,237.93 819 0.39 % 3,850.08 edat edat 5,009 0.32 % 3,193.26 1,247 0.34 % 3,419.26 1,417 0.41 % 4,150.61 1,634 0.25 % 2,514.63 711 0.33 % 3,342.37 riti riti 4,969 0.32 % 3,167.76 1,103 0.30 % 3,024.41 1,378 0.40 % 4,036.37 1,862 0.29 % 2,865.51 626 0.29 % 2,942.79 reči reči 4,901 0.31 % 3,124.41 933 0.26 % 2,558.27 1,670 0.49 % 4,891.68 1,560 0.24 % 2,400.75 738 0.35 % 3,469.30 isti isti 4,411 0.28 % 2,812.03 696 0.19 % 1,908.42 1,622 0.47 % 4,751.08 1,418 0.22 % 2,182.22 675 0.32 % 3,173.14 liti liti 4,334 0.28 % 2,762.95 827 0.23 % 2,267.62 1,201 0.35 % 3,517.91 1,611 0.25 % 2,479.23 695 0.33 % 3,267.16 vati vati 4,197 0.27 % 2,675.61 875 0.24 % 2,399.24 551 0.16 % 1,613.96 2,259 0.35 % 3,476.46 512 0.24 % 2,406.89 lahk lahk 4,176 0.27 % 2,662.22 961 0.26 % 2,635.05 904 0.27 % 2,647.95 1,500 0.23 % 2,308.41 811 0.38 % 3,812.47 niti niti 4,171 0.27 % 2,659.03 821 0.23 % 2,251.17 1,031 0.30 % 3,019.95 1,737 0.27 % 2,673.14 582 0.27 % 2,735.95 ahko ahko 4,166 0.27 % 2,655.84 958 0.26 % 2,626.82 903 0.27 % 2,645.02 1,494 0.23 % 2,299.18 811 0.38 % 3,812.47 gled gled 4,159 0.27 % 2,651.38 1,012 0.28 % 2,774.89 1,092 0.32 % 3,198.63 1,379 0.21 % 2,122.20 676 0.32 % 3,177.84 anje anje 4,081 0.26 % 2,601.66 729 0.20 % 1,998.91 398 0.12 % 1,165.80 2,410 0.37 % 3,708.84 544 0.26 % 2,557.32 diti diti 3,944 0.25 % 2,514.32 764 0.21 % 2,094.88 1,219 0.36 % 3,570.63 1,325 0.20 % 2,039.10 636 0.30 % 2,989.80 veda veda 3,710 0.24 % 2,365.14 1,069 0.29 % 2,931.18 615 0.18 % 1,801.43 1,640 0.25 % 2,523.86 386 0.18 % 1,814.57 neka neka 3,704 0.24 % 2,361.32 712 0.20 % 1,952.29 1,145 0.34 % 3,353.88 1,295 0.20 % 1,992.93 552 0.26 % 2,594.92 pote pote 3,650 0.23 % 2,326.89 757 0.21 % 2,075.68 703 0.21 % 2,059.19 1,424 0.22 % 2,191.45 766 0.36 % 3,600.93 ampa ampa 3,593 0.23 % 2,290.55 818 0.22 % 2,242.95 825 0.24 % 2,416.55 1,281 0.20 % 1,971.38 669 0.31 % 3,144.93 mpak mpak 3,544 0.23 % 2,259.32 816 0.22 % 2,237.46 822 0.24 % 2,407.76 1,239 0.19 % 1,906.75 667 0.31 % 3,135.53 drug drug 3,513 0.22 % 2,239.55 661 0.18 % 1,812.45 891 0.26 % 2,609.87 1,465 0.23 % 2,254.55 496 0.23 % 2,331.67 ravi ravi 3,493 0.22 % 2,226.80 759 0.21 % 2,081.17 678 0.20 % 1,985.96 1,542 0.24 % 2,373.05 514 0.24 % 2,416.29 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 332 File at CLARIN.SI 2.1.5 List of character-level 5-grams in lemmas in the GOS 1.0 corpusGOS1.0-characters-lemmas-5grams-taxonomy-entire.tsvCharacter string Character string (lower case) Total absolute frequen- cy of character string Percentage of total sum of all found character strings Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] imeti imeti 10,145 1.00 % 10,049.80 1,862 0.80 % 7,961.48 3,694 1.87 % 18,737.67 2,813 0.64 % 6,348 1,776 1.31 % 13,124.25 edeti edeti 7,863 0.78 % 7,789.21 1,378 0.59 % 5,892.01 3,986 2.02 % 20,218.83 1,157 0.26 % 2,610.96 1,342 0.99 % 9,917.09 vedet vedet 7,706 0.76 % 7,633.69 1,341 0.57 % 5,733.81 3,930 1.99 % 19,934.77 1,105 0.25 % 2,493.61 1,330 0.98 % 9,828.41 edati edati 5,008 0.50 % 4,961 1,246 0.53 % 5,327.61 1,417 0.72 % 7,187.68 1,634 0.37 % 3,687.39 711 0.53 % 5,254.13 lahko lahko 4,166 0.41 % 4,126.91 958 0.41 % 4,096.19 903 0.46 % 4,580.43 1,494 0.34 % 3,371.46 811 0.60 % 5,993.11 ampak ampak 3,541 0.35 % 3,507.77 815 0.35 % 3,484.75 821 0.42 % 4,164.49 1,238 0.28 % 2,793.75 667 0.49 % 4,928.98 tisti tisti 3,447 0.34 % 3,414.65 533 0.23 % 2,278.99 1,351 0.69 % 6,852.89 1,065 0.24 % 2,403.35 498 0.37 % 3,680.11 gleda gleda 3,373 0.33 % 3,341.35 830 0.35 % 3,548.89 1,008 0.51 % 5,113.04 1,025 0.23 % 2,313.08 510 0.38 % 3,768.79 pravi pravi 3,328 0.33 % 3,296.77 721 0.31 % 3,082.83 644 0.33 % 3,266.66 1,466 0.33 % 3,308.27 497 0.37 % 3,672.72 potem potem 3,246 0.32 % 3,215.54 679 0.29 % 2,903.25 649 0.33 % 3,292.03 1,191 0.27 % 2,687.69 727 0.54 % 5,372.37 ledat ledat 3,144 0.31 % 3,114.50 731 0.31 % 3,125.59 998 0.51 % 5,062.32 920 0.21 % 2,076.13 495 0.37 % 3,657.94 deset deset 3,088 0.31 % 3,059.02 863 0.37 % 3,689.99 601 0.30 % 3,048.55 1,049 0.24 % 2,367.24 575 0.42 % 4,249.12 dober dober 3,021 0.30 % 2,992.65 1,244 0.53 % 5,319.06 621 0.32 % 3,150 776 0.17 % 1,751.17 380 0.28 % 2,808.12 sliti sliti 2,915 0.29 % 2,887.65 504 0.21 % 2,154.99 1,032 0.52 % 5,234.78 792 0.18 % 1,787.28 587 0.43 % 4,337.80 misli misli 2,905 0.29 % 2,877.74 507 0.22 % 2,167.82 1,031 0.52 % 5,229.71 778 0.18 % 1,755.68 589 0.43 % 4,352.58 islit islit 2,893 0.29 % 2,865.85 504 0.21 % 2,154.99 1,025 0.52 % 5,199.27 777 0.17 % 1,753.43 587 0.43 % 4,337.80 aviti aviti 2,848 0.28 % 2,821.27 632 0.27 % 2,702.29 688 0.35 % 3,489.85 1,063 0.24 % 2,398.83 465 0.34 % 3,436.25 akšen akšen 2,812 0.28 % 2,785.61 776 0.33 % 3,318 535 0.27 % 2,713.77 1,151 0.26 % 2,597.42 350 0.26 % 2,586.42 velik velik 2,631 0.26 % 2,606.31 622 0.27 % 2,659.53 455 0.23 % 2,307.97 1,271 0.29 % 2,868.22 283 0.21 % 2,091.31 priti priti 2,612 0.26 % 2,587.49 584 0.25 % 2,497.05 990 0.50 % 5,021.74 688 0.15 % 1,552.58 350 0.26 % 2,586.42 nekaj nekaj 2,580 0.26 % 2,555.79 482 0.21 % 2,060.92 928 0.47 % 4,707.24 774 0.17 % 1,746.66 396 0.29 % 2,926.35 kakše kakše 2,348 0.23 % 2,325.97 609 0.26 % 2,603.94 468 0.24 % 2,373.91 963 0.22 % 2,173.17 308 0.23 % 2,276.05 ideti ideti 2,327 0.23 % 2,305.16 610 0.26 % 2,608.22 760 0.39 % 3,855.07 622 0.14 % 1,403.64 335 0.25 % 2,475.58 videt videt 2,325 0.23 % 2,303.18 610 0.26 % 2,608.22 759 0.39 % 3,850 621 0.14 % 1,401.39 335 0.25 % 2,475.58 orati orati 2,243 0.22 % 2,221.95 386 0.17 % 1,650.45 565 0.29 % 2,865.94 891 0.20 % 2,010.69 401 0.30 % 2,963.30 morat morat 2,233 0.22 % 2,212.05 386 0.17 % 1,650.45 559 0.28 % 2,835.51 889 0.20 % 2,006.17 399 0.29 % 2,948.52 ajati ajati 2,199 0.22 % 2,178.36 453 0.19 % 1,936.92 601 0.30 % 3,048.55 896 0.20 % 2,021.97 249 0.18 % 1,840.06 ljati ljati 2,151 0.21 % 2,130.81 429 0.18 % 1,834.31 490 0.25 % 2,485.51 916 0.21 % 2,067.10 316 0.23 % 2,335.17 poved poved 2,110 0.21 % 2,090.20 561 0.24 % 2,398.71 440 0.22 % 2,231.88 890 0.20 % 2,008.43 219 0.16 % 1,618.36 govor govor 2,079 0.21 % 2,059.49 412 0.18 % 1,761.62 286 0.14 % 1,450.72 1,070 0.24 % 2,414.63 311 0.23 % 2,298.22 ravit ravit 1,988 0.20 % 1,969.34 462 0.20 % 1,975.41 467 0.24 % 2,368.84 747 0.17 % 1,685.73 312 0.23 % 2,305.61 ovati ovati 1,967 0.20 % 1,948.54 377 0.16 % 1,611.97 218 0.11 % 1,105.80 1,132 0.26 % 2,554.54 240 0.18 % 1,773.55 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 333 File at CLARIN.SI 2.1.6 List of character-level 1-grams in standardized word forms in the GOS 1.0 corpusGOS1.0-characters-standardized_forms- 1grams-taxonomy-entire.tsvCharacter string Total absolute frequen- cy of character string Percentage of total sum of all found character strings Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] e 526,645 12,46 % 124,587.01 200,032 12.63 % 126,269.75 130,310 12.19 % 121,922.69 117,728 12.19 % 121,908.74 78,575 12.91 % 129,136.65 a 504,062 11,92 % 119,244.61 174,483 11.01 % 110,142.01 141,974 13.28 % 132,835.95 112,338 11.63 % 116,327.33 75,267 12.37 % 123,700.01 o 371,506 8,79 % 87,886.19 144,233 9.11 % 91,046.76 88,971 8.32 % 83,244.45 84,891 8.79 % 87,905.64 53,411 8.78 % 87,780.05 i 336,196 7,95 % 79,533 131,744 8.32 % 83,163.11 81,784 7.65 % 76,520.03 76,737 7.95 % 79,462.07 45,931 7.55 % 75,486.80 n 249,926 5,91 % 59,124.33 96,721 6.11 % 61,054.92 60,057 5.62 % 56,191.48 57,749 5.98 % 59,799.77 35,399 5.82 % 58,177.64 t 218,427 5,17 % 51,672.70 81,699 5.16 % 51,572.31 52,494 4.91 % 49,115.26 50,143 5.19 % 51,923.67 34,091 5.60 % 56,027.97 j 206,168 4,88 % 48,772.62 69,749 4.40 % 44,028.90 61,324 5.74 % 57,376.93 44,908 4.65 % 46,502.77 30,187 4.96 % 49,611.81 r 189,166 4,47 % 44,750.50 77,457 4.89 % 48,894.56 41,277 3.86 % 38,620.24 44,798 4.64 % 46,388.86 25,634 4.21 % 42,129.03 s 187,776 4,44 % 44,421.67 71,658 4.52 % 45,233.95 46,599 4.36 % 43,599.69 43,115 4.46 % 44,646.09 26,404 4.34 % 43,394.51 m 174,387 4,12 % 41,254.27 58,821 3.71 % 37,130.63 49,427 4.62 % 46,245.67 37,118 3.84 % 38,436.13 29,021 4.77 % 47,695.51 l 171,108 4,05 % 40,478.57 61,614 3.89 % 38,893.70 47,398 4.43 % 44,347.26 39,419 4.08 % 40,818.84 22,677 3.73 % 37,269.26 k 163,216 3,86 % 38,611.58 61,178 3.86 % 38,618.48 43,149 4.04 % 40,371.75 35,535 3.68 % 36,796.91 23,354 3.84 % 38,381.89 p 148,489 3,51 % 35,127.65 55,405 3.50 % 34,974.28 39,159 3.66 % 36,638.56 31,781 3.29 % 32,909.60 22,144 3.64 % 36,393.28 d 148,000 3,50 % 35,011.97 57,888 3.65 % 36,541.67 34,170 3.20 % 31,970.67 35,353 3.66 % 36,608.45 20,589 3.38 % 33,837.66 v 140,434 3,32 % 33,222.10 59,051 3.73 % 37,275.81 29,295 2.74 % 27,409.45 33,153 3.43 % 34,330.32 18,935 3.11 % 31,119.34 z 79,428 1,88 % 18,790.07 31,850 2.01 % 20,105.24 18,747 1.75 % 17,540.36 17,395 1.80 % 18,012.73 11,436 1.88 % 18,794.87 b 71,254 1,69 % 16,856.37 25,486 1.61 % 16,087.98 18,945 1.77 % 17,725.62 16,766 1.74 % 17,361.39 10,057 1.65 % 16,528.50 u 68,944 1,63 % 16,309.90 26,530 1.68 % 16,747 15,859 1.48 % 14,838.25 17,406 1.80 % 18,024.12 9,149 1.50 % 15,036.22 č 53,537 1,27 % 12,665.11 20,457 1.29 % 12,913.44 13,269 1.24 % 12,414.95 11,878 1.23 % 12,299.81 7,933 1.30 % 13,037.75 š 51,329 1,21 % 12,142.77 16,261 1.03 % 10,264.72 15,968 1.49 % 14,940.23 12,064 1.25 % 12,492.41 7,036 1.16 % 11,563.54 g 50,646 1,20 % 11,981.19 20,161 1.27 % 12,726.59 12,160 1.14 % 11,377.33 11,834 1.23 % 12,254.25 6,491 1.07 % 10,667.85 h 39,049 0,92 % 9,237.72 13,528 0.85 % 8,539.52 9,898 0.93 % 9,260.92 8,970 0.93 % 9,288.54 6,653 1.09 % 10,934.09 c 24,883 0,59 % 5,886.51 10,042 0.63 % 6,338.99 5,302 0.50 % 4,960.74 6,449 0.67 % 6,678.02 3,090 0.51 % 5,078.36 ž 18,556 0,44 % 4,389.74 7,405 0.47 % 4,674.39 4,148 0.39 % 3,881.02 4,897 0.51 % 5,070.90 2,106 0.35 % 3,461.17 f 8,157 0,19 % 1,929.68 2,131 0.14 % 1,345.19 3,124 0.29 % 2,922.93 1,808 0.19 % 1,872.21 1,094 0.18 % 1,797.97 S 2,635 0,06 % 623.35 1,164 0.07 % 734.77 358 0.03 % 334.96 954 0.10 % 987.88 159 0.03 % 261.31 M 1,893 0,04 % 447.82 586 0.04 % 369.91 307 0.03 % 287.24 864 0.09 % 894.68 136 0.02 % 223.51 P 1,687 0,04 % 399.09 592 0.04 % 373.70 292 0.03 % 273.21 640 0.07 % 662.73 163 0.03 % 267.89 B 1,570 0,04 % 371.41 492 0.03 % 310.57 230 0.02 % 215.20 768 0.08 % 795.27 80 0.01 % 131.48 K 1,337 0,03 % 316.29 500 0.03 % 315.62 239 0.02 % 223.62 514 0.05 % 532.25 84 0.01 % 138.05 A 1,326 0,03 % 313.69 603 0.04 % 380.64 183 0.02 % 171.22 458 0.05 % 474.26 82 0.01 % 134.77 R 1,124 0,03 % 265.90 344 0.02 % 217.15 110 0.01 % 102.92 620 0.06 % 642.02 50 0.01 % 82.17 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 334 File at CLARIN.SI 2.1.7 List of character-level 2-grams in standardized word forms in the GOS 1.0 corpusGOS1.0-characters-standardized_forms- 2grams-taxonomy-entire.tsvCharacter string Total absolute frequen- cy of character string Percentage of total sum of all found character strings Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] je 63,234 1,98 % 19,809.98 24,223 1.97 % 19,677.15 17,592 2.27 % 22,750.73 13,038 1.77 % 17,734.34 8,381 1.85 % 18,518.64 ne 58,682 1,84 % 18,383.93 18,792 1.53 % 15,265.36 17,523 2.27 % 22,661.49 12,554 1.71 % 17,076 9,813 2.17 % 21,682.79 ja 51,222 1,60 % 16,046.86 13,987 1.14 % 11,362.10 18,118 2.34 % 23,430.97 10,167 1.38 % 13,829.19 8,950 1.98 % 19,775.90 aj 50,307 1,58 % 15,760.21 14,663 1.19 % 11,911.24 16,732 2.16 % 21,638.54 11,175 1.52 % 15,200.28 7,737 1.71 % 17,095.66 ee 50,065 1,57 % 15,684.39 22,478 1.83 % 18,259.62 9,942 1.29 % 12,857.42 9,116 1.24 % 12,399.62 8,529 1.89 % 18,845.66 na 49,506 1,55 % 15,509.27 20,370 1.66 % 16,547.23 10,848 1.40 % 14,029.10 11,571 1.57 % 15,738.92 6,717 1.48 % 14,841.87 ko 49,075 1,54 % 15,374.24 17,544 1.43 % 14,251.57 13,697 1.77 % 17,713.55 10,189 1.39 % 13,859.12 7,645 1.69 % 16,892.38 da 47,565 1,49 % 14,901.19 16,849 1.37 % 13,687 12,551 1.62 % 16,231.49 11,053 1.50 % 15,034.33 7,112 1.57 % 15,714.66 re 45,338 1,42 % 14,203.51 18,525 1.50 % 15,048.47 9,884 1.28 % 12,782.41 10,533 1.43 % 14,327.03 6,396 1.41 % 14,132.59 se 44,548 1,40 % 13,956.02 15,424 1.25 % 12,529.43 12,002 1.55 % 15,521.50 10,499 1.43 % 14,280.78 6,623 1.46 % 14,634.17 ra 42,993 1,35 % 13,468.87 18,795 1.53 % 15,267.80 8,942 1.16 % 11,564.18 9,425 1.28 % 12,819.92 5,831 1.29 % 12,884.17 pa 42,273 1,32 % 13,243.31 11,345 0.92 % 9,215.92 15,643 2.02 % 20,230.20 8,369 1.14 % 11,383.54 6,916 1.53 % 15,281.58 ka 42,123 1,32 % 13,196.32 15,089 1.23 % 12,257.30 12,032 1.56 % 15,560.30 9,290 1.26 % 12,636.29 5,712 1.26 % 12,621.22 po 41,998 1,32 % 13,157.16 17,092 1.39 % 13,884.40 9,768 1.26 % 12,632.40 8,969 1.22 % 12,199.67 6,169 1.36 % 13,631.01 st 40,823 1,28 % 12,789.05 17,896 1.45 % 14,537.51 8,202 1.06 % 10,607.18 8,718 1.19 % 11,858.26 6,007 1.33 % 13,273.06 pr 40,363 1,26 % 12,644.94 18,471 1.50 % 15,004.61 7,365 0.95 % 9,524.73 9,002 1.22 % 12,244.55 5,525 1.22 % 12,208.03 em 39,740 1,25 % 12,449.77 14,302 1.16 % 11,617.99 11,047 1.43 % 14,286.45 8,328 1.13 % 11,327.78 6,063 1.34 % 13,396.79 te 38,735 1,21 % 12,134.92 16,057 1.30 % 13,043.63 7,009 0.91 % 9,064.34 9,300 1.26 % 12,649.89 6,369 1.41 % 14,072.93 li 38,452 1,21 % 12,046.26 15,157 1.23 % 12,312.53 9,634 1.25 % 12,459.10 8,340 1.13 % 11,344.10 5,321 1.18 % 11,757.27 ni 38,393 1,20 % 12,027.78 15,671 1.27 % 12,730.07 9,468 1.22 % 12,244.42 7,890 1.07 % 10,732.01 5,364 1.19 % 11,852.28 to 38,138 1,20 % 11,947.89 13,721 1.11 % 11,146.02 9,753 1.26 % 12,613 8,118 1.10 % 11,042.13 6,546 1.45 % 14,464.03 ta 37,329 1,17 % 11,694.45 11,813 0.96 % 9,596.09 11,786 1.52 % 15,242.16 8,032 1.09 % 10,925.16 5,698 1.26 % 12,590.29 ti 36,178 1,13 % 11,333.86 13,241 1.08 % 10,756.10 9,834 1.27 % 12,717.75 7,977 1.08 % 10,850.34 5,126 1.13 % 11,326.40 en 34,644 1,08 % 10,853.29 13,926 1.13 % 11,312.55 7,999 1.03 % 10,344.65 7,950 1.08 % 10,813.62 4,769 1.05 % 10,537.57 ak 34,287 1,07 % 10,741.45 11,103 0.90 % 9,019.34 10,384 1.34 % 13,429.03 7,284 0.99 % 9,907.72 5,516 1.22 % 12,188.14 ve 34,280 1,07 % 10,739.26 13,143 1.07 % 10,676.49 8,655 1.12 % 11,193.02 8,137 1.11 % 11,067.98 4,345 0.96 % 9,600.70 al 34,084 1,07 % 10,677.85 12,342 1.00 % 10,025.82 8,835 1.14 % 11,425.80 8,191 1.11 % 11,141.43 4,716 1.04 % 10,420.46 no 33,592 1,05 % 10,523.72 13,329 1.08 % 10,827.59 7,617 0.98 % 9,850.63 7,694 1.05 % 10,465.41 4,952 1.09 % 10,941.93 la 32,907 1,03 % 10,309.12 10,683 0.87 % 8,678.16 10,026 1.30 % 12,966.05 7,609 1.03 % 10,349.79 4,589 1.01 % 10,139.85 mo 32,638 1,02 % 10,224.85 12,761 1.04 % 10,366.18 7,917 1.02 % 10,238.60 7,200 0.98 % 9,793.47 4,760 1.05 % 10,517.69 in 30,628 0,96 % 9,595.16 13,232 1.07 % 10,748.79 5,750 0.74 % 7,436.15 8,127 1.10 % 11,054.38 3,519 0.78 % 7,775.58 ri 28,449 0,89 % 8,912.52 11,410 0.93 % 9,268.72 6,213 0.80 % 8,034.92 6,978 0.95 % 9,491.50 3,848 0.85 % 8,502.53 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 335 File at CLARIN.SI 2.1.8 List of character-level 3-grams in standardized word forms in the GOS 1.0 corpusGOS1.0-characters-standardized_forms- 3grams-taxonomy-entire.tsvCharacter string Total absolute frequen- cy of character string Percentage of total sum of all found character strings Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] eee 23,234 1,06 % 10,564.04 10,558 1.18 % 11,812.41 4,533 0.93 % 9,276.33 4,252 0.83 % 8,274.32 3,891 1.28 % 12,841.63 ako 16,989 0,77 % 7,724.57 4,922 0.55 % 5,506.79 5,741 1.18 % 11,748.38 3,490 0.68 % 6,791.48 2,836 0.94 % 9,359.77 tak 15,519 0,71 % 7,056.19 3,880 0.43 % 4,340.99 5,712 1.17 % 11,689.04 3,136 0.61 % 6,102.60 2,791 0.92 % 9,211.25 kaj 15,315 0,70 % 6,963.43 4,702 0.53 % 5,260.65 5,369 1.10 % 10,987.12 2,934 0.57 % 5,709.52 2,310 0.76 % 7,623.79 pre 12,678 0,58 % 5,764.44 6,185 0.69 % 6,919.85 1,846 0.38 % 3,777.65 2,986 0.58 % 5,810.71 1,661 0.55 % 5,481.87 pri 11,602 0,53 % 5,275.20 4,791 0.54 % 5,360.22 2,531 0.52 % 5,179.44 2,679 0.52 % 5,213.29 1,601 0.53 % 5,283.85 ali 11,376 0,52 % 5,172.44 4,555 0.51 % 5,096.18 2,686 0.55 % 5,496.63 2,399 0.47 % 4,668.41 1,736 0.57 % 5,729.39 bil 10,500 0,48 % 4,774.14 3,076 0.34 % 3,441.46 4,069 0.83 % 8,326.80 2,315 0.45 % 4,504.95 1,040 0.34 % 3,432.35 daj 10,472 0,48 % 4,761.41 2,940 0.33 % 3,289.30 3,025 0.62 % 6,190.36 2,646 0.52 % 5,149.07 1,861 0.61 % 6,141.93 sem 10,355 0,47 % 4,708.22 2,293 0.26 % 2,565.43 4,337 0.89 % 8,875.24 2,315 0.45 % 4,504.95 1,410 0.47 % 4,653.48 rav 9,939 0,45 % 4,519.07 4,625 0.52 % 5,174.50 1,798 0.37 % 3,679.43 2,111 0.41 % 4,107.97 1,405 0.46 % 4,636.98 pra 9,188 0,42 % 4,177.60 4,362 0.49 % 4,880.25 1,722 0.35 % 3,523.90 1,866 0.36 % 3,631.21 1,238 0.41 % 4,085.82 udi 9,038 0,41 % 4,109.40 3,368 0.38 % 3,768.16 1,903 0.39 % 3,894.30 2,329 0.45 % 4,532.20 1,438 0.47 % 4,745.89 ist 8,871 0,40 % 4,033.47 3,416 0.38 % 3,821.86 2,605 0.53 % 5,330.87 1,372 0.27 % 2,669.89 1,478 0.49 % 4,877.90 ega 8,571 0,39 % 3,897.07 3,763 0.42 % 4,210.09 1,834 0.38 % 3,753.10 1,797 0.35 % 3,496.93 1,177 0.39 % 3,884.50 ima 8,420 0,38 % 3,828.41 2,497 0.28 % 2,793.67 2,752 0.56 % 5,631.69 1,553 0.30 % 3,022.11 1,618 0.53 % 5,339.95 tud 8,405 0,38 % 3,821.59 3,040 0.34 % 3,401.19 1,761 0.36 % 3,603.71 2,182 0.42 % 4,246.14 1,422 0.47 % 4,693.08 sta 7,917 0,36 % 3,599.70 3,290 0.37 % 3,680.89 1,835 0.38 % 3,755.14 1,777 0.35 % 3,458.01 1,015 0.34 % 3,349.85 ost 7,915 0,36 % 3,598.80 4,061 0.45 % 4,543.49 1,036 0.21 % 2,120.07 1,767 0.34 % 3,438.55 1,051 0.35 % 3,468.66 amo 7,794 0,35 % 3,543.78 2,855 0.32 % 3,194.21 2,288 0.47 % 4,682.16 1,360 0.27 % 2,646.54 1,291 0.43 % 4,260.74 red 7,672 0,35 % 3,488.31 3,353 0.38 % 3,751.37 1,515 0.31 % 3,100.30 1,620 0.32 % 3,152.49 1,184 0.39 % 3,907.60 kak 7,472 0,34 % 3,397.37 2,689 0.30 % 3,008.48 2,038 0.42 % 4,170.56 1,676 0.33 % 3,261.47 1,069 0.35 % 3,528.06 tem 7,375 0,34 % 3,353.27 3,385 0.38 % 3,787.18 1,189 0.24 % 2,433.17 1,437 0.28 % 2,796.38 1,364 0.45 % 4,501.67 zda 7,253 0,33 % 3,297.80 1,971 0.22 % 2,205.18 2,176 0.45 % 4,452.97 1,636 0.32 % 3,183.63 1,470 0.48 % 4,851.50 sto 7,084 0,32 % 3,220.96 2,753 0.31 % 3,080.09 1,840 0.38 % 3,765.38 1,362 0.27 % 2,650.43 1,129 0.37 % 3,726.08 sti 6,867 0,31 % 3,122.29 2,968 0.33 % 3,320.63 1,495 0.31 % 3,059.37 1,499 0.29 % 2,917.03 905 0.30 % 2,986.81 pol 6,717 0,30 % 3,054.09 1,592 0.18 % 1,781.15 2,868 0.59 % 5,869.08 1,107 0.21 % 2,154.20 1,150 0.38 % 3,795.39 del 6,635 0,30 % 3,016.80 2,516 0.28 % 2,814.93 1,858 0.38 % 3,802.21 1,319 0.26 % 2,566.75 942 0.31 % 3,108.92 jaz 6,579 0,30 % 2,991.34 1,543 0.17 % 1,726.33 2,595 0.53 % 5,310.41 1,255 0.24 % 2,442.21 1,186 0.39 % 3,914.20 nek 6,556 0,30 % 2,980.88 2,595 0.29 % 2,903.31 1,726 0.35 % 3,532.09 1,125 0.22 % 2,189.23 1,110 0.37 % 3,663.38 eda 6,074 0,28 % 2,761.73 2,412 0.27 % 2,698.57 1,395 0.28 % 2,854.73 1,578 0.31 % 3,070.76 689 0.23 % 2,273.93 nje 5,937 0,27 % 2,699.44 3,008 0.34 % 3,365.38 952 0.20 % 1,948.17 1,238 0.24 % 2,409.13 739 0.24 % 2,438.95 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 336 File at CLARIN.SI 2.1.9 List of character-level 4-grams in standardized word forms in the GOS 1.0 corpusGOS1.0-characters-standardized_forms- 4grams-taxonomy-entire.tsvCharacter string Total absolute frequen- cy of character string Percentage of total sum of all found character strings Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tako 11,224 0,73 % 7,332.61 2,720 0.41 % 4,154.73 4,096 1.32 % 13,229.25 2,289 0.63 % 6,340.84 2,119 1.03 % 10,315.95 tudi 8,187 0,54 % 5,348.55 2,992 0.46 % 4,570.20 1,706 0.55 % 5,510.03 2,130 0.59 % 5,900.39 1,359 0.66 % 6,616.04 prav 7,197 0,47 % 4,701.78 3,497 0.53 % 5,341.57 1,286 0.41 % 4,153.52 1,441 0.40 % 3,991.77 973 0.47 % 4,736.87 zdaj 6,890 0,45 % 4,501.22 1,883 0.29 % 2,876.23 1,954 0.63 % 6,311.02 1,594 0.44 % 4,415.60 1,459 0.71 % 7,102.87 kako 4,576 0,30 % 2,989.49 1,561 0.24 % 2,384.39 1,423 0.46 % 4,596 964 0.27 % 2,670.41 628 0.31 % 3,057.30 samo 4,320 0,28 % 2,822.25 1,069 0.16 % 1,632.87 1,798 0.58 % 5,807.17 656 0.18 % 1,817.21 797 0.39 % 3,880.04 misl 4,143 0,27 % 2,706.61 1,060 0.16 % 1,619.12 1,485 0.48 % 4,796.25 693 0.19 % 1,919.70 905 0.44 % 4,405.82 lahk 4,100 0,27 % 2,678.52 1,480 0.23 % 2,260.66 900 0.29 % 2,906.82 930 0.26 % 2,576.23 790 0.39 % 3,845.97 ahko 4,068 0,27 % 2,657.61 1,464 0.22 % 2,236.22 891 0.29 % 2,877.75 926 0.26 % 2,565.15 787 0.38 % 3,831.36 isli 4,006 0,26 % 2,617.11 1,008 0.15 % 1,539.69 1,469 0.47 % 4,744.57 680 0.19 % 1,883.69 849 0.41 % 4,133.20 neka 3,909 0,26 % 2,553.74 1,385 0.21 % 2,115.55 1,209 0.39 % 3,904.82 730 0.20 % 2,022.20 585 0.28 % 2,847.96 pote 3,659 0,24 % 2,390.42 1,440 0.22 % 2,199.56 698 0.23 % 2,254.40 760 0.21 % 2,105.30 761 0.37 % 3,704.79 ampa 3,590 0,23 % 2,345.34 1,279 0.20 % 1,953.64 825 0.27 % 2,664.58 818 0.23 % 2,265.97 668 0.33 % 3,252.03 mpak 3,544 0,23 % 2,315.29 1,239 0.19 % 1,892.54 822 0.27 % 2,654.89 816 0.23 % 2,260.43 667 0.33 % 3,247.16 drug 3,513 0,23 % 2,295.03 1,465 0.22 % 2,237.75 891 0.29 % 2,877.75 661 0.18 % 1,831.06 496 0.24 % 2,414.68 tist 3,474 0,23 % 2,269.56 1,070 0.16 % 1,634.40 1,364 0.44 % 4,405.44 539 0.15 % 1,493.10 501 0.24 % 2,439.02 ravi 3,369 0,22 % 2,200.96 1,492 0.23 % 2,278.99 664 0.21 % 2,144.59 706 0.20 % 1,955.72 507 0.25 % 2,468.23 bilo 3,324 0,22 % 2,171.56 894 0.14 % 1,365.56 1,415 0.46 % 4,570.16 594 0.17 % 1,645.46 421 0.20 % 2,049.56 otem 3,258 0,21 % 2,128.44 1,198 0.18 % 1,829.91 649 0.21 % 2,096.14 681 0.19 % 1,886.46 730 0.35 % 3,553.87 liko 3,255 0,21 % 2,126.48 1,203 0.18 % 1,837.55 914 0.29 % 2,952.03 646 0.18 % 1,789.51 492 0.24 % 2,395.21 pred 3,247 0,21 % 2,121.26 1,683 0.26 % 2,570.74 410 0.13 % 1,324.22 794 0.22 % 2,199.49 360 0.17 % 1,752.59 krat 3,235 0,21 % 2,113.42 1,207 0.18 % 1,843.66 799 0.26 % 2,580.61 786 0.22 % 2,177.33 443 0.22 % 2,156.66 dobr 3,204 0,21 % 2,093.17 954 0.15 % 1,457.21 600 0.19 % 1,937.88 1,187 0.33 % 3,288.15 463 0.23 % 2,254.03 eset 3,113 0,20 % 2,033.72 1,064 0.16 % 1,625.23 604 0.20 % 1,950.80 867 0.24 % 2,401.71 578 0.28 % 2,813.88 dese 3,091 0,20 % 2,019.34 1,048 0.16 % 1,600.79 601 0.19 % 1,941.11 864 0.24 % 2,393.40 578 0.28 % 2,813.88 bolj 3,009 0,20 % 1,965.77 905 0.14 % 1,382.36 887 0.29 % 2,864.83 752 0.21 % 2,083.14 465 0.23 % 2,263.77 veda 2,934 0,19 % 1,916.78 1,370 0.21 % 2,092.64 452 0.15 % 1,459.87 836 0.23 % 2,315.83 276 0.13 % 1,343.65 slim 2,931 0,19 % 1,914.82 700 0.11 % 1,069.23 1,061 0.34 % 3,426.81 474 0.13 % 1,313.04 696 0.34 % 3,388.35 isto 2,910 0,19 % 1,901.10 874 0.13 % 1,335.01 1,183 0.38 % 3,820.85 362 0.10 % 1,002.79 491 0.24 % 2,390.34 prej 2,834 0,18 % 1,851.45 1,405 0.21 % 2,146.10 522 0.17 % 1,685.95 551 0.15 % 1,526.35 356 0.17 % 1,733.12 malo 2,797 0,18 % 1,827.27 541 0.08 % 826.36 902 0.29 % 2,913.28 758 0.21 % 2,099.76 596 0.29 % 2,901.51 obro 2,752 0,18 % 1,797.87 805 0.12 % 1,229.62 503 0.16 % 1,624.59 1,014 0.28 % 2,808.92 430 0.21 % 2,093.37 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 337 File at CLARIN.SI 2.1.10 List of character-level 5-grams in standardized word forms in the GOS 1.0 corpusGOS1.0-characters-standardized_forms- 5grams-taxonomy-entire.tsvCharacter string Total absolute frequen- cy of character string Percentage of total sum of all found character strings Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] lahko 4,067 0,39 % 3,924.39 1,464 0.31 % 3,142.31 891 0.47 % 4,672.95 925 0.38 % 3,808.25 787 0.57 % 5,749.81 misli 3,982 0,38 % 3,842.37 990 0.21 % 2,124.92 1,465 0.77 % 7,683.35 678 0.28 % 2,791.34 849 0.62 % 6,202.79 ampak 3,541 0,34 % 3,416.84 1,238 0.27 % 2,657.23 821 0.43 % 4,305.82 815 0.34 % 3,355.37 667 0.49 % 4,873.09 potem 3,256 0,31 % 3,141.83 1,198 0.26 % 2,571.37 649 0.34 % 3,403.75 681 0.28 % 2,803.69 728 0.53 % 5,318.76 pravi 3,155 0,30 % 3,044.37 1,392 0.30 % 2,987.77 620 0.33 % 3,251.66 656 0.27 % 2,700.77 487 0.36 % 3,558.02 deset 3,085 0,30 % 2,976.83 1,048 0.23 % 2,249.41 600 0.32 % 3,146.77 862 0.35 % 3,548.87 575 0.42 % 4,200.94 islim 2,907 0,28 % 2,805.07 679 0.15 % 1,457.40 1,058 0.56 % 5,548.80 474 0.20 % 1,951.47 696 0.51 % 5,084.97 dobro 2,724 0,26 % 2,628.48 798 0.17 % 1,712.82 499 0.26 % 2,617.06 1,010 0.42 % 4,158.19 417 0.30 % 3,046.60 nekaj 2,555 0,25 % 2,465.41 758 0.16 % 1,626.96 928 0.49 % 4,867 480 0.20 % 1,976.17 389 0.28 % 2,842.03 oliko 2,142 0,21 % 2,066.89 616 0.13 % 1,322.17 751 0.39 % 3,938.70 391 0.16 % 1,609.76 384 0.28 % 2,805.50 govor 2,079 0,20 % 2,006.10 1,070 0.23 % 2,296.64 286 0.15 % 1,499.96 412 0.17 % 1,696.21 311 0.23 % 2,272.16 gleda 2,006 0,19 % 1,935.66 591 0.13 % 1,268.52 644 0.34 % 3,377.53 445 0.18 % 1,832.07 326 0.24 % 2,381.75 eveda 1,799 0,17 % 1,735.92 899 0.19 % 1,929.60 189 0.10 % 991.23 543 0.22 % 2,235.54 168 0.12 % 1,227.41 seved 1,795 0,17 % 1,732.06 896 0.19 % 1,923.16 188 0.10 % 985.99 543 0.22 % 2,235.54 168 0.12 % 1,227.41 recim 1,763 0,17 % 1,701.18 753 0.16 % 1,616.23 277 0.14 % 1,452.76 358 0.15 % 1,473.89 375 0.27 % 2,739.75 kakšn 1,762 0,17 % 1,700.22 734 0.16 % 1,575.45 325 0.17 % 1,704.50 465 0.19 % 1,914.42 238 0.17 % 1,738.83 ecimo 1,761 0,17 % 1,699.25 751 0.16 % 1,611.94 277 0.14 % 1,452.76 358 0.15 % 1,473.89 375 0.27 % 2,739.75 kater 1,743 0,17 % 1,681.88 931 0.20 % 1,998.29 295 0.15 % 1,547.16 357 0.15 % 1,469.78 160 0.12 % 1,168.96 loven 1,698 0,16 % 1,638.46 1,014 0.22 % 2,176.44 159 0.08 % 833.89 431 0.18 % 1,774.44 94 0.07 % 686.76 torej 1,698 0,16 % 1,638.46 945 0.20 % 2,028.34 46 0.02 % 241.25 570 0.23 % 2,346.70 137 0.10 % 1,000.92 nared 1,623 0,16 % 1,566.09 458 0.10 % 983.05 548 0.29 % 2,874.05 220 0.09 % 905.74 397 0.29 % 2,900.48 aredi 1,620 0,16 % 1,563.20 459 0.10 % 985.19 546 0.29 % 2,863.56 218 0.09 % 897.51 397 0.29 % 2,900.48 velik 1,545 0,15 % 1,490.82 752 0.16 % 1,614.08 283 0.15 % 1,484.22 376 0.15 % 1,548 134 0.10 % 979 danes 1,501 0,14 % 1,448.37 473 0.10 % 1,015.24 271 0.14 % 1,421.29 668 0.28 % 2,750.17 89 0.07 % 650.23 štiri 1,413 0,14 % 1,363.45 516 0.11 % 1,107.54 273 0.14 % 1,431.78 400 0.17 % 1,646.81 224 0.16 % 1,636.54 rekel 1,412 0,14 % 1,362.49 362 0.08 % 776.99 633 0.33 % 3,319.84 239 0.10 % 983.97 178 0.13 % 1,300.47 kolik 1,373 0,13 % 1,324.86 440 0.09 % 944.41 479 0.25 % 2,512.17 212 0.09 % 872.81 242 0.18 % 1,768.05 poved 1,342 0,13 % 1,294.94 626 0.13 % 1,343.64 278 0.15 % 1,458 328 0.14 % 1,350.38 110 0.08 % 803.66 ovori 1,315 0,13 % 1,268.89 651 0.14 % 1,397.30 238 0.12 % 1,248.22 278 0.11 % 1,144.53 148 0.11 % 1,081.29 tisti 1,313 0,13 % 1,266.96 486 0.10 % 1,043.14 438 0.23 % 2,297.14 228 0.09 % 938.68 161 0.12 % 1,176.26 imamo 1,304 0,13 % 1,258.28 635 0.14 % 1,362.96 191 0.10 % 1,001.72 268 0.11 % 1,103.36 210 0.15 % 1,534.26 najst 1,286 0,12 % 1,240.91 373 0.08 % 800.60 291 0.15 % 1,526.18 411 0.17 % 1,692.10 211 0.15 % 1,541.56 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 338 File at CLARIN.SI 2.1.11 List of character-level 1-grams in lower-case word forms in the GOS 1.0 corpusGOS1.0-characters-lowercase_forms- 1grams-taxonomy-entire.tsvCharacter string Total absolute frequen- cy of character string Percentage of total sum of all found character strings Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] e 529,810 12,92 % 129,218.26 119,052 12.64 % 126,437.20 128,972 12.74 % 127,422.67 202,129 12.96 % 129,564.31 79,657 13.59 % 135,863.66 a 466,433 11,38 % 113,760.90 105,222 11.18 % 111,749.28 126,545 12.50 % 125,024.82 166,090 10.65 % 106,463.38 68,576 11.70 % 116,963.81 o 351,600 8,57 % 85,753.65 82,260 8.74 % 87,362.86 75,957 7.50 % 75,044.53 143,088 9.17 % 91,719.14 50,295 8.58 % 85,783.58 i 285,667 6,97 % 69,672.89 68,208 7.24 % 72,439.17 60,987 6.03 % 60,254.37 119,906 7.69 % 76,859.52 36,566 6.24 % 62,367.28 n 253,143 6,17 % 61,740.43 58,563 6.22 % 62,195.86 61,565 6.08 % 60,825.42 97,228 6.23 % 62,322.96 35,787 6.10 % 61,038.61 t 218,286 5,32 % 53,238.97 50,177 5.33 % 53,289.65 52,292 5.17 % 51,663.82 81,771 5.24 % 52,415.06 34,046 5.81 % 58,069.15 j 200,904 4,90 % 48,999.58 45,106 4.79 % 47,904.08 58,702 5.80 % 57,996.82 68,504 4.39 % 43,910.93 28,592 4.88 % 48,766.76 s 190,692 4,65 % 46,508.92 43,836 4.66 % 46,555.29 47,350 4.68 % 46,781.19 72,853 4.67 % 46,698.64 26,653 4.55 % 45,459.58 r 187,802 4,58 % 45,804.06 44,860 4.76 % 47,642.82 39,929 3.94 % 39,449.34 77,623 4.98 % 49,756.20 25,390 4.33 % 43,305.40 m 173,101 4,22 % 42,218.55 37,016 3.93 % 39,312.23 48,040 4.75 % 47,462.90 59,130 3.79 % 37,902.22 28,915 4.93 % 49,317.67 k 164,634 4,01 % 40,153.49 36,202 3.85 % 38,447.73 43,030 4.25 % 42,513.08 61,960 3.97 % 39,716.24 23,442 4.00 % 39,982.88 l 159,612 3,89 % 38,928.65 37,507 3.98 % 39,833.69 40,355 3.99 % 39,870.22 60,312 3.87 % 38,659.88 21,438 3.66 % 36,564.84 p 149,509 3,65 % 36,464.57 32,182 3.42 % 34,178.36 39,091 3.86 % 38,621.40 55,982 3.59 % 35,884.36 22,254 3.80 % 37,956.61 d 145,103 3,54 % 35,389.97 35,373 3.76 % 37,567.31 32,492 3.21 % 32,101.68 57,544 3.69 % 36,885.60 19,694 3.36 % 33,590.25 v 143,344 3,50 % 34,960.95 34,415 3.65 % 36,549.88 30,619 3.02 % 30,251.18 59,305 3.80 % 38,014.39 19,005 3.24 % 32,415.09 u 81,424 1,99 % 19,858.95 18,335 1.95 % 19,472.38 24,878 2.46 % 24,579.14 28,096 1.80 % 18,009.48 10,115 1.73 % 17,252.23 z 78,503 1,92 % 19,146.53 17,321 1.84 % 18,395.48 17,926 1.77 % 17,710.66 31,874 2.04 % 20,431.17 11,382 1.94 % 19,413.24 b 71,647 1,75 % 17,474.38 17,234 1.83 % 18,303.08 18,532 1.83 % 18,309.38 25,897 1.66 % 16,599.93 9,984 1.70 % 17,028.80 č 54,099 1,32 % 13,194.50 12,062 1.28 % 12,810.25 13,522 1.34 % 13,359.56 20,552 1.32 % 13,173.79 7,963 1.36 % 13,581.76 š 52,463 1,28 % 12,795.49 12,548 1.33 % 13,326.39 16,355 1.62 % 16,158.53 16,457 1.05 % 10,548.91 7,103 1.21 % 12,114.94 g 50,092 1,22 % 12,217.21 12,015 1.28 % 12,760.33 11,492 1.14 % 11,353.95 20,222 1.30 % 12,962.26 6,363 1.08 % 10,852.79 h 39,620 0,97 % 9,663.14 8,465 0.90 % 8,990.11 10,652 1.05 % 10,524.04 13,829 0.89 % 8,864.36 6,674 1.14 % 11,383.23 c 23,750 0,58 % 5,792.52 5,967 0.63 % 6,337.15 4,877 0.48 % 4,818.41 9,853 0.63 % 6,315.75 3,053 0.52 % 5,207.22 ž 19,241 0,47 % 4,692.79 5,333 0.57 % 5,663.82 4,256 0.42 % 4,204.87 7,521 0.48 % 4,820.95 2,131 0.36 % 3,634.65 f 9,629 0,23 % 2,348.47 2,329 0.25 % 2,473.48 3,739 0.37 % 3,694.08 2,340 0.15 % 1,499.94 1,221 0.21 % 2,082.55 ć 4 0 % 0.98 0 0 % 0 1 0 % 0.99 1 0 % 0.64 2 0 % 3.41 w 2 0 % 0.49 1 0 % 1.06 1 0 % 0.99 0 0 % 0 0 0 % 0 q 1 0 % 0.24 0 0 % 0 1 0 % 0.99 0 0 % 0 0 0 % 0 y 1 0 % 0.24 0 0 % 0 1 0 % 0.99 0 0 % 0 0 0 % 0 Ä‘ 1 0 % 0.24 1 0 % 1.06 0 0 % 0 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 339 File at CLARIN.SI 2.1.12 List of character-level 2-grams in lower-case word forms in the GOS 1.0 corpusGOS1.0-characters-lowercase_forms- 2grams-taxonomy-entire.tsvCharacter string Total absolute frequen- cy of character string Percentage of total sum of all found character strings Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] je 63,298 2,06 % 20,651.81 13,056 1.84 % 18,361.14 17,819 2.49 % 24,865.44 24,153 2.00 % 20,012.13 8,270 1.92 % 19,214.33 ne 57,839 1,89 % 18,870.74 12,450 1.75 % 17,508.90 16,953 2.37 % 23,656.99 18,730 1.55 % 15,518.87 9,706 2.25 % 22,550.70 ee 50,274 1,64 % 16,402.56 9,017 1.27 % 12,680.94 9,999 1.40 % 13,953.06 22,631 1.88 % 18,751.07 8,627 2.00 % 20,043.77 na 49,509 1,61 % 16,152.97 11,561 1.63 % 16,258.66 11,025 1.54 % 15,384.79 20,255 1.68 % 16,782.42 6,668 1.55 % 15,492.28 ja 45,448 1,48 % 14,828.01 9,287 1.31 % 13,060.65 15,766 2.20 % 22,000.59 12,524 1.04 % 10,376.84 7,871 1.83 % 18,287.30 se 45,328 1,48 % 14,788.86 10,623 1.49 % 14,939.52 12,015 1.68 % 16,766.28 15,525 1.29 % 12,863.34 7,165 1.67 % 16,647 re 44,432 1,45 % 14,496.53 10,545 1.48 % 14,829.83 9,395 1.31 % 13,110.21 18,361 1.52 % 15,213.13 6,131 1.42 % 14,244.62 pa 42,127 1,37 % 13,744.49 8,345 1.17 % 11,735.88 15,495 2.16 % 21,622.43 11,424 0.95 % 9,465.43 6,863 1.59 % 15,945.34 st 41,112 1,34 % 13,413.33 8,771 1.23 % 12,334.98 8,298 1.16 % 11,579.41 18,001 1.49 % 14,914.85 6,042 1.40 % 14,037.84 po 40,800 1,33 % 13,311.54 8,900 1.25 % 12,516.40 8,697 1.21 % 12,136.19 17,096 1.42 % 14,165.01 6,107 1.42 % 14,188.86 pr 40,559 1,32 % 13,232.91 9,097 1.28 % 12,793.45 7,378 1.03 % 10,295.60 18,526 1.53 % 15,349.84 5,558 1.29 % 12,913.33 ko 40,359 1,32 % 13,167.66 8,932 1.26 % 12,561.40 8,300 1.16 % 11,582.20 16,822 1.39 % 13,937.98 6,305 1.47 % 14,648.89 ra 39,004 1,27 % 12,725.57 8,936 1.26 % 12,567.03 7,301 1.02 % 10,188.15 17,505 1.45 % 14,503.89 5,262 1.22 % 12,225.61 te 38,097 1,24 % 12,429.65 9,286 1.31 % 13,059.25 6,654 0.93 % 9,285.29 15,914 1.32 % 13,185.65 6,243 1.45 % 14,504.84 da 37,299 1,22 % 12,169.29 9,142 1.29 % 12,856.74 8,253 1.15 % 11,516.61 14,611 1.21 % 12,106.04 5,293 1.23 % 12,297.63 to 36,338 1,19 % 11,855.75 8,078 1.14 % 11,360.39 8,485 1.18 % 11,840.36 13,485 1.12 % 11,173.09 6,290 1.46 % 14,614.04 em 35,850 1,17 % 11,696.54 7,488 1.05 % 10,530.65 8,731 1.22 % 12,183.64 13,879 1.15 % 11,499.54 5,752 1.34 % 13,364.06 ni 34,748 1,13 % 11,336.99 7,436 1.05 % 10,457.52 7,274 1.01 % 10,150.47 15,171 1.26 % 12,570.03 4,867 1.13 % 11,307.88 ka 34,382 1,12 % 11,217.58 8,022 1.13 % 11,281.64 8,477 1.18 % 11,829.19 13,518 1.12 % 11,200.43 4,365 1.01 % 10,141.54 en 33,929 1,11 % 11,069.78 7,956 1.12 % 11,188.82 7,328 1.02 % 10,225.82 13,869 1.15 % 11,491.25 4,776 1.11 % 11,096.45 ve 33,738 1,10 % 11,007.47 8,143 1.15 % 11,451.80 8,014 1.12 % 11,183.10 13,218 1.09 % 10,951.86 4,363 1.01 % 10,136.89 ta 32,724 1,07 % 10,676.64 7,369 1.04 % 10,363.30 9,629 1.34 % 13,436.75 11,036 0.91 % 9,143.95 4,690 1.09 % 10,896.64 la 32,645 1,06 % 10,650.86 7,572 1.06 % 10,648.79 9,457 1.32 % 13,196.73 11,024 0.91 % 9,134.01 4,592 1.07 % 10,668.95 aj 32,264 1,05 % 10,526.56 8,295 1.17 % 11,665.57 8,593 1.20 % 11,991.06 10,905 0.90 % 9,035.41 4,471 1.04 % 10,387.82 no 32,071 1,05 % 10,463.59 7,593 1.07 % 10,678.32 6,487 0.91 % 9,052.26 13,250 1.10 % 10,978.38 4,741 1.10 % 11,015.13 al 31,572 1,03 % 10,300.78 7,691 1.08 % 10,816.14 7,339 1.02 % 10,241.17 12,010 0.99 % 9,950.97 4,532 1.05 % 10,529.54 mo 30,472 0,99 % 9,941.89 7,055 0.99 % 9,921.71 6,486 0.91 % 9,050.86 12,568 1.04 % 10,413.30 4,363 1.01 % 10,136.89 in 29,684 0,97 % 9,684.80 8,080 1.14 % 11,363.20 5,238 0.73 % 7,309.34 12,997 1.08 % 10,768.75 3,369 0.78 % 7,827.46 ej 29,365 0,96 % 9,580.72 6,719 0.94 % 9,449.18 9,520 1.33 % 13,284.64 8,916 0.74 % 7,387.41 4,210 0.98 % 9,781.42 ak 27,916 0,91 % 9,107.96 6,014 0.85 % 8,457.71 7,564 1.06 % 10,555.15 10,039 0.83 % 8,317.88 4,299 1.00 % 9,988.20 li 27,231 0,89 % 8,884.47 6,547 0.92 % 9,207.29 5,102 0.71 % 7,119.56 12,559 1.04 % 10,405.84 3,023 0.70 % 7,023.57 ti 26,916 0,88 % 8,781.70 6,607 0.93 % 9,291.67 6,631 0.93 % 9,253.20 10,356 0.86 % 8,580.53 3,322 0.77 % 7,718.26 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 340 File at CLARIN.SI 2.1.13 List of character-level 3-grams in lower-case word forms in the GOS 1.0 corpusGOS1.0-characters-lowercase_forms- 3grams-taxonomy-entire.tsvCharacter string Total absolute frequen- cy of character string Percentage of total sum of all found character strings Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] eee 23,236 1,12 % 11,182.45 4,252 0.87 % 8,671.02 4,535 1.04 % 10,409.18 10,558 1.21 % 12,131.98 3,891 1.38 % 13,817.72 pre 12,789 0,61 % 6,154.77 3,026 0.62 % 6,170.86 1,889 0.43 % 4,335.82 6,208 0.71 % 7,133.48 1,666 0.59 % 5,916.30 tak 11,104 0,53 % 5,343.86 2,358 0.48 % 4,808.62 3,685 0.85 % 8,458.18 3,214 0.37 % 3,693.14 1,847 0.66 % 6,559.07 pri 9,251 0,45 % 4,452.09 2,310 0.47 % 4,710.74 1,326 0.30 % 3,043.57 4,368 0.50 % 5,019.18 1,247 0.44 % 4,428.35 kaj 8,522 0,41 % 4,101.26 1,936 0.40 % 3,948.05 1,830 0.42 % 4,200.40 3,450 0.40 % 3,964.32 1,306 0.46 % 4,637.87 ist 8,444 0,41 % 4,063.72 1,310 0.27 % 2,671.46 2,294 0.53 % 5,265.42 3,416 0.39 % 3,925.25 1,424 0.51 % 5,056.91 rav 8,251 0,40 % 3,970.84 1,826 0.37 % 3,723.73 1,364 0.31 % 3,130.79 3,870 0.45 % 4,446.94 1,191 0.42 % 4,229.48 tud 8,156 0,39 % 3,925.12 2,107 0.43 % 4,296.76 1,623 0.37 % 3,725.27 3,031 0.35 % 3,482.86 1,395 0.49 % 4,953.92 sta 7,892 0,38 % 3,798.07 1,793 0.37 % 3,656.43 1,801 0.41 % 4,133.83 3,295 0.38 % 3,786.22 1,003 0.36 % 3,561.85 ako 7,750 0,37 % 3,729.73 1,870 0.38 % 3,813.45 979 0.23 % 2,247.10 3,814 0.44 % 4,382.59 1,087 0.39 % 3,860.15 pra 7,666 0,37 % 3,689.30 1,645 0.34 % 3,354.62 1,357 0.31 % 3,114.72 3,613 0.41 % 4,151.62 1,051 0.37 % 3,732.31 sem 7,660 0,37 % 3,686.42 1,757 0.36 % 3,583.02 2,694 0.62 % 6,183.54 2,039 0.23 % 2,342.97 1,170 0.41 % 4,154.90 ost 7,620 0,37 % 3,667.17 1,702 0.35 % 3,470.86 942 0.22 % 2,162.17 3,993 0.46 % 4,588.27 983 0.35 % 3,490.83 ega 6,820 0,33 % 3,282.16 1,523 0.31 % 3,105.82 920 0.21 % 2,111.68 3,462 0.40 % 3,978.11 915 0.33 % 3,249.35 tem 6,635 0,32 % 3,193.13 1,237 0.25 % 2,522.59 715 0.16 % 1,641.14 3,373 0.39 % 3,875.84 1,310 0.47 % 4,652.07 ali 6,603 0,32 % 3,177.73 1,615 0.33 % 3,293.44 839 0.19 % 1,925.76 3,436 0.40 % 3,948.24 713 0.25 % 2,532.01 red 6,568 0,32 % 3,160.89 1,501 0.31 % 3,060.96 1,075 0.25 % 2,467.45 3,091 0.35 % 3,551.80 901 0.32 % 3,199.63 nek 6,411 0,31 % 3,085.33 1,102 0.23 % 2,247.29 1,629 0.37 % 3,739.04 2,577 0.30 % 2,961.18 1,103 0.39 % 3,916.97 pro 6,174 0,30 % 2,971.27 1,060 0.22 % 2,161.64 1,127 0.26 % 2,586.80 3,045 0.35 % 3,498.95 942 0.34 % 3,345.23 nje 5,996 0,29 % 2,885.61 1,227 0.25 % 2,502.20 1,018 0.23 % 2,336.61 3,011 0.35 % 3,459.88 740 0.26 % 2,627.89 sti 5,833 0,28 % 2,807.16 1,336 0.27 % 2,724.48 1,009 0.23 % 2,315.96 2,754 0.32 % 3,164.56 734 0.26 % 2,606.58 sto 5,820 0,28 % 2,800.91 1,277 0.26 % 2,604.16 1,130 0.26 % 2,593.69 2,524 0.29 % 2,900.28 889 0.32 % 3,157.02 rej 5,780 0,28 % 2,781.66 1,422 0.29 % 2,899.86 1,067 0.24 % 2,449.08 2,681 0.31 % 3,080.68 610 0.22 % 2,166.23 del 5,721 0,28 % 2,753.26 1,078 0.22 % 2,198.34 1,363 0.31 % 3,128.49 2,410 0.28 % 2,769.28 870 0.31 % 3,089.54 pol 5,686 0,27 % 2,736.42 1,157 0.24 % 2,359.45 2,000 0.46 % 4,590.60 1,553 0.18 % 1,784.52 976 0.35 % 3,465.97 sam 5,655 0,27 % 2,721.50 985 0.20 % 2,008.69 2,068 0.47 % 4,746.68 1,625 0.19 % 1,867.25 977 0.35 % 3,469.52 dob 5,595 0,27 % 2,692.62 1,764 0.36 % 3,597.29 1,042 0.24 % 2,391.70 1,934 0.22 % 2,222.32 855 0.30 % 3,036.28 dej 5,535 0,27 % 2,663.75 1,219 0.25 % 2,485.88 1,388 0.32 % 3,185.88 2,009 0.23 % 2,308.50 919 0.33 % 3,263.55 amo 5,506 0,27 % 2,649.79 1,134 0.23 % 2,312.54 958 0.22 % 2,198.90 2,559 0.29 % 2,940.49 855 0.30 % 3,036.28 anj 5,443 0,26 % 2,619.47 1,094 0.22 % 2,230.97 534 0.12 % 1,225.69 3,125 0.36 % 3,590.87 690 0.24 % 2,450.33 kak 5,421 0,26 % 2,608.89 1,205 0.25 % 2,457.33 1,158 0.27 % 2,657.96 2,253 0.26 % 2,588.88 805 0.29 % 2,858.72 ove 5,417 0,26 % 2,606.96 1,350 0.28 % 2,753.03 787 0.18 % 1,806.40 2,741 0.32 % 3,149.63 539 0.19 % 1,914.10 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 341 File at CLARIN.SI 2.1.14 List of character-level 4-grams in lower-case word forms in the GOS 1.0 corpusGOS1.0-characters-lowercase_forms- 4grams-taxonomy-entire.tsvCharacter string Total absolute frequen- cy of character string Percentage of total sum of all found character strings Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] prav 5,556 0,39 % 3,885.59 1,168 0.34 % 3,421.54 873 0.33 % 3,280.51 2,740 0.43 % 4,318.23 775 0.41 % 4,124.67 tudi 4,347 0,30 % 3,040.08 1,340 0.39 % 3,925.39 367 0.14 % 1,379.09 2,187 0.34 % 3,446.71 453 0.24 % 2,410.93 tako 4,210 0,29 % 2,944.27 1,061 0.31 % 3,108.09 502 0.19 % 1,886.39 1,933 0.30 % 3,046.40 714 0.38 % 3,800.01 ampa 3,499 0,24 % 2,447.03 801 0.23 % 2,346.45 783 0.29 % 2,942.31 1,252 0.20 % 1,973.15 663 0.35 % 3,528.59 mpak 3,475 0,24 % 2,430.24 799 0.23 % 2,340.59 790 0.30 % 2,968.62 1,222 0.19 % 1,925.87 664 0.35 % 3,533.91 pred 3,227 0,23 % 2,256.81 791 0.23 % 2,317.15 395 0.15 % 1,484.31 1,684 0.27 % 2,653.98 357 0.19 % 1,900.01 lahk 3,192 0,22 % 2,232.33 734 0.21 % 2,150.18 463 0.17 % 1,739.84 1,360 0.21 % 2,143.36 635 0.34 % 3,379.57 zdej 3,172 0,22 % 2,218.34 571 0.17 % 1,672.69 858 0.32 % 3,224.15 1,168 0.18 % 1,840.76 575 0.31 % 3,060.24 tist 3,163 0,22 % 2,212.05 484 0.14 % 1,417.83 1,157 0.43 % 4,347.71 1,048 0.17 % 1,651.64 474 0.25 % 2,522.70 dobr 3,092 0,22 % 2,162.39 1,171 0.34 % 3,430.33 529 0.20 % 1,987.85 943 0.15 % 1,486.17 449 0.24 % 2,389.65 misl 3,048 0,21 % 2,131.62 517 0.15 % 1,514.50 997 0.38 % 3,746.47 891 0.14 % 1,404.21 643 0.34 % 3,422.14 drug 2,981 0,21 % 2,084.77 589 0.17 % 1,725.42 574 0.22 % 2,156.95 1,390 0.22 % 2,190.64 428 0.23 % 2,277.88 pote 2,968 0,21 % 2,075.67 563 0.17 % 1,649.25 268 0.10 % 1,007.08 1,435 0.23 % 2,261.56 702 0.37 % 3,736.15 ahko 2,945 0,21 % 2,059.59 701 0.20 % 2,053.51 336 0.13 % 1,262.60 1,295 0.20 % 2,040.92 613 0.33 % 3,262.48 eset 2,892 0,20 % 2,022.52 842 0.25 % 2,466.55 526 0.20 % 1,976.57 971 0.15 % 1,530.29 553 0.29 % 2,943.15 prej 2,842 0,20 % 1,987.56 564 0.17 % 1,652.18 517 0.19 % 1,942.75 1,406 0.22 % 2,215.85 355 0.19 % 1,889.36 otem 2,581 0,18 % 1,805.03 486 0.14 % 1,423.69 230 0.09 % 864.28 1,192 0.19 % 1,878.59 673 0.36 % 3,581.81 nost 2,566 0,18 % 1,794.53 470 0.14 % 1,376.82 162 0.06 % 608.75 1,701 0.27 % 2,680.77 233 0.12 % 1,240.06 krat 2,503 0,17 % 1,750.48 626 0.18 % 1,833.80 483 0.18 % 1,814.99 1,018 0.16 % 1,604.36 376 0.20 % 2,001.13 veda 2,493 0,17 % 1,743.48 742 0.22 % 2,173.61 266 0.10 % 999.56 1,247 0.20 % 1,965.27 238 0.13 % 1,266.67 anje 2,473 0,17 % 1,729.50 447 0.13 % 1,309.44 295 0.11 % 1,108.53 1,363 0.21 % 2,148.08 368 0.20 % 1,958.55 pove 2,463 0,17 % 1,722.50 605 0.18 % 1,772.29 451 0.17 % 1,694.74 1,095 0.17 % 1,725.72 312 0.17 % 1,660.51 kako 2,352 0,16 % 1,644.87 570 0.17 % 1,669.76 269 0.10 % 1,010.83 1,230 0.19 % 1,938.48 283 0.15 % 1,506.17 gled 2,310 0,16 % 1,615.50 566 0.17 % 1,658.04 462 0.17 % 1,736.08 873 0.14 % 1,375.85 409 0.22 % 2,176.76 tega 2,310 0,16 % 1,615.50 407 0.12 % 1,192.27 410 0.15 % 1,540.68 1,066 0.17 % 1,680.01 427 0.23 % 2,272.56 samo 2,301 0,16 % 1,609.21 466 0.14 % 1,365.10 547 0.21 % 2,055.49 889 0.14 % 1,401.06 399 0.21 % 2,123.54 dese 2,255 0,16 % 1,577.04 791 0.23 % 2,317.15 364 0.14 % 1,367.82 743 0.12 % 1,170.97 357 0.19 % 1,900.01 ravi 2,245 0,16 % 1,570.04 453 0.13 % 1,327.02 246 0.09 % 924.41 1,224 0.19 % 1,929.02 322 0.17 % 1,713.73 slov 2,233 0,16 % 1,561.65 547 0.16 % 1,602.38 222 0.08 % 834.22 1,260 0.20 % 1,985.76 204 0.11 % 1,085.72 neka 2,165 0,15 % 1,514.10 486 0.14 % 1,423.69 426 0.16 % 1,600.80 952 0.15 % 1,500.35 301 0.16 % 1,601.97 obro 2,150 0,15 % 1,503.60 857 0.25 % 2,510.49 239 0.09 % 898.10 740 0.12 % 1,166.24 314 0.17 % 1,671.16 govo 2,145 0,15 % 1,500.11 428 0.12 % 1,253.78 284 0.11 % 1,067.20 1,111 0.17 % 1,750.93 322 0.17 % 1,713.73 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 342 File at CLARIN.SI 2.1.15 List of character-level 5-grams in lower-case word forms in the GOS 1.0 corpusGOS1.0-characters-lowercase_forms- 5grams-taxonomy-entire.tsvCharacter string Total absolute frequen- cy of character string Percentage of total sum of all found character strings Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ampak 3,445 0,36 % 3,557.69 797 0.35 % 3,472.62 778 0.48 % 4,801.97 1,208 0.27 % 2,679.04 662 0.53 % 5,258.56 lahko 2,945 0,30 % 3,041.33 701 0.30 % 3,054.33 336 0.21 % 2,073.86 1,295 0.29 % 2,871.98 613 0.49 % 4,869.33 potem 2,566 0,27 % 2,649.94 484 0.21 % 2,108.84 221 0.14 % 1,364.05 1,192 0.26 % 2,643.55 669 0.53 % 5,314.16 deset 2,248 0,23 % 2,321.53 787 0.34 % 3,429.04 363 0.22 % 2,240.51 743 0.17 % 1,647.79 355 0.28 % 2,819.92 dobro 2,121 0,22 % 2,190.38 853 0.37 % 3,716.61 233 0.14 % 1,438.12 733 0.16 % 1,625.61 302 0.24 % 2,398.92 govor 2,069 0,21 % 2,136.68 412 0.18 % 1,795.13 279 0.17 % 1,722.04 1,068 0.24 % 2,368.55 310 0.25 % 2,462.47 pravi 2,066 0,21 % 2,133.58 422 0.18 % 1,838.70 210 0.13 % 1,296.16 1,130 0.25 % 2,506.05 304 0.24 % 2,414.81 seved 1,731 0,18 % 1,787.62 528 0.23 % 2,300.55 170 0.10 % 1,049.27 873 0.19 % 1,936.09 160 0.13 % 1,270.95 slove 1,717 0,18 % 1,773.17 440 0.19 % 1,917.13 159 0.10 % 981.38 1,019 0.23 % 2,259.88 99 0.08 % 786.40 loven 1,695 0,17 % 1,750.45 429 0.19 % 1,869.20 158 0.10 % 975.21 1,014 0.23 % 2,248.80 94 0.07 % 746.68 torej 1,695 0,17 % 1,750.45 570 0.25 % 2,483.55 45 0.03 % 277.75 945 0.21 % 2,095.77 135 0.11 % 1,072.36 recim 1,670 0,17 % 1,724.63 347 0.15 % 1,511.92 246 0.15 % 1,518.36 710 0.16 % 1,574.60 367 0.29 % 2,915.24 ecimo 1,665 0,17 % 1,719.46 346 0.15 % 1,507.56 245 0.15 % 1,512.19 708 0.16 % 1,570.17 366 0.29 % 2,907.30 eveda 1,598 0,17 % 1,650.27 500 0.22 % 2,178.55 127 0.08 % 783.87 819 0.18 % 1,816.34 152 0.12 % 1,207.40 gleda 1,485 0,15 % 1,533.58 378 0.17 % 1,646.99 320 0.20 % 1,975.10 546 0.12 % 1,210.89 241 0.19 % 1,914.37 misli 1,460 0,15 % 1,507.76 286 0.12 % 1,246.13 394 0.24 % 2,431.84 519 0.12 % 1,151.01 261 0.21 % 2,073.24 velik 1,251 0,13 % 1,291.92 323 0.14 % 1,407.35 130 0.08 % 802.38 701 0.15 % 1,554.64 97 0.08 % 770.51 poved 1,239 0,13 % 1,279.53 296 0.13 % 1,289.70 230 0.14 % 1,419.60 613 0.14 % 1,359.48 100 0.08 % 794.34 najst 1,231 0,13 % 1,271.27 396 0.17 % 1,725.42 265 0.16 % 1,635.63 364 0.08 % 807.26 206 0.16 % 1,636.35 stran 1,209 0,12 % 1,248.55 211 0.09 % 919.35 171 0.11 % 1,055.44 652 0.14 % 1,445.97 175 0.14 % 1,390.10 kater 1,194 0,12 % 1,233.06 221 0.10 % 962.92 40 0.03 % 246.89 846 0.19 % 1,876.21 87 0.07 % 691.08 mogoč 1,187 0,12 % 1,225.83 271 0.12 % 1,180.78 203 0.12 % 1,252.95 387 0.09 % 858.27 326 0.26 % 2,589.56 stvar 1,169 0,12 % 1,207.24 177 0.08 % 771.21 194 0.12 % 1,197.41 545 0.12 % 1,208.67 253 0.20 % 2,009.69 prime 1,168 0,12 % 1,206.21 155 0.07 % 675.35 111 0.07 % 685.11 654 0.14 % 1,450.41 248 0.20 % 1,969.97 ovori 1,155 0,12 % 1,192.78 242 0.10 % 1,054.42 175 0.11 % 1,080.13 608 0.14 % 1,348.39 130 0.10 % 1,032.65 bistv 1,141 0,12 % 1,178.32 181 0.08 % 788.64 242 0.15 % 1,493.67 341 0.08 % 756.25 377 0.30 % 2,994.68 rimer 1,121 0,12 % 1,157.67 142 0.06 % 618.71 100 0.06 % 617.22 640 0.14 % 1,419.36 239 0.19 % 1,898.48 gospo 1,112 0,12 % 1,148.37 105 0.05 % 457.50 15 0.01 % 92.58 961 0.21 % 2,131.26 31 0.03 % 246.25 ospod 1,087 0,11 % 1,122.56 99 0.04 % 431.35 13 0.01 % 80.24 950 0.21 % 2,106.86 25 0.02 % 198.59 pogle 1,060 0,11 % 1,094.67 238 0.10 % 1,036.99 173 0.11 % 1,067.79 469 0.10 % 1,040.12 180 0.14 % 1,429.82 ogoče 1,057 0,11 % 1,091.58 241 0.10 % 1,050.06 181 0.11 % 1,117.17 322 0.07 % 714.11 313 0.25 % 2,486.30 vpraš 1,035 0,11 % 1,068.86 185 0.08 % 806.07 178 0.11 % 1,098.65 531 0.12 % 1,177.62 141 0.11 % 1,120.03 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 343A total of 390 frequency lists of word parts were extracted from GOS 1.0: 30 for each part- of-speech according to the MTE-6 annotation scheme used to automatically annotate the corpus (nouns, verbs, adjectives, adverbs, pronouns, numerals, conjunctions, prepositions, interjections, particles, abbreviations, and residual words), with an additional 30 lists that contain word parts extracted from all word forms or lemmas regardless of their part-of-speech. Each group of 30 frequency lists is divided into three parts: 10 lists extracted from lemmas, 10 lists extracted from lower-case word forms, and 10 lists extracted from standardized word forms. Each of these three parts is further divided into 5 lists of initial word parts and 5 lists of final word parts (of length 1 to 5). Each line in the list contains the unit (either the word form, the lemma, or the standardized word form, e.g. “Slovenija”; with lemmas, the lower-case lemma is also listed, e.g. “slovenija”), its initial or final word part (e.g. “slo”), and the rest of the word (e.g. “venija”). The numerical data included is comprised of the absolute frequency (fa) of the split word in the corpus, e.g. the total number of its occurrences in the corpus, followed by its percentage (p) according to the total frequency (N) of all units (either lower-case forms, lemmas, or standardized word forms) in the corpus: The list also contains the unit’s total relative frequency (fr), which indicates how frequently per 1,000,000 units the split unit occurs in the corpus. It is calculated with the following formula, where fa is the total absolute frequency of the split unit in the corpus, and N is the total frequency of all units in the corpus:The lists containing only the units with a specific part-of-speech also feature numerical data for individual text-type subcorpora (e.g. public discourse for information or entertainment, non- public private discourse). In this case, the absolute frequencies (faT) represent the sum of all occurrences of the split unit in the texts pertaining to a specific taxonomy branch. The relative frequencies (frT) and percentages (pT) are calculated using the following formulas, where faT is the absolute frequency of the split unit in the taxonomy branch, and NT is the absolute frequency of all units in the taxonomy branch: The tables are sorted in the following manner: • 2.2.1–2.2.5 → All parts of speech / lemmas / initial word parts • 2.2.6–2.2.10 → All parts of speech lemmas / final word parts • 2.2.11–2.2.15 → All parts of speech / standardized word forms / initial word parts • 2.2.16–2.2.20 → All parts of speech / standardized word forms / final word parts • 2.2.21–2.2.25 → All parts of speech / lower-case word forms / initial word parts • 2.2.26–2.2.30 → All parts of speech / lower-case word forms / final word parts • 2.2.31–2.2.35 → Nouns / lemmas / initial word parts • 2.2.36–2.2.40 → Nouns / lemmas / final word parts • 2.2.41–2.2.45 → Nouns / standardized word forms / initial word parts • 2.2.46–2.2.50 → Nouns / standardized word forms / final word parts • 2.2.51–2.2.55 → Nouns / lower-case word forms / initial word parts • 2.2.56–2.2.60 → Nouns / lower-case word forms / final word parts • ...2.2. Frequency lists of word parts from the GOS 1.0 corpus CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 344 File at CLARIN.SI 2.2.1 List of initial character-level 1-grams from all lemmas in the GOS 1.0 corpusGOS1.0-word_parts-all-lemmas-initial- 1grams-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequen- cy of lemma Percentage of all found lemmas Total relative frequency (per million occur - rences) biti biti b iti 93,407 9.02 % 90,238.98 ne ne n e 31,861 3.08 % 30,780.39 pa pa p a 29,385 2.84 % 28,388.37 ta ta t a 29,315 2.83 % 28,320.74 ja ja j a 25,571 2.47 % 24,703.73 eee eee e ee 23,222 2.24 % 22,434.40 da da d a 19,696 1.90 % 19,027.98 se se s e 18,168 1.75 % 17,551.81 v v v 17,916 1.73 % 17,308.36 on on o n 17,525 1.69 % 16,930.62 in in i n 16,244 1.57 % 15,693.06 jaz jaz j az 15,234 1.47 % 14,717.32 na na n a 12,064 1.17 % 11,654.83 imeti imeti i meti 10,131 0.98 % 9,787.39 tako tako t ako 10,083 0.97 % 9,741.02 kaj kaj k aj 9,753 0.94 % 9,422.21 ti ti t i 9,521 0.92 % 9,198.08 z z z 8,149 0.79 % 7,872.62 za za z a 7,980 0.77 % 7,709.35 tudi tudi t udi 7,947 0.77 % 7,677.47 vedeti vedeti v edeti 7,560 0.73 % 7,303.59 še še š e 7,193 0.69 % 6,949.04 a a a 6,882 0.67 % 6,648.59 zdaj zdaj z daj 6,366 0.61 % 6,150.09 če če č e 6,157 0.59 % 5,948.18 ki ki k i 5,540 0.54 % 5,352.10 en en e n 5,311 0.51 % 5,130.87 iti iti i ti 5,311 0.51 % 5,130.87 ali ali a li 4,798 0.46 % 4,635.27 ker ker k er 4,785 0.46 % 4,622.71 no no n o 4,720 0.46 % 4,559.92 reči reči r eči 4,690 0.45 % 4,530.93 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 345 File at CLARIN.SI 2.2.2 List of initial character-level 2-grams from all lemmas in the GOS 1.0 corpusGOS1.0-word_parts-all-lemmas- initial-2grams-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequen- cy of lemma Percentage of all found lemmas Total relative frequency (per million occur - rences) biti biti bi ti 93,407 9.41 % 90,238.98 ne ne ne 31,861 3.21 % 30,780.39 pa pa pa 29,385 2.96 % 28,388.37 ta ta ta 29,315 2.95 % 28,320.74 ja ja ja 25,571 2.58 % 24,703.73 eee eee ee e 23,222 2.34 % 22,434.40 da da da 19,696 1.98 % 19,027.98 se se se 18,168 1.83 % 17,551.81 on on on 17,525 1.76 % 16,930.62 in in in 16,244 1.64 % 15,693.06 jaz jaz ja z 15,234 1.53 % 14,717.32 na na na 12,064 1.22 % 11,654.83 imeti imeti im eti 10,131 1.02 % 9,787.39 tako tako ta ko 10,083 1.02 % 9,741.02 kaj kaj ka j 9,753 0.98 % 9,422.21 ti ti ti 9,521 0.96 % 9,198.08 za za za 7,980 0.80 % 7,709.35 tudi tudi tu di 7,947 0.80 % 7,677.47 vedeti vedeti ve deti 7,560 0.76 % 7,303.59 še še še 7,193 0.72 % 6,949.04 zdaj zdaj zd aj 6,366 0.64 % 6,150.09 če če če 6,157 0.62 % 5,948.18 ki ki ki 5,540 0.56 % 5,352.10 en en en 5,311 0.54 % 5,130.87 iti iti it i 5,311 0.54 % 5,130.87 ali ali al i 4,798 0.48 % 4,635.27 ker ker ke r 4,785 0.48 % 4,622.71 no no no 4,720 0.47 % 4,559.92 reči reči re či 4,690 0.47 % 4,530.93 ves ves ve s 4,617 0.47 % 4,460.41 mhm mhm mh m 4,477 0.45 % 4,325.16 že že že 4,457 0.45 % 4,305.84 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 346 File at CLARIN.SI 2.2.3 List of initial character-level 3-grams from all lemmas in the GOS 1.0 corpusGOS1.0-word_parts-all-lemmas- initial-3grams-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequen- cy of lemma Percentage of all found lemmas Total relative frequency (per million occur - rences) biti biti bit i 93,407 12.98 % 90,238.98 eee eee eee 23,222 3.23 % 22,434.40 jaz jaz jaz 15,234 2.12 % 14,717.32 imeti imeti ime ti 10,131 1.41 % 9,787.39 tako tako tak o 10,083 1.40 % 9,741.02 kaj kaj kaj 9,753 1.36 % 9,422.21 tudi tudi tud i 7,947 1.10 % 7,677.47 vedeti vedeti ved eti 7,560 1.05 % 7,303.59 zdaj zdaj zda j 6,366 0.89 % 6,150.09 iti iti iti 5,311 0.74 % 5,130.87 ali ali ali 4,798 0.67 % 4,635.27 ker ker ker 4,785 0.67 % 4,622.71 reči reči reč i 4,690 0.65 % 4,530.93 ves ves ves 4,617 0.64 % 4,460.41 mhm mhm mhm 4,477 0.62 % 4,325.16 pol pol pol 4,197 0.58 % 4,054.65 lahko lahko lah ko 4,140 0.58 % 3,999.59 sam sam sam 3,974 0.55 % 3,839.22 dati dati dat i 3,945 0.55 % 3,811.20 saj saj saj 3,943 0.55 % 3,809.27 kar kar kar 3,881 0.54 % 3,749.37 ampak ampak amp ak 3,541 0.49 % 3,420.90 tisti tisti tis ti 3,384 0.47 % 3,269.23 kot kot kot 3,371 0.47 % 3,256.67 potem potem pot em 3,237 0.45 % 3,127.21 dober dober dob er 3,021 0.42 % 2,918.54 eem eem eem 2,950 0.41 % 2,849.95 pač pač pač 2,851 0.40 % 2,754.30 misliti misliti mis liti 2,710 0.38 % 2,618.09 tam tam tam 2,598 0.36 % 2,509.89 nekaj nekaj nek aj 2,562 0.36 % 2,475.11 priti priti pri ti 2,530 0.35 % 2,444.19 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 347 File at CLARIN.SI 2.2.4 List of initial character-level 4-grams from all lemmas in the GOS 1.0 corpusGOS1.0-word_parts-all-lemmas- initial-4grams-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequen- cy of lemma Percentage of all found lemmas Total relative frequency (per million occur - rences) biti biti biti 93,407 16.70 % 90,238.98 imeti imeti imet i 10,131 1.81 % 9,787.39 tako tako tako 10,083 1.80 % 9,741.02 tudi tudi tudi 7,947 1.42 % 7,677.47 vedeti vedeti vede ti 7,560 1.35 % 7,303.59 zdaj zdaj zdaj 6,366 1.14 % 6,150.09 reči reči reči 4,690 0.84 % 4,530.93 lahko lahko lahk o 4,140 0.74 % 3,999.59 dati dati dati 3,945 0.71 % 3,811.20 ampak ampak ampa k 3,541 0.63 % 3,420.90 tisti tisti tist i 3,384 0.60 % 3,269.23 potem potem pote m 3,237 0.58 % 3,127.21 dober dober dobe r 3,021 0.54 % 2,918.54 misliti misliti misl iti 2,710 0.48 % 2,618.09 nekaj nekaj neka j 2,562 0.46 % 2,475.11 priti priti prit i 2,530 0.45 % 2,444.19 drug drug drug 2,368 0.42 % 2,287.69 videti videti vide ti 2,309 0.41 % 2,230.69 kakšen kakšen kakš en 2,260 0.40 % 2,183.35 morati morati mora ti 2,233 0.40 % 2,157.26 moči moči moči 2,067 0.37 % 1,996.90 zelo zelo zelo 1,835 0.33 % 1,772.76 leto leto leto 1,828 0.33 % 1,766 seveda seveda seve da 1,795 0.32 % 1,734.12 gledati gledati gled ati 1,777 0.32 % 1,716.73 mali mali mali 1,760 0.32 % 1,700.31 torej torej tore j 1,697 0.30 % 1,639.44 povedati povedati pove dati 1,686 0.30 % 1,628.82 bolj bolj bolj 1,654 0.30 % 1,597.90 narediti narediti nare diti 1,614 0.29 % 1,559.26 delati delati dela ti 1,515 0.27 % 1,463.62 danes danes dane s 1,494 0.27 % 1,443.33 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 348 File at CLARIN.SI 2.2.5 List of initial character-level 5-grams from all lemmas in the GOS 1.0 corpusGOS1.0-word_parts-all-lemmas- initial-5grams-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequen- cy of lemma Percentage of all found lemmas Total relative frequency (per million occur - rences) imeti imeti imeti 10,131 2.77 % 9,787.39 vedeti vedeti vedet i 7,560 2.07 % 7,303.59 lahko lahko lahko 4,140 1.13 % 3,999.59 ampak ampak ampak 3,541 0.97 % 3,420.90 tisti tisti tisti 3,384 0.93 % 3,269.23 potem potem potem 3,237 0.89 % 3,127.21 dober dober dober 3,021 0.83 % 2,918.54 misliti misliti misli ti 2,710 0.74 % 2,618.09 nekaj nekaj nekaj 2,562 0.70 % 2,475.11 priti priti priti 2,530 0.69 % 2,444.19 videti videti videt i 2,309 0.63 % 2,230.69 kakšen kakšen kakše n 2,260 0.62 % 2,183.35 morati morati morat i 2,233 0.61 % 2,157.26 seveda seveda seved a 1,795 0.49 % 1,734.12 gledati gledati gleda ti 1,777 0.49 % 1,716.73 torej torej torej 1,697 0.46 % 1,639.44 povedati povedati poved ati 1,686 0.46 % 1,628.82 narediti narediti nared iti 1,614 0.44 % 1,559.26 delati delati delat i 1,515 0.41 % 1,463.62 danes danes danes 1,494 0.41 % 1,443.33 kateri kateri kater i 1,448 0.40 % 1,398.89 dobro dobro dobro 1,390 0.38 % 1,342.86 praviti praviti pravi ti 1,344 0.37 % 1,298.42 veliko veliko velik o 1,316 0.36 % 1,271.37 dobiti dobiti dobit i 1,264 0.35 % 1,221.13 človek človek člove k 1,248 0.34 % 1,205.67 velik velik velik 1,213 0.33 % 1,171.86 trije trije trije 1,150 0.32 % 1,111 tukaj tukaj tukaj 1,136 0.31 % 1,097.47 misel misel misel 1,086 0.30 % 1,049.17 mogoče mogoče mogoč e 1,056 0.29 % 1,020.18 stvar stvar stvar 1,056 0.29 % 1,020.18 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 349 File at CLARIN.SI 2.2.6 List of final character-level 1-grams from all lemmas in the GOS 1.0 corpusGOS1.0-word_parts-all-lemmas-final-1grams-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequen- cy of lemma Percentage of all found lemmas Total relative frequency (per million occur - rences) biti biti bit i 93,407 9.02 % 90,238.98 ne ne n e 31,861 3.08 % 30,780.39 pa pa p a 29,385 2.84 % 28,388.37 ta ta t a 29,315 2.83 % 28,320.74 ja ja j a 25,571 2.47 % 24,703.73 eee eee ee e 23,222 2.24 % 22,434.40 da da d a 19,696 1.90 % 19,027.98 se se s e 18,168 1.75 % 17,551.81 v v v 17,916 1.73 % 17,308.36 on on o n 17,525 1.69 % 16,930.62 in in i n 16,244 1.57 % 15,693.06 jaz jaz ja z 15,234 1.47 % 14,717.32 na na n a 12,064 1.17 % 11,654.83 imeti imeti imet i 10,131 0.98 % 9,787.39 tako tako tak o 10,083 0.97 % 9,741.02 kaj kaj ka j 9,753 0.94 % 9,422.21 ti ti t i 9,521 0.92 % 9,198.08 z z z 8,149 0.79 % 7,872.62 za za z a 7,980 0.77 % 7,709.35 tudi tudi tud i 7,947 0.77 % 7,677.47 vedeti vedeti vedet i 7,560 0.73 % 7,303.59 še še š e 7,193 0.69 % 6,949.04 a a a 6,882 0.67 % 6,648.59 zdaj zdaj zda j 6,366 0.61 % 6,150.09 če če č e 6,157 0.59 % 5,948.18 ki ki k i 5,540 0.54 % 5,352.10 en en e n 5,311 0.51 % 5,130.87 iti iti it i 5,311 0.51 % 5,130.87 ali ali al i 4,798 0.46 % 4,635.27 ker ker ke r 4,785 0.46 % 4,622.71 no no n o 4,720 0.46 % 4,559.92 reči reči reč i 4,690 0.45 % 4,530.93 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 350 File at CLARIN.SI 2.2.7 List of final character-level 2-grams from all lemmas in the GOS 1.0 corpusGOS1.0-word_parts-all-lemmas-final-2grams-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequen- cy of lemma Percentage of all found lemmas Total relative frequency (per million occur - rences) biti biti bi ti 93,407 9.41 % 90,238.98 ne ne ne 31,861 3.21 % 30,780.39 pa pa pa 29,385 2.96 % 28,388.37 ta ta ta 29,315 2.95 % 28,320.74 ja ja ja 25,571 2.58 % 24,703.73 eee eee e ee 23,222 2.34 % 22,434.40 da da da 19,696 1.98 % 19,027.98 se se se 18,168 1.83 % 17,551.81 on on on 17,525 1.76 % 16,930.62 in in in 16,244 1.64 % 15,693.06 jaz jaz j az 15,234 1.53 % 14,717.32 na na na 12,064 1.22 % 11,654.83 imeti imeti ime ti 10,131 1.02 % 9,787.39 tako tako ta ko 10,083 1.02 % 9,741.02 kaj kaj k aj 9,753 0.98 % 9,422.21 ti ti ti 9,521 0.96 % 9,198.08 za za za 7,980 0.80 % 7,709.35 tudi tudi tu di 7,947 0.80 % 7,677.47 vedeti vedeti vede ti 7,560 0.76 % 7,303.59 še še še 7,193 0.72 % 6,949.04 zdaj zdaj zd aj 6,366 0.64 % 6,150.09 če če če 6,157 0.62 % 5,948.18 ki ki ki 5,540 0.56 % 5,352.10 en en en 5,311 0.54 % 5,130.87 iti iti i ti 5,311 0.54 % 5,130.87 ali ali a li 4,798 0.48 % 4,635.27 ker ker k er 4,785 0.48 % 4,622.71 no no no 4,720 0.47 % 4,559.92 reči reči re či 4,690 0.47 % 4,530.93 ves ves v es 4,617 0.47 % 4,460.41 mhm mhm m hm 4,477 0.45 % 4,325.16 že že že 4,457 0.45 % 4,305.84 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 351 File at CLARIN.SI 2.2.8 List of final character-level 3-grams from all lemmas in the GOS 1.0 corpusGOS1.0-word_parts-all-lemmas-final-3grams-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequen- cy of lemma Percentage of all found lemmas Total relative frequency (per million occur - rences) biti biti b iti 93,407 12.98 % 90,238.98 eee eee eee 23,222 3.23 % 22,434.40 jaz jaz jaz 15,234 2.12 % 14,717.32 imeti imeti im eti 10,131 1.41 % 9,787.39 tako tako t ako 10,083 1.40 % 9,741.02 kaj kaj kaj 9,753 1.36 % 9,422.21 tudi tudi t udi 7,947 1.10 % 7,677.47 vedeti vedeti ved eti 7,560 1.05 % 7,303.59 zdaj zdaj z daj 6,366 0.89 % 6,150.09 iti iti iti 5,311 0.74 % 5,130.87 ali ali ali 4,798 0.67 % 4,635.27 ker ker ker 4,785 0.67 % 4,622.71 reči reči r eči 4,690 0.65 % 4,530.93 ves ves ves 4,617 0.64 % 4,460.41 mhm mhm mhm 4,477 0.62 % 4,325.16 pol pol pol 4,197 0.58 % 4,054.65 lahko lahko la hko 4,140 0.58 % 3,999.59 sam sam sam 3,974 0.55 % 3,839.22 dati dati d ati 3,945 0.55 % 3,811.20 saj saj saj 3,943 0.55 % 3,809.27 kar kar kar 3,881 0.54 % 3,749.37 ampak ampak am pak 3,541 0.49 % 3,420.90 tisti tisti ti sti 3,384 0.47 % 3,269.23 kot kot kot 3,371 0.47 % 3,256.67 potem potem po tem 3,237 0.45 % 3,127.21 dober dober do ber 3,021 0.42 % 2,918.54 eem eem eem 2,950 0.41 % 2,849.95 pač pač pač 2,851 0.40 % 2,754.30 misliti misliti misl iti 2,710 0.38 % 2,618.09 tam tam tam 2,598 0.36 % 2,509.89 nekaj nekaj ne kaj 2,562 0.36 % 2,475.11 priti priti pr iti 2,530 0.35 % 2,444.19 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 352 File at CLARIN.SI 2.2.9 List of final character-level 4-grams from all lemmas in the GOS 1.0 corpusGOS1.0-word_parts-all-lemmas-final-4grams-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequen- cy of lemma Percentage of all found lemmas Total relative frequency (per million occur - rences) biti biti biti 93,407 16.70 % 90,238.98 imeti imeti i meti 10,131 1.81 % 9,787.39 tako tako tako 10,083 1.80 % 9,741.02 tudi tudi tudi 7,947 1.42 % 7,677.47 vedeti vedeti ve deti 7,560 1.35 % 7,303.59 zdaj zdaj zdaj 6,366 1.14 % 6,150.09 reči reči reči 4,690 0.84 % 4,530.93 lahko lahko l ahko 4,140 0.74 % 3,999.59 dati dati dati 3,945 0.71 % 3,811.20 ampak ampak a mpak 3,541 0.63 % 3,420.90 tisti tisti t isti 3,384 0.60 % 3,269.23 potem potem p otem 3,237 0.58 % 3,127.21 dober dober d ober 3,021 0.54 % 2,918.54 misliti misliti mis liti 2,710 0.48 % 2,618.09 nekaj nekaj n ekaj 2,562 0.46 % 2,475.11 priti priti p riti 2,530 0.45 % 2,444.19 drug drug drug 2,368 0.42 % 2,287.69 videti videti vi deti 2,309 0.41 % 2,230.69 kakšen kakšen ka kšen 2,260 0.40 % 2,183.35 morati morati mo rati 2,233 0.40 % 2,157.26 moči moči moči 2,067 0.37 % 1,996.90 zelo zelo zelo 1,835 0.33 % 1,772.76 leto leto leto 1,828 0.33 % 1,766 seveda seveda se veda 1,795 0.32 % 1,734.12 gledati gledati gle dati 1,777 0.32 % 1,716.73 mali mali mali 1,760 0.32 % 1,700.31 torej torej t orej 1,697 0.30 % 1,639.44 povedati povedati pove dati 1,686 0.30 % 1,628.82 bolj bolj bolj 1,654 0.30 % 1,597.90 narediti narediti nare diti 1,614 0.29 % 1,559.26 delati delati de lati 1,515 0.27 % 1,463.62 danes danes d anes 1,494 0.27 % 1,443.33 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 353 File at CLARIN.SI 2.2.10 List of final character-level 5-grams from all lemmas in the GOS 1.0 corpusGOS1.0-word_parts-all-lemmas-final-5grams-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequen- cy of lemma Percentage of all found lemmas Total relative frequency (per million occur - rences) imeti imeti imeti 10,131 2.77 % 9,787.39 vedeti vedeti v edeti 7,560 2.07 % 7,303.59 lahko lahko lahko 4,140 1.13 % 3,999.59 ampak ampak ampak 3,541 0.97 % 3,420.90 tisti tisti tisti 3,384 0.93 % 3,269.23 potem potem potem 3,237 0.89 % 3,127.21 dober dober dober 3,021 0.83 % 2,918.54 misliti misliti mi sliti 2,710 0.74 % 2,618.09 nekaj nekaj nekaj 2,562 0.70 % 2,475.11 priti priti priti 2,530 0.69 % 2,444.19 videti videti v ideti 2,309 0.63 % 2,230.69 kakšen kakšen k akšen 2,260 0.62 % 2,183.35 morati morati m orati 2,233 0.61 % 2,157.26 seveda seveda s eveda 1,795 0.49 % 1,734.12 gledati gledati gl edati 1,777 0.49 % 1,716.73 torej torej torej 1,697 0.46 % 1,639.44 povedati povedati pov edati 1,686 0.46 % 1,628.82 narediti narediti nar editi 1,614 0.44 % 1,559.26 delati delati d elati 1,515 0.41 % 1,463.62 danes danes danes 1,494 0.41 % 1,443.33 kateri kateri k ateri 1,448 0.40 % 1,398.89 dobro dobro dobro 1,390 0.38 % 1,342.86 praviti praviti pr aviti 1,344 0.37 % 1,298.42 veliko veliko v eliko 1,316 0.36 % 1,271.37 dobiti dobiti d obiti 1,264 0.35 % 1,221.13 človek človek č lovek 1,248 0.34 % 1,205.67 velik velik velik 1,213 0.33 % 1,171.86 trije trije trije 1,150 0.32 % 1,111 tukaj tukaj tukaj 1,136 0.31 % 1,097.47 misel misel misel 1,086 0.30 % 1,049.17 mogoče mogoče m ogoče 1,056 0.29 % 1,020.18 stvar stvar stvar 1,056 0.29 % 1,020.18 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 354 File at CLARIN.SI 2.2.11 List of initial character-level 1-grams from all standardized forms in the GOS 1.0 corpusGOS1.0-word_parts-all-standardized_forms- initial-1grams-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequen- cy of standardized form Percentage of all found standardized forms Total relative frequency (per million occur - rences) je j e 37,466 3.62 % 36,195.29 ne n e 31,857 3.08 % 30,776.53 pa p a 29,385 2.84 % 28,388.37 ja j a 25,571 2.47 % 24,703.73 eee e ee 23,222 2.24 % 22,434.40 da d a 20,961 2.02 % 20,250.08 to t o 18,473 1.78 % 17,846.46 v v 17,813 1.72 % 17,208.85 in i n 16,241 1.57 % 15,690.17 se s e 15,921 1.54 % 15,381.02 na n a 12,028 1.16 % 11,620.05 tako t ako 10,404 1.00 % 10,051.13 kaj k aj 9,507 0.92 % 9,184.56 so s o 7,996 0.77 % 7,724.81 za z a 7,976 0.77 % 7,705.48 tudi t udi 7,947 0.77 % 7,677.47 bi b i 7,626 0.74 % 7,367.35 sem s em 7,541 0.73 % 7,285.24 še š e 7,193 0.69 % 6,949.04 a a 6,705 0.65 % 6,477.59 jaz j az 6,475 0.63 % 6,255.39 zdaj z daj 6,364 0.61 % 6,148.16 če č e 6,274 0.61 % 6,061.21 ki k i 5,540 0.54 % 5,352.10 ni n i 5,239 0.51 % 5,061.31 bo b o 5,099 0.49 % 4,926.06 si s i 4,898 0.47 % 4,731.88 ali a li 4,798 0.46 % 4,635.27 z z 4,789 0.46 % 4,626.57 ker k er 4,785 0.46 % 4,622.71 no n o 4,711 0.46 % 4,551.22 s s 4,621 0.45 % 4,464.27 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 355 File at CLARIN.SI 2.2.12 List of initial character-level 2-grams from all standardized forms in the GOS 1.0 corpusGOS1.0-word_parts-all-standardized_ forms-initial-2grams-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequen- cy of standardized form Percentage of all found standardized forms Total relative frequency (per million occur - rences) je je 37,466 3.77 % 36,195.29 ne ne 31,857 3.21 % 30,776.53 pa pa 29,385 2.96 % 28,388.37 ja ja 25,571 2.58 % 24,703.73 eee ee e 23,222 2.34 % 22,434.40 da da 20,961 2.11 % 20,250.08 to to 18,473 1.86 % 17,846.46 in in 16,241 1.64 % 15,690.17 se se 15,921 1.60 % 15,381.02 na na 12,028 1.21 % 11,620.05 tako ta ko 10,404 1.05 % 10,051.13 kaj ka j 9,507 0.96 % 9,184.56 so so 7,996 0.81 % 7,724.81 za za 7,976 0.80 % 7,705.48 tudi tu di 7,947 0.80 % 7,677.47 bi bi 7,626 0.77 % 7,367.35 sem se m 7,541 0.76 % 7,285.24 še še 7,193 0.72 % 6,949.04 jaz ja z 6,475 0.65 % 6,255.39 zdaj zd aj 6,364 0.64 % 6,148.16 če če 6,274 0.63 % 6,061.21 ki ki 5,540 0.56 % 5,352.10 ni ni 5,239 0.53 % 5,061.31 bo bo 5,099 0.51 % 4,926.06 si si 4,898 0.49 % 4,731.88 ali al i 4,798 0.48 % 4,635.27 ker ke r 4,785 0.48 % 4,622.71 no no 4,711 0.47 % 4,551.22 mhm mh m 4,477 0.45 % 4,325.16 že že 4,457 0.45 % 4,305.84 ti ti 4,369 0.44 % 4,220.82 ko ko 4,299 0.43 % 4,153.19 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 356 File at CLARIN.SI 2.2.13 List of initial character-level 3-grams from all standardized forms in the GOS 1.0 corpusGOS1.0-word_parts-all-standardized_ forms-initial-3grams-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequen- cy of standardized form Percentage of all found standardized forms Total relative frequency (per million occur - rences) eee eee 23,222 3.47 % 22,434.40 tako tak o 10,404 1.56 % 10,051.13 kaj kaj 9,507 1.42 % 9,184.56 tudi tud i 7,947 1.19 % 7,677.47 sem sem 7,541 1.13 % 7,285.24 jaz jaz 6,475 0.97 % 6,255.39 zdaj zda j 6,364 0.95 % 6,148.16 ali ali 4,798 0.72 % 4,635.27 ker ker 4,785 0.72 % 4,622.71 mhm mhm 4,477 0.67 % 4,325.16 pol pol 4,209 0.63 % 4,066.25 samo sam o 4,148 0.62 % 4,007.32 saj saj 4,043 0.60 % 3,905.88 lahko lah ko 4,041 0.60 % 3,903.94 smo smo 3,981 0.59 % 3,845.98 kar kar 3,805 0.57 % 3,675.95 vem vem 3,558 0.53 % 3,437.33 ampak amp ak 3,541 0.53 % 3,420.90 kot kot 3,396 0.51 % 3,280.82 potem pot em 3,247 0.49 % 3,136.87 bilo bil o 3,234 0.48 % 3,124.31 kako kak o 3,191 0.48 % 3,082.77 eem eem 2,950 0.44 % 2,849.95 veš veš 2,896 0.43 % 2,797.78 mislim mis lim 2,873 0.43 % 2,775.56 pač pač 2,851 0.43 % 2,754.30 vse vse 2,786 0.42 % 2,691.51 dobro dob ro 2,672 0.40 % 2,581.38 malo mal o 2,654 0.40 % 2,563.99 tam tam 2,598 0.39 % 2,509.89 nekaj nek aj 2,537 0.38 % 2,450.95 pri pri 2,474 0.37 % 2,390.09 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 357 File at CLARIN.SI 2.2.14 List of initial character-level 4-grams from all standardized forms in the GOS 1.0 corpusGOS1.0-word_parts-all-standardized_ forms-initial-4grams-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequen- cy of standardized form Percentage of all found standardized forms Total relative frequency (per million occur - rences) tako tako 10,404 2.10 % 10,051.13 tudi tudi 7,947 1.61 % 7,677.47 zdaj zdaj 6,364 1.29 % 6,148.16 samo samo 4,148 0.84 % 4,007.32 lahko lahk o 4,041 0.82 % 3,903.94 ampak ampa k 3,541 0.72 % 3,420.90 potem pote m 3,247 0.66 % 3,136.87 bilo bilo 3,234 0.65 % 3,124.31 kako kako 3,191 0.65 % 3,082.77 mislim misl im 2,873 0.58 % 2,775.56 dobro dobr o 2,672 0.54 % 2,581.38 malo malo 2,654 0.54 % 2,563.99 nekaj neka j 2,537 0.51 % 2,450.95 bila bila 2,143 0.43 % 2,070.32 bomo bomo 2,020 0.41 % 1,951.49 tega tega 1,978 0.40 % 1,910.91 zelo zelo 1,835 0.37 % 1,772.76 seveda seve da 1,795 0.36 % 1,734.12 recimo reci mo 1,761 0.36 % 1,701.27 torej tore j 1,697 0.34 % 1,639.44 bolj bolj 1,654 0.34 % 1,597.90 danes dane s 1,490 0.30 % 1,439.46 rekel reke l 1,401 0.28 % 1,353.48 zato zato 1,356 0.27 % 1,310.01 pravi prav i 1,330 0.27 % 1,284.89 prav prav 1,278 0.26 % 1,234.65 imamo imam o 1,172 0.24 % 1,132.25 tisto tist o 1,156 0.23 % 1,116.79 tukaj tuka j 1,135 0.23 % 1,096.50 mogoče mogo če 1,089 0.22 % 1,052.07 rekla rekl a 1,066 0.22 % 1,029.85 imaš imaš 1,058 0.21 % 1,022.12 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 358 File at CLARIN.SI 2.2.15 List of initial character-level 5-grams from all standardized forms in the GOS 1.0 corpusGOS1.0-word_parts-all-standardized_ forms-initial-5grams-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequen- cy of standardized form Percentage of all found standardized forms Total relative frequency (per million occur - rences) lahko lahko 4,041 1.08 % 3,903.94 ampak ampak 3,541 0.95 % 3,420.90 potem potem 3,247 0.87 % 3,136.87 mislim misli m 2,873 0.77 % 2,775.56 dobro dobro 2,672 0.71 % 2,581.38 nekaj nekaj 2,537 0.68 % 2,450.95 seveda seved a 1,795 0.48 % 1,734.12 recimo recim o 1,761 0.47 % 1,701.27 torej torej 1,697 0.45 % 1,639.44 danes danes 1,490 0.40 % 1,439.46 rekel rekel 1,401 0.38 % 1,353.48 pravi pravi 1,330 0.36 % 1,284.89 imamo imamo 1,172 0.31 % 1,132.25 tisto tisto 1,156 0.31 % 1,116.79 tukaj tukaj 1,135 0.30 % 1,096.50 mogoče mogoč e 1,089 0.29 % 1,052.07 rekla rekla 1,066 0.28 % 1,029.85 koliko kolik o 1,046 0.28 % 1,010.52 čisto čisto 1,008 0.27 % 973.81 zakaj zakaj 1,002 0.27 % 968.02 bistvu bistv u 988 0.27 % 954.49 oziroma oziro ma 968 0.26 % 935.17 tisti tisti 958 0.26 % 925.51 imajo imajo 942 0.25 % 910.05 naprej napre j 922 0.25 % 890.73 nisem nisem 916 0.24 % 884.93 imeli imeli 915 0.24 % 883.97 sploh sploh 895 0.24 % 864.64 takrat takra t 840 0.23 % 811.51 treba treba 827 0.22 % 798.95 boste boste 809 0.22 % 781.56 enkrat enkra t 803 0.21 % 775.77 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 359 File at CLARIN.SI 2.2.16 List of final character-level 1-grams from all standardized forms in the GOS 1.0 corpusGOS1.0-word_parts-all-standardized_ forms-final-1grams-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequen- cy of standardized form Percentage of all found standardized forms Total relative frequency (per million occur - rences) je j e 37,466 3.62 % 36,195.29 ne n e 31,857 3.08 % 30,776.53 pa p a 29,385 2.84 % 28,388.37 ja j a 25,571 2.47 % 24,703.73 eee ee e 23,222 2.24 % 22,434.40 da d a 20,961 2.02 % 20,250.08 to t o 18,473 1.78 % 17,846.46 v v 17,813 1.72 % 17,208.85 in i n 16,241 1.57 % 15,690.17 se s e 15,921 1.54 % 15,381.02 na n a 12,028 1.16 % 11,620.05 tako tak o 10,404 1.00 % 10,051.13 kaj ka j 9,507 0.92 % 9,184.56 so s o 7,996 0.77 % 7,724.81 za z a 7,976 0.77 % 7,705.48 tudi tud i 7,947 0.77 % 7,677.47 bi b i 7,626 0.74 % 7,367.35 sem se m 7,541 0.73 % 7,285.24 še š e 7,193 0.69 % 6,949.04 a a 6,705 0.65 % 6,477.59 jaz ja z 6,475 0.63 % 6,255.39 zdaj zda j 6,364 0.61 % 6,148.16 če č e 6,274 0.61 % 6,061.21 ki k i 5,540 0.54 % 5,352.10 ni n i 5,239 0.51 % 5,061.31 bo b o 5,099 0.49 % 4,926.06 si s i 4,898 0.47 % 4,731.88 ali al i 4,798 0.46 % 4,635.27 z z 4,789 0.46 % 4,626.57 ker ke r 4,785 0.46 % 4,622.71 no n o 4,711 0.46 % 4,551.22 s s 4,621 0.45 % 4,464.27 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 360 File at CLARIN.SI 2.2.17 List of final character-level 2-grams from all standardized forms in the GOS 1.0 corpusGOS1.0-word_parts-all-standardized_ forms-final-2grams-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequen- cy of standardized form Percentage of all found standardized forms Total relative frequency (per million occur - rences) je je 37,466 3.77 % 36,195.29 ne ne 31,857 3.21 % 30,776.53 pa pa 29,385 2.96 % 28,388.37 ja ja 25,571 2.58 % 24,703.73 eee e ee 23,222 2.34 % 22,434.40 da da 20,961 2.11 % 20,250.08 to to 18,473 1.86 % 17,846.46 in in 16,241 1.64 % 15,690.17 se se 15,921 1.60 % 15,381.02 na na 12,028 1.21 % 11,620.05 tako ta ko 10,404 1.05 % 10,051.13 kaj k aj 9,507 0.96 % 9,184.56 so so 7,996 0.81 % 7,724.81 za za 7,976 0.80 % 7,705.48 tudi tu di 7,947 0.80 % 7,677.47 bi bi 7,626 0.77 % 7,367.35 sem s em 7,541 0.76 % 7,285.24 še še 7,193 0.72 % 6,949.04 jaz j az 6,475 0.65 % 6,255.39 zdaj zd aj 6,364 0.64 % 6,148.16 če če 6,274 0.63 % 6,061.21 ki ki 5,540 0.56 % 5,352.10 ni ni 5,239 0.53 % 5,061.31 bo bo 5,099 0.51 % 4,926.06 si si 4,898 0.49 % 4,731.88 ali a li 4,798 0.48 % 4,635.27 ker k er 4,785 0.48 % 4,622.71 no no 4,711 0.47 % 4,551.22 mhm m hm 4,477 0.45 % 4,325.16 že že 4,457 0.45 % 4,305.84 ti ti 4,369 0.44 % 4,220.82 ko ko 4,299 0.43 % 4,153.19 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 361 File at CLARIN.SI 2.2.18 List of final character-level 3-grams from all standardized forms in the GOS 1.0 corpusGOS1.0-word_parts-all-standardized_ forms-final-3grams-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequen- cy of standardized form Percentage of all found standardized forms Total relative frequency (per million occur - rences) eee eee 23,222 3.47 % 22,434.40 tako t ako 10,404 1.56 % 10,051.13 kaj kaj 9,507 1.42 % 9,184.56 tudi t udi 7,947 1.19 % 7,677.47 sem sem 7,541 1.13 % 7,285.24 jaz jaz 6,475 0.97 % 6,255.39 zdaj z daj 6,364 0.95 % 6,148.16 ali ali 4,798 0.72 % 4,635.27 ker ker 4,785 0.72 % 4,622.71 mhm mhm 4,477 0.67 % 4,325.16 pol pol 4,209 0.63 % 4,066.25 samo s amo 4,148 0.62 % 4,007.32 saj saj 4,043 0.60 % 3,905.88 lahko la hko 4,041 0.60 % 3,903.94 smo smo 3,981 0.59 % 3,845.98 kar kar 3,805 0.57 % 3,675.95 vem vem 3,558 0.53 % 3,437.33 ampak am pak 3,541 0.53 % 3,420.90 kot kot 3,396 0.51 % 3,280.82 potem po tem 3,247 0.49 % 3,136.87 bilo b ilo 3,234 0.48 % 3,124.31 kako k ako 3,191 0.48 % 3,082.77 eem eem 2,950 0.44 % 2,849.95 veš veš 2,896 0.43 % 2,797.78 mislim mis lim 2,873 0.43 % 2,775.56 pač pač 2,851 0.43 % 2,754.30 vse vse 2,786 0.42 % 2,691.51 dobro do bro 2,672 0.40 % 2,581.38 malo m alo 2,654 0.40 % 2,563.99 tam tam 2,598 0.39 % 2,509.89 nekaj ne kaj 2,537 0.38 % 2,450.95 pri pri 2,474 0.37 % 2,390.09 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 362 File at CLARIN.SI 2.2.19 List of final character-level 4-grams from all standardized forms in the GOS 1.0 corpusGOS1.0-word_parts-all-standardized_ forms-final-4grams-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequen- cy of standardized form Percentage of all found standardized forms Total relative frequency (per million occur - rences) tako tako 10,404 2.10 % 10,051.13 tudi tudi 7,947 1.61 % 7,677.47 zdaj zdaj 6,364 1.29 % 6,148.16 samo samo 4,148 0.84 % 4,007.32 lahko l ahko 4,041 0.82 % 3,903.94 ampak a mpak 3,541 0.72 % 3,420.90 potem p otem 3,247 0.66 % 3,136.87 bilo bilo 3,234 0.65 % 3,124.31 kako kako 3,191 0.65 % 3,082.77 mislim mi slim 2,873 0.58 % 2,775.56 dobro d obro 2,672 0.54 % 2,581.38 malo malo 2,654 0.54 % 2,563.99 nekaj n ekaj 2,537 0.51 % 2,450.95 bila bila 2,143 0.43 % 2,070.32 bomo bomo 2,020 0.41 % 1,951.49 tega tega 1,978 0.40 % 1,910.91 zelo zelo 1,835 0.37 % 1,772.76 seveda se veda 1,795 0.36 % 1,734.12 recimo re cimo 1,761 0.36 % 1,701.27 torej t orej 1,697 0.34 % 1,639.44 bolj bolj 1,654 0.34 % 1,597.90 danes d anes 1,490 0.30 % 1,439.46 rekel r ekel 1,401 0.28 % 1,353.48 zato zato 1,356 0.27 % 1,310.01 pravi p ravi 1,330 0.27 % 1,284.89 prav prav 1,278 0.26 % 1,234.65 imamo i mamo 1,172 0.24 % 1,132.25 tisto t isto 1,156 0.23 % 1,116.79 tukaj t ukaj 1,135 0.23 % 1,096.50 mogoče mo goče 1,089 0.22 % 1,052.07 rekla r ekla 1,066 0.22 % 1,029.85 imaš imaš 1,058 0.21 % 1,022.12 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 363 File at CLARIN.SI 2.2.20 List of final character-level 5-grams from all standardized forms in the GOS 1.0 corpusGOS1.0-word_parts-all-standardized_ forms-final-5grams-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequen- cy of standardized form Percentage of all found standardized forms Total relative frequency (per million occur - rences) lahko lahko 4,041 1.08 % 3,903.94 ampak ampak 3,541 0.95 % 3,420.90 potem potem 3,247 0.87 % 3,136.87 mislim m islim 2,873 0.77 % 2,775.56 dobro dobro 2,672 0.71 % 2,581.38 nekaj nekaj 2,537 0.68 % 2,450.95 seveda s eveda 1,795 0.48 % 1,734.12 recimo r ecimo 1,761 0.47 % 1,701.27 torej torej 1,697 0.45 % 1,639.44 danes danes 1,490 0.40 % 1,439.46 rekel rekel 1,401 0.38 % 1,353.48 pravi pravi 1,330 0.36 % 1,284.89 imamo imamo 1,172 0.31 % 1,132.25 tisto tisto 1,156 0.31 % 1,116.79 tukaj tukaj 1,135 0.30 % 1,096.50 mogoče m ogoče 1,089 0.29 % 1,052.07 rekla rekla 1,066 0.28 % 1,029.85 koliko k oliko 1,046 0.28 % 1,010.52 čisto čisto 1,008 0.27 % 973.81 zakaj zakaj 1,002 0.27 % 968.02 bistvu b istvu 988 0.27 % 954.49 oziroma oz iroma 968 0.26 % 935.17 tisti tisti 958 0.26 % 925.51 imajo imajo 942 0.25 % 910.05 naprej n aprej 922 0.25 % 890.73 nisem nisem 916 0.24 % 884.93 imeli imeli 915 0.24 % 883.97 sploh sploh 895 0.24 % 864.64 takrat t akrat 840 0.23 % 811.51 treba treba 827 0.22 % 798.95 boste boste 809 0.22 % 781.56 enkrat e nkrat 803 0.21 % 775.77 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 364 File at CLARIN.SI 2.2.21 List of initial character-level 1-grams from all lower-case word forms in the GOS 1.0 corpusGOS1.0-word_parts-all-lowercase_ forms-initial-1grams-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequen- cy of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occur - rences) je j e 36,885 3.56 % 35,634 ne n e 30,009 2.90 % 28,991.21 pa p a 29,190 2.82 % 28,199.98 ja j a 25,277 2.44 % 24,419.70 eee e ee 23,222 2.24 % 22,434.40 da d a 18,814 1.82 % 18,175.90 to t o 17,228 1.66 % 16,643.69 v v 17,069 1.65 % 16,490.08 se s e 16,330 1.58 % 15,776.15 in i n 15,952 1.54 % 15,410.97 na n a 12,070 1.17 % 11,660.63 so s o 7,892 0.76 % 7,624.33 za z a 7,866 0.76 % 7,599.21 še š e 7,070 0.68 % 6,830.21 bi b i 7,034 0.68 % 6,795.43 a a 6,960 0.67 % 6,723.94 če č e 6,114 0.59 % 5,906.64 kaj k aj 5,825 0.56 % 5,627.44 sem s em 5,167 0.50 % 4,991.75 s s 5,044 0.49 % 4,872.93 z z 4,905 0.47 % 4,738.64 ni n i 4,830 0.47 % 4,666.18 bo b o 4,802 0.46 % 4,639.13 ki k i 4,676 0.45 % 4,517.41 si s i 4,654 0.45 % 4,496.15 no n o 4,647 0.45 % 4,489.39 mhm m hm 4,478 0.43 % 4,326.12 ti t i 4,308 0.42 % 4,161.89 že ž e 4,277 0.41 % 4,131.94 po p o 4,247 0.41 % 4,102.96 tudi t udi 4,120 0.40 % 3,980.26 mi m i 4,026 0.39 % 3,889.45 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 365 File at CLARIN.SI 2.2.22 List of initial character-level 2-grams from all lower-case word forms in the GOS 1.0 corpusGOS1.0-word_parts-all-lowercase_ forms-initial-2grams-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequen- cy of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occur - rences) je je 36,885 3.74 % 35,634 ne ne 30,009 3.04 % 28,991.21 pa pa 29,190 2.96 % 28,199.98 ja ja 25,277 2.56 % 24,419.70 eee ee e 23,222 2.35 % 22,434.40 da da 18,814 1.91 % 18,175.90 to to 17,228 1.75 % 16,643.69 se se 16,330 1.65 % 15,776.15 in in 15,952 1.62 % 15,410.97 na na 12,070 1.22 % 11,660.63 so so 7,892 0.80 % 7,624.33 za za 7,866 0.80 % 7,599.21 še še 7,070 0.72 % 6,830.21 bi bi 7,034 0.71 % 6,795.43 če če 6,114 0.62 % 5,906.64 kaj ka j 5,825 0.59 % 5,627.44 sem se m 5,167 0.52 % 4,991.75 ni ni 4,830 0.49 % 4,666.18 bo bo 4,802 0.49 % 4,639.13 ki ki 4,676 0.47 % 4,517.41 si si 4,654 0.47 % 4,496.15 no no 4,647 0.47 % 4,489.39 mhm mh m 4,478 0.45 % 4,326.12 ti ti 4,308 0.44 % 4,161.89 že že 4,277 0.43 % 4,131.94 po po 4,247 0.43 % 4,102.96 tudi tu di 4,120 0.42 % 3,980.26 mi mi 4,026 0.41 % 3,889.45 smo sm o 3,927 0.40 % 3,793.81 ko ko 3,873 0.39 % 3,741.64 ta ta 3,789 0.38 % 3,660.49 tako ta ko 3,682 0.37 % 3,557.12 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 366 File at CLARIN.SI 2.2.23 List of initial character-level 3-grams from all lower-case word forms in the GOS 1.0 corpusGOS1.0-word_parts-all-lowercase_ forms-initial-3grams-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequen- cy of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occur - rences) eee eee 23,222 3.58 % 22,434.40 kaj kaj 5,825 0.90 % 5,627.44 sem sem 5,167 0.80 % 4,991.75 mhm mhm 4,478 0.69 % 4,326.12 tudi tud i 4,120 0.64 % 3,980.26 smo smo 3,927 0.61 % 3,793.81 tako tak o 3,682 0.57 % 3,557.12 tud tud 3,571 0.55 % 3,449.88 ampak amp ak 3,445 0.53 % 3,328.16 sej sej 3,426 0.53 % 3,309.80 vem vem 3,297 0.51 % 3,185.18 ker ker 3,264 0.50 % 3,153.30 tak tak 3,226 0.50 % 3,116.59 zdej zde j 3,038 0.47 % 2,934.96 pol pol 2,961 0.46 % 2,860.57 eem eem 2,950 0.46 % 2,849.95 lahko lah ko 2,920 0.45 % 2,820.96 pač pač 2,833 0.44 % 2,736.92 tko tko 2,822 0.43 % 2,726.29 veš veš 2,611 0.40 % 2,522.44 potem pot em 2,556 0.39 % 2,469.31 kot kot 2,538 0.39 % 2,451.92 vse vse 2,520 0.39 % 2,434.53 sam sam 2,352 0.36 % 2,272.23 res res 2,287 0.35 % 2,209.43 blo blo 2,266 0.35 % 2,189.15 tem tem 2,136 0.33 % 2,063.55 samo sam o 2,118 0.33 % 2,046.17 dobro dob ro 2,069 0.32 % 1,998.83 aha aha 2,060 0.32 % 1,990.13 bil bil 2,045 0.32 % 1,975.64 kar kar 1,980 0.31 % 1,912.85 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 367 File at CLARIN.SI 2.2.24 List of initial character-level 4-grams from all lower-case word forms in the GOS 1.0 corpusGOS1.0-word_parts-all-lowercase_ forms-initial-4grams-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequen- cy of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occur - rences) tudi tudi 4,120 0.89 % 3,980.26 tako tako 3,682 0.80 % 3,557.12 ampak ampa k 3,445 0.75 % 3,328.16 zdej zdej 3,038 0.66 % 2,934.96 lahko lahk o 2,920 0.63 % 2,820.96 potem pote m 2,556 0.55 % 2,469.31 samo samo 2,118 0.46 % 2,046.17 dobro dobr o 2,069 0.45 % 1,998.83 tega tega 1,893 0.41 % 1,828.80 bomo bomo 1,819 0.39 % 1,757.31 kako kako 1,699 0.37 % 1,641.38 torej tore j 1,693 0.37 % 1,635.58 recimo reci mo 1,665 0.36 % 1,608.53 seveda seve da 1,593 0.34 % 1,538.97 malo malo 1,204 0.26 % 1,163.16 zdaj zdaj 1,196 0.26 % 1,155.44 neki neki 1,187 0.26 % 1,146.74 zelo zelo 1,128 0.24 % 1,089.74 zato zato 1,122 0.24 % 1,083.95 mogoče mogo če 1,019 0.22 % 984.44 bolj bolj 1,016 0.22 % 981.54 reku reku 996 0.22 % 962.22 danes dane s 992 0.21 % 958.36 bistvu bist vu 965 0.21 % 932.27 pravi prav i 958 0.21 % 925.51 oziroma ozir oma 943 0.20 % 911.02 prav prav 939 0.20 % 907.15 rekla rekl a 931 0.20 % 899.42 mislm misl m 925 0.20 % 893.63 naprej napr ej 912 0.20 % 881.07 mislim misl im 909 0.20 % 878.17 sploh splo h 866 0.19 % 836.63 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 368 File at CLARIN.SI 2.2.25 List of initial character-level 5-grams from all lower-case word forms in the GOS 1.0 corpusGOS1.0-word_parts-all-lowercase_ forms-initial-5grams-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequen- cy of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occur - rences) ampak ampak 3,445 0.98 % 3,328.16 lahko lahko 2,920 0.83 % 2,820.96 potem potem 2,556 0.73 % 2,469.31 dobro dobro 2,069 0.59 % 1,998.83 torej torej 1,693 0.48 % 1,635.58 recimo recim o 1,665 0.47 % 1,608.53 seveda seved a 1,593 0.46 % 1,538.97 mogoče mogoč e 1,019 0.29 % 984.44 danes danes 992 0.28 % 958.36 bistvu bistv u 965 0.28 % 932.27 pravi pravi 958 0.27 % 925.51 oziroma oziro ma 943 0.27 % 911.02 rekla rekla 931 0.27 % 899.42 mislm mislm 925 0.26 % 893.63 naprej napre j 912 0.26 % 881.07 mislim misli m 909 0.26 % 878.17 sploh sploh 866 0.25 % 836.63 zakaj zakaj 821 0.23 % 793.15 nekaj nekaj 804 0.23 % 776.73 hvala hvala 790 0.23 % 763.21 treba treba 751 0.21 % 725.53 vedno vedno 727 0.21 % 702.34 sicer sicer 710 0.20 % 685.92 tisto tisto 699 0.20 % 675.29 gospod gospo d 648 0.18 % 626.02 boste boste 645 0.18 % 623.12 nisem nisem 622 0.18 % 600.90 tisti tisti 620 0.18 % 598.97 tisoč tisoč 614 0.17 % 593.18 enkrat enkra t 596 0.17 % 575.79 stvari stvar i 588 0.17 % 568.06 strani stran i 567 0.16 % 547.77 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 369 File at CLARIN.SI 2.2.26 List of final character-level 1-grams from all lower-case word forms in the GOS 1.0 corpusGOS1.0-word_parts-all-lowercase_ forms-final-1grams-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequen- cy of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occur - rences) je j e 36,885 3.56 % 35,634 ne n e 30,009 2.90 % 28,991.21 pa p a 29,190 2.82 % 28,199.98 ja j a 25,277 2.44 % 24,419.70 eee ee e 23,222 2.24 % 22,434.40 da d a 18,814 1.82 % 18,175.90 to t o 17,228 1.66 % 16,643.69 v v 17,069 1.65 % 16,490.08 se s e 16,330 1.58 % 15,776.15 in i n 15,952 1.54 % 15,410.97 na n a 12,070 1.17 % 11,660.63 so s o 7,892 0.76 % 7,624.33 za z a 7,866 0.76 % 7,599.21 še š e 7,070 0.68 % 6,830.21 bi b i 7,034 0.68 % 6,795.43 a a 6,960 0.67 % 6,723.94 če č e 6,114 0.59 % 5,906.64 kaj ka j 5,825 0.56 % 5,627.44 sem se m 5,167 0.50 % 4,991.75 s s 5,044 0.49 % 4,872.93 z z 4,905 0.47 % 4,738.64 ni n i 4,830 0.47 % 4,666.18 bo b o 4,802 0.46 % 4,639.13 ki k i 4,676 0.45 % 4,517.41 si s i 4,654 0.45 % 4,496.15 no n o 4,647 0.45 % 4,489.39 mhm mh m 4,478 0.43 % 4,326.12 ti t i 4,308 0.42 % 4,161.89 že ž e 4,277 0.41 % 4,131.94 po p o 4,247 0.41 % 4,102.96 tudi tud i 4,120 0.40 % 3,980.26 mi m i 4,026 0.39 % 3,889.45 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 370 File at CLARIN.SI 2.2.27 List of final character-level 2-grams from all lower-case word forms in the GOS 1.0 corpusGOS1.0-word_parts-all-lowercase_ forms-final-2grams-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequen- cy of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occur - rences) je je 36,885 3.74 % 35,634 ne ne 30,009 3.04 % 28,991.21 pa pa 29,190 2.96 % 28,199.98 ja ja 25,277 2.56 % 24,419.70 eee e ee 23,222 2.35 % 22,434.40 da da 18,814 1.91 % 18,175.90 to to 17,228 1.75 % 16,643.69 se se 16,330 1.65 % 15,776.15 in in 15,952 1.62 % 15,410.97 na na 12,070 1.22 % 11,660.63 so so 7,892 0.80 % 7,624.33 za za 7,866 0.80 % 7,599.21 še še 7,070 0.72 % 6,830.21 bi bi 7,034 0.71 % 6,795.43 če če 6,114 0.62 % 5,906.64 kaj k aj 5,825 0.59 % 5,627.44 sem s em 5,167 0.52 % 4,991.75 ni ni 4,830 0.49 % 4,666.18 bo bo 4,802 0.49 % 4,639.13 ki ki 4,676 0.47 % 4,517.41 si si 4,654 0.47 % 4,496.15 no no 4,647 0.47 % 4,489.39 mhm m hm 4,478 0.45 % 4,326.12 ti ti 4,308 0.44 % 4,161.89 že že 4,277 0.43 % 4,131.94 po po 4,247 0.43 % 4,102.96 tudi tu di 4,120 0.42 % 3,980.26 mi mi 4,026 0.41 % 3,889.45 smo s mo 3,927 0.40 % 3,793.81 ko ko 3,873 0.39 % 3,741.64 ta ta 3,789 0.38 % 3,660.49 tako ta ko 3,682 0.37 % 3,557.12 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 371 File at CLARIN.SI 2.2.28 List of final character-level 3-grams from all lower-case word forms in the GOS 1.0 corpusGOS1.0-word_parts-all-lowercase_ forms-final-3grams-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequen- cy of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occur - rences) eee eee 23,222 3.58 % 22,434.40 kaj kaj 5,825 0.90 % 5,627.44 sem sem 5,167 0.80 % 4,991.75 mhm mhm 4,478 0.69 % 4,326.12 tudi t udi 4,120 0.64 % 3,980.26 smo smo 3,927 0.61 % 3,793.81 tako t ako 3,682 0.57 % 3,557.12 tud tud 3,571 0.55 % 3,449.88 ampak am pak 3,445 0.53 % 3,328.16 sej sej 3,426 0.53 % 3,309.80 vem vem 3,297 0.51 % 3,185.18 ker ker 3,264 0.50 % 3,153.30 tak tak 3,226 0.50 % 3,116.59 zdej z dej 3,038 0.47 % 2,934.96 pol pol 2,961 0.46 % 2,860.57 eem eem 2,950 0.46 % 2,849.95 lahko la hko 2,920 0.45 % 2,820.96 pač pač 2,833 0.44 % 2,736.92 tko tko 2,822 0.43 % 2,726.29 veš veš 2,611 0.40 % 2,522.44 potem po tem 2,556 0.39 % 2,469.31 kot kot 2,538 0.39 % 2,451.92 vse vse 2,520 0.39 % 2,434.53 sam sam 2,352 0.36 % 2,272.23 res res 2,287 0.35 % 2,209.43 blo blo 2,266 0.35 % 2,189.15 tem tem 2,136 0.33 % 2,063.55 samo s amo 2,118 0.33 % 2,046.17 dobro do bro 2,069 0.32 % 1,998.83 aha aha 2,060 0.32 % 1,990.13 bil bil 2,045 0.32 % 1,975.64 kar kar 1,980 0.31 % 1,912.85 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 372 File at CLARIN.SI 2.2.29 List of final character-level 4-grams from all lower-case word forms in the GOS 1.0 corpusGOS1.0-word_parts-all-lowercase_ forms-final-4grams-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequen- cy of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occur - rences) tudi tudi 4,120 0.89 % 3,980.26 tako tako 3,682 0.80 % 3,557.12 ampak a mpak 3,445 0.75 % 3,328.16 zdej zdej 3,038 0.66 % 2,934.96 lahko l ahko 2,920 0.63 % 2,820.96 potem p otem 2,556 0.55 % 2,469.31 samo samo 2,118 0.46 % 2,046.17 dobro d obro 2,069 0.45 % 1,998.83 tega tega 1,893 0.41 % 1,828.80 bomo bomo 1,819 0.39 % 1,757.31 kako kako 1,699 0.37 % 1,641.38 torej t orej 1,693 0.37 % 1,635.58 recimo re cimo 1,665 0.36 % 1,608.53 seveda se veda 1,593 0.34 % 1,538.97 malo malo 1,204 0.26 % 1,163.16 zdaj zdaj 1,196 0.26 % 1,155.44 neki neki 1,187 0.26 % 1,146.74 zelo zelo 1,128 0.24 % 1,089.74 zato zato 1,122 0.24 % 1,083.95 mogoče mo goče 1,019 0.22 % 984.44 bolj bolj 1,016 0.22 % 981.54 reku reku 996 0.22 % 962.22 danes d anes 992 0.21 % 958.36 bistvu bi stvu 965 0.21 % 932.27 pravi p ravi 958 0.21 % 925.51 oziroma ozi roma 943 0.20 % 911.02 prav prav 939 0.20 % 907.15 rekla r ekla 931 0.20 % 899.42 mislm m islm 925 0.20 % 893.63 naprej na prej 912 0.20 % 881.07 mislim mi slim 909 0.20 % 878.17 sploh s ploh 866 0.19 % 836.63 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 373 File at CLARIN.SI 2.2.30 List of final character-level 5-grams from all lower-case word forms in the GOS 1.0 corpusGOS1.0-word_parts-all-lowercase_ forms-final-5grams-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequen- cy of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occur - rences) ampak ampak 3,445 0.98 % 3,328.16 lahko lahko 2,920 0.83 % 2,820.96 potem potem 2,556 0.73 % 2,469.31 dobro dobro 2,069 0.59 % 1,998.83 torej torej 1,693 0.48 % 1,635.58 recimo r ecimo 1,665 0.47 % 1,608.53 seveda s eveda 1,593 0.46 % 1,538.97 mogoče m ogoče 1,019 0.29 % 984.44 danes danes 992 0.28 % 958.36 bistvu b istvu 965 0.28 % 932.27 pravi pravi 958 0.27 % 925.51 oziroma oz iroma 943 0.27 % 911.02 rekla rekla 931 0.27 % 899.42 mislm mislm 925 0.26 % 893.63 naprej n aprej 912 0.26 % 881.07 mislim m islim 909 0.26 % 878.17 sploh sploh 866 0.25 % 836.63 zakaj zakaj 821 0.23 % 793.15 nekaj nekaj 804 0.23 % 776.73 hvala hvala 790 0.23 % 763.21 treba treba 751 0.21 % 725.53 vedno vedno 727 0.21 % 702.34 sicer sicer 710 0.20 % 685.92 tisto tisto 699 0.20 % 675.29 gospod g ospod 648 0.18 % 626.02 boste boste 645 0.18 % 623.12 nisem nisem 622 0.18 % 600.90 tisti tisti 620 0.18 % 598.97 tisoč tisoč 614 0.17 % 593.18 enkrat e nkrat 596 0.17 % 575.79 stvari s tvari 588 0.17 % 568.06 strani s trani 567 0.16 % 547.77 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 374 File at CLARIN.SI 2.2.31 List of initial character-level 1-grams from noun lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-nouns-lemmas- initial-1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequen- cy of lemma Percentage of all found lemmas Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pol pol p ol 4,054 2.55 % 3,916.50 463 1.19 % 2,008.48 2,414 7.15 % 8,168.04 333 0.50 % 942.94 844 4.27 % 5,413.97 saj saj s aj 2,321 1.46 % 2,242.28 303 0.78 % 1,314.40 1,298 3.84 % 4,391.93 258 0.39 % 730.57 462 2.34 % 2,963.57 leto leto l eto 1,828 1.15 % 1,766 487 1.25 % 2,112.59 440 1.30 % 1,488.79 767 1.15 % 2,171.89 134 0.68 % 859.56 dan dan d an 1,383 0.87 % 1,336.09 446 1.14 % 1,934.73 415 1.23 % 1,404.20 349 0.53 % 988.25 173 0.88 % 1,109.74 človek človek č lovek 1,248 0.78 % 1,205.67 246 0.63 % 1,067.14 221 0.65 % 747.78 639 0.96 % 1,809.43 142 0.72 % 910.88 čas čas č as 1,155 0.73 % 1,115.83 347 0.89 % 1,505.27 127 0.38 % 429.72 537 0.81 % 1,520.60 144 0.73 % 923.71 misel misel m isel 1,086 0.68 % 1,049.17 178 0.46 % 772.16 429 1.27 % 1,451.57 217 0.33 % 614.47 262 1.33 % 1,680.64 stvar stvar s tvar 1,056 0.66 % 1,020.18 157 0.40 % 681.06 190 0.56 % 642.89 467 0.70 % 1,322.39 242 1.23 % 1,552.35 bistvo bistvo b istvo 1,016 0.64 % 981.54 165 0.42 % 715.76 239 0.71 % 808.68 248 0.37 % 702.25 364 1.84 % 2,334.93 red red r ed 924 0.58 % 892.66 196 0.50 % 850.24 250 0.74 % 845.90 257 0.39 % 727.74 221 1.12 % 1,417.64 ura ura u ra 905 0.57 % 874.31 362 0.93 % 1,570.34 258 0.76 % 872.97 154 0.23 % 436.08 131 0.66 % 840.32 del del d el 883 0.56 % 853.05 189 0.48 % 819.87 178 0.53 % 602.28 413 0.62 % 1,169.48 103 0.52 % 660.71 primer primer p rimer 862 0.54 % 832.76 89 0.23 % 386.08 79 0.23 % 267.31 483 0.73 % 1,367.69 211 1.07 % 1,353.49 stran stran s tran 839 0.53 % 810.54 171 0.44 % 741.79 157 0.47 % 531.23 384 0.58 % 1,087.36 127 0.64 % 814.66 hvala hvala h vala 797 0.50 % 769.97 341 0.87 % 1,479.25 60 0.18 % 203.02 302 0.45 % 855.16 94 0.48 % 602.98 konec konec k onec 784 0.49 % 757.41 205 0.53 % 889.28 170 0.50 % 575.21 304 0.46 % 860.83 105 0.53 % 673.54 gospod gospod g ospod 713 0.45 % 688.82 49 0.12 % 212.56 6 0.02 % 20.30 639 0.96 % 1,809.43 19 0.10 % 121.88 Slovenija slovenija S lovenija 711 0.45 % 686.89 167 0.43 % 724.44 46 0.14 % 155.65 465 0.70 % 1,316.72 33 0.17 % 211.68 voda voda v oda 690 0.43 % 666.60 86 0.22 % 373.06 119 0.35 % 402.65 387 0.58 % 1,095.85 98 0.50 % 628.64 evro evro e vro 664 0.42 % 641.48 169 0.43 % 733.12 200 0.59 % 676.72 192 0.29 % 543.68 103 0.52 % 660.71 minuta minuta m inuta 662 0.42 % 639.55 464 1.19 % 2,012.81 113 0.34 % 382.35 59 0.09 % 167.07 26 0.13 % 166.78 jutro jutro j utro 618 0.39 % 597.04 513 1.31 % 2,225.37 20 0.06 % 67.67 77 0.12 % 218.04 8 0.04 % 51.32 vprašanje vprašanje v prašanje 597 0.38 % 576.75 101 0.26 % 438.13 72 0.21 % 243.62 347 0.52 % 982.59 77 0.39 % 493.93 problem problem p roblem 583 0.37 % 563.23 62 0.16 % 268.95 123 0.36 % 416.18 272 0.41 % 770.21 126 0.64 % 808.25 država država d ržava 538 0.34 % 519.75 44 0.11 % 190.87 30 0.09 % 101.51 446 0.67 % 1,262.92 18 0.09 % 115.46 otrok otrok o trok 538 0.34 % 519.75 115 0.29 % 498.87 99 0.29 % 334.98 266 0.40 % 753.22 58 0.29 % 372.05 mesto mesto m esto 515 0.32 % 497.53 247 0.63 % 1,071.48 73 0.22 % 247 169 0.25 % 478.55 26 0.13 % 166.78 način način n ačin 491 0.31 % 474.35 69 0.18 % 299.32 34 0.10 % 115.04 283 0.42 % 801.36 105 0.53 % 673.54 beseda beseda b eseda 458 0.29 % 442.47 71 0.18 % 308 25 0.07 % 84.59 336 0.51 % 951.44 26 0.13 % 166.78 teden teden t eden 437 0.28 % 422.18 127 0.33 % 550.92 167 0.49 % 565.06 67 0.10 % 189.72 76 0.39 % 487.51 šola šola š ola 432 0.27 % 417.35 59 0.15 % 255.94 156 0.46 % 527.84 166 0.25 % 470.06 51 0.26 % 327.15 zadeva zadeva z adeva 428 0.27 % 413.48 73 0.19 % 316.67 52 0.15 % 175.95 220 0.33 % 622.97 83 0.42 % 532.42 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 375 File at CLARIN.SI 2.2.32 List of initial character-level 2-grams from noun lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-nouns-lemmas- initial-2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequen- cy of lemma Percentage of all found lemmas Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pol pol po l 4,054 2.55 % 3,916.50 463 1.19 % 2,008.48 2,414 7.18 % 8,168.04 333 0.50 % 942.94 844 4.28 % 5,413.97 saj saj sa j 2,321 1.46 % 2,242.28 303 0.78 % 1,314.40 1,298 3.86 % 4,391.93 258 0.39 % 730.57 462 2.35 % 2,963.57 leto leto le to 1,828 1.15 % 1,766 487 1.25 % 2,112.59 440 1.31 % 1,488.79 767 1.16 % 2,171.89 134 0.68 % 859.56 dan dan da n 1,383 0.87 % 1,336.09 446 1.14 % 1,934.73 415 1.23 % 1,404.20 349 0.53 % 988.25 173 0.88 % 1,109.74 človek človek čl ovek 1,248 0.79 % 1,205.67 246 0.63 % 1,067.14 221 0.66 % 747.78 639 0.96 % 1,809.43 142 0.72 % 910.88 čas čas ča s 1,155 0.73 % 1,115.83 347 0.89 % 1,505.27 127 0.38 % 429.72 537 0.81 % 1,520.60 144 0.73 % 923.71 misel misel mi sel 1,086 0.68 % 1,049.17 178 0.46 % 772.16 429 1.28 % 1,451.57 217 0.33 % 614.47 262 1.33 % 1,680.64 stvar stvar st var 1,056 0.67 % 1,020.18 157 0.40 % 681.06 190 0.56 % 642.89 467 0.70 % 1,322.39 242 1.23 % 1,552.35 bistvo bistvo bi stvo 1,016 0.64 % 981.54 165 0.42 % 715.76 239 0.71 % 808.68 248 0.37 % 702.25 364 1.85 % 2,334.93 red red re d 924 0.58 % 892.66 196 0.50 % 850.24 250 0.74 % 845.90 257 0.39 % 727.74 221 1.12 % 1,417.64 ura ura ur a 905 0.57 % 874.31 362 0.93 % 1,570.34 258 0.77 % 872.97 154 0.23 % 436.08 131 0.67 % 840.32 del del de l 883 0.56 % 853.05 189 0.48 % 819.87 178 0.53 % 602.28 413 0.62 % 1,169.48 103 0.52 % 660.71 primer primer pr imer 862 0.54 % 832.76 89 0.23 % 386.08 79 0.23 % 267.31 483 0.73 % 1,367.69 211 1.07 % 1,353.49 stran stran st ran 839 0.53 % 810.54 171 0.44 % 741.79 157 0.47 % 531.23 384 0.58 % 1,087.36 127 0.65 % 814.66 hvala hvala hv ala 797 0.50 % 769.97 341 0.87 % 1,479.25 60 0.18 % 203.02 302 0.46 % 855.16 94 0.48 % 602.98 konec konec ko nec 784 0.49 % 757.41 205 0.53 % 889.28 170 0.51 % 575.21 304 0.46 % 860.83 105 0.53 % 673.54 gospod gospod go spod 713 0.45 % 688.82 49 0.13 % 212.56 6 0.02 % 20.30 639 0.96 % 1,809.43 19 0.10 % 121.88 Slovenija slovenija Sl ovenija 711 0.45 % 686.89 167 0.43 % 724.44 46 0.14 % 155.65 465 0.70 % 1,316.72 33 0.17 % 211.68 voda voda vo da 690 0.43 % 666.60 86 0.22 % 373.06 119 0.35 % 402.65 387 0.58 % 1,095.85 98 0.50 % 628.64 evro evro ev ro 664 0.42 % 641.48 169 0.43 % 733.12 200 0.59 % 676.72 192 0.29 % 543.68 103 0.52 % 660.71 minuta minuta mi nuta 662 0.42 % 639.55 464 1.19 % 2,012.81 113 0.34 % 382.35 59 0.09 % 167.07 26 0.13 % 166.78 jutro jutro ju tro 618 0.39 % 597.04 513 1.31 % 2,225.37 20 0.06 % 67.67 77 0.12 % 218.04 8 0.04 % 51.32 vprašanje vprašanje vp rašanje 597 0.38 % 576.75 101 0.26 % 438.13 72 0.21 % 243.62 347 0.52 % 982.59 77 0.39 % 493.93 problem problem pr oblem 583 0.37 % 563.23 62 0.16 % 268.95 123 0.37 % 416.18 272 0.41 % 770.21 126 0.64 % 808.25 država država dr žava 538 0.34 % 519.75 44 0.11 % 190.87 30 0.09 % 101.51 446 0.67 % 1,262.92 18 0.09 % 115.46 otrok otrok ot rok 538 0.34 % 519.75 115 0.29 % 498.87 99 0.29 % 334.98 266 0.40 % 753.22 58 0.29 % 372.05 mesto mesto me sto 515 0.32 % 497.53 247 0.63 % 1,071.48 73 0.22 % 247 169 0.26 % 478.55 26 0.13 % 166.78 način način na čin 491 0.31 % 474.35 69 0.18 % 299.32 34 0.10 % 115.04 283 0.43 % 801.36 105 0.53 % 673.54 beseda beseda be seda 458 0.29 % 442.47 71 0.18 % 308 25 0.07 % 84.59 336 0.51 % 951.44 26 0.13 % 166.78 teden teden te den 437 0.28 % 422.18 127 0.33 % 550.92 167 0.50 % 565.06 67 0.10 % 189.72 76 0.39 % 487.51 šola šola šo la 432 0.27 % 417.35 59 0.15 % 255.94 156 0.46 % 527.84 166 0.25 % 470.06 51 0.26 % 327.15 zadeva zadeva za deva 428 0.27 % 413.48 73 0.19 % 316.67 52 0.15 % 175.95 220 0.33 % 622.97 83 0.42 % 532.42 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 376 File at CLARIN.SI 2.2.33 List of initial character-level 3-grams from noun lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-nouns-lemmas- initial-3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequen- cy of lemma Percentage of all found lemmas Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pol pol pol 4,054 2.56 % 3,916.50 463 1.19 % 2,008.48 2,414 7.23 % 8,168.04 333 0.50 % 942.94 844 4.31 % 5,413.97 saj saj saj 2,321 1.47 % 2,242.28 303 0.78 % 1,314.40 1,298 3.89 % 4,391.93 258 0.39 % 730.57 462 2.36 % 2,963.57 leto leto let o 1,828 1.16 % 1,766 487 1.25 % 2,112.59 440 1.32 % 1,488.79 767 1.16 % 2,171.89 134 0.68 % 859.56 dan dan dan 1,383 0.88 % 1,336.09 446 1.15 % 1,934.73 415 1.24 % 1,404.20 349 0.53 % 988.25 173 0.88 % 1,109.74 človek človek člo vek 1,248 0.79 % 1,205.67 246 0.63 % 1,067.14 221 0.66 % 747.78 639 0.96 % 1,809.43 142 0.72 % 910.88 čas čas čas 1,155 0.73 % 1,115.83 347 0.89 % 1,505.27 127 0.38 % 429.72 537 0.81 % 1,520.60 144 0.73 % 923.71 misel misel mis el 1,086 0.69 % 1,049.17 178 0.46 % 772.16 429 1.28 % 1,451.57 217 0.33 % 614.47 262 1.34 % 1,680.64 stvar stvar stv ar 1,056 0.67 % 1,020.18 157 0.40 % 681.06 190 0.57 % 642.89 467 0.70 % 1,322.39 242 1.24 % 1,552.35 bistvo bistvo bis tvo 1,016 0.64 % 981.54 165 0.42 % 715.76 239 0.72 % 808.68 248 0.37 % 702.25 364 1.86 % 2,334.93 red red red 924 0.58 % 892.66 196 0.50 % 850.24 250 0.75 % 845.90 257 0.39 % 727.74 221 1.13 % 1,417.64 ura ura ura 905 0.57 % 874.31 362 0.93 % 1,570.34 258 0.77 % 872.97 154 0.23 % 436.08 131 0.67 % 840.32 del del del 883 0.56 % 853.05 189 0.49 % 819.87 178 0.53 % 602.28 413 0.62 % 1,169.48 103 0.53 % 660.71 primer primer pri mer 862 0.55 % 832.76 89 0.23 % 386.08 79 0.24 % 267.31 483 0.73 % 1,367.69 211 1.08 % 1,353.49 stran stran str an 839 0.53 % 810.54 171 0.44 % 741.79 157 0.47 % 531.23 384 0.58 % 1,087.36 127 0.65 % 814.66 hvala hvala hva la 797 0.50 % 769.97 341 0.88 % 1,479.25 60 0.18 % 203.02 302 0.46 % 855.16 94 0.48 % 602.98 konec konec kon ec 784 0.50 % 757.41 205 0.53 % 889.28 170 0.51 % 575.21 304 0.46 % 860.83 105 0.54 % 673.54 gospod gospod gos pod 713 0.45 % 688.82 49 0.13 % 212.56 6 0.02 % 20.30 639 0.96 % 1,809.43 19 0.10 % 121.88 Slovenija slovenija Slo venija 711 0.45 % 686.89 167 0.43 % 724.44 46 0.14 % 155.65 465 0.70 % 1,316.72 33 0.17 % 211.68 voda voda vod a 690 0.44 % 666.60 86 0.22 % 373.06 119 0.36 % 402.65 387 0.58 % 1,095.85 98 0.50 % 628.64 evro evro evr o 664 0.42 % 641.48 169 0.43 % 733.12 200 0.60 % 676.72 192 0.29 % 543.68 103 0.53 % 660.71 minuta minuta min uta 662 0.42 % 639.55 464 1.19 % 2,012.81 113 0.34 % 382.35 59 0.09 % 167.07 26 0.13 % 166.78 jutro jutro jut ro 618 0.39 % 597.04 513 1.32 % 2,225.37 20 0.06 % 67.67 77 0.12 % 218.04 8 0.04 % 51.32 vprašanje vprašanje vpr ašanje 597 0.38 % 576.75 101 0.26 % 438.13 72 0.22 % 243.62 347 0.52 % 982.59 77 0.39 % 493.93 problem problem pro blem 583 0.37 % 563.23 62 0.16 % 268.95 123 0.37 % 416.18 272 0.41 % 770.21 126 0.64 % 808.25 država država drž ava 538 0.34 % 519.75 44 0.11 % 190.87 30 0.09 % 101.51 446 0.67 % 1,262.92 18 0.09 % 115.46 otrok otrok otr ok 538 0.34 % 519.75 115 0.30 % 498.87 99 0.30 % 334.98 266 0.40 % 753.22 58 0.30 % 372.05 mesto mesto mes to 515 0.33 % 497.53 247 0.64 % 1,071.48 73 0.22 % 247 169 0.26 % 478.55 26 0.13 % 166.78 način način nač in 491 0.31 % 474.35 69 0.18 % 299.32 34 0.10 % 115.04 283 0.43 % 801.36 105 0.54 % 673.54 beseda beseda bes eda 458 0.29 % 442.47 71 0.18 % 308 25 0.07 % 84.59 336 0.51 % 951.44 26 0.13 % 166.78 teden teden ted en 437 0.28 % 422.18 127 0.33 % 550.92 167 0.50 % 565.06 67 0.10 % 189.72 76 0.39 % 487.51 šola šola šol a 432 0.27 % 417.35 59 0.15 % 255.94 156 0.47 % 527.84 166 0.25 % 470.06 51 0.26 % 327.15 zadeva zadeva zad eva 428 0.27 % 413.48 73 0.19 % 316.67 52 0.16 % 175.95 220 0.33 % 622.97 83 0.42 % 532.42 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 377 File at CLARIN.SI 2.2.34 List of initial character-level 4-grams from noun lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-nouns-lemmas- initial-4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequen- cy of lemma Percentage of all found lemmas Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] leto leto leto 1,828 1.33 % 1,766 487 1.43 % 2,112.59 440 1.69 % 1,488.79 767 1.26 % 2,171.89 134 0.81 % 859.56 človek človek člov ek 1,248 0.91 % 1,205.67 246 0.72 % 1,067.14 221 0.85 % 747.78 639 1.05 % 1,809.43 142 0.86 % 910.88 misel misel mise l 1,086 0.79 % 1,049.17 178 0.52 % 772.16 429 1.65 % 1,451.57 217 0.35 % 614.47 262 1.59 % 1,680.64 stvar stvar stva r 1,056 0.77 % 1,020.18 157 0.46 % 681.06 190 0.73 % 642.89 467 0.77 % 1,322.39 242 1.47 % 1,552.35 bistvo bistvo bist vo 1,016 0.74 % 981.54 165 0.48 % 715.76 239 0.92 % 808.68 248 0.41 % 702.25 364 2.21 % 2,334.93 primer primer prim er 862 0.63 % 832.76 89 0.26 % 386.08 79 0.30 % 267.31 483 0.79 % 1,367.69 211 1.28 % 1,353.49 stran stran stra n 839 0.61 % 810.54 171 0.50 % 741.79 157 0.60 % 531.23 384 0.63 % 1,087.36 127 0.77 % 814.66 hvala hvala hval a 797 0.58 % 769.97 341 1.00 % 1,479.25 60 0.23 % 203.02 302 0.49 % 855.16 94 0.57 % 602.98 konec konec kone c 784 0.57 % 757.41 205 0.60 % 889.28 170 0.65 % 575.21 304 0.50 % 860.83 105 0.64 % 673.54 gospod gospod gosp od 713 0.52 % 688.82 49 0.14 % 212.56 6 0.02 % 20.30 639 1.05 % 1,809.43 19 0.12 % 121.88 Slovenija slovenija Slov enija 711 0.52 % 686.89 167 0.49 % 724.44 46 0.18 % 155.65 465 0.76 % 1,316.72 33 0.20 % 211.68 voda voda voda 690 0.50 % 666.60 86 0.25 % 373.06 119 0.46 % 402.65 387 0.63 % 1,095.85 98 0.59 % 628.64 evro evro evro 664 0.48 % 641.48 169 0.50 % 733.12 200 0.77 % 676.72 192 0.32 % 543.68 103 0.62 % 660.71 minuta minuta minu ta 662 0.48 % 639.55 464 1.36 % 2,012.81 113 0.43 % 382.35 59 0.10 % 167.07 26 0.16 % 166.78 jutro jutro jutr o 618 0.45 % 597.04 513 1.50 % 2,225.37 20 0.08 % 67.67 77 0.13 % 218.04 8 0.05 % 51.32 vprašanje vprašanje vpra šanje 597 0.43 % 576.75 101 0.30 % 438.13 72 0.28 % 243.62 347 0.57 % 982.59 77 0.47 % 493.93 problem problem prob lem 583 0.42 % 563.23 62 0.18 % 268.95 123 0.47 % 416.18 272 0.45 % 770.21 126 0.77 % 808.25 država država drža va 538 0.39 % 519.75 44 0.13 % 190.87 30 0.12 % 101.51 446 0.73 % 1,262.92 18 0.11 % 115.46 otrok otrok otro k 538 0.39 % 519.75 115 0.34 % 498.87 99 0.38 % 334.98 266 0.44 % 753.22 58 0.35 % 372.05 mesto mesto mest o 515 0.37 % 497.53 247 0.72 % 1,071.48 73 0.28 % 247 169 0.28 % 478.55 26 0.16 % 166.78 način način nači n 491 0.36 % 474.35 69 0.20 % 299.32 34 0.13 % 115.04 283 0.46 % 801.36 105 0.64 % 673.54 beseda beseda bese da 458 0.33 % 442.47 71 0.21 % 308 25 0.10 % 84.59 336 0.55 % 951.44 26 0.16 % 166.78 teden teden tede n 437 0.32 % 422.18 127 0.37 % 550.92 167 0.64 % 565.06 67 0.11 % 189.72 76 0.46 % 487.51 šola šola šola 432 0.31 % 417.35 59 0.17 % 255.94 156 0.60 % 527.84 166 0.27 % 470.06 51 0.31 % 327.15 zadeva zadeva zade va 428 0.31 % 413.48 73 0.21 % 316.67 52 0.20 % 175.95 220 0.36 % 622.97 83 0.50 % 532.42 delo delo delo 425 0.31 % 410.59 63 0.18 % 273.29 68 0.26 % 230.09 238 0.39 % 673.94 56 0.34 % 359.22 svet svet svet 424 0.31 % 409.62 101 0.30 % 438.13 34 0.13 % 115.04 271 0.44 % 767.38 18 0.11 % 115.46 vlada vlada vlad a 416 0.30 % 401.89 15 0.04 % 65.07 1 0 % 3.38 399 0.65 % 1,129.83 1 0.01 % 6.41 denar denar dena r 406 0.29 % 392.23 55 0.16 % 238.59 119 0.46 % 402.65 201 0.33 % 569.16 31 0.19 % 198.85 kola kola kola 363 0.26 % 350.69 63 0.18 % 273.29 85 0.33 % 287.61 114 0.19 % 322.81 101 0.61 % 647.88 skupina skupina skup ina 358 0.26 % 345.86 81 0.24 % 351.37 28 0.11 % 94.74 203 0.33 % 574.83 46 0.28 % 295.07 reklo reklo rekl o 356 0.26 % 343.93 52 0.15 % 225.57 182 0.70 % 615.82 75 0.12 % 212.37 47 0.28 % 301.49 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 378 File at CLARIN.SI 2.2.35 List of initial character-level 5-grams from noun lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-nouns-lemmas- initial-5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequen- cy of lemma Percentage of all found lemmas Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] človek človek člove k 1,248 1.09 % 1,205.67 246 0.88 % 1,067.14 221 1.14 % 747.78 639 1.20 % 1,809.43 142 1.03 % 910.88 misel misel misel 1,086 0.95 % 1,049.17 178 0.63 % 772.16 429 2.21 % 1,451.57 217 0.41 % 614.47 262 1.91 % 1,680.64 stvar stvar stvar 1,056 0.92 % 1,020.18 157 0.56 % 681.06 190 0.98 % 642.89 467 0.88 % 1,322.39 242 1.76 % 1,552.35 bistvo bistvo bistv o 1,016 0.89 % 981.54 165 0.59 % 715.76 239 1.23 % 808.68 248 0.47 % 702.25 364 2.65 % 2,334.93 primer primer prime r 862 0.75 % 832.76 89 0.32 % 386.08 79 0.41 % 267.31 483 0.91 % 1,367.69 211 1.53 % 1,353.49 stran stran stran 839 0.73 % 810.54 171 0.61 % 741.79 157 0.81 % 531.23 384 0.72 % 1,087.36 127 0.92 % 814.66 hvala hvala hvala 797 0.70 % 769.97 341 1.21 % 1,479.25 60 0.31 % 203.02 302 0.57 % 855.16 94 0.68 % 602.98 konec konec konec 784 0.69 % 757.41 205 0.73 % 889.28 170 0.87 % 575.21 304 0.57 % 860.83 105 0.76 % 673.54 gospod gospod gospo d 713 0.62 % 688.82 49 0.17 % 212.56 6 0.03 % 20.30 639 1.20 % 1,809.43 19 0.14 % 121.88 Slovenija slovenija Slove nija 711 0.62 % 686.89 167 0.59 % 724.44 46 0.24 % 155.65 465 0.88 % 1,316.72 33 0.24 % 211.68 minuta minuta minut a 662 0.58 % 639.55 464 1.65 % 2,012.81 113 0.58 % 382.35 59 0.11 % 167.07 26 0.19 % 166.78 jutro jutro jutro 618 0.54 % 597.04 513 1.82 % 2,225.37 20 0.10 % 67.67 77 0.14 % 218.04 8 0.06 % 51.32 vprašanje vprašanje vpraš anje 597 0.52 % 576.75 101 0.36 % 438.13 72 0.37 % 243.62 347 0.65 % 982.59 77 0.56 % 493.93 problem problem probl em 583 0.51 % 563.23 62 0.22 % 268.95 123 0.63 % 416.18 272 0.51 % 770.21 126 0.92 % 808.25 država država držav a 538 0.47 % 519.75 44 0.16 % 190.87 30 0.15 % 101.51 446 0.84 % 1,262.92 18 0.13 % 115.46 otrok otrok otrok 538 0.47 % 519.75 115 0.41 % 498.87 99 0.51 % 334.98 266 0.50 % 753.22 58 0.42 % 372.05 mesto mesto mesto 515 0.45 % 497.53 247 0.88 % 1,071.48 73 0.38 % 247 169 0.32 % 478.55 26 0.19 % 166.78 način način način 491 0.43 % 474.35 69 0.24 % 299.32 34 0.17 % 115.04 283 0.53 % 801.36 105 0.76 % 673.54 beseda beseda besed a 458 0.40 % 442.47 71 0.25 % 308 25 0.13 % 84.59 336 0.63 % 951.44 26 0.19 % 166.78 teden teden teden 437 0.38 % 422.18 127 0.45 % 550.92 167 0.86 % 565.06 67 0.13 % 189.72 76 0.55 % 487.51 zadeva zadeva zadev a 428 0.37 % 413.48 73 0.26 % 316.67 52 0.27 % 175.95 220 0.41 % 622.97 83 0.60 % 532.42 vlada vlada vlada 416 0.36 % 401.89 15 0.05 % 65.07 1 0.01 % 3.38 399 0.75 % 1,129.83 1 0.01 % 6.41 denar denar denar 406 0.35 % 392.23 55 0.20 % 238.59 119 0.61 % 402.65 201 0.38 % 569.16 31 0.23 % 198.85 skupina skupina skupi na 358 0.31 % 345.86 81 0.29 % 351.37 28 0.14 % 94.74 203 0.38 % 574.83 46 0.34 % 295.07 reklo reklo reklo 356 0.31 % 343.93 52 0.18 % 225.57 182 0.94 % 615.82 75 0.14 % 212.37 47 0.34 % 301.49 cesta cesta cesta 332 0.29 % 320.74 114 0.41 % 494.53 66 0.34 % 223.32 132 0.25 % 373.78 20 0.15 % 128.29 začetek začetek začet ek 331 0.29 % 319.77 95 0.34 % 412.11 64 0.33 % 216.55 128 0.24 % 362.45 44 0.32 % 282.24 življenje življenje življ enje 330 0.29 % 318.81 73 0.26 % 316.67 51 0.26 % 172.56 183 0.34 % 518.19 23 0.17 % 147.54 točka točka točka 322 0.28 % 311.08 56 0.20 % 242.93 22 0.11 % 74.44 206 0.39 % 583.32 38 0.28 % 243.76 mesec mesec mesec 316 0.28 % 305.28 55 0.20 % 238.59 101 0.52 % 341.74 116 0.22 % 328.47 44 0.32 % 282.24 vrsta vrsta vrsta 306 0.27 % 295.62 45 0.16 % 195.21 26 0.13 % 87.97 208 0.39 % 588.99 27 0.20 % 173.20 radio radio radio 299 0.26 % 288.86 257 0.91 % 1,114.86 9 0.05 % 30.45 27 0.05 % 76.45 6 0.04 % 38.49 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 379 File at CLARIN.SI 2.2.36 List of final character-level 1-grams from noun lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-nouns-lemmas- final-1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequen- cy of lemma Percentage of all found lemmas Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pol pol po l 4,054 2.55 % 3,916.50 463 1.19 % 2,008.48 2,414 7.15 % 8,168.04 333 0.50 % 942.94 844 4.27 % 5,413.97 saj saj sa j 2,321 1.46 % 2,242.28 303 0.78 % 1,314.40 1,298 3.84 % 4,391.93 258 0.39 % 730.57 462 2.34 % 2,963.57 leto leto let o 1,828 1.15 % 1,766 487 1.25 % 2,112.59 440 1.30 % 1,488.79 767 1.15 % 2,171.89 134 0.68 % 859.56 dan dan da n 1,383 0.87 % 1,336.09 446 1.14 % 1,934.73 415 1.23 % 1,404.20 349 0.53 % 988.25 173 0.88 % 1,109.74 človek človek člove k 1,248 0.78 % 1,205.67 246 0.63 % 1,067.14 221 0.65 % 747.78 639 0.96 % 1,809.43 142 0.72 % 910.88 čas čas ča s 1,155 0.73 % 1,115.83 347 0.89 % 1,505.27 127 0.38 % 429.72 537 0.81 % 1,520.60 144 0.73 % 923.71 misel misel mise l 1,086 0.68 % 1,049.17 178 0.46 % 772.16 429 1.27 % 1,451.57 217 0.33 % 614.47 262 1.33 % 1,680.64 stvar stvar stva r 1,056 0.66 % 1,020.18 157 0.40 % 681.06 190 0.56 % 642.89 467 0.70 % 1,322.39 242 1.23 % 1,552.35 bistvo bistvo bistv o 1,016 0.64 % 981.54 165 0.42 % 715.76 239 0.71 % 808.68 248 0.37 % 702.25 364 1.84 % 2,334.93 red red re d 924 0.58 % 892.66 196 0.50 % 850.24 250 0.74 % 845.90 257 0.39 % 727.74 221 1.12 % 1,417.64 ura ura ur a 905 0.57 % 874.31 362 0.93 % 1,570.34 258 0.76 % 872.97 154 0.23 % 436.08 131 0.66 % 840.32 del del de l 883 0.56 % 853.05 189 0.48 % 819.87 178 0.53 % 602.28 413 0.62 % 1,169.48 103 0.52 % 660.71 primer primer prime r 862 0.54 % 832.76 89 0.23 % 386.08 79 0.23 % 267.31 483 0.73 % 1,367.69 211 1.07 % 1,353.49 stran stran stra n 839 0.53 % 810.54 171 0.44 % 741.79 157 0.47 % 531.23 384 0.58 % 1,087.36 127 0.64 % 814.66 hvala hvala hval a 797 0.50 % 769.97 341 0.87 % 1,479.25 60 0.18 % 203.02 302 0.45 % 855.16 94 0.48 % 602.98 konec konec kone c 784 0.49 % 757.41 205 0.53 % 889.28 170 0.50 % 575.21 304 0.46 % 860.83 105 0.53 % 673.54 gospod gospod gospo d 713 0.45 % 688.82 49 0.12 % 212.56 6 0.02 % 20.30 639 0.96 % 1,809.43 19 0.10 % 121.88 Slovenija slovenija Slovenij a 711 0.45 % 686.89 167 0.43 % 724.44 46 0.14 % 155.65 465 0.70 % 1,316.72 33 0.17 % 211.68 voda voda vod a 690 0.43 % 666.60 86 0.22 % 373.06 119 0.35 % 402.65 387 0.58 % 1,095.85 98 0.50 % 628.64 evro evro evr o 664 0.42 % 641.48 169 0.43 % 733.12 200 0.59 % 676.72 192 0.29 % 543.68 103 0.52 % 660.71 minuta minuta minut a 662 0.42 % 639.55 464 1.19 % 2,012.81 113 0.34 % 382.35 59 0.09 % 167.07 26 0.13 % 166.78 jutro jutro jutr o 618 0.39 % 597.04 513 1.31 % 2,225.37 20 0.06 % 67.67 77 0.12 % 218.04 8 0.04 % 51.32 vprašanje vprašanje vprašanj e 597 0.38 % 576.75 101 0.26 % 438.13 72 0.21 % 243.62 347 0.52 % 982.59 77 0.39 % 493.93 problem problem proble m 583 0.37 % 563.23 62 0.16 % 268.95 123 0.36 % 416.18 272 0.41 % 770.21 126 0.64 % 808.25 država država držav a 538 0.34 % 519.75 44 0.11 % 190.87 30 0.09 % 101.51 446 0.67 % 1,262.92 18 0.09 % 115.46 otrok otrok otro k 538 0.34 % 519.75 115 0.29 % 498.87 99 0.29 % 334.98 266 0.40 % 753.22 58 0.29 % 372.05 mesto mesto mest o 515 0.32 % 497.53 247 0.63 % 1,071.48 73 0.22 % 247 169 0.25 % 478.55 26 0.13 % 166.78 način način nači n 491 0.31 % 474.35 69 0.18 % 299.32 34 0.10 % 115.04 283 0.42 % 801.36 105 0.53 % 673.54 beseda beseda besed a 458 0.29 % 442.47 71 0.18 % 308 25 0.07 % 84.59 336 0.51 % 951.44 26 0.13 % 166.78 teden teden tede n 437 0.28 % 422.18 127 0.33 % 550.92 167 0.49 % 565.06 67 0.10 % 189.72 76 0.39 % 487.51 šola šola šol a 432 0.27 % 417.35 59 0.15 % 255.94 156 0.46 % 527.84 166 0.25 % 470.06 51 0.26 % 327.15 zadeva zadeva zadev a 428 0.27 % 413.48 73 0.19 % 316.67 52 0.15 % 175.95 220 0.33 % 622.97 83 0.42 % 532.42 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 380 File at CLARIN.SI 2.2.37 List of final character-level 2-grams from noun lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-nouns-lemmas- final-2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequen- cy of lemma Percentage of all found lemmas Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pol pol p ol 4,054 2.55 % 3,916.50 463 1.19 % 2,008.48 2,414 7.18 % 8,168.04 333 0.50 % 942.94 844 4.28 % 5,413.97 saj saj s aj 2,321 1.46 % 2,242.28 303 0.78 % 1,314.40 1,298 3.86 % 4,391.93 258 0.39 % 730.57 462 2.35 % 2,963.57 leto leto le to 1,828 1.15 % 1,766 487 1.25 % 2,112.59 440 1.31 % 1,488.79 767 1.16 % 2,171.89 134 0.68 % 859.56 dan dan d an 1,383 0.87 % 1,336.09 446 1.14 % 1,934.73 415 1.23 % 1,404.20 349 0.53 % 988.25 173 0.88 % 1,109.74 človek človek člov ek 1,248 0.79 % 1,205.67 246 0.63 % 1,067.14 221 0.66 % 747.78 639 0.96 % 1,809.43 142 0.72 % 910.88 čas čas č as 1,155 0.73 % 1,115.83 347 0.89 % 1,505.27 127 0.38 % 429.72 537 0.81 % 1,520.60 144 0.73 % 923.71 misel misel mis el 1,086 0.68 % 1,049.17 178 0.46 % 772.16 429 1.28 % 1,451.57 217 0.33 % 614.47 262 1.33 % 1,680.64 stvar stvar stv ar 1,056 0.67 % 1,020.18 157 0.40 % 681.06 190 0.56 % 642.89 467 0.70 % 1,322.39 242 1.23 % 1,552.35 bistvo bistvo bist vo 1,016 0.64 % 981.54 165 0.42 % 715.76 239 0.71 % 808.68 248 0.37 % 702.25 364 1.85 % 2,334.93 red red r ed 924 0.58 % 892.66 196 0.50 % 850.24 250 0.74 % 845.90 257 0.39 % 727.74 221 1.12 % 1,417.64 ura ura u ra 905 0.57 % 874.31 362 0.93 % 1,570.34 258 0.77 % 872.97 154 0.23 % 436.08 131 0.67 % 840.32 del del d el 883 0.56 % 853.05 189 0.48 % 819.87 178 0.53 % 602.28 413 0.62 % 1,169.48 103 0.52 % 660.71 primer primer prim er 862 0.54 % 832.76 89 0.23 % 386.08 79 0.23 % 267.31 483 0.73 % 1,367.69 211 1.07 % 1,353.49 stran stran str an 839 0.53 % 810.54 171 0.44 % 741.79 157 0.47 % 531.23 384 0.58 % 1,087.36 127 0.65 % 814.66 hvala hvala hva la 797 0.50 % 769.97 341 0.87 % 1,479.25 60 0.18 % 203.02 302 0.46 % 855.16 94 0.48 % 602.98 konec konec kon ec 784 0.49 % 757.41 205 0.53 % 889.28 170 0.51 % 575.21 304 0.46 % 860.83 105 0.53 % 673.54 gospod gospod gosp od 713 0.45 % 688.82 49 0.13 % 212.56 6 0.02 % 20.30 639 0.96 % 1,809.43 19 0.10 % 121.88 Slovenija slovenija Sloveni ja 711 0.45 % 686.89 167 0.43 % 724.44 46 0.14 % 155.65 465 0.70 % 1,316.72 33 0.17 % 211.68 voda voda vo da 690 0.43 % 666.60 86 0.22 % 373.06 119 0.35 % 402.65 387 0.58 % 1,095.85 98 0.50 % 628.64 evro evro ev ro 664 0.42 % 641.48 169 0.43 % 733.12 200 0.59 % 676.72 192 0.29 % 543.68 103 0.52 % 660.71 minuta minuta minu ta 662 0.42 % 639.55 464 1.19 % 2,012.81 113 0.34 % 382.35 59 0.09 % 167.07 26 0.13 % 166.78 jutro jutro jut ro 618 0.39 % 597.04 513 1.31 % 2,225.37 20 0.06 % 67.67 77 0.12 % 218.04 8 0.04 % 51.32 vprašanje vprašanje vprašan je 597 0.38 % 576.75 101 0.26 % 438.13 72 0.21 % 243.62 347 0.52 % 982.59 77 0.39 % 493.93 problem problem probl em 583 0.37 % 563.23 62 0.16 % 268.95 123 0.37 % 416.18 272 0.41 % 770.21 126 0.64 % 808.25 država država drža va 538 0.34 % 519.75 44 0.11 % 190.87 30 0.09 % 101.51 446 0.67 % 1,262.92 18 0.09 % 115.46 otrok otrok otr ok 538 0.34 % 519.75 115 0.29 % 498.87 99 0.29 % 334.98 266 0.40 % 753.22 58 0.29 % 372.05 mesto mesto mes to 515 0.32 % 497.53 247 0.63 % 1,071.48 73 0.22 % 247 169 0.26 % 478.55 26 0.13 % 166.78 način način nač in 491 0.31 % 474.35 69 0.18 % 299.32 34 0.10 % 115.04 283 0.43 % 801.36 105 0.53 % 673.54 beseda beseda bese da 458 0.29 % 442.47 71 0.18 % 308 25 0.07 % 84.59 336 0.51 % 951.44 26 0.13 % 166.78 teden teden ted en 437 0.28 % 422.18 127 0.33 % 550.92 167 0.50 % 565.06 67 0.10 % 189.72 76 0.39 % 487.51 šola šola šo la 432 0.27 % 417.35 59 0.15 % 255.94 156 0.46 % 527.84 166 0.25 % 470.06 51 0.26 % 327.15 zadeva zadeva zade va 428 0.27 % 413.48 73 0.19 % 316.67 52 0.15 % 175.95 220 0.33 % 622.97 83 0.42 % 532.42 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 381 File at CLARIN.SI 2.2.38 List of final character-level 3-grams from noun lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-nouns-lemmas- final-3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequen- cy of lemma Percentage of all found lemmas Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pol pol pol 4,054 2.56 % 3,916.50 463 1.19 % 2,008.48 2,414 7.23 % 8,168.04 333 0.50 % 942.94 844 4.31 % 5,413.97 saj saj saj 2,321 1.47 % 2,242.28 303 0.78 % 1,314.40 1,298 3.89 % 4,391.93 258 0.39 % 730.57 462 2.36 % 2,963.57 leto leto l eto 1,828 1.16 % 1,766 487 1.25 % 2,112.59 440 1.32 % 1,488.79 767 1.16 % 2,171.89 134 0.68 % 859.56 dan dan dan 1,383 0.88 % 1,336.09 446 1.15 % 1,934.73 415 1.24 % 1,404.20 349 0.53 % 988.25 173 0.88 % 1,109.74 človek človek člo vek 1,248 0.79 % 1,205.67 246 0.63 % 1,067.14 221 0.66 % 747.78 639 0.96 % 1,809.43 142 0.72 % 910.88 čas čas čas 1,155 0.73 % 1,115.83 347 0.89 % 1,505.27 127 0.38 % 429.72 537 0.81 % 1,520.60 144 0.73 % 923.71 misel misel mi sel 1,086 0.69 % 1,049.17 178 0.46 % 772.16 429 1.28 % 1,451.57 217 0.33 % 614.47 262 1.34 % 1,680.64 stvar stvar st var 1,056 0.67 % 1,020.18 157 0.40 % 681.06 190 0.57 % 642.89 467 0.70 % 1,322.39 242 1.24 % 1,552.35 bistvo bistvo bis tvo 1,016 0.64 % 981.54 165 0.42 % 715.76 239 0.72 % 808.68 248 0.37 % 702.25 364 1.86 % 2,334.93 red red red 924 0.58 % 892.66 196 0.50 % 850.24 250 0.75 % 845.90 257 0.39 % 727.74 221 1.13 % 1,417.64 ura ura ura 905 0.57 % 874.31 362 0.93 % 1,570.34 258 0.77 % 872.97 154 0.23 % 436.08 131 0.67 % 840.32 del del del 883 0.56 % 853.05 189 0.49 % 819.87 178 0.53 % 602.28 413 0.62 % 1,169.48 103 0.53 % 660.71 primer primer pri mer 862 0.55 % 832.76 89 0.23 % 386.08 79 0.24 % 267.31 483 0.73 % 1,367.69 211 1.08 % 1,353.49 stran stran st ran 839 0.53 % 810.54 171 0.44 % 741.79 157 0.47 % 531.23 384 0.58 % 1,087.36 127 0.65 % 814.66 hvala hvala hv ala 797 0.50 % 769.97 341 0.88 % 1,479.25 60 0.18 % 203.02 302 0.46 % 855.16 94 0.48 % 602.98 konec konec ko nec 784 0.50 % 757.41 205 0.53 % 889.28 170 0.51 % 575.21 304 0.46 % 860.83 105 0.54 % 673.54 gospod gospod gos pod 713 0.45 % 688.82 49 0.13 % 212.56 6 0.02 % 20.30 639 0.96 % 1,809.43 19 0.10 % 121.88 Slovenija slovenija Sloven ija 711 0.45 % 686.89 167 0.43 % 724.44 46 0.14 % 155.65 465 0.70 % 1,316.72 33 0.17 % 211.68 voda voda v oda 690 0.44 % 666.60 86 0.22 % 373.06 119 0.36 % 402.65 387 0.58 % 1,095.85 98 0.50 % 628.64 evro evro e vro 664 0.42 % 641.48 169 0.43 % 733.12 200 0.60 % 676.72 192 0.29 % 543.68 103 0.53 % 660.71 minuta minuta min uta 662 0.42 % 639.55 464 1.19 % 2,012.81 113 0.34 % 382.35 59 0.09 % 167.07 26 0.13 % 166.78 jutro jutro ju tro 618 0.39 % 597.04 513 1.32 % 2,225.37 20 0.06 % 67.67 77 0.12 % 218.04 8 0.04 % 51.32 vprašanje vprašanje vpraša nje 597 0.38 % 576.75 101 0.26 % 438.13 72 0.22 % 243.62 347 0.52 % 982.59 77 0.39 % 493.93 problem problem prob lem 583 0.37 % 563.23 62 0.16 % 268.95 123 0.37 % 416.18 272 0.41 % 770.21 126 0.64 % 808.25 država država drž ava 538 0.34 % 519.75 44 0.11 % 190.87 30 0.09 % 101.51 446 0.67 % 1,262.92 18 0.09 % 115.46 otrok otrok ot rok 538 0.34 % 519.75 115 0.30 % 498.87 99 0.30 % 334.98 266 0.40 % 753.22 58 0.30 % 372.05 mesto mesto me sto 515 0.33 % 497.53 247 0.64 % 1,071.48 73 0.22 % 247 169 0.26 % 478.55 26 0.13 % 166.78 način način na čin 491 0.31 % 474.35 69 0.18 % 299.32 34 0.10 % 115.04 283 0.43 % 801.36 105 0.54 % 673.54 beseda beseda bes eda 458 0.29 % 442.47 71 0.18 % 308 25 0.07 % 84.59 336 0.51 % 951.44 26 0.13 % 166.78 teden teden te den 437 0.28 % 422.18 127 0.33 % 550.92 167 0.50 % 565.06 67 0.10 % 189.72 76 0.39 % 487.51 šola šola š ola 432 0.27 % 417.35 59 0.15 % 255.94 156 0.47 % 527.84 166 0.25 % 470.06 51 0.26 % 327.15 zadeva zadeva zad eva 428 0.27 % 413.48 73 0.19 % 316.67 52 0.16 % 175.95 220 0.33 % 622.97 83 0.42 % 532.42 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 382 File at CLARIN.SI 2.2.39 List of final character-level 4-grams from noun lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-nouns-lemmas-final- 4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequen- cy of lemma Percentage of all found lemmas Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] leto leto leto 1,828 1.33 % 1,766 487 1.43 % 2,112.59 440 1.69 % 1,488.79 767 1.26 % 2,171.89 134 0.81 % 859.56 človek človek čl ovek 1,248 0.91 % 1,205.67 246 0.72 % 1,067.14 221 0.85 % 747.78 639 1.05 % 1,809.43 142 0.86 % 910.88 misel misel m isel 1,086 0.79 % 1,049.17 178 0.52 % 772.16 429 1.65 % 1,451.57 217 0.35 % 614.47 262 1.59 % 1,680.64 stvar stvar s tvar 1,056 0.77 % 1,020.18 157 0.46 % 681.06 190 0.73 % 642.89 467 0.77 % 1,322.39 242 1.47 % 1,552.35 bistvo bistvo bi stvo 1,016 0.74 % 981.54 165 0.48 % 715.76 239 0.92 % 808.68 248 0.41 % 702.25 364 2.21 % 2,334.93 primer primer pr imer 862 0.63 % 832.76 89 0.26 % 386.08 79 0.30 % 267.31 483 0.79 % 1,367.69 211 1.28 % 1,353.49 stran stran s tran 839 0.61 % 810.54 171 0.50 % 741.79 157 0.60 % 531.23 384 0.63 % 1,087.36 127 0.77 % 814.66 hvala hvala h vala 797 0.58 % 769.97 341 1.00 % 1,479.25 60 0.23 % 203.02 302 0.49 % 855.16 94 0.57 % 602.98 konec konec k onec 784 0.57 % 757.41 205 0.60 % 889.28 170 0.65 % 575.21 304 0.50 % 860.83 105 0.64 % 673.54 gospod gospod go spod 713 0.52 % 688.82 49 0.14 % 212.56 6 0.02 % 20.30 639 1.05 % 1,809.43 19 0.12 % 121.88 Slovenija slovenija Slove nija 711 0.52 % 686.89 167 0.49 % 724.44 46 0.18 % 155.65 465 0.76 % 1,316.72 33 0.20 % 211.68 voda voda voda 690 0.50 % 666.60 86 0.25 % 373.06 119 0.46 % 402.65 387 0.63 % 1,095.85 98 0.59 % 628.64 evro evro evro 664 0.48 % 641.48 169 0.50 % 733.12 200 0.77 % 676.72 192 0.32 % 543.68 103 0.62 % 660.71 minuta minuta mi nuta 662 0.48 % 639.55 464 1.36 % 2,012.81 113 0.43 % 382.35 59 0.10 % 167.07 26 0.16 % 166.78 jutro jutro j utro 618 0.45 % 597.04 513 1.50 % 2,225.37 20 0.08 % 67.67 77 0.13 % 218.04 8 0.05 % 51.32 vprašanje vprašanje vpraš anje 597 0.43 % 576.75 101 0.30 % 438.13 72 0.28 % 243.62 347 0.57 % 982.59 77 0.47 % 493.93 problem problem pro blem 583 0.42 % 563.23 62 0.18 % 268.95 123 0.47 % 416.18 272 0.45 % 770.21 126 0.77 % 808.25 država država dr žava 538 0.39 % 519.75 44 0.13 % 190.87 30 0.12 % 101.51 446 0.73 % 1,262.92 18 0.11 % 115.46 otrok otrok o trok 538 0.39 % 519.75 115 0.34 % 498.87 99 0.38 % 334.98 266 0.44 % 753.22 58 0.35 % 372.05 mesto mesto m esto 515 0.37 % 497.53 247 0.72 % 1,071.48 73 0.28 % 247 169 0.28 % 478.55 26 0.16 % 166.78 način način n ačin 491 0.36 % 474.35 69 0.20 % 299.32 34 0.13 % 115.04 283 0.46 % 801.36 105 0.64 % 673.54 beseda beseda be seda 458 0.33 % 442.47 71 0.21 % 308 25 0.10 % 84.59 336 0.55 % 951.44 26 0.16 % 166.78 teden teden t eden 437 0.32 % 422.18 127 0.37 % 550.92 167 0.64 % 565.06 67 0.11 % 189.72 76 0.46 % 487.51 šola šola šola 432 0.31 % 417.35 59 0.17 % 255.94 156 0.60 % 527.84 166 0.27 % 470.06 51 0.31 % 327.15 zadeva zadeva za deva 428 0.31 % 413.48 73 0.21 % 316.67 52 0.20 % 175.95 220 0.36 % 622.97 83 0.50 % 532.42 delo delo delo 425 0.31 % 410.59 63 0.18 % 273.29 68 0.26 % 230.09 238 0.39 % 673.94 56 0.34 % 359.22 svet svet svet 424 0.31 % 409.62 101 0.30 % 438.13 34 0.13 % 115.04 271 0.44 % 767.38 18 0.11 % 115.46 vlada vlada v lada 416 0.30 % 401.89 15 0.04 % 65.07 1 0 % 3.38 399 0.65 % 1,129.83 1 0.01 % 6.41 denar denar d enar 406 0.29 % 392.23 55 0.16 % 238.59 119 0.46 % 402.65 201 0.33 % 569.16 31 0.19 % 198.85 kola kola kola 363 0.26 % 350.69 63 0.18 % 273.29 85 0.33 % 287.61 114 0.19 % 322.81 101 0.61 % 647.88 skupina skupina sku pina 358 0.26 % 345.86 81 0.24 % 351.37 28 0.11 % 94.74 203 0.33 % 574.83 46 0.28 % 295.07 reklo reklo r eklo 356 0.26 % 343.93 52 0.15 % 225.57 182 0.70 % 615.82 75 0.12 % 212.37 47 0.28 % 301.49 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 383 File at CLARIN.SI 2.2.40 List of final character-level 5-grams from noun lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-nouns-lemmas-final- 5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequen- cy of lemma Percentage of all found lemmas Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] človek človek č lovek 1,248 1.09 % 1,205.67 246 0.88 % 1,067.14 221 1.14 % 747.78 639 1.20 % 1,809.43 142 1.03 % 910.88 misel misel misel 1,086 0.95 % 1,049.17 178 0.63 % 772.16 429 2.21 % 1,451.57 217 0.41 % 614.47 262 1.91 % 1,680.64 stvar stvar stvar 1,056 0.92 % 1,020.18 157 0.56 % 681.06 190 0.98 % 642.89 467 0.88 % 1,322.39 242 1.76 % 1,552.35 bistvo bistvo b istvo 1,016 0.89 % 981.54 165 0.59 % 715.76 239 1.23 % 808.68 248 0.47 % 702.25 364 2.65 % 2,334.93 primer primer p rimer 862 0.75 % 832.76 89 0.32 % 386.08 79 0.41 % 267.31 483 0.91 % 1,367.69 211 1.53 % 1,353.49 stran stran stran 839 0.73 % 810.54 171 0.61 % 741.79 157 0.81 % 531.23 384 0.72 % 1,087.36 127 0.92 % 814.66 hvala hvala hvala 797 0.70 % 769.97 341 1.21 % 1,479.25 60 0.31 % 203.02 302 0.57 % 855.16 94 0.68 % 602.98 konec konec konec 784 0.69 % 757.41 205 0.73 % 889.28 170 0.87 % 575.21 304 0.57 % 860.83 105 0.76 % 673.54 gospod gospod g ospod 713 0.62 % 688.82 49 0.17 % 212.56 6 0.03 % 20.30 639 1.20 % 1,809.43 19 0.14 % 121.88 Slovenija slovenija Slov enija 711 0.62 % 686.89 167 0.59 % 724.44 46 0.24 % 155.65 465 0.88 % 1,316.72 33 0.24 % 211.68 minuta minuta m inuta 662 0.58 % 639.55 464 1.65 % 2,012.81 113 0.58 % 382.35 59 0.11 % 167.07 26 0.19 % 166.78 jutro jutro jutro 618 0.54 % 597.04 513 1.82 % 2,225.37 20 0.10 % 67.67 77 0.14 % 218.04 8 0.06 % 51.32 vprašanje vprašanje vpra šanje 597 0.52 % 576.75 101 0.36 % 438.13 72 0.37 % 243.62 347 0.65 % 982.59 77 0.56 % 493.93 problem problem pr oblem 583 0.51 % 563.23 62 0.22 % 268.95 123 0.63 % 416.18 272 0.51 % 770.21 126 0.92 % 808.25 država država d ržava 538 0.47 % 519.75 44 0.16 % 190.87 30 0.15 % 101.51 446 0.84 % 1,262.92 18 0.13 % 115.46 otrok otrok otrok 538 0.47 % 519.75 115 0.41 % 498.87 99 0.51 % 334.98 266 0.50 % 753.22 58 0.42 % 372.05 mesto mesto mesto 515 0.45 % 497.53 247 0.88 % 1,071.48 73 0.38 % 247 169 0.32 % 478.55 26 0.19 % 166.78 način način način 491 0.43 % 474.35 69 0.24 % 299.32 34 0.17 % 115.04 283 0.53 % 801.36 105 0.76 % 673.54 beseda beseda b eseda 458 0.40 % 442.47 71 0.25 % 308 25 0.13 % 84.59 336 0.63 % 951.44 26 0.19 % 166.78 teden teden teden 437 0.38 % 422.18 127 0.45 % 550.92 167 0.86 % 565.06 67 0.13 % 189.72 76 0.55 % 487.51 zadeva zadeva z adeva 428 0.37 % 413.48 73 0.26 % 316.67 52 0.27 % 175.95 220 0.41 % 622.97 83 0.60 % 532.42 vlada vlada vlada 416 0.36 % 401.89 15 0.05 % 65.07 1 0.01 % 3.38 399 0.75 % 1,129.83 1 0.01 % 6.41 denar denar denar 406 0.35 % 392.23 55 0.20 % 238.59 119 0.61 % 402.65 201 0.38 % 569.16 31 0.23 % 198.85 skupina skupina sk upina 358 0.31 % 345.86 81 0.29 % 351.37 28 0.14 % 94.74 203 0.38 % 574.83 46 0.34 % 295.07 reklo reklo reklo 356 0.31 % 343.93 52 0.18 % 225.57 182 0.94 % 615.82 75 0.14 % 212.37 47 0.34 % 301.49 cesta cesta cesta 332 0.29 % 320.74 114 0.41 % 494.53 66 0.34 % 223.32 132 0.25 % 373.78 20 0.15 % 128.29 začetek začetek za četek 331 0.29 % 319.77 95 0.34 % 412.11 64 0.33 % 216.55 128 0.24 % 362.45 44 0.32 % 282.24 življenje življenje živl jenje 330 0.29 % 318.81 73 0.26 % 316.67 51 0.26 % 172.56 183 0.34 % 518.19 23 0.17 % 147.54 točka točka točka 322 0.28 % 311.08 56 0.20 % 242.93 22 0.11 % 74.44 206 0.39 % 583.32 38 0.28 % 243.76 mesec mesec mesec 316 0.28 % 305.28 55 0.20 % 238.59 101 0.52 % 341.74 116 0.22 % 328.47 44 0.32 % 282.24 vrsta vrsta vrsta 306 0.27 % 295.62 45 0.16 % 195.21 26 0.13 % 87.97 208 0.39 % 588.99 27 0.20 % 173.20 radio radio radio 299 0.26 % 288.86 257 0.91 % 1,114.86 9 0.05 % 30.45 27 0.05 % 76.45 6 0.04 % 38.49 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 384 File at CLARIN.SI 2.2.41 List of initial character-level 1-grams from noun standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-nouns-standardized_ forms-initial-1grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequen- cy of standardized form Percentage of all found standardized forms Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pol p ol 4,066 2.56 % 3,928.10 465 1.19 % 2,017.15 2,421 7.17 % 8,191.73 335 0.50 % 948.61 845 4.28 % 5,420.38 saj s aj 2,421 1.52 % 2,338.89 318 0.81 % 1,379.47 1,348 3.99 % 4,561.11 275 0.41 % 778.71 480 2.43 % 3,079.03 mislim m islim 1,011 0.64 % 976.71 155 0.40 % 672.38 419 1.24 % 1,417.73 186 0.28 % 526.69 251 1.27 % 1,610.08 bistvu b istvu 988 0.62 % 954.49 163 0.42 % 707.09 234 0.69 % 791.77 232 0.35 % 656.95 359 1.82 % 2,302.86 dan d an 868 0.55 % 838.56 320 0.82 % 1,388.15 277 0.82 % 937.26 159 0.24 % 450.23 112 0.57 % 718.44 redu r edu 863 0.54 % 833.73 179 0.46 % 776.50 238 0.70 % 805.30 236 0.35 % 668.27 210 1.06 % 1,347.08 hvala h vala 793 0.50 % 766.10 340 0.87 % 1,474.91 60 0.18 % 203.02 300 0.45 % 849.50 93 0.47 % 596.56 gospod g ospod 645 0.41 % 623.12 31 0.08 % 134.48 5 0.01 % 16.92 595 0.90 % 1,684.84 14 0.07 % 89.81 strani s trani 608 0.38 % 587.38 129 0.33 % 559.60 107 0.32 % 362.05 287 0.43 % 812.69 85 0.43 % 545.25 stvari s tvari 594 0.37 % 573.85 95 0.24 % 412.11 102 0.30 % 345.13 256 0.39 % 724.91 141 0.71 % 904.47 evrov e vrov 548 0.34 % 529.41 148 0.38 % 642.02 155 0.46 % 524.46 164 0.25 % 464.39 81 0.41 % 519.59 jutro j utro 539 0.34 % 520.72 451 1.15 % 1,956.42 9 0.03 % 30.45 76 0.11 % 215.21 3 0.01 % 19.24 minut m inut 467 0.29 % 451.16 318 0.81 % 1,379.47 92 0.27 % 311.29 34 0.05 % 96.28 23 0.12 % 147.54 let l et 466 0.29 % 450.20 143 0.37 % 620.33 117 0.35 % 395.88 182 0.27 % 515.36 24 0.12 % 153.95 primer p rimer 450 0.28 % 434.74 53 0.14 % 229.91 60 0.18 % 203.02 244 0.37 % 690.93 93 0.47 % 596.56 način n ačin 449 0.28 % 433.77 63 0.16 % 273.29 34 0.10 % 115.04 252 0.38 % 713.58 100 0.51 % 641.47 leta l eta 444 0.28 % 428.94 133 0.34 % 576.95 77 0.23 % 260.54 207 0.31 % 586.15 27 0.14 % 173.20 čas č as 432 0.27 % 417.35 143 0.37 % 620.33 46 0.14 % 155.65 192 0.29 % 543.68 51 0.26 % 327.15 leto l eto 430 0.27 % 415.42 94 0.24 % 407.77 163 0.48 % 551.53 129 0.19 % 365.28 44 0.22 % 282.24 dela d ela 417 0.26 % 402.86 69 0.18 % 299.32 152 0.45 % 514.31 143 0.21 % 404.93 53 0.27 % 339.98 ljudi l judi 394 0.25 % 380.64 59 0.15 % 255.94 89 0.26 % 301.14 212 0.32 % 600.31 34 0.17 % 218.10 vprašanje v prašanje 393 0.25 % 379.67 72 0.18 % 312.33 54 0.16 % 182.72 216 0.33 % 611.64 51 0.26 % 327.15 stvar s tvar 386 0.24 % 372.91 47 0.12 % 203.88 83 0.25 % 280.84 172 0.26 % 487.05 84 0.42 % 538.83 ljudje l judje 376 0.24 % 363.25 73 0.19 % 316.67 58 0.17 % 196.25 201 0.30 % 569.16 44 0.22 % 282.24 problem p roblem 366 0.23 % 353.59 32 0.08 % 138.81 83 0.25 % 280.84 163 0.24 % 461.56 88 0.45 % 564.49 časa č asa 362 0.23 % 349.72 100 0.26 % 433.80 57 0.17 % 192.87 146 0.22 % 413.42 59 0.30 % 378.46 koli k oli 356 0.22 % 343.93 58 0.15 % 251.60 85 0.25 % 287.61 113 0.17 % 319.98 100 0.51 % 641.47 del d el 345 0.22 % 333.30 75 0.19 % 325.35 39 0.12 % 131.96 193 0.29 % 546.51 38 0.19 % 243.76 kot k ot 319 0.20 % 308.18 71 0.18 % 308 49 0.14 % 165.80 158 0.24 % 447.40 41 0.21 % 263 dni d ni 310 0.20 % 299.49 71 0.18 % 308 116 0.34 % 392.50 85 0.13 % 240.69 38 0.19 % 243.76 evo e vo 308 0.19 % 297.55 178 0.46 % 772.16 60 0.18 % 203.02 16 0.02 % 45.31 54 0.27 % 346.39 Sloveniji S loveniji 286 0.18 % 276.30 54 0.14 % 234.25 31 0.09 % 104.89 187 0.28 % 529.52 14 0.07 % 89.81 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 385 File at CLARIN.SI 2.2.42 List of initial character-level 2-grams from noun standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-nouns-standardized_ forms-initial-2grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequen- cy of standardized form Percentage of all found standardized forms Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pol po l 4,066 2.56 % 3,928.10 465 1.19 % 2,017.15 2,421 7.20 % 8,191.73 335 0.51 % 948.61 845 4.29 % 5,420.38 saj sa j 2,421 1.52 % 2,338.89 318 0.81 % 1,379.47 1,348 4.01 % 4,561.11 275 0.41 % 778.71 480 2.44 % 3,079.03 mislim mi slim 1,011 0.64 % 976.71 155 0.40 % 672.38 419 1.25 % 1,417.73 186 0.28 % 526.69 251 1.27 % 1,610.08 bistvu bi stvu 988 0.62 % 954.49 163 0.42 % 707.09 234 0.70 % 791.77 232 0.35 % 656.95 359 1.82 % 2,302.86 dan da n 868 0.55 % 838.56 320 0.82 % 1,388.15 277 0.82 % 937.26 159 0.24 % 450.23 112 0.57 % 718.44 redu re du 863 0.54 % 833.73 179 0.46 % 776.50 238 0.71 % 805.30 236 0.36 % 668.27 210 1.06 % 1,347.08 hvala hv ala 793 0.50 % 766.10 340 0.87 % 1,474.91 60 0.18 % 203.02 300 0.45 % 849.50 93 0.47 % 596.56 gospod go spod 645 0.41 % 623.12 31 0.08 % 134.48 5 0.01 % 16.92 595 0.90 % 1,684.84 14 0.07 % 89.81 strani st rani 608 0.38 % 587.38 129 0.33 % 559.60 107 0.32 % 362.05 287 0.43 % 812.69 85 0.43 % 545.25 stvari st vari 594 0.37 % 573.85 95 0.24 % 412.11 102 0.30 % 345.13 256 0.39 % 724.91 141 0.71 % 904.47 evrov ev rov 548 0.34 % 529.41 148 0.38 % 642.02 155 0.46 % 524.46 164 0.25 % 464.39 81 0.41 % 519.59 jutro ju tro 539 0.34 % 520.72 451 1.16 % 1,956.42 9 0.03 % 30.45 76 0.12 % 215.21 3 0.01 % 19.24 minut mi nut 467 0.29 % 451.16 318 0.81 % 1,379.47 92 0.27 % 311.29 34 0.05 % 96.28 23 0.12 % 147.54 let le t 466 0.29 % 450.20 143 0.37 % 620.33 117 0.35 % 395.88 182 0.27 % 515.36 24 0.12 % 153.95 primer pr imer 450 0.28 % 434.74 53 0.14 % 229.91 60 0.18 % 203.02 244 0.37 % 690.93 93 0.47 % 596.56 način na čin 449 0.28 % 433.77 63 0.16 % 273.29 34 0.10 % 115.04 252 0.38 % 713.58 100 0.51 % 641.47 leta le ta 444 0.28 % 428.94 133 0.34 % 576.95 77 0.23 % 260.54 207 0.31 % 586.15 27 0.14 % 173.20 čas ča s 432 0.27 % 417.35 143 0.37 % 620.33 46 0.14 % 155.65 192 0.29 % 543.68 51 0.26 % 327.15 leto le to 430 0.27 % 415.42 94 0.24 % 407.77 163 0.48 % 551.53 129 0.19 % 365.28 44 0.22 % 282.24 dela de la 417 0.26 % 402.86 69 0.18 % 299.32 152 0.45 % 514.31 143 0.22 % 404.93 53 0.27 % 339.98 ljudi lj udi 394 0.25 % 380.64 59 0.15 % 255.94 89 0.27 % 301.14 212 0.32 % 600.31 34 0.17 % 218.10 vprašanje vp rašanje 393 0.25 % 379.67 72 0.18 % 312.33 54 0.16 % 182.72 216 0.33 % 611.64 51 0.26 % 327.15 stvar st var 386 0.24 % 372.91 47 0.12 % 203.88 83 0.25 % 280.84 172 0.26 % 487.05 84 0.43 % 538.83 ljudje lj udje 376 0.24 % 363.25 73 0.19 % 316.67 58 0.17 % 196.25 201 0.30 % 569.16 44 0.22 % 282.24 problem pr oblem 366 0.23 % 353.59 32 0.08 % 138.81 83 0.25 % 280.84 163 0.25 % 461.56 88 0.45 % 564.49 časa ča sa 362 0.23 % 349.72 100 0.26 % 433.80 57 0.17 % 192.87 146 0.22 % 413.42 59 0.30 % 378.46 koli ko li 356 0.22 % 343.93 58 0.15 % 251.60 85 0.25 % 287.61 113 0.17 % 319.98 100 0.51 % 641.47 del de l 345 0.22 % 333.30 75 0.19 % 325.35 39 0.12 % 131.96 193 0.29 % 546.51 38 0.19 % 243.76 kot ko t 319 0.20 % 308.18 71 0.18 % 308 49 0.15 % 165.80 158 0.24 % 447.40 41 0.21 % 263 dni dn i 310 0.20 % 299.49 71 0.18 % 308 116 0.34 % 392.50 85 0.13 % 240.69 38 0.19 % 243.76 evo ev o 308 0.19 % 297.55 178 0.46 % 772.16 60 0.18 % 203.02 16 0.02 % 45.31 54 0.27 % 346.39 Sloveniji Sl oveniji 286 0.18 % 276.30 54 0.14 % 234.25 31 0.09 % 104.89 187 0.28 % 529.52 14 0.07 % 89.81 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 386 File at CLARIN.SI 2.2.43 List of initial character-level 3-grams from noun standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-nouns-standardized_ forms-initial-3grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequen- cy of standardized form Percentage of all found standardized forms Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pol pol 4,066 2.57 % 3,928.10 465 1.20 % 2,017.15 2,421 7.27 % 8,191.73 335 0.51 % 948.61 845 4.32 % 5,420.38 saj saj 2,421 1.53 % 2,338.89 318 0.82 % 1,379.47 1,348 4.05 % 4,561.11 275 0.41 % 778.71 480 2.45 % 3,079.03 mislim mis lim 1,011 0.64 % 976.71 155 0.40 % 672.38 419 1.26 % 1,417.73 186 0.28 % 526.69 251 1.28 % 1,610.08 bistvu bis tvu 988 0.62 % 954.49 163 0.42 % 707.09 234 0.70 % 791.77 232 0.35 % 656.95 359 1.83 % 2,302.86 dan dan 868 0.55 % 838.56 320 0.82 % 1,388.15 277 0.83 % 937.26 159 0.24 % 450.23 112 0.57 % 718.44 redu red u 863 0.55 % 833.73 179 0.46 % 776.50 238 0.71 % 805.30 236 0.36 % 668.27 210 1.07 % 1,347.08 hvala hva la 793 0.50 % 766.10 340 0.87 % 1,474.91 60 0.18 % 203.02 300 0.45 % 849.50 93 0.47 % 596.56 gospod gos pod 645 0.41 % 623.12 31 0.08 % 134.48 5 0.01 % 16.92 595 0.90 % 1,684.84 14 0.07 % 89.81 strani str ani 608 0.39 % 587.38 129 0.33 % 559.60 107 0.32 % 362.05 287 0.43 % 812.69 85 0.43 % 545.25 stvari stv ari 594 0.38 % 573.85 95 0.24 % 412.11 102 0.31 % 345.13 256 0.39 % 724.91 141 0.72 % 904.47 evrov evr ov 548 0.35 % 529.41 148 0.38 % 642.02 155 0.47 % 524.46 164 0.25 % 464.39 81 0.41 % 519.59 jutro jut ro 539 0.34 % 520.72 451 1.16 % 1,956.42 9 0.03 % 30.45 76 0.12 % 215.21 3 0.01 % 19.24 minut min ut 467 0.30 % 451.16 318 0.82 % 1,379.47 92 0.28 % 311.29 34 0.05 % 96.28 23 0.12 % 147.54 let let 466 0.29 % 450.20 143 0.37 % 620.33 117 0.35 % 395.88 182 0.28 % 515.36 24 0.12 % 153.95 primer pri mer 450 0.28 % 434.74 53 0.14 % 229.91 60 0.18 % 203.02 244 0.37 % 690.93 93 0.47 % 596.56 način nač in 449 0.28 % 433.77 63 0.16 % 273.29 34 0.10 % 115.04 252 0.38 % 713.58 100 0.51 % 641.47 leta let a 444 0.28 % 428.94 133 0.34 % 576.95 77 0.23 % 260.54 207 0.31 % 586.15 27 0.14 % 173.20 čas čas 432 0.27 % 417.35 143 0.37 % 620.33 46 0.14 % 155.65 192 0.29 % 543.68 51 0.26 % 327.15 leto let o 430 0.27 % 415.42 94 0.24 % 407.77 163 0.49 % 551.53 129 0.20 % 365.28 44 0.23 % 282.24 dela del a 417 0.26 % 402.86 69 0.18 % 299.32 152 0.46 % 514.31 143 0.22 % 404.93 53 0.27 % 339.98 ljudi lju di 394 0.25 % 380.64 59 0.15 % 255.94 89 0.27 % 301.14 212 0.32 % 600.31 34 0.17 % 218.10 vprašanje vpr ašanje 393 0.25 % 379.67 72 0.18 % 312.33 54 0.16 % 182.72 216 0.33 % 611.64 51 0.26 % 327.15 stvar stv ar 386 0.24 % 372.91 47 0.12 % 203.88 83 0.25 % 280.84 172 0.26 % 487.05 84 0.43 % 538.83 ljudje lju dje 376 0.24 % 363.25 73 0.19 % 316.67 58 0.17 % 196.25 201 0.30 % 569.16 44 0.23 % 282.24 problem pro blem 366 0.23 % 353.59 32 0.08 % 138.81 83 0.25 % 280.84 163 0.25 % 461.56 88 0.45 % 564.49 časa čas a 362 0.23 % 349.72 100 0.26 % 433.80 57 0.17 % 192.87 146 0.22 % 413.42 59 0.30 % 378.46 koli kol i 356 0.23 % 343.93 58 0.15 % 251.60 85 0.26 % 287.61 113 0.17 % 319.98 100 0.51 % 641.47 del del 345 0.22 % 333.30 75 0.19 % 325.35 39 0.12 % 131.96 193 0.29 % 546.51 38 0.19 % 243.76 kot kot 319 0.20 % 308.18 71 0.18 % 308 49 0.15 % 165.80 158 0.24 % 447.40 41 0.21 % 263 dni dni 310 0.20 % 299.49 71 0.18 % 308 116 0.35 % 392.50 85 0.13 % 240.69 38 0.19 % 243.76 evo evo 308 0.20 % 297.55 178 0.46 % 772.16 60 0.18 % 203.02 16 0.02 % 45.31 54 0.28 % 346.39 Sloveniji Slo veniji 286 0.18 % 276.30 54 0.14 % 234.25 31 0.09 % 104.89 187 0.28 % 529.52 14 0.07 % 89.81 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 387 File at CLARIN.SI 2.2.44 List of initial character-level 4-grams from noun standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-nouns-standardized_ forms-initial-4grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequen- cy of standardized form Percentage of all found standardized forms Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mislim misl im 1,011 0.71 % 976.71 155 0.44 % 672.38 419 1.57 % 1,417.73 186 0.30 % 526.69 251 1.47 % 1,610.08 bistvu bist vu 988 0.70 % 954.49 163 0.46 % 707.09 234 0.88 % 791.77 232 0.37 % 656.95 359 2.10 % 2,302.86 redu redu 863 0.61 % 833.73 179 0.51 % 776.50 238 0.89 % 805.30 236 0.38 % 668.27 210 1.23 % 1,347.08 hvala hval a 793 0.56 % 766.10 340 0.97 % 1,474.91 60 0.23 % 203.02 300 0.48 % 849.50 93 0.54 % 596.56 gospod gosp od 645 0.46 % 623.12 31 0.09 % 134.48 5 0.02 % 16.92 595 0.95 % 1,684.84 14 0.08 % 89.81 strani stra ni 608 0.43 % 587.38 129 0.37 % 559.60 107 0.40 % 362.05 287 0.46 % 812.69 85 0.50 % 545.25 stvari stva ri 594 0.42 % 573.85 95 0.27 % 412.11 102 0.38 % 345.13 256 0.41 % 724.91 141 0.82 % 904.47 evrov evro v 548 0.39 % 529.41 148 0.42 % 642.02 155 0.58 % 524.46 164 0.26 % 464.39 81 0.47 % 519.59 jutro jutr o 539 0.38 % 520.72 451 1.28 % 1,956.42 9 0.03 % 30.45 76 0.12 % 215.21 3 0.02 % 19.24 minut minu t 467 0.33 % 451.16 318 0.91 % 1,379.47 92 0.34 % 311.29 34 0.05 % 96.28 23 0.13 % 147.54 primer prim er 450 0.32 % 434.74 53 0.15 % 229.91 60 0.23 % 203.02 244 0.39 % 690.93 93 0.54 % 596.56 način nači n 449 0.32 % 433.77 63 0.18 % 273.29 34 0.13 % 115.04 252 0.40 % 713.58 100 0.58 % 641.47 leta leta 444 0.31 % 428.94 133 0.38 % 576.95 77 0.29 % 260.54 207 0.33 % 586.15 27 0.16 % 173.20 leto leto 430 0.30 % 415.42 94 0.27 % 407.77 163 0.61 % 551.53 129 0.21 % 365.28 44 0.26 % 282.24 dela dela 417 0.29 % 402.86 69 0.20 % 299.32 152 0.57 % 514.31 143 0.23 % 404.93 53 0.31 % 339.98 ljudi ljud i 394 0.28 % 380.64 59 0.17 % 255.94 89 0.33 % 301.14 212 0.34 % 600.31 34 0.20 % 218.10 vprašanje vpra šanje 393 0.28 % 379.67 72 0.20 % 312.33 54 0.20 % 182.72 216 0.34 % 611.64 51 0.30 % 327.15 stvar stva r 386 0.27 % 372.91 47 0.13 % 203.88 83 0.31 % 280.84 172 0.27 % 487.05 84 0.49 % 538.83 ljudje ljud je 376 0.27 % 363.25 73 0.21 % 316.67 58 0.22 % 196.25 201 0.32 % 569.16 44 0.26 % 282.24 problem prob lem 366 0.26 % 353.59 32 0.09 % 138.81 83 0.31 % 280.84 163 0.26 % 461.56 88 0.52 % 564.49 časa časa 362 0.26 % 349.72 100 0.28 % 433.80 57 0.21 % 192.87 146 0.23 % 413.42 59 0.34 % 378.46 koli koli 356 0.25 % 343.93 58 0.17 % 251.60 85 0.32 % 287.61 113 0.18 % 319.98 100 0.58 % 641.47 Sloveniji Slov eniji 286 0.20 % 276.30 54 0.15 % 234.25 31 0.12 % 104.89 187 0.30 % 529.52 14 0.08 % 89.81 koncu konc u 286 0.20 % 276.30 73 0.21 % 316.67 58 0.22 % 196.25 110 0.17 % 311.48 45 0.26 % 288.66 človek člov ek 282 0.20 % 272.44 72 0.20 % 312.33 43 0.16 % 145.50 123 0.20 % 348.29 44 0.26 % 282.24 teden tede n 273 0.19 % 263.74 71 0.20 % 308 114 0.43 % 385.73 41 0.07 % 116.10 47 0.28 % 301.49 delo delo 272 0.19 % 262.77 35 0.10 % 151.83 37 0.14 % 125.19 170 0.27 % 481.38 30 0.17 % 192.44 konec kone c 251 0.18 % 242.49 66 0.19 % 286.31 66 0.25 % 223.32 86 0.14 % 243.52 33 0.19 % 211.68 vode vode 242 0.17 % 233.79 25 0.07 % 108.45 26 0.10 % 87.97 161 0.26 % 455.90 30 0.17 % 192.44 primeru prim eru 240 0.17 % 231.86 30 0.09 % 130.14 14 0.05 % 47.37 130 0.21 % 368.12 66 0.39 % 423.37 imam imam 235 0.17 % 227.03 51 0.14 % 221.24 92 0.34 % 311.29 38 0.06 % 107.60 54 0.32 % 346.39 času času 220 0.15 % 212.54 49 0.14 % 212.56 11 0.04 % 37.22 137 0.22 % 387.94 23 0.13 % 147.54 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 388 File at CLARIN.SI 2.2.45 List of initial character-level 5-grams from noun standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-nouns-standardized_ forms-initial-5grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequen- cy of standardized form Percentage of all found standardized forms Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mislim misli m 1,011 0.83 % 976.71 155 0.52 % 672.38 419 2.00 % 1,417.73 186 0.33 % 526.69 251 1.74 % 1,610.08 bistvu bistv u 988 0.82 % 954.49 163 0.54 % 707.09 234 1.12 % 791.77 232 0.42 % 656.95 359 2.49 % 2,302.86 hvala hvala 793 0.66 % 766.10 340 1.14 % 1,474.91 60 0.29 % 203.02 300 0.54 % 849.50 93 0.64 % 596.56 gospod gospo d 645 0.53 % 623.12 31 0.10 % 134.48 5 0.02 % 16.92 595 1.07 % 1,684.84 14 0.10 % 89.81 strani stran i 608 0.50 % 587.38 129 0.43 % 559.60 107 0.51 % 362.05 287 0.52 % 812.69 85 0.59 % 545.25 stvari stvar i 594 0.49 % 573.85 95 0.32 % 412.11 102 0.49 % 345.13 256 0.46 % 724.91 141 0.98 % 904.47 evrov evrov 548 0.45 % 529.41 148 0.49 % 642.02 155 0.74 % 524.46 164 0.29 % 464.39 81 0.56 % 519.59 jutro jutro 539 0.45 % 520.72 451 1.51 % 1,956.42 9 0.04 % 30.45 76 0.14 % 215.21 3 0.02 % 19.24 minut minut 467 0.39 % 451.16 318 1.06 % 1,379.47 92 0.44 % 311.29 34 0.06 % 96.28 23 0.16 % 147.54 primer prime r 450 0.37 % 434.74 53 0.18 % 229.91 60 0.29 % 203.02 244 0.44 % 690.93 93 0.64 % 596.56 način način 449 0.37 % 433.77 63 0.21 % 273.29 34 0.16 % 115.04 252 0.45 % 713.58 100 0.69 % 641.47 ljudi ljudi 394 0.33 % 380.64 59 0.20 % 255.94 89 0.42 % 301.14 212 0.38 % 600.31 34 0.23 % 218.10 vprašanje vpraš anje 393 0.33 % 379.67 72 0.24 % 312.33 54 0.26 % 182.72 216 0.39 % 611.64 51 0.35 % 327.15 stvar stvar 386 0.32 % 372.91 47 0.16 % 203.88 83 0.40 % 280.84 172 0.31 % 487.05 84 0.58 % 538.83 ljudje ljudj e 376 0.31 % 363.25 73 0.24 % 316.67 58 0.28 % 196.25 201 0.36 % 569.16 44 0.30 % 282.24 problem probl em 366 0.30 % 353.59 32 0.11 % 138.81 83 0.40 % 280.84 163 0.29 % 461.56 88 0.61 % 564.49 Sloveniji Slove niji 286 0.24 % 276.30 54 0.18 % 234.25 31 0.15 % 104.89 187 0.34 % 529.52 14 0.10 % 89.81 koncu koncu 286 0.24 % 276.30 73 0.24 % 316.67 58 0.28 % 196.25 110 0.20 % 311.48 45 0.31 % 288.66 človek člove k 282 0.23 % 272.44 72 0.24 % 312.33 43 0.20 % 145.50 123 0.22 % 348.29 44 0.30 % 282.24 teden teden 273 0.23 % 263.74 71 0.24 % 308 114 0.55 % 385.73 41 0.07 % 116.10 47 0.33 % 301.49 konec konec 251 0.21 % 242.49 66 0.22 % 286.31 66 0.32 % 223.32 86 0.15 % 243.52 33 0.23 % 211.68 primeru prime ru 240 0.20 % 231.86 30 0.10 % 130.14 14 0.07 % 47.37 130 0.23 % 368.12 66 0.46 % 423.37 stran stran 213 0.18 % 205.78 39 0.13 % 169.18 50 0.24 % 169.18 86 0.15 % 243.52 38 0.26 % 243.76 mesto mesto 207 0.17 % 199.98 100 0.33 % 433.80 35 0.17 % 118.43 70 0.13 % 198.22 2 0.01 % 12.83 denar denar 204 0.17 % 197.08 26 0.09 % 112.79 63 0.30 % 213.17 105 0.19 % 297.32 10 0.07 % 64.15 pizda pizda 201 0.17 % 194.18 7 0.02 % 30.37 191 0.91 % 646.27 0 0 % 0 3 0.02 % 19.24 večer večer 193 0.16 % 186.45 93 0.31 % 403.43 11 0.05 % 37.22 84 0.15 % 237.86 5 0.04 % 32.07 država držav a 192 0.16 % 185.49 7 0.02 % 30.37 16 0.08 % 54.14 161 0.29 % 455.90 8 0.06 % 51.32 Slovenija Slove nija 187 0.15 % 180.66 33 0.11 % 143.15 6 0.03 % 20.30 143 0.26 % 404.93 5 0.04 % 32.07 rekla rekla 184 0.15 % 177.76 16 0.05 % 69.41 117 0.56 % 395.88 22 0.04 % 62.30 29 0.20 % 186.03 svetu svetu 181 0.15 % 174.86 49 0.16 % 212.56 14 0.07 % 47.37 109 0.20 % 308.65 9 0.06 % 57.73 denarja denar ja 176 0.14 % 170.03 27 0.09 % 117.12 48 0.23 % 162.41 84 0.15 % 237.86 17 0.12 % 109.05 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 389 File at CLARIN.SI 2.2.46 List of final character-level 1-grams from noun standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-nouns-standardized_ forms-final-1grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequen- cy of standardized form Percentage of all found standardized forms Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pol po l 4,066 2.56 % 3,928.10 465 1.19 % 2,017.15 2,421 7.17 % 8,191.73 335 0.50 % 948.61 845 4.28 % 5,420.38 saj sa j 2,421 1.52 % 2,338.89 318 0.81 % 1,379.47 1,348 3.99 % 4,561.11 275 0.41 % 778.71 480 2.43 % 3,079.03 mislim misli m 1,011 0.64 % 976.71 155 0.40 % 672.38 419 1.24 % 1,417.73 186 0.28 % 526.69 251 1.27 % 1,610.08 bistvu bistv u 988 0.62 % 954.49 163 0.42 % 707.09 234 0.69 % 791.77 232 0.35 % 656.95 359 1.82 % 2,302.86 dan da n 868 0.55 % 838.56 320 0.82 % 1,388.15 277 0.82 % 937.26 159 0.24 % 450.23 112 0.57 % 718.44 redu red u 863 0.54 % 833.73 179 0.46 % 776.50 238 0.70 % 805.30 236 0.35 % 668.27 210 1.06 % 1,347.08 hvala hval a 793 0.50 % 766.10 340 0.87 % 1,474.91 60 0.18 % 203.02 300 0.45 % 849.50 93 0.47 % 596.56 gospod gospo d 645 0.41 % 623.12 31 0.08 % 134.48 5 0.01 % 16.92 595 0.90 % 1,684.84 14 0.07 % 89.81 strani stran i 608 0.38 % 587.38 129 0.33 % 559.60 107 0.32 % 362.05 287 0.43 % 812.69 85 0.43 % 545.25 stvari stvar i 594 0.37 % 573.85 95 0.24 % 412.11 102 0.30 % 345.13 256 0.39 % 724.91 141 0.71 % 904.47 evrov evro v 548 0.34 % 529.41 148 0.38 % 642.02 155 0.46 % 524.46 164 0.25 % 464.39 81 0.41 % 519.59 jutro jutr o 539 0.34 % 520.72 451 1.15 % 1,956.42 9 0.03 % 30.45 76 0.11 % 215.21 3 0.01 % 19.24 minut minu t 467 0.29 % 451.16 318 0.81 % 1,379.47 92 0.27 % 311.29 34 0.05 % 96.28 23 0.12 % 147.54 let le t 466 0.29 % 450.20 143 0.37 % 620.33 117 0.35 % 395.88 182 0.27 % 515.36 24 0.12 % 153.95 primer prime r 450 0.28 % 434.74 53 0.14 % 229.91 60 0.18 % 203.02 244 0.37 % 690.93 93 0.47 % 596.56 način nači n 449 0.28 % 433.77 63 0.16 % 273.29 34 0.10 % 115.04 252 0.38 % 713.58 100 0.51 % 641.47 leta let a 444 0.28 % 428.94 133 0.34 % 576.95 77 0.23 % 260.54 207 0.31 % 586.15 27 0.14 % 173.20 čas ča s 432 0.27 % 417.35 143 0.37 % 620.33 46 0.14 % 155.65 192 0.29 % 543.68 51 0.26 % 327.15 leto let o 430 0.27 % 415.42 94 0.24 % 407.77 163 0.48 % 551.53 129 0.19 % 365.28 44 0.22 % 282.24 dela del a 417 0.26 % 402.86 69 0.18 % 299.32 152 0.45 % 514.31 143 0.21 % 404.93 53 0.27 % 339.98 ljudi ljud i 394 0.25 % 380.64 59 0.15 % 255.94 89 0.26 % 301.14 212 0.32 % 600.31 34 0.17 % 218.10 vprašanje vprašanj e 393 0.25 % 379.67 72 0.18 % 312.33 54 0.16 % 182.72 216 0.33 % 611.64 51 0.26 % 327.15 stvar stva r 386 0.24 % 372.91 47 0.12 % 203.88 83 0.25 % 280.84 172 0.26 % 487.05 84 0.42 % 538.83 ljudje ljudj e 376 0.24 % 363.25 73 0.19 % 316.67 58 0.17 % 196.25 201 0.30 % 569.16 44 0.22 % 282.24 problem proble m 366 0.23 % 353.59 32 0.08 % 138.81 83 0.25 % 280.84 163 0.24 % 461.56 88 0.45 % 564.49 časa čas a 362 0.23 % 349.72 100 0.26 % 433.80 57 0.17 % 192.87 146 0.22 % 413.42 59 0.30 % 378.46 koli kol i 356 0.22 % 343.93 58 0.15 % 251.60 85 0.25 % 287.61 113 0.17 % 319.98 100 0.51 % 641.47 del de l 345 0.22 % 333.30 75 0.19 % 325.35 39 0.12 % 131.96 193 0.29 % 546.51 38 0.19 % 243.76 kot ko t 319 0.20 % 308.18 71 0.18 % 308 49 0.14 % 165.80 158 0.24 % 447.40 41 0.21 % 263 dni dn i 310 0.20 % 299.49 71 0.18 % 308 116 0.34 % 392.50 85 0.13 % 240.69 38 0.19 % 243.76 evo ev o 308 0.19 % 297.55 178 0.46 % 772.16 60 0.18 % 203.02 16 0.02 % 45.31 54 0.27 % 346.39 Sloveniji Slovenij i 286 0.18 % 276.30 54 0.14 % 234.25 31 0.09 % 104.89 187 0.28 % 529.52 14 0.07 % 89.81 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 390 File at CLARIN.SI 2.2.47 List of final character-level 2-grams from noun standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-nouns-standardized_ forms-final-2grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequen- cy of standardized form Percentage of all found standardized forms Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pol p ol 4,066 2.56 % 3,928.10 465 1.19 % 2,017.15 2,421 7.20 % 8,191.73 335 0.51 % 948.61 845 4.29 % 5,420.38 saj s aj 2,421 1.52 % 2,338.89 318 0.81 % 1,379.47 1,348 4.01 % 4,561.11 275 0.41 % 778.71 480 2.44 % 3,079.03 mislim misl im 1,011 0.64 % 976.71 155 0.40 % 672.38 419 1.25 % 1,417.73 186 0.28 % 526.69 251 1.27 % 1,610.08 bistvu bist vu 988 0.62 % 954.49 163 0.42 % 707.09 234 0.70 % 791.77 232 0.35 % 656.95 359 1.82 % 2,302.86 dan d an 868 0.55 % 838.56 320 0.82 % 1,388.15 277 0.82 % 937.26 159 0.24 % 450.23 112 0.57 % 718.44 redu re du 863 0.54 % 833.73 179 0.46 % 776.50 238 0.71 % 805.30 236 0.36 % 668.27 210 1.06 % 1,347.08 hvala hva la 793 0.50 % 766.10 340 0.87 % 1,474.91 60 0.18 % 203.02 300 0.45 % 849.50 93 0.47 % 596.56 gospod gosp od 645 0.41 % 623.12 31 0.08 % 134.48 5 0.01 % 16.92 595 0.90 % 1,684.84 14 0.07 % 89.81 strani stra ni 608 0.38 % 587.38 129 0.33 % 559.60 107 0.32 % 362.05 287 0.43 % 812.69 85 0.43 % 545.25 stvari stva ri 594 0.37 % 573.85 95 0.24 % 412.11 102 0.30 % 345.13 256 0.39 % 724.91 141 0.71 % 904.47 evrov evr ov 548 0.34 % 529.41 148 0.38 % 642.02 155 0.46 % 524.46 164 0.25 % 464.39 81 0.41 % 519.59 jutro jut ro 539 0.34 % 520.72 451 1.16 % 1,956.42 9 0.03 % 30.45 76 0.12 % 215.21 3 0.01 % 19.24 minut min ut 467 0.29 % 451.16 318 0.81 % 1,379.47 92 0.27 % 311.29 34 0.05 % 96.28 23 0.12 % 147.54 let l et 466 0.29 % 450.20 143 0.37 % 620.33 117 0.35 % 395.88 182 0.27 % 515.36 24 0.12 % 153.95 primer prim er 450 0.28 % 434.74 53 0.14 % 229.91 60 0.18 % 203.02 244 0.37 % 690.93 93 0.47 % 596.56 način nač in 449 0.28 % 433.77 63 0.16 % 273.29 34 0.10 % 115.04 252 0.38 % 713.58 100 0.51 % 641.47 leta le ta 444 0.28 % 428.94 133 0.34 % 576.95 77 0.23 % 260.54 207 0.31 % 586.15 27 0.14 % 173.20 čas č as 432 0.27 % 417.35 143 0.37 % 620.33 46 0.14 % 155.65 192 0.29 % 543.68 51 0.26 % 327.15 leto le to 430 0.27 % 415.42 94 0.24 % 407.77 163 0.48 % 551.53 129 0.19 % 365.28 44 0.22 % 282.24 dela de la 417 0.26 % 402.86 69 0.18 % 299.32 152 0.45 % 514.31 143 0.22 % 404.93 53 0.27 % 339.98 ljudi lju di 394 0.25 % 380.64 59 0.15 % 255.94 89 0.27 % 301.14 212 0.32 % 600.31 34 0.17 % 218.10 vprašanje vprašan je 393 0.25 % 379.67 72 0.18 % 312.33 54 0.16 % 182.72 216 0.33 % 611.64 51 0.26 % 327.15 stvar stv ar 386 0.24 % 372.91 47 0.12 % 203.88 83 0.25 % 280.84 172 0.26 % 487.05 84 0.43 % 538.83 ljudje ljud je 376 0.24 % 363.25 73 0.19 % 316.67 58 0.17 % 196.25 201 0.30 % 569.16 44 0.22 % 282.24 problem probl em 366 0.23 % 353.59 32 0.08 % 138.81 83 0.25 % 280.84 163 0.25 % 461.56 88 0.45 % 564.49 časa ča sa 362 0.23 % 349.72 100 0.26 % 433.80 57 0.17 % 192.87 146 0.22 % 413.42 59 0.30 % 378.46 koli ko li 356 0.22 % 343.93 58 0.15 % 251.60 85 0.25 % 287.61 113 0.17 % 319.98 100 0.51 % 641.47 del d el 345 0.22 % 333.30 75 0.19 % 325.35 39 0.12 % 131.96 193 0.29 % 546.51 38 0.19 % 243.76 kot k ot 319 0.20 % 308.18 71 0.18 % 308 49 0.15 % 165.80 158 0.24 % 447.40 41 0.21 % 263 dni d ni 310 0.20 % 299.49 71 0.18 % 308 116 0.34 % 392.50 85 0.13 % 240.69 38 0.19 % 243.76 evo e vo 308 0.19 % 297.55 178 0.46 % 772.16 60 0.18 % 203.02 16 0.02 % 45.31 54 0.27 % 346.39 Sloveniji Sloveni ji 286 0.18 % 276.30 54 0.14 % 234.25 31 0.09 % 104.89 187 0.28 % 529.52 14 0.07 % 89.81 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 391 File at CLARIN.SI 2.2.48 List of final character-level 3-grams from noun standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-nouns-standardized_ forms-final-3grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequen- cy of standardized form Percentage of all found standardized forms Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pol pol 4,066 2.57 % 3,928.10 465 1.20 % 2,017.15 2,421 7.27 % 8,191.73 335 0.51 % 948.61 845 4.32 % 5,420.38 saj saj 2,421 1.53 % 2,338.89 318 0.82 % 1,379.47 1,348 4.05 % 4,561.11 275 0.41 % 778.71 480 2.45 % 3,079.03 mislim mis lim 1,011 0.64 % 976.71 155 0.40 % 672.38 419 1.26 % 1,417.73 186 0.28 % 526.69 251 1.28 % 1,610.08 bistvu bis tvu 988 0.62 % 954.49 163 0.42 % 707.09 234 0.70 % 791.77 232 0.35 % 656.95 359 1.83 % 2,302.86 dan dan 868 0.55 % 838.56 320 0.82 % 1,388.15 277 0.83 % 937.26 159 0.24 % 450.23 112 0.57 % 718.44 redu r edu 863 0.55 % 833.73 179 0.46 % 776.50 238 0.71 % 805.30 236 0.36 % 668.27 210 1.07 % 1,347.08 hvala hv ala 793 0.50 % 766.10 340 0.87 % 1,474.91 60 0.18 % 203.02 300 0.45 % 849.50 93 0.47 % 596.56 gospod gos pod 645 0.41 % 623.12 31 0.08 % 134.48 5 0.01 % 16.92 595 0.90 % 1,684.84 14 0.07 % 89.81 strani str ani 608 0.39 % 587.38 129 0.33 % 559.60 107 0.32 % 362.05 287 0.43 % 812.69 85 0.43 % 545.25 stvari stv ari 594 0.38 % 573.85 95 0.24 % 412.11 102 0.31 % 345.13 256 0.39 % 724.91 141 0.72 % 904.47 evrov ev rov 548 0.35 % 529.41 148 0.38 % 642.02 155 0.47 % 524.46 164 0.25 % 464.39 81 0.41 % 519.59 jutro ju tro 539 0.34 % 520.72 451 1.16 % 1,956.42 9 0.03 % 30.45 76 0.12 % 215.21 3 0.01 % 19.24 minut mi nut 467 0.30 % 451.16 318 0.82 % 1,379.47 92 0.28 % 311.29 34 0.05 % 96.28 23 0.12 % 147.54 let let 466 0.29 % 450.20 143 0.37 % 620.33 117 0.35 % 395.88 182 0.28 % 515.36 24 0.12 % 153.95 primer pri mer 450 0.28 % 434.74 53 0.14 % 229.91 60 0.18 % 203.02 244 0.37 % 690.93 93 0.47 % 596.56 način na čin 449 0.28 % 433.77 63 0.16 % 273.29 34 0.10 % 115.04 252 0.38 % 713.58 100 0.51 % 641.47 leta l eta 444 0.28 % 428.94 133 0.34 % 576.95 77 0.23 % 260.54 207 0.31 % 586.15 27 0.14 % 173.20 čas čas 432 0.27 % 417.35 143 0.37 % 620.33 46 0.14 % 155.65 192 0.29 % 543.68 51 0.26 % 327.15 leto l eto 430 0.27 % 415.42 94 0.24 % 407.77 163 0.49 % 551.53 129 0.20 % 365.28 44 0.23 % 282.24 dela d ela 417 0.26 % 402.86 69 0.18 % 299.32 152 0.46 % 514.31 143 0.22 % 404.93 53 0.27 % 339.98 ljudi lj udi 394 0.25 % 380.64 59 0.15 % 255.94 89 0.27 % 301.14 212 0.32 % 600.31 34 0.17 % 218.10 vprašanje vpraša nje 393 0.25 % 379.67 72 0.18 % 312.33 54 0.16 % 182.72 216 0.33 % 611.64 51 0.26 % 327.15 stvar st var 386 0.24 % 372.91 47 0.12 % 203.88 83 0.25 % 280.84 172 0.26 % 487.05 84 0.43 % 538.83 ljudje lju dje 376 0.24 % 363.25 73 0.19 % 316.67 58 0.17 % 196.25 201 0.30 % 569.16 44 0.23 % 282.24 problem prob lem 366 0.23 % 353.59 32 0.08 % 138.81 83 0.25 % 280.84 163 0.25 % 461.56 88 0.45 % 564.49 časa č asa 362 0.23 % 349.72 100 0.26 % 433.80 57 0.17 % 192.87 146 0.22 % 413.42 59 0.30 % 378.46 koli k oli 356 0.23 % 343.93 58 0.15 % 251.60 85 0.26 % 287.61 113 0.17 % 319.98 100 0.51 % 641.47 del del 345 0.22 % 333.30 75 0.19 % 325.35 39 0.12 % 131.96 193 0.29 % 546.51 38 0.19 % 243.76 kot kot 319 0.20 % 308.18 71 0.18 % 308 49 0.15 % 165.80 158 0.24 % 447.40 41 0.21 % 263 dni dni 310 0.20 % 299.49 71 0.18 % 308 116 0.35 % 392.50 85 0.13 % 240.69 38 0.19 % 243.76 evo evo 308 0.20 % 297.55 178 0.46 % 772.16 60 0.18 % 203.02 16 0.02 % 45.31 54 0.28 % 346.39 Sloveniji Sloven iji 286 0.18 % 276.30 54 0.14 % 234.25 31 0.09 % 104.89 187 0.28 % 529.52 14 0.07 % 89.81 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 392 File at CLARIN.SI 2.2.49 List of final character-level 4-grams from noun standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-nouns-standardized_ forms-final-4grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequen- cy of standardized form Percentage of all found standardized forms Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mislim mi slim 1,011 0.71 % 976.71 155 0.44 % 672.38 419 1.57 % 1,417.73 186 0.30 % 526.69 251 1.47 % 1,610.08 bistvu bi stvu 988 0.70 % 954.49 163 0.46 % 707.09 234 0.88 % 791.77 232 0.37 % 656.95 359 2.10 % 2,302.86 redu redu 863 0.61 % 833.73 179 0.51 % 776.50 238 0.89 % 805.30 236 0.38 % 668.27 210 1.23 % 1,347.08 hvala h vala 793 0.56 % 766.10 340 0.97 % 1,474.91 60 0.23 % 203.02 300 0.48 % 849.50 93 0.54 % 596.56 gospod go spod 645 0.46 % 623.12 31 0.09 % 134.48 5 0.02 % 16.92 595 0.95 % 1,684.84 14 0.08 % 89.81 strani st rani 608 0.43 % 587.38 129 0.37 % 559.60 107 0.40 % 362.05 287 0.46 % 812.69 85 0.50 % 545.25 stvari st vari 594 0.42 % 573.85 95 0.27 % 412.11 102 0.38 % 345.13 256 0.41 % 724.91 141 0.82 % 904.47 evrov e vrov 548 0.39 % 529.41 148 0.42 % 642.02 155 0.58 % 524.46 164 0.26 % 464.39 81 0.47 % 519.59 jutro j utro 539 0.38 % 520.72 451 1.28 % 1,956.42 9 0.03 % 30.45 76 0.12 % 215.21 3 0.02 % 19.24 minut m inut 467 0.33 % 451.16 318 0.91 % 1,379.47 92 0.34 % 311.29 34 0.05 % 96.28 23 0.13 % 147.54 primer pr imer 450 0.32 % 434.74 53 0.15 % 229.91 60 0.23 % 203.02 244 0.39 % 690.93 93 0.54 % 596.56 način n ačin 449 0.32 % 433.77 63 0.18 % 273.29 34 0.13 % 115.04 252 0.40 % 713.58 100 0.58 % 641.47 leta leta 444 0.31 % 428.94 133 0.38 % 576.95 77 0.29 % 260.54 207 0.33 % 586.15 27 0.16 % 173.20 leto leto 430 0.30 % 415.42 94 0.27 % 407.77 163 0.61 % 551.53 129 0.21 % 365.28 44 0.26 % 282.24 dela dela 417 0.29 % 402.86 69 0.20 % 299.32 152 0.57 % 514.31 143 0.23 % 404.93 53 0.31 % 339.98 ljudi l judi 394 0.28 % 380.64 59 0.17 % 255.94 89 0.33 % 301.14 212 0.34 % 600.31 34 0.20 % 218.10 vprašanje vpraš anje 393 0.28 % 379.67 72 0.20 % 312.33 54 0.20 % 182.72 216 0.34 % 611.64 51 0.30 % 327.15 stvar s tvar 386 0.27 % 372.91 47 0.13 % 203.88 83 0.31 % 280.84 172 0.27 % 487.05 84 0.49 % 538.83 ljudje lj udje 376 0.27 % 363.25 73 0.21 % 316.67 58 0.22 % 196.25 201 0.32 % 569.16 44 0.26 % 282.24 problem pro blem 366 0.26 % 353.59 32 0.09 % 138.81 83 0.31 % 280.84 163 0.26 % 461.56 88 0.52 % 564.49 časa časa 362 0.26 % 349.72 100 0.28 % 433.80 57 0.21 % 192.87 146 0.23 % 413.42 59 0.34 % 378.46 koli koli 356 0.25 % 343.93 58 0.17 % 251.60 85 0.32 % 287.61 113 0.18 % 319.98 100 0.58 % 641.47 Sloveniji Slove niji 286 0.20 % 276.30 54 0.15 % 234.25 31 0.12 % 104.89 187 0.30 % 529.52 14 0.08 % 89.81 koncu k oncu 286 0.20 % 276.30 73 0.21 % 316.67 58 0.22 % 196.25 110 0.17 % 311.48 45 0.26 % 288.66 človek čl ovek 282 0.20 % 272.44 72 0.20 % 312.33 43 0.16 % 145.50 123 0.20 % 348.29 44 0.26 % 282.24 teden t eden 273 0.19 % 263.74 71 0.20 % 308 114 0.43 % 385.73 41 0.07 % 116.10 47 0.28 % 301.49 delo delo 272 0.19 % 262.77 35 0.10 % 151.83 37 0.14 % 125.19 170 0.27 % 481.38 30 0.17 % 192.44 konec k onec 251 0.18 % 242.49 66 0.19 % 286.31 66 0.25 % 223.32 86 0.14 % 243.52 33 0.19 % 211.68 vode vode 242 0.17 % 233.79 25 0.07 % 108.45 26 0.10 % 87.97 161 0.26 % 455.90 30 0.17 % 192.44 primeru pri meru 240 0.17 % 231.86 30 0.09 % 130.14 14 0.05 % 47.37 130 0.21 % 368.12 66 0.39 % 423.37 imam imam 235 0.17 % 227.03 51 0.14 % 221.24 92 0.34 % 311.29 38 0.06 % 107.60 54 0.32 % 346.39 času času 220 0.15 % 212.54 49 0.14 % 212.56 11 0.04 % 37.22 137 0.22 % 387.94 23 0.13 % 147.54 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 393 File at CLARIN.SI 2.2.50 List of final character-level 5-grams from noun standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-nouns-standardized_ forms-final-5grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequen- cy of standardized form Percentage of all found standardized forms Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mislim m islim 1,011 0.83 % 976.71 155 0.52 % 672.38 419 2.00 % 1,417.73 186 0.33 % 526.69 251 1.74 % 1,610.08 bistvu b istvu 988 0.82 % 954.49 163 0.54 % 707.09 234 1.12 % 791.77 232 0.42 % 656.95 359 2.49 % 2,302.86 hvala hvala 793 0.66 % 766.10 340 1.14 % 1,474.91 60 0.29 % 203.02 300 0.54 % 849.50 93 0.64 % 596.56 gospod g ospod 645 0.53 % 623.12 31 0.10 % 134.48 5 0.02 % 16.92 595 1.07 % 1,684.84 14 0.10 % 89.81 strani s trani 608 0.50 % 587.38 129 0.43 % 559.60 107 0.51 % 362.05 287 0.52 % 812.69 85 0.59 % 545.25 stvari s tvari 594 0.49 % 573.85 95 0.32 % 412.11 102 0.49 % 345.13 256 0.46 % 724.91 141 0.98 % 904.47 evrov evrov 548 0.45 % 529.41 148 0.49 % 642.02 155 0.74 % 524.46 164 0.29 % 464.39 81 0.56 % 519.59 jutro jutro 539 0.45 % 520.72 451 1.51 % 1,956.42 9 0.04 % 30.45 76 0.14 % 215.21 3 0.02 % 19.24 minut minut 467 0.39 % 451.16 318 1.06 % 1,379.47 92 0.44 % 311.29 34 0.06 % 96.28 23 0.16 % 147.54 primer p rimer 450 0.37 % 434.74 53 0.18 % 229.91 60 0.29 % 203.02 244 0.44 % 690.93 93 0.64 % 596.56 način način 449 0.37 % 433.77 63 0.21 % 273.29 34 0.16 % 115.04 252 0.45 % 713.58 100 0.69 % 641.47 ljudi ljudi 394 0.33 % 380.64 59 0.20 % 255.94 89 0.42 % 301.14 212 0.38 % 600.31 34 0.23 % 218.10 vprašanje vpra šanje 393 0.33 % 379.67 72 0.24 % 312.33 54 0.26 % 182.72 216 0.39 % 611.64 51 0.35 % 327.15 stvar stvar 386 0.32 % 372.91 47 0.16 % 203.88 83 0.40 % 280.84 172 0.31 % 487.05 84 0.58 % 538.83 ljudje l judje 376 0.31 % 363.25 73 0.24 % 316.67 58 0.28 % 196.25 201 0.36 % 569.16 44 0.30 % 282.24 problem pr oblem 366 0.30 % 353.59 32 0.11 % 138.81 83 0.40 % 280.84 163 0.29 % 461.56 88 0.61 % 564.49 Sloveniji Slov eniji 286 0.24 % 276.30 54 0.18 % 234.25 31 0.15 % 104.89 187 0.34 % 529.52 14 0.10 % 89.81 koncu koncu 286 0.24 % 276.30 73 0.24 % 316.67 58 0.28 % 196.25 110 0.20 % 311.48 45 0.31 % 288.66 človek č lovek 282 0.23 % 272.44 72 0.24 % 312.33 43 0.20 % 145.50 123 0.22 % 348.29 44 0.30 % 282.24 teden teden 273 0.23 % 263.74 71 0.24 % 308 114 0.55 % 385.73 41 0.07 % 116.10 47 0.33 % 301.49 konec konec 251 0.21 % 242.49 66 0.22 % 286.31 66 0.32 % 223.32 86 0.15 % 243.52 33 0.23 % 211.68 primeru pr imeru 240 0.20 % 231.86 30 0.10 % 130.14 14 0.07 % 47.37 130 0.23 % 368.12 66 0.46 % 423.37 stran stran 213 0.18 % 205.78 39 0.13 % 169.18 50 0.24 % 169.18 86 0.15 % 243.52 38 0.26 % 243.76 mesto mesto 207 0.17 % 199.98 100 0.33 % 433.80 35 0.17 % 118.43 70 0.13 % 198.22 2 0.01 % 12.83 denar denar 204 0.17 % 197.08 26 0.09 % 112.79 63 0.30 % 213.17 105 0.19 % 297.32 10 0.07 % 64.15 pizda pizda 201 0.17 % 194.18 7 0.02 % 30.37 191 0.91 % 646.27 0 0 % 0 3 0.02 % 19.24 večer večer 193 0.16 % 186.45 93 0.31 % 403.43 11 0.05 % 37.22 84 0.15 % 237.86 5 0.04 % 32.07 država d ržava 192 0.16 % 185.49 7 0.02 % 30.37 16 0.08 % 54.14 161 0.29 % 455.90 8 0.06 % 51.32 Slovenija Slov enija 187 0.15 % 180.66 33 0.11 % 143.15 6 0.03 % 20.30 143 0.26 % 404.93 5 0.04 % 32.07 rekla rekla 184 0.15 % 177.76 16 0.05 % 69.41 117 0.56 % 395.88 22 0.04 % 62.30 29 0.20 % 186.03 svetu svetu 181 0.15 % 174.86 49 0.16 % 212.56 14 0.07 % 47.37 109 0.20 % 308.65 9 0.06 % 57.73 denarja de narja 176 0.14 % 170.03 27 0.09 % 117.12 48 0.23 % 162.41 84 0.15 % 237.86 17 0.12 % 109.05 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 394 File at CLARIN.SI 2.2.51 List of initial character-level 1-grams from noun lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-nouns-lowercase_ forms-initial-1grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequen- cy of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pol p ol 2,838 1.78 % 2,741.75 422 1.08 % 1,830.62 1,472 4.36 % 4,980.68 277 0.42 % 784.37 667 3.38 % 4,278.58 sej s ej 2,070 1.30 % 1,999.79 252 0.65 % 1,093.17 1,160 3.44 % 3,924.99 242 0.36 % 685.26 416 2.10 % 2,668.50 bistvu b istvu 965 0.61 % 932.27 162 0.41 % 702.75 218 0.65 % 737.63 229 0.34 % 648.45 356 1.80 % 2,283.62 dan d an 825 0.52 % 797.02 312 0.80 % 1,353.44 241 0.71 % 815.45 160 0.24 % 453.07 112 0.57 % 718.44 hvala h vala 790 0.50 % 763.21 338 0.86 % 1,466.23 59 0.17 % 199.63 300 0.45 % 849.50 93 0.47 % 596.56 po p o 788 0.49 % 761.27 44 0.11 % 190.87 551 1.63 % 1,864.37 39 0.06 % 110.43 154 0.78 % 987.86 redu r edu 683 0.43 % 659.84 148 0.38 % 642.02 114 0.34 % 385.73 220 0.33 % 622.97 201 1.02 % 1,289.35 gospod g ospod 648 0.41 % 626.02 31 0.08 % 134.48 5 0.01 % 16.92 598 0.90 % 1,693.34 14 0.07 % 89.81 stvari s tvari 588 0.37 % 568.06 95 0.24 % 412.11 97 0.29 % 328.21 255 0.38 % 722.07 141 0.71 % 904.47 strani s trani 566 0.36 % 546.80 128 0.33 % 555.26 77 0.23 % 260.54 283 0.42 % 801.36 78 0.40 % 500.34 jutro j utro 536 0.34 % 517.82 451 1.15 % 1,956.42 7 0.02 % 23.69 75 0.11 % 212.37 3 0.01 % 19.24 evrov e vrov 532 0.33 % 513.96 148 0.38 % 642.02 142 0.42 % 480.47 161 0.24 % 455.90 81 0.41 % 519.59 let l et 530 0.33 % 512.02 136 0.35 % 589.96 176 0.52 % 595.52 186 0.28 % 526.69 32 0.16 % 205.27 minut m inut 461 0.29 % 445.36 316 0.81 % 1,370.80 88 0.26 % 297.76 34 0.05 % 96.28 23 0.12 % 147.54 način n ačin 448 0.28 % 432.81 63 0.16 % 273.29 34 0.10 % 115.04 251 0.38 % 710.75 100 0.51 % 641.47 primer p rimer 445 0.28 % 429.91 52 0.13 % 225.57 56 0.17 % 189.48 244 0.37 % 690.93 93 0.47 % 596.56 leta l eta 427 0.27 % 412.52 130 0.33 % 563.94 63 0.19 % 213.17 207 0.31 % 586.15 27 0.14 % 173.20 dela d ela 413 0.26 % 398.99 63 0.16 % 273.29 146 0.43 % 494.01 143 0.21 % 404.93 61 0.31 % 391.29 stvar s tvar 387 0.24 % 373.87 47 0.12 % 203.88 84 0.25 % 284.22 172 0.26 % 487.05 84 0.42 % 538.83 vprašanje v prašanje 377 0.24 % 364.21 72 0.18 % 312.33 45 0.13 % 152.26 211 0.32 % 597.48 49 0.25 % 314.32 čas č as 377 0.24 % 364.21 122 0.31 % 529.23 44 0.13 % 148.88 162 0.24 % 458.73 49 0.25 % 314.32 problem p roblem 369 0.23 % 356.48 32 0.08 % 138.81 86 0.26 % 290.99 163 0.24 % 461.56 88 0.45 % 564.49 časa č asa 361 0.23 % 348.76 100 0.26 % 433.80 56 0.17 % 189.48 146 0.22 % 413.42 59 0.30 % 378.46 del d el 351 0.22 % 339.10 76 0.20 % 329.69 44 0.13 % 148.88 193 0.29 % 546.51 38 0.19 % 243.76 mislm m islm 323 0.20 % 312.05 44 0.11 % 190.87 134 0.40 % 453.40 53 0.08 % 150.08 92 0.47 % 590.15 leto l eto 320 0.20 % 309.15 84 0.21 % 364.39 76 0.23 % 257.15 125 0.19 % 353.96 35 0.18 % 224.51 evo e vo 311 0.20 % 300.45 180 0.46 % 780.83 60 0.18 % 203.02 17 0.03 % 48.14 54 0.27 % 346.39 mislim m islim 310 0.20 % 299.49 61 0.16 % 264.62 102 0.30 % 345.13 77 0.12 % 218.04 70 0.35 % 449.03 dni d ni 305 0.19 % 294.66 70 0.18 % 303.66 112 0.33 % 378.96 85 0.13 % 240.69 38 0.19 % 243.76 delo d elo 302 0.19 % 291.76 35 0.09 % 151.83 29 0.09 % 98.12 171 0.26 % 484.21 67 0.34 % 429.78 ljudi l judi 293 0.18 % 283.06 40 0.10 % 173.52 49 0.14 % 165.80 181 0.27 % 512.53 23 0.12 % 147.54 ljudje l judje 284 0.18 % 274.37 50 0.13 % 216.90 33 0.10 % 111.66 168 0.25 % 475.72 33 0.17 % 211.68 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 395 File at CLARIN.SI 2.2.52 List of initial character-level 2-grams from noun lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-nouns-lowercase_ forms-initial-2grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequen- cy of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pol po l 2,838 1.79 % 2,741.75 422 1.08 % 1,830.62 1,472 4.38 % 4,980.68 277 0.42 % 784.37 667 3.38 % 4,278.58 sej se j 2,070 1.30 % 1,999.79 252 0.65 % 1,093.17 1,160 3.45 % 3,924.99 242 0.36 % 685.26 416 2.11 % 2,668.50 bistvu bi stvu 965 0.61 % 932.27 162 0.41 % 702.75 218 0.65 % 737.63 229 0.34 % 648.45 356 1.81 % 2,283.62 dan da n 825 0.52 % 797.02 312 0.80 % 1,353.44 241 0.72 % 815.45 160 0.24 % 453.07 112 0.57 % 718.44 hvala hv ala 790 0.50 % 763.21 338 0.87 % 1,466.23 59 0.17 % 199.63 300 0.45 % 849.50 93 0.47 % 596.56 po po 788 0.50 % 761.27 44 0.11 % 190.87 551 1.64 % 1,864.37 39 0.06 % 110.43 154 0.78 % 987.86 redu re du 683 0.43 % 659.84 148 0.38 % 642.02 114 0.34 % 385.73 220 0.33 % 622.97 201 1.02 % 1,289.35 gospod go spod 648 0.41 % 626.02 31 0.08 % 134.48 5 0.01 % 16.92 598 0.90 % 1,693.34 14 0.07 % 89.81 stvari st vari 588 0.37 % 568.06 95 0.24 % 412.11 97 0.29 % 328.21 255 0.38 % 722.07 141 0.71 % 904.47 strani st rani 566 0.36 % 546.80 128 0.33 % 555.26 77 0.23 % 260.54 283 0.43 % 801.36 78 0.40 % 500.34 jutro ju tro 536 0.34 % 517.82 451 1.16 % 1,956.42 7 0.02 % 23.69 75 0.11 % 212.37 3 0.01 % 19.24 evrov ev rov 532 0.34 % 513.96 148 0.38 % 642.02 142 0.42 % 480.47 161 0.24 % 455.90 81 0.41 % 519.59 let le t 530 0.33 % 512.02 136 0.35 % 589.96 176 0.52 % 595.52 186 0.28 % 526.69 32 0.16 % 205.27 minut mi nut 461 0.29 % 445.36 316 0.81 % 1,370.80 88 0.26 % 297.76 34 0.05 % 96.28 23 0.12 % 147.54 način na čin 448 0.28 % 432.81 63 0.16 % 273.29 34 0.10 % 115.04 251 0.38 % 710.75 100 0.51 % 641.47 primer pr imer 445 0.28 % 429.91 52 0.13 % 225.57 56 0.17 % 189.48 244 0.37 % 690.93 93 0.47 % 596.56 leta le ta 427 0.27 % 412.52 130 0.33 % 563.94 63 0.19 % 213.17 207 0.31 % 586.15 27 0.14 % 173.20 dela de la 413 0.26 % 398.99 63 0.16 % 273.29 146 0.43 % 494.01 143 0.21 % 404.93 61 0.31 % 391.29 stvar st var 387 0.24 % 373.87 47 0.12 % 203.88 84 0.25 % 284.22 172 0.26 % 487.05 84 0.43 % 538.83 vprašanje vp rašanje 377 0.24 % 364.21 72 0.18 % 312.33 45 0.13 % 152.26 211 0.32 % 597.48 49 0.25 % 314.32 čas ča s 377 0.24 % 364.21 122 0.31 % 529.23 44 0.13 % 148.88 162 0.24 % 458.73 49 0.25 % 314.32 problem pr oblem 369 0.23 % 356.48 32 0.08 % 138.81 86 0.26 % 290.99 163 0.25 % 461.56 88 0.45 % 564.49 časa ča sa 361 0.23 % 348.76 100 0.26 % 433.80 56 0.17 % 189.48 146 0.22 % 413.42 59 0.30 % 378.46 del de l 351 0.22 % 339.10 76 0.20 % 329.69 44 0.13 % 148.88 193 0.29 % 546.51 38 0.19 % 243.76 mislm mi slm 323 0.20 % 312.05 44 0.11 % 190.87 134 0.40 % 453.40 53 0.08 % 150.08 92 0.47 % 590.15 leto le to 320 0.20 % 309.15 84 0.21 % 364.39 76 0.23 % 257.15 125 0.19 % 353.96 35 0.18 % 224.51 evo ev o 311 0.20 % 300.45 180 0.46 % 780.83 60 0.18 % 203.02 17 0.03 % 48.14 54 0.27 % 346.39 mislim mi slim 310 0.20 % 299.49 61 0.16 % 264.62 102 0.30 % 345.13 77 0.12 % 218.04 70 0.35 % 449.03 dni dn i 305 0.19 % 294.66 70 0.18 % 303.66 112 0.33 % 378.96 85 0.13 % 240.69 38 0.19 % 243.76 delo de lo 302 0.19 % 291.76 35 0.09 % 151.83 29 0.09 % 98.12 171 0.26 % 484.21 67 0.34 % 429.78 ljudi lj udi 293 0.18 % 283.06 40 0.10 % 173.52 49 0.15 % 165.80 181 0.27 % 512.53 23 0.12 % 147.54 ljudje lj udje 284 0.18 % 274.37 50 0.13 % 216.90 33 0.10 % 111.66 168 0.25 % 475.72 33 0.17 % 211.68 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 396 File at CLARIN.SI 2.2.53 List of initial character-level 3-grams from noun lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-nouns-lowercase_ forms-initial-3grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequen- cy of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pol pol 2,838 1.81 % 2,741.75 422 1.09 % 1,830.62 1,472 4.53 % 4,980.68 277 0.42 % 784.37 667 3.45 % 4,278.58 sej sej 2,070 1.32 % 1,999.79 252 0.65 % 1,093.17 1,160 3.57 % 3,924.99 242 0.37 % 685.26 416 2.15 % 2,668.50 bistvu bis tvu 965 0.62 % 932.27 162 0.42 % 702.75 218 0.67 % 737.63 229 0.35 % 648.45 356 1.84 % 2,283.62 dan dan 825 0.53 % 797.02 312 0.81 % 1,353.44 241 0.74 % 815.45 160 0.24 % 453.07 112 0.58 % 718.44 hvala hva la 790 0.50 % 763.21 338 0.87 % 1,466.23 59 0.18 % 199.63 300 0.45 % 849.50 93 0.48 % 596.56 redu red u 683 0.44 % 659.84 148 0.38 % 642.02 114 0.35 % 385.73 220 0.33 % 622.97 201 1.04 % 1,289.35 gospod gos pod 648 0.41 % 626.02 31 0.08 % 134.48 5 0.01 % 16.92 598 0.91 % 1,693.34 14 0.07 % 89.81 stvari stv ari 588 0.38 % 568.06 95 0.24 % 412.11 97 0.30 % 328.21 255 0.39 % 722.07 141 0.73 % 904.47 strani str ani 566 0.36 % 546.80 128 0.33 % 555.26 77 0.24 % 260.54 283 0.43 % 801.36 78 0.40 % 500.34 jutro jut ro 536 0.34 % 517.82 451 1.16 % 1,956.42 7 0.02 % 23.69 75 0.11 % 212.37 3 0.02 % 19.24 evrov evr ov 532 0.34 % 513.96 148 0.38 % 642.02 142 0.44 % 480.47 161 0.24 % 455.90 81 0.42 % 519.59 let let 530 0.34 % 512.02 136 0.35 % 589.96 176 0.54 % 595.52 186 0.28 % 526.69 32 0.17 % 205.27 minut min ut 461 0.29 % 445.36 316 0.82 % 1,370.80 88 0.27 % 297.76 34 0.05 % 96.28 23 0.12 % 147.54 način nač in 448 0.29 % 432.81 63 0.16 % 273.29 34 0.10 % 115.04 251 0.38 % 710.75 100 0.52 % 641.47 primer pri mer 445 0.28 % 429.91 52 0.13 % 225.57 56 0.17 % 189.48 244 0.37 % 690.93 93 0.48 % 596.56 leta let a 427 0.27 % 412.52 130 0.34 % 563.94 63 0.19 % 213.17 207 0.31 % 586.15 27 0.14 % 173.20 dela del a 413 0.26 % 398.99 63 0.16 % 273.29 146 0.45 % 494.01 143 0.22 % 404.93 61 0.32 % 391.29 stvar stv ar 387 0.25 % 373.87 47 0.12 % 203.88 84 0.26 % 284.22 172 0.26 % 487.05 84 0.43 % 538.83 vprašanje vpr ašanje 377 0.24 % 364.21 72 0.19 % 312.33 45 0.14 % 152.26 211 0.32 % 597.48 49 0.25 % 314.32 čas čas 377 0.24 % 364.21 122 0.32 % 529.23 44 0.14 % 148.88 162 0.24 % 458.73 49 0.25 % 314.32 problem pro blem 369 0.23 % 356.48 32 0.08 % 138.81 86 0.26 % 290.99 163 0.25 % 461.56 88 0.46 % 564.49 časa čas a 361 0.23 % 348.76 100 0.26 % 433.80 56 0.17 % 189.48 146 0.22 % 413.42 59 0.30 % 378.46 del del 351 0.22 % 339.10 76 0.20 % 329.69 44 0.14 % 148.88 193 0.29 % 546.51 38 0.20 % 243.76 mislm mis lm 323 0.21 % 312.05 44 0.11 % 190.87 134 0.41 % 453.40 53 0.08 % 150.08 92 0.47 % 590.15 leto let o 320 0.20 % 309.15 84 0.22 % 364.39 76 0.23 % 257.15 125 0.19 % 353.96 35 0.18 % 224.51 evo evo 311 0.20 % 300.45 180 0.47 % 780.83 60 0.18 % 203.02 17 0.03 % 48.14 54 0.28 % 346.39 mislim mis lim 310 0.20 % 299.49 61 0.16 % 264.62 102 0.31 % 345.13 77 0.12 % 218.04 70 0.36 % 449.03 dni dni 305 0.20 % 294.66 70 0.18 % 303.66 112 0.34 % 378.96 85 0.13 % 240.69 38 0.20 % 243.76 delo del o 302 0.19 % 291.76 35 0.09 % 151.83 29 0.09 % 98.12 171 0.26 % 484.21 67 0.35 % 429.78 ljudi lju di 293 0.19 % 283.06 40 0.10 % 173.52 49 0.15 % 165.80 181 0.27 % 512.53 23 0.12 % 147.54 ljudje lju dje 284 0.18 % 274.37 50 0.13 % 216.90 33 0.10 % 111.66 168 0.25 % 475.72 33 0.17 % 211.68 sloveniji slo veniji 283 0.18 % 273.40 54 0.14 % 234.25 29 0.09 % 98.12 186 0.28 % 526.69 14 0.07 % 89.81 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 397 File at CLARIN.SI 2.2.54 List of initial character-level 4-grams from noun lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-nouns-lowercase_ forms-initial-4grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequen- cy of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] bistvu bist vu 965 0.68 % 932.27 162 0.46 % 702.75 218 0.82 % 737.63 229 0.36 % 648.45 356 2.09 % 2,283.62 hvala hval a 790 0.56 % 763.21 338 0.96 % 1,466.23 59 0.22 % 199.63 300 0.48 % 849.50 93 0.55 % 596.56 redu redu 683 0.48 % 659.84 148 0.42 % 642.02 114 0.43 % 385.73 220 0.35 % 622.97 201 1.18 % 1,289.35 gospod gosp od 648 0.46 % 626.02 31 0.09 % 134.48 5 0.02 % 16.92 598 0.95 % 1,693.34 14 0.08 % 89.81 stvari stva ri 588 0.42 % 568.06 95 0.27 % 412.11 97 0.36 % 328.21 255 0.41 % 722.07 141 0.83 % 904.47 strani stra ni 566 0.40 % 546.80 128 0.37 % 555.26 77 0.29 % 260.54 283 0.45 % 801.36 78 0.46 % 500.34 jutro jutr o 536 0.38 % 517.82 451 1.29 % 1,956.42 7 0.03 % 23.69 75 0.12 % 212.37 3 0.02 % 19.24 evrov evro v 532 0.38 % 513.96 148 0.42 % 642.02 142 0.53 % 480.47 161 0.26 % 455.90 81 0.47 % 519.59 minut minu t 461 0.33 % 445.36 316 0.90 % 1,370.80 88 0.33 % 297.76 34 0.05 % 96.28 23 0.14 % 147.54 način nači n 448 0.32 % 432.81 63 0.18 % 273.29 34 0.13 % 115.04 251 0.40 % 710.75 100 0.59 % 641.47 primer prim er 445 0.32 % 429.91 52 0.15 % 225.57 56 0.21 % 189.48 244 0.39 % 690.93 93 0.55 % 596.56 leta leta 427 0.30 % 412.52 130 0.37 % 563.94 63 0.24 % 213.17 207 0.33 % 586.15 27 0.16 % 173.20 dela dela 413 0.29 % 398.99 63 0.18 % 273.29 146 0.55 % 494.01 143 0.23 % 404.93 61 0.36 % 391.29 stvar stva r 387 0.27 % 373.87 47 0.13 % 203.88 84 0.32 % 284.22 172 0.27 % 487.05 84 0.49 % 538.83 vprašanje vpra šanje 377 0.27 % 364.21 72 0.21 % 312.33 45 0.17 % 152.26 211 0.34 % 597.48 49 0.29 % 314.32 problem prob lem 369 0.26 % 356.48 32 0.09 % 138.81 86 0.32 % 290.99 163 0.26 % 461.56 88 0.52 % 564.49 časa časa 361 0.26 % 348.76 100 0.29 % 433.80 56 0.21 % 189.48 146 0.23 % 413.42 59 0.35 % 378.46 mislm misl m 323 0.23 % 312.05 44 0.13 % 190.87 134 0.50 % 453.40 53 0.09 % 150.08 92 0.54 % 590.15 leto leto 320 0.23 % 309.15 84 0.24 % 364.39 76 0.28 % 257.15 125 0.20 % 353.96 35 0.20 % 224.51 mislim misl im 310 0.22 % 299.49 61 0.17 % 264.62 102 0.38 % 345.13 77 0.12 % 218.04 70 0.41 % 449.03 delo delo 302 0.21 % 291.76 35 0.10 % 151.83 29 0.11 % 98.12 171 0.27 % 484.21 67 0.39 % 429.78 ljudi ljud i 293 0.21 % 283.06 40 0.11 % 173.52 49 0.18 % 165.80 181 0.29 % 512.53 23 0.14 % 147.54 ljudje ljud je 284 0.20 % 274.37 50 0.14 % 216.90 33 0.12 % 111.66 168 0.27 % 475.72 33 0.19 % 211.68 sloveniji slov eniji 283 0.20 % 273.40 54 0.15 % 234.25 29 0.11 % 98.12 186 0.30 % 526.69 14 0.08 % 89.81 primeru prim eru 238 0.17 % 229.93 30 0.09 % 130.14 14 0.05 % 47.37 128 0.20 % 362.45 66 0.39 % 423.37 stran stra n 229 0.16 % 221.23 38 0.11 % 164.84 71 0.27 % 240.24 82 0.13 % 232.20 38 0.22 % 243.76 vode vode 227 0.16 % 219.30 23 0.07 % 99.77 19 0.07 % 64.29 155 0.25 % 438.91 30 0.18 % 192.44 koncu konc u 226 0.16 % 218.33 66 0.19 % 286.31 27 0.10 % 91.36 104 0.17 % 294.49 29 0.17 % 186.03 času času 215 0.15 % 207.71 47 0.13 % 203.88 9 0.03 % 30.45 136 0.22 % 385.11 23 0.14 % 147.54 človek člov ek 207 0.15 % 199.98 58 0.17 % 251.60 17 0.06 % 57.52 101 0.16 % 286 31 0.18 % 198.85 večer veče r 206 0.15 % 199.01 104 0.30 % 451.15 11 0.04 % 37.22 77 0.12 % 218.04 14 0.08 % 89.81 mesto mest o 197 0.14 % 190.32 100 0.29 % 433.80 27 0.10 % 91.36 68 0.11 % 192.55 2 0.01 % 12.83 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 398 File at CLARIN.SI 2.2.55 List of initial character-level 5-grams from noun lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-nouns-lowercase_ forms-initial-5grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequen- cy of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] bistvu bistv u 965 0.80 % 932.27 162 0.55 % 702.75 218 1.06 % 737.63 229 0.41 % 648.45 356 2.48 % 2,283.62 hvala hvala 790 0.66 % 763.21 338 1.14 % 1,466.23 59 0.29 % 199.63 300 0.54 % 849.50 93 0.65 % 596.56 gospod gospo d 648 0.54 % 626.02 31 0.10 % 134.48 5 0.02 % 16.92 598 1.07 % 1,693.34 14 0.10 % 89.81 stvari stvar i 588 0.49 % 568.06 95 0.32 % 412.11 97 0.47 % 328.21 255 0.46 % 722.07 141 0.98 % 904.47 strani stran i 566 0.47 % 546.80 128 0.43 % 555.26 77 0.38 % 260.54 283 0.51 % 801.36 78 0.54 % 500.34 jutro jutro 536 0.45 % 517.82 451 1.52 % 1,956.42 7 0.03 % 23.69 75 0.14 % 212.37 3 0.02 % 19.24 evrov evrov 532 0.44 % 513.96 148 0.50 % 642.02 142 0.69 % 480.47 161 0.29 % 455.90 81 0.56 % 519.59 minut minut 461 0.38 % 445.36 316 1.06 % 1,370.80 88 0.43 % 297.76 34 0.06 % 96.28 23 0.16 % 147.54 način način 448 0.37 % 432.81 63 0.21 % 273.29 34 0.17 % 115.04 251 0.45 % 710.75 100 0.70 % 641.47 primer prime r 445 0.37 % 429.91 52 0.17 % 225.57 56 0.27 % 189.48 244 0.44 % 690.93 93 0.65 % 596.56 stvar stvar 387 0.32 % 373.87 47 0.16 % 203.88 84 0.41 % 284.22 172 0.31 % 487.05 84 0.58 % 538.83 vprašanje vpraš anje 377 0.31 % 364.21 72 0.24 % 312.33 45 0.22 % 152.26 211 0.38 % 597.48 49 0.34 % 314.32 problem probl em 369 0.31 % 356.48 32 0.11 % 138.81 86 0.42 % 290.99 163 0.29 % 461.56 88 0.61 % 564.49 mislm mislm 323 0.27 % 312.05 44 0.15 % 190.87 134 0.65 % 453.40 53 0.10 % 150.08 92 0.64 % 590.15 mislim misli m 310 0.26 % 299.49 61 0.21 % 264.62 102 0.50 % 345.13 77 0.14 % 218.04 70 0.49 % 449.03 ljudi ljudi 293 0.24 % 283.06 40 0.14 % 173.52 49 0.24 % 165.80 181 0.33 % 512.53 23 0.16 % 147.54 ljudje ljudj e 284 0.24 % 274.37 50 0.17 % 216.90 33 0.16 % 111.66 168 0.30 % 475.72 33 0.23 % 211.68 sloveniji slove niji 283 0.23 % 273.40 54 0.18 % 234.25 29 0.14 % 98.12 186 0.33 % 526.69 14 0.10 % 89.81 primeru prime ru 238 0.20 % 229.93 30 0.10 % 130.14 14 0.07 % 47.37 128 0.23 % 362.45 66 0.46 % 423.37 stran stran 229 0.19 % 221.23 38 0.13 % 164.84 71 0.35 % 240.24 82 0.15 % 232.20 38 0.26 % 243.76 koncu koncu 226 0.19 % 218.33 66 0.22 % 286.31 27 0.13 % 91.36 104 0.19 % 294.49 29 0.20 % 186.03 človek člove k 207 0.17 % 199.98 58 0.20 % 251.60 17 0.08 % 57.52 101 0.18 % 286 31 0.22 % 198.85 večer večer 206 0.17 % 199.01 104 0.35 % 451.15 11 0.05 % 37.22 77 0.14 % 218.04 14 0.10 % 89.81 mesto mesto 197 0.16 % 190.32 100 0.34 % 433.80 27 0.13 % 91.36 68 0.12 % 192.55 2 0.01 % 12.83 pizda pizda 197 0.16 % 190.32 7 0.02 % 30.37 187 0.91 % 632.74 0 0 % 0 3 0.02 % 19.24 država držav a 192 0.16 % 185.49 7 0.02 % 30.37 16 0.08 % 54.14 161 0.29 % 455.90 8 0.06 % 51.32 teden teden 187 0.15 % 180.66 53 0.18 % 229.91 58 0.28 % 196.25 38 0.07 % 107.60 38 0.26 % 243.76 slovenija slove nija 186 0.15 % 179.69 33 0.11 % 143.15 5 0.02 % 16.92 143 0.26 % 404.93 5 0.04 % 32.07 gospa gospa 176 0.15 % 170.03 34 0.12 % 147.49 12 0.06 % 40.60 117 0.21 % 331.30 13 0.09 % 83.39 stopinj stopi nj 168 0.14 % 162.30 32 0.11 % 138.81 16 0.08 % 54.14 46 0.08 % 130.26 74 0.52 % 474.68 svetu svetu 168 0.14 % 162.30 41 0.14 % 177.86 8 0.04 % 27.07 111 0.20 % 314.31 8 0.06 % 51.32 misim misim 167 0.14 % 161.34 25 0.08 % 108.45 75 0.36 % 253.77 30 0.05 % 84.95 37 0.26 % 237.34 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 399 File at CLARIN.SI 2.2.56 List of final character-level 1-grams from noun lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-nouns-lowercase_ forms-final-1grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequen- cy of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pol po l 2,838 1.78 % 2,741.75 422 1.08 % 1,830.62 1,472 4.36 % 4,980.68 277 0.42 % 784.37 667 3.38 % 4,278.58 sej se j 2,070 1.30 % 1,999.79 252 0.65 % 1,093.17 1,160 3.44 % 3,924.99 242 0.36 % 685.26 416 2.10 % 2,668.50 bistvu bistv u 965 0.61 % 932.27 162 0.41 % 702.75 218 0.65 % 737.63 229 0.34 % 648.45 356 1.80 % 2,283.62 dan da n 825 0.52 % 797.02 312 0.80 % 1,353.44 241 0.71 % 815.45 160 0.24 % 453.07 112 0.57 % 718.44 hvala hval a 790 0.50 % 763.21 338 0.86 % 1,466.23 59 0.17 % 199.63 300 0.45 % 849.50 93 0.47 % 596.56 po p o 788 0.49 % 761.27 44 0.11 % 190.87 551 1.63 % 1,864.37 39 0.06 % 110.43 154 0.78 % 987.86 redu red u 683 0.43 % 659.84 148 0.38 % 642.02 114 0.34 % 385.73 220 0.33 % 622.97 201 1.02 % 1,289.35 gospod gospo d 648 0.41 % 626.02 31 0.08 % 134.48 5 0.01 % 16.92 598 0.90 % 1,693.34 14 0.07 % 89.81 stvari stvar i 588 0.37 % 568.06 95 0.24 % 412.11 97 0.29 % 328.21 255 0.38 % 722.07 141 0.71 % 904.47 strani stran i 566 0.36 % 546.80 128 0.33 % 555.26 77 0.23 % 260.54 283 0.42 % 801.36 78 0.40 % 500.34 jutro jutr o 536 0.34 % 517.82 451 1.15 % 1,956.42 7 0.02 % 23.69 75 0.11 % 212.37 3 0.01 % 19.24 evrov evro v 532 0.33 % 513.96 148 0.38 % 642.02 142 0.42 % 480.47 161 0.24 % 455.90 81 0.41 % 519.59 let le t 530 0.33 % 512.02 136 0.35 % 589.96 176 0.52 % 595.52 186 0.28 % 526.69 32 0.16 % 205.27 minut minu t 461 0.29 % 445.36 316 0.81 % 1,370.80 88 0.26 % 297.76 34 0.05 % 96.28 23 0.12 % 147.54 način nači n 448 0.28 % 432.81 63 0.16 % 273.29 34 0.10 % 115.04 251 0.38 % 710.75 100 0.51 % 641.47 primer prime r 445 0.28 % 429.91 52 0.13 % 225.57 56 0.17 % 189.48 244 0.37 % 690.93 93 0.47 % 596.56 leta let a 427 0.27 % 412.52 130 0.33 % 563.94 63 0.19 % 213.17 207 0.31 % 586.15 27 0.14 % 173.20 dela del a 413 0.26 % 398.99 63 0.16 % 273.29 146 0.43 % 494.01 143 0.21 % 404.93 61 0.31 % 391.29 stvar stva r 387 0.24 % 373.87 47 0.12 % 203.88 84 0.25 % 284.22 172 0.26 % 487.05 84 0.42 % 538.83 vprašanje vprašanj e 377 0.24 % 364.21 72 0.18 % 312.33 45 0.13 % 152.26 211 0.32 % 597.48 49 0.25 % 314.32 čas ča s 377 0.24 % 364.21 122 0.31 % 529.23 44 0.13 % 148.88 162 0.24 % 458.73 49 0.25 % 314.32 problem proble m 369 0.23 % 356.48 32 0.08 % 138.81 86 0.26 % 290.99 163 0.24 % 461.56 88 0.45 % 564.49 časa čas a 361 0.23 % 348.76 100 0.26 % 433.80 56 0.17 % 189.48 146 0.22 % 413.42 59 0.30 % 378.46 del de l 351 0.22 % 339.10 76 0.20 % 329.69 44 0.13 % 148.88 193 0.29 % 546.51 38 0.19 % 243.76 mislm misl m 323 0.20 % 312.05 44 0.11 % 190.87 134 0.40 % 453.40 53 0.08 % 150.08 92 0.47 % 590.15 leto let o 320 0.20 % 309.15 84 0.21 % 364.39 76 0.23 % 257.15 125 0.19 % 353.96 35 0.18 % 224.51 evo ev o 311 0.20 % 300.45 180 0.46 % 780.83 60 0.18 % 203.02 17 0.03 % 48.14 54 0.27 % 346.39 mislim misli m 310 0.20 % 299.49 61 0.16 % 264.62 102 0.30 % 345.13 77 0.12 % 218.04 70 0.35 % 449.03 dni dn i 305 0.19 % 294.66 70 0.18 % 303.66 112 0.33 % 378.96 85 0.13 % 240.69 38 0.19 % 243.76 delo del o 302 0.19 % 291.76 35 0.09 % 151.83 29 0.09 % 98.12 171 0.26 % 484.21 67 0.34 % 429.78 ljudi ljud i 293 0.18 % 283.06 40 0.10 % 173.52 49 0.14 % 165.80 181 0.27 % 512.53 23 0.12 % 147.54 ljudje ljudj e 284 0.18 % 274.37 50 0.13 % 216.90 33 0.10 % 111.66 168 0.25 % 475.72 33 0.17 % 211.68 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 400 File at CLARIN.SI 2.2.57 List of final character-level 2-grams from noun lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-nouns-lowercase_ forms-final-2grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequen- cy of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pol p ol 2,838 1.79 % 2,741.75 422 1.08 % 1,830.62 1,472 4.38 % 4,980.68 277 0.42 % 784.37 667 3.38 % 4,278.58 sej s ej 2,070 1.30 % 1,999.79 252 0.65 % 1,093.17 1,160 3.45 % 3,924.99 242 0.36 % 685.26 416 2.11 % 2,668.50 bistvu bist vu 965 0.61 % 932.27 162 0.41 % 702.75 218 0.65 % 737.63 229 0.34 % 648.45 356 1.81 % 2,283.62 dan d an 825 0.52 % 797.02 312 0.80 % 1,353.44 241 0.72 % 815.45 160 0.24 % 453.07 112 0.57 % 718.44 hvala hva la 790 0.50 % 763.21 338 0.87 % 1,466.23 59 0.17 % 199.63 300 0.45 % 849.50 93 0.47 % 596.56 po po 788 0.50 % 761.27 44 0.11 % 190.87 551 1.64 % 1,864.37 39 0.06 % 110.43 154 0.78 % 987.86 redu re du 683 0.43 % 659.84 148 0.38 % 642.02 114 0.34 % 385.73 220 0.33 % 622.97 201 1.02 % 1,289.35 gospod gosp od 648 0.41 % 626.02 31 0.08 % 134.48 5 0.01 % 16.92 598 0.90 % 1,693.34 14 0.07 % 89.81 stvari stva ri 588 0.37 % 568.06 95 0.24 % 412.11 97 0.29 % 328.21 255 0.38 % 722.07 141 0.71 % 904.47 strani stra ni 566 0.36 % 546.80 128 0.33 % 555.26 77 0.23 % 260.54 283 0.43 % 801.36 78 0.40 % 500.34 jutro jut ro 536 0.34 % 517.82 451 1.16 % 1,956.42 7 0.02 % 23.69 75 0.11 % 212.37 3 0.01 % 19.24 evrov evr ov 532 0.34 % 513.96 148 0.38 % 642.02 142 0.42 % 480.47 161 0.24 % 455.90 81 0.41 % 519.59 let l et 530 0.33 % 512.02 136 0.35 % 589.96 176 0.52 % 595.52 186 0.28 % 526.69 32 0.16 % 205.27 minut min ut 461 0.29 % 445.36 316 0.81 % 1,370.80 88 0.26 % 297.76 34 0.05 % 96.28 23 0.12 % 147.54 način nač in 448 0.28 % 432.81 63 0.16 % 273.29 34 0.10 % 115.04 251 0.38 % 710.75 100 0.51 % 641.47 primer prim er 445 0.28 % 429.91 52 0.13 % 225.57 56 0.17 % 189.48 244 0.37 % 690.93 93 0.47 % 596.56 leta le ta 427 0.27 % 412.52 130 0.33 % 563.94 63 0.19 % 213.17 207 0.31 % 586.15 27 0.14 % 173.20 dela de la 413 0.26 % 398.99 63 0.16 % 273.29 146 0.43 % 494.01 143 0.21 % 404.93 61 0.31 % 391.29 stvar stv ar 387 0.24 % 373.87 47 0.12 % 203.88 84 0.25 % 284.22 172 0.26 % 487.05 84 0.43 % 538.83 vprašanje vprašan je 377 0.24 % 364.21 72 0.18 % 312.33 45 0.13 % 152.26 211 0.32 % 597.48 49 0.25 % 314.32 čas č as 377 0.24 % 364.21 122 0.31 % 529.23 44 0.13 % 148.88 162 0.24 % 458.73 49 0.25 % 314.32 problem probl em 369 0.23 % 356.48 32 0.08 % 138.81 86 0.26 % 290.99 163 0.25 % 461.56 88 0.45 % 564.49 časa ča sa 361 0.23 % 348.76 100 0.26 % 433.80 56 0.17 % 189.48 146 0.22 % 413.42 59 0.30 % 378.46 del d el 351 0.22 % 339.10 76 0.20 % 329.69 44 0.13 % 148.88 193 0.29 % 546.51 38 0.19 % 243.76 mislm mis lm 323 0.20 % 312.05 44 0.11 % 190.87 134 0.40 % 453.40 53 0.08 % 150.08 92 0.47 % 590.15 leto le to 320 0.20 % 309.15 84 0.21 % 364.39 76 0.23 % 257.15 125 0.19 % 353.96 35 0.18 % 224.51 evo e vo 311 0.20 % 300.45 180 0.46 % 780.83 60 0.18 % 203.02 17 0.03 % 48.14 54 0.27 % 346.39 mislim misl im 310 0.20 % 299.49 61 0.16 % 264.62 102 0.30 % 345.13 77 0.12 % 218.04 70 0.35 % 449.03 dni d ni 305 0.19 % 294.66 70 0.18 % 303.66 112 0.33 % 378.96 85 0.13 % 240.69 38 0.19 % 243.76 delo de lo 302 0.19 % 291.76 35 0.09 % 151.83 29 0.09 % 98.12 171 0.26 % 484.21 67 0.34 % 429.78 ljudi lju di 293 0.18 % 283.06 40 0.10 % 173.52 49 0.15 % 165.80 181 0.27 % 512.53 23 0.12 % 147.54 ljudje ljud je 284 0.18 % 274.37 50 0.13 % 216.90 33 0.10 % 111.66 168 0.25 % 475.72 33 0.17 % 211.68 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 401 File at CLARIN.SI 2.2.58 List of final character-level 3-grams from noun lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-nouns-lowercase_ forms-final-3grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequen- cy of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pol pol 2,838 1.81 % 2,741.75 422 1.09 % 1,830.62 1,472 4.53 % 4,980.68 277 0.42 % 784.37 667 3.45 % 4,278.58 sej sej 2,070 1.32 % 1,999.79 252 0.65 % 1,093.17 1,160 3.57 % 3,924.99 242 0.37 % 685.26 416 2.15 % 2,668.50 bistvu bis tvu 965 0.62 % 932.27 162 0.42 % 702.75 218 0.67 % 737.63 229 0.35 % 648.45 356 1.84 % 2,283.62 dan dan 825 0.53 % 797.02 312 0.81 % 1,353.44 241 0.74 % 815.45 160 0.24 % 453.07 112 0.58 % 718.44 hvala hv ala 790 0.50 % 763.21 338 0.87 % 1,466.23 59 0.18 % 199.63 300 0.45 % 849.50 93 0.48 % 596.56 redu r edu 683 0.44 % 659.84 148 0.38 % 642.02 114 0.35 % 385.73 220 0.33 % 622.97 201 1.04 % 1,289.35 gospod gos pod 648 0.41 % 626.02 31 0.08 % 134.48 5 0.01 % 16.92 598 0.91 % 1,693.34 14 0.07 % 89.81 stvari stv ari 588 0.38 % 568.06 95 0.24 % 412.11 97 0.30 % 328.21 255 0.39 % 722.07 141 0.73 % 904.47 strani str ani 566 0.36 % 546.80 128 0.33 % 555.26 77 0.24 % 260.54 283 0.43 % 801.36 78 0.40 % 500.34 jutro ju tro 536 0.34 % 517.82 451 1.16 % 1,956.42 7 0.02 % 23.69 75 0.11 % 212.37 3 0.02 % 19.24 evrov ev rov 532 0.34 % 513.96 148 0.38 % 642.02 142 0.44 % 480.47 161 0.24 % 455.90 81 0.42 % 519.59 let let 530 0.34 % 512.02 136 0.35 % 589.96 176 0.54 % 595.52 186 0.28 % 526.69 32 0.17 % 205.27 minut mi nut 461 0.29 % 445.36 316 0.82 % 1,370.80 88 0.27 % 297.76 34 0.05 % 96.28 23 0.12 % 147.54 način na čin 448 0.29 % 432.81 63 0.16 % 273.29 34 0.10 % 115.04 251 0.38 % 710.75 100 0.52 % 641.47 primer pri mer 445 0.28 % 429.91 52 0.13 % 225.57 56 0.17 % 189.48 244 0.37 % 690.93 93 0.48 % 596.56 leta l eta 427 0.27 % 412.52 130 0.34 % 563.94 63 0.19 % 213.17 207 0.31 % 586.15 27 0.14 % 173.20 dela d ela 413 0.26 % 398.99 63 0.16 % 273.29 146 0.45 % 494.01 143 0.22 % 404.93 61 0.32 % 391.29 stvar st var 387 0.25 % 373.87 47 0.12 % 203.88 84 0.26 % 284.22 172 0.26 % 487.05 84 0.43 % 538.83 vprašanje vpraša nje 377 0.24 % 364.21 72 0.19 % 312.33 45 0.14 % 152.26 211 0.32 % 597.48 49 0.25 % 314.32 čas čas 377 0.24 % 364.21 122 0.32 % 529.23 44 0.14 % 148.88 162 0.24 % 458.73 49 0.25 % 314.32 problem prob lem 369 0.23 % 356.48 32 0.08 % 138.81 86 0.26 % 290.99 163 0.25 % 461.56 88 0.46 % 564.49 časa č asa 361 0.23 % 348.76 100 0.26 % 433.80 56 0.17 % 189.48 146 0.22 % 413.42 59 0.30 % 378.46 del del 351 0.22 % 339.10 76 0.20 % 329.69 44 0.14 % 148.88 193 0.29 % 546.51 38 0.20 % 243.76 mislm mi slm 323 0.21 % 312.05 44 0.11 % 190.87 134 0.41 % 453.40 53 0.08 % 150.08 92 0.47 % 590.15 leto l eto 320 0.20 % 309.15 84 0.22 % 364.39 76 0.23 % 257.15 125 0.19 % 353.96 35 0.18 % 224.51 evo evo 311 0.20 % 300.45 180 0.47 % 780.83 60 0.18 % 203.02 17 0.03 % 48.14 54 0.28 % 346.39 mislim mis lim 310 0.20 % 299.49 61 0.16 % 264.62 102 0.31 % 345.13 77 0.12 % 218.04 70 0.36 % 449.03 dni dni 305 0.20 % 294.66 70 0.18 % 303.66 112 0.34 % 378.96 85 0.13 % 240.69 38 0.20 % 243.76 delo d elo 302 0.19 % 291.76 35 0.09 % 151.83 29 0.09 % 98.12 171 0.26 % 484.21 67 0.35 % 429.78 ljudi lj udi 293 0.19 % 283.06 40 0.10 % 173.52 49 0.15 % 165.80 181 0.27 % 512.53 23 0.12 % 147.54 ljudje lju dje 284 0.18 % 274.37 50 0.13 % 216.90 33 0.10 % 111.66 168 0.25 % 475.72 33 0.17 % 211.68 sloveniji sloven iji 283 0.18 % 273.40 54 0.14 % 234.25 29 0.09 % 98.12 186 0.28 % 526.69 14 0.07 % 89.81 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 402 File at CLARIN.SI 2.2.59 List of final character-level 4-grams from noun lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-nouns-lowercase_ forms-final-4grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequen- cy of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] bistvu bi stvu 965 0.68 % 932.27 162 0.46 % 702.75 218 0.82 % 737.63 229 0.36 % 648.45 356 2.09 % 2,283.62 hvala h vala 790 0.56 % 763.21 338 0.96 % 1,466.23 59 0.22 % 199.63 300 0.48 % 849.50 93 0.55 % 596.56 redu redu 683 0.48 % 659.84 148 0.42 % 642.02 114 0.43 % 385.73 220 0.35 % 622.97 201 1.18 % 1,289.35 gospod go spod 648 0.46 % 626.02 31 0.09 % 134.48 5 0.02 % 16.92 598 0.95 % 1,693.34 14 0.08 % 89.81 stvari st vari 588 0.42 % 568.06 95 0.27 % 412.11 97 0.36 % 328.21 255 0.41 % 722.07 141 0.83 % 904.47 strani st rani 566 0.40 % 546.80 128 0.37 % 555.26 77 0.29 % 260.54 283 0.45 % 801.36 78 0.46 % 500.34 jutro j utro 536 0.38 % 517.82 451 1.29 % 1,956.42 7 0.03 % 23.69 75 0.12 % 212.37 3 0.02 % 19.24 evrov e vrov 532 0.38 % 513.96 148 0.42 % 642.02 142 0.53 % 480.47 161 0.26 % 455.90 81 0.47 % 519.59 minut m inut 461 0.33 % 445.36 316 0.90 % 1,370.80 88 0.33 % 297.76 34 0.05 % 96.28 23 0.14 % 147.54 način n ačin 448 0.32 % 432.81 63 0.18 % 273.29 34 0.13 % 115.04 251 0.40 % 710.75 100 0.59 % 641.47 primer pr imer 445 0.32 % 429.91 52 0.15 % 225.57 56 0.21 % 189.48 244 0.39 % 690.93 93 0.55 % 596.56 leta leta 427 0.30 % 412.52 130 0.37 % 563.94 63 0.24 % 213.17 207 0.33 % 586.15 27 0.16 % 173.20 dela dela 413 0.29 % 398.99 63 0.18 % 273.29 146 0.55 % 494.01 143 0.23 % 404.93 61 0.36 % 391.29 stvar s tvar 387 0.27 % 373.87 47 0.13 % 203.88 84 0.32 % 284.22 172 0.27 % 487.05 84 0.49 % 538.83 vprašanje vpraš anje 377 0.27 % 364.21 72 0.21 % 312.33 45 0.17 % 152.26 211 0.34 % 597.48 49 0.29 % 314.32 problem pro blem 369 0.26 % 356.48 32 0.09 % 138.81 86 0.32 % 290.99 163 0.26 % 461.56 88 0.52 % 564.49 časa časa 361 0.26 % 348.76 100 0.29 % 433.80 56 0.21 % 189.48 146 0.23 % 413.42 59 0.35 % 378.46 mislm m islm 323 0.23 % 312.05 44 0.13 % 190.87 134 0.50 % 453.40 53 0.09 % 150.08 92 0.54 % 590.15 leto leto 320 0.23 % 309.15 84 0.24 % 364.39 76 0.28 % 257.15 125 0.20 % 353.96 35 0.20 % 224.51 mislim mi slim 310 0.22 % 299.49 61 0.17 % 264.62 102 0.38 % 345.13 77 0.12 % 218.04 70 0.41 % 449.03 delo delo 302 0.21 % 291.76 35 0.10 % 151.83 29 0.11 % 98.12 171 0.27 % 484.21 67 0.39 % 429.78 ljudi l judi 293 0.21 % 283.06 40 0.11 % 173.52 49 0.18 % 165.80 181 0.29 % 512.53 23 0.14 % 147.54 ljudje lj udje 284 0.20 % 274.37 50 0.14 % 216.90 33 0.12 % 111.66 168 0.27 % 475.72 33 0.19 % 211.68 sloveniji slove niji 283 0.20 % 273.40 54 0.15 % 234.25 29 0.11 % 98.12 186 0.30 % 526.69 14 0.08 % 89.81 primeru pri meru 238 0.17 % 229.93 30 0.09 % 130.14 14 0.05 % 47.37 128 0.20 % 362.45 66 0.39 % 423.37 stran s tran 229 0.16 % 221.23 38 0.11 % 164.84 71 0.27 % 240.24 82 0.13 % 232.20 38 0.22 % 243.76 vode vode 227 0.16 % 219.30 23 0.07 % 99.77 19 0.07 % 64.29 155 0.25 % 438.91 30 0.18 % 192.44 koncu k oncu 226 0.16 % 218.33 66 0.19 % 286.31 27 0.10 % 91.36 104 0.17 % 294.49 29 0.17 % 186.03 času času 215 0.15 % 207.71 47 0.13 % 203.88 9 0.03 % 30.45 136 0.22 % 385.11 23 0.14 % 147.54 človek čl ovek 207 0.15 % 199.98 58 0.17 % 251.60 17 0.06 % 57.52 101 0.16 % 286 31 0.18 % 198.85 večer v ečer 206 0.15 % 199.01 104 0.30 % 451.15 11 0.04 % 37.22 77 0.12 % 218.04 14 0.08 % 89.81 mesto m esto 197 0.14 % 190.32 100 0.29 % 433.80 27 0.10 % 91.36 68 0.11 % 192.55 2 0.01 % 12.83 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 403 File at CLARIN.SI 2.2.60 List of final character-level 5-grams from noun lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-nouns-lowercase_ forms-final-5grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequen- cy of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] bistvu b istvu 965 0.80 % 932.27 162 0.55 % 702.75 218 1.06 % 737.63 229 0.41 % 648.45 356 2.48 % 2,283.62 hvala hvala 790 0.66 % 763.21 338 1.14 % 1,466.23 59 0.29 % 199.63 300 0.54 % 849.50 93 0.65 % 596.56 gospod g ospod 648 0.54 % 626.02 31 0.10 % 134.48 5 0.02 % 16.92 598 1.07 % 1,693.34 14 0.10 % 89.81 stvari s tvari 588 0.49 % 568.06 95 0.32 % 412.11 97 0.47 % 328.21 255 0.46 % 722.07 141 0.98 % 904.47 strani s trani 566 0.47 % 546.80 128 0.43 % 555.26 77 0.38 % 260.54 283 0.51 % 801.36 78 0.54 % 500.34 jutro jutro 536 0.45 % 517.82 451 1.52 % 1,956.42 7 0.03 % 23.69 75 0.14 % 212.37 3 0.02 % 19.24 evrov evrov 532 0.44 % 513.96 148 0.50 % 642.02 142 0.69 % 480.47 161 0.29 % 455.90 81 0.56 % 519.59 minut minut 461 0.38 % 445.36 316 1.06 % 1,370.80 88 0.43 % 297.76 34 0.06 % 96.28 23 0.16 % 147.54 način način 448 0.37 % 432.81 63 0.21 % 273.29 34 0.17 % 115.04 251 0.45 % 710.75 100 0.70 % 641.47 primer p rimer 445 0.37 % 429.91 52 0.17 % 225.57 56 0.27 % 189.48 244 0.44 % 690.93 93 0.65 % 596.56 stvar stvar 387 0.32 % 373.87 47 0.16 % 203.88 84 0.41 % 284.22 172 0.31 % 487.05 84 0.58 % 538.83 vprašanje vpra šanje 377 0.31 % 364.21 72 0.24 % 312.33 45 0.22 % 152.26 211 0.38 % 597.48 49 0.34 % 314.32 problem pr oblem 369 0.31 % 356.48 32 0.11 % 138.81 86 0.42 % 290.99 163 0.29 % 461.56 88 0.61 % 564.49 mislm mislm 323 0.27 % 312.05 44 0.15 % 190.87 134 0.65 % 453.40 53 0.10 % 150.08 92 0.64 % 590.15 mislim m islim 310 0.26 % 299.49 61 0.21 % 264.62 102 0.50 % 345.13 77 0.14 % 218.04 70 0.49 % 449.03 ljudi ljudi 293 0.24 % 283.06 40 0.14 % 173.52 49 0.24 % 165.80 181 0.33 % 512.53 23 0.16 % 147.54 ljudje l judje 284 0.24 % 274.37 50 0.17 % 216.90 33 0.16 % 111.66 168 0.30 % 475.72 33 0.23 % 211.68 sloveniji slov eniji 283 0.23 % 273.40 54 0.18 % 234.25 29 0.14 % 98.12 186 0.33 % 526.69 14 0.10 % 89.81 primeru pr imeru 238 0.20 % 229.93 30 0.10 % 130.14 14 0.07 % 47.37 128 0.23 % 362.45 66 0.46 % 423.37 stran stran 229 0.19 % 221.23 38 0.13 % 164.84 71 0.35 % 240.24 82 0.15 % 232.20 38 0.26 % 243.76 koncu koncu 226 0.19 % 218.33 66 0.22 % 286.31 27 0.13 % 91.36 104 0.19 % 294.49 29 0.20 % 186.03 človek č lovek 207 0.17 % 199.98 58 0.20 % 251.60 17 0.08 % 57.52 101 0.18 % 286 31 0.22 % 198.85 večer večer 206 0.17 % 199.01 104 0.35 % 451.15 11 0.05 % 37.22 77 0.14 % 218.04 14 0.10 % 89.81 mesto mesto 197 0.16 % 190.32 100 0.34 % 433.80 27 0.13 % 91.36 68 0.12 % 192.55 2 0.01 % 12.83 pizda pizda 197 0.16 % 190.32 7 0.02 % 30.37 187 0.91 % 632.74 0 0 % 0 3 0.02 % 19.24 država d ržava 192 0.16 % 185.49 7 0.02 % 30.37 16 0.08 % 54.14 161 0.29 % 455.90 8 0.06 % 51.32 teden teden 187 0.15 % 180.66 53 0.18 % 229.91 58 0.28 % 196.25 38 0.07 % 107.60 38 0.26 % 243.76 slovenija slov enija 186 0.15 % 179.69 33 0.11 % 143.15 5 0.02 % 16.92 143 0.26 % 404.93 5 0.04 % 32.07 gospa gospa 176 0.15 % 170.03 34 0.12 % 147.49 12 0.06 % 40.60 117 0.21 % 331.30 13 0.09 % 83.39 stopinj st opinj 168 0.14 % 162.30 32 0.11 % 138.81 16 0.08 % 54.14 46 0.08 % 130.26 74 0.52 % 474.68 svetu svetu 168 0.14 % 162.30 41 0.14 % 177.86 8 0.04 % 27.07 111 0.20 % 314.31 8 0.06 % 51.32 misim misim 167 0.14 % 161.34 25 0.08 % 108.45 75 0.36 % 253.77 30 0.05 % 84.95 37 0.26 % 237.34 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 404 File at CLARIN.SI 2.2.61 List of initial character-level 1-grams from verb lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-verbs-lemmas- initial-1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequen- cy of lemma Percentage of all found lemmas Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] biti biti b iti 93,407 43.59 % 90,238.98 19,830 43.63 % 86,021.79 31,073 44.83 % 105,139.03 29,606 42.92 % 83,834.30 12,898 42.20 % 82,736.24 imeti imeti i meti 10,131 4.73 % 9,787.39 1,862 4.10 % 8,077.29 3,694 5.33 % 12,499.07 2,799 4.06 % 7,925.83 1,776 5.81 % 11,392.43 vedeti vedeti v edeti 7,560 3.53 % 7,303.59 1,283 2.82 % 5,565.61 3,900 5.63 % 13,196.09 1,062 1.54 % 3,007.23 1,315 4.30 % 8,435.27 iti iti i ti 5,311 2.48 % 5,130.87 1,085 2.39 % 4,706.69 2,446 3.53 % 8,276.32 1,175 1.70 % 3,327.21 605 1.98 % 3,880.87 reči reči r eči 4,690 2.19 % 4,530.93 878 1.93 % 3,808.73 1,620 2.34 % 5,481.45 1,476 2.14 % 4,179.54 716 2.34 % 4,592.89 dati dati d ati 3,945 1.84 % 3,811.20 902 1.99 % 3,912.84 1,415 2.04 % 4,787.81 956 1.39 % 2,707.07 672 2.20 % 4,310.65 misliti misliti m isliti 2,710 1.26 % 2,618.09 470 1.03 % 2,038.84 975 1.41 % 3,299.02 713 1.03 % 2,018.98 552 1.81 % 3,540.89 priti priti p riti 2,530 1.18 % 2,444.19 565 1.24 % 2,450.95 968 1.40 % 3,275.34 661 0.96 % 1,871.73 336 1.10 % 2,155.32 videti videti v ideti 2,309 1.08 % 2,230.69 608 1.34 % 2,637.48 759 1.09 % 2,568.16 609 0.88 % 1,724.48 333 1.09 % 2,136.08 morati morati m orati 2,233 1.04 % 2,157.26 386 0.85 % 1,674.45 559 0.81 % 1,891.44 889 1.29 % 2,517.35 399 1.30 % 2,559.45 moči moči m oči 2,067 0.96 % 1,996.90 345 0.76 % 1,496.60 905 1.31 % 3,062.17 512 0.74 % 1,449.81 305 1.00 % 1,956.47 gledati gledati g ledati 1,777 0.83 % 1,716.73 422 0.93 % 1,830.62 702 1.01 % 2,375.30 394 0.57 % 1,115.68 259 0.85 % 1,661.40 povedati povedati p ovedati 1,686 0.79 % 1,628.82 483 1.06 % 2,095.24 389 0.56 % 1,316.23 617 0.89 % 1,747.14 197 0.64 % 1,263.69 narediti narediti n arediti 1,614 0.75 % 1,559.26 217 0.48 % 941.34 543 0.78 % 1,837.30 457 0.66 % 1,294.07 397 1.30 % 2,546.62 delati delati d elati 1,515 0.71 % 1,463.62 238 0.52 % 1,032.43 685 0.99 % 2,317.78 313 0.45 % 886.31 279 0.91 % 1,789.69 praviti praviti p raviti 1,344 0.63 % 1,298.42 285 0.63 % 1,236.32 325 0.47 % 1,099.67 527 0.76 % 1,492.29 207 0.68 % 1,327.83 dobiti dobiti d obiti 1,264 0.59 % 1,221.13 254 0.56 % 1,101.84 401 0.58 % 1,356.83 387 0.56 % 1,095.85 222 0.73 % 1,424.05 hoteti hoteti h oteti 1,047 0.49 % 1,011.49 180 0.40 % 780.83 422 0.61 % 1,427.89 267 0.39 % 756.05 178 0.58 % 1,141.81 pogledati pogledati p ogledati 1,034 0.48 % 998.93 220 0.48 % 954.35 204 0.29 % 690.26 441 0.64 % 1,248.76 169 0.55 % 1,084.08 meti meti m eti 959 0.45 % 926.47 169 0.37 % 733.12 613 0.88 % 2,074.16 79 0.12 % 223.70 98 0.32 % 628.64 jesti jesti j esti 922 0.43 % 890.73 161 0.35 % 698.41 364 0.53 % 1,231.64 280 0.41 % 792.87 117 0.38 % 750.51 znati znati z nati 918 0.43 % 886.86 158 0.35 % 685.40 538 0.78 % 1,820.38 142 0.21 % 402.10 80 0.26 % 513.17 govoriti govoriti g ovoriti 906 0.42 % 875.27 193 0.42 % 837.23 177 0.26 % 598.90 456 0.66 % 1,291.24 80 0.26 % 513.17 začeti začeti z ačeti 841 0.39 % 812.48 146 0.32 % 633.34 260 0.38 % 879.74 340 0.49 % 962.77 95 0.31 % 609.39 zdeti zdeti z deti 804 0.38 % 776.73 134 0.29 % 581.29 235 0.34 % 795.15 226 0.33 % 639.96 209 0.68 % 1,340.66 čakati čakati č akati 782 0.36 % 755.48 235 0.52 % 1,019.42 293 0.42 % 991.40 150 0.22 % 424.75 104 0.34 % 667.12 ajati ajati a jati 562 0.26 % 542.94 72 0.16 % 312.33 373 0.54 % 1,262.09 32 0.05 % 90.61 85 0.28 % 545.25 vzeti vzeti v zeti 557 0.26 % 538.11 112 0.25 % 485.85 200 0.29 % 676.72 154 0.22 % 436.08 91 0.30 % 583.73 slišati slišati s lišati 534 0.25 % 515.89 189 0.42 % 819.87 113 0.16 % 382.35 183 0.27 % 518.19 49 0.16 % 314.32 pomeniti pomeniti p omeniti 511 0.24 % 493.67 57 0.12 % 247.26 40 0.06 % 135.34 322 0.47 % 911.80 92 0.30 % 590.15 napisati napisati n apisati 508 0.24 % 490.77 51 0.11 % 221.24 97 0.14 % 328.21 245 0.35 % 693.76 115 0.38 % 737.69 razumeti razumeti r azumeti 499 0.23 % 482.08 87 0.19 % 377.40 150 0.22 % 507.54 145 0.21 % 410.59 117 0.38 % 750.51 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 405 File at CLARIN.SI 2.2.62 List of initial character-level 2-grams from verb lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-verbs-lemmas- initial-2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequen- cy of lemma Percentage of all found lemmas Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] biti biti bi ti 93,407 43.59 % 90,238.98 19,830 43.63 % 86,021.79 31,073 44.83 % 105,139.03 29,606 42.92 % 83,834.30 12,898 42.20 % 82,736.24 imeti imeti im eti 10,131 4.73 % 9,787.39 1,862 4.10 % 8,077.29 3,694 5.33 % 12,499.07 2,799 4.06 % 7,925.83 1,776 5.81 % 11,392.43 vedeti vedeti ve deti 7,560 3.53 % 7,303.59 1,283 2.82 % 5,565.61 3,900 5.63 % 13,196.09 1,062 1.54 % 3,007.23 1,315 4.30 % 8,435.27 iti iti it i 5,311 2.48 % 5,130.87 1,085 2.39 % 4,706.69 2,446 3.53 % 8,276.32 1,175 1.70 % 3,327.21 605 1.98 % 3,880.87 reči reči re či 4,690 2.19 % 4,530.93 878 1.93 % 3,808.73 1,620 2.34 % 5,481.45 1,476 2.14 % 4,179.54 716 2.34 % 4,592.89 dati dati da ti 3,945 1.84 % 3,811.20 902 1.99 % 3,912.84 1,415 2.04 % 4,787.81 956 1.39 % 2,707.07 672 2.20 % 4,310.65 misliti misliti mi sliti 2,710 1.26 % 2,618.09 470 1.03 % 2,038.84 975 1.41 % 3,299.02 713 1.03 % 2,018.98 552 1.81 % 3,540.89 priti priti pr iti 2,530 1.18 % 2,444.19 565 1.24 % 2,450.95 968 1.40 % 3,275.34 661 0.96 % 1,871.73 336 1.10 % 2,155.32 videti videti vi deti 2,309 1.08 % 2,230.69 608 1.34 % 2,637.48 759 1.09 % 2,568.16 609 0.88 % 1,724.48 333 1.09 % 2,136.08 morati morati mo rati 2,233 1.04 % 2,157.26 386 0.85 % 1,674.45 559 0.81 % 1,891.44 889 1.29 % 2,517.35 399 1.30 % 2,559.45 moči moči mo či 2,067 0.96 % 1,996.90 345 0.76 % 1,496.60 905 1.31 % 3,062.17 512 0.74 % 1,449.81 305 1.00 % 1,956.47 gledati gledati gl edati 1,777 0.83 % 1,716.73 422 0.93 % 1,830.62 702 1.01 % 2,375.30 394 0.57 % 1,115.68 259 0.85 % 1,661.40 povedati povedati po vedati 1,686 0.79 % 1,628.82 483 1.06 % 2,095.24 389 0.56 % 1,316.23 617 0.89 % 1,747.14 197 0.64 % 1,263.69 narediti narediti na rediti 1,614 0.75 % 1,559.26 217 0.48 % 941.34 543 0.78 % 1,837.30 457 0.66 % 1,294.07 397 1.30 % 2,546.62 delati delati de lati 1,515 0.71 % 1,463.62 238 0.52 % 1,032.43 685 0.99 % 2,317.78 313 0.45 % 886.31 279 0.91 % 1,789.69 praviti praviti pr aviti 1,344 0.63 % 1,298.42 285 0.63 % 1,236.32 325 0.47 % 1,099.67 527 0.76 % 1,492.29 207 0.68 % 1,327.83 dobiti dobiti do biti 1,264 0.59 % 1,221.13 254 0.56 % 1,101.84 401 0.58 % 1,356.83 387 0.56 % 1,095.85 222 0.73 % 1,424.05 hoteti hoteti ho teti 1,047 0.49 % 1,011.49 180 0.40 % 780.83 422 0.61 % 1,427.89 267 0.39 % 756.05 178 0.58 % 1,141.81 pogledati pogledati po gledati 1,034 0.48 % 998.93 220 0.48 % 954.35 204 0.29 % 690.26 441 0.64 % 1,248.76 169 0.55 % 1,084.08 meti meti me ti 959 0.45 % 926.47 169 0.37 % 733.12 613 0.88 % 2,074.16 79 0.12 % 223.70 98 0.32 % 628.64 jesti jesti je sti 922 0.43 % 890.73 161 0.35 % 698.41 364 0.53 % 1,231.64 280 0.41 % 792.87 117 0.38 % 750.51 znati znati zn ati 918 0.43 % 886.86 158 0.35 % 685.40 538 0.78 % 1,820.38 142 0.21 % 402.10 80 0.26 % 513.17 govoriti govoriti go voriti 906 0.42 % 875.27 193 0.42 % 837.23 177 0.26 % 598.90 456 0.66 % 1,291.24 80 0.26 % 513.17 začeti začeti za četi 841 0.39 % 812.48 146 0.32 % 633.34 260 0.38 % 879.74 340 0.49 % 962.77 95 0.31 % 609.39 zdeti zdeti zd eti 804 0.38 % 776.73 134 0.29 % 581.29 235 0.34 % 795.15 226 0.33 % 639.96 209 0.68 % 1,340.66 čakati čakati ča kati 782 0.36 % 755.48 235 0.52 % 1,019.42 293 0.42 % 991.40 150 0.22 % 424.75 104 0.34 % 667.12 ajati ajati aj ati 562 0.26 % 542.94 72 0.16 % 312.33 373 0.54 % 1,262.09 32 0.05 % 90.61 85 0.28 % 545.25 vzeti vzeti vz eti 557 0.26 % 538.11 112 0.25 % 485.85 200 0.29 % 676.72 154 0.22 % 436.08 91 0.30 % 583.73 slišati slišati sl išati 534 0.25 % 515.89 189 0.42 % 819.87 113 0.16 % 382.35 183 0.27 % 518.19 49 0.16 % 314.32 pomeniti pomeniti po meniti 511 0.24 % 493.67 57 0.12 % 247.26 40 0.06 % 135.34 322 0.47 % 911.80 92 0.30 % 590.15 napisati napisati na pisati 508 0.24 % 490.77 51 0.11 % 221.24 97 0.14 % 328.21 245 0.35 % 693.76 115 0.38 % 737.69 razumeti razumeti ra zumeti 499 0.23 % 482.08 87 0.19 % 377.40 150 0.22 % 507.54 145 0.21 % 410.59 117 0.38 % 750.51 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 406 File at CLARIN.SI 2.2.63 List of initial character-level 3-grams from verb lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-verbs-lemmas- initial-3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequen- cy of lemma Percentage of all found lemmas Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] biti biti bit i 93,407 43.59 % 90,238.98 19,830 43.63 % 86,021.79 31,073 44.83 % 105,139.03 29,606 42.92 % 83,834.30 12,898 42.20 % 82,736.24 imeti imeti ime ti 10,131 4.73 % 9,787.39 1,862 4.10 % 8,077.29 3,694 5.33 % 12,499.07 2,799 4.06 % 7,925.83 1,776 5.81 % 11,392.43 vedeti vedeti ved eti 7,560 3.53 % 7,303.59 1,283 2.82 % 5,565.61 3,900 5.63 % 13,196.09 1,062 1.54 % 3,007.23 1,315 4.30 % 8,435.27 iti iti iti 5,311 2.48 % 5,130.87 1,085 2.39 % 4,706.69 2,446 3.53 % 8,276.32 1,175 1.70 % 3,327.21 605 1.98 % 3,880.87 reči reči reč i 4,690 2.19 % 4,530.93 878 1.93 % 3,808.73 1,620 2.34 % 5,481.45 1,476 2.14 % 4,179.54 716 2.34 % 4,592.89 dati dati dat i 3,945 1.84 % 3,811.20 902 1.99 % 3,912.84 1,415 2.04 % 4,787.81 956 1.39 % 2,707.07 672 2.20 % 4,310.65 misliti misliti mis liti 2,710 1.26 % 2,618.09 470 1.03 % 2,038.84 975 1.41 % 3,299.02 713 1.03 % 2,018.98 552 1.81 % 3,540.89 priti priti pri ti 2,530 1.18 % 2,444.19 565 1.24 % 2,450.95 968 1.40 % 3,275.34 661 0.96 % 1,871.73 336 1.10 % 2,155.32 videti videti vid eti 2,309 1.08 % 2,230.69 608 1.34 % 2,637.48 759 1.09 % 2,568.16 609 0.88 % 1,724.48 333 1.09 % 2,136.08 morati morati mor ati 2,233 1.04 % 2,157.26 386 0.85 % 1,674.45 559 0.81 % 1,891.44 889 1.29 % 2,517.35 399 1.30 % 2,559.45 moči moči moč i 2,067 0.96 % 1,996.90 345 0.76 % 1,496.60 905 1.31 % 3,062.17 512 0.74 % 1,449.81 305 1.00 % 1,956.47 gledati gledati gle dati 1,777 0.83 % 1,716.73 422 0.93 % 1,830.62 702 1.01 % 2,375.30 394 0.57 % 1,115.68 259 0.85 % 1,661.40 povedati povedati pov edati 1,686 0.79 % 1,628.82 483 1.06 % 2,095.24 389 0.56 % 1,316.23 617 0.89 % 1,747.14 197 0.64 % 1,263.69 narediti narediti nar editi 1,614 0.75 % 1,559.26 217 0.48 % 941.34 543 0.78 % 1,837.30 457 0.66 % 1,294.07 397 1.30 % 2,546.62 delati delati del ati 1,515 0.71 % 1,463.62 238 0.52 % 1,032.43 685 0.99 % 2,317.78 313 0.45 % 886.31 279 0.91 % 1,789.69 praviti praviti pra viti 1,344 0.63 % 1,298.42 285 0.63 % 1,236.32 325 0.47 % 1,099.67 527 0.76 % 1,492.29 207 0.68 % 1,327.83 dobiti dobiti dob iti 1,264 0.59 % 1,221.13 254 0.56 % 1,101.84 401 0.58 % 1,356.83 387 0.56 % 1,095.85 222 0.73 % 1,424.05 hoteti hoteti hot eti 1,047 0.49 % 1,011.49 180 0.40 % 780.83 422 0.61 % 1,427.89 267 0.39 % 756.05 178 0.58 % 1,141.81 pogledati pogledati pog ledati 1,034 0.48 % 998.93 220 0.48 % 954.35 204 0.29 % 690.26 441 0.64 % 1,248.76 169 0.55 % 1,084.08 meti meti met i 959 0.45 % 926.47 169 0.37 % 733.12 613 0.88 % 2,074.16 79 0.12 % 223.70 98 0.32 % 628.64 jesti jesti jes ti 922 0.43 % 890.73 161 0.35 % 698.41 364 0.53 % 1,231.64 280 0.41 % 792.87 117 0.38 % 750.51 znati znati zna ti 918 0.43 % 886.86 158 0.35 % 685.40 538 0.78 % 1,820.38 142 0.21 % 402.10 80 0.26 % 513.17 govoriti govoriti gov oriti 906 0.42 % 875.27 193 0.42 % 837.23 177 0.26 % 598.90 456 0.66 % 1,291.24 80 0.26 % 513.17 začeti začeti zač eti 841 0.39 % 812.48 146 0.32 % 633.34 260 0.38 % 879.74 340 0.49 % 962.77 95 0.31 % 609.39 zdeti zdeti zde ti 804 0.38 % 776.73 134 0.29 % 581.29 235 0.34 % 795.15 226 0.33 % 639.96 209 0.68 % 1,340.66 čakati čakati čak ati 782 0.36 % 755.48 235 0.52 % 1,019.42 293 0.42 % 991.40 150 0.22 % 424.75 104 0.34 % 667.12 ajati ajati aja ti 562 0.26 % 542.94 72 0.16 % 312.33 373 0.54 % 1,262.09 32 0.05 % 90.61 85 0.28 % 545.25 vzeti vzeti vze ti 557 0.26 % 538.11 112 0.25 % 485.85 200 0.29 % 676.72 154 0.22 % 436.08 91 0.30 % 583.73 slišati slišati sli šati 534 0.25 % 515.89 189 0.42 % 819.87 113 0.16 % 382.35 183 0.27 % 518.19 49 0.16 % 314.32 pomeniti pomeniti pom eniti 511 0.24 % 493.67 57 0.12 % 247.26 40 0.06 % 135.34 322 0.47 % 911.80 92 0.30 % 590.15 napisati napisati nap isati 508 0.24 % 490.77 51 0.11 % 221.24 97 0.14 % 328.21 245 0.35 % 693.76 115 0.38 % 737.69 razumeti razumeti raz umeti 499 0.23 % 482.08 87 0.19 % 377.40 150 0.22 % 507.54 145 0.21 % 410.59 117 0.38 % 750.51 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 407 File at CLARIN.SI 2.2.64 List of initial character-level 4-grams from verb lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-verbs-lemmas- initial-4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequen- cy of lemma Percentage of all found lemmas Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] biti biti biti 93,407 44.69 % 90,238.98 19,830 44.70 % 86,021.79 31,073 46.47 % 105,139.03 29,606 43.66 % 83,834.30 12,898 43.05 % 82,736.24 imeti imeti imet i 10,131 4.85 % 9,787.39 1,862 4.20 % 8,077.29 3,694 5.52 % 12,499.07 2,799 4.13 % 7,925.83 1,776 5.93 % 11,392.43 vedeti vedeti vede ti 7,560 3.62 % 7,303.59 1,283 2.89 % 5,565.61 3,900 5.83 % 13,196.09 1,062 1.57 % 3,007.23 1,315 4.39 % 8,435.27 reči reči reči 4,690 2.24 % 4,530.93 878 1.98 % 3,808.73 1,620 2.42 % 5,481.45 1,476 2.18 % 4,179.54 716 2.39 % 4,592.89 dati dati dati 3,945 1.89 % 3,811.20 902 2.03 % 3,912.84 1,415 2.12 % 4,787.81 956 1.41 % 2,707.07 672 2.24 % 4,310.65 misliti misliti misl iti 2,710 1.30 % 2,618.09 470 1.06 % 2,038.84 975 1.46 % 3,299.02 713 1.05 % 2,018.98 552 1.84 % 3,540.89 priti priti prit i 2,530 1.21 % 2,444.19 565 1.27 % 2,450.95 968 1.45 % 3,275.34 661 0.97 % 1,871.73 336 1.12 % 2,155.32 videti videti vide ti 2,309 1.10 % 2,230.69 608 1.37 % 2,637.48 759 1.14 % 2,568.16 609 0.90 % 1,724.48 333 1.11 % 2,136.08 morati morati mora ti 2,233 1.07 % 2,157.26 386 0.87 % 1,674.45 559 0.84 % 1,891.44 889 1.31 % 2,517.35 399 1.33 % 2,559.45 moči moči moči 2,067 0.99 % 1,996.90 345 0.78 % 1,496.60 905 1.35 % 3,062.17 512 0.76 % 1,449.81 305 1.02 % 1,956.47 gledati gledati gled ati 1,777 0.85 % 1,716.73 422 0.95 % 1,830.62 702 1.05 % 2,375.30 394 0.58 % 1,115.68 259 0.86 % 1,661.40 povedati povedati pove dati 1,686 0.81 % 1,628.82 483 1.09 % 2,095.24 389 0.58 % 1,316.23 617 0.91 % 1,747.14 197 0.66 % 1,263.69 narediti narediti nare diti 1,614 0.77 % 1,559.26 217 0.49 % 941.34 543 0.81 % 1,837.30 457 0.67 % 1,294.07 397 1.32 % 2,546.62 delati delati dela ti 1,515 0.72 % 1,463.62 238 0.54 % 1,032.43 685 1.02 % 2,317.78 313 0.46 % 886.31 279 0.93 % 1,789.69 praviti praviti prav iti 1,344 0.64 % 1,298.42 285 0.64 % 1,236.32 325 0.49 % 1,099.67 527 0.78 % 1,492.29 207 0.69 % 1,327.83 dobiti dobiti dobi ti 1,264 0.60 % 1,221.13 254 0.57 % 1,101.84 401 0.60 % 1,356.83 387 0.57 % 1,095.85 222 0.74 % 1,424.05 hoteti hoteti hote ti 1,047 0.50 % 1,011.49 180 0.41 % 780.83 422 0.63 % 1,427.89 267 0.39 % 756.05 178 0.59 % 1,141.81 pogledati pogledati pogl edati 1,034 0.49 % 998.93 220 0.50 % 954.35 204 0.30 % 690.26 441 0.65 % 1,248.76 169 0.56 % 1,084.08 meti meti meti 959 0.46 % 926.47 169 0.38 % 733.12 613 0.92 % 2,074.16 79 0.12 % 223.70 98 0.33 % 628.64 jesti jesti jest i 922 0.44 % 890.73 161 0.36 % 698.41 364 0.54 % 1,231.64 280 0.41 % 792.87 117 0.39 % 750.51 znati znati znat i 918 0.44 % 886.86 158 0.36 % 685.40 538 0.81 % 1,820.38 142 0.21 % 402.10 80 0.27 % 513.17 govoriti govoriti govo riti 906 0.43 % 875.27 193 0.43 % 837.23 177 0.27 % 598.90 456 0.67 % 1,291.24 80 0.27 % 513.17 začeti začeti zače ti 841 0.40 % 812.48 146 0.33 % 633.34 260 0.39 % 879.74 340 0.50 % 962.77 95 0.32 % 609.39 zdeti zdeti zdet i 804 0.39 % 776.73 134 0.30 % 581.29 235 0.35 % 795.15 226 0.33 % 639.96 209 0.70 % 1,340.66 čakati čakati čaka ti 782 0.37 % 755.48 235 0.53 % 1,019.42 293 0.44 % 991.40 150 0.22 % 424.75 104 0.35 % 667.12 ajati ajati ajat i 562 0.27 % 542.94 72 0.16 % 312.33 373 0.56 % 1,262.09 32 0.05 % 90.61 85 0.28 % 545.25 vzeti vzeti vzet i 557 0.27 % 538.11 112 0.25 % 485.85 200 0.30 % 676.72 154 0.23 % 436.08 91 0.30 % 583.73 slišati slišati sliš ati 534 0.26 % 515.89 189 0.43 % 819.87 113 0.17 % 382.35 183 0.27 % 518.19 49 0.16 % 314.32 pomeniti pomeniti pome niti 511 0.24 % 493.67 57 0.13 % 247.26 40 0.06 % 135.34 322 0.47 % 911.80 92 0.31 % 590.15 napisati napisati napi sati 508 0.24 % 490.77 51 0.12 % 221.24 97 0.14 % 328.21 245 0.36 % 693.76 115 0.38 % 737.69 razumeti razumeti razu meti 499 0.24 % 482.08 87 0.20 % 377.40 150 0.22 % 507.54 145 0.21 % 410.59 117 0.39 % 750.51 poznati poznati pozn ati 498 0.24 % 481.11 121 0.27 % 524.89 150 0.22 % 507.54 173 0.26 % 489.88 54 0.18 % 346.39 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 408 File at CLARIN.SI 2.2.65 List of initial character-level 5-grams from verb lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-verbs-lemmas- initial-5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequen- cy of lemma Percentage of all found lemmas Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] imeti imeti imeti 10,131 9.84 % 9,787.39 1,862 8.47 % 8,077.29 3,694 11.98 % 12,499.07 2,799 8.01 % 7,925.83 1,776 11.70 % 11,392.43 vedeti vedeti vedet i 7,560 7.34 % 7,303.59 1,283 5.83 % 5,565.61 3,900 12.65 % 13,196.09 1,062 3.04 % 3,007.23 1,315 8.66 % 8,435.27 misliti misliti misli ti 2,710 2.63 % 2,618.09 470 2.14 % 2,038.84 975 3.16 % 3,299.02 713 2.04 % 2,018.98 552 3.64 % 3,540.89 priti priti priti 2,530 2.46 % 2,444.19 565 2.57 % 2,450.95 968 3.14 % 3,275.34 661 1.89 % 1,871.73 336 2.21 % 2,155.32 videti videti videt i 2,309 2.24 % 2,230.69 608 2.77 % 2,637.48 759 2.46 % 2,568.16 609 1.74 % 1,724.48 333 2.19 % 2,136.08 morati morati morat i 2,233 2.17 % 2,157.26 386 1.75 % 1,674.45 559 1.81 % 1,891.44 889 2.54 % 2,517.35 399 2.63 % 2,559.45 gledati gledati gleda ti 1,777 1.73 % 1,716.73 422 1.92 % 1,830.62 702 2.28 % 2,375.30 394 1.13 % 1,115.68 259 1.71 % 1,661.40 povedati povedati poved ati 1,686 1.64 % 1,628.82 483 2.20 % 2,095.24 389 1.26 % 1,316.23 617 1.77 % 1,747.14 197 1.30 % 1,263.69 narediti narediti nared iti 1,614 1.57 % 1,559.26 217 0.99 % 941.34 543 1.76 % 1,837.30 457 1.31 % 1,294.07 397 2.62 % 2,546.62 delati delati delat i 1,515 1.47 % 1,463.62 238 1.08 % 1,032.43 685 2.22 % 2,317.78 313 0.90 % 886.31 279 1.84 % 1,789.69 praviti praviti pravi ti 1,344 1.31 % 1,298.42 285 1.30 % 1,236.32 325 1.05 % 1,099.67 527 1.51 % 1,492.29 207 1.36 % 1,327.83 dobiti dobiti dobit i 1,264 1.23 % 1,221.13 254 1.16 % 1,101.84 401 1.30 % 1,356.83 387 1.11 % 1,095.85 222 1.46 % 1,424.05 hoteti hoteti hotet i 1,047 1.02 % 1,011.49 180 0.82 % 780.83 422 1.37 % 1,427.89 267 0.76 % 756.05 178 1.17 % 1,141.81 pogledati pogledati pogle dati 1,034 1.00 % 998.93 220 1.00 % 954.35 204 0.66 % 690.26 441 1.26 % 1,248.76 169 1.11 % 1,084.08 jesti jesti jesti 922 0.90 % 890.73 161 0.73 % 698.41 364 1.18 % 1,231.64 280 0.80 % 792.87 117 0.77 % 750.51 znati znati znati 918 0.89 % 886.86 158 0.72 % 685.40 538 1.75 % 1,820.38 142 0.41 % 402.10 80 0.53 % 513.17 govoriti govoriti govor iti 906 0.88 % 875.27 193 0.88 % 837.23 177 0.57 % 598.90 456 1.30 % 1,291.24 80 0.53 % 513.17 začeti začeti začet i 841 0.82 % 812.48 146 0.66 % 633.34 260 0.84 % 879.74 340 0.97 % 962.77 95 0.63 % 609.39 zdeti zdeti zdeti 804 0.78 % 776.73 134 0.61 % 581.29 235 0.76 % 795.15 226 0.65 % 639.96 209 1.38 % 1,340.66 čakati čakati čakat i 782 0.76 % 755.48 235 1.07 % 1,019.42 293 0.95 % 991.40 150 0.43 % 424.75 104 0.69 % 667.12 ajati ajati ajati 562 0.55 % 542.94 72 0.33 % 312.33 373 1.21 % 1,262.09 32 0.09 % 90.61 85 0.56 % 545.25 vzeti vzeti vzeti 557 0.54 % 538.11 112 0.51 % 485.85 200 0.65 % 676.72 154 0.44 % 436.08 91 0.60 % 583.73 slišati slišati sliša ti 534 0.52 % 515.89 189 0.86 % 819.87 113 0.37 % 382.35 183 0.52 % 518.19 49 0.32 % 314.32 pomeniti pomeniti pomen iti 511 0.50 % 493.67 57 0.26 % 247.26 40 0.13 % 135.34 322 0.92 % 911.80 92 0.61 % 590.15 napisati napisati napis ati 508 0.49 % 490.77 51 0.23 % 221.24 97 0.32 % 328.21 245 0.70 % 693.76 115 0.76 % 737.69 razumeti razumeti razum eti 499 0.48 % 482.08 87 0.40 % 377.40 150 0.49 % 507.54 145 0.41 % 410.59 117 0.77 % 750.51 poznati poznati pozna ti 498 0.48 % 481.11 121 0.55 % 524.89 150 0.49 % 507.54 173 0.49 % 489.88 54 0.36 % 346.39 pisati pisati pisat i 492 0.48 % 475.31 66 0.30 % 286.31 149 0.48 % 504.16 183 0.52 % 518.19 94 0.62 % 602.98 hoditi hoditi hodit i 482 0.47 % 465.65 99 0.45 % 429.46 276 0.90 % 933.88 63 0.18 % 178.39 44 0.29 % 282.24 želeti želeti želet i 468 0.46 % 452.13 169 0.77 % 733.12 26 0.08 % 87.97 236 0.68 % 668.27 37 0.24 % 237.34 najti najti najti 452 0.44 % 436.67 108 0.49 % 468.50 107 0.35 % 362.05 168 0.48 % 475.72 69 0.46 % 442.61 vprašati vprašati vpraš ati 434 0.42 % 419.28 83 0.38 % 360.05 142 0.46 % 480.47 147 0.42 % 416.25 62 0.41 % 397.71 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 409 File at CLARIN.SI 2.2.66 List of final character-level 1-grams from verb lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-verbs-lemmas- final-1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequen- cy of lemma Percentage of all found lemmas Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] biti biti bit i 93,407 43.59 % 90,238.98 19,830 43.63 % 86,021.79 31,073 44.83 % 105,139.03 29,606 42.92 % 83,834.30 12,898 42.20 % 82,736.24 imeti imeti imet i 10,131 4.73 % 9,787.39 1,862 4.10 % 8,077.29 3,694 5.33 % 12,499.07 2,799 4.06 % 7,925.83 1,776 5.81 % 11,392.43 vedeti vedeti vedet i 7,560 3.53 % 7,303.59 1,283 2.82 % 5,565.61 3,900 5.63 % 13,196.09 1,062 1.54 % 3,007.23 1,315 4.30 % 8,435.27 iti iti it i 5,311 2.48 % 5,130.87 1,085 2.39 % 4,706.69 2,446 3.53 % 8,276.32 1,175 1.70 % 3,327.21 605 1.98 % 3,880.87 reči reči reč i 4,690 2.19 % 4,530.93 878 1.93 % 3,808.73 1,620 2.34 % 5,481.45 1,476 2.14 % 4,179.54 716 2.34 % 4,592.89 dati dati dat i 3,945 1.84 % 3,811.20 902 1.99 % 3,912.84 1,415 2.04 % 4,787.81 956 1.39 % 2,707.07 672 2.20 % 4,310.65 misliti misliti mislit i 2,710 1.26 % 2,618.09 470 1.03 % 2,038.84 975 1.41 % 3,299.02 713 1.03 % 2,018.98 552 1.81 % 3,540.89 priti priti prit i 2,530 1.18 % 2,444.19 565 1.24 % 2,450.95 968 1.40 % 3,275.34 661 0.96 % 1,871.73 336 1.10 % 2,155.32 videti videti videt i 2,309 1.08 % 2,230.69 608 1.34 % 2,637.48 759 1.09 % 2,568.16 609 0.88 % 1,724.48 333 1.09 % 2,136.08 morati morati morat i 2,233 1.04 % 2,157.26 386 0.85 % 1,674.45 559 0.81 % 1,891.44 889 1.29 % 2,517.35 399 1.30 % 2,559.45 moči moči moč i 2,067 0.96 % 1,996.90 345 0.76 % 1,496.60 905 1.31 % 3,062.17 512 0.74 % 1,449.81 305 1.00 % 1,956.47 gledati gledati gledat i 1,777 0.83 % 1,716.73 422 0.93 % 1,830.62 702 1.01 % 2,375.30 394 0.57 % 1,115.68 259 0.85 % 1,661.40 povedati povedati povedat i 1,686 0.79 % 1,628.82 483 1.06 % 2,095.24 389 0.56 % 1,316.23 617 0.89 % 1,747.14 197 0.64 % 1,263.69 narediti narediti naredit i 1,614 0.75 % 1,559.26 217 0.48 % 941.34 543 0.78 % 1,837.30 457 0.66 % 1,294.07 397 1.30 % 2,546.62 delati delati delat i 1,515 0.71 % 1,463.62 238 0.52 % 1,032.43 685 0.99 % 2,317.78 313 0.45 % 886.31 279 0.91 % 1,789.69 praviti praviti pravit i 1,344 0.63 % 1,298.42 285 0.63 % 1,236.32 325 0.47 % 1,099.67 527 0.76 % 1,492.29 207 0.68 % 1,327.83 dobiti dobiti dobit i 1,264 0.59 % 1,221.13 254 0.56 % 1,101.84 401 0.58 % 1,356.83 387 0.56 % 1,095.85 222 0.73 % 1,424.05 hoteti hoteti hotet i 1,047 0.49 % 1,011.49 180 0.40 % 780.83 422 0.61 % 1,427.89 267 0.39 % 756.05 178 0.58 % 1,141.81 pogledati pogledati pogledat i 1,034 0.48 % 998.93 220 0.48 % 954.35 204 0.29 % 690.26 441 0.64 % 1,248.76 169 0.55 % 1,084.08 meti meti met i 959 0.45 % 926.47 169 0.37 % 733.12 613 0.88 % 2,074.16 79 0.12 % 223.70 98 0.32 % 628.64 jesti jesti jest i 922 0.43 % 890.73 161 0.35 % 698.41 364 0.53 % 1,231.64 280 0.41 % 792.87 117 0.38 % 750.51 znati znati znat i 918 0.43 % 886.86 158 0.35 % 685.40 538 0.78 % 1,820.38 142 0.21 % 402.10 80 0.26 % 513.17 govoriti govoriti govorit i 906 0.42 % 875.27 193 0.42 % 837.23 177 0.26 % 598.90 456 0.66 % 1,291.24 80 0.26 % 513.17 začeti začeti začet i 841 0.39 % 812.48 146 0.32 % 633.34 260 0.38 % 879.74 340 0.49 % 962.77 95 0.31 % 609.39 zdeti zdeti zdet i 804 0.38 % 776.73 134 0.29 % 581.29 235 0.34 % 795.15 226 0.33 % 639.96 209 0.68 % 1,340.66 čakati čakati čakat i 782 0.36 % 755.48 235 0.52 % 1,019.42 293 0.42 % 991.40 150 0.22 % 424.75 104 0.34 % 667.12 ajati ajati ajat i 562 0.26 % 542.94 72 0.16 % 312.33 373 0.54 % 1,262.09 32 0.05 % 90.61 85 0.28 % 545.25 vzeti vzeti vzet i 557 0.26 % 538.11 112 0.25 % 485.85 200 0.29 % 676.72 154 0.22 % 436.08 91 0.30 % 583.73 slišati slišati slišat i 534 0.25 % 515.89 189 0.42 % 819.87 113 0.16 % 382.35 183 0.27 % 518.19 49 0.16 % 314.32 pomeniti pomeniti pomenit i 511 0.24 % 493.67 57 0.12 % 247.26 40 0.06 % 135.34 322 0.47 % 911.80 92 0.30 % 590.15 napisati napisati napisat i 508 0.24 % 490.77 51 0.11 % 221.24 97 0.14 % 328.21 245 0.35 % 693.76 115 0.38 % 737.69 razumeti razumeti razumet i 499 0.23 % 482.08 87 0.19 % 377.40 150 0.22 % 507.54 145 0.21 % 410.59 117 0.38 % 750.51 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 410 File at CLARIN.SI 2.2.67 List of final character-level 2-grams from verb lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-verbs-lemmas- final-2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequen- cy of lemma Percentage of all found lemmas Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] biti biti bi ti 93,407 43.59 % 90,238.98 19,830 43.63 % 86,021.79 31,073 44.83 % 105,139.03 29,606 42.92 % 83,834.30 12,898 42.20 % 82,736.24 imeti imeti ime ti 10,131 4.73 % 9,787.39 1,862 4.10 % 8,077.29 3,694 5.33 % 12,499.07 2,799 4.06 % 7,925.83 1,776 5.81 % 11,392.43 vedeti vedeti vede ti 7,560 3.53 % 7,303.59 1,283 2.82 % 5,565.61 3,900 5.63 % 13,196.09 1,062 1.54 % 3,007.23 1,315 4.30 % 8,435.27 iti iti i ti 5,311 2.48 % 5,130.87 1,085 2.39 % 4,706.69 2,446 3.53 % 8,276.32 1,175 1.70 % 3,327.21 605 1.98 % 3,880.87 reči reči re či 4,690 2.19 % 4,530.93 878 1.93 % 3,808.73 1,620 2.34 % 5,481.45 1,476 2.14 % 4,179.54 716 2.34 % 4,592.89 dati dati da ti 3,945 1.84 % 3,811.20 902 1.99 % 3,912.84 1,415 2.04 % 4,787.81 956 1.39 % 2,707.07 672 2.20 % 4,310.65 misliti misliti misli ti 2,710 1.26 % 2,618.09 470 1.03 % 2,038.84 975 1.41 % 3,299.02 713 1.03 % 2,018.98 552 1.81 % 3,540.89 priti priti pri ti 2,530 1.18 % 2,444.19 565 1.24 % 2,450.95 968 1.40 % 3,275.34 661 0.96 % 1,871.73 336 1.10 % 2,155.32 videti videti vide ti 2,309 1.08 % 2,230.69 608 1.34 % 2,637.48 759 1.09 % 2,568.16 609 0.88 % 1,724.48 333 1.09 % 2,136.08 morati morati mora ti 2,233 1.04 % 2,157.26 386 0.85 % 1,674.45 559 0.81 % 1,891.44 889 1.29 % 2,517.35 399 1.30 % 2,559.45 moči moči mo či 2,067 0.96 % 1,996.90 345 0.76 % 1,496.60 905 1.31 % 3,062.17 512 0.74 % 1,449.81 305 1.00 % 1,956.47 gledati gledati gleda ti 1,777 0.83 % 1,716.73 422 0.93 % 1,830.62 702 1.01 % 2,375.30 394 0.57 % 1,115.68 259 0.85 % 1,661.40 povedati povedati poveda ti 1,686 0.79 % 1,628.82 483 1.06 % 2,095.24 389 0.56 % 1,316.23 617 0.89 % 1,747.14 197 0.64 % 1,263.69 narediti narediti naredi ti 1,614 0.75 % 1,559.26 217 0.48 % 941.34 543 0.78 % 1,837.30 457 0.66 % 1,294.07 397 1.30 % 2,546.62 delati delati dela ti 1,515 0.71 % 1,463.62 238 0.52 % 1,032.43 685 0.99 % 2,317.78 313 0.45 % 886.31 279 0.91 % 1,789.69 praviti praviti pravi ti 1,344 0.63 % 1,298.42 285 0.63 % 1,236.32 325 0.47 % 1,099.67 527 0.76 % 1,492.29 207 0.68 % 1,327.83 dobiti dobiti dobi ti 1,264 0.59 % 1,221.13 254 0.56 % 1,101.84 401 0.58 % 1,356.83 387 0.56 % 1,095.85 222 0.73 % 1,424.05 hoteti hoteti hote ti 1,047 0.49 % 1,011.49 180 0.40 % 780.83 422 0.61 % 1,427.89 267 0.39 % 756.05 178 0.58 % 1,141.81 pogledati pogledati pogleda ti 1,034 0.48 % 998.93 220 0.48 % 954.35 204 0.29 % 690.26 441 0.64 % 1,248.76 169 0.55 % 1,084.08 meti meti me ti 959 0.45 % 926.47 169 0.37 % 733.12 613 0.88 % 2,074.16 79 0.12 % 223.70 98 0.32 % 628.64 jesti jesti jes ti 922 0.43 % 890.73 161 0.35 % 698.41 364 0.53 % 1,231.64 280 0.41 % 792.87 117 0.38 % 750.51 znati znati zna ti 918 0.43 % 886.86 158 0.35 % 685.40 538 0.78 % 1,820.38 142 0.21 % 402.10 80 0.26 % 513.17 govoriti govoriti govori ti 906 0.42 % 875.27 193 0.42 % 837.23 177 0.26 % 598.90 456 0.66 % 1,291.24 80 0.26 % 513.17 začeti začeti zače ti 841 0.39 % 812.48 146 0.32 % 633.34 260 0.38 % 879.74 340 0.49 % 962.77 95 0.31 % 609.39 zdeti zdeti zde ti 804 0.38 % 776.73 134 0.29 % 581.29 235 0.34 % 795.15 226 0.33 % 639.96 209 0.68 % 1,340.66 čakati čakati čaka ti 782 0.36 % 755.48 235 0.52 % 1,019.42 293 0.42 % 991.40 150 0.22 % 424.75 104 0.34 % 667.12 ajati ajati aja ti 562 0.26 % 542.94 72 0.16 % 312.33 373 0.54 % 1,262.09 32 0.05 % 90.61 85 0.28 % 545.25 vzeti vzeti vze ti 557 0.26 % 538.11 112 0.25 % 485.85 200 0.29 % 676.72 154 0.22 % 436.08 91 0.30 % 583.73 slišati slišati sliša ti 534 0.25 % 515.89 189 0.42 % 819.87 113 0.16 % 382.35 183 0.27 % 518.19 49 0.16 % 314.32 pomeniti pomeniti pomeni ti 511 0.24 % 493.67 57 0.12 % 247.26 40 0.06 % 135.34 322 0.47 % 911.80 92 0.30 % 590.15 napisati napisati napisa ti 508 0.24 % 490.77 51 0.11 % 221.24 97 0.14 % 328.21 245 0.35 % 693.76 115 0.38 % 737.69 razumeti razumeti razume ti 499 0.23 % 482.08 87 0.19 % 377.40 150 0.22 % 507.54 145 0.21 % 410.59 117 0.38 % 750.51 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 411 File at CLARIN.SI 2.2.68 List of final character-level 3-grams from verb lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-verbs-lemmas- final-3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequen- cy of lemma Percentage of all found lemmas Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] biti biti b iti 93,407 43.59 % 90,238.98 19,830 43.63 % 86,021.79 31,073 44.83 % 105,139.03 29,606 42.92 % 83,834.30 12,898 42.20 % 82,736.24 imeti imeti im eti 10,131 4.73 % 9,787.39 1,862 4.10 % 8,077.29 3,694 5.33 % 12,499.07 2,799 4.06 % 7,925.83 1,776 5.81 % 11,392.43 vedeti vedeti ved eti 7,560 3.53 % 7,303.59 1,283 2.82 % 5,565.61 3,900 5.63 % 13,196.09 1,062 1.54 % 3,007.23 1,315 4.30 % 8,435.27 iti iti iti 5,311 2.48 % 5,130.87 1,085 2.39 % 4,706.69 2,446 3.53 % 8,276.32 1,175 1.70 % 3,327.21 605 1.98 % 3,880.87 reči reči r eči 4,690 2.19 % 4,530.93 878 1.93 % 3,808.73 1,620 2.34 % 5,481.45 1,476 2.14 % 4,179.54 716 2.34 % 4,592.89 dati dati d ati 3,945 1.84 % 3,811.20 902 1.99 % 3,912.84 1,415 2.04 % 4,787.81 956 1.39 % 2,707.07 672 2.20 % 4,310.65 misliti misliti misl iti 2,710 1.26 % 2,618.09 470 1.03 % 2,038.84 975 1.41 % 3,299.02 713 1.03 % 2,018.98 552 1.81 % 3,540.89 priti priti pr iti 2,530 1.18 % 2,444.19 565 1.24 % 2,450.95 968 1.40 % 3,275.34 661 0.96 % 1,871.73 336 1.10 % 2,155.32 videti videti vid eti 2,309 1.08 % 2,230.69 608 1.34 % 2,637.48 759 1.09 % 2,568.16 609 0.88 % 1,724.48 333 1.09 % 2,136.08 morati morati mor ati 2,233 1.04 % 2,157.26 386 0.85 % 1,674.45 559 0.81 % 1,891.44 889 1.29 % 2,517.35 399 1.30 % 2,559.45 moči moči m oči 2,067 0.96 % 1,996.90 345 0.76 % 1,496.60 905 1.31 % 3,062.17 512 0.74 % 1,449.81 305 1.00 % 1,956.47 gledati gledati gled ati 1,777 0.83 % 1,716.73 422 0.93 % 1,830.62 702 1.01 % 2,375.30 394 0.57 % 1,115.68 259 0.85 % 1,661.40 povedati povedati poved ati 1,686 0.79 % 1,628.82 483 1.06 % 2,095.24 389 0.56 % 1,316.23 617 0.89 % 1,747.14 197 0.64 % 1,263.69 narediti narediti nared iti 1,614 0.75 % 1,559.26 217 0.48 % 941.34 543 0.78 % 1,837.30 457 0.66 % 1,294.07 397 1.30 % 2,546.62 delati delati del ati 1,515 0.71 % 1,463.62 238 0.52 % 1,032.43 685 0.99 % 2,317.78 313 0.45 % 886.31 279 0.91 % 1,789.69 praviti praviti prav iti 1,344 0.63 % 1,298.42 285 0.63 % 1,236.32 325 0.47 % 1,099.67 527 0.76 % 1,492.29 207 0.68 % 1,327.83 dobiti dobiti dob iti 1,264 0.59 % 1,221.13 254 0.56 % 1,101.84 401 0.58 % 1,356.83 387 0.56 % 1,095.85 222 0.73 % 1,424.05 hoteti hoteti hot eti 1,047 0.49 % 1,011.49 180 0.40 % 780.83 422 0.61 % 1,427.89 267 0.39 % 756.05 178 0.58 % 1,141.81 pogledati pogledati pogled ati 1,034 0.48 % 998.93 220 0.48 % 954.35 204 0.29 % 690.26 441 0.64 % 1,248.76 169 0.55 % 1,084.08 meti meti m eti 959 0.45 % 926.47 169 0.37 % 733.12 613 0.88 % 2,074.16 79 0.12 % 223.70 98 0.32 % 628.64 jesti jesti je sti 922 0.43 % 890.73 161 0.35 % 698.41 364 0.53 % 1,231.64 280 0.41 % 792.87 117 0.38 % 750.51 znati znati zn ati 918 0.43 % 886.86 158 0.35 % 685.40 538 0.78 % 1,820.38 142 0.21 % 402.10 80 0.26 % 513.17 govoriti govoriti govor iti 906 0.42 % 875.27 193 0.42 % 837.23 177 0.26 % 598.90 456 0.66 % 1,291.24 80 0.26 % 513.17 začeti začeti zač eti 841 0.39 % 812.48 146 0.32 % 633.34 260 0.38 % 879.74 340 0.49 % 962.77 95 0.31 % 609.39 zdeti zdeti zd eti 804 0.38 % 776.73 134 0.29 % 581.29 235 0.34 % 795.15 226 0.33 % 639.96 209 0.68 % 1,340.66 čakati čakati čak ati 782 0.36 % 755.48 235 0.52 % 1,019.42 293 0.42 % 991.40 150 0.22 % 424.75 104 0.34 % 667.12 ajati ajati aj ati 562 0.26 % 542.94 72 0.16 % 312.33 373 0.54 % 1,262.09 32 0.05 % 90.61 85 0.28 % 545.25 vzeti vzeti vz eti 557 0.26 % 538.11 112 0.25 % 485.85 200 0.29 % 676.72 154 0.22 % 436.08 91 0.30 % 583.73 slišati slišati sliš ati 534 0.25 % 515.89 189 0.42 % 819.87 113 0.16 % 382.35 183 0.27 % 518.19 49 0.16 % 314.32 pomeniti pomeniti pomen iti 511 0.24 % 493.67 57 0.12 % 247.26 40 0.06 % 135.34 322 0.47 % 911.80 92 0.30 % 590.15 napisati napisati napis ati 508 0.24 % 490.77 51 0.11 % 221.24 97 0.14 % 328.21 245 0.35 % 693.76 115 0.38 % 737.69 razumeti razumeti razum eti 499 0.23 % 482.08 87 0.19 % 377.40 150 0.22 % 507.54 145 0.21 % 410.59 117 0.38 % 750.51 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 412 File at CLARIN.SI 2.2.69 List of final character-level 4-grams from verb lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-verbs-lemmas- final-4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequen- cy of lemma Percentage of all found lemmas Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] biti biti biti 93,407 44.69 % 90,238.98 19,830 44.70 % 86,021.79 31,073 46.47 % 105,139.03 29,606 43.66 % 83,834.30 12,898 43.05 % 82,736.24 imeti imeti i meti 10,131 4.85 % 9,787.39 1,862 4.20 % 8,077.29 3,694 5.52 % 12,499.07 2,799 4.13 % 7,925.83 1,776 5.93 % 11,392.43 vedeti vedeti ve deti 7,560 3.62 % 7,303.59 1,283 2.89 % 5,565.61 3,900 5.83 % 13,196.09 1,062 1.57 % 3,007.23 1,315 4.39 % 8,435.27 reči reči reči 4,690 2.24 % 4,530.93 878 1.98 % 3,808.73 1,620 2.42 % 5,481.45 1,476 2.18 % 4,179.54 716 2.39 % 4,592.89 dati dati dati 3,945 1.89 % 3,811.20 902 2.03 % 3,912.84 1,415 2.12 % 4,787.81 956 1.41 % 2,707.07 672 2.24 % 4,310.65 misliti misliti mis liti 2,710 1.30 % 2,618.09 470 1.06 % 2,038.84 975 1.46 % 3,299.02 713 1.05 % 2,018.98 552 1.84 % 3,540.89 priti priti p riti 2,530 1.21 % 2,444.19 565 1.27 % 2,450.95 968 1.45 % 3,275.34 661 0.97 % 1,871.73 336 1.12 % 2,155.32 videti videti vi deti 2,309 1.10 % 2,230.69 608 1.37 % 2,637.48 759 1.14 % 2,568.16 609 0.90 % 1,724.48 333 1.11 % 2,136.08 morati morati mo rati 2,233 1.07 % 2,157.26 386 0.87 % 1,674.45 559 0.84 % 1,891.44 889 1.31 % 2,517.35 399 1.33 % 2,559.45 moči moči moči 2,067 0.99 % 1,996.90 345 0.78 % 1,496.60 905 1.35 % 3,062.17 512 0.76 % 1,449.81 305 1.02 % 1,956.47 gledati gledati gle dati 1,777 0.85 % 1,716.73 422 0.95 % 1,830.62 702 1.05 % 2,375.30 394 0.58 % 1,115.68 259 0.86 % 1,661.40 povedati povedati pove dati 1,686 0.81 % 1,628.82 483 1.09 % 2,095.24 389 0.58 % 1,316.23 617 0.91 % 1,747.14 197 0.66 % 1,263.69 narediti narediti nare diti 1,614 0.77 % 1,559.26 217 0.49 % 941.34 543 0.81 % 1,837.30 457 0.67 % 1,294.07 397 1.32 % 2,546.62 delati delati de lati 1,515 0.72 % 1,463.62 238 0.54 % 1,032.43 685 1.02 % 2,317.78 313 0.46 % 886.31 279 0.93 % 1,789.69 praviti praviti pra viti 1,344 0.64 % 1,298.42 285 0.64 % 1,236.32 325 0.49 % 1,099.67 527 0.78 % 1,492.29 207 0.69 % 1,327.83 dobiti dobiti do biti 1,264 0.60 % 1,221.13 254 0.57 % 1,101.84 401 0.60 % 1,356.83 387 0.57 % 1,095.85 222 0.74 % 1,424.05 hoteti hoteti ho teti 1,047 0.50 % 1,011.49 180 0.41 % 780.83 422 0.63 % 1,427.89 267 0.39 % 756.05 178 0.59 % 1,141.81 pogledati pogledati pogle dati 1,034 0.49 % 998.93 220 0.50 % 954.35 204 0.30 % 690.26 441 0.65 % 1,248.76 169 0.56 % 1,084.08 meti meti meti 959 0.46 % 926.47 169 0.38 % 733.12 613 0.92 % 2,074.16 79 0.12 % 223.70 98 0.33 % 628.64 jesti jesti j esti 922 0.44 % 890.73 161 0.36 % 698.41 364 0.54 % 1,231.64 280 0.41 % 792.87 117 0.39 % 750.51 znati znati z nati 918 0.44 % 886.86 158 0.36 % 685.40 538 0.81 % 1,820.38 142 0.21 % 402.10 80 0.27 % 513.17 govoriti govoriti govo riti 906 0.43 % 875.27 193 0.43 % 837.23 177 0.27 % 598.90 456 0.67 % 1,291.24 80 0.27 % 513.17 začeti začeti za četi 841 0.40 % 812.48 146 0.33 % 633.34 260 0.39 % 879.74 340 0.50 % 962.77 95 0.32 % 609.39 zdeti zdeti z deti 804 0.39 % 776.73 134 0.30 % 581.29 235 0.35 % 795.15 226 0.33 % 639.96 209 0.70 % 1,340.66 čakati čakati ča kati 782 0.37 % 755.48 235 0.53 % 1,019.42 293 0.44 % 991.40 150 0.22 % 424.75 104 0.35 % 667.12 ajati ajati a jati 562 0.27 % 542.94 72 0.16 % 312.33 373 0.56 % 1,262.09 32 0.05 % 90.61 85 0.28 % 545.25 vzeti vzeti v zeti 557 0.27 % 538.11 112 0.25 % 485.85 200 0.30 % 676.72 154 0.23 % 436.08 91 0.30 % 583.73 slišati slišati sli šati 534 0.26 % 515.89 189 0.43 % 819.87 113 0.17 % 382.35 183 0.27 % 518.19 49 0.16 % 314.32 pomeniti pomeniti pome niti 511 0.24 % 493.67 57 0.13 % 247.26 40 0.06 % 135.34 322 0.47 % 911.80 92 0.31 % 590.15 napisati napisati napi sati 508 0.24 % 490.77 51 0.12 % 221.24 97 0.14 % 328.21 245 0.36 % 693.76 115 0.38 % 737.69 razumeti razumeti razu meti 499 0.24 % 482.08 87 0.20 % 377.40 150 0.22 % 507.54 145 0.21 % 410.59 117 0.39 % 750.51 poznati poznati poz nati 498 0.24 % 481.11 121 0.27 % 524.89 150 0.22 % 507.54 173 0.26 % 489.88 54 0.18 % 346.39 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 413 File at CLARIN.SI 2.2.70 List of final character-level 5-grams from verb lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-verbs-lemmas- final-5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequen- cy of lemma Percentage of all found lemmas Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] imeti imeti imeti 10,131 9.84 % 9,787.39 1,862 8.47 % 8,077.29 3,694 11.98 % 12,499.07 2,799 8.01 % 7,925.83 1,776 11.70 % 11,392.43 vedeti vedeti v edeti 7,560 7.34 % 7,303.59 1,283 5.83 % 5,565.61 3,900 12.65 % 13,196.09 1,062 3.04 % 3,007.23 1,315 8.66 % 8,435.27 misliti misliti mi sliti 2,710 2.63 % 2,618.09 470 2.14 % 2,038.84 975 3.16 % 3,299.02 713 2.04 % 2,018.98 552 3.64 % 3,540.89 priti priti priti 2,530 2.46 % 2,444.19 565 2.57 % 2,450.95 968 3.14 % 3,275.34 661 1.89 % 1,871.73 336 2.21 % 2,155.32 videti videti v ideti 2,309 2.24 % 2,230.69 608 2.77 % 2,637.48 759 2.46 % 2,568.16 609 1.74 % 1,724.48 333 2.19 % 2,136.08 morati morati m orati 2,233 2.17 % 2,157.26 386 1.75 % 1,674.45 559 1.81 % 1,891.44 889 2.54 % 2,517.35 399 2.63 % 2,559.45 gledati gledati gl edati 1,777 1.73 % 1,716.73 422 1.92 % 1,830.62 702 2.28 % 2,375.30 394 1.13 % 1,115.68 259 1.71 % 1,661.40 povedati povedati pov edati 1,686 1.64 % 1,628.82 483 2.20 % 2,095.24 389 1.26 % 1,316.23 617 1.77 % 1,747.14 197 1.30 % 1,263.69 narediti narediti nar editi 1,614 1.57 % 1,559.26 217 0.99 % 941.34 543 1.76 % 1,837.30 457 1.31 % 1,294.07 397 2.62 % 2,546.62 delati delati d elati 1,515 1.47 % 1,463.62 238 1.08 % 1,032.43 685 2.22 % 2,317.78 313 0.90 % 886.31 279 1.84 % 1,789.69 praviti praviti pr aviti 1,344 1.31 % 1,298.42 285 1.30 % 1,236.32 325 1.05 % 1,099.67 527 1.51 % 1,492.29 207 1.36 % 1,327.83 dobiti dobiti d obiti 1,264 1.23 % 1,221.13 254 1.16 % 1,101.84 401 1.30 % 1,356.83 387 1.11 % 1,095.85 222 1.46 % 1,424.05 hoteti hoteti h oteti 1,047 1.02 % 1,011.49 180 0.82 % 780.83 422 1.37 % 1,427.89 267 0.76 % 756.05 178 1.17 % 1,141.81 pogledati pogledati pogl edati 1,034 1.00 % 998.93 220 1.00 % 954.35 204 0.66 % 690.26 441 1.26 % 1,248.76 169 1.11 % 1,084.08 jesti jesti jesti 922 0.90 % 890.73 161 0.73 % 698.41 364 1.18 % 1,231.64 280 0.80 % 792.87 117 0.77 % 750.51 znati znati znati 918 0.89 % 886.86 158 0.72 % 685.40 538 1.75 % 1,820.38 142 0.41 % 402.10 80 0.53 % 513.17 govoriti govoriti gov oriti 906 0.88 % 875.27 193 0.88 % 837.23 177 0.57 % 598.90 456 1.30 % 1,291.24 80 0.53 % 513.17 začeti začeti z ačeti 841 0.82 % 812.48 146 0.66 % 633.34 260 0.84 % 879.74 340 0.97 % 962.77 95 0.63 % 609.39 zdeti zdeti zdeti 804 0.78 % 776.73 134 0.61 % 581.29 235 0.76 % 795.15 226 0.65 % 639.96 209 1.38 % 1,340.66 čakati čakati č akati 782 0.76 % 755.48 235 1.07 % 1,019.42 293 0.95 % 991.40 150 0.43 % 424.75 104 0.69 % 667.12 ajati ajati ajati 562 0.55 % 542.94 72 0.33 % 312.33 373 1.21 % 1,262.09 32 0.09 % 90.61 85 0.56 % 545.25 vzeti vzeti vzeti 557 0.54 % 538.11 112 0.51 % 485.85 200 0.65 % 676.72 154 0.44 % 436.08 91 0.60 % 583.73 slišati slišati sl išati 534 0.52 % 515.89 189 0.86 % 819.87 113 0.37 % 382.35 183 0.52 % 518.19 49 0.32 % 314.32 pomeniti pomeniti pom eniti 511 0.50 % 493.67 57 0.26 % 247.26 40 0.13 % 135.34 322 0.92 % 911.80 92 0.61 % 590.15 napisati napisati nap isati 508 0.49 % 490.77 51 0.23 % 221.24 97 0.32 % 328.21 245 0.70 % 693.76 115 0.76 % 737.69 razumeti razumeti raz umeti 499 0.48 % 482.08 87 0.40 % 377.40 150 0.49 % 507.54 145 0.41 % 410.59 117 0.77 % 750.51 poznati poznati po znati 498 0.48 % 481.11 121 0.55 % 524.89 150 0.49 % 507.54 173 0.49 % 489.88 54 0.36 % 346.39 pisati pisati p isati 492 0.48 % 475.31 66 0.30 % 286.31 149 0.48 % 504.16 183 0.52 % 518.19 94 0.62 % 602.98 hoditi hoditi h oditi 482 0.47 % 465.65 99 0.45 % 429.46 276 0.90 % 933.88 63 0.18 % 178.39 44 0.29 % 282.24 želeti želeti ž eleti 468 0.46 % 452.13 169 0.77 % 733.12 26 0.08 % 87.97 236 0.68 % 668.27 37 0.24 % 237.34 najti najti najti 452 0.44 % 436.67 108 0.49 % 468.50 107 0.35 % 362.05 168 0.48 % 475.72 69 0.46 % 442.61 vprašati vprašati vpr ašati 434 0.42 % 419.28 83 0.38 % 360.05 142 0.46 % 480.47 147 0.42 % 416.25 62 0.41 % 397.71 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 414 File at CLARIN.SI 2.2.71 List of initial character-level 1-grams from verb standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-verbs-standardized_ forms-initial-1grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequen- cy of standardized form Percentage of all found standardized forms Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] je j e 32,186 15.02 % 31,094.37 6,536 14.38 % 28,352.92 10,658 15.38 % 36,062.56 10,744 15.57 % 30,423.42 4,248 13.90 % 27,249.46 so s o 7,976 3.72 % 7,705.48 1,528 3.36 % 6,628.41 2,364 3.41 % 7,998.86 3,012 4.37 % 8,528.98 1,072 3.51 % 6,876.51 bi b i 7,466 3.48 % 7,212.78 1,387 3.05 % 6,016.75 1,982 2.86 % 6,706.32 2,746 3.98 % 7,775.75 1,351 4.42 % 8,666.20 sem s em 6,222 2.90 % 6,010.97 1,191 2.62 % 5,166.51 2,952 4.26 % 9,988.43 1,188 1.72 % 3,364.02 891 2.92 % 5,715.46 ni n i 5,191 2.42 % 5,014.94 930 2.05 % 4,034.30 1,870 2.70 % 6,327.36 1,567 2.27 % 4,437.22 824 2.70 % 5,285.68 bo b o 5,064 2.36 % 4,892.25 1,301 2.86 % 5,643.69 1,258 1.81 % 4,256.59 1,777 2.58 % 5,031.87 728 2.38 % 4,669.87 smo s mo 3,981 1.86 % 3,845.98 899 1.98 % 3,899.83 1,148 1.66 % 3,884.39 1,544 2.24 % 4,372.09 390 1.28 % 2,501.72 vem v em 3,558 1.66 % 3,437.33 509 1.12 % 2,208.02 1,885 2.72 % 6,378.11 502 0.73 % 1,421.50 662 2.17 % 4,246.50 bilo b ilo 3,233 1.51 % 3,123.35 572 1.26 % 2,481.31 1,400 2.02 % 4,737.06 850 1.23 % 2,406.92 411 1.34 % 2,636.42 si s i 3,196 1.49 % 3,087.60 903 1.99 % 3,917.18 1,349 1.95 % 4,564.50 545 0.79 % 1,543.26 399 1.30 % 2,559.45 veš v eš 2,896 1.35 % 2,797.78 499 1.10 % 2,164.64 1,799 2.60 % 6,087.12 98 0.14 % 277.50 500 1.64 % 3,207.33 bil b il 2,201 1.03 % 2,126.35 550 1.21 % 2,385.88 903 1.30 % 3,055.40 621 0.90 % 1,758.46 127 0.41 % 814.66 bila b ila 2,143 1.00 % 2,070.32 508 1.12 % 2,203.68 858 1.24 % 2,903.14 581 0.84 % 1,645.20 196 0.64 % 1,257.27 bomo b omo 2,018 0.94 % 1,949.56 544 1.20 % 2,359.85 340 0.49 % 1,150.43 839 1.22 % 2,375.77 295 0.96 % 1,892.32 bom b om 1,975 0.92 % 1,908.02 330 0.73 % 1,431.53 875 1.26 % 2,960.66 379 0.55 % 1,073.20 391 1.28 % 2,508.13 mislim m islim 1,862 0.87 % 1,798.85 313 0.69 % 1,357.78 632 0.91 % 2,138.44 475 0.69 % 1,345.04 442 1.45 % 2,835.28 ima i ma 1,655 0.77 % 1,598.87 275 0.60 % 1,192.94 614 0.89 % 2,077.54 498 0.72 % 1,410.17 268 0.88 % 1,719.13 ste s te 1,629 0.76 % 1,573.75 481 1.06 % 2,086.56 188 0.27 % 636.12 717 1.04 % 2,030.30 243 0.80 % 1,558.76 rekel r ekel 1,278 0.60 % 1,234.65 213 0.47 % 923.99 575 0.83 % 1,945.58 328 0.47 % 928.79 162 0.53 % 1,039.17 da d a 1,268 0.59 % 1,224.99 197 0.43 % 854.58 454 0.66 % 1,536.16 407 0.59 % 1,152.49 210 0.69 % 1,347.08 imamo i mamo 1,172 0.55 % 1,132.25 243 0.54 % 1,054.12 173 0.25 % 585.37 587 0.85 % 1,662.19 169 0.55 % 1,084.08 boš b oš 1,151 0.54 % 1,111.96 238 0.52 % 1,032.43 518 0.75 % 1,752.71 197 0.29 % 557.84 198 0.65 % 1,270.10 gre g re 1,122 0.52 % 1,083.95 200 0.44 % 867.59 309 0.45 % 1,045.54 457 0.66 % 1,294.07 156 0.51 % 1,000.69 imaš i maš 1,058 0.49 % 1,022.12 220 0.48 % 954.35 535 0.77 % 1,810.23 138 0.20 % 390.77 165 0.54 % 1,058.42 imajo i majo 942 0.44 % 910.05 129 0.28 % 559.60 387 0.56 % 1,309.46 283 0.41 % 801.36 143 0.47 % 917.30 ma m a 939 0.44 % 907.15 162 0.36 % 702.75 607 0.88 % 2,053.85 77 0.11 % 218.04 93 0.30 % 596.56 bili b ili 932 0.43 % 900.39 245 0.54 % 1,062.80 309 0.45 % 1,045.54 310 0.45 % 877.82 68 0.22 % 436.20 imeli i meli 915 0.43 % 883.97 164 0.36 % 711.43 337 0.49 % 1,140.28 287 0.42 % 812.69 127 0.41 % 814.66 nisem n isem 915 0.43 % 883.97 172 0.38 % 746.13 449 0.65 % 1,519.24 155 0.23 % 438.91 139 0.46 % 891.64 recimo r ecimo 893 0.42 % 862.71 172 0.38 % 746.13 146 0.21 % 494.01 375 0.54 % 1,061.87 200 0.65 % 1,282.93 rekla r ekla 882 0.41 % 852.09 101 0.22 % 438.13 509 0.73 % 1,722.26 125 0.18 % 353.96 147 0.48 % 942.95 sta s ta 821 0.38 % 793.15 218 0.48 % 945.68 261 0.38 % 883.12 251 0.36 % 710.75 91 0.30 % 583.73 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 415 File at CLARIN.SI 2.2.72 List of initial character-level 2-grams from verb standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-verbs-standardized_ forms-initial-2grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequen- cy of standardized form Percentage of all found standardized forms Total relative frequency (per million occur - rences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] je je 32,186 15.02 % 31,094.37 6,536 14.38 % 28,352.92 10,658 15.38 % 36,062.56 10,744 15.57 % 30,423.42 4,248 13.90 % 27,249.46 so so 7,976 3.72 % 7,705.48 1,528 3.36 % 6,628.41 2,364 3.41 % 7,998.86 3,012 4.37 % 8,528.98 1,072 3.51 % 6,876.51 bi bi 7,466 3.48 % 7,212.78 1,387 3.05 % 6,016.75 1,982 2.86 % 6,706.32 2,746 3.98 % 7,775.75 1,351 4.42 % 8,666.20 sem se m 6,222 2.90 % 6,010.97 1,191 2.62 % 5,166.51 2,952 4.26 % 9,988.43 1,188 1.72 % 3,364.02 891 2.92 % 5,715.46 ni ni 5,191 2.42 % 5,014.94 930 2.05 % 4,034.30 1,870 2.70 % 6,327.36 1,567 2.27 % 4,437.22 824 2.70 % 5,285.68 bo bo 5,064 2.36 % 4,892.25 1,301 2.86 % 5,643.69 1,258 1.81 % 4,256.59 1,777 2.58 % 5,031.87 728 2.38 % 4,669.87 smo sm o 3,981 1.86 % 3,845.98 899 1.98 % 3,899.83 1,148 1.66 % 3,884.39 1,544 2.24 % 4,372.09 390 1.28 % 2,501.72 vem ve m 3,558 1.66 % 3,437.33 509 1.12 % 2,208.02 1,885 2.72 % 6,378.11 502 0.73 % 1,421.50 662 2.17 % 4,246.50 bilo bi lo 3,233 1.51 % 3,123.35 572 1.26 % 2,481.31 1,400 2.02 % 4,737.06 850 1.23 % 2,406.92 411 1.34 % 2,636.42 si si 3,196 1.49 % 3,087.60 903 1.99 % 3,917.18 1,349 1.95 % 4,564.50 545 0.79 % 1,543.26 399 1.30 % 2,559.45 veš ve š 2,896 1.35 % 2,797.78 499 1.10 % 2,164.64 1,799 2.60 % 6,087.12 98 0.14 % 277.50 500 1.64 % 3,207.33 bil bi l 2,201 1.03 % 2,126.35 550 1.21 % 2,385.88 903 1.30 % 3,055.40 621 0.90 % 1,758.46 127 0.41 % 814.66 bila bi la 2,143 1.00 % 2,070.32 508 1.12 % 2,203.68 858 1.24 % 2,903.14 581 0.84 % 1,645.20 196 0.64 % 1,257.27 bomo bo mo 2,018 0.94 % 1,949.56 544 1.20 % 2,359.85 340 0.49 % 1,150.43 839 1.22 % 2,375.77 295 0.96 % 1,892.32 bom bo m 1,975 0.92 % 1,908.02 330 0.73 % 1,431.53 875 1.26 % 2,960.66 379 0.55 % 1,073.20 391 1.28 % 2,508.13 mislim mi slim 1,862 0.87 % 1,798.85 313 0.69 % 1,357.78 632 0.91 % 2,138.44 475 0.69 % 1,345.04 442 1.45 % 2,835.28 ima im a 1,655 0.77 % 1,598.87 275 0.60 % 1,192.94 614 0.89 % 2,077.54 498 0.72 % 1,410.17 268 0.88 % 1,719.13 ste st e 1,629 0.76 % 1,573.75 481 1.06 % 2,086.56 188 0.27 % 636.12 717 1.04 % 2,030.30 243 0.80 % 1,558.76 rekel re kel 1,278 0.60 % 1,234.65 213 0.47 % 923.99 575 0.83 % 1,945.58 328 0.47 % 928.79 162 0.53 % 1,039.17 da da 1,268 0.59 % 1,224.99 197 0.43 % 854.58 454 0.66 % 1,536.16 407 0.59 % 1,152.49 210 0.69 % 1,347.08 imamo im amo 1,172 0.55 % 1,132.25 243 0.54 % 1,054.12 173 0.25 % 585.37 587 0.85 % 1,662.19 169 0.55 % 1,084.08 boš bo š 1,151 0.54 % 1,111.96 238 0.52 % 1,032.43 518 0.75 % 1,752.71 197 0.29 % 557.84 198 0.65 % 1,270.10 gre gr e 1,122 0.52 % 1,083.95 200 0.44 % 867.59 309 0.45 % 1,045.54 457 0.66 % 1,294.07 156 0.51 % 1,000.69 imaš im aš 1,058 0.49 % 1,022.12 220 0.48 % 954.35 535 0.77 % 1,810.23 138 0.20 % 390.77 165 0.54 % 1,058.42 imajo im ajo 942 0.44 % 910.05 129 0.28 % 559.60 387 0.56 % 1,309.46 283 0.41 % 801.36 143 0.47 % 917.30 ma ma 939 0.44 % 907.15 162 0.36 % 702.75 607 0.88 % 2,053.85 77 0.11 % 218.04 93 0.30 % 596.56 bili bi li 932 0.43 % 900.39 245 0.54 % 1,062.80 309 0.45 % 1,045.54 310 0.45 % 877.82 68 0.22 % 436.20 imeli im eli 915 0.43 % 883.97 164 0.36 % 711.43 337 0.49 % 1,140.28 287 0.42 % 812.69 127 0.41 % 814.66 nisem ni sem 915 0.43 % 883.97 172 0.38 % 746.13 449 0.65 % 1,519.24 155 0.23 % 438.91 139 0.46 % 891.64 recimo re cimo 893 0.42 % 862.71 172 0.38 % 746.13 146 0.21 % 494.01 375 0.54 % 1,061.87 200 0.65 % 1,282.93 rekla re kla 882 0.41 % 852.09 101 0.22 % 438.13 509 0.73 % 1,722.26 125 0.18 % 353.96 147 0.48 % 942.95 sta st a 821 0.38 % 793.15 218 0.48 % 945.68 261 0.38 % 883.12 251 0.36 % 710.75 91 0.30 % 583.73 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 416 File at CLARIN.SI2.2.73 List of initial character-level 3-grams from verb standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-verbs-standardized_forms- initial-3grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] sem sem 6,222 4.13 % 6,010.97 1,191 3.67 % 5,166.51 2,952 6.07 % 9,988.43 1,188 2.47 % 3,364.02 891 4.12 % 5,715.46 smo smo 3,981 2.64 % 3,845.98 899 2.77 % 3,899.83 1,148 2.36 % 3,884.39 1,544 3.21 % 4,372.09 390 1.80 % 2,501.72 vem vem 3,558 2.36 % 3,437.33 509 1.57 % 2,208.02 1,885 3.87 % 6,378.11 502 1.04 % 1,421.50 662 3.06 % 4,246.50 bilo bil o 3,233 2.15 % 3,123.35 572 1.76 % 2,481.31 1,400 2.88 % 4,737.06 850 1.77 % 2,406.92 411 1.90 % 2,636.42 veš veš 2,896 1.92 % 2,797.78 499 1.54 % 2,164.64 1,799 3.70 % 6,087.12 98 0.20 % 277.50 500 2.31 % 3,207.33 bil bil 2,201 1.46 % 2,126.35 550 1.70 % 2,385.88 903 1.85 % 3,055.40 621 1.29 % 1,758.46 127 0.59 % 814.66 bila bil a 2,143 1.42 % 2,070.32 508 1.57 % 2,203.68 858 1.76 % 2,903.14 581 1.21 % 1,645.20 196 0.91 % 1,257.27 bomo bom o 2,018 1.34 % 1,949.56 544 1.68 % 2,359.85 340 0.70 % 1,150.43 839 1.75 % 2,375.77 295 1.36 % 1,892.32 bom bom 1,975 1.31 % 1,908.02 330 1.02 % 1,431.53 875 1.80 % 2,960.66 379 0.79 % 1,073.20 391 1.81 % 2,508.13 mislim mis lim 1,862 1.24 % 1,798.85 313 0.97 % 1,357.78 632 1.30 % 2,138.44 475 0.99 % 1,345.04 442 2.05 % 2,835.28 ima ima 1,655 1.10 % 1,598.87 275 0.85 % 1,192.94 614 1.26 % 2,077.54 498 1.04 % 1,410.17 268 1.24 % 1,719.13 ste ste 1,629 1.08 % 1,573.75 481 1.48 % 2,086.56 188 0.39 % 636.12 717 1.49 % 2,030.30 243 1.12 % 1,558.76 rekel rek el 1,278 0.85 % 1,234.65 213 0.66 % 923.99 575 1.18 % 1,945.58 328 0.68 % 928.79 162 0.75 % 1,039.17 imamo ima mo 1,172 0.78 % 1,132.25 243 0.75 % 1,054.12 173 0.35 % 585.37 587 1.22 % 1,662.19 169 0.78 % 1,084.08 boš boš 1,151 0.76 % 1,111.96 238 0.73 % 1,032.43 518 1.06 % 1,752.71 197 0.41 % 557.84 198 0.92 % 1,270.10 gre gre 1,122 0.74 % 1,083.95 200 0.62 % 867.59 309 0.64 % 1,045.54 457 0.95 % 1,294.07 156 0.72 % 1,000.69 imaš ima š 1,058 0.70 % 1,022.12 220 0.68 % 954.35 535 1.10 % 1,810.23 138 0.29 % 390.77 165 0.76 % 1,058.42 imajo ima jo 942 0.62 % 910.05 129 0.40 % 559.60 387 0.80 % 1,309.46 283 0.59 % 801.36 143 0.66 % 917.30 bili bil i 932 0.62 % 900.39 245 0.76 % 1,062.80 309 0.64 % 1,045.54 310 0.65 % 877.82 68 0.32 % 436.20 imeli ime li 915 0.61 % 883.97 164 0.51 % 711.43 337 0.69 % 1,140.28 287 0.60 % 812.69 127 0.59 % 814.66 nisem nis em 915 0.61 % 883.97 172 0.53 % 746.13 449 0.92 % 1,519.24 155 0.32 % 438.91 139 0.64 % 891.64 recimo rec imo 893 0.59 % 862.71 172 0.53 % 746.13 146 0.30 % 494.01 375 0.78 % 1,061.87 200 0.93 % 1,282.93 rekla rek la 882 0.58 % 852.09 101 0.31 % 438.13 509 1.05 % 1,722.26 125 0.26 % 353.96 147 0.68 % 942.95 sta sta 821 0.55 % 793.15 218 0.67 % 945.68 261 0.54 % 883.12 251 0.52 % 710.75 91 0.42 % 583.73 boste bos te 809 0.54 % 781.56 258 0.80 % 1,119.19 53 0.11 % 179.33 318 0.66 % 900.47 180 0.83 % 1,154.64 daj daj 772 0.51 % 745.82 218 0.67 % 945.68 360 0.74 % 1,218.10 98 0.20 % 277.50 96 0.44 % 615.81 glej gle j 769 0.51 % 742.92 241 0.74 % 1,045.45 334 0.69 % 1,130.13 67 0.14 % 189.72 127 0.59 % 814.66 sva sva 767 0.51 % 740.99 195 0.60 % 845.90 367 0.75 % 1,241.79 87 0.18 % 246.35 118 0.55 % 756.93 imam ima m 740 0.49 % 714.90 127 0.39 % 550.92 300 0.62 % 1,015.08 157 0.33 % 444.57 156 0.72 % 1,000.69 imel ime l 736 0.49 % 711.04 126 0.39 % 546.58 418 0.86 % 1,414.35 123 0.26 % 348.29 69 0.32 % 442.61 biti bit i 730 0.48 % 705.24 163 0.50 % 707.09 211 0.43 % 713.94 242 0.50 % 685.26 114 0.53 % 731.27 pravi pra vi 704 0.47 % 680.12 106 0.33 % 459.82 97 0.20 % 328.21 367 0.76 % 1,039.22 134 0.62 % 859.56 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 417 File at CLARIN.SI2.2.74 List of initial character-level 4-grams from verb standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-verbs-standardized_forms- initial-4grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] bilo bilo 3,233 2.77 % 3,123.35 572 2.23 % 2,481.31 1,400 4.13 % 4,737.06 850 2.10 % 2,406.92 411 2.45 % 2,636.42 bila bila 2,143 1.83 % 2,070.32 508 1.98 % 2,203.68 858 2.53 % 2,903.14 581 1.44 % 1,645.20 196 1.17 % 1,257.27 bomo bomo 2,018 1.73 % 1,949.56 544 2.12 % 2,359.85 340 1.00 % 1,150.43 839 2.08 % 2,375.77 295 1.76 % 1,892.32 mislim misl im 1,862 1.59 % 1,798.85 313 1.22 % 1,357.78 632 1.86 % 2,138.44 475 1.18 % 1,345.04 442 2.63 % 2,835.28 rekel reke l 1,278 1.09 % 1,234.65 213 0.83 % 923.99 575 1.70 % 1,945.58 328 0.81 % 928.79 162 0.96 % 1,039.17 imamo imam o 1,172 1.00 % 1,132.25 243 0.95 % 1,054.12 173 0.51 % 585.37 587 1.45 % 1,662.19 169 1.01 % 1,084.08 imaš imaš 1,058 0.91 % 1,022.12 220 0.86 % 954.35 535 1.58 % 1,810.23 138 0.34 % 390.77 165 0.98 % 1,058.42 imajo imaj o 942 0.81 % 910.05 129 0.50 % 559.60 387 1.14 % 1,309.46 283 0.70 % 801.36 143 0.85 % 917.30 bili bili 932 0.80 % 900.39 245 0.95 % 1,062.80 309 0.91 % 1,045.54 310 0.77 % 877.82 68 0.41 % 436.20 imeli imel i 915 0.78 % 883.97 164 0.64 % 711.43 337 0.99 % 1,140.28 287 0.71 % 812.69 127 0.76 % 814.66 nisem nise m 915 0.78 % 883.97 172 0.67 % 746.13 449 1.32 % 1,519.24 155 0.38 % 438.91 139 0.83 % 891.64 recimo reci mo 893 0.77 % 862.71 172 0.67 % 746.13 146 0.43 % 494.01 375 0.93 % 1,061.87 200 1.19 % 1,282.93 rekla rekl a 882 0.76 % 852.09 101 0.39 % 438.13 509 1.50 % 1,722.26 125 0.31 % 353.96 147 0.88 % 942.95 boste bost e 809 0.69 % 781.56 258 1.00 % 1,119.19 53 0.16 % 179.33 318 0.79 % 900.47 180 1.07 % 1,154.64 glej glej 769 0.66 % 742.92 241 0.94 % 1,045.45 334 0.98 % 1,130.13 67 0.17 % 189.72 127 0.76 % 814.66 imam imam 740 0.63 % 714.90 127 0.49 % 550.92 300 0.89 % 1,015.08 157 0.39 % 444.57 156 0.93 % 1,000.69 imel imel 736 0.63 % 711.04 126 0.49 % 546.58 418 1.23 % 1,414.35 123 0.30 % 348.29 69 0.41 % 442.61 biti biti 730 0.62 % 705.24 163 0.64 % 707.09 211 0.62 % 713.94 242 0.60 % 685.26 114 0.68 % 731.27 pravi prav i 704 0.60 % 680.12 106 0.41 % 459.82 97 0.29 % 328.21 367 0.91 % 1,039.22 134 0.80 % 859.56 imela imel a 657 0.56 % 634.72 127 0.49 % 550.92 303 0.89 % 1,025.23 112 0.28 % 317.15 115 0.69 % 737.69 bodo bodo 656 0.56 % 633.75 199 0.78 % 863.25 64 0.19 % 216.55 322 0.80 % 911.80 71 0.42 % 455.44 niso niso 590 0.51 % 569.99 80 0.31 % 347.04 162 0.48 % 548.15 266 0.66 % 753.22 82 0.49 % 526 imate imat e 588 0.50 % 568.06 108 0.42 % 468.50 50 0.15 % 169.18 189 0.47 % 535.18 241 1.44 % 1,545.93 pride prid e 547 0.47 % 528.45 89 0.35 % 386.08 180 0.53 % 609.05 174 0.43 % 492.71 104 0.62 % 667.12 čakaj čaka j 466 0.40 % 450.20 138 0.54 % 598.64 222 0.66 % 751.16 34 0.08 % 96.28 72 0.43 % 461.86 rekli rekl i 463 0.40 % 447.30 89 0.35 % 386.08 85 0.25 % 287.61 241 0.60 % 682.43 48 0.29 % 307.90 gremo grem o 441 0.38 % 426.04 140 0.55 % 607.31 105 0.31 % 355.28 139 0.34 % 393.60 57 0.34 % 365.64 moraš mora š 441 0.38 % 426.04 55 0.21 % 238.59 255 0.75 % 862.82 54 0.13 % 152.91 77 0.46 % 493.93 pomeni pome ni 439 0.38 % 424.11 48 0.19 % 208.22 33 0.10 % 111.66 276 0.68 % 781.54 82 0.49 % 526 prišel priš el 424 0.36 % 409.62 95 0.37 % 412.11 222 0.66 % 751.16 72 0.18 % 203.88 35 0.21 % 224.51 vidiš vidi š 421 0.36 % 406.72 155 0.60 % 672.38 186 0.55 % 629.35 26 0.06 % 73.62 54 0.32 % 346.39 moram mora m 417 0.36 % 402.86 113 0.44 % 490.19 131 0.39 % 443.25 110 0.27 % 311.48 63 0.38 % 404.12 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 418 File at CLARIN.SI2.2.75 List of initial character-level 5-grams from verb standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-verbs-standardized_forms- initial-5grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mislim misli m 1,862 2.00 % 1,798.85 313 1.53 % 1,357.78 632 2.50 % 2,138.44 475 1.41 % 1,345.04 442 3.25 % 2,835.28 rekel rekel 1,278 1.38 % 1,234.65 213 1.04 % 923.99 575 2.27 % 1,945.58 328 0.97 % 928.79 162 1.19 % 1,039.17 imamo imamo 1,172 1.26 % 1,132.25 243 1.19 % 1,054.12 173 0.68 % 585.37 587 1.75 % 1,662.19 169 1.24 % 1,084.08 imajo imajo 942 1.01 % 910.05 129 0.63 % 559.60 387 1.53 % 1,309.46 283 0.84 % 801.36 143 1.05 % 917.30 imeli imeli 915 0.98 % 883.97 164 0.80 % 711.43 337 1.33 % 1,140.28 287 0.85 % 812.69 127 0.94 % 814.66 nisem nisem 915 0.98 % 883.97 172 0.84 % 746.13 449 1.78 % 1,519.24 155 0.46 % 438.91 139 1.02 % 891.64 recimo recim o 893 0.96 % 862.71 172 0.84 % 746.13 146 0.58 % 494.01 375 1.11 % 1,061.87 200 1.47 % 1,282.93 rekla rekla 882 0.95 % 852.09 101 0.49 % 438.13 509 2.01 % 1,722.26 125 0.37 % 353.96 147 1.08 % 942.95 boste boste 809 0.87 % 781.56 258 1.26 % 1,119.19 53 0.21 % 179.33 318 0.95 % 900.47 180 1.32 % 1,154.64 pravi pravi 704 0.76 % 680.12 106 0.52 % 459.82 97 0.38 % 328.21 367 1.09 % 1,039.22 134 0.99 % 859.56 imela imela 657 0.71 % 634.72 127 0.62 % 550.92 303 1.20 % 1,025.23 112 0.33 % 317.15 115 0.85 % 737.69 imate imate 588 0.63 % 568.06 108 0.53 % 468.50 50 0.20 % 169.18 189 0.56 % 535.18 241 1.77 % 1,545.93 pride pride 547 0.59 % 528.45 89 0.43 % 386.08 180 0.71 % 609.05 174 0.52 % 492.71 104 0.77 % 667.12 čakaj čakaj 466 0.50 % 450.20 138 0.67 % 598.64 222 0.88 % 751.16 34 0.10 % 96.28 72 0.53 % 461.86 rekli rekli 463 0.50 % 447.30 89 0.43 % 386.08 85 0.34 % 287.61 241 0.72 % 682.43 48 0.35 % 307.90 gremo gremo 441 0.47 % 426.04 140 0.68 % 607.31 105 0.41 % 355.28 139 0.41 % 393.60 57 0.42 % 365.64 moraš moraš 441 0.47 % 426.04 55 0.27 % 238.59 255 1.01 % 862.82 54 0.16 % 152.91 77 0.57 % 493.93 pomeni pomen i 439 0.47 % 424.11 48 0.23 % 208.22 33 0.13 % 111.66 276 0.82 % 781.54 82 0.60 % 526 prišel priše l 424 0.46 % 409.62 95 0.46 % 412.11 222 0.88 % 751.16 72 0.21 % 203.88 35 0.26 % 224.51 vidiš vidiš 421 0.45 % 406.72 155 0.76 % 672.38 186 0.74 % 629.35 26 0.08 % 73.62 54 0.40 % 346.39 moram moram 417 0.45 % 402.86 113 0.55 % 490.19 131 0.52 % 443.25 110 0.33 % 311.48 63 0.46 % 404.12 veste veste 344 0.37 % 332.33 108 0.53 % 468.50 9 0.04 % 30.45 168 0.50 % 475.72 59 0.43 % 378.46 narediti nared iti 342 0.37 % 330.40 35 0.17 % 151.83 112 0.44 % 378.96 102 0.30 % 288.83 93 0.68 % 596.56 imeti imeti 326 0.35 % 314.94 45 0.22 % 195.21 113 0.45 % 382.35 92 0.27 % 260.51 76 0.56 % 487.51 prišla prišl a 324 0.35 % 313.01 85 0.41 % 368.73 144 0.57 % 487.24 59 0.17 % 167.07 36 0.27 % 230.93 moramo moram o 322 0.35 % 311.08 42 0.20 % 182.19 39 0.15 % 131.96 178 0.53 % 504.04 63 0.46 % 404.12 mogel mogel 318 0.34 % 307.21 51 0.25 % 221.24 191 0.76 % 646.27 42 0.12 % 118.93 34 0.25 % 218.10 povedal poved al 315 0.34 % 304.32 78 0.38 % 338.36 92 0.36 % 311.29 124 0.37 % 351.13 21 0.15 % 134.71 prišli prišl i 308 0.33 % 297.55 74 0.36 % 321.01 101 0.40 % 341.74 108 0.32 % 305.82 25 0.18 % 160.37 morem morem 307 0.33 % 296.59 62 0.30 % 268.95 131 0.52 % 443.25 59 0.17 % 167.07 55 0.41 % 352.81 videli videl i 307 0.33 % 296.59 99 0.48 % 429.46 51 0.20 % 172.56 116 0.34 % 328.47 41 0.30 % 263 naredili nared ili 300 0.32 % 289.83 47 0.23 % 203.88 77 0.30 % 260.54 117 0.35 % 331.30 59 0.43 % 378.46 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 419 File at CLARIN.SI2.2.76 List of final character-level 1-grams from verb standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-verbs-standardized_forms- final-1grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] je j e 32,186 15.02 % 31,094.37 6,536 14.38 % 28,352.92 10,658 15.38 % 36,062.56 10,744 15.57 % 30,423.42 4,248 13.90 % 27,249.46 so s o 7,976 3.72 % 7,705.48 1,528 3.36 % 6,628.41 2,364 3.41 % 7,998.86 3,012 4.37 % 8,528.98 1,072 3.51 % 6,876.51 bi b i 7,466 3.48 % 7,212.78 1,387 3.05 % 6,016.75 1,982 2.86 % 6,706.32 2,746 3.98 % 7,775.75 1,351 4.42 % 8,666.20 sem se m 6,222 2.90 % 6,010.97 1,191 2.62 % 5,166.51 2,952 4.26 % 9,988.43 1,188 1.72 % 3,364.02 891 2.92 % 5,715.46 ni n i 5,191 2.42 % 5,014.94 930 2.05 % 4,034.30 1,870 2.70 % 6,327.36 1,567 2.27 % 4,437.22 824 2.70 % 5,285.68 bo b o 5,064 2.36 % 4,892.25 1,301 2.86 % 5,643.69 1,258 1.81 % 4,256.59 1,777 2.58 % 5,031.87 728 2.38 % 4,669.87 smo sm o 3,981 1.86 % 3,845.98 899 1.98 % 3,899.83 1,148 1.66 % 3,884.39 1,544 2.24 % 4,372.09 390 1.28 % 2,501.72 vem ve m 3,558 1.66 % 3,437.33 509 1.12 % 2,208.02 1,885 2.72 % 6,378.11 502 0.73 % 1,421.50 662 2.17 % 4,246.50 bilo bil o 3,233 1.51 % 3,123.35 572 1.26 % 2,481.31 1,400 2.02 % 4,737.06 850 1.23 % 2,406.92 411 1.34 % 2,636.42 si s i 3,196 1.49 % 3,087.60 903 1.99 % 3,917.18 1,349 1.95 % 4,564.50 545 0.79 % 1,543.26 399 1.30 % 2,559.45 veš ve š 2,896 1.35 % 2,797.78 499 1.10 % 2,164.64 1,799 2.60 % 6,087.12 98 0.14 % 277.50 500 1.64 % 3,207.33 bil bi l 2,201 1.03 % 2,126.35 550 1.21 % 2,385.88 903 1.30 % 3,055.40 621 0.90 % 1,758.46 127 0.41 % 814.66 bila bil a 2,143 1.00 % 2,070.32 508 1.12 % 2,203.68 858 1.24 % 2,903.14 581 0.84 % 1,645.20 196 0.64 % 1,257.27 bomo bom o 2,018 0.94 % 1,949.56 544 1.20 % 2,359.85 340 0.49 % 1,150.43 839 1.22 % 2,375.77 295 0.96 % 1,892.32 bom bo m 1,975 0.92 % 1,908.02 330 0.73 % 1,431.53 875 1.26 % 2,960.66 379 0.55 % 1,073.20 391 1.28 % 2,508.13 mislim misli m 1,862 0.87 % 1,798.85 313 0.69 % 1,357.78 632 0.91 % 2,138.44 475 0.69 % 1,345.04 442 1.45 % 2,835.28 ima im a 1,655 0.77 % 1,598.87 275 0.60 % 1,192.94 614 0.89 % 2,077.54 498 0.72 % 1,410.17 268 0.88 % 1,719.13 ste st e 1,629 0.76 % 1,573.75 481 1.06 % 2,086.56 188 0.27 % 636.12 717 1.04 % 2,030.30 243 0.80 % 1,558.76 rekel reke l 1,278 0.60 % 1,234.65 213 0.47 % 923.99 575 0.83 % 1,945.58 328 0.47 % 928.79 162 0.53 % 1,039.17 da d a 1,268 0.59 % 1,224.99 197 0.43 % 854.58 454 0.66 % 1,536.16 407 0.59 % 1,152.49 210 0.69 % 1,347.08 imamo imam o 1,172 0.55 % 1,132.25 243 0.54 % 1,054.12 173 0.25 % 585.37 587 0.85 % 1,662.19 169 0.55 % 1,084.08 boš bo š 1,151 0.54 % 1,111.96 238 0.52 % 1,032.43 518 0.75 % 1,752.71 197 0.29 % 557.84 198 0.65 % 1,270.10 gre gr e 1,122 0.52 % 1,083.95 200 0.44 % 867.59 309 0.45 % 1,045.54 457 0.66 % 1,294.07 156 0.51 % 1,000.69 imaš ima š 1,058 0.49 % 1,022.12 220 0.48 % 954.35 535 0.77 % 1,810.23 138 0.20 % 390.77 165 0.54 % 1,058.42 imajo imaj o 942 0.44 % 910.05 129 0.28 % 559.60 387 0.56 % 1,309.46 283 0.41 % 801.36 143 0.47 % 917.30 ma m a 939 0.44 % 907.15 162 0.36 % 702.75 607 0.88 % 2,053.85 77 0.11 % 218.04 93 0.30 % 596.56 bili bil i 932 0.43 % 900.39 245 0.54 % 1,062.80 309 0.45 % 1,045.54 310 0.45 % 877.82 68 0.22 % 436.20 imeli imel i 915 0.43 % 883.97 164 0.36 % 711.43 337 0.49 % 1,140.28 287 0.42 % 812.69 127 0.41 % 814.66 nisem nise m 915 0.43 % 883.97 172 0.38 % 746.13 449 0.65 % 1,519.24 155 0.23 % 438.91 139 0.46 % 891.64 recimo recim o 893 0.42 % 862.71 172 0.38 % 746.13 146 0.21 % 494.01 375 0.54 % 1,061.87 200 0.65 % 1,282.93 rekla rekl a 882 0.41 % 852.09 101 0.22 % 438.13 509 0.73 % 1,722.26 125 0.18 % 353.96 147 0.48 % 942.95 sta st a 821 0.38 % 793.15 218 0.48 % 945.68 261 0.38 % 883.12 251 0.36 % 710.75 91 0.30 % 583.73 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 420 File at CLARIN.SI2.2.77 List of final character-level 2-grams from verb standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-verbs-standardized_forms- final-2grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] je je 32,186 15.02 % 31,094.37 6,536 14.38 % 28,352.92 10,658 15.38 % 36,062.56 10,744 15.57 % 30,423.42 4,248 13.90 % 27,249.46 so so 7,976 3.72 % 7,705.48 1,528 3.36 % 6,628.41 2,364 3.41 % 7,998.86 3,012 4.37 % 8,528.98 1,072 3.51 % 6,876.51 bi bi 7,466 3.48 % 7,212.78 1,387 3.05 % 6,016.75 1,982 2.86 % 6,706.32 2,746 3.98 % 7,775.75 1,351 4.42 % 8,666.20 sem s em 6,222 2.90 % 6,010.97 1,191 2.62 % 5,166.51 2,952 4.26 % 9,988.43 1,188 1.72 % 3,364.02 891 2.92 % 5,715.46 ni ni 5,191 2.42 % 5,014.94 930 2.05 % 4,034.30 1,870 2.70 % 6,327.36 1,567 2.27 % 4,437.22 824 2.70 % 5,285.68 bo bo 5,064 2.36 % 4,892.25 1,301 2.86 % 5,643.69 1,258 1.81 % 4,256.59 1,777 2.58 % 5,031.87 728 2.38 % 4,669.87 smo s mo 3,981 1.86 % 3,845.98 899 1.98 % 3,899.83 1,148 1.66 % 3,884.39 1,544 2.24 % 4,372.09 390 1.28 % 2,501.72 vem v em 3,558 1.66 % 3,437.33 509 1.12 % 2,208.02 1,885 2.72 % 6,378.11 502 0.73 % 1,421.50 662 2.17 % 4,246.50 bilo bi lo 3,233 1.51 % 3,123.35 572 1.26 % 2,481.31 1,400 2.02 % 4,737.06 850 1.23 % 2,406.92 411 1.34 % 2,636.42 si si 3,196 1.49 % 3,087.60 903 1.99 % 3,917.18 1,349 1.95 % 4,564.50 545 0.79 % 1,543.26 399 1.30 % 2,559.45 veš v eš 2,896 1.35 % 2,797.78 499 1.10 % 2,164.64 1,799 2.60 % 6,087.12 98 0.14 % 277.50 500 1.64 % 3,207.33 bil b il 2,201 1.03 % 2,126.35 550 1.21 % 2,385.88 903 1.30 % 3,055.40 621 0.90 % 1,758.46 127 0.41 % 814.66 bila bi la 2,143 1.00 % 2,070.32 508 1.12 % 2,203.68 858 1.24 % 2,903.14 581 0.84 % 1,645.20 196 0.64 % 1,257.27 bomo bo mo 2,018 0.94 % 1,949.56 544 1.20 % 2,359.85 340 0.49 % 1,150.43 839 1.22 % 2,375.77 295 0.96 % 1,892.32 bom b om 1,975 0.92 % 1,908.02 330 0.73 % 1,431.53 875 1.26 % 2,960.66 379 0.55 % 1,073.20 391 1.28 % 2,508.13 mislim misl im 1,862 0.87 % 1,798.85 313 0.69 % 1,357.78 632 0.91 % 2,138.44 475 0.69 % 1,345.04 442 1.45 % 2,835.28 ima i ma 1,655 0.77 % 1,598.87 275 0.60 % 1,192.94 614 0.89 % 2,077.54 498 0.72 % 1,410.17 268 0.88 % 1,719.13 ste s te 1,629 0.76 % 1,573.75 481 1.06 % 2,086.56 188 0.27 % 636.12 717 1.04 % 2,030.30 243 0.80 % 1,558.76 rekel rek el 1,278 0.60 % 1,234.65 213 0.47 % 923.99 575 0.83 % 1,945.58 328 0.47 % 928.79 162 0.53 % 1,039.17 da da 1,268 0.59 % 1,224.99 197 0.43 % 854.58 454 0.66 % 1,536.16 407 0.59 % 1,152.49 210 0.69 % 1,347.08 imamo ima mo 1,172 0.55 % 1,132.25 243 0.54 % 1,054.12 173 0.25 % 585.37 587 0.85 % 1,662.19 169 0.55 % 1,084.08 boš b oš 1,151 0.54 % 1,111.96 238 0.52 % 1,032.43 518 0.75 % 1,752.71 197 0.29 % 557.84 198 0.65 % 1,270.10 gre g re 1,122 0.52 % 1,083.95 200 0.44 % 867.59 309 0.45 % 1,045.54 457 0.66 % 1,294.07 156 0.51 % 1,000.69 imaš im aš 1,058 0.49 % 1,022.12 220 0.48 % 954.35 535 0.77 % 1,810.23 138 0.20 % 390.77 165 0.54 % 1,058.42 imajo ima jo 942 0.44 % 910.05 129 0.28 % 559.60 387 0.56 % 1,309.46 283 0.41 % 801.36 143 0.47 % 917.30 ma ma 939 0.44 % 907.15 162 0.36 % 702.75 607 0.88 % 2,053.85 77 0.11 % 218.04 93 0.30 % 596.56 bili bi li 932 0.43 % 900.39 245 0.54 % 1,062.80 309 0.45 % 1,045.54 310 0.45 % 877.82 68 0.22 % 436.20 imeli ime li 915 0.43 % 883.97 164 0.36 % 711.43 337 0.49 % 1,140.28 287 0.42 % 812.69 127 0.41 % 814.66 nisem nis em 915 0.43 % 883.97 172 0.38 % 746.13 449 0.65 % 1,519.24 155 0.23 % 438.91 139 0.46 % 891.64 recimo reci mo 893 0.42 % 862.71 172 0.38 % 746.13 146 0.21 % 494.01 375 0.54 % 1,061.87 200 0.65 % 1,282.93 rekla rek la 882 0.41 % 852.09 101 0.22 % 438.13 509 0.73 % 1,722.26 125 0.18 % 353.96 147 0.48 % 942.95 sta s ta 821 0.38 % 793.15 218 0.48 % 945.68 261 0.38 % 883.12 251 0.36 % 710.75 91 0.30 % 583.73 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 421 File at CLARIN.SI2.2.78 List of final character-level 3-grams from verb standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-verbs-standardized_forms- final-3grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] sem sem 6,222 4.13 % 6,010.97 1,191 3.67 % 5,166.51 2,952 6.07 % 9,988.43 1,188 2.47 % 3,364.02 891 4.12 % 5,715.46 smo smo 3,981 2.64 % 3,845.98 899 2.77 % 3,899.83 1,148 2.36 % 3,884.39 1,544 3.21 % 4,372.09 390 1.80 % 2,501.72 vem vem 3,558 2.36 % 3,437.33 509 1.57 % 2,208.02 1,885 3.87 % 6,378.11 502 1.04 % 1,421.50 662 3.06 % 4,246.50 bilo b ilo 3,233 2.15 % 3,123.35 572 1.76 % 2,481.31 1,400 2.88 % 4,737.06 850 1.77 % 2,406.92 411 1.90 % 2,636.42 veš veš 2,896 1.92 % 2,797.78 499 1.54 % 2,164.64 1,799 3.70 % 6,087.12 98 0.20 % 277.50 500 2.31 % 3,207.33 bil bil 2,201 1.46 % 2,126.35 550 1.70 % 2,385.88 903 1.85 % 3,055.40 621 1.29 % 1,758.46 127 0.59 % 814.66 bila b ila 2,143 1.42 % 2,070.32 508 1.57 % 2,203.68 858 1.76 % 2,903.14 581 1.21 % 1,645.20 196 0.91 % 1,257.27 bomo b omo 2,018 1.34 % 1,949.56 544 1.68 % 2,359.85 340 0.70 % 1,150.43 839 1.75 % 2,375.77 295 1.36 % 1,892.32 bom bom 1,975 1.31 % 1,908.02 330 1.02 % 1,431.53 875 1.80 % 2,960.66 379 0.79 % 1,073.20 391 1.81 % 2,508.13 mislim mis lim 1,862 1.24 % 1,798.85 313 0.97 % 1,357.78 632 1.30 % 2,138.44 475 0.99 % 1,345.04 442 2.05 % 2,835.28 ima ima 1,655 1.10 % 1,598.87 275 0.85 % 1,192.94 614 1.26 % 2,077.54 498 1.04 % 1,410.17 268 1.24 % 1,719.13 ste ste 1,629 1.08 % 1,573.75 481 1.48 % 2,086.56 188 0.39 % 636.12 717 1.49 % 2,030.30 243 1.12 % 1,558.76 rekel re kel 1,278 0.85 % 1,234.65 213 0.66 % 923.99 575 1.18 % 1,945.58 328 0.68 % 928.79 162 0.75 % 1,039.17 imamo im amo 1,172 0.78 % 1,132.25 243 0.75 % 1,054.12 173 0.35 % 585.37 587 1.22 % 1,662.19 169 0.78 % 1,084.08 boš boš 1,151 0.76 % 1,111.96 238 0.73 % 1,032.43 518 1.06 % 1,752.71 197 0.41 % 557.84 198 0.92 % 1,270.10 gre gre 1,122 0.74 % 1,083.95 200 0.62 % 867.59 309 0.64 % 1,045.54 457 0.95 % 1,294.07 156 0.72 % 1,000.69 imaš i maš 1,058 0.70 % 1,022.12 220 0.68 % 954.35 535 1.10 % 1,810.23 138 0.29 % 390.77 165 0.76 % 1,058.42 imajo im ajo 942 0.62 % 910.05 129 0.40 % 559.60 387 0.80 % 1,309.46 283 0.59 % 801.36 143 0.66 % 917.30 bili b ili 932 0.62 % 900.39 245 0.76 % 1,062.80 309 0.64 % 1,045.54 310 0.65 % 877.82 68 0.32 % 436.20 imeli im eli 915 0.61 % 883.97 164 0.51 % 711.43 337 0.69 % 1,140.28 287 0.60 % 812.69 127 0.59 % 814.66 nisem ni sem 915 0.61 % 883.97 172 0.53 % 746.13 449 0.92 % 1,519.24 155 0.32 % 438.91 139 0.64 % 891.64 recimo rec imo 893 0.59 % 862.71 172 0.53 % 746.13 146 0.30 % 494.01 375 0.78 % 1,061.87 200 0.93 % 1,282.93 rekla re kla 882 0.58 % 852.09 101 0.31 % 438.13 509 1.05 % 1,722.26 125 0.26 % 353.96 147 0.68 % 942.95 sta sta 821 0.55 % 793.15 218 0.67 % 945.68 261 0.54 % 883.12 251 0.52 % 710.75 91 0.42 % 583.73 boste bo ste 809 0.54 % 781.56 258 0.80 % 1,119.19 53 0.11 % 179.33 318 0.66 % 900.47 180 0.83 % 1,154.64 daj daj 772 0.51 % 745.82 218 0.67 % 945.68 360 0.74 % 1,218.10 98 0.20 % 277.50 96 0.44 % 615.81 glej g lej 769 0.51 % 742.92 241 0.74 % 1,045.45 334 0.69 % 1,130.13 67 0.14 % 189.72 127 0.59 % 814.66 sva sva 767 0.51 % 740.99 195 0.60 % 845.90 367 0.75 % 1,241.79 87 0.18 % 246.35 118 0.55 % 756.93 imam i mam 740 0.49 % 714.90 127 0.39 % 550.92 300 0.62 % 1,015.08 157 0.33 % 444.57 156 0.72 % 1,000.69 imel i mel 736 0.49 % 711.04 126 0.39 % 546.58 418 0.86 % 1,414.35 123 0.26 % 348.29 69 0.32 % 442.61 biti b iti 730 0.48 % 705.24 163 0.50 % 707.09 211 0.43 % 713.94 242 0.50 % 685.26 114 0.53 % 731.27 pravi pr avi 704 0.47 % 680.12 106 0.33 % 459.82 97 0.20 % 328.21 367 0.76 % 1,039.22 134 0.62 % 859.56 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 422 File at CLARIN.SI2.2.79 List of final character-level 4-grams from verb standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-verbs-standardized_forms- final-4grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] bilo bilo 3,233 2.77 % 3,123.35 572 2.23 % 2,481.31 1,400 4.13 % 4,737.06 850 2.10 % 2,406.92 411 2.45 % 2,636.42 bila bila 2,143 1.83 % 2,070.32 508 1.98 % 2,203.68 858 2.53 % 2,903.14 581 1.44 % 1,645.20 196 1.17 % 1,257.27 bomo bomo 2,018 1.73 % 1,949.56 544 2.12 % 2,359.85 340 1.00 % 1,150.43 839 2.08 % 2,375.77 295 1.76 % 1,892.32 mislim mi slim 1,862 1.59 % 1,798.85 313 1.22 % 1,357.78 632 1.86 % 2,138.44 475 1.18 % 1,345.04 442 2.63 % 2,835.28 rekel r ekel 1,278 1.09 % 1,234.65 213 0.83 % 923.99 575 1.70 % 1,945.58 328 0.81 % 928.79 162 0.96 % 1,039.17 imamo i mamo 1,172 1.00 % 1,132.25 243 0.95 % 1,054.12 173 0.51 % 585.37 587 1.45 % 1,662.19 169 1.01 % 1,084.08 imaš imaš 1,058 0.91 % 1,022.12 220 0.86 % 954.35 535 1.58 % 1,810.23 138 0.34 % 390.77 165 0.98 % 1,058.42 imajo i majo 942 0.81 % 910.05 129 0.50 % 559.60 387 1.14 % 1,309.46 283 0.70 % 801.36 143 0.85 % 917.30 bili bili 932 0.80 % 900.39 245 0.95 % 1,062.80 309 0.91 % 1,045.54 310 0.77 % 877.82 68 0.41 % 436.20 imeli i meli 915 0.78 % 883.97 164 0.64 % 711.43 337 0.99 % 1,140.28 287 0.71 % 812.69 127 0.76 % 814.66 nisem n isem 915 0.78 % 883.97 172 0.67 % 746.13 449 1.32 % 1,519.24 155 0.38 % 438.91 139 0.83 % 891.64 recimo re cimo 893 0.77 % 862.71 172 0.67 % 746.13 146 0.43 % 494.01 375 0.93 % 1,061.87 200 1.19 % 1,282.93 rekla r ekla 882 0.76 % 852.09 101 0.39 % 438.13 509 1.50 % 1,722.26 125 0.31 % 353.96 147 0.88 % 942.95 boste b oste 809 0.69 % 781.56 258 1.00 % 1,119.19 53 0.16 % 179.33 318 0.79 % 900.47 180 1.07 % 1,154.64 glej glej 769 0.66 % 742.92 241 0.94 % 1,045.45 334 0.98 % 1,130.13 67 0.17 % 189.72 127 0.76 % 814.66 imam imam 740 0.63 % 714.90 127 0.49 % 550.92 300 0.89 % 1,015.08 157 0.39 % 444.57 156 0.93 % 1,000.69 imel imel 736 0.63 % 711.04 126 0.49 % 546.58 418 1.23 % 1,414.35 123 0.30 % 348.29 69 0.41 % 442.61 biti biti 730 0.62 % 705.24 163 0.64 % 707.09 211 0.62 % 713.94 242 0.60 % 685.26 114 0.68 % 731.27 pravi p ravi 704 0.60 % 680.12 106 0.41 % 459.82 97 0.29 % 328.21 367 0.91 % 1,039.22 134 0.80 % 859.56 imela i mela 657 0.56 % 634.72 127 0.49 % 550.92 303 0.89 % 1,025.23 112 0.28 % 317.15 115 0.69 % 737.69 bodo bodo 656 0.56 % 633.75 199 0.78 % 863.25 64 0.19 % 216.55 322 0.80 % 911.80 71 0.42 % 455.44 niso niso 590 0.51 % 569.99 80 0.31 % 347.04 162 0.48 % 548.15 266 0.66 % 753.22 82 0.49 % 526 imate i mate 588 0.50 % 568.06 108 0.42 % 468.50 50 0.15 % 169.18 189 0.47 % 535.18 241 1.44 % 1,545.93 pride p ride 547 0.47 % 528.45 89 0.35 % 386.08 180 0.53 % 609.05 174 0.43 % 492.71 104 0.62 % 667.12 čakaj č akaj 466 0.40 % 450.20 138 0.54 % 598.64 222 0.66 % 751.16 34 0.08 % 96.28 72 0.43 % 461.86 rekli r ekli 463 0.40 % 447.30 89 0.35 % 386.08 85 0.25 % 287.61 241 0.60 % 682.43 48 0.29 % 307.90 gremo g remo 441 0.38 % 426.04 140 0.55 % 607.31 105 0.31 % 355.28 139 0.34 % 393.60 57 0.34 % 365.64 moraš m oraš 441 0.38 % 426.04 55 0.21 % 238.59 255 0.75 % 862.82 54 0.13 % 152.91 77 0.46 % 493.93 pomeni po meni 439 0.38 % 424.11 48 0.19 % 208.22 33 0.10 % 111.66 276 0.68 % 781.54 82 0.49 % 526 prišel pr išel 424 0.36 % 409.62 95 0.37 % 412.11 222 0.66 % 751.16 72 0.18 % 203.88 35 0.21 % 224.51 vidiš v idiš 421 0.36 % 406.72 155 0.60 % 672.38 186 0.55 % 629.35 26 0.06 % 73.62 54 0.32 % 346.39 moram m oram 417 0.36 % 402.86 113 0.44 % 490.19 131 0.39 % 443.25 110 0.27 % 311.48 63 0.38 % 404.12 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 423 File at CLARIN.SI2.2.80 List of final character-level 5-grams from verb standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-verbs-standardized_forms- final-5grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mislim m islim 1,862 2.00 % 1,798.85 313 1.53 % 1,357.78 632 2.50 % 2,138.44 475 1.41 % 1,345.04 442 3.25 % 2,835.28 rekel rekel 1,278 1.38 % 1,234.65 213 1.04 % 923.99 575 2.27 % 1,945.58 328 0.97 % 928.79 162 1.19 % 1,039.17 imamo imamo 1,172 1.26 % 1,132.25 243 1.19 % 1,054.12 173 0.68 % 585.37 587 1.75 % 1,662.19 169 1.24 % 1,084.08 imajo imajo 942 1.01 % 910.05 129 0.63 % 559.60 387 1.53 % 1,309.46 283 0.84 % 801.36 143 1.05 % 917.30 imeli imeli 915 0.98 % 883.97 164 0.80 % 711.43 337 1.33 % 1,140.28 287 0.85 % 812.69 127 0.94 % 814.66 nisem nisem 915 0.98 % 883.97 172 0.84 % 746.13 449 1.78 % 1,519.24 155 0.46 % 438.91 139 1.02 % 891.64 recimo r ecimo 893 0.96 % 862.71 172 0.84 % 746.13 146 0.58 % 494.01 375 1.11 % 1,061.87 200 1.47 % 1,282.93 rekla rekla 882 0.95 % 852.09 101 0.49 % 438.13 509 2.01 % 1,722.26 125 0.37 % 353.96 147 1.08 % 942.95 boste boste 809 0.87 % 781.56 258 1.26 % 1,119.19 53 0.21 % 179.33 318 0.95 % 900.47 180 1.32 % 1,154.64 pravi pravi 704 0.76 % 680.12 106 0.52 % 459.82 97 0.38 % 328.21 367 1.09 % 1,039.22 134 0.99 % 859.56 imela imela 657 0.71 % 634.72 127 0.62 % 550.92 303 1.20 % 1,025.23 112 0.33 % 317.15 115 0.85 % 737.69 imate imate 588 0.63 % 568.06 108 0.53 % 468.50 50 0.20 % 169.18 189 0.56 % 535.18 241 1.77 % 1,545.93 pride pride 547 0.59 % 528.45 89 0.43 % 386.08 180 0.71 % 609.05 174 0.52 % 492.71 104 0.77 % 667.12 čakaj čakaj 466 0.50 % 450.20 138 0.67 % 598.64 222 0.88 % 751.16 34 0.10 % 96.28 72 0.53 % 461.86 rekli rekli 463 0.50 % 447.30 89 0.43 % 386.08 85 0.34 % 287.61 241 0.72 % 682.43 48 0.35 % 307.90 gremo gremo 441 0.47 % 426.04 140 0.68 % 607.31 105 0.41 % 355.28 139 0.41 % 393.60 57 0.42 % 365.64 moraš moraš 441 0.47 % 426.04 55 0.27 % 238.59 255 1.01 % 862.82 54 0.16 % 152.91 77 0.57 % 493.93 pomeni p omeni 439 0.47 % 424.11 48 0.23 % 208.22 33 0.13 % 111.66 276 0.82 % 781.54 82 0.60 % 526 prišel p rišel 424 0.46 % 409.62 95 0.46 % 412.11 222 0.88 % 751.16 72 0.21 % 203.88 35 0.26 % 224.51 vidiš vidiš 421 0.45 % 406.72 155 0.76 % 672.38 186 0.74 % 629.35 26 0.08 % 73.62 54 0.40 % 346.39 moram moram 417 0.45 % 402.86 113 0.55 % 490.19 131 0.52 % 443.25 110 0.33 % 311.48 63 0.46 % 404.12 veste veste 344 0.37 % 332.33 108 0.53 % 468.50 9 0.04 % 30.45 168 0.50 % 475.72 59 0.43 % 378.46 narediti nar editi 342 0.37 % 330.40 35 0.17 % 151.83 112 0.44 % 378.96 102 0.30 % 288.83 93 0.68 % 596.56 imeti imeti 326 0.35 % 314.94 45 0.22 % 195.21 113 0.45 % 382.35 92 0.27 % 260.51 76 0.56 % 487.51 prišla p rišla 324 0.35 % 313.01 85 0.41 % 368.73 144 0.57 % 487.24 59 0.17 % 167.07 36 0.27 % 230.93 moramo m oramo 322 0.35 % 311.08 42 0.20 % 182.19 39 0.15 % 131.96 178 0.53 % 504.04 63 0.46 % 404.12 mogel mogel 318 0.34 % 307.21 51 0.25 % 221.24 191 0.76 % 646.27 42 0.12 % 118.93 34 0.25 % 218.10 povedal po vedal 315 0.34 % 304.32 78 0.38 % 338.36 92 0.36 % 311.29 124 0.37 % 351.13 21 0.15 % 134.71 prišli p rišli 308 0.33 % 297.55 74 0.36 % 321.01 101 0.40 % 341.74 108 0.32 % 305.82 25 0.18 % 160.37 morem morem 307 0.33 % 296.59 62 0.30 % 268.95 131 0.52 % 443.25 59 0.17 % 167.07 55 0.41 % 352.81 videli v ideli 307 0.33 % 296.59 99 0.48 % 429.46 51 0.20 % 172.56 116 0.34 % 328.47 41 0.30 % 263 naredili nar edili 300 0.32 % 289.83 47 0.23 % 203.88 77 0.30 % 260.54 117 0.35 % 331.30 59 0.43 % 378.46 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 424 File at CLARIN.SI2.2.81 List of initial character-level 1-grams from verb lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-verbs-lowercase_forms- initial-1grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] je j e 31,662 14.77 % 30,588.14 6,498 14.30 % 28,188.08 10,223 14.75 % 34,590.68 10,712 15.53 % 30,332.81 4,229 13.84 % 27,127.58 so s o 7,869 3.67 % 7,602.11 1,526 3.36 % 6,619.73 2,277 3.29 % 7,704.49 2,995 4.34 % 8,480.84 1,071 3.50 % 6,870.10 bi b i 6,993 3.26 % 6,755.82 1,319 2.90 % 5,721.77 1,684 2.43 % 5,698.01 2,687 3.90 % 7,608.69 1,303 4.26 % 8,358.30 ni n i 4,775 2.23 % 4,613.05 882 1.94 % 3,826.08 1,507 2.17 % 5,099.11 1,565 2.27 % 4,431.56 821 2.69 % 5,266.43 bo b o 4,762 2.22 % 4,600.49 1,218 2.68 % 5,283.64 1,070 1.54 % 3,620.47 1,764 2.56 % 4,995.06 710 2.32 % 4,554.41 sem s em 4,238 1.98 % 4,094.26 771 1.70 % 3,344.57 1,729 2.49 % 5,850.27 1,053 1.53 % 2,981.74 685 2.24 % 4,394.04 smo s mo 3,919 1.83 % 3,786.08 881 1.94 % 3,821.74 1,105 1.59 % 3,738.89 1,543 2.24 % 4,369.26 390 1.28 % 2,501.72 vem v em 3,296 1.54 % 3,184.21 467 1.03 % 2,025.83 1,694 2.44 % 5,731.84 500 0.72 % 1,415.83 635 2.08 % 4,073.31 si s i 2,908 1.36 % 2,809.37 857 1.89 % 3,717.63 1,141 1.65 % 3,860.70 528 0.77 % 1,495.12 382 1.25 % 2,450.40 veš v eš 2,611 1.22 % 2,522.44 485 1.07 % 2,103.91 1,534 2.21 % 5,190.46 97 0.14 % 274.67 495 1.62 % 3,175.25 blo b lo 2,262 1.05 % 2,185.28 368 0.81 % 1,596.37 1,011 1.46 % 3,420.83 499 0.72 % 1,413 384 1.26 % 2,463.23 ma m a 2,077 0.97 % 2,006.56 344 0.76 % 1,492.26 1,155 1.67 % 3,908.07 252 0.36 % 713.58 326 1.07 % 2,091.18 bil b il 2,043 0.95 % 1,973.71 506 1.11 % 2,195.01 798 1.15 % 2,700.12 615 0.89 % 1,741.47 124 0.41 % 795.42 bomo b omo 1,819 0.85 % 1,757.31 486 1.07 % 2,108.25 262 0.38 % 886.51 806 1.17 % 2,282.32 265 0.87 % 1,699.88 ste s te 1,621 0.76 % 1,566.02 480 1.06 % 2,082.22 182 0.26 % 615.82 717 1.04 % 2,030.30 242 0.79 % 1,552.35 bla b la 1,591 0.74 % 1,537.04 309 0.68 % 1,340.43 774 1.12 % 2,618.92 322 0.47 % 911.80 186 0.61 % 1,193.13 bom b om 1,583 0.74 % 1,529.31 269 0.59 % 1,166.91 606 0.87 % 2,050.47 360 0.52 % 1,019.40 348 1.14 % 2,232.30 sn s n 1,350 0.63 % 1,304.21 223 0.49 % 967.37 876 1.26 % 2,964.05 72 0.10 % 203.88 179 0.59 % 1,148.22 da d a 1,144 0.53 % 1,105.20 187 0.41 % 811.20 373 0.54 % 1,262.09 393 0.57 % 1,112.84 191 0.62 % 1,225.20 gre g re 1,099 0.51 % 1,061.73 200 0.44 % 867.59 288 0.41 % 974.48 456 0.66 % 1,291.24 155 0.51 % 994.27 boš b oš 951 0.44 % 918.75 207 0.46 % 897.96 381 0.55 % 1,289.16 184 0.27 % 521.03 179 0.59 % 1,148.22 maš m aš 930 0.43 % 898.46 207 0.46 % 897.96 479 0.69 % 1,620.75 91 0.13 % 257.68 153 0.50 % 981.44 reku r eku 897 0.42 % 866.58 143 0.32 % 620.33 380 0.55 % 1,285.77 247 0.36 % 699.42 127 0.41 % 814.66 recimo r ecimo 838 0.39 % 809.58 166 0.36 % 720.10 126 0.18 % 426.34 351 0.51 % 993.91 195 0.64 % 1,250.86 sta s ta 820 0.38 % 792.19 218 0.48 % 945.68 260 0.38 % 879.74 251 0.36 % 710.75 91 0.30 % 583.73 mel m el 786 0.37 % 759.34 131 0.29 % 568.27 412 0.59 % 1,394.05 134 0.19 % 379.44 109 0.36 % 699.20 rekla r ekla 769 0.36 % 742.92 98 0.22 % 425.12 406 0.59 % 1,373.75 124 0.18 % 351.13 141 0.46 % 904.47 mamo m amo 681 0.32 % 657.90 163 0.36 % 707.09 140 0.20 % 473.71 234 0.34 % 662.61 144 0.47 % 923.71 zdi z di 671 0.31 % 648.24 115 0.25 % 498.87 179 0.26 % 605.67 198 0.29 % 560.67 179 0.59 % 1,148.22 boste b oste 645 0.30 % 623.12 225 0.49 % 976.04 14 0.02 % 47.37 277 0.40 % 784.37 129 0.42 % 827.49 šla š la 631 0.29 % 609.60 75 0.17 % 325.35 412 0.59 % 1,394.05 76 0.11 % 215.21 68 0.22 % 436.20 bli b li 627 0.29 % 605.73 153 0.34 % 663.71 263 0.38 % 889.89 149 0.22 % 421.92 62 0.20 % 397.71 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 425 File at CLARIN.SI2.2.82 List of initial character-level 2-grams from verb lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-verbs-lowercase_forms- initial-2grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] je je 31,662 14.84 % 30,588.14 6,498 14.33 % 28,188.08 10,223 14.89 % 34,590.68 10,712 15.55 % 30,332.81 4,229 13.88 % 27,127.58 so so 7,869 3.69 % 7,602.11 1,526 3.37 % 6,619.73 2,277 3.32 % 7,704.49 2,995 4.35 % 8,480.84 1,071 3.52 % 6,870.10 bi bi 6,993 3.28 % 6,755.82 1,319 2.91 % 5,721.77 1,684 2.45 % 5,698.01 2,687 3.90 % 7,608.69 1,303 4.28 % 8,358.30 ni ni 4,775 2.24 % 4,613.05 882 1.95 % 3,826.08 1,507 2.20 % 5,099.11 1,565 2.27 % 4,431.56 821 2.69 % 5,266.43 bo bo 4,762 2.23 % 4,600.49 1,218 2.69 % 5,283.64 1,070 1.56 % 3,620.47 1,764 2.56 % 4,995.06 710 2.33 % 4,554.41 sem se m 4,238 1.99 % 4,094.26 771 1.70 % 3,344.57 1,729 2.52 % 5,850.27 1,053 1.53 % 2,981.74 685 2.25 % 4,394.04 smo sm o 3,919 1.84 % 3,786.08 881 1.94 % 3,821.74 1,105 1.61 % 3,738.89 1,543 2.24 % 4,369.26 390 1.28 % 2,501.72 vem ve m 3,296 1.54 % 3,184.21 467 1.03 % 2,025.83 1,694 2.47 % 5,731.84 500 0.73 % 1,415.83 635 2.08 % 4,073.31 si si 2,908 1.36 % 2,809.37 857 1.89 % 3,717.63 1,141 1.66 % 3,860.70 528 0.77 % 1,495.12 382 1.25 % 2,450.40 veš ve š 2,611 1.22 % 2,522.44 485 1.07 % 2,103.91 1,534 2.23 % 5,190.46 97 0.14 % 274.67 495 1.62 % 3,175.25 blo bl o 2,262 1.06 % 2,185.28 368 0.81 % 1,596.37 1,011 1.47 % 3,420.83 499 0.72 % 1,413 384 1.26 % 2,463.23 ma ma 2,077 0.97 % 2,006.56 344 0.76 % 1,492.26 1,155 1.68 % 3,908.07 252 0.37 % 713.58 326 1.07 % 2,091.18 bil bi l 2,043 0.96 % 1,973.71 506 1.12 % 2,195.01 798 1.16 % 2,700.12 615 0.89 % 1,741.47 124 0.41 % 795.42 bomo bo mo 1,819 0.85 % 1,757.31 486 1.07 % 2,108.25 262 0.38 % 886.51 806 1.17 % 2,282.32 265 0.87 % 1,699.88 ste st e 1,621 0.76 % 1,566.02 480 1.06 % 2,082.22 182 0.27 % 615.82 717 1.04 % 2,030.30 242 0.79 % 1,552.35 bla bl a 1,591 0.75 % 1,537.04 309 0.68 % 1,340.43 774 1.13 % 2,618.92 322 0.47 % 911.80 186 0.61 % 1,193.13 bom bo m 1,583 0.74 % 1,529.31 269 0.59 % 1,166.91 606 0.88 % 2,050.47 360 0.52 % 1,019.40 348 1.14 % 2,232.30 sn sn 1,350 0.63 % 1,304.21 223 0.49 % 967.37 876 1.28 % 2,964.05 72 0.10 % 203.88 179 0.59 % 1,148.22 da da 1,144 0.54 % 1,105.20 187 0.41 % 811.20 373 0.54 % 1,262.09 393 0.57 % 1,112.84 191 0.63 % 1,225.20 gre gr e 1,099 0.52 % 1,061.73 200 0.44 % 867.59 288 0.42 % 974.48 456 0.66 % 1,291.24 155 0.51 % 994.27 boš bo š 951 0.45 % 918.75 207 0.46 % 897.96 381 0.56 % 1,289.16 184 0.27 % 521.03 179 0.59 % 1,148.22 maš ma š 930 0.44 % 898.46 207 0.46 % 897.96 479 0.70 % 1,620.75 91 0.13 % 257.68 153 0.50 % 981.44 reku re ku 897 0.42 % 866.58 143 0.32 % 620.33 380 0.55 % 1,285.77 247 0.36 % 699.42 127 0.42 % 814.66 recimo re cimo 838 0.39 % 809.58 166 0.37 % 720.10 126 0.18 % 426.34 351 0.51 % 993.91 195 0.64 % 1,250.86 sta st a 820 0.38 % 792.19 218 0.48 % 945.68 260 0.38 % 879.74 251 0.36 % 710.75 91 0.30 % 583.73 mel me l 786 0.37 % 759.34 131 0.29 % 568.27 412 0.60 % 1,394.05 134 0.20 % 379.44 109 0.36 % 699.20 rekla re kla 769 0.36 % 742.92 98 0.22 % 425.12 406 0.59 % 1,373.75 124 0.18 % 351.13 141 0.46 % 904.47 mamo ma mo 681 0.32 % 657.90 163 0.36 % 707.09 140 0.20 % 473.71 234 0.34 % 662.61 144 0.47 % 923.71 zdi zd i 671 0.32 % 648.24 115 0.25 % 498.87 179 0.26 % 605.67 198 0.29 % 560.67 179 0.59 % 1,148.22 boste bo ste 645 0.30 % 623.12 225 0.50 % 976.04 14 0.02 % 47.37 277 0.40 % 784.37 129 0.42 % 827.49 šla šl a 631 0.30 % 609.60 75 0.17 % 325.35 412 0.60 % 1,394.05 76 0.11 % 215.21 68 0.22 % 436.20 bli bl i 627 0.29 % 605.73 153 0.34 % 663.71 263 0.38 % 889.89 149 0.22 % 421.92 62 0.20 % 397.71 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 426 File at CLARIN.SI2.2.83 List of initial character-level 3-grams from verb lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-verbs-lowercase_forms- initial-3grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] sem sem 4,238 2.88 % 4,094.26 771 2.43 % 3,344.57 1,729 3.69 % 5,850.27 1,053 2.21 % 2,981.74 685 3.25 % 4,394.04 smo smo 3,919 2.66 % 3,786.08 881 2.77 % 3,821.74 1,105 2.35 % 3,738.89 1,543 3.24 % 4,369.26 390 1.85 % 2,501.72 vem vem 3,296 2.24 % 3,184.21 467 1.47 % 2,025.83 1,694 3.61 % 5,731.84 500 1.05 % 1,415.83 635 3.02 % 4,073.31 veš veš 2,611 1.77 % 2,522.44 485 1.53 % 2,103.91 1,534 3.27 % 5,190.46 97 0.20 % 274.67 495 2.35 % 3,175.25 blo blo 2,262 1.53 % 2,185.28 368 1.16 % 1,596.37 1,011 2.15 % 3,420.83 499 1.05 % 1,413 384 1.82 % 2,463.23 bil bil 2,043 1.39 % 1,973.71 506 1.59 % 2,195.01 798 1.70 % 2,700.12 615 1.29 % 1,741.47 124 0.59 % 795.42 bomo bom o 1,819 1.23 % 1,757.31 486 1.53 % 2,108.25 262 0.56 % 886.51 806 1.69 % 2,282.32 265 1.26 % 1,699.88 ste ste 1,621 1.10 % 1,566.02 480 1.51 % 2,082.22 182 0.39 % 615.82 717 1.50 % 2,030.30 242 1.15 % 1,552.35 bla bla 1,591 1.08 % 1,537.04 309 0.97 % 1,340.43 774 1.65 % 2,618.92 322 0.68 % 911.80 186 0.88 % 1,193.13 bom bom 1,583 1.07 % 1,529.31 269 0.85 % 1,166.91 606 1.29 % 2,050.47 360 0.76 % 1,019.40 348 1.65 % 2,232.30 gre gre 1,099 0.75 % 1,061.73 200 0.63 % 867.59 288 0.61 % 974.48 456 0.96 % 1,291.24 155 0.74 % 994.27 boš boš 951 0.65 % 918.75 207 0.65 % 897.96 381 0.81 % 1,289.16 184 0.39 % 521.03 179 0.85 % 1,148.22 maš maš 930 0.63 % 898.46 207 0.65 % 897.96 479 1.02 % 1,620.75 91 0.19 % 257.68 153 0.73 % 981.44 reku rek u 897 0.61 % 866.58 143 0.45 % 620.33 380 0.81 % 1,285.77 247 0.52 % 699.42 127 0.60 % 814.66 recimo rec imo 838 0.57 % 809.58 166 0.52 % 720.10 126 0.27 % 426.34 351 0.74 % 993.91 195 0.93 % 1,250.86 sta sta 820 0.56 % 792.19 218 0.69 % 945.68 260 0.55 % 879.74 251 0.53 % 710.75 91 0.43 % 583.73 mel mel 786 0.53 % 759.34 131 0.41 % 568.27 412 0.88 % 1,394.05 134 0.28 % 379.44 109 0.52 % 699.20 rekla rek la 769 0.52 % 742.92 98 0.31 % 425.12 406 0.86 % 1,373.75 124 0.26 % 351.13 141 0.67 % 904.47 mamo mam o 681 0.46 % 657.90 163 0.51 % 707.09 140 0.30 % 473.71 234 0.49 % 662.61 144 0.68 % 923.71 zdi zdi 671 0.46 % 648.24 115 0.36 % 498.87 179 0.38 % 605.67 198 0.42 % 560.67 179 0.85 % 1,148.22 boste bos te 645 0.44 % 623.12 225 0.71 % 976.04 14 0.03 % 47.37 277 0.58 % 784.37 129 0.61 % 827.49 šla šla 631 0.43 % 609.60 75 0.24 % 325.35 412 0.88 % 1,394.05 76 0.16 % 215.21 68 0.32 % 436.20 bli bli 627 0.42 % 605.73 153 0.48 % 663.71 263 0.56 % 889.89 149 0.31 % 421.92 62 0.29 % 397.71 nisem nis em 622 0.42 % 600.90 133 0.42 % 576.95 239 0.51 % 808.68 130 0.27 % 368.12 120 0.57 % 769.76 majo maj o 618 0.42 % 597.04 72 0.23 % 312.33 324 0.69 % 1,096.29 97 0.20 % 274.67 125 0.59 % 801.83 mam mam 608 0.41 % 587.38 90 0.28 % 390.42 263 0.56 % 889.89 112 0.23 % 317.15 143 0.68 % 917.30 dej dej 607 0.41 % 586.41 184 0.58 % 798.18 264 0.56 % 893.27 82 0.17 % 232.20 77 0.37 % 493.93 mislm mis lm 602 0.41 % 581.58 86 0.27 % 373.06 206 0.44 % 697.02 155 0.33 % 438.91 155 0.74 % 994.27 bodo bod o 601 0.41 % 580.62 178 0.56 % 772.16 44 0.09 % 148.88 309 0.65 % 874.98 70 0.33 % 449.03 mislim mis lim 599 0.41 % 578.68 110 0.35 % 477.18 160 0.34 % 541.38 203 0.43 % 574.83 126 0.60 % 808.25 aja aja 561 0.38 % 541.97 71 0.22 % 308 373 0.80 % 1,262.09 32 0.07 % 90.61 85 0.40 % 545.25 lej lej 561 0.38 % 541.97 177 0.56 % 767.82 263 0.56 % 889.89 28 0.06 % 79.29 93 0.44 % 596.56 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 427 File at CLARIN.SI2.2.84 List of initial character-level 4-grams from verb lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-verbs-lowercase_forms- initial-4grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] bomo bomo 1,819 1.71 % 1,757.31 486 2.04 % 2,108.25 262 0.90 % 886.51 806 2.09 % 2,282.32 265 1.74 % 1,699.88 reku reku 897 0.84 % 866.58 143 0.60 % 620.33 380 1.31 % 1,285.77 247 0.64 % 699.42 127 0.83 % 814.66 recimo reci mo 838 0.79 % 809.58 166 0.70 % 720.10 126 0.43 % 426.34 351 0.91 % 993.91 195 1.28 % 1,250.86 rekla rekl a 769 0.72 % 742.92 98 0.41 % 425.12 406 1.40 % 1,373.75 124 0.32 % 351.13 141 0.93 % 904.47 mamo mamo 681 0.64 % 657.90 163 0.68 % 707.09 140 0.48 % 473.71 234 0.61 % 662.61 144 0.95 % 923.71 boste bost e 645 0.60 % 623.12 225 0.94 % 976.04 14 0.05 % 47.37 277 0.72 % 784.37 129 0.85 % 827.49 nisem nise m 622 0.58 % 600.90 133 0.56 % 576.95 239 0.82 % 808.68 130 0.34 % 368.12 120 0.79 % 769.76 majo majo 618 0.58 % 597.04 72 0.30 % 312.33 324 1.11 % 1,096.29 97 0.25 % 274.67 125 0.82 % 801.83 mislm misl m 602 0.56 % 581.58 86 0.36 % 373.06 206 0.71 % 697.02 155 0.40 % 438.91 155 1.02 % 994.27 bodo bodo 601 0.56 % 580.62 178 0.75 % 772.16 44 0.15 % 148.88 309 0.80 % 874.98 70 0.46 % 449.03 mislim misl im 599 0.56 % 578.68 110 0.46 % 477.18 160 0.55 % 541.38 203 0.53 % 574.83 126 0.83 % 808.25 niso niso 556 0.52 % 537.14 78 0.33 % 338.36 130 0.45 % 439.87 266 0.69 % 753.22 82 0.54 % 526 bilo bilo 552 0.52 % 533.28 127 0.53 % 550.92 68 0.23 % 230.09 337 0.87 % 954.27 20 0.13 % 128.29 pravi prav i 501 0.47 % 484.01 81 0.34 % 351.37 37 0.13 % 125.19 293 0.76 % 829.68 90 0.59 % 577.32 bila bila 500 0.47 % 483.04 177 0.74 % 767.82 54 0.19 % 182.72 259 0.67 % 733.40 10 0.07 % 64.15 pride prid e 483 0.45 % 466.62 84 0.35 % 364.39 136 0.47 % 460.17 167 0.43 % 472.89 96 0.63 % 615.81 imamo imam o 477 0.45 % 460.82 81 0.34 % 351.37 19 0.07 % 64.29 352 0.91 % 996.75 25 0.16 % 160.37 mela mela 466 0.44 % 450.20 88 0.37 % 381.74 231 0.80 % 781.61 40 0.10 % 113.27 107 0.70 % 686.37 morš morš 444 0.42 % 428.94 57 0.24 % 247.26 241 0.83 % 815.45 53 0.14 % 150.08 93 0.61 % 596.56 gremo grem o 435 0.41 % 420.25 140 0.59 % 607.31 102 0.35 % 345.13 138 0.36 % 390.77 55 0.36 % 352.81 more more 396 0.37 % 382.57 60 0.25 % 260.28 143 0.49 % 483.86 131 0.34 % 370.95 62 0.41 % 397.71 bojo bojo 372 0.35 % 359.38 44 0.18 % 190.87 126 0.43 % 426.34 104 0.27 % 294.49 98 0.64 % 628.64 pomeni pome ni 371 0.35 % 358.42 40 0.17 % 173.52 19 0.07 % 64.29 257 0.67 % 727.74 55 0.36 % 352.81 rekli rekl i 361 0.34 % 348.76 71 0.30 % 308 49 0.17 % 165.80 212 0.55 % 600.31 29 0.19 % 186.03 mate mate 360 0.34 % 347.79 57 0.24 % 247.26 47 0.16 % 159.03 87 0.23 % 246.35 169 1.11 % 1,084.08 misim misi m 307 0.29 % 296.59 61 0.26 % 264.62 118 0.41 % 399.27 60 0.16 % 169.90 68 0.45 % 436.20 veste vest e 307 0.29 % 296.59 90 0.38 % 390.42 4 0.01 % 13.53 162 0.42 % 458.73 51 0.34 % 327.15 greš greš 289 0.27 % 279.20 38 0.16 % 164.84 183 0.63 % 619.20 34 0.09 % 96.28 34 0.22 % 218.10 bili bili 284 0.27 % 274.37 84 0.35 % 364.39 35 0.12 % 118.43 159 0.41 % 450.23 6 0.04 % 38.49 nismo nism o 284 0.27 % 274.37 57 0.24 % 247.26 61 0.21 % 206.40 133 0.34 % 376.61 33 0.22 % 211.68 imajo imaj o 280 0.26 % 270.50 55 0.23 % 238.59 22 0.08 % 74.44 185 0.48 % 523.86 18 0.12 % 115.46 reče reče 276 0.26 % 266.64 72 0.30 % 312.33 95 0.33 % 321.44 78 0.20 % 220.87 31 0.20 % 198.85 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 428 File at CLARIN.SI2.2.85 List of initial character-level 5-grams from verb lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-verbs-lowercase_forms- initial-5grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] recimo recim o 838 1.01 % 809.58 166 0.89 % 720.10 126 0.61 % 426.34 351 1.10 % 993.91 195 1.64 % 1,250.86 rekla rekla 769 0.93 % 742.92 98 0.53 % 425.12 406 1.97 % 1,373.75 124 0.39 % 351.13 141 1.19 % 904.47 boste boste 645 0.78 % 623.12 225 1.21 % 976.04 14 0.07 % 47.37 277 0.87 % 784.37 129 1.09 % 827.49 nisem nisem 622 0.75 % 600.90 133 0.71 % 576.95 239 1.16 % 808.68 130 0.41 % 368.12 120 1.01 % 769.76 mislm mislm 602 0.72 % 581.58 86 0.46 % 373.06 206 1.00 % 697.02 155 0.49 % 438.91 155 1.30 % 994.27 mislim misli m 599 0.72 % 578.68 110 0.59 % 477.18 160 0.78 % 541.38 203 0.64 % 574.83 126 1.06 % 808.25 pravi pravi 501 0.60 % 484.01 81 0.43 % 351.37 37 0.18 % 125.19 293 0.92 % 829.68 90 0.76 % 577.32 pride pride 483 0.58 % 466.62 84 0.45 % 364.39 136 0.66 % 460.17 167 0.52 % 472.89 96 0.81 % 615.81 imamo imamo 477 0.57 % 460.82 81 0.43 % 351.37 19 0.09 % 64.29 352 1.10 % 996.75 25 0.21 % 160.37 gremo gremo 435 0.52 % 420.25 140 0.75 % 607.31 102 0.49 % 345.13 138 0.43 % 390.77 55 0.46 % 352.81 pomeni pomen i 371 0.45 % 358.42 40 0.21 % 173.52 19 0.09 % 64.29 257 0.81 % 727.74 55 0.46 % 352.81 rekli rekli 361 0.43 % 348.76 71 0.38 % 308 49 0.24 % 165.80 212 0.67 % 600.31 29 0.24 % 186.03 misim misim 307 0.37 % 296.59 61 0.33 % 264.62 118 0.57 % 399.27 60 0.19 % 169.90 68 0.57 % 436.20 veste veste 307 0.37 % 296.59 90 0.48 % 390.42 4 0.02 % 13.53 162 0.51 % 458.73 51 0.43 % 327.15 nismo nismo 284 0.34 % 274.37 57 0.30 % 247.26 61 0.30 % 206.40 133 0.42 % 376.61 33 0.28 % 211.68 imajo imajo 280 0.34 % 270.50 55 0.29 % 238.59 22 0.11 % 74.44 185 0.58 % 523.86 18 0.15 % 115.46 delat delat 273 0.33 % 263.74 45 0.24 % 195.21 141 0.69 % 477.09 55 0.17 % 155.74 32 0.27 % 205.27 morem morem 263 0.32 % 254.08 57 0.30 % 247.26 99 0.48 % 334.98 52 0.16 % 147.25 55 0.46 % 352.81 zanima zanim a 243 0.29 % 234.76 62 0.33 % 268.95 33 0.16 % 111.66 78 0.24 % 220.87 70 0.59 % 449.03 imeli imeli 242 0.29 % 233.79 34 0.18 % 147.49 41 0.20 % 138.73 145 0.46 % 410.59 22 0.18 % 141.12 povej povej 242 0.29 % 233.79 84 0.45 % 364.39 62 0.30 % 209.78 64 0.20 % 181.23 32 0.27 % 205.27 moram moram 237 0.28 % 228.96 72 0.39 % 312.33 43 0.21 % 145.50 87 0.27 % 246.35 35 0.29 % 224.51 mormo mormo 229 0.28 % 221.23 27 0.14 % 117.12 34 0.17 % 115.04 112 0.35 % 317.15 56 0.47 % 359.22 vidla vidla 220 0.27 % 212.54 37 0.20 % 160.50 124 0.60 % 419.57 14 0.04 % 39.64 45 0.38 % 288.66 poglejte pogle jte 197 0.24 % 190.32 21 0.11 % 91.10 3 0.01 % 10.15 161 0.51 % 455.90 12 0.10 % 76.98 povedal poved al 197 0.24 % 190.32 47 0.25 % 203.88 28 0.14 % 94.74 109 0.34 % 308.65 13 0.11 % 83.39 imate imate 195 0.23 % 188.39 45 0.24 % 195.21 0 0 % 0 93 0.29 % 263.34 57 0.48 % 365.64 začel začel 182 0.22 % 175.83 32 0.17 % 138.81 58 0.28 % 196.25 76 0.24 % 215.21 16 0.14 % 102.63 nimam nimam 176 0.21 % 170.03 31 0.17 % 134.48 69 0.34 % 233.47 38 0.12 % 107.60 38 0.32 % 243.76 povedat poved at 176 0.21 % 170.03 54 0.29 % 234.25 32 0.15 % 108.28 68 0.21 % 192.55 22 0.18 % 141.12 dobil dobil 175 0.21 % 169.06 55 0.29 % 238.59 29 0.14 % 98.12 73 0.23 % 206.71 18 0.15 % 115.46 mogla mogla 173 0.21 % 167.13 29 0.15 % 125.80 86 0.42 % 290.99 29 0.09 % 82.12 29 0.24 % 186.03 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 429 File at CLARIN.SI2.2.86 List of final character-level 1-grams from verb lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-verbs-lowercase_forms- final-1grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] je j e 31,662 14.77 % 30,588.14 6,498 14.30 % 28,188.08 10,223 14.75 % 34,590.68 10,712 15.53 % 30,332.81 4,229 13.84 % 27,127.58 so s o 7,869 3.67 % 7,602.11 1,526 3.36 % 6,619.73 2,277 3.29 % 7,704.49 2,995 4.34 % 8,480.84 1,071 3.50 % 6,870.10 bi b i 6,993 3.26 % 6,755.82 1,319 2.90 % 5,721.77 1,684 2.43 % 5,698.01 2,687 3.90 % 7,608.69 1,303 4.26 % 8,358.30 ni n i 4,775 2.23 % 4,613.05 882 1.94 % 3,826.08 1,507 2.17 % 5,099.11 1,565 2.27 % 4,431.56 821 2.69 % 5,266.43 bo b o 4,762 2.22 % 4,600.49 1,218 2.68 % 5,283.64 1,070 1.54 % 3,620.47 1,764 2.56 % 4,995.06 710 2.32 % 4,554.41 sem se m 4,238 1.98 % 4,094.26 771 1.70 % 3,344.57 1,729 2.49 % 5,850.27 1,053 1.53 % 2,981.74 685 2.24 % 4,394.04 smo sm o 3,919 1.83 % 3,786.08 881 1.94 % 3,821.74 1,105 1.59 % 3,738.89 1,543 2.24 % 4,369.26 390 1.28 % 2,501.72 vem ve m 3,296 1.54 % 3,184.21 467 1.03 % 2,025.83 1,694 2.44 % 5,731.84 500 0.72 % 1,415.83 635 2.08 % 4,073.31 si s i 2,908 1.36 % 2,809.37 857 1.89 % 3,717.63 1,141 1.65 % 3,860.70 528 0.77 % 1,495.12 382 1.25 % 2,450.40 veš ve š 2,611 1.22 % 2,522.44 485 1.07 % 2,103.91 1,534 2.21 % 5,190.46 97 0.14 % 274.67 495 1.62 % 3,175.25 blo bl o 2,262 1.05 % 2,185.28 368 0.81 % 1,596.37 1,011 1.46 % 3,420.83 499 0.72 % 1,413 384 1.26 % 2,463.23 ma m a 2,077 0.97 % 2,006.56 344 0.76 % 1,492.26 1,155 1.67 % 3,908.07 252 0.36 % 713.58 326 1.07 % 2,091.18 bil bi l 2,043 0.95 % 1,973.71 506 1.11 % 2,195.01 798 1.15 % 2,700.12 615 0.89 % 1,741.47 124 0.41 % 795.42 bomo bom o 1,819 0.85 % 1,757.31 486 1.07 % 2,108.25 262 0.38 % 886.51 806 1.17 % 2,282.32 265 0.87 % 1,699.88 ste st e 1,621 0.76 % 1,566.02 480 1.06 % 2,082.22 182 0.26 % 615.82 717 1.04 % 2,030.30 242 0.79 % 1,552.35 bla bl a 1,591 0.74 % 1,537.04 309 0.68 % 1,340.43 774 1.12 % 2,618.92 322 0.47 % 911.80 186 0.61 % 1,193.13 bom bo m 1,583 0.74 % 1,529.31 269 0.59 % 1,166.91 606 0.87 % 2,050.47 360 0.52 % 1,019.40 348 1.14 % 2,232.30 sn s n 1,350 0.63 % 1,304.21 223 0.49 % 967.37 876 1.26 % 2,964.05 72 0.10 % 203.88 179 0.59 % 1,148.22 da d a 1,144 0.53 % 1,105.20 187 0.41 % 811.20 373 0.54 % 1,262.09 393 0.57 % 1,112.84 191 0.62 % 1,225.20 gre gr e 1,099 0.51 % 1,061.73 200 0.44 % 867.59 288 0.41 % 974.48 456 0.66 % 1,291.24 155 0.51 % 994.27 boš bo š 951 0.44 % 918.75 207 0.46 % 897.96 381 0.55 % 1,289.16 184 0.27 % 521.03 179 0.59 % 1,148.22 maš ma š 930 0.43 % 898.46 207 0.46 % 897.96 479 0.69 % 1,620.75 91 0.13 % 257.68 153 0.50 % 981.44 reku rek u 897 0.42 % 866.58 143 0.32 % 620.33 380 0.55 % 1,285.77 247 0.36 % 699.42 127 0.41 % 814.66 recimo recim o 838 0.39 % 809.58 166 0.36 % 720.10 126 0.18 % 426.34 351 0.51 % 993.91 195 0.64 % 1,250.86 sta st a 820 0.38 % 792.19 218 0.48 % 945.68 260 0.38 % 879.74 251 0.36 % 710.75 91 0.30 % 583.73 mel me l 786 0.37 % 759.34 131 0.29 % 568.27 412 0.59 % 1,394.05 134 0.19 % 379.44 109 0.36 % 699.20 rekla rekl a 769 0.36 % 742.92 98 0.22 % 425.12 406 0.59 % 1,373.75 124 0.18 % 351.13 141 0.46 % 904.47 mamo mam o 681 0.32 % 657.90 163 0.36 % 707.09 140 0.20 % 473.71 234 0.34 % 662.61 144 0.47 % 923.71 zdi zd i 671 0.31 % 648.24 115 0.25 % 498.87 179 0.26 % 605.67 198 0.29 % 560.67 179 0.59 % 1,148.22 boste bost e 645 0.30 % 623.12 225 0.49 % 976.04 14 0.02 % 47.37 277 0.40 % 784.37 129 0.42 % 827.49 šla šl a 631 0.29 % 609.60 75 0.17 % 325.35 412 0.59 % 1,394.05 76 0.11 % 215.21 68 0.22 % 436.20 bli bl i 627 0.29 % 605.73 153 0.34 % 663.71 263 0.38 % 889.89 149 0.22 % 421.92 62 0.20 % 397.71 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 430 File at CLARIN.SI2.2.87 List of final character-level 2-grams from verb lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-verbs-lowercase_forms- final-2grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] je je 31,662 14.84 % 30,588.14 6,498 14.33 % 28,188.08 10,223 14.89 % 34,590.68 10,712 15.55 % 30,332.81 4,229 13.88 % 27,127.58 so so 7,869 3.69 % 7,602.11 1,526 3.37 % 6,619.73 2,277 3.32 % 7,704.49 2,995 4.35 % 8,480.84 1,071 3.52 % 6,870.10 bi bi 6,993 3.28 % 6,755.82 1,319 2.91 % 5,721.77 1,684 2.45 % 5,698.01 2,687 3.90 % 7,608.69 1,303 4.28 % 8,358.30 ni ni 4,775 2.24 % 4,613.05 882 1.95 % 3,826.08 1,507 2.20 % 5,099.11 1,565 2.27 % 4,431.56 821 2.69 % 5,266.43 bo bo 4,762 2.23 % 4,600.49 1,218 2.69 % 5,283.64 1,070 1.56 % 3,620.47 1,764 2.56 % 4,995.06 710 2.33 % 4,554.41 sem s em 4,238 1.99 % 4,094.26 771 1.70 % 3,344.57 1,729 2.52 % 5,850.27 1,053 1.53 % 2,981.74 685 2.25 % 4,394.04 smo s mo 3,919 1.84 % 3,786.08 881 1.94 % 3,821.74 1,105 1.61 % 3,738.89 1,543 2.24 % 4,369.26 390 1.28 % 2,501.72 vem v em 3,296 1.54 % 3,184.21 467 1.03 % 2,025.83 1,694 2.47 % 5,731.84 500 0.73 % 1,415.83 635 2.08 % 4,073.31 si si 2,908 1.36 % 2,809.37 857 1.89 % 3,717.63 1,141 1.66 % 3,860.70 528 0.77 % 1,495.12 382 1.25 % 2,450.40 veš v eš 2,611 1.22 % 2,522.44 485 1.07 % 2,103.91 1,534 2.23 % 5,190.46 97 0.14 % 274.67 495 1.62 % 3,175.25 blo b lo 2,262 1.06 % 2,185.28 368 0.81 % 1,596.37 1,011 1.47 % 3,420.83 499 0.72 % 1,413 384 1.26 % 2,463.23 ma ma 2,077 0.97 % 2,006.56 344 0.76 % 1,492.26 1,155 1.68 % 3,908.07 252 0.37 % 713.58 326 1.07 % 2,091.18 bil b il 2,043 0.96 % 1,973.71 506 1.12 % 2,195.01 798 1.16 % 2,700.12 615 0.89 % 1,741.47 124 0.41 % 795.42 bomo bo mo 1,819 0.85 % 1,757.31 486 1.07 % 2,108.25 262 0.38 % 886.51 806 1.17 % 2,282.32 265 0.87 % 1,699.88 ste s te 1,621 0.76 % 1,566.02 480 1.06 % 2,082.22 182 0.27 % 615.82 717 1.04 % 2,030.30 242 0.79 % 1,552.35 bla b la 1,591 0.75 % 1,537.04 309 0.68 % 1,340.43 774 1.13 % 2,618.92 322 0.47 % 911.80 186 0.61 % 1,193.13 bom b om 1,583 0.74 % 1,529.31 269 0.59 % 1,166.91 606 0.88 % 2,050.47 360 0.52 % 1,019.40 348 1.14 % 2,232.30 sn sn 1,350 0.63 % 1,304.21 223 0.49 % 967.37 876 1.28 % 2,964.05 72 0.10 % 203.88 179 0.59 % 1,148.22 da da 1,144 0.54 % 1,105.20 187 0.41 % 811.20 373 0.54 % 1,262.09 393 0.57 % 1,112.84 191 0.63 % 1,225.20 gre g re 1,099 0.52 % 1,061.73 200 0.44 % 867.59 288 0.42 % 974.48 456 0.66 % 1,291.24 155 0.51 % 994.27 boš b oš 951 0.45 % 918.75 207 0.46 % 897.96 381 0.56 % 1,289.16 184 0.27 % 521.03 179 0.59 % 1,148.22 maš m aš 930 0.44 % 898.46 207 0.46 % 897.96 479 0.70 % 1,620.75 91 0.13 % 257.68 153 0.50 % 981.44 reku re ku 897 0.42 % 866.58 143 0.32 % 620.33 380 0.55 % 1,285.77 247 0.36 % 699.42 127 0.42 % 814.66 recimo reci mo 838 0.39 % 809.58 166 0.37 % 720.10 126 0.18 % 426.34 351 0.51 % 993.91 195 0.64 % 1,250.86 sta s ta 820 0.38 % 792.19 218 0.48 % 945.68 260 0.38 % 879.74 251 0.36 % 710.75 91 0.30 % 583.73 mel m el 786 0.37 % 759.34 131 0.29 % 568.27 412 0.60 % 1,394.05 134 0.20 % 379.44 109 0.36 % 699.20 rekla rek la 769 0.36 % 742.92 98 0.22 % 425.12 406 0.59 % 1,373.75 124 0.18 % 351.13 141 0.46 % 904.47 mamo ma mo 681 0.32 % 657.90 163 0.36 % 707.09 140 0.20 % 473.71 234 0.34 % 662.61 144 0.47 % 923.71 zdi z di 671 0.32 % 648.24 115 0.25 % 498.87 179 0.26 % 605.67 198 0.29 % 560.67 179 0.59 % 1,148.22 boste bos te 645 0.30 % 623.12 225 0.50 % 976.04 14 0.02 % 47.37 277 0.40 % 784.37 129 0.42 % 827.49 šla š la 631 0.30 % 609.60 75 0.17 % 325.35 412 0.60 % 1,394.05 76 0.11 % 215.21 68 0.22 % 436.20 bli b li 627 0.29 % 605.73 153 0.34 % 663.71 263 0.38 % 889.89 149 0.22 % 421.92 62 0.20 % 397.71 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 431 File at CLARIN.SI2.2.88 List of final character-level 3-grams from verb lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-verbs-lowercase_forms- final-3grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] sem sem 4,238 2.88 % 4,094.26 771 2.43 % 3,344.57 1,729 3.69 % 5,850.27 1,053 2.21 % 2,981.74 685 3.25 % 4,394.04 smo smo 3,919 2.66 % 3,786.08 881 2.77 % 3,821.74 1,105 2.35 % 3,738.89 1,543 3.24 % 4,369.26 390 1.85 % 2,501.72 vem vem 3,296 2.24 % 3,184.21 467 1.47 % 2,025.83 1,694 3.61 % 5,731.84 500 1.05 % 1,415.83 635 3.02 % 4,073.31 veš veš 2,611 1.77 % 2,522.44 485 1.53 % 2,103.91 1,534 3.27 % 5,190.46 97 0.20 % 274.67 495 2.35 % 3,175.25 blo blo 2,262 1.53 % 2,185.28 368 1.16 % 1,596.37 1,011 2.15 % 3,420.83 499 1.05 % 1,413 384 1.82 % 2,463.23 bil bil 2,043 1.39 % 1,973.71 506 1.59 % 2,195.01 798 1.70 % 2,700.12 615 1.29 % 1,741.47 124 0.59 % 795.42 bomo b omo 1,819 1.23 % 1,757.31 486 1.53 % 2,108.25 262 0.56 % 886.51 806 1.69 % 2,282.32 265 1.26 % 1,699.88 ste ste 1,621 1.10 % 1,566.02 480 1.51 % 2,082.22 182 0.39 % 615.82 717 1.50 % 2,030.30 242 1.15 % 1,552.35 bla bla 1,591 1.08 % 1,537.04 309 0.97 % 1,340.43 774 1.65 % 2,618.92 322 0.68 % 911.80 186 0.88 % 1,193.13 bom bom 1,583 1.07 % 1,529.31 269 0.85 % 1,166.91 606 1.29 % 2,050.47 360 0.76 % 1,019.40 348 1.65 % 2,232.30 gre gre 1,099 0.75 % 1,061.73 200 0.63 % 867.59 288 0.61 % 974.48 456 0.96 % 1,291.24 155 0.74 % 994.27 boš boš 951 0.65 % 918.75 207 0.65 % 897.96 381 0.81 % 1,289.16 184 0.39 % 521.03 179 0.85 % 1,148.22 maš maš 930 0.63 % 898.46 207 0.65 % 897.96 479 1.02 % 1,620.75 91 0.19 % 257.68 153 0.73 % 981.44 reku r eku 897 0.61 % 866.58 143 0.45 % 620.33 380 0.81 % 1,285.77 247 0.52 % 699.42 127 0.60 % 814.66 recimo rec imo 838 0.57 % 809.58 166 0.52 % 720.10 126 0.27 % 426.34 351 0.74 % 993.91 195 0.93 % 1,250.86 sta sta 820 0.56 % 792.19 218 0.69 % 945.68 260 0.55 % 879.74 251 0.53 % 710.75 91 0.43 % 583.73 mel mel 786 0.53 % 759.34 131 0.41 % 568.27 412 0.88 % 1,394.05 134 0.28 % 379.44 109 0.52 % 699.20 rekla re kla 769 0.52 % 742.92 98 0.31 % 425.12 406 0.86 % 1,373.75 124 0.26 % 351.13 141 0.67 % 904.47 mamo m amo 681 0.46 % 657.90 163 0.51 % 707.09 140 0.30 % 473.71 234 0.49 % 662.61 144 0.68 % 923.71 zdi zdi 671 0.46 % 648.24 115 0.36 % 498.87 179 0.38 % 605.67 198 0.42 % 560.67 179 0.85 % 1,148.22 boste bo ste 645 0.44 % 623.12 225 0.71 % 976.04 14 0.03 % 47.37 277 0.58 % 784.37 129 0.61 % 827.49 šla šla 631 0.43 % 609.60 75 0.24 % 325.35 412 0.88 % 1,394.05 76 0.16 % 215.21 68 0.32 % 436.20 bli bli 627 0.42 % 605.73 153 0.48 % 663.71 263 0.56 % 889.89 149 0.31 % 421.92 62 0.29 % 397.71 nisem ni sem 622 0.42 % 600.90 133 0.42 % 576.95 239 0.51 % 808.68 130 0.27 % 368.12 120 0.57 % 769.76 majo m ajo 618 0.42 % 597.04 72 0.23 % 312.33 324 0.69 % 1,096.29 97 0.20 % 274.67 125 0.59 % 801.83 mam mam 608 0.41 % 587.38 90 0.28 % 390.42 263 0.56 % 889.89 112 0.23 % 317.15 143 0.68 % 917.30 dej dej 607 0.41 % 586.41 184 0.58 % 798.18 264 0.56 % 893.27 82 0.17 % 232.20 77 0.37 % 493.93 mislm mi slm 602 0.41 % 581.58 86 0.27 % 373.06 206 0.44 % 697.02 155 0.33 % 438.91 155 0.74 % 994.27 bodo b odo 601 0.41 % 580.62 178 0.56 % 772.16 44 0.09 % 148.88 309 0.65 % 874.98 70 0.33 % 449.03 mislim mis lim 599 0.41 % 578.68 110 0.35 % 477.18 160 0.34 % 541.38 203 0.43 % 574.83 126 0.60 % 808.25 aja aja 561 0.38 % 541.97 71 0.22 % 308 373 0.80 % 1,262.09 32 0.07 % 90.61 85 0.40 % 545.25 lej lej 561 0.38 % 541.97 177 0.56 % 767.82 263 0.56 % 889.89 28 0.06 % 79.29 93 0.44 % 596.56 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 432 File at CLARIN.SI2.2.89 List of final character-level 4-grams from verb lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-verbs-lowercase_forms- final-4grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] bomo bomo 1,819 1.71 % 1,757.31 486 2.04 % 2,108.25 262 0.90 % 886.51 806 2.09 % 2,282.32 265 1.74 % 1,699.88 reku reku 897 0.84 % 866.58 143 0.60 % 620.33 380 1.31 % 1,285.77 247 0.64 % 699.42 127 0.83 % 814.66 recimo re cimo 838 0.79 % 809.58 166 0.70 % 720.10 126 0.43 % 426.34 351 0.91 % 993.91 195 1.28 % 1,250.86 rekla r ekla 769 0.72 % 742.92 98 0.41 % 425.12 406 1.40 % 1,373.75 124 0.32 % 351.13 141 0.93 % 904.47 mamo mamo 681 0.64 % 657.90 163 0.68 % 707.09 140 0.48 % 473.71 234 0.61 % 662.61 144 0.95 % 923.71 boste b oste 645 0.60 % 623.12 225 0.94 % 976.04 14 0.05 % 47.37 277 0.72 % 784.37 129 0.85 % 827.49 nisem n isem 622 0.58 % 600.90 133 0.56 % 576.95 239 0.82 % 808.68 130 0.34 % 368.12 120 0.79 % 769.76 majo majo 618 0.58 % 597.04 72 0.30 % 312.33 324 1.11 % 1,096.29 97 0.25 % 274.67 125 0.82 % 801.83 mislm m islm 602 0.56 % 581.58 86 0.36 % 373.06 206 0.71 % 697.02 155 0.40 % 438.91 155 1.02 % 994.27 bodo bodo 601 0.56 % 580.62 178 0.75 % 772.16 44 0.15 % 148.88 309 0.80 % 874.98 70 0.46 % 449.03 mislim mi slim 599 0.56 % 578.68 110 0.46 % 477.18 160 0.55 % 541.38 203 0.53 % 574.83 126 0.83 % 808.25 niso niso 556 0.52 % 537.14 78 0.33 % 338.36 130 0.45 % 439.87 266 0.69 % 753.22 82 0.54 % 526 bilo bilo 552 0.52 % 533.28 127 0.53 % 550.92 68 0.23 % 230.09 337 0.87 % 954.27 20 0.13 % 128.29 pravi p ravi 501 0.47 % 484.01 81 0.34 % 351.37 37 0.13 % 125.19 293 0.76 % 829.68 90 0.59 % 577.32 bila bila 500 0.47 % 483.04 177 0.74 % 767.82 54 0.19 % 182.72 259 0.67 % 733.40 10 0.07 % 64.15 pride p ride 483 0.45 % 466.62 84 0.35 % 364.39 136 0.47 % 460.17 167 0.43 % 472.89 96 0.63 % 615.81 imamo i mamo 477 0.45 % 460.82 81 0.34 % 351.37 19 0.07 % 64.29 352 0.91 % 996.75 25 0.16 % 160.37 mela mela 466 0.44 % 450.20 88 0.37 % 381.74 231 0.80 % 781.61 40 0.10 % 113.27 107 0.70 % 686.37 morš morš 444 0.42 % 428.94 57 0.24 % 247.26 241 0.83 % 815.45 53 0.14 % 150.08 93 0.61 % 596.56 gremo g remo 435 0.41 % 420.25 140 0.59 % 607.31 102 0.35 % 345.13 138 0.36 % 390.77 55 0.36 % 352.81 more more 396 0.37 % 382.57 60 0.25 % 260.28 143 0.49 % 483.86 131 0.34 % 370.95 62 0.41 % 397.71 bojo bojo 372 0.35 % 359.38 44 0.18 % 190.87 126 0.43 % 426.34 104 0.27 % 294.49 98 0.64 % 628.64 pomeni po meni 371 0.35 % 358.42 40 0.17 % 173.52 19 0.07 % 64.29 257 0.67 % 727.74 55 0.36 % 352.81 rekli r ekli 361 0.34 % 348.76 71 0.30 % 308 49 0.17 % 165.80 212 0.55 % 600.31 29 0.19 % 186.03 mate mate 360 0.34 % 347.79 57 0.24 % 247.26 47 0.16 % 159.03 87 0.23 % 246.35 169 1.11 % 1,084.08 misim m isim 307 0.29 % 296.59 61 0.26 % 264.62 118 0.41 % 399.27 60 0.16 % 169.90 68 0.45 % 436.20 veste v este 307 0.29 % 296.59 90 0.38 % 390.42 4 0.01 % 13.53 162 0.42 % 458.73 51 0.34 % 327.15 greš greš 289 0.27 % 279.20 38 0.16 % 164.84 183 0.63 % 619.20 34 0.09 % 96.28 34 0.22 % 218.10 bili bili 284 0.27 % 274.37 84 0.35 % 364.39 35 0.12 % 118.43 159 0.41 % 450.23 6 0.04 % 38.49 nismo n ismo 284 0.27 % 274.37 57 0.24 % 247.26 61 0.21 % 206.40 133 0.34 % 376.61 33 0.22 % 211.68 imajo i majo 280 0.26 % 270.50 55 0.23 % 238.59 22 0.08 % 74.44 185 0.48 % 523.86 18 0.12 % 115.46 reče reče 276 0.26 % 266.64 72 0.30 % 312.33 95 0.33 % 321.44 78 0.20 % 220.87 31 0.20 % 198.85 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 433 File at CLARIN.SI2.2.90 List of final character-level 5-grams from verb lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-verbs-lowercase_forms- final-5grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] recimo r ecimo 838 1.01 % 809.58 166 0.89 % 720.10 126 0.61 % 426.34 351 1.10 % 993.91 195 1.64 % 1,250.86 rekla rekla 769 0.93 % 742.92 98 0.53 % 425.12 406 1.97 % 1,373.75 124 0.39 % 351.13 141 1.19 % 904.47 boste boste 645 0.78 % 623.12 225 1.21 % 976.04 14 0.07 % 47.37 277 0.87 % 784.37 129 1.09 % 827.49 nisem nisem 622 0.75 % 600.90 133 0.71 % 576.95 239 1.16 % 808.68 130 0.41 % 368.12 120 1.01 % 769.76 mislm mislm 602 0.72 % 581.58 86 0.46 % 373.06 206 1.00 % 697.02 155 0.49 % 438.91 155 1.30 % 994.27 mislim m islim 599 0.72 % 578.68 110 0.59 % 477.18 160 0.78 % 541.38 203 0.64 % 574.83 126 1.06 % 808.25 pravi pravi 501 0.60 % 484.01 81 0.43 % 351.37 37 0.18 % 125.19 293 0.92 % 829.68 90 0.76 % 577.32 pride pride 483 0.58 % 466.62 84 0.45 % 364.39 136 0.66 % 460.17 167 0.52 % 472.89 96 0.81 % 615.81 imamo imamo 477 0.57 % 460.82 81 0.43 % 351.37 19 0.09 % 64.29 352 1.10 % 996.75 25 0.21 % 160.37 gremo gremo 435 0.52 % 420.25 140 0.75 % 607.31 102 0.49 % 345.13 138 0.43 % 390.77 55 0.46 % 352.81 pomeni p omeni 371 0.45 % 358.42 40 0.21 % 173.52 19 0.09 % 64.29 257 0.81 % 727.74 55 0.46 % 352.81 rekli rekli 361 0.43 % 348.76 71 0.38 % 308 49 0.24 % 165.80 212 0.67 % 600.31 29 0.24 % 186.03 misim misim 307 0.37 % 296.59 61 0.33 % 264.62 118 0.57 % 399.27 60 0.19 % 169.90 68 0.57 % 436.20 veste veste 307 0.37 % 296.59 90 0.48 % 390.42 4 0.02 % 13.53 162 0.51 % 458.73 51 0.43 % 327.15 nismo nismo 284 0.34 % 274.37 57 0.30 % 247.26 61 0.30 % 206.40 133 0.42 % 376.61 33 0.28 % 211.68 imajo imajo 280 0.34 % 270.50 55 0.29 % 238.59 22 0.11 % 74.44 185 0.58 % 523.86 18 0.15 % 115.46 delat delat 273 0.33 % 263.74 45 0.24 % 195.21 141 0.69 % 477.09 55 0.17 % 155.74 32 0.27 % 205.27 morem morem 263 0.32 % 254.08 57 0.30 % 247.26 99 0.48 % 334.98 52 0.16 % 147.25 55 0.46 % 352.81 zanima z anima 243 0.29 % 234.76 62 0.33 % 268.95 33 0.16 % 111.66 78 0.24 % 220.87 70 0.59 % 449.03 imeli imeli 242 0.29 % 233.79 34 0.18 % 147.49 41 0.20 % 138.73 145 0.46 % 410.59 22 0.18 % 141.12 povej povej 242 0.29 % 233.79 84 0.45 % 364.39 62 0.30 % 209.78 64 0.20 % 181.23 32 0.27 % 205.27 moram moram 237 0.28 % 228.96 72 0.39 % 312.33 43 0.21 % 145.50 87 0.27 % 246.35 35 0.29 % 224.51 mormo mormo 229 0.28 % 221.23 27 0.14 % 117.12 34 0.17 % 115.04 112 0.35 % 317.15 56 0.47 % 359.22 vidla vidla 220 0.27 % 212.54 37 0.20 % 160.50 124 0.60 % 419.57 14 0.04 % 39.64 45 0.38 % 288.66 poglejte pog lejte 197 0.24 % 190.32 21 0.11 % 91.10 3 0.01 % 10.15 161 0.51 % 455.90 12 0.10 % 76.98 povedal po vedal 197 0.24 % 190.32 47 0.25 % 203.88 28 0.14 % 94.74 109 0.34 % 308.65 13 0.11 % 83.39 imate imate 195 0.23 % 188.39 45 0.24 % 195.21 0 0 % 0 93 0.29 % 263.34 57 0.48 % 365.64 začel začel 182 0.22 % 175.83 32 0.17 % 138.81 58 0.28 % 196.25 76 0.24 % 215.21 16 0.14 % 102.63 nimam nimam 176 0.21 % 170.03 31 0.17 % 134.48 69 0.34 % 233.47 38 0.12 % 107.60 38 0.32 % 243.76 povedat po vedat 176 0.21 % 170.03 54 0.29 % 234.25 32 0.15 % 108.28 68 0.21 % 192.55 22 0.18 % 141.12 dobil dobil 175 0.21 % 169.06 55 0.29 % 238.59 29 0.14 % 98.12 73 0.23 % 206.71 18 0.15 % 115.46 mogla mogla 173 0.21 % 167.13 29 0.15 % 125.80 86 0.42 % 290.99 29 0.09 % 82.12 29 0.24 % 186.03 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 434 File at CLARIN.SI2.2.91 List of initial character-level 1-grams from adjective lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-adjectives-lemmas-initial- 1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] sam sam s am 3,381 6.13 % 3,266.33 596 4.44 % 2,585.43 1,346 13.48 % 4,554.34 817 3.31 % 2,313.47 622 8.78 % 3,989.92 dober dober d ober 3,021 5.47 % 2,918.54 1,244 9.27 % 5,396.42 621 6.22 % 2,101.22 776 3.14 % 2,197.37 380 5.37 % 2,437.57 mali mali m ali 1,760 3.19 % 1,700.31 502 3.74 % 2,177.66 595 5.96 % 2,013.25 305 1.24 % 863.66 358 5.06 % 2,296.45 velik velik v elik 1,212 2.20 % 1,170.89 278 2.07 % 1,205.95 210 2.10 % 710.56 589 2.39 % 1,667.85 135 1.91 % 865.98 pravi pravi p ravi 944 1.71 % 911.98 178 1.33 % 772.16 144 1.44 % 487.24 472 1.91 % 1,336.55 150 2.12 % 962.20 lep lep l ep 929 1.68 % 897.49 379 2.82 % 1,644.09 140 1.40 % 473.71 306 1.24 % 866.49 104 1.47 % 667.12 nov nov n ov 894 1.62 % 863.68 282 2.10 % 1,223.31 150 1.50 % 507.54 360 1.46 % 1,019.40 102 1.44 % 654.29 cel cel c el 849 1.54 % 820.21 158 1.18 % 685.40 273 2.73 % 923.73 290 1.18 % 821.18 128 1.81 % 821.08 star star s tar 719 1.30 % 694.61 160 1.19 % 694.07 357 3.58 % 1,207.95 139 0.56 % 393.60 63 0.89 % 404.12 slovenski slovenski s lovenski 561 1.02 % 541.97 144 1.07 % 624.67 31 0.31 % 104.89 370 1.50 % 1,047.72 16 0.23 % 102.63 zadnji zadnji z adnji 507 0.92 % 489.80 166 1.24 % 720.10 64 0.64 % 216.55 244 0.99 % 690.93 33 0.47 % 211.68 glaven glaven g laven 423 0.77 % 408.65 87 0.65 % 377.40 176 1.76 % 595.52 109 0.44 % 308.65 51 0.72 % 327.15 majhen majhen m ajhen 381 0.69 % 368.08 73 0.54 % 316.67 89 0.89 % 301.14 167 0.68 % 472.89 52 0.73 % 333.56 dolg dolg d olg 331 0.60 % 319.77 91 0.68 % 394.75 99 0.99 % 334.98 80 0.32 % 226.53 61 0.86 % 391.29 naslednji naslednji n aslednji 330 0.60 % 318.81 102 0.76 % 442.47 53 0.53 % 179.33 125 0.51 % 353.96 50 0.71 % 320.73 mlad mlad m lad 329 0.60 % 317.84 111 0.83 % 481.51 86 0.86 % 290.99 111 0.45 % 314.31 21 0.30 % 134.71 kratek kratek k ratek 286 0.52 % 276.30 81 0.60 % 351.37 29 0.29 % 98.12 122 0.49 % 345.46 54 0.76 % 346.39 visok visok v isok 278 0.50 % 268.57 53 0.40 % 229.91 37 0.37 % 125.19 159 0.64 % 450.23 29 0.41 % 186.03 pomemben pomemben p omemben 265 0.48 % 256.01 47 0.35 % 203.88 17 0.17 % 57.52 163 0.66 % 461.56 38 0.54 % 243.76 evropski evropski e vropski 262 0.47 % 253.11 33 0.25 % 143.15 13 0.13 % 43.99 210 0.85 % 594.65 6 0.09 % 38.49 različen različen r azličen 260 0.47 % 251.18 28 0.21 % 121.46 28 0.28 % 94.74 154 0.62 % 436.08 50 0.71 % 320.73 slab slab s lab 241 0.44 % 232.83 73 0.54 % 316.67 54 0.54 % 182.72 98 0.40 % 277.50 16 0.23 % 102.63 drag drag d rag 238 0.43 % 229.93 93 0.69 % 403.43 56 0.56 % 189.48 53 0.21 % 150.08 36 0.51 % 230.93 zanimiv zanimiv z animiv 230 0.42 % 222.20 81 0.60 % 351.37 50 0.50 % 169.18 73 0.30 % 206.71 26 0.37 % 166.78 jasen jasen j asen 213 0.39 % 205.78 39 0.29 % 169.18 39 0.39 % 131.96 99 0.40 % 280.33 36 0.51 % 230.93 določen določen d oločen 212 0.38 % 204.81 13 0.10 % 56.39 22 0.22 % 74.44 128 0.52 % 362.45 49 0.69 % 314.32 hud hud h ud 207 0.38 % 199.98 48 0.36 % 208.22 63 0.63 % 213.17 70 0.28 % 198.22 26 0.37 % 166.78 domač domač d omač 184 0.33 % 177.76 71 0.53 % 308 18 0.18 % 60.91 86 0.35 % 243.52 9 0.13 % 57.73 podoben podoben p odoben 181 0.33 % 174.86 47 0.35 % 203.88 28 0.28 % 94.74 79 0.32 % 223.70 27 0.38 % 173.20 deloven deloven d eloven 177 0.32 % 171 21 0.16 % 91.10 12 0.12 % 40.60 110 0.45 % 311.48 34 0.48 % 218.10 osnoven osnoven o snoven 174 0.32 % 168.10 20 0.15 % 86.76 27 0.27 % 91.36 85 0.34 % 240.69 42 0.59 % 269.42 javen javen j aven 171 0.31 % 165.20 5 0.04 % 21.69 4 0.04 % 13.53 146 0.59 % 413.42 16 0.23 % 102.63 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 435 File at CLARIN.SI2.2.92 List of initial character-level 2-grams from adjective lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-adjectives-lemmas-initial- 2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] sam sam sa m 3,381 6.13 % 3,266.33 596 4.44 % 2,585.43 1,346 13.48 % 4,554.34 817 3.31 % 2,313.47 622 8.78 % 3,989.92 dober dober do ber 3,021 5.47 % 2,918.54 1,244 9.27 % 5,396.42 621 6.22 % 2,101.22 776 3.14 % 2,197.37 380 5.37 % 2,437.57 mali mali ma li 1,760 3.19 % 1,700.31 502 3.74 % 2,177.66 595 5.96 % 2,013.25 305 1.24 % 863.66 358 5.06 % 2,296.45 velik velik ve lik 1,212 2.20 % 1,170.89 278 2.07 % 1,205.95 210 2.10 % 710.56 589 2.39 % 1,667.85 135 1.91 % 865.98 pravi pravi pr avi 944 1.71 % 911.98 178 1.33 % 772.16 144 1.44 % 487.24 472 1.91 % 1,336.55 150 2.12 % 962.20 lep lep le p 929 1.68 % 897.49 379 2.82 % 1,644.09 140 1.40 % 473.71 306 1.24 % 866.49 104 1.47 % 667.12 nov nov no v 894 1.62 % 863.68 282 2.10 % 1,223.31 150 1.50 % 507.54 360 1.46 % 1,019.40 102 1.44 % 654.29 cel cel ce l 849 1.54 % 820.21 158 1.18 % 685.40 273 2.73 % 923.73 290 1.18 % 821.18 128 1.81 % 821.08 star star st ar 719 1.30 % 694.61 160 1.19 % 694.07 357 3.58 % 1,207.95 139 0.56 % 393.60 63 0.89 % 404.12 slovenski slovenski sl ovenski 561 1.02 % 541.97 144 1.07 % 624.67 31 0.31 % 104.89 370 1.50 % 1,047.72 16 0.23 % 102.63 zadnji zadnji za dnji 507 0.92 % 489.80 166 1.24 % 720.10 64 0.64 % 216.55 244 0.99 % 690.93 33 0.47 % 211.68 glaven glaven gl aven 423 0.77 % 408.65 87 0.65 % 377.40 176 1.76 % 595.52 109 0.44 % 308.65 51 0.72 % 327.15 majhen majhen ma jhen 381 0.69 % 368.08 73 0.54 % 316.67 89 0.89 % 301.14 167 0.68 % 472.89 52 0.73 % 333.56 dolg dolg do lg 331 0.60 % 319.77 91 0.68 % 394.75 99 0.99 % 334.98 80 0.32 % 226.53 61 0.86 % 391.29 naslednji naslednji na slednji 330 0.60 % 318.81 102 0.76 % 442.47 53 0.53 % 179.33 125 0.51 % 353.96 50 0.71 % 320.73 mlad mlad ml ad 329 0.60 % 317.84 111 0.83 % 481.51 86 0.86 % 290.99 111 0.45 % 314.31 21 0.30 % 134.71 kratek kratek kr atek 286 0.52 % 276.30 81 0.60 % 351.37 29 0.29 % 98.12 122 0.49 % 345.46 54 0.76 % 346.39 visok visok vi sok 278 0.50 % 268.57 53 0.40 % 229.91 37 0.37 % 125.19 159 0.64 % 450.23 29 0.41 % 186.03 pomemben pomemben po memben 265 0.48 % 256.01 47 0.35 % 203.88 17 0.17 % 57.52 163 0.66 % 461.56 38 0.54 % 243.76 evropski evropski ev ropski 262 0.47 % 253.11 33 0.25 % 143.15 13 0.13 % 43.99 210 0.85 % 594.65 6 0.09 % 38.49 različen različen ra zličen 260 0.47 % 251.18 28 0.21 % 121.46 28 0.28 % 94.74 154 0.62 % 436.08 50 0.71 % 320.73 slab slab sl ab 241 0.44 % 232.83 73 0.54 % 316.67 54 0.54 % 182.72 98 0.40 % 277.50 16 0.23 % 102.63 drag drag dr ag 238 0.43 % 229.93 93 0.69 % 403.43 56 0.56 % 189.48 53 0.21 % 150.08 36 0.51 % 230.93 zanimiv zanimiv za nimiv 230 0.42 % 222.20 81 0.60 % 351.37 50 0.50 % 169.18 73 0.30 % 206.71 26 0.37 % 166.78 jasen jasen ja sen 213 0.39 % 205.78 39 0.29 % 169.18 39 0.39 % 131.96 99 0.40 % 280.33 36 0.51 % 230.93 določen določen do ločen 212 0.38 % 204.81 13 0.10 % 56.39 22 0.22 % 74.44 128 0.52 % 362.45 49 0.69 % 314.32 hud hud hu d 207 0.38 % 199.98 48 0.36 % 208.22 63 0.63 % 213.17 70 0.28 % 198.22 26 0.37 % 166.78 domač domač do mač 184 0.33 % 177.76 71 0.53 % 308 18 0.18 % 60.91 86 0.35 % 243.52 9 0.13 % 57.73 podoben podoben po doben 181 0.33 % 174.86 47 0.35 % 203.88 28 0.28 % 94.74 79 0.32 % 223.70 27 0.38 % 173.20 deloven deloven de loven 177 0.32 % 171 21 0.16 % 91.10 12 0.12 % 40.60 110 0.45 % 311.48 34 0.48 % 218.10 osnoven osnoven os noven 174 0.32 % 168.10 20 0.15 % 86.76 27 0.27 % 91.36 85 0.34 % 240.69 42 0.59 % 269.42 javen javen ja ven 171 0.31 % 165.20 5 0.04 % 21.69 4 0.04 % 13.53 146 0.59 % 413.42 16 0.23 % 102.63 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 436 File at CLARIN.SI2.2.93 List of initial character-level 3-grams from adjective lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-adjectives-lemmas-initial- 3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] sam sam sam 3,381 6.13 % 3,266.33 596 4.44 % 2,585.43 1,346 13.48 % 4,554.34 817 3.31 % 2,313.47 622 8.78 % 3,989.92 dober dober dob er 3,021 5.48 % 2,918.54 1,244 9.27 % 5,396.42 621 6.22 % 2,101.22 776 3.14 % 2,197.37 380 5.37 % 2,437.57 mali mali mal i 1,760 3.19 % 1,700.31 502 3.74 % 2,177.66 595 5.96 % 2,013.25 305 1.24 % 863.66 358 5.06 % 2,296.45 velik velik vel ik 1,212 2.20 % 1,170.89 278 2.07 % 1,205.95 210 2.10 % 710.56 589 2.39 % 1,667.85 135 1.91 % 865.98 pravi pravi pra vi 944 1.71 % 911.98 178 1.33 % 772.16 144 1.44 % 487.24 472 1.91 % 1,336.55 150 2.12 % 962.20 lep lep lep 929 1.68 % 897.49 379 2.82 % 1,644.09 140 1.40 % 473.71 306 1.24 % 866.49 104 1.47 % 667.12 nov nov nov 894 1.62 % 863.68 282 2.10 % 1,223.31 150 1.50 % 507.54 360 1.46 % 1,019.40 102 1.44 % 654.29 cel cel cel 849 1.54 % 820.21 158 1.18 % 685.40 273 2.73 % 923.73 290 1.18 % 821.18 128 1.81 % 821.08 star star sta r 719 1.30 % 694.61 160 1.19 % 694.07 357 3.58 % 1,207.95 139 0.56 % 393.60 63 0.89 % 404.12 slovenski slovenski slo venski 561 1.02 % 541.97 144 1.07 % 624.67 31 0.31 % 104.89 370 1.50 % 1,047.72 16 0.23 % 102.63 zadnji zadnji zad nji 507 0.92 % 489.80 166 1.24 % 720.10 64 0.64 % 216.55 244 0.99 % 690.93 33 0.47 % 211.68 glaven glaven gla ven 423 0.77 % 408.65 87 0.65 % 377.40 176 1.76 % 595.52 109 0.44 % 308.65 51 0.72 % 327.15 majhen majhen maj hen 381 0.69 % 368.08 73 0.54 % 316.67 89 0.89 % 301.14 167 0.68 % 472.89 52 0.73 % 333.56 dolg dolg dol g 331 0.60 % 319.77 91 0.68 % 394.75 99 0.99 % 334.98 80 0.32 % 226.53 61 0.86 % 391.29 naslednji naslednji nas lednji 330 0.60 % 318.81 102 0.76 % 442.47 53 0.53 % 179.33 125 0.51 % 353.96 50 0.71 % 320.73 mlad mlad mla d 329 0.60 % 317.84 111 0.83 % 481.51 86 0.86 % 290.99 111 0.45 % 314.31 21 0.30 % 134.71 kratek kratek kra tek 286 0.52 % 276.30 81 0.60 % 351.37 29 0.29 % 98.12 122 0.49 % 345.46 54 0.76 % 346.39 visok visok vis ok 278 0.50 % 268.57 53 0.40 % 229.91 37 0.37 % 125.19 159 0.64 % 450.23 29 0.41 % 186.03 pomemben pomemben pom emben 265 0.48 % 256.01 47 0.35 % 203.88 17 0.17 % 57.52 163 0.66 % 461.56 38 0.54 % 243.76 evropski evropski evr opski 262 0.47 % 253.11 33 0.25 % 143.15 13 0.13 % 43.99 210 0.85 % 594.65 6 0.09 % 38.49 različen različen raz ličen 260 0.47 % 251.18 28 0.21 % 121.46 28 0.28 % 94.74 154 0.62 % 436.08 50 0.71 % 320.73 slab slab sla b 241 0.44 % 232.83 73 0.54 % 316.67 54 0.54 % 182.72 98 0.40 % 277.50 16 0.23 % 102.63 drag drag dra g 238 0.43 % 229.93 93 0.69 % 403.43 56 0.56 % 189.48 53 0.21 % 150.08 36 0.51 % 230.93 zanimiv zanimiv zan imiv 230 0.42 % 222.20 81 0.60 % 351.37 50 0.50 % 169.18 73 0.30 % 206.71 26 0.37 % 166.78 jasen jasen jas en 213 0.39 % 205.78 39 0.29 % 169.18 39 0.39 % 131.96 99 0.40 % 280.33 36 0.51 % 230.93 določen določen dol očen 212 0.38 % 204.81 13 0.10 % 56.39 22 0.22 % 74.44 128 0.52 % 362.45 49 0.69 % 314.32 hud hud hud 207 0.38 % 199.98 48 0.36 % 208.22 63 0.63 % 213.17 70 0.28 % 198.22 26 0.37 % 166.78 domač domač dom ač 184 0.33 % 177.76 71 0.53 % 308 18 0.18 % 60.91 86 0.35 % 243.52 9 0.13 % 57.73 podoben podoben pod oben 181 0.33 % 174.86 47 0.35 % 203.88 28 0.28 % 94.74 79 0.32 % 223.70 27 0.38 % 173.20 deloven deloven del oven 177 0.32 % 171 21 0.16 % 91.10 12 0.12 % 40.60 110 0.45 % 311.48 34 0.48 % 218.10 osnoven osnoven osn oven 174 0.32 % 168.10 20 0.15 % 86.76 27 0.27 % 91.36 85 0.34 % 240.69 42 0.59 % 269.42 javen javen jav en 171 0.31 % 165.20 5 0.04 % 21.69 4 0.04 % 13.53 146 0.59 % 413.42 16 0.23 % 102.63 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 437 File at CLARIN.SI2.2.94 List of initial character-level 4-grams from adjective lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-adjectives-lemmas-initial- 4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] dober dober dobe r 3,021 6.31 % 2,918.54 1,244 10.65 % 5,396.42 621 8.10 % 2,101.22 776 3.44 % 2,197.37 380 6.33 % 2,437.57 mali mali mali 1,760 3.67 % 1,700.31 502 4.30 % 2,177.66 595 7.76 % 2,013.25 305 1.35 % 863.66 358 5.96 % 2,296.45 velik velik veli k 1,212 2.53 % 1,170.89 278 2.38 % 1,205.95 210 2.74 % 710.56 589 2.61 % 1,667.85 135 2.25 % 865.98 pravi pravi prav i 944 1.97 % 911.98 178 1.52 % 772.16 144 1.88 % 487.24 472 2.09 % 1,336.55 150 2.50 % 962.20 star star star 719 1.50 % 694.61 160 1.37 % 694.07 357 4.66 % 1,207.95 139 0.62 % 393.60 63 1.05 % 404.12 slovenski slovenski slov enski 561 1.17 % 541.97 144 1.23 % 624.67 31 0.40 % 104.89 370 1.64 % 1,047.72 16 0.27 % 102.63 zadnji zadnji zadn ji 507 1.06 % 489.80 166 1.42 % 720.10 64 0.83 % 216.55 244 1.08 % 690.93 33 0.55 % 211.68 glaven glaven glav en 423 0.88 % 408.65 87 0.74 % 377.40 176 2.30 % 595.52 109 0.48 % 308.65 51 0.85 % 327.15 majhen majhen majh en 381 0.80 % 368.08 73 0.62 % 316.67 89 1.16 % 301.14 167 0.74 % 472.89 52 0.87 % 333.56 dolg dolg dolg 331 0.69 % 319.77 91 0.78 % 394.75 99 1.29 % 334.98 80 0.35 % 226.53 61 1.02 % 391.29 naslednji naslednji nasl ednji 330 0.69 % 318.81 102 0.87 % 442.47 53 0.69 % 179.33 125 0.56 % 353.96 50 0.83 % 320.73 mlad mlad mlad 329 0.69 % 317.84 111 0.95 % 481.51 86 1.12 % 290.99 111 0.49 % 314.31 21 0.35 % 134.71 kratek kratek krat ek 286 0.60 % 276.30 81 0.69 % 351.37 29 0.38 % 98.12 122 0.54 % 345.46 54 0.90 % 346.39 visok visok viso k 278 0.58 % 268.57 53 0.45 % 229.91 37 0.48 % 125.19 159 0.70 % 450.23 29 0.48 % 186.03 pomemben pomemben pome mben 265 0.55 % 256.01 47 0.40 % 203.88 17 0.22 % 57.52 163 0.72 % 461.56 38 0.63 % 243.76 evropski evropski evro pski 262 0.55 % 253.11 33 0.28 % 143.15 13 0.17 % 43.99 210 0.93 % 594.65 6 0.10 % 38.49 različen različen razl ičen 260 0.54 % 251.18 28 0.24 % 121.46 28 0.36 % 94.74 154 0.68 % 436.08 50 0.83 % 320.73 slab slab slab 241 0.50 % 232.83 73 0.62 % 316.67 54 0.70 % 182.72 98 0.43 % 277.50 16 0.27 % 102.63 drag drag drag 238 0.50 % 229.93 93 0.80 % 403.43 56 0.73 % 189.48 53 0.23 % 150.08 36 0.60 % 230.93 zanimiv zanimiv zani miv 230 0.48 % 222.20 81 0.69 % 351.37 50 0.65 % 169.18 73 0.32 % 206.71 26 0.43 % 166.78 jasen jasen jase n 213 0.45 % 205.78 39 0.33 % 169.18 39 0.51 % 131.96 99 0.44 % 280.33 36 0.60 % 230.93 določen določen dolo čen 212 0.44 % 204.81 13 0.11 % 56.39 22 0.29 % 74.44 128 0.57 % 362.45 49 0.82 % 314.32 domač domač doma č 184 0.38 % 177.76 71 0.61 % 308 18 0.23 % 60.91 86 0.38 % 243.52 9 0.15 % 57.73 podoben podoben podo ben 181 0.38 % 174.86 47 0.40 % 203.88 28 0.36 % 94.74 79 0.35 % 223.70 27 0.45 % 173.20 deloven deloven delo ven 177 0.37 % 171 21 0.18 % 91.10 12 0.16 % 40.60 110 0.49 % 311.48 34 0.57 % 218.10 osnoven osnoven osno ven 174 0.36 % 168.10 20 0.17 % 86.76 27 0.35 % 91.36 85 0.38 % 240.69 42 0.70 % 269.42 javen javen jave n 171 0.36 % 165.20 5 0.04 % 21.69 4 0.05 % 13.53 146 0.65 % 413.42 16 0.27 % 102.63 pripravljen pripravljen prip ravljen 168 0.35 % 162.30 64 0.55 % 277.63 12 0.16 % 40.60 65 0.29 % 184.06 27 0.45 % 173.20 močen močen moče n 166 0.35 % 160.37 52 0.45 % 225.57 21 0.27 % 71.06 74 0.33 % 209.54 19 0.32 % 121.88 današnji današnji dana šnji 159 0.33 % 153.61 79 0.68 % 342.70 3 0.04 % 10.15 64 0.28 % 181.23 13 0.22 % 83.39 poseben poseben pose ben 156 0.33 % 150.71 48 0.41 % 208.22 17 0.22 % 57.52 62 0.28 % 175.56 29 0.48 % 186.03 težek težek teže k 154 0.32 % 148.78 49 0.42 % 212.56 24 0.31 % 81.21 62 0.28 % 175.56 19 0.32 % 121.88 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 438 File at CLARIN.SI2.2.95 List of initial character-level 5-grams from adjective lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-adjectives-lemmas-initial- 5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] dober dober dober 3,021 7.09 % 2,918.54 1,244 12.23 % 5,396.42 621 10.30 % 2,101.22 776 3.67 % 2,197.37 380 7.22 % 2,437.57 velik velik velik 1,212 2.84 % 1,170.89 278 2.73 % 1,205.95 210 3.48 % 710.56 589 2.78 % 1,667.85 135 2.57 % 865.98 pravi pravi pravi 944 2.21 % 911.98 178 1.75 % 772.16 144 2.39 % 487.24 472 2.23 % 1,336.55 150 2.85 % 962.20 slovenski slovenski slove nski 561 1.32 % 541.97 144 1.42 % 624.67 31 0.51 % 104.89 370 1.75 % 1,047.72 16 0.30 % 102.63 zadnji zadnji zadnj i 507 1.19 % 489.80 166 1.63 % 720.10 64 1.06 % 216.55 244 1.15 % 690.93 33 0.63 % 211.68 glaven glaven glave n 423 0.99 % 408.65 87 0.85 % 377.40 176 2.92 % 595.52 109 0.52 % 308.65 51 0.97 % 327.15 majhen majhen majhe n 381 0.89 % 368.08 73 0.72 % 316.67 89 1.48 % 301.14 167 0.79 % 472.89 52 0.99 % 333.56 naslednji naslednji nasle dnji 330 0.77 % 318.81 102 1.00 % 442.47 53 0.88 % 179.33 125 0.59 % 353.96 50 0.95 % 320.73 kratek kratek krate k 286 0.67 % 276.30 81 0.80 % 351.37 29 0.48 % 98.12 122 0.58 % 345.46 54 1.03 % 346.39 visok visok visok 278 0.65 % 268.57 53 0.52 % 229.91 37 0.61 % 125.19 159 0.75 % 450.23 29 0.55 % 186.03 pomemben pomemben pomem ben 265 0.62 % 256.01 47 0.46 % 203.88 17 0.28 % 57.52 163 0.77 % 461.56 38 0.72 % 243.76 evropski evropski evrop ski 262 0.61 % 253.11 33 0.32 % 143.15 13 0.22 % 43.99 210 0.99 % 594.65 6 0.11 % 38.49 različen različen razli čen 260 0.61 % 251.18 28 0.28 % 121.46 28 0.46 % 94.74 154 0.73 % 436.08 50 0.95 % 320.73 zanimiv zanimiv zanim iv 230 0.54 % 222.20 81 0.80 % 351.37 50 0.83 % 169.18 73 0.34 % 206.71 26 0.49 % 166.78 jasen jasen jasen 213 0.50 % 205.78 39 0.38 % 169.18 39 0.65 % 131.96 99 0.47 % 280.33 36 0.68 % 230.93 določen določen določ en 212 0.50 % 204.81 13 0.13 % 56.39 22 0.36 % 74.44 128 0.60 % 362.45 49 0.93 % 314.32 domač domač domač 184 0.43 % 177.76 71 0.70 % 308 18 0.30 % 60.91 86 0.41 % 243.52 9 0.17 % 57.73 podoben podoben podob en 181 0.42 % 174.86 47 0.46 % 203.88 28 0.46 % 94.74 79 0.37 % 223.70 27 0.51 % 173.20 deloven deloven delov en 177 0.41 % 171 21 0.21 % 91.10 12 0.20 % 40.60 110 0.52 % 311.48 34 0.65 % 218.10 osnoven osnoven osnov en 174 0.41 % 168.10 20 0.20 % 86.76 27 0.45 % 91.36 85 0.40 % 240.69 42 0.80 % 269.42 javen javen javen 171 0.40 % 165.20 5 0.05 % 21.69 4 0.07 % 13.53 146 0.69 % 413.42 16 0.30 % 102.63 pripravljen pripravljen pripr avljen 168 0.39 % 162.30 64 0.63 % 277.63 12 0.20 % 40.60 65 0.31 % 184.06 27 0.51 % 173.20 močen močen močen 166 0.39 % 160.37 52 0.51 % 225.57 21 0.35 % 71.06 74 0.35 % 209.54 19 0.36 % 121.88 današnji današnji današ nji 159 0.37 % 153.61 79 0.78 % 342.70 3 0.05 % 10.15 64 0.30 % 181.23 13 0.25 % 83.39 poseben poseben poseb en 156 0.37 % 150.71 48 0.47 % 208.22 17 0.28 % 57.52 62 0.29 % 175.56 29 0.55 % 186.03 težek težek težek 154 0.36 % 148.78 49 0.48 % 212.56 24 0.40 % 81.21 62 0.29 % 175.56 19 0.36 % 121.88 hiter hiter hiter 150 0.35 % 144.91 78 0.77 % 338.36 13 0.22 % 43.99 45 0.21 % 127.42 14 0.27 % 89.81 državen državen držav en 147 0.34 % 142.01 18 0.18 % 78.08 2 0.03 % 6.77 125 0.59 % 353.96 2 0.04 % 12.83 svetoven svetoven sveto ven 145 0.34 % 140.08 87 0.85 % 377.40 5 0.08 % 16.92 52 0.25 % 147.25 1 0.02 % 6.41 socialen socialen socia len 141 0.33 % 136.22 6 0.06 % 26.03 13 0.22 % 43.99 118 0.56 % 334.14 4 0.08 % 25.66 zunanji zunanji zunan ji 134 0.31 % 129.46 5 0.05 % 21.69 5 0.08 % 16.92 105 0.50 % 297.32 19 0.36 % 121.88 prejšnji prejšnji prejš nji 133 0.31 % 128.49 22 0.22 % 95.44 30 0.50 % 101.51 73 0.34 % 206.71 8 0.15 % 51.32 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 439 File at CLARIN.SI2.2.96 List of final character-level 1-grams from adjective lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-adjectives-lemmas-final- 1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] sam sam sa m 3,381 6.13 % 3,266.33 596 4.44 % 2,585.43 1,346 13.48 % 4,554.34 817 3.31 % 2,313.47 622 8.78 % 3,989.92 dober dober dobe r 3,021 5.47 % 2,918.54 1,244 9.27 % 5,396.42 621 6.22 % 2,101.22 776 3.14 % 2,197.37 380 5.37 % 2,437.57 mali mali mal i 1,760 3.19 % 1,700.31 502 3.74 % 2,177.66 595 5.96 % 2,013.25 305 1.24 % 863.66 358 5.06 % 2,296.45 velik velik veli k 1,212 2.20 % 1,170.89 278 2.07 % 1,205.95 210 2.10 % 710.56 589 2.39 % 1,667.85 135 1.91 % 865.98 pravi pravi prav i 944 1.71 % 911.98 178 1.33 % 772.16 144 1.44 % 487.24 472 1.91 % 1,336.55 150 2.12 % 962.20 lep lep le p 929 1.68 % 897.49 379 2.82 % 1,644.09 140 1.40 % 473.71 306 1.24 % 866.49 104 1.47 % 667.12 nov nov no v 894 1.62 % 863.68 282 2.10 % 1,223.31 150 1.50 % 507.54 360 1.46 % 1,019.40 102 1.44 % 654.29 cel cel ce l 849 1.54 % 820.21 158 1.18 % 685.40 273 2.73 % 923.73 290 1.18 % 821.18 128 1.81 % 821.08 star star sta r 719 1.30 % 694.61 160 1.19 % 694.07 357 3.58 % 1,207.95 139 0.56 % 393.60 63 0.89 % 404.12 slovenski slovenski slovensk i 561 1.02 % 541.97 144 1.07 % 624.67 31 0.31 % 104.89 370 1.50 % 1,047.72 16 0.23 % 102.63 zadnji zadnji zadnj i 507 0.92 % 489.80 166 1.24 % 720.10 64 0.64 % 216.55 244 0.99 % 690.93 33 0.47 % 211.68 glaven glaven glave n 423 0.77 % 408.65 87 0.65 % 377.40 176 1.76 % 595.52 109 0.44 % 308.65 51 0.72 % 327.15 majhen majhen majhe n 381 0.69 % 368.08 73 0.54 % 316.67 89 0.89 % 301.14 167 0.68 % 472.89 52 0.73 % 333.56 dolg dolg dol g 331 0.60 % 319.77 91 0.68 % 394.75 99 0.99 % 334.98 80 0.32 % 226.53 61 0.86 % 391.29 naslednji naslednji naslednj i 330 0.60 % 318.81 102 0.76 % 442.47 53 0.53 % 179.33 125 0.51 % 353.96 50 0.71 % 320.73 mlad mlad mla d 329 0.60 % 317.84 111 0.83 % 481.51 86 0.86 % 290.99 111 0.45 % 314.31 21 0.30 % 134.71 kratek kratek krate k 286 0.52 % 276.30 81 0.60 % 351.37 29 0.29 % 98.12 122 0.49 % 345.46 54 0.76 % 346.39 visok visok viso k 278 0.50 % 268.57 53 0.40 % 229.91 37 0.37 % 125.19 159 0.64 % 450.23 29 0.41 % 186.03 pomemben pomemben pomembe n 265 0.48 % 256.01 47 0.35 % 203.88 17 0.17 % 57.52 163 0.66 % 461.56 38 0.54 % 243.76 evropski evropski evropsk i 262 0.47 % 253.11 33 0.25 % 143.15 13 0.13 % 43.99 210 0.85 % 594.65 6 0.09 % 38.49 različen različen različe n 260 0.47 % 251.18 28 0.21 % 121.46 28 0.28 % 94.74 154 0.62 % 436.08 50 0.71 % 320.73 slab slab sla b 241 0.44 % 232.83 73 0.54 % 316.67 54 0.54 % 182.72 98 0.40 % 277.50 16 0.23 % 102.63 drag drag dra g 238 0.43 % 229.93 93 0.69 % 403.43 56 0.56 % 189.48 53 0.21 % 150.08 36 0.51 % 230.93 zanimiv zanimiv zanimi v 230 0.42 % 222.20 81 0.60 % 351.37 50 0.50 % 169.18 73 0.30 % 206.71 26 0.37 % 166.78 jasen jasen jase n 213 0.39 % 205.78 39 0.29 % 169.18 39 0.39 % 131.96 99 0.40 % 280.33 36 0.51 % 230.93 določen določen določe n 212 0.38 % 204.81 13 0.10 % 56.39 22 0.22 % 74.44 128 0.52 % 362.45 49 0.69 % 314.32 hud hud hu d 207 0.38 % 199.98 48 0.36 % 208.22 63 0.63 % 213.17 70 0.28 % 198.22 26 0.37 % 166.78 domač domač doma č 184 0.33 % 177.76 71 0.53 % 308 18 0.18 % 60.91 86 0.35 % 243.52 9 0.13 % 57.73 podoben podoben podobe n 181 0.33 % 174.86 47 0.35 % 203.88 28 0.28 % 94.74 79 0.32 % 223.70 27 0.38 % 173.20 deloven deloven delove n 177 0.32 % 171 21 0.16 % 91.10 12 0.12 % 40.60 110 0.45 % 311.48 34 0.48 % 218.10 osnoven osnoven osnove n 174 0.32 % 168.10 20 0.15 % 86.76 27 0.27 % 91.36 85 0.34 % 240.69 42 0.59 % 269.42 javen javen jave n 171 0.31 % 165.20 5 0.04 % 21.69 4 0.04 % 13.53 146 0.59 % 413.42 16 0.23 % 102.63 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 440 File at CLARIN.SI2.2.97 List of final character-level 2-grams from adjective lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-adjectives-lemmas-final- 2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] sam sam s am 3,381 6.13 % 3,266.33 596 4.44 % 2,585.43 1,346 13.48 % 4,554.34 817 3.31 % 2,313.47 622 8.78 % 3,989.92 dober dober dob er 3,021 5.47 % 2,918.54 1,244 9.27 % 5,396.42 621 6.22 % 2,101.22 776 3.14 % 2,197.37 380 5.37 % 2,437.57 mali mali ma li 1,760 3.19 % 1,700.31 502 3.74 % 2,177.66 595 5.96 % 2,013.25 305 1.24 % 863.66 358 5.06 % 2,296.45 velik velik vel ik 1,212 2.20 % 1,170.89 278 2.07 % 1,205.95 210 2.10 % 710.56 589 2.39 % 1,667.85 135 1.91 % 865.98 pravi pravi pra vi 944 1.71 % 911.98 178 1.33 % 772.16 144 1.44 % 487.24 472 1.91 % 1,336.55 150 2.12 % 962.20 lep lep l ep 929 1.68 % 897.49 379 2.82 % 1,644.09 140 1.40 % 473.71 306 1.24 % 866.49 104 1.47 % 667.12 nov nov n ov 894 1.62 % 863.68 282 2.10 % 1,223.31 150 1.50 % 507.54 360 1.46 % 1,019.40 102 1.44 % 654.29 cel cel c el 849 1.54 % 820.21 158 1.18 % 685.40 273 2.73 % 923.73 290 1.18 % 821.18 128 1.81 % 821.08 star star st ar 719 1.30 % 694.61 160 1.19 % 694.07 357 3.58 % 1,207.95 139 0.56 % 393.60 63 0.89 % 404.12 slovenski slovenski slovens ki 561 1.02 % 541.97 144 1.07 % 624.67 31 0.31 % 104.89 370 1.50 % 1,047.72 16 0.23 % 102.63 zadnji zadnji zadn ji 507 0.92 % 489.80 166 1.24 % 720.10 64 0.64 % 216.55 244 0.99 % 690.93 33 0.47 % 211.68 glaven glaven glav en 423 0.77 % 408.65 87 0.65 % 377.40 176 1.76 % 595.52 109 0.44 % 308.65 51 0.72 % 327.15 majhen majhen majh en 381 0.69 % 368.08 73 0.54 % 316.67 89 0.89 % 301.14 167 0.68 % 472.89 52 0.73 % 333.56 dolg dolg do lg 331 0.60 % 319.77 91 0.68 % 394.75 99 0.99 % 334.98 80 0.32 % 226.53 61 0.86 % 391.29 naslednji naslednji nasledn ji 330 0.60 % 318.81 102 0.76 % 442.47 53 0.53 % 179.33 125 0.51 % 353.96 50 0.71 % 320.73 mlad mlad ml ad 329 0.60 % 317.84 111 0.83 % 481.51 86 0.86 % 290.99 111 0.45 % 314.31 21 0.30 % 134.71 kratek kratek krat ek 286 0.52 % 276.30 81 0.60 % 351.37 29 0.29 % 98.12 122 0.49 % 345.46 54 0.76 % 346.39 visok visok vis ok 278 0.50 % 268.57 53 0.40 % 229.91 37 0.37 % 125.19 159 0.64 % 450.23 29 0.41 % 186.03 pomemben pomemben pomemb en 265 0.48 % 256.01 47 0.35 % 203.88 17 0.17 % 57.52 163 0.66 % 461.56 38 0.54 % 243.76 evropski evropski evrops ki 262 0.47 % 253.11 33 0.25 % 143.15 13 0.13 % 43.99 210 0.85 % 594.65 6 0.09 % 38.49 različen različen različ en 260 0.47 % 251.18 28 0.21 % 121.46 28 0.28 % 94.74 154 0.62 % 436.08 50 0.71 % 320.73 slab slab sl ab 241 0.44 % 232.83 73 0.54 % 316.67 54 0.54 % 182.72 98 0.40 % 277.50 16 0.23 % 102.63 drag drag dr ag 238 0.43 % 229.93 93 0.69 % 403.43 56 0.56 % 189.48 53 0.21 % 150.08 36 0.51 % 230.93 zanimiv zanimiv zanim iv 230 0.42 % 222.20 81 0.60 % 351.37 50 0.50 % 169.18 73 0.30 % 206.71 26 0.37 % 166.78 jasen jasen jas en 213 0.39 % 205.78 39 0.29 % 169.18 39 0.39 % 131.96 99 0.40 % 280.33 36 0.51 % 230.93 določen določen določ en 212 0.38 % 204.81 13 0.10 % 56.39 22 0.22 % 74.44 128 0.52 % 362.45 49 0.69 % 314.32 hud hud h ud 207 0.38 % 199.98 48 0.36 % 208.22 63 0.63 % 213.17 70 0.28 % 198.22 26 0.37 % 166.78 domač domač dom ač 184 0.33 % 177.76 71 0.53 % 308 18 0.18 % 60.91 86 0.35 % 243.52 9 0.13 % 57.73 podoben podoben podob en 181 0.33 % 174.86 47 0.35 % 203.88 28 0.28 % 94.74 79 0.32 % 223.70 27 0.38 % 173.20 deloven deloven delov en 177 0.32 % 171 21 0.16 % 91.10 12 0.12 % 40.60 110 0.45 % 311.48 34 0.48 % 218.10 osnoven osnoven osnov en 174 0.32 % 168.10 20 0.15 % 86.76 27 0.27 % 91.36 85 0.34 % 240.69 42 0.59 % 269.42 javen javen jav en 171 0.31 % 165.20 5 0.04 % 21.69 4 0.04 % 13.53 146 0.59 % 413.42 16 0.23 % 102.63 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 441 File at CLARIN.SI2.2.98 List of final character-level 3-grams from adjective lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-adjectives-lemmas-final- 3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] sam sam sam 3,381 6.13 % 3,266.33 596 4.44 % 2,585.43 1,346 13.48 % 4,554.34 817 3.31 % 2,313.47 622 8.78 % 3,989.92 dober dober do ber 3,021 5.48 % 2,918.54 1,244 9.27 % 5,396.42 621 6.22 % 2,101.22 776 3.14 % 2,197.37 380 5.37 % 2,437.57 mali mali m ali 1,760 3.19 % 1,700.31 502 3.74 % 2,177.66 595 5.96 % 2,013.25 305 1.24 % 863.66 358 5.06 % 2,296.45 velik velik ve lik 1,212 2.20 % 1,170.89 278 2.07 % 1,205.95 210 2.10 % 710.56 589 2.39 % 1,667.85 135 1.91 % 865.98 pravi pravi pr avi 944 1.71 % 911.98 178 1.33 % 772.16 144 1.44 % 487.24 472 1.91 % 1,336.55 150 2.12 % 962.20 lep lep lep 929 1.68 % 897.49 379 2.82 % 1,644.09 140 1.40 % 473.71 306 1.24 % 866.49 104 1.47 % 667.12 nov nov nov 894 1.62 % 863.68 282 2.10 % 1,223.31 150 1.50 % 507.54 360 1.46 % 1,019.40 102 1.44 % 654.29 cel cel cel 849 1.54 % 820.21 158 1.18 % 685.40 273 2.73 % 923.73 290 1.18 % 821.18 128 1.81 % 821.08 star star s tar 719 1.30 % 694.61 160 1.19 % 694.07 357 3.58 % 1,207.95 139 0.56 % 393.60 63 0.89 % 404.12 slovenski slovenski sloven ski 561 1.02 % 541.97 144 1.07 % 624.67 31 0.31 % 104.89 370 1.50 % 1,047.72 16 0.23 % 102.63 zadnji zadnji zad nji 507 0.92 % 489.80 166 1.24 % 720.10 64 0.64 % 216.55 244 0.99 % 690.93 33 0.47 % 211.68 glaven glaven gla ven 423 0.77 % 408.65 87 0.65 % 377.40 176 1.76 % 595.52 109 0.44 % 308.65 51 0.72 % 327.15 majhen majhen maj hen 381 0.69 % 368.08 73 0.54 % 316.67 89 0.89 % 301.14 167 0.68 % 472.89 52 0.73 % 333.56 dolg dolg d olg 331 0.60 % 319.77 91 0.68 % 394.75 99 0.99 % 334.98 80 0.32 % 226.53 61 0.86 % 391.29 naslednji naslednji nasled nji 330 0.60 % 318.81 102 0.76 % 442.47 53 0.53 % 179.33 125 0.51 % 353.96 50 0.71 % 320.73 mlad mlad m lad 329 0.60 % 317.84 111 0.83 % 481.51 86 0.86 % 290.99 111 0.45 % 314.31 21 0.30 % 134.71 kratek kratek kra tek 286 0.52 % 276.30 81 0.60 % 351.37 29 0.29 % 98.12 122 0.49 % 345.46 54 0.76 % 346.39 visok visok vi sok 278 0.50 % 268.57 53 0.40 % 229.91 37 0.37 % 125.19 159 0.64 % 450.23 29 0.41 % 186.03 pomemben pomemben pomem ben 265 0.48 % 256.01 47 0.35 % 203.88 17 0.17 % 57.52 163 0.66 % 461.56 38 0.54 % 243.76 evropski evropski evrop ski 262 0.47 % 253.11 33 0.25 % 143.15 13 0.13 % 43.99 210 0.85 % 594.65 6 0.09 % 38.49 različen različen razli čen 260 0.47 % 251.18 28 0.21 % 121.46 28 0.28 % 94.74 154 0.62 % 436.08 50 0.71 % 320.73 slab slab s lab 241 0.44 % 232.83 73 0.54 % 316.67 54 0.54 % 182.72 98 0.40 % 277.50 16 0.23 % 102.63 drag drag d rag 238 0.43 % 229.93 93 0.69 % 403.43 56 0.56 % 189.48 53 0.21 % 150.08 36 0.51 % 230.93 zanimiv zanimiv zani miv 230 0.42 % 222.20 81 0.60 % 351.37 50 0.50 % 169.18 73 0.30 % 206.71 26 0.37 % 166.78 jasen jasen ja sen 213 0.39 % 205.78 39 0.29 % 169.18 39 0.39 % 131.96 99 0.40 % 280.33 36 0.51 % 230.93 določen določen dolo čen 212 0.38 % 204.81 13 0.10 % 56.39 22 0.22 % 74.44 128 0.52 % 362.45 49 0.69 % 314.32 hud hud hud 207 0.38 % 199.98 48 0.36 % 208.22 63 0.63 % 213.17 70 0.28 % 198.22 26 0.37 % 166.78 domač domač do mač 184 0.33 % 177.76 71 0.53 % 308 18 0.18 % 60.91 86 0.35 % 243.52 9 0.13 % 57.73 podoben podoben podo ben 181 0.33 % 174.86 47 0.35 % 203.88 28 0.28 % 94.74 79 0.32 % 223.70 27 0.38 % 173.20 deloven deloven delo ven 177 0.32 % 171 21 0.16 % 91.10 12 0.12 % 40.60 110 0.45 % 311.48 34 0.48 % 218.10 osnoven osnoven osno ven 174 0.32 % 168.10 20 0.15 % 86.76 27 0.27 % 91.36 85 0.34 % 240.69 42 0.59 % 269.42 javen javen ja ven 171 0.31 % 165.20 5 0.04 % 21.69 4 0.04 % 13.53 146 0.59 % 413.42 16 0.23 % 102.63 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 442 File at CLARIN.SI2.2.99 List of final character-level 4-grams from adjective lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-adjectives-lemmas- final-4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] dober dober d ober 3,021 6.31 % 2,918.54 1,244 10.65 % 5,396.42 621 8.10 % 2,101.22 776 3.44 % 2,197.37 380 6.33 % 2,437.57 mali mali mali 1,760 3.67 % 1,700.31 502 4.30 % 2,177.66 595 7.76 % 2,013.25 305 1.35 % 863.66 358 5.96 % 2,296.45 velik velik v elik 1,212 2.53 % 1,170.89 278 2.38 % 1,205.95 210 2.74 % 710.56 589 2.61 % 1,667.85 135 2.25 % 865.98 pravi pravi p ravi 944 1.97 % 911.98 178 1.52 % 772.16 144 1.88 % 487.24 472 2.09 % 1,336.55 150 2.50 % 962.20 star star star 719 1.50 % 694.61 160 1.37 % 694.07 357 4.66 % 1,207.95 139 0.62 % 393.60 63 1.05 % 404.12 slovenski slovenski slove nski 561 1.17 % 541.97 144 1.23 % 624.67 31 0.40 % 104.89 370 1.64 % 1,047.72 16 0.27 % 102.63 zadnji zadnji za dnji 507 1.06 % 489.80 166 1.42 % 720.10 64 0.83 % 216.55 244 1.08 % 690.93 33 0.55 % 211.68 glaven glaven gl aven 423 0.88 % 408.65 87 0.74 % 377.40 176 2.30 % 595.52 109 0.48 % 308.65 51 0.85 % 327.15 majhen majhen ma jhen 381 0.80 % 368.08 73 0.62 % 316.67 89 1.16 % 301.14 167 0.74 % 472.89 52 0.87 % 333.56 dolg dolg dolg 331 0.69 % 319.77 91 0.78 % 394.75 99 1.29 % 334.98 80 0.35 % 226.53 61 1.02 % 391.29 naslednji naslednji nasle dnji 330 0.69 % 318.81 102 0.87 % 442.47 53 0.69 % 179.33 125 0.56 % 353.96 50 0.83 % 320.73 mlad mlad mlad 329 0.69 % 317.84 111 0.95 % 481.51 86 1.12 % 290.99 111 0.49 % 314.31 21 0.35 % 134.71 kratek kratek kr atek 286 0.60 % 276.30 81 0.69 % 351.37 29 0.38 % 98.12 122 0.54 % 345.46 54 0.90 % 346.39 visok visok v isok 278 0.58 % 268.57 53 0.45 % 229.91 37 0.48 % 125.19 159 0.70 % 450.23 29 0.48 % 186.03 pomemben pomemben pome mben 265 0.55 % 256.01 47 0.40 % 203.88 17 0.22 % 57.52 163 0.72 % 461.56 38 0.63 % 243.76 evropski evropski evro pski 262 0.55 % 253.11 33 0.28 % 143.15 13 0.17 % 43.99 210 0.93 % 594.65 6 0.10 % 38.49 različen različen razl ičen 260 0.54 % 251.18 28 0.24 % 121.46 28 0.36 % 94.74 154 0.68 % 436.08 50 0.83 % 320.73 slab slab slab 241 0.50 % 232.83 73 0.62 % 316.67 54 0.70 % 182.72 98 0.43 % 277.50 16 0.27 % 102.63 drag drag drag 238 0.50 % 229.93 93 0.80 % 403.43 56 0.73 % 189.48 53 0.23 % 150.08 36 0.60 % 230.93 zanimiv zanimiv zan imiv 230 0.48 % 222.20 81 0.69 % 351.37 50 0.65 % 169.18 73 0.32 % 206.71 26 0.43 % 166.78 jasen jasen j asen 213 0.45 % 205.78 39 0.33 % 169.18 39 0.51 % 131.96 99 0.44 % 280.33 36 0.60 % 230.93 določen določen dol očen 212 0.44 % 204.81 13 0.11 % 56.39 22 0.29 % 74.44 128 0.57 % 362.45 49 0.82 % 314.32 domač domač d omač 184 0.38 % 177.76 71 0.61 % 308 18 0.23 % 60.91 86 0.38 % 243.52 9 0.15 % 57.73 podoben podoben pod oben 181 0.38 % 174.86 47 0.40 % 203.88 28 0.36 % 94.74 79 0.35 % 223.70 27 0.45 % 173.20 deloven deloven del oven 177 0.37 % 171 21 0.18 % 91.10 12 0.16 % 40.60 110 0.49 % 311.48 34 0.57 % 218.10 osnoven osnoven osn oven 174 0.36 % 168.10 20 0.17 % 86.76 27 0.35 % 91.36 85 0.38 % 240.69 42 0.70 % 269.42 javen javen j aven 171 0.36 % 165.20 5 0.04 % 21.69 4 0.05 % 13.53 146 0.65 % 413.42 16 0.27 % 102.63 pripravljen pripravljen priprav ljen 168 0.35 % 162.30 64 0.55 % 277.63 12 0.16 % 40.60 65 0.29 % 184.06 27 0.45 % 173.20 močen močen m očen 166 0.35 % 160.37 52 0.45 % 225.57 21 0.27 % 71.06 74 0.33 % 209.54 19 0.32 % 121.88 današnji današnji dana šnji 159 0.33 % 153.61 79 0.68 % 342.70 3 0.04 % 10.15 64 0.28 % 181.23 13 0.22 % 83.39 poseben poseben pos eben 156 0.33 % 150.71 48 0.41 % 208.22 17 0.22 % 57.52 62 0.28 % 175.56 29 0.48 % 186.03 težek težek t ežek 154 0.32 % 148.78 49 0.42 % 212.56 24 0.31 % 81.21 62 0.28 % 175.56 19 0.32 % 121.88 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 443 File at CLARIN.SI2.2.100 List of final character-level 5-grams from adjective lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-adjectives-lemmas-final- 5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] dober dober dober 3,021 7.09 % 2,918.54 1,244 12.23 % 5,396.42 621 10.30 % 2,101.22 776 3.67 % 2,197.37 380 7.22 % 2,437.57 velik velik velik 1,212 2.84 % 1,170.89 278 2.73 % 1,205.95 210 3.48 % 710.56 589 2.78 % 1,667.85 135 2.57 % 865.98 pravi pravi pravi 944 2.21 % 911.98 178 1.75 % 772.16 144 2.39 % 487.24 472 2.23 % 1,336.55 150 2.85 % 962.20 slovenski slovenski slov enski 561 1.32 % 541.97 144 1.42 % 624.67 31 0.51 % 104.89 370 1.75 % 1,047.72 16 0.30 % 102.63 zadnji zadnji z adnji 507 1.19 % 489.80 166 1.63 % 720.10 64 1.06 % 216.55 244 1.15 % 690.93 33 0.63 % 211.68 glaven glaven g laven 423 0.99 % 408.65 87 0.85 % 377.40 176 2.92 % 595.52 109 0.52 % 308.65 51 0.97 % 327.15 majhen majhen m ajhen 381 0.89 % 368.08 73 0.72 % 316.67 89 1.48 % 301.14 167 0.79 % 472.89 52 0.99 % 333.56 naslednji naslednji nasl ednji 330 0.77 % 318.81 102 1.00 % 442.47 53 0.88 % 179.33 125 0.59 % 353.96 50 0.95 % 320.73 kratek kratek k ratek 286 0.67 % 276.30 81 0.80 % 351.37 29 0.48 % 98.12 122 0.58 % 345.46 54 1.03 % 346.39 visok visok visok 278 0.65 % 268.57 53 0.52 % 229.91 37 0.61 % 125.19 159 0.75 % 450.23 29 0.55 % 186.03 pomemben pomemben pom emben 265 0.62 % 256.01 47 0.46 % 203.88 17 0.28 % 57.52 163 0.77 % 461.56 38 0.72 % 243.76 evropski evropski evr opski 262 0.61 % 253.11 33 0.32 % 143.15 13 0.22 % 43.99 210 0.99 % 594.65 6 0.11 % 38.49 različen različen raz ličen 260 0.61 % 251.18 28 0.28 % 121.46 28 0.46 % 94.74 154 0.73 % 436.08 50 0.95 % 320.73 zanimiv zanimiv za nimiv 230 0.54 % 222.20 81 0.80 % 351.37 50 0.83 % 169.18 73 0.34 % 206.71 26 0.49 % 166.78 jasen jasen jasen 213 0.50 % 205.78 39 0.38 % 169.18 39 0.65 % 131.96 99 0.47 % 280.33 36 0.68 % 230.93 določen določen do ločen 212 0.50 % 204.81 13 0.13 % 56.39 22 0.36 % 74.44 128 0.60 % 362.45 49 0.93 % 314.32 domač domač domač 184 0.43 % 177.76 71 0.70 % 308 18 0.30 % 60.91 86 0.41 % 243.52 9 0.17 % 57.73 podoben podoben po doben 181 0.42 % 174.86 47 0.46 % 203.88 28 0.46 % 94.74 79 0.37 % 223.70 27 0.51 % 173.20 deloven deloven de loven 177 0.41 % 171 21 0.21 % 91.10 12 0.20 % 40.60 110 0.52 % 311.48 34 0.65 % 218.10 osnoven osnoven os noven 174 0.41 % 168.10 20 0.20 % 86.76 27 0.45 % 91.36 85 0.40 % 240.69 42 0.80 % 269.42 javen javen javen 171 0.40 % 165.20 5 0.05 % 21.69 4 0.07 % 13.53 146 0.69 % 413.42 16 0.30 % 102.63 pripravljen pripravljen pripra vljen 168 0.39 % 162.30 64 0.63 % 277.63 12 0.20 % 40.60 65 0.31 % 184.06 27 0.51 % 173.20 močen močen močen 166 0.39 % 160.37 52 0.51 % 225.57 21 0.35 % 71.06 74 0.35 % 209.54 19 0.36 % 121.88 današnji današnji dan ašnji 159 0.37 % 153.61 79 0.78 % 342.70 3 0.05 % 10.15 64 0.30 % 181.23 13 0.25 % 83.39 poseben poseben po seben 156 0.37 % 150.71 48 0.47 % 208.22 17 0.28 % 57.52 62 0.29 % 175.56 29 0.55 % 186.03 težek težek težek 154 0.36 % 148.78 49 0.48 % 212.56 24 0.40 % 81.21 62 0.29 % 175.56 19 0.36 % 121.88 hiter hiter hiter 150 0.35 % 144.91 78 0.77 % 338.36 13 0.22 % 43.99 45 0.21 % 127.42 14 0.27 % 89.81 državen državen dr žaven 147 0.34 % 142.01 18 0.18 % 78.08 2 0.03 % 6.77 125 0.59 % 353.96 2 0.04 % 12.83 svetoven svetoven sve toven 145 0.34 % 140.08 87 0.85 % 377.40 5 0.08 % 16.92 52 0.25 % 147.25 1 0.02 % 6.41 socialen socialen soc ialen 141 0.33 % 136.22 6 0.06 % 26.03 13 0.22 % 43.99 118 0.56 % 334.14 4 0.08 % 25.66 zunanji zunanji zu nanji 134 0.31 % 129.46 5 0.05 % 21.69 5 0.08 % 16.92 105 0.50 % 297.32 19 0.36 % 121.88 prejšnji prejšnji pre jšnji 133 0.31 % 128.49 22 0.22 % 95.44 30 0.50 % 101.51 73 0.34 % 206.71 8 0.15 % 51.32 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 444 File at CLARIN.SI2.2.101 List of initial character-level 1-grams from adjective standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-adjectives-standardized_ forms-initial-1grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] samo s amo 2,475 4.49 % 2,391.06 361 2.69 % 1,566 1,104 11.05 % 3,735.51 508 2.06 % 1,438.49 502 7.09 % 3,220.16 malo m alo 1,518 2.75 % 1,466.52 418 3.11 % 1,813.27 500 5.01 % 1,691.81 267 1.08 % 756.05 333 4.70 % 2,136.08 dobro d obro 1,485 2.69 % 1,434.63 687 5.12 % 2,980.18 283 2.83 % 957.56 315 1.28 % 891.97 200 2.82 % 1,282.93 pravi p ravi 624 1.13 % 602.84 91 0.68 % 394.75 71 0.71 % 240.24 345 1.40 % 976.92 117 1.65 % 750.51 dober d ober 468 0.85 % 452.13 181 1.35 % 785.17 88 0.88 % 297.76 156 0.63 % 441.74 43 0.61 % 275.83 sam s am 366 0.66 % 353.59 81 0.60 % 351.37 121 1.21 % 409.42 117 0.47 % 331.30 47 0.66 % 301.49 lepa l epa 344 0.62 % 332.33 122 0.91 % 529.23 36 0.36 % 121.81 153 0.62 % 433.24 33 0.47 % 211.68 celo c elo 310 0.56 % 299.49 75 0.56 % 325.35 97 0.97 % 328.21 93 0.38 % 263.34 45 0.64 % 288.66 glavnem g lavnem 280 0.51 % 270.50 55 0.41 % 238.59 148 1.48 % 500.77 39 0.16 % 110.43 38 0.54 % 243.76 lep l ep 239 0.43 % 230.89 120 0.89 % 520.56 28 0.28 % 94.74 70 0.28 % 198.22 21 0.30 % 134.71 stari s tari 233 0.42 % 225.10 28 0.21 % 121.46 174 1.74 % 588.75 21 0.09 % 59.46 10 0.14 % 64.15 sama s ama 204 0.37 % 197.08 55 0.41 % 238.59 66 0.66 % 223.32 47 0.19 % 133.09 36 0.51 % 230.93 cela c ela 198 0.36 % 191.28 19 0.14 % 82.42 41 0.41 % 138.73 89 0.36 % 252.02 49 0.69 % 314.32 novo n ovo 196 0.35 % 189.35 58 0.43 % 251.60 40 0.40 % 135.34 63 0.26 % 178.39 35 0.49 % 224.51 boljše b oljše 194 0.35 % 187.42 36 0.27 % 156.17 59 0.59 % 199.63 58 0.23 % 164.24 41 0.58 % 263 velika v elika 170 0.31 % 164.23 33 0.25 % 143.15 47 0.47 % 159.03 67 0.27 % 189.72 23 0.33 % 147.54 jasno j asno 165 0.30 % 159.40 33 0.25 % 143.15 35 0.35 % 118.43 65 0.26 % 184.06 32 0.45 % 205.27 dobra d obra 162 0.29 % 156.51 69 0.51 % 299.32 38 0.38 % 128.58 35 0.14 % 99.11 20 0.28 % 128.29 cel c el 160 0.29 % 154.57 36 0.27 % 156.17 85 0.85 % 287.61 21 0.09 % 59.46 18 0.25 % 115.46 sami s ami 148 0.27 % 142.98 25 0.19 % 108.45 31 0.31 % 104.89 70 0.28 % 198.22 22 0.31 % 141.12 naslednji n aslednji 144 0.26 % 139.12 48 0.36 % 208.22 36 0.36 % 121.81 37 0.15 % 104.77 23 0.33 % 147.54 velik v elik 134 0.24 % 129.46 30 0.22 % 130.14 23 0.23 % 77.82 70 0.28 % 198.22 11 0.15 % 70.56 velike v elike 128 0.23 % 123.66 23 0.17 % 99.77 33 0.33 % 111.66 55 0.22 % 155.74 17 0.24 % 109.05 slovenski s lovenski 126 0.23 % 121.73 36 0.27 % 156.17 11 0.11 % 37.22 74 0.30 % 209.54 5 0.07 % 32.07 zadnjih z adnjih 125 0.23 % 120.76 37 0.28 % 160.50 5 0.05 % 16.92 76 0.31 % 215.21 7 0.10 % 44.90 nove n ove 124 0.23 % 119.79 25 0.19 % 108.45 20 0.20 % 67.67 71 0.29 % 201.05 8 0.11 % 51.32 dolgo d olgo 122 0.22 % 117.86 30 0.22 % 130.14 54 0.54 % 182.72 27 0.11 % 76.45 11 0.15 % 70.56 zadnji z adnji 120 0.22 % 115.93 32 0.24 % 138.81 33 0.33 % 111.66 49 0.20 % 138.75 6 0.09 % 38.49 nov n ov 117 0.21 % 113.03 65 0.48 % 281.97 12 0.12 % 40.60 31 0.13 % 87.78 9 0.13 % 57.73 novega n ovega 106 0.19 % 102.40 30 0.22 % 130.14 24 0.24 % 81.21 41 0.17 % 116.10 11 0.15 % 70.56 fajn f ajn 105 0.19 % 101.44 40 0.30 % 173.52 44 0.44 % 148.88 7 0.03 % 19.82 14 0.20 % 89.81 stara s tara 104 0.19 % 100.47 31 0.23 % 134.48 43 0.43 % 145.50 16 0.07 % 45.31 14 0.20 % 89.81 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 445 File at CLARIN.SI2.2.102 List of initial character-level 2-grams from adjective standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-adjectives-standardized_ forms-initial-2grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] samo sa mo 2,475 4.49 % 2,391.06 361 2.69 % 1,566 1,104 11.05 % 3,735.51 508 2.06 % 1,438.49 502 7.09 % 3,220.16 malo ma lo 1,518 2.75 % 1,466.52 418 3.11 % 1,813.27 500 5.01 % 1,691.81 267 1.08 % 756.05 333 4.70 % 2,136.08 dobro do bro 1,485 2.69 % 1,434.63 687 5.12 % 2,980.18 283 2.83 % 957.56 315 1.28 % 891.97 200 2.82 % 1,282.93 pravi pr avi 624 1.13 % 602.84 91 0.68 % 394.75 71 0.71 % 240.24 345 1.40 % 976.92 117 1.65 % 750.51 dober do ber 468 0.85 % 452.13 181 1.35 % 785.17 88 0.88 % 297.76 156 0.63 % 441.74 43 0.61 % 275.83 sam sa m 366 0.66 % 353.59 81 0.60 % 351.37 121 1.21 % 409.42 117 0.47 % 331.30 47 0.66 % 301.49 lepa le pa 344 0.62 % 332.33 122 0.91 % 529.23 36 0.36 % 121.81 153 0.62 % 433.24 33 0.47 % 211.68 celo ce lo 310 0.56 % 299.49 75 0.56 % 325.35 97 0.97 % 328.21 93 0.38 % 263.34 45 0.64 % 288.66 glavnem gl avnem 280 0.51 % 270.50 55 0.41 % 238.59 148 1.48 % 500.77 39 0.16 % 110.43 38 0.54 % 243.76 lep le p 239 0.43 % 230.89 120 0.89 % 520.56 28 0.28 % 94.74 70 0.28 % 198.22 21 0.30 % 134.71 stari st ari 233 0.42 % 225.10 28 0.21 % 121.46 174 1.74 % 588.75 21 0.09 % 59.46 10 0.14 % 64.15 sama sa ma 204 0.37 % 197.08 55 0.41 % 238.59 66 0.66 % 223.32 47 0.19 % 133.09 36 0.51 % 230.93 cela ce la 198 0.36 % 191.28 19 0.14 % 82.42 41 0.41 % 138.73 89 0.36 % 252.02 49 0.69 % 314.32 novo no vo 196 0.35 % 189.35 58 0.43 % 251.60 40 0.40 % 135.34 63 0.26 % 178.39 35 0.49 % 224.51 boljše bo ljše 194 0.35 % 187.42 36 0.27 % 156.17 59 0.59 % 199.63 58 0.23 % 164.24 41 0.58 % 263 velika ve lika 170 0.31 % 164.23 33 0.25 % 143.15 47 0.47 % 159.03 67 0.27 % 189.72 23 0.33 % 147.54 jasno ja sno 165 0.30 % 159.40 33 0.25 % 143.15 35 0.35 % 118.43 65 0.26 % 184.06 32 0.45 % 205.27 dobra do bra 162 0.29 % 156.51 69 0.51 % 299.32 38 0.38 % 128.58 35 0.14 % 99.11 20 0.28 % 128.29 cel ce l 160 0.29 % 154.57 36 0.27 % 156.17 85 0.85 % 287.61 21 0.09 % 59.46 18 0.25 % 115.46 sami sa mi 148 0.27 % 142.98 25 0.19 % 108.45 31 0.31 % 104.89 70 0.28 % 198.22 22 0.31 % 141.12 naslednji na slednji 144 0.26 % 139.12 48 0.36 % 208.22 36 0.36 % 121.81 37 0.15 % 104.77 23 0.33 % 147.54 velik ve lik 134 0.24 % 129.46 30 0.22 % 130.14 23 0.23 % 77.82 70 0.28 % 198.22 11 0.15 % 70.56 velike ve like 128 0.23 % 123.66 23 0.17 % 99.77 33 0.33 % 111.66 55 0.22 % 155.74 17 0.24 % 109.05 slovenski sl ovenski 126 0.23 % 121.73 36 0.27 % 156.17 11 0.11 % 37.22 74 0.30 % 209.54 5 0.07 % 32.07 zadnjih za dnjih 125 0.23 % 120.76 37 0.28 % 160.50 5 0.05 % 16.92 76 0.31 % 215.21 7 0.10 % 44.90 nove no ve 124 0.23 % 119.79 25 0.19 % 108.45 20 0.20 % 67.67 71 0.29 % 201.05 8 0.11 % 51.32 dolgo do lgo 122 0.22 % 117.86 30 0.22 % 130.14 54 0.54 % 182.72 27 0.11 % 76.45 11 0.15 % 70.56 zadnji za dnji 120 0.22 % 115.93 32 0.24 % 138.81 33 0.33 % 111.66 49 0.20 % 138.75 6 0.09 % 38.49 nov no v 117 0.21 % 113.03 65 0.48 % 281.97 12 0.12 % 40.60 31 0.13 % 87.78 9 0.13 % 57.73 novega no vega 106 0.19 % 102.40 30 0.22 % 130.14 24 0.24 % 81.21 41 0.17 % 116.10 11 0.15 % 70.56 fajn fa jn 105 0.19 % 101.44 40 0.30 % 173.52 44 0.44 % 148.88 7 0.03 % 19.82 14 0.20 % 89.81 stara st ara 104 0.19 % 100.47 31 0.23 % 134.48 43 0.43 % 145.50 16 0.07 % 45.31 14 0.20 % 89.81 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 446 File at CLARIN.SI2.2.103 List of initial character-level 3-grams from adjective standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-adjectives-standardized_ forms-initial-3grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] samo sam o 2,475 4.49 % 2,391.06 361 2.69 % 1,566 1,104 11.06 % 3,735.51 508 2.06 % 1,438.49 502 7.09 % 3,220.16 malo mal o 1,518 2.75 % 1,466.52 418 3.11 % 1,813.27 500 5.01 % 1,691.81 267 1.08 % 756.05 333 4.70 % 2,136.08 dobro dob ro 1,485 2.69 % 1,434.63 687 5.12 % 2,980.18 283 2.83 % 957.56 315 1.28 % 891.97 200 2.82 % 1,282.93 pravi pra vi 624 1.13 % 602.84 91 0.68 % 394.75 71 0.71 % 240.24 345 1.40 % 976.92 117 1.65 % 750.51 dober dob er 468 0.85 % 452.13 181 1.35 % 785.17 88 0.88 % 297.76 156 0.63 % 441.74 43 0.61 % 275.83 sam sam 366 0.66 % 353.59 81 0.60 % 351.37 121 1.21 % 409.42 117 0.47 % 331.30 47 0.66 % 301.49 lepa lep a 344 0.62 % 332.33 122 0.91 % 529.23 36 0.36 % 121.81 153 0.62 % 433.24 33 0.47 % 211.68 celo cel o 310 0.56 % 299.49 75 0.56 % 325.35 97 0.97 % 328.21 93 0.38 % 263.34 45 0.64 % 288.66 glavnem gla vnem 280 0.51 % 270.50 55 0.41 % 238.59 148 1.48 % 500.77 39 0.16 % 110.43 38 0.54 % 243.76 lep lep 239 0.43 % 230.89 120 0.89 % 520.56 28 0.28 % 94.74 70 0.28 % 198.22 21 0.30 % 134.71 stari sta ri 233 0.42 % 225.10 28 0.21 % 121.46 174 1.74 % 588.75 21 0.09 % 59.46 10 0.14 % 64.15 sama sam a 204 0.37 % 197.08 55 0.41 % 238.59 66 0.66 % 223.32 47 0.19 % 133.09 36 0.51 % 230.93 cela cel a 198 0.36 % 191.28 19 0.14 % 82.42 41 0.41 % 138.73 89 0.36 % 252.02 49 0.69 % 314.32 novo nov o 196 0.35 % 189.35 58 0.43 % 251.60 40 0.40 % 135.34 63 0.26 % 178.39 35 0.49 % 224.51 boljše bol jše 194 0.35 % 187.42 36 0.27 % 156.17 59 0.59 % 199.63 58 0.23 % 164.24 41 0.58 % 263 velika vel ika 170 0.31 % 164.23 33 0.25 % 143.15 47 0.47 % 159.03 67 0.27 % 189.72 23 0.33 % 147.54 jasno jas no 165 0.30 % 159.40 33 0.25 % 143.15 35 0.35 % 118.43 65 0.26 % 184.06 32 0.45 % 205.27 dobra dob ra 162 0.29 % 156.51 69 0.51 % 299.32 38 0.38 % 128.58 35 0.14 % 99.11 20 0.28 % 128.29 cel cel 160 0.29 % 154.57 36 0.27 % 156.17 85 0.85 % 287.61 21 0.09 % 59.46 18 0.25 % 115.46 sami sam i 148 0.27 % 142.98 25 0.19 % 108.45 31 0.31 % 104.89 70 0.28 % 198.22 22 0.31 % 141.12 naslednji nas lednji 144 0.26 % 139.12 48 0.36 % 208.22 36 0.36 % 121.81 37 0.15 % 104.77 23 0.33 % 147.54 velik vel ik 134 0.24 % 129.46 30 0.22 % 130.14 23 0.23 % 77.82 70 0.28 % 198.22 11 0.15 % 70.56 velike vel ike 128 0.23 % 123.66 23 0.17 % 99.77 33 0.33 % 111.66 55 0.22 % 155.74 17 0.24 % 109.05 slovenski slo venski 126 0.23 % 121.73 36 0.27 % 156.17 11 0.11 % 37.22 74 0.30 % 209.54 5 0.07 % 32.07 zadnjih zad njih 125 0.23 % 120.76 37 0.28 % 160.50 5 0.05 % 16.92 76 0.31 % 215.21 7 0.10 % 44.90 nove nov e 124 0.23 % 119.79 25 0.19 % 108.45 20 0.20 % 67.67 71 0.29 % 201.05 8 0.11 % 51.32 dolgo dol go 122 0.22 % 117.86 30 0.22 % 130.14 54 0.54 % 182.72 27 0.11 % 76.45 11 0.15 % 70.56 zadnji zad nji 120 0.22 % 115.93 32 0.24 % 138.81 33 0.33 % 111.66 49 0.20 % 138.75 6 0.09 % 38.49 nov nov 117 0.21 % 113.03 65 0.48 % 281.97 12 0.12 % 40.60 31 0.13 % 87.78 9 0.13 % 57.73 novega nov ega 106 0.19 % 102.40 30 0.22 % 130.14 24 0.24 % 81.21 41 0.17 % 116.10 11 0.15 % 70.56 fajn faj n 105 0.19 % 101.44 40 0.30 % 173.52 44 0.44 % 148.88 7 0.03 % 19.82 14 0.20 % 89.81 stara sta ra 104 0.19 % 100.47 31 0.23 % 134.48 43 0.43 % 145.50 16 0.07 % 45.31 14 0.20 % 89.81 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 447 File at CLARIN.SI2.2.104 List of initial character-level 4-grams from adjective standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-adjectives-standardized_ forms-initial-4grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] samo samo 2,475 4.59 % 2,391.06 361 2.77 % 1,566 1,104 11.52 % 3,735.51 508 2.08 % 1,438.49 502 7.21 % 3,220.16 malo malo 1,518 2.81 % 1,466.52 418 3.21 % 1,813.27 500 5.22 % 1,691.81 267 1.09 % 756.05 333 4.78 % 2,136.08 dobro dobr o 1,485 2.75 % 1,434.63 687 5.27 % 2,980.18 283 2.95 % 957.56 315 1.29 % 891.97 200 2.87 % 1,282.93 pravi prav i 624 1.16 % 602.84 91 0.70 % 394.75 71 0.74 % 240.24 345 1.41 % 976.92 117 1.68 % 750.51 dober dobe r 468 0.87 % 452.13 181 1.39 % 785.17 88 0.92 % 297.76 156 0.64 % 441.74 43 0.62 % 275.83 lepa lepa 344 0.64 % 332.33 122 0.94 % 529.23 36 0.38 % 121.81 153 0.63 % 433.24 33 0.47 % 211.68 celo celo 310 0.57 % 299.49 75 0.58 % 325.35 97 1.01 % 328.21 93 0.38 % 263.34 45 0.65 % 288.66 glavnem glav nem 280 0.52 % 270.50 55 0.42 % 238.59 148 1.54 % 500.77 39 0.16 % 110.43 38 0.55 % 243.76 stari star i 233 0.43 % 225.10 28 0.21 % 121.46 174 1.82 % 588.75 21 0.09 % 59.46 10 0.14 % 64.15 sama sama 204 0.38 % 197.08 55 0.42 % 238.59 66 0.69 % 223.32 47 0.19 % 133.09 36 0.52 % 230.93 cela cela 198 0.37 % 191.28 19 0.15 % 82.42 41 0.43 % 138.73 89 0.36 % 252.02 49 0.70 % 314.32 novo novo 196 0.36 % 189.35 58 0.45 % 251.60 40 0.42 % 135.34 63 0.26 % 178.39 35 0.50 % 224.51 boljše bolj še 194 0.36 % 187.42 36 0.28 % 156.17 59 0.62 % 199.63 58 0.24 % 164.24 41 0.59 % 263 velika veli ka 170 0.32 % 164.23 33 0.25 % 143.15 47 0.49 % 159.03 67 0.28 % 189.72 23 0.33 % 147.54 jasno jasn o 165 0.31 % 159.40 33 0.25 % 143.15 35 0.36 % 118.43 65 0.27 % 184.06 32 0.46 % 205.27 dobra dobr a 162 0.30 % 156.51 69 0.53 % 299.32 38 0.40 % 128.58 35 0.14 % 99.11 20 0.29 % 128.29 sami sami 148 0.27 % 142.98 25 0.19 % 108.45 31 0.32 % 104.89 70 0.29 % 198.22 22 0.32 % 141.12 naslednji nasl ednji 144 0.27 % 139.12 48 0.37 % 208.22 36 0.38 % 121.81 37 0.15 % 104.77 23 0.33 % 147.54 velik veli k 134 0.25 % 129.46 30 0.23 % 130.14 23 0.24 % 77.82 70 0.29 % 198.22 11 0.16 % 70.56 velike veli ke 128 0.24 % 123.66 23 0.18 % 99.77 33 0.34 % 111.66 55 0.23 % 155.74 17 0.24 % 109.05 slovenski slov enski 126 0.23 % 121.73 36 0.28 % 156.17 11 0.12 % 37.22 74 0.30 % 209.54 5 0.07 % 32.07 zadnjih zadn jih 125 0.23 % 120.76 37 0.28 % 160.50 5 0.05 % 16.92 76 0.31 % 215.21 7 0.10 % 44.90 nove nove 124 0.23 % 119.79 25 0.19 % 108.45 20 0.21 % 67.67 71 0.29 % 201.05 8 0.12 % 51.32 dolgo dolg o 122 0.23 % 117.86 30 0.23 % 130.14 54 0.56 % 182.72 27 0.11 % 76.45 11 0.16 % 70.56 zadnji zadn ji 120 0.22 % 115.93 32 0.25 % 138.81 33 0.34 % 111.66 49 0.20 % 138.75 6 0.09 % 38.49 novega nove ga 106 0.20 % 102.40 30 0.23 % 130.14 24 0.25 % 81.21 41 0.17 % 116.10 11 0.16 % 70.56 fajn fajn 105 0.20 % 101.44 40 0.31 % 173.52 44 0.46 % 148.88 7 0.03 % 19.82 14 0.20 % 89.81 stara star a 104 0.19 % 100.47 31 0.24 % 134.48 43 0.45 % 145.50 16 0.07 % 45.31 14 0.20 % 89.81 dobre dobr e 103 0.19 % 99.51 36 0.28 % 156.17 23 0.24 % 77.82 34 0.14 % 96.28 10 0.14 % 64.15 najboljše najb oljše 103 0.19 % 99.51 34 0.26 % 147.49 28 0.29 % 94.74 18 0.07 % 50.97 23 0.33 % 147.54 nova nova 102 0.19 % 98.54 31 0.24 % 134.48 13 0.14 % 43.99 45 0.18 % 127.42 13 0.19 % 83.39 različnih razl ičnih 101 0.19 % 97.57 12 0.09 % 52.06 13 0.14 % 43.99 56 0.23 % 158.57 20 0.29 % 128.29 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 448 File at CLARIN.SI2.2.105 List of initial character-level 5-grams from adjective standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-adjectives-standardized_ forms-initial-5grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] dobro dobro 1,485 3.21 % 1,434.63 687 6.11 % 2,980.18 283 4.04 % 957.56 315 1.41 % 891.97 200 3.52 % 1,282.93 pravi pravi 624 1.35 % 602.84 91 0.81 % 394.75 71 1.01 % 240.24 345 1.54 % 976.92 117 2.06 % 750.51 dober dober 468 1.01 % 452.13 181 1.61 % 785.17 88 1.26 % 297.76 156 0.70 % 441.74 43 0.76 % 275.83 glavnem glavn em 280 0.60 % 270.50 55 0.49 % 238.59 148 2.11 % 500.77 39 0.17 % 110.43 38 0.67 % 243.76 stari stari 233 0.50 % 225.10 28 0.25 % 121.46 174 2.48 % 588.75 21 0.09 % 59.46 10 0.18 % 64.15 boljše boljš e 194 0.42 % 187.42 36 0.32 % 156.17 59 0.84 % 199.63 58 0.26 % 164.24 41 0.72 % 263 velika velik a 170 0.37 % 164.23 33 0.29 % 143.15 47 0.67 % 159.03 67 0.30 % 189.72 23 0.41 % 147.54 jasno jasno 165 0.36 % 159.40 33 0.29 % 143.15 35 0.50 % 118.43 65 0.29 % 184.06 32 0.56 % 205.27 dobra dobra 162 0.35 % 156.51 69 0.61 % 299.32 38 0.54 % 128.58 35 0.16 % 99.11 20 0.35 % 128.29 naslednji nasle dnji 144 0.31 % 139.12 48 0.43 % 208.22 36 0.51 % 121.81 37 0.17 % 104.77 23 0.41 % 147.54 velik velik 134 0.29 % 129.46 30 0.27 % 130.14 23 0.33 % 77.82 70 0.31 % 198.22 11 0.19 % 70.56 velike velik e 128 0.28 % 123.66 23 0.20 % 99.77 33 0.47 % 111.66 55 0.25 % 155.74 17 0.30 % 109.05 slovenski slove nski 126 0.27 % 121.73 36 0.32 % 156.17 11 0.16 % 37.22 74 0.33 % 209.54 5 0.09 % 32.07 zadnjih zadnj ih 125 0.27 % 120.76 37 0.33 % 160.50 5 0.07 % 16.92 76 0.34 % 215.21 7 0.12 % 44.90 dolgo dolgo 122 0.26 % 117.86 30 0.27 % 130.14 54 0.77 % 182.72 27 0.12 % 76.45 11 0.19 % 70.56 zadnji zadnj i 120 0.26 % 115.93 32 0.28 % 138.81 33 0.47 % 111.66 49 0.22 % 138.75 6 0.11 % 38.49 novega noveg a 106 0.23 % 102.40 30 0.27 % 130.14 24 0.34 % 81.21 41 0.18 % 116.10 11 0.19 % 70.56 stara stara 104 0.22 % 100.47 31 0.28 % 134.48 43 0.61 % 145.50 16 0.07 % 45.31 14 0.25 % 89.81 dobre dobre 103 0.22 % 99.51 36 0.32 % 156.17 23 0.33 % 77.82 34 0.15 % 96.28 10 0.18 % 64.15 najboljše najbo ljše 103 0.22 % 99.51 34 0.30 % 147.49 28 0.40 % 94.74 18 0.08 % 50.97 23 0.41 % 147.54 različnih razli čnih 101 0.22 % 97.57 12 0.11 % 52.06 13 0.19 % 43.99 56 0.25 % 158.57 20 0.35 % 128.29 veliki velik i 99 0.21 % 95.64 29 0.26 % 125.80 19 0.27 % 64.29 43 0.19 % 121.76 8 0.14 % 51.32 zadnje zadnj e 95 0.20 % 91.78 28 0.25 % 121.46 14 0.20 % 47.37 43 0.19 % 121.76 10 0.18 % 64.15 pravim pravi m 89 0.19 % 85.98 19 0.17 % 82.42 38 0.54 % 128.58 14 0.06 % 39.64 18 0.32 % 115.46 različne razli čne 86 0.19 % 83.08 7 0.06 % 30.37 12 0.17 % 40.60 51 0.23 % 144.41 16 0.28 % 102.63 kratko kratk o 82 0.18 % 79.22 14 0.12 % 60.73 4 0.06 % 13.53 54 0.24 % 152.91 10 0.18 % 64.15 preden prede n 78 0.17 % 75.35 14 0.12 % 60.73 16 0.23 % 54.14 26 0.12 % 73.62 22 0.39 % 141.12 rojstni rojst ni 77 0.17 % 74.39 41 0.36 % 177.86 27 0.39 % 91.36 5 0.02 % 14.16 4 0.07 % 25.66 slovenskih slove nskih 77 0.17 % 74.39 23 0.20 % 99.77 4 0.06 % 13.53 49 0.22 % 138.75 1 0.02 % 6.41 rečeno rečen o 76 0.16 % 73.42 20 0.18 % 86.76 4 0.06 % 13.53 44 0.20 % 124.59 8 0.14 % 51.32 boljši boljš i 75 0.16 % 72.46 19 0.17 % 82.42 26 0.37 % 87.97 22 0.10 % 62.30 8 0.14 % 51.32 največji najve čji 74 0.16 % 71.49 21 0.19 % 91.10 13 0.19 % 43.99 29 0.13 % 82.12 11 0.19 % 70.56 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 449 File at CLARIN.SI2.2.106 List of final character-level 1-grams from adjective standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-adjectives-standardized_ forms-final-1grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] samo sam o 2,475 4.49 % 2,391.06 361 2.69 % 1,566 1,104 11.05 % 3,735.51 508 2.06 % 1,438.49 502 7.09 % 3,220.16 malo mal o 1,518 2.75 % 1,466.52 418 3.11 % 1,813.27 500 5.01 % 1,691.81 267 1.08 % 756.05 333 4.70 % 2,136.08 dobro dobr o 1,485 2.69 % 1,434.63 687 5.12 % 2,980.18 283 2.83 % 957.56 315 1.28 % 891.97 200 2.82 % 1,282.93 pravi prav i 624 1.13 % 602.84 91 0.68 % 394.75 71 0.71 % 240.24 345 1.40 % 976.92 117 1.65 % 750.51 dober dobe r 468 0.85 % 452.13 181 1.35 % 785.17 88 0.88 % 297.76 156 0.63 % 441.74 43 0.61 % 275.83 sam sa m 366 0.66 % 353.59 81 0.60 % 351.37 121 1.21 % 409.42 117 0.47 % 331.30 47 0.66 % 301.49 lepa lep a 344 0.62 % 332.33 122 0.91 % 529.23 36 0.36 % 121.81 153 0.62 % 433.24 33 0.47 % 211.68 celo cel o 310 0.56 % 299.49 75 0.56 % 325.35 97 0.97 % 328.21 93 0.38 % 263.34 45 0.64 % 288.66 glavnem glavne m 280 0.51 % 270.50 55 0.41 % 238.59 148 1.48 % 500.77 39 0.16 % 110.43 38 0.54 % 243.76 lep le p 239 0.43 % 230.89 120 0.89 % 520.56 28 0.28 % 94.74 70 0.28 % 198.22 21 0.30 % 134.71 stari star i 233 0.42 % 225.10 28 0.21 % 121.46 174 1.74 % 588.75 21 0.09 % 59.46 10 0.14 % 64.15 sama sam a 204 0.37 % 197.08 55 0.41 % 238.59 66 0.66 % 223.32 47 0.19 % 133.09 36 0.51 % 230.93 cela cel a 198 0.36 % 191.28 19 0.14 % 82.42 41 0.41 % 138.73 89 0.36 % 252.02 49 0.69 % 314.32 novo nov o 196 0.35 % 189.35 58 0.43 % 251.60 40 0.40 % 135.34 63 0.26 % 178.39 35 0.49 % 224.51 boljše boljš e 194 0.35 % 187.42 36 0.27 % 156.17 59 0.59 % 199.63 58 0.23 % 164.24 41 0.58 % 263 velika velik a 170 0.31 % 164.23 33 0.25 % 143.15 47 0.47 % 159.03 67 0.27 % 189.72 23 0.33 % 147.54 jasno jasn o 165 0.30 % 159.40 33 0.25 % 143.15 35 0.35 % 118.43 65 0.26 % 184.06 32 0.45 % 205.27 dobra dobr a 162 0.29 % 156.51 69 0.51 % 299.32 38 0.38 % 128.58 35 0.14 % 99.11 20 0.28 % 128.29 cel ce l 160 0.29 % 154.57 36 0.27 % 156.17 85 0.85 % 287.61 21 0.09 % 59.46 18 0.25 % 115.46 sami sam i 148 0.27 % 142.98 25 0.19 % 108.45 31 0.31 % 104.89 70 0.28 % 198.22 22 0.31 % 141.12 naslednji naslednj i 144 0.26 % 139.12 48 0.36 % 208.22 36 0.36 % 121.81 37 0.15 % 104.77 23 0.33 % 147.54 velik veli k 134 0.24 % 129.46 30 0.22 % 130.14 23 0.23 % 77.82 70 0.28 % 198.22 11 0.15 % 70.56 velike velik e 128 0.23 % 123.66 23 0.17 % 99.77 33 0.33 % 111.66 55 0.22 % 155.74 17 0.24 % 109.05 slovenski slovensk i 126 0.23 % 121.73 36 0.27 % 156.17 11 0.11 % 37.22 74 0.30 % 209.54 5 0.07 % 32.07 zadnjih zadnji h 125 0.23 % 120.76 37 0.28 % 160.50 5 0.05 % 16.92 76 0.31 % 215.21 7 0.10 % 44.90 nove nov e 124 0.23 % 119.79 25 0.19 % 108.45 20 0.20 % 67.67 71 0.29 % 201.05 8 0.11 % 51.32 dolgo dolg o 122 0.22 % 117.86 30 0.22 % 130.14 54 0.54 % 182.72 27 0.11 % 76.45 11 0.15 % 70.56 zadnji zadnj i 120 0.22 % 115.93 32 0.24 % 138.81 33 0.33 % 111.66 49 0.20 % 138.75 6 0.09 % 38.49 nov no v 117 0.21 % 113.03 65 0.48 % 281.97 12 0.12 % 40.60 31 0.13 % 87.78 9 0.13 % 57.73 novega noveg a 106 0.19 % 102.40 30 0.22 % 130.14 24 0.24 % 81.21 41 0.17 % 116.10 11 0.15 % 70.56 fajn faj n 105 0.19 % 101.44 40 0.30 % 173.52 44 0.44 % 148.88 7 0.03 % 19.82 14 0.20 % 89.81 stara star a 104 0.19 % 100.47 31 0.23 % 134.48 43 0.43 % 145.50 16 0.07 % 45.31 14 0.20 % 89.81 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 450 File at CLARIN.SI2.2.107 List of final character-level 2-grams from adjective standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-adjectives-standardized_ forms-final-2grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] samo sa mo 2,475 4.49 % 2,391.06 361 2.69 % 1,566 1,104 11.05 % 3,735.51 508 2.06 % 1,438.49 502 7.09 % 3,220.16 malo ma lo 1,518 2.75 % 1,466.52 418 3.11 % 1,813.27 500 5.01 % 1,691.81 267 1.08 % 756.05 333 4.70 % 2,136.08 dobro dob ro 1,485 2.69 % 1,434.63 687 5.12 % 2,980.18 283 2.83 % 957.56 315 1.28 % 891.97 200 2.82 % 1,282.93 pravi pra vi 624 1.13 % 602.84 91 0.68 % 394.75 71 0.71 % 240.24 345 1.40 % 976.92 117 1.65 % 750.51 dober dob er 468 0.85 % 452.13 181 1.35 % 785.17 88 0.88 % 297.76 156 0.63 % 441.74 43 0.61 % 275.83 sam s am 366 0.66 % 353.59 81 0.60 % 351.37 121 1.21 % 409.42 117 0.47 % 331.30 47 0.66 % 301.49 lepa le pa 344 0.62 % 332.33 122 0.91 % 529.23 36 0.36 % 121.81 153 0.62 % 433.24 33 0.47 % 211.68 celo ce lo 310 0.56 % 299.49 75 0.56 % 325.35 97 0.97 % 328.21 93 0.38 % 263.34 45 0.64 % 288.66 glavnem glavn em 280 0.51 % 270.50 55 0.41 % 238.59 148 1.48 % 500.77 39 0.16 % 110.43 38 0.54 % 243.76 lep l ep 239 0.43 % 230.89 120 0.89 % 520.56 28 0.28 % 94.74 70 0.28 % 198.22 21 0.30 % 134.71 stari sta ri 233 0.42 % 225.10 28 0.21 % 121.46 174 1.74 % 588.75 21 0.09 % 59.46 10 0.14 % 64.15 sama sa ma 204 0.37 % 197.08 55 0.41 % 238.59 66 0.66 % 223.32 47 0.19 % 133.09 36 0.51 % 230.93 cela ce la 198 0.36 % 191.28 19 0.14 % 82.42 41 0.41 % 138.73 89 0.36 % 252.02 49 0.69 % 314.32 novo no vo 196 0.35 % 189.35 58 0.43 % 251.60 40 0.40 % 135.34 63 0.26 % 178.39 35 0.49 % 224.51 boljše bolj še 194 0.35 % 187.42 36 0.27 % 156.17 59 0.59 % 199.63 58 0.23 % 164.24 41 0.58 % 263 velika veli ka 170 0.31 % 164.23 33 0.25 % 143.15 47 0.47 % 159.03 67 0.27 % 189.72 23 0.33 % 147.54 jasno jas no 165 0.30 % 159.40 33 0.25 % 143.15 35 0.35 % 118.43 65 0.26 % 184.06 32 0.45 % 205.27 dobra dob ra 162 0.29 % 156.51 69 0.51 % 299.32 38 0.38 % 128.58 35 0.14 % 99.11 20 0.28 % 128.29 cel c el 160 0.29 % 154.57 36 0.27 % 156.17 85 0.85 % 287.61 21 0.09 % 59.46 18 0.25 % 115.46 sami sa mi 148 0.27 % 142.98 25 0.19 % 108.45 31 0.31 % 104.89 70 0.28 % 198.22 22 0.31 % 141.12 naslednji nasledn ji 144 0.26 % 139.12 48 0.36 % 208.22 36 0.36 % 121.81 37 0.15 % 104.77 23 0.33 % 147.54 velik vel ik 134 0.24 % 129.46 30 0.22 % 130.14 23 0.23 % 77.82 70 0.28 % 198.22 11 0.15 % 70.56 velike veli ke 128 0.23 % 123.66 23 0.17 % 99.77 33 0.33 % 111.66 55 0.22 % 155.74 17 0.24 % 109.05 slovenski slovens ki 126 0.23 % 121.73 36 0.27 % 156.17 11 0.11 % 37.22 74 0.30 % 209.54 5 0.07 % 32.07 zadnjih zadnj ih 125 0.23 % 120.76 37 0.28 % 160.50 5 0.05 % 16.92 76 0.31 % 215.21 7 0.10 % 44.90 nove no ve 124 0.23 % 119.79 25 0.19 % 108.45 20 0.20 % 67.67 71 0.29 % 201.05 8 0.11 % 51.32 dolgo dol go 122 0.22 % 117.86 30 0.22 % 130.14 54 0.54 % 182.72 27 0.11 % 76.45 11 0.15 % 70.56 zadnji zadn ji 120 0.22 % 115.93 32 0.24 % 138.81 33 0.33 % 111.66 49 0.20 % 138.75 6 0.09 % 38.49 nov n ov 117 0.21 % 113.03 65 0.48 % 281.97 12 0.12 % 40.60 31 0.13 % 87.78 9 0.13 % 57.73 novega nove ga 106 0.19 % 102.40 30 0.22 % 130.14 24 0.24 % 81.21 41 0.17 % 116.10 11 0.15 % 70.56 fajn fa jn 105 0.19 % 101.44 40 0.30 % 173.52 44 0.44 % 148.88 7 0.03 % 19.82 14 0.20 % 89.81 stara sta ra 104 0.19 % 100.47 31 0.23 % 134.48 43 0.43 % 145.50 16 0.07 % 45.31 14 0.20 % 89.81 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 451 File at CLARIN.SI2.2.108 List of final character-level 3-grams from adjective standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-adjectives-standardized_ forms-final-3grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] samo s amo 2,475 4.49 % 2,391.06 361 2.69 % 1,566 1,104 11.06 % 3,735.51 508 2.06 % 1,438.49 502 7.09 % 3,220.16 malo m alo 1,518 2.75 % 1,466.52 418 3.11 % 1,813.27 500 5.01 % 1,691.81 267 1.08 % 756.05 333 4.70 % 2,136.08 dobro do bro 1,485 2.69 % 1,434.63 687 5.12 % 2,980.18 283 2.83 % 957.56 315 1.28 % 891.97 200 2.82 % 1,282.93 pravi pr avi 624 1.13 % 602.84 91 0.68 % 394.75 71 0.71 % 240.24 345 1.40 % 976.92 117 1.65 % 750.51 dober do ber 468 0.85 % 452.13 181 1.35 % 785.17 88 0.88 % 297.76 156 0.63 % 441.74 43 0.61 % 275.83 sam sam 366 0.66 % 353.59 81 0.60 % 351.37 121 1.21 % 409.42 117 0.47 % 331.30 47 0.66 % 301.49 lepa l epa 344 0.62 % 332.33 122 0.91 % 529.23 36 0.36 % 121.81 153 0.62 % 433.24 33 0.47 % 211.68 celo c elo 310 0.56 % 299.49 75 0.56 % 325.35 97 0.97 % 328.21 93 0.38 % 263.34 45 0.64 % 288.66 glavnem glav nem 280 0.51 % 270.50 55 0.41 % 238.59 148 1.48 % 500.77 39 0.16 % 110.43 38 0.54 % 243.76 lep lep 239 0.43 % 230.89 120 0.89 % 520.56 28 0.28 % 94.74 70 0.28 % 198.22 21 0.30 % 134.71 stari st ari 233 0.42 % 225.10 28 0.21 % 121.46 174 1.74 % 588.75 21 0.09 % 59.46 10 0.14 % 64.15 sama s ama 204 0.37 % 197.08 55 0.41 % 238.59 66 0.66 % 223.32 47 0.19 % 133.09 36 0.51 % 230.93 cela c ela 198 0.36 % 191.28 19 0.14 % 82.42 41 0.41 % 138.73 89 0.36 % 252.02 49 0.69 % 314.32 novo n ovo 196 0.35 % 189.35 58 0.43 % 251.60 40 0.40 % 135.34 63 0.26 % 178.39 35 0.49 % 224.51 boljše bol jše 194 0.35 % 187.42 36 0.27 % 156.17 59 0.59 % 199.63 58 0.23 % 164.24 41 0.58 % 263 velika vel ika 170 0.31 % 164.23 33 0.25 % 143.15 47 0.47 % 159.03 67 0.27 % 189.72 23 0.33 % 147.54 jasno ja sno 165 0.30 % 159.40 33 0.25 % 143.15 35 0.35 % 118.43 65 0.26 % 184.06 32 0.45 % 205.27 dobra do bra 162 0.29 % 156.51 69 0.51 % 299.32 38 0.38 % 128.58 35 0.14 % 99.11 20 0.28 % 128.29 cel cel 160 0.29 % 154.57 36 0.27 % 156.17 85 0.85 % 287.61 21 0.09 % 59.46 18 0.25 % 115.46 sami s ami 148 0.27 % 142.98 25 0.19 % 108.45 31 0.31 % 104.89 70 0.28 % 198.22 22 0.31 % 141.12 naslednji nasled nji 144 0.26 % 139.12 48 0.36 % 208.22 36 0.36 % 121.81 37 0.15 % 104.77 23 0.33 % 147.54 velik ve lik 134 0.24 % 129.46 30 0.22 % 130.14 23 0.23 % 77.82 70 0.28 % 198.22 11 0.15 % 70.56 velike vel ike 128 0.23 % 123.66 23 0.17 % 99.77 33 0.33 % 111.66 55 0.22 % 155.74 17 0.24 % 109.05 slovenski sloven ski 126 0.23 % 121.73 36 0.27 % 156.17 11 0.11 % 37.22 74 0.30 % 209.54 5 0.07 % 32.07 zadnjih zadn jih 125 0.23 % 120.76 37 0.28 % 160.50 5 0.05 % 16.92 76 0.31 % 215.21 7 0.10 % 44.90 nove n ove 124 0.23 % 119.79 25 0.19 % 108.45 20 0.20 % 67.67 71 0.29 % 201.05 8 0.11 % 51.32 dolgo do lgo 122 0.22 % 117.86 30 0.22 % 130.14 54 0.54 % 182.72 27 0.11 % 76.45 11 0.15 % 70.56 zadnji zad nji 120 0.22 % 115.93 32 0.24 % 138.81 33 0.33 % 111.66 49 0.20 % 138.75 6 0.09 % 38.49 nov nov 117 0.21 % 113.03 65 0.48 % 281.97 12 0.12 % 40.60 31 0.13 % 87.78 9 0.13 % 57.73 novega nov ega 106 0.19 % 102.40 30 0.22 % 130.14 24 0.24 % 81.21 41 0.17 % 116.10 11 0.15 % 70.56 fajn f ajn 105 0.19 % 101.44 40 0.30 % 173.52 44 0.44 % 148.88 7 0.03 % 19.82 14 0.20 % 89.81 stara st ara 104 0.19 % 100.47 31 0.23 % 134.48 43 0.43 % 145.50 16 0.07 % 45.31 14 0.20 % 89.81 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 452 File at CLARIN.SI2.2.109 List of final character-level 4-grams from adjective standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-adjectives-standardized_ forms-final-4grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] samo samo 2,475 4.59 % 2,391.06 361 2.77 % 1,566 1,104 11.52 % 3,735.51 508 2.08 % 1,438.49 502 7.21 % 3,220.16 malo malo 1,518 2.81 % 1,466.52 418 3.21 % 1,813.27 500 5.22 % 1,691.81 267 1.09 % 756.05 333 4.78 % 2,136.08 dobro d obro 1,485 2.75 % 1,434.63 687 5.27 % 2,980.18 283 2.95 % 957.56 315 1.29 % 891.97 200 2.87 % 1,282.93 pravi p ravi 624 1.16 % 602.84 91 0.70 % 394.75 71 0.74 % 240.24 345 1.41 % 976.92 117 1.68 % 750.51 dober d ober 468 0.87 % 452.13 181 1.39 % 785.17 88 0.92 % 297.76 156 0.64 % 441.74 43 0.62 % 275.83 lepa lepa 344 0.64 % 332.33 122 0.94 % 529.23 36 0.38 % 121.81 153 0.63 % 433.24 33 0.47 % 211.68 celo celo 310 0.57 % 299.49 75 0.58 % 325.35 97 1.01 % 328.21 93 0.38 % 263.34 45 0.65 % 288.66 glavnem gla vnem 280 0.52 % 270.50 55 0.42 % 238.59 148 1.54 % 500.77 39 0.16 % 110.43 38 0.55 % 243.76 stari s tari 233 0.43 % 225.10 28 0.21 % 121.46 174 1.82 % 588.75 21 0.09 % 59.46 10 0.14 % 64.15 sama sama 204 0.38 % 197.08 55 0.42 % 238.59 66 0.69 % 223.32 47 0.19 % 133.09 36 0.52 % 230.93 cela cela 198 0.37 % 191.28 19 0.15 % 82.42 41 0.43 % 138.73 89 0.36 % 252.02 49 0.70 % 314.32 novo novo 196 0.36 % 189.35 58 0.45 % 251.60 40 0.42 % 135.34 63 0.26 % 178.39 35 0.50 % 224.51 boljše bo ljše 194 0.36 % 187.42 36 0.28 % 156.17 59 0.62 % 199.63 58 0.24 % 164.24 41 0.59 % 263 velika ve lika 170 0.32 % 164.23 33 0.25 % 143.15 47 0.49 % 159.03 67 0.28 % 189.72 23 0.33 % 147.54 jasno j asno 165 0.31 % 159.40 33 0.25 % 143.15 35 0.36 % 118.43 65 0.27 % 184.06 32 0.46 % 205.27 dobra d obra 162 0.30 % 156.51 69 0.53 % 299.32 38 0.40 % 128.58 35 0.14 % 99.11 20 0.29 % 128.29 sami sami 148 0.27 % 142.98 25 0.19 % 108.45 31 0.32 % 104.89 70 0.29 % 198.22 22 0.32 % 141.12 naslednji nasle dnji 144 0.27 % 139.12 48 0.37 % 208.22 36 0.38 % 121.81 37 0.15 % 104.77 23 0.33 % 147.54 velik v elik 134 0.25 % 129.46 30 0.23 % 130.14 23 0.24 % 77.82 70 0.29 % 198.22 11 0.16 % 70.56 velike ve like 128 0.24 % 123.66 23 0.18 % 99.77 33 0.34 % 111.66 55 0.23 % 155.74 17 0.24 % 109.05 slovenski slove nski 126 0.23 % 121.73 36 0.28 % 156.17 11 0.12 % 37.22 74 0.30 % 209.54 5 0.07 % 32.07 zadnjih zad njih 125 0.23 % 120.76 37 0.28 % 160.50 5 0.05 % 16.92 76 0.31 % 215.21 7 0.10 % 44.90 nove nove 124 0.23 % 119.79 25 0.19 % 108.45 20 0.21 % 67.67 71 0.29 % 201.05 8 0.12 % 51.32 dolgo d olgo 122 0.23 % 117.86 30 0.23 % 130.14 54 0.56 % 182.72 27 0.11 % 76.45 11 0.16 % 70.56 zadnji za dnji 120 0.22 % 115.93 32 0.25 % 138.81 33 0.34 % 111.66 49 0.20 % 138.75 6 0.09 % 38.49 novega no vega 106 0.20 % 102.40 30 0.23 % 130.14 24 0.25 % 81.21 41 0.17 % 116.10 11 0.16 % 70.56 fajn fajn 105 0.20 % 101.44 40 0.31 % 173.52 44 0.46 % 148.88 7 0.03 % 19.82 14 0.20 % 89.81 stara s tara 104 0.19 % 100.47 31 0.24 % 134.48 43 0.45 % 145.50 16 0.07 % 45.31 14 0.20 % 89.81 dobre d obre 103 0.19 % 99.51 36 0.28 % 156.17 23 0.24 % 77.82 34 0.14 % 96.28 10 0.14 % 64.15 najboljše najbo ljše 103 0.19 % 99.51 34 0.26 % 147.49 28 0.29 % 94.74 18 0.07 % 50.97 23 0.33 % 147.54 nova nova 102 0.19 % 98.54 31 0.24 % 134.48 13 0.14 % 43.99 45 0.18 % 127.42 13 0.19 % 83.39 različnih razli čnih 101 0.19 % 97.57 12 0.09 % 52.06 13 0.14 % 43.99 56 0.23 % 158.57 20 0.29 % 128.29 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 453 File at CLARIN.SI2.2.110 List of final character-level 5-grams from adjective standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-adjectives-standardized_ forms-final-5grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] dobro dobro 1,485 3.21 % 1,434.63 687 6.11 % 2,980.18 283 4.04 % 957.56 315 1.41 % 891.97 200 3.52 % 1,282.93 pravi pravi 624 1.35 % 602.84 91 0.81 % 394.75 71 1.01 % 240.24 345 1.54 % 976.92 117 2.06 % 750.51 dober dober 468 1.01 % 452.13 181 1.61 % 785.17 88 1.26 % 297.76 156 0.70 % 441.74 43 0.76 % 275.83 glavnem gl avnem 280 0.60 % 270.50 55 0.49 % 238.59 148 2.11 % 500.77 39 0.17 % 110.43 38 0.67 % 243.76 stari stari 233 0.50 % 225.10 28 0.25 % 121.46 174 2.48 % 588.75 21 0.09 % 59.46 10 0.18 % 64.15 boljše b oljše 194 0.42 % 187.42 36 0.32 % 156.17 59 0.84 % 199.63 58 0.26 % 164.24 41 0.72 % 263 velika v elika 170 0.37 % 164.23 33 0.29 % 143.15 47 0.67 % 159.03 67 0.30 % 189.72 23 0.41 % 147.54 jasno jasno 165 0.36 % 159.40 33 0.29 % 143.15 35 0.50 % 118.43 65 0.29 % 184.06 32 0.56 % 205.27 dobra dobra 162 0.35 % 156.51 69 0.61 % 299.32 38 0.54 % 128.58 35 0.16 % 99.11 20 0.35 % 128.29 naslednji nasl ednji 144 0.31 % 139.12 48 0.43 % 208.22 36 0.51 % 121.81 37 0.17 % 104.77 23 0.41 % 147.54 velik velik 134 0.29 % 129.46 30 0.27 % 130.14 23 0.33 % 77.82 70 0.31 % 198.22 11 0.19 % 70.56 velike v elike 128 0.28 % 123.66 23 0.20 % 99.77 33 0.47 % 111.66 55 0.25 % 155.74 17 0.30 % 109.05 slovenski slov enski 126 0.27 % 121.73 36 0.32 % 156.17 11 0.16 % 37.22 74 0.33 % 209.54 5 0.09 % 32.07 zadnjih za dnjih 125 0.27 % 120.76 37 0.33 % 160.50 5 0.07 % 16.92 76 0.34 % 215.21 7 0.12 % 44.90 dolgo dolgo 122 0.26 % 117.86 30 0.27 % 130.14 54 0.77 % 182.72 27 0.12 % 76.45 11 0.19 % 70.56 zadnji z adnji 120 0.26 % 115.93 32 0.28 % 138.81 33 0.47 % 111.66 49 0.22 % 138.75 6 0.11 % 38.49 novega n ovega 106 0.23 % 102.40 30 0.27 % 130.14 24 0.34 % 81.21 41 0.18 % 116.10 11 0.19 % 70.56 stara stara 104 0.22 % 100.47 31 0.28 % 134.48 43 0.61 % 145.50 16 0.07 % 45.31 14 0.25 % 89.81 dobre dobre 103 0.22 % 99.51 36 0.32 % 156.17 23 0.33 % 77.82 34 0.15 % 96.28 10 0.18 % 64.15 najboljše najb oljše 103 0.22 % 99.51 34 0.30 % 147.49 28 0.40 % 94.74 18 0.08 % 50.97 23 0.41 % 147.54 različnih razl ičnih 101 0.22 % 97.57 12 0.11 % 52.06 13 0.19 % 43.99 56 0.25 % 158.57 20 0.35 % 128.29 veliki v eliki 99 0.21 % 95.64 29 0.26 % 125.80 19 0.27 % 64.29 43 0.19 % 121.76 8 0.14 % 51.32 zadnje z adnje 95 0.20 % 91.78 28 0.25 % 121.46 14 0.20 % 47.37 43 0.19 % 121.76 10 0.18 % 64.15 pravim p ravim 89 0.19 % 85.98 19 0.17 % 82.42 38 0.54 % 128.58 14 0.06 % 39.64 18 0.32 % 115.46 različne raz lične 86 0.19 % 83.08 7 0.06 % 30.37 12 0.17 % 40.60 51 0.23 % 144.41 16 0.28 % 102.63 kratko k ratko 82 0.18 % 79.22 14 0.12 % 60.73 4 0.06 % 13.53 54 0.24 % 152.91 10 0.18 % 64.15 preden p reden 78 0.17 % 75.35 14 0.12 % 60.73 16 0.23 % 54.14 26 0.12 % 73.62 22 0.39 % 141.12 rojstni ro jstni 77 0.17 % 74.39 41 0.36 % 177.86 27 0.39 % 91.36 5 0.02 % 14.16 4 0.07 % 25.66 slovenskih slove nskih 77 0.17 % 74.39 23 0.20 % 99.77 4 0.06 % 13.53 49 0.22 % 138.75 1 0.02 % 6.41 rečeno r ečeno 76 0.16 % 73.42 20 0.18 % 86.76 4 0.06 % 13.53 44 0.20 % 124.59 8 0.14 % 51.32 boljši b oljši 75 0.16 % 72.46 19 0.17 % 82.42 26 0.37 % 87.97 22 0.10 % 62.30 8 0.14 % 51.32 največji naj večji 74 0.16 % 71.49 21 0.19 % 91.10 13 0.19 % 43.99 29 0.13 % 82.12 11 0.19 % 70.56 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 454 File at CLARIN.SI2.2.111 List of initial character-level 1-grams from adjective lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-adjectives-lowercase_ forms-initial-1grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] sam s am 1,403 2.54 % 1,355.42 173 1.29 % 750.47 711 7.12 % 2,405.75 228 0.92 % 645.62 291 4.11 % 1,866.66 samo s amo 1,220 2.21 % 1,178.62 251 1.87 % 1,088.83 335 3.35 % 1,133.51 394 1.60 % 1,115.68 240 3.39 % 1,539.52 dobro d obro 1,154 2.09 % 1,114.86 603 4.49 % 2,615.79 131 1.31 % 443.25 282 1.14 % 798.53 138 1.95 % 885.22 malo m alo 673 1.22 % 650.17 223 1.66 % 967.37 164 1.64 % 554.91 133 0.54 % 376.61 153 2.16 % 981.44 mal m al 659 1.19 % 636.65 180 1.34 % 780.83 196 1.96 % 663.19 123 0.50 % 348.29 160 2.26 % 1,026.34 pravi p ravi 455 0.82 % 439.57 65 0.48 % 281.97 26 0.26 % 87.97 281 1.14 % 795.70 83 1.17 % 532.42 dober d ober 446 0.81 % 430.87 170 1.27 % 737.45 70 0.70 % 236.85 155 0.63 % 438.91 51 0.72 % 327.15 lepa l epa 339 0.61 % 327.50 122 0.91 % 529.23 31 0.31 % 104.89 153 0.62 % 433.24 33 0.47 % 211.68 dobr d obr 267 0.48 % 257.94 74 0.55 % 321.01 117 1.17 % 395.88 22 0.09 % 62.30 54 0.76 % 346.39 glavnem g lavnem 262 0.47 % 253.11 45 0.34 % 195.21 141 1.41 % 477.09 37 0.15 % 104.77 39 0.55 % 250.17 celo c elo 254 0.46 % 245.39 70 0.52 % 303.66 63 0.63 % 213.17 81 0.33 % 229.36 40 0.56 % 256.59 stari s tari 236 0.43 % 228 32 0.24 % 138.81 170 1.70 % 575.21 24 0.10 % 67.96 10 0.14 % 64.15 lep l ep 235 0.43 % 227.03 120 0.89 % 520.56 24 0.24 % 81.21 70 0.28 % 198.22 21 0.30 % 134.71 sama s ama 202 0.37 % 195.15 55 0.41 % 238.59 64 0.64 % 216.55 47 0.19 % 133.09 36 0.51 % 230.93 cela c ela 193 0.35 % 186.45 19 0.14 % 82.42 36 0.36 % 121.81 89 0.36 % 252.02 49 0.69 % 314.32 novo n ovo 191 0.35 % 184.52 59 0.44 % 255.94 29 0.29 % 98.12 71 0.29 % 201.05 32 0.45 % 205.27 cel c el 160 0.29 % 154.57 34 0.25 % 147.49 82 0.82 % 277.46 25 0.10 % 70.79 19 0.27 % 121.88 dobra d obra 153 0.28 % 147.81 69 0.51 % 299.32 30 0.30 % 101.51 34 0.14 % 96.28 20 0.28 % 128.29 sami s ami 148 0.27 % 142.98 24 0.18 % 104.11 32 0.32 % 108.28 70 0.28 % 198.22 22 0.31 % 141.12 mav m av 146 0.27 % 141.05 6 0.04 % 26.03 116 1.16 % 392.50 4 0.02 % 11.33 20 0.28 % 128.29 prav p rav 144 0.26 % 139.12 8 0.06 % 34.70 41 0.41 % 138.73 61 0.25 % 172.73 34 0.48 % 218.10 jasno j asno 141 0.26 % 136.22 32 0.24 % 138.81 15 0.15 % 50.75 65 0.26 % 184.06 29 0.41 % 186.03 slovenski s lovenski 140 0.25 % 135.25 41 0.30 % 177.86 11 0.11 % 37.22 82 0.33 % 232.20 6 0.09 % 38.49 nov n ov 134 0.24 % 129.46 67 0.50 % 290.64 22 0.22 % 74.44 32 0.13 % 90.61 13 0.18 % 83.39 nove n ove 131 0.24 % 126.56 29 0.22 % 125.80 17 0.17 % 57.52 77 0.31 % 218.04 8 0.11 % 51.32 zadnjih z adnjih 120 0.22 % 115.93 36 0.27 % 156.17 3 0.03 % 10.15 76 0.31 % 215.21 5 0.07 % 32.07 velik v elik 109 0.20 % 105.30 27 0.20 % 117.12 11 0.11 % 37.22 64 0.26 % 181.23 7 0.10 % 44.90 nova n ova 108 0.20 % 104.34 32 0.24 % 138.81 13 0.13 % 43.99 50 0.20 % 141.58 13 0.18 % 83.39 slovenske s lovenske 106 0.19 % 102.40 27 0.20 % 117.12 2 0.02 % 6.77 74 0.30 % 209.54 3 0.04 % 19.24 stara s tara 106 0.19 % 102.40 31 0.23 % 134.48 43 0.43 % 145.50 18 0.07 % 50.97 14 0.20 % 89.81 velika v elika 105 0.19 % 101.44 24 0.18 % 104.11 15 0.15 % 50.75 52 0.21 % 147.25 14 0.20 % 89.81 dobre d obre 103 0.19 % 99.51 35 0.26 % 151.83 24 0.24 % 81.21 34 0.14 % 96.28 10 0.14 % 64.15 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 455 File at CLARIN.SI2.2.112 List of initial character-level 2-grams from adjective lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-adjectives-lowercase_ forms-initial-2grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] sam sa m 1,403 2.54 % 1,355.42 173 1.29 % 750.47 711 7.12 % 2,405.75 228 0.92 % 645.62 291 4.11 % 1,866.66 samo sa mo 1,220 2.21 % 1,178.62 251 1.87 % 1,088.83 335 3.36 % 1,133.51 394 1.60 % 1,115.68 240 3.39 % 1,539.52 dobro do bro 1,154 2.09 % 1,114.86 603 4.49 % 2,615.79 131 1.31 % 443.25 282 1.14 % 798.53 138 1.95 % 885.22 malo ma lo 673 1.22 % 650.17 223 1.66 % 967.37 164 1.64 % 554.91 133 0.54 % 376.61 153 2.16 % 981.44 mal ma l 659 1.19 % 636.65 180 1.34 % 780.83 196 1.96 % 663.19 123 0.50 % 348.29 160 2.26 % 1,026.34 pravi pr avi 455 0.82 % 439.57 65 0.48 % 281.97 26 0.26 % 87.97 281 1.14 % 795.70 83 1.17 % 532.42 dober do ber 446 0.81 % 430.87 170 1.27 % 737.45 70 0.70 % 236.85 155 0.63 % 438.91 51 0.72 % 327.15 lepa le pa 339 0.61 % 327.50 122 0.91 % 529.23 31 0.31 % 104.89 153 0.62 % 433.24 33 0.47 % 211.68 dobr do br 267 0.48 % 257.94 74 0.55 % 321.01 117 1.17 % 395.88 22 0.09 % 62.30 54 0.76 % 346.39 glavnem gl avnem 262 0.47 % 253.11 45 0.34 % 195.21 141 1.41 % 477.09 37 0.15 % 104.77 39 0.55 % 250.17 celo ce lo 254 0.46 % 245.39 70 0.52 % 303.66 63 0.63 % 213.17 81 0.33 % 229.36 40 0.56 % 256.59 stari st ari 236 0.43 % 228 32 0.24 % 138.81 170 1.70 % 575.21 24 0.10 % 67.96 10 0.14 % 64.15 lep le p 235 0.43 % 227.03 120 0.89 % 520.56 24 0.24 % 81.21 70 0.28 % 198.22 21 0.30 % 134.71 sama sa ma 202 0.37 % 195.15 55 0.41 % 238.59 64 0.64 % 216.55 47 0.19 % 133.09 36 0.51 % 230.93 cela ce la 193 0.35 % 186.45 19 0.14 % 82.42 36 0.36 % 121.81 89 0.36 % 252.02 49 0.69 % 314.32 novo no vo 191 0.35 % 184.52 59 0.44 % 255.94 29 0.29 % 98.12 71 0.29 % 201.05 32 0.45 % 205.27 cel ce l 160 0.29 % 154.57 34 0.25 % 147.49 82 0.82 % 277.46 25 0.10 % 70.79 19 0.27 % 121.88 dobra do bra 153 0.28 % 147.81 69 0.51 % 299.32 30 0.30 % 101.51 34 0.14 % 96.28 20 0.28 % 128.29 sami sa mi 148 0.27 % 142.98 24 0.18 % 104.11 32 0.32 % 108.28 70 0.28 % 198.22 22 0.31 % 141.12 mav ma v 146 0.27 % 141.05 6 0.04 % 26.03 116 1.16 % 392.50 4 0.02 % 11.33 20 0.28 % 128.29 prav pr av 144 0.26 % 139.12 8 0.06 % 34.70 41 0.41 % 138.73 61 0.25 % 172.73 34 0.48 % 218.10 jasno ja sno 141 0.26 % 136.22 32 0.24 % 138.81 15 0.15 % 50.75 65 0.26 % 184.06 29 0.41 % 186.03 slovenski sl ovenski 140 0.25 % 135.25 41 0.30 % 177.86 11 0.11 % 37.22 82 0.33 % 232.20 6 0.09 % 38.49 nov no v 134 0.24 % 129.46 67 0.50 % 290.64 22 0.22 % 74.44 32 0.13 % 90.61 13 0.18 % 83.39 nove no ve 131 0.24 % 126.56 29 0.22 % 125.80 17 0.17 % 57.52 77 0.31 % 218.04 8 0.11 % 51.32 zadnjih za dnjih 120 0.22 % 115.93 36 0.27 % 156.17 3 0.03 % 10.15 76 0.31 % 215.21 5 0.07 % 32.07 velik ve lik 109 0.20 % 105.30 27 0.20 % 117.12 11 0.11 % 37.22 64 0.26 % 181.23 7 0.10 % 44.90 nova no va 108 0.20 % 104.34 32 0.24 % 138.81 13 0.13 % 43.99 50 0.20 % 141.58 13 0.18 % 83.39 slovenske sl ovenske 106 0.19 % 102.40 27 0.20 % 117.12 2 0.02 % 6.77 74 0.30 % 209.54 3 0.04 % 19.24 stara st ara 106 0.19 % 102.40 31 0.23 % 134.48 43 0.43 % 145.50 18 0.07 % 50.97 14 0.20 % 89.81 velika ve lika 105 0.19 % 101.44 24 0.18 % 104.11 15 0.15 % 50.75 52 0.21 % 147.25 14 0.20 % 89.81 dobre do bre 103 0.19 % 99.51 35 0.26 % 151.83 24 0.24 % 81.21 34 0.14 % 96.28 10 0.14 % 64.15 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 456 File at CLARIN.SI2.2.113 List of initial character-level 3-grams from adjective lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-adjectives-lowercase_ forms-initial-3grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] sam sam 1,403 2.55 % 1,355.42 173 1.29 % 750.47 711 7.17 % 2,405.75 228 0.92 % 645.62 291 4.12 % 1,866.66 samo sam o 1,220 2.21 % 1,178.62 251 1.87 % 1,088.83 335 3.38 % 1,133.51 394 1.60 % 1,115.68 240 3.40 % 1,539.52 dobro dob ro 1,154 2.10 % 1,114.86 603 4.49 % 2,615.79 131 1.32 % 443.25 282 1.14 % 798.53 138 1.95 % 885.22 malo mal o 673 1.22 % 650.17 223 1.66 % 967.37 164 1.65 % 554.91 133 0.54 % 376.61 153 2.16 % 981.44 mal mal 659 1.20 % 636.65 180 1.34 % 780.83 196 1.98 % 663.19 123 0.50 % 348.29 160 2.26 % 1,026.34 pravi pra vi 455 0.83 % 439.57 65 0.48 % 281.97 26 0.26 % 87.97 281 1.14 % 795.70 83 1.17 % 532.42 dober dob er 446 0.81 % 430.87 170 1.27 % 737.45 70 0.71 % 236.85 155 0.63 % 438.91 51 0.72 % 327.15 lepa lep a 339 0.61 % 327.50 122 0.91 % 529.23 31 0.31 % 104.89 153 0.62 % 433.24 33 0.47 % 211.68 dobr dob r 267 0.48 % 257.94 74 0.55 % 321.01 117 1.18 % 395.88 22 0.09 % 62.30 54 0.76 % 346.39 glavnem gla vnem 262 0.48 % 253.11 45 0.34 % 195.21 141 1.42 % 477.09 37 0.15 % 104.77 39 0.55 % 250.17 celo cel o 254 0.46 % 245.39 70 0.52 % 303.66 63 0.64 % 213.17 81 0.33 % 229.36 40 0.57 % 256.59 stari sta ri 236 0.43 % 228 32 0.24 % 138.81 170 1.72 % 575.21 24 0.10 % 67.96 10 0.14 % 64.15 lep lep 235 0.43 % 227.03 120 0.89 % 520.56 24 0.24 % 81.21 70 0.28 % 198.22 21 0.30 % 134.71 sama sam a 202 0.37 % 195.15 55 0.41 % 238.59 64 0.65 % 216.55 47 0.19 % 133.09 36 0.51 % 230.93 cela cel a 193 0.35 % 186.45 19 0.14 % 82.42 36 0.36 % 121.81 89 0.36 % 252.02 49 0.69 % 314.32 novo nov o 191 0.35 % 184.52 59 0.44 % 255.94 29 0.29 % 98.12 71 0.29 % 201.05 32 0.45 % 205.27 cel cel 160 0.29 % 154.57 34 0.25 % 147.49 82 0.83 % 277.46 25 0.10 % 70.79 19 0.27 % 121.88 dobra dob ra 153 0.28 % 147.81 69 0.51 % 299.32 30 0.30 % 101.51 34 0.14 % 96.28 20 0.28 % 128.29 sami sam i 148 0.27 % 142.98 24 0.18 % 104.11 32 0.32 % 108.28 70 0.28 % 198.22 22 0.31 % 141.12 mav mav 146 0.27 % 141.05 6 0.04 % 26.03 116 1.17 % 392.50 4 0.02 % 11.33 20 0.28 % 128.29 prav pra v 144 0.26 % 139.12 8 0.06 % 34.70 41 0.41 % 138.73 61 0.25 % 172.73 34 0.48 % 218.10 jasno jas no 141 0.26 % 136.22 32 0.24 % 138.81 15 0.15 % 50.75 65 0.26 % 184.06 29 0.41 % 186.03 slovenski slo venski 140 0.25 % 135.25 41 0.31 % 177.86 11 0.11 % 37.22 82 0.33 % 232.20 6 0.09 % 38.49 nov nov 134 0.24 % 129.46 67 0.50 % 290.64 22 0.22 % 74.44 32 0.13 % 90.61 13 0.18 % 83.39 nove nov e 131 0.24 % 126.56 29 0.22 % 125.80 17 0.17 % 57.52 77 0.31 % 218.04 8 0.11 % 51.32 zadnjih zad njih 120 0.22 % 115.93 36 0.27 % 156.17 3 0.03 % 10.15 76 0.31 % 215.21 5 0.07 % 32.07 velik vel ik 109 0.20 % 105.30 27 0.20 % 117.12 11 0.11 % 37.22 64 0.26 % 181.23 7 0.10 % 44.90 nova nov a 108 0.20 % 104.34 32 0.24 % 138.81 13 0.13 % 43.99 50 0.20 % 141.58 13 0.18 % 83.39 slovenske slo venske 106 0.19 % 102.40 27 0.20 % 117.12 2 0.02 % 6.77 74 0.30 % 209.54 3 0.04 % 19.24 stara sta ra 106 0.19 % 102.40 31 0.23 % 134.48 43 0.43 % 145.50 18 0.07 % 50.97 14 0.20 % 89.81 velika vel ika 105 0.19 % 101.44 24 0.18 % 104.11 15 0.15 % 50.75 52 0.21 % 147.25 14 0.20 % 89.81 dobre dob re 103 0.19 % 99.51 35 0.26 % 151.83 24 0.24 % 81.21 34 0.14 % 96.28 10 0.14 % 64.15 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 457 File at CLARIN.SI2.2.114 List of initial character-level 4-grams from adjective lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-adjectives-lowercase_ forms-initial-4grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] samo samo 1,220 2.35 % 1,178.62 251 1.98 % 1,088.83 335 3.95 % 1,133.51 394 1.63 % 1,115.68 240 3.69 % 1,539.52 dobro dobr o 1,154 2.23 % 1,114.86 603 4.75 % 2,615.79 131 1.54 % 443.25 282 1.17 % 798.53 138 2.12 % 885.22 malo malo 673 1.30 % 650.17 223 1.76 % 967.37 164 1.94 % 554.91 133 0.55 % 376.61 153 2.35 % 981.44 pravi prav i 455 0.88 % 439.57 65 0.51 % 281.97 26 0.31 % 87.97 281 1.16 % 795.70 83 1.28 % 532.42 dober dobe r 446 0.86 % 430.87 170 1.34 % 737.45 70 0.83 % 236.85 155 0.64 % 438.91 51 0.78 % 327.15 lepa lepa 339 0.65 % 327.50 122 0.96 % 529.23 31 0.37 % 104.89 153 0.63 % 433.24 33 0.51 % 211.68 dobr dobr 267 0.52 % 257.94 74 0.58 % 321.01 117 1.38 % 395.88 22 0.09 % 62.30 54 0.83 % 346.39 glavnem glav nem 262 0.51 % 253.11 45 0.35 % 195.21 141 1.66 % 477.09 37 0.15 % 104.77 39 0.60 % 250.17 celo celo 254 0.49 % 245.39 70 0.55 % 303.66 63 0.74 % 213.17 81 0.34 % 229.36 40 0.61 % 256.59 stari star i 236 0.46 % 228 32 0.25 % 138.81 170 2.00 % 575.21 24 0.10 % 67.96 10 0.15 % 64.15 sama sama 202 0.39 % 195.15 55 0.43 % 238.59 64 0.76 % 216.55 47 0.20 % 133.09 36 0.55 % 230.93 cela cela 193 0.37 % 186.45 19 0.15 % 82.42 36 0.42 % 121.81 89 0.37 % 252.02 49 0.75 % 314.32 novo novo 191 0.37 % 184.52 59 0.47 % 255.94 29 0.34 % 98.12 71 0.29 % 201.05 32 0.49 % 205.27 dobra dobr a 153 0.29 % 147.81 69 0.54 % 299.32 30 0.35 % 101.51 34 0.14 % 96.28 20 0.31 % 128.29 sami sami 148 0.29 % 142.98 24 0.19 % 104.11 32 0.38 % 108.28 70 0.29 % 198.22 22 0.34 % 141.12 prav prav 144 0.28 % 139.12 8 0.06 % 34.70 41 0.48 % 138.73 61 0.25 % 172.73 34 0.52 % 218.10 jasno jasn o 141 0.27 % 136.22 32 0.25 % 138.81 15 0.18 % 50.75 65 0.27 % 184.06 29 0.45 % 186.03 slovenski slov enski 140 0.27 % 135.25 41 0.32 % 177.86 11 0.13 % 37.22 82 0.34 % 232.20 6 0.09 % 38.49 nove nove 131 0.25 % 126.56 29 0.23 % 125.80 17 0.20 % 57.52 77 0.32 % 218.04 8 0.12 % 51.32 zadnjih zadn jih 120 0.23 % 115.93 36 0.28 % 156.17 3 0.04 % 10.15 76 0.32 % 215.21 5 0.08 % 32.07 velik veli k 109 0.21 % 105.30 27 0.21 % 117.12 11 0.13 % 37.22 64 0.27 % 181.23 7 0.11 % 44.90 nova nova 108 0.21 % 104.34 32 0.25 % 138.81 13 0.15 % 43.99 50 0.21 % 141.58 13 0.20 % 83.39 slovenske slov enske 106 0.20 % 102.40 27 0.21 % 117.12 2 0.02 % 6.77 74 0.31 % 209.54 3 0.05 % 19.24 stara star a 106 0.20 % 102.40 31 0.24 % 134.48 43 0.51 % 145.50 18 0.07 % 50.97 14 0.21 % 89.81 velika veli ka 105 0.20 % 101.44 24 0.19 % 104.11 15 0.18 % 50.75 52 0.21 % 147.25 14 0.21 % 89.81 dobre dobr e 103 0.20 % 99.51 35 0.28 % 151.83 24 0.28 % 81.21 34 0.14 % 96.28 10 0.15 % 64.15 naslednji nasl ednji 101 0.20 % 97.57 41 0.32 % 177.86 14 0.17 % 47.37 35 0.14 % 99.11 11 0.17 % 70.56 fajn fajn 100 0.19 % 96.61 40 0.32 % 173.52 40 0.47 % 135.34 7 0.03 % 19.82 13 0.20 % 83.39 različnih razl ičnih 99 0.19 % 95.64 12 0.09 % 52.06 11 0.13 % 37.22 56 0.23 % 158.57 20 0.31 % 128.29 boljše bolj še 92 0.18 % 88.88 23 0.18 % 99.77 14 0.17 % 47.37 34 0.14 % 96.28 21 0.32 % 134.71 novi novi 92 0.18 % 88.88 35 0.28 % 151.83 17 0.20 % 57.52 32 0.13 % 90.61 8 0.12 % 51.32 slovenskih slov enskih 89 0.17 % 85.98 29 0.23 % 125.80 5 0.06 % 16.92 53 0.22 % 150.08 2 0.03 % 12.83 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 458 File at CLARIN.SI2.2.115 List of initial character-level 5-grams from adjective lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-adjectives-lowercase_ forms-initial-5grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] dobro dobro 1,154 2.54 % 1,114.86 603 5.45 % 2,615.79 131 1.99 % 443.25 282 1.27 % 798.53 138 2.50 % 885.22 pravi pravi 455 1.00 % 439.57 65 0.59 % 281.97 26 0.40 % 87.97 281 1.26 % 795.70 83 1.50 % 532.42 dober dober 446 0.98 % 430.87 170 1.53 % 737.45 70 1.06 % 236.85 155 0.70 % 438.91 51 0.92 % 327.15 glavnem glavn em 262 0.58 % 253.11 45 0.41 % 195.21 141 2.14 % 477.09 37 0.17 % 104.77 39 0.71 % 250.17 stari stari 236 0.52 % 228 32 0.29 % 138.81 170 2.58 % 575.21 24 0.11 % 67.96 10 0.18 % 64.15 dobra dobra 153 0.34 % 147.81 69 0.62 % 299.32 30 0.46 % 101.51 34 0.15 % 96.28 20 0.36 % 128.29 jasno jasno 141 0.31 % 136.22 32 0.29 % 138.81 15 0.23 % 50.75 65 0.29 % 184.06 29 0.53 % 186.03 slovenski slove nski 140 0.31 % 135.25 41 0.37 % 177.86 11 0.17 % 37.22 82 0.37 % 232.20 6 0.11 % 38.49 zadnjih zadnj ih 120 0.26 % 115.93 36 0.33 % 156.17 3 0.05 % 10.15 76 0.34 % 215.21 5 0.09 % 32.07 velik velik 109 0.24 % 105.30 27 0.24 % 117.12 11 0.17 % 37.22 64 0.29 % 181.23 7 0.13 % 44.90 slovenske slove nske 106 0.23 % 102.40 27 0.24 % 117.12 2 0.03 % 6.77 74 0.33 % 209.54 3 0.05 % 19.24 stara stara 106 0.23 % 102.40 31 0.28 % 134.48 43 0.65 % 145.50 18 0.08 % 50.97 14 0.25 % 89.81 velika velik a 105 0.23 % 101.44 24 0.22 % 104.11 15 0.23 % 50.75 52 0.23 % 147.25 14 0.25 % 89.81 dobre dobre 103 0.23 % 99.51 35 0.32 % 151.83 24 0.36 % 81.21 34 0.15 % 96.28 10 0.18 % 64.15 naslednji nasle dnji 101 0.22 % 97.57 41 0.37 % 177.86 14 0.21 % 47.37 35 0.16 % 99.11 11 0.20 % 70.56 različnih razli čnih 99 0.22 % 95.64 12 0.11 % 52.06 11 0.17 % 37.22 56 0.25 % 158.57 20 0.36 % 128.29 boljše boljš e 92 0.20 % 88.88 23 0.21 % 99.77 14 0.21 % 47.37 34 0.15 % 96.28 21 0.38 % 134.71 slovenskih slove nskih 89 0.20 % 85.98 29 0.26 % 125.80 5 0.08 % 16.92 53 0.24 % 150.08 2 0.04 % 12.83 zadnji zadnj i 87 0.19 % 84.05 27 0.24 % 117.12 14 0.21 % 47.37 41 0.18 % 116.10 5 0.09 % 32.07 novega noveg a 86 0.19 % 83.08 26 0.23 % 112.79 9 0.14 % 30.45 42 0.19 % 118.93 9 0.16 % 57.73 velike velik e 86 0.19 % 83.08 19 0.17 % 82.42 9 0.14 % 30.45 50 0.23 % 141.58 8 0.14 % 51.32 različne razli čne 85 0.19 % 82.12 7 0.06 % 30.37 11 0.17 % 37.22 51 0.23 % 144.41 16 0.29 % 102.63 slovenska slove nska 84 0.18 % 81.15 21 0.19 % 91.10 4 0.06 % 13.53 58 0.26 % 164.24 1 0.02 % 6.41 kratko kratk o 82 0.18 % 79.22 14 0.13 % 60.73 4 0.06 % 13.53 54 0.24 % 152.91 10 0.18 % 64.15 evropske evrop ske 81 0.18 % 78.25 7 0.06 % 30.37 1 0.01 % 3.38 72 0.32 % 203.88 1 0.02 % 6.41 najlepša najle pša 73 0.16 % 70.52 19 0.17 % 82.42 7 0.11 % 23.69 25 0.11 % 70.79 22 0.40 % 141.12 dolgo dolgo 72 0.16 % 69.56 17 0.15 % 73.75 21 0.32 % 71.06 25 0.11 % 70.79 9 0.16 % 57.73 največji najve čji 72 0.16 % 69.56 20 0.18 % 86.76 14 0.21 % 47.37 28 0.13 % 79.29 10 0.18 % 64.15 rečeno rečen o 69 0.15 % 66.66 19 0.17 % 82.42 3 0.05 % 10.15 40 0.18 % 113.27 7 0.13 % 44.90 večje večje 69 0.15 % 66.66 12 0.11 % 52.06 7 0.11 % 23.69 37 0.17 % 104.77 13 0.23 % 83.39 zadnje zadnj e 69 0.15 % 66.66 24 0.22 % 104.11 9 0.14 % 30.45 27 0.12 % 76.45 9 0.16 % 57.73 prava prava 66 0.14 % 63.76 19 0.17 % 82.42 9 0.14 % 30.45 32 0.14 % 90.61 6 0.11 % 38.49 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 459 File at CLARIN.SI2.2.116 List of final character-level 1-grams from adjective lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-adjectives-lowercase_ forms-final-1grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] sam sa m 1,403 2.54 % 1,355.42 173 1.29 % 750.47 711 7.12 % 2,405.75 228 0.92 % 645.62 291 4.11 % 1,866.66 samo sam o 1,220 2.21 % 1,178.62 251 1.87 % 1,088.83 335 3.35 % 1,133.51 394 1.60 % 1,115.68 240 3.39 % 1,539.52 dobro dobr o 1,154 2.09 % 1,114.86 603 4.49 % 2,615.79 131 1.31 % 443.25 282 1.14 % 798.53 138 1.95 % 885.22 malo mal o 673 1.22 % 650.17 223 1.66 % 967.37 164 1.64 % 554.91 133 0.54 % 376.61 153 2.16 % 981.44 mal ma l 659 1.19 % 636.65 180 1.34 % 780.83 196 1.96 % 663.19 123 0.50 % 348.29 160 2.26 % 1,026.34 pravi prav i 455 0.82 % 439.57 65 0.48 % 281.97 26 0.26 % 87.97 281 1.14 % 795.70 83 1.17 % 532.42 dober dobe r 446 0.81 % 430.87 170 1.27 % 737.45 70 0.70 % 236.85 155 0.63 % 438.91 51 0.72 % 327.15 lepa lep a 339 0.61 % 327.50 122 0.91 % 529.23 31 0.31 % 104.89 153 0.62 % 433.24 33 0.47 % 211.68 dobr dob r 267 0.48 % 257.94 74 0.55 % 321.01 117 1.17 % 395.88 22 0.09 % 62.30 54 0.76 % 346.39 glavnem glavne m 262 0.47 % 253.11 45 0.34 % 195.21 141 1.41 % 477.09 37 0.15 % 104.77 39 0.55 % 250.17 celo cel o 254 0.46 % 245.39 70 0.52 % 303.66 63 0.63 % 213.17 81 0.33 % 229.36 40 0.56 % 256.59 stari star i 236 0.43 % 228 32 0.24 % 138.81 170 1.70 % 575.21 24 0.10 % 67.96 10 0.14 % 64.15 lep le p 235 0.43 % 227.03 120 0.89 % 520.56 24 0.24 % 81.21 70 0.28 % 198.22 21 0.30 % 134.71 sama sam a 202 0.37 % 195.15 55 0.41 % 238.59 64 0.64 % 216.55 47 0.19 % 133.09 36 0.51 % 230.93 cela cel a 193 0.35 % 186.45 19 0.14 % 82.42 36 0.36 % 121.81 89 0.36 % 252.02 49 0.69 % 314.32 novo nov o 191 0.35 % 184.52 59 0.44 % 255.94 29 0.29 % 98.12 71 0.29 % 201.05 32 0.45 % 205.27 cel ce l 160 0.29 % 154.57 34 0.25 % 147.49 82 0.82 % 277.46 25 0.10 % 70.79 19 0.27 % 121.88 dobra dobr a 153 0.28 % 147.81 69 0.51 % 299.32 30 0.30 % 101.51 34 0.14 % 96.28 20 0.28 % 128.29 sami sam i 148 0.27 % 142.98 24 0.18 % 104.11 32 0.32 % 108.28 70 0.28 % 198.22 22 0.31 % 141.12 mav ma v 146 0.27 % 141.05 6 0.04 % 26.03 116 1.16 % 392.50 4 0.02 % 11.33 20 0.28 % 128.29 prav pra v 144 0.26 % 139.12 8 0.06 % 34.70 41 0.41 % 138.73 61 0.25 % 172.73 34 0.48 % 218.10 jasno jasn o 141 0.26 % 136.22 32 0.24 % 138.81 15 0.15 % 50.75 65 0.26 % 184.06 29 0.41 % 186.03 slovenski slovensk i 140 0.25 % 135.25 41 0.30 % 177.86 11 0.11 % 37.22 82 0.33 % 232.20 6 0.09 % 38.49 nov no v 134 0.24 % 129.46 67 0.50 % 290.64 22 0.22 % 74.44 32 0.13 % 90.61 13 0.18 % 83.39 nove nov e 131 0.24 % 126.56 29 0.22 % 125.80 17 0.17 % 57.52 77 0.31 % 218.04 8 0.11 % 51.32 zadnjih zadnji h 120 0.22 % 115.93 36 0.27 % 156.17 3 0.03 % 10.15 76 0.31 % 215.21 5 0.07 % 32.07 velik veli k 109 0.20 % 105.30 27 0.20 % 117.12 11 0.11 % 37.22 64 0.26 % 181.23 7 0.10 % 44.90 nova nov a 108 0.20 % 104.34 32 0.24 % 138.81 13 0.13 % 43.99 50 0.20 % 141.58 13 0.18 % 83.39 slovenske slovensk e 106 0.19 % 102.40 27 0.20 % 117.12 2 0.02 % 6.77 74 0.30 % 209.54 3 0.04 % 19.24 stara star a 106 0.19 % 102.40 31 0.23 % 134.48 43 0.43 % 145.50 18 0.07 % 50.97 14 0.20 % 89.81 velika velik a 105 0.19 % 101.44 24 0.18 % 104.11 15 0.15 % 50.75 52 0.21 % 147.25 14 0.20 % 89.81 dobre dobr e 103 0.19 % 99.51 35 0.26 % 151.83 24 0.24 % 81.21 34 0.14 % 96.28 10 0.14 % 64.15 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 460 File at CLARIN.SI2.2.117 List of final character-level 2-grams from adjective lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-adjectives-lowercase_ forms-final-2grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] sam s am 1,403 2.54 % 1,355.42 173 1.29 % 750.47 711 7.12 % 2,405.75 228 0.92 % 645.62 291 4.11 % 1,866.66 samo sa mo 1,220 2.21 % 1,178.62 251 1.87 % 1,088.83 335 3.36 % 1,133.51 394 1.60 % 1,115.68 240 3.39 % 1,539.52 dobro dob ro 1,154 2.09 % 1,114.86 603 4.49 % 2,615.79 131 1.31 % 443.25 282 1.14 % 798.53 138 1.95 % 885.22 malo ma lo 673 1.22 % 650.17 223 1.66 % 967.37 164 1.64 % 554.91 133 0.54 % 376.61 153 2.16 % 981.44 mal m al 659 1.19 % 636.65 180 1.34 % 780.83 196 1.96 % 663.19 123 0.50 % 348.29 160 2.26 % 1,026.34 pravi pra vi 455 0.82 % 439.57 65 0.48 % 281.97 26 0.26 % 87.97 281 1.14 % 795.70 83 1.17 % 532.42 dober dob er 446 0.81 % 430.87 170 1.27 % 737.45 70 0.70 % 236.85 155 0.63 % 438.91 51 0.72 % 327.15 lepa le pa 339 0.61 % 327.50 122 0.91 % 529.23 31 0.31 % 104.89 153 0.62 % 433.24 33 0.47 % 211.68 dobr do br 267 0.48 % 257.94 74 0.55 % 321.01 117 1.17 % 395.88 22 0.09 % 62.30 54 0.76 % 346.39 glavnem glavn em 262 0.47 % 253.11 45 0.34 % 195.21 141 1.41 % 477.09 37 0.15 % 104.77 39 0.55 % 250.17 celo ce lo 254 0.46 % 245.39 70 0.52 % 303.66 63 0.63 % 213.17 81 0.33 % 229.36 40 0.56 % 256.59 stari sta ri 236 0.43 % 228 32 0.24 % 138.81 170 1.70 % 575.21 24 0.10 % 67.96 10 0.14 % 64.15 lep l ep 235 0.43 % 227.03 120 0.89 % 520.56 24 0.24 % 81.21 70 0.28 % 198.22 21 0.30 % 134.71 sama sa ma 202 0.37 % 195.15 55 0.41 % 238.59 64 0.64 % 216.55 47 0.19 % 133.09 36 0.51 % 230.93 cela ce la 193 0.35 % 186.45 19 0.14 % 82.42 36 0.36 % 121.81 89 0.36 % 252.02 49 0.69 % 314.32 novo no vo 191 0.35 % 184.52 59 0.44 % 255.94 29 0.29 % 98.12 71 0.29 % 201.05 32 0.45 % 205.27 cel c el 160 0.29 % 154.57 34 0.25 % 147.49 82 0.82 % 277.46 25 0.10 % 70.79 19 0.27 % 121.88 dobra dob ra 153 0.28 % 147.81 69 0.51 % 299.32 30 0.30 % 101.51 34 0.14 % 96.28 20 0.28 % 128.29 sami sa mi 148 0.27 % 142.98 24 0.18 % 104.11 32 0.32 % 108.28 70 0.28 % 198.22 22 0.31 % 141.12 mav m av 146 0.27 % 141.05 6 0.04 % 26.03 116 1.16 % 392.50 4 0.02 % 11.33 20 0.28 % 128.29 prav pr av 144 0.26 % 139.12 8 0.06 % 34.70 41 0.41 % 138.73 61 0.25 % 172.73 34 0.48 % 218.10 jasno jas no 141 0.26 % 136.22 32 0.24 % 138.81 15 0.15 % 50.75 65 0.26 % 184.06 29 0.41 % 186.03 slovenski slovens ki 140 0.25 % 135.25 41 0.30 % 177.86 11 0.11 % 37.22 82 0.33 % 232.20 6 0.09 % 38.49 nov n ov 134 0.24 % 129.46 67 0.50 % 290.64 22 0.22 % 74.44 32 0.13 % 90.61 13 0.18 % 83.39 nove no ve 131 0.24 % 126.56 29 0.22 % 125.80 17 0.17 % 57.52 77 0.31 % 218.04 8 0.11 % 51.32 zadnjih zadnj ih 120 0.22 % 115.93 36 0.27 % 156.17 3 0.03 % 10.15 76 0.31 % 215.21 5 0.07 % 32.07 velik vel ik 109 0.20 % 105.30 27 0.20 % 117.12 11 0.11 % 37.22 64 0.26 % 181.23 7 0.10 % 44.90 nova no va 108 0.20 % 104.34 32 0.24 % 138.81 13 0.13 % 43.99 50 0.20 % 141.58 13 0.18 % 83.39 slovenske slovens ke 106 0.19 % 102.40 27 0.20 % 117.12 2 0.02 % 6.77 74 0.30 % 209.54 3 0.04 % 19.24 stara sta ra 106 0.19 % 102.40 31 0.23 % 134.48 43 0.43 % 145.50 18 0.07 % 50.97 14 0.20 % 89.81 velika veli ka 105 0.19 % 101.44 24 0.18 % 104.11 15 0.15 % 50.75 52 0.21 % 147.25 14 0.20 % 89.81 dobre dob re 103 0.19 % 99.51 35 0.26 % 151.83 24 0.24 % 81.21 34 0.14 % 96.28 10 0.14 % 64.15 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 461 File at CLARIN.SI2.2.118 List of final character-level 3-grams from adjective lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-adjectives-lowercase_ forms-final-3grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] sam sam 1,403 2.55 % 1,355.42 173 1.29 % 750.47 711 7.17 % 2,405.75 228 0.92 % 645.62 291 4.12 % 1,866.66 samo s amo 1,220 2.21 % 1,178.62 251 1.87 % 1,088.83 335 3.38 % 1,133.51 394 1.60 % 1,115.68 240 3.40 % 1,539.52 dobro do bro 1,154 2.10 % 1,114.86 603 4.49 % 2,615.79 131 1.32 % 443.25 282 1.14 % 798.53 138 1.95 % 885.22 malo m alo 673 1.22 % 650.17 223 1.66 % 967.37 164 1.65 % 554.91 133 0.54 % 376.61 153 2.16 % 981.44 mal mal 659 1.20 % 636.65 180 1.34 % 780.83 196 1.98 % 663.19 123 0.50 % 348.29 160 2.26 % 1,026.34 pravi pr avi 455 0.83 % 439.57 65 0.48 % 281.97 26 0.26 % 87.97 281 1.14 % 795.70 83 1.17 % 532.42 dober do ber 446 0.81 % 430.87 170 1.27 % 737.45 70 0.71 % 236.85 155 0.63 % 438.91 51 0.72 % 327.15 lepa l epa 339 0.61 % 327.50 122 0.91 % 529.23 31 0.31 % 104.89 153 0.62 % 433.24 33 0.47 % 211.68 dobr d obr 267 0.48 % 257.94 74 0.55 % 321.01 117 1.18 % 395.88 22 0.09 % 62.30 54 0.76 % 346.39 glavnem glav nem 262 0.48 % 253.11 45 0.34 % 195.21 141 1.42 % 477.09 37 0.15 % 104.77 39 0.55 % 250.17 celo c elo 254 0.46 % 245.39 70 0.52 % 303.66 63 0.64 % 213.17 81 0.33 % 229.36 40 0.57 % 256.59 stari st ari 236 0.43 % 228 32 0.24 % 138.81 170 1.72 % 575.21 24 0.10 % 67.96 10 0.14 % 64.15 lep lep 235 0.43 % 227.03 120 0.89 % 520.56 24 0.24 % 81.21 70 0.28 % 198.22 21 0.30 % 134.71 sama s ama 202 0.37 % 195.15 55 0.41 % 238.59 64 0.65 % 216.55 47 0.19 % 133.09 36 0.51 % 230.93 cela c ela 193 0.35 % 186.45 19 0.14 % 82.42 36 0.36 % 121.81 89 0.36 % 252.02 49 0.69 % 314.32 novo n ovo 191 0.35 % 184.52 59 0.44 % 255.94 29 0.29 % 98.12 71 0.29 % 201.05 32 0.45 % 205.27 cel cel 160 0.29 % 154.57 34 0.25 % 147.49 82 0.83 % 277.46 25 0.10 % 70.79 19 0.27 % 121.88 dobra do bra 153 0.28 % 147.81 69 0.51 % 299.32 30 0.30 % 101.51 34 0.14 % 96.28 20 0.28 % 128.29 sami s ami 148 0.27 % 142.98 24 0.18 % 104.11 32 0.32 % 108.28 70 0.28 % 198.22 22 0.31 % 141.12 mav mav 146 0.27 % 141.05 6 0.04 % 26.03 116 1.17 % 392.50 4 0.02 % 11.33 20 0.28 % 128.29 prav p rav 144 0.26 % 139.12 8 0.06 % 34.70 41 0.41 % 138.73 61 0.25 % 172.73 34 0.48 % 218.10 jasno ja sno 141 0.26 % 136.22 32 0.24 % 138.81 15 0.15 % 50.75 65 0.26 % 184.06 29 0.41 % 186.03 slovenski sloven ski 140 0.25 % 135.25 41 0.31 % 177.86 11 0.11 % 37.22 82 0.33 % 232.20 6 0.09 % 38.49 nov nov 134 0.24 % 129.46 67 0.50 % 290.64 22 0.22 % 74.44 32 0.13 % 90.61 13 0.18 % 83.39 nove n ove 131 0.24 % 126.56 29 0.22 % 125.80 17 0.17 % 57.52 77 0.31 % 218.04 8 0.11 % 51.32 zadnjih zadn jih 120 0.22 % 115.93 36 0.27 % 156.17 3 0.03 % 10.15 76 0.31 % 215.21 5 0.07 % 32.07 velik ve lik 109 0.20 % 105.30 27 0.20 % 117.12 11 0.11 % 37.22 64 0.26 % 181.23 7 0.10 % 44.90 nova n ova 108 0.20 % 104.34 32 0.24 % 138.81 13 0.13 % 43.99 50 0.20 % 141.58 13 0.18 % 83.39 slovenske sloven ske 106 0.19 % 102.40 27 0.20 % 117.12 2 0.02 % 6.77 74 0.30 % 209.54 3 0.04 % 19.24 stara st ara 106 0.19 % 102.40 31 0.23 % 134.48 43 0.43 % 145.50 18 0.07 % 50.97 14 0.20 % 89.81 velika vel ika 105 0.19 % 101.44 24 0.18 % 104.11 15 0.15 % 50.75 52 0.21 % 147.25 14 0.20 % 89.81 dobre do bre 103 0.19 % 99.51 35 0.26 % 151.83 24 0.24 % 81.21 34 0.14 % 96.28 10 0.14 % 64.15 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 462 File at CLARIN.SI2.2.119 List of final character-level 4-grams from adjective lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-adjectives-lowercase_ forms-final-4grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] samo samo 1,220 2.35 % 1,178.62 251 1.98 % 1,088.83 335 3.95 % 1,133.51 394 1.63 % 1,115.68 240 3.69 % 1,539.52 dobro d obro 1,154 2.23 % 1,114.86 603 4.75 % 2,615.79 131 1.54 % 443.25 282 1.17 % 798.53 138 2.12 % 885.22 malo malo 673 1.30 % 650.17 223 1.76 % 967.37 164 1.94 % 554.91 133 0.55 % 376.61 153 2.35 % 981.44 pravi p ravi 455 0.88 % 439.57 65 0.51 % 281.97 26 0.31 % 87.97 281 1.16 % 795.70 83 1.28 % 532.42 dober d ober 446 0.86 % 430.87 170 1.34 % 737.45 70 0.83 % 236.85 155 0.64 % 438.91 51 0.78 % 327.15 lepa lepa 339 0.65 % 327.50 122 0.96 % 529.23 31 0.37 % 104.89 153 0.63 % 433.24 33 0.51 % 211.68 dobr dobr 267 0.52 % 257.94 74 0.58 % 321.01 117 1.38 % 395.88 22 0.09 % 62.30 54 0.83 % 346.39 glavnem gla vnem 262 0.51 % 253.11 45 0.35 % 195.21 141 1.66 % 477.09 37 0.15 % 104.77 39 0.60 % 250.17 celo celo 254 0.49 % 245.39 70 0.55 % 303.66 63 0.74 % 213.17 81 0.34 % 229.36 40 0.61 % 256.59 stari s tari 236 0.46 % 228 32 0.25 % 138.81 170 2.00 % 575.21 24 0.10 % 67.96 10 0.15 % 64.15 sama sama 202 0.39 % 195.15 55 0.43 % 238.59 64 0.76 % 216.55 47 0.20 % 133.09 36 0.55 % 230.93 cela cela 193 0.37 % 186.45 19 0.15 % 82.42 36 0.42 % 121.81 89 0.37 % 252.02 49 0.75 % 314.32 novo novo 191 0.37 % 184.52 59 0.47 % 255.94 29 0.34 % 98.12 71 0.29 % 201.05 32 0.49 % 205.27 dobra d obra 153 0.29 % 147.81 69 0.54 % 299.32 30 0.35 % 101.51 34 0.14 % 96.28 20 0.31 % 128.29 sami sami 148 0.29 % 142.98 24 0.19 % 104.11 32 0.38 % 108.28 70 0.29 % 198.22 22 0.34 % 141.12 prav prav 144 0.28 % 139.12 8 0.06 % 34.70 41 0.48 % 138.73 61 0.25 % 172.73 34 0.52 % 218.10 jasno j asno 141 0.27 % 136.22 32 0.25 % 138.81 15 0.18 % 50.75 65 0.27 % 184.06 29 0.45 % 186.03 slovenski slove nski 140 0.27 % 135.25 41 0.32 % 177.86 11 0.13 % 37.22 82 0.34 % 232.20 6 0.09 % 38.49 nove nove 131 0.25 % 126.56 29 0.23 % 125.80 17 0.20 % 57.52 77 0.32 % 218.04 8 0.12 % 51.32 zadnjih zad njih 120 0.23 % 115.93 36 0.28 % 156.17 3 0.04 % 10.15 76 0.32 % 215.21 5 0.08 % 32.07 velik v elik 109 0.21 % 105.30 27 0.21 % 117.12 11 0.13 % 37.22 64 0.27 % 181.23 7 0.11 % 44.90 nova nova 108 0.21 % 104.34 32 0.25 % 138.81 13 0.15 % 43.99 50 0.21 % 141.58 13 0.20 % 83.39 slovenske slove nske 106 0.20 % 102.40 27 0.21 % 117.12 2 0.02 % 6.77 74 0.31 % 209.54 3 0.05 % 19.24 stara s tara 106 0.20 % 102.40 31 0.24 % 134.48 43 0.51 % 145.50 18 0.07 % 50.97 14 0.21 % 89.81 velika ve lika 105 0.20 % 101.44 24 0.19 % 104.11 15 0.18 % 50.75 52 0.21 % 147.25 14 0.21 % 89.81 dobre d obre 103 0.20 % 99.51 35 0.28 % 151.83 24 0.28 % 81.21 34 0.14 % 96.28 10 0.15 % 64.15 naslednji nasle dnji 101 0.20 % 97.57 41 0.32 % 177.86 14 0.17 % 47.37 35 0.14 % 99.11 11 0.17 % 70.56 fajn fajn 100 0.19 % 96.61 40 0.32 % 173.52 40 0.47 % 135.34 7 0.03 % 19.82 13 0.20 % 83.39 različnih razli čnih 99 0.19 % 95.64 12 0.09 % 52.06 11 0.13 % 37.22 56 0.23 % 158.57 20 0.31 % 128.29 boljše bo ljše 92 0.18 % 88.88 23 0.18 % 99.77 14 0.17 % 47.37 34 0.14 % 96.28 21 0.32 % 134.71 novi novi 92 0.18 % 88.88 35 0.28 % 151.83 17 0.20 % 57.52 32 0.13 % 90.61 8 0.12 % 51.32 slovenskih sloven skih 89 0.17 % 85.98 29 0.23 % 125.80 5 0.06 % 16.92 53 0.22 % 150.08 2 0.03 % 12.83 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 463 File at CLARIN.SI2.2.120 List of final character-level 5-grams from adjective lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-adjectives-lowercase_ forms-final-5grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] dobro dobro 1,154 2.54 % 1,114.86 603 5.45 % 2,615.79 131 1.99 % 443.25 282 1.27 % 798.53 138 2.50 % 885.22 pravi pravi 455 1.00 % 439.57 65 0.59 % 281.97 26 0.40 % 87.97 281 1.26 % 795.70 83 1.50 % 532.42 dober dober 446 0.98 % 430.87 170 1.53 % 737.45 70 1.06 % 236.85 155 0.70 % 438.91 51 0.92 % 327.15 glavnem gl avnem 262 0.58 % 253.11 45 0.41 % 195.21 141 2.14 % 477.09 37 0.17 % 104.77 39 0.71 % 250.17 stari stari 236 0.52 % 228 32 0.29 % 138.81 170 2.58 % 575.21 24 0.11 % 67.96 10 0.18 % 64.15 dobra dobra 153 0.34 % 147.81 69 0.62 % 299.32 30 0.46 % 101.51 34 0.15 % 96.28 20 0.36 % 128.29 jasno jasno 141 0.31 % 136.22 32 0.29 % 138.81 15 0.23 % 50.75 65 0.29 % 184.06 29 0.53 % 186.03 slovenski slov enski 140 0.31 % 135.25 41 0.37 % 177.86 11 0.17 % 37.22 82 0.37 % 232.20 6 0.11 % 38.49 zadnjih za dnjih 120 0.26 % 115.93 36 0.33 % 156.17 3 0.05 % 10.15 76 0.34 % 215.21 5 0.09 % 32.07 velik velik 109 0.24 % 105.30 27 0.24 % 117.12 11 0.17 % 37.22 64 0.29 % 181.23 7 0.13 % 44.90 slovenske slov enske 106 0.23 % 102.40 27 0.24 % 117.12 2 0.03 % 6.77 74 0.33 % 209.54 3 0.05 % 19.24 stara stara 106 0.23 % 102.40 31 0.28 % 134.48 43 0.65 % 145.50 18 0.08 % 50.97 14 0.25 % 89.81 velika v elika 105 0.23 % 101.44 24 0.22 % 104.11 15 0.23 % 50.75 52 0.23 % 147.25 14 0.25 % 89.81 dobre dobre 103 0.23 % 99.51 35 0.32 % 151.83 24 0.36 % 81.21 34 0.15 % 96.28 10 0.18 % 64.15 naslednji nasl ednji 101 0.22 % 97.57 41 0.37 % 177.86 14 0.21 % 47.37 35 0.16 % 99.11 11 0.20 % 70.56 različnih razl ičnih 99 0.22 % 95.64 12 0.11 % 52.06 11 0.17 % 37.22 56 0.25 % 158.57 20 0.36 % 128.29 boljše b oljše 92 0.20 % 88.88 23 0.21 % 99.77 14 0.21 % 47.37 34 0.15 % 96.28 21 0.38 % 134.71 slovenskih slove nskih 89 0.20 % 85.98 29 0.26 % 125.80 5 0.08 % 16.92 53 0.24 % 150.08 2 0.04 % 12.83 zadnji z adnji 87 0.19 % 84.05 27 0.24 % 117.12 14 0.21 % 47.37 41 0.18 % 116.10 5 0.09 % 32.07 novega n ovega 86 0.19 % 83.08 26 0.23 % 112.79 9 0.14 % 30.45 42 0.19 % 118.93 9 0.16 % 57.73 velike v elike 86 0.19 % 83.08 19 0.17 % 82.42 9 0.14 % 30.45 50 0.23 % 141.58 8 0.14 % 51.32 različne raz lične 85 0.19 % 82.12 7 0.06 % 30.37 11 0.17 % 37.22 51 0.23 % 144.41 16 0.29 % 102.63 slovenska slov enska 84 0.18 % 81.15 21 0.19 % 91.10 4 0.06 % 13.53 58 0.26 % 164.24 1 0.02 % 6.41 kratko k ratko 82 0.18 % 79.22 14 0.13 % 60.73 4 0.06 % 13.53 54 0.24 % 152.91 10 0.18 % 64.15 evropske evr opske 81 0.18 % 78.25 7 0.06 % 30.37 1 0.01 % 3.38 72 0.32 % 203.88 1 0.02 % 6.41 najlepša naj lepša 73 0.16 % 70.52 19 0.17 % 82.42 7 0.11 % 23.69 25 0.11 % 70.79 22 0.40 % 141.12 dolgo dolgo 72 0.16 % 69.56 17 0.15 % 73.75 21 0.32 % 71.06 25 0.11 % 70.79 9 0.16 % 57.73 največji naj večji 72 0.16 % 69.56 20 0.18 % 86.76 14 0.21 % 47.37 28 0.13 % 79.29 10 0.18 % 64.15 rečeno r ečeno 69 0.15 % 66.66 19 0.17 % 82.42 3 0.05 % 10.15 40 0.18 % 113.27 7 0.13 % 44.90 večje večje 69 0.15 % 66.66 12 0.11 % 52.06 7 0.11 % 23.69 37 0.17 % 104.77 13 0.23 % 83.39 zadnje z adnje 69 0.15 % 66.66 24 0.22 % 104.11 9 0.14 % 30.45 27 0.12 % 76.45 9 0.16 % 57.73 prava prava 66 0.14 % 63.76 19 0.17 % 82.42 9 0.14 % 30.45 32 0.14 % 90.61 6 0.11 % 38.49 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 464 File at CLARIN.SI2.2.121 List of initial character-level 1-grams from adverb lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-adverbs-lemmas-initial- 1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tako tako t ako 9,713 10.29 % 9,383.57 1,925 9.09 % 8,350.58 3,661 12.84 % 12,387.41 2,296 7.90 % 6,501.51 1,831 11.68 % 11,745.24 zdaj zdaj z daj 6,358 6.73 % 6,142.36 1,457 6.88 % 6,320.41 1,739 6.10 % 5,884.10 1,764 6.07 % 4,995.06 1,398 8.92 % 8,967.69 lahko lahko l ahko 4,140 4.38 % 3,999.59 947 4.47 % 4,108.05 902 3.16 % 3,052.02 1,484 5.11 % 4,202.19 807 5.15 % 5,176.63 potem potem p otem 3,149 3.33 % 3,042.20 658 3.11 % 2,854.38 631 2.21 % 2,135.06 1,152 3.96 % 3,262.08 708 4.52 % 4,541.58 tam tam t am 2,598 2.75 % 2,509.89 426 2.01 % 1,847.97 1,289 4.52 % 4,361.48 503 1.73 % 1,424.33 380 2.42 % 2,437.57 zelo zelo z elo 1,835 1.94 % 1,772.76 473 2.23 % 2,051.86 182 0.64 % 615.82 896 3.08 % 2,537.17 284 1.81 % 1,821.76 bolj bolj b olj 1,648 1.75 % 1,592.11 338 1.60 % 1,466.23 557 1.95 % 1,884.67 479 1.65 % 1,356.37 274 1.75 % 1,757.62 danes danes d anes 1,418 1.50 % 1,369.91 638 3.01 % 2,767.62 264 0.93 % 893.27 439 1.51 % 1,243.10 77 0.49 % 493.93 tu tu t u 1,406 1.49 % 1,358.31 384 1.81 % 1,665.78 355 1.25 % 1,201.18 343 1.18 % 971.26 324 2.07 % 2,078.35 dobro dobro d obro 1,320 1.40 % 1,275.23 326 1.54 % 1,414.18 259 0.91 % 876.36 501 1.72 % 1,418.66 234 1.49 % 1,501.03 veliko veliko v eliko 1,316 1.39 % 1,271.37 315 1.49 % 1,366.46 232 0.81 % 785 639 2.20 % 1,809.43 130 0.83 % 833.91 sem sem s em 1,307 1.38 % 1,262.67 232 1.10 % 1,006.41 680 2.38 % 2,300.86 237 0.81 % 671.10 158 1.01 % 1,013.52 malo malo m alo 1,186 1.26 % 1,145.78 316 1.49 % 1,370.80 376 1.32 % 1,272.24 253 0.87 % 716.41 241 1.54 % 1,545.93 tukaj tukaj t ukaj 1,136 1.20 % 1,097.47 134 0.63 % 581.29 164 0.57 % 554.91 597 2.05 % 1,690.50 241 1.54 % 1,545.93 kje kje k je 1,041 1.10 % 1,005.69 225 1.06 % 976.04 416 1.46 % 1,407.58 279 0.96 % 790.03 121 0.77 % 776.17 zato zato z ato 992 1.05 % 958.36 202 0.95 % 876.27 247 0.87 % 835.75 394 1.36 % 1,115.68 149 0.95 % 955.78 čisto čisto č isto 974 1.03 % 940.97 132 0.62 % 572.61 402 1.41 % 1,360.21 231 0.80 % 654.11 209 1.33 % 1,340.66 prav prav p rav 971 1.03 % 938.07 214 1.01 % 928.32 308 1.08 % 1,042.15 317 1.09 % 897.64 132 0.84 % 846.73 naprej naprej n aprej 922 0.98 % 890.73 208 0.98 % 902.30 134 0.47 % 453.40 446 1.53 % 1,262.92 134 0.85 % 859.56 ful ful f ul 916 0.97 % 884.93 52 0.25 % 225.57 697 2.44 % 2,358.38 36 0.12 % 101.94 131 0.84 % 840.32 mogoče mogoče m ogoče 872 0.92 % 842.42 207 0.98 % 897.96 163 0.57 % 551.53 242 0.83 % 685.26 260 1.66 % 1,667.81 takrat takrat t akrat 840 0.89 % 811.51 169 0.80 % 733.12 235 0.82 % 795.15 313 1.08 % 886.31 123 0.79 % 789 prej prej p rej 832 0.88 % 803.78 169 0.80 % 733.12 263 0.92 % 889.89 295 1.01 % 835.34 105 0.67 % 673.54 treba treba t reba 827 0.88 % 798.95 164 0.78 % 711.43 164 0.57 % 554.91 331 1.14 % 937.28 168 1.07 % 1,077.66 gor gor g or 824 0.87 % 796.05 129 0.61 % 559.60 527 1.85 % 1,783.16 64 0.22 % 181.23 104 0.66 % 667.12 lepo lepo l epo 806 0.85 % 778.66 269 1.27 % 1,166.91 254 0.89 % 859.44 205 0.70 % 580.49 78 0.50 % 500.34 enkrat enkrat e nkrat 804 0.85 % 776.73 185 0.87 % 802.52 273 0.96 % 923.73 238 0.82 % 673.94 108 0.69 % 692.78 kako kako k ako 803 0.85 % 775.77 175 0.83 % 759.14 254 0.89 % 859.44 258 0.89 % 730.57 116 0.74 % 744.10 tule tule t ule 756 0.80 % 730.36 126 0.59 % 546.58 305 1.07 % 1,032 128 0.44 % 362.45 197 1.26 % 1,263.69 rad rad r ad 739 0.78 % 713.94 244 1.15 % 1,058.46 200 0.70 % 676.72 210 0.72 % 594.65 85 0.54 % 545.25 vedno vedno v edno 737 0.78 % 712 234 1.10 % 1,015.08 106 0.37 % 358.66 291 1.00 % 824.01 106 0.68 % 679.95 skupaj skupaj s kupaj 713 0.76 % 688.82 161 0.76 % 698.41 220 0.77 % 744.40 222 0.76 % 628.63 110 0.70 % 705.61 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 465 File at CLARIN.SI2.2.122 List of initial character-level 2-grams from adverb lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-adverbs-lemmas- initial-2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tako tako ta ko 9,713 10.29 % 9,383.57 1,925 9.10 % 8,350.58 3,661 12.85 % 12,387.41 2,296 7.90 % 6,501.51 1,831 11.69 % 11,745.24 zdaj zdaj zd aj 6,358 6.74 % 6,142.36 1,457 6.88 % 6,320.41 1,739 6.10 % 5,884.10 1,764 6.07 % 4,995.06 1,398 8.92 % 8,967.69 lahko lahko la hko 4,140 4.39 % 3,999.59 947 4.47 % 4,108.05 902 3.17 % 3,052.02 1,484 5.11 % 4,202.19 807 5.15 % 5,176.63 potem potem po tem 3,149 3.34 % 3,042.20 658 3.11 % 2,854.38 631 2.21 % 2,135.06 1,152 3.96 % 3,262.08 708 4.52 % 4,541.58 tam tam ta m 2,598 2.75 % 2,509.89 426 2.01 % 1,847.97 1,289 4.52 % 4,361.48 503 1.73 % 1,424.33 380 2.42 % 2,437.57 zelo zelo ze lo 1,835 1.94 % 1,772.76 473 2.23 % 2,051.86 182 0.64 % 615.82 896 3.08 % 2,537.17 284 1.81 % 1,821.76 bolj bolj bo lj 1,648 1.75 % 1,592.11 338 1.60 % 1,466.23 557 1.96 % 1,884.67 479 1.65 % 1,356.37 274 1.75 % 1,757.62 danes danes da nes 1,418 1.50 % 1,369.91 638 3.02 % 2,767.62 264 0.93 % 893.27 439 1.51 % 1,243.10 77 0.49 % 493.93 tu tu tu 1,406 1.49 % 1,358.31 384 1.81 % 1,665.78 355 1.25 % 1,201.18 343 1.18 % 971.26 324 2.07 % 2,078.35 dobro dobro do bro 1,320 1.40 % 1,275.23 326 1.54 % 1,414.18 259 0.91 % 876.36 501 1.72 % 1,418.66 234 1.49 % 1,501.03 veliko veliko ve liko 1,316 1.39 % 1,271.37 315 1.49 % 1,366.46 232 0.81 % 785 639 2.20 % 1,809.43 130 0.83 % 833.91 sem sem se m 1,307 1.39 % 1,262.67 232 1.10 % 1,006.41 680 2.39 % 2,300.86 237 0.82 % 671.10 158 1.01 % 1,013.52 malo malo ma lo 1,186 1.26 % 1,145.78 316 1.49 % 1,370.80 376 1.32 % 1,272.24 253 0.87 % 716.41 241 1.54 % 1,545.93 tukaj tukaj tu kaj 1,136 1.20 % 1,097.47 134 0.63 % 581.29 164 0.58 % 554.91 597 2.06 % 1,690.50 241 1.54 % 1,545.93 kje kje kj e 1,041 1.10 % 1,005.69 225 1.06 % 976.04 416 1.46 % 1,407.58 279 0.96 % 790.03 121 0.77 % 776.17 zato zato za to 992 1.05 % 958.36 202 0.95 % 876.27 247 0.87 % 835.75 394 1.36 % 1,115.68 149 0.95 % 955.78 čisto čisto či sto 974 1.03 % 940.97 132 0.62 % 572.61 402 1.41 % 1,360.21 231 0.80 % 654.11 209 1.33 % 1,340.66 prav prav pr av 971 1.03 % 938.07 214 1.01 % 928.32 308 1.08 % 1,042.15 317 1.09 % 897.64 132 0.84 % 846.73 naprej naprej na prej 922 0.98 % 890.73 208 0.98 % 902.30 134 0.47 % 453.40 446 1.53 % 1,262.92 134 0.85 % 859.56 ful ful fu l 916 0.97 % 884.93 52 0.25 % 225.57 697 2.45 % 2,358.38 36 0.12 % 101.94 131 0.84 % 840.32 mogoče mogoče mo goče 872 0.92 % 842.42 207 0.98 % 897.96 163 0.57 % 551.53 242 0.83 % 685.26 260 1.66 % 1,667.81 takrat takrat ta krat 840 0.89 % 811.51 169 0.80 % 733.12 235 0.82 % 795.15 313 1.08 % 886.31 123 0.79 % 789 prej prej pr ej 832 0.88 % 803.78 169 0.80 % 733.12 263 0.92 % 889.89 295 1.01 % 835.34 105 0.67 % 673.54 treba treba tr eba 827 0.88 % 798.95 164 0.78 % 711.43 164 0.58 % 554.91 331 1.14 % 937.28 168 1.07 % 1,077.66 gor gor go r 824 0.87 % 796.05 129 0.61 % 559.60 527 1.85 % 1,783.16 64 0.22 % 181.23 104 0.66 % 667.12 lepo lepo le po 806 0.85 % 778.66 269 1.27 % 1,166.91 254 0.89 % 859.44 205 0.71 % 580.49 78 0.50 % 500.34 enkrat enkrat en krat 804 0.85 % 776.73 185 0.87 % 802.52 273 0.96 % 923.73 238 0.82 % 673.94 108 0.69 % 692.78 kako kako ka ko 803 0.85 % 775.77 175 0.83 % 759.14 254 0.89 % 859.44 258 0.89 % 730.57 116 0.74 % 744.10 tule tule tu le 756 0.80 % 730.36 126 0.59 % 546.58 305 1.07 % 1,032 128 0.44 % 362.45 197 1.26 % 1,263.69 rad rad ra d 739 0.78 % 713.94 244 1.15 % 1,058.46 200 0.70 % 676.72 210 0.72 % 594.65 85 0.54 % 545.25 vedno vedno ve dno 737 0.78 % 712 234 1.11 % 1,015.08 106 0.37 % 358.66 291 1.00 % 824.01 106 0.68 % 679.95 skupaj skupaj sk upaj 713 0.76 % 688.82 161 0.76 % 698.41 220 0.77 % 744.40 222 0.76 % 628.63 110 0.70 % 705.61 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 466 File at CLARIN.SI2.2.123 List of initial character-level 3-grams from adverb lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-adverbs-lemmas-initial- 3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tako tako tak o 9,713 10.45 % 9,383.57 1,925 9.27 % 8,350.58 3,661 13.02 % 12,387.41 2,296 8.00 % 6,501.51 1,831 11.93 % 11,745.24 zdaj zdaj zda j 6,358 6.84 % 6,142.36 1,457 7.02 % 6,320.41 1,739 6.19 % 5,884.10 1,764 6.14 % 4,995.06 1,398 9.11 % 8,967.69 lahko lahko lah ko 4,140 4.46 % 3,999.59 947 4.56 % 4,108.05 902 3.21 % 3,052.02 1,484 5.17 % 4,202.19 807 5.26 % 5,176.63 potem potem pot em 3,149 3.39 % 3,042.20 658 3.17 % 2,854.38 631 2.25 % 2,135.06 1,152 4.01 % 3,262.08 708 4.61 % 4,541.58 tam tam tam 2,598 2.80 % 2,509.89 426 2.05 % 1,847.97 1,289 4.59 % 4,361.48 503 1.75 % 1,424.33 380 2.48 % 2,437.57 zelo zelo zel o 1,835 1.98 % 1,772.76 473 2.28 % 2,051.86 182 0.65 % 615.82 896 3.12 % 2,537.17 284 1.85 % 1,821.76 bolj bolj bol j 1,648 1.77 % 1,592.11 338 1.63 % 1,466.23 557 1.98 % 1,884.67 479 1.67 % 1,356.37 274 1.79 % 1,757.62 danes danes dan es 1,418 1.53 % 1,369.91 638 3.07 % 2,767.62 264 0.94 % 893.27 439 1.53 % 1,243.10 77 0.50 % 493.93 dobro dobro dob ro 1,320 1.42 % 1,275.23 326 1.57 % 1,414.18 259 0.92 % 876.36 501 1.75 % 1,418.66 234 1.52 % 1,501.03 veliko veliko vel iko 1,316 1.42 % 1,271.37 315 1.52 % 1,366.46 232 0.82 % 785 639 2.23 % 1,809.43 130 0.85 % 833.91 sem sem sem 1,307 1.41 % 1,262.67 232 1.12 % 1,006.41 680 2.42 % 2,300.86 237 0.82 % 671.10 158 1.03 % 1,013.52 malo malo mal o 1,186 1.28 % 1,145.78 316 1.52 % 1,370.80 376 1.34 % 1,272.24 253 0.88 % 716.41 241 1.57 % 1,545.93 tukaj tukaj tuk aj 1,136 1.22 % 1,097.47 134 0.65 % 581.29 164 0.58 % 554.91 597 2.08 % 1,690.50 241 1.57 % 1,545.93 kje kje kje 1,041 1.12 % 1,005.69 225 1.08 % 976.04 416 1.48 % 1,407.58 279 0.97 % 790.03 121 0.79 % 776.17 zato zato zat o 992 1.07 % 958.36 202 0.97 % 876.27 247 0.88 % 835.75 394 1.37 % 1,115.68 149 0.97 % 955.78 čisto čisto čis to 974 1.05 % 940.97 132 0.64 % 572.61 402 1.43 % 1,360.21 231 0.81 % 654.11 209 1.36 % 1,340.66 prav prav pra v 971 1.04 % 938.07 214 1.03 % 928.32 308 1.10 % 1,042.15 317 1.10 % 897.64 132 0.86 % 846.73 naprej naprej nap rej 922 0.99 % 890.73 208 1.00 % 902.30 134 0.48 % 453.40 446 1.55 % 1,262.92 134 0.87 % 859.56 ful ful ful 916 0.99 % 884.93 52 0.25 % 225.57 697 2.48 % 2,358.38 36 0.12 % 101.94 131 0.85 % 840.32 mogoče mogoče mog oče 872 0.94 % 842.42 207 1.00 % 897.96 163 0.58 % 551.53 242 0.84 % 685.26 260 1.69 % 1,667.81 takrat takrat tak rat 840 0.90 % 811.51 169 0.81 % 733.12 235 0.84 % 795.15 313 1.09 % 886.31 123 0.80 % 789 prej prej pre j 832 0.90 % 803.78 169 0.81 % 733.12 263 0.94 % 889.89 295 1.03 % 835.34 105 0.68 % 673.54 treba treba tre ba 827 0.89 % 798.95 164 0.79 % 711.43 164 0.58 % 554.91 331 1.15 % 937.28 168 1.09 % 1,077.66 gor gor gor 824 0.89 % 796.05 129 0.62 % 559.60 527 1.88 % 1,783.16 64 0.22 % 181.23 104 0.68 % 667.12 lepo lepo lep o 806 0.87 % 778.66 269 1.30 % 1,166.91 254 0.90 % 859.44 205 0.71 % 580.49 78 0.51 % 500.34 enkrat enkrat enk rat 804 0.86 % 776.73 185 0.89 % 802.52 273 0.97 % 923.73 238 0.83 % 673.94 108 0.70 % 692.78 kako kako kak o 803 0.86 % 775.77 175 0.84 % 759.14 254 0.90 % 859.44 258 0.90 % 730.57 116 0.76 % 744.10 tule tule tul e 756 0.81 % 730.36 126 0.61 % 546.58 305 1.08 % 1,032 128 0.45 % 362.45 197 1.28 % 1,263.69 rad rad rad 739 0.80 % 713.94 244 1.18 % 1,058.46 200 0.71 % 676.72 210 0.73 % 594.65 85 0.55 % 545.25 vedno vedno ved no 737 0.79 % 712 234 1.13 % 1,015.08 106 0.38 % 358.66 291 1.01 % 824.01 106 0.69 % 679.95 skupaj skupaj sku paj 713 0.77 % 688.82 161 0.78 % 698.41 220 0.78 % 744.40 222 0.77 % 628.63 110 0.72 % 705.61 kdaj kdaj kda j 701 0.75 % 677.22 177 0.85 % 767.82 253 0.90 % 856.05 182 0.63 % 515.36 89 0.58 % 570.90 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 467 File at CLARIN.SI2.2.124 List of initial character-level 4-grams from adverb lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-adverbs-lemmas-initial- 4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tako tako tako 9,713 11.96 % 9,383.57 1,925 10.41 % 8,350.58 3,661 16.30 % 12,387.41 2,296 8.66 % 6,501.51 1,831 13.31 % 11,745.24 zdaj zdaj zdaj 6,358 7.83 % 6,142.36 1,457 7.88 % 6,320.41 1,739 7.74 % 5,884.10 1,764 6.65 % 4,995.06 1,398 10.16 % 8,967.69 lahko lahko lahk o 4,140 5.10 % 3,999.59 947 5.12 % 4,108.05 902 4.01 % 3,052.02 1,484 5.59 % 4,202.19 807 5.87 % 5,176.63 potem potem pote m 3,149 3.88 % 3,042.20 658 3.56 % 2,854.38 631 2.81 % 2,135.06 1,152 4.34 % 3,262.08 708 5.15 % 4,541.58 zelo zelo zelo 1,835 2.26 % 1,772.76 473 2.56 % 2,051.86 182 0.81 % 615.82 896 3.38 % 2,537.17 284 2.06 % 1,821.76 bolj bolj bolj 1,648 2.03 % 1,592.11 338 1.83 % 1,466.23 557 2.48 % 1,884.67 479 1.81 % 1,356.37 274 1.99 % 1,757.62 danes danes dane s 1,418 1.75 % 1,369.91 638 3.45 % 2,767.62 264 1.18 % 893.27 439 1.66 % 1,243.10 77 0.56 % 493.93 dobro dobro dobr o 1,320 1.62 % 1,275.23 326 1.76 % 1,414.18 259 1.15 % 876.36 501 1.89 % 1,418.66 234 1.70 % 1,501.03 veliko veliko veli ko 1,316 1.62 % 1,271.37 315 1.70 % 1,366.46 232 1.03 % 785 639 2.41 % 1,809.43 130 0.94 % 833.91 malo malo malo 1,186 1.46 % 1,145.78 316 1.71 % 1,370.80 376 1.67 % 1,272.24 253 0.95 % 716.41 241 1.75 % 1,545.93 tukaj tukaj tuka j 1,136 1.40 % 1,097.47 134 0.72 % 581.29 164 0.73 % 554.91 597 2.25 % 1,690.50 241 1.75 % 1,545.93 zato zato zato 992 1.22 % 958.36 202 1.09 % 876.27 247 1.10 % 835.75 394 1.49 % 1,115.68 149 1.08 % 955.78 čisto čisto čist o 974 1.20 % 940.97 132 0.71 % 572.61 402 1.79 % 1,360.21 231 0.87 % 654.11 209 1.52 % 1,340.66 prav prav prav 971 1.20 % 938.07 214 1.16 % 928.32 308 1.37 % 1,042.15 317 1.20 % 897.64 132 0.96 % 846.73 naprej naprej napr ej 922 1.14 % 890.73 208 1.12 % 902.30 134 0.60 % 453.40 446 1.68 % 1,262.92 134 0.97 % 859.56 mogoče mogoče mogo če 872 1.07 % 842.42 207 1.12 % 897.96 163 0.73 % 551.53 242 0.91 % 685.26 260 1.89 % 1,667.81 takrat takrat takr at 840 1.03 % 811.51 169 0.91 % 733.12 235 1.05 % 795.15 313 1.18 % 886.31 123 0.89 % 789 prej prej prej 832 1.02 % 803.78 169 0.91 % 733.12 263 1.17 % 889.89 295 1.11 % 835.34 105 0.76 % 673.54 treba treba treb a 827 1.02 % 798.95 164 0.89 % 711.43 164 0.73 % 554.91 331 1.25 % 937.28 168 1.22 % 1,077.66 lepo lepo lepo 806 0.99 % 778.66 269 1.46 % 1,166.91 254 1.13 % 859.44 205 0.77 % 580.49 78 0.57 % 500.34 enkrat enkrat enkr at 804 0.99 % 776.73 185 1.00 % 802.52 273 1.22 % 923.73 238 0.90 % 673.94 108 0.79 % 692.78 kako kako kako 803 0.99 % 775.77 175 0.95 % 759.14 254 1.13 % 859.44 258 0.97 % 730.57 116 0.84 % 744.10 tule tule tule 756 0.93 % 730.36 126 0.68 % 546.58 305 1.36 % 1,032 128 0.48 % 362.45 197 1.43 % 1,263.69 vedno vedno vedn o 737 0.91 % 712 234 1.26 % 1,015.08 106 0.47 % 358.66 291 1.10 % 824.01 106 0.77 % 679.95 skupaj skupaj skup aj 713 0.88 % 688.82 161 0.87 % 698.41 220 0.98 % 744.40 222 0.84 % 628.63 110 0.80 % 705.61 kdaj kdaj kdaj 701 0.86 % 677.22 177 0.96 % 767.82 253 1.13 % 856.05 182 0.69 % 515.36 89 0.65 % 570.90 spet spet spet 647 0.80 % 625.06 145 0.78 % 629 223 0.99 % 754.55 188 0.71 % 532.35 91 0.66 % 583.73 nazaj nazaj naza j 579 0.71 % 559.36 136 0.73 % 589.96 201 0.90 % 680.11 176 0.66 % 498.37 66 0.48 % 423.37 noter noter note r 554 0.68 % 535.21 127 0.69 % 550.92 256 1.14 % 866.21 69 0.26 % 195.38 102 0.74 % 654.29 dosti dosti dost i 516 0.64 % 498.50 115 0.62 % 498.87 212 0.94 % 717.33 82 0.31 % 232.20 107 0.78 % 686.37 nekaj nekaj neka j 507 0.62 % 489.80 95 0.51 % 412.11 177 0.79 % 598.90 173 0.65 % 489.88 62 0.45 % 397.71 verjetno verjetno verj etno 505 0.62 % 487.87 111 0.60 % 481.51 106 0.47 % 358.66 192 0.72 % 543.68 96 0.70 % 615.81 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 468 File at CLARIN.SI2.2.125 List of initial character-level 5-grams from adverb lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-adverbs-lemmas-initial- 5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] lahko lahko lahko 4,140 8.06 % 3,999.59 947 7.98 % 4,108.05 902 7.02 % 3,052.02 1,484 8.13 % 4,202.19 807 9.62 % 5,176.63 potem potem potem 3,149 6.13 % 3,042.20 658 5.54 % 2,854.38 631 4.91 % 2,135.06 1,152 6.31 % 3,262.08 708 8.44 % 4,541.58 danes danes danes 1,418 2.76 % 1,369.91 638 5.37 % 2,767.62 264 2.06 % 893.27 439 2.40 % 1,243.10 77 0.92 % 493.93 dobro dobro dobro 1,320 2.57 % 1,275.23 326 2.75 % 1,414.18 259 2.02 % 876.36 501 2.74 % 1,418.66 234 2.79 % 1,501.03 veliko veliko velik o 1,316 2.56 % 1,271.37 315 2.65 % 1,366.46 232 1.81 % 785 639 3.50 % 1,809.43 130 1.55 % 833.91 tukaj tukaj tukaj 1,136 2.21 % 1,097.47 134 1.13 % 581.29 164 1.28 % 554.91 597 3.27 % 1,690.50 241 2.87 % 1,545.93 čisto čisto čisto 974 1.90 % 940.97 132 1.11 % 572.61 402 3.13 % 1,360.21 231 1.26 % 654.11 209 2.49 % 1,340.66 naprej naprej napre j 922 1.79 % 890.73 208 1.75 % 902.30 134 1.04 % 453.40 446 2.44 % 1,262.92 134 1.60 % 859.56 mogoče mogoče mogoč e 872 1.70 % 842.42 207 1.74 % 897.96 163 1.27 % 551.53 242 1.33 % 685.26 260 3.10 % 1,667.81 takrat takrat takra t 840 1.64 % 811.51 169 1.42 % 733.12 235 1.83 % 795.15 313 1.71 % 886.31 123 1.47 % 789 treba treba treba 827 1.61 % 798.95 164 1.38 % 711.43 164 1.28 % 554.91 331 1.81 % 937.28 168 2.00 % 1,077.66 enkrat enkrat enkra t 804 1.56 % 776.73 185 1.56 % 802.52 273 2.13 % 923.73 238 1.30 % 673.94 108 1.29 % 692.78 vedno vedno vedno 737 1.44 % 712 234 1.97 % 1,015.08 106 0.82 % 358.66 291 1.59 % 824.01 106 1.26 % 679.95 skupaj skupaj skupa j 713 1.39 % 688.82 161 1.36 % 698.41 220 1.71 % 744.40 222 1.22 % 628.63 110 1.31 % 705.61 nazaj nazaj nazaj 579 1.13 % 559.36 136 1.15 % 589.96 201 1.56 % 680.11 176 0.96 % 498.37 66 0.79 % 423.37 noter noter noter 554 1.08 % 535.21 127 1.07 % 550.92 256 1.99 % 866.21 69 0.38 % 195.38 102 1.22 % 654.29 dosti dosti dosti 516 1.00 % 498.50 115 0.97 % 498.87 212 1.65 % 717.33 82 0.45 % 232.20 107 1.27 % 686.37 nekaj nekaj nekaj 507 0.99 % 489.80 95 0.80 % 412.11 177 1.38 % 598.90 173 0.95 % 489.88 62 0.74 % 397.71 verjetno verjetno verje tno 505 0.98 % 487.87 111 0.94 % 481.51 106 0.82 % 358.66 192 1.05 % 543.68 96 1.14 % 615.81 drugače drugače druga če 487 0.95 % 470.48 69 0.58 % 299.32 212 1.65 % 717.33 132 0.72 % 373.78 74 0.88 % 474.68 točno točno točno 478 0.93 % 461.79 127 1.07 % 550.92 132 1.03 % 446.64 120 0.66 % 339.80 99 1.18 % 635.05 zdajle zdajle zdajl e 478 0.93 % 461.79 127 1.07 % 550.92 204 1.59 % 690.26 101 0.55 % 286 46 0.55 % 295.07 zakaj zakaj zakaj 475 0.93 % 458.89 65 0.55 % 281.97 107 0.83 % 362.05 262 1.44 % 741.90 41 0.49 % 263 najbolj najbolj najbo lj 462 0.90 % 446.33 125 1.05 % 542.25 105 0.82 % 355.28 177 0.97 % 501.20 55 0.66 % 352.81 koliko koliko kolik o 460 0.90 % 444.40 66 0.56 % 286.31 168 1.31 % 568.45 143 0.78 % 404.93 83 0.99 % 532.42 najprej najprej najpr ej 451 0.88 % 435.70 95 0.80 % 412.11 51 0.40 % 172.56 240 1.31 % 679.60 65 0.78 % 416.95 včasih včasih včasi h 445 0.87 % 429.91 86 0.72 % 373.06 159 1.24 % 537.99 142 0.78 % 402.10 58 0.69 % 372.05 super super super 438 0.85 % 423.14 140 1.18 % 607.31 115 0.90 % 389.12 29 0.16 % 82.12 154 1.83 % 987.86 toliko toliko tolik o 434 0.84 % 419.28 99 0.83 % 429.46 154 1.20 % 521.08 108 0.59 % 305.82 73 0.87 % 468.27 dejansko dejansko dejan sko 406 0.79 % 392.23 40 0.34 % 173.52 72 0.56 % 243.62 184 1.01 % 521.03 110 1.31 % 705.61 notri notri notri 396 0.77 % 382.57 37 0.31 % 160.50 235 1.83 % 795.15 43 0.24 % 121.76 81 0.96 % 519.59 nekje nekje nekje 393 0.77 % 379.67 44 0.37 % 190.87 125 0.97 % 422.95 112 0.61 % 317.15 112 1.33 % 718.44 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 469 File at CLARIN.SI2.2.126 List of final character-level 1-grams from adverb lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-adverbs-lemmas-final- 1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tako tako tak o 9,713 10.29 % 9,383.57 1,925 9.09 % 8,350.58 3,661 12.84 % 12,387.41 2,296 7.90 % 6,501.51 1,831 11.68 % 11,745.24 zdaj zdaj zda j 6,358 6.73 % 6,142.36 1,457 6.88 % 6,320.41 1,739 6.10 % 5,884.10 1,764 6.07 % 4,995.06 1,398 8.92 % 8,967.69 lahko lahko lahk o 4,140 4.38 % 3,999.59 947 4.47 % 4,108.05 902 3.16 % 3,052.02 1,484 5.11 % 4,202.19 807 5.15 % 5,176.63 potem potem pote m 3,149 3.33 % 3,042.20 658 3.11 % 2,854.38 631 2.21 % 2,135.06 1,152 3.96 % 3,262.08 708 4.52 % 4,541.58 tam tam ta m 2,598 2.75 % 2,509.89 426 2.01 % 1,847.97 1,289 4.52 % 4,361.48 503 1.73 % 1,424.33 380 2.42 % 2,437.57 zelo zelo zel o 1,835 1.94 % 1,772.76 473 2.23 % 2,051.86 182 0.64 % 615.82 896 3.08 % 2,537.17 284 1.81 % 1,821.76 bolj bolj bol j 1,648 1.75 % 1,592.11 338 1.60 % 1,466.23 557 1.95 % 1,884.67 479 1.65 % 1,356.37 274 1.75 % 1,757.62 danes danes dane s 1,418 1.50 % 1,369.91 638 3.01 % 2,767.62 264 0.93 % 893.27 439 1.51 % 1,243.10 77 0.49 % 493.93 tu tu t u 1,406 1.49 % 1,358.31 384 1.81 % 1,665.78 355 1.25 % 1,201.18 343 1.18 % 971.26 324 2.07 % 2,078.35 dobro dobro dobr o 1,320 1.40 % 1,275.23 326 1.54 % 1,414.18 259 0.91 % 876.36 501 1.72 % 1,418.66 234 1.49 % 1,501.03 veliko veliko velik o 1,316 1.39 % 1,271.37 315 1.49 % 1,366.46 232 0.81 % 785 639 2.20 % 1,809.43 130 0.83 % 833.91 sem sem se m 1,307 1.38 % 1,262.67 232 1.10 % 1,006.41 680 2.38 % 2,300.86 237 0.81 % 671.10 158 1.01 % 1,013.52 malo malo mal o 1,186 1.26 % 1,145.78 316 1.49 % 1,370.80 376 1.32 % 1,272.24 253 0.87 % 716.41 241 1.54 % 1,545.93 tukaj tukaj tuka j 1,136 1.20 % 1,097.47 134 0.63 % 581.29 164 0.57 % 554.91 597 2.05 % 1,690.50 241 1.54 % 1,545.93 kje kje kj e 1,041 1.10 % 1,005.69 225 1.06 % 976.04 416 1.46 % 1,407.58 279 0.96 % 790.03 121 0.77 % 776.17 zato zato zat o 992 1.05 % 958.36 202 0.95 % 876.27 247 0.87 % 835.75 394 1.36 % 1,115.68 149 0.95 % 955.78 čisto čisto čist o 974 1.03 % 940.97 132 0.62 % 572.61 402 1.41 % 1,360.21 231 0.80 % 654.11 209 1.33 % 1,340.66 prav prav pra v 971 1.03 % 938.07 214 1.01 % 928.32 308 1.08 % 1,042.15 317 1.09 % 897.64 132 0.84 % 846.73 naprej naprej napre j 922 0.98 % 890.73 208 0.98 % 902.30 134 0.47 % 453.40 446 1.53 % 1,262.92 134 0.85 % 859.56 ful ful fu l 916 0.97 % 884.93 52 0.25 % 225.57 697 2.44 % 2,358.38 36 0.12 % 101.94 131 0.84 % 840.32 mogoče mogoče mogoč e 872 0.92 % 842.42 207 0.98 % 897.96 163 0.57 % 551.53 242 0.83 % 685.26 260 1.66 % 1,667.81 takrat takrat takra t 840 0.89 % 811.51 169 0.80 % 733.12 235 0.82 % 795.15 313 1.08 % 886.31 123 0.79 % 789 prej prej pre j 832 0.88 % 803.78 169 0.80 % 733.12 263 0.92 % 889.89 295 1.01 % 835.34 105 0.67 % 673.54 treba treba treb a 827 0.88 % 798.95 164 0.78 % 711.43 164 0.57 % 554.91 331 1.14 % 937.28 168 1.07 % 1,077.66 gor gor go r 824 0.87 % 796.05 129 0.61 % 559.60 527 1.85 % 1,783.16 64 0.22 % 181.23 104 0.66 % 667.12 lepo lepo lep o 806 0.85 % 778.66 269 1.27 % 1,166.91 254 0.89 % 859.44 205 0.70 % 580.49 78 0.50 % 500.34 enkrat enkrat enkra t 804 0.85 % 776.73 185 0.87 % 802.52 273 0.96 % 923.73 238 0.82 % 673.94 108 0.69 % 692.78 kako kako kak o 803 0.85 % 775.77 175 0.83 % 759.14 254 0.89 % 859.44 258 0.89 % 730.57 116 0.74 % 744.10 tule tule tul e 756 0.80 % 730.36 126 0.59 % 546.58 305 1.07 % 1,032 128 0.44 % 362.45 197 1.26 % 1,263.69 rad rad ra d 739 0.78 % 713.94 244 1.15 % 1,058.46 200 0.70 % 676.72 210 0.72 % 594.65 85 0.54 % 545.25 vedno vedno vedn o 737 0.78 % 712 234 1.10 % 1,015.08 106 0.37 % 358.66 291 1.00 % 824.01 106 0.68 % 679.95 skupaj skupaj skupa j 713 0.76 % 688.82 161 0.76 % 698.41 220 0.77 % 744.40 222 0.76 % 628.63 110 0.70 % 705.61 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 470 File at CLARIN.SI2.2.127 List of final character-level 2-grams from adverb lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-adverbs-lemmas-final- 2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tako tako ta ko 9,713 10.29 % 9,383.57 1,925 9.10 % 8,350.58 3,661 12.85 % 12,387.41 2,296 7.90 % 6,501.51 1,831 11.69 % 11,745.24 zdaj zdaj zd aj 6,358 6.74 % 6,142.36 1,457 6.88 % 6,320.41 1,739 6.10 % 5,884.10 1,764 6.07 % 4,995.06 1,398 8.92 % 8,967.69 lahko lahko lah ko 4,140 4.39 % 3,999.59 947 4.47 % 4,108.05 902 3.17 % 3,052.02 1,484 5.11 % 4,202.19 807 5.15 % 5,176.63 potem potem pot em 3,149 3.34 % 3,042.20 658 3.11 % 2,854.38 631 2.21 % 2,135.06 1,152 3.96 % 3,262.08 708 4.52 % 4,541.58 tam tam t am 2,598 2.75 % 2,509.89 426 2.01 % 1,847.97 1,289 4.52 % 4,361.48 503 1.73 % 1,424.33 380 2.42 % 2,437.57 zelo zelo ze lo 1,835 1.94 % 1,772.76 473 2.23 % 2,051.86 182 0.64 % 615.82 896 3.08 % 2,537.17 284 1.81 % 1,821.76 bolj bolj bo lj 1,648 1.75 % 1,592.11 338 1.60 % 1,466.23 557 1.96 % 1,884.67 479 1.65 % 1,356.37 274 1.75 % 1,757.62 danes danes dan es 1,418 1.50 % 1,369.91 638 3.02 % 2,767.62 264 0.93 % 893.27 439 1.51 % 1,243.10 77 0.49 % 493.93 tu tu tu 1,406 1.49 % 1,358.31 384 1.81 % 1,665.78 355 1.25 % 1,201.18 343 1.18 % 971.26 324 2.07 % 2,078.35 dobro dobro dob ro 1,320 1.40 % 1,275.23 326 1.54 % 1,414.18 259 0.91 % 876.36 501 1.72 % 1,418.66 234 1.49 % 1,501.03 veliko veliko veli ko 1,316 1.39 % 1,271.37 315 1.49 % 1,366.46 232 0.81 % 785 639 2.20 % 1,809.43 130 0.83 % 833.91 sem sem s em 1,307 1.39 % 1,262.67 232 1.10 % 1,006.41 680 2.39 % 2,300.86 237 0.82 % 671.10 158 1.01 % 1,013.52 malo malo ma lo 1,186 1.26 % 1,145.78 316 1.49 % 1,370.80 376 1.32 % 1,272.24 253 0.87 % 716.41 241 1.54 % 1,545.93 tukaj tukaj tuk aj 1,136 1.20 % 1,097.47 134 0.63 % 581.29 164 0.58 % 554.91 597 2.06 % 1,690.50 241 1.54 % 1,545.93 kje kje k je 1,041 1.10 % 1,005.69 225 1.06 % 976.04 416 1.46 % 1,407.58 279 0.96 % 790.03 121 0.77 % 776.17 zato zato za to 992 1.05 % 958.36 202 0.95 % 876.27 247 0.87 % 835.75 394 1.36 % 1,115.68 149 0.95 % 955.78 čisto čisto čis to 974 1.03 % 940.97 132 0.62 % 572.61 402 1.41 % 1,360.21 231 0.80 % 654.11 209 1.33 % 1,340.66 prav prav pr av 971 1.03 % 938.07 214 1.01 % 928.32 308 1.08 % 1,042.15 317 1.09 % 897.64 132 0.84 % 846.73 naprej naprej napr ej 922 0.98 % 890.73 208 0.98 % 902.30 134 0.47 % 453.40 446 1.53 % 1,262.92 134 0.85 % 859.56 ful ful f ul 916 0.97 % 884.93 52 0.25 % 225.57 697 2.45 % 2,358.38 36 0.12 % 101.94 131 0.84 % 840.32 mogoče mogoče mogo če 872 0.92 % 842.42 207 0.98 % 897.96 163 0.57 % 551.53 242 0.83 % 685.26 260 1.66 % 1,667.81 takrat takrat takr at 840 0.89 % 811.51 169 0.80 % 733.12 235 0.82 % 795.15 313 1.08 % 886.31 123 0.79 % 789 prej prej pr ej 832 0.88 % 803.78 169 0.80 % 733.12 263 0.92 % 889.89 295 1.01 % 835.34 105 0.67 % 673.54 treba treba tre ba 827 0.88 % 798.95 164 0.78 % 711.43 164 0.58 % 554.91 331 1.14 % 937.28 168 1.07 % 1,077.66 gor gor g or 824 0.87 % 796.05 129 0.61 % 559.60 527 1.85 % 1,783.16 64 0.22 % 181.23 104 0.66 % 667.12 lepo lepo le po 806 0.85 % 778.66 269 1.27 % 1,166.91 254 0.89 % 859.44 205 0.71 % 580.49 78 0.50 % 500.34 enkrat enkrat enkr at 804 0.85 % 776.73 185 0.87 % 802.52 273 0.96 % 923.73 238 0.82 % 673.94 108 0.69 % 692.78 kako kako ka ko 803 0.85 % 775.77 175 0.83 % 759.14 254 0.89 % 859.44 258 0.89 % 730.57 116 0.74 % 744.10 tule tule tu le 756 0.80 % 730.36 126 0.59 % 546.58 305 1.07 % 1,032 128 0.44 % 362.45 197 1.26 % 1,263.69 rad rad r ad 739 0.78 % 713.94 244 1.15 % 1,058.46 200 0.70 % 676.72 210 0.72 % 594.65 85 0.54 % 545.25 vedno vedno ved no 737 0.78 % 712 234 1.11 % 1,015.08 106 0.37 % 358.66 291 1.00 % 824.01 106 0.68 % 679.95 skupaj skupaj skup aj 713 0.76 % 688.82 161 0.76 % 698.41 220 0.77 % 744.40 222 0.76 % 628.63 110 0.70 % 705.61 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 471 File at CLARIN.SI2.2.128 List of final character-level 3-grams from adverb lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-adverbs-lemmas-final- 3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tako tako t ako 9,713 10.45 % 9,383.57 1,925 9.27 % 8,350.58 3,661 13.02 % 12,387.41 2,296 8.00 % 6,501.51 1,831 11.93 % 11,745.24 zdaj zdaj z daj 6,358 6.84 % 6,142.36 1,457 7.02 % 6,320.41 1,739 6.19 % 5,884.10 1,764 6.14 % 4,995.06 1,398 9.11 % 8,967.69 lahko lahko la hko 4,140 4.46 % 3,999.59 947 4.56 % 4,108.05 902 3.21 % 3,052.02 1,484 5.17 % 4,202.19 807 5.26 % 5,176.63 potem potem po tem 3,149 3.39 % 3,042.20 658 3.17 % 2,854.38 631 2.25 % 2,135.06 1,152 4.01 % 3,262.08 708 4.61 % 4,541.58 tam tam tam 2,598 2.80 % 2,509.89 426 2.05 % 1,847.97 1,289 4.59 % 4,361.48 503 1.75 % 1,424.33 380 2.48 % 2,437.57 zelo zelo z elo 1,835 1.98 % 1,772.76 473 2.28 % 2,051.86 182 0.65 % 615.82 896 3.12 % 2,537.17 284 1.85 % 1,821.76 bolj bolj b olj 1,648 1.77 % 1,592.11 338 1.63 % 1,466.23 557 1.98 % 1,884.67 479 1.67 % 1,356.37 274 1.79 % 1,757.62 danes danes da nes 1,418 1.53 % 1,369.91 638 3.07 % 2,767.62 264 0.94 % 893.27 439 1.53 % 1,243.10 77 0.50 % 493.93 dobro dobro do bro 1,320 1.42 % 1,275.23 326 1.57 % 1,414.18 259 0.92 % 876.36 501 1.75 % 1,418.66 234 1.52 % 1,501.03 veliko veliko vel iko 1,316 1.42 % 1,271.37 315 1.52 % 1,366.46 232 0.82 % 785 639 2.23 % 1,809.43 130 0.85 % 833.91 sem sem sem 1,307 1.41 % 1,262.67 232 1.12 % 1,006.41 680 2.42 % 2,300.86 237 0.82 % 671.10 158 1.03 % 1,013.52 malo malo m alo 1,186 1.28 % 1,145.78 316 1.52 % 1,370.80 376 1.34 % 1,272.24 253 0.88 % 716.41 241 1.57 % 1,545.93 tukaj tukaj tu kaj 1,136 1.22 % 1,097.47 134 0.65 % 581.29 164 0.58 % 554.91 597 2.08 % 1,690.50 241 1.57 % 1,545.93 kje kje kje 1,041 1.12 % 1,005.69 225 1.08 % 976.04 416 1.48 % 1,407.58 279 0.97 % 790.03 121 0.79 % 776.17 zato zato z ato 992 1.07 % 958.36 202 0.97 % 876.27 247 0.88 % 835.75 394 1.37 % 1,115.68 149 0.97 % 955.78 čisto čisto či sto 974 1.05 % 940.97 132 0.64 % 572.61 402 1.43 % 1,360.21 231 0.81 % 654.11 209 1.36 % 1,340.66 prav prav p rav 971 1.04 % 938.07 214 1.03 % 928.32 308 1.10 % 1,042.15 317 1.10 % 897.64 132 0.86 % 846.73 naprej naprej nap rej 922 0.99 % 890.73 208 1.00 % 902.30 134 0.48 % 453.40 446 1.55 % 1,262.92 134 0.87 % 859.56 ful ful ful 916 0.99 % 884.93 52 0.25 % 225.57 697 2.48 % 2,358.38 36 0.12 % 101.94 131 0.85 % 840.32 mogoče mogoče mog oče 872 0.94 % 842.42 207 1.00 % 897.96 163 0.58 % 551.53 242 0.84 % 685.26 260 1.69 % 1,667.81 takrat takrat tak rat 840 0.90 % 811.51 169 0.81 % 733.12 235 0.84 % 795.15 313 1.09 % 886.31 123 0.80 % 789 prej prej p rej 832 0.90 % 803.78 169 0.81 % 733.12 263 0.94 % 889.89 295 1.03 % 835.34 105 0.68 % 673.54 treba treba tr eba 827 0.89 % 798.95 164 0.79 % 711.43 164 0.58 % 554.91 331 1.15 % 937.28 168 1.09 % 1,077.66 gor gor gor 824 0.89 % 796.05 129 0.62 % 559.60 527 1.88 % 1,783.16 64 0.22 % 181.23 104 0.68 % 667.12 lepo lepo l epo 806 0.87 % 778.66 269 1.30 % 1,166.91 254 0.90 % 859.44 205 0.71 % 580.49 78 0.51 % 500.34 enkrat enkrat enk rat 804 0.86 % 776.73 185 0.89 % 802.52 273 0.97 % 923.73 238 0.83 % 673.94 108 0.70 % 692.78 kako kako k ako 803 0.86 % 775.77 175 0.84 % 759.14 254 0.90 % 859.44 258 0.90 % 730.57 116 0.76 % 744.10 tule tule t ule 756 0.81 % 730.36 126 0.61 % 546.58 305 1.08 % 1,032 128 0.45 % 362.45 197 1.28 % 1,263.69 rad rad rad 739 0.80 % 713.94 244 1.18 % 1,058.46 200 0.71 % 676.72 210 0.73 % 594.65 85 0.55 % 545.25 vedno vedno ve dno 737 0.79 % 712 234 1.13 % 1,015.08 106 0.38 % 358.66 291 1.01 % 824.01 106 0.69 % 679.95 skupaj skupaj sku paj 713 0.77 % 688.82 161 0.78 % 698.41 220 0.78 % 744.40 222 0.77 % 628.63 110 0.72 % 705.61 kdaj kdaj k daj 701 0.75 % 677.22 177 0.85 % 767.82 253 0.90 % 856.05 182 0.63 % 515.36 89 0.58 % 570.90 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 472 File at CLARIN.SI2.2.129 List of final character-level 4-grams from adverb lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-adverbs-lemmas-final- 4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tako tako tako 9,713 11.96 % 9,383.57 1,925 10.41 % 8,350.58 3,661 16.30 % 12,387.41 2,296 8.66 % 6,501.51 1,831 13.31 % 11,745.24 zdaj zdaj zdaj 6,358 7.83 % 6,142.36 1,457 7.88 % 6,320.41 1,739 7.74 % 5,884.10 1,764 6.65 % 4,995.06 1,398 10.16 % 8,967.69 lahko lahko l ahko 4,140 5.10 % 3,999.59 947 5.12 % 4,108.05 902 4.01 % 3,052.02 1,484 5.59 % 4,202.19 807 5.87 % 5,176.63 potem potem p otem 3,149 3.88 % 3,042.20 658 3.56 % 2,854.38 631 2.81 % 2,135.06 1,152 4.34 % 3,262.08 708 5.15 % 4,541.58 zelo zelo zelo 1,835 2.26 % 1,772.76 473 2.56 % 2,051.86 182 0.81 % 615.82 896 3.38 % 2,537.17 284 2.06 % 1,821.76 bolj bolj bolj 1,648 2.03 % 1,592.11 338 1.83 % 1,466.23 557 2.48 % 1,884.67 479 1.81 % 1,356.37 274 1.99 % 1,757.62 danes danes d anes 1,418 1.75 % 1,369.91 638 3.45 % 2,767.62 264 1.18 % 893.27 439 1.66 % 1,243.10 77 0.56 % 493.93 dobro dobro d obro 1,320 1.62 % 1,275.23 326 1.76 % 1,414.18 259 1.15 % 876.36 501 1.89 % 1,418.66 234 1.70 % 1,501.03 veliko veliko ve liko 1,316 1.62 % 1,271.37 315 1.70 % 1,366.46 232 1.03 % 785 639 2.41 % 1,809.43 130 0.94 % 833.91 malo malo malo 1,186 1.46 % 1,145.78 316 1.71 % 1,370.80 376 1.67 % 1,272.24 253 0.95 % 716.41 241 1.75 % 1,545.93 tukaj tukaj t ukaj 1,136 1.40 % 1,097.47 134 0.72 % 581.29 164 0.73 % 554.91 597 2.25 % 1,690.50 241 1.75 % 1,545.93 zato zato zato 992 1.22 % 958.36 202 1.09 % 876.27 247 1.10 % 835.75 394 1.49 % 1,115.68 149 1.08 % 955.78 čisto čisto č isto 974 1.20 % 940.97 132 0.71 % 572.61 402 1.79 % 1,360.21 231 0.87 % 654.11 209 1.52 % 1,340.66 prav prav prav 971 1.20 % 938.07 214 1.16 % 928.32 308 1.37 % 1,042.15 317 1.20 % 897.64 132 0.96 % 846.73 naprej naprej na prej 922 1.14 % 890.73 208 1.12 % 902.30 134 0.60 % 453.40 446 1.68 % 1,262.92 134 0.97 % 859.56 mogoče mogoče mo goče 872 1.07 % 842.42 207 1.12 % 897.96 163 0.73 % 551.53 242 0.91 % 685.26 260 1.89 % 1,667.81 takrat takrat ta krat 840 1.03 % 811.51 169 0.91 % 733.12 235 1.05 % 795.15 313 1.18 % 886.31 123 0.89 % 789 prej prej prej 832 1.02 % 803.78 169 0.91 % 733.12 263 1.17 % 889.89 295 1.11 % 835.34 105 0.76 % 673.54 treba treba t reba 827 1.02 % 798.95 164 0.89 % 711.43 164 0.73 % 554.91 331 1.25 % 937.28 168 1.22 % 1,077.66 lepo lepo lepo 806 0.99 % 778.66 269 1.46 % 1,166.91 254 1.13 % 859.44 205 0.77 % 580.49 78 0.57 % 500.34 enkrat enkrat en krat 804 0.99 % 776.73 185 1.00 % 802.52 273 1.22 % 923.73 238 0.90 % 673.94 108 0.79 % 692.78 kako kako kako 803 0.99 % 775.77 175 0.95 % 759.14 254 1.13 % 859.44 258 0.97 % 730.57 116 0.84 % 744.10 tule tule tule 756 0.93 % 730.36 126 0.68 % 546.58 305 1.36 % 1,032 128 0.48 % 362.45 197 1.43 % 1,263.69 vedno vedno v edno 737 0.91 % 712 234 1.26 % 1,015.08 106 0.47 % 358.66 291 1.10 % 824.01 106 0.77 % 679.95 skupaj skupaj sk upaj 713 0.88 % 688.82 161 0.87 % 698.41 220 0.98 % 744.40 222 0.84 % 628.63 110 0.80 % 705.61 kdaj kdaj kdaj 701 0.86 % 677.22 177 0.96 % 767.82 253 1.13 % 856.05 182 0.69 % 515.36 89 0.65 % 570.90 spet spet spet 647 0.80 % 625.06 145 0.78 % 629 223 0.99 % 754.55 188 0.71 % 532.35 91 0.66 % 583.73 nazaj nazaj n azaj 579 0.71 % 559.36 136 0.73 % 589.96 201 0.90 % 680.11 176 0.66 % 498.37 66 0.48 % 423.37 noter noter n oter 554 0.68 % 535.21 127 0.69 % 550.92 256 1.14 % 866.21 69 0.26 % 195.38 102 0.74 % 654.29 dosti dosti d osti 516 0.64 % 498.50 115 0.62 % 498.87 212 0.94 % 717.33 82 0.31 % 232.20 107 0.78 % 686.37 nekaj nekaj n ekaj 507 0.62 % 489.80 95 0.51 % 412.11 177 0.79 % 598.90 173 0.65 % 489.88 62 0.45 % 397.71 verjetno verjetno verj etno 505 0.62 % 487.87 111 0.60 % 481.51 106 0.47 % 358.66 192 0.72 % 543.68 96 0.70 % 615.81 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 473 File at CLARIN.SI2.2.130 List of final character-level 5-grams from adverb lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-adverbs-lemmas-final- 5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] lahko lahko lahko 4,140 8.06 % 3,999.59 947 7.98 % 4,108.05 902 7.02 % 3,052.02 1,484 8.13 % 4,202.19 807 9.62 % 5,176.63 potem potem potem 3,149 6.13 % 3,042.20 658 5.54 % 2,854.38 631 4.91 % 2,135.06 1,152 6.31 % 3,262.08 708 8.44 % 4,541.58 danes danes danes 1,418 2.76 % 1,369.91 638 5.37 % 2,767.62 264 2.06 % 893.27 439 2.40 % 1,243.10 77 0.92 % 493.93 dobro dobro dobro 1,320 2.57 % 1,275.23 326 2.75 % 1,414.18 259 2.02 % 876.36 501 2.74 % 1,418.66 234 2.79 % 1,501.03 veliko veliko v eliko 1,316 2.56 % 1,271.37 315 2.65 % 1,366.46 232 1.81 % 785 639 3.50 % 1,809.43 130 1.55 % 833.91 tukaj tukaj tukaj 1,136 2.21 % 1,097.47 134 1.13 % 581.29 164 1.28 % 554.91 597 3.27 % 1,690.50 241 2.87 % 1,545.93 čisto čisto čisto 974 1.90 % 940.97 132 1.11 % 572.61 402 3.13 % 1,360.21 231 1.26 % 654.11 209 2.49 % 1,340.66 naprej naprej n aprej 922 1.79 % 890.73 208 1.75 % 902.30 134 1.04 % 453.40 446 2.44 % 1,262.92 134 1.60 % 859.56 mogoče mogoče m ogoče 872 1.70 % 842.42 207 1.74 % 897.96 163 1.27 % 551.53 242 1.33 % 685.26 260 3.10 % 1,667.81 takrat takrat t akrat 840 1.64 % 811.51 169 1.42 % 733.12 235 1.83 % 795.15 313 1.71 % 886.31 123 1.47 % 789 treba treba treba 827 1.61 % 798.95 164 1.38 % 711.43 164 1.28 % 554.91 331 1.81 % 937.28 168 2.00 % 1,077.66 enkrat enkrat e nkrat 804 1.56 % 776.73 185 1.56 % 802.52 273 2.13 % 923.73 238 1.30 % 673.94 108 1.29 % 692.78 vedno vedno vedno 737 1.44 % 712 234 1.97 % 1,015.08 106 0.82 % 358.66 291 1.59 % 824.01 106 1.26 % 679.95 skupaj skupaj s kupaj 713 1.39 % 688.82 161 1.36 % 698.41 220 1.71 % 744.40 222 1.22 % 628.63 110 1.31 % 705.61 nazaj nazaj nazaj 579 1.13 % 559.36 136 1.15 % 589.96 201 1.56 % 680.11 176 0.96 % 498.37 66 0.79 % 423.37 noter noter noter 554 1.08 % 535.21 127 1.07 % 550.92 256 1.99 % 866.21 69 0.38 % 195.38 102 1.22 % 654.29 dosti dosti dosti 516 1.00 % 498.50 115 0.97 % 498.87 212 1.65 % 717.33 82 0.45 % 232.20 107 1.27 % 686.37 nekaj nekaj nekaj 507 0.99 % 489.80 95 0.80 % 412.11 177 1.38 % 598.90 173 0.95 % 489.88 62 0.74 % 397.71 verjetno verjetno ver jetno 505 0.98 % 487.87 111 0.94 % 481.51 106 0.82 % 358.66 192 1.05 % 543.68 96 1.14 % 615.81 drugače drugače dr ugače 487 0.95 % 470.48 69 0.58 % 299.32 212 1.65 % 717.33 132 0.72 % 373.78 74 0.88 % 474.68 točno točno točno 478 0.93 % 461.79 127 1.07 % 550.92 132 1.03 % 446.64 120 0.66 % 339.80 99 1.18 % 635.05 zdajle zdajle z dajle 478 0.93 % 461.79 127 1.07 % 550.92 204 1.59 % 690.26 101 0.55 % 286 46 0.55 % 295.07 zakaj zakaj zakaj 475 0.93 % 458.89 65 0.55 % 281.97 107 0.83 % 362.05 262 1.44 % 741.90 41 0.49 % 263 najbolj najbolj na jbolj 462 0.90 % 446.33 125 1.05 % 542.25 105 0.82 % 355.28 177 0.97 % 501.20 55 0.66 % 352.81 koliko koliko k oliko 460 0.90 % 444.40 66 0.56 % 286.31 168 1.31 % 568.45 143 0.78 % 404.93 83 0.99 % 532.42 najprej najprej na jprej 451 0.88 % 435.70 95 0.80 % 412.11 51 0.40 % 172.56 240 1.31 % 679.60 65 0.78 % 416.95 včasih včasih v časih 445 0.87 % 429.91 86 0.72 % 373.06 159 1.24 % 537.99 142 0.78 % 402.10 58 0.69 % 372.05 super super super 438 0.85 % 423.14 140 1.18 % 607.31 115 0.90 % 389.12 29 0.16 % 82.12 154 1.83 % 987.86 toliko toliko t oliko 434 0.84 % 419.28 99 0.83 % 429.46 154 1.20 % 521.08 108 0.59 % 305.82 73 0.87 % 468.27 dejansko dejansko dej ansko 406 0.79 % 392.23 40 0.34 % 173.52 72 0.56 % 243.62 184 1.01 % 521.03 110 1.31 % 705.61 notri notri notri 396 0.77 % 382.57 37 0.31 % 160.50 235 1.83 % 795.15 43 0.24 % 121.76 81 0.96 % 519.59 nekje nekje nekje 393 0.77 % 379.67 44 0.37 % 190.87 125 0.97 % 422.95 112 0.61 % 317.15 112 1.33 % 718.44 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 474 File at CLARIN.SI2.2.131 List of initial character-level 1-grams from adverb standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-adverbs-standardized_ forms-initial-1grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tako t ako 9,713 10.29 % 9,383.57 1,925 9.09 % 8,350.58 3,661 12.84 % 12,387.41 2,296 7.90 % 6,501.51 1,831 11.68 % 11,745.24 zdaj z daj 6,356 6.73 % 6,140.43 1,457 6.88 % 6,320.41 1,737 6.09 % 5,877.34 1,764 6.07 % 4,995.06 1,398 8.92 % 8,967.69 lahko l ahko 4,000 4.24 % 3,864.33 901 4.26 % 3,908.50 885 3.10 % 2,994.50 1,437 4.94 % 4,069.10 777 4.96 % 4,984.19 potem p otem 3,149 3.33 % 3,042.20 658 3.11 % 2,854.38 631 2.21 % 2,135.06 1,152 3.96 % 3,262.08 708 4.52 % 4,541.58 tam t am 2,598 2.75 % 2,509.89 426 2.01 % 1,847.97 1,289 4.52 % 4,361.48 503 1.73 % 1,424.33 380 2.42 % 2,437.57 zelo z elo 1,835 1.94 % 1,772.76 473 2.23 % 2,051.86 182 0.64 % 615.82 896 3.08 % 2,537.17 284 1.81 % 1,821.76 bolj b olj 1,648 1.75 % 1,592.11 338 1.60 % 1,466.23 557 1.95 % 1,884.67 479 1.65 % 1,356.37 274 1.75 % 1,757.62 danes d anes 1,414 1.50 % 1,366.04 634 3.00 % 2,750.27 264 0.93 % 893.27 439 1.51 % 1,243.10 77 0.49 % 493.93 tu t u 1,406 1.49 % 1,358.31 384 1.81 % 1,665.78 355 1.25 % 1,201.18 343 1.18 % 971.26 324 2.07 % 2,078.35 sem s em 1,307 1.38 % 1,262.67 232 1.10 % 1,006.41 680 2.38 % 2,300.86 237 0.81 % 671.10 158 1.01 % 1,013.52 dobro d obro 1,147 1.22 % 1,108.10 279 1.32 % 1,210.29 205 0.72 % 693.64 455 1.56 % 1,288.41 208 1.33 % 1,334.25 tukaj t ukaj 1,135 1.20 % 1,096.50 133 0.63 % 576.95 164 0.57 % 554.91 597 2.05 % 1,690.50 241 1.54 % 1,545.93 malo m alo 1,131 1.20 % 1,092.64 308 1.46 % 1,336.09 364 1.28 % 1,231.64 222 0.76 % 628.63 237 1.51 % 1,520.27 kje k je 1,039 1.10 % 1,003.76 224 1.06 % 971.70 415 1.46 % 1,404.20 279 0.96 % 790.03 121 0.77 % 776.17 zato z ato 992 1.05 % 958.36 202 0.95 % 876.27 247 0.87 % 835.75 394 1.36 % 1,115.68 149 0.95 % 955.78 čisto č isto 974 1.03 % 940.97 132 0.62 % 572.61 402 1.41 % 1,360.21 231 0.80 % 654.11 209 1.33 % 1,340.66 prav p rav 971 1.03 % 938.07 214 1.01 % 928.32 308 1.08 % 1,042.15 317 1.09 % 897.64 132 0.84 % 846.73 naprej n aprej 922 0.98 % 890.73 208 0.98 % 902.30 134 0.47 % 453.40 446 1.53 % 1,262.92 134 0.85 % 859.56 ful f ul 916 0.97 % 884.93 52 0.25 % 225.57 697 2.44 % 2,358.38 36 0.12 % 101.94 131 0.84 % 840.32 mogoče m ogoče 872 0.92 % 842.42 207 0.98 % 897.96 163 0.57 % 551.53 242 0.83 % 685.26 260 1.66 % 1,667.81 takrat t akrat 840 0.89 % 811.51 169 0.80 % 733.12 235 0.82 % 795.15 313 1.08 % 886.31 123 0.79 % 789 prej p rej 832 0.88 % 803.78 169 0.80 % 733.12 263 0.92 % 889.89 295 1.01 % 835.34 105 0.67 % 673.54 treba t reba 827 0.88 % 798.95 164 0.78 % 711.43 164 0.57 % 554.91 331 1.14 % 937.28 168 1.07 % 1,077.66 gor g or 824 0.87 % 796.05 129 0.61 % 559.60 527 1.85 % 1,783.16 64 0.22 % 181.23 104 0.66 % 667.12 enkrat e nkrat 803 0.85 % 775.77 184 0.87 % 798.18 273 0.96 % 923.73 238 0.82 % 673.94 108 0.69 % 692.78 kako k ako 803 0.85 % 775.77 175 0.83 % 759.14 254 0.89 % 859.44 258 0.89 % 730.57 116 0.74 % 744.10 tule t ule 756 0.80 % 730.36 126 0.59 % 546.58 305 1.07 % 1,032 128 0.44 % 362.45 197 1.26 % 1,263.69 lepo l epo 743 0.79 % 717.80 240 1.13 % 1,041.11 245 0.86 % 828.99 190 0.65 % 538.02 68 0.43 % 436.20 vedno v edno 733 0.78 % 708.14 230 1.09 % 997.73 106 0.37 % 358.66 291 1.00 % 824.01 106 0.68 % 679.95 veliko v eliko 720 0.76 % 695.58 182 0.86 % 789.51 123 0.43 % 416.18 369 1.27 % 1,044.88 46 0.29 % 295.07 skupaj s kupaj 713 0.76 % 688.82 161 0.76 % 698.41 220 0.77 % 744.40 222 0.76 % 628.63 110 0.70 % 705.61 kdaj k daj 701 0.74 % 677.22 177 0.84 % 767.82 253 0.89 % 856.05 182 0.63 % 515.36 89 0.57 % 570.90 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 475 File at CLARIN.SI2.2.132 List of initial character-level 2-grams from adverb standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-adverbs-standardized_ forms-initial-2grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tako ta ko 9,713 10.29 % 9,383.57 1,925 9.10 % 8,350.58 3,661 12.85 % 12,387.41 2,296 7.90 % 6,501.51 1,831 11.69 % 11,745.24 zdaj zd aj 6,356 6.73 % 6,140.43 1,457 6.88 % 6,320.41 1,737 6.10 % 5,877.34 1,764 6.07 % 4,995.06 1,398 8.92 % 8,967.69 lahko la hko 4,000 4.24 % 3,864.33 901 4.26 % 3,908.50 885 3.11 % 2,994.50 1,437 4.95 % 4,069.10 777 4.96 % 4,984.19 potem po tem 3,149 3.34 % 3,042.20 658 3.11 % 2,854.38 631 2.21 % 2,135.06 1,152 3.96 % 3,262.08 708 4.52 % 4,541.58 tam ta m 2,598 2.75 % 2,509.89 426 2.01 % 1,847.97 1,289 4.52 % 4,361.48 503 1.73 % 1,424.33 380 2.42 % 2,437.57 zelo ze lo 1,835 1.94 % 1,772.76 473 2.23 % 2,051.86 182 0.64 % 615.82 896 3.08 % 2,537.17 284 1.81 % 1,821.76 bolj bo lj 1,648 1.75 % 1,592.11 338 1.60 % 1,466.23 557 1.96 % 1,884.67 479 1.65 % 1,356.37 274 1.75 % 1,757.62 danes da nes 1,414 1.50 % 1,366.04 634 3.00 % 2,750.27 264 0.93 % 893.27 439 1.51 % 1,243.10 77 0.49 % 493.93 tu tu 1,406 1.49 % 1,358.31 384 1.81 % 1,665.78 355 1.25 % 1,201.18 343 1.18 % 971.26 324 2.07 % 2,078.35 sem se m 1,307 1.39 % 1,262.67 232 1.10 % 1,006.41 680 2.39 % 2,300.86 237 0.82 % 671.10 158 1.01 % 1,013.52 dobro do bro 1,147 1.22 % 1,108.10 279 1.32 % 1,210.29 205 0.72 % 693.64 455 1.57 % 1,288.41 208 1.33 % 1,334.25 tukaj tu kaj 1,135 1.20 % 1,096.50 133 0.63 % 576.95 164 0.58 % 554.91 597 2.06 % 1,690.50 241 1.54 % 1,545.93 malo ma lo 1,131 1.20 % 1,092.64 308 1.46 % 1,336.09 364 1.28 % 1,231.64 222 0.76 % 628.63 237 1.51 % 1,520.27 kje kj e 1,039 1.10 % 1,003.76 224 1.06 % 971.70 415 1.46 % 1,404.20 279 0.96 % 790.03 121 0.77 % 776.17 zato za to 992 1.05 % 958.36 202 0.95 % 876.27 247 0.87 % 835.75 394 1.36 % 1,115.68 149 0.95 % 955.78 čisto či sto 974 1.03 % 940.97 132 0.62 % 572.61 402 1.41 % 1,360.21 231 0.80 % 654.11 209 1.33 % 1,340.66 prav pr av 971 1.03 % 938.07 214 1.01 % 928.32 308 1.08 % 1,042.15 317 1.09 % 897.64 132 0.84 % 846.73 naprej na prej 922 0.98 % 890.73 208 0.98 % 902.30 134 0.47 % 453.40 446 1.53 % 1,262.92 134 0.85 % 859.56 ful fu l 916 0.97 % 884.93 52 0.25 % 225.57 697 2.45 % 2,358.38 36 0.12 % 101.94 131 0.84 % 840.32 mogoče mo goče 872 0.92 % 842.42 207 0.98 % 897.96 163 0.57 % 551.53 242 0.83 % 685.26 260 1.66 % 1,667.81 takrat ta krat 840 0.89 % 811.51 169 0.80 % 733.12 235 0.82 % 795.15 313 1.08 % 886.31 123 0.79 % 789 prej pr ej 832 0.88 % 803.78 169 0.80 % 733.12 263 0.92 % 889.89 295 1.01 % 835.34 105 0.67 % 673.54 treba tr eba 827 0.88 % 798.95 164 0.78 % 711.43 164 0.58 % 554.91 331 1.14 % 937.28 168 1.07 % 1,077.66 gor go r 824 0.87 % 796.05 129 0.61 % 559.60 527 1.85 % 1,783.16 64 0.22 % 181.23 104 0.66 % 667.12 enkrat en krat 803 0.85 % 775.77 184 0.87 % 798.18 273 0.96 % 923.73 238 0.82 % 673.94 108 0.69 % 692.78 kako ka ko 803 0.85 % 775.77 175 0.83 % 759.14 254 0.89 % 859.44 258 0.89 % 730.57 116 0.74 % 744.10 tule tu le 756 0.80 % 730.36 126 0.59 % 546.58 305 1.07 % 1,032 128 0.44 % 362.45 197 1.26 % 1,263.69 lepo le po 743 0.79 % 717.80 240 1.13 % 1,041.11 245 0.86 % 828.99 190 0.65 % 538.02 68 0.43 % 436.20 vedno ve dno 733 0.78 % 708.14 230 1.09 % 997.73 106 0.37 % 358.66 291 1.00 % 824.01 106 0.68 % 679.95 veliko ve liko 720 0.76 % 695.58 182 0.86 % 789.51 123 0.43 % 416.18 369 1.27 % 1,044.88 46 0.29 % 295.07 skupaj sk upaj 713 0.76 % 688.82 161 0.76 % 698.41 220 0.77 % 744.40 222 0.76 % 628.63 110 0.70 % 705.61 kdaj kd aj 701 0.74 % 677.22 177 0.84 % 767.82 253 0.89 % 856.05 182 0.63 % 515.36 89 0.57 % 570.90 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 476 File at CLARIN.SI2.2.133 List of initial character-level 3-grams from adverb standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-adverbs-standardized_ forms-initial-3grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tako tak o 9,713 10.45 % 9,383.57 1,925 9.27 % 8,350.58 3,661 13.02 % 12,387.41 2,296 8.00 % 6,501.51 1,831 11.93 % 11,745.24 zdaj zda j 6,356 6.84 % 6,140.43 1,457 7.02 % 6,320.41 1,737 6.18 % 5,877.34 1,764 6.14 % 4,995.06 1,398 9.11 % 8,967.69 lahko lah ko 4,000 4.30 % 3,864.33 901 4.34 % 3,908.50 885 3.15 % 2,994.50 1,437 5.00 % 4,069.10 777 5.06 % 4,984.19 potem pot em 3,149 3.39 % 3,042.20 658 3.17 % 2,854.38 631 2.25 % 2,135.06 1,152 4.01 % 3,262.08 708 4.61 % 4,541.58 tam tam 2,598 2.80 % 2,509.89 426 2.05 % 1,847.97 1,289 4.59 % 4,361.48 503 1.75 % 1,424.33 380 2.48 % 2,437.57 zelo zel o 1,835 1.98 % 1,772.76 473 2.28 % 2,051.86 182 0.65 % 615.82 896 3.12 % 2,537.17 284 1.85 % 1,821.76 bolj bol j 1,648 1.77 % 1,592.11 338 1.63 % 1,466.23 557 1.98 % 1,884.67 479 1.67 % 1,356.37 274 1.79 % 1,757.62 danes dan es 1,414 1.52 % 1,366.04 634 3.06 % 2,750.27 264 0.94 % 893.27 439 1.53 % 1,243.10 77 0.50 % 493.93 sem sem 1,307 1.41 % 1,262.67 232 1.12 % 1,006.41 680 2.42 % 2,300.86 237 0.82 % 671.10 158 1.03 % 1,013.52 dobro dob ro 1,147 1.23 % 1,108.10 279 1.34 % 1,210.29 205 0.73 % 693.64 455 1.58 % 1,288.41 208 1.35 % 1,334.25 tukaj tuk aj 1,135 1.22 % 1,096.50 133 0.64 % 576.95 164 0.58 % 554.91 597 2.08 % 1,690.50 241 1.57 % 1,545.93 malo mal o 1,131 1.22 % 1,092.64 308 1.48 % 1,336.09 364 1.29 % 1,231.64 222 0.77 % 628.63 237 1.54 % 1,520.27 kje kje 1,039 1.12 % 1,003.76 224 1.08 % 971.70 415 1.48 % 1,404.20 279 0.97 % 790.03 121 0.79 % 776.17 zato zat o 992 1.07 % 958.36 202 0.97 % 876.27 247 0.88 % 835.75 394 1.37 % 1,115.68 149 0.97 % 955.78 čisto čis to 974 1.05 % 940.97 132 0.64 % 572.61 402 1.43 % 1,360.21 231 0.81 % 654.11 209 1.36 % 1,340.66 prav pra v 971 1.04 % 938.07 214 1.03 % 928.32 308 1.10 % 1,042.15 317 1.10 % 897.64 132 0.86 % 846.73 naprej nap rej 922 0.99 % 890.73 208 1.00 % 902.30 134 0.48 % 453.40 446 1.55 % 1,262.92 134 0.87 % 859.56 ful ful 916 0.99 % 884.93 52 0.25 % 225.57 697 2.48 % 2,358.38 36 0.12 % 101.94 131 0.85 % 840.32 mogoče mog oče 872 0.94 % 842.42 207 1.00 % 897.96 163 0.58 % 551.53 242 0.84 % 685.26 260 1.69 % 1,667.81 takrat tak rat 840 0.90 % 811.51 169 0.81 % 733.12 235 0.84 % 795.15 313 1.09 % 886.31 123 0.80 % 789 prej pre j 832 0.90 % 803.78 169 0.81 % 733.12 263 0.94 % 889.89 295 1.03 % 835.34 105 0.68 % 673.54 treba tre ba 827 0.89 % 798.95 164 0.79 % 711.43 164 0.58 % 554.91 331 1.15 % 937.28 168 1.09 % 1,077.66 gor gor 824 0.89 % 796.05 129 0.62 % 559.60 527 1.88 % 1,783.16 64 0.22 % 181.23 104 0.68 % 667.12 enkrat enk rat 803 0.86 % 775.77 184 0.89 % 798.18 273 0.97 % 923.73 238 0.83 % 673.94 108 0.70 % 692.78 kako kak o 803 0.86 % 775.77 175 0.84 % 759.14 254 0.90 % 859.44 258 0.90 % 730.57 116 0.76 % 744.10 tule tul e 756 0.81 % 730.36 126 0.61 % 546.58 305 1.08 % 1,032 128 0.45 % 362.45 197 1.28 % 1,263.69 lepo lep o 743 0.80 % 717.80 240 1.16 % 1,041.11 245 0.87 % 828.99 190 0.66 % 538.02 68 0.44 % 436.20 vedno ved no 733 0.79 % 708.14 230 1.11 % 997.73 106 0.38 % 358.66 291 1.01 % 824.01 106 0.69 % 679.95 veliko vel iko 720 0.78 % 695.58 182 0.88 % 789.51 123 0.44 % 416.18 369 1.28 % 1,044.88 46 0.30 % 295.07 skupaj sku paj 713 0.77 % 688.82 161 0.78 % 698.41 220 0.78 % 744.40 222 0.77 % 628.63 110 0.72 % 705.61 kdaj kda j 701 0.75 % 677.22 177 0.85 % 767.82 253 0.90 % 856.05 182 0.63 % 515.36 89 0.58 % 570.90 spet spe t 647 0.70 % 625.06 145 0.70 % 629 223 0.79 % 754.55 188 0.66 % 532.35 91 0.59 % 583.73 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 477 File at CLARIN.SI2.2.134 List of initial character-level 4-grams from adverb standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-adverbs-standardized_ forms-initial-4grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tako tako 9,713 11.88 % 9,383.57 1,925 10.34 % 8,350.58 3,661 16.13 % 12,387.41 2,296 8.64 % 6,501.51 1,831 13.23 % 11,745.24 zdaj zdaj 6,356 7.78 % 6,140.43 1,457 7.83 % 6,320.41 1,737 7.65 % 5,877.34 1,764 6.64 % 4,995.06 1,398 10.10 % 8,967.69 lahko lahk o 4,000 4.89 % 3,864.33 901 4.84 % 3,908.50 885 3.90 % 2,994.50 1,437 5.41 % 4,069.10 777 5.62 % 4,984.19 potem pote m 3,149 3.85 % 3,042.20 658 3.54 % 2,854.38 631 2.78 % 2,135.06 1,152 4.33 % 3,262.08 708 5.12 % 4,541.58 zelo zelo 1,835 2.25 % 1,772.76 473 2.54 % 2,051.86 182 0.80 % 615.82 896 3.37 % 2,537.17 284 2.05 % 1,821.76 bolj bolj 1,648 2.02 % 1,592.11 338 1.82 % 1,466.23 557 2.45 % 1,884.67 479 1.80 % 1,356.37 274 1.98 % 1,757.62 danes dane s 1,414 1.73 % 1,366.04 634 3.41 % 2,750.27 264 1.16 % 893.27 439 1.65 % 1,243.10 77 0.56 % 493.93 dobro dobr o 1,147 1.40 % 1,108.10 279 1.50 % 1,210.29 205 0.90 % 693.64 455 1.71 % 1,288.41 208 1.50 % 1,334.25 tukaj tuka j 1,135 1.39 % 1,096.50 133 0.71 % 576.95 164 0.72 % 554.91 597 2.25 % 1,690.50 241 1.74 % 1,545.93 malo malo 1,131 1.38 % 1,092.64 308 1.66 % 1,336.09 364 1.60 % 1,231.64 222 0.83 % 628.63 237 1.71 % 1,520.27 zato zato 992 1.21 % 958.36 202 1.08 % 876.27 247 1.09 % 835.75 394 1.48 % 1,115.68 149 1.08 % 955.78 čisto čist o 974 1.19 % 940.97 132 0.71 % 572.61 402 1.77 % 1,360.21 231 0.87 % 654.11 209 1.51 % 1,340.66 prav prav 971 1.19 % 938.07 214 1.15 % 928.32 308 1.36 % 1,042.15 317 1.19 % 897.64 132 0.95 % 846.73 naprej napr ej 922 1.13 % 890.73 208 1.12 % 902.30 134 0.59 % 453.40 446 1.68 % 1,262.92 134 0.97 % 859.56 mogoče mogo če 872 1.07 % 842.42 207 1.11 % 897.96 163 0.72 % 551.53 242 0.91 % 685.26 260 1.88 % 1,667.81 takrat takr at 840 1.03 % 811.51 169 0.91 % 733.12 235 1.03 % 795.15 313 1.18 % 886.31 123 0.89 % 789 prej prej 832 1.02 % 803.78 169 0.91 % 733.12 263 1.16 % 889.89 295 1.11 % 835.34 105 0.76 % 673.54 treba treb a 827 1.01 % 798.95 164 0.88 % 711.43 164 0.72 % 554.91 331 1.25 % 937.28 168 1.21 % 1,077.66 enkrat enkr at 803 0.98 % 775.77 184 0.99 % 798.18 273 1.20 % 923.73 238 0.90 % 673.94 108 0.78 % 692.78 kako kako 803 0.98 % 775.77 175 0.94 % 759.14 254 1.12 % 859.44 258 0.97 % 730.57 116 0.84 % 744.10 tule tule 756 0.93 % 730.36 126 0.68 % 546.58 305 1.34 % 1,032 128 0.48 % 362.45 197 1.42 % 1,263.69 lepo lepo 743 0.91 % 717.80 240 1.29 % 1,041.11 245 1.08 % 828.99 190 0.71 % 538.02 68 0.49 % 436.20 vedno vedn o 733 0.90 % 708.14 230 1.24 % 997.73 106 0.47 % 358.66 291 1.09 % 824.01 106 0.77 % 679.95 veliko veli ko 720 0.88 % 695.58 182 0.98 % 789.51 123 0.54 % 416.18 369 1.39 % 1,044.88 46 0.33 % 295.07 skupaj skup aj 713 0.87 % 688.82 161 0.86 % 698.41 220 0.97 % 744.40 222 0.83 % 628.63 110 0.80 % 705.61 kdaj kdaj 701 0.86 % 677.22 177 0.95 % 767.82 253 1.11 % 856.05 182 0.69 % 515.36 89 0.64 % 570.90 spet spet 647 0.79 % 625.06 145 0.78 % 629 223 0.98 % 754.55 188 0.71 % 532.35 91 0.66 % 583.73 nazaj naza j 579 0.71 % 559.36 136 0.73 % 589.96 201 0.89 % 680.11 176 0.66 % 498.37 66 0.48 % 423.37 noter note r 554 0.68 % 535.21 127 0.68 % 550.92 256 1.13 % 866.21 69 0.26 % 195.38 102 0.74 % 654.29 dosti dost i 516 0.63 % 498.50 115 0.62 % 498.87 212 0.93 % 717.33 82 0.31 % 232.20 107 0.77 % 686.37 nekaj neka j 507 0.62 % 489.80 95 0.51 % 412.11 177 0.78 % 598.90 173 0.65 % 489.88 62 0.45 % 397.71 verjetno verj etno 495 0.61 % 478.21 106 0.57 % 459.82 106 0.47 % 358.66 187 0.70 % 529.52 96 0.69 % 615.81 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 478 File at CLARIN.SI2.2.135 List of initial character-level 5-grams from adverb standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-adverbs-standardized_ forms-initial-5grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] lahko lahko 4,000 7.84 % 3,864.33 901 7.64 % 3,908.50 885 6.91 % 2,994.50 1,437 7.95 % 4,069.10 777 9.31 % 4,984.19 potem potem 3,149 6.17 % 3,042.20 658 5.58 % 2,854.38 631 4.93 % 2,135.06 1,152 6.37 % 3,262.08 708 8.48 % 4,541.58 danes danes 1,414 2.77 % 1,366.04 634 5.38 % 2,750.27 264 2.06 % 893.27 439 2.43 % 1,243.10 77 0.92 % 493.93 dobro dobro 1,147 2.25 % 1,108.10 279 2.37 % 1,210.29 205 1.60 % 693.64 455 2.52 % 1,288.41 208 2.49 % 1,334.25 tukaj tukaj 1,135 2.23 % 1,096.50 133 1.13 % 576.95 164 1.28 % 554.91 597 3.30 % 1,690.50 241 2.89 % 1,545.93 čisto čisto 974 1.91 % 940.97 132 1.12 % 572.61 402 3.14 % 1,360.21 231 1.28 % 654.11 209 2.50 % 1,340.66 naprej napre j 922 1.81 % 890.73 208 1.76 % 902.30 134 1.05 % 453.40 446 2.47 % 1,262.92 134 1.60 % 859.56 mogoče mogoč e 872 1.71 % 842.42 207 1.76 % 897.96 163 1.27 % 551.53 242 1.34 % 685.26 260 3.11 % 1,667.81 takrat takra t 840 1.65 % 811.51 169 1.43 % 733.12 235 1.84 % 795.15 313 1.73 % 886.31 123 1.47 % 789 treba treba 827 1.62 % 798.95 164 1.39 % 711.43 164 1.28 % 554.91 331 1.83 % 937.28 168 2.01 % 1,077.66 enkrat enkra t 803 1.57 % 775.77 184 1.56 % 798.18 273 2.13 % 923.73 238 1.32 % 673.94 108 1.29 % 692.78 vedno vedno 733 1.44 % 708.14 230 1.95 % 997.73 106 0.83 % 358.66 291 1.61 % 824.01 106 1.27 % 679.95 veliko velik o 720 1.41 % 695.58 182 1.54 % 789.51 123 0.96 % 416.18 369 2.04 % 1,044.88 46 0.55 % 295.07 skupaj skupa j 713 1.40 % 688.82 161 1.37 % 698.41 220 1.72 % 744.40 222 1.23 % 628.63 110 1.32 % 705.61 nazaj nazaj 579 1.14 % 559.36 136 1.15 % 589.96 201 1.57 % 680.11 176 0.97 % 498.37 66 0.79 % 423.37 noter noter 554 1.09 % 535.21 127 1.08 % 550.92 256 2.00 % 866.21 69 0.38 % 195.38 102 1.22 % 654.29 dosti dosti 516 1.01 % 498.50 115 0.97 % 498.87 212 1.66 % 717.33 82 0.45 % 232.20 107 1.28 % 686.37 nekaj nekaj 507 0.99 % 489.80 95 0.81 % 412.11 177 1.38 % 598.90 173 0.96 % 489.88 62 0.74 % 397.71 verjetno verje tno 495 0.97 % 478.21 106 0.90 % 459.82 106 0.83 % 358.66 187 1.03 % 529.52 96 1.15 % 615.81 drugače druga če 487 0.95 % 470.48 69 0.58 % 299.32 212 1.66 % 717.33 132 0.73 % 373.78 74 0.89 % 474.68 zdajle zdajl e 478 0.94 % 461.79 127 1.08 % 550.92 204 1.59 % 690.26 101 0.56 % 286 46 0.55 % 295.07 točno točno 476 0.93 % 459.86 125 1.06 % 542.25 132 1.03 % 446.64 120 0.66 % 339.80 99 1.19 % 635.05 zakaj zakaj 475 0.93 % 458.89 65 0.55 % 281.97 107 0.84 % 362.05 262 1.45 % 741.90 41 0.49 % 263 najbolj najbo lj 462 0.91 % 446.33 125 1.06 % 542.25 105 0.82 % 355.28 177 0.98 % 501.20 55 0.66 % 352.81 koliko kolik o 460 0.90 % 444.40 66 0.56 % 286.31 168 1.31 % 568.45 143 0.79 % 404.93 83 0.99 % 532.42 najprej najpr ej 451 0.88 % 435.70 95 0.81 % 412.11 51 0.40 % 172.56 240 1.33 % 679.60 65 0.78 % 416.95 včasih včasi h 445 0.87 % 429.91 86 0.73 % 373.06 159 1.24 % 537.99 142 0.79 % 402.10 58 0.69 % 372.05 super super 438 0.86 % 423.14 140 1.19 % 607.31 115 0.90 % 389.12 29 0.16 % 82.12 154 1.84 % 987.86 toliko tolik o 434 0.85 % 419.28 99 0.84 % 429.46 154 1.20 % 521.08 108 0.60 % 305.82 73 0.87 % 468.27 dejansko dejan sko 406 0.80 % 392.23 40 0.34 % 173.52 72 0.56 % 243.62 184 1.02 % 521.03 110 1.32 % 705.61 notri notri 396 0.78 % 382.57 37 0.31 % 160.50 235 1.84 % 795.15 43 0.24 % 121.76 81 0.97 % 519.59 nekje nekje 393 0.77 % 379.67 44 0.37 % 190.87 125 0.98 % 422.95 112 0.62 % 317.15 112 1.34 % 718.44 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 479 File at CLARIN.SI2.2.136 List of final character-level 1-grams from adverb standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-adverbs-standardized_ forms-final-1grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tako tak o 9,713 10.29 % 9,383.57 1,925 9.09 % 8,350.58 3,661 12.84 % 12,387.41 2,296 7.90 % 6,501.51 1,831 11.68 % 11,745.24 zdaj zda j 6,356 6.73 % 6,140.43 1,457 6.88 % 6,320.41 1,737 6.09 % 5,877.34 1,764 6.07 % 4,995.06 1,398 8.92 % 8,967.69 lahko lahk o 4,000 4.24 % 3,864.33 901 4.26 % 3,908.50 885 3.10 % 2,994.50 1,437 4.94 % 4,069.10 777 4.96 % 4,984.19 potem pote m 3,149 3.33 % 3,042.20 658 3.11 % 2,854.38 631 2.21 % 2,135.06 1,152 3.96 % 3,262.08 708 4.52 % 4,541.58 tam ta m 2,598 2.75 % 2,509.89 426 2.01 % 1,847.97 1,289 4.52 % 4,361.48 503 1.73 % 1,424.33 380 2.42 % 2,437.57 zelo zel o 1,835 1.94 % 1,772.76 473 2.23 % 2,051.86 182 0.64 % 615.82 896 3.08 % 2,537.17 284 1.81 % 1,821.76 bolj bol j 1,648 1.75 % 1,592.11 338 1.60 % 1,466.23 557 1.95 % 1,884.67 479 1.65 % 1,356.37 274 1.75 % 1,757.62 danes dane s 1,414 1.50 % 1,366.04 634 3.00 % 2,750.27 264 0.93 % 893.27 439 1.51 % 1,243.10 77 0.49 % 493.93 tu t u 1,406 1.49 % 1,358.31 384 1.81 % 1,665.78 355 1.25 % 1,201.18 343 1.18 % 971.26 324 2.07 % 2,078.35 sem se m 1,307 1.38 % 1,262.67 232 1.10 % 1,006.41 680 2.38 % 2,300.86 237 0.81 % 671.10 158 1.01 % 1,013.52 dobro dobr o 1,147 1.22 % 1,108.10 279 1.32 % 1,210.29 205 0.72 % 693.64 455 1.56 % 1,288.41 208 1.33 % 1,334.25 tukaj tuka j 1,135 1.20 % 1,096.50 133 0.63 % 576.95 164 0.57 % 554.91 597 2.05 % 1,690.50 241 1.54 % 1,545.93 malo mal o 1,131 1.20 % 1,092.64 308 1.46 % 1,336.09 364 1.28 % 1,231.64 222 0.76 % 628.63 237 1.51 % 1,520.27 kje kj e 1,039 1.10 % 1,003.76 224 1.06 % 971.70 415 1.46 % 1,404.20 279 0.96 % 790.03 121 0.77 % 776.17 zato zat o 992 1.05 % 958.36 202 0.95 % 876.27 247 0.87 % 835.75 394 1.36 % 1,115.68 149 0.95 % 955.78 čisto čist o 974 1.03 % 940.97 132 0.62 % 572.61 402 1.41 % 1,360.21 231 0.80 % 654.11 209 1.33 % 1,340.66 prav pra v 971 1.03 % 938.07 214 1.01 % 928.32 308 1.08 % 1,042.15 317 1.09 % 897.64 132 0.84 % 846.73 naprej napre j 922 0.98 % 890.73 208 0.98 % 902.30 134 0.47 % 453.40 446 1.53 % 1,262.92 134 0.85 % 859.56 ful fu l 916 0.97 % 884.93 52 0.25 % 225.57 697 2.44 % 2,358.38 36 0.12 % 101.94 131 0.84 % 840.32 mogoče mogoč e 872 0.92 % 842.42 207 0.98 % 897.96 163 0.57 % 551.53 242 0.83 % 685.26 260 1.66 % 1,667.81 takrat takra t 840 0.89 % 811.51 169 0.80 % 733.12 235 0.82 % 795.15 313 1.08 % 886.31 123 0.79 % 789 prej pre j 832 0.88 % 803.78 169 0.80 % 733.12 263 0.92 % 889.89 295 1.01 % 835.34 105 0.67 % 673.54 treba treb a 827 0.88 % 798.95 164 0.78 % 711.43 164 0.57 % 554.91 331 1.14 % 937.28 168 1.07 % 1,077.66 gor go r 824 0.87 % 796.05 129 0.61 % 559.60 527 1.85 % 1,783.16 64 0.22 % 181.23 104 0.66 % 667.12 enkrat enkra t 803 0.85 % 775.77 184 0.87 % 798.18 273 0.96 % 923.73 238 0.82 % 673.94 108 0.69 % 692.78 kako kak o 803 0.85 % 775.77 175 0.83 % 759.14 254 0.89 % 859.44 258 0.89 % 730.57 116 0.74 % 744.10 tule tul e 756 0.80 % 730.36 126 0.59 % 546.58 305 1.07 % 1,032 128 0.44 % 362.45 197 1.26 % 1,263.69 lepo lep o 743 0.79 % 717.80 240 1.13 % 1,041.11 245 0.86 % 828.99 190 0.65 % 538.02 68 0.43 % 436.20 vedno vedn o 733 0.78 % 708.14 230 1.09 % 997.73 106 0.37 % 358.66 291 1.00 % 824.01 106 0.68 % 679.95 veliko velik o 720 0.76 % 695.58 182 0.86 % 789.51 123 0.43 % 416.18 369 1.27 % 1,044.88 46 0.29 % 295.07 skupaj skupa j 713 0.76 % 688.82 161 0.76 % 698.41 220 0.77 % 744.40 222 0.76 % 628.63 110 0.70 % 705.61 kdaj kda j 701 0.74 % 677.22 177 0.84 % 767.82 253 0.89 % 856.05 182 0.63 % 515.36 89 0.57 % 570.90 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 480 File at CLARIN.SI2.2.137 List of final character-level 2-grams from adverb standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-adverbs-standardized_ forms-final-2grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tako ta ko 9,713 10.29 % 9,383.57 1,925 9.10 % 8,350.58 3,661 12.85 % 12,387.41 2,296 7.90 % 6,501.51 1,831 11.69 % 11,745.24 zdaj zd aj 6,356 6.73 % 6,140.43 1,457 6.88 % 6,320.41 1,737 6.10 % 5,877.34 1,764 6.07 % 4,995.06 1,398 8.92 % 8,967.69 lahko lah ko 4,000 4.24 % 3,864.33 901 4.26 % 3,908.50 885 3.11 % 2,994.50 1,437 4.95 % 4,069.10 777 4.96 % 4,984.19 potem pot em 3,149 3.34 % 3,042.20 658 3.11 % 2,854.38 631 2.21 % 2,135.06 1,152 3.96 % 3,262.08 708 4.52 % 4,541.58 tam t am 2,598 2.75 % 2,509.89 426 2.01 % 1,847.97 1,289 4.52 % 4,361.48 503 1.73 % 1,424.33 380 2.42 % 2,437.57 zelo ze lo 1,835 1.94 % 1,772.76 473 2.23 % 2,051.86 182 0.64 % 615.82 896 3.08 % 2,537.17 284 1.81 % 1,821.76 bolj bo lj 1,648 1.75 % 1,592.11 338 1.60 % 1,466.23 557 1.96 % 1,884.67 479 1.65 % 1,356.37 274 1.75 % 1,757.62 danes dan es 1,414 1.50 % 1,366.04 634 3.00 % 2,750.27 264 0.93 % 893.27 439 1.51 % 1,243.10 77 0.49 % 493.93 tu tu 1,406 1.49 % 1,358.31 384 1.81 % 1,665.78 355 1.25 % 1,201.18 343 1.18 % 971.26 324 2.07 % 2,078.35 sem s em 1,307 1.39 % 1,262.67 232 1.10 % 1,006.41 680 2.39 % 2,300.86 237 0.82 % 671.10 158 1.01 % 1,013.52 dobro dob ro 1,147 1.22 % 1,108.10 279 1.32 % 1,210.29 205 0.72 % 693.64 455 1.57 % 1,288.41 208 1.33 % 1,334.25 tukaj tuk aj 1,135 1.20 % 1,096.50 133 0.63 % 576.95 164 0.58 % 554.91 597 2.06 % 1,690.50 241 1.54 % 1,545.93 malo ma lo 1,131 1.20 % 1,092.64 308 1.46 % 1,336.09 364 1.28 % 1,231.64 222 0.76 % 628.63 237 1.51 % 1,520.27 kje k je 1,039 1.10 % 1,003.76 224 1.06 % 971.70 415 1.46 % 1,404.20 279 0.96 % 790.03 121 0.77 % 776.17 zato za to 992 1.05 % 958.36 202 0.95 % 876.27 247 0.87 % 835.75 394 1.36 % 1,115.68 149 0.95 % 955.78 čisto čis to 974 1.03 % 940.97 132 0.62 % 572.61 402 1.41 % 1,360.21 231 0.80 % 654.11 209 1.33 % 1,340.66 prav pr av 971 1.03 % 938.07 214 1.01 % 928.32 308 1.08 % 1,042.15 317 1.09 % 897.64 132 0.84 % 846.73 naprej napr ej 922 0.98 % 890.73 208 0.98 % 902.30 134 0.47 % 453.40 446 1.53 % 1,262.92 134 0.85 % 859.56 ful f ul 916 0.97 % 884.93 52 0.25 % 225.57 697 2.45 % 2,358.38 36 0.12 % 101.94 131 0.84 % 840.32 mogoče mogo če 872 0.92 % 842.42 207 0.98 % 897.96 163 0.57 % 551.53 242 0.83 % 685.26 260 1.66 % 1,667.81 takrat takr at 840 0.89 % 811.51 169 0.80 % 733.12 235 0.82 % 795.15 313 1.08 % 886.31 123 0.79 % 789 prej pr ej 832 0.88 % 803.78 169 0.80 % 733.12 263 0.92 % 889.89 295 1.01 % 835.34 105 0.67 % 673.54 treba tre ba 827 0.88 % 798.95 164 0.78 % 711.43 164 0.58 % 554.91 331 1.14 % 937.28 168 1.07 % 1,077.66 gor g or 824 0.87 % 796.05 129 0.61 % 559.60 527 1.85 % 1,783.16 64 0.22 % 181.23 104 0.66 % 667.12 enkrat enkr at 803 0.85 % 775.77 184 0.87 % 798.18 273 0.96 % 923.73 238 0.82 % 673.94 108 0.69 % 692.78 kako ka ko 803 0.85 % 775.77 175 0.83 % 759.14 254 0.89 % 859.44 258 0.89 % 730.57 116 0.74 % 744.10 tule tu le 756 0.80 % 730.36 126 0.59 % 546.58 305 1.07 % 1,032 128 0.44 % 362.45 197 1.26 % 1,263.69 lepo le po 743 0.79 % 717.80 240 1.13 % 1,041.11 245 0.86 % 828.99 190 0.65 % 538.02 68 0.43 % 436.20 vedno ved no 733 0.78 % 708.14 230 1.09 % 997.73 106 0.37 % 358.66 291 1.00 % 824.01 106 0.68 % 679.95 veliko veli ko 720 0.76 % 695.58 182 0.86 % 789.51 123 0.43 % 416.18 369 1.27 % 1,044.88 46 0.29 % 295.07 skupaj skup aj 713 0.76 % 688.82 161 0.76 % 698.41 220 0.77 % 744.40 222 0.76 % 628.63 110 0.70 % 705.61 kdaj kd aj 701 0.74 % 677.22 177 0.84 % 767.82 253 0.89 % 856.05 182 0.63 % 515.36 89 0.57 % 570.90 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 481 File at CLARIN.SI2.2.138 List of final character-level 3-grams from adverb standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-adverbs-standardized_ forms-final-3grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tako t ako 9,713 10.45 % 9,383.57 1,925 9.27 % 8,350.58 3,661 13.02 % 12,387.41 2,296 8.00 % 6,501.51 1,831 11.93 % 11,745.24 zdaj z daj 6,356 6.84 % 6,140.43 1,457 7.02 % 6,320.41 1,737 6.18 % 5,877.34 1,764 6.14 % 4,995.06 1,398 9.11 % 8,967.69 lahko la hko 4,000 4.30 % 3,864.33 901 4.34 % 3,908.50 885 3.15 % 2,994.50 1,437 5.00 % 4,069.10 777 5.06 % 4,984.19 potem po tem 3,149 3.39 % 3,042.20 658 3.17 % 2,854.38 631 2.25 % 2,135.06 1,152 4.01 % 3,262.08 708 4.61 % 4,541.58 tam tam 2,598 2.80 % 2,509.89 426 2.05 % 1,847.97 1,289 4.59 % 4,361.48 503 1.75 % 1,424.33 380 2.48 % 2,437.57 zelo z elo 1,835 1.98 % 1,772.76 473 2.28 % 2,051.86 182 0.65 % 615.82 896 3.12 % 2,537.17 284 1.85 % 1,821.76 bolj b olj 1,648 1.77 % 1,592.11 338 1.63 % 1,466.23 557 1.98 % 1,884.67 479 1.67 % 1,356.37 274 1.79 % 1,757.62 danes da nes 1,414 1.52 % 1,366.04 634 3.06 % 2,750.27 264 0.94 % 893.27 439 1.53 % 1,243.10 77 0.50 % 493.93 sem sem 1,307 1.41 % 1,262.67 232 1.12 % 1,006.41 680 2.42 % 2,300.86 237 0.82 % 671.10 158 1.03 % 1,013.52 dobro do bro 1,147 1.23 % 1,108.10 279 1.34 % 1,210.29 205 0.73 % 693.64 455 1.58 % 1,288.41 208 1.35 % 1,334.25 tukaj tu kaj 1,135 1.22 % 1,096.50 133 0.64 % 576.95 164 0.58 % 554.91 597 2.08 % 1,690.50 241 1.57 % 1,545.93 malo m alo 1,131 1.22 % 1,092.64 308 1.48 % 1,336.09 364 1.29 % 1,231.64 222 0.77 % 628.63 237 1.54 % 1,520.27 kje kje 1,039 1.12 % 1,003.76 224 1.08 % 971.70 415 1.48 % 1,404.20 279 0.97 % 790.03 121 0.79 % 776.17 zato z ato 992 1.07 % 958.36 202 0.97 % 876.27 247 0.88 % 835.75 394 1.37 % 1,115.68 149 0.97 % 955.78 čisto či sto 974 1.05 % 940.97 132 0.64 % 572.61 402 1.43 % 1,360.21 231 0.81 % 654.11 209 1.36 % 1,340.66 prav p rav 971 1.04 % 938.07 214 1.03 % 928.32 308 1.10 % 1,042.15 317 1.10 % 897.64 132 0.86 % 846.73 naprej nap rej 922 0.99 % 890.73 208 1.00 % 902.30 134 0.48 % 453.40 446 1.55 % 1,262.92 134 0.87 % 859.56 ful ful 916 0.99 % 884.93 52 0.25 % 225.57 697 2.48 % 2,358.38 36 0.12 % 101.94 131 0.85 % 840.32 mogoče mog oče 872 0.94 % 842.42 207 1.00 % 897.96 163 0.58 % 551.53 242 0.84 % 685.26 260 1.69 % 1,667.81 takrat tak rat 840 0.90 % 811.51 169 0.81 % 733.12 235 0.84 % 795.15 313 1.09 % 886.31 123 0.80 % 789 prej p rej 832 0.90 % 803.78 169 0.81 % 733.12 263 0.94 % 889.89 295 1.03 % 835.34 105 0.68 % 673.54 treba tr eba 827 0.89 % 798.95 164 0.79 % 711.43 164 0.58 % 554.91 331 1.15 % 937.28 168 1.09 % 1,077.66 gor gor 824 0.89 % 796.05 129 0.62 % 559.60 527 1.88 % 1,783.16 64 0.22 % 181.23 104 0.68 % 667.12 enkrat enk rat 803 0.86 % 775.77 184 0.89 % 798.18 273 0.97 % 923.73 238 0.83 % 673.94 108 0.70 % 692.78 kako k ako 803 0.86 % 775.77 175 0.84 % 759.14 254 0.90 % 859.44 258 0.90 % 730.57 116 0.76 % 744.10 tule t ule 756 0.81 % 730.36 126 0.61 % 546.58 305 1.08 % 1,032 128 0.45 % 362.45 197 1.28 % 1,263.69 lepo l epo 743 0.80 % 717.80 240 1.16 % 1,041.11 245 0.87 % 828.99 190 0.66 % 538.02 68 0.44 % 436.20 vedno ve dno 733 0.79 % 708.14 230 1.11 % 997.73 106 0.38 % 358.66 291 1.01 % 824.01 106 0.69 % 679.95 veliko vel iko 720 0.78 % 695.58 182 0.88 % 789.51 123 0.44 % 416.18 369 1.28 % 1,044.88 46 0.30 % 295.07 skupaj sku paj 713 0.77 % 688.82 161 0.78 % 698.41 220 0.78 % 744.40 222 0.77 % 628.63 110 0.72 % 705.61 kdaj k daj 701 0.75 % 677.22 177 0.85 % 767.82 253 0.90 % 856.05 182 0.63 % 515.36 89 0.58 % 570.90 spet s pet 647 0.70 % 625.06 145 0.70 % 629 223 0.79 % 754.55 188 0.66 % 532.35 91 0.59 % 583.73 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 482 File at CLARIN.SI2.2.139 List of final character-level 4-grams from adverb standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-adverbs-standardized_ forms-final-4grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tako tako 9,713 11.88 % 9,383.57 1,925 10.34 % 8,350.58 3,661 16.13 % 12,387.41 2,296 8.64 % 6,501.51 1,831 13.23 % 11,745.24 zdaj zdaj 6,356 7.78 % 6,140.43 1,457 7.83 % 6,320.41 1,737 7.65 % 5,877.34 1,764 6.64 % 4,995.06 1,398 10.10 % 8,967.69 lahko l ahko 4,000 4.89 % 3,864.33 901 4.84 % 3,908.50 885 3.90 % 2,994.50 1,437 5.41 % 4,069.10 777 5.62 % 4,984.19 potem p otem 3,149 3.85 % 3,042.20 658 3.54 % 2,854.38 631 2.78 % 2,135.06 1,152 4.33 % 3,262.08 708 5.12 % 4,541.58 zelo zelo 1,835 2.25 % 1,772.76 473 2.54 % 2,051.86 182 0.80 % 615.82 896 3.37 % 2,537.17 284 2.05 % 1,821.76 bolj bolj 1,648 2.02 % 1,592.11 338 1.82 % 1,466.23 557 2.45 % 1,884.67 479 1.80 % 1,356.37 274 1.98 % 1,757.62 danes d anes 1,414 1.73 % 1,366.04 634 3.41 % 2,750.27 264 1.16 % 893.27 439 1.65 % 1,243.10 77 0.56 % 493.93 dobro d obro 1,147 1.40 % 1,108.10 279 1.50 % 1,210.29 205 0.90 % 693.64 455 1.71 % 1,288.41 208 1.50 % 1,334.25 tukaj t ukaj 1,135 1.39 % 1,096.50 133 0.71 % 576.95 164 0.72 % 554.91 597 2.25 % 1,690.50 241 1.74 % 1,545.93 malo malo 1,131 1.38 % 1,092.64 308 1.66 % 1,336.09 364 1.60 % 1,231.64 222 0.83 % 628.63 237 1.71 % 1,520.27 zato zato 992 1.21 % 958.36 202 1.08 % 876.27 247 1.09 % 835.75 394 1.48 % 1,115.68 149 1.08 % 955.78 čisto č isto 974 1.19 % 940.97 132 0.71 % 572.61 402 1.77 % 1,360.21 231 0.87 % 654.11 209 1.51 % 1,340.66 prav prav 971 1.19 % 938.07 214 1.15 % 928.32 308 1.36 % 1,042.15 317 1.19 % 897.64 132 0.95 % 846.73 naprej na prej 922 1.13 % 890.73 208 1.12 % 902.30 134 0.59 % 453.40 446 1.68 % 1,262.92 134 0.97 % 859.56 mogoče mo goče 872 1.07 % 842.42 207 1.11 % 897.96 163 0.72 % 551.53 242 0.91 % 685.26 260 1.88 % 1,667.81 takrat ta krat 840 1.03 % 811.51 169 0.91 % 733.12 235 1.03 % 795.15 313 1.18 % 886.31 123 0.89 % 789 prej prej 832 1.02 % 803.78 169 0.91 % 733.12 263 1.16 % 889.89 295 1.11 % 835.34 105 0.76 % 673.54 treba t reba 827 1.01 % 798.95 164 0.88 % 711.43 164 0.72 % 554.91 331 1.25 % 937.28 168 1.21 % 1,077.66 enkrat en krat 803 0.98 % 775.77 184 0.99 % 798.18 273 1.20 % 923.73 238 0.90 % 673.94 108 0.78 % 692.78 kako kako 803 0.98 % 775.77 175 0.94 % 759.14 254 1.12 % 859.44 258 0.97 % 730.57 116 0.84 % 744.10 tule tule 756 0.93 % 730.36 126 0.68 % 546.58 305 1.34 % 1,032 128 0.48 % 362.45 197 1.42 % 1,263.69 lepo lepo 743 0.91 % 717.80 240 1.29 % 1,041.11 245 1.08 % 828.99 190 0.71 % 538.02 68 0.49 % 436.20 vedno v edno 733 0.90 % 708.14 230 1.24 % 997.73 106 0.47 % 358.66 291 1.09 % 824.01 106 0.77 % 679.95 veliko ve liko 720 0.88 % 695.58 182 0.98 % 789.51 123 0.54 % 416.18 369 1.39 % 1,044.88 46 0.33 % 295.07 skupaj sk upaj 713 0.87 % 688.82 161 0.86 % 698.41 220 0.97 % 744.40 222 0.83 % 628.63 110 0.80 % 705.61 kdaj kdaj 701 0.86 % 677.22 177 0.95 % 767.82 253 1.11 % 856.05 182 0.69 % 515.36 89 0.64 % 570.90 spet spet 647 0.79 % 625.06 145 0.78 % 629 223 0.98 % 754.55 188 0.71 % 532.35 91 0.66 % 583.73 nazaj n azaj 579 0.71 % 559.36 136 0.73 % 589.96 201 0.89 % 680.11 176 0.66 % 498.37 66 0.48 % 423.37 noter n oter 554 0.68 % 535.21 127 0.68 % 550.92 256 1.13 % 866.21 69 0.26 % 195.38 102 0.74 % 654.29 dosti d osti 516 0.63 % 498.50 115 0.62 % 498.87 212 0.93 % 717.33 82 0.31 % 232.20 107 0.77 % 686.37 nekaj n ekaj 507 0.62 % 489.80 95 0.51 % 412.11 177 0.78 % 598.90 173 0.65 % 489.88 62 0.45 % 397.71 verjetno verj etno 495 0.61 % 478.21 106 0.57 % 459.82 106 0.47 % 358.66 187 0.70 % 529.52 96 0.69 % 615.81 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 483 File at CLARIN.SI2.2.140 List of final character-level 5-grams from adverb standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-adverbs-standardized_ forms-final-5grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] lahko lahko 4,000 7.84 % 3,864.33 901 7.64 % 3,908.50 885 6.91 % 2,994.50 1,437 7.95 % 4,069.10 777 9.31 % 4,984.19 potem potem 3,149 6.17 % 3,042.20 658 5.58 % 2,854.38 631 4.93 % 2,135.06 1,152 6.37 % 3,262.08 708 8.48 % 4,541.58 danes danes 1,414 2.77 % 1,366.04 634 5.38 % 2,750.27 264 2.06 % 893.27 439 2.43 % 1,243.10 77 0.92 % 493.93 dobro dobro 1,147 2.25 % 1,108.10 279 2.37 % 1,210.29 205 1.60 % 693.64 455 2.52 % 1,288.41 208 2.49 % 1,334.25 tukaj tukaj 1,135 2.23 % 1,096.50 133 1.13 % 576.95 164 1.28 % 554.91 597 3.30 % 1,690.50 241 2.89 % 1,545.93 čisto čisto 974 1.91 % 940.97 132 1.12 % 572.61 402 3.14 % 1,360.21 231 1.28 % 654.11 209 2.50 % 1,340.66 naprej n aprej 922 1.81 % 890.73 208 1.76 % 902.30 134 1.05 % 453.40 446 2.47 % 1,262.92 134 1.60 % 859.56 mogoče m ogoče 872 1.71 % 842.42 207 1.76 % 897.96 163 1.27 % 551.53 242 1.34 % 685.26 260 3.11 % 1,667.81 takrat t akrat 840 1.65 % 811.51 169 1.43 % 733.12 235 1.84 % 795.15 313 1.73 % 886.31 123 1.47 % 789 treba treba 827 1.62 % 798.95 164 1.39 % 711.43 164 1.28 % 554.91 331 1.83 % 937.28 168 2.01 % 1,077.66 enkrat e nkrat 803 1.57 % 775.77 184 1.56 % 798.18 273 2.13 % 923.73 238 1.32 % 673.94 108 1.29 % 692.78 vedno vedno 733 1.44 % 708.14 230 1.95 % 997.73 106 0.83 % 358.66 291 1.61 % 824.01 106 1.27 % 679.95 veliko v eliko 720 1.41 % 695.58 182 1.54 % 789.51 123 0.96 % 416.18 369 2.04 % 1,044.88 46 0.55 % 295.07 skupaj s kupaj 713 1.40 % 688.82 161 1.37 % 698.41 220 1.72 % 744.40 222 1.23 % 628.63 110 1.32 % 705.61 nazaj nazaj 579 1.14 % 559.36 136 1.15 % 589.96 201 1.57 % 680.11 176 0.97 % 498.37 66 0.79 % 423.37 noter noter 554 1.09 % 535.21 127 1.08 % 550.92 256 2.00 % 866.21 69 0.38 % 195.38 102 1.22 % 654.29 dosti dosti 516 1.01 % 498.50 115 0.97 % 498.87 212 1.66 % 717.33 82 0.45 % 232.20 107 1.28 % 686.37 nekaj nekaj 507 0.99 % 489.80 95 0.81 % 412.11 177 1.38 % 598.90 173 0.96 % 489.88 62 0.74 % 397.71 verjetno ver jetno 495 0.97 % 478.21 106 0.90 % 459.82 106 0.83 % 358.66 187 1.03 % 529.52 96 1.15 % 615.81 drugače dr ugače 487 0.95 % 470.48 69 0.58 % 299.32 212 1.66 % 717.33 132 0.73 % 373.78 74 0.89 % 474.68 zdajle z dajle 478 0.94 % 461.79 127 1.08 % 550.92 204 1.59 % 690.26 101 0.56 % 286 46 0.55 % 295.07 točno točno 476 0.93 % 459.86 125 1.06 % 542.25 132 1.03 % 446.64 120 0.66 % 339.80 99 1.19 % 635.05 zakaj zakaj 475 0.93 % 458.89 65 0.55 % 281.97 107 0.84 % 362.05 262 1.45 % 741.90 41 0.49 % 263 najbolj na jbolj 462 0.91 % 446.33 125 1.06 % 542.25 105 0.82 % 355.28 177 0.98 % 501.20 55 0.66 % 352.81 koliko k oliko 460 0.90 % 444.40 66 0.56 % 286.31 168 1.31 % 568.45 143 0.79 % 404.93 83 0.99 % 532.42 najprej na jprej 451 0.88 % 435.70 95 0.81 % 412.11 51 0.40 % 172.56 240 1.33 % 679.60 65 0.78 % 416.95 včasih v časih 445 0.87 % 429.91 86 0.73 % 373.06 159 1.24 % 537.99 142 0.79 % 402.10 58 0.69 % 372.05 super super 438 0.86 % 423.14 140 1.19 % 607.31 115 0.90 % 389.12 29 0.16 % 82.12 154 1.84 % 987.86 toliko t oliko 434 0.85 % 419.28 99 0.84 % 429.46 154 1.20 % 521.08 108 0.60 % 305.82 73 0.87 % 468.27 dejansko dej ansko 406 0.80 % 392.23 40 0.34 % 173.52 72 0.56 % 243.62 184 1.02 % 521.03 110 1.32 % 705.61 notri notri 396 0.78 % 382.57 37 0.31 % 160.50 235 1.84 % 795.15 43 0.24 % 121.76 81 0.97 % 519.59 nekje nekje 393 0.77 % 379.67 44 0.37 % 190.87 125 0.98 % 422.95 112 0.62 % 317.15 112 1.34 % 718.44 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 484 File at CLARIN.SI2.2.141 List of initial character-level 1-grams from adverb lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-adverbs-lowercase_forms- initial-1grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tako t ako 3,346 3.54 % 3,232.52 829 3.92 % 3,596.17 360 1.26 % 1,218.10 1,611 5.54 % 4,561.81 546 3.48 % 3,502.40 zdej z dej 3,036 3.21 % 2,933.03 515 2.43 % 2,234.05 839 2.94 % 2,838.85 1,118 3.85 % 3,165.80 564 3.60 % 3,617.87 lahko l ahko 2,885 3.06 % 2,787.15 679 3.21 % 2,945.48 335 1.18 % 1,133.51 1,267 4.36 % 3,587.72 604 3.85 % 3,874.45 tko t ko 2,665 2.82 % 2,574.61 435 2.06 % 1,887.01 1,027 3.60 % 3,474.97 472 1.62 % 1,336.55 731 4.66 % 4,689.11 tak t ak 2,543 2.69 % 2,456.75 493 2.33 % 2,138.62 1,452 5.09 % 4,913.01 132 0.45 % 373.78 466 2.97 % 2,989.23 potem p otem 2,473 2.62 % 2,389.12 466 2.20 % 2,021.49 212 0.74 % 717.33 1,145 3.94 % 3,242.26 650 4.15 % 4,169.53 tam t am 1,820 1.93 % 1,758.27 257 1.21 % 1,114.86 876 3.07 % 2,964.05 384 1.32 % 1,087.36 303 1.93 % 1,943.64 tu t u 1,384 1.47 % 1,337.06 377 1.78 % 1,635.41 341 1.20 % 1,153.81 343 1.18 % 971.26 323 2.06 % 2,071.93 zdaj z daj 1,195 1.27 % 1,154.47 427 2.02 % 1,852.31 207 0.73 % 700.41 357 1.23 % 1,010.90 204 1.30 % 1,308.59 zelo z elo 1,127 1.19 % 1,088.78 293 1.38 % 1,271.02 66 0.23 % 223.32 663 2.28 % 1,877.39 105 0.67 % 673.54 bolj b olj 1,015 1.07 % 980.57 236 1.11 % 1,023.76 228 0.80 % 771.46 374 1.29 % 1,059.04 177 1.13 % 1,135.39 danes d anes 933 0.99 % 901.36 541 2.56 % 2,346.84 21 0.07 % 71.06 333 1.15 % 942.94 38 0.24 % 243.76 ful f ul 914 0.97 % 883 52 0.25 % 225.57 695 2.44 % 2,351.61 36 0.12 % 101.94 131 0.84 % 840.32 naprej n aprej 912 0.97 % 881.07 209 0.99 % 906.63 125 0.44 % 422.95 444 1.53 % 1,257.26 134 0.85 % 859.56 sem s em 898 0.95 % 867.54 144 0.68 % 624.67 420 1.47 % 1,421.12 208 0.72 % 588.99 126 0.80 % 808.25 dobro d obro 885 0.94 % 854.98 208 0.98 % 902.30 97 0.34 % 328.21 423 1.46 % 1,197.79 157 1.00 % 1,007.10 mogoče m ogoče 815 0.86 % 787.36 190 0.90 % 824.21 136 0.48 % 460.17 238 0.82 % 673.94 251 1.60 % 1,610.08 zaj z aj 810 0.86 % 782.53 206 0.97 % 893.62 342 1.20 % 1,157.20 49 0.17 % 138.75 213 1.36 % 1,366.32 prej p rej 805 0.85 % 777.70 165 0.78 % 715.76 240 0.84 % 812.07 295 1.01 % 835.34 105 0.67 % 673.54 zato z ato 799 0.85 % 771.90 144 0.68 % 624.67 145 0.51 % 490.62 374 1.29 % 1,059.04 136 0.87 % 872.39 zej z ej 775 0.82 % 748.71 140 0.66 % 607.31 192 0.67 % 649.65 198 0.68 % 560.67 245 1.56 % 1,571.59 kje k je 773 0.82 % 746.78 139 0.66 % 602.98 258 0.91 % 872.97 272 0.94 % 770.21 104 0.66 % 667.12 treba t reba 750 0.79 % 724.56 151 0.71 % 655.03 137 0.48 % 463.56 313 1.08 % 886.31 149 0.95 % 955.78 vedno v edno 727 0.77 % 702.34 233 1.10 % 1,010.75 100 0.35 % 338.36 288 0.99 % 815.52 106 0.68 % 679.95 zlo z lo 688 0.73 % 664.67 178 0.84 % 772.16 101 0.35 % 341.74 230 0.79 % 651.28 179 1.14 % 1,148.22 lepo l epo 652 0.69 % 629.89 232 1.10 % 1,006.41 174 0.61 % 588.75 182 0.63 % 515.36 64 0.41 % 410.54 spet s pet 640 0.68 % 618.29 145 0.69 % 629 218 0.76 % 737.63 186 0.64 % 526.69 91 0.58 % 583.73 te t e 615 0.65 % 594.14 175 0.83 % 759.14 378 1.33 % 1,279.01 6 0.02 % 16.99 56 0.36 % 359.22 tm t m 613 0.65 % 592.21 79 0.37 % 342.70 343 1.20 % 1,160.58 118 0.41 % 334.14 73 0.47 % 468.27 gor g or 601 0.64 % 580.62 112 0.53 % 485.85 326 1.14 % 1,103.06 62 0.21 % 175.56 101 0.64 % 647.88 enkrat e nkrat 596 0.63 % 575.79 143 0.68 % 620.33 168 0.59 % 568.45 195 0.67 % 552.17 90 0.57 % 577.32 čist č ist 585 0.62 % 565.16 59 0.28 % 255.94 265 0.93 % 896.66 121 0.42 % 342.63 140 0.89 % 898.05 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 485 File at CLARIN.SI2.2.142 List of initial character-level 2-grams from adverb lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-adverbs-lowercase_forms- initial-2grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tako ta ko 3,346 3.55 % 3,232.52 829 3.92 % 3,596.17 360 1.26 % 1,218.10 1,611 5.54 % 4,561.81 546 3.48 % 3,502.40 zdej zd ej 3,036 3.22 % 2,933.03 515 2.44 % 2,234.05 839 2.95 % 2,838.85 1,118 3.85 % 3,165.80 564 3.60 % 3,617.87 lahko la hko 2,885 3.06 % 2,787.15 679 3.21 % 2,945.48 335 1.18 % 1,133.51 1,267 4.36 % 3,587.72 604 3.85 % 3,874.45 tko tk o 2,665 2.82 % 2,574.61 435 2.06 % 1,887.01 1,027 3.61 % 3,474.97 472 1.62 % 1,336.55 731 4.67 % 4,689.11 tak ta k 2,543 2.69 % 2,456.75 493 2.33 % 2,138.62 1,452 5.10 % 4,913.01 132 0.45 % 373.78 466 2.97 % 2,989.23 potem po tem 2,473 2.62 % 2,389.12 466 2.20 % 2,021.49 212 0.74 % 717.33 1,145 3.94 % 3,242.26 650 4.15 % 4,169.53 tam ta m 1,820 1.93 % 1,758.27 257 1.22 % 1,114.86 876 3.08 % 2,964.05 384 1.32 % 1,087.36 303 1.93 % 1,943.64 tu tu 1,384 1.47 % 1,337.06 377 1.78 % 1,635.41 341 1.20 % 1,153.81 343 1.18 % 971.26 323 2.06 % 2,071.93 zdaj zd aj 1,195 1.27 % 1,154.47 427 2.02 % 1,852.31 207 0.73 % 700.41 357 1.23 % 1,010.90 204 1.30 % 1,308.59 zelo ze lo 1,127 1.19 % 1,088.78 293 1.39 % 1,271.02 66 0.23 % 223.32 663 2.28 % 1,877.39 105 0.67 % 673.54 bolj bo lj 1,015 1.08 % 980.57 236 1.12 % 1,023.76 228 0.80 % 771.46 374 1.29 % 1,059.04 177 1.13 % 1,135.39 danes da nes 933 0.99 % 901.36 541 2.56 % 2,346.84 21 0.07 % 71.06 333 1.15 % 942.94 38 0.24 % 243.76 ful fu l 914 0.97 % 883 52 0.25 % 225.57 695 2.44 % 2,351.61 36 0.12 % 101.94 131 0.84 % 840.32 naprej na prej 912 0.97 % 881.07 209 0.99 % 906.63 125 0.44 % 422.95 444 1.53 % 1,257.26 134 0.85 % 859.56 sem se m 898 0.95 % 867.54 144 0.68 % 624.67 420 1.48 % 1,421.12 208 0.72 % 588.99 126 0.80 % 808.25 dobro do bro 885 0.94 % 854.98 208 0.98 % 902.30 97 0.34 % 328.21 423 1.46 % 1,197.79 157 1.00 % 1,007.10 mogoče mo goče 815 0.86 % 787.36 190 0.90 % 824.21 136 0.48 % 460.17 238 0.82 % 673.94 251 1.60 % 1,610.08 zaj za j 810 0.86 % 782.53 206 0.97 % 893.62 342 1.20 % 1,157.20 49 0.17 % 138.75 213 1.36 % 1,366.32 prej pr ej 805 0.85 % 777.70 165 0.78 % 715.76 240 0.84 % 812.07 295 1.01 % 835.34 105 0.67 % 673.54 zato za to 799 0.85 % 771.90 144 0.68 % 624.67 145 0.51 % 490.62 374 1.29 % 1,059.04 136 0.87 % 872.39 zej ze j 775 0.82 % 748.71 140 0.66 % 607.31 192 0.67 % 649.65 198 0.68 % 560.67 245 1.56 % 1,571.59 kje kj e 773 0.82 % 746.78 139 0.66 % 602.98 258 0.91 % 872.97 272 0.94 % 770.21 104 0.66 % 667.12 treba tr eba 750 0.80 % 724.56 151 0.71 % 655.03 137 0.48 % 463.56 313 1.08 % 886.31 149 0.95 % 955.78 vedno ve dno 727 0.77 % 702.34 233 1.10 % 1,010.75 100 0.35 % 338.36 288 0.99 % 815.52 106 0.68 % 679.95 zlo zl o 688 0.73 % 664.67 178 0.84 % 772.16 101 0.35 % 341.74 230 0.79 % 651.28 179 1.14 % 1,148.22 lepo le po 652 0.69 % 629.89 232 1.10 % 1,006.41 174 0.61 % 588.75 182 0.63 % 515.36 64 0.41 % 410.54 spet sp et 640 0.68 % 618.29 145 0.69 % 629 218 0.77 % 737.63 186 0.64 % 526.69 91 0.58 % 583.73 te te 615 0.65 % 594.14 175 0.83 % 759.14 378 1.33 % 1,279.01 6 0.02 % 16.99 56 0.36 % 359.22 tm tm 613 0.65 % 592.21 79 0.37 % 342.70 343 1.20 % 1,160.58 118 0.41 % 334.14 73 0.47 % 468.27 gor go r 601 0.64 % 580.62 112 0.53 % 485.85 326 1.15 % 1,103.06 62 0.21 % 175.56 101 0.65 % 647.88 enkrat en krat 596 0.63 % 575.79 143 0.68 % 620.33 168 0.59 % 568.45 195 0.67 % 552.17 90 0.57 % 577.32 čist či st 585 0.62 % 565.16 59 0.28 % 255.94 265 0.93 % 896.66 121 0.42 % 342.63 140 0.89 % 898.05 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 486 File at CLARIN.SI2.2.143 List of initial character-level 3-grams from adverb lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-adverbs-lowercase_forms- initial-3grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tako tak o 3,346 3.76 % 3,232.52 829 4.19 % 3,596.17 360 1.38 % 1,218.10 1,611 5.70 % 4,561.81 546 3.68 % 3,502.40 zdej zde j 3,036 3.41 % 2,933.03 515 2.60 % 2,234.05 839 3.22 % 2,838.85 1,118 3.95 % 3,165.80 564 3.80 % 3,617.87 lahko lah ko 2,885 3.24 % 2,787.15 679 3.43 % 2,945.48 335 1.28 % 1,133.51 1,267 4.48 % 3,587.72 604 4.07 % 3,874.45 tko tko 2,665 3.00 % 2,574.61 435 2.20 % 1,887.01 1,027 3.94 % 3,474.97 472 1.67 % 1,336.55 731 4.92 % 4,689.11 tak tak 2,543 2.86 % 2,456.75 493 2.49 % 2,138.62 1,452 5.57 % 4,913.01 132 0.47 % 373.78 466 3.14 % 2,989.23 potem pot em 2,473 2.78 % 2,389.12 466 2.36 % 2,021.49 212 0.81 % 717.33 1,145 4.05 % 3,242.26 650 4.38 % 4,169.53 tam tam 1,820 2.04 % 1,758.27 257 1.30 % 1,114.86 876 3.36 % 2,964.05 384 1.36 % 1,087.36 303 2.04 % 1,943.64 zdaj zda j 1,195 1.34 % 1,154.47 427 2.16 % 1,852.31 207 0.79 % 700.41 357 1.26 % 1,010.90 204 1.37 % 1,308.59 zelo zel o 1,127 1.27 % 1,088.78 293 1.48 % 1,271.02 66 0.25 % 223.32 663 2.34 % 1,877.39 105 0.71 % 673.54 bolj bol j 1,015 1.14 % 980.57 236 1.19 % 1,023.76 228 0.88 % 771.46 374 1.32 % 1,059.04 177 1.19 % 1,135.39 danes dan es 933 1.05 % 901.36 541 2.73 % 2,346.84 21 0.08 % 71.06 333 1.18 % 942.94 38 0.26 % 243.76 ful ful 914 1.03 % 883 52 0.26 % 225.57 695 2.67 % 2,351.61 36 0.13 % 101.94 131 0.88 % 840.32 naprej nap rej 912 1.02 % 881.07 209 1.06 % 906.63 125 0.48 % 422.95 444 1.57 % 1,257.26 134 0.90 % 859.56 sem sem 898 1.01 % 867.54 144 0.73 % 624.67 420 1.61 % 1,421.12 208 0.73 % 588.99 126 0.85 % 808.25 dobro dob ro 885 0.99 % 854.98 208 1.05 % 902.30 97 0.37 % 328.21 423 1.50 % 1,197.79 157 1.06 % 1,007.10 mogoče mog oče 815 0.92 % 787.36 190 0.96 % 824.21 136 0.52 % 460.17 238 0.84 % 673.94 251 1.69 % 1,610.08 zaj zaj 810 0.91 % 782.53 206 1.04 % 893.62 342 1.31 % 1,157.20 49 0.17 % 138.75 213 1.44 % 1,366.32 prej pre j 805 0.91 % 777.70 165 0.83 % 715.76 240 0.92 % 812.07 295 1.04 % 835.34 105 0.71 % 673.54 zato zat o 799 0.90 % 771.90 144 0.73 % 624.67 145 0.56 % 490.62 374 1.32 % 1,059.04 136 0.92 % 872.39 zej zej 775 0.87 % 748.71 140 0.71 % 607.31 192 0.74 % 649.65 198 0.70 % 560.67 245 1.65 % 1,571.59 kje kje 773 0.87 % 746.78 139 0.70 % 602.98 258 0.99 % 872.97 272 0.96 % 770.21 104 0.70 % 667.12 treba tre ba 750 0.84 % 724.56 151 0.76 % 655.03 137 0.53 % 463.56 313 1.11 % 886.31 149 1.00 % 955.78 vedno ved no 727 0.82 % 702.34 233 1.18 % 1,010.75 100 0.38 % 338.36 288 1.02 % 815.52 106 0.71 % 679.95 zlo zlo 688 0.77 % 664.67 178 0.90 % 772.16 101 0.39 % 341.74 230 0.81 % 651.28 179 1.21 % 1,148.22 lepo lep o 652 0.73 % 629.89 232 1.17 % 1,006.41 174 0.67 % 588.75 182 0.64 % 515.36 64 0.43 % 410.54 spet spe t 640 0.72 % 618.29 145 0.73 % 629 218 0.84 % 737.63 186 0.66 % 526.69 91 0.61 % 583.73 gor gor 601 0.68 % 580.62 112 0.57 % 485.85 326 1.25 % 1,103.06 62 0.22 % 175.56 101 0.68 % 647.88 enkrat enk rat 596 0.67 % 575.79 143 0.72 % 620.33 168 0.64 % 568.45 195 0.69 % 552.17 90 0.61 % 577.32 čist čis t 585 0.66 % 565.16 59 0.30 % 255.94 265 1.02 % 896.66 121 0.43 % 342.63 140 0.94 % 898.05 kdaj kda j 584 0.66 % 564.19 136 0.69 % 589.96 186 0.71 % 629.35 178 0.63 % 504.04 84 0.57 % 538.83 nazaj naz aj 555 0.62 % 536.18 130 0.66 % 563.94 184 0.71 % 622.58 176 0.62 % 498.37 65 0.44 % 416.95 tle tle 549 0.62 % 530.38 76 0.38 % 329.69 232 0.89 % 785 80 0.28 % 226.53 161 1.08 % 1,032.76 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 487 File at CLARIN.SI2.2.144 List of initial character-level 4-grams from adverb lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-adverbs-lowercase_forms- initial-4grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tako tako 3,346 4.90 % 3,232.52 829 5.18 % 3,596.17 360 2.17 % 1,218.10 1,611 6.50 % 4,561.81 546 5.01 % 3,502.40 zdej zdej 3,036 4.45 % 2,933.03 515 3.22 % 2,234.05 839 5.07 % 2,838.85 1,118 4.51 % 3,165.80 564 5.18 % 3,617.87 lahko lahk o 2,885 4.23 % 2,787.15 679 4.24 % 2,945.48 335 2.02 % 1,133.51 1,267 5.11 % 3,587.72 604 5.54 % 3,874.45 potem pote m 2,473 3.62 % 2,389.12 466 2.91 % 2,021.49 212 1.28 % 717.33 1,145 4.62 % 3,242.26 650 5.97 % 4,169.53 zdaj zdaj 1,195 1.75 % 1,154.47 427 2.67 % 1,852.31 207 1.25 % 700.41 357 1.44 % 1,010.90 204 1.87 % 1,308.59 zelo zelo 1,127 1.65 % 1,088.78 293 1.83 % 1,271.02 66 0.40 % 223.32 663 2.67 % 1,877.39 105 0.96 % 673.54 bolj bolj 1,015 1.49 % 980.57 236 1.48 % 1,023.76 228 1.38 % 771.46 374 1.51 % 1,059.04 177 1.62 % 1,135.39 danes dane s 933 1.37 % 901.36 541 3.38 % 2,346.84 21 0.13 % 71.06 333 1.34 % 942.94 38 0.35 % 243.76 naprej napr ej 912 1.34 % 881.07 209 1.31 % 906.63 125 0.76 % 422.95 444 1.79 % 1,257.26 134 1.23 % 859.56 dobro dobr o 885 1.30 % 854.98 208 1.30 % 902.30 97 0.59 % 328.21 423 1.71 % 1,197.79 157 1.44 % 1,007.10 mogoče mogo če 815 1.19 % 787.36 190 1.19 % 824.21 136 0.82 % 460.17 238 0.96 % 673.94 251 2.30 % 1,610.08 prej prej 805 1.18 % 777.70 165 1.03 % 715.76 240 1.45 % 812.07 295 1.19 % 835.34 105 0.96 % 673.54 zato zato 799 1.17 % 771.90 144 0.90 % 624.67 145 0.88 % 490.62 374 1.51 % 1,059.04 136 1.25 % 872.39 treba treb a 750 1.10 % 724.56 151 0.94 % 655.03 137 0.83 % 463.56 313 1.26 % 886.31 149 1.37 % 955.78 vedno vedn o 727 1.06 % 702.34 233 1.46 % 1,010.75 100 0.60 % 338.36 288 1.16 % 815.52 106 0.97 % 679.95 lepo lepo 652 0.96 % 629.89 232 1.45 % 1,006.41 174 1.05 % 588.75 182 0.73 % 515.36 64 0.59 % 410.54 spet spet 640 0.94 % 618.29 145 0.91 % 629 218 1.32 % 737.63 186 0.75 % 526.69 91 0.83 % 583.73 enkrat enkr at 596 0.87 % 575.79 143 0.89 % 620.33 168 1.01 % 568.45 195 0.79 % 552.17 90 0.83 % 577.32 čist čist 585 0.86 % 565.16 59 0.37 % 255.94 265 1.60 % 896.66 121 0.49 % 342.63 140 1.28 % 898.05 kdaj kdaj 584 0.86 % 564.19 136 0.85 % 589.96 186 1.12 % 629.35 178 0.72 % 504.04 84 0.77 % 538.83 nazaj naza j 555 0.81 % 536.18 130 0.81 % 563.94 184 1.11 % 622.58 176 0.71 % 498.37 65 0.60 % 416.95 malo malo 530 0.78 % 512.02 163 1.02 % 707.09 134 0.81 % 453.40 125 0.50 % 353.96 108 0.99 % 692.78 takrat takr at 494 0.72 % 477.25 103 0.64 % 446.81 94 0.57 % 318.06 213 0.86 % 603.14 84 0.77 % 538.83 tukaj tuka j 493 0.72 % 476.28 69 0.43 % 299.32 44 0.27 % 148.88 284 1.15 % 804.19 96 0.88 % 615.81 prov prov 491 0.72 % 474.35 74 0.46 % 321.01 210 1.27 % 710.56 129 0.52 % 365.28 78 0.72 % 500.34 fajn fajn 478 0.70 % 461.79 139 0.87 % 602.98 221 1.33 % 747.78 21 0.09 % 59.46 97 0.89 % 622.22 prav prav 467 0.68 % 451.16 139 0.87 % 602.98 88 0.53 % 297.76 186 0.75 % 526.69 54 0.50 % 346.39 najprej najp rej 442 0.65 % 427.01 92 0.57 % 399.09 49 0.30 % 165.80 237 0.96 % 671.10 64 0.59 % 410.54 veliko veli ko 442 0.65 % 427.01 104 0.65 % 451.15 10 0.06 % 33.84 305 1.23 % 863.66 23 0.21 % 147.54 kako kako 429 0.63 % 414.45 121 0.76 % 524.89 46 0.28 % 155.65 211 0.85 % 597.48 51 0.47 % 327.15 točno točn o 419 0.61 % 404.79 120 0.75 % 520.56 94 0.57 % 318.06 113 0.46 % 319.98 92 0.84 % 590.15 verjetno verj etno 416 0.61 % 401.89 101 0.63 % 438.13 57 0.34 % 192.87 169 0.68 % 478.55 89 0.82 % 570.90 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 488 File at CLARIN.SI2.2.145 List of initial character-level 5-grams from adverb lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-adverbs-lowercase_forms- initial-5grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] lahko lahko 2,885 6.50 % 2,787.15 679 6.40 % 2,945.48 335 3.60 % 1,133.51 1,267 7.38 % 3,587.72 604 8.27 % 3,874.45 potem potem 2,473 5.57 % 2,389.12 466 4.39 % 2,021.49 212 2.28 % 717.33 1,145 6.67 % 3,242.26 650 8.90 % 4,169.53 danes danes 933 2.10 % 901.36 541 5.10 % 2,346.84 21 0.23 % 71.06 333 1.94 % 942.94 38 0.52 % 243.76 naprej napre j 912 2.05 % 881.07 209 1.97 % 906.63 125 1.34 % 422.95 444 2.58 % 1,257.26 134 1.83 % 859.56 dobro dobro 885 1.99 % 854.98 208 1.96 % 902.30 97 1.04 % 328.21 423 2.46 % 1,197.79 157 2.15 % 1,007.10 mogoče mogoč e 815 1.84 % 787.36 190 1.79 % 824.21 136 1.46 % 460.17 238 1.39 % 673.94 251 3.44 % 1,610.08 treba treba 750 1.69 % 724.56 151 1.42 % 655.03 137 1.47 % 463.56 313 1.82 % 886.31 149 2.04 % 955.78 vedno vedno 727 1.64 % 702.34 233 2.19 % 1,010.75 100 1.07 % 338.36 288 1.68 % 815.52 106 1.45 % 679.95 enkrat enkra t 596 1.34 % 575.79 143 1.35 % 620.33 168 1.81 % 568.45 195 1.14 % 552.17 90 1.23 % 577.32 nazaj nazaj 555 1.25 % 536.18 130 1.23 % 563.94 184 1.98 % 622.58 176 1.02 % 498.37 65 0.89 % 416.95 takrat takra t 494 1.11 % 477.25 103 0.97 % 446.81 94 1.01 % 318.06 213 1.24 % 603.14 84 1.15 % 538.83 tukaj tukaj 493 1.11 % 476.28 69 0.65 % 299.32 44 0.47 % 148.88 284 1.65 % 804.19 96 1.31 % 615.81 najprej najpr ej 442 0.99 % 427.01 92 0.87 % 399.09 49 0.53 % 165.80 237 1.38 % 671.10 64 0.88 % 410.54 veliko velik o 442 0.99 % 427.01 104 0.98 % 451.15 10 0.11 % 33.84 305 1.78 % 863.66 23 0.32 % 147.54 točno točno 419 0.94 % 404.79 120 1.13 % 520.56 94 1.01 % 318.06 113 0.66 % 319.98 92 1.26 % 590.15 verjetno verje tno 416 0.94 % 401.89 101 0.95 % 438.13 57 0.61 % 192.87 169 0.98 % 478.55 89 1.22 % 570.90 super super 407 0.92 % 393.20 134 1.26 % 581.29 90 0.97 % 304.53 29 0.17 % 82.12 154 2.11 % 987.86 dejansko dejan sko 404 0.91 % 390.30 40 0.38 % 173.52 71 0.76 % 240.24 183 1.06 % 518.19 110 1.50 % 705.61 zakaj zakaj 399 0.90 % 385.47 65 0.61 % 281.97 67 0.72 % 226.70 236 1.37 % 668.27 31 0.42 % 198.85 takoj takoj 346 0.78 % 334.26 96 0.91 % 416.44 83 0.89 % 280.84 92 0.54 % 260.51 75 1.03 % 481.10 najbolj najbo lj 341 0.77 % 329.43 104 0.98 % 451.15 43 0.46 % 145.50 155 0.90 % 438.91 39 0.53 % 250.17 glede glede 323 0.73 % 312.05 85 0.80 % 368.73 27 0.29 % 91.36 152 0.89 % 430.41 59 0.81 % 378.46 skupaj skupa j 319 0.72 % 308.18 91 0.86 % 394.75 20 0.21 % 67.67 154 0.90 % 436.08 54 0.74 % 346.39 nekje nekje 313 0.70 % 302.38 42 0.40 % 182.19 57 0.61 % 192.87 109 0.64 % 308.65 105 1.44 % 673.54 včasih včasi h 310 0.70 % 299.49 66 0.62 % 286.31 60 0.65 % 203.02 131 0.76 % 370.95 53 0.72 % 339.98 tukej tukej 289 0.65 % 279.20 36 0.34 % 156.17 22 0.24 % 74.44 179 1.04 % 506.87 52 0.71 % 333.56 noter noter 284 0.64 % 274.37 60 0.56 % 260.28 103 1.11 % 348.51 47 0.27 % 133.09 74 1.01 % 474.68 preveč preve č 284 0.64 % 274.37 66 0.62 % 286.31 88 0.95 % 297.76 66 0.38 % 186.89 64 0.88 % 410.54 težko težko 254 0.57 % 245.39 50 0.47 % 216.90 47 0.51 % 159.03 118 0.69 % 334.14 39 0.53 % 250.17 čisto čisto 233 0.53 % 225.10 51 0.48 % 221.24 48 0.52 % 162.41 90 0.52 % 254.85 44 0.60 % 282.24 hitro hitro 230 0.52 % 222.20 79 0.74 % 342.70 46 0.49 % 155.65 81 0.47 % 229.36 24 0.33 % 153.95 precej prece j 230 0.52 % 222.20 78 0.73 % 338.36 25 0.27 % 84.59 101 0.59 % 286 26 0.36 % 166.78 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 489 File at CLARIN.SI2.2.146 List of final character-level 1-grams from adverb lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-adverbs-lowercase_forms- final-1grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tako tak o 3,346 3.54 % 3,232.52 829 3.92 % 3,596.17 360 1.26 % 1,218.10 1,611 5.54 % 4,561.81 546 3.48 % 3,502.40 zdej zde j 3,036 3.21 % 2,933.03 515 2.43 % 2,234.05 839 2.94 % 2,838.85 1,118 3.85 % 3,165.80 564 3.60 % 3,617.87 lahko lahk o 2,885 3.06 % 2,787.15 679 3.21 % 2,945.48 335 1.18 % 1,133.51 1,267 4.36 % 3,587.72 604 3.85 % 3,874.45 tko tk o 2,665 2.82 % 2,574.61 435 2.06 % 1,887.01 1,027 3.60 % 3,474.97 472 1.62 % 1,336.55 731 4.66 % 4,689.11 tak ta k 2,543 2.69 % 2,456.75 493 2.33 % 2,138.62 1,452 5.09 % 4,913.01 132 0.45 % 373.78 466 2.97 % 2,989.23 potem pote m 2,473 2.62 % 2,389.12 466 2.20 % 2,021.49 212 0.74 % 717.33 1,145 3.94 % 3,242.26 650 4.15 % 4,169.53 tam ta m 1,820 1.93 % 1,758.27 257 1.21 % 1,114.86 876 3.07 % 2,964.05 384 1.32 % 1,087.36 303 1.93 % 1,943.64 tu t u 1,384 1.47 % 1,337.06 377 1.78 % 1,635.41 341 1.20 % 1,153.81 343 1.18 % 971.26 323 2.06 % 2,071.93 zdaj zda j 1,195 1.27 % 1,154.47 427 2.02 % 1,852.31 207 0.73 % 700.41 357 1.23 % 1,010.90 204 1.30 % 1,308.59 zelo zel o 1,127 1.19 % 1,088.78 293 1.38 % 1,271.02 66 0.23 % 223.32 663 2.28 % 1,877.39 105 0.67 % 673.54 bolj bol j 1,015 1.07 % 980.57 236 1.11 % 1,023.76 228 0.80 % 771.46 374 1.29 % 1,059.04 177 1.13 % 1,135.39 danes dane s 933 0.99 % 901.36 541 2.56 % 2,346.84 21 0.07 % 71.06 333 1.15 % 942.94 38 0.24 % 243.76 ful fu l 914 0.97 % 883 52 0.25 % 225.57 695 2.44 % 2,351.61 36 0.12 % 101.94 131 0.84 % 840.32 naprej napre j 912 0.97 % 881.07 209 0.99 % 906.63 125 0.44 % 422.95 444 1.53 % 1,257.26 134 0.85 % 859.56 sem se m 898 0.95 % 867.54 144 0.68 % 624.67 420 1.47 % 1,421.12 208 0.72 % 588.99 126 0.80 % 808.25 dobro dobr o 885 0.94 % 854.98 208 0.98 % 902.30 97 0.34 % 328.21 423 1.46 % 1,197.79 157 1.00 % 1,007.10 mogoče mogoč e 815 0.86 % 787.36 190 0.90 % 824.21 136 0.48 % 460.17 238 0.82 % 673.94 251 1.60 % 1,610.08 zaj za j 810 0.86 % 782.53 206 0.97 % 893.62 342 1.20 % 1,157.20 49 0.17 % 138.75 213 1.36 % 1,366.32 prej pre j 805 0.85 % 777.70 165 0.78 % 715.76 240 0.84 % 812.07 295 1.01 % 835.34 105 0.67 % 673.54 zato zat o 799 0.85 % 771.90 144 0.68 % 624.67 145 0.51 % 490.62 374 1.29 % 1,059.04 136 0.87 % 872.39 zej ze j 775 0.82 % 748.71 140 0.66 % 607.31 192 0.67 % 649.65 198 0.68 % 560.67 245 1.56 % 1,571.59 kje kj e 773 0.82 % 746.78 139 0.66 % 602.98 258 0.91 % 872.97 272 0.94 % 770.21 104 0.66 % 667.12 treba treb a 750 0.79 % 724.56 151 0.71 % 655.03 137 0.48 % 463.56 313 1.08 % 886.31 149 0.95 % 955.78 vedno vedn o 727 0.77 % 702.34 233 1.10 % 1,010.75 100 0.35 % 338.36 288 0.99 % 815.52 106 0.68 % 679.95 zlo zl o 688 0.73 % 664.67 178 0.84 % 772.16 101 0.35 % 341.74 230 0.79 % 651.28 179 1.14 % 1,148.22 lepo lep o 652 0.69 % 629.89 232 1.10 % 1,006.41 174 0.61 % 588.75 182 0.63 % 515.36 64 0.41 % 410.54 spet spe t 640 0.68 % 618.29 145 0.69 % 629 218 0.76 % 737.63 186 0.64 % 526.69 91 0.58 % 583.73 te t e 615 0.65 % 594.14 175 0.83 % 759.14 378 1.33 % 1,279.01 6 0.02 % 16.99 56 0.36 % 359.22 tm t m 613 0.65 % 592.21 79 0.37 % 342.70 343 1.20 % 1,160.58 118 0.41 % 334.14 73 0.47 % 468.27 gor go r 601 0.64 % 580.62 112 0.53 % 485.85 326 1.14 % 1,103.06 62 0.21 % 175.56 101 0.64 % 647.88 enkrat enkra t 596 0.63 % 575.79 143 0.68 % 620.33 168 0.59 % 568.45 195 0.67 % 552.17 90 0.57 % 577.32 čist čis t 585 0.62 % 565.16 59 0.28 % 255.94 265 0.93 % 896.66 121 0.42 % 342.63 140 0.89 % 898.05 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 490 File at CLARIN.SI2.2.147 List of final character-level 2-grams from adverb lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-adverbs-lowercase_forms- final-2grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tako ta ko 3,346 3.55 % 3,232.52 829 3.92 % 3,596.17 360 1.26 % 1,218.10 1,611 5.54 % 4,561.81 546 3.48 % 3,502.40 zdej zd ej 3,036 3.22 % 2,933.03 515 2.44 % 2,234.05 839 2.95 % 2,838.85 1,118 3.85 % 3,165.80 564 3.60 % 3,617.87 lahko lah ko 2,885 3.06 % 2,787.15 679 3.21 % 2,945.48 335 1.18 % 1,133.51 1,267 4.36 % 3,587.72 604 3.85 % 3,874.45 tko t ko 2,665 2.82 % 2,574.61 435 2.06 % 1,887.01 1,027 3.61 % 3,474.97 472 1.62 % 1,336.55 731 4.67 % 4,689.11 tak t ak 2,543 2.69 % 2,456.75 493 2.33 % 2,138.62 1,452 5.10 % 4,913.01 132 0.45 % 373.78 466 2.97 % 2,989.23 potem pot em 2,473 2.62 % 2,389.12 466 2.20 % 2,021.49 212 0.74 % 717.33 1,145 3.94 % 3,242.26 650 4.15 % 4,169.53 tam t am 1,820 1.93 % 1,758.27 257 1.22 % 1,114.86 876 3.08 % 2,964.05 384 1.32 % 1,087.36 303 1.93 % 1,943.64 tu tu 1,384 1.47 % 1,337.06 377 1.78 % 1,635.41 341 1.20 % 1,153.81 343 1.18 % 971.26 323 2.06 % 2,071.93 zdaj zd aj 1,195 1.27 % 1,154.47 427 2.02 % 1,852.31 207 0.73 % 700.41 357 1.23 % 1,010.90 204 1.30 % 1,308.59 zelo ze lo 1,127 1.19 % 1,088.78 293 1.39 % 1,271.02 66 0.23 % 223.32 663 2.28 % 1,877.39 105 0.67 % 673.54 bolj bo lj 1,015 1.08 % 980.57 236 1.12 % 1,023.76 228 0.80 % 771.46 374 1.29 % 1,059.04 177 1.13 % 1,135.39 danes dan es 933 0.99 % 901.36 541 2.56 % 2,346.84 21 0.07 % 71.06 333 1.15 % 942.94 38 0.24 % 243.76 ful f ul 914 0.97 % 883 52 0.25 % 225.57 695 2.44 % 2,351.61 36 0.12 % 101.94 131 0.84 % 840.32 naprej napr ej 912 0.97 % 881.07 209 0.99 % 906.63 125 0.44 % 422.95 444 1.53 % 1,257.26 134 0.85 % 859.56 sem s em 898 0.95 % 867.54 144 0.68 % 624.67 420 1.48 % 1,421.12 208 0.72 % 588.99 126 0.80 % 808.25 dobro dob ro 885 0.94 % 854.98 208 0.98 % 902.30 97 0.34 % 328.21 423 1.46 % 1,197.79 157 1.00 % 1,007.10 mogoče mogo če 815 0.86 % 787.36 190 0.90 % 824.21 136 0.48 % 460.17 238 0.82 % 673.94 251 1.60 % 1,610.08 zaj z aj 810 0.86 % 782.53 206 0.97 % 893.62 342 1.20 % 1,157.20 49 0.17 % 138.75 213 1.36 % 1,366.32 prej pr ej 805 0.85 % 777.70 165 0.78 % 715.76 240 0.84 % 812.07 295 1.01 % 835.34 105 0.67 % 673.54 zato za to 799 0.85 % 771.90 144 0.68 % 624.67 145 0.51 % 490.62 374 1.29 % 1,059.04 136 0.87 % 872.39 zej z ej 775 0.82 % 748.71 140 0.66 % 607.31 192 0.67 % 649.65 198 0.68 % 560.67 245 1.56 % 1,571.59 kje k je 773 0.82 % 746.78 139 0.66 % 602.98 258 0.91 % 872.97 272 0.94 % 770.21 104 0.66 % 667.12 treba tre ba 750 0.80 % 724.56 151 0.71 % 655.03 137 0.48 % 463.56 313 1.08 % 886.31 149 0.95 % 955.78 vedno ved no 727 0.77 % 702.34 233 1.10 % 1,010.75 100 0.35 % 338.36 288 0.99 % 815.52 106 0.68 % 679.95 zlo z lo 688 0.73 % 664.67 178 0.84 % 772.16 101 0.35 % 341.74 230 0.79 % 651.28 179 1.14 % 1,148.22 lepo le po 652 0.69 % 629.89 232 1.10 % 1,006.41 174 0.61 % 588.75 182 0.63 % 515.36 64 0.41 % 410.54 spet sp et 640 0.68 % 618.29 145 0.69 % 629 218 0.77 % 737.63 186 0.64 % 526.69 91 0.58 % 583.73 te te 615 0.65 % 594.14 175 0.83 % 759.14 378 1.33 % 1,279.01 6 0.02 % 16.99 56 0.36 % 359.22 tm tm 613 0.65 % 592.21 79 0.37 % 342.70 343 1.20 % 1,160.58 118 0.41 % 334.14 73 0.47 % 468.27 gor g or 601 0.64 % 580.62 112 0.53 % 485.85 326 1.15 % 1,103.06 62 0.21 % 175.56 101 0.65 % 647.88 enkrat enkr at 596 0.63 % 575.79 143 0.68 % 620.33 168 0.59 % 568.45 195 0.67 % 552.17 90 0.57 % 577.32 čist či st 585 0.62 % 565.16 59 0.28 % 255.94 265 0.93 % 896.66 121 0.42 % 342.63 140 0.89 % 898.05 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 491 File at CLARIN.SI2.2.148 List of final character-level 3-grams from adverb lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-adverbs-lowercase_forms- final-3grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tako t ako 3,346 3.76 % 3,232.52 829 4.19 % 3,596.17 360 1.38 % 1,218.10 1,611 5.70 % 4,561.81 546 3.68 % 3,502.40 zdej z dej 3,036 3.41 % 2,933.03 515 2.60 % 2,234.05 839 3.22 % 2,838.85 1,118 3.95 % 3,165.80 564 3.80 % 3,617.87 lahko la hko 2,885 3.24 % 2,787.15 679 3.43 % 2,945.48 335 1.28 % 1,133.51 1,267 4.48 % 3,587.72 604 4.07 % 3,874.45 tko tko 2,665 3.00 % 2,574.61 435 2.20 % 1,887.01 1,027 3.94 % 3,474.97 472 1.67 % 1,336.55 731 4.92 % 4,689.11 tak tak 2,543 2.86 % 2,456.75 493 2.49 % 2,138.62 1,452 5.57 % 4,913.01 132 0.47 % 373.78 466 3.14 % 2,989.23 potem po tem 2,473 2.78 % 2,389.12 466 2.36 % 2,021.49 212 0.81 % 717.33 1,145 4.05 % 3,242.26 650 4.38 % 4,169.53 tam tam 1,820 2.04 % 1,758.27 257 1.30 % 1,114.86 876 3.36 % 2,964.05 384 1.36 % 1,087.36 303 2.04 % 1,943.64 zdaj z daj 1,195 1.34 % 1,154.47 427 2.16 % 1,852.31 207 0.79 % 700.41 357 1.26 % 1,010.90 204 1.37 % 1,308.59 zelo z elo 1,127 1.27 % 1,088.78 293 1.48 % 1,271.02 66 0.25 % 223.32 663 2.34 % 1,877.39 105 0.71 % 673.54 bolj b olj 1,015 1.14 % 980.57 236 1.19 % 1,023.76 228 0.88 % 771.46 374 1.32 % 1,059.04 177 1.19 % 1,135.39 danes da nes 933 1.05 % 901.36 541 2.73 % 2,346.84 21 0.08 % 71.06 333 1.18 % 942.94 38 0.26 % 243.76 ful ful 914 1.03 % 883 52 0.26 % 225.57 695 2.67 % 2,351.61 36 0.13 % 101.94 131 0.88 % 840.32 naprej nap rej 912 1.02 % 881.07 209 1.06 % 906.63 125 0.48 % 422.95 444 1.57 % 1,257.26 134 0.90 % 859.56 sem sem 898 1.01 % 867.54 144 0.73 % 624.67 420 1.61 % 1,421.12 208 0.73 % 588.99 126 0.85 % 808.25 dobro do bro 885 0.99 % 854.98 208 1.05 % 902.30 97 0.37 % 328.21 423 1.50 % 1,197.79 157 1.06 % 1,007.10 mogoče mog oče 815 0.92 % 787.36 190 0.96 % 824.21 136 0.52 % 460.17 238 0.84 % 673.94 251 1.69 % 1,610.08 zaj zaj 810 0.91 % 782.53 206 1.04 % 893.62 342 1.31 % 1,157.20 49 0.17 % 138.75 213 1.44 % 1,366.32 prej p rej 805 0.91 % 777.70 165 0.83 % 715.76 240 0.92 % 812.07 295 1.04 % 835.34 105 0.71 % 673.54 zato z ato 799 0.90 % 771.90 144 0.73 % 624.67 145 0.56 % 490.62 374 1.32 % 1,059.04 136 0.92 % 872.39 zej zej 775 0.87 % 748.71 140 0.71 % 607.31 192 0.74 % 649.65 198 0.70 % 560.67 245 1.65 % 1,571.59 kje kje 773 0.87 % 746.78 139 0.70 % 602.98 258 0.99 % 872.97 272 0.96 % 770.21 104 0.70 % 667.12 treba tr eba 750 0.84 % 724.56 151 0.76 % 655.03 137 0.53 % 463.56 313 1.11 % 886.31 149 1.00 % 955.78 vedno ve dno 727 0.82 % 702.34 233 1.18 % 1,010.75 100 0.38 % 338.36 288 1.02 % 815.52 106 0.71 % 679.95 zlo zlo 688 0.77 % 664.67 178 0.90 % 772.16 101 0.39 % 341.74 230 0.81 % 651.28 179 1.21 % 1,148.22 lepo l epo 652 0.73 % 629.89 232 1.17 % 1,006.41 174 0.67 % 588.75 182 0.64 % 515.36 64 0.43 % 410.54 spet s pet 640 0.72 % 618.29 145 0.73 % 629 218 0.84 % 737.63 186 0.66 % 526.69 91 0.61 % 583.73 gor gor 601 0.68 % 580.62 112 0.57 % 485.85 326 1.25 % 1,103.06 62 0.22 % 175.56 101 0.68 % 647.88 enkrat enk rat 596 0.67 % 575.79 143 0.72 % 620.33 168 0.64 % 568.45 195 0.69 % 552.17 90 0.61 % 577.32 čist č ist 585 0.66 % 565.16 59 0.30 % 255.94 265 1.02 % 896.66 121 0.43 % 342.63 140 0.94 % 898.05 kdaj k daj 584 0.66 % 564.19 136 0.69 % 589.96 186 0.71 % 629.35 178 0.63 % 504.04 84 0.57 % 538.83 nazaj na zaj 555 0.62 % 536.18 130 0.66 % 563.94 184 0.71 % 622.58 176 0.62 % 498.37 65 0.44 % 416.95 tle tle 549 0.62 % 530.38 76 0.38 % 329.69 232 0.89 % 785 80 0.28 % 226.53 161 1.08 % 1,032.76 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 492 File at CLARIN.SI2.2.149 List of final character-level 4-grams from adverb lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-adverbs-lowercase_forms- final-4grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tako tako 3,346 4.90 % 3,232.52 829 5.18 % 3,596.17 360 2.17 % 1,218.10 1,611 6.50 % 4,561.81 546 5.01 % 3,502.40 zdej zdej 3,036 4.45 % 2,933.03 515 3.22 % 2,234.05 839 5.07 % 2,838.85 1,118 4.51 % 3,165.80 564 5.18 % 3,617.87 lahko l ahko 2,885 4.23 % 2,787.15 679 4.24 % 2,945.48 335 2.02 % 1,133.51 1,267 5.11 % 3,587.72 604 5.54 % 3,874.45 potem p otem 2,473 3.62 % 2,389.12 466 2.91 % 2,021.49 212 1.28 % 717.33 1,145 4.62 % 3,242.26 650 5.97 % 4,169.53 zdaj zdaj 1,195 1.75 % 1,154.47 427 2.67 % 1,852.31 207 1.25 % 700.41 357 1.44 % 1,010.90 204 1.87 % 1,308.59 zelo zelo 1,127 1.65 % 1,088.78 293 1.83 % 1,271.02 66 0.40 % 223.32 663 2.67 % 1,877.39 105 0.96 % 673.54 bolj bolj 1,015 1.49 % 980.57 236 1.48 % 1,023.76 228 1.38 % 771.46 374 1.51 % 1,059.04 177 1.62 % 1,135.39 danes d anes 933 1.37 % 901.36 541 3.38 % 2,346.84 21 0.13 % 71.06 333 1.34 % 942.94 38 0.35 % 243.76 naprej na prej 912 1.34 % 881.07 209 1.31 % 906.63 125 0.76 % 422.95 444 1.79 % 1,257.26 134 1.23 % 859.56 dobro d obro 885 1.30 % 854.98 208 1.30 % 902.30 97 0.59 % 328.21 423 1.71 % 1,197.79 157 1.44 % 1,007.10 mogoče mo goče 815 1.19 % 787.36 190 1.19 % 824.21 136 0.82 % 460.17 238 0.96 % 673.94 251 2.30 % 1,610.08 prej prej 805 1.18 % 777.70 165 1.03 % 715.76 240 1.45 % 812.07 295 1.19 % 835.34 105 0.96 % 673.54 zato zato 799 1.17 % 771.90 144 0.90 % 624.67 145 0.88 % 490.62 374 1.51 % 1,059.04 136 1.25 % 872.39 treba t reba 750 1.10 % 724.56 151 0.94 % 655.03 137 0.83 % 463.56 313 1.26 % 886.31 149 1.37 % 955.78 vedno v edno 727 1.06 % 702.34 233 1.46 % 1,010.75 100 0.60 % 338.36 288 1.16 % 815.52 106 0.97 % 679.95 lepo lepo 652 0.96 % 629.89 232 1.45 % 1,006.41 174 1.05 % 588.75 182 0.73 % 515.36 64 0.59 % 410.54 spet spet 640 0.94 % 618.29 145 0.91 % 629 218 1.32 % 737.63 186 0.75 % 526.69 91 0.83 % 583.73 enkrat en krat 596 0.87 % 575.79 143 0.89 % 620.33 168 1.01 % 568.45 195 0.79 % 552.17 90 0.83 % 577.32 čist čist 585 0.86 % 565.16 59 0.37 % 255.94 265 1.60 % 896.66 121 0.49 % 342.63 140 1.28 % 898.05 kdaj kdaj 584 0.86 % 564.19 136 0.85 % 589.96 186 1.12 % 629.35 178 0.72 % 504.04 84 0.77 % 538.83 nazaj n azaj 555 0.81 % 536.18 130 0.81 % 563.94 184 1.11 % 622.58 176 0.71 % 498.37 65 0.60 % 416.95 malo malo 530 0.78 % 512.02 163 1.02 % 707.09 134 0.81 % 453.40 125 0.50 % 353.96 108 0.99 % 692.78 takrat ta krat 494 0.72 % 477.25 103 0.64 % 446.81 94 0.57 % 318.06 213 0.86 % 603.14 84 0.77 % 538.83 tukaj t ukaj 493 0.72 % 476.28 69 0.43 % 299.32 44 0.27 % 148.88 284 1.15 % 804.19 96 0.88 % 615.81 prov prov 491 0.72 % 474.35 74 0.46 % 321.01 210 1.27 % 710.56 129 0.52 % 365.28 78 0.72 % 500.34 fajn fajn 478 0.70 % 461.79 139 0.87 % 602.98 221 1.33 % 747.78 21 0.09 % 59.46 97 0.89 % 622.22 prav prav 467 0.68 % 451.16 139 0.87 % 602.98 88 0.53 % 297.76 186 0.75 % 526.69 54 0.50 % 346.39 najprej naj prej 442 0.65 % 427.01 92 0.57 % 399.09 49 0.30 % 165.80 237 0.96 % 671.10 64 0.59 % 410.54 veliko ve liko 442 0.65 % 427.01 104 0.65 % 451.15 10 0.06 % 33.84 305 1.23 % 863.66 23 0.21 % 147.54 kako kako 429 0.63 % 414.45 121 0.76 % 524.89 46 0.28 % 155.65 211 0.85 % 597.48 51 0.47 % 327.15 točno t očno 419 0.61 % 404.79 120 0.75 % 520.56 94 0.57 % 318.06 113 0.46 % 319.98 92 0.84 % 590.15 verjetno verj etno 416 0.61 % 401.89 101 0.63 % 438.13 57 0.34 % 192.87 169 0.68 % 478.55 89 0.82 % 570.90 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 493 File at CLARIN.SI2.2.150 List of final character-level 5-grams from adverb lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-adverbs-lowercase_forms- final-5grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] lahko lahko 2,885 6.50 % 2,787.15 679 6.40 % 2,945.48 335 3.60 % 1,133.51 1,267 7.38 % 3,587.72 604 8.27 % 3,874.45 potem potem 2,473 5.57 % 2,389.12 466 4.39 % 2,021.49 212 2.28 % 717.33 1,145 6.67 % 3,242.26 650 8.90 % 4,169.53 danes danes 933 2.10 % 901.36 541 5.10 % 2,346.84 21 0.23 % 71.06 333 1.94 % 942.94 38 0.52 % 243.76 naprej n aprej 912 2.05 % 881.07 209 1.97 % 906.63 125 1.34 % 422.95 444 2.58 % 1,257.26 134 1.83 % 859.56 dobro dobro 885 1.99 % 854.98 208 1.96 % 902.30 97 1.04 % 328.21 423 2.46 % 1,197.79 157 2.15 % 1,007.10 mogoče m ogoče 815 1.84 % 787.36 190 1.79 % 824.21 136 1.46 % 460.17 238 1.39 % 673.94 251 3.44 % 1,610.08 treba treba 750 1.69 % 724.56 151 1.42 % 655.03 137 1.47 % 463.56 313 1.82 % 886.31 149 2.04 % 955.78 vedno vedno 727 1.64 % 702.34 233 2.19 % 1,010.75 100 1.07 % 338.36 288 1.68 % 815.52 106 1.45 % 679.95 enkrat e nkrat 596 1.34 % 575.79 143 1.35 % 620.33 168 1.81 % 568.45 195 1.14 % 552.17 90 1.23 % 577.32 nazaj nazaj 555 1.25 % 536.18 130 1.23 % 563.94 184 1.98 % 622.58 176 1.02 % 498.37 65 0.89 % 416.95 takrat t akrat 494 1.11 % 477.25 103 0.97 % 446.81 94 1.01 % 318.06 213 1.24 % 603.14 84 1.15 % 538.83 tukaj tukaj 493 1.11 % 476.28 69 0.65 % 299.32 44 0.47 % 148.88 284 1.65 % 804.19 96 1.31 % 615.81 najprej na jprej 442 0.99 % 427.01 92 0.87 % 399.09 49 0.53 % 165.80 237 1.38 % 671.10 64 0.88 % 410.54 veliko v eliko 442 0.99 % 427.01 104 0.98 % 451.15 10 0.11 % 33.84 305 1.78 % 863.66 23 0.32 % 147.54 točno točno 419 0.94 % 404.79 120 1.13 % 520.56 94 1.01 % 318.06 113 0.66 % 319.98 92 1.26 % 590.15 verjetno ver jetno 416 0.94 % 401.89 101 0.95 % 438.13 57 0.61 % 192.87 169 0.98 % 478.55 89 1.22 % 570.90 super super 407 0.92 % 393.20 134 1.26 % 581.29 90 0.97 % 304.53 29 0.17 % 82.12 154 2.11 % 987.86 dejansko dej ansko 404 0.91 % 390.30 40 0.38 % 173.52 71 0.76 % 240.24 183 1.06 % 518.19 110 1.50 % 705.61 zakaj zakaj 399 0.90 % 385.47 65 0.61 % 281.97 67 0.72 % 226.70 236 1.37 % 668.27 31 0.42 % 198.85 takoj takoj 346 0.78 % 334.26 96 0.91 % 416.44 83 0.89 % 280.84 92 0.54 % 260.51 75 1.03 % 481.10 najbolj na jbolj 341 0.77 % 329.43 104 0.98 % 451.15 43 0.46 % 145.50 155 0.90 % 438.91 39 0.53 % 250.17 glede glede 323 0.73 % 312.05 85 0.80 % 368.73 27 0.29 % 91.36 152 0.89 % 430.41 59 0.81 % 378.46 skupaj s kupaj 319 0.72 % 308.18 91 0.86 % 394.75 20 0.21 % 67.67 154 0.90 % 436.08 54 0.74 % 346.39 nekje nekje 313 0.70 % 302.38 42 0.40 % 182.19 57 0.61 % 192.87 109 0.64 % 308.65 105 1.44 % 673.54 včasih v časih 310 0.70 % 299.49 66 0.62 % 286.31 60 0.65 % 203.02 131 0.76 % 370.95 53 0.72 % 339.98 tukej tukej 289 0.65 % 279.20 36 0.34 % 156.17 22 0.24 % 74.44 179 1.04 % 506.87 52 0.71 % 333.56 noter noter 284 0.64 % 274.37 60 0.56 % 260.28 103 1.11 % 348.51 47 0.27 % 133.09 74 1.01 % 474.68 preveč p reveč 284 0.64 % 274.37 66 0.62 % 286.31 88 0.95 % 297.76 66 0.38 % 186.89 64 0.88 % 410.54 težko težko 254 0.57 % 245.39 50 0.47 % 216.90 47 0.51 % 159.03 118 0.69 % 334.14 39 0.53 % 250.17 čisto čisto 233 0.53 % 225.10 51 0.48 % 221.24 48 0.52 % 162.41 90 0.52 % 254.85 44 0.60 % 282.24 hitro hitro 230 0.52 % 222.20 79 0.74 % 342.70 46 0.49 % 155.65 81 0.47 % 229.36 24 0.33 % 153.95 precej p recej 230 0.52 % 222.20 78 0.73 % 338.36 25 0.27 % 84.59 101 0.59 % 286 26 0.36 % 166.78 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 494 File at CLARIN.SI2.2.151 List of initial character-level 1-grams from pronoun lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-pronouns-lemmas-initial- 1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ta ta t a 29,255 21.02 % 28,262.78 5,551 19.28 % 24,080.03 7,749 18.23 % 26,219.62 10,190 22.50 % 28,854.68 5,765 25.55 % 36,980.49 se se s e 18,112 13.02 % 17,497.71 3,828 13.30 % 16,605.72 4,503 10.60 % 15,236.41 6,799 15.01 % 19,252.50 2,982 13.22 % 19,128.50 on on o n 17,518 12.59 % 16,923.85 3,276 11.38 % 14,211.16 6,792 15.98 % 22,981.51 5,129 11.32 % 14,523.61 2,321 10.29 % 14,888.42 jaz jaz j az 15,037 10.81 % 14,527 3,089 10.73 % 13,399.96 5,613 13.21 % 18,992.22 3,872 8.55 % 10,964.21 2,463 10.92 % 15,799.30 ti ti t i 9,496 6.82 % 9,173.93 2,673 9.29 % 11,595.37 3,045 7.17 % 10,303.10 2,137 4.72 % 6,051.27 1,641 7.27 % 10,526.45 kaj kaj k aj 9,336 6.71 % 9,019.36 1,740 6.04 % 7,548.05 3,584 8.43 % 12,126.87 2,584 5.70 % 7,317.02 1,428 6.33 % 9,160.13 ves ves v es 4,615 3.32 % 4,458.48 1,105 3.84 % 4,793.45 1,336 3.14 % 4,520.51 1,474 3.25 % 4,173.88 700 3.10 % 4,490.26 tisti tisti t isti 3,373 2.42 % 3,258.60 523 1.82 % 2,268.75 1,334 3.14 % 4,513.74 1,025 2.26 % 2,902.46 491 2.18 % 3,149.60 kar kar k ar 2,985 2.15 % 2,883.76 630 2.19 % 2,732.92 732 1.72 % 2,476.81 1,108 2.45 % 3,137.49 515 2.28 % 3,303.55 tak tak t ak 2,457 1.77 % 2,373.67 460 1.60 % 1,995.46 918 2.16 % 3,106.16 618 1.36 % 1,749.97 461 2.04 % 2,957.16 kakšen kakšen k akšen 2,259 1.62 % 2,182.38 581 2.02 % 2,520.36 435 1.02 % 1,471.87 938 2.07 % 2,656.10 305 1.35 % 1,956.47 kak kak k ak 2,205 1.58 % 2,130.21 451 1.57 % 1,956.42 796 1.87 % 2,693.36 585 1.29 % 1,656.52 373 1.65 % 2,392.67 nekaj nekaj n ekaj 2,045 1.47 % 1,975.64 378 1.31 % 1,639.75 748 1.76 % 2,530.94 591 1.30 % 1,673.51 328 1.45 % 2,104.01 nek nek n ek 1,708 1.23 % 1,650.07 177 0.61 % 767.82 328 0.77 % 1,109.83 850 1.88 % 2,406.92 353 1.56 % 2,264.37 nič nič n ič 1,580 1.14 % 1,526.41 373 1.30 % 1,618.06 607 1.43 % 2,053.85 395 0.87 % 1,118.51 205 0.91 % 1,315 kateri kateri k ateri 1,447 1.04 % 1,397.92 288 1.00 % 1,249.33 279 0.66 % 944.03 739 1.63 % 2,092.60 141 0.62 % 904.47 moj moj m oj 1,367 0.98 % 1,320.64 349 1.21 % 1,513.95 512 1.21 % 1,732.41 325 0.72 % 920.29 181 0.80 % 1,161.05 naš naš n aš 1,327 0.95 % 1,281.99 351 1.22 % 1,522.62 164 0.39 % 554.91 689 1.52 % 1,951.02 123 0.55 % 789 kdo kdo k do 1,238 0.89 % 1,196.01 313 1.09 % 1,357.78 311 0.73 % 1,052.30 450 0.99 % 1,274.25 164 0.73 % 1,052 svoj svoj s voj 1,220 0.88 % 1,178.62 286 0.99 % 1,240.66 149 0.35 % 504.16 684 1.51 % 1,936.86 101 0.45 % 647.88 tale tale t ale 1,215 0.87 % 1,173.79 296 1.03 % 1,284.04 216 0.51 % 730.86 477 1.05 % 1,350.70 226 1.00 % 1,449.71 vsak vsak v sak 1,084 0.78 % 1,047.23 246 0.85 % 1,067.14 304 0.71 % 1,028.62 373 0.82 % 1,056.21 161 0.71 % 1,032.76 isti isti i sti 605 0.43 % 584.48 75 0.26 % 325.35 208 0.49 % 703.79 211 0.47 % 597.48 111 0.49 % 712.03 vaš vaš v aš 568 0.41 % 548.74 156 0.54 % 676.72 35 0.08 % 118.43 265 0.58 % 750.39 112 0.50 % 718.44 kolik kolik k olik 507 0.36 % 489.80 57 0.20 % 247.26 228 0.54 % 771.46 118 0.26 % 334.14 104 0.46 % 667.12 oni oni o ni 469 0.34 % 453.09 39 0.14 % 169.18 316 0.74 % 1,069.22 44 0.10 % 124.59 70 0.31 % 449.03 takšen takšen t akšen 464 0.33 % 448.26 167 0.58 % 724.44 67 0.16 % 226.70 188 0.41 % 532.35 42 0.19 % 269.42 tvoj tvoj t voj 408 0.29 % 394.16 150 0.52 % 650.69 128 0.30 % 433.10 95 0.21 % 269.01 35 0.15 % 224.51 nekak nekak n ekak 391 0.28 % 377.74 73 0.25 % 316.67 68 0.16 % 230.09 176 0.39 % 498.37 74 0.33 % 474.68 tolik tolik t olik 365 0.26 % 352.62 81 0.28 % 351.37 131 0.31 % 443.25 81 0.18 % 229.36 72 0.32 % 461.86 ostal ostal o stal 355 0.26 % 342.96 100 0.35 % 433.80 53 0.12 % 179.33 150 0.33 % 424.75 52 0.23 % 333.56 njegov njegov n jegov 325 0.23 % 313.98 76 0.26 % 329.69 64 0.15 % 216.55 168 0.37 % 475.72 17 0.07 % 109.05 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 495 File at CLARIN.SI2.2.152 List of initial character-level 2-grams from pronoun lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-pronouns-lemmas-initial- 2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ta ta ta 29,255 21.02 % 28,262.78 5,551 19.28 % 24,080.03 7,749 18.23 % 26,219.62 10,190 22.50 % 28,854.68 5,765 25.55 % 36,980.49 se se se 18,112 13.02 % 17,497.71 3,828 13.30 % 16,605.72 4,503 10.60 % 15,236.41 6,799 15.01 % 19,252.50 2,982 13.22 % 19,128.50 on on on 17,518 12.59 % 16,923.85 3,276 11.38 % 14,211.16 6,792 15.98 % 22,981.51 5,129 11.32 % 14,523.61 2,321 10.29 % 14,888.42 jaz jaz ja z 15,037 10.81 % 14,527 3,089 10.73 % 13,399.96 5,613 13.21 % 18,992.22 3,872 8.55 % 10,964.21 2,463 10.92 % 15,799.30 ti ti ti 9,496 6.82 % 9,173.93 2,673 9.29 % 11,595.37 3,045 7.17 % 10,303.10 2,137 4.72 % 6,051.27 1,641 7.27 % 10,526.45 kaj kaj ka j 9,336 6.71 % 9,019.36 1,740 6.04 % 7,548.05 3,584 8.43 % 12,126.87 2,584 5.70 % 7,317.02 1,428 6.33 % 9,160.13 ves ves ve s 4,615 3.32 % 4,458.48 1,105 3.84 % 4,793.45 1,336 3.14 % 4,520.51 1,474 3.25 % 4,173.88 700 3.10 % 4,490.26 tisti tisti ti sti 3,373 2.42 % 3,258.60 523 1.82 % 2,268.75 1,334 3.14 % 4,513.74 1,025 2.26 % 2,902.46 491 2.18 % 3,149.60 kar kar ka r 2,985 2.15 % 2,883.76 630 2.19 % 2,732.92 732 1.72 % 2,476.81 1,108 2.45 % 3,137.49 515 2.28 % 3,303.55 tak tak ta k 2,457 1.77 % 2,373.67 460 1.60 % 1,995.46 918 2.16 % 3,106.16 618 1.36 % 1,749.97 461 2.04 % 2,957.16 kakšen kakšen ka kšen 2,259 1.62 % 2,182.38 581 2.02 % 2,520.36 435 1.02 % 1,471.87 938 2.07 % 2,656.10 305 1.35 % 1,956.47 kak kak ka k 2,205 1.58 % 2,130.21 451 1.57 % 1,956.42 796 1.87 % 2,693.36 585 1.29 % 1,656.52 373 1.65 % 2,392.67 nekaj nekaj ne kaj 2,045 1.47 % 1,975.64 378 1.31 % 1,639.75 748 1.76 % 2,530.94 591 1.30 % 1,673.51 328 1.45 % 2,104.01 nek nek ne k 1,708 1.23 % 1,650.07 177 0.61 % 767.82 328 0.77 % 1,109.83 850 1.88 % 2,406.92 353 1.56 % 2,264.37 nič nič ni č 1,580 1.14 % 1,526.41 373 1.30 % 1,618.06 607 1.43 % 2,053.85 395 0.87 % 1,118.51 205 0.91 % 1,315 kateri kateri ka teri 1,447 1.04 % 1,397.92 288 1.00 % 1,249.33 279 0.66 % 944.03 739 1.63 % 2,092.60 141 0.62 % 904.47 moj moj mo j 1,367 0.98 % 1,320.64 349 1.21 % 1,513.95 512 1.21 % 1,732.41 325 0.72 % 920.29 181 0.80 % 1,161.05 naš naš na š 1,327 0.95 % 1,281.99 351 1.22 % 1,522.62 164 0.39 % 554.91 689 1.52 % 1,951.02 123 0.55 % 789 kdo kdo kd o 1,238 0.89 % 1,196.01 313 1.09 % 1,357.78 311 0.73 % 1,052.30 450 0.99 % 1,274.25 164 0.73 % 1,052 svoj svoj sv oj 1,220 0.88 % 1,178.62 286 0.99 % 1,240.66 149 0.35 % 504.16 684 1.51 % 1,936.86 101 0.45 % 647.88 tale tale ta le 1,215 0.87 % 1,173.79 296 1.03 % 1,284.04 216 0.51 % 730.86 477 1.05 % 1,350.70 226 1.00 % 1,449.71 vsak vsak vs ak 1,084 0.78 % 1,047.23 246 0.85 % 1,067.14 304 0.71 % 1,028.62 373 0.82 % 1,056.21 161 0.71 % 1,032.76 isti isti is ti 605 0.43 % 584.48 75 0.26 % 325.35 208 0.49 % 703.79 211 0.47 % 597.48 111 0.49 % 712.03 vaš vaš va š 568 0.41 % 548.74 156 0.54 % 676.72 35 0.08 % 118.43 265 0.58 % 750.39 112 0.50 % 718.44 kolik kolik ko lik 507 0.36 % 489.80 57 0.20 % 247.26 228 0.54 % 771.46 118 0.26 % 334.14 104 0.46 % 667.12 oni oni on i 469 0.34 % 453.09 39 0.14 % 169.18 316 0.74 % 1,069.22 44 0.10 % 124.59 70 0.31 % 449.03 takšen takšen ta kšen 464 0.33 % 448.26 167 0.58 % 724.44 67 0.16 % 226.70 188 0.41 % 532.35 42 0.19 % 269.42 tvoj tvoj tv oj 408 0.29 % 394.16 150 0.52 % 650.69 128 0.30 % 433.10 95 0.21 % 269.01 35 0.15 % 224.51 nekak nekak ne kak 391 0.28 % 377.74 73 0.25 % 316.67 68 0.16 % 230.09 176 0.39 % 498.37 74 0.33 % 474.68 tolik tolik to lik 365 0.26 % 352.62 81 0.28 % 351.37 131 0.31 % 443.25 81 0.18 % 229.36 72 0.32 % 461.86 ostal ostal os tal 355 0.26 % 342.96 100 0.35 % 433.80 53 0.12 % 179.33 150 0.33 % 424.75 52 0.23 % 333.56 njegov njegov nj egov 325 0.23 % 313.98 76 0.26 % 329.69 64 0.15 % 216.55 168 0.37 % 475.72 17 0.07 % 109.05 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 496 File at CLARIN.SI2.2.153 List of initial character-level 3-grams from pronoun lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-pronouns-lemmas-initial- 3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] jaz jaz jaz 15,037 23.29 % 14,527 3,089 22.99 % 13,399.96 5,613 27.60 % 18,992.22 3,872 18.46 % 10,964.21 2,463 25.08 % 15,799.30 kaj kaj kaj 9,336 14.46 % 9,019.36 1,740 12.95 % 7,548.05 3,584 17.62 % 12,126.87 2,584 12.32 % 7,317.02 1,428 14.54 % 9,160.13 ves ves ves 4,615 7.15 % 4,458.48 1,105 8.22 % 4,793.45 1,336 6.57 % 4,520.51 1,474 7.03 % 4,173.88 700 7.13 % 4,490.26 tisti tisti tis ti 3,373 5.22 % 3,258.60 523 3.89 % 2,268.75 1,334 6.56 % 4,513.74 1,025 4.89 % 2,902.46 491 5.00 % 3,149.60 kar kar kar 2,985 4.62 % 2,883.76 630 4.69 % 2,732.92 732 3.60 % 2,476.81 1,108 5.28 % 3,137.49 515 5.24 % 3,303.55 tak tak tak 2,457 3.81 % 2,373.67 460 3.42 % 1,995.46 918 4.51 % 3,106.16 618 2.95 % 1,749.97 461 4.69 % 2,957.16 kakšen kakšen kak šen 2,259 3.50 % 2,182.38 581 4.32 % 2,520.36 435 2.14 % 1,471.87 938 4.47 % 2,656.10 305 3.11 % 1,956.47 kak kak kak 2,205 3.42 % 2,130.21 451 3.36 % 1,956.42 796 3.91 % 2,693.36 585 2.79 % 1,656.52 373 3.80 % 2,392.67 nekaj nekaj nek aj 2,045 3.17 % 1,975.64 378 2.81 % 1,639.75 748 3.68 % 2,530.94 591 2.82 % 1,673.51 328 3.34 % 2,104.01 nek nek nek 1,708 2.65 % 1,650.07 177 1.32 % 767.82 328 1.61 % 1,109.83 850 4.05 % 2,406.92 353 3.59 % 2,264.37 nič nič nič 1,580 2.45 % 1,526.41 373 2.78 % 1,618.06 607 2.98 % 2,053.85 395 1.88 % 1,118.51 205 2.09 % 1,315 kateri kateri kat eri 1,447 2.24 % 1,397.92 288 2.14 % 1,249.33 279 1.37 % 944.03 739 3.52 % 2,092.60 141 1.44 % 904.47 moj moj moj 1,367 2.12 % 1,320.64 349 2.60 % 1,513.95 512 2.52 % 1,732.41 325 1.55 % 920.29 181 1.84 % 1,161.05 naš naš naš 1,327 2.06 % 1,281.99 351 2.61 % 1,522.62 164 0.81 % 554.91 689 3.29 % 1,951.02 123 1.25 % 789 kdo kdo kdo 1,238 1.92 % 1,196.01 313 2.33 % 1,357.78 311 1.53 % 1,052.30 450 2.15 % 1,274.25 164 1.67 % 1,052 svoj svoj svo j 1,220 1.89 % 1,178.62 286 2.13 % 1,240.66 149 0.73 % 504.16 684 3.26 % 1,936.86 101 1.03 % 647.88 tale tale tal e 1,215 1.88 % 1,173.79 296 2.20 % 1,284.04 216 1.06 % 730.86 477 2.27 % 1,350.70 226 2.30 % 1,449.71 vsak vsak vsa k 1,084 1.68 % 1,047.23 246 1.83 % 1,067.14 304 1.50 % 1,028.62 373 1.78 % 1,056.21 161 1.64 % 1,032.76 isti isti ist i 605 0.94 % 584.48 75 0.56 % 325.35 208 1.02 % 703.79 211 1.01 % 597.48 111 1.13 % 712.03 vaš vaš vaš 568 0.88 % 548.74 156 1.16 % 676.72 35 0.17 % 118.43 265 1.26 % 750.39 112 1.14 % 718.44 kolik kolik kol ik 507 0.79 % 489.80 57 0.42 % 247.26 228 1.12 % 771.46 118 0.56 % 334.14 104 1.06 % 667.12 oni oni oni 469 0.73 % 453.09 39 0.29 % 169.18 316 1.55 % 1,069.22 44 0.21 % 124.59 70 0.71 % 449.03 takšen takšen tak šen 464 0.72 % 448.26 167 1.24 % 724.44 67 0.33 % 226.70 188 0.90 % 532.35 42 0.43 % 269.42 tvoj tvoj tvo j 408 0.63 % 394.16 150 1.12 % 650.69 128 0.63 % 433.10 95 0.45 % 269.01 35 0.36 % 224.51 nekak nekak nek ak 391 0.61 % 377.74 73 0.54 % 316.67 68 0.33 % 230.09 176 0.84 % 498.37 74 0.75 % 474.68 tolik tolik tol ik 365 0.56 % 352.62 81 0.60 % 351.37 131 0.64 % 443.25 81 0.39 % 229.36 72 0.73 % 461.86 ostal ostal ost al 355 0.55 % 342.96 100 0.74 % 433.80 53 0.26 % 179.33 150 0.71 % 424.75 52 0.53 % 333.56 njegov njegov nje gov 325 0.50 % 313.98 76 0.57 % 329.69 64 0.32 % 216.55 168 0.80 % 475.72 17 0.17 % 109.05 nekdo nekdo nek do 313 0.48 % 302.38 63 0.47 % 273.29 67 0.33 % 226.70 140 0.67 % 396.43 43 0.44 % 275.83 takle takle tak le 312 0.48 % 301.42 81 0.60 % 351.37 89 0.44 % 301.14 108 0.52 % 305.82 34 0.35 % 218.10 nobeden nobeden nob eden 307 0.47 % 296.59 43 0.32 % 186.53 112 0.55 % 378.96 94 0.45 % 266.18 58 0.59 % 372.05 oba oba oba 259 0.40 % 250.22 70 0.52 % 303.66 32 0.16 % 108.28 128 0.61 % 362.45 29 0.29 % 186.03 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 497 File at CLARIN.SI2.2.154 List of initial character-level 4-grams from pronoun lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-pronouns-lemmas-initial- 4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tisti tisti tist i 3,373 17.37 % 3,258.60 523 12.65 % 2,268.75 1,334 26.40 % 4,513.74 1,025 13.51 % 2,902.46 491 18.57 % 3,149.60 kakšen kakšen kakš en 2,259 11.63 % 2,182.38 581 14.06 % 2,520.36 435 8.61 % 1,471.87 938 12.36 % 2,656.10 305 11.54 % 1,956.47 nekaj nekaj neka j 2,045 10.53 % 1,975.64 378 9.15 % 1,639.75 748 14.80 % 2,530.94 591 7.79 % 1,673.51 328 12.40 % 2,104.01 kateri kateri kate ri 1,447 7.45 % 1,397.92 288 6.97 % 1,249.33 279 5.52 % 944.03 739 9.74 % 2,092.60 141 5.33 % 904.47 svoj svoj svoj 1,220 6.28 % 1,178.62 286 6.92 % 1,240.66 149 2.95 % 504.16 684 9.01 % 1,936.86 101 3.82 % 647.88 tale tale tale 1,215 6.26 % 1,173.79 296 7.16 % 1,284.04 216 4.28 % 730.86 477 6.29 % 1,350.70 226 8.55 % 1,449.71 vsak vsak vsak 1,084 5.58 % 1,047.23 246 5.95 % 1,067.14 304 6.02 % 1,028.62 373 4.92 % 1,056.21 161 6.09 % 1,032.76 isti isti isti 605 3.12 % 584.48 75 1.81 % 325.35 208 4.12 % 703.79 211 2.78 % 597.48 111 4.20 % 712.03 kolik kolik koli k 507 2.61 % 489.80 57 1.38 % 247.26 228 4.51 % 771.46 118 1.55 % 334.14 104 3.93 % 667.12 takšen takšen takš en 464 2.39 % 448.26 167 4.04 % 724.44 67 1.33 % 226.70 188 2.48 % 532.35 42 1.59 % 269.42 tvoj tvoj tvoj 408 2.10 % 394.16 150 3.63 % 650.69 128 2.53 % 433.10 95 1.25 % 269.01 35 1.32 % 224.51 nekak nekak neka k 391 2.01 % 377.74 73 1.77 % 316.67 68 1.35 % 230.09 176 2.32 % 498.37 74 2.80 % 474.68 tolik tolik toli k 365 1.88 % 352.62 81 1.96 % 351.37 131 2.59 % 443.25 81 1.07 % 229.36 72 2.72 % 461.86 ostal ostal osta l 355 1.83 % 342.96 100 2.42 % 433.80 53 1.05 % 179.33 150 1.98 % 424.75 52 1.97 % 333.56 njegov njegov njeg ov 325 1.67 % 313.98 76 1.84 % 329.69 64 1.27 % 216.55 168 2.21 % 475.72 17 0.64 % 109.05 nekdo nekdo nekd o 313 1.61 % 302.38 63 1.52 % 273.29 67 1.33 % 226.70 140 1.84 % 396.43 43 1.63 % 275.83 takle takle takl e 312 1.61 % 301.42 81 1.96 % 351.37 89 1.76 % 301.14 108 1.42 % 305.82 34 1.29 % 218.10 nobeden nobeden nobe den 307 1.58 % 296.59 43 1.04 % 186.53 112 2.22 % 378.96 94 1.24 % 266.18 58 2.19 % 372.05 nekateri nekateri neka teri 253 1.30 % 244.42 56 1.35 % 242.93 12 0.24 % 40.60 167 2.20 % 472.89 18 0.68 % 115.46 njen njen njen 245 1.26 % 236.69 94 2.27 % 407.77 40 0.79 % 135.34 100 1.32 % 283.17 11 0.42 % 70.56 njihov njihov njih ov 241 1.24 % 232.83 44 1.06 % 190.87 39 0.77 % 131.96 122 1.61 % 345.46 36 1.36 % 230.93 drugačen drugačen drug ačen 199 1.02 % 192.25 40 0.97 % 173.52 29 0.57 % 98.12 106 1.40 % 300.16 24 0.91 % 153.95 enak enak enak 182 0.94 % 175.83 17 0.41 % 73.75 7 0.14 % 23.69 134 1.77 % 379.44 24 0.91 % 153.95 zame zame zame 138 0.71 % 133.32 37 0.90 % 160.50 41 0.81 % 138.73 50 0.66 % 141.58 10 0.38 % 64.15 kolikor kolikor koli kor 128 0.66 % 123.66 24 0.58 % 104.11 26 0.52 % 87.97 58 0.76 % 164.24 20 0.76 % 128.29 nihče nihče nihč e 125 0.64 % 120.76 24 0.58 % 104.11 26 0.52 % 87.97 68 0.90 % 192.55 7 0.27 % 44.90 nobena nobena nobe na 116 0.60 % 112.07 19 0.46 % 82.42 17 0.34 % 57.52 61 0.80 % 172.73 19 0.72 % 121.88 noben noben nobe n 101 0.52 % 97.57 19 0.46 % 82.42 16 0.32 % 54.14 50 0.66 % 141.58 16 0.60 % 102.63 kakršen kakršen kakr šen 83 0.43 % 80.18 24 0.58 % 104.11 4 0.08 % 13.53 38 0.50 % 107.60 17 0.64 % 109.05 marsikaj marsikaj mars ikaj 82 0.42 % 79.22 23 0.56 % 99.77 8 0.16 % 27.07 45 0.59 % 127.42 6 0.23 % 38.49 nekakšen nekakšen neka kšen 76 0.39 % 73.42 24 0.58 % 104.11 29 0.57 % 98.12 20 0.26 % 56.63 3 0.11 % 19.24 kdor kdor kdor 65 0.34 % 62.80 14 0.34 % 60.73 10 0.20 % 33.84 31 0.41 % 87.78 10 0.38 % 64.15 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 498 File at CLARIN.SI2.2.155 List of initial character-level 5-grams from pronoun lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-pronouns-lemmas-initial- 5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tisti tisti tisti 3,373 23.90 % 3,258.60 523 18.15 % 2,268.75 1,334 34.05 % 4,513.74 1,025 19.09 % 2,902.46 491 25.26 % 3,149.60 kakšen kakšen kakše n 2,259 16.01 % 2,182.38 581 20.17 % 2,520.36 435 11.10 % 1,471.87 938 17.47 % 2,656.10 305 15.69 % 1,956.47 nekaj nekaj nekaj 2,045 14.49 % 1,975.64 378 13.12 % 1,639.75 748 19.09 % 2,530.94 591 11.01 % 1,673.51 328 16.87 % 2,104.01 kateri kateri kater i 1,447 10.25 % 1,397.92 288 10.00 % 1,249.33 279 7.12 % 944.03 739 13.77 % 2,092.60 141 7.25 % 904.47 kolik kolik kolik 507 3.59 % 489.80 57 1.98 % 247.26 228 5.82 % 771.46 118 2.20 % 334.14 104 5.35 % 667.12 takšen takšen takše n 464 3.29 % 448.26 167 5.80 % 724.44 67 1.71 % 226.70 188 3.50 % 532.35 42 2.16 % 269.42 nekak nekak nekak 391 2.77 % 377.74 73 2.53 % 316.67 68 1.74 % 230.09 176 3.28 % 498.37 74 3.81 % 474.68 tolik tolik tolik 365 2.59 % 352.62 81 2.81 % 351.37 131 3.34 % 443.25 81 1.51 % 229.36 72 3.70 % 461.86 ostal ostal ostal 355 2.52 % 342.96 100 3.47 % 433.80 53 1.35 % 179.33 150 2.79 % 424.75 52 2.67 % 333.56 njegov njegov njego v 325 2.30 % 313.98 76 2.64 % 329.69 64 1.63 % 216.55 168 3.13 % 475.72 17 0.87 % 109.05 nekdo nekdo nekdo 313 2.22 % 302.38 63 2.19 % 273.29 67 1.71 % 226.70 140 2.61 % 396.43 43 2.21 % 275.83 takle takle takle 312 2.21 % 301.42 81 2.81 % 351.37 89 2.27 % 301.14 108 2.01 % 305.82 34 1.75 % 218.10 nobeden nobeden nobed en 307 2.18 % 296.59 43 1.49 % 186.53 112 2.86 % 378.96 94 1.75 % 266.18 58 2.98 % 372.05 nekateri nekateri nekat eri 253 1.79 % 244.42 56 1.94 % 242.93 12 0.31 % 40.60 167 3.11 % 472.89 18 0.93 % 115.46 njihov njihov njiho v 241 1.71 % 232.83 44 1.53 % 190.87 39 0.99 % 131.96 122 2.27 % 345.46 36 1.85 % 230.93 drugačen drugačen druga čen 199 1.41 % 192.25 40 1.39 % 173.52 29 0.74 % 98.12 106 1.98 % 300.16 24 1.24 % 153.95 kolikor kolikor kolik or 128 0.91 % 123.66 24 0.83 % 104.11 26 0.66 % 87.97 58 1.08 % 164.24 20 1.03 % 128.29 nihče nihče nihče 125 0.89 % 120.76 24 0.83 % 104.11 26 0.66 % 87.97 68 1.27 % 192.55 7 0.36 % 44.90 nobena nobena noben a 116 0.82 % 112.07 19 0.66 % 82.42 17 0.43 % 57.52 61 1.14 % 172.73 19 0.98 % 121.88 noben noben noben 101 0.72 % 97.57 19 0.66 % 82.42 16 0.41 % 54.14 50 0.93 % 141.58 16 0.82 % 102.63 kakršen kakršen kakrš en 83 0.59 % 80.18 24 0.83 % 104.11 4 0.10 % 13.53 38 0.71 % 107.60 17 0.87 % 109.05 marsikaj marsikaj marsi kaj 82 0.58 % 79.22 23 0.80 % 99.77 8 0.20 % 27.07 45 0.84 % 127.42 6 0.31 % 38.49 nekakšen nekakšen nekak šen 76 0.54 % 73.42 24 0.83 % 104.11 29 0.74 % 98.12 20 0.37 % 56.63 3 0.15 % 19.24 najin najin najin 47 0.33 % 45.41 23 0.80 % 99.77 7 0.18 % 23.69 10 0.19 % 28.32 7 0.36 % 44.90 marsikdo marsikdo marsi kdo 39 0.28 % 37.68 14 0.49 % 60.73 3 0.08 % 10.15 20 0.37 % 56.63 2 0.10 % 12.83 obadva obadva obadv a 29 0.21 % 28.02 6 0.21 % 26.03 10 0.26 % 33.84 10 0.19 % 28.32 3 0.15 % 19.24 marsikateri marsikateri marsi kateri 24 0.17 % 23.19 9 0.31 % 39.04 1 0.03 % 3.38 14 0.26 % 39.64 0 0 % 0 nikakršen nikakršen nikak ršen 18 0.13 % 17.39 4 0.14 % 17.35 3 0.08 % 10.15 11 0.20 % 31.15 0 0 % 0 vajin vajin vajin 14 0.10 % 13.53 9 0.31 % 39.04 1 0.03 % 3.38 4 0.07 % 11.33 0 0 % 0 kolikšen kolikšen kolik šen 13 0.09 % 12.56 0 0 % 0 1 0.03 % 3.38 12 0.22 % 33.98 0 0 % 0 vsakdo vsakdo vsakd o 12 0.09 % 11.59 2 0.07 % 8.68 1 0.03 % 3.38 8 0.15 % 22.65 1 0.05 % 6.41 čigav čigav čigav 11 0.08 % 10.63 2 0.07 % 8.68 5 0.13 % 16.92 3 0.06 % 8.49 1 0.05 % 6.41 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 499 File at CLARIN.SI2.2.156 List of final character-level 1-grams from pronoun lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-pronouns-lemmas-final- 1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ta ta t a 29,255 21.02 % 28,262.78 5,551 19.28 % 24,080.03 7,749 18.23 % 26,219.62 10,190 22.50 % 28,854.68 5,765 25.55 % 36,980.49 se se s e 18,112 13.02 % 17,497.71 3,828 13.30 % 16,605.72 4,503 10.60 % 15,236.41 6,799 15.01 % 19,252.50 2,982 13.22 % 19,128.50 on on o n 17,518 12.59 % 16,923.85 3,276 11.38 % 14,211.16 6,792 15.98 % 22,981.51 5,129 11.32 % 14,523.61 2,321 10.29 % 14,888.42 jaz jaz ja z 15,037 10.81 % 14,527 3,089 10.73 % 13,399.96 5,613 13.21 % 18,992.22 3,872 8.55 % 10,964.21 2,463 10.92 % 15,799.30 ti ti t i 9,496 6.82 % 9,173.93 2,673 9.29 % 11,595.37 3,045 7.17 % 10,303.10 2,137 4.72 % 6,051.27 1,641 7.27 % 10,526.45 kaj kaj ka j 9,336 6.71 % 9,019.36 1,740 6.04 % 7,548.05 3,584 8.43 % 12,126.87 2,584 5.70 % 7,317.02 1,428 6.33 % 9,160.13 ves ves ve s 4,615 3.32 % 4,458.48 1,105 3.84 % 4,793.45 1,336 3.14 % 4,520.51 1,474 3.25 % 4,173.88 700 3.10 % 4,490.26 tisti tisti tist i 3,373 2.42 % 3,258.60 523 1.82 % 2,268.75 1,334 3.14 % 4,513.74 1,025 2.26 % 2,902.46 491 2.18 % 3,149.60 kar kar ka r 2,985 2.15 % 2,883.76 630 2.19 % 2,732.92 732 1.72 % 2,476.81 1,108 2.45 % 3,137.49 515 2.28 % 3,303.55 tak tak ta k 2,457 1.77 % 2,373.67 460 1.60 % 1,995.46 918 2.16 % 3,106.16 618 1.36 % 1,749.97 461 2.04 % 2,957.16 kakšen kakšen kakše n 2,259 1.62 % 2,182.38 581 2.02 % 2,520.36 435 1.02 % 1,471.87 938 2.07 % 2,656.10 305 1.35 % 1,956.47 kak kak ka k 2,205 1.58 % 2,130.21 451 1.57 % 1,956.42 796 1.87 % 2,693.36 585 1.29 % 1,656.52 373 1.65 % 2,392.67 nekaj nekaj neka j 2,045 1.47 % 1,975.64 378 1.31 % 1,639.75 748 1.76 % 2,530.94 591 1.30 % 1,673.51 328 1.45 % 2,104.01 nek nek ne k 1,708 1.23 % 1,650.07 177 0.61 % 767.82 328 0.77 % 1,109.83 850 1.88 % 2,406.92 353 1.56 % 2,264.37 nič nič ni č 1,580 1.14 % 1,526.41 373 1.30 % 1,618.06 607 1.43 % 2,053.85 395 0.87 % 1,118.51 205 0.91 % 1,315 kateri kateri kater i 1,447 1.04 % 1,397.92 288 1.00 % 1,249.33 279 0.66 % 944.03 739 1.63 % 2,092.60 141 0.62 % 904.47 moj moj mo j 1,367 0.98 % 1,320.64 349 1.21 % 1,513.95 512 1.21 % 1,732.41 325 0.72 % 920.29 181 0.80 % 1,161.05 naš naš na š 1,327 0.95 % 1,281.99 351 1.22 % 1,522.62 164 0.39 % 554.91 689 1.52 % 1,951.02 123 0.55 % 789 kdo kdo kd o 1,238 0.89 % 1,196.01 313 1.09 % 1,357.78 311 0.73 % 1,052.30 450 0.99 % 1,274.25 164 0.73 % 1,052 svoj svoj svo j 1,220 0.88 % 1,178.62 286 0.99 % 1,240.66 149 0.35 % 504.16 684 1.51 % 1,936.86 101 0.45 % 647.88 tale tale tal e 1,215 0.87 % 1,173.79 296 1.03 % 1,284.04 216 0.51 % 730.86 477 1.05 % 1,350.70 226 1.00 % 1,449.71 vsak vsak vsa k 1,084 0.78 % 1,047.23 246 0.85 % 1,067.14 304 0.71 % 1,028.62 373 0.82 % 1,056.21 161 0.71 % 1,032.76 isti isti ist i 605 0.43 % 584.48 75 0.26 % 325.35 208 0.49 % 703.79 211 0.47 % 597.48 111 0.49 % 712.03 vaš vaš va š 568 0.41 % 548.74 156 0.54 % 676.72 35 0.08 % 118.43 265 0.58 % 750.39 112 0.50 % 718.44 kolik kolik koli k 507 0.36 % 489.80 57 0.20 % 247.26 228 0.54 % 771.46 118 0.26 % 334.14 104 0.46 % 667.12 oni oni on i 469 0.34 % 453.09 39 0.14 % 169.18 316 0.74 % 1,069.22 44 0.10 % 124.59 70 0.31 % 449.03 takšen takšen takše n 464 0.33 % 448.26 167 0.58 % 724.44 67 0.16 % 226.70 188 0.41 % 532.35 42 0.19 % 269.42 tvoj tvoj tvo j 408 0.29 % 394.16 150 0.52 % 650.69 128 0.30 % 433.10 95 0.21 % 269.01 35 0.15 % 224.51 nekak nekak neka k 391 0.28 % 377.74 73 0.25 % 316.67 68 0.16 % 230.09 176 0.39 % 498.37 74 0.33 % 474.68 tolik tolik toli k 365 0.26 % 352.62 81 0.28 % 351.37 131 0.31 % 443.25 81 0.18 % 229.36 72 0.32 % 461.86 ostal ostal osta l 355 0.26 % 342.96 100 0.35 % 433.80 53 0.12 % 179.33 150 0.33 % 424.75 52 0.23 % 333.56 njegov njegov njego v 325 0.23 % 313.98 76 0.26 % 329.69 64 0.15 % 216.55 168 0.37 % 475.72 17 0.07 % 109.05 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 500 File at CLARIN.SI2.2.157 List of final character-level 2-grams from pronoun lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-pronouns-lemmas- final-2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ta ta ta 29,255 21.02 % 28,262.78 5,551 19.28 % 24,080.03 7,749 18.23 % 26,219.62 10,190 22.50 % 28,854.68 5,765 25.55 % 36,980.49 se se se 18,112 13.02 % 17,497.71 3,828 13.30 % 16,605.72 4,503 10.60 % 15,236.41 6,799 15.01 % 19,252.50 2,982 13.22 % 19,128.50 on on on 17,518 12.59 % 16,923.85 3,276 11.38 % 14,211.16 6,792 15.98 % 22,981.51 5,129 11.32 % 14,523.61 2,321 10.29 % 14,888.42 jaz jaz j az 15,037 10.81 % 14,527 3,089 10.73 % 13,399.96 5,613 13.21 % 18,992.22 3,872 8.55 % 10,964.21 2,463 10.92 % 15,799.30 ti ti ti 9,496 6.82 % 9,173.93 2,673 9.29 % 11,595.37 3,045 7.17 % 10,303.10 2,137 4.72 % 6,051.27 1,641 7.27 % 10,526.45 kaj kaj k aj 9,336 6.71 % 9,019.36 1,740 6.04 % 7,548.05 3,584 8.43 % 12,126.87 2,584 5.70 % 7,317.02 1,428 6.33 % 9,160.13 ves ves v es 4,615 3.32 % 4,458.48 1,105 3.84 % 4,793.45 1,336 3.14 % 4,520.51 1,474 3.25 % 4,173.88 700 3.10 % 4,490.26 tisti tisti tis ti 3,373 2.42 % 3,258.60 523 1.82 % 2,268.75 1,334 3.14 % 4,513.74 1,025 2.26 % 2,902.46 491 2.18 % 3,149.60 kar kar k ar 2,985 2.15 % 2,883.76 630 2.19 % 2,732.92 732 1.72 % 2,476.81 1,108 2.45 % 3,137.49 515 2.28 % 3,303.55 tak tak t ak 2,457 1.77 % 2,373.67 460 1.60 % 1,995.46 918 2.16 % 3,106.16 618 1.36 % 1,749.97 461 2.04 % 2,957.16 kakšen kakšen kakš en 2,259 1.62 % 2,182.38 581 2.02 % 2,520.36 435 1.02 % 1,471.87 938 2.07 % 2,656.10 305 1.35 % 1,956.47 kak kak k ak 2,205 1.58 % 2,130.21 451 1.57 % 1,956.42 796 1.87 % 2,693.36 585 1.29 % 1,656.52 373 1.65 % 2,392.67 nekaj nekaj nek aj 2,045 1.47 % 1,975.64 378 1.31 % 1,639.75 748 1.76 % 2,530.94 591 1.30 % 1,673.51 328 1.45 % 2,104.01 nek nek n ek 1,708 1.23 % 1,650.07 177 0.61 % 767.82 328 0.77 % 1,109.83 850 1.88 % 2,406.92 353 1.56 % 2,264.37 nič nič n ič 1,580 1.14 % 1,526.41 373 1.30 % 1,618.06 607 1.43 % 2,053.85 395 0.87 % 1,118.51 205 0.91 % 1,315 kateri kateri kate ri 1,447 1.04 % 1,397.92 288 1.00 % 1,249.33 279 0.66 % 944.03 739 1.63 % 2,092.60 141 0.62 % 904.47 moj moj m oj 1,367 0.98 % 1,320.64 349 1.21 % 1,513.95 512 1.21 % 1,732.41 325 0.72 % 920.29 181 0.80 % 1,161.05 naš naš n aš 1,327 0.95 % 1,281.99 351 1.22 % 1,522.62 164 0.39 % 554.91 689 1.52 % 1,951.02 123 0.55 % 789 kdo kdo k do 1,238 0.89 % 1,196.01 313 1.09 % 1,357.78 311 0.73 % 1,052.30 450 0.99 % 1,274.25 164 0.73 % 1,052 svoj svoj sv oj 1,220 0.88 % 1,178.62 286 0.99 % 1,240.66 149 0.35 % 504.16 684 1.51 % 1,936.86 101 0.45 % 647.88 tale tale ta le 1,215 0.87 % 1,173.79 296 1.03 % 1,284.04 216 0.51 % 730.86 477 1.05 % 1,350.70 226 1.00 % 1,449.71 vsak vsak vs ak 1,084 0.78 % 1,047.23 246 0.85 % 1,067.14 304 0.71 % 1,028.62 373 0.82 % 1,056.21 161 0.71 % 1,032.76 isti isti is ti 605 0.43 % 584.48 75 0.26 % 325.35 208 0.49 % 703.79 211 0.47 % 597.48 111 0.49 % 712.03 vaš vaš v aš 568 0.41 % 548.74 156 0.54 % 676.72 35 0.08 % 118.43 265 0.58 % 750.39 112 0.50 % 718.44 kolik kolik kol ik 507 0.36 % 489.80 57 0.20 % 247.26 228 0.54 % 771.46 118 0.26 % 334.14 104 0.46 % 667.12 oni oni o ni 469 0.34 % 453.09 39 0.14 % 169.18 316 0.74 % 1,069.22 44 0.10 % 124.59 70 0.31 % 449.03 takšen takšen takš en 464 0.33 % 448.26 167 0.58 % 724.44 67 0.16 % 226.70 188 0.41 % 532.35 42 0.19 % 269.42 tvoj tvoj tv oj 408 0.29 % 394.16 150 0.52 % 650.69 128 0.30 % 433.10 95 0.21 % 269.01 35 0.15 % 224.51 nekak nekak nek ak 391 0.28 % 377.74 73 0.25 % 316.67 68 0.16 % 230.09 176 0.39 % 498.37 74 0.33 % 474.68 tolik tolik tol ik 365 0.26 % 352.62 81 0.28 % 351.37 131 0.31 % 443.25 81 0.18 % 229.36 72 0.32 % 461.86 ostal ostal ost al 355 0.26 % 342.96 100 0.35 % 433.80 53 0.12 % 179.33 150 0.33 % 424.75 52 0.23 % 333.56 njegov njegov njeg ov 325 0.23 % 313.98 76 0.26 % 329.69 64 0.15 % 216.55 168 0.37 % 475.72 17 0.07 % 109.05 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 501 File at CLARIN.SI2.2.158 List of final character-level 3-grams from pronoun lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-pronouns-lemmas-final- 3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] jaz jaz jaz 15,037 23.29 % 14,527 3,089 22.99 % 13,399.96 5,613 27.60 % 18,992.22 3,872 18.46 % 10,964.21 2,463 25.08 % 15,799.30 kaj kaj kaj 9,336 14.46 % 9,019.36 1,740 12.95 % 7,548.05 3,584 17.62 % 12,126.87 2,584 12.32 % 7,317.02 1,428 14.54 % 9,160.13 ves ves ves 4,615 7.15 % 4,458.48 1,105 8.22 % 4,793.45 1,336 6.57 % 4,520.51 1,474 7.03 % 4,173.88 700 7.13 % 4,490.26 tisti tisti ti sti 3,373 5.22 % 3,258.60 523 3.89 % 2,268.75 1,334 6.56 % 4,513.74 1,025 4.89 % 2,902.46 491 5.00 % 3,149.60 kar kar kar 2,985 4.62 % 2,883.76 630 4.69 % 2,732.92 732 3.60 % 2,476.81 1,108 5.28 % 3,137.49 515 5.24 % 3,303.55 tak tak tak 2,457 3.81 % 2,373.67 460 3.42 % 1,995.46 918 4.51 % 3,106.16 618 2.95 % 1,749.97 461 4.69 % 2,957.16 kakšen kakšen kak šen 2,259 3.50 % 2,182.38 581 4.32 % 2,520.36 435 2.14 % 1,471.87 938 4.47 % 2,656.10 305 3.11 % 1,956.47 kak kak kak 2,205 3.42 % 2,130.21 451 3.36 % 1,956.42 796 3.91 % 2,693.36 585 2.79 % 1,656.52 373 3.80 % 2,392.67 nekaj nekaj ne kaj 2,045 3.17 % 1,975.64 378 2.81 % 1,639.75 748 3.68 % 2,530.94 591 2.82 % 1,673.51 328 3.34 % 2,104.01 nek nek nek 1,708 2.65 % 1,650.07 177 1.32 % 767.82 328 1.61 % 1,109.83 850 4.05 % 2,406.92 353 3.59 % 2,264.37 nič nič nič 1,580 2.45 % 1,526.41 373 2.78 % 1,618.06 607 2.98 % 2,053.85 395 1.88 % 1,118.51 205 2.09 % 1,315 kateri kateri kat eri 1,447 2.24 % 1,397.92 288 2.14 % 1,249.33 279 1.37 % 944.03 739 3.52 % 2,092.60 141 1.44 % 904.47 moj moj moj 1,367 2.12 % 1,320.64 349 2.60 % 1,513.95 512 2.52 % 1,732.41 325 1.55 % 920.29 181 1.84 % 1,161.05 naš naš naš 1,327 2.06 % 1,281.99 351 2.61 % 1,522.62 164 0.81 % 554.91 689 3.29 % 1,951.02 123 1.25 % 789 kdo kdo kdo 1,238 1.92 % 1,196.01 313 2.33 % 1,357.78 311 1.53 % 1,052.30 450 2.15 % 1,274.25 164 1.67 % 1,052 svoj svoj s voj 1,220 1.89 % 1,178.62 286 2.13 % 1,240.66 149 0.73 % 504.16 684 3.26 % 1,936.86 101 1.03 % 647.88 tale tale t ale 1,215 1.88 % 1,173.79 296 2.20 % 1,284.04 216 1.06 % 730.86 477 2.27 % 1,350.70 226 2.30 % 1,449.71 vsak vsak v sak 1,084 1.68 % 1,047.23 246 1.83 % 1,067.14 304 1.50 % 1,028.62 373 1.78 % 1,056.21 161 1.64 % 1,032.76 isti isti i sti 605 0.94 % 584.48 75 0.56 % 325.35 208 1.02 % 703.79 211 1.01 % 597.48 111 1.13 % 712.03 vaš vaš vaš 568 0.88 % 548.74 156 1.16 % 676.72 35 0.17 % 118.43 265 1.26 % 750.39 112 1.14 % 718.44 kolik kolik ko lik 507 0.79 % 489.80 57 0.42 % 247.26 228 1.12 % 771.46 118 0.56 % 334.14 104 1.06 % 667.12 oni oni oni 469 0.73 % 453.09 39 0.29 % 169.18 316 1.55 % 1,069.22 44 0.21 % 124.59 70 0.71 % 449.03 takšen takšen tak šen 464 0.72 % 448.26 167 1.24 % 724.44 67 0.33 % 226.70 188 0.90 % 532.35 42 0.43 % 269.42 tvoj tvoj t voj 408 0.63 % 394.16 150 1.12 % 650.69 128 0.63 % 433.10 95 0.45 % 269.01 35 0.36 % 224.51 nekak nekak ne kak 391 0.61 % 377.74 73 0.54 % 316.67 68 0.33 % 230.09 176 0.84 % 498.37 74 0.75 % 474.68 tolik tolik to lik 365 0.56 % 352.62 81 0.60 % 351.37 131 0.64 % 443.25 81 0.39 % 229.36 72 0.73 % 461.86 ostal ostal os tal 355 0.55 % 342.96 100 0.74 % 433.80 53 0.26 % 179.33 150 0.71 % 424.75 52 0.53 % 333.56 njegov njegov nje gov 325 0.50 % 313.98 76 0.57 % 329.69 64 0.32 % 216.55 168 0.80 % 475.72 17 0.17 % 109.05 nekdo nekdo ne kdo 313 0.48 % 302.38 63 0.47 % 273.29 67 0.33 % 226.70 140 0.67 % 396.43 43 0.44 % 275.83 takle takle ta kle 312 0.48 % 301.42 81 0.60 % 351.37 89 0.44 % 301.14 108 0.52 % 305.82 34 0.35 % 218.10 nobeden nobeden nobe den 307 0.47 % 296.59 43 0.32 % 186.53 112 0.55 % 378.96 94 0.45 % 266.18 58 0.59 % 372.05 oba oba oba 259 0.40 % 250.22 70 0.52 % 303.66 32 0.16 % 108.28 128 0.61 % 362.45 29 0.29 % 186.03 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 502 File at CLARIN.SI2.2.159 List of final character-level 4-grams from pronoun lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-pronouns-lemmas-final- 4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tisti tisti t isti 3,373 17.37 % 3,258.60 523 12.65 % 2,268.75 1,334 26.40 % 4,513.74 1,025 13.51 % 2,902.46 491 18.57 % 3,149.60 kakšen kakšen ka kšen 2,259 11.63 % 2,182.38 581 14.06 % 2,520.36 435 8.61 % 1,471.87 938 12.36 % 2,656.10 305 11.54 % 1,956.47 nekaj nekaj n ekaj 2,045 10.53 % 1,975.64 378 9.15 % 1,639.75 748 14.80 % 2,530.94 591 7.79 % 1,673.51 328 12.40 % 2,104.01 kateri kateri ka teri 1,447 7.45 % 1,397.92 288 6.97 % 1,249.33 279 5.52 % 944.03 739 9.74 % 2,092.60 141 5.33 % 904.47 svoj svoj svoj 1,220 6.28 % 1,178.62 286 6.92 % 1,240.66 149 2.95 % 504.16 684 9.01 % 1,936.86 101 3.82 % 647.88 tale tale tale 1,215 6.26 % 1,173.79 296 7.16 % 1,284.04 216 4.28 % 730.86 477 6.29 % 1,350.70 226 8.55 % 1,449.71 vsak vsak vsak 1,084 5.58 % 1,047.23 246 5.95 % 1,067.14 304 6.02 % 1,028.62 373 4.92 % 1,056.21 161 6.09 % 1,032.76 isti isti isti 605 3.12 % 584.48 75 1.81 % 325.35 208 4.12 % 703.79 211 2.78 % 597.48 111 4.20 % 712.03 kolik kolik k olik 507 2.61 % 489.80 57 1.38 % 247.26 228 4.51 % 771.46 118 1.55 % 334.14 104 3.93 % 667.12 takšen takšen ta kšen 464 2.39 % 448.26 167 4.04 % 724.44 67 1.33 % 226.70 188 2.48 % 532.35 42 1.59 % 269.42 tvoj tvoj tvoj 408 2.10 % 394.16 150 3.63 % 650.69 128 2.53 % 433.10 95 1.25 % 269.01 35 1.32 % 224.51 nekak nekak n ekak 391 2.01 % 377.74 73 1.77 % 316.67 68 1.35 % 230.09 176 2.32 % 498.37 74 2.80 % 474.68 tolik tolik t olik 365 1.88 % 352.62 81 1.96 % 351.37 131 2.59 % 443.25 81 1.07 % 229.36 72 2.72 % 461.86 ostal ostal o stal 355 1.83 % 342.96 100 2.42 % 433.80 53 1.05 % 179.33 150 1.98 % 424.75 52 1.97 % 333.56 njegov njegov nj egov 325 1.67 % 313.98 76 1.84 % 329.69 64 1.27 % 216.55 168 2.21 % 475.72 17 0.64 % 109.05 nekdo nekdo n ekdo 313 1.61 % 302.38 63 1.52 % 273.29 67 1.33 % 226.70 140 1.84 % 396.43 43 1.63 % 275.83 takle takle t akle 312 1.61 % 301.42 81 1.96 % 351.37 89 1.76 % 301.14 108 1.42 % 305.82 34 1.29 % 218.10 nobeden nobeden nob eden 307 1.58 % 296.59 43 1.04 % 186.53 112 2.22 % 378.96 94 1.24 % 266.18 58 2.19 % 372.05 nekateri nekateri neka teri 253 1.30 % 244.42 56 1.35 % 242.93 12 0.24 % 40.60 167 2.20 % 472.89 18 0.68 % 115.46 njen njen njen 245 1.26 % 236.69 94 2.27 % 407.77 40 0.79 % 135.34 100 1.32 % 283.17 11 0.42 % 70.56 njihov njihov nj ihov 241 1.24 % 232.83 44 1.06 % 190.87 39 0.77 % 131.96 122 1.61 % 345.46 36 1.36 % 230.93 drugačen drugačen drug ačen 199 1.02 % 192.25 40 0.97 % 173.52 29 0.57 % 98.12 106 1.40 % 300.16 24 0.91 % 153.95 enak enak enak 182 0.94 % 175.83 17 0.41 % 73.75 7 0.14 % 23.69 134 1.77 % 379.44 24 0.91 % 153.95 zame zame zame 138 0.71 % 133.32 37 0.90 % 160.50 41 0.81 % 138.73 50 0.66 % 141.58 10 0.38 % 64.15 kolikor kolikor kol ikor 128 0.66 % 123.66 24 0.58 % 104.11 26 0.52 % 87.97 58 0.76 % 164.24 20 0.76 % 128.29 nihče nihče n ihče 125 0.64 % 120.76 24 0.58 % 104.11 26 0.52 % 87.97 68 0.90 % 192.55 7 0.27 % 44.90 nobena nobena no bena 116 0.60 % 112.07 19 0.46 % 82.42 17 0.34 % 57.52 61 0.80 % 172.73 19 0.72 % 121.88 noben noben n oben 101 0.52 % 97.57 19 0.46 % 82.42 16 0.32 % 54.14 50 0.66 % 141.58 16 0.60 % 102.63 kakršen kakršen kak ršen 83 0.43 % 80.18 24 0.58 % 104.11 4 0.08 % 13.53 38 0.50 % 107.60 17 0.64 % 109.05 marsikaj marsikaj mars ikaj 82 0.42 % 79.22 23 0.56 % 99.77 8 0.16 % 27.07 45 0.59 % 127.42 6 0.23 % 38.49 nekakšen nekakšen neka kšen 76 0.39 % 73.42 24 0.58 % 104.11 29 0.57 % 98.12 20 0.26 % 56.63 3 0.11 % 19.24 kdor kdor kdor 65 0.34 % 62.80 14 0.34 % 60.73 10 0.20 % 33.84 31 0.41 % 87.78 10 0.38 % 64.15 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 503 File at CLARIN.SI2.2.160 List of final character-level 5-grams from pronoun lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-pronouns-lemmas-final- 5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tisti tisti tisti 3,373 23.90 % 3,258.60 523 18.15 % 2,268.75 1,334 34.05 % 4,513.74 1,025 19.09 % 2,902.46 491 25.26 % 3,149.60 kakšen kakšen k akšen 2,259 16.01 % 2,182.38 581 20.17 % 2,520.36 435 11.10 % 1,471.87 938 17.47 % 2,656.10 305 15.69 % 1,956.47 nekaj nekaj nekaj 2,045 14.49 % 1,975.64 378 13.12 % 1,639.75 748 19.09 % 2,530.94 591 11.01 % 1,673.51 328 16.87 % 2,104.01 kateri kateri k ateri 1,447 10.25 % 1,397.92 288 10.00 % 1,249.33 279 7.12 % 944.03 739 13.77 % 2,092.60 141 7.25 % 904.47 kolik kolik kolik 507 3.59 % 489.80 57 1.98 % 247.26 228 5.82 % 771.46 118 2.20 % 334.14 104 5.35 % 667.12 takšen takšen t akšen 464 3.29 % 448.26 167 5.80 % 724.44 67 1.71 % 226.70 188 3.50 % 532.35 42 2.16 % 269.42 nekak nekak nekak 391 2.77 % 377.74 73 2.53 % 316.67 68 1.74 % 230.09 176 3.28 % 498.37 74 3.81 % 474.68 tolik tolik tolik 365 2.59 % 352.62 81 2.81 % 351.37 131 3.34 % 443.25 81 1.51 % 229.36 72 3.70 % 461.86 ostal ostal ostal 355 2.52 % 342.96 100 3.47 % 433.80 53 1.35 % 179.33 150 2.79 % 424.75 52 2.67 % 333.56 njegov njegov n jegov 325 2.30 % 313.98 76 2.64 % 329.69 64 1.63 % 216.55 168 3.13 % 475.72 17 0.87 % 109.05 nekdo nekdo nekdo 313 2.22 % 302.38 63 2.19 % 273.29 67 1.71 % 226.70 140 2.61 % 396.43 43 2.21 % 275.83 takle takle takle 312 2.21 % 301.42 81 2.81 % 351.37 89 2.27 % 301.14 108 2.01 % 305.82 34 1.75 % 218.10 nobeden nobeden no beden 307 2.18 % 296.59 43 1.49 % 186.53 112 2.86 % 378.96 94 1.75 % 266.18 58 2.98 % 372.05 nekateri nekateri nek ateri 253 1.79 % 244.42 56 1.94 % 242.93 12 0.31 % 40.60 167 3.11 % 472.89 18 0.93 % 115.46 njihov njihov n jihov 241 1.71 % 232.83 44 1.53 % 190.87 39 0.99 % 131.96 122 2.27 % 345.46 36 1.85 % 230.93 drugačen drugačen dru gačen 199 1.41 % 192.25 40 1.39 % 173.52 29 0.74 % 98.12 106 1.98 % 300.16 24 1.24 % 153.95 kolikor kolikor ko likor 128 0.91 % 123.66 24 0.83 % 104.11 26 0.66 % 87.97 58 1.08 % 164.24 20 1.03 % 128.29 nihče nihče nihče 125 0.89 % 120.76 24 0.83 % 104.11 26 0.66 % 87.97 68 1.27 % 192.55 7 0.36 % 44.90 nobena nobena n obena 116 0.82 % 112.07 19 0.66 % 82.42 17 0.43 % 57.52 61 1.14 % 172.73 19 0.98 % 121.88 noben noben noben 101 0.72 % 97.57 19 0.66 % 82.42 16 0.41 % 54.14 50 0.93 % 141.58 16 0.82 % 102.63 kakršen kakršen ka kršen 83 0.59 % 80.18 24 0.83 % 104.11 4 0.10 % 13.53 38 0.71 % 107.60 17 0.87 % 109.05 marsikaj marsikaj mar sikaj 82 0.58 % 79.22 23 0.80 % 99.77 8 0.20 % 27.07 45 0.84 % 127.42 6 0.31 % 38.49 nekakšen nekakšen nek akšen 76 0.54 % 73.42 24 0.83 % 104.11 29 0.74 % 98.12 20 0.37 % 56.63 3 0.15 % 19.24 najin najin najin 47 0.33 % 45.41 23 0.80 % 99.77 7 0.18 % 23.69 10 0.19 % 28.32 7 0.36 % 44.90 marsikdo marsikdo mar sikdo 39 0.28 % 37.68 14 0.49 % 60.73 3 0.08 % 10.15 20 0.37 % 56.63 2 0.10 % 12.83 obadva obadva o badva 29 0.21 % 28.02 6 0.21 % 26.03 10 0.26 % 33.84 10 0.19 % 28.32 3 0.15 % 19.24 marsikateri marsikateri marsik ateri 24 0.17 % 23.19 9 0.31 % 39.04 1 0.03 % 3.38 14 0.26 % 39.64 0 0 % 0 nikakršen nikakršen nika kršen 18 0.13 % 17.39 4 0.14 % 17.35 3 0.08 % 10.15 11 0.20 % 31.15 0 0 % 0 vajin vajin vajin 14 0.10 % 13.53 9 0.31 % 39.04 1 0.03 % 3.38 4 0.07 % 11.33 0 0 % 0 kolikšen kolikšen kol ikšen 13 0.09 % 12.56 0 0 % 0 1 0.03 % 3.38 12 0.22 % 33.98 0 0 % 0 vsakdo vsakdo v sakdo 12 0.09 % 11.59 2 0.07 % 8.68 1 0.03 % 3.38 8 0.15 % 22.65 1 0.05 % 6.41 čigav čigav čigav 11 0.08 % 10.63 2 0.07 % 8.68 5 0.13 % 16.92 3 0.06 % 8.49 1 0.05 % 6.41 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 504 File at CLARIN.SI2.2.161 List of initial character-level 1-grams from pronoun standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-pronouns-standardized_ forms-initial-1grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] to t o 18,439 13.25 % 17,813.62 3,620 12.57 % 15,703.42 5,469 12.87 % 18,504.98 5,489 12.12 % 15,543.01 3,861 17.11 % 24,766.99 se s e 15,865 11.40 % 15,326.92 3,232 11.23 % 14,020.29 3,910 9.20 % 13,229.93 6,051 13.36 % 17,134.41 2,672 11.84 % 17,139.96 kaj k aj 8,980 6.45 % 8,675.43 1,692 5.88 % 7,339.83 3,530 8.31 % 11,944.16 2,397 5.29 % 6,787.50 1,361 6.03 % 8,730.35 jaz j az 6,278 4.51 % 6,065.07 1,175 4.08 % 5,097.11 2,519 5.93 % 8,523.32 1,443 3.19 % 4,086.09 1,141 5.06 % 7,319.12 je j e 5,254 3.78 % 5,075.80 1,061 3.69 % 4,602.58 1,778 4.18 % 6,016.07 1,682 3.71 % 4,762.86 733 3.25 % 4,701.94 ti t i 4,344 3.12 % 4,196.67 1,094 3.80 % 4,745.73 1,973 4.64 % 6,675.87 617 1.36 % 1,747.14 660 2.92 % 4,233.67 mi m i 4,046 2.91 % 3,908.77 672 2.33 % 2,915.11 1,457 3.43 % 4,929.93 1,249 2.76 % 3,536.75 668 2.96 % 4,284.99 ta t a 3,881 2.79 % 3,749.37 678 2.35 % 2,941.14 1,121 2.64 % 3,793.03 1,386 3.06 % 3,924.69 696 3.08 % 4,464.60 te t e 2,995 2.15 % 2,893.42 791 2.75 % 3,431.33 908 2.14 % 3,072.32 808 1.78 % 2,287.99 488 2.16 % 3,130.35 kar k ar 2,909 2.09 % 2,810.34 617 2.14 % 2,676.52 731 1.72 % 2,473.42 1,058 2.34 % 2,995.90 503 2.23 % 3,226.57 vse v se 2,769 1.99 % 2,675.09 648 2.25 % 2,811 969 2.28 % 3,278.72 689 1.52 % 1,951.02 463 2.05 % 2,969.99 ga g a 2,681 1.93 % 2,590.07 527 1.83 % 2,286.11 1,001 2.35 % 3,387 809 1.79 % 2,290.82 344 1.52 % 2,206.64 tem t em 2,124 1.53 % 2,051.96 399 1.39 % 1,730.85 248 0.58 % 839.14 1,098 2.42 % 3,109.17 379 1.68 % 2,431.15 nekaj n ekaj 2,020 1.45 % 1,951.49 376 1.31 % 1,631.07 748 1.76 % 2,530.94 575 1.27 % 1,628.21 321 1.42 % 2,059.10 tega t ega 1,978 1.42 % 1,910.91 301 1.05 % 1,305.73 382 0.90 % 1,292.54 930 2.05 % 2,633.45 365 1.62 % 2,341.35 jih j ih 1,931 1.39 % 1,865.51 323 1.12 % 1,401.16 516 1.21 % 1,745.94 802 1.77 % 2,271 290 1.28 % 1,860.25 kako k ako 1,853 1.33 % 1,790.15 400 1.39 % 1,735.18 676 1.59 % 2,287.32 513 1.13 % 1,452.64 264 1.17 % 1,693.47 si s i 1,690 1.22 % 1,632.68 449 1.56 % 1,947.74 487 1.15 % 1,647.82 521 1.15 % 1,475.30 233 1.03 % 1,494.61 nič n ič 1,538 1.10 % 1,485.84 370 1.28 % 1,605.05 602 1.42 % 2,036.94 371 0.82 % 1,050.55 195 0.86 % 1,250.86 jo j o 1,333 0.96 % 1,287.79 286 0.99 % 1,240.66 403 0.95 % 1,363.60 512 1.13 % 1,449.81 132 0.58 % 846.73 on o n 1,216 0.87 % 1,174.76 246 0.85 % 1,067.14 585 1.38 % 1,979.41 211 0.47 % 597.48 174 0.77 % 1,116.15 tisto t isto 1,156 0.83 % 1,116.79 132 0.46 % 572.61 566 1.33 % 1,915.13 278 0.61 % 787.20 180 0.80 % 1,154.64 me m e 1,146 0.82 % 1,107.13 327 1.14 % 1,418.51 440 1.03 % 1,488.79 232 0.51 % 656.95 147 0.65 % 942.95 kdo k do 1,004 0.72 % 969.95 259 0.90 % 1,123.53 231 0.54 % 781.61 371 0.82 % 1,050.55 143 0.63 % 917.30 oni o ni 1,003 0.72 % 968.98 132 0.46 % 572.61 599 1.41 % 2,026.78 122 0.27 % 345.46 150 0.67 % 962.20 tisti t isti 947 0.68 % 914.88 168 0.58 % 728.78 352 0.83 % 1,191.03 314 0.69 % 889.14 113 0.50 % 724.86 vi v i 907 0.65 % 876.24 192 0.67 % 832.89 98 0.23 % 331.59 380 0.84 % 1,076.03 237 1.05 % 1,520.27 nas n as 886 0.64 % 855.95 208 0.72 % 902.30 272 0.64 % 920.34 314 0.69 % 889.14 92 0.41 % 590.15 ona o na 867 0.62 % 837.59 133 0.46 % 576.95 539 1.27 % 1,823.77 76 0.17 % 215.21 119 0.53 % 763.34 teh t eh 850 0.61 % 821.17 152 0.53 % 659.37 103 0.24 % 348.51 434 0.96 % 1,228.94 161 0.71 % 1,032.76 meni m eni 810 0.58 % 782.53 132 0.46 % 572.61 405 0.95 % 1,370.36 113 0.25 % 319.98 160 0.71 % 1,026.34 vsi v si 800 0.57 % 772.87 175 0.61 % 759.14 205 0.48 % 693.64 314 0.69 % 889.14 106 0.47 % 679.95 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 505 File at CLARIN.SI2.2.162 List of initial character-level 2-grams from pronoun standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-pronouns-standardized_ forms-initial-2grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] to to 18,439 13.25 % 17,813.62 3,620 12.57 % 15,703.42 5,469 12.87 % 18,504.98 5,489 12.12 % 15,543.01 3,861 17.11 % 24,766.99 se se 15,865 11.40 % 15,326.92 3,232 11.23 % 14,020.29 3,910 9.20 % 13,229.93 6,051 13.36 % 17,134.41 2,672 11.84 % 17,139.96 kaj ka j 8,980 6.45 % 8,675.43 1,692 5.88 % 7,339.83 3,530 8.31 % 11,944.16 2,397 5.29 % 6,787.50 1,361 6.03 % 8,730.35 jaz ja z 6,278 4.51 % 6,065.07 1,175 4.08 % 5,097.11 2,519 5.93 % 8,523.32 1,443 3.19 % 4,086.09 1,141 5.06 % 7,319.12 je je 5,254 3.78 % 5,075.80 1,061 3.69 % 4,602.58 1,778 4.18 % 6,016.07 1,682 3.71 % 4,762.86 733 3.25 % 4,701.94 ti ti 4,344 3.12 % 4,196.67 1,094 3.80 % 4,745.73 1,973 4.64 % 6,675.87 617 1.36 % 1,747.14 660 2.92 % 4,233.67 mi mi 4,046 2.91 % 3,908.77 672 2.33 % 2,915.11 1,457 3.43 % 4,929.93 1,249 2.76 % 3,536.75 668 2.96 % 4,284.99 ta ta 3,881 2.79 % 3,749.37 678 2.35 % 2,941.14 1,121 2.64 % 3,793.03 1,386 3.06 % 3,924.69 696 3.08 % 4,464.60 te te 2,995 2.15 % 2,893.42 791 2.75 % 3,431.33 908 2.14 % 3,072.32 808 1.78 % 2,287.99 488 2.16 % 3,130.35 kar ka r 2,909 2.09 % 2,810.34 617 2.14 % 2,676.52 731 1.72 % 2,473.42 1,058 2.34 % 2,995.90 503 2.23 % 3,226.57 vse vs e 2,769 1.99 % 2,675.09 648 2.25 % 2,811 969 2.28 % 3,278.72 689 1.52 % 1,951.02 463 2.05 % 2,969.99 ga ga 2,681 1.93 % 2,590.07 527 1.83 % 2,286.11 1,001 2.35 % 3,387 809 1.79 % 2,290.82 344 1.52 % 2,206.64 tem te m 2,124 1.53 % 2,051.96 399 1.39 % 1,730.85 248 0.58 % 839.14 1,098 2.42 % 3,109.17 379 1.68 % 2,431.15 nekaj ne kaj 2,020 1.45 % 1,951.49 376 1.31 % 1,631.07 748 1.76 % 2,530.94 575 1.27 % 1,628.21 321 1.42 % 2,059.10 tega te ga 1,978 1.42 % 1,910.91 301 1.05 % 1,305.73 382 0.90 % 1,292.54 930 2.05 % 2,633.45 365 1.62 % 2,341.35 jih ji h 1,931 1.39 % 1,865.51 323 1.12 % 1,401.16 516 1.21 % 1,745.94 802 1.77 % 2,271 290 1.28 % 1,860.25 kako ka ko 1,853 1.33 % 1,790.15 400 1.39 % 1,735.18 676 1.59 % 2,287.32 513 1.13 % 1,452.64 264 1.17 % 1,693.47 si si 1,690 1.22 % 1,632.68 449 1.56 % 1,947.74 487 1.15 % 1,647.82 521 1.15 % 1,475.30 233 1.03 % 1,494.61 nič ni č 1,538 1.10 % 1,485.84 370 1.28 % 1,605.05 602 1.42 % 2,036.94 371 0.82 % 1,050.55 195 0.86 % 1,250.86 jo jo 1,333 0.96 % 1,287.79 286 0.99 % 1,240.66 403 0.95 % 1,363.60 512 1.13 % 1,449.81 132 0.58 % 846.73 on on 1,216 0.87 % 1,174.76 246 0.85 % 1,067.14 585 1.38 % 1,979.41 211 0.47 % 597.48 174 0.77 % 1,116.15 tisto ti sto 1,156 0.83 % 1,116.79 132 0.46 % 572.61 566 1.33 % 1,915.13 278 0.61 % 787.20 180 0.80 % 1,154.64 me me 1,146 0.82 % 1,107.13 327 1.14 % 1,418.51 440 1.03 % 1,488.79 232 0.51 % 656.95 147 0.65 % 942.95 kdo kd o 1,004 0.72 % 969.95 259 0.90 % 1,123.53 231 0.54 % 781.61 371 0.82 % 1,050.55 143 0.63 % 917.30 oni on i 1,003 0.72 % 968.98 132 0.46 % 572.61 599 1.41 % 2,026.78 122 0.27 % 345.46 150 0.67 % 962.20 tisti ti sti 947 0.68 % 914.88 168 0.58 % 728.78 352 0.83 % 1,191.03 314 0.69 % 889.14 113 0.50 % 724.86 vi vi 907 0.65 % 876.24 192 0.67 % 832.89 98 0.23 % 331.59 380 0.84 % 1,076.03 237 1.05 % 1,520.27 nas na s 886 0.64 % 855.95 208 0.72 % 902.30 272 0.64 % 920.34 314 0.69 % 889.14 92 0.41 % 590.15 ona on a 867 0.62 % 837.59 133 0.46 % 576.95 539 1.27 % 1,823.77 76 0.17 % 215.21 119 0.53 % 763.34 teh te h 850 0.61 % 821.17 152 0.53 % 659.37 103 0.24 % 348.51 434 0.96 % 1,228.94 161 0.71 % 1,032.76 meni me ni 810 0.58 % 782.53 132 0.46 % 572.61 405 0.95 % 1,370.36 113 0.25 % 319.98 160 0.71 % 1,026.34 vsi vs i 800 0.57 % 772.87 175 0.61 % 759.14 205 0.48 % 693.64 314 0.69 % 889.14 106 0.47 % 679.95 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 506 File at CLARIN.SI2.2.163 List of initial character-level 3-grams from pronoun standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-pronouns-standardized_ forms-initial-3grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] kaj kaj 8,980 12.14 % 8,675.43 1,692 11.07 % 7,339.83 3,530 15.81 % 11,944.16 2,397 9.60 % 6,787.50 1,361 11.95 % 8,730.35 jaz jaz 6,278 8.49 % 6,065.07 1,175 7.69 % 5,097.11 2,519 11.28 % 8,523.32 1,443 5.78 % 4,086.09 1,141 10.02 % 7,319.12 kar kar 2,909 3.93 % 2,810.34 617 4.04 % 2,676.52 731 3.27 % 2,473.42 1,058 4.24 % 2,995.90 503 4.42 % 3,226.57 vse vse 2,769 3.74 % 2,675.09 648 4.24 % 2,811 969 4.34 % 3,278.72 689 2.76 % 1,951.02 463 4.07 % 2,969.99 tem tem 2,124 2.87 % 2,051.96 399 2.61 % 1,730.85 248 1.11 % 839.14 1,098 4.40 % 3,109.17 379 3.33 % 2,431.15 nekaj nek aj 2,020 2.73 % 1,951.49 376 2.46 % 1,631.07 748 3.35 % 2,530.94 575 2.30 % 1,628.21 321 2.82 % 2,059.10 tega teg a 1,978 2.67 % 1,910.91 301 1.97 % 1,305.73 382 1.71 % 1,292.54 930 3.73 % 2,633.45 365 3.21 % 2,341.35 jih jih 1,931 2.61 % 1,865.51 323 2.11 % 1,401.16 516 2.31 % 1,745.94 802 3.21 % 2,271 290 2.55 % 1,860.25 kako kak o 1,853 2.51 % 1,790.15 400 2.62 % 1,735.18 676 3.03 % 2,287.32 513 2.06 % 1,452.64 264 2.32 % 1,693.47 nič nič 1,538 2.08 % 1,485.84 370 2.42 % 1,605.05 602 2.70 % 2,036.94 371 1.49 % 1,050.55 195 1.71 % 1,250.86 tisto tis to 1,156 1.56 % 1,116.79 132 0.86 % 572.61 566 2.54 % 1,915.13 278 1.11 % 787.20 180 1.58 % 1,154.64 kdo kdo 1,004 1.36 % 969.95 259 1.70 % 1,123.53 231 1.03 % 781.61 371 1.49 % 1,050.55 143 1.26 % 917.30 oni oni 1,003 1.36 % 968.98 132 0.86 % 572.61 599 2.68 % 2,026.78 122 0.49 % 345.46 150 1.32 % 962.20 tisti tis ti 947 1.28 % 914.88 168 1.10 % 728.78 352 1.58 % 1,191.03 314 1.26 % 889.14 113 0.99 % 724.86 nas nas 886 1.20 % 855.95 208 1.36 % 902.30 272 1.22 % 920.34 314 1.26 % 889.14 92 0.81 % 590.15 ona ona 867 1.17 % 837.59 133 0.87 % 576.95 539 2.41 % 1,823.77 76 0.30 % 215.21 119 1.04 % 763.34 teh teh 850 1.15 % 821.17 152 0.99 % 659.37 103 0.46 % 348.51 434 1.74 % 1,228.94 161 1.41 % 1,032.76 meni men i 810 1.09 % 782.53 132 0.86 % 572.61 405 1.81 % 1,370.36 113 0.45 % 319.98 160 1.41 % 1,026.34 vsi vsi 800 1.08 % 772.87 175 1.15 % 759.14 205 0.92 % 693.64 314 1.26 % 889.14 106 0.93 % 679.95 vam vam 798 1.08 % 770.93 231 1.51 % 1,002.07 26 0.12 % 87.97 304 1.22 % 860.83 237 2.08 % 1,520.27 nam nam 657 0.89 % 634.72 166 1.09 % 720.10 94 0.42 % 318.06 315 1.26 % 891.97 82 0.72 % 526 vas vas 606 0.82 % 585.45 224 1.47 % 971.70 61 0.27 % 206.40 215 0.86 % 608.81 106 0.93 % 679.95 tole tol e 597 0.81 % 576.75 153 1.00 % 663.71 98 0.44 % 331.59 235 0.94 % 665.44 111 0.97 % 712.03 tak tak 574 0.78 % 554.53 107 0.70 % 464.16 218 0.98 % 737.63 135 0.54 % 382.27 114 1.00 % 731.27 kakšen kak šen 563 0.76 % 543.91 137 0.90 % 594.30 133 0.60 % 450.02 223 0.89 % 631.46 70 0.61 % 449.03 vsak vsa k 508 0.69 % 490.77 117 0.77 % 507.54 140 0.63 % 473.71 166 0.67 % 470.06 85 0.75 % 545.25 kakšno kak šno 506 0.68 % 488.84 156 1.02 % 676.72 73 0.33 % 247 203 0.81 % 574.83 74 0.65 % 474.68 koliko kol iko 490 0.66 % 473.38 56 0.37 % 242.93 220 0.98 % 744.40 113 0.45 % 319.98 101 0.89 % 647.88 tej tej 482 0.65 % 465.65 110 0.72 % 477.18 59 0.26 % 199.63 259 1.04 % 733.40 54 0.47 % 346.39 neki nek i 476 0.64 % 459.86 59 0.39 % 255.94 86 0.39 % 290.99 232 0.93 % 656.95 99 0.87 % 635.05 mene men e 461 0.62 % 445.36 89 0.58 % 386.08 217 0.97 % 734.24 81 0.32 % 229.36 74 0.65 % 474.68 ono ono 443 0.60 % 427.98 26 0.17 % 112.79 339 1.52 % 1,147.05 26 0.10 % 73.62 52 0.46 % 333.56 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 507 File at CLARIN.SI2.2.164 List of initial character-level 4-grams from pronoun standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-pronouns-standardized_ forms-initial-4grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] nekaj neka j 2,020 5.59 % 1,951.49 376 4.89 % 1,631.07 748 7.66 % 2,530.94 575 4.28 % 1,628.21 321 6.11 % 2,059.10 tega tega 1,978 5.47 % 1,910.91 301 3.91 % 1,305.73 382 3.91 % 1,292.54 930 6.92 % 2,633.45 365 6.94 % 2,341.35 kako kako 1,853 5.13 % 1,790.15 400 5.20 % 1,735.18 676 6.92 % 2,287.32 513 3.82 % 1,452.64 264 5.02 % 1,693.47 tisto tist o 1,156 3.20 % 1,116.79 132 1.72 % 572.61 566 5.80 % 1,915.13 278 2.07 % 787.20 180 3.42 % 1,154.64 tisti tist i 947 2.62 % 914.88 168 2.18 % 728.78 352 3.60 % 1,191.03 314 2.34 % 889.14 113 2.15 % 724.86 meni meni 810 2.24 % 782.53 132 1.72 % 572.61 405 4.15 % 1,370.36 113 0.84 % 319.98 160 3.04 % 1,026.34 tole tole 597 1.65 % 576.75 153 1.99 % 663.71 98 1.00 % 331.59 235 1.75 % 665.44 111 2.11 % 712.03 kakšen kakš en 563 1.56 % 543.91 137 1.78 % 594.30 133 1.36 % 450.02 223 1.66 % 631.46 70 1.33 % 449.03 vsak vsak 508 1.41 % 490.77 117 1.52 % 507.54 140 1.43 % 473.71 166 1.24 % 470.06 85 1.62 % 545.25 kakšno kakš no 506 1.40 % 488.84 156 2.03 % 676.72 73 0.75 % 247 203 1.51 % 574.83 74 1.41 % 474.68 koliko koli ko 490 1.36 % 473.38 56 0.73 % 242.93 220 2.25 % 744.40 113 0.84 % 319.98 101 1.92 % 647.88 neki neki 476 1.32 % 459.86 59 0.77 % 255.94 86 0.88 % 290.99 232 1.73 % 656.95 99 1.88 % 635.05 mene mene 461 1.27 % 445.36 89 1.16 % 386.08 217 2.22 % 734.24 81 0.60 % 229.36 74 1.41 % 474.68 take take 437 1.21 % 422.18 78 1.01 % 338.36 176 1.80 % 595.52 101 0.75 % 286 82 1.56 % 526 kateri kate ri 414 1.15 % 399.96 104 1.35 % 451.15 99 1.01 % 334.98 170 1.27 % 481.38 41 0.78 % 263 moje moje 396 1.09 % 382.57 72 0.94 % 312.33 193 1.98 % 653.04 73 0.54 % 206.71 58 1.10 % 372.05 nekako neka ko 391 1.08 % 377.74 73 0.95 % 316.67 68 0.70 % 230.09 176 1.31 % 498.37 74 1.41 % 474.68 tale tale 391 1.08 % 377.74 93 1.21 % 403.43 77 0.79 % 260.54 158 1.18 % 447.40 63 1.20 % 404.12 taka taka 390 1.08 % 376.77 76 0.99 % 329.69 140 1.43 % 473.71 100 0.74 % 283.17 74 1.41 % 474.68 kakšna kakš na 386 1.07 % 372.91 76 0.99 % 329.69 72 0.74 % 243.62 195 1.45 % 552.17 43 0.82 % 275.83 tiste tist e 386 1.07 % 372.91 75 0.97 % 325.35 140 1.43 % 473.71 107 0.80 % 302.99 64 1.22 % 410.54 svoje svoj e 382 1.06 % 369.04 74 0.96 % 321.01 55 0.56 % 186.10 218 1.62 % 617.30 35 0.67 % 224.51 kakšne kakš ne 381 1.05 % 368.08 87 1.13 % 377.40 80 0.82 % 270.69 147 1.09 % 416.25 67 1.27 % 429.78 neke neke 360 1.00 % 347.79 37 0.48 % 160.50 69 0.71 % 233.47 162 1.21 % 458.73 92 1.75 % 590.15 toliko toli ko 359 0.99 % 346.82 79 1.03 % 342.70 127 1.30 % 429.72 81 0.60 % 229.36 72 1.37 % 461.86 temu temu 348 0.96 % 336.20 79 1.03 % 342.70 41 0.42 % 138.73 189 1.41 % 535.18 39 0.74 % 250.17 neko neko 341 0.94 % 329.43 32 0.42 % 138.81 69 0.71 % 233.47 173 1.29 % 489.88 67 1.27 % 429.78 takega take ga 324 0.90 % 313.01 46 0.60 % 199.55 139 1.42 % 470.32 60 0.45 % 169.90 79 1.50 % 506.76 tako tako 321 0.89 % 310.11 53 0.69 % 229.91 112 1.15 % 378.96 90 0.67 % 254.85 66 1.25 % 423.37 vseh vseh 321 0.89 % 310.11 75 0.97 % 325.35 39 0.40 % 131.96 175 1.30 % 495.54 32 0.61 % 205.27 tista tist a 298 0.82 % 287.89 54 0.70 % 234.25 113 1.16 % 382.35 87 0.65 % 246.35 44 0.84 % 282.24 isto isto 290 0.80 % 280.16 29 0.38 % 125.80 125 1.28 % 422.95 88 0.66 % 249.19 48 0.91 % 307.90 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 508 File at CLARIN.SI2.2.165 List of initial character-level 5-grams from pronoun standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-pronouns-standardized_ forms-initial-5grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] nekaj nekaj 2,020 10.17 % 1,951.49 376 8.96 % 1,631.07 748 14.21 % 2,530.94 575 7.48 % 1,628.21 321 11.81 % 2,059.10 tisto tisto 1,156 5.82 % 1,116.79 132 3.15 % 572.61 566 10.75 % 1,915.13 278 3.62 % 787.20 180 6.62 % 1,154.64 tisti tisti 947 4.77 % 914.88 168 4.00 % 728.78 352 6.69 % 1,191.03 314 4.09 % 889.14 113 4.16 % 724.86 kakšen kakše n 563 2.83 % 543.91 137 3.27 % 594.30 133 2.53 % 450.02 223 2.90 % 631.46 70 2.58 % 449.03 kakšno kakšn o 506 2.55 % 488.84 156 3.72 % 676.72 73 1.39 % 247 203 2.64 % 574.83 74 2.72 % 474.68 koliko kolik o 490 2.47 % 473.38 56 1.33 % 242.93 220 4.18 % 744.40 113 1.47 % 319.98 101 3.72 % 647.88 kateri kater i 414 2.08 % 399.96 104 2.48 % 451.15 99 1.88 % 334.98 170 2.21 % 481.38 41 1.51 % 263 nekako nekak o 391 1.97 % 377.74 73 1.74 % 316.67 68 1.29 % 230.09 176 2.29 % 498.37 74 2.72 % 474.68 kakšna kakšn a 386 1.94 % 372.91 76 1.81 % 329.69 72 1.37 % 243.62 195 2.54 % 552.17 43 1.58 % 275.83 tiste tiste 386 1.94 % 372.91 75 1.79 % 325.35 140 2.66 % 473.71 107 1.39 % 302.99 64 2.35 % 410.54 svoje svoje 382 1.92 % 369.04 74 1.76 % 321.01 55 1.04 % 186.10 218 2.84 % 617.30 35 1.29 % 224.51 kakšne kakšn e 381 1.92 % 368.08 87 2.07 % 377.40 80 1.52 % 270.69 147 1.91 % 416.25 67 2.46 % 429.78 toliko tolik o 359 1.81 % 346.82 79 1.88 % 342.70 127 2.41 % 429.72 81 1.05 % 229.36 72 2.65 % 461.86 takega takeg a 324 1.63 % 313.01 46 1.10 % 199.55 139 2.64 % 470.32 60 0.78 % 169.90 79 2.91 % 506.76 tista tista 298 1.50 % 287.89 54 1.29 % 234.25 113 2.15 % 382.35 87 1.13 % 246.35 44 1.62 % 282.24 takole takol e 241 1.21 % 232.83 75 1.79 % 325.35 60 1.14 % 203.02 79 1.03 % 223.70 27 0.99 % 173.20 svojo svojo 232 1.17 % 224.13 60 1.43 % 260.28 29 0.55 % 98.12 127 1.65 % 359.62 16 0.59 % 102.63 midva midva 231 1.16 % 223.17 98 2.34 % 425.12 88 1.67 % 297.76 20 0.26 % 56.63 25 0.92 % 160.37 nekdo nekdo 208 1.05 % 200.95 35 0.83 % 151.83 49 0.93 % 165.80 94 1.22 % 266.18 30 1.10 % 192.44 mojem mojem 206 1.04 % 199.01 49 1.17 % 212.56 64 1.22 % 216.55 48 0.62 % 135.92 45 1.66 % 288.66 katera kater a 199 1.00 % 192.25 38 0.91 % 164.84 50 0.95 % 169.18 93 1.21 % 263.34 18 0.66 % 115.46 katero kater o 196 0.99 % 189.35 43 1.02 % 186.53 48 0.91 % 162.41 88 1.15 % 249.19 17 0.62 % 109.05 katere kater e 195 0.98 % 188.39 28 0.67 % 121.46 17 0.32 % 57.52 125 1.63 % 353.96 25 0.92 % 160.37 njega njega 195 0.98 % 188.39 41 0.98 % 177.86 66 1.25 % 223.32 64 0.83 % 181.23 24 0.88 % 153.95 tistega tiste ga 179 0.90 % 172.93 28 0.67 % 121.46 58 1.10 % 196.25 62 0.81 % 175.56 31 1.14 % 198.85 tistih tisti h 178 0.90 % 171.96 32 0.76 % 138.81 44 0.84 % 148.88 82 1.07 % 232.20 20 0.74 % 128.29 vsako vsako 173 0.87 % 167.13 38 0.91 % 164.84 60 1.14 % 203.02 65 0.85 % 184.06 10 0.37 % 64.15 kakšni kakšn i 147 0.74 % 142.01 30 0.71 % 130.14 30 0.57 % 101.51 69 0.90 % 195.38 18 0.66 % 115.46 katerega kater ega 146 0.73 % 141.05 31 0.74 % 134.48 37 0.70 % 125.19 61 0.79 % 172.73 17 0.62 % 109.05 njemu njemu 146 0.73 % 141.05 23 0.55 % 99.77 75 1.43 % 253.77 28 0.36 % 79.29 20 0.74 % 128.29 njimi njimi 143 0.72 % 138.15 26 0.62 % 112.79 31 0.59 % 104.89 70 0.91 % 198.22 16 0.59 % 102.63 takih takih 137 0.69 % 132.35 26 0.62 % 112.79 33 0.63 % 111.66 56 0.73 % 158.57 22 0.81 % 141.12 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 509 File at CLARIN.SI2.2.166 List of final character-level 1-grams from pronoun standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-pronouns-standardized_ forms-final-1grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] to t o 18,439 13.25 % 17,813.62 3,620 12.57 % 15,703.42 5,469 12.87 % 18,504.98 5,489 12.12 % 15,543.01 3,861 17.11 % 24,766.99 se s e 15,865 11.40 % 15,326.92 3,232 11.23 % 14,020.29 3,910 9.20 % 13,229.93 6,051 13.36 % 17,134.41 2,672 11.84 % 17,139.96 kaj ka j 8,980 6.45 % 8,675.43 1,692 5.88 % 7,339.83 3,530 8.31 % 11,944.16 2,397 5.29 % 6,787.50 1,361 6.03 % 8,730.35 jaz ja z 6,278 4.51 % 6,065.07 1,175 4.08 % 5,097.11 2,519 5.93 % 8,523.32 1,443 3.19 % 4,086.09 1,141 5.06 % 7,319.12 je j e 5,254 3.78 % 5,075.80 1,061 3.69 % 4,602.58 1,778 4.18 % 6,016.07 1,682 3.71 % 4,762.86 733 3.25 % 4,701.94 ti t i 4,344 3.12 % 4,196.67 1,094 3.80 % 4,745.73 1,973 4.64 % 6,675.87 617 1.36 % 1,747.14 660 2.92 % 4,233.67 mi m i 4,046 2.91 % 3,908.77 672 2.33 % 2,915.11 1,457 3.43 % 4,929.93 1,249 2.76 % 3,536.75 668 2.96 % 4,284.99 ta t a 3,881 2.79 % 3,749.37 678 2.35 % 2,941.14 1,121 2.64 % 3,793.03 1,386 3.06 % 3,924.69 696 3.08 % 4,464.60 te t e 2,995 2.15 % 2,893.42 791 2.75 % 3,431.33 908 2.14 % 3,072.32 808 1.78 % 2,287.99 488 2.16 % 3,130.35 kar ka r 2,909 2.09 % 2,810.34 617 2.14 % 2,676.52 731 1.72 % 2,473.42 1,058 2.34 % 2,995.90 503 2.23 % 3,226.57 vse vs e 2,769 1.99 % 2,675.09 648 2.25 % 2,811 969 2.28 % 3,278.72 689 1.52 % 1,951.02 463 2.05 % 2,969.99 ga g a 2,681 1.93 % 2,590.07 527 1.83 % 2,286.11 1,001 2.35 % 3,387 809 1.79 % 2,290.82 344 1.52 % 2,206.64 tem te m 2,124 1.53 % 2,051.96 399 1.39 % 1,730.85 248 0.58 % 839.14 1,098 2.42 % 3,109.17 379 1.68 % 2,431.15 nekaj neka j 2,020 1.45 % 1,951.49 376 1.31 % 1,631.07 748 1.76 % 2,530.94 575 1.27 % 1,628.21 321 1.42 % 2,059.10 tega teg a 1,978 1.42 % 1,910.91 301 1.05 % 1,305.73 382 0.90 % 1,292.54 930 2.05 % 2,633.45 365 1.62 % 2,341.35 jih ji h 1,931 1.39 % 1,865.51 323 1.12 % 1,401.16 516 1.21 % 1,745.94 802 1.77 % 2,271 290 1.28 % 1,860.25 kako kak o 1,853 1.33 % 1,790.15 400 1.39 % 1,735.18 676 1.59 % 2,287.32 513 1.13 % 1,452.64 264 1.17 % 1,693.47 si s i 1,690 1.22 % 1,632.68 449 1.56 % 1,947.74 487 1.15 % 1,647.82 521 1.15 % 1,475.30 233 1.03 % 1,494.61 nič ni č 1,538 1.10 % 1,485.84 370 1.28 % 1,605.05 602 1.42 % 2,036.94 371 0.82 % 1,050.55 195 0.86 % 1,250.86 jo j o 1,333 0.96 % 1,287.79 286 0.99 % 1,240.66 403 0.95 % 1,363.60 512 1.13 % 1,449.81 132 0.58 % 846.73 on o n 1,216 0.87 % 1,174.76 246 0.85 % 1,067.14 585 1.38 % 1,979.41 211 0.47 % 597.48 174 0.77 % 1,116.15 tisto tist o 1,156 0.83 % 1,116.79 132 0.46 % 572.61 566 1.33 % 1,915.13 278 0.61 % 787.20 180 0.80 % 1,154.64 me m e 1,146 0.82 % 1,107.13 327 1.14 % 1,418.51 440 1.03 % 1,488.79 232 0.51 % 656.95 147 0.65 % 942.95 kdo kd o 1,004 0.72 % 969.95 259 0.90 % 1,123.53 231 0.54 % 781.61 371 0.82 % 1,050.55 143 0.63 % 917.30 oni on i 1,003 0.72 % 968.98 132 0.46 % 572.61 599 1.41 % 2,026.78 122 0.27 % 345.46 150 0.67 % 962.20 tisti tist i 947 0.68 % 914.88 168 0.58 % 728.78 352 0.83 % 1,191.03 314 0.69 % 889.14 113 0.50 % 724.86 vi v i 907 0.65 % 876.24 192 0.67 % 832.89 98 0.23 % 331.59 380 0.84 % 1,076.03 237 1.05 % 1,520.27 nas na s 886 0.64 % 855.95 208 0.72 % 902.30 272 0.64 % 920.34 314 0.69 % 889.14 92 0.41 % 590.15 ona on a 867 0.62 % 837.59 133 0.46 % 576.95 539 1.27 % 1,823.77 76 0.17 % 215.21 119 0.53 % 763.34 teh te h 850 0.61 % 821.17 152 0.53 % 659.37 103 0.24 % 348.51 434 0.96 % 1,228.94 161 0.71 % 1,032.76 meni men i 810 0.58 % 782.53 132 0.46 % 572.61 405 0.95 % 1,370.36 113 0.25 % 319.98 160 0.71 % 1,026.34 vsi vs i 800 0.57 % 772.87 175 0.61 % 759.14 205 0.48 % 693.64 314 0.69 % 889.14 106 0.47 % 679.95 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 510 File at CLARIN.SI2.2.167 List of final character-level 2-grams from pronoun standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-pronouns-standardized_ forms-final-2grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] to to 18,439 13.25 % 17,813.62 3,620 12.57 % 15,703.42 5,469 12.87 % 18,504.98 5,489 12.12 % 15,543.01 3,861 17.11 % 24,766.99 se se 15,865 11.40 % 15,326.92 3,232 11.23 % 14,020.29 3,910 9.20 % 13,229.93 6,051 13.36 % 17,134.41 2,672 11.84 % 17,139.96 kaj k aj 8,980 6.45 % 8,675.43 1,692 5.88 % 7,339.83 3,530 8.31 % 11,944.16 2,397 5.29 % 6,787.50 1,361 6.03 % 8,730.35 jaz j az 6,278 4.51 % 6,065.07 1,175 4.08 % 5,097.11 2,519 5.93 % 8,523.32 1,443 3.19 % 4,086.09 1,141 5.06 % 7,319.12 je je 5,254 3.78 % 5,075.80 1,061 3.69 % 4,602.58 1,778 4.18 % 6,016.07 1,682 3.71 % 4,762.86 733 3.25 % 4,701.94 ti ti 4,344 3.12 % 4,196.67 1,094 3.80 % 4,745.73 1,973 4.64 % 6,675.87 617 1.36 % 1,747.14 660 2.92 % 4,233.67 mi mi 4,046 2.91 % 3,908.77 672 2.33 % 2,915.11 1,457 3.43 % 4,929.93 1,249 2.76 % 3,536.75 668 2.96 % 4,284.99 ta ta 3,881 2.79 % 3,749.37 678 2.35 % 2,941.14 1,121 2.64 % 3,793.03 1,386 3.06 % 3,924.69 696 3.08 % 4,464.60 te te 2,995 2.15 % 2,893.42 791 2.75 % 3,431.33 908 2.14 % 3,072.32 808 1.78 % 2,287.99 488 2.16 % 3,130.35 kar k ar 2,909 2.09 % 2,810.34 617 2.14 % 2,676.52 731 1.72 % 2,473.42 1,058 2.34 % 2,995.90 503 2.23 % 3,226.57 vse v se 2,769 1.99 % 2,675.09 648 2.25 % 2,811 969 2.28 % 3,278.72 689 1.52 % 1,951.02 463 2.05 % 2,969.99 ga ga 2,681 1.93 % 2,590.07 527 1.83 % 2,286.11 1,001 2.35 % 3,387 809 1.79 % 2,290.82 344 1.52 % 2,206.64 tem t em 2,124 1.53 % 2,051.96 399 1.39 % 1,730.85 248 0.58 % 839.14 1,098 2.42 % 3,109.17 379 1.68 % 2,431.15 nekaj nek aj 2,020 1.45 % 1,951.49 376 1.31 % 1,631.07 748 1.76 % 2,530.94 575 1.27 % 1,628.21 321 1.42 % 2,059.10 tega te ga 1,978 1.42 % 1,910.91 301 1.05 % 1,305.73 382 0.90 % 1,292.54 930 2.05 % 2,633.45 365 1.62 % 2,341.35 jih j ih 1,931 1.39 % 1,865.51 323 1.12 % 1,401.16 516 1.21 % 1,745.94 802 1.77 % 2,271 290 1.28 % 1,860.25 kako ka ko 1,853 1.33 % 1,790.15 400 1.39 % 1,735.18 676 1.59 % 2,287.32 513 1.13 % 1,452.64 264 1.17 % 1,693.47 si si 1,690 1.22 % 1,632.68 449 1.56 % 1,947.74 487 1.15 % 1,647.82 521 1.15 % 1,475.30 233 1.03 % 1,494.61 nič n ič 1,538 1.10 % 1,485.84 370 1.28 % 1,605.05 602 1.42 % 2,036.94 371 0.82 % 1,050.55 195 0.86 % 1,250.86 jo jo 1,333 0.96 % 1,287.79 286 0.99 % 1,240.66 403 0.95 % 1,363.60 512 1.13 % 1,449.81 132 0.58 % 846.73 on on 1,216 0.87 % 1,174.76 246 0.85 % 1,067.14 585 1.38 % 1,979.41 211 0.47 % 597.48 174 0.77 % 1,116.15 tisto tis to 1,156 0.83 % 1,116.79 132 0.46 % 572.61 566 1.33 % 1,915.13 278 0.61 % 787.20 180 0.80 % 1,154.64 me me 1,146 0.82 % 1,107.13 327 1.14 % 1,418.51 440 1.03 % 1,488.79 232 0.51 % 656.95 147 0.65 % 942.95 kdo k do 1,004 0.72 % 969.95 259 0.90 % 1,123.53 231 0.54 % 781.61 371 0.82 % 1,050.55 143 0.63 % 917.30 oni o ni 1,003 0.72 % 968.98 132 0.46 % 572.61 599 1.41 % 2,026.78 122 0.27 % 345.46 150 0.67 % 962.20 tisti tis ti 947 0.68 % 914.88 168 0.58 % 728.78 352 0.83 % 1,191.03 314 0.69 % 889.14 113 0.50 % 724.86 vi vi 907 0.65 % 876.24 192 0.67 % 832.89 98 0.23 % 331.59 380 0.84 % 1,076.03 237 1.05 % 1,520.27 nas n as 886 0.64 % 855.95 208 0.72 % 902.30 272 0.64 % 920.34 314 0.69 % 889.14 92 0.41 % 590.15 ona o na 867 0.62 % 837.59 133 0.46 % 576.95 539 1.27 % 1,823.77 76 0.17 % 215.21 119 0.53 % 763.34 teh t eh 850 0.61 % 821.17 152 0.53 % 659.37 103 0.24 % 348.51 434 0.96 % 1,228.94 161 0.71 % 1,032.76 meni me ni 810 0.58 % 782.53 132 0.46 % 572.61 405 0.95 % 1,370.36 113 0.25 % 319.98 160 0.71 % 1,026.34 vsi v si 800 0.57 % 772.87 175 0.61 % 759.14 205 0.48 % 693.64 314 0.69 % 889.14 106 0.47 % 679.95 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 511 File at CLARIN.SI2.2.168 List of final character-level 3-grams from pronoun standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-pronouns-standardized_ forms-final-3grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] kaj kaj 8,980 12.14 % 8,675.43 1,692 11.07 % 7,339.83 3,530 15.81 % 11,944.16 2,397 9.60 % 6,787.50 1,361 11.95 % 8,730.35 jaz jaz 6,278 8.49 % 6,065.07 1,175 7.69 % 5,097.11 2,519 11.28 % 8,523.32 1,443 5.78 % 4,086.09 1,141 10.02 % 7,319.12 kar kar 2,909 3.93 % 2,810.34 617 4.04 % 2,676.52 731 3.27 % 2,473.42 1,058 4.24 % 2,995.90 503 4.42 % 3,226.57 vse vse 2,769 3.74 % 2,675.09 648 4.24 % 2,811 969 4.34 % 3,278.72 689 2.76 % 1,951.02 463 4.07 % 2,969.99 tem tem 2,124 2.87 % 2,051.96 399 2.61 % 1,730.85 248 1.11 % 839.14 1,098 4.40 % 3,109.17 379 3.33 % 2,431.15 nekaj ne kaj 2,020 2.73 % 1,951.49 376 2.46 % 1,631.07 748 3.35 % 2,530.94 575 2.30 % 1,628.21 321 2.82 % 2,059.10 tega t ega 1,978 2.67 % 1,910.91 301 1.97 % 1,305.73 382 1.71 % 1,292.54 930 3.73 % 2,633.45 365 3.21 % 2,341.35 jih jih 1,931 2.61 % 1,865.51 323 2.11 % 1,401.16 516 2.31 % 1,745.94 802 3.21 % 2,271 290 2.55 % 1,860.25 kako k ako 1,853 2.51 % 1,790.15 400 2.62 % 1,735.18 676 3.03 % 2,287.32 513 2.06 % 1,452.64 264 2.32 % 1,693.47 nič nič 1,538 2.08 % 1,485.84 370 2.42 % 1,605.05 602 2.70 % 2,036.94 371 1.49 % 1,050.55 195 1.71 % 1,250.86 tisto ti sto 1,156 1.56 % 1,116.79 132 0.86 % 572.61 566 2.54 % 1,915.13 278 1.11 % 787.20 180 1.58 % 1,154.64 kdo kdo 1,004 1.36 % 969.95 259 1.70 % 1,123.53 231 1.03 % 781.61 371 1.49 % 1,050.55 143 1.26 % 917.30 oni oni 1,003 1.36 % 968.98 132 0.86 % 572.61 599 2.68 % 2,026.78 122 0.49 % 345.46 150 1.32 % 962.20 tisti ti sti 947 1.28 % 914.88 168 1.10 % 728.78 352 1.58 % 1,191.03 314 1.26 % 889.14 113 0.99 % 724.86 nas nas 886 1.20 % 855.95 208 1.36 % 902.30 272 1.22 % 920.34 314 1.26 % 889.14 92 0.81 % 590.15 ona ona 867 1.17 % 837.59 133 0.87 % 576.95 539 2.41 % 1,823.77 76 0.30 % 215.21 119 1.04 % 763.34 teh teh 850 1.15 % 821.17 152 0.99 % 659.37 103 0.46 % 348.51 434 1.74 % 1,228.94 161 1.41 % 1,032.76 meni m eni 810 1.09 % 782.53 132 0.86 % 572.61 405 1.81 % 1,370.36 113 0.45 % 319.98 160 1.41 % 1,026.34 vsi vsi 800 1.08 % 772.87 175 1.15 % 759.14 205 0.92 % 693.64 314 1.26 % 889.14 106 0.93 % 679.95 vam vam 798 1.08 % 770.93 231 1.51 % 1,002.07 26 0.12 % 87.97 304 1.22 % 860.83 237 2.08 % 1,520.27 nam nam 657 0.89 % 634.72 166 1.09 % 720.10 94 0.42 % 318.06 315 1.26 % 891.97 82 0.72 % 526 vas vas 606 0.82 % 585.45 224 1.47 % 971.70 61 0.27 % 206.40 215 0.86 % 608.81 106 0.93 % 679.95 tole t ole 597 0.81 % 576.75 153 1.00 % 663.71 98 0.44 % 331.59 235 0.94 % 665.44 111 0.97 % 712.03 tak tak 574 0.78 % 554.53 107 0.70 % 464.16 218 0.98 % 737.63 135 0.54 % 382.27 114 1.00 % 731.27 kakšen kak šen 563 0.76 % 543.91 137 0.90 % 594.30 133 0.60 % 450.02 223 0.89 % 631.46 70 0.61 % 449.03 vsak v sak 508 0.69 % 490.77 117 0.77 % 507.54 140 0.63 % 473.71 166 0.67 % 470.06 85 0.75 % 545.25 kakšno kak šno 506 0.68 % 488.84 156 1.02 % 676.72 73 0.33 % 247 203 0.81 % 574.83 74 0.65 % 474.68 koliko kol iko 490 0.66 % 473.38 56 0.37 % 242.93 220 0.98 % 744.40 113 0.45 % 319.98 101 0.89 % 647.88 tej tej 482 0.65 % 465.65 110 0.72 % 477.18 59 0.26 % 199.63 259 1.04 % 733.40 54 0.47 % 346.39 neki n eki 476 0.64 % 459.86 59 0.39 % 255.94 86 0.39 % 290.99 232 0.93 % 656.95 99 0.87 % 635.05 mene m ene 461 0.62 % 445.36 89 0.58 % 386.08 217 0.97 % 734.24 81 0.32 % 229.36 74 0.65 % 474.68 ono ono 443 0.60 % 427.98 26 0.17 % 112.79 339 1.52 % 1,147.05 26 0.10 % 73.62 52 0.46 % 333.56 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 512 File at CLARIN.SI2.2.169 List of final character-level 4-grams from pronoun standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-pronouns-standardized_ forms-final-4grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] nekaj n ekaj 2,020 5.59 % 1,951.49 376 4.89 % 1,631.07 748 7.66 % 2,530.94 575 4.28 % 1,628.21 321 6.11 % 2,059.10 tega tega 1,978 5.47 % 1,910.91 301 3.91 % 1,305.73 382 3.91 % 1,292.54 930 6.92 % 2,633.45 365 6.94 % 2,341.35 kako kako 1,853 5.13 % 1,790.15 400 5.20 % 1,735.18 676 6.92 % 2,287.32 513 3.82 % 1,452.64 264 5.02 % 1,693.47 tisto t isto 1,156 3.20 % 1,116.79 132 1.72 % 572.61 566 5.80 % 1,915.13 278 2.07 % 787.20 180 3.42 % 1,154.64 tisti t isti 947 2.62 % 914.88 168 2.18 % 728.78 352 3.60 % 1,191.03 314 2.34 % 889.14 113 2.15 % 724.86 meni meni 810 2.24 % 782.53 132 1.72 % 572.61 405 4.15 % 1,370.36 113 0.84 % 319.98 160 3.04 % 1,026.34 tole tole 597 1.65 % 576.75 153 1.99 % 663.71 98 1.00 % 331.59 235 1.75 % 665.44 111 2.11 % 712.03 kakšen ka kšen 563 1.56 % 543.91 137 1.78 % 594.30 133 1.36 % 450.02 223 1.66 % 631.46 70 1.33 % 449.03 vsak vsak 508 1.41 % 490.77 117 1.52 % 507.54 140 1.43 % 473.71 166 1.24 % 470.06 85 1.62 % 545.25 kakšno ka kšno 506 1.40 % 488.84 156 2.03 % 676.72 73 0.75 % 247 203 1.51 % 574.83 74 1.41 % 474.68 koliko ko liko 490 1.36 % 473.38 56 0.73 % 242.93 220 2.25 % 744.40 113 0.84 % 319.98 101 1.92 % 647.88 neki neki 476 1.32 % 459.86 59 0.77 % 255.94 86 0.88 % 290.99 232 1.73 % 656.95 99 1.88 % 635.05 mene mene 461 1.27 % 445.36 89 1.16 % 386.08 217 2.22 % 734.24 81 0.60 % 229.36 74 1.41 % 474.68 take take 437 1.21 % 422.18 78 1.01 % 338.36 176 1.80 % 595.52 101 0.75 % 286 82 1.56 % 526 kateri ka teri 414 1.15 % 399.96 104 1.35 % 451.15 99 1.01 % 334.98 170 1.27 % 481.38 41 0.78 % 263 moje moje 396 1.09 % 382.57 72 0.94 % 312.33 193 1.98 % 653.04 73 0.54 % 206.71 58 1.10 % 372.05 nekako ne kako 391 1.08 % 377.74 73 0.95 % 316.67 68 0.70 % 230.09 176 1.31 % 498.37 74 1.41 % 474.68 tale tale 391 1.08 % 377.74 93 1.21 % 403.43 77 0.79 % 260.54 158 1.18 % 447.40 63 1.20 % 404.12 taka taka 390 1.08 % 376.77 76 0.99 % 329.69 140 1.43 % 473.71 100 0.74 % 283.17 74 1.41 % 474.68 kakšna ka kšna 386 1.07 % 372.91 76 0.99 % 329.69 72 0.74 % 243.62 195 1.45 % 552.17 43 0.82 % 275.83 tiste t iste 386 1.07 % 372.91 75 0.97 % 325.35 140 1.43 % 473.71 107 0.80 % 302.99 64 1.22 % 410.54 svoje s voje 382 1.06 % 369.04 74 0.96 % 321.01 55 0.56 % 186.10 218 1.62 % 617.30 35 0.67 % 224.51 kakšne ka kšne 381 1.05 % 368.08 87 1.13 % 377.40 80 0.82 % 270.69 147 1.09 % 416.25 67 1.27 % 429.78 neke neke 360 1.00 % 347.79 37 0.48 % 160.50 69 0.71 % 233.47 162 1.21 % 458.73 92 1.75 % 590.15 toliko to liko 359 0.99 % 346.82 79 1.03 % 342.70 127 1.30 % 429.72 81 0.60 % 229.36 72 1.37 % 461.86 temu temu 348 0.96 % 336.20 79 1.03 % 342.70 41 0.42 % 138.73 189 1.41 % 535.18 39 0.74 % 250.17 neko neko 341 0.94 % 329.43 32 0.42 % 138.81 69 0.71 % 233.47 173 1.29 % 489.88 67 1.27 % 429.78 takega ta kega 324 0.90 % 313.01 46 0.60 % 199.55 139 1.42 % 470.32 60 0.45 % 169.90 79 1.50 % 506.76 tako tako 321 0.89 % 310.11 53 0.69 % 229.91 112 1.15 % 378.96 90 0.67 % 254.85 66 1.25 % 423.37 vseh vseh 321 0.89 % 310.11 75 0.97 % 325.35 39 0.40 % 131.96 175 1.30 % 495.54 32 0.61 % 205.27 tista t ista 298 0.82 % 287.89 54 0.70 % 234.25 113 1.16 % 382.35 87 0.65 % 246.35 44 0.84 % 282.24 isto isto 290 0.80 % 280.16 29 0.38 % 125.80 125 1.28 % 422.95 88 0.66 % 249.19 48 0.91 % 307.90 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 513 File at CLARIN.SI2.2.170 List of final character-level 5-grams from pronoun standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-pronouns-standardized_ forms-final-5grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] nekaj nekaj 2,020 10.17 % 1,951.49 376 8.96 % 1,631.07 748 14.21 % 2,530.94 575 7.48 % 1,628.21 321 11.81 % 2,059.10 tisto tisto 1,156 5.82 % 1,116.79 132 3.15 % 572.61 566 10.75 % 1,915.13 278 3.62 % 787.20 180 6.62 % 1,154.64 tisti tisti 947 4.77 % 914.88 168 4.00 % 728.78 352 6.69 % 1,191.03 314 4.09 % 889.14 113 4.16 % 724.86 kakšen k akšen 563 2.83 % 543.91 137 3.27 % 594.30 133 2.53 % 450.02 223 2.90 % 631.46 70 2.58 % 449.03 kakšno k akšno 506 2.55 % 488.84 156 3.72 % 676.72 73 1.39 % 247 203 2.64 % 574.83 74 2.72 % 474.68 koliko k oliko 490 2.47 % 473.38 56 1.33 % 242.93 220 4.18 % 744.40 113 1.47 % 319.98 101 3.72 % 647.88 kateri k ateri 414 2.08 % 399.96 104 2.48 % 451.15 99 1.88 % 334.98 170 2.21 % 481.38 41 1.51 % 263 nekako n ekako 391 1.97 % 377.74 73 1.74 % 316.67 68 1.29 % 230.09 176 2.29 % 498.37 74 2.72 % 474.68 kakšna k akšna 386 1.94 % 372.91 76 1.81 % 329.69 72 1.37 % 243.62 195 2.54 % 552.17 43 1.58 % 275.83 tiste tiste 386 1.94 % 372.91 75 1.79 % 325.35 140 2.66 % 473.71 107 1.39 % 302.99 64 2.35 % 410.54 svoje svoje 382 1.92 % 369.04 74 1.76 % 321.01 55 1.04 % 186.10 218 2.84 % 617.30 35 1.29 % 224.51 kakšne k akšne 381 1.92 % 368.08 87 2.07 % 377.40 80 1.52 % 270.69 147 1.91 % 416.25 67 2.46 % 429.78 toliko t oliko 359 1.81 % 346.82 79 1.88 % 342.70 127 2.41 % 429.72 81 1.05 % 229.36 72 2.65 % 461.86 takega t akega 324 1.63 % 313.01 46 1.10 % 199.55 139 2.64 % 470.32 60 0.78 % 169.90 79 2.91 % 506.76 tista tista 298 1.50 % 287.89 54 1.29 % 234.25 113 2.15 % 382.35 87 1.13 % 246.35 44 1.62 % 282.24 takole t akole 241 1.21 % 232.83 75 1.79 % 325.35 60 1.14 % 203.02 79 1.03 % 223.70 27 0.99 % 173.20 svojo svojo 232 1.17 % 224.13 60 1.43 % 260.28 29 0.55 % 98.12 127 1.65 % 359.62 16 0.59 % 102.63 midva midva 231 1.16 % 223.17 98 2.34 % 425.12 88 1.67 % 297.76 20 0.26 % 56.63 25 0.92 % 160.37 nekdo nekdo 208 1.05 % 200.95 35 0.83 % 151.83 49 0.93 % 165.80 94 1.22 % 266.18 30 1.10 % 192.44 mojem mojem 206 1.04 % 199.01 49 1.17 % 212.56 64 1.22 % 216.55 48 0.62 % 135.92 45 1.66 % 288.66 katera k atera 199 1.00 % 192.25 38 0.91 % 164.84 50 0.95 % 169.18 93 1.21 % 263.34 18 0.66 % 115.46 katero k atero 196 0.99 % 189.35 43 1.02 % 186.53 48 0.91 % 162.41 88 1.15 % 249.19 17 0.62 % 109.05 katere k atere 195 0.98 % 188.39 28 0.67 % 121.46 17 0.32 % 57.52 125 1.63 % 353.96 25 0.92 % 160.37 njega njega 195 0.98 % 188.39 41 0.98 % 177.86 66 1.25 % 223.32 64 0.83 % 181.23 24 0.88 % 153.95 tistega ti stega 179 0.90 % 172.93 28 0.67 % 121.46 58 1.10 % 196.25 62 0.81 % 175.56 31 1.14 % 198.85 tistih t istih 178 0.90 % 171.96 32 0.76 % 138.81 44 0.84 % 148.88 82 1.07 % 232.20 20 0.74 % 128.29 vsako vsako 173 0.87 % 167.13 38 0.91 % 164.84 60 1.14 % 203.02 65 0.85 % 184.06 10 0.37 % 64.15 kakšni k akšni 147 0.74 % 142.01 30 0.71 % 130.14 30 0.57 % 101.51 69 0.90 % 195.38 18 0.66 % 115.46 katerega kat erega 146 0.73 % 141.05 31 0.74 % 134.48 37 0.70 % 125.19 61 0.79 % 172.73 17 0.62 % 109.05 njemu njemu 146 0.73 % 141.05 23 0.55 % 99.77 75 1.43 % 253.77 28 0.36 % 79.29 20 0.74 % 128.29 njimi njimi 143 0.72 % 138.15 26 0.62 % 112.79 31 0.59 % 104.89 70 0.91 % 198.22 16 0.59 % 102.63 takih takih 137 0.69 % 132.35 26 0.62 % 112.79 33 0.63 % 111.66 56 0.73 % 158.57 22 0.81 % 141.12 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 514 File at CLARIN.SI2.2.171 List of initial character-level 1-grams from pronoun lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-pronouns-lowercase_ forms-initial-1grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] to t o 17,178 12.35 % 16,595.39 3,440 11.95 % 14,922.59 4,556 10.72 % 15,415.74 5,429 11.98 % 15,373.11 3,753 16.63 % 24,074.20 se s e 15,760 11.33 % 15,225.48 3,278 11.39 % 14,219.84 3,780 8.89 % 12,790.06 6,039 13.33 % 17,100.43 2,663 11.80 % 17,082.23 kaj k aj 5,528 3.97 % 5,340.51 1,189 4.13 % 5,157.84 1,388 3.27 % 4,696.46 2,029 4.48 % 5,745.45 922 4.09 % 5,914.31 je j e 5,187 3.73 % 5,011.08 1,060 3.68 % 4,598.24 1,724 4.06 % 5,833.35 1,675 3.70 % 4,743.04 728 3.23 % 4,669.87 ti t i 4,277 3.07 % 4,131.94 1,075 3.73 % 4,663.31 1,930 4.54 % 6,530.37 612 1.35 % 1,732.98 660 2.92 % 4,233.67 mi m i 4,009 2.88 % 3,873.03 679 2.36 % 2,945.48 1,417 3.33 % 4,794.58 1,248 2.75 % 3,533.92 665 2.95 % 4,265.75 ta t a 3,549 2.55 % 3,428.63 627 2.18 % 2,719.90 888 2.09 % 3,004.65 1,375 3.04 % 3,893.54 659 2.92 % 4,227.26 jz j z 3,351 2.41 % 3,237.35 628 2.18 % 2,724.24 1,092 2.57 % 3,694.91 918 2.03 % 2,599.47 713 3.16 % 4,573.65 te t e 2,922 2.10 % 2,822.90 771 2.68 % 3,344.57 864 2.03 % 2,923.44 804 1.77 % 2,276.66 483 2.14 % 3,098.28 ga g a 2,631 1.89 % 2,541.77 517 1.80 % 2,242.73 963 2.27 % 3,258.42 808 1.78 % 2,287.99 343 1.52 % 2,200.23 vse v se 2,501 1.80 % 2,416.18 579 2.01 % 2,511.68 822 1.93 % 2,781.33 661 1.46 % 1,871.73 439 1.95 % 2,816.03 tem t em 2,081 1.50 % 2,010.42 390 1.35 % 1,691.81 226 0.53 % 764.70 1,090 2.41 % 3,086.52 375 1.66 % 2,405.50 jih j ih 1,913 1.38 % 1,848.12 322 1.12 % 1,396.82 503 1.18 % 1,701.96 800 1.77 % 2,265.33 288 1.28 % 1,847.42 tega t ega 1,893 1.36 % 1,828.80 279 0.97 % 1,210.29 328 0.77 % 1,109.83 927 2.05 % 2,624.95 359 1.59 % 2,302.86 si s i 1,727 1.24 % 1,668.43 444 1.54 % 1,926.06 534 1.26 % 1,806.85 515 1.14 % 1,458.31 234 1.04 % 1,501.03 kar k ar 1,571 1.13 % 1,517.72 331 1.15 % 1,435.87 278 0.65 % 940.64 740 1.63 % 2,095.43 222 0.98 % 1,424.05 jaz j az 1,541 1.11 % 1,488.73 339 1.18 % 1,470.57 618 1.45 % 2,091.07 330 0.73 % 934.45 254 1.13 % 1,629.32 kej k ej 1,313 0.94 % 1,268.47 217 0.75 % 941.34 630 1.48 % 2,131.68 230 0.51 % 651.28 236 1.05 % 1,513.86 jo j o 1,298 0.93 % 1,253.98 274 0.95 % 1,188.60 381 0.90 % 1,289.16 511 1.13 % 1,446.98 132 0.58 % 846.73 ka k a 1,294 0.93 % 1,250.11 197 0.68 % 854.58 825 1.94 % 2,791.48 117 0.26 % 331.30 155 0.69 % 994.27 on o n 1,156 0.83 % 1,116.79 234 0.81 % 1,015.08 539 1.27 % 1,823.77 210 0.46 % 594.65 173 0.77 % 1,109.74 me m e 1,140 0.82 % 1,101.34 320 1.11 % 1,388.15 442 1.04 % 1,495.56 230 0.51 % 651.28 148 0.66 % 949.37 kr k r 1,128 0.81 % 1,089.74 269 0.93 % 1,166.91 352 0.83 % 1,191.03 283 0.62 % 801.36 224 0.99 % 1,436.88 kako k ako 973 0.70 % 940 240 0.83 % 1,041.11 156 0.37 % 527.84 436 0.96 % 1,234.61 141 0.62 % 904.47 neki n eki 956 0.69 % 923.58 117 0.41 % 507.54 467 1.10 % 1,580.15 195 0.43 % 552.17 177 0.78 % 1,135.39 vi v i 919 0.66 % 887.83 202 0.70 % 876.27 100 0.23 % 338.36 380 0.84 % 1,076.03 237 1.05 % 1,520.27 kdo k do 907 0.65 % 876.24 245 0.85 % 1,062.80 162 0.38 % 548.15 364 0.80 % 1,030.73 136 0.60 % 872.39 nič n ič 857 0.62 % 827.93 256 0.89 % 1,110.52 189 0.45 % 639.50 297 0.66 % 841 115 0.51 % 737.69 teh t eh 805 0.58 % 777.70 143 0.50 % 620.33 79 0.19 % 267.31 427 0.94 % 1,209.12 156 0.69 % 1,000.69 nas n as 767 0.55 % 740.99 185 0.64 % 802.52 226 0.53 % 764.70 281 0.62 % 795.70 75 0.33 % 481.10 vsi v si 745 0.54 % 719.73 168 0.58 % 728.78 177 0.42 % 598.90 298 0.66 % 843.84 102 0.45 % 654.29 vam v am 731 0.53 % 706.21 207 0.72 % 897.96 21 0.05 % 71.06 271 0.60 % 767.38 232 1.03 % 1,488.20 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 515 File at CLARIN.SI2.2.172 List of initial character-level 2-grams from pronoun lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-pronouns-lowercase_ forms-initial-2grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] to to 17,178 12.37 % 16,595.39 3,440 11.96 % 14,922.59 4,556 10.78 % 15,415.74 5,429 12.00 % 15,373.11 3,753 16.65 % 24,074.20 se se 15,760 11.35 % 15,225.48 3,278 11.39 % 14,219.84 3,780 8.94 % 12,790.06 6,039 13.35 % 17,100.43 2,663 11.82 % 17,082.23 kaj ka j 5,528 3.98 % 5,340.51 1,189 4.13 % 5,157.84 1,388 3.28 % 4,696.46 2,029 4.48 % 5,745.45 922 4.09 % 5,914.31 je je 5,187 3.74 % 5,011.08 1,060 3.69 % 4,598.24 1,724 4.08 % 5,833.35 1,675 3.70 % 4,743.04 728 3.23 % 4,669.87 ti ti 4,277 3.08 % 4,131.94 1,075 3.74 % 4,663.31 1,930 4.57 % 6,530.37 612 1.35 % 1,732.98 660 2.93 % 4,233.67 mi mi 4,009 2.89 % 3,873.03 679 2.36 % 2,945.48 1,417 3.35 % 4,794.58 1,248 2.76 % 3,533.92 665 2.95 % 4,265.75 ta ta 3,549 2.56 % 3,428.63 627 2.18 % 2,719.90 888 2.10 % 3,004.65 1,375 3.04 % 3,893.54 659 2.92 % 4,227.26 jz jz 3,351 2.41 % 3,237.35 628 2.18 % 2,724.24 1,092 2.58 % 3,694.91 918 2.03 % 2,599.47 713 3.16 % 4,573.65 te te 2,922 2.10 % 2,822.90 771 2.68 % 3,344.57 864 2.04 % 2,923.44 804 1.78 % 2,276.66 483 2.14 % 3,098.28 ga ga 2,631 1.90 % 2,541.77 517 1.80 % 2,242.73 963 2.28 % 3,258.42 808 1.79 % 2,287.99 343 1.52 % 2,200.23 vse vs e 2,501 1.80 % 2,416.18 579 2.01 % 2,511.68 822 1.95 % 2,781.33 661 1.46 % 1,871.73 439 1.95 % 2,816.03 tem te m 2,081 1.50 % 2,010.42 390 1.36 % 1,691.81 226 0.54 % 764.70 1,090 2.41 % 3,086.52 375 1.66 % 2,405.50 jih ji h 1,913 1.38 % 1,848.12 322 1.12 % 1,396.82 503 1.19 % 1,701.96 800 1.77 % 2,265.33 288 1.28 % 1,847.42 tega te ga 1,893 1.36 % 1,828.80 279 0.97 % 1,210.29 328 0.78 % 1,109.83 927 2.05 % 2,624.95 359 1.59 % 2,302.86 si si 1,727 1.24 % 1,668.43 444 1.54 % 1,926.06 534 1.26 % 1,806.85 515 1.14 % 1,458.31 234 1.04 % 1,501.03 kar ka r 1,571 1.13 % 1,517.72 331 1.15 % 1,435.87 278 0.66 % 940.64 740 1.64 % 2,095.43 222 0.98 % 1,424.05 jaz ja z 1,541 1.11 % 1,488.73 339 1.18 % 1,470.57 618 1.46 % 2,091.07 330 0.73 % 934.45 254 1.13 % 1,629.32 kej ke j 1,313 0.95 % 1,268.47 217 0.75 % 941.34 630 1.49 % 2,131.68 230 0.51 % 651.28 236 1.05 % 1,513.86 jo jo 1,298 0.94 % 1,253.98 274 0.95 % 1,188.60 381 0.90 % 1,289.16 511 1.13 % 1,446.98 132 0.59 % 846.73 ka ka 1,294 0.93 % 1,250.11 197 0.69 % 854.58 825 1.95 % 2,791.48 117 0.26 % 331.30 155 0.69 % 994.27 on on 1,156 0.83 % 1,116.79 234 0.81 % 1,015.08 539 1.27 % 1,823.77 210 0.46 % 594.65 173 0.77 % 1,109.74 me me 1,140 0.82 % 1,101.34 320 1.11 % 1,388.15 442 1.05 % 1,495.56 230 0.51 % 651.28 148 0.66 % 949.37 kr kr 1,128 0.81 % 1,089.74 269 0.94 % 1,166.91 352 0.83 % 1,191.03 283 0.62 % 801.36 224 0.99 % 1,436.88 kako ka ko 973 0.70 % 940 240 0.83 % 1,041.11 156 0.37 % 527.84 436 0.96 % 1,234.61 141 0.63 % 904.47 neki ne ki 956 0.69 % 923.58 117 0.41 % 507.54 467 1.10 % 1,580.15 195 0.43 % 552.17 177 0.79 % 1,135.39 vi vi 919 0.66 % 887.83 202 0.70 % 876.27 100 0.24 % 338.36 380 0.84 % 1,076.03 237 1.05 % 1,520.27 kdo kd o 907 0.65 % 876.24 245 0.85 % 1,062.80 162 0.38 % 548.15 364 0.80 % 1,030.73 136 0.60 % 872.39 nič ni č 857 0.62 % 827.93 256 0.89 % 1,110.52 189 0.45 % 639.50 297 0.66 % 841 115 0.51 % 737.69 teh te h 805 0.58 % 777.70 143 0.50 % 620.33 79 0.19 % 267.31 427 0.94 % 1,209.12 156 0.69 % 1,000.69 nas na s 767 0.55 % 740.99 185 0.64 % 802.52 226 0.54 % 764.70 281 0.62 % 795.70 75 0.33 % 481.10 vsi vs i 745 0.54 % 719.73 168 0.58 % 728.78 177 0.42 % 598.90 298 0.66 % 843.84 102 0.45 % 654.29 vam va m 731 0.53 % 706.21 207 0.72 % 897.96 21 0.05 % 71.06 271 0.60 % 767.38 232 1.03 % 1,488.20 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 516 File at CLARIN.SI2.2.173 List of initial character-level 3-grams from pronoun lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-pronouns-lowercase_ forms-initial-3grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] kaj kaj 5,528 8.22 % 5,340.51 1,189 8.46 % 5,157.84 1,388 7.09 % 4,696.46 2,029 8.66 % 5,745.45 922 9.05 % 5,914.31 vse vse 2,501 3.72 % 2,416.18 579 4.12 % 2,511.68 822 4.20 % 2,781.33 661 2.82 % 1,871.73 439 4.31 % 2,816.03 tem tem 2,081 3.10 % 2,010.42 390 2.77 % 1,691.81 226 1.15 % 764.70 1,090 4.66 % 3,086.52 375 3.68 % 2,405.50 jih jih 1,913 2.85 % 1,848.12 322 2.29 % 1,396.82 503 2.57 % 1,701.96 800 3.42 % 2,265.33 288 2.83 % 1,847.42 tega teg a 1,893 2.81 % 1,828.80 279 1.98 % 1,210.29 328 1.68 % 1,109.83 927 3.96 % 2,624.95 359 3.53 % 2,302.86 kar kar 1,571 2.34 % 1,517.72 331 2.35 % 1,435.87 278 1.42 % 940.64 740 3.16 % 2,095.43 222 2.18 % 1,424.05 jaz jaz 1,541 2.29 % 1,488.73 339 2.41 % 1,470.57 618 3.16 % 2,091.07 330 1.41 % 934.45 254 2.50 % 1,629.32 kej kej 1,313 1.95 % 1,268.47 217 1.54 % 941.34 630 3.22 % 2,131.68 230 0.98 % 651.28 236 2.32 % 1,513.86 kako kak o 973 1.45 % 940 240 1.71 % 1,041.11 156 0.80 % 527.84 436 1.86 % 1,234.61 141 1.39 % 904.47 neki nek i 956 1.42 % 923.58 117 0.83 % 507.54 467 2.38 % 1,580.15 195 0.83 % 552.17 177 1.74 % 1,135.39 kdo kdo 907 1.35 % 876.24 245 1.74 % 1,062.80 162 0.83 % 548.15 364 1.55 % 1,030.73 136 1.34 % 872.39 nič nič 857 1.27 % 827.93 256 1.82 % 1,110.52 189 0.96 % 639.50 297 1.27 % 841 115 1.13 % 737.69 teh teh 805 1.20 % 777.70 143 1.02 % 620.33 79 0.40 % 267.31 427 1.82 % 1,209.12 156 1.53 % 1,000.69 nas nas 767 1.14 % 740.99 185 1.32 % 802.52 226 1.15 % 764.70 281 1.20 % 795.70 75 0.74 % 481.10 vsi vsi 745 1.11 % 719.73 168 1.20 % 728.78 177 0.90 % 598.90 298 1.27 % 843.84 102 1.00 % 654.29 vam vam 731 1.09 % 706.21 207 1.47 % 897.96 21 0.11 % 71.06 271 1.16 % 767.38 232 2.28 % 1,488.20 tisto tis to 698 1.04 % 674.33 94 0.67 % 407.77 257 1.31 % 869.59 227 0.97 % 642.79 120 1.18 % 769.76 ona ona 657 0.98 % 634.72 124 0.88 % 537.91 371 1.90 % 1,255.32 65 0.28 % 184.06 97 0.95 % 622.22 nekaj nek aj 648 0.96 % 626.02 193 1.37 % 837.23 105 0.54 % 355.28 265 1.13 % 750.39 85 0.83 % 545.25 tisti tis ti 618 0.92 % 597.04 129 0.92 % 559.60 152 0.78 % 514.31 260 1.11 % 736.23 77 0.76 % 493.93 tak tak 599 0.89 % 578.68 109 0.78 % 472.84 235 1.20 % 795.15 132 0.56 % 373.78 123 1.21 % 789 nam nam 595 0.89 % 574.82 139 0.99 % 602.98 79 0.40 % 267.31 298 1.27 % 843.84 79 0.78 % 506.76 kak kak 588 0.87 % 568.06 126 0.90 % 546.58 297 1.52 % 1,004.93 54 0.23 % 152.91 111 1.09 % 712.03 tole tol e 571 0.85 % 551.63 151 1.07 % 655.03 78 0.40 % 263.92 232 0.99 % 656.95 110 1.08 % 705.61 vas vas 535 0.80 % 516.85 210 1.49 % 910.97 46 0.23 % 155.65 178 0.76 % 504.04 101 0.99 % 647.88 tej tej 534 0.79 % 515.89 116 0.82 % 503.20 109 0.56 % 368.81 256 1.09 % 724.91 53 0.52 % 339.98 oni oni 508 0.76 % 490.77 73 0.52 % 316.67 239 1.22 % 808.68 95 0.41 % 269.01 101 0.99 % 647.88 men men 503 0.75 % 485.94 60 0.43 % 260.28 272 1.39 % 920.34 53 0.23 % 150.08 118 1.16 % 756.93 tist tis t 497 0.74 % 480.14 30 0.21 % 130.14 311 1.59 % 1,052.30 85 0.36 % 240.69 71 0.70 % 455.44 vsak vsa k 471 0.70 % 455.03 118 0.84 % 511.88 120 0.61 % 406.03 154 0.66 % 436.08 79 0.78 % 506.76 mene men e 448 0.67 % 432.81 89 0.63 % 386.08 205 1.05 % 693.64 81 0.35 % 229.36 73 0.72 % 468.27 jst jst 424 0.63 % 409.62 33 0.23 % 143.15 159 0.81 % 537.99 134 0.57 % 379.44 98 0.96 % 628.64 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 517 File at CLARIN.SI2.2.174 List of initial character-level 4-grams from pronoun lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-pronouns-lowercase_ forms-initial-4grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tega tega 1,893 5.47 % 1,828.80 279 3.72 % 1,210.29 328 3.62 % 1,109.83 927 7.07 % 2,624.95 359 7.28 % 2,302.86 kako kako 973 2.81 % 940 240 3.20 % 1,041.11 156 1.72 % 527.84 436 3.33 % 1,234.61 141 2.86 % 904.47 neki neki 956 2.76 % 923.58 117 1.56 % 507.54 467 5.15 % 1,580.15 195 1.49 % 552.17 177 3.59 % 1,135.39 tisto tist o 698 2.02 % 674.33 94 1.25 % 407.77 257 2.83 % 869.59 227 1.73 % 642.79 120 2.43 % 769.76 nekaj neka j 648 1.87 % 626.02 193 2.58 % 837.23 105 1.16 % 355.28 265 2.02 % 750.39 85 1.72 % 545.25 tisti tist i 618 1.79 % 597.04 129 1.72 % 559.60 152 1.68 % 514.31 260 1.98 % 736.23 77 1.56 % 493.93 tole tole 571 1.65 % 551.63 151 2.02 % 655.03 78 0.86 % 263.92 232 1.77 % 656.95 110 2.23 % 705.61 tist tist 497 1.44 % 480.14 30 0.40 % 130.14 311 3.43 % 1,052.30 85 0.65 % 240.69 71 1.44 % 455.44 vsak vsak 471 1.36 % 455.03 118 1.57 % 511.88 120 1.32 % 406.03 154 1.17 % 436.08 79 1.60 % 506.76 mene mene 448 1.29 % 432.81 89 1.19 % 386.08 205 2.26 % 693.64 81 0.62 % 229.36 73 1.48 % 468.27 take take 422 1.22 % 407.69 76 1.01 % 329.69 165 1.82 % 558.30 100 0.76 % 283.17 81 1.64 % 519.59 tiste tist e 389 1.12 % 375.81 77 1.03 % 334.02 139 1.53 % 470.32 109 0.83 % 308.65 64 1.30 % 410.54 neke neke 384 1.11 % 370.98 53 0.71 % 229.91 74 0.82 % 250.39 161 1.23 % 455.90 96 1.95 % 615.81 taka taka 380 1.10 % 367.11 76 1.01 % 329.69 132 1.46 % 446.64 99 0.76 % 280.33 73 1.48 % 468.27 svoje svoj e 379 1.09 % 366.15 72 0.96 % 312.33 53 0.58 % 179.33 219 1.67 % 620.13 35 0.71 % 224.51 tale tale 379 1.09 % 366.15 93 1.24 % 403.43 67 0.74 % 226.70 158 1.21 % 447.40 61 1.24 % 391.29 moje moje 372 1.07 % 359.38 72 0.96 % 312.33 167 1.84 % 565.06 76 0.58 % 215.21 57 1.16 % 365.64 temu temu 340 0.98 % 328.47 79 1.05 % 342.70 34 0.38 % 115.04 187 1.43 % 529.52 40 0.81 % 256.59 neko neko 335 0.97 % 323.64 32 0.43 % 138.81 63 0.69 % 213.17 172 1.31 % 487.05 68 1.38 % 436.20 nekej neke j 321 0.93 % 310.11 56 0.75 % 242.93 71 0.78 % 240.24 147 1.12 % 416.25 47 0.95 % 301.49 neka neka 315 0.91 % 304.32 27 0.36 % 117.12 132 1.46 % 446.64 110 0.84 % 311.48 46 0.93 % 295.07 vseh vseh 309 0.89 % 298.52 71 0.95 % 308 35 0.39 % 118.43 173 1.32 % 489.88 30 0.61 % 192.44 meni meni 282 0.81 % 272.44 63 0.84 % 273.29 116 1.28 % 392.50 60 0.46 % 169.90 43 0.87 % 275.83 tista tist a 279 0.81 % 269.54 49 0.65 % 212.56 99 1.09 % 334.98 87 0.66 % 246.35 44 0.89 % 282.24 isto isto 270 0.78 % 260.84 30 0.40 % 130.14 110 1.21 % 372.20 85 0.65 % 240.69 45 0.91 % 288.66 kakšno kakš no 244 0.70 % 235.72 55 0.73 % 238.59 16 0.18 % 54.14 137 1.04 % 387.94 36 0.73 % 230.93 naši naši 244 0.70 % 235.72 81 1.08 % 351.37 28 0.31 % 94.74 113 0.86 % 319.98 22 0.45 % 141.12 naše naše 238 0.69 % 229.93 63 0.84 % 273.29 19 0.21 % 64.29 133 1.01 % 376.61 23 0.47 % 147.54 naša naša 237 0.69 % 228.96 52 0.69 % 225.57 41 0.45 % 138.73 132 1.01 % 373.78 12 0.24 % 76.98 moja moja 231 0.67 % 223.17 64 0.85 % 277.63 85 0.94 % 287.61 59 0.45 % 167.07 23 0.47 % 147.54 taki taki 231 0.67 % 223.17 54 0.72 % 234.25 99 1.09 % 334.98 52 0.40 % 147.25 26 0.53 % 166.78 kakšen kakš en 229 0.66 % 221.23 36 0.48 % 156.17 16 0.18 % 54.14 155 1.18 % 438.91 22 0.45 % 141.12 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 518 File at CLARIN.SI2.2.175 List of initial character-level 5-grams from pronoun lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-pronouns-lowercase_ forms-initial-5grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tisto tisto 698 4.07 % 674.33 94 2.51 % 407.77 257 6.73 % 869.59 227 3.11 % 642.79 120 5.23 % 769.76 nekaj nekaj 648 3.78 % 626.02 193 5.15 % 837.23 105 2.75 % 355.28 265 3.63 % 750.39 85 3.71 % 545.25 tisti tisti 618 3.60 % 597.04 129 3.44 % 559.60 152 3.98 % 514.31 260 3.56 % 736.23 77 3.36 % 493.93 tiste tiste 389 2.27 % 375.81 77 2.05 % 334.02 139 3.64 % 470.32 109 1.49 % 308.65 64 2.79 % 410.54 svoje svoje 379 2.21 % 366.15 72 1.92 % 312.33 53 1.39 % 179.33 219 3.00 % 620.13 35 1.53 % 224.51 nekej nekej 321 1.87 % 310.11 56 1.49 % 242.93 71 1.86 % 240.24 147 2.01 % 416.25 47 2.05 % 301.49 tista tista 279 1.63 % 269.54 49 1.31 % 212.56 99 2.59 % 334.98 87 1.19 % 246.35 44 1.92 % 282.24 kakšno kakšn o 244 1.42 % 235.72 55 1.47 % 238.59 16 0.42 % 54.14 137 1.88 % 387.94 36 1.57 % 230.93 kakšen kakše n 229 1.33 % 221.23 36 0.96 % 156.17 16 0.42 % 54.14 155 2.12 % 438.91 22 0.96 % 141.12 svojo svojo 228 1.33 % 220.27 59 1.57 % 255.94 28 0.73 % 94.74 125 1.71 % 353.96 16 0.70 % 102.63 kakšna kakšn a 222 1.29 % 214.47 39 1.04 % 169.18 14 0.37 % 47.37 149 2.04 % 421.92 20 0.87 % 128.29 nekak nekak 195 1.14 % 188.39 28 0.75 % 121.46 52 1.36 % 175.95 64 0.88 % 181.23 51 2.22 % 327.15 nekdo nekdo 191 1.11 % 184.52 35 0.93 % 151.83 34 0.89 % 115.04 94 1.29 % 266.18 28 1.22 % 179.61 kateri kater i 186 1.08 % 179.69 43 1.15 % 186.53 6 0.16 % 20.30 127 1.74 % 359.62 10 0.44 % 64.15 mojem mojem 184 1.07 % 177.76 46 1.23 % 199.55 52 1.36 % 175.95 42 0.57 % 118.93 44 1.92 % 282.24 midva midva 180 1.05 % 173.90 68 1.81 % 294.98 68 1.78 % 230.09 20 0.27 % 56.63 24 1.05 % 153.95 nekako nekak o 178 1.04 % 171.96 38 1.01 % 164.84 8 0.21 % 27.07 110 1.51 % 311.48 22 0.96 % 141.12 kakšne kakšn e 171 1.00 % 165.20 30 0.80 % 130.14 13 0.34 % 43.99 99 1.36 % 280.33 29 1.26 % 186.03 tistih tisti h 171 1.00 % 165.20 30 0.80 % 130.14 41 1.07 % 138.73 81 1.11 % 229.36 19 0.83 % 121.88 njega njega 167 0.97 % 161.34 36 0.96 % 156.17 45 1.18 % 152.26 62 0.85 % 175.56 24 1.05 % 153.95 vsako vsako 150 0.87 % 144.91 37 0.99 % 160.50 39 1.02 % 131.96 64 0.88 % 181.23 10 0.44 % 64.15 njimi njimi 143 0.83 % 138.15 26 0.69 % 112.79 31 0.81 % 104.89 70 0.96 % 198.22 16 0.70 % 102.63 katere kater e 141 0.82 % 136.22 20 0.53 % 86.76 2 0.05 % 6.77 111 1.52 % 314.31 8 0.35 % 51.32 takih takih 135 0.79 % 130.42 26 0.69 % 112.79 32 0.84 % 108.28 56 0.77 % 158.57 21 0.92 % 134.71 takega takeg a 134 0.78 % 129.46 21 0.56 % 91.10 39 1.02 % 131.96 39 0.53 % 110.43 35 1.53 % 224.51 vsega vsega 119 0.69 % 114.96 29 0.77 % 125.80 21 0.55 % 71.06 41 0.56 % 116.10 28 1.22 % 179.61 njemu njemu 115 0.67 % 111.10 20 0.53 % 86.76 46 1.21 % 155.65 29 0.40 % 82.12 20 0.87 % 128.29 tazga tazga 111 0.65 % 107.24 10 0.27 % 43.38 51 1.34 % 172.56 13 0.18 % 36.81 37 1.61 % 237.34 katero kater o 110 0.64 % 106.27 22 0.59 % 95.44 2 0.05 % 6.77 77 1.05 % 218.04 9 0.39 % 57.73 noben noben 110 0.64 % 106.27 15 0.40 % 65.07 42 1.10 % 142.11 31 0.42 % 87.78 22 0.96 % 141.12 katera kater a 107 0.62 % 103.37 17 0.45 % 73.75 4 0.10 % 13.53 77 1.05 % 218.04 9 0.39 % 57.73 kešne kešne 104 0.61 % 100.47 22 0.59 % 95.44 43 1.13 % 145.50 15 0.20 % 42.47 24 1.05 % 153.95 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 519 File at CLARIN.SI2.2.176 List of final character-level 1-grams from pronoun lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-pronouns-lowercase_ forms-final-1grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] to t o 17,178 12.35 % 16,595.39 3,440 11.95 % 14,922.59 4,556 10.72 % 15,415.74 5,429 11.98 % 15,373.11 3,753 16.63 % 24,074.20 se s e 15,760 11.33 % 15,225.48 3,278 11.39 % 14,219.84 3,780 8.89 % 12,790.06 6,039 13.33 % 17,100.43 2,663 11.80 % 17,082.23 kaj ka j 5,528 3.97 % 5,340.51 1,189 4.13 % 5,157.84 1,388 3.27 % 4,696.46 2,029 4.48 % 5,745.45 922 4.09 % 5,914.31 je j e 5,187 3.73 % 5,011.08 1,060 3.68 % 4,598.24 1,724 4.06 % 5,833.35 1,675 3.70 % 4,743.04 728 3.23 % 4,669.87 ti t i 4,277 3.07 % 4,131.94 1,075 3.73 % 4,663.31 1,930 4.54 % 6,530.37 612 1.35 % 1,732.98 660 2.92 % 4,233.67 mi m i 4,009 2.88 % 3,873.03 679 2.36 % 2,945.48 1,417 3.33 % 4,794.58 1,248 2.75 % 3,533.92 665 2.95 % 4,265.75 ta t a 3,549 2.55 % 3,428.63 627 2.18 % 2,719.90 888 2.09 % 3,004.65 1,375 3.04 % 3,893.54 659 2.92 % 4,227.26 jz j z 3,351 2.41 % 3,237.35 628 2.18 % 2,724.24 1,092 2.57 % 3,694.91 918 2.03 % 2,599.47 713 3.16 % 4,573.65 te t e 2,922 2.10 % 2,822.90 771 2.68 % 3,344.57 864 2.03 % 2,923.44 804 1.77 % 2,276.66 483 2.14 % 3,098.28 ga g a 2,631 1.89 % 2,541.77 517 1.80 % 2,242.73 963 2.27 % 3,258.42 808 1.78 % 2,287.99 343 1.52 % 2,200.23 vse vs e 2,501 1.80 % 2,416.18 579 2.01 % 2,511.68 822 1.93 % 2,781.33 661 1.46 % 1,871.73 439 1.95 % 2,816.03 tem te m 2,081 1.50 % 2,010.42 390 1.35 % 1,691.81 226 0.53 % 764.70 1,090 2.41 % 3,086.52 375 1.66 % 2,405.50 jih ji h 1,913 1.38 % 1,848.12 322 1.12 % 1,396.82 503 1.18 % 1,701.96 800 1.77 % 2,265.33 288 1.28 % 1,847.42 tega teg a 1,893 1.36 % 1,828.80 279 0.97 % 1,210.29 328 0.77 % 1,109.83 927 2.05 % 2,624.95 359 1.59 % 2,302.86 si s i 1,727 1.24 % 1,668.43 444 1.54 % 1,926.06 534 1.26 % 1,806.85 515 1.14 % 1,458.31 234 1.04 % 1,501.03 kar ka r 1,571 1.13 % 1,517.72 331 1.15 % 1,435.87 278 0.65 % 940.64 740 1.63 % 2,095.43 222 0.98 % 1,424.05 jaz ja z 1,541 1.11 % 1,488.73 339 1.18 % 1,470.57 618 1.45 % 2,091.07 330 0.73 % 934.45 254 1.13 % 1,629.32 kej ke j 1,313 0.94 % 1,268.47 217 0.75 % 941.34 630 1.48 % 2,131.68 230 0.51 % 651.28 236 1.05 % 1,513.86 jo j o 1,298 0.93 % 1,253.98 274 0.95 % 1,188.60 381 0.90 % 1,289.16 511 1.13 % 1,446.98 132 0.58 % 846.73 ka k a 1,294 0.93 % 1,250.11 197 0.68 % 854.58 825 1.94 % 2,791.48 117 0.26 % 331.30 155 0.69 % 994.27 on o n 1,156 0.83 % 1,116.79 234 0.81 % 1,015.08 539 1.27 % 1,823.77 210 0.46 % 594.65 173 0.77 % 1,109.74 me m e 1,140 0.82 % 1,101.34 320 1.11 % 1,388.15 442 1.04 % 1,495.56 230 0.51 % 651.28 148 0.66 % 949.37 kr k r 1,128 0.81 % 1,089.74 269 0.93 % 1,166.91 352 0.83 % 1,191.03 283 0.62 % 801.36 224 0.99 % 1,436.88 kako kak o 973 0.70 % 940 240 0.83 % 1,041.11 156 0.37 % 527.84 436 0.96 % 1,234.61 141 0.62 % 904.47 neki nek i 956 0.69 % 923.58 117 0.41 % 507.54 467 1.10 % 1,580.15 195 0.43 % 552.17 177 0.78 % 1,135.39 vi v i 919 0.66 % 887.83 202 0.70 % 876.27 100 0.23 % 338.36 380 0.84 % 1,076.03 237 1.05 % 1,520.27 kdo kd o 907 0.65 % 876.24 245 0.85 % 1,062.80 162 0.38 % 548.15 364 0.80 % 1,030.73 136 0.60 % 872.39 nič ni č 857 0.62 % 827.93 256 0.89 % 1,110.52 189 0.45 % 639.50 297 0.66 % 841 115 0.51 % 737.69 teh te h 805 0.58 % 777.70 143 0.50 % 620.33 79 0.19 % 267.31 427 0.94 % 1,209.12 156 0.69 % 1,000.69 nas na s 767 0.55 % 740.99 185 0.64 % 802.52 226 0.53 % 764.70 281 0.62 % 795.70 75 0.33 % 481.10 vsi vs i 745 0.54 % 719.73 168 0.58 % 728.78 177 0.42 % 598.90 298 0.66 % 843.84 102 0.45 % 654.29 vam va m 731 0.53 % 706.21 207 0.72 % 897.96 21 0.05 % 71.06 271 0.60 % 767.38 232 1.03 % 1,488.20 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 520 File at CLARIN.SI2.2.177 List of final character-level 2-grams from pronoun lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-pronouns-lowercase_ forms-final-2grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] to to 17,178 12.37 % 16,595.39 3,440 11.96 % 14,922.59 4,556 10.78 % 15,415.74 5,429 12.00 % 15,373.11 3,753 16.65 % 24,074.20 se se 15,760 11.35 % 15,225.48 3,278 11.39 % 14,219.84 3,780 8.94 % 12,790.06 6,039 13.35 % 17,100.43 2,663 11.82 % 17,082.23 kaj k aj 5,528 3.98 % 5,340.51 1,189 4.13 % 5,157.84 1,388 3.28 % 4,696.46 2,029 4.48 % 5,745.45 922 4.09 % 5,914.31 je je 5,187 3.74 % 5,011.08 1,060 3.69 % 4,598.24 1,724 4.08 % 5,833.35 1,675 3.70 % 4,743.04 728 3.23 % 4,669.87 ti ti 4,277 3.08 % 4,131.94 1,075 3.74 % 4,663.31 1,930 4.57 % 6,530.37 612 1.35 % 1,732.98 660 2.93 % 4,233.67 mi mi 4,009 2.89 % 3,873.03 679 2.36 % 2,945.48 1,417 3.35 % 4,794.58 1,248 2.76 % 3,533.92 665 2.95 % 4,265.75 ta ta 3,549 2.56 % 3,428.63 627 2.18 % 2,719.90 888 2.10 % 3,004.65 1,375 3.04 % 3,893.54 659 2.92 % 4,227.26 jz jz 3,351 2.41 % 3,237.35 628 2.18 % 2,724.24 1,092 2.58 % 3,694.91 918 2.03 % 2,599.47 713 3.16 % 4,573.65 te te 2,922 2.10 % 2,822.90 771 2.68 % 3,344.57 864 2.04 % 2,923.44 804 1.78 % 2,276.66 483 2.14 % 3,098.28 ga ga 2,631 1.90 % 2,541.77 517 1.80 % 2,242.73 963 2.28 % 3,258.42 808 1.79 % 2,287.99 343 1.52 % 2,200.23 vse v se 2,501 1.80 % 2,416.18 579 2.01 % 2,511.68 822 1.95 % 2,781.33 661 1.46 % 1,871.73 439 1.95 % 2,816.03 tem t em 2,081 1.50 % 2,010.42 390 1.36 % 1,691.81 226 0.54 % 764.70 1,090 2.41 % 3,086.52 375 1.66 % 2,405.50 jih j ih 1,913 1.38 % 1,848.12 322 1.12 % 1,396.82 503 1.19 % 1,701.96 800 1.77 % 2,265.33 288 1.28 % 1,847.42 tega te ga 1,893 1.36 % 1,828.80 279 0.97 % 1,210.29 328 0.78 % 1,109.83 927 2.05 % 2,624.95 359 1.59 % 2,302.86 si si 1,727 1.24 % 1,668.43 444 1.54 % 1,926.06 534 1.26 % 1,806.85 515 1.14 % 1,458.31 234 1.04 % 1,501.03 kar k ar 1,571 1.13 % 1,517.72 331 1.15 % 1,435.87 278 0.66 % 940.64 740 1.64 % 2,095.43 222 0.98 % 1,424.05 jaz j az 1,541 1.11 % 1,488.73 339 1.18 % 1,470.57 618 1.46 % 2,091.07 330 0.73 % 934.45 254 1.13 % 1,629.32 kej k ej 1,313 0.95 % 1,268.47 217 0.75 % 941.34 630 1.49 % 2,131.68 230 0.51 % 651.28 236 1.05 % 1,513.86 jo jo 1,298 0.94 % 1,253.98 274 0.95 % 1,188.60 381 0.90 % 1,289.16 511 1.13 % 1,446.98 132 0.59 % 846.73 ka ka 1,294 0.93 % 1,250.11 197 0.69 % 854.58 825 1.95 % 2,791.48 117 0.26 % 331.30 155 0.69 % 994.27 on on 1,156 0.83 % 1,116.79 234 0.81 % 1,015.08 539 1.27 % 1,823.77 210 0.46 % 594.65 173 0.77 % 1,109.74 me me 1,140 0.82 % 1,101.34 320 1.11 % 1,388.15 442 1.05 % 1,495.56 230 0.51 % 651.28 148 0.66 % 949.37 kr kr 1,128 0.81 % 1,089.74 269 0.94 % 1,166.91 352 0.83 % 1,191.03 283 0.62 % 801.36 224 0.99 % 1,436.88 kako ka ko 973 0.70 % 940 240 0.83 % 1,041.11 156 0.37 % 527.84 436 0.96 % 1,234.61 141 0.63 % 904.47 neki ne ki 956 0.69 % 923.58 117 0.41 % 507.54 467 1.10 % 1,580.15 195 0.43 % 552.17 177 0.79 % 1,135.39 vi vi 919 0.66 % 887.83 202 0.70 % 876.27 100 0.24 % 338.36 380 0.84 % 1,076.03 237 1.05 % 1,520.27 kdo k do 907 0.65 % 876.24 245 0.85 % 1,062.80 162 0.38 % 548.15 364 0.80 % 1,030.73 136 0.60 % 872.39 nič n ič 857 0.62 % 827.93 256 0.89 % 1,110.52 189 0.45 % 639.50 297 0.66 % 841 115 0.51 % 737.69 teh t eh 805 0.58 % 777.70 143 0.50 % 620.33 79 0.19 % 267.31 427 0.94 % 1,209.12 156 0.69 % 1,000.69 nas n as 767 0.55 % 740.99 185 0.64 % 802.52 226 0.54 % 764.70 281 0.62 % 795.70 75 0.33 % 481.10 vsi v si 745 0.54 % 719.73 168 0.58 % 728.78 177 0.42 % 598.90 298 0.66 % 843.84 102 0.45 % 654.29 vam v am 731 0.53 % 706.21 207 0.72 % 897.96 21 0.05 % 71.06 271 0.60 % 767.38 232 1.03 % 1,488.20 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 521 File at CLARIN.SI2.2.178 List of final character-level 3-grams from pronoun lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-pronouns-lowercase_ forms-final-3grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] kaj kaj 5,528 8.22 % 5,340.51 1,189 8.46 % 5,157.84 1,388 7.09 % 4,696.46 2,029 8.66 % 5,745.45 922 9.05 % 5,914.31 vse vse 2,501 3.72 % 2,416.18 579 4.12 % 2,511.68 822 4.20 % 2,781.33 661 2.82 % 1,871.73 439 4.31 % 2,816.03 tem tem 2,081 3.10 % 2,010.42 390 2.77 % 1,691.81 226 1.15 % 764.70 1,090 4.66 % 3,086.52 375 3.68 % 2,405.50 jih jih 1,913 2.85 % 1,848.12 322 2.29 % 1,396.82 503 2.57 % 1,701.96 800 3.42 % 2,265.33 288 2.83 % 1,847.42 tega t ega 1,893 2.81 % 1,828.80 279 1.98 % 1,210.29 328 1.68 % 1,109.83 927 3.96 % 2,624.95 359 3.53 % 2,302.86 kar kar 1,571 2.34 % 1,517.72 331 2.35 % 1,435.87 278 1.42 % 940.64 740 3.16 % 2,095.43 222 2.18 % 1,424.05 jaz jaz 1,541 2.29 % 1,488.73 339 2.41 % 1,470.57 618 3.16 % 2,091.07 330 1.41 % 934.45 254 2.50 % 1,629.32 kej kej 1,313 1.95 % 1,268.47 217 1.54 % 941.34 630 3.22 % 2,131.68 230 0.98 % 651.28 236 2.32 % 1,513.86 kako k ako 973 1.45 % 940 240 1.71 % 1,041.11 156 0.80 % 527.84 436 1.86 % 1,234.61 141 1.39 % 904.47 neki n eki 956 1.42 % 923.58 117 0.83 % 507.54 467 2.38 % 1,580.15 195 0.83 % 552.17 177 1.74 % 1,135.39 kdo kdo 907 1.35 % 876.24 245 1.74 % 1,062.80 162 0.83 % 548.15 364 1.55 % 1,030.73 136 1.34 % 872.39 nič nič 857 1.27 % 827.93 256 1.82 % 1,110.52 189 0.96 % 639.50 297 1.27 % 841 115 1.13 % 737.69 teh teh 805 1.20 % 777.70 143 1.02 % 620.33 79 0.40 % 267.31 427 1.82 % 1,209.12 156 1.53 % 1,000.69 nas nas 767 1.14 % 740.99 185 1.32 % 802.52 226 1.15 % 764.70 281 1.20 % 795.70 75 0.74 % 481.10 vsi vsi 745 1.11 % 719.73 168 1.20 % 728.78 177 0.90 % 598.90 298 1.27 % 843.84 102 1.00 % 654.29 vam vam 731 1.09 % 706.21 207 1.47 % 897.96 21 0.11 % 71.06 271 1.16 % 767.38 232 2.28 % 1,488.20 tisto ti sto 698 1.04 % 674.33 94 0.67 % 407.77 257 1.31 % 869.59 227 0.97 % 642.79 120 1.18 % 769.76 ona ona 657 0.98 % 634.72 124 0.88 % 537.91 371 1.90 % 1,255.32 65 0.28 % 184.06 97 0.95 % 622.22 nekaj ne kaj 648 0.96 % 626.02 193 1.37 % 837.23 105 0.54 % 355.28 265 1.13 % 750.39 85 0.83 % 545.25 tisti ti sti 618 0.92 % 597.04 129 0.92 % 559.60 152 0.78 % 514.31 260 1.11 % 736.23 77 0.76 % 493.93 tak tak 599 0.89 % 578.68 109 0.78 % 472.84 235 1.20 % 795.15 132 0.56 % 373.78 123 1.21 % 789 nam nam 595 0.89 % 574.82 139 0.99 % 602.98 79 0.40 % 267.31 298 1.27 % 843.84 79 0.78 % 506.76 kak kak 588 0.87 % 568.06 126 0.90 % 546.58 297 1.52 % 1,004.93 54 0.23 % 152.91 111 1.09 % 712.03 tole t ole 571 0.85 % 551.63 151 1.07 % 655.03 78 0.40 % 263.92 232 0.99 % 656.95 110 1.08 % 705.61 vas vas 535 0.80 % 516.85 210 1.49 % 910.97 46 0.23 % 155.65 178 0.76 % 504.04 101 0.99 % 647.88 tej tej 534 0.79 % 515.89 116 0.82 % 503.20 109 0.56 % 368.81 256 1.09 % 724.91 53 0.52 % 339.98 oni oni 508 0.76 % 490.77 73 0.52 % 316.67 239 1.22 % 808.68 95 0.41 % 269.01 101 0.99 % 647.88 men men 503 0.75 % 485.94 60 0.43 % 260.28 272 1.39 % 920.34 53 0.23 % 150.08 118 1.16 % 756.93 tist t ist 497 0.74 % 480.14 30 0.21 % 130.14 311 1.59 % 1,052.30 85 0.36 % 240.69 71 0.70 % 455.44 vsak v sak 471 0.70 % 455.03 118 0.84 % 511.88 120 0.61 % 406.03 154 0.66 % 436.08 79 0.78 % 506.76 mene m ene 448 0.67 % 432.81 89 0.63 % 386.08 205 1.05 % 693.64 81 0.35 % 229.36 73 0.72 % 468.27 jst jst 424 0.63 % 409.62 33 0.23 % 143.15 159 0.81 % 537.99 134 0.57 % 379.44 98 0.96 % 628.64 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 522 File at CLARIN.SI2.2.179 List of final character-level 4-grams from pronoun lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-pronouns-lowercase_ forms-final-4grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tega tega 1,893 5.47 % 1,828.80 279 3.72 % 1,210.29 328 3.62 % 1,109.83 927 7.07 % 2,624.95 359 7.28 % 2,302.86 kako kako 973 2.81 % 940 240 3.20 % 1,041.11 156 1.72 % 527.84 436 3.33 % 1,234.61 141 2.86 % 904.47 neki neki 956 2.76 % 923.58 117 1.56 % 507.54 467 5.15 % 1,580.15 195 1.49 % 552.17 177 3.59 % 1,135.39 tisto t isto 698 2.02 % 674.33 94 1.25 % 407.77 257 2.83 % 869.59 227 1.73 % 642.79 120 2.43 % 769.76 nekaj n ekaj 648 1.87 % 626.02 193 2.58 % 837.23 105 1.16 % 355.28 265 2.02 % 750.39 85 1.72 % 545.25 tisti t isti 618 1.79 % 597.04 129 1.72 % 559.60 152 1.68 % 514.31 260 1.98 % 736.23 77 1.56 % 493.93 tole tole 571 1.65 % 551.63 151 2.02 % 655.03 78 0.86 % 263.92 232 1.77 % 656.95 110 2.23 % 705.61 tist tist 497 1.44 % 480.14 30 0.40 % 130.14 311 3.43 % 1,052.30 85 0.65 % 240.69 71 1.44 % 455.44 vsak vsak 471 1.36 % 455.03 118 1.57 % 511.88 120 1.32 % 406.03 154 1.17 % 436.08 79 1.60 % 506.76 mene mene 448 1.29 % 432.81 89 1.19 % 386.08 205 2.26 % 693.64 81 0.62 % 229.36 73 1.48 % 468.27 take take 422 1.22 % 407.69 76 1.01 % 329.69 165 1.82 % 558.30 100 0.76 % 283.17 81 1.64 % 519.59 tiste t iste 389 1.12 % 375.81 77 1.03 % 334.02 139 1.53 % 470.32 109 0.83 % 308.65 64 1.30 % 410.54 neke neke 384 1.11 % 370.98 53 0.71 % 229.91 74 0.82 % 250.39 161 1.23 % 455.90 96 1.95 % 615.81 taka taka 380 1.10 % 367.11 76 1.01 % 329.69 132 1.46 % 446.64 99 0.76 % 280.33 73 1.48 % 468.27 svoje s voje 379 1.09 % 366.15 72 0.96 % 312.33 53 0.58 % 179.33 219 1.67 % 620.13 35 0.71 % 224.51 tale tale 379 1.09 % 366.15 93 1.24 % 403.43 67 0.74 % 226.70 158 1.21 % 447.40 61 1.24 % 391.29 moje moje 372 1.07 % 359.38 72 0.96 % 312.33 167 1.84 % 565.06 76 0.58 % 215.21 57 1.16 % 365.64 temu temu 340 0.98 % 328.47 79 1.05 % 342.70 34 0.38 % 115.04 187 1.43 % 529.52 40 0.81 % 256.59 neko neko 335 0.97 % 323.64 32 0.43 % 138.81 63 0.69 % 213.17 172 1.31 % 487.05 68 1.38 % 436.20 nekej n ekej 321 0.93 % 310.11 56 0.75 % 242.93 71 0.78 % 240.24 147 1.12 % 416.25 47 0.95 % 301.49 neka neka 315 0.91 % 304.32 27 0.36 % 117.12 132 1.46 % 446.64 110 0.84 % 311.48 46 0.93 % 295.07 vseh vseh 309 0.89 % 298.52 71 0.95 % 308 35 0.39 % 118.43 173 1.32 % 489.88 30 0.61 % 192.44 meni meni 282 0.81 % 272.44 63 0.84 % 273.29 116 1.28 % 392.50 60 0.46 % 169.90 43 0.87 % 275.83 tista t ista 279 0.81 % 269.54 49 0.65 % 212.56 99 1.09 % 334.98 87 0.66 % 246.35 44 0.89 % 282.24 isto isto 270 0.78 % 260.84 30 0.40 % 130.14 110 1.21 % 372.20 85 0.65 % 240.69 45 0.91 % 288.66 kakšno ka kšno 244 0.70 % 235.72 55 0.73 % 238.59 16 0.18 % 54.14 137 1.04 % 387.94 36 0.73 % 230.93 naši naši 244 0.70 % 235.72 81 1.08 % 351.37 28 0.31 % 94.74 113 0.86 % 319.98 22 0.45 % 141.12 naše naše 238 0.69 % 229.93 63 0.84 % 273.29 19 0.21 % 64.29 133 1.01 % 376.61 23 0.47 % 147.54 naša naša 237 0.69 % 228.96 52 0.69 % 225.57 41 0.45 % 138.73 132 1.01 % 373.78 12 0.24 % 76.98 moja moja 231 0.67 % 223.17 64 0.85 % 277.63 85 0.94 % 287.61 59 0.45 % 167.07 23 0.47 % 147.54 taki taki 231 0.67 % 223.17 54 0.72 % 234.25 99 1.09 % 334.98 52 0.40 % 147.25 26 0.53 % 166.78 kakšen ka kšen 229 0.66 % 221.23 36 0.48 % 156.17 16 0.18 % 54.14 155 1.18 % 438.91 22 0.45 % 141.12 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 523 File at CLARIN.SI2.2.180 List of final character-level 5-grams from pronoun lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-pronouns-lowercase_ forms-final-5grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tisto tisto 698 4.07 % 674.33 94 2.51 % 407.77 257 6.73 % 869.59 227 3.11 % 642.79 120 5.23 % 769.76 nekaj nekaj 648 3.78 % 626.02 193 5.15 % 837.23 105 2.75 % 355.28 265 3.63 % 750.39 85 3.71 % 545.25 tisti tisti 618 3.60 % 597.04 129 3.44 % 559.60 152 3.98 % 514.31 260 3.56 % 736.23 77 3.36 % 493.93 tiste tiste 389 2.27 % 375.81 77 2.05 % 334.02 139 3.64 % 470.32 109 1.49 % 308.65 64 2.79 % 410.54 svoje svoje 379 2.21 % 366.15 72 1.92 % 312.33 53 1.39 % 179.33 219 3.00 % 620.13 35 1.53 % 224.51 nekej nekej 321 1.87 % 310.11 56 1.49 % 242.93 71 1.86 % 240.24 147 2.01 % 416.25 47 2.05 % 301.49 tista tista 279 1.63 % 269.54 49 1.31 % 212.56 99 2.59 % 334.98 87 1.19 % 246.35 44 1.92 % 282.24 kakšno k akšno 244 1.42 % 235.72 55 1.47 % 238.59 16 0.42 % 54.14 137 1.88 % 387.94 36 1.57 % 230.93 kakšen k akšen 229 1.33 % 221.23 36 0.96 % 156.17 16 0.42 % 54.14 155 2.12 % 438.91 22 0.96 % 141.12 svojo svojo 228 1.33 % 220.27 59 1.57 % 255.94 28 0.73 % 94.74 125 1.71 % 353.96 16 0.70 % 102.63 kakšna k akšna 222 1.29 % 214.47 39 1.04 % 169.18 14 0.37 % 47.37 149 2.04 % 421.92 20 0.87 % 128.29 nekak nekak 195 1.14 % 188.39 28 0.75 % 121.46 52 1.36 % 175.95 64 0.88 % 181.23 51 2.22 % 327.15 nekdo nekdo 191 1.11 % 184.52 35 0.93 % 151.83 34 0.89 % 115.04 94 1.29 % 266.18 28 1.22 % 179.61 kateri k ateri 186 1.08 % 179.69 43 1.15 % 186.53 6 0.16 % 20.30 127 1.74 % 359.62 10 0.44 % 64.15 mojem mojem 184 1.07 % 177.76 46 1.23 % 199.55 52 1.36 % 175.95 42 0.57 % 118.93 44 1.92 % 282.24 midva midva 180 1.05 % 173.90 68 1.81 % 294.98 68 1.78 % 230.09 20 0.27 % 56.63 24 1.05 % 153.95 nekako n ekako 178 1.04 % 171.96 38 1.01 % 164.84 8 0.21 % 27.07 110 1.51 % 311.48 22 0.96 % 141.12 kakšne k akšne 171 1.00 % 165.20 30 0.80 % 130.14 13 0.34 % 43.99 99 1.36 % 280.33 29 1.26 % 186.03 tistih t istih 171 1.00 % 165.20 30 0.80 % 130.14 41 1.07 % 138.73 81 1.11 % 229.36 19 0.83 % 121.88 njega njega 167 0.97 % 161.34 36 0.96 % 156.17 45 1.18 % 152.26 62 0.85 % 175.56 24 1.05 % 153.95 vsako vsako 150 0.87 % 144.91 37 0.99 % 160.50 39 1.02 % 131.96 64 0.88 % 181.23 10 0.44 % 64.15 njimi njimi 143 0.83 % 138.15 26 0.69 % 112.79 31 0.81 % 104.89 70 0.96 % 198.22 16 0.70 % 102.63 katere k atere 141 0.82 % 136.22 20 0.53 % 86.76 2 0.05 % 6.77 111 1.52 % 314.31 8 0.35 % 51.32 takih takih 135 0.79 % 130.42 26 0.69 % 112.79 32 0.84 % 108.28 56 0.77 % 158.57 21 0.92 % 134.71 takega t akega 134 0.78 % 129.46 21 0.56 % 91.10 39 1.02 % 131.96 39 0.53 % 110.43 35 1.53 % 224.51 vsega vsega 119 0.69 % 114.96 29 0.77 % 125.80 21 0.55 % 71.06 41 0.56 % 116.10 28 1.22 % 179.61 njemu njemu 115 0.67 % 111.10 20 0.53 % 86.76 46 1.21 % 155.65 29 0.40 % 82.12 20 0.87 % 128.29 tazga tazga 111 0.65 % 107.24 10 0.27 % 43.38 51 1.34 % 172.56 13 0.18 % 36.81 37 1.61 % 237.34 katero k atero 110 0.64 % 106.27 22 0.59 % 95.44 2 0.05 % 6.77 77 1.05 % 218.04 9 0.39 % 57.73 noben noben 110 0.64 % 106.27 15 0.40 % 65.07 42 1.10 % 142.11 31 0.42 % 87.78 22 0.96 % 141.12 katera k atera 107 0.62 % 103.37 17 0.45 % 73.75 4 0.10 % 13.53 77 1.05 % 218.04 9 0.39 % 57.73 kešne kešne 104 0.61 % 100.47 22 0.59 % 95.44 43 1.13 % 145.50 15 0.20 % 42.47 24 1.05 % 153.95 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 524 File at CLARIN.SI2.2.181 List of initial character-level 1-grams from numeral lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-numerals-lemmas-initial- 1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] en en e n 5,114 19.96 % 4,940.55 1,075 15.76 % 4,663.31 1,811 28.00 % 6,127.72 1,472 17.26 % 4,168.21 756 19.91 % 4,849.48 drug drug d rug 2,356 9.20 % 2,276.09 465 6.82 % 2,017.15 488 7.54 % 1,651.20 1,066 12.50 % 3,018.56 337 8.87 % 2,161.74 dva dva d va 2,275 8.88 % 2,197.84 457 6.70 % 1,982.45 603 9.32 % 2,040.32 905 10.61 % 2,562.66 310 8.16 % 1,988.54 MMM mmm M MM 1,522 5.94 % 1,470.38 193 2.83 % 837.23 605 9.35 % 2,047.09 351 4.12 % 993.91 373 9.82 % 2,392.67 prvi prvi p rvi 1,315 5.13 % 1,270.40 411 6.02 % 1,782.90 259 4.00 % 876.36 491 5.76 % 1,390.35 154 4.05 % 987.86 trije trije t rije 1,150 4.49 % 1,111 357 5.23 % 1,548.65 259 4.00 % 876.36 386 4.53 % 1,093.02 148 3.90 % 949.37 pet pet p et 669 2.61 % 646.31 201 2.95 % 871.93 183 2.83 % 619.20 187 2.19 % 529.52 98 2.58 % 628.64 štirje štirje š tirje 667 2.60 % 644.38 189 2.77 % 819.87 158 2.44 % 534.61 240 2.81 % 679.60 80 2.11 % 513.17 tisoč tisoč t isoč 618 2.41 % 597.04 204 2.99 % 884.94 41 0.63 % 138.73 321 3.76 % 908.96 52 1.37 % 333.56 šest šest š est 480 1.87 % 463.72 212 3.11 % 919.65 101 1.56 % 341.74 121 1.42 % 342.63 46 1.21 % 295.07 sto sto s to 478 1.87 % 461.79 61 0.89 % 264.62 107 1.65 % 362.05 176 2.06 % 498.37 134 3.53 % 859.56 deset deset d eset 476 1.86 % 459.86 128 1.88 % 555.26 151 2.33 % 510.93 123 1.44 % 348.29 74 1.95 % 474.68 osem osem o sem 392 1.53 % 378.70 195 2.86 % 845.90 57 0.88 % 192.87 97 1.14 % 274.67 43 1.13 % 275.83 dvajset dvajset d vajset 367 1.43 % 354.55 98 1.44 % 425.12 91 1.41 % 307.91 113 1.32 % 319.98 65 1.71 % 416.95 ena ena e na 316 1.23 % 305.28 67 0.98 % 290.64 104 1.61 % 351.90 96 1.12 % 271.84 49 1.29 % 314.32 tretji tretji t retji 315 1.23 % 304.32 98 1.44 % 425.12 66 1.02 % 223.32 124 1.45 % 351.13 27 0.71 % 173.20 sedem sedem s edem 306 1.19 % 295.62 111 1.63 % 481.51 82 1.27 % 277.46 88 1.03 % 249.19 25 0.66 % 160.37 trideset trideset t rideset 299 1.17 % 288.86 95 1.39 % 412.11 56 0.87 % 189.48 74 0.87 % 209.54 74 1.95 % 474.68 petnajst petnajst p etnajst 265 1.03 % 256.01 70 1.03 % 303.66 65 1.00 % 219.93 62 0.73 % 175.56 68 1.79 % 436.20 petdeset petdeset p etdeset 256 1.00 % 247.32 32 0.47 % 138.81 86 1.33 % 290.99 85 1.00 % 240.69 53 1.40 % 339.98 devet devet d evet 247 0.96 % 238.62 103 1.51 % 446.81 39 0.60 % 131.96 64 0.75 % 181.23 41 1.08 % 263 eden eden e den 219 0.85 % 211.57 57 0.83 % 247.26 54 0.83 % 182.72 87 1.02 % 246.35 21 0.55 % 134.71 šesti šesti š esti 188 0.73 % 181.62 105 1.54 % 455.49 25 0.39 % 84.59 48 0.56 % 135.92 10 0.26 % 64.15 dvesto dvesto d vesto 187 0.73 % 180.66 58 0.85 % 251.60 36 0.56 % 121.81 72 0.84 % 203.88 21 0.55 % 134.71 osemdeset osemdeset o semdeset 173 0.68 % 167.13 54 0.79 % 234.25 23 0.36 % 77.82 48 0.56 % 135.92 48 1.26 % 307.90 devetdeset devetdeset d evetdeset 167 0.65 % 161.34 89 1.30 % 386.08 20 0.31 % 67.67 32 0.38 % 90.61 26 0.69 % 166.78 dvanajst dvanajst d vanajst 165 0.64 % 159.40 38 0.56 % 164.84 49 0.76 % 165.80 55 0.65 % 155.74 23 0.61 % 147.54 sedmi sedmi s edmi 165 0.64 % 159.40 99 1.45 % 429.46 31 0.48 % 104.89 25 0.29 % 70.79 10 0.26 % 64.15 peti peti p eti 159 0.62 % 153.61 59 0.86 % 255.94 45 0.70 % 152.26 41 0.48 % 116.10 14 0.37 % 89.81 osmi osmi o smi 151 0.59 % 145.88 86 1.26 % 373.06 31 0.48 % 104.89 13 0.15 % 36.81 21 0.55 % 134.71 štirideset štirideset š tirideset 145 0.57 % 140.08 23 0.34 % 99.77 35 0.54 % 118.43 50 0.59 % 141.58 37 0.97 % 237.34 šestdeset šestdeset š estdeset 143 0.56 % 138.15 21 0.31 % 91.10 24 0.37 % 81.21 47 0.55 % 133.09 51 1.34 % 327.15 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 525 File at CLARIN.SI2.2.182 List of initial character-level 2-grams from numeral lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-numerals-lemmas-initial- 2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] en en en 5,114 20.17 % 4,940.55 1,075 15.97 % 4,663.31 1,811 28.16 % 6,127.72 1,472 17.46 % 4,168.21 756 20.10 % 4,849.48 drug drug dr ug 2,356 9.29 % 2,276.09 465 6.91 % 2,017.15 488 7.59 % 1,651.20 1,066 12.65 % 3,018.56 337 8.96 % 2,161.74 dva dva dv a 2,275 8.97 % 2,197.84 457 6.79 % 1,982.45 603 9.38 % 2,040.32 905 10.74 % 2,562.66 310 8.24 % 1,988.54 MMM mmm MM M 1,522 6.00 % 1,470.38 193 2.87 % 837.23 605 9.41 % 2,047.09 351 4.16 % 993.91 373 9.92 % 2,392.67 prvi prvi pr vi 1,315 5.19 % 1,270.40 411 6.11 % 1,782.90 259 4.03 % 876.36 491 5.83 % 1,390.35 154 4.09 % 987.86 trije trije tr ije 1,150 4.54 % 1,111 357 5.30 % 1,548.65 259 4.03 % 876.36 386 4.58 % 1,093.02 148 3.94 % 949.37 pet pet pe t 669 2.64 % 646.31 201 2.99 % 871.93 183 2.85 % 619.20 187 2.22 % 529.52 98 2.61 % 628.64 štirje štirje št irje 667 2.63 % 644.38 189 2.81 % 819.87 158 2.46 % 534.61 240 2.85 % 679.60 80 2.13 % 513.17 tisoč tisoč ti soč 618 2.44 % 597.04 204 3.03 % 884.94 41 0.64 % 138.73 321 3.81 % 908.96 52 1.38 % 333.56 šest šest še st 480 1.89 % 463.72 212 3.15 % 919.65 101 1.57 % 341.74 121 1.44 % 342.63 46 1.22 % 295.07 sto sto st o 478 1.89 % 461.79 61 0.91 % 264.62 107 1.66 % 362.05 176 2.09 % 498.37 134 3.56 % 859.56 deset deset de set 476 1.88 % 459.86 128 1.90 % 555.26 151 2.35 % 510.93 123 1.46 % 348.29 74 1.97 % 474.68 osem osem os em 392 1.55 % 378.70 195 2.90 % 845.90 57 0.89 % 192.87 97 1.15 % 274.67 43 1.14 % 275.83 dvajset dvajset dv ajset 367 1.45 % 354.55 98 1.46 % 425.12 91 1.42 % 307.91 113 1.34 % 319.98 65 1.73 % 416.95 ena ena en a 316 1.25 % 305.28 67 0.99 % 290.64 104 1.62 % 351.90 96 1.14 % 271.84 49 1.30 % 314.32 tretji tretji tr etji 315 1.24 % 304.32 98 1.46 % 425.12 66 1.03 % 223.32 124 1.47 % 351.13 27 0.72 % 173.20 sedem sedem se dem 306 1.21 % 295.62 111 1.65 % 481.51 82 1.27 % 277.46 88 1.04 % 249.19 25 0.67 % 160.37 trideset trideset tr ideset 299 1.18 % 288.86 95 1.41 % 412.11 56 0.87 % 189.48 74 0.88 % 209.54 74 1.97 % 474.68 petnajst petnajst pe tnajst 265 1.04 % 256.01 70 1.04 % 303.66 65 1.01 % 219.93 62 0.74 % 175.56 68 1.81 % 436.20 petdeset petdeset pe tdeset 256 1.01 % 247.32 32 0.47 % 138.81 86 1.34 % 290.99 85 1.01 % 240.69 53 1.41 % 339.98 devet devet de vet 247 0.97 % 238.62 103 1.53 % 446.81 39 0.61 % 131.96 64 0.76 % 181.23 41 1.09 % 263 eden eden ed en 219 0.86 % 211.57 57 0.85 % 247.26 54 0.84 % 182.72 87 1.03 % 246.35 21 0.56 % 134.71 šesti šesti še sti 188 0.74 % 181.62 105 1.56 % 455.49 25 0.39 % 84.59 48 0.57 % 135.92 10 0.27 % 64.15 dvesto dvesto dv esto 187 0.74 % 180.66 58 0.86 % 251.60 36 0.56 % 121.81 72 0.85 % 203.88 21 0.56 % 134.71 osemdeset osemdeset os emdeset 173 0.68 % 167.13 54 0.80 % 234.25 23 0.36 % 77.82 48 0.57 % 135.92 48 1.28 % 307.90 devetdeset devetdeset de vetdeset 167 0.66 % 161.34 89 1.32 % 386.08 20 0.31 % 67.67 32 0.38 % 90.61 26 0.69 % 166.78 dvanajst dvanajst dv anajst 165 0.65 % 159.40 38 0.56 % 164.84 49 0.76 % 165.80 55 0.65 % 155.74 23 0.61 % 147.54 sedmi sedmi se dmi 165 0.65 % 159.40 99 1.47 % 429.46 31 0.48 % 104.89 25 0.30 % 70.79 10 0.27 % 64.15 peti peti pe ti 159 0.63 % 153.61 59 0.88 % 255.94 45 0.70 % 152.26 41 0.49 % 116.10 14 0.37 % 89.81 osmi osmi os mi 151 0.60 % 145.88 86 1.28 % 373.06 31 0.48 % 104.89 13 0.15 % 36.81 21 0.56 % 134.71 štirideset štirideset št irideset 145 0.57 % 140.08 23 0.34 % 99.77 35 0.54 % 118.43 50 0.59 % 141.58 37 0.98 % 237.34 šestdeset šestdeset še stdeset 143 0.56 % 138.15 21 0.31 % 91.10 24 0.37 % 81.21 47 0.56 % 133.09 51 1.36 % 327.15 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 526 File at CLARIN.SI2.2.183 List of initial character-level 3-grams from numeral lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-numerals-lemmas-initial- 3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] drug drug dru g 2,356 11.66 % 2,276.09 465 8.26 % 2,017.15 488 10.59 % 1,651.20 1,066 15.32 % 3,018.56 337 11.21 % 2,161.74 dva dva dva 2,275 11.26 % 2,197.84 457 8.11 % 1,982.45 603 13.09 % 2,040.32 905 13.01 % 2,562.66 310 10.32 % 1,988.54 MMM mmm MMM 1,522 7.53 % 1,470.38 193 3.43 % 837.23 605 13.13 % 2,047.09 351 5.05 % 993.91 373 12.41 % 2,392.67 prvi prvi prv i 1,315 6.51 % 1,270.40 411 7.30 % 1,782.90 259 5.62 % 876.36 491 7.06 % 1,390.35 154 5.12 % 987.86 trije trije tri je 1,150 5.69 % 1,111 357 6.34 % 1,548.65 259 5.62 % 876.36 386 5.55 % 1,093.02 148 4.92 % 949.37 pet pet pet 669 3.31 % 646.31 201 3.57 % 871.93 183 3.97 % 619.20 187 2.69 % 529.52 98 3.26 % 628.64 štirje štirje šti rje 667 3.30 % 644.38 189 3.36 % 819.87 158 3.43 % 534.61 240 3.45 % 679.60 80 2.66 % 513.17 tisoč tisoč tis oč 618 3.06 % 597.04 204 3.62 % 884.94 41 0.89 % 138.73 321 4.62 % 908.96 52 1.73 % 333.56 šest šest šes t 480 2.38 % 463.72 212 3.76 % 919.65 101 2.19 % 341.74 121 1.74 % 342.63 46 1.53 % 295.07 sto sto sto 478 2.37 % 461.79 61 1.08 % 264.62 107 2.32 % 362.05 176 2.53 % 498.37 134 4.46 % 859.56 deset deset des et 476 2.36 % 459.86 128 2.27 % 555.26 151 3.28 % 510.93 123 1.77 % 348.29 74 2.46 % 474.68 osem osem ose m 392 1.94 % 378.70 195 3.46 % 845.90 57 1.24 % 192.87 97 1.39 % 274.67 43 1.43 % 275.83 dvajset dvajset dva jset 367 1.82 % 354.55 98 1.74 % 425.12 91 1.98 % 307.91 113 1.62 % 319.98 65 2.16 % 416.95 ena ena ena 316 1.56 % 305.28 67 1.19 % 290.64 104 2.26 % 351.90 96 1.38 % 271.84 49 1.63 % 314.32 tretji tretji tre tji 315 1.56 % 304.32 98 1.74 % 425.12 66 1.43 % 223.32 124 1.78 % 351.13 27 0.90 % 173.20 sedem sedem sed em 306 1.51 % 295.62 111 1.97 % 481.51 82 1.78 % 277.46 88 1.26 % 249.19 25 0.83 % 160.37 trideset trideset tri deset 299 1.48 % 288.86 95 1.69 % 412.11 56 1.22 % 189.48 74 1.06 % 209.54 74 2.46 % 474.68 petnajst petnajst pet najst 265 1.31 % 256.01 70 1.24 % 303.66 65 1.41 % 219.93 62 0.89 % 175.56 68 2.26 % 436.20 petdeset petdeset pet deset 256 1.27 % 247.32 32 0.57 % 138.81 86 1.87 % 290.99 85 1.22 % 240.69 53 1.76 % 339.98 devet devet dev et 247 1.22 % 238.62 103 1.83 % 446.81 39 0.85 % 131.96 64 0.92 % 181.23 41 1.36 % 263 eden eden ede n 219 1.08 % 211.57 57 1.01 % 247.26 54 1.17 % 182.72 87 1.25 % 246.35 21 0.70 % 134.71 šesti šesti šes ti 188 0.93 % 181.62 105 1.86 % 455.49 25 0.54 % 84.59 48 0.69 % 135.92 10 0.33 % 64.15 dvesto dvesto dve sto 187 0.93 % 180.66 58 1.03 % 251.60 36 0.78 % 121.81 72 1.03 % 203.88 21 0.70 % 134.71 osemdeset osemdeset ose mdeset 173 0.86 % 167.13 54 0.96 % 234.25 23 0.50 % 77.82 48 0.69 % 135.92 48 1.60 % 307.90 devetdeset devetdeset dev etdeset 167 0.83 % 161.34 89 1.58 % 386.08 20 0.43 % 67.67 32 0.46 % 90.61 26 0.86 % 166.78 dvanajst dvanajst dva najst 165 0.82 % 159.40 38 0.68 % 164.84 49 1.06 % 165.80 55 0.79 % 155.74 23 0.77 % 147.54 sedmi sedmi sed mi 165 0.82 % 159.40 99 1.76 % 429.46 31 0.67 % 104.89 25 0.36 % 70.79 10 0.33 % 64.15 peti peti pet i 159 0.79 % 153.61 59 1.05 % 255.94 45 0.98 % 152.26 41 0.59 % 116.10 14 0.47 % 89.81 osmi osmi osm i 151 0.75 % 145.88 86 1.53 % 373.06 31 0.67 % 104.89 13 0.19 % 36.81 21 0.70 % 134.71 štirideset štirideset šti rideset 145 0.72 % 140.08 23 0.41 % 99.77 35 0.76 % 118.43 50 0.72 % 141.58 37 1.23 % 237.34 šestdeset šestdeset šes tdeset 143 0.71 % 138.15 21 0.37 % 91.10 24 0.52 % 81.21 47 0.68 % 133.09 51 1.70 % 327.15 tristo tristo tri sto 141 0.70 % 136.22 15 0.27 % 65.07 28 0.61 % 94.74 78 1.12 % 220.87 20 0.67 % 128.29 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 527 File at CLARIN.SI2.2.184 List of initial character-level 4-grams from numeral lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-numerals-lemmas-initial- 4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] drug drug drug 2,356 15.88 % 2,276.09 465 10.12 % 2,017.15 488 16.26 % 1,651.20 1,066 20.52 % 3,018.56 337 16.53 % 2,161.74 prvi prvi prvi 1,315 8.87 % 1,270.40 411 8.94 % 1,782.90 259 8.63 % 876.36 491 9.45 % 1,390.35 154 7.55 % 987.86 trije trije trij e 1,150 7.75 % 1,111 357 7.77 % 1,548.65 259 8.63 % 876.36 386 7.43 % 1,093.02 148 7.26 % 949.37 štirje štirje štir je 667 4.50 % 644.38 189 4.11 % 819.87 158 5.26 % 534.61 240 4.62 % 679.60 80 3.92 % 513.17 tisoč tisoč tiso č 618 4.17 % 597.04 204 4.44 % 884.94 41 1.37 % 138.73 321 6.18 % 908.96 52 2.55 % 333.56 šest šest šest 480 3.24 % 463.72 212 4.61 % 919.65 101 3.36 % 341.74 121 2.33 % 342.63 46 2.26 % 295.07 deset deset dese t 476 3.21 % 459.86 128 2.78 % 555.26 151 5.03 % 510.93 123 2.37 % 348.29 74 3.63 % 474.68 osem osem osem 392 2.64 % 378.70 195 4.24 % 845.90 57 1.90 % 192.87 97 1.87 % 274.67 43 2.11 % 275.83 dvajset dvajset dvaj set 367 2.47 % 354.55 98 2.13 % 425.12 91 3.03 % 307.91 113 2.17 % 319.98 65 3.19 % 416.95 tretji tretji tret ji 315 2.12 % 304.32 98 2.13 % 425.12 66 2.20 % 223.32 124 2.39 % 351.13 27 1.32 % 173.20 sedem sedem sede m 306 2.06 % 295.62 111 2.42 % 481.51 82 2.73 % 277.46 88 1.69 % 249.19 25 1.23 % 160.37 trideset trideset trid eset 299 2.02 % 288.86 95 2.07 % 412.11 56 1.86 % 189.48 74 1.42 % 209.54 74 3.63 % 474.68 petnajst petnajst petn ajst 265 1.79 % 256.01 70 1.52 % 303.66 65 2.17 % 219.93 62 1.19 % 175.56 68 3.33 % 436.20 petdeset petdeset petd eset 256 1.73 % 247.32 32 0.70 % 138.81 86 2.87 % 290.99 85 1.64 % 240.69 53 2.60 % 339.98 devet devet deve t 247 1.67 % 238.62 103 2.24 % 446.81 39 1.30 % 131.96 64 1.23 % 181.23 41 2.01 % 263 eden eden eden 219 1.48 % 211.57 57 1.24 % 247.26 54 1.80 % 182.72 87 1.67 % 246.35 21 1.03 % 134.71 šesti šesti šest i 188 1.27 % 181.62 105 2.28 % 455.49 25 0.83 % 84.59 48 0.92 % 135.92 10 0.49 % 64.15 dvesto dvesto dves to 187 1.26 % 180.66 58 1.26 % 251.60 36 1.20 % 121.81 72 1.39 % 203.88 21 1.03 % 134.71 osemdeset osemdeset osem deset 173 1.17 % 167.13 54 1.18 % 234.25 23 0.77 % 77.82 48 0.92 % 135.92 48 2.35 % 307.90 devetdeset devetdeset deve tdeset 167 1.13 % 161.34 89 1.94 % 386.08 20 0.67 % 67.67 32 0.62 % 90.61 26 1.27 % 166.78 dvanajst dvanajst dvan ajst 165 1.11 % 159.40 38 0.83 % 164.84 49 1.63 % 165.80 55 1.06 % 155.74 23 1.13 % 147.54 sedmi sedmi sedm i 165 1.11 % 159.40 99 2.15 % 429.46 31 1.03 % 104.89 25 0.48 % 70.79 10 0.49 % 64.15 peti peti peti 159 1.07 % 153.61 59 1.28 % 255.94 45 1.50 % 152.26 41 0.79 % 116.10 14 0.69 % 89.81 osmi osmi osmi 151 1.02 % 145.88 86 1.87 % 373.06 31 1.03 % 104.89 13 0.25 % 36.81 21 1.03 % 134.71 štirideset štirideset štir ideset 145 0.98 % 140.08 23 0.50 % 99.77 35 1.17 % 118.43 50 0.96 % 141.58 37 1.81 % 237.34 šestdeset šestdeset šest deset 143 0.96 % 138.15 21 0.46 % 91.10 24 0.80 % 81.21 47 0.91 % 133.09 51 2.50 % 327.15 tristo tristo tris to 141 0.95 % 136.22 15 0.33 % 65.07 28 0.93 % 94.74 78 1.50 % 220.87 20 0.98 % 128.29 četrti četrti četr ti 134 0.90 % 129.46 46 1.00 % 199.55 25 0.83 % 84.59 46 0.89 % 130.26 17 0.83 % 109.05 deveti deveti deve ti 109 0.73 % 105.30 51 1.11 % 221.24 27 0.90 % 91.36 13 0.25 % 36.81 18 0.88 % 115.46 petsto petsto pets to 107 0.72 % 103.37 26 0.57 % 112.79 22 0.73 % 74.44 54 1.04 % 152.91 5 0.24 % 32.07 štirinajst štirinajst štir inajst 103 0.69 % 99.51 33 0.72 % 143.15 30 1.00 % 101.51 31 0.60 % 87.78 9 0.44 % 57.73 petindvajset petindvajset peti ndvajset 99 0.67 % 95.64 13 0.28 % 56.39 25 0.83 % 84.59 40 0.77 % 113.27 21 1.03 % 134.71 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 528 File at CLARIN.SI2.2.185 List of initial character-level 5-grams from numeral lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-numerals-lemmas-initial- 5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] trije trije trije 1,150 11.79 % 1,111 357 11.47 % 1,548.65 259 13.18 % 876.36 386 11.78 % 1,093.02 148 10.59 % 949.37 štirje štirje štirj e 667 6.84 % 644.38 189 6.07 % 819.87 158 8.04 % 534.61 240 7.32 % 679.60 80 5.72 % 513.17 tisoč tisoč tisoč 618 6.34 % 597.04 204 6.55 % 884.94 41 2.09 % 138.73 321 9.79 % 908.96 52 3.72 % 333.56 deset deset deset 476 4.88 % 459.86 128 4.11 % 555.26 151 7.68 % 510.93 123 3.75 % 348.29 74 5.29 % 474.68 dvajset dvajset dvajs et 367 3.76 % 354.55 98 3.15 % 425.12 91 4.63 % 307.91 113 3.45 % 319.98 65 4.65 % 416.95 tretji tretji tretj i 315 3.23 % 304.32 98 3.15 % 425.12 66 3.36 % 223.32 124 3.78 % 351.13 27 1.93 % 173.20 sedem sedem sedem 306 3.14 % 295.62 111 3.57 % 481.51 82 4.17 % 277.46 88 2.69 % 249.19 25 1.79 % 160.37 trideset trideset tride set 299 3.07 % 288.86 95 3.05 % 412.11 56 2.85 % 189.48 74 2.26 % 209.54 74 5.29 % 474.68 petnajst petnajst petna jst 265 2.72 % 256.01 70 2.25 % 303.66 65 3.31 % 219.93 62 1.89 % 175.56 68 4.86 % 436.20 petdeset petdeset petde set 256 2.62 % 247.32 32 1.03 % 138.81 86 4.38 % 290.99 85 2.59 % 240.69 53 3.79 % 339.98 devet devet devet 247 2.53 % 238.62 103 3.31 % 446.81 39 1.99 % 131.96 64 1.95 % 181.23 41 2.93 % 263 šesti šesti šesti 188 1.93 % 181.62 105 3.37 % 455.49 25 1.27 % 84.59 48 1.46 % 135.92 10 0.71 % 64.15 dvesto dvesto dvest o 187 1.92 % 180.66 58 1.86 % 251.60 36 1.83 % 121.81 72 2.20 % 203.88 21 1.50 % 134.71 osemdeset osemdeset osemd eset 173 1.77 % 167.13 54 1.74 % 234.25 23 1.17 % 77.82 48 1.46 % 135.92 48 3.43 % 307.90 devetdeset devetdeset devet deset 167 1.71 % 161.34 89 2.86 % 386.08 20 1.02 % 67.67 32 0.98 % 90.61 26 1.86 % 166.78 dvanajst dvanajst dvana jst 165 1.69 % 159.40 38 1.22 % 164.84 49 2.49 % 165.80 55 1.68 % 155.74 23 1.65 % 147.54 sedmi sedmi sedmi 165 1.69 % 159.40 99 3.18 % 429.46 31 1.58 % 104.89 25 0.76 % 70.79 10 0.71 % 64.15 štirideset štirideset štiri deset 145 1.49 % 140.08 23 0.74 % 99.77 35 1.78 % 118.43 50 1.52 % 141.58 37 2.65 % 237.34 šestdeset šestdeset šestd eset 143 1.47 % 138.15 21 0.68 % 91.10 24 1.22 % 81.21 47 1.43 % 133.09 51 3.65 % 327.15 tristo tristo trist o 141 1.45 % 136.22 15 0.48 % 65.07 28 1.43 % 94.74 78 2.38 % 220.87 20 1.43 % 128.29 četrti četrti četrt i 134 1.37 % 129.46 46 1.48 % 199.55 25 1.27 % 84.59 46 1.40 % 130.26 17 1.22 % 109.05 deveti deveti devet i 109 1.12 % 105.30 51 1.64 % 221.24 27 1.37 % 91.36 13 0.40 % 36.81 18 1.29 % 115.46 petsto petsto petst o 107 1.10 % 103.37 26 0.83 % 112.79 22 1.12 % 74.44 54 1.65 % 152.91 5 0.36 % 32.07 štirinajst štirinajst štiri najst 103 1.06 % 99.51 33 1.06 % 143.15 30 1.53 % 101.51 31 0.95 % 87.78 9 0.64 % 57.73 petindvajset petindvajset petin dvajset 99 1.01 % 95.64 13 0.42 % 56.39 25 1.27 % 84.59 40 1.22 % 113.27 21 1.50 % 134.71 sedemdeset sedemdeset sedem deset 99 1.01 % 95.64 20 0.64 % 86.76 20 1.02 % 67.67 40 1.22 % 113.27 19 1.36 % 121.88 enajst enajst enajs t 94 0.96 % 90.81 27 0.87 % 117.12 28 1.43 % 94.74 27 0.82 % 76.45 12 0.86 % 76.98 deseti deseti deset i 85 0.87 % 82.12 34 1.09 % 147.49 17 0.86 % 57.52 28 0.85 % 79.29 6 0.43 % 38.49 petinštirideset petinštirideset petin štirideset 78 0.80 % 75.35 5 0.16 % 21.69 12 0.61 % 40.60 11 0.34 % 31.15 50 3.58 % 320.73 štiristo štiristo štiri sto 76 0.78 % 73.42 16 0.51 % 69.41 17 0.86 % 57.52 35 1.07 % 99.11 8 0.57 % 51.32 trinajst trinajst trina jst 71 0.73 % 68.59 23 0.74 % 99.77 7 0.36 % 23.69 36 1.10 % 101.94 5 0.36 % 32.07 sedemsto sedemsto sedem sto 65 0.67 % 62.80 18 0.58 % 78.08 7 0.36 % 23.69 36 1.10 % 101.94 4 0.29 % 25.66 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 529 File at CLARIN.SI2.2.186 List of final character-level 1-grams from numeral lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-numerals-lemmas-final- 1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] en en e n 5,114 19.96 % 4,940.55 1,075 15.76 % 4,663.31 1,811 28.00 % 6,127.72 1,472 17.26 % 4,168.21 756 19.91 % 4,849.48 drug drug dru g 2,356 9.20 % 2,276.09 465 6.82 % 2,017.15 488 7.54 % 1,651.20 1,066 12.50 % 3,018.56 337 8.87 % 2,161.74 dva dva dv a 2,275 8.88 % 2,197.84 457 6.70 % 1,982.45 603 9.32 % 2,040.32 905 10.61 % 2,562.66 310 8.16 % 1,988.54 MMM mmm MM M 1,522 5.94 % 1,470.38 193 2.83 % 837.23 605 9.35 % 2,047.09 351 4.12 % 993.91 373 9.82 % 2,392.67 prvi prvi prv i 1,315 5.13 % 1,270.40 411 6.02 % 1,782.90 259 4.00 % 876.36 491 5.76 % 1,390.35 154 4.05 % 987.86 trije trije trij e 1,150 4.49 % 1,111 357 5.23 % 1,548.65 259 4.00 % 876.36 386 4.53 % 1,093.02 148 3.90 % 949.37 pet pet pe t 669 2.61 % 646.31 201 2.95 % 871.93 183 2.83 % 619.20 187 2.19 % 529.52 98 2.58 % 628.64 štirje štirje štirj e 667 2.60 % 644.38 189 2.77 % 819.87 158 2.44 % 534.61 240 2.81 % 679.60 80 2.11 % 513.17 tisoč tisoč tiso č 618 2.41 % 597.04 204 2.99 % 884.94 41 0.63 % 138.73 321 3.76 % 908.96 52 1.37 % 333.56 šest šest šes t 480 1.87 % 463.72 212 3.11 % 919.65 101 1.56 % 341.74 121 1.42 % 342.63 46 1.21 % 295.07 sto sto st o 478 1.87 % 461.79 61 0.89 % 264.62 107 1.65 % 362.05 176 2.06 % 498.37 134 3.53 % 859.56 deset deset dese t 476 1.86 % 459.86 128 1.88 % 555.26 151 2.33 % 510.93 123 1.44 % 348.29 74 1.95 % 474.68 osem osem ose m 392 1.53 % 378.70 195 2.86 % 845.90 57 0.88 % 192.87 97 1.14 % 274.67 43 1.13 % 275.83 dvajset dvajset dvajse t 367 1.43 % 354.55 98 1.44 % 425.12 91 1.41 % 307.91 113 1.32 % 319.98 65 1.71 % 416.95 ena ena en a 316 1.23 % 305.28 67 0.98 % 290.64 104 1.61 % 351.90 96 1.12 % 271.84 49 1.29 % 314.32 tretji tretji tretj i 315 1.23 % 304.32 98 1.44 % 425.12 66 1.02 % 223.32 124 1.45 % 351.13 27 0.71 % 173.20 sedem sedem sede m 306 1.19 % 295.62 111 1.63 % 481.51 82 1.27 % 277.46 88 1.03 % 249.19 25 0.66 % 160.37 trideset trideset tridese t 299 1.17 % 288.86 95 1.39 % 412.11 56 0.87 % 189.48 74 0.87 % 209.54 74 1.95 % 474.68 petnajst petnajst petnajs t 265 1.03 % 256.01 70 1.03 % 303.66 65 1.00 % 219.93 62 0.73 % 175.56 68 1.79 % 436.20 petdeset petdeset petdese t 256 1.00 % 247.32 32 0.47 % 138.81 86 1.33 % 290.99 85 1.00 % 240.69 53 1.40 % 339.98 devet devet deve t 247 0.96 % 238.62 103 1.51 % 446.81 39 0.60 % 131.96 64 0.75 % 181.23 41 1.08 % 263 eden eden ede n 219 0.85 % 211.57 57 0.83 % 247.26 54 0.83 % 182.72 87 1.02 % 246.35 21 0.55 % 134.71 šesti šesti šest i 188 0.73 % 181.62 105 1.54 % 455.49 25 0.39 % 84.59 48 0.56 % 135.92 10 0.26 % 64.15 dvesto dvesto dvest o 187 0.73 % 180.66 58 0.85 % 251.60 36 0.56 % 121.81 72 0.84 % 203.88 21 0.55 % 134.71 osemdeset osemdeset osemdese t 173 0.68 % 167.13 54 0.79 % 234.25 23 0.36 % 77.82 48 0.56 % 135.92 48 1.26 % 307.90 devetdeset devetdeset devetdese t 167 0.65 % 161.34 89 1.30 % 386.08 20 0.31 % 67.67 32 0.38 % 90.61 26 0.69 % 166.78 dvanajst dvanajst dvanajs t 165 0.64 % 159.40 38 0.56 % 164.84 49 0.76 % 165.80 55 0.65 % 155.74 23 0.61 % 147.54 sedmi sedmi sedm i 165 0.64 % 159.40 99 1.45 % 429.46 31 0.48 % 104.89 25 0.29 % 70.79 10 0.26 % 64.15 peti peti pet i 159 0.62 % 153.61 59 0.86 % 255.94 45 0.70 % 152.26 41 0.48 % 116.10 14 0.37 % 89.81 osmi osmi osm i 151 0.59 % 145.88 86 1.26 % 373.06 31 0.48 % 104.89 13 0.15 % 36.81 21 0.55 % 134.71 štirideset štirideset štiridese t 145 0.57 % 140.08 23 0.34 % 99.77 35 0.54 % 118.43 50 0.59 % 141.58 37 0.97 % 237.34 šestdeset šestdeset šestdese t 143 0.56 % 138.15 21 0.31 % 91.10 24 0.37 % 81.21 47 0.55 % 133.09 51 1.34 % 327.15 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 530 File at CLARIN.SI2.2.187 List of final character-level 2-grams from numeral lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-numerals-lemmas-final- 2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] en en en 5,114 20.17 % 4,940.55 1,075 15.97 % 4,663.31 1,811 28.16 % 6,127.72 1,472 17.46 % 4,168.21 756 20.10 % 4,849.48 drug drug dr ug 2,356 9.29 % 2,276.09 465 6.91 % 2,017.15 488 7.59 % 1,651.20 1,066 12.65 % 3,018.56 337 8.96 % 2,161.74 dva dva d va 2,275 8.97 % 2,197.84 457 6.79 % 1,982.45 603 9.38 % 2,040.32 905 10.74 % 2,562.66 310 8.24 % 1,988.54 MMM mmm M MM 1,522 6.00 % 1,470.38 193 2.87 % 837.23 605 9.41 % 2,047.09 351 4.16 % 993.91 373 9.92 % 2,392.67 prvi prvi pr vi 1,315 5.19 % 1,270.40 411 6.11 % 1,782.90 259 4.03 % 876.36 491 5.83 % 1,390.35 154 4.09 % 987.86 trije trije tri je 1,150 4.54 % 1,111 357 5.30 % 1,548.65 259 4.03 % 876.36 386 4.58 % 1,093.02 148 3.94 % 949.37 pet pet p et 669 2.64 % 646.31 201 2.99 % 871.93 183 2.85 % 619.20 187 2.22 % 529.52 98 2.61 % 628.64 štirje štirje štir je 667 2.63 % 644.38 189 2.81 % 819.87 158 2.46 % 534.61 240 2.85 % 679.60 80 2.13 % 513.17 tisoč tisoč tis oč 618 2.44 % 597.04 204 3.03 % 884.94 41 0.64 % 138.73 321 3.81 % 908.96 52 1.38 % 333.56 šest šest še st 480 1.89 % 463.72 212 3.15 % 919.65 101 1.57 % 341.74 121 1.44 % 342.63 46 1.22 % 295.07 sto sto s to 478 1.89 % 461.79 61 0.91 % 264.62 107 1.66 % 362.05 176 2.09 % 498.37 134 3.56 % 859.56 deset deset des et 476 1.88 % 459.86 128 1.90 % 555.26 151 2.35 % 510.93 123 1.46 % 348.29 74 1.97 % 474.68 osem osem os em 392 1.55 % 378.70 195 2.90 % 845.90 57 0.89 % 192.87 97 1.15 % 274.67 43 1.14 % 275.83 dvajset dvajset dvajs et 367 1.45 % 354.55 98 1.46 % 425.12 91 1.42 % 307.91 113 1.34 % 319.98 65 1.73 % 416.95 ena ena e na 316 1.25 % 305.28 67 0.99 % 290.64 104 1.62 % 351.90 96 1.14 % 271.84 49 1.30 % 314.32 tretji tretji tret ji 315 1.24 % 304.32 98 1.46 % 425.12 66 1.03 % 223.32 124 1.47 % 351.13 27 0.72 % 173.20 sedem sedem sed em 306 1.21 % 295.62 111 1.65 % 481.51 82 1.27 % 277.46 88 1.04 % 249.19 25 0.67 % 160.37 trideset trideset trides et 299 1.18 % 288.86 95 1.41 % 412.11 56 0.87 % 189.48 74 0.88 % 209.54 74 1.97 % 474.68 petnajst petnajst petnaj st 265 1.04 % 256.01 70 1.04 % 303.66 65 1.01 % 219.93 62 0.74 % 175.56 68 1.81 % 436.20 petdeset petdeset petdes et 256 1.01 % 247.32 32 0.47 % 138.81 86 1.34 % 290.99 85 1.01 % 240.69 53 1.41 % 339.98 devet devet dev et 247 0.97 % 238.62 103 1.53 % 446.81 39 0.61 % 131.96 64 0.76 % 181.23 41 1.09 % 263 eden eden ed en 219 0.86 % 211.57 57 0.85 % 247.26 54 0.84 % 182.72 87 1.03 % 246.35 21 0.56 % 134.71 šesti šesti šes ti 188 0.74 % 181.62 105 1.56 % 455.49 25 0.39 % 84.59 48 0.57 % 135.92 10 0.27 % 64.15 dvesto dvesto dves to 187 0.74 % 180.66 58 0.86 % 251.60 36 0.56 % 121.81 72 0.85 % 203.88 21 0.56 % 134.71 osemdeset osemdeset osemdes et 173 0.68 % 167.13 54 0.80 % 234.25 23 0.36 % 77.82 48 0.57 % 135.92 48 1.28 % 307.90 devetdeset devetdeset devetdes et 167 0.66 % 161.34 89 1.32 % 386.08 20 0.31 % 67.67 32 0.38 % 90.61 26 0.69 % 166.78 dvanajst dvanajst dvanaj st 165 0.65 % 159.40 38 0.56 % 164.84 49 0.76 % 165.80 55 0.65 % 155.74 23 0.61 % 147.54 sedmi sedmi sed mi 165 0.65 % 159.40 99 1.47 % 429.46 31 0.48 % 104.89 25 0.30 % 70.79 10 0.27 % 64.15 peti peti pe ti 159 0.63 % 153.61 59 0.88 % 255.94 45 0.70 % 152.26 41 0.49 % 116.10 14 0.37 % 89.81 osmi osmi os mi 151 0.60 % 145.88 86 1.28 % 373.06 31 0.48 % 104.89 13 0.15 % 36.81 21 0.56 % 134.71 štirideset štirideset štirides et 145 0.57 % 140.08 23 0.34 % 99.77 35 0.54 % 118.43 50 0.59 % 141.58 37 0.98 % 237.34 šestdeset šestdeset šestdes et 143 0.56 % 138.15 21 0.31 % 91.10 24 0.37 % 81.21 47 0.56 % 133.09 51 1.36 % 327.15 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 531 File at CLARIN.SI2.2.188 List of final character-level 3-grams from numeral lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-numerals-lemmas-final- 3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] drug drug d rug 2,356 11.66 % 2,276.09 465 8.26 % 2,017.15 488 10.59 % 1,651.20 1,066 15.32 % 3,018.56 337 11.21 % 2,161.74 dva dva dva 2,275 11.26 % 2,197.84 457 8.11 % 1,982.45 603 13.09 % 2,040.32 905 13.01 % 2,562.66 310 10.32 % 1,988.54 MMM mmm MMM 1,522 7.53 % 1,470.38 193 3.43 % 837.23 605 13.13 % 2,047.09 351 5.05 % 993.91 373 12.41 % 2,392.67 prvi prvi p rvi 1,315 6.51 % 1,270.40 411 7.30 % 1,782.90 259 5.62 % 876.36 491 7.06 % 1,390.35 154 5.12 % 987.86 trije trije tr ije 1,150 5.69 % 1,111 357 6.34 % 1,548.65 259 5.62 % 876.36 386 5.55 % 1,093.02 148 4.92 % 949.37 pet pet pet 669 3.31 % 646.31 201 3.57 % 871.93 183 3.97 % 619.20 187 2.69 % 529.52 98 3.26 % 628.64 štirje štirje šti rje 667 3.30 % 644.38 189 3.36 % 819.87 158 3.43 % 534.61 240 3.45 % 679.60 80 2.66 % 513.17 tisoč tisoč ti soč 618 3.06 % 597.04 204 3.62 % 884.94 41 0.89 % 138.73 321 4.62 % 908.96 52 1.73 % 333.56 šest šest š est 480 2.38 % 463.72 212 3.76 % 919.65 101 2.19 % 341.74 121 1.74 % 342.63 46 1.53 % 295.07 sto sto sto 478 2.37 % 461.79 61 1.08 % 264.62 107 2.32 % 362.05 176 2.53 % 498.37 134 4.46 % 859.56 deset deset de set 476 2.36 % 459.86 128 2.27 % 555.26 151 3.28 % 510.93 123 1.77 % 348.29 74 2.46 % 474.68 osem osem o sem 392 1.94 % 378.70 195 3.46 % 845.90 57 1.24 % 192.87 97 1.39 % 274.67 43 1.43 % 275.83 dvajset dvajset dvaj set 367 1.82 % 354.55 98 1.74 % 425.12 91 1.98 % 307.91 113 1.62 % 319.98 65 2.16 % 416.95 ena ena ena 316 1.56 % 305.28 67 1.19 % 290.64 104 2.26 % 351.90 96 1.38 % 271.84 49 1.63 % 314.32 tretji tretji tre tji 315 1.56 % 304.32 98 1.74 % 425.12 66 1.43 % 223.32 124 1.78 % 351.13 27 0.90 % 173.20 sedem sedem se dem 306 1.51 % 295.62 111 1.97 % 481.51 82 1.78 % 277.46 88 1.26 % 249.19 25 0.83 % 160.37 trideset trideset tride set 299 1.48 % 288.86 95 1.69 % 412.11 56 1.22 % 189.48 74 1.06 % 209.54 74 2.46 % 474.68 petnajst petnajst petna jst 265 1.31 % 256.01 70 1.24 % 303.66 65 1.41 % 219.93 62 0.89 % 175.56 68 2.26 % 436.20 petdeset petdeset petde set 256 1.27 % 247.32 32 0.57 % 138.81 86 1.87 % 290.99 85 1.22 % 240.69 53 1.76 % 339.98 devet devet de vet 247 1.22 % 238.62 103 1.83 % 446.81 39 0.85 % 131.96 64 0.92 % 181.23 41 1.36 % 263 eden eden e den 219 1.08 % 211.57 57 1.01 % 247.26 54 1.17 % 182.72 87 1.25 % 246.35 21 0.70 % 134.71 šesti šesti še sti 188 0.93 % 181.62 105 1.86 % 455.49 25 0.54 % 84.59 48 0.69 % 135.92 10 0.33 % 64.15 dvesto dvesto dve sto 187 0.93 % 180.66 58 1.03 % 251.60 36 0.78 % 121.81 72 1.03 % 203.88 21 0.70 % 134.71 osemdeset osemdeset osemde set 173 0.86 % 167.13 54 0.96 % 234.25 23 0.50 % 77.82 48 0.69 % 135.92 48 1.60 % 307.90 devetdeset devetdeset devetde set 167 0.83 % 161.34 89 1.58 % 386.08 20 0.43 % 67.67 32 0.46 % 90.61 26 0.86 % 166.78 dvanajst dvanajst dvana jst 165 0.82 % 159.40 38 0.68 % 164.84 49 1.06 % 165.80 55 0.79 % 155.74 23 0.77 % 147.54 sedmi sedmi se dmi 165 0.82 % 159.40 99 1.76 % 429.46 31 0.67 % 104.89 25 0.36 % 70.79 10 0.33 % 64.15 peti peti p eti 159 0.79 % 153.61 59 1.05 % 255.94 45 0.98 % 152.26 41 0.59 % 116.10 14 0.47 % 89.81 osmi osmi o smi 151 0.75 % 145.88 86 1.53 % 373.06 31 0.67 % 104.89 13 0.19 % 36.81 21 0.70 % 134.71 štirideset štirideset štiride set 145 0.72 % 140.08 23 0.41 % 99.77 35 0.76 % 118.43 50 0.72 % 141.58 37 1.23 % 237.34 šestdeset šestdeset šestde set 143 0.71 % 138.15 21 0.37 % 91.10 24 0.52 % 81.21 47 0.68 % 133.09 51 1.70 % 327.15 tristo tristo tri sto 141 0.70 % 136.22 15 0.27 % 65.07 28 0.61 % 94.74 78 1.12 % 220.87 20 0.67 % 128.29 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 532 File at CLARIN.SI2.2.189 List of final character-level 4-grams from numeral lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-numerals-lemmas-final- 4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] drug drug drug 2,356 15.88 % 2,276.09 465 10.12 % 2,017.15 488 16.26 % 1,651.20 1,066 20.52 % 3,018.56 337 16.53 % 2,161.74 prvi prvi prvi 1,315 8.87 % 1,270.40 411 8.94 % 1,782.90 259 8.63 % 876.36 491 9.45 % 1,390.35 154 7.55 % 987.86 trije trije t rije 1,150 7.75 % 1,111 357 7.77 % 1,548.65 259 8.63 % 876.36 386 7.43 % 1,093.02 148 7.26 % 949.37 štirje štirje št irje 667 4.50 % 644.38 189 4.11 % 819.87 158 5.26 % 534.61 240 4.62 % 679.60 80 3.92 % 513.17 tisoč tisoč t isoč 618 4.17 % 597.04 204 4.44 % 884.94 41 1.37 % 138.73 321 6.18 % 908.96 52 2.55 % 333.56 šest šest šest 480 3.24 % 463.72 212 4.61 % 919.65 101 3.36 % 341.74 121 2.33 % 342.63 46 2.26 % 295.07 deset deset d eset 476 3.21 % 459.86 128 2.78 % 555.26 151 5.03 % 510.93 123 2.37 % 348.29 74 3.63 % 474.68 osem osem osem 392 2.64 % 378.70 195 4.24 % 845.90 57 1.90 % 192.87 97 1.87 % 274.67 43 2.11 % 275.83 dvajset dvajset dva jset 367 2.47 % 354.55 98 2.13 % 425.12 91 3.03 % 307.91 113 2.17 % 319.98 65 3.19 % 416.95 tretji tretji tr etji 315 2.12 % 304.32 98 2.13 % 425.12 66 2.20 % 223.32 124 2.39 % 351.13 27 1.32 % 173.20 sedem sedem s edem 306 2.06 % 295.62 111 2.42 % 481.51 82 2.73 % 277.46 88 1.69 % 249.19 25 1.23 % 160.37 trideset trideset trid eset 299 2.02 % 288.86 95 2.07 % 412.11 56 1.86 % 189.48 74 1.42 % 209.54 74 3.63 % 474.68 petnajst petnajst petn ajst 265 1.79 % 256.01 70 1.52 % 303.66 65 2.17 % 219.93 62 1.19 % 175.56 68 3.33 % 436.20 petdeset petdeset petd eset 256 1.73 % 247.32 32 0.70 % 138.81 86 2.87 % 290.99 85 1.64 % 240.69 53 2.60 % 339.98 devet devet d evet 247 1.67 % 238.62 103 2.24 % 446.81 39 1.30 % 131.96 64 1.23 % 181.23 41 2.01 % 263 eden eden eden 219 1.48 % 211.57 57 1.24 % 247.26 54 1.80 % 182.72 87 1.67 % 246.35 21 1.03 % 134.71 šesti šesti š esti 188 1.27 % 181.62 105 2.28 % 455.49 25 0.83 % 84.59 48 0.92 % 135.92 10 0.49 % 64.15 dvesto dvesto dv esto 187 1.26 % 180.66 58 1.26 % 251.60 36 1.20 % 121.81 72 1.39 % 203.88 21 1.03 % 134.71 osemdeset osemdeset osemd eset 173 1.17 % 167.13 54 1.18 % 234.25 23 0.77 % 77.82 48 0.92 % 135.92 48 2.35 % 307.90 devetdeset devetdeset devetd eset 167 1.13 % 161.34 89 1.94 % 386.08 20 0.67 % 67.67 32 0.62 % 90.61 26 1.27 % 166.78 dvanajst dvanajst dvan ajst 165 1.11 % 159.40 38 0.83 % 164.84 49 1.63 % 165.80 55 1.06 % 155.74 23 1.13 % 147.54 sedmi sedmi s edmi 165 1.11 % 159.40 99 2.15 % 429.46 31 1.03 % 104.89 25 0.48 % 70.79 10 0.49 % 64.15 peti peti peti 159 1.07 % 153.61 59 1.28 % 255.94 45 1.50 % 152.26 41 0.79 % 116.10 14 0.69 % 89.81 osmi osmi osmi 151 1.02 % 145.88 86 1.87 % 373.06 31 1.03 % 104.89 13 0.25 % 36.81 21 1.03 % 134.71 štirideset štirideset štirid eset 145 0.98 % 140.08 23 0.50 % 99.77 35 1.17 % 118.43 50 0.96 % 141.58 37 1.81 % 237.34 šestdeset šestdeset šestd eset 143 0.96 % 138.15 21 0.46 % 91.10 24 0.80 % 81.21 47 0.91 % 133.09 51 2.50 % 327.15 tristo tristo tr isto 141 0.95 % 136.22 15 0.33 % 65.07 28 0.93 % 94.74 78 1.50 % 220.87 20 0.98 % 128.29 četrti četrti če trti 134 0.90 % 129.46 46 1.00 % 199.55 25 0.83 % 84.59 46 0.89 % 130.26 17 0.83 % 109.05 deveti deveti de veti 109 0.73 % 105.30 51 1.11 % 221.24 27 0.90 % 91.36 13 0.25 % 36.81 18 0.88 % 115.46 petsto petsto pe tsto 107 0.72 % 103.37 26 0.57 % 112.79 22 0.73 % 74.44 54 1.04 % 152.91 5 0.24 % 32.07 štirinajst štirinajst štirin ajst 103 0.69 % 99.51 33 0.72 % 143.15 30 1.00 % 101.51 31 0.60 % 87.78 9 0.44 % 57.73 petindvajset petindvajset petindva jset 99 0.67 % 95.64 13 0.28 % 56.39 25 0.83 % 84.59 40 0.77 % 113.27 21 1.03 % 134.71 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 533 File at CLARIN.SI2.2.190 List of final character-level 5-grams from numeral lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-numerals-lemmas-final- 5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] trije trije trije 1,150 11.79 % 1,111 357 11.47 % 1,548.65 259 13.18 % 876.36 386 11.78 % 1,093.02 148 10.59 % 949.37 štirje štirje š tirje 667 6.84 % 644.38 189 6.07 % 819.87 158 8.04 % 534.61 240 7.32 % 679.60 80 5.72 % 513.17 tisoč tisoč tisoč 618 6.34 % 597.04 204 6.55 % 884.94 41 2.09 % 138.73 321 9.79 % 908.96 52 3.72 % 333.56 deset deset deset 476 4.88 % 459.86 128 4.11 % 555.26 151 7.68 % 510.93 123 3.75 % 348.29 74 5.29 % 474.68 dvajset dvajset dv ajset 367 3.76 % 354.55 98 3.15 % 425.12 91 4.63 % 307.91 113 3.45 % 319.98 65 4.65 % 416.95 tretji tretji t retji 315 3.23 % 304.32 98 3.15 % 425.12 66 3.36 % 223.32 124 3.78 % 351.13 27 1.93 % 173.20 sedem sedem sedem 306 3.14 % 295.62 111 3.57 % 481.51 82 4.17 % 277.46 88 2.69 % 249.19 25 1.79 % 160.37 trideset trideset tri deset 299 3.07 % 288.86 95 3.05 % 412.11 56 2.85 % 189.48 74 2.26 % 209.54 74 5.29 % 474.68 petnajst petnajst pet najst 265 2.72 % 256.01 70 2.25 % 303.66 65 3.31 % 219.93 62 1.89 % 175.56 68 4.86 % 436.20 petdeset petdeset pet deset 256 2.62 % 247.32 32 1.03 % 138.81 86 4.38 % 290.99 85 2.59 % 240.69 53 3.79 % 339.98 devet devet devet 247 2.53 % 238.62 103 3.31 % 446.81 39 1.99 % 131.96 64 1.95 % 181.23 41 2.93 % 263 šesti šesti šesti 188 1.93 % 181.62 105 3.37 % 455.49 25 1.27 % 84.59 48 1.46 % 135.92 10 0.71 % 64.15 dvesto dvesto d vesto 187 1.92 % 180.66 58 1.86 % 251.60 36 1.83 % 121.81 72 2.20 % 203.88 21 1.50 % 134.71 osemdeset osemdeset osem deset 173 1.77 % 167.13 54 1.74 % 234.25 23 1.17 % 77.82 48 1.46 % 135.92 48 3.43 % 307.90 devetdeset devetdeset devet deset 167 1.71 % 161.34 89 2.86 % 386.08 20 1.02 % 67.67 32 0.98 % 90.61 26 1.86 % 166.78 dvanajst dvanajst dva najst 165 1.69 % 159.40 38 1.22 % 164.84 49 2.49 % 165.80 55 1.68 % 155.74 23 1.65 % 147.54 sedmi sedmi sedmi 165 1.69 % 159.40 99 3.18 % 429.46 31 1.58 % 104.89 25 0.76 % 70.79 10 0.71 % 64.15 štirideset štirideset štiri deset 145 1.49 % 140.08 23 0.74 % 99.77 35 1.78 % 118.43 50 1.52 % 141.58 37 2.65 % 237.34 šestdeset šestdeset šest deset 143 1.47 % 138.15 21 0.68 % 91.10 24 1.22 % 81.21 47 1.43 % 133.09 51 3.65 % 327.15 tristo tristo t risto 141 1.45 % 136.22 15 0.48 % 65.07 28 1.43 % 94.74 78 2.38 % 220.87 20 1.43 % 128.29 četrti četrti č etrti 134 1.37 % 129.46 46 1.48 % 199.55 25 1.27 % 84.59 46 1.40 % 130.26 17 1.22 % 109.05 deveti deveti d eveti 109 1.12 % 105.30 51 1.64 % 221.24 27 1.37 % 91.36 13 0.40 % 36.81 18 1.29 % 115.46 petsto petsto p etsto 107 1.10 % 103.37 26 0.83 % 112.79 22 1.12 % 74.44 54 1.65 % 152.91 5 0.36 % 32.07 štirinajst štirinajst štiri najst 103 1.06 % 99.51 33 1.06 % 143.15 30 1.53 % 101.51 31 0.95 % 87.78 9 0.64 % 57.73 petindvajset petindvajset petindv ajset 99 1.01 % 95.64 13 0.42 % 56.39 25 1.27 % 84.59 40 1.22 % 113.27 21 1.50 % 134.71 sedemdeset sedemdeset sedem deset 99 1.01 % 95.64 20 0.64 % 86.76 20 1.02 % 67.67 40 1.22 % 113.27 19 1.36 % 121.88 enajst enajst e najst 94 0.96 % 90.81 27 0.87 % 117.12 28 1.43 % 94.74 27 0.82 % 76.45 12 0.86 % 76.98 deseti deseti d eseti 85 0.87 % 82.12 34 1.09 % 147.49 17 0.86 % 57.52 28 0.85 % 79.29 6 0.43 % 38.49 petinštirideset petinštirideset petinštiri deset 78 0.80 % 75.35 5 0.16 % 21.69 12 0.61 % 40.60 11 0.34 % 31.15 50 3.58 % 320.73 štiristo štiristo šti risto 76 0.78 % 73.42 16 0.51 % 69.41 17 0.86 % 57.52 35 1.07 % 99.11 8 0.57 % 51.32 trinajst trinajst tri najst 71 0.73 % 68.59 23 0.74 % 99.77 7 0.36 % 23.69 36 1.10 % 101.94 5 0.36 % 32.07 sedemsto sedemsto sed emsto 65 0.67 % 62.80 18 0.58 % 78.08 7 0.36 % 23.69 36 1.10 % 101.94 4 0.29 % 25.66 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 534 File at CLARIN.SI2.2.191 List of initial character-level 1-grams from numeral standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-numerals-standardized_ forms-initial-1grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mmm m mm 1,522 5.94 % 1,470.38 193 2.83 % 837.23 605 9.35 % 2,047.09 351 4.12 % 993.91 373 9.82 % 2,392.67 en e n 1,504 5.87 % 1,452.99 302 4.43 % 1,310.06 576 8.90 % 1,948.96 411 4.82 % 1,163.81 215 5.66 % 1,379.15 dva d va 1,278 4.99 % 1,234.65 224 3.28 % 971.70 346 5.35 % 1,170.73 501 5.87 % 1,418.66 207 5.45 % 1,327.83 ena e na 1,178 4.60 % 1,138.05 309 4.53 % 1,340.43 293 4.53 % 991.40 419 4.91 % 1,186.47 157 4.13 % 1,007.10 eno e no 1,154 4.50 % 1,114.86 254 3.72 % 1,101.84 397 6.14 % 1,343.29 332 3.89 % 940.11 171 4.50 % 1,096.91 tri t ri 891 3.48 % 860.78 300 4.40 % 1,301.39 197 3.04 % 666.57 276 3.24 % 781.54 118 3.11 % 756.93 dve d ve 740 2.89 % 714.90 187 2.74 % 811.20 194 3.00 % 656.42 282 3.31 % 798.53 77 2.03 % 493.93 tisoč t isoč 617 2.41 % 596.07 203 2.98 % 880.61 41 0.63 % 138.73 321 3.76 % 908.96 52 1.37 % 333.56 pet p et 581 2.27 % 561.29 164 2.40 % 711.43 153 2.37 % 517.69 175 2.05 % 495.54 89 2.34 % 570.90 ene e ne 519 2.03 % 501.40 75 1.10 % 325.35 265 4.10 % 896.66 100 1.17 % 283.17 79 2.08 % 506.76 štiri š tiri 501 1.96 % 484.01 142 2.08 % 615.99 112 1.73 % 378.96 177 2.08 % 501.20 70 1.84 % 449.03 drugi d rugi 489 1.91 % 472.41 100 1.47 % 433.80 83 1.28 % 280.84 226 2.65 % 639.96 80 2.11 % 513.17 sto s to 470 1.83 % 454.06 61 0.89 % 264.62 105 1.62 % 355.28 171 2.00 % 484.21 133 3.50 % 853.15 prvi p rvi 451 1.76 % 435.70 148 2.17 % 642.02 96 1.48 % 324.83 162 1.90 % 458.73 45 1.19 % 288.66 drugo d rugo 416 1.62 % 401.89 66 0.97 % 286.31 93 1.44 % 314.68 191 2.24 % 540.85 66 1.74 % 423.37 deset d eset 415 1.62 % 400.92 109 1.60 % 472.84 130 2.01 % 439.87 109 1.28 % 308.65 67 1.76 % 429.78 šest š est 411 1.60 % 397.06 174 2.55 % 754.81 80 1.24 % 270.69 115 1.35 % 325.64 42 1.11 % 269.42 drugega d rugega 397 1.55 % 383.54 90 1.32 % 390.42 123 1.90 % 416.18 127 1.49 % 359.62 57 1.50 % 365.64 enega e nega 375 1.46 % 362.28 68 1.00 % 294.98 140 2.16 % 473.71 104 1.22 % 294.49 63 1.66 % 404.12 dvajset d vajset 354 1.38 % 341.99 95 1.39 % 412.11 88 1.36 % 297.76 106 1.24 % 300.16 65 1.71 % 416.95 eni e ni 338 1.32 % 326.54 54 0.79 % 234.25 123 1.90 % 416.18 97 1.14 % 274.67 64 1.69 % 410.54 osem o sem 333 1.30 % 321.71 165 2.42 % 715.76 34 0.53 % 115.04 91 1.07 % 257.68 43 1.13 % 275.83 trideset t rideset 285 1.11 % 275.33 92 1.35 % 399.09 54 0.83 % 182.72 67 0.79 % 189.72 72 1.90 % 461.86 prvo p rvo 273 1.07 % 263.74 71 1.04 % 308 83 1.28 % 280.84 82 0.96 % 232.20 37 0.97 % 237.34 druga d ruga 252 0.98 % 243.45 60 0.88 % 260.28 45 0.70 % 152.26 111 1.30 % 314.31 36 0.95 % 230.93 petnajst p etnajst 252 0.98 % 243.45 66 0.97 % 286.31 64 0.99 % 216.55 55 0.65 % 155.74 67 1.76 % 429.78 petdeset p etdeset 245 0.96 % 236.69 30 0.44 % 130.14 82 1.27 % 277.46 81 0.95 % 229.36 52 1.37 % 333.56 sedem s edem 240 0.94 % 231.86 76 1.11 % 329.69 57 0.88 % 192.87 84 0.98 % 237.86 23 0.61 % 147.54 druge d ruge 229 0.89 % 221.23 39 0.57 % 169.18 45 0.70 % 152.26 120 1.41 % 339.80 25 0.66 % 160.37 eden e den 219 0.85 % 211.57 57 0.83 % 247.26 54 0.83 % 182.72 87 1.02 % 246.35 21 0.55 % 134.71 devet d evet 203 0.79 % 196.11 87 1.27 % 377.40 18 0.28 % 60.91 59 0.69 % 167.07 39 1.03 % 250.17 prva p rva 186 0.73 % 179.69 50 0.73 % 216.90 17 0.26 % 57.52 91 1.07 % 257.68 28 0.74 % 179.61 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 535 File at CLARIN.SI2.2.192 List of initial character-level 2-grams from numeral standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-numerals-standardized_ forms-initial-2grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mmm mm m 1,522 6.00 % 1,470.38 193 2.87 % 837.23 605 9.41 % 2,047.09 351 4.16 % 993.91 373 9.92 % 2,392.67 en en 1,504 5.93 % 1,452.99 302 4.49 % 1,310.06 576 8.96 % 1,948.96 411 4.88 % 1,163.81 215 5.72 % 1,379.15 dva dv a 1,278 5.04 % 1,234.65 224 3.33 % 971.70 346 5.38 % 1,170.73 501 5.94 % 1,418.66 207 5.50 % 1,327.83 ena en a 1,178 4.65 % 1,138.05 309 4.59 % 1,340.43 293 4.56 % 991.40 419 4.97 % 1,186.47 157 4.17 % 1,007.10 eno en o 1,154 4.55 % 1,114.86 254 3.77 % 1,101.84 397 6.17 % 1,343.29 332 3.94 % 940.11 171 4.55 % 1,096.91 tri tr i 891 3.52 % 860.78 300 4.46 % 1,301.39 197 3.06 % 666.57 276 3.27 % 781.54 118 3.14 % 756.93 dve dv e 740 2.92 % 714.90 187 2.78 % 811.20 194 3.02 % 656.42 282 3.35 % 798.53 77 2.05 % 493.93 tisoč ti soč 617 2.43 % 596.07 203 3.02 % 880.61 41 0.64 % 138.73 321 3.81 % 908.96 52 1.38 % 333.56 pet pe t 581 2.29 % 561.29 164 2.44 % 711.43 153 2.38 % 517.69 175 2.08 % 495.54 89 2.37 % 570.90 ene en e 519 2.05 % 501.40 75 1.11 % 325.35 265 4.12 % 896.66 100 1.19 % 283.17 79 2.10 % 506.76 štiri št iri 501 1.98 % 484.01 142 2.11 % 615.99 112 1.74 % 378.96 177 2.10 % 501.20 70 1.86 % 449.03 drugi dr ugi 489 1.93 % 472.41 100 1.49 % 433.80 83 1.29 % 280.84 226 2.68 % 639.96 80 2.13 % 513.17 sto st o 470 1.85 % 454.06 61 0.91 % 264.62 105 1.63 % 355.28 171 2.03 % 484.21 133 3.54 % 853.15 prvi pr vi 451 1.78 % 435.70 148 2.20 % 642.02 96 1.49 % 324.83 162 1.92 % 458.73 45 1.20 % 288.66 drugo dr ugo 416 1.64 % 401.89 66 0.98 % 286.31 93 1.45 % 314.68 191 2.27 % 540.85 66 1.75 % 423.37 deset de set 415 1.64 % 400.92 109 1.62 % 472.84 130 2.02 % 439.87 109 1.29 % 308.65 67 1.78 % 429.78 šest še st 411 1.62 % 397.06 174 2.58 % 754.81 80 1.24 % 270.69 115 1.36 % 325.64 42 1.12 % 269.42 drugega dr ugega 397 1.57 % 383.54 90 1.34 % 390.42 123 1.91 % 416.18 127 1.51 % 359.62 57 1.52 % 365.64 enega en ega 375 1.48 % 362.28 68 1.01 % 294.98 140 2.18 % 473.71 104 1.23 % 294.49 63 1.68 % 404.12 dvajset dv ajset 354 1.40 % 341.99 95 1.41 % 412.11 88 1.37 % 297.76 106 1.26 % 300.16 65 1.73 % 416.95 eni en i 338 1.33 % 326.54 54 0.80 % 234.25 123 1.91 % 416.18 97 1.15 % 274.67 64 1.70 % 410.54 osem os em 333 1.31 % 321.71 165 2.45 % 715.76 34 0.53 % 115.04 91 1.08 % 257.68 43 1.14 % 275.83 trideset tr ideset 285 1.12 % 275.33 92 1.37 % 399.09 54 0.84 % 182.72 67 0.80 % 189.72 72 1.91 % 461.86 prvo pr vo 273 1.08 % 263.74 71 1.05 % 308 83 1.29 % 280.84 82 0.97 % 232.20 37 0.98 % 237.34 druga dr uga 252 0.99 % 243.45 60 0.89 % 260.28 45 0.70 % 152.26 111 1.32 % 314.31 36 0.96 % 230.93 petnajst pe tnajst 252 0.99 % 243.45 66 0.98 % 286.31 64 0.99 % 216.55 55 0.65 % 155.74 67 1.78 % 429.78 petdeset pe tdeset 245 0.97 % 236.69 30 0.45 % 130.14 82 1.27 % 277.46 81 0.96 % 229.36 52 1.38 % 333.56 sedem se dem 240 0.95 % 231.86 76 1.13 % 329.69 57 0.89 % 192.87 84 1.00 % 237.86 23 0.61 % 147.54 druge dr uge 229 0.90 % 221.23 39 0.58 % 169.18 45 0.70 % 152.26 120 1.42 % 339.80 25 0.67 % 160.37 eden ed en 219 0.86 % 211.57 57 0.85 % 247.26 54 0.84 % 182.72 87 1.03 % 246.35 21 0.56 % 134.71 devet de vet 203 0.80 % 196.11 87 1.29 % 377.40 18 0.28 % 60.91 59 0.70 % 167.07 39 1.04 % 250.17 prva pr va 186 0.73 % 179.69 50 0.74 % 216.90 17 0.26 % 57.52 91 1.08 % 257.68 28 0.74 % 179.61 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 536 File at CLARIN.SI2.2.193 List of initial character-level 3-grams from numeral standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-numerals-standardized_ forms-initial-3grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mmm mmm 1,522 6.39 % 1,470.38 193 3.01 % 837.23 605 10.35 % 2,047.09 351 4.38 % 993.91 373 10.52 % 2,392.67 dva dva 1,278 5.37 % 1,234.65 224 3.50 % 971.70 346 5.92 % 1,170.73 501 6.25 % 1,418.66 207 5.84 % 1,327.83 ena ena 1,178 4.95 % 1,138.05 309 4.82 % 1,340.43 293 5.01 % 991.40 419 5.23 % 1,186.47 157 4.43 % 1,007.10 eno eno 1,154 4.85 % 1,114.86 254 3.97 % 1,101.84 397 6.79 % 1,343.29 332 4.14 % 940.11 171 4.82 % 1,096.91 tri tri 891 3.74 % 860.78 300 4.68 % 1,301.39 197 3.37 % 666.57 276 3.44 % 781.54 118 3.33 % 756.93 dve dve 740 3.11 % 714.90 187 2.92 % 811.20 194 3.32 % 656.42 282 3.52 % 798.53 77 2.17 % 493.93 tisoč tis oč 617 2.59 % 596.07 203 3.17 % 880.61 41 0.70 % 138.73 321 4.00 % 908.96 52 1.47 % 333.56 pet pet 581 2.44 % 561.29 164 2.56 % 711.43 153 2.62 % 517.69 175 2.18 % 495.54 89 2.51 % 570.90 ene ene 519 2.18 % 501.40 75 1.17 % 325.35 265 4.54 % 896.66 100 1.25 % 283.17 79 2.23 % 506.76 štiri šti ri 501 2.10 % 484.01 142 2.22 % 615.99 112 1.92 % 378.96 177 2.21 % 501.20 70 1.98 % 449.03 drugi dru gi 489 2.05 % 472.41 100 1.56 % 433.80 83 1.42 % 280.84 226 2.82 % 639.96 80 2.26 % 513.17 sto sto 470 1.97 % 454.06 61 0.95 % 264.62 105 1.80 % 355.28 171 2.13 % 484.21 133 3.75 % 853.15 prvi prv i 451 1.89 % 435.70 148 2.31 % 642.02 96 1.64 % 324.83 162 2.02 % 458.73 45 1.27 % 288.66 drugo dru go 416 1.75 % 401.89 66 1.03 % 286.31 93 1.59 % 314.68 191 2.38 % 540.85 66 1.86 % 423.37 deset des et 415 1.74 % 400.92 109 1.70 % 472.84 130 2.23 % 439.87 109 1.36 % 308.65 67 1.89 % 429.78 šest šes t 411 1.73 % 397.06 174 2.72 % 754.81 80 1.37 % 270.69 115 1.43 % 325.64 42 1.19 % 269.42 drugega dru gega 397 1.67 % 383.54 90 1.41 % 390.42 123 2.10 % 416.18 127 1.58 % 359.62 57 1.61 % 365.64 enega ene ga 375 1.57 % 362.28 68 1.06 % 294.98 140 2.40 % 473.71 104 1.30 % 294.49 63 1.78 % 404.12 dvajset dva jset 354 1.49 % 341.99 95 1.48 % 412.11 88 1.51 % 297.76 106 1.32 % 300.16 65 1.83 % 416.95 eni eni 338 1.42 % 326.54 54 0.84 % 234.25 123 2.10 % 416.18 97 1.21 % 274.67 64 1.80 % 410.54 osem ose m 333 1.40 % 321.71 165 2.58 % 715.76 34 0.58 % 115.04 91 1.14 % 257.68 43 1.21 % 275.83 trideset tri deset 285 1.20 % 275.33 92 1.44 % 399.09 54 0.92 % 182.72 67 0.84 % 189.72 72 2.03 % 461.86 prvo prv o 273 1.15 % 263.74 71 1.11 % 308 83 1.42 % 280.84 82 1.02 % 232.20 37 1.04 % 237.34 druga dru ga 252 1.06 % 243.45 60 0.94 % 260.28 45 0.77 % 152.26 111 1.39 % 314.31 36 1.02 % 230.93 petnajst pet najst 252 1.06 % 243.45 66 1.03 % 286.31 64 1.09 % 216.55 55 0.69 % 155.74 67 1.89 % 429.78 petdeset pet deset 245 1.03 % 236.69 30 0.47 % 130.14 82 1.40 % 277.46 81 1.01 % 229.36 52 1.47 % 333.56 sedem sed em 240 1.01 % 231.86 76 1.19 % 329.69 57 0.98 % 192.87 84 1.05 % 237.86 23 0.65 % 147.54 druge dru ge 229 0.96 % 221.23 39 0.61 % 169.18 45 0.77 % 152.26 120 1.50 % 339.80 25 0.70 % 160.37 eden ede n 219 0.92 % 211.57 57 0.89 % 247.26 54 0.92 % 182.72 87 1.08 % 246.35 21 0.59 % 134.71 devet dev et 203 0.85 % 196.11 87 1.36 % 377.40 18 0.31 % 60.91 59 0.74 % 167.07 39 1.10 % 250.17 prva prv a 186 0.78 % 179.69 50 0.78 % 216.90 17 0.29 % 57.52 91 1.14 % 257.68 28 0.79 % 179.61 dvesto dve sto 185 0.78 % 178.73 58 0.91 % 251.60 36 0.62 % 121.81 70 0.87 % 198.22 21 0.59 % 134.71 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 537 File at CLARIN.SI2.2.194 List of initial character-level 4-grams from numeral standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-numerals-standardized_ forms-initial-4grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tisoč tiso č 617 4.11 % 596.07 203 4.49 % 880.61 41 1.30 % 138.73 321 6.09 % 908.96 52 2.51 % 333.56 štiri štir i 501 3.33 % 484.01 142 3.14 % 615.99 112 3.55 % 378.96 177 3.36 % 501.20 70 3.38 % 449.03 drugi drug i 489 3.25 % 472.41 100 2.21 % 433.80 83 2.63 % 280.84 226 4.29 % 639.96 80 3.86 % 513.17 prvi prvi 451 3.00 % 435.70 148 3.27 % 642.02 96 3.04 % 324.83 162 3.08 % 458.73 45 2.17 % 288.66 drugo drug o 416 2.77 % 401.89 66 1.46 % 286.31 93 2.95 % 314.68 191 3.63 % 540.85 66 3.18 % 423.37 deset dese t 415 2.76 % 400.92 109 2.41 % 472.84 130 4.12 % 439.87 109 2.07 % 308.65 67 3.23 % 429.78 šest šest 411 2.74 % 397.06 174 3.85 % 754.81 80 2.54 % 270.69 115 2.18 % 325.64 42 2.02 % 269.42 drugega drug ega 397 2.64 % 383.54 90 1.99 % 390.42 123 3.90 % 416.18 127 2.41 % 359.62 57 2.75 % 365.64 enega eneg a 375 2.50 % 362.28 68 1.50 % 294.98 140 4.44 % 473.71 104 1.97 % 294.49 63 3.04 % 404.12 dvajset dvaj set 354 2.36 % 341.99 95 2.10 % 412.11 88 2.79 % 297.76 106 2.01 % 300.16 65 3.13 % 416.95 osem osem 333 2.22 % 321.71 165 3.65 % 715.76 34 1.08 % 115.04 91 1.73 % 257.68 43 2.07 % 275.83 trideset trid eset 285 1.90 % 275.33 92 2.03 % 399.09 54 1.71 % 182.72 67 1.27 % 189.72 72 3.47 % 461.86 prvo prvo 273 1.82 % 263.74 71 1.57 % 308 83 2.63 % 280.84 82 1.56 % 232.20 37 1.78 % 237.34 druga drug a 252 1.68 % 243.45 60 1.33 % 260.28 45 1.43 % 152.26 111 2.11 % 314.31 36 1.74 % 230.93 petnajst petn ajst 252 1.68 % 243.45 66 1.46 % 286.31 64 2.03 % 216.55 55 1.04 % 155.74 67 3.23 % 429.78 petdeset petd eset 245 1.63 % 236.69 30 0.66 % 130.14 82 2.60 % 277.46 81 1.54 % 229.36 52 2.51 % 333.56 sedem sede m 240 1.60 % 231.86 76 1.68 % 329.69 57 1.81 % 192.87 84 1.59 % 237.86 23 1.11 % 147.54 druge drug e 229 1.52 % 221.23 39 0.86 % 169.18 45 1.43 % 152.26 120 2.28 % 339.80 25 1.21 % 160.37 eden eden 219 1.46 % 211.57 57 1.26 % 247.26 54 1.71 % 182.72 87 1.65 % 246.35 21 1.01 % 134.71 devet deve t 203 1.35 % 196.11 87 1.92 % 377.40 18 0.57 % 60.91 59 1.12 % 167.07 39 1.88 % 250.17 prva prva 186 1.24 % 179.69 50 1.10 % 216.90 17 0.54 % 57.52 91 1.73 % 257.68 28 1.35 % 179.61 dvesto dves to 185 1.23 % 178.73 58 1.28 % 251.60 36 1.14 % 121.81 70 1.33 % 198.22 21 1.01 % 134.71 dveh dveh 178 1.19 % 171.96 28 0.62 % 121.46 45 1.43 % 152.26 86 1.63 % 243.52 19 0.92 % 121.88 drugih drug ih 162 1.08 % 156.51 19 0.42 % 82.42 11 0.35 % 37.22 109 2.07 % 308.65 23 1.11 % 147.54 drug drug 156 1.04 % 150.71 28 0.62 % 121.46 51 1.62 % 172.56 52 0.99 % 147.25 25 1.21 % 160.37 tristo tris to 141 0.94 % 136.22 15 0.33 % 65.07 28 0.89 % 94.74 78 1.48 % 220.87 20 0.96 % 128.29 devetdeset deve tdeset 138 0.92 % 133.32 70 1.55 % 303.66 17 0.54 % 57.52 27 0.51 % 76.45 24 1.16 % 153.95 dvanajst dvan ajst 135 0.90 % 130.42 30 0.66 % 130.14 37 1.17 % 125.19 48 0.91 % 135.92 20 0.96 % 128.29 štirideset štir ideset 134 0.89 % 129.46 20 0.44 % 86.76 33 1.05 % 111.66 47 0.89 % 133.09 34 1.64 % 218.10 šestdeset šest deset 133 0.89 % 128.49 18 0.40 % 78.08 21 0.67 % 71.06 43 0.82 % 121.76 51 2.46 % 327.15 enem enem 131 0.87 % 126.56 21 0.46 % 91.10 31 0.98 % 104.89 51 0.97 % 144.41 28 1.35 % 179.61 osemdeset osem deset 130 0.86 % 125.59 18 0.40 % 78.08 22 0.70 % 74.44 42 0.80 % 118.93 48 2.31 % 307.90 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 538 File at CLARIN.SI2.2.195 List of initial character-level 5-grams from numeral standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-numerals-standardized_ forms-initial-5grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tisoč tisoč 617 5.11 % 596.07 203 5.70 % 880.61 41 1.62 % 138.73 321 7.51 % 908.96 52 3.03 % 333.56 štiri štiri 501 4.15 % 484.01 142 3.99 % 615.99 112 4.43 % 378.96 177 4.14 % 501.20 70 4.08 % 449.03 drugi drugi 489 4.05 % 472.41 100 2.81 % 433.80 83 3.29 % 280.84 226 5.29 % 639.96 80 4.66 % 513.17 drugo drugo 416 3.44 % 401.89 66 1.85 % 286.31 93 3.68 % 314.68 191 4.47 % 540.85 66 3.85 % 423.37 deset deset 415 3.44 % 400.92 109 3.06 % 472.84 130 5.15 % 439.87 109 2.55 % 308.65 67 3.90 % 429.78 drugega druge ga 397 3.29 % 383.54 90 2.53 % 390.42 123 4.87 % 416.18 127 2.97 % 359.62 57 3.32 % 365.64 enega enega 375 3.10 % 362.28 68 1.91 % 294.98 140 5.54 % 473.71 104 2.43 % 294.49 63 3.67 % 404.12 dvajset dvajs et 354 2.93 % 341.99 95 2.67 % 412.11 88 3.48 % 297.76 106 2.48 % 300.16 65 3.79 % 416.95 trideset tride set 285 2.36 % 275.33 92 2.58 % 399.09 54 2.14 % 182.72 67 1.57 % 189.72 72 4.20 % 461.86 druga druga 252 2.09 % 243.45 60 1.69 % 260.28 45 1.78 % 152.26 111 2.60 % 314.31 36 2.10 % 230.93 petnajst petna jst 252 2.09 % 243.45 66 1.85 % 286.31 64 2.53 % 216.55 55 1.29 % 155.74 67 3.90 % 429.78 petdeset petde set 245 2.03 % 236.69 30 0.84 % 130.14 82 3.25 % 277.46 81 1.89 % 229.36 52 3.03 % 333.56 sedem sedem 240 1.99 % 231.86 76 2.13 % 329.69 57 2.26 % 192.87 84 1.96 % 237.86 23 1.34 % 147.54 druge druge 229 1.90 % 221.23 39 1.10 % 169.18 45 1.78 % 152.26 120 2.81 % 339.80 25 1.46 % 160.37 devet devet 203 1.68 % 196.11 87 2.44 % 377.40 18 0.71 % 60.91 59 1.38 % 167.07 39 2.27 % 250.17 dvesto dvest o 185 1.53 % 178.73 58 1.63 % 251.60 36 1.43 % 121.81 70 1.64 % 198.22 21 1.22 % 134.71 drugih drugi h 162 1.34 % 156.51 19 0.53 % 82.42 11 0.43 % 37.22 109 2.55 % 308.65 23 1.34 % 147.54 tristo trist o 141 1.17 % 136.22 15 0.42 % 65.07 28 1.11 % 94.74 78 1.82 % 220.87 20 1.17 % 128.29 devetdeset devet deset 138 1.14 % 133.32 70 1.97 % 303.66 17 0.67 % 57.52 27 0.63 % 76.45 24 1.40 % 153.95 dvanajst dvana jst 135 1.12 % 130.42 30 0.84 % 130.14 37 1.47 % 125.19 48 1.12 % 135.92 20 1.17 % 128.29 štirideset štiri deset 134 1.11 % 129.46 20 0.56 % 86.76 33 1.31 % 111.66 47 1.10 % 133.09 34 1.98 % 218.10 šestdeset šestd eset 133 1.10 % 128.49 18 0.51 % 78.08 21 0.83 % 71.06 43 1.01 % 121.76 51 2.97 % 327.15 osemdeset osemd eset 130 1.08 % 125.59 18 0.51 % 78.08 22 0.87 % 74.44 42 0.98 % 118.93 48 2.80 % 307.90 prvem prvem 115 0.95 % 111.10 38 1.07 % 164.84 12 0.47 % 40.60 43 1.01 % 121.76 22 1.28 % 141.12 petsto petst o 107 0.89 % 103.37 26 0.73 % 112.79 22 0.87 % 74.44 54 1.26 % 152.91 5 0.29 % 32.07 trije trije 107 0.89 % 103.37 21 0.59 % 91.10 31 1.23 % 104.89 36 0.84 % 101.94 19 1.11 % 121.88 prvega prveg a 105 0.87 % 101.44 40 1.12 % 173.52 24 0.95 % 81.21 36 0.84 % 101.94 5 0.29 % 32.07 tretji tretj i 98 0.81 % 94.68 28 0.79 % 121.46 25 0.99 % 84.59 36 0.84 % 101.94 9 0.52 % 57.73 šestih šesti h 96 0.80 % 92.74 52 1.46 % 225.57 29 1.15 % 98.12 8 0.19 % 22.65 7 0.41 % 44.90 sedmih sedmi h 95 0.79 % 91.78 49 1.38 % 212.56 38 1.50 % 128.58 4 0.09 % 11.33 4 0.23 % 25.66 sedemdeset sedem deset 93 0.77 % 89.85 19 0.53 % 82.42 19 0.75 % 64.29 36 0.84 % 101.94 19 1.11 % 121.88 štirinajst štiri najst 93 0.77 % 89.85 29 0.81 % 125.80 28 1.11 % 94.74 28 0.66 % 79.29 8 0.47 % 51.32 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 539 File at CLARIN.SI2.2.196 List of final character-level 1-grams from numeral standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-numerals-standardized_ forms-final-1grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mmm mm m 1,522 5.94 % 1,470.38 193 2.83 % 837.23 605 9.35 % 2,047.09 351 4.12 % 993.91 373 9.82 % 2,392.67 en e n 1,504 5.87 % 1,452.99 302 4.43 % 1,310.06 576 8.90 % 1,948.96 411 4.82 % 1,163.81 215 5.66 % 1,379.15 dva dv a 1,278 4.99 % 1,234.65 224 3.28 % 971.70 346 5.35 % 1,170.73 501 5.87 % 1,418.66 207 5.45 % 1,327.83 ena en a 1,178 4.60 % 1,138.05 309 4.53 % 1,340.43 293 4.53 % 991.40 419 4.91 % 1,186.47 157 4.13 % 1,007.10 eno en o 1,154 4.50 % 1,114.86 254 3.72 % 1,101.84 397 6.14 % 1,343.29 332 3.89 % 940.11 171 4.50 % 1,096.91 tri tr i 891 3.48 % 860.78 300 4.40 % 1,301.39 197 3.04 % 666.57 276 3.24 % 781.54 118 3.11 % 756.93 dve dv e 740 2.89 % 714.90 187 2.74 % 811.20 194 3.00 % 656.42 282 3.31 % 798.53 77 2.03 % 493.93 tisoč tiso č 617 2.41 % 596.07 203 2.98 % 880.61 41 0.63 % 138.73 321 3.76 % 908.96 52 1.37 % 333.56 pet pe t 581 2.27 % 561.29 164 2.40 % 711.43 153 2.37 % 517.69 175 2.05 % 495.54 89 2.34 % 570.90 ene en e 519 2.03 % 501.40 75 1.10 % 325.35 265 4.10 % 896.66 100 1.17 % 283.17 79 2.08 % 506.76 štiri štir i 501 1.96 % 484.01 142 2.08 % 615.99 112 1.73 % 378.96 177 2.08 % 501.20 70 1.84 % 449.03 drugi drug i 489 1.91 % 472.41 100 1.47 % 433.80 83 1.28 % 280.84 226 2.65 % 639.96 80 2.11 % 513.17 sto st o 470 1.83 % 454.06 61 0.89 % 264.62 105 1.62 % 355.28 171 2.00 % 484.21 133 3.50 % 853.15 prvi prv i 451 1.76 % 435.70 148 2.17 % 642.02 96 1.48 % 324.83 162 1.90 % 458.73 45 1.19 % 288.66 drugo drug o 416 1.62 % 401.89 66 0.97 % 286.31 93 1.44 % 314.68 191 2.24 % 540.85 66 1.74 % 423.37 deset dese t 415 1.62 % 400.92 109 1.60 % 472.84 130 2.01 % 439.87 109 1.28 % 308.65 67 1.76 % 429.78 šest šes t 411 1.60 % 397.06 174 2.55 % 754.81 80 1.24 % 270.69 115 1.35 % 325.64 42 1.11 % 269.42 drugega drugeg a 397 1.55 % 383.54 90 1.32 % 390.42 123 1.90 % 416.18 127 1.49 % 359.62 57 1.50 % 365.64 enega eneg a 375 1.46 % 362.28 68 1.00 % 294.98 140 2.16 % 473.71 104 1.22 % 294.49 63 1.66 % 404.12 dvajset dvajse t 354 1.38 % 341.99 95 1.39 % 412.11 88 1.36 % 297.76 106 1.24 % 300.16 65 1.71 % 416.95 eni en i 338 1.32 % 326.54 54 0.79 % 234.25 123 1.90 % 416.18 97 1.14 % 274.67 64 1.69 % 410.54 osem ose m 333 1.30 % 321.71 165 2.42 % 715.76 34 0.53 % 115.04 91 1.07 % 257.68 43 1.13 % 275.83 trideset tridese t 285 1.11 % 275.33 92 1.35 % 399.09 54 0.83 % 182.72 67 0.79 % 189.72 72 1.90 % 461.86 prvo prv o 273 1.07 % 263.74 71 1.04 % 308 83 1.28 % 280.84 82 0.96 % 232.20 37 0.97 % 237.34 druga drug a 252 0.98 % 243.45 60 0.88 % 260.28 45 0.70 % 152.26 111 1.30 % 314.31 36 0.95 % 230.93 petnajst petnajs t 252 0.98 % 243.45 66 0.97 % 286.31 64 0.99 % 216.55 55 0.65 % 155.74 67 1.76 % 429.78 petdeset petdese t 245 0.96 % 236.69 30 0.44 % 130.14 82 1.27 % 277.46 81 0.95 % 229.36 52 1.37 % 333.56 sedem sede m 240 0.94 % 231.86 76 1.11 % 329.69 57 0.88 % 192.87 84 0.98 % 237.86 23 0.61 % 147.54 druge drug e 229 0.89 % 221.23 39 0.57 % 169.18 45 0.70 % 152.26 120 1.41 % 339.80 25 0.66 % 160.37 eden ede n 219 0.85 % 211.57 57 0.83 % 247.26 54 0.83 % 182.72 87 1.02 % 246.35 21 0.55 % 134.71 devet deve t 203 0.79 % 196.11 87 1.27 % 377.40 18 0.28 % 60.91 59 0.69 % 167.07 39 1.03 % 250.17 prva prv a 186 0.73 % 179.69 50 0.73 % 216.90 17 0.26 % 57.52 91 1.07 % 257.68 28 0.74 % 179.61 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 540 File at CLARIN.SI2.2.197 List of final character-level 2-grams from numeral standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-numerals-standardized_ forms-final-2grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mmm m mm 1,522 6.00 % 1,470.38 193 2.87 % 837.23 605 9.41 % 2,047.09 351 4.16 % 993.91 373 9.92 % 2,392.67 en en 1,504 5.93 % 1,452.99 302 4.49 % 1,310.06 576 8.96 % 1,948.96 411 4.88 % 1,163.81 215 5.72 % 1,379.15 dva d va 1,278 5.04 % 1,234.65 224 3.33 % 971.70 346 5.38 % 1,170.73 501 5.94 % 1,418.66 207 5.50 % 1,327.83 ena e na 1,178 4.65 % 1,138.05 309 4.59 % 1,340.43 293 4.56 % 991.40 419 4.97 % 1,186.47 157 4.17 % 1,007.10 eno e no 1,154 4.55 % 1,114.86 254 3.77 % 1,101.84 397 6.17 % 1,343.29 332 3.94 % 940.11 171 4.55 % 1,096.91 tri t ri 891 3.52 % 860.78 300 4.46 % 1,301.39 197 3.06 % 666.57 276 3.27 % 781.54 118 3.14 % 756.93 dve d ve 740 2.92 % 714.90 187 2.78 % 811.20 194 3.02 % 656.42 282 3.35 % 798.53 77 2.05 % 493.93 tisoč tis oč 617 2.43 % 596.07 203 3.02 % 880.61 41 0.64 % 138.73 321 3.81 % 908.96 52 1.38 % 333.56 pet p et 581 2.29 % 561.29 164 2.44 % 711.43 153 2.38 % 517.69 175 2.08 % 495.54 89 2.37 % 570.90 ene e ne 519 2.05 % 501.40 75 1.11 % 325.35 265 4.12 % 896.66 100 1.19 % 283.17 79 2.10 % 506.76 štiri šti ri 501 1.98 % 484.01 142 2.11 % 615.99 112 1.74 % 378.96 177 2.10 % 501.20 70 1.86 % 449.03 drugi dru gi 489 1.93 % 472.41 100 1.49 % 433.80 83 1.29 % 280.84 226 2.68 % 639.96 80 2.13 % 513.17 sto s to 470 1.85 % 454.06 61 0.91 % 264.62 105 1.63 % 355.28 171 2.03 % 484.21 133 3.54 % 853.15 prvi pr vi 451 1.78 % 435.70 148 2.20 % 642.02 96 1.49 % 324.83 162 1.92 % 458.73 45 1.20 % 288.66 drugo dru go 416 1.64 % 401.89 66 0.98 % 286.31 93 1.45 % 314.68 191 2.27 % 540.85 66 1.75 % 423.37 deset des et 415 1.64 % 400.92 109 1.62 % 472.84 130 2.02 % 439.87 109 1.29 % 308.65 67 1.78 % 429.78 šest še st 411 1.62 % 397.06 174 2.58 % 754.81 80 1.24 % 270.69 115 1.36 % 325.64 42 1.12 % 269.42 drugega druge ga 397 1.57 % 383.54 90 1.34 % 390.42 123 1.91 % 416.18 127 1.51 % 359.62 57 1.52 % 365.64 enega ene ga 375 1.48 % 362.28 68 1.01 % 294.98 140 2.18 % 473.71 104 1.23 % 294.49 63 1.68 % 404.12 dvajset dvajs et 354 1.40 % 341.99 95 1.41 % 412.11 88 1.37 % 297.76 106 1.26 % 300.16 65 1.73 % 416.95 eni e ni 338 1.33 % 326.54 54 0.80 % 234.25 123 1.91 % 416.18 97 1.15 % 274.67 64 1.70 % 410.54 osem os em 333 1.31 % 321.71 165 2.45 % 715.76 34 0.53 % 115.04 91 1.08 % 257.68 43 1.14 % 275.83 trideset trides et 285 1.12 % 275.33 92 1.37 % 399.09 54 0.84 % 182.72 67 0.80 % 189.72 72 1.91 % 461.86 prvo pr vo 273 1.08 % 263.74 71 1.05 % 308 83 1.29 % 280.84 82 0.97 % 232.20 37 0.98 % 237.34 druga dru ga 252 0.99 % 243.45 60 0.89 % 260.28 45 0.70 % 152.26 111 1.32 % 314.31 36 0.96 % 230.93 petnajst petnaj st 252 0.99 % 243.45 66 0.98 % 286.31 64 0.99 % 216.55 55 0.65 % 155.74 67 1.78 % 429.78 petdeset petdes et 245 0.97 % 236.69 30 0.45 % 130.14 82 1.27 % 277.46 81 0.96 % 229.36 52 1.38 % 333.56 sedem sed em 240 0.95 % 231.86 76 1.13 % 329.69 57 0.89 % 192.87 84 1.00 % 237.86 23 0.61 % 147.54 druge dru ge 229 0.90 % 221.23 39 0.58 % 169.18 45 0.70 % 152.26 120 1.42 % 339.80 25 0.67 % 160.37 eden ed en 219 0.86 % 211.57 57 0.85 % 247.26 54 0.84 % 182.72 87 1.03 % 246.35 21 0.56 % 134.71 devet dev et 203 0.80 % 196.11 87 1.29 % 377.40 18 0.28 % 60.91 59 0.70 % 167.07 39 1.04 % 250.17 prva pr va 186 0.73 % 179.69 50 0.74 % 216.90 17 0.26 % 57.52 91 1.08 % 257.68 28 0.74 % 179.61 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 541 File at CLARIN.SI2.2.198 List of final character-level 3-grams from numeral standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-numerals-standardized_ forms-final-3grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mmm mmm 1,522 6.39 % 1,470.38 193 3.01 % 837.23 605 10.35 % 2,047.09 351 4.38 % 993.91 373 10.52 % 2,392.67 dva dva 1,278 5.37 % 1,234.65 224 3.50 % 971.70 346 5.92 % 1,170.73 501 6.25 % 1,418.66 207 5.84 % 1,327.83 ena ena 1,178 4.95 % 1,138.05 309 4.82 % 1,340.43 293 5.01 % 991.40 419 5.23 % 1,186.47 157 4.43 % 1,007.10 eno eno 1,154 4.85 % 1,114.86 254 3.97 % 1,101.84 397 6.79 % 1,343.29 332 4.14 % 940.11 171 4.82 % 1,096.91 tri tri 891 3.74 % 860.78 300 4.68 % 1,301.39 197 3.37 % 666.57 276 3.44 % 781.54 118 3.33 % 756.93 dve dve 740 3.11 % 714.90 187 2.92 % 811.20 194 3.32 % 656.42 282 3.52 % 798.53 77 2.17 % 493.93 tisoč ti soč 617 2.59 % 596.07 203 3.17 % 880.61 41 0.70 % 138.73 321 4.00 % 908.96 52 1.47 % 333.56 pet pet 581 2.44 % 561.29 164 2.56 % 711.43 153 2.62 % 517.69 175 2.18 % 495.54 89 2.51 % 570.90 ene ene 519 2.18 % 501.40 75 1.17 % 325.35 265 4.54 % 896.66 100 1.25 % 283.17 79 2.23 % 506.76 štiri št iri 501 2.10 % 484.01 142 2.22 % 615.99 112 1.92 % 378.96 177 2.21 % 501.20 70 1.98 % 449.03 drugi dr ugi 489 2.05 % 472.41 100 1.56 % 433.80 83 1.42 % 280.84 226 2.82 % 639.96 80 2.26 % 513.17 sto sto 470 1.97 % 454.06 61 0.95 % 264.62 105 1.80 % 355.28 171 2.13 % 484.21 133 3.75 % 853.15 prvi p rvi 451 1.89 % 435.70 148 2.31 % 642.02 96 1.64 % 324.83 162 2.02 % 458.73 45 1.27 % 288.66 drugo dr ugo 416 1.75 % 401.89 66 1.03 % 286.31 93 1.59 % 314.68 191 2.38 % 540.85 66 1.86 % 423.37 deset de set 415 1.74 % 400.92 109 1.70 % 472.84 130 2.23 % 439.87 109 1.36 % 308.65 67 1.89 % 429.78 šest š est 411 1.73 % 397.06 174 2.72 % 754.81 80 1.37 % 270.69 115 1.43 % 325.64 42 1.19 % 269.42 drugega drug ega 397 1.67 % 383.54 90 1.41 % 390.42 123 2.10 % 416.18 127 1.58 % 359.62 57 1.61 % 365.64 enega en ega 375 1.57 % 362.28 68 1.06 % 294.98 140 2.40 % 473.71 104 1.30 % 294.49 63 1.78 % 404.12 dvajset dvaj set 354 1.49 % 341.99 95 1.48 % 412.11 88 1.51 % 297.76 106 1.32 % 300.16 65 1.83 % 416.95 eni eni 338 1.42 % 326.54 54 0.84 % 234.25 123 2.10 % 416.18 97 1.21 % 274.67 64 1.80 % 410.54 osem o sem 333 1.40 % 321.71 165 2.58 % 715.76 34 0.58 % 115.04 91 1.14 % 257.68 43 1.21 % 275.83 trideset tride set 285 1.20 % 275.33 92 1.44 % 399.09 54 0.92 % 182.72 67 0.84 % 189.72 72 2.03 % 461.86 prvo p rvo 273 1.15 % 263.74 71 1.11 % 308 83 1.42 % 280.84 82 1.02 % 232.20 37 1.04 % 237.34 druga dr uga 252 1.06 % 243.45 60 0.94 % 260.28 45 0.77 % 152.26 111 1.39 % 314.31 36 1.02 % 230.93 petnajst petna jst 252 1.06 % 243.45 66 1.03 % 286.31 64 1.09 % 216.55 55 0.69 % 155.74 67 1.89 % 429.78 petdeset petde set 245 1.03 % 236.69 30 0.47 % 130.14 82 1.40 % 277.46 81 1.01 % 229.36 52 1.47 % 333.56 sedem se dem 240 1.01 % 231.86 76 1.19 % 329.69 57 0.98 % 192.87 84 1.05 % 237.86 23 0.65 % 147.54 druge dr uge 229 0.96 % 221.23 39 0.61 % 169.18 45 0.77 % 152.26 120 1.50 % 339.80 25 0.70 % 160.37 eden e den 219 0.92 % 211.57 57 0.89 % 247.26 54 0.92 % 182.72 87 1.08 % 246.35 21 0.59 % 134.71 devet de vet 203 0.85 % 196.11 87 1.36 % 377.40 18 0.31 % 60.91 59 0.74 % 167.07 39 1.10 % 250.17 prva p rva 186 0.78 % 179.69 50 0.78 % 216.90 17 0.29 % 57.52 91 1.14 % 257.68 28 0.79 % 179.61 dvesto dve sto 185 0.78 % 178.73 58 0.91 % 251.60 36 0.62 % 121.81 70 0.87 % 198.22 21 0.59 % 134.71 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 542 File at CLARIN.SI2.2.199 List of final character-level 4-grams from numeral standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-numerals-standardized_ forms-final-4grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tisoč t isoč 617 4.11 % 596.07 203 4.49 % 880.61 41 1.30 % 138.73 321 6.09 % 908.96 52 2.51 % 333.56 štiri š tiri 501 3.33 % 484.01 142 3.14 % 615.99 112 3.55 % 378.96 177 3.36 % 501.20 70 3.38 % 449.03 drugi d rugi 489 3.25 % 472.41 100 2.21 % 433.80 83 2.63 % 280.84 226 4.29 % 639.96 80 3.86 % 513.17 prvi prvi 451 3.00 % 435.70 148 3.27 % 642.02 96 3.04 % 324.83 162 3.08 % 458.73 45 2.17 % 288.66 drugo d rugo 416 2.77 % 401.89 66 1.46 % 286.31 93 2.95 % 314.68 191 3.63 % 540.85 66 3.18 % 423.37 deset d eset 415 2.76 % 400.92 109 2.41 % 472.84 130 4.12 % 439.87 109 2.07 % 308.65 67 3.23 % 429.78 šest šest 411 2.74 % 397.06 174 3.85 % 754.81 80 2.54 % 270.69 115 2.18 % 325.64 42 2.02 % 269.42 drugega dru gega 397 2.64 % 383.54 90 1.99 % 390.42 123 3.90 % 416.18 127 2.41 % 359.62 57 2.75 % 365.64 enega e nega 375 2.50 % 362.28 68 1.50 % 294.98 140 4.44 % 473.71 104 1.97 % 294.49 63 3.04 % 404.12 dvajset dva jset 354 2.36 % 341.99 95 2.10 % 412.11 88 2.79 % 297.76 106 2.01 % 300.16 65 3.13 % 416.95 osem osem 333 2.22 % 321.71 165 3.65 % 715.76 34 1.08 % 115.04 91 1.73 % 257.68 43 2.07 % 275.83 trideset trid eset 285 1.90 % 275.33 92 2.03 % 399.09 54 1.71 % 182.72 67 1.27 % 189.72 72 3.47 % 461.86 prvo prvo 273 1.82 % 263.74 71 1.57 % 308 83 2.63 % 280.84 82 1.56 % 232.20 37 1.78 % 237.34 druga d ruga 252 1.68 % 243.45 60 1.33 % 260.28 45 1.43 % 152.26 111 2.11 % 314.31 36 1.74 % 230.93 petnajst petn ajst 252 1.68 % 243.45 66 1.46 % 286.31 64 2.03 % 216.55 55 1.04 % 155.74 67 3.23 % 429.78 petdeset petd eset 245 1.63 % 236.69 30 0.66 % 130.14 82 2.60 % 277.46 81 1.54 % 229.36 52 2.51 % 333.56 sedem s edem 240 1.60 % 231.86 76 1.68 % 329.69 57 1.81 % 192.87 84 1.59 % 237.86 23 1.11 % 147.54 druge d ruge 229 1.52 % 221.23 39 0.86 % 169.18 45 1.43 % 152.26 120 2.28 % 339.80 25 1.21 % 160.37 eden eden 219 1.46 % 211.57 57 1.26 % 247.26 54 1.71 % 182.72 87 1.65 % 246.35 21 1.01 % 134.71 devet d evet 203 1.35 % 196.11 87 1.92 % 377.40 18 0.57 % 60.91 59 1.12 % 167.07 39 1.88 % 250.17 prva prva 186 1.24 % 179.69 50 1.10 % 216.90 17 0.54 % 57.52 91 1.73 % 257.68 28 1.35 % 179.61 dvesto dv esto 185 1.23 % 178.73 58 1.28 % 251.60 36 1.14 % 121.81 70 1.33 % 198.22 21 1.01 % 134.71 dveh dveh 178 1.19 % 171.96 28 0.62 % 121.46 45 1.43 % 152.26 86 1.63 % 243.52 19 0.92 % 121.88 drugih dr ugih 162 1.08 % 156.51 19 0.42 % 82.42 11 0.35 % 37.22 109 2.07 % 308.65 23 1.11 % 147.54 drug drug 156 1.04 % 150.71 28 0.62 % 121.46 51 1.62 % 172.56 52 0.99 % 147.25 25 1.21 % 160.37 tristo tr isto 141 0.94 % 136.22 15 0.33 % 65.07 28 0.89 % 94.74 78 1.48 % 220.87 20 0.96 % 128.29 devetdeset devetd eset 138 0.92 % 133.32 70 1.55 % 303.66 17 0.54 % 57.52 27 0.51 % 76.45 24 1.16 % 153.95 dvanajst dvan ajst 135 0.90 % 130.42 30 0.66 % 130.14 37 1.17 % 125.19 48 0.91 % 135.92 20 0.96 % 128.29 štirideset štirid eset 134 0.89 % 129.46 20 0.44 % 86.76 33 1.05 % 111.66 47 0.89 % 133.09 34 1.64 % 218.10 šestdeset šestd eset 133 0.89 % 128.49 18 0.40 % 78.08 21 0.67 % 71.06 43 0.82 % 121.76 51 2.46 % 327.15 enem enem 131 0.87 % 126.56 21 0.46 % 91.10 31 0.98 % 104.89 51 0.97 % 144.41 28 1.35 % 179.61 osemdeset osemd eset 130 0.86 % 125.59 18 0.40 % 78.08 22 0.70 % 74.44 42 0.80 % 118.93 48 2.31 % 307.90 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 543 File at CLARIN.SI2.2.200 List of final character-level 5-grams from numeral standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-numerals-standardized_ forms-final-5grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tisoč tisoč 617 5.11 % 596.07 203 5.70 % 880.61 41 1.62 % 138.73 321 7.51 % 908.96 52 3.03 % 333.56 štiri štiri 501 4.15 % 484.01 142 3.99 % 615.99 112 4.43 % 378.96 177 4.14 % 501.20 70 4.08 % 449.03 drugi drugi 489 4.05 % 472.41 100 2.81 % 433.80 83 3.29 % 280.84 226 5.29 % 639.96 80 4.66 % 513.17 drugo drugo 416 3.44 % 401.89 66 1.85 % 286.31 93 3.68 % 314.68 191 4.47 % 540.85 66 3.85 % 423.37 deset deset 415 3.44 % 400.92 109 3.06 % 472.84 130 5.15 % 439.87 109 2.55 % 308.65 67 3.90 % 429.78 drugega dr ugega 397 3.29 % 383.54 90 2.53 % 390.42 123 4.87 % 416.18 127 2.97 % 359.62 57 3.32 % 365.64 enega enega 375 3.10 % 362.28 68 1.91 % 294.98 140 5.54 % 473.71 104 2.43 % 294.49 63 3.67 % 404.12 dvajset dv ajset 354 2.93 % 341.99 95 2.67 % 412.11 88 3.48 % 297.76 106 2.48 % 300.16 65 3.79 % 416.95 trideset tri deset 285 2.36 % 275.33 92 2.58 % 399.09 54 2.14 % 182.72 67 1.57 % 189.72 72 4.20 % 461.86 druga druga 252 2.09 % 243.45 60 1.69 % 260.28 45 1.78 % 152.26 111 2.60 % 314.31 36 2.10 % 230.93 petnajst pet najst 252 2.09 % 243.45 66 1.85 % 286.31 64 2.53 % 216.55 55 1.29 % 155.74 67 3.90 % 429.78 petdeset pet deset 245 2.03 % 236.69 30 0.84 % 130.14 82 3.25 % 277.46 81 1.89 % 229.36 52 3.03 % 333.56 sedem sedem 240 1.99 % 231.86 76 2.13 % 329.69 57 2.26 % 192.87 84 1.96 % 237.86 23 1.34 % 147.54 druge druge 229 1.90 % 221.23 39 1.10 % 169.18 45 1.78 % 152.26 120 2.81 % 339.80 25 1.46 % 160.37 devet devet 203 1.68 % 196.11 87 2.44 % 377.40 18 0.71 % 60.91 59 1.38 % 167.07 39 2.27 % 250.17 dvesto d vesto 185 1.53 % 178.73 58 1.63 % 251.60 36 1.43 % 121.81 70 1.64 % 198.22 21 1.22 % 134.71 drugih d rugih 162 1.34 % 156.51 19 0.53 % 82.42 11 0.43 % 37.22 109 2.55 % 308.65 23 1.34 % 147.54 tristo t risto 141 1.17 % 136.22 15 0.42 % 65.07 28 1.11 % 94.74 78 1.82 % 220.87 20 1.17 % 128.29 devetdeset devet deset 138 1.14 % 133.32 70 1.97 % 303.66 17 0.67 % 57.52 27 0.63 % 76.45 24 1.40 % 153.95 dvanajst dva najst 135 1.12 % 130.42 30 0.84 % 130.14 37 1.47 % 125.19 48 1.12 % 135.92 20 1.17 % 128.29 štirideset štiri deset 134 1.11 % 129.46 20 0.56 % 86.76 33 1.31 % 111.66 47 1.10 % 133.09 34 1.98 % 218.10 šestdeset šest deset 133 1.10 % 128.49 18 0.51 % 78.08 21 0.83 % 71.06 43 1.01 % 121.76 51 2.97 % 327.15 osemdeset osem deset 130 1.08 % 125.59 18 0.51 % 78.08 22 0.87 % 74.44 42 0.98 % 118.93 48 2.80 % 307.90 prvem prvem 115 0.95 % 111.10 38 1.07 % 164.84 12 0.47 % 40.60 43 1.01 % 121.76 22 1.28 % 141.12 petsto p etsto 107 0.89 % 103.37 26 0.73 % 112.79 22 0.87 % 74.44 54 1.26 % 152.91 5 0.29 % 32.07 trije trije 107 0.89 % 103.37 21 0.59 % 91.10 31 1.23 % 104.89 36 0.84 % 101.94 19 1.11 % 121.88 prvega p rvega 105 0.87 % 101.44 40 1.12 % 173.52 24 0.95 % 81.21 36 0.84 % 101.94 5 0.29 % 32.07 tretji t retji 98 0.81 % 94.68 28 0.79 % 121.46 25 0.99 % 84.59 36 0.84 % 101.94 9 0.52 % 57.73 šestih š estih 96 0.80 % 92.74 52 1.46 % 225.57 29 1.15 % 98.12 8 0.19 % 22.65 7 0.41 % 44.90 sedmih s edmih 95 0.79 % 91.78 49 1.38 % 212.56 38 1.50 % 128.58 4 0.09 % 11.33 4 0.23 % 25.66 sedemdeset sedem deset 93 0.77 % 89.85 19 0.53 % 82.42 19 0.75 % 64.29 36 0.84 % 101.94 19 1.11 % 121.88 štirinajst štiri najst 93 0.77 % 89.85 29 0.81 % 125.80 28 1.11 % 94.74 28 0.66 % 79.29 8 0.47 % 51.32 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 544 File at CLARIN.SI2.2.201 List of initial character-level 1-grams from numeral lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-numerals-lowercase_ forms-initial-1grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mmm m mm 1,522 5.94 % 1,470.38 193 2.83 % 837.23 605 9.35 % 2,047.09 351 4.12 % 993.91 373 9.82 % 2,392.67 en e n 1,495 5.83 % 1,444.30 312 4.57 % 1,353.44 557 8.61 % 1,884.67 409 4.79 % 1,158.15 217 5.71 % 1,391.98 dva d va 1,259 4.91 % 1,216.30 221 3.24 % 958.69 331 5.12 % 1,119.98 500 5.86 % 1,415.83 207 5.45 % 1,327.83 ena e na 1,159 4.52 % 1,119.69 305 4.47 % 1,323.08 277 4.28 % 937.26 420 4.92 % 1,189.30 157 4.13 % 1,007.10 eno e no 1,070 4.18 % 1,033.71 253 3.71 % 1,097.50 320 4.95 % 1,082.76 328 3.85 % 928.79 169 4.45 % 1,084.08 tri t ri 872 3.40 % 842.42 292 4.28 % 1,266.68 186 2.88 % 629.35 276 3.24 % 781.54 118 3.11 % 756.93 dve d ve 701 2.74 % 677.22 182 2.67 % 789.51 163 2.52 % 551.53 279 3.27 % 790.03 77 2.03 % 493.93 tisoč t isoč 614 2.40 % 593.18 200 2.93 % 867.59 41 0.63 % 138.73 321 3.76 % 908.96 52 1.37 % 333.56 pet p et 577 2.25 % 557.43 167 2.45 % 724.44 151 2.33 % 510.93 169 1.98 % 478.55 90 2.37 % 577.32 drugi d rugi 455 1.78 % 439.57 96 1.41 % 416.44 65 1.00 % 219.93 217 2.54 % 614.47 77 2.03 % 493.93 sto s to 434 1.69 % 419.28 60 0.88 % 260.28 87 1.34 % 294.37 159 1.86 % 450.23 128 3.37 % 821.08 prvi p rvi 426 1.66 % 411.55 145 2.12 % 629 79 1.22 % 267.31 160 1.88 % 453.07 42 1.11 % 269.42 ene e ne 407 1.59 % 393.20 60 0.88 % 260.28 191 2.95 % 646.27 89 1.04 % 252.02 67 1.76 % 429.78 deset d eset 402 1.57 % 388.37 107 1.57 % 464.16 120 1.85 % 406.03 108 1.27 % 305.82 67 1.76 % 429.78 šest š est 388 1.51 % 374.84 164 2.40 % 711.43 77 1.19 % 260.54 106 1.24 % 300.16 41 1.08 % 263 drugo d rugo 376 1.47 % 363.25 63 0.92 % 273.29 66 1.02 % 223.32 189 2.22 % 535.18 58 1.53 % 372.05 štiri š tiri 372 1.45 % 359.38 119 1.74 % 516.22 64 0.99 % 216.55 142 1.67 % 402.10 47 1.24 % 301.49 eni e ni 322 1.26 % 311.08 52 0.76 % 225.57 109 1.69 % 368.81 97 1.14 % 274.67 64 1.69 % 410.54 osem o sem 315 1.23 % 304.32 162 2.37 % 702.75 22 0.34 % 74.44 90 1.05 % 254.85 41 1.08 % 263 druga d ruga 279 1.09 % 269.54 69 1.01 % 299.32 58 0.90 % 196.25 115 1.35 % 325.64 37 0.97 % 237.34 prvo p rvo 246 0.96 % 237.66 67 0.98 % 290.64 65 1.00 % 219.93 79 0.93 % 223.70 35 0.92 % 224.51 petnajst p etnajst 240 0.94 % 231.86 61 0.89 % 264.62 59 0.91 % 199.63 53 0.62 % 150.08 67 1.76 % 429.78 druge d ruge 231 0.90 % 223.17 40 0.59 % 173.52 46 0.71 % 155.65 120 1.41 % 339.80 25 0.66 % 160.37 enga e nga 225 0.88 % 217.37 30 0.44 % 130.14 106 1.64 % 358.66 42 0.49 % 118.93 47 1.24 % 301.49 devet d evet 202 0.79 % 195.15 87 1.27 % 377.40 17 0.26 % 57.52 59 0.69 % 167.07 39 1.03 % 250.17 sedem s edem 201 0.79 % 194.18 70 1.03 % 303.66 35 0.54 % 118.43 80 0.94 % 226.53 16 0.42 % 102.63 drugega d rugega 192 0.75 % 185.49 48 0.70 % 208.22 23 0.36 % 77.82 96 1.12 % 271.84 25 0.66 % 160.37 prva p rva 190 0.74 % 183.56 54 0.79 % 234.25 17 0.26 % 57.52 91 1.07 % 257.68 28 0.74 % 179.61 drug d rug 189 0.74 % 182.59 30 0.44 % 130.14 69 1.07 % 233.47 60 0.70 % 169.90 30 0.79 % 192.44 dvajset d vajset 186 0.73 % 179.69 74 1.08 % 321.01 7 0.11 % 23.69 65 0.76 % 184.06 40 1.05 % 256.59 eden e den 186 0.73 % 179.69 50 0.73 % 216.90 35 0.54 % 118.43 83 0.97 % 235.03 18 0.47 % 115.46 dvesto d vesto 177 0.69 % 171 58 0.85 % 251.60 32 0.49 % 108.28 66 0.77 % 186.89 21 0.55 % 134.71 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 545 File at CLARIN.SI2.2.202 List of initial character-level 2-grams from numeral lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-numerals-lowercase_ forms-initial-2grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mmm mm m 1,522 5.98 % 1,470.38 193 2.85 % 837.23 605 9.38 % 2,047.09 351 4.15 % 993.91 373 9.89 % 2,392.67 en en 1,495 5.87 % 1,444.30 312 4.60 % 1,353.44 557 8.64 % 1,884.67 409 4.84 % 1,158.15 217 5.75 % 1,391.98 dva dv a 1,259 4.95 % 1,216.30 221 3.26 % 958.69 331 5.13 % 1,119.98 500 5.92 % 1,415.83 207 5.49 % 1,327.83 ena en a 1,159 4.55 % 1,119.69 305 4.50 % 1,323.08 277 4.30 % 937.26 420 4.97 % 1,189.30 157 4.16 % 1,007.10 eno en o 1,070 4.20 % 1,033.71 253 3.73 % 1,097.50 320 4.96 % 1,082.76 328 3.88 % 928.79 169 4.48 % 1,084.08 tri tr i 872 3.43 % 842.42 292 4.31 % 1,266.68 186 2.88 % 629.35 276 3.27 % 781.54 118 3.13 % 756.93 dve dv e 701 2.75 % 677.22 182 2.68 % 789.51 163 2.53 % 551.53 279 3.30 % 790.03 77 2.04 % 493.93 tisoč ti soč 614 2.41 % 593.18 200 2.95 % 867.59 41 0.64 % 138.73 321 3.80 % 908.96 52 1.38 % 333.56 pet pe t 577 2.27 % 557.43 167 2.46 % 724.44 151 2.34 % 510.93 169 2.00 % 478.55 90 2.38 % 577.32 drugi dr ugi 455 1.79 % 439.57 96 1.42 % 416.44 65 1.01 % 219.93 217 2.57 % 614.47 77 2.04 % 493.93 sto st o 434 1.71 % 419.28 60 0.89 % 260.28 87 1.35 % 294.37 159 1.88 % 450.23 128 3.39 % 821.08 prvi pr vi 426 1.67 % 411.55 145 2.14 % 629 79 1.23 % 267.31 160 1.89 % 453.07 42 1.11 % 269.42 ene en e 407 1.60 % 393.20 60 0.89 % 260.28 191 2.96 % 646.27 89 1.05 % 252.02 67 1.78 % 429.78 deset de set 402 1.58 % 388.37 107 1.58 % 464.16 120 1.86 % 406.03 108 1.28 % 305.82 67 1.78 % 429.78 šest še st 388 1.52 % 374.84 164 2.42 % 711.43 77 1.19 % 260.54 106 1.25 % 300.16 41 1.09 % 263 drugo dr ugo 376 1.48 % 363.25 63 0.93 % 273.29 66 1.02 % 223.32 189 2.24 % 535.18 58 1.54 % 372.05 štiri št iri 372 1.46 % 359.38 119 1.75 % 516.22 64 0.99 % 216.55 142 1.68 % 402.10 47 1.25 % 301.49 eni en i 322 1.26 % 311.08 52 0.77 % 225.57 109 1.69 % 368.81 97 1.15 % 274.67 64 1.70 % 410.54 osem os em 315 1.24 % 304.32 162 2.39 % 702.75 22 0.34 % 74.44 90 1.06 % 254.85 41 1.09 % 263 druga dr uga 279 1.10 % 269.54 69 1.02 % 299.32 58 0.90 % 196.25 115 1.36 % 325.64 37 0.98 % 237.34 prvo pr vo 246 0.97 % 237.66 67 0.99 % 290.64 65 1.01 % 219.93 79 0.94 % 223.70 35 0.93 % 224.51 petnajst pe tnajst 240 0.94 % 231.86 61 0.90 % 264.62 59 0.92 % 199.63 53 0.63 % 150.08 67 1.78 % 429.78 druge dr uge 231 0.91 % 223.17 40 0.59 % 173.52 46 0.71 % 155.65 120 1.42 % 339.80 25 0.66 % 160.37 enga en ga 225 0.88 % 217.37 30 0.44 % 130.14 106 1.64 % 358.66 42 0.50 % 118.93 47 1.25 % 301.49 devet de vet 202 0.79 % 195.15 87 1.28 % 377.40 17 0.26 % 57.52 59 0.70 % 167.07 39 1.03 % 250.17 sedem se dem 201 0.79 % 194.18 70 1.03 % 303.66 35 0.54 % 118.43 80 0.95 % 226.53 16 0.42 % 102.63 drugega dr ugega 192 0.75 % 185.49 48 0.71 % 208.22 23 0.36 % 77.82 96 1.14 % 271.84 25 0.66 % 160.37 prva pr va 190 0.75 % 183.56 54 0.80 % 234.25 17 0.26 % 57.52 91 1.08 % 257.68 28 0.74 % 179.61 drug dr ug 189 0.74 % 182.59 30 0.44 % 130.14 69 1.07 % 233.47 60 0.71 % 169.90 30 0.80 % 192.44 dvajset dv ajset 186 0.73 % 179.69 74 1.09 % 321.01 7 0.11 % 23.69 65 0.77 % 184.06 40 1.06 % 256.59 eden ed en 186 0.73 % 179.69 50 0.74 % 216.90 35 0.54 % 118.43 83 0.98 % 235.03 18 0.48 % 115.46 dvesto dv esto 177 0.69 % 171 58 0.85 % 251.60 32 0.50 % 108.28 66 0.78 % 186.89 21 0.56 % 134.71 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 546 File at CLARIN.SI2.2.203 List of initial character-level 3-grams from numeral lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-numerals-lowercase_ forms-initial-3grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mmm mmm 1,522 6.40 % 1,470.38 193 3.02 % 837.23 605 10.39 % 2,047.09 351 4.37 % 993.91 373 10.53 % 2,392.67 dva dva 1,259 5.29 % 1,216.30 221 3.45 % 958.69 331 5.69 % 1,119.98 500 6.23 % 1,415.83 207 5.84 % 1,327.83 ena ena 1,159 4.87 % 1,119.69 305 4.77 % 1,323.08 277 4.76 % 937.26 420 5.23 % 1,189.30 157 4.43 % 1,007.10 eno eno 1,070 4.50 % 1,033.71 253 3.95 % 1,097.50 320 5.50 % 1,082.76 328 4.08 % 928.79 169 4.77 % 1,084.08 tri tri 872 3.67 % 842.42 292 4.56 % 1,266.68 186 3.20 % 629.35 276 3.44 % 781.54 118 3.33 % 756.93 dve dve 701 2.95 % 677.22 182 2.85 % 789.51 163 2.80 % 551.53 279 3.47 % 790.03 77 2.17 % 493.93 tisoč tis oč 614 2.58 % 593.18 200 3.13 % 867.59 41 0.70 % 138.73 321 4.00 % 908.96 52 1.47 % 333.56 pet pet 577 2.42 % 557.43 167 2.61 % 724.44 151 2.60 % 510.93 169 2.10 % 478.55 90 2.54 % 577.32 drugi dru gi 455 1.91 % 439.57 96 1.50 % 416.44 65 1.12 % 219.93 217 2.70 % 614.47 77 2.17 % 493.93 sto sto 434 1.82 % 419.28 60 0.94 % 260.28 87 1.50 % 294.37 159 1.98 % 450.23 128 3.61 % 821.08 prvi prv i 426 1.79 % 411.55 145 2.27 % 629 79 1.36 % 267.31 160 1.99 % 453.07 42 1.19 % 269.42 ene ene 407 1.71 % 393.20 60 0.94 % 260.28 191 3.28 % 646.27 89 1.11 % 252.02 67 1.89 % 429.78 deset des et 402 1.69 % 388.37 107 1.67 % 464.16 120 2.06 % 406.03 108 1.34 % 305.82 67 1.89 % 429.78 šest šes t 388 1.63 % 374.84 164 2.56 % 711.43 77 1.32 % 260.54 106 1.32 % 300.16 41 1.16 % 263 drugo dru go 376 1.58 % 363.25 63 0.98 % 273.29 66 1.13 % 223.32 189 2.35 % 535.18 58 1.64 % 372.05 štiri šti ri 372 1.56 % 359.38 119 1.86 % 516.22 64 1.10 % 216.55 142 1.77 % 402.10 47 1.33 % 301.49 eni eni 322 1.35 % 311.08 52 0.81 % 225.57 109 1.87 % 368.81 97 1.21 % 274.67 64 1.81 % 410.54 osem ose m 315 1.32 % 304.32 162 2.53 % 702.75 22 0.38 % 74.44 90 1.12 % 254.85 41 1.16 % 263 druga dru ga 279 1.17 % 269.54 69 1.08 % 299.32 58 1.00 % 196.25 115 1.43 % 325.64 37 1.04 % 237.34 prvo prv o 246 1.03 % 237.66 67 1.05 % 290.64 65 1.12 % 219.93 79 0.98 % 223.70 35 0.99 % 224.51 petnajst pet najst 240 1.01 % 231.86 61 0.95 % 264.62 59 1.01 % 199.63 53 0.66 % 150.08 67 1.89 % 429.78 druge dru ge 231 0.97 % 223.17 40 0.62 % 173.52 46 0.79 % 155.65 120 1.49 % 339.80 25 0.71 % 160.37 enga eng a 225 0.95 % 217.37 30 0.47 % 130.14 106 1.82 % 358.66 42 0.52 % 118.93 47 1.33 % 301.49 devet dev et 202 0.85 % 195.15 87 1.36 % 377.40 17 0.29 % 57.52 59 0.73 % 167.07 39 1.10 % 250.17 sedem sed em 201 0.84 % 194.18 70 1.09 % 303.66 35 0.60 % 118.43 80 1.00 % 226.53 16 0.45 % 102.63 drugega dru gega 192 0.81 % 185.49 48 0.75 % 208.22 23 0.40 % 77.82 96 1.20 % 271.84 25 0.71 % 160.37 prva prv a 190 0.80 % 183.56 54 0.84 % 234.25 17 0.29 % 57.52 91 1.13 % 257.68 28 0.79 % 179.61 drug dru g 189 0.79 % 182.59 30 0.47 % 130.14 69 1.19 % 233.47 60 0.75 % 169.90 30 0.85 % 192.44 dvajset dva jset 186 0.78 % 179.69 74 1.16 % 321.01 7 0.12 % 23.69 65 0.81 % 184.06 40 1.13 % 256.59 eden ede n 186 0.78 % 179.69 50 0.78 % 216.90 35 0.60 % 118.43 83 1.03 % 235.03 18 0.51 % 115.46 dvesto dve sto 177 0.74 % 171 58 0.91 % 251.60 32 0.55 % 108.28 66 0.82 % 186.89 21 0.59 % 134.71 trideset tri deset 174 0.73 % 168.10 74 1.16 % 321.01 9 0.15 % 30.45 43 0.54 % 121.76 48 1.35 % 307.90 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 547 File at CLARIN.SI2.2.204 List of initial character-level 4-grams from numeral lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-numerals-lowercase_ forms-initial-4grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tisoč tiso č 614 4.06 % 593.18 200 4.41 % 867.59 41 1.26 % 138.73 321 6.09 % 908.96 52 2.50 % 333.56 drugi drug i 455 3.01 % 439.57 96 2.12 % 416.44 65 2.00 % 219.93 217 4.12 % 614.47 77 3.71 % 493.93 prvi prvi 426 2.82 % 411.55 145 3.20 % 629 79 2.44 % 267.31 160 3.04 % 453.07 42 2.02 % 269.42 deset dese t 402 2.66 % 388.37 107 2.36 % 464.16 120 3.70 % 406.03 108 2.05 % 305.82 67 3.23 % 429.78 šest šest 388 2.56 % 374.84 164 3.62 % 711.43 77 2.37 % 260.54 106 2.01 % 300.16 41 1.97 % 263 drugo drug o 376 2.49 % 363.25 63 1.39 % 273.29 66 2.04 % 223.32 189 3.59 % 535.18 58 2.79 % 372.05 štiri štir i 372 2.46 % 359.38 119 2.62 % 516.22 64 1.97 % 216.55 142 2.69 % 402.10 47 2.26 % 301.49 osem osem 315 2.08 % 304.32 162 3.57 % 702.75 22 0.68 % 74.44 90 1.71 % 254.85 41 1.97 % 263 druga drug a 279 1.84 % 269.54 69 1.52 % 299.32 58 1.79 % 196.25 115 2.18 % 325.64 37 1.78 % 237.34 prvo prvo 246 1.63 % 237.66 67 1.48 % 290.64 65 2.00 % 219.93 79 1.50 % 223.70 35 1.69 % 224.51 petnajst petn ajst 240 1.59 % 231.86 61 1.34 % 264.62 59 1.82 % 199.63 53 1.01 % 150.08 67 3.23 % 429.78 druge drug e 231 1.53 % 223.17 40 0.88 % 173.52 46 1.42 % 155.65 120 2.28 % 339.80 25 1.20 % 160.37 enga enga 225 1.49 % 217.37 30 0.66 % 130.14 106 3.27 % 358.66 42 0.80 % 118.93 47 2.26 % 301.49 devet deve t 202 1.33 % 195.15 87 1.92 % 377.40 17 0.52 % 57.52 59 1.12 % 167.07 39 1.88 % 250.17 sedem sede m 201 1.33 % 194.18 70 1.54 % 303.66 35 1.08 % 118.43 80 1.52 % 226.53 16 0.77 % 102.63 drugega drug ega 192 1.27 % 185.49 48 1.06 % 208.22 23 0.71 % 77.82 96 1.82 % 271.84 25 1.20 % 160.37 prva prva 190 1.26 % 183.56 54 1.19 % 234.25 17 0.52 % 57.52 91 1.73 % 257.68 28 1.35 % 179.61 drug drug 189 1.25 % 182.59 30 0.66 % 130.14 69 2.13 % 233.47 60 1.14 % 169.90 30 1.44 % 192.44 dvajset dvaj set 186 1.23 % 179.69 74 1.63 % 321.01 7 0.22 % 23.69 65 1.23 % 184.06 40 1.93 % 256.59 eden eden 186 1.23 % 179.69 50 1.10 % 216.90 35 1.08 % 118.43 83 1.57 % 235.03 18 0.87 % 115.46 dvesto dves to 177 1.17 % 171 58 1.28 % 251.60 32 0.99 % 108.28 66 1.25 % 186.89 21 1.01 % 134.71 trideset trid eset 174 1.15 % 168.10 74 1.63 % 321.01 9 0.28 % 30.45 43 0.82 % 121.76 48 2.31 % 307.90 petdeset petd eset 173 1.14 % 167.13 28 0.62 % 121.46 45 1.39 % 152.26 65 1.23 % 184.06 35 1.69 % 224.51 dveh dveh 164 1.08 % 158.44 28 0.62 % 121.46 31 0.96 % 104.89 86 1.63 % 243.52 19 0.92 % 121.88 drugih drug ih 160 1.06 % 154.57 19 0.42 % 82.42 9 0.28 % 30.45 109 2.07 % 308.65 23 1.11 % 147.54 enih enih 152 1.00 % 146.84 24 0.53 % 104.11 74 2.28 % 250.39 27 0.51 % 76.45 27 1.30 % 173.20 enega eneg a 133 0.88 % 128.49 33 0.73 % 143.15 24 0.74 % 81.21 61 1.16 % 172.73 15 0.72 % 96.22 tristo tris to 132 0.87 % 127.52 13 0.29 % 56.39 25 0.77 % 84.59 77 1.46 % 218.04 17 0.82 % 109.05 druzga druz ga 129 0.85 % 124.62 22 0.48 % 95.44 54 1.67 % 182.72 25 0.47 % 70.79 28 1.35 % 179.61 dvanajst dvan ajst 127 0.84 % 122.69 31 0.68 % 134.48 29 0.89 % 98.12 47 0.89 % 133.09 20 0.96 % 128.29 enem enem 123 0.81 % 118.83 20 0.44 % 86.76 25 0.77 % 84.59 50 0.95 % 141.58 28 1.35 % 179.61 treh treh 113 0.75 % 109.17 27 0.59 % 117.12 21 0.65 % 71.06 54 1.02 % 152.91 11 0.53 % 70.56 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 548 File at CLARIN.SI2.2.205 List of initial character-level 5-grams from numeral lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-numerals-lowercase_ forms-initial-5grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tisoč tisoč 614 5.26 % 593.18 200 5.70 % 867.59 41 1.75 % 138.73 321 7.67 % 908.96 52 3.17 % 333.56 drugi drugi 455 3.90 % 439.57 96 2.74 % 416.44 65 2.78 % 219.93 217 5.19 % 614.47 77 4.70 % 493.93 deset deset 402 3.44 % 388.37 107 3.05 % 464.16 120 5.13 % 406.03 108 2.58 % 305.82 67 4.09 % 429.78 drugo drugo 376 3.22 % 363.25 63 1.79 % 273.29 66 2.82 % 223.32 189 4.52 % 535.18 58 3.54 % 372.05 štiri štiri 372 3.19 % 359.38 119 3.39 % 516.22 64 2.74 % 216.55 142 3.40 % 402.10 47 2.87 % 301.49 druga druga 279 2.39 % 269.54 69 1.97 % 299.32 58 2.48 % 196.25 115 2.75 % 325.64 37 2.26 % 237.34 petnajst petna jst 240 2.06 % 231.86 61 1.74 % 264.62 59 2.52 % 199.63 53 1.27 % 150.08 67 4.09 % 429.78 druge druge 231 1.98 % 223.17 40 1.14 % 173.52 46 1.97 % 155.65 120 2.87 % 339.80 25 1.53 % 160.37 devet devet 202 1.73 % 195.15 87 2.48 % 377.40 17 0.73 % 57.52 59 1.41 % 167.07 39 2.38 % 250.17 sedem sedem 201 1.72 % 194.18 70 2.00 % 303.66 35 1.50 % 118.43 80 1.91 % 226.53 16 0.98 % 102.63 drugega druge ga 192 1.65 % 185.49 48 1.37 % 208.22 23 0.98 % 77.82 96 2.29 % 271.84 25 1.53 % 160.37 dvajset dvajs et 186 1.59 % 179.69 74 2.11 % 321.01 7 0.30 % 23.69 65 1.55 % 184.06 40 2.44 % 256.59 dvesto dvest o 177 1.52 % 171 58 1.65 % 251.60 32 1.37 % 108.28 66 1.58 % 186.89 21 1.28 % 134.71 trideset tride set 174 1.49 % 168.10 74 2.11 % 321.01 9 0.39 % 30.45 43 1.03 % 121.76 48 2.93 % 307.90 petdeset petde set 173 1.48 % 167.13 28 0.80 % 121.46 45 1.93 % 152.26 65 1.55 % 184.06 35 2.14 % 224.51 drugih drugi h 160 1.37 % 154.57 19 0.54 % 82.42 9 0.39 % 30.45 109 2.61 % 308.65 23 1.40 % 147.54 enega enega 133 1.14 % 128.49 33 0.94 % 143.15 24 1.03 % 81.21 61 1.46 % 172.73 15 0.92 % 96.22 tristo trist o 132 1.13 % 127.52 13 0.37 % 56.39 25 1.07 % 84.59 77 1.84 % 218.04 17 1.04 % 109.05 druzga druzg a 129 1.11 % 124.62 22 0.63 % 95.44 54 2.31 % 182.72 25 0.60 % 70.79 28 1.71 % 179.61 dvanajst dvana jst 127 1.09 % 122.69 31 0.88 % 134.48 29 1.24 % 98.12 47 1.12 % 133.09 20 1.22 % 128.29 prvem prvem 110 0.94 % 106.27 36 1.03 % 156.17 9 0.39 % 30.45 43 1.03 % 121.76 22 1.34 % 141.12 petsto petst o 106 0.91 % 102.40 26 0.74 % 112.79 22 0.94 % 74.44 53 1.27 % 150.08 5 0.30 % 32.07 trije trije 103 0.88 % 99.51 18 0.51 % 78.08 30 1.28 % 101.51 36 0.86 % 101.94 19 1.16 % 121.88 devetdeset devet deset 102 0.87 % 98.54 67 1.91 % 290.64 7 0.30 % 23.69 12 0.29 % 33.98 16 0.98 % 102.63 trieset tries et 95 0.81 % 91.78 16 0.46 % 69.41 36 1.54 % 121.81 19 0.45 % 53.80 24 1.47 % 153.95 šestih šesti h 94 0.81 % 90.81 52 1.48 % 225.57 27 1.16 % 91.36 8 0.19 % 22.65 7 0.43 % 44.90 tretji tretj i 90 0.77 % 86.95 27 0.77 % 117.12 18 0.77 % 60.91 36 0.86 % 101.94 9 0.55 % 57.73 petih petih 89 0.76 % 85.98 33 0.94 % 143.15 32 1.37 % 108.28 16 0.38 % 45.31 8 0.49 % 51.32 sedmih sedmi h 88 0.75 % 85.02 49 1.40 % 212.56 31 1.33 % 104.89 4 0.10 % 11.33 4 0.24 % 25.66 drugem druge m 85 0.73 % 82.12 32 0.91 % 138.81 10 0.43 % 33.84 37 0.89 % 104.77 6 0.37 % 38.49 osmih osmih 84 0.72 % 81.15 41 1.17 % 177.86 35 1.50 % 118.43 8 0.19 % 22.65 0 0 % 0 prvega prveg a 84 0.72 % 81.15 32 0.91 % 138.81 13 0.56 % 43.99 35 0.84 % 99.11 4 0.24 % 25.66 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 549 File at CLARIN.SI2.2.206 List of final character-level 1-grams from numeral lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-numerals-lowercase_ forms-final-1grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mmm mm m 1,522 5.94 % 1,470.38 193 2.83 % 837.23 605 9.35 % 2,047.09 351 4.12 % 993.91 373 9.82 % 2,392.67 en e n 1,495 5.83 % 1,444.30 312 4.57 % 1,353.44 557 8.61 % 1,884.67 409 4.79 % 1,158.15 217 5.71 % 1,391.98 dva dv a 1,259 4.91 % 1,216.30 221 3.24 % 958.69 331 5.12 % 1,119.98 500 5.86 % 1,415.83 207 5.45 % 1,327.83 ena en a 1,159 4.52 % 1,119.69 305 4.47 % 1,323.08 277 4.28 % 937.26 420 4.92 % 1,189.30 157 4.13 % 1,007.10 eno en o 1,070 4.18 % 1,033.71 253 3.71 % 1,097.50 320 4.95 % 1,082.76 328 3.85 % 928.79 169 4.45 % 1,084.08 tri tr i 872 3.40 % 842.42 292 4.28 % 1,266.68 186 2.88 % 629.35 276 3.24 % 781.54 118 3.11 % 756.93 dve dv e 701 2.74 % 677.22 182 2.67 % 789.51 163 2.52 % 551.53 279 3.27 % 790.03 77 2.03 % 493.93 tisoč tiso č 614 2.40 % 593.18 200 2.93 % 867.59 41 0.63 % 138.73 321 3.76 % 908.96 52 1.37 % 333.56 pet pe t 577 2.25 % 557.43 167 2.45 % 724.44 151 2.33 % 510.93 169 1.98 % 478.55 90 2.37 % 577.32 drugi drug i 455 1.78 % 439.57 96 1.41 % 416.44 65 1.00 % 219.93 217 2.54 % 614.47 77 2.03 % 493.93 sto st o 434 1.69 % 419.28 60 0.88 % 260.28 87 1.34 % 294.37 159 1.86 % 450.23 128 3.37 % 821.08 prvi prv i 426 1.66 % 411.55 145 2.12 % 629 79 1.22 % 267.31 160 1.88 % 453.07 42 1.11 % 269.42 ene en e 407 1.59 % 393.20 60 0.88 % 260.28 191 2.95 % 646.27 89 1.04 % 252.02 67 1.76 % 429.78 deset dese t 402 1.57 % 388.37 107 1.57 % 464.16 120 1.85 % 406.03 108 1.27 % 305.82 67 1.76 % 429.78 šest šes t 388 1.51 % 374.84 164 2.40 % 711.43 77 1.19 % 260.54 106 1.24 % 300.16 41 1.08 % 263 drugo drug o 376 1.47 % 363.25 63 0.92 % 273.29 66 1.02 % 223.32 189 2.22 % 535.18 58 1.53 % 372.05 štiri štir i 372 1.45 % 359.38 119 1.74 % 516.22 64 0.99 % 216.55 142 1.67 % 402.10 47 1.24 % 301.49 eni en i 322 1.26 % 311.08 52 0.76 % 225.57 109 1.69 % 368.81 97 1.14 % 274.67 64 1.69 % 410.54 osem ose m 315 1.23 % 304.32 162 2.37 % 702.75 22 0.34 % 74.44 90 1.05 % 254.85 41 1.08 % 263 druga drug a 279 1.09 % 269.54 69 1.01 % 299.32 58 0.90 % 196.25 115 1.35 % 325.64 37 0.97 % 237.34 prvo prv o 246 0.96 % 237.66 67 0.98 % 290.64 65 1.00 % 219.93 79 0.93 % 223.70 35 0.92 % 224.51 petnajst petnajs t 240 0.94 % 231.86 61 0.89 % 264.62 59 0.91 % 199.63 53 0.62 % 150.08 67 1.76 % 429.78 druge drug e 231 0.90 % 223.17 40 0.59 % 173.52 46 0.71 % 155.65 120 1.41 % 339.80 25 0.66 % 160.37 enga eng a 225 0.88 % 217.37 30 0.44 % 130.14 106 1.64 % 358.66 42 0.49 % 118.93 47 1.24 % 301.49 devet deve t 202 0.79 % 195.15 87 1.27 % 377.40 17 0.26 % 57.52 59 0.69 % 167.07 39 1.03 % 250.17 sedem sede m 201 0.79 % 194.18 70 1.03 % 303.66 35 0.54 % 118.43 80 0.94 % 226.53 16 0.42 % 102.63 drugega drugeg a 192 0.75 % 185.49 48 0.70 % 208.22 23 0.36 % 77.82 96 1.12 % 271.84 25 0.66 % 160.37 prva prv a 190 0.74 % 183.56 54 0.79 % 234.25 17 0.26 % 57.52 91 1.07 % 257.68 28 0.74 % 179.61 drug dru g 189 0.74 % 182.59 30 0.44 % 130.14 69 1.07 % 233.47 60 0.70 % 169.90 30 0.79 % 192.44 dvajset dvajse t 186 0.73 % 179.69 74 1.08 % 321.01 7 0.11 % 23.69 65 0.76 % 184.06 40 1.05 % 256.59 eden ede n 186 0.73 % 179.69 50 0.73 % 216.90 35 0.54 % 118.43 83 0.97 % 235.03 18 0.47 % 115.46 dvesto dvest o 177 0.69 % 171 58 0.85 % 251.60 32 0.49 % 108.28 66 0.77 % 186.89 21 0.55 % 134.71 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 550 File at CLARIN.SI2.2.207 List of final character-level 2-grams from numeral lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-numerals-lowercase_ forms-final-2grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mmm m mm 1,522 5.98 % 1,470.38 193 2.85 % 837.23 605 9.38 % 2,047.09 351 4.15 % 993.91 373 9.89 % 2,392.67 en en 1,495 5.87 % 1,444.30 312 4.60 % 1,353.44 557 8.64 % 1,884.67 409 4.84 % 1,158.15 217 5.75 % 1,391.98 dva d va 1,259 4.95 % 1,216.30 221 3.26 % 958.69 331 5.13 % 1,119.98 500 5.92 % 1,415.83 207 5.49 % 1,327.83 ena e na 1,159 4.55 % 1,119.69 305 4.50 % 1,323.08 277 4.30 % 937.26 420 4.97 % 1,189.30 157 4.16 % 1,007.10 eno e no 1,070 4.20 % 1,033.71 253 3.73 % 1,097.50 320 4.96 % 1,082.76 328 3.88 % 928.79 169 4.48 % 1,084.08 tri t ri 872 3.43 % 842.42 292 4.31 % 1,266.68 186 2.88 % 629.35 276 3.27 % 781.54 118 3.13 % 756.93 dve d ve 701 2.75 % 677.22 182 2.68 % 789.51 163 2.53 % 551.53 279 3.30 % 790.03 77 2.04 % 493.93 tisoč tis oč 614 2.41 % 593.18 200 2.95 % 867.59 41 0.64 % 138.73 321 3.80 % 908.96 52 1.38 % 333.56 pet p et 577 2.27 % 557.43 167 2.46 % 724.44 151 2.34 % 510.93 169 2.00 % 478.55 90 2.38 % 577.32 drugi dru gi 455 1.79 % 439.57 96 1.42 % 416.44 65 1.01 % 219.93 217 2.57 % 614.47 77 2.04 % 493.93 sto s to 434 1.71 % 419.28 60 0.89 % 260.28 87 1.35 % 294.37 159 1.88 % 450.23 128 3.39 % 821.08 prvi pr vi 426 1.67 % 411.55 145 2.14 % 629 79 1.23 % 267.31 160 1.89 % 453.07 42 1.11 % 269.42 ene e ne 407 1.60 % 393.20 60 0.89 % 260.28 191 2.96 % 646.27 89 1.05 % 252.02 67 1.78 % 429.78 deset des et 402 1.58 % 388.37 107 1.58 % 464.16 120 1.86 % 406.03 108 1.28 % 305.82 67 1.78 % 429.78 šest še st 388 1.52 % 374.84 164 2.42 % 711.43 77 1.19 % 260.54 106 1.25 % 300.16 41 1.09 % 263 drugo dru go 376 1.48 % 363.25 63 0.93 % 273.29 66 1.02 % 223.32 189 2.24 % 535.18 58 1.54 % 372.05 štiri šti ri 372 1.46 % 359.38 119 1.75 % 516.22 64 0.99 % 216.55 142 1.68 % 402.10 47 1.25 % 301.49 eni e ni 322 1.26 % 311.08 52 0.77 % 225.57 109 1.69 % 368.81 97 1.15 % 274.67 64 1.70 % 410.54 osem os em 315 1.24 % 304.32 162 2.39 % 702.75 22 0.34 % 74.44 90 1.06 % 254.85 41 1.09 % 263 druga dru ga 279 1.10 % 269.54 69 1.02 % 299.32 58 0.90 % 196.25 115 1.36 % 325.64 37 0.98 % 237.34 prvo pr vo 246 0.97 % 237.66 67 0.99 % 290.64 65 1.01 % 219.93 79 0.94 % 223.70 35 0.93 % 224.51 petnajst petnaj st 240 0.94 % 231.86 61 0.90 % 264.62 59 0.92 % 199.63 53 0.63 % 150.08 67 1.78 % 429.78 druge dru ge 231 0.91 % 223.17 40 0.59 % 173.52 46 0.71 % 155.65 120 1.42 % 339.80 25 0.66 % 160.37 enga en ga 225 0.88 % 217.37 30 0.44 % 130.14 106 1.64 % 358.66 42 0.50 % 118.93 47 1.25 % 301.49 devet dev et 202 0.79 % 195.15 87 1.28 % 377.40 17 0.26 % 57.52 59 0.70 % 167.07 39 1.03 % 250.17 sedem sed em 201 0.79 % 194.18 70 1.03 % 303.66 35 0.54 % 118.43 80 0.95 % 226.53 16 0.42 % 102.63 drugega druge ga 192 0.75 % 185.49 48 0.71 % 208.22 23 0.36 % 77.82 96 1.14 % 271.84 25 0.66 % 160.37 prva pr va 190 0.75 % 183.56 54 0.80 % 234.25 17 0.26 % 57.52 91 1.08 % 257.68 28 0.74 % 179.61 drug dr ug 189 0.74 % 182.59 30 0.44 % 130.14 69 1.07 % 233.47 60 0.71 % 169.90 30 0.80 % 192.44 dvajset dvajs et 186 0.73 % 179.69 74 1.09 % 321.01 7 0.11 % 23.69 65 0.77 % 184.06 40 1.06 % 256.59 eden ed en 186 0.73 % 179.69 50 0.74 % 216.90 35 0.54 % 118.43 83 0.98 % 235.03 18 0.48 % 115.46 dvesto dves to 177 0.69 % 171 58 0.85 % 251.60 32 0.50 % 108.28 66 0.78 % 186.89 21 0.56 % 134.71 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 551 File at CLARIN.SI2.2.208 List of final character-level 3-grams from numeral lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-numerals-lowercase_ forms-final-3grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mmm mmm 1,522 6.40 % 1,470.38 193 3.02 % 837.23 605 10.39 % 2,047.09 351 4.37 % 993.91 373 10.53 % 2,392.67 dva dva 1,259 5.29 % 1,216.30 221 3.45 % 958.69 331 5.69 % 1,119.98 500 6.23 % 1,415.83 207 5.84 % 1,327.83 ena ena 1,159 4.87 % 1,119.69 305 4.77 % 1,323.08 277 4.76 % 937.26 420 5.23 % 1,189.30 157 4.43 % 1,007.10 eno eno 1,070 4.50 % 1,033.71 253 3.95 % 1,097.50 320 5.50 % 1,082.76 328 4.08 % 928.79 169 4.77 % 1,084.08 tri tri 872 3.67 % 842.42 292 4.56 % 1,266.68 186 3.20 % 629.35 276 3.44 % 781.54 118 3.33 % 756.93 dve dve 701 2.95 % 677.22 182 2.85 % 789.51 163 2.80 % 551.53 279 3.47 % 790.03 77 2.17 % 493.93 tisoč ti soč 614 2.58 % 593.18 200 3.13 % 867.59 41 0.70 % 138.73 321 4.00 % 908.96 52 1.47 % 333.56 pet pet 577 2.42 % 557.43 167 2.61 % 724.44 151 2.60 % 510.93 169 2.10 % 478.55 90 2.54 % 577.32 drugi dr ugi 455 1.91 % 439.57 96 1.50 % 416.44 65 1.12 % 219.93 217 2.70 % 614.47 77 2.17 % 493.93 sto sto 434 1.82 % 419.28 60 0.94 % 260.28 87 1.50 % 294.37 159 1.98 % 450.23 128 3.61 % 821.08 prvi p rvi 426 1.79 % 411.55 145 2.27 % 629 79 1.36 % 267.31 160 1.99 % 453.07 42 1.19 % 269.42 ene ene 407 1.71 % 393.20 60 0.94 % 260.28 191 3.28 % 646.27 89 1.11 % 252.02 67 1.89 % 429.78 deset de set 402 1.69 % 388.37 107 1.67 % 464.16 120 2.06 % 406.03 108 1.34 % 305.82 67 1.89 % 429.78 šest š est 388 1.63 % 374.84 164 2.56 % 711.43 77 1.32 % 260.54 106 1.32 % 300.16 41 1.16 % 263 drugo dr ugo 376 1.58 % 363.25 63 0.98 % 273.29 66 1.13 % 223.32 189 2.35 % 535.18 58 1.64 % 372.05 štiri št iri 372 1.56 % 359.38 119 1.86 % 516.22 64 1.10 % 216.55 142 1.77 % 402.10 47 1.33 % 301.49 eni eni 322 1.35 % 311.08 52 0.81 % 225.57 109 1.87 % 368.81 97 1.21 % 274.67 64 1.81 % 410.54 osem o sem 315 1.32 % 304.32 162 2.53 % 702.75 22 0.38 % 74.44 90 1.12 % 254.85 41 1.16 % 263 druga dr uga 279 1.17 % 269.54 69 1.08 % 299.32 58 1.00 % 196.25 115 1.43 % 325.64 37 1.04 % 237.34 prvo p rvo 246 1.03 % 237.66 67 1.05 % 290.64 65 1.12 % 219.93 79 0.98 % 223.70 35 0.99 % 224.51 petnajst petna jst 240 1.01 % 231.86 61 0.95 % 264.62 59 1.01 % 199.63 53 0.66 % 150.08 67 1.89 % 429.78 druge dr uge 231 0.97 % 223.17 40 0.62 % 173.52 46 0.79 % 155.65 120 1.49 % 339.80 25 0.71 % 160.37 enga e nga 225 0.95 % 217.37 30 0.47 % 130.14 106 1.82 % 358.66 42 0.52 % 118.93 47 1.33 % 301.49 devet de vet 202 0.85 % 195.15 87 1.36 % 377.40 17 0.29 % 57.52 59 0.73 % 167.07 39 1.10 % 250.17 sedem se dem 201 0.84 % 194.18 70 1.09 % 303.66 35 0.60 % 118.43 80 1.00 % 226.53 16 0.45 % 102.63 drugega drug ega 192 0.81 % 185.49 48 0.75 % 208.22 23 0.40 % 77.82 96 1.20 % 271.84 25 0.71 % 160.37 prva p rva 190 0.80 % 183.56 54 0.84 % 234.25 17 0.29 % 57.52 91 1.13 % 257.68 28 0.79 % 179.61 drug d rug 189 0.79 % 182.59 30 0.47 % 130.14 69 1.19 % 233.47 60 0.75 % 169.90 30 0.85 % 192.44 dvajset dvaj set 186 0.78 % 179.69 74 1.16 % 321.01 7 0.12 % 23.69 65 0.81 % 184.06 40 1.13 % 256.59 eden e den 186 0.78 % 179.69 50 0.78 % 216.90 35 0.60 % 118.43 83 1.03 % 235.03 18 0.51 % 115.46 dvesto dve sto 177 0.74 % 171 58 0.91 % 251.60 32 0.55 % 108.28 66 0.82 % 186.89 21 0.59 % 134.71 trideset tride set 174 0.73 % 168.10 74 1.16 % 321.01 9 0.15 % 30.45 43 0.54 % 121.76 48 1.35 % 307.90 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 552 File at CLARIN.SI2.2.209 List of final character-level 4-grams from numeral lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-numerals-lowercase_ forms-final-4grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tisoč t isoč 614 4.06 % 593.18 200 4.41 % 867.59 41 1.26 % 138.73 321 6.09 % 908.96 52 2.50 % 333.56 drugi d rugi 455 3.01 % 439.57 96 2.12 % 416.44 65 2.00 % 219.93 217 4.12 % 614.47 77 3.71 % 493.93 prvi prvi 426 2.82 % 411.55 145 3.20 % 629 79 2.44 % 267.31 160 3.04 % 453.07 42 2.02 % 269.42 deset d eset 402 2.66 % 388.37 107 2.36 % 464.16 120 3.70 % 406.03 108 2.05 % 305.82 67 3.23 % 429.78 šest šest 388 2.56 % 374.84 164 3.62 % 711.43 77 2.37 % 260.54 106 2.01 % 300.16 41 1.97 % 263 drugo d rugo 376 2.49 % 363.25 63 1.39 % 273.29 66 2.04 % 223.32 189 3.59 % 535.18 58 2.79 % 372.05 štiri š tiri 372 2.46 % 359.38 119 2.62 % 516.22 64 1.97 % 216.55 142 2.69 % 402.10 47 2.26 % 301.49 osem osem 315 2.08 % 304.32 162 3.57 % 702.75 22 0.68 % 74.44 90 1.71 % 254.85 41 1.97 % 263 druga d ruga 279 1.84 % 269.54 69 1.52 % 299.32 58 1.79 % 196.25 115 2.18 % 325.64 37 1.78 % 237.34 prvo prvo 246 1.63 % 237.66 67 1.48 % 290.64 65 2.00 % 219.93 79 1.50 % 223.70 35 1.69 % 224.51 petnajst petn ajst 240 1.59 % 231.86 61 1.34 % 264.62 59 1.82 % 199.63 53 1.01 % 150.08 67 3.23 % 429.78 druge d ruge 231 1.53 % 223.17 40 0.88 % 173.52 46 1.42 % 155.65 120 2.28 % 339.80 25 1.20 % 160.37 enga enga 225 1.49 % 217.37 30 0.66 % 130.14 106 3.27 % 358.66 42 0.80 % 118.93 47 2.26 % 301.49 devet d evet 202 1.33 % 195.15 87 1.92 % 377.40 17 0.52 % 57.52 59 1.12 % 167.07 39 1.88 % 250.17 sedem s edem 201 1.33 % 194.18 70 1.54 % 303.66 35 1.08 % 118.43 80 1.52 % 226.53 16 0.77 % 102.63 drugega dru gega 192 1.27 % 185.49 48 1.06 % 208.22 23 0.71 % 77.82 96 1.82 % 271.84 25 1.20 % 160.37 prva prva 190 1.26 % 183.56 54 1.19 % 234.25 17 0.52 % 57.52 91 1.73 % 257.68 28 1.35 % 179.61 drug drug 189 1.25 % 182.59 30 0.66 % 130.14 69 2.13 % 233.47 60 1.14 % 169.90 30 1.44 % 192.44 dvajset dva jset 186 1.23 % 179.69 74 1.63 % 321.01 7 0.22 % 23.69 65 1.23 % 184.06 40 1.93 % 256.59 eden eden 186 1.23 % 179.69 50 1.10 % 216.90 35 1.08 % 118.43 83 1.57 % 235.03 18 0.87 % 115.46 dvesto dv esto 177 1.17 % 171 58 1.28 % 251.60 32 0.99 % 108.28 66 1.25 % 186.89 21 1.01 % 134.71 trideset trid eset 174 1.15 % 168.10 74 1.63 % 321.01 9 0.28 % 30.45 43 0.82 % 121.76 48 2.31 % 307.90 petdeset petd eset 173 1.14 % 167.13 28 0.62 % 121.46 45 1.39 % 152.26 65 1.23 % 184.06 35 1.69 % 224.51 dveh dveh 164 1.08 % 158.44 28 0.62 % 121.46 31 0.96 % 104.89 86 1.63 % 243.52 19 0.92 % 121.88 drugih dr ugih 160 1.06 % 154.57 19 0.42 % 82.42 9 0.28 % 30.45 109 2.07 % 308.65 23 1.11 % 147.54 enih enih 152 1.00 % 146.84 24 0.53 % 104.11 74 2.28 % 250.39 27 0.51 % 76.45 27 1.30 % 173.20 enega e nega 133 0.88 % 128.49 33 0.73 % 143.15 24 0.74 % 81.21 61 1.16 % 172.73 15 0.72 % 96.22 tristo tr isto 132 0.87 % 127.52 13 0.29 % 56.39 25 0.77 % 84.59 77 1.46 % 218.04 17 0.82 % 109.05 druzga dr uzga 129 0.85 % 124.62 22 0.48 % 95.44 54 1.67 % 182.72 25 0.47 % 70.79 28 1.35 % 179.61 dvanajst dvan ajst 127 0.84 % 122.69 31 0.68 % 134.48 29 0.89 % 98.12 47 0.89 % 133.09 20 0.96 % 128.29 enem enem 123 0.81 % 118.83 20 0.44 % 86.76 25 0.77 % 84.59 50 0.95 % 141.58 28 1.35 % 179.61 treh treh 113 0.75 % 109.17 27 0.59 % 117.12 21 0.65 % 71.06 54 1.02 % 152.91 11 0.53 % 70.56 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 553 File at CLARIN.SI2.2.210 List of final character-level 5-grams from numeral lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-numerals-lowercase_ forms-final-5grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tisoč tisoč 614 5.26 % 593.18 200 5.70 % 867.59 41 1.75 % 138.73 321 7.67 % 908.96 52 3.17 % 333.56 drugi drugi 455 3.90 % 439.57 96 2.74 % 416.44 65 2.78 % 219.93 217 5.19 % 614.47 77 4.70 % 493.93 deset deset 402 3.44 % 388.37 107 3.05 % 464.16 120 5.13 % 406.03 108 2.58 % 305.82 67 4.09 % 429.78 drugo drugo 376 3.22 % 363.25 63 1.79 % 273.29 66 2.82 % 223.32 189 4.52 % 535.18 58 3.54 % 372.05 štiri štiri 372 3.19 % 359.38 119 3.39 % 516.22 64 2.74 % 216.55 142 3.40 % 402.10 47 2.87 % 301.49 druga druga 279 2.39 % 269.54 69 1.97 % 299.32 58 2.48 % 196.25 115 2.75 % 325.64 37 2.26 % 237.34 petnajst pet najst 240 2.06 % 231.86 61 1.74 % 264.62 59 2.52 % 199.63 53 1.27 % 150.08 67 4.09 % 429.78 druge druge 231 1.98 % 223.17 40 1.14 % 173.52 46 1.97 % 155.65 120 2.87 % 339.80 25 1.53 % 160.37 devet devet 202 1.73 % 195.15 87 2.48 % 377.40 17 0.73 % 57.52 59 1.41 % 167.07 39 2.38 % 250.17 sedem sedem 201 1.72 % 194.18 70 2.00 % 303.66 35 1.50 % 118.43 80 1.91 % 226.53 16 0.98 % 102.63 drugega dr ugega 192 1.65 % 185.49 48 1.37 % 208.22 23 0.98 % 77.82 96 2.29 % 271.84 25 1.53 % 160.37 dvajset dv ajset 186 1.59 % 179.69 74 2.11 % 321.01 7 0.30 % 23.69 65 1.55 % 184.06 40 2.44 % 256.59 dvesto d vesto 177 1.52 % 171 58 1.65 % 251.60 32 1.37 % 108.28 66 1.58 % 186.89 21 1.28 % 134.71 trideset tri deset 174 1.49 % 168.10 74 2.11 % 321.01 9 0.39 % 30.45 43 1.03 % 121.76 48 2.93 % 307.90 petdeset pet deset 173 1.48 % 167.13 28 0.80 % 121.46 45 1.93 % 152.26 65 1.55 % 184.06 35 2.14 % 224.51 drugih d rugih 160 1.37 % 154.57 19 0.54 % 82.42 9 0.39 % 30.45 109 2.61 % 308.65 23 1.40 % 147.54 enega enega 133 1.14 % 128.49 33 0.94 % 143.15 24 1.03 % 81.21 61 1.46 % 172.73 15 0.92 % 96.22 tristo t risto 132 1.13 % 127.52 13 0.37 % 56.39 25 1.07 % 84.59 77 1.84 % 218.04 17 1.04 % 109.05 druzga d ruzga 129 1.11 % 124.62 22 0.63 % 95.44 54 2.31 % 182.72 25 0.60 % 70.79 28 1.71 % 179.61 dvanajst dva najst 127 1.09 % 122.69 31 0.88 % 134.48 29 1.24 % 98.12 47 1.12 % 133.09 20 1.22 % 128.29 prvem prvem 110 0.94 % 106.27 36 1.03 % 156.17 9 0.39 % 30.45 43 1.03 % 121.76 22 1.34 % 141.12 petsto p etsto 106 0.91 % 102.40 26 0.74 % 112.79 22 0.94 % 74.44 53 1.27 % 150.08 5 0.30 % 32.07 trije trije 103 0.88 % 99.51 18 0.51 % 78.08 30 1.28 % 101.51 36 0.86 % 101.94 19 1.16 % 121.88 devetdeset devet deset 102 0.87 % 98.54 67 1.91 % 290.64 7 0.30 % 23.69 12 0.29 % 33.98 16 0.98 % 102.63 trieset tr ieset 95 0.81 % 91.78 16 0.46 % 69.41 36 1.54 % 121.81 19 0.45 % 53.80 24 1.47 % 153.95 šestih š estih 94 0.81 % 90.81 52 1.48 % 225.57 27 1.16 % 91.36 8 0.19 % 22.65 7 0.43 % 44.90 tretji t retji 90 0.77 % 86.95 27 0.77 % 117.12 18 0.77 % 60.91 36 0.86 % 101.94 9 0.55 % 57.73 petih petih 89 0.76 % 85.98 33 0.94 % 143.15 32 1.37 % 108.28 16 0.38 % 45.31 8 0.49 % 51.32 sedmih s edmih 88 0.75 % 85.02 49 1.40 % 212.56 31 1.33 % 104.89 4 0.10 % 11.33 4 0.24 % 25.66 drugem d rugem 85 0.73 % 82.12 32 0.91 % 138.81 10 0.43 % 33.84 37 0.89 % 104.77 6 0.37 % 38.49 osmih osmih 84 0.72 % 81.15 41 1.17 % 177.86 35 1.50 % 118.43 8 0.19 % 22.65 0 0 % 0 prvega p rvega 84 0.72 % 81.15 32 0.91 % 138.81 13 0.56 % 43.99 35 0.84 % 99.11 4 0.24 % 25.66 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 554 File at CLARIN.SI2.2.211 List of initial character-level 1-grams from preposition lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-prepositions-lemmas- initial-1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] v v v 17,583 25.92 % 16,986.65 4,042 25.13 % 17,534.04 3,793 24.31 % 12,834.05 7,400 27.40 % 20,954.33 2,348 25.69 % 15,061.61 na na n a 11,921 17.57 % 11,516.68 2,989 18.58 % 12,966.17 2,882 18.47 % 9,751.58 4,348 16.10 % 12,312.08 1,702 18.62 % 10,917.74 za za z a 7,872 11.60 % 7,605.01 1,776 11.04 % 7,704.22 1,787 11.45 % 6,046.52 3,129 11.59 % 8,860.28 1,180 12.91 % 7,569.29 z z z 7,663 11.30 % 7,403.10 1,894 11.77 % 8,216.10 1,564 10.02 % 5,291.97 3,134 11.60 % 8,874.44 1,071 11.72 % 6,870.10 po po p o 3,337 4.92 % 3,223.82 779 4.84 % 3,379.27 981 6.29 % 3,319.33 1,116 4.13 % 3,160.14 461 5.04 % 2,957.16 od od o d 2,510 3.70 % 2,424.87 594 3.69 % 2,576.75 674 4.32 % 2,280.56 942 3.49 % 2,667.43 300 3.28 % 1,924.40 pri pri p ri 2,371 3.50 % 2,290.58 443 2.75 % 1,921.72 596 3.82 % 2,016.63 929 3.44 % 2,630.62 403 4.41 % 2,585.11 do do d o 2,126 3.13 % 2,053.89 481 2.99 % 2,086.56 448 2.87 % 1,515.86 879 3.25 % 2,489.03 318 3.48 % 2,039.86 o o o 2,093 3.08 % 2,022.01 483 3.00 % 2,095.24 321 2.06 % 1,086.14 1,076 3.98 % 3,046.87 213 2.33 % 1,366.32 iz iz i z 1,944 2.87 % 1,878.07 455 2.83 % 1,973.77 436 2.79 % 1,475.26 888 3.29 % 2,514.52 165 1.80 % 1,058.42 ob ob o b 896 1.32 % 865.61 291 1.81 % 1,262.35 259 1.66 % 876.36 282 1.04 % 798.53 64 0.70 % 410.54 pred pred p red 789 1.16 % 762.24 284 1.77 % 1,231.98 148 0.95 % 500.77 311 1.15 % 880.65 46 0.50 % 295.07 zaradi zaradi z aradi 786 1.16 % 759.34 135 0.84 % 585.62 143 0.92 % 483.86 398 1.47 % 1,127 110 1.20 % 705.61 med med m ed 781 1.15 % 754.51 192 1.19 % 832.89 100 0.64 % 338.36 409 1.51 % 1,158.15 80 0.88 % 513.17 k k k 615 0.91 % 594.14 127 0.79 % 550.92 176 1.13 % 595.52 227 0.84 % 642.79 85 0.93 % 545.25 čez čez č ez 587 0.86 % 567.09 211 1.31 % 915.31 198 1.27 % 669.96 125 0.46 % 353.96 53 0.58 % 339.98 brez brez b rez 463 0.68 % 447.30 133 0.83 % 576.95 118 0.76 % 399.27 143 0.53 % 404.93 69 0.76 % 442.61 pod pod p od 340 0.50 % 328.47 61 0.38 % 264.62 93 0.60 % 314.68 147 0.54 % 416.25 39 0.43 % 250.17 proti proti p roti 323 0.48 % 312.05 123 0.77 % 533.57 63 0.40 % 213.17 114 0.42 % 322.81 23 0.25 % 147.54 zraven zraven z raven 297 0.44 % 286.93 54 0.34 % 234.25 130 0.83 % 439.87 48 0.18 % 135.92 65 0.71 % 416.95 skoz skoz s koz 213 0.31 % 205.78 15 0.09 % 65.07 130 0.83 % 439.87 14 0.05 % 39.64 54 0.59 % 346.39 skozi skozi s kozi 212 0.31 % 204.81 50 0.31 % 216.90 60 0.39 % 203.02 92 0.34 % 260.51 10 0.11 % 64.15 okoli okoli o koli 183 0.27 % 176.79 38 0.24 % 164.84 89 0.57 % 301.14 41 0.15 % 116.10 15 0.16 % 96.22 nad nad n ad 162 0.24 % 156.51 41 0.26 % 177.86 24 0.15 % 81.21 84 0.31 % 237.86 13 0.14 % 83.39 preko preko p reko 145 0.21 % 140.08 43 0.27 % 186.53 14 0.09 % 47.37 73 0.27 % 206.71 15 0.16 % 96.22 glede glede g lede 132 0.20 % 127.52 17 0.11 % 73.75 18 0.12 % 60.91 48 0.18 % 135.92 49 0.54 % 314.32 okrog okrog o krog 125 0.18 % 120.76 19 0.12 % 82.42 28 0.18 % 94.74 55 0.20 % 155.74 23 0.25 % 147.54 mimo mimo m imo 117 0.17 % 113.03 42 0.26 % 182.19 38 0.24 % 128.58 28 0.10 % 79.29 9 0.10 % 57.73 poleg poleg p oleg 112 0.17 % 108.20 29 0.18 % 125.80 20 0.13 % 67.67 53 0.20 % 150.08 10 0.11 % 64.15 kljub kljub k ljub 99 0.15 % 95.64 16 0.10 % 69.41 4 0.03 % 13.53 67 0.25 % 189.72 12 0.13 % 76.98 razen razen r azen 98 0.14 % 94.68 14 0.09 % 60.73 29 0.19 % 98.12 35 0.13 % 99.11 20 0.22 % 128.29 prek prek p rek 97 0.14 % 93.71 15 0.09 % 65.07 43 0.28 % 145.50 26 0.10 % 73.62 13 0.14 % 83.39 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 555 File at CLARIN.SI2.2.212 List of initial character-level 2-grams from preposition lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-prepositions-lemmas- initial-2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] na na na 11,921 29.89 % 11,516.68 2,989 31.33 % 12,966.17 2,882 29.57 % 9,751.58 4,348 28.67 % 12,312.08 1,702 31.38 % 10,917.74 za za za 7,872 19.74 % 7,605.01 1,776 18.62 % 7,704.22 1,787 18.34 % 6,046.52 3,129 20.63 % 8,860.28 1,180 21.75 % 7,569.29 po po po 3,337 8.37 % 3,223.82 779 8.17 % 3,379.27 981 10.07 % 3,319.33 1,116 7.36 % 3,160.14 461 8.50 % 2,957.16 od od od 2,510 6.29 % 2,424.87 594 6.23 % 2,576.75 674 6.92 % 2,280.56 942 6.21 % 2,667.43 300 5.53 % 1,924.40 pri pri pr i 2,371 5.95 % 2,290.58 443 4.64 % 1,921.72 596 6.12 % 2,016.63 929 6.12 % 2,630.62 403 7.43 % 2,585.11 do do do 2,126 5.33 % 2,053.89 481 5.04 % 2,086.56 448 4.60 % 1,515.86 879 5.79 % 2,489.03 318 5.86 % 2,039.86 iz iz iz 1,944 4.88 % 1,878.07 455 4.77 % 1,973.77 436 4.47 % 1,475.26 888 5.86 % 2,514.52 165 3.04 % 1,058.42 ob ob ob 896 2.25 % 865.61 291 3.05 % 1,262.35 259 2.66 % 876.36 282 1.86 % 798.53 64 1.18 % 410.54 pred pred pr ed 789 1.98 % 762.24 284 2.98 % 1,231.98 148 1.52 % 500.77 311 2.05 % 880.65 46 0.85 % 295.07 zaradi zaradi za radi 786 1.97 % 759.34 135 1.42 % 585.62 143 1.47 % 483.86 398 2.62 % 1,127 110 2.03 % 705.61 med med me d 781 1.96 % 754.51 192 2.01 % 832.89 100 1.03 % 338.36 409 2.70 % 1,158.15 80 1.48 % 513.17 čez čez če z 587 1.47 % 567.09 211 2.21 % 915.31 198 2.03 % 669.96 125 0.82 % 353.96 53 0.98 % 339.98 brez brez br ez 463 1.16 % 447.30 133 1.39 % 576.95 118 1.21 % 399.27 143 0.94 % 404.93 69 1.27 % 442.61 pod pod po d 340 0.85 % 328.47 61 0.64 % 264.62 93 0.95 % 314.68 147 0.97 % 416.25 39 0.72 % 250.17 proti proti pr oti 323 0.81 % 312.05 123 1.29 % 533.57 63 0.65 % 213.17 114 0.75 % 322.81 23 0.42 % 147.54 zraven zraven zr aven 297 0.74 % 286.93 54 0.57 % 234.25 130 1.33 % 439.87 48 0.32 % 135.92 65 1.20 % 416.95 skoz skoz sk oz 213 0.53 % 205.78 15 0.16 % 65.07 130 1.33 % 439.87 14 0.09 % 39.64 54 1.00 % 346.39 skozi skozi sk ozi 212 0.53 % 204.81 50 0.52 % 216.90 60 0.62 % 203.02 92 0.61 % 260.51 10 0.18 % 64.15 okoli okoli ok oli 183 0.46 % 176.79 38 0.40 % 164.84 89 0.91 % 301.14 41 0.27 % 116.10 15 0.28 % 96.22 nad nad na d 162 0.41 % 156.51 41 0.43 % 177.86 24 0.25 % 81.21 84 0.55 % 237.86 13 0.24 % 83.39 preko preko pr eko 145 0.36 % 140.08 43 0.45 % 186.53 14 0.14 % 47.37 73 0.48 % 206.71 15 0.28 % 96.22 glede glede gl ede 132 0.33 % 127.52 17 0.18 % 73.75 18 0.18 % 60.91 48 0.32 % 135.92 49 0.90 % 314.32 okrog okrog ok rog 125 0.31 % 120.76 19 0.20 % 82.42 28 0.29 % 94.74 55 0.36 % 155.74 23 0.42 % 147.54 mimo mimo mi mo 117 0.29 % 113.03 42 0.44 % 182.19 38 0.39 % 128.58 28 0.18 % 79.29 9 0.17 % 57.73 poleg poleg po leg 112 0.28 % 108.20 29 0.30 % 125.80 20 0.20 % 67.67 53 0.35 % 150.08 10 0.18 % 64.15 kljub kljub kl jub 99 0.25 % 95.64 16 0.17 % 69.41 4 0.04 % 13.53 67 0.44 % 189.72 12 0.22 % 76.98 razen razen ra zen 98 0.25 % 94.68 14 0.15 % 60.73 29 0.30 % 98.12 35 0.23 % 99.11 20 0.37 % 128.29 prek prek pr ek 97 0.24 % 93.71 15 0.16 % 65.07 43 0.44 % 145.50 26 0.17 % 73.62 13 0.24 % 83.39 blizu blizu bl izu 96 0.24 % 92.74 19 0.20 % 82.42 31 0.32 % 104.89 36 0.24 % 101.94 10 0.18 % 64.15 zunaj zunaj zu naj 90 0.23 % 86.95 14 0.15 % 60.73 30 0.31 % 101.51 25 0.17 % 70.79 21 0.39 % 134.71 izmed izmed iz med 89 0.22 % 85.98 46 0.48 % 199.55 5 0.05 % 16.92 37 0.24 % 104.77 1 0.02 % 6.41 namesto namesto na mesto 84 0.21 % 81.15 15 0.16 % 65.07 19 0.20 % 64.29 35 0.23 % 99.11 15 0.28 % 96.22 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 556 File at CLARIN.SI2.2.213 List of initial character-level 3-grams from preposition lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-prepositions-lemmas- initial-3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pri pri pri 2,371 25.57 % 2,290.58 443 20.37 % 1,921.72 596 26.15 % 2,016.63 929 25.93 % 2,630.62 403 32.66 % 2,585.11 pred pred pre d 789 8.51 % 762.24 284 13.06 % 1,231.98 148 6.49 % 500.77 311 8.68 % 880.65 46 3.73 % 295.07 zaradi zaradi zar adi 786 8.48 % 759.34 135 6.21 % 585.62 143 6.28 % 483.86 398 11.11 % 1,127 110 8.91 % 705.61 med med med 781 8.42 % 754.51 192 8.83 % 832.89 100 4.39 % 338.36 409 11.41 % 1,158.15 80 6.48 % 513.17 čez čez čez 587 6.33 % 567.09 211 9.70 % 915.31 198 8.69 % 669.96 125 3.49 % 353.96 53 4.29 % 339.98 brez brez bre z 463 4.99 % 447.30 133 6.12 % 576.95 118 5.18 % 399.27 143 3.99 % 404.93 69 5.59 % 442.61 pod pod pod 340 3.67 % 328.47 61 2.81 % 264.62 93 4.08 % 314.68 147 4.10 % 416.25 39 3.16 % 250.17 proti proti pro ti 323 3.48 % 312.05 123 5.66 % 533.57 63 2.76 % 213.17 114 3.18 % 322.81 23 1.86 % 147.54 zraven zraven zra ven 297 3.20 % 286.93 54 2.48 % 234.25 130 5.70 % 439.87 48 1.34 % 135.92 65 5.27 % 416.95 skoz skoz sko z 213 2.30 % 205.78 15 0.69 % 65.07 130 5.70 % 439.87 14 0.39 % 39.64 54 4.38 % 346.39 skozi skozi sko zi 212 2.29 % 204.81 50 2.30 % 216.90 60 2.63 % 203.02 92 2.57 % 260.51 10 0.81 % 64.15 okoli okoli oko li 183 1.97 % 176.79 38 1.75 % 164.84 89 3.90 % 301.14 41 1.14 % 116.10 15 1.22 % 96.22 nad nad nad 162 1.75 % 156.51 41 1.89 % 177.86 24 1.05 % 81.21 84 2.34 % 237.86 13 1.05 % 83.39 preko preko pre ko 145 1.56 % 140.08 43 1.98 % 186.53 14 0.61 % 47.37 73 2.04 % 206.71 15 1.22 % 96.22 glede glede gle de 132 1.42 % 127.52 17 0.78 % 73.75 18 0.79 % 60.91 48 1.34 % 135.92 49 3.97 % 314.32 okrog okrog okr og 125 1.35 % 120.76 19 0.87 % 82.42 28 1.23 % 94.74 55 1.53 % 155.74 23 1.86 % 147.54 mimo mimo mim o 117 1.26 % 113.03 42 1.93 % 182.19 38 1.67 % 128.58 28 0.78 % 79.29 9 0.73 % 57.73 poleg poleg pol eg 112 1.21 % 108.20 29 1.33 % 125.80 20 0.88 % 67.67 53 1.48 % 150.08 10 0.81 % 64.15 kljub kljub klj ub 99 1.07 % 95.64 16 0.74 % 69.41 4 0.18 % 13.53 67 1.87 % 189.72 12 0.97 % 76.98 razen razen raz en 98 1.06 % 94.68 14 0.64 % 60.73 29 1.27 % 98.12 35 0.98 % 99.11 20 1.62 % 128.29 prek prek pre k 97 1.05 % 93.71 15 0.69 % 65.07 43 1.89 % 145.50 26 0.73 % 73.62 13 1.05 % 83.39 blizu blizu bli zu 96 1.03 % 92.74 19 0.87 % 82.42 31 1.36 % 104.89 36 1.00 % 101.94 10 0.81 % 64.15 zunaj zunaj zun aj 90 0.97 % 86.95 14 0.64 % 60.73 30 1.32 % 101.51 25 0.70 % 70.79 21 1.70 % 134.71 izmed izmed izm ed 89 0.96 % 85.98 46 2.12 % 199.55 5 0.22 % 16.92 37 1.03 % 104.77 1 0.08 % 6.41 namesto namesto nam esto 84 0.91 % 81.15 15 0.69 % 65.07 19 0.83 % 64.29 35 0.98 % 99.11 15 1.22 % 96.22 kasneje kasneje kas neje 75 0.81 % 72.46 25 1.15 % 108.45 11 0.48 % 37.22 34 0.95 % 96.28 5 0.41 % 32.07 sredi sredi sre di 58 0.63 % 56.03 10 0.46 % 43.38 27 1.19 % 91.36 14 0.39 % 39.64 7 0.57 % 44.90 znotraj znotraj zno traj 53 0.57 % 51.20 3 0.14 % 13.01 2 0.09 % 6.77 40 1.12 % 113.27 8 0.65 % 51.32 vrhu vrhu vrh u 46 0.50 % 44.44 21 0.97 % 91.10 7 0.31 % 23.69 16 0.45 % 45.31 2 0.16 % 12.83 vrh vrh vrh 40 0.43 % 38.64 13 0.60 % 56.39 9 0.40 % 30.45 8 0.22 % 22.65 10 0.81 % 64.15 krog krog kro g 29 0.31 % 28.02 5 0.23 % 21.69 3 0.13 % 10.15 16 0.45 % 45.31 5 0.41 % 32.07 izven izven izv en 21 0.23 % 20.29 5 0.23 % 21.69 3 0.13 % 10.15 9 0.25 % 25.48 4 0.32 % 25.66 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 557 File at CLARIN.SI2.2.214 List of initial character-level 4-grams from preposition lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-prepositions-lemmas- initial-4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pred pred pred 789 15.84 % 762.24 284 23.47 % 1,231.98 148 11.80 % 500.77 311 16.53 % 880.65 46 7.23 % 295.07 zaradi zaradi zara di 786 15.78 % 759.34 135 11.16 % 585.62 143 11.40 % 483.86 398 21.16 % 1,127 110 17.30 % 705.61 brez brez brez 463 9.29 % 447.30 133 10.99 % 576.95 118 9.41 % 399.27 143 7.60 % 404.93 69 10.85 % 442.61 proti proti prot i 323 6.49 % 312.05 123 10.16 % 533.57 63 5.02 % 213.17 114 6.06 % 322.81 23 3.62 % 147.54 zraven zraven zrav en 297 5.96 % 286.93 54 4.46 % 234.25 130 10.37 % 439.87 48 2.55 % 135.92 65 10.22 % 416.95 skoz skoz skoz 213 4.28 % 205.78 15 1.24 % 65.07 130 10.37 % 439.87 14 0.74 % 39.64 54 8.49 % 346.39 skozi skozi skoz i 212 4.26 % 204.81 50 4.13 % 216.90 60 4.79 % 203.02 92 4.89 % 260.51 10 1.57 % 64.15 okoli okoli okol i 183 3.67 % 176.79 38 3.14 % 164.84 89 7.10 % 301.14 41 2.18 % 116.10 15 2.36 % 96.22 preko preko prek o 145 2.91 % 140.08 43 3.55 % 186.53 14 1.12 % 47.37 73 3.88 % 206.71 15 2.36 % 96.22 glede glede gled e 132 2.65 % 127.52 17 1.41 % 73.75 18 1.44 % 60.91 48 2.55 % 135.92 49 7.70 % 314.32 okrog okrog okro g 125 2.51 % 120.76 19 1.57 % 82.42 28 2.23 % 94.74 55 2.92 % 155.74 23 3.62 % 147.54 mimo mimo mimo 117 2.35 % 113.03 42 3.47 % 182.19 38 3.03 % 128.58 28 1.49 % 79.29 9 1.42 % 57.73 poleg poleg pole g 112 2.25 % 108.20 29 2.40 % 125.80 20 1.59 % 67.67 53 2.82 % 150.08 10 1.57 % 64.15 kljub kljub klju b 99 1.99 % 95.64 16 1.32 % 69.41 4 0.32 % 13.53 67 3.56 % 189.72 12 1.89 % 76.98 razen razen raze n 98 1.97 % 94.68 14 1.16 % 60.73 29 2.31 % 98.12 35 1.86 % 99.11 20 3.15 % 128.29 prek prek prek 97 1.95 % 93.71 15 1.24 % 65.07 43 3.43 % 145.50 26 1.38 % 73.62 13 2.04 % 83.39 blizu blizu bliz u 96 1.93 % 92.74 19 1.57 % 82.42 31 2.47 % 104.89 36 1.91 % 101.94 10 1.57 % 64.15 zunaj zunaj zuna j 90 1.81 % 86.95 14 1.16 % 60.73 30 2.39 % 101.51 25 1.33 % 70.79 21 3.30 % 134.71 izmed izmed izme d 89 1.79 % 85.98 46 3.80 % 199.55 5 0.40 % 16.92 37 1.97 % 104.77 1 0.16 % 6.41 namesto namesto name sto 84 1.69 % 81.15 15 1.24 % 65.07 19 1.51 % 64.29 35 1.86 % 99.11 15 2.36 % 96.22 kasneje kasneje kasn eje 75 1.51 % 72.46 25 2.07 % 108.45 11 0.88 % 37.22 34 1.81 % 96.28 5 0.79 % 32.07 sredi sredi sred i 58 1.16 % 56.03 10 0.83 % 43.38 27 2.15 % 91.36 14 0.74 % 39.64 7 1.10 % 44.90 znotraj znotraj znot raj 53 1.06 % 51.20 3 0.25 % 13.01 2 0.16 % 6.77 40 2.13 % 113.27 8 1.26 % 51.32 vrhu vrhu vrhu 46 0.92 % 44.44 21 1.74 % 91.10 7 0.56 % 23.69 16 0.85 % 45.31 2 0.31 % 12.83 krog krog krog 29 0.58 % 28.02 5 0.41 % 21.69 3 0.24 % 10.15 16 0.85 % 45.31 5 0.79 % 32.07 izven izven izve n 21 0.42 % 20.29 5 0.41 % 21.69 3 0.24 % 10.15 9 0.48 % 25.48 4 0.63 % 25.66 konec konec kone c 21 0.42 % 20.29 0 0 % 0 5 0.40 % 16.92 14 0.74 % 39.64 2 0.31 % 12.83 nasproti nasproti nasp roti 21 0.42 % 20.29 1 0.08 % 4.34 12 0.96 % 40.60 5 0.27 % 14.16 3 0.47 % 19.24 izpred izpred izpr ed 16 0.32 % 15.46 2 0.17 % 8.68 5 0.40 % 16.92 9 0.48 % 25.48 0 0 % 0 bližje bližje bliž je 12 0.24 % 11.59 2 0.17 % 8.68 4 0.32 % 13.53 5 0.27 % 14.16 1 0.16 % 6.41 zoper zoper zope r 11 0.22 % 10.63 0 0 % 0 0 0 % 0 11 0.58 % 31.15 0 0 % 0 bliže bliže bliž e 10 0.20 % 9.66 6 0.50 % 26.03 0 0 % 0 4 0.21 % 11.33 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 558 File at CLARIN.SI2.2.215 List of initial character-level 5-grams from preposition lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-prepositions-lemmas- initial-5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] zaradi zaradi zarad i 786 24.43 % 759.34 135 19.42 % 585.62 143 18.67 % 483.86 398 30.02 % 1,127 110 25.52 % 705.61 proti proti proti 323 10.04 % 312.05 123 17.70 % 533.57 63 8.22 % 213.17 114 8.60 % 322.81 23 5.34 % 147.54 zraven zraven zrave n 297 9.23 % 286.93 54 7.77 % 234.25 130 16.97 % 439.87 48 3.62 % 135.92 65 15.08 % 416.95 skozi skozi skozi 212 6.59 % 204.81 50 7.19 % 216.90 60 7.83 % 203.02 92 6.94 % 260.51 10 2.32 % 64.15 okoli okoli okoli 183 5.69 % 176.79 38 5.47 % 164.84 89 11.62 % 301.14 41 3.09 % 116.10 15 3.48 % 96.22 preko preko preko 145 4.51 % 140.08 43 6.19 % 186.53 14 1.83 % 47.37 73 5.50 % 206.71 15 3.48 % 96.22 glede glede glede 132 4.10 % 127.52 17 2.45 % 73.75 18 2.35 % 60.91 48 3.62 % 135.92 49 11.37 % 314.32 okrog okrog okrog 125 3.88 % 120.76 19 2.73 % 82.42 28 3.65 % 94.74 55 4.15 % 155.74 23 5.34 % 147.54 poleg poleg poleg 112 3.48 % 108.20 29 4.17 % 125.80 20 2.61 % 67.67 53 4.00 % 150.08 10 2.32 % 64.15 kljub kljub kljub 99 3.08 % 95.64 16 2.30 % 69.41 4 0.52 % 13.53 67 5.05 % 189.72 12 2.78 % 76.98 razen razen razen 98 3.04 % 94.68 14 2.01 % 60.73 29 3.79 % 98.12 35 2.64 % 99.11 20 4.64 % 128.29 blizu blizu blizu 96 2.98 % 92.74 19 2.73 % 82.42 31 4.05 % 104.89 36 2.71 % 101.94 10 2.32 % 64.15 zunaj zunaj zunaj 90 2.80 % 86.95 14 2.01 % 60.73 30 3.92 % 101.51 25 1.89 % 70.79 21 4.87 % 134.71 izmed izmed izmed 89 2.77 % 85.98 46 6.62 % 199.55 5 0.65 % 16.92 37 2.79 % 104.77 1 0.23 % 6.41 namesto namesto names to 84 2.61 % 81.15 15 2.16 % 65.07 19 2.48 % 64.29 35 2.64 % 99.11 15 3.48 % 96.22 kasneje kasneje kasne je 75 2.33 % 72.46 25 3.60 % 108.45 11 1.44 % 37.22 34 2.56 % 96.28 5 1.16 % 32.07 sredi sredi sredi 58 1.80 % 56.03 10 1.44 % 43.38 27 3.52 % 91.36 14 1.06 % 39.64 7 1.62 % 44.90 znotraj znotraj znotr aj 53 1.65 % 51.20 3 0.43 % 13.01 2 0.26 % 6.77 40 3.02 % 113.27 8 1.86 % 51.32 izven izven izven 21 0.65 % 20.29 5 0.72 % 21.69 3 0.39 % 10.15 9 0.68 % 25.48 4 0.93 % 25.66 konec konec konec 21 0.65 % 20.29 0 0 % 0 5 0.65 % 16.92 14 1.06 % 39.64 2 0.46 % 12.83 nasproti nasproti naspr oti 21 0.65 % 20.29 1 0.14 % 4.34 12 1.57 % 40.60 5 0.38 % 14.16 3 0.70 % 19.24 izpred izpred izpre d 16 0.50 % 15.46 2 0.29 % 8.68 5 0.65 % 16.92 9 0.68 % 25.48 0 0 % 0 bližje bližje bližj e 12 0.37 % 11.59 2 0.29 % 8.68 4 0.52 % 13.53 5 0.38 % 14.16 1 0.23 % 6.41 zoper zoper zoper 11 0.34 % 10.63 0 0 % 0 0 0 % 0 11 0.83 % 31.15 0 0 % 0 bliže bliže bliže 10 0.31 % 9.66 6 0.86 % 26.03 0 0 % 0 4 0.30 % 11.33 0 0 % 0 zavoljo zavoljo zavol jo 7 0.22 % 6.76 1 0.14 % 4.34 1 0.13 % 3.38 5 0.38 % 14.16 0 0 % 0 najbližje najbližje najbl ižje 6 0.19 % 5.80 3 0.43 % 13.01 1 0.13 % 3.38 1 0.07 % 2.83 1 0.23 % 6.41 naokoli naokoli naoko li 5 0.15 % 4.83 0 0 % 0 4 0.52 % 13.53 1 0.07 % 2.83 0 0 % 0 napram napram napra m 5 0.15 % 4.83 0 0 % 0 3 0.39 % 10.15 2 0.15 % 5.66 0 0 % 0 spričo spričo sprič o 5 0.15 % 4.83 1 0.14 % 4.34 0 0 % 0 4 0.30 % 11.33 0 0 % 0 vzdolž vzdolž vzdol ž 4 0.12 % 3.86 0 0 % 0 0 0 % 0 4 0.30 % 11.33 0 0 % 0 tekom tekom tekom 3 0.09 % 2.90 1 0.14 % 4.34 0 0 % 0 1 0.07 % 2.83 1 0.23 % 6.41 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 559 File at CLARIN.SI2.2.216 List of final character-level 1-grams from preposition lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-prepositions-lemmas-final- 1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] v v v 17,583 25.92 % 16,986.65 4,042 25.13 % 17,534.04 3,793 24.31 % 12,834.05 7,400 27.40 % 20,954.33 2,348 25.69 % 15,061.61 na na n a 11,921 17.57 % 11,516.68 2,989 18.58 % 12,966.17 2,882 18.47 % 9,751.58 4,348 16.10 % 12,312.08 1,702 18.62 % 10,917.74 za za z a 7,872 11.60 % 7,605.01 1,776 11.04 % 7,704.22 1,787 11.45 % 6,046.52 3,129 11.59 % 8,860.28 1,180 12.91 % 7,569.29 z z z 7,663 11.30 % 7,403.10 1,894 11.77 % 8,216.10 1,564 10.02 % 5,291.97 3,134 11.60 % 8,874.44 1,071 11.72 % 6,870.10 po po p o 3,337 4.92 % 3,223.82 779 4.84 % 3,379.27 981 6.29 % 3,319.33 1,116 4.13 % 3,160.14 461 5.04 % 2,957.16 od od o d 2,510 3.70 % 2,424.87 594 3.69 % 2,576.75 674 4.32 % 2,280.56 942 3.49 % 2,667.43 300 3.28 % 1,924.40 pri pri pr i 2,371 3.50 % 2,290.58 443 2.75 % 1,921.72 596 3.82 % 2,016.63 929 3.44 % 2,630.62 403 4.41 % 2,585.11 do do d o 2,126 3.13 % 2,053.89 481 2.99 % 2,086.56 448 2.87 % 1,515.86 879 3.25 % 2,489.03 318 3.48 % 2,039.86 o o o 2,093 3.08 % 2,022.01 483 3.00 % 2,095.24 321 2.06 % 1,086.14 1,076 3.98 % 3,046.87 213 2.33 % 1,366.32 iz iz i z 1,944 2.87 % 1,878.07 455 2.83 % 1,973.77 436 2.79 % 1,475.26 888 3.29 % 2,514.52 165 1.80 % 1,058.42 ob ob o b 896 1.32 % 865.61 291 1.81 % 1,262.35 259 1.66 % 876.36 282 1.04 % 798.53 64 0.70 % 410.54 pred pred pre d 789 1.16 % 762.24 284 1.77 % 1,231.98 148 0.95 % 500.77 311 1.15 % 880.65 46 0.50 % 295.07 zaradi zaradi zarad i 786 1.16 % 759.34 135 0.84 % 585.62 143 0.92 % 483.86 398 1.47 % 1,127 110 1.20 % 705.61 med med me d 781 1.15 % 754.51 192 1.19 % 832.89 100 0.64 % 338.36 409 1.51 % 1,158.15 80 0.88 % 513.17 k k k 615 0.91 % 594.14 127 0.79 % 550.92 176 1.13 % 595.52 227 0.84 % 642.79 85 0.93 % 545.25 čez čez če z 587 0.86 % 567.09 211 1.31 % 915.31 198 1.27 % 669.96 125 0.46 % 353.96 53 0.58 % 339.98 brez brez bre z 463 0.68 % 447.30 133 0.83 % 576.95 118 0.76 % 399.27 143 0.53 % 404.93 69 0.76 % 442.61 pod pod po d 340 0.50 % 328.47 61 0.38 % 264.62 93 0.60 % 314.68 147 0.54 % 416.25 39 0.43 % 250.17 proti proti prot i 323 0.48 % 312.05 123 0.77 % 533.57 63 0.40 % 213.17 114 0.42 % 322.81 23 0.25 % 147.54 zraven zraven zrave n 297 0.44 % 286.93 54 0.34 % 234.25 130 0.83 % 439.87 48 0.18 % 135.92 65 0.71 % 416.95 skoz skoz sko z 213 0.31 % 205.78 15 0.09 % 65.07 130 0.83 % 439.87 14 0.05 % 39.64 54 0.59 % 346.39 skozi skozi skoz i 212 0.31 % 204.81 50 0.31 % 216.90 60 0.39 % 203.02 92 0.34 % 260.51 10 0.11 % 64.15 okoli okoli okol i 183 0.27 % 176.79 38 0.24 % 164.84 89 0.57 % 301.14 41 0.15 % 116.10 15 0.16 % 96.22 nad nad na d 162 0.24 % 156.51 41 0.26 % 177.86 24 0.15 % 81.21 84 0.31 % 237.86 13 0.14 % 83.39 preko preko prek o 145 0.21 % 140.08 43 0.27 % 186.53 14 0.09 % 47.37 73 0.27 % 206.71 15 0.16 % 96.22 glede glede gled e 132 0.20 % 127.52 17 0.11 % 73.75 18 0.12 % 60.91 48 0.18 % 135.92 49 0.54 % 314.32 okrog okrog okro g 125 0.18 % 120.76 19 0.12 % 82.42 28 0.18 % 94.74 55 0.20 % 155.74 23 0.25 % 147.54 mimo mimo mim o 117 0.17 % 113.03 42 0.26 % 182.19 38 0.24 % 128.58 28 0.10 % 79.29 9 0.10 % 57.73 poleg poleg pole g 112 0.17 % 108.20 29 0.18 % 125.80 20 0.13 % 67.67 53 0.20 % 150.08 10 0.11 % 64.15 kljub kljub klju b 99 0.15 % 95.64 16 0.10 % 69.41 4 0.03 % 13.53 67 0.25 % 189.72 12 0.13 % 76.98 razen razen raze n 98 0.14 % 94.68 14 0.09 % 60.73 29 0.19 % 98.12 35 0.13 % 99.11 20 0.22 % 128.29 prek prek pre k 97 0.14 % 93.71 15 0.09 % 65.07 43 0.28 % 145.50 26 0.10 % 73.62 13 0.14 % 83.39 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 560 File at CLARIN.SI2.2.217 List of final character-level 2-grams from preposition lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-prepositions-lemmas-final- 2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] na na na 11,921 29.89 % 11,516.68 2,989 31.33 % 12,966.17 2,882 29.57 % 9,751.58 4,348 28.67 % 12,312.08 1,702 31.38 % 10,917.74 za za za 7,872 19.74 % 7,605.01 1,776 18.62 % 7,704.22 1,787 18.34 % 6,046.52 3,129 20.63 % 8,860.28 1,180 21.75 % 7,569.29 po po po 3,337 8.37 % 3,223.82 779 8.17 % 3,379.27 981 10.07 % 3,319.33 1,116 7.36 % 3,160.14 461 8.50 % 2,957.16 od od od 2,510 6.29 % 2,424.87 594 6.23 % 2,576.75 674 6.92 % 2,280.56 942 6.21 % 2,667.43 300 5.53 % 1,924.40 pri pri p ri 2,371 5.95 % 2,290.58 443 4.64 % 1,921.72 596 6.12 % 2,016.63 929 6.12 % 2,630.62 403 7.43 % 2,585.11 do do do 2,126 5.33 % 2,053.89 481 5.04 % 2,086.56 448 4.60 % 1,515.86 879 5.79 % 2,489.03 318 5.86 % 2,039.86 iz iz iz 1,944 4.88 % 1,878.07 455 4.77 % 1,973.77 436 4.47 % 1,475.26 888 5.86 % 2,514.52 165 3.04 % 1,058.42 ob ob ob 896 2.25 % 865.61 291 3.05 % 1,262.35 259 2.66 % 876.36 282 1.86 % 798.53 64 1.18 % 410.54 pred pred pr ed 789 1.98 % 762.24 284 2.98 % 1,231.98 148 1.52 % 500.77 311 2.05 % 880.65 46 0.85 % 295.07 zaradi zaradi zara di 786 1.97 % 759.34 135 1.42 % 585.62 143 1.47 % 483.86 398 2.62 % 1,127 110 2.03 % 705.61 med med m ed 781 1.96 % 754.51 192 2.01 % 832.89 100 1.03 % 338.36 409 2.70 % 1,158.15 80 1.48 % 513.17 čez čez č ez 587 1.47 % 567.09 211 2.21 % 915.31 198 2.03 % 669.96 125 0.82 % 353.96 53 0.98 % 339.98 brez brez br ez 463 1.16 % 447.30 133 1.39 % 576.95 118 1.21 % 399.27 143 0.94 % 404.93 69 1.27 % 442.61 pod pod p od 340 0.85 % 328.47 61 0.64 % 264.62 93 0.95 % 314.68 147 0.97 % 416.25 39 0.72 % 250.17 proti proti pro ti 323 0.81 % 312.05 123 1.29 % 533.57 63 0.65 % 213.17 114 0.75 % 322.81 23 0.42 % 147.54 zraven zraven zrav en 297 0.74 % 286.93 54 0.57 % 234.25 130 1.33 % 439.87 48 0.32 % 135.92 65 1.20 % 416.95 skoz skoz sk oz 213 0.53 % 205.78 15 0.16 % 65.07 130 1.33 % 439.87 14 0.09 % 39.64 54 1.00 % 346.39 skozi skozi sko zi 212 0.53 % 204.81 50 0.52 % 216.90 60 0.62 % 203.02 92 0.61 % 260.51 10 0.18 % 64.15 okoli okoli oko li 183 0.46 % 176.79 38 0.40 % 164.84 89 0.91 % 301.14 41 0.27 % 116.10 15 0.28 % 96.22 nad nad n ad 162 0.41 % 156.51 41 0.43 % 177.86 24 0.25 % 81.21 84 0.55 % 237.86 13 0.24 % 83.39 preko preko pre ko 145 0.36 % 140.08 43 0.45 % 186.53 14 0.14 % 47.37 73 0.48 % 206.71 15 0.28 % 96.22 glede glede gle de 132 0.33 % 127.52 17 0.18 % 73.75 18 0.18 % 60.91 48 0.32 % 135.92 49 0.90 % 314.32 okrog okrog okr og 125 0.31 % 120.76 19 0.20 % 82.42 28 0.29 % 94.74 55 0.36 % 155.74 23 0.42 % 147.54 mimo mimo mi mo 117 0.29 % 113.03 42 0.44 % 182.19 38 0.39 % 128.58 28 0.18 % 79.29 9 0.17 % 57.73 poleg poleg pol eg 112 0.28 % 108.20 29 0.30 % 125.80 20 0.20 % 67.67 53 0.35 % 150.08 10 0.18 % 64.15 kljub kljub klj ub 99 0.25 % 95.64 16 0.17 % 69.41 4 0.04 % 13.53 67 0.44 % 189.72 12 0.22 % 76.98 razen razen raz en 98 0.25 % 94.68 14 0.15 % 60.73 29 0.30 % 98.12 35 0.23 % 99.11 20 0.37 % 128.29 prek prek pr ek 97 0.24 % 93.71 15 0.16 % 65.07 43 0.44 % 145.50 26 0.17 % 73.62 13 0.24 % 83.39 blizu blizu bli zu 96 0.24 % 92.74 19 0.20 % 82.42 31 0.32 % 104.89 36 0.24 % 101.94 10 0.18 % 64.15 zunaj zunaj zun aj 90 0.23 % 86.95 14 0.15 % 60.73 30 0.31 % 101.51 25 0.17 % 70.79 21 0.39 % 134.71 izmed izmed izm ed 89 0.22 % 85.98 46 0.48 % 199.55 5 0.05 % 16.92 37 0.24 % 104.77 1 0.02 % 6.41 namesto namesto names to 84 0.21 % 81.15 15 0.16 % 65.07 19 0.20 % 64.29 35 0.23 % 99.11 15 0.28 % 96.22 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 561 File at CLARIN.SI2.2.218 List of final character-level 3-grams from preposition lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-prepositions-lemmas-final- 3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pri pri pri 2,371 25.57 % 2,290.58 443 20.37 % 1,921.72 596 26.15 % 2,016.63 929 25.93 % 2,630.62 403 32.66 % 2,585.11 pred pred p red 789 8.51 % 762.24 284 13.06 % 1,231.98 148 6.49 % 500.77 311 8.68 % 880.65 46 3.73 % 295.07 zaradi zaradi zar adi 786 8.48 % 759.34 135 6.21 % 585.62 143 6.28 % 483.86 398 11.11 % 1,127 110 8.91 % 705.61 med med med 781 8.42 % 754.51 192 8.83 % 832.89 100 4.39 % 338.36 409 11.41 % 1,158.15 80 6.48 % 513.17 čez čez čez 587 6.33 % 567.09 211 9.70 % 915.31 198 8.69 % 669.96 125 3.49 % 353.96 53 4.29 % 339.98 brez brez b rez 463 4.99 % 447.30 133 6.12 % 576.95 118 5.18 % 399.27 143 3.99 % 404.93 69 5.59 % 442.61 pod pod pod 340 3.67 % 328.47 61 2.81 % 264.62 93 4.08 % 314.68 147 4.10 % 416.25 39 3.16 % 250.17 proti proti pr oti 323 3.48 % 312.05 123 5.66 % 533.57 63 2.76 % 213.17 114 3.18 % 322.81 23 1.86 % 147.54 zraven zraven zra ven 297 3.20 % 286.93 54 2.48 % 234.25 130 5.70 % 439.87 48 1.34 % 135.92 65 5.27 % 416.95 skoz skoz s koz 213 2.30 % 205.78 15 0.69 % 65.07 130 5.70 % 439.87 14 0.39 % 39.64 54 4.38 % 346.39 skozi skozi sk ozi 212 2.29 % 204.81 50 2.30 % 216.90 60 2.63 % 203.02 92 2.57 % 260.51 10 0.81 % 64.15 okoli okoli ok oli 183 1.97 % 176.79 38 1.75 % 164.84 89 3.90 % 301.14 41 1.14 % 116.10 15 1.22 % 96.22 nad nad nad 162 1.75 % 156.51 41 1.89 % 177.86 24 1.05 % 81.21 84 2.34 % 237.86 13 1.05 % 83.39 preko preko pr eko 145 1.56 % 140.08 43 1.98 % 186.53 14 0.61 % 47.37 73 2.04 % 206.71 15 1.22 % 96.22 glede glede gl ede 132 1.42 % 127.52 17 0.78 % 73.75 18 0.79 % 60.91 48 1.34 % 135.92 49 3.97 % 314.32 okrog okrog ok rog 125 1.35 % 120.76 19 0.87 % 82.42 28 1.23 % 94.74 55 1.53 % 155.74 23 1.86 % 147.54 mimo mimo m imo 117 1.26 % 113.03 42 1.93 % 182.19 38 1.67 % 128.58 28 0.78 % 79.29 9 0.73 % 57.73 poleg poleg po leg 112 1.21 % 108.20 29 1.33 % 125.80 20 0.88 % 67.67 53 1.48 % 150.08 10 0.81 % 64.15 kljub kljub kl jub 99 1.07 % 95.64 16 0.74 % 69.41 4 0.18 % 13.53 67 1.87 % 189.72 12 0.97 % 76.98 razen razen ra zen 98 1.06 % 94.68 14 0.64 % 60.73 29 1.27 % 98.12 35 0.98 % 99.11 20 1.62 % 128.29 prek prek p rek 97 1.05 % 93.71 15 0.69 % 65.07 43 1.89 % 145.50 26 0.73 % 73.62 13 1.05 % 83.39 blizu blizu bl izu 96 1.03 % 92.74 19 0.87 % 82.42 31 1.36 % 104.89 36 1.00 % 101.94 10 0.81 % 64.15 zunaj zunaj zu naj 90 0.97 % 86.95 14 0.64 % 60.73 30 1.32 % 101.51 25 0.70 % 70.79 21 1.70 % 134.71 izmed izmed iz med 89 0.96 % 85.98 46 2.12 % 199.55 5 0.22 % 16.92 37 1.03 % 104.77 1 0.08 % 6.41 namesto namesto name sto 84 0.91 % 81.15 15 0.69 % 65.07 19 0.83 % 64.29 35 0.98 % 99.11 15 1.22 % 96.22 kasneje kasneje kasn eje 75 0.81 % 72.46 25 1.15 % 108.45 11 0.48 % 37.22 34 0.95 % 96.28 5 0.41 % 32.07 sredi sredi sr edi 58 0.63 % 56.03 10 0.46 % 43.38 27 1.19 % 91.36 14 0.39 % 39.64 7 0.57 % 44.90 znotraj znotraj znot raj 53 0.57 % 51.20 3 0.14 % 13.01 2 0.09 % 6.77 40 1.12 % 113.27 8 0.65 % 51.32 vrhu vrhu v rhu 46 0.50 % 44.44 21 0.97 % 91.10 7 0.31 % 23.69 16 0.45 % 45.31 2 0.16 % 12.83 vrh vrh vrh 40 0.43 % 38.64 13 0.60 % 56.39 9 0.40 % 30.45 8 0.22 % 22.65 10 0.81 % 64.15 krog krog k rog 29 0.31 % 28.02 5 0.23 % 21.69 3 0.13 % 10.15 16 0.45 % 45.31 5 0.41 % 32.07 izven izven iz ven 21 0.23 % 20.29 5 0.23 % 21.69 3 0.13 % 10.15 9 0.25 % 25.48 4 0.32 % 25.66 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 562 File at CLARIN.SI2.2.219 List of final character-level 4-grams from preposition lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-prepositions-lemmas-final- 4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pred pred pred 789 15.84 % 762.24 284 23.47 % 1,231.98 148 11.80 % 500.77 311 16.53 % 880.65 46 7.23 % 295.07 zaradi zaradi za radi 786 15.78 % 759.34 135 11.16 % 585.62 143 11.40 % 483.86 398 21.16 % 1,127 110 17.30 % 705.61 brez brez brez 463 9.29 % 447.30 133 10.99 % 576.95 118 9.41 % 399.27 143 7.60 % 404.93 69 10.85 % 442.61 proti proti p roti 323 6.49 % 312.05 123 10.16 % 533.57 63 5.02 % 213.17 114 6.06 % 322.81 23 3.62 % 147.54 zraven zraven zr aven 297 5.96 % 286.93 54 4.46 % 234.25 130 10.37 % 439.87 48 2.55 % 135.92 65 10.22 % 416.95 skoz skoz skoz 213 4.28 % 205.78 15 1.24 % 65.07 130 10.37 % 439.87 14 0.74 % 39.64 54 8.49 % 346.39 skozi skozi s kozi 212 4.26 % 204.81 50 4.13 % 216.90 60 4.79 % 203.02 92 4.89 % 260.51 10 1.57 % 64.15 okoli okoli o koli 183 3.67 % 176.79 38 3.14 % 164.84 89 7.10 % 301.14 41 2.18 % 116.10 15 2.36 % 96.22 preko preko p reko 145 2.91 % 140.08 43 3.55 % 186.53 14 1.12 % 47.37 73 3.88 % 206.71 15 2.36 % 96.22 glede glede g lede 132 2.65 % 127.52 17 1.41 % 73.75 18 1.44 % 60.91 48 2.55 % 135.92 49 7.70 % 314.32 okrog okrog o krog 125 2.51 % 120.76 19 1.57 % 82.42 28 2.23 % 94.74 55 2.92 % 155.74 23 3.62 % 147.54 mimo mimo mimo 117 2.35 % 113.03 42 3.47 % 182.19 38 3.03 % 128.58 28 1.49 % 79.29 9 1.42 % 57.73 poleg poleg p oleg 112 2.25 % 108.20 29 2.40 % 125.80 20 1.59 % 67.67 53 2.82 % 150.08 10 1.57 % 64.15 kljub kljub k ljub 99 1.99 % 95.64 16 1.32 % 69.41 4 0.32 % 13.53 67 3.56 % 189.72 12 1.89 % 76.98 razen razen r azen 98 1.97 % 94.68 14 1.16 % 60.73 29 2.31 % 98.12 35 1.86 % 99.11 20 3.15 % 128.29 prek prek prek 97 1.95 % 93.71 15 1.24 % 65.07 43 3.43 % 145.50 26 1.38 % 73.62 13 2.04 % 83.39 blizu blizu b lizu 96 1.93 % 92.74 19 1.57 % 82.42 31 2.47 % 104.89 36 1.91 % 101.94 10 1.57 % 64.15 zunaj zunaj z unaj 90 1.81 % 86.95 14 1.16 % 60.73 30 2.39 % 101.51 25 1.33 % 70.79 21 3.30 % 134.71 izmed izmed i zmed 89 1.79 % 85.98 46 3.80 % 199.55 5 0.40 % 16.92 37 1.97 % 104.77 1 0.16 % 6.41 namesto namesto nam esto 84 1.69 % 81.15 15 1.24 % 65.07 19 1.51 % 64.29 35 1.86 % 99.11 15 2.36 % 96.22 kasneje kasneje kas neje 75 1.51 % 72.46 25 2.07 % 108.45 11 0.88 % 37.22 34 1.81 % 96.28 5 0.79 % 32.07 sredi sredi s redi 58 1.16 % 56.03 10 0.83 % 43.38 27 2.15 % 91.36 14 0.74 % 39.64 7 1.10 % 44.90 znotraj znotraj zno traj 53 1.06 % 51.20 3 0.25 % 13.01 2 0.16 % 6.77 40 2.13 % 113.27 8 1.26 % 51.32 vrhu vrhu vrhu 46 0.92 % 44.44 21 1.74 % 91.10 7 0.56 % 23.69 16 0.85 % 45.31 2 0.31 % 12.83 krog krog krog 29 0.58 % 28.02 5 0.41 % 21.69 3 0.24 % 10.15 16 0.85 % 45.31 5 0.79 % 32.07 izven izven i zven 21 0.42 % 20.29 5 0.41 % 21.69 3 0.24 % 10.15 9 0.48 % 25.48 4 0.63 % 25.66 konec konec k onec 21 0.42 % 20.29 0 0 % 0 5 0.40 % 16.92 14 0.74 % 39.64 2 0.31 % 12.83 nasproti nasproti nasp roti 21 0.42 % 20.29 1 0.08 % 4.34 12 0.96 % 40.60 5 0.27 % 14.16 3 0.47 % 19.24 izpred izpred iz pred 16 0.32 % 15.46 2 0.17 % 8.68 5 0.40 % 16.92 9 0.48 % 25.48 0 0 % 0 bližje bližje bl ižje 12 0.24 % 11.59 2 0.17 % 8.68 4 0.32 % 13.53 5 0.27 % 14.16 1 0.16 % 6.41 zoper zoper z oper 11 0.22 % 10.63 0 0 % 0 0 0 % 0 11 0.58 % 31.15 0 0 % 0 bliže bliže b liže 10 0.20 % 9.66 6 0.50 % 26.03 0 0 % 0 4 0.21 % 11.33 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 563 File at CLARIN.SI2.2.220 List of final character-level 5-grams from preposition lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-prepositions-lemmas-final- 5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] zaradi zaradi z aradi 786 24.43 % 759.34 135 19.42 % 585.62 143 18.67 % 483.86 398 30.02 % 1,127 110 25.52 % 705.61 proti proti proti 323 10.04 % 312.05 123 17.70 % 533.57 63 8.22 % 213.17 114 8.60 % 322.81 23 5.34 % 147.54 zraven zraven z raven 297 9.23 % 286.93 54 7.77 % 234.25 130 16.97 % 439.87 48 3.62 % 135.92 65 15.08 % 416.95 skozi skozi skozi 212 6.59 % 204.81 50 7.19 % 216.90 60 7.83 % 203.02 92 6.94 % 260.51 10 2.32 % 64.15 okoli okoli okoli 183 5.69 % 176.79 38 5.47 % 164.84 89 11.62 % 301.14 41 3.09 % 116.10 15 3.48 % 96.22 preko preko preko 145 4.51 % 140.08 43 6.19 % 186.53 14 1.83 % 47.37 73 5.50 % 206.71 15 3.48 % 96.22 glede glede glede 132 4.10 % 127.52 17 2.45 % 73.75 18 2.35 % 60.91 48 3.62 % 135.92 49 11.37 % 314.32 okrog okrog okrog 125 3.88 % 120.76 19 2.73 % 82.42 28 3.65 % 94.74 55 4.15 % 155.74 23 5.34 % 147.54 poleg poleg poleg 112 3.48 % 108.20 29 4.17 % 125.80 20 2.61 % 67.67 53 4.00 % 150.08 10 2.32 % 64.15 kljub kljub kljub 99 3.08 % 95.64 16 2.30 % 69.41 4 0.52 % 13.53 67 5.05 % 189.72 12 2.78 % 76.98 razen razen razen 98 3.04 % 94.68 14 2.01 % 60.73 29 3.79 % 98.12 35 2.64 % 99.11 20 4.64 % 128.29 blizu blizu blizu 96 2.98 % 92.74 19 2.73 % 82.42 31 4.05 % 104.89 36 2.71 % 101.94 10 2.32 % 64.15 zunaj zunaj zunaj 90 2.80 % 86.95 14 2.01 % 60.73 30 3.92 % 101.51 25 1.89 % 70.79 21 4.87 % 134.71 izmed izmed izmed 89 2.77 % 85.98 46 6.62 % 199.55 5 0.65 % 16.92 37 2.79 % 104.77 1 0.23 % 6.41 namesto namesto na mesto 84 2.61 % 81.15 15 2.16 % 65.07 19 2.48 % 64.29 35 2.64 % 99.11 15 3.48 % 96.22 kasneje kasneje ka sneje 75 2.33 % 72.46 25 3.60 % 108.45 11 1.44 % 37.22 34 2.56 % 96.28 5 1.16 % 32.07 sredi sredi sredi 58 1.80 % 56.03 10 1.44 % 43.38 27 3.52 % 91.36 14 1.06 % 39.64 7 1.62 % 44.90 znotraj znotraj zn otraj 53 1.65 % 51.20 3 0.43 % 13.01 2 0.26 % 6.77 40 3.02 % 113.27 8 1.86 % 51.32 izven izven izven 21 0.65 % 20.29 5 0.72 % 21.69 3 0.39 % 10.15 9 0.68 % 25.48 4 0.93 % 25.66 konec konec konec 21 0.65 % 20.29 0 0 % 0 5 0.65 % 16.92 14 1.06 % 39.64 2 0.46 % 12.83 nasproti nasproti nas proti 21 0.65 % 20.29 1 0.14 % 4.34 12 1.57 % 40.60 5 0.38 % 14.16 3 0.70 % 19.24 izpred izpred i zpred 16 0.50 % 15.46 2 0.29 % 8.68 5 0.65 % 16.92 9 0.68 % 25.48 0 0 % 0 bližje bližje b ližje 12 0.37 % 11.59 2 0.29 % 8.68 4 0.52 % 13.53 5 0.38 % 14.16 1 0.23 % 6.41 zoper zoper zoper 11 0.34 % 10.63 0 0 % 0 0 0 % 0 11 0.83 % 31.15 0 0 % 0 bliže bliže bliže 10 0.31 % 9.66 6 0.86 % 26.03 0 0 % 0 4 0.30 % 11.33 0 0 % 0 zavoljo zavoljo za voljo 7 0.22 % 6.76 1 0.14 % 4.34 1 0.13 % 3.38 5 0.38 % 14.16 0 0 % 0 najbližje najbližje najb ližje 6 0.19 % 5.80 3 0.43 % 13.01 1 0.13 % 3.38 1 0.07 % 2.83 1 0.23 % 6.41 naokoli naokoli na okoli 5 0.15 % 4.83 0 0 % 0 4 0.52 % 13.53 1 0.07 % 2.83 0 0 % 0 napram napram n apram 5 0.15 % 4.83 0 0 % 0 3 0.39 % 10.15 2 0.15 % 5.66 0 0 % 0 spričo spričo s pričo 5 0.15 % 4.83 1 0.14 % 4.34 0 0 % 0 4 0.30 % 11.33 0 0 % 0 vzdolž vzdolž v zdolž 4 0.12 % 3.86 0 0 % 0 0 0 % 0 4 0.30 % 11.33 0 0 % 0 tekom tekom tekom 3 0.09 % 2.90 1 0.14 % 4.34 0 0 % 0 1 0.07 % 2.83 1 0.23 % 6.41 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 564 File at CLARIN.SI2.2.221 List of initial character-level 1-grams from preposition standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-prepositions-standardized_ forms-initial-1grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] v v 17,481 25.77 % 16,888.11 3,989 24.80 % 17,304.13 3,778 24.21 % 12,783.29 7,378 27.32 % 20,892.03 2,336 25.55 % 14,984.64 na n a 11,885 17.52 % 11,481.90 2,972 18.48 % 12,892.42 2,881 18.46 % 9,748.19 4,330 16.03 % 12,261.11 1,702 18.62 % 10,917.74 za z a 7,868 11.60 % 7,601.15 1,772 11.02 % 7,686.87 1,787 11.45 % 6,046.52 3,129 11.59 % 8,860.28 1,180 12.91 % 7,569.29 z z 4,303 6.34 % 4,157.06 1,119 6.96 % 4,854.18 887 5.68 % 3,001.27 1,740 6.44 % 4,927.10 557 6.09 % 3,572.96 po p o 3,337 4.92 % 3,223.82 779 4.84 % 3,379.27 981 6.29 % 3,319.33 1,116 4.13 % 3,160.14 461 5.04 % 2,957.16 s s 3,259 4.80 % 3,148.47 759 4.72 % 3,292.51 669 4.29 % 2,263.64 1,323 4.90 % 3,746.29 508 5.56 % 3,258.65 od o d 2,502 3.69 % 2,417.14 589 3.66 % 2,555.06 674 4.32 % 2,280.56 940 3.48 % 2,661.77 299 3.27 % 1,917.98 pri p ri 2,371 3.50 % 2,290.58 443 2.75 % 1,921.72 596 3.82 % 2,016.63 929 3.44 % 2,630.62 403 4.41 % 2,585.11 do d o 2,124 3.13 % 2,051.96 479 2.98 % 2,077.88 448 2.87 % 1,515.86 879 3.25 % 2,489.03 318 3.48 % 2,039.86 o o 2,070 3.05 % 1,999.79 480 2.98 % 2,082.22 319 2.04 % 1,079.37 1,060 3.92 % 3,001.57 211 2.31 % 1,353.49 iz i z 1,942 2.86 % 1,876.13 453 2.82 % 1,965.10 436 2.79 % 1,475.26 888 3.29 % 2,514.52 165 1.80 % 1,058.42 ob o b 896 1.32 % 865.61 291 1.81 % 1,262.35 259 1.66 % 876.36 282 1.04 % 798.53 64 0.70 % 410.54 pred p red 788 1.16 % 761.27 284 1.77 % 1,231.98 148 0.95 % 500.77 310 1.15 % 877.82 46 0.50 % 295.07 zaradi z aradi 786 1.16 % 759.34 135 0.84 % 585.62 143 0.92 % 483.86 398 1.47 % 1,127 110 1.20 % 705.61 med m ed 781 1.15 % 754.51 192 1.19 % 832.89 100 0.64 % 338.36 409 1.51 % 1,158.15 80 0.88 % 513.17 čez č ez 587 0.86 % 567.09 211 1.31 % 915.31 198 1.27 % 669.96 125 0.46 % 353.96 53 0.58 % 339.98 k k 468 0.69 % 452.13 103 0.64 % 446.81 116 0.74 % 392.50 186 0.69 % 526.69 63 0.69 % 404.12 brez b rez 463 0.68 % 447.30 133 0.83 % 576.95 118 0.76 % 399.27 143 0.53 % 404.93 69 0.76 % 442.61 pod p od 337 0.50 % 325.57 60 0.37 % 260.28 93 0.60 % 314.68 145 0.54 % 410.59 39 0.43 % 250.17 proti p roti 323 0.48 % 312.05 123 0.77 % 533.57 63 0.40 % 213.17 114 0.42 % 322.81 23 0.25 % 147.54 zraven z raven 297 0.44 % 286.93 54 0.34 % 234.25 130 0.83 % 439.87 48 0.18 % 135.92 65 0.71 % 416.95 skoz s koz 213 0.31 % 205.78 15 0.09 % 65.07 130 0.83 % 439.87 14 0.05 % 39.64 54 0.59 % 346.39 skozi s kozi 212 0.31 % 204.81 50 0.31 % 216.90 60 0.39 % 203.02 92 0.34 % 260.51 10 0.11 % 64.15 okoli o koli 183 0.27 % 176.79 38 0.24 % 164.84 89 0.57 % 301.14 41 0.15 % 116.10 15 0.16 % 96.22 nad n ad 162 0.24 % 156.51 41 0.26 % 177.86 24 0.15 % 81.21 84 0.31 % 237.86 13 0.14 % 83.39 preko p reko 145 0.21 % 140.08 43 0.27 % 186.53 14 0.09 % 47.37 73 0.27 % 206.71 15 0.16 % 96.22 glede g lede 132 0.20 % 127.52 17 0.11 % 73.75 18 0.12 % 60.91 48 0.18 % 135.92 49 0.54 % 314.32 okrog o krog 125 0.18 % 120.76 19 0.12 % 82.42 28 0.18 % 94.74 55 0.20 % 155.74 23 0.25 % 147.54 mimo m imo 117 0.17 % 113.03 42 0.26 % 182.19 38 0.24 % 128.58 28 0.10 % 79.29 9 0.10 % 57.73 h h 112 0.17 % 108.20 19 0.12 % 82.42 55 0.35 % 186.10 23 0.09 % 65.13 15 0.16 % 96.22 poleg p oleg 112 0.17 % 108.20 29 0.18 % 125.80 20 0.13 % 67.67 53 0.20 % 150.08 10 0.11 % 64.15 kljub k ljub 99 0.15 % 95.64 16 0.10 % 69.41 4 0.03 % 13.53 67 0.25 % 189.72 12 0.13 % 76.98 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 565 File at CLARIN.SI2.2.222 List of initial character-level 2-grams from preposition standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-prepositions-standardized_ forms-initial-2grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] na na 11,885 29.80 % 11,481.90 2,972 31.15 % 12,892.42 2,881 29.56 % 9,748.19 4,330 28.55 % 12,261.11 1,702 31.38 % 10,917.74 za za 7,868 19.73 % 7,601.15 1,772 18.57 % 7,686.87 1,787 18.34 % 6,046.52 3,129 20.63 % 8,860.28 1,180 21.75 % 7,569.29 po po 3,337 8.37 % 3,223.82 779 8.17 % 3,379.27 981 10.07 % 3,319.33 1,116 7.36 % 3,160.14 461 8.50 % 2,957.16 od od 2,502 6.27 % 2,417.14 589 6.17 % 2,555.06 674 6.92 % 2,280.56 940 6.20 % 2,661.77 299 5.51 % 1,917.98 pri pr i 2,371 5.95 % 2,290.58 443 4.64 % 1,921.72 596 6.12 % 2,016.63 929 6.12 % 2,630.62 403 7.43 % 2,585.11 do do 2,124 5.33 % 2,051.96 479 5.02 % 2,077.88 448 4.60 % 1,515.86 879 5.79 % 2,489.03 318 5.86 % 2,039.86 iz iz 1,942 4.87 % 1,876.13 453 4.75 % 1,965.10 436 4.47 % 1,475.26 888 5.86 % 2,514.52 165 3.04 % 1,058.42 ob ob 896 2.25 % 865.61 291 3.05 % 1,262.35 259 2.66 % 876.36 282 1.86 % 798.53 64 1.18 % 410.54 pred pr ed 788 1.98 % 761.27 284 2.98 % 1,231.98 148 1.52 % 500.77 310 2.04 % 877.82 46 0.85 % 295.07 zaradi za radi 786 1.97 % 759.34 135 1.42 % 585.62 143 1.47 % 483.86 398 2.62 % 1,127 110 2.03 % 705.61 med me d 781 1.96 % 754.51 192 2.01 % 832.89 100 1.03 % 338.36 409 2.70 % 1,158.15 80 1.48 % 513.17 čez če z 587 1.47 % 567.09 211 2.21 % 915.31 198 2.03 % 669.96 125 0.82 % 353.96 53 0.98 % 339.98 brez br ez 463 1.16 % 447.30 133 1.39 % 576.95 118 1.21 % 399.27 143 0.94 % 404.93 69 1.27 % 442.61 pod po d 337 0.84 % 325.57 60 0.63 % 260.28 93 0.95 % 314.68 145 0.96 % 410.59 39 0.72 % 250.17 proti pr oti 323 0.81 % 312.05 123 1.29 % 533.57 63 0.65 % 213.17 114 0.75 % 322.81 23 0.42 % 147.54 zraven zr aven 297 0.74 % 286.93 54 0.57 % 234.25 130 1.33 % 439.87 48 0.32 % 135.92 65 1.20 % 416.95 skoz sk oz 213 0.53 % 205.78 15 0.16 % 65.07 130 1.33 % 439.87 14 0.09 % 39.64 54 1.00 % 346.39 skozi sk ozi 212 0.53 % 204.81 50 0.52 % 216.90 60 0.62 % 203.02 92 0.61 % 260.51 10 0.18 % 64.15 okoli ok oli 183 0.46 % 176.79 38 0.40 % 164.84 89 0.91 % 301.14 41 0.27 % 116.10 15 0.28 % 96.22 nad na d 162 0.41 % 156.51 41 0.43 % 177.86 24 0.25 % 81.21 84 0.55 % 237.86 13 0.24 % 83.39 preko pr eko 145 0.36 % 140.08 43 0.45 % 186.53 14 0.14 % 47.37 73 0.48 % 206.71 15 0.28 % 96.22 glede gl ede 132 0.33 % 127.52 17 0.18 % 73.75 18 0.18 % 60.91 48 0.32 % 135.92 49 0.90 % 314.32 okrog ok rog 125 0.31 % 120.76 19 0.20 % 82.42 28 0.29 % 94.74 55 0.36 % 155.74 23 0.42 % 147.54 mimo mi mo 117 0.29 % 113.03 42 0.44 % 182.19 38 0.39 % 128.58 28 0.18 % 79.29 9 0.17 % 57.73 poleg po leg 112 0.28 % 108.20 29 0.30 % 125.80 20 0.20 % 67.67 53 0.35 % 150.08 10 0.18 % 64.15 kljub kl jub 99 0.25 % 95.64 16 0.17 % 69.41 4 0.04 % 13.53 67 0.44 % 189.72 12 0.22 % 76.98 razen ra zen 98 0.25 % 94.68 14 0.15 % 60.73 29 0.30 % 98.12 35 0.23 % 99.11 20 0.37 % 128.29 prek pr ek 97 0.24 % 93.71 15 0.16 % 65.07 43 0.44 % 145.50 26 0.17 % 73.62 13 0.24 % 83.39 blizu bl izu 96 0.24 % 92.74 19 0.20 % 82.42 31 0.32 % 104.89 36 0.24 % 101.94 10 0.18 % 64.15 zunaj zu naj 90 0.23 % 86.95 14 0.15 % 60.73 30 0.31 % 101.51 25 0.17 % 70.79 21 0.39 % 134.71 izmed iz med 89 0.22 % 85.98 46 0.48 % 199.55 5 0.05 % 16.92 37 0.24 % 104.77 1 0.02 % 6.41 namesto na mesto 84 0.21 % 81.15 15 0.16 % 65.07 19 0.20 % 64.29 35 0.23 % 99.11 15 0.28 % 96.22 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 566 File at CLARIN.SI2.2.223 List of initial character-level 3-grams from preposition standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-prepositions-standardized_ forms-initial-3grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pri pri 2,371 25.57 % 2,290.58 443 20.37 % 1,921.72 596 26.15 % 2,016.63 929 25.93 % 2,630.62 403 32.66 % 2,585.11 pred pre d 788 8.50 % 761.27 284 13.06 % 1,231.98 148 6.49 % 500.77 310 8.65 % 877.82 46 3.73 % 295.07 zaradi zar adi 786 8.48 % 759.34 135 6.21 % 585.62 143 6.28 % 483.86 398 11.11 % 1,127 110 8.91 % 705.61 med med 781 8.42 % 754.51 192 8.83 % 832.89 100 4.39 % 338.36 409 11.41 % 1,158.15 80 6.48 % 513.17 čez čez 587 6.33 % 567.09 211 9.70 % 915.31 198 8.69 % 669.96 125 3.49 % 353.96 53 4.29 % 339.98 brez bre z 463 4.99 % 447.30 133 6.12 % 576.95 118 5.18 % 399.27 143 3.99 % 404.93 69 5.59 % 442.61 pod pod 337 3.63 % 325.57 60 2.76 % 260.28 93 4.08 % 314.68 145 4.05 % 410.59 39 3.16 % 250.17 proti pro ti 323 3.48 % 312.05 123 5.66 % 533.57 63 2.76 % 213.17 114 3.18 % 322.81 23 1.86 % 147.54 zraven zra ven 297 3.20 % 286.93 54 2.48 % 234.25 130 5.70 % 439.87 48 1.34 % 135.92 65 5.27 % 416.95 skoz sko z 213 2.30 % 205.78 15 0.69 % 65.07 130 5.70 % 439.87 14 0.39 % 39.64 54 4.38 % 346.39 skozi sko zi 212 2.29 % 204.81 50 2.30 % 216.90 60 2.63 % 203.02 92 2.57 % 260.51 10 0.81 % 64.15 okoli oko li 183 1.97 % 176.79 38 1.75 % 164.84 89 3.90 % 301.14 41 1.14 % 116.10 15 1.22 % 96.22 nad nad 162 1.75 % 156.51 41 1.89 % 177.86 24 1.05 % 81.21 84 2.34 % 237.86 13 1.05 % 83.39 preko pre ko 145 1.56 % 140.08 43 1.98 % 186.53 14 0.61 % 47.37 73 2.04 % 206.71 15 1.22 % 96.22 glede gle de 132 1.42 % 127.52 17 0.78 % 73.75 18 0.79 % 60.91 48 1.34 % 135.92 49 3.97 % 314.32 okrog okr og 125 1.35 % 120.76 19 0.87 % 82.42 28 1.23 % 94.74 55 1.53 % 155.74 23 1.86 % 147.54 mimo mim o 117 1.26 % 113.03 42 1.93 % 182.19 38 1.67 % 128.58 28 0.78 % 79.29 9 0.73 % 57.73 poleg pol eg 112 1.21 % 108.20 29 1.33 % 125.80 20 0.88 % 67.67 53 1.48 % 150.08 10 0.81 % 64.15 kljub klj ub 99 1.07 % 95.64 16 0.74 % 69.41 4 0.18 % 13.53 67 1.87 % 189.72 12 0.97 % 76.98 razen raz en 98 1.06 % 94.68 14 0.64 % 60.73 29 1.27 % 98.12 35 0.98 % 99.11 20 1.62 % 128.29 prek pre k 97 1.05 % 93.71 15 0.69 % 65.07 43 1.89 % 145.50 26 0.73 % 73.62 13 1.05 % 83.39 blizu bli zu 96 1.03 % 92.74 19 0.87 % 82.42 31 1.36 % 104.89 36 1.00 % 101.94 10 0.81 % 64.15 zunaj zun aj 90 0.97 % 86.95 14 0.64 % 60.73 30 1.32 % 101.51 25 0.70 % 70.79 21 1.70 % 134.71 izmed izm ed 89 0.96 % 85.98 46 2.12 % 199.55 5 0.22 % 16.92 37 1.03 % 104.77 1 0.08 % 6.41 namesto nam esto 84 0.91 % 81.15 15 0.69 % 65.07 19 0.83 % 64.29 35 0.98 % 99.11 15 1.22 % 96.22 kasneje kas neje 75 0.81 % 72.46 25 1.15 % 108.45 11 0.48 % 37.22 34 0.95 % 96.28 5 0.41 % 32.07 sredi sre di 58 0.63 % 56.03 10 0.46 % 43.38 27 1.19 % 91.36 14 0.39 % 39.64 7 0.57 % 44.90 znotraj zno traj 53 0.57 % 51.20 3 0.14 % 13.01 2 0.09 % 6.77 40 1.12 % 113.27 8 0.65 % 51.32 vrhu vrh u 46 0.50 % 44.44 21 0.97 % 91.10 7 0.31 % 23.69 16 0.45 % 45.31 2 0.16 % 12.83 vrh vrh 40 0.43 % 38.64 13 0.60 % 56.39 9 0.40 % 30.45 8 0.22 % 22.65 10 0.81 % 64.15 krog kro g 29 0.31 % 28.02 5 0.23 % 21.69 3 0.13 % 10.15 16 0.45 % 45.31 5 0.41 % 32.07 izven izv en 21 0.23 % 20.29 5 0.23 % 21.69 3 0.13 % 10.15 9 0.25 % 25.48 4 0.32 % 25.66 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 567 File at CLARIN.SI2.2.224 List of initial character-level 4-grams from preposition standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-prepositions-standardized_ forms-initial-4grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pred pred 788 15.82 % 761.27 284 23.47 % 1,231.98 148 11.80 % 500.77 310 16.48 % 877.82 46 7.23 % 295.07 zaradi zara di 786 15.78 % 759.34 135 11.16 % 585.62 143 11.40 % 483.86 398 21.16 % 1,127 110 17.30 % 705.61 brez brez 463 9.29 % 447.30 133 10.99 % 576.95 118 9.41 % 399.27 143 7.60 % 404.93 69 10.85 % 442.61 proti prot i 323 6.49 % 312.05 123 10.16 % 533.57 63 5.02 % 213.17 114 6.06 % 322.81 23 3.62 % 147.54 zraven zrav en 297 5.96 % 286.93 54 4.46 % 234.25 130 10.37 % 439.87 48 2.55 % 135.92 65 10.22 % 416.95 skoz skoz 213 4.28 % 205.78 15 1.24 % 65.07 130 10.37 % 439.87 14 0.74 % 39.64 54 8.49 % 346.39 skozi skoz i 212 4.26 % 204.81 50 4.13 % 216.90 60 4.79 % 203.02 92 4.89 % 260.51 10 1.57 % 64.15 okoli okol i 183 3.67 % 176.79 38 3.14 % 164.84 89 7.10 % 301.14 41 2.18 % 116.10 15 2.36 % 96.22 preko prek o 145 2.91 % 140.08 43 3.55 % 186.53 14 1.12 % 47.37 73 3.88 % 206.71 15 2.36 % 96.22 glede gled e 132 2.65 % 127.52 17 1.41 % 73.75 18 1.44 % 60.91 48 2.55 % 135.92 49 7.70 % 314.32 okrog okro g 125 2.51 % 120.76 19 1.57 % 82.42 28 2.23 % 94.74 55 2.92 % 155.74 23 3.62 % 147.54 mimo mimo 117 2.35 % 113.03 42 3.47 % 182.19 38 3.03 % 128.58 28 1.49 % 79.29 9 1.42 % 57.73 poleg pole g 112 2.25 % 108.20 29 2.40 % 125.80 20 1.59 % 67.67 53 2.82 % 150.08 10 1.57 % 64.15 kljub klju b 99 1.99 % 95.64 16 1.32 % 69.41 4 0.32 % 13.53 67 3.56 % 189.72 12 1.89 % 76.98 razen raze n 98 1.97 % 94.68 14 1.16 % 60.73 29 2.31 % 98.12 35 1.86 % 99.11 20 3.15 % 128.29 prek prek 97 1.95 % 93.71 15 1.24 % 65.07 43 3.43 % 145.50 26 1.38 % 73.62 13 2.04 % 83.39 blizu bliz u 96 1.93 % 92.74 19 1.57 % 82.42 31 2.47 % 104.89 36 1.91 % 101.94 10 1.57 % 64.15 zunaj zuna j 90 1.81 % 86.95 14 1.16 % 60.73 30 2.39 % 101.51 25 1.33 % 70.79 21 3.30 % 134.71 izmed izme d 89 1.79 % 85.98 46 3.80 % 199.55 5 0.40 % 16.92 37 1.97 % 104.77 1 0.16 % 6.41 namesto name sto 84 1.69 % 81.15 15 1.24 % 65.07 19 1.51 % 64.29 35 1.86 % 99.11 15 2.36 % 96.22 kasneje kasn eje 75 1.51 % 72.46 25 2.07 % 108.45 11 0.88 % 37.22 34 1.81 % 96.28 5 0.79 % 32.07 sredi sred i 58 1.16 % 56.03 10 0.83 % 43.38 27 2.15 % 91.36 14 0.74 % 39.64 7 1.10 % 44.90 znotraj znot raj 53 1.06 % 51.20 3 0.25 % 13.01 2 0.16 % 6.77 40 2.13 % 113.27 8 1.26 % 51.32 vrhu vrhu 46 0.92 % 44.44 21 1.74 % 91.10 7 0.56 % 23.69 16 0.85 % 45.31 2 0.31 % 12.83 krog krog 29 0.58 % 28.02 5 0.41 % 21.69 3 0.24 % 10.15 16 0.85 % 45.31 5 0.79 % 32.07 izven izve n 21 0.42 % 20.29 5 0.41 % 21.69 3 0.24 % 10.15 9 0.48 % 25.48 4 0.63 % 25.66 konec kone c 21 0.42 % 20.29 0 0 % 0 5 0.40 % 16.92 14 0.74 % 39.64 2 0.31 % 12.83 nasproti nasp roti 21 0.42 % 20.29 1 0.08 % 4.34 12 0.96 % 40.60 5 0.27 % 14.16 3 0.47 % 19.24 izpred izpr ed 16 0.32 % 15.46 2 0.17 % 8.68 5 0.40 % 16.92 9 0.48 % 25.48 0 0 % 0 bližje bliž je 12 0.24 % 11.59 2 0.17 % 8.68 4 0.32 % 13.53 5 0.27 % 14.16 1 0.16 % 6.41 zoper zope r 11 0.22 % 10.63 0 0 % 0 0 0 % 0 11 0.58 % 31.15 0 0 % 0 bliže bliž e 10 0.20 % 9.66 6 0.50 % 26.03 0 0 % 0 4 0.21 % 11.33 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 568 File at CLARIN.SI2.2.225 List of initial character-level 5-grams from preposition standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-prepositions-standardized_ forms-initial-5grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] zaradi zarad i 786 24.43 % 759.34 135 19.42 % 585.62 143 18.67 % 483.86 398 30.02 % 1,127 110 25.52 % 705.61 proti proti 323 10.04 % 312.05 123 17.70 % 533.57 63 8.22 % 213.17 114 8.60 % 322.81 23 5.34 % 147.54 zraven zrave n 297 9.23 % 286.93 54 7.77 % 234.25 130 16.97 % 439.87 48 3.62 % 135.92 65 15.08 % 416.95 skozi skozi 212 6.59 % 204.81 50 7.19 % 216.90 60 7.83 % 203.02 92 6.94 % 260.51 10 2.32 % 64.15 okoli okoli 183 5.69 % 176.79 38 5.47 % 164.84 89 11.62 % 301.14 41 3.09 % 116.10 15 3.48 % 96.22 preko preko 145 4.51 % 140.08 43 6.19 % 186.53 14 1.83 % 47.37 73 5.50 % 206.71 15 3.48 % 96.22 glede glede 132 4.10 % 127.52 17 2.45 % 73.75 18 2.35 % 60.91 48 3.62 % 135.92 49 11.37 % 314.32 okrog okrog 125 3.88 % 120.76 19 2.73 % 82.42 28 3.65 % 94.74 55 4.15 % 155.74 23 5.34 % 147.54 poleg poleg 112 3.48 % 108.20 29 4.17 % 125.80 20 2.61 % 67.67 53 4.00 % 150.08 10 2.32 % 64.15 kljub kljub 99 3.08 % 95.64 16 2.30 % 69.41 4 0.52 % 13.53 67 5.05 % 189.72 12 2.78 % 76.98 razen razen 98 3.04 % 94.68 14 2.01 % 60.73 29 3.79 % 98.12 35 2.64 % 99.11 20 4.64 % 128.29 blizu blizu 96 2.98 % 92.74 19 2.73 % 82.42 31 4.05 % 104.89 36 2.71 % 101.94 10 2.32 % 64.15 zunaj zunaj 90 2.80 % 86.95 14 2.01 % 60.73 30 3.92 % 101.51 25 1.89 % 70.79 21 4.87 % 134.71 izmed izmed 89 2.77 % 85.98 46 6.62 % 199.55 5 0.65 % 16.92 37 2.79 % 104.77 1 0.23 % 6.41 namesto names to 84 2.61 % 81.15 15 2.16 % 65.07 19 2.48 % 64.29 35 2.64 % 99.11 15 3.48 % 96.22 kasneje kasne je 75 2.33 % 72.46 25 3.60 % 108.45 11 1.44 % 37.22 34 2.56 % 96.28 5 1.16 % 32.07 sredi sredi 58 1.80 % 56.03 10 1.44 % 43.38 27 3.52 % 91.36 14 1.06 % 39.64 7 1.62 % 44.90 znotraj znotr aj 53 1.65 % 51.20 3 0.43 % 13.01 2 0.26 % 6.77 40 3.02 % 113.27 8 1.86 % 51.32 izven izven 21 0.65 % 20.29 5 0.72 % 21.69 3 0.39 % 10.15 9 0.68 % 25.48 4 0.93 % 25.66 konec konec 21 0.65 % 20.29 0 0 % 0 5 0.65 % 16.92 14 1.06 % 39.64 2 0.46 % 12.83 nasproti naspr oti 21 0.65 % 20.29 1 0.14 % 4.34 12 1.57 % 40.60 5 0.38 % 14.16 3 0.70 % 19.24 izpred izpre d 16 0.50 % 15.46 2 0.29 % 8.68 5 0.65 % 16.92 9 0.68 % 25.48 0 0 % 0 bližje bližj e 12 0.37 % 11.59 2 0.29 % 8.68 4 0.52 % 13.53 5 0.38 % 14.16 1 0.23 % 6.41 zoper zoper 11 0.34 % 10.63 0 0 % 0 0 0 % 0 11 0.83 % 31.15 0 0 % 0 bliže bliže 10 0.31 % 9.66 6 0.86 % 26.03 0 0 % 0 4 0.30 % 11.33 0 0 % 0 zavoljo zavol jo 7 0.22 % 6.76 1 0.14 % 4.34 1 0.13 % 3.38 5 0.38 % 14.16 0 0 % 0 najbližje najbl ižje 6 0.19 % 5.80 3 0.43 % 13.01 1 0.13 % 3.38 1 0.07 % 2.83 1 0.23 % 6.41 naokoli naoko li 5 0.15 % 4.83 0 0 % 0 4 0.52 % 13.53 1 0.07 % 2.83 0 0 % 0 napram napra m 5 0.15 % 4.83 0 0 % 0 3 0.39 % 10.15 2 0.15 % 5.66 0 0 % 0 spričo sprič o 5 0.15 % 4.83 1 0.14 % 4.34 0 0 % 0 4 0.30 % 11.33 0 0 % 0 vzdolž vzdol ž 4 0.12 % 3.86 0 0 % 0 0 0 % 0 4 0.30 % 11.33 0 0 % 0 tekom tekom 3 0.09 % 2.90 1 0.14 % 4.34 0 0 % 0 1 0.07 % 2.83 1 0.23 % 6.41 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 569 File at CLARIN.SI2.2.226 List of final character-level 1-grams from preposition standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-prepositions-standardized_ forms-final-1grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] v v 17,481 25.77 % 16,888.11 3,989 24.80 % 17,304.13 3,778 24.21 % 12,783.29 7,378 27.32 % 20,892.03 2,336 25.55 % 14,984.64 na n a 11,885 17.52 % 11,481.90 2,972 18.48 % 12,892.42 2,881 18.46 % 9,748.19 4,330 16.03 % 12,261.11 1,702 18.62 % 10,917.74 za z a 7,868 11.60 % 7,601.15 1,772 11.02 % 7,686.87 1,787 11.45 % 6,046.52 3,129 11.59 % 8,860.28 1,180 12.91 % 7,569.29 z z 4,303 6.34 % 4,157.06 1,119 6.96 % 4,854.18 887 5.68 % 3,001.27 1,740 6.44 % 4,927.10 557 6.09 % 3,572.96 po p o 3,337 4.92 % 3,223.82 779 4.84 % 3,379.27 981 6.29 % 3,319.33 1,116 4.13 % 3,160.14 461 5.04 % 2,957.16 s s 3,259 4.80 % 3,148.47 759 4.72 % 3,292.51 669 4.29 % 2,263.64 1,323 4.90 % 3,746.29 508 5.56 % 3,258.65 od o d 2,502 3.69 % 2,417.14 589 3.66 % 2,555.06 674 4.32 % 2,280.56 940 3.48 % 2,661.77 299 3.27 % 1,917.98 pri pr i 2,371 3.50 % 2,290.58 443 2.75 % 1,921.72 596 3.82 % 2,016.63 929 3.44 % 2,630.62 403 4.41 % 2,585.11 do d o 2,124 3.13 % 2,051.96 479 2.98 % 2,077.88 448 2.87 % 1,515.86 879 3.25 % 2,489.03 318 3.48 % 2,039.86 o o 2,070 3.05 % 1,999.79 480 2.98 % 2,082.22 319 2.04 % 1,079.37 1,060 3.92 % 3,001.57 211 2.31 % 1,353.49 iz i z 1,942 2.86 % 1,876.13 453 2.82 % 1,965.10 436 2.79 % 1,475.26 888 3.29 % 2,514.52 165 1.80 % 1,058.42 ob o b 896 1.32 % 865.61 291 1.81 % 1,262.35 259 1.66 % 876.36 282 1.04 % 798.53 64 0.70 % 410.54 pred pre d 788 1.16 % 761.27 284 1.77 % 1,231.98 148 0.95 % 500.77 310 1.15 % 877.82 46 0.50 % 295.07 zaradi zarad i 786 1.16 % 759.34 135 0.84 % 585.62 143 0.92 % 483.86 398 1.47 % 1,127 110 1.20 % 705.61 med me d 781 1.15 % 754.51 192 1.19 % 832.89 100 0.64 % 338.36 409 1.51 % 1,158.15 80 0.88 % 513.17 čez če z 587 0.86 % 567.09 211 1.31 % 915.31 198 1.27 % 669.96 125 0.46 % 353.96 53 0.58 % 339.98 k k 468 0.69 % 452.13 103 0.64 % 446.81 116 0.74 % 392.50 186 0.69 % 526.69 63 0.69 % 404.12 brez bre z 463 0.68 % 447.30 133 0.83 % 576.95 118 0.76 % 399.27 143 0.53 % 404.93 69 0.76 % 442.61 pod po d 337 0.50 % 325.57 60 0.37 % 260.28 93 0.60 % 314.68 145 0.54 % 410.59 39 0.43 % 250.17 proti prot i 323 0.48 % 312.05 123 0.77 % 533.57 63 0.40 % 213.17 114 0.42 % 322.81 23 0.25 % 147.54 zraven zrave n 297 0.44 % 286.93 54 0.34 % 234.25 130 0.83 % 439.87 48 0.18 % 135.92 65 0.71 % 416.95 skoz sko z 213 0.31 % 205.78 15 0.09 % 65.07 130 0.83 % 439.87 14 0.05 % 39.64 54 0.59 % 346.39 skozi skoz i 212 0.31 % 204.81 50 0.31 % 216.90 60 0.39 % 203.02 92 0.34 % 260.51 10 0.11 % 64.15 okoli okol i 183 0.27 % 176.79 38 0.24 % 164.84 89 0.57 % 301.14 41 0.15 % 116.10 15 0.16 % 96.22 nad na d 162 0.24 % 156.51 41 0.26 % 177.86 24 0.15 % 81.21 84 0.31 % 237.86 13 0.14 % 83.39 preko prek o 145 0.21 % 140.08 43 0.27 % 186.53 14 0.09 % 47.37 73 0.27 % 206.71 15 0.16 % 96.22 glede gled e 132 0.20 % 127.52 17 0.11 % 73.75 18 0.12 % 60.91 48 0.18 % 135.92 49 0.54 % 314.32 okrog okro g 125 0.18 % 120.76 19 0.12 % 82.42 28 0.18 % 94.74 55 0.20 % 155.74 23 0.25 % 147.54 mimo mim o 117 0.17 % 113.03 42 0.26 % 182.19 38 0.24 % 128.58 28 0.10 % 79.29 9 0.10 % 57.73 h h 112 0.17 % 108.20 19 0.12 % 82.42 55 0.35 % 186.10 23 0.09 % 65.13 15 0.16 % 96.22 poleg pole g 112 0.17 % 108.20 29 0.18 % 125.80 20 0.13 % 67.67 53 0.20 % 150.08 10 0.11 % 64.15 kljub klju b 99 0.15 % 95.64 16 0.10 % 69.41 4 0.03 % 13.53 67 0.25 % 189.72 12 0.13 % 76.98 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 570 File at CLARIN.SI2.2.227 List of final character-level 2-grams from preposition standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-prepositions-standardized_ forms-final-2grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] na na 11,885 29.80 % 11,481.90 2,972 31.15 % 12,892.42 2,881 29.56 % 9,748.19 4,330 28.55 % 12,261.11 1,702 31.38 % 10,917.74 za za 7,868 19.73 % 7,601.15 1,772 18.57 % 7,686.87 1,787 18.34 % 6,046.52 3,129 20.63 % 8,860.28 1,180 21.75 % 7,569.29 po po 3,337 8.37 % 3,223.82 779 8.17 % 3,379.27 981 10.07 % 3,319.33 1,116 7.36 % 3,160.14 461 8.50 % 2,957.16 od od 2,502 6.27 % 2,417.14 589 6.17 % 2,555.06 674 6.92 % 2,280.56 940 6.20 % 2,661.77 299 5.51 % 1,917.98 pri p ri 2,371 5.95 % 2,290.58 443 4.64 % 1,921.72 596 6.12 % 2,016.63 929 6.12 % 2,630.62 403 7.43 % 2,585.11 do do 2,124 5.33 % 2,051.96 479 5.02 % 2,077.88 448 4.60 % 1,515.86 879 5.79 % 2,489.03 318 5.86 % 2,039.86 iz iz 1,942 4.87 % 1,876.13 453 4.75 % 1,965.10 436 4.47 % 1,475.26 888 5.86 % 2,514.52 165 3.04 % 1,058.42 ob ob 896 2.25 % 865.61 291 3.05 % 1,262.35 259 2.66 % 876.36 282 1.86 % 798.53 64 1.18 % 410.54 pred pr ed 788 1.98 % 761.27 284 2.98 % 1,231.98 148 1.52 % 500.77 310 2.04 % 877.82 46 0.85 % 295.07 zaradi zara di 786 1.97 % 759.34 135 1.42 % 585.62 143 1.47 % 483.86 398 2.62 % 1,127 110 2.03 % 705.61 med m ed 781 1.96 % 754.51 192 2.01 % 832.89 100 1.03 % 338.36 409 2.70 % 1,158.15 80 1.48 % 513.17 čez č ez 587 1.47 % 567.09 211 2.21 % 915.31 198 2.03 % 669.96 125 0.82 % 353.96 53 0.98 % 339.98 brez br ez 463 1.16 % 447.30 133 1.39 % 576.95 118 1.21 % 399.27 143 0.94 % 404.93 69 1.27 % 442.61 pod p od 337 0.84 % 325.57 60 0.63 % 260.28 93 0.95 % 314.68 145 0.96 % 410.59 39 0.72 % 250.17 proti pro ti 323 0.81 % 312.05 123 1.29 % 533.57 63 0.65 % 213.17 114 0.75 % 322.81 23 0.42 % 147.54 zraven zrav en 297 0.74 % 286.93 54 0.57 % 234.25 130 1.33 % 439.87 48 0.32 % 135.92 65 1.20 % 416.95 skoz sk oz 213 0.53 % 205.78 15 0.16 % 65.07 130 1.33 % 439.87 14 0.09 % 39.64 54 1.00 % 346.39 skozi sko zi 212 0.53 % 204.81 50 0.52 % 216.90 60 0.62 % 203.02 92 0.61 % 260.51 10 0.18 % 64.15 okoli oko li 183 0.46 % 176.79 38 0.40 % 164.84 89 0.91 % 301.14 41 0.27 % 116.10 15 0.28 % 96.22 nad n ad 162 0.41 % 156.51 41 0.43 % 177.86 24 0.25 % 81.21 84 0.55 % 237.86 13 0.24 % 83.39 preko pre ko 145 0.36 % 140.08 43 0.45 % 186.53 14 0.14 % 47.37 73 0.48 % 206.71 15 0.28 % 96.22 glede gle de 132 0.33 % 127.52 17 0.18 % 73.75 18 0.18 % 60.91 48 0.32 % 135.92 49 0.90 % 314.32 okrog okr og 125 0.31 % 120.76 19 0.20 % 82.42 28 0.29 % 94.74 55 0.36 % 155.74 23 0.42 % 147.54 mimo mi mo 117 0.29 % 113.03 42 0.44 % 182.19 38 0.39 % 128.58 28 0.18 % 79.29 9 0.17 % 57.73 poleg pol eg 112 0.28 % 108.20 29 0.30 % 125.80 20 0.20 % 67.67 53 0.35 % 150.08 10 0.18 % 64.15 kljub klj ub 99 0.25 % 95.64 16 0.17 % 69.41 4 0.04 % 13.53 67 0.44 % 189.72 12 0.22 % 76.98 razen raz en 98 0.25 % 94.68 14 0.15 % 60.73 29 0.30 % 98.12 35 0.23 % 99.11 20 0.37 % 128.29 prek pr ek 97 0.24 % 93.71 15 0.16 % 65.07 43 0.44 % 145.50 26 0.17 % 73.62 13 0.24 % 83.39 blizu bli zu 96 0.24 % 92.74 19 0.20 % 82.42 31 0.32 % 104.89 36 0.24 % 101.94 10 0.18 % 64.15 zunaj zun aj 90 0.23 % 86.95 14 0.15 % 60.73 30 0.31 % 101.51 25 0.17 % 70.79 21 0.39 % 134.71 izmed izm ed 89 0.22 % 85.98 46 0.48 % 199.55 5 0.05 % 16.92 37 0.24 % 104.77 1 0.02 % 6.41 namesto names to 84 0.21 % 81.15 15 0.16 % 65.07 19 0.20 % 64.29 35 0.23 % 99.11 15 0.28 % 96.22 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 571 File at CLARIN.SI2.2.228 List of final character-level 3-grams from preposition standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-prepositions-standardized_ forms-final-3grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pri pri 2,371 25.57 % 2,290.58 443 20.37 % 1,921.72 596 26.15 % 2,016.63 929 25.93 % 2,630.62 403 32.66 % 2,585.11 pred p red 788 8.50 % 761.27 284 13.06 % 1,231.98 148 6.49 % 500.77 310 8.65 % 877.82 46 3.73 % 295.07 zaradi zar adi 786 8.48 % 759.34 135 6.21 % 585.62 143 6.28 % 483.86 398 11.11 % 1,127 110 8.91 % 705.61 med med 781 8.42 % 754.51 192 8.83 % 832.89 100 4.39 % 338.36 409 11.41 % 1,158.15 80 6.48 % 513.17 čez čez 587 6.33 % 567.09 211 9.70 % 915.31 198 8.69 % 669.96 125 3.49 % 353.96 53 4.29 % 339.98 brez b rez 463 4.99 % 447.30 133 6.12 % 576.95 118 5.18 % 399.27 143 3.99 % 404.93 69 5.59 % 442.61 pod pod 337 3.63 % 325.57 60 2.76 % 260.28 93 4.08 % 314.68 145 4.05 % 410.59 39 3.16 % 250.17 proti pr oti 323 3.48 % 312.05 123 5.66 % 533.57 63 2.76 % 213.17 114 3.18 % 322.81 23 1.86 % 147.54 zraven zra ven 297 3.20 % 286.93 54 2.48 % 234.25 130 5.70 % 439.87 48 1.34 % 135.92 65 5.27 % 416.95 skoz s koz 213 2.30 % 205.78 15 0.69 % 65.07 130 5.70 % 439.87 14 0.39 % 39.64 54 4.38 % 346.39 skozi sk ozi 212 2.29 % 204.81 50 2.30 % 216.90 60 2.63 % 203.02 92 2.57 % 260.51 10 0.81 % 64.15 okoli ok oli 183 1.97 % 176.79 38 1.75 % 164.84 89 3.90 % 301.14 41 1.14 % 116.10 15 1.22 % 96.22 nad nad 162 1.75 % 156.51 41 1.89 % 177.86 24 1.05 % 81.21 84 2.34 % 237.86 13 1.05 % 83.39 preko pr eko 145 1.56 % 140.08 43 1.98 % 186.53 14 0.61 % 47.37 73 2.04 % 206.71 15 1.22 % 96.22 glede gl ede 132 1.42 % 127.52 17 0.78 % 73.75 18 0.79 % 60.91 48 1.34 % 135.92 49 3.97 % 314.32 okrog ok rog 125 1.35 % 120.76 19 0.87 % 82.42 28 1.23 % 94.74 55 1.53 % 155.74 23 1.86 % 147.54 mimo m imo 117 1.26 % 113.03 42 1.93 % 182.19 38 1.67 % 128.58 28 0.78 % 79.29 9 0.73 % 57.73 poleg po leg 112 1.21 % 108.20 29 1.33 % 125.80 20 0.88 % 67.67 53 1.48 % 150.08 10 0.81 % 64.15 kljub kl jub 99 1.07 % 95.64 16 0.74 % 69.41 4 0.18 % 13.53 67 1.87 % 189.72 12 0.97 % 76.98 razen ra zen 98 1.06 % 94.68 14 0.64 % 60.73 29 1.27 % 98.12 35 0.98 % 99.11 20 1.62 % 128.29 prek p rek 97 1.05 % 93.71 15 0.69 % 65.07 43 1.89 % 145.50 26 0.73 % 73.62 13 1.05 % 83.39 blizu bl izu 96 1.03 % 92.74 19 0.87 % 82.42 31 1.36 % 104.89 36 1.00 % 101.94 10 0.81 % 64.15 zunaj zu naj 90 0.97 % 86.95 14 0.64 % 60.73 30 1.32 % 101.51 25 0.70 % 70.79 21 1.70 % 134.71 izmed iz med 89 0.96 % 85.98 46 2.12 % 199.55 5 0.22 % 16.92 37 1.03 % 104.77 1 0.08 % 6.41 namesto name sto 84 0.91 % 81.15 15 0.69 % 65.07 19 0.83 % 64.29 35 0.98 % 99.11 15 1.22 % 96.22 kasneje kasn eje 75 0.81 % 72.46 25 1.15 % 108.45 11 0.48 % 37.22 34 0.95 % 96.28 5 0.41 % 32.07 sredi sr edi 58 0.63 % 56.03 10 0.46 % 43.38 27 1.19 % 91.36 14 0.39 % 39.64 7 0.57 % 44.90 znotraj znot raj 53 0.57 % 51.20 3 0.14 % 13.01 2 0.09 % 6.77 40 1.12 % 113.27 8 0.65 % 51.32 vrhu v rhu 46 0.50 % 44.44 21 0.97 % 91.10 7 0.31 % 23.69 16 0.45 % 45.31 2 0.16 % 12.83 vrh vrh 40 0.43 % 38.64 13 0.60 % 56.39 9 0.40 % 30.45 8 0.22 % 22.65 10 0.81 % 64.15 krog k rog 29 0.31 % 28.02 5 0.23 % 21.69 3 0.13 % 10.15 16 0.45 % 45.31 5 0.41 % 32.07 izven iz ven 21 0.23 % 20.29 5 0.23 % 21.69 3 0.13 % 10.15 9 0.25 % 25.48 4 0.32 % 25.66 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 572 File at CLARIN.SI2.2.229 List of final character-level 4-grams from preposition standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-prepositions-standardized_ forms-final-4grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pred pred 788 15.82 % 761.27 284 23.47 % 1,231.98 148 11.80 % 500.77 310 16.48 % 877.82 46 7.23 % 295.07 zaradi za radi 786 15.78 % 759.34 135 11.16 % 585.62 143 11.40 % 483.86 398 21.16 % 1,127 110 17.30 % 705.61 brez brez 463 9.29 % 447.30 133 10.99 % 576.95 118 9.41 % 399.27 143 7.60 % 404.93 69 10.85 % 442.61 proti p roti 323 6.49 % 312.05 123 10.16 % 533.57 63 5.02 % 213.17 114 6.06 % 322.81 23 3.62 % 147.54 zraven zr aven 297 5.96 % 286.93 54 4.46 % 234.25 130 10.37 % 439.87 48 2.55 % 135.92 65 10.22 % 416.95 skoz skoz 213 4.28 % 205.78 15 1.24 % 65.07 130 10.37 % 439.87 14 0.74 % 39.64 54 8.49 % 346.39 skozi s kozi 212 4.26 % 204.81 50 4.13 % 216.90 60 4.79 % 203.02 92 4.89 % 260.51 10 1.57 % 64.15 okoli o koli 183 3.67 % 176.79 38 3.14 % 164.84 89 7.10 % 301.14 41 2.18 % 116.10 15 2.36 % 96.22 preko p reko 145 2.91 % 140.08 43 3.55 % 186.53 14 1.12 % 47.37 73 3.88 % 206.71 15 2.36 % 96.22 glede g lede 132 2.65 % 127.52 17 1.41 % 73.75 18 1.44 % 60.91 48 2.55 % 135.92 49 7.70 % 314.32 okrog o krog 125 2.51 % 120.76 19 1.57 % 82.42 28 2.23 % 94.74 55 2.92 % 155.74 23 3.62 % 147.54 mimo mimo 117 2.35 % 113.03 42 3.47 % 182.19 38 3.03 % 128.58 28 1.49 % 79.29 9 1.42 % 57.73 poleg p oleg 112 2.25 % 108.20 29 2.40 % 125.80 20 1.59 % 67.67 53 2.82 % 150.08 10 1.57 % 64.15 kljub k ljub 99 1.99 % 95.64 16 1.32 % 69.41 4 0.32 % 13.53 67 3.56 % 189.72 12 1.89 % 76.98 razen r azen 98 1.97 % 94.68 14 1.16 % 60.73 29 2.31 % 98.12 35 1.86 % 99.11 20 3.15 % 128.29 prek prek 97 1.95 % 93.71 15 1.24 % 65.07 43 3.43 % 145.50 26 1.38 % 73.62 13 2.04 % 83.39 blizu b lizu 96 1.93 % 92.74 19 1.57 % 82.42 31 2.47 % 104.89 36 1.91 % 101.94 10 1.57 % 64.15 zunaj z unaj 90 1.81 % 86.95 14 1.16 % 60.73 30 2.39 % 101.51 25 1.33 % 70.79 21 3.30 % 134.71 izmed i zmed 89 1.79 % 85.98 46 3.80 % 199.55 5 0.40 % 16.92 37 1.97 % 104.77 1 0.16 % 6.41 namesto nam esto 84 1.69 % 81.15 15 1.24 % 65.07 19 1.51 % 64.29 35 1.86 % 99.11 15 2.36 % 96.22 kasneje kas neje 75 1.51 % 72.46 25 2.07 % 108.45 11 0.88 % 37.22 34 1.81 % 96.28 5 0.79 % 32.07 sredi s redi 58 1.16 % 56.03 10 0.83 % 43.38 27 2.15 % 91.36 14 0.74 % 39.64 7 1.10 % 44.90 znotraj zno traj 53 1.06 % 51.20 3 0.25 % 13.01 2 0.16 % 6.77 40 2.13 % 113.27 8 1.26 % 51.32 vrhu vrhu 46 0.92 % 44.44 21 1.74 % 91.10 7 0.56 % 23.69 16 0.85 % 45.31 2 0.31 % 12.83 krog krog 29 0.58 % 28.02 5 0.41 % 21.69 3 0.24 % 10.15 16 0.85 % 45.31 5 0.79 % 32.07 izven i zven 21 0.42 % 20.29 5 0.41 % 21.69 3 0.24 % 10.15 9 0.48 % 25.48 4 0.63 % 25.66 konec k onec 21 0.42 % 20.29 0 0 % 0 5 0.40 % 16.92 14 0.74 % 39.64 2 0.31 % 12.83 nasproti nasp roti 21 0.42 % 20.29 1 0.08 % 4.34 12 0.96 % 40.60 5 0.27 % 14.16 3 0.47 % 19.24 izpred iz pred 16 0.32 % 15.46 2 0.17 % 8.68 5 0.40 % 16.92 9 0.48 % 25.48 0 0 % 0 bližje bl ižje 12 0.24 % 11.59 2 0.17 % 8.68 4 0.32 % 13.53 5 0.27 % 14.16 1 0.16 % 6.41 zoper z oper 11 0.22 % 10.63 0 0 % 0 0 0 % 0 11 0.58 % 31.15 0 0 % 0 bliže b liže 10 0.20 % 9.66 6 0.50 % 26.03 0 0 % 0 4 0.21 % 11.33 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 573 File at CLARIN.SI2.2.230 List of final character-level 5-grams from preposition standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-prepositions-standardized_ forms-final-5grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] zaradi z aradi 786 24.43 % 759.34 135 19.42 % 585.62 143 18.67 % 483.86 398 30.02 % 1,127 110 25.52 % 705.61 proti proti 323 10.04 % 312.05 123 17.70 % 533.57 63 8.22 % 213.17 114 8.60 % 322.81 23 5.34 % 147.54 zraven z raven 297 9.23 % 286.93 54 7.77 % 234.25 130 16.97 % 439.87 48 3.62 % 135.92 65 15.08 % 416.95 skozi skozi 212 6.59 % 204.81 50 7.19 % 216.90 60 7.83 % 203.02 92 6.94 % 260.51 10 2.32 % 64.15 okoli okoli 183 5.69 % 176.79 38 5.47 % 164.84 89 11.62 % 301.14 41 3.09 % 116.10 15 3.48 % 96.22 preko preko 145 4.51 % 140.08 43 6.19 % 186.53 14 1.83 % 47.37 73 5.50 % 206.71 15 3.48 % 96.22 glede glede 132 4.10 % 127.52 17 2.45 % 73.75 18 2.35 % 60.91 48 3.62 % 135.92 49 11.37 % 314.32 okrog okrog 125 3.88 % 120.76 19 2.73 % 82.42 28 3.65 % 94.74 55 4.15 % 155.74 23 5.34 % 147.54 poleg poleg 112 3.48 % 108.20 29 4.17 % 125.80 20 2.61 % 67.67 53 4.00 % 150.08 10 2.32 % 64.15 kljub kljub 99 3.08 % 95.64 16 2.30 % 69.41 4 0.52 % 13.53 67 5.05 % 189.72 12 2.78 % 76.98 razen razen 98 3.04 % 94.68 14 2.01 % 60.73 29 3.79 % 98.12 35 2.64 % 99.11 20 4.64 % 128.29 blizu blizu 96 2.98 % 92.74 19 2.73 % 82.42 31 4.05 % 104.89 36 2.71 % 101.94 10 2.32 % 64.15 zunaj zunaj 90 2.80 % 86.95 14 2.01 % 60.73 30 3.92 % 101.51 25 1.89 % 70.79 21 4.87 % 134.71 izmed izmed 89 2.77 % 85.98 46 6.62 % 199.55 5 0.65 % 16.92 37 2.79 % 104.77 1 0.23 % 6.41 namesto na mesto 84 2.61 % 81.15 15 2.16 % 65.07 19 2.48 % 64.29 35 2.64 % 99.11 15 3.48 % 96.22 kasneje ka sneje 75 2.33 % 72.46 25 3.60 % 108.45 11 1.44 % 37.22 34 2.56 % 96.28 5 1.16 % 32.07 sredi sredi 58 1.80 % 56.03 10 1.44 % 43.38 27 3.52 % 91.36 14 1.06 % 39.64 7 1.62 % 44.90 znotraj zn otraj 53 1.65 % 51.20 3 0.43 % 13.01 2 0.26 % 6.77 40 3.02 % 113.27 8 1.86 % 51.32 izven izven 21 0.65 % 20.29 5 0.72 % 21.69 3 0.39 % 10.15 9 0.68 % 25.48 4 0.93 % 25.66 konec konec 21 0.65 % 20.29 0 0 % 0 5 0.65 % 16.92 14 1.06 % 39.64 2 0.46 % 12.83 nasproti nas proti 21 0.65 % 20.29 1 0.14 % 4.34 12 1.57 % 40.60 5 0.38 % 14.16 3 0.70 % 19.24 izpred i zpred 16 0.50 % 15.46 2 0.29 % 8.68 5 0.65 % 16.92 9 0.68 % 25.48 0 0 % 0 bližje b ližje 12 0.37 % 11.59 2 0.29 % 8.68 4 0.52 % 13.53 5 0.38 % 14.16 1 0.23 % 6.41 zoper zoper 11 0.34 % 10.63 0 0 % 0 0 0 % 0 11 0.83 % 31.15 0 0 % 0 bliže bliže 10 0.31 % 9.66 6 0.86 % 26.03 0 0 % 0 4 0.30 % 11.33 0 0 % 0 zavoljo za voljo 7 0.22 % 6.76 1 0.14 % 4.34 1 0.13 % 3.38 5 0.38 % 14.16 0 0 % 0 najbližje najb ližje 6 0.19 % 5.80 3 0.43 % 13.01 1 0.13 % 3.38 1 0.07 % 2.83 1 0.23 % 6.41 naokoli na okoli 5 0.15 % 4.83 0 0 % 0 4 0.52 % 13.53 1 0.07 % 2.83 0 0 % 0 napram n apram 5 0.15 % 4.83 0 0 % 0 3 0.39 % 10.15 2 0.15 % 5.66 0 0 % 0 spričo s pričo 5 0.15 % 4.83 1 0.14 % 4.34 0 0 % 0 4 0.30 % 11.33 0 0 % 0 vzdolž v zdolž 4 0.12 % 3.86 0 0 % 0 0 0 % 0 4 0.30 % 11.33 0 0 % 0 tekom tekom 3 0.09 % 2.90 1 0.14 % 4.34 0 0 % 0 1 0.07 % 2.83 1 0.23 % 6.41 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 574 File at CLARIN.SI2.2.231 List of initial character-level 1-grams from preposition lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-prepositions-lowercase_ forms-initial-1grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] v v 16,790 24.75 % 16,220.55 3,902 24.26 % 16,926.73 3,331 21.35 % 11,270.82 7,264 26.90 % 20,569.22 2,293 25.09 % 14,708.81 na n a 11,774 17.36 % 11,374.67 2,981 18.53 % 12,931.46 2,781 17.82 % 9,409.83 4,317 15.98 % 12,224.30 1,695 18.54 % 10,872.84 za z a 7,734 11.40 % 7,471.69 1,771 11.01 % 7,682.53 1,671 10.71 % 5,654.02 3,119 11.55 % 8,831.97 1,173 12.83 % 7,524.39 z z 4,412 6.50 % 4,262.36 1,130 7.03 % 4,901.90 947 6.07 % 3,204.28 1,766 6.54 % 5,000.72 569 6.22 % 3,649.94 s s 3,288 4.85 % 3,176.48 752 4.67 % 3,262.15 682 4.37 % 2,307.62 1,345 4.98 % 3,808.59 509 5.57 % 3,265.06 po p o 3,236 4.77 % 3,126.25 770 4.79 % 3,340.23 897 5.75 % 3,035.10 1,109 4.11 % 3,140.32 460 5.03 % 2,950.74 od o d 2,431 3.58 % 2,348.55 586 3.64 % 2,542.05 611 3.92 % 2,067.39 936 3.47 % 2,650.44 298 3.26 % 1,911.57 o o 2,094 3.09 % 2,022.98 485 3.02 % 2,103.91 322 2.06 % 1,089.52 1,074 3.98 % 3,041.21 213 2.33 % 1,366.32 do d o 2,089 3.08 % 2,018.15 474 2.95 % 2,056.19 418 2.68 % 1,414.35 879 3.25 % 2,489.03 318 3.48 % 2,039.86 iz i z 1,885 2.78 % 1,821.07 438 2.72 % 1,900.03 408 2.62 % 1,380.51 879 3.25 % 2,489.03 160 1.75 % 1,026.34 pri p ri 1,559 2.30 % 1,506.12 340 2.11 % 1,474.91 163 1.04 % 551.53 798 2.96 % 2,259.67 258 2.82 % 1,654.98 ob o b 864 1.27 % 834.70 291 1.81 % 1,262.35 234 1.50 % 791.77 279 1.03 % 790.03 60 0.66 % 384.88 pr p r 783 1.15 % 756.44 103 0.64 % 446.81 407 2.61 % 1,377.13 129 0.48 % 365.28 144 1.57 % 923.71 pred p red 779 1.15 % 752.58 282 1.75 % 1,223.31 142 0.91 % 480.47 309 1.14 % 874.98 46 0.50 % 295.07 med m ed 777 1.15 % 750.65 192 1.19 % 832.89 99 0.63 % 334.98 407 1.51 % 1,152.49 79 0.86 % 506.76 čez č ez 575 0.85 % 555.50 210 1.30 % 910.97 187 1.20 % 632.74 125 0.46 % 353.96 53 0.58 % 339.98 k k 463 0.68 % 447.30 105 0.65 % 455.49 107 0.69 % 362.05 187 0.69 % 529.52 64 0.70 % 410.54 brez b rez 457 0.67 % 441.50 133 0.83 % 576.95 113 0.72 % 382.35 143 0.53 % 404.93 68 0.74 % 436.20 u u 417 0.61 % 402.86 87 0.54 % 377.40 214 1.37 % 724.09 97 0.36 % 274.67 19 0.21 % 121.88 zarad z arad 386 0.57 % 372.91 59 0.37 % 255.94 82 0.53 % 277.46 172 0.64 % 487.05 73 0.80 % 468.27 pod p od 332 0.49 % 320.74 60 0.37 % 260.28 87 0.56 % 294.37 147 0.54 % 416.25 38 0.42 % 243.76 zaradi z aradi 312 0.46 % 301.42 69 0.43 % 299.32 20 0.13 % 67.67 195 0.72 % 552.17 28 0.31 % 179.61 proti p roti 274 0.40 % 264.71 119 0.74 % 516.22 40 0.26 % 135.34 95 0.35 % 269.01 20 0.22 % 128.29 skoz s koz 232 0.34 % 224.13 42 0.26 % 182.19 110 0.70 % 372.20 26 0.10 % 73.62 54 0.59 % 346.39 f f 228 0.34 % 220.27 25 0.15 % 108.45 147 0.94 % 497.39 32 0.12 % 90.61 24 0.26 % 153.95 zraven z raven 207 0.30 % 199.98 39 0.24 % 169.18 70 0.45 % 236.85 41 0.15 % 116.10 57 0.62 % 365.64 nad n ad 159 0.23 % 153.61 40 0.25 % 173.52 22 0.14 % 74.44 84 0.31 % 237.86 13 0.14 % 83.39 h h 139 0.20 % 134.29 20 0.12 % 86.76 73 0.47 % 247 30 0.11 % 84.95 16 0.17 % 102.63 preko p reko 138 0.20 % 133.32 36 0.22 % 156.17 13 0.08 % 43.99 74 0.27 % 209.54 15 0.16 % 96.22 glede g lede 132 0.20 % 127.52 17 0.11 % 73.75 18 0.12 % 60.91 48 0.18 % 135.92 49 0.54 % 314.32 n n 127 0.19 % 122.69 1 0.01 % 4.34 95 0.61 % 321.44 29 0.11 % 82.12 2 0.02 % 12.83 okrog o krog 122 0.18 % 117.86 17 0.11 % 73.75 27 0.17 % 91.36 55 0.20 % 155.74 23 0.25 % 147.54 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 575 File at CLARIN.SI2.2.232 List of initial character-level 2-grams from preposition lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-prepositions-lowercase_ forms-initial-2grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] na na 11,774 29.55 % 11,374.67 2,981 31.14 % 12,931.46 2,781 28.76 % 9,409.83 4,317 28.45 % 12,224.30 1,695 31.21 % 10,872.84 za za 7,734 19.41 % 7,471.69 1,771 18.50 % 7,682.53 1,671 17.28 % 5,654.02 3,119 20.55 % 8,831.97 1,173 21.60 % 7,524.39 po po 3,236 8.12 % 3,126.25 770 8.04 % 3,340.23 897 9.28 % 3,035.10 1,109 7.31 % 3,140.32 460 8.47 % 2,950.74 od od 2,431 6.10 % 2,348.55 586 6.12 % 2,542.05 611 6.32 % 2,067.39 936 6.17 % 2,650.44 298 5.49 % 1,911.57 do do 2,089 5.24 % 2,018.15 474 4.95 % 2,056.19 418 4.32 % 1,414.35 879 5.79 % 2,489.03 318 5.86 % 2,039.86 iz iz 1,885 4.73 % 1,821.07 438 4.58 % 1,900.03 408 4.22 % 1,380.51 879 5.79 % 2,489.03 160 2.95 % 1,026.34 pri pr i 1,559 3.91 % 1,506.12 340 3.55 % 1,474.91 163 1.69 % 551.53 798 5.26 % 2,259.67 258 4.75 % 1,654.98 ob ob 864 2.17 % 834.70 291 3.04 % 1,262.35 234 2.42 % 791.77 279 1.84 % 790.03 60 1.10 % 384.88 pr pr 783 1.97 % 756.44 103 1.08 % 446.81 407 4.21 % 1,377.13 129 0.85 % 365.28 144 2.65 % 923.71 pred pr ed 779 1.96 % 752.58 282 2.95 % 1,223.31 142 1.47 % 480.47 309 2.04 % 874.98 46 0.85 % 295.07 med me d 777 1.95 % 750.65 192 2.01 % 832.89 99 1.02 % 334.98 407 2.68 % 1,152.49 79 1.46 % 506.76 čez če z 575 1.44 % 555.50 210 2.19 % 910.97 187 1.93 % 632.74 125 0.82 % 353.96 53 0.98 % 339.98 brez br ez 457 1.15 % 441.50 133 1.39 % 576.95 113 1.17 % 382.35 143 0.94 % 404.93 68 1.25 % 436.20 zarad za rad 386 0.97 % 372.91 59 0.62 % 255.94 82 0.85 % 277.46 172 1.13 % 487.05 73 1.34 % 468.27 pod po d 332 0.83 % 320.74 60 0.63 % 260.28 87 0.90 % 294.37 147 0.97 % 416.25 38 0.70 % 243.76 zaradi za radi 312 0.78 % 301.42 69 0.72 % 299.32 20 0.21 % 67.67 195 1.28 % 552.17 28 0.52 % 179.61 proti pr oti 274 0.69 % 264.71 119 1.24 % 516.22 40 0.41 % 135.34 95 0.63 % 269.01 20 0.37 % 128.29 skoz sk oz 232 0.58 % 224.13 42 0.44 % 182.19 110 1.14 % 372.20 26 0.17 % 73.62 54 0.99 % 346.39 zraven zr aven 207 0.52 % 199.98 39 0.41 % 169.18 70 0.72 % 236.85 41 0.27 % 116.10 57 1.05 % 365.64 nad na d 159 0.40 % 153.61 40 0.42 % 173.52 22 0.23 % 74.44 84 0.55 % 237.86 13 0.24 % 83.39 preko pr eko 138 0.35 % 133.32 36 0.38 % 156.17 13 0.13 % 43.99 74 0.49 % 209.54 15 0.28 % 96.22 glede gl ede 132 0.33 % 127.52 17 0.18 % 73.75 18 0.19 % 60.91 48 0.32 % 135.92 49 0.90 % 314.32 okrog ok rog 122 0.31 % 117.86 17 0.18 % 73.75 27 0.28 % 91.36 55 0.36 % 155.74 23 0.42 % 147.54 poleg po leg 110 0.28 % 106.27 28 0.29 % 121.46 19 0.20 % 64.29 53 0.35 % 150.08 10 0.18 % 64.15 skozi sk ozi 106 0.27 % 102.40 16 0.17 % 69.41 10 0.10 % 33.84 75 0.49 % 212.37 5 0.09 % 32.07 kljub kl jub 92 0.23 % 88.88 15 0.16 % 65.07 3 0.03 % 10.15 62 0.41 % 175.56 12 0.22 % 76.98 izmed iz med 89 0.22 % 85.98 46 0.48 % 199.55 5 0.05 % 16.92 37 0.24 % 104.77 1 0.02 % 6.41 razen ra zen 87 0.22 % 84.05 14 0.15 % 60.73 21 0.22 % 71.06 34 0.22 % 96.28 18 0.33 % 115.46 prek pr ek 84 0.21 % 81.15 12 0.12 % 52.06 35 0.36 % 118.43 24 0.16 % 67.96 13 0.24 % 83.39 hu hu 77 0.19 % 74.39 0 0 % 0 77 0.80 % 260.54 0 0 % 0 0 0 % 0 mimo mi mo 77 0.19 % 74.39 33 0.34 % 143.15 11 0.11 % 37.22 26 0.17 % 73.62 7 0.13 % 44.90 okol ok ol 75 0.19 % 72.46 13 0.14 % 56.39 41 0.42 % 138.73 9 0.06 % 25.48 12 0.22 % 76.98 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 576 File at CLARIN.SI2.2.233 List of initial character-level 3-grams from preposition lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-prepositions-lowercase_ forms-initial-3grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pri pri 1,559 18.12 % 1,506.12 340 16.27 % 1,474.91 163 8.35 % 551.53 798 23.02 % 2,259.67 258 23.58 % 1,654.98 pred pre d 779 9.05 % 752.58 282 13.49 % 1,223.31 142 7.27 % 480.47 309 8.91 % 874.98 46 4.21 % 295.07 med med 777 9.03 % 750.65 192 9.19 % 832.89 99 5.07 % 334.98 407 11.74 % 1,152.49 79 7.22 % 506.76 čez čez 575 6.68 % 555.50 210 10.05 % 910.97 187 9.57 % 632.74 125 3.60 % 353.96 53 4.84 % 339.98 brez bre z 457 5.31 % 441.50 133 6.36 % 576.95 113 5.79 % 382.35 143 4.12 % 404.93 68 6.22 % 436.20 zarad zar ad 386 4.49 % 372.91 59 2.82 % 255.94 82 4.20 % 277.46 172 4.96 % 487.05 73 6.67 % 468.27 pod pod 332 3.86 % 320.74 60 2.87 % 260.28 87 4.46 % 294.37 147 4.24 % 416.25 38 3.47 % 243.76 zaradi zar adi 312 3.63 % 301.42 69 3.30 % 299.32 20 1.02 % 67.67 195 5.62 % 552.17 28 2.56 % 179.61 proti pro ti 274 3.19 % 264.71 119 5.69 % 516.22 40 2.05 % 135.34 95 2.74 % 269.01 20 1.83 % 128.29 skoz sko z 232 2.70 % 224.13 42 2.01 % 182.19 110 5.63 % 372.20 26 0.75 % 73.62 54 4.94 % 346.39 zraven zra ven 207 2.41 % 199.98 39 1.87 % 169.18 70 3.58 % 236.85 41 1.18 % 116.10 57 5.21 % 365.64 nad nad 159 1.85 % 153.61 40 1.91 % 173.52 22 1.13 % 74.44 84 2.42 % 237.86 13 1.19 % 83.39 preko pre ko 138 1.60 % 133.32 36 1.72 % 156.17 13 0.67 % 43.99 74 2.13 % 209.54 15 1.37 % 96.22 glede gle de 132 1.53 % 127.52 17 0.81 % 73.75 18 0.92 % 60.91 48 1.38 % 135.92 49 4.48 % 314.32 okrog okr og 122 1.42 % 117.86 17 0.81 % 73.75 27 1.38 % 91.36 55 1.59 % 155.74 23 2.10 % 147.54 poleg pol eg 110 1.28 % 106.27 28 1.34 % 121.46 19 0.97 % 64.29 53 1.53 % 150.08 10 0.91 % 64.15 skozi sko zi 106 1.23 % 102.40 16 0.77 % 69.41 10 0.51 % 33.84 75 2.16 % 212.37 5 0.46 % 32.07 kljub klj ub 92 1.07 % 88.88 15 0.72 % 65.07 3 0.15 % 10.15 62 1.79 % 175.56 12 1.10 % 76.98 izmed izm ed 89 1.03 % 85.98 46 2.20 % 199.55 5 0.26 % 16.92 37 1.07 % 104.77 1 0.09 % 6.41 razen raz en 87 1.01 % 84.05 14 0.67 % 60.73 21 1.07 % 71.06 34 0.98 % 96.28 18 1.65 % 115.46 prek pre k 84 0.98 % 81.15 12 0.57 % 52.06 35 1.79 % 118.43 24 0.69 % 67.96 13 1.19 % 83.39 mimo mim o 77 0.90 % 74.39 33 1.58 % 143.15 11 0.56 % 37.22 26 0.75 % 73.62 7 0.64 % 44.90 okol oko l 75 0.87 % 72.46 13 0.62 % 56.39 41 2.10 % 138.73 9 0.26 % 25.48 12 1.10 % 76.98 blizu bli zu 68 0.79 % 65.69 17 0.81 % 73.75 10 0.51 % 33.84 31 0.89 % 87.78 10 0.91 % 64.15 zravn zra vn 66 0.77 % 63.76 9 0.43 % 39.04 42 2.15 % 142.11 7 0.20 % 19.82 8 0.73 % 51.32 okoli oko li 65 0.76 % 62.80 22 1.05 % 95.44 10 0.51 % 33.84 31 0.89 % 87.78 2 0.18 % 12.83 namesto nam esto 55 0.64 % 53.13 11 0.53 % 47.72 7 0.36 % 23.69 29 0.84 % 82.12 8 0.73 % 51.32 kasneje kas neje 54 0.63 % 52.17 20 0.96 % 86.76 2 0.10 % 6.77 31 0.89 % 87.78 1 0.09 % 6.41 vrh vrh 42 0.49 % 40.58 13 0.62 % 56.39 10 0.51 % 33.84 8 0.23 % 22.65 11 1.00 % 70.56 vrhu vrh u 42 0.49 % 40.58 21 1.00 % 91.10 4 0.20 % 13.53 16 0.46 % 45.31 1 0.09 % 6.41 prot pro t 39 0.45 % 37.68 2 0.10 % 8.68 17 0.87 % 57.52 18 0.52 % 50.97 2 0.18 % 12.83 zard zar d 39 0.45 % 37.68 0 0 % 0 17 0.87 % 57.52 15 0.43 % 42.47 7 0.64 % 44.90 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 577 File at CLARIN.SI2.2.234 List of initial character-level 4-grams from preposition lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-prepositions-lowercase_ forms-initial-4grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pred pred 779 15.52 % 752.58 282 23.21 % 1,223.31 142 11.13 % 480.47 309 16.36 % 874.98 46 7.21 % 295.07 brez brez 457 9.11 % 441.50 133 10.95 % 576.95 113 8.86 % 382.35 143 7.57 % 404.93 68 10.66 % 436.20 zarad zara d 386 7.69 % 372.91 59 4.86 % 255.94 82 6.43 % 277.46 172 9.11 % 487.05 73 11.44 % 468.27 zaradi zara di 312 6.22 % 301.42 69 5.68 % 299.32 20 1.57 % 67.67 195 10.32 % 552.17 28 4.39 % 179.61 proti prot i 274 5.46 % 264.71 119 9.79 % 516.22 40 3.13 % 135.34 95 5.03 % 269.01 20 3.13 % 128.29 skoz skoz 232 4.62 % 224.13 42 3.46 % 182.19 110 8.62 % 372.20 26 1.38 % 73.62 54 8.46 % 346.39 zraven zrav en 207 4.12 % 199.98 39 3.21 % 169.18 70 5.49 % 236.85 41 2.17 % 116.10 57 8.93 % 365.64 preko prek o 138 2.75 % 133.32 36 2.96 % 156.17 13 1.02 % 43.99 74 3.92 % 209.54 15 2.35 % 96.22 glede gled e 132 2.63 % 127.52 17 1.40 % 73.75 18 1.41 % 60.91 48 2.54 % 135.92 49 7.68 % 314.32 okrog okro g 122 2.43 % 117.86 17 1.40 % 73.75 27 2.12 % 91.36 55 2.91 % 155.74 23 3.60 % 147.54 poleg pole g 110 2.19 % 106.27 28 2.31 % 121.46 19 1.49 % 64.29 53 2.81 % 150.08 10 1.57 % 64.15 skozi skoz i 106 2.11 % 102.40 16 1.32 % 69.41 10 0.78 % 33.84 75 3.97 % 212.37 5 0.78 % 32.07 kljub klju b 92 1.83 % 88.88 15 1.24 % 65.07 3 0.23 % 10.15 62 3.28 % 175.56 12 1.88 % 76.98 izmed izme d 89 1.77 % 85.98 46 3.79 % 199.55 5 0.39 % 16.92 37 1.96 % 104.77 1 0.16 % 6.41 razen raze n 87 1.73 % 84.05 14 1.15 % 60.73 21 1.65 % 71.06 34 1.80 % 96.28 18 2.82 % 115.46 prek prek 84 1.67 % 81.15 12 0.99 % 52.06 35 2.74 % 118.43 24 1.27 % 67.96 13 2.04 % 83.39 mimo mimo 77 1.53 % 74.39 33 2.72 % 143.15 11 0.86 % 37.22 26 1.38 % 73.62 7 1.10 % 44.90 okol okol 75 1.50 % 72.46 13 1.07 % 56.39 41 3.21 % 138.73 9 0.48 % 25.48 12 1.88 % 76.98 blizu bliz u 68 1.35 % 65.69 17 1.40 % 73.75 10 0.78 % 33.84 31 1.64 % 87.78 10 1.57 % 64.15 zravn zrav n 66 1.31 % 63.76 9 0.74 % 39.04 42 3.29 % 142.11 7 0.37 % 19.82 8 1.25 % 51.32 okoli okol i 65 1.29 % 62.80 22 1.81 % 95.44 10 0.78 % 33.84 31 1.64 % 87.78 2 0.31 % 12.83 namesto name sto 55 1.10 % 53.13 11 0.91 % 47.72 7 0.55 % 23.69 29 1.53 % 82.12 8 1.25 % 51.32 kasneje kasn eje 54 1.08 % 52.17 20 1.65 % 86.76 2 0.16 % 6.77 31 1.64 % 87.78 1 0.16 % 6.41 vrhu vrhu 42 0.84 % 40.58 21 1.73 % 91.10 4 0.31 % 13.53 16 0.85 % 45.31 1 0.16 % 6.41 prot prot 39 0.78 % 37.68 2 0.17 % 8.68 17 1.33 % 57.52 18 0.95 % 50.97 2 0.31 % 12.83 zard zard 39 0.78 % 37.68 0 0 % 0 17 1.33 % 57.52 15 0.79 % 42.47 7 1.10 % 44.90 znotraj znot raj 39 0.78 % 37.68 2 0.17 % 8.68 0 0 % 0 32 1.69 % 90.61 5 0.78 % 32.07 zunaj zuna j 36 0.72 % 34.78 10 0.82 % 43.38 7 0.55 % 23.69 14 0.74 % 39.64 5 0.78 % 32.07 skos skos 29 0.58 % 28.02 3 0.25 % 13.01 20 1.57 % 67.67 3 0.16 % 8.49 3 0.47 % 19.24 krog krog 28 0.56 % 27.05 5 0.41 % 21.69 2 0.16 % 6.77 16 0.85 % 45.31 5 0.78 % 32.07 sredi sred i 28 0.56 % 27.05 9 0.74 % 39.04 5 0.39 % 16.92 11 0.58 % 31.15 3 0.47 % 19.24 skuz skuz 24 0.48 % 23.19 1 0.08 % 4.34 22 1.72 % 74.44 0 0 % 0 1 0.16 % 6.41 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 578 File at CLARIN.SI2.2.235 List of initial character-level 5-grams from preposition lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-prepositions-lowercase_ forms-initial-5grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] zarad zarad 386 13.31 % 372.91 59 9.01 % 255.94 82 13.69 % 277.46 172 13.74 % 487.05 73 18.57 % 468.27 zaradi zarad i 312 10.76 % 301.42 69 10.53 % 299.32 20 3.34 % 67.67 195 15.57 % 552.17 28 7.12 % 179.61 proti proti 274 9.45 % 264.71 119 18.17 % 516.22 40 6.68 % 135.34 95 7.59 % 269.01 20 5.09 % 128.29 zraven zrave n 207 7.14 % 199.98 39 5.95 % 169.18 70 11.69 % 236.85 41 3.27 % 116.10 57 14.50 % 365.64 preko preko 138 4.76 % 133.32 36 5.50 % 156.17 13 2.17 % 43.99 74 5.91 % 209.54 15 3.82 % 96.22 glede glede 132 4.55 % 127.52 17 2.60 % 73.75 18 3.00 % 60.91 48 3.83 % 135.92 49 12.47 % 314.32 okrog okrog 122 4.21 % 117.86 17 2.60 % 73.75 27 4.51 % 91.36 55 4.39 % 155.74 23 5.85 % 147.54 poleg poleg 110 3.79 % 106.27 28 4.28 % 121.46 19 3.17 % 64.29 53 4.23 % 150.08 10 2.54 % 64.15 skozi skozi 106 3.66 % 102.40 16 2.44 % 69.41 10 1.67 % 33.84 75 5.99 % 212.37 5 1.27 % 32.07 kljub kljub 92 3.17 % 88.88 15 2.29 % 65.07 3 0.50 % 10.15 62 4.95 % 175.56 12 3.05 % 76.98 izmed izmed 89 3.07 % 85.98 46 7.02 % 199.55 5 0.83 % 16.92 37 2.96 % 104.77 1 0.25 % 6.41 razen razen 87 3.00 % 84.05 14 2.14 % 60.73 21 3.51 % 71.06 34 2.72 % 96.28 18 4.58 % 115.46 blizu blizu 68 2.35 % 65.69 17 2.60 % 73.75 10 1.67 % 33.84 31 2.48 % 87.78 10 2.54 % 64.15 zravn zravn 66 2.28 % 63.76 9 1.37 % 39.04 42 7.01 % 142.11 7 0.56 % 19.82 8 2.04 % 51.32 okoli okoli 65 2.24 % 62.80 22 3.36 % 95.44 10 1.67 % 33.84 31 2.48 % 87.78 2 0.51 % 12.83 namesto names to 55 1.90 % 53.13 11 1.68 % 47.72 7 1.17 % 23.69 29 2.32 % 82.12 8 2.04 % 51.32 kasneje kasne je 54 1.86 % 52.17 20 3.05 % 86.76 2 0.33 % 6.77 31 2.48 % 87.78 1 0.25 % 6.41 znotraj znotr aj 39 1.34 % 37.68 2 0.30 % 8.68 0 0 % 0 32 2.56 % 90.61 5 1.27 % 32.07 zunaj zunaj 36 1.24 % 34.78 10 1.53 % 43.38 7 1.17 % 23.69 14 1.12 % 39.64 5 1.27 % 32.07 sredi sredi 28 0.97 % 27.05 9 1.37 % 39.04 5 0.83 % 16.92 11 0.88 % 31.15 3 0.76 % 19.24 izven izven 21 0.72 % 20.29 5 0.76 % 21.69 3 0.50 % 10.15 9 0.72 % 25.48 4 1.02 % 25.66 namest names t 21 0.72 % 20.29 3 0.46 % 13.01 10 1.67 % 33.84 2 0.16 % 5.66 6 1.53 % 38.49 pomoje pomoj e 20 0.69 % 19.32 2 0.30 % 8.68 18 3.00 % 60.91 0 0 % 0 0 0 % 0 zunej zunej 19 0.66 % 18.36 1 0.15 % 4.34 4 0.67 % 13.53 6 0.48 % 16.99 8 2.04 % 51.32 kasnej kasne j 18 0.62 % 17.39 5 0.76 % 21.69 6 1.00 % 20.30 3 0.24 % 8.49 4 1.02 % 25.66 prejk prejk 17 0.59 % 16.42 9 1.37 % 39.04 7 1.17 % 23.69 1 0.08 % 2.83 0 0 % 0 izpred izpre d 16 0.55 % 15.46 2 0.30 % 8.68 5 0.83 % 16.92 9 0.72 % 25.48 0 0 % 0 nasproti naspr oti 14 0.48 % 13.53 1 0.15 % 4.34 7 1.17 % 23.69 3 0.24 % 8.49 3 0.76 % 19.24 nasvidenje nasvi denje 11 0.38 % 10.63 5 0.76 % 21.69 0 0 % 0 2 0.16 % 5.66 4 1.02 % 25.66 zoper zoper 11 0.38 % 10.63 0 0 % 0 0 0 % 0 11 0.88 % 31.15 0 0 % 0 bližje bližj e 10 0.34 % 9.66 2 0.30 % 8.68 3 0.50 % 10.15 4 0.32 % 11.33 1 0.25 % 6.41 konec konec 9 0.31 % 8.69 0 0 % 0 0 0 % 0 8 0.64 % 22.65 1 0.25 % 6.41 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 579 File at CLARIN.SI2.2.236 List of final character-level 1-grams from preposition lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-prepositions-lowercase_ forms-final-1grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] v v 16,790 24.75 % 16,220.55 3,902 24.26 % 16,926.73 3,331 21.35 % 11,270.82 7,264 26.90 % 20,569.22 2,293 25.09 % 14,708.81 na n a 11,774 17.36 % 11,374.67 2,981 18.53 % 12,931.46 2,781 17.82 % 9,409.83 4,317 15.98 % 12,224.30 1,695 18.54 % 10,872.84 za z a 7,734 11.40 % 7,471.69 1,771 11.01 % 7,682.53 1,671 10.71 % 5,654.02 3,119 11.55 % 8,831.97 1,173 12.83 % 7,524.39 z z 4,412 6.50 % 4,262.36 1,130 7.03 % 4,901.90 947 6.07 % 3,204.28 1,766 6.54 % 5,000.72 569 6.22 % 3,649.94 s s 3,288 4.85 % 3,176.48 752 4.67 % 3,262.15 682 4.37 % 2,307.62 1,345 4.98 % 3,808.59 509 5.57 % 3,265.06 po p o 3,236 4.77 % 3,126.25 770 4.79 % 3,340.23 897 5.75 % 3,035.10 1,109 4.11 % 3,140.32 460 5.03 % 2,950.74 od o d 2,431 3.58 % 2,348.55 586 3.64 % 2,542.05 611 3.92 % 2,067.39 936 3.47 % 2,650.44 298 3.26 % 1,911.57 o o 2,094 3.09 % 2,022.98 485 3.02 % 2,103.91 322 2.06 % 1,089.52 1,074 3.98 % 3,041.21 213 2.33 % 1,366.32 do d o 2,089 3.08 % 2,018.15 474 2.95 % 2,056.19 418 2.68 % 1,414.35 879 3.25 % 2,489.03 318 3.48 % 2,039.86 iz i z 1,885 2.78 % 1,821.07 438 2.72 % 1,900.03 408 2.62 % 1,380.51 879 3.25 % 2,489.03 160 1.75 % 1,026.34 pri pr i 1,559 2.30 % 1,506.12 340 2.11 % 1,474.91 163 1.04 % 551.53 798 2.96 % 2,259.67 258 2.82 % 1,654.98 ob o b 864 1.27 % 834.70 291 1.81 % 1,262.35 234 1.50 % 791.77 279 1.03 % 790.03 60 0.66 % 384.88 pr p r 783 1.15 % 756.44 103 0.64 % 446.81 407 2.61 % 1,377.13 129 0.48 % 365.28 144 1.57 % 923.71 pred pre d 779 1.15 % 752.58 282 1.75 % 1,223.31 142 0.91 % 480.47 309 1.14 % 874.98 46 0.50 % 295.07 med me d 777 1.15 % 750.65 192 1.19 % 832.89 99 0.63 % 334.98 407 1.51 % 1,152.49 79 0.86 % 506.76 čez če z 575 0.85 % 555.50 210 1.30 % 910.97 187 1.20 % 632.74 125 0.46 % 353.96 53 0.58 % 339.98 k k 463 0.68 % 447.30 105 0.65 % 455.49 107 0.69 % 362.05 187 0.69 % 529.52 64 0.70 % 410.54 brez bre z 457 0.67 % 441.50 133 0.83 % 576.95 113 0.72 % 382.35 143 0.53 % 404.93 68 0.74 % 436.20 u u 417 0.61 % 402.86 87 0.54 % 377.40 214 1.37 % 724.09 97 0.36 % 274.67 19 0.21 % 121.88 zarad zara d 386 0.57 % 372.91 59 0.37 % 255.94 82 0.53 % 277.46 172 0.64 % 487.05 73 0.80 % 468.27 pod po d 332 0.49 % 320.74 60 0.37 % 260.28 87 0.56 % 294.37 147 0.54 % 416.25 38 0.42 % 243.76 zaradi zarad i 312 0.46 % 301.42 69 0.43 % 299.32 20 0.13 % 67.67 195 0.72 % 552.17 28 0.31 % 179.61 proti prot i 274 0.40 % 264.71 119 0.74 % 516.22 40 0.26 % 135.34 95 0.35 % 269.01 20 0.22 % 128.29 skoz sko z 232 0.34 % 224.13 42 0.26 % 182.19 110 0.70 % 372.20 26 0.10 % 73.62 54 0.59 % 346.39 f f 228 0.34 % 220.27 25 0.15 % 108.45 147 0.94 % 497.39 32 0.12 % 90.61 24 0.26 % 153.95 zraven zrave n 207 0.30 % 199.98 39 0.24 % 169.18 70 0.45 % 236.85 41 0.15 % 116.10 57 0.62 % 365.64 nad na d 159 0.23 % 153.61 40 0.25 % 173.52 22 0.14 % 74.44 84 0.31 % 237.86 13 0.14 % 83.39 h h 139 0.20 % 134.29 20 0.12 % 86.76 73 0.47 % 247 30 0.11 % 84.95 16 0.17 % 102.63 preko prek o 138 0.20 % 133.32 36 0.22 % 156.17 13 0.08 % 43.99 74 0.27 % 209.54 15 0.16 % 96.22 glede gled e 132 0.20 % 127.52 17 0.11 % 73.75 18 0.12 % 60.91 48 0.18 % 135.92 49 0.54 % 314.32 n n 127 0.19 % 122.69 1 0.01 % 4.34 95 0.61 % 321.44 29 0.11 % 82.12 2 0.02 % 12.83 okrog okro g 122 0.18 % 117.86 17 0.11 % 73.75 27 0.17 % 91.36 55 0.20 % 155.74 23 0.25 % 147.54 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 580 File at CLARIN.SI2.2.237 List of final character-level 2-grams from preposition lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-prepositions-lowercase_ forms-final-2grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] na na 11,774 29.55 % 11,374.67 2,981 31.14 % 12,931.46 2,781 28.76 % 9,409.83 4,317 28.45 % 12,224.30 1,695 31.21 % 10,872.84 za za 7,734 19.41 % 7,471.69 1,771 18.50 % 7,682.53 1,671 17.28 % 5,654.02 3,119 20.55 % 8,831.97 1,173 21.60 % 7,524.39 po po 3,236 8.12 % 3,126.25 770 8.04 % 3,340.23 897 9.28 % 3,035.10 1,109 7.31 % 3,140.32 460 8.47 % 2,950.74 od od 2,431 6.10 % 2,348.55 586 6.12 % 2,542.05 611 6.32 % 2,067.39 936 6.17 % 2,650.44 298 5.49 % 1,911.57 do do 2,089 5.24 % 2,018.15 474 4.95 % 2,056.19 418 4.32 % 1,414.35 879 5.79 % 2,489.03 318 5.86 % 2,039.86 iz iz 1,885 4.73 % 1,821.07 438 4.58 % 1,900.03 408 4.22 % 1,380.51 879 5.79 % 2,489.03 160 2.95 % 1,026.34 pri p ri 1,559 3.91 % 1,506.12 340 3.55 % 1,474.91 163 1.69 % 551.53 798 5.26 % 2,259.67 258 4.75 % 1,654.98 ob ob 864 2.17 % 834.70 291 3.04 % 1,262.35 234 2.42 % 791.77 279 1.84 % 790.03 60 1.10 % 384.88 pr pr 783 1.97 % 756.44 103 1.08 % 446.81 407 4.21 % 1,377.13 129 0.85 % 365.28 144 2.65 % 923.71 pred pr ed 779 1.96 % 752.58 282 2.95 % 1,223.31 142 1.47 % 480.47 309 2.04 % 874.98 46 0.85 % 295.07 med m ed 777 1.95 % 750.65 192 2.01 % 832.89 99 1.02 % 334.98 407 2.68 % 1,152.49 79 1.46 % 506.76 čez č ez 575 1.44 % 555.50 210 2.19 % 910.97 187 1.93 % 632.74 125 0.82 % 353.96 53 0.98 % 339.98 brez br ez 457 1.15 % 441.50 133 1.39 % 576.95 113 1.17 % 382.35 143 0.94 % 404.93 68 1.25 % 436.20 zarad zar ad 386 0.97 % 372.91 59 0.62 % 255.94 82 0.85 % 277.46 172 1.13 % 487.05 73 1.34 % 468.27 pod p od 332 0.83 % 320.74 60 0.63 % 260.28 87 0.90 % 294.37 147 0.97 % 416.25 38 0.70 % 243.76 zaradi zara di 312 0.78 % 301.42 69 0.72 % 299.32 20 0.21 % 67.67 195 1.28 % 552.17 28 0.52 % 179.61 proti pro ti 274 0.69 % 264.71 119 1.24 % 516.22 40 0.41 % 135.34 95 0.63 % 269.01 20 0.37 % 128.29 skoz sk oz 232 0.58 % 224.13 42 0.44 % 182.19 110 1.14 % 372.20 26 0.17 % 73.62 54 0.99 % 346.39 zraven zrav en 207 0.52 % 199.98 39 0.41 % 169.18 70 0.72 % 236.85 41 0.27 % 116.10 57 1.05 % 365.64 nad n ad 159 0.40 % 153.61 40 0.42 % 173.52 22 0.23 % 74.44 84 0.55 % 237.86 13 0.24 % 83.39 preko pre ko 138 0.35 % 133.32 36 0.38 % 156.17 13 0.13 % 43.99 74 0.49 % 209.54 15 0.28 % 96.22 glede gle de 132 0.33 % 127.52 17 0.18 % 73.75 18 0.19 % 60.91 48 0.32 % 135.92 49 0.90 % 314.32 okrog okr og 122 0.31 % 117.86 17 0.18 % 73.75 27 0.28 % 91.36 55 0.36 % 155.74 23 0.42 % 147.54 poleg pol eg 110 0.28 % 106.27 28 0.29 % 121.46 19 0.20 % 64.29 53 0.35 % 150.08 10 0.18 % 64.15 skozi sko zi 106 0.27 % 102.40 16 0.17 % 69.41 10 0.10 % 33.84 75 0.49 % 212.37 5 0.09 % 32.07 kljub klj ub 92 0.23 % 88.88 15 0.16 % 65.07 3 0.03 % 10.15 62 0.41 % 175.56 12 0.22 % 76.98 izmed izm ed 89 0.22 % 85.98 46 0.48 % 199.55 5 0.05 % 16.92 37 0.24 % 104.77 1 0.02 % 6.41 razen raz en 87 0.22 % 84.05 14 0.15 % 60.73 21 0.22 % 71.06 34 0.22 % 96.28 18 0.33 % 115.46 prek pr ek 84 0.21 % 81.15 12 0.12 % 52.06 35 0.36 % 118.43 24 0.16 % 67.96 13 0.24 % 83.39 hu hu 77 0.19 % 74.39 0 0 % 0 77 0.80 % 260.54 0 0 % 0 0 0 % 0 mimo mi mo 77 0.19 % 74.39 33 0.34 % 143.15 11 0.11 % 37.22 26 0.17 % 73.62 7 0.13 % 44.90 okol ok ol 75 0.19 % 72.46 13 0.14 % 56.39 41 0.42 % 138.73 9 0.06 % 25.48 12 0.22 % 76.98 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 581 File at CLARIN.SI2.2.238 List of final character-level 3-grams from preposition lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-prepositions-lowercase_ forms-final-3grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pri pri 1,559 18.12 % 1,506.12 340 16.27 % 1,474.91 163 8.35 % 551.53 798 23.02 % 2,259.67 258 23.58 % 1,654.98 pred p red 779 9.05 % 752.58 282 13.49 % 1,223.31 142 7.27 % 480.47 309 8.91 % 874.98 46 4.21 % 295.07 med med 777 9.03 % 750.65 192 9.19 % 832.89 99 5.07 % 334.98 407 11.74 % 1,152.49 79 7.22 % 506.76 čez čez 575 6.68 % 555.50 210 10.05 % 910.97 187 9.57 % 632.74 125 3.60 % 353.96 53 4.84 % 339.98 brez b rez 457 5.31 % 441.50 133 6.36 % 576.95 113 5.79 % 382.35 143 4.12 % 404.93 68 6.22 % 436.20 zarad za rad 386 4.49 % 372.91 59 2.82 % 255.94 82 4.20 % 277.46 172 4.96 % 487.05 73 6.67 % 468.27 pod pod 332 3.86 % 320.74 60 2.87 % 260.28 87 4.46 % 294.37 147 4.24 % 416.25 38 3.47 % 243.76 zaradi zar adi 312 3.63 % 301.42 69 3.30 % 299.32 20 1.02 % 67.67 195 5.62 % 552.17 28 2.56 % 179.61 proti pr oti 274 3.19 % 264.71 119 5.69 % 516.22 40 2.05 % 135.34 95 2.74 % 269.01 20 1.83 % 128.29 skoz s koz 232 2.70 % 224.13 42 2.01 % 182.19 110 5.63 % 372.20 26 0.75 % 73.62 54 4.94 % 346.39 zraven zra ven 207 2.41 % 199.98 39 1.87 % 169.18 70 3.58 % 236.85 41 1.18 % 116.10 57 5.21 % 365.64 nad nad 159 1.85 % 153.61 40 1.91 % 173.52 22 1.13 % 74.44 84 2.42 % 237.86 13 1.19 % 83.39 preko pr eko 138 1.60 % 133.32 36 1.72 % 156.17 13 0.67 % 43.99 74 2.13 % 209.54 15 1.37 % 96.22 glede gl ede 132 1.53 % 127.52 17 0.81 % 73.75 18 0.92 % 60.91 48 1.38 % 135.92 49 4.48 % 314.32 okrog ok rog 122 1.42 % 117.86 17 0.81 % 73.75 27 1.38 % 91.36 55 1.59 % 155.74 23 2.10 % 147.54 poleg po leg 110 1.28 % 106.27 28 1.34 % 121.46 19 0.97 % 64.29 53 1.53 % 150.08 10 0.91 % 64.15 skozi sk ozi 106 1.23 % 102.40 16 0.77 % 69.41 10 0.51 % 33.84 75 2.16 % 212.37 5 0.46 % 32.07 kljub kl jub 92 1.07 % 88.88 15 0.72 % 65.07 3 0.15 % 10.15 62 1.79 % 175.56 12 1.10 % 76.98 izmed iz med 89 1.03 % 85.98 46 2.20 % 199.55 5 0.26 % 16.92 37 1.07 % 104.77 1 0.09 % 6.41 razen ra zen 87 1.01 % 84.05 14 0.67 % 60.73 21 1.07 % 71.06 34 0.98 % 96.28 18 1.65 % 115.46 prek p rek 84 0.98 % 81.15 12 0.57 % 52.06 35 1.79 % 118.43 24 0.69 % 67.96 13 1.19 % 83.39 mimo m imo 77 0.90 % 74.39 33 1.58 % 143.15 11 0.56 % 37.22 26 0.75 % 73.62 7 0.64 % 44.90 okol o kol 75 0.87 % 72.46 13 0.62 % 56.39 41 2.10 % 138.73 9 0.26 % 25.48 12 1.10 % 76.98 blizu bl izu 68 0.79 % 65.69 17 0.81 % 73.75 10 0.51 % 33.84 31 0.89 % 87.78 10 0.91 % 64.15 zravn zr avn 66 0.77 % 63.76 9 0.43 % 39.04 42 2.15 % 142.11 7 0.20 % 19.82 8 0.73 % 51.32 okoli ok oli 65 0.76 % 62.80 22 1.05 % 95.44 10 0.51 % 33.84 31 0.89 % 87.78 2 0.18 % 12.83 namesto name sto 55 0.64 % 53.13 11 0.53 % 47.72 7 0.36 % 23.69 29 0.84 % 82.12 8 0.73 % 51.32 kasneje kasn eje 54 0.63 % 52.17 20 0.96 % 86.76 2 0.10 % 6.77 31 0.89 % 87.78 1 0.09 % 6.41 vrh vrh 42 0.49 % 40.58 13 0.62 % 56.39 10 0.51 % 33.84 8 0.23 % 22.65 11 1.00 % 70.56 vrhu v rhu 42 0.49 % 40.58 21 1.00 % 91.10 4 0.20 % 13.53 16 0.46 % 45.31 1 0.09 % 6.41 prot p rot 39 0.45 % 37.68 2 0.10 % 8.68 17 0.87 % 57.52 18 0.52 % 50.97 2 0.18 % 12.83 zard z ard 39 0.45 % 37.68 0 0 % 0 17 0.87 % 57.52 15 0.43 % 42.47 7 0.64 % 44.90 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 582 File at CLARIN.SI2.2.239 List of final character-level 4-grams from preposition lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-prepositions-lowercase_ forms-final-4grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pred pred 779 15.52 % 752.58 282 23.21 % 1,223.31 142 11.13 % 480.47 309 16.36 % 874.98 46 7.21 % 295.07 brez brez 457 9.11 % 441.50 133 10.95 % 576.95 113 8.86 % 382.35 143 7.57 % 404.93 68 10.66 % 436.20 zarad z arad 386 7.69 % 372.91 59 4.86 % 255.94 82 6.43 % 277.46 172 9.11 % 487.05 73 11.44 % 468.27 zaradi za radi 312 6.22 % 301.42 69 5.68 % 299.32 20 1.57 % 67.67 195 10.32 % 552.17 28 4.39 % 179.61 proti p roti 274 5.46 % 264.71 119 9.79 % 516.22 40 3.13 % 135.34 95 5.03 % 269.01 20 3.13 % 128.29 skoz skoz 232 4.62 % 224.13 42 3.46 % 182.19 110 8.62 % 372.20 26 1.38 % 73.62 54 8.46 % 346.39 zraven zr aven 207 4.12 % 199.98 39 3.21 % 169.18 70 5.49 % 236.85 41 2.17 % 116.10 57 8.93 % 365.64 preko p reko 138 2.75 % 133.32 36 2.96 % 156.17 13 1.02 % 43.99 74 3.92 % 209.54 15 2.35 % 96.22 glede g lede 132 2.63 % 127.52 17 1.40 % 73.75 18 1.41 % 60.91 48 2.54 % 135.92 49 7.68 % 314.32 okrog o krog 122 2.43 % 117.86 17 1.40 % 73.75 27 2.12 % 91.36 55 2.91 % 155.74 23 3.60 % 147.54 poleg p oleg 110 2.19 % 106.27 28 2.31 % 121.46 19 1.49 % 64.29 53 2.81 % 150.08 10 1.57 % 64.15 skozi s kozi 106 2.11 % 102.40 16 1.32 % 69.41 10 0.78 % 33.84 75 3.97 % 212.37 5 0.78 % 32.07 kljub k ljub 92 1.83 % 88.88 15 1.24 % 65.07 3 0.23 % 10.15 62 3.28 % 175.56 12 1.88 % 76.98 izmed i zmed 89 1.77 % 85.98 46 3.79 % 199.55 5 0.39 % 16.92 37 1.96 % 104.77 1 0.16 % 6.41 razen r azen 87 1.73 % 84.05 14 1.15 % 60.73 21 1.65 % 71.06 34 1.80 % 96.28 18 2.82 % 115.46 prek prek 84 1.67 % 81.15 12 0.99 % 52.06 35 2.74 % 118.43 24 1.27 % 67.96 13 2.04 % 83.39 mimo mimo 77 1.53 % 74.39 33 2.72 % 143.15 11 0.86 % 37.22 26 1.38 % 73.62 7 1.10 % 44.90 okol okol 75 1.50 % 72.46 13 1.07 % 56.39 41 3.21 % 138.73 9 0.48 % 25.48 12 1.88 % 76.98 blizu b lizu 68 1.35 % 65.69 17 1.40 % 73.75 10 0.78 % 33.84 31 1.64 % 87.78 10 1.57 % 64.15 zravn z ravn 66 1.31 % 63.76 9 0.74 % 39.04 42 3.29 % 142.11 7 0.37 % 19.82 8 1.25 % 51.32 okoli o koli 65 1.29 % 62.80 22 1.81 % 95.44 10 0.78 % 33.84 31 1.64 % 87.78 2 0.31 % 12.83 namesto nam esto 55 1.10 % 53.13 11 0.91 % 47.72 7 0.55 % 23.69 29 1.53 % 82.12 8 1.25 % 51.32 kasneje kas neje 54 1.08 % 52.17 20 1.65 % 86.76 2 0.16 % 6.77 31 1.64 % 87.78 1 0.16 % 6.41 vrhu vrhu 42 0.84 % 40.58 21 1.73 % 91.10 4 0.31 % 13.53 16 0.85 % 45.31 1 0.16 % 6.41 prot prot 39 0.78 % 37.68 2 0.17 % 8.68 17 1.33 % 57.52 18 0.95 % 50.97 2 0.31 % 12.83 zard zard 39 0.78 % 37.68 0 0 % 0 17 1.33 % 57.52 15 0.79 % 42.47 7 1.10 % 44.90 znotraj zno traj 39 0.78 % 37.68 2 0.17 % 8.68 0 0 % 0 32 1.69 % 90.61 5 0.78 % 32.07 zunaj z unaj 36 0.72 % 34.78 10 0.82 % 43.38 7 0.55 % 23.69 14 0.74 % 39.64 5 0.78 % 32.07 skos skos 29 0.58 % 28.02 3 0.25 % 13.01 20 1.57 % 67.67 3 0.16 % 8.49 3 0.47 % 19.24 krog krog 28 0.56 % 27.05 5 0.41 % 21.69 2 0.16 % 6.77 16 0.85 % 45.31 5 0.78 % 32.07 sredi s redi 28 0.56 % 27.05 9 0.74 % 39.04 5 0.39 % 16.92 11 0.58 % 31.15 3 0.47 % 19.24 skuz skuz 24 0.48 % 23.19 1 0.08 % 4.34 22 1.72 % 74.44 0 0 % 0 1 0.16 % 6.41 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 583 File at CLARIN.SI2.2.240 List of final character-level 5-grams from preposition lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-prepositions-lowercase_ forms-final-5grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] zarad zarad 386 13.31 % 372.91 59 9.01 % 255.94 82 13.69 % 277.46 172 13.74 % 487.05 73 18.57 % 468.27 zaradi z aradi 312 10.76 % 301.42 69 10.53 % 299.32 20 3.34 % 67.67 195 15.57 % 552.17 28 7.12 % 179.61 proti proti 274 9.45 % 264.71 119 18.17 % 516.22 40 6.68 % 135.34 95 7.59 % 269.01 20 5.09 % 128.29 zraven z raven 207 7.14 % 199.98 39 5.95 % 169.18 70 11.69 % 236.85 41 3.27 % 116.10 57 14.50 % 365.64 preko preko 138 4.76 % 133.32 36 5.50 % 156.17 13 2.17 % 43.99 74 5.91 % 209.54 15 3.82 % 96.22 glede glede 132 4.55 % 127.52 17 2.60 % 73.75 18 3.00 % 60.91 48 3.83 % 135.92 49 12.47 % 314.32 okrog okrog 122 4.21 % 117.86 17 2.60 % 73.75 27 4.51 % 91.36 55 4.39 % 155.74 23 5.85 % 147.54 poleg poleg 110 3.79 % 106.27 28 4.28 % 121.46 19 3.17 % 64.29 53 4.23 % 150.08 10 2.54 % 64.15 skozi skozi 106 3.66 % 102.40 16 2.44 % 69.41 10 1.67 % 33.84 75 5.99 % 212.37 5 1.27 % 32.07 kljub kljub 92 3.17 % 88.88 15 2.29 % 65.07 3 0.50 % 10.15 62 4.95 % 175.56 12 3.05 % 76.98 izmed izmed 89 3.07 % 85.98 46 7.02 % 199.55 5 0.83 % 16.92 37 2.96 % 104.77 1 0.25 % 6.41 razen razen 87 3.00 % 84.05 14 2.14 % 60.73 21 3.51 % 71.06 34 2.72 % 96.28 18 4.58 % 115.46 blizu blizu 68 2.35 % 65.69 17 2.60 % 73.75 10 1.67 % 33.84 31 2.48 % 87.78 10 2.54 % 64.15 zravn zravn 66 2.28 % 63.76 9 1.37 % 39.04 42 7.01 % 142.11 7 0.56 % 19.82 8 2.04 % 51.32 okoli okoli 65 2.24 % 62.80 22 3.36 % 95.44 10 1.67 % 33.84 31 2.48 % 87.78 2 0.51 % 12.83 namesto na mesto 55 1.90 % 53.13 11 1.68 % 47.72 7 1.17 % 23.69 29 2.32 % 82.12 8 2.04 % 51.32 kasneje ka sneje 54 1.86 % 52.17 20 3.05 % 86.76 2 0.33 % 6.77 31 2.48 % 87.78 1 0.25 % 6.41 znotraj zn otraj 39 1.34 % 37.68 2 0.30 % 8.68 0 0 % 0 32 2.56 % 90.61 5 1.27 % 32.07 zunaj zunaj 36 1.24 % 34.78 10 1.53 % 43.38 7 1.17 % 23.69 14 1.12 % 39.64 5 1.27 % 32.07 sredi sredi 28 0.97 % 27.05 9 1.37 % 39.04 5 0.83 % 16.92 11 0.88 % 31.15 3 0.76 % 19.24 izven izven 21 0.72 % 20.29 5 0.76 % 21.69 3 0.50 % 10.15 9 0.72 % 25.48 4 1.02 % 25.66 namest n amest 21 0.72 % 20.29 3 0.46 % 13.01 10 1.67 % 33.84 2 0.16 % 5.66 6 1.53 % 38.49 pomoje p omoje 20 0.69 % 19.32 2 0.30 % 8.68 18 3.00 % 60.91 0 0 % 0 0 0 % 0 zunej zunej 19 0.66 % 18.36 1 0.15 % 4.34 4 0.67 % 13.53 6 0.48 % 16.99 8 2.04 % 51.32 kasnej k asnej 18 0.62 % 17.39 5 0.76 % 21.69 6 1.00 % 20.30 3 0.24 % 8.49 4 1.02 % 25.66 prejk prejk 17 0.59 % 16.42 9 1.37 % 39.04 7 1.17 % 23.69 1 0.08 % 2.83 0 0 % 0 izpred i zpred 16 0.55 % 15.46 2 0.30 % 8.68 5 0.83 % 16.92 9 0.72 % 25.48 0 0 % 0 nasproti nas proti 14 0.48 % 13.53 1 0.15 % 4.34 7 1.17 % 23.69 3 0.24 % 8.49 3 0.76 % 19.24 nasvidenje nasvi denje 11 0.38 % 10.63 5 0.76 % 21.69 0 0 % 0 2 0.16 % 5.66 4 1.02 % 25.66 zoper zoper 11 0.38 % 10.63 0 0 % 0 0 0 % 0 11 0.88 % 31.15 0 0 % 0 bližje b ližje 10 0.34 % 9.66 2 0.30 % 8.68 3 0.50 % 10.15 4 0.32 % 11.33 1 0.25 % 6.41 konec konec 9 0.31 % 8.69 0 0 % 0 0 0 % 0 8 0.64 % 22.65 1 0.25 % 6.41 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 584 File at CLARIN.SI2.2.241 List of initial character-level 1-grams from conjunction lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-conjunctions-lemmas- initial-1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pa pa p a 29,360 25.89 % 28,364.22 5,601 23.95 % 24,296.92 12,111 35.20 % 40,978.95 6,825 18.02 % 19,326.12 4,823 27.20 % 30,937.89 da da d a 19,011 16.77 % 18,366.22 3,598 15.38 % 15,607.99 5,141 14.94 % 17,395.16 7,140 18.86 % 20,218.09 3,132 17.66 % 20,090.70 in in i n 16,238 14.32 % 15,687.27 4,184 17.89 % 18,150.03 3,199 9.30 % 10,824.18 7,068 18.67 % 20,014.21 1,787 10.08 % 11,462.99 a a a 6,753 5.96 % 6,523.96 1,247 5.33 % 5,409.44 2,515 7.31 % 8,509.79 1,771 4.68 % 5,014.88 1,220 6.88 % 7,825.88 če če č e 6,137 5.41 % 5,928.86 1,120 4.79 % 4,858.52 1,882 5.47 % 6,367.96 1,902 5.02 % 5,385.83 1,233 6.95 % 7,909.27 ki ki k i 5,537 4.88 % 5,349.21 1,257 5.38 % 5,452.82 882 2.56 % 2,984.35 2,839 7.50 % 8,039.10 559 3.15 % 3,585.79 ker ker k er 4,783 4.22 % 4,620.78 824 3.52 % 3,574.48 1,628 4.73 % 5,508.52 1,294 3.42 % 3,664.18 1,037 5.85 % 6,652 ali ali a li 4,757 4.20 % 4,595.66 818 3.50 % 3,548.45 1,536 4.46 % 5,197.23 1,501 3.96 % 4,250.33 902 5.09 % 5,786.02 ko ko k o 4,253 3.75 % 4,108.75 739 3.16 % 3,205.75 1,692 4.92 % 5,725.07 1,215 3.21 % 3,440.47 607 3.42 % 3,893.70 ampak ampak a mpak 3,540 3.12 % 3,419.94 815 3.48 % 3,535.44 821 2.39 % 2,777.95 1,237 3.27 % 3,502.77 667 3.76 % 4,278.58 kot kot k ot 3,075 2.71 % 2,970.71 737 3.15 % 3,197.08 550 1.60 % 1,860.99 1,363 3.60 % 3,859.56 425 2.40 % 2,726.23 saj saj s aj 1,621 1.43 % 1,566.02 231 0.99 % 1,002.07 824 2.40 % 2,788.10 209 0.55 % 591.82 357 2.01 % 2,290.03 torej torej t orej 1,223 1.08 % 1,181.52 436 1.86 % 1,891.35 37 0.11 % 125.19 648 1.71 % 1,834.92 102 0.57 % 654.29 oziroma oziroma o ziroma 903 0.80 % 872.37 259 1.11 % 1,123.53 105 0.30 % 355.28 417 1.10 % 1,180.80 122 0.69 % 782.59 kjer kjer k jer 652 0.57 % 629.89 122 0.52 % 529.23 121 0.35 % 409.42 306 0.81 % 866.49 103 0.58 % 660.71 sicer sicer s icer 595 0.53 % 574.82 195 0.83 % 845.90 74 0.21 % 250.39 243 0.64 % 688.09 83 0.47 % 532.42 kako kako k ako 533 0.47 % 514.92 126 0.54 % 546.58 135 0.39 % 456.79 223 0.59 % 631.46 49 0.28 % 314.32 zakaj zakaj z akaj 528 0.47 % 510.09 87 0.37 % 377.40 147 0.43 % 497.39 238 0.63 % 673.94 56 0.32 % 359.22 kakor kakor k akor 523 0.46 % 505.26 100 0.43 % 433.80 229 0.67 % 774.85 134 0.35 % 379.44 60 0.34 % 384.88 zato zato z ato 363 0.32 % 350.69 59 0.25 % 255.94 75 0.22 % 253.77 167 0.44 % 472.89 62 0.35 % 397.71 tako tako t ako 352 0.31 % 340.06 56 0.24 % 242.93 114 0.33 % 385.73 94 0.25 % 266.18 88 0.50 % 564.49 niti niti n iti 314 0.28 % 303.35 60 0.26 % 260.28 95 0.28 % 321.44 105 0.28 % 297.32 54 0.30 % 346.39 čeprav čeprav č eprav 302 0.27 % 291.76 85 0.36 % 368.73 71 0.21 % 240.24 106 0.28 % 300.16 40 0.23 % 256.59 vendar vendar v endar 284 0.25 % 274.37 85 0.36 % 368.73 6 0.02 % 20.30 184 0.49 % 521.03 9 0.05 % 57.73 kadar kadar k adar 229 0.20 % 221.23 47 0.20 % 203.88 98 0.28 % 331.59 60 0.16 % 169.90 24 0.14 % 153.95 namreč namreč n amreč 225 0.20 % 217.37 71 0.30 % 308 14 0.04 % 47.37 126 0.33 % 356.79 14 0.08 % 89.81 drugače drugače d rugače 202 0.18 % 195.15 40 0.17 % 173.52 92 0.27 % 311.29 38 0.10 % 107.60 32 0.18 % 205.27 kajti kajti k ajti 174 0.15 % 168.10 104 0.45 % 451.15 0 0 % 0 65 0.17 % 184.06 5 0.03 % 32.07 dokler dokler d okler 143 0.13 % 138.15 21 0.09 % 91.10 57 0.17 % 192.87 44 0.12 % 124.59 21 0.12 % 134.71 vendarle vendarle v endarle 108 0.10 % 104.34 25 0.11 % 108.45 0 0 % 0 76 0.20 % 215.21 7 0.04 % 44.90 naj naj n aj 106 0.09 % 102.40 26 0.11 % 112.79 31 0.09 % 104.89 30 0.08 % 84.95 19 0.11 % 121.88 toda toda t oda 87 0.08 % 84.05 71 0.30 % 308 0 0 % 0 15 0.04 % 42.47 1 0.01 % 6.41 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 585 File at CLARIN.SI2.2.242 List of initial character-level 2-grams from conjunction lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-conjunctions-lemmas- initial-2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pa pa pa 29,360 27.53 % 28,364.22 5,601 25.30 % 24,296.92 12,111 37.98 % 40,978.95 6,825 18.91 % 19,326.12 4,823 29.21 % 30,937.89 da da da 19,011 17.83 % 18,366.22 3,598 16.25 % 15,607.99 5,141 16.12 % 17,395.16 7,140 19.78 % 20,218.09 3,132 18.97 % 20,090.70 in in in 16,238 15.23 % 15,687.27 4,184 18.90 % 18,150.03 3,199 10.03 % 10,824.18 7,068 19.58 % 20,014.21 1,787 10.82 % 11,462.99 če če če 6,137 5.75 % 5,928.86 1,120 5.06 % 4,858.52 1,882 5.90 % 6,367.96 1,902 5.27 % 5,385.83 1,233 7.47 % 7,909.27 ki ki ki 5,537 5.19 % 5,349.21 1,257 5.68 % 5,452.82 882 2.77 % 2,984.35 2,839 7.87 % 8,039.10 559 3.39 % 3,585.79 ker ker ke r 4,783 4.49 % 4,620.78 824 3.72 % 3,574.48 1,628 5.11 % 5,508.52 1,294 3.58 % 3,664.18 1,037 6.28 % 6,652 ali ali al i 4,757 4.46 % 4,595.66 818 3.69 % 3,548.45 1,536 4.82 % 5,197.23 1,501 4.16 % 4,250.33 902 5.46 % 5,786.02 ko ko ko 4,253 3.99 % 4,108.75 739 3.34 % 3,205.75 1,692 5.31 % 5,725.07 1,215 3.37 % 3,440.47 607 3.68 % 3,893.70 ampak ampak am pak 3,540 3.32 % 3,419.94 815 3.68 % 3,535.44 821 2.58 % 2,777.95 1,237 3.43 % 3,502.77 667 4.04 % 4,278.58 kot kot ko t 3,075 2.88 % 2,970.71 737 3.33 % 3,197.08 550 1.73 % 1,860.99 1,363 3.78 % 3,859.56 425 2.57 % 2,726.23 saj saj sa j 1,621 1.52 % 1,566.02 231 1.04 % 1,002.07 824 2.58 % 2,788.10 209 0.58 % 591.82 357 2.16 % 2,290.03 torej torej to rej 1,223 1.15 % 1,181.52 436 1.97 % 1,891.35 37 0.12 % 125.19 648 1.79 % 1,834.92 102 0.62 % 654.29 oziroma oziroma oz iroma 903 0.85 % 872.37 259 1.17 % 1,123.53 105 0.33 % 355.28 417 1.16 % 1,180.80 122 0.74 % 782.59 kjer kjer kj er 652 0.61 % 629.89 122 0.55 % 529.23 121 0.38 % 409.42 306 0.85 % 866.49 103 0.62 % 660.71 sicer sicer si cer 595 0.56 % 574.82 195 0.88 % 845.90 74 0.23 % 250.39 243 0.67 % 688.09 83 0.50 % 532.42 kako kako ka ko 533 0.50 % 514.92 126 0.57 % 546.58 135 0.42 % 456.79 223 0.62 % 631.46 49 0.30 % 314.32 zakaj zakaj za kaj 528 0.49 % 510.09 87 0.39 % 377.40 147 0.46 % 497.39 238 0.66 % 673.94 56 0.34 % 359.22 kakor kakor ka kor 523 0.49 % 505.26 100 0.45 % 433.80 229 0.72 % 774.85 134 0.37 % 379.44 60 0.36 % 384.88 zato zato za to 363 0.34 % 350.69 59 0.27 % 255.94 75 0.23 % 253.77 167 0.46 % 472.89 62 0.38 % 397.71 tako tako ta ko 352 0.33 % 340.06 56 0.25 % 242.93 114 0.36 % 385.73 94 0.26 % 266.18 88 0.53 % 564.49 niti niti ni ti 314 0.29 % 303.35 60 0.27 % 260.28 95 0.30 % 321.44 105 0.29 % 297.32 54 0.33 % 346.39 čeprav čeprav če prav 302 0.28 % 291.76 85 0.38 % 368.73 71 0.22 % 240.24 106 0.29 % 300.16 40 0.24 % 256.59 vendar vendar ve ndar 284 0.27 % 274.37 85 0.38 % 368.73 6 0.02 % 20.30 184 0.51 % 521.03 9 0.06 % 57.73 kadar kadar ka dar 229 0.21 % 221.23 47 0.21 % 203.88 98 0.31 % 331.59 60 0.17 % 169.90 24 0.14 % 153.95 namreč namreč na mreč 225 0.21 % 217.37 71 0.32 % 308 14 0.04 % 47.37 126 0.35 % 356.79 14 0.09 % 89.81 drugače drugače dr ugače 202 0.19 % 195.15 40 0.18 % 173.52 92 0.29 % 311.29 38 0.10 % 107.60 32 0.19 % 205.27 kajti kajti ka jti 174 0.16 % 168.10 104 0.47 % 451.15 0 0 % 0 65 0.18 % 184.06 5 0.03 % 32.07 dokler dokler do kler 143 0.13 % 138.15 21 0.10 % 91.10 57 0.18 % 192.87 44 0.12 % 124.59 21 0.13 % 134.71 vendarle vendarle ve ndarle 108 0.10 % 104.34 25 0.11 % 108.45 0 0 % 0 76 0.21 % 215.21 7 0.04 % 44.90 naj naj na j 106 0.10 % 102.40 26 0.12 % 112.79 31 0.10 % 104.89 30 0.08 % 84.95 19 0.12 % 121.88 toda toda to da 87 0.08 % 84.05 71 0.32 % 308 0 0 % 0 15 0.04 % 42.47 1 0.01 % 6.41 ter ter te r 73 0.07 % 70.52 45 0.20 % 195.21 1 0 % 3.38 27 0.07 % 76.45 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 586 File at CLARIN.SI2.2.243 List of initial character-level 3-grams from conjunction lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-conjunctions-lemmas- initial-3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ker ker ker 4,783 18.36 % 4,620.78 824 14.62 % 3,574.48 1,628 23.36 % 5,508.52 1,294 14.24 % 3,664.18 1,037 23.74 % 6,652 ali ali ali 4,757 18.26 % 4,595.66 818 14.52 % 3,548.45 1,536 22.04 % 5,197.23 1,501 16.52 % 4,250.33 902 20.65 % 5,786.02 ampak ampak amp ak 3,540 13.59 % 3,419.94 815 14.47 % 3,535.44 821 11.78 % 2,777.95 1,237 13.61 % 3,502.77 667 15.27 % 4,278.58 kot kot kot 3,075 11.80 % 2,970.71 737 13.08 % 3,197.08 550 7.89 % 1,860.99 1,363 15.00 % 3,859.56 425 9.73 % 2,726.23 saj saj saj 1,621 6.22 % 1,566.02 231 4.10 % 1,002.07 824 11.82 % 2,788.10 209 2.30 % 591.82 357 8.17 % 2,290.03 torej torej tor ej 1,223 4.69 % 1,181.52 436 7.74 % 1,891.35 37 0.53 % 125.19 648 7.13 % 1,834.92 102 2.33 % 654.29 oziroma oziroma ozi roma 903 3.47 % 872.37 259 4.60 % 1,123.53 105 1.51 % 355.28 417 4.59 % 1,180.80 122 2.79 % 782.59 kjer kjer kje r 652 2.50 % 629.89 122 2.17 % 529.23 121 1.74 % 409.42 306 3.37 % 866.49 103 2.36 % 660.71 sicer sicer sic er 595 2.28 % 574.82 195 3.46 % 845.90 74 1.06 % 250.39 243 2.67 % 688.09 83 1.90 % 532.42 kako kako kak o 533 2.05 % 514.92 126 2.24 % 546.58 135 1.94 % 456.79 223 2.45 % 631.46 49 1.12 % 314.32 zakaj zakaj zak aj 528 2.03 % 510.09 87 1.54 % 377.40 147 2.11 % 497.39 238 2.62 % 673.94 56 1.28 % 359.22 kakor kakor kak or 523 2.01 % 505.26 100 1.77 % 433.80 229 3.29 % 774.85 134 1.48 % 379.44 60 1.37 % 384.88 zato zato zat o 363 1.39 % 350.69 59 1.05 % 255.94 75 1.08 % 253.77 167 1.84 % 472.89 62 1.42 % 397.71 tako tako tak o 352 1.35 % 340.06 56 0.99 % 242.93 114 1.64 % 385.73 94 1.03 % 266.18 88 2.02 % 564.49 niti niti nit i 314 1.21 % 303.35 60 1.06 % 260.28 95 1.36 % 321.44 105 1.16 % 297.32 54 1.24 % 346.39 čeprav čeprav čep rav 302 1.16 % 291.76 85 1.51 % 368.73 71 1.02 % 240.24 106 1.17 % 300.16 40 0.92 % 256.59 vendar vendar ven dar 284 1.09 % 274.37 85 1.51 % 368.73 6 0.09 % 20.30 184 2.02 % 521.03 9 0.21 % 57.73 kadar kadar kad ar 229 0.88 % 221.23 47 0.83 % 203.88 98 1.41 % 331.59 60 0.66 % 169.90 24 0.55 % 153.95 namreč namreč nam reč 225 0.86 % 217.37 71 1.26 % 308 14 0.20 % 47.37 126 1.39 % 356.79 14 0.32 % 89.81 drugače drugače dru gače 202 0.78 % 195.15 40 0.71 % 173.52 92 1.32 % 311.29 38 0.42 % 107.60 32 0.73 % 205.27 kajti kajti kaj ti 174 0.67 % 168.10 104 1.85 % 451.15 0 0 % 0 65 0.71 % 184.06 5 0.11 % 32.07 dokler dokler dok ler 143 0.55 % 138.15 21 0.37 % 91.10 57 0.82 % 192.87 44 0.48 % 124.59 21 0.48 % 134.71 vendarle vendarle ven darle 108 0.41 % 104.34 25 0.44 % 108.45 0 0 % 0 76 0.84 % 215.21 7 0.16 % 44.90 naj naj naj 106 0.41 % 102.40 26 0.46 % 112.79 31 0.45 % 104.89 30 0.33 % 84.95 19 0.43 % 121.88 toda toda tod a 87 0.33 % 84.05 71 1.26 % 308 0 0 % 0 15 0.17 % 42.47 1 0.02 % 6.41 ter ter ter 73 0.28 % 70.52 45 0.80 % 195.21 1 0.01 % 3.38 27 0.30 % 76.45 0 0 % 0 kamor kamor kam or 53 0.20 % 51.20 10 0.18 % 43.38 23 0.33 % 77.82 14 0.15 % 39.64 6 0.14 % 38.49 preden preden pre den 45 0.17 % 43.47 11 0.20 % 47.72 11 0.16 % 37.22 17 0.19 % 48.14 6 0.14 % 38.49 and and and 42 0.16 % 40.58 25 0.44 % 108.45 10 0.14 % 33.84 5 0.06 % 14.16 2 0.05 % 12.83 odkar odkar odk ar 35 0.13 % 33.81 19 0.34 % 82.42 7 0.10 % 23.69 6 0.07 % 16.99 3 0.07 % 19.24 tedaj tedaj ted aj 29 0.11 % 28.02 1 0.02 % 4.34 25 0.36 % 84.59 3 0.03 % 8.49 0 0 % 0 bodisi bodisi bod isi 28 0.11 % 27.05 0 0 % 0 0 0 % 0 24 0.26 % 67.96 4 0.09 % 25.66 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 587 File at CLARIN.SI2.2.244 List of initial character-level 4-grams from conjunction lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-conjunctions-lemmas- initial-4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ampak ampak ampa k 3,540 30.55 % 3,419.94 815 27.85 % 3,535.44 821 34.54 % 2,777.95 1,237 26.56 % 3,502.77 667 41.02 % 4,278.58 torej torej tore j 1,223 10.56 % 1,181.52 436 14.90 % 1,891.35 37 1.56 % 125.19 648 13.91 % 1,834.92 102 6.27 % 654.29 oziroma oziroma ozir oma 903 7.79 % 872.37 259 8.85 % 1,123.53 105 4.42 % 355.28 417 8.95 % 1,180.80 122 7.50 % 782.59 kjer kjer kjer 652 5.63 % 629.89 122 4.17 % 529.23 121 5.09 % 409.42 306 6.57 % 866.49 103 6.33 % 660.71 sicer sicer sice r 595 5.14 % 574.82 195 6.66 % 845.90 74 3.11 % 250.39 243 5.22 % 688.09 83 5.11 % 532.42 kako kako kako 533 4.60 % 514.92 126 4.31 % 546.58 135 5.68 % 456.79 223 4.79 % 631.46 49 3.01 % 314.32 zakaj zakaj zaka j 528 4.56 % 510.09 87 2.97 % 377.40 147 6.18 % 497.39 238 5.11 % 673.94 56 3.44 % 359.22 kakor kakor kako r 523 4.51 % 505.26 100 3.42 % 433.80 229 9.63 % 774.85 134 2.88 % 379.44 60 3.69 % 384.88 zato zato zato 363 3.13 % 350.69 59 2.02 % 255.94 75 3.15 % 253.77 167 3.59 % 472.89 62 3.81 % 397.71 tako tako tako 352 3.04 % 340.06 56 1.91 % 242.93 114 4.80 % 385.73 94 2.02 % 266.18 88 5.41 % 564.49 niti niti niti 314 2.71 % 303.35 60 2.05 % 260.28 95 4.00 % 321.44 105 2.25 % 297.32 54 3.32 % 346.39 čeprav čeprav čepr av 302 2.61 % 291.76 85 2.90 % 368.73 71 2.99 % 240.24 106 2.28 % 300.16 40 2.46 % 256.59 vendar vendar vend ar 284 2.45 % 274.37 85 2.90 % 368.73 6 0.25 % 20.30 184 3.95 % 521.03 9 0.55 % 57.73 kadar kadar kada r 229 1.98 % 221.23 47 1.61 % 203.88 98 4.12 % 331.59 60 1.29 % 169.90 24 1.48 % 153.95 namreč namreč namr eč 225 1.94 % 217.37 71 2.43 % 308 14 0.59 % 47.37 126 2.71 % 356.79 14 0.86 % 89.81 drugače drugače drug ače 202 1.74 % 195.15 40 1.37 % 173.52 92 3.87 % 311.29 38 0.82 % 107.60 32 1.97 % 205.27 kajti kajti kajt i 174 1.50 % 168.10 104 3.55 % 451.15 0 0 % 0 65 1.40 % 184.06 5 0.31 % 32.07 dokler dokler dokl er 143 1.23 % 138.15 21 0.72 % 91.10 57 2.40 % 192.87 44 0.94 % 124.59 21 1.29 % 134.71 vendarle vendarle vend arle 108 0.93 % 104.34 25 0.85 % 108.45 0 0 % 0 76 1.63 % 215.21 7 0.43 % 44.90 toda toda toda 87 0.75 % 84.05 71 2.43 % 308 0 0 % 0 15 0.32 % 42.47 1 0.06 % 6.41 kamor kamor kamo r 53 0.46 % 51.20 10 0.34 % 43.38 23 0.97 % 77.82 14 0.30 % 39.64 6 0.37 % 38.49 preden preden pred en 45 0.39 % 43.47 11 0.38 % 47.72 11 0.46 % 37.22 17 0.36 % 48.14 6 0.37 % 38.49 odkar odkar odka r 35 0.30 % 33.81 19 0.65 % 82.42 7 0.29 % 23.69 6 0.13 % 16.99 3 0.18 % 19.24 tedaj tedaj teda j 29 0.25 % 28.02 1 0.03 % 4.34 25 1.05 % 84.59 3 0.06 % 8.49 0 0 % 0 bodisi bodisi bodi si 28 0.24 % 27.05 0 0 % 0 0 0 % 0 24 0.52 % 67.96 4 0.25 % 25.66 temveč temveč temv eč 25 0.22 % 24.15 2 0.07 % 8.68 0 0 % 0 23 0.49 % 65.13 0 0 % 0 kolikor kolikor koli kor 17 0.15 % 16.42 3 0.10 % 13.01 3 0.13 % 10.15 7 0.15 % 19.82 4 0.25 % 25.66 četudi četudi četu di 14 0.12 % 13.53 2 0.07 % 8.68 1 0.04 % 3.38 11 0.24 % 31.15 0 0 % 0 nato nato nato 13 0.11 % 12.56 6 0.20 % 26.03 0 0 % 0 7 0.15 % 19.82 0 0 % 0 magari magari maga ri 11 0.10 % 10.63 0 0 % 0 9 0.38 % 30.45 0 0 % 0 2 0.12 % 12.83 bodi bodi bodi 10 0.09 % 9.66 3 0.10 % 13.01 3 0.13 % 10.15 3 0.06 % 8.49 1 0.06 % 6.41 potemtakem potemtakem pote mtakem 8 0.07 % 7.73 2 0.07 % 8.68 0 0 % 0 6 0.13 % 16.99 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 588 File at CLARIN.SI2.2.245 List of initial character-level 5-grams from conjunction lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-conjunctions-lemmas- initial-5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ampak ampak ampak 3,540 38.23 % 3,419.94 815 33.65 % 3,535.44 821 44.81 % 2,777.95 1,237 33.10 % 3,502.77 667 52.60 % 4,278.58 torej torej torej 1,223 13.21 % 1,181.52 436 18.00 % 1,891.35 37 2.02 % 125.19 648 17.34 % 1,834.92 102 8.04 % 654.29 oziroma oziroma oziro ma 903 9.75 % 872.37 259 10.69 % 1,123.53 105 5.73 % 355.28 417 11.16 % 1,180.80 122 9.62 % 782.59 sicer sicer sicer 595 6.43 % 574.82 195 8.05 % 845.90 74 4.04 % 250.39 243 6.50 % 688.09 83 6.55 % 532.42 zakaj zakaj zakaj 528 5.70 % 510.09 87 3.59 % 377.40 147 8.02 % 497.39 238 6.37 % 673.94 56 4.42 % 359.22 kakor kakor kakor 523 5.65 % 505.26 100 4.13 % 433.80 229 12.50 % 774.85 134 3.59 % 379.44 60 4.73 % 384.88 čeprav čeprav čepra v 302 3.26 % 291.76 85 3.51 % 368.73 71 3.88 % 240.24 106 2.84 % 300.16 40 3.15 % 256.59 vendar vendar venda r 284 3.07 % 274.37 85 3.51 % 368.73 6 0.33 % 20.30 184 4.92 % 521.03 9 0.71 % 57.73 kadar kadar kadar 229 2.47 % 221.23 47 1.94 % 203.88 98 5.35 % 331.59 60 1.61 % 169.90 24 1.89 % 153.95 namreč namreč namre č 225 2.43 % 217.37 71 2.93 % 308 14 0.76 % 47.37 126 3.37 % 356.79 14 1.10 % 89.81 drugače drugače druga če 202 2.18 % 195.15 40 1.65 % 173.52 92 5.02 % 311.29 38 1.02 % 107.60 32 2.52 % 205.27 kajti kajti kajti 174 1.88 % 168.10 104 4.29 % 451.15 0 0 % 0 65 1.74 % 184.06 5 0.39 % 32.07 dokler dokler dokle r 143 1.54 % 138.15 21 0.87 % 91.10 57 3.11 % 192.87 44 1.18 % 124.59 21 1.66 % 134.71 vendarle vendarle venda rle 108 1.17 % 104.34 25 1.03 % 108.45 0 0 % 0 76 2.03 % 215.21 7 0.55 % 44.90 kamor kamor kamor 53 0.57 % 51.20 10 0.41 % 43.38 23 1.25 % 77.82 14 0.38 % 39.64 6 0.47 % 38.49 preden preden prede n 45 0.49 % 43.47 11 0.45 % 47.72 11 0.60 % 37.22 17 0.46 % 48.14 6 0.47 % 38.49 odkar odkar odkar 35 0.38 % 33.81 19 0.78 % 82.42 7 0.38 % 23.69 6 0.16 % 16.99 3 0.24 % 19.24 tedaj tedaj tedaj 29 0.31 % 28.02 1 0.04 % 4.34 25 1.36 % 84.59 3 0.08 % 8.49 0 0 % 0 bodisi bodisi bodis i 28 0.30 % 27.05 0 0 % 0 0 0 % 0 24 0.64 % 67.96 4 0.32 % 25.66 temveč temveč temve č 25 0.27 % 24.15 2 0.08 % 8.68 0 0 % 0 23 0.61 % 65.13 0 0 % 0 kolikor kolikor kolik or 17 0.18 % 16.42 3 0.12 % 13.01 3 0.16 % 10.15 7 0.19 % 19.82 4 0.32 % 25.66 četudi četudi četud i 14 0.15 % 13.53 2 0.08 % 8.68 1 0.06 % 3.38 11 0.29 % 31.15 0 0 % 0 magari magari magar i 11 0.12 % 10.63 0 0 % 0 9 0.49 % 30.45 0 0 % 0 2 0.16 % 12.83 potemtakem potemtakem potem takem 8 0.09 % 7.73 2 0.08 % 8.68 0 0 % 0 6 0.16 % 16.99 0 0 % 0 dočim dočim dočim 7 0.08 % 6.76 1 0.04 % 4.34 1 0.06 % 3.38 4 0.11 % 11.33 1 0.08 % 6.41 koder koder koder 5 0.05 % 4.83 0 0 % 0 1 0.06 % 3.38 4 0.11 % 11.33 0 0 % 0 čeravno čeravno čerav no 2 0.02 % 1.93 0 0 % 0 0 0 % 0 2 0.05 % 5.66 0 0 % 0 zatorej zatorej zator ej 1 0.01 % 0.97 1 0.04 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 589 File at CLARIN.SI2.2.246 List of final character-level 1-grams from conjunction lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-conjunctions-lemmas- final-1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pa pa p a 29,360 25.89 % 28,364.22 5,601 23.95 % 24,296.92 12,111 35.20 % 40,978.95 6,825 18.02 % 19,326.12 4,823 27.20 % 30,937.89 da da d a 19,011 16.77 % 18,366.22 3,598 15.38 % 15,607.99 5,141 14.94 % 17,395.16 7,140 18.86 % 20,218.09 3,132 17.66 % 20,090.70 in in i n 16,238 14.32 % 15,687.27 4,184 17.89 % 18,150.03 3,199 9.30 % 10,824.18 7,068 18.67 % 20,014.21 1,787 10.08 % 11,462.99 a a a 6,753 5.96 % 6,523.96 1,247 5.33 % 5,409.44 2,515 7.31 % 8,509.79 1,771 4.68 % 5,014.88 1,220 6.88 % 7,825.88 če če č e 6,137 5.41 % 5,928.86 1,120 4.79 % 4,858.52 1,882 5.47 % 6,367.96 1,902 5.02 % 5,385.83 1,233 6.95 % 7,909.27 ki ki k i 5,537 4.88 % 5,349.21 1,257 5.38 % 5,452.82 882 2.56 % 2,984.35 2,839 7.50 % 8,039.10 559 3.15 % 3,585.79 ker ker ke r 4,783 4.22 % 4,620.78 824 3.52 % 3,574.48 1,628 4.73 % 5,508.52 1,294 3.42 % 3,664.18 1,037 5.85 % 6,652 ali ali al i 4,757 4.20 % 4,595.66 818 3.50 % 3,548.45 1,536 4.46 % 5,197.23 1,501 3.96 % 4,250.33 902 5.09 % 5,786.02 ko ko k o 4,253 3.75 % 4,108.75 739 3.16 % 3,205.75 1,692 4.92 % 5,725.07 1,215 3.21 % 3,440.47 607 3.42 % 3,893.70 ampak ampak ampa k 3,540 3.12 % 3,419.94 815 3.48 % 3,535.44 821 2.39 % 2,777.95 1,237 3.27 % 3,502.77 667 3.76 % 4,278.58 kot kot ko t 3,075 2.71 % 2,970.71 737 3.15 % 3,197.08 550 1.60 % 1,860.99 1,363 3.60 % 3,859.56 425 2.40 % 2,726.23 saj saj sa j 1,621 1.43 % 1,566.02 231 0.99 % 1,002.07 824 2.40 % 2,788.10 209 0.55 % 591.82 357 2.01 % 2,290.03 torej torej tore j 1,223 1.08 % 1,181.52 436 1.86 % 1,891.35 37 0.11 % 125.19 648 1.71 % 1,834.92 102 0.57 % 654.29 oziroma oziroma ozirom a 903 0.80 % 872.37 259 1.11 % 1,123.53 105 0.30 % 355.28 417 1.10 % 1,180.80 122 0.69 % 782.59 kjer kjer kje r 652 0.57 % 629.89 122 0.52 % 529.23 121 0.35 % 409.42 306 0.81 % 866.49 103 0.58 % 660.71 sicer sicer sice r 595 0.53 % 574.82 195 0.83 % 845.90 74 0.21 % 250.39 243 0.64 % 688.09 83 0.47 % 532.42 kako kako kak o 533 0.47 % 514.92 126 0.54 % 546.58 135 0.39 % 456.79 223 0.59 % 631.46 49 0.28 % 314.32 zakaj zakaj zaka j 528 0.47 % 510.09 87 0.37 % 377.40 147 0.43 % 497.39 238 0.63 % 673.94 56 0.32 % 359.22 kakor kakor kako r 523 0.46 % 505.26 100 0.43 % 433.80 229 0.67 % 774.85 134 0.35 % 379.44 60 0.34 % 384.88 zato zato zat o 363 0.32 % 350.69 59 0.25 % 255.94 75 0.22 % 253.77 167 0.44 % 472.89 62 0.35 % 397.71 tako tako tak o 352 0.31 % 340.06 56 0.24 % 242.93 114 0.33 % 385.73 94 0.25 % 266.18 88 0.50 % 564.49 niti niti nit i 314 0.28 % 303.35 60 0.26 % 260.28 95 0.28 % 321.44 105 0.28 % 297.32 54 0.30 % 346.39 čeprav čeprav čepra v 302 0.27 % 291.76 85 0.36 % 368.73 71 0.21 % 240.24 106 0.28 % 300.16 40 0.23 % 256.59 vendar vendar venda r 284 0.25 % 274.37 85 0.36 % 368.73 6 0.02 % 20.30 184 0.49 % 521.03 9 0.05 % 57.73 kadar kadar kada r 229 0.20 % 221.23 47 0.20 % 203.88 98 0.28 % 331.59 60 0.16 % 169.90 24 0.14 % 153.95 namreč namreč namre č 225 0.20 % 217.37 71 0.30 % 308 14 0.04 % 47.37 126 0.33 % 356.79 14 0.08 % 89.81 drugače drugače drugač e 202 0.18 % 195.15 40 0.17 % 173.52 92 0.27 % 311.29 38 0.10 % 107.60 32 0.18 % 205.27 kajti kajti kajt i 174 0.15 % 168.10 104 0.45 % 451.15 0 0 % 0 65 0.17 % 184.06 5 0.03 % 32.07 dokler dokler dokle r 143 0.13 % 138.15 21 0.09 % 91.10 57 0.17 % 192.87 44 0.12 % 124.59 21 0.12 % 134.71 vendarle vendarle vendarl e 108 0.10 % 104.34 25 0.11 % 108.45 0 0 % 0 76 0.20 % 215.21 7 0.04 % 44.90 naj naj na j 106 0.09 % 102.40 26 0.11 % 112.79 31 0.09 % 104.89 30 0.08 % 84.95 19 0.11 % 121.88 toda toda tod a 87 0.08 % 84.05 71 0.30 % 308 0 0 % 0 15 0.04 % 42.47 1 0.01 % 6.41 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 590 File at CLARIN.SI2.2.247 List of final character-level 2-grams from conjunction lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-conjunctions-lemmas- final-2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pa pa pa 29,360 27.53 % 28,364.22 5,601 25.30 % 24,296.92 12,111 37.98 % 40,978.95 6,825 18.91 % 19,326.12 4,823 29.21 % 30,937.89 da da da 19,011 17.83 % 18,366.22 3,598 16.25 % 15,607.99 5,141 16.12 % 17,395.16 7,140 19.78 % 20,218.09 3,132 18.97 % 20,090.70 in in in 16,238 15.23 % 15,687.27 4,184 18.90 % 18,150.03 3,199 10.03 % 10,824.18 7,068 19.58 % 20,014.21 1,787 10.82 % 11,462.99 če če če 6,137 5.75 % 5,928.86 1,120 5.06 % 4,858.52 1,882 5.90 % 6,367.96 1,902 5.27 % 5,385.83 1,233 7.47 % 7,909.27 ki ki ki 5,537 5.19 % 5,349.21 1,257 5.68 % 5,452.82 882 2.77 % 2,984.35 2,839 7.87 % 8,039.10 559 3.39 % 3,585.79 ker ker k er 4,783 4.49 % 4,620.78 824 3.72 % 3,574.48 1,628 5.11 % 5,508.52 1,294 3.58 % 3,664.18 1,037 6.28 % 6,652 ali ali a li 4,757 4.46 % 4,595.66 818 3.69 % 3,548.45 1,536 4.82 % 5,197.23 1,501 4.16 % 4,250.33 902 5.46 % 5,786.02 ko ko ko 4,253 3.99 % 4,108.75 739 3.34 % 3,205.75 1,692 5.31 % 5,725.07 1,215 3.37 % 3,440.47 607 3.68 % 3,893.70 ampak ampak amp ak 3,540 3.32 % 3,419.94 815 3.68 % 3,535.44 821 2.58 % 2,777.95 1,237 3.43 % 3,502.77 667 4.04 % 4,278.58 kot kot k ot 3,075 2.88 % 2,970.71 737 3.33 % 3,197.08 550 1.73 % 1,860.99 1,363 3.78 % 3,859.56 425 2.57 % 2,726.23 saj saj s aj 1,621 1.52 % 1,566.02 231 1.04 % 1,002.07 824 2.58 % 2,788.10 209 0.58 % 591.82 357 2.16 % 2,290.03 torej torej tor ej 1,223 1.15 % 1,181.52 436 1.97 % 1,891.35 37 0.12 % 125.19 648 1.79 % 1,834.92 102 0.62 % 654.29 oziroma oziroma oziro ma 903 0.85 % 872.37 259 1.17 % 1,123.53 105 0.33 % 355.28 417 1.16 % 1,180.80 122 0.74 % 782.59 kjer kjer kj er 652 0.61 % 629.89 122 0.55 % 529.23 121 0.38 % 409.42 306 0.85 % 866.49 103 0.62 % 660.71 sicer sicer sic er 595 0.56 % 574.82 195 0.88 % 845.90 74 0.23 % 250.39 243 0.67 % 688.09 83 0.50 % 532.42 kako kako ka ko 533 0.50 % 514.92 126 0.57 % 546.58 135 0.42 % 456.79 223 0.62 % 631.46 49 0.30 % 314.32 zakaj zakaj zak aj 528 0.49 % 510.09 87 0.39 % 377.40 147 0.46 % 497.39 238 0.66 % 673.94 56 0.34 % 359.22 kakor kakor kak or 523 0.49 % 505.26 100 0.45 % 433.80 229 0.72 % 774.85 134 0.37 % 379.44 60 0.36 % 384.88 zato zato za to 363 0.34 % 350.69 59 0.27 % 255.94 75 0.23 % 253.77 167 0.46 % 472.89 62 0.38 % 397.71 tako tako ta ko 352 0.33 % 340.06 56 0.25 % 242.93 114 0.36 % 385.73 94 0.26 % 266.18 88 0.53 % 564.49 niti niti ni ti 314 0.29 % 303.35 60 0.27 % 260.28 95 0.30 % 321.44 105 0.29 % 297.32 54 0.33 % 346.39 čeprav čeprav čepr av 302 0.28 % 291.76 85 0.38 % 368.73 71 0.22 % 240.24 106 0.29 % 300.16 40 0.24 % 256.59 vendar vendar vend ar 284 0.27 % 274.37 85 0.38 % 368.73 6 0.02 % 20.30 184 0.51 % 521.03 9 0.06 % 57.73 kadar kadar kad ar 229 0.21 % 221.23 47 0.21 % 203.88 98 0.31 % 331.59 60 0.17 % 169.90 24 0.14 % 153.95 namreč namreč namr eč 225 0.21 % 217.37 71 0.32 % 308 14 0.04 % 47.37 126 0.35 % 356.79 14 0.09 % 89.81 drugače drugače druga če 202 0.19 % 195.15 40 0.18 % 173.52 92 0.29 % 311.29 38 0.10 % 107.60 32 0.19 % 205.27 kajti kajti kaj ti 174 0.16 % 168.10 104 0.47 % 451.15 0 0 % 0 65 0.18 % 184.06 5 0.03 % 32.07 dokler dokler dokl er 143 0.13 % 138.15 21 0.10 % 91.10 57 0.18 % 192.87 44 0.12 % 124.59 21 0.13 % 134.71 vendarle vendarle vendar le 108 0.10 % 104.34 25 0.11 % 108.45 0 0 % 0 76 0.21 % 215.21 7 0.04 % 44.90 naj naj n aj 106 0.10 % 102.40 26 0.12 % 112.79 31 0.10 % 104.89 30 0.08 % 84.95 19 0.12 % 121.88 toda toda to da 87 0.08 % 84.05 71 0.32 % 308 0 0 % 0 15 0.04 % 42.47 1 0.01 % 6.41 ter ter t er 73 0.07 % 70.52 45 0.20 % 195.21 1 0 % 3.38 27 0.07 % 76.45 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 591 File at CLARIN.SI2.2.248 List of final character-level 3-grams from conjunction lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-conjunctions-lemmas- final-3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ker ker ker 4,783 18.36 % 4,620.78 824 14.62 % 3,574.48 1,628 23.36 % 5,508.52 1,294 14.24 % 3,664.18 1,037 23.74 % 6,652 ali ali ali 4,757 18.26 % 4,595.66 818 14.52 % 3,548.45 1,536 22.04 % 5,197.23 1,501 16.52 % 4,250.33 902 20.65 % 5,786.02 ampak ampak am pak 3,540 13.59 % 3,419.94 815 14.47 % 3,535.44 821 11.78 % 2,777.95 1,237 13.61 % 3,502.77 667 15.27 % 4,278.58 kot kot kot 3,075 11.80 % 2,970.71 737 13.08 % 3,197.08 550 7.89 % 1,860.99 1,363 15.00 % 3,859.56 425 9.73 % 2,726.23 saj saj saj 1,621 6.22 % 1,566.02 231 4.10 % 1,002.07 824 11.82 % 2,788.10 209 2.30 % 591.82 357 8.17 % 2,290.03 torej torej to rej 1,223 4.69 % 1,181.52 436 7.74 % 1,891.35 37 0.53 % 125.19 648 7.13 % 1,834.92 102 2.33 % 654.29 oziroma oziroma ozir oma 903 3.47 % 872.37 259 4.60 % 1,123.53 105 1.51 % 355.28 417 4.59 % 1,180.80 122 2.79 % 782.59 kjer kjer k jer 652 2.50 % 629.89 122 2.17 % 529.23 121 1.74 % 409.42 306 3.37 % 866.49 103 2.36 % 660.71 sicer sicer si cer 595 2.28 % 574.82 195 3.46 % 845.90 74 1.06 % 250.39 243 2.67 % 688.09 83 1.90 % 532.42 kako kako k ako 533 2.05 % 514.92 126 2.24 % 546.58 135 1.94 % 456.79 223 2.45 % 631.46 49 1.12 % 314.32 zakaj zakaj za kaj 528 2.03 % 510.09 87 1.54 % 377.40 147 2.11 % 497.39 238 2.62 % 673.94 56 1.28 % 359.22 kakor kakor ka kor 523 2.01 % 505.26 100 1.77 % 433.80 229 3.29 % 774.85 134 1.48 % 379.44 60 1.37 % 384.88 zato zato z ato 363 1.39 % 350.69 59 1.05 % 255.94 75 1.08 % 253.77 167 1.84 % 472.89 62 1.42 % 397.71 tako tako t ako 352 1.35 % 340.06 56 0.99 % 242.93 114 1.64 % 385.73 94 1.03 % 266.18 88 2.02 % 564.49 niti niti n iti 314 1.21 % 303.35 60 1.06 % 260.28 95 1.36 % 321.44 105 1.16 % 297.32 54 1.24 % 346.39 čeprav čeprav čep rav 302 1.16 % 291.76 85 1.51 % 368.73 71 1.02 % 240.24 106 1.17 % 300.16 40 0.92 % 256.59 vendar vendar ven dar 284 1.09 % 274.37 85 1.51 % 368.73 6 0.09 % 20.30 184 2.02 % 521.03 9 0.21 % 57.73 kadar kadar ka dar 229 0.88 % 221.23 47 0.83 % 203.88 98 1.41 % 331.59 60 0.66 % 169.90 24 0.55 % 153.95 namreč namreč nam reč 225 0.86 % 217.37 71 1.26 % 308 14 0.20 % 47.37 126 1.39 % 356.79 14 0.32 % 89.81 drugače drugače drug ače 202 0.78 % 195.15 40 0.71 % 173.52 92 1.32 % 311.29 38 0.42 % 107.60 32 0.73 % 205.27 kajti kajti ka jti 174 0.67 % 168.10 104 1.85 % 451.15 0 0 % 0 65 0.71 % 184.06 5 0.11 % 32.07 dokler dokler dok ler 143 0.55 % 138.15 21 0.37 % 91.10 57 0.82 % 192.87 44 0.48 % 124.59 21 0.48 % 134.71 vendarle vendarle venda rle 108 0.41 % 104.34 25 0.44 % 108.45 0 0 % 0 76 0.84 % 215.21 7 0.16 % 44.90 naj naj naj 106 0.41 % 102.40 26 0.46 % 112.79 31 0.45 % 104.89 30 0.33 % 84.95 19 0.43 % 121.88 toda toda t oda 87 0.33 % 84.05 71 1.26 % 308 0 0 % 0 15 0.17 % 42.47 1 0.02 % 6.41 ter ter ter 73 0.28 % 70.52 45 0.80 % 195.21 1 0.01 % 3.38 27 0.30 % 76.45 0 0 % 0 kamor kamor ka mor 53 0.20 % 51.20 10 0.18 % 43.38 23 0.33 % 77.82 14 0.15 % 39.64 6 0.14 % 38.49 preden preden pre den 45 0.17 % 43.47 11 0.20 % 47.72 11 0.16 % 37.22 17 0.19 % 48.14 6 0.14 % 38.49 and and and 42 0.16 % 40.58 25 0.44 % 108.45 10 0.14 % 33.84 5 0.06 % 14.16 2 0.05 % 12.83 odkar odkar od kar 35 0.13 % 33.81 19 0.34 % 82.42 7 0.10 % 23.69 6 0.07 % 16.99 3 0.07 % 19.24 tedaj tedaj te daj 29 0.11 % 28.02 1 0.02 % 4.34 25 0.36 % 84.59 3 0.03 % 8.49 0 0 % 0 bodisi bodisi bod isi 28 0.11 % 27.05 0 0 % 0 0 0 % 0 24 0.26 % 67.96 4 0.09 % 25.66 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 592 File at CLARIN.SI2.2.249 List of final character-level 4-grams from conjunction lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-conjunctions-lemmas- final-4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ampak ampak a mpak 3,540 30.55 % 3,419.94 815 27.85 % 3,535.44 821 34.54 % 2,777.95 1,237 26.56 % 3,502.77 667 41.02 % 4,278.58 torej torej t orej 1,223 10.56 % 1,181.52 436 14.90 % 1,891.35 37 1.56 % 125.19 648 13.91 % 1,834.92 102 6.27 % 654.29 oziroma oziroma ozi roma 903 7.79 % 872.37 259 8.85 % 1,123.53 105 4.42 % 355.28 417 8.95 % 1,180.80 122 7.50 % 782.59 kjer kjer kjer 652 5.63 % 629.89 122 4.17 % 529.23 121 5.09 % 409.42 306 6.57 % 866.49 103 6.33 % 660.71 sicer sicer s icer 595 5.14 % 574.82 195 6.66 % 845.90 74 3.11 % 250.39 243 5.22 % 688.09 83 5.11 % 532.42 kako kako kako 533 4.60 % 514.92 126 4.31 % 546.58 135 5.68 % 456.79 223 4.79 % 631.46 49 3.01 % 314.32 zakaj zakaj z akaj 528 4.56 % 510.09 87 2.97 % 377.40 147 6.18 % 497.39 238 5.11 % 673.94 56 3.44 % 359.22 kakor kakor k akor 523 4.51 % 505.26 100 3.42 % 433.80 229 9.63 % 774.85 134 2.88 % 379.44 60 3.69 % 384.88 zato zato zato 363 3.13 % 350.69 59 2.02 % 255.94 75 3.15 % 253.77 167 3.59 % 472.89 62 3.81 % 397.71 tako tako tako 352 3.04 % 340.06 56 1.91 % 242.93 114 4.80 % 385.73 94 2.02 % 266.18 88 5.41 % 564.49 niti niti niti 314 2.71 % 303.35 60 2.05 % 260.28 95 4.00 % 321.44 105 2.25 % 297.32 54 3.32 % 346.39 čeprav čeprav če prav 302 2.61 % 291.76 85 2.90 % 368.73 71 2.99 % 240.24 106 2.28 % 300.16 40 2.46 % 256.59 vendar vendar ve ndar 284 2.45 % 274.37 85 2.90 % 368.73 6 0.25 % 20.30 184 3.95 % 521.03 9 0.55 % 57.73 kadar kadar k adar 229 1.98 % 221.23 47 1.61 % 203.88 98 4.12 % 331.59 60 1.29 % 169.90 24 1.48 % 153.95 namreč namreč na mreč 225 1.94 % 217.37 71 2.43 % 308 14 0.59 % 47.37 126 2.71 % 356.79 14 0.86 % 89.81 drugače drugače dru gače 202 1.74 % 195.15 40 1.37 % 173.52 92 3.87 % 311.29 38 0.82 % 107.60 32 1.97 % 205.27 kajti kajti k ajti 174 1.50 % 168.10 104 3.55 % 451.15 0 0 % 0 65 1.40 % 184.06 5 0.31 % 32.07 dokler dokler do kler 143 1.23 % 138.15 21 0.72 % 91.10 57 2.40 % 192.87 44 0.94 % 124.59 21 1.29 % 134.71 vendarle vendarle vend arle 108 0.93 % 104.34 25 0.85 % 108.45 0 0 % 0 76 1.63 % 215.21 7 0.43 % 44.90 toda toda toda 87 0.75 % 84.05 71 2.43 % 308 0 0 % 0 15 0.32 % 42.47 1 0.06 % 6.41 kamor kamor k amor 53 0.46 % 51.20 10 0.34 % 43.38 23 0.97 % 77.82 14 0.30 % 39.64 6 0.37 % 38.49 preden preden pr eden 45 0.39 % 43.47 11 0.38 % 47.72 11 0.46 % 37.22 17 0.36 % 48.14 6 0.37 % 38.49 odkar odkar o dkar 35 0.30 % 33.81 19 0.65 % 82.42 7 0.29 % 23.69 6 0.13 % 16.99 3 0.18 % 19.24 tedaj tedaj t edaj 29 0.25 % 28.02 1 0.03 % 4.34 25 1.05 % 84.59 3 0.06 % 8.49 0 0 % 0 bodisi bodisi bo disi 28 0.24 % 27.05 0 0 % 0 0 0 % 0 24 0.52 % 67.96 4 0.25 % 25.66 temveč temveč te mveč 25 0.22 % 24.15 2 0.07 % 8.68 0 0 % 0 23 0.49 % 65.13 0 0 % 0 kolikor kolikor kol ikor 17 0.15 % 16.42 3 0.10 % 13.01 3 0.13 % 10.15 7 0.15 % 19.82 4 0.25 % 25.66 četudi četudi če tudi 14 0.12 % 13.53 2 0.07 % 8.68 1 0.04 % 3.38 11 0.24 % 31.15 0 0 % 0 nato nato nato 13 0.11 % 12.56 6 0.20 % 26.03 0 0 % 0 7 0.15 % 19.82 0 0 % 0 magari magari ma gari 11 0.10 % 10.63 0 0 % 0 9 0.38 % 30.45 0 0 % 0 2 0.12 % 12.83 bodi bodi bodi 10 0.09 % 9.66 3 0.10 % 13.01 3 0.13 % 10.15 3 0.06 % 8.49 1 0.06 % 6.41 potemtakem potemtakem potemt akem 8 0.07 % 7.73 2 0.07 % 8.68 0 0 % 0 6 0.13 % 16.99 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 593 File at CLARIN.SI2.2.250 List of final character-level 5-grams from conjunction lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-conjunctions-lemmas- final-5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ampak ampak ampak 3,540 38.23 % 3,419.94 815 33.65 % 3,535.44 821 44.81 % 2,777.95 1,237 33.10 % 3,502.77 667 52.60 % 4,278.58 torej torej torej 1,223 13.21 % 1,181.52 436 18.00 % 1,891.35 37 2.02 % 125.19 648 17.34 % 1,834.92 102 8.04 % 654.29 oziroma oziroma oz iroma 903 9.75 % 872.37 259 10.69 % 1,123.53 105 5.73 % 355.28 417 11.16 % 1,180.80 122 9.62 % 782.59 sicer sicer sicer 595 6.43 % 574.82 195 8.05 % 845.90 74 4.04 % 250.39 243 6.50 % 688.09 83 6.55 % 532.42 zakaj zakaj zakaj 528 5.70 % 510.09 87 3.59 % 377.40 147 8.02 % 497.39 238 6.37 % 673.94 56 4.42 % 359.22 kakor kakor kakor 523 5.65 % 505.26 100 4.13 % 433.80 229 12.50 % 774.85 134 3.59 % 379.44 60 4.73 % 384.88 čeprav čeprav č eprav 302 3.26 % 291.76 85 3.51 % 368.73 71 3.88 % 240.24 106 2.84 % 300.16 40 3.15 % 256.59 vendar vendar v endar 284 3.07 % 274.37 85 3.51 % 368.73 6 0.33 % 20.30 184 4.92 % 521.03 9 0.71 % 57.73 kadar kadar kadar 229 2.47 % 221.23 47 1.94 % 203.88 98 5.35 % 331.59 60 1.61 % 169.90 24 1.89 % 153.95 namreč namreč n amreč 225 2.43 % 217.37 71 2.93 % 308 14 0.76 % 47.37 126 3.37 % 356.79 14 1.10 % 89.81 drugače drugače dr ugače 202 2.18 % 195.15 40 1.65 % 173.52 92 5.02 % 311.29 38 1.02 % 107.60 32 2.52 % 205.27 kajti kajti kajti 174 1.88 % 168.10 104 4.29 % 451.15 0 0 % 0 65 1.74 % 184.06 5 0.39 % 32.07 dokler dokler d okler 143 1.54 % 138.15 21 0.87 % 91.10 57 3.11 % 192.87 44 1.18 % 124.59 21 1.66 % 134.71 vendarle vendarle ven darle 108 1.17 % 104.34 25 1.03 % 108.45 0 0 % 0 76 2.03 % 215.21 7 0.55 % 44.90 kamor kamor kamor 53 0.57 % 51.20 10 0.41 % 43.38 23 1.25 % 77.82 14 0.38 % 39.64 6 0.47 % 38.49 preden preden p reden 45 0.49 % 43.47 11 0.45 % 47.72 11 0.60 % 37.22 17 0.46 % 48.14 6 0.47 % 38.49 odkar odkar odkar 35 0.38 % 33.81 19 0.78 % 82.42 7 0.38 % 23.69 6 0.16 % 16.99 3 0.24 % 19.24 tedaj tedaj tedaj 29 0.31 % 28.02 1 0.04 % 4.34 25 1.36 % 84.59 3 0.08 % 8.49 0 0 % 0 bodisi bodisi b odisi 28 0.30 % 27.05 0 0 % 0 0 0 % 0 24 0.64 % 67.96 4 0.32 % 25.66 temveč temveč t emveč 25 0.27 % 24.15 2 0.08 % 8.68 0 0 % 0 23 0.61 % 65.13 0 0 % 0 kolikor kolikor ko likor 17 0.18 % 16.42 3 0.12 % 13.01 3 0.16 % 10.15 7 0.19 % 19.82 4 0.32 % 25.66 četudi četudi č etudi 14 0.15 % 13.53 2 0.08 % 8.68 1 0.06 % 3.38 11 0.29 % 31.15 0 0 % 0 magari magari m agari 11 0.12 % 10.63 0 0 % 0 9 0.49 % 30.45 0 0 % 0 2 0.16 % 12.83 potemtakem potemtakem potem takem 8 0.09 % 7.73 2 0.08 % 8.68 0 0 % 0 6 0.16 % 16.99 0 0 % 0 dočim dočim dočim 7 0.08 % 6.76 1 0.04 % 4.34 1 0.06 % 3.38 4 0.11 % 11.33 1 0.08 % 6.41 koder koder koder 5 0.05 % 4.83 0 0 % 0 1 0.06 % 3.38 4 0.11 % 11.33 0 0 % 0 čeravno čeravno če ravno 2 0.02 % 1.93 0 0 % 0 0 0 % 0 2 0.05 % 5.66 0 0 % 0 zatorej zatorej za torej 1 0.01 % 0.97 1 0.04 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 594 File at CLARIN.SI2.2.251 List of initial character-level 1-grams from conjunction standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-conjunctions-standardized_ forms-initial-1grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pa p a 29,360 25.89 % 28,364.22 5,601 23.95 % 24,296.92 12,111 35.20 % 40,978.95 6,825 18.02 % 19,326.12 4,823 27.20 % 30,937.89 da d a 19,008 16.76 % 18,363.32 3,596 15.38 % 15,599.31 5,141 14.94 % 17,395.16 7,139 18.85 % 20,215.26 3,132 17.66 % 20,090.70 in i n 16,235 14.32 % 15,684.37 4,182 17.88 % 18,141.36 3,198 9.30 % 10,820.80 7,068 18.67 % 20,014.21 1,787 10.08 % 11,462.99 a a 6,576 5.80 % 6,352.97 1,222 5.22 % 5,300.99 2,499 7.26 % 8,455.65 1,665 4.40 % 4,714.72 1,190 6.71 % 7,633.44 če č e 6,137 5.41 % 5,928.86 1,120 4.79 % 4,858.52 1,882 5.47 % 6,367.96 1,902 5.02 % 5,385.83 1,233 6.95 % 7,909.27 ki k i 5,537 4.88 % 5,349.21 1,257 5.38 % 5,452.82 882 2.56 % 2,984.35 2,839 7.50 % 8,039.10 559 3.15 % 3,585.79 ker k er 4,783 4.22 % 4,620.78 824 3.52 % 3,574.48 1,628 4.73 % 5,508.52 1,294 3.42 % 3,664.18 1,037 5.85 % 6,652 ali a li 4,757 4.20 % 4,595.66 818 3.50 % 3,548.45 1,536 4.46 % 5,197.23 1,501 3.96 % 4,250.33 902 5.09 % 5,786.02 ko k o 4,247 3.75 % 4,102.96 734 3.14 % 3,184.06 1,691 4.92 % 5,721.69 1,215 3.21 % 3,440.47 607 3.42 % 3,893.70 ampak a mpak 3,540 3.12 % 3,419.94 815 3.48 % 3,535.44 821 2.39 % 2,777.95 1,237 3.27 % 3,502.77 667 3.76 % 4,278.58 kot k ot 3,075 2.71 % 2,970.71 737 3.15 % 3,197.08 550 1.60 % 1,860.99 1,363 3.60 % 3,859.56 425 2.40 % 2,726.23 saj s aj 1,621 1.43 % 1,566.02 231 0.99 % 1,002.07 824 2.40 % 2,788.10 209 0.55 % 591.82 357 2.01 % 2,290.03 torej t orej 1,223 1.08 % 1,181.52 436 1.86 % 1,891.35 37 0.11 % 125.19 648 1.71 % 1,834.92 102 0.57 % 654.29 oziroma o ziroma 903 0.80 % 872.37 259 1.11 % 1,123.53 105 0.30 % 355.28 417 1.10 % 1,180.80 122 0.69 % 782.59 kjer k jer 652 0.57 % 629.89 122 0.52 % 529.23 121 0.35 % 409.42 306 0.81 % 866.49 103 0.58 % 660.71 sicer s icer 595 0.53 % 574.82 195 0.83 % 845.90 74 0.21 % 250.39 243 0.64 % 688.09 83 0.47 % 532.42 kako k ako 533 0.47 % 514.92 126 0.54 % 546.58 135 0.39 % 456.79 223 0.59 % 631.46 49 0.28 % 314.32 zakaj z akaj 527 0.47 % 509.13 87 0.37 % 377.40 147 0.43 % 497.39 237 0.63 % 671.10 56 0.32 % 359.22 kakor k akor 523 0.46 % 505.26 100 0.43 % 433.80 229 0.67 % 774.85 134 0.35 % 379.44 60 0.34 % 384.88 zato z ato 363 0.32 % 350.69 59 0.25 % 255.94 75 0.22 % 253.77 167 0.44 % 472.89 62 0.35 % 397.71 tako t ako 352 0.31 % 340.06 56 0.24 % 242.93 114 0.33 % 385.73 94 0.25 % 266.18 88 0.50 % 564.49 niti n iti 314 0.28 % 303.35 60 0.26 % 260.28 95 0.28 % 321.44 105 0.28 % 297.32 54 0.30 % 346.39 čeprav č eprav 302 0.27 % 291.76 85 0.36 % 368.73 71 0.21 % 240.24 106 0.28 % 300.16 40 0.23 % 256.59 vendar v endar 284 0.25 % 274.37 85 0.36 % 368.73 6 0.02 % 20.30 184 0.49 % 521.03 9 0.05 % 57.73 kadar k adar 229 0.20 % 221.23 47 0.20 % 203.88 98 0.28 % 331.59 60 0.16 % 169.90 24 0.14 % 153.95 namreč n amreč 225 0.20 % 217.37 71 0.30 % 308 14 0.04 % 47.37 126 0.33 % 356.79 14 0.08 % 89.81 drugače d rugače 202 0.18 % 195.15 40 0.17 % 173.52 92 0.27 % 311.29 38 0.10 % 107.60 32 0.18 % 205.27 A A 177 0.16 % 171 25 0.11 % 108.45 16 0.05 % 54.14 106 0.28 % 300.16 30 0.17 % 192.44 kajti k ajti 174 0.15 % 168.10 104 0.45 % 451.15 0 0 % 0 65 0.17 % 184.06 5 0.03 % 32.07 dokler d okler 143 0.13 % 138.15 21 0.09 % 91.10 57 0.17 % 192.87 44 0.12 % 124.59 21 0.12 % 134.71 vendarle v endarle 108 0.10 % 104.34 25 0.11 % 108.45 0 0 % 0 76 0.20 % 215.21 7 0.04 % 44.90 naj n aj 105 0.09 % 101.44 25 0.11 % 108.45 31 0.09 % 104.89 30 0.08 % 84.95 19 0.11 % 121.88 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 595 File at CLARIN.SI2.2.252 List of initial character-level 2-grams from conjunction standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-conjunctions-standardized_ forms-initial-2grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pa pa 29,360 27.53 % 28,364.22 5,601 25.30 % 24,296.92 12,111 37.98 % 40,978.95 6,825 18.91 % 19,326.12 4,823 29.21 % 30,937.89 da da 19,008 17.83 % 18,363.32 3,596 16.24 % 15,599.31 5,141 16.12 % 17,395.16 7,139 19.78 % 20,215.26 3,132 18.97 % 20,090.70 in in 16,235 15.22 % 15,684.37 4,182 18.89 % 18,141.36 3,198 10.03 % 10,820.80 7,068 19.58 % 20,014.21 1,787 10.82 % 11,462.99 če če 6,137 5.75 % 5,928.86 1,120 5.06 % 4,858.52 1,882 5.90 % 6,367.96 1,902 5.27 % 5,385.83 1,233 7.47 % 7,909.27 ki ki 5,537 5.19 % 5,349.21 1,257 5.68 % 5,452.82 882 2.77 % 2,984.35 2,839 7.87 % 8,039.10 559 3.39 % 3,585.79 ker ke r 4,783 4.49 % 4,620.78 824 3.72 % 3,574.48 1,628 5.11 % 5,508.52 1,294 3.58 % 3,664.18 1,037 6.28 % 6,652 ali al i 4,757 4.46 % 4,595.66 818 3.69 % 3,548.45 1,536 4.82 % 5,197.23 1,501 4.16 % 4,250.33 902 5.46 % 5,786.02 ko ko 4,247 3.98 % 4,102.96 734 3.31 % 3,184.06 1,691 5.30 % 5,721.69 1,215 3.37 % 3,440.47 607 3.68 % 3,893.70 ampak am pak 3,540 3.32 % 3,419.94 815 3.68 % 3,535.44 821 2.58 % 2,777.95 1,237 3.43 % 3,502.77 667 4.04 % 4,278.58 kot ko t 3,075 2.88 % 2,970.71 737 3.33 % 3,197.08 550 1.73 % 1,860.99 1,363 3.78 % 3,859.56 425 2.57 % 2,726.23 saj sa j 1,621 1.52 % 1,566.02 231 1.04 % 1,002.07 824 2.58 % 2,788.10 209 0.58 % 591.82 357 2.16 % 2,290.03 torej to rej 1,223 1.15 % 1,181.52 436 1.97 % 1,891.35 37 0.12 % 125.19 648 1.79 % 1,834.92 102 0.62 % 654.29 oziroma oz iroma 903 0.85 % 872.37 259 1.17 % 1,123.53 105 0.33 % 355.28 417 1.16 % 1,180.80 122 0.74 % 782.59 kjer kj er 652 0.61 % 629.89 122 0.55 % 529.23 121 0.38 % 409.42 306 0.85 % 866.49 103 0.62 % 660.71 sicer si cer 595 0.56 % 574.82 195 0.88 % 845.90 74 0.23 % 250.39 243 0.67 % 688.09 83 0.50 % 532.42 kako ka ko 533 0.50 % 514.92 126 0.57 % 546.58 135 0.42 % 456.79 223 0.62 % 631.46 49 0.30 % 314.32 zakaj za kaj 527 0.49 % 509.13 87 0.39 % 377.40 147 0.46 % 497.39 237 0.66 % 671.10 56 0.34 % 359.22 kakor ka kor 523 0.49 % 505.26 100 0.45 % 433.80 229 0.72 % 774.85 134 0.37 % 379.44 60 0.36 % 384.88 zato za to 363 0.34 % 350.69 59 0.27 % 255.94 75 0.23 % 253.77 167 0.46 % 472.89 62 0.38 % 397.71 tako ta ko 352 0.33 % 340.06 56 0.25 % 242.93 114 0.36 % 385.73 94 0.26 % 266.18 88 0.53 % 564.49 niti ni ti 314 0.29 % 303.35 60 0.27 % 260.28 95 0.30 % 321.44 105 0.29 % 297.32 54 0.33 % 346.39 čeprav če prav 302 0.28 % 291.76 85 0.38 % 368.73 71 0.22 % 240.24 106 0.29 % 300.16 40 0.24 % 256.59 vendar ve ndar 284 0.27 % 274.37 85 0.38 % 368.73 6 0.02 % 20.30 184 0.51 % 521.03 9 0.06 % 57.73 kadar ka dar 229 0.21 % 221.23 47 0.21 % 203.88 98 0.31 % 331.59 60 0.17 % 169.90 24 0.14 % 153.95 namreč na mreč 225 0.21 % 217.37 71 0.32 % 308 14 0.04 % 47.37 126 0.35 % 356.79 14 0.09 % 89.81 drugače dr ugače 202 0.19 % 195.15 40 0.18 % 173.52 92 0.29 % 311.29 38 0.10 % 107.60 32 0.19 % 205.27 kajti ka jti 174 0.16 % 168.10 104 0.47 % 451.15 0 0 % 0 65 0.18 % 184.06 5 0.03 % 32.07 dokler do kler 143 0.13 % 138.15 21 0.10 % 91.10 57 0.18 % 192.87 44 0.12 % 124.59 21 0.13 % 134.71 vendarle ve ndarle 108 0.10 % 104.34 25 0.11 % 108.45 0 0 % 0 76 0.21 % 215.21 7 0.04 % 44.90 naj na j 105 0.10 % 101.44 25 0.11 % 108.45 31 0.10 % 104.89 30 0.08 % 84.95 19 0.12 % 121.88 toda to da 87 0.08 % 84.05 71 0.32 % 308 0 0 % 0 15 0.04 % 42.47 1 0.01 % 6.41 ter te r 73 0.07 % 70.52 45 0.20 % 195.21 1 0 % 3.38 27 0.07 % 76.45 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 596 File at CLARIN.SI2.2.253 List of initial character-level 3-grams from conjunction standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-conjunctions-standardized_ forms-initial-3grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ker ker 4,783 18.36 % 4,620.78 824 14.62 % 3,574.48 1,628 23.36 % 5,508.52 1,294 14.24 % 3,664.18 1,037 23.74 % 6,652 ali ali 4,757 18.26 % 4,595.66 818 14.52 % 3,548.45 1,536 22.04 % 5,197.23 1,501 16.52 % 4,250.33 902 20.65 % 5,786.02 ampak amp ak 3,540 13.59 % 3,419.94 815 14.47 % 3,535.44 821 11.78 % 2,777.95 1,237 13.61 % 3,502.77 667 15.27 % 4,278.58 kot kot 3,075 11.80 % 2,970.71 737 13.08 % 3,197.08 550 7.89 % 1,860.99 1,363 15.00 % 3,859.56 425 9.73 % 2,726.23 saj saj 1,621 6.22 % 1,566.02 231 4.10 % 1,002.07 824 11.82 % 2,788.10 209 2.30 % 591.82 357 8.17 % 2,290.03 torej tor ej 1,223 4.69 % 1,181.52 436 7.74 % 1,891.35 37 0.53 % 125.19 648 7.13 % 1,834.92 102 2.33 % 654.29 oziroma ozi roma 903 3.47 % 872.37 259 4.60 % 1,123.53 105 1.51 % 355.28 417 4.59 % 1,180.80 122 2.79 % 782.59 kjer kje r 652 2.50 % 629.89 122 2.17 % 529.23 121 1.74 % 409.42 306 3.37 % 866.49 103 2.36 % 660.71 sicer sic er 595 2.28 % 574.82 195 3.46 % 845.90 74 1.06 % 250.39 243 2.67 % 688.09 83 1.90 % 532.42 kako kak o 533 2.05 % 514.92 126 2.24 % 546.58 135 1.94 % 456.79 223 2.45 % 631.46 49 1.12 % 314.32 zakaj zak aj 527 2.02 % 509.13 87 1.54 % 377.40 147 2.11 % 497.39 237 2.61 % 671.10 56 1.28 % 359.22 kakor kak or 523 2.01 % 505.26 100 1.77 % 433.80 229 3.29 % 774.85 134 1.48 % 379.44 60 1.37 % 384.88 zato zat o 363 1.39 % 350.69 59 1.05 % 255.94 75 1.08 % 253.77 167 1.84 % 472.89 62 1.42 % 397.71 tako tak o 352 1.35 % 340.06 56 0.99 % 242.93 114 1.64 % 385.73 94 1.03 % 266.18 88 2.02 % 564.49 niti nit i 314 1.21 % 303.35 60 1.06 % 260.28 95 1.36 % 321.44 105 1.16 % 297.32 54 1.24 % 346.39 čeprav čep rav 302 1.16 % 291.76 85 1.51 % 368.73 71 1.02 % 240.24 106 1.17 % 300.16 40 0.92 % 256.59 vendar ven dar 284 1.09 % 274.37 85 1.51 % 368.73 6 0.09 % 20.30 184 2.02 % 521.03 9 0.21 % 57.73 kadar kad ar 229 0.88 % 221.23 47 0.83 % 203.88 98 1.41 % 331.59 60 0.66 % 169.90 24 0.55 % 153.95 namreč nam reč 225 0.86 % 217.37 71 1.26 % 308 14 0.20 % 47.37 126 1.39 % 356.79 14 0.32 % 89.81 drugače dru gače 202 0.78 % 195.15 40 0.71 % 173.52 92 1.32 % 311.29 38 0.42 % 107.60 32 0.73 % 205.27 kajti kaj ti 174 0.67 % 168.10 104 1.85 % 451.15 0 0 % 0 65 0.71 % 184.06 5 0.11 % 32.07 dokler dok ler 143 0.55 % 138.15 21 0.37 % 91.10 57 0.82 % 192.87 44 0.48 % 124.59 21 0.48 % 134.71 vendarle ven darle 108 0.41 % 104.34 25 0.44 % 108.45 0 0 % 0 76 0.84 % 215.21 7 0.16 % 44.90 naj naj 105 0.40 % 101.44 25 0.44 % 108.45 31 0.45 % 104.89 30 0.33 % 84.95 19 0.43 % 121.88 toda tod a 87 0.33 % 84.05 71 1.26 % 308 0 0 % 0 15 0.17 % 42.47 1 0.02 % 6.41 ter ter 73 0.28 % 70.52 45 0.80 % 195.21 1 0.01 % 3.38 27 0.30 % 76.45 0 0 % 0 kamor kam or 53 0.20 % 51.20 10 0.18 % 43.38 23 0.33 % 77.82 14 0.15 % 39.64 6 0.14 % 38.49 preden pre den 45 0.17 % 43.47 11 0.20 % 47.72 11 0.16 % 37.22 17 0.19 % 48.14 6 0.14 % 38.49 and and 42 0.16 % 40.58 25 0.44 % 108.45 10 0.14 % 33.84 5 0.06 % 14.16 2 0.05 % 12.83 odkar odk ar 35 0.13 % 33.81 19 0.34 % 82.42 7 0.10 % 23.69 6 0.07 % 16.99 3 0.07 % 19.24 tedaj ted aj 29 0.11 % 28.02 1 0.02 % 4.34 25 0.36 % 84.59 3 0.03 % 8.49 0 0 % 0 bodisi bod isi 28 0.11 % 27.05 0 0 % 0 0 0 % 0 24 0.26 % 67.96 4 0.09 % 25.66 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 597 File at CLARIN.SI2.2.254 List of initial character-level 4-grams from conjunction standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-conjunctions-standardized_ forms-initial-4grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ampak ampa k 3,540 30.55 % 3,419.94 815 27.85 % 3,535.44 821 34.54 % 2,777.95 1,237 26.56 % 3,502.77 667 41.02 % 4,278.58 torej tore j 1,223 10.56 % 1,181.52 436 14.90 % 1,891.35 37 1.56 % 125.19 648 13.91 % 1,834.92 102 6.27 % 654.29 oziroma ozir oma 903 7.79 % 872.37 259 8.85 % 1,123.53 105 4.42 % 355.28 417 8.95 % 1,180.80 122 7.50 % 782.59 kjer kjer 652 5.63 % 629.89 122 4.17 % 529.23 121 5.09 % 409.42 306 6.57 % 866.49 103 6.33 % 660.71 sicer sice r 595 5.14 % 574.82 195 6.66 % 845.90 74 3.11 % 250.39 243 5.22 % 688.09 83 5.11 % 532.42 kako kako 533 4.60 % 514.92 126 4.31 % 546.58 135 5.68 % 456.79 223 4.79 % 631.46 49 3.01 % 314.32 zakaj zaka j 527 4.55 % 509.13 87 2.97 % 377.40 147 6.18 % 497.39 237 5.09 % 671.10 56 3.44 % 359.22 kakor kako r 523 4.51 % 505.26 100 3.42 % 433.80 229 9.63 % 774.85 134 2.88 % 379.44 60 3.69 % 384.88 zato zato 363 3.13 % 350.69 59 2.02 % 255.94 75 3.15 % 253.77 167 3.59 % 472.89 62 3.81 % 397.71 tako tako 352 3.04 % 340.06 56 1.91 % 242.93 114 4.80 % 385.73 94 2.02 % 266.18 88 5.41 % 564.49 niti niti 314 2.71 % 303.35 60 2.05 % 260.28 95 4.00 % 321.44 105 2.25 % 297.32 54 3.32 % 346.39 čeprav čepr av 302 2.61 % 291.76 85 2.90 % 368.73 71 2.99 % 240.24 106 2.28 % 300.16 40 2.46 % 256.59 vendar vend ar 284 2.45 % 274.37 85 2.90 % 368.73 6 0.25 % 20.30 184 3.95 % 521.03 9 0.55 % 57.73 kadar kada r 229 1.98 % 221.23 47 1.61 % 203.88 98 4.12 % 331.59 60 1.29 % 169.90 24 1.48 % 153.95 namreč namr eč 225 1.94 % 217.37 71 2.43 % 308 14 0.59 % 47.37 126 2.71 % 356.79 14 0.86 % 89.81 drugače drug ače 202 1.74 % 195.15 40 1.37 % 173.52 92 3.87 % 311.29 38 0.82 % 107.60 32 1.97 % 205.27 kajti kajt i 174 1.50 % 168.10 104 3.55 % 451.15 0 0 % 0 65 1.40 % 184.06 5 0.31 % 32.07 dokler dokl er 143 1.23 % 138.15 21 0.72 % 91.10 57 2.40 % 192.87 44 0.94 % 124.59 21 1.29 % 134.71 vendarle vend arle 108 0.93 % 104.34 25 0.85 % 108.45 0 0 % 0 76 1.63 % 215.21 7 0.43 % 44.90 toda toda 87 0.75 % 84.05 71 2.43 % 308 0 0 % 0 15 0.32 % 42.47 1 0.06 % 6.41 kamor kamo r 53 0.46 % 51.20 10 0.34 % 43.38 23 0.97 % 77.82 14 0.30 % 39.64 6 0.37 % 38.49 preden pred en 45 0.39 % 43.47 11 0.38 % 47.72 11 0.46 % 37.22 17 0.36 % 48.14 6 0.37 % 38.49 odkar odka r 35 0.30 % 33.81 19 0.65 % 82.42 7 0.29 % 23.69 6 0.13 % 16.99 3 0.18 % 19.24 tedaj teda j 29 0.25 % 28.02 1 0.03 % 4.34 25 1.05 % 84.59 3 0.06 % 8.49 0 0 % 0 bodisi bodi si 28 0.24 % 27.05 0 0 % 0 0 0 % 0 24 0.52 % 67.96 4 0.25 % 25.66 temveč temv eč 25 0.22 % 24.15 2 0.07 % 8.68 0 0 % 0 23 0.49 % 65.13 0 0 % 0 kolikor koli kor 17 0.15 % 16.42 3 0.10 % 13.01 3 0.13 % 10.15 7 0.15 % 19.82 4 0.25 % 25.66 četudi četu di 14 0.12 % 13.53 2 0.07 % 8.68 1 0.04 % 3.38 11 0.24 % 31.15 0 0 % 0 nato nato 13 0.11 % 12.56 6 0.20 % 26.03 0 0 % 0 7 0.15 % 19.82 0 0 % 0 magari maga ri 11 0.10 % 10.63 0 0 % 0 9 0.38 % 30.45 0 0 % 0 2 0.12 % 12.83 bodi bodi 10 0.09 % 9.66 3 0.10 % 13.01 3 0.13 % 10.15 3 0.06 % 8.49 1 0.06 % 6.41 potemtakem pote mtakem 8 0.07 % 7.73 2 0.07 % 8.68 0 0 % 0 6 0.13 % 16.99 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 598 File at CLARIN.SI2.2.255 List of initial character-level 5-grams from conjunction standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-conjunctions-standardized_ forms-initial-5grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ampak ampak 3,540 38.23 % 3,419.94 815 33.65 % 3,535.44 821 44.81 % 2,777.95 1,237 33.10 % 3,502.77 667 52.60 % 4,278.58 torej torej 1,223 13.21 % 1,181.52 436 18.00 % 1,891.35 37 2.02 % 125.19 648 17.34 % 1,834.92 102 8.04 % 654.29 oziroma oziro ma 903 9.75 % 872.37 259 10.69 % 1,123.53 105 5.73 % 355.28 417 11.16 % 1,180.80 122 9.62 % 782.59 sicer sicer 595 6.43 % 574.82 195 8.05 % 845.90 74 4.04 % 250.39 243 6.50 % 688.09 83 6.55 % 532.42 zakaj zakaj 527 5.69 % 509.13 87 3.59 % 377.40 147 8.02 % 497.39 237 6.34 % 671.10 56 4.42 % 359.22 kakor kakor 523 5.65 % 505.26 100 4.13 % 433.80 229 12.50 % 774.85 134 3.59 % 379.44 60 4.73 % 384.88 čeprav čepra v 302 3.26 % 291.76 85 3.51 % 368.73 71 3.88 % 240.24 106 2.84 % 300.16 40 3.15 % 256.59 vendar venda r 284 3.07 % 274.37 85 3.51 % 368.73 6 0.33 % 20.30 184 4.92 % 521.03 9 0.71 % 57.73 kadar kadar 229 2.47 % 221.23 47 1.94 % 203.88 98 5.35 % 331.59 60 1.61 % 169.90 24 1.89 % 153.95 namreč namre č 225 2.43 % 217.37 71 2.93 % 308 14 0.76 % 47.37 126 3.37 % 356.79 14 1.10 % 89.81 drugače druga če 202 2.18 % 195.15 40 1.65 % 173.52 92 5.02 % 311.29 38 1.02 % 107.60 32 2.52 % 205.27 kajti kajti 174 1.88 % 168.10 104 4.29 % 451.15 0 0 % 0 65 1.74 % 184.06 5 0.39 % 32.07 dokler dokle r 143 1.54 % 138.15 21 0.87 % 91.10 57 3.11 % 192.87 44 1.18 % 124.59 21 1.66 % 134.71 vendarle venda rle 108 1.17 % 104.34 25 1.03 % 108.45 0 0 % 0 76 2.03 % 215.21 7 0.55 % 44.90 kamor kamor 53 0.57 % 51.20 10 0.41 % 43.38 23 1.25 % 77.82 14 0.38 % 39.64 6 0.47 % 38.49 preden prede n 45 0.49 % 43.47 11 0.45 % 47.72 11 0.60 % 37.22 17 0.46 % 48.14 6 0.47 % 38.49 odkar odkar 35 0.38 % 33.81 19 0.78 % 82.42 7 0.38 % 23.69 6 0.16 % 16.99 3 0.24 % 19.24 tedaj tedaj 29 0.31 % 28.02 1 0.04 % 4.34 25 1.36 % 84.59 3 0.08 % 8.49 0 0 % 0 bodisi bodis i 28 0.30 % 27.05 0 0 % 0 0 0 % 0 24 0.64 % 67.96 4 0.32 % 25.66 temveč temve č 25 0.27 % 24.15 2 0.08 % 8.68 0 0 % 0 23 0.61 % 65.13 0 0 % 0 kolikor kolik or 17 0.18 % 16.42 3 0.12 % 13.01 3 0.16 % 10.15 7 0.19 % 19.82 4 0.32 % 25.66 četudi četud i 14 0.15 % 13.53 2 0.08 % 8.68 1 0.06 % 3.38 11 0.29 % 31.15 0 0 % 0 magari magar i 11 0.12 % 10.63 0 0 % 0 9 0.49 % 30.45 0 0 % 0 2 0.16 % 12.83 potemtakem potem takem 8 0.09 % 7.73 2 0.08 % 8.68 0 0 % 0 6 0.16 % 16.99 0 0 % 0 dočim dočim 7 0.08 % 6.76 1 0.04 % 4.34 1 0.06 % 3.38 4 0.11 % 11.33 1 0.08 % 6.41 koder koder 5 0.05 % 4.83 0 0 % 0 1 0.06 % 3.38 4 0.11 % 11.33 0 0 % 0 čeravno čerav no 2 0.02 % 1.93 0 0 % 0 0 0 % 0 2 0.05 % 5.66 0 0 % 0 Zakaj Zakaj 1 0.01 % 0.97 0 0 % 0 0 0 % 0 1 0.03 % 2.83 0 0 % 0 zatorej zator ej 1 0.01 % 0.97 1 0.04 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 599 File at CLARIN.SI2.2.256 List of final character-level 1-grams from conjunction standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-conjunctions-standardized_ forms-final-1grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pa p a 29,360 25.89 % 28,364.22 5,601 23.95 % 24,296.92 12,111 35.20 % 40,978.95 6,825 18.02 % 19,326.12 4,823 27.20 % 30,937.89 da d a 19,008 16.76 % 18,363.32 3,596 15.38 % 15,599.31 5,141 14.94 % 17,395.16 7,139 18.85 % 20,215.26 3,132 17.66 % 20,090.70 in i n 16,235 14.32 % 15,684.37 4,182 17.88 % 18,141.36 3,198 9.30 % 10,820.80 7,068 18.67 % 20,014.21 1,787 10.08 % 11,462.99 a a 6,576 5.80 % 6,352.97 1,222 5.22 % 5,300.99 2,499 7.26 % 8,455.65 1,665 4.40 % 4,714.72 1,190 6.71 % 7,633.44 če č e 6,137 5.41 % 5,928.86 1,120 4.79 % 4,858.52 1,882 5.47 % 6,367.96 1,902 5.02 % 5,385.83 1,233 6.95 % 7,909.27 ki k i 5,537 4.88 % 5,349.21 1,257 5.38 % 5,452.82 882 2.56 % 2,984.35 2,839 7.50 % 8,039.10 559 3.15 % 3,585.79 ker ke r 4,783 4.22 % 4,620.78 824 3.52 % 3,574.48 1,628 4.73 % 5,508.52 1,294 3.42 % 3,664.18 1,037 5.85 % 6,652 ali al i 4,757 4.20 % 4,595.66 818 3.50 % 3,548.45 1,536 4.46 % 5,197.23 1,501 3.96 % 4,250.33 902 5.09 % 5,786.02 ko k o 4,247 3.75 % 4,102.96 734 3.14 % 3,184.06 1,691 4.92 % 5,721.69 1,215 3.21 % 3,440.47 607 3.42 % 3,893.70 ampak ampa k 3,540 3.12 % 3,419.94 815 3.48 % 3,535.44 821 2.39 % 2,777.95 1,237 3.27 % 3,502.77 667 3.76 % 4,278.58 kot ko t 3,075 2.71 % 2,970.71 737 3.15 % 3,197.08 550 1.60 % 1,860.99 1,363 3.60 % 3,859.56 425 2.40 % 2,726.23 saj sa j 1,621 1.43 % 1,566.02 231 0.99 % 1,002.07 824 2.40 % 2,788.10 209 0.55 % 591.82 357 2.01 % 2,290.03 torej tore j 1,223 1.08 % 1,181.52 436 1.86 % 1,891.35 37 0.11 % 125.19 648 1.71 % 1,834.92 102 0.57 % 654.29 oziroma ozirom a 903 0.80 % 872.37 259 1.11 % 1,123.53 105 0.30 % 355.28 417 1.10 % 1,180.80 122 0.69 % 782.59 kjer kje r 652 0.57 % 629.89 122 0.52 % 529.23 121 0.35 % 409.42 306 0.81 % 866.49 103 0.58 % 660.71 sicer sice r 595 0.53 % 574.82 195 0.83 % 845.90 74 0.21 % 250.39 243 0.64 % 688.09 83 0.47 % 532.42 kako kak o 533 0.47 % 514.92 126 0.54 % 546.58 135 0.39 % 456.79 223 0.59 % 631.46 49 0.28 % 314.32 zakaj zaka j 527 0.47 % 509.13 87 0.37 % 377.40 147 0.43 % 497.39 237 0.63 % 671.10 56 0.32 % 359.22 kakor kako r 523 0.46 % 505.26 100 0.43 % 433.80 229 0.67 % 774.85 134 0.35 % 379.44 60 0.34 % 384.88 zato zat o 363 0.32 % 350.69 59 0.25 % 255.94 75 0.22 % 253.77 167 0.44 % 472.89 62 0.35 % 397.71 tako tak o 352 0.31 % 340.06 56 0.24 % 242.93 114 0.33 % 385.73 94 0.25 % 266.18 88 0.50 % 564.49 niti nit i 314 0.28 % 303.35 60 0.26 % 260.28 95 0.28 % 321.44 105 0.28 % 297.32 54 0.30 % 346.39 čeprav čepra v 302 0.27 % 291.76 85 0.36 % 368.73 71 0.21 % 240.24 106 0.28 % 300.16 40 0.23 % 256.59 vendar venda r 284 0.25 % 274.37 85 0.36 % 368.73 6 0.02 % 20.30 184 0.49 % 521.03 9 0.05 % 57.73 kadar kada r 229 0.20 % 221.23 47 0.20 % 203.88 98 0.28 % 331.59 60 0.16 % 169.90 24 0.14 % 153.95 namreč namre č 225 0.20 % 217.37 71 0.30 % 308 14 0.04 % 47.37 126 0.33 % 356.79 14 0.08 % 89.81 drugače drugač e 202 0.18 % 195.15 40 0.17 % 173.52 92 0.27 % 311.29 38 0.10 % 107.60 32 0.18 % 205.27 A A 177 0.16 % 171 25 0.11 % 108.45 16 0.05 % 54.14 106 0.28 % 300.16 30 0.17 % 192.44 kajti kajt i 174 0.15 % 168.10 104 0.45 % 451.15 0 0 % 0 65 0.17 % 184.06 5 0.03 % 32.07 dokler dokle r 143 0.13 % 138.15 21 0.09 % 91.10 57 0.17 % 192.87 44 0.12 % 124.59 21 0.12 % 134.71 vendarle vendarl e 108 0.10 % 104.34 25 0.11 % 108.45 0 0 % 0 76 0.20 % 215.21 7 0.04 % 44.90 naj na j 105 0.09 % 101.44 25 0.11 % 108.45 31 0.09 % 104.89 30 0.08 % 84.95 19 0.11 % 121.88 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 600 File at CLARIN.SI2.2.257 List of final character-level 2-grams from conjunction standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-conjunctions-standardized_ forms-final-2grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pa pa 29,360 27.53 % 28,364.22 5,601 25.30 % 24,296.92 12,111 37.98 % 40,978.95 6,825 18.91 % 19,326.12 4,823 29.21 % 30,937.89 da da 19,008 17.83 % 18,363.32 3,596 16.24 % 15,599.31 5,141 16.12 % 17,395.16 7,139 19.78 % 20,215.26 3,132 18.97 % 20,090.70 in in 16,235 15.22 % 15,684.37 4,182 18.89 % 18,141.36 3,198 10.03 % 10,820.80 7,068 19.58 % 20,014.21 1,787 10.82 % 11,462.99 če če 6,137 5.75 % 5,928.86 1,120 5.06 % 4,858.52 1,882 5.90 % 6,367.96 1,902 5.27 % 5,385.83 1,233 7.47 % 7,909.27 ki ki 5,537 5.19 % 5,349.21 1,257 5.68 % 5,452.82 882 2.77 % 2,984.35 2,839 7.87 % 8,039.10 559 3.39 % 3,585.79 ker k er 4,783 4.49 % 4,620.78 824 3.72 % 3,574.48 1,628 5.11 % 5,508.52 1,294 3.58 % 3,664.18 1,037 6.28 % 6,652 ali a li 4,757 4.46 % 4,595.66 818 3.69 % 3,548.45 1,536 4.82 % 5,197.23 1,501 4.16 % 4,250.33 902 5.46 % 5,786.02 ko ko 4,247 3.98 % 4,102.96 734 3.31 % 3,184.06 1,691 5.30 % 5,721.69 1,215 3.37 % 3,440.47 607 3.68 % 3,893.70 ampak amp ak 3,540 3.32 % 3,419.94 815 3.68 % 3,535.44 821 2.58 % 2,777.95 1,237 3.43 % 3,502.77 667 4.04 % 4,278.58 kot k ot 3,075 2.88 % 2,970.71 737 3.33 % 3,197.08 550 1.73 % 1,860.99 1,363 3.78 % 3,859.56 425 2.57 % 2,726.23 saj s aj 1,621 1.52 % 1,566.02 231 1.04 % 1,002.07 824 2.58 % 2,788.10 209 0.58 % 591.82 357 2.16 % 2,290.03 torej tor ej 1,223 1.15 % 1,181.52 436 1.97 % 1,891.35 37 0.12 % 125.19 648 1.79 % 1,834.92 102 0.62 % 654.29 oziroma oziro ma 903 0.85 % 872.37 259 1.17 % 1,123.53 105 0.33 % 355.28 417 1.16 % 1,180.80 122 0.74 % 782.59 kjer kj er 652 0.61 % 629.89 122 0.55 % 529.23 121 0.38 % 409.42 306 0.85 % 866.49 103 0.62 % 660.71 sicer sic er 595 0.56 % 574.82 195 0.88 % 845.90 74 0.23 % 250.39 243 0.67 % 688.09 83 0.50 % 532.42 kako ka ko 533 0.50 % 514.92 126 0.57 % 546.58 135 0.42 % 456.79 223 0.62 % 631.46 49 0.30 % 314.32 zakaj zak aj 527 0.49 % 509.13 87 0.39 % 377.40 147 0.46 % 497.39 237 0.66 % 671.10 56 0.34 % 359.22 kakor kak or 523 0.49 % 505.26 100 0.45 % 433.80 229 0.72 % 774.85 134 0.37 % 379.44 60 0.36 % 384.88 zato za to 363 0.34 % 350.69 59 0.27 % 255.94 75 0.23 % 253.77 167 0.46 % 472.89 62 0.38 % 397.71 tako ta ko 352 0.33 % 340.06 56 0.25 % 242.93 114 0.36 % 385.73 94 0.26 % 266.18 88 0.53 % 564.49 niti ni ti 314 0.29 % 303.35 60 0.27 % 260.28 95 0.30 % 321.44 105 0.29 % 297.32 54 0.33 % 346.39 čeprav čepr av 302 0.28 % 291.76 85 0.38 % 368.73 71 0.22 % 240.24 106 0.29 % 300.16 40 0.24 % 256.59 vendar vend ar 284 0.27 % 274.37 85 0.38 % 368.73 6 0.02 % 20.30 184 0.51 % 521.03 9 0.06 % 57.73 kadar kad ar 229 0.21 % 221.23 47 0.21 % 203.88 98 0.31 % 331.59 60 0.17 % 169.90 24 0.14 % 153.95 namreč namr eč 225 0.21 % 217.37 71 0.32 % 308 14 0.04 % 47.37 126 0.35 % 356.79 14 0.09 % 89.81 drugače druga če 202 0.19 % 195.15 40 0.18 % 173.52 92 0.29 % 311.29 38 0.10 % 107.60 32 0.19 % 205.27 kajti kaj ti 174 0.16 % 168.10 104 0.47 % 451.15 0 0 % 0 65 0.18 % 184.06 5 0.03 % 32.07 dokler dokl er 143 0.13 % 138.15 21 0.10 % 91.10 57 0.18 % 192.87 44 0.12 % 124.59 21 0.13 % 134.71 vendarle vendar le 108 0.10 % 104.34 25 0.11 % 108.45 0 0 % 0 76 0.21 % 215.21 7 0.04 % 44.90 naj n aj 105 0.10 % 101.44 25 0.11 % 108.45 31 0.10 % 104.89 30 0.08 % 84.95 19 0.12 % 121.88 toda to da 87 0.08 % 84.05 71 0.32 % 308 0 0 % 0 15 0.04 % 42.47 1 0.01 % 6.41 ter t er 73 0.07 % 70.52 45 0.20 % 195.21 1 0 % 3.38 27 0.07 % 76.45 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 601 File at CLARIN.SI2.2.258 List of final character-level 3-grams from conjunction standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-conjunctions-standardized_ forms-final-3grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ker ker 4,783 18.36 % 4,620.78 824 14.62 % 3,574.48 1,628 23.36 % 5,508.52 1,294 14.24 % 3,664.18 1,037 23.74 % 6,652 ali ali 4,757 18.26 % 4,595.66 818 14.52 % 3,548.45 1,536 22.04 % 5,197.23 1,501 16.52 % 4,250.33 902 20.65 % 5,786.02 ampak am pak 3,540 13.59 % 3,419.94 815 14.47 % 3,535.44 821 11.78 % 2,777.95 1,237 13.61 % 3,502.77 667 15.27 % 4,278.58 kot kot 3,075 11.80 % 2,970.71 737 13.08 % 3,197.08 550 7.89 % 1,860.99 1,363 15.00 % 3,859.56 425 9.73 % 2,726.23 saj saj 1,621 6.22 % 1,566.02 231 4.10 % 1,002.07 824 11.82 % 2,788.10 209 2.30 % 591.82 357 8.17 % 2,290.03 torej to rej 1,223 4.69 % 1,181.52 436 7.74 % 1,891.35 37 0.53 % 125.19 648 7.13 % 1,834.92 102 2.33 % 654.29 oziroma ozir oma 903 3.47 % 872.37 259 4.60 % 1,123.53 105 1.51 % 355.28 417 4.59 % 1,180.80 122 2.79 % 782.59 kjer k jer 652 2.50 % 629.89 122 2.17 % 529.23 121 1.74 % 409.42 306 3.37 % 866.49 103 2.36 % 660.71 sicer si cer 595 2.28 % 574.82 195 3.46 % 845.90 74 1.06 % 250.39 243 2.67 % 688.09 83 1.90 % 532.42 kako k ako 533 2.05 % 514.92 126 2.24 % 546.58 135 1.94 % 456.79 223 2.45 % 631.46 49 1.12 % 314.32 zakaj za kaj 527 2.02 % 509.13 87 1.54 % 377.40 147 2.11 % 497.39 237 2.61 % 671.10 56 1.28 % 359.22 kakor ka kor 523 2.01 % 505.26 100 1.77 % 433.80 229 3.29 % 774.85 134 1.48 % 379.44 60 1.37 % 384.88 zato z ato 363 1.39 % 350.69 59 1.05 % 255.94 75 1.08 % 253.77 167 1.84 % 472.89 62 1.42 % 397.71 tako t ako 352 1.35 % 340.06 56 0.99 % 242.93 114 1.64 % 385.73 94 1.03 % 266.18 88 2.02 % 564.49 niti n iti 314 1.21 % 303.35 60 1.06 % 260.28 95 1.36 % 321.44 105 1.16 % 297.32 54 1.24 % 346.39 čeprav čep rav 302 1.16 % 291.76 85 1.51 % 368.73 71 1.02 % 240.24 106 1.17 % 300.16 40 0.92 % 256.59 vendar ven dar 284 1.09 % 274.37 85 1.51 % 368.73 6 0.09 % 20.30 184 2.02 % 521.03 9 0.21 % 57.73 kadar ka dar 229 0.88 % 221.23 47 0.83 % 203.88 98 1.41 % 331.59 60 0.66 % 169.90 24 0.55 % 153.95 namreč nam reč 225 0.86 % 217.37 71 1.26 % 308 14 0.20 % 47.37 126 1.39 % 356.79 14 0.32 % 89.81 drugače drug ače 202 0.78 % 195.15 40 0.71 % 173.52 92 1.32 % 311.29 38 0.42 % 107.60 32 0.73 % 205.27 kajti ka jti 174 0.67 % 168.10 104 1.85 % 451.15 0 0 % 0 65 0.71 % 184.06 5 0.11 % 32.07 dokler dok ler 143 0.55 % 138.15 21 0.37 % 91.10 57 0.82 % 192.87 44 0.48 % 124.59 21 0.48 % 134.71 vendarle venda rle 108 0.41 % 104.34 25 0.44 % 108.45 0 0 % 0 76 0.84 % 215.21 7 0.16 % 44.90 naj naj 105 0.40 % 101.44 25 0.44 % 108.45 31 0.45 % 104.89 30 0.33 % 84.95 19 0.43 % 121.88 toda t oda 87 0.33 % 84.05 71 1.26 % 308 0 0 % 0 15 0.17 % 42.47 1 0.02 % 6.41 ter ter 73 0.28 % 70.52 45 0.80 % 195.21 1 0.01 % 3.38 27 0.30 % 76.45 0 0 % 0 kamor ka mor 53 0.20 % 51.20 10 0.18 % 43.38 23 0.33 % 77.82 14 0.15 % 39.64 6 0.14 % 38.49 preden pre den 45 0.17 % 43.47 11 0.20 % 47.72 11 0.16 % 37.22 17 0.19 % 48.14 6 0.14 % 38.49 and and 42 0.16 % 40.58 25 0.44 % 108.45 10 0.14 % 33.84 5 0.06 % 14.16 2 0.05 % 12.83 odkar od kar 35 0.13 % 33.81 19 0.34 % 82.42 7 0.10 % 23.69 6 0.07 % 16.99 3 0.07 % 19.24 tedaj te daj 29 0.11 % 28.02 1 0.02 % 4.34 25 0.36 % 84.59 3 0.03 % 8.49 0 0 % 0 bodisi bod isi 28 0.11 % 27.05 0 0 % 0 0 0 % 0 24 0.26 % 67.96 4 0.09 % 25.66 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 602 File at CLARIN.SI2.2.259 List of final character-level 4-grams from conjunction standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-conjunctions-standardized_ forms-final-4grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ampak a mpak 3,540 30.55 % 3,419.94 815 27.85 % 3,535.44 821 34.54 % 2,777.95 1,237 26.56 % 3,502.77 667 41.02 % 4,278.58 torej t orej 1,223 10.56 % 1,181.52 436 14.90 % 1,891.35 37 1.56 % 125.19 648 13.91 % 1,834.92 102 6.27 % 654.29 oziroma ozi roma 903 7.79 % 872.37 259 8.85 % 1,123.53 105 4.42 % 355.28 417 8.95 % 1,180.80 122 7.50 % 782.59 kjer kjer 652 5.63 % 629.89 122 4.17 % 529.23 121 5.09 % 409.42 306 6.57 % 866.49 103 6.33 % 660.71 sicer s icer 595 5.14 % 574.82 195 6.66 % 845.90 74 3.11 % 250.39 243 5.22 % 688.09 83 5.11 % 532.42 kako kako 533 4.60 % 514.92 126 4.31 % 546.58 135 5.68 % 456.79 223 4.79 % 631.46 49 3.01 % 314.32 zakaj z akaj 527 4.55 % 509.13 87 2.97 % 377.40 147 6.18 % 497.39 237 5.09 % 671.10 56 3.44 % 359.22 kakor k akor 523 4.51 % 505.26 100 3.42 % 433.80 229 9.63 % 774.85 134 2.88 % 379.44 60 3.69 % 384.88 zato zato 363 3.13 % 350.69 59 2.02 % 255.94 75 3.15 % 253.77 167 3.59 % 472.89 62 3.81 % 397.71 tako tako 352 3.04 % 340.06 56 1.91 % 242.93 114 4.80 % 385.73 94 2.02 % 266.18 88 5.41 % 564.49 niti niti 314 2.71 % 303.35 60 2.05 % 260.28 95 4.00 % 321.44 105 2.25 % 297.32 54 3.32 % 346.39 čeprav če prav 302 2.61 % 291.76 85 2.90 % 368.73 71 2.99 % 240.24 106 2.28 % 300.16 40 2.46 % 256.59 vendar ve ndar 284 2.45 % 274.37 85 2.90 % 368.73 6 0.25 % 20.30 184 3.95 % 521.03 9 0.55 % 57.73 kadar k adar 229 1.98 % 221.23 47 1.61 % 203.88 98 4.12 % 331.59 60 1.29 % 169.90 24 1.48 % 153.95 namreč na mreč 225 1.94 % 217.37 71 2.43 % 308 14 0.59 % 47.37 126 2.71 % 356.79 14 0.86 % 89.81 drugače dru gače 202 1.74 % 195.15 40 1.37 % 173.52 92 3.87 % 311.29 38 0.82 % 107.60 32 1.97 % 205.27 kajti k ajti 174 1.50 % 168.10 104 3.55 % 451.15 0 0 % 0 65 1.40 % 184.06 5 0.31 % 32.07 dokler do kler 143 1.23 % 138.15 21 0.72 % 91.10 57 2.40 % 192.87 44 0.94 % 124.59 21 1.29 % 134.71 vendarle vend arle 108 0.93 % 104.34 25 0.85 % 108.45 0 0 % 0 76 1.63 % 215.21 7 0.43 % 44.90 toda toda 87 0.75 % 84.05 71 2.43 % 308 0 0 % 0 15 0.32 % 42.47 1 0.06 % 6.41 kamor k amor 53 0.46 % 51.20 10 0.34 % 43.38 23 0.97 % 77.82 14 0.30 % 39.64 6 0.37 % 38.49 preden pr eden 45 0.39 % 43.47 11 0.38 % 47.72 11 0.46 % 37.22 17 0.36 % 48.14 6 0.37 % 38.49 odkar o dkar 35 0.30 % 33.81 19 0.65 % 82.42 7 0.29 % 23.69 6 0.13 % 16.99 3 0.18 % 19.24 tedaj t edaj 29 0.25 % 28.02 1 0.03 % 4.34 25 1.05 % 84.59 3 0.06 % 8.49 0 0 % 0 bodisi bo disi 28 0.24 % 27.05 0 0 % 0 0 0 % 0 24 0.52 % 67.96 4 0.25 % 25.66 temveč te mveč 25 0.22 % 24.15 2 0.07 % 8.68 0 0 % 0 23 0.49 % 65.13 0 0 % 0 kolikor kol ikor 17 0.15 % 16.42 3 0.10 % 13.01 3 0.13 % 10.15 7 0.15 % 19.82 4 0.25 % 25.66 četudi če tudi 14 0.12 % 13.53 2 0.07 % 8.68 1 0.04 % 3.38 11 0.24 % 31.15 0 0 % 0 nato nato 13 0.11 % 12.56 6 0.20 % 26.03 0 0 % 0 7 0.15 % 19.82 0 0 % 0 magari ma gari 11 0.10 % 10.63 0 0 % 0 9 0.38 % 30.45 0 0 % 0 2 0.12 % 12.83 bodi bodi 10 0.09 % 9.66 3 0.10 % 13.01 3 0.13 % 10.15 3 0.06 % 8.49 1 0.06 % 6.41 potemtakem potemt akem 8 0.07 % 7.73 2 0.07 % 8.68 0 0 % 0 6 0.13 % 16.99 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 603 File at CLARIN.SI2.2.260 List of final character-level 5-grams from conjunction standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-conjunctions-standardized_ forms-final-5grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ampak ampak 3,540 38.23 % 3,419.94 815 33.65 % 3,535.44 821 44.81 % 2,777.95 1,237 33.10 % 3,502.77 667 52.60 % 4,278.58 torej torej 1,223 13.21 % 1,181.52 436 18.00 % 1,891.35 37 2.02 % 125.19 648 17.34 % 1,834.92 102 8.04 % 654.29 oziroma oz iroma 903 9.75 % 872.37 259 10.69 % 1,123.53 105 5.73 % 355.28 417 11.16 % 1,180.80 122 9.62 % 782.59 sicer sicer 595 6.43 % 574.82 195 8.05 % 845.90 74 4.04 % 250.39 243 6.50 % 688.09 83 6.55 % 532.42 zakaj zakaj 527 5.69 % 509.13 87 3.59 % 377.40 147 8.02 % 497.39 237 6.34 % 671.10 56 4.42 % 359.22 kakor kakor 523 5.65 % 505.26 100 4.13 % 433.80 229 12.50 % 774.85 134 3.59 % 379.44 60 4.73 % 384.88 čeprav č eprav 302 3.26 % 291.76 85 3.51 % 368.73 71 3.88 % 240.24 106 2.84 % 300.16 40 3.15 % 256.59 vendar v endar 284 3.07 % 274.37 85 3.51 % 368.73 6 0.33 % 20.30 184 4.92 % 521.03 9 0.71 % 57.73 kadar kadar 229 2.47 % 221.23 47 1.94 % 203.88 98 5.35 % 331.59 60 1.61 % 169.90 24 1.89 % 153.95 namreč n amreč 225 2.43 % 217.37 71 2.93 % 308 14 0.76 % 47.37 126 3.37 % 356.79 14 1.10 % 89.81 drugače dr ugače 202 2.18 % 195.15 40 1.65 % 173.52 92 5.02 % 311.29 38 1.02 % 107.60 32 2.52 % 205.27 kajti kajti 174 1.88 % 168.10 104 4.29 % 451.15 0 0 % 0 65 1.74 % 184.06 5 0.39 % 32.07 dokler d okler 143 1.54 % 138.15 21 0.87 % 91.10 57 3.11 % 192.87 44 1.18 % 124.59 21 1.66 % 134.71 vendarle ven darle 108 1.17 % 104.34 25 1.03 % 108.45 0 0 % 0 76 2.03 % 215.21 7 0.55 % 44.90 kamor kamor 53 0.57 % 51.20 10 0.41 % 43.38 23 1.25 % 77.82 14 0.38 % 39.64 6 0.47 % 38.49 preden p reden 45 0.49 % 43.47 11 0.45 % 47.72 11 0.60 % 37.22 17 0.46 % 48.14 6 0.47 % 38.49 odkar odkar 35 0.38 % 33.81 19 0.78 % 82.42 7 0.38 % 23.69 6 0.16 % 16.99 3 0.24 % 19.24 tedaj tedaj 29 0.31 % 28.02 1 0.04 % 4.34 25 1.36 % 84.59 3 0.08 % 8.49 0 0 % 0 bodisi b odisi 28 0.30 % 27.05 0 0 % 0 0 0 % 0 24 0.64 % 67.96 4 0.32 % 25.66 temveč t emveč 25 0.27 % 24.15 2 0.08 % 8.68 0 0 % 0 23 0.61 % 65.13 0 0 % 0 kolikor ko likor 17 0.18 % 16.42 3 0.12 % 13.01 3 0.16 % 10.15 7 0.19 % 19.82 4 0.32 % 25.66 četudi č etudi 14 0.15 % 13.53 2 0.08 % 8.68 1 0.06 % 3.38 11 0.29 % 31.15 0 0 % 0 magari m agari 11 0.12 % 10.63 0 0 % 0 9 0.49 % 30.45 0 0 % 0 2 0.16 % 12.83 potemtakem potem takem 8 0.09 % 7.73 2 0.08 % 8.68 0 0 % 0 6 0.16 % 16.99 0 0 % 0 dočim dočim 7 0.08 % 6.76 1 0.04 % 4.34 1 0.06 % 3.38 4 0.11 % 11.33 1 0.08 % 6.41 koder koder 5 0.05 % 4.83 0 0 % 0 1 0.06 % 3.38 4 0.11 % 11.33 0 0 % 0 čeravno če ravno 2 0.02 % 1.93 0 0 % 0 0 0 % 0 2 0.05 % 5.66 0 0 % 0 Zakaj Zakaj 1 0.01 % 0.97 0 0 % 0 0 0 % 0 1 0.03 % 2.83 0 0 % 0 zatorej za torej 1 0.01 % 0.97 1 0.04 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 604 File at CLARIN.SI2.2.261 List of initial character-level 1-grams from conjunction lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-conjunctions-lowercase_ forms-initial-1grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pa p a 29,155 25.71 % 28,166.17 5,585 23.88 % 24,227.52 11,945 34.72 % 40,417.27 6,807 17.98 % 19,275.15 4,818 27.17 % 30,905.81 da d a 17,252 15.21 % 16,666.88 3,484 14.90 % 15,113.46 3,889 11.30 % 13,158.87 6,960 18.38 % 19,708.40 2,919 16.46 % 18,724.38 in i n 15,945 14.06 % 15,404.20 4,161 17.79 % 18,050.26 2,958 8.60 % 10,008.73 7,054 18.63 % 19,974.57 1,772 9.99 % 11,366.77 a a 6,728 5.93 % 6,499.81 1,233 5.27 % 5,348.71 2,505 7.28 % 8,475.95 1,772 4.68 % 5,017.71 1,218 6.87 % 7,813.05 če č e 5,945 5.24 % 5,743.37 1,107 4.73 % 4,802.12 1,744 5.07 % 5,901.02 1,891 4.99 % 5,354.68 1,203 6.79 % 7,716.83 ki k i 4,526 3.99 % 4,372.49 1,128 4.82 % 4,893.22 367 1.07 % 1,241.79 2,663 7.03 % 7,540.73 368 2.08 % 2,360.59 ko k o 3,703 3.27 % 3,577.41 705 3.01 % 3,058.26 1,288 3.74 % 4,358.09 1,157 3.06 % 3,276.24 553 3.12 % 3,547.30 ampak a mpak 3,444 3.04 % 3,327.19 796 3.40 % 3,453.02 778 2.26 % 2,632.45 1,208 3.19 % 3,420.65 662 3.73 % 4,246.50 al a l 3,231 2.85 % 3,121.42 444 1.90 % 1,926.06 1,388 4.04 % 4,696.46 614 1.62 % 1,738.64 785 4.43 % 5,035.51 ker k er 3,191 2.81 % 3,082.77 567 2.42 % 2,459.62 703 2.04 % 2,378.68 1,159 3.06 % 3,281.90 762 4.30 % 4,887.97 k k 2,979 2.63 % 2,877.96 345 1.48 % 1,496.60 1,729 5.03 % 5,850.27 371 0.98 % 1,050.55 534 3.01 % 3,425.43 kot k ot 2,279 2.01 % 2,201.70 586 2.51 % 2,542.05 125 0.36 % 422.95 1,274 3.37 % 3,607.54 294 1.66 % 1,885.91 ali a li 1,482 1.31 % 1,431.74 352 1.50 % 1,526.96 129 0.38 % 436.49 885 2.34 % 2,506.02 116 0.65 % 744.10 sej s ej 1,341 1.18 % 1,295.52 184 0.79 % 798.18 678 1.97 % 2,294.09 177 0.47 % 501.20 302 1.70 % 1,937.23 torej t orej 1,221 1.08 % 1,179.59 436 1.86 % 1,891.35 36 0.10 % 121.81 648 1.71 % 1,834.92 101 0.57 % 647.88 de d e 1,169 1.03 % 1,129.35 87 0.37 % 377.40 797 2.32 % 2,696.74 106 0.28 % 300.16 179 1.01 % 1,148.22 oziroma o ziroma 881 0.78 % 851.12 258 1.10 % 1,119.19 104 0.30 % 351.90 400 1.06 % 1,132.67 119 0.67 % 763.34 sicer s icer 593 0.52 % 572.89 195 0.83 % 845.90 74 0.21 % 250.39 242 0.64 % 685.26 82 0.46 % 526 kjer k jer 484 0.43 % 467.58 111 0.47 % 481.51 23 0.07 % 77.82 283 0.75 % 801.36 67 0.38 % 429.78 zakaj z akaj 422 0.37 % 407.69 70 0.30 % 303.66 82 0.24 % 277.46 224 0.59 % 634.29 46 0.26 % 295.07 d d 358 0.32 % 345.86 24 0.10 % 104.11 249 0.72 % 842.52 68 0.18 % 192.55 17 0.10 % 109.05 ka k a 353 0.31 % 341.03 123 0.53 % 533.57 211 0.61 % 713.94 4 0.01 % 11.33 15 0.09 % 96.22 zato z ato 322 0.28 % 311.08 58 0.25 % 251.60 54 0.16 % 182.72 153 0.40 % 433.24 57 0.32 % 365.64 kr k r 318 0.28 % 307.21 33 0.14 % 143.15 151 0.44 % 510.93 48 0.13 % 135.92 86 0.48 % 551.66 niti n iti 306 0.27 % 295.62 59 0.25 % 255.94 89 0.26 % 301.14 105 0.28 % 297.32 53 0.30 % 339.98 kako k ako 296 0.26 % 285.96 71 0.30 % 308 21 0.06 % 71.06 189 0.50 % 535.18 15 0.09 % 96.22 vendar v endar 276 0.24 % 266.64 84 0.36 % 364.39 1 0 % 3.38 182 0.48 % 515.36 9 0.05 % 57.73 ku k u 256 0.23 % 247.32 14 0.06 % 60.73 220 0.64 % 744.40 14 0.04 % 39.64 8 0.04 % 51.32 i i 253 0.22 % 244.42 10 0.04 % 43.38 224 0.65 % 757.93 9 0.02 % 25.48 10 0.06 % 64.15 kak k ak 245 0.22 % 236.69 80 0.34 % 347.04 112 0.33 % 378.96 13 0.03 % 36.81 40 0.23 % 256.59 namreč n amreč 223 0.20 % 215.44 71 0.30 % 308 14 0.04 % 47.37 124 0.33 % 351.13 14 0.08 % 89.81 se s e 185 0.16 % 178.73 21 0.09 % 91.10 112 0.33 % 378.96 13 0.03 % 36.81 39 0.22 % 250.17 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 605 File at CLARIN.SI2.2.262 List of initial character-level 2-grams from conjunction lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-conjunctions-lowercase_ forms-initial-2grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pa pa 29,155 28.31 % 28,166.17 5,585 25.67 % 24,227.52 11,945 40.28 % 40,417.27 6,807 19.10 % 19,275.15 4,818 30.22 % 30,905.81 da da 17,252 16.75 % 16,666.88 3,484 16.01 % 15,113.46 3,889 13.11 % 13,158.87 6,960 19.53 % 19,708.40 2,919 18.31 % 18,724.38 in in 15,945 15.48 % 15,404.20 4,161 19.12 % 18,050.26 2,958 9.97 % 10,008.73 7,054 19.80 % 19,974.57 1,772 11.11 % 11,366.77 če če 5,945 5.77 % 5,743.37 1,107 5.09 % 4,802.12 1,744 5.88 % 5,901.02 1,891 5.31 % 5,354.68 1,203 7.54 % 7,716.83 ki ki 4,526 4.39 % 4,372.49 1,128 5.18 % 4,893.22 367 1.24 % 1,241.79 2,663 7.47 % 7,540.73 368 2.31 % 2,360.59 ko ko 3,703 3.60 % 3,577.41 705 3.24 % 3,058.26 1,288 4.34 % 4,358.09 1,157 3.25 % 3,276.24 553 3.47 % 3,547.30 ampak am pak 3,444 3.34 % 3,327.19 796 3.66 % 3,453.02 778 2.62 % 2,632.45 1,208 3.39 % 3,420.65 662 4.15 % 4,246.50 al al 3,231 3.14 % 3,121.42 444 2.04 % 1,926.06 1,388 4.68 % 4,696.46 614 1.72 % 1,738.64 785 4.92 % 5,035.51 ker ke r 3,191 3.10 % 3,082.77 567 2.61 % 2,459.62 703 2.37 % 2,378.68 1,159 3.25 % 3,281.90 762 4.78 % 4,887.97 kot ko t 2,279 2.21 % 2,201.70 586 2.69 % 2,542.05 125 0.42 % 422.95 1,274 3.58 % 3,607.54 294 1.84 % 1,885.91 ali al i 1,482 1.44 % 1,431.74 352 1.62 % 1,526.96 129 0.43 % 436.49 885 2.48 % 2,506.02 116 0.73 % 744.10 sej se j 1,341 1.30 % 1,295.52 184 0.85 % 798.18 678 2.29 % 2,294.09 177 0.50 % 501.20 302 1.89 % 1,937.23 torej to rej 1,221 1.19 % 1,179.59 436 2.00 % 1,891.35 36 0.12 % 121.81 648 1.82 % 1,834.92 101 0.63 % 647.88 de de 1,169 1.14 % 1,129.35 87 0.40 % 377.40 797 2.69 % 2,696.74 106 0.30 % 300.16 179 1.12 % 1,148.22 oziroma oz iroma 881 0.85 % 851.12 258 1.19 % 1,119.19 104 0.35 % 351.90 400 1.12 % 1,132.67 119 0.75 % 763.34 sicer si cer 593 0.58 % 572.89 195 0.90 % 845.90 74 0.25 % 250.39 242 0.68 % 685.26 82 0.51 % 526 kjer kj er 484 0.47 % 467.58 111 0.51 % 481.51 23 0.08 % 77.82 283 0.79 % 801.36 67 0.42 % 429.78 zakaj za kaj 422 0.41 % 407.69 70 0.32 % 303.66 82 0.28 % 277.46 224 0.63 % 634.29 46 0.29 % 295.07 ka ka 353 0.34 % 341.03 123 0.56 % 533.57 211 0.71 % 713.94 4 0.01 % 11.33 15 0.09 % 96.22 zato za to 322 0.31 % 311.08 58 0.27 % 251.60 54 0.18 % 182.72 153 0.43 % 433.24 57 0.36 % 365.64 kr kr 318 0.31 % 307.21 33 0.15 % 143.15 151 0.51 % 510.93 48 0.14 % 135.92 86 0.54 % 551.66 niti ni ti 306 0.30 % 295.62 59 0.27 % 255.94 89 0.30 % 301.14 105 0.29 % 297.32 53 0.33 % 339.98 kako ka ko 296 0.29 % 285.96 71 0.33 % 308 21 0.07 % 71.06 189 0.53 % 535.18 15 0.09 % 96.22 vendar ve ndar 276 0.27 % 266.64 84 0.39 % 364.39 1 0 % 3.38 182 0.51 % 515.36 9 0.06 % 57.73 ku ku 256 0.25 % 247.32 14 0.06 % 60.73 220 0.74 % 744.40 14 0.04 % 39.64 8 0.05 % 51.32 kak ka k 245 0.24 % 236.69 80 0.37 % 347.04 112 0.38 % 378.96 13 0.04 % 36.81 40 0.25 % 256.59 namreč na mreč 223 0.22 % 215.44 71 0.33 % 308 14 0.05 % 47.37 124 0.35 % 351.13 14 0.09 % 89.81 se se 185 0.18 % 178.73 21 0.10 % 91.10 112 0.38 % 378.96 13 0.04 % 36.81 39 0.24 % 250.17 tako ta ko 183 0.18 % 176.79 30 0.14 % 130.14 29 0.10 % 98.12 72 0.20 % 203.88 52 0.33 % 333.56 kajti ka jti 174 0.17 % 168.10 104 0.48 % 451.15 0 0 % 0 65 0.18 % 184.06 5 0.03 % 32.07 čeprav če prav 160 0.15 % 154.57 62 0.28 % 268.95 16 0.05 % 54.14 66 0.18 % 186.89 16 0.10 % 102.63 ke ke 139 0.14 % 134.29 25 0.12 % 108.45 104 0.35 % 351.90 1 0 % 2.83 9 0.06 % 57.73 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 606 File at CLARIN.SI2.2.263 List of initial character-level 3-grams from conjunction lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-conjunctions-lowercase_ forms-initial-3grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ampak amp ak 3,444 16.95 % 3,327.19 796 16.75 % 3,453.02 778 18.48 % 2,632.45 1,208 14.67 % 3,420.65 662 21.16 % 4,246.50 ker ker 3,191 15.70 % 3,082.77 567 11.93 % 2,459.62 703 16.70 % 2,378.68 1,159 14.08 % 3,281.90 762 24.36 % 4,887.97 kot kot 2,279 11.21 % 2,201.70 586 12.33 % 2,542.05 125 2.97 % 422.95 1,274 15.47 % 3,607.54 294 9.40 % 1,885.91 ali ali 1,482 7.29 % 1,431.74 352 7.41 % 1,526.96 129 3.06 % 436.49 885 10.75 % 2,506.02 116 3.71 % 744.10 sej sej 1,341 6.60 % 1,295.52 184 3.87 % 798.18 678 16.11 % 2,294.09 177 2.15 % 501.20 302 9.65 % 1,937.23 torej tor ej 1,221 6.01 % 1,179.59 436 9.18 % 1,891.35 36 0.85 % 121.81 648 7.87 % 1,834.92 101 3.23 % 647.88 oziroma ozi roma 881 4.33 % 851.12 258 5.43 % 1,119.19 104 2.47 % 351.90 400 4.86 % 1,132.67 119 3.80 % 763.34 sicer sic er 593 2.92 % 572.89 195 4.10 % 845.90 74 1.76 % 250.39 242 2.94 % 685.26 82 2.62 % 526 kjer kje r 484 2.38 % 467.58 111 2.34 % 481.51 23 0.55 % 77.82 283 3.44 % 801.36 67 2.14 % 429.78 zakaj zak aj 422 2.08 % 407.69 70 1.47 % 303.66 82 1.95 % 277.46 224 2.72 % 634.29 46 1.47 % 295.07 zato zat o 322 1.58 % 311.08 58 1.22 % 251.60 54 1.28 % 182.72 153 1.86 % 433.24 57 1.82 % 365.64 niti nit i 306 1.51 % 295.62 59 1.24 % 255.94 89 2.12 % 301.14 105 1.27 % 297.32 53 1.69 % 339.98 kako kak o 296 1.46 % 285.96 71 1.49 % 308 21 0.50 % 71.06 189 2.30 % 535.18 15 0.48 % 96.22 vendar ven dar 276 1.36 % 266.64 84 1.77 % 364.39 1 0.02 % 3.38 182 2.21 % 515.36 9 0.29 % 57.73 kak kak 245 1.21 % 236.69 80 1.68 % 347.04 112 2.66 % 378.96 13 0.16 % 36.81 40 1.28 % 256.59 namreč nam reč 223 1.10 % 215.44 71 1.49 % 308 14 0.33 % 47.37 124 1.51 % 351.13 14 0.45 % 89.81 tako tak o 183 0.90 % 176.79 30 0.63 % 130.14 29 0.69 % 98.12 72 0.88 % 203.88 52 1.66 % 333.56 kajti kaj ti 174 0.86 % 168.10 104 2.19 % 451.15 0 0 % 0 65 0.79 % 184.06 5 0.16 % 32.07 čeprav čep rav 160 0.79 % 154.57 62 1.30 % 268.95 16 0.38 % 54.14 66 0.80 % 186.89 16 0.51 % 102.63 čeprov čep rov 133 0.65 % 128.49 23 0.48 % 99.77 47 1.12 % 159.03 40 0.49 % 113.27 23 0.73 % 147.54 kokr kok r 123 0.60 % 118.83 15 0.32 % 65.07 64 1.52 % 216.55 35 0.42 % 99.11 9 0.29 % 57.73 dokler dok ler 111 0.55 % 107.24 19 0.40 % 82.42 31 0.74 % 104.89 41 0.50 % 116.10 20 0.64 % 128.29 vendarle ven darle 106 0.52 % 102.40 25 0.53 % 108.45 0 0 % 0 74 0.90 % 209.54 7 0.22 % 44.90 toda tod a 87 0.43 % 84.05 71 1.49 % 308 0 0 % 0 15 0.18 % 42.47 1 0.03 % 6.41 tko tko 86 0.42 % 83.08 11 0.23 % 47.72 38 0.90 % 128.58 15 0.18 % 42.47 22 0.70 % 141.12 kadar kad ar 84 0.41 % 81.15 24 0.51 % 104.11 7 0.17 % 23.69 37 0.45 % 104.77 16 0.51 % 102.63 kukr kuk r 81 0.40 % 78.25 7 0.15 % 30.37 56 1.33 % 189.48 14 0.17 % 39.64 4 0.13 % 25.66 ter ter 73 0.36 % 70.52 45 0.95 % 195.21 1 0.02 % 3.38 27 0.33 % 76.45 0 0 % 0 kakor kak or 72 0.35 % 69.56 10 0.21 % 43.38 4 0.10 % 13.53 51 0.62 % 144.41 7 0.22 % 44.90 saj saj 68 0.34 % 65.69 17 0.36 % 73.75 19 0.45 % 64.29 18 0.22 % 50.97 14 0.45 % 89.81 drgač drg ač 65 0.32 % 62.80 14 0.29 % 60.73 38 0.90 % 128.58 5 0.06 % 14.16 8 0.26 % 51.32 drugač dru gač 63 0.31 % 60.86 12 0.25 % 52.06 32 0.76 % 108.28 9 0.11 % 25.48 10 0.32 % 64.15 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 607 File at CLARIN.SI2.2.264 List of initial character-level 4-grams from conjunction lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-conjunctions-lowercase_ forms-initial-4grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ampak ampa k 3,444 31.69 % 3,327.19 796 28.59 % 3,453.02 778 39.29 % 2,632.45 1,208 26.35 % 3,420.65 662 43.61 % 4,246.50 torej tore j 1,221 11.24 % 1,179.59 436 15.66 % 1,891.35 36 1.82 % 121.81 648 14.13 % 1,834.92 101 6.65 % 647.88 oziroma ozir oma 881 8.11 % 851.12 258 9.27 % 1,119.19 104 5.25 % 351.90 400 8.72 % 1,132.67 119 7.84 % 763.34 sicer sice r 593 5.46 % 572.89 195 7.00 % 845.90 74 3.74 % 250.39 242 5.28 % 685.26 82 5.40 % 526 kjer kjer 484 4.45 % 467.58 111 3.99 % 481.51 23 1.16 % 77.82 283 6.17 % 801.36 67 4.41 % 429.78 zakaj zaka j 422 3.88 % 407.69 70 2.51 % 303.66 82 4.14 % 277.46 224 4.88 % 634.29 46 3.03 % 295.07 zato zato 322 2.96 % 311.08 58 2.08 % 251.60 54 2.73 % 182.72 153 3.34 % 433.24 57 3.75 % 365.64 niti niti 306 2.82 % 295.62 59 2.12 % 255.94 89 4.50 % 301.14 105 2.29 % 297.32 53 3.49 % 339.98 kako kako 296 2.72 % 285.96 71 2.55 % 308 21 1.06 % 71.06 189 4.12 % 535.18 15 0.99 % 96.22 vendar vend ar 276 2.54 % 266.64 84 3.02 % 364.39 1 0.05 % 3.38 182 3.97 % 515.36 9 0.59 % 57.73 namreč namr eč 223 2.05 % 215.44 71 2.55 % 308 14 0.71 % 47.37 124 2.70 % 351.13 14 0.92 % 89.81 tako tako 183 1.68 % 176.79 30 1.08 % 130.14 29 1.47 % 98.12 72 1.57 % 203.88 52 3.43 % 333.56 kajti kajt i 174 1.60 % 168.10 104 3.74 % 451.15 0 0 % 0 65 1.42 % 184.06 5 0.33 % 32.07 čeprav čepr av 160 1.47 % 154.57 62 2.23 % 268.95 16 0.81 % 54.14 66 1.44 % 186.89 16 1.05 % 102.63 čeprov čepr ov 133 1.22 % 128.49 23 0.83 % 99.77 47 2.37 % 159.03 40 0.87 % 113.27 23 1.51 % 147.54 kokr kokr 123 1.13 % 118.83 15 0.54 % 65.07 64 3.23 % 216.55 35 0.76 % 99.11 9 0.59 % 57.73 dokler dokl er 111 1.02 % 107.24 19 0.68 % 82.42 31 1.57 % 104.89 41 0.89 % 116.10 20 1.32 % 128.29 vendarle vend arle 106 0.97 % 102.40 25 0.90 % 108.45 0 0 % 0 74 1.61 % 209.54 7 0.46 % 44.90 toda toda 87 0.80 % 84.05 71 2.55 % 308 0 0 % 0 15 0.33 % 42.47 1 0.07 % 6.41 kadar kada r 84 0.77 % 81.15 24 0.86 % 104.11 7 0.35 % 23.69 37 0.81 % 104.77 16 1.05 % 102.63 kukr kukr 81 0.74 % 78.25 7 0.25 % 30.37 56 2.83 % 189.48 14 0.30 % 39.64 4 0.26 % 25.66 kakor kako r 72 0.66 % 69.56 10 0.36 % 43.38 4 0.20 % 13.53 51 1.11 % 144.41 7 0.46 % 44.90 drgač drga č 65 0.60 % 62.80 14 0.50 % 60.73 38 1.92 % 128.58 5 0.11 % 14.16 8 0.53 % 51.32 drugač drug ač 63 0.58 % 60.86 12 0.43 % 52.06 32 1.62 % 108.28 9 0.20 % 25.48 10 0.66 % 64.15 drugače drug ače 62 0.57 % 59.90 11 0.40 % 47.72 13 0.66 % 43.99 24 0.52 % 67.96 14 0.92 % 89.81 kedr kedr 42 0.39 % 40.58 4 0.14 % 17.35 23 1.16 % 77.82 14 0.30 % 39.64 1 0.07 % 6.41 zaka zaka 42 0.39 % 40.58 9 0.32 % 39.04 25 1.26 % 84.59 4 0.09 % 11.33 4 0.26 % 25.66 odkar odka r 30 0.28 % 28.98 17 0.61 % 73.75 7 0.35 % 23.69 4 0.09 % 11.33 2 0.13 % 12.83 preden pred en 30 0.28 % 28.98 7 0.25 % 30.37 7 0.35 % 23.69 11 0.24 % 31.15 5 0.33 % 32.07 kakorkoli kako rkoli 29 0.27 % 28.02 10 0.36 % 43.38 5 0.25 % 16.92 9 0.20 % 25.48 5 0.33 % 32.07 bodisi bodi si 27 0.25 % 26.08 0 0 % 0 0 0 % 0 23 0.50 % 65.13 4 0.26 % 25.66 mpak mpak 26 0.24 % 25.12 1 0.04 % 4.34 11 0.56 % 37.22 12 0.26 % 33.98 2 0.13 % 12.83 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 608 File at CLARIN.SI2.2.265 List of initial character-level 5-grams from conjunction lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-conjunctions-lowercase_ forms-initial-5grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ampak ampak 3,444 39.88 % 3,327.19 796 34.49 % 3,453.02 778 53.14 % 2,632.45 1,208 33.23 % 3,420.65 662 53.87 % 4,246.50 torej torej 1,221 14.14 % 1,179.59 436 18.89 % 1,891.35 36 2.46 % 121.81 648 17.83 % 1,834.92 101 8.22 % 647.88 oziroma oziro ma 881 10.20 % 851.12 258 11.18 % 1,119.19 104 7.10 % 351.90 400 11.00 % 1,132.67 119 9.68 % 763.34 sicer sicer 593 6.87 % 572.89 195 8.45 % 845.90 74 5.05 % 250.39 242 6.66 % 685.26 82 6.67 % 526 zakaj zakaj 422 4.89 % 407.69 70 3.03 % 303.66 82 5.60 % 277.46 224 6.16 % 634.29 46 3.74 % 295.07 vendar venda r 276 3.20 % 266.64 84 3.64 % 364.39 1 0.07 % 3.38 182 5.01 % 515.36 9 0.73 % 57.73 namreč namre č 223 2.58 % 215.44 71 3.08 % 308 14 0.96 % 47.37 124 3.41 % 351.13 14 1.14 % 89.81 kajti kajti 174 2.02 % 168.10 104 4.51 % 451.15 0 0 % 0 65 1.79 % 184.06 5 0.41 % 32.07 čeprav čepra v 160 1.85 % 154.57 62 2.69 % 268.95 16 1.09 % 54.14 66 1.82 % 186.89 16 1.30 % 102.63 čeprov čepro v 133 1.54 % 128.49 23 1.00 % 99.77 47 3.21 % 159.03 40 1.10 % 113.27 23 1.87 % 147.54 dokler dokle r 111 1.28 % 107.24 19 0.82 % 82.42 31 2.12 % 104.89 41 1.13 % 116.10 20 1.63 % 128.29 vendarle venda rle 106 1.23 % 102.40 25 1.08 % 108.45 0 0 % 0 74 2.04 % 209.54 7 0.57 % 44.90 kadar kadar 84 0.97 % 81.15 24 1.04 % 104.11 7 0.48 % 23.69 37 1.02 % 104.77 16 1.30 % 102.63 kakor kakor 72 0.83 % 69.56 10 0.43 % 43.38 4 0.27 % 13.53 51 1.40 % 144.41 7 0.57 % 44.90 drgač drgač 65 0.75 % 62.80 14 0.61 % 60.73 38 2.60 % 128.58 5 0.14 % 14.16 8 0.65 % 51.32 drugač druga č 63 0.73 % 60.86 12 0.52 % 52.06 32 2.19 % 108.28 9 0.25 % 25.48 10 0.81 % 64.15 drugače druga če 62 0.72 % 59.90 11 0.48 % 47.72 13 0.89 % 43.99 24 0.66 % 67.96 14 1.14 % 89.81 odkar odkar 30 0.35 % 28.98 17 0.74 % 73.75 7 0.48 % 23.69 4 0.11 % 11.33 2 0.16 % 12.83 preden prede n 30 0.35 % 28.98 7 0.30 % 30.37 7 0.48 % 23.69 11 0.30 % 31.15 5 0.41 % 32.07 kakorkoli kakor koli 29 0.34 % 28.02 10 0.43 % 43.38 5 0.34 % 16.92 9 0.25 % 25.48 5 0.41 % 32.07 bodisi bodis i 27 0.31 % 26.08 0 0 % 0 0 0 % 0 23 0.63 % 65.13 4 0.33 % 25.66 koker koker 25 0.29 % 24.15 6 0.26 % 26.03 6 0.41 % 20.30 4 0.11 % 11.33 9 0.73 % 57.73 temveč temve č 25 0.29 % 24.15 2 0.09 % 8.68 0 0 % 0 23 0.63 % 65.13 0 0 % 0 kamor kamor 21 0.24 % 20.29 7 0.30 % 30.37 2 0.14 % 6.77 11 0.30 % 31.15 1 0.08 % 6.41 zakva zakva 21 0.24 % 20.29 1 0.04 % 4.34 16 1.09 % 54.14 0 0 % 0 4 0.33 % 25.66 kjerkoli kjerk oli 17 0.20 % 16.42 2 0.09 % 8.68 3 0.20 % 10.15 9 0.25 % 25.48 3 0.24 % 19.24 ozirma ozirm a 13 0.15 % 12.56 0 0 % 0 0 0 % 0 12 0.33 % 33.98 1 0.08 % 6.41 četudi četud i 13 0.15 % 12.56 2 0.09 % 8.68 1 0.07 % 3.38 10 0.28 % 28.32 0 0 % 0 keder keder 11 0.13 % 10.63 1 0.04 % 4.34 9 0.61 % 30.45 1 0.03 % 2.83 0 0 % 0 kuker kuker 11 0.13 % 10.63 0 0 % 0 9 0.61 % 30.45 2 0.06 % 5.66 0 0 % 0 magari magar i 11 0.13 % 10.63 0 0 % 0 9 0.61 % 30.45 0 0 % 0 2 0.16 % 12.83 kamorkoli kamor koli 10 0.12 % 9.66 3 0.13 % 13.01 3 0.20 % 10.15 2 0.06 % 5.66 2 0.16 % 12.83 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 609 File at CLARIN.SI2.2.266 List of final character-level 1-grams from conjunction lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-conjunctions-lowercase_ forms-final-1grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pa p a 29,155 25.71 % 28,166.17 5,585 23.88 % 24,227.52 11,945 34.72 % 40,417.27 6,807 17.98 % 19,275.15 4,818 27.17 % 30,905.81 da d a 17,252 15.21 % 16,666.88 3,484 14.90 % 15,113.46 3,889 11.30 % 13,158.87 6,960 18.38 % 19,708.40 2,919 16.46 % 18,724.38 in i n 15,945 14.06 % 15,404.20 4,161 17.79 % 18,050.26 2,958 8.60 % 10,008.73 7,054 18.63 % 19,974.57 1,772 9.99 % 11,366.77 a a 6,728 5.93 % 6,499.81 1,233 5.27 % 5,348.71 2,505 7.28 % 8,475.95 1,772 4.68 % 5,017.71 1,218 6.87 % 7,813.05 če č e 5,945 5.24 % 5,743.37 1,107 4.73 % 4,802.12 1,744 5.07 % 5,901.02 1,891 4.99 % 5,354.68 1,203 6.79 % 7,716.83 ki k i 4,526 3.99 % 4,372.49 1,128 4.82 % 4,893.22 367 1.07 % 1,241.79 2,663 7.03 % 7,540.73 368 2.08 % 2,360.59 ko k o 3,703 3.27 % 3,577.41 705 3.01 % 3,058.26 1,288 3.74 % 4,358.09 1,157 3.06 % 3,276.24 553 3.12 % 3,547.30 ampak ampa k 3,444 3.04 % 3,327.19 796 3.40 % 3,453.02 778 2.26 % 2,632.45 1,208 3.19 % 3,420.65 662 3.73 % 4,246.50 al a l 3,231 2.85 % 3,121.42 444 1.90 % 1,926.06 1,388 4.04 % 4,696.46 614 1.62 % 1,738.64 785 4.43 % 5,035.51 ker ke r 3,191 2.81 % 3,082.77 567 2.42 % 2,459.62 703 2.04 % 2,378.68 1,159 3.06 % 3,281.90 762 4.30 % 4,887.97 k k 2,979 2.63 % 2,877.96 345 1.48 % 1,496.60 1,729 5.03 % 5,850.27 371 0.98 % 1,050.55 534 3.01 % 3,425.43 kot ko t 2,279 2.01 % 2,201.70 586 2.51 % 2,542.05 125 0.36 % 422.95 1,274 3.37 % 3,607.54 294 1.66 % 1,885.91 ali al i 1,482 1.31 % 1,431.74 352 1.50 % 1,526.96 129 0.38 % 436.49 885 2.34 % 2,506.02 116 0.65 % 744.10 sej se j 1,341 1.18 % 1,295.52 184 0.79 % 798.18 678 1.97 % 2,294.09 177 0.47 % 501.20 302 1.70 % 1,937.23 torej tore j 1,221 1.08 % 1,179.59 436 1.86 % 1,891.35 36 0.10 % 121.81 648 1.71 % 1,834.92 101 0.57 % 647.88 de d e 1,169 1.03 % 1,129.35 87 0.37 % 377.40 797 2.32 % 2,696.74 106 0.28 % 300.16 179 1.01 % 1,148.22 oziroma ozirom a 881 0.78 % 851.12 258 1.10 % 1,119.19 104 0.30 % 351.90 400 1.06 % 1,132.67 119 0.67 % 763.34 sicer sice r 593 0.52 % 572.89 195 0.83 % 845.90 74 0.21 % 250.39 242 0.64 % 685.26 82 0.46 % 526 kjer kje r 484 0.43 % 467.58 111 0.47 % 481.51 23 0.07 % 77.82 283 0.75 % 801.36 67 0.38 % 429.78 zakaj zaka j 422 0.37 % 407.69 70 0.30 % 303.66 82 0.24 % 277.46 224 0.59 % 634.29 46 0.26 % 295.07 d d 358 0.32 % 345.86 24 0.10 % 104.11 249 0.72 % 842.52 68 0.18 % 192.55 17 0.10 % 109.05 ka k a 353 0.31 % 341.03 123 0.53 % 533.57 211 0.61 % 713.94 4 0.01 % 11.33 15 0.09 % 96.22 zato zat o 322 0.28 % 311.08 58 0.25 % 251.60 54 0.16 % 182.72 153 0.40 % 433.24 57 0.32 % 365.64 kr k r 318 0.28 % 307.21 33 0.14 % 143.15 151 0.44 % 510.93 48 0.13 % 135.92 86 0.48 % 551.66 niti nit i 306 0.27 % 295.62 59 0.25 % 255.94 89 0.26 % 301.14 105 0.28 % 297.32 53 0.30 % 339.98 kako kak o 296 0.26 % 285.96 71 0.30 % 308 21 0.06 % 71.06 189 0.50 % 535.18 15 0.09 % 96.22 vendar venda r 276 0.24 % 266.64 84 0.36 % 364.39 1 0 % 3.38 182 0.48 % 515.36 9 0.05 % 57.73 ku k u 256 0.23 % 247.32 14 0.06 % 60.73 220 0.64 % 744.40 14 0.04 % 39.64 8 0.04 % 51.32 i i 253 0.22 % 244.42 10 0.04 % 43.38 224 0.65 % 757.93 9 0.02 % 25.48 10 0.06 % 64.15 kak ka k 245 0.22 % 236.69 80 0.34 % 347.04 112 0.33 % 378.96 13 0.03 % 36.81 40 0.23 % 256.59 namreč namre č 223 0.20 % 215.44 71 0.30 % 308 14 0.04 % 47.37 124 0.33 % 351.13 14 0.08 % 89.81 se s e 185 0.16 % 178.73 21 0.09 % 91.10 112 0.33 % 378.96 13 0.03 % 36.81 39 0.22 % 250.17 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 610 File at CLARIN.SI2.2.267 List of final character-level 2-grams from conjunction lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-conjunctions-lowercase_ forms-final-2grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pa pa 29,155 28.31 % 28,166.17 5,585 25.67 % 24,227.52 11,945 40.28 % 40,417.27 6,807 19.10 % 19,275.15 4,818 30.22 % 30,905.81 da da 17,252 16.75 % 16,666.88 3,484 16.01 % 15,113.46 3,889 13.11 % 13,158.87 6,960 19.53 % 19,708.40 2,919 18.31 % 18,724.38 in in 15,945 15.48 % 15,404.20 4,161 19.12 % 18,050.26 2,958 9.97 % 10,008.73 7,054 19.80 % 19,974.57 1,772 11.11 % 11,366.77 če če 5,945 5.77 % 5,743.37 1,107 5.09 % 4,802.12 1,744 5.88 % 5,901.02 1,891 5.31 % 5,354.68 1,203 7.54 % 7,716.83 ki ki 4,526 4.39 % 4,372.49 1,128 5.18 % 4,893.22 367 1.24 % 1,241.79 2,663 7.47 % 7,540.73 368 2.31 % 2,360.59 ko ko 3,703 3.60 % 3,577.41 705 3.24 % 3,058.26 1,288 4.34 % 4,358.09 1,157 3.25 % 3,276.24 553 3.47 % 3,547.30 ampak amp ak 3,444 3.34 % 3,327.19 796 3.66 % 3,453.02 778 2.62 % 2,632.45 1,208 3.39 % 3,420.65 662 4.15 % 4,246.50 al al 3,231 3.14 % 3,121.42 444 2.04 % 1,926.06 1,388 4.68 % 4,696.46 614 1.72 % 1,738.64 785 4.92 % 5,035.51 ker k er 3,191 3.10 % 3,082.77 567 2.61 % 2,459.62 703 2.37 % 2,378.68 1,159 3.25 % 3,281.90 762 4.78 % 4,887.97 kot k ot 2,279 2.21 % 2,201.70 586 2.69 % 2,542.05 125 0.42 % 422.95 1,274 3.58 % 3,607.54 294 1.84 % 1,885.91 ali a li 1,482 1.44 % 1,431.74 352 1.62 % 1,526.96 129 0.43 % 436.49 885 2.48 % 2,506.02 116 0.73 % 744.10 sej s ej 1,341 1.30 % 1,295.52 184 0.85 % 798.18 678 2.29 % 2,294.09 177 0.50 % 501.20 302 1.89 % 1,937.23 torej tor ej 1,221 1.19 % 1,179.59 436 2.00 % 1,891.35 36 0.12 % 121.81 648 1.82 % 1,834.92 101 0.63 % 647.88 de de 1,169 1.14 % 1,129.35 87 0.40 % 377.40 797 2.69 % 2,696.74 106 0.30 % 300.16 179 1.12 % 1,148.22 oziroma oziro ma 881 0.85 % 851.12 258 1.19 % 1,119.19 104 0.35 % 351.90 400 1.12 % 1,132.67 119 0.75 % 763.34 sicer sic er 593 0.58 % 572.89 195 0.90 % 845.90 74 0.25 % 250.39 242 0.68 % 685.26 82 0.51 % 526 kjer kj er 484 0.47 % 467.58 111 0.51 % 481.51 23 0.08 % 77.82 283 0.79 % 801.36 67 0.42 % 429.78 zakaj zak aj 422 0.41 % 407.69 70 0.32 % 303.66 82 0.28 % 277.46 224 0.63 % 634.29 46 0.29 % 295.07 ka ka 353 0.34 % 341.03 123 0.56 % 533.57 211 0.71 % 713.94 4 0.01 % 11.33 15 0.09 % 96.22 zato za to 322 0.31 % 311.08 58 0.27 % 251.60 54 0.18 % 182.72 153 0.43 % 433.24 57 0.36 % 365.64 kr kr 318 0.31 % 307.21 33 0.15 % 143.15 151 0.51 % 510.93 48 0.14 % 135.92 86 0.54 % 551.66 niti ni ti 306 0.30 % 295.62 59 0.27 % 255.94 89 0.30 % 301.14 105 0.29 % 297.32 53 0.33 % 339.98 kako ka ko 296 0.29 % 285.96 71 0.33 % 308 21 0.07 % 71.06 189 0.53 % 535.18 15 0.09 % 96.22 vendar vend ar 276 0.27 % 266.64 84 0.39 % 364.39 1 0 % 3.38 182 0.51 % 515.36 9 0.06 % 57.73 ku ku 256 0.25 % 247.32 14 0.06 % 60.73 220 0.74 % 744.40 14 0.04 % 39.64 8 0.05 % 51.32 kak k ak 245 0.24 % 236.69 80 0.37 % 347.04 112 0.38 % 378.96 13 0.04 % 36.81 40 0.25 % 256.59 namreč namr eč 223 0.22 % 215.44 71 0.33 % 308 14 0.05 % 47.37 124 0.35 % 351.13 14 0.09 % 89.81 se se 185 0.18 % 178.73 21 0.10 % 91.10 112 0.38 % 378.96 13 0.04 % 36.81 39 0.24 % 250.17 tako ta ko 183 0.18 % 176.79 30 0.14 % 130.14 29 0.10 % 98.12 72 0.20 % 203.88 52 0.33 % 333.56 kajti kaj ti 174 0.17 % 168.10 104 0.48 % 451.15 0 0 % 0 65 0.18 % 184.06 5 0.03 % 32.07 čeprav čepr av 160 0.15 % 154.57 62 0.28 % 268.95 16 0.05 % 54.14 66 0.18 % 186.89 16 0.10 % 102.63 ke ke 139 0.14 % 134.29 25 0.12 % 108.45 104 0.35 % 351.90 1 0 % 2.83 9 0.06 % 57.73 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 611 File at CLARIN.SI2.2.268 List of final character-level 3-grams from conjunction lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-conjunctions-lowercase_ forms-final-3grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ampak am pak 3,444 16.95 % 3,327.19 796 16.75 % 3,453.02 778 18.48 % 2,632.45 1,208 14.67 % 3,420.65 662 21.16 % 4,246.50 ker ker 3,191 15.70 % 3,082.77 567 11.93 % 2,459.62 703 16.70 % 2,378.68 1,159 14.08 % 3,281.90 762 24.36 % 4,887.97 kot kot 2,279 11.21 % 2,201.70 586 12.33 % 2,542.05 125 2.97 % 422.95 1,274 15.47 % 3,607.54 294 9.40 % 1,885.91 ali ali 1,482 7.29 % 1,431.74 352 7.41 % 1,526.96 129 3.06 % 436.49 885 10.75 % 2,506.02 116 3.71 % 744.10 sej sej 1,341 6.60 % 1,295.52 184 3.87 % 798.18 678 16.11 % 2,294.09 177 2.15 % 501.20 302 9.65 % 1,937.23 torej to rej 1,221 6.01 % 1,179.59 436 9.18 % 1,891.35 36 0.85 % 121.81 648 7.87 % 1,834.92 101 3.23 % 647.88 oziroma ozir oma 881 4.33 % 851.12 258 5.43 % 1,119.19 104 2.47 % 351.90 400 4.86 % 1,132.67 119 3.80 % 763.34 sicer si cer 593 2.92 % 572.89 195 4.10 % 845.90 74 1.76 % 250.39 242 2.94 % 685.26 82 2.62 % 526 kjer k jer 484 2.38 % 467.58 111 2.34 % 481.51 23 0.55 % 77.82 283 3.44 % 801.36 67 2.14 % 429.78 zakaj za kaj 422 2.08 % 407.69 70 1.47 % 303.66 82 1.95 % 277.46 224 2.72 % 634.29 46 1.47 % 295.07 zato z ato 322 1.58 % 311.08 58 1.22 % 251.60 54 1.28 % 182.72 153 1.86 % 433.24 57 1.82 % 365.64 niti n iti 306 1.51 % 295.62 59 1.24 % 255.94 89 2.12 % 301.14 105 1.27 % 297.32 53 1.69 % 339.98 kako k ako 296 1.46 % 285.96 71 1.49 % 308 21 0.50 % 71.06 189 2.30 % 535.18 15 0.48 % 96.22 vendar ven dar 276 1.36 % 266.64 84 1.77 % 364.39 1 0.02 % 3.38 182 2.21 % 515.36 9 0.29 % 57.73 kak kak 245 1.21 % 236.69 80 1.68 % 347.04 112 2.66 % 378.96 13 0.16 % 36.81 40 1.28 % 256.59 namreč nam reč 223 1.10 % 215.44 71 1.49 % 308 14 0.33 % 47.37 124 1.51 % 351.13 14 0.45 % 89.81 tako t ako 183 0.90 % 176.79 30 0.63 % 130.14 29 0.69 % 98.12 72 0.88 % 203.88 52 1.66 % 333.56 kajti ka jti 174 0.86 % 168.10 104 2.19 % 451.15 0 0 % 0 65 0.79 % 184.06 5 0.16 % 32.07 čeprav čep rav 160 0.79 % 154.57 62 1.30 % 268.95 16 0.38 % 54.14 66 0.80 % 186.89 16 0.51 % 102.63 čeprov čep rov 133 0.65 % 128.49 23 0.48 % 99.77 47 1.12 % 159.03 40 0.49 % 113.27 23 0.73 % 147.54 kokr k okr 123 0.60 % 118.83 15 0.32 % 65.07 64 1.52 % 216.55 35 0.42 % 99.11 9 0.29 % 57.73 dokler dok ler 111 0.55 % 107.24 19 0.40 % 82.42 31 0.74 % 104.89 41 0.50 % 116.10 20 0.64 % 128.29 vendarle venda rle 106 0.52 % 102.40 25 0.53 % 108.45 0 0 % 0 74 0.90 % 209.54 7 0.22 % 44.90 toda t oda 87 0.43 % 84.05 71 1.49 % 308 0 0 % 0 15 0.18 % 42.47 1 0.03 % 6.41 tko tko 86 0.42 % 83.08 11 0.23 % 47.72 38 0.90 % 128.58 15 0.18 % 42.47 22 0.70 % 141.12 kadar ka dar 84 0.41 % 81.15 24 0.51 % 104.11 7 0.17 % 23.69 37 0.45 % 104.77 16 0.51 % 102.63 kukr k ukr 81 0.40 % 78.25 7 0.15 % 30.37 56 1.33 % 189.48 14 0.17 % 39.64 4 0.13 % 25.66 ter ter 73 0.36 % 70.52 45 0.95 % 195.21 1 0.02 % 3.38 27 0.33 % 76.45 0 0 % 0 kakor ka kor 72 0.35 % 69.56 10 0.21 % 43.38 4 0.10 % 13.53 51 0.62 % 144.41 7 0.22 % 44.90 saj saj 68 0.34 % 65.69 17 0.36 % 73.75 19 0.45 % 64.29 18 0.22 % 50.97 14 0.45 % 89.81 drgač dr gač 65 0.32 % 62.80 14 0.29 % 60.73 38 0.90 % 128.58 5 0.06 % 14.16 8 0.26 % 51.32 drugač dru gač 63 0.31 % 60.86 12 0.25 % 52.06 32 0.76 % 108.28 9 0.11 % 25.48 10 0.32 % 64.15 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 612 File at CLARIN.SI2.2.269 List of final character-level 4-grams from conjunction lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-conjunctions-lowercase_ forms-final-4grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ampak a mpak 3,444 31.69 % 3,327.19 796 28.59 % 3,453.02 778 39.29 % 2,632.45 1,208 26.35 % 3,420.65 662 43.61 % 4,246.50 torej t orej 1,221 11.24 % 1,179.59 436 15.66 % 1,891.35 36 1.82 % 121.81 648 14.13 % 1,834.92 101 6.65 % 647.88 oziroma ozi roma 881 8.11 % 851.12 258 9.27 % 1,119.19 104 5.25 % 351.90 400 8.72 % 1,132.67 119 7.84 % 763.34 sicer s icer 593 5.46 % 572.89 195 7.00 % 845.90 74 3.74 % 250.39 242 5.28 % 685.26 82 5.40 % 526 kjer kjer 484 4.45 % 467.58 111 3.99 % 481.51 23 1.16 % 77.82 283 6.17 % 801.36 67 4.41 % 429.78 zakaj z akaj 422 3.88 % 407.69 70 2.51 % 303.66 82 4.14 % 277.46 224 4.88 % 634.29 46 3.03 % 295.07 zato zato 322 2.96 % 311.08 58 2.08 % 251.60 54 2.73 % 182.72 153 3.34 % 433.24 57 3.75 % 365.64 niti niti 306 2.82 % 295.62 59 2.12 % 255.94 89 4.50 % 301.14 105 2.29 % 297.32 53 3.49 % 339.98 kako kako 296 2.72 % 285.96 71 2.55 % 308 21 1.06 % 71.06 189 4.12 % 535.18 15 0.99 % 96.22 vendar ve ndar 276 2.54 % 266.64 84 3.02 % 364.39 1 0.05 % 3.38 182 3.97 % 515.36 9 0.59 % 57.73 namreč na mreč 223 2.05 % 215.44 71 2.55 % 308 14 0.71 % 47.37 124 2.70 % 351.13 14 0.92 % 89.81 tako tako 183 1.68 % 176.79 30 1.08 % 130.14 29 1.47 % 98.12 72 1.57 % 203.88 52 3.43 % 333.56 kajti k ajti 174 1.60 % 168.10 104 3.74 % 451.15 0 0 % 0 65 1.42 % 184.06 5 0.33 % 32.07 čeprav če prav 160 1.47 % 154.57 62 2.23 % 268.95 16 0.81 % 54.14 66 1.44 % 186.89 16 1.05 % 102.63 čeprov če prov 133 1.22 % 128.49 23 0.83 % 99.77 47 2.37 % 159.03 40 0.87 % 113.27 23 1.51 % 147.54 kokr kokr 123 1.13 % 118.83 15 0.54 % 65.07 64 3.23 % 216.55 35 0.76 % 99.11 9 0.59 % 57.73 dokler do kler 111 1.02 % 107.24 19 0.68 % 82.42 31 1.57 % 104.89 41 0.89 % 116.10 20 1.32 % 128.29 vendarle vend arle 106 0.97 % 102.40 25 0.90 % 108.45 0 0 % 0 74 1.61 % 209.54 7 0.46 % 44.90 toda toda 87 0.80 % 84.05 71 2.55 % 308 0 0 % 0 15 0.33 % 42.47 1 0.07 % 6.41 kadar k adar 84 0.77 % 81.15 24 0.86 % 104.11 7 0.35 % 23.69 37 0.81 % 104.77 16 1.05 % 102.63 kukr kukr 81 0.74 % 78.25 7 0.25 % 30.37 56 2.83 % 189.48 14 0.30 % 39.64 4 0.26 % 25.66 kakor k akor 72 0.66 % 69.56 10 0.36 % 43.38 4 0.20 % 13.53 51 1.11 % 144.41 7 0.46 % 44.90 drgač d rgač 65 0.60 % 62.80 14 0.50 % 60.73 38 1.92 % 128.58 5 0.11 % 14.16 8 0.53 % 51.32 drugač dr ugač 63 0.58 % 60.86 12 0.43 % 52.06 32 1.62 % 108.28 9 0.20 % 25.48 10 0.66 % 64.15 drugače dru gače 62 0.57 % 59.90 11 0.40 % 47.72 13 0.66 % 43.99 24 0.52 % 67.96 14 0.92 % 89.81 kedr kedr 42 0.39 % 40.58 4 0.14 % 17.35 23 1.16 % 77.82 14 0.30 % 39.64 1 0.07 % 6.41 zaka zaka 42 0.39 % 40.58 9 0.32 % 39.04 25 1.26 % 84.59 4 0.09 % 11.33 4 0.26 % 25.66 odkar o dkar 30 0.28 % 28.98 17 0.61 % 73.75 7 0.35 % 23.69 4 0.09 % 11.33 2 0.13 % 12.83 preden pr eden 30 0.28 % 28.98 7 0.25 % 30.37 7 0.35 % 23.69 11 0.24 % 31.15 5 0.33 % 32.07 kakorkoli kakor koli 29 0.27 % 28.02 10 0.36 % 43.38 5 0.25 % 16.92 9 0.20 % 25.48 5 0.33 % 32.07 bodisi bo disi 27 0.25 % 26.08 0 0 % 0 0 0 % 0 23 0.50 % 65.13 4 0.26 % 25.66 mpak mpak 26 0.24 % 25.12 1 0.04 % 4.34 11 0.56 % 37.22 12 0.26 % 33.98 2 0.13 % 12.83 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 613 File at CLARIN.SI2.2.270 List of final character-level 5-grams from conjunction lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-conjunctions-lowercase_ forms-final-5grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ampak ampak 3,444 39.88 % 3,327.19 796 34.49 % 3,453.02 778 53.14 % 2,632.45 1,208 33.23 % 3,420.65 662 53.87 % 4,246.50 torej torej 1,221 14.14 % 1,179.59 436 18.89 % 1,891.35 36 2.46 % 121.81 648 17.83 % 1,834.92 101 8.22 % 647.88 oziroma oz iroma 881 10.20 % 851.12 258 11.18 % 1,119.19 104 7.10 % 351.90 400 11.00 % 1,132.67 119 9.68 % 763.34 sicer sicer 593 6.87 % 572.89 195 8.45 % 845.90 74 5.05 % 250.39 242 6.66 % 685.26 82 6.67 % 526 zakaj zakaj 422 4.89 % 407.69 70 3.03 % 303.66 82 5.60 % 277.46 224 6.16 % 634.29 46 3.74 % 295.07 vendar v endar 276 3.20 % 266.64 84 3.64 % 364.39 1 0.07 % 3.38 182 5.01 % 515.36 9 0.73 % 57.73 namreč n amreč 223 2.58 % 215.44 71 3.08 % 308 14 0.96 % 47.37 124 3.41 % 351.13 14 1.14 % 89.81 kajti kajti 174 2.02 % 168.10 104 4.51 % 451.15 0 0 % 0 65 1.79 % 184.06 5 0.41 % 32.07 čeprav č eprav 160 1.85 % 154.57 62 2.69 % 268.95 16 1.09 % 54.14 66 1.82 % 186.89 16 1.30 % 102.63 čeprov č eprov 133 1.54 % 128.49 23 1.00 % 99.77 47 3.21 % 159.03 40 1.10 % 113.27 23 1.87 % 147.54 dokler d okler 111 1.28 % 107.24 19 0.82 % 82.42 31 2.12 % 104.89 41 1.13 % 116.10 20 1.63 % 128.29 vendarle ven darle 106 1.23 % 102.40 25 1.08 % 108.45 0 0 % 0 74 2.04 % 209.54 7 0.57 % 44.90 kadar kadar 84 0.97 % 81.15 24 1.04 % 104.11 7 0.48 % 23.69 37 1.02 % 104.77 16 1.30 % 102.63 kakor kakor 72 0.83 % 69.56 10 0.43 % 43.38 4 0.27 % 13.53 51 1.40 % 144.41 7 0.57 % 44.90 drgač drgač 65 0.75 % 62.80 14 0.61 % 60.73 38 2.60 % 128.58 5 0.14 % 14.16 8 0.65 % 51.32 drugač d rugač 63 0.73 % 60.86 12 0.52 % 52.06 32 2.19 % 108.28 9 0.25 % 25.48 10 0.81 % 64.15 drugače dr ugače 62 0.72 % 59.90 11 0.48 % 47.72 13 0.89 % 43.99 24 0.66 % 67.96 14 1.14 % 89.81 odkar odkar 30 0.35 % 28.98 17 0.74 % 73.75 7 0.48 % 23.69 4 0.11 % 11.33 2 0.16 % 12.83 preden p reden 30 0.35 % 28.98 7 0.30 % 30.37 7 0.48 % 23.69 11 0.30 % 31.15 5 0.41 % 32.07 kakorkoli kako rkoli 29 0.34 % 28.02 10 0.43 % 43.38 5 0.34 % 16.92 9 0.25 % 25.48 5 0.41 % 32.07 bodisi b odisi 27 0.31 % 26.08 0 0 % 0 0 0 % 0 23 0.63 % 65.13 4 0.33 % 25.66 koker koker 25 0.29 % 24.15 6 0.26 % 26.03 6 0.41 % 20.30 4 0.11 % 11.33 9 0.73 % 57.73 temveč t emveč 25 0.29 % 24.15 2 0.09 % 8.68 0 0 % 0 23 0.63 % 65.13 0 0 % 0 kamor kamor 21 0.24 % 20.29 7 0.30 % 30.37 2 0.14 % 6.77 11 0.30 % 31.15 1 0.08 % 6.41 zakva zakva 21 0.24 % 20.29 1 0.04 % 4.34 16 1.09 % 54.14 0 0 % 0 4 0.33 % 25.66 kjerkoli kje rkoli 17 0.20 % 16.42 2 0.09 % 8.68 3 0.20 % 10.15 9 0.25 % 25.48 3 0.24 % 19.24 ozirma o zirma 13 0.15 % 12.56 0 0 % 0 0 0 % 0 12 0.33 % 33.98 1 0.08 % 6.41 četudi č etudi 13 0.15 % 12.56 2 0.09 % 8.68 1 0.07 % 3.38 10 0.28 % 28.32 0 0 % 0 keder keder 11 0.13 % 10.63 1 0.04 % 4.34 9 0.61 % 30.45 1 0.03 % 2.83 0 0 % 0 kuker kuker 11 0.13 % 10.63 0 0 % 0 9 0.61 % 30.45 2 0.06 % 5.66 0 0 % 0 magari m agari 11 0.13 % 10.63 0 0 % 0 9 0.61 % 30.45 0 0 % 0 2 0.16 % 12.83 kamorkoli kamo rkoli 10 0.12 % 9.66 3 0.13 % 13.01 3 0.20 % 10.15 2 0.06 % 5.66 2 0.16 % 12.83 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 614 File at CLARIN.SI2.2.271 List of initial character-level 1-grams from particle lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-particles-lemmas-initial- 1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ne ne n e 31,737 31.64 % 30,660.60 6,439 29.97 % 27,932.14 11,588 32.66 % 39,209.32 7,780 30.90 % 22,030.36 5,930 32.62 % 38,038.91 ja ja j a 25,555 25.47 % 24,688.27 4,720 21.97 % 20,475.18 11,541 32.53 % 39,050.29 3,821 15.18 % 10,819.80 5,473 30.10 % 35,107.41 tudi tudi t udi 7,945 7.92 % 7,675.53 2,076 9.66 % 9,005.61 1,644 4.63 % 5,562.66 2,890 11.48 % 8,183.51 1,335 7.34 % 8,563.57 še še š e 7,184 7.16 % 6,940.35 1,782 8.29 % 7,730.25 2,116 5.96 % 7,159.73 2,127 8.45 % 6,022.95 1,159 6.38 % 7,434.59 no no n o 4,700 4.68 % 4,540.59 1,298 6.04 % 5,630.67 1,566 4.41 % 5,298.74 1,114 4.42 % 3,154.48 722 3.97 % 4,631.38 že že ž e 4,447 4.43 % 4,296.17 1,071 4.98 % 4,645.96 1,498 4.22 % 5,068.65 1,191 4.73 % 3,372.51 687 3.78 % 4,406.87 pač pač p ač 2,851 2.84 % 2,754.30 402 1.87 % 1,743.86 960 2.71 % 3,248.27 830 3.30 % 2,350.28 659 3.62 % 4,227.26 seveda seveda s eveda 1,795 1.79 % 1,734.12 543 2.53 % 2,355.51 188 0.53 % 636.12 896 3.56 % 2,537.17 168 0.92 % 1,077.66 res res r es 1,699 1.69 % 1,641.38 467 2.17 % 2,025.83 577 1.63 % 1,952.35 360 1.43 % 1,019.40 295 1.62 % 1,892.32 samo samo s amo 1,469 1.46 % 1,419.18 254 1.18 % 1,101.84 565 1.59 % 1,911.74 419 1.66 % 1,186.47 231 1.27 % 1,481.79 več več v eč 1,332 1.33 % 1,286.82 287 1.34 % 1,245 442 1.25 % 1,495.56 445 1.77 % 1,260.09 158 0.87 % 1,013.52 sploh sploh s ploh 895 0.89 % 864.64 161 0.75 % 698.41 354 1.00 % 1,197.80 219 0.87 % 620.13 161 0.89 % 1,032.76 kar kar k ar 793 0.79 % 766.10 183 0.85 % 793.85 160 0.45 % 541.38 336 1.33 % 951.44 114 0.63 % 731.27 naj naj n aj 792 0.79 % 765.14 180 0.84 % 780.83 179 0.51 % 605.67 331 1.31 % 937.28 102 0.56 % 654.29 okej okej o kej 656 0.65 % 633.75 191 0.89 % 828.55 175 0.49 % 592.13 96 0.38 % 271.84 194 1.07 % 1,244.44 pravzaprav pravzaprav p ravzaprav 579 0.58 % 559.36 73 0.34 % 316.67 18 0.05 % 60.91 439 1.74 % 1,243.10 49 0.27 % 314.32 glih glih g lih 542 0.54 % 523.62 74 0.34 % 321.01 373 1.05 % 1,262.09 36 0.14 % 101.94 59 0.33 % 378.46 itak itak i tak 529 0.53 % 511.06 50 0.23 % 216.90 391 1.10 % 1,322.99 29 0.12 % 82.12 59 0.33 % 378.46 torej torej t orej 474 0.47 % 457.92 133 0.62 % 576.95 9 0.03 % 30.45 297 1.18 % 841 35 0.19 % 224.51 vsaj vsaj v saj 448 0.45 % 432.81 113 0.53 % 490.19 119 0.34 % 402.65 128 0.51 % 362.45 88 0.48 % 564.49 da da d a 354 0.35 % 341.99 51 0.24 % 221.24 162 0.46 % 548.15 91 0.36 % 257.68 50 0.28 % 320.73 predvsem predvsem p redvsem 335 0.33 % 323.64 107 0.50 % 464.16 10 0.03 % 33.84 187 0.74 % 529.52 31 0.17 % 198.85 morda morda m orda 333 0.33 % 321.71 138 0.64 % 598.64 27 0.08 % 91.36 151 0.60 % 427.58 17 0.09 % 109.05 skoraj skoraj s koraj 295 0.29 % 284.99 76 0.35 % 329.69 100 0.28 % 338.36 83 0.33 % 235.03 36 0.20 % 230.93 prav prav p rav 249 0.25 % 240.55 50 0.23 % 216.90 87 0.24 % 294.37 84 0.33 % 237.86 28 0.15 % 179.61 le le l e 240 0.24 % 231.86 98 0.46 % 425.12 33 0.09 % 111.66 91 0.36 % 257.68 18 0.10 % 115.46 najbrž najbrž n ajbrž 217 0.22 % 209.64 44 0.20 % 190.87 63 0.18 % 213.17 64 0.25 % 181.23 46 0.25 % 295.07 mogoče mogoče m ogoče 184 0.18 % 177.76 39 0.18 % 169.18 36 0.10 % 121.81 60 0.24 % 169.90 49 0.27 % 314.32 celo celo c elo 173 0.17 % 167.13 47 0.22 % 203.88 32 0.09 % 108.28 75 0.30 % 212.37 19 0.10 % 121.88 menda menda m enda 149 0.15 % 143.95 12 0.06 % 52.06 96 0.27 % 324.83 28 0.11 % 79.29 13 0.07 % 83.39 šele šele š ele 130 0.13 % 125.59 29 0.14 % 125.80 46 0.13 % 155.65 39 0.15 % 110.43 16 0.09 % 102.63 baje baje b aje 124 0.12 % 119.79 35 0.16 % 151.83 72 0.20 % 243.62 4 0.02 % 11.33 13 0.07 % 83.39 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 615 File at CLARIN.SI2.2.272 List of initial character-level 2-grams from particle lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-particles-lemmas-initial- 2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ne ne ne 31,737 31.64 % 30,660.60 6,439 29.97 % 27,932.14 11,588 32.66 % 39,209.32 7,780 30.90 % 22,030.36 5,930 32.62 % 38,038.91 ja ja ja 25,555 25.47 % 24,688.27 4,720 21.97 % 20,475.18 11,541 32.53 % 39,050.29 3,821 15.18 % 10,819.80 5,473 30.10 % 35,107.41 tudi tudi tu di 7,945 7.92 % 7,675.53 2,076 9.66 % 9,005.61 1,644 4.63 % 5,562.66 2,890 11.48 % 8,183.51 1,335 7.34 % 8,563.57 še še še 7,184 7.16 % 6,940.35 1,782 8.29 % 7,730.25 2,116 5.96 % 7,159.73 2,127 8.45 % 6,022.95 1,159 6.38 % 7,434.59 no no no 4,700 4.68 % 4,540.59 1,298 6.04 % 5,630.67 1,566 4.41 % 5,298.74 1,114 4.42 % 3,154.48 722 3.97 % 4,631.38 že že že 4,447 4.43 % 4,296.17 1,071 4.98 % 4,645.96 1,498 4.22 % 5,068.65 1,191 4.73 % 3,372.51 687 3.78 % 4,406.87 pač pač pa č 2,851 2.84 % 2,754.30 402 1.87 % 1,743.86 960 2.71 % 3,248.27 830 3.30 % 2,350.28 659 3.62 % 4,227.26 seveda seveda se veda 1,795 1.79 % 1,734.12 543 2.53 % 2,355.51 188 0.53 % 636.12 896 3.56 % 2,537.17 168 0.92 % 1,077.66 res res re s 1,699 1.69 % 1,641.38 467 2.17 % 2,025.83 577 1.63 % 1,952.35 360 1.43 % 1,019.40 295 1.62 % 1,892.32 samo samo sa mo 1,469 1.46 % 1,419.18 254 1.18 % 1,101.84 565 1.59 % 1,911.74 419 1.66 % 1,186.47 231 1.27 % 1,481.79 več več ve č 1,332 1.33 % 1,286.82 287 1.34 % 1,245 442 1.25 % 1,495.56 445 1.77 % 1,260.09 158 0.87 % 1,013.52 sploh sploh sp loh 895 0.89 % 864.64 161 0.75 % 698.41 354 1.00 % 1,197.80 219 0.87 % 620.13 161 0.89 % 1,032.76 kar kar ka r 793 0.79 % 766.10 183 0.85 % 793.85 160 0.45 % 541.38 336 1.33 % 951.44 114 0.63 % 731.27 naj naj na j 792 0.79 % 765.14 180 0.84 % 780.83 179 0.51 % 605.67 331 1.31 % 937.28 102 0.56 % 654.29 okej okej ok ej 656 0.65 % 633.75 191 0.89 % 828.55 175 0.49 % 592.13 96 0.38 % 271.84 194 1.07 % 1,244.44 pravzaprav pravzaprav pr avzaprav 579 0.58 % 559.36 73 0.34 % 316.67 18 0.05 % 60.91 439 1.74 % 1,243.10 49 0.27 % 314.32 glih glih gl ih 542 0.54 % 523.62 74 0.34 % 321.01 373 1.05 % 1,262.09 36 0.14 % 101.94 59 0.33 % 378.46 itak itak it ak 529 0.53 % 511.06 50 0.23 % 216.90 391 1.10 % 1,322.99 29 0.12 % 82.12 59 0.33 % 378.46 torej torej to rej 474 0.47 % 457.92 133 0.62 % 576.95 9 0.03 % 30.45 297 1.18 % 841 35 0.19 % 224.51 vsaj vsaj vs aj 448 0.45 % 432.81 113 0.53 % 490.19 119 0.34 % 402.65 128 0.51 % 362.45 88 0.48 % 564.49 da da da 354 0.35 % 341.99 51 0.24 % 221.24 162 0.46 % 548.15 91 0.36 % 257.68 50 0.28 % 320.73 predvsem predvsem pr edvsem 335 0.33 % 323.64 107 0.50 % 464.16 10 0.03 % 33.84 187 0.74 % 529.52 31 0.17 % 198.85 morda morda mo rda 333 0.33 % 321.71 138 0.64 % 598.64 27 0.08 % 91.36 151 0.60 % 427.58 17 0.09 % 109.05 skoraj skoraj sk oraj 295 0.29 % 284.99 76 0.35 % 329.69 100 0.28 % 338.36 83 0.33 % 235.03 36 0.20 % 230.93 prav prav pr av 249 0.25 % 240.55 50 0.23 % 216.90 87 0.24 % 294.37 84 0.33 % 237.86 28 0.15 % 179.61 le le le 240 0.24 % 231.86 98 0.46 % 425.12 33 0.09 % 111.66 91 0.36 % 257.68 18 0.10 % 115.46 najbrž najbrž na jbrž 217 0.22 % 209.64 44 0.20 % 190.87 63 0.18 % 213.17 64 0.25 % 181.23 46 0.25 % 295.07 mogoče mogoče mo goče 184 0.18 % 177.76 39 0.18 % 169.18 36 0.10 % 121.81 60 0.24 % 169.90 49 0.27 % 314.32 celo celo ce lo 173 0.17 % 167.13 47 0.22 % 203.88 32 0.09 % 108.28 75 0.30 % 212.37 19 0.10 % 121.88 menda menda me nda 149 0.15 % 143.95 12 0.06 % 52.06 96 0.27 % 324.83 28 0.11 % 79.29 13 0.07 % 83.39 šele šele še le 130 0.13 % 125.59 29 0.14 % 125.80 46 0.13 % 155.65 39 0.15 % 110.43 16 0.09 % 102.63 baje baje ba je 124 0.12 % 119.79 35 0.16 % 151.83 72 0.20 % 243.62 4 0.02 % 11.33 13 0.07 % 83.39 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 616 File at CLARIN.SI2.2.273 List of initial character-level 3-grams from particle lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-particles-lemmas-initial- 3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tudi tudi tud i 7,945 30.45 % 7,675.53 2,076 34.45 % 9,005.61 1,644 23.58 % 5,562.66 2,890 32.28 % 8,183.51 1,335 32.26 % 8,563.57 pač pač pač 2,851 10.93 % 2,754.30 402 6.67 % 1,743.86 960 13.77 % 3,248.27 830 9.27 % 2,350.28 659 15.93 % 4,227.26 seveda seveda sev eda 1,795 6.88 % 1,734.12 543 9.01 % 2,355.51 188 2.70 % 636.12 896 10.01 % 2,537.17 168 4.06 % 1,077.66 res res res 1,699 6.51 % 1,641.38 467 7.75 % 2,025.83 577 8.28 % 1,952.35 360 4.02 % 1,019.40 295 7.13 % 1,892.32 samo samo sam o 1,469 5.63 % 1,419.18 254 4.21 % 1,101.84 565 8.10 % 1,911.74 419 4.68 % 1,186.47 231 5.58 % 1,481.79 več več več 1,332 5.11 % 1,286.82 287 4.76 % 1,245 442 6.34 % 1,495.56 445 4.97 % 1,260.09 158 3.82 % 1,013.52 sploh sploh spl oh 895 3.43 % 864.64 161 2.67 % 698.41 354 5.08 % 1,197.80 219 2.45 % 620.13 161 3.89 % 1,032.76 kar kar kar 793 3.04 % 766.10 183 3.04 % 793.85 160 2.29 % 541.38 336 3.75 % 951.44 114 2.75 % 731.27 naj naj naj 792 3.04 % 765.14 180 2.99 % 780.83 179 2.57 % 605.67 331 3.70 % 937.28 102 2.46 % 654.29 okej okej oke j 656 2.51 % 633.75 191 3.17 % 828.55 175 2.51 % 592.13 96 1.07 % 271.84 194 4.69 % 1,244.44 pravzaprav pravzaprav pra vzaprav 579 2.22 % 559.36 73 1.21 % 316.67 18 0.26 % 60.91 439 4.90 % 1,243.10 49 1.18 % 314.32 glih glih gli h 542 2.08 % 523.62 74 1.23 % 321.01 373 5.35 % 1,262.09 36 0.40 % 101.94 59 1.43 % 378.46 itak itak ita k 529 2.03 % 511.06 50 0.83 % 216.90 391 5.61 % 1,322.99 29 0.32 % 82.12 59 1.43 % 378.46 torej torej tor ej 474 1.82 % 457.92 133 2.21 % 576.95 9 0.13 % 30.45 297 3.32 % 841 35 0.85 % 224.51 vsaj vsaj vsa j 448 1.72 % 432.81 113 1.88 % 490.19 119 1.71 % 402.65 128 1.43 % 362.45 88 2.13 % 564.49 predvsem predvsem pre dvsem 335 1.28 % 323.64 107 1.78 % 464.16 10 0.14 % 33.84 187 2.09 % 529.52 31 0.75 % 198.85 morda morda mor da 333 1.28 % 321.71 138 2.29 % 598.64 27 0.39 % 91.36 151 1.69 % 427.58 17 0.41 % 109.05 skoraj skoraj sko raj 295 1.13 % 284.99 76 1.26 % 329.69 100 1.43 % 338.36 83 0.93 % 235.03 36 0.87 % 230.93 prav prav pra v 249 0.95 % 240.55 50 0.83 % 216.90 87 1.25 % 294.37 84 0.94 % 237.86 28 0.68 % 179.61 najbrž najbrž naj brž 217 0.83 % 209.64 44 0.73 % 190.87 63 0.90 % 213.17 64 0.71 % 181.23 46 1.11 % 295.07 mogoče mogoče mog oče 184 0.70 % 177.76 39 0.65 % 169.18 36 0.52 % 121.81 60 0.67 % 169.90 49 1.18 % 314.32 celo celo cel o 173 0.66 % 167.13 47 0.78 % 203.88 32 0.46 % 108.28 75 0.84 % 212.37 19 0.46 % 121.88 menda menda men da 149 0.57 % 143.95 12 0.20 % 52.06 96 1.38 % 324.83 28 0.31 % 79.29 13 0.31 % 83.39 šele šele šel e 130 0.50 % 125.59 29 0.48 % 125.80 46 0.66 % 155.65 39 0.44 % 110.43 16 0.39 % 102.63 baje baje baj e 124 0.47 % 119.79 35 0.58 % 151.83 72 1.03 % 243.62 4 0.04 % 11.33 13 0.31 % 83.39 valjda valjda val jda 116 0.45 % 112.07 24 0.40 % 104.11 70 1.00 % 236.85 5 0.06 % 14.16 17 0.41 % 109.05 žal žal žal 92 0.35 % 88.88 34 0.56 % 147.49 15 0.21 % 50.75 31 0.35 % 87.78 12 0.29 % 76.98 potem potem pot em 87 0.33 % 84.05 19 0.32 % 82.42 16 0.23 % 54.14 33 0.37 % 93.44 19 0.46 % 121.88 ravno ravno rav no 87 0.33 % 84.05 20 0.33 % 86.76 10 0.14 % 33.84 44 0.49 % 124.59 13 0.31 % 83.39 komaj komaj kom aj 79 0.30 % 76.32 22 0.36 % 95.44 41 0.59 % 138.73 7 0.08 % 19.82 9 0.22 % 57.73 verjetno verjetno ver jetno 69 0.26 % 66.66 7 0.12 % 30.37 11 0.16 % 37.22 37 0.41 % 104.77 14 0.34 % 89.81 morebiti morebiti mor ebiti 49 0.19 % 47.34 2 0.03 % 8.68 17 0.24 % 57.52 18 0.20 % 50.97 12 0.29 % 76.98 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 617 File at CLARIN.SI2.2.274 List of initial character-level 4-grams from particle lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-particles-lemmas-initial- 4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tudi tudi tudi 7,945 42.89 % 7,675.53 2,076 46.42 % 9,005.61 1,644 35.46 % 5,562.66 2,890 43.67 % 8,183.51 1,335 47.71 % 8,563.57 seveda seveda seve da 1,795 9.69 % 1,734.12 543 12.14 % 2,355.51 188 4.05 % 636.12 896 13.54 % 2,537.17 168 6.00 % 1,077.66 samo samo samo 1,469 7.93 % 1,419.18 254 5.68 % 1,101.84 565 12.19 % 1,911.74 419 6.33 % 1,186.47 231 8.26 % 1,481.79 sploh sploh splo h 895 4.83 % 864.64 161 3.60 % 698.41 354 7.64 % 1,197.80 219 3.31 % 620.13 161 5.75 % 1,032.76 okej okej okej 656 3.54 % 633.75 191 4.27 % 828.55 175 3.77 % 592.13 96 1.45 % 271.84 194 6.93 % 1,244.44 pravzaprav pravzaprav prav zaprav 579 3.13 % 559.36 73 1.63 % 316.67 18 0.39 % 60.91 439 6.63 % 1,243.10 49 1.75 % 314.32 glih glih glih 542 2.93 % 523.62 74 1.66 % 321.01 373 8.05 % 1,262.09 36 0.54 % 101.94 59 2.11 % 378.46 itak itak itak 529 2.86 % 511.06 50 1.12 % 216.90 391 8.43 % 1,322.99 29 0.44 % 82.12 59 2.11 % 378.46 torej torej tore j 474 2.56 % 457.92 133 2.97 % 576.95 9 0.19 % 30.45 297 4.49 % 841 35 1.25 % 224.51 vsaj vsaj vsaj 448 2.42 % 432.81 113 2.53 % 490.19 119 2.57 % 402.65 128 1.93 % 362.45 88 3.15 % 564.49 predvsem predvsem pred vsem 335 1.81 % 323.64 107 2.39 % 464.16 10 0.22 % 33.84 187 2.83 % 529.52 31 1.11 % 198.85 morda morda mord a 333 1.80 % 321.71 138 3.09 % 598.64 27 0.58 % 91.36 151 2.28 % 427.58 17 0.61 % 109.05 skoraj skoraj skor aj 295 1.59 % 284.99 76 1.70 % 329.69 100 2.16 % 338.36 83 1.25 % 235.03 36 1.29 % 230.93 prav prav prav 249 1.34 % 240.55 50 1.12 % 216.90 87 1.88 % 294.37 84 1.27 % 237.86 28 1.00 % 179.61 najbrž najbrž najb rž 217 1.17 % 209.64 44 0.98 % 190.87 63 1.36 % 213.17 64 0.97 % 181.23 46 1.64 % 295.07 mogoče mogoče mogo če 184 0.99 % 177.76 39 0.87 % 169.18 36 0.78 % 121.81 60 0.91 % 169.90 49 1.75 % 314.32 celo celo celo 173 0.93 % 167.13 47 1.05 % 203.88 32 0.69 % 108.28 75 1.13 % 212.37 19 0.68 % 121.88 menda menda mend a 149 0.80 % 143.95 12 0.27 % 52.06 96 2.07 % 324.83 28 0.42 % 79.29 13 0.47 % 83.39 šele šele šele 130 0.70 % 125.59 29 0.65 % 125.80 46 0.99 % 155.65 39 0.59 % 110.43 16 0.57 % 102.63 baje baje baje 124 0.67 % 119.79 35 0.78 % 151.83 72 1.55 % 243.62 4 0.06 % 11.33 13 0.47 % 83.39 valjda valjda valj da 116 0.63 % 112.07 24 0.54 % 104.11 70 1.51 % 236.85 5 0.08 % 14.16 17 0.61 % 109.05 potem potem pote m 87 0.47 % 84.05 19 0.42 % 82.42 16 0.34 % 54.14 33 0.50 % 93.44 19 0.68 % 121.88 ravno ravno ravn o 87 0.47 % 84.05 20 0.45 % 86.76 10 0.22 % 33.84 44 0.67 % 124.59 13 0.47 % 83.39 komaj komaj koma j 79 0.43 % 76.32 22 0.49 % 95.44 41 0.88 % 138.73 7 0.11 % 19.82 9 0.32 % 57.73 verjetno verjetno verj etno 69 0.37 % 66.66 7 0.16 % 30.37 11 0.24 % 37.22 37 0.56 % 104.77 14 0.50 % 89.81 morebiti morebiti more biti 49 0.27 % 47.34 2 0.04 % 8.68 17 0.37 % 57.52 18 0.27 % 50.97 12 0.43 % 76.98 zgolj zgolj zgol j 45 0.24 % 43.47 11 0.25 % 47.72 3 0.07 % 10.15 30 0.45 % 84.95 1 0.04 % 6.41 vendarle vendarle vend arle 44 0.24 % 42.51 4 0.09 % 17.35 0 0 % 0 39 0.59 % 110.43 1 0.04 % 6.41 skorajda skorajda skor ajda 42 0.23 % 40.58 15 0.34 % 65.07 1 0.02 % 3.38 25 0.38 % 70.79 1 0.04 % 6.41 edino edino edin o 30 0.16 % 28.98 6 0.13 % 26.03 13 0.28 % 43.99 4 0.06 % 11.33 7 0.25 % 44.90 vsekakor vsekakor vsek akor 28 0.15 % 27.05 8 0.18 % 34.70 0 0 % 0 18 0.27 % 50.97 2 0.07 % 12.83 minus minus minu s 22 0.12 % 21.25 0 0 % 0 0 0 % 0 10 0.15 % 28.32 12 0.43 % 76.98 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 618 File at CLARIN.SI2.2.275 List of initial character-level 5-grams from particle lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-particles-lemmas-initial- 5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] seveda seveda seved a 1,795 29.02 % 1,734.12 543 35.19 % 2,355.51 188 16.94 % 636.12 896 32.07 % 2,537.17 168 22.76 % 1,077.66 sploh sploh sploh 895 14.47 % 864.64 161 10.43 % 698.41 354 31.89 % 1,197.80 219 7.84 % 620.13 161 21.82 % 1,032.76 pravzaprav pravzaprav pravz aprav 579 9.36 % 559.36 73 4.73 % 316.67 18 1.62 % 60.91 439 15.71 % 1,243.10 49 6.64 % 314.32 torej torej torej 474 7.66 % 457.92 133 8.62 % 576.95 9 0.81 % 30.45 297 10.63 % 841 35 4.74 % 224.51 predvsem predvsem predv sem 335 5.42 % 323.64 107 6.93 % 464.16 10 0.90 % 33.84 187 6.69 % 529.52 31 4.20 % 198.85 morda morda morda 333 5.38 % 321.71 138 8.94 % 598.64 27 2.43 % 91.36 151 5.40 % 427.58 17 2.30 % 109.05 skoraj skoraj skora j 295 4.77 % 284.99 76 4.92 % 329.69 100 9.01 % 338.36 83 2.97 % 235.03 36 4.88 % 230.93 najbrž najbrž najbr ž 217 3.51 % 209.64 44 2.85 % 190.87 63 5.68 % 213.17 64 2.29 % 181.23 46 6.23 % 295.07 mogoče mogoče mogoč e 184 2.98 % 177.76 39 2.53 % 169.18 36 3.24 % 121.81 60 2.15 % 169.90 49 6.64 % 314.32 menda menda menda 149 2.41 % 143.95 12 0.78 % 52.06 96 8.65 % 324.83 28 1.00 % 79.29 13 1.76 % 83.39 valjda valjda valjd a 116 1.88 % 112.07 24 1.55 % 104.11 70 6.31 % 236.85 5 0.18 % 14.16 17 2.30 % 109.05 potem potem potem 87 1.41 % 84.05 19 1.23 % 82.42 16 1.44 % 54.14 33 1.18 % 93.44 19 2.58 % 121.88 ravno ravno ravno 87 1.41 % 84.05 20 1.30 % 86.76 10 0.90 % 33.84 44 1.57 % 124.59 13 1.76 % 83.39 komaj komaj komaj 79 1.28 % 76.32 22 1.43 % 95.44 41 3.69 % 138.73 7 0.25 % 19.82 9 1.22 % 57.73 verjetno verjetno verje tno 69 1.12 % 66.66 7 0.45 % 30.37 11 0.99 % 37.22 37 1.32 % 104.77 14 1.90 % 89.81 morebiti morebiti moreb iti 49 0.79 % 47.34 2 0.13 % 8.68 17 1.53 % 57.52 18 0.64 % 50.97 12 1.63 % 76.98 zgolj zgolj zgolj 45 0.73 % 43.47 11 0.71 % 47.72 3 0.27 % 10.15 30 1.07 % 84.95 1 0.14 % 6.41 vendarle vendarle venda rle 44 0.71 % 42.51 4 0.26 % 17.35 0 0 % 0 39 1.40 % 110.43 1 0.14 % 6.41 skorajda skorajda skora jda 42 0.68 % 40.58 15 0.97 % 65.07 1 0.09 % 3.38 25 0.90 % 70.79 1 0.14 % 6.41 edino edino edino 30 0.48 % 28.98 6 0.39 % 26.03 13 1.17 % 43.99 4 0.14 % 11.33 7 0.95 % 44.90 vsekakor vsekakor vseka kor 28 0.45 % 27.05 8 0.52 % 34.70 0 0 % 0 18 0.64 % 50.97 2 0.27 % 12.83 minus minus minus 22 0.36 % 21.25 0 0 % 0 0 0 % 0 10 0.36 % 28.32 12 1.63 % 76.98 največ največ najve č 21 0.34 % 20.29 7 0.45 % 30.37 0 0 % 0 12 0.43 % 33.98 2 0.27 % 12.83 bržkone bržkone bržko ne 20 0.32 % 19.32 20 1.30 % 86.76 0 0 % 0 0 0 % 0 0 0 % 0 natanko natanko natan ko 19 0.31 % 18.36 10 0.65 % 43.38 0 0 % 0 9 0.32 % 25.48 0 0 % 0 bojda bojda bojda 16 0.26 % 15.46 8 0.52 % 34.70 8 0.72 % 27.07 0 0 % 0 0 0 % 0 najmanj najmanj najma nj 16 0.26 % 15.46 2 0.13 % 8.68 2 0.18 % 6.77 10 0.36 % 28.32 2 0.27 % 12.83 približno približno pribl ižno 14 0.23 % 13.53 7 0.45 % 30.37 0 0 % 0 6 0.21 % 16.99 1 0.14 % 6.41 končno končno končn o 13 0.21 % 12.56 3 0.19 % 13.01 1 0.09 % 3.38 9 0.32 % 25.48 0 0 % 0 razen razen razen 13 0.21 % 12.56 2 0.13 % 8.68 5 0.45 % 16.92 4 0.14 % 11.33 2 0.27 % 12.83 kvečjemu kvečjemu kvečj emu 12 0.19 % 11.59 1 0.07 % 4.34 2 0.18 % 6.77 5 0.18 % 14.16 4 0.54 % 25.66 okrog okrog okrog 11 0.18 % 10.63 1 0.07 % 4.34 0 0 % 0 5 0.18 % 14.16 5 0.68 % 32.07 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 619 File at CLARIN.SI2.2.276 List of final character-level 1-grams from particle lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-particles-lemmas-final- 1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ne ne n e 31,737 31.64 % 30,660.60 6,439 29.97 % 27,932.14 11,588 32.66 % 39,209.32 7,780 30.90 % 22,030.36 5,930 32.62 % 38,038.91 ja ja j a 25,555 25.47 % 24,688.27 4,720 21.97 % 20,475.18 11,541 32.53 % 39,050.29 3,821 15.18 % 10,819.80 5,473 30.10 % 35,107.41 tudi tudi tud i 7,945 7.92 % 7,675.53 2,076 9.66 % 9,005.61 1,644 4.63 % 5,562.66 2,890 11.48 % 8,183.51 1,335 7.34 % 8,563.57 še še š e 7,184 7.16 % 6,940.35 1,782 8.29 % 7,730.25 2,116 5.96 % 7,159.73 2,127 8.45 % 6,022.95 1,159 6.38 % 7,434.59 no no n o 4,700 4.68 % 4,540.59 1,298 6.04 % 5,630.67 1,566 4.41 % 5,298.74 1,114 4.42 % 3,154.48 722 3.97 % 4,631.38 že že ž e 4,447 4.43 % 4,296.17 1,071 4.98 % 4,645.96 1,498 4.22 % 5,068.65 1,191 4.73 % 3,372.51 687 3.78 % 4,406.87 pač pač pa č 2,851 2.84 % 2,754.30 402 1.87 % 1,743.86 960 2.71 % 3,248.27 830 3.30 % 2,350.28 659 3.62 % 4,227.26 seveda seveda seved a 1,795 1.79 % 1,734.12 543 2.53 % 2,355.51 188 0.53 % 636.12 896 3.56 % 2,537.17 168 0.92 % 1,077.66 res res re s 1,699 1.69 % 1,641.38 467 2.17 % 2,025.83 577 1.63 % 1,952.35 360 1.43 % 1,019.40 295 1.62 % 1,892.32 samo samo sam o 1,469 1.46 % 1,419.18 254 1.18 % 1,101.84 565 1.59 % 1,911.74 419 1.66 % 1,186.47 231 1.27 % 1,481.79 več več ve č 1,332 1.33 % 1,286.82 287 1.34 % 1,245 442 1.25 % 1,495.56 445 1.77 % 1,260.09 158 0.87 % 1,013.52 sploh sploh splo h 895 0.89 % 864.64 161 0.75 % 698.41 354 1.00 % 1,197.80 219 0.87 % 620.13 161 0.89 % 1,032.76 kar kar ka r 793 0.79 % 766.10 183 0.85 % 793.85 160 0.45 % 541.38 336 1.33 % 951.44 114 0.63 % 731.27 naj naj na j 792 0.79 % 765.14 180 0.84 % 780.83 179 0.51 % 605.67 331 1.31 % 937.28 102 0.56 % 654.29 okej okej oke j 656 0.65 % 633.75 191 0.89 % 828.55 175 0.49 % 592.13 96 0.38 % 271.84 194 1.07 % 1,244.44 pravzaprav pravzaprav pravzapra v 579 0.58 % 559.36 73 0.34 % 316.67 18 0.05 % 60.91 439 1.74 % 1,243.10 49 0.27 % 314.32 glih glih gli h 542 0.54 % 523.62 74 0.34 % 321.01 373 1.05 % 1,262.09 36 0.14 % 101.94 59 0.33 % 378.46 itak itak ita k 529 0.53 % 511.06 50 0.23 % 216.90 391 1.10 % 1,322.99 29 0.12 % 82.12 59 0.33 % 378.46 torej torej tore j 474 0.47 % 457.92 133 0.62 % 576.95 9 0.03 % 30.45 297 1.18 % 841 35 0.19 % 224.51 vsaj vsaj vsa j 448 0.45 % 432.81 113 0.53 % 490.19 119 0.34 % 402.65 128 0.51 % 362.45 88 0.48 % 564.49 da da d a 354 0.35 % 341.99 51 0.24 % 221.24 162 0.46 % 548.15 91 0.36 % 257.68 50 0.28 % 320.73 predvsem predvsem predvse m 335 0.33 % 323.64 107 0.50 % 464.16 10 0.03 % 33.84 187 0.74 % 529.52 31 0.17 % 198.85 morda morda mord a 333 0.33 % 321.71 138 0.64 % 598.64 27 0.08 % 91.36 151 0.60 % 427.58 17 0.09 % 109.05 skoraj skoraj skora j 295 0.29 % 284.99 76 0.35 % 329.69 100 0.28 % 338.36 83 0.33 % 235.03 36 0.20 % 230.93 prav prav pra v 249 0.25 % 240.55 50 0.23 % 216.90 87 0.24 % 294.37 84 0.33 % 237.86 28 0.15 % 179.61 le le l e 240 0.24 % 231.86 98 0.46 % 425.12 33 0.09 % 111.66 91 0.36 % 257.68 18 0.10 % 115.46 najbrž najbrž najbr ž 217 0.22 % 209.64 44 0.20 % 190.87 63 0.18 % 213.17 64 0.25 % 181.23 46 0.25 % 295.07 mogoče mogoče mogoč e 184 0.18 % 177.76 39 0.18 % 169.18 36 0.10 % 121.81 60 0.24 % 169.90 49 0.27 % 314.32 celo celo cel o 173 0.17 % 167.13 47 0.22 % 203.88 32 0.09 % 108.28 75 0.30 % 212.37 19 0.10 % 121.88 menda menda mend a 149 0.15 % 143.95 12 0.06 % 52.06 96 0.27 % 324.83 28 0.11 % 79.29 13 0.07 % 83.39 šele šele šel e 130 0.13 % 125.59 29 0.14 % 125.80 46 0.13 % 155.65 39 0.15 % 110.43 16 0.09 % 102.63 baje baje baj e 124 0.12 % 119.79 35 0.16 % 151.83 72 0.20 % 243.62 4 0.02 % 11.33 13 0.07 % 83.39 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 620 File at CLARIN.SI2.2.277 List of final character-level 2-grams from particle lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-particles-lemmas-final- 2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ne ne ne 31,737 31.64 % 30,660.60 6,439 29.97 % 27,932.14 11,588 32.66 % 39,209.32 7,780 30.90 % 22,030.36 5,930 32.62 % 38,038.91 ja ja ja 25,555 25.47 % 24,688.27 4,720 21.97 % 20,475.18 11,541 32.53 % 39,050.29 3,821 15.18 % 10,819.80 5,473 30.10 % 35,107.41 tudi tudi tu di 7,945 7.92 % 7,675.53 2,076 9.66 % 9,005.61 1,644 4.63 % 5,562.66 2,890 11.48 % 8,183.51 1,335 7.34 % 8,563.57 še še še 7,184 7.16 % 6,940.35 1,782 8.29 % 7,730.25 2,116 5.96 % 7,159.73 2,127 8.45 % 6,022.95 1,159 6.38 % 7,434.59 no no no 4,700 4.68 % 4,540.59 1,298 6.04 % 5,630.67 1,566 4.41 % 5,298.74 1,114 4.42 % 3,154.48 722 3.97 % 4,631.38 že že že 4,447 4.43 % 4,296.17 1,071 4.98 % 4,645.96 1,498 4.22 % 5,068.65 1,191 4.73 % 3,372.51 687 3.78 % 4,406.87 pač pač p ač 2,851 2.84 % 2,754.30 402 1.87 % 1,743.86 960 2.71 % 3,248.27 830 3.30 % 2,350.28 659 3.62 % 4,227.26 seveda seveda seve da 1,795 1.79 % 1,734.12 543 2.53 % 2,355.51 188 0.53 % 636.12 896 3.56 % 2,537.17 168 0.92 % 1,077.66 res res r es 1,699 1.69 % 1,641.38 467 2.17 % 2,025.83 577 1.63 % 1,952.35 360 1.43 % 1,019.40 295 1.62 % 1,892.32 samo samo sa mo 1,469 1.46 % 1,419.18 254 1.18 % 1,101.84 565 1.59 % 1,911.74 419 1.66 % 1,186.47 231 1.27 % 1,481.79 več več v eč 1,332 1.33 % 1,286.82 287 1.34 % 1,245 442 1.25 % 1,495.56 445 1.77 % 1,260.09 158 0.87 % 1,013.52 sploh sploh spl oh 895 0.89 % 864.64 161 0.75 % 698.41 354 1.00 % 1,197.80 219 0.87 % 620.13 161 0.89 % 1,032.76 kar kar k ar 793 0.79 % 766.10 183 0.85 % 793.85 160 0.45 % 541.38 336 1.33 % 951.44 114 0.63 % 731.27 naj naj n aj 792 0.79 % 765.14 180 0.84 % 780.83 179 0.51 % 605.67 331 1.31 % 937.28 102 0.56 % 654.29 okej okej ok ej 656 0.65 % 633.75 191 0.89 % 828.55 175 0.49 % 592.13 96 0.38 % 271.84 194 1.07 % 1,244.44 pravzaprav pravzaprav pravzapr av 579 0.58 % 559.36 73 0.34 % 316.67 18 0.05 % 60.91 439 1.74 % 1,243.10 49 0.27 % 314.32 glih glih gl ih 542 0.54 % 523.62 74 0.34 % 321.01 373 1.05 % 1,262.09 36 0.14 % 101.94 59 0.33 % 378.46 itak itak it ak 529 0.53 % 511.06 50 0.23 % 216.90 391 1.10 % 1,322.99 29 0.12 % 82.12 59 0.33 % 378.46 torej torej tor ej 474 0.47 % 457.92 133 0.62 % 576.95 9 0.03 % 30.45 297 1.18 % 841 35 0.19 % 224.51 vsaj vsaj vs aj 448 0.45 % 432.81 113 0.53 % 490.19 119 0.34 % 402.65 128 0.51 % 362.45 88 0.48 % 564.49 da da da 354 0.35 % 341.99 51 0.24 % 221.24 162 0.46 % 548.15 91 0.36 % 257.68 50 0.28 % 320.73 predvsem predvsem predvs em 335 0.33 % 323.64 107 0.50 % 464.16 10 0.03 % 33.84 187 0.74 % 529.52 31 0.17 % 198.85 morda morda mor da 333 0.33 % 321.71 138 0.64 % 598.64 27 0.08 % 91.36 151 0.60 % 427.58 17 0.09 % 109.05 skoraj skoraj skor aj 295 0.29 % 284.99 76 0.35 % 329.69 100 0.28 % 338.36 83 0.33 % 235.03 36 0.20 % 230.93 prav prav pr av 249 0.25 % 240.55 50 0.23 % 216.90 87 0.24 % 294.37 84 0.33 % 237.86 28 0.15 % 179.61 le le le 240 0.24 % 231.86 98 0.46 % 425.12 33 0.09 % 111.66 91 0.36 % 257.68 18 0.10 % 115.46 najbrž najbrž najb rž 217 0.22 % 209.64 44 0.20 % 190.87 63 0.18 % 213.17 64 0.25 % 181.23 46 0.25 % 295.07 mogoče mogoče mogo če 184 0.18 % 177.76 39 0.18 % 169.18 36 0.10 % 121.81 60 0.24 % 169.90 49 0.27 % 314.32 celo celo ce lo 173 0.17 % 167.13 47 0.22 % 203.88 32 0.09 % 108.28 75 0.30 % 212.37 19 0.10 % 121.88 menda menda men da 149 0.15 % 143.95 12 0.06 % 52.06 96 0.27 % 324.83 28 0.11 % 79.29 13 0.07 % 83.39 šele šele še le 130 0.13 % 125.59 29 0.14 % 125.80 46 0.13 % 155.65 39 0.15 % 110.43 16 0.09 % 102.63 baje baje ba je 124 0.12 % 119.79 35 0.16 % 151.83 72 0.20 % 243.62 4 0.02 % 11.33 13 0.07 % 83.39 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 621 File at CLARIN.SI2.2.278 List of final character-level 3-grams from particle lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-particles-lemmas-final- 3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tudi tudi t udi 7,945 30.45 % 7,675.53 2,076 34.45 % 9,005.61 1,644 23.58 % 5,562.66 2,890 32.28 % 8,183.51 1,335 32.26 % 8,563.57 pač pač pač 2,851 10.93 % 2,754.30 402 6.67 % 1,743.86 960 13.77 % 3,248.27 830 9.27 % 2,350.28 659 15.93 % 4,227.26 seveda seveda sev eda 1,795 6.88 % 1,734.12 543 9.01 % 2,355.51 188 2.70 % 636.12 896 10.01 % 2,537.17 168 4.06 % 1,077.66 res res res 1,699 6.51 % 1,641.38 467 7.75 % 2,025.83 577 8.28 % 1,952.35 360 4.02 % 1,019.40 295 7.13 % 1,892.32 samo samo s amo 1,469 5.63 % 1,419.18 254 4.21 % 1,101.84 565 8.10 % 1,911.74 419 4.68 % 1,186.47 231 5.58 % 1,481.79 več več več 1,332 5.11 % 1,286.82 287 4.76 % 1,245 442 6.34 % 1,495.56 445 4.97 % 1,260.09 158 3.82 % 1,013.52 sploh sploh sp loh 895 3.43 % 864.64 161 2.67 % 698.41 354 5.08 % 1,197.80 219 2.45 % 620.13 161 3.89 % 1,032.76 kar kar kar 793 3.04 % 766.10 183 3.04 % 793.85 160 2.29 % 541.38 336 3.75 % 951.44 114 2.75 % 731.27 naj naj naj 792 3.04 % 765.14 180 2.99 % 780.83 179 2.57 % 605.67 331 3.70 % 937.28 102 2.46 % 654.29 okej okej o kej 656 2.51 % 633.75 191 3.17 % 828.55 175 2.51 % 592.13 96 1.07 % 271.84 194 4.69 % 1,244.44 pravzaprav pravzaprav pravzap rav 579 2.22 % 559.36 73 1.21 % 316.67 18 0.26 % 60.91 439 4.90 % 1,243.10 49 1.18 % 314.32 glih glih g lih 542 2.08 % 523.62 74 1.23 % 321.01 373 5.35 % 1,262.09 36 0.40 % 101.94 59 1.43 % 378.46 itak itak i tak 529 2.03 % 511.06 50 0.83 % 216.90 391 5.61 % 1,322.99 29 0.32 % 82.12 59 1.43 % 378.46 torej torej to rej 474 1.82 % 457.92 133 2.21 % 576.95 9 0.13 % 30.45 297 3.32 % 841 35 0.85 % 224.51 vsaj vsaj v saj 448 1.72 % 432.81 113 1.88 % 490.19 119 1.71 % 402.65 128 1.43 % 362.45 88 2.13 % 564.49 predvsem predvsem predv sem 335 1.28 % 323.64 107 1.78 % 464.16 10 0.14 % 33.84 187 2.09 % 529.52 31 0.75 % 198.85 morda morda mo rda 333 1.28 % 321.71 138 2.29 % 598.64 27 0.39 % 91.36 151 1.69 % 427.58 17 0.41 % 109.05 skoraj skoraj sko raj 295 1.13 % 284.99 76 1.26 % 329.69 100 1.43 % 338.36 83 0.93 % 235.03 36 0.87 % 230.93 prav prav p rav 249 0.95 % 240.55 50 0.83 % 216.90 87 1.25 % 294.37 84 0.94 % 237.86 28 0.68 % 179.61 najbrž najbrž naj brž 217 0.83 % 209.64 44 0.73 % 190.87 63 0.90 % 213.17 64 0.71 % 181.23 46 1.11 % 295.07 mogoče mogoče mog oče 184 0.70 % 177.76 39 0.65 % 169.18 36 0.52 % 121.81 60 0.67 % 169.90 49 1.18 % 314.32 celo celo c elo 173 0.66 % 167.13 47 0.78 % 203.88 32 0.46 % 108.28 75 0.84 % 212.37 19 0.46 % 121.88 menda menda me nda 149 0.57 % 143.95 12 0.20 % 52.06 96 1.38 % 324.83 28 0.31 % 79.29 13 0.31 % 83.39 šele šele š ele 130 0.50 % 125.59 29 0.48 % 125.80 46 0.66 % 155.65 39 0.44 % 110.43 16 0.39 % 102.63 baje baje b aje 124 0.47 % 119.79 35 0.58 % 151.83 72 1.03 % 243.62 4 0.04 % 11.33 13 0.31 % 83.39 valjda valjda val jda 116 0.45 % 112.07 24 0.40 % 104.11 70 1.00 % 236.85 5 0.06 % 14.16 17 0.41 % 109.05 žal žal žal 92 0.35 % 88.88 34 0.56 % 147.49 15 0.21 % 50.75 31 0.35 % 87.78 12 0.29 % 76.98 potem potem po tem 87 0.33 % 84.05 19 0.32 % 82.42 16 0.23 % 54.14 33 0.37 % 93.44 19 0.46 % 121.88 ravno ravno ra vno 87 0.33 % 84.05 20 0.33 % 86.76 10 0.14 % 33.84 44 0.49 % 124.59 13 0.31 % 83.39 komaj komaj ko maj 79 0.30 % 76.32 22 0.36 % 95.44 41 0.59 % 138.73 7 0.08 % 19.82 9 0.22 % 57.73 verjetno verjetno verje tno 69 0.26 % 66.66 7 0.12 % 30.37 11 0.16 % 37.22 37 0.41 % 104.77 14 0.34 % 89.81 morebiti morebiti moreb iti 49 0.19 % 47.34 2 0.03 % 8.68 17 0.24 % 57.52 18 0.20 % 50.97 12 0.29 % 76.98 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 622 File at CLARIN.SI2.2.279 List of final character-level 4-grams from particle lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-particles-lemmas-final- 4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tudi tudi tudi 7,945 42.89 % 7,675.53 2,076 46.42 % 9,005.61 1,644 35.46 % 5,562.66 2,890 43.67 % 8,183.51 1,335 47.71 % 8,563.57 seveda seveda se veda 1,795 9.69 % 1,734.12 543 12.14 % 2,355.51 188 4.05 % 636.12 896 13.54 % 2,537.17 168 6.00 % 1,077.66 samo samo samo 1,469 7.93 % 1,419.18 254 5.68 % 1,101.84 565 12.19 % 1,911.74 419 6.33 % 1,186.47 231 8.26 % 1,481.79 sploh sploh s ploh 895 4.83 % 864.64 161 3.60 % 698.41 354 7.64 % 1,197.80 219 3.31 % 620.13 161 5.75 % 1,032.76 okej okej okej 656 3.54 % 633.75 191 4.27 % 828.55 175 3.77 % 592.13 96 1.45 % 271.84 194 6.93 % 1,244.44 pravzaprav pravzaprav pravza prav 579 3.13 % 559.36 73 1.63 % 316.67 18 0.39 % 60.91 439 6.63 % 1,243.10 49 1.75 % 314.32 glih glih glih 542 2.93 % 523.62 74 1.66 % 321.01 373 8.05 % 1,262.09 36 0.54 % 101.94 59 2.11 % 378.46 itak itak itak 529 2.86 % 511.06 50 1.12 % 216.90 391 8.43 % 1,322.99 29 0.44 % 82.12 59 2.11 % 378.46 torej torej t orej 474 2.56 % 457.92 133 2.97 % 576.95 9 0.19 % 30.45 297 4.49 % 841 35 1.25 % 224.51 vsaj vsaj vsaj 448 2.42 % 432.81 113 2.53 % 490.19 119 2.57 % 402.65 128 1.93 % 362.45 88 3.15 % 564.49 predvsem predvsem pred vsem 335 1.81 % 323.64 107 2.39 % 464.16 10 0.22 % 33.84 187 2.83 % 529.52 31 1.11 % 198.85 morda morda m orda 333 1.80 % 321.71 138 3.09 % 598.64 27 0.58 % 91.36 151 2.28 % 427.58 17 0.61 % 109.05 skoraj skoraj sk oraj 295 1.59 % 284.99 76 1.70 % 329.69 100 2.16 % 338.36 83 1.25 % 235.03 36 1.29 % 230.93 prav prav prav 249 1.34 % 240.55 50 1.12 % 216.90 87 1.88 % 294.37 84 1.27 % 237.86 28 1.00 % 179.61 najbrž najbrž na jbrž 217 1.17 % 209.64 44 0.98 % 190.87 63 1.36 % 213.17 64 0.97 % 181.23 46 1.64 % 295.07 mogoče mogoče mo goče 184 0.99 % 177.76 39 0.87 % 169.18 36 0.78 % 121.81 60 0.91 % 169.90 49 1.75 % 314.32 celo celo celo 173 0.93 % 167.13 47 1.05 % 203.88 32 0.69 % 108.28 75 1.13 % 212.37 19 0.68 % 121.88 menda menda m enda 149 0.80 % 143.95 12 0.27 % 52.06 96 2.07 % 324.83 28 0.42 % 79.29 13 0.47 % 83.39 šele šele šele 130 0.70 % 125.59 29 0.65 % 125.80 46 0.99 % 155.65 39 0.59 % 110.43 16 0.57 % 102.63 baje baje baje 124 0.67 % 119.79 35 0.78 % 151.83 72 1.55 % 243.62 4 0.06 % 11.33 13 0.47 % 83.39 valjda valjda va ljda 116 0.63 % 112.07 24 0.54 % 104.11 70 1.51 % 236.85 5 0.08 % 14.16 17 0.61 % 109.05 potem potem p otem 87 0.47 % 84.05 19 0.42 % 82.42 16 0.34 % 54.14 33 0.50 % 93.44 19 0.68 % 121.88 ravno ravno r avno 87 0.47 % 84.05 20 0.45 % 86.76 10 0.22 % 33.84 44 0.67 % 124.59 13 0.47 % 83.39 komaj komaj k omaj 79 0.43 % 76.32 22 0.49 % 95.44 41 0.88 % 138.73 7 0.11 % 19.82 9 0.32 % 57.73 verjetno verjetno verj etno 69 0.37 % 66.66 7 0.16 % 30.37 11 0.24 % 37.22 37 0.56 % 104.77 14 0.50 % 89.81 morebiti morebiti more biti 49 0.27 % 47.34 2 0.04 % 8.68 17 0.37 % 57.52 18 0.27 % 50.97 12 0.43 % 76.98 zgolj zgolj z golj 45 0.24 % 43.47 11 0.25 % 47.72 3 0.07 % 10.15 30 0.45 % 84.95 1 0.04 % 6.41 vendarle vendarle vend arle 44 0.24 % 42.51 4 0.09 % 17.35 0 0 % 0 39 0.59 % 110.43 1 0.04 % 6.41 skorajda skorajda skor ajda 42 0.23 % 40.58 15 0.34 % 65.07 1 0.02 % 3.38 25 0.38 % 70.79 1 0.04 % 6.41 edino edino e dino 30 0.16 % 28.98 6 0.13 % 26.03 13 0.28 % 43.99 4 0.06 % 11.33 7 0.25 % 44.90 vsekakor vsekakor vsek akor 28 0.15 % 27.05 8 0.18 % 34.70 0 0 % 0 18 0.27 % 50.97 2 0.07 % 12.83 minus minus m inus 22 0.12 % 21.25 0 0 % 0 0 0 % 0 10 0.15 % 28.32 12 0.43 % 76.98 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 623 File at CLARIN.SI2.2.280 List of final character-level 5-grams from particle lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-particles-lemmas-final- 5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] seveda seveda s eveda 1,795 29.02 % 1,734.12 543 35.19 % 2,355.51 188 16.94 % 636.12 896 32.07 % 2,537.17 168 22.76 % 1,077.66 sploh sploh sploh 895 14.47 % 864.64 161 10.43 % 698.41 354 31.89 % 1,197.80 219 7.84 % 620.13 161 21.82 % 1,032.76 pravzaprav pravzaprav pravz aprav 579 9.36 % 559.36 73 4.73 % 316.67 18 1.62 % 60.91 439 15.71 % 1,243.10 49 6.64 % 314.32 torej torej torej 474 7.66 % 457.92 133 8.62 % 576.95 9 0.81 % 30.45 297 10.63 % 841 35 4.74 % 224.51 predvsem predvsem pre dvsem 335 5.42 % 323.64 107 6.93 % 464.16 10 0.90 % 33.84 187 6.69 % 529.52 31 4.20 % 198.85 morda morda morda 333 5.38 % 321.71 138 8.94 % 598.64 27 2.43 % 91.36 151 5.40 % 427.58 17 2.30 % 109.05 skoraj skoraj s koraj 295 4.77 % 284.99 76 4.92 % 329.69 100 9.01 % 338.36 83 2.97 % 235.03 36 4.88 % 230.93 najbrž najbrž n ajbrž 217 3.51 % 209.64 44 2.85 % 190.87 63 5.68 % 213.17 64 2.29 % 181.23 46 6.23 % 295.07 mogoče mogoče m ogoče 184 2.98 % 177.76 39 2.53 % 169.18 36 3.24 % 121.81 60 2.15 % 169.90 49 6.64 % 314.32 menda menda menda 149 2.41 % 143.95 12 0.78 % 52.06 96 8.65 % 324.83 28 1.00 % 79.29 13 1.76 % 83.39 valjda valjda v aljda 116 1.88 % 112.07 24 1.55 % 104.11 70 6.31 % 236.85 5 0.18 % 14.16 17 2.30 % 109.05 potem potem potem 87 1.41 % 84.05 19 1.23 % 82.42 16 1.44 % 54.14 33 1.18 % 93.44 19 2.58 % 121.88 ravno ravno ravno 87 1.41 % 84.05 20 1.30 % 86.76 10 0.90 % 33.84 44 1.57 % 124.59 13 1.76 % 83.39 komaj komaj komaj 79 1.28 % 76.32 22 1.43 % 95.44 41 3.69 % 138.73 7 0.25 % 19.82 9 1.22 % 57.73 verjetno verjetno ver jetno 69 1.12 % 66.66 7 0.45 % 30.37 11 0.99 % 37.22 37 1.32 % 104.77 14 1.90 % 89.81 morebiti morebiti mor ebiti 49 0.79 % 47.34 2 0.13 % 8.68 17 1.53 % 57.52 18 0.64 % 50.97 12 1.63 % 76.98 zgolj zgolj zgolj 45 0.73 % 43.47 11 0.71 % 47.72 3 0.27 % 10.15 30 1.07 % 84.95 1 0.14 % 6.41 vendarle vendarle ven darle 44 0.71 % 42.51 4 0.26 % 17.35 0 0 % 0 39 1.40 % 110.43 1 0.14 % 6.41 skorajda skorajda sko rajda 42 0.68 % 40.58 15 0.97 % 65.07 1 0.09 % 3.38 25 0.90 % 70.79 1 0.14 % 6.41 edino edino edino 30 0.48 % 28.98 6 0.39 % 26.03 13 1.17 % 43.99 4 0.14 % 11.33 7 0.95 % 44.90 vsekakor vsekakor vse kakor 28 0.45 % 27.05 8 0.52 % 34.70 0 0 % 0 18 0.64 % 50.97 2 0.27 % 12.83 minus minus minus 22 0.36 % 21.25 0 0 % 0 0 0 % 0 10 0.36 % 28.32 12 1.63 % 76.98 največ največ n ajveč 21 0.34 % 20.29 7 0.45 % 30.37 0 0 % 0 12 0.43 % 33.98 2 0.27 % 12.83 bržkone bržkone br žkone 20 0.32 % 19.32 20 1.30 % 86.76 0 0 % 0 0 0 % 0 0 0 % 0 natanko natanko na tanko 19 0.31 % 18.36 10 0.65 % 43.38 0 0 % 0 9 0.32 % 25.48 0 0 % 0 bojda bojda bojda 16 0.26 % 15.46 8 0.52 % 34.70 8 0.72 % 27.07 0 0 % 0 0 0 % 0 najmanj najmanj na jmanj 16 0.26 % 15.46 2 0.13 % 8.68 2 0.18 % 6.77 10 0.36 % 28.32 2 0.27 % 12.83 približno približno prib ližno 14 0.23 % 13.53 7 0.45 % 30.37 0 0 % 0 6 0.21 % 16.99 1 0.14 % 6.41 končno končno k ončno 13 0.21 % 12.56 3 0.19 % 13.01 1 0.09 % 3.38 9 0.32 % 25.48 0 0 % 0 razen razen razen 13 0.21 % 12.56 2 0.13 % 8.68 5 0.45 % 16.92 4 0.14 % 11.33 2 0.27 % 12.83 kvečjemu kvečjemu kve čjemu 12 0.19 % 11.59 1 0.07 % 4.34 2 0.18 % 6.77 5 0.18 % 14.16 4 0.54 % 25.66 okrog okrog okrog 11 0.18 % 10.63 1 0.07 % 4.34 0 0 % 0 5 0.18 % 14.16 5 0.68 % 32.07 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 624 File at CLARIN.SI2.2.281 List of initial character-level 1-grams from particle standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-particles-standardized_ forms-initial-1grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ne n e 31,733 31.63 % 30,656.73 6,435 29.95 % 27,914.79 11,588 32.66 % 39,209.32 7,780 30.90 % 22,030.36 5,930 32.62 % 38,038.91 ja j a 25,555 25.47 % 24,688.27 4,720 21.97 % 20,475.18 11,541 32.53 % 39,050.29 3,821 15.18 % 10,819.80 5,473 30.10 % 35,107.41 tudi t udi 7,945 7.92 % 7,675.53 2,076 9.66 % 9,005.61 1,644 4.63 % 5,562.66 2,890 11.48 % 8,183.51 1,335 7.34 % 8,563.57 še š e 7,184 7.16 % 6,940.35 1,782 8.29 % 7,730.25 2,116 5.96 % 7,159.73 2,127 8.45 % 6,022.95 1,159 6.38 % 7,434.59 no n o 4,691 4.68 % 4,531.90 1,289 6.00 % 5,591.63 1,566 4.41 % 5,298.74 1,114 4.42 % 3,154.48 722 3.97 % 4,631.38 že ž e 4,447 4.43 % 4,296.17 1,071 4.98 % 4,645.96 1,498 4.22 % 5,068.65 1,191 4.73 % 3,372.51 687 3.78 % 4,406.87 pač p ač 2,851 2.84 % 2,754.30 402 1.87 % 1,743.86 960 2.71 % 3,248.27 830 3.30 % 2,350.28 659 3.62 % 4,227.26 seveda s eveda 1,795 1.79 % 1,734.12 543 2.53 % 2,355.51 188 0.53 % 636.12 896 3.56 % 2,537.17 168 0.92 % 1,077.66 res r es 1,699 1.69 % 1,641.38 467 2.17 % 2,025.83 577 1.63 % 1,952.35 360 1.43 % 1,019.40 295 1.62 % 1,892.32 samo s amo 1,469 1.46 % 1,419.18 254 1.18 % 1,101.84 565 1.59 % 1,911.74 419 1.66 % 1,186.47 231 1.27 % 1,481.79 več v eč 1,332 1.33 % 1,286.82 287 1.34 % 1,245 442 1.25 % 1,495.56 445 1.77 % 1,260.09 158 0.87 % 1,013.52 sploh s ploh 895 0.89 % 864.64 161 0.75 % 698.41 354 1.00 % 1,197.80 219 0.87 % 620.13 161 0.89 % 1,032.76 kar k ar 793 0.79 % 766.10 183 0.85 % 793.85 160 0.45 % 541.38 336 1.33 % 951.44 114 0.63 % 731.27 naj n aj 792 0.79 % 765.14 180 0.84 % 780.83 179 0.51 % 605.67 331 1.31 % 937.28 102 0.56 % 654.29 okej o kej 656 0.65 % 633.75 191 0.89 % 828.55 175 0.49 % 592.13 96 0.38 % 271.84 194 1.07 % 1,244.44 pravzaprav p ravzaprav 579 0.58 % 559.36 73 0.34 % 316.67 18 0.05 % 60.91 439 1.74 % 1,243.10 49 0.27 % 314.32 glih g lih 542 0.54 % 523.62 74 0.34 % 321.01 373 1.05 % 1,262.09 36 0.14 % 101.94 59 0.33 % 378.46 itak i tak 527 0.53 % 509.13 50 0.23 % 216.90 389 1.10 % 1,316.23 29 0.12 % 82.12 59 0.33 % 378.46 torej t orej 474 0.47 % 457.92 133 0.62 % 576.95 9 0.03 % 30.45 297 1.18 % 841 35 0.19 % 224.51 vsaj v saj 448 0.45 % 432.81 113 0.53 % 490.19 119 0.34 % 402.65 128 0.51 % 362.45 88 0.48 % 564.49 da d a 354 0.35 % 341.99 51 0.24 % 221.24 162 0.46 % 548.15 91 0.36 % 257.68 50 0.28 % 320.73 predvsem p redvsem 335 0.33 % 323.64 107 0.50 % 464.16 10 0.03 % 33.84 187 0.74 % 529.52 31 0.17 % 198.85 morda m orda 333 0.33 % 321.71 138 0.64 % 598.64 27 0.08 % 91.36 151 0.60 % 427.58 17 0.09 % 109.05 skoraj s koraj 295 0.29 % 284.99 76 0.35 % 329.69 100 0.28 % 338.36 83 0.33 % 235.03 36 0.20 % 230.93 prav p rav 249 0.25 % 240.55 50 0.23 % 216.90 87 0.24 % 294.37 84 0.33 % 237.86 28 0.15 % 179.61 le l e 240 0.24 % 231.86 98 0.46 % 425.12 33 0.09 % 111.66 91 0.36 % 257.68 18 0.10 % 115.46 najbrž n ajbrž 217 0.22 % 209.64 44 0.20 % 190.87 63 0.18 % 213.17 64 0.25 % 181.23 46 0.25 % 295.07 mogoče m ogoče 184 0.18 % 177.76 39 0.18 % 169.18 36 0.10 % 121.81 60 0.24 % 169.90 49 0.27 % 314.32 celo c elo 173 0.17 % 167.13 47 0.22 % 203.88 32 0.09 % 108.28 75 0.30 % 212.37 19 0.10 % 121.88 menda m enda 149 0.15 % 143.95 12 0.06 % 52.06 96 0.27 % 324.83 28 0.11 % 79.29 13 0.07 % 83.39 šele š ele 130 0.13 % 125.59 29 0.14 % 125.80 46 0.13 % 155.65 39 0.15 % 110.43 16 0.09 % 102.63 baje b aje 124 0.12 % 119.79 35 0.16 % 151.83 72 0.20 % 243.62 4 0.02 % 11.33 13 0.07 % 83.39 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 625 File at CLARIN.SI2.2.282 List of initial character-level 2-grams from particle standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-particles-standardized_ forms-initial-2grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ne ne 31,733 31.63 % 30,656.73 6,435 29.95 % 27,914.79 11,588 32.66 % 39,209.32 7,780 30.90 % 22,030.36 5,930 32.62 % 38,038.91 ja ja 25,555 25.47 % 24,688.27 4,720 21.97 % 20,475.18 11,541 32.53 % 39,050.29 3,821 15.18 % 10,819.80 5,473 30.10 % 35,107.41 tudi tu di 7,945 7.92 % 7,675.53 2,076 9.66 % 9,005.61 1,644 4.63 % 5,562.66 2,890 11.48 % 8,183.51 1,335 7.34 % 8,563.57 še še 7,184 7.16 % 6,940.35 1,782 8.29 % 7,730.25 2,116 5.96 % 7,159.73 2,127 8.45 % 6,022.95 1,159 6.38 % 7,434.59 no no 4,691 4.68 % 4,531.90 1,289 6.00 % 5,591.63 1,566 4.41 % 5,298.74 1,114 4.42 % 3,154.48 722 3.97 % 4,631.38 že že 4,447 4.43 % 4,296.17 1,071 4.98 % 4,645.96 1,498 4.22 % 5,068.65 1,191 4.73 % 3,372.51 687 3.78 % 4,406.87 pač pa č 2,851 2.84 % 2,754.30 402 1.87 % 1,743.86 960 2.71 % 3,248.27 830 3.30 % 2,350.28 659 3.62 % 4,227.26 seveda se veda 1,795 1.79 % 1,734.12 543 2.53 % 2,355.51 188 0.53 % 636.12 896 3.56 % 2,537.17 168 0.92 % 1,077.66 res re s 1,699 1.69 % 1,641.38 467 2.17 % 2,025.83 577 1.63 % 1,952.35 360 1.43 % 1,019.40 295 1.62 % 1,892.32 samo sa mo 1,469 1.46 % 1,419.18 254 1.18 % 1,101.84 565 1.59 % 1,911.74 419 1.66 % 1,186.47 231 1.27 % 1,481.79 več ve č 1,332 1.33 % 1,286.82 287 1.34 % 1,245 442 1.25 % 1,495.56 445 1.77 % 1,260.09 158 0.87 % 1,013.52 sploh sp loh 895 0.89 % 864.64 161 0.75 % 698.41 354 1.00 % 1,197.80 219 0.87 % 620.13 161 0.89 % 1,032.76 kar ka r 793 0.79 % 766.10 183 0.85 % 793.85 160 0.45 % 541.38 336 1.33 % 951.44 114 0.63 % 731.27 naj na j 792 0.79 % 765.14 180 0.84 % 780.83 179 0.51 % 605.67 331 1.31 % 937.28 102 0.56 % 654.29 okej ok ej 656 0.65 % 633.75 191 0.89 % 828.55 175 0.49 % 592.13 96 0.38 % 271.84 194 1.07 % 1,244.44 pravzaprav pr avzaprav 579 0.58 % 559.36 73 0.34 % 316.67 18 0.05 % 60.91 439 1.74 % 1,243.10 49 0.27 % 314.32 glih gl ih 542 0.54 % 523.62 74 0.34 % 321.01 373 1.05 % 1,262.09 36 0.14 % 101.94 59 0.33 % 378.46 itak it ak 527 0.53 % 509.13 50 0.23 % 216.90 389 1.10 % 1,316.23 29 0.12 % 82.12 59 0.33 % 378.46 torej to rej 474 0.47 % 457.92 133 0.62 % 576.95 9 0.03 % 30.45 297 1.18 % 841 35 0.19 % 224.51 vsaj vs aj 448 0.45 % 432.81 113 0.53 % 490.19 119 0.34 % 402.65 128 0.51 % 362.45 88 0.48 % 564.49 da da 354 0.35 % 341.99 51 0.24 % 221.24 162 0.46 % 548.15 91 0.36 % 257.68 50 0.28 % 320.73 predvsem pr edvsem 335 0.33 % 323.64 107 0.50 % 464.16 10 0.03 % 33.84 187 0.74 % 529.52 31 0.17 % 198.85 morda mo rda 333 0.33 % 321.71 138 0.64 % 598.64 27 0.08 % 91.36 151 0.60 % 427.58 17 0.09 % 109.05 skoraj sk oraj 295 0.29 % 284.99 76 0.35 % 329.69 100 0.28 % 338.36 83 0.33 % 235.03 36 0.20 % 230.93 prav pr av 249 0.25 % 240.55 50 0.23 % 216.90 87 0.24 % 294.37 84 0.33 % 237.86 28 0.15 % 179.61 le le 240 0.24 % 231.86 98 0.46 % 425.12 33 0.09 % 111.66 91 0.36 % 257.68 18 0.10 % 115.46 najbrž na jbrž 217 0.22 % 209.64 44 0.20 % 190.87 63 0.18 % 213.17 64 0.25 % 181.23 46 0.25 % 295.07 mogoče mo goče 184 0.18 % 177.76 39 0.18 % 169.18 36 0.10 % 121.81 60 0.24 % 169.90 49 0.27 % 314.32 celo ce lo 173 0.17 % 167.13 47 0.22 % 203.88 32 0.09 % 108.28 75 0.30 % 212.37 19 0.10 % 121.88 menda me nda 149 0.15 % 143.95 12 0.06 % 52.06 96 0.27 % 324.83 28 0.11 % 79.29 13 0.07 % 83.39 šele še le 130 0.13 % 125.59 29 0.14 % 125.80 46 0.13 % 155.65 39 0.15 % 110.43 16 0.09 % 102.63 baje ba je 124 0.12 % 119.79 35 0.16 % 151.83 72 0.20 % 243.62 4 0.02 % 11.33 13 0.07 % 83.39 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 626 File at CLARIN.SI2.2.283 List of initial character-level 3-grams from particle standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-particles-standardized_ forms-initial-3grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tudi tud i 7,945 30.45 % 7,675.53 2,076 34.45 % 9,005.61 1,644 23.58 % 5,562.66 2,890 32.28 % 8,183.51 1,335 32.26 % 8,563.57 pač pač 2,851 10.93 % 2,754.30 402 6.67 % 1,743.86 960 13.77 % 3,248.27 830 9.27 % 2,350.28 659 15.93 % 4,227.26 seveda sev eda 1,795 6.88 % 1,734.12 543 9.01 % 2,355.51 188 2.70 % 636.12 896 10.01 % 2,537.17 168 4.06 % 1,077.66 res res 1,699 6.51 % 1,641.38 467 7.75 % 2,025.83 577 8.28 % 1,952.35 360 4.02 % 1,019.40 295 7.13 % 1,892.32 samo sam o 1,469 5.63 % 1,419.18 254 4.21 % 1,101.84 565 8.10 % 1,911.74 419 4.68 % 1,186.47 231 5.58 % 1,481.79 več več 1,332 5.11 % 1,286.82 287 4.76 % 1,245 442 6.34 % 1,495.56 445 4.97 % 1,260.09 158 3.82 % 1,013.52 sploh spl oh 895 3.43 % 864.64 161 2.67 % 698.41 354 5.08 % 1,197.80 219 2.45 % 620.13 161 3.89 % 1,032.76 kar kar 793 3.04 % 766.10 183 3.04 % 793.85 160 2.29 % 541.38 336 3.75 % 951.44 114 2.75 % 731.27 naj naj 792 3.04 % 765.14 180 2.99 % 780.83 179 2.57 % 605.67 331 3.70 % 937.28 102 2.46 % 654.29 okej oke j 656 2.51 % 633.75 191 3.17 % 828.55 175 2.51 % 592.13 96 1.07 % 271.84 194 4.69 % 1,244.44 pravzaprav pra vzaprav 579 2.22 % 559.36 73 1.21 % 316.67 18 0.26 % 60.91 439 4.90 % 1,243.10 49 1.18 % 314.32 glih gli h 542 2.08 % 523.62 74 1.23 % 321.01 373 5.35 % 1,262.09 36 0.40 % 101.94 59 1.43 % 378.46 itak ita k 527 2.02 % 509.13 50 0.83 % 216.90 389 5.58 % 1,316.23 29 0.32 % 82.12 59 1.43 % 378.46 torej tor ej 474 1.82 % 457.92 133 2.21 % 576.95 9 0.13 % 30.45 297 3.32 % 841 35 0.85 % 224.51 vsaj vsa j 448 1.72 % 432.81 113 1.88 % 490.19 119 1.71 % 402.65 128 1.43 % 362.45 88 2.13 % 564.49 predvsem pre dvsem 335 1.28 % 323.64 107 1.78 % 464.16 10 0.14 % 33.84 187 2.09 % 529.52 31 0.75 % 198.85 morda mor da 333 1.28 % 321.71 138 2.29 % 598.64 27 0.39 % 91.36 151 1.69 % 427.58 17 0.41 % 109.05 skoraj sko raj 295 1.13 % 284.99 76 1.26 % 329.69 100 1.43 % 338.36 83 0.93 % 235.03 36 0.87 % 230.93 prav pra v 249 0.95 % 240.55 50 0.83 % 216.90 87 1.25 % 294.37 84 0.94 % 237.86 28 0.68 % 179.61 najbrž naj brž 217 0.83 % 209.64 44 0.73 % 190.87 63 0.90 % 213.17 64 0.71 % 181.23 46 1.11 % 295.07 mogoče mog oče 184 0.70 % 177.76 39 0.65 % 169.18 36 0.52 % 121.81 60 0.67 % 169.90 49 1.18 % 314.32 celo cel o 173 0.66 % 167.13 47 0.78 % 203.88 32 0.46 % 108.28 75 0.84 % 212.37 19 0.46 % 121.88 menda men da 149 0.57 % 143.95 12 0.20 % 52.06 96 1.38 % 324.83 28 0.31 % 79.29 13 0.31 % 83.39 šele šel e 130 0.50 % 125.59 29 0.48 % 125.80 46 0.66 % 155.65 39 0.44 % 110.43 16 0.39 % 102.63 baje baj e 124 0.47 % 119.79 35 0.58 % 151.83 72 1.03 % 243.62 4 0.04 % 11.33 13 0.31 % 83.39 valjda val jda 116 0.45 % 112.07 24 0.40 % 104.11 70 1.00 % 236.85 5 0.06 % 14.16 17 0.41 % 109.05 žal žal 92 0.35 % 88.88 34 0.56 % 147.49 15 0.21 % 50.75 31 0.35 % 87.78 12 0.29 % 76.98 potem pot em 87 0.33 % 84.05 19 0.32 % 82.42 16 0.23 % 54.14 33 0.37 % 93.44 19 0.46 % 121.88 ravno rav no 87 0.33 % 84.05 20 0.33 % 86.76 10 0.14 % 33.84 44 0.49 % 124.59 13 0.31 % 83.39 komaj kom aj 79 0.30 % 76.32 22 0.36 % 95.44 41 0.59 % 138.73 7 0.08 % 19.82 9 0.22 % 57.73 verjetno ver jetno 69 0.26 % 66.66 7 0.12 % 30.37 11 0.16 % 37.22 37 0.41 % 104.77 14 0.34 % 89.81 morebiti mor ebiti 49 0.19 % 47.34 2 0.03 % 8.68 17 0.24 % 57.52 18 0.20 % 50.97 12 0.29 % 76.98 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 627 File at CLARIN.SI2.2.284 List of initial character-level 4-grams from particle standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-particles-standardized_ forms-initial-4grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tudi tudi 7,945 42.89 % 7,675.53 2,076 46.42 % 9,005.61 1,644 35.46 % 5,562.66 2,890 43.67 % 8,183.51 1,335 47.71 % 8,563.57 seveda seve da 1,795 9.69 % 1,734.12 543 12.14 % 2,355.51 188 4.05 % 636.12 896 13.54 % 2,537.17 168 6.00 % 1,077.66 samo samo 1,469 7.93 % 1,419.18 254 5.68 % 1,101.84 565 12.19 % 1,911.74 419 6.33 % 1,186.47 231 8.26 % 1,481.79 sploh splo h 895 4.83 % 864.64 161 3.60 % 698.41 354 7.64 % 1,197.80 219 3.31 % 620.13 161 5.75 % 1,032.76 okej okej 656 3.54 % 633.75 191 4.27 % 828.55 175 3.77 % 592.13 96 1.45 % 271.84 194 6.93 % 1,244.44 pravzaprav prav zaprav 579 3.13 % 559.36 73 1.63 % 316.67 18 0.39 % 60.91 439 6.63 % 1,243.10 49 1.75 % 314.32 glih glih 542 2.93 % 523.62 74 1.66 % 321.01 373 8.05 % 1,262.09 36 0.54 % 101.94 59 2.11 % 378.46 itak itak 527 2.85 % 509.13 50 1.12 % 216.90 389 8.39 % 1,316.23 29 0.44 % 82.12 59 2.11 % 378.46 torej tore j 474 2.56 % 457.92 133 2.97 % 576.95 9 0.19 % 30.45 297 4.49 % 841 35 1.25 % 224.51 vsaj vsaj 448 2.42 % 432.81 113 2.53 % 490.19 119 2.57 % 402.65 128 1.93 % 362.45 88 3.15 % 564.49 predvsem pred vsem 335 1.81 % 323.64 107 2.39 % 464.16 10 0.22 % 33.84 187 2.83 % 529.52 31 1.11 % 198.85 morda mord a 333 1.80 % 321.71 138 3.09 % 598.64 27 0.58 % 91.36 151 2.28 % 427.58 17 0.61 % 109.05 skoraj skor aj 295 1.59 % 284.99 76 1.70 % 329.69 100 2.16 % 338.36 83 1.25 % 235.03 36 1.29 % 230.93 prav prav 249 1.34 % 240.55 50 1.12 % 216.90 87 1.88 % 294.37 84 1.27 % 237.86 28 1.00 % 179.61 najbrž najb rž 217 1.17 % 209.64 44 0.98 % 190.87 63 1.36 % 213.17 64 0.97 % 181.23 46 1.64 % 295.07 mogoče mogo če 184 0.99 % 177.76 39 0.87 % 169.18 36 0.78 % 121.81 60 0.91 % 169.90 49 1.75 % 314.32 celo celo 173 0.93 % 167.13 47 1.05 % 203.88 32 0.69 % 108.28 75 1.13 % 212.37 19 0.68 % 121.88 menda mend a 149 0.80 % 143.95 12 0.27 % 52.06 96 2.07 % 324.83 28 0.42 % 79.29 13 0.47 % 83.39 šele šele 130 0.70 % 125.59 29 0.65 % 125.80 46 0.99 % 155.65 39 0.59 % 110.43 16 0.57 % 102.63 baje baje 124 0.67 % 119.79 35 0.78 % 151.83 72 1.55 % 243.62 4 0.06 % 11.33 13 0.47 % 83.39 valjda valj da 116 0.63 % 112.07 24 0.54 % 104.11 70 1.51 % 236.85 5 0.08 % 14.16 17 0.61 % 109.05 potem pote m 87 0.47 % 84.05 19 0.42 % 82.42 16 0.34 % 54.14 33 0.50 % 93.44 19 0.68 % 121.88 ravno ravn o 87 0.47 % 84.05 20 0.45 % 86.76 10 0.22 % 33.84 44 0.67 % 124.59 13 0.47 % 83.39 komaj koma j 79 0.43 % 76.32 22 0.49 % 95.44 41 0.88 % 138.73 7 0.11 % 19.82 9 0.32 % 57.73 verjetno verj etno 69 0.37 % 66.66 7 0.16 % 30.37 11 0.24 % 37.22 37 0.56 % 104.77 14 0.50 % 89.81 morebiti more biti 49 0.27 % 47.34 2 0.04 % 8.68 17 0.37 % 57.52 18 0.27 % 50.97 12 0.43 % 76.98 zgolj zgol j 45 0.24 % 43.47 11 0.25 % 47.72 3 0.07 % 10.15 30 0.45 % 84.95 1 0.04 % 6.41 vendarle vend arle 44 0.24 % 42.51 4 0.09 % 17.35 0 0 % 0 39 0.59 % 110.43 1 0.04 % 6.41 skorajda skor ajda 42 0.23 % 40.58 15 0.34 % 65.07 1 0.02 % 3.38 25 0.38 % 70.79 1 0.04 % 6.41 edino edin o 30 0.16 % 28.98 6 0.13 % 26.03 13 0.28 % 43.99 4 0.06 % 11.33 7 0.25 % 44.90 vsekakor vsek akor 28 0.15 % 27.05 8 0.18 % 34.70 0 0 % 0 18 0.27 % 50.97 2 0.07 % 12.83 minus minu s 22 0.12 % 21.25 0 0 % 0 0 0 % 0 10 0.15 % 28.32 12 0.43 % 76.98 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 628 File at CLARIN.SI2.2.285 List of initial character-level 5-grams from particle standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-particles-standardized_forms- initial-5grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] seveda seved a 1,795 29.02 % 1,734.12 543 35.19 % 2,355.51 188 16.94 % 636.12 896 32.07 % 2,537.17 168 22.76 % 1,077.66 sploh sploh 895 14.47 % 864.64 161 10.43 % 698.41 354 31.89 % 1,197.80 219 7.84 % 620.13 161 21.82 % 1,032.76 pravzaprav pravz aprav 579 9.36 % 559.36 73 4.73 % 316.67 18 1.62 % 60.91 439 15.71 % 1,243.10 49 6.64 % 314.32 torej torej 474 7.66 % 457.92 133 8.62 % 576.95 9 0.81 % 30.45 297 10.63 % 841 35 4.74 % 224.51 predvsem predv sem 335 5.42 % 323.64 107 6.93 % 464.16 10 0.90 % 33.84 187 6.69 % 529.52 31 4.20 % 198.85 morda morda 333 5.38 % 321.71 138 8.94 % 598.64 27 2.43 % 91.36 151 5.40 % 427.58 17 2.30 % 109.05 skoraj skora j 295 4.77 % 284.99 76 4.92 % 329.69 100 9.01 % 338.36 83 2.97 % 235.03 36 4.88 % 230.93 najbrž najbr ž 217 3.51 % 209.64 44 2.85 % 190.87 63 5.68 % 213.17 64 2.29 % 181.23 46 6.23 % 295.07 mogoče mogoč e 184 2.98 % 177.76 39 2.53 % 169.18 36 3.24 % 121.81 60 2.15 % 169.90 49 6.64 % 314.32 menda menda 149 2.41 % 143.95 12 0.78 % 52.06 96 8.65 % 324.83 28 1.00 % 79.29 13 1.76 % 83.39 valjda valjd a 116 1.88 % 112.07 24 1.55 % 104.11 70 6.31 % 236.85 5 0.18 % 14.16 17 2.30 % 109.05 potem potem 87 1.41 % 84.05 19 1.23 % 82.42 16 1.44 % 54.14 33 1.18 % 93.44 19 2.58 % 121.88 ravno ravno 87 1.41 % 84.05 20 1.30 % 86.76 10 0.90 % 33.84 44 1.57 % 124.59 13 1.76 % 83.39 komaj komaj 79 1.28 % 76.32 22 1.43 % 95.44 41 3.69 % 138.73 7 0.25 % 19.82 9 1.22 % 57.73 verjetno verje tno 69 1.12 % 66.66 7 0.45 % 30.37 11 0.99 % 37.22 37 1.32 % 104.77 14 1.90 % 89.81 morebiti moreb iti 49 0.79 % 47.34 2 0.13 % 8.68 17 1.53 % 57.52 18 0.64 % 50.97 12 1.63 % 76.98 zgolj zgolj 45 0.73 % 43.47 11 0.71 % 47.72 3 0.27 % 10.15 30 1.07 % 84.95 1 0.14 % 6.41 vendarle venda rle 44 0.71 % 42.51 4 0.26 % 17.35 0 0 % 0 39 1.40 % 110.43 1 0.14 % 6.41 skorajda skora jda 42 0.68 % 40.58 15 0.97 % 65.07 1 0.09 % 3.38 25 0.90 % 70.79 1 0.14 % 6.41 edino edino 30 0.48 % 28.98 6 0.39 % 26.03 13 1.17 % 43.99 4 0.14 % 11.33 7 0.95 % 44.90 vsekakor vseka kor 28 0.45 % 27.05 8 0.52 % 34.70 0 0 % 0 18 0.64 % 50.97 2 0.27 % 12.83 minus minus 22 0.36 % 21.25 0 0 % 0 0 0 % 0 10 0.36 % 28.32 12 1.63 % 76.98 največ najve č 21 0.34 % 20.29 7 0.45 % 30.37 0 0 % 0 12 0.43 % 33.98 2 0.27 % 12.83 bržkone bržko ne 20 0.32 % 19.32 20 1.30 % 86.76 0 0 % 0 0 0 % 0 0 0 % 0 natanko natan ko 19 0.31 % 18.36 10 0.65 % 43.38 0 0 % 0 9 0.32 % 25.48 0 0 % 0 bojda bojda 16 0.26 % 15.46 8 0.52 % 34.70 8 0.72 % 27.07 0 0 % 0 0 0 % 0 najmanj najma nj 16 0.26 % 15.46 2 0.13 % 8.68 2 0.18 % 6.77 10 0.36 % 28.32 2 0.27 % 12.83 približno pribl ižno 14 0.23 % 13.53 7 0.45 % 30.37 0 0 % 0 6 0.21 % 16.99 1 0.14 % 6.41 končno končn o 13 0.21 % 12.56 3 0.19 % 13.01 1 0.09 % 3.38 9 0.32 % 25.48 0 0 % 0 razen razen 13 0.21 % 12.56 2 0.13 % 8.68 5 0.45 % 16.92 4 0.14 % 11.33 2 0.27 % 12.83 kvečjemu kvečj emu 12 0.19 % 11.59 1 0.07 % 4.34 2 0.18 % 6.77 5 0.18 % 14.16 4 0.54 % 25.66 okrog okrog 11 0.18 % 10.63 1 0.07 % 4.34 0 0 % 0 5 0.18 % 14.16 5 0.68 % 32.07 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 629 File at CLARIN.SI2.2.286 List of final character-level 1-grams from particle standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-particles-standardized_forms- final-1grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ne n e 31,733 31.63 % 30,656.73 6,435 29.95 % 27,914.79 11,588 32.66 % 39,209.32 7,780 30.90 % 22,030.36 5,930 32.62 % 38,038.91 ja j a 25,555 25.47 % 24,688.27 4,720 21.97 % 20,475.18 11,541 32.53 % 39,050.29 3,821 15.18 % 10,819.80 5,473 30.10 % 35,107.41 tudi tud i 7,945 7.92 % 7,675.53 2,076 9.66 % 9,005.61 1,644 4.63 % 5,562.66 2,890 11.48 % 8,183.51 1,335 7.34 % 8,563.57 še š e 7,184 7.16 % 6,940.35 1,782 8.29 % 7,730.25 2,116 5.96 % 7,159.73 2,127 8.45 % 6,022.95 1,159 6.38 % 7,434.59 no n o 4,691 4.68 % 4,531.90 1,289 6.00 % 5,591.63 1,566 4.41 % 5,298.74 1,114 4.42 % 3,154.48 722 3.97 % 4,631.38 že ž e 4,447 4.43 % 4,296.17 1,071 4.98 % 4,645.96 1,498 4.22 % 5,068.65 1,191 4.73 % 3,372.51 687 3.78 % 4,406.87 pač pa č 2,851 2.84 % 2,754.30 402 1.87 % 1,743.86 960 2.71 % 3,248.27 830 3.30 % 2,350.28 659 3.62 % 4,227.26 seveda seved a 1,795 1.79 % 1,734.12 543 2.53 % 2,355.51 188 0.53 % 636.12 896 3.56 % 2,537.17 168 0.92 % 1,077.66 res re s 1,699 1.69 % 1,641.38 467 2.17 % 2,025.83 577 1.63 % 1,952.35 360 1.43 % 1,019.40 295 1.62 % 1,892.32 samo sam o 1,469 1.46 % 1,419.18 254 1.18 % 1,101.84 565 1.59 % 1,911.74 419 1.66 % 1,186.47 231 1.27 % 1,481.79 več ve č 1,332 1.33 % 1,286.82 287 1.34 % 1,245 442 1.25 % 1,495.56 445 1.77 % 1,260.09 158 0.87 % 1,013.52 sploh splo h 895 0.89 % 864.64 161 0.75 % 698.41 354 1.00 % 1,197.80 219 0.87 % 620.13 161 0.89 % 1,032.76 kar ka r 793 0.79 % 766.10 183 0.85 % 793.85 160 0.45 % 541.38 336 1.33 % 951.44 114 0.63 % 731.27 naj na j 792 0.79 % 765.14 180 0.84 % 780.83 179 0.51 % 605.67 331 1.31 % 937.28 102 0.56 % 654.29 okej oke j 656 0.65 % 633.75 191 0.89 % 828.55 175 0.49 % 592.13 96 0.38 % 271.84 194 1.07 % 1,244.44 pravzaprav pravzapra v 579 0.58 % 559.36 73 0.34 % 316.67 18 0.05 % 60.91 439 1.74 % 1,243.10 49 0.27 % 314.32 glih gli h 542 0.54 % 523.62 74 0.34 % 321.01 373 1.05 % 1,262.09 36 0.14 % 101.94 59 0.33 % 378.46 itak ita k 527 0.53 % 509.13 50 0.23 % 216.90 389 1.10 % 1,316.23 29 0.12 % 82.12 59 0.33 % 378.46 torej tore j 474 0.47 % 457.92 133 0.62 % 576.95 9 0.03 % 30.45 297 1.18 % 841 35 0.19 % 224.51 vsaj vsa j 448 0.45 % 432.81 113 0.53 % 490.19 119 0.34 % 402.65 128 0.51 % 362.45 88 0.48 % 564.49 da d a 354 0.35 % 341.99 51 0.24 % 221.24 162 0.46 % 548.15 91 0.36 % 257.68 50 0.28 % 320.73 predvsem predvse m 335 0.33 % 323.64 107 0.50 % 464.16 10 0.03 % 33.84 187 0.74 % 529.52 31 0.17 % 198.85 morda mord a 333 0.33 % 321.71 138 0.64 % 598.64 27 0.08 % 91.36 151 0.60 % 427.58 17 0.09 % 109.05 skoraj skora j 295 0.29 % 284.99 76 0.35 % 329.69 100 0.28 % 338.36 83 0.33 % 235.03 36 0.20 % 230.93 prav pra v 249 0.25 % 240.55 50 0.23 % 216.90 87 0.24 % 294.37 84 0.33 % 237.86 28 0.15 % 179.61 le l e 240 0.24 % 231.86 98 0.46 % 425.12 33 0.09 % 111.66 91 0.36 % 257.68 18 0.10 % 115.46 najbrž najbr ž 217 0.22 % 209.64 44 0.20 % 190.87 63 0.18 % 213.17 64 0.25 % 181.23 46 0.25 % 295.07 mogoče mogoč e 184 0.18 % 177.76 39 0.18 % 169.18 36 0.10 % 121.81 60 0.24 % 169.90 49 0.27 % 314.32 celo cel o 173 0.17 % 167.13 47 0.22 % 203.88 32 0.09 % 108.28 75 0.30 % 212.37 19 0.10 % 121.88 menda mend a 149 0.15 % 143.95 12 0.06 % 52.06 96 0.27 % 324.83 28 0.11 % 79.29 13 0.07 % 83.39 šele šel e 130 0.13 % 125.59 29 0.14 % 125.80 46 0.13 % 155.65 39 0.15 % 110.43 16 0.09 % 102.63 baje baj e 124 0.12 % 119.79 35 0.16 % 151.83 72 0.20 % 243.62 4 0.02 % 11.33 13 0.07 % 83.39 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 630 File at CLARIN.SI2.2.287 List of final character-level 2-grams from particle standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-particles-standardized_forms- final-2grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ne ne 31,733 31.63 % 30,656.73 6,435 29.95 % 27,914.79 11,588 32.66 % 39,209.32 7,780 30.90 % 22,030.36 5,930 32.62 % 38,038.91 ja ja 25,555 25.47 % 24,688.27 4,720 21.97 % 20,475.18 11,541 32.53 % 39,050.29 3,821 15.18 % 10,819.80 5,473 30.10 % 35,107.41 tudi tu di 7,945 7.92 % 7,675.53 2,076 9.66 % 9,005.61 1,644 4.63 % 5,562.66 2,890 11.48 % 8,183.51 1,335 7.34 % 8,563.57 še še 7,184 7.16 % 6,940.35 1,782 8.29 % 7,730.25 2,116 5.96 % 7,159.73 2,127 8.45 % 6,022.95 1,159 6.38 % 7,434.59 no no 4,691 4.68 % 4,531.90 1,289 6.00 % 5,591.63 1,566 4.41 % 5,298.74 1,114 4.42 % 3,154.48 722 3.97 % 4,631.38 že že 4,447 4.43 % 4,296.17 1,071 4.98 % 4,645.96 1,498 4.22 % 5,068.65 1,191 4.73 % 3,372.51 687 3.78 % 4,406.87 pač p ač 2,851 2.84 % 2,754.30 402 1.87 % 1,743.86 960 2.71 % 3,248.27 830 3.30 % 2,350.28 659 3.62 % 4,227.26 seveda seve da 1,795 1.79 % 1,734.12 543 2.53 % 2,355.51 188 0.53 % 636.12 896 3.56 % 2,537.17 168 0.92 % 1,077.66 res r es 1,699 1.69 % 1,641.38 467 2.17 % 2,025.83 577 1.63 % 1,952.35 360 1.43 % 1,019.40 295 1.62 % 1,892.32 samo sa mo 1,469 1.46 % 1,419.18 254 1.18 % 1,101.84 565 1.59 % 1,911.74 419 1.66 % 1,186.47 231 1.27 % 1,481.79 več v eč 1,332 1.33 % 1,286.82 287 1.34 % 1,245 442 1.25 % 1,495.56 445 1.77 % 1,260.09 158 0.87 % 1,013.52 sploh spl oh 895 0.89 % 864.64 161 0.75 % 698.41 354 1.00 % 1,197.80 219 0.87 % 620.13 161 0.89 % 1,032.76 kar k ar 793 0.79 % 766.10 183 0.85 % 793.85 160 0.45 % 541.38 336 1.33 % 951.44 114 0.63 % 731.27 naj n aj 792 0.79 % 765.14 180 0.84 % 780.83 179 0.51 % 605.67 331 1.31 % 937.28 102 0.56 % 654.29 okej ok ej 656 0.65 % 633.75 191 0.89 % 828.55 175 0.49 % 592.13 96 0.38 % 271.84 194 1.07 % 1,244.44 pravzaprav pravzapr av 579 0.58 % 559.36 73 0.34 % 316.67 18 0.05 % 60.91 439 1.74 % 1,243.10 49 0.27 % 314.32 glih gl ih 542 0.54 % 523.62 74 0.34 % 321.01 373 1.05 % 1,262.09 36 0.14 % 101.94 59 0.33 % 378.46 itak it ak 527 0.53 % 509.13 50 0.23 % 216.90 389 1.10 % 1,316.23 29 0.12 % 82.12 59 0.33 % 378.46 torej tor ej 474 0.47 % 457.92 133 0.62 % 576.95 9 0.03 % 30.45 297 1.18 % 841 35 0.19 % 224.51 vsaj vs aj 448 0.45 % 432.81 113 0.53 % 490.19 119 0.34 % 402.65 128 0.51 % 362.45 88 0.48 % 564.49 da da 354 0.35 % 341.99 51 0.24 % 221.24 162 0.46 % 548.15 91 0.36 % 257.68 50 0.28 % 320.73 predvsem predvs em 335 0.33 % 323.64 107 0.50 % 464.16 10 0.03 % 33.84 187 0.74 % 529.52 31 0.17 % 198.85 morda mor da 333 0.33 % 321.71 138 0.64 % 598.64 27 0.08 % 91.36 151 0.60 % 427.58 17 0.09 % 109.05 skoraj skor aj 295 0.29 % 284.99 76 0.35 % 329.69 100 0.28 % 338.36 83 0.33 % 235.03 36 0.20 % 230.93 prav pr av 249 0.25 % 240.55 50 0.23 % 216.90 87 0.24 % 294.37 84 0.33 % 237.86 28 0.15 % 179.61 le le 240 0.24 % 231.86 98 0.46 % 425.12 33 0.09 % 111.66 91 0.36 % 257.68 18 0.10 % 115.46 najbrž najb rž 217 0.22 % 209.64 44 0.20 % 190.87 63 0.18 % 213.17 64 0.25 % 181.23 46 0.25 % 295.07 mogoče mogo če 184 0.18 % 177.76 39 0.18 % 169.18 36 0.10 % 121.81 60 0.24 % 169.90 49 0.27 % 314.32 celo ce lo 173 0.17 % 167.13 47 0.22 % 203.88 32 0.09 % 108.28 75 0.30 % 212.37 19 0.10 % 121.88 menda men da 149 0.15 % 143.95 12 0.06 % 52.06 96 0.27 % 324.83 28 0.11 % 79.29 13 0.07 % 83.39 šele še le 130 0.13 % 125.59 29 0.14 % 125.80 46 0.13 % 155.65 39 0.15 % 110.43 16 0.09 % 102.63 baje ba je 124 0.12 % 119.79 35 0.16 % 151.83 72 0.20 % 243.62 4 0.02 % 11.33 13 0.07 % 83.39 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 631 File at CLARIN.SI2.2.288 List of final character-level 3-grams from particle standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-particles-standardized_forms- final-3grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tudi t udi 7,945 30.45 % 7,675.53 2,076 34.45 % 9,005.61 1,644 23.58 % 5,562.66 2,890 32.28 % 8,183.51 1,335 32.26 % 8,563.57 pač pač 2,851 10.93 % 2,754.30 402 6.67 % 1,743.86 960 13.77 % 3,248.27 830 9.27 % 2,350.28 659 15.93 % 4,227.26 seveda sev eda 1,795 6.88 % 1,734.12 543 9.01 % 2,355.51 188 2.70 % 636.12 896 10.01 % 2,537.17 168 4.06 % 1,077.66 res res 1,699 6.51 % 1,641.38 467 7.75 % 2,025.83 577 8.28 % 1,952.35 360 4.02 % 1,019.40 295 7.13 % 1,892.32 samo s amo 1,469 5.63 % 1,419.18 254 4.21 % 1,101.84 565 8.10 % 1,911.74 419 4.68 % 1,186.47 231 5.58 % 1,481.79 več več 1,332 5.11 % 1,286.82 287 4.76 % 1,245 442 6.34 % 1,495.56 445 4.97 % 1,260.09 158 3.82 % 1,013.52 sploh sp loh 895 3.43 % 864.64 161 2.67 % 698.41 354 5.08 % 1,197.80 219 2.45 % 620.13 161 3.89 % 1,032.76 kar kar 793 3.04 % 766.10 183 3.04 % 793.85 160 2.29 % 541.38 336 3.75 % 951.44 114 2.75 % 731.27 naj naj 792 3.04 % 765.14 180 2.99 % 780.83 179 2.57 % 605.67 331 3.70 % 937.28 102 2.46 % 654.29 okej o kej 656 2.51 % 633.75 191 3.17 % 828.55 175 2.51 % 592.13 96 1.07 % 271.84 194 4.69 % 1,244.44 pravzaprav pravzap rav 579 2.22 % 559.36 73 1.21 % 316.67 18 0.26 % 60.91 439 4.90 % 1,243.10 49 1.18 % 314.32 glih g lih 542 2.08 % 523.62 74 1.23 % 321.01 373 5.35 % 1,262.09 36 0.40 % 101.94 59 1.43 % 378.46 itak i tak 527 2.02 % 509.13 50 0.83 % 216.90 389 5.58 % 1,316.23 29 0.32 % 82.12 59 1.43 % 378.46 torej to rej 474 1.82 % 457.92 133 2.21 % 576.95 9 0.13 % 30.45 297 3.32 % 841 35 0.85 % 224.51 vsaj v saj 448 1.72 % 432.81 113 1.88 % 490.19 119 1.71 % 402.65 128 1.43 % 362.45 88 2.13 % 564.49 predvsem predv sem 335 1.28 % 323.64 107 1.78 % 464.16 10 0.14 % 33.84 187 2.09 % 529.52 31 0.75 % 198.85 morda mo rda 333 1.28 % 321.71 138 2.29 % 598.64 27 0.39 % 91.36 151 1.69 % 427.58 17 0.41 % 109.05 skoraj sko raj 295 1.13 % 284.99 76 1.26 % 329.69 100 1.43 % 338.36 83 0.93 % 235.03 36 0.87 % 230.93 prav p rav 249 0.95 % 240.55 50 0.83 % 216.90 87 1.25 % 294.37 84 0.94 % 237.86 28 0.68 % 179.61 najbrž naj brž 217 0.83 % 209.64 44 0.73 % 190.87 63 0.90 % 213.17 64 0.71 % 181.23 46 1.11 % 295.07 mogoče mog oče 184 0.70 % 177.76 39 0.65 % 169.18 36 0.52 % 121.81 60 0.67 % 169.90 49 1.18 % 314.32 celo c elo 173 0.66 % 167.13 47 0.78 % 203.88 32 0.46 % 108.28 75 0.84 % 212.37 19 0.46 % 121.88 menda me nda 149 0.57 % 143.95 12 0.20 % 52.06 96 1.38 % 324.83 28 0.31 % 79.29 13 0.31 % 83.39 šele š ele 130 0.50 % 125.59 29 0.48 % 125.80 46 0.66 % 155.65 39 0.44 % 110.43 16 0.39 % 102.63 baje b aje 124 0.47 % 119.79 35 0.58 % 151.83 72 1.03 % 243.62 4 0.04 % 11.33 13 0.31 % 83.39 valjda val jda 116 0.45 % 112.07 24 0.40 % 104.11 70 1.00 % 236.85 5 0.06 % 14.16 17 0.41 % 109.05 žal žal 92 0.35 % 88.88 34 0.56 % 147.49 15 0.21 % 50.75 31 0.35 % 87.78 12 0.29 % 76.98 potem po tem 87 0.33 % 84.05 19 0.32 % 82.42 16 0.23 % 54.14 33 0.37 % 93.44 19 0.46 % 121.88 ravno ra vno 87 0.33 % 84.05 20 0.33 % 86.76 10 0.14 % 33.84 44 0.49 % 124.59 13 0.31 % 83.39 komaj ko maj 79 0.30 % 76.32 22 0.36 % 95.44 41 0.59 % 138.73 7 0.08 % 19.82 9 0.22 % 57.73 verjetno verje tno 69 0.26 % 66.66 7 0.12 % 30.37 11 0.16 % 37.22 37 0.41 % 104.77 14 0.34 % 89.81 morebiti moreb iti 49 0.19 % 47.34 2 0.03 % 8.68 17 0.24 % 57.52 18 0.20 % 50.97 12 0.29 % 76.98 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 632 File at CLARIN.SI2.2.289 List of final character-level 4-grams from particle standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-particles-standardized_forms- final-4grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tudi tudi 7,945 42.89 % 7,675.53 2,076 46.42 % 9,005.61 1,644 35.46 % 5,562.66 2,890 43.67 % 8,183.51 1,335 47.71 % 8,563.57 seveda se veda 1,795 9.69 % 1,734.12 543 12.14 % 2,355.51 188 4.05 % 636.12 896 13.54 % 2,537.17 168 6.00 % 1,077.66 samo samo 1,469 7.93 % 1,419.18 254 5.68 % 1,101.84 565 12.19 % 1,911.74 419 6.33 % 1,186.47 231 8.26 % 1,481.79 sploh s ploh 895 4.83 % 864.64 161 3.60 % 698.41 354 7.64 % 1,197.80 219 3.31 % 620.13 161 5.75 % 1,032.76 okej okej 656 3.54 % 633.75 191 4.27 % 828.55 175 3.77 % 592.13 96 1.45 % 271.84 194 6.93 % 1,244.44 pravzaprav pravza prav 579 3.13 % 559.36 73 1.63 % 316.67 18 0.39 % 60.91 439 6.63 % 1,243.10 49 1.75 % 314.32 glih glih 542 2.93 % 523.62 74 1.66 % 321.01 373 8.05 % 1,262.09 36 0.54 % 101.94 59 2.11 % 378.46 itak itak 527 2.85 % 509.13 50 1.12 % 216.90 389 8.39 % 1,316.23 29 0.44 % 82.12 59 2.11 % 378.46 torej t orej 474 2.56 % 457.92 133 2.97 % 576.95 9 0.19 % 30.45 297 4.49 % 841 35 1.25 % 224.51 vsaj vsaj 448 2.42 % 432.81 113 2.53 % 490.19 119 2.57 % 402.65 128 1.93 % 362.45 88 3.15 % 564.49 predvsem pred vsem 335 1.81 % 323.64 107 2.39 % 464.16 10 0.22 % 33.84 187 2.83 % 529.52 31 1.11 % 198.85 morda m orda 333 1.80 % 321.71 138 3.09 % 598.64 27 0.58 % 91.36 151 2.28 % 427.58 17 0.61 % 109.05 skoraj sk oraj 295 1.59 % 284.99 76 1.70 % 329.69 100 2.16 % 338.36 83 1.25 % 235.03 36 1.29 % 230.93 prav prav 249 1.34 % 240.55 50 1.12 % 216.90 87 1.88 % 294.37 84 1.27 % 237.86 28 1.00 % 179.61 najbrž na jbrž 217 1.17 % 209.64 44 0.98 % 190.87 63 1.36 % 213.17 64 0.97 % 181.23 46 1.64 % 295.07 mogoče mo goče 184 0.99 % 177.76 39 0.87 % 169.18 36 0.78 % 121.81 60 0.91 % 169.90 49 1.75 % 314.32 celo celo 173 0.93 % 167.13 47 1.05 % 203.88 32 0.69 % 108.28 75 1.13 % 212.37 19 0.68 % 121.88 menda m enda 149 0.80 % 143.95 12 0.27 % 52.06 96 2.07 % 324.83 28 0.42 % 79.29 13 0.47 % 83.39 šele šele 130 0.70 % 125.59 29 0.65 % 125.80 46 0.99 % 155.65 39 0.59 % 110.43 16 0.57 % 102.63 baje baje 124 0.67 % 119.79 35 0.78 % 151.83 72 1.55 % 243.62 4 0.06 % 11.33 13 0.47 % 83.39 valjda va ljda 116 0.63 % 112.07 24 0.54 % 104.11 70 1.51 % 236.85 5 0.08 % 14.16 17 0.61 % 109.05 potem p otem 87 0.47 % 84.05 19 0.42 % 82.42 16 0.34 % 54.14 33 0.50 % 93.44 19 0.68 % 121.88 ravno r avno 87 0.47 % 84.05 20 0.45 % 86.76 10 0.22 % 33.84 44 0.67 % 124.59 13 0.47 % 83.39 komaj k omaj 79 0.43 % 76.32 22 0.49 % 95.44 41 0.88 % 138.73 7 0.11 % 19.82 9 0.32 % 57.73 verjetno verj etno 69 0.37 % 66.66 7 0.16 % 30.37 11 0.24 % 37.22 37 0.56 % 104.77 14 0.50 % 89.81 morebiti more biti 49 0.27 % 47.34 2 0.04 % 8.68 17 0.37 % 57.52 18 0.27 % 50.97 12 0.43 % 76.98 zgolj z golj 45 0.24 % 43.47 11 0.25 % 47.72 3 0.07 % 10.15 30 0.45 % 84.95 1 0.04 % 6.41 vendarle vend arle 44 0.24 % 42.51 4 0.09 % 17.35 0 0 % 0 39 0.59 % 110.43 1 0.04 % 6.41 skorajda skor ajda 42 0.23 % 40.58 15 0.34 % 65.07 1 0.02 % 3.38 25 0.38 % 70.79 1 0.04 % 6.41 edino e dino 30 0.16 % 28.98 6 0.13 % 26.03 13 0.28 % 43.99 4 0.06 % 11.33 7 0.25 % 44.90 vsekakor vsek akor 28 0.15 % 27.05 8 0.18 % 34.70 0 0 % 0 18 0.27 % 50.97 2 0.07 % 12.83 minus m inus 22 0.12 % 21.25 0 0 % 0 0 0 % 0 10 0.15 % 28.32 12 0.43 % 76.98 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 633 File at CLARIN.SI2.2.290 List of final character-level 5-grams from particle standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-particles-standardized_forms- final-5grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] seveda s eveda 1,795 29.02 % 1,734.12 543 35.19 % 2,355.51 188 16.94 % 636.12 896 32.07 % 2,537.17 168 22.76 % 1,077.66 sploh sploh 895 14.47 % 864.64 161 10.43 % 698.41 354 31.89 % 1,197.80 219 7.84 % 620.13 161 21.82 % 1,032.76 pravzaprav pravz aprav 579 9.36 % 559.36 73 4.73 % 316.67 18 1.62 % 60.91 439 15.71 % 1,243.10 49 6.64 % 314.32 torej torej 474 7.66 % 457.92 133 8.62 % 576.95 9 0.81 % 30.45 297 10.63 % 841 35 4.74 % 224.51 predvsem pre dvsem 335 5.42 % 323.64 107 6.93 % 464.16 10 0.90 % 33.84 187 6.69 % 529.52 31 4.20 % 198.85 morda morda 333 5.38 % 321.71 138 8.94 % 598.64 27 2.43 % 91.36 151 5.40 % 427.58 17 2.30 % 109.05 skoraj s koraj 295 4.77 % 284.99 76 4.92 % 329.69 100 9.01 % 338.36 83 2.97 % 235.03 36 4.88 % 230.93 najbrž n ajbrž 217 3.51 % 209.64 44 2.85 % 190.87 63 5.68 % 213.17 64 2.29 % 181.23 46 6.23 % 295.07 mogoče m ogoče 184 2.98 % 177.76 39 2.53 % 169.18 36 3.24 % 121.81 60 2.15 % 169.90 49 6.64 % 314.32 menda menda 149 2.41 % 143.95 12 0.78 % 52.06 96 8.65 % 324.83 28 1.00 % 79.29 13 1.76 % 83.39 valjda v aljda 116 1.88 % 112.07 24 1.55 % 104.11 70 6.31 % 236.85 5 0.18 % 14.16 17 2.30 % 109.05 potem potem 87 1.41 % 84.05 19 1.23 % 82.42 16 1.44 % 54.14 33 1.18 % 93.44 19 2.58 % 121.88 ravno ravno 87 1.41 % 84.05 20 1.30 % 86.76 10 0.90 % 33.84 44 1.57 % 124.59 13 1.76 % 83.39 komaj komaj 79 1.28 % 76.32 22 1.43 % 95.44 41 3.69 % 138.73 7 0.25 % 19.82 9 1.22 % 57.73 verjetno ver jetno 69 1.12 % 66.66 7 0.45 % 30.37 11 0.99 % 37.22 37 1.32 % 104.77 14 1.90 % 89.81 morebiti mor ebiti 49 0.79 % 47.34 2 0.13 % 8.68 17 1.53 % 57.52 18 0.64 % 50.97 12 1.63 % 76.98 zgolj zgolj 45 0.73 % 43.47 11 0.71 % 47.72 3 0.27 % 10.15 30 1.07 % 84.95 1 0.14 % 6.41 vendarle ven darle 44 0.71 % 42.51 4 0.26 % 17.35 0 0 % 0 39 1.40 % 110.43 1 0.14 % 6.41 skorajda sko rajda 42 0.68 % 40.58 15 0.97 % 65.07 1 0.09 % 3.38 25 0.90 % 70.79 1 0.14 % 6.41 edino edino 30 0.48 % 28.98 6 0.39 % 26.03 13 1.17 % 43.99 4 0.14 % 11.33 7 0.95 % 44.90 vsekakor vse kakor 28 0.45 % 27.05 8 0.52 % 34.70 0 0 % 0 18 0.64 % 50.97 2 0.27 % 12.83 minus minus 22 0.36 % 21.25 0 0 % 0 0 0 % 0 10 0.36 % 28.32 12 1.63 % 76.98 največ n ajveč 21 0.34 % 20.29 7 0.45 % 30.37 0 0 % 0 12 0.43 % 33.98 2 0.27 % 12.83 bržkone br žkone 20 0.32 % 19.32 20 1.30 % 86.76 0 0 % 0 0 0 % 0 0 0 % 0 natanko na tanko 19 0.31 % 18.36 10 0.65 % 43.38 0 0 % 0 9 0.32 % 25.48 0 0 % 0 bojda bojda 16 0.26 % 15.46 8 0.52 % 34.70 8 0.72 % 27.07 0 0 % 0 0 0 % 0 najmanj na jmanj 16 0.26 % 15.46 2 0.13 % 8.68 2 0.18 % 6.77 10 0.36 % 28.32 2 0.27 % 12.83 približno prib ližno 14 0.23 % 13.53 7 0.45 % 30.37 0 0 % 0 6 0.21 % 16.99 1 0.14 % 6.41 končno k ončno 13 0.21 % 12.56 3 0.19 % 13.01 1 0.09 % 3.38 9 0.32 % 25.48 0 0 % 0 razen razen 13 0.21 % 12.56 2 0.13 % 8.68 5 0.45 % 16.92 4 0.14 % 11.33 2 0.27 % 12.83 kvečjemu kve čjemu 12 0.19 % 11.59 1 0.07 % 4.34 2 0.18 % 6.77 5 0.18 % 14.16 4 0.54 % 25.66 okrog okrog 11 0.18 % 10.63 1 0.07 % 4.34 0 0 % 0 5 0.18 % 14.16 5 0.68 % 32.07 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 634 File at CLARIN.SI2.2.291 List of initial character-level 1-grams from particle lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-particles-lowercase_forms- initial-1grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ne n e 29,863 29.77 % 28,850.16 6,135 28.55 % 26,613.40 10,304 29.04 % 34,864.76 7,592 30.16 % 21,498.01 5,832 32.08 % 37,410.27 ja j a 25,105 25.02 % 24,253.53 4,628 21.54 % 20,076.09 11,249 31.71 % 38,062.27 3,787 15.04 % 10,723.52 5,441 29.93 % 34,902.14 še š e 7,058 7.04 % 6,818.62 1,756 8.17 % 7,617.46 2,021 5.70 % 6,838.28 2,126 8.45 % 6,020.12 1,155 6.35 % 7,408.93 no n o 4,623 4.61 % 4,466.20 1,282 5.97 % 5,561.27 1,511 4.26 % 5,112.64 1,111 4.41 % 3,145.98 719 3.96 % 4,612.14 že ž e 4,273 4.26 % 4,128.08 1,044 4.86 % 4,528.83 1,357 3.83 % 4,591.56 1,185 4.71 % 3,355.52 687 3.78 % 4,406.87 tudi t udi 4,119 4.11 % 3,979.30 1,289 6.00 % 5,591.63 315 0.89 % 1,065.84 2,086 8.29 % 5,906.86 429 2.36 % 2,751.89 tud t ud 3,570 3.56 % 3,448.92 709 3.30 % 3,075.62 1,187 3.35 % 4,016.35 795 3.16 % 2,251.17 879 4.83 % 5,638.48 pač p ač 2,833 2.82 % 2,736.92 401 1.87 % 1,739.52 954 2.69 % 3,227.97 826 3.28 % 2,338.96 652 3.59 % 4,182.36 res r es 1,625 1.62 % 1,569.89 457 2.13 % 1,982.45 521 1.47 % 1,762.86 355 1.41 % 1,005.24 292 1.61 % 1,873.08 seveda s eveda 1,593 1.59 % 1,538.97 499 2.32 % 2,164.64 126 0.35 % 426.34 816 3.24 % 2,310.64 152 0.84 % 975.03 več v eč 1,312 1.31 % 1,267.50 287 1.34 % 1,245 422 1.19 % 1,427.89 445 1.77 % 1,260.09 158 0.87 % 1,013.52 sploh s ploh 866 0.86 % 836.63 147 0.68 % 637.68 340 0.96 % 1,150.43 218 0.87 % 617.30 161 0.89 % 1,032.76 nej n ej 801 0.80 % 773.83 230 1.07 % 997.73 408 1.15 % 1,380.51 104 0.41 % 294.49 59 0.33 % 378.46 samo s amo 800 0.80 % 772.87 174 0.81 % 754.81 156 0.44 % 527.84 349 1.39 % 988.25 121 0.67 % 776.17 okej o kej 638 0.64 % 616.36 190 0.88 % 824.21 163 0.46 % 551.53 96 0.38 % 271.84 189 1.04 % 1,212.37 sam s am 556 0.55 % 537.14 76 0.35 % 329.69 314 0.89 % 1,062.45 66 0.26 % 186.89 100 0.55 % 641.47 itak i tak 529 0.53 % 511.06 50 0.23 % 216.90 391 1.10 % 1,322.99 29 0.12 % 82.12 59 0.33 % 378.46 torej t orej 472 0.47 % 455.99 133 0.62 % 576.95 8 0.02 % 27.07 297 1.18 % 841 34 0.19 % 218.10 naj n aj 469 0.47 % 453.09 112 0.52 % 485.85 81 0.23 % 274.07 230 0.91 % 651.28 46 0.25 % 295.07 kr k r 408 0.41 % 394.16 125 0.58 % 542.25 99 0.28 % 334.98 113 0.45 % 319.98 71 0.39 % 455.44 vsaj v saj 382 0.38 % 369.04 99 0.46 % 429.46 93 0.26 % 314.68 113 0.45 % 319.98 77 0.42 % 493.93 kar k ar 373 0.37 % 360.35 55 0.26 % 238.59 56 0.16 % 189.48 221 0.88 % 625.80 41 0.23 % 263 glih g lih 360 0.36 % 347.79 60 0.28 % 260.28 233 0.66 % 788.38 24 0.10 % 67.96 43 0.24 % 275.83 nje n je 341 0.34 % 329.43 21 0.10 % 91.10 198 0.56 % 669.96 106 0.42 % 300.16 16 0.09 % 102.63 da d a 324 0.32 % 313.01 49 0.23 % 212.56 141 0.40 % 477.09 86 0.34 % 243.52 48 0.26 % 307.90 provzaprov p rovzaprov 320 0.32 % 309.15 34 0.16 % 147.49 8 0.02 % 27.07 245 0.97 % 693.76 33 0.18 % 211.68 morda m orda 306 0.30 % 295.62 137 0.64 % 594.30 6 0.02 % 20.30 148 0.59 % 419.09 15 0.08 % 96.22 predvsem p redvsem 300 0.30 % 289.83 97 0.45 % 420.78 10 0.03 % 33.84 163 0.65 % 461.56 30 0.17 % 192.44 le l e 240 0.24 % 231.86 98 0.46 % 425.12 33 0.09 % 111.66 91 0.36 % 257.68 18 0.10 % 115.46 n n 232 0.23 % 224.13 11 0.05 % 47.72 175 0.49 % 592.13 38 0.15 % 107.60 8 0.04 % 51.32 pravzaprav p ravzaprav 214 0.21 % 206.74 32 0.15 % 138.81 9 0.03 % 30.45 159 0.63 % 450.23 14 0.08 % 89.81 najbrž n ajbrž 210 0.21 % 202.88 41 0.19 % 177.86 59 0.17 % 199.63 64 0.25 % 181.23 46 0.25 % 295.07 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 635 File at CLARIN.SI2.2.292 List of initial character-level 2-grams from particle lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-particles-lowercase_forms- initial-2grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ne ne 29,863 29.85 % 28,850.16 6,135 28.57 % 26,613.40 10,304 29.20 % 34,864.76 7,592 30.21 % 21,498.01 5,832 32.10 % 37,410.27 ja ja 25,105 25.09 % 24,253.53 4,628 21.55 % 20,076.09 11,249 31.88 % 38,062.27 3,787 15.07 % 10,723.52 5,441 29.95 % 34,902.14 še še 7,058 7.05 % 6,818.62 1,756 8.18 % 7,617.46 2,021 5.73 % 6,838.28 2,126 8.46 % 6,020.12 1,155 6.36 % 7,408.93 no no 4,623 4.62 % 4,466.20 1,282 5.97 % 5,561.27 1,511 4.28 % 5,112.64 1,111 4.42 % 3,145.98 719 3.96 % 4,612.14 že že 4,273 4.27 % 4,128.08 1,044 4.86 % 4,528.83 1,357 3.85 % 4,591.56 1,185 4.72 % 3,355.52 687 3.78 % 4,406.87 tudi tu di 4,119 4.12 % 3,979.30 1,289 6.00 % 5,591.63 315 0.89 % 1,065.84 2,086 8.30 % 5,906.86 429 2.36 % 2,751.89 tud tu d 3,570 3.57 % 3,448.92 709 3.30 % 3,075.62 1,187 3.36 % 4,016.35 795 3.16 % 2,251.17 879 4.84 % 5,638.48 pač pa č 2,833 2.83 % 2,736.92 401 1.87 % 1,739.52 954 2.70 % 3,227.97 826 3.29 % 2,338.96 652 3.59 % 4,182.36 res re s 1,625 1.62 % 1,569.89 457 2.13 % 1,982.45 521 1.48 % 1,762.86 355 1.41 % 1,005.24 292 1.61 % 1,873.08 seveda se veda 1,593 1.59 % 1,538.97 499 2.32 % 2,164.64 126 0.36 % 426.34 816 3.25 % 2,310.64 152 0.84 % 975.03 več ve č 1,312 1.31 % 1,267.50 287 1.34 % 1,245 422 1.20 % 1,427.89 445 1.77 % 1,260.09 158 0.87 % 1,013.52 sploh sp loh 866 0.87 % 836.63 147 0.69 % 637.68 340 0.96 % 1,150.43 218 0.87 % 617.30 161 0.89 % 1,032.76 nej ne j 801 0.80 % 773.83 230 1.07 % 997.73 408 1.16 % 1,380.51 104 0.41 % 294.49 59 0.33 % 378.46 samo sa mo 800 0.80 % 772.87 174 0.81 % 754.81 156 0.44 % 527.84 349 1.39 % 988.25 121 0.67 % 776.17 okej ok ej 638 0.64 % 616.36 190 0.89 % 824.21 163 0.46 % 551.53 96 0.38 % 271.84 189 1.04 % 1,212.37 sam sa m 556 0.56 % 537.14 76 0.35 % 329.69 314 0.89 % 1,062.45 66 0.26 % 186.89 100 0.55 % 641.47 itak it ak 529 0.53 % 511.06 50 0.23 % 216.90 391 1.11 % 1,322.99 29 0.12 % 82.12 59 0.33 % 378.46 torej to rej 472 0.47 % 455.99 133 0.62 % 576.95 8 0.02 % 27.07 297 1.18 % 841 34 0.19 % 218.10 naj na j 469 0.47 % 453.09 112 0.52 % 485.85 81 0.23 % 274.07 230 0.92 % 651.28 46 0.25 % 295.07 kr kr 408 0.41 % 394.16 125 0.58 % 542.25 99 0.28 % 334.98 113 0.45 % 319.98 71 0.39 % 455.44 vsaj vs aj 382 0.38 % 369.04 99 0.46 % 429.46 93 0.26 % 314.68 113 0.45 % 319.98 77 0.42 % 493.93 kar ka r 373 0.37 % 360.35 55 0.26 % 238.59 56 0.16 % 189.48 221 0.88 % 625.80 41 0.23 % 263 glih gl ih 360 0.36 % 347.79 60 0.28 % 260.28 233 0.66 % 788.38 24 0.10 % 67.96 43 0.24 % 275.83 nje nj e 341 0.34 % 329.43 21 0.10 % 91.10 198 0.56 % 669.96 106 0.42 % 300.16 16 0.09 % 102.63 da da 324 0.32 % 313.01 49 0.23 % 212.56 141 0.40 % 477.09 86 0.34 % 243.52 48 0.26 % 307.90 provzaprov pr ovzaprov 320 0.32 % 309.15 34 0.16 % 147.49 8 0.02 % 27.07 245 0.97 % 693.76 33 0.18 % 211.68 morda mo rda 306 0.31 % 295.62 137 0.64 % 594.30 6 0.02 % 20.30 148 0.59 % 419.09 15 0.08 % 96.22 predvsem pr edvsem 300 0.30 % 289.83 97 0.45 % 420.78 10 0.03 % 33.84 163 0.65 % 461.56 30 0.17 % 192.44 le le 240 0.24 % 231.86 98 0.46 % 425.12 33 0.09 % 111.66 91 0.36 % 257.68 18 0.10 % 115.46 pravzaprav pr avzaprav 214 0.21 % 206.74 32 0.15 % 138.81 9 0.03 % 30.45 159 0.63 % 450.23 14 0.08 % 89.81 najbrž na jbrž 210 0.21 % 202.88 41 0.19 % 177.86 59 0.17 % 199.63 64 0.26 % 181.23 46 0.25 % 295.07 jah ja h 193 0.19 % 186.45 56 0.26 % 242.93 108 0.31 % 365.43 15 0.06 % 42.47 14 0.08 % 89.81 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 636 File at CLARIN.SI2.2.293 List of initial character-level 3-grams from particle lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-particles-lowercase_forms- initial-3grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tudi tud i 4,119 14.96 % 3,979.30 1,289 20.67 % 5,591.63 315 3.87 % 1,065.84 2,086 23.16 % 5,906.86 429 10.33 % 2,751.89 tud tud 3,570 12.97 % 3,448.92 709 11.37 % 3,075.62 1,187 14.59 % 4,016.35 795 8.82 % 2,251.17 879 21.16 % 5,638.48 pač pač 2,833 10.29 % 2,736.92 401 6.43 % 1,739.52 954 11.73 % 3,227.97 826 9.17 % 2,338.96 652 15.70 % 4,182.36 res res 1,625 5.90 % 1,569.89 457 7.33 % 1,982.45 521 6.41 % 1,762.86 355 3.94 % 1,005.24 292 7.03 % 1,873.08 seveda sev eda 1,593 5.79 % 1,538.97 499 8.00 % 2,164.64 126 1.55 % 426.34 816 9.06 % 2,310.64 152 3.66 % 975.03 več več 1,312 4.76 % 1,267.50 287 4.60 % 1,245 422 5.19 % 1,427.89 445 4.94 % 1,260.09 158 3.80 % 1,013.52 sploh spl oh 866 3.15 % 836.63 147 2.36 % 637.68 340 4.18 % 1,150.43 218 2.42 % 617.30 161 3.88 % 1,032.76 nej nej 801 2.91 % 773.83 230 3.69 % 997.73 408 5.02 % 1,380.51 104 1.15 % 294.49 59 1.42 % 378.46 samo sam o 800 2.90 % 772.87 174 2.79 % 754.81 156 1.92 % 527.84 349 3.87 % 988.25 121 2.91 % 776.17 okej oke j 638 2.32 % 616.36 190 3.05 % 824.21 163 2.00 % 551.53 96 1.07 % 271.84 189 4.55 % 1,212.37 sam sam 556 2.02 % 537.14 76 1.22 % 329.69 314 3.86 % 1,062.45 66 0.73 % 186.89 100 2.41 % 641.47 itak ita k 529 1.92 % 511.06 50 0.80 % 216.90 391 4.81 % 1,322.99 29 0.32 % 82.12 59 1.42 % 378.46 torej tor ej 472 1.71 % 455.99 133 2.13 % 576.95 8 0.10 % 27.07 297 3.30 % 841 34 0.82 % 218.10 naj naj 469 1.70 % 453.09 112 1.80 % 485.85 81 1.00 % 274.07 230 2.55 % 651.28 46 1.11 % 295.07 vsaj vsa j 382 1.39 % 369.04 99 1.59 % 429.46 93 1.14 % 314.68 113 1.25 % 319.98 77 1.85 % 493.93 kar kar 373 1.35 % 360.35 55 0.88 % 238.59 56 0.69 % 189.48 221 2.45 % 625.80 41 0.99 % 263 glih gli h 360 1.31 % 347.79 60 0.96 % 260.28 233 2.87 % 788.38 24 0.27 % 67.96 43 1.03 % 275.83 nje nje 341 1.24 % 329.43 21 0.34 % 91.10 198 2.43 % 669.96 106 1.18 % 300.16 16 0.39 % 102.63 provzaprov pro vzaprov 320 1.16 % 309.15 34 0.55 % 147.49 8 0.10 % 27.07 245 2.72 % 693.76 33 0.79 % 211.68 morda mor da 306 1.11 % 295.62 137 2.20 % 594.30 6 0.07 % 20.30 148 1.64 % 419.09 15 0.36 % 96.22 predvsem pre dvsem 300 1.09 % 289.83 97 1.55 % 420.78 10 0.12 % 33.84 163 1.81 % 461.56 30 0.72 % 192.44 pravzaprav pra vzaprav 214 0.78 % 206.74 32 0.51 % 138.81 9 0.11 % 30.45 159 1.76 % 450.23 14 0.34 % 89.81 najbrž naj brž 210 0.76 % 202.88 41 0.66 % 177.86 59 0.72 % 199.63 64 0.71 % 181.23 46 1.11 % 295.07 jah jah 193 0.70 % 186.45 56 0.90 % 242.93 108 1.33 % 365.43 15 0.17 % 42.47 14 0.34 % 89.81 mogoče mog oče 174 0.63 % 168.10 37 0.59 % 160.50 29 0.36 % 98.12 59 0.66 % 167.07 49 1.18 % 314.32 lih lih 170 0.62 % 164.23 14 0.22 % 60.73 129 1.59 % 436.49 12 0.13 % 33.98 15 0.36 % 96.22 celo cel o 155 0.56 % 149.74 42 0.67 % 182.19 24 0.29 % 81.21 71 0.79 % 201.05 18 0.43 % 115.46 kna kna 147 0.53 % 142.01 0 0 % 0 147 1.81 % 497.39 0 0 % 0 0 0 % 0 sevede sev ede 138 0.50 % 133.32 29 0.47 % 125.80 44 0.54 % 148.88 57 0.63 % 161.40 8 0.19 % 51.32 skoraj sko raj 138 0.50 % 133.32 41 0.66 % 177.86 18 0.22 % 60.91 63 0.70 % 178.39 16 0.39 % 102.63 prov pro v 130 0.47 % 125.59 18 0.29 % 78.08 58 0.71 % 196.25 34 0.38 % 96.28 20 0.48 % 128.29 baje baj e 123 0.45 % 118.83 34 0.55 % 147.49 72 0.89 % 243.62 4 0.04 % 11.33 13 0.31 % 83.39 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 637 File at CLARIN.SI2.2.294 List of initial character-level 4-grams from particle lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-particles-lowercase_forms- initial-4grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tudi tudi 4,119 29.12 % 3,979.30 1,289 35.23 % 5,591.63 315 10.57 % 1,065.84 2,086 36.36 % 5,906.86 429 24.24 % 2,751.89 seveda seve da 1,593 11.26 % 1,538.97 499 13.64 % 2,164.64 126 4.23 % 426.34 816 14.22 % 2,310.64 152 8.59 % 975.03 sploh splo h 866 6.12 % 836.63 147 4.02 % 637.68 340 11.41 % 1,150.43 218 3.80 % 617.30 161 9.10 % 1,032.76 samo samo 800 5.66 % 772.87 174 4.75 % 754.81 156 5.23 % 527.84 349 6.08 % 988.25 121 6.84 % 776.17 okej okej 638 4.51 % 616.36 190 5.19 % 824.21 163 5.47 % 551.53 96 1.67 % 271.84 189 10.68 % 1,212.37 itak itak 529 3.74 % 511.06 50 1.37 % 216.90 391 13.12 % 1,322.99 29 0.51 % 82.12 59 3.33 % 378.46 torej tore j 472 3.34 % 455.99 133 3.63 % 576.95 8 0.27 % 27.07 297 5.18 % 841 34 1.92 % 218.10 vsaj vsaj 382 2.70 % 369.04 99 2.71 % 429.46 93 3.12 % 314.68 113 1.97 % 319.98 77 4.35 % 493.93 glih glih 360 2.54 % 347.79 60 1.64 % 260.28 233 7.82 % 788.38 24 0.42 % 67.96 43 2.43 % 275.83 provzaprov prov zaprov 320 2.26 % 309.15 34 0.93 % 147.49 8 0.27 % 27.07 245 4.27 % 693.76 33 1.86 % 211.68 morda mord a 306 2.16 % 295.62 137 3.74 % 594.30 6 0.20 % 20.30 148 2.58 % 419.09 15 0.85 % 96.22 predvsem pred vsem 300 2.12 % 289.83 97 2.65 % 420.78 10 0.34 % 33.84 163 2.84 % 461.56 30 1.70 % 192.44 pravzaprav prav zaprav 214 1.51 % 206.74 32 0.88 % 138.81 9 0.30 % 30.45 159 2.77 % 450.23 14 0.79 % 89.81 najbrž najb rž 210 1.48 % 202.88 41 1.12 % 177.86 59 1.98 % 199.63 64 1.12 % 181.23 46 2.60 % 295.07 mogoče mogo če 174 1.23 % 168.10 37 1.01 % 160.50 29 0.97 % 98.12 59 1.03 % 167.07 49 2.77 % 314.32 celo celo 155 1.10 % 149.74 42 1.15 % 182.19 24 0.81 % 81.21 71 1.24 % 201.05 18 1.02 % 115.46 sevede seve de 138 0.97 % 133.32 29 0.79 % 125.80 44 1.48 % 148.88 57 0.99 % 161.40 8 0.45 % 51.32 skoraj skor aj 138 0.97 % 133.32 41 1.12 % 177.86 18 0.60 % 60.91 63 1.10 % 178.39 16 0.90 % 102.63 prov prov 130 0.92 % 125.59 18 0.49 % 78.08 58 1.95 % 196.25 34 0.59 % 96.28 20 1.13 % 128.29 baje baje 123 0.87 % 118.83 34 0.93 % 147.49 72 2.42 % 243.62 4 0.07 % 11.33 13 0.73 % 83.39 šele šele 120 0.85 % 115.93 29 0.79 % 125.80 36 1.21 % 121.81 39 0.68 % 110.43 16 0.90 % 102.63 prav prav 115 0.81 % 111.10 32 0.88 % 138.81 25 0.84 % 84.59 50 0.87 % 141.58 8 0.45 % 51.32 skor skor 115 0.81 % 111.10 26 0.71 % 112.79 58 1.95 % 196.25 14 0.24 % 39.64 17 0.96 % 109.05 ravno ravn o 81 0.57 % 78.25 19 0.52 % 82.42 8 0.27 % 27.07 43 0.75 % 121.76 11 0.62 % 70.56 menda mend a 79 0.56 % 76.32 7 0.19 % 30.37 39 1.31 % 131.96 23 0.40 % 65.13 10 0.56 % 64.15 valda vald a 74 0.52 % 71.49 23 0.63 % 99.77 31 1.04 % 104.89 4 0.07 % 11.33 16 0.90 % 102.63 potem pote m 71 0.50 % 68.59 15 0.41 % 65.07 7 0.23 % 23.69 32 0.56 % 90.61 17 0.96 % 109.05 verjetno verj etno 60 0.42 % 57.97 7 0.19 % 30.37 9 0.30 % 30.45 34 0.59 % 96.28 10 0.56 % 64.15 zgolj zgol j 45 0.32 % 43.47 11 0.30 % 47.72 3 0.10 % 10.15 30 0.52 % 84.95 1 0.06 % 6.41 vendarle vend arle 44 0.31 % 42.51 4 0.11 % 17.35 0 0 % 0 39 0.68 % 110.43 1 0.06 % 6.41 valjda valj da 42 0.30 % 40.58 1 0.03 % 4.34 39 1.31 % 131.96 1 0.02 % 2.83 1 0.06 % 6.41 rejs rejs 41 0.29 % 39.61 0 0 % 0 35 1.17 % 118.43 5 0.09 % 14.16 1 0.06 % 6.41 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 638 File at CLARIN.SI2.2.295 List of initial character-level 5-grams from particle lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-particles-lowercase_forms- initial-5grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] seveda seved a 1,593 26.49 % 1,538.97 499 32.96 % 2,164.64 126 12.51 % 426.34 816 29.41 % 2,310.64 152 21.20 % 975.03 sploh sploh 866 14.40 % 836.63 147 9.71 % 637.68 340 33.76 % 1,150.43 218 7.86 % 617.30 161 22.45 % 1,032.76 torej torej 472 7.85 % 455.99 133 8.79 % 576.95 8 0.79 % 27.07 297 10.70 % 841 34 4.74 % 218.10 provzaprov provz aprov 320 5.32 % 309.15 34 2.25 % 147.49 8 0.79 % 27.07 245 8.83 % 693.76 33 4.60 % 211.68 morda morda 306 5.09 % 295.62 137 9.05 % 594.30 6 0.60 % 20.30 148 5.33 % 419.09 15 2.09 % 96.22 predvsem predv sem 300 4.99 % 289.83 97 6.41 % 420.78 10 0.99 % 33.84 163 5.87 % 461.56 30 4.18 % 192.44 pravzaprav pravz aprav 214 3.56 % 206.74 32 2.11 % 138.81 9 0.89 % 30.45 159 5.73 % 450.23 14 1.95 % 89.81 najbrž najbr ž 210 3.49 % 202.88 41 2.71 % 177.86 59 5.86 % 199.63 64 2.31 % 181.23 46 6.42 % 295.07 mogoče mogoč e 174 2.89 % 168.10 37 2.44 % 160.50 29 2.88 % 98.12 59 2.13 % 167.07 49 6.83 % 314.32 sevede seved e 138 2.29 % 133.32 29 1.92 % 125.80 44 4.37 % 148.88 57 2.05 % 161.40 8 1.12 % 51.32 skoraj skora j 138 2.29 % 133.32 41 2.71 % 177.86 18 1.79 % 60.91 63 2.27 % 178.39 16 2.23 % 102.63 ravno ravno 81 1.35 % 78.25 19 1.25 % 82.42 8 0.79 % 27.07 43 1.55 % 121.76 11 1.53 % 70.56 menda menda 79 1.31 % 76.32 7 0.46 % 30.37 39 3.87 % 131.96 23 0.83 % 65.13 10 1.40 % 64.15 valda valda 74 1.23 % 71.49 23 1.52 % 99.77 31 3.08 % 104.89 4 0.14 % 11.33 16 2.23 % 102.63 potem potem 71 1.18 % 68.59 15 0.99 % 65.07 7 0.69 % 23.69 32 1.15 % 90.61 17 2.37 % 109.05 verjetno verje tno 60 1.00 % 57.97 7 0.46 % 30.37 9 0.89 % 30.45 34 1.23 % 96.28 10 1.40 % 64.15 zgolj zgolj 45 0.75 % 43.47 11 0.73 % 47.72 3 0.30 % 10.15 30 1.08 % 84.95 1 0.14 % 6.41 vendarle venda rle 44 0.73 % 42.51 4 0.26 % 17.35 0 0 % 0 39 1.41 % 110.43 1 0.14 % 6.41 valjda valjd a 42 0.70 % 40.58 1 0.07 % 4.34 39 3.87 % 131.96 1 0.04 % 2.83 1 0.14 % 6.41 skorajda skora jda 38 0.63 % 36.71 14 0.93 % 60.73 0 0 % 0 23 0.83 % 65.13 1 0.14 % 6.41 komaj komaj 35 0.58 % 33.81 14 0.93 % 60.73 11 1.09 % 37.22 5 0.18 % 14.16 5 0.70 % 32.07 mende mende 35 0.58 % 33.81 1 0.07 % 4.34 32 3.18 % 108.28 2 0.07 % 5.66 0 0 % 0 predsem preds em 33 0.55 % 31.88 10 0.66 % 43.38 0 0 % 0 22 0.79 % 62.30 1 0.14 % 6.41 sevea sevea 28 0.47 % 27.05 7 0.46 % 30.37 4 0.40 % 13.53 9 0.32 % 25.48 8 1.12 % 51.32 vsekakor vseka kor 26 0.43 % 25.12 8 0.53 % 34.70 0 0 % 0 16 0.58 % 45.31 2 0.28 % 12.83 edino edino 23 0.38 % 22.22 4 0.26 % 17.35 10 0.99 % 33.84 4 0.14 % 11.33 5 0.70 % 32.07 minus minus 22 0.37 % 21.25 0 0 % 0 0 0 % 0 10 0.36 % 28.32 12 1.67 % 76.98 sploj sploj 22 0.37 % 21.25 10 0.66 % 43.38 12 1.19 % 40.60 0 0 % 0 0 0 % 0 največ najve č 21 0.35 % 20.29 7 0.46 % 30.37 0 0 % 0 12 0.43 % 33.98 2 0.28 % 12.83 bržkone bržko ne 20 0.33 % 19.32 20 1.32 % 86.76 0 0 % 0 0 0 % 0 0 0 % 0 morebiti moreb iti 19 0.32 % 18.36 1 0.07 % 4.34 0 0 % 0 18 0.65 % 50.97 0 0 % 0 natanko natan ko 19 0.32 % 18.36 10 0.66 % 43.38 0 0 % 0 9 0.32 % 25.48 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 639 File at CLARIN.SI2.2.296 List of final character-level 1-grams from particle lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-particles-lowercase_forms- final-1grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ne n e 29,863 29.77 % 28,850.16 6,135 28.55 % 26,613.40 10,304 29.04 % 34,864.76 7,592 30.16 % 21,498.01 5,832 32.08 % 37,410.27 ja j a 25,105 25.02 % 24,253.53 4,628 21.54 % 20,076.09 11,249 31.71 % 38,062.27 3,787 15.04 % 10,723.52 5,441 29.93 % 34,902.14 še š e 7,058 7.04 % 6,818.62 1,756 8.17 % 7,617.46 2,021 5.70 % 6,838.28 2,126 8.45 % 6,020.12 1,155 6.35 % 7,408.93 no n o 4,623 4.61 % 4,466.20 1,282 5.97 % 5,561.27 1,511 4.26 % 5,112.64 1,111 4.41 % 3,145.98 719 3.96 % 4,612.14 že ž e 4,273 4.26 % 4,128.08 1,044 4.86 % 4,528.83 1,357 3.83 % 4,591.56 1,185 4.71 % 3,355.52 687 3.78 % 4,406.87 tudi tud i 4,119 4.11 % 3,979.30 1,289 6.00 % 5,591.63 315 0.89 % 1,065.84 2,086 8.29 % 5,906.86 429 2.36 % 2,751.89 tud tu d 3,570 3.56 % 3,448.92 709 3.30 % 3,075.62 1,187 3.35 % 4,016.35 795 3.16 % 2,251.17 879 4.83 % 5,638.48 pač pa č 2,833 2.82 % 2,736.92 401 1.87 % 1,739.52 954 2.69 % 3,227.97 826 3.28 % 2,338.96 652 3.59 % 4,182.36 res re s 1,625 1.62 % 1,569.89 457 2.13 % 1,982.45 521 1.47 % 1,762.86 355 1.41 % 1,005.24 292 1.61 % 1,873.08 seveda seved a 1,593 1.59 % 1,538.97 499 2.32 % 2,164.64 126 0.35 % 426.34 816 3.24 % 2,310.64 152 0.84 % 975.03 več ve č 1,312 1.31 % 1,267.50 287 1.34 % 1,245 422 1.19 % 1,427.89 445 1.77 % 1,260.09 158 0.87 % 1,013.52 sploh splo h 866 0.86 % 836.63 147 0.68 % 637.68 340 0.96 % 1,150.43 218 0.87 % 617.30 161 0.89 % 1,032.76 nej ne j 801 0.80 % 773.83 230 1.07 % 997.73 408 1.15 % 1,380.51 104 0.41 % 294.49 59 0.33 % 378.46 samo sam o 800 0.80 % 772.87 174 0.81 % 754.81 156 0.44 % 527.84 349 1.39 % 988.25 121 0.67 % 776.17 okej oke j 638 0.64 % 616.36 190 0.88 % 824.21 163 0.46 % 551.53 96 0.38 % 271.84 189 1.04 % 1,212.37 sam sa m 556 0.55 % 537.14 76 0.35 % 329.69 314 0.89 % 1,062.45 66 0.26 % 186.89 100 0.55 % 641.47 itak ita k 529 0.53 % 511.06 50 0.23 % 216.90 391 1.10 % 1,322.99 29 0.12 % 82.12 59 0.33 % 378.46 torej tore j 472 0.47 % 455.99 133 0.62 % 576.95 8 0.02 % 27.07 297 1.18 % 841 34 0.19 % 218.10 naj na j 469 0.47 % 453.09 112 0.52 % 485.85 81 0.23 % 274.07 230 0.91 % 651.28 46 0.25 % 295.07 kr k r 408 0.41 % 394.16 125 0.58 % 542.25 99 0.28 % 334.98 113 0.45 % 319.98 71 0.39 % 455.44 vsaj vsa j 382 0.38 % 369.04 99 0.46 % 429.46 93 0.26 % 314.68 113 0.45 % 319.98 77 0.42 % 493.93 kar ka r 373 0.37 % 360.35 55 0.26 % 238.59 56 0.16 % 189.48 221 0.88 % 625.80 41 0.23 % 263 glih gli h 360 0.36 % 347.79 60 0.28 % 260.28 233 0.66 % 788.38 24 0.10 % 67.96 43 0.24 % 275.83 nje nj e 341 0.34 % 329.43 21 0.10 % 91.10 198 0.56 % 669.96 106 0.42 % 300.16 16 0.09 % 102.63 da d a 324 0.32 % 313.01 49 0.23 % 212.56 141 0.40 % 477.09 86 0.34 % 243.52 48 0.26 % 307.90 provzaprov provzapro v 320 0.32 % 309.15 34 0.16 % 147.49 8 0.02 % 27.07 245 0.97 % 693.76 33 0.18 % 211.68 morda mord a 306 0.30 % 295.62 137 0.64 % 594.30 6 0.02 % 20.30 148 0.59 % 419.09 15 0.08 % 96.22 predvsem predvse m 300 0.30 % 289.83 97 0.45 % 420.78 10 0.03 % 33.84 163 0.65 % 461.56 30 0.17 % 192.44 le l e 240 0.24 % 231.86 98 0.46 % 425.12 33 0.09 % 111.66 91 0.36 % 257.68 18 0.10 % 115.46 n n 232 0.23 % 224.13 11 0.05 % 47.72 175 0.49 % 592.13 38 0.15 % 107.60 8 0.04 % 51.32 pravzaprav pravzapra v 214 0.21 % 206.74 32 0.15 % 138.81 9 0.03 % 30.45 159 0.63 % 450.23 14 0.08 % 89.81 najbrž najbr ž 210 0.21 % 202.88 41 0.19 % 177.86 59 0.17 % 199.63 64 0.25 % 181.23 46 0.25 % 295.07 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 640 File at CLARIN.SI2.2.297 List of final character-level 2-grams from particle lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-particles-lowercase_forms- final-2grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ne ne 29,863 29.85 % 28,850.16 6,135 28.57 % 26,613.40 10,304 29.20 % 34,864.76 7,592 30.21 % 21,498.01 5,832 32.10 % 37,410.27 ja ja 25,105 25.09 % 24,253.53 4,628 21.55 % 20,076.09 11,249 31.88 % 38,062.27 3,787 15.07 % 10,723.52 5,441 29.95 % 34,902.14 še še 7,058 7.05 % 6,818.62 1,756 8.18 % 7,617.46 2,021 5.73 % 6,838.28 2,126 8.46 % 6,020.12 1,155 6.36 % 7,408.93 no no 4,623 4.62 % 4,466.20 1,282 5.97 % 5,561.27 1,511 4.28 % 5,112.64 1,111 4.42 % 3,145.98 719 3.96 % 4,612.14 že že 4,273 4.27 % 4,128.08 1,044 4.86 % 4,528.83 1,357 3.85 % 4,591.56 1,185 4.72 % 3,355.52 687 3.78 % 4,406.87 tudi tu di 4,119 4.12 % 3,979.30 1,289 6.00 % 5,591.63 315 0.89 % 1,065.84 2,086 8.30 % 5,906.86 429 2.36 % 2,751.89 tud t ud 3,570 3.57 % 3,448.92 709 3.30 % 3,075.62 1,187 3.36 % 4,016.35 795 3.16 % 2,251.17 879 4.84 % 5,638.48 pač p ač 2,833 2.83 % 2,736.92 401 1.87 % 1,739.52 954 2.70 % 3,227.97 826 3.29 % 2,338.96 652 3.59 % 4,182.36 res r es 1,625 1.62 % 1,569.89 457 2.13 % 1,982.45 521 1.48 % 1,762.86 355 1.41 % 1,005.24 292 1.61 % 1,873.08 seveda seve da 1,593 1.59 % 1,538.97 499 2.32 % 2,164.64 126 0.36 % 426.34 816 3.25 % 2,310.64 152 0.84 % 975.03 več v eč 1,312 1.31 % 1,267.50 287 1.34 % 1,245 422 1.20 % 1,427.89 445 1.77 % 1,260.09 158 0.87 % 1,013.52 sploh spl oh 866 0.87 % 836.63 147 0.69 % 637.68 340 0.96 % 1,150.43 218 0.87 % 617.30 161 0.89 % 1,032.76 nej n ej 801 0.80 % 773.83 230 1.07 % 997.73 408 1.16 % 1,380.51 104 0.41 % 294.49 59 0.33 % 378.46 samo sa mo 800 0.80 % 772.87 174 0.81 % 754.81 156 0.44 % 527.84 349 1.39 % 988.25 121 0.67 % 776.17 okej ok ej 638 0.64 % 616.36 190 0.89 % 824.21 163 0.46 % 551.53 96 0.38 % 271.84 189 1.04 % 1,212.37 sam s am 556 0.56 % 537.14 76 0.35 % 329.69 314 0.89 % 1,062.45 66 0.26 % 186.89 100 0.55 % 641.47 itak it ak 529 0.53 % 511.06 50 0.23 % 216.90 391 1.11 % 1,322.99 29 0.12 % 82.12 59 0.33 % 378.46 torej tor ej 472 0.47 % 455.99 133 0.62 % 576.95 8 0.02 % 27.07 297 1.18 % 841 34 0.19 % 218.10 naj n aj 469 0.47 % 453.09 112 0.52 % 485.85 81 0.23 % 274.07 230 0.92 % 651.28 46 0.25 % 295.07 kr kr 408 0.41 % 394.16 125 0.58 % 542.25 99 0.28 % 334.98 113 0.45 % 319.98 71 0.39 % 455.44 vsaj vs aj 382 0.38 % 369.04 99 0.46 % 429.46 93 0.26 % 314.68 113 0.45 % 319.98 77 0.42 % 493.93 kar k ar 373 0.37 % 360.35 55 0.26 % 238.59 56 0.16 % 189.48 221 0.88 % 625.80 41 0.23 % 263 glih gl ih 360 0.36 % 347.79 60 0.28 % 260.28 233 0.66 % 788.38 24 0.10 % 67.96 43 0.24 % 275.83 nje n je 341 0.34 % 329.43 21 0.10 % 91.10 198 0.56 % 669.96 106 0.42 % 300.16 16 0.09 % 102.63 da da 324 0.32 % 313.01 49 0.23 % 212.56 141 0.40 % 477.09 86 0.34 % 243.52 48 0.26 % 307.90 provzaprov provzapr ov 320 0.32 % 309.15 34 0.16 % 147.49 8 0.02 % 27.07 245 0.97 % 693.76 33 0.18 % 211.68 morda mor da 306 0.31 % 295.62 137 0.64 % 594.30 6 0.02 % 20.30 148 0.59 % 419.09 15 0.08 % 96.22 predvsem predvs em 300 0.30 % 289.83 97 0.45 % 420.78 10 0.03 % 33.84 163 0.65 % 461.56 30 0.17 % 192.44 le le 240 0.24 % 231.86 98 0.46 % 425.12 33 0.09 % 111.66 91 0.36 % 257.68 18 0.10 % 115.46 pravzaprav pravzapr av 214 0.21 % 206.74 32 0.15 % 138.81 9 0.03 % 30.45 159 0.63 % 450.23 14 0.08 % 89.81 najbrž najb rž 210 0.21 % 202.88 41 0.19 % 177.86 59 0.17 % 199.63 64 0.26 % 181.23 46 0.25 % 295.07 jah j ah 193 0.19 % 186.45 56 0.26 % 242.93 108 0.31 % 365.43 15 0.06 % 42.47 14 0.08 % 89.81 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 641 File at CLARIN.SI2.2.298 List of final character-level 3-grams from particle lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-particles-lowercase_forms- final-3grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tudi t udi 4,119 14.96 % 3,979.30 1,289 20.67 % 5,591.63 315 3.87 % 1,065.84 2,086 23.16 % 5,906.86 429 10.33 % 2,751.89 tud tud 3,570 12.97 % 3,448.92 709 11.37 % 3,075.62 1,187 14.59 % 4,016.35 795 8.82 % 2,251.17 879 21.16 % 5,638.48 pač pač 2,833 10.29 % 2,736.92 401 6.43 % 1,739.52 954 11.73 % 3,227.97 826 9.17 % 2,338.96 652 15.70 % 4,182.36 res res 1,625 5.90 % 1,569.89 457 7.33 % 1,982.45 521 6.41 % 1,762.86 355 3.94 % 1,005.24 292 7.03 % 1,873.08 seveda sev eda 1,593 5.79 % 1,538.97 499 8.00 % 2,164.64 126 1.55 % 426.34 816 9.06 % 2,310.64 152 3.66 % 975.03 več več 1,312 4.76 % 1,267.50 287 4.60 % 1,245 422 5.19 % 1,427.89 445 4.94 % 1,260.09 158 3.80 % 1,013.52 sploh sp loh 866 3.15 % 836.63 147 2.36 % 637.68 340 4.18 % 1,150.43 218 2.42 % 617.30 161 3.88 % 1,032.76 nej nej 801 2.91 % 773.83 230 3.69 % 997.73 408 5.02 % 1,380.51 104 1.15 % 294.49 59 1.42 % 378.46 samo s amo 800 2.90 % 772.87 174 2.79 % 754.81 156 1.92 % 527.84 349 3.87 % 988.25 121 2.91 % 776.17 okej o kej 638 2.32 % 616.36 190 3.05 % 824.21 163 2.00 % 551.53 96 1.07 % 271.84 189 4.55 % 1,212.37 sam sam 556 2.02 % 537.14 76 1.22 % 329.69 314 3.86 % 1,062.45 66 0.73 % 186.89 100 2.41 % 641.47 itak i tak 529 1.92 % 511.06 50 0.80 % 216.90 391 4.81 % 1,322.99 29 0.32 % 82.12 59 1.42 % 378.46 torej to rej 472 1.71 % 455.99 133 2.13 % 576.95 8 0.10 % 27.07 297 3.30 % 841 34 0.82 % 218.10 naj naj 469 1.70 % 453.09 112 1.80 % 485.85 81 1.00 % 274.07 230 2.55 % 651.28 46 1.11 % 295.07 vsaj v saj 382 1.39 % 369.04 99 1.59 % 429.46 93 1.14 % 314.68 113 1.25 % 319.98 77 1.85 % 493.93 kar kar 373 1.35 % 360.35 55 0.88 % 238.59 56 0.69 % 189.48 221 2.45 % 625.80 41 0.99 % 263 glih g lih 360 1.31 % 347.79 60 0.96 % 260.28 233 2.87 % 788.38 24 0.27 % 67.96 43 1.03 % 275.83 nje nje 341 1.24 % 329.43 21 0.34 % 91.10 198 2.43 % 669.96 106 1.18 % 300.16 16 0.39 % 102.63 provzaprov provzap rov 320 1.16 % 309.15 34 0.55 % 147.49 8 0.10 % 27.07 245 2.72 % 693.76 33 0.79 % 211.68 morda mo rda 306 1.11 % 295.62 137 2.20 % 594.30 6 0.07 % 20.30 148 1.64 % 419.09 15 0.36 % 96.22 predvsem predv sem 300 1.09 % 289.83 97 1.55 % 420.78 10 0.12 % 33.84 163 1.81 % 461.56 30 0.72 % 192.44 pravzaprav pravzap rav 214 0.78 % 206.74 32 0.51 % 138.81 9 0.11 % 30.45 159 1.76 % 450.23 14 0.34 % 89.81 najbrž naj brž 210 0.76 % 202.88 41 0.66 % 177.86 59 0.72 % 199.63 64 0.71 % 181.23 46 1.11 % 295.07 jah jah 193 0.70 % 186.45 56 0.90 % 242.93 108 1.33 % 365.43 15 0.17 % 42.47 14 0.34 % 89.81 mogoče mog oče 174 0.63 % 168.10 37 0.59 % 160.50 29 0.36 % 98.12 59 0.66 % 167.07 49 1.18 % 314.32 lih lih 170 0.62 % 164.23 14 0.22 % 60.73 129 1.59 % 436.49 12 0.13 % 33.98 15 0.36 % 96.22 celo c elo 155 0.56 % 149.74 42 0.67 % 182.19 24 0.29 % 81.21 71 0.79 % 201.05 18 0.43 % 115.46 kna kna 147 0.53 % 142.01 0 0 % 0 147 1.81 % 497.39 0 0 % 0 0 0 % 0 sevede sev ede 138 0.50 % 133.32 29 0.47 % 125.80 44 0.54 % 148.88 57 0.63 % 161.40 8 0.19 % 51.32 skoraj sko raj 138 0.50 % 133.32 41 0.66 % 177.86 18 0.22 % 60.91 63 0.70 % 178.39 16 0.39 % 102.63 prov p rov 130 0.47 % 125.59 18 0.29 % 78.08 58 0.71 % 196.25 34 0.38 % 96.28 20 0.48 % 128.29 baje b aje 123 0.45 % 118.83 34 0.55 % 147.49 72 0.89 % 243.62 4 0.04 % 11.33 13 0.31 % 83.39 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 642 File at CLARIN.SI2.2.299 List of final character-level 4-grams from particle lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-particles-lowercase_forms- final-4grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tudi tudi 4,119 29.12 % 3,979.30 1,289 35.23 % 5,591.63 315 10.57 % 1,065.84 2,086 36.36 % 5,906.86 429 24.24 % 2,751.89 seveda se veda 1,593 11.26 % 1,538.97 499 13.64 % 2,164.64 126 4.23 % 426.34 816 14.22 % 2,310.64 152 8.59 % 975.03 sploh s ploh 866 6.12 % 836.63 147 4.02 % 637.68 340 11.41 % 1,150.43 218 3.80 % 617.30 161 9.10 % 1,032.76 samo samo 800 5.66 % 772.87 174 4.75 % 754.81 156 5.23 % 527.84 349 6.08 % 988.25 121 6.84 % 776.17 okej okej 638 4.51 % 616.36 190 5.19 % 824.21 163 5.47 % 551.53 96 1.67 % 271.84 189 10.68 % 1,212.37 itak itak 529 3.74 % 511.06 50 1.37 % 216.90 391 13.12 % 1,322.99 29 0.51 % 82.12 59 3.33 % 378.46 torej t orej 472 3.34 % 455.99 133 3.63 % 576.95 8 0.27 % 27.07 297 5.18 % 841 34 1.92 % 218.10 vsaj vsaj 382 2.70 % 369.04 99 2.71 % 429.46 93 3.12 % 314.68 113 1.97 % 319.98 77 4.35 % 493.93 glih glih 360 2.54 % 347.79 60 1.64 % 260.28 233 7.82 % 788.38 24 0.42 % 67.96 43 2.43 % 275.83 provzaprov provza prov 320 2.26 % 309.15 34 0.93 % 147.49 8 0.27 % 27.07 245 4.27 % 693.76 33 1.86 % 211.68 morda m orda 306 2.16 % 295.62 137 3.74 % 594.30 6 0.20 % 20.30 148 2.58 % 419.09 15 0.85 % 96.22 predvsem pred vsem 300 2.12 % 289.83 97 2.65 % 420.78 10 0.34 % 33.84 163 2.84 % 461.56 30 1.70 % 192.44 pravzaprav pravza prav 214 1.51 % 206.74 32 0.88 % 138.81 9 0.30 % 30.45 159 2.77 % 450.23 14 0.79 % 89.81 najbrž na jbrž 210 1.48 % 202.88 41 1.12 % 177.86 59 1.98 % 199.63 64 1.12 % 181.23 46 2.60 % 295.07 mogoče mo goče 174 1.23 % 168.10 37 1.01 % 160.50 29 0.97 % 98.12 59 1.03 % 167.07 49 2.77 % 314.32 celo celo 155 1.10 % 149.74 42 1.15 % 182.19 24 0.81 % 81.21 71 1.24 % 201.05 18 1.02 % 115.46 sevede se vede 138 0.97 % 133.32 29 0.79 % 125.80 44 1.48 % 148.88 57 0.99 % 161.40 8 0.45 % 51.32 skoraj sk oraj 138 0.97 % 133.32 41 1.12 % 177.86 18 0.60 % 60.91 63 1.10 % 178.39 16 0.90 % 102.63 prov prov 130 0.92 % 125.59 18 0.49 % 78.08 58 1.95 % 196.25 34 0.59 % 96.28 20 1.13 % 128.29 baje baje 123 0.87 % 118.83 34 0.93 % 147.49 72 2.42 % 243.62 4 0.07 % 11.33 13 0.73 % 83.39 šele šele 120 0.85 % 115.93 29 0.79 % 125.80 36 1.21 % 121.81 39 0.68 % 110.43 16 0.90 % 102.63 prav prav 115 0.81 % 111.10 32 0.88 % 138.81 25 0.84 % 84.59 50 0.87 % 141.58 8 0.45 % 51.32 skor skor 115 0.81 % 111.10 26 0.71 % 112.79 58 1.95 % 196.25 14 0.24 % 39.64 17 0.96 % 109.05 ravno r avno 81 0.57 % 78.25 19 0.52 % 82.42 8 0.27 % 27.07 43 0.75 % 121.76 11 0.62 % 70.56 menda m enda 79 0.56 % 76.32 7 0.19 % 30.37 39 1.31 % 131.96 23 0.40 % 65.13 10 0.56 % 64.15 valda v alda 74 0.52 % 71.49 23 0.63 % 99.77 31 1.04 % 104.89 4 0.07 % 11.33 16 0.90 % 102.63 potem p otem 71 0.50 % 68.59 15 0.41 % 65.07 7 0.23 % 23.69 32 0.56 % 90.61 17 0.96 % 109.05 verjetno verj etno 60 0.42 % 57.97 7 0.19 % 30.37 9 0.30 % 30.45 34 0.59 % 96.28 10 0.56 % 64.15 zgolj z golj 45 0.32 % 43.47 11 0.30 % 47.72 3 0.10 % 10.15 30 0.52 % 84.95 1 0.06 % 6.41 vendarle vend arle 44 0.31 % 42.51 4 0.11 % 17.35 0 0 % 0 39 0.68 % 110.43 1 0.06 % 6.41 valjda va ljda 42 0.30 % 40.58 1 0.03 % 4.34 39 1.31 % 131.96 1 0.02 % 2.83 1 0.06 % 6.41 rejs rejs 41 0.29 % 39.61 0 0 % 0 35 1.17 % 118.43 5 0.09 % 14.16 1 0.06 % 6.41 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 643 File at CLARIN.SI2.2.300 List of final character-level 5-grams from particle lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-particles-lowercase_forms- final-5grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] seveda s eveda 1,593 26.49 % 1,538.97 499 32.96 % 2,164.64 126 12.51 % 426.34 816 29.41 % 2,310.64 152 21.20 % 975.03 sploh sploh 866 14.40 % 836.63 147 9.71 % 637.68 340 33.76 % 1,150.43 218 7.86 % 617.30 161 22.45 % 1,032.76 torej torej 472 7.85 % 455.99 133 8.79 % 576.95 8 0.79 % 27.07 297 10.70 % 841 34 4.74 % 218.10 provzaprov provz aprov 320 5.32 % 309.15 34 2.25 % 147.49 8 0.79 % 27.07 245 8.83 % 693.76 33 4.60 % 211.68 morda morda 306 5.09 % 295.62 137 9.05 % 594.30 6 0.60 % 20.30 148 5.33 % 419.09 15 2.09 % 96.22 predvsem pre dvsem 300 4.99 % 289.83 97 6.41 % 420.78 10 0.99 % 33.84 163 5.87 % 461.56 30 4.18 % 192.44 pravzaprav pravz aprav 214 3.56 % 206.74 32 2.11 % 138.81 9 0.89 % 30.45 159 5.73 % 450.23 14 1.95 % 89.81 najbrž n ajbrž 210 3.49 % 202.88 41 2.71 % 177.86 59 5.86 % 199.63 64 2.31 % 181.23 46 6.42 % 295.07 mogoče m ogoče 174 2.89 % 168.10 37 2.44 % 160.50 29 2.88 % 98.12 59 2.13 % 167.07 49 6.83 % 314.32 sevede s evede 138 2.29 % 133.32 29 1.92 % 125.80 44 4.37 % 148.88 57 2.05 % 161.40 8 1.12 % 51.32 skoraj s koraj 138 2.29 % 133.32 41 2.71 % 177.86 18 1.79 % 60.91 63 2.27 % 178.39 16 2.23 % 102.63 ravno ravno 81 1.35 % 78.25 19 1.25 % 82.42 8 0.79 % 27.07 43 1.55 % 121.76 11 1.53 % 70.56 menda menda 79 1.31 % 76.32 7 0.46 % 30.37 39 3.87 % 131.96 23 0.83 % 65.13 10 1.40 % 64.15 valda valda 74 1.23 % 71.49 23 1.52 % 99.77 31 3.08 % 104.89 4 0.14 % 11.33 16 2.23 % 102.63 potem potem 71 1.18 % 68.59 15 0.99 % 65.07 7 0.69 % 23.69 32 1.15 % 90.61 17 2.37 % 109.05 verjetno ver jetno 60 1.00 % 57.97 7 0.46 % 30.37 9 0.89 % 30.45 34 1.23 % 96.28 10 1.40 % 64.15 zgolj zgolj 45 0.75 % 43.47 11 0.73 % 47.72 3 0.30 % 10.15 30 1.08 % 84.95 1 0.14 % 6.41 vendarle ven darle 44 0.73 % 42.51 4 0.26 % 17.35 0 0 % 0 39 1.41 % 110.43 1 0.14 % 6.41 valjda v aljda 42 0.70 % 40.58 1 0.07 % 4.34 39 3.87 % 131.96 1 0.04 % 2.83 1 0.14 % 6.41 skorajda sko rajda 38 0.63 % 36.71 14 0.93 % 60.73 0 0 % 0 23 0.83 % 65.13 1 0.14 % 6.41 komaj komaj 35 0.58 % 33.81 14 0.93 % 60.73 11 1.09 % 37.22 5 0.18 % 14.16 5 0.70 % 32.07 mende mende 35 0.58 % 33.81 1 0.07 % 4.34 32 3.18 % 108.28 2 0.07 % 5.66 0 0 % 0 predsem pr edsem 33 0.55 % 31.88 10 0.66 % 43.38 0 0 % 0 22 0.79 % 62.30 1 0.14 % 6.41 sevea sevea 28 0.47 % 27.05 7 0.46 % 30.37 4 0.40 % 13.53 9 0.32 % 25.48 8 1.12 % 51.32 vsekakor vse kakor 26 0.43 % 25.12 8 0.53 % 34.70 0 0 % 0 16 0.58 % 45.31 2 0.28 % 12.83 edino edino 23 0.38 % 22.22 4 0.26 % 17.35 10 0.99 % 33.84 4 0.14 % 11.33 5 0.70 % 32.07 minus minus 22 0.37 % 21.25 0 0 % 0 0 0 % 0 10 0.36 % 28.32 12 1.67 % 76.98 sploj sploj 22 0.37 % 21.25 10 0.66 % 43.38 12 1.19 % 40.60 0 0 % 0 0 0 % 0 največ n ajveč 21 0.35 % 20.29 7 0.46 % 30.37 0 0 % 0 12 0.43 % 33.98 2 0.28 % 12.83 bržkone br žkone 20 0.33 % 19.32 20 1.32 % 86.76 0 0 % 0 0 0 % 0 0 0 % 0 morebiti mor ebiti 19 0.32 % 18.36 1 0.07 % 4.34 0 0 % 0 18 0.65 % 50.97 0 0 % 0 natanko na tanko 19 0.32 % 18.36 10 0.66 % 43.38 0 0 % 0 9 0.32 % 25.48 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 644 File at CLARIN.SI2.2.301 List of initial character-level 1-grams from interjection lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-interjections-lemmas-initial- 1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mhm mhm m hm 4,476 38.80 % 4,324.19 432 21.74 % 1,874 1,261 30.90 % 4,266.74 975 44.68 % 2,760.87 1,808 55.01 % 11,597.70 aha aha a ha 2,063 17.88 % 1,993.03 377 18.97 % 1,635.41 627 15.36 % 2,121.53 219 10.04 % 620.13 840 25.55 % 5,388.31 recimo recimo r ecimo 868 7.52 % 838.56 186 9.36 % 806.86 131 3.21 % 443.25 376 17.23 % 1,064.71 175 5.32 % 1,122.56 aja aja a ja 762 6.61 % 736.16 66 3.32 % 286.31 512 12.55 % 1,732.41 84 3.85 % 237.86 100 3.04 % 641.47 ej ej e j 616 5.34 % 595.11 91 4.58 % 394.75 427 10.46 % 1,444.80 21 0.96 % 59.46 77 2.34 % 493.93 joj joj j oj 436 3.78 % 421.21 99 4.98 % 429.46 257 6.30 % 869.59 32 1.47 % 90.61 48 1.46 % 307.90 prosim prosim p rosim 413 3.58 % 398.99 62 3.12 % 268.95 37 0.91 % 125.19 259 11.87 % 733.40 55 1.67 % 352.81 eh eh e h 215 1.86 % 207.71 26 1.31 % 112.79 168 4.12 % 568.45 9 0.41 % 25.48 12 0.36 % 76.98 hm hm h m 209 1.81 % 201.91 34 1.71 % 147.49 89 2.18 % 301.14 50 2.29 % 141.58 36 1.09 % 230.93 ha ha h a 205 1.78 % 198.05 77 3.88 % 334.02 102 2.50 % 345.13 10 0.46 % 28.32 16 0.49 % 102.63 čao čao č ao 177 1.53 % 171 114 5.74 % 494.53 34 0.83 % 115.04 6 0.28 % 16.99 23 0.70 % 147.54 ah ah a h 149 1.29 % 143.95 28 1.41 % 121.46 90 2.21 % 304.53 16 0.73 % 45.31 15 0.46 % 96.22 bravo bravo b ravo 119 1.03 % 114.96 88 4.43 % 381.74 5 0.12 % 16.92 23 1.05 % 65.13 3 0.09 % 19.24 adijo adijo a dijo 72 0.62 % 69.56 37 1.86 % 160.50 13 0.32 % 43.99 13 0.60 % 36.81 9 0.27 % 57.73 zdravo zdravo z dravo 70 0.61 % 67.63 35 1.76 % 151.83 13 0.32 % 43.99 13 0.60 % 36.81 9 0.27 % 57.73 he he h e 64 0.56 % 61.83 5 0.25 % 21.69 52 1.27 % 175.95 3 0.14 % 8.49 4 0.12 % 25.66 oh oh o h 57 0.49 % 55.07 12 0.60 % 52.06 25 0.61 % 84.59 10 0.46 % 28.32 10 0.30 % 64.15 la la l a 55 0.48 % 53.13 32 1.61 % 138.81 11 0.27 % 37.22 0 0 % 0 12 0.36 % 76.98 fak fak f ak 53 0.46 % 51.20 0 0 % 0 53 1.30 % 179.33 0 0 % 0 0 0 % 0 halo halo h alo 42 0.36 % 40.58 15 0.76 % 65.07 20 0.49 % 67.67 5 0.23 % 14.16 2 0.06 % 12.83 ho ho h o 41 0.35 % 39.61 35 1.76 % 151.83 5 0.12 % 16.92 1 0.05 % 2.83 0 0 % 0 hop hop h op 36 0.31 % 34.78 29 1.46 % 125.80 5 0.12 % 16.92 1 0.05 % 2.83 1 0.03 % 6.41 bla bla b la 34 0.29 % 32.85 2 0.10 % 8.68 23 0.56 % 77.82 5 0.23 % 14.16 4 0.12 % 25.66 hej hej h ej 25 0.22 % 24.15 14 0.70 % 60.73 7 0.17 % 23.69 3 0.14 % 8.49 1 0.03 % 6.41 ojej ojej o jej 23 0.20 % 22.22 4 0.20 % 17.35 13 0.32 % 43.99 2 0.09 % 5.66 4 0.12 % 25.66 opa opa o pa 23 0.20 % 22.22 14 0.70 % 60.73 6 0.15 % 20.30 1 0.05 % 2.83 2 0.06 % 12.83 uh uh u h 23 0.20 % 22.22 7 0.35 % 30.37 10 0.24 % 33.84 2 0.09 % 5.66 4 0.12 % 25.66 nasvidenje nasvidenje n asvidenje 19 0.17 % 18.36 12 0.60 % 52.06 0 0 % 0 7 0.32 % 19.82 0 0 % 0 šit šit š it 19 0.17 % 18.36 2 0.10 % 8.68 15 0.37 % 50.75 0 0 % 0 2 0.06 % 12.83 alo alo a lo 17 0.15 % 16.42 7 0.35 % 30.37 7 0.17 % 23.69 3 0.14 % 8.49 0 0 % 0 jebemti jebemti j ebemti 16 0.14 % 15.46 5 0.25 % 21.69 11 0.27 % 37.22 0 0 % 0 0 0 % 0 oj oj o j 14 0.12 % 13.53 1 0.05 % 4.34 12 0.29 % 40.60 0 0 % 0 1 0.03 % 6.41 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 645 File at CLARIN.SI2.2.302 List of initial character-level 2-grams from interjection lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-interjections-lemmas-initial- 2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mhm mhm mh m 4,476 38.80 % 4,324.19 432 21.74 % 1,874 1,261 30.90 % 4,266.74 975 44.68 % 2,760.87 1,808 55.01 % 11,597.70 aha aha ah a 2,063 17.88 % 1,993.03 377 18.97 % 1,635.41 627 15.36 % 2,121.53 219 10.04 % 620.13 840 25.55 % 5,388.31 recimo recimo re cimo 868 7.52 % 838.56 186 9.36 % 806.86 131 3.21 % 443.25 376 17.23 % 1,064.71 175 5.32 % 1,122.56 aja aja aj a 762 6.61 % 736.16 66 3.32 % 286.31 512 12.55 % 1,732.41 84 3.85 % 237.86 100 3.04 % 641.47 ej ej ej 616 5.34 % 595.11 91 4.58 % 394.75 427 10.46 % 1,444.80 21 0.96 % 59.46 77 2.34 % 493.93 joj joj jo j 436 3.78 % 421.21 99 4.98 % 429.46 257 6.30 % 869.59 32 1.47 % 90.61 48 1.46 % 307.90 prosim prosim pr osim 413 3.58 % 398.99 62 3.12 % 268.95 37 0.91 % 125.19 259 11.87 % 733.40 55 1.67 % 352.81 eh eh eh 215 1.86 % 207.71 26 1.31 % 112.79 168 4.12 % 568.45 9 0.41 % 25.48 12 0.36 % 76.98 hm hm hm 209 1.81 % 201.91 34 1.71 % 147.49 89 2.18 % 301.14 50 2.29 % 141.58 36 1.09 % 230.93 ha ha ha 205 1.78 % 198.05 77 3.88 % 334.02 102 2.50 % 345.13 10 0.46 % 28.32 16 0.49 % 102.63 čao čao ča o 177 1.53 % 171 114 5.74 % 494.53 34 0.83 % 115.04 6 0.28 % 16.99 23 0.70 % 147.54 ah ah ah 149 1.29 % 143.95 28 1.41 % 121.46 90 2.21 % 304.53 16 0.73 % 45.31 15 0.46 % 96.22 bravo bravo br avo 119 1.03 % 114.96 88 4.43 % 381.74 5 0.12 % 16.92 23 1.05 % 65.13 3 0.09 % 19.24 adijo adijo ad ijo 72 0.62 % 69.56 37 1.86 % 160.50 13 0.32 % 43.99 13 0.60 % 36.81 9 0.27 % 57.73 zdravo zdravo zd ravo 70 0.61 % 67.63 35 1.76 % 151.83 13 0.32 % 43.99 13 0.60 % 36.81 9 0.27 % 57.73 he he he 64 0.56 % 61.83 5 0.25 % 21.69 52 1.27 % 175.95 3 0.14 % 8.49 4 0.12 % 25.66 oh oh oh 57 0.49 % 55.07 12 0.60 % 52.06 25 0.61 % 84.59 10 0.46 % 28.32 10 0.30 % 64.15 la la la 55 0.48 % 53.13 32 1.61 % 138.81 11 0.27 % 37.22 0 0 % 0 12 0.36 % 76.98 fak fak fa k 53 0.46 % 51.20 0 0 % 0 53 1.30 % 179.33 0 0 % 0 0 0 % 0 halo halo ha lo 42 0.36 % 40.58 15 0.76 % 65.07 20 0.49 % 67.67 5 0.23 % 14.16 2 0.06 % 12.83 ho ho ho 41 0.35 % 39.61 35 1.76 % 151.83 5 0.12 % 16.92 1 0.05 % 2.83 0 0 % 0 hop hop ho p 36 0.31 % 34.78 29 1.46 % 125.80 5 0.12 % 16.92 1 0.05 % 2.83 1 0.03 % 6.41 bla bla bl a 34 0.29 % 32.85 2 0.10 % 8.68 23 0.56 % 77.82 5 0.23 % 14.16 4 0.12 % 25.66 hej hej he j 25 0.22 % 24.15 14 0.70 % 60.73 7 0.17 % 23.69 3 0.14 % 8.49 1 0.03 % 6.41 ojej ojej oj ej 23 0.20 % 22.22 4 0.20 % 17.35 13 0.32 % 43.99 2 0.09 % 5.66 4 0.12 % 25.66 opa opa op a 23 0.20 % 22.22 14 0.70 % 60.73 6 0.15 % 20.30 1 0.05 % 2.83 2 0.06 % 12.83 uh uh uh 23 0.20 % 22.22 7 0.35 % 30.37 10 0.24 % 33.84 2 0.09 % 5.66 4 0.12 % 25.66 nasvidenje nasvidenje na svidenje 19 0.17 % 18.36 12 0.60 % 52.06 0 0 % 0 7 0.32 % 19.82 0 0 % 0 šit šit ši t 19 0.17 % 18.36 2 0.10 % 8.68 15 0.37 % 50.75 0 0 % 0 2 0.06 % 12.83 alo alo al o 17 0.15 % 16.42 7 0.35 % 30.37 7 0.17 % 23.69 3 0.14 % 8.49 0 0 % 0 jebemti jebemti je bemti 16 0.14 % 15.46 5 0.25 % 21.69 11 0.27 % 37.22 0 0 % 0 0 0 % 0 oj oj oj 14 0.12 % 13.53 1 0.05 % 4.34 12 0.29 % 40.60 0 0 % 0 1 0.03 % 6.41 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 646 File at CLARIN.SI2.2.303 List of initial character-level 3-grams from interjection lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-interjections-lemmas-initial- 3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mhm mhm mhm 4,476 45.32 % 4,324.19 432 26.41 % 1,874 1,261 40.89 % 4,266.74 975 47.35 % 2,760.87 1,808 58.36 % 11,597.70 aha aha aha 2,063 20.89 % 1,993.03 377 23.04 % 1,635.41 627 20.33 % 2,121.53 219 10.64 % 620.13 840 27.11 % 5,388.31 recimo recimo rec imo 868 8.79 % 838.56 186 11.37 % 806.86 131 4.25 % 443.25 376 18.26 % 1,064.71 175 5.65 % 1,122.56 aja aja aja 762 7.71 % 736.16 66 4.03 % 286.31 512 16.60 % 1,732.41 84 4.08 % 237.86 100 3.23 % 641.47 joj joj joj 436 4.41 % 421.21 99 6.05 % 429.46 257 8.33 % 869.59 32 1.55 % 90.61 48 1.55 % 307.90 prosim prosim pro sim 413 4.18 % 398.99 62 3.79 % 268.95 37 1.20 % 125.19 259 12.58 % 733.40 55 1.77 % 352.81 čao čao čao 177 1.79 % 171 114 6.97 % 494.53 34 1.10 % 115.04 6 0.29 % 16.99 23 0.74 % 147.54 bravo bravo bra vo 119 1.21 % 114.96 88 5.38 % 381.74 5 0.16 % 16.92 23 1.12 % 65.13 3 0.10 % 19.24 adijo adijo adi jo 72 0.73 % 69.56 37 2.26 % 160.50 13 0.42 % 43.99 13 0.63 % 36.81 9 0.29 % 57.73 zdravo zdravo zdr avo 70 0.71 % 67.63 35 2.14 % 151.83 13 0.42 % 43.99 13 0.63 % 36.81 9 0.29 % 57.73 fak fak fak 53 0.54 % 51.20 0 0 % 0 53 1.72 % 179.33 0 0 % 0 0 0 % 0 halo halo hal o 42 0.42 % 40.58 15 0.92 % 65.07 20 0.65 % 67.67 5 0.24 % 14.16 2 0.07 % 12.83 hop hop hop 36 0.36 % 34.78 29 1.77 % 125.80 5 0.16 % 16.92 1 0.05 % 2.83 1 0.03 % 6.41 bla bla bla 34 0.34 % 32.85 2 0.12 % 8.68 23 0.75 % 77.82 5 0.24 % 14.16 4 0.13 % 25.66 hej hej hej 25 0.25 % 24.15 14 0.86 % 60.73 7 0.23 % 23.69 3 0.15 % 8.49 1 0.03 % 6.41 ojej ojej oje j 23 0.23 % 22.22 4 0.24 % 17.35 13 0.42 % 43.99 2 0.10 % 5.66 4 0.13 % 25.66 opa opa opa 23 0.23 % 22.22 14 0.86 % 60.73 6 0.20 % 20.30 1 0.05 % 2.83 2 0.07 % 12.83 nasvidenje nasvidenje nas videnje 19 0.19 % 18.36 12 0.73 % 52.06 0 0 % 0 7 0.34 % 19.82 0 0 % 0 šit šit šit 19 0.19 % 18.36 2 0.12 % 8.68 15 0.49 % 50.75 0 0 % 0 2 0.07 % 12.83 alo alo alo 17 0.17 % 16.42 7 0.43 % 30.37 7 0.23 % 23.69 3 0.15 % 8.49 0 0 % 0 jebemti jebemti jeb emti 16 0.16 % 15.46 5 0.31 % 21.69 11 0.36 % 37.22 0 0 % 0 0 0 % 0 ups ups ups 14 0.14 % 13.53 3 0.18 % 13.01 6 0.20 % 20.30 3 0.15 % 8.49 2 0.07 % 12.83 živijo živijo živ ijo 14 0.14 % 13.53 0 0 % 0 5 0.16 % 16.92 9 0.44 % 25.48 0 0 % 0 ojoj ojoj ojo j 12 0.12 % 11.59 3 0.18 % 13.01 4 0.13 % 13.53 2 0.10 % 5.66 3 0.10 % 19.24 jej jej jej 8 0.08 % 7.73 5 0.31 % 21.69 2 0.07 % 6.77 1 0.05 % 2.83 0 0 % 0 pardon pardon par don 8 0.08 % 7.73 3 0.18 % 13.01 2 0.07 % 6.77 3 0.15 % 8.49 0 0 % 0 hopla hopla hop la 7 0.07 % 6.76 2 0.12 % 8.68 0 0 % 0 0 0 % 0 5 0.16 % 32.07 oho oho oho 7 0.07 % 6.76 4 0.24 % 17.35 3 0.10 % 10.15 0 0 % 0 0 0 % 0 fuj fuj fuj 6 0.06 % 5.80 0 0 % 0 6 0.20 % 20.30 0 0 % 0 0 0 % 0 huh huh huh 5 0.05 % 4.83 2 0.12 % 8.68 1 0.03 % 3.38 1 0.05 % 2.83 1 0.03 % 6.41 jah jah jah 5 0.05 % 4.83 1 0.06 % 4.34 0 0 % 0 4 0.19 % 11.33 0 0 % 0 ops ops ops 5 0.05 % 4.83 0 0 % 0 3 0.10 % 10.15 1 0.05 % 2.83 1 0.03 % 6.41 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 647 File at CLARIN.SI2.2.304 List of initial character-level 4-grams from interjection lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-interjections-lemmas-initial- 4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] recimo recimo reci mo 868 50.88 % 838.56 186 40.00 % 806.86 131 51.17 % 443.25 376 52.22 % 1,064.71 175 66.04 % 1,122.56 prosim prosim pros im 413 24.21 % 398.99 62 13.33 % 268.95 37 14.45 % 125.19 259 35.97 % 733.40 55 20.75 % 352.81 bravo bravo brav o 119 6.97 % 114.96 88 18.93 % 381.74 5 1.95 % 16.92 23 3.19 % 65.13 3 1.13 % 19.24 adijo adijo adij o 72 4.22 % 69.56 37 7.96 % 160.50 13 5.08 % 43.99 13 1.81 % 36.81 9 3.40 % 57.73 zdravo zdravo zdra vo 70 4.10 % 67.63 35 7.53 % 151.83 13 5.08 % 43.99 13 1.81 % 36.81 9 3.40 % 57.73 halo halo halo 42 2.46 % 40.58 15 3.23 % 65.07 20 7.81 % 67.67 5 0.69 % 14.16 2 0.76 % 12.83 ojej ojej ojej 23 1.35 % 22.22 4 0.86 % 17.35 13 5.08 % 43.99 2 0.28 % 5.66 4 1.51 % 25.66 nasvidenje nasvidenje nasv idenje 19 1.11 % 18.36 12 2.58 % 52.06 0 0 % 0 7 0.97 % 19.82 0 0 % 0 jebemti jebemti jebe mti 16 0.94 % 15.46 5 1.07 % 21.69 11 4.30 % 37.22 0 0 % 0 0 0 % 0 živijo živijo živi jo 14 0.82 % 13.53 0 0 % 0 5 1.95 % 16.92 9 1.25 % 25.48 0 0 % 0 ojoj ojoj ojoj 12 0.70 % 11.59 3 0.65 % 13.01 4 1.56 % 13.53 2 0.28 % 5.66 3 1.13 % 19.24 pardon pardon pard on 8 0.47 % 7.73 3 0.65 % 13.01 2 0.78 % 6.77 3 0.42 % 8.49 0 0 % 0 hopla hopla hopl a 7 0.41 % 6.76 2 0.43 % 8.68 0 0 % 0 0 0 % 0 5 1.89 % 32.07 juhuhu juhuhu juhu hu 4 0.23 % 3.86 3 0.65 % 13.01 0 0 % 0 1 0.14 % 2.83 0 0 % 0 ajej ajej ajej 3 0.18 % 2.90 0 0 % 0 1 0.39 % 3.38 2 0.28 % 5.66 0 0 % 0 hojla hojla hojl a 3 0.18 % 2.90 0 0 % 0 0 0 % 0 3 0.42 % 8.49 0 0 % 0 aleluja aleluja alel uja 2 0.12 % 1.93 2 0.43 % 8.68 0 0 % 0 0 0 % 0 0 0 % 0 juhej juhej juhe j 2 0.12 % 1.93 2 0.43 % 8.68 0 0 % 0 0 0 % 0 0 0 % 0 pozor pozor pozo r 2 0.12 % 1.93 1 0.21 % 4.34 0 0 % 0 1 0.14 % 2.83 0 0 % 0 tralala tralala tral ala 2 0.12 % 1.93 2 0.43 % 8.68 0 0 % 0 0 0 % 0 0 0 % 0 hopsasa hopsasa hops asa 1 0.06 % 0.97 1 0.21 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 jebenti jebenti jebe nti 1 0.06 % 0.97 1 0.21 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 jebiga jebiga jebi ga 1 0.06 % 0.97 0 0 % 0 1 0.39 % 3.38 0 0 % 0 0 0 % 0 ježeš ježeš ježe š 1 0.06 % 0.97 1 0.21 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 zbogom zbogom zbog om 1 0.06 % 0.97 0 0 % 0 0 0 % 0 1 0.14 % 2.83 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 648 File at CLARIN.SI2.2.305 List of initial character-level 5-grams from interjection lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-interjections-lemmas-initial- 5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] recimo recimo recim o 868 53.38 % 838.56 186 41.99 % 806.86 131 60.09 % 443.25 376 53.03 % 1,064.71 175 68.36 % 1,122.56 prosim prosim prosi m 413 25.40 % 398.99 62 13.99 % 268.95 37 16.97 % 125.19 259 36.53 % 733.40 55 21.48 % 352.81 bravo bravo bravo 119 7.32 % 114.96 88 19.86 % 381.74 5 2.29 % 16.92 23 3.24 % 65.13 3 1.17 % 19.24 adijo adijo adijo 72 4.43 % 69.56 37 8.35 % 160.50 13 5.96 % 43.99 13 1.83 % 36.81 9 3.52 % 57.73 zdravo zdravo zdrav o 70 4.30 % 67.63 35 7.90 % 151.83 13 5.96 % 43.99 13 1.83 % 36.81 9 3.52 % 57.73 nasvidenje nasvidenje nasvi denje 19 1.17 % 18.36 12 2.71 % 52.06 0 0 % 0 7 0.99 % 19.82 0 0 % 0 jebemti jebemti jebem ti 16 0.98 % 15.46 5 1.13 % 21.69 11 5.05 % 37.22 0 0 % 0 0 0 % 0 živijo živijo živij o 14 0.86 % 13.53 0 0 % 0 5 2.29 % 16.92 9 1.27 % 25.48 0 0 % 0 pardon pardon pardo n 8 0.49 % 7.73 3 0.68 % 13.01 2 0.92 % 6.77 3 0.42 % 8.49 0 0 % 0 hopla hopla hopla 7 0.43 % 6.76 2 0.45 % 8.68 0 0 % 0 0 0 % 0 5 1.95 % 32.07 juhuhu juhuhu juhuh u 4 0.25 % 3.86 3 0.68 % 13.01 0 0 % 0 1 0.14 % 2.83 0 0 % 0 hojla hojla hojla 3 0.18 % 2.90 0 0 % 0 0 0 % 0 3 0.42 % 8.49 0 0 % 0 aleluja aleluja alelu ja 2 0.12 % 1.93 2 0.45 % 8.68 0 0 % 0 0 0 % 0 0 0 % 0 juhej juhej juhej 2 0.12 % 1.93 2 0.45 % 8.68 0 0 % 0 0 0 % 0 0 0 % 0 pozor pozor pozor 2 0.12 % 1.93 1 0.23 % 4.34 0 0 % 0 1 0.14 % 2.83 0 0 % 0 tralala tralala trala la 2 0.12 % 1.93 2 0.45 % 8.68 0 0 % 0 0 0 % 0 0 0 % 0 hopsasa hopsasa hopsa sa 1 0.06 % 0.97 1 0.23 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 jebenti jebenti jeben ti 1 0.06 % 0.97 1 0.23 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 jebiga jebiga jebig a 1 0.06 % 0.97 0 0 % 0 1 0.46 % 3.38 0 0 % 0 0 0 % 0 ježeš ježeš ježeš 1 0.06 % 0.97 1 0.23 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 zbogom zbogom zbogo m 1 0.06 % 0.97 0 0 % 0 0 0 % 0 1 0.14 % 2.83 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 649 File at CLARIN.SI2.2.306 List of final character-level 1-grams from interjection lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-interjections-lemmas- final-1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mhm mhm mh m 4,476 38.80 % 4,324.19 432 21.74 % 1,874 1,261 30.90 % 4,266.74 975 44.68 % 2,760.87 1,808 55.01 % 11,597.70 aha aha ah a 2,063 17.88 % 1,993.03 377 18.97 % 1,635.41 627 15.36 % 2,121.53 219 10.04 % 620.13 840 25.55 % 5,388.31 recimo recimo recim o 868 7.52 % 838.56 186 9.36 % 806.86 131 3.21 % 443.25 376 17.23 % 1,064.71 175 5.32 % 1,122.56 aja aja aj a 762 6.61 % 736.16 66 3.32 % 286.31 512 12.55 % 1,732.41 84 3.85 % 237.86 100 3.04 % 641.47 ej ej e j 616 5.34 % 595.11 91 4.58 % 394.75 427 10.46 % 1,444.80 21 0.96 % 59.46 77 2.34 % 493.93 joj joj jo j 436 3.78 % 421.21 99 4.98 % 429.46 257 6.30 % 869.59 32 1.47 % 90.61 48 1.46 % 307.90 prosim prosim prosi m 413 3.58 % 398.99 62 3.12 % 268.95 37 0.91 % 125.19 259 11.87 % 733.40 55 1.67 % 352.81 eh eh e h 215 1.86 % 207.71 26 1.31 % 112.79 168 4.12 % 568.45 9 0.41 % 25.48 12 0.36 % 76.98 hm hm h m 209 1.81 % 201.91 34 1.71 % 147.49 89 2.18 % 301.14 50 2.29 % 141.58 36 1.09 % 230.93 ha ha h a 205 1.78 % 198.05 77 3.88 % 334.02 102 2.50 % 345.13 10 0.46 % 28.32 16 0.49 % 102.63 čao čao ča o 177 1.53 % 171 114 5.74 % 494.53 34 0.83 % 115.04 6 0.28 % 16.99 23 0.70 % 147.54 ah ah a h 149 1.29 % 143.95 28 1.41 % 121.46 90 2.21 % 304.53 16 0.73 % 45.31 15 0.46 % 96.22 bravo bravo brav o 119 1.03 % 114.96 88 4.43 % 381.74 5 0.12 % 16.92 23 1.05 % 65.13 3 0.09 % 19.24 adijo adijo adij o 72 0.62 % 69.56 37 1.86 % 160.50 13 0.32 % 43.99 13 0.60 % 36.81 9 0.27 % 57.73 zdravo zdravo zdrav o 70 0.61 % 67.63 35 1.76 % 151.83 13 0.32 % 43.99 13 0.60 % 36.81 9 0.27 % 57.73 he he h e 64 0.56 % 61.83 5 0.25 % 21.69 52 1.27 % 175.95 3 0.14 % 8.49 4 0.12 % 25.66 oh oh o h 57 0.49 % 55.07 12 0.60 % 52.06 25 0.61 % 84.59 10 0.46 % 28.32 10 0.30 % 64.15 la la l a 55 0.48 % 53.13 32 1.61 % 138.81 11 0.27 % 37.22 0 0 % 0 12 0.36 % 76.98 fak fak fa k 53 0.46 % 51.20 0 0 % 0 53 1.30 % 179.33 0 0 % 0 0 0 % 0 halo halo hal o 42 0.36 % 40.58 15 0.76 % 65.07 20 0.49 % 67.67 5 0.23 % 14.16 2 0.06 % 12.83 ho ho h o 41 0.35 % 39.61 35 1.76 % 151.83 5 0.12 % 16.92 1 0.05 % 2.83 0 0 % 0 hop hop ho p 36 0.31 % 34.78 29 1.46 % 125.80 5 0.12 % 16.92 1 0.05 % 2.83 1 0.03 % 6.41 bla bla bl a 34 0.29 % 32.85 2 0.10 % 8.68 23 0.56 % 77.82 5 0.23 % 14.16 4 0.12 % 25.66 hej hej he j 25 0.22 % 24.15 14 0.70 % 60.73 7 0.17 % 23.69 3 0.14 % 8.49 1 0.03 % 6.41 ojej ojej oje j 23 0.20 % 22.22 4 0.20 % 17.35 13 0.32 % 43.99 2 0.09 % 5.66 4 0.12 % 25.66 opa opa op a 23 0.20 % 22.22 14 0.70 % 60.73 6 0.15 % 20.30 1 0.05 % 2.83 2 0.06 % 12.83 uh uh u h 23 0.20 % 22.22 7 0.35 % 30.37 10 0.24 % 33.84 2 0.09 % 5.66 4 0.12 % 25.66 nasvidenje nasvidenje nasvidenj e 19 0.17 % 18.36 12 0.60 % 52.06 0 0 % 0 7 0.32 % 19.82 0 0 % 0 šit šit ši t 19 0.17 % 18.36 2 0.10 % 8.68 15 0.37 % 50.75 0 0 % 0 2 0.06 % 12.83 alo alo al o 17 0.15 % 16.42 7 0.35 % 30.37 7 0.17 % 23.69 3 0.14 % 8.49 0 0 % 0 jebemti jebemti jebemt i 16 0.14 % 15.46 5 0.25 % 21.69 11 0.27 % 37.22 0 0 % 0 0 0 % 0 oj oj o j 14 0.12 % 13.53 1 0.05 % 4.34 12 0.29 % 40.60 0 0 % 0 1 0.03 % 6.41 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 650 File at CLARIN.SI2.2.307 List of final character-level 2-grams from interjection lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-interjections-lemmas- final-2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mhm mhm m hm 4,476 38.80 % 4,324.19 432 21.74 % 1,874 1,261 30.90 % 4,266.74 975 44.68 % 2,760.87 1,808 55.01 % 11,597.70 aha aha a ha 2,063 17.88 % 1,993.03 377 18.97 % 1,635.41 627 15.36 % 2,121.53 219 10.04 % 620.13 840 25.55 % 5,388.31 recimo recimo reci mo 868 7.52 % 838.56 186 9.36 % 806.86 131 3.21 % 443.25 376 17.23 % 1,064.71 175 5.32 % 1,122.56 aja aja a ja 762 6.61 % 736.16 66 3.32 % 286.31 512 12.55 % 1,732.41 84 3.85 % 237.86 100 3.04 % 641.47 ej ej ej 616 5.34 % 595.11 91 4.58 % 394.75 427 10.46 % 1,444.80 21 0.96 % 59.46 77 2.34 % 493.93 joj joj j oj 436 3.78 % 421.21 99 4.98 % 429.46 257 6.30 % 869.59 32 1.47 % 90.61 48 1.46 % 307.90 prosim prosim pros im 413 3.58 % 398.99 62 3.12 % 268.95 37 0.91 % 125.19 259 11.87 % 733.40 55 1.67 % 352.81 eh eh eh 215 1.86 % 207.71 26 1.31 % 112.79 168 4.12 % 568.45 9 0.41 % 25.48 12 0.36 % 76.98 hm hm hm 209 1.81 % 201.91 34 1.71 % 147.49 89 2.18 % 301.14 50 2.29 % 141.58 36 1.09 % 230.93 ha ha ha 205 1.78 % 198.05 77 3.88 % 334.02 102 2.50 % 345.13 10 0.46 % 28.32 16 0.49 % 102.63 čao čao č ao 177 1.53 % 171 114 5.74 % 494.53 34 0.83 % 115.04 6 0.28 % 16.99 23 0.70 % 147.54 ah ah ah 149 1.29 % 143.95 28 1.41 % 121.46 90 2.21 % 304.53 16 0.73 % 45.31 15 0.46 % 96.22 bravo bravo bra vo 119 1.03 % 114.96 88 4.43 % 381.74 5 0.12 % 16.92 23 1.05 % 65.13 3 0.09 % 19.24 adijo adijo adi jo 72 0.62 % 69.56 37 1.86 % 160.50 13 0.32 % 43.99 13 0.60 % 36.81 9 0.27 % 57.73 zdravo zdravo zdra vo 70 0.61 % 67.63 35 1.76 % 151.83 13 0.32 % 43.99 13 0.60 % 36.81 9 0.27 % 57.73 he he he 64 0.56 % 61.83 5 0.25 % 21.69 52 1.27 % 175.95 3 0.14 % 8.49 4 0.12 % 25.66 oh oh oh 57 0.49 % 55.07 12 0.60 % 52.06 25 0.61 % 84.59 10 0.46 % 28.32 10 0.30 % 64.15 la la la 55 0.48 % 53.13 32 1.61 % 138.81 11 0.27 % 37.22 0 0 % 0 12 0.36 % 76.98 fak fak f ak 53 0.46 % 51.20 0 0 % 0 53 1.30 % 179.33 0 0 % 0 0 0 % 0 halo halo ha lo 42 0.36 % 40.58 15 0.76 % 65.07 20 0.49 % 67.67 5 0.23 % 14.16 2 0.06 % 12.83 ho ho ho 41 0.35 % 39.61 35 1.76 % 151.83 5 0.12 % 16.92 1 0.05 % 2.83 0 0 % 0 hop hop h op 36 0.31 % 34.78 29 1.46 % 125.80 5 0.12 % 16.92 1 0.05 % 2.83 1 0.03 % 6.41 bla bla b la 34 0.29 % 32.85 2 0.10 % 8.68 23 0.56 % 77.82 5 0.23 % 14.16 4 0.12 % 25.66 hej hej h ej 25 0.22 % 24.15 14 0.70 % 60.73 7 0.17 % 23.69 3 0.14 % 8.49 1 0.03 % 6.41 ojej ojej oj ej 23 0.20 % 22.22 4 0.20 % 17.35 13 0.32 % 43.99 2 0.09 % 5.66 4 0.12 % 25.66 opa opa o pa 23 0.20 % 22.22 14 0.70 % 60.73 6 0.15 % 20.30 1 0.05 % 2.83 2 0.06 % 12.83 uh uh uh 23 0.20 % 22.22 7 0.35 % 30.37 10 0.24 % 33.84 2 0.09 % 5.66 4 0.12 % 25.66 nasvidenje nasvidenje nasviden je 19 0.17 % 18.36 12 0.60 % 52.06 0 0 % 0 7 0.32 % 19.82 0 0 % 0 šit šit š it 19 0.17 % 18.36 2 0.10 % 8.68 15 0.37 % 50.75 0 0 % 0 2 0.06 % 12.83 alo alo a lo 17 0.15 % 16.42 7 0.35 % 30.37 7 0.17 % 23.69 3 0.14 % 8.49 0 0 % 0 jebemti jebemti jebem ti 16 0.14 % 15.46 5 0.25 % 21.69 11 0.27 % 37.22 0 0 % 0 0 0 % 0 oj oj oj 14 0.12 % 13.53 1 0.05 % 4.34 12 0.29 % 40.60 0 0 % 0 1 0.03 % 6.41 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 651 File at CLARIN.SI2.2.308 List of final character-level 3-grams from interjection lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-interjections-lemmas- final-3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mhm mhm mhm 4,476 45.32 % 4,324.19 432 26.41 % 1,874 1,261 40.89 % 4,266.74 975 47.35 % 2,760.87 1,808 58.36 % 11,597.70 aha aha aha 2,063 20.89 % 1,993.03 377 23.04 % 1,635.41 627 20.33 % 2,121.53 219 10.64 % 620.13 840 27.11 % 5,388.31 recimo recimo rec imo 868 8.79 % 838.56 186 11.37 % 806.86 131 4.25 % 443.25 376 18.26 % 1,064.71 175 5.65 % 1,122.56 aja aja aja 762 7.71 % 736.16 66 4.03 % 286.31 512 16.60 % 1,732.41 84 4.08 % 237.86 100 3.23 % 641.47 joj joj joj 436 4.41 % 421.21 99 6.05 % 429.46 257 8.33 % 869.59 32 1.55 % 90.61 48 1.55 % 307.90 prosim prosim pro sim 413 4.18 % 398.99 62 3.79 % 268.95 37 1.20 % 125.19 259 12.58 % 733.40 55 1.77 % 352.81 čao čao čao 177 1.79 % 171 114 6.97 % 494.53 34 1.10 % 115.04 6 0.29 % 16.99 23 0.74 % 147.54 bravo bravo br avo 119 1.21 % 114.96 88 5.38 % 381.74 5 0.16 % 16.92 23 1.12 % 65.13 3 0.10 % 19.24 adijo adijo ad ijo 72 0.73 % 69.56 37 2.26 % 160.50 13 0.42 % 43.99 13 0.63 % 36.81 9 0.29 % 57.73 zdravo zdravo zdr avo 70 0.71 % 67.63 35 2.14 % 151.83 13 0.42 % 43.99 13 0.63 % 36.81 9 0.29 % 57.73 fak fak fak 53 0.54 % 51.20 0 0 % 0 53 1.72 % 179.33 0 0 % 0 0 0 % 0 halo halo h alo 42 0.42 % 40.58 15 0.92 % 65.07 20 0.65 % 67.67 5 0.24 % 14.16 2 0.07 % 12.83 hop hop hop 36 0.36 % 34.78 29 1.77 % 125.80 5 0.16 % 16.92 1 0.05 % 2.83 1 0.03 % 6.41 bla bla bla 34 0.34 % 32.85 2 0.12 % 8.68 23 0.75 % 77.82 5 0.24 % 14.16 4 0.13 % 25.66 hej hej hej 25 0.25 % 24.15 14 0.86 % 60.73 7 0.23 % 23.69 3 0.15 % 8.49 1 0.03 % 6.41 ojej ojej o jej 23 0.23 % 22.22 4 0.24 % 17.35 13 0.42 % 43.99 2 0.10 % 5.66 4 0.13 % 25.66 opa opa opa 23 0.23 % 22.22 14 0.86 % 60.73 6 0.20 % 20.30 1 0.05 % 2.83 2 0.07 % 12.83 nasvidenje nasvidenje nasvide nje 19 0.19 % 18.36 12 0.73 % 52.06 0 0 % 0 7 0.34 % 19.82 0 0 % 0 šit šit šit 19 0.19 % 18.36 2 0.12 % 8.68 15 0.49 % 50.75 0 0 % 0 2 0.07 % 12.83 alo alo alo 17 0.17 % 16.42 7 0.43 % 30.37 7 0.23 % 23.69 3 0.15 % 8.49 0 0 % 0 jebemti jebemti jebe mti 16 0.16 % 15.46 5 0.31 % 21.69 11 0.36 % 37.22 0 0 % 0 0 0 % 0 ups ups ups 14 0.14 % 13.53 3 0.18 % 13.01 6 0.20 % 20.30 3 0.15 % 8.49 2 0.07 % 12.83 živijo živijo živ ijo 14 0.14 % 13.53 0 0 % 0 5 0.16 % 16.92 9 0.44 % 25.48 0 0 % 0 ojoj ojoj o joj 12 0.12 % 11.59 3 0.18 % 13.01 4 0.13 % 13.53 2 0.10 % 5.66 3 0.10 % 19.24 jej jej jej 8 0.08 % 7.73 5 0.31 % 21.69 2 0.07 % 6.77 1 0.05 % 2.83 0 0 % 0 pardon pardon par don 8 0.08 % 7.73 3 0.18 % 13.01 2 0.07 % 6.77 3 0.15 % 8.49 0 0 % 0 hopla hopla ho pla 7 0.07 % 6.76 2 0.12 % 8.68 0 0 % 0 0 0 % 0 5 0.16 % 32.07 oho oho oho 7 0.07 % 6.76 4 0.24 % 17.35 3 0.10 % 10.15 0 0 % 0 0 0 % 0 fuj fuj fuj 6 0.06 % 5.80 0 0 % 0 6 0.20 % 20.30 0 0 % 0 0 0 % 0 huh huh huh 5 0.05 % 4.83 2 0.12 % 8.68 1 0.03 % 3.38 1 0.05 % 2.83 1 0.03 % 6.41 jah jah jah 5 0.05 % 4.83 1 0.06 % 4.34 0 0 % 0 4 0.19 % 11.33 0 0 % 0 ops ops ops 5 0.05 % 4.83 0 0 % 0 3 0.10 % 10.15 1 0.05 % 2.83 1 0.03 % 6.41 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 652 File at CLARIN.SI2.2.309 List of final character-level 4-grams from interjection lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-interjections-lemmas- final-4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] recimo recimo re cimo 868 50.88 % 838.56 186 40.00 % 806.86 131 51.17 % 443.25 376 52.22 % 1,064.71 175 66.04 % 1,122.56 prosim prosim pr osim 413 24.21 % 398.99 62 13.33 % 268.95 37 14.45 % 125.19 259 35.97 % 733.40 55 20.75 % 352.81 bravo bravo b ravo 119 6.97 % 114.96 88 18.93 % 381.74 5 1.95 % 16.92 23 3.19 % 65.13 3 1.13 % 19.24 adijo adijo a dijo 72 4.22 % 69.56 37 7.96 % 160.50 13 5.08 % 43.99 13 1.81 % 36.81 9 3.40 % 57.73 zdravo zdravo zd ravo 70 4.10 % 67.63 35 7.53 % 151.83 13 5.08 % 43.99 13 1.81 % 36.81 9 3.40 % 57.73 halo halo halo 42 2.46 % 40.58 15 3.23 % 65.07 20 7.81 % 67.67 5 0.69 % 14.16 2 0.76 % 12.83 ojej ojej ojej 23 1.35 % 22.22 4 0.86 % 17.35 13 5.08 % 43.99 2 0.28 % 5.66 4 1.51 % 25.66 nasvidenje nasvidenje nasvid enje 19 1.11 % 18.36 12 2.58 % 52.06 0 0 % 0 7 0.97 % 19.82 0 0 % 0 jebemti jebemti jeb emti 16 0.94 % 15.46 5 1.07 % 21.69 11 4.30 % 37.22 0 0 % 0 0 0 % 0 živijo živijo ži vijo 14 0.82 % 13.53 0 0 % 0 5 1.95 % 16.92 9 1.25 % 25.48 0 0 % 0 ojoj ojoj ojoj 12 0.70 % 11.59 3 0.65 % 13.01 4 1.56 % 13.53 2 0.28 % 5.66 3 1.13 % 19.24 pardon pardon pa rdon 8 0.47 % 7.73 3 0.65 % 13.01 2 0.78 % 6.77 3 0.42 % 8.49 0 0 % 0 hopla hopla h opla 7 0.41 % 6.76 2 0.43 % 8.68 0 0 % 0 0 0 % 0 5 1.89 % 32.07 juhuhu juhuhu ju huhu 4 0.23 % 3.86 3 0.65 % 13.01 0 0 % 0 1 0.14 % 2.83 0 0 % 0 ajej ajej ajej 3 0.18 % 2.90 0 0 % 0 1 0.39 % 3.38 2 0.28 % 5.66 0 0 % 0 hojla hojla h ojla 3 0.18 % 2.90 0 0 % 0 0 0 % 0 3 0.42 % 8.49 0 0 % 0 aleluja aleluja ale luja 2 0.12 % 1.93 2 0.43 % 8.68 0 0 % 0 0 0 % 0 0 0 % 0 juhej juhej j uhej 2 0.12 % 1.93 2 0.43 % 8.68 0 0 % 0 0 0 % 0 0 0 % 0 pozor pozor p ozor 2 0.12 % 1.93 1 0.21 % 4.34 0 0 % 0 1 0.14 % 2.83 0 0 % 0 tralala tralala tra lala 2 0.12 % 1.93 2 0.43 % 8.68 0 0 % 0 0 0 % 0 0 0 % 0 hopsasa hopsasa hop sasa 1 0.06 % 0.97 1 0.21 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 jebenti jebenti jeb enti 1 0.06 % 0.97 1 0.21 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 jebiga jebiga je biga 1 0.06 % 0.97 0 0 % 0 1 0.39 % 3.38 0 0 % 0 0 0 % 0 ježeš ježeš j ežeš 1 0.06 % 0.97 1 0.21 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 zbogom zbogom zb ogom 1 0.06 % 0.97 0 0 % 0 0 0 % 0 1 0.14 % 2.83 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 653 File at CLARIN.SI2.2.310 List of final character-level 5-grams from interjection lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-interjections-lemmas- final-5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] recimo recimo r ecimo 868 53.38 % 838.56 186 41.99 % 806.86 131 60.09 % 443.25 376 53.03 % 1,064.71 175 68.36 % 1,122.56 prosim prosim p rosim 413 25.40 % 398.99 62 13.99 % 268.95 37 16.97 % 125.19 259 36.53 % 733.40 55 21.48 % 352.81 bravo bravo bravo 119 7.32 % 114.96 88 19.86 % 381.74 5 2.29 % 16.92 23 3.24 % 65.13 3 1.17 % 19.24 adijo adijo adijo 72 4.43 % 69.56 37 8.35 % 160.50 13 5.96 % 43.99 13 1.83 % 36.81 9 3.52 % 57.73 zdravo zdravo z dravo 70 4.30 % 67.63 35 7.90 % 151.83 13 5.96 % 43.99 13 1.83 % 36.81 9 3.52 % 57.73 nasvidenje nasvidenje nasvi denje 19 1.17 % 18.36 12 2.71 % 52.06 0 0 % 0 7 0.99 % 19.82 0 0 % 0 jebemti jebemti je bemti 16 0.98 % 15.46 5 1.13 % 21.69 11 5.05 % 37.22 0 0 % 0 0 0 % 0 živijo živijo ž ivijo 14 0.86 % 13.53 0 0 % 0 5 2.29 % 16.92 9 1.27 % 25.48 0 0 % 0 pardon pardon p ardon 8 0.49 % 7.73 3 0.68 % 13.01 2 0.92 % 6.77 3 0.42 % 8.49 0 0 % 0 hopla hopla hopla 7 0.43 % 6.76 2 0.45 % 8.68 0 0 % 0 0 0 % 0 5 1.95 % 32.07 juhuhu juhuhu j uhuhu 4 0.25 % 3.86 3 0.68 % 13.01 0 0 % 0 1 0.14 % 2.83 0 0 % 0 hojla hojla hojla 3 0.18 % 2.90 0 0 % 0 0 0 % 0 3 0.42 % 8.49 0 0 % 0 aleluja aleluja al eluja 2 0.12 % 1.93 2 0.45 % 8.68 0 0 % 0 0 0 % 0 0 0 % 0 juhej juhej juhej 2 0.12 % 1.93 2 0.45 % 8.68 0 0 % 0 0 0 % 0 0 0 % 0 pozor pozor pozor 2 0.12 % 1.93 1 0.23 % 4.34 0 0 % 0 1 0.14 % 2.83 0 0 % 0 tralala tralala tr alala 2 0.12 % 1.93 2 0.45 % 8.68 0 0 % 0 0 0 % 0 0 0 % 0 hopsasa hopsasa ho psasa 1 0.06 % 0.97 1 0.23 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 jebenti jebenti je benti 1 0.06 % 0.97 1 0.23 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 jebiga jebiga j ebiga 1 0.06 % 0.97 0 0 % 0 1 0.46 % 3.38 0 0 % 0 0 0 % 0 ježeš ježeš ježeš 1 0.06 % 0.97 1 0.23 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 zbogom zbogom z bogom 1 0.06 % 0.97 0 0 % 0 0 0 % 0 1 0.14 % 2.83 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 654 File at CLARIN.SI2.2.311 List of initial character-level 1-grams from interjection standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-interjections-standardized_ forms-initial-1grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mhm m hm 4,476 38.80 % 4,324.19 432 21.74 % 1,874 1,261 30.90 % 4,266.74 975 44.68 % 2,760.87 1,808 55.01 % 11,597.70 aha a ha 2,055 17.81 % 1,985.30 369 18.57 % 1,600.71 627 15.36 % 2,121.53 219 10.04 % 620.13 840 25.55 % 5,388.31 recimo r ecimo 868 7.52 % 838.56 186 9.36 % 806.86 131 3.21 % 443.25 376 17.23 % 1,064.71 175 5.32 % 1,122.56 aja a ja 762 6.61 % 736.16 66 3.32 % 286.31 512 12.55 % 1,732.41 84 3.85 % 237.86 100 3.04 % 641.47 ej e j 616 5.34 % 595.11 91 4.58 % 394.75 427 10.46 % 1,444.80 21 0.96 % 59.46 77 2.34 % 493.93 joj j oj 436 3.78 % 421.21 99 4.98 % 429.46 257 6.30 % 869.59 32 1.47 % 90.61 48 1.46 % 307.90 prosim p rosim 413 3.58 % 398.99 62 3.12 % 268.95 37 0.91 % 125.19 259 11.87 % 733.40 55 1.67 % 352.81 eh e h 215 1.86 % 207.71 26 1.31 % 112.79 168 4.12 % 568.45 9 0.41 % 25.48 12 0.36 % 76.98 hm h m 209 1.81 % 201.91 34 1.71 % 147.49 89 2.18 % 301.14 50 2.29 % 141.58 36 1.09 % 230.93 ha h a 203 1.76 % 196.11 75 3.77 % 325.35 102 2.50 % 345.13 10 0.46 % 28.32 16 0.49 % 102.63 čao č ao 177 1.53 % 171 114 5.74 % 494.53 34 0.83 % 115.04 6 0.28 % 16.99 23 0.70 % 147.54 ah a h 145 1.26 % 140.08 28 1.41 % 121.46 90 2.21 % 304.53 12 0.55 % 33.98 15 0.46 % 96.22 bravo b ravo 117 1.01 % 113.03 88 4.43 % 381.74 3 0.07 % 10.15 23 1.05 % 65.13 3 0.09 % 19.24 adijo a dijo 72 0.62 % 69.56 37 1.86 % 160.50 13 0.32 % 43.99 13 0.60 % 36.81 9 0.27 % 57.73 zdravo z dravo 70 0.61 % 67.63 35 1.76 % 151.83 13 0.32 % 43.99 13 0.60 % 36.81 9 0.27 % 57.73 he h e 64 0.56 % 61.83 5 0.25 % 21.69 52 1.27 % 175.95 3 0.14 % 8.49 4 0.12 % 25.66 oh o h 57 0.49 % 55.07 12 0.60 % 52.06 25 0.61 % 84.59 10 0.46 % 28.32 10 0.30 % 64.15 la l a 55 0.48 % 53.13 32 1.61 % 138.81 11 0.27 % 37.22 0 0 % 0 12 0.36 % 76.98 fak f ak 53 0.46 % 51.20 0 0 % 0 53 1.30 % 179.33 0 0 % 0 0 0 % 0 halo h alo 42 0.36 % 40.58 15 0.76 % 65.07 20 0.49 % 67.67 5 0.23 % 14.16 2 0.06 % 12.83 ho h o 41 0.35 % 39.61 35 1.76 % 151.83 5 0.12 % 16.92 1 0.05 % 2.83 0 0 % 0 hop h op 36 0.31 % 34.78 29 1.46 % 125.80 5 0.12 % 16.92 1 0.05 % 2.83 1 0.03 % 6.41 bla b la 34 0.29 % 32.85 2 0.10 % 8.68 23 0.56 % 77.82 5 0.23 % 14.16 4 0.12 % 25.66 hej h ej 23 0.20 % 22.22 12 0.60 % 52.06 7 0.17 % 23.69 3 0.14 % 8.49 1 0.03 % 6.41 ojej o jej 23 0.20 % 22.22 4 0.20 % 17.35 13 0.32 % 43.99 2 0.09 % 5.66 4 0.12 % 25.66 opa o pa 23 0.20 % 22.22 14 0.70 % 60.73 6 0.15 % 20.30 1 0.05 % 2.83 2 0.06 % 12.83 uh u h 23 0.20 % 22.22 7 0.35 % 30.37 10 0.24 % 33.84 2 0.09 % 5.66 4 0.12 % 25.66 nasvidenje n asvidenje 19 0.17 % 18.36 12 0.60 % 52.06 0 0 % 0 7 0.32 % 19.82 0 0 % 0 šit š it 19 0.17 % 18.36 2 0.10 % 8.68 15 0.37 % 50.75 0 0 % 0 2 0.06 % 12.83 alo a lo 17 0.15 % 16.42 7 0.35 % 30.37 7 0.17 % 23.69 3 0.14 % 8.49 0 0 % 0 jebemti j ebemti 16 0.14 % 15.46 5 0.25 % 21.69 11 0.27 % 37.22 0 0 % 0 0 0 % 0 oj o j 14 0.12 % 13.53 1 0.05 % 4.34 12 0.29 % 40.60 0 0 % 0 1 0.03 % 6.41 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 655 File at CLARIN.SI2.2.312 List of initial character-level 2-grams from interjection standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-interjections-standardized_ forms-initial-2grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mhm mh m 4,476 38.80 % 4,324.19 432 21.74 % 1,874 1,261 30.90 % 4,266.74 975 44.68 % 2,760.87 1,808 55.01 % 11,597.70 aha ah a 2,055 17.81 % 1,985.30 369 18.57 % 1,600.71 627 15.36 % 2,121.53 219 10.04 % 620.13 840 25.55 % 5,388.31 recimo re cimo 868 7.52 % 838.56 186 9.36 % 806.86 131 3.21 % 443.25 376 17.23 % 1,064.71 175 5.32 % 1,122.56 aja aj a 762 6.61 % 736.16 66 3.32 % 286.31 512 12.55 % 1,732.41 84 3.85 % 237.86 100 3.04 % 641.47 ej ej 616 5.34 % 595.11 91 4.58 % 394.75 427 10.46 % 1,444.80 21 0.96 % 59.46 77 2.34 % 493.93 joj jo j 436 3.78 % 421.21 99 4.98 % 429.46 257 6.30 % 869.59 32 1.47 % 90.61 48 1.46 % 307.90 prosim pr osim 413 3.58 % 398.99 62 3.12 % 268.95 37 0.91 % 125.19 259 11.87 % 733.40 55 1.67 % 352.81 eh eh 215 1.86 % 207.71 26 1.31 % 112.79 168 4.12 % 568.45 9 0.41 % 25.48 12 0.36 % 76.98 hm hm 209 1.81 % 201.91 34 1.71 % 147.49 89 2.18 % 301.14 50 2.29 % 141.58 36 1.09 % 230.93 ha ha 203 1.76 % 196.11 75 3.77 % 325.35 102 2.50 % 345.13 10 0.46 % 28.32 16 0.49 % 102.63 čao ča o 177 1.53 % 171 114 5.74 % 494.53 34 0.83 % 115.04 6 0.28 % 16.99 23 0.70 % 147.54 ah ah 145 1.26 % 140.08 28 1.41 % 121.46 90 2.21 % 304.53 12 0.55 % 33.98 15 0.46 % 96.22 bravo br avo 117 1.01 % 113.03 88 4.43 % 381.74 3 0.07 % 10.15 23 1.05 % 65.13 3 0.09 % 19.24 adijo ad ijo 72 0.62 % 69.56 37 1.86 % 160.50 13 0.32 % 43.99 13 0.60 % 36.81 9 0.27 % 57.73 zdravo zd ravo 70 0.61 % 67.63 35 1.76 % 151.83 13 0.32 % 43.99 13 0.60 % 36.81 9 0.27 % 57.73 he he 64 0.56 % 61.83 5 0.25 % 21.69 52 1.27 % 175.95 3 0.14 % 8.49 4 0.12 % 25.66 oh oh 57 0.49 % 55.07 12 0.60 % 52.06 25 0.61 % 84.59 10 0.46 % 28.32 10 0.30 % 64.15 la la 55 0.48 % 53.13 32 1.61 % 138.81 11 0.27 % 37.22 0 0 % 0 12 0.36 % 76.98 fak fa k 53 0.46 % 51.20 0 0 % 0 53 1.30 % 179.33 0 0 % 0 0 0 % 0 halo ha lo 42 0.36 % 40.58 15 0.76 % 65.07 20 0.49 % 67.67 5 0.23 % 14.16 2 0.06 % 12.83 ho ho 41 0.35 % 39.61 35 1.76 % 151.83 5 0.12 % 16.92 1 0.05 % 2.83 0 0 % 0 hop ho p 36 0.31 % 34.78 29 1.46 % 125.80 5 0.12 % 16.92 1 0.05 % 2.83 1 0.03 % 6.41 bla bl a 34 0.29 % 32.85 2 0.10 % 8.68 23 0.56 % 77.82 5 0.23 % 14.16 4 0.12 % 25.66 hej he j 23 0.20 % 22.22 12 0.60 % 52.06 7 0.17 % 23.69 3 0.14 % 8.49 1 0.03 % 6.41 ojej oj ej 23 0.20 % 22.22 4 0.20 % 17.35 13 0.32 % 43.99 2 0.09 % 5.66 4 0.12 % 25.66 opa op a 23 0.20 % 22.22 14 0.70 % 60.73 6 0.15 % 20.30 1 0.05 % 2.83 2 0.06 % 12.83 uh uh 23 0.20 % 22.22 7 0.35 % 30.37 10 0.24 % 33.84 2 0.09 % 5.66 4 0.12 % 25.66 nasvidenje na svidenje 19 0.17 % 18.36 12 0.60 % 52.06 0 0 % 0 7 0.32 % 19.82 0 0 % 0 šit ši t 19 0.17 % 18.36 2 0.10 % 8.68 15 0.37 % 50.75 0 0 % 0 2 0.06 % 12.83 alo al o 17 0.15 % 16.42 7 0.35 % 30.37 7 0.17 % 23.69 3 0.14 % 8.49 0 0 % 0 jebemti je bemti 16 0.14 % 15.46 5 0.25 % 21.69 11 0.27 % 37.22 0 0 % 0 0 0 % 0 oj oj 14 0.12 % 13.53 1 0.05 % 4.34 12 0.29 % 40.60 0 0 % 0 1 0.03 % 6.41 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 656 File at CLARIN.SI2.2.313 List of initial character-level 3-grams from interjection standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-interjections-standardized_ forms-initial-3grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mhm mhm 4,476 45.32 % 4,324.19 432 26.41 % 1,874 1,261 40.89 % 4,266.74 975 47.35 % 2,760.87 1,808 58.36 % 11,597.70 aha aha 2,055 20.81 % 1,985.30 369 22.55 % 1,600.71 627 20.33 % 2,121.53 219 10.64 % 620.13 840 27.11 % 5,388.31 recimo rec imo 868 8.79 % 838.56 186 11.37 % 806.86 131 4.25 % 443.25 376 18.26 % 1,064.71 175 5.65 % 1,122.56 aja aja 762 7.71 % 736.16 66 4.03 % 286.31 512 16.60 % 1,732.41 84 4.08 % 237.86 100 3.23 % 641.47 joj joj 436 4.41 % 421.21 99 6.05 % 429.46 257 8.33 % 869.59 32 1.55 % 90.61 48 1.55 % 307.90 prosim pro sim 413 4.18 % 398.99 62 3.79 % 268.95 37 1.20 % 125.19 259 12.58 % 733.40 55 1.77 % 352.81 čao čao 177 1.79 % 171 114 6.97 % 494.53 34 1.10 % 115.04 6 0.29 % 16.99 23 0.74 % 147.54 bravo bra vo 117 1.19 % 113.03 88 5.38 % 381.74 3 0.10 % 10.15 23 1.12 % 65.13 3 0.10 % 19.24 adijo adi jo 72 0.73 % 69.56 37 2.26 % 160.50 13 0.42 % 43.99 13 0.63 % 36.81 9 0.29 % 57.73 zdravo zdr avo 70 0.71 % 67.63 35 2.14 % 151.83 13 0.42 % 43.99 13 0.63 % 36.81 9 0.29 % 57.73 fak fak 53 0.54 % 51.20 0 0 % 0 53 1.72 % 179.33 0 0 % 0 0 0 % 0 halo hal o 42 0.42 % 40.58 15 0.92 % 65.07 20 0.65 % 67.67 5 0.24 % 14.16 2 0.07 % 12.83 hop hop 36 0.36 % 34.78 29 1.77 % 125.80 5 0.16 % 16.92 1 0.05 % 2.83 1 0.03 % 6.41 bla bla 34 0.34 % 32.85 2 0.12 % 8.68 23 0.75 % 77.82 5 0.24 % 14.16 4 0.13 % 25.66 hej hej 23 0.23 % 22.22 12 0.73 % 52.06 7 0.23 % 23.69 3 0.15 % 8.49 1 0.03 % 6.41 ojej oje j 23 0.23 % 22.22 4 0.24 % 17.35 13 0.42 % 43.99 2 0.10 % 5.66 4 0.13 % 25.66 opa opa 23 0.23 % 22.22 14 0.86 % 60.73 6 0.20 % 20.30 1 0.05 % 2.83 2 0.07 % 12.83 nasvidenje nas videnje 19 0.19 % 18.36 12 0.73 % 52.06 0 0 % 0 7 0.34 % 19.82 0 0 % 0 šit šit 19 0.19 % 18.36 2 0.12 % 8.68 15 0.49 % 50.75 0 0 % 0 2 0.07 % 12.83 alo alo 17 0.17 % 16.42 7 0.43 % 30.37 7 0.23 % 23.69 3 0.15 % 8.49 0 0 % 0 jebemti jeb emti 16 0.16 % 15.46 5 0.31 % 21.69 11 0.36 % 37.22 0 0 % 0 0 0 % 0 ups ups 14 0.14 % 13.53 3 0.18 % 13.01 6 0.20 % 20.30 3 0.15 % 8.49 2 0.07 % 12.83 živijo živ ijo 14 0.14 % 13.53 0 0 % 0 5 0.16 % 16.92 9 0.44 % 25.48 0 0 % 0 ojoj ojo j 12 0.12 % 11.59 3 0.18 % 13.01 4 0.13 % 13.53 2 0.10 % 5.66 3 0.10 % 19.24 Aha Aha 8 0.08 % 7.73 8 0.49 % 34.70 0 0 % 0 0 0 % 0 0 0 % 0 jej jej 8 0.08 % 7.73 5 0.31 % 21.69 2 0.07 % 6.77 1 0.05 % 2.83 0 0 % 0 pardon par don 8 0.08 % 7.73 3 0.18 % 13.01 2 0.07 % 6.77 3 0.15 % 8.49 0 0 % 0 oho oho 7 0.07 % 6.76 4 0.24 % 17.35 3 0.10 % 10.15 0 0 % 0 0 0 % 0 Hopla Hop la 6 0.06 % 5.80 1 0.06 % 4.34 0 0 % 0 0 0 % 0 5 0.16 % 32.07 fuj fuj 6 0.06 % 5.80 0 0 % 0 6 0.20 % 20.30 0 0 % 0 0 0 % 0 huh huh 5 0.05 % 4.83 2 0.12 % 8.68 1 0.03 % 3.38 1 0.05 % 2.83 1 0.03 % 6.41 jah jah 5 0.05 % 4.83 1 0.06 % 4.34 0 0 % 0 4 0.19 % 11.33 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 657 File at CLARIN.SI2.2.314 List of initial character-level 4-grams from interjection standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-interjections-standardized_ forms-initial-4grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] recimo reci mo 868 50.88 % 838.56 186 40.00 % 806.86 131 51.17 % 443.25 376 52.22 % 1,064.71 175 66.04 % 1,122.56 prosim pros im 413 24.21 % 398.99 62 13.33 % 268.95 37 14.45 % 125.19 259 35.97 % 733.40 55 20.75 % 352.81 bravo brav o 117 6.86 % 113.03 88 18.93 % 381.74 3 1.17 % 10.15 23 3.19 % 65.13 3 1.13 % 19.24 adijo adij o 72 4.22 % 69.56 37 7.96 % 160.50 13 5.08 % 43.99 13 1.81 % 36.81 9 3.40 % 57.73 zdravo zdra vo 70 4.10 % 67.63 35 7.53 % 151.83 13 5.08 % 43.99 13 1.81 % 36.81 9 3.40 % 57.73 halo halo 42 2.46 % 40.58 15 3.23 % 65.07 20 7.81 % 67.67 5 0.69 % 14.16 2 0.76 % 12.83 ojej ojej 23 1.35 % 22.22 4 0.86 % 17.35 13 5.08 % 43.99 2 0.28 % 5.66 4 1.51 % 25.66 nasvidenje nasv idenje 19 1.11 % 18.36 12 2.58 % 52.06 0 0 % 0 7 0.97 % 19.82 0 0 % 0 jebemti jebe mti 16 0.94 % 15.46 5 1.07 % 21.69 11 4.30 % 37.22 0 0 % 0 0 0 % 0 živijo živi jo 14 0.82 % 13.53 0 0 % 0 5 1.95 % 16.92 9 1.25 % 25.48 0 0 % 0 ojoj ojoj 12 0.70 % 11.59 3 0.65 % 13.01 4 1.56 % 13.53 2 0.28 % 5.66 3 1.13 % 19.24 pardon pard on 8 0.47 % 7.73 3 0.65 % 13.01 2 0.78 % 6.77 3 0.42 % 8.49 0 0 % 0 Hopla Hopl a 6 0.35 % 5.80 1 0.21 % 4.34 0 0 % 0 0 0 % 0 5 1.89 % 32.07 juhuhu juhu hu 4 0.23 % 3.86 3 0.65 % 13.01 0 0 % 0 1 0.14 % 2.83 0 0 % 0 ajej ajej 3 0.18 % 2.90 0 0 % 0 1 0.39 % 3.38 2 0.28 % 5.66 0 0 % 0 hojla hojl a 3 0.18 % 2.90 0 0 % 0 0 0 % 0 3 0.42 % 8.49 0 0 % 0 Bravo Brav o 2 0.12 % 1.93 0 0 % 0 2 0.78 % 6.77 0 0 % 0 0 0 % 0 aleluja alel uja 2 0.12 % 1.93 2 0.43 % 8.68 0 0 % 0 0 0 % 0 0 0 % 0 juhej juhe j 2 0.12 % 1.93 2 0.43 % 8.68 0 0 % 0 0 0 % 0 0 0 % 0 pozor pozo r 2 0.12 % 1.93 1 0.21 % 4.34 0 0 % 0 1 0.14 % 2.83 0 0 % 0 tralala tral ala 2 0.12 % 1.93 2 0.43 % 8.68 0 0 % 0 0 0 % 0 0 0 % 0 hopla hopl a 1 0.06 % 0.97 1 0.21 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 hopsasa hops asa 1 0.06 % 0.97 1 0.21 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 jebenti jebe nti 1 0.06 % 0.97 1 0.21 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 jebiga jebi ga 1 0.06 % 0.97 0 0 % 0 1 0.39 % 3.38 0 0 % 0 0 0 % 0 ježeš ježe š 1 0.06 % 0.97 1 0.21 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 zbogom zbog om 1 0.06 % 0.97 0 0 % 0 0 0 % 0 1 0.14 % 2.83 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 658 File at CLARIN.SI2.2.315 List of initial character-level 5-grams from interjection standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-interjections-standardized_ forms-initial-5grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] recimo recim o 868 53.38 % 838.56 186 41.99 % 806.86 131 60.09 % 443.25 376 53.03 % 1,064.71 175 68.36 % 1,122.56 prosim prosi m 413 25.40 % 398.99 62 13.99 % 268.95 37 16.97 % 125.19 259 36.53 % 733.40 55 21.48 % 352.81 bravo bravo 117 7.20 % 113.03 88 19.86 % 381.74 3 1.38 % 10.15 23 3.24 % 65.13 3 1.17 % 19.24 adijo adijo 72 4.43 % 69.56 37 8.35 % 160.50 13 5.96 % 43.99 13 1.83 % 36.81 9 3.52 % 57.73 zdravo zdrav o 70 4.30 % 67.63 35 7.90 % 151.83 13 5.96 % 43.99 13 1.83 % 36.81 9 3.52 % 57.73 nasvidenje nasvi denje 19 1.17 % 18.36 12 2.71 % 52.06 0 0 % 0 7 0.99 % 19.82 0 0 % 0 jebemti jebem ti 16 0.98 % 15.46 5 1.13 % 21.69 11 5.05 % 37.22 0 0 % 0 0 0 % 0 živijo živij o 14 0.86 % 13.53 0 0 % 0 5 2.29 % 16.92 9 1.27 % 25.48 0 0 % 0 pardon pardo n 8 0.49 % 7.73 3 0.68 % 13.01 2 0.92 % 6.77 3 0.42 % 8.49 0 0 % 0 Hopla Hopla 6 0.37 % 5.80 1 0.23 % 4.34 0 0 % 0 0 0 % 0 5 1.95 % 32.07 juhuhu juhuh u 4 0.25 % 3.86 3 0.68 % 13.01 0 0 % 0 1 0.14 % 2.83 0 0 % 0 hojla hojla 3 0.18 % 2.90 0 0 % 0 0 0 % 0 3 0.42 % 8.49 0 0 % 0 Bravo Bravo 2 0.12 % 1.93 0 0 % 0 2 0.92 % 6.77 0 0 % 0 0 0 % 0 aleluja alelu ja 2 0.12 % 1.93 2 0.45 % 8.68 0 0 % 0 0 0 % 0 0 0 % 0 juhej juhej 2 0.12 % 1.93 2 0.45 % 8.68 0 0 % 0 0 0 % 0 0 0 % 0 pozor pozor 2 0.12 % 1.93 1 0.23 % 4.34 0 0 % 0 1 0.14 % 2.83 0 0 % 0 tralala trala la 2 0.12 % 1.93 2 0.45 % 8.68 0 0 % 0 0 0 % 0 0 0 % 0 hopla hopla 1 0.06 % 0.97 1 0.23 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 hopsasa hopsa sa 1 0.06 % 0.97 1 0.23 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 jebenti jeben ti 1 0.06 % 0.97 1 0.23 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 jebiga jebig a 1 0.06 % 0.97 0 0 % 0 1 0.46 % 3.38 0 0 % 0 0 0 % 0 ježeš ježeš 1 0.06 % 0.97 1 0.23 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 zbogom zbogo m 1 0.06 % 0.97 0 0 % 0 0 0 % 0 1 0.14 % 2.83 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 659 File at CLARIN.SI2.2.316 List of final character-level 1-grams from interjection standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-interjections-standardized_ forms-final-1grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mhm mh m 4,476 38.80 % 4,324.19 432 21.74 % 1,874 1,261 30.90 % 4,266.74 975 44.68 % 2,760.87 1,808 55.01 % 11,597.70 aha ah a 2,055 17.81 % 1,985.30 369 18.57 % 1,600.71 627 15.36 % 2,121.53 219 10.04 % 620.13 840 25.55 % 5,388.31 recimo recim o 868 7.52 % 838.56 186 9.36 % 806.86 131 3.21 % 443.25 376 17.23 % 1,064.71 175 5.32 % 1,122.56 aja aj a 762 6.61 % 736.16 66 3.32 % 286.31 512 12.55 % 1,732.41 84 3.85 % 237.86 100 3.04 % 641.47 ej e j 616 5.34 % 595.11 91 4.58 % 394.75 427 10.46 % 1,444.80 21 0.96 % 59.46 77 2.34 % 493.93 joj jo j 436 3.78 % 421.21 99 4.98 % 429.46 257 6.30 % 869.59 32 1.47 % 90.61 48 1.46 % 307.90 prosim prosi m 413 3.58 % 398.99 62 3.12 % 268.95 37 0.91 % 125.19 259 11.87 % 733.40 55 1.67 % 352.81 eh e h 215 1.86 % 207.71 26 1.31 % 112.79 168 4.12 % 568.45 9 0.41 % 25.48 12 0.36 % 76.98 hm h m 209 1.81 % 201.91 34 1.71 % 147.49 89 2.18 % 301.14 50 2.29 % 141.58 36 1.09 % 230.93 ha h a 203 1.76 % 196.11 75 3.77 % 325.35 102 2.50 % 345.13 10 0.46 % 28.32 16 0.49 % 102.63 čao ča o 177 1.53 % 171 114 5.74 % 494.53 34 0.83 % 115.04 6 0.28 % 16.99 23 0.70 % 147.54 ah a h 145 1.26 % 140.08 28 1.41 % 121.46 90 2.21 % 304.53 12 0.55 % 33.98 15 0.46 % 96.22 bravo brav o 117 1.01 % 113.03 88 4.43 % 381.74 3 0.07 % 10.15 23 1.05 % 65.13 3 0.09 % 19.24 adijo adij o 72 0.62 % 69.56 37 1.86 % 160.50 13 0.32 % 43.99 13 0.60 % 36.81 9 0.27 % 57.73 zdravo zdrav o 70 0.61 % 67.63 35 1.76 % 151.83 13 0.32 % 43.99 13 0.60 % 36.81 9 0.27 % 57.73 he h e 64 0.56 % 61.83 5 0.25 % 21.69 52 1.27 % 175.95 3 0.14 % 8.49 4 0.12 % 25.66 oh o h 57 0.49 % 55.07 12 0.60 % 52.06 25 0.61 % 84.59 10 0.46 % 28.32 10 0.30 % 64.15 la l a 55 0.48 % 53.13 32 1.61 % 138.81 11 0.27 % 37.22 0 0 % 0 12 0.36 % 76.98 fak fa k 53 0.46 % 51.20 0 0 % 0 53 1.30 % 179.33 0 0 % 0 0 0 % 0 halo hal o 42 0.36 % 40.58 15 0.76 % 65.07 20 0.49 % 67.67 5 0.23 % 14.16 2 0.06 % 12.83 ho h o 41 0.35 % 39.61 35 1.76 % 151.83 5 0.12 % 16.92 1 0.05 % 2.83 0 0 % 0 hop ho p 36 0.31 % 34.78 29 1.46 % 125.80 5 0.12 % 16.92 1 0.05 % 2.83 1 0.03 % 6.41 bla bl a 34 0.29 % 32.85 2 0.10 % 8.68 23 0.56 % 77.82 5 0.23 % 14.16 4 0.12 % 25.66 hej he j 23 0.20 % 22.22 12 0.60 % 52.06 7 0.17 % 23.69 3 0.14 % 8.49 1 0.03 % 6.41 ojej oje j 23 0.20 % 22.22 4 0.20 % 17.35 13 0.32 % 43.99 2 0.09 % 5.66 4 0.12 % 25.66 opa op a 23 0.20 % 22.22 14 0.70 % 60.73 6 0.15 % 20.30 1 0.05 % 2.83 2 0.06 % 12.83 uh u h 23 0.20 % 22.22 7 0.35 % 30.37 10 0.24 % 33.84 2 0.09 % 5.66 4 0.12 % 25.66 nasvidenje nasvidenj e 19 0.17 % 18.36 12 0.60 % 52.06 0 0 % 0 7 0.32 % 19.82 0 0 % 0 šit ši t 19 0.17 % 18.36 2 0.10 % 8.68 15 0.37 % 50.75 0 0 % 0 2 0.06 % 12.83 alo al o 17 0.15 % 16.42 7 0.35 % 30.37 7 0.17 % 23.69 3 0.14 % 8.49 0 0 % 0 jebemti jebemt i 16 0.14 % 15.46 5 0.25 % 21.69 11 0.27 % 37.22 0 0 % 0 0 0 % 0 oj o j 14 0.12 % 13.53 1 0.05 % 4.34 12 0.29 % 40.60 0 0 % 0 1 0.03 % 6.41 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 660 File at CLARIN.SI2.2.317 List of final character-level 2-grams from interjection standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-interjections-standardized_ forms-final-2grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mhm m hm 4,476 38.80 % 4,324.19 432 21.74 % 1,874 1,261 30.90 % 4,266.74 975 44.68 % 2,760.87 1,808 55.01 % 11,597.70 aha a ha 2,055 17.81 % 1,985.30 369 18.57 % 1,600.71 627 15.36 % 2,121.53 219 10.04 % 620.13 840 25.55 % 5,388.31 recimo reci mo 868 7.52 % 838.56 186 9.36 % 806.86 131 3.21 % 443.25 376 17.23 % 1,064.71 175 5.32 % 1,122.56 aja a ja 762 6.61 % 736.16 66 3.32 % 286.31 512 12.55 % 1,732.41 84 3.85 % 237.86 100 3.04 % 641.47 ej ej 616 5.34 % 595.11 91 4.58 % 394.75 427 10.46 % 1,444.80 21 0.96 % 59.46 77 2.34 % 493.93 joj j oj 436 3.78 % 421.21 99 4.98 % 429.46 257 6.30 % 869.59 32 1.47 % 90.61 48 1.46 % 307.90 prosim pros im 413 3.58 % 398.99 62 3.12 % 268.95 37 0.91 % 125.19 259 11.87 % 733.40 55 1.67 % 352.81 eh eh 215 1.86 % 207.71 26 1.31 % 112.79 168 4.12 % 568.45 9 0.41 % 25.48 12 0.36 % 76.98 hm hm 209 1.81 % 201.91 34 1.71 % 147.49 89 2.18 % 301.14 50 2.29 % 141.58 36 1.09 % 230.93 ha ha 203 1.76 % 196.11 75 3.77 % 325.35 102 2.50 % 345.13 10 0.46 % 28.32 16 0.49 % 102.63 čao č ao 177 1.53 % 171 114 5.74 % 494.53 34 0.83 % 115.04 6 0.28 % 16.99 23 0.70 % 147.54 ah ah 145 1.26 % 140.08 28 1.41 % 121.46 90 2.21 % 304.53 12 0.55 % 33.98 15 0.46 % 96.22 bravo bra vo 117 1.01 % 113.03 88 4.43 % 381.74 3 0.07 % 10.15 23 1.05 % 65.13 3 0.09 % 19.24 adijo adi jo 72 0.62 % 69.56 37 1.86 % 160.50 13 0.32 % 43.99 13 0.60 % 36.81 9 0.27 % 57.73 zdravo zdra vo 70 0.61 % 67.63 35 1.76 % 151.83 13 0.32 % 43.99 13 0.60 % 36.81 9 0.27 % 57.73 he he 64 0.56 % 61.83 5 0.25 % 21.69 52 1.27 % 175.95 3 0.14 % 8.49 4 0.12 % 25.66 oh oh 57 0.49 % 55.07 12 0.60 % 52.06 25 0.61 % 84.59 10 0.46 % 28.32 10 0.30 % 64.15 la la 55 0.48 % 53.13 32 1.61 % 138.81 11 0.27 % 37.22 0 0 % 0 12 0.36 % 76.98 fak f ak 53 0.46 % 51.20 0 0 % 0 53 1.30 % 179.33 0 0 % 0 0 0 % 0 halo ha lo 42 0.36 % 40.58 15 0.76 % 65.07 20 0.49 % 67.67 5 0.23 % 14.16 2 0.06 % 12.83 ho ho 41 0.35 % 39.61 35 1.76 % 151.83 5 0.12 % 16.92 1 0.05 % 2.83 0 0 % 0 hop h op 36 0.31 % 34.78 29 1.46 % 125.80 5 0.12 % 16.92 1 0.05 % 2.83 1 0.03 % 6.41 bla b la 34 0.29 % 32.85 2 0.10 % 8.68 23 0.56 % 77.82 5 0.23 % 14.16 4 0.12 % 25.66 hej h ej 23 0.20 % 22.22 12 0.60 % 52.06 7 0.17 % 23.69 3 0.14 % 8.49 1 0.03 % 6.41 ojej oj ej 23 0.20 % 22.22 4 0.20 % 17.35 13 0.32 % 43.99 2 0.09 % 5.66 4 0.12 % 25.66 opa o pa 23 0.20 % 22.22 14 0.70 % 60.73 6 0.15 % 20.30 1 0.05 % 2.83 2 0.06 % 12.83 uh uh 23 0.20 % 22.22 7 0.35 % 30.37 10 0.24 % 33.84 2 0.09 % 5.66 4 0.12 % 25.66 nasvidenje nasviden je 19 0.17 % 18.36 12 0.60 % 52.06 0 0 % 0 7 0.32 % 19.82 0 0 % 0 šit š it 19 0.17 % 18.36 2 0.10 % 8.68 15 0.37 % 50.75 0 0 % 0 2 0.06 % 12.83 alo a lo 17 0.15 % 16.42 7 0.35 % 30.37 7 0.17 % 23.69 3 0.14 % 8.49 0 0 % 0 jebemti jebem ti 16 0.14 % 15.46 5 0.25 % 21.69 11 0.27 % 37.22 0 0 % 0 0 0 % 0 oj oj 14 0.12 % 13.53 1 0.05 % 4.34 12 0.29 % 40.60 0 0 % 0 1 0.03 % 6.41 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 661 File at CLARIN.SI2.2.318 List of final character-level 3-grams from interjection standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-interjections-standardized_ forms-final-3grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mhm mhm 4,476 45.32 % 4,324.19 432 26.41 % 1,874 1,261 40.89 % 4,266.74 975 47.35 % 2,760.87 1,808 58.36 % 11,597.70 aha aha 2,055 20.81 % 1,985.30 369 22.55 % 1,600.71 627 20.33 % 2,121.53 219 10.64 % 620.13 840 27.11 % 5,388.31 recimo rec imo 868 8.79 % 838.56 186 11.37 % 806.86 131 4.25 % 443.25 376 18.26 % 1,064.71 175 5.65 % 1,122.56 aja aja 762 7.71 % 736.16 66 4.03 % 286.31 512 16.60 % 1,732.41 84 4.08 % 237.86 100 3.23 % 641.47 joj joj 436 4.41 % 421.21 99 6.05 % 429.46 257 8.33 % 869.59 32 1.55 % 90.61 48 1.55 % 307.90 prosim pro sim 413 4.18 % 398.99 62 3.79 % 268.95 37 1.20 % 125.19 259 12.58 % 733.40 55 1.77 % 352.81 čao čao 177 1.79 % 171 114 6.97 % 494.53 34 1.10 % 115.04 6 0.29 % 16.99 23 0.74 % 147.54 bravo br avo 117 1.19 % 113.03 88 5.38 % 381.74 3 0.10 % 10.15 23 1.12 % 65.13 3 0.10 % 19.24 adijo ad ijo 72 0.73 % 69.56 37 2.26 % 160.50 13 0.42 % 43.99 13 0.63 % 36.81 9 0.29 % 57.73 zdravo zdr avo 70 0.71 % 67.63 35 2.14 % 151.83 13 0.42 % 43.99 13 0.63 % 36.81 9 0.29 % 57.73 fak fak 53 0.54 % 51.20 0 0 % 0 53 1.72 % 179.33 0 0 % 0 0 0 % 0 halo h alo 42 0.42 % 40.58 15 0.92 % 65.07 20 0.65 % 67.67 5 0.24 % 14.16 2 0.07 % 12.83 hop hop 36 0.36 % 34.78 29 1.77 % 125.80 5 0.16 % 16.92 1 0.05 % 2.83 1 0.03 % 6.41 bla bla 34 0.34 % 32.85 2 0.12 % 8.68 23 0.75 % 77.82 5 0.24 % 14.16 4 0.13 % 25.66 hej hej 23 0.23 % 22.22 12 0.73 % 52.06 7 0.23 % 23.69 3 0.15 % 8.49 1 0.03 % 6.41 ojej o jej 23 0.23 % 22.22 4 0.24 % 17.35 13 0.42 % 43.99 2 0.10 % 5.66 4 0.13 % 25.66 opa opa 23 0.23 % 22.22 14 0.86 % 60.73 6 0.20 % 20.30 1 0.05 % 2.83 2 0.07 % 12.83 nasvidenje nasvide nje 19 0.19 % 18.36 12 0.73 % 52.06 0 0 % 0 7 0.34 % 19.82 0 0 % 0 šit šit 19 0.19 % 18.36 2 0.12 % 8.68 15 0.49 % 50.75 0 0 % 0 2 0.07 % 12.83 alo alo 17 0.17 % 16.42 7 0.43 % 30.37 7 0.23 % 23.69 3 0.15 % 8.49 0 0 % 0 jebemti jebe mti 16 0.16 % 15.46 5 0.31 % 21.69 11 0.36 % 37.22 0 0 % 0 0 0 % 0 ups ups 14 0.14 % 13.53 3 0.18 % 13.01 6 0.20 % 20.30 3 0.15 % 8.49 2 0.07 % 12.83 živijo živ ijo 14 0.14 % 13.53 0 0 % 0 5 0.16 % 16.92 9 0.44 % 25.48 0 0 % 0 ojoj o joj 12 0.12 % 11.59 3 0.18 % 13.01 4 0.13 % 13.53 2 0.10 % 5.66 3 0.10 % 19.24 Aha Aha 8 0.08 % 7.73 8 0.49 % 34.70 0 0 % 0 0 0 % 0 0 0 % 0 jej jej 8 0.08 % 7.73 5 0.31 % 21.69 2 0.07 % 6.77 1 0.05 % 2.83 0 0 % 0 pardon par don 8 0.08 % 7.73 3 0.18 % 13.01 2 0.07 % 6.77 3 0.15 % 8.49 0 0 % 0 oho oho 7 0.07 % 6.76 4 0.24 % 17.35 3 0.10 % 10.15 0 0 % 0 0 0 % 0 Hopla Ho pla 6 0.06 % 5.80 1 0.06 % 4.34 0 0 % 0 0 0 % 0 5 0.16 % 32.07 fuj fuj 6 0.06 % 5.80 0 0 % 0 6 0.20 % 20.30 0 0 % 0 0 0 % 0 huh huh 5 0.05 % 4.83 2 0.12 % 8.68 1 0.03 % 3.38 1 0.05 % 2.83 1 0.03 % 6.41 jah jah 5 0.05 % 4.83 1 0.06 % 4.34 0 0 % 0 4 0.19 % 11.33 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 662 File at CLARIN.SI2.2.319 List of final character-level 4-grams from interjection standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-interjections-standardized_ forms-final-4grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] recimo re cimo 868 50.88 % 838.56 186 40.00 % 806.86 131 51.17 % 443.25 376 52.22 % 1,064.71 175 66.04 % 1,122.56 prosim pr osim 413 24.21 % 398.99 62 13.33 % 268.95 37 14.45 % 125.19 259 35.97 % 733.40 55 20.75 % 352.81 bravo b ravo 117 6.86 % 113.03 88 18.93 % 381.74 3 1.17 % 10.15 23 3.19 % 65.13 3 1.13 % 19.24 adijo a dijo 72 4.22 % 69.56 37 7.96 % 160.50 13 5.08 % 43.99 13 1.81 % 36.81 9 3.40 % 57.73 zdravo zd ravo 70 4.10 % 67.63 35 7.53 % 151.83 13 5.08 % 43.99 13 1.81 % 36.81 9 3.40 % 57.73 halo halo 42 2.46 % 40.58 15 3.23 % 65.07 20 7.81 % 67.67 5 0.69 % 14.16 2 0.76 % 12.83 ojej ojej 23 1.35 % 22.22 4 0.86 % 17.35 13 5.08 % 43.99 2 0.28 % 5.66 4 1.51 % 25.66 nasvidenje nasvid enje 19 1.11 % 18.36 12 2.58 % 52.06 0 0 % 0 7 0.97 % 19.82 0 0 % 0 jebemti jeb emti 16 0.94 % 15.46 5 1.07 % 21.69 11 4.30 % 37.22 0 0 % 0 0 0 % 0 živijo ži vijo 14 0.82 % 13.53 0 0 % 0 5 1.95 % 16.92 9 1.25 % 25.48 0 0 % 0 ojoj ojoj 12 0.70 % 11.59 3 0.65 % 13.01 4 1.56 % 13.53 2 0.28 % 5.66 3 1.13 % 19.24 pardon pa rdon 8 0.47 % 7.73 3 0.65 % 13.01 2 0.78 % 6.77 3 0.42 % 8.49 0 0 % 0 Hopla H opla 6 0.35 % 5.80 1 0.21 % 4.34 0 0 % 0 0 0 % 0 5 1.89 % 32.07 juhuhu ju huhu 4 0.23 % 3.86 3 0.65 % 13.01 0 0 % 0 1 0.14 % 2.83 0 0 % 0 ajej ajej 3 0.18 % 2.90 0 0 % 0 1 0.39 % 3.38 2 0.28 % 5.66 0 0 % 0 hojla h ojla 3 0.18 % 2.90 0 0 % 0 0 0 % 0 3 0.42 % 8.49 0 0 % 0 Bravo B ravo 2 0.12 % 1.93 0 0 % 0 2 0.78 % 6.77 0 0 % 0 0 0 % 0 aleluja ale luja 2 0.12 % 1.93 2 0.43 % 8.68 0 0 % 0 0 0 % 0 0 0 % 0 juhej j uhej 2 0.12 % 1.93 2 0.43 % 8.68 0 0 % 0 0 0 % 0 0 0 % 0 pozor p ozor 2 0.12 % 1.93 1 0.21 % 4.34 0 0 % 0 1 0.14 % 2.83 0 0 % 0 tralala tra lala 2 0.12 % 1.93 2 0.43 % 8.68 0 0 % 0 0 0 % 0 0 0 % 0 hopla h opla 1 0.06 % 0.97 1 0.21 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 hopsasa hop sasa 1 0.06 % 0.97 1 0.21 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 jebenti jeb enti 1 0.06 % 0.97 1 0.21 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 jebiga je biga 1 0.06 % 0.97 0 0 % 0 1 0.39 % 3.38 0 0 % 0 0 0 % 0 ježeš j ežeš 1 0.06 % 0.97 1 0.21 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 zbogom zb ogom 1 0.06 % 0.97 0 0 % 0 0 0 % 0 1 0.14 % 2.83 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 663 File at CLARIN.SI2.2.320 List of final character-level 5-grams from interjection standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-interjections-standardized_ forms-final-5grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] recimo r ecimo 868 53.38 % 838.56 186 41.99 % 806.86 131 60.09 % 443.25 376 53.03 % 1,064.71 175 68.36 % 1,122.56 prosim p rosim 413 25.40 % 398.99 62 13.99 % 268.95 37 16.97 % 125.19 259 36.53 % 733.40 55 21.48 % 352.81 bravo bravo 117 7.20 % 113.03 88 19.86 % 381.74 3 1.38 % 10.15 23 3.24 % 65.13 3 1.17 % 19.24 adijo adijo 72 4.43 % 69.56 37 8.35 % 160.50 13 5.96 % 43.99 13 1.83 % 36.81 9 3.52 % 57.73 zdravo z dravo 70 4.30 % 67.63 35 7.90 % 151.83 13 5.96 % 43.99 13 1.83 % 36.81 9 3.52 % 57.73 nasvidenje nasvi denje 19 1.17 % 18.36 12 2.71 % 52.06 0 0 % 0 7 0.99 % 19.82 0 0 % 0 jebemti je bemti 16 0.98 % 15.46 5 1.13 % 21.69 11 5.05 % 37.22 0 0 % 0 0 0 % 0 živijo ž ivijo 14 0.86 % 13.53 0 0 % 0 5 2.29 % 16.92 9 1.27 % 25.48 0 0 % 0 pardon p ardon 8 0.49 % 7.73 3 0.68 % 13.01 2 0.92 % 6.77 3 0.42 % 8.49 0 0 % 0 Hopla Hopla 6 0.37 % 5.80 1 0.23 % 4.34 0 0 % 0 0 0 % 0 5 1.95 % 32.07 juhuhu j uhuhu 4 0.25 % 3.86 3 0.68 % 13.01 0 0 % 0 1 0.14 % 2.83 0 0 % 0 hojla hojla 3 0.18 % 2.90 0 0 % 0 0 0 % 0 3 0.42 % 8.49 0 0 % 0 Bravo Bravo 2 0.12 % 1.93 0 0 % 0 2 0.92 % 6.77 0 0 % 0 0 0 % 0 aleluja al eluja 2 0.12 % 1.93 2 0.45 % 8.68 0 0 % 0 0 0 % 0 0 0 % 0 juhej juhej 2 0.12 % 1.93 2 0.45 % 8.68 0 0 % 0 0 0 % 0 0 0 % 0 pozor pozor 2 0.12 % 1.93 1 0.23 % 4.34 0 0 % 0 1 0.14 % 2.83 0 0 % 0 tralala tr alala 2 0.12 % 1.93 2 0.45 % 8.68 0 0 % 0 0 0 % 0 0 0 % 0 hopla hopla 1 0.06 % 0.97 1 0.23 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 hopsasa ho psasa 1 0.06 % 0.97 1 0.23 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 jebenti je benti 1 0.06 % 0.97 1 0.23 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 jebiga j ebiga 1 0.06 % 0.97 0 0 % 0 1 0.46 % 3.38 0 0 % 0 0 0 % 0 ježeš ježeš 1 0.06 % 0.97 1 0.23 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 zbogom z bogom 1 0.06 % 0.97 0 0 % 0 0 0 % 0 1 0.14 % 2.83 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 664 File at CLARIN.SI2.2.321 List of initial character-level 1-grams from interjection lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-interjections-lowercase_ forms-initial-1grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mhm m hm 4,475 38.79 % 4,323.22 431 21.69 % 1,869.66 1,261 30.90 % 4,266.74 975 44.68 % 2,760.87 1,808 55.01 % 11,597.70 aha a ha 2,059 17.85 % 1,989.17 376 18.92 % 1,631.07 626 15.34 % 2,118.14 217 9.95 % 614.47 840 25.55 % 5,388.31 recimo r ecimo 827 7.17 % 798.95 180 9.06 % 780.83 119 2.92 % 402.65 357 16.36 % 1,010.90 171 5.20 % 1,096.91 aja a ja 758 6.57 % 732.29 66 3.32 % 286.31 508 12.45 % 1,718.88 84 3.85 % 237.86 100 3.04 % 641.47 ej e j 615 5.33 % 594.14 91 4.58 % 394.75 426 10.44 % 1,441.42 21 0.96 % 59.46 77 2.34 % 493.93 joj j oj 364 3.15 % 351.65 86 4.33 % 373.06 211 5.17 % 713.94 32 1.47 % 90.61 35 1.06 % 224.51 prosim p rosim 290 2.51 % 280.16 42 2.11 % 182.19 21 0.52 % 71.06 187 8.57 % 529.52 40 1.22 % 256.59 eh e h 215 1.86 % 207.71 26 1.31 % 112.79 168 4.12 % 568.45 9 0.41 % 25.48 12 0.36 % 76.98 hm h m 208 1.80 % 200.95 34 1.71 % 147.49 88 2.16 % 297.76 50 2.29 % 141.58 36 1.09 % 230.93 ha h a 205 1.78 % 198.05 77 3.88 % 334.02 102 2.50 % 345.13 10 0.46 % 28.32 16 0.49 % 102.63 čav č av 150 1.30 % 144.91 102 5.13 % 442.47 30 0.73 % 101.51 0 0 % 0 18 0.55 % 115.46 ah a h 149 1.29 % 143.95 28 1.41 % 121.46 90 2.21 % 304.53 16 0.73 % 45.31 15 0.46 % 96.22 bravo b ravo 118 1.02 % 114 88 4.43 % 381.74 5 0.12 % 16.92 22 1.01 % 62.30 3 0.09 % 19.24 prosm p rosm 104 0.90 % 100.47 18 0.91 % 78.08 8 0.20 % 27.07 65 2.98 % 184.06 13 0.40 % 83.39 zdravo z dravo 69 0.60 % 66.66 35 1.76 % 151.83 12 0.29 % 40.60 13 0.60 % 36.81 9 0.27 % 57.73 he h e 61 0.53 % 58.93 4 0.20 % 17.35 50 1.23 % 169.18 3 0.14 % 8.49 4 0.12 % 25.66 la l a 55 0.48 % 53.13 32 1.61 % 138.81 11 0.27 % 37.22 0 0 % 0 12 0.36 % 76.98 oh o h 54 0.47 % 52.17 10 0.50 % 43.38 24 0.59 % 81.21 10 0.46 % 28.32 10 0.30 % 64.15 fak f ak 53 0.46 % 51.20 0 0 % 0 53 1.30 % 179.33 0 0 % 0 0 0 % 0 adijo a dijo 51 0.44 % 49.27 29 1.46 % 125.80 8 0.20 % 27.07 9 0.41 % 25.48 5 0.15 % 32.07 halo h alo 42 0.36 % 40.58 15 0.76 % 65.07 20 0.49 % 67.67 5 0.23 % 14.16 2 0.06 % 12.83 ho h o 38 0.33 % 36.71 32 1.61 % 138.81 5 0.12 % 16.92 1 0.05 % 2.83 0 0 % 0 hop h op 35 0.30 % 33.81 29 1.46 % 125.80 4 0.10 % 13.53 1 0.05 % 2.83 1 0.03 % 6.41 bla b la 31 0.27 % 29.95 2 0.10 % 8.68 20 0.49 % 67.67 5 0.23 % 14.16 4 0.12 % 25.66 jo j o 30 0.26 % 28.98 2 0.10 % 8.68 23 0.56 % 77.82 0 0 % 0 5 0.15 % 32.07 hej h ej 25 0.22 % 24.15 14 0.70 % 60.73 7 0.17 % 23.69 3 0.14 % 8.49 1 0.03 % 6.41 čao č ao 24 0.21 % 23.19 11 0.55 % 47.72 3 0.07 % 10.15 6 0.28 % 16.99 4 0.12 % 25.66 opa o pa 23 0.20 % 22.22 14 0.70 % 60.73 6 0.15 % 20.30 1 0.05 % 2.83 2 0.06 % 12.83 uh u h 23 0.20 % 22.22 7 0.35 % 30.37 10 0.24 % 33.84 2 0.09 % 5.66 4 0.12 % 25.66 rečmo r ečmo 22 0.19 % 21.25 2 0.10 % 8.68 5 0.12 % 16.92 14 0.64 % 39.64 1 0.03 % 6.41 ojej o jej 21 0.18 % 20.29 3 0.15 % 13.01 12 0.29 % 40.60 2 0.09 % 5.66 4 0.12 % 25.66 dijo d ijo 20 0.17 % 19.32 8 0.40 % 34.70 4 0.10 % 13.53 4 0.18 % 11.33 4 0.12 % 25.66 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 665 File at CLARIN.SI2.2.322 List of initial character-level 2-grams from interjection lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-interjections-lowercase_ forms-initial-2grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mhm mh m 4,475 38.79 % 4,323.22 431 21.69 % 1,869.66 1,261 30.91 % 4,266.74 975 44.68 % 2,760.87 1,808 55.01 % 11,597.70 aha ah a 2,059 17.85 % 1,989.17 376 18.92 % 1,631.07 626 15.34 % 2,118.14 217 9.95 % 614.47 840 25.55 % 5,388.31 recimo re cimo 827 7.17 % 798.95 180 9.06 % 780.83 119 2.92 % 402.65 357 16.36 % 1,010.90 171 5.20 % 1,096.91 aja aj a 758 6.57 % 732.29 66 3.32 % 286.31 508 12.45 % 1,718.88 84 3.85 % 237.86 100 3.04 % 641.47 ej ej 615 5.33 % 594.14 91 4.58 % 394.75 426 10.44 % 1,441.42 21 0.96 % 59.46 77 2.34 % 493.93 joj jo j 364 3.15 % 351.65 86 4.33 % 373.06 211 5.17 % 713.94 32 1.47 % 90.61 35 1.06 % 224.51 prosim pr osim 290 2.51 % 280.16 42 2.11 % 182.19 21 0.52 % 71.06 187 8.57 % 529.52 40 1.22 % 256.59 eh eh 215 1.86 % 207.71 26 1.31 % 112.79 168 4.12 % 568.45 9 0.41 % 25.48 12 0.36 % 76.98 hm hm 208 1.80 % 200.95 34 1.71 % 147.49 88 2.16 % 297.76 50 2.29 % 141.58 36 1.09 % 230.93 ha ha 205 1.78 % 198.05 77 3.88 % 334.02 102 2.50 % 345.13 10 0.46 % 28.32 16 0.49 % 102.63 čav ča v 150 1.30 % 144.91 102 5.13 % 442.47 30 0.73 % 101.51 0 0 % 0 18 0.55 % 115.46 ah ah 149 1.29 % 143.95 28 1.41 % 121.46 90 2.21 % 304.53 16 0.73 % 45.31 15 0.46 % 96.22 bravo br avo 118 1.02 % 114 88 4.43 % 381.74 5 0.12 % 16.92 22 1.01 % 62.30 3 0.09 % 19.24 prosm pr osm 104 0.90 % 100.47 18 0.91 % 78.08 8 0.20 % 27.07 65 2.98 % 184.06 13 0.40 % 83.39 zdravo zd ravo 69 0.60 % 66.66 35 1.76 % 151.83 12 0.29 % 40.60 13 0.60 % 36.81 9 0.27 % 57.73 he he 61 0.53 % 58.93 4 0.20 % 17.35 50 1.23 % 169.18 3 0.14 % 8.49 4 0.12 % 25.66 la la 55 0.48 % 53.13 32 1.61 % 138.81 11 0.27 % 37.22 0 0 % 0 12 0.36 % 76.98 oh oh 54 0.47 % 52.17 10 0.50 % 43.38 24 0.59 % 81.21 10 0.46 % 28.32 10 0.30 % 64.15 fak fa k 53 0.46 % 51.20 0 0 % 0 53 1.30 % 179.33 0 0 % 0 0 0 % 0 adijo ad ijo 51 0.44 % 49.27 29 1.46 % 125.80 8 0.20 % 27.07 9 0.41 % 25.48 5 0.15 % 32.07 halo ha lo 42 0.36 % 40.58 15 0.76 % 65.07 20 0.49 % 67.67 5 0.23 % 14.16 2 0.06 % 12.83 ho ho 38 0.33 % 36.71 32 1.61 % 138.81 5 0.12 % 16.92 1 0.05 % 2.83 0 0 % 0 hop ho p 35 0.30 % 33.81 29 1.46 % 125.80 4 0.10 % 13.53 1 0.05 % 2.83 1 0.03 % 6.41 bla bl a 31 0.27 % 29.95 2 0.10 % 8.68 20 0.49 % 67.67 5 0.23 % 14.16 4 0.12 % 25.66 jo jo 30 0.26 % 28.98 2 0.10 % 8.68 23 0.56 % 77.82 0 0 % 0 5 0.15 % 32.07 hej he j 25 0.22 % 24.15 14 0.70 % 60.73 7 0.17 % 23.69 3 0.14 % 8.49 1 0.03 % 6.41 čao ča o 24 0.21 % 23.19 11 0.55 % 47.72 3 0.07 % 10.15 6 0.28 % 16.99 4 0.12 % 25.66 opa op a 23 0.20 % 22.22 14 0.70 % 60.73 6 0.15 % 20.30 1 0.05 % 2.83 2 0.06 % 12.83 uh uh 23 0.20 % 22.22 7 0.35 % 30.37 10 0.24 % 33.84 2 0.09 % 5.66 4 0.12 % 25.66 rečmo re čmo 22 0.19 % 21.25 2 0.10 % 8.68 5 0.12 % 16.92 14 0.64 % 39.64 1 0.03 % 6.41 ojej oj ej 21 0.18 % 20.29 3 0.15 % 13.01 12 0.29 % 40.60 2 0.09 % 5.66 4 0.12 % 25.66 dijo di jo 20 0.17 % 19.32 8 0.40 % 34.70 4 0.10 % 13.53 4 0.18 % 11.33 4 0.12 % 25.66 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 666 File at CLARIN.SI2.2.323 List of initial character-level 3-grams from interjection lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-interjections-lowercase_ forms-initial-3grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mhm mhm 4,475 45.42 % 4,323.22 431 26.33 % 1,869.66 1,261 41.17 % 4,266.74 975 47.33 % 2,760.87 1,808 58.47 % 11,597.70 aha aha 2,059 20.90 % 1,989.17 376 22.97 % 1,631.07 626 20.44 % 2,118.14 217 10.53 % 614.47 840 27.17 % 5,388.31 recimo rec imo 827 8.39 % 798.95 180 11.00 % 780.83 119 3.88 % 402.65 357 17.33 % 1,010.90 171 5.53 % 1,096.91 aja aja 758 7.69 % 732.29 66 4.03 % 286.31 508 16.59 % 1,718.88 84 4.08 % 237.86 100 3.23 % 641.47 joj joj 364 3.69 % 351.65 86 5.25 % 373.06 211 6.89 % 713.94 32 1.55 % 90.61 35 1.13 % 224.51 prosim pro sim 290 2.94 % 280.16 42 2.57 % 182.19 21 0.69 % 71.06 187 9.08 % 529.52 40 1.29 % 256.59 čav čav 150 1.52 % 144.91 102 6.23 % 442.47 30 0.98 % 101.51 0 0 % 0 18 0.58 % 115.46 bravo bra vo 118 1.20 % 114 88 5.38 % 381.74 5 0.16 % 16.92 22 1.07 % 62.30 3 0.10 % 19.24 prosm pro sm 104 1.06 % 100.47 18 1.10 % 78.08 8 0.26 % 27.07 65 3.15 % 184.06 13 0.42 % 83.39 zdravo zdr avo 69 0.70 % 66.66 35 2.14 % 151.83 12 0.39 % 40.60 13 0.63 % 36.81 9 0.29 % 57.73 fak fak 53 0.54 % 51.20 0 0 % 0 53 1.73 % 179.33 0 0 % 0 0 0 % 0 adijo adi jo 51 0.52 % 49.27 29 1.77 % 125.80 8 0.26 % 27.07 9 0.44 % 25.48 5 0.16 % 32.07 halo hal o 42 0.43 % 40.58 15 0.92 % 65.07 20 0.65 % 67.67 5 0.24 % 14.16 2 0.07 % 12.83 hop hop 35 0.35 % 33.81 29 1.77 % 125.80 4 0.13 % 13.53 1 0.05 % 2.83 1 0.03 % 6.41 bla bla 31 0.32 % 29.95 2 0.12 % 8.68 20 0.65 % 67.67 5 0.24 % 14.16 4 0.13 % 25.66 hej hej 25 0.25 % 24.15 14 0.85 % 60.73 7 0.23 % 23.69 3 0.15 % 8.49 1 0.03 % 6.41 čao čao 24 0.24 % 23.19 11 0.67 % 47.72 3 0.10 % 10.15 6 0.29 % 16.99 4 0.13 % 25.66 opa opa 23 0.23 % 22.22 14 0.85 % 60.73 6 0.20 % 20.30 1 0.05 % 2.83 2 0.07 % 12.83 rečmo reč mo 22 0.22 % 21.25 2 0.12 % 8.68 5 0.16 % 16.92 14 0.68 % 39.64 1 0.03 % 6.41 ojej oje j 21 0.21 % 20.29 3 0.18 % 13.01 12 0.39 % 40.60 2 0.10 % 5.66 4 0.13 % 25.66 dijo dij o 20 0.20 % 19.32 8 0.49 % 34.70 4 0.13 % 13.53 4 0.19 % 11.33 4 0.13 % 25.66 šit šit 19 0.19 % 18.36 2 0.12 % 8.68 15 0.49 % 50.75 0 0 % 0 2 0.07 % 12.83 alo alo 17 0.17 % 16.42 7 0.43 % 30.37 7 0.23 % 23.69 3 0.15 % 8.49 0 0 % 0 nasvidenje nas videnje 17 0.17 % 16.42 11 0.67 % 47.72 0 0 % 0 6 0.29 % 16.99 0 0 % 0 živijo živ ijo 14 0.14 % 13.53 0 0 % 0 5 0.16 % 16.92 9 0.44 % 25.48 0 0 % 0 ups ups 13 0.13 % 12.56 3 0.18 % 13.01 5 0.16 % 16.92 3 0.15 % 8.49 2 0.07 % 12.83 ojoj ojo j 11 0.11 % 10.63 3 0.18 % 13.01 3 0.10 % 10.15 2 0.10 % 5.66 3 0.10 % 19.24 prosem pro sem 9 0.09 % 8.69 0 0 % 0 4 0.13 % 13.53 3 0.15 % 8.49 2 0.07 % 12.83 recmo rec mo 9 0.09 % 8.69 2 0.12 % 8.68 5 0.16 % 16.92 2 0.10 % 5.66 0 0 % 0 ijoj ijo j 8 0.08 % 7.73 4 0.24 % 17.35 4 0.13 % 13.53 0 0 % 0 0 0 % 0 pardon par don 8 0.08 % 7.73 3 0.18 % 13.01 2 0.07 % 6.77 3 0.15 % 8.49 0 0 % 0 hopla hop la 7 0.07 % 6.76 2 0.12 % 8.68 0 0 % 0 0 0 % 0 5 0.16 % 32.07 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 667 File at CLARIN.SI2.2.324 List of initial character-level 4-grams from interjection lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-interjections-lowercase_ forms-initial-4grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] recimo reci mo 827 47.72 % 798.95 180 38.14 % 780.83 119 43.59 % 402.65 357 49.52 % 1,010.90 171 64.05 % 1,096.91 prosim pros im 290 16.73 % 280.16 42 8.90 % 182.19 21 7.69 % 71.06 187 25.94 % 529.52 40 14.98 % 256.59 bravo brav o 118 6.81 % 114 88 18.64 % 381.74 5 1.83 % 16.92 22 3.05 % 62.30 3 1.12 % 19.24 prosm pros m 104 6.00 % 100.47 18 3.81 % 78.08 8 2.93 % 27.07 65 9.02 % 184.06 13 4.87 % 83.39 zdravo zdra vo 69 3.98 % 66.66 35 7.42 % 151.83 12 4.40 % 40.60 13 1.80 % 36.81 9 3.37 % 57.73 adijo adij o 51 2.94 % 49.27 29 6.14 % 125.80 8 2.93 % 27.07 9 1.25 % 25.48 5 1.87 % 32.07 halo halo 42 2.42 % 40.58 15 3.18 % 65.07 20 7.33 % 67.67 5 0.69 % 14.16 2 0.75 % 12.83 rečmo rečm o 22 1.27 % 21.25 2 0.42 % 8.68 5 1.83 % 16.92 14 1.94 % 39.64 1 0.38 % 6.41 ojej ojej 21 1.21 % 20.29 3 0.64 % 13.01 12 4.40 % 40.60 2 0.28 % 5.66 4 1.50 % 25.66 dijo dijo 20 1.15 % 19.32 8 1.70 % 34.70 4 1.47 % 13.53 4 0.56 % 11.33 4 1.50 % 25.66 nasvidenje nasv idenje 17 0.98 % 16.42 11 2.33 % 47.72 0 0 % 0 6 0.83 % 16.99 0 0 % 0 živijo živi jo 14 0.81 % 13.53 0 0 % 0 5 1.83 % 16.92 9 1.25 % 25.48 0 0 % 0 ojoj ojoj 11 0.64 % 10.63 3 0.64 % 13.01 3 1.10 % 10.15 2 0.28 % 5.66 3 1.12 % 19.24 prosem pros em 9 0.52 % 8.69 0 0 % 0 4 1.47 % 13.53 3 0.42 % 8.49 2 0.75 % 12.83 recmo recm o 9 0.52 % 8.69 2 0.42 % 8.68 5 1.83 % 16.92 2 0.28 % 5.66 0 0 % 0 ijoj ijoj 8 0.46 % 7.73 4 0.85 % 17.35 4 1.47 % 13.53 0 0 % 0 0 0 % 0 pardon pard on 8 0.46 % 7.73 3 0.64 % 13.01 2 0.73 % 6.77 3 0.42 % 8.49 0 0 % 0 hopla hopl a 7 0.40 % 6.76 2 0.42 % 8.68 0 0 % 0 0 0 % 0 5 1.87 % 32.07 jebemti jebe mti 6 0.35 % 5.80 3 0.64 % 13.01 3 1.10 % 10.15 0 0 % 0 0 0 % 0 jebenti jebe nti 6 0.35 % 5.80 2 0.42 % 8.68 4 1.47 % 13.53 0 0 % 0 0 0 % 0 cimo cimo 4 0.23 % 3.86 1 0.21 % 4.34 0 0 % 0 3 0.42 % 8.49 0 0 % 0 jebemtiš jebe mtiš 4 0.23 % 3.86 0 0 % 0 4 1.47 % 13.53 0 0 % 0 0 0 % 0 pruesm prue sm 4 0.23 % 3.86 0 0 % 0 1 0.37 % 3.38 3 0.42 % 8.49 0 0 % 0 ajej ajej 3 0.17 % 2.90 0 0 % 0 1 0.37 % 3.38 2 0.28 % 5.66 0 0 % 0 bila bila 3 0.17 % 2.90 0 0 % 0 3 1.10 % 10.15 0 0 % 0 0 0 % 0 hjoj hjoj 3 0.17 % 2.90 2 0.42 % 8.68 1 0.37 % 3.38 0 0 % 0 0 0 % 0 hojla hojl a 3 0.17 % 2.90 0 0 % 0 0 0 % 0 3 0.42 % 8.49 0 0 % 0 jojojoj jojo joj 3 0.17 % 2.90 0 0 % 0 3 1.10 % 10.15 0 0 % 0 0 0 % 0 juhuhu juhu hu 3 0.17 % 2.90 2 0.42 % 8.68 0 0 % 0 1 0.14 % 2.83 0 0 % 0 aleluja alel uja 2 0.12 % 1.93 2 0.42 % 8.68 0 0 % 0 0 0 % 0 0 0 % 0 juhej juhe j 2 0.12 % 1.93 2 0.42 % 8.68 0 0 % 0 0 0 % 0 0 0 % 0 naha naha 2 0.12 % 1.93 1 0.21 % 4.34 1 0.37 % 3.38 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 668 File at CLARIN.SI2.2.325 List of initial character-level 5-grams from interjection lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-interjections-lowercase_ forms-initial-5grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] recimo recim o 827 51.46 % 798.95 180 41.48 % 780.83 119 54.34 % 402.65 357 50.93 % 1,010.90 171 67.59 % 1,096.91 prosim prosi m 290 18.05 % 280.16 42 9.68 % 182.19 21 9.59 % 71.06 187 26.68 % 529.52 40 15.81 % 256.59 bravo bravo 118 7.34 % 114 88 20.28 % 381.74 5 2.28 % 16.92 22 3.14 % 62.30 3 1.19 % 19.24 prosm prosm 104 6.47 % 100.47 18 4.15 % 78.08 8 3.65 % 27.07 65 9.27 % 184.06 13 5.14 % 83.39 zdravo zdrav o 69 4.29 % 66.66 35 8.06 % 151.83 12 5.48 % 40.60 13 1.85 % 36.81 9 3.56 % 57.73 adijo adijo 51 3.17 % 49.27 29 6.68 % 125.80 8 3.65 % 27.07 9 1.28 % 25.48 5 1.98 % 32.07 rečmo rečmo 22 1.37 % 21.25 2 0.46 % 8.68 5 2.28 % 16.92 14 2.00 % 39.64 1 0.40 % 6.41 nasvidenje nasvi denje 17 1.06 % 16.42 11 2.54 % 47.72 0 0 % 0 6 0.86 % 16.99 0 0 % 0 živijo živij o 14 0.87 % 13.53 0 0 % 0 5 2.28 % 16.92 9 1.28 % 25.48 0 0 % 0 prosem prose m 9 0.56 % 8.69 0 0 % 0 4 1.83 % 13.53 3 0.43 % 8.49 2 0.79 % 12.83 recmo recmo 9 0.56 % 8.69 2 0.46 % 8.68 5 2.28 % 16.92 2 0.28 % 5.66 0 0 % 0 pardon pardo n 8 0.50 % 7.73 3 0.69 % 13.01 2 0.91 % 6.77 3 0.43 % 8.49 0 0 % 0 hopla hopla 7 0.44 % 6.76 2 0.46 % 8.68 0 0 % 0 0 0 % 0 5 1.98 % 32.07 jebemti jebem ti 6 0.37 % 5.80 3 0.69 % 13.01 3 1.37 % 10.15 0 0 % 0 0 0 % 0 jebenti jeben ti 6 0.37 % 5.80 2 0.46 % 8.68 4 1.83 % 13.53 0 0 % 0 0 0 % 0 jebemtiš jebem tiš 4 0.25 % 3.86 0 0 % 0 4 1.83 % 13.53 0 0 % 0 0 0 % 0 pruesm prues m 4 0.25 % 3.86 0 0 % 0 1 0.46 % 3.38 3 0.43 % 8.49 0 0 % 0 hojla hojla 3 0.19 % 2.90 0 0 % 0 0 0 % 0 3 0.43 % 8.49 0 0 % 0 jojojoj jojoj oj 3 0.19 % 2.90 0 0 % 0 3 1.37 % 10.15 0 0 % 0 0 0 % 0 juhuhu juhuh u 3 0.19 % 2.90 2 0.46 % 8.68 0 0 % 0 1 0.14 % 2.83 0 0 % 0 aleluja alelu ja 2 0.12 % 1.93 2 0.46 % 8.68 0 0 % 0 0 0 % 0 0 0 % 0 juhej juhej 2 0.12 % 1.93 2 0.46 % 8.68 0 0 % 0 0 0 % 0 0 0 % 0 pozor pozor 2 0.12 % 1.93 1 0.23 % 4.34 0 0 % 0 1 0.14 % 2.83 0 0 % 0 prosin prosi n 2 0.12 % 1.93 2 0.46 % 8.68 0 0 % 0 0 0 % 0 0 0 % 0 pruosm pruos m 2 0.12 % 1.93 0 0 % 0 1 0.46 % 3.38 1 0.14 % 2.83 0 0 % 0 tralala trala la 2 0.12 % 1.93 2 0.46 % 8.68 0 0 % 0 0 0 % 0 0 0 % 0 adijos adijo s 1 0.06 % 0.97 0 0 % 0 1 0.46 % 3.38 0 0 % 0 0 0 % 0 bemti bemti 1 0.06 % 0.97 1 0.23 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 hopsasa hopsa sa 1 0.06 % 0.97 1 0.23 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 hujoj hujoj 1 0.06 % 0.97 0 0 % 0 0 0 % 0 0 0 % 0 1 0.40 % 6.41 iouhuhu iouhu hu 1 0.06 % 0.97 1 0.23 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 jebga jebga 1 0.06 % 0.97 0 0 % 0 1 0.46 % 3.38 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 669 File at CLARIN.SI2.2.326 List of final character-level 1-grams from interjection lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-interjections-lowercase_forms- final-1grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mhm mh m 4,475 38.79 % 4,323.22 431 21.69 % 1,869.66 1,261 30.90 % 4,266.74 975 44.68 % 2,760.87 1,808 55.01 % 11,597.70 aha ah a 2,059 17.85 % 1,989.17 376 18.92 % 1,631.07 626 15.34 % 2,118.14 217 9.95 % 614.47 840 25.55 % 5,388.31 recimo recim o 827 7.17 % 798.95 180 9.06 % 780.83 119 2.92 % 402.65 357 16.36 % 1,010.90 171 5.20 % 1,096.91 aja aj a 758 6.57 % 732.29 66 3.32 % 286.31 508 12.45 % 1,718.88 84 3.85 % 237.86 100 3.04 % 641.47 ej e j 615 5.33 % 594.14 91 4.58 % 394.75 426 10.44 % 1,441.42 21 0.96 % 59.46 77 2.34 % 493.93 joj jo j 364 3.15 % 351.65 86 4.33 % 373.06 211 5.17 % 713.94 32 1.47 % 90.61 35 1.06 % 224.51 prosim prosi m 290 2.51 % 280.16 42 2.11 % 182.19 21 0.52 % 71.06 187 8.57 % 529.52 40 1.22 % 256.59 eh e h 215 1.86 % 207.71 26 1.31 % 112.79 168 4.12 % 568.45 9 0.41 % 25.48 12 0.36 % 76.98 hm h m 208 1.80 % 200.95 34 1.71 % 147.49 88 2.16 % 297.76 50 2.29 % 141.58 36 1.09 % 230.93 ha h a 205 1.78 % 198.05 77 3.88 % 334.02 102 2.50 % 345.13 10 0.46 % 28.32 16 0.49 % 102.63 čav ča v 150 1.30 % 144.91 102 5.13 % 442.47 30 0.73 % 101.51 0 0 % 0 18 0.55 % 115.46 ah a h 149 1.29 % 143.95 28 1.41 % 121.46 90 2.21 % 304.53 16 0.73 % 45.31 15 0.46 % 96.22 bravo brav o 118 1.02 % 114 88 4.43 % 381.74 5 0.12 % 16.92 22 1.01 % 62.30 3 0.09 % 19.24 prosm pros m 104 0.90 % 100.47 18 0.91 % 78.08 8 0.20 % 27.07 65 2.98 % 184.06 13 0.40 % 83.39 zdravo zdrav o 69 0.60 % 66.66 35 1.76 % 151.83 12 0.29 % 40.60 13 0.60 % 36.81 9 0.27 % 57.73 he h e 61 0.53 % 58.93 4 0.20 % 17.35 50 1.23 % 169.18 3 0.14 % 8.49 4 0.12 % 25.66 la l a 55 0.48 % 53.13 32 1.61 % 138.81 11 0.27 % 37.22 0 0 % 0 12 0.36 % 76.98 oh o h 54 0.47 % 52.17 10 0.50 % 43.38 24 0.59 % 81.21 10 0.46 % 28.32 10 0.30 % 64.15 fak fa k 53 0.46 % 51.20 0 0 % 0 53 1.30 % 179.33 0 0 % 0 0 0 % 0 adijo adij o 51 0.44 % 49.27 29 1.46 % 125.80 8 0.20 % 27.07 9 0.41 % 25.48 5 0.15 % 32.07 halo hal o 42 0.36 % 40.58 15 0.76 % 65.07 20 0.49 % 67.67 5 0.23 % 14.16 2 0.06 % 12.83 ho h o 38 0.33 % 36.71 32 1.61 % 138.81 5 0.12 % 16.92 1 0.05 % 2.83 0 0 % 0 hop ho p 35 0.30 % 33.81 29 1.46 % 125.80 4 0.10 % 13.53 1 0.05 % 2.83 1 0.03 % 6.41 bla bl a 31 0.27 % 29.95 2 0.10 % 8.68 20 0.49 % 67.67 5 0.23 % 14.16 4 0.12 % 25.66 jo j o 30 0.26 % 28.98 2 0.10 % 8.68 23 0.56 % 77.82 0 0 % 0 5 0.15 % 32.07 hej he j 25 0.22 % 24.15 14 0.70 % 60.73 7 0.17 % 23.69 3 0.14 % 8.49 1 0.03 % 6.41 čao ča o 24 0.21 % 23.19 11 0.55 % 47.72 3 0.07 % 10.15 6 0.28 % 16.99 4 0.12 % 25.66 opa op a 23 0.20 % 22.22 14 0.70 % 60.73 6 0.15 % 20.30 1 0.05 % 2.83 2 0.06 % 12.83 uh u h 23 0.20 % 22.22 7 0.35 % 30.37 10 0.24 % 33.84 2 0.09 % 5.66 4 0.12 % 25.66 rečmo rečm o 22 0.19 % 21.25 2 0.10 % 8.68 5 0.12 % 16.92 14 0.64 % 39.64 1 0.03 % 6.41 ojej oje j 21 0.18 % 20.29 3 0.15 % 13.01 12 0.29 % 40.60 2 0.09 % 5.66 4 0.12 % 25.66 dijo dij o 20 0.17 % 19.32 8 0.40 % 34.70 4 0.10 % 13.53 4 0.18 % 11.33 4 0.12 % 25.66 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 670 File at CLARIN.SI2.2.327 List of final character-level 2-grams from interjection lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-interjections-lowercase_forms- final-2grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mhm m hm 4,475 38.79 % 4,323.22 431 21.69 % 1,869.66 1,261 30.91 % 4,266.74 975 44.68 % 2,760.87 1,808 55.01 % 11,597.70 aha a ha 2,059 17.85 % 1,989.17 376 18.92 % 1,631.07 626 15.34 % 2,118.14 217 9.95 % 614.47 840 25.55 % 5,388.31 recimo reci mo 827 7.17 % 798.95 180 9.06 % 780.83 119 2.92 % 402.65 357 16.36 % 1,010.90 171 5.20 % 1,096.91 aja a ja 758 6.57 % 732.29 66 3.32 % 286.31 508 12.45 % 1,718.88 84 3.85 % 237.86 100 3.04 % 641.47 ej ej 615 5.33 % 594.14 91 4.58 % 394.75 426 10.44 % 1,441.42 21 0.96 % 59.46 77 2.34 % 493.93 joj j oj 364 3.15 % 351.65 86 4.33 % 373.06 211 5.17 % 713.94 32 1.47 % 90.61 35 1.06 % 224.51 prosim pros im 290 2.51 % 280.16 42 2.11 % 182.19 21 0.52 % 71.06 187 8.57 % 529.52 40 1.22 % 256.59 eh eh 215 1.86 % 207.71 26 1.31 % 112.79 168 4.12 % 568.45 9 0.41 % 25.48 12 0.36 % 76.98 hm hm 208 1.80 % 200.95 34 1.71 % 147.49 88 2.16 % 297.76 50 2.29 % 141.58 36 1.09 % 230.93 ha ha 205 1.78 % 198.05 77 3.88 % 334.02 102 2.50 % 345.13 10 0.46 % 28.32 16 0.49 % 102.63 čav č av 150 1.30 % 144.91 102 5.13 % 442.47 30 0.73 % 101.51 0 0 % 0 18 0.55 % 115.46 ah ah 149 1.29 % 143.95 28 1.41 % 121.46 90 2.21 % 304.53 16 0.73 % 45.31 15 0.46 % 96.22 bravo bra vo 118 1.02 % 114 88 4.43 % 381.74 5 0.12 % 16.92 22 1.01 % 62.30 3 0.09 % 19.24 prosm pro sm 104 0.90 % 100.47 18 0.91 % 78.08 8 0.20 % 27.07 65 2.98 % 184.06 13 0.40 % 83.39 zdravo zdra vo 69 0.60 % 66.66 35 1.76 % 151.83 12 0.29 % 40.60 13 0.60 % 36.81 9 0.27 % 57.73 he he 61 0.53 % 58.93 4 0.20 % 17.35 50 1.23 % 169.18 3 0.14 % 8.49 4 0.12 % 25.66 la la 55 0.48 % 53.13 32 1.61 % 138.81 11 0.27 % 37.22 0 0 % 0 12 0.36 % 76.98 oh oh 54 0.47 % 52.17 10 0.50 % 43.38 24 0.59 % 81.21 10 0.46 % 28.32 10 0.30 % 64.15 fak f ak 53 0.46 % 51.20 0 0 % 0 53 1.30 % 179.33 0 0 % 0 0 0 % 0 adijo adi jo 51 0.44 % 49.27 29 1.46 % 125.80 8 0.20 % 27.07 9 0.41 % 25.48 5 0.15 % 32.07 halo ha lo 42 0.36 % 40.58 15 0.76 % 65.07 20 0.49 % 67.67 5 0.23 % 14.16 2 0.06 % 12.83 ho ho 38 0.33 % 36.71 32 1.61 % 138.81 5 0.12 % 16.92 1 0.05 % 2.83 0 0 % 0 hop h op 35 0.30 % 33.81 29 1.46 % 125.80 4 0.10 % 13.53 1 0.05 % 2.83 1 0.03 % 6.41 bla b la 31 0.27 % 29.95 2 0.10 % 8.68 20 0.49 % 67.67 5 0.23 % 14.16 4 0.12 % 25.66 jo jo 30 0.26 % 28.98 2 0.10 % 8.68 23 0.56 % 77.82 0 0 % 0 5 0.15 % 32.07 hej h ej 25 0.22 % 24.15 14 0.70 % 60.73 7 0.17 % 23.69 3 0.14 % 8.49 1 0.03 % 6.41 čao č ao 24 0.21 % 23.19 11 0.55 % 47.72 3 0.07 % 10.15 6 0.28 % 16.99 4 0.12 % 25.66 opa o pa 23 0.20 % 22.22 14 0.70 % 60.73 6 0.15 % 20.30 1 0.05 % 2.83 2 0.06 % 12.83 uh uh 23 0.20 % 22.22 7 0.35 % 30.37 10 0.24 % 33.84 2 0.09 % 5.66 4 0.12 % 25.66 rečmo reč mo 22 0.19 % 21.25 2 0.10 % 8.68 5 0.12 % 16.92 14 0.64 % 39.64 1 0.03 % 6.41 ojej oj ej 21 0.18 % 20.29 3 0.15 % 13.01 12 0.29 % 40.60 2 0.09 % 5.66 4 0.12 % 25.66 dijo di jo 20 0.17 % 19.32 8 0.40 % 34.70 4 0.10 % 13.53 4 0.18 % 11.33 4 0.12 % 25.66 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 671 File at CLARIN.SI2.2.328 List of final character-level 3-grams from interjection lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-interjections-lowercase_forms- final-3grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mhm mhm 4,475 45.42 % 4,323.22 431 26.33 % 1,869.66 1,261 41.17 % 4,266.74 975 47.33 % 2,760.87 1,808 58.47 % 11,597.70 aha aha 2,059 20.90 % 1,989.17 376 22.97 % 1,631.07 626 20.44 % 2,118.14 217 10.53 % 614.47 840 27.17 % 5,388.31 recimo rec imo 827 8.39 % 798.95 180 11.00 % 780.83 119 3.88 % 402.65 357 17.33 % 1,010.90 171 5.53 % 1,096.91 aja aja 758 7.69 % 732.29 66 4.03 % 286.31 508 16.59 % 1,718.88 84 4.08 % 237.86 100 3.23 % 641.47 joj joj 364 3.69 % 351.65 86 5.25 % 373.06 211 6.89 % 713.94 32 1.55 % 90.61 35 1.13 % 224.51 prosim pro sim 290 2.94 % 280.16 42 2.57 % 182.19 21 0.69 % 71.06 187 9.08 % 529.52 40 1.29 % 256.59 čav čav 150 1.52 % 144.91 102 6.23 % 442.47 30 0.98 % 101.51 0 0 % 0 18 0.58 % 115.46 bravo br avo 118 1.20 % 114 88 5.38 % 381.74 5 0.16 % 16.92 22 1.07 % 62.30 3 0.10 % 19.24 prosm pr osm 104 1.06 % 100.47 18 1.10 % 78.08 8 0.26 % 27.07 65 3.15 % 184.06 13 0.42 % 83.39 zdravo zdr avo 69 0.70 % 66.66 35 2.14 % 151.83 12 0.39 % 40.60 13 0.63 % 36.81 9 0.29 % 57.73 fak fak 53 0.54 % 51.20 0 0 % 0 53 1.73 % 179.33 0 0 % 0 0 0 % 0 adijo ad ijo 51 0.52 % 49.27 29 1.77 % 125.80 8 0.26 % 27.07 9 0.44 % 25.48 5 0.16 % 32.07 halo h alo 42 0.43 % 40.58 15 0.92 % 65.07 20 0.65 % 67.67 5 0.24 % 14.16 2 0.07 % 12.83 hop hop 35 0.35 % 33.81 29 1.77 % 125.80 4 0.13 % 13.53 1 0.05 % 2.83 1 0.03 % 6.41 bla bla 31 0.32 % 29.95 2 0.12 % 8.68 20 0.65 % 67.67 5 0.24 % 14.16 4 0.13 % 25.66 hej hej 25 0.25 % 24.15 14 0.85 % 60.73 7 0.23 % 23.69 3 0.15 % 8.49 1 0.03 % 6.41 čao čao 24 0.24 % 23.19 11 0.67 % 47.72 3 0.10 % 10.15 6 0.29 % 16.99 4 0.13 % 25.66 opa opa 23 0.23 % 22.22 14 0.85 % 60.73 6 0.20 % 20.30 1 0.05 % 2.83 2 0.07 % 12.83 rečmo re čmo 22 0.22 % 21.25 2 0.12 % 8.68 5 0.16 % 16.92 14 0.68 % 39.64 1 0.03 % 6.41 ojej o jej 21 0.21 % 20.29 3 0.18 % 13.01 12 0.39 % 40.60 2 0.10 % 5.66 4 0.13 % 25.66 dijo d ijo 20 0.20 % 19.32 8 0.49 % 34.70 4 0.13 % 13.53 4 0.19 % 11.33 4 0.13 % 25.66 šit šit 19 0.19 % 18.36 2 0.12 % 8.68 15 0.49 % 50.75 0 0 % 0 2 0.07 % 12.83 alo alo 17 0.17 % 16.42 7 0.43 % 30.37 7 0.23 % 23.69 3 0.15 % 8.49 0 0 % 0 nasvidenje nasvide nje 17 0.17 % 16.42 11 0.67 % 47.72 0 0 % 0 6 0.29 % 16.99 0 0 % 0 živijo živ ijo 14 0.14 % 13.53 0 0 % 0 5 0.16 % 16.92 9 0.44 % 25.48 0 0 % 0 ups ups 13 0.13 % 12.56 3 0.18 % 13.01 5 0.16 % 16.92 3 0.15 % 8.49 2 0.07 % 12.83 ojoj o joj 11 0.11 % 10.63 3 0.18 % 13.01 3 0.10 % 10.15 2 0.10 % 5.66 3 0.10 % 19.24 prosem pro sem 9 0.09 % 8.69 0 0 % 0 4 0.13 % 13.53 3 0.15 % 8.49 2 0.07 % 12.83 recmo re cmo 9 0.09 % 8.69 2 0.12 % 8.68 5 0.16 % 16.92 2 0.10 % 5.66 0 0 % 0 ijoj i joj 8 0.08 % 7.73 4 0.24 % 17.35 4 0.13 % 13.53 0 0 % 0 0 0 % 0 pardon par don 8 0.08 % 7.73 3 0.18 % 13.01 2 0.07 % 6.77 3 0.15 % 8.49 0 0 % 0 hopla ho pla 7 0.07 % 6.76 2 0.12 % 8.68 0 0 % 0 0 0 % 0 5 0.16 % 32.07 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 672 File at CLARIN.SI2.2.329 List of final character-level 4-grams from interjection lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-interjections-lowercase_forms- final-4grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] recimo re cimo 827 47.72 % 798.95 180 38.14 % 780.83 119 43.59 % 402.65 357 49.52 % 1,010.90 171 64.05 % 1,096.91 prosim pr osim 290 16.73 % 280.16 42 8.90 % 182.19 21 7.69 % 71.06 187 25.94 % 529.52 40 14.98 % 256.59 bravo b ravo 118 6.81 % 114 88 18.64 % 381.74 5 1.83 % 16.92 22 3.05 % 62.30 3 1.12 % 19.24 prosm p rosm 104 6.00 % 100.47 18 3.81 % 78.08 8 2.93 % 27.07 65 9.02 % 184.06 13 4.87 % 83.39 zdravo zd ravo 69 3.98 % 66.66 35 7.42 % 151.83 12 4.40 % 40.60 13 1.80 % 36.81 9 3.37 % 57.73 adijo a dijo 51 2.94 % 49.27 29 6.14 % 125.80 8 2.93 % 27.07 9 1.25 % 25.48 5 1.87 % 32.07 halo halo 42 2.42 % 40.58 15 3.18 % 65.07 20 7.33 % 67.67 5 0.69 % 14.16 2 0.75 % 12.83 rečmo r ečmo 22 1.27 % 21.25 2 0.42 % 8.68 5 1.83 % 16.92 14 1.94 % 39.64 1 0.38 % 6.41 ojej ojej 21 1.21 % 20.29 3 0.64 % 13.01 12 4.40 % 40.60 2 0.28 % 5.66 4 1.50 % 25.66 dijo dijo 20 1.15 % 19.32 8 1.70 % 34.70 4 1.47 % 13.53 4 0.56 % 11.33 4 1.50 % 25.66 nasvidenje nasvid enje 17 0.98 % 16.42 11 2.33 % 47.72 0 0 % 0 6 0.83 % 16.99 0 0 % 0 živijo ži vijo 14 0.81 % 13.53 0 0 % 0 5 1.83 % 16.92 9 1.25 % 25.48 0 0 % 0 ojoj ojoj 11 0.64 % 10.63 3 0.64 % 13.01 3 1.10 % 10.15 2 0.28 % 5.66 3 1.12 % 19.24 prosem pr osem 9 0.52 % 8.69 0 0 % 0 4 1.47 % 13.53 3 0.42 % 8.49 2 0.75 % 12.83 recmo r ecmo 9 0.52 % 8.69 2 0.42 % 8.68 5 1.83 % 16.92 2 0.28 % 5.66 0 0 % 0 ijoj ijoj 8 0.46 % 7.73 4 0.85 % 17.35 4 1.47 % 13.53 0 0 % 0 0 0 % 0 pardon pa rdon 8 0.46 % 7.73 3 0.64 % 13.01 2 0.73 % 6.77 3 0.42 % 8.49 0 0 % 0 hopla h opla 7 0.40 % 6.76 2 0.42 % 8.68 0 0 % 0 0 0 % 0 5 1.87 % 32.07 jebemti jeb emti 6 0.35 % 5.80 3 0.64 % 13.01 3 1.10 % 10.15 0 0 % 0 0 0 % 0 jebenti jeb enti 6 0.35 % 5.80 2 0.42 % 8.68 4 1.47 % 13.53 0 0 % 0 0 0 % 0 cimo cimo 4 0.23 % 3.86 1 0.21 % 4.34 0 0 % 0 3 0.42 % 8.49 0 0 % 0 jebemtiš jebe mtiš 4 0.23 % 3.86 0 0 % 0 4 1.47 % 13.53 0 0 % 0 0 0 % 0 pruesm pr uesm 4 0.23 % 3.86 0 0 % 0 1 0.37 % 3.38 3 0.42 % 8.49 0 0 % 0 ajej ajej 3 0.17 % 2.90 0 0 % 0 1 0.37 % 3.38 2 0.28 % 5.66 0 0 % 0 bila bila 3 0.17 % 2.90 0 0 % 0 3 1.10 % 10.15 0 0 % 0 0 0 % 0 hjoj hjoj 3 0.17 % 2.90 2 0.42 % 8.68 1 0.37 % 3.38 0 0 % 0 0 0 % 0 hojla h ojla 3 0.17 % 2.90 0 0 % 0 0 0 % 0 3 0.42 % 8.49 0 0 % 0 jojojoj joj ojoj 3 0.17 % 2.90 0 0 % 0 3 1.10 % 10.15 0 0 % 0 0 0 % 0 juhuhu ju huhu 3 0.17 % 2.90 2 0.42 % 8.68 0 0 % 0 1 0.14 % 2.83 0 0 % 0 aleluja ale luja 2 0.12 % 1.93 2 0.42 % 8.68 0 0 % 0 0 0 % 0 0 0 % 0 juhej j uhej 2 0.12 % 1.93 2 0.42 % 8.68 0 0 % 0 0 0 % 0 0 0 % 0 naha naha 2 0.12 % 1.93 1 0.21 % 4.34 1 0.37 % 3.38 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 673 File at CLARIN.SI2.2.330 List of final character-level 5-grams from interjection lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-interjections-lowercase_forms- final-5grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] recimo r ecimo 827 51.46 % 798.95 180 41.48 % 780.83 119 54.34 % 402.65 357 50.93 % 1,010.90 171 67.59 % 1,096.91 prosim p rosim 290 18.05 % 280.16 42 9.68 % 182.19 21 9.59 % 71.06 187 26.68 % 529.52 40 15.81 % 256.59 bravo bravo 118 7.34 % 114 88 20.28 % 381.74 5 2.28 % 16.92 22 3.14 % 62.30 3 1.19 % 19.24 prosm prosm 104 6.47 % 100.47 18 4.15 % 78.08 8 3.65 % 27.07 65 9.27 % 184.06 13 5.14 % 83.39 zdravo z dravo 69 4.29 % 66.66 35 8.06 % 151.83 12 5.48 % 40.60 13 1.85 % 36.81 9 3.56 % 57.73 adijo adijo 51 3.17 % 49.27 29 6.68 % 125.80 8 3.65 % 27.07 9 1.28 % 25.48 5 1.98 % 32.07 rečmo rečmo 22 1.37 % 21.25 2 0.46 % 8.68 5 2.28 % 16.92 14 2.00 % 39.64 1 0.40 % 6.41 nasvidenje nasvi denje 17 1.06 % 16.42 11 2.54 % 47.72 0 0 % 0 6 0.86 % 16.99 0 0 % 0 živijo ž ivijo 14 0.87 % 13.53 0 0 % 0 5 2.28 % 16.92 9 1.28 % 25.48 0 0 % 0 prosem p rosem 9 0.56 % 8.69 0 0 % 0 4 1.83 % 13.53 3 0.43 % 8.49 2 0.79 % 12.83 recmo recmo 9 0.56 % 8.69 2 0.46 % 8.68 5 2.28 % 16.92 2 0.28 % 5.66 0 0 % 0 pardon p ardon 8 0.50 % 7.73 3 0.69 % 13.01 2 0.91 % 6.77 3 0.43 % 8.49 0 0 % 0 hopla hopla 7 0.44 % 6.76 2 0.46 % 8.68 0 0 % 0 0 0 % 0 5 1.98 % 32.07 jebemti je bemti 6 0.37 % 5.80 3 0.69 % 13.01 3 1.37 % 10.15 0 0 % 0 0 0 % 0 jebenti je benti 6 0.37 % 5.80 2 0.46 % 8.68 4 1.83 % 13.53 0 0 % 0 0 0 % 0 jebemtiš jeb emtiš 4 0.25 % 3.86 0 0 % 0 4 1.83 % 13.53 0 0 % 0 0 0 % 0 pruesm p ruesm 4 0.25 % 3.86 0 0 % 0 1 0.46 % 3.38 3 0.43 % 8.49 0 0 % 0 hojla hojla 3 0.19 % 2.90 0 0 % 0 0 0 % 0 3 0.43 % 8.49 0 0 % 0 jojojoj jo jojoj 3 0.19 % 2.90 0 0 % 0 3 1.37 % 10.15 0 0 % 0 0 0 % 0 juhuhu j uhuhu 3 0.19 % 2.90 2 0.46 % 8.68 0 0 % 0 1 0.14 % 2.83 0 0 % 0 aleluja al eluja 2 0.12 % 1.93 2 0.46 % 8.68 0 0 % 0 0 0 % 0 0 0 % 0 juhej juhej 2 0.12 % 1.93 2 0.46 % 8.68 0 0 % 0 0 0 % 0 0 0 % 0 pozor pozor 2 0.12 % 1.93 1 0.23 % 4.34 0 0 % 0 1 0.14 % 2.83 0 0 % 0 prosin p rosin 2 0.12 % 1.93 2 0.46 % 8.68 0 0 % 0 0 0 % 0 0 0 % 0 pruosm p ruosm 2 0.12 % 1.93 0 0 % 0 1 0.46 % 3.38 1 0.14 % 2.83 0 0 % 0 tralala tr alala 2 0.12 % 1.93 2 0.46 % 8.68 0 0 % 0 0 0 % 0 0 0 % 0 adijos a dijos 1 0.06 % 0.97 0 0 % 0 1 0.46 % 3.38 0 0 % 0 0 0 % 0 bemti bemti 1 0.06 % 0.97 1 0.23 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 hopsasa ho psasa 1 0.06 % 0.97 1 0.23 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 hujoj hujoj 1 0.06 % 0.97 0 0 % 0 0 0 % 0 0 0 % 0 1 0.40 % 6.41 iouhuhu io uhuhu 1 0.06 % 0.97 1 0.23 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 jebga jebga 1 0.06 % 0.97 0 0 % 0 1 0.46 % 3.38 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 674 File at CLARIN.SI2.2.331 List of initial character-level 1-grams from abbreviation lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-abbreviations-lemmas- initial-1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] čim čim č im 64 58.72 % 61.83 6 60.00 % 26.03 17 94.44 % 57.52 28 42.42 % 79.29 13 86.67 % 83.39 pH ph p H 38 34.86 % 36.71 0 0 % 0 1 5.56 % 3.38 37 56.06 % 104.77 0 0 % 0 W w W 3 2.75 % 2.90 2 20.00 % 8.68 0 0 % 0 0 0 % 0 1 6.67 % 6.41 Al al A l 1 0.92 % 0.97 0 0 % 0 0 0 % 0 1 1.51 % 2.83 0 0 % 0 Mg mg M g 1 0.92 % 0.97 0 0 % 0 0 0 % 0 0 0 % 0 1 6.67 % 6.41 WWW www W WW 1 0.92 % 0.97 1 10.00 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 najsi najsi n ajsi 1 0.92 % 0.97 1 10.00 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 675 File at CLARIN.SI2.2.332 List of initial character-level 2-grams from abbreviation lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-abbreviations-lemmas- initial-2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] čim čim či m 64 60.38 % 61.83 6 75.00 % 26.03 17 94.44 % 57.52 28 42.42 % 79.29 13 92.86 % 83.39 pH ph pH 38 35.85 % 36.71 0 0 % 0 1 5.56 % 3.38 37 56.06 % 104.77 0 0 % 0 Al al Al 1 0.94 % 0.97 0 0 % 0 0 0 % 0 1 1.51 % 2.83 0 0 % 0 Mg mg Mg 1 0.94 % 0.97 0 0 % 0 0 0 % 0 0 0 % 0 1 7.14 % 6.41 WWW www WW W 1 0.94 % 0.97 1 12.50 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 najsi najsi na jsi 1 0.94 % 0.97 1 12.50 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 676 File at CLARIN.SI2.2.333 List of initial character-level 3-grams from abbreviation lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-abbreviations-lemmas- initial-3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] čim čim čim 64 96.97 % 61.83 6 75.00 % 26.03 17 100.00 % 57.52 28 100.00 % 79.29 13 100.00 % 83.39 WWW www WWW 1 1.51 % 0.97 1 12.50 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 najsi najsi naj si 1 1.51 % 0.97 1 12.50 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 677 File at CLARIN.SI2.2.334 List of initial character-level 4-grams from abbreviation lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-abbreviations-lemmas- initial-4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] najsi najsi najs i 1 100.00 % 0.97 1 100.00 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 678 File at CLARIN.SI2.2.335 List of initial character-level 5-grams from abbreviation lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-abbreviations-lemmas- initial-5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] najsi najsi najsi 1 100.00 % 0.97 1 100.00 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 679 File at CLARIN.SI2.2.336 List of final character-level 1-grams from abbreviation lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-abbreviations-lemmas- final-1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] čim čim či m 64 58.72 % 61.83 6 60.00 % 26.03 17 94.44 % 57.52 28 42.42 % 79.29 13 86.67 % 83.39 pH ph p H 38 34.86 % 36.71 0 0 % 0 1 5.56 % 3.38 37 56.06 % 104.77 0 0 % 0 W w W 3 2.75 % 2.90 2 20.00 % 8.68 0 0 % 0 0 0 % 0 1 6.67 % 6.41 Al al A l 1 0.92 % 0.97 0 0 % 0 0 0 % 0 1 1.51 % 2.83 0 0 % 0 Mg mg M g 1 0.92 % 0.97 0 0 % 0 0 0 % 0 0 0 % 0 1 6.67 % 6.41 WWW www WW W 1 0.92 % 0.97 1 10.00 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 najsi najsi najs i 1 0.92 % 0.97 1 10.00 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 680 File at CLARIN.SI2.2.337 List of final character-level 2-grams from abbreviation lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-abbreviations-lemmas- final-2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] čim čim č im 64 60.38 % 61.83 6 75.00 % 26.03 17 94.44 % 57.52 28 42.42 % 79.29 13 92.86 % 83.39 pH ph pH 38 35.85 % 36.71 0 0 % 0 1 5.56 % 3.38 37 56.06 % 104.77 0 0 % 0 Al al Al 1 0.94 % 0.97 0 0 % 0 0 0 % 0 1 1.51 % 2.83 0 0 % 0 Mg mg Mg 1 0.94 % 0.97 0 0 % 0 0 0 % 0 0 0 % 0 1 7.14 % 6.41 WWW www W WW 1 0.94 % 0.97 1 12.50 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 najsi najsi naj si 1 0.94 % 0.97 1 12.50 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 681 File at CLARIN.SI2.2.338 List of final character-level 3-grams from abbreviation lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-abbreviations-lemmas- final-3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] čim čim čim 64 96.97 % 61.83 6 75.00 % 26.03 17 100.00 % 57.52 28 100.00 % 79.29 13 100.00 % 83.39 WWW www WWW 1 1.51 % 0.97 1 12.50 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 najsi najsi na jsi 1 1.51 % 0.97 1 12.50 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 682 File at CLARIN.SI2.2.339 List of final character-level 4-grams from abbreviation lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-abbreviations-lemmas- final-4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] najsi najsi n ajsi 1 100.00 % 0.97 1 100.00 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 683 File at CLARIN.SI2.2.340 List of final character-level 5-grams from abbreviation lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-abbreviations-lemmas- final-5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] najsi najsi najsi 1 100.00 % 0.97 1 100.00 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 684 File at CLARIN.SI2.2.341 List of initial character-level 1-grams from abbreviation standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-abbreviations-standardized_ forms-initial-1grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] čim č im 64 58.72 % 61.83 6 60.00 % 26.03 17 94.44 % 57.52 28 42.42 % 79.29 13 86.67 % 83.39 PH P H 37 33.95 % 35.75 0 0 % 0 0 0 % 0 37 56.06 % 104.77 0 0 % 0 W W 3 2.75 % 2.90 2 20.00 % 8.68 0 0 % 0 0 0 % 0 1 6.67 % 6.41 MG M G 1 0.92 % 0.97 0 0 % 0 0 0 % 0 0 0 % 0 1 6.67 % 6.41 WWW W WW 1 0.92 % 0.97 1 10.00 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 al a l 1 0.92 % 0.97 0 0 % 0 0 0 % 0 1 1.51 % 2.83 0 0 % 0 najsi n ajsi 1 0.92 % 0.97 1 10.00 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 ph p h 1 0.92 % 0.97 0 0 % 0 1 5.56 % 3.38 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 685 File at CLARIN.SI2.2.342 List of initial character-level 2-grams from abbreviation standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-abbreviations-standardized_ forms-initial-2grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] čim či m 64 60.38 % 61.83 6 75.00 % 26.03 17 94.44 % 57.52 28 42.42 % 79.29 13 92.86 % 83.39 PH PH 37 34.91 % 35.75 0 0 % 0 0 0 % 0 37 56.06 % 104.77 0 0 % 0 MG MG 1 0.94 % 0.97 0 0 % 0 0 0 % 0 0 0 % 0 1 7.14 % 6.41 WWW WW W 1 0.94 % 0.97 1 12.50 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 al al 1 0.94 % 0.97 0 0 % 0 0 0 % 0 1 1.51 % 2.83 0 0 % 0 najsi na jsi 1 0.94 % 0.97 1 12.50 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 ph ph 1 0.94 % 0.97 0 0 % 0 1 5.56 % 3.38 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 686 File at CLARIN.SI2.2.343 List of initial character-level 3-grams from abbreviation standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-abbreviations-standardized_ forms-initial-3grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] čim čim 64 96.97 % 61.83 6 75.00 % 26.03 17 100.00 % 57.52 28 100.00 % 79.29 13 100.00 % 83.39 WWW WWW 1 1.51 % 0.97 1 12.50 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 najsi naj si 1 1.51 % 0.97 1 12.50 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 687 File at CLARIN.SI2.2.344 List of initial character-level 4-grams from abbreviation standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-abbreviations-standardized_ forms-initial-4grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] najsi najs i 1 100.00 % 0.97 1 100.00 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 688 File at CLARIN.SI2.2.345 List of initial character-level 5-grams from abbreviation standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-abbreviations-standardized_ forms-initial-5grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] najsi najsi 1 100.00 % 0.97 1 100.00 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 689 File at CLARIN.SI2.2.346 List of final character-level 1-grams from abbreviation standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-abbreviations-standardized_ forms-final-1grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] čim či m 64 58.72 % 61.83 6 60.00 % 26.03 17 94.44 % 57.52 28 42.42 % 79.29 13 86.67 % 83.39 PH P H 37 33.95 % 35.75 0 0 % 0 0 0 % 0 37 56.06 % 104.77 0 0 % 0 W W 3 2.75 % 2.90 2 20.00 % 8.68 0 0 % 0 0 0 % 0 1 6.67 % 6.41 MG M G 1 0.92 % 0.97 0 0 % 0 0 0 % 0 0 0 % 0 1 6.67 % 6.41 WWW WW W 1 0.92 % 0.97 1 10.00 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 al a l 1 0.92 % 0.97 0 0 % 0 0 0 % 0 1 1.51 % 2.83 0 0 % 0 najsi najs i 1 0.92 % 0.97 1 10.00 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 ph p h 1 0.92 % 0.97 0 0 % 0 1 5.56 % 3.38 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 690 File at CLARIN.SI2.2.347 List of final character-level 2-grams from abbreviation standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-abbreviations-standardized_ forms-final-2grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] čim č im 64 60.38 % 61.83 6 75.00 % 26.03 17 94.44 % 57.52 28 42.42 % 79.29 13 92.86 % 83.39 PH PH 37 34.91 % 35.75 0 0 % 0 0 0 % 0 37 56.06 % 104.77 0 0 % 0 MG MG 1 0.94 % 0.97 0 0 % 0 0 0 % 0 0 0 % 0 1 7.14 % 6.41 WWW W WW 1 0.94 % 0.97 1 12.50 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 al al 1 0.94 % 0.97 0 0 % 0 0 0 % 0 1 1.51 % 2.83 0 0 % 0 najsi naj si 1 0.94 % 0.97 1 12.50 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 ph ph 1 0.94 % 0.97 0 0 % 0 1 5.56 % 3.38 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 691 File at CLARIN.SI2.2.348 List of final character-level 3-grams from abbreviation standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-abbreviations-standardized_ forms-final-3grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] čim čim 64 96.97 % 61.83 6 75.00 % 26.03 17 100.00 % 57.52 28 100.00 % 79.29 13 100.00 % 83.39 WWW WWW 1 1.51 % 0.97 1 12.50 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 najsi na jsi 1 1.51 % 0.97 1 12.50 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 692 File at CLARIN.SI2.2.349 List of final character-level 4-grams from abbreviation standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-abbreviations-standardized_ forms-final-4grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] najsi n ajsi 1 100.00 % 0.97 1 100.00 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 693 File at CLARIN.SI2.2.350 List of final character-level 5-grams from abbreviation standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-abbreviations-standardized_ forms-final-5grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] najsi najsi 1 100.00 % 0.97 1 100.00 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 694 File at CLARIN.SI2.2.351 List of initial character-level 1-grams from abbreviation lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-abbreviations-lowercase_ forms-initial-1grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] čim č im 63 57.80 % 60.86 6 60.00 % 26.03 16 88.89 % 54.14 28 42.42 % 79.29 13 86.67 % 83.39 peha p eha 37 33.95 % 35.75 0 0 % 0 0 0 % 0 37 56.06 % 104.77 0 0 % 0 ve v e 3 2.75 % 2.90 2 20.00 % 8.68 0 0 % 0 0 0 % 0 1 6.67 % 6.41 al a l 1 0.92 % 0.97 0 0 % 0 0 0 % 0 1 1.51 % 2.83 0 0 % 0 emge e mge 1 0.92 % 0.97 0 0 % 0 0 0 % 0 0 0 % 0 1 6.67 % 6.41 najsi n ajsi 1 0.92 % 0.97 1 10.00 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 ph p h 1 0.92 % 0.97 0 0 % 0 1 5.56 % 3.38 0 0 % 0 0 0 % 0 veveve v eveve 1 0.92 % 0.97 1 10.00 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 čimveč č imveč 1 0.92 % 0.97 0 0 % 0 1 5.56 % 3.38 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 695 File at CLARIN.SI2.2.352 List of initial character-level 2-grams from abbreviation lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-abbreviations-lowercase_ forms-initial-2grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] čim či m 63 57.80 % 60.86 6 60.00 % 26.03 16 88.89 % 54.14 28 42.42 % 79.29 13 86.67 % 83.39 peha pe ha 37 33.95 % 35.75 0 0 % 0 0 0 % 0 37 56.06 % 104.77 0 0 % 0 ve ve 3 2.75 % 2.90 2 20.00 % 8.68 0 0 % 0 0 0 % 0 1 6.67 % 6.41 al al 1 0.92 % 0.97 0 0 % 0 0 0 % 0 1 1.51 % 2.83 0 0 % 0 emge em ge 1 0.92 % 0.97 0 0 % 0 0 0 % 0 0 0 % 0 1 6.67 % 6.41 najsi na jsi 1 0.92 % 0.97 1 10.00 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 ph ph 1 0.92 % 0.97 0 0 % 0 1 5.56 % 3.38 0 0 % 0 0 0 % 0 veveve ve veve 1 0.92 % 0.97 1 10.00 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 čimveč či mveč 1 0.92 % 0.97 0 0 % 0 1 5.56 % 3.38 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 696 File at CLARIN.SI2.2.353 List of initial character-level 3-grams from abbreviation lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-abbreviations-lowercase_ forms-initial-3grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] čim čim 63 60.58 % 60.86 6 75.00 % 26.03 16 94.12 % 54.14 28 43.08 % 79.29 13 92.86 % 83.39 peha peh a 37 35.58 % 35.75 0 0 % 0 0 0 % 0 37 56.92 % 104.77 0 0 % 0 emge emg e 1 0.96 % 0.97 0 0 % 0 0 0 % 0 0 0 % 0 1 7.14 % 6.41 najsi naj si 1 0.96 % 0.97 1 12.50 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 veveve vev eve 1 0.96 % 0.97 1 12.50 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 čimveč čim več 1 0.96 % 0.97 0 0 % 0 1 5.88 % 3.38 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 697 File at CLARIN.SI2.2.354 List of initial character-level 4-grams from abbreviation lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-abbreviations-lowercase_ forms-initial-4grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] peha peha 37 90.24 % 35.75 0 0 % 0 0 0 % 0 37 100.00 % 104.77 0 0 % 0 emge emge 1 2.44 % 0.97 0 0 % 0 0 0 % 0 0 0 % 0 1 100.00 % 6.41 najsi najs i 1 2.44 % 0.97 1 50.00 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 veveve veve ve 1 2.44 % 0.97 1 50.00 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 čimveč čimv eč 1 2.44 % 0.97 0 0 % 0 1 100.00 % 3.38 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 698 File at CLARIN.SI2.2.355 List of initial character-level 5-grams from abbreviation lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-abbreviations-lowercase_ forms-initial-5grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] najsi najsi 1 33.33 % 0.97 1 50.00 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 veveve vevev e 1 33.33 % 0.97 1 50.00 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 čimveč čimve č 1 33.33 % 0.97 0 0 % 0 1 100.00 % 3.38 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 699 File at CLARIN.SI2.2.356 List of final character-level 1-grams from abbreviation lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-abbreviations-lowercase_ forms-final-1grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] čim či m 63 57.80 % 60.86 6 60.00 % 26.03 16 88.89 % 54.14 28 42.42 % 79.29 13 86.67 % 83.39 peha peh a 37 33.95 % 35.75 0 0 % 0 0 0 % 0 37 56.06 % 104.77 0 0 % 0 ve v e 3 2.75 % 2.90 2 20.00 % 8.68 0 0 % 0 0 0 % 0 1 6.67 % 6.41 al a l 1 0.92 % 0.97 0 0 % 0 0 0 % 0 1 1.51 % 2.83 0 0 % 0 emge emg e 1 0.92 % 0.97 0 0 % 0 0 0 % 0 0 0 % 0 1 6.67 % 6.41 najsi najs i 1 0.92 % 0.97 1 10.00 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 ph p h 1 0.92 % 0.97 0 0 % 0 1 5.56 % 3.38 0 0 % 0 0 0 % 0 veveve vevev e 1 0.92 % 0.97 1 10.00 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 čimveč čimve č 1 0.92 % 0.97 0 0 % 0 1 5.56 % 3.38 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 700 File at CLARIN.SI2.2.357 List of final character-level 2-grams from abbreviation lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-abbreviations-lowercase_ forms-final-2grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] čim č im 63 57.80 % 60.86 6 60.00 % 26.03 16 88.89 % 54.14 28 42.42 % 79.29 13 86.67 % 83.39 peha pe ha 37 33.95 % 35.75 0 0 % 0 0 0 % 0 37 56.06 % 104.77 0 0 % 0 ve ve 3 2.75 % 2.90 2 20.00 % 8.68 0 0 % 0 0 0 % 0 1 6.67 % 6.41 al al 1 0.92 % 0.97 0 0 % 0 0 0 % 0 1 1.51 % 2.83 0 0 % 0 emge em ge 1 0.92 % 0.97 0 0 % 0 0 0 % 0 0 0 % 0 1 6.67 % 6.41 najsi naj si 1 0.92 % 0.97 1 10.00 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 ph ph 1 0.92 % 0.97 0 0 % 0 1 5.56 % 3.38 0 0 % 0 0 0 % 0 veveve veve ve 1 0.92 % 0.97 1 10.00 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 čimveč čimv eč 1 0.92 % 0.97 0 0 % 0 1 5.56 % 3.38 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 701 File at CLARIN.SI2.2.358 List of final character-level 3-grams from abbreviation lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-abbreviations-lowercase_ forms-final-3grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] čim čim 63 60.58 % 60.86 6 75.00 % 26.03 16 94.12 % 54.14 28 43.08 % 79.29 13 92.86 % 83.39 peha p eha 37 35.58 % 35.75 0 0 % 0 0 0 % 0 37 56.92 % 104.77 0 0 % 0 emge e mge 1 0.96 % 0.97 0 0 % 0 0 0 % 0 0 0 % 0 1 7.14 % 6.41 najsi na jsi 1 0.96 % 0.97 1 12.50 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 veveve vev eve 1 0.96 % 0.97 1 12.50 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 čimveč čim več 1 0.96 % 0.97 0 0 % 0 1 5.88 % 3.38 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 702 File at CLARIN.SI2.2.359 List of final character-level 4-grams from abbreviation lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-abbreviations-lowercase_ forms-final-4grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] peha peha 37 90.24 % 35.75 0 0 % 0 0 0 % 0 37 100.00 % 104.77 0 0 % 0 emge emge 1 2.44 % 0.97 0 0 % 0 0 0 % 0 0 0 % 0 1 100.00 % 6.41 najsi n ajsi 1 2.44 % 0.97 1 50.00 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 veveve ve veve 1 2.44 % 0.97 1 50.00 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 čimveč či mveč 1 2.44 % 0.97 0 0 % 0 1 100.00 % 3.38 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 703 File at CLARIN.SI2.2.360 List of final character-level 5-grams from abbreviation lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-abbreviations-lowercase_ forms-final-5grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] najsi najsi 1 33.33 % 0.97 1 50.00 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 veveve v eveve 1 33.33 % 0.97 1 50.00 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 čimveč č imveč 1 33.33 % 0.97 0 0 % 0 1 100.00 % 3.38 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 704 File at CLARIN.SI2.2.361 List of initial character-level 1-grams from residual lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-residual-lemmas-initial- 1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] eee eee e ee 23,222 42.92 % 22,434.40 4,248 33.09 % 18,427.66 4,527 29.40 % 15,317.62 10,556 59.40 % 29,891.07 3,891 48.07 % 24,959.43 eem eem e em 2,950 5.45 % 2,849.95 392 3.05 % 1,700.48 713 4.63 % 2,412.52 1,210 6.81 % 3,426.32 635 7.84 % 4,073.31 s s s 1,327 2.45 % 1,281.99 213 1.66 % 923.99 488 3.17 % 1,651.20 373 2.10 % 1,056.21 253 3.12 % 1,622.91 ka ka k a 820 1.52 % 792.19 466 3.63 % 2,021.49 280 1.82 % 947.41 42 0.24 % 118.93 32 0.40 % 205.27 n n n 752 1.39 % 726.49 107 0.83 % 464.16 299 1.94 % 1,011.70 193 1.09 % 546.51 153 1.89 % 981.44 p p p 497 0.92 % 480.14 81 0.63 % 351.37 167 1.08 % 565.06 141 0.79 % 399.26 108 1.33 % 692.78 z z z 486 0.90 % 469.52 79 0.61 % 342.70 159 1.03 % 537.99 150 0.84 % 424.75 98 1.21 % 628.64 t t t 463 0.86 % 447.30 76 0.59 % 329.69 186 1.21 % 629.35 91 0.51 % 257.68 110 1.36 % 705.61 m m m 422 0.78 % 407.69 79 0.61 % 342.70 160 1.04 % 541.38 93 0.52 % 263.34 90 1.11 % 577.32 j j j 364 0.67 % 351.65 89 0.69 % 386.08 136 0.88 % 460.17 75 0.42 % 212.37 64 0.79 % 410.54 k k k 363 0.67 % 350.69 72 0.56 % 312.33 123 0.80 % 416.18 92 0.52 % 260.51 76 0.94 % 487.51 v v v 332 0.61 % 320.74 67 0.52 % 290.64 92 0.60 % 311.29 119 0.67 % 336.97 54 0.67 % 346.39 da da d a 331 0.61 % 319.77 61 0.47 % 264.62 129 0.84 % 436.49 90 0.51 % 254.85 51 0.63 % 327.15 aaa aaa a aa 288 0.53 % 278.23 67 0.52 % 290.64 95 0.62 % 321.44 95 0.54 % 269.01 31 0.38 % 198.85 nnn nnn n nn 240 0.44 % 231.86 30 0.23 % 130.14 99 0.64 % 334.98 59 0.33 % 167.07 52 0.64 % 333.56 e e e 224 0.41 % 216.40 63 0.49 % 273.29 96 0.62 % 324.83 40 0.23 % 113.27 25 0.31 % 160.37 po po p o 210 0.39 % 202.88 35 0.27 % 151.83 55 0.36 % 186.10 80 0.45 % 226.53 40 0.49 % 256.59 o o o 199 0.37 % 192.25 32 0.25 % 138.81 52 0.34 % 175.95 69 0.39 % 195.38 46 0.57 % 295.07 kao kao k ao 198 0.37 % 191.28 9 0.07 % 39.04 165 1.07 % 558.30 5 0.03 % 14.16 19 0.23 % 121.88 š š š 173 0.32 % 167.13 34 0.27 % 147.49 65 0.42 % 219.93 37 0.21 % 104.77 37 0.46 % 237.34 ooo ooo o oo 166 0.31 % 160.37 83 0.65 % 360.05 57 0.37 % 192.87 10 0.06 % 28.32 16 0.20 % 102.63 b b b 165 0.30 % 159.40 50 0.39 % 216.90 38 0.25 % 128.58 71 0.40 % 201.05 6 0.07 % 38.49 bi bi b i 160 0.30 % 154.57 21 0.16 % 91.10 74 0.48 % 250.39 18 0.10 % 50.97 47 0.58 % 301.49 i i i 154 0.28 % 148.78 25 0.20 % 108.45 50 0.33 % 169.18 46 0.26 % 130.26 33 0.41 % 211.68 živjo živjo ž ivjo 149 0.28 % 143.95 74 0.58 % 321.01 22 0.14 % 74.44 35 0.20 % 99.11 18 0.22 % 115.46 tipo tipo t ipo 145 0.27 % 140.08 1 0.01 % 4.34 144 0.94 % 487.24 0 0 % 0 0 0 % 0 na na n a 143 0.26 % 138.15 18 0.14 % 78.08 49 0.32 % 165.80 46 0.26 % 130.26 30 0.37 % 192.44 the the t he 127 0.23 % 122.69 104 0.81 % 451.15 16 0.10 % 54.14 1 0.01 % 2.83 6 0.07 % 38.49 B b B 112 0.21 % 108.20 23 0.18 % 99.77 3 0.02 % 10.15 68 0.38 % 192.55 18 0.22 % 115.46 za za z a 108 0.20 % 104.34 23 0.18 % 99.77 29 0.19 % 98.12 34 0.19 % 96.28 22 0.27 % 141.12 pre pre p re 107 0.20 % 103.37 27 0.21 % 117.12 24 0.16 % 81.21 34 0.19 % 96.28 22 0.27 % 141.12 uuu uuu u uu 104 0.19 % 100.47 58 0.45 % 251.60 32 0.21 % 108.28 8 0.04 % 22.65 6 0.07 % 38.49 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 705 File at CLARIN.SI2.2.362 List of initial character-level 2-grams from residual lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-residual-lemmas-initial- 2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] eee eee ee e 23,222 49.31 % 22,434.40 4,248 36.78 % 18,427.66 4,527 34.77 % 15,317.62 10,556 67.11 % 29,891.07 3,891 57.24 % 24,959.43 eem eem ee m 2,950 6.26 % 2,849.95 392 3.39 % 1,700.48 713 5.48 % 2,412.52 1,210 7.69 % 3,426.32 635 9.34 % 4,073.31 ka ka ka 820 1.74 % 792.19 466 4.04 % 2,021.49 280 2.15 % 947.41 42 0.27 % 118.93 32 0.47 % 205.27 da da da 331 0.70 % 319.77 61 0.53 % 264.62 129 0.99 % 436.49 90 0.57 % 254.85 51 0.75 % 327.15 aaa aaa aa a 288 0.61 % 278.23 67 0.58 % 290.64 95 0.73 % 321.44 95 0.60 % 269.01 31 0.46 % 198.85 nnn nnn nn n 240 0.51 % 231.86 30 0.26 % 130.14 99 0.76 % 334.98 59 0.38 % 167.07 52 0.77 % 333.56 po po po 210 0.45 % 202.88 35 0.30 % 151.83 55 0.42 % 186.10 80 0.51 % 226.53 40 0.59 % 256.59 kao kao ka o 198 0.42 % 191.28 9 0.08 % 39.04 165 1.27 % 558.30 5 0.03 % 14.16 19 0.28 % 121.88 ooo ooo oo o 166 0.35 % 160.37 83 0.72 % 360.05 57 0.44 % 192.87 10 0.06 % 28.32 16 0.23 % 102.63 bi bi bi 160 0.34 % 154.57 21 0.18 % 91.10 74 0.57 % 250.39 18 0.11 % 50.97 47 0.69 % 301.49 živjo živjo ži vjo 149 0.32 % 143.95 74 0.64 % 321.01 22 0.17 % 74.44 35 0.22 % 99.11 18 0.27 % 115.46 tipo tipo ti po 145 0.31 % 140.08 1 0.01 % 4.34 144 1.11 % 487.24 0 0 % 0 0 0 % 0 na na na 143 0.30 % 138.15 18 0.16 % 78.08 49 0.38 % 165.80 46 0.29 % 130.26 30 0.44 % 192.44 the the th e 127 0.27 % 122.69 104 0.90 % 451.15 16 0.12 % 54.14 1 0.01 % 2.83 6 0.09 % 38.49 za za za 108 0.23 % 104.34 23 0.20 % 99.77 29 0.22 % 98.12 34 0.22 % 96.28 22 0.32 % 141.12 pre pre pr e 107 0.23 % 103.37 27 0.23 % 117.12 24 0.18 % 81.21 34 0.22 % 96.28 22 0.32 % 141.12 uuu uuu uu u 104 0.22 % 100.47 58 0.50 % 251.60 32 0.25 % 108.28 8 0.05 % 22.65 6 0.09 % 38.49 pri pri pr i 103 0.22 % 99.51 17 0.15 % 73.75 31 0.24 % 104.89 35 0.22 % 99.11 20 0.29 % 128.29 ne ne ne 88 0.19 % 85.02 23 0.20 % 99.77 23 0.18 % 77.82 29 0.18 % 82.12 13 0.19 % 83.39 re re re 76 0.16 % 73.42 20 0.17 % 86.76 21 0.16 % 71.06 22 0.14 % 62.30 13 0.19 % 83.39 do do do 68 0.14 % 65.69 18 0.16 % 78.08 20 0.15 % 67.67 17 0.11 % 48.14 13 0.19 % 83.39 een een ee n 64 0.14 % 61.83 10 0.09 % 43.38 10 0.08 % 33.84 36 0.23 % 101.94 8 0.12 % 51.32 ma ma ma 64 0.14 % 61.83 19 0.17 % 82.42 29 0.22 % 98.12 8 0.05 % 22.65 8 0.12 % 51.32 yo yo yo 64 0.14 % 61.83 62 0.54 % 268.95 0 0 % 0 0 0 % 0 2 0.03 % 12.83 ta ta ta 60 0.13 % 57.97 13 0.11 % 56.39 27 0.21 % 91.36 12 0.08 % 33.98 8 0.12 % 51.32 City city Ci ty 57 0.12 % 55.07 55 0.48 % 238.59 1 0.01 % 3.38 1 0.01 % 2.83 0 0 % 0 te te te 57 0.12 % 55.07 8 0.07 % 34.70 19 0.15 % 64.29 20 0.13 % 56.63 10 0.15 % 64.15 se se se 56 0.12 % 54.10 14 0.12 % 60.73 24 0.18 % 81.21 10 0.06 % 28.32 8 0.12 % 51.32 tavžent tavžent ta vžent 56 0.12 % 54.10 18 0.16 % 78.08 32 0.25 % 108.28 3 0.02 % 8.49 3 0.04 % 19.24 you you yo u 55 0.12 % 53.13 45 0.39 % 195.21 9 0.07 % 30.45 1 0.01 % 2.83 0 0 % 0 mo mo mo 53 0.11 % 51.20 12 0.10 % 52.06 18 0.14 % 60.91 10 0.06 % 28.32 13 0.19 % 83.39 ko ko ko 52 0.11 % 50.24 15 0.13 % 65.07 12 0.09 % 40.60 14 0.09 % 39.64 11 0.16 % 70.56 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 706 File at CLARIN.SI2.2.363 List of initial character-level 3-grams from residual lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-residual-lemmas-initial- 3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] eee eee eee 23,222 54.38 % 22,434.40 4,248 41.49 % 18,427.66 4,527 39.41 % 15,317.62 10,556 71.44 % 29,891.07 3,891 62.74 % 24,959.43 eem eem eem 2,950 6.91 % 2,849.95 392 3.83 % 1,700.48 713 6.21 % 2,412.52 1,210 8.19 % 3,426.32 635 10.24 % 4,073.31 aaa aaa aaa 288 0.67 % 278.23 67 0.65 % 290.64 95 0.83 % 321.44 95 0.64 % 269.01 31 0.50 % 198.85 nnn nnn nnn 240 0.56 % 231.86 30 0.29 % 130.14 99 0.86 % 334.98 59 0.40 % 167.07 52 0.84 % 333.56 kao kao kao 198 0.46 % 191.28 9 0.09 % 39.04 165 1.44 % 558.30 5 0.03 % 14.16 19 0.31 % 121.88 ooo ooo ooo 166 0.39 % 160.37 83 0.81 % 360.05 57 0.50 % 192.87 10 0.07 % 28.32 16 0.26 % 102.63 živjo živjo živ jo 149 0.35 % 143.95 74 0.72 % 321.01 22 0.19 % 74.44 35 0.24 % 99.11 18 0.29 % 115.46 tipo tipo tip o 145 0.34 % 140.08 1 0.01 % 4.34 144 1.25 % 487.24 0 0 % 0 0 0 % 0 the the the 127 0.30 % 122.69 104 1.02 % 451.15 16 0.14 % 54.14 1 0.01 % 2.83 6 0.10 % 38.49 pre pre pre 107 0.25 % 103.37 27 0.26 % 117.12 24 0.21 % 81.21 34 0.23 % 96.28 22 0.35 % 141.12 uuu uuu uuu 104 0.24 % 100.47 58 0.57 % 251.60 32 0.28 % 108.28 8 0.05 % 22.65 6 0.10 % 38.49 pri pri pri 103 0.24 % 99.51 17 0.17 % 73.75 31 0.27 % 104.89 35 0.24 % 99.11 20 0.32 % 128.29 een een een 64 0.15 % 61.83 10 0.10 % 43.38 10 0.09 % 33.84 36 0.24 % 101.94 8 0.13 % 51.32 City city Cit y 57 0.13 % 55.07 55 0.54 % 238.59 1 0.01 % 3.38 1 0.01 % 2.83 0 0 % 0 tavžent tavžent tav žent 56 0.13 % 54.10 18 0.18 % 78.08 32 0.28 % 108.28 3 0.02 % 8.49 3 0.05 % 19.24 you you you 55 0.13 % 53.13 45 0.44 % 195.21 9 0.08 % 30.45 1 0.01 % 2.83 0 0 % 0 Capris capris Cap ris 48 0.11 % 46.37 48 0.47 % 208.22 0 0 % 0 0 0 % 0 0 0 % 0 aam aam aam 43 0.10 % 41.54 5 0.05 % 21.69 15 0.13 % 50.75 18 0.12 % 50.97 5 0.08 % 32.07 bot bot bot 43 0.10 % 41.54 4 0.04 % 17.35 38 0.33 % 128.58 0 0 % 0 1 0.02 % 6.41 hmm hmm hmm 43 0.10 % 41.54 14 0.14 % 60.73 25 0.22 % 84.59 0 0 % 0 4 0.06 % 25.66 ovi ovi ovi 42 0.10 % 40.58 2 0.02 % 8.68 37 0.32 % 125.19 3 0.02 % 8.49 0 0 % 0 Healy healy Hea ly 41 0.10 % 39.61 41 0.40 % 177.86 0 0 % 0 0 0 % 0 0 0 % 0 brezveze brezveze bre zveze 37 0.09 % 35.75 3 0.03 % 13.01 14 0.12 % 47.37 9 0.06 % 25.48 11 0.18 % 70.56 komot komot kom ot 37 0.09 % 35.75 5 0.05 % 21.69 26 0.23 % 87.97 1 0.01 % 2.83 5 0.08 % 32.07 anche anche anc he 34 0.08 % 32.85 9 0.09 % 39.04 25 0.22 % 84.59 0 0 % 0 0 0 % 0 Poutiainen poutiainen Pou tiainen 33 0.08 % 31.88 33 0.32 % 143.15 0 0 % 0 0 0 % 0 0 0 % 0 eeh eeh eeh 33 0.08 % 31.88 0 0 % 0 19 0.17 % 64.29 13 0.09 % 36.81 1 0.02 % 6.41 hambrt hambrt ham brt 32 0.07 % 30.91 0 0 % 0 32 0.28 % 108.28 0 0 % 0 0 0 % 0 direkt direkt dir ekt 31 0.07 % 29.95 3 0.03 % 13.01 23 0.20 % 77.82 3 0.02 % 8.49 2 0.03 % 12.83 Belvi belvi Bel vi 29 0.07 % 28.02 29 0.28 % 125.80 0 0 % 0 0 0 % 0 0 0 % 0 ach ach ach 29 0.07 % 28.02 0 0 % 0 29 0.25 % 98.12 0 0 % 0 0 0 % 0 Feeney feeney Fee ney 27 0.06 % 26.08 27 0.26 % 117.12 0 0 % 0 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 707 File at CLARIN.SI2.2.364 List of initial character-level 4-grams from residual lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-residual-lemmas-initial- 4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] živjo živjo živj o 149 1.21 % 143.95 74 1.68 % 321.01 22 0.47 % 74.44 35 1.68 % 99.11 18 1.51 % 115.46 tipo tipo tipo 145 1.17 % 140.08 1 0.02 % 4.34 144 3.08 % 487.24 0 0 % 0 0 0 % 0 City city City 57 0.46 % 55.07 55 1.25 % 238.59 1 0.02 % 3.38 1 0.05 % 2.83 0 0 % 0 tavžent tavžent tavž ent 56 0.45 % 54.10 18 0.41 % 78.08 32 0.69 % 108.28 3 0.14 % 8.49 3 0.25 % 19.24 Capris capris Capr is 48 0.39 % 46.37 48 1.09 % 208.22 0 0 % 0 0 0 % 0 0 0 % 0 Healy healy Heal y 41 0.33 % 39.61 41 0.93 % 177.86 0 0 % 0 0 0 % 0 0 0 % 0 brezveze brezveze brez veze 37 0.30 % 35.75 3 0.07 % 13.01 14 0.30 % 47.37 9 0.43 % 25.48 11 0.92 % 70.56 komot komot komo t 37 0.30 % 35.75 5 0.11 % 21.69 26 0.56 % 87.97 1 0.05 % 2.83 5 0.42 % 32.07 anche anche anch e 34 0.28 % 32.85 9 0.20 % 39.04 25 0.54 % 84.59 0 0 % 0 0 0 % 0 Poutiainen poutiainen Pout iainen 33 0.27 % 31.88 33 0.75 % 143.15 0 0 % 0 0 0 % 0 0 0 % 0 hambrt hambrt hamb rt 32 0.26 % 30.91 0 0 % 0 32 0.69 % 108.28 0 0 % 0 0 0 % 0 direkt direkt dire kt 31 0.25 % 29.95 3 0.07 % 13.01 23 0.49 % 77.82 3 0.14 % 8.49 2 0.17 % 12.83 Belvi belvi Belv i 29 0.23 % 28.02 29 0.66 % 125.80 0 0 % 0 0 0 % 0 0 0 % 0 Feeney feeney Feen ey 27 0.22 % 26.08 27 0.61 % 117.12 0 0 % 0 0 0 % 0 0 0 % 0 frej frej frej 27 0.22 % 26.08 3 0.07 % 13.01 24 0.51 % 81.21 0 0 % 0 0 0 % 0 orenk orenk oren k 27 0.22 % 26.08 11 0.25 % 47.72 14 0.30 % 47.37 0 0 % 0 2 0.17 % 12.83 servus servus serv us 27 0.22 % 26.08 9 0.20 % 39.04 18 0.39 % 60.91 0 0 % 0 0 0 % 0 alora alora alor a 26 0.21 % 25.12 12 0.27 % 52.06 14 0.30 % 47.37 0 0 % 0 0 0 % 0 trebalo trebalo treb alo 25 0.20 % 24.15 17 0.39 % 73.75 8 0.17 % 27.07 0 0 % 0 0 0 % 0 rajtam rajtam rajt am 24 0.19 % 23.19 0 0 % 0 24 0.51 % 81.21 0 0 % 0 0 0 % 0 Boruc boruc Boru c 23 0.19 % 22.22 23 0.52 % 99.77 0 0 % 0 0 0 % 0 0 0 % 0 genau genau gena u 23 0.19 % 22.22 0 0 % 0 23 0.49 % 77.82 0 0 % 0 0 0 % 0 mega mega mega 22 0.18 % 21.25 4 0.09 % 17.35 2 0.04 % 6.77 0 0 % 0 16 1.34 % 102.63 takvida takvida takv ida 22 0.18 % 21.25 0 0 % 0 22 0.47 % 74.44 0 0 % 0 0 0 % 0 Brant brant Bran t 21 0.17 % 20.29 21 0.48 % 91.10 0 0 % 0 0 0 % 0 0 0 % 0 ukvarja ukvarja ukva rja 21 0.17 % 20.29 6 0.14 % 26.03 2 0.04 % 6.77 8 0.38 % 22.65 5 0.42 % 32.07 fraj fraj fraj 20 0.16 % 19.32 6 0.14 % 26.03 13 0.28 % 43.99 0 0 % 0 1 0.08 % 6.41 Belviju belviju Belv iju 19 0.15 % 18.36 19 0.43 % 82.42 0 0 % 0 0 0 % 0 0 0 % 0 Zettel zettel Zett el 19 0.15 % 18.36 19 0.43 % 82.42 0 0 % 0 0 0 % 0 0 0 % 0 metiloranž metiloranž meti loranž 19 0.15 % 18.36 0 0 % 0 0 0 % 0 19 0.91 % 53.80 0 0 % 0 cumbaya cumbaya cumb aya 18 0.15 % 17.39 18 0.41 % 78.08 0 0 % 0 0 0 % 0 0 0 % 0 žijala žijala žija la 18 0.15 % 17.39 0 0 % 0 18 0.39 % 60.91 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 708 File at CLARIN.SI2.2.365 List of initial character-level 5-grams from residual lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-residual-lemmas- initial-5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Initial part of the word Rest of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] živjo živjo živjo 149 1.51 % 143.95 74 2.08 % 321.01 22 0.59 % 74.44 35 2.12 % 99.11 18 1.89 % 115.46 tavžent tavžent tavže nt 56 0.57 % 54.10 18 0.51 % 78.08 32 0.86 % 108.28 3 0.18 % 8.49 3 0.32 % 19.24 Capris capris Capri s 48 0.49 % 46.37 48 1.35 % 208.22 0 0 % 0 0 0 % 0 0 0 % 0 Healy healy Healy 41 0.42 % 39.61 41 1.15 % 177.86 0 0 % 0 0 0 % 0 0 0 % 0 brezveze brezveze brezv eze 37 0.38 % 35.75 3 0.08 % 13.01 14 0.38 % 47.37 9 0.55 % 25.48 11 1.16 % 70.56 komot komot komot 37 0.38 % 35.75 5 0.14 % 21.69 26 0.70 % 87.97 1 0.06 % 2.83 5 0.53 % 32.07 anche anche anche 34 0.34 % 32.85 9 0.25 % 39.04 25 0.68 % 84.59 0 0 % 0 0 0 % 0 Poutiainen poutiainen Pouti ainen 33 0.34 % 31.88 33 0.93 % 143.15 0 0 % 0 0 0 % 0 0 0 % 0 hambrt hambrt hambr t 32 0.32 % 30.91 0 0 % 0 32 0.86 % 108.28 0 0 % 0 0 0 % 0 direkt direkt direk t 31 0.31 % 29.95 3 0.08 % 13.01 23 0.62 % 77.82 3 0.18 % 8.49 2 0.21 % 12.83 Belvi belvi Belvi 29 0.29 % 28.02 29 0.81 % 125.80 0 0 % 0 0 0 % 0 0 0 % 0 Feeney feeney Feene y 27 0.27 % 26.08 27 0.76 % 117.12 0 0 % 0 0 0 % 0 0 0 % 0 orenk orenk orenk 27 0.27 % 26.08 11 0.31 % 47.72 14 0.38 % 47.37 0 0 % 0 2 0.21 % 12.83 servus servus servu s 27 0.27 % 26.08 9 0.25 % 39.04 18 0.49 % 60.91 0 0 % 0 0 0 % 0 alora alora alora 26 0.26 % 25.12 12 0.34 % 52.06 14 0.38 % 47.37 0 0 % 0 0 0 % 0 trebalo trebalo treba lo 25 0.25 % 24.15 17 0.48 % 73.75 8 0.22 % 27.07 0 0 % 0 0 0 % 0 rajtam rajtam rajta m 24 0.24 % 23.19 0 0 % 0 24 0.65 % 81.21 0 0 % 0 0 0 % 0 Boruc boruc Boruc 23 0.23 % 22.22 23 0.65 % 99.77 0 0 % 0 0 0 % 0 0 0 % 0 genau genau genau 23 0.23 % 22.22 0 0 % 0 23 0.62 % 77.82 0 0 % 0 0 0 % 0 takvida takvida takvi da 22 0.22 % 21.25 0 0 % 0 22 0.59 % 74.44 0 0 % 0 0 0 % 0 Brant brant Brant 21 0.21 % 20.29 21 0.59 % 91.10 0 0 % 0 0 0 % 0 0 0 % 0 ukvarja ukvarja ukvar ja 21 0.21 % 20.29 6 0.17 % 26.03 2 0.05 % 6.77 8 0.48 % 22.65 5 0.53 % 32.07 Belviju belviju Belvi ju 19 0.19 % 18.36 19 0.53 % 82.42 0 0 % 0 0 0 % 0 0 0 % 0 Zettel zettel Zette l 19 0.19 % 18.36 19 0.53 % 82.42 0 0 % 0 0 0 % 0 0 0 % 0 metiloranž metiloranž metil oranž 19 0.19 % 18.36 0 0 % 0 0 0 % 0 19 1.15 % 53.80 0 0 % 0 cumbaya cumbaya cumba ya 18 0.18 % 17.39 18 0.51 % 78.08 0 0 % 0 0 0 % 0 0 0 % 0 žijala žijala žijal a 18 0.18 % 17.39 0 0 % 0 18 0.49 % 60.91 0 0 % 0 0 0 % 0 invece invece invec e 17 0.17 % 16.42 0 0 % 0 17 0.46 % 57.52 0 0 % 0 0 0 % 0 devetnajststo devetnajststo devet najststo 16 0.16 % 15.46 2 0.06 % 8.68 0 0 % 0 14 0.85 % 39.64 0 0 % 0 luškan luškan luška n 16 0.16 % 15.46 9 0.25 % 39.04 1 0.03 % 3.38 1 0.06 % 2.83 5 0.53 % 32.07 magar magar magar 16 0.16 % 15.46 2 0.06 % 8.68 11 0.30 % 37.22 1 0.06 % 2.83 2 0.21 % 12.83 žijaš žijaš žijaš 16 0.16 % 15.46 0 0 % 0 16 0.43 % 54.14 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 709 File at CLARIN.SI2.2.366 List of final character-level 1-grams from residual lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-residual-lemmas- final-1grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] eee eee ee e 23,222 42.92 % 22,434.40 4,248 33.09 % 18,427.66 4,527 29.40 % 15,317.62 10,556 59.40 % 29,891.07 3,891 48.07 % 24,959.43 eem eem ee m 2,950 5.45 % 2,849.95 392 3.05 % 1,700.48 713 4.63 % 2,412.52 1,210 6.81 % 3,426.32 635 7.84 % 4,073.31 s s s 1,327 2.45 % 1,281.99 213 1.66 % 923.99 488 3.17 % 1,651.20 373 2.10 % 1,056.21 253 3.12 % 1,622.91 ka ka k a 820 1.52 % 792.19 466 3.63 % 2,021.49 280 1.82 % 947.41 42 0.24 % 118.93 32 0.40 % 205.27 n n n 752 1.39 % 726.49 107 0.83 % 464.16 299 1.94 % 1,011.70 193 1.09 % 546.51 153 1.89 % 981.44 p p p 497 0.92 % 480.14 81 0.63 % 351.37 167 1.08 % 565.06 141 0.79 % 399.26 108 1.33 % 692.78 z z z 486 0.90 % 469.52 79 0.61 % 342.70 159 1.03 % 537.99 150 0.84 % 424.75 98 1.21 % 628.64 t t t 463 0.86 % 447.30 76 0.59 % 329.69 186 1.21 % 629.35 91 0.51 % 257.68 110 1.36 % 705.61 m m m 422 0.78 % 407.69 79 0.61 % 342.70 160 1.04 % 541.38 93 0.52 % 263.34 90 1.11 % 577.32 j j j 364 0.67 % 351.65 89 0.69 % 386.08 136 0.88 % 460.17 75 0.42 % 212.37 64 0.79 % 410.54 k k k 363 0.67 % 350.69 72 0.56 % 312.33 123 0.80 % 416.18 92 0.52 % 260.51 76 0.94 % 487.51 v v v 332 0.61 % 320.74 67 0.52 % 290.64 92 0.60 % 311.29 119 0.67 % 336.97 54 0.67 % 346.39 da da d a 331 0.61 % 319.77 61 0.47 % 264.62 129 0.84 % 436.49 90 0.51 % 254.85 51 0.63 % 327.15 aaa aaa aa a 288 0.53 % 278.23 67 0.52 % 290.64 95 0.62 % 321.44 95 0.54 % 269.01 31 0.38 % 198.85 nnn nnn nn n 240 0.44 % 231.86 30 0.23 % 130.14 99 0.64 % 334.98 59 0.33 % 167.07 52 0.64 % 333.56 e e e 224 0.41 % 216.40 63 0.49 % 273.29 96 0.62 % 324.83 40 0.23 % 113.27 25 0.31 % 160.37 po po p o 210 0.39 % 202.88 35 0.27 % 151.83 55 0.36 % 186.10 80 0.45 % 226.53 40 0.49 % 256.59 o o o 199 0.37 % 192.25 32 0.25 % 138.81 52 0.34 % 175.95 69 0.39 % 195.38 46 0.57 % 295.07 kao kao ka o 198 0.37 % 191.28 9 0.07 % 39.04 165 1.07 % 558.30 5 0.03 % 14.16 19 0.23 % 121.88 š š š 173 0.32 % 167.13 34 0.27 % 147.49 65 0.42 % 219.93 37 0.21 % 104.77 37 0.46 % 237.34 ooo ooo oo o 166 0.31 % 160.37 83 0.65 % 360.05 57 0.37 % 192.87 10 0.06 % 28.32 16 0.20 % 102.63 b b b 165 0.30 % 159.40 50 0.39 % 216.90 38 0.25 % 128.58 71 0.40 % 201.05 6 0.07 % 38.49 bi bi b i 160 0.30 % 154.57 21 0.16 % 91.10 74 0.48 % 250.39 18 0.10 % 50.97 47 0.58 % 301.49 i i i 154 0.28 % 148.78 25 0.20 % 108.45 50 0.33 % 169.18 46 0.26 % 130.26 33 0.41 % 211.68 živjo živjo živj o 149 0.28 % 143.95 74 0.58 % 321.01 22 0.14 % 74.44 35 0.20 % 99.11 18 0.22 % 115.46 tipo tipo tip o 145 0.27 % 140.08 1 0.01 % 4.34 144 0.94 % 487.24 0 0 % 0 0 0 % 0 na na n a 143 0.26 % 138.15 18 0.14 % 78.08 49 0.32 % 165.80 46 0.26 % 130.26 30 0.37 % 192.44 the the th e 127 0.23 % 122.69 104 0.81 % 451.15 16 0.10 % 54.14 1 0.01 % 2.83 6 0.07 % 38.49 B b B 112 0.21 % 108.20 23 0.18 % 99.77 3 0.02 % 10.15 68 0.38 % 192.55 18 0.22 % 115.46 za za z a 108 0.20 % 104.34 23 0.18 % 99.77 29 0.19 % 98.12 34 0.19 % 96.28 22 0.27 % 141.12 pre pre pr e 107 0.20 % 103.37 27 0.21 % 117.12 24 0.16 % 81.21 34 0.19 % 96.28 22 0.27 % 141.12 uuu uuu uu u 104 0.19 % 100.47 58 0.45 % 251.60 32 0.21 % 108.28 8 0.04 % 22.65 6 0.07 % 38.49 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 710 File at CLARIN.SI2.2.367 List of final character-level 2-grams from residual lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-residual-lemmas- final-2grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] eee eee e ee 23,222 49.31 % 22,434.40 4,248 36.78 % 18,427.66 4,527 34.77 % 15,317.62 10,556 67.11 % 29,891.07 3,891 57.24 % 24,959.43 eem eem e em 2,950 6.26 % 2,849.95 392 3.39 % 1,700.48 713 5.48 % 2,412.52 1,210 7.69 % 3,426.32 635 9.34 % 4,073.31 ka ka ka 820 1.74 % 792.19 466 4.04 % 2,021.49 280 2.15 % 947.41 42 0.27 % 118.93 32 0.47 % 205.27 da da da 331 0.70 % 319.77 61 0.53 % 264.62 129 0.99 % 436.49 90 0.57 % 254.85 51 0.75 % 327.15 aaa aaa a aa 288 0.61 % 278.23 67 0.58 % 290.64 95 0.73 % 321.44 95 0.60 % 269.01 31 0.46 % 198.85 nnn nnn n nn 240 0.51 % 231.86 30 0.26 % 130.14 99 0.76 % 334.98 59 0.38 % 167.07 52 0.77 % 333.56 po po po 210 0.45 % 202.88 35 0.30 % 151.83 55 0.42 % 186.10 80 0.51 % 226.53 40 0.59 % 256.59 kao kao k ao 198 0.42 % 191.28 9 0.08 % 39.04 165 1.27 % 558.30 5 0.03 % 14.16 19 0.28 % 121.88 ooo ooo o oo 166 0.35 % 160.37 83 0.72 % 360.05 57 0.44 % 192.87 10 0.06 % 28.32 16 0.23 % 102.63 bi bi bi 160 0.34 % 154.57 21 0.18 % 91.10 74 0.57 % 250.39 18 0.11 % 50.97 47 0.69 % 301.49 živjo živjo živ jo 149 0.32 % 143.95 74 0.64 % 321.01 22 0.17 % 74.44 35 0.22 % 99.11 18 0.27 % 115.46 tipo tipo ti po 145 0.31 % 140.08 1 0.01 % 4.34 144 1.11 % 487.24 0 0 % 0 0 0 % 0 na na na 143 0.30 % 138.15 18 0.16 % 78.08 49 0.38 % 165.80 46 0.29 % 130.26 30 0.44 % 192.44 the the t he 127 0.27 % 122.69 104 0.90 % 451.15 16 0.12 % 54.14 1 0.01 % 2.83 6 0.09 % 38.49 za za za 108 0.23 % 104.34 23 0.20 % 99.77 29 0.22 % 98.12 34 0.22 % 96.28 22 0.32 % 141.12 pre pre p re 107 0.23 % 103.37 27 0.23 % 117.12 24 0.18 % 81.21 34 0.22 % 96.28 22 0.32 % 141.12 uuu uuu u uu 104 0.22 % 100.47 58 0.50 % 251.60 32 0.25 % 108.28 8 0.05 % 22.65 6 0.09 % 38.49 pri pri p ri 103 0.22 % 99.51 17 0.15 % 73.75 31 0.24 % 104.89 35 0.22 % 99.11 20 0.29 % 128.29 ne ne ne 88 0.19 % 85.02 23 0.20 % 99.77 23 0.18 % 77.82 29 0.18 % 82.12 13 0.19 % 83.39 re re re 76 0.16 % 73.42 20 0.17 % 86.76 21 0.16 % 71.06 22 0.14 % 62.30 13 0.19 % 83.39 do do do 68 0.14 % 65.69 18 0.16 % 78.08 20 0.15 % 67.67 17 0.11 % 48.14 13 0.19 % 83.39 een een e en 64 0.14 % 61.83 10 0.09 % 43.38 10 0.08 % 33.84 36 0.23 % 101.94 8 0.12 % 51.32 ma ma ma 64 0.14 % 61.83 19 0.17 % 82.42 29 0.22 % 98.12 8 0.05 % 22.65 8 0.12 % 51.32 yo yo yo 64 0.14 % 61.83 62 0.54 % 268.95 0 0 % 0 0 0 % 0 2 0.03 % 12.83 ta ta ta 60 0.13 % 57.97 13 0.11 % 56.39 27 0.21 % 91.36 12 0.08 % 33.98 8 0.12 % 51.32 City city Ci ty 57 0.12 % 55.07 55 0.48 % 238.59 1 0.01 % 3.38 1 0.01 % 2.83 0 0 % 0 te te te 57 0.12 % 55.07 8 0.07 % 34.70 19 0.15 % 64.29 20 0.13 % 56.63 10 0.15 % 64.15 se se se 56 0.12 % 54.10 14 0.12 % 60.73 24 0.18 % 81.21 10 0.06 % 28.32 8 0.12 % 51.32 tavžent tavžent tavže nt 56 0.12 % 54.10 18 0.16 % 78.08 32 0.25 % 108.28 3 0.02 % 8.49 3 0.04 % 19.24 you you y ou 55 0.12 % 53.13 45 0.39 % 195.21 9 0.07 % 30.45 1 0.01 % 2.83 0 0 % 0 mo mo mo 53 0.11 % 51.20 12 0.10 % 52.06 18 0.14 % 60.91 10 0.06 % 28.32 13 0.19 % 83.39 ko ko ko 52 0.11 % 50.24 15 0.13 % 65.07 12 0.09 % 40.60 14 0.09 % 39.64 11 0.16 % 70.56 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 711 File at CLARIN.SI2.2.368 List of final character-level 3-grams from residual lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-residual-lemmas- final-3grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] eee eee eee 23,222 54.38 % 22,434.40 4,248 41.49 % 18,427.66 4,527 39.41 % 15,317.62 10,556 71.44 % 29,891.07 3,891 62.74 % 24,959.43 eem eem eem 2,950 6.91 % 2,849.95 392 3.83 % 1,700.48 713 6.21 % 2,412.52 1,210 8.19 % 3,426.32 635 10.24 % 4,073.31 aaa aaa aaa 288 0.67 % 278.23 67 0.65 % 290.64 95 0.83 % 321.44 95 0.64 % 269.01 31 0.50 % 198.85 nnn nnn nnn 240 0.56 % 231.86 30 0.29 % 130.14 99 0.86 % 334.98 59 0.40 % 167.07 52 0.84 % 333.56 kao kao kao 198 0.46 % 191.28 9 0.09 % 39.04 165 1.44 % 558.30 5 0.03 % 14.16 19 0.31 % 121.88 ooo ooo ooo 166 0.39 % 160.37 83 0.81 % 360.05 57 0.50 % 192.87 10 0.07 % 28.32 16 0.26 % 102.63 živjo živjo ži vjo 149 0.35 % 143.95 74 0.72 % 321.01 22 0.19 % 74.44 35 0.24 % 99.11 18 0.29 % 115.46 tipo tipo t ipo 145 0.34 % 140.08 1 0.01 % 4.34 144 1.25 % 487.24 0 0 % 0 0 0 % 0 the the the 127 0.30 % 122.69 104 1.02 % 451.15 16 0.14 % 54.14 1 0.01 % 2.83 6 0.10 % 38.49 pre pre pre 107 0.25 % 103.37 27 0.26 % 117.12 24 0.21 % 81.21 34 0.23 % 96.28 22 0.35 % 141.12 uuu uuu uuu 104 0.24 % 100.47 58 0.57 % 251.60 32 0.28 % 108.28 8 0.05 % 22.65 6 0.10 % 38.49 pri pri pri 103 0.24 % 99.51 17 0.17 % 73.75 31 0.27 % 104.89 35 0.24 % 99.11 20 0.32 % 128.29 een een een 64 0.15 % 61.83 10 0.10 % 43.38 10 0.09 % 33.84 36 0.24 % 101.94 8 0.13 % 51.32 City city C ity 57 0.13 % 55.07 55 0.54 % 238.59 1 0.01 % 3.38 1 0.01 % 2.83 0 0 % 0 tavžent tavžent tavž ent 56 0.13 % 54.10 18 0.18 % 78.08 32 0.28 % 108.28 3 0.02 % 8.49 3 0.05 % 19.24 you you you 55 0.13 % 53.13 45 0.44 % 195.21 9 0.08 % 30.45 1 0.01 % 2.83 0 0 % 0 Capris capris Cap ris 48 0.11 % 46.37 48 0.47 % 208.22 0 0 % 0 0 0 % 0 0 0 % 0 aam aam aam 43 0.10 % 41.54 5 0.05 % 21.69 15 0.13 % 50.75 18 0.12 % 50.97 5 0.08 % 32.07 bot bot bot 43 0.10 % 41.54 4 0.04 % 17.35 38 0.33 % 128.58 0 0 % 0 1 0.02 % 6.41 hmm hmm hmm 43 0.10 % 41.54 14 0.14 % 60.73 25 0.22 % 84.59 0 0 % 0 4 0.06 % 25.66 ovi ovi ovi 42 0.10 % 40.58 2 0.02 % 8.68 37 0.32 % 125.19 3 0.02 % 8.49 0 0 % 0 Healy healy He aly 41 0.10 % 39.61 41 0.40 % 177.86 0 0 % 0 0 0 % 0 0 0 % 0 brezveze brezveze brezv eze 37 0.09 % 35.75 3 0.03 % 13.01 14 0.12 % 47.37 9 0.06 % 25.48 11 0.18 % 70.56 komot komot ko mot 37 0.09 % 35.75 5 0.05 % 21.69 26 0.23 % 87.97 1 0.01 % 2.83 5 0.08 % 32.07 anche anche an che 34 0.08 % 32.85 9 0.09 % 39.04 25 0.22 % 84.59 0 0 % 0 0 0 % 0 Poutiainen poutiainen Poutiai nen 33 0.08 % 31.88 33 0.32 % 143.15 0 0 % 0 0 0 % 0 0 0 % 0 eeh eeh eeh 33 0.08 % 31.88 0 0 % 0 19 0.17 % 64.29 13 0.09 % 36.81 1 0.02 % 6.41 hambrt hambrt ham brt 32 0.07 % 30.91 0 0 % 0 32 0.28 % 108.28 0 0 % 0 0 0 % 0 direkt direkt dir ekt 31 0.07 % 29.95 3 0.03 % 13.01 23 0.20 % 77.82 3 0.02 % 8.49 2 0.03 % 12.83 Belvi belvi Be lvi 29 0.07 % 28.02 29 0.28 % 125.80 0 0 % 0 0 0 % 0 0 0 % 0 ach ach ach 29 0.07 % 28.02 0 0 % 0 29 0.25 % 98.12 0 0 % 0 0 0 % 0 Feeney feeney Fee ney 27 0.06 % 26.08 27 0.26 % 117.12 0 0 % 0 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 712 File at CLARIN.SI2.2.369 List of final character-level 4-grams from residual lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-residual-lemmas- final-4grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] živjo živjo ž ivjo 149 1.21 % 143.95 74 1.68 % 321.01 22 0.47 % 74.44 35 1.68 % 99.11 18 1.51 % 115.46 tipo tipo tipo 145 1.17 % 140.08 1 0.02 % 4.34 144 3.08 % 487.24 0 0 % 0 0 0 % 0 City city City 57 0.46 % 55.07 55 1.25 % 238.59 1 0.02 % 3.38 1 0.05 % 2.83 0 0 % 0 tavžent tavžent tav žent 56 0.45 % 54.10 18 0.41 % 78.08 32 0.69 % 108.28 3 0.14 % 8.49 3 0.25 % 19.24 Capris capris Ca pris 48 0.39 % 46.37 48 1.09 % 208.22 0 0 % 0 0 0 % 0 0 0 % 0 Healy healy H ealy 41 0.33 % 39.61 41 0.93 % 177.86 0 0 % 0 0 0 % 0 0 0 % 0 brezveze brezveze brez veze 37 0.30 % 35.75 3 0.07 % 13.01 14 0.30 % 47.37 9 0.43 % 25.48 11 0.92 % 70.56 komot komot k omot 37 0.30 % 35.75 5 0.11 % 21.69 26 0.56 % 87.97 1 0.05 % 2.83 5 0.42 % 32.07 anche anche a nche 34 0.28 % 32.85 9 0.20 % 39.04 25 0.54 % 84.59 0 0 % 0 0 0 % 0 Poutiainen poutiainen Poutia inen 33 0.27 % 31.88 33 0.75 % 143.15 0 0 % 0 0 0 % 0 0 0 % 0 hambrt hambrt ha mbrt 32 0.26 % 30.91 0 0 % 0 32 0.69 % 108.28 0 0 % 0 0 0 % 0 direkt direkt di rekt 31 0.25 % 29.95 3 0.07 % 13.01 23 0.49 % 77.82 3 0.14 % 8.49 2 0.17 % 12.83 Belvi belvi B elvi 29 0.23 % 28.02 29 0.66 % 125.80 0 0 % 0 0 0 % 0 0 0 % 0 Feeney feeney Fe eney 27 0.22 % 26.08 27 0.61 % 117.12 0 0 % 0 0 0 % 0 0 0 % 0 frej frej frej 27 0.22 % 26.08 3 0.07 % 13.01 24 0.51 % 81.21 0 0 % 0 0 0 % 0 orenk orenk o renk 27 0.22 % 26.08 11 0.25 % 47.72 14 0.30 % 47.37 0 0 % 0 2 0.17 % 12.83 servus servus se rvus 27 0.22 % 26.08 9 0.20 % 39.04 18 0.39 % 60.91 0 0 % 0 0 0 % 0 alora alora a lora 26 0.21 % 25.12 12 0.27 % 52.06 14 0.30 % 47.37 0 0 % 0 0 0 % 0 trebalo trebalo tre balo 25 0.20 % 24.15 17 0.39 % 73.75 8 0.17 % 27.07 0 0 % 0 0 0 % 0 rajtam rajtam ra jtam 24 0.19 % 23.19 0 0 % 0 24 0.51 % 81.21 0 0 % 0 0 0 % 0 Boruc boruc B oruc 23 0.19 % 22.22 23 0.52 % 99.77 0 0 % 0 0 0 % 0 0 0 % 0 genau genau g enau 23 0.19 % 22.22 0 0 % 0 23 0.49 % 77.82 0 0 % 0 0 0 % 0 mega mega mega 22 0.18 % 21.25 4 0.09 % 17.35 2 0.04 % 6.77 0 0 % 0 16 1.34 % 102.63 takvida takvida tak vida 22 0.18 % 21.25 0 0 % 0 22 0.47 % 74.44 0 0 % 0 0 0 % 0 Brant brant B rant 21 0.17 % 20.29 21 0.48 % 91.10 0 0 % 0 0 0 % 0 0 0 % 0 ukvarja ukvarja ukv arja 21 0.17 % 20.29 6 0.14 % 26.03 2 0.04 % 6.77 8 0.38 % 22.65 5 0.42 % 32.07 fraj fraj fraj 20 0.16 % 19.32 6 0.14 % 26.03 13 0.28 % 43.99 0 0 % 0 1 0.08 % 6.41 Belviju belviju Bel viju 19 0.15 % 18.36 19 0.43 % 82.42 0 0 % 0 0 0 % 0 0 0 % 0 Zettel zettel Ze ttel 19 0.15 % 18.36 19 0.43 % 82.42 0 0 % 0 0 0 % 0 0 0 % 0 metiloranž metiloranž metilo ranž 19 0.15 % 18.36 0 0 % 0 0 0 % 0 19 0.91 % 53.80 0 0 % 0 cumbaya cumbaya cum baya 18 0.15 % 17.39 18 0.41 % 78.08 0 0 % 0 0 0 % 0 0 0 % 0 žijala žijala ži jala 18 0.15 % 17.39 0 0 % 0 18 0.39 % 60.91 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 713 File at CLARIN.SI2.2.370 List of final character-level 5-grams from residual lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-residual-lemmas- final-5grams-taxonomy-entire.tsvLemma Lemma (lower-case) Rest of the word Final part of the word Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] živjo živjo živjo 149 1.51 % 143.95 74 2.08 % 321.01 22 0.59 % 74.44 35 2.12 % 99.11 18 1.89 % 115.46 tavžent tavžent ta vžent 56 0.57 % 54.10 18 0.51 % 78.08 32 0.86 % 108.28 3 0.18 % 8.49 3 0.32 % 19.24 Capris capris C apris 48 0.49 % 46.37 48 1.35 % 208.22 0 0 % 0 0 0 % 0 0 0 % 0 Healy healy Healy 41 0.42 % 39.61 41 1.15 % 177.86 0 0 % 0 0 0 % 0 0 0 % 0 brezveze brezveze bre zveze 37 0.38 % 35.75 3 0.08 % 13.01 14 0.38 % 47.37 9 0.55 % 25.48 11 1.16 % 70.56 komot komot komot 37 0.38 % 35.75 5 0.14 % 21.69 26 0.70 % 87.97 1 0.06 % 2.83 5 0.53 % 32.07 anche anche anche 34 0.34 % 32.85 9 0.25 % 39.04 25 0.68 % 84.59 0 0 % 0 0 0 % 0 Poutiainen poutiainen Pouti ainen 33 0.34 % 31.88 33 0.93 % 143.15 0 0 % 0 0 0 % 0 0 0 % 0 hambrt hambrt h ambrt 32 0.32 % 30.91 0 0 % 0 32 0.86 % 108.28 0 0 % 0 0 0 % 0 direkt direkt d irekt 31 0.31 % 29.95 3 0.08 % 13.01 23 0.62 % 77.82 3 0.18 % 8.49 2 0.21 % 12.83 Belvi belvi Belvi 29 0.29 % 28.02 29 0.81 % 125.80 0 0 % 0 0 0 % 0 0 0 % 0 Feeney feeney F eeney 27 0.27 % 26.08 27 0.76 % 117.12 0 0 % 0 0 0 % 0 0 0 % 0 orenk orenk orenk 27 0.27 % 26.08 11 0.31 % 47.72 14 0.38 % 47.37 0 0 % 0 2 0.21 % 12.83 servus servus s ervus 27 0.27 % 26.08 9 0.25 % 39.04 18 0.49 % 60.91 0 0 % 0 0 0 % 0 alora alora alora 26 0.26 % 25.12 12 0.34 % 52.06 14 0.38 % 47.37 0 0 % 0 0 0 % 0 trebalo trebalo tr ebalo 25 0.25 % 24.15 17 0.48 % 73.75 8 0.22 % 27.07 0 0 % 0 0 0 % 0 rajtam rajtam r ajtam 24 0.24 % 23.19 0 0 % 0 24 0.65 % 81.21 0 0 % 0 0 0 % 0 Boruc boruc Boruc 23 0.23 % 22.22 23 0.65 % 99.77 0 0 % 0 0 0 % 0 0 0 % 0 genau genau genau 23 0.23 % 22.22 0 0 % 0 23 0.62 % 77.82 0 0 % 0 0 0 % 0 takvida takvida ta kvida 22 0.22 % 21.25 0 0 % 0 22 0.59 % 74.44 0 0 % 0 0 0 % 0 Brant brant Brant 21 0.21 % 20.29 21 0.59 % 91.10 0 0 % 0 0 0 % 0 0 0 % 0 ukvarja ukvarja uk varja 21 0.21 % 20.29 6 0.17 % 26.03 2 0.05 % 6.77 8 0.48 % 22.65 5 0.53 % 32.07 Belviju belviju Be lviju 19 0.19 % 18.36 19 0.53 % 82.42 0 0 % 0 0 0 % 0 0 0 % 0 Zettel zettel Z ettel 19 0.19 % 18.36 19 0.53 % 82.42 0 0 % 0 0 0 % 0 0 0 % 0 metiloranž metiloranž metil oranž 19 0.19 % 18.36 0 0 % 0 0 0 % 0 19 1.15 % 53.80 0 0 % 0 cumbaya cumbaya cu mbaya 18 0.18 % 17.39 18 0.51 % 78.08 0 0 % 0 0 0 % 0 0 0 % 0 žijala žijala ž ijala 18 0.18 % 17.39 0 0 % 0 18 0.49 % 60.91 0 0 % 0 0 0 % 0 invece invece i nvece 17 0.17 % 16.42 0 0 % 0 17 0.46 % 57.52 0 0 % 0 0 0 % 0 devetnajststo devetnajststo devetnaj ststo 16 0.16 % 15.46 2 0.06 % 8.68 0 0 % 0 14 0.85 % 39.64 0 0 % 0 luškan luškan l uškan 16 0.16 % 15.46 9 0.25 % 39.04 1 0.03 % 3.38 1 0.06 % 2.83 5 0.53 % 32.07 magar magar magar 16 0.16 % 15.46 2 0.06 % 8.68 11 0.30 % 37.22 1 0.06 % 2.83 2 0.21 % 12.83 žijaš žijaš žijaš 16 0.16 % 15.46 0 0 % 0 16 0.43 % 54.14 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 714 File at CLARIN.SI2.2.371 List of initial character-level 1-grams from residual standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-residual-standardized_ forms-initial-1grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] eee e ee 23,222 42.92 % 22,434.40 4,248 33.09 % 18,427.66 4,527 29.40 % 15,317.62 10,556 59.40 % 29,891.07 3,891 48.07 % 24,959.43 eem e em 2,950 5.45 % 2,849.95 392 3.05 % 1,700.48 713 4.63 % 2,412.52 1,210 6.81 % 3,426.32 635 7.84 % 4,073.31 s s 1,327 2.45 % 1,281.99 213 1.66 % 923.99 488 3.17 % 1,651.20 373 2.10 % 1,056.21 253 3.12 % 1,622.91 ka k a 820 1.52 % 792.19 466 3.63 % 2,021.49 280 1.82 % 947.41 42 0.24 % 118.93 32 0.40 % 205.27 n n 752 1.39 % 726.49 107 0.83 % 464.16 299 1.94 % 1,011.70 193 1.09 % 546.51 153 1.89 % 981.44 p p 497 0.92 % 480.14 81 0.63 % 351.37 167 1.08 % 565.06 141 0.79 % 399.26 108 1.33 % 692.78 z z 486 0.90 % 469.52 79 0.61 % 342.70 159 1.03 % 537.99 150 0.84 % 424.75 98 1.21 % 628.64 t t 463 0.86 % 447.30 76 0.59 % 329.69 186 1.21 % 629.35 91 0.51 % 257.68 110 1.36 % 705.61 m m 422 0.78 % 407.69 79 0.61 % 342.70 160 1.04 % 541.38 93 0.52 % 263.34 90 1.11 % 577.32 j j 364 0.67 % 351.65 89 0.69 % 386.08 136 0.88 % 460.17 75 0.42 % 212.37 64 0.79 % 410.54 k k 363 0.67 % 350.69 72 0.56 % 312.33 123 0.80 % 416.18 92 0.52 % 260.51 76 0.94 % 487.51 v v 332 0.61 % 320.74 67 0.52 % 290.64 92 0.60 % 311.29 119 0.67 % 336.97 54 0.67 % 346.39 da d a 331 0.61 % 319.77 61 0.47 % 264.62 129 0.84 % 436.49 90 0.51 % 254.85 51 0.63 % 327.15 aaa a aa 288 0.53 % 278.23 67 0.52 % 290.64 95 0.62 % 321.44 95 0.54 % 269.01 31 0.38 % 198.85 nnn n nn 240 0.44 % 231.86 30 0.23 % 130.14 99 0.64 % 334.98 59 0.33 % 167.07 52 0.64 % 333.56 e e 224 0.41 % 216.40 63 0.49 % 273.29 96 0.62 % 324.83 40 0.23 % 113.27 25 0.31 % 160.37 po p o 210 0.39 % 202.88 35 0.27 % 151.83 55 0.36 % 186.10 80 0.45 % 226.53 40 0.49 % 256.59 o o 199 0.37 % 192.25 32 0.25 % 138.81 52 0.34 % 175.95 69 0.39 % 195.38 46 0.57 % 295.07 kao k ao 198 0.37 % 191.28 9 0.07 % 39.04 165 1.07 % 558.30 5 0.03 % 14.16 19 0.23 % 121.88 š š 173 0.32 % 167.13 34 0.27 % 147.49 65 0.42 % 219.93 37 0.21 % 104.77 37 0.46 % 237.34 ooo o oo 166 0.31 % 160.37 83 0.65 % 360.05 57 0.37 % 192.87 10 0.06 % 28.32 16 0.20 % 102.63 b b 165 0.30 % 159.40 50 0.39 % 216.90 38 0.25 % 128.58 71 0.40 % 201.05 6 0.07 % 38.49 bi b i 160 0.30 % 154.57 21 0.16 % 91.10 74 0.48 % 250.39 18 0.10 % 50.97 47 0.58 % 301.49 i i 154 0.28 % 148.78 25 0.20 % 108.45 50 0.33 % 169.18 46 0.26 % 130.26 33 0.41 % 211.68 živjo ž ivjo 149 0.28 % 143.95 74 0.58 % 321.01 22 0.14 % 74.44 35 0.20 % 99.11 18 0.22 % 115.46 tipo t ipo 145 0.27 % 140.08 1 0.01 % 4.34 144 0.94 % 487.24 0 0 % 0 0 0 % 0 na n a 143 0.26 % 138.15 18 0.14 % 78.08 49 0.32 % 165.80 46 0.26 % 130.26 30 0.37 % 192.44 B B 112 0.21 % 108.20 23 0.18 % 99.77 3 0.02 % 10.15 68 0.38 % 192.55 18 0.22 % 115.46 za z a 108 0.20 % 104.34 23 0.18 % 99.77 29 0.19 % 98.12 34 0.19 % 96.28 22 0.27 % 141.12 pre p re 107 0.20 % 103.37 27 0.21 % 117.12 24 0.16 % 81.21 34 0.19 % 96.28 22 0.27 % 141.12 uuu u uu 104 0.19 % 100.47 58 0.45 % 251.60 32 0.21 % 108.28 8 0.04 % 22.65 6 0.07 % 38.49 pri p ri 103 0.19 % 99.51 17 0.13 % 73.75 31 0.20 % 104.89 35 0.20 % 99.11 20 0.25 % 128.29 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 715 File at CLARIN.SI2.2.372 List of initial character-level 2-grams from residual standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-residual-standardized_ forms-initial-2grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] eee ee e 23,222 49.31 % 22,434.40 4,248 36.78 % 18,427.66 4,527 34.77 % 15,317.62 10,556 67.11 % 29,891.07 3,891 57.24 % 24,959.43 eem ee m 2,950 6.26 % 2,849.95 392 3.39 % 1,700.48 713 5.48 % 2,412.52 1,210 7.69 % 3,426.32 635 9.34 % 4,073.31 ka ka 820 1.74 % 792.19 466 4.04 % 2,021.49 280 2.15 % 947.41 42 0.27 % 118.93 32 0.47 % 205.27 da da 331 0.70 % 319.77 61 0.53 % 264.62 129 0.99 % 436.49 90 0.57 % 254.85 51 0.75 % 327.15 aaa aa a 288 0.61 % 278.23 67 0.58 % 290.64 95 0.73 % 321.44 95 0.60 % 269.01 31 0.46 % 198.85 nnn nn n 240 0.51 % 231.86 30 0.26 % 130.14 99 0.76 % 334.98 59 0.38 % 167.07 52 0.77 % 333.56 po po 210 0.45 % 202.88 35 0.30 % 151.83 55 0.42 % 186.10 80 0.51 % 226.53 40 0.59 % 256.59 kao ka o 198 0.42 % 191.28 9 0.08 % 39.04 165 1.27 % 558.30 5 0.03 % 14.16 19 0.28 % 121.88 ooo oo o 166 0.35 % 160.37 83 0.72 % 360.05 57 0.44 % 192.87 10 0.06 % 28.32 16 0.23 % 102.63 bi bi 160 0.34 % 154.57 21 0.18 % 91.10 74 0.57 % 250.39 18 0.11 % 50.97 47 0.69 % 301.49 živjo ži vjo 149 0.32 % 143.95 74 0.64 % 321.01 22 0.17 % 74.44 35 0.22 % 99.11 18 0.27 % 115.46 tipo ti po 145 0.31 % 140.08 1 0.01 % 4.34 144 1.11 % 487.24 0 0 % 0 0 0 % 0 na na 143 0.30 % 138.15 18 0.16 % 78.08 49 0.38 % 165.80 46 0.29 % 130.26 30 0.44 % 192.44 za za 108 0.23 % 104.34 23 0.20 % 99.77 29 0.22 % 98.12 34 0.22 % 96.28 22 0.32 % 141.12 pre pr e 107 0.23 % 103.37 27 0.23 % 117.12 24 0.18 % 81.21 34 0.22 % 96.28 22 0.32 % 141.12 uuu uu u 104 0.22 % 100.47 58 0.50 % 251.60 32 0.25 % 108.28 8 0.05 % 22.65 6 0.09 % 38.49 pri pr i 103 0.22 % 99.51 17 0.15 % 73.75 31 0.24 % 104.89 35 0.22 % 99.11 20 0.29 % 128.29 ne ne 88 0.19 % 85.02 23 0.20 % 99.77 23 0.18 % 77.82 29 0.18 % 82.12 13 0.19 % 83.39 re re 76 0.16 % 73.42 20 0.17 % 86.76 21 0.16 % 71.06 22 0.14 % 62.30 13 0.19 % 83.39 the th e 71 0.15 % 68.59 50 0.43 % 216.90 14 0.11 % 47.37 1 0.01 % 2.83 6 0.09 % 38.49 do do 68 0.14 % 65.69 18 0.16 % 78.08 20 0.15 % 67.67 17 0.11 % 48.14 13 0.19 % 83.39 een ee n 64 0.14 % 61.83 10 0.09 % 43.38 10 0.08 % 33.84 36 0.23 % 101.94 8 0.12 % 51.32 ma ma 64 0.14 % 61.83 19 0.17 % 82.42 29 0.22 % 98.12 8 0.05 % 22.65 8 0.12 % 51.32 yo yo 64 0.14 % 61.83 62 0.54 % 268.95 0 0 % 0 0 0 % 0 2 0.03 % 12.83 ta ta 60 0.13 % 57.97 13 0.11 % 56.39 27 0.21 % 91.36 12 0.08 % 33.98 8 0.12 % 51.32 City Ci ty 57 0.12 % 55.07 55 0.48 % 238.59 1 0.01 % 3.38 1 0.01 % 2.83 0 0 % 0 te te 57 0.12 % 55.07 8 0.07 % 34.70 19 0.15 % 64.29 20 0.13 % 56.63 10 0.15 % 64.15 The Th e 56 0.12 % 54.10 54 0.47 % 234.25 2 0.01 % 6.77 0 0 % 0 0 0 % 0 se se 56 0.12 % 54.10 14 0.12 % 60.73 24 0.18 % 81.21 10 0.06 % 28.32 8 0.12 % 51.32 tavžent ta vžent 56 0.12 % 54.10 18 0.16 % 78.08 32 0.25 % 108.28 3 0.02 % 8.49 3 0.04 % 19.24 you yo u 55 0.12 % 53.13 45 0.39 % 195.21 9 0.07 % 30.45 1 0.01 % 2.83 0 0 % 0 mo mo 53 0.11 % 51.20 12 0.10 % 52.06 18 0.14 % 60.91 10 0.06 % 28.32 13 0.19 % 83.39 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 716 File at CLARIN.SI2.2.373 List of initial character-level 3-grams from residual standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-residual-standardized_ forms-initial-3grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] eee eee 23,222 54.38 % 22,434.40 4,248 41.49 % 18,427.66 4,527 39.41 % 15,317.62 10,556 71.44 % 29,891.07 3,891 62.74 % 24,959.43 eem eem 2,950 6.91 % 2,849.95 392 3.83 % 1,700.48 713 6.21 % 2,412.52 1,210 8.19 % 3,426.32 635 10.24 % 4,073.31 aaa aaa 288 0.67 % 278.23 67 0.65 % 290.64 95 0.83 % 321.44 95 0.64 % 269.01 31 0.50 % 198.85 nnn nnn 240 0.56 % 231.86 30 0.29 % 130.14 99 0.86 % 334.98 59 0.40 % 167.07 52 0.84 % 333.56 kao kao 198 0.46 % 191.28 9 0.09 % 39.04 165 1.44 % 558.30 5 0.03 % 14.16 19 0.31 % 121.88 ooo ooo 166 0.39 % 160.37 83 0.81 % 360.05 57 0.50 % 192.87 10 0.07 % 28.32 16 0.26 % 102.63 živjo živ jo 149 0.35 % 143.95 74 0.72 % 321.01 22 0.19 % 74.44 35 0.24 % 99.11 18 0.29 % 115.46 tipo tip o 145 0.34 % 140.08 1 0.01 % 4.34 144 1.25 % 487.24 0 0 % 0 0 0 % 0 pre pre 107 0.25 % 103.37 27 0.26 % 117.12 24 0.21 % 81.21 34 0.23 % 96.28 22 0.35 % 141.12 uuu uuu 104 0.24 % 100.47 58 0.57 % 251.60 32 0.28 % 108.28 8 0.05 % 22.65 6 0.10 % 38.49 pri pri 103 0.24 % 99.51 17 0.17 % 73.75 31 0.27 % 104.89 35 0.24 % 99.11 20 0.32 % 128.29 the the 71 0.17 % 68.59 50 0.49 % 216.90 14 0.12 % 47.37 1 0.01 % 2.83 6 0.10 % 38.49 een een 64 0.15 % 61.83 10 0.10 % 43.38 10 0.09 % 33.84 36 0.24 % 101.94 8 0.13 % 51.32 City Cit y 57 0.13 % 55.07 55 0.54 % 238.59 1 0.01 % 3.38 1 0.01 % 2.83 0 0 % 0 The The 56 0.13 % 54.10 54 0.53 % 234.25 2 0.02 % 6.77 0 0 % 0 0 0 % 0 tavžent tav žent 56 0.13 % 54.10 18 0.18 % 78.08 32 0.28 % 108.28 3 0.02 % 8.49 3 0.05 % 19.24 you you 55 0.13 % 53.13 45 0.44 % 195.21 9 0.08 % 30.45 1 0.01 % 2.83 0 0 % 0 Capris Cap ris 48 0.11 % 46.37 48 0.47 % 208.22 0 0 % 0 0 0 % 0 0 0 % 0 aam aam 43 0.10 % 41.54 5 0.05 % 21.69 15 0.13 % 50.75 18 0.12 % 50.97 5 0.08 % 32.07 bot bot 43 0.10 % 41.54 4 0.04 % 17.35 38 0.33 % 128.58 0 0 % 0 1 0.02 % 6.41 hmm hmm 43 0.10 % 41.54 14 0.14 % 60.73 25 0.22 % 84.59 0 0 % 0 4 0.06 % 25.66 ovi ovi 42 0.10 % 40.58 2 0.02 % 8.68 37 0.32 % 125.19 3 0.02 % 8.49 0 0 % 0 Healy Hea ly 41 0.10 % 39.61 41 0.40 % 177.86 0 0 % 0 0 0 % 0 0 0 % 0 brezveze bre zveze 37 0.09 % 35.75 3 0.03 % 13.01 14 0.12 % 47.37 9 0.06 % 25.48 11 0.18 % 70.56 komot kom ot 37 0.09 % 35.75 5 0.05 % 21.69 26 0.23 % 87.97 1 0.01 % 2.83 5 0.08 % 32.07 anche anc he 34 0.08 % 32.85 9 0.09 % 39.04 25 0.22 % 84.59 0 0 % 0 0 0 % 0 Poutiainen Pou tiainen 33 0.08 % 31.88 33 0.32 % 143.15 0 0 % 0 0 0 % 0 0 0 % 0 eeh eeh 33 0.08 % 31.88 0 0 % 0 19 0.17 % 64.29 13 0.09 % 36.81 1 0.02 % 6.41 hambrt ham brt 32 0.07 % 30.91 0 0 % 0 32 0.28 % 108.28 0 0 % 0 0 0 % 0 direkt dir ekt 31 0.07 % 29.95 3 0.03 % 13.01 23 0.20 % 77.82 3 0.02 % 8.49 2 0.03 % 12.83 Belvi Bel vi 29 0.07 % 28.02 29 0.28 % 125.80 0 0 % 0 0 0 % 0 0 0 % 0 ach ach 29 0.07 % 28.02 0 0 % 0 29 0.25 % 98.12 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 717 File at CLARIN.SI2.2.374 List of initial character-level 4-grams from residual standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-residual-standardized_ forms-initial-4grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] živjo živj o 149 1.21 % 143.95 74 1.68 % 321.01 22 0.47 % 74.44 35 1.68 % 99.11 18 1.51 % 115.46 tipo tipo 145 1.17 % 140.08 1 0.02 % 4.34 144 3.08 % 487.24 0 0 % 0 0 0 % 0 City City 57 0.46 % 55.07 55 1.25 % 238.59 1 0.02 % 3.38 1 0.05 % 2.83 0 0 % 0 tavžent tavž ent 56 0.45 % 54.10 18 0.41 % 78.08 32 0.69 % 108.28 3 0.14 % 8.49 3 0.25 % 19.24 Capris Capr is 48 0.39 % 46.37 48 1.09 % 208.22 0 0 % 0 0 0 % 0 0 0 % 0 Healy Heal y 41 0.33 % 39.61 41 0.93 % 177.86 0 0 % 0 0 0 % 0 0 0 % 0 brezveze brez veze 37 0.30 % 35.75 3 0.07 % 13.01 14 0.30 % 47.37 9 0.43 % 25.48 11 0.92 % 70.56 komot komo t 37 0.30 % 35.75 5 0.11 % 21.69 26 0.56 % 87.97 1 0.05 % 2.83 5 0.42 % 32.07 anche anch e 34 0.28 % 32.85 9 0.20 % 39.04 25 0.54 % 84.59 0 0 % 0 0 0 % 0 Poutiainen Pout iainen 33 0.27 % 31.88 33 0.75 % 143.15 0 0 % 0 0 0 % 0 0 0 % 0 hambrt hamb rt 32 0.26 % 30.91 0 0 % 0 32 0.69 % 108.28 0 0 % 0 0 0 % 0 direkt dire kt 31 0.25 % 29.95 3 0.07 % 13.01 23 0.49 % 77.82 3 0.14 % 8.49 2 0.17 % 12.83 Belvi Belv i 29 0.23 % 28.02 29 0.66 % 125.80 0 0 % 0 0 0 % 0 0 0 % 0 Feeney Feen ey 27 0.22 % 26.08 27 0.61 % 117.12 0 0 % 0 0 0 % 0 0 0 % 0 frej frej 27 0.22 % 26.08 3 0.07 % 13.01 24 0.51 % 81.21 0 0 % 0 0 0 % 0 orenk oren k 27 0.22 % 26.08 11 0.25 % 47.72 14 0.30 % 47.37 0 0 % 0 2 0.17 % 12.83 servus serv us 27 0.22 % 26.08 9 0.20 % 39.04 18 0.39 % 60.91 0 0 % 0 0 0 % 0 alora alor a 26 0.21 % 25.12 12 0.27 % 52.06 14 0.30 % 47.37 0 0 % 0 0 0 % 0 trebalo treb alo 25 0.20 % 24.15 17 0.39 % 73.75 8 0.17 % 27.07 0 0 % 0 0 0 % 0 rajtam rajt am 24 0.19 % 23.19 0 0 % 0 24 0.51 % 81.21 0 0 % 0 0 0 % 0 Boruc Boru c 23 0.19 % 22.22 23 0.52 % 99.77 0 0 % 0 0 0 % 0 0 0 % 0 genau gena u 23 0.19 % 22.22 0 0 % 0 23 0.49 % 77.82 0 0 % 0 0 0 % 0 mega mega 22 0.18 % 21.25 4 0.09 % 17.35 2 0.04 % 6.77 0 0 % 0 16 1.34 % 102.63 takvida takv ida 22 0.18 % 21.25 0 0 % 0 22 0.47 % 74.44 0 0 % 0 0 0 % 0 Brant Bran t 21 0.17 % 20.29 21 0.48 % 91.10 0 0 % 0 0 0 % 0 0 0 % 0 ukvarja ukva rja 21 0.17 % 20.29 6 0.14 % 26.03 2 0.04 % 6.77 8 0.38 % 22.65 5 0.42 % 32.07 fraj fraj 20 0.16 % 19.32 6 0.14 % 26.03 13 0.28 % 43.99 0 0 % 0 1 0.08 % 6.41 Belviju Belv iju 19 0.15 % 18.36 19 0.43 % 82.42 0 0 % 0 0 0 % 0 0 0 % 0 Zettel Zett el 19 0.15 % 18.36 19 0.43 % 82.42 0 0 % 0 0 0 % 0 0 0 % 0 metiloranž meti loranž 19 0.15 % 18.36 0 0 % 0 0 0 % 0 19 0.91 % 53.80 0 0 % 0 cumbaya cumb aya 18 0.15 % 17.39 18 0.41 % 78.08 0 0 % 0 0 0 % 0 0 0 % 0 žijala žija la 18 0.15 % 17.39 0 0 % 0 18 0.39 % 60.91 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 718 File at CLARIN.SI2.2.375 List of initial character-level 5-grams from residual standardized forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-residual-standardized_ forms-initial-5grams-taxonomy-entire.tsvStandardized form Initial part of the word Rest of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] živjo živjo 149 1.51 % 143.95 74 2.08 % 321.01 22 0.59 % 74.44 35 2.12 % 99.11 18 1.89 % 115.46 tavžent tavže nt 56 0.57 % 54.10 18 0.51 % 78.08 32 0.86 % 108.28 3 0.18 % 8.49 3 0.32 % 19.24 Capris Capri s 48 0.49 % 46.37 48 1.35 % 208.22 0 0 % 0 0 0 % 0 0 0 % 0 Healy Healy 41 0.42 % 39.61 41 1.15 % 177.86 0 0 % 0 0 0 % 0 0 0 % 0 brezveze brezv eze 37 0.38 % 35.75 3 0.08 % 13.01 14 0.38 % 47.37 9 0.55 % 25.48 11 1.16 % 70.56 komot komot 37 0.38 % 35.75 5 0.14 % 21.69 26 0.70 % 87.97 1 0.06 % 2.83 5 0.53 % 32.07 anche anche 34 0.34 % 32.85 9 0.25 % 39.04 25 0.68 % 84.59 0 0 % 0 0 0 % 0 Poutiainen Pouti ainen 33 0.34 % 31.88 33 0.93 % 143.15 0 0 % 0 0 0 % 0 0 0 % 0 hambrt hambr t 32 0.32 % 30.91 0 0 % 0 32 0.86 % 108.28 0 0 % 0 0 0 % 0 direkt direk t 31 0.31 % 29.95 3 0.08 % 13.01 23 0.62 % 77.82 3 0.18 % 8.49 2 0.21 % 12.83 Belvi Belvi 29 0.29 % 28.02 29 0.81 % 125.80 0 0 % 0 0 0 % 0 0 0 % 0 Feeney Feene y 27 0.27 % 26.08 27 0.76 % 117.12 0 0 % 0 0 0 % 0 0 0 % 0 orenk orenk 27 0.27 % 26.08 11 0.31 % 47.72 14 0.38 % 47.37 0 0 % 0 2 0.21 % 12.83 servus servu s 27 0.27 % 26.08 9 0.25 % 39.04 18 0.49 % 60.91 0 0 % 0 0 0 % 0 alora alora 26 0.26 % 25.12 12 0.34 % 52.06 14 0.38 % 47.37 0 0 % 0 0 0 % 0 trebalo treba lo 25 0.25 % 24.15 17 0.48 % 73.75 8 0.22 % 27.07 0 0 % 0 0 0 % 0 rajtam rajta m 24 0.24 % 23.19 0 0 % 0 24 0.65 % 81.21 0 0 % 0 0 0 % 0 Boruc Boruc 23 0.23 % 22.22 23 0.65 % 99.77 0 0 % 0 0 0 % 0 0 0 % 0 genau genau 23 0.23 % 22.22 0 0 % 0 23 0.62 % 77.82 0 0 % 0 0 0 % 0 takvida takvi da 22 0.22 % 21.25 0 0 % 0 22 0.59 % 74.44 0 0 % 0 0 0 % 0 Brant Brant 21 0.21 % 20.29 21 0.59 % 91.10 0 0 % 0 0 0 % 0 0 0 % 0 ukvarja ukvar ja 21 0.21 % 20.29 6 0.17 % 26.03 2 0.05 % 6.77 8 0.48 % 22.65 5 0.53 % 32.07 Belviju Belvi ju 19 0.19 % 18.36 19 0.53 % 82.42 0 0 % 0 0 0 % 0 0 0 % 0 Zettel Zette l 19 0.19 % 18.36 19 0.53 % 82.42 0 0 % 0 0 0 % 0 0 0 % 0 metiloranž metil oranž 19 0.19 % 18.36 0 0 % 0 0 0 % 0 19 1.15 % 53.80 0 0 % 0 cumbaya cumba ya 18 0.18 % 17.39 18 0.51 % 78.08 0 0 % 0 0 0 % 0 0 0 % 0 žijala žijal a 18 0.18 % 17.39 0 0 % 0 18 0.49 % 60.91 0 0 % 0 0 0 % 0 invece invec e 17 0.17 % 16.42 0 0 % 0 17 0.46 % 57.52 0 0 % 0 0 0 % 0 devetnajststo devet najststo 16 0.16 % 15.46 2 0.06 % 8.68 0 0 % 0 14 0.85 % 39.64 0 0 % 0 luškan luška n 16 0.16 % 15.46 9 0.25 % 39.04 1 0.03 % 3.38 1 0.06 % 2.83 5 0.53 % 32.07 magar magar 16 0.16 % 15.46 2 0.06 % 8.68 11 0.30 % 37.22 1 0.06 % 2.83 2 0.21 % 12.83 žijaš žijaš 16 0.16 % 15.46 0 0 % 0 16 0.43 % 54.14 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 719 File at CLARIN.SI2.2.376 List of final character-level 1-grams from residual standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-residual-standardized_ forms-final-1grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] eee ee e 23,222 42.92 % 22,434.40 4,248 33.09 % 18,427.66 4,527 29.40 % 15,317.62 10,556 59.40 % 29,891.07 3,891 48.07 % 24,959.43 eem ee m 2,950 5.45 % 2,849.95 392 3.05 % 1,700.48 713 4.63 % 2,412.52 1,210 6.81 % 3,426.32 635 7.84 % 4,073.31 s s 1,327 2.45 % 1,281.99 213 1.66 % 923.99 488 3.17 % 1,651.20 373 2.10 % 1,056.21 253 3.12 % 1,622.91 ka k a 820 1.52 % 792.19 466 3.63 % 2,021.49 280 1.82 % 947.41 42 0.24 % 118.93 32 0.40 % 205.27 n n 752 1.39 % 726.49 107 0.83 % 464.16 299 1.94 % 1,011.70 193 1.09 % 546.51 153 1.89 % 981.44 p p 497 0.92 % 480.14 81 0.63 % 351.37 167 1.08 % 565.06 141 0.79 % 399.26 108 1.33 % 692.78 z z 486 0.90 % 469.52 79 0.61 % 342.70 159 1.03 % 537.99 150 0.84 % 424.75 98 1.21 % 628.64 t t 463 0.86 % 447.30 76 0.59 % 329.69 186 1.21 % 629.35 91 0.51 % 257.68 110 1.36 % 705.61 m m 422 0.78 % 407.69 79 0.61 % 342.70 160 1.04 % 541.38 93 0.52 % 263.34 90 1.11 % 577.32 j j 364 0.67 % 351.65 89 0.69 % 386.08 136 0.88 % 460.17 75 0.42 % 212.37 64 0.79 % 410.54 k k 363 0.67 % 350.69 72 0.56 % 312.33 123 0.80 % 416.18 92 0.52 % 260.51 76 0.94 % 487.51 v v 332 0.61 % 320.74 67 0.52 % 290.64 92 0.60 % 311.29 119 0.67 % 336.97 54 0.67 % 346.39 da d a 331 0.61 % 319.77 61 0.47 % 264.62 129 0.84 % 436.49 90 0.51 % 254.85 51 0.63 % 327.15 aaa aa a 288 0.53 % 278.23 67 0.52 % 290.64 95 0.62 % 321.44 95 0.54 % 269.01 31 0.38 % 198.85 nnn nn n 240 0.44 % 231.86 30 0.23 % 130.14 99 0.64 % 334.98 59 0.33 % 167.07 52 0.64 % 333.56 e e 224 0.41 % 216.40 63 0.49 % 273.29 96 0.62 % 324.83 40 0.23 % 113.27 25 0.31 % 160.37 po p o 210 0.39 % 202.88 35 0.27 % 151.83 55 0.36 % 186.10 80 0.45 % 226.53 40 0.49 % 256.59 o o 199 0.37 % 192.25 32 0.25 % 138.81 52 0.34 % 175.95 69 0.39 % 195.38 46 0.57 % 295.07 kao ka o 198 0.37 % 191.28 9 0.07 % 39.04 165 1.07 % 558.30 5 0.03 % 14.16 19 0.23 % 121.88 š š 173 0.32 % 167.13 34 0.27 % 147.49 65 0.42 % 219.93 37 0.21 % 104.77 37 0.46 % 237.34 ooo oo o 166 0.31 % 160.37 83 0.65 % 360.05 57 0.37 % 192.87 10 0.06 % 28.32 16 0.20 % 102.63 b b 165 0.30 % 159.40 50 0.39 % 216.90 38 0.25 % 128.58 71 0.40 % 201.05 6 0.07 % 38.49 bi b i 160 0.30 % 154.57 21 0.16 % 91.10 74 0.48 % 250.39 18 0.10 % 50.97 47 0.58 % 301.49 i i 154 0.28 % 148.78 25 0.20 % 108.45 50 0.33 % 169.18 46 0.26 % 130.26 33 0.41 % 211.68 živjo živj o 149 0.28 % 143.95 74 0.58 % 321.01 22 0.14 % 74.44 35 0.20 % 99.11 18 0.22 % 115.46 tipo tip o 145 0.27 % 140.08 1 0.01 % 4.34 144 0.94 % 487.24 0 0 % 0 0 0 % 0 na n a 143 0.26 % 138.15 18 0.14 % 78.08 49 0.32 % 165.80 46 0.26 % 130.26 30 0.37 % 192.44 B B 112 0.21 % 108.20 23 0.18 % 99.77 3 0.02 % 10.15 68 0.38 % 192.55 18 0.22 % 115.46 za z a 108 0.20 % 104.34 23 0.18 % 99.77 29 0.19 % 98.12 34 0.19 % 96.28 22 0.27 % 141.12 pre pr e 107 0.20 % 103.37 27 0.21 % 117.12 24 0.16 % 81.21 34 0.19 % 96.28 22 0.27 % 141.12 uuu uu u 104 0.19 % 100.47 58 0.45 % 251.60 32 0.21 % 108.28 8 0.04 % 22.65 6 0.07 % 38.49 pri pr i 103 0.19 % 99.51 17 0.13 % 73.75 31 0.20 % 104.89 35 0.20 % 99.11 20 0.25 % 128.29 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 720 File at CLARIN.SI2.2.377 List of final character-level 2-grams from residual standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-residual-standardized_ forms-final-2grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] eee e ee 23,222 49.31 % 22,434.40 4,248 36.78 % 18,427.66 4,527 34.77 % 15,317.62 10,556 67.11 % 29,891.07 3,891 57.24 % 24,959.43 eem e em 2,950 6.26 % 2,849.95 392 3.39 % 1,700.48 713 5.48 % 2,412.52 1,210 7.69 % 3,426.32 635 9.34 % 4,073.31 ka ka 820 1.74 % 792.19 466 4.04 % 2,021.49 280 2.15 % 947.41 42 0.27 % 118.93 32 0.47 % 205.27 da da 331 0.70 % 319.77 61 0.53 % 264.62 129 0.99 % 436.49 90 0.57 % 254.85 51 0.75 % 327.15 aaa a aa 288 0.61 % 278.23 67 0.58 % 290.64 95 0.73 % 321.44 95 0.60 % 269.01 31 0.46 % 198.85 nnn n nn 240 0.51 % 231.86 30 0.26 % 130.14 99 0.76 % 334.98 59 0.38 % 167.07 52 0.77 % 333.56 po po 210 0.45 % 202.88 35 0.30 % 151.83 55 0.42 % 186.10 80 0.51 % 226.53 40 0.59 % 256.59 kao k ao 198 0.42 % 191.28 9 0.08 % 39.04 165 1.27 % 558.30 5 0.03 % 14.16 19 0.28 % 121.88 ooo o oo 166 0.35 % 160.37 83 0.72 % 360.05 57 0.44 % 192.87 10 0.06 % 28.32 16 0.23 % 102.63 bi bi 160 0.34 % 154.57 21 0.18 % 91.10 74 0.57 % 250.39 18 0.11 % 50.97 47 0.69 % 301.49 živjo živ jo 149 0.32 % 143.95 74 0.64 % 321.01 22 0.17 % 74.44 35 0.22 % 99.11 18 0.27 % 115.46 tipo ti po 145 0.31 % 140.08 1 0.01 % 4.34 144 1.11 % 487.24 0 0 % 0 0 0 % 0 na na 143 0.30 % 138.15 18 0.16 % 78.08 49 0.38 % 165.80 46 0.29 % 130.26 30 0.44 % 192.44 za za 108 0.23 % 104.34 23 0.20 % 99.77 29 0.22 % 98.12 34 0.22 % 96.28 22 0.32 % 141.12 pre p re 107 0.23 % 103.37 27 0.23 % 117.12 24 0.18 % 81.21 34 0.22 % 96.28 22 0.32 % 141.12 uuu u uu 104 0.22 % 100.47 58 0.50 % 251.60 32 0.25 % 108.28 8 0.05 % 22.65 6 0.09 % 38.49 pri p ri 103 0.22 % 99.51 17 0.15 % 73.75 31 0.24 % 104.89 35 0.22 % 99.11 20 0.29 % 128.29 ne ne 88 0.19 % 85.02 23 0.20 % 99.77 23 0.18 % 77.82 29 0.18 % 82.12 13 0.19 % 83.39 re re 76 0.16 % 73.42 20 0.17 % 86.76 21 0.16 % 71.06 22 0.14 % 62.30 13 0.19 % 83.39 the t he 71 0.15 % 68.59 50 0.43 % 216.90 14 0.11 % 47.37 1 0.01 % 2.83 6 0.09 % 38.49 do do 68 0.14 % 65.69 18 0.16 % 78.08 20 0.15 % 67.67 17 0.11 % 48.14 13 0.19 % 83.39 een e en 64 0.14 % 61.83 10 0.09 % 43.38 10 0.08 % 33.84 36 0.23 % 101.94 8 0.12 % 51.32 ma ma 64 0.14 % 61.83 19 0.17 % 82.42 29 0.22 % 98.12 8 0.05 % 22.65 8 0.12 % 51.32 yo yo 64 0.14 % 61.83 62 0.54 % 268.95 0 0 % 0 0 0 % 0 2 0.03 % 12.83 ta ta 60 0.13 % 57.97 13 0.11 % 56.39 27 0.21 % 91.36 12 0.08 % 33.98 8 0.12 % 51.32 City Ci ty 57 0.12 % 55.07 55 0.48 % 238.59 1 0.01 % 3.38 1 0.01 % 2.83 0 0 % 0 te te 57 0.12 % 55.07 8 0.07 % 34.70 19 0.15 % 64.29 20 0.13 % 56.63 10 0.15 % 64.15 The T he 56 0.12 % 54.10 54 0.47 % 234.25 2 0.01 % 6.77 0 0 % 0 0 0 % 0 se se 56 0.12 % 54.10 14 0.12 % 60.73 24 0.18 % 81.21 10 0.06 % 28.32 8 0.12 % 51.32 tavžent tavže nt 56 0.12 % 54.10 18 0.16 % 78.08 32 0.25 % 108.28 3 0.02 % 8.49 3 0.04 % 19.24 you y ou 55 0.12 % 53.13 45 0.39 % 195.21 9 0.07 % 30.45 1 0.01 % 2.83 0 0 % 0 mo mo 53 0.11 % 51.20 12 0.10 % 52.06 18 0.14 % 60.91 10 0.06 % 28.32 13 0.19 % 83.39 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 721 File at CLARIN.SI2.2.378 List of final character-level 3-grams from residual standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-residual-standardized_ forms-final-3grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] eee eee 23,222 54.38 % 22,434.40 4,248 41.49 % 18,427.66 4,527 39.41 % 15,317.62 10,556 71.44 % 29,891.07 3,891 62.74 % 24,959.43 eem eem 2,950 6.91 % 2,849.95 392 3.83 % 1,700.48 713 6.21 % 2,412.52 1,210 8.19 % 3,426.32 635 10.24 % 4,073.31 aaa aaa 288 0.67 % 278.23 67 0.65 % 290.64 95 0.83 % 321.44 95 0.64 % 269.01 31 0.50 % 198.85 nnn nnn 240 0.56 % 231.86 30 0.29 % 130.14 99 0.86 % 334.98 59 0.40 % 167.07 52 0.84 % 333.56 kao kao 198 0.46 % 191.28 9 0.09 % 39.04 165 1.44 % 558.30 5 0.03 % 14.16 19 0.31 % 121.88 ooo ooo 166 0.39 % 160.37 83 0.81 % 360.05 57 0.50 % 192.87 10 0.07 % 28.32 16 0.26 % 102.63 živjo ži vjo 149 0.35 % 143.95 74 0.72 % 321.01 22 0.19 % 74.44 35 0.24 % 99.11 18 0.29 % 115.46 tipo t ipo 145 0.34 % 140.08 1 0.01 % 4.34 144 1.25 % 487.24 0 0 % 0 0 0 % 0 pre pre 107 0.25 % 103.37 27 0.26 % 117.12 24 0.21 % 81.21 34 0.23 % 96.28 22 0.35 % 141.12 uuu uuu 104 0.24 % 100.47 58 0.57 % 251.60 32 0.28 % 108.28 8 0.05 % 22.65 6 0.10 % 38.49 pri pri 103 0.24 % 99.51 17 0.17 % 73.75 31 0.27 % 104.89 35 0.24 % 99.11 20 0.32 % 128.29 the the 71 0.17 % 68.59 50 0.49 % 216.90 14 0.12 % 47.37 1 0.01 % 2.83 6 0.10 % 38.49 een een 64 0.15 % 61.83 10 0.10 % 43.38 10 0.09 % 33.84 36 0.24 % 101.94 8 0.13 % 51.32 City C ity 57 0.13 % 55.07 55 0.54 % 238.59 1 0.01 % 3.38 1 0.01 % 2.83 0 0 % 0 The The 56 0.13 % 54.10 54 0.53 % 234.25 2 0.02 % 6.77 0 0 % 0 0 0 % 0 tavžent tavž ent 56 0.13 % 54.10 18 0.18 % 78.08 32 0.28 % 108.28 3 0.02 % 8.49 3 0.05 % 19.24 you you 55 0.13 % 53.13 45 0.44 % 195.21 9 0.08 % 30.45 1 0.01 % 2.83 0 0 % 0 Capris Cap ris 48 0.11 % 46.37 48 0.47 % 208.22 0 0 % 0 0 0 % 0 0 0 % 0 aam aam 43 0.10 % 41.54 5 0.05 % 21.69 15 0.13 % 50.75 18 0.12 % 50.97 5 0.08 % 32.07 bot bot 43 0.10 % 41.54 4 0.04 % 17.35 38 0.33 % 128.58 0 0 % 0 1 0.02 % 6.41 hmm hmm 43 0.10 % 41.54 14 0.14 % 60.73 25 0.22 % 84.59 0 0 % 0 4 0.06 % 25.66 ovi ovi 42 0.10 % 40.58 2 0.02 % 8.68 37 0.32 % 125.19 3 0.02 % 8.49 0 0 % 0 Healy He aly 41 0.10 % 39.61 41 0.40 % 177.86 0 0 % 0 0 0 % 0 0 0 % 0 brezveze brezv eze 37 0.09 % 35.75 3 0.03 % 13.01 14 0.12 % 47.37 9 0.06 % 25.48 11 0.18 % 70.56 komot ko mot 37 0.09 % 35.75 5 0.05 % 21.69 26 0.23 % 87.97 1 0.01 % 2.83 5 0.08 % 32.07 anche an che 34 0.08 % 32.85 9 0.09 % 39.04 25 0.22 % 84.59 0 0 % 0 0 0 % 0 Poutiainen Poutiai nen 33 0.08 % 31.88 33 0.32 % 143.15 0 0 % 0 0 0 % 0 0 0 % 0 eeh eeh 33 0.08 % 31.88 0 0 % 0 19 0.17 % 64.29 13 0.09 % 36.81 1 0.02 % 6.41 hambrt ham brt 32 0.07 % 30.91 0 0 % 0 32 0.28 % 108.28 0 0 % 0 0 0 % 0 direkt dir ekt 31 0.07 % 29.95 3 0.03 % 13.01 23 0.20 % 77.82 3 0.02 % 8.49 2 0.03 % 12.83 Belvi Be lvi 29 0.07 % 28.02 29 0.28 % 125.80 0 0 % 0 0 0 % 0 0 0 % 0 ach ach 29 0.07 % 28.02 0 0 % 0 29 0.25 % 98.12 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 722 File at CLARIN.SI2.2.379 List of final character-level 4-grams from residual standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-residual-standardized_ forms-final-4grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] živjo ž ivjo 149 1.21 % 143.95 74 1.68 % 321.01 22 0.47 % 74.44 35 1.68 % 99.11 18 1.51 % 115.46 tipo tipo 145 1.17 % 140.08 1 0.02 % 4.34 144 3.08 % 487.24 0 0 % 0 0 0 % 0 City City 57 0.46 % 55.07 55 1.25 % 238.59 1 0.02 % 3.38 1 0.05 % 2.83 0 0 % 0 tavžent tav žent 56 0.45 % 54.10 18 0.41 % 78.08 32 0.69 % 108.28 3 0.14 % 8.49 3 0.25 % 19.24 Capris Ca pris 48 0.39 % 46.37 48 1.09 % 208.22 0 0 % 0 0 0 % 0 0 0 % 0 Healy H ealy 41 0.33 % 39.61 41 0.93 % 177.86 0 0 % 0 0 0 % 0 0 0 % 0 brezveze brez veze 37 0.30 % 35.75 3 0.07 % 13.01 14 0.30 % 47.37 9 0.43 % 25.48 11 0.92 % 70.56 komot k omot 37 0.30 % 35.75 5 0.11 % 21.69 26 0.56 % 87.97 1 0.05 % 2.83 5 0.42 % 32.07 anche a nche 34 0.28 % 32.85 9 0.20 % 39.04 25 0.54 % 84.59 0 0 % 0 0 0 % 0 Poutiainen Poutia inen 33 0.27 % 31.88 33 0.75 % 143.15 0 0 % 0 0 0 % 0 0 0 % 0 hambrt ha mbrt 32 0.26 % 30.91 0 0 % 0 32 0.69 % 108.28 0 0 % 0 0 0 % 0 direkt di rekt 31 0.25 % 29.95 3 0.07 % 13.01 23 0.49 % 77.82 3 0.14 % 8.49 2 0.17 % 12.83 Belvi B elvi 29 0.23 % 28.02 29 0.66 % 125.80 0 0 % 0 0 0 % 0 0 0 % 0 Feeney Fe eney 27 0.22 % 26.08 27 0.61 % 117.12 0 0 % 0 0 0 % 0 0 0 % 0 frej frej 27 0.22 % 26.08 3 0.07 % 13.01 24 0.51 % 81.21 0 0 % 0 0 0 % 0 orenk o renk 27 0.22 % 26.08 11 0.25 % 47.72 14 0.30 % 47.37 0 0 % 0 2 0.17 % 12.83 servus se rvus 27 0.22 % 26.08 9 0.20 % 39.04 18 0.39 % 60.91 0 0 % 0 0 0 % 0 alora a lora 26 0.21 % 25.12 12 0.27 % 52.06 14 0.30 % 47.37 0 0 % 0 0 0 % 0 trebalo tre balo 25 0.20 % 24.15 17 0.39 % 73.75 8 0.17 % 27.07 0 0 % 0 0 0 % 0 rajtam ra jtam 24 0.19 % 23.19 0 0 % 0 24 0.51 % 81.21 0 0 % 0 0 0 % 0 Boruc B oruc 23 0.19 % 22.22 23 0.52 % 99.77 0 0 % 0 0 0 % 0 0 0 % 0 genau g enau 23 0.19 % 22.22 0 0 % 0 23 0.49 % 77.82 0 0 % 0 0 0 % 0 mega mega 22 0.18 % 21.25 4 0.09 % 17.35 2 0.04 % 6.77 0 0 % 0 16 1.34 % 102.63 takvida tak vida 22 0.18 % 21.25 0 0 % 0 22 0.47 % 74.44 0 0 % 0 0 0 % 0 Brant B rant 21 0.17 % 20.29 21 0.48 % 91.10 0 0 % 0 0 0 % 0 0 0 % 0 ukvarja ukv arja 21 0.17 % 20.29 6 0.14 % 26.03 2 0.04 % 6.77 8 0.38 % 22.65 5 0.42 % 32.07 fraj fraj 20 0.16 % 19.32 6 0.14 % 26.03 13 0.28 % 43.99 0 0 % 0 1 0.08 % 6.41 Belviju Bel viju 19 0.15 % 18.36 19 0.43 % 82.42 0 0 % 0 0 0 % 0 0 0 % 0 Zettel Ze ttel 19 0.15 % 18.36 19 0.43 % 82.42 0 0 % 0 0 0 % 0 0 0 % 0 metiloranž metilo ranž 19 0.15 % 18.36 0 0 % 0 0 0 % 0 19 0.91 % 53.80 0 0 % 0 cumbaya cum baya 18 0.15 % 17.39 18 0.41 % 78.08 0 0 % 0 0 0 % 0 0 0 % 0 žijala ži jala 18 0.15 % 17.39 0 0 % 0 18 0.39 % 60.91 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 723 File at CLARIN.SI2.2.380 List of final character-level 5-grams from residual standardized forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-residual-standardized_ forms-final-5grams-taxonomy-entire.tsvStandardized form Rest of the word Final part of the word Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] živjo živjo 149 1.51 % 143.95 74 2.08 % 321.01 22 0.59 % 74.44 35 2.12 % 99.11 18 1.89 % 115.46 tavžent ta vžent 56 0.57 % 54.10 18 0.51 % 78.08 32 0.86 % 108.28 3 0.18 % 8.49 3 0.32 % 19.24 Capris C apris 48 0.49 % 46.37 48 1.35 % 208.22 0 0 % 0 0 0 % 0 0 0 % 0 Healy Healy 41 0.42 % 39.61 41 1.15 % 177.86 0 0 % 0 0 0 % 0 0 0 % 0 brezveze bre zveze 37 0.38 % 35.75 3 0.08 % 13.01 14 0.38 % 47.37 9 0.55 % 25.48 11 1.16 % 70.56 komot komot 37 0.38 % 35.75 5 0.14 % 21.69 26 0.70 % 87.97 1 0.06 % 2.83 5 0.53 % 32.07 anche anche 34 0.34 % 32.85 9 0.25 % 39.04 25 0.68 % 84.59 0 0 % 0 0 0 % 0 Poutiainen Pouti ainen 33 0.34 % 31.88 33 0.93 % 143.15 0 0 % 0 0 0 % 0 0 0 % 0 hambrt h ambrt 32 0.32 % 30.91 0 0 % 0 32 0.86 % 108.28 0 0 % 0 0 0 % 0 direkt d irekt 31 0.31 % 29.95 3 0.08 % 13.01 23 0.62 % 77.82 3 0.18 % 8.49 2 0.21 % 12.83 Belvi Belvi 29 0.29 % 28.02 29 0.81 % 125.80 0 0 % 0 0 0 % 0 0 0 % 0 Feeney F eeney 27 0.27 % 26.08 27 0.76 % 117.12 0 0 % 0 0 0 % 0 0 0 % 0 orenk orenk 27 0.27 % 26.08 11 0.31 % 47.72 14 0.38 % 47.37 0 0 % 0 2 0.21 % 12.83 servus s ervus 27 0.27 % 26.08 9 0.25 % 39.04 18 0.49 % 60.91 0 0 % 0 0 0 % 0 alora alora 26 0.26 % 25.12 12 0.34 % 52.06 14 0.38 % 47.37 0 0 % 0 0 0 % 0 trebalo tr ebalo 25 0.25 % 24.15 17 0.48 % 73.75 8 0.22 % 27.07 0 0 % 0 0 0 % 0 rajtam r ajtam 24 0.24 % 23.19 0 0 % 0 24 0.65 % 81.21 0 0 % 0 0 0 % 0 Boruc Boruc 23 0.23 % 22.22 23 0.65 % 99.77 0 0 % 0 0 0 % 0 0 0 % 0 genau genau 23 0.23 % 22.22 0 0 % 0 23 0.62 % 77.82 0 0 % 0 0 0 % 0 takvida ta kvida 22 0.22 % 21.25 0 0 % 0 22 0.59 % 74.44 0 0 % 0 0 0 % 0 Brant Brant 21 0.21 % 20.29 21 0.59 % 91.10 0 0 % 0 0 0 % 0 0 0 % 0 ukvarja uk varja 21 0.21 % 20.29 6 0.17 % 26.03 2 0.05 % 6.77 8 0.48 % 22.65 5 0.53 % 32.07 Belviju Be lviju 19 0.19 % 18.36 19 0.53 % 82.42 0 0 % 0 0 0 % 0 0 0 % 0 Zettel Z ettel 19 0.19 % 18.36 19 0.53 % 82.42 0 0 % 0 0 0 % 0 0 0 % 0 metiloranž metil oranž 19 0.19 % 18.36 0 0 % 0 0 0 % 0 19 1.15 % 53.80 0 0 % 0 cumbaya cu mbaya 18 0.18 % 17.39 18 0.51 % 78.08 0 0 % 0 0 0 % 0 0 0 % 0 žijala ž ijala 18 0.18 % 17.39 0 0 % 0 18 0.49 % 60.91 0 0 % 0 0 0 % 0 invece i nvece 17 0.17 % 16.42 0 0 % 0 17 0.46 % 57.52 0 0 % 0 0 0 % 0 devetnajststo devetnaj ststo 16 0.16 % 15.46 2 0.06 % 8.68 0 0 % 0 14 0.85 % 39.64 0 0 % 0 luškan l uškan 16 0.16 % 15.46 9 0.25 % 39.04 1 0.03 % 3.38 1 0.06 % 2.83 5 0.53 % 32.07 magar magar 16 0.16 % 15.46 2 0.06 % 8.68 11 0.30 % 37.22 1 0.06 % 2.83 2 0.21 % 12.83 žijaš žijaš 16 0.16 % 15.46 0 0 % 0 16 0.43 % 54.14 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 724 File at CLARIN.SI2.2.381 List of initial character-level 1-grams from residual lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-residual-lowercase_ forms-initial-1grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] eee e ee 23,222 42.92 % 22,434.40 4,248 33.09 % 18,427.66 4,527 29.40 % 15,317.62 10,556 59.40 % 29,891.07 3,891 48.07 % 24,959.43 eem e em 2,950 5.45 % 2,849.95 392 3.05 % 1,700.48 713 4.63 % 2,412.52 1,210 6.81 % 3,426.32 635 7.84 % 4,073.31 s s 1,341 2.48 % 1,295.52 221 1.72 % 958.69 492 3.19 % 1,664.74 375 2.11 % 1,061.87 253 3.12 % 1,622.91 ka k a 772 1.43 % 745.82 424 3.30 % 1,839.30 271 1.76 % 916.96 45 0.25 % 127.42 32 0.40 % 205.27 n n 766 1.42 % 740.02 108 0.84 % 468.50 298 1.94 % 1,008.32 205 1.15 % 580.49 155 1.92 % 994.27 p p 519 0.96 % 501.40 87 0.68 % 377.40 168 1.09 % 568.45 155 0.87 % 438.91 109 1.35 % 699.20 z z 490 0.91 % 473.38 79 0.61 % 342.70 161 1.04 % 544.76 152 0.85 % 430.41 98 1.21 % 628.64 t t 465 0.86 % 449.23 77 0.60 % 334.02 186 1.21 % 629.35 92 0.52 % 260.51 110 1.36 % 705.61 m m 424 0.78 % 409.62 79 0.61 % 342.70 161 1.04 % 544.76 93 0.52 % 263.34 91 1.12 % 583.73 d d 412 0.76 % 398.03 149 1.16 % 646.36 126 0.82 % 426.34 94 0.53 % 266.18 43 0.53 % 275.83 b b 377 0.70 % 364.21 76 0.59 % 329.69 111 0.72 % 375.58 133 0.75 % 376.61 57 0.70 % 365.64 j j 368 0.68 % 355.52 90 0.70 % 390.42 133 0.86 % 450.02 81 0.46 % 229.36 64 0.79 % 410.54 k k 367 0.68 % 354.55 75 0.58 % 325.35 124 0.81 % 419.57 92 0.52 % 260.51 76 0.94 % 487.51 aaa a aa 288 0.53 % 278.23 67 0.52 % 290.64 95 0.62 % 321.44 95 0.54 % 269.01 31 0.38 % 198.85 v v 269 0.50 % 259.88 59 0.46 % 255.94 67 0.43 % 226.70 107 0.60 % 302.99 36 0.45 % 230.93 e e 249 0.46 % 240.55 61 0.47 % 264.62 102 0.66 % 345.13 59 0.33 % 167.07 27 0.33 % 173.20 nnn n nn 240 0.44 % 231.86 30 0.23 % 130.14 99 0.64 % 334.98 59 0.33 % 167.07 52 0.64 % 333.56 po p o 206 0.38 % 199.01 35 0.27 % 151.83 51 0.33 % 172.56 80 0.45 % 226.53 40 0.49 % 256.59 o o 202 0.37 % 195.15 34 0.27 % 147.49 53 0.34 % 179.33 69 0.39 % 195.38 46 0.57 % 295.07 kao k ao 196 0.36 % 189.35 8 0.06 % 34.70 164 1.06 % 554.91 5 0.03 % 14.16 19 0.23 % 121.88 u u 184 0.34 % 177.76 50 0.39 % 216.90 49 0.32 % 165.80 60 0.34 % 169.90 25 0.31 % 160.37 š š 175 0.32 % 169.06 34 0.27 % 147.49 66 0.43 % 223.32 38 0.21 % 107.60 37 0.46 % 237.34 ooo o oo 166 0.31 % 160.37 83 0.65 % 360.05 57 0.37 % 192.87 10 0.06 % 28.32 16 0.20 % 102.63 i i 162 0.30 % 156.51 29 0.23 % 125.80 52 0.34 % 175.95 46 0.26 % 130.26 35 0.43 % 224.51 tipo t ipo 145 0.27 % 140.08 1 0.01 % 4.34 144 0.94 % 487.24 0 0 % 0 0 0 % 0 na n a 144 0.27 % 139.12 18 0.14 % 78.08 49 0.32 % 165.80 46 0.26 % 130.26 31 0.38 % 198.85 živjo ž ivjo 134 0.25 % 129.46 66 0.51 % 286.31 18 0.12 % 60.91 33 0.19 % 93.44 17 0.21 % 109.05 a a 118 0.22 % 114 27 0.21 % 117.12 61 0.40 % 206.40 18 0.10 % 50.97 12 0.15 % 76.98 za z a 108 0.20 % 104.34 23 0.18 % 99.77 29 0.19 % 98.12 34 0.19 % 96.28 22 0.27 % 141.12 pre p re 107 0.20 % 103.37 27 0.21 % 117.12 24 0.16 % 81.21 34 0.19 % 96.28 22 0.27 % 141.12 g g 92 0.17 % 88.88 26 0.20 % 112.79 22 0.14 % 74.44 30 0.17 % 84.95 14 0.17 % 89.81 r r 87 0.16 % 84.05 15 0.12 % 65.07 25 0.16 % 84.59 33 0.19 % 93.44 14 0.17 % 89.81 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 725 File at CLARIN.SI2.2.382 List of initial character-level 2-grams from residual lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-residual-lowercase_ forms-initial-2grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] eee ee e 23,222 49.74 % 22,434.40 4,248 37.28 % 18,427.66 4,527 35.34 % 15,317.62 10,556 66.97 % 29,891.07 3,891 57.87 % 24,959.43 eem ee m 2,950 6.32 % 2,849.95 392 3.44 % 1,700.48 713 5.57 % 2,412.52 1,210 7.68 % 3,426.32 635 9.44 % 4,073.31 ka ka 772 1.65 % 745.82 424 3.72 % 1,839.30 271 2.12 % 916.96 45 0.29 % 127.42 32 0.48 % 205.27 aaa aa a 288 0.62 % 278.23 67 0.59 % 290.64 95 0.74 % 321.44 95 0.60 % 269.01 31 0.46 % 198.85 nnn nn n 240 0.51 % 231.86 30 0.26 % 130.14 99 0.77 % 334.98 59 0.37 % 167.07 52 0.77 % 333.56 po po 206 0.44 % 199.01 35 0.31 % 151.83 51 0.40 % 172.56 80 0.51 % 226.53 40 0.59 % 256.59 kao ka o 196 0.42 % 189.35 8 0.07 % 34.70 164 1.28 % 554.91 5 0.03 % 14.16 19 0.28 % 121.88 ooo oo o 166 0.36 % 160.37 83 0.73 % 360.05 57 0.45 % 192.87 10 0.06 % 28.32 16 0.24 % 102.63 tipo ti po 145 0.31 % 140.08 1 0.01 % 4.34 144 1.12 % 487.24 0 0 % 0 0 0 % 0 na na 144 0.31 % 139.12 18 0.16 % 78.08 49 0.38 % 165.80 46 0.29 % 130.26 31 0.46 % 198.85 živjo ži vjo 134 0.29 % 129.46 66 0.58 % 286.31 18 0.14 % 60.91 33 0.21 % 93.44 17 0.25 % 109.05 za za 108 0.23 % 104.34 23 0.20 % 99.77 29 0.23 % 98.12 34 0.22 % 96.28 22 0.33 % 141.12 pre pr e 107 0.23 % 103.37 27 0.24 % 117.12 24 0.19 % 81.21 34 0.22 % 96.28 22 0.33 % 141.12 ne ne 85 0.18 % 82.12 22 0.19 % 95.44 21 0.16 % 71.06 29 0.18 % 82.12 13 0.19 % 83.39 ju ju 76 0.16 % 73.42 56 0.49 % 242.93 17 0.13 % 57.52 3 0.02 % 8.49 0 0 % 0 re re 76 0.16 % 73.42 20 0.18 % 86.76 21 0.16 % 71.06 22 0.14 % 62.30 13 0.19 % 83.39 uuu uu u 74 0.16 % 71.49 35 0.31 % 151.83 27 0.21 % 91.36 6 0.04 % 16.99 6 0.09 % 38.49 ma ma 72 0.15 % 69.56 22 0.19 % 95.44 33 0.26 % 111.66 8 0.05 % 22.65 9 0.13 % 57.73 do do 71 0.15 % 68.59 21 0.18 % 91.10 20 0.16 % 67.67 17 0.11 % 48.14 13 0.19 % 83.39 pr pr 69 0.15 % 66.66 9 0.08 % 39.04 24 0.19 % 81.21 20 0.13 % 56.63 16 0.24 % 102.63 de de 68 0.15 % 65.69 31 0.27 % 134.48 11 0.09 % 37.22 17 0.11 % 48.14 9 0.13 % 57.73 een ee n 64 0.14 % 61.83 10 0.09 % 43.38 10 0.08 % 33.84 36 0.23 % 101.94 8 0.12 % 51.32 jov jo v 61 0.13 % 58.93 56 0.49 % 242.93 0 0 % 0 3 0.02 % 8.49 2 0.03 % 12.83 be be 58 0.12 % 56.03 15 0.13 % 65.07 6 0.05 % 20.30 21 0.13 % 59.46 16 0.24 % 102.63 ta ta 58 0.12 % 56.03 13 0.11 % 56.39 25 0.20 % 84.59 12 0.08 % 33.98 8 0.12 % 51.32 se se 57 0.12 % 55.07 14 0.12 % 60.73 25 0.20 % 84.59 10 0.06 % 28.32 8 0.12 % 51.32 siti si ti 57 0.12 % 55.07 56 0.49 % 242.93 0 0 % 0 1 0.01 % 2.83 0 0 % 0 te te 56 0.12 % 54.10 9 0.08 % 39.04 17 0.13 % 57.52 20 0.13 % 56.63 10 0.15 % 64.15 ko ko 55 0.12 % 53.13 15 0.13 % 65.07 14 0.11 % 47.37 15 0.10 % 42.47 11 0.16 % 70.56 mo mo 54 0.12 % 52.17 13 0.11 % 56.39 17 0.13 % 57.52 10 0.06 % 28.32 14 0.21 % 89.81 da da 53 0.11 % 51.20 17 0.15 % 73.75 27 0.21 % 91.36 4 0.03 % 11.33 5 0.07 % 32.07 er er 53 0.11 % 51.20 1 0.01 % 4.34 0 0 % 0 52 0.33 % 147.25 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 726 File at CLARIN.SI2.2.383 List of initial character-level 3-grams from residual lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-residual-lowercase_ forms-initial-3grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] eee eee 23,222 54.79 % 22,434.40 4,248 42.05 % 18,427.66 4,527 39.89 % 15,317.62 10,556 71.52 % 29,891.07 3,891 63.06 % 24,959.43 eem eem 2,950 6.96 % 2,849.95 392 3.88 % 1,700.48 713 6.28 % 2,412.52 1,210 8.20 % 3,426.32 635 10.29 % 4,073.31 aaa aaa 288 0.68 % 278.23 67 0.66 % 290.64 95 0.84 % 321.44 95 0.64 % 269.01 31 0.50 % 198.85 nnn nnn 240 0.57 % 231.86 30 0.30 % 130.14 99 0.87 % 334.98 59 0.40 % 167.07 52 0.84 % 333.56 kao kao 196 0.46 % 189.35 8 0.08 % 34.70 164 1.45 % 554.91 5 0.03 % 14.16 19 0.31 % 121.88 ooo ooo 166 0.39 % 160.37 83 0.82 % 360.05 57 0.50 % 192.87 10 0.07 % 28.32 16 0.26 % 102.63 tipo tip o 145 0.34 % 140.08 1 0.01 % 4.34 144 1.27 % 487.24 0 0 % 0 0 0 % 0 živjo živ jo 134 0.32 % 129.46 66 0.65 % 286.31 18 0.16 % 60.91 33 0.22 % 93.44 17 0.28 % 109.05 pre pre 107 0.25 % 103.37 27 0.27 % 117.12 24 0.21 % 81.21 34 0.23 % 96.28 22 0.36 % 141.12 uuu uuu 74 0.17 % 71.49 35 0.35 % 151.83 27 0.24 % 91.36 6 0.04 % 16.99 6 0.10 % 38.49 een een 64 0.15 % 61.83 10 0.10 % 43.38 10 0.09 % 33.84 36 0.24 % 101.94 8 0.13 % 51.32 jov jov 61 0.14 % 58.93 56 0.55 % 242.93 0 0 % 0 3 0.02 % 8.49 2 0.03 % 12.83 siti sit i 57 0.13 % 55.07 56 0.55 % 242.93 0 0 % 0 1 0.01 % 2.83 0 0 % 0 kapris kap ris 48 0.11 % 46.37 48 0.47 % 208.22 0 0 % 0 0 0 % 0 0 0 % 0 aam aam 44 0.10 % 42.51 5 0.05 % 21.69 16 0.14 % 54.14 18 0.12 % 50.97 5 0.08 % 32.07 hili hil i 43 0.10 % 41.54 42 0.42 % 182.19 1 0.01 % 3.38 0 0 % 0 0 0 % 0 hmm hmm 43 0.10 % 41.54 14 0.14 % 60.73 25 0.22 % 84.59 0 0 % 0 4 0.07 % 25.66 pri pri 40 0.09 % 38.64 9 0.09 % 39.04 8 0.07 % 27.07 19 0.13 % 53.80 4 0.07 % 25.66 ovi ovi 38 0.09 % 36.71 1 0.01 % 4.34 34 0.30 % 115.04 3 0.02 % 8.49 0 0 % 0 brezveze bre zveze 37 0.09 % 35.75 3 0.03 % 13.01 14 0.12 % 47.37 9 0.06 % 25.48 11 0.18 % 70.56 eeh eeh 33 0.08 % 31.88 0 0 % 0 19 0.17 % 64.29 13 0.09 % 36.81 1 0.02 % 6.41 poutiajnen pou tiajnen 33 0.08 % 31.88 33 0.33 % 143.15 0 0 % 0 0 0 % 0 0 0 % 0 anka ank a 30 0.07 % 28.98 8 0.08 % 34.70 22 0.19 % 74.44 0 0 % 0 0 0 % 0 komot kom ot 30 0.07 % 28.98 5 0.05 % 21.69 21 0.18 % 71.06 1 0.01 % 2.83 3 0.05 % 19.24 belvi bel vi 29 0.07 % 28.02 29 0.29 % 125.80 0 0 % 0 0 0 % 0 0 0 % 0 fraj fra j 29 0.07 % 28.02 8 0.08 % 34.70 19 0.17 % 64.29 1 0.01 % 2.83 1 0.02 % 6.41 bot bot 28 0.07 % 27.05 4 0.04 % 17.35 23 0.20 % 77.82 0 0 % 0 1 0.02 % 6.41 pro pro 28 0.07 % 27.05 4 0.04 % 17.35 4 0.04 % 13.53 16 0.11 % 45.31 4 0.07 % 25.66 direkt dir ekt 27 0.06 % 26.08 3 0.03 % 13.01 19 0.17 % 64.29 3 0.02 % 8.49 2 0.03 % 12.83 fini fin i 27 0.06 % 26.08 27 0.27 % 117.12 0 0 % 0 0 0 % 0 0 0 % 0 mis mis 27 0.06 % 26.08 15 0.15 % 65.07 7 0.06 % 23.69 3 0.02 % 8.49 2 0.03 % 12.83 alora alo ra 26 0.06 % 25.12 12 0.12 % 52.06 14 0.12 % 47.37 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 727 File at CLARIN.SI2.2.384 List of initial character-level 4-grams from residual lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-residual-lowercase_ forms-initial-4grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tipo tipo 145 1.22 % 140.08 1 0.02 % 4.34 144 3.18 % 487.24 0 0 % 0 0 0 % 0 živjo živj o 134 1.12 % 129.46 66 1.58 % 286.31 18 0.40 % 60.91 33 1.60 % 93.44 17 1.47 % 109.05 siti siti 57 0.48 % 55.07 56 1.34 % 242.93 0 0 % 0 1 0.05 % 2.83 0 0 % 0 kapris kapr is 48 0.40 % 46.37 48 1.15 % 208.22 0 0 % 0 0 0 % 0 0 0 % 0 hili hili 43 0.36 % 41.54 42 1.01 % 182.19 1 0.02 % 3.38 0 0 % 0 0 0 % 0 brezveze brez veze 37 0.31 % 35.75 3 0.07 % 13.01 14 0.31 % 47.37 9 0.44 % 25.48 11 0.95 % 70.56 poutiajnen pout iajnen 33 0.28 % 31.88 33 0.79 % 143.15 0 0 % 0 0 0 % 0 0 0 % 0 anka anka 30 0.25 % 28.98 8 0.19 % 34.70 22 0.49 % 74.44 0 0 % 0 0 0 % 0 komot komo t 30 0.25 % 28.98 5 0.12 % 21.69 21 0.46 % 71.06 1 0.05 % 2.83 3 0.26 % 19.24 belvi belv i 29 0.24 % 28.02 29 0.70 % 125.80 0 0 % 0 0 0 % 0 0 0 % 0 fraj fraj 29 0.24 % 28.02 8 0.19 % 34.70 19 0.42 % 64.29 1 0.05 % 2.83 1 0.09 % 6.41 direkt dire kt 27 0.23 % 26.08 3 0.07 % 13.01 19 0.42 % 64.29 3 0.14 % 8.49 2 0.17 % 12.83 fini fini 27 0.23 % 26.08 27 0.65 % 117.12 0 0 % 0 0 0 % 0 0 0 % 0 alora alor a 26 0.22 % 25.12 12 0.29 % 52.06 14 0.31 % 47.37 0 0 % 0 0 0 % 0 hambrt hamb rt 26 0.22 % 25.12 0 0 % 0 26 0.57 % 87.97 0 0 % 0 0 0 % 0 tavžnt tavž nt 25 0.21 % 24.15 14 0.34 % 60.73 11 0.24 % 37.22 0 0 % 0 0 0 % 0 rajtam rajt am 24 0.20 % 23.19 0 0 % 0 24 0.53 % 81.21 0 0 % 0 0 0 % 0 boruc boru c 23 0.19 % 22.22 23 0.55 % 99.77 0 0 % 0 0 0 % 0 0 0 % 0 frej frej 22 0.18 % 21.25 4 0.10 % 17.35 18 0.40 % 60.91 0 0 % 0 0 0 % 0 mega mega 22 0.18 % 21.25 4 0.10 % 17.35 2 0.04 % 6.77 0 0 % 0 16 1.38 % 102.63 brant bran t 21 0.18 % 20.29 21 0.50 % 91.10 0 0 % 0 0 0 % 0 0 0 % 0 ukvarja ukva rja 21 0.18 % 20.29 6 0.14 % 26.03 2 0.04 % 6.77 8 0.39 % 22.65 5 0.43 % 32.07 belviju belv iju 19 0.16 % 18.36 19 0.46 % 82.42 0 0 % 0 0 0 % 0 0 0 % 0 metiloranž meti loranž 19 0.16 % 18.36 0 0 % 0 0 0 % 0 19 0.92 % 53.80 0 0 % 0 trbelo trbe lo 19 0.16 % 18.36 14 0.34 % 60.73 5 0.11 % 16.92 0 0 % 0 0 0 % 0 cetel cete l 18 0.15 % 17.39 18 0.43 % 78.08 0 0 % 0 0 0 % 0 0 0 % 0 kumbaja kumb aja 18 0.15 % 17.39 18 0.43 % 78.08 0 0 % 0 0 0 % 0 0 0 % 0 orng orng 18 0.15 % 17.39 5 0.12 % 21.69 10 0.22 % 33.84 1 0.05 % 2.83 2 0.17 % 12.83 pink pink 17 0.14 % 16.42 10 0.24 % 43.38 6 0.13 % 20.30 0 0 % 0 1 0.09 % 6.41 takvida takv ida 17 0.14 % 16.42 0 0 % 0 17 0.38 % 57.52 0 0 % 0 0 0 % 0 žijava žija va 17 0.14 % 16.42 0 0 % 0 17 0.38 % 57.52 0 0 % 0 0 0 % 0 lajf lajf 16 0.13 % 15.46 10 0.24 % 43.38 6 0.13 % 20.30 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 728 File at CLARIN.SI2.2.385 List of initial character-level 5-grams from residual lower-case word forms in the GOS 1.0 corpus with text-type distributionGOS1.0-word_parts-residual-lowercase_ forms-initial-5grams-taxonomy-entire.tsvLower-case word form Initial part of the word Rest of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] živjo živjo 134 1.46 % 129.46 66 2.09 % 286.31 18 0.52 % 60.91 33 2.04 % 93.44 17 1.85 % 109.05 kapris kapri s 48 0.52 % 46.37 48 1.52 % 208.22 0 0 % 0 0 0 % 0 0 0 % 0 brezveze brezv eze 37 0.40 % 35.75 3 0.10 % 13.01 14 0.40 % 47.37 9 0.56 % 25.48 11 1.20 % 70.56 poutiajnen pouti ajnen 33 0.36 % 31.88 33 1.05 % 143.15 0 0 % 0 0 0 % 0 0 0 % 0 komot komot 30 0.33 % 28.98 5 0.16 % 21.69 21 0.60 % 71.06 1 0.06 % 2.83 3 0.33 % 19.24 belvi belvi 29 0.32 % 28.02 29 0.92 % 125.80 0 0 % 0 0 0 % 0 0 0 % 0 direkt direk t 27 0.29 % 26.08 3 0.10 % 13.01 19 0.55 % 64.29 3 0.18 % 8.49 2 0.22 % 12.83 alora alora 26 0.28 % 25.12 12 0.38 % 52.06 14 0.40 % 47.37 0 0 % 0 0 0 % 0 hambrt hambr t 26 0.28 % 25.12 0 0 % 0 26 0.75 % 87.97 0 0 % 0 0 0 % 0 tavžnt tavžn t 25 0.27 % 24.15 14 0.44 % 60.73 11 0.32 % 37.22 0 0 % 0 0 0 % 0 rajtam rajta m 24 0.26 % 23.19 0 0 % 0 24 0.69 % 81.21 0 0 % 0 0 0 % 0 boruc boruc 23 0.25 % 22.22 23 0.73 % 99.77 0 0 % 0 0 0 % 0 0 0 % 0 brant brant 21 0.23 % 20.29 21 0.67 % 91.10 0 0 % 0 0 0 % 0 0 0 % 0 ukvarja ukvar ja 21 0.23 % 20.29 6 0.19 % 26.03 2 0.06 % 6.77 8 0.49 % 22.65 5 0.55 % 32.07 belviju belvi ju 19 0.21 % 18.36 19 0.60 % 82.42 0 0 % 0 0 0 % 0 0 0 % 0 metiloranž metil oranž 19 0.21 % 18.36 0 0 % 0 0 0 % 0 19 1.17 % 53.80 0 0 % 0 trbelo trbel o 19 0.21 % 18.36 14 0.44 % 60.73 5 0.14 % 16.92 0 0 % 0 0 0 % 0 cetel cetel 18 0.20 % 17.39 18 0.57 % 78.08 0 0 % 0 0 0 % 0 0 0 % 0 kumbaja kumba ja 18 0.20 % 17.39 18 0.57 % 78.08 0 0 % 0 0 0 % 0 0 0 % 0 takvida takvi da 17 0.19 % 16.42 0 0 % 0 17 0.49 % 57.52 0 0 % 0 0 0 % 0 žijava žijav a 17 0.19 % 16.42 0 0 % 0 17 0.49 % 57.52 0 0 % 0 0 0 % 0 luškan luška n 16 0.17 % 15.46 9 0.28 % 39.04 1 0.03 % 3.38 1 0.06 % 2.83 5 0.55 % 32.07 magar magar 16 0.17 % 15.46 2 0.06 % 8.68 11 0.32 % 37.22 1 0.06 % 2.83 2 0.22 % 12.83 porka porka 16 0.17 % 15.46 15 0.48 % 65.07 1 0.03 % 3.38 0 0 % 0 0 0 % 0 tavžent tavže nt 16 0.17 % 15.46 4 0.13 % 17.35 8 0.23 % 27.07 3 0.18 % 8.49 1 0.11 % 6.41 žijaš žijaš 16 0.17 % 15.46 0 0 % 0 16 0.46 % 54.14 0 0 % 0 0 0 % 0 abada abada 15 0.16 % 14.49 0 0 % 0 15 0.43 % 50.75 0 0 % 0 0 0 % 0 furja furja 15 0.16 % 14.49 15 0.48 % 65.07 0 0 % 0 0 0 % 0 0 0 % 0 inveče inveč e 15 0.16 % 14.49 0 0 % 0 15 0.43 % 50.75 0 0 % 0 0 0 % 0 žinovek žinov ek 15 0.16 % 14.49 15 0.48 % 65.07 0 0 % 0 0 0 % 0 0 0 % 0 gereiro gerei ro 14 0.15 % 13.53 14 0.44 % 60.73 0 0 % 0 0 0 % 0 0 0 % 0 ireneuš irene uš 14 0.15 % 13.53 14 0.44 % 60.73 0 0 % 0 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 729 File at CLARIN.SI2.2.386 List of final character-level 1-grams from residual lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-residual-lowercase_ forms-final-1grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] eee ee e 23,222 42.92 % 22,434.40 4,248 33.09 % 18,427.66 4,527 29.40 % 15,317.62 10,556 59.40 % 29,891.07 3,891 48.07 % 24,959.43 eem ee m 2,950 5.45 % 2,849.95 392 3.05 % 1,700.48 713 4.63 % 2,412.52 1,210 6.81 % 3,426.32 635 7.84 % 4,073.31 s s 1,341 2.48 % 1,295.52 221 1.72 % 958.69 492 3.19 % 1,664.74 375 2.11 % 1,061.87 253 3.12 % 1,622.91 ka k a 772 1.43 % 745.82 424 3.30 % 1,839.30 271 1.76 % 916.96 45 0.25 % 127.42 32 0.40 % 205.27 n n 766 1.42 % 740.02 108 0.84 % 468.50 298 1.94 % 1,008.32 205 1.15 % 580.49 155 1.92 % 994.27 p p 519 0.96 % 501.40 87 0.68 % 377.40 168 1.09 % 568.45 155 0.87 % 438.91 109 1.35 % 699.20 z z 490 0.91 % 473.38 79 0.61 % 342.70 161 1.04 % 544.76 152 0.85 % 430.41 98 1.21 % 628.64 t t 465 0.86 % 449.23 77 0.60 % 334.02 186 1.21 % 629.35 92 0.52 % 260.51 110 1.36 % 705.61 m m 424 0.78 % 409.62 79 0.61 % 342.70 161 1.04 % 544.76 93 0.52 % 263.34 91 1.12 % 583.73 d d 412 0.76 % 398.03 149 1.16 % 646.36 126 0.82 % 426.34 94 0.53 % 266.18 43 0.53 % 275.83 b b 377 0.70 % 364.21 76 0.59 % 329.69 111 0.72 % 375.58 133 0.75 % 376.61 57 0.70 % 365.64 j j 368 0.68 % 355.52 90 0.70 % 390.42 133 0.86 % 450.02 81 0.46 % 229.36 64 0.79 % 410.54 k k 367 0.68 % 354.55 75 0.58 % 325.35 124 0.81 % 419.57 92 0.52 % 260.51 76 0.94 % 487.51 aaa aa a 288 0.53 % 278.23 67 0.52 % 290.64 95 0.62 % 321.44 95 0.54 % 269.01 31 0.38 % 198.85 v v 269 0.50 % 259.88 59 0.46 % 255.94 67 0.43 % 226.70 107 0.60 % 302.99 36 0.45 % 230.93 e e 249 0.46 % 240.55 61 0.47 % 264.62 102 0.66 % 345.13 59 0.33 % 167.07 27 0.33 % 173.20 nnn nn n 240 0.44 % 231.86 30 0.23 % 130.14 99 0.64 % 334.98 59 0.33 % 167.07 52 0.64 % 333.56 po p o 206 0.38 % 199.01 35 0.27 % 151.83 51 0.33 % 172.56 80 0.45 % 226.53 40 0.49 % 256.59 o o 202 0.37 % 195.15 34 0.27 % 147.49 53 0.34 % 179.33 69 0.39 % 195.38 46 0.57 % 295.07 kao ka o 196 0.36 % 189.35 8 0.06 % 34.70 164 1.06 % 554.91 5 0.03 % 14.16 19 0.23 % 121.88 u u 184 0.34 % 177.76 50 0.39 % 216.90 49 0.32 % 165.80 60 0.34 % 169.90 25 0.31 % 160.37 š š 175 0.32 % 169.06 34 0.27 % 147.49 66 0.43 % 223.32 38 0.21 % 107.60 37 0.46 % 237.34 ooo oo o 166 0.31 % 160.37 83 0.65 % 360.05 57 0.37 % 192.87 10 0.06 % 28.32 16 0.20 % 102.63 i i 162 0.30 % 156.51 29 0.23 % 125.80 52 0.34 % 175.95 46 0.26 % 130.26 35 0.43 % 224.51 tipo tip o 145 0.27 % 140.08 1 0.01 % 4.34 144 0.94 % 487.24 0 0 % 0 0 0 % 0 na n a 144 0.27 % 139.12 18 0.14 % 78.08 49 0.32 % 165.80 46 0.26 % 130.26 31 0.38 % 198.85 živjo živj o 134 0.25 % 129.46 66 0.51 % 286.31 18 0.12 % 60.91 33 0.19 % 93.44 17 0.21 % 109.05 a a 118 0.22 % 114 27 0.21 % 117.12 61 0.40 % 206.40 18 0.10 % 50.97 12 0.15 % 76.98 za z a 108 0.20 % 104.34 23 0.18 % 99.77 29 0.19 % 98.12 34 0.19 % 96.28 22 0.27 % 141.12 pre pr e 107 0.20 % 103.37 27 0.21 % 117.12 24 0.16 % 81.21 34 0.19 % 96.28 22 0.27 % 141.12 g g 92 0.17 % 88.88 26 0.20 % 112.79 22 0.14 % 74.44 30 0.17 % 84.95 14 0.17 % 89.81 r r 87 0.16 % 84.05 15 0.12 % 65.07 25 0.16 % 84.59 33 0.19 % 93.44 14 0.17 % 89.81 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 730 File at CLARIN.SI2.2.387 List of final character-level 2-grams from residual lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-residual-lowercase_ forms-final-2grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] eee e ee 23,222 49.74 % 22,434.40 4,248 37.28 % 18,427.66 4,527 35.34 % 15,317.62 10,556 66.97 % 29,891.07 3,891 57.87 % 24,959.43 eem e em 2,950 6.32 % 2,849.95 392 3.44 % 1,700.48 713 5.57 % 2,412.52 1,210 7.68 % 3,426.32 635 9.44 % 4,073.31 ka ka 772 1.65 % 745.82 424 3.72 % 1,839.30 271 2.12 % 916.96 45 0.29 % 127.42 32 0.48 % 205.27 aaa a aa 288 0.62 % 278.23 67 0.59 % 290.64 95 0.74 % 321.44 95 0.60 % 269.01 31 0.46 % 198.85 nnn n nn 240 0.51 % 231.86 30 0.26 % 130.14 99 0.77 % 334.98 59 0.37 % 167.07 52 0.77 % 333.56 po po 206 0.44 % 199.01 35 0.31 % 151.83 51 0.40 % 172.56 80 0.51 % 226.53 40 0.59 % 256.59 kao k ao 196 0.42 % 189.35 8 0.07 % 34.70 164 1.28 % 554.91 5 0.03 % 14.16 19 0.28 % 121.88 ooo o oo 166 0.36 % 160.37 83 0.73 % 360.05 57 0.45 % 192.87 10 0.06 % 28.32 16 0.24 % 102.63 tipo ti po 145 0.31 % 140.08 1 0.01 % 4.34 144 1.12 % 487.24 0 0 % 0 0 0 % 0 na na 144 0.31 % 139.12 18 0.16 % 78.08 49 0.38 % 165.80 46 0.29 % 130.26 31 0.46 % 198.85 živjo živ jo 134 0.29 % 129.46 66 0.58 % 286.31 18 0.14 % 60.91 33 0.21 % 93.44 17 0.25 % 109.05 za za 108 0.23 % 104.34 23 0.20 % 99.77 29 0.23 % 98.12 34 0.22 % 96.28 22 0.33 % 141.12 pre p re 107 0.23 % 103.37 27 0.24 % 117.12 24 0.19 % 81.21 34 0.22 % 96.28 22 0.33 % 141.12 ne ne 85 0.18 % 82.12 22 0.19 % 95.44 21 0.16 % 71.06 29 0.18 % 82.12 13 0.19 % 83.39 ju ju 76 0.16 % 73.42 56 0.49 % 242.93 17 0.13 % 57.52 3 0.02 % 8.49 0 0 % 0 re re 76 0.16 % 73.42 20 0.18 % 86.76 21 0.16 % 71.06 22 0.14 % 62.30 13 0.19 % 83.39 uuu u uu 74 0.16 % 71.49 35 0.31 % 151.83 27 0.21 % 91.36 6 0.04 % 16.99 6 0.09 % 38.49 ma ma 72 0.15 % 69.56 22 0.19 % 95.44 33 0.26 % 111.66 8 0.05 % 22.65 9 0.13 % 57.73 do do 71 0.15 % 68.59 21 0.18 % 91.10 20 0.16 % 67.67 17 0.11 % 48.14 13 0.19 % 83.39 pr pr 69 0.15 % 66.66 9 0.08 % 39.04 24 0.19 % 81.21 20 0.13 % 56.63 16 0.24 % 102.63 de de 68 0.15 % 65.69 31 0.27 % 134.48 11 0.09 % 37.22 17 0.11 % 48.14 9 0.13 % 57.73 een e en 64 0.14 % 61.83 10 0.09 % 43.38 10 0.08 % 33.84 36 0.23 % 101.94 8 0.12 % 51.32 jov j ov 61 0.13 % 58.93 56 0.49 % 242.93 0 0 % 0 3 0.02 % 8.49 2 0.03 % 12.83 be be 58 0.12 % 56.03 15 0.13 % 65.07 6 0.05 % 20.30 21 0.13 % 59.46 16 0.24 % 102.63 ta ta 58 0.12 % 56.03 13 0.11 % 56.39 25 0.20 % 84.59 12 0.08 % 33.98 8 0.12 % 51.32 se se 57 0.12 % 55.07 14 0.12 % 60.73 25 0.20 % 84.59 10 0.06 % 28.32 8 0.12 % 51.32 siti si ti 57 0.12 % 55.07 56 0.49 % 242.93 0 0 % 0 1 0.01 % 2.83 0 0 % 0 te te 56 0.12 % 54.10 9 0.08 % 39.04 17 0.13 % 57.52 20 0.13 % 56.63 10 0.15 % 64.15 ko ko 55 0.12 % 53.13 15 0.13 % 65.07 14 0.11 % 47.37 15 0.10 % 42.47 11 0.16 % 70.56 mo mo 54 0.12 % 52.17 13 0.11 % 56.39 17 0.13 % 57.52 10 0.06 % 28.32 14 0.21 % 89.81 da da 53 0.11 % 51.20 17 0.15 % 73.75 27 0.21 % 91.36 4 0.03 % 11.33 5 0.07 % 32.07 er er 53 0.11 % 51.20 1 0.01 % 4.34 0 0 % 0 52 0.33 % 147.25 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 731 File at CLARIN.SI2.2.388 List of final character-level 3-grams from residual lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-residual-lowercase_ forms-final-3grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] eee eee 23,222 54.79 % 22,434.40 4,248 42.05 % 18,427.66 4,527 39.89 % 15,317.62 10,556 71.52 % 29,891.07 3,891 63.06 % 24,959.43 eem eem 2,950 6.96 % 2,849.95 392 3.88 % 1,700.48 713 6.28 % 2,412.52 1,210 8.20 % 3,426.32 635 10.29 % 4,073.31 aaa aaa 288 0.68 % 278.23 67 0.66 % 290.64 95 0.84 % 321.44 95 0.64 % 269.01 31 0.50 % 198.85 nnn nnn 240 0.57 % 231.86 30 0.30 % 130.14 99 0.87 % 334.98 59 0.40 % 167.07 52 0.84 % 333.56 kao kao 196 0.46 % 189.35 8 0.08 % 34.70 164 1.45 % 554.91 5 0.03 % 14.16 19 0.31 % 121.88 ooo ooo 166 0.39 % 160.37 83 0.82 % 360.05 57 0.50 % 192.87 10 0.07 % 28.32 16 0.26 % 102.63 tipo t ipo 145 0.34 % 140.08 1 0.01 % 4.34 144 1.27 % 487.24 0 0 % 0 0 0 % 0 živjo ži vjo 134 0.32 % 129.46 66 0.65 % 286.31 18 0.16 % 60.91 33 0.22 % 93.44 17 0.28 % 109.05 pre pre 107 0.25 % 103.37 27 0.27 % 117.12 24 0.21 % 81.21 34 0.23 % 96.28 22 0.36 % 141.12 uuu uuu 74 0.17 % 71.49 35 0.35 % 151.83 27 0.24 % 91.36 6 0.04 % 16.99 6 0.10 % 38.49 een een 64 0.15 % 61.83 10 0.10 % 43.38 10 0.09 % 33.84 36 0.24 % 101.94 8 0.13 % 51.32 jov jov 61 0.14 % 58.93 56 0.55 % 242.93 0 0 % 0 3 0.02 % 8.49 2 0.03 % 12.83 siti s iti 57 0.13 % 55.07 56 0.55 % 242.93 0 0 % 0 1 0.01 % 2.83 0 0 % 0 kapris kap ris 48 0.11 % 46.37 48 0.47 % 208.22 0 0 % 0 0 0 % 0 0 0 % 0 aam aam 44 0.10 % 42.51 5 0.05 % 21.69 16 0.14 % 54.14 18 0.12 % 50.97 5 0.08 % 32.07 hili h ili 43 0.10 % 41.54 42 0.42 % 182.19 1 0.01 % 3.38 0 0 % 0 0 0 % 0 hmm hmm 43 0.10 % 41.54 14 0.14 % 60.73 25 0.22 % 84.59 0 0 % 0 4 0.07 % 25.66 pri pri 40 0.09 % 38.64 9 0.09 % 39.04 8 0.07 % 27.07 19 0.13 % 53.80 4 0.07 % 25.66 ovi ovi 38 0.09 % 36.71 1 0.01 % 4.34 34 0.30 % 115.04 3 0.02 % 8.49 0 0 % 0 brezveze brezv eze 37 0.09 % 35.75 3 0.03 % 13.01 14 0.12 % 47.37 9 0.06 % 25.48 11 0.18 % 70.56 eeh eeh 33 0.08 % 31.88 0 0 % 0 19 0.17 % 64.29 13 0.09 % 36.81 1 0.02 % 6.41 poutiajnen poutiaj nen 33 0.08 % 31.88 33 0.33 % 143.15 0 0 % 0 0 0 % 0 0 0 % 0 anka a nka 30 0.07 % 28.98 8 0.08 % 34.70 22 0.19 % 74.44 0 0 % 0 0 0 % 0 komot ko mot 30 0.07 % 28.98 5 0.05 % 21.69 21 0.18 % 71.06 1 0.01 % 2.83 3 0.05 % 19.24 belvi be lvi 29 0.07 % 28.02 29 0.29 % 125.80 0 0 % 0 0 0 % 0 0 0 % 0 fraj f raj 29 0.07 % 28.02 8 0.08 % 34.70 19 0.17 % 64.29 1 0.01 % 2.83 1 0.02 % 6.41 bot bot 28 0.07 % 27.05 4 0.04 % 17.35 23 0.20 % 77.82 0 0 % 0 1 0.02 % 6.41 pro pro 28 0.07 % 27.05 4 0.04 % 17.35 4 0.04 % 13.53 16 0.11 % 45.31 4 0.07 % 25.66 direkt dir ekt 27 0.06 % 26.08 3 0.03 % 13.01 19 0.17 % 64.29 3 0.02 % 8.49 2 0.03 % 12.83 fini f ini 27 0.06 % 26.08 27 0.27 % 117.12 0 0 % 0 0 0 % 0 0 0 % 0 mis mis 27 0.06 % 26.08 15 0.15 % 65.07 7 0.06 % 23.69 3 0.02 % 8.49 2 0.03 % 12.83 alora al ora 26 0.06 % 25.12 12 0.12 % 52.06 14 0.12 % 47.37 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 732 File at CLARIN.SI2.2.389 List of final character-level 4-grams from residual lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-residual-lowercase_ forms-final-4grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tipo tipo 145 1.22 % 140.08 1 0.02 % 4.34 144 3.18 % 487.24 0 0 % 0 0 0 % 0 živjo ž ivjo 134 1.12 % 129.46 66 1.58 % 286.31 18 0.40 % 60.91 33 1.60 % 93.44 17 1.47 % 109.05 siti siti 57 0.48 % 55.07 56 1.34 % 242.93 0 0 % 0 1 0.05 % 2.83 0 0 % 0 kapris ka pris 48 0.40 % 46.37 48 1.15 % 208.22 0 0 % 0 0 0 % 0 0 0 % 0 hili hili 43 0.36 % 41.54 42 1.01 % 182.19 1 0.02 % 3.38 0 0 % 0 0 0 % 0 brezveze brez veze 37 0.31 % 35.75 3 0.07 % 13.01 14 0.31 % 47.37 9 0.44 % 25.48 11 0.95 % 70.56 poutiajnen poutia jnen 33 0.28 % 31.88 33 0.79 % 143.15 0 0 % 0 0 0 % 0 0 0 % 0 anka anka 30 0.25 % 28.98 8 0.19 % 34.70 22 0.49 % 74.44 0 0 % 0 0 0 % 0 komot k omot 30 0.25 % 28.98 5 0.12 % 21.69 21 0.46 % 71.06 1 0.05 % 2.83 3 0.26 % 19.24 belvi b elvi 29 0.24 % 28.02 29 0.70 % 125.80 0 0 % 0 0 0 % 0 0 0 % 0 fraj fraj 29 0.24 % 28.02 8 0.19 % 34.70 19 0.42 % 64.29 1 0.05 % 2.83 1 0.09 % 6.41 direkt di rekt 27 0.23 % 26.08 3 0.07 % 13.01 19 0.42 % 64.29 3 0.14 % 8.49 2 0.17 % 12.83 fini fini 27 0.23 % 26.08 27 0.65 % 117.12 0 0 % 0 0 0 % 0 0 0 % 0 alora a lora 26 0.22 % 25.12 12 0.29 % 52.06 14 0.31 % 47.37 0 0 % 0 0 0 % 0 hambrt ha mbrt 26 0.22 % 25.12 0 0 % 0 26 0.57 % 87.97 0 0 % 0 0 0 % 0 tavžnt ta vžnt 25 0.21 % 24.15 14 0.34 % 60.73 11 0.24 % 37.22 0 0 % 0 0 0 % 0 rajtam ra jtam 24 0.20 % 23.19 0 0 % 0 24 0.53 % 81.21 0 0 % 0 0 0 % 0 boruc b oruc 23 0.19 % 22.22 23 0.55 % 99.77 0 0 % 0 0 0 % 0 0 0 % 0 frej frej 22 0.18 % 21.25 4 0.10 % 17.35 18 0.40 % 60.91 0 0 % 0 0 0 % 0 mega mega 22 0.18 % 21.25 4 0.10 % 17.35 2 0.04 % 6.77 0 0 % 0 16 1.38 % 102.63 brant b rant 21 0.18 % 20.29 21 0.50 % 91.10 0 0 % 0 0 0 % 0 0 0 % 0 ukvarja ukv arja 21 0.18 % 20.29 6 0.14 % 26.03 2 0.04 % 6.77 8 0.39 % 22.65 5 0.43 % 32.07 belviju bel viju 19 0.16 % 18.36 19 0.46 % 82.42 0 0 % 0 0 0 % 0 0 0 % 0 metiloranž metilo ranž 19 0.16 % 18.36 0 0 % 0 0 0 % 0 19 0.92 % 53.80 0 0 % 0 trbelo tr belo 19 0.16 % 18.36 14 0.34 % 60.73 5 0.11 % 16.92 0 0 % 0 0 0 % 0 cetel c etel 18 0.15 % 17.39 18 0.43 % 78.08 0 0 % 0 0 0 % 0 0 0 % 0 kumbaja kum baja 18 0.15 % 17.39 18 0.43 % 78.08 0 0 % 0 0 0 % 0 0 0 % 0 orng orng 18 0.15 % 17.39 5 0.12 % 21.69 10 0.22 % 33.84 1 0.05 % 2.83 2 0.17 % 12.83 pink pink 17 0.14 % 16.42 10 0.24 % 43.38 6 0.13 % 20.30 0 0 % 0 1 0.09 % 6.41 takvida tak vida 17 0.14 % 16.42 0 0 % 0 17 0.38 % 57.52 0 0 % 0 0 0 % 0 žijava ži java 17 0.14 % 16.42 0 0 % 0 17 0.38 % 57.52 0 0 % 0 0 0 % 0 lajf lajf 16 0.13 % 15.46 10 0.24 % 43.38 6 0.13 % 20.30 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 733 File at CLARIN.SI2.2.390 List of final character-level 5-grams from residual lower-case word forms in the GOS 1.0 corpus with text- type distributionGOS1.0-word_parts-residual-lowercase_ forms-final-5grams-taxonomy-entire.tsvLower-case word form Rest of the word Final part of the word Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] živjo živjo 134 1.46 % 129.46 66 2.09 % 286.31 18 0.52 % 60.91 33 2.04 % 93.44 17 1.85 % 109.05 kapris k apris 48 0.52 % 46.37 48 1.52 % 208.22 0 0 % 0 0 0 % 0 0 0 % 0 brezveze bre zveze 37 0.40 % 35.75 3 0.10 % 13.01 14 0.40 % 47.37 9 0.56 % 25.48 11 1.20 % 70.56 poutiajnen pouti ajnen 33 0.36 % 31.88 33 1.05 % 143.15 0 0 % 0 0 0 % 0 0 0 % 0 komot komot 30 0.33 % 28.98 5 0.16 % 21.69 21 0.60 % 71.06 1 0.06 % 2.83 3 0.33 % 19.24 belvi belvi 29 0.32 % 28.02 29 0.92 % 125.80 0 0 % 0 0 0 % 0 0 0 % 0 direkt d irekt 27 0.29 % 26.08 3 0.10 % 13.01 19 0.55 % 64.29 3 0.18 % 8.49 2 0.22 % 12.83 alora alora 26 0.28 % 25.12 12 0.38 % 52.06 14 0.40 % 47.37 0 0 % 0 0 0 % 0 hambrt h ambrt 26 0.28 % 25.12 0 0 % 0 26 0.75 % 87.97 0 0 % 0 0 0 % 0 tavžnt t avžnt 25 0.27 % 24.15 14 0.44 % 60.73 11 0.32 % 37.22 0 0 % 0 0 0 % 0 rajtam r ajtam 24 0.26 % 23.19 0 0 % 0 24 0.69 % 81.21 0 0 % 0 0 0 % 0 boruc boruc 23 0.25 % 22.22 23 0.73 % 99.77 0 0 % 0 0 0 % 0 0 0 % 0 brant brant 21 0.23 % 20.29 21 0.67 % 91.10 0 0 % 0 0 0 % 0 0 0 % 0 ukvarja uk varja 21 0.23 % 20.29 6 0.19 % 26.03 2 0.06 % 6.77 8 0.49 % 22.65 5 0.55 % 32.07 belviju be lviju 19 0.21 % 18.36 19 0.60 % 82.42 0 0 % 0 0 0 % 0 0 0 % 0 metiloranž metil oranž 19 0.21 % 18.36 0 0 % 0 0 0 % 0 19 1.17 % 53.80 0 0 % 0 trbelo t rbelo 19 0.21 % 18.36 14 0.44 % 60.73 5 0.14 % 16.92 0 0 % 0 0 0 % 0 cetel cetel 18 0.20 % 17.39 18 0.57 % 78.08 0 0 % 0 0 0 % 0 0 0 % 0 kumbaja ku mbaja 18 0.20 % 17.39 18 0.57 % 78.08 0 0 % 0 0 0 % 0 0 0 % 0 takvida ta kvida 17 0.19 % 16.42 0 0 % 0 17 0.49 % 57.52 0 0 % 0 0 0 % 0 žijava ž ijava 17 0.19 % 16.42 0 0 % 0 17 0.49 % 57.52 0 0 % 0 0 0 % 0 luškan l uškan 16 0.17 % 15.46 9 0.28 % 39.04 1 0.03 % 3.38 1 0.06 % 2.83 5 0.55 % 32.07 magar magar 16 0.17 % 15.46 2 0.06 % 8.68 11 0.32 % 37.22 1 0.06 % 2.83 2 0.22 % 12.83 porka porka 16 0.17 % 15.46 15 0.48 % 65.07 1 0.03 % 3.38 0 0 % 0 0 0 % 0 tavžent ta vžent 16 0.17 % 15.46 4 0.13 % 17.35 8 0.23 % 27.07 3 0.18 % 8.49 1 0.11 % 6.41 žijaš žijaš 16 0.17 % 15.46 0 0 % 0 16 0.46 % 54.14 0 0 % 0 0 0 % 0 abada abada 15 0.16 % 14.49 0 0 % 0 15 0.43 % 50.75 0 0 % 0 0 0 % 0 furja furja 15 0.16 % 14.49 15 0.48 % 65.07 0 0 % 0 0 0 % 0 0 0 % 0 inveče i nveče 15 0.16 % 14.49 0 0 % 0 15 0.43 % 50.75 0 0 % 0 0 0 % 0 žinovek ži novek 15 0.16 % 14.49 15 0.48 % 65.07 0 0 % 0 0 0 % 0 0 0 % 0 gereiro ge reiro 14 0.15 % 13.53 14 0.44 % 60.73 0 0 % 0 0 0 % 0 0 0 % 0 ireneuš ir eneuš 14 0.15 % 13.53 14 0.44 % 60.73 0 0 % 0 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 734The frequency lists of words from the GOS 1.0 corpus are divided into two groups: the first group contains lower-case word forms, lemmas, standardized word forms, or morphosyntactic tags (with respective frequencies and percentages), while the second group contains consonant- vowel structures, e.g. conversions of words using character substitution (e.g. C for consonants and V for vowels, with the word “tiger” becoming “CVCVC”). The lists from the first group contain the unit (lower-case form, lemma, standardized word form, or morphosyntactic tag according to the MTE-6 annotation scheme; some lists also contain the part-of-speech and lower-case lemma), its total absolute frequency (fa), ie. the sum of all occurrences of the unit in the corpus, and its percentage (p), i.e. its share according to the total frequency (N) of all units in the corpus: The total relative frequency (fr) indicates how frequently per 1,000,000 units the specific unit occurs in the corpus. The formula takes into account the total absolute frequency of the unit in the corpus (fa) and the total frequency of all units in the corpus (N): The lists also contain the absolute frequencies (faT) of the unit in different taxonomy branches of the corpus, i.e. text-types, as well as the percentages of the unit (pT) and its relative frequencies (frT) in different taxonomy branches. NT represents the total frequency of all units in the subcorpus: The lists containing morphosyntactic tags also feature individual elements of the morphosyntactic tags listed in separate columns at the end of each line, e.g. “Somei” → “S o m e i”, meaning “noun”, “common”, “masculine gender”, “singular”, “nominative case”. This allows the user to filter the relevant lines in data analysis software in order to sum and compare relevant frequencies. The second group of lists, which contains consonant-vowel structures, contains the unit’s converted form (e.g. “CVCVC”) and the total absolute frequency of all units with the same consonant-vowel structure in the corpus. A separate column contains the actual forms (and their absolute frequencies) that have been annotated with a specific part-of-speech in the corpus and pertain to the consonant-vowel structure in question. The forms are listed in the following format: “form_1~part_of_speech_1~frequency | form_2~part_of_speech_2~frequency | ...”. A separate column indicates the number of all different forms pertaining to the consonant-vowel structure in question. The consonant-vowel structure in the lists are either robust (e.g. “tiger” → “CVCVC”) or finegrained (e.g. “tiger” → “KVGVZ”) according to the symbols used in character substitution. The substitution algorithm follows the following categorization: C - consonants (in finegrained consonant-vowel structures, these are instead divided into Z - sonorants, G - voiced obstruents, and K - voiceless obstruents) V - vowels X - foreign consonants Y - foreign vowels S - symbols P - punctuation N - numbers F - non-Latin-script characters ! - other A more detailed list determining the substitution rules for characters is available under the relevant CLARIN.SI repository entry in the file titled “GOS1.0_character_categorization.tsv”.2.3. Frequency lists of words from the GOS 1.0 corpus CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 735 File at CLARIN.SI2.3.1 List of all lemmas in the GOS 1.0 corpus with part-of-speech categories and text-type distributionGOS1.0-words-all-lemmas-parts_of_ speech-taxonomy-entire.tsvLemma Lemma (lower-case) Part-of-speech category Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] biti biti G 93,407 9.02 % 90,238.98 19,830 8.60 % 86,021.79 31,073 10.51 % 105,139.03 29,606 8.38 % 83,834.30 12,898 8.27 % 82,736.24 ne ne L 31,737 3.07 % 30,660.60 6,439 2.79 % 27,932.14 11,588 3.92 % 39,209.32 7,780 2.20 % 22,030.36 5,930 3.80 % 38,038.91 pa pa V 29,360 2.84 % 28,364.22 5,601 2.43 % 24,296.92 12,111 4.10 % 40,978.95 6,825 1.93 % 19,326.12 4,823 3.09 % 30,937.89 ta ta Z 29,255 2.83 % 28,262.78 5,551 2.41 % 24,080.03 7,749 2.62 % 26,219.62 10,190 2.88 % 28,854.68 5,765 3.70 % 36,980.49 ja ja L 25,555 2.47 % 24,688.27 4,720 2.05 % 20,475.18 11,541 3.90 % 39,050.29 3,821 1.08 % 10,819.80 5,473 3.51 % 35,107.41 eee eee N 23,222 2.24 % 22,434.40 4,248 1.84 % 18,427.66 4,527 1.53 % 15,317.62 10,556 2.99 % 29,891.07 3,891 2.50 % 24,959.43 da da V 19,011 1.84 % 18,366.22 3,598 1.56 % 15,607.99 5,141 1.74 % 17,395.16 7,140 2.02 % 20,218.09 3,132 2.01 % 20,090.70 se se Z 18,112 1.75 % 17,497.71 3,828 1.66 % 16,605.72 4,503 1.52 % 15,236.41 6,799 1.93 % 19,252.50 2,982 1.91 % 19,128.50 v v D 17,583 1.70 % 16,986.65 4,042 1.75 % 17,534.04 3,793 1.28 % 12,834.05 7,400 2.10 % 20,954.33 2,348 1.51 % 15,061.61 on on Z 17,518 1.69 % 16,923.85 3,276 1.42 % 14,211.16 6,792 2.30 % 22,981.51 5,129 1.45 % 14,523.61 2,321 1.49 % 14,888.42 in in V 16,238 1.57 % 15,687.27 4,184 1.81 % 18,150.03 3,199 1.08 % 10,824.18 7,068 2.00 % 20,014.21 1,787 1.15 % 11,462.99 jaz jaz Z 15,037 1.45 % 14,527 3,089 1.34 % 13,399.96 5,613 1.90 % 18,992.22 3,872 1.10 % 10,964.21 2,463 1.58 % 15,799.30 na na D 11,921 1.15 % 11,516.68 2,989 1.30 % 12,966.17 2,882 0.97 % 9,751.58 4,348 1.23 % 12,312.08 1,702 1.09 % 10,917.74 imeti imeti G 10,131 0.98 % 9,787.39 1,862 0.81 % 8,077.29 3,694 1.25 % 12,499.07 2,799 0.79 % 7,925.83 1,776 1.14 % 11,392.43 tako tako R 9,713 0.94 % 9,383.57 1,925 0.83 % 8,350.58 3,661 1.24 % 12,387.41 2,296 0.65 % 6,501.51 1,831 1.18 % 11,745.24 ti ti Z 9,496 0.92 % 9,173.93 2,673 1.16 % 11,595.37 3,045 1.03 % 10,303.10 2,137 0.60 % 6,051.27 1,641 1.05 % 10,526.45 kaj kaj Z 9,336 0.90 % 9,019.36 1,740 0.76 % 7,548.05 3,584 1.21 % 12,126.87 2,584 0.73 % 7,317.02 1,428 0.92 % 9,160.13 tudi tudi L 7,945 0.77 % 7,675.53 2,076 0.90 % 9,005.61 1,644 0.56 % 5,562.66 2,890 0.82 % 8,183.51 1,335 0.86 % 8,563.57 za za D 7,872 0.76 % 7,605.01 1,776 0.77 % 7,704.22 1,787 0.60 % 6,046.52 3,129 0.89 % 8,860.28 1,180 0.76 % 7,569.29 z z D 7,663 0.74 % 7,403.10 1,894 0.82 % 8,216.10 1,564 0.53 % 5,291.97 3,134 0.89 % 8,874.44 1,071 0.69 % 6,870.10 vedeti vedeti G 7,560 0.73 % 7,303.59 1,283 0.56 % 5,565.61 3,900 1.32 % 13,196.09 1,062 0.30 % 3,007.23 1,315 0.84 % 8,435.27 še še L 7,184 0.69 % 6,940.35 1,782 0.77 % 7,730.25 2,116 0.72 % 7,159.73 2,127 0.60 % 6,022.95 1,159 0.74 % 7,434.59 a a V 6,753 0.65 % 6,523.96 1,247 0.54 % 5,409.44 2,515 0.85 % 8,509.79 1,771 0.50 % 5,014.88 1,220 0.78 % 7,825.88 zdaj zdaj R 6,358 0.61 % 6,142.36 1,457 0.63 % 6,320.41 1,739 0.59 % 5,884.10 1,764 0.50 % 4,995.06 1,398 0.90 % 8,967.69 če če V 6,137 0.59 % 5,928.86 1,120 0.49 % 4,858.52 1,882 0.64 % 6,367.96 1,902 0.54 % 5,385.83 1,233 0.79 % 7,909.27 ki ki V 5,537 0.54 % 5,349.21 1,257 0.55 % 5,452.82 882 0.30 % 2,984.35 2,839 0.80 % 8,039.10 559 0.36 % 3,585.79 iti iti G 5,311 0.51 % 5,130.87 1,085 0.47 % 4,706.69 2,446 0.83 % 8,276.32 1,175 0.33 % 3,327.21 605 0.39 % 3,880.87 en en K 5,114 0.49 % 4,940.55 1,075 0.47 % 4,663.31 1,811 0.61 % 6,127.72 1,472 0.42 % 4,168.21 756 0.48 % 4,849.48 ker ker V 4,783 0.46 % 4,620.78 824 0.36 % 3,574.48 1,628 0.55 % 5,508.52 1,294 0.37 % 3,664.18 1,037 0.67 % 6,652 ali ali V 4,757 0.46 % 4,595.66 818 0.35 % 3,548.45 1,536 0.52 % 5,197.23 1,501 0.42 % 4,250.33 902 0.58 % 5,786.02 no no L 4,700 0.45 % 4,540.59 1,298 0.56 % 5,630.67 1,566 0.53 % 5,298.74 1,114 0.32 % 3,154.48 722 0.46 % 4,631.38 reči reči G 4,690 0.45 % 4,530.93 878 0.38 % 3,808.73 1,620 0.55 % 5,481.45 1,476 0.42 % 4,179.54 716 0.46 % 4,592.89 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 736 File at CLARIN.SI2.3.2 List of all lower-case word forms in the GOS 1.0 corpus with lemmas, part-of-speech categories and text- type distributionGOS1.0-words-all-lowercase_forms- lemmas-parts_of_speech-taxonomy-entire.tsvLower-case word form Lemma Lemma (lower-case) Part-of-speech category Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] je biti biti G 30,945 2.99 % 29,895.46 6,372 2.76 % 27,641.49 9,993 3.38 % 33,812.45 10,457 2.96 % 29,610.73 4,123 2.65 % 26,447.63 ne ne ne L 29,855 2.88 % 28,842.43 6,131 2.66 % 26,596.04 10,303 3.49 % 34,861.37 7,590 2.15 % 21,492.34 5,831 3.74 % 37,403.86 pa pa pa V 29,153 2.82 % 28,164.24 5,585 2.42 % 24,227.52 11,944 4.04 % 40,413.88 6,807 1.93 % 19,275.15 4,817 3.09 % 30,899.40 ja ja ja L 25,105 2.42 % 24,253.53 4,628 2.01 % 20,076.09 11,249 3.81 % 38,062.27 3,787 1.07 % 10,723.52 5,441 3.49 % 34,902.14 eee eee eee N 23,222 2.24 % 22,434.40 4,248 1.84 % 18,427.66 4,527 1.53 % 15,317.62 10,556 2.99 % 29,891.07 3,891 2.50 % 24,959.43 da da da V 17,232 1.67 % 16,647.55 3,471 1.51 % 15,057.07 3,882 1.31 % 13,135.19 6,960 1.97 % 19,708.40 2,919 1.87 % 18,724.38 to ta ta Z 17,176 1.66 % 16,593.45 3,440 1.49 % 14,922.59 4,555 1.54 % 15,412.36 5,429 1.54 % 15,373.11 3,752 2.41 % 24,067.79 v v v D 16,785 1.62 % 16,215.71 3,900 1.69 % 16,918.05 3,330 1.13 % 11,267.43 7,262 2.06 % 20,563.56 2,293 1.47 % 14,708.81 in in in V 15,945 1.54 % 15,404.20 4,161 1.80 % 18,050.26 2,958 1.00 % 10,008.73 7,054 2.00 % 19,974.57 1,772 1.14 % 11,366.77 se se se Z 15,699 1.52 % 15,166.55 3,229 1.40 % 14,007.28 3,772 1.28 % 12,762.99 6,038 1.71 % 17,097.60 2,660 1.71 % 17,062.99 na na na D 11,773 1.14 % 11,373.70 2,981 1.29 % 12,931.46 2,780 0.94 % 9,406.45 4,317 1.22 % 12,224.30 1,695 1.09 % 10,872.84 so biti biti G 7,869 0.76 % 7,602.11 1,526 0.66 % 6,619.73 2,277 0.77 % 7,704.49 2,995 0.85 % 8,480.84 1,071 0.69 % 6,870.10 za za za D 7,728 0.75 % 7,465.89 1,769 0.77 % 7,673.85 1,669 0.56 % 5,647.25 3,117 0.88 % 8,826.30 1,173 0.75 % 7,524.39 še še še L 7,056 0.68 % 6,816.69 1,756 0.76 % 7,617.46 2,021 0.68 % 6,838.28 2,124 0.60 % 6,014.46 1,155 0.74 % 7,408.93 bi biti biti G 6,993 0.68 % 6,755.82 1,319 0.57 % 5,721.77 1,684 0.57 % 5,698.01 2,687 0.76 % 7,608.69 1,303 0.84 % 8,358.30 a a a V 6,691 0.65 % 6,464.07 1,232 0.53 % 5,344.37 2,474 0.84 % 8,371.06 1,770 0.50 % 5,012.05 1,215 0.78 % 7,793.81 če če če V 5,945 0.57 % 5,743.37 1,107 0.48 % 4,802.12 1,744 0.59 % 5,901.02 1,891 0.54 % 5,354.68 1,203 0.77 % 7,716.83 kaj kaj kaj Z 5,522 0.53 % 5,334.71 1,187 0.52 % 5,149.16 1,386 0.47 % 4,689.69 2,029 0.57 % 5,745.45 920 0.59 % 5,901.48 je on on Z 5,180 0.50 % 5,004.31 1,058 0.46 % 4,589.56 1,720 0.58 % 5,819.82 1,674 0.47 % 4,740.21 728 0.47 % 4,669.87 ni biti biti G 4,775 0.46 % 4,613.05 882 0.38 % 3,826.08 1,507 0.51 % 5,099.11 1,565 0.44 % 4,431.56 821 0.53 % 5,266.43 bo biti biti G 4,762 0.46 % 4,600.49 1,218 0.53 % 5,283.64 1,070 0.36 % 3,620.47 1,764 0.50 % 4,995.06 710 0.46 % 4,554.41 no no no L 4,622 0.45 % 4,465.24 1,282 0.56 % 5,561.27 1,511 0.51 % 5,112.64 1,111 0.32 % 3,145.98 718 0.46 % 4,605.72 mhm mhm mhm M 4,475 0.43 % 4,323.22 431 0.19 % 1,869.66 1,261 0.43 % 4,266.74 975 0.28 % 2,760.87 1,808 1.16 % 11,597.70 ki ki ki V 4,454 0.43 % 4,302.94 1,122 0.49 % 4,867.19 318 0.11 % 1,075.99 2,661 0.75 % 7,535.06 353 0.23 % 2,264.37 z z z D 4,287 0.41 % 4,141.60 1,114 0.48 % 4,832.49 862 0.29 % 2,916.68 1,753 0.50 % 4,963.91 558 0.36 % 3,579.38 že že že L 4,273 0.41 % 4,128.08 1,044 0.45 % 4,528.83 1,357 0.46 % 4,591.56 1,185 0.34 % 3,355.52 687 0.44 % 4,406.87 sem biti biti G 4,238 0.41 % 4,094.26 771 0.33 % 3,344.57 1,729 0.58 % 5,850.27 1,053 0.30 % 2,981.74 685 0.44 % 4,394.04 tudi tudi tudi L 4,119 0.40 % 3,979.30 1,289 0.56 % 5,591.63 315 0.11 % 1,065.84 2,086 0.59 % 5,906.86 429 0.28 % 2,751.89 mi jaz jaz Z 4,009 0.39 % 3,873.03 679 0.29 % 2,945.48 1,417 0.48 % 4,794.58 1,248 0.35 % 3,533.92 665 0.43 % 4,265.75 smo biti biti G 3,919 0.38 % 3,786.08 881 0.38 % 3,821.74 1,105 0.37 % 3,738.89 1,543 0.44 % 4,369.26 390 0.25 % 2,501.72 ti ti ti Z 3,900 0.38 % 3,767.73 1,021 0.44 % 4,429.06 1,750 0.59 % 5,921.32 532 0.15 % 1,506.45 597 0.38 % 3,829.55 tud tudi tudi L 3,570 0.34 % 3,448.92 709 0.31 % 3,075.62 1,187 0.40 % 4,016.35 795 0.23 % 2,251.17 879 0.56 % 5,638.48 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 737 File at CLARIN.SI2.3.3 List of all lower-case word forms in the GOS 1.0 corpus with standardized word forms, lemmas, part-of- speech categories and text-type distributionGOS1.0-words-all-lowercase_forms-standardized_forms- lemmas-parts_of_speech-taxonomy-entire.tsvLower-case word form Lemma Lemma (lower-case) Part-of-speech category Standardized form Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] je biti biti G je 30,945 2.99 % 29,895.46 6,372 2.76 % 27,641.49 9,993 3.38 % 33,812.45 10,457 2.96 % 29,610.73 4,123 2.65 % 26,447.63 ne ne ne L ne 29,853 2.88 % 28,840.50 6,129 2.66 % 26,587.37 10,303 3.49 % 34,861.37 7,590 2.15 % 21,492.34 5,831 3.74 % 37,403.86 pa pa pa V pa 29,153 2.82 % 28,164.24 5,585 2.42 % 24,227.52 11,944 4.04 % 40,413.88 6,807 1.93 % 19,275.15 4,817 3.09 % 30,899.40 ja ja ja L ja 25,105 2.42 % 24,253.53 4,628 2.01 % 20,076.09 11,249 3.81 % 38,062.27 3,787 1.07 % 10,723.52 5,441 3.49 % 34,902.14 eee eee eee N eee 23,222 2.24 % 22,434.40 4,248 1.84 % 18,427.66 4,527 1.53 % 15,317.62 10,556 2.99 % 29,891.07 3,891 2.50 % 24,959.43 da da da V da 17,231 1.67 % 16,646.59 3,471 1.51 % 15,057.07 3,882 1.31 % 13,135.19 6,959 1.97 % 19,705.56 2,919 1.87 % 18,724.38 to ta ta Z to 17,174 1.66 % 16,591.52 3,440 1.49 % 14,922.59 4,554 1.54 % 15,408.98 5,429 1.54 % 15,373.11 3,751 2.41 % 24,061.38 v v v D v 16,774 1.62 % 16,205.09 3,895 1.69 % 16,896.36 3,329 1.13 % 11,264.05 7,257 2.06 % 20,549.40 2,293 1.47 % 14,708.81 in in in V in 15,942 1.54 % 15,401.31 4,159 1.80 % 18,041.58 2,957 1.00 % 10,005.35 7,054 2.00 % 19,974.57 1,772 1.14 % 11,366.77 se se se Z se 15,678 1.51 % 15,146.26 3,224 1.40 % 13,985.59 3,764 1.27 % 12,735.92 6,030 1.71 % 17,074.95 2,660 1.71 % 17,062.99 na na na D na 11,737 1.13 % 11,338.92 2,964 1.29 % 12,857.72 2,779 0.94 % 9,403.06 4,299 1.22 % 12,173.33 1,695 1.09 % 10,872.84 so biti biti G so 7,856 0.76 % 7,589.55 1,514 0.66 % 6,567.67 2,276 0.77 % 7,701.11 2,995 0.85 % 8,480.84 1,071 0.69 % 6,870.10 za za za D za 7,724 0.75 % 7,462.03 1,765 0.77 % 7,656.50 1,669 0.56 % 5,647.25 3,117 0.88 % 8,826.30 1,173 0.75 % 7,524.39 še še še L še 7,056 0.68 % 6,816.69 1,756 0.76 % 7,617.46 2,021 0.68 % 6,838.28 2,124 0.60 % 6,014.46 1,155 0.74 % 7,408.93 bi biti biti G bi 6,988 0.68 % 6,750.99 1,318 0.57 % 5,717.43 1,682 0.57 % 5,691.24 2,685 0.76 % 7,603.02 1,303 0.84 % 8,358.30 a a a V a 6,517 0.63 % 6,295.97 1,210 0.53 % 5,248.93 2,458 0.83 % 8,316.92 1,664 0.47 % 4,711.89 1,185 0.76 % 7,601.37 če če če V če 5,945 0.57 % 5,743.37 1,107 0.48 % 4,802.12 1,744 0.59 % 5,901.02 1,891 0.54 % 5,354.68 1,203 0.77 % 7,716.83 kaj kaj kaj Z kaj 5,522 0.53 % 5,334.71 1,187 0.52 % 5,149.16 1,386 0.47 % 4,689.69 2,029 0.57 % 5,745.45 920 0.59 % 5,901.48 je on on Z je 5,166 0.50 % 4,990.79 1,052 0.46 % 4,563.54 1,714 0.58 % 5,799.51 1,672 0.47 % 4,734.55 728 0.47 % 4,669.87 ni biti biti G ni 4,775 0.46 % 4,613.05 882 0.38 % 3,826.08 1,507 0.51 % 5,099.11 1,565 0.44 % 4,431.56 821 0.53 % 5,266.43 bo biti biti G bo 4,751 0.46 % 4,589.86 1,218 0.53 % 5,283.64 1,065 0.36 % 3,603.55 1,761 0.50 % 4,986.56 707 0.45 % 4,535.16 no no no L no 4,622 0.45 % 4,465.24 1,282 0.56 % 5,561.27 1,511 0.51 % 5,112.64 1,111 0.32 % 3,145.98 718 0.46 % 4,605.72 mhm mhm mhm M mhm 4,475 0.43 % 4,323.22 431 0.19 % 1,869.66 1,261 0.43 % 4,266.74 975 0.28 % 2,760.87 1,808 1.16 % 11,597.70 ki ki ki V ki 4,454 0.43 % 4,302.94 1,122 0.49 % 4,867.19 318 0.11 % 1,075.99 2,661 0.75 % 7,535.06 353 0.23 % 2,264.37 že že že L že 4,273 0.41 % 4,128.08 1,044 0.45 % 4,528.83 1,357 0.46 % 4,591.56 1,185 0.34 % 3,355.52 687 0.44 % 4,406.87 z z z D z 4,254 0.41 % 4,109.72 1,106 0.48 % 4,797.79 860 0.29 % 2,909.91 1,732 0.49 % 4,904.45 556 0.36 % 3,566.55 sem biti biti G sem 4,224 0.41 % 4,080.74 768 0.33 % 3,331.55 1,720 0.58 % 5,819.82 1,052 0.30 % 2,978.91 684 0.44 % 4,387.62 tudi tudi tudi L tudi 4,119 0.40 % 3,979.30 1,289 0.56 % 5,591.63 315 0.11 % 1,065.84 2,086 0.59 % 5,906.86 429 0.28 % 2,751.89 mi jaz jaz Z mi 3,984 0.39 % 3,848.88 660 0.29 % 2,863.05 1,412 0.48 % 4,777.66 1,247 0.35 % 3,531.09 665 0.43 % 4,265.75 smo biti biti G smo 3,917 0.38 % 3,784.15 881 0.38 % 3,821.74 1,104 0.37 % 3,735.51 1,543 0.44 % 4,369.26 389 0.25 % 2,495.30 ti ti ti Z ti 3,881 0.38 % 3,749.37 1,012 0.44 % 4,390.02 1,744 0.59 % 5,901.02 532 0.15 % 1,506.45 593 0.38 % 3,803.89 tud tudi tudi L tudi 3,570 0.34 % 3,448.92 709 0.31 % 3,075.62 1,187 0.40 % 4,016.35 795 0.23 % 2,251.17 879 0.56 % 5,638.48 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 738 File at CLARIN.SI2.3.4 List of all morphosyntactic tags in the GOS 1.0 corpus with text-type distributionGOS1.0-words-all-morphosyntactic_tags- split_MSD-taxonomy-entire.tsvMorphosyntactic tag Total absolute frequency of morphosyntactic tag Percentage of all found morphosyntactic tags Total relative frequency (per million occurrences) Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] msd01 msd02 msd03 msd04 msd05 msd06 msd07 msd08 msd09 L 100,322 9.69 % 96,919.45 35,478 12.00 % 120,043.85 25,175 7.13 % 71,287.19 18,181 11.66 % 116,624.86 21,488 9.32 % 93,214.13 L Rsn 92,983 8.98 % 89,829.36 28,218 9.55 % 95,478.82 28,515 8.07 % 80,744.95 15,476 9.93 % 99,273.22 20,774 9.01 % 90,116.82 R s n Vp 67,900 6.56 % 65,597.08 21,816 7.38 % 73,816.92 21,148 5.99 % 59,884.07 10,398 6.67 % 66,699.60 14,538 6.31 % 63,065.29 V p N 46,541 4.50 % 44,962.50 11,752 3.98 % 39,764.23 16,958 4.80 % 48,019.39 7,441 4.77 % 47,731.46 10,390 4.51 % 45,071.42 N Vd 45,485 4.39 % 43,942.32 12,586 4.26 % 42,586.16 16,716 4.73 % 47,334.13 7,333 4.70 % 47,038.67 8,850 3.84 % 38,390.96 V d Gp-ste-n 31,468 3.04 % 30,400.72 10,422 3.53 % 35,264.02 10,489 2.97 % 29,701.34 4,142 2.66 % 26,569.51 6,415 2.78 % 27,828.03 G p - s t e - n Dm 29,362 2.84 % 28,366.15 6,276 2.12 % 21,235.56 12,056 3.41 % 34,138.56 4,002 2.57 % 25,671.45 7,028 3.05 % 30,487.20 D m Somei 27,340 2.64 % 26,412.73 8,210 2.78 % 27,779.47 9,320 2.64 % 26,391.13 3,861 2.48 % 24,766.99 5,949 2.58 % 25,806.54 S o m e i Dt 17,676 1.71 % 17,076.50 4,695 1.59 % 15,886.07 6,462 1.83 % 18,298.23 2,486 1.59 % 15,946.84 4,033 1.75 % 17,495 D t Zk-sei 17,668 1.71 % 17,068.77 5,595 1.89 % 18,931.32 4,976 1.41 % 14,090.37 3,702 2.38 % 23,747.06 3,395 1.47 % 14,727.38 Z k - s e i Sozei 16,226 1.57 % 15,675.67 3,165 1.07 % 10,709.14 7,159 2.03 % 20,271.90 1,934 1.24 % 12,405.95 3,968 1.72 % 17,213.03 S o z e i Zp------k 15,864 1.53 % 15,325.95 3,909 1.32 % 13,226.55 6,053 1.71 % 17,140.07 2,670 1.71 % 17,127.13 3,232 1.40 % 14,020.29 Z p - - - - - - k M 11,537 1.11 % 11,145.71 4,081 1.38 % 13,808.53 2,182 0.62 % 6,178.70 3,287 2.11 % 21,084.97 1,987 0.86 % 8,619.53 M Sozet 11,381 1.10 % 10,995 2,716 0.92 % 9,189.90 4,551 1.29 % 12,886.91 1,511 0.97 % 9,692.55 2,603 1.13 % 11,291.71 S o z e t Sozer 11,260 1.09 % 10,878.10 2,013 0.68 % 6,811.21 5,288 1.50 % 14,973.85 1,461 0.94 % 9,371.81 2,498 1.08 % 10,836.23 S o z e r Dr 10,700 1.03 % 10,337.10 2,625 0.89 % 8,881.99 4,234 1.20 % 11,989.27 1,393 0.89 % 8,935.62 2,448 1.06 % 10,619.33 D r Ggnste 9,953 0.96 % 9,615.43 2,606 0.88 % 8,817.70 3,771 1.07 % 10,678.21 1,424 0.91 % 9,134.47 2,152 0.93 % 9,335.29 G g n s t e Do 9,036 0.87 % 8,729.53 1,753 0.59 % 5,931.48 3,839 1.09 % 10,870.77 1,137 0.73 % 7,293.46 2,307 1.00 % 10,007.68 D o Ggnspe 8,609 0.83 % 8,317.01 3,582 1.21 % 12,120.10 1,880 0.53 % 5,323.53 1,570 1.01 % 10,071.01 1,577 0.68 % 6,840.97 G g n s p e Ppnsei 8,560 0.83 % 8,269.68 2,568 0.87 % 8,689.12 2,416 0.68 % 6,841.30 1,554 1.00 % 9,968.38 2,022 0.88 % 8,771.36 P p n s e i Gp-stm-n 7,983 0.77 % 7,712.25 2,364 0.80 % 7,998.86 3,012 0.85 % 8,528.98 1,072 0.69 % 6,876.51 1,535 0.67 % 6,658.77 G p - s t m - n Zv-sei 7,983 0.77 % 7,712.25 3,186 1.08 % 10,780.19 2,065 0.58 % 5,847.39 1,193 0.77 % 7,652.68 1,539 0.67 % 6,676.12 Z v - s e i Ggdd-em 7,780 0.75 % 7,516.13 2,756 0.93 % 9,325.24 2,491 0.70 % 7,053.68 718 0.46 % 4,605.72 1,815 0.79 % 7,873.40 G g d d - e m Gp-g 7,466 0.72 % 7,212.78 1,982 0.67 % 6,706.32 2,746 0.78 % 7,775.75 1,351 0.87 % 8,666.20 1,387 0.60 % 6,016.75 G p - g Nt 7,416 0.72 % 7,164.48 3,629 1.23 % 12,279.13 800 0.23 % 2,265.33 646 0.41 % 4,143.87 2,341 1.02 % 10,155.17 N t Kbg-mi 7,025 0.68 % 6,786.74 1,356 0.46 % 4,588.18 2,475 0.70 % 7,008.37 1,220 0.78 % 7,825.88 1,974 0.86 % 8,563.14 K b g - m i Somer 6,343 0.61 % 6,127.87 1,297 0.44 % 4,388.55 2,781 0.79 % 7,874.86 790 0.51 % 5,067.58 1,475 0.64 % 6,398.49 S o m e r Ppnzei 6,298 0.61 % 6,084.40 1,132 0.38 % 3,830.25 2,919 0.83 % 8,265.63 734 0.47 % 4,708.36 1,513 0.66 % 6,563.34 P p n z e i Zop-ei 6,293 0.61 % 6,079.56 2,531 0.86 % 8,563.93 1,444 0.41 % 4,088.93 1,141 0.73 % 7,319.12 1,177 0.51 % 5,105.78 Z o p - e i Gp-spe-n 6,222 0.60 % 6,010.97 2,952 1.00 % 9,988.43 1,188 0.34 % 3,364.02 891 0.57 % 5,715.46 1,191 0.52 % 5,166.51 G p - s p e - n Ggdste 5,994 0.58 % 5,790.71 1,581 0.54 % 5,349.49 2,248 0.64 % 6,365.59 1,062 0.68 % 6,812.36 1,103 0.48 % 4,784.77 G g d s t e Sometn 5,943 0.57 % 5,741.44 1,543 0.52 % 5,220.92 2,219 0.63 % 6,283.47 902 0.58 % 5,786.02 1,279 0.56 % 5,548.25 S o m e t n CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 739 File at CLARIN.SI 2.3.5 List of lemmas by basic consonant-vowel structure in the GOS 1.0 corpusGOS1.0_cv_lemmas_robust_entire.tsvStructure … Frequency (G) Most frequent lemmas (G) Number of all unique lemmas (G) … Frequency (R) Most frequent lemmas (R) Number of all unique lemmas (R) Frequency (S) Most frequent lemmas (S) Number of all unique lemmas (S) … Total frequency of structure Total number of all unique lemmas in structure CVC … 0 0… 10.270 tam~R~2.598 | sem~R~1.307 | ful~R~916 | … 29 17.039 pol~S~4.054 | saj~S~2.321 | dan~S~1.383 | … 332 … 99.386 931 CVCV … 106.048 biti~G~93.407 | reči~G~4.690 | dati~G~3.945 | … 18 … 17.305 tako~R~9.713 | zelo~R~1.835 | malo~R~1.186 | … 47 12.857 leto~S~1.828 | voda~S~690 | šola~S~432 | ... 524 … 152.366 1.061 CVCVC … 0 0… 10.531 potem~R~3.149 | danes~R~1.418 | tukaj~R~1.136 | … 46 12.437 misel~S~1.086 | konec~S~784 | način~S~491 | … 694 … 41.620 1.361 CVCVCV … 26.033 vedeti~G~7.560 | videti~G~2.309 | morati~G~2.233 | … 334 … 4.737 veliko~R~1.316 | mogoče~R~872 | koliko~R~460 | … 44 7.093 minuta~S~662 | beseda~S~458 | zadeva~S~428 | … 460 … 43.803 1.157 CVCCV … 1.935 jesti~G~922 | najti~G~452 | pasti~G~204 | … 12 … 11.823 lahko~R~4.140 | dobro~R~1.320 | čisto~R~974 | … 95 7.010 jutro~S~618 | mesto~S~515 | reklo~S~356 | … 404 … 27.230 933 CCVCV … 6.792 priti~G~2.530 | znati~G~918 | zdeti~G~804 | … 53 … 1.877 treba~R~827 | glede~R~324 | slabo~R~167 | … 44 5.074 hvala~S~797 | vlada~S~416 | glava~S~258 | … 207 … 17.215 493 CVCCVC … 0 0… 2.266 naprej~R~922 | takrat~R~840 | medtem~R~88 | … 18 4.837 gospod~S~713 | center~S~238 | sistem~S~233 | … 476 … 12.672 872 CCVCVC … 0 0… 3.054 skupaj~R~713 | včasih~R~445 | zmeraj~R~328 | … 23 4.312 človek~S~1.248 | primer~S~862 | smisel~S~149 | … 269 … 11.483 603 CCVC … 0 0… 10.055 zdaj~R~6358 | prav~R~971 | prej~R~832 | … 17 4.146 svet~S~424 | brat~S~163 | snov~S~136 | … 239 … 25.886 512 CVCVCVC … 0 0… 348 posebej~R~175 | nikakor~R~53 | nikamor~R~42 | … 14 3.617 začetek~S~331 | podatek~S~191 | telefon~S~186 | … 328 … 8.381 843 CVCCVCV … 6.379 misliti~G~2.710 | poznati~G~498 | pustiti~G~318 | … 192 … 112 najraje~R~37 | pošteno~R~16 | kajneda~R~13 | … 22 3.014 resnica~S~249 | razlika~S~175 | kultura~S~115 | … 282 … 9.870 664 CCCVC … 0 0… 206 prvič~R~185 | stran~R~21 2 2.859 stvar~S~1.056 | stran~S~839 | strah~S~83 | … 60 … 4.384 118 CCVCCV … 170 znajti~G~55 | zrasti~G~33 | znesti~G~15 | … 15 … 1.167 zdajle~R~478 | grozno~R~77 | smešno~R~68 | … 58 2.524 služba~S~285 | zgodba~S~208 | mnenje~S~197 | … 140 … 5.444 339 CVCC … 0 0… 2.443 bolj~R~1648 | fajn~R~493 | manj~R~302 3 2.410 cajt~S~209 | film~S~147 | fant~S~146 | … 166 … 6.387 392 CCVCCVC … 0 0… 585 zjutraj~R~318 | dvakrat~R~107 | trikrat~R~59 | … 11 2.393 problem~S~583 | program~S~197 | prostor~S~182 | … 145 … 3.988 298 VCV … 5.311 iti~G~5.311 1… 32 ali~R~32 1 2.244 ura~S~905 | eva~S~332 | ime~S~256 | … 38 … 16.515 133 CVCVCCV … 248 pojesti~G~68 | dopasti~G~28 | zavesti~G~25 | … 23 … 1.288 sigurno~R~214 | tukajle~R~107 | podobno~R~89 | … 101 2.234 sekunda~S~134 | nedelja~S~101 | naselje~S~85 | … 275 … 4.435 583 CCVCVCV … 8.460 gledati~G~1.777 | praviti~G~1.344 | slišati~G~534 | … 282 … 599 drugače~R~487 | premalo~R~64 | globoko~R~25 | … 14 2.229 skupina~S~358 | družina~S~205 | število~S~144 | … 172 … 11.658 576 VCCVC … 0 0… 54 okrog~R~38 | odveč~R~12 | odkod~R~4 3 2.197 otrok~S~538 | izpit~S~144 | izraz~S~144 | … 131 … 6.604 200 CVCVCVCV … 11.079 povedati~G~1.686 | narediti~G~1.614 | govoriti~G~906 | … 403 … 806 zanimivo~R~212 | ponavadi~R~176 | nekoliko~R~118 | … 25 2.158 politika~S~231 | besedilo~S~186 | polovica~S~95 | … 202 … 14.484 747 CVCCCV … 55 zebsti~G~36 | tepsti~G~12 | molsti~G~7 3… 168 končno~R~38 | luštno~R~20 | majhno~R~12 | … 30 2.098 bistvo~S~1.016 | ženska~S~295 | vojska~S~121 | ... 54 … 3.737 201 VCCV … 0 0… 19 obče~R~6 | ozko~R~6 | učno~R~3 | … 6 1.722 evro~S~664 | avto~S~224 | igra~S~207 | … 45 … 2.656 145 CCVCVCCV … 397 prinesti~G~297 | prenesti~G~51 | prevesti~G~24 | … 11… 587 trenutno~R~120 | slučajno~R~99 | pravilno~R~50 | … 64 1.563 številka~S~256 | stopinja~S~193 | stališče~S~108 | … 144 … 2.749 263 VCCVCV … 1.465 igrati~G~290 | ostati~G~288 | iskati~G~266 | … 42 … 1ihtavo~R~1 1 1.471 oddaja~S~175 | oblika~S~158 | osnova~S~101 | … 73 … 3.046 167 VCVCV … 11.577 imeti~G~10.131 | ajati~G~562 | upati~G~386 | … 23 … 288 edino~R~135 | enako~R~63 | okoli~R~56 | … 6 1.252 ideja~S~159 | oseba~S~129 | ekipa~S~127 | … 60 … 13.700 159 CCVCVCVC … 0 0… 64 vsekakor~R~51 | predaleč~R~13 2 1.232 trenutek~S~270 | slovenec~S~149 | profesor~S~124 | … 99 … 3.175 301 CVCCVCVC … 0 0… 33 ravnokar~R~20 | dandanes~R~11 | naprodaj~R~2 3 1.226 mercator~S~181 | postopek~S~120 | sestanek~S~97 | … 172 … 3.167 471 VCCVCVC … 0 0… 81 odzadaj~R~58 | odzunaj~R~23 2 1.150 interes~S~167 | odgovor~S~144 | občutek~S~139 | … 74 … 2.256 199 VCVC … 0 0… 0 0 1.150 imam~S~235 | irak~S~100 | ozir~S~73 | … 82 … 3.551 192 CVCVCCVC … 0 0… 20 popoldan~R~15 | malokrat~R~4 | količkaj~R~1 3 1.131 rezultat~S~157 | minister~S~153 | direktor~S~57 | … 163 … 2.243 312 CVCCVCCV … 53 zaplesti~G~15 | razpasti~G~13 | pokrasti~G~7 | … 10 … 1.030 verjetno~R~505 | potrebno~R~148 | normalno~R~96 | … 56 1.115 podjetje~S~249 | področje~S~173 | postelja~S~64 | … 117 … 2.773 292 CCCVCV … 825 držati~G~215 | vrniti~G~143 | zbrati~G~78 | … 45 … 159 hkrati~R~83 | sproti~R~46 | zdravo~R~14 | … 5 1.038 država~S~538 | knjiga~S~204 | črtica~S~47 | … 49 … 2.134 130 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 740 File at CLARIN.SI 2.3.6 List of word forms by basic consonant-vowel structure in the GOS 1.0 corpusGOS1.0_cv_forms_robust_entire.tsvStructure … Frequency (G) Most frequent forms (G) Number of unique forms (G) … Frequency (R) Most frequent forms (R) Number of unique forms (R) Frequency (S) Most frequent forms (S) Number of unique forms (S) … Total frequency of structure Total number of unique forms in structure CVCV … 16.929 bomo~G~1.819 | reku~G~897 | mamo~G~673 | … 744 … 10.589 tako~R~3.345 | zelo~R~1.127 | zato~R~799 | ... 236 13.644 redu~S~683 | leta~S~420 | časa~S~361 | … 1.567 … 68.033 3.653 CVC … 23.041 sem~G~4.238 | vem~G~3.296 | veš~G~2.611 | ... 280 … 15.205 tak~R~2.543 | tam~R~1.817 | ful~R~914 | ... 198 12.384 pol~S~2.815 | sej~S~1.978 | dan~S~814 | … 612 … 103.227 2.255 CVCCV … 7.567 rekla~G~769 | boste~G~645 | rekli~G~361 | … 655 … 9.015 lahko~R~2.885 | dobro~R~885 | vedno~R~727 | … 270 9.238 jutro~S~536 | koncu~S~226 | mesto~S~197 | … 1.478 … 36.434 3.591 CVCVCV … 9.680 recimo~G~838 | pomeni~G~371 | zanima~G~243 | … 1.361 … 2.486 mogoče~R~815 | veliko~R~442 | nekako~R~157 | … 87 8.492 zadeve~S~153 | zadeva~S~140 | besede~S~125 | … 1.742 … 27.883 4.005 CVCVC … 8.279 nisem~G~622 | misim~G~307 | delat~G~273 | … 758 … 9.021 potem~R~2.473 | danes~R~933 | nazaj~R~555 | ... 193 7.647 minut~S~461 | način~S~448 | teden~S~187 | … 1.034 … 35.313 2.780 CCVCV … 4.769 pravi~G~501 | pride~G~483 | gremo~G~435 | … 556 … 2.349 treba~R~750 | glede~R~323 | zdele~R~184 | … 102 6.840 hvala~S~790 | ljudi~S~293 | svetu~S~168 | … 875 … 21.487 2.018 CVCCVC … 2.596 mislim~G~599 | nardit~G~168 | misliš~G~146 | … 386 … 2.394 naprej~R~909 | takrat~R~494 | takret~R~171 | … 96 4.387 gospod~S~589 | mislim~S~310 | center~S~125 | … 786 … 12.666 1.878 CVCCVCV … 2.842 nardila~G~141 | mislite~G~75 | nardili~G~71 | … 635 … 186 hitreje~R~27 | kasneje~R~20 | pozneje~R~16 | … 51 4.068 resnici~S~89 | karkoli~S~62 | razmere~S~60 | … 1.034 … 8.481 2.250 CCVCCV … 1.667 prišli~G~167 | prišla~G~161 | glejte~G~115 | … 373 … 921 zdejle~R~122 | zjutri~R~58 | srečno~R~53 | … 106 3.614 ljudje~S~284 | službo~S~108 | stanje~S~108 | … 588 … 8.711 1.587 CVCVCCV … 1.080 povejte~G~85 | naredli~G~54 | naredla~G~38 | … 363 … 1.148 sigurno~R~151 | podobno~R~81 | ponovno~R~78 | … 135 3.300 začetku~S~153 | denarja~S~108 | začetka~S~68 | … 914 … 7.938 2.124 CCVCVCV … 3.384 zgodilo~G~149 | slišali~G~142 | pridejo~G~121 | … 827 … 266 drugače~R~178 | premalo~R~40 | globoko~R~16 | … 18 2.968 primeru~S~238 | skupina~S~117 | skupine~S~92 | … 604 … 7.971 1.730 CVCVCVCV … 3.550 govorimo~G~155 | povedala~G~130 | govorili~G~105 | … 870 … 678 zanimivo~R~172 | zagotovo~R~111 | ponavadi~R~107 | … 27 2.690 mariboru~S~73 | besedilo~S~71 | besedila~S~65 | … 628 … 8.415 1.918 CCVC … 2.382 greš~G~289 | grem~G~272 | znam~G~206 | … 179 … 8.321 zdej~R~3.036 | zdaj~R~1.195 | prej~R~804 | … 101 2.583 svet~S~136 | plus~S~115 | brat~S~83 | … 344 … 20.351 1.053 CCCVCV … 847 zgleda~G~73 | splača~G~54 | prnesu~G~31 | … 239 … 163 hkrati~R~82 | sproti~R~25 | zdravo~R~13 | … 13 2.474 stvari~S~588 | strani~S~566 | država~S~192 | … 216 … 3.822 557 CVCVCVC … 2.782 povedal~G~197 | povedat~G~176 | razumem~G~152 | … 631 … 400 posebej~R~132 | ponavad~R~67 | nikakor~R~43 | … 30 2.456 rešitev~S~99 | začetek~S~77 | telefon~S~65 | … 642 … 7.045 1.681 CVCCCV … 415 manjka~G~72 | mislte~G~20 | mislna~G~16 | … 139 … 192 končno~R~35 | boljše~R~27 | pestro~R~11 | … 58 2.407 bistvu~S~965 | ženske~S~105 | ženska~S~87 | … 269 … 6.075 863 CCVCVC … 2.583 prideš~G~95 | gledam~G~79 | prišel~G~79 | … 543 … 2.313 skupaj~R~319 | včasih~R~310 | preveč~R~263 | … 80 2.361 primer~S~441 | človek~S~207 | spomin~S~44 | … 467 … 9.687 1.415 VCCVCV … 1.317 ostane~G~92 | izvoli~G~61 | uspelo~G~54 | … 260 … 39 odzadi~R~11 | odzuni~R~10 | udzadi~R~10 | … 10 2.317 otroci~S~107 | otroke~S~90 | okviru~S~77 | … 341 … 4.131 723 CVCC … 2.510 morš~G~322 | mism~G~257 | nisn~G~167 | … 179 … 4.351 bolj~R~1.013 | čist~R~585 | fajn~R~478 | … 118 2.233 konc~S~187 | mism~S~159 | cajt~S~89 | … 336 … 12.297 1.149 CCVCVCCV … 244 prinesla~G~19 | spominja~G~19 | premakne~G~8 | … 113 … 558 trenutno~R~120 | slučajno~R~85 | pravilno~R~47 | … 70 1.969 trenutku~S~129 | številka~S~101 | slovenci~S~76 | … 416 … 3.972 919 CVCCVCCV … 618 poglejte~G~197 | poglejmo~G~130 | postavli~G~10 | … 168 … 1.036 verjetno~R~416 | potrebno~R~147 | normalno~R~86 | … 87 1.877 možnosti~S~106 | podjetja~S~88 | področju~S~85 | … 476 … 5.212 1.146 VCV … 1.336 aja~G~561 | ima~G~535 | iti~G~42 | … 39 … 36 ali~R~21 | upi~R~11 | uli~R~1 | … 6 1.754 evo~S~310 | ura~S~211 | ure~S~186 | … 109 … 13.338 341 VCCVC … 454 igral~G~65 | iskat~G~61 | uspel~G~35 | … 90 … 241 enkat~R~154 | okrog~R~37 | odveč~R~12 | … 17 1.652 evrov~S~532 | otrok~S~160 | izraz~S~101 | … 171 … 6.342 370 VCVCV … 1.879 imamo~G~477 | imajo~G~280 | imeli~G~242 | … 142 … 225 edino~R~87 | enako~R~61 | oboje~R~28 | … 15 1.606 ideja~S~79 | adijo~S~71 | ekipa~S~45 | … 252 … 4.624 547 CCVCCVC … 693 spomnim~G~62 | spomnem~G~39 | prebral~G~32 | … 182 … 412 zjutraj~R~141 | dvakrat~R~86 | zjutrej~R~39 | … 27 1.565 problem~S~369 | program~S~79 | predmet~S~75 | … 255 … 3.800 713 CCVCCVCV … 828 prebrala~G~40 | prebrali~G~38 | spomnite~G~29 | … 278 … 73 glasbeno~R~19 | družbeno~R~15 | spomladi~R~5 | … 20 1.293 problema~S~68 | prostora~S~57 | problemi~S~43 | … 341 … 2.786 866 CV … 63.680 je~G~30.945 | so~G~7.869 | bi~G~6.993 | … 86 … 3.236 tu~R~1384 | te~R~593 | ze~R~414 | … 77 1.269 po~S~783 | se~S~206 | ma~S~56 | … 48 … 296.450 688 CCCVC … 370 pršov~G~45 | vrnil~G~22 | držal~G~20 | … 112 … 429 prvič~R~183 | drgač~R~162 | stran~R~21 | … 23 1.189 stvar~S~387 | stran~S~229 | strah~S~68 | … 108 … 3.400 340 VCCV … 206 išče~G~35 | igra~G~26 | afna~G~19 | … 41 … 31 uhka~R~7 | obče~R~6 | ozko~R~4 | … 14 1.152 avto~S~107 | ajde~S~95 | igra~S~85 | … 126 … 2.628 332 CVCCVCVC … 1.269 verjamem~G~95 | pogledat~G~82 | pogledam~G~37 | … 357 … 42 ravnokar~R~19 | dandanes~R~11 | verjeten~R~6 | … 7 1.145 merkator~S~76 | postopek~S~54 | posnetek~S~40 | … 351 … 3.103 915 CVCVCCVC … 298 zamenjat~G~17 | zamenjal~G~12 | pufarbat~G~8 | … 154 … 48 velikrat~R~14 | popoldan~R~11 | pomemben~R~5 | … 14 1.107 minister~S~81 | rezultat~S~75 | podatkov~S~37 | … 355 … 2.181 823 CCVCVCVCV … 922 prihajajo~G~34 | prodajajo~G~20 | pričakuje~G~17 | … 389 … 13 prepočasi~R~5 | previsoko~R~3 | dragoceno~R~1 | … 7 1.089 sloveniji~S~283 | slovenija~S~186 | slovenije~S~136 | … 153 … 2.267 669 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 741 File at CLARIN.SI 2.3.7 List of standardized forms by basic consonant-vowel structure in the GOS 1.0 corpusGOS1.0_cv_stand_robust_entire.tsvStructure … Frequency (G) Most frequent forms (G) Number of unique forms (G) … Frequency (R) Most frequent forms (R) Number of unique forms (R) Frequency (S) Most frequent forms (S) Number of unique forms (S) … Total frequency of structure Total number of unique forms in structure CVCV … 18.001 bilo~G~2.262 | bomo~G~1.818 | bila~G~1.587 | ... 703 … 18.039 tako~R~3.345 | tako~R~2.665 | tako~R~2.543 | … 237 14.044 redu~S~683 | leta~S~420 | časa~S~361 | … 1.879 … 83.255 3.962 CVC … 20.309 sem~G~4.224 | vem~G~3.296 | veš~G~2.611 | ... 267 … 9.778 tam~R~1.817 | ful~R~914 | sem~R~898 | … 202 13.542 pol~S~2.815 | saj~S~1.978 | dan~S~814 | … 677 … 96.498 2.138 CVCCV … 5.684 rekla~G~769 | boste~G~644 | rekli~G~360 | … 464 … 11.735 lahko~R~2.884 | dobro~R~885 | vedno~R~723 | … 359 8.910 jutro~S~536 | koncu~S~221 | mesto~S~197 | … 1.464 … 36.547 3.375 CVCVCV … 13.040 recimo~G~838 | pomeni~G~371 | zanima~G~243 | … 1.973 … 4.068 mogoče~R~815 | veliko~R~442 | veliko~R~195 | … 208 8.765 zadeve~S~153 | zadeva~S~140 | besede~S~125 | … 1.875 … 35.588 5.106 CVCVC … 10.845 rekel~G~896 | nisem~G~622 | moraš~G~322 | … 985 … 10.482 potem~R~2.473 | danes~R~929 | potem~R~593 | … 247 7.971 minut~S~461 | način~S~447 | teden~S~186 | … 1.103 … 40.662 3.232 CCVCV … 4.809 pravi~G~500 | pride~G~483 | gremo~G~435 | … 505 … 1.829 treba~R~750 | glede~R~323 | slabo~R~116 | … 91 6.878 hvala~S~790 | ljudi~S~291 | svetu~S~166 | … 941 … 21.405 2.118 CVCCVC … 3.345 mislim~G~602 | mislim~G~599 | mislim~G~307 | … 311 … 2.404 naprej~R~909 | takrat~R~494 | takrat~R~171 | … 80 4.869 gospod~S~582 | mislim~S~323 | mislim~S~310 | … 782 … 13.939 1.725 CVCCVCV … 2.959 mislila~G~105 | mislite~G~75 | poznate~G~61 | … 722 … 193 hitreje~R~27 | kasneje~R~20 | pozneje~R~16 | … 52 4.000 resnici~S~89 | razmere~S~60 | razlika~S~58 | … 1.060 … 8.390 2.327 CCVCCV … 1.436 prišli~G~167 | prišla~G~160 | prišla~G~143 | … 159 … 1.195 zdajle~R~184 | zdajle~R~122 | srečno~R~53 | … 112 3.498 ljudje~S~284 | službo~S~108 | stanje~S~108 | … 522 … 8.242 1.232 CVCVCCV … 614 povejte~G~85 | čakajte~G~35 | dosegla~G~25 | … 193 … 1.277 sigurno~R~151 | podobno~R~81 | ponovno~R~78 | … 157 3.443 začetku~S~153 | denarja~S~108 | začetka~S~67 | … 953 … 7.586 1.978 CCVCVCV … 4.688 zgodilo~G~149 | slišali~G~142 | pridejo~G~121 | … 1.199 … 604 drugače~R~178 | drugače~R~162 | drugače~R~96 | … 40 3.044 primeru~S~238 | skupina~S~116 | skupine~S~92 | … 667 … 10.192 2.264 CVCVCVCV … 5.770 povedati~G~168 | narediti~G~167 | govorimo~G~155 | … 1.325 … 806 zanimivo~R~172 | zagotovo~R~111 | ponavadi~R~107 | … 39 2.762 mariboru~S~73 | besedilo~S~71 | besedila~S~65 | … 675 … 11.237 2.540 CVCVCVC … 2.817 povedal~G~167 | razumem~G~152 | naredil~G~151 | … 684 … 348 posebej~R~132 | nikakor~R~43 | ponekod~R~36 | … 35 2.520 rešitev~S~99 | začetek~S~77 | telefon~S~64 | … 670 … 7.186 1.816 CCCVCV … 657 splača~G~54 | držijo~G~22 | skriva~G~21 | … 226 … 156 hkrati~R~82 | sproti~R~25 | sproti~R~17 | … 11 2.450 stvari~S~588 | strani~S~566 | država~S~192 | … 221 … 3.586 542 CCVCVC … 2.862 prišel~G~164 | prideš~G~95 | pravim~G~86 | … 583 … 3.075 skupaj~R~319 | včasih~R~310 | preveč~R~263 | … 174 2.446 primer~S~441 | človek~S~207 | človek~S~69 | … 475 … 11.151 1.601 CVCCCV … 228 manjka~G~72 | zaprli~G~12 | pošlje~G~9 | … 69 … 226 končno~R~35 | boljše~R~27 | boljše~R~13 | … 58 2.394 bistvu~S~965 | ženske~S~105 | ženska~S~87 | … 252 … 6.622 857 VCCVCV … 1.384 ostane~G~92 | izvoli~G~61 | uspelo~G~54 | … 270 … 1ihtavo~R~1 1 2.341 otroci~S~103 | otroke~S~90 | okviru~S~77 | … 357 … 4.193 738 CCVC … 2.407 glej~G~561 | greš~G~289 | grem~G~272 | … 113 … 10.055 zdaj~R~3.036 | zdaj~R~1.195 | zdaj~R~808 | … 107 2.288 svet~S~121 | plus~S~114 | glas~S~82 | … 313 … 21.389 955 CCVCVCCV … 288 prinesla~G~43 | prinesla~G~19 | spominja~G~19 | … 99 … 587 trenutno~R~120 | slučajno~R~85 | pravilno~R~47 | … 81 1.966 trenutku~S~129 | številka~S~101 | slovenci~S~76 | … 424 … 4.087 927 CVCCVCCV … 416 poglejte~G~197 | poglejmo~G~130 | končajmo~G~7 | … 56 … 1.129 verjetno~R~416 | potrebno~R~147 | normalno~R~86 | … 108 1.808 možnosti~S~106 | področju~S~85 | podjetja~S~84 | … 462 … 4.981 1.029 VCV … 2.647 ima~G~1.079 | aja~G~561 | ima~G~534 | … 33 … 32 ali~R~21 | ali~R~11 2 1.747 evo~S~308 | ura~S~209 | ure~S~185 | … 118 … 18.339 407 VCVCV … 5.344 imamo~G~672 | imajo~G~618 | imamo~G~476 | … 234 … 288 edino~R~87 | enako~R~61 | edino~R~45 | … 19 1.672 ideja~S~79 | adijo~S~71 | ekipa~S~45 | … 273 … 8.777 712 VCCVC … 303 igral~G~48 | iskat~G~34 | uspel~G~20 | … 66 … 54 okrog~R~37 | odveč~R~12 | odkod~R~4 | … 4 1.644 evrov~S~532 | otrok~S~159 | izraz~S~101 | … 181 … 6.063 348 CCVCCVC … 641 spomnim~G~62 | spomnim~G~39 | spomnim~G~30 | … 175 … 585 zjutraj~R~141 | dvakrat~R~86 | zjutraj~R~58 | … 46 1.568 problem~S~366 | program~S~79 | predmet~S~75 | … 257 … 3.955 709 CVCC … 41 mest~G~16 | rast~G~14 | past~G~3 | … 7… 2.490 bolj~R~1.013 | fajn~R~478 | bolj~R~395 | … 39 1.378 cajt~S~85 | film~S~66 | točk~S~63 | … 243 … 4.948 557 CCVCCVCV … 1.047 prebrala~G~40 | prebrali~G~38 | spomnite~G~29 | … 349 … 83 glasbeno~R~19 | družbeno~R~15 | spomladi~R~8 | … 23 1.254 problema~S~68 | prostora~S~57 | probleme~S~41 | … 349 … 2.977 960 CCCVC … 195 vrnil~G~21 | držal~G~13 | vrgel~G~13 | … 62 … 206 prvič~R~183 | stran~R~21 | prvič~R~1 | … 4 1.178 stvar~S~385 | stran~S~204 | strah~S~68 | … 105 … 2.876 234 CVCCVCVC … 1.197 verjamem~G~94 | pogledam~G~37 | pogledat~G~34 | … 383 … 33 ravnokar~R~19 | dandanes~R~11 | naprodaj~R~2 | … 4 1.138 mercator~S~76 | postopek~S~54 | posnetek~S~40 | … 363 … 3.010 934 VCCV … 159 išče~G~35 | igra~G~26 | afna~G~19 | … 21 … 19 obče~R~6 | ozko~R~4 | učno~R~3 | … 8 1.109 avto~S~105 | ajde~S~95 | igra~S~85 | … 129 … 2.119 286 CCVCVCVCV … 1.199 prihajajo~G~34 | prodajajo~G~20 | pričakuje~G~17 | … 548 … 12 prepočasi~R~5 | previsoko~R~3 | dragoceno~R~1 | … 6 1.105 sloveniji~S~283 | slovenija~S~186 | slovenije~S~136 | … 167 … 2.587 864 CVCVCCVC … 281 potegnil~G~9 | zamenjal~G~9 | zamenjaš~G~7 | … 148 … 19 popoldan~R~11 | malokrat~R~2 | količkaj~R~1 | … 8 1.100 minister~S~81 | rezultat~S~75 | podatkov~S~37 | … 365 … 2.077 796 CVCVCVCCV … 63 nagovarja~G~9 | pogovarja~G~8 | počakajte~G~7 | … 28 … 269 pozitivno~R~36 | relativno~R~36 | dobesedno~R~24 | … 52 1.062 policisti~S~30 | bivališča~S~25 | potovanje~S~25 | … 390 … 2.078 706 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 742 File at CLARIN.SI 2.3.8 List of lemmas by detailed consonant-vowel structure in the GOS 1.0 corpusGOS1.0_cv_lemmas_finegrained_entire.tsvStructure … Frequency (G) Most frequent lemmas (G) Number of all unique lemmas (G) … Frequency (R) Most frequent lemmas (R) Number of all unique lemmas (R) Frequency (S) Most frequent lemmas (S) Number of all unique lemmas (S) … Total frequency of structure Total number of all unique lemmas in structure KVZ … 0 0… 6.414 tam~R~2.598 | sem~R~1.307 | ful~R~916 | … 13 7.047 pol~S~4.054 | saj~S~2.321 | par~S~201 | … 49 … 37.791 173 KVZV … 0 0… 836 tule~R~756 | fino~R~55 | solo~R~10 | … 7 2.986 šola~S~432 | kola~S~363 | kama~S~240 | … 99 … 7.032 199 ZVKV … 7.737 reči~G~4.690 | moči~G~2.067 | meti~G~959 | … 7… 865 lepo~R~806 | nato~R~25 | jako~R~24 | … 6 2.967 leto~S~1.828 | roka~S~295 | mati~S~167 | … 87 … 12.042 156 KVK … 0 0… 76 peš~R~42 | tik~R~31 | kos~R~2 | … 4 2.894 čas~S~1.155 | kot~S~294 | pot~S~283 | … 74 … 14.695 180 GVZ … 0 0… 1.391 gor~R~824 | dol~R~530 | žal~R~37 3 2.831 dan~S~1.383 | del~S~883 | dom~S~163 | … 31 … 4.784 103 ZVKVZ … 0 0… 1.240 noter~R~554 | nekaj~R~507 | nekam~R~113 | … 6 2.534 misel~S~1.086 | način~S~491 | večer~S~217 | … 58 … 6.805 125 ZVZV … 0 0… 1.429 malo~R~1186 | lani~R~130 | noro~R~30 | … 11 2.131 mora~S~267 | mama~S~257 | vino~S~207 | … 86 … 5.698 150 KKZVZ … 0 0… 21 stran~R~21 1 1.999 stvar~S~1.056 | stran~S~839 | stroj~S~50 | … 8… 2.025 14 VZV … 0 0… 32 ali~R~32 1 1.646 ura~S~905 | eva~S~332 | ime~S~256 | … 15 … 8.162 67 ZVG … 0 0… 739 rad~R~739 1 1.633 red~S~924 | jaz~S~196 | mož~S~93 | … 30 … 18.422 63 KVZVK … 0 0… 0 0 1.592 konec~S~784 | pomoč~S~166 | torek~S~115 | … 68 … 2.613 100 ZVGV … 0 0… 6luže~R~2 | vede~R~2 | miže~R~1 | … 4 1.493 voda~S~690 | noga~S~134 | miza~S~119 | … 59 … 1.584 99 KZVZVK … 0 0… 290 preveč~R~278 | privat~R~12 2 1.485 človek~S~1.248 | planet~S~61 | promet~S~58 | … 26 … 1.807 48 KZVZV … 0 0… 119 kmalu~R~89 | pravo~R~8 | smeje~R~7 | … 8 1.264 hvala~S~797 | hrana~S~126 | trava~S~76 | … 28 … 3.537 73 KZVK … 0 0… 79 krat~R~50 | proč~R~17 | prec~R~12 3 1.236 svet~S~424 | plus~S~117 | klic~S~74 | … 64 … 1.578 117 ZVK … 0 0… 937 res~R~566 | nič~R~329 | moč~R~42 3 1.186 noč~S~143 | rok~S~110 | vas~S~108 | … 56 … 16.183 141 GVZV … 0 0… 2.415 zelo~R~1.835 | doma~R~461 | gori~R~47 | … 8 1.114 delo~S~425 | gora~S~151 | žena~S~94 | … 49 … 3.871 112 ZVKZV … 0 0… 1.366 notri~R~396 | nekje~R~393 | jutri~R~283 | … 15 1.069 jutro~S~618 | reklo~S~356 | jutri~S~24 | … 17… 2.489 50 KVKV … 751 peti~G~211 | čuti~G~205 | piti~G~143 | … 6… 10.627 tako~R~9713 | kako~R~803 | tiho~R~81 | … 5 1.063 hiša~S~311 | kota~S~103 | kača~S~98 | … 70 … 13.762 163 GVKKZV … 0 0… 1bistro~R~1 1 1.039 bistvo~S~1.016 | gostja~S~22 | bistra~S~1 3… 1.043 7 KZVZVZ … 0 0… 0 0 1.013 primer~S~862 | slalom~S~29 | slovar~S~22 | … 25 … 1.150 46 ZVKKV … 1.206 jesti~G~922 | nesti~G~99 | rasti~G~87 | … 6… 4.174 lahko~R~4.140 | maksi~R~9 | rusko~R~9 | … 6 972 mesto~S~515 | moški~S~141 | mačka~S~72 | … 34 … 6.583 78 ZVZVKV … 2.948 morati~G~2.233 | meniti~G~201 | meriti~G~82 | ... 36 … 1.316 veliko~R~1.316 1 961 minuta~S~662 | novica~S~97 | vejica~S~56 | … 29 … 5.234 75 KVKKV … 257 pasti~G~204 | sesti~G~53 2… 977 čisto~R~974 | pusto~R~2 | češko~R~1 3 950 cesta~S~332 | točka~S~322 | pošta~S~44 | … 40 … 5.828 82 KVKVZ … 0 0… 5.217 potem~R~3.149 | tukaj~R~1.136 | super~R~438 | … 5 949 pesem~S~237 | pahor~S~75 | hotel~S~59 | … 59 … 8.008 138 GVZVZ … 0 0… 341 domov~R~293 | zunaj~R~42 | zaman~R~6 3 905 denar~S~406 | govor~S~146 | žival~S~114 | … 28 … 1.570 70 KVGVZ … 0 0… 370 sedaj~R~315 | tedaj~R~55 2 849 teden~S~437 | kazen~S~85 | pogoj~S~81 | … 35 … 2.044 67 KZVGZVZ … 0 0… 0 0 835 problem~S~583 | program~S~197 | triglav~S~53 | … 5… 854 8 VZZV … 0 0… 1urno~R~1 1 793 evro~S~664 | olje~S~82 | emma~S~24 | … 10 … 814 21 KVZVZ … 0 0… 0 0 755 pomen~S~198 | pojem~S~116 | kamen~S~69 | … 61 … 2.772 123 GVKKVG … 0 0… 0 0 713 gospod~S~713 1… 713 1 KZVZVZVZV … 0 0… 1praviloma~R~1 1 713 slovenija~S~711 | preminula~S~1 | članarina~S~1 3… 715 5 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 743 File at CLARIN.SI 2.3.9 List of word forms by detailed consonant-vowel structure in the GOS 1.0 corpusGOS1.0_cv_forms_finegrained_entire.tsvStructure … Frequency (G) Most frequent forms (G) Number of unique forms (G) … Frequency (R) Most frequent forms (R) Number of unique forms (R) Frequency (S) Most frequent forms (S) Number of unique forms (S) … Total frequency of structure Total number of unique forms in structure KVZ … 5.260 sem~G~4.238 | šov~G~276 | šel~G~164 | ... 71… 5.132 tam~R~1.817 | ful~R~914 | sem~R~898 | ... 70 6.147 pol~S~2.815 | sej~S~1.978 | kam~S~218 | ... 121 … 36.548 561 ZVGV … 1.197 mogu~G~206 | vidu~G~157 | vidi~G~155 | … 81 … 405 rada~R~170 | radi~R~149 | laže~R~25 | … 21 2.423 redu~S~683 | vode~S~225 | voda~S~187 | … 177 … 4.157 342 ZVKV … 3.271 reku~G~897 | niso~G~556 | mate~G~356 | … 114 … 1.124 lepo~R~652 | neki~R~205 | lepu~R~60 | … 38 2.339 leta~S~420 | leto~S~320 | leti~S~178 | … 244 … 10.936 541 KVZV … 516 pove~G~127 | poje~G~38 | pije~G~21 | … 91 … 598 tule~R~135 | sama~R~96 | sami~R~94 | … 43 2.080 pole~S~127 | šolo~S~112 | šoli~S~108 | … 259 … 9.049 609 KVKV … 1.275 piše~G~187 | čaki~G~185 | tiče~G~142 | … 120 … 4.600 tako~R~3.345 | kako~R~429 | tuki~R~293 | … 33 2.047 časa~S~361 | času~S~215 | poti~S~111 | … 266 … 12.188 637 ZVZV … 3.704 mamo~G~673 | majo~G~618 | mela~G~466 | … 126 … 800 malo~R~530 | lani~R~110 | raje~R~33 | … 25 1.795 mama~S~160 | vino~S~100 | more~S~80 | … 220 … 10.184 534 GVZV … 5.216 bomo~G~1.819 | bilo~G~552 | bila~G~500 | … 115 … 1.945 zelo~R~1.127 | doma~R~411 | domu~R~60 | ... 45 1.682 dela~S~325 | delo~S~271 | delu~S~110 | … 156 … 9.387 426 GVZ … 5.158 bil~G~2.042 | bom~G~1.583 | dej~G~604 | ... 41 … 2.955 zaj~R~809 | zej~R~774 | gor~R~601 | … 35 1.610 dan~S~814 | del~S~332 | dol~S~60 | … 65 … 10.206 261 KZVZV … 1.104 pravi~G~501 | tlele~G~163 | traja~G~57 | … 82 … 88 kmalu~R~52 | pravo~R~7 | smeje~R~7 | … 14 1.540 hvala~S~790 | snovi~S~67 | hrano~S~63 | … 138 … 4.579 333 KVK … 362 čak~G~173 | češ~G~74 | pet~G~38 | … 25 … 3.611 tak~R~2.543 | tok~R~199 | kak~R~168 | ... 29 1.526 čas~S~377 | kot~S~190 | pot~S~151 | … 98 … 14.683 371 ZVKVZ … 1.516 nisem~G~622 | misim~G~307 | rečem~G~142 | … 82 … 918 noter~R~234 | nekaj~R~156 | nikol~R~136 | … 30 1.450 način~S~448 | večer~S~187 | misim~S~167 | … 113 … 5.343 291 ZVK … 4.328 veš~G~2.611 | maš~G~930 | met~G~216 | … 44 … 2.038 res~R~547 | več~R~426 | loh~R~295 | … 26 1.337 let~S~526 | res~S~113 | noč~S~92 | … 92 … 16.841 314 KVZKV … 98 konča~G~25 | pejte~G~7 | čujte~G~7 | … 35 … 236 tolko~R~81 | kolko~R~68 | širše~R~17 | … 27 1.321 koncu~S~226 | konca~S~153 | cajta~S~73 | … 177 … 2.202 343 VZV … 1.146 aja~G~561 | ima~G~535 | imu~G~12 | … 12 … 22 ali~R~21 | uli~R~1 2 1.311 evo~S~310 | ura~S~211 | ure~S~186 | … 53 … 10.215 178 KKZVZV … 64 skriva~G~21 | skrije~G~10 | spravu~G~5 | … 19 … 0 0 1.223 stvari~S~588 | strani~S~566 | strela~S~8 | … 33 … 1.295 59 KV … 11.349 so~G~7.869 | si~G~2.908 | šu~G~170 | ... 28 … 2.602 tu~R~1.384 | te~R~593 | ki~R~121 | ... 51 1.165 po~S~783 | se~S~206 | te~S~43 | … 23 … 119.936 339 KZVKV … 436 prišo~G~46 | prišu~G~39 | kliče~G~38 | … 74 … 78 proti~R~50 | preko~R~8 | cvete~R~6 | … 11 1.112 svetu~S~168 | plače~S~92 | srečo~S~78 | … 154 … 2.134 302 ZVKZV … 2.132 rekla~G~769 | rekli~G~361 | nismo~G~284 | … 76 … 891 nekje~R~313 | jutri~R~174 | resno~R~80 | … 41 1.072 jutro~S~536 | rekla~S~161 | jutra~S~29 | … 72 … 4.917 289 ZVKKV … 633 veste~G~307 | niste~G~119 | nista~G~61 | … 38 … 3.175 lahko~R~2.885 | lehko~R~66 | lahku~R~42 | ... 20 1.056 mesto~S~197 | mestu~S~153 | mesta~S~95 | … 116 … 5.177 249 GVKKZV … 6zastre~G~6 1… 1bistro~R~1 1 1.047 bistvu~S~965 | bistvo~S~28 | gostja~S~16 | … 10 … 1.057 15 KVKKV … 217 pusti~G~39 | čakte~G~35 | pustu~G~27 | … 38 … 248 čisto~R~233 | čista~R~6 | čistu~R~3 | … 8 984 ceste~S~92 | točko~S~88 | točke~S~77 | … 150 … 3.943 269 ZZVGV … 23 ljubi~G~8 | vlaga~G~6 | vloži~G~2 | … 8… 21 mnogo~R~21 1 900 ljudi~S~293 | vlada~S~153 | vlade~S~139 | … 41 … 1.316 73 ZVZ … 5.937 vem~G~3.296 | mel~G~786 | mam~G~608 | ... 58 … 1.026 mal~R~499 | ven~R~309 | mav~R~76 | … 17 773 mam~S~187 | men~S~73 | mir~S~69 | … 84 … 13.554 326 ZVZVK … 512 moreš~G~151 | nimaš~G~141 | moraš~G~69 | … 38 … 200 velik~R~195 | nalaš~R~1 | naleš~R~1 | … 6 773 minut~S~461 | minus~S~162 | junak~S~26 | … 49 … 1.749 123 KZVZVZVZV … 98 preverili~G~16 | preverimo~G~15 | prevajala~G~7 | … 39 … 1praviloma~R~1 1 740 sloveniji~S~283 | slovenija~S~186 | slovenije~S~136 | … 13 … 848 58 KVZVZV … 693 pomeni~G~371 | povemo~G~32 | pelala~G~24 | … 92 … 18 ceneje~R~13 | horile~R~1 | kameno~R~1 | … 6 729 pomeni~S~98 | tujini~S~27 | koleno~S~22 | … 182 … 1.594 357 KZVK … 173 prit~G~63 | smeš~G~37 | sliš~G~18 | … 21 … 144 krat~R~49 | proč~R~16 | prot~R~14 | … 17 726 svet~S~136 | plus~S~115 | klic~S~45 | … 77… 1.485 203 ZVKVZV … 1.454 recimo~G~838 | rečemo~G~146 | rečejo~G~55 | … 156 … 145 nikoli~R~118 | veselo~R~11 | ločeno~R~8 | … 9 706 večina~S~72 | večino~S~31 | višina~S~31 | … 142 … 3.509 397 KVZZV … 115 pejva~G~15 | pejmo~G~13 | pelje~G~13 | … 46 … 165 tamle~R~53 | polno~R~50 | kajne~R~14 | … 17 704 pojma~S~64 | firma~S~33 | firme~S~28 | … 133 … 1.261 303 KKZVZ … 8skril~G~4 | fkral~G~1 | shran~G~1 | … 5… 21 stran~R~21 1 682 stvar~S~387 | stran~S~229 | stroj~S~25 | … 12 … 746 30 KZVGV … 670 pride~G~483 | pridi~G~40 | sneži~G~21 | … 34 … 918 treba~R~750 | slabo~R~116 | slabe~R~25 | … 10 664 sredo~S~88 | krize~S~72 | kriza~S~69 | … 83 … 2.473 162 GVKKVG … 0 0… 0 0 649 gospod~S~589 | gospod~S~59 | gespud~S~1 3… 650 4 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 744 File at CLARIN.SI2.3.10 List of standardized forms by detailed consonant-vowel structure in the GOS 1.0 corpusGOS1.0_cv_stand_finegrained_entire.tsvStructure … Frequency (G) Most frequent forms (G) Number of unique forms (G) … Frequency (R) Most frequent forms (R) Number of unique forms (R) Frequency (S) Most frequent forms (S) Number of unique forms (S) … Total frequency of structure Total number of unique forms in structure KVZ … 7.042 sem~G~4.224 | sem~G~1.348 | sem~G~288 | … 51 … 6.004 tam~R~1.817 | ful~R~914 | sem~R~898 | … 80 7.505 pol~S~2.815 | saj~S~1.978 | pol~S~783 | ... 140 … 43.486 545 ZVGV … 550 vidi~G~154 | vidi~G~53 | vodi~G~46 | … 58 … 391 rada~R~170 | radi~R~149 | laže~R~24 | … 19 2.451 redu~S~683 | vode~S~222 | voda~S~186 | … 227 … 3.482 345 ZVKV … 1.895 niso~G~555 | reče~G~275 | nisi~G~229 | … 105 … 809 lepo~R~650 | lepo~R~60 | nato~R~25 | … 19 2.428 leta~S~420 | leto~S~319 | leti~S~174 | … 296 … 8.687 557 KVZV … 418 pove~G~125 | poje~G~38 | pije~G~21 | … 62 … 1.252 tule~R~549 | tule~R~135 | samo~R~97 | … 37 2.290 šolo~S~112 | šoli~S~108 | šole~S~94 | … 360 … 11.266 695 KVKV … 877 piše~G~185 | tiče~G~142 | čaka~G~83 | … 90 … 10.626 tako~R~3.345 | tako~R~2.665 | tako~R~2.543 | … 62 2.058 časa~S~361 | času~S~215 | poti~S~110 | … 294 … 18.354 667 ZVZV … 1.372 more~G~379 | mora~G~257 | nima~G~231 | … 77… 1.449 malo~R~530 | malo~R~499 | lani~R~110 | … 38 1.852 mama~S~158 | vino~S~100 | more~S~77 | … 258 … 9.930 577 GVZV … 10.704 bilo~G~2.262 | bomo~G~1.818 | bila~G~1.587 | ... 199 … 2.415 zelo~R~1.127 | zelo~R~688 | doma~R~411 | … 37 1.658 dela~S~323 | delo~S~261 | delu~S~110 | … 174 … 15.235 499 GVZ … 5.390 bil~G~2.035 | bom~G~1.575 | daj~G~604 | … 110 … 1.391 gor~R~601 | dol~R~394 | gor~R~80 | … 57 1.613 dan~S~814 | del~S~332 | dol~S~60 | … 76 … 8.699 317 KVK … 66 pet~G~35 | češ~G~22 | pit~G~8 | … 4… 76 tik~R~31 | peš~R~29 | peš~R~10 | … 7 1.592 čas~S~375 | kot~S~190 | pot~S~150 | … 113 … 10.186 297 KZVZV … 1.218 pravi~G~500 | pravi~G~179 | tlele~G~163 | … 70 … 119 kmalu~R~52 | kmalu~R~17 | kmalu~R~13 | … 15 1.513 hvala~S~790 | snovi~S~67 | hrano~S~63 | … 141 … 4.843 333 VZV … 2.234 ima~G~1.079 | aja~G~561 | ima~G~534 | … 16 … 32 ali~R~21 | ali~R~11 2 1.341 evo~S~308 | ura~S~209 | ure~S~185 | … 58 … 15.173 277 ZVKVZ … 2.729 rekel~G~896 | nisem~G~622 | rekel~G~195 | … 161 … 1.240 noter~R~234 | nekaj~R~205 | nekaj~R~156 | … 57 1.340 način~S~447 | večer~S~183 | rekel~S~96 | … 111 … 7.824 423 KVZKV … 52 konča~G~25 | čujte~G~7 | karta~G~4 | … 13 … 30 širše~R~17 | hujše~R~8 | hujše~R~2 | … 6 1.330 koncu~S~221 | konca~S~153 | cajta~S~71 | … 182 … 1.738 275 ZVKZVZ … 2.048 mislim~G~602 | mislim~G~599 | mislim~G~307 | … 31 … 980 naprej~R~909 | nikjer~R~45 | naprej~R~6 | … 16 1.286 mislim~S~323 | mislim~S~310 | mislim~S~167 | … 45 … 4.362 114 KKZVZV … 50 skriva~G~21 | skrije~G~10 | skrila~G~3 | … 13 … 0 0 1.257 stvari~S~588 | strani~S~566 | strani~S~25 | … 35 … 1.314 54 ZVK … 2.927 veš~G~2.611 | veš~G~146 | veš~G~57 | … 30 … 1.369 res~R~547 | več~R~426 | nič~R~163 | … 19 1.252 let~S~427 | res~S~113 | noč~S~92 | … 100 … 14.458 291 KZVKV … 385 priti~G~62 | kliče~G~38 | priti~G~33 | … 68 … 83 proti~R~50 | proti~R~14 | preko~R~8 | … 8 1.110 svetu~S~166 | plače~S~92 | srečo~S~78 | … 169 … 2.125 310 ZVKZV … 2.008 rekla~G~769 | rekli~G~360 | nismo~G~284 | … 92 … 1.391 nekje~R~313 | jutri~R~174 | notri~R~107 | … 80 1.082 jutro~S~536 | rekla~S~161 | rekli~S~29 | … 79 … 5.291 360 GVKKZV … 6zastre~G~6 1… 1bistro~R~1 1 1.052 bistvu~S~965 | bistvo~S~24 | gostja~S~16 | … 14 … 1.063 20 KVKKV … 128 pusti~G~39 | košta~G~14 | pusti~G~12 | … 25 … 977 čisto~R~585 | čisto~R~233 | čisto~R~137 | … 14 1.030 ceste~S~92 | točko~S~88 | točke~S~77 | … 174 … 5.212 314 ZVKKV … 678 veste~G~307 | niste~G~119 | nista~G~61 | … 44 … 4.074 lahko~R~2884 | lahko~R~295 | lahko~R~205 | … 52 1.004 mesto~S~197 | mestu~S~152 | mesta~S~94 | … 126 … 6.045 291 ZZVGV … 27 ljubi~G~8 | vlaga~G~6 | ljubi~G~5 | … 8… 21 mnogo~R~21 1 975 ljudi~S~291 | vlada~S~150 | vlade~S~138 | … 57 … 1.414 98 ZVZVK … 959 moraš~G~322 | moreš~G~151 | nimaš~G~141 | … 40 … 1veječ~R~1 1 822 minut~S~461 | minus~S~162 | james~S~24 | … 62 … 2.030 130 KVZVK … 117 poveš~G~45 | čuješ~G~31 | piješ~G~14 | … 17… 1parih~R~1 1 779 konec~S~132 | konec~S~116 | pomoč~S~81 | … 74 … 1.076 128 KZVZVZVZV … 139 preverili~G~16 | preverimo~G~15 | preverili~G~10 | … 56 … 1praviloma~R~1 1 744 sloveniji~S~283 | slovenija~S~186 | slovenije~S~136 | … 17… 892 78 ZVKVZV … 1.643 recimo~G~838 | rečemo~G~146 | rečejo~G~55 | … 225 … 349 nikoli~R~136 | nikoli~R~118 | nikoli~R~27 | … 19 721 večina~S~72 | večino~S~31 | višina~S~31 | … 153 … 3.952 489 KVZVZV … 705 pomeni~G~371 | pomeni~G~64 | povemo~G~32 | … 93 … 30 ceneje~R~13 | ceneje~R~12 | ceneje~R~3 | … 5 717 pomeni~S~98 | pomeni~S~28 | tujini~S~27 | … 166 … 1.537 324 KVZVKV … 65 poveča~G~7 | furati~G~4 | poroča~G~4 | … 41 … 1.014 toliko~R~140 | koliko~R~121 | koliko~R~93 | … 72 706 pomoči~S~41 | panike~S~40 | celoti~S~29 | … 159 … 2.688 363 KZVK … 48 smeš~G~37 | smeš~G~4 | klet~G~2 | … 7… 79 krat~R~49 | proč~R~16 | prec~R~12 | … 5 685 svet~S~121 | plus~S~114 | klic~S~45 | … 80 … 1.148 152 KZVGV … 733 pride~G~483 | pridi~G~35 | pride~G~34 | … 39 … 990 treba~R~750 | slabo~R~116 | treba~R~46 | … 19 673 sredo~S~88 | krize~S~72 | kriza~S~68 | … 96 … 2.610 189 KVZZV … 84 pelje~G~13 | pelje~G~11 | kamna~G~6 | … 31 … 216 tamle~R~53 | polno~R~50 | tamle~R~28 | … 28 664 pojma~S~64 | firma~S~33 | firme~S~28 | … 140 … 1.197 284 KKZVZ … 6skril~G~3 | skril~G~1 | skril~G~1 | … 4… 21 stran~R~21 1 662 stvar~S~385 | stran~S~204 | stroj~S~25 | … 14 … 694 24 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 745 File at CLARIN.SI2.3.11 List of noun lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-words-nouns-lemmas-taxonomy-entire.tsvLemma Lemma (lower-case) Lemma Lemma (lower-case) Part-of-speech category Standardized form Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pol pol pol pol S pol 4,053 2.55 % 3,915.54 463 1.19 % 2,008.48 2,413 7.14 % 8,164.66 333 0.50 % 942.94 844 4.27 % 5,413.97 saj saj saj saj S saj 2,321 1.46 % 2,242.28 303 0.78 % 1,314.40 1,298 3.84 % 4,391.93 258 0.39 % 730.57 462 2.34 % 2,963.57 misel misel misel misel S mislim 1,011 0.64 % 976.71 155 0.40 % 672.38 419 1.24 % 1,417.73 186 0.28 % 526.69 251 1.27 % 1,610.08 bistvo bistvo bistvo bistvo S bistvu 988 0.62 % 954.49 163 0.42 % 707.09 234 0.69 % 791.77 232 0.35 % 656.95 359 1.82 % 2,302.86 red red red red S redu 863 0.54 % 833.73 179 0.46 % 776.50 238 0.70 % 805.30 236 0.35 % 668.27 210 1.06 % 1,347.08 dan dan dan dan S dan 861 0.54 % 831.80 319 0.82 % 1,383.81 273 0.81 % 923.73 158 0.24 % 447.40 111 0.56 % 712.03 hvala hvala hvala hvala S hvala 793 0.50 % 766.10 340 0.87 % 1,474.91 60 0.18 % 203.02 300 0.45 % 849.50 93 0.47 % 596.56 stran stran stran stran S strani 608 0.38 % 587.38 129 0.33 % 559.60 107 0.32 % 362.05 287 0.43 % 812.69 85 0.43 % 545.25 stvar stvar stvar stvar S stvari 594 0.37 % 573.85 95 0.24 % 412.11 102 0.30 % 345.13 256 0.39 % 724.91 141 0.71 % 904.47 gospod gospod gospod gospod S gospod 586 0.37 % 566.13 24 0.06 % 104.11 5 0.01 % 16.92 543 0.82 % 1,537.59 14 0.07 % 89.81 evro evro evro evro S evrov 548 0.34 % 529.41 148 0.38 % 642.02 155 0.46 % 524.46 164 0.25 % 464.39 81 0.41 % 519.59 jutro jutro jutro jutro S jutro 539 0.34 % 520.72 451 1.15 % 1,956.42 9 0.03 % 30.45 76 0.11 % 215.21 3 0.01 % 19.24 minuta minuta minuta minuta S minut 467 0.29 % 451.16 318 0.81 % 1,379.47 92 0.27 % 311.29 34 0.05 % 96.28 23 0.12 % 147.54 leto leto leto leto S let 462 0.29 % 446.33 141 0.36 % 611.65 116 0.34 % 392.50 181 0.27 % 512.53 24 0.12 % 153.95 način način način način S način 449 0.28 % 433.77 63 0.16 % 273.29 34 0.10 % 115.04 252 0.38 % 713.58 100 0.51 % 641.47 primer primer primer primer S primer 446 0.28 % 430.87 53 0.14 % 229.91 60 0.18 % 203.02 241 0.36 % 682.43 92 0.47 % 590.15 leto leto leto leto S leta 437 0.28 % 422.18 130 0.33 % 563.94 76 0.23 % 257.15 204 0.31 % 577.66 27 0.14 % 173.20 čas čas čas čas S čas 432 0.27 % 417.35 143 0.37 % 620.33 46 0.14 % 155.65 192 0.29 % 543.68 51 0.26 % 327.15 leto leto leto leto S leto 430 0.27 % 415.42 94 0.24 % 407.77 163 0.48 % 551.53 129 0.19 % 365.28 44 0.22 % 282.24 človek človek človek človek S ljudi 394 0.25 % 380.64 59 0.15 % 255.94 89 0.26 % 301.14 212 0.32 % 600.31 34 0.17 % 218.10 vprašanje vprašanje vprašanje vprašanje S vprašanje 393 0.25 % 379.67 72 0.18 % 312.33 54 0.16 % 182.72 216 0.33 % 611.64 51 0.26 % 327.15 stvar stvar stvar stvar S stvar 386 0.24 % 372.91 47 0.12 % 203.88 83 0.25 % 280.84 172 0.26 % 487.05 84 0.42 % 538.83 človek človek človek človek S ljudje 376 0.24 % 363.25 73 0.19 % 316.67 58 0.17 % 196.25 201 0.30 % 569.16 44 0.22 % 282.24 problem problem problem problem S problem 366 0.23 % 353.59 32 0.08 % 138.81 83 0.25 % 280.84 163 0.24 % 461.56 88 0.45 % 564.49 čas čas čas čas S časa 362 0.23 % 349.72 100 0.26 % 433.80 57 0.17 % 192.87 146 0.22 % 413.42 59 0.30 % 378.46 kola kola kola kola S koli 352 0.22 % 340.06 57 0.15 % 247.26 82 0.24 % 277.46 113 0.17 % 319.98 100 0.51 % 641.47 del del del del S dela 335 0.21 % 323.64 53 0.14 % 229.91 123 0.36 % 416.18 116 0.17 % 328.47 43 0.22 % 275.83 del del del del S del 332 0.21 % 320.74 72 0.18 % 312.33 38 0.11 % 128.58 187 0.28 % 529.52 35 0.18 % 224.51 Eva eva Eva eva S evo 308 0.19 % 297.55 178 0.46 % 772.16 60 0.18 % 203.02 16 0.02 % 45.31 54 0.27 % 346.39 dan dan dan dan S dni 307 0.19 % 296.59 69 0.18 % 299.32 115 0.34 % 389.12 85 0.13 % 240.69 38 0.19 % 243.76 Slovenija slovenija Slovenija slovenija S Sloveniji 286 0.18 % 276.30 54 0.14 % 234.25 31 0.09 % 104.89 187 0.28 % 529.52 14 0.07 % 89.81 konec konec konec konec S koncu 286 0.18 % 276.30 73 0.19 % 316.67 58 0.17 % 196.25 110 0.17 % 311.48 45 0.23 % 288.66 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 746 File at CLARIN.SI2.3.12 List of noun lower-case word forms in the GOS 1.0 corpus with standardized word forms, lemmas, morphosyntactic tags and text-type distributionGOS1.0-words-nouns-lowercase_forms-standardized_forms- lemmas-morphosyntactic_tags-taxonomy-entire.tsvLower-case word form Lemma Lemma (lower-case) Morphosyntactic tag Standardized form Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] msd01 msd02 msd03 msd04 msd05 msd06 pol pol pol Somei pol 2,214 1.39 % 2,138.91 292 0.75 % 1,266.68 1,175 3.48 % 3,975.75 206 0.31 % 583.32 541 2.74 % 3,470.33 S o m e i sej saj saj Somei saj 1,978 1.24 % 1,910.91 239 0.61 % 1,036.77 1,119 3.31 % 3,786.26 218 0.33 % 617.30 402 2.04 % 2,578.69 S o m e i hvala hvala hvala Sozei hvala 790 0.50 % 763.21 338 0.86 % 1,466.23 59 0.17 % 199.63 300 0.45 % 849.50 93 0.47 % 596.56 S o z e i bistvu bistvo bistvo Sosed bistvu 749 0.47 % 723.60 121 0.31 % 524.89 181 0.54 % 612.43 164 0.25 % 464.39 283 1.43 % 1,815.35 S o s e d po pol pol Somei pol 675 0.42 % 652.11 39 0.10 % 169.18 465 1.38 % 1,573.38 34 0.05 % 96.28 137 0.69 % 878.81 S o m e i gospod gospod gospod Somei gospod 582 0.37 % 562.26 24 0.06 % 104.11 4 0.01 % 13.53 540 0.81 % 1,529.10 14 0.07 % 89.81 S o m e i dan dan dan Somei dan 573 0.36 % 553.57 219 0.56 % 950.01 165 0.49 % 558.30 112 0.17 % 317.15 77 0.39 % 493.93 S o m e i evrov evro evro Sommr evrov 532 0.33 % 513.96 148 0.38 % 642.02 142 0.42 % 480.47 161 0.24 % 455.90 81 0.41 % 519.59 S o m m r jutro jutro jutro Sosei jutro 471 0.30 % 455.03 394 1.01 % 1,709.16 5 0.01 % 16.92 71 0.11 % 201.05 1 0.01 % 6.41 S o s e i minut minuta minuta Sozmr minut 461 0.29 % 445.36 316 0.81 % 1,370.80 88 0.26 % 297.76 34 0.05 % 96.28 23 0.12 % 147.54 S o z m r let leto leto Sosmr let 425 0.27 % 410.59 126 0.32 % 546.58 95 0.28 % 321.44 180 0.27 % 509.70 24 0.12 % 153.95 S o s m r stvari stvar stvar Sozer stvari 374 0.23 % 361.32 62 0.16 % 268.95 68 0.20 % 230.09 150 0.23 % 424.75 94 0.48 % 602.98 S o z e r redu red red Somem redu 368 0.23 % 355.52 77 0.20 % 334.02 46 0.14 % 155.65 139 0.21 % 393.60 106 0.54 % 679.95 S o m e m časa čas čas Somer časa 347 0.22 % 335.23 97 0.25 % 420.78 52 0.15 % 175.95 143 0.21 % 404.93 55 0.28 % 352.81 S o m e r problem problem problem Somei problem 342 0.21 % 330.40 31 0.08 % 134.48 80 0.24 % 270.69 147 0.22 % 416.25 84 0.42 % 538.83 S o m e i stvar stvar stvar Sozei stvar 340 0.21 % 328.47 41 0.10 % 177.86 73 0.22 % 247 153 0.23 % 433.24 73 0.37 % 468.27 S o z e i leta leto leto Soser leta 326 0.20 % 314.94 104 0.27 % 451.15 40 0.12 % 135.34 165 0.25 % 467.22 17 0.09 % 109.05 S o s e r mislm misel misel Sozmd mislim 323 0.20 % 312.05 44 0.11 % 190.87 134 0.40 % 453.40 53 0.08 % 150.08 92 0.47 % 590.15 S o z m d dela del del Somer dela 321 0.20 % 310.11 47 0.12 % 203.88 116 0.34 % 392.50 115 0.17 % 325.64 43 0.22 % 275.83 S o m e r vprašanje vprašanje vprašanje Sosei vprašanje 320 0.20 % 309.15 55 0.14 % 238.59 41 0.12 % 138.73 182 0.27 % 515.36 42 0.21 % 269.42 S o s e i redu red red Somed redu 315 0.20 % 304.32 71 0.18 % 308 68 0.20 % 230.09 81 0.12 % 229.36 95 0.48 % 609.39 S o m e d mislim misel misel Sozmd mislim 310 0.20 % 299.49 61 0.16 % 264.62 102 0.30 % 345.13 77 0.12 % 218.04 70 0.35 % 449.03 S o z m d evo Eva eva Slzet evo 308 0.19 % 297.55 178 0.46 % 772.16 60 0.18 % 203.02 16 0.02 % 45.31 54 0.27 % 346.39 S l z e t primer primer primer Sometn primer 308 0.19 % 297.55 46 0.12 % 199.55 48 0.14 % 162.41 144 0.22 % 407.76 70 0.35 % 449.03 S o m e t n del del del Somei del 293 0.18 % 283.06 70 0.18 % 303.66 35 0.10 % 118.43 159 0.24 % 450.23 29 0.15 % 186.03 S o m e i čas čas čas Somei čas 290 0.18 % 280.16 100 0.26 % 433.80 35 0.10 % 118.43 123 0.18 % 348.29 32 0.16 % 205.27 S o m e i ljudje človek človek Sommi ljudje 284 0.18 % 274.37 50 0.13 % 216.90 33 0.10 % 111.66 168 0.25 % 475.72 33 0.17 % 211.68 S o m m i sloveniji Slovenija slovenija Slzem Sloveniji 263 0.17 % 254.08 49 0.12 % 212.56 27 0.08 % 91.36 173 0.26 % 489.88 14 0.07 % 89.81 S l z e m strani stran stran Sozer strani 263 0.17 % 254.08 62 0.16 % 268.95 40 0.12 % 135.34 123 0.18 % 348.29 38 0.19 % 243.76 S o z e r dan dan dan Sometn dan 241 0.15 % 232.83 89 0.23 % 386.08 72 0.21 % 243.62 46 0.07 % 130.26 34 0.17 % 218.10 S o m e t n način način način Sometn način 227 0.14 % 219.30 31 0.08 % 134.48 18 0.05 % 60.91 137 0.21 % 387.94 41 0.21 % 263 S o m e t n način način način Somei način 220 0.14 % 212.54 32 0.08 % 138.81 16 0.05 % 54.14 113 0.17 % 319.98 59 0.30 % 378.46 S o m e i CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 747 File at CLARIN.SI2.3.13 List of verb lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-words-verbs-lemmas-taxonomy-entire.tsvLemma Lemma (lower-case) Lemma Lemma (lower-case) Part-of-speech category Standardized form Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] biti biti biti biti G je 31,463 14.68 % 30,395.89 6,410 14.11 % 27,806.34 10,422 15.04 % 35,264.02 10,489 15.21 % 29,701.34 4,142 13.55 % 26,569.51 biti biti biti biti G so 7,976 3.72 % 7,705.48 1,528 3.36 % 6,628.41 2,364 3.41 % 7,998.86 3,012 4.37 % 8,528.98 1,072 3.51 % 6,876.51 biti biti biti biti G bi 7,466 3.48 % 7,212.78 1,387 3.05 % 6,016.75 1,982 2.86 % 6,706.32 2,746 3.98 % 7,775.75 1,351 4.42 % 8,666.20 biti biti biti biti G sem 6,222 2.90 % 6,010.97 1,191 2.62 % 5,166.51 2,952 4.26 % 9,988.43 1,188 1.72 % 3,364.02 891 2.92 % 5,715.46 biti biti biti biti G ni 5,191 2.42 % 5,014.94 930 2.05 % 4,034.30 1,870 2.70 % 6,327.36 1,567 2.27 % 4,437.22 824 2.70 % 5,285.68 biti biti biti biti G bo 5,064 2.36 % 4,892.25 1,301 2.86 % 5,643.69 1,258 1.81 % 4,256.59 1,777 2.58 % 5,031.87 728 2.38 % 4,669.87 biti biti biti biti G smo 3,981 1.86 % 3,845.98 899 1.98 % 3,899.83 1,148 1.66 % 3,884.39 1,544 2.24 % 4,372.09 390 1.28 % 2,501.72 vedeti vedeti vedeti vedeti G vem 3,558 1.66 % 3,437.33 509 1.12 % 2,208.02 1,885 2.72 % 6,378.11 502 0.73 % 1,421.50 662 2.17 % 4,246.50 biti biti biti biti G bilo 3,233 1.51 % 3,123.35 572 1.26 % 2,481.31 1,400 2.02 % 4,737.06 850 1.23 % 2,406.92 411 1.34 % 2,636.42 biti biti biti biti G si 3,196 1.49 % 3,087.60 903 1.99 % 3,917.18 1,349 1.95 % 4,564.50 545 0.79 % 1,543.26 399 1.30 % 2,559.45 vedeti vedeti vedeti vedeti G veš 2,896 1.35 % 2,797.78 499 1.10 % 2,164.64 1,799 2.60 % 6,087.12 98 0.14 % 277.50 500 1.64 % 3,207.33 biti biti biti biti G bil 2,201 1.03 % 2,126.35 550 1.21 % 2,385.88 903 1.30 % 3,055.40 621 0.90 % 1,758.46 127 0.41 % 814.66 biti biti biti biti G bila 2,143 1.00 % 2,070.32 508 1.12 % 2,203.68 858 1.24 % 2,903.14 581 0.84 % 1,645.20 196 0.64 % 1,257.27 biti biti biti biti G bomo 2,018 0.94 % 1,949.56 544 1.20 % 2,359.85 340 0.49 % 1,150.43 839 1.22 % 2,375.77 295 0.96 % 1,892.32 biti biti biti biti G bom 1,975 0.92 % 1,908.02 330 0.73 % 1,431.53 875 1.26 % 2,960.66 379 0.55 % 1,073.20 391 1.28 % 2,508.13 misliti misliti misliti misliti G mislim 1,862 0.87 % 1,798.85 313 0.69 % 1,357.78 632 0.91 % 2,138.44 475 0.69 % 1,345.04 442 1.45 % 2,835.28 imeti imeti imeti imeti G ima 1,655 0.77 % 1,598.87 275 0.60 % 1,192.94 614 0.89 % 2,077.54 498 0.72 % 1,410.17 268 0.88 % 1,719.13 biti biti biti biti G ste 1,629 0.76 % 1,573.75 481 1.06 % 2,086.56 188 0.27 % 636.12 717 1.04 % 2,030.30 243 0.80 % 1,558.76 reči reči reči reči G rekel 1,278 0.60 % 1,234.65 213 0.47 % 923.99 575 0.83 % 1,945.58 328 0.47 % 928.79 162 0.53 % 1,039.17 dati dati dati dati G da 1,268 0.59 % 1,224.99 197 0.43 % 854.58 454 0.66 % 1,536.16 407 0.59 % 1,152.49 210 0.69 % 1,347.08 imeti imeti imeti imeti G imamo 1,172 0.55 % 1,132.25 243 0.54 % 1,054.12 173 0.25 % 585.37 587 0.85 % 1,662.19 169 0.55 % 1,084.08 biti biti biti biti G boš 1,151 0.54 % 1,111.96 238 0.52 % 1,032.43 518 0.75 % 1,752.71 197 0.29 % 557.84 198 0.65 % 1,270.10 iti iti iti iti G gre 1,122 0.52 % 1,083.95 200 0.44 % 867.59 309 0.45 % 1,045.54 457 0.66 % 1,294.07 156 0.51 % 1,000.69 imeti imeti imeti imeti G imaš 1,058 0.49 % 1,022.12 220 0.48 % 954.35 535 0.77 % 1,810.23 138 0.20 % 390.77 165 0.54 % 1,058.42 imeti imeti imeti imeti G imajo 942 0.44 % 910.05 129 0.28 % 559.60 387 0.56 % 1,309.46 283 0.41 % 801.36 143 0.47 % 917.30 meti meti meti meti G ma 939 0.44 % 907.15 162 0.36 % 702.75 607 0.88 % 2,053.85 77 0.11 % 218.04 93 0.30 % 596.56 biti biti biti biti G bili 932 0.43 % 900.39 245 0.54 % 1,062.80 309 0.45 % 1,045.54 310 0.45 % 877.82 68 0.22 % 436.20 biti biti biti biti G nisem 915 0.43 % 883.97 172 0.38 % 746.13 449 0.65 % 1,519.24 155 0.23 % 438.91 139 0.46 % 891.64 imeti imeti imeti imeti G imeli 915 0.43 % 883.97 164 0.36 % 711.43 337 0.49 % 1,140.28 287 0.42 % 812.69 127 0.41 % 814.66 reči reči reči reči G recimo 893 0.42 % 862.71 172 0.38 % 746.13 146 0.21 % 494.01 375 0.54 % 1,061.87 200 0.65 % 1,282.93 reči reči reči reči G rekla 882 0.41 % 852.09 101 0.22 % 438.13 509 0.73 % 1,722.26 125 0.18 % 353.96 147 0.48 % 942.95 biti biti biti biti G sta 821 0.38 % 793.15 218 0.48 % 945.68 261 0.38 % 883.12 251 0.36 % 710.75 91 0.30 % 583.73 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 748 File at CLARIN.SI2.3.14 List of verb lower-case word forms in the GOS 1.0 corpus with standardized word forms, lemmas, morphosyntactic tags and text-type distributionGOS1.0-words-verbs-lowercase_forms-standardized_forms- lemmas-morphosyntactic_tags-taxonomy-entire.tsvLower-case word form Lemma Lemma (lower-case) Morphosyntactic tag Standardized form Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] msd01 msd02 msd03 msd04 msd05 msd06 msd07 msd08 je biti biti Gp-ste-n je 30,945 14.44 % 29,895.46 6,372 14.02 % 27,641.49 9,993 14.42 % 33,812.45 10,457 15.16 % 29,610.73 4,123 13.49 % 26,447.63 G p - s t e - n so biti biti Gp-stm-n so 7,856 3.67 % 7,589.55 1,514 3.33 % 6,567.67 2,276 3.28 % 7,701.11 2,995 4.34 % 8,480.84 1,071 3.50 % 6,870.10 G p - s t m - n bi biti biti Gp-g bi 6,988 3.26 % 6,750.99 1,318 2.90 % 5,717.43 1,682 2.43 % 5,691.24 2,685 3.89 % 7,603.02 1,303 4.26 % 8,358.30 G p - g ni biti biti Gp-ste-d ni 4,775 2.23 % 4,613.05 882 1.94 % 3,826.08 1,507 2.17 % 5,099.11 1,565 2.27 % 4,431.56 821 2.69 % 5,266.43 G p - s t e - d bo biti biti Gp-pte-n bo 4,751 2.22 % 4,589.86 1,218 2.68 % 5,283.64 1,065 1.54 % 3,603.55 1,761 2.55 % 4,986.56 707 2.31 % 4,535.16 G p - p t e - n sem biti biti Gp-spe-n sem 4,224 1.97 % 4,080.74 768 1.69 % 3,331.55 1,720 2.48 % 5,819.82 1,052 1.52 % 2,978.91 684 2.24 % 4,387.62 G p - s p e - n smo biti biti Gp-spm-n smo 3,917 1.83 % 3,784.15 881 1.94 % 3,821.74 1,104 1.59 % 3,735.51 1,543 2.24 % 4,369.26 389 1.27 % 2,495.30 G p - s p m - n vem vedeti vedeti Ggnspe vem 3,296 1.54 % 3,184.21 467 1.03 % 2,025.83 1,694 2.44 % 5,731.84 500 0.72 % 1,415.83 635 2.08 % 4,073.31 G g n s p e si biti biti Gp-sde-n si 2,898 1.35 % 2,799.71 849 1.87 % 3,682.93 1,139 1.64 % 3,853.94 528 0.77 % 1,495.12 382 1.25 % 2,450.40 G p - s d e - n veš vedeti vedeti Ggnsde veš 2,611 1.22 % 2,522.44 485 1.07 % 2,103.91 1,534 2.21 % 5,190.46 97 0.14 % 274.67 495 1.62 % 3,175.25 G g n s d e blo biti biti Gp-d-es bilo 2,262 1.05 % 2,185.28 368 0.81 % 1,596.37 1,011 1.46 % 3,420.83 499 0.72 % 1,413 384 1.26 % 2,463.23 G p - d - e s bil biti biti Gp-d-em bil 2,035 0.95 % 1,965.98 505 1.11 % 2,190.67 793 1.14 % 2,683.21 613 0.89 % 1,735.81 124 0.41 % 795.42 G p - d - e m bomo biti biti Gp-ppm-n bomo 1,818 0.85 % 1,756.34 486 1.07 % 2,108.25 261 0.38 % 883.12 806 1.17 % 2,282.32 265 0.87 % 1,699.88 G p - p p m - n ste biti biti Gp-sdm-n ste 1,621 0.76 % 1,566.02 480 1.06 % 2,082.22 182 0.26 % 615.82 717 1.04 % 2,030.30 242 0.79 % 1,552.35 G p - s d m - n bom biti biti Gp-ppe-n bom 1,575 0.73 % 1,521.58 263 0.58 % 1,140.88 605 0.87 % 2,047.09 360 0.52 % 1,019.40 347 1.14 % 2,225.89 G p - p p e - n bla biti biti Gp-d-ez bila 1,455 0.68 % 1,405.65 273 0.60 % 1,184.26 700 1.01 % 2,368.53 303 0.44 % 857.99 179 0.59 % 1,148.22 G p - d - e z sn biti biti Gp-spe-n sem 1,348 0.63 % 1,302.28 222 0.49 % 963.03 875 1.26 % 2,960.66 72 0.10 % 203.88 179 0.59 % 1,148.22 G p - s p e - n da dati dati Ggdste da 1,138 0.53 % 1,099.40 185 0.41 % 802.52 369 0.53 % 1,248.55 393 0.57 % 1,112.84 191 0.62 % 1,225.20 G g d s t e gre iti iti Ggvste gre 1,098 0.51 % 1,060.76 200 0.44 % 867.59 287 0.41 % 971.10 456 0.66 % 1,291.24 155 0.51 % 994.27 G g v s t e ma imeti imeti Ggnste-n ima 1,079 0.50 % 1,042.40 157 0.34 % 681.06 525 0.76 % 1,776.40 176 0.26 % 498.37 221 0.72 % 1,417.64 G g n s t e - n boš biti biti Gp-pde-n boš 951 0.44 % 918.75 207 0.46 % 897.96 381 0.55 % 1,289.16 184 0.27 % 521.03 179 0.59 % 1,148.22 G p - p d e - n maš imeti imeti Ggnsde-n imaš 929 0.43 % 897.49 206 0.45 % 893.62 479 0.69 % 1,620.75 91 0.13 % 257.68 153 0.50 % 981.44 G g n s d e - n ma meti meti Ggvste ma 927 0.43 % 895.56 161 0.35 % 698.41 597 0.86 % 2,020.02 76 0.11 % 215.21 93 0.30 % 596.56 G g v s t e reku reči reči Ggdd-em rekel 896 0.42 % 865.61 143 0.32 % 620.33 380 0.55 % 1,285.77 246 0.36 % 696.59 127 0.41 % 814.66 G g d d - e m recimo reči reči Ggdvpm recimo 838 0.39 % 809.58 166 0.36 % 720.10 126 0.18 % 426.34 351 0.51 % 993.91 195 0.64 % 1,250.86 G g d v p m rekla reči reči Ggdd-ez rekla 759 0.35 % 733.26 94 0.21 % 407.77 404 0.58 % 1,366.98 121 0.17 % 342.63 140 0.46 % 898.05 G g d d - e z sta biti biti Gp-std-n sta 728 0.34 % 703.31 197 0.43 % 854.58 235 0.34 % 795.15 220 0.32 % 622.97 76 0.25 % 487.51 G p - s t d - n je jesti jesti Ggnste je 716 0.33 % 691.72 126 0.28 % 546.58 230 0.33 % 778.23 254 0.37 % 719.24 106 0.35 % 679.95 G g n s t e mamo imeti imeti Ggnspm-n imamo 672 0.31 % 649.21 158 0.35 % 685.40 138 0.20 % 466.94 233 0.34 % 659.78 143 0.47 % 917.30 G g n s p m - n zdi zdeti zdeti Ggnste zdi 671 0.31 % 648.24 115 0.25 % 498.87 179 0.26 % 605.67 198 0.29 % 560.67 179 0.59 % 1,148.22 G g n s t e boste biti biti Gp-pdm-n boste 644 0.30 % 622.16 225 0.49 % 976.04 13 0.02 % 43.99 277 0.40 % 784.37 129 0.42 % 827.49 G p - p d m - n nisem biti biti Gp-spe-d nisem 622 0.29 % 600.90 133 0.29 % 576.95 239 0.34 % 808.68 130 0.19 % 368.12 120 0.39 % 769.76 G p - s p e - d CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 749 File at CLARIN.SI2.3.15 List of adjective lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-words-adjectives-lemmas-taxonomy-entire.tsvLemma Lemma (lower-case) Lemma Lemma (lower-case) Part-of-speech category Standardized form Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] sam sam sam sam P samo 2,475 4.49 % 2,391.06 361 2.69 % 1,566 1,104 11.05 % 3,735.51 508 2.06 % 1,438.49 502 7.09 % 3,220.16 mali mali mali mali P malo 1,518 2.75 % 1,466.52 418 3.11 % 1,813.27 500 5.01 % 1,691.81 267 1.08 % 756.05 333 4.70 % 2,136.08 dober dober dober dober P dobro 1,485 2.69 % 1,434.63 687 5.12 % 2,980.18 283 2.83 % 957.56 315 1.28 % 891.97 200 2.82 % 1,282.93 pravi pravi pravi pravi P pravi 624 1.13 % 602.84 91 0.68 % 394.75 71 0.71 % 240.24 345 1.40 % 976.92 117 1.65 % 750.51 dober dober dober dober P dober 468 0.85 % 452.13 181 1.35 % 785.17 88 0.88 % 297.76 156 0.63 % 441.74 43 0.61 % 275.83 sam sam sam sam P sam 366 0.66 % 353.59 81 0.60 % 351.37 121 1.21 % 409.42 117 0.47 % 331.30 47 0.66 % 301.49 lep lep lep lep P lepa 344 0.62 % 332.33 122 0.91 % 529.23 36 0.36 % 121.81 153 0.62 % 433.24 33 0.47 % 211.68 cel cel cel cel P celo 310 0.56 % 299.49 75 0.56 % 325.35 97 0.97 % 328.21 93 0.38 % 263.34 45 0.64 % 288.66 glaven glaven glaven glaven P glavnem 280 0.51 % 270.50 55 0.41 % 238.59 148 1.48 % 500.77 39 0.16 % 110.43 38 0.54 % 243.76 lep lep lep lep P lep 239 0.43 % 230.89 120 0.89 % 520.56 28 0.28 % 94.74 70 0.28 % 198.22 21 0.30 % 134.71 star star star star P stari 233 0.42 % 225.10 28 0.21 % 121.46 174 1.74 % 588.75 21 0.09 % 59.46 10 0.14 % 64.15 sam sam sam sam P sama 204 0.37 % 197.08 55 0.41 % 238.59 66 0.66 % 223.32 47 0.19 % 133.09 36 0.51 % 230.93 cel cel cel cel P cela 198 0.36 % 191.28 19 0.14 % 82.42 41 0.41 % 138.73 89 0.36 % 252.02 49 0.69 % 314.32 nov nov nov nov P novo 196 0.35 % 189.35 58 0.43 % 251.60 40 0.40 % 135.34 63 0.26 % 178.39 35 0.49 % 224.51 dober dober dober dober P boljše 194 0.35 % 187.42 36 0.27 % 156.17 59 0.59 % 199.63 58 0.23 % 164.24 41 0.58 % 263 velik velik velik velik P velika 170 0.31 % 164.23 33 0.25 % 143.15 47 0.47 % 159.03 67 0.27 % 189.72 23 0.33 % 147.54 jasen jasen jasen jasen P jasno 165 0.30 % 159.40 33 0.25 % 143.15 35 0.35 % 118.43 65 0.26 % 184.06 32 0.45 % 205.27 dober dober dober dober P dobra 162 0.29 % 156.51 69 0.51 % 299.32 38 0.38 % 128.58 35 0.14 % 99.11 20 0.28 % 128.29 cel cel cel cel P cel 160 0.29 % 154.57 36 0.27 % 156.17 85 0.85 % 287.61 21 0.09 % 59.46 18 0.25 % 115.46 sam sam sam sam P sami 148 0.27 % 142.98 25 0.19 % 108.45 31 0.31 % 104.89 70 0.28 % 198.22 22 0.31 % 141.12 naslednji naslednji naslednji naslednji P naslednji 144 0.26 % 139.12 48 0.36 % 208.22 36 0.36 % 121.81 37 0.15 % 104.77 23 0.33 % 147.54 velik velik velik velik P velik 134 0.24 % 129.46 30 0.22 % 130.14 23 0.23 % 77.82 70 0.28 % 198.22 11 0.15 % 70.56 velik velik velik velik P velike 128 0.23 % 123.66 23 0.17 % 99.77 33 0.33 % 111.66 55 0.22 % 155.74 17 0.24 % 109.05 slovenski slovenski slovenski slovenski P slovenski 126 0.23 % 121.73 36 0.27 % 156.17 11 0.11 % 37.22 74 0.30 % 209.54 5 0.07 % 32.07 zadnji zadnji zadnji zadnji P zadnjih 125 0.23 % 120.76 37 0.28 % 160.50 5 0.05 % 16.92 76 0.31 % 215.21 7 0.10 % 44.90 nov nov nov nov P nove 124 0.23 % 119.79 25 0.19 % 108.45 20 0.20 % 67.67 71 0.29 % 201.05 8 0.11 % 51.32 dolg dolg dolg dolg P dolgo 122 0.22 % 117.86 30 0.22 % 130.14 54 0.54 % 182.72 27 0.11 % 76.45 11 0.15 % 70.56 zadnji zadnji zadnji zadnji P zadnji 120 0.22 % 115.93 32 0.24 % 138.81 33 0.33 % 111.66 49 0.20 % 138.75 6 0.09 % 38.49 nov nov nov nov P nov 117 0.21 % 113.03 65 0.48 % 281.97 12 0.12 % 40.60 31 0.13 % 87.78 9 0.13 % 57.73 nov nov nov nov P novega 106 0.19 % 102.40 30 0.22 % 130.14 24 0.24 % 81.21 41 0.17 % 116.10 11 0.15 % 70.56 fajn fajn fajn fajn P fajn 105 0.19 % 101.44 40 0.30 % 173.52 44 0.44 % 148.88 7 0.03 % 19.82 14 0.20 % 89.81 star star star star P stara 104 0.19 % 100.47 31 0.23 % 134.48 43 0.43 % 145.50 16 0.07 % 45.31 14 0.20 % 89.81 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 750 File at CLARIN.SI2.3.16 List of adjective lower-case word forms in the GOS 1.0 corpus with standardized word forms, lemmas, morphosyntactic tags and text-type distributionGOS1.0-words-adjectives-lowercase_forms-standardized_ forms-lemmas-morphosyntactic_tags-taxonomy-entire.tsvLower-case word form Lemma Lemma (lower-case) Morphosyntactic tag Standardized form Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] msd01 msd02 msd03 msd04 msd05 msd06 msd07 samo sam sam Ppnsei samo 1,151 2.09 % 1,111.96 237 1.76 % 1,028.10 321 3.21 % 1,086.14 366 1.48 % 1,036.39 227 3.21 % 1,456.13 P p n s e i dobro dober dober Ppnsei dobro 1,032 1.87 % 997 522 3.89 % 2,264.42 123 1.23 % 416.18 255 1.03 % 722.07 132 1.86 % 846.73 P p n s e i sam sam sam Ppnsei samo 1,009 1.83 % 974.78 94 0.70 % 407.77 571 5.72 % 1,932.04 107 0.43 % 302.99 237 3.35 % 1,520.27 P p n s e i malo mali mali Ppnsei malo 641 1.16 % 619.26 217 1.62 % 941.34 156 1.56 % 527.84 124 0.50 % 351.13 144 2.03 % 923.71 P p n s e i mal mali mali Ppnsei malo 620 1.12 % 598.97 170 1.27 % 737.45 183 1.83 % 619.20 115 0.47 % 325.64 152 2.15 % 975.03 P p n s e i dober dober dober Ppnmein dober 394 0.71 % 380.64 154 1.15 % 668.05 56 0.56 % 189.48 146 0.59 % 413.42 38 0.54 % 243.76 P p n m e i n sam sam sam Ppnmein sam 341 0.62 % 329.43 71 0.53 % 308 116 1.16 % 392.50 108 0.44 % 305.82 46 0.65 % 295.07 P p n m e i n pravi pravi pravi Ppnmeid pravi 336 0.61 % 324.60 48 0.36 % 208.22 18 0.18 % 60.91 203 0.82 % 574.83 67 0.95 % 429.78 P p n m e i d lepa lep lep Ppnzei lepa 333 0.60 % 321.71 117 0.87 % 507.54 30 0.30 % 101.51 153 0.62 % 433.24 33 0.47 % 211.68 P p n z e i dobr dober dober Ppnsei dobro 236 0.43 % 228 61 0.45 % 264.62 101 1.01 % 341.74 20 0.08 % 56.63 54 0.76 % 346.39 P p n s e i lep lep lep Ppnmein lep 217 0.39 % 209.64 110 0.82 % 477.18 21 0.21 % 71.06 67 0.27 % 189.72 19 0.27 % 121.88 P p n m e i n stari star star Ppnmeid stari 197 0.36 % 190.32 20 0.15 % 86.76 159 1.59 % 537.99 12 0.05 % 33.98 6 0.09 % 38.49 P p n m e i d sama sam sam Ppnzei sama 190 0.34 % 183.56 50 0.37 % 216.90 62 0.62 % 209.78 46 0.19 % 130.26 32 0.45 % 205.27 P p n z e i glavnem glaven glaven Ppnmem glavnem 175 0.32 % 169.06 27 0.20 % 117.12 93 0.93 % 314.68 25 0.10 % 70.79 30 0.42 % 192.44 P p n m e m celo cel cel Ppnsei celo 171 0.31 % 165.20 47 0.35 % 203.88 35 0.35 % 118.43 61 0.25 % 172.73 28 0.40 % 179.61 P p n s e i cela cel cel Ppnzei cela 169 0.31 % 163.27 19 0.14 % 82.42 33 0.33 % 111.66 71 0.29 % 201.05 46 0.65 % 295.07 P p n z e i dobra dober dober Ppnzei dobra 139 0.25 % 134.29 63 0.47 % 273.29 28 0.28 % 94.74 28 0.11 % 79.29 20 0.28 % 128.29 P p n z e i mav mali mali Ppnsei malo 137 0.25 % 132.35 6 0.04 % 26.03 109 1.09 % 368.81 4 0.02 % 11.33 18 0.25 % 115.46 P p n s e i jasno jasen jasen Ppnsei jasno 132 0.24 % 127.52 32 0.24 % 138.81 15 0.15 % 50.75 56 0.23 % 158.57 29 0.41 % 186.03 P p n s e i prav pravi pravi Ppnmeid pravi 123 0.22 % 118.83 7 0.05 % 30.37 34 0.34 % 115.04 54 0.22 % 152.91 28 0.40 % 179.61 P p n m e i d cel cel cel Ppnmein cel 117 0.21 % 113.03 26 0.19 % 112.79 63 0.63 % 213.17 16 0.07 % 45.31 12 0.17 % 76.98 P p n m e i n sami sam sam Ppnzed sami 109 0.20 % 105.30 13 0.10 % 56.39 23 0.23 % 77.82 51 0.21 % 144.41 22 0.31 % 141.12 P p n z e d stara star star Ppnzei stara 99 0.18 % 95.64 29 0.22 % 125.80 41 0.41 % 138.73 16 0.07 % 45.31 13 0.18 % 83.39 P p n z e i velika velik velik Ppnzei velika 97 0.18 % 93.71 23 0.17 % 99.77 13 0.13 % 43.99 48 0.19 % 135.92 13 0.18 % 83.39 P p n z e i nov nov nov Ppnmein nov 95 0.17 % 91.78 56 0.42 % 242.93 11 0.11 % 37.22 21 0.09 % 59.46 7 0.10 % 44.90 P p n m e i n velik velik velik Ppnmein velik 91 0.17 % 87.91 22 0.16 % 95.44 8 0.08 % 27.07 57 0.23 % 161.40 4 0.06 % 25.66 P p n m e i n nova nov nov Ppnzei nova 89 0.16 % 85.98 28 0.21 % 121.46 11 0.11 % 37.22 38 0.15 % 107.60 12 0.17 % 76.98 P p n z e i glavnem glaven glaven Ppnsem glavnem 86 0.16 % 83.08 18 0.13 % 78.08 48 0.48 % 162.41 12 0.05 % 33.98 8 0.11 % 51.32 P p n s e m novo nov nov Ppnzet novo 77 0.14 % 74.39 23 0.17 % 99.77 9 0.09 % 30.45 29 0.12 % 82.12 16 0.23 % 102.63 P p n z e t nove nov nov Ppnzer nove 75 0.14 % 72.46 17 0.13 % 73.75 9 0.09 % 30.45 45 0.18 % 127.42 4 0.06 % 25.66 P p n z e r star star star Ppnmein star 75 0.14 % 72.46 26 0.19 % 112.79 35 0.35 % 118.43 9 0.04 % 25.48 5 0.07 % 32.07 P p n m e i n celo cel cel Ppnzet celo 73 0.13 % 70.52 20 0.15 % 86.76 26 0.26 % 87.97 16 0.07 % 45.31 11 0.15 % 70.56 P p n z e t CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 751 File at CLARIN.SI2.3.17 List of adverb lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-words-adverbs-lemmas-taxonomy-entire.tsvLemma Lemma (lower-case) Lemma Lemma (lower-case) Part-of-speech category Standardized form Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] tako tako tako tako R tako 9,713 10.29 % 9,383.57 1,925 9.09 % 8,350.58 3,661 12.84 % 12,387.41 2,296 7.90 % 6,501.51 1,831 11.68 % 11,745.24 zdaj zdaj zdaj zdaj R zdaj 6,356 6.73 % 6,140.43 1,457 6.88 % 6,320.41 1,737 6.09 % 5,877.34 1,764 6.07 % 4,995.06 1,398 8.92 % 8,967.69 lahko lahko lahko lahko R lahko 4,000 4.24 % 3,864.33 901 4.26 % 3,908.50 885 3.10 % 2,994.50 1,437 4.94 % 4,069.10 777 4.96 % 4,984.19 potem potem potem potem R potem 3,149 3.33 % 3,042.20 658 3.11 % 2,854.38 631 2.21 % 2,135.06 1,152 3.96 % 3,262.08 708 4.52 % 4,541.58 tam tam tam tam R tam 2,598 2.75 % 2,509.89 426 2.01 % 1,847.97 1,289 4.52 % 4,361.48 503 1.73 % 1,424.33 380 2.42 % 2,437.57 zelo zelo zelo zelo R zelo 1,835 1.94 % 1,772.76 473 2.23 % 2,051.86 182 0.64 % 615.82 896 3.08 % 2,537.17 284 1.81 % 1,821.76 bolj bolj bolj bolj R bolj 1,648 1.75 % 1,592.11 338 1.60 % 1,466.23 557 1.95 % 1,884.67 479 1.65 % 1,356.37 274 1.75 % 1,757.62 danes danes danes danes R danes 1,414 1.50 % 1,366.04 634 3.00 % 2,750.27 264 0.93 % 893.27 439 1.51 % 1,243.10 77 0.49 % 493.93 tu tu tu tu R tu 1,406 1.49 % 1,358.31 384 1.81 % 1,665.78 355 1.25 % 1,201.18 343 1.18 % 971.26 324 2.07 % 2,078.35 sem sem sem sem R sem 1,307 1.38 % 1,262.67 232 1.10 % 1,006.41 680 2.38 % 2,300.86 237 0.81 % 671.10 158 1.01 % 1,013.52 dobro dobro dobro dobro R dobro 1,147 1.22 % 1,108.10 279 1.32 % 1,210.29 205 0.72 % 693.64 455 1.56 % 1,288.41 208 1.33 % 1,334.25 tukaj tukaj tukaj tukaj R tukaj 1,135 1.20 % 1,096.50 133 0.63 % 576.95 164 0.57 % 554.91 597 2.05 % 1,690.50 241 1.54 % 1,545.93 malo malo malo malo R malo 1,131 1.20 % 1,092.64 308 1.46 % 1,336.09 364 1.28 % 1,231.64 222 0.76 % 628.63 237 1.51 % 1,520.27 kje kje kje kje R kje 1,039 1.10 % 1,003.76 224 1.06 % 971.70 415 1.46 % 1,404.20 279 0.96 % 790.03 121 0.77 % 776.17 zato zato zato zato R zato 992 1.05 % 958.36 202 0.95 % 876.27 247 0.87 % 835.75 394 1.36 % 1,115.68 149 0.95 % 955.78 čisto čisto čisto čisto R čisto 974 1.03 % 940.97 132 0.62 % 572.61 402 1.41 % 1,360.21 231 0.80 % 654.11 209 1.33 % 1,340.66 prav prav prav prav R prav 971 1.03 % 938.07 214 1.01 % 928.32 308 1.08 % 1,042.15 317 1.09 % 897.64 132 0.84 % 846.73 naprej naprej naprej naprej R naprej 922 0.98 % 890.73 208 0.98 % 902.30 134 0.47 % 453.40 446 1.53 % 1,262.92 134 0.85 % 859.56 ful ful ful ful R ful 916 0.97 % 884.93 52 0.25 % 225.57 697 2.44 % 2,358.38 36 0.12 % 101.94 131 0.84 % 840.32 mogoče mogoče mogoče mogoče R mogoče 872 0.92 % 842.42 207 0.98 % 897.96 163 0.57 % 551.53 242 0.83 % 685.26 260 1.66 % 1,667.81 takrat takrat takrat takrat R takrat 840 0.89 % 811.51 169 0.80 % 733.12 235 0.82 % 795.15 313 1.08 % 886.31 123 0.79 % 789 prej prej prej prej R prej 832 0.88 % 803.78 169 0.80 % 733.12 263 0.92 % 889.89 295 1.01 % 835.34 105 0.67 % 673.54 treba treba treba treba R treba 827 0.88 % 798.95 164 0.78 % 711.43 164 0.57 % 554.91 331 1.14 % 937.28 168 1.07 % 1,077.66 gor gor gor gor R gor 824 0.87 % 796.05 129 0.61 % 559.60 527 1.85 % 1,783.16 64 0.22 % 181.23 104 0.66 % 667.12 enkrat enkrat enkrat enkrat R enkrat 803 0.85 % 775.77 184 0.87 % 798.18 273 0.96 % 923.73 238 0.82 % 673.94 108 0.69 % 692.78 kako kako kako kako R kako 803 0.85 % 775.77 175 0.83 % 759.14 254 0.89 % 859.44 258 0.89 % 730.57 116 0.74 % 744.10 tule tule tule tule R tule 756 0.80 % 730.36 126 0.59 % 546.58 305 1.07 % 1,032 128 0.44 % 362.45 197 1.26 % 1,263.69 lepo lepo lepo lepo R lepo 743 0.79 % 717.80 240 1.13 % 1,041.11 245 0.86 % 828.99 190 0.65 % 538.02 68 0.43 % 436.20 vedno vedno vedno vedno R vedno 733 0.78 % 708.14 230 1.09 % 997.73 106 0.37 % 358.66 291 1.00 % 824.01 106 0.68 % 679.95 veliko veliko veliko veliko R veliko 720 0.76 % 695.58 182 0.86 % 789.51 123 0.43 % 416.18 369 1.27 % 1,044.88 46 0.29 % 295.07 skupaj skupaj skupaj skupaj R skupaj 713 0.76 % 688.82 161 0.76 % 698.41 220 0.77 % 744.40 222 0.76 % 628.63 110 0.70 % 705.61 kdaj kdaj kdaj kdaj R kdaj 701 0.74 % 677.22 177 0.84 % 767.82 253 0.89 % 856.05 182 0.63 % 515.36 89 0.57 % 570.90 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 752 File at CLARIN.SI2.3.18 List of adverb lower-case word forms in the GOS 1.0 corpus with standardized word forms, lemmas, morphosyntactic tags and text-type distributionGOS1.0-words-adverbs-lowercase_forms-standardized_ forms-lemmas-morphosyntactic_tags-taxonomy-entire.tsvLower-case word form Lemma Lemma (lower-case) Morphosyntactic tag Standardized form Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] msd01 msd02 msd03 tako tako tako Rsn tako 3,345 3.54 % 3,231.55 828 3.91 % 3,591.83 360 1.26 % 1,218.10 1,611 5.54 % 4,561.81 546 3.48 % 3,502.40 R s n zdej zdaj zdaj Rsn zdaj 3,036 3.21 % 2,933.03 515 2.43 % 2,234.05 839 2.94 % 2,838.85 1,118 3.85 % 3,165.80 564 3.60 % 3,617.87 R s n lahko lahko lahko Rsn lahko 2,884 3.05 % 2,786.19 678 3.20 % 2,941.14 335 1.18 % 1,133.51 1,267 4.36 % 3,587.72 604 3.85 % 3,874.45 R s n tko tako tako Rsn tako 2,665 2.82 % 2,574.61 435 2.06 % 1,887.01 1,027 3.60 % 3,474.97 472 1.62 % 1,336.55 731 4.66 % 4,689.11 R s n tak tako tako Rsn tako 2,543 2.69 % 2,456.75 493 2.33 % 2,138.62 1,452 5.09 % 4,913.01 132 0.45 % 373.78 466 2.97 % 2,989.23 R s n potem potem potem Rsn potem 2,473 2.62 % 2,389.12 466 2.20 % 2,021.49 212 0.74 % 717.33 1,145 3.94 % 3,242.26 650 4.15 % 4,169.53 R s n tam tam tam Rsn tam 1,817 1.92 % 1,755.37 257 1.21 % 1,114.86 873 3.06 % 2,953.89 384 1.32 % 1,087.36 303 1.93 % 1,943.64 R s n tu tu tu Rsn tu 1,384 1.47 % 1,337.06 377 1.78 % 1,635.41 341 1.20 % 1,153.81 343 1.18 % 971.26 323 2.06 % 2,071.93 R s n zdaj zdaj zdaj Rsn zdaj 1,195 1.27 % 1,154.47 427 2.02 % 1,852.31 207 0.73 % 700.41 357 1.23 % 1,010.90 204 1.30 % 1,308.59 R s n zelo zelo zelo Rsn zelo 1,127 1.19 % 1,088.78 293 1.38 % 1,271.02 66 0.23 % 223.32 663 2.28 % 1,877.39 105 0.67 % 673.54 R s n bolj bolj bolj Rsn bolj 1,013 1.07 % 978.64 236 1.11 % 1,023.76 227 0.80 % 768.08 373 1.28 % 1,056.21 177 1.13 % 1,135.39 R s n danes danes danes Rsn danes 929 0.98 % 897.49 537 2.54 % 2,329.49 21 0.07 % 71.06 333 1.15 % 942.94 38 0.24 % 243.76 R s n ful ful ful Rsn ful 914 0.97 % 883 52 0.25 % 225.57 695 2.44 % 2,351.61 36 0.12 % 101.94 131 0.84 % 840.32 R s n naprej naprej naprej Rsn naprej 909 0.96 % 878.17 208 0.98 % 902.30 125 0.44 % 422.95 443 1.52 % 1,254.43 133 0.85 % 853.15 R s n sem sem sem Rsn sem 898 0.95 % 867.54 144 0.68 % 624.67 420 1.47 % 1,421.12 208 0.72 % 588.99 126 0.80 % 808.25 R s n dobro dobro dobro Rsn dobro 885 0.94 % 854.98 208 0.98 % 902.30 97 0.34 % 328.21 423 1.46 % 1,197.79 157 1.00 % 1,007.10 R s n mogoče mogoče mogoče Rsn mogoče 815 0.86 % 787.36 190 0.90 % 824.21 136 0.48 % 460.17 238 0.82 % 673.94 251 1.60 % 1,610.08 R s n zaj zdaj zdaj Rsn zdaj 808 0.86 % 780.60 205 0.97 % 889.28 341 1.20 % 1,153.81 49 0.17 % 138.75 213 1.36 % 1,366.32 R s n prej prej prej Rsn prej 804 0.85 % 776.73 165 0.78 % 715.76 240 0.84 % 812.07 295 1.01 % 835.34 104 0.66 % 667.12 R s n zato zato zato Rsn zato 799 0.85 % 771.90 144 0.68 % 624.67 145 0.51 % 490.62 374 1.29 % 1,059.04 136 0.87 % 872.39 R s n zej zdaj zdaj Rsn zdaj 773 0.82 % 746.78 139 0.66 % 602.98 191 0.67 % 646.27 198 0.68 % 560.67 245 1.56 % 1,571.59 R s n kje kje kje Rsn kje 757 0.80 % 731.33 138 0.65 % 598.64 245 0.86 % 828.99 270 0.93 % 764.55 104 0.66 % 667.12 R s n treba treba treba Rsn treba 750 0.79 % 724.56 151 0.71 % 655.03 137 0.48 % 463.56 313 1.08 % 886.31 149 0.95 % 955.78 R s n vedno vedno vedno Rsn vedno 723 0.77 % 698.48 229 1.08 % 993.39 100 0.35 % 338.36 288 0.99 % 815.52 106 0.68 % 679.95 R s n zlo zelo zelo Rsn zelo 688 0.73 % 664.67 178 0.84 % 772.16 101 0.35 % 341.74 230 0.79 % 651.28 179 1.14 % 1,148.22 R s n lepo lepo lepo Rsn lepo 650 0.69 % 627.95 230 1.09 % 997.73 174 0.61 % 588.75 182 0.63 % 515.36 64 0.41 % 410.54 R s n spet spet spet Rsn spet 640 0.68 % 618.29 145 0.69 % 629 218 0.76 % 737.63 186 0.64 % 526.69 91 0.58 % 583.73 R s n tm tam tam Rsn tam 613 0.65 % 592.21 79 0.37 % 342.70 343 1.20 % 1,160.58 118 0.41 % 334.14 73 0.47 % 468.27 R s n gor gor gor Rsn gor 601 0.64 % 580.62 112 0.53 % 485.85 326 1.14 % 1,103.06 62 0.21 % 175.56 101 0.64 % 647.88 R s n enkrat enkrat enkrat Rsn enkrat 595 0.63 % 574.82 142 0.67 % 615.99 168 0.59 % 568.45 195 0.67 % 552.17 90 0.57 % 577.32 R s n te potem potem Rsn potem 593 0.63 % 572.89 173 0.82 % 750.47 360 1.26 % 1,218.10 5 0.02 % 14.16 55 0.35 % 352.81 R s n čist čisto čisto Rsn čisto 585 0.62 % 565.16 59 0.28 % 255.94 265 0.93 % 896.66 121 0.42 % 342.63 140 0.89 % 898.05 R s n CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 753 File at CLARIN.SI2.3.19 List of pronoun lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-words-pronouns-lemmas-taxonomy-entire.tsvLemma Lemma (lower-case) Lemma Lemma (lower-case) Part-of-speech category Standardized form Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ta ta ta ta Z to 18,438 13.25 % 17,812.65 3,620 12.57 % 15,703.42 5,468 12.87 % 18,501.60 5,489 12.12 % 15,543.01 3,861 17.11 % 24,766.99 se se se se Z se 15,865 11.40 % 15,326.92 3,232 11.23 % 14,020.29 3,910 9.20 % 13,229.93 6,051 13.36 % 17,134.41 2,672 11.84 % 17,139.96 kaj kaj kaj kaj Z kaj 8,980 6.45 % 8,675.43 1,692 5.88 % 7,339.83 3,530 8.31 % 11,944.16 2,397 5.29 % 6,787.50 1,361 6.03 % 8,730.35 jaz jaz jaz jaz Z jaz 6,278 4.51 % 6,065.07 1,175 4.08 % 5,097.11 2,519 5.93 % 8,523.32 1,443 3.19 % 4,086.09 1,141 5.06 % 7,319.12 on on on on Z je 5,254 3.78 % 5,075.80 1,061 3.69 % 4,602.58 1,778 4.18 % 6,016.07 1,682 3.71 % 4,762.86 733 3.25 % 4,701.94 jaz jaz jaz jaz Z mi 4,046 2.91 % 3,908.77 672 2.33 % 2,915.11 1,457 3.43 % 4,929.93 1,249 2.76 % 3,536.75 668 2.96 % 4,284.99 ti ti ti ti Z ti 4,006 2.88 % 3,870.13 1,039 3.61 % 4,507.14 1,835 4.32 % 6,208.93 536 1.18 % 1,517.77 596 2.64 % 3,823.14 ta ta ta ta Z ta 3,881 2.79 % 3,749.37 678 2.35 % 2,941.14 1,121 2.64 % 3,793.03 1,386 3.06 % 3,924.69 696 3.08 % 4,464.60 kar kar kar kar Z kar 2,909 2.09 % 2,810.34 617 2.14 % 2,676.52 731 1.72 % 2,473.42 1,058 2.34 % 2,995.90 503 2.23 % 3,226.57 ves ves ves ves Z vse 2,769 1.99 % 2,675.09 648 2.25 % 2,811 969 2.28 % 3,278.72 689 1.52 % 1,951.02 463 2.05 % 2,969.99 on on on on Z ga 2,681 1.93 % 2,590.07 527 1.83 % 2,286.11 1,001 2.35 % 3,387 809 1.79 % 2,290.82 344 1.52 % 2,206.64 ti ti ti ti Z te 2,326 1.67 % 2,247.11 660 2.29 % 2,863.05 746 1.75 % 2,524.18 548 1.21 % 1,551.75 372 1.65 % 2,386.25 ta ta ta ta Z tem 2,124 1.53 % 2,051.96 399 1.39 % 1,730.85 248 0.58 % 839.14 1,098 2.42 % 3,109.17 379 1.68 % 2,431.15 nekaj nekaj nekaj nekaj Z nekaj 2,020 1.45 % 1,951.49 376 1.31 % 1,631.07 748 1.76 % 2,530.94 575 1.27 % 1,628.21 321 1.42 % 2,059.10 ta ta ta ta Z tega 1,961 1.41 % 1,894.49 298 1.03 % 1,292.71 382 0.90 % 1,292.54 920 2.03 % 2,605.13 361 1.60 % 2,315.69 on on on on Z jih 1,931 1.39 % 1,865.51 323 1.12 % 1,401.16 516 1.21 % 1,745.94 802 1.77 % 2,271 290 1.28 % 1,860.25 kak kak kak kak Z kako 1,853 1.33 % 1,790.15 400 1.39 % 1,735.18 676 1.59 % 2,287.32 513 1.13 % 1,452.64 264 1.17 % 1,693.47 se se se se Z si 1,690 1.22 % 1,632.68 449 1.56 % 1,947.74 487 1.15 % 1,647.82 521 1.15 % 1,475.30 233 1.03 % 1,494.61 nič nič nič nič Z nič 1,538 1.10 % 1,485.84 370 1.28 % 1,605.05 602 1.42 % 2,036.94 371 0.82 % 1,050.55 195 0.86 % 1,250.86 on on on on Z jo 1,333 0.96 % 1,287.79 286 0.99 % 1,240.66 403 0.95 % 1,363.60 512 1.13 % 1,449.81 132 0.58 % 846.73 on on on on Z on 1,216 0.87 % 1,174.76 246 0.85 % 1,067.14 585 1.38 % 1,979.41 211 0.47 % 597.48 174 0.77 % 1,116.15 tisti tisti tisti tisti Z tisto 1,156 0.83 % 1,116.79 132 0.46 % 572.61 566 1.33 % 1,915.13 278 0.61 % 787.20 180 0.80 % 1,154.64 jaz jaz jaz jaz Z me 1,146 0.82 % 1,107.13 327 1.14 % 1,418.51 440 1.03 % 1,488.79 232 0.51 % 656.95 147 0.65 % 942.95 kdo kdo kdo kdo Z kdo 1,004 0.72 % 969.95 259 0.90 % 1,123.53 231 0.54 % 781.61 371 0.82 % 1,050.55 143 0.63 % 917.30 tisti tisti tisti tisti Z tisti 947 0.68 % 914.88 168 0.58 % 728.78 352 0.83 % 1,191.03 314 0.69 % 889.14 113 0.50 % 724.86 ti ti ti ti Z vi 907 0.65 % 876.24 192 0.67 % 832.89 98 0.23 % 331.59 380 0.84 % 1,076.03 237 1.05 % 1,520.27 jaz jaz jaz jaz Z nas 886 0.64 % 855.95 208 0.72 % 902.30 272 0.64 % 920.34 314 0.69 % 889.14 92 0.41 % 590.15 on on on on Z oni 881 0.63 % 851.12 119 0.41 % 516.22 516 1.21 % 1,745.94 112 0.25 % 317.15 134 0.59 % 859.56 ta ta ta ta Z teh 850 0.61 % 821.17 152 0.53 % 659.37 103 0.24 % 348.51 434 0.96 % 1,228.94 161 0.71 % 1,032.76 on on on on Z ona 816 0.59 % 788.32 126 0.44 % 546.58 508 1.20 % 1,718.88 71 0.16 % 201.05 111 0.49 % 712.03 jaz jaz jaz jaz Z meni 810 0.58 % 782.53 132 0.46 % 572.61 405 0.95 % 1,370.36 113 0.25 % 319.98 160 0.71 % 1,026.34 ves ves ves ves Z vsi 800 0.57 % 772.87 175 0.61 % 759.14 205 0.48 % 693.64 314 0.69 % 889.14 106 0.47 % 679.95 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 754 File at CLARIN.SI2.3.20 List of pronoun lower-case word forms in the GOS 1.0 corpus with standardized word forms, lemmas, morphosyntactic tags and text-type distributionGOS1.0-words-pronouns-lowercase_forms-standardized_ forms-lemmas-morphosyntactic_tags-taxonomy-entire.tsvLower-case word form Lemma Lemma (lower-case) Morphosyntactic tag Standardized form Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] msd01 msd02 msd03 msd04 msd05 msd06 msd07 msd08 msd09 se se se Zp------k se 15,674 11.26 % 15,142.40 3,224 11.20 % 13,985.59 3,763 8.86 % 12,732.54 6,029 13.31 % 17,072.11 2,658 11.78 % 17,050.16 Z p - - - - - - k to ta ta Zk-sei to 14,824 10.65 % 14,321.22 2,950 10.25 % 12,796.99 4,100 9.65 % 13,872.82 4,450 9.82 % 12,600.91 3,324 14.73 % 21,322.32 Z k - s e i je on on Zotzer--k je 5,166 3.71 % 4,990.79 1,052 3.65 % 4,563.54 1,714 4.03 % 5,799.51 1,672 3.69 % 4,734.55 728 3.23 % 4,669.87 Z o t z e r - - k kaj kaj kaj Zv-sei kaj 4,710 3.38 % 4,550.25 1,016 3.53 % 4,407.37 1,240 2.92 % 4,195.68 1,670 3.69 % 4,728.88 784 3.48 % 5,029.09 Z v - s e i ti ti ti Zod-ei ti 3,578 2.57 % 3,456.65 935 3.25 % 4,055.99 1,608 3.78 % 5,440.85 493 1.09 % 1,396.01 542 2.40 % 3,476.74 Z o d - e i mi jaz jaz Zop-ed--k mi 3,392 2.44 % 3,276.96 587 2.04 % 2,546.38 1,266 2.98 % 4,283.66 947 2.09 % 2,681.59 592 2.62 % 3,797.48 Z o p - e d - - k jz jaz jaz Zop-ei jaz 3,351 2.41 % 3,237.35 628 2.18 % 2,724.24 1,092 2.57 % 3,694.91 918 2.03 % 2,599.47 713 3.16 % 4,573.65 Z o p - e i ta ta ta Zk-mei ta 2,783 2.00 % 2,688.61 479 1.66 % 2,077.88 758 1.78 % 2,564.78 1,006 2.22 % 2,848.66 540 2.39 % 3,463.91 Z k - m e i ga on on Zotmet--k ga 2,472 1.78 % 2,388.16 493 1.71 % 2,138.62 903 2.12 % 3,055.40 757 1.67 % 2,143.57 319 1.41 % 2,046.28 Z o t m e t - - k te ti ti Zod-et--k te 2,096 1.51 % 2,024.91 609 2.12 % 2,641.82 636 1.50 % 2,151.98 505 1.11 % 1,429.99 346 1.53 % 2,219.47 Z o d - e t - - k to ta ta Zk-set to 2,026 1.46 % 1,957.29 420 1.46 % 1,821.94 412 0.97 % 1,394.05 814 1.80 % 2,304.98 380 1.68 % 2,437.57 Z k - s e t vse ves ves Zc-sei vse 1,982 1.42 % 1,914.78 459 1.59 % 1,991.12 703 1.65 % 2,378.68 485 1.07 % 1,373.36 335 1.49 % 2,148.91 Z c - s e i jih on on Zotmmt--k jih 1,804 1.30 % 1,742.81 306 1.06 % 1,327.42 470 1.11 % 1,590.30 766 1.69 % 2,169.06 262 1.16 % 1,680.64 Z o t m m t - - k si se se Zp---d--k si 1,633 1.17 % 1,577.61 441 1.53 % 1,913.04 444 1.04 % 1,502.32 515 1.14 % 1,458.31 233 1.03 % 1,494.61 Z p - - - d - - k jaz jaz jaz Zop-ei jaz 1,541 1.11 % 1,488.73 339 1.18 % 1,470.57 618 1.45 % 2,091.07 330 0.73 % 934.45 254 1.13 % 1,629.32 Z o p - e i kar kar kar Zz-sei kar 1,330 0.96 % 1,284.89 277 0.96 % 1,201.62 238 0.56 % 805.30 626 1.38 % 1,772.62 189 0.84 % 1,212.37 Z z - s e i jo on on Zotzet--k jo 1,294 0.93 % 1,250.11 273 0.95 % 1,184.26 378 0.89 % 1,279.01 511 1.13 % 1,446.98 132 0.58 % 846.73 Z o t z e t - - k tem ta ta Zk-sem tem 1,236 0.89 % 1,194.08 243 0.84 % 1,054.12 124 0.29 % 419.57 661 1.46 % 1,871.73 208 0.92 % 1,334.25 Z k - s e m on on on Zotmei on 1,148 0.82 % 1,109.06 232 0.81 % 1,006.41 534 1.26 % 1,806.85 210 0.46 % 594.65 172 0.76 % 1,103.32 Z o t m e i kej kaj kaj Zv-sei kaj 1,108 0.80 % 1,070.42 181 0.63 % 785.17 550 1.29 % 1,860.99 181 0.40 % 512.53 196 0.87 % 1,257.27 Z v - s e i ka kaj kaj Zv-sei kaj 1,088 0.78 % 1,051.10 166 0.58 % 720.10 705 1.66 % 2,385.45 91 0.20 % 257.68 126 0.56 % 808.25 Z v - s e i me jaz jaz Zop-et--k me 1,025 0.74 % 990.24 281 0.98 % 1,218.97 405 0.95 % 1,370.36 204 0.45 % 577.66 135 0.60 % 865.98 Z o p - e t - - k kr kar kar Zz-sei kar 995 0.71 % 961.25 235 0.82 % 1,019.42 313 0.74 % 1,059.07 254 0.56 % 719.24 193 0.85 % 1,238.03 Z z - s e i tega ta ta Zk-met tega 938 0.67 % 906.19 138 0.48 % 598.64 178 0.42 % 602.28 417 0.92 % 1,180.80 205 0.91 % 1,315 Z k - m e t kdo kdo kdo Zv-mei kdo 906 0.65 % 875.27 245 0.85 % 1,062.80 161 0.38 % 544.76 364 0.80 % 1,030.73 136 0.60 % 872.39 Z v - m e i vi ti ti Zodmmi vi 894 0.64 % 863.68 191 0.66 % 828.55 94 0.22 % 318.06 374 0.83 % 1,059.04 235 1.04 % 1,507.44 Z o d m m i kako kak kak Zn-sei kako 860 0.62 % 830.83 207 0.72 % 897.96 120 0.28 % 406.03 414 0.91 % 1,172.31 119 0.53 % 763.34 Z n - s e i tega ta ta Zk-ser tega 842 0.60 % 813.44 117 0.41 % 507.54 144 0.34 % 487.24 443 0.98 % 1,254.43 138 0.61 % 885.22 Z k - s e r nič nič nič Zl-sei nič 840 0.60 % 811.51 248 0.86 % 1,075.81 188 0.44 % 636.12 291 0.64 % 824.01 113 0.50 % 724.86 Z l - s e i kaj kaj kaj Zv-set kaj 812 0.58 % 784.46 171 0.59 % 741.79 146 0.34 % 494.01 359 0.79 % 1,016.57 136 0.60 % 872.39 Z v - s e t vsi ves ves Zc-mmi vsi 739 0.53 % 713.94 167 0.58 % 724.44 175 0.41 % 592.13 296 0.65 % 838.17 101 0.45 % 647.88 Z c - m m i vam ti ti Zod-md vam 729 0.52 % 704.28 206 0.72 % 893.62 20 0.05 % 67.67 271 0.60 % 767.38 232 1.03 % 1,488.20 Z o d - m d CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 755 File at CLARIN.SI2.3.21 List of numeral lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-words-numerals-lemmas-taxonomy-entire.tsvLemma Lemma (lower-case) Lemma Lemma (lower-case) Part-of-speech category Standardized form Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] MMM mmm MMM mmm K mmm 1,522 5.94 % 1,470.38 193 2.83 % 837.23 605 9.35 % 2,047.09 351 4.12 % 993.91 373 9.82 % 2,392.67 en en en en K en 1,504 5.87 % 1,452.99 302 4.43 % 1,310.06 576 8.90 % 1,948.96 411 4.82 % 1,163.81 215 5.66 % 1,379.15 dva dva dva dva K dva 1,278 4.99 % 1,234.65 224 3.28 % 971.70 346 5.35 % 1,170.73 501 5.87 % 1,418.66 207 5.45 % 1,327.83 en en en en K eno 1,033 4.03 % 997.96 229 3.36 % 993.39 346 5.35 % 1,170.73 304 3.56 % 860.83 154 4.05 % 987.86 en en en en K ena 983 3.84 % 949.66 267 3.91 % 1,158.24 240 3.71 % 812.07 351 4.12 % 993.91 125 3.29 % 801.83 trije trije trije trije K tri 891 3.48 % 860.78 300 4.40 % 1,301.39 197 3.04 % 666.57 276 3.24 % 781.54 118 3.11 % 756.93 dva dva dva dva K dve 740 2.89 % 714.90 187 2.74 % 811.20 194 3.00 % 656.42 282 3.31 % 798.53 77 2.03 % 493.93 tisoč tisoč tisoč tisoč K tisoč 617 2.41 % 596.07 203 2.98 % 880.61 41 0.63 % 138.73 321 3.76 % 908.96 52 1.37 % 333.56 pet pet pet pet K pet 581 2.27 % 561.29 164 2.40 % 711.43 153 2.37 % 517.69 175 2.05 % 495.54 89 2.34 % 570.90 en en en en K ene 519 2.03 % 501.40 75 1.10 % 325.35 265 4.10 % 896.66 100 1.17 % 283.17 79 2.08 % 506.76 štirje štirje štirje štirje K štiri 501 1.96 % 484.01 142 2.08 % 615.99 112 1.73 % 378.96 177 2.08 % 501.20 70 1.84 % 449.03 drug drug drug drug K drugi 489 1.91 % 472.41 100 1.47 % 433.80 83 1.28 % 280.84 226 2.65 % 639.96 80 2.11 % 513.17 sto sto sto sto K sto 470 1.83 % 454.06 61 0.89 % 264.62 105 1.62 % 355.28 171 2.00 % 484.21 133 3.50 % 853.15 prvi prvi prvi prvi K prvi 451 1.76 % 435.70 148 2.17 % 642.02 96 1.48 % 324.83 162 1.90 % 458.73 45 1.19 % 288.66 drug drug drug drug K drugo 416 1.62 % 401.89 66 0.97 % 286.31 93 1.44 % 314.68 191 2.24 % 540.85 66 1.74 % 423.37 deset deset deset deset K deset 415 1.62 % 400.92 109 1.60 % 472.84 130 2.01 % 439.87 109 1.28 % 308.65 67 1.76 % 429.78 šest šest šest šest K šest 411 1.60 % 397.06 174 2.55 % 754.81 80 1.24 % 270.69 115 1.35 % 325.64 42 1.11 % 269.42 drug drug drug drug K drugega 397 1.55 % 383.54 90 1.32 % 390.42 123 1.90 % 416.18 127 1.49 % 359.62 57 1.50 % 365.64 en en en en K enega 375 1.46 % 362.28 68 1.00 % 294.98 140 2.16 % 473.71 104 1.22 % 294.49 63 1.66 % 404.12 dvajset dvajset dvajset dvajset K dvajset 354 1.38 % 341.99 95 1.39 % 412.11 88 1.36 % 297.76 106 1.24 % 300.16 65 1.71 % 416.95 en en en en K eni 338 1.32 % 326.54 54 0.79 % 234.25 123 1.90 % 416.18 97 1.14 % 274.67 64 1.69 % 410.54 osem osem osem osem K osem 333 1.30 % 321.71 165 2.42 % 715.76 34 0.53 % 115.04 91 1.07 % 257.68 43 1.13 % 275.83 trideset trideset trideset trideset K trideset 285 1.11 % 275.33 92 1.35 % 399.09 54 0.83 % 182.72 67 0.79 % 189.72 72 1.90 % 461.86 prvi prvi prvi prvi K prvo 273 1.07 % 263.74 71 1.04 % 308 83 1.28 % 280.84 82 0.96 % 232.20 37 0.97 % 237.34 drug drug drug drug K druga 252 0.98 % 243.45 60 0.88 % 260.28 45 0.70 % 152.26 111 1.30 % 314.31 36 0.95 % 230.93 petnajst petnajst petnajst petnajst K petnajst 252 0.98 % 243.45 66 0.97 % 286.31 64 0.99 % 216.55 55 0.65 % 155.74 67 1.76 % 429.78 petdeset petdeset petdeset petdeset K petdeset 245 0.96 % 236.69 30 0.44 % 130.14 82 1.27 % 277.46 81 0.95 % 229.36 52 1.37 % 333.56 sedem sedem sedem sedem K sedem 240 0.94 % 231.86 76 1.11 % 329.69 57 0.88 % 192.87 84 0.98 % 237.86 23 0.61 % 147.54 drug drug drug drug K druge 229 0.89 % 221.23 39 0.57 % 169.18 45 0.70 % 152.26 120 1.41 % 339.80 25 0.66 % 160.37 eden eden eden eden K eden 219 0.85 % 211.57 57 0.83 % 247.26 54 0.83 % 182.72 87 1.02 % 246.35 21 0.55 % 134.71 devet devet devet devet K devet 203 0.79 % 196.11 87 1.27 % 377.40 18 0.28 % 60.91 59 0.69 % 167.07 39 1.03 % 250.17 ena ena ena ena K ena 195 0.76 % 188.39 42 0.62 % 182.19 53 0.82 % 179.33 68 0.80 % 192.55 32 0.84 % 205.27 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 756 File at CLARIN.SI2.3.22 List of numeral lower-case word forms in the GOS 1.0 corpus with standardized word forms, lemmas, morphosyntactic tags and text-type distributionGOS1.0-words-numerals-lowercase_forms-standardized_ forms-lemmas-morphosyntactic_tags-taxonomy-entire.tsvLower-case word form Lemma Lemma (lower-case) Morphosyntactic tag Standardized form Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] msd01 msd02 msd03 msd04 msd05 msd06 msd07 mmm MMM mmm Krg mmm 1,522 5.94 % 1,470.38 193 2.83 % 837.23 605 9.35 % 2,047.09 351 4.12 % 993.91 373 9.82 % 2,392.67 K r g en en en Kbzmei en 1,241 4.84 % 1,198.91 258 3.78 % 1,119.19 446 6.89 % 1,509.09 345 4.04 % 976.92 192 5.05 % 1,231.61 K b z m e i dva dva dva Kbgmdi dva 1,087 4.24 % 1,050.13 197 2.89 % 854.58 266 4.11 % 900.04 449 5.26 % 1,271.42 175 4.61 % 1,122.56 K b g m d i ena en en Kbzsmi ena 952 3.72 % 919.71 262 3.84 % 1,136.55 217 3.35 % 734.24 349 4.09 % 988.25 124 3.27 % 795.42 K b z s m i eno en en Kbzsei eno 811 3.17 % 783.49 192 2.81 % 832.89 235 3.63 % 795.15 249 2.92 % 705.08 135 3.56 % 865.98 K b z s e i tri trije trije Kbgmmt tri 612 2.39 % 591.24 200 2.93 % 867.59 138 2.13 % 466.94 196 2.30 % 555.01 78 2.05 % 500.34 K b g m m t tisoč tisoč tisoč Kbg-mi tisoč 552 2.15 % 533.28 179 2.62 % 776.50 35 0.54 % 118.43 289 3.39 % 818.35 49 1.29 % 314.32 K b g - m i pet pet pet Kbg-mi pet 496 1.94 % 479.18 135 1.98 % 585.62 126 1.95 % 426.34 156 1.83 % 441.74 79 2.08 % 506.76 K b g - m i sto sto sto Kbg-mi sto 369 1.44 % 356.48 51 0.75 % 221.24 75 1.16 % 253.77 133 1.56 % 376.61 110 2.90 % 705.61 K b g - m i šest šest šest Kbg-mi šest 362 1.41 % 349.72 155 2.27 % 672.38 68 1.05 % 230.09 98 1.15 % 277.50 41 1.08 % 263 K b g - m i deset deset deset Kbg-mi deset 353 1.38 % 341.03 90 1.32 % 390.42 105 1.62 % 355.28 98 1.15 % 277.50 60 1.58 % 384.88 K b g - m i dve dva dva Kbgsdi dve 337 1.31 % 325.57 81 1.19 % 351.37 77 1.19 % 260.54 142 1.67 % 402.10 37 0.97 % 237.34 K b g s d i osem osem osem Kbg-mi osem 296 1.16 % 285.96 153 2.24 % 663.71 18 0.28 % 60.91 85 1.00 % 240.69 40 1.05 % 256.59 K b g - m i ene en en Kbzmmt ene 288 1.12 % 278.23 44 0.65 % 190.87 150 2.32 % 507.54 48 0.56 % 135.92 46 1.21 % 295.07 K b z m m t prvi prvi prvi Kbvmei prvi 253 0.99 % 244.42 86 1.26 % 373.06 57 0.88 % 192.87 84 0.98 % 237.86 26 0.69 % 166.78 K b v m e i dve dva dva Kbgzdi dve 243 0.95 % 234.76 71 1.04 % 308 52 0.80 % 175.95 90 1.05 % 254.85 30 0.79 % 192.44 K b g z d i štiri štirje štirje Kbgmmt štiri 225 0.88 % 217.37 46 0.67 % 199.55 48 0.74 % 162.41 98 1.15 % 277.50 33 0.87 % 211.68 K b g m m t drugo drug drug Kbzsei drugo 216 0.84 % 208.67 35 0.51 % 151.83 44 0.68 % 148.88 97 1.14 % 274.67 40 1.05 % 256.59 K b z s e i druga drug drug Kbzzei druga 213 0.83 % 205.78 53 0.78 % 229.91 36 0.56 % 121.81 91 1.07 % 257.68 33 0.87 % 211.68 K b z z e i petnajst petnajst petnajst Kbg-mi petnajst 211 0.82 % 203.84 49 0.72 % 212.56 51 0.79 % 172.56 48 0.56 % 135.92 63 1.66 % 404.12 K b g - m i en en en Kbzmet en 199 0.78 % 192.25 44 0.65 % 190.87 71 1.10 % 240.24 62 0.73 % 175.56 22 0.58 % 141.12 K b z m e t ena ena ena Kbg-mi ena 190 0.74 % 183.56 42 0.62 % 182.19 48 0.74 % 162.41 68 0.80 % 192.55 32 0.84 % 205.27 K b g - m i enga en en Kbzmer enega 190 0.74 % 183.56 26 0.38 % 112.79 90 1.39 % 304.53 39 0.46 % 110.43 35 0.92 % 224.51 K b z m e r devet devet devet Kbg-mi devet 188 0.73 % 181.62 78 1.14 % 338.36 15 0.23 % 50.75 56 0.66 % 158.57 39 1.03 % 250.17 K b g - m i eden eden eden Kbzmei eden 185 0.72 % 178.73 50 0.73 % 216.90 35 0.54 % 118.43 82 0.96 % 232.20 18 0.47 % 115.46 K b z m e i sedem sedem sedem Kbg-mi sedem 184 0.72 % 177.76 64 0.94 % 277.63 30 0.46 % 101.51 76 0.89 % 215.21 14 0.37 % 89.81 K b g - m i dva dva dva Kbgmdt dva 172 0.67 % 166.17 24 0.35 % 104.11 65 1.00 % 219.93 51 0.60 % 144.41 32 0.84 % 205.27 K b g m d t eni en en Kbzmmi eni 168 0.66 % 162.30 33 0.48 % 143.15 54 0.83 % 182.72 48 0.56 % 135.92 33 0.87 % 211.68 K b z m m i petdeset petdeset petdeset Kbg-mi petdeset 162 0.63 % 156.51 27 0.40 % 117.12 42 0.65 % 142.11 59 0.69 % 167.07 34 0.90 % 218.10 K b g - m i trideset trideset trideset Kbg-mi trideset 154 0.60 % 148.78 61 0.89 % 264.62 9 0.14 % 30.45 38 0.45 % 107.60 46 1.21 % 295.07 K b g - m i dvajset dvajset dvajset Kbg-mi dvajset 151 0.59 % 145.88 57 0.83 % 247.26 7 0.11 % 23.69 51 0.60 % 144.41 36 0.95 % 230.93 K b g - m i dvesto dvesto dvesto Kbg-mi dvesto 149 0.58 % 143.95 52 0.76 % 225.57 26 0.40 % 87.97 53 0.62 % 150.08 18 0.47 % 115.46 K b g - m i CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 757 File at CLARIN.SI2.3.23 List of preposition lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-words-prepositions-lemmas-taxonomy-entire.tsvLemma Lemma (lower-case) Lemma Lemma (lower-case) Part-of-speech category Standardized form Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] v v v v D v 17,481 25.77 % 16,888.11 3,989 24.80 % 17,304.13 3,778 24.21 % 12,783.29 7,378 27.32 % 20,892.03 2,336 25.55 % 14,984.64 na na na na D na 11,885 17.52 % 11,481.90 2,972 18.48 % 12,892.42 2,881 18.46 % 9,748.19 4,330 16.03 % 12,261.11 1,702 18.62 % 10,917.74 za za za za D za 7,868 11.60 % 7,601.15 1,772 11.02 % 7,686.87 1,787 11.45 % 6,046.52 3,129 11.59 % 8,860.28 1,180 12.91 % 7,569.29 z z z z D z 4,303 6.34 % 4,157.06 1,119 6.96 % 4,854.18 887 5.68 % 3,001.27 1,740 6.44 % 4,927.10 557 6.09 % 3,572.96 po po po po D po 3,337 4.92 % 3,223.82 779 4.84 % 3,379.27 981 6.29 % 3,319.33 1,116 4.13 % 3,160.14 461 5.04 % 2,957.16 z z z z D s 3,259 4.80 % 3,148.47 759 4.72 % 3,292.51 669 4.29 % 2,263.64 1,323 4.90 % 3,746.29 508 5.56 % 3,258.65 od od od od D od 2,502 3.69 % 2,417.14 589 3.66 % 2,555.06 674 4.32 % 2,280.56 940 3.48 % 2,661.77 299 3.27 % 1,917.98 pri pri pri pri D pri 2,371 3.50 % 2,290.58 443 2.75 % 1,921.72 596 3.82 % 2,016.63 929 3.44 % 2,630.62 403 4.41 % 2,585.11 do do do do D do 2,124 3.13 % 2,051.96 479 2.98 % 2,077.88 448 2.87 % 1,515.86 879 3.25 % 2,489.03 318 3.48 % 2,039.86 o o o o D o 2,070 3.05 % 1,999.79 480 2.98 % 2,082.22 319 2.04 % 1,079.37 1,060 3.92 % 3,001.57 211 2.31 % 1,353.49 iz iz iz iz D iz 1,942 2.86 % 1,876.13 453 2.82 % 1,965.10 436 2.79 % 1,475.26 888 3.29 % 2,514.52 165 1.80 % 1,058.42 ob ob ob ob D ob 896 1.32 % 865.61 291 1.81 % 1,262.35 259 1.66 % 876.36 282 1.04 % 798.53 64 0.70 % 410.54 pred pred pred pred D pred 788 1.16 % 761.27 284 1.77 % 1,231.98 148 0.95 % 500.77 310 1.15 % 877.82 46 0.50 % 295.07 zaradi zaradi zaradi zaradi D zaradi 786 1.16 % 759.34 135 0.84 % 585.62 143 0.92 % 483.86 398 1.47 % 1,127 110 1.20 % 705.61 med med med med D med 781 1.15 % 754.51 192 1.19 % 832.89 100 0.64 % 338.36 409 1.51 % 1,158.15 80 0.88 % 513.17 čez čez čez čez D čez 587 0.86 % 567.09 211 1.31 % 915.31 198 1.27 % 669.96 125 0.46 % 353.96 53 0.58 % 339.98 k k k k D k 468 0.69 % 452.13 103 0.64 % 446.81 116 0.74 % 392.50 186 0.69 % 526.69 63 0.69 % 404.12 brez brez brez brez D brez 463 0.68 % 447.30 133 0.83 % 576.95 118 0.76 % 399.27 143 0.53 % 404.93 69 0.76 % 442.61 pod pod pod pod D pod 337 0.50 % 325.57 60 0.37 % 260.28 93 0.60 % 314.68 145 0.54 % 410.59 39 0.43 % 250.17 proti proti proti proti D proti 323 0.48 % 312.05 123 0.77 % 533.57 63 0.40 % 213.17 114 0.42 % 322.81 23 0.25 % 147.54 zraven zraven zraven zraven D zraven 297 0.44 % 286.93 54 0.34 % 234.25 130 0.83 % 439.87 48 0.18 % 135.92 65 0.71 % 416.95 skoz skoz skoz skoz D skoz 213 0.31 % 205.78 15 0.09 % 65.07 130 0.83 % 439.87 14 0.05 % 39.64 54 0.59 % 346.39 skozi skozi skozi skozi D skozi 212 0.31 % 204.81 50 0.31 % 216.90 60 0.39 % 203.02 92 0.34 % 260.51 10 0.11 % 64.15 okoli okoli okoli okoli D okoli 183 0.27 % 176.79 38 0.24 % 164.84 89 0.57 % 301.14 41 0.15 % 116.10 15 0.16 % 96.22 nad nad nad nad D nad 162 0.24 % 156.51 41 0.26 % 177.86 24 0.15 % 81.21 84 0.31 % 237.86 13 0.14 % 83.39 preko preko preko preko D preko 145 0.21 % 140.08 43 0.27 % 186.53 14 0.09 % 47.37 73 0.27 % 206.71 15 0.16 % 96.22 glede glede glede glede D glede 132 0.20 % 127.52 17 0.11 % 73.75 18 0.12 % 60.91 48 0.18 % 135.92 49 0.54 % 314.32 okrog okrog okrog okrog D okrog 125 0.18 % 120.76 19 0.12 % 82.42 28 0.18 % 94.74 55 0.20 % 155.74 23 0.25 % 147.54 mimo mimo mimo mimo D mimo 117 0.17 % 113.03 42 0.26 % 182.19 38 0.24 % 128.58 28 0.10 % 79.29 9 0.10 % 57.73 k k k k D h 112 0.17 % 108.20 19 0.12 % 82.42 55 0.35 % 186.10 23 0.09 % 65.13 15 0.16 % 96.22 poleg poleg poleg poleg D poleg 112 0.17 % 108.20 29 0.18 % 125.80 20 0.13 % 67.67 53 0.20 % 150.08 10 0.11 % 64.15 kljub kljub kljub kljub D kljub 99 0.15 % 95.64 16 0.10 % 69.41 4 0.03 % 13.53 67 0.25 % 189.72 12 0.13 % 76.98 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 758 File at CLARIN.SI2.3.24 List of preposition lower-case word forms in the GOS 1.0 corpus with standardized word forms, lemmas, morphosyntactic tags and text-type distributionGOS1.0-words-prepositions-lowercase_forms-standardized_ forms-lemmas-morphosyntactic_tags-taxonomy-entire.tsvLower-case word form Lemma Lemma (lower-case) Morphosyntactic tag Standardized form Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] msd01 msd02 v v v Dm v 13,416 19.78 % 12,960.98 3,116 19.37 % 13,517.09 2,361 15.13 % 7,988.71 5,996 22.20 % 16,978.67 1,943 21.26 % 12,463.68 D m za za za Dt za 7,420 10.94 % 7,168.34 1,678 10.43 % 7,279.10 1,600 10.25 % 5,413.78 2,997 11.10 % 8,486.50 1,145 12.53 % 7,344.78 D t na na na Dm na 7,086 10.45 % 6,845.67 1,891 11.76 % 8,203.09 1,644 10.54 % 5,562.66 2,590 9.59 % 7,334.01 961 10.51 % 6,164.48 D m na na na Dt na 4,651 6.86 % 4,493.26 1,073 6.67 % 4,654.63 1,135 7.27 % 3,840.40 1,709 6.33 % 4,839.32 734 8.03 % 4,708.36 D t z z z Do z 3,873 5.71 % 3,741.64 995 6.19 % 4,316.27 768 4.92 % 2,598.62 1,607 5.95 % 4,550.49 503 5.50 % 3,226.57 D o v v v Dt v 3,358 4.95 % 3,244.11 779 4.84 % 3,379.27 968 6.20 % 3,275.34 1,261 4.67 % 3,570.73 350 3.83 % 2,245.13 D t s z z Do s 2,840 4.19 % 2,743.68 669 4.16 % 2,902.10 546 3.50 % 1,847.45 1,176 4.35 % 3,330.04 449 4.91 % 2,880.18 D o po po po Dm po 2,769 4.08 % 2,675.09 685 4.26 % 2,971.50 679 4.35 % 2,297.47 1,031 3.82 % 2,919.45 374 4.09 % 2,399.08 D m od od od Dr od 2,423 3.57 % 2,340.82 581 3.61 % 2,520.36 611 3.92 % 2,067.39 934 3.46 % 2,644.78 297 3.25 % 1,905.15 D r do do do Dr do 2,089 3.08 % 2,018.15 474 2.95 % 2,056.19 418 2.68 % 1,414.35 879 3.25 % 2,489.03 318 3.48 % 2,039.86 D r o o o Dm o 2,063 3.04 % 1,993.03 478 2.97 % 2,073.55 318 2.04 % 1,075.99 1,058 3.92 % 2,995.90 209 2.29 % 1,340.66 D m iz iz iz Dr iz 1,876 2.77 % 1,812.37 435 2.70 % 1,887.01 402 2.58 % 1,360.21 879 3.25 % 2,489.03 160 1.75 % 1,026.34 D r pri pri pri Dm pri 1,559 2.30 % 1,506.12 340 2.11 % 1,474.91 163 1.04 % 551.53 798 2.96 % 2,259.67 258 2.82 % 1,654.98 D m ob ob ob Dm ob 833 1.23 % 804.75 279 1.73 % 1,210.29 228 1.46 % 771.46 269 1.00 % 761.72 57 0.62 % 365.64 D m pr pri pri Dm pri 783 1.15 % 756.44 103 0.64 % 446.81 407 2.61 % 1,377.13 129 0.48 % 365.28 144 1.57 % 923.71 D m pred pred pred Do pred 749 1.10 % 723.60 273 1.70 % 1,184.26 135 0.86 % 456.79 297 1.10 % 841 44 0.48 % 282.24 D o med med med Do med 707 1.04 % 683.02 170 1.06 % 737.45 91 0.58 % 307.91 373 1.38 % 1,056.21 73 0.80 % 468.27 D o čez čez čez Dt čez 575 0.85 % 555.50 210 1.30 % 910.97 187 1.20 % 632.74 125 0.46 % 353.96 53 0.58 % 339.98 D t po po po Dt po 464 0.68 % 448.26 84 0.52 % 364.39 218 1.40 % 737.63 78 0.29 % 220.87 84 0.92 % 538.83 D t brez brez brez Dr brez 457 0.67 % 441.50 133 0.83 % 576.95 113 0.72 % 382.35 143 0.53 % 404.93 68 0.74 % 436.20 D r k k k Dd k 449 0.66 % 433.77 101 0.63 % 438.13 104 0.67 % 351.90 181 0.67 % 512.53 63 0.69 % 404.12 D d zarad zaradi zaradi Dr zaradi 386 0.57 % 372.91 59 0.37 % 255.94 82 0.53 % 277.46 172 0.64 % 487.05 73 0.80 % 468.27 D r z z z Dr z 381 0.56 % 368.08 111 0.69 % 481.51 92 0.59 % 311.29 125 0.46 % 353.96 53 0.58 % 339.98 D r s z z Dr s 371 0.55 % 358.42 75 0.47 % 325.35 109 0.70 % 368.81 130 0.48 % 368.12 57 0.62 % 365.64 D r zaradi zaradi zaradi Dr zaradi 312 0.46 % 301.42 69 0.43 % 299.32 20 0.13 % 67.67 195 0.72 % 552.17 28 0.31 % 179.61 D r proti proti proti Dd proti 274 0.40 % 264.71 119 0.74 % 516.22 40 0.26 % 135.34 95 0.35 % 269.01 20 0.22 % 128.29 D d u v v Dm v 265 0.39 % 256.01 40 0.25 % 173.52 144 0.92 % 487.24 69 0.26 % 195.38 12 0.13 % 76.98 D m pod pod pod Do pod 261 0.39 % 252.15 49 0.30 % 212.56 75 0.48 % 253.77 114 0.42 % 322.81 23 0.25 % 147.54 D o za za za Do za 244 0.36 % 235.72 69 0.43 % 299.32 48 0.31 % 162.41 106 0.39 % 300.16 21 0.23 % 134.71 D o zraven zraven zraven Dr zraven 207 0.30 % 199.98 39 0.24 % 169.18 70 0.45 % 236.85 41 0.15 % 116.10 57 0.62 % 365.64 D r skoz skoz skoz Dt skoz 158 0.23 % 152.64 14 0.09 % 60.73 83 0.53 % 280.84 11 0.04 % 31.15 50 0.55 % 320.73 D t nad nad nad Do nad 147 0.22 % 142.01 37 0.23 % 160.50 21 0.14 % 71.06 77 0.28 % 218.04 12 0.13 % 76.98 D o CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 759 File at CLARIN.SI2.3.25 List of conjunction lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-words-conjunctions-lemmas-taxonomy-entire.tsvLemma Lemma (lower-case) Lemma Lemma (lower-case) Part-of-speech category Standardized form Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] pa pa pa pa V pa 29,360 25.89 % 28,364.22 5,601 23.95 % 24,296.92 12,111 35.20 % 40,978.95 6,825 18.02 % 19,326.12 4,823 27.20 % 30,937.89 da da da da V da 19,008 16.76 % 18,363.32 3,596 15.38 % 15,599.31 5,141 14.94 % 17,395.16 7,139 18.85 % 20,215.26 3,132 17.66 % 20,090.70 in in in in V in 16,235 14.32 % 15,684.37 4,182 17.88 % 18,141.36 3,198 9.30 % 10,820.80 7,068 18.67 % 20,014.21 1,787 10.08 % 11,462.99 a a a a V a 6,576 5.80 % 6,352.97 1,222 5.22 % 5,300.99 2,499 7.26 % 8,455.65 1,665 4.40 % 4,714.72 1,190 6.71 % 7,633.44 če če če če V če 6,137 5.41 % 5,928.86 1,120 4.79 % 4,858.52 1,882 5.47 % 6,367.96 1,902 5.02 % 5,385.83 1,233 6.95 % 7,909.27 ki ki ki ki V ki 5,537 4.88 % 5,349.21 1,257 5.38 % 5,452.82 882 2.56 % 2,984.35 2,839 7.50 % 8,039.10 559 3.15 % 3,585.79 ker ker ker ker V ker 4,783 4.22 % 4,620.78 824 3.52 % 3,574.48 1,628 4.73 % 5,508.52 1,294 3.42 % 3,664.18 1,037 5.85 % 6,652 ali ali ali ali V ali 4,757 4.20 % 4,595.66 818 3.50 % 3,548.45 1,536 4.46 % 5,197.23 1,501 3.96 % 4,250.33 902 5.09 % 5,786.02 ko ko ko ko V ko 4,247 3.75 % 4,102.96 734 3.14 % 3,184.06 1,691 4.92 % 5,721.69 1,215 3.21 % 3,440.47 607 3.42 % 3,893.70 ampak ampak ampak ampak V ampak 3,540 3.12 % 3,419.94 815 3.48 % 3,535.44 821 2.39 % 2,777.95 1,237 3.27 % 3,502.77 667 3.76 % 4,278.58 kot kot kot kot V kot 3,075 2.71 % 2,970.71 737 3.15 % 3,197.08 550 1.60 % 1,860.99 1,363 3.60 % 3,859.56 425 2.40 % 2,726.23 saj saj saj saj V saj 1,621 1.43 % 1,566.02 231 0.99 % 1,002.07 824 2.40 % 2,788.10 209 0.55 % 591.82 357 2.01 % 2,290.03 torej torej torej torej V torej 1,223 1.08 % 1,181.52 436 1.86 % 1,891.35 37 0.11 % 125.19 648 1.71 % 1,834.92 102 0.57 % 654.29 oziroma oziroma oziroma oziroma V oziroma 903 0.80 % 872.37 259 1.11 % 1,123.53 105 0.30 % 355.28 417 1.10 % 1,180.80 122 0.69 % 782.59 kjer kjer kjer kjer V kjer 652 0.57 % 629.89 122 0.52 % 529.23 121 0.35 % 409.42 306 0.81 % 866.49 103 0.58 % 660.71 sicer sicer sicer sicer V sicer 595 0.53 % 574.82 195 0.83 % 845.90 74 0.21 % 250.39 243 0.64 % 688.09 83 0.47 % 532.42 kako kako kako kako V kako 533 0.47 % 514.92 126 0.54 % 546.58 135 0.39 % 456.79 223 0.59 % 631.46 49 0.28 % 314.32 zakaj zakaj zakaj zakaj V zakaj 527 0.47 % 509.13 87 0.37 % 377.40 147 0.43 % 497.39 237 0.63 % 671.10 56 0.32 % 359.22 kakor kakor kakor kakor V kakor 523 0.46 % 505.26 100 0.43 % 433.80 229 0.67 % 774.85 134 0.35 % 379.44 60 0.34 % 384.88 zato zato zato zato V zato 363 0.32 % 350.69 59 0.25 % 255.94 75 0.22 % 253.77 167 0.44 % 472.89 62 0.35 % 397.71 tako tako tako tako V tako 352 0.31 % 340.06 56 0.24 % 242.93 114 0.33 % 385.73 94 0.25 % 266.18 88 0.50 % 564.49 niti niti niti niti V niti 314 0.28 % 303.35 60 0.26 % 260.28 95 0.28 % 321.44 105 0.28 % 297.32 54 0.30 % 346.39 čeprav čeprav čeprav čeprav V čeprav 302 0.27 % 291.76 85 0.36 % 368.73 71 0.21 % 240.24 106 0.28 % 300.16 40 0.23 % 256.59 vendar vendar vendar vendar V vendar 284 0.25 % 274.37 85 0.36 % 368.73 6 0.02 % 20.30 184 0.49 % 521.03 9 0.05 % 57.73 kadar kadar kadar kadar V kadar 229 0.20 % 221.23 47 0.20 % 203.88 98 0.28 % 331.59 60 0.16 % 169.90 24 0.14 % 153.95 namreč namreč namreč namreč V namreč 225 0.20 % 217.37 71 0.30 % 308 14 0.04 % 47.37 126 0.33 % 356.79 14 0.08 % 89.81 drugače drugače drugače drugače V drugače 202 0.18 % 195.15 40 0.17 % 173.52 92 0.27 % 311.29 38 0.10 % 107.60 32 0.18 % 205.27 a a a a V A 177 0.16 % 171 25 0.11 % 108.45 16 0.05 % 54.14 106 0.28 % 300.16 30 0.17 % 192.44 kajti kajti kajti kajti V kajti 174 0.15 % 168.10 104 0.45 % 451.15 0 0 % 0 65 0.17 % 184.06 5 0.03 % 32.07 dokler dokler dokler dokler V dokler 143 0.13 % 138.15 21 0.09 % 91.10 57 0.17 % 192.87 44 0.12 % 124.59 21 0.12 % 134.71 vendarle vendarle vendarle vendarle V vendarle 108 0.10 % 104.34 25 0.11 % 108.45 0 0 % 0 76 0.20 % 215.21 7 0.04 % 44.90 naj naj naj naj V naj 105 0.09 % 101.44 25 0.11 % 108.45 31 0.09 % 104.89 30 0.08 % 84.95 19 0.11 % 121.88 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 760 File at CLARIN.SI2.3.26 List of conjunction lower-case word forms in the GOS 1.0 corpus with standardized word forms, lemmas, morphosyntactic tags and text-type distributionGOS1.0-words-conjunctions-lowercase_forms-standardized_ forms-lemmas-morphosyntactic_tags-taxonomy-entire.tsvLower-case word form Lemma Lemma (lower-case) Morphosyntactic tag Standardized form Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] msd01 msd02 pa pa pa Vp pa 29,153 25.71 % 28,164.24 5,585 23.88 % 24,227.52 11,944 34.72 % 40,413.88 6,807 17.98 % 19,275.15 4,817 27.17 % 30,899.40 V p da da da Vd da 17,231 15.20 % 16,646.59 3,471 14.84 % 15,057.07 3,882 11.28 % 13,135.19 6,959 18.38 % 19,705.56 2,919 16.46 % 18,724.38 V d in in in Vp in 15,942 14.06 % 15,401.31 4,159 17.78 % 18,041.58 2,957 8.60 % 10,005.35 7,054 18.63 % 19,974.57 1,772 9.99 % 11,366.77 V p a a a Vp a 6,517 5.75 % 6,295.97 1,210 5.17 % 5,248.93 2,458 7.14 % 8,316.92 1,664 4.39 % 4,711.89 1,185 6.68 % 7,601.37 V p če če če Vd če 5,945 5.24 % 5,743.37 1,107 4.73 % 4,802.12 1,744 5.07 % 5,901.02 1,891 4.99 % 5,354.68 1,203 6.79 % 7,716.83 V d ki ki ki Vd ki 4,454 3.93 % 4,302.94 1,122 4.80 % 4,867.19 318 0.92 % 1,075.99 2,661 7.03 % 7,535.06 353 1.99 % 2,264.37 V d ampak ampak ampak Vp ampak 3,444 3.04 % 3,327.19 796 3.40 % 3,453.02 778 2.26 % 2,632.45 1,208 3.19 % 3,420.65 662 3.73 % 4,246.50 V p al ali ali Vp ali 3,231 2.85 % 3,121.42 444 1.90 % 1,926.06 1,388 4.04 % 4,696.46 614 1.62 % 1,738.64 785 4.43 % 5,035.51 V p ker ker ker Vd ker 3,186 2.81 % 3,077.94 567 2.42 % 2,459.62 702 2.04 % 2,375.30 1,158 3.06 % 3,279.07 759 4.28 % 4,868.72 V d ko ko ko Vd ko 3,128 2.76 % 3,021.91 597 2.55 % 2,589.76 974 2.83 % 3,295.64 1,111 2.93 % 3,145.98 446 2.52 % 2,860.94 V d kot kot kot Vd kot 2,267 2.00 % 2,190.11 585 2.50 % 2,537.71 125 0.36 % 422.95 1,263 3.34 % 3,576.39 294 1.66 % 1,885.91 V d ali ali ali Vp ali 1,482 1.31 % 1,431.74 352 1.50 % 1,526.96 129 0.38 % 436.49 885 2.34 % 2,506.02 116 0.65 % 744.10 V p sej saj saj Vp saj 1,341 1.18 % 1,295.52 184 0.79 % 798.18 678 1.97 % 2,294.09 177 0.47 % 501.20 302 1.70 % 1,937.23 V p torej torej torej Vp torej 1,221 1.08 % 1,179.59 436 1.86 % 1,891.35 36 0.10 % 121.81 648 1.71 % 1,834.92 101 0.57 % 647.88 V p de da da Vd da 1,167 1.03 % 1,127.42 85 0.36 % 368.73 797 2.32 % 2,696.74 106 0.28 % 300.16 179 1.01 % 1,148.22 V d k ker ker Vd ker 926 0.82 % 894.59 99 0.42 % 429.46 587 1.71 % 1,986.18 81 0.21 % 229.36 159 0.90 % 1,019.93 V d k ko ko Vd ko 923 0.81 % 891.70 104 0.45 % 451.15 567 1.65 % 1,918.51 98 0.26 % 277.50 154 0.87 % 987.86 V d oziroma oziroma oziroma Vp oziroma 881 0.78 % 851.12 258 1.10 % 1,119.19 104 0.30 % 351.90 400 1.06 % 1,132.67 119 0.67 % 763.34 V p k ki ki Vd ki 781 0.69 % 754.51 94 0.40 % 407.77 367 1.07 % 1,241.79 163 0.43 % 461.56 157 0.89 % 1,007.10 V d sicer sicer sicer Vp sicer 593 0.52 % 572.89 195 0.83 % 845.90 74 0.21 % 250.39 242 0.64 % 685.26 82 0.46 % 526 V p kjer kjer kjer Vd kjer 484 0.43 % 467.58 111 0.47 % 481.51 23 0.07 % 77.82 283 0.75 % 801.36 67 0.38 % 429.78 V d zakaj zakaj zakaj Vp zakaj 421 0.37 % 406.72 70 0.30 % 303.66 82 0.24 % 277.46 223 0.59 % 631.46 46 0.26 % 295.07 V p d da da Vd da 356 0.31 % 343.93 22 0.09 % 95.44 249 0.72 % 842.52 68 0.18 % 192.55 17 0.10 % 109.05 V d zato zato zato Vp zato 322 0.28 % 311.08 58 0.25 % 251.60 54 0.16 % 182.72 153 0.40 % 433.24 57 0.32 % 365.64 V p ko kot kot Vd kot 319 0.28 % 308.18 74 0.32 % 321.01 164 0.48 % 554.91 32 0.09 % 90.61 49 0.28 % 314.32 V d kr ker ker Vd ker 315 0.28 % 304.32 33 0.14 % 143.15 149 0.43 % 504.16 47 0.12 % 133.09 86 0.48 % 551.66 V d niti niti niti Vp niti 306 0.27 % 295.62 59 0.25 % 255.94 89 0.26 % 301.14 105 0.28 % 297.32 53 0.30 % 339.98 V p kako kako kako Vd kako 295 0.26 % 284.99 71 0.30 % 308 20 0.06 % 67.67 189 0.50 % 535.18 15 0.09 % 96.22 V d vendar vendar vendar Vp vendar 276 0.24 % 266.64 84 0.36 % 364.39 1 0 % 3.38 182 0.48 % 515.36 9 0.05 % 57.73 V p i in in Vp in 253 0.22 % 244.42 10 0.04 % 43.38 224 0.65 % 757.93 9 0.02 % 25.48 10 0.06 % 64.15 V p k kot kot Vd kot 252 0.22 % 243.45 44 0.19 % 190.87 147 0.43 % 497.39 18 0.05 % 50.97 43 0.24 % 275.83 V d namreč namreč namreč Vp namreč 223 0.20 % 215.44 71 0.30 % 308 14 0.04 % 47.37 124 0.33 % 351.13 14 0.08 % 89.81 V p CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 761 File at CLARIN.SI2.3.27 List of particle lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-words-particles-lemmas-taxonomy-entire.tsvLemma Lemma (lower-case) Lemma Lemma (lower-case) Part-of-speech category Standardized form Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] ne ne ne ne L ne 31,733 31.63 % 30,656.73 6,435 29.95 % 27,914.79 11,588 32.66 % 39,209.32 7,780 30.90 % 22,030.36 5,930 32.62 % 38,038.91 ja ja ja jaL ja 25,555 25.47 % 24,688.27 4,720 21.97 % 20,475.18 11,541 32.53 % 39,050.29 3,821 15.18 % 10,819.80 5,473 30.10 % 35,107.41 tudi tudi tudi tudi L tudi 7,945 7.92 % 7,675.53 2,076 9.66 % 9,005.61 1,644 4.63 % 5,562.66 2,890 11.48 % 8,183.51 1,335 7.34 % 8,563.57 še še še še L še 7,184 7.16 % 6,940.35 1,782 8.29 % 7,730.25 2,116 5.96 % 7,159.73 2,127 8.45 % 6,022.95 1,159 6.38 % 7,434.59 no no no no L no 4,691 4.68 % 4,531.90 1,289 6.00 % 5,591.63 1,566 4.41 % 5,298.74 1,114 4.42 % 3,154.48 722 3.97 % 4,631.38 že že že že L že 4,447 4.43 % 4,296.17 1,071 4.98 % 4,645.96 1,498 4.22 % 5,068.65 1,191 4.73 % 3,372.51 687 3.78 % 4,406.87 pač pač pač pač L pač 2,851 2.84 % 2,754.30 402 1.87 % 1,743.86 960 2.71 % 3,248.27 830 3.30 % 2,350.28 659 3.62 % 4,227.26 seveda seveda seveda seveda L seveda 1,795 1.79 % 1,734.12 543 2.53 % 2,355.51 188 0.53 % 636.12 896 3.56 % 2,537.17 168 0.92 % 1,077.66 res res res res L res 1,699 1.69 % 1,641.38 467 2.17 % 2,025.83 577 1.63 % 1,952.35 360 1.43 % 1,019.40 295 1.62 % 1,892.32 samo samo samo samo L samo 1,469 1.46 % 1,419.18 254 1.18 % 1,101.84 565 1.59 % 1,911.74 419 1.66 % 1,186.47 231 1.27 % 1,481.79 več več več več L več 1,332 1.33 % 1,286.82 287 1.34 % 1,245 442 1.25 % 1,495.56 445 1.77 % 1,260.09 158 0.87 % 1,013.52 sploh sploh sploh sploh L sploh 895 0.89 % 864.64 161 0.75 % 698.41 354 1.00 % 1,197.80 219 0.87 % 620.13 161 0.89 % 1,032.76 kar kar kar kar L kar 793 0.79 % 766.10 183 0.85 % 793.85 160 0.45 % 541.38 336 1.33 % 951.44 114 0.63 % 731.27 naj naj naj naj L naj 792 0.79 % 765.14 180 0.84 % 780.83 179 0.51 % 605.67 331 1.31 % 937.28 102 0.56 % 654.29 okej okej okej okej L okej 656 0.65 % 633.75 191 0.89 % 828.55 175 0.49 % 592.13 96 0.38 % 271.84 194 1.07 % 1,244.44 pravzaprav pravzaprav pravzaprav pravzaprav L pravzaprav 579 0.58 % 559.36 73 0.34 % 316.67 18 0.05 % 60.91 439 1.74 % 1,243.10 49 0.27 % 314.32 glih glih glih glih L glih 542 0.54 % 523.62 74 0.34 % 321.01 373 1.05 % 1,262.09 36 0.14 % 101.94 59 0.33 % 378.46 itak itak itak itak L itak 527 0.53 % 509.13 50 0.23 % 216.90 389 1.10 % 1,316.23 29 0.12 % 82.12 59 0.33 % 378.46 torej torej torej torej L torej 474 0.47 % 457.92 133 0.62 % 576.95 9 0.03 % 30.45 297 1.18 % 841 35 0.19 % 224.51 vsaj vsaj vsaj vsaj L vsaj 448 0.45 % 432.81 113 0.53 % 490.19 119 0.34 % 402.65 128 0.51 % 362.45 88 0.48 % 564.49 da da da da L da 354 0.35 % 341.99 51 0.24 % 221.24 162 0.46 % 548.15 91 0.36 % 257.68 50 0.28 % 320.73 predvsem predvsem predvsem predvsem L predvsem 335 0.33 % 323.64 107 0.50 % 464.16 10 0.03 % 33.84 187 0.74 % 529.52 31 0.17 % 198.85 morda morda morda morda L morda 333 0.33 % 321.71 138 0.64 % 598.64 27 0.08 % 91.36 151 0.60 % 427.58 17 0.09 % 109.05 skoraj skoraj skoraj skoraj L skoraj 295 0.29 % 284.99 76 0.35 % 329.69 100 0.28 % 338.36 83 0.33 % 235.03 36 0.20 % 230.93 prav prav prav prav L prav 249 0.25 % 240.55 50 0.23 % 216.90 87 0.24 % 294.37 84 0.33 % 237.86 28 0.15 % 179.61 le le le leL le 240 0.24 % 231.86 98 0.46 % 425.12 33 0.09 % 111.66 91 0.36 % 257.68 18 0.10 % 115.46 najbrž najbrž najbrž najbrž L najbrž 217 0.22 % 209.64 44 0.20 % 190.87 63 0.18 % 213.17 64 0.25 % 181.23 46 0.25 % 295.07 mogoče mogoče mogoče mogoče L mogoče 184 0.18 % 177.76 39 0.18 % 169.18 36 0.10 % 121.81 60 0.24 % 169.90 49 0.27 % 314.32 celo celo celo celo L celo 173 0.17 % 167.13 47 0.22 % 203.88 32 0.09 % 108.28 75 0.30 % 212.37 19 0.10 % 121.88 menda menda menda menda L menda 149 0.15 % 143.95 12 0.06 % 52.06 96 0.27 % 324.83 28 0.11 % 79.29 13 0.07 % 83.39 šele šele šele šele L šele 130 0.13 % 125.59 29 0.14 % 125.80 46 0.13 % 155.65 39 0.15 % 110.43 16 0.09 % 102.63 baje baje baje baje L baje 124 0.12 % 119.79 35 0.16 % 151.83 72 0.20 % 243.62 4 0.02 % 11.33 13 0.07 % 83.39 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 762 File at CLARIN.SI2.3.28 List of particle lower-case word forms in the GOS 1.0 corpus with standardized word forms, lemmas, morphosyntactic tags and text-type distributionGOS1.0-words-particles-lowercase_forms-standardized_ forms-lemmas-morphosyntactic_tags-taxonomy-entire.tsvLower-case word form Lemma Lemma (lower-case) Morphosyntactic tag Standardized form Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] msd01 ne ne ne L ne 29,853 29.76 % 28,840.50 6,129 28.52 % 26,587.37 10,303 29.04 % 34,861.37 7,590 30.15 % 21,492.34 5,831 32.07 % 37,403.86 L ja ja ja L ja 25,105 25.02 % 24,253.53 4,628 21.54 % 20,076.09 11,249 31.71 % 38,062.27 3,787 15.04 % 10,723.52 5,441 29.93 % 34,902.14 L še še še L še 7,056 7.03 % 6,816.69 1,756 8.17 % 7,617.46 2,021 5.70 % 6,838.28 2,124 8.44 % 6,014.46 1,155 6.35 % 7,408.93 L no no no L no 4,622 4.61 % 4,465.24 1,282 5.97 % 5,561.27 1,511 4.26 % 5,112.64 1,111 4.41 % 3,145.98 718 3.95 % 4,605.72 L že že že L že 4,273 4.26 % 4,128.08 1,044 4.86 % 4,528.83 1,357 3.83 % 4,591.56 1,185 4.71 % 3,355.52 687 3.78 % 4,406.87 L tudi tudi tudi L tudi 4,119 4.11 % 3,979.30 1,289 6.00 % 5,591.63 315 0.89 % 1,065.84 2,086 8.29 % 5,906.86 429 2.36 % 2,751.89 L tud tudi tudi L tudi 3,570 3.56 % 3,448.92 709 3.30 % 3,075.62 1,187 3.35 % 4,016.35 795 3.16 % 2,251.17 879 4.83 % 5,638.48 L pač pač pač L pač 2,833 2.82 % 2,736.92 401 1.87 % 1,739.52 954 2.69 % 3,227.97 826 3.28 % 2,338.96 652 3.59 % 4,182.36 L res res res L res 1,625 1.62 % 1,569.89 457 2.13 % 1,982.45 521 1.47 % 1,762.86 355 1.41 % 1,005.24 292 1.61 % 1,873.08 L seveda seveda seveda L seveda 1,593 1.59 % 1,538.97 499 2.32 % 2,164.64 126 0.35 % 426.34 816 3.24 % 2,310.64 152 0.84 % 975.03 L več več več L več 1,312 1.31 % 1,267.50 287 1.34 % 1,245 422 1.19 % 1,427.89 445 1.77 % 1,260.09 158 0.87 % 1,013.52 L sploh sploh sploh L sploh 866 0.86 % 836.63 147 0.68 % 637.68 340 0.96 % 1,150.43 218 0.87 % 617.30 161 0.89 % 1,032.76 L samo samo samo L samo 800 0.80 % 772.87 174 0.81 % 754.81 156 0.44 % 527.84 349 1.39 % 988.25 121 0.67 % 776.17 L okej okej okej L okej 638 0.64 % 616.36 190 0.88 % 824.21 163 0.46 % 551.53 96 0.38 % 271.84 189 1.04 % 1,212.37 L sam samo samo L samo 556 0.55 % 537.14 76 0.35 % 329.69 314 0.89 % 1,062.45 66 0.26 % 186.89 100 0.55 % 641.47 L itak itak itak L itak 527 0.53 % 509.13 50 0.23 % 216.90 389 1.10 % 1,316.23 29 0.12 % 82.12 59 0.33 % 378.46 L nej ne ne L ne 496 0.49 % 479.18 180 0.84 % 780.83 309 0.87 % 1,045.54 4 0.02 % 11.33 3 0.02 % 19.24 L torej torej torej L torej 472 0.47 % 455.99 133 0.62 % 576.95 8 0.02 % 27.07 297 1.18 % 841 34 0.19 % 218.10 L naj naj naj L naj 466 0.47 % 450.20 112 0.52 % 485.85 79 0.22 % 267.31 229 0.91 % 648.45 46 0.25 % 295.07 L kr kar kar L kar 408 0.41 % 394.16 125 0.58 % 542.25 99 0.28 % 334.98 113 0.45 % 319.98 71 0.39 % 455.44 L vsaj vsaj vsaj L vsaj 382 0.38 % 369.04 99 0.46 % 429.46 93 0.26 % 314.68 113 0.45 % 319.98 77 0.42 % 493.93 L kar kar kar L kar 373 0.37 % 360.35 55 0.26 % 238.59 56 0.16 % 189.48 221 0.88 % 625.80 41 0.23 % 263 L glih glih glih L glih 360 0.36 % 347.79 60 0.28 % 260.28 233 0.66 % 788.38 24 0.10 % 67.96 43 0.24 % 275.83 L nje ne ne L ne 340 0.34 % 328.47 21 0.10 % 91.10 197 0.56 % 666.57 106 0.42 % 300.16 16 0.09 % 102.63 L da da da L da 324 0.32 % 313.01 49 0.23 % 212.56 141 0.40 % 477.09 86 0.34 % 243.52 48 0.26 % 307.90 L provzaprov pravzaprav pravzaprav L pravzaprav 320 0.32 % 309.15 34 0.16 % 147.49 8 0.02 % 27.07 245 0.97 % 693.76 33 0.18 % 211.68 L morda morda morda L morda 306 0.30 % 295.62 137 0.64 % 594.30 6 0.02 % 20.30 148 0.59 % 419.09 15 0.08 % 96.22 L nej naj naj L naj 305 0.30 % 294.66 50 0.23 % 216.90 99 0.28 % 334.98 100 0.40 % 283.17 56 0.31 % 359.22 L predvsem predvsem predvsem L predvsem 300 0.30 % 289.83 97 0.45 % 420.78 10 0.03 % 33.84 163 0.65 % 461.56 30 0.17 % 192.44 L le le le L le 240 0.24 % 231.86 98 0.46 % 425.12 33 0.09 % 111.66 91 0.36 % 257.68 18 0.10 % 115.46 L pravzaprav pravzaprav pravzaprav L pravzaprav 214 0.21 % 206.74 32 0.15 % 138.81 9 0.03 % 30.45 159 0.63 % 450.23 14 0.08 % 89.81 L najbrž najbrž najbrž L najbrž 210 0.21 % 202.88 41 0.19 % 177.86 59 0.17 % 199.63 64 0.25 % 181.23 46 0.25 % 295.07 L CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 763 File at CLARIN.SI2.3.29 List of interjection lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-words-interjections-lemmas-taxonomy-entire.tsvLemma Lemma (lower-case) Lemma Lemma (lower-case) Part-of-speech category Standardized form Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] mhm mhm mhm mhm M mhm 4,476 38.80 % 4,324.19 432 21.74 % 1,874 1,261 30.90 % 4,266.74 975 44.68 % 2,760.87 1,808 55.01 % 11,597.70 aha aha aha aha M aha 2,055 17.81 % 1,985.30 369 18.57 % 1,600.71 627 15.36 % 2,121.53 219 10.04 % 620.13 840 25.55 % 5,388.31 recimo recimo recimo recimo M recimo 868 7.52 % 838.56 186 9.36 % 806.86 131 3.21 % 443.25 376 17.23 % 1,064.71 175 5.32 % 1,122.56 aja aja aja aja M aja 762 6.61 % 736.16 66 3.32 % 286.31 512 12.55 % 1,732.41 84 3.85 % 237.86 100 3.04 % 641.47 ej ej ej ej M ej 616 5.34 % 595.11 91 4.58 % 394.75 427 10.46 % 1,444.80 21 0.96 % 59.46 77 2.34 % 493.93 joj joj joj joj M joj 436 3.78 % 421.21 99 4.98 % 429.46 257 6.30 % 869.59 32 1.47 % 90.61 48 1.46 % 307.90 prosim prosim prosim prosim M prosim 413 3.58 % 398.99 62 3.12 % 268.95 37 0.91 % 125.19 259 11.87 % 733.40 55 1.67 % 352.81 eh eh eh eh M eh 215 1.86 % 207.71 26 1.31 % 112.79 168 4.12 % 568.45 9 0.41 % 25.48 12 0.36 % 76.98 hm hm hm hm M hm 209 1.81 % 201.91 34 1.71 % 147.49 89 2.18 % 301.14 50 2.29 % 141.58 36 1.09 % 230.93 ha ha ha ha M ha 203 1.76 % 196.11 75 3.77 % 325.35 102 2.50 % 345.13 10 0.46 % 28.32 16 0.49 % 102.63 čao čao čao čao M čao 177 1.53 % 171 114 5.74 % 494.53 34 0.83 % 115.04 6 0.28 % 16.99 23 0.70 % 147.54 ah ah ah ah M ah 145 1.26 % 140.08 28 1.41 % 121.46 90 2.21 % 304.53 12 0.55 % 33.98 15 0.46 % 96.22 bravo bravo bravo bravo M bravo 117 1.01 % 113.03 88 4.43 % 381.74 3 0.07 % 10.15 23 1.05 % 65.13 3 0.09 % 19.24 adijo adijo adijo adijo M adijo 72 0.62 % 69.56 37 1.86 % 160.50 13 0.32 % 43.99 13 0.60 % 36.81 9 0.27 % 57.73 zdravo zdravo zdravo zdravo M zdravo 70 0.61 % 67.63 35 1.76 % 151.83 13 0.32 % 43.99 13 0.60 % 36.81 9 0.27 % 57.73 he he he he M he 64 0.56 % 61.83 5 0.25 % 21.69 52 1.27 % 175.95 3 0.14 % 8.49 4 0.12 % 25.66 oh oh oh oh M oh 57 0.49 % 55.07 12 0.60 % 52.06 25 0.61 % 84.59 10 0.46 % 28.32 10 0.30 % 64.15 la la la la M la 55 0.48 % 53.13 32 1.61 % 138.81 11 0.27 % 37.22 0 0 % 0 12 0.36 % 76.98 fak fak fak fak M fak 53 0.46 % 51.20 0 0 % 0 53 1.30 % 179.33 0 0 % 0 0 0 % 0 halo halo halo halo M halo 42 0.36 % 40.58 15 0.76 % 65.07 20 0.49 % 67.67 5 0.23 % 14.16 2 0.06 % 12.83 ho ho ho ho M ho 41 0.35 % 39.61 35 1.76 % 151.83 5 0.12 % 16.92 1 0.05 % 2.83 0 0 % 0 hop hop hop hop M hop 36 0.31 % 34.78 29 1.46 % 125.80 5 0.12 % 16.92 1 0.05 % 2.83 1 0.03 % 6.41 bla bla bla bla M bla 34 0.29 % 32.85 2 0.10 % 8.68 23 0.56 % 77.82 5 0.23 % 14.16 4 0.12 % 25.66 hej hej hej hej M hej 23 0.20 % 22.22 12 0.60 % 52.06 7 0.17 % 23.69 3 0.14 % 8.49 1 0.03 % 6.41 ojej ojej ojej ojej M ojej 23 0.20 % 22.22 4 0.20 % 17.35 13 0.32 % 43.99 2 0.09 % 5.66 4 0.12 % 25.66 opa opa opa opa M opa 23 0.20 % 22.22 14 0.70 % 60.73 6 0.15 % 20.30 1 0.05 % 2.83 2 0.06 % 12.83 uh uh uh uh M uh 23 0.20 % 22.22 7 0.35 % 30.37 10 0.24 % 33.84 2 0.09 % 5.66 4 0.12 % 25.66 nasvidenje nasvidenje nasvidenje nasvidenje M nasvidenje 19 0.17 % 18.36 12 0.60 % 52.06 0 0 % 0 7 0.32 % 19.82 0 0 % 0 šit šit šit šit M šit 19 0.17 % 18.36 2 0.10 % 8.68 15 0.37 % 50.75 0 0 % 0 2 0.06 % 12.83 alo alo alo alo M alo 17 0.15 % 16.42 7 0.35 % 30.37 7 0.17 % 23.69 3 0.14 % 8.49 0 0 % 0 jebemti jebemti jebemti jebemti M jebemti 16 0.14 % 15.46 5 0.25 % 21.69 11 0.27 % 37.22 0 0 % 0 0 0 % 0 oj oj oj oj M oj 14 0.12 % 13.53 1 0.05 % 4.34 12 0.29 % 40.60 0 0 % 0 1 0.03 % 6.41 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 764 File at CLARIN.SI2.3.30 List of interjection lower-case word forms in the GOS 1.0 corpus with standardized word forms, lemmas, morphosyntactic tags and text-type distributionGOS1.0-words-interjections-lowercase_forms-standardized_ forms-lemmas-morphosyntactic_tags-taxonomy-entire.tsvLower-case word form Lemma Lemma (lower-case) Morphosyntactic tag Standardized form Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] msd01 mhm mhm mhm M mhm 4,475 38.79 % 4,323.22 431 21.69 % 1,869.66 1,261 30.90 % 4,266.74 975 44.68 % 2,760.87 1,808 55.01 % 11,597.70 M aha aha aha M aha 2,051 17.78 % 1,981.44 368 18.52 % 1,596.37 626 15.34 % 2,118.14 217 9.95 % 614.47 840 25.55 % 5,388.31 M recimo recimo recimo M recimo 827 7.17 % 798.95 180 9.06 % 780.83 119 2.92 % 402.65 357 16.36 % 1,010.90 171 5.20 % 1,096.91 M aja aja aja M aja 758 6.57 % 732.29 66 3.32 % 286.31 508 12.45 % 1,718.88 84 3.85 % 237.86 100 3.04 % 641.47 M ej ej ej M ej 615 5.33 % 594.14 91 4.58 % 394.75 426 10.44 % 1,441.42 21 0.96 % 59.46 77 2.34 % 493.93 M joj joj joj M joj 364 3.15 % 351.65 86 4.33 % 373.06 211 5.17 % 713.94 32 1.47 % 90.61 35 1.06 % 224.51 M prosim prosim prosim M prosim 290 2.51 % 280.16 42 2.11 % 182.19 21 0.52 % 71.06 187 8.57 % 529.52 40 1.22 % 256.59 M eh eh eh M eh 215 1.86 % 207.71 26 1.31 % 112.79 168 4.12 % 568.45 9 0.41 % 25.48 12 0.36 % 76.98 M hm hm hm M hm 208 1.80 % 200.95 34 1.71 % 147.49 88 2.16 % 297.76 50 2.29 % 141.58 36 1.09 % 230.93 M ha ha ha M ha 203 1.76 % 196.11 75 3.77 % 325.35 102 2.50 % 345.13 10 0.46 % 28.32 16 0.49 % 102.63 M čav čao čao M čao 150 1.30 % 144.91 102 5.13 % 442.47 30 0.73 % 101.51 0 0 % 0 18 0.55 % 115.46 M ah ah ah M ah 145 1.26 % 140.08 28 1.41 % 121.46 90 2.21 % 304.53 12 0.55 % 33.98 15 0.46 % 96.22 M bravo bravo bravo M bravo 116 1.00 % 112.07 88 4.43 % 381.74 3 0.07 % 10.15 22 1.01 % 62.30 3 0.09 % 19.24 M prosm prosim prosim M prosim 104 0.90 % 100.47 18 0.91 % 78.08 8 0.20 % 27.07 65 2.98 % 184.06 13 0.40 % 83.39 M zdravo zdravo zdravo M zdravo 69 0.60 % 66.66 35 1.76 % 151.83 12 0.29 % 40.60 13 0.60 % 36.81 9 0.27 % 57.73 M he he he M he 61 0.53 % 58.93 4 0.20 % 17.35 50 1.23 % 169.18 3 0.14 % 8.49 4 0.12 % 25.66 M la la la M la 55 0.48 % 53.13 32 1.61 % 138.81 11 0.27 % 37.22 0 0 % 0 12 0.36 % 76.98 M oh oh oh M oh 54 0.47 % 52.17 10 0.50 % 43.38 24 0.59 % 81.21 10 0.46 % 28.32 10 0.30 % 64.15 M fak fak fak M fak 53 0.46 % 51.20 0 0 % 0 53 1.30 % 179.33 0 0 % 0 0 0 % 0M adijo adijo adijo M adijo 51 0.44 % 49.27 29 1.46 % 125.80 8 0.20 % 27.07 9 0.41 % 25.48 5 0.15 % 32.07 M halo halo halo M halo 42 0.36 % 40.58 15 0.76 % 65.07 20 0.49 % 67.67 5 0.23 % 14.16 2 0.06 % 12.83 M ho ho ho M ho 38 0.33 % 36.71 32 1.61 % 138.81 5 0.12 % 16.92 1 0.05 % 2.83 0 0 % 0M hop hop hop M hop 35 0.30 % 33.81 29 1.46 % 125.80 4 0.10 % 13.53 1 0.05 % 2.83 1 0.03 % 6.41 M bla bla bla M bla 31 0.27 % 29.95 2 0.10 % 8.68 20 0.49 % 67.67 5 0.23 % 14.16 4 0.12 % 25.66 M jo joj joj M joj 30 0.26 % 28.98 2 0.10 % 8.68 23 0.56 % 77.82 0 0 % 0 5 0.15 % 32.07 M čao čao čao M čao 24 0.21 % 23.19 11 0.55 % 47.72 3 0.07 % 10.15 6 0.28 % 16.99 4 0.12 % 25.66 M hej hej hej M hej 23 0.20 % 22.22 12 0.60 % 52.06 7 0.17 % 23.69 3 0.14 % 8.49 1 0.03 % 6.41 M opa opa opa M opa 23 0.20 % 22.22 14 0.70 % 60.73 6 0.15 % 20.30 1 0.05 % 2.83 2 0.06 % 12.83 M uh uh uh M uh 23 0.20 % 22.22 7 0.35 % 30.37 10 0.24 % 33.84 2 0.09 % 5.66 4 0.12 % 25.66 M rečmo recimo recimo M recimo 22 0.19 % 21.25 2 0.10 % 8.68 5 0.12 % 16.92 14 0.64 % 39.64 1 0.03 % 6.41 M ojej ojej ojej M ojej 21 0.18 % 20.29 3 0.15 % 13.01 12 0.29 % 40.60 2 0.09 % 5.66 4 0.12 % 25.66 M dijo adijo adijo M adijo 20 0.17 % 19.32 8 0.40 % 34.70 4 0.10 % 13.53 4 0.18 % 11.33 4 0.12 % 25.66 M CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 765 File at CLARIN.SI2.3.31 List of abbreviation lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-words-abbreviations-lemmas-taxonomy-entire.tsvLemma Lemma (lower-case) Lemma Lemma (lower-case) Part-of-speech category Standardized form Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] čim čim čim čim O čim 64 58.72 % 61.83 6 60.00 % 26.03 17 94.44 % 57.52 28 42.42 % 79.29 13 86.67 % 83.39 pH ph pH ph O PH 37 33.95 % 35.75 0 0 % 0 0 0 % 0 37 56.06 % 104.77 0 0 % 0 W w W w O W 3 2.75 % 2.90 2 20.00 % 8.68 0 0 % 0 0 0 % 0 1 6.67 % 6.41 Al al Al al O al 1 0.92 % 0.97 0 0 % 0 0 0 % 0 1 1.51 % 2.83 0 0 % 0 Mg mg Mg mg O MG 1 0.92 % 0.97 0 0 % 0 0 0 % 0 0 0 % 0 1 6.67 % 6.41 WWW www WWW www O WWW 1 0.92 % 0.97 1 10.00 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 najsi najsi najsi najsi O najsi 1 0.92 % 0.97 1 10.00 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0 pH ph pH ph O ph 1 0.92 % 0.97 0 0 % 0 1 5.56 % 3.38 0 0 % 0 0 0 % 0 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 766 File at CLARIN.SI2.3.32 List of abbreviation lower-case word forms in the GOS 1.0 corpus with standardized word forms, lemmas, morphosyntactic tags and text-type distributionGOS1.0-words-abbreviations-lowercase_forms-standardized_ forms-lemmas-morphosyntactic_tags-taxonomy-entire.tsvLower-case word form Lemma Lemma (lower-case) Morphosyntactic tag Standardized form Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] msd01 čim čim čim O čim 63 57.80 % 60.86 6 60.00 % 26.03 16 88.89 % 54.14 28 42.42 % 79.29 13 86.67 % 83.39 O peha pH ph O PH 37 33.95 % 35.75 0 0 % 0 0 0 % 0 37 56.06 % 104.77 0 0 % 0O ve W w O W 3 2.75 % 2.90 2 20.00 % 8.68 0 0 % 0 0 0 % 0 1 6.67 % 6.41 O al Al al O al 1 0.92 % 0.97 0 0 % 0 0 0 % 0 1 1.51 % 2.83 0 0 % 0O emge Mg mg O MG 1 0.92 % 0.97 0 0 % 0 0 0 % 0 0 0 % 0 1 6.67 % 6.41 O najsi najsi najsi O najsi 1 0.92 % 0.97 1 10.00 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0O ph pH ph O ph 1 0.92 % 0.97 0 0 % 0 1 5.56 % 3.38 0 0 % 0 0 0 % 0O veveve WWW www O WWW 1 0.92 % 0.97 1 10.00 % 4.34 0 0 % 0 0 0 % 0 0 0 % 0O čimveč čim čim O čim 1 0.92 % 0.97 0 0 % 0 1 5.56 % 3.38 0 0 % 0 0 0 % 0O CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 767 File at CLARIN.SI2.3.33 List of residual lemmas in the GOS 1.0 corpus with text-type distributionGOS1.0-words-residual-lemmas-taxonomy-entire.tsvLemma Lemma (lower-case) Lemma Lemma (lower-case) Part-of-speech category Standardized form Total absolute frequency of lemma Percentage of all found lemmas Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] eee eee eee eee N eee 23,222 42.92 % 22,434.40 4,248 33.09 % 18,427.66 4,527 29.40 % 15,317.62 10,556 59.40 % 29,891.07 3,891 48.07 % 24,959.43 eem eem eem eem N eem 2,950 5.45 % 2,849.95 392 3.05 % 1,700.48 713 4.63 % 2,412.52 1,210 6.81 % 3,426.32 635 7.84 % 4,073.31 s s s s N s 1,327 2.45 % 1,281.99 213 1.66 % 923.99 488 3.17 % 1,651.20 373 2.10 % 1,056.21 253 3.12 % 1,622.91 ka ka ka ka N ka 820 1.52 % 792.19 466 3.63 % 2,021.49 280 1.82 % 947.41 42 0.24 % 118.93 32 0.40 % 205.27 n n n n N n 752 1.39 % 726.49 107 0.83 % 464.16 299 1.94 % 1,011.70 193 1.09 % 546.51 153 1.89 % 981.44 p p p p N p 497 0.92 % 480.14 81 0.63 % 351.37 167 1.08 % 565.06 141 0.79 % 399.26 108 1.33 % 692.78 z z z z N z 486 0.90 % 469.52 79 0.61 % 342.70 159 1.03 % 537.99 150 0.84 % 424.75 98 1.21 % 628.64 t t t t N t 463 0.86 % 447.30 76 0.59 % 329.69 186 1.21 % 629.35 91 0.51 % 257.68 110 1.36 % 705.61 m m m m N m 422 0.78 % 407.69 79 0.61 % 342.70 160 1.04 % 541.38 93 0.52 % 263.34 90 1.11 % 577.32 j j j j N j 364 0.67 % 351.65 89 0.69 % 386.08 136 0.88 % 460.17 75 0.42 % 212.37 64 0.79 % 410.54 k k k k N k 363 0.67 % 350.69 72 0.56 % 312.33 123 0.80 % 416.18 92 0.52 % 260.51 76 0.94 % 487.51 v v v v N v 332 0.61 % 320.74 67 0.52 % 290.64 92 0.60 % 311.29 119 0.67 % 336.97 54 0.67 % 346.39 da da da da N da 331 0.61 % 319.77 61 0.47 % 264.62 129 0.84 % 436.49 90 0.51 % 254.85 51 0.63 % 327.15 aaa aaa aaa aaa N aaa 288 0.53 % 278.23 67 0.52 % 290.64 95 0.62 % 321.44 95 0.54 % 269.01 31 0.38 % 198.85 nnn nnn nnn nnn N nnn 240 0.44 % 231.86 30 0.23 % 130.14 99 0.64 % 334.98 59 0.33 % 167.07 52 0.64 % 333.56 e e e e N e 224 0.41 % 216.40 63 0.49 % 273.29 96 0.62 % 324.83 40 0.23 % 113.27 25 0.31 % 160.37 po po po po N po 210 0.39 % 202.88 35 0.27 % 151.83 55 0.36 % 186.10 80 0.45 % 226.53 40 0.49 % 256.59 o o o o N o 199 0.37 % 192.25 32 0.25 % 138.81 52 0.34 % 175.95 69 0.39 % 195.38 46 0.57 % 295.07 kao kao kao kao N kao 198 0.37 % 191.28 9 0.07 % 39.04 165 1.07 % 558.30 5 0.03 % 14.16 19 0.23 % 121.88 š š š š N š 173 0.32 % 167.13 34 0.27 % 147.49 65 0.42 % 219.93 37 0.21 % 104.77 37 0.46 % 237.34 ooo ooo ooo ooo N ooo 166 0.31 % 160.37 83 0.65 % 360.05 57 0.37 % 192.87 10 0.06 % 28.32 16 0.20 % 102.63 b b b b N b 165 0.30 % 159.40 50 0.39 % 216.90 38 0.25 % 128.58 71 0.40 % 201.05 6 0.07 % 38.49 bi bi bi bi N bi 160 0.30 % 154.57 21 0.16 % 91.10 74 0.48 % 250.39 18 0.10 % 50.97 47 0.58 % 301.49 i i i i N i 154 0.28 % 148.78 25 0.20 % 108.45 50 0.33 % 169.18 46 0.26 % 130.26 33 0.41 % 211.68 živjo živjo živjo živjo N živjo 149 0.28 % 143.95 74 0.58 % 321.01 22 0.14 % 74.44 35 0.20 % 99.11 18 0.22 % 115.46 tipo tipo tipo tipo N tipo 145 0.27 % 140.08 1 0.01 % 4.34 144 0.94 % 487.24 0 0 % 0 0 0 % 0 na na na na N na 143 0.26 % 138.15 18 0.14 % 78.08 49 0.32 % 165.80 46 0.26 % 130.26 30 0.37 % 192.44 B b B b N B 112 0.21 % 108.20 23 0.18 % 99.77 3 0.02 % 10.15 68 0.38 % 192.55 18 0.22 % 115.46 za za za za N za 108 0.20 % 104.34 23 0.18 % 99.77 29 0.19 % 98.12 34 0.19 % 96.28 22 0.27 % 141.12 pre pre pre pre N pre 107 0.20 % 103.37 27 0.21 % 117.12 24 0.16 % 81.21 34 0.19 % 96.28 22 0.27 % 141.12 uuu uuu uuu uuu N uuu 104 0.19 % 100.47 58 0.45 % 251.60 32 0.21 % 108.28 8 0.04 % 22.65 6 0.07 % 38.49 pri pri pri pri N pri 103 0.19 % 99.51 17 0.13 % 73.75 31 0.20 % 104.89 35 0.20 % 99.11 20 0.25 % 128.29 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 768 File at CLARIN.SI2.3.34 List of residual lower-case word forms in the GOS 1.0 corpus with standardized word forms, lemmas, morphosyntactic tags and text-type distributionGOS1.0-words-residual-lowercase_forms-standardized_ forms-lemmas-morphosyntactic_tags-taxonomy-entire.tsvLower-case word form Lemma Lemma (lower-case) Morphosyntactic tag Standardized form Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] msd01 msd02 eee eee eee N eee 23,222 42.92 % 22,434.40 4,248 33.09 % 18,427.66 4,527 29.40 % 15,317.62 10,556 59.40 % 29,891.07 3,891 48.07 % 24,959.43 N eem eem eem N eem 2,950 5.45 % 2,849.95 392 3.05 % 1,700.48 713 4.63 % 2,412.52 1,210 6.81 % 3,426.32 635 7.84 % 4,073.31 N s s s N s 1,327 2.45 % 1,281.99 213 1.66 % 923.99 488 3.17 % 1,651.20 373 2.10 % 1,056.21 253 3.12 % 1,622.91 N ka ka ka N ka 762 1.41 % 736.16 423 3.29 % 1,834.96 265 1.72 % 896.66 42 0.24 % 118.93 32 0.40 % 205.27 N n n n N n 750 1.39 % 724.56 107 0.83 % 464.16 297 1.93 % 1,004.93 193 1.09 % 546.51 153 1.89 % 981.44 N p p p N p 497 0.92 % 480.14 81 0.63 % 351.37 167 1.08 % 565.06 141 0.79 % 399.26 108 1.33 % 692.78 N z z z N z 486 0.90 % 469.52 79 0.61 % 342.70 159 1.03 % 537.99 150 0.84 % 424.75 98 1.21 % 628.64 N t t t N t 462 0.85 % 446.33 76 0.59 % 329.69 185 1.20 % 625.97 91 0.51 % 257.68 110 1.36 % 705.61 N m m m N m 422 0.78 % 407.69 79 0.61 % 342.70 160 1.04 % 541.38 93 0.52 % 263.34 90 1.11 % 577.32 N k k k N k 363 0.67 % 350.69 72 0.56 % 312.33 123 0.80 % 416.18 92 0.52 % 260.51 76 0.94 % 487.51 N j j j N j 359 0.66 % 346.82 89 0.69 % 386.08 132 0.86 % 446.64 74 0.42 % 209.54 64 0.79 % 410.54 N aaa aaa aaa N aaa 288 0.53 % 278.23 67 0.52 % 290.64 95 0.62 % 321.44 95 0.54 % 269.01 31 0.38 % 198.85 N d da da N da 273 0.51 % 263.74 56 0.44 % 242.93 108 0.70 % 365.43 72 0.41 % 203.88 37 0.46 % 237.34 N v v v N v 267 0.49 % 257.94 58 0.45 % 251.60 66 0.43 % 223.32 107 0.60 % 302.99 36 0.45 % 230.93 N nnn nnn nnn N nnn 240 0.44 % 231.86 30 0.23 % 130.14 99 0.64 % 334.98 59 0.33 % 167.07 52 0.64 % 333.56 N e e e N e 221 0.41 % 213.50 60 0.47 % 260.28 96 0.62 % 324.83 40 0.23 % 113.27 25 0.31 % 160.37 N po po po N po 206 0.38 % 199.01 35 0.27 % 151.83 51 0.33 % 172.56 80 0.45 % 226.53 40 0.49 % 256.59 N o o o N o 197 0.36 % 190.32 32 0.25 % 138.81 51 0.33 % 172.56 68 0.38 % 192.55 46 0.57 % 295.07 N kao kao kao Nt kao 196 0.36 % 189.35 8 0.06 % 34.70 164 1.06 % 554.91 5 0.03 % 14.16 19 0.23 % 121.88 N t š š š N š 173 0.32 % 167.13 34 0.27 % 147.49 65 0.42 % 219.93 37 0.21 % 104.77 37 0.46 % 237.34 N ooo ooo ooo N ooo 166 0.31 % 160.37 83 0.65 % 360.05 57 0.37 % 192.87 10 0.06 % 28.32 16 0.20 % 102.63 N b b b N b 165 0.30 % 159.40 50 0.39 % 216.90 38 0.25 % 128.58 71 0.40 % 201.05 6 0.07 % 38.49 N i i i N i 154 0.28 % 148.78 25 0.20 % 108.45 50 0.33 % 169.18 46 0.26 % 130.26 33 0.41 % 211.68 N b bi bi N bi 147 0.27 % 142.01 18 0.14 % 78.08 71 0.46 % 240.24 12 0.07 % 33.98 46 0.57 % 295.07 N tipo tipo tipo Nt tipo 145 0.27 % 140.08 1 0.01 % 4.34 144 0.94 % 487.24 0 0 % 0 0 0 % 0N t na na na N na 143 0.26 % 138.15 18 0.14 % 78.08 49 0.32 % 165.80 46 0.26 % 130.26 30 0.37 % 192.44 N živjo živjo živjo Nt živjo 134 0.25 % 129.46 66 0.51 % 286.31 18 0.12 % 60.91 33 0.19 % 93.44 17 0.21 % 109.05 N t za za za N za 108 0.20 % 104.34 23 0.18 % 99.77 29 0.19 % 98.12 34 0.19 % 96.28 22 0.27 % 141.12 N pre pre pre N pre 100 0.18 % 96.61 20 0.16 % 86.76 24 0.16 % 81.21 34 0.19 % 96.28 22 0.27 % 141.12 N g g g N g 90 0.17 % 86.95 25 0.20 % 108.45 21 0.14 % 71.06 30 0.17 % 84.95 14 0.17 % 89.81 N u u u N u 90 0.17 % 86.95 18 0.14 % 78.08 20 0.13 % 67.67 45 0.25 % 127.42 7 0.09 % 44.90 N ne ne ne N ne 85 0.16 % 82.12 22 0.17 % 95.44 21 0.14 % 71.06 29 0.16 % 82.12 13 0.16 % 83.39 N CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 769The frequency lists of word sets from the GOS 1.0 corpus contain consecutive sequences of words (without skipped words; e.g. the string “prevajati roman” is extracted only when the two words actually co-occur in the text; the frequencies for “prevajati roman” thus do not include cases with one or more interceding words, such as “prevajati italijanski roman” or “prevajati novi italijanski roman”). The length of the extracted word sets ranges from 1 to 5 words. Along with the word sets, the lists also contain their total absolute frequencies (fas), i.e. the number of its occurrences in the corpus, and their percentages (psn) based on the sum of the frequencies of all extracted sets (N) of equal length (n) in the corpus: The total relative frequency of a word set (frs) is calculated using the total absolute frequency of the word set (fas) and the total sum of the absolute frequencies (faw) of all m words in the corpus: The same method is used to calculate the absolute frequencies, percentages, and relative frequencies in the subcorpora containing only texts from a specific taxonomy branch. The main difference is that instead of the sums from the entire corpus (e.g. for the number of all words or the absolute frequency of a word set), the formulae take into account only partial values extracted from the relevant subcorpus (e.g. the number of all words in public discourse texts and the absolute frequency of a word set in public discourse texts). The lists of word sets also contain five different collocability measures that indicate how typical a certain word set is. The available measures are the following: t-score, MI (mutual information), MI3 (mutual information cubed), logDice, and simple LL (simple log-likelihood). The measures are calculated using the following formulae based on the observed (O) and expected (E) frequencies of the word set): O ... observed frequency of the word set E ... expected frequency of the word set fas ... absolute frequency of the word set in the (sub)corpus N ... frequency of all word sets in the (sub)corpus n ... length of the word set (number of words) fw ... absolute frequency of the word in the (sub)corpus Tables 2.4.1. to 2.4.4. contain lists of word sets extracted from standardized word forms. Tables 2.4.5. to 2.4.8. contain lists of word sets extracted from lower-case word forms, along with their lemmas (e.g. “prevajal roman” → “prevajati roman”). In tables 2.4.9. to 2.4.12., the lists of word- level n-grams contain lists of sets of morphosyntactic tags without word forms, standardized word forms, or lemmas.2.4. Frequency lists of word sets from the GOS 1.0 corpus CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 770 File at CLARIN.SI2.4.1 List of all word-level 2-grams from standardized word forms in the GOS 1.0 corpus with text-type distribution and collocation measuresGOS1.0-word_sets-standardized_forms- 2grams-taxonomy-collocativity-entire.tsvStandardized form of string Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] Dice t-score MI MI3 logDice simple LL ja ja 3,926 0.42 % 3,792.84 655 0.32 % 2,841.37 1,899 0.73 % 6,425.48 166 0.05 % 470.06 1,206 0.87 % 7,736.08 0.15 52.58 2.64 26.51 11.30 -358.51 to je 3,889 0.42 % 3,757.10 889 0.43 % 3,856.45 1,176 0.45 % 3,979.13 1,116 0.35 % 3,160.14 708 0.51 % 4,541.58 0.14 51.64 2.54 26.39 11.15 -493.29 da je 2,732 0.30 % 2,639.34 395 0.19 % 1,713.49 720 0.28 % 2,436.20 1,248 0.39 % 3,533.92 369 0.27 % 2,367.01 0.09 37.75 1.85 24.68 10.58 -906.36 ne vem 2,731 0.30 % 2,638.37 362 0.18 % 1,570.34 1,440 0.56 % 4,872.40 385 0.12 % 1,090.19 544 0.39 % 3,489.57 0.15 50.16 4.64 27.47 11.30 2,386.85 je pa 2,033 0.22 % 1,964.05 302 0.15 % 1,310.06 872 0.34 % 2,950.51 523 0.16 % 1,480.96 336 0.24 % 2,155.32 0.06 21.50 0.93 22.91 9.96 -794.79 je to 1,889 0.20 % 1,824.93 321 0.16 % 1,392.49 536 0.21 % 1,813.62 627 0.20 % 1,775.45 405 0.29 % 2,597.94 0.07 28.08 1.50 23.27 10.11 -736.69 a ne 1,839 0.20 % 1,776.63 171 0.08 % 741.79 425 0.16 % 1,438.04 856 0.27 % 2,423.91 387 0.28 % 2,482.47 0.10 38.07 3.16 24.85 10.61 228.68 tako da 1,810 0.20 % 1,748.61 390 0.19 % 1,691.81 556 0.21 % 1,881.29 427 0.13 % 1,209.12 437 0.32 % 2,803.20 0.12 37.59 3.10 24.75 10.88 182.63 je bilo 1,682 0.18 % 1,624.95 316 0.15 % 1,370.80 761 0.29 % 2,574.93 445 0.14 % 1,260.09 160 0.12 % 1,026.34 0.08 38.16 3.84 25.28 10.40 763.72 pa je 1,556 0.17 % 1,503.23 368 0.18 % 1,596.37 591 0.23 % 1,999.72 402 0.13 % 1,138.33 195 0.14 % 1,250.86 0.05 12.48 0.55 21.76 9.57 -470.60 da se 1,553 0.17 % 1,500.33 296 0.14 % 1,284.04 314 0.12 % 1,062.45 656 0.20 % 1,857.57 287 0.21 % 1,841.01 0.08 31.23 2.27 23.47 10.43 -340.50 da bi 1,459 0.16 % 1,409.52 257 0.12 % 1,114.86 396 0.15 % 1,339.91 521 0.16 % 1,475.30 285 0.20 % 1,828.18 0.10 34.15 3.24 24.26 10.71 236.87 se je 1,366 0.15 % 1,319.67 309 0.15 % 1,340.43 348 0.13 % 1,177.50 612 0.19 % 1,732.98 97 0.07 % 622.22 0.05 21.37 1.25 22.08 9.71 -555.44 ali pa 1,359 0.15 % 1,312.91 228 0.11 % 989.06 450 0.17 % 1,522.63 399 0.12 % 1,129.83 282 0.20 % 1,808.93 0.08 33.17 3.32 24.14 10.35 269.75 jaz sem 1,316 0.14 % 1,271.37 229 0.11 % 993.39 630 0.24 % 2,131.68 280 0.09 % 792.87 177 0.13 % 1,135.39 0.19 34.98 4.80 25.53 11.59 1,267.09 ne ne 1,234 0.13 % 1,192.15 349 0.17 % 1,513.95 462 0.18 % 1,563.23 184 0.06 % 521.03 239 0.17 % 1,533.10 0.04 7.22 0.33 20.87 9.31 -260.57 je bil 1,180 0.13 % 1,139.98 294 0.14 % 1,275.36 483 0.19 % 1,634.29 341 0.11 % 965.60 62 0.04 % 397.71 0.06 32.03 3.89 24.30 9.93 561.58 ki je 1,069 0.12 % 1,032.74 262 0.13 % 1,136.55 232 0.09 % 785 496 0.15 % 1,404.51 79 0.06 % 506.76 0.05 26.56 2.41 22.54 9.67 -183.02 kaj pa 1,044 0.11 % 1,008.59 181 0.09 % 785.17 357 0.14 % 1,207.95 321 0.10 % 908.96 185 0.13 % 1,186.71 0.05 23.96 1.95 22.01 9.78 -321.49 mislim da 1,044 0.11 % 1,008.59 218 0.11 % 945.68 187 0.07 % 632.74 458 0.14 % 1,296.90 181 0.13 % 1,161.05 0.09 30.51 4.17 24.22 10.49 646.58 je bila 1,033 0.11 % 997.96 240 0.12 % 1,041.11 425 0.16 % 1,438.04 295 0.09 % 835.34 73 0.05 % 468.27 0.05 29.73 3.74 23.76 9.74 412.20 pa ne 986 0.11 % 952.56 153 0.07 % 663.71 444 0.17 % 1,502.32 214 0.07 % 605.98 175 0.13 % 1,122.56 0.03 2.60 0.12 20.02 9.04 -89.25 v bistvu 985 0.11 % 951.59 163 0.08 % 707.09 232 0.09 % 785 231 0.07 % 654.11 359 0.26 % 2,302.86 0.10 30.84 5.86 25.74 10.75 1,536.97 kaj je 945 0.10 % 912.95 134 0.07 % 581.29 414 0.16 % 1,400.82 285 0.09 % 807.02 112 0.08 % 718.44 0.04 19.55 1.46 21.23 9.36 -372.57 pa to 927 0.10 % 895.56 152 0.07 % 659.37 472 0.18 % 1,597.07 141 0.04 % 399.26 162 0.12 % 1,039.17 0.04 13.22 0.82 20.53 9.31 -346.48 v redu 854 0.09 % 825.04 179 0.09 % 776.50 236 0.09 % 798.53 230 0.07 % 651.28 209 0.15 % 1,340.66 0.09 28.72 5.85 25.32 10.55 1,327.26 in eee 848 0.09 % 819.24 246 0.12 % 1,067.14 155 0.06 % 524.46 357 0.11 % 1,010.90 90 0.07 % 577.32 0.04 16.61 1.22 20.67 9.46 -345.08 a veš 838 0.09 % 809.58 126 0.06 % 546.58 450 0.17 % 1,522.63 16 0.01 % 45.31 246 0.18 % 1,578.01 0.17 28.30 5.48 24.90 11.48 1,126.97 se pravi 832 0.09 % 803.78 55 0.03 % 238.59 37 0.01 % 125.19 517 0.16 % 1,463.97 223 0.16 % 1,430.47 0.10 28.14 5.35 24.75 10.63 1,054.77 pa še 784 0.09 % 757.41 159 0.08 % 689.74 319 0.12 % 1,079.37 181 0.06 % 512.53 125 0.09 % 801.83 0.04 20.71 1.94 21.17 9.46 -243.48 ker je 783 0.09 % 756.44 123 0.06 % 533.57 261 0.10 % 883.12 255 0.08 % 722.07 144 0.10 % 923.71 0.04 21.79 2.18 21.40 9.25 -193.52 da ne 782 0.09 % 755.48 179 0.09 % 776.50 279 0.11 % 944.03 201 0.06 % 569.16 123 0.09 % 789 0.03 4.90 0.28 19.50 8.92 -143.07 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 771 File at CLARIN.SI2.4.2 List of all word-level 3-grams from standardized word forms in the GOS 1.0 corpus with text-type distribution and collocation measuresGOS1.0-word_sets-standardized_forms- 3grams-taxonomy-collocativity-entire.tsvStandardized form of string Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] Dice t-score MI MI3 logDice simple LL ja ja ja 1,269 0.15 % 1,225.96 194 0.10 % 841.56 607 0.27 % 2,053.85 23 0.01 % 65.13 445 0.35 % 2,854.52 0.05 35.18 6.35 26.96 9.67 2,341.27 se mi zdi 356 0.04 % 343.93 74 0.04 % 321.01 91 0.04 % 307.91 91 0.03 % 257.68 100 0.08 % 641.47 0.05 18.87 13.07 30.02 9.73 2,089.82 ne ne ne 320 0.04 % 309.15 101 0.06 % 438.13 116 0.05 % 392.50 43 0.01 % 121.76 60 0.05 % 384.88 0.01 16.20 3.41 20.05 7.36 76.67 to je to 316 0.04 % 305.28 55 0.03 % 238.59 111 0.05 % 375.58 50 0.02 % 141.58 100 0.08 % 641.47 0.01 17.11 4.73 21.33 7.71 291.17 jaz mislim da 264 0.03 % 255.05 32 0.02 % 138.81 25 0.01 % 84.59 165 0.06 % 467.22 42 0.03 % 269.42 0.03 16.23 9.50 25.59 8.74 983.12 pa ne vem 254 0.03 % 245.39 24 0.01 % 104.11 150 0.07 % 507.54 22 0.01 % 62.30 58 0.05 % 372.05 0.01 15.74 6.35 22.33 7.59 469.65 da je to 250 0.03 % 241.52 39 0.02 % 169.18 39 0.02 % 131.96 126 0.04 % 356.79 46 0.04 % 295.07 0.01 14.96 4.21 20.14 7.32 160.24 to je pa 248 0.03 % 239.59 55 0.03 % 238.59 86 0.04 % 290.99 60 0.02 % 169.90 47 0.04 % 301.49 0.01 14.54 3.71 19.62 7.16 95.56 mislim da je 244 0.03 % 235.72 43 0.02 % 186.53 34 0.01 % 115.04 126 0.04 % 356.79 41 0.03 % 263 0.01 15.49 6.86 22.72 7.61 523.43 ne vem kaj 244 0.03 % 235.72 27 0.01 % 117.12 143 0.06 % 483.86 27 0.01 % 76.45 47 0.04 % 301.49 0.02 15.56 7.92 23.78 8.06 677.85 ja to je 237 0.03 % 228.96 45 0.02 % 195.21 86 0.04 % 290.99 47 0.02 % 133.09 59 0.05 % 378.46 0.01 14.32 3.84 19.62 7.16 107.36 in tako naprej 235 0.03 % 227.03 41 0.02 % 177.86 11 0.01 % 37.22 152 0.05 % 430.41 31 0.03 % 198.85 0.03 15.32 10.66 26.41 8.71 1,038.28 in to je 213 0.03 % 205.78 46 0.03 % 199.55 43 0.02 % 145.50 103 0.04 % 291.66 21 0.02 % 134.71 0.01 13.88 4.34 19.81 7.18 152.00 to je bilo 197 0.02 % 190.32 52 0.03 % 225.57 68 0.03 % 230.09 60 0.02 % 169.90 17 0.01 % 109.05 0.01 13.89 6.56 21.80 7.35 388.14 zaradi tega ker 188 0.02 % 181.62 27 0.01 % 117.12 28 0.01 % 94.74 108 0.04 % 305.82 25 0.02 % 160.37 0.07 13.71 14.72 29.83 10.26 1,290.46 ne vem če 182 0.02 % 175.83 30 0.02 % 130.14 78 0.03 % 263.92 33 0.01 % 93.44 41 0.03 % 263 0.01 13.44 8.10 23.11 7.75 524.79 glede na to 178 0.02 % 171.96 67 0.04 % 290.64 15 0.01 % 50.75 64 0.02 % 181.23 32 0.03 % 205.27 0.02 13.33 10.88 25.83 8.14 809.98 na to da 170 0.02 % 164.23 69 0.04 % 299.32 10 0 % 33.84 68 0.02 % 192.55 23 0.02 % 147.54 0.01 12.71 5.29 20.11 7.34 210.07 ne to je 154 0.02 % 148.78 36 0.02 % 156.17 47 0.02 % 159.03 40 0.01 % 113.27 31 0.03 % 198.85 0.01 10.75 2.90 17.44 6.43 2.38 to se pravi 152 0.02 % 146.84 9 0.01 % 39.04 9 0 % 30.45 104 0.04 % 294.49 30 0.02 % 192.44 0.01 12.30 8.70 23.20 7.71 493.04 mi zdi da 149 0.02 % 143.95 42 0.02 % 182.19 35 0.01 % 118.43 34 0.01 % 96.28 38 0.03 % 243.76 0.02 12.20 11.42 25.86 8.15 726.43 da bi se 146 0.02 % 141.05 20 0.01 % 86.76 30 0.01 % 101.51 64 0.02 % 181.23 32 0.03 % 205.27 0.01 11.89 5.94 20.32 7.33 235.03 mhm mhm mhm 139 0.02 % 134.29 7 0 % 30.37 36 0.02 % 121.81 15 0.01 % 42.47 81 0.07 % 519.59 0.03 11.78 10.70 24.93 8.99 617.33 ja ne vem 137 0.02 % 132.35 21 0.01 % 91.10 86 0.04 % 290.99 18 0.01 % 50.97 12 0.01 % 76.98 0.01 11.47 5.66 19.86 6.79 198.45 saj to je 131 0.02 % 126.56 13 0.01 % 56.39 70 0.03 % 236.85 19 0.01 % 53.80 29 0.02 % 186.03 0.01 11.22 5.65 19.72 6.75 188.72 je v bistvu 130 0.02 % 125.59 19 0.01 % 82.42 27 0.01 % 91.36 40 0.01 % 113.27 44 0.04 % 282.24 0.01 11.35 7.72 21.77 6.83 345.67 to pa je 130 0.02 % 125.59 45 0.02 % 195.21 27 0.01 % 91.36 33 0.01 % 93.44 25 0.02 % 160.37 0.00 9.74 2.78 16.82 6.23 -4.78 kaj je to 129 0.02 % 124.62 23 0.01 % 99.77 53 0.02 % 179.33 42 0.01 % 118.93 11 0.01 % 70.56 0.01 10.82 4.39 18.42 6.60 95.45 tako da je 124 0.01 % 119.79 21 0.01 % 91.10 40 0.02 % 135.34 35 0.01 % 99.11 28 0.02 % 179.61 0.01 10.45 4.02 17.93 6.47 67.61 na neki način 122 0.01 % 117.86 12 0.01 % 52.06 4 0 % 13.53 69 0.02 % 195.38 37 0.03 % 237.34 0.03 11.05 15.57 29.43 8.85 899.76 jaz ne vem 118 0.01 % 114 18 0.01 % 78.08 45 0.02 % 152.26 21 0.01 % 59.46 34 0.03 % 218.10 0.01 10.80 7.43 21.19 7.11 293.11 je to je 118 0.01 % 114 18 0.01 % 78.08 39 0.02 % 131.96 34 0.01 % 96.28 27 0.02 % 173.20 0.00 8.63 2.29 16.05 5.96 -25.22 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 772 File at CLARIN.SI2.4.3 List of all word-level 4-grams from standardized word forms in the GOS 1.0 corpus with text-type distribution and collocation measuresGOS1.0-word_sets-standardized_forms- 4grams-taxonomy-collocativity-entire.tsvStandardized form of string Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] Dice t-score MI MI3 logDice simple LL ja ja ja ja 501 0.07 % 484.01 69 0.04 % 299.32 245 0.12 % 828.99 6 0 % 16.99 181 0.16 % 1,161.05 0.02 22.37 10.34 28.28 8.33 2,118.80 se mi zdi da 149 0.02 % 143.95 42 0.03 % 182.19 35 0.02 % 118.43 34 0.01 % 96.28 38 0.03 % 243.76 0.01 12.21 17.44 31.88 7.87 1,266.60 glede na to da 122 0.02 % 117.86 56 0.03 % 242.93 8 0 % 27.07 41 0.01 % 116.10 17 0.01 % 109.05 0.01 11.05 15.96 29.82 7.27 928.23 ne ne ne ne 116 0.02 % 112.07 44 0.03 % 190.87 40 0.02 % 135.34 17 0.01 % 48.14 15 0.01 % 96.22 0.00 10.68 6.96 20.68 5.90 256.27 to je to je 93 0.01 % 89.85 14 0.01 % 60.73 35 0.02 % 118.43 19 0.01 % 53.80 25 0.02 % 160.37 0.00 9.60 7.75 20.83 5.77 248.82 jaz mislim da je 60 0.01 % 57.97 5 0 % 21.69 5 0 % 16.92 40 0.01 % 113.27 10 0.01 % 64.15 0.00 7.74 12.15 23.97 5.86 319.04 meni se zdi da 49 0.01 % 47.34 5 0 % 21.69 29 0.01 % 98.12 6 0 % 16.99 9 0.01 % 57.73 0.01 7.00 17.91 29.14 6.38 430.50 in to je to 46 0.01 % 44.44 6 0 % 26.03 21 0.01 % 71.06 9 0 % 25.48 10 0.01 % 64.15 0.00 6.75 7.94 18.99 5.06 128.29 se mi je zdelo 46 0.01 % 44.44 11 0.01 % 47.72 15 0.01 % 50.75 8 0 % 22.65 12 0.01 % 76.98 0.00 6.78 17.76 28.80 5.71 399.80 mhm mhm mhm mhm 42 0.01 % 40.58 1 0 % 4.34 13 0.01 % 43.99 5 0 % 14.16 23 0.02 % 147.54 0.01 6.48 16.82 27.61 7.26 341.40 ali pa kaj takega 41 0.01 % 39.61 2 0 % 8.68 17 0.01 % 57.52 6 0 % 16.99 16 0.01 % 102.63 0.00 6.40 16.68 27.39 5.93 329.64 pa ne vem kaj 40 0.01 % 38.64 2 0 % 8.68 23 0.01 % 77.82 5 0 % 14.16 10 0.01 % 64.15 0.00 6.32 10.45 21.10 5.14 171.77 ne glede na to 38 0.01 % 36.71 7 0 % 30.37 2 0 % 6.77 22 0.01 % 62.30 7 0.01 % 44.90 0.00 6.16 13.67 24.17 5.31 236.81 mislim da je to 36 0.01 % 34.78 6 0 % 26.03 3 0 % 10.15 22 0.01 % 62.30 5 0 % 32.07 0.00 5.99 9.90 20.24 4.89 142.73 na to da je 36 0.01 % 34.78 12 0.01 % 52.06 1 0 % 3.38 17 0.01 % 48.14 6 0.01 % 38.49 0.00 5.97 7.84 18.18 4.73 98.20 nič osem nič nič 36 0.01 % 34.78 36 0.02 % 156.17 0 0 % 0 0 0 % 0 0 0 % 0 0.02 6.00 23.67 34.01 8.53 440.94 osem nič osem nič 36 0.01 % 34.78 36 0.02 % 156.17 0 0 % 0 0 0 % 0 0 0 % 0 0.03 6.00 26.08 36.42 8.96 493.28 zaradi tega ker je 36 0.01 % 34.78 6 0 % 26.03 6 0 % 20.30 20 0.01 % 56.63 4 0 % 25.66 0.00 6.00 17.13 27.47 5.71 299.20 šest osem nič osem 36 0.01 % 34.78 36 0.02 % 156.17 0 0 % 0 0 0 % 0 0 0 % 0 0.05 6.00 28.35 38.69 9.54 542.53 šest šest osem nič 35 0.01 % 33.81 35 0.02 % 151.83 0 0 % 0 0 0 % 0 0 0 % 0 0.04 5.92 28.17 38.43 9.48 523.60 je pa res da 34 0.01 % 32.85 17 0.01 % 73.75 0 0 % 0 8 0 % 22.65 9 0.01 % 57.73 0.00 5.82 9.42 19.59 4.63 124.89 ne vem kaj je 34 0.01 % 32.85 3 0 % 13.01 26 0.01 % 87.97 2 0 % 5.66 3 0 % 19.24 0.00 5.82 9.87 20.04 4.76 134.06 ja saj to je 33 0 % 31.88 6 0 % 26.03 20 0.01 % 67.67 1 0 % 2.83 6 0.01 % 38.49 0.00 5.73 9.00 19.09 4.66 112.91 mi zdi da je 33 0 % 31.88 8 0.01 % 34.70 9 0 % 30.45 10 0 % 28.32 6 0.01 % 38.49 0.00 5.74 14.03 24.12 5.10 212.79 in tako naprej ne 32 0 % 30.91 6 0 % 26.03 5 0 % 16.92 14 0.01 % 39.64 7 0.01 % 44.90 0.00 5.66 12.80 22.80 5.14 182.69 aha ja ja ja 30 0 % 28.98 2 0 % 8.68 15 0.01 % 50.75 0 0 % 0 13 0.01 % 83.39 0.00 5.47 9.92 19.73 4.64 119.21 da je da je 30 0 % 28.98 1 0 % 4.34 13 0.01 % 43.99 6 0 % 16.99 10 0.01 % 64.15 0.00 5.38 5.75 15.57 4.07 45.03 ne vem ne vem 30 0 % 28.98 4 0 % 17.35 19 0.01 % 64.29 4 0 % 11.33 3 0 % 19.24 0.00 5.48 11.34 21.15 4.79 144.82 ne ja ja ja 29 0 % 28.02 2 0 % 8.68 11 0.01 % 37.22 0 0 % 0 16 0.01 % 102.63 0.00 5.30 5.92 15.63 4.13 46.25 aha aha aha aha 28 0 % 27.05 10 0.01 % 43.38 7 0 % 23.69 0 0 % 0 11 0.01 % 70.56 0.01 5.29 20.73 30.34 7.80 293.44 ja to je pa 28 0 % 27.05 6 0 % 26.03 10 0.01 % 33.84 8 0 % 22.65 4 0 % 25.66 0.00 5.20 5.90 15.51 4.05 44.40 ne vem če je 28 0 % 27.05 2 0 % 8.68 13 0.01 % 43.99 7 0 % 19.82 6 0.01 % 38.49 0.00 5.29 10.19 19.80 4.53 115.77 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 773 File at CLARIN.SI2.4.4 List of all word-level 5-grams from standardized word forms in the GOS 1.0 corpus with text-type distribution and collocation measuresGOS1.0-word_sets-standardized_forms- 5grams-taxonomy-collocativity-entire.tsvStandardized form of string Total absolute frequency of standardized form Percentage of all found standardized forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] Dice t-score MI MI3 logDice simple LL ja ja ja ja ja 216 0.03 % 208.67 23 0.02 % 99.77 120 0.07 % 406.03 0 0 % 0 73 0.07 % 468.27 0.01 14.70 24.68 40.19 7.11 2,777.55 ne ne ne ne ne 48 0.01 % 46.37 20 0.01 % 86.76 17 0.01 % 57.52 7 0 % 19.82 4 0 % 25.66 0.00 6.93 22.51 33.68 4.63 554.52 osem nič osem nič nič 36 0.01 % 34.78 36 0.02 % 156.17 0 0 % 0 0 0 % 0 0 0 % 0 0.03 6 35.10 45.44 8.77 688.69 šest osem nič osem nič 36 0.01 % 34.78 36 0.02 % 156.17 0 0 % 0 0 0 % 0 0 0 % 0 0.03 6 37.37 47.71 9.16 737.94 šest šest osem nič osem 35 0.01 % 33.81 35 0.02 % 151.83 0 0 % 0 0 0 % 0 0 0 % 0 0.05 5.92 39.60 49.86 9.65 764.46 se mi zdi da je 33 0.01 % 31.88 8 0.01 % 34.70 9 0.01 % 30.45 10 0 % 28.32 6 0.01 % 38.49 0.00 5.74 21.97 32.06 5.10 370.50 glede na to da je 28 0 % 27.05 10 0.01 % 43.38 0 0 % 0 14 0.01 % 39.64 4 0 % 25.66 0.00 5.29 22.41 32.02 4.68 321.72 nič osem nič trinajst nič 23 0 % 22.22 0 0 % 0 0 0 % 0 23 0.01 % 65.13 0 0 % 0 0.02 4.80 36.89 45.94 8.19 464.86 osem nič trinajst nič ena 23 0 % 22.22 0 0 % 0 0 0 % 0 23 0.01 % 65.13 0 0 % 0 0.02 4.80 37.58 46.63 8.37 474.36 aha ja ja ja ja 17 0 % 16.42 1 0 % 4.34 7 0 % 23.69 0 0 % 0 9 0.01 % 57.73 0.00 4.12 21.01 29.19 3.74 181.07 s hiti na Radiu City 17 0 % 16.42 17 0.01 % 73.75 0 0 % 0 0 0 % 0 0 0 % 0 0.01 4.12 40.84 49.02 6.37 384.04 se mi je zdelo da 17 0 % 16.42 5 0 % 21.69 5 0 % 16.92 4 0 % 11.33 3 0 % 19.24 0.00 4.12 21.95 30.12 4.15 190.63 iz osemdesetih devetdesetih in danes 16 0 % 15.46 16 0.01 % 69.41 0 0 % 0 0 0 % 0 0 0 % 0 0.00 4 37.99 45.99 6.05 333.93 ker se mi zdi da 16 0 % 15.46 4 0 % 17.35 4 0 % 13.53 2 0 % 5.66 6 0.01 % 38.49 0.00 4.00 21.98 29.98 4.82 179.72 mhm mhm mhm mhm mhm 16 0 % 15.46 0 0 % 0 7 0 % 23.69 2 0 % 5.66 7 0.01 % 44.90 0.00 4.00 23.28 31.28 5.87 192.29 glasbenem miksu za vso Slovenijo 15 0 % 14.49 15 0.01 % 65.07 0 0 % 0 0 0 % 0 0 0 % 0 0.01 3.87 49.03 56.84 7.23 412.76 najboljšem glasbenem miksu za vso 15 0 % 14.49 15 0.01 % 65.07 0 0 % 0 0 0 % 0 0 0 % 0 0.01 3.87 51.39 59.21 7.24 434.11 ne glede na to da 15 0 % 14.49 5 0 % 21.69 1 0 % 3.38 8 0 % 22.65 1 0 % 6.41 0.00 3.87 20.83 28.65 3.87 158.13 v najboljšem glasbenem miksu za 15 0 % 14.49 15 0.01 % 65.07 0 0 % 0 0 0 % 0 0 0 % 0 0.00 3.87 43.71 51.53 5.57 364.78 hitov iz osemdesetih devetdesetih in 14 0 % 13.53 14 0.01 % 60.73 0 0 % 0 0 0 % 0 0 0 % 0 0.00 3.74 43.63 51.25 5.97 339.79 jaz mislim da je to 13 0 % 12.56 1 0 % 4.34 3 0 % 10.15 6 0 % 16.99 3 0 % 19.24 0.00 3.61 20.63 28.03 3.63 135.43 ne vem kaj še vse 13 0 % 12.56 0 0 % 0 7 0 % 23.69 2 0 % 5.66 4 0 % 25.66 0.00 3.61 22.18 29.58 4.28 147.57 to je to je to 13 0 % 12.56 1 0 % 4.34 5 0 % 16.92 1 0 % 2.83 6 0.01 % 38.49 0.00 3.61 20.63 28.03 3.03 135.43 devet nič devet tri tri 12 0 % 11.59 12 0.01 % 52.06 0 0 % 0 0 0 % 0 0 0 % 0 0.01 3.46 37.01 44.18 7.74 243.41 devet tri tri nič ena 12 0 % 11.59 12 0.01 % 52.06 0 0 % 0 0 0 % 0 0 0 % 0 0.01 3.46 34.41 41.58 7.45 224.63 meni se zdi da je 12 0 % 11.59 0 0 % 0 8 0 % 27.07 2 0 % 5.66 2 0 % 12.83 0.00 3.46 20.67 27.84 3.69 125.36 nič devet nič devet tri 12 0 % 11.59 12 0.01 % 52.06 0 0 % 0 0 0 % 0 0 0 % 0 0.01 3.46 36.13 43.30 7.48 237.06 nič devet tri tri nič 12 0 % 11.59 12 0.01 % 52.06 0 0 % 0 0 0 % 0 0 0 % 0 0.01 3.46 33.73 40.90 7.27 219.67 pa ne vem kaj še 12 0 % 11.59 0 0 % 0 7 0 % 23.69 1 0 % 2.83 4 0 % 25.66 0.00 3.46 21.04 28.21 3.59 127.98 V pika Radio Capris pika 11 0 % 10.63 11 0.01 % 47.72 0 0 % 0 0 0 % 0 0 0 % 0 0.14 3.32 52.47 59.39 11.15 325.49 Zaslužite s hiti na Radiu 11 0 % 10.63 11 0.01 % 47.72 0 0 % 0 0 0 % 0 0 0 % 0 0.00 3.32 41.96 48.88 5.75 255.90 aha aha aha aha aha 11 0 % 10.63 4 0 % 17.35 3 0 % 10.15 0 0 % 0 4 0 % 25.66 0.01 3.32 28.36 35.28 6.45 165.80 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 774 File at CLARIN.SI2.4.5 List of all word-level 2-grams from lower-case word forms in the GOS 1.0 corpus with text-type distribution and collocation measuresGOS1.0-word_sets-lowercase_forms- 2grams-taxonomy-collocativity-entire.tsvLower-case form of string Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] Dice t-score MI MI3 logDice simple LL ja ja 3,909 0.42 % 3,776.42 649 0.32 % 2,815.34 1,889 0.73 % 6,391.65 165 0.05 % 467.22 1,206 0.87 % 7,736.08 0.15 52.65 2.66 26.53 11.31 -316.58 to je 3,611 0.39 % 3,488.53 875 0.43 % 3,795.72 950 0.37 % 3,214.43 1,102 0.34 % 3,120.50 684 0.49 % 4,387.62 0.13 49.88 2.56 26.19 11.09 -436.66 ne vem 2,427 0.26 % 2,344.69 338 0.16 % 1,466.23 1,199 0.46 % 4,056.95 378 0.12 % 1,070.37 512 0.37 % 3,284.30 0.15 47.32 4.67 27.16 11.22 2,155.51 da je 2,425 0.26 % 2,342.75 385 0.19 % 1,670.12 481 0.18 % 1,627.52 1,218 0.38 % 3,448.97 341 0.24 % 2,187.40 0.09 35.63 1.85 24.34 10.48 -801.09 je pa 1,958 0.21 % 1,891.59 299 0.14 % 1,297.05 805 0.31 % 2,723.81 521 0.16 % 1,475.30 333 0.24 % 2,136.08 0.06 20.74 0.91 22.78 9.92 -759.91 a ne 1,824 0.20 % 1,762.14 168 0.08 % 728.78 409 0.16 % 1,383.90 859 0.27 % 2,432.40 388 0.28 % 2,488.89 0.10 37.98 3.18 24.84 10.66 243.59 je to 1,753 0.19 % 1,693.54 313 0.15 % 1,357.78 424 0.16 % 1,434.65 623 0.20 % 1,764.13 393 0.28 % 2,520.96 0.06 27.21 1.51 23.06 10.05 -680.57 pa je 1,520 0.17 % 1,468.45 368 0.18 % 1,596.37 556 0.21 % 1,881.29 401 0.12 % 1,135.50 195 0.14 % 1,250.86 0.05 12.31 0.55 21.69 9.56 -458.86 da se 1,426 0.15 % 1,377.64 285 0.14 % 1,236.32 237 0.09 % 801.92 637 0.20 % 1,803.77 267 0.19 % 1,712.71 0.08 29.90 2.26 23.22 10.38 -314.35 se je 1,377 0.15 % 1,330.30 317 0.15 % 1,375.13 351 0.14 % 1,187.65 613 0.19 % 1,735.81 96 0.07 % 615.81 0.05 21.43 1.24 22.10 9.73 -559.97 je blo 1,239 0.13 % 1,196.98 244 0.12 % 1,058.46 589 0.23 % 1,992.95 260 0.08 % 736.23 146 0.10 % 936.54 0.06 32.91 3.94 24.49 10.02 622.27 da bi 1,232 0.13 % 1,190.22 237 0.12 % 1,028.10 249 0.10 % 842.52 494 0.15 % 1,398.84 252 0.18 % 1,616.49 0.10 31.46 3.27 23.80 10.61 216.06 ne ne 1,135 0.12 % 1,096.50 334 0.16 % 1,448.88 392 0.15 % 1,326.38 178 0.06 % 504.04 231 0.17 % 1,481.79 0.04 7.87 0.38 20.68 9.28 -267.87 je bil 1,128 0.12 % 1,089.74 287 0.14 % 1,245 444 0.17 % 1,502.32 338 0.11 % 957.10 59 0.04 % 378.46 0.06 31.42 3.95 24.23 9.89 573.82 al pa 1,073 0.12 % 1,036.61 136 0.07 % 589.96 404 0.16 % 1,366.98 275 0.09 % 778.71 258 0.19 % 1,654.98 0.07 29.94 3.54 23.68 10.08 326.33 v bistvu 959 0.10 % 926.47 162 0.08 % 702.75 213 0.08 % 720.71 228 0.07 % 645.62 356 0.26 % 2,283.62 0.11 30.45 5.91 25.72 10.77 1,528.00 pa ne 885 0.10 % 854.98 137 0.07 % 594.30 366 0.14 % 1,238.40 210 0.07 % 594.65 172 0.12 % 1,103.32 0.03 1.30 0.06 19.64 8.94 -43.08 ki je 860 0.09 % 830.83 236 0.12 % 1,023.76 109 0.04 % 368.81 462 0.14 % 1,308.23 53 0.04 % 339.98 0.04 23.64 2.37 21.86 9.41 -160.80 in eee 846 0.09 % 817.31 246 0.12 % 1,067.14 153 0.06 % 517.69 357 0.11 % 1,010.90 90 0.07 % 577.32 0.04 16.78 1.24 20.69 9.47 -344.05 pa to 811 0.09 % 783.49 136 0.07 % 589.96 377 0.14 % 1,275.62 141 0.04 % 399.26 157 0.11 % 1,007.10 0.03 11.42 0.74 20.07 9.16 -289.39 a veš 810 0.09 % 782.53 127 0.06 % 550.92 421 0.16 % 1,424.50 16 0.01 % 45.31 246 0.18 % 1,578.01 0.17 27.84 5.53 24.85 11.44 1,110.88 je bla 772 0.08 % 745.82 141 0.07 % 611.65 389 0.15 % 1,316.23 170 0.05 % 481.38 72 0.05 % 461.86 0.04 25.70 3.74 22.92 9.36 309.34 eee eee 767 0.08 % 740.99 137 0.07 % 594.30 190 0.07 % 642.89 315 0.10 % 891.97 125 0.09 % 801.83 0.03 8.88 0.56 19.72 9.08 -234.37 pa še 760 0.08 % 734.22 158 0.08 % 685.40 299 0.12 % 1,011.70 180 0.06 % 509.70 123 0.09 % 789 0.04 20.34 1.93 21.07 9.42 -237.91 mi je 747 0.08 % 721.66 104 0.05 % 451.15 391 0.15 % 1,322.99 138 0.04 % 390.77 114 0.08 % 731.27 0.04 22.08 2.38 21.47 9.22 -136.50 je v 726 0.08 % 701.38 132 0.06 % 572.61 145 0.06 % 490.62 315 0.10 % 891.97 134 0.10 % 859.56 0.03 4.37 0.26 19.26 8.78 -123.92 če bi 665 0.07 % 642.45 126 0.06 % 546.58 147 0.06 % 497.39 255 0.08 % 722.07 137 0.10 % 878.81 0.10 24.18 4.00 22.75 10.69 354.79 v redu 664 0.07 % 641.48 144 0.07 % 624.67 107 0.04 % 362.05 213 0.07 % 603.14 200 0.14 % 1,282.93 0.07 25.33 5.88 24.63 10.26 1,045.78 ne bi 663 0.07 % 640.51 155 0.07 % 672.38 152 0.06 % 514.31 253 0.08 % 716.41 103 0.07 % 660.71 0.04 17.83 1.70 20.45 9.20 -239.18 se mi 653 0.07 % 630.85 122 0.06 % 529.23 212 0.08 % 717.33 172 0.05 % 487.05 147 0.11 % 942.95 0.06 23.07 3.36 22.06 10.04 142.75 ja sej 649 0.07 % 626.99 66 0.03 % 286.31 396 0.15 % 1,339.91 62 0.02 % 175.56 125 0.09 % 801.83 0.05 22.19 2.96 21.64 9.53 24.18 kaj je 637 0.07 % 615.40 115 0.06 % 498.87 172 0.07 % 581.98 266 0.08 % 753.22 84 0.06 % 538.83 0.03 17.01 1.62 20.25 8.93 -238.45 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 775 File at CLARIN.SI2.4.6 List of all word-level 3-grams from lower-case word forms in the GOS 1.0 corpus with text-type distribution and collocation measuresGOS1.0-word_sets-lowercase_forms- 3grams-taxonomy-collocativity-entire.tsvLower-case form of string Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] Dice t-score MI MI3 logDice simple LL ja ja ja 1,267 0.15 % 1,224.03 193 0.10 % 837.23 606 0.26 % 2,050.47 23 0.01 % 65.13 445 0.35 % 2,854.52 0.05 35.17 6.39 27.01 9.68 2,373.01 se mi zdi 347 0.04 % 335.23 73 0.04 % 316.67 84 0.04 % 284.22 91 0.03 % 257.68 99 0.08 % 635.05 0.05 18.63 13.03 29.91 9.66 2,029.19 ne ne ne 291 0.04 % 281.13 98 0.05 % 425.12 97 0.04 % 328.21 39 0.01 % 110.43 57 0.05 % 365.64 0.01 15.58 3.53 19.90 7.31 86.59 to je to 291 0.04 % 281.13 54 0.03 % 234.25 94 0.04 % 318.06 45 0.01 % 127.42 98 0.08 % 628.64 0.01 16.46 4.83 21.20 7.65 284.98 da je to 235 0.03 % 227.03 37 0.02 % 160.50 29 0.01 % 98.12 123 0.04 % 348.29 46 0.04 % 295.07 0.01 14.60 4.40 20.15 7.31 174.35 to je pa 222 0.03 % 214.47 54 0.03 % 234.25 64 0.03 % 216.55 59 0.02 % 167.07 45 0.04 % 288.66 0.01 13.74 3.68 19.27 7.03 82.58 pa ne vem 220 0.03 % 212.54 20 0.01 % 86.76 121 0.05 % 409.42 22 0.01 % 62.30 57 0.05 % 365.64 0.01 14.65 6.35 21.91 7.43 406.58 ja to je 215 0.03 % 207.71 44 0.02 % 190.87 69 0.03 % 233.47 46 0.02 % 130.26 56 0.04 % 359.22 0.01 13.64 3.84 19.34 7.06 97.32 in to je 201 0.02 % 194.18 46 0.03 % 199.55 31 0.01 % 104.89 103 0.04 % 291.66 21 0.02 % 134.71 0.01 13.51 4.41 19.71 7.14 150.48 glede na to 174 0.02 % 168.10 67 0.04 % 290.64 11 0.01 % 37.22 64 0.02 % 181.23 32 0.03 % 205.27 0.02 13.18 10.94 25.83 8.17 798.69 na to da 163 0.02 % 157.47 68 0.04 % 294.98 7 0 % 23.69 67 0.02 % 189.72 21 0.02 % 134.71 0.01 12.48 5.48 20.18 7.38 219.12 ne vem če 148 0.02 % 142.98 29 0.02 % 125.80 49 0.02 % 165.80 33 0.01 % 93.44 37 0.03 % 237.34 0.01 12.12 8.03 22.45 7.53 421.02 ne to je 146 0.02 % 141.05 34 0.02 % 147.49 41 0.02 % 138.73 40 0.01 % 113.27 31 0.03 % 198.85 0.01 10.61 3.04 17.42 6.41 10.48 to je blo 142 0.02 % 137.18 42 0.02 % 182.19 43 0.02 % 145.50 43 0.01 % 121.76 14 0.01 % 89.81 0.01 11.80 6.72 21.02 6.95 293.48 mhm mhm mhm 139 0.02 % 134.29 7 0 % 30.37 36 0.02 % 121.81 15 0.01 % 42.47 81 0.07 % 519.59 0.03 11.78 10.70 24.93 8.99 617.25 in tako naprej 130 0.02 % 125.59 22 0.01 % 95.44 1 0 % 3.38 102 0.04 % 288.83 5 0 % 32.07 0.02 11.40 11.34 25.39 8.28 628.01 ne vem kaj 129 0.02 % 124.62 17 0.01 % 73.75 55 0.02 % 186.10 23 0.01 % 65.13 34 0.03 % 218.10 0.01 11.31 7.91 21.93 7.34 357.09 na nek način 128 0.01 % 123.66 12 0.01 % 52.06 3 0 % 10.15 74 0.03 % 209.54 39 0.03 % 250.17 0.03 11.31 15.74 29.74 8.92 956.61 to pa je 127 0.01 % 122.69 45 0.02 % 195.21 25 0.01 % 84.59 33 0.01 % 93.44 24 0.02 % 153.95 0.00 9.73 2.87 16.85 6.23 0.45 mi zdi da 126 0.01 % 121.73 41 0.02 % 177.86 19 0.01 % 64.29 32 0.01 % 90.61 34 0.03 % 218.10 0.02 11.22 11.37 25.32 8.04 610.52 je v bistvu 125 0.01 % 120.76 18 0.01 % 78.08 25 0.01 % 84.59 38 0.01 % 107.60 44 0.04 % 282.24 0.01 11.13 7.78 21.72 6.81 336.96 da bi se 124 0.01 % 119.79 20 0.01 % 86.76 15 0.01 % 50.75 60 0.02 % 169.90 29 0.02 % 186.03 0.01 10.95 5.94 19.85 7.17 199.64 to je res 114 0.01 % 110.13 35 0.02 % 151.83 39 0.02 % 131.96 20 0.01 % 56.63 20 0.02 % 128.29 0.01 10.55 6.39 20.06 6.63 213.50 je to je 112 0.01 % 108.20 18 0.01 % 78.08 36 0.02 % 121.81 31 0.01 % 87.78 27 0.02 % 173.20 0.00 8.52 2.36 15.97 5.92 -21.38 ja ne vem 111 0.01 % 107.24 18 0.01 % 78.08 68 0.03 % 230.09 14 0.01 % 39.64 11 0.01 % 70.56 0.01 10.31 5.57 19.16 6.54 155.01 eee to je 105 0.01 % 101.44 19 0.01 % 82.42 17 0.01 % 57.52 50 0.02 % 141.58 19 0.01 % 121.88 0.00 8.90 2.93 16.36 6.06 2.80 mislim da je 101 0.01 % 97.57 19 0.01 % 82.42 11 0.01 % 37.22 52 0.02 % 147.25 19 0.01 % 121.88 0.01 9.99 7.42 20.74 6.45 250.53 se mi je 99 0.01 % 95.64 24 0.01 % 104.11 38 0.02 % 128.58 18 0.01 % 50.97 19 0.01 % 121.88 0.01 9.72 5.45 18.71 6.41 131.42 za to da 98 0.01 % 94.68 17 0.01 % 73.75 2 0 % 6.77 72 0.03 % 203.88 7 0.01 % 44.90 0.01 9.66 5.36 18.59 6.78 125.25 sej to je 96 0.01 % 92.74 10 0.01 % 43.38 48 0.02 % 162.41 15 0.01 % 42.47 23 0.02 % 147.54 0.01 9.59 5.56 18.73 6.36 133.54 in tko naprej 95 0.01 % 91.78 18 0.01 % 78.08 9 0 % 30.45 45 0.01 % 127.42 23 0.02 % 147.54 0.01 9.74 11.28 24.42 7.89 455.00 to da je 94 0.01 % 90.81 21 0.01 % 91.10 3 0 % 10.15 56 0.02 % 158.57 14 0.01 % 89.81 0.00 8.54 3.07 16.18 5.99 8.32 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 776 File at CLARIN.SI2.4.7 List of all word-level 4-grams from lower-case word forms in the GOS 1.0 corpus with text-type distribution and collocation measuresGOS1.0-word_sets-lowercase_forms- 4grams-taxonomy-collocativity-entire.tsvLower-case form of string Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] Dice t-score MI MI3 logDice simple LL ja ja ja ja 501 0.07 % 484.01 69 0.04 % 299.32 245 0.12 % 828.99 6 0 % 16.99 181 0.16 % 1,161.05 0.02 22.37 10.41 28.35 8.34 2,138.90 se mi zdi da 126 0.02 % 121.73 41 0.03 % 177.86 19 0.01 % 64.29 32 0.01 % 90.61 34 0.03 % 218.10 0.01 11.22 17.35 31.31 7.70 1,064.53 glede na to da 117 0.02 % 113.03 55 0.03 % 238.59 5 0 % 16.92 41 0.01 % 116.10 16 0.01 % 102.63 0.01 10.82 16.15 29.89 7.30 903.88 ne ne ne ne 104 0.01 % 100.47 43 0.03 % 186.53 33 0.02 % 111.66 15 0.01 % 42.47 13 0.01 % 83.39 0.00 10.13 7.15 20.55 5.83 241.28 to je to je 88 0.01 % 85.02 14 0.01 % 60.73 32 0.02 % 108.28 17 0.01 % 48.14 25 0.02 % 160.37 0.00 9.34 7.92 20.84 5.74 244.18 in to je to 43 0.01 % 41.54 6 0 % 26.03 18 0.01 % 60.91 9 0 % 25.48 10 0.01 % 64.15 0.00 6.53 8.09 18.95 5.01 123.84 mhm mhm mhm mhm 42 0.01 % 40.58 1 0 % 4.34 13 0.01 % 43.99 5 0 % 14.16 23 0.02 % 147.54 0.01 6.48 16.82 27.61 7.26 341.37 men se zdi da 37 0.01 % 35.75 4 0 % 17.35 21 0.01 % 71.06 5 0 % 14.16 7 0.01 % 44.90 0.00 6.08 18.30 28.72 6.06 333.60 ne glede na to 36 0.01 % 34.78 7 0 % 30.37 0 0 % 0 22 0.01 % 62.30 7 0.01 % 44.90 0.00 6.00 13.78 24.12 5.30 226.66 nič osem nič nič 36 0.01 % 34.78 36 0.02 % 156.17 0 0 % 0 0 0 % 0 0 0 % 0 0.04 6.00 26.21 36.55 9.29 495.99 osem nič osem nič 36 0.01 % 34.78 36 0.02 % 156.17 0 0 % 0 0 0 % 0 0 0 % 0 0.05 6.00 27.91 38.25 9.63 532.86 šest osem nič osem 36 0.01 % 34.78 36 0.02 % 156.17 0 0 % 0 0 0 % 0 0 0 % 0 0.06 6.00 29.45 39.79 10.05 566.27 je pa res da 34 0.01 % 32.85 17 0.01 % 73.75 0 0 % 0 8 0 % 22.65 9 0.01 % 57.73 0.00 5.82 9.67 19.84 4.68 130.00 na to da je 34 0.01 % 32.85 12 0.01 % 52.06 0 0 % 0 16 0.01 % 45.31 6 0.01 % 38.49 0.00 5.81 8.03 18.20 4.71 96.63 aha ja ja ja 30 0 % 28.98 2 0 % 8.68 15 0.01 % 50.75 0 0 % 0 13 0.01 % 83.39 0.00 5.47 9.97 19.78 4.66 120.06 mi zdi da je 29 0 % 28.02 8 0.01 % 34.70 6 0 % 20.30 10 0 % 28.32 5 0 % 32.07 0.00 5.38 14.06 23.78 4.98 187.49 ne ja ja ja 29 0 % 28.02 2 0 % 8.68 11 0.01 % 37.22 0 0 % 0 16 0.01 % 102.63 0.00 5.30 6.05 15.77 4.17 48.55 aha aha aha aha 28 0 % 27.05 10 0.01 % 43.38 7 0 % 23.69 0 0 % 0 11 0.01 % 70.56 0.01 5.29 20.72 30.33 7.80 293.25 se mi je zdelo 28 0 % 27.05 9 0.01 % 39.04 10 0.01 % 33.84 4 0 % 11.33 5 0 % 32.07 0.00 5.29 17.73 27.34 5.00 242.85 šest šest osem nič 28 0 % 27.05 28 0.02 % 121.46 0 0 % 0 0 0 % 0 0 0 % 0 0.05 5.29 28.93 38.54 9.66 431.62 to se mi zdi 27 0 % 26.08 2 0 % 8.68 3 0 % 10.15 9 0 % 25.48 13 0.01 % 83.39 0.00 5.20 15.26 24.77 5.53 194.05 ne vem ne vem 26 0 % 25.12 4 0 % 17.35 15 0.01 % 50.75 4 0 % 11.33 3 0 % 19.24 0.00 5.10 11.52 20.93 4.68 128.42 ja to je pa 25 0 % 24.15 5 0 % 21.69 8 0 % 27.07 8 0 % 22.65 4 0 % 25.66 0.00 4.92 5.89 15.17 3.92 39.44 da je da je 24 0 % 23.19 1 0 % 4.34 10 0.01 % 33.84 4 0 % 11.33 9 0.01 % 57.73 0.00 4.81 5.79 14.96 3.82 36.51 ne vem če je 24 0 % 23.19 2 0 % 8.68 9 0 % 30.45 7 0 % 19.82 6 0.01 % 38.49 0.00 4.89 10.22 19.39 4.37 99.72 gre za to da 23 0 % 22.22 6 0 % 26.03 1 0 % 3.38 15 0.01 % 42.47 1 0 % 6.41 0.00 4.80 13.14 22.19 5.07 135.98 ja sej to je 23 0 % 22.22 4 0 % 17.35 14 0.01 % 47.37 1 0 % 2.83 4 0 % 25.66 0.00 4.79 8.86 17.90 4.19 76.74 nič osem nič trinajst 23 0 % 22.22 0 0 % 0 0 0 % 0 23 0.01 % 65.13 0 0 % 0 0.03 4.80 29.62 38.67 9.13 364.22 nič trinajst nič ena 23 0 % 22.22 0 0 % 0 0 0 % 0 23 0.01 % 65.13 0 0 % 0 0.03 4.80 27.82 36.87 8.72 339.21 osem nič trinajst nič 23 0 % 22.22 0 0 % 0 0 0 % 0 23 0.01 % 65.13 0 0 % 0 0.03 4.80 29.62 38.67 9.13 364.22 to je pa res 23 0 % 22.22 10 0.01 % 43.38 9 0 % 30.45 1 0 % 2.83 3 0 % 19.24 0.00 4.79 9.23 18.28 4.14 81.92 ja to je to 22 0 % 21.25 7 0 % 30.37 10 0.01 % 33.84 2 0 % 5.66 3 0 % 19.24 0.00 4.64 6.46 15.38 3.90 42.09 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 777 File at CLARIN.SI2.4.8 List of all word-level 5-grams from lower-case word forms in the GOS 1.0 corpus with text-type distribution and collocation measuresGOS1.0-word_sets-lowercase_forms- 5grams-taxonomy-collocativity-entire.tsvLower-case form of string Total absolute frequency of lower-case word form Percentage of all found lower-case word forms Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] Dice t-score MI MI3 logDice simple LL ja ja ja ja ja 216 0.03 % 208.67 23 0.02 % 99.77 120 0.07 % 406.03 0 0 % 0 73 0.07 % 468.27 0.01 14.70 25.07 40.58 7.13 2,828.41 ne ne ne ne ne 43 0.01 % 41.54 20 0.01 % 86.76 14 0.01 % 47.37 6 0 % 16.99 3 0 % 19.24 0.00 6.56 23.17 34.03 4.55 513.94 osem nič osem nič nič 36 0.01 % 34.78 36 0.02 % 156.17 0 0 % 0 0 0 % 0 0 0 % 0 0.04 6 37.74 48.08 9.49 745.90 šest osem nič osem nič 36 0.01 % 34.78 36 0.02 % 156.17 0 0 % 0 0 0 % 0 0 0 % 0 0.05 6 39.28 49.62 9.77 779.30 se mi zdi da je 29 0 % 28.02 8 0.01 % 34.70 6 0 % 20.30 10 0 % 28.32 5 0.01 % 32.07 0.00 5.39 21.78 31.50 4.95 322.33 glede na to da je 28 0 % 27.05 10 0.01 % 43.38 0 0 % 0 14 0.01 % 39.64 4 0 % 25.66 0.00 5.29 21.73 31.35 4.75 310.36 šest šest osem nič osem 28 0 % 27.05 28 0.02 % 121.46 0 0 % 0 0 0 % 0 0 0 % 0 0.05 5.29 40.46 50.07 9.77 625.99 nič osem nič trinajst nič 23 0 % 22.22 0 0 % 0 0 0 % 0 23 0.01 % 65.13 0 0 % 0 0.03 4.80 39.45 48.50 8.94 500.33 osem nič trinajst nič ena 23 0 % 22.22 0 0 % 0 0 0 % 0 23 0.01 % 65.13 0 0 % 0 0.03 4.80 39.35 48.40 8.91 498.87 aha ja ja ja ja 17 0 % 16.42 1 0 % 4.34 7 0 % 23.69 0 0 % 0 9 0.01 % 57.73 0.00 4.12 21.01 29.19 3.75 181.07 s hiti na radiu siti 17 0 % 16.42 17 0.01 % 73.75 0 0 % 0 0 0 % 0 0 0 % 0 0.00 4.12 40.27 48.45 6.33 378.18 iz osemdesetih devetdesetih in danes 16 0 % 15.46 16 0.01 % 69.41 0 0 % 0 0 0 % 0 0 0 % 0 0.00 4 39.36 47.36 6.11 347.11 mhm mhm mhm mhm mhm 16 0 % 15.46 0 0 % 0 7 0 % 23.69 2 0 % 5.66 7 0.01 % 44.90 0.00 4.00 23.28 31.28 5.87 192.28 glasbenem miksu za vso slovenijo 15 0 % 14.49 15 0.01 % 65.07 0 0 % 0 0 0 % 0 0 0 % 0 0.01 3.87 49.09 56.91 7.25 413.35 ve pika radio kapris pika 15 0 % 14.49 15 0.01 % 65.07 0 0 % 0 0 0 % 0 0 0 % 0 0.11 3.87 49.44 57.26 10.79 416.53 zaslužite s hiti na radiu 15 0 % 14.49 15 0.01 % 65.07 0 0 % 0 0 0 % 0 0 0 % 0 0.00 3.87 40.98 48.79 6.15 340.06 dvojni ve pika radio kapris 14 0 % 13.53 14 0.01 % 60.73 0 0 % 0 0 0 % 0 0 0 % 0 0.12 3.74 51.57 59.18 10.89 406.65 hitov iz osemdesetih devetdesetih in 14 0 % 13.53 14 0.01 % 60.73 0 0 % 0 0 0 % 0 0 0 % 0 0.00 3.74 44.42 52.03 6.00 346.38 najboljšem glasbenem miksu za vso 14 0 % 13.53 14 0.01 % 60.73 0 0 % 0 0 0 % 0 0 0 % 0 0.01 3.74 51.40 59.02 7.16 405.27 v najboljšem glasbenem miksu za 14 0 % 13.53 14 0.01 % 60.73 0 0 % 0 0 0 % 0 0 0 % 0 0.00 3.74 43.77 51.38 5.52 340.93 ne glede na to da 13 0 % 12.56 5 0 % 21.69 0 0 % 0 8 0 % 22.65 0 0 % 0 0.00 3.61 20.63 28.03 3.76 135.43 to je to je to 13 0 % 12.56 1 0 % 4.34 5 0 % 16.92 1 0 % 2.83 6 0.01 % 38.49 0.00 3.61 22.63 30.04 3.09 151.16 trikat dvojni ve pika radio 13 0 % 12.56 13 0.01 % 56.39 0 0 % 0 0 0 % 0 0 0 % 0 0.11 3.61 52.52 59.92 10.84 385.07 devet nič devet tri tri 12 0 % 11.59 12 0.01 % 52.06 0 0 % 0 0 0 % 0 0 0 % 0 0.02 3.46 37.89 45.06 8.06 249.76 devet tri tri nič ena 12 0 % 11.59 12 0.01 % 52.06 0 0 % 0 0 0 % 0 0 0 % 0 0.01 3.46 35.31 42.48 7.71 231.08 nič devet nič devet tri 12 0 % 11.59 12 0.01 % 52.06 0 0 % 0 0 0 % 0 0 0 % 0 0.02 3.46 37.80 44.97 8.03 249.10 nič devet tri tri nič 12 0 % 11.59 12 0.01 % 52.06 0 0 % 0 0 0 % 0 0 0 % 0 0.01 3.46 35.41 42.58 7.73 231.84 v popolnem miksu hitov iz 12 0 % 11.59 12 0.01 % 52.06 0 0 % 0 0 0 % 0 0 0 % 0 0.00 3.46 45.33 52.50 5.69 303.52 aha aha aha aha aha 11 0 % 10.63 4 0 % 17.35 3 0 % 10.15 0 0 % 0 4 0 % 25.66 0.01 3.32 28.34 35.26 6.45 165.70 men se zdi da je 11 0 % 10.63 0 0 % 0 7 0 % 23.69 2 0 % 5.66 2 0 % 12.83 0.00 3.32 21.36 28.28 3.62 119.45 miksu hitov iz osemdesetih devetdesetih 11 0 % 10.63 11 0.01 % 47.72 0 0 % 0 0 0 % 0 0 0 % 0 0.03 3.32 53.22 60.14 8.79 330.47 ne ja ja ja ja 11 0 % 10.63 0 0 % 0 5 0 % 16.92 0 0 % 0 6 0.01 % 38.49 0.00 3.32 22.70 29.62 2.78 128.32 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 778 File at CLARIN.SI2.4.9 List of all 2-grams of morphosyntactic tags in the GOS 1.0 corpus with their part-of-speech categories and text-type distributionGOS1.0-word_sets-morphosyntactic_tags- 2grams-taxonomy-collocativity-entire.tsvMorphosyntactic tag of string Part-of-speech category of string Total absolute frequency of morphosyntactic tag Percentage of all found morphosyntactic tags Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Dice t-score MI MI3 logDice simple LL L L L L 11,105 1.20 % 10,728.36 2,405 1.17 % 10,432.80 4,730 1.82 % 16,004.49 2,471 1.78 % 15,850.62 1,499 0.47 % 4,244.67 0.11 13.11 0.19 27.07 10.82 -1,481.92 Rsn Rsn R R 9,949 1.08 % 9,611.57 2,224 1.08 % 9,647.63 3,111 1.20 % 10,526.42 1,731 1.25 % 11,103.77 2,883 0.90 % 8,163.69 0.11 16.00 0.25 26.81 10.78 -1,681.39 L Rsn L R 9,211 1.00 % 8,898.60 2,172 1.06 % 9,422.05 2,873 1.11 % 9,721.12 1,645 1.18 % 10,552.11 2,521 0.79 % 7,138.63 0.10 2.07 0.03 26.37 10.61 -223.41 Vp L V L 8,379 0.91 % 8,094.82 1,655 0.81 % 7,179.33 2,562 0.99 % 8,668.82 1,431 1.03 % 9,179.37 2,731 0.85 % 7,733.28 0.10 19.64 0.35 26.41 10.67 -1,838.23 Vp Rsn V R 7,706 0.83 % 7,444.64 1,693 0.82 % 7,344.17 2,448 0.94 % 8,283.09 1,246 0.90 % 7,992.66 2,319 0.72 % 6,566.63 0.10 18.30 0.34 26.16 10.62 -1,648.22 Rsn L R L 6,732 0.73 % 6,503.68 1,366 0.66 % 5,925.66 2,224 0.86 % 7,525.16 1,423 1.02 % 9,128.06 1,719 0.54 % 4,867.63 0.07 -27.79 -0.42 25.01 10.16 2,854.23 Rsn Vd R V 5,629 0.61 % 5,438.09 1,109 0.54 % 4,810.80 1,702 0.66 % 5,758.91 1,147 0.82 % 7,357.61 1,671 0.52 % 4,731.71 0.08 20.57 0.46 25.38 10.38 -1,519.73 Vd Gp-ste-n V G 5,580 0.60 % 5,390.75 951 0.46 % 4,125.40 1,604 0.62 % 5,427.32 719 0.52 % 4,612.14 2,306 0.72 % 6,529.82 0.15 56.19 2.01 26.90 11.21 -1,632.80 Rsn Vp R V 5,136 0.56 % 4,961.81 1,123 0.55 % 4,871.53 1,635 0.63 % 5,532.21 877 0.63 % 5,625.65 1,501 0.47 % 4,250.33 0.06 -13.44 -0.25 24.40 10.03 1,159.89 N N N N 4,735 0.51 % 4,574.41 1,123 0.55 % 4,871.53 1,189 0.46 % 4,023.12 813 0.58 % 5,215.12 1,610 0.50 % 4,558.98 0.10 38.40 1.18 25.60 10.70 -1,926.42 Gp-ste-n Rsn G R 4,245 0.46 % 4,101.03 826 0.40 % 3,583.16 1,312 0.51 % 4,439.30 675 0.49 % 4,329.89 1,432 0.45 % 4,054.95 0.07 21.77 0.59 24.69 10.13 -1,337.25 L Vp L V 4,175 0.45 % 4,033.40 825 0.40 % 3,578.82 1,582 0.61 % 5,352.88 790 0.57 % 5,067.58 978 0.31 % 2,769.37 0.05 -37.23 -0.66 23.40 9.67 3,161.50 L Ggnspe L G 4,020 0.43 % 3,883.66 610 0.30 % 2,646.16 2,013 0.78 % 6,811.21 747 0.54 % 4,791.75 650 0.20 % 1,840.58 0.07 50.24 2.27 26.21 10.24 -881.03 N Rsn N R 3,937 0.43 % 3,803.47 803 0.39 % 3,483.38 993 0.38 % 3,359.93 723 0.52 % 4,637.80 1,418 0.44 % 4,015.30 0.06 -3.88 -0.09 23.80 9.85 282.07 N L N L 3,552 0.38 % 3,431.53 757 0.37 % 3,283.84 1,088 0.42 % 3,681.37 628 0.45 % 4,028.40 1,079 0.34 % 3,055.37 0.05 -16.09 -0.34 23.24 9.63 1,180.25 Zk-sei Gp-ste-n Z G 3,445 0.37 % 3,328.16 769 0.37 % 3,335.89 1,093 0.42 % 3,698.29 630 0.45 % 4,041.23 953 0.30 % 2,698.58 0.14 49.54 2.68 26.18 11.17 -254.72 Vp Vp V V 3,370 0.36 % 3,255.70 706 0.34 % 3,062.60 1,042 0.40 % 3,525.73 578 0.42 % 3,707.67 1,044 0.33 % 2,956.26 0.05 -18.67 -0.40 23.03 9.67 1,351.71 Dm Sozem D S 3,269 0.35 % 3,158.13 753 0.37 % 3,266.49 697 0.27 % 2,358.38 386 0.28 % 2,476.06 1,433 0.45 % 4,057.78 0.19 54.82 4.60 27.95 11.62 2,789.88 L Vd L V 3,222 0.35 % 3,112.72 588 0.29 % 2,550.72 1,126 0.43 % 3,809.95 670 0.48 % 4,297.82 838 0.26 % 2,372.94 0.04 -20.90 -0.45 22.86 9.50 1,495.39 L N L N 3,195 0.35 % 3,086.64 629 0.31 % 2,728.58 974 0.38 % 3,295.64 582 0.42 % 3,733.33 1,010 0.32 % 2,859.98 0.04 -23.28 -0.50 22.79 9.48 1,674.39 Dm Somem D S 3,127 0.34 % 3,020.94 794 0.39 % 3,444.34 607 0.23 % 2,053.85 472 0.34 % 3,027.72 1,254 0.39 % 3,550.91 0.18 53.61 4.60 27.82 11.56 2,661.10 Vp N V N 3,094 0.34 % 2,989.06 787 0.38 % 3,413.98 819 0.32 % 2,771.18 438 0.32 % 2,809.62 1,050 0.33 % 2,973.25 0.05 0.74 0.02 23.21 9.79 -46.20 Rsn N R N 3,081 0.33 % 2,976.50 645 0.31 % 2,797.99 854 0.33 % 2,889.61 574 0.41 % 3,682.01 1,008 0.32 % 2,854.32 0.04 -19.81 -0.44 22.74 9.50 1,382.65 Vd Zp------k V Z 3,078 0.33 % 2,973.61 626 0.30 % 2,715.56 629 0.24 % 2,128.29 519 0.37 % 3,329.21 1,304 0.41 % 3,692.49 0.10 42.91 2.14 25.32 10.68 -791.34 Vp Gp-ste-n V G 2,986 0.32 % 2,884.73 605 0.29 % 2,624.47 1,068 0.41 % 3,613.70 395 0.28 % 2,533.79 918 0.29 % 2,599.47 0.06 16.87 0.53 23.62 9.94 -886.05 Rsn Dm R D 2,940 0.32 % 2,840.29 691 0.34 % 2,997.53 838 0.32 % 2,835.47 462 0.33 % 2,963.57 949 0.30 % 2,687.25 0.05 5.58 0.16 23.20 9.62 -327.66 Rsn Gp-ste-n R G 2,785 0.30 % 2,690.54 586 0.28 % 2,542.05 913 0.35 % 3,089.24 409 0.29 % 2,623.59 877 0.27 % 2,483.37 0.04 -0.79 -0.02 22.87 9.52 47.51 Gp-ste-n L G L 2,759 0.30 % 2,665.42 606 0.29 % 2,628.80 925 0.36 % 3,129.84 414 0.30 % 2,655.67 814 0.26 % 2,304.98 0.04 -5.54 -0.14 22.72 9.42 341.53 Vd L V L 2,624 0.28 % 2,535 560 0.27 % 2,429.26 831 0.32 % 2,811.78 455 0.33 % 2,918.67 778 0.24 % 2,203.04 0.04 -34.83 -0.75 21.97 9.20 2,386.31 Ppnzei Sozei P S 2,607 0.28 % 2,518.58 640 0.31 % 2,776.30 307 0.12 % 1,038.77 257 0.18 % 1,648.57 1,403 0.44 % 3,972.83 0.23 49.13 4.72 27.42 11.89 2,396.26 Vd Gp-g V G 2,549 0.28 % 2,462.55 454 0.22 % 1,969.43 661 0.26 % 2,236.57 471 0.34 % 3,021.30 963 0.30 % 2,726.89 0.10 43.99 2.96 25.59 10.62 97.40 Dt Sozet D S 2,533 0.27 % 2,447.09 569 0.28 % 2,468.30 744 0.29 % 2,517.41 295 0.21 % 1,892.32 925 0.29 % 2,619.29 0.17 46.47 3.70 26.32 11.48 971.57 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 779 File at CLARIN.SI2.4.10 List of all 3-grams of morphosyntactic tags in the GOS 1.0 corpus with their part-of-speech categories and text-type distributionGOS1.0-word_sets-morphosyntactic_tags- 3grams-taxonomy-collocativity-entire.tsvMorphosyntactic tag of string Part-of-speech category of string Total absolute frequency of morphosyntactic tag Percentage of all found morphosyntactic tags Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Dice t-score MI MI3 logDice simple LL L L L L L L 2,505 0.30 % 2,420.04 492 0.27 % 2,134.28 1,110 0.48 % 3,755.81 721 0.58 % 4,624.97 182 0.06 % 515.36 0.02 31.22 1.41 23.99 8.68 -998.08 Rsn Rsn Rsn R R R 1,057 0.13 % 1,021.15 231 0.12 % 1,002.07 352 0.15 % 1,191.03 210 0.17 % 1,347.08 264 0.09 % 747.56 0.01 9.43 0.49 20.59 7.54 -298.75 Vp Rsn Rsn V R R 1,020 0.12 % 985.41 211 0.12 % 915.31 284 0.12 % 960.95 173 0.14 % 1,109.74 352 0.12 % 996.75 0.01 14.78 0.90 20.89 7.63 -393.60 L Rsn Rsn L R R 973 0.12 % 940 272 0.15 % 1,179.93 307 0.13 % 1,038.77 176 0.14 % 1,128.98 218 0.07 % 617.30 0.01 5.24 0.27 20.12 7.38 -171.49 Rsn Vp Rsn R V R 804 0.10 % 776.73 178 0.10 % 772.16 230 0.10 % 778.23 141 0.11 % 904.47 255 0.09 % 722.07 0.01 9.03 0.55 19.86 7.28 -244.38 Vp L Rsn V L R 795 0.10 % 768.04 175 0.10 % 759.14 232 0.10 % 785 133 0.11 % 853.15 255 0.09 % 722.07 0.01 7.23 0.43 19.70 7.22 -203.11 Rsn Rsn L R R L 783 0.09 % 756.44 167 0.09 % 724.44 265 0.12 % 896.66 180 0.14 % 1,154.64 171 0.06 % 484.21 0.01 -0.95 -0.05 19.18 7.07 30.40 N N N N N N 771 0.09 % 744.85 207 0.11 % 897.96 181 0.08 % 612.43 130 0.10 % 833.91 253 0.09 % 716.41 0.02 24.38 3.03 22.22 8.08 54.82 Vd Gp-ste-n Rsn V G R 757 0.09 % 731.33 120 0.07 % 520.56 200 0.09 % 676.72 112 0.09 % 718.44 325 0.11 % 920.29 0.01 23.00 2.61 21.74 7.77 -77.19 Rsn L Rsn R L R 696 0.08 % 672.39 153 0.08 % 663.71 212 0.09 % 717.33 141 0.11 % 904.47 190 0.07 % 538.02 0.01 -4.30 -0.22 18.67 6.90 135.71 Rsn Vp L R V L 684 0.08 % 660.80 146 0.08 % 633.34 195 0.09 % 659.80 125 0.10 % 801.83 218 0.07 % 617.30 0.01 3.55 0.21 19.05 7.01 -99.02 L Rsn Vd L R V 658 0.08 % 635.68 133 0.07 % 576.95 218 0.10 % 737.63 145 0.12 % 930.13 162 0.06 % 458.73 0.01 10.21 0.73 19.46 7.08 -233.78 L L Rsn L L R 656 0.08 % 633.75 152 0.08 % 659.37 257 0.11 % 869.59 134 0.11 % 859.56 113 0.04 % 319.98 0.01 -8.49 -0.41 18.30 6.78 271.74 Vp Rsn L V R L 637 0.08 % 615.40 137 0.07 % 594.30 203 0.09 % 686.87 121 0.10 % 776.17 176 0.06 % 498.37 0.01 1.82 0.11 18.74 6.91 -50.37 Vp L L V L L 635 0.08 % 613.46 147 0.08 % 637.68 224 0.10 % 757.93 120 0.10 % 769.76 144 0.05 % 407.76 0.01 -0.11 -0.01 18.61 6.86 3.19 Gp-ste-n Rsn Rsn G R R 621 0.07 % 599.94 93 0.05 % 403.43 212 0.09 % 717.33 112 0.09 % 718.44 204 0.07 % 577.66 0.01 14.73 1.29 19.85 7.13 -251.77 L Rsn L L R L 614 0.07 % 593.18 134 0.07 % 581.29 224 0.10 % 757.93 140 0.11 % 898.05 116 0.04 % 328.47 0.01 -10.47 -0.51 18.02 6.68 330.90 Dm Ppnzem Sozem D P S 599 0.07 % 578.68 186 0.10 % 806.86 39 0.02 % 131.96 51 0.04 % 327.15 323 0.11 % 914.63 0.05 24.47 12.08 30.53 9.71 3,157.48 Rsn Rsn Vp R R V 548 0.07 % 529.41 126 0.07 % 546.58 161 0.07 % 544.76 89 0.07 % 570.90 172 0.06 % 487.05 0.01 0.00 0.00 18.20 6.73 -0.11 Zk-sei Gp-ste-n Rsn Z G R 537 0.07 % 518.79 102 0.06 % 442.47 196 0.09 % 663.19 116 0.09 % 744.10 123 0.04 % 348.29 0.01 21.09 3.48 21.61 7.54 146.42 L Zk-sei Gp-ste-n L Z G 535 0.06 % 516.85 122 0.07 % 529.23 178 0.08 % 602.28 125 0.10 % 801.83 110 0.04 % 311.48 0.01 20.88 3.36 21.49 7.46 116.82 Rsn Vd Gp-ste-n R V G 530 0.06 % 512.02 78 0.04 % 338.36 178 0.08 % 602.28 101 0.08 % 647.88 173 0.06 % 489.88 0.01 17.63 2.09 20.19 7.26 -143.66 Rsn L L R L L 524 0.06 % 506.23 114 0.06 % 494.53 177 0.08 % 598.90 141 0.11 % 904.47 92 0.03 % 260.51 0.01 -15.26 -0.74 17.33 6.45 466.30 L L Ggnspe L L G 523 0.06 % 505.26 94 0.05 % 407.77 279 0.12 % 944.03 85 0.07 % 545.25 65 0.02 % 184.06 0.01 19.33 2.69 20.75 6.94 -36.24 Dm Ppnmem Somem D P S 516 0.06 % 498.50 169 0.09 % 733.12 20 0.01 % 67.67 36 0.03 % 230.93 291 0.10 % 824.01 0.04 22.71 11.51 29.54 9.49 2,545.55 L Rsn Vp L R V 516 0.06 % 498.50 108 0.06 % 468.50 176 0.08 % 595.52 92 0.07 % 590.15 140 0.05 % 396.43 0.01 -3.31 -0.20 17.83 6.60 89.36 Rsn Rsn Vd R R V 508 0.06 % 490.77 87 0.05 % 377.40 157 0.07 % 531.23 100 0.08 % 641.47 164 0.06 % 464.39 0.01 6.25 0.47 18.45 6.75 -138.52 L Vp L L V L 505 0.06 % 487.87 113 0.06 % 490.19 175 0.08 % 592.13 100 0.08 % 641.47 117 0.04 % 331.30 0.01 -5.91 -0.34 17.62 6.53 163.21 L Vp Rsn L V R 499 0.06 % 482.08 108 0.06 % 468.50 181 0.08 % 612.43 85 0.07 % 545.25 125 0.04 % 353.96 0.01 -4.13 -0.24 17.68 6.55 110.85 Vp L Ggnspe V L G 484 0.06 % 467.58 60 0.03 % 260.28 249 0.11 % 842.52 106 0.09 % 679.95 69 0.02 % 195.38 0.01 19.51 3.14 20.98 7.07 57.77 Vp Zk-sei Gp-ste-n V Z G 477 0.06 % 460.82 104 0.06 % 451.15 147 0.06 % 497.39 68 0.05 % 436.20 158 0.05 % 447.40 0.01 20.23 3.76 21.55 7.65 195.98 Gp-ste-n L Rsn G L R 445 0.05 % 429.91 96 0.05 % 416.44 146 0.06 % 494.01 68 0.05 % 436.20 135 0.05 % 382.27 0.01 8.11 0.70 18.30 6.60 -154.58 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 780 File at CLARIN.SI2.4.11 List of all 4-grams of morphosyntactic tags in the GOS 1.0 corpus with their part-of-speech categories and text-type distributionGOS1.0-word_sets-morphosyntactic_tags- 4grams-taxonomy-collocativity-entire.tsvMorphosyntactic tag of string Part-of-speech category of string Total absolute frequency of morphosyntactic tag Percentage of all found morphosyntactic tags Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Dice t-score MI MI3 logDice simple LL L L L L L L L L 891 0.12 % 860.78 176 0.11 % 763.48 381 0.19 % 1,289.16 281 0.25 % 1,802.52 53 0.02 % 150.08 0.01 29.58 6.77 26.37 7.19 1,865.49 N N N N N N N N 166 0.02 % 160.37 54 0.03 % 234.25 44 0.02 % 148.88 24 0.02 % 153.95 44 0.02 % 124.59 0.00 12.56 5.29 20.04 5.87 205.58 Zp------k Zop-ed--k Ggnste Vd Z Z G V 158 0.02 % 152.64 43 0.03 % 186.53 37 0.02 % 125.19 42 0.04 % 269.42 36 0.01 % 101.94 0.01 12.57 12.79 27.40 7.11 900.51 Rsn Rsn Rsn Rsn R R R R 146 0.02 % 141.05 39 0.02 % 169.18 44 0.02 % 148.88 33 0.03 % 211.68 30 0.01 % 84.95 0.00 12.01 7.39 21.77 4.69 359.57 Kbg-mi Kbg-mi Kbg-mi Kbg-mi K K K K 142 0.02 % 137.18 115 0.07 % 498.87 5 0 % 16.92 7 0.01 % 44.90 15 0.01 % 42.47 0.02 11.92 15.98 30.28 8.37 1,082.23 M M M M M M M M 136 0.02 % 131.39 33 0.02 % 143.15 46 0.02 % 155.65 50 0.04 % 320.73 7 0 % 19.82 0.01 11.66 13.06 27.23 7.59 797.03 L Rsn Rsn Rsn L R R R 115 0.01 % 111.10 29 0.02 % 125.80 38 0.02 % 128.58 29 0.03 % 186.03 19 0.01 % 53.80 0.00 10.15 4.22 17.91 4.31 74.27 Vp Rsn Rsn L V R R L 112 0.01 % 108.20 22 0.01 % 95.44 34 0.02 % 115.04 28 0.03 % 179.61 28 0.01 % 79.29 0.00 10.28 5.13 18.74 4.37 128.14 L L L Rsn L L L R 110 0.01 % 106.27 20 0.01 % 86.76 49 0.02 % 165.80 31 0.03 % 198.85 10 0 % 28.32 0.00 10.35 6.21 19.77 4.19 194.10 Vd Gp-ste-n Rsn Rsn V G R R 105 0.01 % 101.44 8 0.01 % 34.70 33 0.02 % 111.66 17 0.01 % 109.05 47 0.02 % 133.09 0.00 9.44 3.66 17.09 4.71 37.90 Vp L Rsn Rsn V L R R 103 0.01 % 99.51 27 0.02 % 117.12 34 0.02 % 115.04 20 0.02 % 128.29 22 0.01 % 62.30 0.00 9.83 5.01 18.38 4.25 110.87 Rsn Rsn Vp L R R V L 97 0.01 % 93.71 22 0.01 % 95.44 28 0.01 % 94.74 17 0.01 % 109.05 30 0.01 % 84.95 0.00 9.52 4.92 18.12 4.17 99.72 Zk-sei Gp-ste-n Rsn Rsn Z G R R 93 0.01 % 89.85 15 0.01 % 65.07 35 0.02 % 118.43 26 0.02 % 166.78 17 0.01 % 48.14 0.00 9.19 4.42 17.50 4.70 70.34 Rsn Rsn Vp Rsn R R V R 91 0.01 % 87.91 19 0.01 % 82.42 24 0.01 % 81.21 11 0.01 % 70.56 37 0.01 % 104.77 0.00 8.67 3.45 16.47 4.10 23.75 Vp Rsn Rsn Rsn V R R R 90 0.01 % 86.95 20 0.01 % 86.76 29 0.01 % 98.12 15 0.01 % 96.22 26 0.01 % 73.62 0.00 8.61 3.44 16.42 4.09 22.81 L Zk-sei Gp-ste-n L L Z G L 89 0.01 % 85.98 24 0.01 % 104.11 23 0.01 % 77.82 29 0.03 % 186.03 13 0.01 % 36.81 0.00 8.90 4.14 17.09 4.55 53.97 Rsn Vp Rsn Rsn R V R R 89 0.01 % 85.98 18 0.01 % 78.08 21 0.01 % 71.06 20 0.02 % 128.29 30 0.01 % 84.95 0.00 8.55 3.42 16.37 4.07 21.88 Rsn L L L R L L L 86 0.01 % 83.08 19 0.01 % 82.42 33 0.02 % 111.66 27 0.02 % 173.20 7 0 % 19.82 0.00 9.11 5.85 18.71 3.84 134.02 Rsn Rsn Rsn L R R R L 85 0.01 % 82.12 12 0.01 % 52.06 28 0.01 % 94.74 28 0.03 % 179.61 17 0.01 % 48.14 0.00 8.55 3.78 16.60 3.88 35.81 Gp-ste-n Gp-d-es Rsn Rsn G G R R 84 0.01 % 81.15 12 0.01 % 52.06 40 0.02 % 135.34 14 0.01 % 89.81 18 0.01 % 50.97 0.00 9.08 6.73 19.51 4.64 173.78 Rsn Dt Zk-set Vd R D Z V 83 0.01 % 80.18 34 0.02 % 147.49 4 0 % 13.53 7 0.01 % 44.90 38 0.01 % 107.60 0.00 9.09 8.99 21.74 5.10 283.73 L Rsn Rsn L L R R L 80 0.01 % 77.29 18 0.01 % 78.08 30 0.01 % 101.51 19 0.02 % 121.88 13 0.01 % 36.81 0.00 8.01 3.27 15.91 3.76 13.94 L Zk-sei Gp-ste-n Rsn L Z G R 80 0.01 % 77.29 8 0.01 % 34.70 32 0.02 % 108.28 22 0.02 % 141.12 18 0.01 % 50.97 0.00 8.42 4.10 16.74 4.43 46.66 Zk-sei Gp-ste-n Zk-sei Gp-ste-n Z G Z G 78 0.01 % 75.35 11 0.01 % 47.72 31 0.01 % 104.89 23 0.02 % 147.54 13 0.01 % 36.81 0.00 8.80 8.13 20.70 5.70 226.28 L Rsn Vp Rsn L R V R 77 0.01 % 74.39 17 0.01 % 73.75 29 0.01 % 98.12 9 0.01 % 57.73 22 0.01 % 62.30 0.00 8.41 4.59 17.12 3.83 65.04 L L Rsn Rsn L L R R 75 0.01 % 72.46 20 0.01 % 86.76 28 0.01 % 94.74 15 0.01 % 96.22 12 0.01 % 33.98 0.00 7.70 3.17 15.63 3.67 9.90 Rsn Vp L Rsn R V L R 75 0.01 % 72.46 20 0.01 % 86.76 16 0.01 % 54.14 11 0.01 % 70.56 28 0.01 % 79.29 0.00 8.29 4.55 17.01 3.79 61.80 Vp Rsn L Rsn V R L R 75 0.01 % 72.46 23 0.01 % 99.77 25 0.01 % 84.59 9 0.01 % 57.73 18 0.01 % 50.97 0.00 8.29 4.55 17.01 3.79 61.80 Vd Gp-ste-n L Rsn V G L R 74 0.01 % 71.49 19 0.01 % 82.42 25 0.01 % 84.59 10 0.01 % 64.15 20 0.01 % 56.63 0.00 7.64 3.15 15.57 4.17 9.13 N L L L N L L L 73 0.01 % 70.52 15 0.01 % 65.07 32 0.02 % 108.28 18 0.02 % 115.46 8 0 % 22.65 0.00 7.57 3.13 15.51 3.78 8.37 Vd Gp-ste-n Gp-d-es Rsn V G G R 73 0.01 % 70.52 14 0.01 % 60.73 30 0.01 % 101.51 5 0 % 32.07 24 0.01 % 67.96 0.00 8.50 7.56 19.94 4.79 186.86 Vp L L L V L L L 72 0.01 % 69.56 15 0.01 % 65.07 25 0.01 % 84.59 21 0.02 % 134.71 11 0 % 31.15 0.00 7.51 3.11 15.45 3.68 7.62 CJVT // A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora // 781 File at CLARIN.SI2.4.12 List of all 5-grams of morphosyntactic tags in the GOS 1.0 corpus with their part-of-speech categories and text-type distributionGOS1.0-word_sets-morphosyntactic_tags- 5grams-taxonomy-collocativity-entire.tsvMorphosyntactic tag of string Part-of-speech category of string Total absolute frequency of morphosyntactic tag Percentage of all found morphosyntactic tags Total relative frequency (per million occurrences) Absolute frequency [gos.T.J.R] Percentage [gos.T.J.R] Relative frequency [gos.T.J.R] Absolute frequency [gos.T.N.Z] Percentage [gos.T.N.Z] Relative frequency [gos.T.N.Z] Absolute frequency [gos.T.N.N] Percentage [gos.T.N.N] Relative frequency [gos.T.N.N] Absolute frequency [gos.T.J.I] Percentage [gos.T.J.I] Relative frequency [gos.T.J.I] Dice t-score MI MI3 logDice simple LL L L L L L L L L L L 374 0.06 % 361.32 65 0.04 % 281.97 178 0.10 % 602.28 115 0.11 % 737.69 16 0.01 % 45.31 0.00 19.34 26.20 43.29 5.93 5,151.16 M M M M M M M M M M 62 0.01 % 59.90 13 0.01 % 56.39 25 0.01 % 84.59 21 0.02 % 134.71 3 0 % 8.49 0.01 7.87 25.52 37.43 6.46 828.65 Kbg-mi Kbg-mi Kbg-mi Kbg-mi Kbg-mi K K K K K 53 0.01 % 51.20 47 0.03 % 203.88 0 0 % 0 3 0 % 19.24 3 0 % 8.49 0.01 7.28 22.65 34.11 6.95 616.85 N N N N N N N N N N 53 0.01 % 51.20 18 0.01 % 78.08 19 0.01 % 64.29 7 0.01 % 44.90 9 0 % 25.48 0.00 7.28 22.75 34.20 4.22 619.86 Kbg-mi Kbg-mi Kbg-mi Kbg-mi Zl-sei K K K K Z 36 0.01 % 34.78 36 0.02 % 156.17 0 0 % 0 0 0 % 0 0 0 % 0 0.01 6.00 23.42 33.76 6.64 435.61 Kbg-mi Kbg-mi Kbg-mi Zl-sei Zl-sei K K K Z Z 36 0.01 % 34.78 36 0.02 % 156.17 0 0 % 0 0 0 % 0 0 0 % 0 0.01 6.00 25.64 35.98 6.94 483.64 L L L L Rsn L L L L R 31 0.01 % 29.95 5 0 % 21.69 9 0.01 % 30.45 15 0.01 % 96.22 2 0 % 5.66 0.00 5.57 21.88 31.79 2.36 346.36 M L L L L M L L L L 31 0.01 % 29.95 3 0 % 13.01 8 0 % 27.07 20 0.02 % 128.29 0 0 % 0 0.00 5.57 21.88 31.79 2.62 346.36 Zp------k Zop-ed--k Ggnste Vd Gp-ste-n Z Z G V G 31 0.01 % 29.95 7 0.01 % 30.37 7 0 % 23.69 7 0.01 % 44.90 10 0 % 28.32 0.00 5.57 22.78 32.69 4.58 363.23 Slmei Slmei Slmei Slmei Slmei S S S S S 30 0 % 28.98 17 0.01 % 73.75 5 0 % 16.92 0 0 % 0 8 0 % 22.65 0.01 5.48 24.63 34.44 6.87 384.88 Dm Kbgmmm Vp Kbg-mt Sozmr D K V K S 27 0 % 26.08 27 0.02 % 117.12 0 0 % 0 0 0 % 0 0 0 % 0 0.00 5.20 22.87 32.38 4.42 317.80 L L L L Vp L L L L V 27 0 % 26.08 5 0 % 21.69 7 0 % 23.69 12 0.01 % 76.98 3 0 % 8.49 0.00 5.20 21.68 31.19 2.24 298.43 Vp Zp------k Zop-ed--k Ggnste Vd V Z Z G V 26 0 % 25.12 8 0.01 % 34.70 4 0 % 13.53 6 0.01 % 38.49 8 0 % 22.65 0.00 5.10 22.93 32.33 3.90 307.00 Rsn L L L L R L L L L 24 0 % 23.19 7 0.01 % 30.37 9 0.01 % 30.45 7 0.01 % 44.90 1 0 % 2.83 0.00 4.90 21.51 30.68 1.99 262.81 Rsn Rsn Rsn Rsn Rsn R R R R R 24 0 % 23.19 11 0.01 % 47.72 3 0 % 10.15 6 0.01 % 38.49 4 0 % 11.33 0.00 4.90 21.51 30.68 2.08 262.81 Kbg-mi Zl-sei Kbg-mi Kbg-mi Kbzsmi K Z K K K 23 0 % 22.22 0 0 % 0 0 0 % 0 0 0 % 0 23 0.01 % 65.13 0.00 4.80 25.58 34.62 6.32 308.18 Zl-sei Kbg-mi Zl-sei Kbg-mi Kbg-mi Z K Z K K 23 0 % 22.22 0 0 % 0 0 0 % 0 0 0 % 0 23 0.01 % 65.13 0.00 4.80 24.99 34.04 6.29 300.04 L Zk-sei Gp-ste-n L L L Z G L L 22 0 % 21.25 6 0 % 26.03 6 0 % 20.30 8 0.01 % 51.32 2 0 % 5.66 0.00 4.69 21.38 30.30 2.36 239.25 Rsn Dt Zk-set Vd Gp-ste-n R D Z V G 22 0 % 21.25 8 0.01 % 34.70 1 0 % 3.38 1 0 % 6.41 12 0.01 % 33.98 0.00 4.69 23.60 32.52 3.25 268.64 Vd Zp------k Zop-ed--k Ggnste Vd V Z Z G V 22 0 % 21.25 5 0 % 21.69 6 0 % 20.30 9 0.01 % 57.73 2 0 % 5.66 0.00 4.69 23.79 32.71 3.91 271.07 L L L L Zk-sei L L L L Z 20 0 % 19.32 4 0 % 17.35 5 0 % 16.92 10 0.01 % 64.15 1 0 % 2.83 0.00 4.47 26.11 34.75 1.97 274.40 Vp L L L L V L L L L 20 0 % 19.32 5 0 % 21.69 5 0 % 16.92 8 0.01 % 51.32 2 0 % 5.66 0.00 4.47 21.25 29.89 1.80 215.84 L L L L M L L L L M 19 0 % 18.36 1 0 % 4.34 8 0 % 27.07 10 0.01 % 64.15 0 0 % 0 0.00 4.36 21.17 29.67 1.91 204.20 L Zk-sei Gp-ste-n Rsn Rsn L Z G R R 19 0 % 18.36 1 0 % 4.34 6 0 % 20.30 5 0.01 % 32.07 7 0 % 19.82 0.00 4.36 23.86 32.36 2.21 234.95 Zp------k Zop-ed--k Gp-ste-n Ggnd-es Vd Z Z G G V 19 0 % 18.36 7 0.01 % 30.37 5 0 % 16.92 3 0 % 19.24 4 0 % 11.33 0.00 4.36 21.17 29.67 4.00 204.20 Do Sommo Dm Somem N D S D S N 18 0 % 17.39 17 0.01 % 73.75 0 0 % 0 0 0 % 0 1 0 % 2.83 0.00 4.24 21.10 29.44 4.03 192.61 L L L L N L L L L N 18 0 % 17.39 6 0 % 26.03 8 0 % 27.07 2 0 % 12.83 2 0 % 5.66 0.00 4.24 21.10 29.44 1.72 192.61 Vp L Ggnspe Zv-sei L V L G Z L 18 0 % 17.39 0 0 % 0 10 0.01 % 33.84 5 0.01 % 32.07 3 0 % 8.49 0.00 4.24 26.00 34.34 2.37 245.73 Ggdsdm Do Sommo Dm Somem G D S D S 17 0 % 16.42 17 0.01 % 73.75 0 0 % 0 0 0 % 0 0 0 % 0 0.00 4.12 24.67 32.84 4.97 218.46 Rsn Zp------k Zop-ed--k Ggnste Vd R Z Z G V 17 0 % 16.42 3 0 % 13.01 5 0 % 16.92 7 0.01 % 44.90 2 0 % 5.66 0.00 4.12 21.01 29.19 3.05 181.07 Zk-sei Gp-ste-n N Zk-sei Gp-ste-n Z G N Z G 17 0 % 16.42 5 0 % 21.69 3 0 % 10.15 2 0 % 12.83 7 0 % 19.82 0.00 4.12 21.01 29.19 3.27 181.07 Dr Soser Kbgmdi Kbg-mi Kbg-mi D S K K K 16 0 % 15.46 1 0 % 4.34 0 0 % 0 1 0 % 6.41 14 0.01 % 39.64 0.00 4.00 23.23 31.23 5.50 191.77