Acta Linguistica Asiatica, 10(2), 2020.  
ISSN: 2232-3317, http://revije.ff.uni-lj.si/ala/ 
DOI: 10.4312/ala.10.2.127-142  
EXAMINING THE PART-OF-SPEECH FEATURES IN ASSESSING THE READABILITY  
OF VIETNAMESE TEXTS 
An-Vinh LUONG 
Computational Linguistics Center, University of Science, Ho Chi Minh City, Vietnam 
anvinhluong@gmail.com 
Diep NGUYEN 
Department of Linguistics, University of Social Sciences & Humanities, Ho Chi Minh City, Vietnam 
nhudiep2004@gmail.com 
Dien DINH 
Computational Linguistics Center, University of Science, Ho Chi Minh City, Vietnam 
ddien@fit.hcmus.edu.vn 
Abstract 
The readability of the text plays a very important role in selecting appropriate materials for the level 
of the reader. Text readability in Vietnamese language has received a lot of attention in recent years, 
however, studies have mainly been limited to simple statistics at the level of a sentence length, word 
length, etc. In this article, we investigate the role of word-level grammatical characteristics in assessing 
the difficulty of texts in Vietnamese textbooks. We have used machine learning models (for instance, 
Decision Tree, K-nearest neighbor, Support Vector Machines, etc.) to evaluate the accuracy of 
classifying texts according to readability, using grammatical features in word level along with other 
statistical characteristics. Empirical results show that the presence of POS-level characteristics 
increases the accuracy of the classification by 2-4%. 
Keywords: text readability; text difficulty; Vietnamese text readability; text classification; 
school textbooks 
Povzetek 
Berljivost besedila ima zelo pomembno vlogo pri izbiri ustreznih gradiv za raven bralca. Berljivost 
besedil v vietnamskem jeziku pridobiva pozornost šele v zadnjih letih in dosedanje študije so omejene 
na preproste ocene na osnovi statističnih podatkov za dolžino stavka, dolžino besed in podobnih 
značilnosti. V tem članku raziskujemo vlogo slovničnih značilnosti na besedni ravni pri ocenjevanju 
težavnosti besedil v vietnamskih učbenikih. Za oceno natančnosti razvrščanja besedila glede na 
berljivost smo uporabili modele strojnega učenja (na primer drevo odločitve, K-najbližji sosed, 
podporni vektorski stroji itd.) Empirični rezultati kažejo, da upoštevanje različnih značilnosti na nivoju 
besednih vrst poveča natančnost klasifikacije za 2-4%. 
Ključne besede: berljivost besedila; raven enostavnosti; berljivost vietnamskih tekstov; 
klasifikacija tekstov; šolski učbeniki 
128 An-Vinh LUONG, Diep NGUYEN, Dien DINH 
1 Introduction 
In today’s era of information explosion, thousands of documents with different 
contents and in different languages get released every second. Such documents have 
different levels of readability; some are easy to read and understand while others are 
more difficult and demand larger amount of time and knowledge to get through. It is 
generally known that the best way to assess whether a text is easy or difficult is to ask 
readers to read or skim that text, though this can be time-consuming for the readers. 
Therefore, we suppose that there exists some kind of a method that assists a reader to 
determine the readability of the text, upon which they make further decision on 
whether they would continue reading or not. Recently gaining a lot of focus is 
readability, which is one of such methods. 
Brown and colleagues state that  
readability is a concept that describes the degree to which a text is easy or 
difficult to read. A readability index is a numerical scale that estimates the 
readability or degree reading difficulty that native speakers are likely to have in 
reading a particular text. (Brown et al., 2012). 
Determining the readability index of a document is to determine how difficult the 
text is, which gives a reader information on whether the document is suitable for them 
to read to understand it in a reasonable amount of time. Information on readability is 
useful in many different fields of science as well as in everyday life. It can be used when 
assisting scientists at publishing articles, helping text editors (writers, journalists, etc.) 
to create documents suitable specific audience, or else for manufacturers to produce 
readable manuals. Above all, information on readability is of most importance in 
education, especially in second language education. It is used when textbooks are 
compiled, or for educators to make decisions on appropriate texts are made.  
Research on the difficulty of texts originates back to the late 19th century when 
Lucius Adelno Sherman wrote that “the average length of sentences has been 
decreasing over time” (Sherman, 1893). Many books on readability have been 
published since, however, they mostly applied for English, such the work of Dale and 
Chall (1948), Si and Callan (2001), Schwarm and Ostendorf (2005), Chall and Dale (1995), 
Chen and Meurers (2018), etc., and some languages that were treated as lingua franca 
at some point in the history or some part of the world. Thus we find works on French 
(François (2014), François & Fairon (2012), etc.), Chinese (Chen et al. (2013); Jiang et al. 
(2018); Sun et al. (2014), etc.), Spanish (Coco et al. (2017); I. Parkeret al. (2001); 
Spaulding (1956), etc.), Arabic (Al-Ajlan et al. (2008); Al-Tamimi et al. (2014); Al Khalil 
et al. (2018); Saddiki et al. (2015); Saddiki et al. (2018), etc.), and other languages.  
For less-resource languages, studies on the readability of texts are still limited, and 
Vietnamese is one of such languages. In Vietnamese, there some publications that date 
 Examining the Part-of-speech Features in Assessing the Readability … 129 
back to 1980ies (Nguyen & Henkin 1982, 1985), and recent studies of Luong et al. (2017, 
2018a, 2018b), Điệp (2019) and Luong & Tran (2019). These studies have shown some 
valuable features for assessing text readability in Vietnamese, but the results are 
limited and further research on the topic is necessary. 
In this research, we examine features on the level of parts of speech (POS-level) 
and assess the readability of literary texts based on them. The texts were taken from 
literary textbooks for school students from grade 2 to grade 12, which corresponds to 
the students’ age 7 to age 17. This method inherits the results of the Luong et al. (2017, 
2018b) with the addition of a number of grammatical features at word-level to build a 
text-based classifier based on readability through some machine learning methods like 
Decision Tree, K-nearest neighbor, Support Vector Machines, etc.  
The article is thus organized as follows. Section 2 presents some ground works on 
text readability and previous literature on Vietnamese text readability. It also 
introduces the features that we surveyed and used in this study to develop models for 
assessing the readability of Vietnamese texts, using some classification algorithms 
along with experimental results. Section 0 presents the results of the study and 
discusses them, and the final Section 5 offers an overall conclusion to the topic. 
2 Related works 
2.1 Different approaches in previous studies 
Previous studies of text readability can be grouped into two groups based on either 
they undertake traditional approach or corpus-based approach. 
Traditional approach uses conventional statistical methods on the documents to 
select high correlation factors with the readability of texts and then use regression 
analysis to create formulas for measuring the readability. The factors examined are 
typically shallow features, also called easy-to-extract features, such as average 
sentence length, average word length, percentage of difficult words in the documents. 
Representative researches with this approach produces the Dale-Chall formula (Dale & 
Chall, 1948), the Gunning Fog Index (Robert, 1952), the SMOG formula (Mc Laughlin, 
1969), the Flesch-Kincaid grade level readability (Kincaid, Fishburne, Rogers, & Chissom, 
1975), the new Dale-Chall formula (Chall & Dale, 1995), and others. 
On the other hand, corpus-based approach approach has been developed in recent 
years due to the fast development computer science and machine learning algorithms. 
Studies in this approach see the problem of assessing the readability of text as a 
classification problem, and use machine learning models to classify text by layer of 
readability based on extracted features. Representative works are those of Si & Callan 
(2001); Collins-Thompson & Callan (2005); Schwarm & Ostendorf (2005); Heilman et al. 
130 An-Vinh LUONG, Diep NGUYEN, Dien DINH 
(2007); Pitler & Nenkova (2008); Feng et al. (2010); Vajjala & Meurers (2012); Jiang et 
al. (2015); Wang & Andersen (2016); Chen & Meurers (2018), and others.  
2.2 Studies on text readability in Vietnamese 
The research on the readability of the text in Vietnamese is still quite small and their 
results are limited. Nguyen et al., have introduced two formulas to measure the 
readability of Vietnamese texts (Nguyen & Henkin, 1982, 1985). These two formulas 
base on features such as the average length of sentences or words, and the ratio of 
difficult words in texts. The weak point in these works is that the two formulas were 
surveyed and evaluated on a relatively small amount of data; on 20 documents in 
Nguyen & Henkin, 1982 and 54 documents in Nguyen & Henkin, 1985. 
Luong et al. (2017) conducted a survey of texts extracted from literary textbooks 
for Vietnamese high school students and suggested to use the feature of text length to 
classify texts according to readability. Experimental results show that the length of 
texts has a great influence on the classification results, and is to be used to evaluate 
texts in Vietnamese textbooks. 
Luong et al. (2018a) introduced a new formula for measuring the readability of 
Vietnamese texts. This formula is based on a survey of 1,200 documents classified into 
3 levels of difficulty (easy, medium and difficult). The features of the average length of 
the sentence, of the word and the ratio of difficult words in the text have been chosen 
formulas criteria. 
In addition, Luong et al. (2018b) published another study on the readability of texts 
using the proportional features of proper nouns and Vietnamese specific 
characteristics such as Sino-Vietnamese words ratio, borrowed words ratio, and dialect 
words ratio within documents. Experimental results show the contribution of these 
features in improving the accuracy of classification processes. 
Điệp et al. (2019) presented the statistical analyses on the frequency of POS tags 
in Vietnamese texts. They conducted a survey of 209 texts extracted from Vietnamese 
textbooks from grade 2 to grade 5 (corresponding to age 7 to 10) for primary students 
according to the general curriculum in Vietnam. Their results showed that words such 
as common nouns and volatile verbs were common in the examined documents. In 
addition, through the correlation analysis between the frequencies of POS tags and the 
readability of the surveyed documents, they also proved a high correlation between 
the ratio of common nouns and the ratio of prepositions with the readability level in 
the examined texts. 
Furthermore, Luong & Tran (2019) introduced a method of evaluating the 
readability of documents by comparing difficulty correlation between different 
documents. They built a set of 30 texts – which were graded the readability level – as 
the standard.  
 Examining the Part-of-speech Features in Assessing the Readability … 131 
The documents in this research will be compared to the standard texts proposed 
by Luong & Tran (2019) to determine the readability level. 
2.3 Features 
In this section, we will introduce features that used to classify texts with readability 
level. These features include the so-called traditional or grammatical features at the 
word level (features (1) – (4)), and features (5) – (10) proposed by Luong et al. (2017, 
2018b) that have been defined relevant in Vietnamese literary texts and are the focus 
of this research. 
(1) Average sentence length. The average length of sentences is a common factor 
in most studies of text readability. The length of sentences is very important in the 
process of documenting and reading texts. If a text has too many long sentences, it may 
make it difficult for the reader to fully understand meanings of its long sentences. On 
the other hand, using only short sentences may make the text discrete and incoherent, 
which could make the reader experience difficulties. Therefore, the length of sentences 
is a very important factor in assessing the readability of text. The average sentence 
length features commonly used are average sentence length in words (ASLW), in 
syllables (ASLS), and in characters (ASLC). 
(2) Average word length. Carver (1976) showed a linear correlation between the 
length of words and the readability of text, which is a commonly used factor in studies 
on text readability. The average word length in syllables (AWLS) and in characters 
(AWLC) are commonly used with length features. 
(3) Percentage of difficult words. In many studies, the percentage of difficult 
words is one of the most valuable factors when assessing the readability of texts. These 
studies often use a list of easy vs. difficult words in a language as the base for 
calculation. However, building such a list takes a lot of effort, and therefore many 
studies used statistical lists of words according to their frequency of use instead; they 
chose most commonly used words in a particular language and treated them as a list 
of easy words.  
In this study, we use 3,000 most popular words extracted from the statistical list of 
words in the Vietnamese texts of the researching group of Dinh et al. (Dinh, Nguyen, & 
Ho, 2018) as the basis for calculating difficult word rate. The features we surveyed are 
Percentage of Difficult Words (PDW), Percentage of Unique Difficult Words (PDDW). 
(4) Percentage of difficult syllables. Vietnamese writing is monosyllabic in nature. 
Every “syllable” is written as though it were a separate dictation-unit with a space 
before and after. Such a unit is called morphosyllable or “tiếng” in Vietnamese. Each 
morphosyllable tends to have its own meaning and consequently a strong identity. 
However, these morphosyllables are not automatically combined into ‘words’ as the 
linguistic notion of word commonly applies for European languages (Tran et el., 2007), 
132 An-Vinh LUONG, Diep NGUYEN, Dien DINH 
which leads to difficulties for readers, especially those with low reading skills, to 
distinguish the boundaries between words.  
For this reason, we consider syllables as an important language unit of Vietnamese 
to make statistics and use as a characteristic for examination. In this work, we use 3,000 
most popular syllables of Dinh et al. (2018) to extract the following two features: 
percentage of difficult syllables (PDS) and percentage of distinct difficult syllables 
(PDDS). 
(5) Text length features. In the study by Luong et al. (2017), results showed the 
essential role of the text length features in assessing readability. The features that 
Luong et al. surveyed and are relevant to this study are as follows. The total number of 
sentences (NSen), the total number of words (NWo), the total number of syllables 
(NSyl), the total number of characters (NCha), the total number of distinct words 
(NDWo), and finally the total number of distinct syllables (NDSyl). 
In the article published in 2018, Luong et al. introduced some additional features 
for assessing readability of Vietnamese texts: 
(6) Percentage of Sino-Vietnamese words. Vietnam has spent more than 1,000 
years of domination by the Chinese feudal dynasties (111BC - 905AC). During that 
period, the Vietnamese language was strongly influenced by Chinese culture and 
language, and those influences continue to this day. Vocabulary of the Vietnamese 
language consists of more than 60% of words of Chinese-origin, the called as Sino-
Vietnamese words (DeFrancis, 1977). These Sino-Vietnamese words are often used in 
the official and ceremonial language and are therefore considered as more difficult 
compared to originally Vietnamese words of the same meaning.  
In this study, we examined features such as percentage of Sino-Vietnamese Words 
(PSVW), percentage of distinct Sino-Vietnamese words (PDSVW), and the proportion 
of distinct Sino-Vietnamese within all distinct words (DSVW/DW). 
(7) Percentage of borrowed words. Similar to Sino-Vietnamese words, many 
words from other languages entered Vietnamese. This foreign influence was especially 
strong during the French invasion of Vietnam in the middle of the 19th century. Such 
words of French, English and other origin undertook Vietnamese phonetic 
transcriptions (Alves, 2009) and are nowadays used in the official and scientific 
language. It is estimated that they influence the readability and are therefore taken 
into account as the percentage of borrowed words (PBW), the percentage of distinct 
borrowed words (PDBW), and the proportion of distinct borrowed words within all 
distinct words (DBW/DW).  
(8) Percentage of dialectal words. Vietnamese territory has many different regions 
with different cultural and linguistic characteristics. Regions tend to localize general 
Vietnamese language and use it as their own regional language. Consequently, such 
dialectal vocabulary, which is not available in the standard Vietnamese language, is 
 Examining the Part-of-speech Features in Assessing the Readability … 133 
thought to be difficult for readers. In assessing text readability the feature appears 
either as the percentage of dialect words (PDiaW), the percentage of distinct dialect 
words (PDDiaW), or the proportion of distinct dialect words within all distinct words 
(DDiaW/DW). 
(9) Percentage of proper nouns. According to Luong et al. (2018b), the more 
proper nouns in the text, the more effort the reader will have to memorize those 
objects, and therefore the text is considered more difficult. For this reason we have 
decided to take into account the characteristics of proper nouns in this experiment. 
The features are defines in the following way. Nr/Sen is the abbreviation for the 
proportion of proper nouns within sentences. Nr/W is the abbreviation for the number 
of proper nouns in comparison to all words. Nr/DW stands for the number of proper 
nouns that is divided by the number of distinct words. DNr/Sen points to the proportion 
of distinct proper nouns within the overall number of sentences. DNr/W stands for the 
number of distinct proper nouns divided by the number of words. Finally, DNr/DW is 
the abbreviation for the number of distinct proper nouns divided by the number of 
distinct words. 
(10) Other parts of speech and their elements. In this study, we also used other 
POS tags such as countable noun, directional verb, parallel association, etc. to 
experiment with the model. Table 1 is a list of tags used. These POS tags are derived 
from the CLC_VN_Toolkit tool, which has been developed by the Computational 
Linguistics Center, Ho Chi Minh City University of Science1 . This is a tool for pre-
processing, sentences segmentation, words segmentation, part-of-speech tagging 
(POS), named entity labeling, etc. for Vietnamese texts. Similar to proper nouns, we use 
features with abbreviated symbols for each POS element. The number POS divided by 
the number of sentences is POS/Sen. The number of POS divided by the number of 
words is POS/Wo. The number of POS divided by the number of distinct words is 
POS/DWo. The number of distinct POS divided by the number of sentences is DPOS/Sen. 
The number of Distinct POS divided by the number of words is DPOS/Wo. The number 
of Distinct POS divided by the number of distinct words is DPOS/DWo. The abbreviation 
‘POS’ is generally replaced by POS tags as shown in Table 1 except for proper nouns (Nr) 
already presented in the work of Luong et al. (2018b). 
 
Table 1: List of Vietnamese POS tags used in CLC_VN_Toolkit 
POS Tag POS Tag 
Countable nouns Nc Quality adjectives Aa 
Concrete nouns Nu Demonstrative pronouns Pd 
Temporal nouns Nt Personal pronouns Pp 
                                                        
1 CLC website: http://clc.hcmus.edu.vn 
134 An-Vinh LUONG, Diep NGUYEN, Dien DINH 
POS Tag POS Tag 
Numerals Nq Adverbs R 
Common nouns Nn Prepositions Cm 
Proper nouns Nr Parallel conjunctions Cp 
Directional verbs Vd Subordinating 
conjunctions 
Cs 
State verbs Ve Modifiers M 
Comparative verbs Vc Emotion words E 
Volatile verbs Vv Foreign words FW 
Directional co-verb D Onomatopoeia ON 
Quantity Adjectives An Idioms ID 
3 Experiment 
In this study, we used the corpus of 371 literary texts of Luong et al. (2018b) for 
experimentation. These documents were taken from Vietnamese textbooks for 
primary, middle and high school students in Vietnam. We divided the texts into groups 
based on: 
(1) grade level (from grade 2 to grade 12);  
(2) level of education (Primary, Middle and High school). 
Table 2 presents the basic statistics of the corpus. The features mentioned in 
Section 2.3 are used to build the classification models for text readability. 
 
Table 2: The statistics of the corpus of 371 literary documents of Luong et al. (2018b) 
Grade 2 3 4 5 6 7 8 9 10 11 12 
Number of texts 67 62 40 40 28 13 17 21 15 19 49 
Average number  
of sentences 
18.34 19.63 21.53 21.43 54.75 46.38 65.76 107.33 60.67 105.16 111.65 
Average number  
of words 
158.06 192.31 231.28 244.4 679.54 676.92 969.24 1447.4 861.73 1359.9 1710.3 
Average number  
of distinct words 
100.63 125.58 144.3 152.78 304.86 329.69 394.29 526.29 368.4 510 576 
Average number  
of syllables 
178.48 221.98 276.1 288 784.11 820.85 1131.5 1709.7 1006.5 1579.1 2179.4 
Average number  
of distinct syllables 
111.36 141.53 164.78 173.35 327.54 372.46 428.35 555.52 390.07 534.95 594.2 
Average number  
of characters 
826.8 1065.4 1335 1395.9 3709 3942.3 5401.9 8160 4860 7535.1 10761 
 Examining the Part-of-speech Features in Assessing the Readability … 135 
Grade 2 3 4 5 6 7 8 9 10 11 12 
Average sentence  
length in words 
9.14 10.61 11.59 12.69 14.01 17.99 17.78 18.23 15.04 15.67 16.68 
Average sentence  
length in syllables 
10.36 12.34 14.08 15.21 16.14 22.3 21.34 22.07 17.72 18.72 22.17 
Average sentence  
length in characters 
48.3 59.57 68.61 74.37 76.67 108.69 103.38 106.62 85.79 90.66 111.21 
Average word  
length in syllables 
1.13 1.16 1.2 1.19 1.15 1.23 1.19 1.2 1.17 1.18 1.32 
Average word  
length in characters 
5.25 5.61 5.84 5.77 5.43 5.98 5.74 5.78 5.67 5.7 6.59 
 
We conducted experiments by using several classification algorithms such as 
Decision Tree (denoted as D-TREE), K-nearest neighbor (K-NN), Multi-layer Perceptron 
(MLP), Random Forest (RND-FRST), and Support Vector Machines (SVM). In this study, 
we used Scikit-learn, a machine-learning library for the python programming language 
for the experiments. With D-TREE and RND-FRST, we used two common impurity 
measures: Entropy and Gini index. In order to avoid overfitting we used k-fold cross 
validation during training and testing; randomly dividing the corpus into 5 parts (4 parts 
for training and 1 for testing). The best features combinations of Luong et al. (2017) 
and Luong et al. (2018b) are used as the baselines for the experimental process. Tables 
3 and 4 show the best practices on 4 metrics: accuracy (Acc), precision (P), recall (R), 
and F1-score (F1). 
 
Table 3: Classification results performed on grade-level documents 
Feature set Acc P R F1 
  D-TREE (ENTROPY) 
Luong2017 0.3828 0.2893 0.2772 0.2728 
Luong2018 0.4206 0.3488 0.3276 0.3142 
Luong2017 + PSVW, Nq/Sen, DMSen 0.4449 0.3749 0.364 0.3552 
Luong2017 + PSVW, Aa/Sen, DCm/Wo 0.4341 0.3714 0.3523 0.3492 
Luong2017 + PSVW, DNr/DWo, Cp/DWo 0.4204 0.3709 0.3467 0.3359 
  D-TREE (GINI) 
Luong2017 0.3855 0.3049 0.2973 0.289 
Luong2018 0.3909 0.3038 0.2959 0.2888 
Luong2017 + PSVW, Aa/Sen, Cm/Wo 0.4448 0.3506 0.3829 0.3538 
Luong2017 + PSVW, DNr/DWo, Nq/Wo, 
DAa/Sen, Nq/DWo, Cm/Wo 
0.4368 0.3849 0.3472 0.3409 
Luong2017 + PSVW, Nq/Wo, DCm/Sen 0.4502 0.3649 0.3473 0.3375 
136 An-Vinh LUONG, Diep NGUYEN, Dien DINH 
Feature set Acc P R F1 
  KNN 
Luong2017 0.4556 0.2996 0.3097 0.2928 
Luong2018 0.4475 0.2929 0.3038 0.2877 
Luong2017 + PSVW, Aa/Sen, Cm/Sen 0.4609 0.3236 0.3283 0.3069 
Luong2017 + PSVW, Nr/Sen, Cm/Sen 0.4584 0.3075 0.3232 0.3035 
Luong2017 + PSVW, Cp/Sen, Cm/DWo 0.4476 0.3134 0.3182 0.3025 
  MLP 
Luong2017 0.3882 0.2916 0.3034 0.2696 
Luong2018 0.3883 0.3246 0.2845 0.2724 
Luong2017 + PSVW, DNr/DWo, DAa/Sen, 
Cm/Wo, DNr/Sen 
0.4286 0.3258 0.3404 0.3058 
Luong2017 + PSVW, DNr/DWo, DCp/Sen 0.4421 0.3128 0.3447 0.2993 
Luong2017 + PSVW, Aa/Sen, DD/Sen 0.38 0.3488 0.3098 0.2955 
  RND-FRST (ENTROPY) 
Luong2017 0.4529 0.3569 0.3577 0.3403 
Luong2018 0.4689 0.3952 0.3503 0.3477 
Luong2017 + PSVW, DNr/DWo, Aa/DWo 0.5041 0.4291 0.4029 0.3897 
Luong2017 + PSVW, Nr/Sen, DM/Wo 0.4772 0.4629 0.382 0.3889 
Luong2017 + PSVW, Nn/Sen, Aa/DWo 0.4826 0.4178 0.4011 0.3811 
  RND-FRST (GINI) 
Luong2017 0.4392 0.3191 0.3206 0.3071 
Luong2018 0.4636 0.345 0.3365 0.3195 
Luong2017 + PSVW, Nr/Sen, Cp/Wo 0.523 0.4256 0.4089 0.4051 
Luong2017 + PSVW, DNr/DWo, DPp/Sen 0.4989 0.402 0.3883 0.3766 
Luong2017 + PSVW, Nq/Sen, Nn/Sen 0.4852 0.4078 0.3809 0.371 
  SVM (LINEAR) 
Luong2017 0.4446 0.3402 0.3257 0.3177 
Luong2018 0.477 0.3892 0.3611 0.3538 
Luong2017 + PSVW, DNr/DWo, Nq/Wo, 
DAa/Sen, Cm/Wo 
0.5148 0.4657 0.4219 0.418 
Luong2017 + PSVW, DNr/DWo, Nq/Wo, 
Nq/DWo, Cm/Wo 
0.5068 0.4479 0.4099 0.4134 
Luong2017 + PSVW, DNr/DWo, Nq/Wo 0.5069 0.4464 0.4132 0.408 
 
 
 
 
 Examining the Part-of-speech Features in Assessing the Readability … 137 
Table 4: Classification results performed on school-level documents 
Feature set Acc P R F1 
  D-TREE ENTROPY 
Luong2017 0.7845 0.7268 0.7006 0.7021 
Luong2018 0.8167 0.7594 0.7489 0.7511 
Luong2018 + Nc/DWo, DPp/DWo 0.8329 0.7881 0.7729 0.7761 
Luong2018 + DVc/Sen, DID/Wo 0.8221 0.7792 0.7584 0.7623 
Luong2018 + FW/Wo, DNn/Sen 0.8221 0.7739 0.7569 0.7607 
  D-TREE GINI 
Luong2017 0.7925 0.7234 0.6985 0.7008 
Luong2018 0.7925 0.7174 0.7033 0.7049 
Luong2018 + Nu/Sen, D/Wo 0.8169 0.7531 0.7429 0.743 
Luong2018 + D/Sen, DNn/Sen 0.8087 0.7568 0.7341 0.7322 
Luong2018 + Nc/Sen, DCp/DWo 0.8114 0.7441 0.7338 0.7316 
  KNN 
Luong2017 0.7708 0.6687 0.656 0.6594 
Luong2018 0.7708 0.6687 0.656 0.6594 
Luong2018 + Vv/Wo, DVv/Sen 0.7815 0.6846 0.6688 0.6746 
Luong2018 + Aa/Sen, DNu/Wo 0.7762 0.6759 0.6612 0.6655 
Luong2018 + Aa/Sen 0.7762 0.6759 0.6612 0.6655 
  MLP 
Luong2017 0.6589 0.4973 0.5855 0.5169 
Luong2018 0.6846 0.5701 0.631 0.5666 
Luong2018 + Nr/Wo, DPp/Wo 0.7954 0.7124 0.7029 0.6723 
Luong2018 + Aa/Sen, FW/Wo 0.7707 0.7555 0.7005 0.6652 
  RND-FRST ENTROPY 
Luong2017 0.8221 0.757 0.7368 0.7367 
Luong2018 0.8355 0.7743 0.7547 0.7596 
Luong2018 + M/Wo, DNq/Sen 0.8599 0.8138 0.7956 0.802 
Luong2018 + Nc/Sen, Nq/Sen 0.8544 0.8126 0.789 0.7939 
Luong2018 + Nq/Sen, Aa/Sen 0.8571 0.8169 0.7824 0.7903 
  RND-FRST GINI 
Luong2017 0.8222 0.7569 0.7411 0.7439 
Luong2018 0.8302 0.7735 0.7528 0.7573 
Luong2018 + Nq/Wo, DON/DWo 0.8653 0.8182 0.8004 0.8062 
Luong2018 + Nc/Sen, Nq/Sen 0.8652 0.8173 0.7973 0.8031 
Luong2018 + Nq/Wo 0.8491 0.7978 0.7837 0.7879 
138 An-Vinh LUONG, Diep NGUYEN, Dien DINH 
Feature set Acc P R F1 
  SVM LINEAR 
Luong2017 0.8274 0.785 0.7626 0.7644 
Luong2018 0.8517 0.8107 0.7842 0.7903 
Luong2018 + Aa/Sen, DPp/Wo 0.8787 0.8462 0.8206 0.8231 
Luong2018 + Aa/Sen, DFW/Wo 0.8733 0.8326 0.8153 0.8182 
Luong2018 + D/Sen, DNn/Sen 0.8706 0.833 0.8163 0.8162 
 
From the results presented in Table 3 and Table 4 we can see that, when adding 
POS features, some features have helped improve the performance of the classification 
model. 
With the experiments in grade-level grouping, accuracy increased from the value 
0.4770 of the work of Luong2017 to the value 0.5148 when adding the features PSVW, 
DNr/DWo, Nq/Wo, DAa/Sen, Cm/Wo with the SVM classifier. Similarly, precision, recall 
and F1-score also increased from 0.3892, 0.361, and 0.3538 respectively in Luong2017 
to 0.4657, 0.4219, and 0.4180 respectively with the SVM classifier. In experimental 
results, the most accurate features combination is the combination (Luong2017 + 
PSVW, Nr/Sen, Cp/Wo), implemented on the Random Forest classifier (Gini index). 
However, the combination that yield the highest precision and F1-score is the 
combination (Luong2017 + PSVW, DNr/DWo, Nq/Wo, DAa/Sen, Cm/Wo). Among the 
POS features surveyed, the feature DNr/Dwo (Number of Distinct Proper Nouns divided 
by number of Distinct Words) feature appears the most in high performing experiments 
(appears 9 times in Table 3). This shows that the DNr/Dwo feature is a good feature for 
evaluating the readability of Vietnamese texts Besides, some other POS features also 
appear several times in the Table 3, such as Cm/Wo (5 times), Nq/Wo (5 times), Aa/Sen 
(4 times), etc. These POS features are also valuable for classifying Vietnamese texts 
according to difficulty level. 
With school-level grouping, the highest experimental results belong to the feature 
combination (Luong2018 + Aa/Sen, DPp/Wo), implemented on the SVM classifier: the 
Accuracy, Precision, Recall and F1-score increased from 0.8517 (Luong2018) to 0.8787; 
from 0.8107 (Luong2018) to 0.8462; from 0.7842 (Luong2018) to 0.8206; and from 
0.7903 (Luong2018) to 0.8231 respectively. The feature Aa/Sen (Number of Quality 
Adjectives divided by number of Sentences) appears the most (6 times) in the Table 4, 
therefore, this is a valuable feature for assessing the readability of Vietnamese texts. 
Similarly, features like DNn/Sen (appears 3 times), Nc/Sen (appears 3 times) or Nq/Sen 
(appears 3 times) are also good features for automatic classification of Vietnamese 
texts according to the difficulty level. 
Experimental results also show that SVM classifier performs best on overall 
Accuracy, Precision, Recall and F1-score for most feature sets on both school and 
grade-level. The Random Forest classifier (Gini impurity) archives the best accuracy in 
 Examining the Part-of-speech Features in Assessing the Readability … 139 
grade-level with the feature set of (Luong2017 + PSVW, Nr/Sen, Cp/Wo). The other 
classifiers do not seem suitable for the problem of evaluating the readability of 
Vietnamese text. 
4 Discussion and conclusion 
Text readability is an important factor affecting the selection and understanding of 
documents. Numerous studies on text readability have been conducted for English and 
some other resource-rich languages, while for Vietnamese research results are rare and 
limited. In this study, we investigated the role of word-level grammatical characteristics 
in assessing the difficulty of texts in Vietnamese textbooks. We conducted empirical 
assessments of text readability in 371 literary texts extracted from Vietnamese 
textbooks primary school students and the literary textbooks for middle and high 
school students in Vietnam. Some machine learning algorithms for automatic text 
classification like Decision Tree, K-nearest neighbor, Support Vector Machines, etc. 
were used to classify the texts. 
The experimental results presented in Table 3 and Table 4 show that some POS 
features such as DNr/Dwo, Cm/Wo, Nq/Wo, or Aa/Sen also contribute to the efficiency 
of classification. Comparing the results to the Luong 2017 results we can conclude that, 
he feature set (DNr/DWo, Nq/Wo, DAa/Sen, Cm/Wo), and the feature PSVW help 
increase precision value with SVM classifier in case of the group-by-grade-level corpus. 
On the other hand, the case of the group-by-school-level corpus, the feature set 
(Aa/Sen, DPp/Wo) helped the classification process to achieve the highest results for 
all measurements. 
Experiments in this study only used those machine learning classification 
algorithms that assess whether a feature is valuable for the classification or not. For 
that reason it is not possible to discuss the potential influence that increasing or 
decreasing the use of a certain POS would have on the difficulty of the text. Such studies 
on the correlation of the extracted features with the text readability level are planned 
to be conducted in the upcoming investigations. 
For the future works, we will proceed to collect additional corpora on different 
domains to look for features that could be useful for evaluating the readability of texts 
in the responding domains. Deeper features such as sentence-level grammar (syntax, 
coherence, cohesion, and others) should also be surveyed to find a better combination 
of features for assessing the readability of Vietnamese texts. 
140 An-Vinh LUONG, Diep NGUYEN, Dien DINH 
References 
Al-Tamimi, A. K., Jaradat, M., Aljarrah, N., & Ghanim, S. (2014). AARI: Automatic Arabic 
readability index. International Arab Journal of Information Technology, 11(4), 370-378.  
Alves, M. J. (2009). Loanwords in Vietnamese. Loanwords in the world’s language: A 
Comparative Handbook, 617-637.  
Brown, J. D., Janssen, G., Trace, J., & Kozhevnikova, L. (2012). A preliminary study of cloze 
procedure as a tool for estimating English readability for Russian students. In Second 
Language Studies Paper (pp. 1-22): University of Hawai'i at Manoa. 
Carver, R. P. (1976). Word Length, Prose Difficulty, and Reading Rate. Journal of Reading 
Behavior, 8(2), 193-203.  
Chall, J. S., & Dale, E. (1995). Readability Revisited: The New Dale-Chall Readability Formula. 
Northampton, Massachusetts: Brookline Books. 
Chen, X., & Meurers, D. (2018). Word frequency and readability: Predicting the text-level 
readability with a lexical-level attribute. Journal of Research in Reading, 41(3), 486-510.  
Chen, Y.-T., Chen, Y.-H., & Cheng, Y.-C. (2013). Assessing Chinese Readability using Term 
Frequency and Lexical Chain. IJCLCLP, 18(2), 1-18.  
Coco, L., Colina, S., Atcherson, S. R., & Marrone, N. (2017). Readability Level of Spanish-
Language Patient-Reported Outcome Measures in Audiology and Otolaryngology. 
American journal of audiology, 26(3), 309-317. doi:10.1044/2017_AJA-17-0018 
Collins-Thompson, K., & Callan, J. (2005). Predicting Reading Difficulty with Statistical Language 
Models. J. Am. Soc. Inf. Sci. Technol., 56(13), 1448-1462.  
Dale, E., & Chall, J. S. (1948). A Formula for Predicting Readability. Educational Research Bulletin, 
11-28.  
DeFrancis, J. (1977). Colonialism and language policy in Viet Nam. The Hague: Mouton. 
Dinh, D., Nguyen, T. N., & Ho, H. T. (2018). Building a corpus-based frequency dictionary of 
Vietnamese. In (pp. 72-98). 
Nguyễn Điệp T. N., Lươnga.-V., & Đinh Điền. (2019). Affection of the part of speech elements in 
Vietnamese text readability. Acta Linguistica Asiatica, 9(1), 105-118. 
https://doi.org/10.4312/ala.9.1.105-118 
Feng, L., Jansche, M., Huenerfauth, M., & Elhadad, N. e., mie. (2010). A Comparison of Features 
for Automatic Readability Assessment. Paper presented at the Proceedings of the 23rd 
International Conference on Computational Linguistics: Posters, Stroudsburg, PA, USA. 
François, T., & Fairon, C. (2012). An AI readability formula for French as a foreign language. 
Paper presented at the Proceedings of the 2012 Joint Conference on Empirical Methods in 
Natural Language Processing and Computational Natural Language Learning. 
Gunning, R. (1952). The technique of clear writing. New York: McGraw-Hill Book Co. 
Heilman, M., Collins-Thompson, K., Callan, J., & Eskenazi, M. (2007, April). Combining Lexical 
and Grammatical Features to Improve Readability Measures for First and Second Language 
Texts. Paper presented at the Human Language Technologies 2007: The Conference of the 
North American Chapter of the Association for Computational Linguistics; Proceedings of 
the Main Conference, Rochester, New York. 
 Examining the Part-of-speech Features in Assessing the Readability … 141 
Parker, R. I., Hasbrouck, J. E., & Weaver, L. R. (2001). Spanish readability formulas for 
elementary-level texts: A validation study. Reading & Writing Quarterly, 17(4), 307-322. 
doi:10.1080/105735601317095052 
Jiang, Z., Sun, G., Gu, Q., Yu, L., & Chen, D. (2015). An Extended Graph-Based Label Propagation 
Method for Readability Assessment. Paper presented at the Web Technologies and 
Applications, Cham. 
Kincaid, J. P., Fishburne, R. P., Rogers, R. L., & Chissom, B. S. (1975). Derivation of New 
Readability Formulas (Automated Readability Index, Fog Count and Flesch Reading Ease 
Formula) for Navy Enlisted Personnel. Technical Training, Research B(February), 49.  
Luong, A.-V., Nguyen, D., & Dinh, D. (2018a, November). A New Formula for Vietnamese Text 
Readability Assessment. Paper presented at the 2018 10th International Conference on 
Knowledge and Systems Engineering (KSE). 
Luong, A.-V., Nguyen, D., & Dinh, D. (2018b, November). Assessing the Readability of Literary 
Texts in Vietnamese Textbooks. Paper presented at the 2018 5th NAFOSTED Conference 
on Information and Computer Science (NICS). 
Mc Laughlin, G. H. (1969). SMOG grading-a new readability formula. Journal of Reading, 12(8), 
639-646.  
Nguyen, L. T., & Henkin, A. B. (1982). A Readability Formula for Vietnamese. Journal of Reading, 
26(3), 243-251.  
Nguyen, L. T., & Henkin, A. B. (1985). A Second Generation Readability Formula for Vietnamese. 
Journal of Reading, 29(3), 219-225.  
Pitler, E., & Nenkova, A. (2008). Revisiting readability: A unified framework for predicting text 
quality. Paper presented at the Proceedings of the conference on empirical methods in 
natural language processing. 
Saddiki, H., Bouzoubaa, K., & Cavalli-Sforza, V. (2015). Text readability for Arabic as a foreign 
language. Paper presented at the Computer Systems and Applications (AICCSA), 2015 
IEEE/ACS 12th International Conference of. 
Saddiki, H., Habash, N., Cavalli-Sforza, V., & Al Khalil, M. (2018, July). Feature Optimization for 
Predicting Readability of Arabic L1 and L2. Paper presented at the Proceedings of the 5th 
Workshop on Natural Language Processing Techniques for Educational Applications, 
Melbourne, Australia. 
Schwarm, S. E., & Ostendorf, M. (2005). Reading Level Assessment Using Support Vector 
Machines and Statistical Language Models. Paper presented at the Proceedings of the 43rd 
Annual Meeting on Association for Computational Linguistics, Stroudsburg, PA, USA. 
Sherman, L. A. (1893). Analytics of literature: a manual for the objective study of English prose 
and poetry. Boston, England: Ginn. 
Si, L., & Callan, J. (2001). A Statistical Model for Scientific Readability. Paper presented at the 
Proceedings of the Tenth International Conference on Information and Knowledge 
Management, New York, NY, USA. 
Spaulding, S. (1956). A Spanish Readability Formula. The Modern Language Journal, 40(8), 433-
441. doi:10.1111/j.1540-4781.1956.tb02145.x 
Tran, T., Pham, T., Ngo, H., Dien, D., & Collier, N. (2007). Named Entity Recognition in 
Vietnamese documents. Progress in Informatics, 5-13. doi:10.2201/NiiPi.2007.4.2 
142 An-Vinh LUONG, Diep NGUYEN, Dien DINH 
Al-Ajlan, A. A., Al-Khalifa, H. S., & Al-Salman, A. S. (2008, November). Towards the development 
of an automatic readability measurements for arabic language. Paper presented at the 
2008 Third International Conference on Digital Information Management, University of 
East London, London, UK. 
Vajjala, S., & Meurers, D. (2012, June). On Improving the Accuracy of Readability Classification 
using Insights from Second Language Acquisition. Paper presented at the Proceedings of 
the Seventh Workshop on Building Educational Applications Using NLP, Montréal, Canada. 
Sun, G., Jiang, Z., Gu, Q., & Chen, D. (2014, September). Linear model incorporating feature 
ranking for Chinese documents readability. Paper presented at the The 9th International 
Symposium on Chinese Spoken Language Processing, Singapore. 
François, T. (2014, November). An analysis of a French as a Foreign Language Corpus for 
Readability Assessment. Paper presented at the Proceedings of the third workshop on NLP 
for computer-assisted language learning, Uppsala, Sweden. 
Wang, S., & Andersen, E. (2016, December). Grammatical Templates: Improving Text Difficulty 
Evaluation for Language Learners. Paper presented at the Proceedings of COLING 2016, 
the 26th International Conference on Computational Linguistics: Technical Papers, Osaka, 
Japan. 
Luong, A.-V., Nguyen, D., & Dinh, D. (2017, October). Examining the text-length factor in 
evaluating the readability of literary texts in Vietnamese textbooks. Paper presented at the 
2017 9th International Conference on Knowledge and Systems Engineering (KSE). 
Al Khalil, M., Saddiki, H., Habash, N., & Alfalasi, L. (2018, May). A Leveled Reading Corpus of 
Modern Standard Arabic. Paper presented at the Proceedings of the 11th Language 
Resources and Evaluation Conference, Miyazaki, Japan. 
Jiang, Z., Gu, Q., Yin, Y., & Chen, D. (2018, August). Enriching Word Embeddings with Domain 
Knowledge for Readability Assessment. Paper presented at the Proceedings of the 27th 
International Conference on Computational Linguistics, Santa Fe, New Mexico, USA. 
Luong, A.-V., & Tran, P. (2019, November). Assessing the Readability of Vietnamese Texts 
Through Comparison. Paper presented at the Computational Data and Social Networks, 
Cham.