Volume 7, Issue 1, 2017
Logo_FF_ ENG_BLACK
Acta Linguistica Asiatica 

Volume 7, Issue 1, 2017 
ACTA LINGUISTICA ASIATICA 
Volume 7, Issue 1, 2017 
Editors. !.u.ù. Bù.ùG, Nina Golob, Mooù.o Pùo.aðT 
Editorial Board: Bi Yanli (China), Cao Hongquan (China), Luka Culiberg (Slovenia), Tamara Ditrich (Slovenia), Kristina Hmeljak Sangawa (Slovenia), Ichimiya Yufuko (Japan), Terry Andrew Joyce (Japan), Jens Karlsson (Sweden), Lee Yong (Korea), Lin Ming-Chang (Taiwan), Arun Prakash Mishra (I.uTo), NooT.o Ma.Toa.T Š.a (S|aðù.To), Nishina Kikuko (Japan), Sawada Hiroko (Japan), Chikako StToù.a.T BÆo. (S|aðù.To), I.ù.o S.uo.aðTÞ (Croatia). 
© U.Tðù..To. a L.ÆŠ|.o.o, FošÆ|o. a !.o., 2017 All rights reserved. 
Published by: i.o..oðù.o zo|ažŠo FT|aza..ù o.Æ|oùoù U.Tðù.zù ð L.ÆŠ|.o.T (Ljubljana University Press, Faculty of Arts) 
Issued by: Department of Asian Studies 
For the publisher: D./ B.o..o Ko|ù.TÞ Ro.Go., Dean of the Faculty of Arts 
The journal is licensed under a 
Creative Commons Attribution-ShareAlike 4.0 International License. 
Journal's web page: 
http://revije.ff.uni-lj.si/ala/ 
The journal is published in the scope of Open Journal Systems 

ISSN: 2232-3317 

Abstracting and Indexing Services: 
COBISS, dLib, Directory of Open Access Journals, MLA International Bibliography, 
Open J-Gate, Google Scholar and ERIH PLUS. 

Publication is free of charge. 

Address: 
University of Ljubljana, Faculty of Arts Department of Asian Studies !G.ù.ùðo 2, SI-1000 Ljubljana, Slovenia 
E-mail: nina.golob@ff.uni-lj.si 
TABLE OF CONTENTS 
Foreword ....................................................................................................................... 5 

RESEARCH ARTICLES Understanding Reference: Morphological Marking in Japanese 
Shinichi SHOJI ................................................................................................................ 9 



Functional Significance of Contextual Distribution: Discourse Particle ar in Bangla 
Soumya Sankar GHOSH, Samir KARMAKAR, Arka BANERJEE...................................... 23 



Japanese n deshita in Discourse: Past Form of n desu 
Hironori NISHI ............................................................................................................. 41 


“Two sides of the same coin”: Yokohama Pidgin Japanese and Japanese Pidgin English 
Andrei A. AVRAM ........................................................................................................ 57 


Chinese Legal Texts – Quantitative Description 
¼ÆŠaG G!JDOŠ ............................................................................................................ 77 

Perceptual Errors in Chinese Language Processing: A Case Study of Czech Learners 
Tù.ùzo SL!MÌNÍKOVÁ ................................................................................................. 89 

FOREWORD 
The first, summer issue of the seventh volume of the ALA journal comprises six academic articles, of which the first three share pragmatic concepts in discourse analysis in Japanese and Bangla, while the latter three deploy different language varieties of Japanese and Chinese to provide overviews of their linguistic characteristics. 
Shinichi SHOJI is the author a otù T..o o.oTš|ù .U.uù..oo.uT.o Rùù.ù.šù. Ma.uta|aoTšo| Mo..T.o T. Jouo.ù.ù., EtTšt T.ðù.oToooùu the anaphor-antecedent relationship in Japanese, particularly in cases with repeated-name anaphors. He has come to the conclusion that the topic postposition wa after an anaphora determines the o.outa... oauTš-hood and as such plays an important role in facilitates the realization of antecedents. 
Another conversational discourse was used by Soumya Sankar GHOSH, Samir KARMAKAR and Arka BANERJEE to explore the occurance and role of indeclinable ar in Bangla. Authors point out that multiple interpretations of ar, which in addition have a large semantic and pragmatic scope, can be systematized by the use of phonological context. 
Hironori NISHI in his article was interested in the use of the Japanese past form n deshita/n datta in a discourse. His results, based on the analyses of a large corpus, show that approximately two-thirds of the n deshita/n datta cases are not grammatically ša..o.oT.ùu, o.u otoo ota.ù šo.ù. ùetTŠTo ùTotù. otù .uùo.ù... .ùša||ùšoTa. a u.ùðTaÆ.|. held knowledge, or confirmation-seeking utterances for their previously held knowledge. 
Andrei A. AVRAM explored phonological, morphological, syntactic and lexical aspects of Yokohama Pidgin Japanese and Japanese Pidgin English to find out that the bi-directional approach detected several common features typical of pre-pidgins though the two languages differ considerably in the circumstances of their emergence and the context of use. 
Yet another language variety, namely the legal Chinese sociolect, was analyzed quantitatively by ¼uboš G!JDOŠ. Referring to the Chinese monolingual corpus Hanku, the author touched several statistical parameters, including the length of sentences and the proportion of parts of speech, and additionally discussed the issues on statistical data processing. 
Last but not least, Tereza SL!MÌNÍKOVÁ turned to Czech speakers of Chinese to search on and discuss the L1 influences on the L2 perception at the early stages of the learning process. The author focused on segmental and suprasegmental features, and found out that while some perceptional mistakes are language-independent, others are language-specific, and stressed the importance of the latter.  
Editors and Editorial Board thank all the contributors to this volume, and wish the regular and new readers of the ALA journal a pleasant read full of inspiration. 
Nina Golob 


RESEARCH ARTICLES 

UNDERSTANDING REFERENCE: MORPHOLOGICAL MARKING IN JAPANESE 
Shinichi SHOJI 
Organization for the Development of Higher Education and Regional Human Resources, Mie University, Japan 
Abstract 
This study investigates reference resolution with repeated-name anaphors in Japanese, particularly focusing on (i) subject anaphor with the nominative postposition ga, (ii) topic-subject anaphor with the topic postposition wa, (iii) scrambled object anaphor with the accusative postposition o, and (iv) topic-object anaphor with the topic postposition wa. A self-paced sentence-by-sentence reading experiment was conducted using two-sentence discourse items followed by comprehension questions, aiming to examine which type of anaphor would trigger a faster realization of the anaphor-antecedent relationship. The discourse items included antecedents in the first sentence and anaphors in the second sentences, and the comprehension questions asked about the antecedents in the first sentences. Results showed that the comprehension questions for the discourses that included topic anaphors (topic-subject-wa and topic-object-wa) were responded to faster than those for the discourses that included non-topic anaphors (subject-ga and scrambled object-o). The results indicate otoo o.outa... oauTš-hood given by wa facilitates the realization of antecedents. 
Keywords: reference; subject; object; topic; scrambling 
Povzetek 
V GoÆuT.T a .ùù.ù..T |a|.Tða.oT naveznikov (anafor) . ua.oð|.o.aT.T se T.ù.T ð .oua.GT.T .ù oðoa. ua.ðùo .o.|ùu..T. GoT.T. oa.o.. (i) osebkovemu navezniku z T.ù.aðo|.TG.T. |ù..a. ga, (ii) tematskemu osebkovemu navezniku . où.oo..T. |ù..a. wa, (TTT) .ùGo.emu predmetnemu navezniku ð u.ù.ù.T T. . oažT|.TG.T. |ù..a. o ter (iv) tematskemu predmetnemu navezniku s où.oo..T. |ù..a. wa. V bralnem eksperimentu, v katerem je bralec samonadzoroval hitrost branja posameznih stavkov, je bil uporabljen uða.ooð.T diskurz/ Tù.Æ .ù .|ùuT|a ðu.oGo..ù, u.ù.a .ooù.ùoo je oðoa. Æoaooð|.o|, .ooù.T oTu .oðùz.T.o .u.ažo .o.tTo.ù.Ga .oðùzoða .ùu .oðùz.T.a. T. .o.oGo|.Tša/ No.oGo|.Tšù .a ŠT|ù ðùu.a ð.|.Æù.ù ð u.ðT T. .oðùz.T.T ð u.ÆoT .ooðù., ðu.oGo..o .a iskala informacije iz u.ðùoo .ooð.o, o./ a .o.oGo|.Tšot/ RùzÆ|oooT .a pokazali, uo .ù ŠT|o auzTð.a.o .o ðu.oGo..o tTo.ù.Go ð primerih tematskih naveznikov (où.oo..ù.Æ a.ùŠ.aðù.Æ .oðùz.T.Æ . où.oo..T. |ù..a. wa in où.oo..ù.Æ u.ùu.ùo.ù.Æ .oðùz.T.Æ . où.oo..T. |ù..a. wa) kot netematskih (osebkovemu .oðùz.T.Æ z T.ù.aðo|.TG.T. |ù..a. ga T. .ùGo.ù.Æ u.ùu.ùo.ù.Æ .oðùz.T.Æ ð u.ù.ù.T T. . oažT|.TG.T. |ù..a. o), zo.ouT ù.o. |ot.a ..|ùuo.a, uo où.ooT.a.o, .T .a uùT.T.o où.ao..T |ù.ù. wa, a.aoao .ùo|TzošT.a .o.oGo|.Tšù/ 
Kljuène besede: referenca; osebek; predmet; tema; skladenjske premene 
Acta Linguistica Asiatica, 7(1), 2017. 
ISSN: 2232-3317, http://revije.ff.uni-lj.si/ala/ 

DOI: 10.4312/ala.7.1.9-21 


Introduction 
Reference resolution has been widely discussed in the field of psycholinguistics, concerning the forms of anaphors. It is well known that anaphors in pronoun forms are preferred to those in repeated-name forms when antecedents are prominent, namely, when they are grammatical subjects or first mentioned entities in a sentence (Ariel, 1990; Gundel, Hedberg, & Zacharski, 1993; Gordon, Grosz & Gillian, 1993; among others). According to Gordon and Hendrick (1998, p. 390, 393, 416), in English, pronouns are immediately interpreted as anaphors, which leads readers to looking for their referents, whereas repeated-name anaphors tend not to be initially interpreted as anaphors. Readers of a repeated-name anaphor first establish a general concept of the entity indicated by the repeated name, and later realize the referential relationship between the entity and its referent, resulting in a slower identification of the referents. This relatively slower realization of antecedents is reflected in slower processing times for sentences with repeated-name anaphors compared with sentences with pronoun anaphors where the anaphors are both grammatical subjects (Gordon, Grosz & Gilliom, 1993, among others), e.g., .Harry is a member of a track team. Harry/He recruited Fred for the team because he i. otù o.où.o .Æ..ù. T. otù .štaa|/.. 
In Japanese, repeated-name anaphors can appear in several variations. One is grammatical subject anaphors marked with the nominative postposition ga (e.g., HARRY-GA Fred-o sasotta .Harry recruited Fred.), which would be the most equivalent to the repeated-name anaphors in other languages that elicited slower processing in early studies. However, instead of ga, grammatical subjects can be marked with the topic postposition wa (e.g., HARRY-WA Fred-o sasotta .Harry recruited Fred.), which explicitly shows that the grammatical subjects are discourse topics (i.e., topic-subject).1 In addition, in scrambled sentences, non-subjects such as grammatical objects with the accusative postposition o can be positioned at the beginning of a sentence, similar to a .ÆŠ.ùšo.. ua.ToTa. (ù/o/, HARRY-O Fred-ga sasotta .Fred recruited Harry). Moreover, the scrambled/fronted non-subject can be a topic (e.g., HARRY-WA Fred-ga sasotta .Fred recruited Harry.). The present study investigated the processing of these different types of sentence-initial repeated-name anaphors in Japanese, aiming to examine whether any particular types of repeated-name anaphors would trigger a faster realization of their referents. For non-subjects, this study used grammatical objects, and thus the following four variations of repeated-name anaphors were tested: (non­topic) subject-ga, topic-subject-wa, (non-topic) scrambled object-o, and topic-object­wa. 
1 The term topic-subject is sometimes called topicalized subject in linguistic articles. However, in the oÆota... ðTùE, oauTš. o.ù Šo.ù-generated, and they are not moved from the grammatical subject position. Thus, the term topicalized is not used in this article. 

Topic and Scrambling 
It is widely acknowledged that a topic refers to information that has been presented earlier (Chafe, 1987; Kuno, 1973, p. 38, among others). Prince (1978) states that a speaker marks an entity as a topic when the listener already recognizes the entity. In response, the entity marked as the topic leads the listener to search for its antecedent T. otù u.ùšùuT.o ša.oùeo, otù a.oaT.o .ToÆooTa., a. otù |T.où.ù... |a.o-term memory (Haviland & Clark, 1974, p. 512, 513). This process seems similar to that for English pronouns, where Gordon and Hendrick (1998, p. 390, 393, 416) argue that pronouns are immediately interpreted as anaphors and readers look for antecedents. In short, a topic is essentially anaphoric, as Halliday (1967, p. 199) argues: a topic is concerned with the relation of what is currently being said to what was said earlier in the discourse. According to the above arguments, in Japanese, topic anaphors with the topic postposition wa should be more quickly interpreted as anaphors and to trigger antecedent-realization compared with non-topic anaphors (with ga, o, etc.), which would be reflected in different processing times, even if the anaphors are all repeated-name anaphors. This prediction is summarized below. 
(1) 	topic-subject-wa, topic-object-wa < (faster to process than) subject-ga, 
scrambled object-o 

However, a possible problem of processing a sentence with topic-wa is that the topic postposition wa by itself does not tell its grammatical role, such as topic-subject or topic-object. In fact, topic-wa anaphors in Japanese are much more frequently topic-subjects than other types of topics (Martin, 1975; Nishimura, 1989, p. 374). Due to this frequency difference, any type of topic-wa might be initially interpreted as a topic-subject. This possibility is further supported by the fact that any topic-wa tends to be positioned at the beginning of a sentence, which is similar to the grammatical subject position. Therefore, readers of a topic-object (or any non-subject topics) would have to reanalyze it after they initially misinterpret it as a topic-subject. Accordingly, sentences with topic-objects could be processed more slowly than sentences with topic-subjects, as summarized below. 
(2) 	topic-subject-wa < topic-object-wa 
Note that this above prediction is not related to the realization of the anaphor­antecedent relationship. Rather, this is an issue of realizing the grammatical role of a topic within a sentence, i.e. topic-subjects and topic-objects do not have to be anaphors to elicit this processing-time difference. 
Another sentence-level issue is word order. In Japanese, the default word order a otù o.oÆ.ù.o .aÆ.. o.u o ðù.Š T. ..ÆŠ.ùšo – object – ðù.Š (SOV)./ Ttù.ùa.ù, a scrambled word order such as OSV could assign readers a heavier processing load than the default SOV order. Sentences with subjects and topic-subjects are in the SOV order, which may be faster to process than sentences in the OSV order, including scrambled sentences with surface-initial objects and sentences with surface-initial topic-objects. This prediction is summarized below. 
(3) topic-subject-wa, subject-ga < scrambled object-o, topic-object-wa 
Moreover, when an object in an OSV sentence is a topic-object, its sentence might be slower to process than a sentence with a scrambled non-topic object. As mentioned earlier, a topic-object may require readers to reanalyze it during processing the sentence because a topic-object might be initially misinterpreted as a topic-subject. This is different from scrambled objects appended with the accusative postposition o, which explicitly shows that they are grammatical objects. Thus, sentences with scrambled objects with o might be processed faster than topic-objects with wa, as summarized below. 
(4) scrambled object -o < topic-object -wa 
In sum, four predictions are presented above. The prediction (1) is related to the realization of the referential relationship between anaphors and their antecedents, which is the main objective of this study. The other predictions, (2), (3) and (4), relate to the processing of sentences with subjects, topic-subjects, scrambled objects and topic-objects, irrelevantly to being anaphoric. A self-paced sentence-by-sentence reading experiment was conducted in order to test the prediction (1), but the effects shown in (2), (3) and (4) were also predicted to appear in the results. 

3 Experiment 
3.1 Participants 
24 Japanese speakers, which consisted of students of the University of South Carolina and residents in South Carolina, North Carolina and Georgia, served as participants in this experiment. They were all native speakers of Japanese, raised in Japan until they were at least 15 years old. The participants consisted of 8 males and 16 females, and their ages ranged from 18 to 52. 
3.2 Items 
The basic design of the experiment followed that of Gordon, Grosz and Gilliom (1993). A self-paced sentence-by-sentence reading experiment was conducted, in which participants read two-sentence discourse items.2 The first sentence included an 
2 Gordon et al. (1993) used four-sentence discourse items. 
antecedent, which was always prominent as it was a grammatical subject and was first-mentioned in the sentence. The second sentence included a repeated-name anaphor, which was one of the following four types: (i) (non-topic) subject-ga, (ii) topic-subject­wa, (iii) (non-topic) scrambled object-o, or (iv) topic-object-wa. Eight items for each of the four types were prepared, and thus there were 32 experimental items in total. The o.outa.. o.u o.oùšùuù.o. Eù.ù o|Eo.. uù..a... names, and no other proper names were used in the items. The processing times of the second sentences that included anaphors were measured as an indication of how fast participants realized the anaphor-antecedent referential relationship, which tested the prediction (1), or an indication of how fast they processed the sentences independently from the preceding sentences, which tested the predictions (2), (3) and (4). 
A yes-no comprehension question followed each discourse item. In order to test the prediction (1), it was of importance that the comprehension questions asked about the antecedents found in the first sentences. According to Gernsbacher (1989, p. 107), questions about the clauses that include antecedents that appear before anaphors ensure that readers understand anaphor-antecedent relationship. The author of this study considers that response times for this type of comprehension question would indicate how fast participants successfully realized the referential relationship between anaphors and antecedents. For example, when participants read a discourse such as Taro-ga toshokan-ni itta. Taro-wa yoru osoku made benkyooshita. .To.a Eù.o oa otù library. Taro .oÆuTùu Æ.oT| |ooù oo .Toto/., T otù. .Æššù..Æ||. T.où.u.ùo .To.a. T. otù second sentence as an anaphor and realize its antecedent (interpreting the discourse |T.ù, .To.a, Eta Eù.o oa otù |TŠ.o.., .oÆuTùu otù.ù Æ.oT| |ooù oo .Toto.), otù. otù. EaÆ|u have little trouble in answering a comprehension question about the antecedent, e.g., .DTu To.a oa oa o |TŠ.o..?. (otù o..Eù. T. .ù.)/ O. otù aotù. to.u, T otù uo.oTšTuo.o. ua .ao .ùšao.Tzù .To.a. T. otù .ùša.u .ù.où.šù o. o. o.outa., otù. toðù T.où.u.ùoùu otù .ùša.u .To.a. T.uùuù.uù.o|. .a. otù .To.a. T. otù T..o .ù.où.šù, ŠÆT|uT.o .a referential relationship between the two Taros. As a result, they would experience a temporary difficulty in answering the comprehension question that asks about the u.Ta. .To.a./ 
This method regarding comprehension questions is a modification of the probe­recognitio. oo.. .Æšt o. otù a.ù T. No.o.o.o.. (1990) .oÆu./ Hù conducted an experiment with the items like the below (p. 15): 
(i)	............................. Machi-o aruiteita obasan-gai keisatsu-ni kanojo-gai doroboo-o mita to denwashita. 

.! Ea.o. Eta Eo. Eo|.ing on the street telephoned the police that she saw the thief.. 

(ii)............................. Toshokan-de benkyooshiteiru gakusei-ga tomodachi-.T OT .tÆ.ÆuoT-o shiteoita to tsutaeta. 


.Ttù .oÆuù.o Eta Eo. .oÆu.T.o oo otù |TŠ.o.. T.a..ùu tT. .Tù.u otoo [.Æ||\ uTu ta.ùEa../. 
Antecedents (obasan .Ea.o.. and gakusei ..oÆuù.o.) a o.outa.. (kanojo ..tù. o.u null pronoun) were shown as probe words after participants read the sentences, and the participants were asked whether the probe words appeared in the sentence that they just read. The response times to the probe words were the indication of how fast they realized the anaphor-antecedent referential relationships. Nakayama found that the response times for (i) was slower than those for (ii), indicating that the referential relationships were realized faster with null pronouns than with overt pronouns. The u.ù.ù.o .oÆu. šaÆ|u .ao .ùu|Tšooù No.o.o.o.. u.aŠù-recognition task: antecedents could not be used as probe words because the antecedents were repeated in the second sentences. Thus, instead of using antecedents as probe words, this study prepared comprehension questions that asked about antecedents.3 
All items were presented in Japanese texts. The 32 experimental items (8 items for four conditions) were given mixed among 68 distractors, and thus there were 100 items in total. Example experimental items for each condition are shown below. (Anaphors in each condition are italicized.) 
(i) Subject anaphor 
Two-sentence discourse: 
(5)........................ 
Taro-ga toshokan-ni itta. Taro-ga yoru osoku-made benkyoshita. Taro-NOM library-DIR went Taro-NOM until late at night studied .Taro went to a library. Taro studied until late at night.. 
Comprehension question: 
(5Q).............. 
Taro-wa toshokan-ni ikimashita ka. 

.DTu To.a oa oa o |TŠ.o..?. 
3 Gernsbacker (1989) used the antecedents of repeated-name anaphors as probes in her study, which resulted in comparatively faster response times for the repeated-name anaphors. Gernsbacker also suggested that repeated names facilitated faster antecedent-realization compared with pronouns. However, as Gordon et al. (1993, p. 323) highlight, the response times unlikely reflect how fast readers realized the anaphor-antecedent relationship; they may have simply retrieved the anaphors unrelated to the antecedents. 
(ii) Topic-subject anaphor  
Two-sentence discourse:  
(6) ...................Jiro-ga resutoran-de shokuji-o shita. Jiro-NOM restaurant-LOC meal-ACC did .Jiro ate a meal at a restaurant. Jiro ate pasta..  ....Jiro-wa Jiro-TOP  ... pasuta-o pasta-ACC  tabeta. ate  

Comprehension question: 
(6Q)................ Jiro-wa resutoran-de tabemashita ka. 
.DTu JT.a ùoo oo o .ù.ooÆ.o.o?. 
(iii) Scrambled object anaphor 
Two-sentence discourse: 
(7)	.......................... Saburo-ga kooen-de asonde-ita. Saburo-o okaasan-ga mukaenikita. Saburo-NOM park-LOC was playing Saburo-ACC mother-NOM came to pick up .SoŠÆ.a Eo. playing at a park. Mother came to pick up Saburo.' 
Comprehension question: 
(7Q)............. Saburo-wa kooen-de asobimashita ka. 
.DTu SoŠÆ.a u|o. oo o uo..?. 
(iv) Topic-object anaphor 
Two-sentence discourse: 
(8)	....—.................... Shiro-ga paatii-ni shusseki-shita. Shiro-wa itoko-ga shootaishita. Shiro-NOM party-DIR attended Shiro-TOP cousin-NOM invited 
.StT.a .oÆuTùu oo otù |TŠ.o../ HT. šaÆ.T. T.ðToùu StT.a/. 
Comprehension question: 
(8Q)....—........... Shiro-wa paatii-ni shusseki shimasita ka. 
.DTu StT.a oooù.u o uo.o.?. 

4 Procedure 
The discourse items in the experiment were presented using E-Prime. Participants read two-sentence discourses sentence-by-sentence, in a self-paced reading fashion. The experiment was carried out with each participant viewing the sentences on a computer. During the experiment, the participants first received the welcome message and instructions on the computer screen and proceeded to the practice block by hitting the space bar. The practice block provided four practice questions to familiarize the participants with the sentence-by-sentence reading task. After the participants finished the practice questions, they received the end-of-practice message, and they were allowed to proceed to the actual experiment by hitting the space bar. In the practice block and actual experiment, the first sentence of each experimental discourse ouuùo.ùu ooù. otù TeooTa. .o.., .+./ !où. uo.oTšTuo.o. .ùou ùošt uT.šaÆ..ù, o .ù.-no ša.u.ùtù..Ta. uÆù.oTa. Eo. oTðù., EtTšt šaÆ|u Šù o..Eù.ùu Š. tTooT.o .1 (.ù.). a. .2 (.a)./ !où. otù ša.u.ùtù..Ta. uÆù.oTa., otù TeooTa. .+. ouuùo.ùu, EtTšt Eo. 
followed by the first sentence of the next discourse. The experimental and distractor discourses were given in random order. A session lasted approximately 20 minutes. 

5 Data !nalysis 
The independent variables of the experiment were the anaphors: repeated-name subject-ga, repeated-name topic-subject-wa, repeated-name scrambled object-o and repeated-name topic-object-wa. The measured dependent variables were reading times of the second sentences with anaphors and response times to the comprehension questions asking about antecedents. Linear Mixed Effects analyses using SPSS compared these dependent variables between each condition. The data ETot otù uo.oTšTuo.o.. E.a.o o..Eù.. a. otù ša.u.ùtù..Tan questions were removed from the analysis, affecting 4.95% of the data, as the wrong answers indicate that participants did not accurately comprehend the given discourses. When analyzing reading times, an additional 0.26% of the data with reading times greater than 15,000 milliseconds were removed as outliers. In addition, the reading times that were three 
.oo.uo.u uùðTooTa.. (SD.) oEo. .a. ùošt uo.oTšTuo.o.. .ùo. Eù.ù .ù.aðùu, oùšoT.o 
2.73% of the data. In total, 7.94% of the data were removed. Likewise, when analyzing question-response times, 0.39% of the data with response times greater than 15,000 milliseconds were removed as outliers. Also, response times that were three SDs away .a. ùošt uo.oTšTuo.o.. .ùo. Eù.ù .ù.aðùu, oùšoT.o 3/16% a otù uooo. In total, 8.5% of the data were removed. 

Results 
The table and figures below show the mean reading times of the second sentences that included anaphors and response times for the comprehension questions asking about antecedents.4 
Table 1: Reading Times for Anaphoric Sentences and Response Times for Comprehension Questions 
Anaphors  Reading times ms (SD)  Response times ms (SD)  
Subject-ga  2358.81 (1281.56)  1947.83 (903.87)  
Topic-subject-wa  2306.44 (1292.57)  1757.32 (772.44)  
Scrambled object-o  2843.71 (1474.71)  2027.02 (954.06)  
Topic-object-wa  3055.87 (1549.55)  1789.57 (811.28)  

3500 3000 2500 2000 1500 

Figure 1: Reading times for anaphoric sentences (second sentences) 
4 It was also observed that the accuracy rates for the comprehension questions did not significantly differ between conditions: the accuracy rates were 94% in subject condition, 95% for topic-subject condition, 94% for scrambled object condition, and 97% for topic-object condition. 
2100 2000 1900 1800 1700 1600 1500 

Figure 2: Response times for comprehension questions 
The results of the reading times of the second sentences (that included anaphors) showed that sentences with subject-ga anaphors and those with topic-subject-wa anaphors did not significantly differ [. = 52.376, SE = 135.878, t = .385, p = .700]. Also, the reading times for scrambled object-o anaphors and topic-object-wa anaphors did not significantly differ [. = 212.167, SE = 161.662, t = 1.312, p= .190]. However, sentences with scrambled object-o anaphors were processed significantly slower than those with subject-ga anaphors [. = -484.894, SE = 147.085, t = -3.297, p = .001] and than those with topic-subject-wa anaphors [. = -537.270, SE = 146.468, t = -3.688, p< .001]. Likewise, sentences with topic-object-wa anaphors were processed significantly slower than those with subject-ga anaphors [. = -697.060, SE = 152.105, t = -4.583, p < .001] and than those with topic-subject-wa anaphors [. = -749.436, SE = 151.402, t =­4.950, p< .001]. In short, the reading-time results indicate that SOV sentences with subject-type anaphors (i.e., subject-ga and topic-subject-wa) were processed faster than OSV sentences with object-type anaphors (i.e., scrambled object-o and topic­object-wa). 
The results of the response times to the comprehension questions (that asked about antecedents) showed different outcomes. Comprehension questions for the items with topic-subject-wa anaphors were responded to significantly faster than those for the items with subject-ga anaphors [. = 190.508, SE = 90.030, t = 2.116, p = .035] and than those for the items with scrambled object-o anaphors [. = -269.701, SE = 92.583, t = -2.931, p = .004]. Similarly, question-response times for the items with topic­object-wa anaphors were significantly or marginally significantly faster than for those with subject-ga anaphors [. = 158.251, SE = 91.176, t = 1.736, p = .083] and for those with scrambled object-o anaphors [. = -237.443, SE = 93.575, t = -2.537, p = .012]. There was no significant difference between the response times for the items with topic­subject-wa anaphors and topic-object-wa anaphors [. = -32.257, SE = 83.540, t = -.386, p = .700]. Also, there was no significant difference between the response times for the items with subject-ga anaphors and scrambled object-o anaphors [. = -79.193, SE = 100.091, t = -.791, p = .429]. In short, question-response times were faster for the discourse items with topic-type anaphors (i.e., topic-subject-wa and topic-object-wa) than for items with non-topic-type anaphors (i.e., subject-ga and scrambled object-o). 

Discussion 
The results of the reading times of the second sentences with anaphors can be attributable to word order, as indicated by the prediction (3). The sentences with subject anaphors and topic-subject anaphors are in the default SOV order, and those with scrambled object anaphors and topic-object anaphors are in the OSV order. The results showed that SOV sentences were faster to process than OSV sentences. Also, the reading-time results for topic-subject anaphors and topic-object anaphors can be attributable to the prediction (2). Readers might have misinterpreted topic-objects as topic-subjects, and later they had to reanalyze the interpretation. Thus, sentences with topic-objects were processed more slowly than those with topic-subjects. 
The prediction (4), scrambled objects should be processed faster than topic-objects, did not appear in this experiment. Scrambled objects with the accusative postposition o should have been immediately realized as objects while topic-objects with wa should have been initially misinterpreted as topic-subjects. Nevertheless, they were read at indifferent speeds. This result can mean that the reanalysis of topic-objects does not require processing cost enough to significantly slow down its overall processing. If this possibility is true, there should be no reading-time difference between topic-subjects and topic-objects as well, and thus the prediction (2) should not account for the overall results for reading times. Thus, the prediction (3), SOV vs. OSV, solely accounts for the reading time results. 
The prediction (1) that suggests the advantage of topic anaphors for realizing anaphor-antecedent relationship did not appear in the reading-time results. This outcome implies that the reading-time differences only reflect the word-order effects, which may have overridden the possible effect predicted by (1). On the other hand, the prediction (1) was supported by the results of response times to comprehension questions. The question-response times for the items with topic-type anaphors (i.e., topic-subject and topic-object) were faster than for non-topic-type anaphors (i.e., subject and scrambled object) at significant or marginally significant levels. The results indicate that, when participants read the second sentences with topic-type anaphors, the topic postposition wa signaled to them that the topic entities overlapped the antecedent entities, resulting in immediate realization of their referential relationship. In other words, the participants possibly interpreted the given two sentences as one continuous discourse that described one person whose name appeared in the discourse. Thus, they quickly responded when they were asked about the antecedents, the person, in the first sentences. 
In contrast, the slower question-response times for the items with non-topic-type anaphors may indicate that the readers processed the second sentences with the anaphors independently and discontinuously from the first sentences with antecedents. Readers seemed to have shifted their attention away from the first sentence when they processed the second sentence. In other words, processing non­topic-type anaphors initially did not trigger realization of the referential relationship between the anaphors and antecedents.5 The participants were likely reminded of the antecedents only when comprehension questions asked about them, resulting in slower responses. 

8 Limitations 
A possible limitation of this study is that the experimental items were all different, not given in the Latin-Square style. Should the same discourse items be used with only different anaphors, the results would have been more plausible. Another limitation coulu Šù aÆ.u T. otù o.T|To.To. a otù uù..a... .o.ù. otoo Eù.ù Æ.ùu o. o.outa.. o.u as antecedents in experimental items. Common names could be processed faster than relative rare names. There could have been a familiarity-check survey with native Japanese speakers, and only common names should have been used. Future research for the same objective as the one for the present study can be conducted with the modifications regarding the above problematic factors in experimental design. 

9 Conclusion 
This study investigated referential resolution in Japanese using four different types of repeated-name anaphors. The results showed that word order affects sentence-processing and that, more importantly, topic-hood of anaphors (indicated by the topic postposition wa) contributes to building referential relationships between anaphors and antecedents. The fact that topic anaphors and non-topic anaphors elicited different outcomes provide an implication for future studies that examine referential expressions. That is, while .a.o ùeT.oT.o .oÆuTù. ašÆ. a. o.outa... a... .Æšt o. pronouns vs. repeated names, morphological markings such as wa or ga in Japanese should be also considered, which will contribute to cross-language understandings of 
5 O.ù a otù uo.oTšTuo.o. ooðù o ša..ù.o oa otù oÆota. ooù. otù ùeuù.T.ù.o. .I ùeuù.Tù.šùu o ET.ùu ùù|T.o otoo otù T.ooù. otoo I š.ùooùu .a. otù .ù.où.šù. uTu .ao uÆTš.|. ša..ùšoùu./ TtT. ša..ù.o 
might express that the images from non-topic anaphors without wa were not quickly connected to the images from antecedents. 
referential resolution. The present study is one such study that examines the effects of morphological markings. Similar research in other morpheme-marking languages could be conducted to verify the replicability of this study. Such follow up studies may find it universal that morphological topic-marking functions to helps readers realize referential relationships between anaphors and antecedents. 

References 
Ariel, M. (1990). Accessing noun-phrase antecedents. New York: Routledge. Chafe, W. (1987). Cognitive constraints on information flow. In R. Tomlin (Ed.), Coherence and grounding (pp. 21-52) Amserdam: John Benjamins. Gernsbacher, M. A. (1989). Mechanisms that improve referential access. Cognition, 32(2), 99­
156. Retrieved from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4467536/ Gordon, P. C., Grosz, B. J. & Gilliom, L. A. (1993). Pronouns, names, and the centering of attention in discourse. Cognitive Science, 17, 311-347. Gordon, P. C., & Hendrick, R. (1998). The representation and processing of coreference in discourse. Cognitive Science, 22, 389-424. Gundel, J. K., Hedberg, N., & Zacharski, R. (1993). Cognitive status and the form of referring expressions in discourse. Language, 69, 274-307. 
Halliday, M. A. K. (1967). Notes on transitivity and theme in English. Journal of Linguistics, 19, 377-417. 
HoðT|o.u, S/ E/, & C|o.., H/ H/ (1974)/ Wtoo.. .ùE? !šuÆT.T.o .ùE T.a..ooTa. o. o u.ašù.. T. 
comprehension. Journal of Verbal Learning and Verbal Behavior, 13, 512-521. Kuno, S. (1973). The structure of Japanese language. Cambridge, MA: MIT Press. Martin, S. (1975). A reference grammar of Japanese. New Haven, CT: Yale University Press. Nakayama, M. (1990). Accessibility to the antecedents in Japanese sentence comprehension. 
Unpublished manuscript. Nishimura, M. (1989). The topic-comment construction in Japanese-English code-switching. World Englishes, 8, 365-377. Prince, E. F. (1978). On the function of existential presupposition in discourse. Papers from the Fourteenth Regional Meeting, Chicago, IL: Chicago Linguistics Society. 
FUNCTIONAL SIGNIFICANCE OF CONTEXTUAL DISTRIBUTION: DISCOURSE PARTICLE AR IN BANGLA 
Soumya Sankar GHOSH  Samir KARMAKAR  
ghosh.soumya73@yahoo.com  samirkrmkr@yahoo.co.in  
Jadavpur University, India  Jadavpur University, India  

Arka BANERJEE 
banerjeesoumyo29@gmail.com Jadavpur University, India 
Abstract 
This paper deals with Bangla indeclinable ar to explore its role in conversational discourse. In doing so, the paper provides a detail study about ar in the Bangla language. This, in turn, helps to conceptualize how the occurrences of ar motivate a conversation in a pragmatic domain, in particular. More specifically, multiple interpretations of ar pose a particular challenge to semantics and pragmatics, which can be taken care of through the incorporation of phonological context. Pta.a|aoTšo| ša.oùeo ša.ooT.. T.a..ooTa. oŠaÆo .uùo.ù... T.où.oTa. and .uùo.ù... ouu.aošt oa their utterance. The paper discusses several criteria, namely the traditional and polysemous nature, intonational pattern, evidentiality etc., which are crucial in determining its role in structuring a conversation. 
Keywords: particle ar; intonation; evidentiality 
Povzetek 
È|o.ù. .ù a..ùuaoao .o .ù..|a.|.TðT ar ð Šù.oo|GT.T T. u.aÆÆ.ù ..ùoaða ð|aoa ð oaða..ù.ù. diskurzu ter s tem ponudi podrobno analizo o njegovi uporabi. To.Gù. u.T.oau ua.|ùuT.a a.aoao oÆuT zo.|.Æ.ù v obratni smeri in sicer o tem, kako pojavnost izraza ar .o u.oo.ooT.T .oð.T .uauŠÆ.o pogovor/ ŠoùðT|.a.o ..ùoaðTt T.où.u.ùoošT. u.ùu.ooð|.o TzzTð oo.a ð .ù.o.oT.T .ao ð u.oo.ooT.T, .o. uo .ù .a uaù.a.ooðToT z ÆuaGoùðo..ù. a.a|aG.ùoo .a.où..oo/ S|ùu..T .o..ù ð.ùŠÆ.ù T.a..ošT.ù a oaða.ùðTt .o.ù.ot T. oaða.ùðù.Æ au.a.Æ ua Tz.ùù.ùoo/ !ðoa. oo.a .ozu.oð|.o a a.a|aG.Tt kriterijih kot so o.ouTšTa.o|.o T. ðùua.ù...o .o.oðo Tz.ù.ù, T.oa.ošT...T ðza.šT, ua.oz|.Tða.o Tu./, .T so au|aT|.T u.T Æoaooð|janju njegove vloge v pogovoru. 
Kljuène besede: |ù.ù. ar; intonacija; dokazljivost 
Acta Linguistica Asiatica, 7(1), 2017. 
ISSN: 2232-3317, http://revije.ff.uni-lj.si/ala/ 

DOI: 10.4312/ala.7.1.23-39 



Introduction 
This paper seeks to investigate the role of Bangla indeclinable ar in a discourse emphasizing particularly on its usage pattern at the utterance level. A brief survey of its usage in Bangla discourse reveals that ar functions in two different ways, i.e. (i) it can either occur as a conjunctive indeclinable, or (ii) as a non-conjunctive indeclinable. This non-conjunctive indeclinable, in the level of utterance, contains multiple senses depending on its contextual behavior. Therefore, the objective of this research paper is to disambiguate the various senses of ar with a special reference to the contexts of its use. For further elaboration, please consider the following examples: 
(1) 	
... .. .... .... ram ar rohim o.-b-e Ram-NOM INDL Rahim-NOM come-FUT-3 .Ram and Rahim will come.. 

(3) 	
.. .... ... ...... ar b....or-e ram ù.-e...-il-o INDL year-Loctemp Ram-NOM come-PRF-PST-3 .Ram came in the previous year.. 


(2)  ...  .. ....  ..  
ram  ar  o.-b-e  na  
Ram-NOM  INDL  come-FUT-3  NEG  
.Ram will never come..  

It is evident from these examples that the indeclinable ar has different roles in all its 
aššÆ..ù.šù./ !. T. (1) To aššÆ.. o. o ša..Æ.šoTðù uo.oTš|ù toðT.o otù .ù..ù a .o.u., 
additionally its scope is restricted within the NP (e.g. ram ar rohim). As a conjunctive particle it is connected with two NPs (ram and rohim). Contrarily, in (3) ar is modifying the following NP (b....or-e) o.u šo..Tù. otù .ù..ù a .u.ùðTaÆ... ŠùT.o .auTTùu ETot ar the complex NP (e.g. ar b....or-e) appears VP internally and finally moves to the sentence initial position. In (2) ar .ùo.. ..ùðù.., o.u To ouuùo.. ETotT. otù .šauù a VP 
(e.g. ... .. .... .). 
To understand the way ar brings different colors of interpretation simply by overriding the truth conditional content of the utterance within which it is embedded, this paper will concentrate primarily on the constructions like (2) and (3). 

Research Question 
While dealing with ar, this paper will engage itself in analyzing the role of ar in a communicative context. In particular, the pragmatic function and evidential nature of this particle in a conversational discourse will be explored. To attain these goals stated above the discussion is distributed in following major sections: Section 3 will provide a short overview of Bangla language and thereafter Section 4 will contain a discussion on how ar is dealt with in traditional Bangla grammar. This is further augmented with a discussion on the polysemic nature of ar. Subsequently, in Section 5 an effort has been taken to integrate the traditional findings with the phonological ones to explore how communicative intention is crucial in determining the way meaning construing capacity of ar do vary from one context to another. 

Bangla: !n Overview 
Bengali, also known by its endonym Bangla, is an Indo-European language mainly spoken in eastern South-Asia. It is the national language of Bangladesh and the official language of India, spoken mainly in West Bengal and parts of Assam, Bihar, Jharkhand, Mizoram Tripura North Western Burma and Andaman and Nicobar Islands. With over 250 million speakers it is the seventh most spoken native language in the world. 
Following Gordon (2005), Khan (2008) uaT.o. aÆo otoo G.Tù..a... (1928) .Æ.ðù. a Bo.o|o uTo|ùšo. T. .oT|| Æ.ùu o. otù Šo.Tš š|o..TTšooTa. a otù |o.oÆooù.. ðo.To.o./ G.Tù..a. divides the Bangla language in two branches, i.e. Eastern and Western. It is important to mention here that this division does not follow any national or geographic boundary. The paper summarize the said branches in the following manner.  
I. Western Branch 
a.
 Central Bangla 

i. 	In Indian West Bengal: Nadia (Standard Bengali), Kolkata, Haora, Tamluk, Medinipur, Murshidabad, Barddhaman 
ii. 	In Bangladesh: Kushtia 

b.
 Northern Bangla 

i. 	In Indian West Bengal: East Malda, Koch Bihar 
ii. In Bangladesh: Rajshahi, Dinajpur, Bogra, Pabna 

c.
 Western Bangla 

i. 	In Indian West Bengal: Kharia Thar, Mal Paharia, Manbhum 
ii. In Indian Bihar: Saraki 

d.
 Southwestern Bangla 


II. Eastern Branch 
a.
 Eastern Bangla 

i. 	In Bangladesh: Dhaka, southeastern Faridpur, Mymensingh, Comilla, Bakerganj, Sylhet, Hajong, Sandwip Island 
ii. In Indian Assam: Cachar 

b.
 East-Central Bangla 

i. In Bangladesh: Jessore, Khulna, Faridpur 

c.
 Southeastern Bangla 

i. In Bangladesh: Noakhali, Chittagong, Chakma, Tangchangya 
ii. In Myanmar: Sittwe 

d.
 Rajbanshi 


i. In Bangladesh: Rangpur 
ii. In Indian West Bengal: Siripuria, Jalpaiguri, Bahe 
iii. In Indian Assam: Goalpara 
Depending on this division Khan (2008) further points out that the dialects not only differs in syntactic level but a major difference is also observable in phonological and morphological level. The distinction between oral and nasal vowel, /./ o.u /./, /./ and /./, vowel rounding harmony, voicing harmony are some of the noteworthy differences (Chatterjee, 1939; Grierson, 1928). Thus, even though the speakers of all dialects are 
o.T|To. ETot otù .oo.uo.u a.. (o. a. I.uTo To T. otù .Ka|.ooo Soo.uo.u Bù.oo|T. o.u a. Bo.o|ouù.t To T. otù .Bo.o|ouù.tT Soo.uo.u Bù.oo|T.) .ùoTa.o| uTo|ùšo.. T.|Æù.šù a. otT. 
standardize form is significant. Having said this in the next section the paper will focus on the behavior of ar in detail. In doing this, ttù uouù. ET|| ša.šù.o.ooù a. otù .Ka|.ooo Soo.uo.u Bù.oo|T. T. où... a uooo o.u a. otù o.o|..T. uo.o/ 

Indeclinable ar in Bangla 
Traditionally, ar is classified as an indeclinable mainly because of being insensitive to the declension. Its significance lies with its capacity to change the overall sense of an utterance. Compare (4) with (5): the example (4) says nothing specific about the span of time for which the articulation holds true. In other words, it works in an indefinite manner leaving the scouù a Ro... ša.T.o Šoš. T. .a.ù ÆoÆ.ù oT.ù/ Wtù.ùo. (5) ta|u. true for a longer period of time: in fact depending of the context sometimes it may 
T.uTšooù otoo .Ro. ET|| .ùðù. ša.ù./ Ma.ù a..o||., otù |oooù. a.ù T.uTšooù. otoo otù 
coming of Ram will never hold true for all future time. 
(4)  ...  ....  ..  
ram  o.-b-e  na  
Ram-NOM  come-FUT-3  NEG  

.Ram will not come.. 
(5) 	... .. .... .. ram ar o.-b-e na Ram-NOM INDL come-FUT-3 NEG .Ram will not come again.. 
Both of these two utterances form a minimal pair in virtue of (not-) having ar in the utterance body. From their respected literal translation it is also clear that whatever difference they do posses in their meaning is due to the (non-)appearance of ar: in fact appearance of ar remains extremely crucial in implicating a particular type of inference under the precedence of a pretext as is illustrated below: 
(6) 	
(..... .. .... ......) ... .. .... .. (ram-er ..a mon -er .bostha) ram ar o.-b-e na ram-GEN PRT mind-GEN situation Ram-NOM INDL come-FUT-3 NEG .(The state of mind in which Ram is now,) Ram will not come again.. 

(7) 	
... .. ...... .... .. ram ark.khonoi o.-b-e na Ram-NOM INDL ever come-FUT-3 NEG .Ram will not ever come again.. 


Being a implicature of (6), (7) satisfies the feature of conversational implicature namely defeasibility, non-detachability, calculability, non-conventionality. With the substitution of an utterance containing no ar in (6), a different implication is licensed. 
(8)  (.....  ..  ....  ......)  ...  ....  ..  
(ram-er  ..a  mon-er  .bostha)  ram  o.-b-e  na  
ram-GEN  PRT  mind-GEN  situation  Ram-NOM  come-FUT-3  NEG  

›.(The state of mind in which Ram is now,) Ram will not come again.. 
Therefore, (4) and (5) as the members of a minimal pair reflect contrastive distribution resulting into a kind of paradigmatic arrangement. 
4.1 Nature of ar 
As it has been stated earlier, this paper mainly focuses on the non-conjunctive ar, and this section in particular will concentrate on how ar In Bangla behaves as a discourse particle. Unlike the other particles such as to, na, particle ar does not bring any kind of a semantic change by its presence in the initial, medial, or the final situation of an utterance. As shown below: 
(9) 	
... .. .... .. 
ram ar o.-b-e na 
Ram-NOM PRT come-FUT-3 NEG 
.Ram will not come again.. 


(10) 	
... .... .. .. 
ram o.-b-e na ar 
Ram-NOM come-FUT-3 NEG PRT 
.Ram will not come again.. 


(11) 	
.. ... .... .. 
ar ram o.-b-e na 
PRT Ram-NOM come-FUT-NEG 
.Ram will not come again.. 



However this is not the case with other Bangla particles to and na. Consider the examples of (12)-(15). 
(12) 	
... . . .... 
ram to o.-b-e 
Ram-NOM PRT come-FUT-3 
.Ram will come.. 


(13) 	
... .. .... 
ram na o.-b-e 
Ram-NOM PRT come-FUT-3 
.Ram will come.. 



The appearance of to and na as modifiers of the preceding noun in the subject position of (12)-(13) plays a crucial role in emphasizing the respective assertions unambiguously. Therefore, phonological cues are not significant in interpreting these utterances. Contrariwise (14)-(15) need phonological cues to get interpreted unambiguously. In the absence of the phonological context, as is the case here, each of them can be interpreted either as an assertion or as a question. 
(14) 	
... .... . . 
ram o.-b-e to 
Ram-NOM come-FUT-3 PRT 
.Ram will come./Will Ram come?. 


(15) 	
... .... .. 
ram o.-b-e na 
Ram-NOM come-FUT-3 NEG 
.Ram will not come./Will Ram not come?. 



Chatterjee (1939) points out ar as one of the members of the fundamental indeclinable class1 in Bangla. In sentential level it can co-occur with other particles. Consider the 
following:  
(16)  ... . . .. .... ram to ar o.-b-e Ram-NOM PRT PRT come-FUT-3 .Ram will not come again..  .. na NEG  
(17)  ... .. .. .... ram na ar o.-b-e Ram-NOM PRT PRT come-FUT-3 .Ram will not come again..  .. na NEG  
(18)  ... .. .. .... ram ki ar o.-b-e Ram-NOM PRT PRT come-FUT-3 .Ram will not come again?.  ..? na NEG  

In (16)-(18) ar is used in all examples to emphasize the predicate of the utterance but it is also interesting to observe the role of other particles in these utterances. In (16) 
otù .uùo.ù... T.où.oTa. Eo. oa ouu o u.oo.ooTš a.šù Šaot oa otù .ÆŠ.ùšo o.u u.ùuTšooù, 
as the occurrence of to after the subject and occurrence of ar before the predicate emphasize subject and predicate respectively. By doing this speaker wants to imply otoo .To T. Ro. Eta ET|| .ao ša.ù oooT../ ST.T|o.|., T. (17) otù T..ù.oTa. a na induces politeness to the whole information and in (18) the question particle ki adds a notion of polarity to the utterance. 
4.2 Polysemous Nature of ar 
The discussion about the nature of ar instigates the paper to focus on another unique feature. In some situations, the meaning construing capacity of ar, both in the sentence and utterance level, largely depends on the words with which it co-occurs. As a consequence, in these case ar functions not as a particle but as a grammatical category with which it co-locates. Such as the following: 
(19) 	.. ..... .. ar ªk-bar ù.-o ADJ one-time come-PRS-2 .Come once again.. 
1 Chatterjee (1939) identifies na, ba, ki, ar, to, as a fundamental indeclinable in Bangla. In the utterance level these indeclinable, having the nature of particle, works as a functional category. Therefore, we can form a class, containing these particles as members of it. 
(20) ...  .. ....  ...  ..?  
ram  ar  ki...u  bol-l-o  na?  
Ram-NOM  ADJ  some  say-PST-3  NEG  
.Ro. uTu..o .o. o..otT.o ù|.ù?.  

(21) 	
...... .. ...... .. ....... .. 

... iskul-e uoÆ...(o)-l-am ar Š.T..i o.a.Š.a ho-l-o school-LocTEMP reach-PST-1 CONJ rain begin be-PST-3 .As soon as I reached the school rain started.. 

(22) 	
... .. .... .. .... . ... tumi b.l-o ar nab.l-o ami to bol-b-o you-NOM speak-PRS-2 INDL NEG speak -PRS-2 I PRT speak -FUT-1 .You speak or not, I will speak.. 

(23) 	
.. ..... ...... .. ar kau-ke bol-o na ADJ someone-ACC say-PRS-2 NEG .Da..o say this to anyone else.. 


It is not hard to show from these examples (19)-(23) that ar in discourse creates various types of meanings depending on its use. In the example (19) ar carries a sense of .again. that makes the speaker to request the hearer to come one another time. In the next example, (20), ar indicates the sense of .more. and the speaker by saying this utterance expressing his expectation in an emphasizing manner. ar in the example (21) adds T.a..ooTa. a oT.ù toðT.o .o.-soon-o.. .ù..ù/ ST.T|o.|. ùeo.u|ù (22) o.u (23) o|.a contains two different meanings of ar T/ù/ o .ù..ù a .a.. o.u o .ù..ù a .o.aotù../ Io T. to be important to mention over here that unlike the other examples ar in (21) functions not as a particle rather as a conjunctive indeclinable. 
These examples, which are mentioned above, establish our line of argument that the meaning construing force of ar is very much dependent on the neighboring words. Ttoo.. Et. otù .ù.Æ|oùu Æooù.o.šù. o.ù T.où.u.ùoùu T. otù uT.oT.šo Eo../ !. T. otù šo.ù of (19) the incorporation of ar with the quantifier . .... Š.T.o. otù .ù..ù a .oooT.. T. the sentence. In (20) ar T. où.ù.ooT.o otù .ù..ù a ..a.ù. o. To ouuùo.. ooù. otù u.a.aÆ. ki ...u. In rest of the examples, ar T. a..T.o otù .ù..ù. |T.ù .o.-soon-o.., .a.., o.u .o.aotù.. Š. a||aET.o otù .ù.oTa.ùu š|oT./ TtT. |T.ù a o.oÆ.ù.o šan further be cemented by three examples from the above-(3), (19) and (23). In all these cases ar occurs initially but creates three different interpretations depending on its different co­aššÆ..ù.šù., ETot otù .ù..ù a ..ùo.., .a.šù. o.u .uù..a../ 
A line of syntactic thought will specify that the occurrence of ar in utterance initial position will not project ar as a particle rather as different lexical categories regardless of its non-conjunctive nature. As in (19), (20) and (23) it functions like an adjective, in 
(21) as a conjunctive and in (22) as an indeclinable. 

Discussion 
What follows in, therefore, is an emerging necessity to explore the significance of ar in an utterance in inducing a particular illocutionary force. To address this newly evolved concern one need to consider the phonological make up of an utterance because the appearance of a particle in an utterance influence the meaning in two distinct ways: (a) It influence the meaning of the utterance in terms of those pragmatic behaviors which are pertinent from the viewpoint of the discrete segmental appearance of it. This is discussed in detail in Section 4, and, (b) beyond its discrete reality it also participates in the non-discrete supra-segmental make up of the utterance. Though the syntactic semantic and pragmatic behavior of the particle is discussed in existing literature on particles, very few of them in any true sense tries to explore the way communicative intention is captured through the characteristic interactions holding between the segmental and supra-segmental layers of linguistic representations. This is exactly the departure point from where the current investigation differs from the rest of the studies on particles. 
In the level of prosodic hierarchy, an utterance is denoted as an Intonational Phrase (henceforth IP) which is comprised of Phonological Phrases (henceforth PP or P). A PP is further decomposed into the Prosodic Word (henceforth, Pwd), containing information about the supra-segmental aspects associated with the lexical words (lex)2 . Furthermore function words3 (henceforth Fnc) categorized as either Prosodic Word or as a Prosodic clitic4 (henceforth Pcl) (Selkirk, 2003).  
In the phonological phrase the pitch accents are tones -high (H) or low (L) -that gets linked to stressed syllables, which is formally represented as H* and L*. At the boundary level, both for the phonological phrase and intonational phrase, the phrase accent is identified as Phonological phrase boundary or TP and Intonational phrase boundary or TI (Hayes & Lahiri, 1991). Having said this, the intonational pattern of the sentence (4) can be represented in Figure 1: 
2 Selkerk (2003) talked about two structures in the level of utterance-i) S-structure, ii) P-structure. S-structure contains the lexical words (Lex) whereas the P-structure contains the sequence of Prosodic words (Pwd) in phonological representation. 
3 Function words (Fnc) are the members of a class in which membership is largely fixed, such as in the cases with determiners, prepositions, conjunctions and particles. Lexical words (Lex), on the other hand, constitute the open class expressions having the unlimited numbers as the new items are continually being added. 
4 Prosodic clitics (Pcl) are those morpho-syntactic words which are not itself a Pwd. 

Figure 1: Pitch Pattern of Example (4) 
Figure 1 shows that the initial syllable of a word often carries the stress marker as the syllable .. gets the stress and it also receives a high pitch accent (H*). Additionally, the graph falls down around the final syllable of the I-phrase boundary (LI ) just after the high pitch accent (H*). The corresponding metrical grid representation is given in Figure 2 to show the distribution of the stresses over the utterance: 

Figure 2: Metrical Grid of Example (4) 
In Figure 2, ram and .... .. are the two P-phrase that constitutes IP/Utt. Note that in a neutral situation like this, the declarative sentence the right most P-phrase within the I-phrase receives the main prominence as the I-phrase stress rule assigns stress to the rightmost P-phrase of the I-phrase. In this metrical grid, na functions as a prosodic clitic, more precisely as an internal clitic5 and because of this it further adds extra stress to the left most non-clitic word (Hayes & Lahiri, 1991). 
5 Internal clitic is another branch of prosodic clitic which is dominated by the same Pwd that on the other hand dominates its sister lexical word lex 
The insertion of ar in the utterance does not alter the meaning of the sentence rather adds a higher degree of negativity to it. This higher degree of negativity can only be achieved through the performance of some inferential task as it is already discussed towards the beginning of Section 4. Nearly, similar situation can be grasped through 
(24) and (25) from different perspective. 
(24) 	
..... ....? .Æ.T| o.-b-e Susil-NOM come-FUT-3 .Will Susil come?. 

(25) 	
..... .. ....? .Æ.T| ar o.-b-e Susil-NOM PRT come-FUT-3 .Will Susil come again?. 


A close look on these two examples will indicate that, in the examples (24) and (25), the speaker is giving the license to the hearer to draw the inference that the speaker is T. uaÆŠo oŠaÆo SÆ.T|.. ša.T.o/ !uuToTa.o||., o šo.ùÆ| o.o|..T. a otù.ù oEa ùeomples do vary from each other in terms of their respective implicational capacities: (24) 
T.u|Tšooù. otoo otù .uùo.ù. T. ša.šù..ùu oŠaÆo SÆ.T|.. ša.T.o T. o (ÆoÆ.ù) .uo. a oT.ù 
of which the lower bound is the utterance time associated with it. More explicitly, 
SÆ.T|.. ša.T.o šaÆ|u Šù ùTotù. o.Æù a. o|.ù T. otù u.ù.Æuua.ùu .uo. a oT.ù-o. o ša..ùuÆù.šù (24) ET|| .ao T.u|Tšooù otù o|.To. a SÆ.T|.. ša.T.o a. o|| ÆoÆ.ù oT.ù/ Hù.ù in this case .uùo.ù... psychological state is severely restricted by a temporal constraint T. ðT.oÆù a .ao toðT.o o tTuuù. .ù..ù a ..ùðù../ Ca.o.o.TET.ù T. (25) SÆ.T|.. ša.T.o T. T. uaÆŠo a. o|| ÆoÆ.ù oT.ù/ I. aotù. Ea.u., .uùo.ù... u..šta|aoTšo| .oooù u.ù.Æuua.ù. otù u.ù.ù.šù a ..ùðù.. in the underlying representation. In fact, speaker is asking the hearer to confirm whether Susil will ever come. Due to its characteristic implicational pattern, (25) can be further augmented with the following lexicalized context as its 
pretext:  
(26)  ..  .....  ....  ........  ..  . ....  ..  .....  
.ù  din-er  Šou.zù  byabohar-er  por  tomar  ki  monehoy  
that  day-GEN  bad  behavior-GEN  then  you-NOM  Q-PRT  think  
.....  .. ....?  
.Æ.T|  ar  o.-b-e?  
Susil-NOM  PRT  come-FUT-3  

.Do you think that Susil will come again after the behaviour you have shown to him on the day of the accident?. 
In the domain of conversation the particle ar captures more emotional coloring through the intonational pattern. Consider the following: 
(27) Speaker 1:  ..... .... .... dekh-b-e ..bai o.-b-e see-FUT-3 all come-FUT-3 .Da..o Ea..., everyone will come..  
Speaker 2:  (.... ....) ... .. .... .. (..bai ele-o) ram ar o.-b-e na (all come-COND-EMP) Ram-NOM PRT come-FUT-3 NEG .(Although everyone will come) Ram will not come again..  
(28) Speaker 1:  ... .. .... ram ar o.-b-e Ram-NOM PRT come-FUT-3 .Ram will not come again?.  ..? na NEG  
Speaker 2:  .. na  
no .No..  

I. (27) otù .uùo.ù. 2.. T.ša.ua.ooTa. a ar T. tT. .ùu|. oooT..o .uùo.ù. 1.. uÆù.oTa. 
establishes the fact that the speaker 2 is more or less confirm that the person called .Ro.. will never come. On the other hand in (28) the situation is little different. Here, the speaker uses ar in his utterance as a negative polarity particle by making a change in the intonation. In situations like these two mentioned in (27) and (28), then, the question arises how the distinctive intonations associated with the utterances are selected. A little attention will reveal the fact that the selection of intonation patterns are not bound to the selection of the discrete lexical unites of the utterances rather they are bound to the context of the communication. 
The intonation pattern of the example mentioned in (27), as evident in Figure 4, does not show much change compare to the Figure 1.The interesting fact over here is that the ar as a clitic particle brings ram under narrow focus situation as per the illustration of Figure 3: 

Figure 3: Mùo.Tšo| G.Tu a Suùo.ù._2.. Uooù.o.šù T. Eeo.u|ù (27) 
The Figure 3, under the narrow focus situation, there is a low pitch accent on the syllable ram and for this reason the phrase ram-ar receives a high tone in the P-phrase boundary (HP), the pitch map will show this fact more on an elaborate manner. 

Figure 4: PTošt Poooù.. a Suùo.ù._2.. Uooù.o.šù T. Eeo.u|ù (27) 
It is important to note here that ar not only appears as a clitic particle but additionally as an internal clitic. It further implies that lex-fnc combination displaying a phonological behavior identical to that of Pwd which is constituted of a single lex alone. In Bangla, this particular combination is possible only because it fulfills the criterion-the left edge of any Pwd is required to coincide with the left edge of a Foot. (McCarthy & Prince, 1993) 
(29) Align (Pwd, L; Ft, L) 
The transformation of this declarative sentence to an interrogative sentence through the change in intonation brings change not only in the metrical grid but also in the pitch map. 

Figure 5: Mùo.Tšo| G.Tu a Suùo.ù._1.. Uooù.o.šù T. Eeo.u|ù (28) 

Figure 6: PTošt Poooù.. a Suùo.ù._1.. Uooù.o.šù T. Eeo.u|ù (28) 
The utterance, as a question, has its narrow focus on .... na, the first syllable of this word, i.e. .., gets the main stress of the utterance. In the yes/no situation, the main stressed syllable often gets the low pitch and then this pitch rises smoothly to the last syllable na, and afterwards it falls again, as is shown in Figure 6. The presence of HI and LI sequence in the pitch map indicates that in the IP boundary, high peak is followed by a final low value. 
The paper has already argued that the intonation is very much context dependent phenomenon. Keeping this thing in mind, we can also say that the particle ar can be occurred not only as a clitic particle but also as a focus particle in the utterance like ram .. .... ... Jackendoff (1972), in a similar occasion once argued that if a P-phrase comes in the focus position of an utterance (U), the highest stress in U will be on the syllable of that P-phrase. Thus the pragmatic domain of the focus is also its phonological domain. As a consequence the focus is defined in the following way: 
(30) 	Focus: If F is a Focus and DF is its domain then the highest prominence in DF 
will be within F. 

In the conversational discourse the domain of focus (DF) is defined as a sector from which the scope of the focus can be determined. This domain is both phonologically and semantically relevant. This pragmatic-phonological interface can be illustrated in Figure 7 and Figure 8: 

Figure 7: Phonology Pragmatics Interface with Focused Subject Containing Discourse Particle 

Figure 8: Phonology Pragmatics Interface with Non-Focused Subject Containing Discourse Particle 
Figure 7 suggests that ar as a clitic particle emphasizes the constituents with which it is attached to, as in this case ram is stressed. Due to this, the entire clause being a domain of focus, selects ram-ar as a focus constituent. On the contrary the interrogative utterance in Figure 8 implies the fact that the speaker is uttering this question out of disbelief. Thus in the entire focus domain ..be na gets the relative prominence compare to ram-ar and as a consequence it comes in the focus part. 

Evidential Nature of ar 
The above discussion ensures the fact that ar .ao a.|. où.ù.ooù. .uùo.ù... T.où.oTa. but it also marks the source and reliability of their knowledge behind a particular assertion. It specifies the source of evidence on which statements are based, the degrees of precision, probability and expectations. So, in simple words we can say that, ar as an evidential shows what kind of justification for a factual claim is available to the person making the claim. In order to grasp the evidentiality in a better way, the machinery of Grice.s theory becomes important as it explains not only what is conversationally implied but also in what is said. Therefore by applying the maxims of Quality to the utterances (P), we have mentioned above, can be reinterpreted as (a) 
otù .uùo.ù. Šù|Tùðù. otoo .P., o.u o|.a (Š) otù .uùo.ù. to. ouùuÆooù ùðTuù.šù a .P./ 
In the level of utterance this evidential nature is not only expressed through the linguistic items (in this paper it is ar) but also through the extra linguistic elements such as intonation. As in (27) and (28) the sentence ... .. .... .. is uttered from two different intonational pattern, i.e. (27) in declarative tone and (28) in the tone of question. It implies that in (27) on the base of some evidence speaker believes that Ram will not come whereas in (28) the speaker was not sure about Ram.s arrival for this reason he was trying to confirm the fact. Thus, it is visible if the speaker is asserting/claiming/declaring that P, (s)he must believe that P; if the speaker is suggesting/guessing/questioning that P, (s)he must believe that there is not sufficient reason to believe that P, which is weakest degree of commitment (Bach & Harnish 1979). 
To sum up, we can say that in Bangla discourse ar plays a very significant role in construing the pragmatic meaning. It further answers the question on .Etoo ša..oToÆoù. otù ..aE|ùuoù a |o.oÆooù. o.u taE otT. ..aE|ùuoù T. uÆo oa Æ.ù/ ! .a.ù ota.aÆot 
research on this line and a comparative study of some Bangla discourse particles will help us to build the structure of the conversation in a more concrete manner. 

References 
Bayer, J., Dasgupta, P., Mukhopadhyay, S., & Ghosh, R. (2014, February 06-08). Functional Structure and the Bangla Discourse Particle to. Retrieved from 
http://ling.unikonstanz.de/pages/StructureUtterance/web/Publications_&_Talks_files/Ba yer_Dasgupta_MukhopadhyayGhosh_SALA.pdf 
Bach, K., & Harnish, R. N. (1979). Linguistic Communication and Speech Actss. MIT. Press Cambridge. Mass. Chatterjee, S. K., (1939). Bhasa-Prokash Bangla Byakaron. Rupa Publication India Limited. 
Dastidar, R. G., & Mukhopadhyay, S. (2013). Utterance Discourse and Meaning: A Pragmatic Journey with the Bangla Discourse Particle/na/. In: Mining Intelligence and Knowledge Exploration (pp. 814-822). Springer International Publishing. 
Gordon, R. G., Jr. (ed.) (2005). Ethnologue: Languages of the World, Fifteenth edition. Dallas, Tex: SIL International. Online version: http://www.ethnologue.com/. Grierson, A. (1928). Linguistic Survey of India. Calcutta, British India. Available online: 
http://joao-roiz.jp/LSI/ 
Hayes, B., & Lahiri, A. (1991). Bengali Intonational Phonology. Natural Language & Linguistic Theory, 47-96. Springer International Publishing. Ifantidou, E. (2001). Evidentials and Relevance. John Benjamins Publishing Company. Jackendoff, R. S. (1972). Semantic Interpretation in Generative Grammar. MIT Press, Cambridge, Mass. 
Khan, S. (2008). Intonational Phonology and Focus Prosody in Bengali. PhD thesis, University of California, Los Angeles. 
McCarthy, J., & Prince, A. (1993). Generalized Alignment. In G. Booij & J. van Marle (Eds.), Yearbook of Morphology 1993. Dordrech: Kluwer. 
McHugh, B. (1990). Cyclicity in the Phrasal Phonology of Kivunjo Chaga. Ph.D. dissertation, UCLA. 
Selkirk, E. (2003). The Prosodic Structure of Function Words. In J. McCarthy, ed. Optimality Theory in Phonology:A Reader. Blackwell Publishing. 
JAPANESE N DESHITA IN DISCOURSE: PAST FORM OF N DESU 
Hironori NISHI 
University of Memphis, USA hnishi1@memphis.edu 
Abstract 
N deshita/datta, which is the past-tense form of n desu/da, has not been explored in depth in the field of Japanese linguistics. By using the Balanced Corpus of Contemporary Written Japanese (BCCWJ) as a database, the present study explores the cases of n deshita/datta used for past events and situations. The findings of the present study show that approximately one-third of the cases of n deshita/datta used for past events and situations in the corpus co-occurred with grammatical elements that require past-tense connections such as the sentential ending particle kke, the tara structure, and the tari structure. For the cases of n deshita/datta that co-occurred with kke, tara, or tari, it was concluded that the grammatical restrictions arising from these elements triggered the occurrences of n deshita/datta. On the other hand, about two-thirds of the cases of n deshita/datta occurred without any grammatical elements that require past-tense connections. These cases of n deshita/datta were used to ùeu.ù.. otù .uùo.ù... .ùša|lection of previously held knowledge, or as part of confirmation-seeking utterances for previously held knowledge. 
Keywords: Japanese linguistics; discourse analyses; past tense; n desu; n deshita; n datta 
Povzetek 
N deshita/datta, ki je pretekla oblika strukture n desu/da, v japonskem jezikoslovju ni nikoli dobila pozornosti. Z vpogledom v korpus BCCWJ (Balanced Corpus of Contemporary Written Japanese) tokratna raziskava razkriva uporabo te oblike za pretekle dogodke ali .oz.ù.ù/ RùzÆ|oooT .ožù.a, uo se u.TŠ|Tž.a ù.o o.ùo.T.o ð.ùt u.T.ù.að n deshita/datta, .T .ožù.a .o u.ùoù.|ù dogodke ali razmere, ua.oð|.o ..Æuo. . .ooð.T. |ù..a. kke,v tara strukturi ali pa v tari strukturi. Za omenjene tri primere .ù .a .ùT, uo .ù ua.oð.a.o u.ùoù.|ù aŠ|T.ù n deshita/datta ua.|ùuTšo .|að.T.Tt u.oðT|/ Ta uo .ù ðù|.o za preostali dve tretjini primerov z obliko n deshita/datta, preko katerih govorec izrazi njemu žù znane dogodke ali razmere azT.a.o a ..TtaðT u.oðT|.a.oT au .aoaða.šo u.To.Æ.ù uao.uT|a/ 
Kljuène besede: japonsko jezikoslovje; diskurzivna analiza; preteklik; n desu; n deshita; n datta 
Acta Linguistica Asiatica, 7(1), 2017. 
ISSN: 2232-3317, http://revije.ff.uni-lj.si/ala/ 

DOI: 10.4312/ala.7.1.41-56 



1 Introduction 
The Japanese sentential ending n desu structure has been discussed and explored in depth in the field of Japanese linguistics. However, previous studies focus mainly on its present-tense form n desu, and the past-form n deshita has not been included in the scope of these analyses. By examining a large corpus, the present paper will explore the usages of n deshita used for past events and situations in discourse, and discuss the factors that trigger the usage of n deshita. 

2 Japanese n desu structure 
The Japanese n desu structure has been the focus of linguistic inquiries by various scholars (Jorden, 1963; Alfonso, 1966; Kuno, 1973; McGloin, 1980, 1981, 1984, 1989; Aoki, 1986; Tanomura, 1990; Takatsu, 1991; Maynard, 1992, 2005; Noda, 1997; Ijima 2010; among many others). The n desu structure consists of the nominalizer n and the copula desu, and the structure is believed to create various interactional effects when it is added to the end of a sentence. Compare the a. sentences with the b. sentences in 
(1) and (2). 
(1)  a. ...........  
Watashi  wa  hirugohan  o taberu.  
I  TP  lunch  O  eat  
.I eat lunch..  
b. .............  
Watashi  wa  hirugohan  o taberu  n  desu.  
I  TP  lunch  O  eat  N  CP  
.(Io T. otoo) I ùoo |Æ.št/.  
(2)  a. ........  
Kaban  wa  u../  
bag  TP  large  
.Ttù Šoo T. |o.où/.  
b. ..........  
Kaban  wa  u..  n  da.  
bag  TP  large  N  CP  
.(Io T. otoo) otù Šoo T. |o.où/.  

(1a) and (2a) simply express the semantic information included in the sentences, but (1b) includes the n desu structure, and (2b) includes n da, which is a non-polite variant of n desu. The n desu .o.ÆšoÆ.ù T. Jouo.ù.ù T. o.uTšo||. o.o..|ooùu o. .To T. otoo. T. E.o|T.t (Jorden and Noda, 1987; McGloin, 1980, 1989; Lammers, 2005; McGloin et al., 2013; etc.), but it is recognized as one of the most obscure and difficult-to-conceptualize grammatical structures in Japanese. Due to its wide range of usages and versatile interactional effects, various arguments have been formulated on the interactional functions of the n desu structure. For example, McGloin (1989) argues that by using the n desu .o.ÆšoÆ.ù, otù .uùo.ù. šo. .u.ù.ù.o T.a..ooTa. EtTšt T. ..aE. a.|. oa otù .uùo.ù. a. otù tùo.ù. o. T To Eù.ù .to.ùu T.a..ooTa.. (u/ 89), o.u To to. T.où.ošoTa.o| 
functions such as explanation, rapport building, and providing background information. On the other hand, some discuss the n desu structure from the perspective of evidentiality. Aoki (1986) argues that the n desu structure has an evidential function of marking ..a..uùšTTš ùðTuù.oTo| .oooù.ù.o.. (u/ 223), which does not explicitly indicate or specify the source of information for the stated proposition though treating the information as factual. 
For the description of past events or situations with the n desu structure, the tense of the component that precedes n desu is modified into the past tense, but n desu itself typically remains unchanged. Examples (3) and (4) show the usages of n desu for a past event. 
(3)	
............. 
Watashi wa hirugohan o tabeta n desu. 
I TP lunch O ate N CP 


.(Io T. otoo) I ooù |Æ.št/. 

(4)	
............ 
Kaban wa u.T.oooo n da. 
bag TP was large N CP 



.(Io T. otoo) otù Šoo Eo. |o.où/. 
In (3), tabeta, which is the past form of taberu .oa ùoo. T. Æ.ùu Šùa.ù n desu. In (4), .kikatta, which is the past form of .k¸ .oa Šù ŠTo,. T. Æ.ùu Šùa.ù n desu. In both (3) and (4), the copula component in the n desu structure stays in the present form and is not affected by the tense of the propositional content that precedes the n desu structure. 
As mentioned in the introduction, past studies on the n desu structure primarily focus on the present-tense n desu, and the past-tense n deshita has not been explored in depth. Examples (5) and (6) include the past-tense forms of n desu and its non-polite variation n da, respectively. 
(5)	
.............. 
Watashi wa hirugohan o tabeta n deshita. 
I TP lunch O ate N CP 


.(Io Eo. otoo) I ooù |Æ.št/. 

(6)	
.............. 
Kaban wa u.T.oooo n datta. 
bag TP was large N CP 



.(Io Eo. otoo) otù Šoo Eo. |o.où/. 
Even though it is not grammatically unacceptable to use the past-tense n deshita/datta instead of the present tense n desu/da, some speakers of Japanese may feel (5) and (6) as unnatural unless a very specific context is given, which might be the reason why the past-tense n deshita/datta is left out in previous studies. Also, in the field of teaching Japanese as a second language, the present-tense n desu/da is introduced in early stages of learning in many Japanese language textbooks, but no explanation is provided for the past-tense n deshita/datta (Jorden and Noda, 1987; Lammers, 2005; Banno et al., 2011; Hatasa et al., 2015; etc.). In addition, many intermediate to advanced level textbooks also do not include any information on the usage of n deshita/datta (Miura & McGloin, 2008; Oka et al. 2009; etc.). Furthermore, as far as the author is aware, no studies have been conducted on the usage of n deshita/datta by L2 speakers of Japanese. 

Present study 
N deshita, which is the past form of n desu, has not been explored in depth in previous studies on Japanese linguistics nor second language acquisition. In order to explore the usages of n deshita in discourse, the present study examined cases of n deshita in a large corpus. By using the corpus as a database, the present study explored the usages of n deshita quantitatively and qualitatively, and analyzed in which kinds of contextual situations n deshita is used and how its interactional properties are utilized by speakers of Japanese. 
The corpus used in the present study was the Balanced Corpus of Contemporary Written Japanese (BCCWJ), which is a balanced language database for written Japanese that was created by the National Institute for Japanese Language and Linguistics (Maekawa, 2008). The data in the BCCWJ is comprised of approximately 104.3 million words, and it covers text genres such as general books, magazines, newspapers, business reports, blogs, internet forums, textbooks, and legal documents among others. The search for the linguistic data in the database was conducted through the ....... search portal, which was also developed by the National Institute for Japanese Language and Linguistics, and has a user interface similar to an internet search engine. 
The scope of the present study was limited to cases of n deshita that follow another component in the past form in order to highlight the difference between using the present-tense n desu and the past-tense n deshita for past events, and also to limit the number of cases to be examined due to the large size of the BCCWJ. As for the selection of examples from the database, n deshita and n datta that follow the past marker –ta or –da were searched on the Ch.nagon search portal. The search results were examined quantitatively and qualitatively. 

Results and discussion 
In order to identify individual examples of the past-tense form of n desu used for past events or situations, the four possible hiragana sequences for the combination of the past morpheme –ta/da and n deshita/n datta, which are ta n deshita (.....) ta n datta (.....), da n deshita (.....), and da n datta (.....) were input into the Ch.nagon search portal. The search yielded 180 cases of the four possible hiranaga sequences for the –ta/da + n deshita/datta combination, but 13 cases were coincidentally matching cases such as kantan datta .To Eo. ùo..,. EtTšt o.ù irrelevant to the scope of the present study. After eliminating the matching but irrelevant cases, 167 cases were available for further analysis. The following table summarizes the breakdown of the 167 cases of –ta/da + n deshita/datta found in the corpus. 
Table 1: -ta/da + n deshita/datta in the BCCWJ 
Hiragana sequence  # of cases  
-ta/da n deshita  61  
-ta/da n datta  106  
Total  167  

4.1 -ta/da n deshita/datta co-occurred with kke, tara, or tari 
The 167 cases of -ta/da + n deshita/datta in the corpus were examined qualitatively. Out of the 167 cases of -ta/da + n deshita/datta, 63 cases (37.7%) co-occurred with the sentence final particle kke, the tara structure, or the tari structure. Kke, tara, and tari all require a past-tense connection for the preceding grammatical item. The cases of -ta/da + n deshita/datta that co-occurred with kke, tara, or tari will be analyzed in this section. 
Out of the 167 cases of -ta/da + n deshita/datta found in the corpus, 35 cases (21.0%) co-occurred with the sentence final particle kke. More precisely, 30 cases were -ta/da n deshita co-occurring with kke, and 5 cases were –ta/da n datta co-occurring with kke. The sentence final particle kke requires the past-tense form before the particle when it follows a verb or an i-adjective. Kke can also follow a predicate that includes a noun or a na-adjective, but the tense of the predicate can be either present or past, depending on the type of copula used at the end of the predicate (Martin, 1975; Kosaka, 2004; McGloin et al, 2013; etc.). 
Example (7) includes a case of n deshita used with kke found in the corpus. (7) is from an internet discussion board included in the BCCWJ, on which its users ask and answer questions about topics related to everyday life. 
(7)	....—...—...........—..—..... 
RT.uaŠøoÆ no Šu.o.Æ no Watase Maki tte oToø ka Šw.Æ yatte ta 
RT.uaŠøoÆ LK lead singer LK Watase Maki QT guitar or bass played

................. 
kami no nagai hito to kekkon shita n deshita kke? 
hair LK long person to got married N CP FP 

.!. I .Toto otoo Mo.T Wooo.ù, Eta Eo. otù |ùou .T.où. a RT.uaŠøoÆ (.o.ù a o .aš. Šo.u), .o..Tùu otù uù..a. Eta Eo. u|o.T.o otù oÆToo. a. Šo..?. 
According to Martin (1975), kke .o..ùu Æooù.o.šù. o.ù Æ.ùu oa T.uTšooù .otT..T.o Šoš., .ùša||ùšoT.o oa a.ù.ù|, a. uÆù.oTa.T.o a.ù.ù| oŠaÆo .a.ù .ToÆooTa.. oa Šù .ùšo||ùu. (u/ 
937). However, as Hayashi (2010, 2012) claims, kke is also commonly used in utterances addressed to another person. In regards to the usage of kke in interactional situations, Ho.o.tT (2010) o.oÆù. otoo .Æ.|T.ù ka and no, kke makes implicit reference to knowledge or information previously held by the speaker and shared with the addressee, but which the speaker has .a.ùtaE a.oaooù. a. T. Æ..Æ.ù oŠaÆo. (u/ 2687)/ Example (7) is a question about Maki Watase, who is a well-known musician in Japan, and the person who asked the question used to have the information but he or she is not sure as of now, and this uncertainty is marked with kke. As for the usage of n deshita, the particle kke requires the past form for the preceding item when it follows the long-form copula desu, and this grammatical constraint seems to be the main factor that triggers the usage of n deshita here. The two forms of Japanese copula, da and desu, mark different levels of politeness, and generally speaking, desu is considered to be more polite than da. When kke follows the less polite copula da, the tense of da can be either the present-tense da or the past-tense datta, and neither of them are grammatically incorrect. Examples (8) and (9) demonstrate the acceptability of using datta and da directly before kke, respectively. 
(8)	
.............. 
Ano hito, Tanaka-san datta kke? 
that person Tanaka Mr. CP FP 


.!. I .Toto otoo otoo uù..a. T. M./ To.o.o?. 

(9)	
............ 
Ano hito, Tanaka-san da kke? 
that person Tanaka Mr. CP FP 



.!. I .Toto otoo otoo uù..a. T. M./ To.o.o?. 
Even though the tense of the copula component is different in (8) and (9), there are no semantic or communicative differences between (8) and (9). However, as Kosaka (2004) points out, when the long-form copula desu is used before kke, it must be modified into the past-tense deshita, and the present-tense desu cannot precede kke. Observe (10) and (11). 
(10) .—............. Eoa, dochira san deshita kke? well who CP FP .Wù||, (I Æ.ùu oa ..aE ŠÆo) Eta o.ù .aÆ?.  
(11)  *.—............ *Eoa, dochira san desu kke? well who CP FP .Wù||, (I Æ.ùu oa ..aE ŠÆo) Eta o.ù .aÆ?.  
(Kosaka, 2004, p. 139)  

In (10), deshita, which is the past form of the desu, is used directly before kke, and it is an acceptable sentence. On the other hand, the present-tense desu is used in (11), and the sentence is not acceptable. 
The above mentioned explanation is also applicable to the usage of n deshita in (7), which has already been examined. (12) provides a hypothetical example in which n desu in used instead of n deshita in (7). 
(12) 	*....—...—...........—..—..... *RT.uaŠøoÆ no Šu.o.Æ no Watase Maki tte oToø ka Šw.Æ yatte ta RT.uaŠøoÆ LK lead singer LK Watase Maki QT guitar or bass played
................ 
kami no nagai hito to kekkon shita n desu kke? hair LK long person to got married N CP Q 
.!. I .Toto otoo Mo.T Wooo.ù, Eta Eo. otù |ùou .T.où. a RT.uaŠøoÆ (.o.ù a o .aš. Šo.u), .o..Tùu otù uù..a. Eta Eo. u|o.T.o otù oÆToo. a. Šo..?. 
As demonstrated by (12), the present-tense of the copula desu cannot precede kke due to the grammatical constraint imposed on the usage of kke. Therefore, in order for the speaker to use kke after the n desu structure, and if the speaker also wants to preserve the politeness level marked with the long-form copula desu, the speaker has no choice other than to use the past-tense deshita with kke. There are many examples of n deshita co-occurring with kke similar to (7) in the corpus, and the usages of n deshita in those cases appear to be resulting from the grammatical constraint discussed above.1 
Another grammatical form that frequently co-occurred with -ta/da n deshita/datta in the corpus was tara. Out of the 167 cases of –ta/da n deshita/datta in the corpus, 
1 As demonstrated in the comparison between (8) and (9), the non-polite da and datta are interchangeable before kke and the meaning of the sentence does not change regardless of the choice. The five cases of –ta/da n datta kke in the corpus seem to be resulting from the flexibility of using da or datta directly before kke. 
26 cases (15.6%) co-occurred with the tara structure. Generally speaking, the Japanese tara structure is considered to exp.ù.. ša.uToTa.o| .ùo.T.o .T.T|o. oa .T. a. .Etù.. T. English. Tara T.uTšooù. otoo .otù ošoTa./.oooù ùeu.ù..ùu Š. otù .oT. š|oÆ.ù T. o .ù.où.šù oo.ù. u|ošù ooù. otù ošoTa./.oooù ùeu.ù..ùu Š. otù .ÆŠa.uT.ooù š|oÆ.ù. (Makino and Tsutsui, 1989, p. 452). The .o.ÆšoÆ.ù T. o.uTšo||. |oŠù|ùu o. otù .tara. structure in linguistic research, but technically the tara structure consists of the past form of a verb, an i-adjective or a copula, and ra that follows it. For example, the tara structure for the verb taberu .oa ùoo. T. tabeta ra, which consists of the past form tabeta .ooù. o.u ra. When ra follows a copula, datta ra or deshita ra is formed depending on the intended politeness level. 
The following example is from the BCCWJ, and it was uttered by a character in a novel. (13) contains a case of -ta n datta that co-occurs with the tara structure. 
(13)	......................... 
Du .a.u .a hikikaesu shika nai .ø/  B.uT. ni kaette kara 
anyway go back have to FP hospital to return then

...................... 
kiga tsuita n datta ra, ashita ni mawashi chau kedo. 
notice N CP if tomorrow until wait FP 

.!..Eo., I toðù oa oa Šoš./ I (To Eo. otoo) I .aoTšùu To ooù. I .ùoÆ.. oa otù ta.uToo|, I EaÆ|u toðù EoToùu Æ.oT| oa.a..aE/. 
As mentioned earlier, in order for the tara structure to be formulated, the grammatical unit that directly precedes ra must be in the past form. Therefore, whenever the tara structure is used with a predicate that ends with the n desu structure, the copula component must be converted to the past-tense deshita or datta. This is very similar to what was observed for the sentence final particle kke earlier, since the usage of the past-tense n deshita is triggered by the grammatical restriction caused by a grammatical component that directly follows n desu for both kke and the tara structure. 
The third type of grammatical element that requires a past-tense connection co­occurring with -ta/da + n deshita/datta is the tari structure. In the examined corpus, 2 cases of ta/da + n deshita/datta co-occurred with the tari structure. The tari structure T. Æ.ùu oa ùeu.ù.. .T.ùetoÆ.oTðù |T.oT.o a ošoTa.. a. .oooù.. (Mo.T.a o.u T.Æo.ÆT, 1989, 
p. 458), and it is typically used with verbs as in utatta ri odotta ri suru .ua otT.o. |T.ù .T.oT.o o.u uo.šT.o,. ŠÆo To šo. o|.a Šù Æ.ùu ETot .aÆ.. o.u ou.ùšoTðù. o. Eù||/ !. a. 
the formation of the structure, tari consists of a past-tense form of a predicate + ri and suru .to do,. a..T.o .o.ÆšoÆ.ù. .Æšh as tabeta ri nonda ri suru .oa ua otT.o. |T.ù ùooT.o o.u u.T..T.o,. ookikatta ri omokatta ri suru .oa Šù ŠTo, tùoð., ùoš/,. tsukue datta ri isu datta ri suru .uù..., štoT.., ùoš/. I. ouuToTa., otù tari structure is sometimes used as a sentential ending expression that marks uncertainty. This usage of the tari structure usually co-occurs with the gerund form ending shite, forming expressions such as .kikatta ri shite .(.a.ùotT.o) .Toto Šù ŠTo/. 
The following example, (14), is one of the cases of -ta/da n deshita/datta that co­occurs with tari found in the corpus. It is taken from a scene in a novel where the protagonist recalls his childhood memories. 
(14)	........................... 
Are kara, shibaraku shite boku to ...o. wa issho ni ofuro ni hairu 
that since after a while me and older brother TP with take a bath

......................... 
koto wa nakunatte shimatta n da kedo, moshika shite, boku wa N TP stopped N CP but perhaps I TP
.................... 
mada ...o. to hairitakatta n datta ri shite. 
still older brother with wanted to take N CP might 

.! |Too|ù ooù. otoo, .. a|uù. Š.aotù. .oauuùu oo.T.o o Šoot ETot .ù, ŠÆo uù.tou., I .oT|| Eo.oùu oa oo.ù o Šoot ETot tT./. 
In the above example, the tari structure is used to express uncertainty at the end of the sentence. Similar to the tara structure discussed earlier, in order for the tari structure to be formulated, the grammatical element directly before ri must be in the past form. Therefore, the copula da in (14) must be in the past form for the sentence to be grammatically acceptable. 
In this section, the usages of kke, the tara structure, and the tari structure with ­ta/da n deshita/datta were qualitatively examined. These three grammatical elements require a past-tense connection for the preceding item, and this grammatical restriction seems to trigger occurrences of -ta/da n deshita/datta. The next section will explore the cases of -ta/da n deshita/datta that occurred without any grammatical elements which would require past-tense connections. 
4.2 -ta/da n deshita/datta without required past-tense connection 
4.2.1 -ta/da n deshita/datta for recollection of previously held knowledge 
Out of the 167 cases of -ta/da + n deshita/datta in the examined corpus, 104 cases (62.3%) were -ta/da + n deshita/datta that did not precede any grammatical elements that require past-tense connections. After examining each case of -ta/da + n deshita/datta, it was found that there are several ways in which -ta/da + n deshita/datta is used in discourse. 
The first type of usage of -ta/da + n deshita/datta without being followed by grammatical elements that require a past-tense connection was expressing the 
.uùo.ù... .ùša||ùšoTa. a u.ùðTaÆ.|. tù|u ..aE|ùuoù/ As Jorden and Noda (1987) explain, Japanese past-tense forms can be used for currently continuing actions or ša.uToTa.., o.u To .o. .ùù. oa otù .uùo.ù... .ùšo||ùu ..aE|ùuoù/ OŠ.ù.ðù otù uo.o­tense copula deshita T. B.. Æooù.o.šù T. (15)/ 
(15) 	A:.............. 
Amerika taishikan, doko desu ka. 
America embassy where CP Q 

.Wtù.ù.. otù !.ù.Tšo. E.Šo...?. 
B:	.—.......... 
Eto, Toranomon deshita ne. 
uh Toranomon CP FP 

.Ut, To Eo. Ta.o.a.a., Eo...o To?. (T/ù/, o. I .ùšo|| To) 
(Jorden and Noda, 1987, p. 196) 
I. .ù.ua..ù oa !.. uÆù.oTa., B Æ.ù. otù uo.o-tense deshita, but this does not necessarily mean that the American Embassy was located in Toranomon in the past and now it has moved to a new location. The usage of the past-tense form here indicates that the speaker has just recalled his/her previously held knowledge, and the relocation of the American Embassy is not being implied or indicated. 
In the examined corpus, there were many cases of -ta/da + n deshita/datta that were Æ.ùu oa T.uTšooù otù .uùo.ù... .ùcollections of previously held knowledge. The next example, (16), is from a blog entry about taking pictures of rare birds. 
(16)	..................... 
Tashika kyonen mo kare ni .oo.Æw o jama sareta n datta. 
perhaps last year also him by photo shoot O got interrupted N CP 

.I I .ù.ù.Šù. .Toto (.aE I .ùšo|| otoo) tù (.. tÆ.Šo.u) o|.a T.où..Æuoùu .. utaoa .taao |o.o .ùo./. 
In this part of the blog, the writer recalls that her husband interrupted her photo shoot last year, and the recollection of the information is indicated by the past-tense datta oo otù ù.u a otù .ù.où.šù/ Sa.ù .ùouù.. .o. ùù| otoo otù .uùo.ù... .ùša||ùšoion is also expressed by tashika .T I .ù.ù.Šù. .Toto. T. (16), ŠÆo ùðù. Etù. tashika is .ù.aðùu, otù T.uTšooTa. a otù .uùo.ù... .ùša||ùšoTa. uaù. .ao što.où/ OŠ.ù.ðù (17)/ 
(17)	.................. 
Kyonen mo kare ni .oo.Æw o jama sareta n datta. 
last year also him by photo shoot O got interrupted N CP 

.(NaE I .ùšo|| otoo) tù (.. tÆ.Šo.u) o|.a T.où..Æuoùu .. utaoa .taao |o.o .ùo./. 
However, for this particular example, if the present-tense da was used instead of datta at the end of the sentence, the sentence would give the impression that the writer has just come to realize what she stated. In (18), the past-tense datta in (17) is modified into the present-tense da. 
(18)	................ Kyonen mo kare ni .oo.Æw o jama sareta n da. last year also him by photo shoot O got interrupted N CP 
.Hù (.. tÆ.Šo.u) o|.a T.où..Æuoùu .. utaoa .taao |o.o .ùo./. 
As demonstrated by (18), if the sentence ended with the present-tense da, it would give the impression that the writer has just realized that her husband interrupted her 
utaoa .taao |o.o .ùo., o.u otù .uùo.ù... .ùša||ùšoTa. a u.ùðTaÆ.|. tù|u ..aE|ùuoù T. 
not expressed. In addition, as Sadanobu (2004) argues, speaker recollection can be marked by using a past-tense ending only when the sentence is about stative situations, and it cannot be marked when the sentence is about dynamic actions. 
(19)	.............. Kyonen mo kare ni .oo.Æw o jama sareta. last year also him by photo shoot O got interrupted 
.Hù (.. tÆ.Šo.u) o|.a T.où..Æuoùu .. utaoa .taao |o.o .ùo./. 
N datta in (17) is removed in (19). As demonstrated by (19), since jama sareta .oao T.où..Æuoùu. T. o u..o.Tš ošoTa., .T.u|. Æ.T.o otù uo.o où..ù a. otù ošoTa. uaù. .ao 
indicate that the speaker just recalled previously held knowledge. However, as we observed in (16) and (17), the speaker can indicate recollection of previously held knowledge for dynamic actions when -ta/da + n deshita/datta is used. 
The examined corpus included many other cases of -ta/da + n deshita/datta similar to (16). The following are some of the examples of -ta/da + n deshita/datta that were found in the corpus, and they appear to be indicating speaker recollection of previously held knowledge. 
(20)	..................... 
Su TùŠo,  kyonen  mo  pinku  no  shikuramen o  ni  hachi  
speaking of which  last year also  pink  LK  cyclamen  O  two pots 
...........  
kureta  no  o  omoidashita.  
gave me  LK  O  recalled  

.Suùo.T.o a EtTšt, I .ùšo||ùu otoo I o|.a .ùšùTðùu oEa uao. a uT.. š.š|o.ù.. |o.o .ùo./. 
...................  
Hito hachi one pot  wa TP  umaku well  saki kept  tsuzuketa blooming  kedo, but  .u hito another  hachi pot  wa TP 
................  
sugu ni quickly  dame ni bad  nacchatta became  n N  datta. CP  

.(NaE I .ùšo|| otoo) otù š.š|o.ù.. T. a.ù uao .ùuo Š|aa.T.o Eù||, ŠÆo otù a.ù. T. o.aotù. uao Eù.o Šou uÆTš.|./. 
(21)	.....................—.....—.. A, .u da. Anmari ni to..u ga warui node, oÈoÆ.Æ oua EøzÆ o oh so CP extremely response SB bad because Google AdWords O
............. 
ow.tT .T shite oita n datta. turn off set N CP 
.Ot, .ù./ (NaE I .ùšo|| otoo) I oÆ..ùu a Gaao|ù !uWa.u. ŠùšoÆ.ù otù response Eo. ðù.. Šou/. 
Both (20) and (21) are sentences about dynamic actions that happened in the past, and –ta n datta is used at the end of the sentence. The past-tense datta in each sentence seems to be indicating speaker recollection of previously held knowledge. 
4.2.2 -ta/da + n deshita/datta in confirmation-seeking utterances 
The corpus also included cases of -ta/da + n deshita/datta used in sentences for seeking confirmation and agreement. This type of usage seems to be derived from -ta/da + n deshita/datta otoo T.uTšooù. otù .uùo.ù... .ùša||ùšoTa.., ù.uùšTo||. Etù. .ù.où.šù T.o| particles such as ne and yone are added to the sentence. According to Izuhara (2003), both yo and yone have the interactional function of establishing shared recognition between the speaker and the addressee, and this function of ne and yone seems to be contributing to the formation of the interactional effect.2 
Example (22) is from an article based on an interview with a victim of aerial bombing during World War II. The utterance is made by the interviewer. 
2 Technically speaking, Izuhara (2003) categorizes ne as a confirmation seeker, and yone as an agreement seeker. However, since the focus of the present study is not the difference between ne and yone, the difference between the two particles is not fully discussed here. For more details, see Izuhara (1993, 2001, 2003). 
(22)	......................... 
KÈ.tÈ no toki wa Ha.ou made aruite irasshatta n deshita ne. 
bombing LK when TP Hongo to walked N CP FP 

..aÆ Eo|.ùu oa Ha.oa Etù. otù Ša.ŠT.o touuù.ùu, .Toto?. 
In (22), it appears that the interviewer had already held the stated information when the utterance was made, and the sentence final particle ne is used to indicate the whole utterance was made as a confirmation seeking utterance. 
In addition to ne, yone was also used with -ta/da + n deshita/datta in several confirmation seeking sentences in the corpus. (23) is a question utterance by an interviewer in an interview with a musician. 
(23)	..—.................. 
T.Æø jitai wa Hiroshima kara hajimatta n deshita yone. 
tour itself TP Hiroshima in started N CP FP 

.Ttù (ša.šù.o) oaÆ. To.ù| .oo.oùu T. HT.a.tT.o, .Toto?. 
In (23), -ta n deshita is followed by yone. Similar to the example that included ne, (23) appears to be uttered as a confirmation seeking utterance for the propositional information that was previously held by the speaker. The examined corpus also included many other examples that were similar to (22) and (23). Based on the abundant usage of these cases in the corpus, the combination of -ta/da + n deshita/datta and ne or yone seems to be a commonly recognized way of seeking confirmation for previously held knowledge. 


Conclusion 
The present paper has explored the usages of -ta/da + n deshita/datta in discourse by examining a large corpus. The findings have shown that 37.8% of -ta/da + n deshita/datta in the corpus co-occurred with either kke, the tara structure, or the tari structure. Kke, tara, and tari all require past-tense connections for the preceding grammatical element. The analysis has shown that the occurrences of -ta/da + n deshita/datta with kke, tara, or tari are triggered by the grammatical constraints arising from those sentential ending expressions or connections. In addition, in the examined corpus, 62.2% -ta/da + n deshita/datta was not accompanied with any grammatical elements that require past-tense connections. Those cases of -ta/da + n deshita/datta 
o.ù Æ.ùu oa T.uTšooù otù .uùo.ù... .ùša||ùšoTa. a u.ùðTaÆ.|. tù|u ..aE|ùuoù, a. Æ.ùu 
as part of a confirmation seeking utterance for previously held knowledge when it is used with the sentence-final ne or yone. 
The author of the present study is aware of the limitations in the present study. The present study only focused on the past-form of the n desu structure that is used for past events and situations. Needless to say, it is possible for the speaker to use n deshita/datta for ongoing or future events and situations as long as the information was previously recognized in the past. Further analysis of those cases may contribute to expanding our understanding of the usages of the past-tense n deshita/datta. In addition, no de atta, which is the past-tense form of no de aru, was not explored in the present study. No de aru is a variant of n desu, and it is predominantly used in formal-style written texts, especially in narrative texts such as the main body of novels. The relationship between tense, aspect, and point of recognition seems to be operating on a different system in those narrative texts, and communicative functions of no de aru and n desu in colloquial utterances also seem to be highly differentiated. Conducting a comparative study on n deshita/datta and no de atta may further reveal the interactional effects created by using the past form of the n desu structure. 
Finally, in the field of Japanese language pedagogy, explicit instruction on the usages of n deshita/datta is usually not included in the curriculum, even though the n desu structure itself is introduced in elementary-level textbooks. Due to the complexity around the usages of n desu, not including n deshita/datta may be reasonable in order to avoid overwhelming beginning-level learners. However, it may be beneficial for learners of Japanese to include instruction on n deshita/datta in intermediate to upper level courses as part of activities to fine-tune their usage of the n desu structure. 

References 
Alfonso, A. (1966). Japanese language patterns: A structural approach (Vol. 2). Sophia University LL Center of Applied Linguistics. Aoki, H. (1986). Evidentials in Japanese. In W. Chafe & J. Nichols (Eds.), Evidentiality: The linguistic coding of epistemology (pp. 223-238), Norwood: Ablex. Banno, E., Ikeda, Y., Ohno, Y., Shinagawa, C., & Tokashiki, K. (2011). Genki I: An integrated course in elementary Japanese. Tokyo: The Japan Times. Hatasa, Y., Hatasa, K., & Makino, S. (2015). Nakama 1, introductory Japanese: Communication, culture, context. Stamford: Cengage Learning. Hayashi, M. (2010). An overview of the question–response system in Japanese. Journal of Pragmatics, 42(10), 2685-2702. Hayashi, M. (2012). Claiming uncertainty in recollection: A study of kke-marked utterances in Japanese conversation. Discourse Processes, 49(5), 391-425. I.T.o, M/ (2010)/ Nauo ŠÆ. .a .T.u oa .uzu/ ........... ....... ., 75-117. IzÆto.o, E/ (1993)/ StÈ.a.tT ..a. ..a.ù. ..ù. .a .uouoù.T .u.oo.Æ. ..a.ù. .a .a..Æ.T.w.ta. .T.u .a .u.oo.Æ a .T.Æ .T/ ...... D...... ........ ..... .... ....... ., 21-34. IzÆto.o, E/ (2001)/ .Nù. oa ..a. .oT .oT.u/ !.... G..... D...... `...... `.... ..(1), 35-49. 
IzÆto.o, E/ (2003)/ StÈ.a.tT ..a. ..a.ù. ..ù. .oT.u/ !.... G..... D...... `...... `.... ..(2), 1-15. 
Jorden, E. H. (1963). Beginning Japanese. New Haven: Yale University Press. 
Jorden, E. H., & Noda, M. (1987). Japanese: the spoken language. New Haven: Yale University Press. 
Ka.o.o, K/ (2004)/ .Kù. ŠÆ. .a T.T .uzu/ Kotoba no Kagaku, 47, 139-158. 
Kuno, S. (1973). The structure of the Japanese language. Cambridge: MIT Press. 
Lammers, W. P. (2005). Japanese the manga way: An illustrated guide to grammar & structure. Berkeley: Stone Bridge Press. 
Maekawa, K. (2008). Balanced corpus of contemporary written Japanese. IJCNLP 2008, 101. 
Martin. S. E. (1975). A reference grammar of Japanese. New Haven/London: Yale University Press. 
Maynard, S. K. (1992). Cognitive and pragmatic messages of a syntactic choice: The case of the Japanese commentary predicate n(o) da. Text-Interdisciplinary Journal for the Study of Discourse, 12(4), 563-614. 
Maynard, S. K. (2005). Expressive Japanese: A reference guide to sharing emotion and empathy. Honolulu: University of Hawaii Press. 
McGloin, N. H. (1980). Some observations concerning no desu expressions. The Journal of the Association of Teachers of Japanese, 15(2), 117-149. 
McGloin, N. H. (1981). Discourse functions of no desu. Papers from the Middlebury Symposium on Japanese Discourse Analysis, 151-177. 
MšG|aT., N/ H/ (1984)/ Do.Eo ŠÆ..tu .T a.ù.Æ .a uù.Æ .a .T.u/ Gengo, 13(1), 254-260. 
McGloin, N. H. (1989). A student's guide to Japanese grammar/ Ta..a. ToT.tÈ.o./ 
McGloin, N. H., Hudson, M. E., Nazikian, F, & Kakegawa, T. (2013). Modern Japanese grammar: A practical guide. London/New York: Routledge. 
Miura, A., & McGloin, N. H. (2008). An integrated approach to intermediate Japanese. Tokyo: Japan Times. 
Noda, H. (1997). .. .. .. ..... Tokyo: Kuroshio. 
Oka, M., Tsutsui, M, Kondo, J., Emori, S., Hanai, Y., & Ishikawa, S. (2009). Tobira: Gateway to advanced Japanese: Learning through content and multimedia. Tokyo: Kuroshio Publishers. 
Souo.aŠÆ, T/ (2004)/ MÈua .a .oo. .a .o.a.w/ `...... ...... `...... `... D...... `...... .... G..... `.... ß., 1-68. 
Takatsu, T. (1991). A unified semantic analysis of the NO DA construction in Japanese. The Journal of the Association of Teachers of Japanese, 25(2), 167-176. 
Tanomura, T. (1990). G..... ....... ...... .... .. ... .. .... .. Tokyo: Izumi Shoin. 
!ppendix: Transcription Conventions and !bbreviations 
CP  various forms of copula verb be  
FP  final particle  
LK  nominal linking particle  
N  nominalizer  
NEG  negative morpheme  
O  object marker  
Q  question marker  
QT  quotative marker  
SB  subject marker  
TP  topic marker  

“TWO SIDES OF THE SAME COIN”: YOKOHAMA PIDGIN JAPANESE AND JAPANESE 
*
PIDGIN ENGLISH 
Andrei A. AVRAM 
University of Bucharest, Romania andrei.avram@lls.unibuc.ro 
Abstract 
The paper is a comparative overview of the phonology, morphology, syntax and lexicon of Yokohama Pidgin Japanese and Japanese Pidgin English, formerly spoken in Japan. Both varieties are shown to exhibit features typical of pre-pidgins, while they differ considerably in the circumstances of their emergence and the context of use. 
Keywords: Yokohama Pidgin Japanese; Japanese Pidgin English; phonology; morphology; syntax; vocabulary; pre-pidgin 
Povzetek 
RozT..oðo ð.|.ÆÆ.ù u.T.ù..o|.T u.ùo|ùu .ùu .oua...im pTužT.a. ð .a.ato.T T. o.o|ùG.a uTužT.a. na Japonskem pred dobrim stoletjem ter njune glasoslovne, besedotvorne, skladenjske in 
Šù.ùuTG.ù z.oT|.a.oT/ S.azT u.T.ù..oða .ù .ozðTu.a, uo aŠù .oz|TTšT ð.ùŠÆ.ùoo z.oT|.a.oT, oTuT.ù zo zgodnjo fazo uTužT.o, ve.uo. .ù ŠT.oðù.a .oz|T.Æ.ùoo ð a.a|TGT.ot ..Æ.ùoo .o.oo..o T. a.a|.o ..Æ.ù uporabe. 
Kljuène besede: japonskT uTužT. ð .a.ato.T- o.o|ùG.T uTužT. na Japonskem; glasoslovje; besedotvorje; skladnja; besediGù- zoau..o .oau..o uTužT.o 
* An earlier version of this paper was presented at the 4th International Symposium on Japanese Studies .T.ouToTa., Mauù..To., o.u G|aŠo|TzooTa. T. Jouo.., 1-2 March 2014, Centre for Japanese SoÆuTù., BÆšto.ù.o/ I oto.. otù oÆuTù.šù a. otùT. u.aŠT.o uÆù.oTa.., o.u, T. uo.oTšÆ|o., !.u.ù. Bù.ùG 
(University of Ljubljana, Department of Asian Studies) for his insightful comments. 
Acta Linguistica Asiatica, 7(1), 2017. 
ISSN: 2232-3317, http://revije.ff.uni-lj.si/ala/ 

DOI: 10.4312/ala.7.1.57-76 



Introduction 
Yokohama Pidgin Japanese (henceforth YPJ), o|.a ..aE. o. ..a.ato.o uTo|ùšo., ..a.ato.o PTuoT.. a. .Jouo.ù.ù ua.o. |T.oa., T. o ðo.Tùo. a uTuoT.Tzùu Jouo.ù.ù, 
spoken in the second half of the 19th century in Yokohama and, most probably, in Kobe and Nagasaki (Lange, 1903, p. XXVIII; Chamberlain, 1904, p. 369; Daniels, 1948, pp. 805­806; Loveday, 1986, p. 28; Loveday, 1996, p. 69; Stanlaw, 2004, pp. 56-59; Inoue, 2003; Inoue, 2004, p. 116; Inoue, 2006, pp. 55-56). The users of YPJ included Japanese, Westerners (Europeans and Americans), and also a sizable number of Chinese (Stanlaw 2004, p. 57; Inoue 2004, pp. 116-117; Inoue, 2006, p. 56). 
YPJ is poorly documented. The attestations are limited to a phrasebook (Atkinson, 1879), a glossary (Gills, 1886), a dictionary (Lentzner, 1892), travel accounts (Griffis, 1883; Knollys, 1887), and two magazine articles (Anonymous, 1879; DTi.. 1879)/ Under the circumstances, it is hardly surprising that analyses of YPJ are also scarce. Its lexicon is analyzed by Daniels (1948), while Inoue (2006) looks mostly at morphosyntactic features. More comprehensive overviews are found in Avram (2013, 2014). 
Japanese PTuoT. E.o|T.t (tù.šùa.ot JPE), o|.a .ùù..ùu oa o. .Bo.Šaa E.o|T.t. a. .E.o|T.t-Jouo.ù.ù PTuoT.., T. o. ùeoT.šo ðo.Tùo. a uTuoinized English, formerly used by US army personnel and local Japanese after World War II (Norman, 1954; Loveday, 1986, p. 29) and later transplanted to South Korea (Algeo, 1960, p. 117; Stanlaw, 2004, 
p. 70; Loveday, 1986, p. 29). 
There are very few textual sources for JPE. The data analyzed in the present paper are from a dialogue (Michener, 1954), cartoons (Hume, 1954; Hume and Annarino 1953a, 1953b), and a story (Webster, 1960). Short descriptions of JPE can be found in Norman (1954, 1955), Algeo (1960), Webster (1960), Goodman (1967), Duke (1972), Stanlaw (1987, 1996, 2004, 2006). A more detailed analysis is found in Avram (2016). 
All examples1 appear in the original orthography or system of transcription used in the sources. The sources are mentioned between brackets. Unless otherwise specified, the translations are from the sources. 
The paper is organized as follows. The phonology, morphology, syntax and lexicon of YPJ and JPE are analyzed in sections 2, 3, 4, and 5, respectively. Section 6 is a discussion of the findings, with reference to various classifications of pidgin languages. Section 7 summarizes the conclusions. 
1 The following abbreviations are used in the examples: 1 = 1st person; 2 = second person; 3 = 3rd person; D = Dutch DEM = demonstrative; E = English; IMPER = imperative; INDEF = indefinite; J = Japanese; NEG = negator; O = object; PL = plural; S = subject; SG = singular; V = verb. 

2 Phonology 
2.1 	YPJ 
In spite of the inconsistencies in the transcriptions in the textual sources, it can be stated from the outset that the phonology of YPJ has characteristics typical of the Tokyo-Yokohama dialectal area. 
For instance, in Japanese-derived lexical items, the etymological high vowels /i/ and /./ are not rendered in the transcription if they occur between voiceless consonants or in word-final position. These are precisely the phonological environments in which devoicing of the high vowels /i/ and /./ occurs in the Tokyo-Yokohama dialectal area2. The absence of the vowel letters <i> or <u> presumably reflects the phonetic realizations [ 
] and [..] respectively: 
(1) a. 	h’to .uù..a.. (DTi.., 1879, p. 500) < J hito ..o.. 
b. 
nannats ..ùðù.. (!o.T..a., 1879, u/ 18) < J nanatsu ..ùðù.. 

c. 
skoshe |Too|ù. (!o.T..a., 1879, u/ 18) < J sukoshi .o |Too|ù. 

d. 
tacksan ..Æšt. (!o.T..a.,1879, p. 18) < J takusan 


As in Tokyo-Yokohama Japanese, the alveo-palatal voiceless fricative [.] is substituted for the Soo.uo.u Jouo.ù.ù uo|ooo| ðaTšù|ù.. .TšooTðù [T\: 
(2) a. 	shto ..o.. (!o.T..a., 1879, p. 20) < J hito ..o.. 
b. sheebatchey ..oaðù. (!o.T..a., 1879, p. 24, f.n.) < J hibachi ..oaðù. 
!. .taE. Š. otù Æ.ù a otù uTo.out <.o= T. otù .aÆ.šù., .PJ ùetTŠTo. otù .o.o| ðù|o. [A\ 
in word-medial position, a feature of (earlier) Tokyo Japanese3: 
(3) 	a. koong-ee4 (Atkinson, 1879, p. 24) / koongee (Atkinson, 1879, p. 28) < 
kugi ..oT|. 

b. tomango (Atkinson, 1879, p. 24) < J tamago .ùoo. 
The syllable structure of YPJ is, with a few exceptions, that of Japanese: i.e. with simple syllable margins and with the uvular /N/ as the only admissible consonant in word-final codas5. Again as in Japanese, illegitimate onsets and codas are resolved via two repair strategies – epenthesis, as in (4), or paragoge, as in (5): 
(4) a. 	buranket .Š|o..ùo. (DTi.., 1979, u/ 500) < E blanket 
b. si...... ..oT|Eo. .oooTa.. (DTi.., 1879, u/ 500) < E station 
2 See e.g. Avram (2005, pp. 28-33). 
3 See e.g. Shibatani (1990, pp. 171-173) and Avram (2005, pp. 48-56). 
4 In the YPJ examples from Atkinson (1879), <ee> frequently stands for [i], while <oo> represents 

both [.] and [.:]. 
5 See e.g. Avram (2005, p. 96). 

(5) a. 	drunky .u.Æ... (!o.T..a., 1879, u/ 28) < E drunk 
b. 	madorosu ..oT|a.. (DTi.., 1879, p. 501) < D matroos ..oT|a.. 
The phonology of YPJ appears to have displayed considerable inter-speaker variation. Some of these instances of variation are explicitly attributed to the different first languages of YPJ users. Consider, for example, the differences between the pronunciation of Westerners and that of the Chinese users of YPJ in the phonetic realization of [.]. According to Atkinson (1879, p. 29), .a.ùTo.ù.. [= Westerners] as a .Æ|ù .ooo|ù otùT. .R.. .aÆot|., .ùouT|. [0\ a. ù|.ù To.a.ù otù. o|oaoùotù.., Etù.ùo. .otù Cù|ù.oTo| [= CtT.ù.ù\ |ÆŠ.Tšooù. otù .R../ TtT. T. T||Æ.o.ooùu Š. otù a||aET.o ùxamples: 
(6) 	a. Westerner walk-karrymasing / walk-kawymasing vs. 
Chinese walk-kallimasing ..T.Æ.uù..oo.u (!o.T..a., 1879, u/ 28) 

b. 	Westerner am buy worry vs. Chinese am buy wolly ..ao ùù|T.o Eù||. 
(Atkinson, 1879, p. 28) 

There is also variation in the form of the same YPJ lexical items, as recorded in different sources: 
(7) 	a. piggy (Atkinson, 1879, p. 21) / peke (DTi.., 1879, u/ 501) / peggy (Knollys, 
1887, u/ 312) .to oa. 

b. 	pumgutz .uÆ.T.t.ù.o. (!o.T..a., 1879, u/ 28) / bonkotz .ot.o.tT.o. (DTi.., 1879, p. 501) 
Finally, the same author occasionally lists different forms of the same YPJ word: 
(8) a. 	jiggy jiggy / jiki jiki .to .o.ù to.où. (Gills, 1886, p. 113) 
b. 	
maro-maro / maru-maru .oa Šù .a.ùEtù.ù. (DTi.., 1879, u/ 501) 

c. 	
sheebatchey / heebathchey6 ..oaðù. (!o.T..a., 1879, u/ 24, /./) 


2.2 	JPE 
The phonology of JPE, to the extent to which it can be inferred from the available evidence, attests to the occurrence of readjustments of the syllable structure of lexical items derived etymologically from English and Japanese, respectively. Goodman (1967, 
p. 51), for instance, states that .T. otù Jouo.ù.ù .uùo.ù... ðù..Ta. [a JPE], /o/ or /u/ is normally added in final position to English words that do not ù.u T. [., ., A\./ The three nasal phonemes of English appear to have been treated as the Japanese uvular nasal /N/, the only consonant which can occur in word-final position in Japanese. This would account for the absence of paragogic vowels in such cases. Duke (1972, p. 170) writes 
6 In which <sh> and <h> presumably stand for [.\ o.u .ù.uùšoTðù|. [T\/ 
that American speakers also resort to paragoge, .Š. ù.uT.o E.o|T.t Ea.u. ETot ùTotù. .T. a. .ùù.7: 
(9) a. changee .oa što.où. (WùŠ.où., 1960, u/ 160) 
b. 
ketchee .oa šoošt. (WùŠ.où., 1960, u/ 160) 

c. 
speakie .oa .uùo.. (WùŠ.où., 1960, u/ 161) 


On the other hand, both parties involved, i.e. the Japanese and the American users of JPE, also tried to accommodate each other. According to Goodman (1967, p. 52), 
.E.o|T.t .uùo.ù.. [0\ uùðù|auùu o .ù..ToTðTo. oa Jouo.ù.ù ..||oŠTš .o.ÆšoÆ.ù o.u attempted to end all words with /u/ or /o/ T. o .ootù. o.ŠTo.o.. uoooù.... 
(10) a. post cardo (Webster, 1960, p. 163) 
b. saymo-saymo ..o.ù. (Gaau.o., 1967, u/ 51) 
Goodman (1967, p. 52) notes that .Jouo.ù.ù .uùo.ù.. [0\ .ouù otù .o.ù .a.o a compensation by š|TuuT.o a T.o| ðaEù|..: 
(11) ..... .šo.. (Goodman, 1967, p. 52), cf. J ....... 
Inter-speaker variation is widely attested. In (12a) Japanese /N/ is phonetically realized as [n] by American speakers, while in (12b) English /r/ is phonetically realized as [.] in word-initial and it is deleted word-finally, while English /l/ is phonetically realized as [.]: 
(12) a. [i.ibaN] vs. [i.ibªn\ .ðù.. oaau. (Goodman, 1967, p. 51) 
b. [.o:.a:] vs. [r..l.r\ ..a||ù.. (Gaau.o., 1967, p. 51) 
Consider next differences in the phonetic realization of vowels: 
(13) a. [sake] vs. [saki\ ..o.ù. (Gaau.o., 1967, pp. 51-52) 
b. 
[i.ibaN] vs. [i.iban\ .ðù.. oaau. (Goodman, 1967, p. 51) 

c. 
[s...ko..i] vs. [sko..i] (Norman, 1967, p. 44) 


As can be seen, in (13a-b) American speakers use different vowels and in (13c) they do not exhibit the devoiced vowel. Similarly, in (14), Japanese speakers substitute a long vowel for the original diphthong [..] and for [.]8 respectively: 
(14) [.o:.a:] vs. [r..l.r\ ..a||ù.. (Gaau.o., 1967, p. 51) 
7 The use of the paragogic vowel [.] is typical of stereotypical representations of English pidgins or 
š.ùa|ù. o. Eù|| o. a oooù.uo. oo ..uùo.T.o. T. .Æšt ðo.TùoTù./ 
8 In Japanese, the reflex of [-.r\, .uù||ùu <ù.=, T. |a.o [o.\, .ùù ù/o/ QÆoš.ù.ŠÆ.t & O.a (1991. 93)/ 


3 Morphology 
3.1 YPJ 
With the exception of the negators nigh < J nai and -en < J-en, which only occurs in two verbs, YPJ does not have any inflectional morphology. 
Two word-formation means are attested, compounding and affixation. Compounds frequently compensate for the absence of particular lexical items: 
(15) a. nammai kammy |To/ ..o.ù. + uouù.. = .šo.u. (!o.T..a., 1879, p. 21) 
b. 	niwa-tori |To/ .oo.uù.. + .ŠT.u. = ..aa.où.. (DTi.., 1879, p. 501) 
A number of compounds are constructed with mono (< J mono .otT.o.), o. T. (16), a. with reflexes of J hito ..o.., o. T. (17). 
(16) 
shiroy mono |To/ .EtToù. + .otT.o. = ..oo.št. (!o.T..a., 1879, p. 24) 

(17) 
selly shto |To/ .oa .ù||.+ ..o.. = .oÆšoTa.ùù.. (!o.T..a., 1879, p. 25) 


Affixation is confined to the use of the suffix -san (< J -san): 
(18) a. babysan .štT|u. (!o.T..a., 1879, p. 19) 
b. 	
doctorsan .uašoa.. (!o.T..a., 1879, p. 24) 

c. 	
Nankinsan .CtT.o.o.. (!o.T..a., 1879, p. 25) 


Reduplication is not really a word-formation means. Firstly, it is neither productive nor frequent: 
(19) a. drunky drunky .u.Æ..., š/ drunky .u.Æ... (!o.T..a., 1879, p. 28) 
b. 	mate-mate .EoTo o |Too|ù. (GT||., 1883, p. 147), cf. matty .EoTo. (!o.T..a., 1879, p. 20) 
Secondly, as shown by example (19a), there is no demonstrable difference in meaning between the reduplicated form and the simplex one. Moreover, other examples are actually quasi-reduplicated forms9: 
(20) a. chobber chobber .aau, .Æ.où.o.šù. (!tkinson, 1879, p. 21) 
b. 	
minner minner .o||. (!o.T..a., 1879, u/ 22) 

c. 	
sick-sick .š.o... (!o.T..a., 1879, p. 20) 

d. 	
so so ..ùE. (!o.T..a., 1879, p. 21) 


9 Defined by Bakker (2003, p.40) as ..ùuÆu|Tšooùu a... a. EtTšt .T.o|ù a... ua .ao ùeT.o.. 
3.2 JPE 
Inflectional endings only occur sporadically in JPE as spoken by American users. 
In the derivational morphology of JPE compounding is rather poorly documented. One rare example of a compound is given below: 
(21) benjo ditch |To/ .oaT|ùo. + .uTošt. = .oaT|ùo, otù šo.. (Gaau.o., 1967, p. 49) 
Affixation, limited to the use of the Japanese-derived suffix -san, is better represented. According to Goodman: (1967, p. 54), -san T. .a sort of suffix., EtTšt šo. Šù ooooštùu .to any of a group of English-derived terms, like mama, papa, boy, girl, and baby, as both où... a .ùù.ù.šù o.u ouu.ù.../ Duke (1972, p. 170) also notes .otù Æ.ù a otù honorific suffix -san ooù. .o.. .aÆ.... Consider the following examples: 
(22) a. 	boy-san .Ša.- .a.. (DÆ.ù, 1972, p. 170) 
b. 	
godmother-san .oau.aotù.. (WùŠ.où., 1960, u/ 160) 

c. 	
mama-san .Ea.o., |ou.-.aotù.. (DÆ.ù, 1972, p. 170) 

d. 	
prince-san .u.T.šù. (WùŠ.où., 1960, u/ 160) 


Both Goodman, 1967, p. 51) and Duke (1972, p. 170) also discuss reduplication. However, on the strength of the available evidence, JPE appears to have had quasi-reduplicated forms, with no corresponding simplex forms: 
(23) a. 	chop-chop .aau. (WùŠ.où., 1960, p. 163) 
b. 	
dame-dame .Šou. (HÆ.ù, 1954, p. 95) 

c. 	
hubba-hubba .oa tÆ.... (WùŠ.où., 1960, p. 164) 

d. 	
saymo-saymo ..o.ù. (Gaau.o., 1967, p. 51) 



4 Syntax 
4.1 YPJ 
The absence of inflections and the small size of its lexicon account for the fact that YPJ words exhibit categorial multifunctionality, as illustrated by the following examples: 
(24) 	a. die job ..o.a.o, .aÆ.u, oaau, oŠ|ù. (!o.T..a., 1879, p. 19) o.u .Eù|| (ouð/). 
(Atkinson, 1879, p. 23) 

b. 	
jiggy jig .oa to.où.. (!o.T..a., 1879, p. 17), .uÆTš.|.. (!o.T..a., 1879, p. 17) o.u .o.o. (!o.T..a., 1879, p. 17), .otù .ùo.ù.o. (!o.T..a., 1879, p. 19) 

c. 	
pumgutz .uÆ.T.t. (!o.T..a., 1879, p. 22) o.u .uÆ.T.t.ù.o. (!o.T..a., 1879, 

p. 28) 

d. 	
sick-sick .T||.ù... (!o.T..a., 1879, p. 17) o.u ..Tš., T||.(!o.T..a., 1879, p. 28) 


As is mostly the case in Japanese, plurality is inferred from the context or expressed by means of cardinal numerals or quantifiers: 
(25) 	Tempo meats high kin arimas. (Atkinson, 1879, p. 18) 
penny three see be 

.I .ùù ot.ùù uù.šù/. 
The Japanese case markers (particles and postpositions) have not been retained, as can be seen from the examples below: 
(26) 	a. Dalley O house arimas? (Atkinson, 1879, p. 22) 
who house be 

.Wta.ù taÆ.ù T. otT.?. 
b. 	watarkshee boto O piggy (Atkinson, 1879, p. 21) 
1SG boat go 
.I.ðù oa.ù aÆo T. otù Šaoo.. 

Only three personal pronouns are attested: 
(27) 	watarkshee 1SG 
anatta / anatter and oh my 2SG 
acheera sto 3SG 

Of these, only watarkshee and oh my occur more frequently. The only demonstrative attested (just once) is kono: 
(28) 	kono house (Atkinson, 1879, p. 26) 
DEM house 

.otT. taÆ.ù. 
Adjectives are better represented in the available sources. The only degree of comparison of adjectives attested in the corpus is the absolute superlative, formed with num wun preceding the adjective: 
(29) 	num wun10 your a shee (Atkinson, 1879, p. 25) 
exceptional good 

.ùešùuoTa.o||. .Tšù. 
The only numerals found in the corpus of YPJ are the cardinal numerals from 1 to 10 and 100. 
10 Etymologically derived from E number one, and presumably pronounced [namwan]. 
Rather surprisingly, YPJ has a copulative verb, arimas < J arimasu. This is attested both in equative, as in (30a), and in predicative structures, as in (30b): 
(30) 	a. Tempo arimasu. (Atkinson, 1879, p. 16) 
penny be 

.TtT. T. o uù.../. 
b. 	Kooroy arimasu. (Atkinson, 1879, p. 19) 
black be 

.Io T. Š|oš./. 
Time adverbials are used to indicate tense and aspect: 
(31) 	a. Sigh oh narrow dozo bynebai moh skosh cow (Atkinson, 1879, p. 27) good bye please by and by more little buy 
.Gaau Š.ù, u|ùo.ù ŠÆ. [T. otù ÆoÆ.ù\ .a.ù .a.ù/. 
b. 	meonitchi ... tacksan so so arimasu (Atkinson, 1879, p. 21) tomorrow a lot sew be 
.I ET|| toðù u|ù.o. a Ea.. a. tT./. 
Only one, invariant negator, nigh < J nai, is attested: 
(32) 	Atsie sammy eel oh piggy nigh? (Atkinson, 1879, p. 19) 
hot cold colour change NEG 

.Daù. tT. ša|a. što.où T. otù ðo.TaÆ. .ùo.a..?. 
As in Japanese, the word order of YPJ is SOV: 
(33) 	Your a shee cheese eye curio high kin. (Atkinson, 1879, p. 25) 
good small curios see 

.I ET.t oa .ùù .a.ù .Tšù ..o|| šÆ.Ta./. 
YPJ exhibits rather consistently the parameters correlated with the SOV word order. This is illustrated by the examples under (34): 
(34) a. 	possessor – possessee 
oh my oh char (Atkinson, 1879, p. 15) 
2SG tea 

..aÆ. oùo. 
b. 	adjective – noun 
die job sto (Atkinson, 1879, p. 19) 
strong person 

.o .o.a.o .o.. 
c. 	demonstrative – noun 
kono house (Atkinson, 1879, p. 26) 
DEM house 

.otT. taÆ.ù. 
d. 	numeral – noun 
Stoats sindoe skoshe matty. (Atkinson, 1879, p. 20) 
one boatman a little wait 

.Lùo a.ù Šaoo.o. EoTo/. 
e. 	adverb – verb 
skoshe matty (Atkinson, 1879, p. 20) 
a little wait 

.EoTo o |Too|ù. [o.o..|ooTa. .T.ù\ 
However, as shown below, exceptions are also attested: 
(35) 	a. noun – numeral 
Tempo meats high kin arimas. (Atkinson, 1879, p. 18) penny three see be 
.I .ùù ot.ùù uù.šù/. 
b. 	verb – adverb 
Oh my piggy jiggy jig (Atkinson, 1879, p. 28) 
2SG get out quickly 

.Gùo aÆo uÆTš.|.. [o.o..|ooTa. .T.ù\ 
YPJ exclusively resorts to parataxis for sentence coordination: 
(36) 	watarkshe oki akindo, tacksan cow (Atkinson, 1879, p. 26) 
1SG big merchant a lot buy 

.I o. o. T.ua.oo.o .ù.što.o o.u I ŠÆ. o |ao. [o.o..|ooTa. .T.ù\ 
Sentence subordination also relies on the exclusive use of parataxis, given the absence of any complementizers, conjunctions, conjunctive particles, etc.: 
(37) 	a. Nanny sto arimasu, watarkshee arimasen? (Atkinson, 1879, p. 19) 
what person be 1SG be-NEG 

.Wta šo||ùu Etù. I Eo. aÆo?. 
b. 	Watarkshee am buy worry oh char parra parra (Atkinson, 1879, p. 17) 1SG ill tea boil 
.BaT| .ù .a.ù oùo ŠùšoÆ.ù I ùù| T||/. [o.o..|ooTa. .T.ù\ 
c. 	Dye die job arimasen, itchiboo sinjoe nigh. (Atkinson, 1879, p. 28) table good be-NEG one bu give NEG .I otù ooŠ|ù T. .ao oaau, I Ea..o oTðù .aÆ o bu/. [o.o..|ooTa. .T.ù\ 
Generally, and as can be seen in the examples (37b) and (37c), subordinate clauses precede main clauses, as in Japanese. 
4.2 JPE 
The virtual absence of inflections and the small size of the lexicon explain the categorial multifunctionality typical of JPE lexical items. Goodman (1967, p. 53) notes .otù Æ.ù a many words in a variety of grammooTšo| Æ.šoTa.../ !. .oooùu Š. Duke (1972, p. 170), .o.o..ooTšo||., .o.. a otù Ea.u. Æ.šoTa. o. Šaot .aÆ.. o.u ðù.Š. o.u .a.ùoT.ù. o. ou.ùšoTðù. o.u ouðù.Š../ Representative examples are provided below: 
(38) 	a. chop-chop .aau. o.u .oa ùoo. (Duke, 1972, p. 172) 
b. 
hayaku .uÆTš.|.. o.u .oa tÆ... Æu. (Gaau.o., 1967, p. 53) 

c. 
okay .OK. o.u .oa Te, oa ou.Æ.o. (Gaau.o., 1967, p. 54) 

d. 
sayonara .oŠ.ù.šù. o.u .oa oùo .Tu a. (Gaau.o., 1967, p. 53) 

e. 
taksan ..Æšt, .o..., .ðù... o.u .|o.où. (DÆ.ù, 1970, p. 172) 


The English definite and indefinite articles have not been retained. In the JPE of American speakers, one may be used as an indefinite article: 
(39) 	one prince-san (Webster, 1960, p. 163) INDEF prince 
.o u.T.šù. 
Nouns are occasionally marked for the plural, but only by American users of JPE. Only three personal pronouns are attested in the corpus: 
(40)  a.  I / watash  1SG  (Michener, Sayonara, p. 170; Goodman 1967, p. 48)  
b.  you  2SG  (Michener, Sayonara, p. 170)  
c.  we / ol watash  1PL  (Goodman, 1948, p. 48)  

As seen in (40c), pre-posed ol (< E all) optionally marks plurality. A number of adjectives occur in the available corpus of JPE. As shown by Goodman 
(1967, p. 48), ichiban (< J ichiban) T. .Æ.ùu oa T.uTšooù .ù|ooTðù a. oŠ.a|Æoù .Æuù.|ooTðù./ Only cardinal numerals are attested. The verbal system of JPE is characterized by extreme simplicity. For instance, there 
is no overt copula (Goodman, 1967, p. 52; Stanlaw, 2006, p. 184): 
(41) 	a. You O takusan steki (Michener, Sayonara, p. 171) 2SG very wonderful 
..aÆ o.ù ðù.. ŠùoÆoTÆ|/. 
b. 	Kid, you O dai jobu (Webster, 1960, p. 164) 
kid 2SG OK 

.KTu, .aÆ..ù o|| .Toto/. 
Also absent are auxiliary verbs. Consider the example below: 
(42) 	I beauty saron O go. (Michener, Sayonara, p. 171) 
1SG beauty salon go 

.I.. oaT.o oa otù ŠùoÆo. uo.|a./. 
According to Goodman (1967, p. 52), .ðù.Š. .a. Šaot |o.oÆooù. [E.o|T.t o.u Jouo.ù.ù\ Eù.ù Æ.ùu [0\ T. T.T.ToTðù a.m. a. šToooTa. a... ETotaÆo oTeù../ This general absence of verbal inflections accounts for the fact that tense and aspect distinctions could only be inferred from the context or were indicated by time adverbials (Goodman, 1967, p. 52; Stanlaw, 2006, p. 184) such as one time .a.šù., kinoo ..ù.où.uo.. (< J ....), ima ..aE. (< J ima), ashita .oa.a..aE. (< J ashita), all time .o|Eo.... 
(43) 	Maybe you one time gang boy (Hume & Annarino, 1953a, p. 43) 
maybe 2SG once gangster 

.Mo.Šù .aÆ Eù.ù a.šù o oo.o.où./. 
The only negator attested is pre-posed no (< E no): 
(44) 	a. No can stay. (Michener, Sayonara, p. 170) 
NEG can stay 

.I šo..o .oo./. 
b. 	all time no fit (Webster, 1960, p. 164) 
all time NEG fit 

.[otù.\ uTu .ao To. 
Prepositions are frequently omitted, in particular by Japanese speakers, as in (45a), and, more rarely, also by American users of JPE, as in (45b): 
(45) 	a. I O jobu go (Michener, Sayonara, p. 170) 
1SG job go 

.I .Æ.o oa oa .. .aŠ. 
b. 	Come on O my house (Webster, 1960, p. 164) 
come on my house 

.Ca.ù a. oa .. taÆ.ù. 
Consider next word order. According to Stanlaw (2006, p. 184), .Šaot !.ù.Tšo. o.u Japanese speakers were somùEtoo T.uùoù..T.ooù oŠaÆo otT... Indeed, both SVO and SOV patterns are attested, with intra-speaker variation as well, as in (46b): 
(46) 	a. I takushi, get. (Michener, Sayonara, p. 170) 
1SG taxi get 

.I.|| šo|| o šoŠ/. 
b. You mess my hair, ne. I beauty saron go (Michenere, Sayonara, p. 171) 
..aÆ.ðù .ù..ùu Æu .. toT./ I.|| toðù oa oa oa otù ŠùoÆo. uo.|aÆ./. 
Sentence coordination is mostly achieved by means of parataxis (Goodman, 1967, pp. 52-53), given that coordinating conjunctions are generally not used: 
(47) 	Meter-meter dai jobu. Testo-testo dammey-dammey. (Goodman, 1967, p. 53) look over OK examine bad .Io.. T.ù oa loa. oo otù oT.|, ŠÆo ua..o o.. o..otT.o ù|.ù/. 
The following is an extremely rare example in which a Japanese-derived coordinating conjunction (keredomo .ŠÆo. < J keredomo) is used: 
(48) 	I rike stay with you keredomo I train 
1SG like stay with 2SG but 1SG train 

go honto (Michener, Sayonara, p. 170) 
go really 

.I EaÆ|u |T.ù oa .oo. ETot .aÆ, ŠÆo I .ùo||. toðù oa oa oa otù o.oT. .oooTa./. 
Since complementizers and subordinating conjunctions are not used, sentence subordination is also achieved via parataxis, as in the examples below: 
(49) 	a. You all time speak work-work. (Stanlaw, 2006, p. 184) 
2SG always speak work 

..aÆ o|Eo.. .o. .aÆ..ù aÆo Ea..T.o/. 
b. 	Come night ..., sisters speak sayonara (Webster, 1960, p. 164) come night sisters speak good bye 
.Wtù. otù .Toto [0\ šo.ù, otù .T.où.. |ùo. 

5 	Lexicon 
5.1 YPJ 
The lexicon of YPJ amounts to approximately 250 words. The inventory of these lexical items as well as their origin can be found in Daniels (1948). Therefore, this section is exclusively concerned with other features of the YPJ vocabulary (see also Avram, 2014, pp. 42-43). 
A few English-derived words illustrate reanalysis of morphemic boundaries: 
(50) 	a. come here (Atkinson, 1879, p. 19) / komiya (DTi.., 1879, p. 500) / kumheer (Knollys, 1887, p. 311) .uao. < E come here! 
b. 	damyuri sto (Atkinson, 1879, p. 28) / 
......... .... (DTi.., 1879, u/ 500) / 
dammuraisu hito (Griffis, 1883, p. 493) ..oT|a.. < E damn your eye(s), J hito ..o.. 
The following are lexical hybrids11: 
(51) 	a. kireen .š|ùo.. (!o.T..a., 1879, p. 25), cf. E clean and J kirei 
b. 	shiroy ..tT.o (!o.T..a., 1879, p. 24), cf. E shirt and J shiroy .EtToù. 
Given the small size of the YPJ lexicon, words undergo semantic extension and frequently express a rather wide range of meanings: 
(52) a. 	aboorah .ŠÆooù., aT|, .ù.a.ù.ù, ua.ooÆ., o.ùo.ù. (!o.T..a., 1879, p. 20) 
b. 	piggy .oa .ù.aðù, oo.ù oEo., šo... a, š|ùo. [otù ooŠ|ù\, oùo aÆo, .ù.aðù. (Atkinson, 1879, p. 17), .što.où. (!o.T..a., 1879, p. 19), .uÆ.t a. (Atkinson, 1879, p. 20), go(ne) out. (!o.T..a., 1879, p. 21) 
Another consequence of the small size of the vocabulary of YPJ is the occurrence of circumlocutions. Illustrative examples are provided below: 
(53) 	a. coots pom pom otoko (Atkinson, 1979, p. 20) shoe hammer man 
.Šaao.o.ù.. 
b.  fooney high-kin  serampan  nigh  rosoko (Atkinson, 1879, p. 19)  
ship  see  break  NEG  candle  
.|Toto taÆ.ù.  

11 Lexical items identified across languages, given their uta.ùoTš .T.T|o.To. (Mct|tØÆ.|ù., 1997, p. 135). 
Finally, also attested are synonyms derived etymologically from different source languages: 
(54) 	a. am buy worry (Atkinson, 1879, p. 17) < J ambai .ša.uToTa.., warui .Šou., 
and sick-sick (Atkinson, 1879, p. 17) .T||. < E sick 

b. 	die job (Atkinson, 1879, p. 19) < J daijobu .T.ù., o.u your a shee .o|.Toto. (Atkinson, p. 1879, p. 18) < J yoroshii .oaau. 
5.2 JPE 
The lexical contribution of English is discussed in Goodman (1967), while the lexical items and phrases derived etymologically from Japanese are discussed in Avram (2014, pp. 17-18), to which the reader is referred. The following remarks focus on other characteristics of the lexicon of JPE. 
The form below is the outcome of reanalysis of morphemic boundaries: 
(55) 	morskosh .o EtT|ù. (Norman, 1955, p. 44) < J .. ..a.ù. o.u sukoshi .o |Too|ù. 
Also attested are lexical hybrids: 
(56) 	a. meter-meter .oa |aa. aðù.. (Goodman, 1967, p. 51), 
cf. J mite .oa .ùù IMPER. o.u E meter 

b. 	mor ..a.ù. (Norman, 1955, p. 44), 
cf. J .. ..a.ù. o.u E more 

The small size of the JPE lexicon accounts for two striking features of the JPE vocabulary. Consider first the pervasive lexical polysemy. Goodman (1967, p. 54) 
.ù.oTa.. otù .uÆo|To. a .ù.o.oTš ùeoù..TŠT|To.. a JPE Ea.u., o.u Soo.|oE (2006, u/ 184) .Toto|. aŠ.ù.ðù. otoo ..a.o a otù ðašoŠÆ|o.. Toù.. toðù Æ.uù.oa.ù .ù.o.oTš ùeoù..Ta../ Baot E.o|T.t-and Japanese-derived words exhibit considerable polysemy, as illustrated by the examples under (57) and (58) respectively: 
(57) 
a. 	ketchee no fun .[.tù\ tou .a Æ.. (WùŠ.où., 1960, p. 163) 

b. 	
ketchee post cardo .[otù.\ .ùšùTðùu o ua.o šo.u. (WùŠ.où., 1960, p. 163) 

c. 	
ketchee one mouse .[.tù\ šoÆoto o .aÆ.ù. (WùŠ.où., 1960,u/ 163) 

d. 	
ketchee beeru .[tù\ oao o Šùù.. (WùŠ.où., 1960, p. 163) 



(58) 	
shimpai-nai .ua..o Ea...- ua..o Šaotù.- |ùo.. ù..a. aÆ..ù|ðù.-.aÆ..ù Eù|ša.ù­I.ðù .ùšaðù.ùu .a. .. .o|ou.. (Gaau.o., 1967, p. 54) 

(59) 	
speak sayonara .oa |ùoðù someone. (Webster, 1960, p. 164) 


The extremely small size of the JPE vocabulary also explains the occurrence of circumlocutions. Consider one such example: 
One last characteristic of the JPE lexicon worth mentioning in this section is the existence of a few synonyms, one of which is from English and the other from Japanese: 
(60) a. nice < E .nice. and suteki < J suteki ..Tšù. 
b. okay < E OK and dai jobu < J daijobu .o|| .Toto. 

Classification of YPJ and JPE 
In a well-..aE. o.ua|ao. Mct|tØÆ.|ù. (1997, pp. 5-6) identifies three types of pidgin in the so-šo||ùu .uTuoT.-to-š.ùa|ù |Tù š.š|ù.: (i) pre-pidgins 12 ; (ii) stable pidgins; (iii) expanded pidgins 13 . Each of these is characterized by specific phonological, morphological, syntactic, and lexical diagnostic features. Since productive morphological reduplication is a correlate of the developmental stage (Bakker, 2003, 
p. 44; Bakker & Parkvall, 2006, p. 514), this diagnostic feature can be added to those suggested by Mct|tØÆ.|ù. (1997). 
The data from YPJ and JPE discussed in sections 3 through 5 indicate that both varieties should be classified as pre-pidgins. The distribution of pre-pidgin features in YPJ and JPE is set out in Table 1 below: 
Table 1: Pre-pidgin features of YPJ and JPE 
Feature  YPJ  JPE  
inter-speaker variation in phonology  +  +  
minimal personal pronoun system  +  +  
omission of copula (predicative, equative)  ,  +  
omission of tense and aspect markers  +  +  
omission of adpositions  +  (+)  
omission of complementizers  +  +  
non-productive reduplication  +  +  
categorial multifunctionality  +  +  
extensive use of parataxis  +  +  
small size of vocabulary  +  +  
reanalysis of morphemic boundaries  +  +  
lexical hybrids  +  +  
lexical polysemy  +  +  
circumlocutions  +  +  
synonyms from different source languages  +  +  

12 A|.a šo||ùu ..o.oa..., ..T.T.o| uTuoT... a. ..ù.o.Tšoùu uTuoT.... 13 Bo..ù. (2008, u/ 131) .Æooù.o. otù o|où..ooTðù où.. .PTuoT.C.ùa|ù.. 
As can be seen, the overwhelming majority of the features diagnostic of the pre-pidgin stage are attested in both YPJ and JPE. 
Pidgin languages are also classified on the basis of social criteria. Sebba (1997, pp. 26-33), for instance, suggests the following typology according to the social context of the languooù.. a.ToT..: (i) military and police pidgins; (ii) seafaring and trade pidgins; 
(iii) plantation pidgins; (iv) mine and construction pidgins; (v) immigro.o.. uTuoT..- (ðT) tourist pidgins); (vii) urban contact vernaculars. YPJ and JPE exemplify different types. YPJ emerged in several Japanese ports – Yokohama, Kobe, and Nagasaki – and it can therefore be assigned to type (ii). JPE, which emerged in various locations in Japan, in the context of contacts between the US army personnel and the local Japanese population, represents type (i). 
Consider next the social situation in which pidgins are used. Bakker (1995, pp. 27­28) distinguishes the following types: (i) maritime pidgins; (ii) trade pidgins; (iii) interethnic contact languages; (iv) work force pidgins. Here again, YPJ and JPE differ in terms of the type to which they can be assigned: YPJ is a representative of type (ii), while JPE is illustrative of type (iii). 
To sum up, the social contexts of the emergence and use of YPJ and JPE are different. Moreover, both varieties reflect a specific power differential, which accounts for the different lexifier language – Japanese for YPJ, but English in the case of JPE. WTotT. otù ša.oùeo a Jouo... auù.T.o a To. ua.o. oa otù Wù.o ooù. otù 1850., Jouo.ù.ù was the superstrate language and YPJ consequently bears its imprint. JPE, which emerged in the aftermath of World War II in American-occupied Japan, testifies to the position of English as the superstrate language. 

Conclusions 
YPJ and JPE are structurally very similar and can both be assigned to the same developmental stage, i.e. that of pre-pidgin. 
YPJ and JPE differ, though, in terms both of the social context in which they emerged and of that in which they were formerly used. The similarities in structure and in developmental stage are the outcome of different histories and contexts of use. 
The different lexifier of YPJ and JPE respectively reflects differences in the relative uaEù. a otù .oT. ša.o.TŠÆoT.o |o.oÆooù./ I. o .ù..ù, otù., .PJ o.u JPE o.ù .oEa .Tuù. a otù .o.ù šaT.., T||Æ.o.ooTðù a oEa ùuT.auù. T. otù tT.oa.. a |o.oÆooù ša.oošo. T. Japan. 

References 
Algeo, J. (1960) Korean Bamboo English. American Speech XXXV (2), 115-123. 
Anon. (1879) A new dialect; or, Yokohama Pidgin. ^........ ^..... !.. ..ß ......, 496-500. 
Atkinson, H. (1879) Revised and Enlarged Edition of Exercises in the Yokohama Dialect. 

Yokohama. 
Avram, A.A. (2005). Fonologia limbii japoneze contemporane/ BÆšto.ù.o. EuToÆ.o U.Tðù..ToŒþTT uT. BÆšÆ.ùgoT/ 
Avram, A. A. (2013) When East meets West: A 19th century variety of pidginized Japanese. Paper presented at the ..... ............. ......... .. ........ ....... .... ..... for M........ .. ......,2–4 March 2013, Center for Japanese Studies, University of Bucharest. 
Avram, A. A. (2014).Yokohama Pidgin Japanese revisited. Acta Linguistica Asiatica 4 (1), 29­
46. Avram, A. A. (2016). An extinct variety of pidginized English: Japanese Pidgin English. In M. Burada, O. Tatu & R. Sinu (eds.), 12th Conference on British and American Studies .¨................ !......... .. ... ........... .. ¨......., 6-24. Newcastle upon Tyne: Cambridge Scholars Publishing. Bakker, P. (1995) Pidgins.In J. Arends, P. Muysken & N. Smith (eds.), Pidgins and Creoles. An Introduction, 25-39. Amsterdam / Philadelphia: John Benjamins. Bakker, P. (2003) The absence of reduplication in pidgins. In S. Kouwenberg (ed.). Twice as Meaningful. Reduplication in Pidgins, Creoles and Other Contact Languages, 37-46. London, Battlebridge. Bakker, Peter. (2008) Pidgins versus Creoles and Pidgincreoles. In S. Kouwenberg & J.V. Singler (eds.), The Handbook of Pidgin and Creole Studies, 130-157. Oxford: Blackwell Publishing. Bakker, P. & Parkvall, M. (2005) Reduplication in Pidgins and Creoles. In B. Hurch (ed.), Studies in Reduplication, 511-532. Berlin New York, Mouton de Gruyter. Chamberlain, B. H. (1904) Things Japanese, Being Notes on Various Subjects Connected with Japan for the Use of Travellers and Others. London: Murray. Daniels, F. J. (1948) The vocabulary of the Japanese ports lingo. Bulletin of the School of 
Oriental and African Studies XII (3-4), 805-823. 
DTi.., !/ (1879) Joua.To.o šÆ.Ta.T..T.o/ ^........ ^..... !.. 142 (1836), 500-501. 
Duke, C. R. (1972) The Bamboo style of English. College Composition and Comunication 21 
(2), 170-172. Ferguson, C. A. 1971/1996. Absence of copula and the notion of simplicity: A study of normal speech, Baby Talk, Foreigner Talk and pTuoT.../ I. T/ HÆùŠ.ù. (ùu/), Charles A. Ferguson, Sociolinguistic Perspectives. Papers on Language in Society, 1959 . 1994, 115-123. New York: Oxford University Press. Gills, H. A. (1886) A Glossary of Reference on Subjects Connected with the Far East, second edition. Hong Kong: Lane, Crawfors & Co.; Shanghai & Yokohama: Kelly & Walsh; London: Bernard Quaritch. 
Goodman, J. S. (1967) The development of a dialect of English-Japanese Pidgin. Anthropological Linguistics 9 (6), 43-55. 
Griffis, W. E. (1883) The ¨....... E...... New York: Harper & Brothers. 
Hume, B. (1954) ........ ...... Tokyo: Charles E. Tuttle Company. 
Hume, B. & Annarino, J. (1953a) Babysan: A Private Look at the JapaneseOccupation. Tokyo: Kasuga Boeki K. K. 
Hume, B. & Annarino, J. (1953b) When We Get back Home. Tokyo: Kyoya. 
Inoue, A. (2003) Sociolinguistic history and linguistic features of Pidginized Japanese in Yokohama. Paper presented at the Annual Meeting of the Society for Pidgin and Creole Linguistics, January 2003, Atlanta. 
Inoue, A. (2004) Pidginized variety of Japanese in Yokohama: Can we label it a pidgin?. In K. Ikeda & J. Robideau (eds.), Proceedings 2003. Selected Papers from the College-Wide ......... ... ........ .. ^......... ^........... ... ^.......... .......... .. H.....i, ¨...., 116-127/ Mø.ao. Ca||ùoù a Lo.oÆooù., LT.oÆT.oTš. o.u LToù.ooÆ.ù, U.Tðù..To. a HoEoT.T/ 
Inoue, A. (2006) Grammatical features of Yokohama Pidgin Japanese: Common characteristics of restricted pidgins. In N. McGloin & J. Mori (eds.), Japanese/Korean Linguistics 15, 55­
66. Stanford: CSLI Publications. Knollys, H. (1887) Sketches of Life in Japan. London: Chapman and Hall. Lange, R. (1903) A Text-book of Colloquial Japanese. Tokyo: Kyobunkan. Lentzner, K. (1892) Dictionary of the Slang English of Australia and of Some Mixed Languages: 
With an Appendix. Halle, Leipzig: Ehrhardt Karras. 
Loveday, L. J. 1986. Explorations in Japanese Sociolinguistics. Amsterdam/Philadelphia: John Benjamins. 
Loveday, L. J. (1996) Language Contact in Japan. A Socio-linguistic History. Oxford: Clarendon 
Press. 
Michener, J. (1954) Sayonara. Greenwich, CT: Fawcett. 
Mct|tØÆ.|ù., P/ (1997) Pidgin and Creole Linguistics, expanded and revised edition. London: 
University of Westminster Press. Norman, A. M. Z. (1954) Linguistic aspects of the mores of U.S. occupation and security forces in Japan. American Speech XXIX (3): 301-302. Norman, A. M. Z. (1955) Bamboo English. The Japanese influence on American speech in Japan. American Speech XXX (1): 44-48. Quackenbush, H. C. & O.a, M/ (1991) G....... .. ...... .. .... ....... Tokyo: Kokuritsu Ka.Æoa Kù...È.a/ 
Sebba, M. (1997) Contact Languages. Pidgins and Creoles. London: Macmillan. 
Stanlaw, J. (1987) Japanese and English: Borrowing and contact. World Englishes 6 (2): 93­
109. Stanlaw, J. (1996) Japanese Pidgin English. In T. McArthur (ed.), The Oxford Companion to the English Language, 543. Oxford: Oxford University Press. 
Stanlaw, J. (2004) Japanese English Language and Culture Contact. Hong Kong: Hong Kong University Press. 
Stanlaw, J. (2006) Japanese and English. Borrowing and contact. In K. Bolton and B. B. Kachru (eds.), World Englishes, 179-200. New York: Routledge. Shibatani, M. (1990) The Languages of Japan. Cambridge: Cambridge University Press. Webster, G. (1960) Korean Bamboo English once more. American Speech 35 (4), 261-265. 
*
CHINESE LEGAL TEXTS – QUANTITATIVE DESCRIPTION 
¼uboš G!JDOŠ 
Comenius University in Bratislava, Slovakia lubos.gajdos@uniba.sk 
Abstract 
The aim of the paper is to provide a quantitative description of legal Chinese. This study adopts the approach of corpus-based analyses and is conducted on the Chinese monolingual corpus Hanku. It shows basic statistical parameters of legal texts in Chinese, namely the length of a sentence, the proportion of part of speech etc., and discusses the issues on statistical data processing from various corpora, such as the tokenisation and part of speech tagging, and their relevance to the study of register variations. 
Keywords: Chinese language; written Chinese; legal texts; corpus linguistics 
Povzetek 
No.ù. |o..o .ù ua.ÆuToT .ðo.oToooTð.T .ùzT.a.|að.T auT. .Too...Tt u.oð.Tt Šù.ùuT|/ V .ozT..oðT oðoa. u.Tðzù.o .ùoaua|aoT.a .a.uÆ..Tt o.o|Tz .o .Too...ù. ù.a.ùzT.ù. .a.uÆ.Æ Hanku/ O..ùuaoao .o a..að.ù .oooT.oT.ù uo.o.ùo.ù, .ao .a ua|žT.o uaðùuT, Šù.ùu.ù ð..où T. ..Ttað uù|ùž ð Šù.ùuT|Tt Tu./ où. .ozu.oð|.o a ðu.oGo..Tt aŠuù|oðù .oooT.oT.Tt uauoo.að Tz .oz|T.Tt .a.uÆ.að, .ao .oo .u./ .oðo.o..ù T. az.oùðo..ù Šù.ùu.Tt ð..o, où. ..Ttað uau.T.a. . .ozT..oðo. a .oz.a|T.a.oT .ùoT.o.að/ 
Kljuène besede: kitajs.T .ùzT.- uT..o .Too.GT.o- u.oð.o Šù.ùuT|o, .a.uÆ..a .ùzT.a.|að.ù 
* Ttù .oÆu. ŠÆT|u. a. .. u.ùðTaÆ. Ea.. Go.uaG, ¼ (2016)/ QÆo.oToooTðù uù.š.TuoTa. a E.Tooù. CtT.ù.ù 
– a preliminary corpus-based study. Studia orientalia: Victori Krupa dedicata/ B.ooT.|oðo. Ú.ooð orientalistiky, pp. 62-75. 
Acta Linguistica Asiatica, 7(1), 2017. 
ISSN: 2232-3317, http://revije.ff.uni-lj.si/ala/ 

DOI: 10.4312/ala.7.1.77-87 



1 Chinese language 
Chinese is one of the most widely used languages in the world (Sun, 2006). Genealogically, Chinese belongs to the Sino-Tibetan language family and is often classifiùu o. o. .T.a|ooT.o. a. .o.o|.oTš. |o.oÆooù (LT & Thompson, 2009, pp. 703-723). Chinese has a written tradition of more than 3000 years. The majority of Chinese words are mono-or disyllabic, the word order is relatively rigid with the SVO prototypic word order. 
In a language with written tradition, the discrepancy between spoken and written registers1 is a natural phenomenon. Chinese language is no exception, nevertheless, there are some issues that need to be considered when studying registers2 variation, 
e.g. a relatively vague standard of the written language, the influence of wenyan3 on the written language, missing qualitative and quantitative comparative study of language registers etc. 
The study is based on the dichotomic model of Chinese language registers – ša||auÆTo| o.u E.Tooù./ Hù.ùT.ooù., otù où.. .CtT.ù.ù. T. .ù.ù.ðùu a. otù E.Tooù. standard of Chinese which is known as putonghua (...) .common language.. 

2 Chinese legal language 
The term legal language/text (hereinafter legal text) may be understood in many ways with respect to a research perspective, a research methodology, criteria of classification etc. Due to the content of the sub-corpus zh-law, the term legal text implies only monologic, prescriptive texts of legislation. Generally speaking, legislation is considered as the prototype of legal language (Biel, 2014, pp. 19-22). 
Legal Chinese is a part of written language registers (sub-register) with unique lexis, syntax properties, terminology etc. 
1 In this article, I use the term register o. ..ToÆooTa.o||.-defined varieties described for their 
characteristic lexico-o.o..ooTšo| ùooÆ.ù.. (BIBER, 2006, uu/ 11-12). 
2 Both terms are defined very vaguely in Chinese linguistics. The spoken register is called kouyu (.
.) .spoken language. and written as shumianyu (...) .written or literary language.. 

3 Wenyan T. ..aE. o. .š|o..Tšo| |Toù.o.. |o.oÆooù. o.u .To |aa.. oa otù .o.|ù a E.ToT.o. u.ùðo|ù.o T. 
the period from the Spring and Autumn period to the Eastern Han dynasty for its grammatical and |ùeTšo| .a.... (Ctù., 2004, u. 67). 

3 Methodology 
In this study, the Hanku corpus and corpus methodology is used as a systematic approach to the study of the register of legal Chinese. It is proposed that a variance across registers might be – to some extend – revealed by statistical data from a corpus. 
By testing this hypothesis on the register of legal Chinese, the following criteria are taken as a part of the description. 
(1) 
the length of a sentence in a register 

(2) 
frequency of every part of speech (hereinafter the POS) in a register 

(3) 
the relative representation of some POS and their comparison 

(4) 
markers of passive voice 


The statistical data presented in this study are given in two values – absolute frequency and frequency in IPM.4 

4 The Chinese corpus Hanku 
The Hanku corpus is available at: http://konfuciovinstitut.sk/corpus-hanku/, NoSketch Engine5 is used as the corpus manager. The basic block of the corpus is a token which basically corresponds to one word. A token is annotated for the part of speech (POS labelling), its composition into characters and the Hanyu pinyin transcription.6 The corpus is divided into two subcorpora (May 2017): 
. web-zh – texts from the PRC 
. zh-law – legal texts from the PRC; texts of laws and regulations 
The statistical data used in this study was obtained by writing CQL queries in the NoSketch Engine user interface. 
4 IPM: Instances Per Million, the number of occurrences normalized by the size of the corpus. 5 Nosketch Engine is an open-source version of the Sketch Engine. See more at: https://www.sketchengine.co.uk. For other Chinese corpora available in Sketch Engine, see 
Pùo.aðT (2016)/ 
6 The POS annotation, tokenization and Hanyu pinyin transcription are results of automatic processing. 
Table 1: Subcorpus Zh-law – parameters 
Parameters  Status  Notes  
Type  synchronous  legal texts from the PRC  
Language of interface  English, Chinese  
Size (June 2017)  7.2 million  size referred in tokens  
Tokenisation  •  
POS annotation  •  Penn Chinese Treebank7  
Bibliographic annotation  •  
Phonetic annotation  •  
Statistic tools  •  frequency in IPM, average reduced frequency  
Save results directly from the interface  •  in text or XML format  
KWIC  •  
Collocations search  •  many collocation measures  
Advanced search options  •  Baa|ùo. auù.ooa.. . ša..Æ.šoTa., disjunction, negation; possibility to use regular expressions at the character, word, pinyin, and metadata level; full CQL8 etc.  
Sorting by  •  Left, right, node, references etc.  


The length of a sentence in the subcorpus zh-law 
It is generally believed that there is a positive correlation between the length of a sentence and the register affiliation – the more formal a text is, the longer the sentences are, and vice versa. 
Our previous research on written Chinese has confirmed this tendency, yet with more accurate statistical data showing that the length tends to have more than 29 tokens.9 It should be noted though that the number of tokens also include punctuation 
(otù POS ooo .PU.), ù/o/ .,..()., .a otoo otù |ù.oot T. Ea.u. T. .ta.où./10 
7 See more at: http://www.cs.brandeis.edu/~clp/ctb/posguide.3rd.ch.pdf. 
8 CQL – Computer Query Language. 
9 Our previous research indicated slightly different figures, i.e. 20 words (tokens without 

punctuations). 
10 The length of a sentence was simply calculated by division of a number of tokens by a number of 
sentences in the sub-corpus. 

The results indicate that a sentence in legal Chinese tends to have the length of approximately 29 tokens or 25 words (tokens without punctuations). It is also evident that the more information-saturated a text (a text in nominal style) is, the longer it is. When analysing the corpus data here, one must also pay attention to the fact that the name of a law or a regulation is tokenized as a sentence as well. That is to say, the length of a sentence may even be longer.  

Parts of speech in the sub-corpus zh-law 
The proportion of individual POS was directly retrieved from the corpus with a query written in CQL,11 otù., otù .ù.Æ|o Eo. .a.oùu Š. ..auù ooo.. o.u ša.ðù.oùu oa IPM. 
Table 2: A proportion of POS in Zh-law 
Part of Speech  Examples  Tag  Frequency in the corpus  IPM  
Nouns  NN  2 804 164  389 236  
Punctuations  PU  1 091 923  151 566  
Verbs  VV  1 070 286  148 562  
Prepositions  P  242 165  33 614  
Non-predicative adjectives ..,.  JJ  232 514  32 274  
Adverbs  AD  222 902  30 940  
Coordinating Conjunctions .,.,..  CC  219 759  30 504  
Particle DE .  DEC  215 381  29 896  
Measure words  M  198 809  27 596  
Cardinal numbers  CD  162 183  22 512  
Ordinal numbers  OD  139 560  19 371  
Particle DE as genitive marker .  DEG  129 286  17 945  
Localizer  LC  109 661  15 221  
Determiners .,.  DT  99 523  13 814  
Proper nouns  NR  48 931  6 792  
Adjectives  VA  38 810  5 387  
Temporal nouns  NT  35 166  4 881  

11 CQL – Corpus Query Language; [tag=".*"]. 
Part of Speech  Examples  Tag  Frequency in the corpus  IPM  
Pronouns  PN  33 930  4 709  
Verbs  .,..,.  VE  28 195  3 913  
Etcetera  ..  ETC  21 082  2 926  
Particles  .,.,.  MSP  18 021  2 501  
Copulas  VC  15 708  2 180  
Preposition BA  BA  6 586  914  
Preposition BEI  SB  6 378  885  
Preposition BEI  LB  3 338  463  
Aspect particles .,.,.  AS  2 898  402  
Particle DE .  DEV  2 489  345  
Subordinating conjunctions ..,..  CS  1 917  266  
Modal particles .,.,.  SP  1 591  220  
Foreign words  FW  948  131  
Particle DE  .  DER  157  22  

Data in Table 2 reveal that interjections and onomatopoeias are not present in legal texts at all. By comparing statistical data with other registers, there are some factors that should be taken into consideration – due to the fact that the form of legal texts differs from other registers, the frequency of punctuation, cardinal and ordinal numbers, or measure words is much higher. See Chapter 7. 
To allow a more concise comparison, some POS are combined together as shown below in Table 3. 
Table 3: POS combined together 
POS  Frequency  IPM  
Nouns (NN+NR+LC+NT)  2 997 922  416 132  
Verbs (VV+VC+VE)  1 114 189  154 657  
Particles DE (DEC+DEG+DEV+DER)  347 313  48 209  
Numbers (CD+OD)  301 743  41 884  
Prepositions (P+BA+BEI)  258 467  35 877  
Non-predicative adjectives (JJ)  232 514  32 275  
Adverbs (AD)  222 902  30 940  

POS  Frequency  IPM  
Conjunctions (CC+CS)  221 676  30 770  
Measure words (M)  198 809  27 596  
Pronouns (PN+DT)  133 453  18 524  
Particles (ETC+AS+MSP+SP)  43 592  6 051  
Adjectives (VA)  38 810  5 387  
Passive markers (SB+LB)  9 716  1 349  
Punctuation (PU)  1 091 923  151 566  

The chart shows that the nominal style is preferred in legal Chinese with a dominance of nouns. 

nouns (NN+NR+LC+NT) 


verbs (VV+VC+VE) 



particles DE 
(DEC+DEG+DEV+DER) 




numbers (CD+OD) 



prepositions (P+BA+BEI) 




non-predicative adjectives (JJ) 

adverbs (AD) 


conjunctions (CC+CS) 


measure words (M) 
pronouns (PN+DT) 
particles (ETC+AS+MSP+SP) 



Figure 1: Proportion of part of speech in the zh-law 
Mainly because of the different approach to the tokenization and the POS labelling, statistical data have exhibited a considerable divergence between the proportion of verbs in the sub-corpus zh-law (18%) versus the Sihanku corpus (26%). 

Most frequent content and function words in the sub-corpus zh-law 
Following the traditional model12 of division of lexis into content and function words (shici.., xuci.., respectively), I have searched for the most frequent content 
Ea.u. Š. otù CQL uÆù... [ooo=.V/*|N/*|LC|CD|OD|JJ|!D|DT|PN|M.\/ Ttù .ù.Æ|o Eo. otù. .a.oùu Š. ..auù a.... o.u ša.ðù.oùu oa IPM. As for the function words, the CQL expression is: [tag="P|CC|DE.|ETC|MSP|BA|SB|LB|AS|CS|SP"]. 
Table 4: 30 most frequent content and function words in legal Chinese 
Content word  Frequency  IPM  Function word  Frequency  IPM 
.  102845  14276 .  339351  47104 
..  57301  7954 .  86413  11995 
..  50465  7005 ..  47375  6576 
..  48272  6700 .  40438  5613 
.  45266  6283 .  38084  5286 
..  41477  5757 .  26783  3718 
..  34843  4836 .  24460  3395 
..  33949  4712 .  21111  2930 
..  31789  4413 .  21060  2923 
.  31540  4378 .  18820  2612 
..  29894  4149 .  17647  2450 
.  28734  3988 .  17483  2427 
..  28029  3891 ..  15267  2119 
..  27402  3804 .  14866  2064 
..  26708  3707 ..  10988  1525 
..  24086  3343 .  10928  1517 
..  23779  3301 .  10130  1406 
.  23085  3204 .  9338  1296 
.  22666  3146 ..  7595  1054 
..  22012  3055 .  7355  1021 
..  21269  2952 .  7216  1002 
..  21018  2917 .  7176  996  

12 E.g. Liu, Y. et al. (2004). Practical Chinese Grammar. 
Content word  Frequency  IPM  Function word  Frequency  IPM 
.  20085  2788 ..  6732  934 
..  19254  2673 .  6691  929 
.  18847  2616 .  6160  855 
..  18130  2517 .  4353  604 
..  17920  2487 .  4042  561 
..  17711  2458 ..  3418  474 
.  17333  2406 ..  2958  411 
.  17096  2373 .  2627  365  

Based on the content of legal texts, it is no surprise that the most frequent content word is the measure word tiao. o. To .oo.u. a. o. o.oTš|ù a o |oE (§)/ ! .ùša.u .a.o frequent word is the modal verb yingdang.. which is a mean of expressing deontic modality. 
Our previous research has also proven the tendency that on the part of function words, there is a high frequency of conjunctions compared to unstructured texts (e.g. language data from the sub-corpus web-zh) and this might be an indication of formal, written texts. There is also a high figure of prepositions which are considered as a formal expression, e.g. yu. or yi.. 

Passive voice in legal Chinese 
I. otT. štouoù., otù .ù.Æ|o a So.o.à... .oÆu. (2015) Eta to. ša.uÆšoùu tT. .ù.ùo.št a. 
the relatively small sub-corpus13 of legal texts with a different tagset and tokenizer are compared. 
Passive voice in Chinese may be marked with prepositions e.g. bei. or unmarked. ST.šù So.o.à. to. a.|. .ùo.štùu a. otù uo..Tðù ðaTšù .o..ùu Š. u.ùua.ToTa., I adopt this approach here as well. In the Hanku corpus, to search for the passive voice is quite .o.oTotoa.Eo.u ETot otù CQL uÆù.. o. a||aE.. [ooo=.LB|SB.\/ 
The results in the Table 5 show that the frequency of passive markers in all corpora varies. Comparison of the passive marker bei. from the sub-corpora zh-law and web­zh also indicate slightly auua.Toù où.uù.š. o. To Eo. otù šo.ù a So.o.à... .ù.ùo.št T. which the frequency of the passive marker in legal Chinese was significantly higher that 
13 The sub-corpus had a size of 480.000 tokens. 
the frequency in other sub-corpora of written language. It is worth noting that the statistical data is not sufficiently large to enable comparison to be performed. 
Table 5: Passive markers in the sub-corpora zh-law, web-zh and the corpus Sihanku 
Marker  IPM zh-law  IPM web-zh  IPM Sihanku 
.  1296  1458  2206 
.  47  20  1312 
.  5  20  ?  

The statistical data also has not proven that the passive markers might be one of the indicator of legal texts. The figures of passive markers do not indicate any significant differences between two sub-corpora of the Hanku. 

9 Implications for language pedagogy 
The quantitative study and its result may be used in language pedagogy too, i.e. by learning Chinese legal texts, one might choose to study not only verbs (according their occurrence in a corpus) but the collocational preferences of verbs with a subject/object or prepositional ut.o.ù., EtTšt T. ..aE. o. .Ea.u..ùošt. T. ša.uÆ. |T.oÆT.oTš. a. o. .ðo|ù.š.. T. où.ù.o| |T.oÆT.oTš./ 
It is only a matter of practise for students of Chinese language to write a CQL query or a regular expression and search for the concrete POS or words, even for patterns of sentences. I assume that this may help to improve language teaching methods and materials. 

10 Conclusion 
In this article, I have presented the results of the corpus-based approach to the study of register variation in Chinese. The research was conducted on a relatively small corpus yet the language data in it may be described as complete and closed. The statistical data of legal Chinese reveals that there are evident differences, e.g. in lexis or syntax. Among the above indicators, the absence of modal particles, onomatopoeias, interjections may be clear evidence of a formal, written register. I have also presented an amount of statistical data in support of the hypothesis in which the proportion of nouns in written formal register (here zh-law) prevails over other POS. 
Some caution should be applied, when comparing the statistical data in this study with other corpus data. Let us here just highlight some issues: 
. 	The size of a corpus matters, e.g. as it is the case of passive markers 
. 	Different approach to tokenization may result in different statistical data 
. 	It is not always a simple task to compare results from two corpora with different tagsets. 
To conclude, a quantitative description is as accurate as precise the automatic process of tokenization and POS labelling is. Over the past few years, we have witnessed steady improvement in automatic tokenisation and the POS tagging processes, nevertheless problems still remain with regards of quantitative comparison of results from different corpora. 

References 
Biber, D. (2006). Univesity Language. Amsterdam/Philadelphia: John Benjamins Publishing Company. 
Biel, L. (2014). Lost in the Eurofog. Frankfurt am Main: Peter Lang AG, pp. 19-22. 
Chen, P. (2004). Modern Chinese: History and Sociolinguistics. Cambridge: Cambridge University Press. 
Go.uaG, ¼/ (2011)/ DT.š.ùuo.š. BùoEùù. Sua.ù. o.u W.Tooù. CtT.ù.ù – Methodical Notes on Linguistics. Studia Orientalia Slovaca, 10(1), 155-159. 
Go.uaG, ¼/ (2013)/ S|aðù...a-u.... uo.o|ù|.ý .a.uÆ. [Ttù S|aðo.-Chinese parallel corpus]. Studia orientalia Slovaca, 12(2), 313-317. 
Go.uaG, ¼ (2016)/ QÆo.oToooTðù uù.š.TuoTa. a written Chinese – a preliminary corpus-based study. Studia orientalia: Victori Krupa dedicata/ B.ooT.|oðo. Ú.ooð a.Tù.oo|T.oT.., 62-75. 
Go.uaG, ¼/, Go.oŠu., R/, Bù.Tš.à, J/ (2016)/ The New Chinese Webcorpus Hanku – Origin, Parameters, Usage. Studia Orientalia Slovaca, 15(1), 53-65. 
Li, N. Ch., Thompson, S. (2009). Chinese. ... ....... ¨.... ^........ (Second Edition), ed. by Comrie, B. Oxon: Routledge. 
Liu, Y. [... ] et al. (2004). Practical Chinese Grammar [...... ]. Beijing: Shangwu yishuguan. 
NoSketch Engine [online] [10 May 2017]. Retrieved from Sketch Engine: https://www.sketchengine.co.uk. 
Pùo.aðT, M/ (2016)/ Word Sketches of Separable Words Liheci in Chinese. Acta Linguistica Asiatica, 6(1), 47-57. 
So.o.à., I/ (2015)/ QÆo.oToooTðù !.o|..T. a FÆ.šoTa. Ea.u. T. CtT.ù.ù Lùoo| Tùeo./ Studia Orientalia Slovaca, 14(2), 165-182. 
Sun, Ch. (2006). Chinese: A Linguistic Introduction. Cambridge: Cambridge University Press. 
PERCEPTUAL ERRORS IN CHINESE LANGUAGE PROCESSING: A CASE STUDY OF CZECH LEARNERS 
Tereza SL!MÌNÍKOVÁ 
Po|oš.ý University Olomouc, Czech Republic tereza.slamenikova@upol.cz 
Abstract 
There are now a vast number of cross-linguistic studies that investigate how perceptual performance with non-native speech categories is constrained by the listener's native language. However, considering the acquisition of the Chinese language phonological system, studies examining the transfer from less frequent languages are rather rare. The aim of this paper is to fill the gap regarding Czech native speakers. Through examining errors in dictation tests, it introduces some difficulties experienced by beginner level Chinese students and thus provides insight on the perception of Chinese language segmental and suprasegmental features by Czech learners. The findings imply that while errors in initials and finals show a high influence of the native language, errors in disyllabic tonal combinations seem to follow the basic language-independent patterns that have been observed in previous studies. 
Keywords: second-language acquisition; non-native speech perception; L1 interference; Chinese phonological system; Czech learners 
Povzetek 
OŠ.oo.o.a GoùðT|.ù .ozT..oðù a où., .o.a .o zoz.oðo..ù oaða..Tt .ooùoa.T. ðu|Tðo .|ÆGooù|.ùð .ooù..T jezik, vendar pa so raziskave o usvajanju kitajskega fo.a|aG.ùoo .T.où.o . .oo|TGo .o..GTt .ùzT.að Gù ðùu.a zù|a .o|aGoùðT|.ù/ No.ù. où .ozT..oðù .ù oa.ù. zoua|.ToT ð.zù| z .ùzÆ|oooT ùG.Tt .ooù..Tt oaða.šùð/ S u.aÆToðT.a ..TtaðTt .ouo. ð .o.ù.Tt oðoa. u.ùu.ooðT uaoa.où oùžoðù, .T .Tt T.o.a ùG.T GoÆuù.oT .Too.GT.ù .o zoùo.ù. .Tða.Æ, T. ua.ÆuT ðuao|ùu ð z.oT|.a.oT ..Ttaðùoo zoz.oðo..o .o .ùo.ù.o.ù. T. u.azauT.ù. .Tða.Æ/ UoaoaðToðù .ožù.a, uo .ùuoù. .a .a .ouo.ù ð .ao|o..TG.T zoùo.TšT z|aoo (o.o|/ T.ToTo|) az/ .a..TG.ù.Æ ..|auÆ z|aoo (o.o|/ T.o|) ua.|ùuica negativnega vpliva .ooù..ùoo .ùzT.o, .ouo.ù ð uðaz|až.Tt oa...Tt .a.ŠT.ošT.ot .|ùuT.a Æ.oo|.ù.T. ðza.šù. T. .a .ùauðT..ù au .ooù..ùoo .ùzT.o .|ÆGooù|.o/ 
Kljuène besede: usvajanje drugega/tujega jezika; zaznavanje govora drugega/tujega govora; vplivi maternega jezika; kitajski glasovni sistem; ùG.T GoÆuù.o.ù 
Acta Linguistica Asiatica, 7(1), 2017. 
ISSN: 2232-3317, http://revije.ff.uni-lj.si/ala/ 

DOI: 10.4312/ala.6.2.89-101 



Introduction 
At the beginning of this research, the aim was to optimize the content of a course introducing the Chinese phonological system to first-year students in the Chinese Philology B.A. program at Palacký University in Olomouc. Although the characteristics of the Chinese phonological system have been thoroughly described, taking into 
oššaÆ.o To. uTù.ù.šù. .a. Czùšt uta.a|aoTšo| u.auù.oTù. (Šðo..ý, 1998- Šðo..ý & Utù., 2001- Tau..aðà, 2014), Chinese teaching materials were used in the course due to the absence of any set of practical exercises. Despite the fact that this material provides a comprehensive training program, the teaching experience has shown that it did not fully meet the requirements of the specific native speaker group. In particular, it does not seem to sufficiently emphasize the similarities and differences between the Chinese and Czech phonological features. 
It is generally recognized that a speaker's native language (L1) experience significantly impacts many aspects of second language (L2) acquisition. The influence of the native phonological system on the production and perception of L2 speech sounds is no exception. L2 learners tend to transmit phonological rules from their native language, as well as implement strategies used in L1 acquisition. Using Trubetzkoy's (1969) famous metaphor, the native language phonological system .ùu.ù.ù.o. o .Tùðù .ot.aÆot EtTšt ùðù..otT.o otoo T. .oTu T. u.ašù..ùu. (p. 51). When a learner hears another language, his linguistic experience sets the parameters that constrain perception of the L2 phonological system. 
As for Chinese language 1 acquisition, the rapidly growing interest in studying Chinese is bringing increasing attention to its pedagogical theory and practice. Nevertheless, case studies reporting difficulties caused by L1 linguistic experience are still limited to just a few languages. Even though teaching Chinese in the Czech Republic already has an 80-year tradition, there is no systematic study investigating Czech-specific influences on Chinese language processing. 
The linguistically homogeneous environment of the Czech Republic2 provides a space for the adaptation of learning paths according to the needs of a specific group of learners. In order to adjust the course called Introduction to Chinese Phonetics to meet the needs of Czech learners, different investigations were carried out, obtaining complex insights. Among other things, an analysis of dictation tests has been performed. These dictation tests have been used as a tool for testing students' perception skills at the end of the course for the last 15 years. As such, they provide a 
1 The term Chinese used in this paper refers to Modern Standard Chinese, with pronunciation based on the Beijing dialect. 2 Strictly speaking, the needs of two specific groups of learners are taken into account, since Slovak 
.oÆuù.o. o.ù o|.a ù..a||ùu T. otù CtT.ù.ù utT|a|ao. u.ao.o. oo Po|oš.ý U.Tðù..To./ Nùðù.otù|ù.., otù 
vast majority of students are of Czech origin. For this study, the target group is only Czech learners. 
valuable data collection source whose analysis can reveal useful information on Chinese language processing. By examining the errors occurring in dictation tests, this paper attempts to fill a gap in our understanding of Czech students' Chinese language perception. After a brief introduction of the course.. organization and the structure of the dictation test, errors at the segmental and suprasegmental level will be investigated. 

Organization of the course 
Introduction to Chinese Phonetics is a mandatory course taken by all Chinese Philology students in the first semester of their study.3 Its aim is to introduce students to the basic speech features of Chinese, laying the groundwork for successful language learning. Using the official romanization system Pinyin, students' ability to produce and perceive speech sounds is being systematically developed. 21 45-minute lessons are scheduled three times a week. Different types of exercises are used to improve students' listening ability and production accuracy. After seven weeks of intensive training, students take a dictation test covering both segmental and suprasegmental phonological levels, i.e. the range of Chinese initials and finals, as well as four lexical tones in their 15 possible disyllabic tonal combinations. 
The Chinese handbook Hanyu Putonghua Yuyin Bianzheng«.........» .Correction of Standard Chinese Pronunciation. is used as the main teaching material; however, it has been adapted for the purposes of the course. Using a discrete-item approach, the first third of the course focuses on initials, the second on finals and the last on tonal combinations. Initials and finals with similar phonological features (place or manner of articulation) are compared and drilled. The training is applied with disyllabic compounds as basic units of the modern Chinese lexicon. While practicing a specific feature (e.g. finals -an and -ang), either two syllables within the same compound (e.g. .......), or two compounds are contrasted (e.g. ...... vs. .......). 
It must be mentioned that the Introduction to Chinese Phonetics course is just one part of a comprehensive curriculum in the Chinese philology program; listening and oral comprehension skills are developed in different courses. The goal of this course is to help learners achieve familiarity with common phonemes of the Chinese language as soon as possible. Disyllabic compounds are used as a tool to achieve this goal, regardless of their lexical meaning, in order to restrict learners. concentration to acoustic language features. Considering the non-tonal background, basic 
3 Warm thanks are due to Professor David Uher, PhD, who as a course supervisor and author of the dictation test kindly supported the idea to analyze errors in order to prepare more effective study materials. 
suprasegmental units are introduced as well. Sentence-level prosodic patterns are practiced in the follow up courses during the second, third and fourth semesters. 

Dictation tests 
The dictation test contains 100 disyllabic compounds altogether. They are chosen from the exercises used during the course, and no additional combinations are included. Each compound is dictated twice by the teacher and the entire duration of the test is approximately 20 minutes. A special form is used, which is divided into four parts as shown below. It is composed of three partial dictation tasks and one full dictation task. In the first part (numbers 1 to 25), finals are preprinted and students are supposed to fill in the initials according to the dictation. In the second part (numbers 26 to 50) it is the other way around: initials are preprinted and finals (without tones) are to be written down. The third part (numbers 51 to 75) focuses on tonal combinations: students have to mark tones above syllables. These three parts are intended as warm-up exercises that should help students gather their concentration for the most difficult task: in the last part of the test (numbers 76 to 100) they are expected to note down the entire disyllabic compound including tone marks. Since the second and last part require more time for writing, they are dictated at a slower speed. Below is an example of a filled-out test form. 

Figure 1: The filled-in dictation test form 

Procedure 
This study does not attempt to meet the requirements of experimental research. Tests were designed for the purpose of assessing students at the end of the course, not to provide a venue for research. Based on theoretical knowledge about listening comprehension and the teaching experience, the tests were designed in order to measure listening skills gained during the course. Even though the operationalization process applied while compiling an assessment test (theoretical notions about the nature of the listening construct were turned into actual practice, in a set of test items) is similar to that at beginning of the experimental research procedure, the parameters used in compiling the test were determined by its primary purpose, which was not to 
aŠ.ù.ðù ŠÆo oa o..ù.. .oÆuù.o.. uù.a..o.šù/ Nùðù.otù|ù.., otù où.o. .ùu.ù.ù.o o 
valuable data source, especially considering the amount of collected materials and the long time span when they were collected. As such, they provide an otherwise not easily obtainable source for exploring perceptual errors by Czech learners of Chinese. The specific circumstances regarding the data collection, however, have to be taken into account while interpreting the findings. 
The dictations have been held since 2002 and up to now over 1500 tests have been collected. This paper investigates errors occurring in the tests from 2002 to 2008. There were altogether 715 tests collected during this period. The first examination of the tests has, however, shown that those with a high error rate need to be excluded from the analysis. The majority of errors clearly indicated a lack of familiarity with the notation system, i.e. letters used to write down initials and finals were highly influenced by the symbols used in Czech orthography. This fact made it difficult to identify perceptual difficulties. Due to this, tests with a maximum of 10 errors were selected in order to analyze speech perception at the segmental level (27% of the analyzed sample). Considering the fact that no interference with the Czech graphic system can be expected while placing the tonal marks, the analyzed sample was extended to 20 errors at the suprasegmental level (52% of the analyzed sample). 
During this time, many different versions of tests were used. Although the general layout has not changed, the set of disyllabic compounds used vary from one to another. Because the tests were constructed with the intention of providing a general testing tool, the features of the Chinese phonological system are not equally distributed. As such, they were not suitable for the analysis of sensitivity to specific speech contrasts in the form of scores on correct identifications. Considering this fact, tests were analyzed in terms of errors in initials, finals and tonal combinations. The incorrect notations for each of these three units were collected together in order to identify the underlying patterns. 

5 Results 
Errors were evaluated from two perspectives. Firstly, disyllabic compounds with incorrectly noted initials and finals were examined. Secondly, confusions in tonal combinations were analyzed. Analysis of tests with a maximum of 10 errors has shown that proportionally speaking, Czech students experience more difficulties with the acquisition of the segmental level than with the perception of the disyllabic tonal combinations. Nonetheless, it has to be noticed that the vast majority of errors at the segmental level consist of errors in finals. In fact, errors in both finals and tonal combinations each account for about two fifths of incorrectly noted compounds. Contrary to expectations that a wider spectrum of errors might occur within the fourth (full dictation) part of the test, the analysis did not reveal significant differences. Besides, considering the long time period of data collection, the results imply only a negligible variation of errors typology. Thus, the overall findings are presented in the following two subsections. 
5.1 Errors at the segmental level 
Analysis has shown that errors in initials occur rarely. The results do not exactly support the presumption that because of the different distinctive phonological features characteristic for Czech (correlation of voicing) and Chinese (correlation of aspiration), Czech students might have difficulties with distinguishing aspirated and non-aspirated ša..a.o.o. (Tau..aðà, 2014)/ O.|. a.ù a otù .Te uoT.., ðù|o.. g-and k-, 4 appeared multiple times within incorrectly noted initials. The familiarity of the non-aspirated voiceless consonants obviously represents a solid basis for the aspirated consonant recognition. 
However, distinguishing aspirated consonants from one other, specifically the dental c-and post-alveolar ch-versus palatal q-, cause Czech students the biggest difficulties. Apparently students tend to overlook the fact that c-and ch-combine with a completely different set of finals than q-. As can be seen below, the incorrectly noted versions are either pronounced completely differently or do not even exist. More than anything else, these errors mostly indicate a lack of familiarity with the Chinese phonological system. However, one cannot overlook the possible reason for these confusions. Affricates c-and ch-before apical vowels are pronounced differently than in combination with other vowels: the type of aspiration is the same as in case of affricate q-(Tau..aðà, 2012, u/ 143) EtTšt .Toto Šù o .T.T|o.To. šoÆsing confusion. 
Besides, the analysis has shown that students tend to overlook the different distributional environments of the fricatives x-and sh-as well. It must be the same 
4 Considering the topic of this paper, Pinyin letters are used to note the described initials and finals. 
place of articulation that tempts students to fail to notice not just their complementary distribution, but also their phonological differences. Nevertheless, in contrast to another pair of errors these are easily fixed. The palatals q-and x-share the same set of finals, and since neither of them is present in the Czech phonological inventory they are quite difficult to master. 
Table 1: Most frequent errors in initials 
Initials  Examples  
k-vs. g­ .....g.... instead of .....k....  
c-/ch-vs. q­ c...... instead of q...... q..... instead of ch.....  
sh-vs. x­ .¸.xa.. instead of .¸.sh...  
q-vs.  x­ ....q... instead of ....x...  

Four fifths of errors on the segmental level consisted of incorrectly noted finals. Three features causing difficulties can be identified. Two of them are related to single finals. Firstly, about 10% of errors show difficulties with the distinction between mid-round -e and the two apical vowels -i after post-alveolars zh-, ch-, sh-, r-and dentals z­, c-, s-. Secondly, about 16% of errors involve front high vowels, specifically discriminating between unrounded -i and rounded -;. Students mostly tend to miss the higher level of roundedness than the other way around. Both of these types of errors are explainable in terms of differences between the Czech and Chinese vowel systems. 
However, the most difficult finals to perceive are those with nasal endings, which appear in more than two thirds of the incorrectly noted finals. Even though both alveolar and velar nasals can be found in Czech speech sounds, the velar nasal is just a positional variant of the alveolar phoneme and is regularly used before velar consonants k-or g-in the middle of words. In comparison with Chinese, the alveolar and velar nasal do not occur in contrastive distribution. The experience of Czech learners with velar nasals is limited to a certain environment, which must be a reason Et. To. .o.où.. T. uTTšÆ|o/ Tau..aðà (2012, u/ 290) uaT.o. aÆo otoo Æ.uù. otù T.|Æù.šù of their native language, Czech students tend to put non-aspirated velar k-after the velar nasal while pronouncing the velar nasal. As far as our own pedagogical experience is concerned, they also tend to merge velar nasals with alveolar nasals, making no distinction between them. Analysis of dictation has shown the same blending in their perception. The recognition of differences between the two types of nasal endings has been identified as the most difficult issue of speech perception. Considering the structure of the final, the following pairs are especially difficult to distinguish: -in vs. -ing, -uan vs. -uang and -an vs. -ang. 
Errors in finals with nasal endings are not limited to those relating to discrimination of nasals. There were also difficulties with vowel differentiation before velar nasals. Firstly, confusion of final -ang and -eng was identified. Secondly, a syllable with zero initial ying was repeatedly noted incorrectly as yong or yung. It seems difficult for students to distinguish the nucleus of finals whose pronunciation is affected by the velar nasal ending. Besides, there was one syllable with alveolar ending that, based on a spectrum of incorrect notation, can be marked as difficult to perceive as well. It is the final -u(.)n, and as shown below, the incorrect notations of the example syllable sun include suan, sang and song. 
Table 2: Most frequent errors in finals 
Finals  Examples  
apical vowels -i vs. back unrounded -e  ....í instead of ....é  
unrounded front high -i vs. rounded -;  .u.i instead of .u.u  
alveolar vs. velar nasals (-in/-ing, -uan/-uang, -an/-ang)  jin.¸. instead of jing.¸. ......uán instead of ......uáng .......ang instead of .......an  
finals -ang vs. -eng  .....ang instead of .....ìng  
syllable ying  incorrectly noted as yong, yun, yung  
final -un (-uen)  .....uan. .....ang. .....ong instead of .....Un  

To sum up, confusion on the segmental level appears to be proportional to the dissimilarities between both languages in phonetic and phonological features. They reveal in substantial measure the main differences in the phonological organization of Czech and Chinese. In regard to the experience gained through teaching the Introduction to Chinese Phonetics course, one cannot overlook the correspondence with the most prominent pronunciation difficulties, confirming a high level of correlation between speech production and perception. 
Considering the types of errors, they appear to be consistent with the fundamental premise of the so-called perceptual assimilation model of cross-language speech perception described by C. T. Best (1995). She pointed out that the reason L2 learners discern the differences between pairs of sounds in the L2 might be the fact that they tend to perceive the non-.ooTðù .aÆ.u. .ošša.uT.o oa otùT. .T.T|o.ToTù. oa, o.u discrepancies from, the native constellations that are in the closest proximity to them in native phonological space. (u/ 185)/ 
Within the above described errors, several examples can be found indicating that learners show a tendency to assimilate non-native sounds to the most similar native category that is straightforwardly closest in the L1 phonological space. This must be the reason why Czech learners have difficulties with distinguishing k-vs. g-, x-vs. sh-, -; vs. -i and velar vs. alveolar finals: it is due to assimilation, non-native sounds and the closest native sounds being perceived as members of the same category. On the other hand, it also seems that non-native palatals q-and x-are not strongly assimilated to two different native categories since learners also face difficulties with their differentiation. These errors might indicate another type of assimilation defined by Best, i.e. uncategorizable speech sounds that are assimilated within native phonological space but not as a clear example of any particular native category. This also seems to be case of another pair of non-native speech sounds: apical vowels -i and back round vowel -e. 
5.2 Errors at the suprasegmental level 
Within the analyzed sample of dictation tests, 778 incorrectly noted tonal combinations were collected. According to occurrence of errors, the 15 possible tonal combinations5 can be divided into the following four levels of difficulty. As can be seen, there is at least a 1% gap between neighboring levels. 
Table 3: Errors in tones 
Level  Percentage of errors in tones  Combinations  
1.  over 13%  T3–T4, T1–T3  
2.  11% – 9%  T2–T4, T2–T1, T2–T3, T4–T3  
3.  7% – 4%  T3–T1, T4–T2, T1–T2, T3–T2  
4.  less than 2%  T4–T1, T1–T4, T4–T4, T1–T1, T2–T2  

The results indicate that the easiest tonal combinations to recognize are all combinations of the two same tones. Besides, the combinations of T1 and T4, regardless of their position, also belong to the least difficult group of tonal combinations. On the other hand, the combinations with T2 or T3 (including their mutual combinations) appear to be not easy to perceive. The most difficult combination is T3–T4: its incorrect notations account for 19% of errors in tonal combinations. Taken together with T1–T3 as the second most difficult combination, their incorrect notations constitute almost one third of all errors at the suprasegmental level. 
5 T3–T3 was included under T2–T3 because its pronunciation is, due to the sandhi rule, the same. 
Let us now examine types of errors in detail. Firstly, the position of the error within the combination was investigated. Analysis has revealed that students have more difficulties to recognize the tone on the first syllable than on the second syllable. To be specific, 69% errors occur on the initial and 37% on the final syllable. It seems that the impression of the second syllable might influence the perception of the first one. As the numbers indicate, errors on both syllables are rather rare; this was the case in only 6% of incorrectly noted combinations. 
Secondly, the question to be answered is whether the errors show any systematic behavior. Analysis has pointed out some important results that need to be further discussed. First of all, a strong connection between errors occurring within combinations T3–T4 and T2–T4 has been identified. As for the T2–T4, it is T3 that can be found on the first syllable in almost of 90% of its incorrect notations. And vice versa the same phenomenon can be observed, even though with a slightly lower frequency rate: T2 occurs on the first syllable in the 82% of incorrectly noted T3–T4 combinations. It appears that T4 starting at the top of pitch range makes it difficult for students to distinguish whether the preceding syllable was rising or not. 
Moreover, the same connection can be identified for another pair of combinations. As for T1–T2 and T1–T3, Czech students face perceptual confusion between T2 and T3 on the second syllable. In this case, the mutual error frequency rate of 64% is identical for both combinations. 
Nonetheless, patterns without such a clear mutual connection can be observed as well. Considering the other combinations rated above as most difficult, three other easily misidentified pairs of combinations can be identified. As can be seen below in Fig. 4, the error rate frequency for T2–T3 incorrectly noted as T4–T3 is especially high. However, errors concerning the combination T4–T3 occur more often with the second syllable being confused for T2. 
Table 4: Tonal combinations: most frequent types of errors 
Combination  Incorrectly noted as  Rate  
T2–T4  T3–T4  90%  
T3–T4  T2–T4  82%  
T2–T3  T4–T3  84%  
T1–T2  T1–T3  64%  
T1–T3  T1–T2  64%  
T4–T3  T4–T2  51%  
T4–T3  T2–T3  36%  

The results demonstrate that the greatest difficulties in tonal differentiation involve the pair of T2 and T3. Considering the fact that students take the test after seven weeks of training, this result is consistent with McGinnis.. (1996, p. 87) theory that at the second stage of development of tonal perception skills, students begin to develop tonal confusion difficulties on the basis of similar tonal contours. However, it has been also observed that Czech learners experience difficulties concerning T2 and T4 discrimination, which are, according to McGinnis.. research relating to American English speakers, easy to confuse during the initial period of Chinese language study because learners place greater emphasis on the extreme endpoint of a given tone and less on its direction. 
Generally speaking, phonetic features of tones are considered one of the key factors that have been found to affect tone learning. Gandour (1981) has observed that tones which are highly similar in their Fundamental Frequency (F0) are more likely to be confused than tones whose F0 patterns are different. His findings were later confirmed in several other studies. In one recent study, So & Best (2010) examined the tonal perception of six Chinese syllables by Hong Kong Cantonese, Japanese and Canadian American native speakers. Although their results revealed language-specific errors, language independent patterns were identified as well: regardless of the learners' backgrounds, tones which have similar phonetic features (T1 vs. T2, T2 vs. T3, T1 vs. T4) were more difficult to distinguish than those with dissimilar features (T1 vs. T3, T2 vs. T4, T3 vs. T4). As for the Czech students, difficulties with the discrimination of T2 and T36 are consistent with this theory. It also aligns with the general assumption that T2 and T3 are the most confusable tone pair. 
However, significant differences can be found when comparing the results with another recent study, this time focusing on disyllabic compounds. Lilienfeld (2015) examined tone perception by 111 beginner level Chinese students from different backgrounds (tonal as well as non-tonal, though none of them was from a Slavic-language background). Despite certain mean score differences, all students performed similarly for overall tonal combination perception. Analyzing the results, Lilienfeld detected the impact of tone placement as one of the factors influencing the perception of T3; the combinations with T3 on the initial position (T3–T1, T3–T2 and T3–T4) were identified as hardest, in contrast to the easiest tonal combinations T1–T3 and T2–T3. These findings are considerably different from those identified in the case of Czech learners. Only the combination T3–T4 was found to be the most difficult in both studies. Otherwise, the results of this study are the opposite of those obtained by Lilienfeld. Combinations T1–T2 and T2–T3 belong to those that were highly misinterpreted; on the other hand, errors within combinations T3–T1 and T3–T2 appear to be relatively 
6 The similarity shared by T2 and T3 is that both have a lower-to-higher pitch range, even though for T3 this is the case only in the final position. 
fewer. Unfortunately, Lilienfeld's study does not include any description of error patterns typical for each combination, whose comparison might provide additional insight. 
Despite the obvious differences, there is one important fact connecting both Lilienfeld.s and the present study. Although all the most frequent errors are in some way connected with T3, the results suggest that one cannot sum it up as an overall difficulty regardless of the tonal combination. However, compared to Lilienfield.s findings, it is obvious that the perception difficulties with T3 observed in Czech students show a more complex pattern. Firstly, difficulties with discrimination of T3 and T2 on the initial syllable occur in disyllabic compounds with T4 on the final syllable. Secondly, T3 and T2 on the final syllable are not easy to distinguish in disyllabic compounds with T1 or T4 on the initial syllable. Moreover, the position of T3 on the final syllable makes the recognition of T2 and T4 on the initial syllable difficult. 

Conclusion 
The analysis of errors occurring in dictation tests has revealed how Czech listeners' performance is constrained by their phonological system. The results demonstrate that, despite their non-tonal language background, the discrimination of tonal combinations is not the most challenging issue in Chinese speech perception for Czech learners, or at least not after the intensive seven weeks of training that the students went through before taking the test. If anything, they seem to experience the same difficulties with perception of Chinese finals as well. Besides, the study has also provided an overview of the most confusable pairs at the segmental and suprasegmental level. Unlike the language-specific errors in initials and finals, the identified patterns of errors in tonal combinations appear to be language-independent, since they are, in general, consistent with what has been reported in previous studies on cross-linguistic perception of the Standard Chinese tonal system, i.e. that tones with similar features are likely to cause more perceptual difficulties. In addition, the findings have also indicated that the tone errors of Czech learners seem to follow relatively complex patterns. 

References 
Best, C. T. (1995). A direct realist view of cross-language speech perception. In W. Strange (Ed.), 
Speech perception and linguistic experience: Issues in cross-language research (pp. 171­
204). Baltimore: York Press. Buck, G. (2001). Assessing listening. Cambridge: Cambridge University Press. Duanmu, S. (2007). The phonology of Standard Chinese. New York: Oxford University Press. 
Gandour, J. (1981). Perceptual dimensions of tone: Evidence from Cantonese. Journal of Chinese Linguistics, 9(1), 20-36. 
Gottfried, T. L., & Suiter, T. L. (1997). Effect of linguistic experience on the identification of Mandarin tone identification. Journal of Phonetics, 25(2), 207-231. 
Lee, Y.-S., Vakoch, D. A., & Wurm, L. H. (1996). Tone perception in Cantonese and Mandarin: A cross-linguistic comparison. Journal of  Psycholinguistic Research, 25(5), 527-542. 
Li, M. [..] & Shi, P. [... ]. (1998). Hanyu Putonghua Yuyin Bianzheng«.........». Beijing: Beijing Yuyan Xueyuan Chubanshe. 
Lilienfeld, B. (2015). A mixed method study of Mandarin tone perception, production and self-regulated strategy use (Doctoral dissertation). 
Lin Y.-H. (2007). The sounds of Chinese. Cambridge: Cambridge University Press.  
McGinnis, S. (1996). Tonal distinction errors by beginning Chinese language students: A comparative study of American English and Japanese native speakers. In S. McGinnis (Ed.), Chinese pedagogy: An emerging field (pp. 81-91). Columbus: Ohio State University. 
Po|.aðà, i/ (1994)/ F....... . ......... ........ Praha: Univerzita Karlova. 
So, C. K., & Best, C. (2010). Cross-language perception of non-native tonal contrasts: effects of native phonological and phonetic influences. Language and Speech, 53(2), 273-293. 
Šðo..ý, O/, & Utù., D/ (2001)/ H....... ......... .... .. ...... ........ ......... Olomouc: 
U.Tðù.zToo Po|oš.ûta ð O|a.aÆšT/ 
Šðo..ý, O/, ùo o|/ (1988)/ H....... ........ . ........../ O|a.aÆš. V.uoðooù|.oðu U.Tðù.zTo. Po|oš.ûta/ 
Trubetzkoy, N. (1969). Principles of phonology. Berkeley: University of California Press. 
Tau..aðà, H/ (2012)/ ........... ......... ...... ........ Praha: Karolinum. 
Tau..aðà, H/ (2014)/ !.uT.aðo.û .aÆt|à... ð u.GoT.u o ð o.o|ToT.u/ È...... ... ....... ........., 
94(2), 196-218. 

RESEARCH ARTICLESUnderstanding Reference: Morphological Marking in JapaneseShinichi SHOJI	.....................................................................................................  9Functional Significance of Contextual Distribution: Discourse Particle ar in BanglaSoumya Sankar GHOSH, Samir KARMAKAR, Arka BANERJEE ....................	23Japanese n deshita in Discourse: Past Form of n desuHironori NISHI ......................................................................................................	41“Two sides of the same coin”: Yokohama Pidgin Japanese and Japanese Pidgin EnglishAndrei A. AVRAM ................................................................................................	57Chinese Legal Texts – Quantitative Description ¼uboš GAJDOŠ ......................................................................................................	77Perceptual Errors in Chinese Language Processing: A Case Study of Czech LearnersTereza SLAMÌNÍKOVÁ ......................................................................................	89