Rethinking Metalinguistic Labels: SPATIO-TEMPORAL METALINGUISTIC TERMS IN LEARNERS' DICTIONARIES OF JAPANESE EXPRESSIONS Andrej BEKES University of Ljubljana andrej .bekes@gmail.com Abstract Paper examines spatio-temporal metalinguistic terms in learners' dictionaries of Japanese expressions in major existing dictionaries. Based on the analysis it proposes a layered metalinguistic labelling solution to achieve the greates efficiency with the smallest possible number of labels being employed. Keywords: Japanese; metalinguistic label; semantic/functional index; grammar/ expression dictionaries Povzetek Članek analizira prostorsko-časovne metajezikovne izraze v slovarjih japonskih izrazov za učeče se. Na osnovi analize predlaga večplastno uporabo metajezikovnih oznak, s čemer se optimizira učinkovitost ob kar se da majhnem naboru oznak. Ključne besede: japonski jezik; metajezikovna oznaka; semantična/ funkcijska razvrstitev; slovarji slovnice/ izrazov 1. Introduction Efficient semantic labelling is essential if a dictionary is to be used for the composition as well as the reception of texts. When creating a learners' dictionary, this task tends to be done based on experience and tradition, and often in quite an ad hoc fashion, that is without systematic theoretical considerations (e.g. Group JAMASSY 1998 where I was in charge of semantic/functional labels). One option for a more systematic treatment is offered by the NSM (natural semantic metalanguage) approach (e.g. Goddard and Wierzbicka 2007, for a critical assessment c.f. Trobevšek Drobnak 2009). Another possibility is to build an array of metalinguistic labels from the bottom up, from corpora (e.g. Labrador De La Cruz 2004). On the other hand, the chosen labels should also be as theoretically relevant as possible. Acta Linguistica Asiatica, 4(2), 2014. ISSN: 2232-3317, http://revije.ff.uni-lj.si/ala/ DOI: 10.4312/ala.4.2.9-23 This means that if adequate labels are to be produced, not only collocations but also broader semantic properties, implicit within the wider context, need to be considered. Languaging (e.g. Becker 1988) is an activity where our linguistic potential is applied to fluid on-going situations. Not only functional expressions but also lexical items such as verbs, adjectives and nouns, often have multiple meanings, which arise through interplay with the context. The problem is how to capture these characteristics in a metalanguage intended for explanation of the lexicon, or when providing learners with semantic/functional labels. One of the drawbacks of NSM is its apparently static view of the lexicon and the meanings of lexical items, missing a crucial property of human communication. Regarding methodology, the intuitively chosen approach of Group JAMASSY (1998) to provide meanings not so much by description as by illustration, giving many examples of particular uses of functional lexical items, seems appropriate. While more terse than Makino and Tsutsui (2008) in the sense that its explicit explanations are limited to short commentaries, it seems to be a less restrictive approach (than what?) when guiding experienced users(of what?)/learners. The context-based approach, resonating with the basic argument in Labrador De La Cruz (2004), has recently been taken up systematically. Limiting the discussion just to the field of Japanese language learning support tools, two outstanding such tools should be noted. One is Natsume (http://hinoki.ryu.titech.ac.jp/natsume/), developed by Kikuko Nishina and associates and running partly on the BCCWJ and partly on closed corpora, and the other is NINJAL LWP (Lago Word Profiler), developed by Prashant Pardeshi and associates and implemented using the BCCWJ and TWC corpora (http://corpus.tsukuba.ac.jp/search/). These tools provide users with usage based information, extracted online by means of an analysis of the chosen corpora. For this reason, these tools are actually used not only by intermediate and advanced learners, but also by professional dictionary makers. In spite of the power of these new tools, well thought-out dictionaries for beginner and lower intermediate learners are also crucial to speed up efficient learning of Japanese. For this purpose, an efficient array of semantic/functional labels that guides the users towards an effective understanding of the lexical item is necessary. The labels should be chosen so that for composition, they can efficiently guide the learner towards those expressions in the target language that are most appropriate to achieve the learner's goal in a given situation. Since learners are not translators, however, this goal should be achieved through learners' intuitive grasp "from within" of how the target language works. This paper focuses on spatio-temporal metalinguistic terms, with an emphasis on methodological issues, and analyses the implementation of such metalinguistic terms in contemporary Japanese grammar/expression dictionaries, in particular Group JAMASSY (1998). Building on the analysis of how spatio-temporal terms are structured in Group JAMASSY (1998) I argue for a concise but systematic and pragmatically relevant approach in semantic labelling. 2. Japanese grammar / expression dictionaries A number of dictionaries (Group JAMASSY, Makino and Tsutsui, Tomomatsu et al. etc.) have been published since 1990 with the express purpose of easing the learning of items that do not appear in traditional dictionaries, i.e., functional expressions, sentence patterns, etc. In this paper, two such dictionaries were chosen, based on their breadth of coverage and popularity. An additional reason is that I am a co-author of one (Group JAMASSY 1998) and this paper serves as a re-examination of that previous work. 2.1 Characteristics of Makino and Tsutsui (2008) and Group JAMASSY (1998) The purpose of the dictionaries is largely similar: providing clear information to learners about common expressions and patterns. Makino and Tsutsui authored three volumes, from elementary to advanced, while Group JAMASSY's dictionary targets upper intermediate and advanced level learners. Therefore, only Makino and Tsutsui's third volume (2008), targeting advanced level learners, will be taken into consideration here. In both dictionaries the main entries include lexical items and constructions, and are organised alphabetically (Makino and Tsutsui) or in gojüon order (Group JAMASSY). Makino and Tsutsui offer detailed information on formation, i.e., the constructions in which particular expressions are typically used. Group JAMASSY handles this type of information in subentries that are often detailed and hence sometimes difficult to follow. There is one crucial difference in the organisation of the dictionaries. Makino and Tsutsui is bilingual - with Japanese entries and examples but strictly English explanations and glosses. On the other hand, Group JAMASSY's dictionary was conceived as a monolingual dictionary, aimed primarily at upper intermediate and advanced learners and as a reference book for Japanese language teachers. In the wake of its success, translations of the dictionary into Chinese, Korean, Thai and Vietnamese have appeared. While Makino and Tsutsui provide ample examples of use together with translations and detailed explanations, Group JAMASSY relies on the power of multiple examples, supplemented with direct, concise explanations. In the translated versions of the dictionary the examples and the explanations are translated into the respective language, resulting in a dictionary of a structure similar to Makino and Tsutsui. Another difference is in the organisation of the index. Makino and Tsutsui (2008), being bilingual, offers two indexes, an index of romanised Japanese entries in alphabetical order, and an index of English translations of entries, also in alphabetical order. This solution may seem helpful for users engaging in productive tasks, but is not always conducive to a quick location of the desired expression in Japanese. Group JAMASSY tries to cope with the problem of quick entry location by including two indexes: besides the gojüon index of entries, there is also an index of entries sorted by their meaning and/or function. This semantic/functional index consists of more than one hundred metalinguistic labels, referring broadly to the meaning and/or function of a particular entry or subentry. In this respect, Group JAMASSY (1998) is similar to some of the other grammar dictionaries that appeared later, such as Tomomatsu et al. (2007/2010). Below, this paper focuses on the subset of these labels that refer to spatial concepts, temporal concepts, or both, in Group JAMASSY (1998). 3. Metalinguistic labels Attempts at creating a language that would precisely convey meaning across languages have a distinguished lineage, going back to the tradition of searching for a universal language - with G. W. Leibnitz to be mentioned as one prominent proponent of the idea (e.g. Yaguello 1990, Eco 2003), and with NSM (e.g. Goddard and Wierzbicka 2007) being a notable present attempt. In both dictionaries under scrutiny, the explanations should have ideally been based on some subset of language in the vein of NSM, but this is not he case with Group JAMASSY and does not seem to be the case with Makino and Tsutsui either. In both dictionaries ordinary, though to some extent controlled, language is used for the explanations. 3.2 The purpose of the metalinguistic labels in Group JAMASSY dictionary Group JAMASSY (1998) uses metalinguistic labels to hint at the meaning and/or function of entries and subentries. There are two types of labels: (a) labels used in the semantic/functional index, to direct the learner towards the relevant entries during language production tasks, and (b) those used in the main body of the dictionary, to disambiguate different meanings and usages. The problem is that both sets of labels are not completely identical and are used in a way that seems somewhat inconsistent. Let us consider for example the entry fe^ (1) a mn .p. 2 ..p. 2 ............p. 2 1 N ©feu^ aN©feu^ b N ©feu^ 2 huK < a ...huti [N [A-ufeut] [V-TUS/V-S > Ut ] In the semantic/functional index the entry feut is listed under the label < ffi^ > (meaning interval of time, period, see (1)a above), while in the main body, fe u t is disambiguated with the synonymous / hypernymous . In the index one also finds labels such as and while in the body of the dictionary, as in example in (1)b, the hypernym < M > is used, which in this context refers to the specific meaning of a "human relationship" and happens to be the most frequent use in TWC. Also, while for example and as above, are often used in the main body of the dictionary, there are no such labels in the index. Here, on the other hand, one can find , , M #^>, which point to the relevant entries in the body. In this way, there are 158 labels used only in the main body, 115 labels used only in the index, and 45 labels that are used both in the index and in the main body of the dictionary. 3.3 Clashing requirements There is a contradiction inherent in devising metalinguistic expressions to be used for explanation in dictionaries. On the one hand, the number of such expressions has to be small, so that they are manageable, and on the other hand, for these expressions to be specific enough, their number has to be sufficiently large. This is the clash of requirements that Leibnitz and others had to face early in their work on universal language (e.g. Eco 2003; Yaguello 1990). The problem that for a while also hindered attempts at automatic translation is that if one wants to make a language with the expressive power of natural languages it has to be as complex as natural languages. b Group JAMASSY's attempt was no exception. Its goal was simplicity and intuitive ease of use, which resulted in plentiful examples, concise explanations, and a very limited set of non-technical transparent semantic/functional labels used in the main body and index of the dictionary. The request that labels in the index should be specific enough to direct the learner to a small set of potential candidate entries is in conflict with the requirement that the number of labels should be kept at a manageably small number. The outcome of the clash was the aforementioned inconsistency between the labels used in the index and those used in the body of the dictionary. Though this is theoretically an unresolvable problem (NSM being no exception), a trade-off is possible. The compromise would be to establish an array of labels sufficiently rich for practical purposes but at the same time also concise enough to be manageable. Randomly chosen examples such as (1) above show that an improvement in the consistency and accuracy of semantic/functional labels is possible. 4. Some proposals for improvement Limiting the scope to spatio-temporal labels, I will illustrate a few possible paths out of this impasse. 4.4 Consistency of use - labels in the body and in the index There are many examples where the use of semantic/functional labels could be made more consistent by treating the labels in the body of the dictionary and the labels in the index as parts of the same set. In the following sections, examples based on the use of spatio-temporal labels are presented. 4.4.1 Unifying labels in the dictionary body and in the index The labels in the index and the labels used for disambiguation in the body of the dictionary are sometimes related (hypernym-hyponym etc...), yet differ. For example, there are cases when ^^ is used in the index while ^^ is used in the body of the dictionary. The first entry in the dictionary is a good example: (2) 2 h^n p. 2 a ...h^n [N [A-uh^n] [V-TuS/V-Sh^n] b ...hun^ [N ®hun^] [Na ^feu^^] [V-TuS/v-Sfeu^^] In (2) 2a, the meaning of feu^ can be glossed as "an interval of time during which an action continues" (i /u fc^^). It would seem to be more user friendly to use the same label for disambiguation and in the index, and ^^ is more specific than ^^. In 2b, the meaning of feu^^ can be glossed as "(relatively) instantaneous action, taking place within an interval of time" (W^K^^^S / ofc^ ^), so again, as in 2a, it seems appropriate for the labels to be the same, i.e., A more or less identical situation seems to be the case in (3) below, semantically similar to the preceding example, with the meaning of 9 ^ ^ glossed as "(relatively) instantaneous action, taking place within an interval of time" (W^K^^^S/o kM (3) 2 p. 48 [N [Na In example (4) and (5) below, the situation is similar again. The meaning of both constructions can be glossed as "an interval of time during which an action or state continues" • M^K^^^ < / ufcJW^. In (4), as in (2)2a, using JW^ for disambiguation would be more appropriate. (4) 2 N ru® p. 110 (5) 2 N p. 145 On the other hand, in (6), both keeping and changing the disambiguation label could be argued for: (6) 3...^^ p. 546 a N ^^ b V-S^^ The gloss of ^^ in 3a in (6) is "the time until which a state or some activity continues" • M^K^®^^), and the gloss of ^^ in 3b is "the time limit defined by an action, until which a state or some activity continues" • M^K^® ^ ^ o T ^^ ä ^ S ^^). In both cases, the hypernym ^^ seems acceptable from the point of view of label economy, though a more specific label, such as "time limit" ^^ would be even more appropriate. Additionally, in 3b, the temporalrelational semantic relationship between the defining action and continuing state or activity could also be emphasized, by means of double labelling, a possibility discussed in more detail in Section 3.2. Similarly, the construction in the example below: (7) 2 N ^ N p. 505 expresses the general meaning of a spatial relationship between a location and its surroundings. Therefore the more specific "place, location" could be replaced by as a disambiguator. 4.4.2 Consistent disambiguation of subentries Consistent disambiguation of the subentries would make searching easier and improve understanding of the entries. The first entry in the dictionary is once again a good example. (8) [feu^] 1 N ©feu^ p. 2 a N ©feu^ b N ©feu^ 2 feu^ a......feu ^ b ...feu^^ In (8) the macrostructure of the entry [feu^] is shown. While the temporal use of feu ^ is disambiguated at the first level of the subentry, the spatial and relational use are disambiguated at the second level of the subentries. Restructuring the entry [fe u^] into (8') [feu^] 1 N ©feu^ 2 N ©feu^ 3 feu^ a ...feuti b ...feut ^ would render the treatment of subentries in the dictionary more consistent. Restructuring the entry would also be advisable in cases such as below: (9) p. 545 1 N N ^^ <0> (1 ) 1 3 3 ^^^^už ^o (2) A < 2 N <0WÄ> a N iV b V-SiV 4...iV p. 546 p. 547 In (9) the first level subentry 1 is not disambiguated at all, while giving examples of the temporal and spatial use of N N iV. Such situations may discourage an inexperienced user, who otherwise could make good use of such a dictionary, if its entries were organised in a more consistent and transparent way. A more appropriate entry structure would be: (9') [iV] 1 N N iV (1 ) iyy^^^^ÄÜ 1 3 ^iVM 3 2 N ^6 N iV (2) A t^o ... 3 N iV <0WÄ> 4...iV a N iV b V-SiV 5..iV 4.5 Combination of labels for greater precision One way to increase the expressive power of a limited number of semantic/functional labels is to combine the labels. For example, a superordinate label could specify a wider semantic area, while the subordinate label of the same entry would narrow the focus to one of its aspects. Incongruent examples, using a partly overlapping set of labels in the index and in the main body, such as (10) below, provide a hint: index: [tt^t] p. 34 main body: 1 ... ttSt 2 ... ttst (10) The index entry, which at present looks like: (11) t tt^t would look like: (11') t ttst while the labelling in the main body of the dictionary would not change. With a two-layered labelling in the index the user would have a clearer idea of how entries are interrelated. Below I offer some additional examples of temporal and spatial labels that could become more transparent by being rearranged into two layers. In the index entry (12) below, the spatial use of the entry expression ^ L is indicated by the label $ M ^, while there is no label provided to indicate the temporal use of ^L . (12) [rL] p. 110 1 N 2 N In the "spatial" use of rL the relational meaning seems to be more general, and it should take precedence over the spatial meaning. As for the temporal use of rL, as shown in example (4) above, it implies an interval. The temporal meaning should therefore be labelled as The index should hence be rewritten as follows: (12') rL L The main entry would then look like: (12") [ru] 2 N ru - • < /^^mn There is no end to possible examples. The next example is again an entry with both spatial and temporal meanings. (13) It^o] p. 145 1 N D^o(^) <$n> 2 N t^o <^n> In the index, the entry t^O is referred to by the labels $ nWM#^ and mn. In its spatial use, the meaning seems not to be so much relational as implying "the range or scope of some activity" (^ ^ ^ Ä ^ ffi H ^ S n ), thus is appropriate, subspecified for space <$n>. The temporal use label, <^n>, seems too vague for the gloss "the period while a situation or an action continues" • / ^ ^mn). Both the spatial and temporal uses are related; mn is a sort of temporal range, that is, a ffiH. The revised index labelling and entry should thus look as follows: (13') index: ffiH <$n> t^O; mn t^O entry: [t^O] 1 N t^o(^) 2 N t^o - • < /^fcmn In this section, a two-layered labelling was proposed. As can be seen from example (14), finer subdivisions with more layers of labels are possible and sometimes necessary. In (14) the entry [feä i ] is referred to in the index by two labels, $ nWM^ and MMM^. The problem is that MMM^ can refer to both temporal and spatial order, so as a label it is not very useful for the learner. (14) [feä 1] p. 8, 9 1 feä <$n> [N [V-S/V-fcfeä] 2 <^n> a ...feä [N [V-fchä] Two-layered labelling would provide clearer information for the learner. The entry hh 1 is about temporal and spatial order, so rather than ^^ or its relational property should be given priority in the choice of labels. In the index, the labels should be arranged as below, in (14'). For greater precision and a sharper focus, an additional layer can be added, in this case, . This is the reverse of the original index, where the ambiguous label appears as the first label. Thus we would have in the index: (14') hh hh and in the body: (14") [hh 1] 1 hh [N ®hh ] [V-S/V-fchh] a ...hh [N ®hh ] [V-fchh] 4.6 Contribution of corpora Frequency data from corpora can be a guide for ordering different usages, though this principle is not an absolute guide. Let us consider entry (1) once again in (15). (15) [hun] 1 N ®hun aN®hun b N ®hun 2hun a ...hun Here for example frequency based ordering and semantic (more general first) ordering clash. Data from Lago TWC show that the use as RELATION is more frequent than the use as SPACE. On the other hand, semantically, SPACE is primary and RELATION is derived. Further, the semantic properties of N in N may also be relevant. The choices in the Group JAMASSY do not always seem to be the most appropriate. Here are some representative examples, based on the same entry: (16) b N p.2 (1) Mffi^A®^« (2) (3) Considering the first example above, (16)(1), frequencies in LAGO TWC are 402 for N=^A and 1535 for N=A^ . This difference should be big enough to warrant the choice of A^ instead as the typical example of relation involving members of a set. On the other hand, the second example of use (16)(2) seems to be appropriate as far as the choice of N is concerned, as Ns that belong to the semantic category HUMAN () represent the vast majority of examples of use in LAGO TWC. There is doubt, however, surrounding whether (16)(2) is indeed an example of RELATION . It seems that could be interpreted rather as expressing a RANGE (or scope) where the predication relationship is between T^r^J and T AÄ^feS. The third example of use, (16)(3) shows a relation between abstract Ns. The frequency in LAGO TWC for N= < abstract noun > is very small (< 100), compared to N=A^ (1535), but the example seems to be appropriate, because abstract nouns are a separate category. In accordance with Labrador De La Cruz (2004), labels based on corpora may improve the overall quality of labelling. Abstractions should be based on empirically obtained examples of use and on the goals of a particular dictionary. Some examples of ^ from LAGO TWC follow: (17) a b Here, (17) a is a relation between two specific sentient beings with volition, also quite a frequent pattern observed in the corpus. (17)b is a relation between two possibly not well defined human groupings, but still rather frequent. (17)c is a relation among abstract categories, denoting general characteristics of some human society, and such constructions are not at all frequent in the corpus. From the three examples in (17) it is possible to see that the general label RELATION could be subspecified further as VOLITIONAL (in 17a), as GENERALISED VOLITIONAL (in 17b) and as PERCEIVED/ASCRIBED (in 17c). How far this labelling process should go in practice is perhaps one of the most difficult tasks of future dictionary making, not only for JAMASSY 2.0. 5. Conclusions In Section 1 I compared the organisation of Group JAMASSY (1998) with that of Makino and Tsutsui (2008), pointing out the differences stemming from the different basic concepts on which the dictionaries are based. Makino and Tsutsui's volumes are more like grammar handbooks, arranged in dictionary form, with ample examples and explanations. On the other hand Group JAMASSY's dictionary is organised as a dictionary of entries that are not described in general dictionaries. While the two dictionaries overlap, Group JAMASSY covers a larger number of entries. It offers plentiful examples with concise explanations. It also offers relatively systematic semantic/functional labels, used in the index to guide the user towards the desired entries, and within the entries, to disambiguate and explain usage. In section 2, I briefly examined some issues connected with metalinguistic labels, in particular the inherent limitations of "universal" semantic/functional labels, and clashing requirements that force editors to find a pragmatic balance between accuracy and usability. Section 3 dealt with the metalinguistic labels in the Group JAMASSY dictionary. Consistency of structuring and use is a precondition for the successful application of labels. Some inconsistencies in the organisation and choice of labels were pointed out and alternatives suggested. Further, having two sets of labels, one for the index and one for the main body of the dictionary, was shown to be inappropriate. As a solution, a merger into one consistently organised set was proposed, such that it could be used to create consistent disambiguation of entries and subentries, while using the same labels also for the meaning and function index. The merger of labels has a drawback, i.e., an increase in the number of labels. To keep the number of labels low and at the same time efficient for accessing the entries from the index via meaning/function, or as transparent disambiguators in the main body of the dictionary, a layered use of labels was proposed and illustrated with examples of spatio-temporal entries. Lastly, some possibilities of how to use corpus data were also outlined. Literature Becker, A.L. (1988). Language in particular: A lecture. In D. Tannen (Ed.), Linguistics in context: connecting observation and understanding, pp. 17-35. Norwood, NJ: Ablex Publishing Corporation. Eco, Umberto (2003) Iskanje popolnega jezika v evropski kulturi (La ricerca della lingua perfetta nella cultura europea, Bari: Laterza 1993, transl. by Vera Troha). Ljubljana: Založba /*cf. Goddard, Cliff and Anna Wierzbicka (2007). Semantic primes and cultural scripts in language learning and intercultural communication. In Gary Palmer and Farzad Sharifian (eds.), Applied Cultural Linguistics: Implications from second language learning and intercultural communication, pp.105-124. Amsterdam: John Benjamins. Group JAMASSY (1998) Nihongo bunkei jiten. Tokyo: Kurosio. Makino, Seiichi and Michio Tsutsui (2008). A Dictionary of Advanced Japanese Grammar. Tokyo: The Japan Times. Labrador De La Cruz, Belen (2004). A Methodological Proposal for the Study of Semantic Functions across Languages. Meta: journal des traducteurs /Meta: Translators' Journal, Vol. 49/ 2, pp. 360-380. Tomomatsu Etsuko et al. (2007/2010) Shinsöban donna toki dö tsukau nihongo hyögen bunkei jiten. Tokyo: ARK. Trobevšek Drobnak, Frančiška (2009). On the Merits and Shortcomings of Semantic Primes and Natural Semantic Metalanguage in Cross-Cultural Translation. English Language Overseas Perspectives and Enquiries, Vol. VI/1-2, pp. 29-41. Yaguello, Marina (1990) Gengo no musosha: 17seiki fuhengengo kara gendai SF made (Les fous du langage: des langues imaginaires et de leurs inventeurs. Paris: Seuil 1984, transl. by Tanikawa Taeko, Eguchi Osamu). Tokyo: Kosakusha.