Informática 34 (2010) 103-110 103 Grammar of ReALIS and the Implementation of its Dynamic Interpretation Gabor Alberti and Judit Kleiber Department of Linguistics, University of Pecs, 7624 Pecs, Ifjusag 6, Hungary E-mail: realis@btk.pte.hu Keywords: lexical and discourse semantics, Hungarian, total lexicalism Received: February 20, 2009 ReALIS, REciprocal And Lifelong Interpretation System, is a new "post-Montagovian" theory concerning the formal interpretation of sentences constituting coherent discourses, with a lifelong model of lexical, interpersonal and encyclopaedic knowledge of interpreters in its centre including their reciprocal knowledge on each other. First we provide a 2 page long summary of its 40 page long mathematical definition. Then we show the process of dynamic interpretation of a Hungarian sentence (Hungarian is a "challenge" because of its rich morphology, free word order and sophisticated information structure). We show how an interpreter can anchor to each other in the course of dynamic interpretation the different types of referents occurring in copies of lexical items retrieved by the interpreter on the basis (of the morphemes, word order, case and agreement markers) of the sentence performed by the speaker. Finally, the computational implementation of ReALIS is demonstrated. Povzetek: Predstavljen je sistem ReALIS za dinamično interpretacijo zapletenih stavkov. 1 Introduction ReALIS [2] [4], REciprocal And Lifelong Interpretation System, is a new "post-Montagovian" [15] [17] theory concerning the formal interpretation of sentences constituting coherent discourses [9], with a lifelong model [1] of lexical, interpersonal and cultural/encyclopaedic knowledge of interpreters in its centre including their reciprocal knowledge on each other. The decisive theoretical feature of ReALIS lies in a peculiar reconciliation of three objectives which are all worth accomplishing in formal semantics but could not be reconciled so far. The first aim concerns the exact formal basis itself ("Montague's Thesis" [20]): human languages can be described as interpreted formal systems. The second aim concerns compositionality: the meaning of a whole is a function of the meaning of its parts, practically postulating the existence of a homomorphism from syntax to semantics, i.e. a rule-to-rule correspondence between the two sides of grammar. In Montague's interpretation systems a traditional logical representation played the role of an intermediate level between the syntactic representation and the world model, but Montague argued that this intermediate level of representation can, and should, be eliminated. (If a is a compositional mapping from syntax to discourse representation and p is a compositional mapping from discourse to the representation of the world model, then y=a°p must be a compositional mapping directly from syntax to model.) The post-Montagovian history of formal semantics [17] [9], however, seems to have proven the opposite, some principle of "discourse representationalism": "some level of [intermediate] representation is indispensable in modelling the interpretation of natural language" [14]. The Thesis of ReALIS is that the two fundamental Montagovian objectives can be reconciled with the principle of "discourse representationalism" - by embedding discourse representations in the world model, getting rid of an intermediate level of representation in this way while preserving its content and relevant structural characteristics. This idea can be carried out in the larger-scale framework of embedding discourse representations in the world model not directly but as parts of the representations of interpreters' minds, i.e. that of their (permanently changing) information states [3]. 2 Definition The frame of the mathematical definition of ReALIS (whose 40 page long complete version is available in [4] (Sections 3-4)) is summarized in this section. As interpreters' mind representations are part of the world model, the definition of this model ^ = (U, W0, W) is a quite complex structure where - U is a countably infinite set: the universe - W0 = (U0, T, S, I, D, Q, A): the external world - W is a partial function from set IxTm where W[i,t] is a quintuple (U[i], cr[i,t]n, a[i,t]^, X[i,t]A, K[i,tf): the INTERNAL-WORLD FUNCTION. The external world consists of the following components: 104 Informatica 34 (2010) 103-110 G. Alberti et al. - U0 is the external universe (U0 c U), whose elements are called entities - T = (T, ©) is a structured set of temporal intervals (T c U0) - S = (S, S) is a structured set of spatial entities (S c U0) - I = (I, Y) is a structured set of interpreters (I c U0) - D = (D, A) is a structured set of linguistic signs (practically morph-like entities and bigger chunks of discourses) (D c U0) - Q c TxU0* is the set of core relations (with time intervals as the first argument of all core relations) - A is the information structure of the external world (which is nothing else but relation structure Q reformulated as a standard simple information structure, as is defined in [22: 245]; its basic elements are called the infons of the external world - T, S, I and D are pairwise disjoint, infinite, proper subsets of the external universe U0 which meet further requirements that cannot be elaborated here. The above mentioned internal-world function W is defined as follows: - The relation structure W[i,t] is called the internal world (or information state) of interpreter i at moment t - U[i] c U is an infinite set: interpreter i's internal universe (or the set of i's referents, or internal entities); U[i'] and U[i"] are disjoint sets if i' and i" are two different interpreters - in our approach what changes during a given interpreter's lifespan is not his/her referent set U[i] but only the four relations among the (peg-like [12]) referents, listed below, which are called i's internal functions: - a[i,t]n : nxU[i] ^ U[i] is a partial function: the eventuality function (where n is a complex label characterizing argument types of predicates)1 - a[i,t]T : TxU[i] ^ U[i]uU0 is another partial function: the anchoring function (a practically identifies referents, and T contains complex labels referring to the grammatical factors legitimizing these identifications) - X[i,t]A : AxU[i] ^ U[i] is a third partial function: the level function (elements of A are called level labels); the level function is practically intended to capture something similar to the "box hierarchy" among referents in complex Kampian DRS boxes [10] enriched with some rhetorical hierarchy in the style of SDRT [2] - K[i,t]K : K ^ U[i] is also a partial function: the cursor, which points to certain temporary reference points prominently relevant to the interpreter such as "Now", "Here", "Ego", "Then", "There", "You" etc. 1 The DRS condition [e: p t rj ... rK] [10] (e.g. [e: resemble now Peter Paul]) can be formulated with the aid of this function as follows (with i and t fixed): CT((Pred, ft), e) = p, CT((Temp, t), e) = t, a((Arg, e)= r1, ..., a((Arg, VkX e)= rK. The temporary states of these four internal functions above an interpreter's internal universe (which meet further requirements that cannot be elaborated here) serve as his/her "agent model" [11] in the process of (static and dynamic) interpretation. Suppose the information structure A of the external world (defined above as a part of model ^ = (U, W0, W)) contains the following infon: 1 = (perceive, t, i, j, d, s), where i and j are interpreters, t is a point of time, s is a spatial entity, d is a discourse (chunk), and perceive is a distinguished core relation (i.e. an element of Q). The interpretation of this "perceived" discourse d can be defined in our model relative to an external world W0 and internal world W[i,t]. The dynamic interpretation of discourse d is essentially a mapping from W[i,t], which is a temporary information state of interpreter i, to another (potential) information state of the same interpreter that is an extension of W[i,t]; which practically means that the above mentioned four internal functions (ct, a, X, k) are to be developed monotonically by simultaneous recursion, expressing the addition of the information stored by discourse d to that stored in W[i,t]. The new value of eventuality function ct chiefly depends on the lexical items retrieved from the interpreter's internal mental lexicon as a result of the perception and recognition of the words / morphemes of the interpreter's mother tongue in discourse d. This process of the unification of lexical items can be regarded as the first phase of the dynamic interpretation of (a sentence of) d. In our ^eALIS framework, as will be shown in the next section, extending function ct corresponds to the process of accumulating DRS condition rows [17] containing referents which are all -still - regarded as different from each other. It will be the next phase of dynamic interpretation to anchor these referents to each other (by function a) on the basis of different grammatical relations which can be established due to the recognized order of morphs / words in discourse d and the case, agreement and other markers it contains. In our approach two referents will never have been identified (or deleted), they will only be anchored to each other; but this anchoring essentially corresponds to the identification of referents in DRSs. The third phase in this simplified description of the process of dynamic interpretation concerns the third internal function, X, the level function. This function is responsible for the expression of intra- and inter-sentential scope hierarchy [21] / information structure [23] / rhetorical structure [9], including the embedding of sentences, one after the other, in the currently given information state by means of rhetorical relations more or less in the way suggested in SDRT [9]. It is to be mentioned at this point that the information-state changing dynamic interpretation and the truth-value calculating static interpretation are mutually based upon each other. on the one hand, static interpretation operates on the representation of sentences (of discourses) which is nothing else but the output result of dynamic interpretation. on the other hand, however, GRAMMAR OF REALIS AND THE IMPLEMENTATION. Informatica 34 (2010) 103-110 105 the above discussed phases of dynamic interpretation (and chiefly the third phase) include subprocesses requiring static interpretation: certain presuppositions are to be verified [17]. The interpreter's fourth internal function, cursor k, plays certain roles during the whole process of dynamic interpretation. Aspect, for instance, can be captured in our approach as the resetting or retaining of the temporal cursor value as a result of the interpretation of a sentence (^ non-progressive /progressive aspect). It can be said in general that the input cursor values have a considerable effect on the embedding of the "new information" carried by a sentence in the interpreter's current information state and then this embedding will affect the output cursor values. Dynamic interpretation in a ^eALIS model W0, W), thus, is a partial function Dyn which maps a (potential) information state W° to a discourse d and an information state W[i,t] (of an interpreter i): Dyn(d): <^,W[i,t]) ^ 1), there are also alternative ways of satisfaction at our disposal, typically with reference to higher ranked requirements (e.g. (1i.row3). Requirement (1h.row3), thus, can be satisfied (2x), but indirectly, due to (2z)). It is also typical that requirements concerning word order can be satisfied indirectly. There are five lexical items in the example that contain requirements demanding that a certain word immediately precede the common noun 'champion' (see (1c-g)). The adjective expressing nationality is required to be the word adjacent to the noun to the highest degree: rank '+1' expresses this fact in (1f.5). The fraction rank in (1g.5) implies an even stricter neighbourhood but this should be carried out within one word in Hungarian (uszobajnok 'swimming champion'). The other adjective referring to a personal characteristic, 'tall', should remain before 'German' because of its rank number '2' in (1e.5). Then the weaker ranks '5' in the lexical item of the definite article (1d.7) and '6' in that of the demonstrative pronoun (1c.2) lead to the following grammatical word order in the prenominal zone in question: arra a magas nemet uszobajnokra 'that' 'the' 'tall' 'German' 'swimming' 'champion'. Alternative orders are ill-formed. The explanation relies on the ranks discussed above: an adjacency requirement of rank k concerning words w' and w" can be regarded as satisfied if w' is adjacent to w", indeed, or each word \ between w' and w" is such that the requirement demanding its being there is of a higher rank n (n