Elektrotehniški vestnik 81(1-2): 27-32, 2014 Original Scientific Paper Attributed Context-Sensitive Graph Grammars Luka FUrst University of Ljubljana, Faculty of Computer and Information Science, Tržaška cesta 25, SI-1000 Ljubljana, Slovenia E-mail: luka.fuerst@fri.uni-lj.si Abstract. The paper introduces a concept of attributed context-sensitive graph grammars. The graph grammars are a graphical generalization of the textual grammars and can thus be used to specify the syntax of graphical programming or modeling languages. The attributed graph grammars extend the basic graph grammars with definitions of attributes and the associated attribute evaluation rules. By analogy to the attributed textual grammars, the purpose of the attributes and the rules is to define the semantic elements of a graphical language. The introduced concept is illustrated by an example of a grammar-driven conversion of flowcharts to the equivalent C code. The presented example might find its use in introductory programming courses. Keywords: graph grammar, graphical language, visual language, semantics, attributed grammar Kontekstno odvisne grafne gramatike s prilastki Clanek uvede koncept kontekstno odvisnih grafnih gramatik s prilastki. Grafne gramatike so grafna posplošitev tekstovnih gramatik, zato jih je mogoče uporabljati za podajanje sintakse grafnih programskih ali modelirnih jezikov. Grafne gramatike s prilastki razširjajo osnovne grafne gramatike z opredelitvijo prilastkov in pravil za računanje njihovih vrednosti. Po analogiji z besedilnimi gramatikami s prilastki je cilj prilastkov in pravil opredelitev semantičnih prvin grafnega jezika. Predstavljeni koncept ponazorimo s primerom grafnogramaticšne pretvorbe diagramov poteka v enakovredno programsko kodo v jeziku C. Prikazani primer je potencialno uporaben na uvodnih tečajih programiranja. 1 Introduction The set of all syntactically valid programs in a textual programming language such as Pascal or C can be specified by a textual grammar, i.e., a formal system for generating textual strings based on a set of string replacement rules (productions). However, a grammar does not suffice for representing semantic elements such as variable scope or loop behavior. Such elements can be defined by attributed grammars [1], [2], a formalism that extends the ordinary grammars by a set of attributes assigned to individual grammar symbols and a set of production-specific rules for evaluating the attributes. In this paper, we present how the attributes and attribute evaluation rules can be formulated within a graph Received 6 January 2014 Accepted 11 February 2014 grammar framework. The graph grammars are formal systems for generating graph sets and can thus be used for specifying the syntax of graphical languages [3]. We focus on the context-sensitive graph grammars, since they can represent a broader set of graphical languages than their context-free counterparts [4]. The attributed graph grammars remain an under-researched area. Their applications have been scarce; let us mention image parsing [5], music recognition [6] and graph drawing [7]. However, none of these approaches employs the concept of the attributed context-sensitive graph grammars in an analogous way to the attributed textual grammars. In our recent paper [8], we did use such a concept, but without a formal treatment. In this paper, we define the attributed context-sensitive graph grammars formally and provide a clarification example that might be applicable to introductory programming courses. The rest of this paper is structured as follows. In Section 2, we define the context-sensitive graph grammars. In Section 3, we present our running example. Section 4 introduces the attributed context-sensitive graph grammars, and Section 5 concludes the paper. 2 Context-Sensitive Graph Grammars For a graph G, let V [G], E [G], and X [G] = V [G] [G] denote the sets of its vertices, edges, and elements, respectively. Let s(e) and t(e) denote the source vertex and the target vertex, respectively, of an edge e £ E [G]. Let l(x) denote the label of an element x £ X [G]. A graph G is a subgraph of a graph H (denoted G Q H) if V[G] C V[H] and E[G] C E[H]. A homomorphism h: G ^ H is a vertex-to-vertex and edge-to-edge map- ping that preserves labels and adjacencies: l(h(x)) = l(x), s(h(e)) = h(s(e)), and t(h(e)) = h(t(e)) has to hold for each x e X [G] and e e E [Gj. A monomor-phism is an injective homomorphism. An isomorphism is a bijective homomorphism whose inverse is also a homomorphism. A context-sensitive graph production p is a rule of the form L ::= R, where L and R are graphs with a possible common subgraph C (C Q L and C Q R). The graphs L and R are called the left-hand side (LHS) and the right-hand side (RHS), respectively, of the production p (denoted Lhs [p] and Rhs [p]). The graph C is called the context of p (denoted Common[p]). Let X[p] = X[Lhs[p]] U X[Rhs[p]]. A context-sensitive graph grammar GG is a quadruple (N, T, P, A), where N is a set of nonterminal labels, T is a set of terminal labels, P is a set of context-sensitive graph productions, and A is a set of graphs called axioms. Each graph element in P and A has to be labeled by a label from either N or T. To apply a context-sensitive graph production p to a graph G, perform the following steps: 1) Find a monomorphism h: Lhs [p] ^ G. Let L' = h(Lhs[p]) Q G and C' = h(Common[p]) Q L' denote the subgraphs of G corresponding to Lhs[p] and Common[p], respectively. 2) Remove all elements L' \ C from G. 3) Add copies of the elements Rhs [p] \ Common [p] to the resulting graph. The added elements have to be attached to the elements of C' in the same way as the elements of Rhs [p] \ Common [p] are attached to the elements of Common[p]. The notation G ^ G' signifies that the graph G' is the result of applying the production p to the graph G. A derivation of a graph G in a context-sensitive graph f-lf-l ■ A P1 ^ P2 ^ Pk grammar GG is a sequence A —> Gi —> G2... —> Gk = G, where A e A[GG] and pi, ..., pk e P[GG]. A graph G for which a derivation exists is derivable in GG. The language of a grammar GG (denoted L(GG)) is a set of all terminally labeled graphs derivable in GG. (A graph is terminally labeled if all its elements are labeled by labels from T[GG].) A parser is an algorithm that, for a graph G and a grammar GG, determines whether G e L( GG) and produces a derivation of G in GG if this is the case. A parser for the context-sensitive graph grammars was introduced by Rekers and Schurr [4] and improved by Fuerst et al. [9]. 3 Running Example A grammar of flowcharts denoted GGFC will serve as a running example for the rest of this paper. For the grammar GGfc, N[GGfc] = {a} and T[GGfc] = {o, Cond, PrimStat, T, F, }, where o and | denote the virtual 'labels' of the unlabeled vertices and edges, respectively. The productions of GGFC are displayed in the left column of Table 1; the right column will be explained later. For an easier reference, many graph elements are tagged by unique IDs (shown in square brackets, e.g., VI or [e2]). The elements occurring on both sides of the individual productions, i.e., the elements of the Common sets, are grayed. These elements should actually have the same ID on both sides. However, since we will later have to distinguish between the LHS and RHS occurrences of the same element, we use primes (vi) for the RHS occurrences. The grammar GGFC contains a single axiom: this is the graph on the LHS of production pi. In the productions of the grammar GGFC, the a-labeled edges represent graphical statements; in a derivation of a flowchart, every edge a develops into an elementary or composite graphical statement of a flowchart. The vertices o (i.e., the unlabeled ones) represent the endpoints of individual statements. The PrimStat vertices represent primitive statements (non-control-flow ones), and the Cond vertices represent conditions. The edges T and F represent the 'true' and 'false' outcomes, respectively, of the associated conditions. The axiom of the grammar GGFC thus represents the fact that a flowchart comprises a single graphical statement. A graphical statement can take the form of a sequence of statements (production pi ), a primitive statement (p2), an if-then statement (p3), an if-then-else statement (p4), a while statement (p5), a do-while statement (p6), or a goto statement (p7). Since we are only interested in the control-flow statements, we grouped all the other types of the statements (assignments, reads, writes, etc.) under the 'primitive statements' category. Although the goto statement has been considered 'harmful' ever since the famous paper by Dijkstra [10], there are generally no objections against its use in flowcharts. In addition, as will be shown later, goto makes it possible to define a meaningful inherited attribute. A sample derivation in the grammar GGFC is shown in Fig. 1. 4 Attributed Context-Sensitive Graph Grammars We shall now introduce the central concept of this paper. An attributed context-sensitive graph grammar is a tuple (GG, B, R), where GG is a context-sensitive graph grammar, and the sets B and R are defined as follows: • B is a set of triples (p, x, A), where p e P[GG], x e X[p], and A is an attribute, a named variable assigned to a given graph element in a given production. An attribute A for a graph element x in a given production is denoted x.A. Let B(x) denote the set of all attributes associated with the graph element x. Table 1. Productions and attribute evaluation rules of the sample grammar for the flowcharts language. Production Attribute evaluation rules QM OKI M Pi a ::= Ô encode = e2.code © e3.code on [es] Ohl o h'] P4 ON On encode = (if isEmpty(vi.label) then e else v[.label © ':') © 'if (' © V3.code © ') {' © e2.code © '}' © 'else {' © e3.code © '}' P5 a QN M on ei.code = (if isEmpty(v1.label) then e else vi .'label © ':') © 'while (' © V3.code © ') {' © e2.code © '}' P6 K'lO )M )N e1.code = (if isEmpty(v1.label) then e else v'.label © ':') © 'do {' © e2.code © '} while (' © v3.code © ');' ok'] o of^o P7 aj[ej N] ::= M™ Nl ÔN1 on] v'3.label = if isEmpty(v3.label) then newLabel() else v3.label ei.code = 'goto ' © v'3.label © ';' ON Pi, elJ h] M f ^r^ t -one )[«J PrimStat M Pa ^> K] [«J a , „ CConp-^O K W [«2] Figure 1. Sample derivation in the GGFc grammar. • R is a set of pairs (p, R), where p G P[GG] and R is an attribute evaluation rule. An attribute evaluation rule for a production p is a rule of the form x0.A0 = f (xi.Ai, ..., xk.Ak), where x0, xi, ..., xk G X[p] and Ai G B(xi) for all i G {0, 1, ..., k}. The function f is called semantic function. By analogy to the attributed textual grammars [2], the set of attributes for each production graph element can be partitioned into a set of synthesized attributes and a set of inherited attributes. The synthesized attributes are those associated with the graph elements on the production LHSs, while the inherited ones are those associated with the RHS elements. The set of the attribute evaluation rules must not contain any circular dependencies. If at least two attributes directly or indirectly depend on each other, their values cannot be computed. The purpose of the attributes is to carry semantic information. In the usual case of compiling programs defined by the context-free grammars, the inherited attributes carry information about various symbols (variables, procedure declarations, etc.), while the synthesized attributes hold (intermediary) translations of the program or its parts. To obtain the 'semantics' of the program, the attributes have to be evaluated. In the case of the context-free grammars, the attributes are evaluated during the traversal of a derivation tree of the given program; the inherited attributes are evaluated on the way from the root to the leaves, while the synthesized ones are evaluated in the opposite direction. In the case of the context-sensitive graph grammars, derivation trees cannot be defined, however, since the production LHSs may contain more than one graph element. For this reason, the attributes can only be evaluated by traversing a derivation sequence of the input graph. The inherited attributes are evaluated during the left-to-right traversals of the sequence (from an axiom to the input graph), while the synthesized ones are evaluated during the right-to-left traversals. In general, several traversals may be required to evaluate all the attributes. Let us formally describe a single attribute evaluation step. Let p be a production, and let X[Lhs[p]] = {a1, ..., ai} and X[Rhs[p]] = {b1, ..., br}. Although the context elements occur on both sides of the production, we distinguish between their LHS and RHS occurrences. In a derivation step G A H, let L C G denote the subgraph corresponding to Lhs [p], and let R C H denote the subgraph corresponding to Rhs [p]. Let X[L] = {xi,..., xi} and X[R] = {y1,..., yr} such that xi (1 < i < l) corresponds to ai and yj (1 < j < r) corresponds to bj. For each j G {1, ..., r}, the following procedure is executed: if there is an inherited attribute A G B(bj) and a rule bj .A = f (aSl .Atl, ..., asl .Atl), and if all of xSl .Atl, ..., xsl .Atl have already been evaluated, then evaluate yj .A as f (xSl .Atl, ..., xSl .Atl). The synthesized attributes are evaluated in the same fashion, only the LHS and RHS are reversed. Let us proceed to our running example. The goal is to translate an arbitrary flowchart belonging to the language of the GGFC grammar to the equivalent C code. We define the following two attributes: • code: This synthesized attribute is assigned to each a-labeled edge in the production set. It holds the C translation of the graphical statement derived from the associated edge a. The code attribute for the edge in the axiom graph holds the C translation of the entire flowchart. • label: This inherited attribute is assigned to each empty vertex in the production set. At the beginning of the attribute evaluation procedure on a given flowchart, the value of the label attribute is initialized to an empty string for each empty vertex in the flowchart. We do not translate the primitive statements and conditions, since we are interested only in the control flow. We merely assume that each PrimStat vertex and each Cond vertex has an associated code attribute that contains the C code for the primitive statement or condition represented by that vertex. The attribute evaluation rules for individual productions are shown in the right column of Table 1. The rules employ the following special operators and functions: • © denotes string concatenation (a © bc = abc); • (if C then E1 else E2) equals E1 if the condition C is satisfied, and E2 if this is not the case; • isEmpty(s) returns true iff the string s is empty; • newLabel () returns a unique label; • e represents an empty string. Let us first consider the expressions involving the code attribute without paying attention to the label attributes. The C code for a sequence of statements (production p1) is obtained by concatenating the translations of individual constituents. The code of a primitive statement is just the code associated with the vertex PrimStat. The code of a graphical if-then statement is obtained by embedding the code associated with the vertex Cond and with the edge a on the production RHS into the if (...) {...} framework. The code for the graphical if-then-else, while, and do-while statements is obtained in a similar fashion. The code for the graphical goto statement is goto b, where b is a unique label assigned to the target vertex of the goto jump. The code attributes are evaluated in the second (right-to-left) pass over the derivation of a given flowchart. During the first (left-to-right) pass, the inherited label attributes are evaluated. Every empty vertex that is the target of at least one goto statement receives a unique label. (The if-then expression within the v3.label := ... statement ensures that a label is reused if it already exists.) By the end of the first pass, all the goto targets have received the unique labels. When computing the Table 2. Attribute evaluation procedure for the derivation of Fig. 1. Derivation step Evaluation 6 V5. label = a1 8 es.code = V14 7 e3.code = a1: while (V12) { V14 } 6 e7.code = goto a1; 5 eö.code = if (Vi0) { goto a1; } 4 e5. code = Vg 3 e4.code = Vg if (V10) { goto a1; } 2 e2.code = do { Vg if (Vi0) { goto a1; } } while (V7); 1 ei.code = if (V3) { a1: while (V12) { V14 } } else { do { Vg if (Vi0) { goto a1; } } while (V7); } code attribute for a statement, we check whether the empty vertex marking the beginning of the statement has a nonempty label; if it has, it must be the target of at least one goto, and so we have to prepend the label and a colon to the code for the statement. Table 2 shows the attribute evaluation procedure for the derivation of Fig. 1. To simplify the notation, we write vi instead of Vi.code. At the end, the code attribute for the edge in the axiom graph (ei.code) contains the translation of the entire flowchart. 5 Conclusion We introduce a concept of the attributed context-sensitive graph grammars as a natural extension of the attributed textual grammars. If the syntax of a graph language is specified by a graph grammar, its semantics can be defined by the attributes attached to individual grammar elements and by the rules for evaluating the attributes during the traversal of a derivation of a given graph. As an example, we define a graph grammar of flowcharts and specify a set of the attributes and attribute evaluation rules used in translating individual flowcharts into the C code. The 'semantics' of a particular flowchart is thus defined as the C code that represents the same control flow as the flowchart. The presented example can be incorporated into a visual tool. Such a tool can be beneficial in introductory programming courses, where control structures are often presented by flowcharts. By translating the flowcharts to an equivalent textual code, the instructor can show the connection between a graphical and a textual representation of the control flow. Recently, we proposed a novel method to convert metamodels (a declarative formalism for defining graph sets) to the equivalent context-sensitive graph grammars [8]. We showed that the semantics of models conforming to a given metamodel can be defined by an attributed graph grammar. However, the concept of the model semantics is as-yet underexplored. Our future efforts might thus be directed towards strengthening the connection between the (meta)models, graph grammars and formalisms for specifying semantics. References [1] D. E. Knuth. Semantics of context-free languages. Mathematical Systems Theory, 2(2):127-145, 1968. [2] J. Paakki. Attribute grammar paradigms—a high-level methodology in language implementation. ACM Computing Surveys, 27(2):196-255, 1995. [3] H. Ehrig, G. Engels, H.-J. Kreowski, G. Rozenberg, and U. Mon-tanari, editors. Handbook of graph grammars and computing by graph transformation (Vols. 1.-3.). World Scientific, 1997-1999. [4] J. Rekers and A. Schurr. Defining and parsing visual languages with Layered Graph Grammars. Journal of Visual Languages and Computing, 8(1):27-55, 1997. [5] F. Han and S. Chun Zhu. Bottom-up/top-down image parsing with attribute grammar. IEEE Trans. Pattern Anal. Mach. Intell., 31(1):59-73, 2009. [6] S. Baumann. A simplified attributed graph grammar for highlevel music recognition. In Intl. Conf. on Document Analysis and Recognition, Montreal, Canada, pages 1080-1083. IEEE, 1995. [7] G. Zinßmeister and C. McCreary. Drawing graphs with attribute graph grammars. In Graph Grammars and Their Applications to Computer Science, Williamsburg, VA, USA, pages 443-453. Springer, 1994. [8] L. Fürst, M. Mernik, and V. Mahnic. Converting metamodels to graph grammars: doing without advanced graph grammar features. Software and Systems Modeling. (to be published; DOI: 10.1007/s10270-013-0380-2). [9] L. Fürst, M. Mernik, and V. Mahnic. Improving the graph grammar parser of Rekers and Schürr. IET Software, 5(2):246-261, 2011. [10] E. W. Dijkstra. Letters to the editor: go to statement considered harmful. Commun. ACM, 11(3):147-148, 1968. Luka Fiirst received his B.Sc., M.Sc., and Ph.D. degrees from the Faculty of Computer and Information Science of the University of Ljubljana in 2004, 2007 and 2013, respectively. Since 2004, he has been employed with the same faculty, where he is currently working as a teaching assistant. His research interests include graph grammars, machine learning and graph algorithms in general.