Informatica An International Journal of Computing and Informatics Special Issue: Theoretical Computer Science Guest Editors: Boštjan Vilfan Roberto Grossi The Slovene Society Informatika, Ljubljana, Slovenia EDITORIAL BOARDS, PUBLISHING COUNCIL Informatica is a journal primarily covering the European computer science and informatics community; scientific and educational as well as technical, commercial and industrial. Its basic aim is to enhance communications between different European structures on the basis of equal rights and international referee-ing. It publishes scientific papers accepted by at least two referees outside the author's country. In addition, it contains information about conferences, opinions, critical examinations of existing publications and news. Finally, major practical achievements and innovations in the computer and information industry are presented through commercial publications as well as through independent evaluations. Editing and refereeing are distributed. Each editor from the Editorial Board can conduct the refereeing process by appointing two new referees or referees from the Board of Referees or Editorial Board. Referees should not be from the author's country. If new referees are appointed, their names will appear in the list of referees. Each paper bears the name of the editor who appointed the referees. Each editor can propose new members for the Editorial Board or referees. Editors and referees inactive for a longer period can be automatically replaced. Changes in the Editorial Board are confirmed by the Executive Editors. The coordination necessary is made through the Executive Editors who examine the reviews, sort the accepted articles and maintain appropriate international distribution. The Executive Board is appointed by the Society Informatika. Informatica is partially supported by the Slovenian Ministry of Science and Technology. Each author is guaranteed to receive the reviews of his article. When accepted, publication in Informatica is guaranteed in less than one year after the Executive Editors receive the corrected version of the article. Executive Editor - Editor in Chief Anton P. Železnikar Volariceva 8, Ljubljana, Slovenia s51em@lea.hamradio.si http://lea.hamradio.si/~s51em/ Executive Associate Editor (Contact Person) Matjaž Gams, Jožef Stefan Institute Jamova 39, 1000 Ljubljana, Slovenia Phone: +386 1 4773 900, Fax: +386 1 219 385 matjaz.gams@ijs.si http://ai.ijs.si/mezi/matjaz.html Executive Associate Editor (Technical Editor) Drago Torkar, Jožef Stefan Institute Jamova 39, 1000 Ljubljana, Slovenia Phone: +386 1 4773 900, Fax: +386 1 219 385 drago.torkar@ijs.si Rudi Murn, Jožef Stefan Institute Publishing Council: Tomaž Banovec, Ciril Baškovic, Andrej Jerman-Blažic, Jožko cCuk, Vladislav Rajkovic Board of Advisors: Ivan Bratko, Marko Jagodic, Tomaž Pisanski, Stanko Strmcnik Editorial Board Suad Alagic (Bosnia and Herzegovina) Vladimir Bajic (Republic of South Africa) Vladimir Batagelj (Slovenia) Francesco Bergadano (Italy) Leon Birnbaum (Romania) Marco Botta (Italy) Pavel Brazdil (Portugal) Andrej Brodnik (Slovenia) Ivan Bruha (Canada) Se Woo Cheon (Korea) Hubert L. Dreyfus (USA) Jozo Dujmovic (USA) Johann Eder (Austria) Vladimir Fomichov (Russia) Georg Gottlob (Austria) Janez Grad (Slovenia) Francis Heylighen (Belgium) Hiroaki Kitano (Japan) Igor Kononenko (Slovenia) Miroslav Kubat (USA) Ante Lauc (Croatia) Jadran Lenarcic (Slovenia) Huan Liu (Singapore) Ramon L. de Mantaras (Spain) Magoroh Maruyama (Japan) Nikos Mastorakis (Greece) Angelo Montanari (Italy) Igor Mozetic (Austria) Stephen Muggleton (UK) Pavol Nävrat (Slovakia) Jerzy R. Nawrocki (Poland) Roumen Nikolov (Bulgaria) Franc Novak (Slovenia) Marcin Paprzycki (USA) Oliver Popov (Macedonia) Karl H. Pribram (USA) Luc De Raedt (Belgium) Dejan Rakovic (Yugoslavia) Jean Ramaekers (Belgium) Wilhelm Rossak (USA) Ivan Rozman (Slovenia) Claude Sammut (Australia) Sugata Sanyal (India) Walter Schempp (Germany) Johannes Schwinn (Germany) Zhongzhi Shi (China) Branko Soucek (Italy) Oliviero Stock (Italy) Petra Stoerig (Germany) Jiff Šlechta (UK) Gheorghe Tecuci (USA) Robert Trappl (Austria) Terry Winograd (USA) Stefan Wrobel (Germany) Xindong Wu (Australia) From the editors This special issue of Informatica is devoted to Theoretical Computer Science, and contains a selection of papers derived from presentations at the conference Theoretical Computer Science 2003, which took place on October 16, 2003 within the framework of the multiconference Information Society 2003 (October 12-17, 2003 at the Jozef Stefan Institute, Ljubljana, Slovenia). The conference was an inaugural conference on its subject in Slovenia, and is expected to continue in the future on an annual or biannual schedule. The 2003 event attracted several interesting contributions from diverse areas of Theoretical Computer Science, and some were selected for inclusion in this issue. The area of the Analysis ofAlgorithms is represented by the paper by Suzuki and Ibaraki, An Average Runing Time Analysis of a Backtracking Algorithm to Calculate the Size of the Union of Cartesian Products. The paper provides an analysis of a backtracking algorithm for the problem of calculating the size of the set U^l^^Si^ x ... x Sim where Sij are finite sets of integers and m denotes the dimension of the space. The paper complements earlier work by the same authors, and thus provides more detailed information on the problem. The area of Graphs and Visualization is represented by the papers, Bokal, Juvan, and Mohar: A Spectral Approach to Graphical Representation of Data, and Orbanic, Boben, Jaklic, and Pisanski: Algorithms for Drawing Poly-hedra from 3-Connected Planar Graphs. The first considers a heuristic algorithm for the problem, given a weighted graph, find positions {xi, yi) of the nodes such that the relative differences between the internode distances and the edgeweights are minimised. The second represents an analysis of Tutte's method of drawing a planar graph, viewed in terms of matrix iteration and Markov chains. Finally, the area of Formal Languages and Compiler Construction is represented by Mernik, cCrepinšek, Kosar, Rebernak, and Žumer: Grammar-Based Systems: Definition and Examples, and Slivnik, and Vilfan: Improved Error Recovery in Generated LR Parsers. The first represents a guided tour of different applications of context-free grammars and attribute grammars. The paper does not report on novel research results; but it was felt that it does give an interesting and timely introduction to the area. The second discusses an application of a new parsing method (more precisely: an improvement of a recently reported parsing method) which generates the left parse of an input string on the basis of an LR grammar. It is described how this approach can be used to improve error reporting in parsers that are constructed with tools such as Bison, Yacc, etc. Boštjan Vilfan, University of Ljubljana Roberto Grossi, University of Pisa An Average Running Time Analysis of a Backtracking Algorithm to Calculate the Size of the Union of Cartesian Products Susumu Suzuki Information Network Engineering, Aichi Institute of Technology, Toyota, 470-0392 Japan susumu-suzuki@aitech.ac.jp Toshihide Ibaraki Graduate School of Informatics, Kyoto University, Kyoto, 606-8501 Japan ibaraki@i.kyoto-u.ac.jp Keywords: average running time, backtracking algorithm, Cartesian product Received: February 7, 2004 Problem SUCP is the problem to calculate the size of the union of n Cartesian products, | Uj=i... n Su X • • • X Siml, where Sij are finite sets of integers and m denotes the dimension of the space. SUCP contains as a special case the problem of counting the number of unsatisfying assignments of the satisfiability problem (SAT). Therefore, SUCP is NP-hard and, in further detail, #P-complete. We presented in [7] an exact algorithm to solve SUCP, called the grouping method, and analyzed its average running time. Except for this, SUCP has been hardly studied so far. In this paper, we analyze the average running time of a backtracking algorithm to solve SUCP. For the analysis, Sij are constructed by randomly selecting each element from set D = {1, 2,... ,d} with probability p. We show that its average running time is O{mnd{{ inin/imn-l)) Ig d + 1)) for n > m, where Ig and In denote log2 and logg respectively. For the same instances, the average running time of the grouping method is O(mnd(nd(1 - p) + 1)m-1) and that of the naive method is O(mndm). These results indicate that the backtracking algorithm is most efficient if p and n (> m) are large enough such that in(m/--r!m-P'i)) ^ min{1, n(1 - p)} holds. Povzetek: članek opisuje časovno analizo algoritma za izračun velikosti unije kartezičnih produktov. 1 Introduction the answer is Problem SUCP is defined as follows [7]: 1 U C2 U C31 = |Sii X Si2 U S2i xS22 U S31 xS32| SUCP Given finite sets of integers D j = {1,2,..., dj} = |{1,2} x {1,2, 3} U {2,3} x {1} and their subsets Sij (C D j ), where i = 1,.. .,n and j = U {1 3} X {1 3}| 1, . . . , m, calculate the size of the union of m-dimensional Cartesian products Ci = S^ x • • • x Si^; i.e., = |{(1'1)'(1' 2)'(1' 3)'(2'1)'(2' 2)'(2' 3)} U {(2,1), (3,1)} |Ci U^^^U Cn| = U{(1,1), (1, 3), (3,1), (3, 3)}| | { (xi, ...,Xm) | (xi G Sii A • • • A Xm G Sim) = |{(1,1), (1, 2), (1, 3), (2,1), (2, 2), (2, 3), V • • • V (xi G Sni A • • • A Xm G Snm) } |. (3, 1), (3, 3)}| = 8. □ □ Example 1 When a problem instance is given by Di = D2 = {1, 2, 3}, Sii = {1, 2}, Si2 = {1, 2, 3}, Applications A shop sells three kinds of T-shirts (TD' = d2 = {1,2,3}, shirt#1, T-shirt#2, T-shirt#3). The sizes of T-shirt#1 are small and medium, and its colors blue, green and red. Similarly, the sizes of T-shirt#2 are medium and large, S2i = {2, 3}, S22 = {1}, and its color blue. The sizes of T-shirt#3 are small and S3i = {1, 3}, S32 = {1,3}, large, and its colors blue and red. The measure of the variety of T-shirts sold at the shop can be represented as |{small, medium} x {blue, green, red} U {medium, large} x {blue} U {small, large} x {blue, red}|. That is, the problem of measuring the variety of T-shirts can be represented as SUCP. □ Problem SUCP contains as a special case the problem of counting the number of unsatisfying assignments of the satisfiability problem (SAT). For example, the problem of counting the number of unsatisfying assignments for DNF equation (xi A x2 A x3) V (x2 A x3) V (xl A X3) = false is expressed as the problem SUCP for di = 3,2 = ds = 2, Ci = {2} x {1} x {2}, C2 = {1,2} x {2} x {1} and C3 = {1}x{1, 2}x{2}, as understood by regarding "false" as 1 and "true" as 2. Therefore, SUCP is NP-hard and, in further detail, #P-complete [2]. The problem of counting the number of (un)satisfying assignments of SAT has been intensively studied, and the average running times of several algorithms (including backtracking algorithms) have been analyzed [1, 2, 3, 4, 5, 6]. On the other hand, SUCP for general d j (> 2) has not been studied much so far except for [7], in which we proposed an algorithm called the grouping method and analyzed an upper bound of its average running time. We are interested in algorithms that can solve SUCP with large dj efficiently. We explain the grouping method [7] briefly by using an example. Let Ci and C2 be Ci = ^^ii xS12 = {1,2,3}x {1, 2} and C2 = S2i x S22 = {2, 3} x {1, 2,3}. Since the decomposition of ICi U C2I over the first coordinate axis, |Ci U C2I = |{1}|-|Si2| + |{2}|-|Si2 U S221 + |{3}|-|Si2 U S221, (1) contains common terms Si2 U S22 in two positions in its right hand side, it can be aggregated as follows: |Ci U C2| = |{1}| • |Si2| + |{2, 3}| • |Si2 U S22|. (2) The grouping method calculates | Ci U C21 efficiently by using this aggregated formula (2) instead of (1). To analyze the average running time, Sij are constructed by randomly selecting each element from set D j = D = {1, 2,.. .,3} with probability p. The average running time of the grouping method is 0((mnd(nd(1 -p) + 1)m-1) [7]. In addition to the grouping method, if we consider the naive method, which calculates |C1 U • • • U Cn| by check- ing whether each cell (xi ^) € Di X Dm is operations that expresses an upper bound of the average running time T, which is an generalization of the corresponding formula for SAT-BT in [6], and then transform its formula (7) into a simpler formula (23) approximately. However, the formula (7) for SUCP-BT is more complicated than the formula for SAT-BT in [6], and our transformation to obtain a simpler formula (23) is different from that in [6]. As a result of the analysis, we show that the average running time T of SUCP-BT is estimated as O(mnd((i^mgn^)m-1 Igd + 1)) for n > m ((23)), where Ig and In denote log2 and logg respectively. The result indicates that SUCP-BT is more efficient than the former algorithms (that is, the grouping method and the naive method), if p and n(> m) are large enough such that inSfe!-!)) < min{1,n(1 - p)} holds. The rest of this paper is organized as follows. We present SUCP-BT in Section 2, analyze its average running time in Section 3, and give its worst-case running time in Section 4. Finally, we summarize the paper in Section 5. 2 Backtracking Algorithm We present SUCP-BT, where [l,r] = {x | l < x < r, x is an integer}. stepl: Put the pair (Io,m) on top of the empty stack STACK, where I0 = [1, d1]x^ • •x [1, dm], and set V to 0, which is used to keep the answer |Ci U • • • U C„ |. step2: If STACK = output V ( = |Ci U • • • U C„ | ) and halt. step3: Take the top pair (I,k) = ([l1, r1]x^ • •x[lm, rm], k) from STACK. If the m-dimensional sub-domain I is contained in at least one of Cartesian products Ci, update V by V +1 to step2. ll,rl]|x and return contained in C1 U • • • U Cn or not, its average running time is O(mndm). In this paper, we propose a backtracking algorithm for the case of general dj > 2 (denoted by SUCP-BT), which is a generalization of the basic backtracking algorithm for SAT in [6] (denoted by SAT-BT), and analyze its average running time(denoted by T), where problem instances are generated in the same manner as the above. We follow the analysis procedure in [6]. That is, we first obtain an formula (7) (using (5) and (6)) including sum step4: If |[lj,rj]| = 1 for all j, return to step2. Otherwise, denote the first j in the order of j = k + 1,...,m, 1,...,k, with |[lj,rj]| > 2 by k'. Divide the m-dimensional sub-domain I into two m-dimensional sub-domains I' = [l1, r1] x • • • x [lk', c -1] x • • • x [lm,rm] and I" = [li,ri] x • • • x [c,rk'] x • • • x [lm, rm] by splitting the k'-th interval [lk',rk'] into two intervals [lk', c - 1] and [c, rk'], where c = lk' + \(rk' - lk' + 1)/2l. Put the pairs (I', k') and (I'', k') on top of STACK and return to step2. □ Let us apply SUCP-BT to SUCP of Example 1 in Section 1. Figure 1 shows all of the sub-domains generated by SUCP-BT. Domain Io is divided into sub-domains Ii and I2 by a split of its first interval [1,3] (step4), since it is contained in none of Cartesian products C1, C2 and C3 (step3). Sub-domain I1 is contained in C1. Sub-domain I2 is divided into sub-domains I3 and I4 by a split of its second interval [1, 3], and sub-domain I3 divided into subdomains I5 and I6 by a split of its second interval [1, 2], x r m m I0 = [1,3] ¥ [1,3] a split of the first interval [1,3] Il = [1,2] ¥ [1,3] (c Ci) I2 = [3,3] ¥ [1,3] a split of the second interval [1,3] I3 = [3,3] ¥ [1,2] I4 = [3,3] ¥ [3,3] (c C3) a split of the second interval [1,2] one are sub-domains that are generated and divided by SUCP-BT. We denote the average running time of SUCP-BT by T, and also denote it by T (d) when it is necessary to specify the value of d. First, we consider the case of d = 2h (h = 1,2,...). Then T(2h) is expressed as follows: T(2h) = Ti(2h)(l - (1 - h-1 m-1 ^^^ T2(2h,i,j)2mi+j i=0 j=0 X (1 - (3) I5 = [3,3] ¥ [1,1] (c C2) I6 = [3,3] ¥ [2,2] Figure 1: All of the sub-domains that the backtracking algorithm SUCP-BT generates for the problem instance of Example 1 since I2 and I3 are contained in none of Cartesian products. Sub-domain Is is contained in C2. Sub-domain le is not divided, since its size is one. Sub-domain I4 is contained in C3. As a result, SUCP-BT outputs |Ci U C2 u C31 = |Ii| + II4I + II5I = 2 X 3 + 1X1 + 1X1 = 8 in step2. 3 Average running time analysis 3.1 An intermediate formula for the average running time As mentioned in Section 1, Sij (i = 1,...,n,j = 1,..., m ) are constructed by randomly selecting each element from set D j = D = {1, 2,... ,d} with probability p. Let us consider the procedure obtained by modifying step3 of SUCP-BT so that the processing goes from step3 to step4 even when a sub-domain I is contained in at least one of Cartesian products (that is, the procedure obtained by modifying SUCP-BT so that a sub-domain I with III > 2 is always divided), and denote the tree that shows how the procedure generates sub-domains by Ti. Nodes of Ti are sub-domains generated by the procedure, the root is domain [1, d] x • • • x [1, d], and leaves are sub-domains of which the size is one. A node I of T1 has its children I' and I'' when the procedure divides the sub-domain I into the sub-domains I' and I''. T has the following property: property 1 Nodes of T1 that are contained in none of Cartesian products and of which the size is more than In (3), the formula (1 - (1 - p2 m)n) in the first term is the probability that the domain I0 is contained in at least one of Cartesian products, and T1 (2h) is the average of time for SUCP-BT to process I0 in such a case (that is, the average of time for SUCP-BT to find that I0 is contained in at least one of Cartesian products). The formula (1 - p^ ^ j+2 \m-j))n in the second term is the probability that a sub-domain I at the depth of mi + j in T1 is contained in none of Cartesian products, and therefore, from property 1 mentioned above, it is the probability that a sub-domain I at the depth of mi + j in T1 is generated and divided by SUCP-BT. T2(2h,i,j) is the average of time for SUCP-BT to process a sub-domain I in such a case. T2 (2h, i, j) includes time to find that I is contained in none of Cartesian products and time to divide I into two subdomains I' and I''. Furthermore, if I'(I'') is a sub-domain that is not divided by SUCP-BT, T2(2h, i, j) also includes time to check whether I'(I'') is contained in at least one of Cartesian products or not. T2 (2h, i,j) consists of these time. Since T1(2h) and T2(2h, i,j) have the following upper bounds: T1(2h) = O (2hmn), T2(2h,i,j) = O ((2h-i-1j + 2h-i(m - j)) n) , by substituting these into (3) and replacing 1 - (1 -with 1, T(2h) is T(2h) = O(F(h)), (4) where F(h) = mn h-1 m-1 n2h ^^J2i2h-i-1J +2h-i(m - j)) i=0 j=0 X n2mi+j (1 - p2h-i-1j+2h-i(m-j)^ n . (5) Next, we consider the case of any integer d. Let [Ig d'] be denoted by a: = [Ig d]. (6) a Since it holds T (= T (d)) < T (2a), T = O (F(a)), (7) + mn2"^ (l - p i=0 ■ (8) By regarding 2a i in (8) as the real number x(= 2a i), 1-1 F (a) < mn21 + x 21 m (1 - pmx)n i=0 a) x < mn2a + mna2(1+a)m ■ (1 - pmx)n X max I 2 < x < 2a,x is real (9) We denote the formula in the brace { } of (9) by f : (^ pmx)n f (x) = ^ (10) and evaluate the following term in (9): max{f (x) I 2 < x < 2a,x is real}■ (11) We extend the domain of x from 2 < x < 2a to 0 < x. Differentiating f(x) gives by changing h in (4) into a, the following upper bound of T is obtained: f '(x) = f ^ mn(— In p) x(p-mx - 1) (m — 1)(p-mx — 1) mn(— In p) (12) where a and F (h) are (6) and (5) respectively. We use the formula (7) as an intermediate formula of an upper bound of the average running time T of SUCP-BT for any integer d. 3.2 An end formula for the average running time We transform an upper bound O(F(a)) ((7)) of T into a simpler formula approximately. F (a) is estimated as follows: F(a) = mn2a a-1 m-1 ^^l>2i2a-i-1J +2a-i(m — j )) i=0 j=0 X n2mi+j (1 — < mn2a a-1 m-1 ^^l>2i2a-iJ + 2a-i(m — j)) i=0 j=0 X n2mi+j (1 — pia-ij+ia-iim-j)^" = mn2a m—1 a—1 + 2a-i2mi (1 — pia-i j=0 i=0 < mn2a a-1 Let the formula in the brace { } on the right hand side of (12) be denoted by g(x): = x — (m 1> ■ (13) mn(— In p) g(x) and its first and second derivatives g'(x) and g''(x), respectively, have the following properties: g(0) = 0, lim g(x) = < 0, (14) (15) g'(x) g'(0) 1 — m—lp-rnx^ =1 n m1 lim g'(x) = n < 0, (16) g''(x) = — m(m — 1)(— lnp)p- < 0^ (17) We assume that n > m, i.e., g'(0) > 0. From g'(0) > 0, (16) and (17), the equation g'(x) = 0 has one solution ln a = m1 m(— In p) (18) for 0 < x, and the value of the function g'(x) varies as follows: (i) g'(x) > 0 for 0 < x < a, g'(x) = 0 for x = a, and g'(x) < 0 for a < x. From property (i), (14) and (15), the equation g(x) = 0 has one solution ß for 0 < x, and the value of the function g(x) varies as follows: g(x) > 0 for 0 < x < ß, g(x) = 0 for x = ß, and g(x) < 0 for ß < x. Since f '(x) ((12)) and g(x) ((13)) have the same sign, f (x) has the maximum value for 0 < x when x = ß: max{f (x) I 0 < x, x is real} = f (ß). (19) Although ß cannot be represented as a brief formula, it holds a < ß. (20) From (10), (18), (19) and (20), when n > m, (11) can be evaluated as follows: max{f (x) I 2 < x < 2a, x is real} (^ pmß)n 1 < f (ß) = 1 ) ^ ßm ß m1 < m1 (21) x xm-1 1 By substituting (21) into (9), F (a) < mn2a + mna2(a+i)^^ m-i 22a + mna2a+i / 2a+i \ m-i \ a / (22) T = O + mn [Ig d] 2^8 dl+' / 2rig d'] + im(-In p) ^m-i^ ln(n/(m - 1)) =O mnd \ ^ 4m(— Inp)d \ \ ln(n/(m - 1)) Ig d +1 / (23) len(/) < m|I|. (24) And the sum of the sizes of all sub-domains I at the same depth in T2 is no more than |Io | = dm, that is, it also holds From (24) and (25), the worst-case running time Tw of SUCP-BT is estimated as follows: Tw = O By substituting (22), (6) and (18) into T = O(F(a)) ((7)), an upper bound of the average running time T of SUCP-BT for n > m is finally obtained as follows: /m [ig d] \ O E len(I) n i=o I in 72 such that dp(I) = i / /m [ig d] \ O E |I |mn i=o I in 72 such that dp(I) = i /m [ig d] \ O mndm i=o / 4 Worst-case running time analysis Let the tree that shows how SUCP-BT generates subdomains be denoted by 72, where an example of such a tree is Figure 1, and the depth of a sub-domain I = [/', ri] x • • • x [Im, Tm] in 72 and the sum of the sizes of all intervals [Ij, rj] of I be denoted by dp(I) and len(I) (^m=i l [Ij, Tj] |) respectively. Since xj < m xj j=i j=i for any numbers xi,x2,..., xm that are no less than 1 (xj > 1), it holds = O{m2ndm lg (26) 5 Conclusion Problem SUCP is the problem to calculate the size of the union of n Cartesian products, | Ui=i..n Sii x • • •x Sim |, where Sij are finite sets of integers and m denotes the dimension of the space. We showed that the average running time of a backtracking algorithm to solve SUCP is O(mnd(( inin:--^^^'^)) )m-i lg d+1)) for n > m, where Sij are constructed by randomly selecting each element from set D = {1,2,..., d} with probability p. The result indicates that the backtracking algorithm is more efficient than the former algorithms, i.e., the grouping method and the naive method, if p and n(> m) are large enough such that inm/m-1)) < min{1,n(1 - p)} holds. References [1] C. A. Brown and P. W. Purdom (1981) An average time analysis of backtracking, SIAM J. Computing, 10(3), Society for Industrial and Applied Mathematics, pp.583-593. [2] M. R. Garey and D. S. Johnson (1979) Computers and Intractability - A Guide to the Theory of NP-Completeness -, W. H. Freeman and Company. [3] J. Gu, P. W. Purdom, J. Franco, and B. W. Wah (1997) Algorithms for the satisfiability(SAT) problem: A survey, DIMACS Series in Discrete Mathematics and Theoretical Computer Science, 35, American Mathematical Society, pp.19-151. [4] K. Iwama (1989) CNF satisfiability test by counting and polynomial time, SIAM J. Computing, 18(2), Society for Industrial and Applied Mathematics, pp.385-391. |I| < I in T2 such that dp(/)=i [5] P. W. Purdom (1990) A survey of average time analy-for any 'i. ses of satisfiability algorithms, Journal of Information Processing, 13(4), Information Processing Society of (25) Japan, pp.449-455. X X m d [6] P. W. Purdom and C. A. Brown (1987) Polynomial-average-time satisfiability problems, Information Sciences, 41, Elsevier, pp.23-42. [7] S. Suzuki and T. Ibaraki (2003) Average running time analysis of an algorithm to calculate the size of the union of cartesian products, Discrete Mathematics, 273, Elsevier, pp.211-220. A Spectral Approach to Graphical Representation of Data Drago Bokal Department of Mathematics, IMFM, Jadranska 19, Ljubljana, Slovenia drago.bokal@imfm.uni-lj.si Martin Juvan Department of Mathematics, University of Ljubljana, Jadranska 19, Ljubljana, Slovenia martin.juvan@fmf.uni-lj.si Bojan Mohar Department of Mathematics, University of Ljubljana, Jadranska 19, Ljubljana, Slovenia bojan.mohar@uni-lj.si Keywords: Data presentation, Clustering, Graph drawing, Spectral method, Gradient method Received: February 6, 2004 Graphical representation of relationship data is useful in several applications. Relationships among objects are modeled as a graph and the strength of relationship as weights on graph's edges. In the paper we demonstratehow the spectralmethod can be applied to visualize such data. Application ofgradientmethod is suggested to fine tune the solution obtained by the spectral method. Povzetek: članek opisuje spektralno metodo za vizualizacijo podatkov. 1 Introduction Many applications require graphical representation of (non numerical) data. A general approach for such a task is presented. Given a data base, pairs of objects from the given data set are classified as being "close" or "far apart" by specifying a numerical value S{x,y) for each such pair x, y. This value measures similarity of objects. This means that the value S(x, y) is large for closely related objects x and y and small for very different objects. For the purpose of this paper we shall assume that S is symmetric, i.e. S{x,y) = S {y,x). Similarity measures can be produced in a number of ways, for example by applying the factor analysis. The goal is to represent the objects graphically (e.g. on a computer screen) in such a way that objects which are similar with respect to the similarity measure S are represented close to each other, i.e., the distance between their graphical representations is in agreement with the "similarity distance" S{x,y). Two main points of our approach rely on the spectral method where calculations are based on the eigenvalues and eigenvectors of related Laplacian matrices. General setting for this approach is surveyed by Mohar and Poljak [27]. Usually, the spectral approach can be expressed, via the Rayleigh quotient expressions for eigenvalues, as a quadratic optimization and, more generally, via semidefinite programming [1, 18, 30, 32]. Applications of eigenvalue methods in combinatorics, graph theory and in combinatorial optimization have a long history. For example, eigenvalue bounds on the chromatic number were formulated by Wilf [31] and Hoffman [23] in the 1960's. Another early application, in the area of graph partition, is due to Fiedler [15] and Donath and Hoffman [13]. An important result was the use of eigenvalues in the construction of superconcentrators and expanders by Alon and Milman [2, 3]. Isoperimetric properties of graphs and their eigenvalues play a crucial role in the design of various randomized algorithms. These applications are based on the so-called rapidly mixing Markov chains. There is an increasing interest in the application of eigenvalues to combinatorial optimization problems. For example, Burkard, Finke, Rendl and Wolkowicz [14, 29] used an eigenvalue approach in the study of the quadratic assignment problem and general graph partition problems, Delorme and Poljak [10, 11] and Goemans and Williamson [19] in the max-cut problem, and Juvan and Mohar [24,25] in labelling problems. Spectral partitioning, which is based on eigenvectors of Laplace eigenvalues of graphs, has proved to be a successful heuristic approach in the design of partition algorithms [7, 22, 21], in solving sparse linear systems [28], clustering [20, 6], ranking [25, 21], in graph drawing [16], automated finding of large components [12], image and video segmentation [17], etc. We refer to [27] for additional applications. For further results, the reader may consult existing books and survey papers, such as [8, 9, 27]. 2 Problem description Formally, the above problem can be stated as follows. Given a data set containing n objects and the similarity measure S{:c, y) between pairs of objects, find an embedding of the n points in the plane such that the distances between the points representing similar objects are small, while the points corresponding to objects with small similarity are far from each other. In order to formally describe the "agreement" of the similarity measure with the distances in the plane, one has to introduce a function f : R ^ R which transforms every value s of similarity into the desired distance between points representing two objects whose similarity is S{x, y) = s. The transformation f must satisfy the following requirements: (F1) Monotonicity: if s < s', then f (s) > f (s'). (F2) Validity: lim,^^ f (s) = 0. It can be proved easily that (F1) and (F2) together imply: (F3) Non-negativity: f (s) > 0 for every s G R. Exact choice of f depends on the data to be represented and on the properties of their similarity measure S. If similarities of distinct objects are always positive, then one may take, for example, f (s) = 1/s. We consider the objects as vertices of the (complete) weighted graph G whose edge-weights are determined by S and f : the weight of the edge xy is equal to f (S(x, y)). This setting has an advantage that in case when similarity measure of certain pairs of objects is not defined, then the edges corresponding to such pairs can be removed from the graph. We consider the following problem. Given is a graph G = (V, E) and a weight function w : E ^ R+ (the edge-weights). The goal is to find a mapping ^ : V ^ R2, which assigns to every vertex of G a point in the Euclidean plane, such that the distance between the points ^(x) and ^(y) is as close as possible to the prescribed weight w(xy). We refer to this problem as the vertex placement problem. The vertex placement problem is an optimization problem and there are several possible choices for the energy function which is to be minimized. Our choice is described in Subsection 3.2. 3 Solving the problem We propose the following general algorithm to find (an approximate) solution to the vertex placement problem. Algorithm 1: Basic algorithm for solving the vertex placement problem Input: Graph G =(V,E), similarity measure S : V X V ^ R. Output: Placement of the vertices of G into R' 2 Description: Compute the edge-weights of G, w(xy) = f(S(x,y)), xy G E. Obtain the initial placement by the spectral method. Run the gradient method to obtain an improved placement. Correct the final solution. 3.1 Initial placement To find the initial placement in Algorithm 1, the spectral method is proposed. Its formal description is given as Algorithm 2. Let us observe that this algorithm is only heuristic, and there are no theoretical guarantees that it will return a solution close to an optimum. However, as mentioned in the introduction, it behaves quite well in practice. We refer to [27] for more information. Algorithm 2: Obtaining the initial placement of vertices Input: Weighted graph G =(V,E) with edge-weights w. Output: Initial placement of the vertices of G in R2. Description: Compute the auxiliary matrix A from the edge-weights w by setting (A)ij =0, if « = j or ij G E (A)ij = 1/w(ij), if ij G E. Determine the Laplace matrix La from A: (La)ìì = Y.n=i(A)ik, (LA)ij = -(A)ij, i = j . Compute the eigenvectors e, f of LA corresponding to the two smallest nontrivial eigenvalues of La■ Set xi := ei,yi := fi as the coordinates of the vertex i. To obtain the auxiliary matrix A from the edge-weights w, one can also use the following formula: (A)ij = 0, if i = j or ij G E (A)ij = 1/(w(ijif ij G E. Both alternatives seem to yield good results. If the number of objects to be represented is too large, we first apply the same spectral method to cluster the data set into smaller cluster sets whose size fits the requirements. Particular clusters that are small enough can then be graphically represented as described above. On the other hand, the relations among clusters themselves can also be represented by the same method by defining the distance w between two clusters X, Y as )= |X1|Y| EE f (S (x.y)). xeXyeY Similar spectral approach has already been applied to clustering problem, see [4, 5]. 3.2 The gradient method The energy function we choose to minimize in solving the vertex placement problem is the sum of the squares of the relative differences between the distances implied by the current placement of the vertices, and the desired distances: 'w(ij) - \\(xi,yi) - (xj,yj)y , w(ij) f (x, y)= E ijeE 0. 0.6 0.4 0.2 ♦ ♦ ♦ ♦ ♦ ♦ • * * ♦ ♦ ♦ 0.2 0.4 0.6 0.8 Figure 1: Performance test for random graphs with parameters p = 0.1, q on the x-axis and r on the y-axis ranging from 0.1 to 0.9. wherex = (xi,:l2, ... ,Xn), y = (yi,y2,... ,yn), (xi,yi) is the point in R2 corresponding to the vertex i, w(ij) is the desired distance between vertices i and j, and E is the set of edges of the graph G on vertices {1,. ..,n}. The gradient method is a well-known iterative algorithm for solving optimization problems whose objective function is differentiable (see, e.g., [26]). At each step, first the direction in which the placement will change is determined: usually, the negative gradient is taken as the direction, but in every third step the average of the last two gradients is used instead (this significantly reduces the "zig-zag" behavior which otherwise often occurs). Then the length of the step in the chosen direction is calculated. As the first approximation a step of the Newton's root finding method for the direction derivative of the energy function in the chosen direction is used (the goal is a local minimum and the derivative evaluates to zero in the minimum; the Newton's method is used to find this root). Then step of the calculated length is made in the direction of decreasing energy function. If the value of the energy function in the obtained point is higher than the current value, the length of the step is corrected: it is repeatedly multiplied by an appropriately chosen constant factor from the unit interval until the value of the energy function is lower than the current value. Additionally, if two subsequent directions differ too much (the measure is the angle between the two), the next step is to be shorter. This method is a variant of an inexact line search using the Armijo Rule as its stopping condition (cf. [26, sec. 3.2]) and combined with some heuristics. The method stops when any of the following cases occurs: - the norm of the direction vector is small enough, 0.8 0.6 0.4 0.2 0.2 0.4 0.6 0. Figure 2: Performance test for random graphs (p = 0.8). - the number of steps exceeds maximum number allowed, or - a certain number of the quotients of subsequent energy values are small enough. These criteria can be tuned to achieve either higher accuracy or faster performance. Note that the problem is invariant under translations and rotations of the plane. Thus we may fix the position of one vertex and additionally one coordinate of another vertex. 4 Practical considerations The behavior of proposed algorithms was tested on several distance matrices of various sizes. For these tests we used random graphs constructed as follows: let G = G(n,p) be a random graph on n vertices, where each edge is added to G with probability p. Choose randomly n points xi,...,xn in the plane, and for each edge ij e Eg let 0.1 0.6 0.4 0.2 0.2 0.4 0.6 0.8 Figure 3: Performance test for random graphs (p = 1). 120 110 100 90 80 70 60 50 20 40 60 80 1000 800 600 400 200 •• • 20 40 60 80 Figure 5: Changes in gradient norm during iteration of the gradient method. 400 , -1250 -1000 -750 -500 -250 ^ '' -2 0 0 -400 -V 250 500 Figure 6: Traces of positions of vertices in the plane dur-Figure 4: Decrease of the energy during iteration of the ing the execution of the gradient method, initial placement gradient method. using eigenvectors of smallest eigenvalues {D)ij be the distance between the points xi, xj. With probability q the distances (D)ij are perturbed for a factor r (half of the distances are increased, half are decreased). Additional tests were performed using random bipartite graphs, obtained in a similar way. Such graphs appear often in the real world applications, where objects considered naturally fall into two disjoint classes. Figures 1-3 demonstrate the results of the tests for parameters n = 10, q and r ranging from 0 to 0.9 by steps 0.1. The thicker the point, the more steps the gradient method required. The points were obtained averaging the results of ten independently chosen random graphs with the same parameters. The performance analysis showed that the gradient method requires the largest number of steps when p is large with q and r being small (compare the aforementioned figures). The interpretation could be that then the optimum solution is nearly exact and hard to find, as the graph is dense. With large perturbations, the search space tends to have more local optima that are close to the initial state and the algorithm is more easily stopped in one of them. When the graph is sparse, the problem usually splits into several smaller subproblems and in general requires fewer steps of the gradient method. Figure 4 presents the energy of the solution as a function of the number of iterations performed. It is decreasing in steps, which demonstrates that the solution may be jumping from one local optimum to another at certain points in time. Norm of the gradient calculated using the gradient method is displayed in Figure 5. It is demonstrated that the norm of the gradient is not decreasing monotonously, however towards the local optimum its value settles and approaches 0. The large jumps in the gradient norm correspond to the bigger decreases in the value of the energy function. The progress of the algorithm was monitored using the traces of the points in the plane. After every step of the gradient method the graph was output and its position in the plane was displayed. The positions of the same vertex in two consecutive drawings were connected by a line segment. Thus for each point we reconstructed its trace during the optimization process. Figures 6-8 display one such picture for a graph on ten vertices. These traces were displayed for various initial placements of vertices of considered graphs. Results demonstrate that the traces are shortest if we choose as the initial placement the one proposed 200 -600 -800 Figure 7: Traces of positions of vertices, initial placement using eigenvectors of larger eigenvalues / -400 -200 20O 400 600 the optimal solution fast. To reduce the effect of such a behavior, we slightly modified the gradient method such that in every third step, the position of vertices is updated according to the average direction of the last two gradients (see the description of the gradient method). 5 Conclusion As demonstrated, the spectral method turns out to be well applicable to the problem of graphical representation of relations among objects. In the paper we suggested using gradient method to fine tune the solution obtained by the spectral method, however, other local optimization algorithms could be applied as well, cf. [26]. A comparison of their suitability to the described problem, as well as some rigorous analysis of convergence could be the subject of further research. One could also investigate theoretical bounds of optimality of the solutions proposed by the spectral method. 6 Acknowledgments The authors acknowledge fruitful discussions about this subject with Robert Reinhardt. Also, programming help of Matija Mazi and Martin Milanic is greatly appreciated. We also thank the referees for their suggestions. Figure 8: Traces of positions of vertices, random initial placement by Algorithm 2 (Figure 6). This was tested both against choosing higher eigenvalues (Figure 7) and against choosing a random initial placement (Figure 8). Note that the position of the first vertex and one coordinate of the second vertex are fixed, reducing the unnecessary computational costs due to isometric transformations of the plane. In Figure 7 certain "zig-zag" behavior of two points can be observed. It hindered the gradient method from finding References [1] F. Alizadeh, Interior point methods in semidefinite programming with applications to combinatorial optimization, SIAM J. Optimiz. 5 (1995) 13-51. [2] N. Alon, Eigenvalues and expanders, Combinatorica 6 (1986) 83-96. [3] N. Alon, V. D. Milman, Ai, isoperimetric inequalities for graphs and superconcentrators, J. Combin. Theory, Ser. B 38 (1985) 73-88. [4] M. Bolla, Spectra, Euclidean representations and clusterings of hypergraphs, Discrete Math. 117 (1993), 19-39. [5] M. Bolla, G. Tusnädy, Spectra and optimal partitions of weighted graphs, Discrete Math. 128 (1994), 1-20. [6] P. K. Chan, M. Schlag, J. Zien, Spectral fc-way ratio cut partitioning and clustering, Symp. on Integrated Systems, 1993. [7] T.F. Chan, W.K. Szeto, On the optimality of the median cut spectral bisection graph partitioning method, SIAM J. Scientific Comput. 18 (1997) 943-948. [8] F. R. K. Chung, Spectral graph theory, American Math. Soc., Providence, RI, 1997. 400 200 -200 -400 [9] D. M. Cvetkovic, M. Doob, H. Sachs, Spectra of graphs, Academic Press, New York, 1979; 3rd edition, Johann Ambrosius Barth Verlag, Heidelberg, 1995. [10] C. Delorme, S. Poljak, Laplacian eigenvalues and the maximum cut problem, Math. Programming 62 (1993) 557-574. [11] C. Delorme, S. Poljak, Combinatorial properties and the complexity of a max-cut approximation, Europ. J. Combin. 14 (1993) 313-333. [12] C. Ding, X. He, and H. Zha, A spectral method to separate disconnected and nearly disconnected web graph components, Proc. 7th ACM Int'l Conf Knowledge Discovery and Data Mining (KDD 2001), 2001, pp. 275-280. [13] W. E. Donath, A. J. Hoffman, Lower bounds for the partitioning of graphs, IBM J. Res. Develop. 17 (1973) 420-425. [14] G. Finke, R. E. Burkard and F. Rendl, Quadratic assignment problem, Ann. Discrete Math. 31 (1987) 6182. [15] M. Fiedler, Algebraic connectivity of graphs, Czech. Math. J. 23 (98) (1973) 298-305. [16] P. W. Fowler, T. Pisanski, J. Shawe-Taylor, Molecular graph eigenvectors for molecular coordinates, in "Graph drawing: GD'94," (R. Tamassia, ed.), Springer-Verlag, Berlin, 1995, pp. 282-285. [17] C. Fowlkes, S. Belongie, F. Chung, and J. Malik, Spectral grouping using the Nyström method, preprint, 2002. [18] M. Goemans, Semidefinite programming in combinatorial optimization, Math. Program. 79 (1997) 143161. [19] M. X. Goemans, D. P. Williamson, Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming, J. ACM 42(1995) 1115-1145. [20] L. Hagen, A. B. Kahng, New spectral methods for ratio cut partitioning and clustering, IEEE Trans. Computer-Aided Design 11 (1992) 1074-1085. [21] C. Helmberg, B. Mohar, S. Poljak, F. Rendl, A spectral approach to bandwidth and separator problems in graphs, Linear and Multilinear Algebra 39 (1995) 7390. [22] B. Hendrickson, R. Leland, An improved spectral graph partitioning algorithm for mapping parallel computations, SIAM J. Sci. Comput. 16 (1995) 452469. [23] A. J. Hoffman, On eigenvalues and colorings of graphs, in "Graph Theory and Its Applications" (B. Harris, ed.), Acad. Press, 1970, pp. 79-91. [24] M. Juvan, B. Mohar, Optimal linear labelings and eigenvalues of graphs, Discrete Appl. Math. 36 (1992) 153-168. [25] M. Juvan, B. Mohar, Laplace eigenvalues and bandwidth-type invariants of graphs, J. Graph Theory 17 (1993) 393-407. [26] C. T. Kelley, Iterative Methods for Optimization, SIAM, Philadelphia, 1999. [27] B. Mohar, S. Poljak, Eigenvalues in combinatorial optimization, in "Combinatorial and Graph-Theoretical Problems in Linear Algebra," R. A. Brualdi, S. Friedland, V. Klee, Eds., IMA Volumes in Mathematics and Its Applications, Vol. 50, Springer-Verlag, 1993, pp. 107-151. [28] A. Pothen, H. D. Simon, K.P. Liu, Partitioning sparse matrices with eigenvectors of graph, SIAM J. Matrix Anal. Appl. 11 (1990) 430-452. [29] F. Rendl, H. Wolkowicz, Applications of parametric programming and eigenvalue maximization to the quadratic assignment problem, Math. Progr. 53 (1992) 63-78. [30] L. Vandenberghe and S. Boyd, Semidefinite programming, SIAM Review 38 (1996) 49-95. [31] H. S. Wilf, The eigenvalues of a graph and its chromatic number, J. London Math. Soc. 42 (1967) 330332. [32] H. Wolkowicz, R. Saigal and L. Vandenberghe (editors), Handbook of semidefinite programming, Theory, algorithms, and applications, International Series in Operations Research & Management Science 27, Kluwer Academic Publishers, Boston, MA, 2000. Algorithms for Drawing Polyhedra from 3-Connected Planar Graphs Alen OrbaniC, Marko Boben, Gašper JakliC and Tomaž Pisanski IMFM, OTR, Jadranska 19, Ljubljana, Slovenia Emails: Alen.Orbanic@fmf.uni-lj.si, Marko.Boben@fmf.uni-lj.si, Gasper.Jaklic@fmf.uni-lj.si, Tomaz.Pisanski@fmf.uni-lj.si Keywords: graph drawing, Tutte's drawing method, representation of a graph, polyhedral representations Received: September 15, 2003 Two algorithms for producing polyhedral representations for 3-connected planar graphs are discussed in the paper. One of them uses Tutte's drawing algorithm [11] to produce a 2D drawing. Then the drawing is lifted into 3D space obtaining a polyhedral embedding. The other is a simple algorithm by G. Hart [4] for drawing canonical polyhedral representations. Some alternative aspects (physical model, Markov chain model) in algorithms for obtaining Tutte's drawings are presented and proved. Povzetek: članek opisuje dva pristopa za grafiten prikaz planarnih grafov. 1 Introduction A convex polyhedron can be viewed as a convex hull of its vertices and referred as P = (pi,..., p„), p j g R3, or as an intersection of the half-spaces defined by the supporting planes of the faces using the side of R3 that contains the polyhedron. These are two dual definitions. For more detailed and generalized definitions of polyhedra see [12]. Any graph mentioned in the article is 3-connected planar. Vertices and edges of a polyhedron P define a skeleton graph G(P) in an obvious way. The skeleton graph is obviously planar. By Balinski's theorem [12] this graph is 3-connected. In 1922 Steinitz proved that 3-connected planar graphs are exactly the skeletons of the convex 3D polyhedra. According to Whitney [2], every 3-connected planar graph has a unique embedding in the plane and the faces of the embedding are exactly the non-separating induced cycles. These faces are exactly the faces of any polyhedron with the same skeleton graph. Given a 3-connected graph G one would like to have an algorithm to obtain a polyhedron with its skeleton G. We will call such a representation a polyhedral representation. A graph is given as a combinatorial structure and our algorithms return 3D coordinates of the vertices of the polyhedron. There are infinitely many polyhedral representations for a given graph. According to Koebe [5], for each polyhedron there is the canonical form, which is determined up to a rotation in 3D space. The canonical representation is especially "nice" because it possesses maximal possible geometric symmetry. Each edge of such a polyhedron touches the unit sphere in exactly one point and the center of the gravity of these touching points is the origin. If a polyhedron P is in canonical form, then the polar polyhedron is also in canonical form and the dual edges have the same touching points with the unit sphere. For a given polyhe- dron P containing the origin of R3 a polar polyhedron P * is well defined and unique: P * = {y g R3 I (x, y) < 1 for all x g P }. A skeleton graph of P*, G* := G(P*) is exactly the dual graph of the skeleton graph G := G(P). We will present some methods together with references for more detailed information. 1 2 Tutte's algorithm The proofs and detailed descriptions can be found in [10] and [11]. In 1963 Tutte [11] invented an interesting rubber-band method for embedding a 3-connected planar graph G into the plane. By this method one face C (on vertices k + 1,. ..,n, referred as the outer cycle) of a graph is embedded into the plane as a fixed strictly convex polygon, while the other adjacent vertices are connected with rubber-bands. On the edge ij, where i,j not both in C, there is a strictly positive stretching coefficient ^ij. Later we will also assign the weights to edges ij g C, but we will not use them in the Tutte's method. Let pi represent a position of the vertex i in the plane. The positions pi := p", i = k+1,...,n are fixed. The other positions are to be calculated and lie somewhere inside the polygon. The system consisting of a rigid polygon as a frame and rubber-bands is in equilibrium if the following system of equations holds: ^ ^ij(pj - pi)=0, i g y(G) - V(C), (1) ijeEio) pi = p0, i G V(C). Supported in part b-y Ministrstvo za šolstvo, znanost in šport Republike Slovenije, gn^nt J1^i6^i61, ^'2^-6193. Tutte proved that this system has a unique solution which gives a set of positions that induces a straight line embedding of G into the plane with convex faces. Instead of solving the system of the linear equations directly, two alternative approaches can be used. In [8] the following algorithm called Schlegel diagram was introduced: Algorithm 1 Fix the positions of the vertices of the outer cycle C on a convex polygon in R2. Assign to the other vertices random positions inside the polygon. For those vertices repeat the following steps: - for each vertex i calculate the resulting force Fj of all adjacent rubber-bands; - move each vertex i for vector aFj, where a > 0 is a fixed real number; until the displacements of all vertices are sufficently small (depends on the prescribed precision of the drawing). We will determine the values of a which guarantee the convergence of Algorithm 1. A weighted Laplacian matrix Q^ for a graph G on n vertices and weights ^ = {^ij )ijeE{G) is a n x n matrix with {Q^ )i,j = {Q^ )j,i = -^ij when ij e E (G), {Qu )i,j = 0, when i = j, and ij e E (G) and {Q^ )i,i = -Yn=ij=i{Qu )i,j. Let K = {1,..., k} and Qu,k be a matrix that consists of the first k rows of Qu. Let A = -Qu,K 0k,n-k -In-k (2) where 0knn-k represents a k x {n - k) zero matrix and Ii a I x I identity matrix. Let x = {xi,..., xn) and y = {yi,...,yn), where pi = {xi,yi). Let pi := p0 be fixed values for i = k + 1,. ..,n and pi be the pairs of variables for i = 1,. ..,k. Let also b = {hl,..., bn), c = {ci, ...,cn) and {hi, ci) = {0,0) for i = 1,...,k and {hi, Ci) = -p00 for i = k + 1,... ,n. The system of equations (1) can be written as: Ax = b, Ay = c. (3) The iteration procedure in the Algorithm 1 can be rewritten as: xn+l = xn + a{Axn - b), (4) y _ ...n = yn + a{Ayn - c). Let us determine the bounds for a that ensure that the algorithm converges. It is sufficient to do this for the iteration for x. Let us rewrite the iteration (4): xn+l = {aA + I)xn - ab. Let p{ M ) denote a spectral radius of a matrix M : It is well known that an iteration of this type converges iff p{aA + I) < 1, see [1]. We will show that for a sufficiently small a > 0 it follows p{aA + I) < 1. If ff{A) is the spectrum of the matrix A, then (with slight abuse of notation) a{aA +1) = 1+ a{aA) = 1 + a ■ a{A). Proving that a{A) is strictly negative would yield the existence of the appropriate a, such that p{aA + I) < 1. The matrix A is block diagonal and upper triangular with two diagonal blocks -Qu,(k,k) and -Ik, where Qu,{k,k) is a matrix obtained from Qu taking the rows and the columns in the set K in induced order. It follows that a{A) = ff{-Qu,(K,K)) U a{-Ik). It suffices to show that Qu,(K,K) is positive definite. The matrix Qu,(K,K) is a principal sub-matrix of Qu. We will prove that a{Qu) > 0 and that Qu,(K,K) is nonsingular. Cauchy's interlacing theorem [1] implies that Qu,(K,K) is also positive definite. Using the Gershgorin's theorem (see [1]) it is easy to see that the eigenvalues of Laplacian matrix Qu are contained in [0, 2 ■ maxi=i,...,n{{Qu)i,i}], hence a{Qu) > 0. By Whitney's theorem H = G - C is connected. Let Q' := Qu {H) be a weighted Laplacian matrix of H and Q := Qu,{K,K). Then: Q = Q' + A, where A is a nonnegative diagonal matrix. Since G is connected, A = 0. For an arbitrary vector w e Rk : wT Qw = wT Q'w + wT Aw. Since eigenvalues of Q' are by Gershgorin's theorem contained in the interval [0, 2 ■ {Q'i,i}], it follows that wTQ'w > 0. Since H is connected, the multiplicity of the smallest eigenvalue is 1. But the smallest eigenvalue is 0 and its eigenvector is the all ones vector 1. Therefore wTQ'w = 0 iff w = c 1 for some c e R. Since A is a nonnegative diagonal matrix, it follows that wT Aw > 0. But for w = 1, wTAw = c21TA1 > 0, if c = 0. So wTQw > 0 for each w = 0 and Q is positive definite. The following theorem holds: THEOREM 1 Let G be a 3-connected planar graph. If: 2 0 0 | Xq = i), (9) are hitting probabilities for the set A when starting in the state i. We will write for h^:^^. The following theorem holds (see [7]): hA) is a solution of Theorem 2 A vector hA = (h^A, a system: 1. hA = l,for i G A, 2. E;=i P.j hA = hA, for i GA, 3. 0 < hA < 1 for all i. If hA is any other solution then hA < hA (inequality on components). Therefore hA is a minimal solution. Let hj = (hi)n=i be the vector of hitting probabilities. By Theorem 2, Phi = hi. Let J = [hk+1, hn ] be a matrix with vectors hi as columns. Hence P J = J. Let xQ = and yQ = (yl+i,...,yn ) be the coordinates of vertices on the outer polygon. Then: PJx0 = Jx0, PJy0 = Jy0, (10) and (J xQ)i = xQ, (JyQ)i = yQ, j = k + i,...,n. (11) Thus p. = (( JxQ )i, (JyQ)i) is a solution of the system (7) and it is unique by Tutte. The following theorem holds: THEOREM 3 Let G = ({1, ...,n}, E (G)) be a 3-connec-ted planar graph, C = (k +1,... ,n) be one of its faces, and pQ, i = k + 1,... ,n, be the fixed coordinates of the vertices of C forming a convex polygon in the cyclical order of C. Let P be a transition matrix for the Markov chain (Xr ) and let be the hitting probabilities for entering into vertices j = k + 1,... ,n, starting from the vertex i G V (G). Then the Tutte's drawing can be obtained in the following way: p. ^ hi pQ, i=k+i p. = pQ, i G V (G) - V (C ), i G V (C ). The theorem was originally conjectured by T. W. Tucker. To obtain the hitting probabilities hi one can make suffi-cently large number of the following experiments: start in i and make a series of steps until a state in C is reached. The frequency of the walks ending in j is an approximation for hi. It can be easily verified that lim(Pn)ii = hi, i G C, j G C. Hence the hitting probabilities can be obtained by calculating Pr, r large. 3 Lifting of a Tutte's Drawing A detailed procedure with proofs can be found in [10]. A weighted embedded 3-connected planar graph G with weights w and positions p1,..., pn of the vertices is in equilibrium if: wii (pi - p.) = 0, for all i G V (G). (12) .jeEiG) The corresponding weight w is called an equilibrium weight. Maxwell-Cremona's theorem states that if for an embedded 3-connected graph with convex faces there exists an equilibrium weight with strictly negative weights on the edges of the outer face and strictly positive weights on the inner edges, then the drawing can be lifted to a polyhedron. A Tutte's drawing with the corresponding weights on the edges that are not in the outer face has all the vertices not on the outer face in an equilibrium. To use a Maxwell-Cremona's theorem there should be negative equilibrium weights on the edges of the outer face. A simplified version of the theorem will be used to solve that problem. Proposition 4 Let G bea 3-connected planar graph embedded with one outer face f1 on a convex polygon with all other faces (f2,..., fm) as convex polygons inside the outer polygon. Let G* be the geometric dual of G. Let e = ij G E(G) and let e* = fg be the corresponding dual edge, such that if one traverses from i to j in the embedding of G, the face f is on the left side. Let (i, j, f, g) denote an oriented patch and let q/^ = (0,0,0). Then for 2 < k < m the unique assignments q/k exist, such that: q/ - qg = w.i(p. X pi), where w is an equilibrium weight and x denotes a cross product. For the proof see [10]. Using vectors q one can define a function on the interior of the outer polygon: z(x) = (x, q/i), for x G f. C R2. (13) It can be proved that the function z(x) is linear, continuous, convex, and the image of each face lies on some plane. It can be seen that if we have a Tutte's drawing with a triangle as the outer face, then using (12) on the Tutte's drawing, one can calculate the remaining negative weights on the edges of the triangle. The surface defined by the graph of the function z(x) together with a patch in the place of the triangle determines a hull of the polyhedral representation of G. Using the fact that if G is 3-connected planar then either G or G* has a triangle as a face (see [6]). This implies the following algorithm: Algorithm 2 1. Determine the faces of G (some pla-narity algorithm). 2. If one of the faces is a triangle, use G otherwise use G*. 3. Draw a Tutte's drawing. 4. From the Tutte's drawing and using Proposition 4 determine vectors q. 5. Determine the vertices of the polyhedron using the function z(x). 6. If G was used, we have a polyhedral embedding. Otherwise move the polyhedron so that the center of the gravity of the vertices lies in the origin. Calculate the polar polyhedron. Figure 1: Dodecahedron and Fullerene C60 drawn by Tutte's method. from the Algorithm 2, move the center of gravity into the origin, and project vertices from the origin to the unit sphere. 2. Repeat the following procedure until the positions of the vertices are precise enough: (a) for each edge e calculate the point pe, which is the closest to the origin. If the edge is not at the distance 1 to the origin, move the endpoints of e foravector a(l - ypey)pe, where 0 < a < 1. (b) for each face calculate the approximate plane of the vertices on the face. If some vertex v is at the distance d(v) from the plane, move v towards the plane in the direction of the normal for a magnitude ßd(v), 0 < ß < 1. Algorithm 3 works well on small graphs. It seems to work well on cubic graphs and perhaps on graphs with lower degrees of the vertices, but the convergence might be poor, especially on the larger graphs. It behaves very poorly (does not even converge) if degrees are rather high (> 4). More or less we tested the performance on cubic graphs. If the cubic graph is large the convergence is very poor. To improve the convergence on cubic graphs we first use Algorithm 3 for a few iterations. Then at each iteration we calculate the vertices of the approximate polar (from approximate faces) and the dual edges. Then we check if the current dual edges are perpendicular to the original edges. If not, the vertices are moved in a manner of rotation for some small magnitude that depends on the difference between the current angle and n/2. This brings a small improvement to the convergence of the algorithm. Both methods significantly depend on a, ß and similar parameters in the improved algorithm. Unfortunately we were not able to prove the convergence for any set of the parameters. 4 Canonical polyhedra To produce the canonical polyhedral representation one can use methods for determining primal-dual circle packing (PDCP) of 3-connected planar graph. Having PDCP one can easily obtain the canonical polyhedral representation using the inverse of the stereographic projection on the unit sphere. One of the algorithms is due to Mohar [6] (circle packing in the plane or on the sphere). The other, which works in 3D and on more or less small graphs, is due to G. Hart [4]. We present some slight improvements for the latter algorithm. The Hart's algorithm reads: Algorithm 3 1. Start with a "good approximation" of a polyhedron that has edges as near as possible to the unit sphere. For instance, get the representation —< \ / V \ \ \ \ W ^—^ Figure 2: A dodecahedron obtained by lifting ing of its dual graph (left) and a dodecahedron ical form (right). Tutte's draw-in the canon- References [1] J. W. Demmel, Applied numerical linear algebra, SIAM Society for Industrial and Applied Mathematics, 1997. [2] R. Diestel, Graph Theory, Springer, 1997. [3] C. D. Godsil, G. F. Royle, Algebraic Graph Theory, Graduate Texts in Mathematics, 207, SpringerVerlag, 2001. [4] G. W. Hart, Calculating Canonical Polyhedra, Mathematica in Education and Research, Vol 6 No. 3, Summer 1997, pp. 5-10. [5] P. Koebe, Kontaktprobleme der Konformen Abbildung, Ber. Sächs. Akad. Wiss. Leipzig, Math.-Phys. Kl., 88 (1936), 141-164. [6] B. Mohar, A polynomial time circle packing algorithm, Discrete Math. 117 (1993), 257-263. [7] J. Norris, Markov Chains, Cambridge University Press, 1999. [8] T. Pisanski, B. Plestenjak, A. Graovac: NiceGraph Program and its Applications In Chemistry, Croat. Chem. Acta 68 (1995), 283-292. [9] T. Pisanski et al., VEGA program, http://vega.ijp.si/Htmldoc/Vega03.html [10] J. Richter-Gebert, Realization Spaces of Polytopes, Lecture Notes in Mathematics, Vol. 1643, SpringerVerlag, Berlin, 1996. [11] W.T. Tutte, How to draw a graph, Proc. London Math. Soc. 13 (1963), 743-767. [12] G. M. Ziegler, Lectures on Polytopes, Graduate Texts in Mathematics 152, Springer-Verlag, New York 1995, Revised edition 1998. Grammar-Based Systems: Definition and Examples Marjan Mernik, Matej CCrepinšek, Tomaž Kosar, Damijan Rebernak and Viljem Žumer University of Maribor, Faculty of Electrical Engineering and Computer Science Smetanova ulica 17, 2000 Maribor, Slovenia {marjan.mernik, matej.crepinsek, tomaz.kosar, damijan.rebernak, zumer}@uni-mb.si Keywords: context-free grammars, attribute grammars, grammar-based systems Received: January 30, 2004 Formal language theory is an important part of theoretical computer science and has also been applied in many practical applications. The importance of context-free grammars and attribute grammars for compiler construction and automatic generation for compilers/interpreters is already well known. However, grammars can be found in many other applications which are not as clearly related to their original application - language description and implementation. We call such systems grammar-based systems. No general comparison and classification has been done until now despite these systems having existed for a long time. The aim of this paper is to introduce and popularize grammar-based systems. Povzetek: članek opisuje definicije in primere sistemov s formalnimi slovnicami. 1 Introduction This paper emphasizes grammars, especially context-free grammars and attribute grammars. Their importance for compiler construction is already well known. However, from formal language definitions (e.g. attribute grammars) many other language-based tools can be automatically generated [7, 9], such as: pretty printers, syntax-directed editors, type checkers, dataflow analyzers, partial evaluators, debuggers, profilers, test case generators, visualizers, animators, and documentation generators. In most of these cases, the core language definitions have to be augmented with tool-specific information. In other cases, only a part of the formal language definition is sufficient for automatic tool generation, or implicit information must be extracted from the formal language definition in order to automatically generate a tool. Moreover, grammars can be found in many other applications which are not as clearly related to their original application - language description and implementation. We call such systems grammar-based systems (GBSs). Some papers describing particular approaches even contain this word in their titles (e.g. [4], [33], [36]). However, there is no exact definition nor comparison and classification for such systems. Since grammar-based systems are mainly unnoticed, there is also a lack of identifying benefits of such systems. The aim of this paper is to remedy this situation by defining, introducing and popularizing grammar-based systems. The benefits of GBSs are identified and clearly stated. The organization of the paper is as follows. Section 2 presents the original application of grammars, namely automatic generation of compilers/interpreters, and other language-based tools using our compiler generator LISA [22]. In section 3, we introduce GBSs, define them and their application areas, followed by presentation of practical examples in section 4. Concluding remarks and future research work are given in section 5. 2 Original Application of Grammars The original application of grammars is a notation for language description and its implementation [1]. What do we gain by formalizing the syntax and semantics of a programming language? The following benefits are identified: - The language definition standardizes the language. This is important to programmers, who need to write syntactically and semantically correct programs and understand them without any doubt about their meaning. It is also important to language implementors, who need to write a correct compiler/interpreter of the specified language. - The language definition allows a formal analysis of its properties, such as whether the definition is LL(k) grammar and L-attributed grammar. This contributes to better syntax and semantics of the programming language. The programming language that has been formally designed is more regular, has less exceptions and is easier to learn. - The language definition enables us to systematically derive the implementation of a language, such as a LR(k) parser and attribute evaluator. Moreover, such an implementation can be automatically obtained. In this case, the language definition is used as an input to a compiler generator system. Researchers have recognized the possibility that many other language-based tools could be generated from a formal language definition. Therefore, many tools not only automatically generate a compiler/interpreter, but also complete language-based environments [7]. Such automatically generated language-based environments include editors, type checkers, debuggers, various analyzers, and animators. Automatic generation of compilers/interpreters and other language-based tools using our LISA compiler generator are presented in the rest of this section. To support incremental language development [21] and educational activities in teaching "Compiler construction" course [23] the LISA (Language Implementation System based on Attribute grammars) tool was developed [22]. LISA is a compiler-compiler, or a system that automatically generates a compiler/interpreter from attribute grammar-based language specifications. The specification of a toy language SELA (Simple Expression Language with Assignments) is given below, in order to illustrate the LISA style. language SELA { lexicon { Number [0-9] + Identifier [a-z] + Operator \+ | := ignore [\0x09\0x0A\0x0D\ ]+ } attributes Hashtable *.inEnv, *.outEnv; int *.val; rule Start { START ::= STMTS compute { STMTS.inEnv = new Hashtable(); START.outEnv = STMTS.outEnv; }; } rule Statements { STMTS ::= STMT STMTS compute { STMT.inEnv = STMTS[0].inEnv; STMTS[1].inEnv = STMT.outEnv; STMTS[0].outEnv = STMTS[1].outEnv; } | STMT compute { STMT.inEnv = STMTS[0].inEnv; STMTS[0].outEnv = STMT.outEnv; }; } rule Statement { STMT ::= #Identifier \:= EXPR compute { EXPR.inEnv = STMT.inEnv; STMT.outEnv = put(STMT.inEnv, #Identifier.value(), EXPR.val); }; } rule Expression { EXPR ::= EXPR + EXPR compute { EXPR[2].inEnv = EXPR[0].inEnv; EXPR[1].inEnv = EXPR[0].inEnv; EXPR[0].val = EXPR[1].val + EXPR[2].val; }; } rule Term1 { EXPR ::= #Number compute { EXPR.val = Integer.valueOf( #Number.value()).intValue(); }; } rule Term2 { EXPR ::= #Identifier compute { EXPR.val = ((Integer)EXPR.inEnv.get( #Identifier.value())).intValue(); LISA automatically generates a SELA compiler/interpreter from this specification An example of a program written in the SELA language is shown in Figure 1. }; Figure 1: Language knowledgeable editor LISA also automatically generates other tools, such as language knowledgeable editors and various inspectors (e.g. finite state automata visualizator (Figure 2), syntax and semantic tree animators (Figure 3)) that are useful for understanding the behavior of the generated language compiler/interpreter. A LISA-generated language knowledgeable editor is aware of the regular definitions of the language lexicon. Therefore, it can color the different parts of a program (comments, operators, reserved words) to enhance the understandability and readability of programs. In Figure 1 the operators in the SELA program are recognized while editing and displaying in a different color. 3 Grammar-Based Systems: Definition As already mentioned, grammars can be found in many other systems than those described in section 2. These systems do not focus on language definition and implementation, but on solving various other problems. We call such systems grammar-based. The essential characteristics of GBSs is comprised in the following definition: A grammar-based s^st^em is a^y s^st^em that uses a grammar and/or sentences produced by this grammar to solv^e various problems outside the domain of programming language definition and its implementation. The vit^al component of such a system is well struct^ured and expressed w^ith a grammar or wit^h sentences produced by this grammar in an explicit or implicit manners. The key characteristic of GBSs according to our definition is a grammar which presents a kernel for the problem solving part of the system (application). Without this grammar part, the system becomes less general, and by lacking an important generic functionality, it can become usable ^Automata of AlmaSELA fT) I /^T & X / ^ ignore \ .i 1 EIJ\.ignore \ XaoxogwxOAioxODM Vv/ — r. SELA.Operator \l0-9] w lentifier vJ SELfl..Number Figure 2: FSA visualizator Figure 3: The snapshot of semantic tree animator only in a very restricted manner. In many GBSs the transformation of the representation to a context-free grammar makes the various analyses of properties feasible. In others the ability of context-free grammars to represent infinite languages with a finite set of production rules is exploited. Sometimes, a problem can be solved simply by converting the representation to a context-free grammar since the appropriate tools already exist. Why is the study of GBSs so important? The theory of grammars is well-defined and described in many books, where different examples and solutions are presented [1, 32]. In our research we discovered that grammars can be, and already are, used as a kernel part of many different practical systems. The fact that grammars are so well-defined is an advantage for developers, because they can use that knowledge, and the already well tested solutions, to build their own. The main problem of using grammars as part of a solution is to identify those problems that have grammatical nature (can be solved with grammars) and to convert their presentation in the form of grammars. One may ask why traditional programming applications such as compilers are excluded from the above definition since it is clear that these applications heavily depend on grammars. Actually, their whole construction is based on grammars. Grammars have been used in this area since their invention and other ad-hoc approaches were mainly superseded by grammars a long time ago. In this case we simply do not have other options. Therefore, talking about grammar-based compiler would be awkward. On the other hand, using grammars in other application areas can be regarded as an alternative and novel approach with clearly defined benefits. In this case the noun qualifier "grammar-based" is really appropriate. Our longstanding interest in grammars inspired us to start collecting information about their different practical application areas, such as: - Software engineering, where syntax definition occurs in various software development processes - in the form of rapid prototyping [28, 8], the modeling flow and constrains of collaborating software components [17], and many others [3, 15]. - Evolutionary computation is the study of computational systems that uses ideas from natural evolution and adoption to search the solution space. One of the research fields of evolutionary computations is grammatical evolution [26]. - Information theory comprises a vast range of diverse scopes. So far, our observations noticed grammars involved in encoding methods [2, 25], programming compaction [4] and grammatical inference [19]. Other grammar-dependent software in information theory are under investigation. - Neural network^s bring grammars into use with grammatical description of neural networks topology [14, 6, 10]. - Data representation architecture uses special kinds of grammars for communicating business data among very diverse systems. The particular technology considered is XML [29]. - Other areas of Computer Science (speech recognition, data mining, syntactical pattern recognition, etc). Applications of grammars are found even in areas outside computer science, such as organizational science [27] and mechanical engineering [33]. Until now GBSs have been studied only for particular problems (e.g. compression) without any general comparison and classification. The only attempt, to the best of our knowledge, is a recent work described in [15] where authors coined the word grammarw^are. To quote their definition "Gr^mmarware comprises grammars ^nd aU grammar-dependent software, i.e. software artefacts that directly involve grammar knowledge. " Their definition classified compilers, program analysis tools, program transformation tools, application generators, weaving tools, CASE tools as grammarware. Their definition, is in some, sense more restricted than ours and includes just GBSs from those areas of software engineering where grammars appear in an explicit form. Sentences produced by this grammar are always computer programs. Our definition is much broader since we are interested in GBSs for application areas that are even outside computer science (e.g. organizational science). On the other hand, in our definition grammars and sentences produced by this grammar can be expressed implicitly (e.g. in GOOD [17] a sentence is a sequence of method calls during the execution of an application program). 4 Grammar-Based Systems: Examples GBSs can be found in different application areas such as: software engineering, evolutionary computations, information theory, neural networks, data mining, syntactical pattern recognition, and data representation. In this section some of our own applications, as well as other representative applications of grammars, are presented and their benefits are stressed in more detail. Examples clearly show how grammars and sentences generated by a grammar can be used to describe various structured artifacts. Moreover, various possibilities exist for using GBSs. Examples include descriptions of GBSs where sentences generated by grammar appear in explicit or implicit manner. 4.1 Software Engineering 4.1.1 Grammatical Approach to Problem Solving In [8, 28] the grammatical approach to problem solving (GAPS) is presented. It is based on the following steps: - describe the syntax of the problem (the structure of the classes that characterize problem domain), deriving the context-free grammar from the conceptual class diagram, - describe the semantics of the problem (the meaning of the classes in the problem domain), associating attributes to every concept derived from the use cases and operational diagrams, - generate a rapid prototype of the system, using a compiler generator and the attribute grammar obtained in the two previous steps. Only the first step is explained for the purpose of this paper. A detailed explanation of the above steps can be found in [8]. The role of non-terminal symbols in a context-free grammar is two fold. First, at a higher abstraction level nonterminal symbols are used to describe different concepts in the programming language (e.g. an expression or a declaration in a general-purpose programming language). On the other hand, at a more concrete level, non-terminal and terminal symbols are used to describe the structure of a concept (e.g. an expression consists on two operands separated by an operator symbol, or a variable declaration consists of a variable type and a variable name). Therefore, both the concepts and the relations between them, belonging to the specific problem domain, are captured in a context-free grammar. But, this is also true for the conceptual class diagram [30] which describes concepts in a problem domain and their relations. It is clear that both formalisms can be used for the same purpose and that some rough transformation from a conceptual class diagram to a context-free grammar and vice versa should exist. The transformation from a conceptual class diagram to a context-free grammar is depicted in Tables 1 and 2. Classes can collaborate with more than just one class. For example, class A associates with classes B, C and D. In our approach, this collaboration is described with context-free grammar production A ^ BCD. The sequence of nonterminal symbols on the right side of the production should be in natural order and depends on the collaboration of entities in a given problem domain. As an example let's transform the conceptual class diagram in Fig. 4 to a context-free grammar. From this conceptual class diagram the following context-free grammar is obtained using transformation Tables 1 and 2. VIDEO_STORE ::= MOVIES CUSTOMERS MOVIES ::= MOVIES MOVIE | MOVIE MOVIE ::= title PRICE CUSTOMERS ::= CUSTOMERS CUSTOMER | epsilon CUSTOMER ::= name RENTALS RENTALS ::= RENTALS RENTAL | RENTAL RENTAL ::= daysRented MOVIE PRICE ::= new | child | reg The next step of the grammatical approach is to write a detailed semantic description of the problem domain deriving the attribute grammar. The prototype of the system is Navigability Composition Class diagram element Class ^ I-^ Class B ' Class A Class ^ ^-1 Class B >-C Class (non-terminal) attribute (terminal) A ::= B | C (-3X e N, X ^ B) AX = A Cardinality Multiplicity exactly one Optional multiplicity Multiplicity [0..m] Multiplicity many Class diagram element A ::= B | « A :: = MoreB MoreB ::= MoreB B | A ::= MoreB MoreB ::= MoreB B | B Table 2: Association multiplicity -title : String -tT" 1. New Child Reg RentHi 1..' Customer 0..- Video Store -daysRenlBd : int K-♦ -name : String k Table 1: From a conceptual class diagram to a context-free grammar obtained after a straightforward transformation of attribute grammar specification to LISA specification. The following program describes a particular use of the system. //entering movies into database lion_king child gone_with_the_wind reg the_ring new //customers and their rentals description Andy 3 lion_king child 2 gone_with_the_wind reg Mary 3 the_ring new What are the advantages of using grammars in this case? By transforming the conceptual class diagram to a context-free grammar a domain-specific language is obtained that describes the user interaction with the system. In this manner a rapid prototype is obtained and can be used whenever Figure 4: Conceptual Class Diagram for Video Store the user's requirements are not well defined. Another benefit is that a conformability check of the conceptual class diagram is also possible. 4.1.2 Adaptive programming Adaptive programming (AP) [18] is a subclass of aspect-oriented programming [13]. The stress is put on code tangling - for example, the required functionality is not always trivial to implement in existing applications when cross-cutting concerns exist. Adaptive programming offers a solution in the form of traversal specifications, in order to provide that additional functionality without modification of the existing code. These specify connections between objects as loosely as possible (called "structure-shy" programming). The structure of the application class dictionary can be seen as a context-free grammar (Figure 5) from which an object graph may be derived, express all possible navigations through the code. S : := A B I A S B A ::= ... B ::= C ... A S B^ / /\ \\ A B C • • Figure 5: Class dictionary (CFG) and its object graph The idea of adaptive programming is very general. The Demeter [18] language has been integrated with various object-oriented programming languages (the Demeter tool was successfully applied to Java). Demeter allows programmers to write the following specifications (see Fig. 5): starting from object A, go to object C via all objects with an attribute named "x". to add arbitrary execution paths. The problem of adaptive programming can also be solved using other techniques (visitor pattern). In comparison, the presented solution avoids code tangling, increases the programmer's productivity and consequently, it reduces error prone coding. 4.1.3 Other Approaches In Grammar-Oriented Object Design (GOOD) [17] a context-free grammar is used to represent a set of all possible interactions (collaborations) for objects in a particular Gramma Association Class Class attribute Class B Association A::=B A::=B Generalization Class B A::=B Class B A::=B Grammar Class B A ::= B Class B Class B Class B cluster, in order to fulfill the domain goals. When a grammar is interpreted at run-time, a cluster will dynamically bind the collaborators to the collaborations. Hence, GOOD facilitates the creation of dynamically configurable components, which encapsulates volatile business rules. The rationale behind this is that creating and representing a model of solutions is more extensible, simpler and more scalable than just creating the single solution. Possible solutions are modeled with a meta-model and represented as a context-free grammar. If this grammar is available to the "users" at run-time, then they are able to customize the system's behavior. An example of a production rule in [17] using EBNF is: ShoppingCartOperation ::= {AddItem | DeleteItem | SaveShoppingCart} CheckOut Since the interaction of objects is obtained from use case diagrams that describe the functionality of a system, the author [17] called such a grammar a use case grammar. The author [17] in his work distinguishes two types of metamodels: the static (class diagram) and the dynamic (valid object interaction sequences) meta-model. The latter is described with a context-free grammar. In [3] the correspondence between the feature diagram and the context-free grammar has been identified, where atomic features map to terminal symbols, composite features map to nonterminal symbols, and feature operators map to syntax operators. In domain analysis, feature diagrams are used to describe commonalities, variabilities and dependencies between variable properties in the application domain. By converting a feature diagram to a context-free grammar (FD2CFG), syntax tools can be applied to feature descriptions for free (e.g. validity of configuration corresponds to successful parsing). The Free University of Amsterdam recently launched a project on Grammar Engineering - software engineering for grammars [15]. Topics included are grammar recovery, grammar implementation, and the application of grammars in software renovation. The more technical issues include concepts and technology for grammar-based software renovation factories, grammar adaptation, grammar documentation, grammar testing and many others. 4.2 Evolutionary Computations Genetic programming (GP) is an evolutionary approach in which an evolving population consists of computer programs [16]. Each member of the population, a chromosome, represents a possible solution in the search space of all possible programs written in a pre-selec^ed programming language (e.g. Lisp). Since the search space is too large it is restricted by the user-defined function set F and the terminal set T. The set T contains variables and constants and the set F functions that are a priori believed to be useful for the problem domain. For example, in the Santa Fe ant trail problem [16] from sets T = {(MOVE), (LEFT), (RIGHT)} and F = {IF-FOOD-AHEAD, PROGN2, PROGN3} the following solution (lisp program) can be evolved: (IF-FOOD-AHEAD (MOVE)(PROGN3 (LEFT)(PROGN2 (IF-FOOD-AHEAD (MOVE)(RIGHT))(PROGN2 (RIGHT) (PROGN2 (LEFT)(RIGHT))))(PROGN2 (IF-FOOD-AHEAD (MOVE)(LEFT))(MOVE)))) In [26] the concept of grammatical evolution (GE) has been introduced. GE is an evolutionary algorithm that can evolve programs in an arbitrary language [31]. The input to the GE is a BNF definition for the genotype-to-phenotype mapping process. For example, the following grammar can be used as input to Santa Fe ant trail problem: CODE :: CODE :: LINE :: EXPR :: EXPR :: IF-STAT OP ::= left() OP ::= right( OP ::= move() LINE CODE LINE EXPR IF-STAT OP := if (food-ahead()) EXPR else EXPR The population consists of variable-length binary strings that determine which production rules from the grammar definition are used in a genotype-to-phenotype mapping process. The appropriate production rule is selected by using the following mapping function: rule = (Integer value stored in a chromosome)MOD (number of production rules for the left most nonterminal) For example, the following chromosome (203 245 110 55 29 200 241 11 151 162 227 74) encodes the following left-most derivation CODE ^203mod2 CODELJ^NfE CODE LINE LINE LINE LINE L]^NE ^ EX^PR LINE LINE OP LINE LINE mo^e) LINE LINE ^ ... ^ move() if (food-aheadO) move() else left() move() What are the benefits of using grammars in GE? Obviously, GE is much more flexible than GP because it can produce a code in any language. Furthermore, in the GE closure problem, the generation and preservation of valid programs, does not exist. Other benefits come with the separation of the search and solution spaces because grammar enables the genotype-to-phenotype mapping process. This allows an unconstrained evolutionary search to be performed on simple variable-length binary strings. Moreover, new advances in genetic algorithms can be easily incorporated into GE or any new search algorithm operating on binary strings can be used. 4.3 Information Theory The ability of a grammar to represent an infinite language with a finite set of production rules also makes grammars useful in compression algorithms. Grammar-based encoding (GBEnc) methods, such as derivation encoding [34], which represents a program by a sequence of grammar rules to derive it from the start symbol, have been proven useful for compressing programs. For example, the program using the grammar from subsection 4.2 can be encoded as 110120121012 by derivation encoding. The derivation tree is shown in Figure 6. It was shown in [2] that programs can be compressed to almost 10% of their original size. Another grammar-based compression algorithm is SEQUITUR [25] which constructs a context-free grammar for its input. The resulting grammar is capable of generating just one string, namely the original sequence. For example, the sequence abcdbcabcd is represented by the following grammar ::= CAC ::= bc ::= aAd 0( move() i move() j left() Figure 6: Derivation tree The algorithm identifies the hierarchical structure (Figure 7) in sequences of symbols and uses that information for compression. By detection and elimination of redundancy it outperforms the standard compression techniques on very large or highly structured sequences. SEQUITUR preforms well on many practical problems such as DNA sequences and genealogical databases. It is also possible to use the system as a basis for generalization in grammatical inference [19]. Grammar-based techniques (e.g. [4]) have also been used in program compaction, which is a compression technique with an additional constraint - the compressed program has to be executable. abcdbcabc d under investigation. Therefore, the search for suitable neural network architecture is a common task. Here we can use direct encoding, or the so-called grammatical encoding [14], where the architecture of the neural network is generated from its grammar description. The advantages of grammatical encoding are better scalability and the possibility of finding building blocks. This work is further elaborated in [6] where cellular encoding using graph grammars was proposed. The architecture, the weights, and the kind of sigmoids used by each neuron are encoded. Cellular encoding can be seen as a machine language for neural networks and can be used as a tool for designing neural networks. In [10] an attribute grammar is used to specify classes of neural network structures with explicit representation of their functional organization. The approach is termed Network Generating Attribute Grammar Encoding (NGAGE). The specification of a neural network structure is extracted from the attributes of the root symbol and interpreted to produce a functional neural network. This neural network can be randomly initialized and trained afterwards. The NGAGE is specially usable in genetic programming, in the form of neural network representation, where each production rule (derivation subtree) corresponds to a meaningful structural component of the neural network. These characteristics of NGAGE can be used for genetic operators implementation, crossover and mutation. The other benefit of NGAGE is identical representation of different neural networks. The similarities and differences between them can be emphasized within a common framework. 4.5 Data Representation The use of mark-up languages on the Web is indispensable. The hypertext mark-up language (HTML) is the best known example. In the last few years the eXtensible Markup Lang;uag;e (XML) has been introduced, as a markup language for uniform representation of data. It was originally meant as a format for transferring data over the Internet. Separation of data from its representation increased the standard applicability to other computing areas. Although our perception is that compiler notation and mark-up language have little in common, the reality is quite different. The syntax of XML documents is conceptually similar to the meta-language (defined by BNF) of compilers. The analogy between compilers and mark-ups is shown in table 3. Figure 7: Hierarchical structure for grammar 4.4 Neural Networks The selection of a suitable neural network topology is an important step in finding a good solution for the problem Notation Compiler Mark-up meta-notation syntax LISA, ASF+SDF context-free grammar XML DTD, XML Schema Table 3: Analogy between compiler and mark-up The syntax of an XML document is defined by Docu-mentType Definition (DTD). DTD defines document structure, elements, their attributes and types (see example be- low). It uses the syntax of EBNF ('*', ' + ', '?', '|') to describe the syntactical structure of XML documents. Therefore, a DTD can be seen as an extended BNF (EBNF). An example of an XML document, for the given DTD above, is written below. The XML elements from the DTD can be seen as non-terminal symbols in EBNF. Empty e^e-ments are without contents (element ' ') - their meaning is in the position and attributes. Non-empty element's can contain other elements and textual content (element ''). Textual content can be parallelized with terminal symbols in EBNF. <paper_collection> <paper> <title>Grammatical Approach to Problem Solving Pedro Henriques 2 0 03 Another similarity of mark-up languages and compilers is the building of a parser. The XML parser generator analyzes the source DTD and automatically generates a parser for XML documents which comply with the source DTD, which is also the case in compiler building tools (e.g. LISA [20]). The disadvantage of XML against context-free grammar is in its lexical part. The textual content of XML elements can be either a generic string (denoted with '#PCDATA' in DTD) or enumeration of allowed values. This limitation is reduced with XML Schema, which offers richer notation for describing the textual content of the XML elements. 4.6 Other Applications Due to page limitation, all GBSs can not be described in detail. The aim of the paper is to show a plethora of different research areas where grammars are proven to be useful. A short description of such systems in other areas follows. A number of researchers have proposed ways to use grammar-based notation for expressing knowledge in the speech recognition process. In most cases CFG was used to generate or filter word transitions [24]. To improve semantic sentence recognition, the probabilistic LR parser has been used as well as stohastic CFG (SCFG) [12]. Data mining is an automated process of discovering knowledge from databases. Various data mining methods exist, among them are inductive logic programming and genetic programming. In [36] both approaches were integrated using logic grammars aiming to exploit the benefits of both approaches. Special rule learning has been developed where a grammar represents rules. Moreover, the grammar can be modified in order to learn rules. Formal language theory has been successfully applied to pattern recognition problems [5] in which the patterns contain most of their information in their structure rather than in their numeric values. In order to make grammars more suitable for pattern recognition the concept of context-free grammars have been extended to stochastic grammars [35] and fuzzy grammars [11]. The design of organizational and work processes is well defined. However, there is a lack of formal approaches in discovering new organizational processes. By using grammars, one may systematically search for solutions in process redesign, as well as for new solutions in process organization. Therefore, the process grammar [27] offers complementary solutions to the existing ones. 4.7 Concluding remarks on GBSs examples It is important that GBSs are not studied in an isolated manner. In order to be able to make a general comparison of different GBSs and to classify them, we need to find their common and variable properties. The main reason for classification of GBSs is to identify differences among GBSs and to identify the representative examples of it. These classifications can be used by future developers to identify whether their system is solvable using a grammar-based approach. Developers can further use the classification to build their own system faster and more efficiently. The following questions help to identify different dimensions of GBSs: Q1 What is described with grammar G? Q2 What is described with program P generated by language L(G)? Q3 Is the representation of program P generated by language L(G) explicit? Q4 Is the control flow from G ^ P or P ^ G? (Is the input to GBS defined by grammar or program?) Q5 Why was GBS invented? In table 4, the answers to some examples are given. It is important for future application developers to notice that with a grammar-based approach their systems can benefit in several directions: - system can become more general (e.g. GOOD [17]), - system can be easier to develop (e.g. [3]), - system's underlying representation can be more efficient (e.g. [10]). Q1 Q2 Q3 Q4 Q5 GAPS conceptual class diagram (CCD) user interaction with the system yes G ^ P to obtain rapid prototype of the system GOOD interaction between objectS sequence of method calls executed by an application program no G ^ P to extend generality of the system FD2CFG feature diagram (FD) an instance of a system described byFD yes G ^ P to check if an instance is a valid system by FD GE grammar of the target language used in GP program written in a target language no G ^ P to extend generality of the system GBEnc grammar of the target language program written in a target language to be compressed yes P ^ G for compression SEQUITUR grammar of the target sequence of symbols sequence of symbols yes P ^ G for compression AP connections between objects structure of application class dictionary no G ^ P to extend functionality of applications NGAGE neural network (NN) structure a fully functional NN no G ^ P to simplify NN representation for GP Table 4: A comparison among different GBSs This paper describes some of the representative examples of the above mentioned benefits. The main contribution of this paper therefore is: - definition of grammar-based systems, - identifying problems that can be solved with grammar-based approach, - identifying benefits of grammar-based system, and - popularizing grammar-based systems. 5 Conclusions and future work Formal language theory has been applied to many practical applications. In addition to language description and implementation (original applications of grammars), grammars have been proven useful in many other areas. However, there is no particular research of systems (applications) in which grammar plays a vital role. In this paper such systems are introduced and defined as GBSs. The paper contains representative examples of GBSs in various areas of computer science, such as: software engineering, evolutionary computations, information theory, neural networks, and data representation. Although the formal theory of grammar is well defined, there are still many research possibilities in the field of GBSs. We have noticed several unexplored areas in GBSs such as: - Classification of GBSs. There is no classification of GBSs. We believe that this can be attained by questions similar to the ones proposed in section 4.7. However, further case studies of GBSs are required. - When to develop GBS? No guidelines exist to show whether a particular problem should be solved with grammar knowledge. - GBSs patt^erns. The remaining question is how to develop GBSs. Identifying patterns would improve and speed up the interest in developing GBSs. In the future we plan to extend our research on GBSs. Our research will be focused on solving those problems presented in this paper and on finding other areas or problems that can be efficiently solved with the grammar-based approach. We want to show that problem definition using the formal approach (grammar) can increase the efficiency, reliability and generality of the solution. 6 Acknowledgements We would like to thank Jeff Gray and anonymous referees for useful comments. References [1] A. V. Aho and J. D. Ullman. The theory of languages. Mathematical Systems Theory, 2(2):97-125, 1968. [2] R. Cameron. Source encoding using syntactic information models. IEEE Transactions on Information Theory, 34(4):843-850, 1988. [3] M. de Jonge and J. Visser. Grammars as feature diagrams. draft, Apr. 2002. [4] W. S. Evans and C. W. Fraser. Grammar-based compression of interpreted code. ACMCommunications, 46(8):61-66, 2003. [5] K. Fu. Syntactic Patt^ern Recognition ^nd Applications. Prentice-Hall, 1982. [6] F. Gruau. Neural Netw^ork Synthesis using Cellular Encoding and the Genetic Algorithm. PhD thesis, Laboratoire de l'Informatique du Parallilisme, Ecole Normale Supirieure de Lyon, France, 1994. [7] J. Heering and P. Klint. Semantics of programming languages: A tool-oriented approach. ACM Sigplan Notices, 35(3):39-48, Mar. 2000. [8] P. Henriques, T. Kosar, M. Mernik, M. J. V. Pereira, and V. Žumer. Grammatical approach to problem solving. In 2003 : Proceedings of the 25th International Conference on Information Technology In-t^erfaces, pages 645-650. SRCE University Computing Centre, University of Zagreb, 2003. [9] P. Henriques, M. V. Pereira, M. Mernik, M. Lenic, E. Avdicauševic, and V. Žumer. Automatic generation of language-based tools. In M. van den Brand and R. Laemmel, editors, Electronic Notes in Theoret^-ical Computer Science, volume 65. Elsevier Science Publishers, 2002. [10] T. Hussain and R. Browse. Attribute grammars for genetic representations of neural networks and syntactic constraints of genetic programming. In AIVIGI'98: Workshop on Evolutionary Computation, 1998. [11] Y. Inagaki and T. Fukumura. On the description of fuzzy meaning of context-free languages. In L. Zadeh, K. Fu, K. Tanaka, and M. Shimura, editors, Fuzzy Sets and Their Applications t^o Cognitive and Decission Processes, pages 301-328. Academic Press, 1975. [12] D. Jurafsky, C. Wooters, J. Segal, A. Stolcke, E. Fos-ler, G. Tajchman, and N. Morgan. Using a stochastic context-free grammar as a language model for speech recognition. In Proc. ICASSP '95, pages 189-192, Detroit, MI, 1995. [13] G. Kiczales, J. Lamping, A. Menhdhekar, C. Maeda, C. Lopes, J.-M. Loingtier, and J. Irwin. Aspect-oriented programming. In Proceedings European Con^erence on Object^-Orient^ed Programming, volume 1241, pages 220-242. Springer-Verlag, 1997. [14] H. Kitano. Designing neural networks by genetic algorithms using graph generation systems. Complex Systems, (4):461-476, 1990. [15] P. Klint, R. Lämmel, and C. Verhoef. Towards an engineering discipline for grammarware. Draft, Aug. 2003. [16] J. R. Koza. Genetic Programming: On the Programming of Computers by Natural Selection. MIT Press, 1992. [17] K. Levi and A. Arsanjani. A goal-driven approach to enterprise component identification and specification. Communications c^f ^e ACM, 45(10):45-52, October 2002. [18] K. J. Lieberherr. Adaptive Object-Oriented Soft-w^are: The Demeter Method with Propagation Patterns. PWS Publishing Company, 1996. [19] M. Mernik, M. CCrepinšek, G. Gerlic, V. Žumer, B. R. Bryant, and A. Sprague. Learning context-free grammars using an evolutionary approach. Technical report, University of Maribor and The University of Alabama at Birmingham, 2003. [20] M. Mernik, N. Korbar, and V. Žumer. LISA: A tool for automatic language implementation. ACM SIGPLAN Notices, 30(4):71-79, Apr. 1995. [21] M. Mernik, M. Lenic, Enis Avdicauševic, and V. Žumer. Multiple Attribute Grammar Inheritance. Informatica, 24(3):319-328, Sept. 2000. [22] M. Mernik, M. Lenic, E. Avdicauševic, and V. Žumer. LISA: An Interactive Environment for Programming Language Development. In N. Horspool, editor, 11t^h International Conference on Compiler Construction, volume 2304, pages 1-4. Lecture Notes in Computer Science, Springer-Verlag, 2002. [23] M. Mernik and V. Žumer. An educational tool for teaching compiler construction. IEEE Transactions on Education, 46(1):61-68, February 2003. [24] R. Moore, F. Pereira, and H. Murveit. Integrating speech and natural-language processing. In Proc. of the Speech and Nat^ui^al L^a^guage Workshop, pages 243-247, Philadelphia, PA, 1989. [25] C. G. Nevill-Manning and I. H. Witten. Compression and explanation using hierarchical grammars. The Computer Journal, 40:103-116, 1997. [26] M. O'Neill and C. Ryan. Grammatical evolution. IEEE Transaction on Evolutionary Computations, 5(4):349-358, August 2001. [27] B. T. Pentland. Grammatical models of organizational processes. Organization Science, 6(5):541-56, 1995. [28] M. V. Pereira, M. Mernik, T. Kosar, P. Henriques, and V. Žumer. Object-oriented attribute grammar based grammatical approach to problem specification. Technical report, University of Braga, Department of Computer Science, 2002. [29] E. T. Ray. L^ear^n^ng XML: Creating Self^-Descr^ib^ng Data. O'Reilly & Associates, Inc., 2001. [30] J. Rumbaugh, M. Blaha, W. Premerlani, F. Eddy, and W. Lorensen. Object-Oriented f^odel^ng ^nd Design. Prentice-Hall, 1991. [31] C. Ryan, J. J. Collins, and M. O Neill. Grammatical evolution: Evolving programs for an arbitrary language. In Proceedings of the First European Workshop on Genetic Programming, volume 1391 of LNCS, pages 83-95, Paris, 14-15 Apr. 1998. [32] Salomaa, A. Theory of Au^oma^a. Pergamon Press, 1969. [33] L. C. Schmidt and J. Cagan. Ggreada: A graph grammar-based machine design algorithm. Research ^n Engineering Design, 9:195-213, 1997. [34] R. G. Stone. On the choice of grammar and parser for the compact analytical encoding of programs. The Computter Journal, 29(5):307-314, 1986. [35] P. Swain and K. Fu. Stochastic programmed grammars for syntactic pattern recognition. Pat^tern R^ecog-nition, (4):83-100, 1972. [36] M. L. Wong and K. S. Leung. Dat^a mining using grammar based genet^ic programming and appl^ca-tions. Kluwer Academic Publishers, 2000. Improved Error Recovery in Generated LR Parsers Boštjan Slivnik and Boštjan Vilfan University of Ljubljana Faculty of Computer and Information Science Tržaška 25, 1000 Ljubljana, Slovenia bostjan.slivnik@fri.uni-lj.si bostjan.vilfan@fri.uni-lj.si Keywords: LR parsing, error recovery and reporting Received: February 17, 2004 A new method for error recovery in LR parsers is described. An error recovery routine based on this new method can be generated automatically by a parser generator as a part of an LR parser. Based on the result that a viable suffix from which the unread part of the input is derived can be computed in certain states of an LR parser, the new method uses the viable suffix to discard the erroneous part of the input and to synchronize the parser stack with the rest of the input afterwards. Thus it resembles a simple but efficient error recovery method used by LL and other predictive parsers. It is proved that all states suitable for this kind of error recovery can automatically be identified by a parser generator. Povzetek: članek opisuje okrevanje po napaki v LR analizatorjih. 1 Introduction Compilers are programs that mostly process erroneous input. Robust error recovery and meaningful error reporting are therefore essential parts of any industrial-strength compiler. Nowadays many compilers perform syntax analysis using an LALR parser (more precisely, LA(1)LR(0) parser) that is generated automatically by a parser generator like y^acc, bison, JavaCUP, etc. [3]. However, none of these parser generators uses any advanced method for error recovery and reporting mainly, because these methods are either (a) time consuming, or (b) require inspection of individual parser states and manual insertion of error recovery routines [4, 5]. LALR parser generators usually provide only a very simple mechanism for error recovery and none for error reporting. If a yacc generated parser is to recover after an error is encountered within a sentential form derived from a nonterminal A, a compiler writer should insert a production A —> error a manually where error is a yacc reserved word [2, 3]. In case of an error, a parser abandons other productions expanding A, moves forward over the erroneous part of the input and discards it until a string which can be reduced to a is seen. A reduction to A is performed and thus the parser is resynchronized. However, "proper placement of error tokens in a grammar is a black art" [3]. The paper is organized as follows. Section 2 includes definitions of some basic elements of formal language theory and parsing. Following the brief outline of the new method in Section 3, the construction of the finite automaton used by the error recovery routine is described in Section 4. This is followed in Section 5 by (a) the algorithm for computing the viable suffix needed for error recovery and (b) the algorithm for computing a grammatical context of the erroneous part of the input. Examples and figures are given along the way. 2 Basic definitions Standard terminology of formal language theory and parsing is assumed [4, 5]. Throughout the paper we assume that grammars are reduced (no useless or unreachable symbols) and $-augmented (production S' —> $S$ is added as the only production expanding the new start symbol S'). A string Y ^ V* is a viable prefix (suffix) of G iff there exists a rightmost (leftmost) derivation S rm Yu (S =^*o,im UYR). ' The nondeterministic LR(fc) machine Nk for the grammar G = {V, T, P, S) (where V contains both nonterminal and terminal symbols) is a finite (semi)automaton with the state set equal to Ik (the set of valid LR(fc)-items for G), an initial item io e Ik, and a mapping SN : Ik x (VU{e}) —> 2Ik [1]. The deterministic LR(fc) (or LR(k)LA(k')) machine Mk for the grammar G is a finite (semi)automaton with a set of states Q C 2Ik, an initial state qs e Q, and a mapping Sm : Q x V —> Q. If SM(qS, y) = q for some q e Q and y e V*, then [y] denotes the set of equivalent viable prefixes leading from the initial state qs to the state q. Furthermore, [y] uniquely determines q (and is thus just another name for q). The LR(fc) parser is a pushdown transducer {M, t) (or simply M). M denotes the deterministic pushdown automaton based on the deterministic LR(fc) machine Mk, and t denotes the output effect (a mapping of parser actions into grammar productions). States of M are the same as the states of Mk. The stack alphabet of M is a set of states of Mk. A configuration (or instantaneous description) of a parser M is represented as Sriu$, where r G Q* and u e T* denote the stack contents and the unread part of the input, respectively. 3 The outline of the new method To relieve a compiler writer of the "black art" of proper placement of error productions, a better error recovery method is needed. Let us suppose that the input string uv' is being parsed with an LR(fc) parser M. Starting with the initial stack contents ro, the parser M performs the parser steps n(u) corresponding to the derivation sro luv's sri v's (1) Yvi (2) for various vi and a single viable prefix 7 where r = r'[7] (the stack contents r of parser M corresponds to the viable prefix Y of Derivations 2). Therefore, there exist leftmost derivations 5 uSi uvi (3) for various viable suffixes Si. Now suppose that Sr Iv'S is an error configuration. In other words, v' cannot be derived from any viable suffix Si in any of Derivations (3). The idea, on which the new method is based, can be outlined as follows: 1. If there is only one viable suffix S = XiS such that Si = S for any i in Derivations (3), and 2. if this particular S is known in the error configuration sr iv's, then the parser should discard the next few tokens of input, resynchronize and continue parsing the string derived from (5. If there exists a unique viable suffix S, there are two approaches to discard the erroneous part of the input. The first is to skip everything until a string from FIRSTk (SS) is seen, and then to resynchronize by pushing the symbol Xi on the stack. Using this approach, Derivation (1) can be extended with the derivation srl v'S = srIvivs sr ivs where the steps denoted by n(v') are used to skip the part of the input derived from symbol X1. Thus, the string vs is the longest suffix of v's having a property that (k : vs) G FIRSTk (Ss). The second approach is to skip everything until a string which can be reduced to X2, the first symbol of S, is read, and then to resynchronize by pushing X1 and then X2 on the stack. Formally, Derivation (1) is extended with the derivation srl v's = sr Iv' v2ss ^niv'1 ) >MMV2^ sriYXijiYXi X2] ivs sriYXi] iv2vs and enters a configuration syIv's (note that y and v' denote the stack contents and the unread part of the input, respectively). LR(k) parsers have the correct prefix property [4,5], i.e., any string of terminals pushed on the parser stack is a prefix of some valid input. It follows from Derivation (1) that there exist derivations >M sr[YXi] ivs where the steps denoted by n(v'i) are used to skip the part of the input derived from symbol X1 (as in the case above) and where n(v2) is the parse of v2, the correct part of the input, in M. The parser using an error recovery routine based on either of the two strategies tries to skip as few symbols of the input string as possible. More precisely, there may be many different, but viable splittings of the input string v' either to strings and s (as in the first approach) or to strings , v2 and s (as in the second approach). However, the problem of splitting the string v' is beyond the scope of this paper. Finally, if the viable suffix S cannot be determined uniquely in the parser state [y] , the parser removes one state at the time from the parser stack until a state with a unique viable suffix is at the top of the stack. Then, one of the approaches described above is applied. 4 The construction of the error recovery routine Two conditions were set in the previous section that must be fulfilled if a parser is to recover from the error: (1) the viable suffix S must be unique and (2) it must be known. In general, a viable prefix y and thus a state [y] (a set of LR(k)-equivalent viable prefixes) can have many corresponding viable suffixes Si. To identify states of an LR(k) parser suitable for performing error recovery we start with the following two definitions [6]: Definition 1 LetNk = {lG,V,PNk ,i0) be a nondetermin-is^ic LR(k) r^ach^ine f^or ai grammar G = {V, T, P, S). A string of LR(k)-valid items i0ii ...in G (IG)* is called a {y, k) -pci^h if there exists a sequence Xi,X2,..., Xn G (VU{e}) so thatXn = e and[ij-iXj ^ ij] G PNk where i = 1, 2,...,n. Definition 2 {y, k) -paths pi and p2, wehere pi = i0ii .. .in ^nd p2 = i0ii ... i'm are 0-equivalent iff n = m and items i j and i'j differ only in ^ook^ahead strings for all j = 1,2,...,n (i.e., if ij = [A ^ a • ß,x] and i'j = [A' ^ a'• ß',x'], ^hen A = A', a = a', andß = ß'). uv i The reason for defining 0-equivalence of {y, fc)-paths becomes obvious with the following lemma, which establishes a relationship between viable prefixes and suffixes on the one hand and LR(fc) machines on the other. Lemma 1 Any two 0-equivalent {Y,k)-paths define the same viable suffix. Proof: Any {y, fc)-path p = io«! ■ - -in specifies a set of leftmost derivations all having the form Al .Pm lm ■■■UmAm+lßmßm-! ■ --ßl, N = { {[A ^ a . ß],q) e Qm,y e T* : [A ^ a • ß,y] e q} En = { {{i!,q!), {i2 ,q2))\3X e V U{e} : [qiX ^ q2] e Pm a [ilX ^ i2] e Pno}, a!A2ß! U!A2ß! U!a2A3ß2ß! u!u2A3 ß2ß! U!U2 ■ ■ ■ Um-!amAm+!ßmßm-! ■ ■■ß! respectively. It is derived from the graph of the nondeter-ministic LR(0) machine by (1) replicating each LR(0)-item as many times as there are states in M in which an LR(k) item with the corresponding core appears, and (2) erasing edge labels. Thus every path in GN starting with {[5' ^ •$5$],qs) has its corresponding path in N0 (and vice versa). As an example, consider the following $-augmented LALR grammar Gex with productions S' ^ $S$, S ^ AB, A- ^ aA, A — e, B Bb, B b. lm where p j = A j —> a j Aj+! ßj is the production of the j-th LR(fc)-item of p having the dot in the initial position at the far left (and Am+! may be e). As any change of lookahead strings in LR(fc)-items of p does not affect the leftmost derivations above, all {y, fc)-paths which are 0-equivalent to p define the same viable suffix 6, where ÖR = Am+!ßmßm-! ■■■ß!. □ When the traditional LR(fc) parser enters the error configuration $n v'$, the error is recognized because no action is specified for q and x = k : v', where r = r'q, i.e., ACTION(q, x) = error. But as LR(fc) parsers have the correct prefix property, the first (k — 1) symbols of the lookahead buffer are correct — otherwise the error would have been detected earlier. The first step is to identify all states [y] with the property that for any y' e [y], all {y',^)-paths ending with an item (k — 1)-active for (k — 1) : v' are 0-equivalent (an item [A ^ aX • ß,y] is fc-active for x if and only if x e FIRST|f (ßy)). In other words, the stack contents of the LR(fc) parser and the first (fc — 1) symbols in the lookahead buffer must define the viable suffix uniquely (remember that in error configuration $n v'$ the fc-th symbol of v' is erroneous and thus not useful for error recovery). To do so, we change the focus to the nondeterministic LR(0) machine N0. Definition 3 An LR(0) -item [A ^ a • ß] of No is relevant for sitate [y] of l^he deterministic LR(fc) machine for G if ^nd only ^f, for all y' e y and i = [A ^ a • ß,y] e [y], all {y', fc) -paths p = p'i' are 0-equivalent. During the parser construction, we compute a directed graph Gn with a set of vertices VN and a set of edges EN defined as The LALR machine for the grammar Gex as constructed by a modified version of bison is shown in Figure 1. The graph Gn for the grammar Gex is shown in Figure 2. The irrelevant vertices of GN (i.e., vertices {i, q) where i is irrelevant for q) can be identified by the following simple algorithm: 1. Compute the set of all conflicting vertices, i.e., those with at least two different predecessors in the same state: = {v\ v = {[A ^•ß],q)A {{i!,q),v), {{i2,q),v) e En a «i = i2} (Different predecessors from different states represent no problem as the state itself helps determining the right path.) 2. Compute the set of successors of conflicting vertices: V^N^^ = {v\v is-reachable-from v' e In Figure 2 the irrelevant vertices are shaded while in Figure 1 they are closed between double lines. We define the skelet^on aut^omat^on U with the set of states Qu = {i\{i,q)e^N \ (VN'^ U VN2^)}, alphabet QM, and the mapping 6U : Qu x QM defined as 6u (i,q') = i ^^ Qu {{i',q'), {i,q))e En a {i,q), {i',q')eVN \ (VN^^ U VN'^). The skeleton automaton is just a compact representation of the graph GN with irrelevant vertices removed. The skeleton automaton for the grammar Gex is shown in Figure 3. The aforementioned modification of bison makes state 2 $accept: S . $end [eps] nd state 5 $accept: S $end . [eps] state 0 $accept: S $end [eps] S: .AB [Send] A: . a A [b] A: . [b] state 3 S: A . B [lend] B: . B b [$end] B: . b [Send] B: . B b ______ B: . b [b] r- b i state 6 B: b . [Send] B: b . [b] -I 13 state 7 S: A B . [Send] B: B . b [Send] B: B . b [b] 1 b state 8 B: B b . [Send] B: B b . [b] state 1 A: . a A [b] A: . [b] A: a . A [b] A state A A: a A . [b] Figure 1: Bison-generated LALR machine for the grammar G. Note the different grammar augmentation: instead of using production S' —> $S$, bison uses a production $accept —^ S$. 1 ( ^ ) C S 1 ^ i ^ jjA ^ a • A, q^—(A ^ »aA, qo ) / i t ^_ Va ^ aA»,q4) i t B ( B Jh»,q6 ) $S^ ,q5) state 1 Saccept -> . S Send n state 2 S -> AB l-o- state 9 Saccept -> S . Send state 5 A -> . state E fi -> . a fi / i-2— state 10 Saccept -> S Send state 1 A -> a .A 4-1— state 3 S -> A . B state 4 S -> A B . state B A -> a A Figure 4: Bison-generated skeleton automaton U for the grammar Gex. context stack i | i == [S' — s • Ss s] = e context stack@((q, _): st) [A — •ß] = ctx o [A — •ß] where i = Su ([A — •ß],q) ctx = context stack i context stack@((q, _): st) [A — aX • / ß] = ctx o [A — aX • ß] where ctx = context st [A — a • Xß] Figure 5: Algorithm for computing the viable suffix. and the viable suffix ßRßRR ...ß, down the following theorem: R m' Hence, we can write Theorem 1 If ^ny two {Y',k)-paths ending w^ith items which are (k — 1)-active for x, ai^e 0-equivalent for each y' G [y], t^en a viable suffix Si in derivation (3) can be competed from the st^ack contents r = r' [y] in the parser configuration srl vs ^or any v = xv'. A parse of erroneous string aacbb is shown in Tables 2 and 3. Both possible solutions mentioned in Section 2 are shown. In Table 2, the erroneous part of the input is discarded until the string b G FIRSTi(Aßs) is found in the lookahead buffer. The resulting stack contents after error recovery is performed is therefore s[s][sa][saa][saaA]. This is a simple but efficient solution. In Table 3, however, the string bb is reduced to the second symbol of the viable suffix AB s, namely B. Hence, the resulting stack contents is s[s][sA][sB]. Finding and reducing the substring bb can be performed in the same way as by parsers generated by existing LALR parser generators if error productions are used. Finally, the list of items [S' [A [A > s • ss], [S •aA], [A ^ •a^A], [A ^ •AB], a • A], a • A] STACK INPUT 1. s(q0,v2) aacbbs 2. s(q0,v2)(qi,v7) acbbs 3. s(q0,v2)(qi,v7)(qi,v7) cbbs ^ context returns [S' — s • ss] [S — •ab] [A — •aA] [A — a • a] [A — •aA] [A — a • a] yielding viable suffix (abs)r: 4. s(q0,v2)(qi,v7)(qi,v7)(q4,v8) bbs Table 2: A trace of parsing with error recovery: erroneous part of the input is discarded until b G FIRSTi(ABs) is seen. STACK INPUT 1. $(qo,v2) aacbb$ 2. $(qo,v2)(q!,v7) acbb$ 3. $(qo,v2)(q!,v7)(q!,v7) cbb$ ^ context returns [S' - $ • S$] [S — •ab] [A — •aA] [A - a • A] [A — •aA] [A — a • a] yielding viable suffix (AB$)R: 4. $(qo,v2)(q3,v3)(q7,v4) $ [6] B. Slivnik. Kombinacija Knut^hovega in Lewis-Stearnsovega sin^aksnega ^nal^z^at^orja z minimalno uporabo Knuthove analize. PhD thesis, University of Ljubljana, Ljubljana, Slovenia, 2003. Table 3: A trace of parsing with error recovery: erroneous part of the input is discarded until bb derived from B in AB$ is reduced. represents a particularly good starting point for printing out helpful error messages because it provides the compiler writer with the exact grammatical context within which the error occurred. 6 Conclusion The presented method works with both, canonical LR(fc) parsers as well as LA(fc)LR(fc') parsers. The error recovery routine does not slow down or influence the parser until it encounters the first error, and it can be generated automatically. Besides, it has two main benefits: (a) a compiler writer needs not add any additional productions to the grammar and (b) it is a good starting point for meaningful error reporting. However, the generation of parser is slower and the generated parser is larger. References [1] A. V. Aho and J. D. Hopcroft. Introduction toAu^oma^a Theory, Languages ^nd Computation. Addison-Wesley, 1979. [2] A. V. Aho, R. Sethi, and J. D. Ullman. Compilers: Principles, Techniques a^d Tools. Addison-Wesley series in Computer Science. Addison-Wesley, 1986. [3] J. Levine, T. Mason, and D. Brown. Lex & Yacc. O'Reilly & Associates, 1992. [4] S. Sippu and E. Soisalon-Soininen. Parsing Theory, Volume I: Languages and Parsing, volume 15 of EATCS Monographs on Theoretical Comput^er Science. Springer-Verlag, 1988. [5] S. Sippu and E. Soisalon-Soininen. Parsing Theory, Volume ^I: LR(k) and LL(k) Parsing, volume 20 of EATCS Monographs on Theoretical Comput^er Science. Springer-Verlag, 1990. Informational Design of Conscious Entities Anton P. Železnikar VolariCeva ul. 8, Ljubljana, SI-1111 s51em@hamradio.si Keywords: definitions, informational design, informational consciousness, informational methodology, metaphysicalistic organization, thesaural approach Received: June 8, 2003 Thesaurai approach to enti ties' metaphysi cahsti c organizati on is shown wi thin the design of an informational consciousness system. Besides phenomenal, psychological, biological, quantum-mechanical, artificial, and other possible definitions of consciousness, the informational consciousness (IC) is outlined as a major autonomous subsystem necessary in anyconscious system. Itis impossible to imagine a kind of the beyond to the informational phenomenalism in brain, nature, cosmos, and artifact. IC is founded formally by its own informational ax-iomatism, formulas, formula systems, schemes, scheme systems, graphs, and other constructive means and, within this theory and implementation possibili ties, remains clearly scientifically disciplined. More actually, IC systems can be designed by the proposed informational methodology and implemented within existing complex computer systems and informational nets. Povzetek: članek opisuje informacijsko zasnovo zavestnih entitet. 1 Definitions of different sorts of consciousness A precise and exhaustive definition of consciousness does not exist at all. Definitions can emerge according to the research of brain, mind, cognition, emotions, motivation, and other ingredients of the consciousness complex. There are loose attempts stating a few commonly accepted principles within the concept of consciousness. Definitional distinguishing among philosophical consciousness, artificial consciousness, phenomenal consciousness, psychological consciousness, biological consciousness, quantum-mechanical consciousness, informational consciousness, and other sorts of consciousness is possible. Definitions of consciousness can substantially differ in different languages. The philosophical background dictates the inclination to such or another definition together with different views of research and the emerging science of consciousness. Let us make a short list of possible definitions explicitly concerning consciousness of such or another sort. In a way, major consciousness components like cognition and emotions should be incorporated in the rough forms of definition. Instead with definitions we start with the assumptions of the following sort. Assumption 1 From the research point of view different sorts of consciousness can be dist^inguished: InC: individual (human, animal) consciousness, BC: biological consciousness, PhC: phenomenal consciousness, PhiC: philosophical consciousness (phenomenology), PsC: psychological consciousness, SC: social consciousness, LC: local consciousness, GC: global consciousness, CC: cosmological consciousness, QtC: quantum-theoretical consciousness, AC: artificial consciousness, IC: informational consciousness, AIC: artificial informational consciousness, etc. These sorts of consciousness represent the majority of concept's known now^adays. □ Assumption 2 Consciousness concerns an object or a complex of objects a^d emerges out of the subconscious or unconscious background, being the possible informational space. The subst^ance of any sor^t of consciousness is informational and, weithin the informational, consciousness-like component's can e^e^rg^e being organised consciously^. □ This assumption does not deviate essentially from known definitions of consciousness. As an exception the infor-ma^ional view of consciousness may be understood which, however, seems to be evident. Assumption 3 Consciousness is generally reg;arded t^o tee a conscious system w^^t^h abilit^ies or components representing conscious properties. One of ^e major system components is self-awareness or self-consciousness, the ab^l^ty to perceive the component by itself. Perception as a conscious component perceives t^he relation between component ^nd component environment. □ To repeat: Informational consciousness seems to be the essential component or property of any consciousness system. We experience how the phenomenal consciousness — the consciousness in the living human brai^—has its evident informational roots, for instance, in pure linguistic experience, where memory performs as an associative storage of learned and experienced information. Phenomenology in philosophy performs as a literary information and, in this view, as informational phenomenology. 2 Informational consciousness system Instead of consciousness a more appropriate term will be introduced, called conscious system. For a general denotation of an informational formula system, symbol $ will be used. Conscious system is a formula system concerning consciousness as a particular informational entity, denoted by J. Thus, $ [3] will symbolize the conscious system at the first glance, not expressed informonically yet. In general, it is adequate to use informational systems instead of properties like consciousness, subconsciousness, perception, cognition, emotions, and so on. A conscious system can unite such kind of subsystems organizationally. The system view of informational entities approaches also the understanding of informons and entropons as complex conscious and subconscious entities representing the named informational entities. Assumption 4 Conscious system, denoted naively (primitively) by $[3], is a system of major subsystems, where 3 strands for consciousness, that is, 3 ^ Cconsciousness, Ssubconsciousness for subconsciousness, Uunconsciousness for unconsciousness, ccognition for cognition, eemotions for emotions, mmotivation for motivation, h homeostasis ^or homeostasis, bbehavior for behavior, et^c. $[3] ismeantto bea circular, component^-distribut^ed, ser^al-parćil^el s^st^em of components, ^ied infor^a^io^ally by common operands. □ In the next step of theory development, the informonic view has to be considered to explicate the complex and l^ol^st^c nature of a conscious system. An informon a always appears together with its adequately named entropon a. They build the so-called «-informational space, denoted by (a, a). Or, consequently, recursively, a, a delivers (a, a); (a, a), (a, a) delivers ((a, a), (a, a) 1; a, a), (a, a) 1, ( (a, a), (a, a a, a), (a, a) ), ( (a, a), (a, a delivers -\ ; / This sort of informon-entropon recursion represents the holistic informational propagation of «-meaning through the informonic-entroponic consciousness system, being defined by the next assumption. Assumption 5 Consciousness system is an informonic-entroponic org;anization, denoted complexly by $[(3, Z) . Such a system connects in an informational v^ay, through common operands, the informonic-entroponic spaces (3, ^ and (aj, ai) in the form $\(Z, 3)" Cz, ^ ; ^ {Oi, aj) ; i = 1, 2, ...,n; \n < TO / Relation n < to means that n is potentially not limited and rises w^ith t^he ^nform^t^onal (development of the system. □ Spaces (z, ^ and (ai, aj) are informationally connected through common operands. Since each informon aj is a conscious entity, on the primitive formula level, aiY...,3,. ..\ holds. On the informonic level, there is, certainly, ai I..., z,..J and, consequently, ai I..., 3,.. 3 Meaning of informational operands One of the central notions of the informational and the conscious is the meaning of informational entities. One sort of understanding meaning comes from the use of natural language. Meaning is nothing more than the use of language within a community. In this way, meaning is the expression in language as such. The expression means that something comes to word where sequences of words appear. Assumption 6 Meaning concerns operands occurring in a formula, formula scheme ^nd formula graph, and formula system, system scheme a^d system grap^. Meaning of an operand is determined by the informat^io^al context wit^h other operands and operat^ors in di^^ere^t ^nform^a^^onal e^x^-pressions. Meaning is nothing else than the expression of an operand, in a context^ual or explicit v^ay.. □ «1,«2 , ...,0.i,. ..,Q.n where > G In a formula ^^ and V G {\, O}, operand a^ is expressed contextually, within the formula between other operands. The scheme of this formula for > = ^ is S ^^ ai,a2,...,ai,...,an y L («1= «2= ... = ai= ... = «n y [ = «1 From this scheme we see how «i occurs in the context of other operands. Within this context, it has a specific meaning, so the conclusion can be made, what does it in fact mean. Option [ = «1 ] in the scheme represents the circular case, where V = O. In a circular situation, operand «i in the scheme can be rotated to the initial position of the scheme, that is, G[^[«i]] ^ \«i= ... = «n o = «1 = «2 = .. . = «i where ^ ^ mmeaning. This kind of the scheme can be grasped as an explicit circular definition, that is, the one among parallel meaning schemes pertaining to operand ai. The unambiguous meaning ^\ai'\ can be obtained by parenthesizing this scheme, for instance, by formula e\^\ai'\'\ — R\ail S f- al,a2,...,ai,...,a„ Q P R\ail S o ai,a2 ,...,ai,...,an fi>i '-iOi^ , ain y. , aii, ai2,..., ai_i € expressed by a formula system which could mean an explicit definition of operand a in the form f a[aii,ai2,...,aiii ,...,aini \ ; ^ a[a2i, a22, ..., a2i2,. .., a2n2J ; Formally, the schematic result obtained by the operand rotation procedure can be expressed directly, using the rotation operand R concerning operand ai, in the form a ajl, aj2, . ..7 ajij 7...7 ajn^ ; \ a [ami, am2,..., amim ,...,a„ schematized by J Meaning ^\ai'\ is then a concrete parenthesizing of the scheme, that is, S\al — / aii = ai2 = ... = aiii = ... = ai^ ; ^ a2i = a22 = ... = a2i2 = ... = a2n2 ; aji = aj2= ... = ajij = ... = V ami = am2 = ... = = ... = am J where P is the operator of parenthesizing. A direct, serial, or linear meaning of an informational operand al is given by a serial, non-circular formula in the form f^ [ai,a2,...,ai,..., an^^ putting ai — f^ ai, a2,...,ai,...,an For getting the meaning of formula system $'s operand aij., it would suffice to put at the relation aiji — It is understood that system formulas V1 y2 fi>1 ,f2>2 ,...,fi>i ,...,fn^ ^ ^ are mutually informationally connected (dependent, impacted) through common operands. In this view the meaning of aij. becomes as complex as possible, expressed by system formulas, interpreting its meaning in different possible, explicit and implicit ways. 4 How to make an informational entity conscious? By an informational entity the meaning is understood belonging to the explicitly expressed entity's title, its name. For instance, entity a has the name a and simultaneously represents a's meaning in the form of an informational formula a — a[ai,a2, ...,ai,...,anaJ, schematized by S\al — (ai = a2 = ... = ai\= ana ), where some operands ai may equal to a. A complex meaning of a is where some operands ajij may equal a. System formulas a [aji,aj2,..., ajij,..., ajnj^ of the meaning system a can be mutually connected also by other common operands than a, where j = 1, 2,...,m and ij = 1, 2,..., Uj. Assumption 7 An informational entity, named by a, becomes primitively conscious if it is ^nformat^on^lly connected wit^h at least one consic^ous space [ß, ß) w^ithin a consciousness system ^ (z, . The ability of consciousness is granted t^o a transitively^, via the conscious organisation of informon ß. Thus, entity a transits into in^ormon a and (a, a) e ^(3, Z!. ° This kind of a's conscious emergence does not need a's own metaphysicalistic organization. So, a cannot decide autonomously upon its own conscious development and is, in this respect, dependent on informational spaces of other system operands. Assumption 8 An informational entity, named by a, becomes properly conscious if it is organised in the sense of initial informational metaphysicalism M0 " \al and, then, complexly developed t^o the informonic-entroponic space (a, a). Through metaphysicalism, informon a appropriates its own ab^l^ty of consciousness, and the corresponding ability of subconsciousness, expressed entro-ponically by a. This means that space (a, a) is complexly connected to spaces (3, ^ and (oi, aj), con-st^tut^i^g the c^mmom consciousness system ^ (3, , where (a, a), (3, Z, (ai, a^) e ^(3, . Consciousness space (3, ^ is meant to be the central informonic-entroponic space for the me^aphy^sicalis^ically org;anized informational entity z — Cconsciousness. ^ a ^o ^O- Informonic solution for operand a becomes, according to the previous formula system, as complex as a a2i, a22,..., a2i2 , a2n2 aji, aj2,..., ajij , a y a \ami, am2,..., a^im^ , am 5 Informational organization of a language thesaurus An exhaustive and adequately structured language thesaurus is the key means at the design and implementation of informational consciousness. It is a must for the design of metaphysicalistically conceptualized artificial informational consciousness system (MCAICS) being the major component of other possible concepts, theories, and implementations of consciousness systems. In Fig. 1, the recursive graph of a more or less complete thesaurus of a natural language is presented [1]. This graph can be additionally meaningly refined according to the design needs, and shows, at the first glance, a linear or serial, tree-like structure. However, the graph becomes circular as soon as, in a concrete case, the word (informational operand) appears, being used in a previous (higher) knot of the graph. Such a circular situation is regular and can concern any headword in the dictionary. Let us look a case from thesaurus [1] in which the headword at^tention is presented in the following way1 : attention n 1 a focusing of the mind on something syn application, concentration, consideration, debate, deliberation, heed, study rel assiduity, diligence, industry, sedulity, sedulous-ness; notice, observation, regard, remark; absorption, engrossment, immersion, intentness con absence, absentmindedness, abstraction, detachment, remoteness, withdrawal; disinterest, indifference, unconcern, unmindfulness ant inattention 2 syn see NOTICE 1 rel awareness, consciousness, mindfulness, sensibility con disregard, heedlessness, insensibility, unaware-ness, unconsciousness 'The meaning of abbreviations, also in further examples, is the following: syn synonym(s) rel related word(s) ant antonym(s) con contrasted word(s) idiom idiomatic equivalent(s) y use limited; if in doubt, see a dictionary 3 syn see COURTESY 1 rel deference, homage, honor, reverence, benignity; considerateness, consideration, kindliness, solicitude con neglect, negligence; aloofness, indifference, unconcern; discourtesy notice n 1 a noting of or concerning oneself with something syn attention, cognizance, ear, head, mark y mind, note, observance, observation, regard, remark rel care, concern, consideration, thought; apprehension, grasp, understanding con disinterest, disregard, indifference, unconcern; carelessness, heedlessness, unmindfulness; insouciance, negligence, recklessness 2 syn see MEMORANDUM 2 3 syn see CRITICISM 1 Etc. One sees how by the use of thesaurus the meaning of the headword at^ten^ion expands in a positive and negative sense. However, a thesaurus includes also other words used in the subscript language of informational operands and informational operators (verbs, adjectives, prepositions, etc.) Between two subscribed operands, an appropriately subscribed operator must be chosen, meaningly corresponding to the context in which operands and operators occur. Thus, verbs or verbal phrases can be searched in a thesaurus too. 6 Where does a thesaurus meet informational metaphysicalism? A thesaurus must not be understood as a headword-synonym dictionary but rather a much more complex interpretation of the headword meaning also in a contrasted, idiomatic, and other possibilities of positively and negatively related word meaning, concerning the headword [1]. In such a dictionary, synonyms, antonyms, related words, contrasted words, idiomatic equivalents, and other meaningly relevant words and phrases to a headword can be searched. Then, in the second step of searching, again, all these categories of words and phrases can be searched to a found synonym, antonym, related word, contrasted word, idiomatic equivalent, and other meaningly relevant word and phrase. This technique of the initial headword identification within a broader meaning can expand as deep as necessary, offering the sufficient informational complexity of the headword interpretation by sentences, concerning the searched thesaural entities. To this pure thesaural identification of a headword, sentences can be constructed, explaining or defining the headword meaning. In this way, a complex meaning of the headword can emerge expressing what the headword represents informationally in a context and what it does not represent at all. A thesaurally represented headword a can come close to the concept of metaphysicalistic decomposition concerning the headword a, that is, [a]. Informational a Headword in thesaurus (name, title, naming of informon and entropon) Synonyms Antonyms Related words Contrasted words Idiomatic equivalents Other relevant headwords Synonyms Antonyms Related words Contrasted words Idiomatic equivalents Other relevant headwords Figure 1: Recur^sive organisation of ^esaurus [1] for the design of thesaural-me^aph^sicalistic struct^ure developing informon. Such a thesaurus is a must in concept^uali^ing ^nd design of informational consciousness. metaphysicalism is organized as informing, counterinform-ing, and informational embedding concerning an informational entity, in this case, the headword. Thesaural identification of a headword is structured - by its synonyms and meaningly related word^ — a kind of positive identification in the sense of metaphysical-istic informing, - by its antonyms and meaningly contrasted word^ —a kind of negative identification in the sense of meta-physicalistic counterinforming, and - by a kind of meaningly idiomatic identification (an equalization^ in the sense of metaphysicalistic informational embedding (fixing the meaning). Informational embedding can be grasped also as a kind of decision making upon the meaning, resulting from metaphysicalistic informing and counterinforming of the headword. To the metaphysicalistic-thesaural model belongs also the determination of concrete operators figuring in the metaphysicalistic organization which subscripts can be chosen by means of the thesaurus. The operand-operator identification in a metaphysicalistic model can be efficiently meaningly covered by words and word phrases obtained by the use of a thesaurus. Thesaural approach is in fact a complex approach to the problem of meaning concerning a headword and its identification up to the possible details of meaning or interpretation. For each informational entity intending to inform in a conscious way, it is necessary to know what consciousness does mean in the most possible complex way of meaning expression. Therefore, the meaning of the concept "consciousness" must be developed in an informationally complex manner. What is needed is the informon cconsciousness, represented by its informational synonym z, and determined by the possible, informationally most complex form, from the thesaural point of view. Further, within the metaphysicalistic model of an operand a, one has to distinguish meaningly relatively closely positioned operands, like informing Ja and to it corresponding informational entity ia, counterinforming Ca and to it corresponding counterinfor-mational entity ca, and informational embedding Ea and to it corresponding informational embedding entity ea. Informational embedding is a kind of decision making in the understanding of that which results from metaphysicalistic informing and the corresponding counterinforming within systemically organized entity's metaphysicalism. Let us show an experiment made on the basis of Tab. 1, where metaphysicalistic components are listed both in the sense of thesaural contents and metaphysicalistic meaning (semantics). The table proceeds from name a, being the informational identifier of metaphysicalistic decomposition in the left-most column. The following columns represent metaphysicalistic informing Ja (second column), metaphysicalistic informational entity ia (third column), metaphysicalistic counterinforming Ca (fourth column), metaphysicalistic counterinformational entity ca (fifth column), metaphysicalistic informational embedding Ea (sixth column), and metaphysicalistic informational embedding entity ea (seventh column). All of the columns include meta-physicalistically appropriate, thesaural, and intentional contents, dedicated for the future development of meta-physicalistic organization, up to the complex informonic and entroponic perplexity of occurring operands. Thus, the table becomes a significant reminder for the metaphysical-istic decomposition of the named operand a. In the upper part of the table, the first column deals with the named operand, being for instance a noun or a noun phrase in a thesaurus, dictionary, or encyclopedia. The named operand will appear as a headword, idiom, or otherwise related word in thesaurus. After a time of de- 0 p ä ro +3 d d 0 .2 s .d ? E - ^ "O 0 "O fz 0 .E .Q e S Tb e -Q E e ■T T^ Ö (0 ó E g != o "O (^s (^o ro s ^rol^ E £= (0 CT otticu OLU "£= E > ct:^ -■ s ■ P ere — .£Z £i£zro ^ d S 0 0 0 e ■li ^^ CÜ JČ §c| .<5 s rd (5 =ä -d se ^ ca £= O iz QSbi?b o o CT CT ^ £= O 12 E -C (O CT ;=£=£= ■j±£=go ■ T .E CT.- ro CT-^ .E d 0 = . E o > ■ M > oroo — > N ■ 0 ■ F e CQ CQ .g 0'0^CT ü (O d i? ro S (/> -e o .E iE E > 2 o E OJE ro d 0 e ü= syn alive, apprehensive, au courant, awake, cognizant, conscious, conversant, knowing, mindful, sensible, sentient, ware, witting rel acquainted, appraised, informed; alert, heedful; impressionable, perceptive, receptive con anesthetic, impassible, insensible, insensitive; ignorant, unknowing ant unaware conscious adj 1 syn see AWARE rel noticing, noting, observing, perceiving, remarking; vigilant, watchful con forgetful, oblivious, unmindful; disregarding, ignoring, overlooking ant unconscious 2 syn see SELF-CONSCIOUS From nouns, adjectives, verbs, etc. in dictionaries and thesauri adequate operand names for metaphysicalistic operands can be chosen as shown in Tab. 1 and Fig. 2. Entering into recursive depth of dictionaries and thesauri, the complexity of operands and their mutual connection can rise up to the required degree and precision of concrete operand meanings. The metaphysicalistic organization of operand 3, that is, of Cconsciousness, in the graph of Fig. 2, is informonized, that is, complexly informationally interweaved within a conscious system $[(3,discussed in Assumption 5. For the initial metaphysicalistic organization, the serial decomposition [3] was taken and, then, informonically organized according to the operand suggestions in Tab. 1. On the other hand, nothing was determined on the level of graph operators, which in the same way as operands can be determined by adequate subscripts fitting the operator position between concrete operands. Namely, the general operator =, as an informational joker, can be in principle put between arbitrary operands. However, in a concrete situation, the operator is subscribed according to the used natural language, expressing a verb or a verbal phrase (in the last case by an operator composition, e.g., as the operator composition in a \=a ◦ \=ß ß). 7 Methodology of the design of informational consciousness In the preceding section several possible and essential methods of designing informational consciousness have been demonstrated. Systematically, the methods can be formulated as follows. Method 1 [The naive expansion of consciousness.] Thie most general, in fact, non-struct^ural or naive way of ma^-i^g an informational entity a fonscious is to include it in an existing consoious s^st^em $ [(3,3 ^ informonically, that is, a G $ [(3,3)0, connecting ^t to in^ormonic operands of s^st^em $ [(3,3^ trough common operands. □ This method needs additional explanation. Operand a, understood as a formula or formula system, is so far a rough informational entity determined as a usual informational formula (also as a single operand) or informational formula system. As such, it has not the property of being conscious, but is an unconscious entity yet. Such a case can happen in reading a sentence in a foreign language. Then, the unknown operands can be linked to the known translated operands and the similar is done with unknown operators. Thus, the sentence becomes linked to already conscious (common, that is, translated) operands in the context of meaningly known entities. The sentence becomes understandable, that is consciously perceivable. The second method intends to give a future conscious entity a primordial organization for the development into a proper self-content conscious informational entity, proceeding from the initial, thesaurally-metaphysicalistically organized structure. This method puts in the foreground the beginning of the design of an informational consciousness system when at least some key components must be developed first to the level of conscious informational behavior. One of such entities is without doubt the complex meaning of the headword 'consciousness'. Method 2 [The thesaural-metaphysicalistic expansion of consciousness.] A headword, h ^ hheadw0rd, of a complete informational thesaurus, denoted by tthesaurus, with h G tthesaurus, is made conscious by its informonic expansion h through a consequent recursive use of tthesaurus in a metaphysicalistic way, that is, proceeding from the initial decomposition [h] to the informonic orgjanization rh, building up the informational space (h, h ^ rh, Th □ What would the entroponic decomposition of entropon h, m"" "h" ,a h, m"" Y , mean at all? As already shown in Sect. 2, m;- h , M"" h delivers m"" rh, m?" rh), (m"" Th, m?" [h as recursive expansion, to which the pointed-out entroponic decomposition of entropon and informonic decomposition of entropon do not belong. In a general case of operand a, - entropon of entropon, an up to now not discussed entity a or, more precisely, (a), could mean entropon a developing (emerging, coming into unconsciousness, expanding it) in an entroponic way, and - informon of entropon is an up to now not discussed entity a, understood ambiguously as both (a) and (a); (a) could mean entropon a developing (emerging, coming into consciousness in the very moment) informonically, and (a) could mean informon a en-troponizing in the very moment. Thus, decomposition M. means entropon h, which fragments in this very moment come into consciousness in a metaphysicalistic way, and decomposition MQ" h which fragments in this very moment emerge unconsciously, that is, entroponically, in a metaphysicalistic way. Consequence 1 [Determination of operator subscripts by means of a thesaural-metaphysicalistic approach.] By the choice of headwords or headword phrases, denoted by hi and h2, where hi, h2 e tthesaurus, the operator subscript (verb or verb phrase) for =subscript e tthesaurus, in a basic or complex (int^eriorly parenthesised hi- a^d h2-transi^ion hi =subscript h2), depends on meaningly con-t^e^t determined circumstances, where operator =subscript must fit the operator composition =h1-subscript_dependent ◦ =h2-subscript_dependent, that is, (hi hsubscript h2) — i-subscript_dependent ◦ =h2-subscript_dependent h2) Operator subsci^ipt dependence is a mattier of meaning weithin the r^at^ural l^ngu^ge content, that is, the use of language. □ In Fig. 2, all the paths of the graph, representing operators, are informational jokers =, with the general meaning of the verb "to inform". They are not subscribed yet. For instance, the operator of informational concern could be used in many cases. However, more specific operators from the thesaurus, appearing in headword definitions, can be chosen, e.g., =be_ conscious, =b e aware, =be_ unconscious, =conceive, |=perceive, |=understand, |=intend, etc. 8 Conclusion The design of a thesaurus belongs to the hard problems of linguistic study, being an innovative effort in the direction of a deeper understanding of language, of enlarging and spreading the word and phrase meaning. It is important to comment where the informational approach to conscious agents could meet the Computing Research Repository (ACM) as an innovative research theory and practice on the way to informational consciousness. This paper shows how the subject concerns natural language processing [Computation and Language], combinatorics and graph theory [Discrete Mathematics], indexing (naming), dictionaries (thesauri), retrieval, content and analysis (meaning) [Information Retrieval], multiagent systems (informons), distributed artificial intelligence, intelligent agents, coordinated interactions, and practical applications [Multiagent Systems], connection-ism and adaptive behavior [Neural and Evolutionary Computation], other not listed subject areas (consciousness) [Other], and robotics in the sense of informational consciousness [Robotics] [5]. Informational consciousness directly concerns cognitive psychology, cognitive science [2, 4], linguistics, and the philosophy of information [3] and mind. This paper and [8] show how in English and German, respectively, it is possible to begin the design and programming of informational consciousness system by the use of language thesaurus. It becomes evident that particular thesauri have to be constructed concerning the consciousness related terms of a natural language. The process of translation could be hidden behind a bit more complex organization of informational conscious system: in the first phase, the first natural language is translated into the first (subscribed) informational language. In the second phase, the mapping of the first informational language into the second informational language takes place. This mapping can be determined in advance by approved rules between two languages. In the third phase, the obtained result in the second informational language is translated into the second natural language. References [1] The Merriam-Webster Concise School and Office Thesaurus. 1991. Merriam-Webster Inc., Publishers. Springfield, MA. [2] Dalgleish, T. & M. Power, Eds. 2000. Handbook of Cognition and Emotion. John Wiley & Sons. Chichester, England. [3] floridi, L., Ed. 2003/2004. The philosophy of information. Minds and Machines 13:459-588/14:1-132. [4] lewis, M. & J.M. Haviland-Jones, Eds. 2000. Handbook of Emotions. Second Edition. The Guilford Press. New York, London. [5] Moravec, H. 1999. Robot. Mere Machine to Transcendent Mind. Oxford University Press. New York. [6] Webster's Ninth New Collegiate Dictionary 1986. Merriam-Webster, Inc. Springfield, MA. [7] ŽELEZNIKAR, A.P.2 2003. Conscious informational entities. Informatica 27:483-494. [8] ŽELEZNIKAR, A.P. 2004. Informon und Entropon im Bewusstseinssystem. Grundlagenstudien aus Kybernetik und Geisteswissenschaft/Humankybernetik 45:81-89. [9] ŽELEZNIKAR, A.P.2 2004. Introduction to Artificial Consciousness. The Philosophy of the Informational, Formalization, and Implementation. A study in progress. 2Readable also in PDF (Adobe Acrobat Reader), at the website < http://www.artifico.org >. The Demarcate Construction: A New Form of Tree-based Priority Queues Rick Siow Mong Goh, Wai Teng Tang, Ian Li- Jin Thng and Marie Therese Robles Quieta National University of Singapore, Department of Electrical and Computer Engineering 3 Engineering Drive 3, CCN Laboratory, Singapore 117576 Keywords: priority queue, splay tree, skew heap, calendar queue Received: February 22, 2004 A new form of tree-based priority queues is proposed. These priority queues employ the demarcation process to systematically split a single tree-based priority queue into many smaller trees, each divided by a logical time boundary. These new Demarcate Construction priority queues offer an average speedup of more than twice over the single tree-based counterparts and outperform the current expected O(1) Calendar Queues in many scenarios. Their generality in small to large queue sizes, insensitivity to priority increment distributions and low overhead costs, render them suitable for many applications. Povzetek: članek opisuje novo obliko prioritetnih vrst. 1 Introduction In this article we apply the concepts of demarcation to tree-based priority queues. We define demarcation as the process where by events in a tree-based priority queue are separated clearly by time-boundaries in the form of buckets, where each bucket is made up of a single tree. We are motivated by the observation that the known kinds of efficient trees such as the Splay Tree [1] and Skew Heap [2] only have at best an amortized time bound of O(log(n)) per operation, where by amortized time is meant the time per operation averaged over a worst-case sequence of operations [3]. Comparatively, multilist-based priority queues such as the Calendar Queue (CQ) [4] and its variant Dynamic CQ (DCQ) [5] offer an "expected" O(1) average time bound per operation, where by "expected" is meant that the CQs are not theoretically proven to be O(1) but rather displays an O(1) performance in numerous application scenarios. However, the drawback of employing the CQs in applications is that the worst-case time bound per operation can be as poor as O(n) [6]. That said, the CQ has nonetheless been a popular choice as the pending event set structure in discrete event simulators such as CSIM18 [7], GTW [8], Network Simulator v2 [9], as well as in a quality of service algorithm where it maintains the real-time packet requests [10] and even as part of a rate controller for ATM switches [11]. Similarly, tree-based priority queue - the Splay Tree too has a wide array of applications: the pending event set structure in the CelKit (formerly known as SimKit) simulator [12], the data structure for fast IP lookups [13], data compression [14] and is also used in the block sorting process of Burrows and Wheeler [15]. An important use of priority queues, both tree and multilist-based, is in the area of discrete event simulation (DES). In DES, the pending event set (PES) is defined as the set of all events generated during a DES which have not been simulated yet. The PES is in essence a priority queue controlling the flow of the simulation of events with the minimum timestamp having the highest priority and maximum timestamp having the least priority. Comfort [16] has revealed that up to 40% of the computational effort in a simulation may be devoted on the management of the PES alone, where the enqueue and dequeue operations account for as much as 98% of all operations on the PES. A DES frequently operates in a three-step cycle: dequeue - removal of an event with the highest priority from the PES; execute - processing this dequeued event; enqueue - insertion of new event/s resulting from the execution into the PES. The two basic operations, enqueue and dequeue, have run-time complexity closely dependent on the total number of events in the PES. Therefore, a PES structure should be efficient especially for large-scale simulations that involve large number of events during simulation jobs. In most applications the metric of interest for a priority queue is often the time required to perform the most common operations. This metric is referred to as access time. In typical applications such as DES, the total runtime of the simulation job is by far more important than the individual times of the operations, except for realtime applications. Therefore, the amortized (or average) access time per operation is by far more important than the worst-case access time for each individual operation. Fine-grain simulations, such as but not limited to ATM network simulations, are time-consuming due to the huge number of events to process [5]. The faster and the larger the networks, the higher the number of events would be in the PES and the longer run-times these network simulations would require, which may take days or weeks to yield results with an acceptable level of statistical error. For example, experiments conducted in Tcpsim [17] for a three-minute simulated time over Sun Ultra 1 took more than one day execution time on average [5]. Therefore, to speed up simulation jobs, one approach is to develop high-performance priority queue structures for the PES. In this article we develop the Demarcate Construction (Demarco) priority queue, a multilist-based structure which is made up of two building blocks. The name Demarco arises from the word "demarcate" which means to divide and separate clearly as if by boundaries. The primary structure is an array of buckets, where each bucket may contain a tree holding near-future events. The secondary structure is made up of a simple unsorted linked list to hold far-future events. Demarcation refers to the process of constructing the primary structure and transferring events from the secondary structure to the primary. In an amortized sense, this demarcation process ensures that a tree-based priority queue has comparable performance or better, than one which does not undergo demarcation. The rest of this article is organized as follows: Section 2 describes in detail the Demarcate Construction, Section 3 explains the performance measurement techniques and the empirical results from various experiments using these techniques are given in Section 4. Finally, Section 5 concludes. 2 Demarcate Construction The Demarcate Construction (Demarco) has four essential principles. First and foremost, the concept of demarcation is to have many trees each containing a small number of events. In contrast, a tree-based priority queue manages only a single tree containing all the nodes or events. Upon applying demarcation, an array of logical buckets is constructed. Each bucket spans equal time-interval and these buckets systematically enable the events to be demarcated and distributed in the buckets. Thus on the average, the tree in each bucket will have a smaller number of events leading to a much reduced height as compared to a single tree priority queue. Secondly, Demarco defers the sorting of events until necessary. At the onset, all enqueued events are appended in the secondary tier (SecT) of Demarco. These events are not sorted according to their timestamps. During the first dequeue operation, the primary tier (PriT) is constructed and the events are inserted into the corresponding buckets in PriT where they are sorted according to the tree-based priority queue's native enqueue algorithm. Thirdly, unlike other multilist-based priority queues, Demarco does not rely on sampling heuristics to obtain structure parameters. The parameters used when constructing PriT are obtained from the events distribution in the SecT. Lastly, the algorithm of Demarco proceeds in demarcation cycles where by a cycle is defined as the duration when: the events in SecT are transferred to the PriT, more events are enqueued in PriT and SecT, and all the events in the PriT are dequeued. Primary Tier (PriT) Seconary Tier (SecT) PriT Start « ; X X B B B B — SecT_Cur SecT_Min, SecT_Max and SecT_Num get updated as events are appended in SecT An invalidated bucket where all events have already been dequeued. Legend S - rB - A bucket which may contain events. O - An event, which can also be considered as a tree node. Figure 1: Basic structure of a Demarcate Construction. 2.1 Basic structure of Demarco Figure 1 shows an example of the basic structure of a Demarcate Construction (Demarco). The main building blocks of the Demarco consist of: 1. Primary Tier (PriT) - an array of buckets where each bucket may contain a tree. Each tree-node contains an event holding a near-future (i.e. soon to be dequeued) timestamp. Within each bucket, the events are sorted according to the algorithm of the tree-based priority queue. The parameters used in creating the PriT are obtained from the events distribution in SecT. In Figure 1, there are a total of six buckets of which two of them are invalidated (i.e. events have already been dequeued before) and four buckets which may contain events. 2. Secondary Tier (SecT) - an unsorted singly linked list. Acting as an overflow list to contain far future events, SecT buffers events that do not affect the PriT. This reduces the number of events in the PriT and thus, on the average, the number of events in each bucket decreases as simulation time progresses. Since the performance of tree-based priority queues depends on the height (or number of levels), reducing the number of events in PriT will eventually lead to a reduction in the height of the tree in the buckets in PriT. This without doubt leads to a superior overall performance. 2.2 The Demarco algorithm Demarco marks the first departure from the CQs' resize triggers and sampling heuristics to obtain structure parameters such as the number of buckets and the bucketwidth. Instead of the static methodologies used in the CQs, Demarco employs a dynamic approach of updating its structure parameters, which is explained in Section 2.2.1. The Demarco structure keeps a set of variables to function and they are defined as follow: PriTStart - Used for calculating the bucket-index of the event which is to be enqueued in PriT. It is set to SecT Min during each Demarcation process, where by events are transferred from SecT to PriT. PriT Num - Number of events in PriT. PriT Bw - Bucketwidth of PriT. PriT Index - Bucket-index of the first non-empty bucket bucket in PriT. SecT Cur - Minimum timestamp of an event that can be enqueued in SecT. This value will be set equal to SecT Max at each transfer of events from SecT to PriT. SecT_Min - Minimum timestamp in SecT. SecT _Max - Maximum timestamp in SecT. SecT _Num - Number of events in SecT. 2.2.1 Dequeue operation At the onset, all enqueued events are placed in SecT in a FIFO manner without time-order thus leaving PriT being empty. On the first dequeue operation, PriT is constructed and thereafter, all the events are transferred from SecT to PriT. The bucketwidth of PriT, an important structure parameter, is dynamically assigned using equation (1). PriT Bw = Bucketwidth = SecT Max - SecT Min SecT Num (1) The number of buckets to be created in PriT is set to be SecT_Num, giving an average of one event per bucket on the assumption that the event distribution is a uniform distribution. Though in practical scenarios this may not be true, the Demarco will still perform well because the enqueue of events into PriT is O(log(nB)) per event whereby nB is the number of events in a bucket. For most scenarios, nB << N, where N is the total number of events in the Demarco structure. After the construction of PriT, the events in SecT are transferred to PriT. Transferring of an event into PriT is alike enqueuing an event into PriT which utilizes the tree-based priority queue's native enqueue algorithm. Thereafter, the highest priority event would be in the first bucket in PriT (where PriT Index = 0 and that PriT_Start = SecT Min have been initialized). On each dequeue, the highest priority event would be removed from the first bucket in PriT by employing the tree-based priority queue's native dequeue algorithm. Subsequently, when the first bucket is empty, it is invalidated and the second bucket is then considered, where at the same time, parameter PriT_Index is incremented by one. If the second bucket is empty, PriT Index is incremented again until a non-empty bucket is found and the current highest priority event is dequeued. After all the events in PriT are dequeued, i.e. all the buckets are empty, the demarcation cycle repeats itself with SecT treating the next dequeue to be alike the first dequeue as mentioned. Figure 2: Dequeue operation. Figure 2 illustrates an example of a demarcation process during a dequeue operation when there are events in SecT but PriT is empty. Essentially, PriT is created with parameters set according to the events found in SecT. Thereafter, events in SecT are transferred to PriT. Figure 2 demonstrates how the three smallest timestamp events (shaded) are being transferred from SecT to Bucket 0. For this example, we first assume that five events, with timestamps 0.5, 7.0, 1.1, 0.9 and 4.5, are enqueued in an empty PriT. Since SecT Cur equals 0, all the events are enqueued in SecT. On the first dequeue, PriT is created with the parameters as stated in Figure 2 using equation (1). Thereafter, events from SecT are transferred to PriT. Immediately after this, the following variables are updated: SecT Cur = SecT Max = 7.0; SecT_Min, SecT_Max, SecT Num are then reset. Events from the first non-empty bucket, i.e. Bucket 0, should now have already been sorted according to the tree's enqueue algorithm. Next, event 0.5 is dequeued, followed by 0.9 and 1.1 according to the tree's dequeue algorithm. After each dequeue PriT Num is decremented and after dequeuing 1.1, PriT Num should now be 2. Subsequently, the algorithm searches for the next nonempty bucket, which is Bucket 3, and PriT Index is updated as 3. Thereafter, event 4.5 is dequeued. When all the events in PriT are dequeued, the cycle repeats with SecT handling the next dequeue operation alike the first dequeue as mentioned earlier. 2.2.2 Enqueue operation For each enqueue operation, Demarco checks if that event timestamp is greater than SecT Cur. If so, the event is simply placed at the end of the linked list in SecT. If the event is not inserted in SecT, then the event is enqueued in PriT. On enqueuing in PriT, the bucket- index of the bucket where this event is to be inserted in PriT is: timestamp - PriT Start Bucketindex = --(2) PriT Bw and the event is enqueued according to the tree's native enqueue algorithm. 2.3 Brief time complexity analysis This section seeks to provide a brief performance analysis of a Demarco structure using the amortized time complexity analysis [3]. Assume N events are enqueued into an empty Demarco structure. Initially all the events would be enqueued into SecT. This enqueue of events into SecT is O(1) per event since SecT is an unsorted linked list and the events are simply appended at the end of the linked list. On the first dequeue, all the events from SecT would be bucket-sorted into PriT, where PriT is made up of buckets with an equal time-interval (bucketwidth). Events would be distributed in PriT according to the event distribution. For instance, for a uniform distribution, each bucket would approximately hold one event. Suppose during these dequeue operations, more events are enqueued into the Demarco structure. Since each bucket has equal time-interval, the events in PriT are likely to be spread out with each bucket having relatively small number of events. Therefore, even though a tree-based priority queue has O(log(nB)) complexity, nB is expected to be << N, leading to near O(1) enqueue and dequeue complexity. Note that the cost of transferring the events from SecT to PriT is also ~ O(N) / N = O(1) per event. Table 1 summarizes the theoretical performance of a Demarco structure with a tree-based priority which has O(log(n)) amortized complexity for both its enqueue and dequeue operation. Table 1: Amortized time complexity of a Demarco structure. PriT (expected, worst) SecT (expected, worst) Enqueue O(1), O(log(n)) O(1) Dequeue O(1), O(log(n)) - Through simulation benchmarks, we have shown empirically in Section 4 that a Demarco structure has near O(1) performance under all distributions. 3 Performance Measurement Techniques The performance of priority queues are often measured by the average access time to enqueue or dequeue an event under different load conditions. The parameters to be varied for each queue are: the access pattern model, the queue size and the priority distribution. The queue size refers to the number of events in a priority queue. For our experiments, the queue size ranges from 100 to 1 million events. This wide range represents what small to large-scale real simulation jobs will encounter. 3.1 Access pattern models The access pattern models that have been proposed either emulate the steady-state or the transient phase of a typical simulation. They are as follow: 1. Classic Hold model [18]. The queue to be tested is initially built up to the specified benchmark size by using a random series of enqueues (with slightly higher probability) and dequeues. Thereafter a series of hold operations ensue. A hold operation is defined as a dequeue followed by an enqueue. The timestamp of a new enqueue event is obtained as follows: a random variable which describes the desired priority distribution to be tested is picked. The value of the random variable is then added to the timestamp of the event that was just dequeued and the result is the timestamp of the new enqueue event. The average access time to be calculated is the average time taken for one hold operation. The recommended number of hold operations, in order to obtain a reasonable accurate average, should be 30 times the queue size [6]. This method has the following features: a. The problem of determining the transient period is avoided, and b. The effect of the transient period on the different queue sizes tested are similar [6]. 2. Up/Down model [19]. In an Up/Down test, the queue is built up to the benchmark queue size by a sequence of enqueues. Thereafter, the queue size is returned to zero by an equally long sequence of dequeues. The average access time to be calculated is the time taken for all queue operations (in the enqueue phase and dequeue phase), divided by the total number of queue operations. This Up/Down model emulates the worst case scenario of a simulation job which is not stable in queue size [19]. 3.2 Priority increment distributions The priority increment distributions, often used for the benchmarking of priority queue structures [5,6,18,19], are illustrated in Figure 3. The rand() used can be found in [20]. The Camel(x,y) distribution [19] represents a 2-hump heavily skewed distribution with x% of its mass concentrated in the two humps and the duration of the two humps is y% of the total interval. In addition to the five distributions as shown in Figure 3, the Change(A,B,x) distribution [19] was also used to test the sensitivity of the CQ when exposed to drastic changes in priority increment distribution. The compound distribution Change(A,B,x) interleaves two different priority increment distributions A and B together. Initially, x priority increments are drawn from A followed by another x priority increments drawn from B and so on. Change distributions can be used to model simulations where the priority increment distributions vary significantly over different time periods, for example battlefield simulations. In our experiments, we use Camel(0,1000,0.001,0.999) and Change(Exp(1), Triangle(90000,100000),2000) which have been used in the landmark survey [6]. Figure 3: Priority increment distributions. 3.3 Benchmarking codes and hardware The performance of the priority queues was obtained by conducting experiments on an AMD Athlon MP server. Even though this server has dual processors, the experiments did not make use of its true SMP capabilities as the algorithms are sequential. However, this server ensured that when each benchmark was carried out, other background processes that might affect the results were kept at a minimum, thus obtaining more accurate experimental results. The code used for the Splay Tree was based on the Pascal code used by Jones [18]. Skew Heap implementation was based on the non-recursive code given by Jones [21]. SCQ and DCQ were based on the codes supplied by Brown [4] and Oh et al. [5] respectively. Two empirical tests were conducted to verify that no items in the priority queues were gained or lost and that successive dequeues removed events in stable time-order. The experiments were performed with the required memory for each priority queue being pre-allocated. This was to eliminate the underlying memory management system which might affect the results. This is a good practice in actual DES as it prevents memory fragmentation when creating new events and deleting the serviced events. This method of pre-allocating memory would also enhance the performance of the DES. The method of pre-allocation could be made dynamic by an initial pre-allocation and subsequently, an allocation of memory on demand methodology could be employed. All code was written in the C programming language with all recursive procedure calls and the like being eliminated. Loop overhead time and the time taken for random numbers generated were removed by factoring out the time required for running a dummy loop. Five runs of each experiment were done and the median value was obtained for each queue size simulated in the some background processes could have adversely affected a particular run of an experiment and averaging this value could render less accurate results. 4 Experimental Results The objectives of this section are firstly to present the performance of tree-based priority queues with and without Demarco. Secondly, we compare Demarco priority queues with the current fastest multilist-based queues - CQ and DCQ. Lastly, we would like to determine Demarco priority queues' generality and sensitivity in the five priority increment distributions using the Classic Hold and Up/Down models, as well as when the queue size increases ^om 100 to 1 million. Note that a logarithmic scale has been used for the queue-size axis which leads to logarithmic complexity for linear plots. 4.1 Steady-state experiments Figures 4(a) to 4(f) show the results obtained under the Classic Hold experiments which is commonly employed to test the steady-state performance of the priority queues. Note that the obvious knee seen in the graphs is due to the declining cache performance and occurs when the queue size is about 10,000. This phenomenon is also observed in the graphs in [6] where the experiments were done on SUN and Intel architectures. Figures 4 show vividly that the performance of Demarco structures, i.e. Demarco-Skews and Demarco-Splays, outperform the tree-based priority queues; Skew Heap and Splay Tree, where by Demarco-Skews/-Splays is made up of a Demarco structure where each bucket in PriT of Demarco may contain a Skew Heap/Splay Tree. At larger queue sizes, the performance speedup that Demarco offers is more than three times. Figures 4(a) to 4(d) show that the performance of the Demarco structures are comparable to the expected O(1) complexity multilist-based priority queues, i.e. CQ and DCQ. Furthermore, Figures 4(e) and 4(f) demonstrates clearlyyhjt^thg^Demqrco^structurgs outperform the CQs which have erratic performance for skewed distributions such as the Camel and ChSnge. l inferior performance are: 1. The CQ size-based resize incompetent mechanism for e reasons for their triggers handling experiment. The median value is a measure of the central tendency and is chosen over the mean value because | | are an skewed distributions. The triggers are too rigid to react according to the events distribution since a resize trigger occurs only if the queue size fluctuates by a factor of two [4]. It results in many events being enqueued into a few buckets with long sublists and many empty buckets. Long sublists make enqueue operations expensive since each enqueue entails a sequential search, whereas excessive traversal of empty buckets increases the process of dequeue operations. The DCQ incorporates two additional gers, one for dequeue tion. These additional »läisj^riar reduce the instability faced by the CQ under Change distribution but however, the DCQ performs worse than the CQ for the Camel distribution. Sampling heuristic is inadequate to obtain good operating parameters, namely, the number of buckets and the bucketwidth, when skewed distributions are encountered. During each resize operation, the CQ samples at most the first twenty-five events. This is clearly too simplistic because for skewed distributions in which many events fall into some buckets, the inter-event time-gap of the first twenty-five events, which can span several buckets, and those in the few populated buckets may vary a lot. To simply increase the events sampled is not prudent unless it is uniform distribution. For most distributions, particularly skewed distributions such as the Camel, if we sample more events then take the mean or median, it is unlikely that it will be accurate since events are spread unevenly. Even if we sample the most populated bucket (i.e. in the DCQ), the DCQ also performs poorly because events in that bucket will have a small average time-gap whereas other events have widely diverse time-gaps. If the bucketwidth is updated to this small time-gap, there will likely be numerous empty buckets. Skipping these buckets will lead to inferior performance. Furthermore, sampling more events inevitably leads to higher overheads for each resize operation, affecting the CQ performance on a whole. Classic Hold & Rectangle mean access time (|js) ; Hold & Exponential mean access time (|js) 100 1000 10000 100000 Queue Size (a) 1000000 10000 100000 Queue Size (b) 1000000 Classic 8 Hold & Triangle mean access time (ps) ........' ........' ........' ■ ■ ' +...... -O --A---B- CQ DCQ Skew Heap Splay Tree ^ Demarco-Skews Demarco-Splays p ^ 1000 10000 _.100000 1000000 Quel ueue Size (c) ; Hold & NegTriangle mean access time (|js) 1000 10000 100000 1000000 Queue Size (d) Classic Hold & Camel mean access time (|js) .111.........I.........I 76 -543 ■ 21 - - CQ .......+...... DCQ + --G- -" Skew Heap : + / —A— Splay Tree" —B— Demarco-Skews ^ Demarco-Splays p 0 100 1000 10000 100000 Queue Size (e) 1000000 Up/Down & Rectangle mean access time (|js) I III mil I III mil I III mil i i 4- .......+..... --Ä---B— CQ DCQ Skew Heap Splay Tree Demarco-Skews Demarco-Splays 10000 100000 Queue Size (a) 1000000 Classic Hold & Change mean access time (|js) 100 1000 10000 „.100000 1000000 Queue Size (f) Up/Down & Exponential mean access time (|js) ........' ........' ........' ■ ■ ■ 4 - CQ DCQ Skew Heap Splay Tree Demarco-Skews Demarco-Splays 10000 „.100000 1000000 Queue Size (b) Figure 4: Performance graphs for Classic Hold model experiments. 4.2 Transient-state experiments The Up/Down model which tests the performance of priority queue structures during transient periods when the queue size fluctuates frequently, reveals the weaknesses of the CQs. Figures 5(a) to 5(d) show that the CQs experience several peaks and these suggest strongly that the resize operations found in the CQs can be costly since the CQs resize whenever the queue size fluctuates by factors of two. The form of triggers found in the CQs are clearly inflexible because even though the CQs can be performing well with its existing operating parameters, but because of their static triggers, they still have to resize whenever the queue size fluctuates by factors of two. Figures 5(e) and 5(f) again demonstrate that the CQs are sensitive to skewed distributions. Up/Down & Triangle mean access time (ps) 10000 100000 Queue Size (c) 1000000 The Demarco structures outperform all the priority queues in all these experiments. Up/Down & NegTriangle mean access time (|js) ........I.........I.........I_U-L. 4- - -G- ---A--B— CQ DCQ Skew Heap Splay Tree Demarco-Skews Demarco-Splays 100 1000 10000 100000 1000000 Queue Size (d) Up/Down & Camel mean access time (|js) 10000 „.100000 1000000 Queue Size (e) Up/Down & Change mean access time (|js) 5." " " 4 - - C .......+...... : --0-- £ -A—- £ —B— : " .......>...... : al + DQ céw Heap Dfey Tree =marco-Skews J [ emarco-Splays ^ ' A. Jueue Size 100 1000 10000 100000 1000000 Qu (f) Figure 5: Performance graphs for Up/Down model experiments. 4.3 Overall performance comparison This section illustrates numerically the performance speedup of the Demarco structures over the normal single tree-based priority queues. In addition, we compare the relative performance of the Demarco structures and tree-based priority queues versus the multilist-based CQ and DCQ. Table 2: Speedup offered by the Demarco structure normalized over a single tree-based priority queue - Distribution Demarco-Skews Demarco-Splays Rectangle 2.99 2.61 Exponential 2.62 2.66 Triangle 3.01 2.46 NegTriangle 2.99 2.66 Camel 2.05 1.31 Change 1.91 1.39 Average 2.60 2.18 Table 3: Speedup offered by the Demarco structure normalized over a single tree-based priority queue - Queue Demarco- Demarco- Cost (MB)* Size Skews Splays 100 1.14832 1.27259 0.0016 200 1.21848 1.35236 0.0032 300 1.26669 1.39214 0.0048 400 1.32153 1.55857 0.0064 500 1.33875 1.47472 0.0080 700 1.37699 1.50817 0.0112 1000 1.42609 1.57609 0.0160 2000 1.45507 1.59857 0.0320 3000 1.58005 1.67455 0.0480 4000 1.60971 1.71499 0.0640 5000 1.59277 1.61879 0.0800 7000 1.33867 1.35547 0.1120 10000 1.38927 1.39643 0.1600 20000 1.78548 1.56020 0.3200 30000 1.99237 1.67113 0.4800 40000 2.17161 1.77727 0.6400 50000 2.30145 1.83629 0.8000 70000 2.50331 1.97654 1.1200 100000 2.69297 2.08904 1.6000 200000 3.00549 2.37097 3.2000 300000 3.15127 2.49080 4.8000 400000 3.21057 2.55460 6.4000 500000 3.33468 2.63459 8.0000 700000 3.48493 2.75407 11.2000 1000000 3.66322 2.88340 16.0000 *MB refers to one million bytes. Memory incurred is sixteen bytes per bucket and the number of buckets used is equal to the queue size tested. Table 4: Relative performance for Exponential distribution (normalized with respect to the fastest access time where 1.00 is the fastest). Model Queue Size Demarco-Skews Demarco-Splays Skew Heap Splay Tree CQ DCQ ■TS X CJ in O 100 1.23 1.35 1.67 2.06 1.00 1.20 1000 1.19 1.32 2.17 2.81 1.00 1.10 10000 1.34 1.50 2.20 2.60 1.00 1.16 100000 1.12 1.26 3.90 3.50 1.00 1.07 1000000 1.05 1.17 5.33 4.77 1.00 1.03 Average 1.19 1.32 3.05 3.15 1.00 1.11 J- 100 1.00 1.06 1.15 1.22 2.31 2.57 1000 1.00 1.04 1.56 1.73 1.76 1.94 10000 1.00 1.12 1.31 1.45 1.33 1.54 100000 1.00 1.12 2.39 2.43 1.52 1.64 1000000 1.00 1.12 3.55 3.49 1.44 1.52 Average 1.00 1.09 1.99 2.06 1.67 1.84 Total Average 1.09 1.21 2.52 2.61 1.34 1.48 Table 5: Relative average performance for all distributions (normalized with respect to the fastest access time where 1.00 is the fastest). Model Queue Size Demarco-Skews Demarco-Splays Skew Heap Splay Tree CQ DCQ ■TS 100 1.07 1.14 1.31 1.61 1.59 1.00 1000 1.15 1.26 1.76 2.15 1.00 1.41 X (J 10000 1.00 1.10 1.55 1.70 5.32 1.23 100000 1.00 1.12 3.27 2.59 1.20 1.81 U 1000000 1.00 1.11 4.44 3.61 1.90 NA* Average 1.04 1.15 2.47 2.33 2.20 NA* 100 1.00 1.09 1.01 1.12 2.17 2.38 J- 1000 1.00 1.06 1.25 1.42 1.56 1.75 10000 1.00 1.08 1.16 1.27 19.55 15.96 100000 1.00 1.10 1.93 1.95 NA* NA* 1000000 1.00 1.10 2.67 2.63 NA* NA* Average 1.00 1.09 1.60 1.68 NA* NA* Total Average 1.02 1.12 2.04 2.01 NA* NA* * NA is meant that some of the access times are too high in at least one or more distributions. Thus the results are not considered in this comparison. Table 2 shows that the speedup offered by the Demarco structure is more than two times, average over all queue sizes and distributions. Table 3 shows the speedup for each queue size, average under all the distributions. From the table, it is obvious that as the queue size increases, the speedup increases to more than three times for the Skew Heap and close to three times for the Splay Tree. Tables 4 and 5 illustrate the relative performance of all the priority queues considered. The Demarco-Skews and Demarco-Splays outperform their tree-based counterparts and is generally more stable than the CQs. 4.4 Generality, sensitivity and cost-performance ratio of Demarco structures Figures 6(a) and 6(b) shows the generality and insensitivity of Demarco-Skews under the various distributions and queue sizes (Demarco-Splays has similar graphs and is thus not included). Though the performance of Demarco-Skews may differ by as much as twice for different distributions, the complexity is still considered near O(1). Furthermore, the graphs show that it is stable for all the distributions unlike the CQs which is near O(n) for skewed distributions. This superior performance is made possible because of the four essential principles mentioned in Section 2. Mean Hold access time (|js) for Demarco-Skews ........I.........I.........I_ 76 -543 21 - — A--- Rectangle Exponential Triangle NegativeTriangular Camel Change 100 1000 10000 100Q00 1000000 Queue size (a) Mean Up/Down access time (|js) for Demarco-Skews ........■.........■.........■...... 7 -654 32 -1 - 0 .......+....... — A--- Rectangle Exponential Triangle NegTriangle Camel Change 100 1000 1000000 Figure 6: Performance graphs for Demarco-Skews. We also consider the cost-performance ratio of employing Demaro on the tree-based priority queue. The additional memory allocation required to obtain a Demarco structure is as shown in Table 3. The bulk of memory requirement is the number of buckets allocated. Other memory variables such in SecT (e.g. SecT_Min, etc) take up an insignificant amount of memory and thus are not considered. In C programming language, a bucket is "struct bucket { unsigned int count; struct event *evt; }". Therefore, each bucket in PriT takes up eight bytes on a 32-bit hardware platform and sixteen bytes on a 64-bit platform. Table 3 assumes a higher cost of sixteen bytes per bucket and that the number of buckets used is equal to the queue size tested. From Table 3, it is shown that if the queue size of a simulation is 100,000, the speedup expected for a Demarco-Skews is 2.69 and the cost of additional memory incurred is about 1.6MB, where MB refers to one million bytes. And for a one million queue size simulation, a Demarco-Skews performs 3.66 times better than a Skew Heap and the cost incurred is 16MB of shared memory. By today's standard, 16MB is considered affordable. This is even more so in the near future, when 64-bit workstations (which can accommodate more physical memory) begin to proliferate the desktop market. 5 Conclusion Demarcate Construction is a new form of tree-based priority queues which employs the demarcation process. These new priority queues offer an average speedup of more than twice over the single tree-based counterparts and outperform the current expected O(1) Calendar Queues in many scenarios. Its generality in small to large queue sizes, insensitivity to priority increment distributions and low overhead costs, make it a superior priority queue for many applications such as the pending event set structure in discrete event simulations. References [1] Sleator, D. D. and Tarjan, R. E. (1985) Self-adjusting binary search trees. Journal of the ACM 32, 3 (July), pp. 652-686. [2] Sleator, D. D. and Tarjan, R. E. (1986) Self-adjusting heaps. SIAM Journal of Computing 15, 1 (Feb.), pp. 5269. [3] Tarjan, R.E. (1985) Amortized computational complexity. SIAM Journal on Algebraic and Discrete Meth. 6, 2 (April), 306-318. [4] Brown, R. (1988) Calendar queues: A fast O(1) priority queue implementation for the simulation event set problem. Commun. ACM24, 12 (Dec.), 825-829. [5] Oh, S., and Ahn, J. (1998) Dynamic calendar queue. In Proceedings of the 32nd Annual Simulation Symposium, pp. 20-25. [6] Rönngren, R. and Ayani, R. (1997) A comparative study of parallel and sequential priority queue algorithms. ACM Trans. Model. Comput. Simul. 7, 2 (April), pp. 157-209. [7] Schwetman, H. (1996) CSIM18 User's Guide. Austin, TX: Mesquite Software, Inc. [8] Das, S., Fujimoto, R., Panesar, K., Allison, D., and Hybinette, M. (1994) GTW: a time warp system for shared memory multiprocessors. In Proceedings of the 1994 Winter Simulation Conference, pp. 1332-1339. [9] Fall, K. and Varadhan, K. (2002) The ns Manual. UCB/LBNL/VINT Network simulator v2. http://www.isi.edu/nsnam/ns/. [10] Stoica, I., Zhang, H., and Ng, T.S.E. (2000) A hierarchical fair service curve algorithm for link-sharing, real-time, and priority services. IEEE/ACM Trans. Networking, 8, 2 (Apr.), pp. 185-199. [11] Hagai, A. and Patt-Shamir, B. (2001) Multiple priority, per flow, dual GCRA rate controller for ATM switches. IEEE Workshop on High Performance Switching and Routing 2001, pp. 169-174. [12] Gomes F., S. Franks, B. Unger, Z. Xiao, J. Cleary, and A. Covington. (1995) SimKit: A high performance logical process simulation class library in C++. In Proceedings of the 1995 Winter Simulation Conference, pp. 706-713. [13] Narlikar, G. and Zane, F. (2001) Performance modeling for fast IP lookups. In Proceedings of the 2001 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, pp. 1-I2. [14] Grinberg, D., Rajagopalan, S., Venkatesan, R., Wei, V. K. (1995) Splay trees for data compression, In Proceedings of the sixth annual ACM-SIAM symposium on Discrete Algorithms, pp. 522-530. [15] Yugo, K. I. R., Moffat, A., and Ngai, C. H. A. (2002) Enhanced word-based block-sorting text compression. In Proceedings of the twenty-fifth Australasian conference on Computer science, pp. 129-I37. [16] Comfort, J. C. (1984) The simulation of a master-slave event set processor. Simulation 42, 3 (March), pp. 117-124. [17] Dupuy, A., Schwartz, J., Yemini, Y. and Bacon, D. (1990) NEST: a network simulation and prototyping testbed, Commun. ACM33, 10 (Oct.), pp. 63-74. [18] Jones, D. W. (1986) An empirical comparison of priority-queue and event-set implementations. Commun. ACM29, 4 (April), pp. 300-311. [19] Rönngren, R., Riboe, J., and Ayani, R. (1993) Lazy queue: New approach to implementing the pending event set. Int. J. Computer Simulation 3, pp. 303-332. [20] Park, S. K. and Miller, K. W. (1988) Random number generators: Good ones are hard to find. Commun. ACM 31, 10 (Oct.), pp. 1192-1201. [21] Jones, D. W. (1989) Concurrent operations on priority queues. Commun. ACM32, 1 (Jan.), pp. 132-137. Fault-Free Maximal Submeshes in Faulty Torus-Connected Multicomputers Seong-Moo Yoo Electrical and Computer Engineering Department, University of Alabama in Huntsville Huntsville, Alabama 35899, USA e-mail: yoos@eng.uah.edu Hee Yong Youn School of Information and Communication Engineering, Sungkyunkwan University Suwon, Korea e-mail: youn@ece.skku.ac.kr Hyunseung Choo School of Information and Communication Engineering, Sungkyunkwan University Suwon, Korea e-mail: choo@ece.skku.ac.kr Keywords: fault-free subsystem, maximal submeshes, reconfiguration, torus-connected multicomputers, virtual submeshes Received: May 11, 2004 In a parallel computer system with faulty processors, it is highly desirable to reconfigure the system by eliminating the faulty ones and thereby restore the system to some operational state. In the reconfiguration maintaining the maximum size fault-free subsystem is the main problem. In this paper, we propose an efficient scheme for identifying maximum size fault-free submeshes in a faulty torus-connected multicomputer system. For this, the relations between two submeshes have been defined. Then we take two-phase approach. In the first phase, an efficient algorithm for determining maximal faulty submeshes in a faulty torus has been introduced. In the second phase, we have introduced a procedure to identify the maximal fault-free submeshes by removing all faulty submeshes from a whole torus. The time complexity of the proposed scheme is O(Nj3) where Nf is the number offaulty processors in a 2D torus. The proposed scheme can be utilized to the task allocation in 2D tori in the presence of failed nodes. Povzetek: članek obravnava delovanje večprocesorskega sistema ob izpadu nekaj procesorjev. 1 Introduction to the subsystems of a required number of processors such as subcube for hypercube or submesh for mesh Fault-tolerance for the multiprocessor system is achieved system, respectively. Due to its complexity, some either by using the workable portion of the system to processors in the entire structure can be defective, and emulate the whole machine with certain slowdown or by then they need to be excluded from the allocation. In reconfiguring the machine to a smaller sized system after other words, all the processors in the allocated subsystem faults occur. Letting the fault-free part of a machine must be good. In the presence of failed nodes, thus, the emulate entire system functionality tends to have system should be reconfigured to allow fast contiguous limitations in practical use because slowdown could allocation of fault-free subsystems to incoming jobs. translate into a considerable performance loss. Hence, Among several parallel computer systems, torus- fault tolerance by reconfiguration is very important in connected multicomputer system has become popular, such a large distnbuted computing environment for and it has become the base of many parallel computer continued operation of the multiprocessor after the systems such as the Tera Computer [1] and the Cray T3D failure of one or more processors and/or links. Once the [2]. Task allocation problem in 2D meshes has recently faulty elements have been identified, graceful drawn a lot of attention [3-18]; however, the allocation degradation can be achieved by reconfigunng the problem in a torus system in the presence of faulty nodes multiprocessor and the distnbuted algonthm running on has not. Reconfiguring a faulty system into smaller size the multiprocessor. The algonthm is formulated to run on systems has been investigated for the hypercube a single processor which would typically be the host or architecture [19-20], but little work has been reported for tli^^ resource manager in a commercial multiprocessor torus-connected system to the best of our knowledge. system. The reconfiguration problem in a torus-connected In parallel computer systems consisting of a number of interconnected processors, incoming jobs are allocated multiprocessor system reduces to finding the maximum size fault-free submeshes. In this paper, we propose a scheme which can effectively identify the maximum size fault-free submeshes in a two-dimensional (2D) torus. For this, the relations between any two submeshes in a 2D torus are first defined. Then two-phase approach is taken. In the first phase, the largest size faulty submeshes in a faulty torus are determined. In the second phase, largest size fault-free submeshes are identified by removing all faulty submeshes. The time complexity of the proposed scheme is O(Nf), where Nf is the number of faulty processors in a 2D torus. The proposed scheme will be useful for task allocation in 2D torus with faulty processors, and it can be applied to a 3D torus system. In the case of a real high performance parallel computer (numerical solver etc.), torus connection is crucial, high speed is needed, and faulty processors must be replaced. Usually, for an efficient parallel system, communication links have to operate with maximal speed. In the case of faulty processors, the original torus structure is lost, because some indirect communication channels have to be used, or new links introduced to bridge a faulty processor. In this paper, the whole algorithm is organized globally. It is supposed that the parallel system works in fact by the master-slave principle (computer farm). In such cases, the algorithm proposed could be useful even though the proposed algorithm has not considered faulty links and unconnected links that are left after removing faulty processors. The rest of the paper is organized as follows. In Section 2, definitions and notation used throughout the paper are introduced. In Section 3, the procedures grouping the faulty processors into faulty submeshes are introduced. The procedure identifying the largest size fault-free submeshes is then presented. Finally, we conclude the paper in Section 4. 2 Definitions and Notation In this section, we define meshes by introducing the index set. Refer to Figure 1. A node, n = (x, y), (0 < x < a-1 A 0 < y < è-1), refers to a processor where a two-dimensional torus, 2DT(a, b) = {(x, y) | (0 < x < a-1 a 0 < y < b-1)}, is an a x b rectangular grid. Here a and b represent the width and height of the torus, respectively. A submesh, S (base, end) = ((xb, yb), (x^, y^)) = {(x, y) | (xb < x < xe A yb < y < ye)}. w and h denote the width and height of S, respectively. Definition 0: (Neighbor)-. n = (xn , yn ) , m = (xm , y^ ). n \-m Ö (Xm = + 1) A(ym = yn )- m is right neighbor of n, and n is left neighbor of m. n 1 m Ö (xm = xn ) A (ym = yn + 1): m is upper neighbor of n, and n is lower neighbor of m. (0,b-l) (a-1,b-1) end (x_i h=y_e -y_b+l w=K e-x b+1 base(x_b, yjb) M (a-1,0) Figure 1. Index set for a 2D mesh and a submesh. (a) modub v, li. end (b) (c) modub base (d) Figure 2. Four different virtual submeshes considering (base, end). Definition 1: (Virtual submesh): Refer to Figure 2. Sv = ((xb , yb ),(xve , yve ))for S = ((xb , yb ), (xe , ye )) , xe = xve mod a, ' xe if xb < xe ye = yve modb , xve = xe + a otherwise yve 'ye if yb < ye ye + b otherwise Definition 2: Separate : SjOS2 O Sj I S2 = ^, overlapped : SJ O S 2 O SJ I S 2 ^ ^ . Definition 3: Covered : SJ • S2 O SJ ^ S2, equivalent : SJ ^ S2 O (SJ ^ S2 ) a (S2 ^ SJ) . Definition 4: (adjacent) SJ ||r S2 O 3(n G SJ,m g S2) [n |- m a SJOS2 ]: S2 is right adjacent of SJ, and SJ is left adjacent of S2 . SJ ||u S2 O 3(n G SJ,m g S2) [n 1 m a SJOS2 ]: S2 is upper adjacent of SJ, and SJ is lower adjacent of S2 . Definition 5: (Intersected submesh) S = SJ I S2 . Definition 6: S is a maximal submesh in 2DT ö not 3 S2 [ S1 • S 2 ]. Maximal submeshes may either be separate or overlapping one another. 3 The Proposed Scheme Our scheme finding maximal submeshes in a faulty 2DT consists of two phases. In the first phase, maximal faulty submeshes are determined by grouping the faulty processors. In the second phase, maximal fault-free submeshes are identified by removing all maximal faulty submeshes. For locating maximal faulty submeshes, a mechanism combining two submeshes need to be developed. We first consider that. 3.1 Maximal Faulty Submeshes 3.1.1 Combining Two Submeshes Definition 7: Combining two submeshes, Si + S2, is a procedure for identifying new maximal submeshes such that each of the new submesh consists of part or whole of Si and S2. The details of the procedure are explained below. There exist four cases in combining S1 and S2 according to the relative positions of them. Case 1: S1 ^ S2 v S2 e Si. Case 2: Si ◊ S2 a Si tf S2. Case 3: Si y S2. Case 4: Si O S2 a - (Si e S2) a - (S2 e Si). In Case i, no new submesh is identified because Si + S2 is Si or S2 itself. In Case 2, no new submesh is identified either. In Case 3, one new submesh may be identified. S3 if Si y ^ S2 or S4 if Si \\u S2 is generated as shown in Figure 3. The addresses of S3 and S4 can be referred to Table I. Note that the base and end points of a virtual submesh Si are denoted by (xib, ytb) and (xie, yie), respectively. In this case, Si should be deleted if Si e S3 or S4, and S2 if S2 e S3 or S4. In Case 4, no or two submeshes are identified. In Figure 4(a), two new submeshes, S3 and S4, are identified whose addresses are given in Table I. In Figure 4(b) and (c), no new submesh can be identified since Si ^ S3 and S2 ^ S4. Therefore, in Case 4, S3 and S4 should be checked whether each of them is equivalent to either Si or S2. Table I. Addresses of new maximal submeshes obtained S3 S4 xb min (Xib, X2b) max (Xib, X2b) yb max (yib, y2b) min (yib, .y2b) xe max (Xie, X2e) min (Xie, X2e) ye min (yie, y2e) max (yie, .y2e) Figure 3. Generating new maximal submeshes when two submeshes are adjacent. S_2, S_4 (c) Figure 4. Generating new maximal submeshes when two submeshes overlap. When submeshes in 2DT are identified, we can do that for only separate ones or even overlapped ones. If only separate ones are identified, however, the largest submesh may not be able to be recognized. For example, refer to Figure 5. Figure 5. An example of recognizing the available largest size submeshes. We want to identify maximal submeshes from 12 processors in Figure 5(a). The processors can be formed into two separate submeshes as shown in (b) and (c). Note that these are the two ways which can identify separate largest size submeshes. The way shown in Figure 5(b), however, cannot recognize the 6x1 submesh. Similarly, in Figure 5(c), the 3 x 3 submesh cannot be identified. On the contrary, if the identified submeshes are allowed to overlap as shown in Figure 5(d), all the available largest size submeshes can be recognized. Therefore, the proposed scheme identifies new submeshes which can overlap with other submeshes. 3.1.2 Finding Maximal Faulty Submeshes Here we maintain a data structure F, an ordered list keeping faulty submeshes, for finding the maximal faulty submeshes. Initially any faulty processor is a submesh of one node itself, and F keeps the single node submeshes. These faulty submeshes are then combined into maximal faulty submeshes. They are combined first columnwise then rowwise. For this, all the faulty submeshes of a single node are inserted into F by the increasing order of x-coordinates of their bases, and then by the y-coordinates of the bases if the x-coordinates are same. If the base coordinates are the same, the end coordinates are considered. Next is the procedure for columnwise combining. Procedure Column_combining ( ) ^ first submesh in F; S5 ^Nil Repeat S2 ^ next submesh in F if (x1b = x2b) if (yib = 0 A S1 ^ S5) then S5 ^ S1 if (S1 \\u S2) S3 ^ S1 + S2; delete S1 and S2 from F; insert S3 into F if (y3b = 0) then S5 ^ S3 S1 ^ S3 else S1 ^ S2 else if (S5 ^ Nil A S1 \u S5) S3 ^ S1 + S5; delete S1 and S5 from F; insert S3 into F S1 ^ S2 S5 ^ Nil Until all submeshes in F are considered For example, in Figure 6(a), suppose that the 17 processors marked as X are faulty. Initially F contains 17 submeshes whose addresses are ((0, 0), (0, 0)), ((0, 2), (0, 2)), ((0, 3), (0, 3)), (0, 5), (0, 5)), ((1, 0), (1, 0)), ((1, 1), (1, 1)),..., and ((3, 5), (3, 5)). First consider two submeshes Si((0, 0), (0, 0)) and S2((0, 2), (0, 2)). Here S5 is ((0, 0), (0, 0)). Since S2 ^u S1, S1 and S2 cannot be combined. Now consider Sj ((0, 2), (0, 2)) and S2((0, 3), (0, 3)). Since Si \\u S2, Si and S2 are combined into S3((0, 2), (0, 3)) and deleted from F. Next S1((0, 2), (0, 3)) and S2((0, 5), (0, 5)) cannot be combined. Consider S1((0, 5), (0, 5)) and S2((1, 0), (1, 0)). The x coordinates of bases of S1 and S2 are different. Therefore, we consider S1((0, 5), (0, 5)) and S5((0, 0), (0, 0)). Since S5 \u S1, those submeshes are combined into ((0, 5), (0, 0)) and deleted from F. The new submesh ((0, 5), (0, 0)) is inserted into F, and S2((1, 0), (1, 0)) becomes new S1. So far, ((0, 2), (0, 2)), ((0, 3), (0, 3)), ((0, 0), (0, 0)), and ((0, 5), (0, 5)) are deleted, and ((0, 2), (0, 3)) and ((0, 5), (0, 0)) are inserted. This procedure is continued and finally F contains ((0, 2), (0, 3)), ((0, 5), (0, 0)), ((1, 4), (1, 2)), ((2, 1), (2, 1)), ((2, 4), (2, 5)) and ((3, 1), (3, 5)) after the columnwise combining as shown in Figure 6(b). Note that the submeshes in F after columnwise combining have the following properties. (1) The width of every submesh is 1. (2) If S1 is ahead of S2 in F, either (x1b = x2b a y1e < y2b) or (x1e < x2b) holds. (3) All submeshes are separate. Next the procedure for rowwise combining is given as follows. Procedure Row_combining( ) 51 ^ first submesh in F 52 ^ next submesh in F Repeat if (S1 ^r S2 V S2 ^r S1) S3 ^ S1 + S2 insert S3 to F if S1 ^ S3, delete S1 from F and S1 ^ S3 if S2 e S3, delete S2 from F if there is next submesh of S2 in F S2 ^ the submesh else 51 ^ next submesh of S1 in F 52 ^ next submesh of S1 in F Until no more next submesh of S1 exists /* all submeshes in F are considered */ For example, suppose that the faulty processors are combined as shown in Figure 6(b) after the columnwise combining. Initially, all submeshes are arranged by the increasing order of the y-coordinates of their bases, and then by the x-coordinates of the bases if the y-coordinates are same. If the base coordinates are the same, the end coordinates are considered. Thus, the submeshes in Figure 6(b) are ordered as ((2, 1), (2, 1)), ((3, 1), (3, 5)), ((0, 2), (0, 3)), ((1, 4), (1, 2)), ((2, 4), (2, 5)), and ((0, 5), (0, 0)). First, considering S1((2, 1), (2, 1)) and S2((3, 1), (3, 5)) which are adjacent each other, new submesh ((2, 1), (3, 1)) is identified. Since ((2, 1), (2, 1)) e ((2, 1), (3, 1)), ((2, 1), (2, 1)) is deleted from F, ((2, 1), (3, 1)) is inserted to F, ((2, 1), (3, 1)) is new S1, and ((0, 2), (0, 3)) is new S2. S1 cannot be combined with S2. Consider S2((1, 4), (1, 2)). Here, a new submesh ((1, 1), (3, 1)) is identified and it is new S1. The S1 cannot be combined into any other remaining submesh ((2, 4), (2, 5)) or ((0, 5), (0, 0)) in F. Now new Si is ((3, 1), (3, 5)) and new S2 is ((0, 2), (0, 3)). This procedure is continued, and final submeshes in F after rowwise combination become ((1, 1), (3, 1)), ((3, 1), (3, 5)), ((0, 2), (0, 3)), ((0, 2), (1, 2)), ((1, 4), (1, 2)), ((1, 4), (3, 5)), ((0, 5), (3, 5)), and ((0, 5), (1, 0)) as shown in Figure 6(c). (c) Afìer iiDWTnse combmatiün. Figure 6. An example of finding faulty submeshes. We analyze the time complexity of the procedure for finding maximal faulty submeshes as follows. Assume that Nf is the number of faulty processors in an a x b mesh system. The initial sorting of the faulty processors takes O(Nflog2Nf). Columnwise combining takes only 6(Nf) since each faulty processor is considered only once and deletion/insertion of a submesh tales 6(1) using a doubly-linked list. Rowwise combining takes O(N/2) because all submeshes in F should consider all other submeshes in F in the worst case. Consequently, the time to find maximal faulty submeshes is O(Nf). We next show how the maximal fault-free submeshes are identified using the maximal faulty submeshes. 3.2 Maximal Fault-free Submeshes Here we maintain another data structure FF, an ordered list keeping fault-free submeshes. The basic idea for identifying maximal fault-free submeshes is to chop off the portions of the fault-free submeshes which are part of maximal faulty submeshes. Initially, FF keeps one entire mesh. Then the faulty submeshes are removed from it one by one. In this section, the mechanism for removing faulty submeshes is first explained. Then the procedure for identifying maximal fault-free submeshes follows. 3.2.1 Removing Submeshes Assume Si O S2 and S9 ^ Si n S2 . Let k as the number of same coordinates in the addresses of S1 and S9. Recall that each submesh has four coordinates, xb, y^,, xe, and >'e. In Figure 7, the cases corresponding to five different values of k are shown. For example, in the case of Figure 7(c), k = 2 since xib = X9b and yib = y9b, but xie ^ X9e and yie ^ y9e. (a)k = 4 (b)k=3 (c)k = 2 S 1 EUD (d)k=l (e)k=0 Figure 7. Five different values of k. Definition 8: Removing S2 from S1, S1 - S2, is a procedure for identifying new maximal submeshes such that the new submeshes are the remainder of S1 after excluding the processors belonging to S2. S1 - S2 is the same as S1 - S9 where S9 ^ S1 n S2. If S1 ◊ S2, S1 - S2 = S1 and no change in S1. If S1 O S2, new submeshes may be identified. The details of the procedure for each case of k are explained below. k = 4: Since S1 ^ S9, no submesh remains after removing S9 from S1. k = 3: There exist four cases as shown in Figure 8, Case l: yle ^ y9e, Case 2: xle ^ x9e, Case 3: ylb ^ y9b, and Case 4: Xib ^ X9b. A new submesh S3 is generated after removing S9 from Si. The address of S3 is referred to Table II. s 1 D Case 1 Case 2 Case 3 Case 4 Figure 8. Four cases of location of intersected submesh when k = 3. Table II. Address of new submesh from Si - S9 when k = 3. Case S3 X3b y3b X3e y3e i A Xib y9e + i Xie yie 2 B X9e + i yib Xie yie 3 C Xib yib Xie .y9b- i 4 D Xib yib X9b - i yie k = 2: There exist six cases as shown in Figure 9, Case i Xib = X9b A yib = y9b, Case 2: xi, = X9b a xie = X9e, Case 3 Xib = X9b A yie = y9e, Case 4: yi, = y9b a xie = X9e, Case 5 yib = y9b A yie = y9e, and Case 6: xie = X9e a yie = y9e. In each case, two new submeshes are generated after removing S9 from Si. The addresses of the new submeshes in Figure 9 are as follows; A, C, G: same as Case i in Table II; B, F, J: same as Case 2; D, E, K: same as Case 3; H, I, L: same as Case 4. k = 1: There exist four cases as shown in Figure i0, Case i: Xib = x9b, Case 2: xie = x9e, Case 3: yie= y9e, and Case 4: yib = y9b. In each case, three new submeshes are generated. The addresses of the new submeshes in Figure 10 are as follows; A, D, K: same as Case i in Table II; B, I, L: same as Case 2; C, F H: same as Case 3; E, G, J: same as Case 4. 3 J Case 1 D Case 2 Case 3 H K Case 4 Case 5 Case 6 Figure 9. Six cases of location of intersected submesh when k = 2. Case 1 Case 2 H K Case 3 Case 4 Figure 10. Four cases of location of intersected submesh when k = i. 3.2.2 Identifying Maximal Fault-free Submeshes After the removing operation, the maximal fault-free submeshes are identified using the following procedure. Procedure Collection_faultFree ( ) FF = { ((0, 0), (a-i, b-i)) } F = set of faulty submeshes while F is not empty S2 ^ extract first submesh in F repeat S1 ^ first submesh in FF if Si O S2 generate new submeshes from Si - S2 as explained in Section 3.2.1 if any of those submeshes — e any submesh in FF insert it into FF until all submeshes in FF are considered Figure 11. Location of intersected submesh when k = 0. k = 0: Four new submeshes are generated as shown in Figure 11. The addresses of the new submeshes in Figure 11 are as follows; A: same as Case 1 in Table II; D: same as Case 2; C: same as Case 3; B: same as Case 4. (c) Figure 12. An example of removing faulty submeshes. Recall that there exist 17 faulty processors (marked as 'X') in Figure 6(a) and those 17 processors are combined into 8 overlapped, faulty submeshes shown in Figure 6(c). Let us use the same example. There exist 19 fault-free processors (un-marked as 'X') in Figure 6(a). To identify the maximal fault-free submeshes from the 19 fault-free processors, the faulty submeshes in Figure 6(c) are removed from the whole torus, instead of directly combining the 19 fault-free processors. Refer to Figure 12. Initially, the whole torus ((0, 0), (5, 5)) exists. After removing ((1, 1), (3, 1)), the first faulty submesh in Figure 6(c), from the whole torus, two overlapped submeshes, ((0, 2), (5, 0)) and ((4, 0), (0, 5)), exist as shown in Figure 12(a). Next consider ((3, 1), (3, 5)), the second faulty submesh. It does not overlap with ((4, 0), (0, 5)) but overlaps with ((0, 2), (5, 0)). ((3, 1), (3, 5)) n ((0, 2), (5, 0)) = ((3, 2), (3, 5)), and ((0, 2), (5, 0)) - ((3, 2), (3, 5)) = ((0, 0), (5, 0)) and ((4, 2), (2, 0)). The remaining submeshes after the removing ((3, 1), (3, 5)) are ((4, 0), (0, 5)), ((0, 0), (5, 0)), and ((4, 2), (2, 0)) as shown in Figure 12(b). Similarly, remaining faulty submeshes, ((0, 2), (0, 3)), ((0, 2), (1, 2)), ((1, 4), (1, 2)), ((1, 4), (3, 5)), ((0, 5), (3, 5)), and ((0, 5), (1, 0)), are removed. Finally, as shown in Figure 12(c), there exist six fault-free submeshes, ((1, 3), (2, 3)), ((2, 0), (5, 0)), ((2, 2), (2, 3)), ((4, 0), (5, 5)), ((4, 1), (0, 1)), and ((4, 4), (0, 4)), which are maximal faulty-free submeshes. We analyze the time complexity of the procedure to identify maximal fault-free submeshes. Initially, FF contains only one submesh and F contains O(Nf) submeshes. The while loop executes O(Nf) times and the repeat loop executes O(Nf) times. Generation of new submeshes takes O(1) times. Comparison and insterstion take O(Nf) times because FF is an ordered list. Therefore, the time to find maximal fault-free submeshes takes O(Nf). Thus, the total time of the proposed scheme is O(Nf). Recall that Nf is the number of faulty processors in an a x b mesh system. 4 Conclusion In this paper, we have proposed a scheme which can efficiently identify the largest size fault-free submeshes in a 2D torus with faulty processors. The proposed scheme employs two phase approach for systematically find the desired fault-free submeshes. The main idea is that the largest size faulty submeshes are first identified, and then the portion of fault-free submeshes overlapped with the maximal faulty submeshes is excluded to find the largest size fault-free submeshes. For the effective manipulation of this process, the relative locations of any pair of submeshes in a 2D torus have also been defined. The time complexity of the proposed scheme is O(Nf), where Nf is the number of faulty processors in a 2D torus. Even though task allocation problem in 2D meshes has recently drawn a lot of attention, the allocation problem in a torus system in the presence of faulty nodes has not. We are currently developing a task allocation scheme in a faulty 2D torus based on the proposed reconfiguration scheme. References [1] Alverson et al., "The Tera computer system," Proc. 1990 Int'l Conf. on Supercomputing, pp. 1-6, 1990. [2] R. E. Kessler and J. L. Schwarzmeier, "CRAY T3D: A new dimension for Cray research," in Proc. COMPCON, pp. 176-182, Feb. 1993. [3] J. Ding and L.N. Bhuyan, "An adaptive submesh allocation strategy for two-dimensional mesh connected systems," Int'l Conf. on Parallel Processing, pp. II-193-200, Aug. 1993. [4] D.D. Sharma and D.K. Pradhan, "A fast and efficient strategy for submesh allocation in mesh-connected parallel computers," Symp. on Parallel and Distributed Processing, pp. 682-689, Dec. 1993. [5] S. M. Yoo, H.Y. Youn, and B. Shirazi, "An efficient task allocation scheme for 2D mesh architectures," IEEE Trans. on Parallel and Distributed Systems, pp. 934-942, September 1997. [6] T. Liu, W.K. Huang, F. Lombardi, and L.N. Bhuyan, "A submesh allocation scheme for mesh-connected multicomputer systems," Int'l Conf. on Parallel Processing, pp. II-159-163, August 1995. [7] S. Bhattacharya and W.T. Tsai, "Lookahead processor allocation in mesh-connected massively parallel multicomputers," Int'l Parallel Processing Symposium, pp. 868-875, April 1994. [8] J. Upadhayay and P. Mohapatra, "An efficient processor allocation scheme for mesh connected parallel computers," Symposium on Parallel and Distributed Processing, pp. 196-203, October 1995. [9] W. Liu, V. Lo, and K. Windisch, "Non-contiguous processor allocation algorithms for distributed memory multicomputers," Supercomputing, pp. 227-236, November 1994. [10] G. Kim and H. Yoon, "On submesh allocation for mesh multicomputers: a best-fit allocation and a virtual submesh allocation for faulty meshes," IEEE Trans. on Parallel and Distributed Systems, pp. 175-185, February 1998. [11] G.M. Chiu and S.K. Chen, "An efficient submesh allocation scheme for two-dimensional meshes with little overhead," IEEE Trans. on Parallel and Distributed Systems, pp. 471-486, May 1999. [12] B.S. Yoo and C.R. Das, "A fast and efficient processor allocation scheme for mesh-connected multicomputers," IEEE Trans. on Computers, 51 (1), pp. 46-60, Jan. 2002. [13] L.D. de Cerio, M. Valero-Garcia, and A. Gonzalez, "Hypercube algorithms on mesh connected multicomputers," IEEE Trans. on Parallel and Distributed Systems, 13 (12), pp. 1247-1260, Dec. 2002. [14] D. Wang and J. Cao, "On optimal hierarchical configuration of distributed systems on mesh and hypercube," Parallel and Distributed Processing Symposium, pp. 8, April 2003. [15] K.H. Seo and S.C. Kim, "A dynamic processor management scheme on the reconfigurable meshes," Int'l Conf. on Parallel and Distributed Computing, Applications and Technologies (PDCAT'2003), pp. 497-501, Aug. 2003. [16] N.C. Wang and T.S. Chen, "Task migration in allport wormhole-routed 2D mesh multicomputers," Int'l Symp. on Parallel Architectures, Algorithms and Networks, pp. 123-128, May 2004. [17] H.J. Ho and W.M. Lin, "A performance-optimizing scheduling technique for mesh-connected multicomputers based on real-time job size distribution," Int'l Conf. on Parallel and Distributed Systems (ICPADS 2004), pp. 639-646, July 2004. [18] A.L. Rosenberg, "On scheduling mesh-structured computations for Internet-based computing," IEEE Trans, on Computers, 53(9), pp. 1176-1186, Sept. 2004. [19] S. Latifi, "Distributed subcube identification algorithms for reliable hypercubes," Information Processing Letters 38, pp. 315-321, June 1991. [20] N.F. Tzeng and G. Lin, "Identifying maximal incomplete subcubes in faulty hypercubes," Symp. on Parallel and Dist. Comp. Systems, pp. 186-193, Oct. 1994. Distributing State Space for Parallel Computation of CTL Model Checking Mustapha Bourahla Computer Science Department, University of Biskra, Algeria mbourahla@hotmail.com Mohamed Benmohamed Computer Science Department, University of Constantine, Algeria ibnm@yahoo.fr Keywords: Model Checking, Distributed Systems, Parallel Computation Received: March 3, 2003 In this work we discuss the problem of performing distributed CTL model checking by sph tting the given state space into several partial state spaces, and therefore the distribution of the transition relation. This work thus significantly extends the scope of properties that can be verified for very large designs. Each computerinvolved in the distributed computation owns a partial state space and performs a model checking algorithm on it. To be able to proceed, a CTL property is checked by the different processes and the results are combined to produce the fnal decision about the satisfaction of this CTL property. In this paper, we present the spli tting algori thm of the state space. We will present also a new algori thm of the distri buted CTL model checking whose correctness is formally proved. Povzetek: članek analizira delovanje porazdeljenega CTL modela. 1 Introduction The main aim in exploiting a distributed environment for model checking [5] is to extend the applicability of model checking algorithms to larger and more complex systems. Many sequential approaches have been proposed to deal with large state spaces, e.g. partial-order methods, symbolic verifcation, abstractions, and partial state space reasoning. Often these approaches do not suffce-time or space resources can still signifcantly limit the practical applicability [3]. Recently, a new promising method for increasing the memory capacity was introduced. The method uses the collective pool of memory modules in a network of processes. A parallel super computer, grid or a network of computers can provide extra resources needed to fght more realistic verifcation problems. Here we consider a network of workstations that communicate via message passing. The important feature of algorithms running in a distributed environment is to solve the given task by distributing the data among the participating workstations with as small amount of coordination as possible. One of the main issues in distributing model checking algorithms is how to partition the state space (data) among the individual computers called here network nodes. There are two aspects that signifcantly influence the overall effectiveness of model-checking in the distributed environment: locality and (spatial) balance of the state space partition. Locality means that most of the state's descendants are assigned to the same node as the parent state, thus reducing communication and cooperation overhead. Balance means that each network node is assigned approximately the same number of states, thus achieving good speed-up. The main idea of many distributed algorithms is similar: the state graph is partitioned among the network nodes, i.e., each network node owns a subset of the state space. The differences are in the way the state space is partitioned. Probabilistic techniques to partition the state space have been used e.g. in [2, 10, 14], and a technique which exploits some structural properties derived from the verifed formula has been proposed in [1]. The model checking algorithm running on each network node has thus access only to a part of the entire system. Depending on the type of the algorithm, it communicates with other nodes to achieve the required (global) result. The authors of [9, 16] have developed an approach to model checking of software which uses modularity. Their notion of a module differs from that used in modular model checking as understood for example in [7, 8, 13]. A module here is not a part of a whole system that runs in parallel with other modules (i.e. that contributes to the whole system in a multiplicative way), but a subset of a state space that originates from splitting the whole system in an additive way. It is defned by following the syntactical structure of the program. This notion of module has also been used in [11], where the system is split according to the semantics of the program. Besides this partition, the authors in [9, 16] have also defned the notion of an assumption function that represents partial knowledge about truth of formulas provided by the rest of the system (by other parts). In this contribution we propose a new method of split- ting the state space that is based on manipulations of Binary Decision Diagrams (BDDs) implementing this state space. Any transition system modeled as a Kripke structure, is implemented by a BDD. Our method will partition the original Kripke structure (the original BDD) by creating different BDDs that are linked by a way preserving the original behavior. Hence, these BDDs represent the different fragments of the original Kripke structure. The CTL property will be checked over the fragments of the Kripke structure in a parallel or a sequential way, where each fragment is used by a different process running the model checking algorithm on a workstation node. These processes will exchange information if necessary to make a decision about the satisfaction check. The exchange of information is based essentially on the links between the different fragments. Furthermore, we have modified the model checking algorithm in such a way that it can be run in a distributed environment. The main idea is, once the system is partitioned, the Kripke structure on each network node can contain partial states and transitions between these states. These transitions can be unconditional transitions (their executions do not need exchange of information with other processes) or conditional where their executions will happen only after reception of information from other processes running on other nodes. This information represent the condition of these conditional transitions. In all cases we have also to take into account the associated communication complexity. The rest of this paper is organized as follows: in Section 2, we present the definition of transition systems and their modeling as Kripke structures. Section 3 is devoted to our method of partitioning the state space. In Section 4, we present the language CTL and its semantics over a partitioned Kripke structure. The distributed CTL model checking algorithm and the proof of its correctness are presented in Section 5. At the end, in Section 6 we present a conclusion and related works. 2 Modeling Transition Systems Our aim is to perform a model checking algorithm on a transition system which can model any concurrent system that can be described with a high level language, on a cluster of n workstations, called (network) nodes. In addition to the sequential case a partition algorithm is used to partition the state space among the nodes. After partitioning the state space, each node owns a part of the original state space. We will show that we can run distributed model checking on these parts of state spaces to verify CTL properties. Definition 1 A transition system M is a triple M = (V, R,I) where V = {vi,... ,Vn} is a finit^e set of Boolean variables cont^ai^ing a sub-set of the at^o^^c proposit^ions (AP), R C X is a transition relation and I C 2V is a set ofinitial states. Definition 2 A Kripke structure is a q^int^up^e K = (Q, R, so, AP, L) where: 1. Q is a non-empty (possibly infinite) set of statues, 2. R C Q X Q is a tot^al transition relation (i.e., (is : s G Q :(3s' : s' G Q : {s,s')eR))), 3. so G Q is the initial state, 4. AP is a set of atomic proposit^ions, and 5. L : Q ^ 2ap is a statue-labeling funct^ion. A Kripke structure may be viewed as labeled directed graph. The states are the graph's vertices, the transition relation defines the edges, the initial state is marked with a small incoming arrow, and each state is labeled with a set of atomic propositions. A transition system is modeled with a Kripke structure, where the states in the Kripke structures (Q) are defined to be mapped to Boolean expressions composed of the variables from V describing the transition system. These Boolean expressions are defined in the transitions set of the original system. Note that, because the transition relation is required to be total, our model described with a Kripke structure has the same set of transitions defined by the relation R in the original transition system with the addition of loop transitions from and to vertices that don't have outgoing edges. The labeling function L : Q ^ 2AP is intended to associate with each state in Q an interpretation of the atomic propositions in AP. Thus, through L we know for each state s G Q which atomic propositions are true, namely those in L(s), and which ones are false, namely those not in L(s). The Kripke structures are implemented using the Reduced Order Binary Decision Diagrams (ROBDD or shortly BDD). These structures will be used as a base for our algorithms. Example 1 The Kripke structure of the transition s^st^em defined by the ti^iple M = (V = {a,h,c}, R = {(a A b A c,a Ab A c), (a A b A c, a A b A c), (a AÌ) A c, a A b A c), (a A b A c, a A b A c), (a A b A c, a A b A c), (a A b A c, a A b A c)}, I = a A b A c), is shown in Figure 1, where s0 ^s the initial statue of this Kripke struct^ure. We assume AP = {b}. 3 Partitioning the Kripke Structure The Kripke structure (which is represented by a BDD) modeling a transition system will be partitioned using the algorithm defined below (Algorithm 1). To partition any Kripke structure, we have to define a list of clusters C of Boolean variables from the set of variables V. Then, each cluster of variables from C, is used to generate a corresponding Kripke sub-structure. We note that the variables in V are ordered. The clusters C should be also or- dered which means that Ci ~< C j if ivi G Ci and vj G C j then vi -< v j. Each Kripke sub-structure j {b} Figure 1: Kripke structure of the transition system M Ki is composed of states that can be mapped to Boolean expressions B of the variables in Ci with the condition that 3B' such that B e B' and B' is the Boolean expression using the whole set V, of a state si in the original Kripke structure K. The composition of the generated Kripke substructures Ki, i = 0,1, ••• ,n should be equal to the original Kripke structure K. Algorithm 1 Partionning the Kripke structures BDD partition(BDD K, G, H ; ClustersList C) { ^f(C = nil) { if (K == true) return G if (K == false) return H x = min(root{K), root(G), root(H)) if (x e CurrentCluster(C)) { T = partition(Kx, Gx, Hx, C) E = partition(Kx, Gx, Hx, C) ifT==E return T return createNode("x", T, E) } else { Ki = partition(K, G, H, NextCluster(C)) if (BP e FragmentsList | P == Ki) return createNode(P.Label, nil, nil) else { InsertFragment(Ki) return createNode("Ki ", nil, nil) } } } else return nil } The Kripke sub-structures resulting from the partition of the given Kripke structure, are called fragments. Definitions A Kripke slrucl^ure Ki = {Qi, Ri, soi, APi, Li) is a fragment of a Kripke struct^ure K =iQ,R, so,AP,L) if - Vs : s e Qi : s' e Q | s A s' = s' - soi A so = so - V(si, s2) e Ri ^ 3{s'i, s'2) e R | (si A s1 = s1) A (s2 A s2 = s2) - A^i C AP - Li : Qi ^ 2Api IVs : s e Qi : (3s' : s' e Q : ((s A s' = s') A (Li(s) C L(s'))). This algorithm is using a list of fragments FragmentsList to store the generated fragments of the original Kripke structure. By this algorithm, we can partition a complete Kripke structure K to many fragments Ki. Each Ki is a Kripke structure of a sub-system. Once the given system is partitioned, the resulting Kripke structures K0, • •• , Kn can have links between them which make the sub-systems dependant. A fragment Ki of K is a sub-structure of K satisfying the property that every state in Ki can have unconditional and/or conditional successor states in Ki. Then, a transition in the system represented by the fragment Ki can be unconditional or conditional transition. Formally, if 3Ki and 3(si, s2) : (si, s2) e Ri (Ri is the transition relation of Ki), we denote the unconditional transition by s1 ^ s2 which means a transition from the state s1 to the state s2 can occur without any reference to the other fragments (its execution is independent). By contrast, the conditional transition has the form s1 -^ s2, where process j is a link label to another Kripke structure Kj. This will be handled by communications between the processes running the different sub-structures. This means that a process will be suspended on a conditional transition until reception of information from the process running the Kripke sub-structure indicated by the label. There is a particular Kripke sub-structure which is not referenced by any transition in the other sub-structures. We call this the root Kripke sub-structure. A path n in a Kripke structure K from a state s0 is a sequence n = s0s1 • • • such that Vi > 0 : si, si+1 e Q and (si,, si+1) e R. An equivalent path can be computed using the fragments of K. Figure 2 shows the partition of the Kripke structure presented in Figure 1 after the execution of the following call: InsertFragment(partition(K, true, false, {{a}, {b}, {c}})). There are three clusters of variables ({{a}, {b}, {c}}. Each Kripke structure K, represents a sub-system M,. The sub-systems M, can be executed in sequence or in parallel. These Kripke sub-structures can be represented as BDDs (It suffice just to remove the transition labels). So, for each S1 Q K4 K 4 K5 K5 K4 K3 ^-C^ Ki Figure 2: Fragments of K with their links state we realize a formula of the form F = f ™ ^ gi, where f is the state formula and the sub-formulas gi are the labels on its outgoing edges. The transitions drawn by the dashed edges have a label indicating which sub-system to communicate with (receivefrom{)) to get information (the condition) before this transition can take place (it is waiting for an information). We should remark that the number of generated Kripke sub-structures depends on the number of the speci-fed clusters of variables and the original Kripke structure. Also, some Kripke sub-structures can have the same set of states but different transitions. This is very important to reduce the communication complexity and to improve the performance of the model checking. The advantage of representing any transition system with a fragmented and therefore distributed Kripke structure that we can execute this transition system in a sequential or in parallel manner. The main goal of our distributed algorithm is to reduce the memory requirement. In symbolic model checking, pre-image is one of the operations with the highest memory requirement. Given a set of states S, pre-image computes pred{S), which is the set of all predecessors of states in S. The pre-image operation can be described by the formula pred{S) = {s' G Q |3s G S : R{s', s)}. It is easy to see that the memory requirement of this operation grows with the sizes of the transition relation R and the set S. Furthermore, intermediate results sometimes exceed the memory capacity even when pred{S) can be held in memory. Our distributed algorithm reduces memory requirements by restricting each of the computed sets of states to a fragment. This takes care of the S parameter of pre-image, and the reduction of the transition as well. The following algorithm is the parallel version of the pre-image function pred. K Algorithm 2 the pre-image function 2q pred(KripkeK, 2QS) { res = {s A receivefrom{processi) | 3t.s ■ t A t e S}U{s | 3t.s ^ t A t e S} for each process processi = pid sendto(processi, res) return res } For each Kripke sub-structure Ki, we associate a process to calculate the pre-image function pred in parallel with the other processes. The image function succ can be defined by the same way. All the processesprocessi, i = 0, ••• ,n execute pred once in parallel and then they begin communication to exchange information until there are no information to exchange which indicates the end of the function pre-image pred. The global result is the pre-image result of the process running the root Kripke sub-structure. Example 2 In this ex^ampJe (Figure 3), we compulse the pre-image of the s^a^es charact^eri^ed by the expression S = a A b A c. We observe that the computation is parallel where the function pre-image pred is e^ecut^ed simuJ-^aneousJy on aJJ the Kripk^e sub-structures by d^f^erent processes Po, ••• ,P6 located on different nodes. If a process has a l^nk label to another process, it will send a message t^o get the result of it^s e^ecut^ion. This communication will continue unt^il all t^e information requested has been delivered. The result of the g^ob^l function pre-image is t^he result of the process running the root Kripke sub-structure Ko. To reconstruct the original Kripke structure, it suffices to remove the conditional transitions by making them unconditional transitions. To do that we replace each conditional transition by transitions which are the product of this conditional transition and all the unconditional transitions in the fragment indicated by the link label on the conditional transition. The source states and the target states are also composed to get new source and target states. The resulting new transitions will inherit the attribute unconditional. This process will be iterated many times until there will be no conditional transitions. Algorithm 3 Reconstructing the original Kripke structure Kripke Reconstruct(Kripk^e K) { while 3t = si —^ s2 e K do { if K j contains conditional transitions then K j = Reconstruct(Kj ) for all transitions s'1 ^ s'2 e K j do { Q = Q U{si A s1,s2 A s'2 } R = R U{(si A s'1,s2 A s'2)} } To get the original Kripke structure from its fragments, we have just to execute the call Reconstruct{K0). Lemma 1 The produced K^rip^e struct^ure by removing the conditional transitions from the fragments of a Kripke structure, is the same original K^r^p^e st^ructure used as input to the partition algorithm. We have illustrated the parallel execution which is our goal in addition to solving the explosion problem. The sequential execution is also possible using the same structure. In the next section, we will present the CTL semantics over a transition system represented by fragments of a Kripke structure. 4 CTL Model Checking Computational Tree Logic (CTL) is the most popular temporal logic introduced for formal verification with model checking [9, 10]. This logic is formally based on models where at each moment there may be several different possible futures (branching temporal logic). Its semantics is a tree of states, each path in the tree is intended to represent a single possible computation of the system. The tree itself thus represents all possible computations. The CTL operators allow the expression of properties of some or all computations of a system. Definition 4 (Syntax of CTL). The language of CTL is defined by the following abstract syntax: tf ::= p I —t-p | tf A tf | ^X^ | | where p rang;es o^er the set of atomic propositions AP, X is pronounced "next^", 3 "^or some path", V "for all paths" and U "until". There are two types of operators, state operators X, F and G, and path operators 3 and V. All CTL temporal operators are composed of a path operator followed by a state operator. This grammar is not given in its most succinct form and there exist equivalence rules to express the same formula with different operators. In practice, by using this equivalence rules, a formula can be written such that the negation appears only at the level of atomic propositions. Such a form of a formula is known as Positive Normal Form (henceforth PNF form) [6]. A CTL-model M is represented as a Kripke structure (M = (Q, R, s0, AP, L)). A path p is an infinite sequence of states s0sis2 • • • such that {si, si+i) G R, Vi > 0. p[i] denotes the (i + 1)-th element of p. The set of paths starting in state s of the model M is defined by n^(s) = {p | p[0] = s}. For any CTL-model M = (Q, R, s0, AP, L) and state s G Q, there is an infinite computation tree with root labeled s such that (s', s") is an arc in the tree if and only if (s', s'') G R. A state s for which p G L(s) is sometimes called a p-state. p is called a p-path if it consists solely of p-states. } } 1 Po{Ko) 1 1 Pi(Ki^ 1 P2(K2^ 1 PaCKa) | 1 Pi(Ki) 1 1 P5(K5) I 1 P6(K6) I 1 1 1 I I 1 1 a A K2 b A K5 V b A K4 b A K4 V b A K5 b A K5 c cc c 1 1 1 1 I 1 1 b © c b © c b A c 1 1 I I I 1 1 a A b © c N\\\\\\\ NWNWMKWWW^ nWWW KWWW^ Figure 3: Parallel computation Definition 5 (Semantics of CTL). The semant^ics of CTL are defined by a satisfaction relation between a model M, one of its spates s, and a formula s \=m ^ if ^nd only if ^ is va^id in state s of model M. The satisfaction relation is defined by: - s=M P iffp G L(s) - s [=M -P iffnot (s [=M P) - s h M V A ^ iff (s h M A (s h M - s =M ^X^ iff3p G nM(s).p[1] =M V - s =M ^Gv iff3p G nM(s).Vj- > 0.p[j\ =M V - s =M V^U^ iff 3p G nM(s).(3j > 0.p[3\ =M A (Vk, 0 < k < j.p[k\ =M V) Given a CTL-formula V and a CTL-model M with a finite set of states (Q), the model checking algorithm computes the set of states for which V is valid. This set is denoted Sat(V), and is computed in a recursive way, i.e. by computing for each sub-formula ^ of V the set Sat(^). In order to decide whether s \=m V we just have to check whether state s is a V-state, i.e., whether s G Sat(V). Theorem 1 The system M represent^ed by the Kripke struct^ure K and the s^st^em M ' represented by its fragments Ki, 0 < i < n are bi-s^^i^ar-. Proofs. We have to prove that M simulates M' and M' simulates M. formally, M - M' ^ (V(sl,s2) G R ^ 3(s:l,h) G :r)a (V(Ii,J2) G R ^3(si,s2) G R), where s'l and s'2 are global states in M' that are computed from sl and s2 respectively by the partition algorithm. R is the global transition relation of M'. We assume there are n +1 processes each one is running a fragment from the n +1 Kripke sub-structures. By construction sl has a corresponding global state in M', which is defined by = /\n=0 si. Then it exists a next state of sl that also is represented with the same form by the next state of si which can be computed by the global transition relation R. 3s2 : R(si, S2) = 3s0 : Ro(s'^, s2 A receivefrom()) Ro(s00, s^ A /\ receivefrom()) = Ro(s?^ s^), j=i j=o where Ri,i = 0, ••• ,n are the transition relations of the corresponding fragments Ki. Each process j j = 1, • •• ,n will send its image state sj2 computed by using the local transition relation Rj, to the root process. The final image result is l\n=o sj2. We remark that by construction s2 is in each sj2 then, s2 ^ /\n=o sj2. On the other hand, for V?l we assume there exists sl which is in sll. By definition of Kripke structure it exists s2 such that R(sl, s2). By construction s2 has a corresponding global state in M', which is defined by ?2 = f\n=o si2. By construction of the partial transition relations during the partition process, it is easy to verify that R(sl, ?2) is true. □ To define the semantics of CTL formulas over Kripke structures composed with fragment sub-structures we need to adapt the standard semantic definition. CTL is usually interpreted over complete structures, while our structures are typically non-complete. By theorem 1, we have the following equivalences in the definition of semantics. Lemma 2 (Semantics of the distributed system). Let V be a CTL formula, and Ki (0 < i < n) are the fragments of K, computed using the partition algorithm. Then, we have this equality Sat(K,V) ^ f](Sat(Ki,v)). i=o 5 Distributed CTL Model Checking Each fragment Ki is managed by a separate process Pi. These processes are running in parallel on each node. Each Algorithm 4 Distributed CTL Model Checking 2q pSat(Kripke Ki, CTL Lp) { switch(p) { case p: res = {s A receivefrom(Pj ) | p e Li(s) ABt e Qi.t —^ s e fii}U p . {s | p e Li(s)A Bt e Qi A (yj : 0 < j < n.t s e Ri)} sendtoOtherProcesses(res) case —p: res = {s A receivefro^n(Pj ) | p e Li(s) ABt e Qi.t p. {s | p e Li(s)A Bt e Qi A (yj : 0 < j < n.t -pU s sendtoOtherProcesses(res) —^ s e Ri}u e Ri)} case pi A p2 : res = exchang^e^Sat(Ki, pi)) n exchange(pSat(Ki, p2)) case EXp: res = exchange(pred(pSat(Ki, p))) case EGp : res = fixpoint(pSat(Ki, p), false, Qi) case piEUp2 : res = fixpoint(pSat(Ki ,pi),pSat(Ki,p2),false) return res } 2q ßxpoint(2Q p, init) { Qval = init do Qoid = exchange(Q val ) Qval = ^ U (p n pred(Qoid)) while Qoid = exchange(Qvai) return Qval } 2q exchange2Q S) { res = S for each process Pi = pid sendto(Pi, S) for each process Pi = pid i^es = res A receivefro^n(Pi) return res process P, checks the satisfaction of the CTL formula (which is in PNF) using the adapted symbolic algorithm of model checking (Algorithm 4) presented below. If the satisfaction check can not be decided only after reception of the results of other processes, the process running pSat, will send messages to receive information from these processes. These steps are repeated until a fix-point is reached (global stabilization occurs), i.e. until no new information can be computed. The processes repeatedly compute information about truth of formulas and send and receive computed information to and from other processes, respectively. This finishes when a fix-point is reached. Then each process extrapolates information, using the fact that the fix-point has been reached. This is performed repeatedly until the information we search for is computed. For each state and each formula we want to say if its value has already been computed or not. We consider a value for a state and a formula computed if an appropriate value of receive from() has already been defined for some process P,. After reaching the fix-point at the end it can be necessary to resolve undefined values to be able to continue in the computation, i.e. to consider as computed a value for some state and some formula. The fact that a fix-point has been reached cannot be detected locally. How- ever, by employing an additional communication (exchange()) between computers we are able to determine it. Additional communication between processes is also needed to find out what tuples (S, are minimal. Suppose a fix-point has been reached. Each process P, computes a set of tuples that are minimal in the set for which is undefined. When finished, it sends the set to every other process and receives similar information from other processes. Using this information, each process is able to determine what tuples are minimal in the undefined set. 5.1 Correctness To prove the correctness of the distributed algorithm, we assume the sequential algorithm is correct. The sequential algorithm evaluates a formula by computing the set of states satisfying this formula. In the distributed algorithm every such set is owned by the process running the root Kripke sub-structure which is also portioned among the processes in a conjunctive way. In the proof we show that, for every CTL formula, the set of states computed by the sequential algorithm is identical to the set computed by the processes in the distributed algorithm. Theorem 2 (Correctness). Let ^ be a CTL formula, K,, i = 0,1, ••• ,n are the fragments of K com-put^ed by the algorithm 1. If Sat(K, andpSat,(K,, for i = 0,1, ••• ,n ^er^^na^e, then Sat(K,^) = An=o PSati(Ki,^). } Proof. We prove the theorem by induction on the structure of If ^ = p for p G AP. The sequential algorithm Sat{) returns the union of all the states in Q that are marked by p. The process running the root Kripke sub-structure in the distributed algorithm returns n resid = (Sid ^ /\ receivefromO) ^ f\ S j. j=id j=0 This is based on the fact that each process can wait for information which represents the transition condition, from other process. For formulas of the form ^ = A ^2, pSatid{^) frst computes pSatid{^1) A pSatid(^2). At the end of this computation, the global set is: Both algorithms perform at least one iteration, so they do not terminate at iteration 0. Assume Lemma 3 holds for iteration j. We prove it for iteration j + 1. By the induction hypothesis of Lemma 3 we know that Qj = /\k=1 Qj. Qj+1 = Sat(Qjand Q^id^ = pSatid(Qjd,^). Thus, the induction hypothesis \f Theorem 2 is applicable and implies that Sat{Qj= ^n=opSati^jHence, Qì+1 =/\n=o Qj^1. The sequential fixpoint procedure terminates at iteration j + 1 if Qj = Qj+1. We prove that this holds if and only if for every process id, exchange(Qld) = exchang(3(QiJ ). From above, Qj = /\n=0 Q^ì and qì+1 ^ n=o Qj^1. yid : exchange(Qjd) = exchange{Qj+^) ^ [\(pSati(^1 ) ApSatii^2)) = i=0 yid ^ Qj ^ qì+1 [\pSati(^1) A^ pSatii^2). i=0 i=0 By the induction hypothesis, this is identical to Sat{^1 ) A Sat{^2) which is identical to Sat{^1 A ^2). For the formula EX^, we compute the pre-image of ^ (pred{^)). predid{^) = (s1 V s2) ^ /\ receivefrom{) | j=id 3s1 : S1 s1 A ^(s!) A 3s2 : S2 ^ s2 A ^(s'2). We have s1 V s2 is in Qid. Then, predid ^ sì . j=0 i=0 i=0 If the formula is EG^, we would like to prove An=0pSati(EG^) = Sat(EG^). So, we need to prove that An=0 fixpointi(S, false, Qi) = fixpoint(S, false, Q), where S is the satisfaction set of The following lemma proves stronger requirements. Lemma 3 Let Qj be the value of Qvai in iteration j of the sequential fixpoint algorithm. Similarly, let Qld be the value of Qval in iteration j of the distribut^ed fi^-point algorithm in process id. Q0 is the initialisation oft^he sequential algorithm; Q0d is t^he ini^iali^a^i^n oft^he distributed algorithm. Then, for every j : Qj = K k=1 Qj. If the sequential fixpoint algorithm t^erminat^es a^ter i0 iterations then so does the distributed fixpoint algorithm. Proof. We prove the lemma by induction on the number j of iterations in the loop of the sequential function fixpoint. At iteration 0, the initialization of the sequential algorithm, as well as the distributed algorithm is false. Hence, Q0 = Q0d = false which implies Q0 ^n=0 Q0. yid : Qj = Qj+1 ^ Qj = Qj+1. The last equality is implied by the previous one. This completes the proof of the lemma. □ Then the proof of the theorem is completed. □ 6 Conclusions and Related Work In this work we have considered a technique that uses established links between the Kripke sub-structures (fragments) generated by a partitioning algorithm of the state space to perform CTL model checking in a distributed environment. We have developed the necessary theoretical background and described the distributed algorithm. The experimental version of the algorithm is currently being implemented. One of the points that would certainly deserve at least some comments is how to chose the clusters of variables to do partitioning so as to minimize communications. For example we could choose to partition according to few variables that are known to change rarely. We expect to elaborate more possibilities in the future. In this work in contrast to others, we have focused on the partitioning problem and we have given a complete approach. Other work is based on the modular model checking approach [9] which uses assumptions about missing parts of the state space. An other approach that utilizes a decomposition of the system into parts (modules, fragments) is in [12]. They present a model checking algorithm for pushdown processes and consider the semantics of "fragments" which are interpreted as "incomplete portions" of the process. Another work is the model checking algorithm for the logic EF and CTL and pushdown processes [15]. Finally, in [4] the authors have used 3-valued logic (with ? representing "don't know if property is true or false") to reason about Kripke structures with partial labeling (called partial state space). For the future work, our frst goal is to perform an experimental evaluation. In particular we would like to fnd out how the performance is influenced by various types of partition function. We also intend to consider other logics and model checking algorithms in place of the "node algorithm". References [1] Barnat J., L. Brim and I. Cerna (2002) Property Driven Distribution of Nested DFS, VCL^'02, Pittsburgh (USA). [2] Barnat J., L. Brim and J. Stribrna (2001) Distributed LTL Model-Checking in SPIN, l^he 8th In^er^na-t^ional SPIN Workshop on M^odel Checking of Soft-waar^e (SPIN'01), LNCS 2057 Springer-Verlag, pp. 217-234. [3] Bourahla M. and M. Benmohamed (2002) Predicate Abstraction and Refinement for Model Checking VHDL State Machines, Electronic Not^es in Theoretical Comput^er Science 66(2), Elsevier Science. [4] Bruns G. and P. Godefroid (1999) Model checking partial state spaces with 3-valued temporal logics, 11th International Computer Aided Verification Conference, LNCS Springer-Verlag, pp. 274-287. [5] Clarke E. M., O. Grumberg and D. A. Peled (1999) Model Checking, The MIT Press. [6] Emerson E., C. Lei (1986) Efficient Model Checking in Fragments of the prepositional ^-calculus, The First Annual Symposium of Logic in Computer Science. [7] Kupferman O. and M. Y. Vardi (2000) An automata-theoretic approach to modular model checking, ACM Transac^ions on Programming Languages ^nd S^s-tems 22. [8] Kupferman O. and M. Y. Vardi (1997) Modular model checking, COMPOS, pp. 381-401. [9] Laster K. and O. Grumberg (1998) Modular model checking of software, TACAS'98, LNCS 1384 Springer-Verlag, pp. 20-35. [10] Lerda F. and R. Sisto (1999) Distributed-memory model checking with SPIN, The 6th Int^ernational SPIN Workshop on M^odel Checking of Software (SP^^'99), LNCS 1680 Springer-Verlag, pp. 22-39. [11] Pierre-Alain Masson J. J. and H. Mountassir (2000) Modular verification for a class of PLTL properties, The 7t^h Int^ernational SPIN Workshop on Model Checking of So^t^are (SPI#'00),LNCS 1945 Springer-Verlag, pp. 398-419. [12] Steffen B. and O. Burkart (1994) Pushdown processes: Parallel composition and model checking, CO^C^R'94, LNCS 836 Springer-Verlag, pp. 98113. [13] Tsay Y.-K. (2000) Compositional verification in linear-time temporal logic, FoSSaCS 2000, pp. 344358. [14] Stern U. and D. L. Dill (1997) Parallelizing the mur^ verifier, Computer Aided Verification (CAV '97), LNCS 1254 Springer-Verlag, pp. 256-267. [15] Walukiewicz I. (2000) Model checking CTL properties of pushdown systems, Foundations of Software Technology and Theoretical Computer Science, LNCS 1974 Springer-Verlag, pp. 127-138. [16] Yorav K. (2000) Exploiting Syntactic Structure for Automatic Verification, Ph.D. thesis, Technion, Haifa, Israel. On-line Handwriting Chinese Character Analysis and Recognition Using Stroke Correspondence Search Jungpil Shin Department of Computer Software, University of Aizu, Aizu-Wakamatsu City, Fukushima, 965-8580, Japan e-mail: jpshin@u-aizu.ac.jp Keywords: on-line character recognition, stroke order-free, stroke number-free, stroke information, stroke correspondence search Received: January 14, 2003 To improve the computational time and recognition accuracy in stroke order- and stroke number-free online handwriting character recognition, we have performed a structural analysis of stroke order style and stroke connection. Using handwritten characters chosen from among 2965 Chinese characters, we have deduced information on stroke order and connection using an automatic stroke correspondence system. First, we have shown that the majority of real characters are written in a fixed stroke order, and that the actual stroke order is predominantly the standard stroke order, with about 98.1% of the characters being written in the standard stroke order. Almost all the stroke connections occurred in the standard order (92.8%); while two-stroke connections were common, stroke connections in non-standard order occurred very rarely In a comparison of our findings with the expected stroke connections, very few were found to actually occur. Second, we have shown methods for incorporating the information in a completely stroke order- and number-free framework. Third, we have shown that the proposed methods can be selected according to the quality of the writer(s), the character(s), and the recognition system. Finally, a large improvement in both computational time and recognition accuracy is demonstrated by experiment. Povzetek: članek opisuje prepoznavanje kitajskih pisanih znakov. 1 Introduction On-line recognition of handwritten cursive characters is a key issue in state-of-the-art character recognition research [8], and extensive research has been conducted to mitigate the variation seen in stroke order, stroke number and stroke deformation [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18]. In contrast with off-line recognition, on-line recognition has the important advantage of being able to use stroke order and connection information, because the character pattern is expressed by ordered time sequences. Based on stroke order and stroke connection information, there are two approaches to realize on-line character recognition. The first approach actively uses the stroke order and connection information, and the second approach takes into account the inevitable changes in stroke order and connection that occur from person to person, and hence utilizes "free stroke order" and "free stroke number". Our approach is basically the same as the second method, and the framework for character recognition is stroke order-free and stroke number-free. Using this framework for stroke order-free and number-free character recognition, we can obtain locally available information on the stroke order and connection [15, 16, 17, 18]. The algorithm required to construct a system for stroke order- and number-free recognition needs a correct performance of stroke correspondence between the input pattern and the reference pattern for the highest recognition perfor- mance [2, 11, 17, 18]. However, this type of algorithm also carries out a large number of stroke correspondences that do not actually occur. To realize high quality recognition on a small microprocessor with a built-in memory, such as a Personal Digital Assistant (PDA), the recognition time and required computational resources must be reduced. For this purpose, it is expected that a reasonable level of stroke correspondence searching can be realized by using information on the stroke order and stroke connection as they actually occur. The development of a recognition framework that incorporates information on stroke order and stroke connection is expected to achieve a reduction in computational time and to improve the recognition accuracy by neglecting any non-real stroke correspondences. This paper is based on the assumption that recognition can be performed on the stroke order- and number-free framework. First, we investigate whether useful information can be obtained in regard to the stroke order and connection by focussing on a style analysis of the stroke order variation and the connection between strokes. Second, we propose the methods for incorporating the information in a framework that is completely stroke order-free and stroke number-free. Third, we show that the proposed methods can be selected according to the quality of the writer(s), the character(s), and the recognition system. Finally, a large improvement in both the computational time and the recognition accuracy are demonstrated by experiment. Input Pattern Alt Reference Pattern BU Table 1: Occurrence of changes in stroke number. (b) (a)< -- __ --- B1 B2 B3 B4 B5 A1 • A 2 • A 3 • A 4 • A 5 • Figure 1: The correct stroke correspondence between: (a) the input pattern, and (b) the reference pattern. 2 The Stroke Correspondence Problem An on-line input character can be expressed as an ordered series of writing strokes, i.e., A = A1A2 ■■■Ak ■ - -An , (1) where the k-th stroke, Ak, is the time sequence representation of the local feature, aik, of a character, e.g., the x-y coordinates or stroke direction. This is expressed as Ak = aika2k ---aik ■■■ aik, I = I (k). The reference pattern can be similarly expressed as B = B1B2 ---Bi ---Bm (2) Bl = 011021 ■■■bji ■ ■-bji, J = J(l). The stroke number of the input pattern is denoted by N, and the stroke number of the reference pattern is denoted by M, with N being equal to M for correct stroke number recognition. The measure of dissimilarity between the input pattern stroke, Ak, and the reference pattern stroke, Bi, is calculated using the stroke information of the character shape and position, and is denoted by the stroke distance, ö(k; l). A one-to-one stroke correspondence is defined by the bijec-tion, {l(k)}, of the stroke number l of the reference pattern from the stroke number k of the input pattern, as shown in Fig.1. The sum of the stroke distances, S(k; l), is used as an evaluation standard of the optimum correspondence. Thus, I : stroke correspondence Stroke-number Occurrence Ratio (%) change number 2 13 0.000050 1 662 0.002560 0 172428 0.666767 -1 54100 0.209201 -2 18933 0.073213 -3 7268 0.028105 -4 3202 0.012382 -5 1105 0.004273 -6 517 0.001999 -7 226 0.000874 -8 99 0.000383 -9 29 0.000112 -10 11 0.000043 -11 7 0.000027 -12 1 0.000004 -13 2 0.000008 Total 258603 based on the stroke information, the solution of the following minimization problem is considered to give the optimal stroke correspondence in which the minimum value, D(A,B), is chosen as the measure of matching (i.e., the dissimilarity) [11]. D(A,B) = min {i(k)} N J2S(k; l(k)) .k=i (3) Solving this stroke correspondence determination problem provides a structural analysis of the pattern, and therefore, the dissimilarity, D(A, B), and the stroke correspondence, {l(k)}, between the patterns are obtained. 3 Preparation for Analysis The investigation data consisted of 2,965 categories of Chinese characters, i.e., specified characters in the first level of the Japanese Industry Standard (JIS) code set, as written by 90 university students (total = 258,603 characters). The students were directed to write cursively in a normal manner. The data were recorded using a stylus pen on a liquid crystal display (LCD) tablet. The investigation into the results of the change in stroke number are shown in Table 1. This result is in good agreement with [2, 11, 18]. Reference patterns were generated from the training data by storing the average values of the loci of feature points from non-connected strokes that were extracted by rearranging the strokes according to the correct stroke order. One reference pattern for each category was made. The input character was transformed into a 256 x 256 mesh plane by preprocessing using redundant elimination, smoothing, size normalization and feature point extraction. The x-y coordinates and movement directional vector between one point and the next were extracted as feature information from the character data. The feature information Cui (T / t fj— d m (a) © rp\ /v; J.v-'v rio lo 11 ii B0 % « 1 2 3 4 5 6 7 8 1 75 0 15\0 0 0 0 0 2 15 75 0 0X0 0 0 0 3 \0 15 75 0 0\0 0 0 4 0\O 090 0 0 \0 0 5 0 0\O 090 0 oyo 6 0 0 0\0 0 79 11 0 7 0 0 0 0 0 2 79 9 8 0 0 0 0 0 9 0 81 J (a) IT S m 'bn -/ 1 2 3 4 5 1 2 3 4 5 6 1 2 3 4 5 6 7 1 18 72 0 0 0 1 90 0 0 0 0 0 1 90 0 0 0 0 0 C 2 72 18 0 0 0 2 0 90 0 0 0 0 2 0 88 2 0 0 0 C 3 0 0 90 0 0 3 0 0 90 0 0 0 3 0 2 43 21 24 0 C 4 0 0 0 90 0 4 0 0 0 90 0 0 4 0 0 40 50 0 0 C 5 0 0 0 0 90 5 6 0 0 0 0 90 0 5 0 0 5 19 66 0 C 0 0 0 0 0 90 6 0 0 0 0 0 69 21 r=1 7 0 0 0 0 0 21 69 rÀ (b) Figure 2: Examples of: (a) the input, and (b) the reference patterns. of the input and reference patterns was placed into üik and bji, respectively. Figure 2 shows typical reference and input patterns. Automatic searching for the correct stroke correspondence between the input and reference patterns was not easy, since the analysis of the stroke order and the stroke connection had to be performed simultaneously with a high accuracy. First, the Cube Search method [17, 18] was used to correct the stroke correspondence between the input and reference patterns by automatically searching using backtracking. This algorithm has a novel advantage in being able to search efficiently for the optimal stroke correspondence in spite of: (i) variation in stroke order, (ii) variation in stroke number owing to stroke connections, and (iii) exceptional user-generated stroke deformations. Some of the incorrect stroke correspondences were manually converted into more appropriate stroke correspondences from observation of the characters. These errors arose from strokes written in erroneous positions. 4 Stroke Order and Stroke Connection Information On the Cube Search method [17, 18], to realize a completely free condition for stroke order, a high percentage of unreal stroke correspondences was also carried out. We expected that a reasonable level of stroke correspondence searching could be realized by using the actual information on the stroke order and stroke connection. Furthermore, there would be no hindrance for the framework of the stroke order- and number-free recognition by use of statistically stable information. The development of a recognition framework that incorporates information on the stroke order and stroke connection is expected to achieve a reduction in computational time, and to improve the recognition accuracy by neglecting non-real stroke correspondences. We analysed the stroke order and stroke connection of each handwritten character using an automatic stroke correspondence analysis system based on the Cube Search (b) 1 2 3 4 5 6 7 8 75 0 l5\0 0 0 0 0 15 75 0 a\0 0 0 0 nQ 15 75 0 0\0 0 0 0\D 0 90 0 0\0 0 0 0\n 0 90 0 oys 0 0 0\n 0 79 11 0 0 0 0 0 0 2 79 9 0 0 0 0 0\9 0 81 Figure 3: The occurrence, Ckl, of correct stroke correspondence between: (a) the input pattern stroke, k, and (b) the reference pattern stroke, l. -/ 1 2 3 4 5 18 72 0 0 0 72 18 0 0 0 0 0 90 0 0 0 0 0 90 0 0 0 0 0 90 r=1 1 2 3 4 5 6 90 0 0 0 0 0 0 90 0 0 0 0 0 0 90 0 0 0 0 0 0 90 0 0 0 0 0 0 90 0 0 0 0 0 0 90 r=0 1 2 3 4 5 6 7 90 0 0 0 0 0 0 0 88 2 0 0 0 0 0 2 43 21 24 0 0 0 0 40 50 0 0 C 0 0 5 19 66 0 C 0 0 0 0 0 69 21 0 0 0 0 0 21 69 r=2 Figure 4: Examples of the occurrence of correct stroke correspondences. method [17, 18]. 4.1 Stroke Order Analysis Stroke order variation among writers is mainly caused by the particular writing style of the individual. To use the stroke order information in the framework of stroke orderfree recognition, changes in the real stroke order data were investigated from the following viewpoints: (1) the range of the fluctuation of the stroke order in comparison with the standard stroke order, or (2) whether the stroke order change was completely random. One such analysed result can be shown using the frequency of an example included in Fig. 3. The quantity Ckl Figure 5: Character examples, each having a gap. « 100 95 90 85 80 75' 70 1 2 3 4 5 6 range(r) 78 9 10 Figure 6: Stroke correspondence distribution. is the number of the input pattern in which the l-th stroke of the reference pattern is written as the k-th stroke of the input pattern. Note that a connected stroke is composed of strokes that correspond to the reference pattern strokes, based on the fact that the change in stroke order of characters with a connected stroke has the same tendency as the change in stroke order of characters having only a single stroke. In the example "Ä", 15 among the 90 writers sampled wrote the third to the first stroke incorrectly, and the deviations of the 15 incorrect writers are shown in the distribution of the first, second and third strokes. From the results of the stroke order analysis, 76% of all characters were written in accordance with the standard stroke order, and further, the distribution of the remaining characters was concentrated along the diagonal line. In the occurrence table of Fig. 3, if the maximum distance having a non-zero element from the diagonal line is r, then it is denoted by r-gap. Figure 4 shows examples with an r-gap value. Figure 5 shows character examples, each having a gap. Figure 6 shows the percentage of characters having r-gap, with 95.7% and 98.1% of the characters included in the r = 2 and r = 3 ranges, respectively. Based on the above stroke correspondence results, the following methods can be incorporated into the Cube Search algorithm. (1) If there is no case where the l-th stroke of the reference pattern can be written as the k-th stroke of the input pattern, i.e., Cki = 0, then the {k,l) correspondence is excluded. (2) Based on the tendency that the stroke correspondences are concentrated on the diagonal line, if \k — l\ > r then the (k, l) correspondence is excluded. One of the above methods or the range r can be selected according to the quality of the writer (s), the character(s), and the recognition system. In the example "it" , only 15 pairs of Ckl =0 or 34 pairs within r = 2 were considered as objects suitable for searching for correspondences. By using the stroke order information, a large improvement is Connections Characters Connections Characters 3 p)^ 3 (1) 7 ^ TL a ž « f (1) m m m ^ (3)^ ^ (1) ff^ ^ Ä ■3 (2)^ -2 (1) s m (1) ^ 'it m ä r s = (3)^ 1 (1) H B- » fi äs ^ > 1 (1) m m ^ m m iC BU (2)-^ (1) ^ m \rz(3)^V2(l) D fp m ü ie ^ ^ ^ M ± ¥ m ■M. 0 h « M m m Figure 7: Typical stroke connections and character examples. Stroke-Number Transformation of Character Stroke-Number 14 v 1 \ s v 9 13 > ^^ v 1 \ s N 8 12 i® r 1 \ \ N 7 11 » 1 \ v. N 4 10 t T 1 \ \ N [iJ I V ^ it it it Figure 8: Transformation of the character "INIZZ" by the occurrence of stroke connections. expected in terms of the recognition accuracy and computation time. 4.2 Stroke Connection Analysis Stroke connection occurs mainly in rapid or in cursive handwriting. As a result, the number of strokes in a character decreases. Figure 7 shows examples of characters in which stroke connections occur, and where some fixed-shape stroke connections occur. Figure 8 shows the transformation of a character from the occurrence of stroke connections. The following points need to be considered for stroke connection. (1) How many strokes are there in consecutive continuous writing? (2) In one character, in how many places does independent continuous writing occur? 0 Table 2: Occurrence of consecutive stroke connection with: (a) standard, and (b) non-standard order. type of consecutive stroke connection (a) number ratio(%) kinds ratio(%) 2-strokes 94970 78.88 12779 65.78 3-strokes 15483 12.86 3695 19.02 4-strokes 1093 0.91 751 3.87 5- or more strokes 251 0.21 238 1.23 (b) 2-strokes 8036 6.67 1513 7.79 3-strokes 417 0.35 318 1.64 4-strokes 140 0.12 117 0.60 5- or more strokes 15 0.01 15 0.08 total 120405 100 19426 100 single stroke 2580086 32717 i10 I 9 i 8 (dR + dF)tan5°^ (13) The cubes that do not contain at least one point rw are eliminated. As a result, the workspace of the arm is described by a set of cubes of edge L/n. In order to increase the accuracy, the cubes forming the surface of the workspace are broken into smaller ones with edge of half length. In every step i the workspace volume is computed by V ' = Vr T/i + (14) where Vs is the volume of cubes on the surface, and Vj the volume of all other (inner) cubes. The procedure is ended when |Vi — V V i-1 (15) The input data to the AWS program are the joint limits measured in the reference pose of the arm, which are ^Am, ^Fm, ^Rm, ^EFm, ^AM, ^FM, ^RM, and ^efm. These parameters vary considerably among individuals as affected by age, sex, injuries, and stage of the illness [7]. Relatively to the hight of the subject H, the segments of the arm are normalized in accordance to the anthropometric table [15]. The length of the shoulder girdle length is then dsG = 0,129 • H, of the humerus is du = 0,185 • H, and of the forearm dp = 0,146 • H. The procedure to determine the workspace has few stages. The first stage is to compute the set of points which can be reached by the wrist. This computation involves four nested loops, each associated with one joint angle. In every iteration, position rw (Equation 12) is computed and stored as a three dimensional vector. The procedure is repeated until all ranges of joint angles are swept [1]. In addition, the collisions between the segments of the arm (humerus and forearm) and the body (head, neck and trunk) have to be taken into account. If during the computation an arm segment intersects the body, the related position of the wrist is eliminated as impossible and is not considered as part of the workspace. An elliptical cylinder to approximate the body whose size is 0,174 • H in the frontal plane and 0, 089 • H in the sagittal plane is used. The head is approximated by a sphere whose radius is 0,065 • H [15]. The resolution is set to 5° for all joint angles. It corresponds to the measurement error in the input data. Since the ranges of coordinates change from one subject to another and throughout the workspace, it is impossible to exactly predict the number of iterations. Usually, tens of thousands of iterations are needed to determine the whole set of points approximating the reachable workspace. The obtained set of points lies inside a cube of edge L = 2(dR + dF) whose center is in the center of the shoulder is smaller than a prescribed value v. The last stage in determining the arm workspace is to smoothing the surface cubes. For this purpose a Bezier interpolation [16] is performed. The workspace can thus be visualized with different color textures, illuminated with different positions of light, or made transparent. A comparison between the calculated and the measured reachable workspace of the healthy left arm is shown in Figure 4. The workspace in Figure 4a was measured in equidistant planes in [1]. Such a measurement is extremely complicated and time consuming. It can only be performed on healthy subjects. On the contrary, the calculated workspace (Figure 4b) can be obtained based on very simple manual measurements which are part of a standard procedure in treating hemiplegic patients in the rehabilitation center. The similarity between the measured and the calculated workspace is quite evident. Figure 4: A comparison between the measured (a) and the calculated (b) reachable workspace. 4 Example An example of a treatment of a hemiplegic patient is presented in Figure 5. The figure shows the reachable workspace of the left handicapped arm before the treatment v = (Wli) and two different phases during the treatment (Wl2 and Wls). Workspace Wro is associated with the healthy right arm inverted to the left hand side for comparison. The measured ranges of motion are collected in Table 1. Table 2: The computed workspace volume for H = 18,00 dm volume ±v Wli 190, 4 dm3 Wl2 401, 4 dm3 Wl3 364, 3 dm3 Wro 598, 2 dm3 tive to the body is an important issue. A reachable space in front of the body could in general be more useful. The AWS program is written in Matlab and converted into an autonomic form. It is designed to be used at the Institute for Rehabilitation as a standard tool for the examination of the shoulder complex. The input data form contains an identical information as the paper form used in the past. The workspace block is added (Figure 6). Figure 5: A comparison of the reachable workspace of the left handicapped arm with the reachable workspace of the right healthy arm of a hemiplegic patient. The measured ranges of motion of the handicapped left shoulder before and during the treatment, as well of the healthy right shoulder, are reported in Table 1. The elbow flexion-extension angles are taken as for the healthy arm and are ^EFm = —90° and ^efm = 60°. Table 1: The measured shoulder ranges ^Fm ^FM ^Am ^AM ^Rm ^RM Wli —45° 65° — 10° 80° —55° 25° Wl2 —55° 140° — 10° 95° —60° 40° Wl3 —80° 120° — 10° 95° —45° 50° Wro —60° 170° — 10° 170° —60° 90° Table 2 shows the computed volume of the reachable workspace of the handicapped left arm and of the healthy right arm. The workspace was computed with v < 10% relative error. The hight of the subject is 180,0 cm. Note the workspace volume is not directly associated with functionality of the arm. Other indices in combination with the volume can be used, such as the workspace compactness [17]. The workspace compactness quantifes the similarity of the workspace shape with a sphere. It is assumed that a compact workspace is more adaptable then an elongated one. Also the location of the workspace rela- Figure 6: The input form of the AWS program. The workspace is visualized in a posterior and in an anterior view. It is possible to superimpose the workspace of the handicapped arm onto the one of the healthy arm by using a transparent envelope (Figure 7). Thus, the workspaces can be directly compared during different phases of the rehabilitation process. The results can easily be documented and stored in a computer or printed. They can be displayed and numerically processed, as well as electronically transferred to another user. Figure 7: A direct comparison of the workspace of the handicapped arm (Wli) with the one of the healthy arm (Wro) by superimposing the two workspaces. 5 Conclusion A computer program which computes the human arm reachable workspace is reported in this article. The program is based on a simplified kinematic model of the human upper extremity in which the shoulder complex is approximated by a spherical joint and the elbow complex by a rotation. The input data to this program are taken from a standard evaluation procedure in physiotherapy. The reachable workspace can be quantified by its volume or other mathematical indices. The main advantage of this program is that it can visualize the reachability of the measured human arm. The obtained results can be used for computer-aided documentation, numerical or visual comparison between different phases of a rehabilitation process, and numerical or visual comparison between different subjects. This helps us to plan and control the rehabilitation procedure of shoulder patients. The program is now being introduced in the Institute for Rehabilitation, Ljubljana, Slovenia. Acknowledgement This investigation was supported by the Slovenian Ministry of Education, Science and Sport. We are grateful to the employees of the Institute for Rehabilitation, Ljubljana, Slovenia, for their valuable contribution. References [1] J. Lenarcic, A. Umek (1994) Simple model of human arm reachable workspace, IEEE Tranes. System, Man and Cyberneti^cs, Vol. 24, pp. 1239-1246. [2] V. M. Zatsiorsky (1998) Kinematics of human motion, Human Kinematics. [3] A. E. Engin and S. T. Tumer (1989) Three-dimensional kinematic modeling of the human shoulder complex - part I: Physical model and determination of joint sinus cones, Trans. of the ASME Jour. of Biomech. Eng., Vol. 111, pp. 107-112. [4] J. E. Wood, S. G. Meek and S. C. Jacobsen (1989) Quantitation of human shoulder anatomy for prosthetic arm control - II. Anatomy Matrices, Jour. of Biomech., Vol. 22, No. 4, pp. 309-325. [5] Z. Dvir and N. Berme (1987) The shoulder complex in elevation on the arm: a mechanism approach, Jour. of Biomech., Vol. 11, pp. 219-225. [6] J. Lenarcic and M. Stanišic (2003) A humanoid shoulder complex and the humeral pointing kinematics, IEEE Trans. on Robotics and Aul^omal^., Vol. 19, No. 3, pp. 499-506. [7] C. C. Norkin and D. J. White (1985) Measurement of Joint motion: A guide to g;onimelry, F. A. Davis Company Philadelphia. [8] I. A. Kapadanji (1970) Physiology of the Joints, Churchill Livingstone, London. [9] D. J. Magee (1997) Oi^thopaedic f^h^sical Assessment, W. B. Saunders Company, 3rd ed. [10] J. Hesselbach, M. B. Helm, H. Kerle, M. Frindt and A. M. Weinberg (1998) Adv^ances in Ro^ot Kinematics: Analysis and Control, Kluwer Academic Publishers, pp. 551-560. [11] J. Lenarcic, M. M. Stanišic. V. Parrenti-Castelli (2000) Kinematic design of a humanoid robotic shoulder complex, Proc. Int. Conf. On Robotics and Automat., San Francisco, USA. [12] V. T. Inman, J. B. Saunders, L. C. Abbott (1944) Observation on the function of the shoulder joint, Jour. of Bone and Joint Sur^geery, Vol. 26, pp. 1-30. [13] C. Högfors, B. Peterson, G. Sigholm and P. Herberts, (1991) Biomechanical model of the human shoulder joint - II. The shoulder rhythm, Jour. ofBiomech, Vol. 24, No. 8, pp. 699-709. [14] S. D. Bagg and W. J. Forrest, (1980) A Biomechanical analysis of scapular rotation during arm abduction in scapular plane, American Jour. ofPhy. Med. Rehabilitation, Vol. 67, No. 6, pp. 238-245, 1980. [15] D. A. Winter (1990) Bimechanics and motor control of human mov^ement, Wiley-Interscience Publication, University of Waterloo. [16] F. Gerald (1990) Curves and surfaces for computer aided geometric (design, Academic Press Inc. - Har-court Brace Jovanovich Publisher. [17] J. Lenarcic (1992) An approach to optimum design of robot manipulators, Laboratory Robotics and Automation, Vol. 4, pp. 137-143. A Software Architecture for Enterprise Components D. A. Helton Computer Information Systems Northern Michigan University Marquette, Michigan, USA E-mail: dhelton@nmu.edu Keywords: component-based software development, enterprise components, software components, component software, COTS Received: January 31, 2004 Component-based software development, the idea of constructing computer programs from existing software modules, has attracted considerable attention. Component size has become a significant consideration. Recently, there have emerged significant research prototypes and early commercial systems for constructing major business applications from a few components of coarse granularity, termed enterprise components, rather than combining hundreds of small components. This research with large components has shown that building applications from components of coarse granularity presents distinct challenges from assembling systems from small elements. The focus of this paper is a proposed system architecture, synthesizing the best features of other systems using enterprise software components, and adding new features, as needed, as a step towards the improvement of software development involving large components. Povzetek: članek opisuje gradnjo poslovnih sistemov iz velikih modulov. 1 Introduction There is growing interest in coarse-grained software components, or enterprise components (Levi & Arsanjani 2002), which offer the potential of building applications quickly from a few large modules rather than by composing numerous fine-grained components. An analysis of an organization's functional areas determines what is to be included within each of these large components. Each major process becomes a single enterprise component (Bandini et al., 2002, Helton 1999, Levi & Arsanjani 2002). The objective of this study is to derive an improved system architecture for software development with large components, which will be presented in sections 2, 3 and 4 of this paper, as follows: (section 2) a review and synthesis of features from existing systems, (section 3) the definition of functionalities of system components and interrelationships among them, and (section 4) the system architecture specification. 2 Review and Synthesis of Previous Systems This stage includes three activities: (1) review existing systems handling large components to determine requirements for an improved software architecture, (2) locate candidate features that might fulfill these requirements within these systems or other sources, (3) select a group of items from the candidates for synthesis into an improved system. The first and second activities were completed in a previous study (Helton 1999) and a summary of the results will be included below. This preliminary study contained an examination of research prototypes and early commercial systems using enterprise components, including Carnegie-Mellon University Software Engineering Institute (SEI) reports of development with commercial-off-the-shelf (COTS) systems (Carney 1997), work at Harvard University's Brigham & Women's Hospital on large component imaging software (Deibel & Greenes 1995), Sun Microsystems' Enterprise JavaBeans (EJB) model (Thomas 1997), the large component version of SAP enterprise resource planning (ERP) applications (SAP 1997), and Stanford University's CHAIMS project on very large-grained, heterogeneous, distributed components (Beringer et al. 1998). The study (Kim 2002) of the Korean government's initiative in promoting enterprise component development also was beneficial. Review of Other Systems. As an immature technology, existing large-component software systems do not meet adequately the needs envisioned by their designers. The aforementioned study (Helton 1999) of the other large component systems uncovered some of their deficiencies. Despite encountering these inadequacies, the study of these and other related systems provides a source for locating functionalities that meet these requirements. Theoretically, building systems from a few large components should be less challenging than building a system from many small components. The study of existing systems indicated the following needs for enhanced large-components systems: (1) streamlined components, (2) simple interfaces, (3) straightforward composition, (4) plug-and-play compatibility, and (5) standard distribution infrastructure. The aforesaid review of these systems (Helton 1999) suggests a need for simplifying rather than complicating the large components. Several of the large component systems described do not subordinate appropriately the details of coarse-grained components. Scrutiny of these systems indicates that large components manifest a complexity that makes their composition with other elements significantly more challenging than assembling systems from fine-grained components. The previous study (Helton 1999) also indicated that, as the granularity moves up the scale, the appropriateness of using the conventional set of object-oriented (OO) features becomes suspect. Several of the systems (Enterprise JavaBeans, SAP R/3, and the Harvard information system research) overdo object orientation by extending it to the highest level, i.e., to that of the overall large component structure. The design of the inner structure or implementation of a large component, which is not the focus of this paper, may follow extensively the OO paradigm, if the designer so chooses. Even in working on the implementation design, great care must be taken to avoid creating mazes of confusing OO inheritance hierarchies. Nevertheless, the basic concept of OO, that is, depicting in the software an entity from some business domain, was appropriate for each system studied. The preliminary study (Helton 1999) also revealed that the use of OO features beyond this inner structure presented problems in these other systems. Since only a few components are involved, many elements of OO are superfluous. The inherent complexity of large components makes them challenging enough to handle without adding further complications. Large components are stable and do not require dynamic features for their quick replication or frequent modification. As mentioned earlier, each enterprise component might represent a distinct, complex business function, such as a payroll system or an inventory process. Instantiation, when only a few components of this granularity are involved, is a needless complication. As observed by Taylor (1995), the reusability of components decreases as complexity increases. At this point of component granularity, reusability is minimal. Thus, the small reusability attained by permitting inheritance is overwhelmed by the disadvantages generated by entangling class hierarchies. Following the same logic with regards to the diminishing returns of reusability as components increase in size, other OO features offer little in return for the baffling complexity required to implement them. The previous review (Helton 1999) of the systems also shows that simple interfaces are another essential requirement to facilitate the handling of large components. The systems mentioned above further complicate large component systems by using OO inheritance for interfaces of large components. This overuse of OO inheritance can permit direct access to deeply nested interfaces within a large component (D'Souza & Wills 1998). This contradicts the basic idea of hiding component details behind interfaces (Szyperski 1997). Streamlining component interfaces may be achieved by delegation (Lewandowski 1998). The overall component contains a straightforward interface, providing the only internal implementation access to external clients. The large component may contain nested interfaces, even several layers deep. Access to these inner interfaces is via forwarding or delegation, which restricts flexibility to some degree, but keeps component structure simple. The examination of other large component systems (Helton 1999) brought to light another requirement for improving this type of system. In order to work effectively, the system needs a straightforward means of connecting together the enterprise components. If the cumbersome OO interface and implementation hierarchies are eliminated, then an uncomplicated glue component may be inserted to tie pre-existing applications together. The Stanford CHAIMS researchers used this technique, and D'Souza and Wills (1998) also suggested it. Plug-and-Play compatibility of large business components from multiple vendors is another major requirement revealed by the earlier study (Helton 1999) of these systems. Though the ERP vendors allow competitors to add secondary modules, none of these vendors guarantee that their major modules may be substituted by comparable components produced by competitors. Experience with component projects has indicated that effective reuse of large components is generally restricted to vertical industry segments (Szyperski 1997). This conforms to the reusability principle, where the more specialized a module is, the less reuse potential it has. Thus, the feasibility of using components across industry boundaries is much more restricted for large components than it is for small ones. The Harvard information systems project confirms the need for establishing standards within particular business domains (Deibel & Greenes 1995). The interchangeability of components from multiple sources requires, foremost, standardization of the specification, i.e., the interface and related descriptions about the functionality of the enterprise component. Uniform implementations are not required for plug-and-play compatibility of components. Researchers at PeopleSoft and SAP also confirm the need to develop specialized, standard interfaces for particular industries. Experience has demonstrated that standardizing interfaces for specific industry groups yields adequate compatibility to effectively assemble the respective large components. A standard distribution infrastructure is one of the most important requirements for development with large components. Companies usually expect that new major applications will be deployed over networks. The pressures of competition have produced incompatible distribution systems (Szyperski 1997), making development with large components challenging. Some high-level data models abstract the complexity of data storage mechanisms by simply depicting persistent data as being contained within application modules. The conceptual framework (Helton 1999) for the proposed architecture, however, incorporates, at the top level, software elements of large granularity connected by a distribution system, as well as a data storage facility, normally a database management system (DBMS). Originally, CORBA, DCOM, and Java, were not designed to support large software components (Szyperski 1997). Enterprise JavaBeans extends Java to the point that EJB is a whole, distinct component system, rather than just a distribution system. Though capable of handling large components, EJB is an adaptation of a system designed to handle small software elements. EJB thus retains the overuse of some OO features that is commonplace in systems focusing upon small components. Those working with large component development described in the aforesaid study (Helton 1999), pertinent to COTS, Harvard University's Brigham & Women's Hospital, EJB, ERP, and CHAIMS projects, as well as the recent initiative in Korea (Kim 2002) found that it was difficult to find existing infrastructures that supported their conceptual models for handling large-grained components. For example, in the Harvard information systems project (Deibel & Greenes 1995), investigators initially developed an attractive architecture for handling large components. As the work progressed, however, the Harvard researchers realized that no existing distribution systems completely supported their model. Thus, they modified their architecture to conform to the requirements of existing infrastructures, oriented towards components of small granularity. The complexity and magnitude of building business applications requires developers and researchers to focus upon assembling large components, despite the inadequacies of underlying distribution systems that are available. The preliminary study (Helton 1999) of early commercial and research prototypes of large component systems have been forced to use infrastructure systems that do not adequately support their coarse-grained components. The Stanford CHAIMS was unique in that, rather than compromising by using an incompatible infrastructure, the researchers developed their own infrastructure (Beringer et al. 1998). Thus, this concludes the review of existing systems handling large components in order to establish requirements for improving large-component systems. As noted earlier, these requirements are: (1) streamlined components, (2) simple interfaces, (3) straightforward composition, (4) plug-and-play compatibility, and (5) standard distribution infrastructure. The next activity is to locate feasible elements that satisfy these five requirements. Locate Candidate Features. The review of prior systems, in the foregoing section, suggests that the requirements for a large-component architecture may be met by making the following enhancements: (1) acceptance of attractive features used with the structured approach to software development, (2) removal of inappropriate object-oriented features, (3) patterning each enterprise after a complete business process, (4) adoption of solid notions from the client/server model, and (5) employment of standards specially designed for the large components. In developing the conceptual framework for the architecture (Helton 1999), candidate features were obtained from the following sources: (1) structured programming, (2) object orientation, (3) process-oriented notions, (4) the client/server model, and (5) standards concepts. In this step, locate candidate features, the four preceding sources of features for an enhanced software architecture will be examined further. High cohesion and loose coupling, two concepts associated with structured development, enhance component development (Budgen 2003, Sametinger 1997). Inclusion of these two features with the suggested architecture is possible because unattractive inheritance hierarchies and other inappropriate OO notions have been expunged. Insuring that everything pertaining to a particular function adheres to a central, unifying purpose attains high cohesion. Loose coupling is achieved by eliminating all connections between components other than data linkages. Structured interfaces, derived from structured programming, should be beneficial to the proposed architecture. Several systems (Enterprise JavaBeans, Harvard health information systems, SAP R/3), unnecessarily complicate development with large components by including OO interfaces, which produce puzzling interface hierarchies. The rationale for including this maze of interfaces is to permit direct external access to smaller components embedded within larger ones. Even if component inheritance and other inappropriate OO features are eliminated, this compromise, consisting of the inclusion of interface inheritance, potentially degrades the effectiveness of a system. The Stanford CHAIMS project confirms that procedural (structured) interfaces streamline development involving large components. Though these structured interfaces only allow indirect communication with embedded components, they enhance the construction of applications with large components by subordinating minutia. Information hiding (Inverardi & Tivoli 2002, Ravichandran & Rothenberger 2002) is another feature from structured programming suggested for the proposed architecture. Information hiding abstracts the intricacy of a large component's inner workings. Even a baffling maze of OO inheritance hierarchies, for example, may be concealed through this feature. A large business component, comprised of an aggregation of smaller components, should limit access to nested elements via delegation instead of exposing inner interfaces (Lewandowski 1998). Component development with small components incorporates many features that do not function well with enterprise components. Several of the systems (Enterprise JavaBeans, SAP R/3, Harvard health information systems research), however, overextend OO concepts to large components, making their handling unduly complex. The central OO notion of mapping business entities directly to software is worthwhile for large components. A modified version of OO encapsulation is useful for large components, too. Simply expressed, this means that an object class encapsulates both data and related operations on the data (Pressman 1997). The implementation of a component may follow any model desired (Buck-Emden 1996, Short 1997, Szyperski 1997). If the inner structure follows the OO model, a single enterprise component may have many object classes. A large component represents a function from the business domain and encapsulates everything pertaining to this function (D'Souza & Wills 1998, Levi & Arsanjani 2002, Shaw et al. 1995). A component may even subordinate, through abstraction, all of the mechanics involved with attaining persistence of data, that is, storage of data, generally in a DBMS. Most other OO concepts are neither necessary nor beneficial for handling large components. Enterprise components tend to be more static in nature than small components (D'Souza & Wills 1998). The implementation of a large component may or may not follow OO structure. The details of the inner structure remain hidden and consequently are insignificant at the high level view of the architecture. This is consistent with the concept of black-box assembly of components (Inverardi & Tivoli 2002, Ravichandran & Rothenberger 2002). In composing large components, the developer works at a level of abstraction where the concern is upon the connections between components. Thus, the study of other systems (Helton 1999) made it clear that at this high level, components need to be simplified, irrespective of the complexity of their inner workings. Extending OO ideas, other than the aforementioned mapping of business functions to software entities, is detrimental development with large components, as illustrated by the CHAIMS project at Stanford University (Beringer et al. 1998). Confusion exists, among many developers, about the differences between OO and component approaches to development (D'Souza & Wills 1998). Development with coarse-grained components should capitalize on the static demeanor of these large components by reducing their number to just a handful. Having classes instantiate enterprise components only complicates development with large components and is unnecessary. Since only a few large, stable components are involved in these systems, using OO inheritance, and thereby possibly creating entangled dependency hierarchies, is even more inappropriate. Multiple inheritance has the potential of creating extremely confusing class hierarchy webs (Hissam et al., 2002, Pressman 1997) and thus is controversial even when working with small grained components, and is highly questionable for composing large components. In addition, removal of inheritance eliminates problems with related features, particularly those associated with overriding and polymorphism. Since large components are inherently static, and thus use static binding, late binding is generally unnecessary. Though mapping elements from the business domain to software is a central OO notion, the process-oriented manifestation of this idea is better suited to large component development. This means that an entire business process may be portrayed within an application as a large component (D'Souza & Wills 1998, Levi & Arsanjani 2002). At two levels, client/server concepts may be included in the large component architecture (Helton 1999). First is the common practice of transforming a program into modules, with a central client module calling server elements. The business components communicate only through a coordinating glue component. At the second level, the client/server model may be employed in the architecture to demonstrate the interaction of concurrently executing processes. While the physical distribution of these processes was implicit in the conceptual framework (Helton 1999), this feature will be specified explicitly in the proposed architecture. The review of the coarse-grained systems (Helton 1999) suggests several sources suggests that standards are needed for: (1) large components, in general, to provide functionality identical to that of equivalent units available from other vendors; (2) interfaces, to make components plug compatible with those from other companies; and (3) underlying distribution systems, so that developers may focus upon the application software and so that the distribution infrastructure is able to handle large components appropriately. Though at times an elusive goal, significant precedents for de facto or formal standards exist in the software field (Szyperski 1997) and meaningful component development requires standards (Hissam et al. 2002). A standard for large components would include specifications to insure that the behavior of a component from one vendor is equivalent to a comparable component from a competitor (Vitharana 2003). Components with standard, that is identical, interfaces would be plug compatible with each other. Both of the foregoing standards would be necessary. Otherwise, one might expect components with identical interfaces but dissimilar specifications to function correctly. They would, however, yield erroneous functionality. Standard distribution infrastructures are absolutely essential since communication between elements is impossible without using the same distribution system. Components must be compatible in each of these three areas in order to interact properly. Consequently, a particular ERP vendor, for example, must maintain rigid internal standards within the corporation in order to attain component functionality. The only other possibility would be to rig heterogeneous elements together with wrappers and other patches. As discussed in reviewing these systems, vendors have been reluctant to enter into agreements for standards that would make components interchangeable between companies. ERP vendors do not provide general, straightforward plug compatibility (Helton 1999). These companies include mechanisms for appending peripheral components, but protect their own key business components so that they are not interchangeable with modules produced by competitors. Though companies are reluctant to make agreements regarding components with independent standardization agencies, once organizations consent, they are usually willing to extend the standards beyond the three compulsory areas listed above (Szyperski 1997). Similarly, the proposed architecture contains suggestions for features beyond the minimal requirements of component specification, interface, and distribution infrastructure. These additional features, though not essential, enhance a system's performance. Selection of Features for Architecture. The next activity is to select a set of features for the proposed architecture. The set of features was derived by applying the two initial stages of review and synthesis of prior systems. Figure 6.1 divides these selected features into three categories: (1) general component structure, (2) interface, and (3) distribution infrastructure. General Component Structure. The features selected for this category are the following: (1) standard components providing plug-and-play interoperability, (2) business domain process mapping directly to software, (3) encapsulation, (4) high cohesion, (5) the server depicting large business application components, and (6) a client as a glue component. These features describe the overall structure of large components. A standardization of implementation is accomplished through standard components providing plug-and-play interoperability. This assures that a component performs as specified, in contrast to the standardization of interfaces, which guarantees that components interconnect properly. Identical specifications are required to have true interchangeability of elements, not solely identical interfaces. A developer needs to know what specific functionality to expect from a large component. Components with indistinguishable specifications must behave in the same way. The specification, however, implies nothing about the similitude of component implementations. Explicitly specified in the Stanford CHAIMS architecture, and implicit in several of the other systems (Helton 1999), is the assumption that the architecture will have incompatible components requiring wrappers to make them work together effectively. Commercial developers encounter a staggering quantity of incompatible software elements in their work, and these professionals are quite successful in devising mechanisms to enable these heterogeneous elements to function together to some degree. Inherently incompatible components, however, suggest sub-par performance, and will not be specified as a feature of the proposed architecture. Standard, compatible components are the ideal, and thus will be included. The next feature selected for the architecture is business domain process mapping directly to software. This key concept has its roots in OO development facilitates preliminary data modeling, which is beyond the scope of this paper, and makes the software more comprehensible. This feature enables the software to capture the essence of what is actually happening in the corresponding business domain. One-to-one mapping of domain objects upon the software is well established in OO programming. Though mapping business elements to software is a key OO feature, process-oriented development (Matthes et al. 1999) extends this idea to elements of larger granularity. In development with large components, generally an entire business process, such as a hotel reservation system, is mapped to an enterprise component, consistent with process-oriented approach. Conceptually, this is appealing, because a large system may be developed by combining just a handful of these coarse-grained elements. The encapsulation feature, a notion associated with OO development helps make a component a manageable, autonomous unit. Unfortunately, this feature has remained to some degree an ideal, not having been implemented completely in practice. Instead of a module containing all of its related behavior, portions of its functionality would be placed outside of the module, or even in a separate library. At times, encapsulation is challenging, such as in the case of data storage. Conceptually, the handling of persistent data occurs within the module, whereas in reality the management of permanent data takes place mainly in an external DBMS. Physically, the data are stored in an external database shared by many enterprise components. Logically, however, the data are internal to the component in question. Though this and other considerations force designers to place, outside of a module, items properly belonging within it, encapsulation of everything directly pertaining to a module remains the ideal. Proponents of structured programming established high cohesion of software modules as one of their guiding principles. This feature also fits well within OO design. Focusing everything within a component to one objective is readily attainable within small components with limited functionality. Applying this principle to large components is more challenging because of the increased complexity. Superficially, a highly cohesive large component may appear to contain elements with diverse functionality, which actually are related functions, all supporting a common purpose. Care must be taken to insure that items within a complicated component do not support unrelated tasks. The next feature, a server with large business application components, has precedents in the Stanford University CHAIMS architecture and in the client/server application model. At least conceptually, ERP systems and other architectures also imply the presence of this feature, which implies a system with only a few components of coarse granularity. Tedious, yet relatively stable business processes, such as payroll and inventory, should make good candidates for mapping to the software as large server components. As indicated in Figure 1, the last feature from the general component structure category, a client containing a large glue component, was suggested by D'Souza and Wills (1998) and also incorporated into the Stanford CHAIMS architecture. An earlier precedent for this was the client/server application model, where a coordinating client requests services from a server module. Later, this feature will be discussed further, under the next step of the systems development research methodology, involving functionalities of components and their interactions. Interface. The features chosen for this category, enumerated in Figure 1, are: (1) standard interfaces, (2) structured interfaces, (3) information hiding, and (4) external access only. Interfaces are crucial to the interrelationship of components. Standard interfaces, the first feature in this category chosen for the proposed architecture, is required for large components to connect together correctly. Interchangeability between two components, as explained under the general component structure category, above, also demands functionality to be identical. Otherwise, a component would plug into the system properly, but not provide the desired functionality. Standard interfaces also offer extensibility to the system. A new component with a standard interface may be added to the system, as long as the new item's business functionality is compatible with the rest of the application. The proposed architecture for large components would incorporate structure interfaces to eliminate the possibility of generating confusing interface inheritance hierarchies through using OO interfaces, such as those featured in several systems described earlier (Enterprise JavaBeans, Harvard health information systems, SAP R/3). The Stanford CHAIMS project has demonstrated that large, stable components work well with straightforward structured interfaces rather adding the complications of OO interfaces. Large, stable components do not require the dynamic capability of instantiating interfaces from classes of interfaces. The next feature incorporated into the proposed software architecture is information hiding, which abstracts implementation details behind a component's interface. This concept was common in structured programming and was carried over to some extent in OO development. The object approach, unfortunately, permitted direct external access to nested interfaces, thereby completely nullifying the hiding feature. Permitting external access only to components is a feature related to information hiding that will be included in the proposed architecture. As suggested by Lewandowski (1998), access to embedded interfaces should be only through delegation. This forwarding of calls maintains information hiding, which is important for keeping large components manageable. Distribution Infrastructure. As indicated in Figure 1, the features in this category are: (1) standard distribution protocols, and (2) loose coupling. These features enable remote communication between components, an expectation in current large business applications. Standard distribution protocols are obligatory for any intercommunication whatsoever between physically isolated modules. The three most widely used commercial systems, CORBA, DCOM, and Java, were designed for small components. Enterprise JavaBeans, is an extension of Java that handles large components. EJB, however, overextends the OO paradigm, retaining features that worked satisfactorily with FGCs, but which are unsuited for components of large granularity. CBSD demands attention on the business application, not the underlying distributions structure. Consequently, the developer usually is required to use the most suitable distribution system available, irrespective of problems it might present with large components. The Stanford CHAIMS research project managed to achieve communication between large components by creating a new middleware layer to set atop a current commercial distribution system. This is a sub-optimal solution for communication among large communications. The architecture, therefore, includes a suggestion for a distribution system specifically designed for large components. Loose coupling, a key notion borrowed from structured programming (Sametinger 1997), is the final feature recommended for the improved architecture for assembling components of large granularity. If not loosely coupled, independent deployment of components is infeasible. Even moderate coupling complicates the assembly of components as autonomous units. Instead of building software from discrete component blocks, the developer must wire together entire sub-assemblies of overly coupled modules. 3 Definition of Component Functionalities and Interrelationships During this step in creating the system architecture, functionalities of system components and interrelationships among them will be defined. In the previous step, review and synthesis of prior systems, desirable features were derived for an improved architecture for composing large components, as indicated in Figure 1. This step will focus upon what components do and how they relate to other system elements. Component Functionality. As discussed in the previous phase, the proposed architecture features standardized components of large granularity. This means that the components from one company are completely interchangeable with those from another, as long as the elements have identical specifications and interfaces. The system may be extended by adding other compatible components. Thus the architecture calls for homogeneous elements because research with the systems discussed in Helton (1999) indicates serious problems with handling disparate elements. These standardized elements are included in the architecture as an ideal, though extenuating circumstances might force someone implementing such a system to use wrappers to harmonize the functioning of heterogeneous components. Entire business processes are mapped directly to the software as server components of coarse granularity. Thus, the entire application might be constructed from just a handful of large components, as discussed in the previous step on review and synthesis of prior systems. A glue component coordinates the operations of the application, consistent with the Stanford CHAIMS project and suggested by D'Souza and Wills (1998). Since current technology assumes the use of graphical user interfaces, a user interface component may be added. Optionally, the glue component may be combined with the user interface component. Irrespective of its type, each component encapsulates its functionality so that each component remains a discrete, manageable unit. A conceptual framework might abstract functionality in such a way that a data storage facility might seem to be encapsulated within the component. The software architecture, however, depicts the reality that the database management system is a distinct mechanism, external to the other components. In addition, all components, regardless of their type, must exhibit high cohesion, as explained in the review and synthesis step of the research methodology. Everything within the enterprise component should focus upon one central objective. Component Interrelationships. Interconnections between components involve two aspects of a system: (1) interfaces, and (2) the distribution infrastructure. These will be described in the following section and elaborated further in the third step, system architecture development, of the research methodology. Standard interfaces are included in the improved architecture to make components plug compatible. Two components with identical specifications should be completely interchangeable if their interfaces are also indistinguishable. Standard interfaces may also be used to extend system functionality by adding components. The standard interfaces permit these appended components to fit properly into the system, contingent upon the component specification being compatible with the rest of the system. The proposed architecture incorporates structured interfaces. In several of the systems studied (Helton (1999), OO interfaces were employed, rendering them susceptible to the generation of puzzling inheritance interface hierarchies. The proposed architecture uses information hiding to abstract the implementations of large components, making them manageable. The inner workings may follow any paradigm whatsoever. The only type of direct access to components permitted in the proposed architecture is direct external access. This restriction reinforces information hiding. Delegation permits indirect access to interior interfaces, as suggested by Lewandowski (1998). Standard distribution protocols are specified in the proposed architecture, even though the only distribution systems commercially available either were designed for small components or have been extended somewhat imperfectly to handle large components. As was discussed in the review and synthesis of prior systems step of the research methodology, the distribution system ideally should be one that was specifically crafted to handle large components. A specialized infrastructure for large components would obviate the need for building extra middleware software, as the Stanford CHAIMS researchers were forced to do, to adapt large components to infrastructures really designed for fine-grained components. Since the proposed architecture is for a distributed system, loose coupling of elements is required. As adherents of structured programming demonstrated, tight coupling of modules even complicates applications that are not distributed (Sametinger 1997). The glue component provides central control, which facilitates loose coupling. This will be discussed further in the next section. 4 System Architecture Specification The final step in defining the improved software architecture is to bring everything together to form a coherent system. Though the focus here is upon composing large components for a business application, the architecture specifies some general requirements for the physical distribution of software. The system architecture provides detail down to the level of interface specification. The implementation of components is abstracted because component inner structure is a widely researched topic, beyond the scope of this dissertation. The overall system architecture is depicted in Figure 2. This figure conforms to the completed conceptual framework developed in (Helton 1999). The proposed system architecture, as illustrated in Figure 2, contains a glue component, one or more large application server components, a database management system, and an underlying distribution infrastructure. The role of each of these elements in the architecture will be discussed in the subsequent material. Glue Component. The glue component coordinates the activities of the large application components. The glue component, as well as the other components, has a structured interface. Observe in Figure 2 that all invocations are made through the glue component in forcing loose coupling between application elements. The effectiveness of this type of glue component has been well established in the client/server application model for programming (Berson 1996). Server Application Components. Components A and B in Figure 2 represent a small number of large components that comprise the server application components. Their implementations are obscured through information hiding. No direct external access to inner elements is permitted. Either A or B, however, may have nested interfaces accessible through delegation, as described earlier in this chapter A new server component X could be substituted for component B if the interfaces and specifications for both X and B are identical. This mechanism provides interchangeability of elements obtained from competing vendors. An element C, moreover, distinct in its functionality from anything presently available within the application, could be appended to the system to extend the application's capabilities. Component C would have to have a standard interface and its functionality could not conflict with that of the rest of the system. As indicated in Figure 2, component A should not make a direct invocation to B, or vice versa, though this is possible theoretically. Having one server component call another violates the programming principle of passing unconditional control to another module. Channelling all calls through the glue component eliminates this type of undesirable invocation. Database Management System. Though conceptually data is contained within components, at the operational level persistence requires components to access the database management system, as depicted in Figure 2. The client component (Glue Component, in this case) makes invocations of server Components A and B as if they stored the data and, via delegation, the server components make calls to the DBMS. Distribution System. In the proposed system architecture, a standard underlying distribution system manages all communication between the dispersed system elements. Ideally, an infrastructure specifically designed to handle large components would be used. This would eliminate the need for creating and maintaining, between the distribution system and the application, an additional software layer, such as the extra software stratum required in the Stanford CHAIMS system because the distribution system is designed to manage small rather than large components. 5 Summary The goal of this study was to suggest an enhanced system architecture for large components. The conceptual framework (Helton 1999) formed the input for delineating the system architecture. This resulting software architecture synthesizes the best features from existing systems presented in (Helton 1999, Kim 2002) and related technologies. References [1] Bandini S., Paoli F., Manzoni S. & Mereghetti P. (2002) A supports system to COTS-based software development for business services. 14th International Conference On Software Engineering & Knowledge Engineering, Ischia, Italy. [2] Beringer D., Tornabene C., Jain P., & Wiederhold G. (1998) A language and system for composing autonomous, heterogeneous and distributed megamodules. International Workshop on Large-Scale Software Composition, in conjunction with DEXA'98, Ninth International Workshop on Database and Expert Systems Applications, Vienna, Austria, p. 826-833. [3] Berson A. (1996) Client/Server Architecture, 2nd ed. New York: McGraw-Hill. [4] Buck-Emden R. & Jürgen G. (1996) SAP R/3 System: A Client/Server Technology. Weinland A. (trans.). Harlow, England: Addison-Wesley. [5] Budgen D. (2003) Software Design, 2nd ed. Harlow, England: Addison-Wesley. [6] Carney D. (1997) Assembling large systems from COTS components: Opportunities, cautions, and complexities. SEI Monographs on Use of Commercial Software in Government Systems. Pittsburgh, PA: Carnegie Mellon University. [7] Deibel S. & Greenes R. (1995) Component based computing in radiology systems architecture. Decision Systems Group Technical Report, DSG-AR-005-1.0, Brigham & Women's Hospital. Cambridge, MA: Harvard University. [8] D'Souza D. & Wills A. C. (1998) Objects, components and frameworks with UML: the Catalysis approach. Reading, MA: Addison-Wesley. [9] Helton D. (1999) Coarse-grained components as an alternative to component frameworks. Proceedings, 4th International Workshop on Component-Oriented Programming, Lisbon, Portugal. [10] Hissam S. A., Seacord R. C. & Lewis G.A. (2002) Building systems from commercial components. Proceedings of the 24th International Conference on Software Engineering, Orlando, Florida, USA, p. 679-680. [11] Inverardi P. & Tivoli M. (2002) The role of architecture in components assembly. Proceedings, 7th International Workshop on Component-Oriented Programming, Malaga, Spain. [12] Kim S. D. (2002) Lessons learned from a nationwide CBD promotion project. Communications of the ACM, 45, no. 10, p. 83-87. [13] Levi K. & Arsanjani A. (2002) A goal-driven approach to enterprise component identification and specification. Communications of the ACM, 45, no. 10, p. 45-52. [14] Lewandowski S. M. (1998) Frameworks for component-based client/Server computing," ACM Computing Surveys, 30, no. 1, p. 3-27. [15] Matthes, F., Wegner H. & Hupe P. (1999) A process-oriented approach to software component definition," Proceedings, 1lth Conference on Advanced Systems Engineering, Heidelberg, Germany, June 14-18, 1999. [16] Pressman R. S. (1997) Software Engineering: a Practitioner's approach, 4th ed. New York: McGraw-Hill. [17] Ravichandran T. & Rothenberger M. A. (2003) Software reuse strategies and component markets. Communications of the ACM, 46, no. 8, p. 109114. [18] Sametinger J. (1997) Software Engineering with Reusable Components. Berlin: Springer-Verlag. [19] SAP AG (1997) R/3: System benefits of the business framework. Walldorf, Germany: SAP AG Technology Marketing. [20] Shaw M. et al. (1995) Abstractions for software architecture and tools to support them. IEEE Transactions on Software Engineering, 21, no. 4, p. 314-335. [21] Szyperski C. (1997) Component Software: Beyond Object-Oriented Programming. Reading, MA: Addison-Wesley. [22] Thomas, A. (1997) Enterprise JavaBeans: server Component model for Java. Boston, MA: Patricia Seybold Group. [23] Vitharana P. (2003) Risks and challenges of component-based software development. Communications of the ACM, 46, no. 8, p. 67-72. JOŽEF STEFAN INSTITUTE Jožef Stefan (1835-1893) was one of the most prominent physicists of the 19th century. Born to Slovene parents, he obtained his Ph.D. at V^en^a University^, where he ^as ^at^er Director of the Phy^sics Institute, Vice-President of the Vienna Academy of Sciences and a member of several scientific institutions in Europe. St^efan explored m^ny areas in hydrodynamics, optics, acoustics, electricity, magnetism and the k^i^et^ic theory of gases. Among other things, he originated the law that t^he total radiation from a b^ack body is proportional to the 4th pooler of its absolutne t^em-perature, known as the St^e^an-Boltzmann law. The Jožef Stefan Institute (JSI) is the leading independent scientific research institution in Slovenia, covering a broad spectrum of fundamental and applied research in the fields of physics, chemistry and biochemistry, electronics and information science, nuclear science technology, energy research and environmental science. The Jožef Stefan Institute (JSI) is a research organisation for pure and applied research in the natural sciences and technology. Both are closely interconnected in research departments composed of different task teams. Emphasis in basic research is given to the development and education of young scientists, while applied research and development serve for the transfer of advanced knowledge, contributing to the development of the national economy and society in general. At present the Institute, with a total of about 700 staff, has 500 researchers, about 250 of whom are postgraduates, over 200 of whom have doctorates (Ph.D.), and around 150 of whom have permanent professorships or temporary teaching assignments at the Universities. In view of its activities and status, the JSI plays the role of a national institute, complementing the role of the universities and bridging the gap between basic science and applications. Research at the JSI includes the following major fields: physics; chemistry; electronics, informatics and computer sciences; biochemistry; ecology; reactor technology; applied mathematics. Most of the activities are more or less closely connected to information sciences, in particular computer sciences, artificial intelligence, language and speech technologies, computer-aided design, computer architectures, biocybernetics and robotics, computer automation and control, professional electronics, digital communications and networks, and applied mathematics. The Institute is located in Ljubljana, the capital of the independent state of Slovenia (or S9nia). The capital today is considered a crossroad between East, West and Mediter- ranean Europe, offering excellent productive capabilities and solid business opportunities, with strong international connections. Ljubljana is connected to important centers such as Prague, Budapest, Vienna, Zagreb, Milan, Rome, Monaco, Nice, Bern and Munich, all within a radius of 600 km. In the last year on the site of the Jožef Stefan Institute, the Technology park "Ljubljana" has been proposed as part of the national strategy for technological development to foster synergies between research and industry, to promote joint ventures between university bodies, research institutes and innovative industry, to act as an incubator for high-tech initiatives and to accelerate the development cycle of innovative products. At the present time, part of the Institute is being reorganized into several high-tech units supported by and connected within the Technology park at the Jožef Stefan Institute, established as the beginning of a regional Technology park "Ljubljana". The project is being developed at a particularly historical moment, characterized by the process of state reorganisation, privatisation and private initiative. The national Technology Park will take the form of a shareholding company and will host an independent venture-capital institution. The promoters and operational entities of the project are the Republic of Slovenia, Ministry of Science and Technology and the Jožef Stefan Institute. The framework of the operation also includes the University of Ljubljana, the National Institute of Chemistry, the Institute for Electronics and Vacuum Technology and the Institute for Materials and Construction Research among others. In addition, the project is supported by the Ministry of Economic Relations and Development, the National Chamber of Economy and the City of Ljubljana. Jožef Stefan Institute Jamova 39, 1000 Ljubljana, Slovenia Tel.:+386 1 4773 900, Fax.:+386 1 219 385 Tlx.:31 296 JOSTIN SI WWW: http://www.ijs.si E-mail: matjaz.gams@ijs.si Contact person for the Park: Iztok Lesjak, M.Sc. Public relations: Natalija Polenec INFORMATICA AN INTERNATIONAL JOURNAL OF COMPUTING AND INFORMATICS INVITATION, COOPERATION Submissions and Refereeing Please submit three copies of the manuscript with good copies of the figures and photographs to one of the editors from the Editorial Board or to the Contact Person. At least two referees outside the author's country will examine it, and they are invited to make as many remarks as possible directly on the manuscript, from typing errors to global philosophical disagreements. The chosen editor will send the author copies with remarks. If the paper is accepted, the editor will also send copies to the Contact Person. The Executive Board will inform the author that the paper has been accepted, in which case it will be published within one year of receipt of e-mails with the text in Informatica LATEX format and figures in .eps format. The original figures can also be sent on separate sheets. Style and examples of papers can be obtained by e-mail from the Contact Person or from FTP or WWW (see the last page of Informatica). Opinions, news, calls for conferences, calls for papers, etc. should be sent directly to the Contact Person. QUESTIONNAIRE Send Informatica free of charge Yes, we subscribe Please, complete the order form and send it to Dr. Drago Torkar, Informatica, Institut Jožef Stefan, Jamova 39, 1111 Ljubljana, Slovenia. Since 1977, Informatica has been a major Slovenian scientific journal of computing and informatics, including telecommunications, automation and other related areas. In its 16th year (more than ten years ago) it became truly international, although it still remains connected to Central Europe. The basic aim of Informatica is to impose intellectual values (science, engineering) in a distributed organisation. Informatica is a journal primarily covering the European computer science and informatics community - scientific and educational as well as technical, commercial and industrial. Its basic aim is to enhance communications between different European structures on the basis of equal rights and international referee-ing. It publishes scientific papers accepted by at least two referees outside the author's country. In addition, it contains information about conferences, opinions, critical examinations of existing publications and news. Finally, major practical achievements and innovations in the computer and information industry are presented through commercial publications as well as through independent evaluations. Editing and refereeing are distributed. Each editor can conduct the refereeing process by appointing two new referees or referees from the Board of Referees or Editorial Board. Referees should not be from the author's country. If new referees are appointed, their names will appear in the Refereeing Board. Informatica is free of charge for major scientific, educational and governmental institutions. Others should subscribe (see the last page of Informatica). ORDER FORM - INFORMATICA Name: .................................................... Office Address and Telephone (optional): Title and Profession (optional): .................................................................... ........................................................... E-mail Address (optional): ............. Home Address and Telephone (optional): .................... ........................................................... Signature and Date: ................... Informatica WWW: http://ai.ijs.si/informatica/ Referees: Witold Abramowicz, David Abramson, Adel Adi, Kenneth Aizawa, Suad Alagic, Mohamad Alam, Dia Ali, Alan Aliu, Richard Amoroso, John Anderson, Hans-Jurgen Appelrath, Ivan Araujo, Vladimir Bajic, Michel Barbeau, Grzegorz Bartoszewicz, Catriel Beeri, Daniel Beech, Fevzi Belli, Simon Beloglavec, Sondes Bennasri, Francesco Bergadano, Istvan Berkeley, Azer Bestavros, Andraž Bežek, Balaji Bharadwaj, Ralph Bisland, Jacek Blazewicz, Laszlo Boeszoermenyi, Damjan Bojadžijev, Jeff Bone, Ivan Bratko, Pavel Brazdil, Bostjan Brumen, Jerzy Brzezinski, Marian Bubak, Davide Bugali, Troy Bull, Sabin Corneliu Buraga, Leslie Burkholder, Frada Burstein, Wojciech Buszkowski, Rajkumar Bvyya, Giacomo Cabri, Netiva Caftori, Particia Carando, Robert Cattral, Jason Ceddia, Ryszard Choras, Wojciech Cellary, Wojciech Chybowski, Andrzej Ciepielewski, Vic Ciesielski, Mel Ó Cinnéide, David Cliff, Maria Cobb, Jean-Pierre Corriveau, Travis Craig, Noel Craske, Matthew Crocker, Tadeusz Czachorski, Milan (Češka, Honghua Dai, Bart de Decker, Deborah Dent, Andrej Dobnikar, Sait Dogru, Peter Dolog, Georg Dorfner, Ludoslaw Drelichowski, Matija Drobnic, Maciej Drozdowski, Marek Druzdzel, Marjan Družovec, Jozo Dujmovic, Pavol iDuriš, Amnon Eden, Johann Eder, Hesham El-Rewini, Darrell Ferguson, Warren Fergusson, David Flater, Pierre Flener, Wojciech Fliegner, Vladimir A. Fomichov, Terrence Forgarty, Hans Fraaije, Stan Franklin, Violetta Galant, Hugo de Garis, Eugeniusz Gatnar, Grant Gayed, James Geller, Michael Georgiopolus, Michael Gertz, Jan Golinski, Janusz Gorski, Georg Gottlob, David Green, Herbert Groiss, Jozsef Gyorkos, Marten Haglind, Abdelwahab Hamou-Lhadj, Inman Harvey, Jaak Henno, Marjan Hericko, Henry Hexmoor, Elke Hochmueller, Jack Hodges, Doug Howe, Rod Howell, Tomaš Hruška, Don Huch, Simone Fischer-Huebner, Zbigniew Huzar, Alexey Ippa, Hannu Jaakkola, Sushil Jajodia, Ryszard Jakubowski, Piotr Jedrzejowicz, A. Milton Jenkins, Eric Johnson, Polina Jordanova, Djani Juricic, Marko Juvancic, Sabhash Kak, Li-Shan Kang, Ivan Kapust0k, Orlando Karam, Roland Kaschek, Jacek Kierzenka, Jan Kniat, Stavros Kokkotos, Fabio Kon, Kevin Korb, Gilad Koren, Andrej Krajnc, Henryk Krawczyk, Ben Kroese, Zbyszko Krolikowski, Benjamin Kuipers, Matjaž Kukar, Aarre Laakso, Sofiane Labidi, Les Labuschagne, Ivan Lah, Phil Laplante, Bud Lawson, Herbert Leitold, Ulrike Leopold-Wildburger, Timothy C. Lethbridge, Joseph Y-T. Leung, Barry Levine, Xuefeng Li, Alexander Linkevich, Raymond Lister, Doug Locke, Peter Lockeman, Vincenzo Loia, Matija Lokar, Jason Lowder, Kim Teng Lua, Ann Macintosh, Bernardo Magnini, Andrzej Malachowski, Peter Marcer, Andrzej Marciniak, Witold Marciszewski, Vladimir Marik, Jacek Martinek, Tomasz Maruszewski, Florian Matthes, Daniel Memmi, Timothy Menzies, Dieter Merkl, Zbigniew Michalewicz, Armin R. Mikler, Gautam Mitra, Roland Mittermeir, Madhav Moganti, Reinhard Moller, Tadeusz Morzy, Daniel Mossé, John Mueller, Jari Multisilta, Hari Narayanan, Jerzy Nawrocki, Rance Necaise, Elzbieta Niedzielska, Marian Niedq'zwiedzinski, Jaroslav Nieplocha, Oscar Nierstrasz, Roumen Nikolov, Mark Nissen, Jerzy Nogiec, Stefano Nolfi, Franc Novak, Antoni Nowakowski, Adam Nowicki, Tadeusz Nowicki, Daniel Olejar, Hubert Österle, Wojciech Olejniczak, Jerzy Olszewski, Cherry Owen, Mieczyslaw Owoc, Tadeusz Pankowski, Jens Penberg, William C. Perkins, Warren Persons, Mitja Peruš, Fred Petry, Stephen Pike, Niki Pissinou, Aleksander Pivk, Ullin Place, Peter Planinšec, Gabika Polcicova, Gustav Pomberger, James Pomykalski, Tomas E. Potok, Dimithu Prasanna, Gary Preckshot, Dejan Rakovic, Cveta Razdevšek Pucko, Ke Qiu, Michael Quinn, Gerald Quirchmayer, Vojislav D. Radonjic, Luc de Raedt, Ewaryst Rafajlowicz, Sita Ramakrishnan, Kai Rannenberg, Wolf Rauch, Peter Rechenberg, Felix Redmill, James Edward Ries, David Robertson, Marko Robnik, Colette Rolland, Wilhelm Rossak, Ingrid Russel, A.S.M. Sajeev, Kimmo Salmenjoki, Pierangela Samarati, Bo Sanden, P. G. Sarang, Vivek Sarin, Iztok Savnik, Ichiro Satoh, Walter Schempp, Wolfgang Schreiner, Guenter Schmidt, Heinz Schmidt, Dennis Sewer, Zhongzhi Shi, Maria Smolarova, Carine Souveyet, William Spears, Hartmut Stadtler, Stanislaw Stanek, Olivero Stock, Janusz Stoklosa, Przemyslaw Stpiczynski, Andrej Stritar, Maciej Stroinski, Leon Strous, Ron Sun, Tomasz Szmuc, Zdzislaw Szyjewski, Jure Šilc, Metod Škarja, Jiri Šlechta, Chew Lim Tan, Zahir Tari, Jurij Tasic, Gheorge Tecuci, Piotr Teczynski, Stephanie Teufel, Ken Tindell, A Min Tjoa, Drago Torkar, Vladimir Tosic, Wieslaw Traczyk, Denis Trcek, Roman Trobec, Marek Tudruj, Andrej Ule, Amjad Umar, Andrzej Urbanski, Marko Uršic, Tadeusz Usowicz, Romana Vajde Horvat, Elisabeth Valentine, Kanonkluk Vanapipat, Alexander P. Vazhenin, Jan Verschuren, Zygmunt Vetulani, Olivier de Vel, Didier Vojtisek, Valentino Vranic, Jozef Vyskoc, Eugene Wallingford, Matthew Warren, John Weckert, Michael Weiss, Tatjana Welzer, Lee White, Gerhard Widmer, Stefan Wrobel, Stanislaw Wrycza, Tatyana Yakhno, Janusz Zalewski, Damir Zazula, Yanchun Zhang, Ales Zivkovic, Zonling Zhou, Robert Zorc, Anton P. Železnikar Informatica An International Journal of Computing and Informatics Archive of abstracts may be accessed at USA: http://, Europe: http://ai.ijs.si/informatica, Asia: http://www.comp.nus.edu.sg/ liuh/Informatica/index.html. Subscription Information Informatica (ISSN 0350-5596) is published four times a year in Spring, Summer, Autumn, and Winter (4 issues per year) by the Slovene Society Informatika, Vožarski pot 12, 1000 Ljubljana, Slovenia. The subscription rate for 2004 (Volume 28) is - USD 80 for institutions, - USD 40 for individuals, and - USD 20 for students Claims for missing issues will be honored free of charge within six months after the publication date of the issue. ILTeX Tech. Support: Borut Žnidar, Kranj, Slovenia. Lectorship: Fergus F. Smith, AMIDAS d.o.o., Cankarjevo nabrežje 11, Ljubljana, Slovenia. Printed by Biro M, d.o.o., Žibertova 1, 1000 Ljubljana, Slovenia. Orders for subscription may be placed by telephone or fax using any major credit card. Please call Mr. R. Murn, Jožef Stefan Institute: Tel (+386) 1 4773 900, Fax (+386) 1 219 385, or send checks or VISA card number or use the bank account number 900-27620-5159/4 Nova Ljubljanska Banka d.d. Slovenia (LB 50101-678-51841 for domestic subscribers only). Informatica is published in cooperation with the following societies (and contact persons): Robotics Society of Slovenia (Jadran Lenarcic) Slovene Society for Pattern Recognition (Franjo Pernuš) Slovenian Artificial Intelligence Society; Cognitive Science Society (Matjaž Gams) Slovenian Society of Mathematicians, Physicists and Astronomers (Bojan Mohar) Automatic Control Society of Slovenia (Borut Zupancic) Slovenian Association of Technical and Natural Sciences / Engineering Academy of Slovenia (Igor Grabec) ACM Slovenia (Dunja Mladenic) Informatica is surveyed by: AI and Robotic Abstracts, AI References, ACM Computing Surveys, ACM Digital Library, Applied Science & Techn. Index, COMPENDEX*PLUS, Computer ASAP, Computer Literature Index, Cur. Cont. & Comp. & Math. Sear., Current Mathematical Publications, Cybernetica Newsletter, DBLP Computer Science Bibliography, Engineering Index, INSPEC, Linguistics and Language Behaviour Abstracts, Mathematical Reviews, MathSci, Sociological Abstracts, Uncover, Zentralblatt für Mathematik The issuing of the Informatica journal is financially supported by the Ministry of Education, Science and Sport, Trg OF 13, 1000 Ljubljana, Slovenia. Informatica An International Journal of Computing and Informatics Introduction B. Vilfan, 225 R. Grossi An Average Running Time Analysis of a S. Suzuki, 227 Backtracking Algorithm to Calculate the Size of the T. Ibaraki Union of Cartesian Products A Spectral Approach to Graphical Representation of D. Bokal, 233 Data M. Juvan, B. Mohar Algorithms for Drawing Polyhedra from A. Orbanić, 239 3-Connected Planar Graphs M. Boben, G. Jaklic, T. Pisanski Grammar-Based Systems: Definition and Examples M. Mernik, 245 M. CCrepinšek, T. Kosar, D. Rebernak, V. Žumer Improved Error Recovery in Generated LR Parsers B. Slivnik, 257 B. Vilfan Informational Design of Conscious Entities A.P. Železnikar 265 The Demarcate Construction: A New Form of R.S.M. Goh, 277 Tree-based Priority Queues W.T. Tang, I.L.-J. Thng, M.T.R. Quieta Fault-Free Maximal Submeshes in Faulty S.-M. Yoo, 289 Torus-Connected Multicomputers H.Y. Youn, H. Choo Distributing State Space for Parallel Computation of M. Bourahla, 297 CTL Model Checking M. Benmohamed On-line Handwriting Chinese Character Analysis and J. Shin 307 Recognition Using Stroke Correspondence Search A System for Evaluation of Human Upper Extremity N. Klopcar, 315 J. Lenarcic A Software Architecture for Enterprise Components D.A. Helton 323