Volume 35 Number 2 June 2011 ISSN 0350-5596 Informatica An International Journal of Computing and Informatics EDITORIAL BOARDS, PUBLISHING COUNCIL Informatica is ajournai primarily covering intelligent systems in the European computer science, informatics and cognitive community; scientific and educational as well as technical, commercial and industrial. Its basic aim is to enhance communications between different European structures on the basis of equal rights and international refereeing. It publishes scientific papers accepted by at least two referees outside the author's country. In addition, it contains information about conferences, opinions, critical examinations of existing publications and news. Finally, major practical achievements and innovations in the computer and information industry are presented through commercial publications as well as through independent evaluations. Editing and refereeing are distributed. Each editor from the Editorial Board can conduct the refereeing process by appointing two new referees or referees from the Board of Referees or Editorial Board. Referees should not be from the author's country. If new referees are appointed, their names will appear in the list of referees. Each paper bears the name of the editor who appointed the referees. Each editor can propose new members for the Editorial Board or referees. Editors and referees inactive for a longer period can be automatically replaced. Changes in the Editorial Board are confirmed by the Executive Editors. The coordination necessary is made through the Executive Editors who examine the reviews, sort the accepted articles and maintain appropriate international distribution. The Executive Board is appointed by the Society Informatika. Informatica is partially supported by the Slovenian Ministry of Higher Education, Science and Technology. Each author is guaranteed to receive the reviews of his article. When accepted, publication in Informatica is guaranteed in less than one year after the Executive Editors receive the corrected version of the article. Executive Editor - Editor in Chief Anton P. Železnikar Volariceva 8, Ljubljana, Slovenia s51em@lea.hamradio.si http://lea.hamradio.si/~s51em/ Executive Associate Editor - Managing Editor Matjaž Gams, Jožef Stefan Institute Jamova 39, 1000 Ljubljana, Slovenia Phone: +386 1 4773 900, Fax: +386 1 251 93 85 matjaz.gams@ijs.si http://dis.ijs.si/mezi/matjaz.html Executive Associate Editor - Deputy Managing Editor Mitja Luštrek, Jožef Stefan Institute mitja.lustrek@ijs.si Executive Associate Editor - Technical Editor Drago Torkar, Jožef Stefan Institute Jamova 39, 1000 Ljubljana, Slovenia Phone: +386 1 4773 900, Fax: +386 1 251 93 85 drago.torkar@ijs.si Editorial Board Juan Carlos Augusto (Argentina) Costin Badica (Romania) Vladimir Batagelj (Slovenia) Francesco Bergadano (Italy) Marco Botta (Italy) Pavel Brazdil (Portugal) Andrej Brodnik (Slovenia) Ivan Bruha (Canada) Wray Buntine (Finland) Ondrej Drbohlav (Czech Republic) Hubert L. Dreyfus (USA) Jozo Dujmovic (USA) Johann Eder (Austria) Ling Feng (China) Vladimir A. Fomichov (Russia) Maria Ganzha (Poland) Marjan Gušev (Macedonia) N. Jaisankar (India) Dimitris Kanellopoulos (Greece) Samee Ullah Khan (USA) Hiroaki Kitano (Japan) Igor Kononenko (Slovenia) Miroslav Kubat (USA) Ante Lauc (Croatia) Jadran Lenarčič (Slovenia) Shiguo Lian (China) Huan Liu (USA) Suzana Loskovska (Macedonia) Ramon L. de Mantras (Spain) Angelo Montanari (Italy) Deepak Laxmi Narasimha (Malaysia) Pavol Ndvrat (Slovakia) Jerzy R. Nawrocki (Poland) Nadja Nedjah (Brasil) Franc Novak (Slovenia) Marcin Paprzycki (USA/Poland) Ivana Podnar Žarko (Croatia) Karl H. Pribram (USA) Luc De Raedt (Belgium) Shahram Rahimi (USA) Dejan Rakovic (Serbia) Jean Ramaekers (Belgium) Wilhelm Rossak (Germany) Ivan Rozman (Slovenia) Sugata Sanyal (India) Walter Schempp (Germany) Johannes Schwinn (Germany) Zhongzhi Shi (China) Oliviero Stock (Italy) Robert Trappl (Austria) Terry Winograd (USA) Stefan Wrobel (Germany) Konrad Wrona (France) Xindong Wu (USA) A Data Model and an XQuery Extension for Concurrent XML Structures Emmanuel Bruno and Elisabeth Murisasco LSIS -UMR CNRS 6168 Université du Sud Toulon-Var, BP 20132 83957 La Garde Cedex, France E-mail: {bruno, murisasco}@univ - tln.fr Keywords: multistructure, concurrent hierarchy, textual document, tree-like structure, XML data model, XQuery Received: March 13, 2009 An XML document is mainly hierarhical, but some applications need to simultaneously associate more than one hierarchy to the same data. In general, concurrent hierarchies cannot be merged in order to get a well-formed XML document. This work stands in this context: it aims at describing and querying hierarchical XML structures defined over the same textual data in a concurrent way. Our proposal called MSXXD is composed of a data model (which can be seen as an index over the data and between all the structures) and a dedicated query language defined as an extension of XQuery. The key idea is to propose a method for a compact description of multiple tree-like structures over a single text based on segmentation. Segmentation encoding allows querying overlap/containment relations of markups belonging to different structures. This paper also tackles a solution to define a multistructured schema in order to describe the relationships (as weak constraints) between parts of concurrent structures. Finally, this paper focuses on the architecture of the XML environmentimplementingourproposals. Povzetek: Sistem omogoca uveljavljanje vec hierarhij v dokumentih XML. 1 Motivation XML [10] is the de facto standard to describe structured data. Several applications in the context of information systems are based on their use: electronic publishing, language engineering, technical documentation, digital libraries, Web, etc. XML documents are mainly hierarchical. The hierarchy, captured in a tree-like structure [23], corresponds to one level of analysis of the data contained in the document (e.g a logical analysis). A large set of tools are now available and widely used in particular for edition, querying (XPath [16], XQuery [7]) or transformation (XSLT [15]). According to us, this success is due to the hierarchical structure which is easier to exploit compared to the graph structure. Moreover, manipulation remains simple in an XML environment. Recently, the title of an article published in the SIG-MOD conference [27] claims that " One hierarchy is not enough " for data-centric application context. Indeed, the CONCUR feature of SGML [24] first pointed out this need in the nineties but in context of document-centric encoding where some applications need to consider more than one hierarchy over the same text in a concurrent way. These last years, several other works about concurrent markups have been published [31, 30, 32, 22, 33, 19, 25, 14, 21]. All these different works propose solutions to describe multiple tree-like structures (with overlapping) into a single document. The main problem with concurrent hierarchies is that they cannot be merged in order to get a well-formed XML document without using a flat representation or hyperlinks that make the structure difficult to query. Our work stands in this context. It aims at representing and querying concurrent hierarchical XML structures defined over the same textual data. We call such data " mul-tistructured textual documents ". The key idea is to propose a method for a compact description of multiple trees over a single text based on its segmentation. Segmentation encoding allows querying overlap/containment relations of markups belonging to different structures. The solution proposed, called MSXD for MultiStructured XML Docu-ments1, is entirely XML compatible. This paper describes on the first hand the MSXD model and its relative query language, and on the other hand, it details the way these proposals are implemented under an XML environment. We also tackle the description of relationships between structures (as weak constraints) in order to propose a solution to define a multistructured schema enabling validation across multiple hierarchies [20]. This paper is an extended version of different preliminary contributions ( [11], [12] and [13]). 1.1 A running example To illustrate the problem, we consider a mediœval manuscript (Princeton, Garrett 80 (PG)) related to medico-pharmaceutical recipes written in Occitan language [8]. This kind of text is studied by philologists of the University ¡Funded by the french research agency (ANR) - Semweb project (2004-2007) of Pisa (Italy) in the Department of Language and Romance Literature. Philology is a science that studies ancient or medieval civilizations by the mean of literary documents. Researchers are interested in studying the text of the manuscript according to different points of view corresponding to different uses of the document. Each analysis results in a mainly hierarchical markup of the text which could be easily represented in XML. From the document represented as an image (see Figure 1), for illustrating our intention, we have extracted the following textual content: ... Per recobrar maniar Ad home cant a perdut lo maniar prin de l erba blanca ... This same text has been segmented and marked up in three ways: its physical structure ST (the manuscript is organized into pages, on two columns composed of lines), its syntactic structure S2 (the manuscript is composed of sentences and words) and its semantic structure S3 (the manuscript describes medical prescriptions which have signs, ingredients, plants and effects). The result is shown in Figure 2 where three XML documents hierarchically organize the same text according to these three different points of views. Each structure marks up the text differently from another (see Figure 3). For example the segment of text "Ad home cant a perdut lo maniar' is tagged by Sign in S3, while it is tagged differently in S1: "Ad home cant a perdut' is marked by Line whereas "lo maniar' begins another textual segment marked also by Line. Moreover notice that a Sign can overlap two lines. Therefore, these documents are independent with potential overlapping. Structures (but not the relations between them) could be defined by means of a grammar like a DTD, an XML schema [6] or a RelaxNG schema [17]. These XML documents are currently considered separately even if researchers identify correlations between them. In particular, they would like to be able to express the following queries: What are the sentences that follow the signs? (this query combines syntactic and semantic structures); What is the prescription that contains the larger number of lines? (this query combines physical and semantic structures); What are the words cut by an end of line? (this query combines physical and syntactic structures). Usual XML manipulation languages do not support concurrent structures, thus our issue is (1) to model multiple hierarchical structures in order to query them in a concurrent way and (2) to provide an XML environment to support multistructured documents. Finally, notice that the relationship (eventually weak) between parts of these three structures are not defined: for example, one can remark that a page (in the physical structure) always begins with a prescription (in the semantic structure). It could be useful to use this kind of constraint during querying and to check consistency between structures. We tackle this issue by defining a schema to describe the relationships between parts of the concurrent structures. 1.2 Objectives Our objectives are the following: A suitable - XML compatible - data model. This model dedicated to multistructured textual documents is called MSXD. It enables (1) to consider several segmentations of the same text, and (2) to define a hierarchical structure for each of these segmentations. Notice that the structures could be weakly coupled and that there is no main structure. The set of segmentations and the set of hierarchical structures are deduced automatically from the given XML structured documents (for example produced by philologists). That is why they could be developed in a distributed way. The replication of the common text (in each XML document) can be seen as a drawback but (i) the data storage according to the volume is not really a problem, and (ii) synchronisation is not an issue because the common text is never updated: the only changing part (the structure) is not replicated. Moreover, each user can then edit its own copy offline. This data model is based on the use of hedges [28], the foundation of grammar language RelaxNG [17] (REgular LAnguage description for XML New Generation); An extension of XQuery [7]. Our objective is to query the structures and the textual content in a concurrent way and to use an as much as possible unchanged XQuery. One of the original contribution of our proposal is that the Mul-tiStructured document is never instantiated. As we said before, the set of structures can be syntactically described as a distributed set of XML documents: we want to keep each structure safe to use available XML tools and languages for its construction and its manipulation. Given a set of XML documents over the same textual value, a MSXD instance can be seen as an index over the textual value and between all the structures. This index is used to evaluate queries using concurrent structures; The description of relationships between structures. We propose a solution to define a multistructured schema in order to describe the relationships (as weak constraints) between parts of the concurrent structures. This is done as a set of rules by means of Allen's relations to constrain the relative position of fragments in the structures. This work is a first step towards validation across multiple hierarchies [20]. This paper is organized as follows: Section 2 defines the MSXD data model and shows the XML syntax associated to an MSXD instance. Section 3 presents an extension of XQuery to multistructure. Section 4 proposes a schema enabling to express constraints between concurrent structures. Section 5 specifies the MSXD architecture. Section 6 is dedicated to the related works and section 7 concludes. Currently, we only consider the definition of multiple structures in the single textual modality. fW al ucrturi nulla crlntnr iwn proftirivi .ikrtonuf n .itorlocotf cn«MUC iu4ii|Cttt t\uc U mcf«U cdvi eo-tua-fcro iti4ifp fvakx qui Ucafcn^ >q»lu£u i aptcfUpaa j4 4L1 (Xiviu catr fcr.i not ftm low u bcu c plttEcCfU ttWUC"4 CTT u> ai : -lufK lofuctinf c ëu nwur pillai, crcr ¿Uffctmuf Uf ftarf. rv 4Lico?bur ttiaiav tvitw »anr 4 patiur-flSwûiui Vim frtjgjM lctv«H cluui* dinmi "sclun amTxiUUnt t.v putf^pcbtt £>etH c feui a a»MV cuU.ni uin .c atar anrio •xxn o abcuic cpwyn ¿r ta Idticuwtçsuatu firaf U( miljf umoif oitmcinr^^v 1 \ in pu» tofiuteUuâ n oBtnmr >'»»»«» otutaf tvc Vwine « btftu u nol roam pucuftc r*vtitui 2>ucüc—- fr> tyrnu ^uca uuUdf uulU ViinUtvcuf cpi rtalo 4b mcl c tu U>t«ir fui lof uulfCtTiaiyfb S . i *H. v-'-l bUmaKb Jf lavvuti.i piuc ical mottà Co cf L.fb4. C."îd tUTiSl C'ï« -fkJii u|ii Altaic t auf cmr fl[ > ■■ "jcUiua crbu amt» Luii' a cuUf^uïi v. nu cpitaf Ufcftuc uf luf pun lasenfqcv -foramti o^uclla .»¿m iaf M obture home ïuuicilLif. ftjitw UCaiusUM «r ncf Uifatfcf-»»»)' » m/»nn.ci<&-4xp:iri oou Uiiu cfacttu»^»' ituuf-uutur uol ttftf . fc rxcii: Figure 1 : A page from the manuscript 2 The MSXD model We first choose to define the notion of a multistructured document and then to introduce all the concepts used in this definition. We illustrate the MSXD model with our running example. Our data model is based on the use of hedges (the foundation of RelaxNG). Informally, a hedge is a sequence of trees. In the XML terminology, a hedge is a sequence of elements possibly intervened by character data; in particular, an XML document is a hedge [28]. 2.1 Formal model definition Definition 1. A Multistructured document is a triplet (V, I, S) where V is a textual value, I a set of segmentations ofV and S a set of structures associated to segmentations from I. A multistructured document can be seen as a textual value augmented with different hierarchical structures defined over it. These structures share the same text but concern (in general) different strings extracted from that text. Definition 2. A segmentation of the textual value V of length l is a list Xy of strings such that Xy = {xi|xi = V[bi..ei] and b§ = 0 and ei > bi and bi = ei—y + 1 and e|xv|—l = l — 1},bi and ei are respectively the start and the end positions of the fragment in V. We define two functions for each Xv [i], start (Xy [i]) = bi and end(Xy [i]) = ei. The setXy makes a total partition of the textual value V. For our example, V is the text extracted from the manuscript ("...Per recobrar maniar Ad home cant a perdut lo maniar prin de l erba blanca.. ") and we consider three segmentations Xy, Xy andXy3 (corresponding to the set of textual contents of the XML elements from each structure Sl, S2 and S3, see Figure 2): - Xy = xl... U ...x\, with in particular xl = "Per recobrar maniar"andx\ ="blanca...", - Xy = x2... U ...x15, with in particular x\ ="Per" and x25 ="blanca", - X3 = xl... U ...x3, with in particular xl = "Per recobrar maniar", and x3 = "erba blanca". ... ... Per recobrar maniar Ad home cant a perdut lo maniar prin de l erba blanca ...... ... ... ... Perrecobrarmaniar Ad home canta perdutlo maniar prin del erbablanca ...... ... Per recobrar maniar Ad home cant a perdut lo maniarprin de l erba blanca...... Figure 2: Physical (Si), Syntactic (S2) and Semantic (S3) structures i V i V i " > i" Per recobrar maniar Ad home cant a perdut lo maniar prin de l erba blanca Sign Ingredient Figure 3 : Multiple segmentations of the text We use the concept of fragment to define structures: fragments mark up a segmentation. Fragment positions in the textual value are useful to compute their relative positions. Definition 3. A fragment f is defined over a segmentation Xy of the textual value V and an alphabet Xy, a set of labels: 1. f = £ (the empty fragment), start (f) = end (f) = 0 2. f = vi with vi GXy, start(f) = start(Xy[i]), end (f) = end (Xy [i]) 3. f = v < x > with v G Xy and x a fragment, start (f) = start (x), end( f) = end(x) (f is called a tree) 4. f = xy with x andy two fragments and start (y) = end(x) + 1, start(f) = start(x), end(f) = end(y). Notice some remarks about definition 3: - Rule 1 defines the empty fragment. Recall that we want to represent several segmentations of the same text in order to define a hierarchical structure for each of these segmentations (producing several structured documents over the same text). Using empty fragments (as milestones) is not compatible with our model because a fragment has to be related to a segmentation. The only legal empty fragment is the one associated with the empty document. - Rule 3 uses the alphabet XV, whichis a set of labels for fragments having a tree-like structure corresponding to XML element names: for the physical analysis, Ej1 = {Manuscript, Page, Column, Line}, for the syntactic analysis, E2 = {Manuscript, Syntax, Sentence, W}, for the semantic analysis, E3 = {Manuscript, Prescription, Ingredient, Sign, Plant}. - Rule 4 produces sequences of fragments. We do not make a distinction between a fragment and a singleton sequence containing that fragment and we do not consider nested sequences. We now give some examples of fragments respectively constructed over segmentations Xy and Xy: - Over Xy (two fragments): (1) f1 = x1 with x1 ="Per recobrar maniar", start(f1 ) = 1, end(f ) = 19 and (2) f1 = Line < x\ >, start(f1) = start(f1 ), end(f2)= end (f\); The fragment f is represented in XML syntax by Per recobrar maniar. Over XV3 (three fragments): (1) fi start( f43 ) (2)4 — x4 x4 f4 end( f53) (3) f4 -- end( f63) with = 60, end(/¡) = 70 = Plant < f3 >, start (f ) = end( f3 ) and Ingredient < f >, start(f3 "erba blanca", start ( f4 ), start ( f4 ), 92) (we supposed that 92 is the end position of the textual fragment marked by "..." in our example); The fragment f63 is represented in XML by erba blanca... . Definition 4. A structure is a tree f (a labelled fragment) over a segmentation of the textual value V, end( f) = ¡Xy | — 1 and start ( f) = 0. Figure 4 illustrates our model over the two XV1 and XV3 segmentations (we do not show the third segmentation to make the reading of the figure easier) and the two structures 51 and S3 defined over them. The figure is composed of two parts: the first indicates the segmentations (start and end positions are associated to each textual segment inside a segmentation; for convenience the numbering of start and end positions only takes into account the segments used in our example), and the second part shows the hierarchical organization of fragments into a structure (physical structure on the left, semantic structure on the right). In summary, from the XML documents of Figure 2, we can extract the text V, we can deduce three structures (S1, S2 and S3) built on three segmentations (Xy, Xy and Xy). Fragments are constructed over segmentations. 2.2 Relative position of two fragments Our model is designed so that Allen's relations [2] can be used on fragments in order to calculate their relative position inside a segmentation or between two segmentations. Definition 5. Predicates on two fragments f and f are defined over one or two segmentations on the same textual value: before(fi, f2 ) before(fi, f2, n) meets(fi, f2 ) during(fi, f2 ) overlaps(fi, f2 ) starts( fi, f2 ) finishes(fi, f2 ) equals( fi, f2) = finishes(f2, fi ) = end(fi) < start(f2) = finishes( f, fi, n) = start ( f2 ) — end ( fi) = n = met-by (f2, fi ) = end( fi) = start ( ff) = contains( f2, fi ) = start (fi ) > start (f ) and end(fi ) < end(f ) = is-overlapped ( f, fi ) = start (fi ) < start (f ) and end(fi ) > start ( ff) and end( fi ) < end( f2 )) = started-by( f2, fi ) = start ( fi ) = start ( f ) and end(fi ) < end(f ) = finished-by(f2, fi ) = end( fi) = end ( f ) and start(fi) > start(f2) = start ( fi ) = start ( f ) and end(fi ) = end(f ) Notice that if f1 and f2 are defined on the same segmentation the predicates meets and overlaps are always false. Finally, we need to compute the level of a fragment in a structure. This level captures the parent/child relationship between two fragments in a structure. Definition 6. Let F (s) (s is a structure) be the set of fragments f such that f = sor3x G F (s), 3a G Xy |x = a < f >. The function level (s, f ) returns the level of the fragment f in the structure s, it is calculated with the following algorithm : - level (s, s) = 0 - level(s,y) = level(s,x) + 1 with x = a < y > (x and y G F (s)). Figure 4 also shows the relative position of two fragments in two segmentations. In particular, the two following Allen's predicates are true (they indicate that a Plant overlaps two Lines): overlaps(Plant(s'5, e' 5), Line(s5, e5)) = true overlaps(Plant(s'5, e' 5), Line(s6, e6)) = true. Lastly, notice that in Figure 4, we have associated to each fragment its level in the hierarchical structure to which it belongs. For example, in S1, level(S1,Manuscript <>) = 0 level(S1,Page <>) = 1 level(S1,Column <>) = 2 level(S1,Line <>) = 3. 2.3 MSXD XML syntax We see that for a given multistructured document, each structure can be described using an XML syntax, thus its schema can be described using RelaxNG (see Figure 5). We choose RelaxNG because our model relies on hedges e1 s2f s3 e2 e3 s4 ~v r e4 s5 ~V e5 s6 "V r .Per recobrar maniar Ad home cant a perdut lo maniar prin de l erba blanca o s,t s'4 e'4 e'2 Manus cript(s0,e0) . level=0 Page(s1,e1) level=1 Prescriptions(s'1,e'1) level=1 Line(s3,e3) level=3 Column(s2,e2) . level=2 / \ Line(s4,e4) level=3 Line(s5,e5) level=3 Prescription(s'2,e'2) level=2 / \ ' Line(s6,e6) Sign(s'3,e'3) Ingredient(s'4,e'4) level=3 level=3 level=3 Plant(s'5,e'5) level=4 s'3 e'3 e'5 s'1 e'1 Figure 4: Illustration of our model to represent XML documents and the design of RelaxNG is based on this theory [28]. Another equivalent solution could have been to use XML Schemas. We define an XML syntax for an MSXD instance (see Figure 6). In this figure, we refer to the textual value of the manuscript (identified by an uri), each structure (identified by an uri) and its associated schema. Notice that segmentations are implicit. 3 Querying MSXD instances Our data model is close to XDM (XML Data Model)[23] used in XQuery and XPath. Recall that XDM defines unnested sequences of items (nodes or atomic values). For querying, we propose an extension of XQuery to deal with a multistructured document. Indeed, we propose to define an XQuery item as an atomic value or a fragment (instead of a node) and then we still deal with sequences of items. Thus, the XQuery language can be adapted to query a multistructured document. A XQuery on an MSXD instance with a single structure is equivalent to the same XQuery on the XML document corresponding to this structure. In the case of multiple structures, we extend the semantics of the filters of XQuery. For that, we successively study and extend: - The accessors defined in the XQuery Data Model (XDM) [23]. Accessors can be seen as a set of primitives to access an instance of the XML data model (parent, child,...). - The axis defined in XQuery [7]. Axis are an higher level access mode to instances of the XML data model. An axis can be an accessor (for instance parent and child) or can be based on accessors (for instance the axis ancestor is defined by the recursive application of the accessor parent). - The normalization of an extended XQuery into a core XQuery (as defined in the XQuery Formal semantics [18]). To formally define the semantics of XQuery, a subset of XQuery named XQuery core has been defined. Every XQuery can be expressed in the core. For instance, an XPath step is translated in a For Let Where Return (FLWR) expression. The dynamic environment existing during the evaluation of an XQuery is explicitly defined by binding a set of standard variables in the XQuery core. Our objective is to use an as much as possible unchanged XQuery, that is why we choose to rely on the XQuery normative documents. 3.1 Extending XDM accessors and XQuery axis To navigate in every structure and to adapt Allen's relations, we choose to slighty modify the semantics of some accessors or operators (close to Allen's relations). In other cases, we define new functions. Accessors. First, to express the containment we adapt the parent/child relationship: The dm:children and dm:parent accessors have been extended to return every parent and every child of a fragment in every structure it belongs to. Axis. As a fragment does not belong to every structure and as we want them to be as general as possible, the axis ancestor and descendant cannot be defined by applying recursively the accessors dm:parent and dm:children as it is done in XQuery2. Thus, we define 2http://www.w3.org/TR/xquery/ id-full-axis-feature start = element Manuscript { element Page { element Column { element line { text + start = element Manuscript { element Page { element Column { element line { text + start = element Manuscript { element Prescriptions { element Prescription { (text | Dosage | Effect | Ingredient | AdministrationMode | element Sign { text } | element Concoction { (text | Effect | AdministrationMode | element Action { (text | Ingredient)+ })+ }) Effect = element Effet { text } Sign = element Ingredient { (text | Dosage | element Mineral { text } | element Plant { text })+ } AdministrationMode = element AdministrationMode { text } Dosage = element Dosage { text } Figure 5: Schemas for the Physical (SI), Syntactic (S2) and Semantic (S3) structures Figure 6: XML syntax for the multistructured manuscript two new accessors: dmmsxd:ancestor and dmmsxd:descendant which respectively return the sequence of fragments containing a fragment or contained in a fragment (according to start and end positions). Order. We need to consider Allen's relations used to compute relative position between two fragments inside a segmentation or between two segmentations. A partial order on fragments in a given multistructured document can be defined on start and end positions. As in XQuery, the boolean operators ni << n2, ni >> n2 and ni is n2 can be defined. They are respectively true for the fragments ni and n2, if ni is before n2, ni is after n2 and if ni and n2 are the same fragment. We use the two first boolean operators (<<, >>) to express the following (after) and preceding (before) Allen's relations. New functions. We introduce each of the remaining relations as functions (evaluated according to the relative positions of two fragments). We only give here the definition of two Allen's relations (equals and overlaps) and the corresponding operators because they are used in the query examples given below. The remaining relations can be defined in the same way. - the functionmsxd:is-equal(ni(n2) and the operator is-equal,are true if n i and n2 have the same start and end positions in the same multistructured document, - the function msxd:is-overlapping(ni(n2) and the operator is-overlapping, are true if start (ni) < start (n2) and end(nf) > start (n2) and end(ni) < end(ni). 3.2 Extending the dynamic evaluation In the formal semantics of XPath and XQuery, the dynamic context is explicitly defined by binding variables. In particular $fs:sequence, $fs:dot, $fs:position and $fs:last variables respectively represent the sequence of context items, the context item, the context position, and the context size. The side effect of each operator is also explicit in the core XQuery. It is necessary to extend the dynamic context to carry information about every structure associated with an MSXD instance. We extend it by binding a new variable $msxd:selected_structures to a sequence of strings that represents the set of ids of the structures to be taken into account during the evaluation of the query. In our example, when the document is loaded by default: $msxd:selected_structures = {"http://lsis.univ-tln.fr/msxd/instance/S1.xml", "http://lsis.univ-tln.fr/msxd/instance/S2.xml", "http://lsis.univ-tln.fr/msxd/instance/S3.xml"}; This variable can be used to restrict the set of structures. If it is set to a subset of the structures, only this subset is taken into account by the XQueries. We need to define two basic functions to return existing structures in a document and to create the instance of a document: - msxd:structures($arg as fragment()*) as xs:string* returns a sequence of strings which represents the ids of structures to which every fragment of the sequence $arg belongs. - msxd:doc($uri as xs:string?) as document-node()? retrieves the XML description of an MSXD instance using $uri and returns its document fragment $root (equivalent to fn:doc() in http: //www.w3.org/TR/xpath- functions/). This changes the dynamic context by binding $msxd:selected_structures to msxd:structures($root), ie by default every structure of a document are consired during a query. Finally, we slightly modify the normalization of a path expression in XQuery (a step followed by a relative path expression) to return only fragments of the selected structures. The new rule is given in Figure 7. The standard normalization transforms an XPath step into a FLWR expression and it sets the dynamic environment after the evaluation of the first step, and for each item of the result ($fs:sequence) the relative path is evaluated (Lines 3,4 and 5). In our extension, we restrict the results to items from the selected structures (Lines 6, 7 and 8). Lines 1 and 2 ensure that each fragment is unique and that every fragment is sorted according to the document global order. 3.3 Query examples We propose some queries for our running example: Prolog - Binding the multistructured document to a global variable declare variable $msdoc := msxd:doc("manuscript.msxd"); Q1 - Children of Manuscript $msdoc//Manuscript/* returns every child of the fragment Manuscript which is shared by every structure (the results is the sequence Page from S1, Syntax from S2 and Prescriptions from S3, see Figure 2), Q2 - First Sentence of Prescriptions described on one Column $msdoc//Column//Prescription//Sentence[1] [StepExpr / RelativePathExpr]Expr 1 fs:apply-ordering-mode ( 2 fs:distinct-doc-order-or-atomic-sequence ( 3 let $fs:sequence as node()* := [StepExpr]Expr return 4 let $fs:last := fn:count($fs:sequence) return 5 for $fs:dot at $fs:position in $fs:sequence return 6 for $msxd:fragment in [RelativePathExpr]Expr return 7 if ([(msxd:structures($msxd:fragment)) = $msxd:selected_structures]Expr) 8 return $msxd:fragment 9 else return (); )) Figure 7: Extended normalization of a Path expression Q4- Columns which are Sentences for $v in $msdoc//Sentence return $msdoc//Column[. is-equal $v] Q5 - Sentences containing Plant $msdoc//Sentence[descendant::Plant] Q6- First Sentence after a Sign $msdoc//Sign/following::Sentence[1] Q7-First Sentence after a Sign for $h in $msdoc//Sign return $msdoc//Sentence[. >> $h][1] is the same as Q6 but using the order operator instead of the following axis. Q8 - Children of Manuscript in S1 let $msxd:selected_structures := "http://lsis.univ-tln.fr/msxd/instance/ Sl.xml" return $msdoc/Manuscript/* is the same as Q1 but returns only children from S1 because the variable $msxd:selected_structures is explicitly set to the identifier of S1. 4 A Schema for multistructured documents We define a schema for multistructured documents as a set of rules (vs a content model definition) because our structures are weakly coupled and the multistructured document is not hierarchical. Allen's relations (starts, overlaps, equals, ...) enable to constrain the relative position of fragments belonging to different structures. The constrains are expressed using XPath based predicates, we suppose that an XPath expression applied to a structure returns a sequence of fragments. Definition 7. A Multistructured document schema is a pair (Gs , C) where Gs is a set of grammars defining valid structures and C = {ci\ci = c(pi in si,p2 in s'2)} is a set of constrains, where c is the name of an Allen's predicate and pi,p2 are XPath expressions applied to the structures si and s'2- The constrain is true if for each fragment fi in val(pi), it exists a fragment f in val(p2) such that c(fi, f2) is true. A document is valid according to the schema if and only if every constrains in C are true. Figure 8 (see comments in the figure) shows an XML syntax for multistructured documents schemas and illustrates some constrains between fragments of the three structures of our running example (notice that each constrain is applied to two structures). Every constrain could be read in the same way, for example - Rule i: Root fragments of physical and syntactic structures are equal. Each fragment matching /Manuscript in every document valid according to the structure (whose alias is) manuscript_physical must be equal to at least one fragment matching /Manuscript in every document valid according to the structure (whose alias is) manuscript_syntactic; - Rule 3: A page starts by a prescription. Each fragment matching Page in every document valid according to the structure (whose alias is) manuscript_physical starts by at least one fragment matching Prescription in every document valid according to the structure (whose alias is) manuscript_semantic; - Rule 4: A prescription contains sentences. Each fragment matching /Prescription in every document valid according to the structure (whose alias is) manuscript_semantic contains at least a fragment matching sentence in every document valid according to the structure (whose alias is) manuscript_syntactic. This work is a first step towards validation across multiple hierarchies [20]. It enables to check the conformance of concurrent annotations attached to the same textual document related to predefined weak relationships between parts of different structures. Even if the validation is optional, it is useful to use this kind of constraints in case of distributed annotation to check the consistency of structures before querying. Figure 8: A grammar for our multistructured document Document creation and indexing Document querying User space MSXD Engine (1) A multistructured document distributed accross the network (2) Multistructured documents indexing document access v(on the fly ^indexing) ^— S--- ^-N. s___ ___' Textual Structures and values fragments management Management >(indexing)^ (SGBDR) ^___.___^ (3) Multistructured documents_ querying User Query (Extended XQuery) (4) Query evaluation Optimisation Query Tree O XQuery in the Core language Figure 9: MSXD functional architecture 5 Implementation of the MXSD data model and query language In this section, we describe the architecture of an implementation of the MSXD data model and of a XQuery engine extended to support multistructured documents. Figure 9 presents the functional architecture of our prototype and a typical use case in four steps. The figure is organized in two lines and two columns. The first line presents the user level and the second the MXSD engine level. The first column describes the design and the indexing an MSXD instance from a set of XML documents, the second column illustrates the querying. 5.1 User space: An almost usual XML environment The user space (parts 1 and 3 in Figure 9) is close to an usual XML environment. First, each user can describe its structures in XML documents which mark up the same text (see Figure 2). Notice that the structures could be distributed over a network. Then the user defines a (virtual) MXSD instance with respect to MSXD XML syntax (see Figure 6). To do this the user only needs an XML editor, he does not have to change its habits. To query an MXSD instance, a user can express it using the language defined in Section 3. The indexing is automatically done if needed (see next section). Recall that the language extends XQuery: if the user queries a single struc- ture, it is the standard XQuery. Query result is a sequence of fragments represented in XML. 5.2 MSXD Engine: indexing and querying structures and content As we have shown in Section 2, an MSXD document is a XML meta-document which refers to one XML document for each structure (both of them sharing the same textual value); our implementation of the data model can be seen as a dynamically built index between them (part 2 in Figure 9). We choose to separate the indexing of the structure and the indexing of the content. The analysis of the first structure enables us to deduce the textual value. The textual value is indexed only once in a specific component which is in charge of the textual indexing (to answer full text queries) and in charge of the access to the textual values of fragments (to build XML answers to the queries). When the other structures are analyzed, the consistency of the text is checked, and an alignment by means of spaces, tabulations or end of line is automatically processed if necessary. For the storage of the textual value and its indexing, we use Apache Lucene3. The analysis of each structure enables the creation of a relational representation of the structures. We define three main relations for a given document to store: (1) the fragments with their node type (element, attribute, text, ...), their start and end positions in the normalized textual 3http://lucene.apache.org value, their label, (2) the structures and (3) the structure contents (id of the structure, id of the fragment, level of the fragment in the structure). Notice that this representation is independent from the textual value, it uses start and end positions (the textual value manager stores real values) and it computes the level of fragments into each structure (a fragment can then be shared between structures). In our prototype, we choose to embed a java RDBMS4 but an external one can also be used. We define an API consisting in usual DOM [1] API ac-cessors extended with operators based on Allen's relations (as defined in definition 5). We store segmentations as tuples in a RDBMS, so we implemented it in SQL. This API provides a high level access to multistructured documents. MSXD instances conform to the DOM API and provide new methods such as the access to overlapping fragments. The query langage prototype relies on this API. To implement the query engine (part 4 in Figure 9) for this academic prototype, we choose to work step by step and to use standards. First, we translate the user query in the XQuery Core language5. Even if it is not designed to be the foundation of prototypes, we choose it because we need a clean " simple " language with a well defined semantics. Then, the XQuery Core query is used to build a query tree, which is optimized before its evaluation. Most of our operators are implemented to work in pipe line, the XQuery filters operators (operators which give access to children, ancestors, ... of a given fragment) use the extended DOM API. Notice that, the SQL translation remains visible at query time for future optimization (for example by grouping several SQL queries nested with FLWR operators into a single SQL query). Figures 10 and 11 show two screenshots of the main window of the application and of the query tree display window. Figure 10 shows a capture from our prototype during the evaluation of query Q2. A user can select (from a set of test cases) or edit an extended XQuery (top/right). The automatic translation in XQuery Core is shown bottom/right and the result (either in XML or in internal format by means of start and end positions) is displayed at bottom/left. The second figure displays the query tree corresponding to the core query and displays dynamic information during the execution (number of items created or filtered by operators,...). Finally, we developed an implementation of the multi-structured schema validation where constraints are checked sequentially. In order to obtain a more efficient validation and to provide a more intuitive way to express contraints, we are investigating the use of ontologies. We currently tackle this proposal with a linguistic application of multilevel analysis of multimodal data (OTIM-http: //lpl- aix.fr/~otim). 6 Related works If we look at XML standards, it seems clear that the standard tree-like model [23] and namespaces [9] could be used to represent multistructured documents if each structure is hierarchical and can be merged with others. But, this is not the case in general. In our example, some elements from the physical and syntactic structures can overlap (Line and Sentence). The problem of overlapping is not recent see [20] for a review. Several works have studied multi-structured documents in the context of XML for document-centric encoding. We classify the main proposals into three categories. The first one concerns the very first works about the representation of several hierarchical structures inside the same text (the CONCUR feature [24] of SGML, TEI6); these solutions are often syntactic. TEI's solutions need to choose either a flat representation of the multistructured document or a main (hierarchical) structure and to use references (ID/IDREF) for the description of the other structures. The second category is based on proprietary graph models. LMNL7 proposes a new markup language and model such as to overcome the limitations of hierarchical markup in XML and to get an instance of a multistructured document. LMNL graph-based model is not XML compatible even if it is able to import/export. Notice that LMNL considers user annotations but no solution for querying. Mul-tiX [14] is a proprietary graph-based model. It is possible to serialize an instance of it in an unique XML document. The multistructured querying is achieved by means of a set of XQuery [7] functions which in particular, deals with overlapping. Based on [14], [29] proposed a methodology for the construction of multistructured documents. This high level approach aims at defining structures during the construction process. To our knowledge, none other contribution considers a priori the problem of that construction which leads to restructuring and automatic differentiation of structures. We did not yet consider the problem of mul-tistructured document edition. MVDM (Multiview Document Model) [21] is a proprietary graph-based model. The model aims at considering multimedia documents and therefore at representing different kind of relationship between two document entities (and not only hierarchical relation). MVDM focuses on the notion of view which corresponds to a particular organization of a document. Stored in a document repository, multistructured documents can be queried according to criteria linked to one or several views of that document (automatic generation of SQL queries taking into account overlapping); another solution to navigate in the repository is proposed with a multidimensional analysis (OLAP). At last, Annotation graphs (AG) [5] are coming from the linguistic domain. Annotation graphs propose a proprietary formal model for the representation over the same flow (au- 4http://www.hsqldb.org/ 5http://www.w3.org/TR/xquery- semantics/ 6http://www.tei-c.org/P4X/NH.html 7http://www.lmnl.net/ Figure 10: Evaluation of XQuery Q2 in our prototype Figure 11: Tree of the XQuery Core for Q2 in our prototype Data models XML Compatible Main structure Validation Operators CONCUR none SGML Syntax no no no TEI (milestones) None XML Syntax yes no yes Annotation graphs (AG) Proprietary graph XML Syntax for serialisation no no not for querying Multistructure but only an XPath linguistic extension LMNL Proprietary graph specific markup and XML import / export no no no MultiX Proprietary graph XML import / export yes no specific XQuery functions for querying and structure manipulation Colored trees Extension of the XML data model XML export no no XPath step extension Based on Goddag Goddag DOM Extension and XML import/export no no XPath axis extension MSXD Extension of the XML data model An XML document for each structure no weak constraints XQuery semantics extension and new functions MVDM Proprietary graph XML Syntax by structure no no repository (SQL with overlapping management and multidimensional analysis) Figure 12: Proposals related to Multistructure dio, text, ...) of several structures (may not be hierarchical). An XML " flat" representation of an AG is proposed but no solution for querying. The third category presents XML compatible contributions. A very interesting framework is proposed in [19]. It is a new model based on the Goddags data structure [31] which can be seen as a generalization of DOM trees for the representation of multi hierarchical XML documents. This proposal defines also an extension of XPath [26] to navigate between different structures sharing the same textual data (with a specific axis for concurrent querying such as overlapping, xancestor or xdescendant). Another proposal, the colored trees [27], deals with multiple hierarchies in a data-centric context. It aims at sharing atomic data and it does not consider overlapping thus it is out of our scope. The idea is to build several hierarchies (called colors) over the same set of values (text nodes). Thus, nodes are multicolored. In order to navigate between colors/hierarchies, the authors extend the notion of step in XPath. A step begins by a choice of a color (and thus of a structure) before the usual selection of an axis, a test node and some predicates. Figure 12 summarizes the main features of each proposal related to multistructured documents according to some criteria: Data model, XML compatibility, Existence of a main structure (the user at the logical level or the system at the physical level chooses a main structure so others structures have to be set with regard to this main structure), Validation of a multistructured document (is it possible to define a schema for multistructured documents validation across multiple hierarchies [20]) Operators (forquery-ing several structures and content in a concurrent way). No query language has been defined for querying annotation graphs, but it exists operators that complete the XML propositions (XPath in particular [3, 4]), they are related to the linguistic context . Our proposal belongs to the third category, our objective is to remain close to XML standard. Our model is close to Goddags. Indeed, we want to keep the hierarchical aspect of each structure, so that classical XML tools remain available. But, Goddag does not provide mechanisms to add annotations (as LMNL does) and it does not describe relationships between structures for enabling validation across multiple hierarchies [20] (none of these proposals, whatever the approach is, proposes it). We do not detail user annotation but our proposal considers it (see [11] and [12]). Annotations represent textual data added by a user to the text in one structure and so missing in the other structures. It represents specific information que the user needed to integrate to its analysis. For querying several hierarchies over the same textual content in a concurrent way, we chose to extend the semantics of the filter of XQuery. Our objective is not to add new axis (like Goddag) or to extend XPath step with colors (as with colored trees). Moreover, even if we propose every Allen's relations, we choose to stay simple and to use an as much as possible unchanged XQuery by only adding the necessary function (Unlike MultiX). 7 Conclusion In this paper, our intention was twofold. First, we defined a XML compatible model for multistructured textual documents which is based on the use of hedges (the foundation of RelaxNG). A multistructured textual document is a text which has several concurrent hierarchical structures defined over it. Each structure corresponds to an analysis of the text according a specific use. Secondly, we proposed an extension of XQuery in order to query structures and content in a concurrent way. We applied our proposals using a medieval manuscript text written in occitan. Finally, we describe the architecture of an implementation of the MSXD data model and of a XQuery engine extended to support multistructured documents. Our solution is entirely XML compatible and conforms to standards. A multistructured textual document is defined as a set of fragments defined on the same textual value and grouped in concurrent hierarchical structures. The key idea is to propose a method for a compact description of multiple trees over a single text based on segmentation. Segmentation encoding allows querying overlap/containment relations of markups belonging to different structures. We showed how each structure can be described in an XML document. The multistructured textual document is never instantiated. To query a multistructured textual document, we chose to extend the semantics of the filter of XQuery. We show how to take into account equality, overlapping and other Allen's relations. For that we added functions and operators to XQuery. We are trying to avoid changing the structure of XQuery (as colored trees did without considering overlapping). Moreover, we did not simply add a new axis (as goddag did by adding xancestor xdescendant), but one can notice that it makes the query easier to read. However, normalization of xancestor, xdescendant or even a new axis associated to Allen's relations can be rewritten into our proposal. Moreover, we defined a multistructured schema in order to express weak constraints between structures; it is defined as a set of rules, Allen's relations are used to constrain the relative position of fragments in the structures. An alternative solution could rely on the use of ontologies. It could offer more flexibility. We currently tackle this proposal with a linguistic application of multilevel analysis of multimodal data (Project OTIM8). Our main perspective is the definition of multiple structures in other modalities than the single textual one. For example, it could be useful to define one or several structures associated to the image of a manuscript as it is done for its textual transcription. Then, the objective would be to manipulate the set of all these structures in a concurrent way. Secondly, we plan to extend our query engine to distribute parts of queries in a P2P network and to enable data sharing. References [1] A. Le Hors et al. Document object model (dom) level 3 core specification. Recommendation, W3C, 2004. [2] J. Allen. Time and time again : The many ways to represent time. International Journal of Intelligent Systems, 6(4):341-355, july 1991. [3] S. Bird, Y. Chen, S. B. Davidson, H. Lee, and Y. Zheng. Extending XPath to support Linguistic Queries. In Proceedings of The ACM Workshop Programming Language Technologies for XML (PLAN-X), pages 35-46, january 2005. 8http://lpl-aix.fr/~otim [4] S. Bird, Y. Chen, S. B. Davidson, H. Lee, and Y. Zheng. Designing and evaluating an XPath dialect for linguistic queries. In Proceedings of The 22nd International Conference on Data Engineering (ICDE'06), april 2006. [5] S. Bird and M. Liberman. A formal framework for linguistic annotation. In Speech Communication 33(1,2), pages 23-60, september 2001. [6] P.-V Biron and A. Malhotra. XML Schema Part 2: Datatypes second edition. Recommendation, W3C, 2004. [7] S. Boag. XQuery 1.0 : An XML Query Language. Recommendation, W3C, 2007. [8] M.S. Corradini Bozzi. Etude des textes de matiï£jre medico-pharmaceutique en langue d'oc. In Bulletin de l'Association Internationales d'Etudes Occitanes, VIII, pages 29-34,1990. [9] T. Bray, D. Hollander, A. Layman, and R. Tobin. Namespaces in XML 1.1 second edition. Recommendation, W3C, 2006. [10] T. Bray, J. Paoli, and C.-M. Sperberg-McQueen. Extensible Markup Language (XML) 1.0. Recommendation, W3C, 1998. [11] E. Bruno and E. Murisasco. MSXD : a formal model for concurrent structures defined over the same textual data . In Proceedings of The International Conference on Database and Expert Systems Applications (DEXA 2006), pages 172-181. LNCS, 2006. [12] E. Bruno and E. Murisasco. Describing and querying hierarchical structures defined over the same textual data. In Proceedings of the ACM Symposium on Document Engineering (DocEng 2006), pages 147-154, Amsterdam, The Netherlands, October 2006. [13] E. Bruno and E. Murisasco. An xml environment for multistructured textual documents. In Proceedings of the Second International Conference on Digital Information Management (ICDIM'07), pages 230-235, Lyon, France, October 2007. [14] N. Chatti, S. Kaouk, S. Calabretto, and J.M. Pinon. Multix: an xml based formalism to encode multi-structured documents. In Proceedings of Extreme Markup Languages Conference, August 6-10 2007. [15] J. Clark. XSL Transformations (XSLT) V1.0. Recommendation, W3C, 1999. [16] J. Clark and S. Derose. XML Path Language (XPath) V1.0. Recommendation, W3C, 1999. [18] D. Draper et al. XQuery 1.0 and XPath 2.0 Formal Semantics . Recommendation, W3C, 2007. [19] A. Dekhtyar and I.-E. Iacob. A framework for management of concurrent xml markup. Data and Knowledge Engineering, 52(2):185-208,2005. [20] S. DeRose. Markup overlap : a review and a horse. In Proceedings of The Extreme markup language Conference, 2004. [21] K. Djemal, Soule-Dupuy, and Valles-Parlangeau C. Modeling and exploitation of multistructured documents. In Proceedings of the IEEE 3rd International Conference on Information and Communication Technologies: From Theory to Applications (ICTTA ' 08)., Damascus, Syria, April 2008. [22] P. Durusau and M. Brook O'Donnell. Concurrent markup for xml documents. In Proceedings of XML Europe Atlanta, 2002. [23] M. Fernandez, A. Malhotra, J. Marsh, M. Nagy, and N. Walsh. XQuery 1.0 and XPath 2.0 Data Model. Recommendation, W3C, 2007. [24] C.-F. Goldfarb and Y. Rubinsky. The SGML handbook. Clarendon Press, Oxford, 1990. [25] M. Hilbert, O. Schonefeld, and A. Witt. Making concur work. In Proceedings of The Extreme Markup Languages Conference, August 2005. [26] I.-E. Iacob and A. Dekhtyar. Towards a query language for multihierarchical xml: Revisiting xpath,. In Proceedings of The Eighth International Workshop on the Web and Databases (WebDB '05), pages 43 - 48, june 2005. [27] H.-V Jagadish, L.-V-S. Lakshmanan, M. Scanna-pieco, D. Srivastava, and N. Wiwatwattana. Colorful XML: One Hierarchy Isn't Enough. In Proceedings of The International Conference on Management of Data (SIGMOD'04), pages 251-262,2004. [28] M. Murata. Hedge automata: a formal model for XML schemata. Web page, 2000. [29] P.-E. Portier and S. Calabretto. Creation and maintenance of multi-structured documents. In Proceedings of the ACM Symposium on Document Engineering (DocEng 2009), Munich, Germany, Septembre 2009. [30] C.-M. Sperberg-McQueen and L. Burnard. Tei p4 guidelines for electronic text encoding and interchange, 2001. [31] C.-M. Sperberg-McQueen and C. Huitfeldt. Goddag: A data structure for overlapping hierarchies. In Proceedings of The Principles of Digital Document and electronic publishing (DDEP/P0DDP'00), pages 139-160, 2000. [32] Jeni Tennison and Wendell Piez. Layered markup and annotation language (lmnl). In Proceedongs of The Extreme Markup Languages Conference, 2002. [33] A. Witt. Multiple hierarchies : news aspects of an old solution. In Proceedings of The Extreme markup language Conference, 2004. A Sequential Three-Stage Integer Goal Programming (IGP) Model for Faculty-Course-Time-Classroom Assignments Raed Al-Husain, Mohamad K. Hasan and Hameed Al-Qaheri* Department of Quantitative Methods and Information Systems College of Business Administration, Kuwait University, Kuwait E-mail: raed@cba.edu.kw, mkamal@cba.edu.kw, alqaheri@cba.edu.kw Keywords: integer goal programming, timetabling, university scheduling problem Received: March 19, 2010 Developing university schedules that could take into account factors such as faculties' preferences to courses, timeslots, and classrooms, in a timely fashion while being unbiased and meeting university requirements, is a hurdle in many universities around the world.This paper exploits the use of three-stage integer goal programming (IGP) technique to solve the university scheduling problem, as an expansion of an earlier two-stage model attempt conducted by the authors. Segmenting the problem into three stages enabled reaching a complete schedule in a timely manner and a high satisfactory level among faculties, while meeting all university requirements. The output of every stage is used as an input to the following stage, and goals are satisfied using the priority sequence approach according to their order of importance based on some college, department, and major regulations and requirements. Povzetek: Z novo metodo IGP naredijo univerzitetni urnik. 1 Introduction The utilization of optimization techniques to ensure more efficient and effective operational workflow has long been a major factor in the success of organizations in different industries; hence the need for such techniques in the educational sector is no exception. Scheduling problems in universities, such as offering required courses at the same time on the same day, assigning the wrong class size to the wrong classroom, inevitable biased faculty-course assignment, and relatively long time to complete the schedule have all been problematic issues associated with using manual and judgmental approaches when developing course schedules. This paper exploits the use of three-stage integer goal programming (IGP) technique to solve the university scheduling problem, as an expansion of an earlier two-stage model attempt conducted by the authors [12]. The three-stage model is developed and solved in a sequential order, where faculties assigned to courses, courses assigned to different time slots, and then time slots assigned to classrooms respectively. In our approach, each stage is optimally solved such that the outputs of each stage are fed as inputs to the following stage. In every stage, university, college, and departments regulations are considered as a set of goals to be achieved along with faculties' preferences. The model has been tested at the College of Business Administration in Kuwait University using Excel Premium Solver. The rest of the paper is organized as follows: Section 2 present a selective review of literature, Section 3 covers the Three-Stage integer goal programming(IGP) model formulation, Section 4 cover the experimentation and discusses the results of the three stages follow by an overall analysis and assessment of the three stage model in section, conclusion and future research are discussed in Section 6. 2 Review of literature The idea of developing sophisticated models to solve the university scheduling problem has been around since the early 70s [14] [11]. The techniques used range from the utilization of optimization models to complex heuristics models. Some models solved the problem of faculties' assignment to courses only [23] [6]. Other models took into consideration the time slot factors as well [10] [6][7][15][17] and some models took into account faculty-time-classroom assignment [13] [1][2]. Most of the work mentioned used the approach of decomposing the problem into distinct and interrelated stages versus the approach of solving the problem as a complex single stage model. Using such approach has the advantage of significantly reducing computation time while finding a relatively satisfying solution. Heuristics approaches and the aid of decisions support systems have also been utilized to solve the university scheduling problem in order to overcome complexities that could arise from using optimization techniques. The major reason of using such approaches Corresponding Author Figure 1: Faculty Course Schedule Block Diagram and Information Flow. is to reach a relatively close to optimality solution in relatively reasonable time [13][16] [8] [3][9]. More recently, the use of variable neighbourhood search (VNS) approaches for the university examination timetabling problem has been investigated. The technique has proven to produce high quality solution across a wide range of problems, but with relatively large amount of computational time [18]. Another heuristic approach that has been utilization in the College of Engineering at Virginia Tech is the use of Benders' partitioning. An improved quality course schedules, in terms of the total distance travelled by the faculty members from their offices to the classrooms where the courses are offered, has been obtained [19]. Moreover, the use of genetic algorithm meta-heuristic has been another heuristic approach to the university timetabling problem. The approach considers timetable in a bottom-up fashion at the various levels of department, faculty or entire university, which is claimed to be the first application of meta-heuristics to a timetabling problem [20]. Hyper-heuristics method has also been utilized in solving the university timetabling problem. Burke used a novel direction in hype-heuristics, unified graph-based hyper-heuristic (GHH) framework, under which a number of local search-based algorithms are studied to search upon sequences of low-level graph colouring heuristics [21]. More complicated approaches has also been utilized by using a multi-objective evolutionary algorithm that uses a variable-length chromosome representation and incorporates a micro-genetic algorithm and a hill-climber for local exploitation and a goal-based Pareto ranking scheme for assigning the relative strength of solutions [22]. The aim of this paper is to solve the university scheduling problem by extending the work of Badri [6] by using a three-stage (sequential) integer goal programming (IGP) model with different university, college, department and major rules and regulations as shown in Figure 1. Introducing goal programming technique into every stage has enabled the fulfilment of many rules and regulations. The use of integer goal programming technique in developing a comprehensive schedule at the College of Business Administration in Kuwait University has proven to outperform manual approaches that had been adopted. Results have proven to be both efficient and effective. Resources are optimally utilized, faculties are fairly satisfied, and students are efficiently progressed in a timely manner through their graduation requirements. The work of this paper represents an extension of an earlier study conducted by the authors [12]. The major difference of this study and that of Badri's [6][7] and Hasan's[12] is the inclusion of the classroom availability regulations to the model. Hence, a complete schedule that could take into account faculty preferences to courses and timeslots, and classroom availability regulations of the college can be developed. 3 A sequential three-stage integer goal programming (IGP) model formulation The approach that has be followed to solve the university scheduling problem is through segmenting the problem into three distinct yet interrelated stages, the faculty-course assignment stage, the courses-timeslot assignment stage, and then the timeslot-room assignment stage. The inputs of every stage is translated into goals and solved according to their order of importance, where goals are given priorities according to their order of importance. The output of every stage, which represents an optimal assignment, is then fed to the next stage to act as an input. This process continues until the final stage is solved and a complete scheduled is created. Figure 1 shows the schematic diagram of the entire modelling process for solving the university scheduling problem. A detailed description of the three-stage integer goal linear programming (IGP) model that is applied to solve the university scheduling problem is discussed next. See Hasan. et al[12] for a full description of the two-stage integer goal programming model. 3.1 Stage I: faculty-course assignment integer goal programming (IGP) model formulation 3.1.1 Stage I model notations: i: Faculty member j: Courses number Li :Maximum number course loads for faculty i Nj : Number of sections offered for course j Xj : Number of sections for course j that will be assigned to faculty member i Rj : The preference for faculty member i to teach course j, where Ri}- = {0, 1, 2, 3, 4, 5} such that the value of 5 represents is a very favourable course, and 0 not desired at all. Stage I IGP Model has five goals and one hard constraint and are described as follows: Goal 1: Each faculty member i should take exactly his maximum course loads Li. This goal has a priority P1 and the objective is to minimize both of d^ and d^ Vi. Goal 2: The number of course sections N j for course j should all be covered by faculty members. This goal has a Stage I model goals and constraints In this section we formulate the integer goal programming (IGP) model of stage I that represents the rules and regulations of Kuwait University, College of Business Administration, and the requirements of the Department of Quantitative Methods for assigning courses to faculty. Stage I IGP model: Satisfying goals with their priorities and the other requirements, Stage I GLP model can be written as following: priority P1and the objective is to minimize both of d2j and d2j Vj . Goal 3: Each faculty member should take at least one of the College Level Courses (CLC) course section. This goal is a department level requirement and is given a priority level P2 and the objective is to minimize d3i Vi. Goal 4: Each faculty member should take at least one of the Major Level Courses (MLC) section. This goal represents another department level requirement and is Minimize p w (d ~ + d + ) + 2 (d+ d+2] )] + P2 2 d3- + P3 2 d-, + P, 2 d- i=1 j=1 i i i Subject to: 2 X j + d " - d !+. = Li i = 1,+ ,..., m j = 1 2 Xj + d -2]-d +j = Nj j = ¡,2,......, n i = 1 n 2 Xj + d-i-d+ = 1 i = 1,2,......, m j'gclc n 2 Xj + d 4i - d 4i = 1 i = 1,+,......, m jgmlc 2 RjXj + d 5i-d+i = 5 Li i = 1,2,......, m j f0 if r, = 0 X, =+ J , i = 1,2,..., m and j = 1,2,..., n J \< 2 Otherwise <, , , d2j, , , <, d4,, d5,, d5i, > 0for i = 1,2,......, m and j = 1,2,...., n Where w is very big value to force the values of the deviations d^, d1 i, d2j, d2 j Vi and Vj to be zeros. given a priority level P3 and the objective is to minimize such that m = morning time and d- Vi . a = afternoon time Goal 5' . tdp : Time slot number in a day d and period p Maximize the total preference for each faculty member i F that has a course loads Li. This goal has a priority P4 ' where tdp ={1, 2, . } and the objective is to minimize d5i Vi. Hard Constraints: X Finally we have one more hard constrain that does not ijkvtdp allow any faculty member i to take more than two sections for the same course j . This constraint is represented by the following formula: °tdp : Number of rooms available at time slot number 1 if faculty i assigned to course j sec tion kjj during time slot t of day d in period p 0 Otherwise Xj < 2 i = 1,2,...,m and j = 1,2,...,n tdp 4.1.2 Stage II model goals and constraints: 4.i Stage II: faculty-course"time in this section we formulate the goal programming model assignment integer goal programming that represents the rules and regulation of Kuwait model formulation University, College of Business Administration requirements for assigning time slots to Faculty-Course assignment resulting from stage I. 4.1.1 Stage II model notations: kj :section number for course j that assigned Stage II IGP model: to faculty i, where kj = {1,2,3....,K} Satisfying the goals with their priorities and the other requirements, the second stage IGP can be written as follows: d :Days of the week, where d = {1, 2} such that 1 = {Sunday, Tuesday, Thuresday} and 2 = {Monday, Wednesday} p : Period of the day, where p = {m, a} Mmmize P S^ + P2(SSdljtm) + P3(SSdlit*) + P4d4 + P5d5 + P6(d6 + d7 ) + P7Sd8- tdp j tdm j tda ' Subject to: ^^ ^C/ j kij tdp (d^tdp + d^tdp Otdp Vt -j^ ijkijtdp 1tCp 1tdP d dP kj SS Xi j kj tdm - j + d2jtdm = 2 Vj e CLC and tdm 1 kij SS X^jta - j + j = 1 Vi e CLC and tda i kij SSS S^d, - 4S S S Sx.- d+ + d— =0 i jeMLCkj tdm i jeULCkj tda SSSSXi; kj ,p - 1.5(SSSSXi i kj t2p) -d+ + d- = 0 i j kij t1 p i J kij t2 p S S S S Xj^Hm - 2.3(S S S S Xijk„tla) - d6++ d— =0 i j kij tm1 i j kij ta1 S S S S Xjkjtm - 2.3(S S S S Xijkjt2a ) - d 7 + d— = 0 i j kij tm2 i j kij ta2 SSSRi j kr tp 'Xi j kj. tp -d8i + d- = M Vi, such that M is a large number j kn tdp SS Xijkjtp <1 Vi e i, Vd j kij SS Xij 1 tp < 1 Vtdp i jeMLC i J kij tdp = 1 all Xijki td are binary, and all d's > 0 Stage II IGP Model has seven goals and three hard constraints and are described as follows: Goal 1: Total number of courses assigned in a specific time slot cannot exceed the number of rooms available for that time slot. This goal has a priority Pl and the objective is to minimize d^ _ Vt, Goal 2: dp • objective is to minimize d3jtda Vj e CLC and t da Goal 4: Reduce the gaps between MLC. The MLC should be 4 times more condensed during the morning-time than they are during the afternoon-time, where 4 is just any number that the department wishes to choose. This goal has a priority P4 and the objective is to minimize d 4 . Goal 5: This is more like a guideline, where 60% of courses should preferably be offered during the odd days and 40% during the even days. This goal has a priority P5 and the objective is to minimize d 5~ . Goal 6: This is more like a guideline, where 70% of courses should preferably be offered during the morning-time and 30% during the afternoon-time. These goals have a priority P6 and the objective is to minimize d6 , d7 Goal 7: 1. Sum of sections taught for every faculty in every specific time slot must be at most equal to 1. j kij i J kij < 1 Vi e I, Vt, dp 2. Sum of MLC offered during a specific time slot during same day must equal at most 1. X X Xij 1 tdp <1 Vt dp i JeMLC 3. Sum of time slots for each section for every faculty, every course, and every section must equal 1. ^^Xi J kj tdp 1 This goal eliminates timing conflict of courses that can be taken at the same time for similar CLC. Total number of similar CLC assigned during a specific time slot in morning-time cannot exceed 2 sections for the same course. This goal has a priority P2 and the objective is to minimize dVj e CLC and tdm. Goal 3: Total number of similar CLC assigned during a specific time slot in afternoon-time cannot exceed 1 section for the same course. This goal has a priority P3 and the 4.2 Stage III: room assignment integer goal programming model formulation 4.2.1 Stage III model notations: L : Floor Level, where L = {1,2,3} C : Room size category, where C = {1,2,3,4} rLC : room number in floor level L and size category C where rLC = {101,...,112,201,...,214,301,...,318} DL : department number based the floor level location, where DL = {1,2,3} p LC : Room level location preference, where pLC = {2,4,5} such that 2 is the least preferred room level location, and 5 is highly preferred room level location for a course to be placed. 4.2.2 Stage III model goals and constraints: Stage III IGP model: Minimize d~ Subject to: XX pr XJJk t r + d - d+ = M ¿—i /-U^rLC iJckijctdprLC i j k t rLC ' ijckijctdprLC XX X XjckuM>-LC = 1 V tdp ,kjc This goal maximizes the faculty preferences on their class times. This goal has a priority P7 and the objective is to minimize Vi Hard Constrains: L C X XijckijJdprLC - 1 Vtdp,L, C, rLC i all Xjk t . are binary, and all d' s > 0 ijckijctdprLC J ' Stage III IGP Model has one goal and two hard constraints and are described as follows: Goal 1: The primary goal in stage III model is to locate each previously assigned course to a room of the right size as close as possible to the department that is offering the course. This is accomplished by achieving a high dp dp enough sum-product of the decision variable Xijck.. tdprIC , the assignment of faculty i to course jc of size category C section kjc in time period tdp to a room location rLC , with the location preference prc and M is a large number. The objective is to minimize the under deviation, d ~ of the sum-product. Hard Constraints 1. Each section of a course that has been assigned a specific faculty and time should be located in one room only. V tdp, i, k1]c ^^^^ ^^X ijckijctdprLC 1 L C rLC 2. Each room is assigned to at most one faculty in a specific time period. XXjckUctdprLC ^ 1 Vtdp,L, C, rLC 5 Experimentation The model was initially applied to generate the schedule for 4 different majors representing 2 different departments at the College of Business Administration in Kuwait University for the semester of fall 2009. Namely, the Marketing major (MRKT) at the Department of Management and Marketing; and the majors of Information Systems (IS), Logistics and Supply Chain Management (LSCM) and Statistics (STAT) at the Department of Quantitative Methods and Information Systems (QMIS). 5.1 Data collection Each of the above mentioned majors had to fill in the model inputs for every stage sequentially until the final schedule is completed. Model inputs include faculty members, number of courses and their sections to be offered, faculty-course preferences, required load to be taught for every faculty, course-timeslot preferences, and the university, college, and department rules and regulations of assignment. Examples of these regulations include the ratio of courses to be offered in the morning sessions versus in the afternoon sessions, the ratio of courses to be offered during Day 1 (Sunday, Tuesday, and Thursday) versus Day 2 (Monday and Wednesday), and the amount of dispersion of major level classes. For further discussion of stage I and stage II models, please refer to [12]. The same procedure has been followed in stage III, the timeslot-room assignment model. Inputs of this model include room information, i.e. number of rooms, their size category, and their floor location. Each room was given a size category, room category (RC) based on its capacity as shown in Tablel. This distinction ensures that rooms are assigned to courses of the right Expected Course Category (ECC) size only. Moreover, departments are located in 3 different Levels, Room Level (RL), at the College of Business Administrations in Kuwait University; hence rooms were distinguished based on their floor level in order to be able to assign them as close as possible to the department that is offering the course. Table 2 shows the room characterization where RN is the Room Number. Table 1: Room Size Category. Size Cap Category acity 1 25 2 30 3 40 -44 4 65 Table 2: Room Characterization. RL RC RN 1 2 105-106, 111-112 3 101-104, 107-110 2 2 205-207, 212-214 3 201-204, 208-211 1 312-314 3 2 305-207 3 301-304, 308-311 4 315-318 5.2 Stage I results The output of this stage, the faculty-course assignment stage, represents an optimal assignment of faculty members to courses and their sections according to the imposed rules and regulations. Thirteen different scenarios were tested to ensure the effectiveness of the model. These scenarios take into account the occurrences of three different cases that could arise when developing a schedule. Cases include the situation where the faculties' loads = the total course sections offered, the faculties' load > the total course sections offered, and the faculties' load < the total course sections offered. In the first case, most goals were 100% met except for the last goal, the faculty-course preferences goal. Satisfaction level of this goal, i.e. faculty getting their first choice of courses, ranged from 85.2% to 73.3%. However, when it came to the second choice preferences, all faculties were 100% satisfied. In the second and third cases, where the load of faculties available is not equal to the amount of courses offered, the satisfaction of the goals ranged from 100% to 54.9% based on the amount of variation of the faculties' load available and the amount of courses offered. For further discussion of stage I results, please refer to [12]. 5.3 Stage II results The output of this stage, the course-timeslot assignment stage, represents an optimal assignment of course, that were already assigned to different faculty members, to timeslots according to the imposed rules and regulations. Most goals were met up to 100% with the exception of goal 4, the dispersion of MLC in the morning versus the afternoon timeslots. The model was able to condense the MLC during the morning timeslots as desired, hence there has been an under achievement of the goal by 42%. Moreover, about 90% of the faculties got their first choice of preferences when it came to their desired timeslot in the schedule. Combining the faculty satisfaction level of the two stages together, 73.6% of the faculties were able to get their first choices of preferences, and 100% of the faculties were able to get at least their second choice of preferences. For further discussion of stage II model, please refer to [12]. 5.4 Stage III results Upon the completion of stage I & II of the model, an optimal assignment of both faculties to courses, and then those courses to different timeslots is obtained. The result is then used as an input to stage III model, timeslot-room assignment. Based on the formulation of stage III model, a complete schedule was obtained . Table 3 shows part of the generated schedule. All timeslots were successfully assigned to different room locations, the right course size were assigned to the right room size, and courses were distributed in the college to the desired floor based on the department that is offering these courses. Table 3: Room Assignment Distribution. Dep. I.D. Major I.D. Dep. Location Course Level Time Faculty ID. Course I.D. Sectiou No. ECC KL RC RN OMTS LSCM CT C fi-Q LSCM Fl 210 1 3 3 i 30i5 omis LSCM MIC K-Q T SCM F? IIS 1 1 3 1 312 OMIS IS 3 CLC 8-9 IS Fl 240 1 3 i 309 OMIS IS 3 8-9 IS F2 240 5 3 3 310 OMIS IS 3 MLC 8-9 IS F4 4i 1 1 1 i 2 303 OMIS STAT 3 8-9 STAT Fl 110 7 3 3 4 313 OMIS STAT 3 CLC 8-9 s TAT F2 220 1 3 3 4 317 OMIS STAT 3 8-9 STAT F3 220 8 3 3 4 3 IS BUSA MRKT MLC 8-9 MRKT Fl 32 i 1 1 2 2 214 OMIS LSCM 3 CLC 9-10 T SCM Fi 3 3 4 316 OMIS LSCM 3 9-10 T SCM F? Xr. 4 3 3 3 30i5 OMIS IS 3 CLC 9-10 IS F3 130 6 3 i i 309 OMIS IS 3 9-10 IS F4 240 6 3 3 310 OMIS IS 3 MIC 9-10 TSF1 1 1 3 2 303 OMIS STAT 3 9-10 STAT F4 120 3 3 3 311 OMIS STAT 3 CLC 9-10 STAT F3 120 5 3 i 4 317 OMIS STAT 1 9-10 STAT FS 1 i 4 ¿IS 6 Overall analysis and assessment of the three stage model Breaking the university scheduling problem into three stages has greatly improved the solution process and computation time of such a complex problem. Once all the required input data of every stage were available, computation time for each stage were few seconds using the Excel Premium Solver. Moreover, although the output of every stage represents a local optima of the overall problem, considering the satisfaction level of assigning faculties to courses and courses to different timeslots, then the efficient and effective allocation of timeslots to the right rooms, and the computation time of solving the entire problem, the decomposition of the scheduling problem is considered an advantage rather than a disadvantage. On the other hand, solving the entire scheduling problem in one complex model might result in an infeasible solution when global optimum is desired. 7 Conclusion and future research Developing an effective, unbiased, and timely schedules have long been an issue in universities around the world. The utilization of optimization techniques, however, has proven to overcome such a complex problem. Although different approaches have been used to resolve this problem and reach an "optimal" schedule, the consideration of factors such as faculties' preferences to different courses and timeslots, an efficient room assignment, and university rules and regulations of assignment have all been hindrances to be considered all at once. Moreover, computation time has always been a problem when all of the above factors were considered in one complex model. This paper utilizes the integer goal programming(IGP) technique and the idea of breaking (decomposing) the problem into smaller sub problems, i.e. different stages, in order to simplify formulation and swiftly reach a satisfying solution to the overall scheduling problem. The method used in satisfying goals is the priority sequence approach, where goals are satisfied according to their order of importance based on some university, college, and department regulations and requirements. The output of every stage has been used as an input to the subsequent stage until a complete schedule is developed. After successful results of the first two stages of the model has been verified in an earlier study conducted by the authors, a new stage, timeslot-room assignment stage, has been added to the previous model and contributed to the development of a complete schedule that took into account all different factors when developing a schedule is desired. The overall model has been tested in Kuwait University at the College of Business Administration using 4 different majors in 2 different departments. Results showed that faculties satisfaction level obtained reached up to 85.2% in stage I, and 88. 8% in stages II of the model as shown in an earlier study. The overall satisfaction level when combining the two results reached up to 73.6%, as far as faculties getting their first choices of preferences. Nonetheless, faculties' satisfaction level reached up to 100% when it came to getting at least their second choices of preferences. The room assignment stage has successfully used the results obtained in the previous stages and efficiently distributes courses with an assigned timeslots to the desired room location. Work is underway to eventually integrate the three-stage model of this paper with a Decision Support System (DSS) such as the ScheduleExpert of Cornell in order to build an ultimate scheduling tool that will enable users to develop quick and effective schedules that are demand driven by the students through a new development of students planer DSS rather than supply driven by the college. The integration between the University scheduling DSS and the student planer DSS in a unique integrated DSS, will be a great tool that will efficiently and effectively enhance the whole Kuwait University registration system. References [1] Al-Yakoob, S. M. and Sherali, H. D. (2006). Mathematical programming models and algorithms for a class-faculty assignment problem. European Journal of Operational Research 173, 488-507. [2] Al-Yakoob, S. M. and Sherali, H. D. (2007). A mixed-integer programming approach to a class timetabling problem: A case study with gender policies and traffic considerations. European Journal of Operational Research 180, 1028-1044. [3] Aladag, C., Hocaoglu,G. and Basaran, M. (2009). The effect of neighborhood structures on tabu search algorithm in solving course timetabling problem. Expert Systems with Applications, in press. [4] Anderson, D., Sweeney, D., Williams, T. (2008). An Introduction to Management Science : Quantitative Approaches to Decision Making, 11e. [5] Badri, M. A. (1996). A Linear Goal Programming Model for Solving the Optimal Faculty Assignment Problem. King Sauod University Journal; Vol. 8; Business Administration 2; 537 - 566; Reiad. [6] Badri, M. A. (1996). A two-stage multi-objective scheduling model for [faculty-course-time] assignments. European Journal of Operational Research 94 (1), 16-28. [7] Badri, M. A., Davis, D. L., Davis, D. F., Hollingsworth, J. (1998). A multi-objective course scheduling model: Combining faculty preferences for courses and times. Computers and Operations Research 25 (4), 303-316. [8] Broek, J., Hurkens C., and Woeginger G. (2009). Timetabling problems at the TU Eindhoven. European Journal of Operational Research 196 (2009) 877-885 [9] De Causmaecker, P., Demeester, P., and Berghe, G. (2009). A decomposed metaheuristic approach for a real-world university timetabling problem. European Journal of Operational Research, 195, 307-318. [10] Dinkel, J. and J .Mote. (1989). An Efficient Decision Support System For Academic Course Scheduling. Operations Research. 37(6), 853-864 [11] Harwood, G. and Lawless, R. (1975). Optimizing organizational goals in assigning faculty teaching schedules. Decision Science 6, 513-524. [12] Hasan, M., Al-Husain, R., Al-Qaheri, H. (2009). Course Assignment for University Faculties. Arab Journal of Administrative Sciences, Vol 17, No.1, pp. 169-191. [13] Hinkin, T. R. and Thompson, G. M. (2002). SchedulExpert: Scheduling Courses in the Cornell University School of Hotel Administration. Interfaces.32(6), 45-57 [14] Lee, S. and Schniederjans, M. (1983). Muticriteria Assignment Problem: A goal programming approach. Interfaces 13, 75-81. [15] Mirrazavi, SK. Mardle, SJ. And Tamiz, M. (2003). A two-phase multiple objective approach to university timetabling utilizing optimization and evolutionary solution methodologies. Journal of the Operational Research Society 54, 1155 - 1166. [16] Pongcharoen, P., Promtet W., Yenradeec, P., and Hicksd C. (2008). Stochastic Optimization Timetabling Tool for University Course Scheduling. Intentional Journal of Production Economics, 112, 903-918 [17] Valouxis, C. and Housos, E. (2003). Constraint programming approach for school timetabling. Computers & Operations Research_30; 1555 -1572. [18] Burke, EK., Eckersley, AJ., McCollum, B., Petrovic, S., Qu, R. (2010). Hybrid variable neighbourhood approaches to university exam timetabling. European Journal of Operational Research, Vol. 206, No. 1, pp. 46-53. [19] Subhash, Sarin C., Wang, Yuqiang., Varadarajan, Amrusha. (2010). A university-timetabling problem and its solution using Benders' partitioning-- a case study. Journal of Scheduling, Vol. 13, No. 2, pp. 131-141. [20] Aderemi, O., Adewumi, Babatunde, A., Sawyerr, M., Montaz, Ali. (2009). A heuristic solution to the university timetabling problem. Engineering Computations, Vol. 26, No. 8, pp. 972-984. [21] Qu, R., Burke, EK. (2009). Hybridizations within a graph-based hyper-heuristic framework for university timetabling problems.The Journal of the Operational Research Society, Vol. 60, No. 9, 1273-1285. [22] Cheong, C Y., Tan, K C., Veeravalli, B. (2009). A multi-objective evolutionary algorithm for examination timetabling. Journal of Scheduling, Vol. 12, No. 2, pp. 121-146. [23] Schniederjans, M. and Kim, G. (1987). A goal Programming Model to optimize departmental preference in course assignments. Computers and Operations Research 14 (2), 87-96 Performance Comparison Study of Multicast Routing Protocols for Mobile Ad hoc Networks under Default Flooding and Density and Mobility Aware Energy-Efficient (DMEF) Broadcast Strategies Natarajan Meghanathan Department of Computer Science Jackson State University Jackson, MS 39217, USA E-mail: natarajan.meghanathan@jsums.edu Keywords: mobile ad hoc networks, broadcast, energy-efficiency, multicast routing, stability, simulation Received: June 10, 2010 Recently, we had proposed a novel network density and mobility aware energy-efficient broadcast route discovery strategy (called DMEF) to discover stable routes in mobile ad hoc networks (MANETs). DMEF works by letting each node to dynamically choose its own broadcast transmission range for the route discovery messages depending on the perceived number of neighbour nodes in its default maximum transmission range and the node's own mobility values at the time of route discovery. A node surrounded by more neighbours makes itself available to a smaller neighbourhood and vice-versa. Similarly, a slow moving node broadcasts the route discovery message to a majority of its neighbours so that links formed using this node can be more stable. A fast moving node advertises itself only to the neighbours closer to it. The effectiveness of DMEF has been so far tested only for MANET unicast and multi-path routing protocols. In this paper, we study the impact of DMEF on the performance of MANET multicast routing protocols. We investigate the minimum-hop based Multicast Ad hoc On-demand Distance Vector (MAODV) routing protocol, the minimum-link based Bandwidth-Efficient Multicast Routing Protocol (BEMRP) and our recently proposed non-receiver aware and receiver aware multicast extensions to the Location Prediction Based Routing (NR-MLPBR and R-MLPBR) protocols. Exhaustive simulation studies of these multicast routing protocols with DMEF and the default flooding as the route discovery strategies have been conducted. Performance results for each multicast routing protocol illustrate DMEF to be effective in discovering multicast trees that exist for a longer time with a lower energy consumed per node and without any appreciable increase in the hop count per source-receiver path. Povzetek: Predstavljeno je testiranje nove metode DMEF za iskanje stabilnih povezav v mobilnih omrežjih. 1 Introduction A mobile ad hoc network (MANET) is a dynamic distributed system of mobile, autonomous wireless nodes. The network has limited bandwidth and the nodes have limited battery charge. In order to conserve battery charge, each node has a limited transmission range (i.e., transmits the data signals only to a limited distance). As a result, MANET routes are typically multi-hop in nature. As nodes move independent of each other, routes between a source and destination node often break and new routes have to be discovered. MANET routing protocols are of two types. Proactive protocols require the nodes to periodically exchange the table updates to pre-determine routes between any pair of source-destination nodes. Reactive protocols determine routes only when a route is required from a source to a destination. In dynamically changing environments, typical of MANETs, reactive on-demand routing protocols incur lower control overhead to discover routes compared to the proactive routing protocols [5]. In this paper, we work only with the reactive routing protocols. Flooding is the default route discovery approach for on-demand MANET routing protocols [14]. The flooding algorithm to discover routes can be briefly explained as follows: Whenever a source node needs a route to a destination node, it broadcasts a Route Request (RREQ) message to its neighbours. Neighbour nodes of the source node broadcast the received RREQ further, if they have not already done so. A RREQ message for a particular route discovery process is forwarded by a node exactly once. The destination node receives the RREQs along several routes, selects the best route according to the route selection principles of the particular routing protocol and notifies the selected route to the source through a Route-Reply (RREP) packet. The source starts sending data packets on the discovered route. Flooding is inefficient and consumes significantly high energy and bandwidth. When a node receives a message for the first time in its neighbourhood, at least 39% of the neighbourhood would have seen it already and on the average only 41% of the additional area could be covered with a rebroadcast [15]. In an earlier work [11], we had proposed a novel density and mobility aware energy-efficient broadcast strategy, referred to as DMEF, to reduce the energy consumption in broadcast route discoveries by letting a node to broadcast only within a limited neighbourhood. The neighbourhood size to which a node advertises itself as part of the route discovery process is independently decided at the node based on the number of neighbours surrounding the node and the mobility of the node. The neighbourhood size for rebroadcast is reduced in such a way that the RREQ packets still make it to the destination through one or more paths with a reduced energy spent per route discovery and such paths are also more stable compared to those discovered using flooding. The effectiveness of DMEF has been so far studied only for MANET unicast [11] and multi-path routing protocols [12]. In this paper, we study the impact of DMEF on the performance of MANET multicast routing protocols. Multicasting is the process of sending a stream of data from one source node to multiple recipients by establishing a routing tree, which is an acyclic connected subgraph of the entire network. The set of receiver nodes form the multicast group. While propagating down the tree, data is duplicated only when necessary. This is better than multiple unicast transmissions. Multicasting in ad hoc wireless networks has numerous applications [21]: collaborative and distributing computing like civilian operations, emergency search and rescue, law enforcement, warfare situations and etc. We investigate the minimum-hop based Multicast Ad hoc On-demand Distance Vector (MAODV) routing protocol [18], the minimum-link based Bandwidth-Efficient Multicast Routing Protocol (BEMRP) [16] and our recently proposed non-receiver aware and receiver aware multicast extensions to the Location Prediction Based Routing protocol [9], referred to as NR-MLPBR and R-MLPBR protocols [10] respectively. Exhaustive simulation studies of these multicast routing protocols with DMEF and the default flooding as the route discovery strategies have been conducted in this paper. Performance results for each multicast routing protocol illustrate DMEF to be effective in discovering multicast trees that exist for a longer time with a lower energy consumed per node and without any appreciable increase in the hop count per source-receiver path. The rest of the paper is organized as follows: Section 2 briefly describes the DMEF strategy. Section 3 reviews the multicast routing protocols studied. Section 4 discusses the simulation environment and presents the simulation results illustrating the effectiveness of DMEF vis-à-vis flooding. Section 5 reviews state-of-the-art related work on different optimal broadcast route discovery strategies proposed in the literature and discusses the advantages of DMEF and differences with related work. Section 6 concludes the paper and discusses future work. Throughout this paper, the terms 'path' and 'route', 'link' and 'edge', 'message' and 'packet' are used interchangeably. They mean the same. 2 DMEF strategy 2.1 Terminology and assumptions Every node (say node u) in the network is configured with a maximum transmission range (RangeMax). If the distance between two nodes is less than or equal to the maximum transmission range, the two nodes are said to be within the "complete neighbourhood" of each other. Each node broadcasts periodically a beacon message in its complete neighbourhood. The time between two successive broadcasts is chosen uniform-randomly, by each node from the range [0...Twait\. Using this strategy, each node learns about the number of nodes in its complete neighbourhood. 2.2 Basic idea of DMEF The twin objectives of DMEF are to discover stable routes with a reduced energy consumption compared to that incurred using flooding. DMEF achieves this by considering the number of neighbours of a node (a measure of node density) and node mobility. The basic idea behind DMEF is as follows: The transmission range of a RREQ broadcast for route discovery is not fixed for every node. A node surrounded by more neighbours in the complete neighbourhood should broadcast the RREQ message only within a smaller neighbourhood that would be sufficient enough to pick up the message and forward it to the other nodes in the rest of the network. On the other hand, a node that is surrounded by fewer neighbours in the complete neighbourhood should broadcast the RREQ message to a larger neighbourhood (but still contained within the complete neighbourhood) so that a majority of the nodes in the complete neighbourhood can pick up the message and rebroadcast it further. A node rebroadcasts a RREQ message at most once. The density aspect of DMEF thus helps to reduce the unnecessary transmission and reception of broadcast RREQ messages and conserves energy. To discover stable routes that exist for a longer time, DMEF adopts the following approach: A node that is highly mobile makes itself available only to a smaller neighbourhood around itself, whereas a node that is less mobile makes itself available over a larger neighbourhood (but still contained within the complete neighbourhood). The reasoning is that links involving a slow moving node will exist for a longer time. Hence, it is better for a slow moving node to advertise itself to a larger neighbourhood so that the links (involving this node) that are part of the routes discovered will exist for a longer time. On the other hand, a fast moving node will have links of relatively longer lifetime with neighbours that are closer to it. Hence, it is worth to let a fast moving node advertise only to its nearby neighbours. 2.3 DMEF mathematical model DMEF effectively uses the knowledge of neighbourhood node density and mobility so that they complement each other in discovering stable routes in a more energy-efficient fashion. The transmission range used by a node u, RangeRREQ, to rebroadcast a RREQ message is given by the following model: Ranged = RangeMax | Neighborsu a * vß u (1) The idea behind the formulation of equation (1) is that the larger the value of the term, | Neighborsu | a *„ß the lower would be the transmission range chosen by a node for broadcasting the RREQ message. For a fixed value of parameters a and p, the above term in equation (1) could become larger for a node if it has a larger number of neighbours and/or is moving faster with a larger velocity. In order to make sure, RangeRREQ is always greater than or equal to zero, the value of parameter a should be chosen very carefully. For a given value of parameter [, the necessary condition is: a > | Neighbors t A Range,t Max ß (2) In practice, the value of a has to be sufficiently larger than the value obtained from (2), so that the RREQ message reaches neighbours who can forward the message further to the rest of the network. Otherwise, certain source-destination nodes may not be reachable from one another even though there may exist one or more paths between them in the underlying network. 2.4 Dynamic selection of DMEF parameter values The specialty of DMEF is that it allows for each node to dynamically and independently choose at run-time the appropriate values for the critical operating parameters a and [ depending on the perceived number of nodes in the complete neighbourhood of the node and the node's own velocity. A node has to be simply pre-programmed with the appropriate values of a and [ to be chosen for different values of the number of nodes in the complete neighbourhood and node velocity. Let the maximum number of neighbours a node should have in order to conclude that the complete neighbourhood density of the node is low and moderate be represented respectively by maxNeighblowDensity, maxNeighbmodDensity. If a node has more than maxNeighbmodDensity number of neighbours, then the node is said to exist in a complete neighbourhood of high density. Let lowDensitya, modDensitya and highDensitya represent the values of a to be chosen by a node for complete neighbourhoods of low, moderate and high density respectively. Let maxVellowMobility, maxVelmodMobility represent the maximum velocity values for a node in order to conclude that the mobility of the node is low and moderate respectively. If the velocity of a node is more than maxVel modMobility, then the mobility of the node is said to be high. Let lowMobility_[, modMobility_[ and highMobility_[ represent the values of [ to be chosen by a node when its mobility is low, moderate and high respectively. Let Neighbors.lu and V represent the set of neighbours in the complete neighbourhood and velocity of a node u at time t. Note that the set Neighbors^ is determined by node u based on the latest periodic beacon exchange in the complete neighbourhood formed by the maximum transmission range, RangeMax. The algorithm, DMEF_Parameter_Selection, to dynamically choose the values of parameters a and [ (represented as atu and fiU) is illustrated below in Figure 1: Input: Neighbors^ and Vu Auxiliary Variables: minimum^ // minimum value of a to be chosen to u avoid the transmission range of a node from becoming negative RangeuMax // the maximum transmission range of a node for complete neighbourhood Density related variables: maxNeighblowDensity, maxNeighb modDensity, lowDensity a, modDensitya, highDensity_a Node Velocity related variables: maxVel lowMobility, maxVelmodMobility, lowMobility_[, modMobility_[, highMobility_[ Output: au Begin DMEFParameterSelection if (yl < maxVel lowMobility) ßU ^ lowMobilityJ else if (< maxVel moderateMobility) ß ^ moderateMobility_ß else ßU ^ highMobility_ß t x minimum a ^ | Neighborst eMax eu * V1 Rangeu if (I NeighborsU I - maxNeighb lowDensity) q} ^ Maximum (minimum, lowDensity a) u u else if (| Neighborsu I - maxNeighb modDensity) qQ ^ Maximum (minimum_Q , modDensity a) uu else u * u Q Maximum (minimum_Q, highDensitya) u u return Q and pl End DMEF Parameter Selection Figure 1: Algorithm to Dynamically Select the Parameter Values for DMEF. 3 Review of MANET multicast routing protocols In this section, we discuss the working of the MANET multicast routing protocols (MAODV, BEMRP, NR-MLPBR and R-MLPBR) whose performance under DMEF and default flooding is studied through simulations in this paper. We also provide a brief overview of LPBR before discussing its two multicast extensions. 3.1 Multicast extension of ad hoc on-demand distance vector (MAODV) routing protocol MAODV [18] is the multicast extension of the well-known Ad hoc On-demand Distance Vector (AODV) unicast routing protocol [17]. Here, a receiver node joins the multicast tree through a member node that lies on the minimum-hop path to the source. A potential receiver wishing to join the multicast group broadcasts a RREQ message. If a node receives the RREQ message and is not part of the multicast tree, the node broadcasts the message in its neighbourhood and also establishes the reverse path by storing the state information consisting of the group address, requesting node id and the sender node id in a temporary cache. If a node receiving the RREQ message is a member of the multicast tree and has not seen the RREQ message earlier, the node waits to receive several RREQ messages and sends back a RREP message on the shortest path to the receiver. The member node also informs in the RREP message, the number of hops from itself to the source. The potential receiver receives several RREP messages and selects the member node which lies on the shortest path to the source. The receiver node sends a Multicast Activation (MACT) message to the selected member node along the chosen route. The route from the source to receiver is set up when the member node and all the intermediate nodes in the chosen path update their multicast table with state information from the temporary cache. A similar approach is used in NR-MLPBR and R-MLPBR when a new receiver node wishes to join the multicast group. Tree maintenance in MAODV is based on the expanding ring search (ERS) approach, using the RREQ, RREP and MACT messages. The downstream node of a broken link is responsible for initiating ERS to issue a fresh RREQ for the group. This RREQ contains the hop count of the requesting node from the source and the last known sequence number for that group. It can be replied only by the member nodes whose recorded sequence number is greater than that indicated in the RREQ and whose hop distance to the source is smaller than the value indicated in the RREQ. 3.2 Bandwidth-efficient multicast routing protocol (BEMRP) According to BEMRP [16], a newly joining node to the multicast group opts for the nearest forwarding node in the existing tree, rather than choosing a minimum-hop count path from the source of the multicast group. As a result, the number of links in the multicast tree is reduced leading to savings in the network bandwidth. Multicast tree construction is receiver-initiated. When a node wishes to join the multicast group as a receiver, it initiates the flooding of Join control packets targeted towards the nodes that are currently members of the multicast tree. On receiving the first Join control packet, the member node waits for a certain time before sending a Reply packet. The member node sends a Reply packet on the path, traversed by the Join control packet, with the minimum number of intermediate forwarding nodes. The newly joining receiver node collects the Reply packets from different member nodes and would send a Reserve packet on that path that has the minimum number of forwarding nodes from the member node to itself. To provide more bandwidth efficiency, the tree maintenance approach in BEMRP is hard-state based, i.e. a member node transmits control packets only after a link breaks. BEMRP uses two schemes to recover from link failures: Broadcast-multicast scheme - the upstream node of the broken link is responsible for finding a new route to the previous downstream node; Local-rejoin scheme - the downstream node of the broken link tries to rejoin the multicast group using a limited flooding of the Join control packets. 3.3 Location prediction based routing (LPBR) protocol LPBR works as follows: Whenever a source node has data packets to send to a destination node but does not have a route to that node, it initiates a flooding-based route discovery by broadcasting a Route-Request (RREQ) packet. During this flooding process, each node forwards the RREQ packet exactly once after incorporating its location update vector (LUV) in the RREQ packet. The LUV of a node comprises the node id, the current X and Y co-ordinates of the nodes, the current velocity and angle of movement with respect to the X-axis. The destination node collects the LUV information of all the nodes in the network from the RREQ packets received through several paths and sends a Route-Reply (RREP) packet to the source on the minimum hop path traversed by a RREQ packet. The source starts sending the data packets on the path learnt (based on the RREP packet) and informs the destination about the time of next packet dispatch through the header of the data packet currently being sent. If an intermediate node could not forward a data packet, it sends a Route-Error packet to the source node, which then waits a little while for the destination to inform it of a new route predicted using the LUVs gathered from the latest flooding-based route discovery. If the destination does not receive the data packet within the expected time, it locally constructs the current global topology by predicting the locations of the nodes. Each node is assumed to be currently moving in the same direction and speed as mentioned in its latest LUV. If there is at least one path in the predicted global topology, the destination node sends the source a LPBR-RREP packet on the minimum hop path in the predicted topology. If the predicted path actually exists in reality, the intermediate nodes on the predicted route manage to forward the LPBR-RREP packet to the source. The source uses the route learnt through the latest LPBR-RREP packet to send the data packets. A costly flooding-based route discovery has been thus avoided. If an intermediate node could not forward the LPBR-RREP packet (i.e., the predicted path did not exist in reality), the intermediate node sends a LPBR-RREP-ERROR packet to the destination informing it of the failure to forward the LPBR-RREP packet. The destination discards all the LUVs and the source node initiates the next flooding-based route discovery after timing out for the LPBR-RREP packet. 3.4 Multicast extensions to the LPBR protocol (NR-MLPBR and R-MLPBR) Both the multicast extensions of LPBR, referred to as NR-MLPBR and R-MLPBR, are aimed at minimizing the number of global broadcast tree discoveries as well as the hop count per source-receiver path of the multicast tree. They use a similar idea of letting the receiver nodes to predict a new path based on the locally constructed global topology obtained from the location and mobility information of the nodes learnt through the latest broadcast tree discovery. Receiver nodes running NR-MLPBR (Non-Receiver aware Multicast extensions of LPBR) are not aware of the receivers of the multicast group, whereas each receiver node running R-MLPBR (Receiver-aware Multicast Extension of LPBR) is aware of the identity of the other receivers of the multicast group. NR-MLPBR attempts to predict a minimum hop path to the source, whereas R-MLPBR attempts to predict a path to the source that has the minimum number of non-receiver nodes. The multicast extensions of LPBR work as follows: When a source attempts to construct a multicast tree, it floods a Multicast Tree Request Message (MTRM) throughout the network. The location and mobility information of the intermediate forwarding nodes are recorded in the MTRM. Each node, including the receiver nodes of the multicast group, broadcasts the MTRM exactly once in its neighbourhood. Each receiver node of the multicast group receives several MTRMs and sends a Multicast Tree Establishment Message (MTEM) on the minimum hop path traversed by the MTRMs. The set of paths traversed by the MTEMs form the multicast tree rooted at the source. If an intermediate node of the tree notices a downstream node moving away from it, the intermediate node sends a Multicast Path Error Message (MPEM) to the source. The source does not immediately initiate another tree discovery procedure. Instead, the source waits for the appropriate receiver node (whose path to the source has broken) to predict a path to the source. The receiver predicts a new path based on the location and mobility information of the nodes collected through the MTRMs during the latest global tree discovery procedure. The receiver attempts to locally construct the global topology by predicting the locations of the nodes in the network using the latest location and mobility information collected. NR-MLPBR and R-MLPBR differ from each other based on the type of path predicted and notified to the source. NR-MLPBR determines and sends a Multicast Predicted Path Message (MPPM) on the minimum hop path to the source. R-MLPBR attempts to choose a path that will minimize the number of newly added intermediate nodes to the multicast tree. In pursuit of this, R-MLPBR determines a set of node-disjoint paths to the source on the predicted topology and sends the MPPM on that path that includes the minimum number of non-receiver nodes. If there is a tie, R-MLPBR chooses the path that has the least hop count. The source waits to receive a MPPM from the affected receiver node. If a MPPM is received within a certain time, the source considers the path traversed by the MPPM as part of the multicast tree and continues to send data packets down the tree including to the nodes on the new path. Otherwise, the source initiates another global tree discovery procedure by broadcasting the MTRM. R-MLPBR has been thus designed to also reduce the number of links that form the multicast tree, in addition to the source-receiver hop count and the number of global tree discoveries. 4 Simulations The network dimension used is a 1000m x 1000m square network. The transmission range of each node is assumed to be 250m. The number of nodes used in the network is 25, 50 and 75 nodes representing networks of low, medium and high density with an average distribution of 5, 10 and 15 neighbours per node respectively. Initially, nodes are uniformly randomly distributed in the network. We implemented all of the four multicast routing protocols (MAODV, BEMRP, NR-MLPBR and R-MLPBR) in the ns-2 simulator [4]. The broadcast tree discovery strategies simulated are the default flooding approach and DMEF. The DMEF parameter values are given in Table 1. Table 1: DMEF Parameter Values. DMEF Parameter Value maxNeighb lowDensity 5 maxNeighb modDensity 10 lowDensity a 5 modDensity a 10 highDensity a 20 maxVel lowMobility 5 maxVel modMobility 15 lowMobiUtyJ3 1.6 modMobility_p 1.3 highMobiUty_p 1.1 T ± wait 10 seconds The signal propagation model used is the Two-ray ground reflection model [4]. The Medium Access Control (MAC) layer model is the IEEE 802.11 [3] model. The channel bandwidth is 2 Mbps. The node queues operate on a First-in First-Out (FIFO) basis, with a maximum queue size of 100 packets. The node mobility model used is the Random Waypoint model [2], with the node velocity chosen from [vmi-„,..., vmax]; vmin was set to 0 and the values of vmax used are 10m/s, 30m/s and 50m/s representing scenarios of low, moderate and high node mobility respectively. The pause time is 0 seconds. Simulations are conducted with a multicast group size of 2, 4 (small size), 8, 12 (moderate size) and 24 (larger size) receiver nodes. For each group size, we generated 5 lists of receiver nodes and simulations were conducted with each of them. Traffic sources are constant bit rate (CBR). Data packets are 512 bytes in size and the packet sending rate is 4 data packets/second. The multicast session continues until the end of the simulation time, which is 1000 seconds. The transmission energy and reception energy per hop is set at 1.4 W and 1 W respectively [6]. Initial energy at each node is 1000 Joules. Each node periodically broadcasts a beacon message within its neighbourhood to make its presence felt to the other nodes in the neighbourhood. 4.1 Performance metrics The performance metrics studied through this simulation are the following. The performance results for each metric displayed in Figures 2 through 14 are an average of the results obtained from simulations conducted with 5 sets of multicast groups and 5 sets of mobility profiles for each group size, node velocity and network density values. The multicast source in each case was selected randomly among the nodes in the network and the source is not part of the multicast group. The nodes that are part of the multicast group are merely the receivers. • Number of Links per Multicast Tree: This is the time averaged number of links in the multicast trees discovered and computed over the entire multicast session. The notion of "time-average" is explained as follows: Let there be multicast trees T1, T2, T3 with 5, 8 and 6 links used for time 12, 6 and 15 seconds respectively, then the time averaged number of links in the multicast trees is given by (5*12+8*6+6*15)/ (12+6+15) = 6 and not merely 6.33, which is the average of 5, 8 and 6. • Hop Count per Source-Receiver Path: This is the time averaged hop count of the paths from the source to each receiver of the multicast group and computed over the entire multicast session. • Time between Successive Broadcast Tree Discoveries: This is the time between two successive broadcast tree discoveries, averaged over the entire multicast session. This metric is a measure of the lifetime of the multicast trees discovered and also the effectiveness of the path prediction approach followed in NR-MLPBR and R-MLPBR. • Energy Throughput: This is the average of the ratio of the number of data packets reaching the destination to the sum of the energy spent across all the nodes in the network. • Energy Consumed per Node: This is the sum of the energy consumed at a node due to the transfer of data packets as part of the multicast session, broadcast tree discoveries as well as the periodic broadcast and exchange of beacons in the neighbourhood. • Energy Consumed per Tree Discovery: This is the average of the total energy consumed for the global broadcast based tree discovery attempts. This includes the sum of the energy consumed to transmit (broadcast) the MTRM packets to the nodes in the neighbourhood and to receive the MTRM packet sent by each node in the neighbourhood, summed over all the nodes. It also includes the energy consumed to transmit the MTEM packet from each receiver to the source of the multicast session. 4.2 Number of links per multicast tree The number of links per multicast tree (refer Figures 2 and 3) is a measure of the efficiency of the multicast routing protocol in reducing the number of link transmissions during the transfer of the multicast data from the source to the receivers of the multicast group. The smaller is the number of links in the tree, the larger the link transmission efficiency of the multicast routing protocol. If fewer links are part of the tree, then the chances of multiple transmissions in the network increase and this increases the efficiency of link usage and the network bandwidth. Naturally, the BEMRP protocol, which has been purely designed to yield bandwidth-efficient multicast trees, discovers trees that have a reduced number of links for all the operating scenarios. This leads to larger hop count per source-receiver paths for BEMRP as observed in Figures 4 and 5. R-MLPBR, which has been designed to choose the predicted paths with the minimum number of non-receiver nodes, manages to significantly reduce the number of links vis-à-vis the MAODV and NR-MLPBR protocols. R-MLPBR attempts to minimize the number of links in the multicast tree without yielding to a higher hop count per source-receiver path. But, the tradeoff between the link efficiency and the hop count per source-receiver path continues to exist and it cannot be nullified. In other words, R-MLPBR cannot discover trees that have minimum number of links as well as the minimum hop count per source-receiver path. Nevertheless, R-MLPBR is the first multicast routing protocol that yields trees with the reduced number of links and at the same time, with a reduced hop count (close to the minimum) per source-receiver path. Performance with Flooding as Tree Discovery Strategy • Impact of Node Mobility: For a given network density and multicast group size, we do not see any appreciable variation in the number of links per tree for each of the multicast routing protocols studied. Figure 2.1: 25 nodes, 10 m/s. Figure 2.2: 25 nodes, 30 m/s. Figure 2.3: 25 nodes, 50 m/s. Figure 2.4: 50 nodes, 10 m/s. Figure 2.5: 50 nodes, 30 m/s. Figure 2.6: 50 nodes, 50 m/s. Figure 2.7: 75 nodes, 10 m/s. Figure 2.8: 75 nodes, 30 m/s. Figure 2.9: 75 nodes, 50 m/s. Figure 2: Average Number of Links per Multicast Tree (Tree Discovery Procedure: Flooding). Impact of Network Density: For a given multicast group size, the number of links per tree for MAODV and NR-MLPBR is about 4-15%, 8-28% and 1038% more than that incurred with BEMRP in networks of low, moderate and high density respectively. This illustrates that as the network density increases, BEMRP attempts to reduce the number of links per tree by incorporating links that can be shared by multiple receivers on the paths towards the source. On the other hand, both MAODV and NR-MLPBR attempt to choose minimum hop paths between the source and any receiver and hence exploit the increase in network density to discover minimum hop paths, but at the cost of the link efficiency. On the other hand, R-MLPBR attempts to reduce the number of links per tree as we increase the network density. For a given multicast group size, the number of links per tree for R-MLPBR is about 4-15%, 8-18% and 10-21% more than that incurred by BEMRP. This shows that R-MLPBR is relatively more scalable, similar to BEMRP, with increase in the network density. Impact of Multicast Group Size: For a given level of node mobility, for smaller multicast groups (of size 2), the number of links per tree for MAODV, NR-MLPBR and R-MLPBR is about 3-7%, 8-11% and 9-14% more than that incurred for BEMRP in low, medium and high-density networks respectively. For medium and large-sized multicast groups, the number of links per tree for both MAODV and NR-MLPBR is about 7-15%, 17-28% and 22-38% more than that incurred for BEMRP in low, medium and high-density networks respectively. On the other hand, the number of links per tree for R-MLPBR is about 6-15%, 12-18% and 16-21% more than that incurred for BEMRP in low, medium and high-density networks respectively. This shows that R-MLPBR is relatively more scalable, similar to BEMRP, with increase in the multicast group size. Performance with DMEF as the Tree Discovery Strategy • Impact of Node Mobility: For each multicast routing protocol, as the maximum node velocity is increased from 10 m/s to 30 m/s, the number of links per multicast tree increases as large as up to 24% (for multicast groups of small and moderate sizes) and 3% (for larger multicast groups). As the maximum node velocity is increased from 10 m/s to 50 m/s, the number of links per tree increases as large as up to 15% (for multicast groups of small and moderate sizes) and 5% (for larger multicast groups). Thus, DMEF can yield multicast trees with reduced number of links in low node mobility, especially for multicast groups of small and moderate sizes. • Impact of Network Density: For a given group size, the number of links per tree for MAODV and NR-MLPBR is about 4-15%, 8-28% and 10-35% more than that incurred with BEMRP in networks of low, moderate and high density respectively. For a given group size, the number of links per tree for R-MLPBR is about 3-9%, 8-18% and 9-24% more than that incurred by BEMRP. The results are more or less similar to what has been obtained using flooding as the tree discovery strategy. Impact of Multicast Group Size: For a given level of node mobility, for smaller multicast groups (of size Figure 3.1: 25 nodes, 10 m/s. Figure 3.2: 25 nodes, 30 m/s. Figure 3.3: 25 nodes, 50 m/s. Figure 3.4: 50 nodes, 10 m/s. Figure 3.5: 50 nodes, 30 m/s. Figure 3.6: 50 nodes, 50 m/s. Figure 3.7: 75 nodes, 10 m/s. Figure 3.8: 75 nodes, 30 m/s. Figure 3.9: 75 nodes, 50 m/s. Figure 3: Average Number of Links per Multicast Tree (Tree Discovery Procedure: DMEF). 2), the number of links per tree for MAODV, NR-MLPBR and R-MLPBR is about 4-7%, 8-9% and 914% more than that incurred for BEMRP in low, medium and high-density networks respectively. For medium and large-sized multicast groups, the number of links per tree for both MAODV and NR-MLPBR is about 7-15%, 17-28% and 21-35% more than that incurred for BEMRP in low, medium and high-density networks respectively. On the other hand, the number of links per tree for R-MLPBR is about 6-8%, 11-18% and 15-24% more than that incurred for BEMRP in low, medium and high-density networks respectively. These results are almost the same as that obtained when flooding is used as the tree discovery strategy. 4.3 Hop count per source-receiver path All the three multicast routing protocols - MAODV, NR-MLPBR and R-MLPBR, incur almost the same average hop count per source-receiver and it is considerably lower than that incurred for BEMRP. The hop count per source-receiver path is an important metric and it is often indicative of the end-to-end delay per multicast packet from the source to a specific receiver. BEMRP incurs a significantly larger hop count per source-receiver path and this can be attributed to the nature of this multicast routing protocol to look for trees with a reduced number of links. When multiple receiver nodes have to be connected to the source through a reduced set of links, the hop count per source-receiver path is bound to increase. In performance Figures 4 and 5, we can see a significant increase in the hop count per source-receiver path as we increase the multicast group size. In the case of flooding, the hop count per source-receiver path for BEMRP can be as large as 41%, 57% and 59% more than that of the hop count per source-receiver path incurred for the other three multicast routing protocols. In the case of DMEF, the hop count per source-receiver path for BEMRP can be as large as 36%, 49% and 53% more than that of the hop count per source-receiver path incurred for the other three multicast routing protocols. The increase in the hop count per source-receiver path for BEMRP is slightly less than that obtained under flooding. Performance with Flooding as the Tree Discovery Strategy • Impact of Node Mobility: For a given network density and group size, we do not see any appreciable variation in the hop count per source-receiver path for each of the multicast routing protocols studied. • Impact of Network Density: As we increase the network density, the hop count per source-receiver path decreases. This is mainly observed in the case of the minimum-hop based MAODV, NR-MLPBR and R-MLPBR. In the case of BEMRP, the impact of network density on the decrease in the hop count is relatively less as it is a bandwidth-efficient multicast routing protocol attempting to reduce the number of links in the tree. In networks of moderate density (50 nodes), the hop count per source-receiver path for the three minimum hop based multicast protocols is about 6%, 9-12% and 15-19% less than that incurred in low-density networks for multicast groups of small, medium and larger sizes respectively. In high Figure 4.1: 25 nodes, 10 m/s. Figure 4.2: 25 nodes, 30 m/s. Figure 4.3: 25 nodes, 50 m/s. # Receivers per Multicast Group Figure 4.4: 50 nodes, 10 m/s Figure 4.5: 50 nodes, 30 m/s. Figure 4.6: 50 nodes, 50 m/s. Figure 4.7: 75 nodes, 10 m/s. Figure 4.8: 75 nodes, 30 m/s. # Receivers per Multicast Group Figure 4.9: 75 nodes, 50 m/s. Figure 4: Average Hop Count per Source-Receiver Path (Tree Discovery Procedure: Flooding). density networks (75 nodes), the hop count per source-receiver path for the three minimum-hop based multicast protocols is about 7-9%, 11-18% and 15-19% less than that incurred in low-density networks for multicast groups of small, medium and larger sizes respectively. In the case of BEMRP, the maximum reduction in the hop count with increase in network density is within 10%. • Impact of Multicast Group Size: For smaller multicast groups (of size 2), the hop count per source-receiver path for BEMRP can be 6-10%, 812% and 10-12% more than that of the other three multicast routing protocols in networks of low, moderate and high density respectively. For medium sized multicast groups, the hop count per source-receiver path for BEMRP can be 14-29%, 21-30% and 23-37% more than that of the other three multicast routing protocols in networks of low, moderate and high density respectively. For large-sized multicast groups, the hop count per source-receiver path for BEMRP can be 27-41%, 35-57% and 33-59% more than that of the hop count per source-receiver path for the other three multicast routing protocols in networks of low, moderate and high density respectively. • Impact of Node Mobility : For each of the multicast routing protocols, as the maximum node velocity is increased from 10 m/s to 30 m/s, we observe that the hop count per source-receiver path increases as large as up to 17% (for multicast groups of small and moderate sizes) and 7% (for multicast groups of larger size). As the maximum node velocity is increased from 10 m/s to 50 m/s, we observe that the number of links per multicast tree increases as large as up to 13% (for multicast groups of small and moderate sizes) and 15% (for multicast groups of larger size). This shows that DMEF can yield multicast trees with reduced hop count per source-receiver path under low node mobility, especially for multicast groups of small and moderate sizes. • Impact of Network Density: The impact is similar to that observed in the case of flooding. For the minimum-hop based multicast protocols, with increase in network density, the hop count per source-receiver path decreases significantly. On the other hand, in the case of BEMRP, the decrease in the hop count per source-receiver path is relatively less, with increase in the network density. • Impact of Multicast Group Size: For smaller multicast groups (of size 2), the hop count per source-receiver path for BEMRP can be 6-9%, 912% and 10-12% more than that of the other three multicast routing protocols in networks of low, moderate and high density respectively. For medium sized multicast groups, the hop count per source-receiver path for BEMRP can be 13-28%, 20-29% and 23-34% more than that of the other three multicast routing protocols in networks of low, moderate and high density respectively. For large- sized multicast groups, the hop count per source-receiver path for BEMRP can be 24-36%, 33-50% and 36-54% more than that of the hop count per source-receiver path for the other three multicast routing protocols in networks of low, moderate and high density respectively. Figure 5.1: 25 nodes, 10 m/s. Figure 5.2: 25 nodes, 30 m/s. Figure 5.3: 25 nodes, 50 m/s. Figure 5.4: 50 nodes, 10 m/s. Figure 5.5: 50 nodes, 30 m/s. Figure 5.6: 50 nodes, 50 m/s. Figure 5.7: 75 nodes, 10 m/s. Figure 5.8: 75 nodes, 30 m/s. Figure 5.9: 75 nodes, 50 m/s. Figure 5: Average Hop Count per Source-Receiver Path (Tree Discovery Procedure: DMEF). 4.4 Time between successive broadcast tree discoveries The time between successive broadcast tree discoveries is a measure of the stability of the multicast trees and the effectiveness of the location prediction and path prediction approach of the two multicast extensions. For a given condition of node density and node mobility, both NR-MLPBR and R-MLPBR incur relatively larger time between successive broadcast tree discoveries for smaller and medium sized multicast groups. MAODV tends to be more unstable as the multicast group size is increased, owing to the minimum hop nature of the paths discovered and absence of any path prediction approach. For larger multicast groups, BEMRP tends to perform better by virtue of its tendency to strictly minimize only the number of links in the tree. On the other hand, NR-MLPBR attempts to reduce the hop count per source-receiver path and ends up choosing predicted paths that increase the number of links in the tree, quickly leading to the failure of the tree. The time between successive tree discoveries for R-MLPBR is 15-25%, 15-59% and 20-82% more than that obtained for MAODV in networks of low, moderate and high density respectively. For a given level of node mobility and network density, MAODV trees become highly unstable as the multicast group size increases. For multicast groups of size 2 and 4, the time between successive broadcast tree discoveries for NR-MLPBR and R-MLPBR is greater than that obtained for BEMRP, especially in networks of low and moderate network density. For larger multicast group sizes, when we employ flooding, BEMRP tends to incur larger time between successive broadcast tree discoveries compared to NR-MLPBR and R-MLPBR. On the other hand, when we employ DMEF, R-MLPBR tends to incur larger time between successive broadcast tree discoveries compared to BEMRP, even for larger group sizes. Performance with Flooding as the Tree Discovery Strategy • Impact of Node Mobility: For a given multicast group size, network density and multicast routing protocol, the time between successive broadcast tree discoveries at maximal node velocity of 30 m/s is roughly about 28-47% of that obtained at maximal node velocity of 10 m/s. The time between successive broadcast tree discoveries at maximal node velocity of 50 m/s is roughly about 21-36% of that obtained at maximal node velocity of 10 m/s. • Impact of Network Density: For each multicast routing protocol, for a given multicast group size and level of node mobility, as the network density increases, the time between successive broadcast tree discoveries decreases. This is mainly observed for the minimum-hop based multicast protocols (especially MAODV and NR-MLPBR) which incur a reduced hop count per source-receiver path as we increase the network density. But, such minimum hop paths obtained in moderate and high-density networks are relatively less stable than those obtained in low-density networks. For a given multicast group size and low node mobility, the time between successive tree discoveries in networks of moderate density (50 nodes) for MAODV and NR-MLPBR is 67-90% and for R-MLPBR and BEMRP is 73-96% of those incurred in low-density networks. Figure 6.1: 25 nodes, 10 m/s. Figure 6.2: 25 nodes, 30 m/s. Figure 6.3: 25 nodes, 50 m/s. Figure 6.4: 50 nodes, 10 m/s. Figure 6.5: 50 nodes, 30 m/s. Figure 6.6: 50 nodes, 50 m/s. Figure 6.7: 75 nodes, 10 m/s. Figure 6.8: 75 nodes, 30 m/s. Figure 6.9: 75 nodes, 50 m/s. Figure 6: Average Time between Successive Tree Discoveries (Tree Discovery Procedure: Flooding). For a given multicast group size and low node mobility, the time between successive tree discoveries in networks of high density (75 nodes) is 51-80% for MAODV and NR-MLPBR and for R-MLPBR and BEMRP is 70-90% of those obtained in networks of low-density. In low-density networks, the time between successive route discoveries for R-MLPBR and NR-MLPBR is about 10-15% more than that obtained for BEMRP for smaller multicast groups and is almost the same as that of BEMRP for moderately sized multicast groups. For larger multicast groups, the time between successive route discoveries for R-MLPBR and NR-MLPBR can be about 10-23% less than that obtained for BEMRP. In moderate and high density networks, the time between successive route discoveries for R-MLPBR is about 7-25% more than that obtained for BEMRP for smaller multicast groups and is about the same of moderately size multicast groups. For larger multicast groups, the time between successive route discoveries for R-MLBPR can be about 15-25% less than that obtained for BEMRP. In both moderate and high-density networks, R-MLPBR incurs larger time between successive route discoveries (as large as 30%) compared to NR-MLPBR. • Impact of Multicast Group Size: For a given network density and node mobility, the time between successive route discoveries decreases as the multicast group size increases. For smaller group sizes, the time between successive broadcast tree discoveries for MAODV and BEMRP is respectively about 80%-90% and 85%-94% of that incurred for NR-MLPBR and R-MLPBR. For larger group sizes, the time between successive broadcast tree discoveries for MAODV is about 70%, 51% and 41% of that incurred for BEMRP in networks of low, moderate and high density respectively. Similarly, for larger group sizes, the time between successive broadcast tree discoveries for NR-MLPBR is about 76%, 64% and 57% of that incurred for BEMRP in networks of low, moderate and high density respectively. On the other hand, R-MLPBR tends to incur relatively larger time between successive tree discoveries even for larger multicast group sizes. For larger multicast groups, the time between successive tree discoveries for R-MLPBR is about 75%-80% of that incurred for BEMRP for all network densities. Impact of Node Mobility: For a given multicast group size, network density and multicast routing protocol, the time between successive broadcast tree discoveries at maximal node velocity of 30 m/s is roughly about 38-59% of that obtained at maximal node velocity of 10 m/s in networks of low, moderate and high density respectively. The time between successive broadcast tree discoveries at maximal node velocity of 50 m/s is roughly about 34-50% of that obtained at maximal node velocity of 10 m/s. In each instance, the increase in the time between successive route discoveries while using DMEF is at least 10-15% more than that obtained due to flooding. Figure 7.1: 25 nodes, 10 m/s. Figure 7.2: 25 nodes, 30 m/s. Figure 7.3: 25 nodes, 50 m/s. Figure 7.4: 50 nodes, 10 m/s. Figure 7.5: 50 nodes, 30 m/s. Figure 7.6: 50 nodes, 50 m/s. Figure 7.7: 75 nodes, 10 m/s. Figure 7.8: 75 nodes, 30 m/s. Figure 7.9: 75 nodes, 50 m/s. Figure 7: Average Time between Successive Tree Discoveries (Tree Discovery Procedure: DMEF). Impact of Network Density: As we increase the network density from 25 nodes to 50 nodes, we observe that the time between successive broadcast tree discoveries for MAODV, NR-MLPBR, R-MLPBR and BEMRP decreases by 13%, 9%, 6% and 6% respectively. On the other hand, as we increase from 25 nodes to 75 nodes, we notice that the larger number of nodes in the neighbourhood is taken into account by DMEF to discover stable routes and there is no appreciable difference in the time between successive tree discoveries for NR-MLPBR, R-MLPBR and BEMRP. In the case of MAODV, the time between successive tree discoveries decreases by 8%. Impact of Multicast Group Size: For a given network density and node mobility, the time between successive route discoveries decreases as the multicast group size decreases. For smaller group sizes, the time between successive broadcast tree discoveries for MAODV and BEMRP is respectively about 82% and 87% of that incurred for NR-MLPBR and R-MLPBR. For moderate group sizes, the time between successive broadcast tree discoveries for MAODV, NR-MLPBR and BEMRP is about 7786%, 96% and 96% of those incurred for R-MLPBR. For larger group sizes, the time between successive broadcast tree discoveries for MAODV and NR-MLPBR is about 80-89% and 92-94% of that obtained for R-MLPBR and BEMRP. 4.5 Energy consumed per node Energy consumption in multicast routing is directly proportional to the number of links in the tree. Larger the number of links, more the transmissions and more will be the energy consumption in the network and vice-versa. The simulation results in Figures 8 and 9 clearly illustrate this. BEMRP incurs the least energy consumption per node and MAODV incurs the largest energy consumption per node. The energy consumed per node for the two multicast extensions is in between these two extremes. The energy consumed per node for R-MLPBR is less than that of NR-MLPBR as the former also attempts to simultaneously reduce the number of links as well as the hop count per source-receiver path. The energy consumption per node increases as the multicast group size increases. For a given multicast group size and multicast routing protocol, the energy consumed per node increases with increase in network density as well as with increase in node mobility. Impact of Node Mobility: For a given multicast group size, network density and multicast routing protocol, the energy consumed per node at maximal node velocity of 30 m/s can grow as large as 10-35% of that obtained at maximal node velocity of 10 m/s. The energy consumed per node at maximal node velocity of 50 m/s can grow as large as 10-40% of that obtained at maximal node velocity of 10 m/s. BEMRP and MAODV incur the largest increase in energy consumed per node with increase in node mobility. NR-MLPBR and R-MLPBR incur a relatively lower increase in the energy consumed per node with increase in node mobility. This can be attributed to the tendency of these multicast routing Figure 8.1: 25 nodes, 10 m/s. Figure 8.2: 25 nodes, 30 m/s. Figure 8.3: 25 nodes, 50 m/s. 2 4 8 12 24 # Receivers per Multicast Group Figure 8.4: 50 nodes, 10 m/s. Figure 8.5: 50 nodes, 30 m/s. Figure 8.6: 50 nodes, 50 m/s. Figure 8.7: 75 nodes, 10 m/s. Figure 8.8: 75 nodes, 30 m/s. Figure 8.9: 75 nodes, 50 m/s. Figure 8: Average Energy Consumed per Node (Tree Discovery Procedure: Flooding). protocols to reduce the number of broadcast tree discoveries using effective tree prediction. • Impact of Network Density: For multicast groups of size 2 and 4, we observe that with increase in network density from 25 to 50 nodes and from 25 to 75 nodes, the energy consumed per node decreases. This can be attributed to the smaller group size, leading to the effective sharing of the data forwarding load among all the nodes in the network. For larger group sizes, all the nodes in the network end up spending more energy (due to transmission/reception or at least receiving the packets in the neighbourhood). As a result, for multicast group sizes of 8, 12 and 24, as we increase the network density from 25 nodes to 50 nodes, the increase in the energy consumed per node for MAODV, NR-MLPBR, R-MLPBR and BEMRP is by factors of 47%-134%, 46%-133%, 42%-122% and 30%-96% respectively. As we increase the network density from 25 nodes to 75 nodes, the increase in the energy consumed per node for MAODV, NR-MLPBR, R-MLPBR and BEMRP is by factors of 52%-158%, 50%-154%, 42%-125% and 25%-100% respectively. MAODV and NR-MLPBR incur a relatively larger energy consumed per node at high network densities due to the nature of these multicast routing protocols to discover trees with minimum hop count. R-MLPBR and BEMRP discover trees with reduced number of links and hence incur relatively lower energy consumed per node at high network density. • Impact of Multicast Group Size: As we increase the multicast group size from 2 to 24, the energy consumed per node for MAODV and NR-MLPBR increases by a factor of 2.1 to 2.6, 5.7 to 5.9 and 6.0 to 7.0 for low, medium and high density networks respectively. In the case of BEMRP and R-MLPBR, as we increase the multicast group size from 2 to 24, the energy consumed per node increases by a factor of 2.1 to 2.5, 4.9 to 5.2 and 4.6 to 6.2 in networks of low, medium and high density respectively. The increase in the energy consumed per node is below linear. Hence, all the four multicast routing protocols are scalable with respect to the increase in multicast group size. Performance with DMEF as the Tree Discovery Strategy • Impact of Node Mobility: For a given multicast group size, network density and multicast routing protocol, the energy consumed per node at maximal node velocity of 30 m/s and 50 m/s can grow as large as 5-20% of that obtained at maximal node velocity of 10 m/s. This indicates the effectiveness of DMEF vis-à-vis flooding in reducing the energy consumed per node. DMEF discovers relatively more stable trees by involving only slow moving nodes in the tree. As a result, the multicast trees exist for a long time and incur less energy for tree discoveries. Similar to that observed for flooding, BEMRP and MAODV incur the largest increase in energy consumed per node with increase in node mobility. NR-MLPBR and R-MLPBR incur a relatively lower increase in the energy consumed per node with increase in node mobility. Impact of Network Density: Similar to the observed for flooding, for multicast groups of size 2 and 4, we observe that with increase in network density from Figure 9.1: 25 nodes, 10 m/s. Figure 9.2: 25 nodes, 30 m/s. Figure 9.3: 25 nodes, 50 m/s. Figure 9.4: 50 nodes, 10 m/s. Figure 9.5: 50 nodes, 30 m/s. Figure 9.6: 50 nodes, 50 m/s. Figure 9.7: 75 nodes, 10 m/s. Figure 9.8: 75 nodes, 30 m/s. Figure 9.9: 75 nodes, 50 m/s. Figure 9: Average Energy Consumed per Node (Tree Discovery Procedure: DMEF). 25 to 50 nodes and from 25 to 75 nodes, the energy consumed per node decreases. For multicast group sizes of 8, 12 and 24, as we increase the network density from 25 nodes to 50 nodes, the increase in the energy consumed per node for MAODV, NR-MLPBR, R-MLPBR and BEMRP is by factors of 54%-157%, 53%-156%, 48%-136% and 38%-118% respectively. As we increase the network density from 25 nodes to 75 nodes, the increase in the energy consumed per node for MAODV, NR-MLPBR, R-MLPBR and BEMRP is by factors of 49%-173%, 47%-172%, 42%-146% and 27%-114% respectively. MAODV and NR-MLPBR incur a relatively larger energy consumed per node at high network densities due to the nature of these multicast routing protocols to discover trees with minimum hop count. R-MLPBR and BEMRP discover trees with reduced number of links and hence incur relatively lower energy consumed per node at high network density. For a given network density, the energy consumed per node due to flooding can be as large as 5%-16%, 12%-23% and 22%-37% more than that incurred using DMEF in the presence of low, medium and high node mobility respectively. • Impact of Multicast Group Size: As we increase the multicast group size from 2 to 24, the energy consumed per node for MAODV and NR-MLPBR increases by a factor of 2.2 to 2.4, 5.6 to 5.8 and 6.0 to 7.1 for low, medium and high density networks respectively. In the case of BEMRP and R-MLPBR, as we increase the multicast group size from 2 to 24, the energy consumed per node increases by a factor of 2.2 to 2.4, 4.9 to 5.4 and 4.8 to 6.4 in networks of low, medium and high density respectively. The increase in the energy consumed per node is below linear. Hence, all the four multicast routing protocols are scalable with respect to the increase in multicast group size. 4.6 Energy throughput For each of the multicast routing protocols and for a given network density and node mobility, the energy throughput decreases with increase in the multicast group size. This can be attributed to the need to spend more energy to deliver a given multicast packet to more receivers vis-à-vis few receivers. For a given network density and multicast group size, the energy throughput of a multicast routing protocol decreases slightly as the node velocity is increased from low to moderate and high. For a given multicast group size and node mobility, the energy throughput of a multicast routing protocol decreases with increase in network density. This can be attributed to the involvement of several nodes (for larger network density) in distributing the offered traffic load to the multicast group. For a given simulation condition, the energy throughput of BEMRP is slightly larger than that of the other multicast routing protocols. This can be attributed to the lower energy consumed per node (and less number of links) for BEMRP. Performance with Flooding as the Tree Discovery Strategy • Impact of Node Mobility: As we increase the node mobility, the energy throughput for a multicast protocol reduces as large as by 8%-12%, 12%-17% and 24%-26% in low, moderate and high density networks respectively. For a given network density, Figure 10.1: 25 nodes, 10 m/s. Figure 10.2: 25 nodes, 30 m/s. Figure 10.3: 25 nodes, 50 m/s. Figure 10.4: 50 nodes, 10 m/s. Figure 10.5: 50 nodes, 30 m/s. Figure 10.6: 50 nodes, 50 m/s. Figure 10.7: 75 nodes, 10 m/s. Figure 10.8: 75 nodes, 30 m/s. Figure 10.9: 75 nodes, 50 m/s. Figure 10: Energy Throughput: # Packets Delivered per Joule (Tree Discovery Procedure: Flooding). the reduction in the energy throughput with increase in node mobility is due to the relatively larger amount of energy spent for broadcast tree discoveries. • Impact of Network Density: The decrease in energy throughput with increase in network density is more for MAODV and NR-MLPBR, relatively lower for R-MLPBR and is the least for BEMRP. At network density of 50 nodes, the energy throughput of MAODV and NR-MLPBR is 45%-64% and that of R-MLPBR and BEMRP is 50%-65% of that observed at network density of 25 nodes. At network density of 75 nodes, the energy through of MAODV, NR-MLPBR, R-MLPBR and BEMRP is 29%-48%, 30%-50%, 33%-50% and 38%-50% of that observed at network density of 25 nodes. • Impact of Multicast Group Size: As the multicast group size is increased from 2 to 4, the energy throughput of the multicast routing protocols decreased by 30%-40%, 36%-40% and 24%-45% in networks of low, moderate and high density respectively. As the multicast group size is increased from 2 to 24, the energy throughput of the multicast routing protocols decreased by about 78%, 83% and 85% in networks of low, moderate and high density respectively. Performance with DMEF as the Tree Discovery Strategy • Impact of Node Mobility: As we increase the node mobility from low to moderate and high, the energy throughput for a multicast routing protocol reduces as large as by 7%-8%, 8%-12% and 16%-17% in networks of low, moderate and high density respectively. The relatively higher energy throughput while using DMEF can be attributed to the tendency of the broadcast strategy to involve only relatively slow moving nodes to be part of the trees. As a result, less energy consumed for broadcast tree discoveries. • Impact of Network Density: The decrease in energy throughput with increase in network density is more for MAODV and NR-MLPBR, relatively lower for R-MLPBR and is the least for BEMRP. At network density of 50 nodes, the energy throughput of MAODV, NR-MLPBR, R-MLPBR and BEMRP is 48%-63%, 47%-63%, 52%-64% and 58%-69% of that observed at network density of 25 nodes. At network density of 75 nodes, the energy through of MAODV, NR-MLPBR, R-MLPBR and BEMRP is 32%-47%, 32%-48%, 36%-48% and 42%-50% of that observed at network density of 25 nodes. Impact of Multicast Group Size: As the multicast group size is increased from 2 to 4, the energy throughput of the multicast routing protocols decreased by 36%-44%, 35%-45% and 30%-47% in networks of low, moderate and high density respectively. As the multicast group size is increased from 2 to 24, the energy throughput of the multicast routing protocols decreased by about 80%, 84% and 84% in networks of low, moderate and high density respectively. 4.7 Energy consumed per tree discovery For a given broadcast strategy, the energy consumed per tree discovery is the same for all of the four multicast Figure 11.1: 25 nodes, 10 m/s. Figure 11.2: 25 nodes, 30 m/s. Figure 11.3: 25 nodes, 50 m/s. Figure 11.4: 50 nodes, 10 m/s. Figure 11.5: 50 nodes, 30 m/s. Figure 11.6: 50 nodes, 50 m/s. Figure 11.7: 75 nodes, 10 m/s. Figure 11.8: 75 nodes, 30 m/s. Figure 11.9: 75 nodes, 50 m/s. Figure 11: Energy Throughput: # Packets Delivered per Joule (Tree Discovery Procedure: DMEF). routing protocols. For both flooding and DMEF, the energy consumed increases with increase in network density, attributed to the involvement of multiple nodes in the broadcast of the MTRMs. In low-density networks, the energy consumed per tree discovery using flooding is 10-22%, 19-35% and 14-20% more than that of the energy consumed per tree discovery using DMEF in low, moderate and high node mobility conditions respectively. In moderate density networks, the energy consumed per tree discovery using flooding is about 15%, 23% and 28% more than that of the energy consumed per tree discovery using DMEF in low, moderate and high node mobility conditions respectively. In high-density networks, the energy consumed per tree discovery using flooding is about 18%, 30% and 37% more than the energy consumed per tree discovery using DMEF. As observed, DMEF performs better than flooding with increase in network density and/or node mobility. Figure 12: Energy Consumed per Broadcast Tree Discovery: Flooding vs. DMEF (25 Nodes). For a given multicast group size, the energy consumed while using flooding in moderate (50 nodes) and high density (75 nodes) networks is respectively about 3.8 and 8 times more than that incurred in networks of low density. This indicates that as the number of nodes is increased by x times (x = 2 for moderate density and x = 3 for high density), the energy consumed due to flooding increases by 2x times. In the case of DMEF, for a given multicast group size, the energy consumed in moderate density networks is about 3.7, 3.5 and 3.2 times more than that observed in low density networks for low, moderate and high node mobility conditions respectively. For a given multicast group size, the energy consumed during DMEF in high-density networks is about 7.8, 7.2 and 6.6 times more than that observed in low-density networks for low, moderate and high node mobility conditions respectively. Thus, the energy consumed while using DMEF does not increase exponentially as observed for flooding. DMEF performs appreciably well in lowering the energy consumed per tree discovery with increase in node mobility and/or increase in network density. Figure 13: Energy Consumed per Broadcast Tree Discovery: Flooding vs. DMEF (50 Nodes). 2 4 8 # Receivers per Multicast Group Figure 14: Energy Consumed per Broadcast Tree Discovery: Flooding vs. DMEF (75 Nodes). 5 Survey of MANET broadcast route discovery strategies We surveyed the literature for different broadcast route discovery strategies proposed to reduce the route discovery overhead and we describe below the strategies relevant to the research conducted in this paper. In Section 5.3, we qualitatively analyse the advantages of our DMEF broadcast strategy compared to the broadcast strategies described below in Sections 5.1 and 5.2. 5.1 Reliable route selection (RRS) Algorithm In [20], the authors proposed a Reliable Route Selection (referred to as RRS) algorithm based on Global Positioning System (GPS) [8]. The RRS algorithm divides the circular area formed by the transmission range of a node into two zones: stable zone and caution zone. A node is said to maintain stable links with the neighbour nodes in its stable zone and maintain unstable links with the neighbour nodes lying in its caution zone. If R is the transmission range of a node, then the radius of the stable zone is defined as r = R-SS where S is the velocity of the node. The status zone is a circular region (with its own centre) inscribed inside the circular region formed by the transmission range of the node. The centre of the status zone need not be the centre of the circular region forming the transmission range of the node, but always lies in the direction of movement of the node. RRS works as follows: The Route-Request (RREQ) message of a broadcast route discovery process includes the co-ordinates representing the current position of the transmitter of the RREQ message, the co-ordinates representing the centre of the stable zone of the transmitter, the value of parameter S to be used by an intermediate node and the stable zone radius of the transmitter of the message. The source node of the route discovery process broadcasts the RREQ message in the complete neighbourhood formed by the transmission range R. The RRS-related fields are set to initial values corresponding to the source node. An intermediate node receiving the RREQ message broadcasts the message further, only if the node lies in the stable zone of the transmitter. If a route discovery attempt based on a set value of S is unsuccessful, the source node decrements the value of S and launches another global broadcast based route discovery. This process is continued (i.e., the value of S decremented and global broadcast reinitiated) until the source finds a path to the destination. If the source cannot find a route to the destination even while conducting route discovery with S set to zero, then the source declares that the destination is not connected to it. 5.2 Probability, counter, area and neighbour-knowledge based methods In [15], the authors propose several broadcast route discovery strategies that could reduce the number of retransmitting nodes of a broadcast message. These strategies can be grouped into four families: probability-based, counter-based, area-based and neighbour-knowledge based methods: (i) Probability-based method: When a node receives a broadcast message for the first time, the node rebroadcasts the message with a certain probability. If the message received is already seen, then the node drops the message irrespective of whether or not the node retransmitted the message when it received the message for the first time. (ii) Counter-based method: When a node receives a broadcast message for the first time, it waits for a certain time before retransmitting the message. During this broadcast-wait-time, the node maintains a counter to keep track of the number of redundant broadcast messages received from some of its other neighbours. If this counter value exceeds a threshold within the broadcast-wait-time, then the node decides to drop the message. Otherwise, the node retransmits the message. (iii) Area-based method: A broadcasting node includes its location information in the message header. The receiver node calculates the additional coverage area obtained if the message were to be rebroadcast. If the additional coverage area is less than a threshold value, all future receptions of the same message will be dropped. Otherwise, the node starts a broadcast-wait-timer. Redundant broadcast messages received during this broadcast-wait-time are also cached. After the timer expires, the node considers all the cached messages and recalculates the additional coverage area if it were to rebroadcast the particular message. If the additional obtainable coverage area is less than a threshold value, the cached messages are dropped. Otherwise, the message is rebroadcast. (iv) Neighbour-knowledge based method: This method requires nodes to maintain a list of 1-hop neighbours and 2-hop neighbours, learnt via periodic beacon exchange. Using these lists, a node calculates the smallest set of 1-hop neighbours required to reach all the 2-hop neighbours. The minimum set of 1-hop neighbours that will cover all of the 2-hop neighbours is called the Multi Point Relays (MPRs). 5.3 Advantages of DMEF and differences with related work The DMEF strategy is very effective in discovering relatively long-living routes in an energy-efficient manner and differs from the RRS algorithm in the following ways: • RRS is highly dependent on location-service schemes like GPS, while DMEF is not dependent on any location-service scheme. • RRS requires the RREQ message header to be changed while DMEF does not require any change in the structure of the RREQ messages used for broadcasting. DMEF can be thus used without requiring any change in a MANET routing protocol. • In RRS, a node lying in the stable zone of the transmitter of the RREQ rebroadcasts the message in its complete neighbourhood. However, it is only the recipient nodes in the stable zone of the transmitter that rebroadcast the RREQ. Hence, RRS is not energy-efficient. In DMEF, the transmission range for broadcast at a node is dynamically and locally determined using the node's velocity and neighbourhood density and is usually considerably less than the maximum transmission range. • RRS does not properly handle the scenario where the value of 8*S exceeds the transmission range, R, of the node. The value of 8 has to be iteratively reduced (by trial and error) to determine the connectivity between the source and destination nodes. DMEF is better than RRS because it requires only one broadcast route discovery attempt from the source to determine a route to the destination if the two nodes are indeed connected. The values of the DMEF parameters are dynamically determined at each node by the nodes themselves because a node knows better about its own velocity and neighbourhood, compared to the source of the broadcast process. • The network density does not influence the stable zone radius selected by RRS. In RRS, the number of nodes retransmitting the RREQ message in a neighbourhood increases significantly as the network density is increased. DMEF is quite effective in reducing the number of nodes retransmitting the RREQ message in high-density networks. The advantages of the DMEF scheme when compared with the broadcast route discovery strategies discussed in Section 5.2 are summarized as follows: • The probability and MPR-based methods do not guarantee that the broadcast message will be routed on a path with the minimum hop count or close to the minimum hop count. Previous research [13] on the impact of these broadcast strategies on the stability and hop count of DSR routes indicates that the hop count of the paths can be far more than the minimum and the routes have a smaller lifetime than the paths discovered using flooding. The probability-based method cannot always guarantee that the RREQ message gets delivered to the destination. Also, with increase in network density, the number of nodes retransmitting the message increases for both the probability-based and MPR-based methods. DMEF determines paths with hop count being close to that of the minimum hop count paths and such paths have a relatively larger lifetime compared to those discovered using flooding. DMEF almost always guarantees that a source-destination route is discovered if there is at least one such route in the underlying network. DMEF effectively controls the RREQ message retransmission overhead as the network density increases. • The counter and area-based methods require careful selection of the threshold values for their proper functioning. Each node has to wait for a broadcast-wait-time before retransmitting the message. This can introduce significant route acquisition delays. The area-based method also requires the nodes to be location-aware and include the location information in the broadcast messages. With DMEF, there is no waiting time at a node to rebroadcast a received RREQ message, if the message has been received for the first time during a particular route discovery. DMEF does not depend on any location-aware services for its operation and the structure of the RREQ message need not be changed. 5.4 Other relevant optimizations for multicast routing overhead In addition to the methods described in Sections 5.1 and 5.2, some of the other optimizations that have been proposed in the MANET literature include: (i) A Swarm Intelligence based multicast routing protocol for ad hoc networks (MANHSI) has been proposed in [1]; (ii) In [19], the authors propose an independent tree ad hoc multicast routing (ITAMAR) framework that includes a number of heuristics to compute a set of alternate trees to improve the mean time between interruptions in multicast communication, achieved with a small increase in the route discovery overhead; and (iii) A virtual overlay mesh of unicast paths has been proposed for efficient discovery of multicast routes in [7]. 6 Conclusions and future work Simulations have been conducted with both flooding and DMEF as the broadcast tree discovery strategies. DMEF helps the multicast routing protocols to discover stable trees and at the same time does not increase the source-receiver hop count appreciably. Hence, the energy consumed per node with DMEF is lower than that incurred with flooding. With the use of DMEF as the tree discovery strategy, the performance of NR-MLPBR and R-MLPBR with respect to the time between successive tree discoveries and energy consumed per node actually improved relatively more than that observed for BEMRP and MAODV. This can be attributed to the effective path prediction of the two multicast extensions, an idea inherited from LPBR, and complemented by DMEF. Thus, DMEF has been demonstrated to be an effective broadcast strategy to discover multicast trees. In a related work [12], we have also demonstrated the effectiveness of DMEF to discover node-disjoint multi-path routes for MANETs. Thus, DMEF is an effective broadcast strategy to discover stable unicast, multicast and multi-path routes for MANETs with relatively lower energy consumption than the default flooding approach. The related work listed in Sections 5.1 and 5.4 require the strategies to be embedded into the design of the protocols and require changes built-in to the route discovery procedure of the protocols. Ours is the first such effort to study the impact of protocol-independent broadcast strategies (like DMEF and flooding) on the performance of multicast routing protocols. In future, we will evaluate the impact of the probability, counter, area and neighbour-knowledge based methods on the performance of the multicast routing protocols. Acknowledgments Research was sponsored by the U. S. Army Research Laboratory and was accomplished under Cooperative Agreement Number W911NF-08-2-0061. The views and conclusions in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Laboratory or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation herein. References [1] Z. M. Alfawaer, H. W. Hua and N. Ahmed, "A Novel Multicast Routing Protocol for Mobile Ad Hoc Networks," American Journal of Applied Sciences, vol. 4, no. 5, pp. 333-338, 2007. [2] C. Bettstetter, H. Hartenstein and X. Perez-Costa, "Stochastic Properties of the Random-Waypoint Mobility Model," Wireless Networks, vol. 10, no. 5, pp. 555-567, 2004. [3] G. Bianchi, "Performance Analysis of the IEEE 802.11 Distributed Coordination Function," IEEE Journal of Selected Areas in Communication, vol. 18, no. 3, pp. 535-547, 2000. [4] L. Breslau et. al, "Advances in Network Simulation," IEEE Computer, vol. 33, no. 5, pp. 5967, May 2000. [5] J. Broch, D. A. Maltz, D. B. Johnson, Y. C. Hu and J. Jetcheva, "A Performance of Comparison of Multi-hop Wireless Ad hoc Network Routing Protocols," Proceedings of the 4th Annual ACM/IEEE Conference on Mobile Computing and Networking, pp. 85-97, October 1998. [6] L. M. Feeney, "An energy-consumption model for performance analysis of routing protocols for mobile ad hoc networks," Journal of Mobile Networks and Applications, vol. 3, no. 6, pp. 239249, June 2001. [7] C. Gui and P. Mohapatra, "Overlay Multicast for MANETs using Dynamic Virtual Mesh," Wireless Networks, vol. 13, no. 1, pp. 77-91, 2007. [8] B. Hofmann-Wellenhof, H. Lichtenegger and J. Collins, Global Positioning System: Theory and Practice, 5th rev. ed., Springer, September 2004. [9] N. Meghanathan, "A Location Prediction Based Reactive Routing Protocol to Minimize the Number of Route Discoveries and Hop Count per Path in Mobile Ad hoc Networks," The Computer Journal, Vol. 52, No. 4, pp. 461-482, July 2009. [10] N. Meghanathan, "Multicast Extensions to the Location Prediction Based Routing Protocol for Mobile Ad hoc Networks," ISAST Transactions on Computers and Intelligent Systems, vol. 1, no. 1, pp. 56-65, August 2009. [11] N. Meghanathan, "A Density and Mobility Aware Energy-efficient Broadcast Route Discovery Strategy for Mobile Ad hoc Networks," International Journal of Computer Science and Network Security, vol. 9, no. 11, pp. 15-24, November 2009. [12] N. Meghanathan, "A Node-Disjoint Multi-path Extension of the Location Prediction Based Routing Protocol for Mobile Ad hoc Networks," Proceedings of the 3rd International Conference on Signal Processing and Communication Systems, Omaha, Nebraska, USA, September 28-30, 2009. [13] N. Meghanathan, "Impact of Broadcast Route Discovery Strategies on the Performance of Mobile Ad Hoc Network Routing Protocols," in Proceedings of the International Conference on High Performance Computing, Networking and Communication Systems, pp. 144-151, July 2007. [14] C. S. R. Murthy and B. S. Manoj, "Ad Hoc Wireless Networks: Architectures and Protocols," Prentice Hall, June 2004. [15] S. Ni, Y. Tseng, Y. Chen and J. Sheu, "The Broadcast Storm Problem in a Mobile Ad Hoc Network," Proceedings of ACM International Conference on Mobile Computing and Networking, pp. 151-162, 1999. [16] T. Ozaki, J-B. Kim and T. Suda, "Bandwidth-Efficient Multicast Routing for Multihop, Ad hoc Wireless Networks," Proceedings of the IEEE INFOCOM Conference, vol. 2, pp. 1182-1192, Anchorage, USA, April 2001. [17] C. E. Perkins and E. M. Royer, "The Ad hoc On-demand Distance Vector Protocol," Ad hoc Networking, edited by C. E. Perkins, pp. 173-219, Addison-Wesley, 2000. [18] E. Royer and C. E. Perkins, "Multicast Operation of the Ad-hoc On-demand Distance Vector Routing Protocol," Proceedings of the 5th ACM/IEEE Annual Conference on Mobile Computing and Networking, pp. 207-218, Seattle, USA, August 1999. [19] S. Sajama and Z. J. Haas, "Independent-tree Ad hoc Multicast Routing (ITAMAR)," Mobile Networks and Applications, vol. 8, no. 5, pp. 551-566, October 2003. [20] Y.-J. Suh, W. Kim and D.-H. Kwon, "GPS-Based Reliable routing Algorithms for Ad Hoc Networks," The Handbook of Ad Hoc Wireless Networks, pp. 361 -374, CRC Press, 2003. [21] C. K. Toh, G. Guichal and S. Bunchua, "ABAM: On-demand Associatvity-based Multicast Routing for Ad hoc Mobile Networks," in Proceedings of the 52nd IEEE VTS Fall Vehicular Technology Conference, Vol. 3, pp. 987 - 993, September 2000. An Extended TOPSIS Method for Multiple Attribute Group Decision Making Based on Generalized Interval-valued Trapezoidal Fuzzy Numbers Peide Liu Shandong Economic University, Information Management School Jinan Shandong 250014, China E-mail:Peide.liu@gmail.com Keywords: interval-valued fuzzy number, relative closeness coefficient, multiple attribute group decision making Received: May 29, 2009 An Extended TOPSIS Method deals with multiple attribute group decision making problems in which the attribute values and weights take the form of the generalized interval-valued trapezoidal fuzzy numbers (GIVTFN). First, some properties are defined, such as the concept and the relational calculation rules of GIVTFN, the distance and its characteristics of GIVTFN, and the method which can transform the linguistic terms into GIVTFN. Second, the normalization method of GIVTFN is illustrated, and an extended TOPSIS method based on the GIVTFN is presented in detail. The order of the alternatives is ranked based on the relative closeness coefficient of TOPSIS. Finally, an illustrate example is given to show the effectiveness of this method and this decision making steps. Povzetek: Z izboljšano metodo TOPSIS so dosegli boljše rezultate pri odločanju z mnogoterimi atributi. 1 Introduction Multiple attribute decision making (MADM) is an important part of modern decision science. It has been extensively applied to various areas, such as society, economics, management, military and engineering technology. For example, the investment decisionmaking, project evaluation, the economic evaluation, the personnel evaluation etc. Since the object things are fuzzy, uncertainty and human thinking is ambiguous, the majority of the multi-attribute decision-making is uncertain and ambiguous, which is called the fuzzy multiple attribute decision-making (FMADM). Since Bellmanhe and Zadeh [1] initially proposed the basic model of fuzzy decision making based on the theory of fuzzy mathematics, FMADM has been receiving more and more attentions. Many achievements have been made on FMADM problems [2-5,7-21]. TOPSIS (Technique for Order Preference by Similarity to Ideal Solution) is proposed by Hwang and Yoon [6], and it is a popular approach to MCDM problems. The basic principle is that the chosen alternative should have the shortest distance from the positive ideal solution and the farthest distance from the negative ideal solution. In the TOPSIS, the performance ratings and the weights of the criteria are given as crisp values. In many cases, crisp data are inadequate to model real life situations. Jahanshahloo et al [7] extends the TOPSIS method to the fuzzy decision making situations by considering interval numbers and defining crisp Euclidean distance between two interval numbers. Wang and Elhag[8] proposes a fuzzy TOPSIS method based on alpha level sets and presents a nonlinear programming (NLP) solution procedure by considering triangular fuzzy numbers. Liu and Zeng [9] proposes a new TOPSIS method to deal with the fuzzy multiple attribute group decision making problem based on the expected value operator of the trapezoidal fuzzy number when the fuzzy decision matrixes and the weights of the decision attributes and decision makers are all given by the trapezoidal fuzzy number. Tsaur et al. [10] convert the fuzzy MCDM problem into a crisp one via centroid defuzzification and then solve the non-fuzzy MCDM problem by the TOPSIS method. Chu and Lin [11] changed the fuzzy MCDM problem into a crisp one. Differing from the others, they first derive the membership functions of all the weighted ratings in a weighted normalized decision matrix and then convert them to crisp values by defuzzifying and then use TOPSIS method to solve this problem. The concept of the interval-valued fuzzy set is initially proposed by Gorzlczany[12]and Turksen[13]. Some researchers focused on this research topic of interval-valued fuzzy numbers [12-18] in resent years, because the interval-valued fuzzy numbers are more general and better to express fuzzy information. Wang and Li [14-15] defined the expansion operation of the interval-valued fuzzy numbers, and proposed the concept and properties of the similarity coefficient based on the interval-valued fuzzy numbers. Hong and Lee [16] proposed the distance of the interval-valued fuzzy numbers. Ashtiani,et al[17] proposed definition of the interval-valued triangular fuzzy numbers and presented the extended TOPSIS group decision making method for the interval-valued triangular fuzzy numbers. Wei and Chen[18]proposed similarity measures between the generalized interval-valued trapezoidal fuzzy numbers (GIVTFN) for risk analysis. This paper proposed an extended TOPSIS Method to solve the multiple attribute group decision making problems of which the attribute weights and values are given with the form of GIVTFN. In this paper we develop an extended TOPSIS method for multiple attribute group decision making based on the generalized interval-valued trapezoidal fuzzy numbers by defining the distance of GIVTFN. The remaining of this study is organized as follows. In the next section, we will briefly introduce the basic concept and the operation rules of the GIVTFN, and define the distance of GIVTFN. Section 3 describes the extended TOPSIS method to solve the multiple attribute group decision making problems by using GIVTFN. Section 4 gives a numerical example to explain validity of the decision-making steps and the method. The study is concluded in Section 5. 2 The basic concept of the interval-valued trapezoidal fuzzy numbers 2.1 The generalized trapezoidal fuzzy numbers (1) The concept of the generalized trapezoidal fuzzy numbers Definition 1 [19]: The generalized trapezoidal fuzzy numbers can be defined as a vector A = (a1, a2, a3, a4; w^ ) (as shown in Fig1), and the membership function a(x): R ^ [0,1] is defined as follows: a( x) = x - a a2 w -, x - a 4 a3 - a4 x wA, x e (aj,a2) x e (a2,a3) x w-, x e (a3,a4) 0, x e (-ro, a1) u (a4, ro) If a1 = a2 = a3 = a4 , then A is reduced to a real number. Figure 1: The generalized trapezoidal fuzzy number A . (2) The operation of the generalized trapezoidal fuzzy number Suppose that a = (a1,a2,a3,a4;wa) , b = (b1,b2,b3,b4;w-) are the generalized trapezoidal fuzzy numbers, then the operational rules of the generalized trapezoidal fuzzy number are shown as follows:[20] a © b = (a,, a2, a3, a4; wa) © (b1, b2, b3, b4; w- ) (i) = (aj + b1, a2 + b2, a3 + b3, a4 + b4;min(wa, w- )) (2) (ii) (3) (iii) (4) a -b = a a2, ^ a4; wa) - (bJ, b2, b3, b4; wb ) = (a1 - b4, a2 - b3, a3 - b2, a4 - b{; min(w5, w~)) a ®b = (a!, a2, a3, a4; wa) ® (b1, b2, b3, b4; w- ) = (a, b, c, d;min(w5, wb )) (1) where a1 < a2 < a3 < a4and w^ e [0,1]. The elements of the generalized trapezoidal fuzzy numbers x e R are real numbers, and its membership function a(x) is the regularly and continuous convex function, showing the membership degree to the fuzzy sets. If -1 < a1 < a2 < a3 < a4 < 1, then a- is called the normalized trapezoidal fuzzy number. Especially, if w- = 1 , then A is called the trapezoidal fuzzy number(a1,a2,a3,a4); if a1 < a2 = a3 < a4, then A is reduced to a triangular fuzzy number. where a = min(a1 x b1, a1 x b4,a4 x b1,a4 x b4), b = min(a2 xb2,a2 xb3,a3 xb2,a3 xb3) c = max(a2 xb2,a2 xb3,a3 xb2,a3 xb3), d = max(a1 x b1, a1 x b4, a4 x b1,a4 x b4) If a1, a2, a3, a4, b1, b2, b3, b4 are the positive numbers, then a®b = (a! xb1,a2 xb2,a3 xb3,a4 xb4;min(wa,w~)) a / b = fa , a2, a3, a4; wa )/(bJ, b2, ^ b4; wb ) (iv) (5) = (a / b4,a2 /b3,a3 / b2, a4 / b1; min(wa, w-)) (3) The center of the gravity (COG) point of the generalized trapezoidal fuzzy numbers Chen and Chen [21] proposed the concept of the COG point of the generalized trapezoidal fuzzy numbers, and suppose that the COG point of the generalized trapezoidal fuzzy numbers a = (a1, a2, a3, a4; wa) is (xa, ya), then [21]: y-a = x- = w-x ' -3 - -2 A + 2 y i -4 - -1 6 \w'A if a1 ^ a4 if a1 = a4 y-x (-2 + -3)+(a + -4)x (w- - y- ) 2x w- (6) A = (-L, dL, -L; m, ), (-f, ^,-,; wjcí ) 0 < - < - < - < - <1 , 0 < -U < -U < -U < -U < 1 0 < w, < Wz. <1 and AL œ AU al au Í. , ir A fl3 11J rt4 Suppose that A=lAL, AU ' b=\B, BU -4 , -a^ UUUU (b^bb wB x^A^; Wu ) where 0 0): A/B= / (10) are the two interval-valued trapezoidal fuzzy numbers, AA=Ax \4444;wAJ ) 2J/ A' (11) (2) The distance of two interval-valued trapezoidal fuzzy numbers is: d/AB) = 2.3 The distance of the interval-valued trapezoidal fuzzy numbers jy ^ )2 +X -Xl )2 ^ )2 -XU )2] 4 (12) Suppose that A= B= AL, AU BL, BU ^44^),(aU,aU' aJ,aJ;wAJ) \bL,bL,bL,bL4; W§L ),(bj 4 ,bj ,bj; w^j) are any two generalized trapezoidal fuzzy numbers, then the distance of two interval-valued trapezoidal fuzzy numbers( A and B ) is calculated as follows: (l)Utilize the formula (6) to calculate the coordinate of the COG point (XAL, yAL ), (XAJ,) , (XJL, yB ), (xU, y-w ) which belongs to the generalized trapezoidal fuzzy numbers AL, AJ DL DJ where d(A, B) satisfies the following properties: (i) if A and B are the normalized interval-valued trapezoidal fuzzy numbers, then 0 < d(A, B) < l (ii) A = B O d(A, B) = 0 (iii) d(A, B) = d(B, A) (iv) d(A, C) + d(C, B) > d(A, B) Obviously, the properties (i) and (iii) are satisfied. For the properties (ii), if A = B , then d(A, B) = 0 . If d(A, B) = 0 , then the COG of A is equal to B 's, so we approximately believe that A = B . BL , Bj respectively. For properties (iv): In order to simplify the expression in distance formula, we suppose that «i = xAl ,«2 = xBl « = xcl , P =yAL A =yBL A =yC 7 =^,72 =^7 =*-? fa = yAj fa = yBj fa = yCj. d(A, B) + d(B, C) > d(A, C) O (d(A, B) + d(B, C) )2 > d2 (A, C) O (d(A, B) + d(B, C))2 - d2 (A, C) > 0 o(« -«2)2 + (A - A)2 + (7i 72) + fa -fa)2 + («2 -«3)2 + P-A3)2 + (72 -7)2 + fa -fa)2 +: (« «2) + (P -P2)2 + (7i -72)2 + (fa -fa)2]*[(« -«3)2 + (P2 -P3)2 + (72 7 + (fa -fa)2] -[(« -«)2 + (Pi -P3)2 + (71 -7,)2 + (fa fa?)2]>0 Suppose that dA = («i - a2)2 + (Pi -P2)2 + (71 - 72)2 + (fa - fa)2 + (a2+ (P2 -P3)2 + (72 - 73)2 + (fa - fa)2 -[(«1 -«3)2 + (Pi -P3)2 + (7i -73)2 + (fa -fa)2] dB= 2/[(q +(P-P2)2 +7 -72)2 +(fa fa)2]x[(« « +P P +(/2 7 +(fa-fa)] V dA = (a, - «2)2 + (P - P2)2 + (/1 - 72)2 + (fa - fa2)2 + («2 - «3)2 + (P2 - P3)2 + (72 - 73)2 + fa - fa3)2 - [(« - «3)2 + (P - P3)2 + (7! - 73)2 + (fa - fa3)2 ] = 2(a1 - a2)(a3 - a2) + 2(P - P2)(P3 - P2) + 2(71 - 72)(73 - 72) + 2(fa - fa2)(fa3 - fa2 )_ V dB = 2j[(a -a2)2 + (P -P2)2 + (71 -72)2 + (fa -fa2)2 ]x[(a2 -«)2 + (P -P,)2 + (72 -73)2 + (fa2 -fa) > 2¡[(a -«2)(«2 -«3) + (P1 -PjXP2 -P3)]2 +[(71 -72)(72 -73) + (fa -fa2)(fa2 -fas)]2 + dC = 2[(a -a2)(a2 -a3) + (P1 P2)(P2 P3) + 7 -72)(72 -7î) + (fa -fa2)fa2 -faî)] where dC = 2[(« - a2)(a2 - «3)7 - 71)(y1 - 73) + (ß - ß2)(ß2 - ß3)7 - ^X^ " 73) + (Ü1 -^2X^2 -^)(a1 - «2X« - «3) + (^1 - ^2)(^2 - ^3)(ß1 - ßi)(ß2 - ß3)] .'. dA + dB > 0 o (d(A, B) + d(B, C) j2 - d2 (Ai, C) > 0 2.4 Utilize the interval-valued trapezoidal fuzzy numbers to represent the linguistic terms In the real decision making process, it is difficult to adopt the form of generalized interval-valued trapezoidal > 0 o d (A, B) + d (B, C) > d (A, C) fuzzy numbers to give the attribute values and weights directly by the decision makers. So we usually adopt the form of linguistic terms. Wei and Chen [18] utilizes the interval-valued trapezoidal fuzzy numbers to represent the 9-member linguistic terms. (shown in Table 1) Table 1: A 9-member interval linguistic term set. linguistic terms (the attribute values) linguistic terms generalized interval-valued trapezoidal (weights)_fuzzy numbers_ Absolutely-poor(AP) Very-poor(VP) poor (P) Medium-poor(MP) Medium (F) Medium-good(MG) good(G) Very-good(VG) Absolutely-good(AG) Absolutely-low(AL) Very-low (VL) low (L) Medium-low(ML) Medium (M) Medium-high(MH) high(H) very-high (VH) Absolutely-high (AH) [(0.00,0.00 [(0.00,0.00 [(0.04,0.10, [(0.17,0.22, [(0.32,0.41 [(0.58,0.63. [(0.72,0.78 [(0.93,0.98, [(1.00,1.00, ,0.00,0.00;0.8). 0.02,0.07;0.8). 0.18,0.23;0.8), 0.36,0.42;0.8), 0.58,0.65;0.8), 0.80,0.86;0.8), 0.92,0.97;0.8), 1.00,1.00;0.8), 1.00,1.00;0.8), (0.00,0.00,0.00 (0.00,0.00,0.02 (0.04,0.10,0.18, (0.17,0.22,0.36 (0.32,0.41,0.58 (0.58,0.63,0.80 (0.72,0.78,0.92 (0.93,0.98,1.00, (1.00,1.00,1.00, 0.00;1.0)] 0.07;1.0)] 0.23;1.0)] 0.42;1.0)] 0.65;1.0)] 0.86;1.0)] 0.97;1.0)] 1.00;1.0)] 1.00;1.0)] 3 Group decision making method 3.1 The description of the decision making problems Let E = jgj, e2, • • •, eq j be the set of decision makers in the group decision making, and A = { A1, A2, • • •, Am j be the set of alternatives, and C = {Cj, C2, • • •, Cn j be the set of attributes. Suppose that a*^(j ^ ^ ^ Xj ^ ^ wl)] is the attribute value given by the decision maker ek , where aijk is an interval-valued trapezoidal fuzzy number for the alternative Ai with respect to the attribute Cj , and ~ ri L L L L L\ / U U U U U\1 % = LH^ % 2, jj % X ^vj j j %) J is the attribute weight given by the decision maker ek , where % is also an interval-valued trapezoidal fuzzy number. Let A = (\, A2, • • •, Aq ) be the weight vector of decision makers, where A, is a real number, and ^ Ak = 1. Then we use the attribute weights, the k=i decision makers' weights, and the attribute values to rank the order of the alternatives. 3.2 Normalize the decision-making information we need normalize the decision-making information, in order to eliminate the impact of different physical dimension to the decision-making result. Consider that there are generally benefit attributes ( I1 ) and cost attributes ( I2 ). The normalizing method is shown as follows: For benefit attributes, where mjk = max(aUjk4) . xi)k = [(-jl, XHk2, Xijk3, XLk4;WLk ), (xuk1, xpk2, xjk3, xjk4' WU )] ( et et et ct aU aU aU aU (j ^jk2 j . W ) (j'kl »2 Uijk3 Jik±.wU ) V mjk mjk mjk mjk For cost attributes, where njk = min(aLk1 ) . "V l_(Xfc1' xjk2, Xjk4; rfjk XCjp 'U, ^jk3, ^jk4^"ijk (pk_ -U- -Uk- -U-• wu) ct™ ' oL ' oL'dj ' ijk ' CL ' CL ' CL ' ijk y yk\ ijk2 ijk3 ijk4 ijk1 ijk2 ijk3 ijk4 J (14) 3.3 Combine the evaluation information of each decision maker We can get the collective attribute values and weights, according to the different projects' attribute values and weights which were given by different experts under different attribute. The combining steps are shown as follows: k=1 ~ ["/¿-j-j-j L\ / U U U U Uxl " "ij 2, j j Wj ),(j "i, 2 , j j w )J g g Ê(^kxijk ) = Ê {^k X _(xi,k1, xijk2 , xijk3 , xijk4 • Wjk ), (xyk1, Xijk2 , Xijk3 , "ijk4 • Wi,k )]} ' ' k=1 g g g g ^ Ê (■) Ê ( 2 ) Ê 3 ) 4 ); mkin(wj) k=1 k=1 k=1 k=1 , Ê (^1 ) Ê (AkxUk 2 ), Ê (AkxUUk 3 ) Ê (AkxUk 4 ); min^ ) ' k=1 k=1 k=1 k=1 ~ t/ll lll\suuuu U\"l K = _(K ' K , ®3 , ®4 • V X (K , ®2 , ®3 , ®4 • V )] : Ê ( Ak x k )=Ê (^ x [(K , 2 , K, K 4 ; v- x (K, <2, <3, <4 ;vl) ]) k=1 k=1 k=1 g Ê (^K- ) Ê (4® 2 ) Ê (AkK-3 ) 4 ); mkin(vkj-) k=1 k=1 k=1 k=1 Ê (^kKUj1 ), Ê (^kKUj 2 ), Ê (^kKUj 3 ) ,Ê (^kKUj 4 ); mjn(vJU ) . k=1 k=1 k=1 k=1 (15) (16) 3.4 Construct the weighted matrix Let V = y be the weighted matrix, then: Jmxn .L ,.L ,.L ,_L ) (vU ,.U ,.U ,.U ij ^ ( ij1 v« = _(v-1, v-2, v-3, vj4;mj ),(j j v^, v^; < )] = ® K ij [v' ijl' ' ij 21 ij 31 ij 41 ij ij1> ij 21 ij 31 ij 4>~ ij /] "ij (x>L1, xj2®-2, x-3®-3, x>-4;min(wj,vj))," "ij 2^ j 2i ~ij 3^ j 3i ~ij 4™ j 4 ' (x>U1, xU^® U, xtfU3k Uj3, xU^KU4; min(wij V )) ij ' ' j ij j (17) 3.5 The extended TOPSIS decision making method (1) Determine the positive ideal solution and the ideal solution are V+ = negative ideal solution of the evaluation objects l~/\,L+ L + L + L ^ ^ L + \ + „U + ,,U + „U U + \ Vj =[ ( Vj1 , Vj 2, Vj 3, Vj 4;^j ),( Vj1 , Vj 2 , Vj 3 , Vj 4 ;Vj ) ] (max( vLl), m ax ( V-2), max( V-3),max (Vj 4 );max(^i^ Suppose that the positive ideal solution and the negative ,V" = then : (ma x( ma x( vl2), m a x( v ji3),m a x (vU4 );m axO U )) ij 2 ij 3- ij 4 ^ + 1xn 1xn V j = [ ( ^ L - L - L - L - \ i U - U ,v,.? ,v^ ),(vj1 ,v^ ,v^ ,v,.A ;w j2 ' 'j3 j 4 ' ' ,2 ) ] (min( vL1),min( vL 2),min( vL 3),min( vL 4);minOi)), Z 2 Z 1 1 (min(v^), min(vU2),min(vU3),min(v^); min^ )) (19) (2) Calculate the weighted matrix and the COG of each attributes with respect to the positive ideal solution and the negative ideal solution Based on the formula (6), we can calculate the COG [ (y, x)v ] of each element in the weighted matrix and the barycentric coordinates [(y, x)v+ ] and [(y, x)V_] of each element with respect to the positive ideal solution and the negative ideal solution. (3) Calculate the weighted distance between each project Ai and the ideal solution V + and negative ideal solution V : d + = A Z[d v, v, ) j=1 d - = V ž[d v, v, ) j=1 (20) (21) where i = 1,2 — m, j = 1,2, — n . (4) Calculate the relative closeness coefficient Ci d - C =• d,+ + d. (22) , i 111 1 vector of the decision makers is A = I — — — I. Each ^ 3 3 3 J decision maker utilizes the linguistic terms to assess the importance of each attribute, and the evaluation information of four volunteers is shown in Tables2, 3, 4 and 5, respectively [17]. Table 2: The attribute weights given by three DMs. c1 C2 C3 C4 C5 DM1 VH H H VH M DM2 VH H MH H MH DM3 VH MH MH VH M Table 3: The evaluation information given by DM1. (5) Rank the alternatives We rank each alternative, based on the relative closeness coefficient. The bigger the relative closeness coefficient is, the better the alternative is, vice versa. 4 Illustrative example Suppose that a Telecommunication Company intends to choose a manager for R&D department from four volunteers named A1, A2, A3 and A4. The decision making committee assesses the four concerned volunteers based on five attributes: (1) the proficiency in identifying research areas (C1), (2) the proficiency in administration (C2), (3) the personality (C3), (4) the past experience (C4) and (5) the self-confidence (C5). The number of the committee members is three, labeled as DM1, DM2 and DM3, respectively, and the weight c1 C2 C3 C4 C5 a1 VG VG VG VG VG a2 G VG VG VG MG a3 VG MG G G G a4 G F F G MG Table 4: The evaluation information given by DM2. c1 C2 C3 C4 C5 a1 G MG G G VG a2 G VG VG VG MG a3 G G MG VG G a4 VG F MG F G Table 5: The evaluation information given by DM3. c1 C2 C3 C4 C5 ax MG F G VG VG a2 MG MG G MG G a3 VG VG VG VG MG a4 MG VG MG VG F 2 2 The decision making steps are shown as follows: (1) Convert the linguistic terms into the interval-valued trapezoidal fuzzy numbers, and then get: a kj 3 x S [j. [ ai2 ] 4 [(Q.9 3 [(Q.9 3 [(Q.9 3 [(Q .72 , [(Q .72 , [ ( Q . S 8 , [(Q .72 , [ ( Q . S 8 , [ ( Q . S 8 , [(Q 9 3, [(Q .72 , [(Q 9 3, [(Q .3 2, [ ( Q . S 8 , [(Q .3 2, ,Q .9 8,1 ,Q .9 8,1 ,Q .9 8,1 ,Q .7 8 ,Q ,Q .7 8 ,Q , Q . ó 3 , Q ,Q .7 8 ,Q , Q . ó 3 , Q , Q . ó 3 , Q ,Q .9 8 ,1 ,Q .7 8 ,Q ,Q .9 8 ,1 Q .41,Q . Q.ó 3,Q . Q.41,Q . . Q Q , 1 . Q Q , 1 . Q Q , 1 9 2 ,Q 9 2 ,Q 8 Q ,Q 9 2 ,Q 8 Q ,Q 8 Q ,Q Q Q , 1 9 2 ,Q Q Q , 1 S 8 ,Q . 8 Q ,Q . S 8 ,Q . .Q Q ;Q .Q Q ;Q .Q Q ;Q 9 V ; Q . 9 V ; Q . 8 ó ; Q . 9 V ; Q . 8 ó ; Q . 8 ó ; Q . Q Q ; Q . 9 V ; Q . Q Q ; Q . ó S ;Q . 8 ó ; Q . ó S ;Q . . 8 ), ( Q . 9 3 . 8 ), ( Q . 9 3 . 8 ), ( Q . 9 3 .8 ),(Q .7 2 .8 ),(Q .8 ),(Q .8 ),(Q .8 ),(Q .8 ),(Q .8 ),(Q .9 3 .8 ),(Q .7 2 .8 ),(Q .9 3 8 ),(Q .3 2 8 ),(Q .S 8 8 ),(Q .3 2 .V2 .S 8 .V2 .S 8 .S 8 ,Q.9 8,1 ,Q.9 8,1 ,Q .9 8 , 1 Q .7 8 ,Q Q .7 8 ,Q Q .ó 3 ,Q Q .7 8 ,Q Q .ó 3 ,Q Q .ó 3 ,Q Q .9 8 , 1 Q .7 8 ,Q Q .9 8 , 1 Q .4 1 ,Q . Q .ó 3,Q . Q.4 1 ,Q . . Q Q , 1 . Q Q , 1 . Q Q , 1 9 2 ,Q 9 2 ,Q 8 Q ,Q 9 2 ,Q 8 Q ,Q 8 Q ,Q QQ,1 9 2 ,Q QQ,1 S 8 ,Q 8 Q ,Q S 8 ,Q .Q Q ; 1 .Q )] .Q Q ; 1 .Q )] .Q Q ; 1 .Q )] .9 7 ; 1 .Q )] .9 7 ; 1 .Q )] .8 ó ; 1 .Q )] .9 7 ; 1 .Q )] .8 ó ; 1 .Q )] .8 ó ; 1 .Q)] .Q Q ; 1 .Q )] .9 7 ; 1 .Q )] .Q Q ; 1 .Q )] ó S ; 1 .Q)]" .8 ó ; 1 .Q )] .ó S ; 1 .Q )] [(Q .9 3 ,Q [( Q . 7 2 ,Q [(Q .9 3 ,Q [( Q . 7 2 ,Q [(Q.9 3 ,Q [(Q.9 3 ,Q [(Q .S 8,Q [(Q .3 2,Q [(Q .9 3,Q [(Q.9 3 ,Q [ ( Q . 7 2 , Q [(Q .3 2,Q [(Q.9 3 ,Q [(Q.9 3 ,Q [ ( Q . 7 2 , Q [ ( Q . 7 2 , Q [(Q .9 3,Q. [(Q .S 8,Q. [ ( Q . 7 2 , Q . [(Q.S 8,Q . .98,1 .7 8 ,Q .98,1 .7 8 ,Q 98,1 98,1 ó 3 ,Q 4 1 ,Q 98,1 98,1 7 8 ,Q 4 1 ,Q 98,1 98,1 7 8 ,Q 7 8 ,Q 98,1. ó 3 ,Q . 7 8 , Q . ó 3 ,Q . .QQ,1 .9 2 ,Q .QQ,1 .9 2 ,Q Q Q ,1 . Q Q ,1 . 8 Q , Q . S 8 ,Q . Q Q ,1 . Q Q ,1 . 9 2 , Q . S 8 ,Q . Q Q ,1 . Q Q ,1 . 9 2 , Q . 9 2 , Q . QQ,1. 8 Q ,Q . 9 2 , Q . 8 Q ,Q . .Q Q ;Q .9 7 ;Q .Q Q ;Q .9 7 ;Q .Q Q ;Q .Q Q ;Q .8 ó ;Q .ó S ;Q .Q Q ;Q .Q Q ;Q .9 7 ;Q .ó S ;Q .Q Q ;Q .Q Q ;Q .9 7 ;Q .9 7 ;Q Q Q ; Q . 8 ó ; Q . 9 7 ; Q . 8 ó ; Q . .8),(Q .8),(Q .8),(Q .8),(Q 8 ), ( Q 8 ), ( Q 8 ), ( Q 8 ), ( Q 8 ), ( Q 8 ), ( Q 8 ), ( Q 8 ), ( Q 8 ), ( Q 8 ), ( Q 8 ), ( Q 8 ), ( Q 8 ), ( Q . 8 ), ( Q . 8 ), ( Q . 8 ), ( Q . 9 3 ,Q 7 2 , Q 9 3 ,Q 7 2 , Q .9 3 ,Q .9 3 ,Q .S 8 ,Q .3 2 ,Q .9 3 ,Q .9 3 ,Q . 7 2 , Q .3 2 ,Q .9 3 ,Q .9 3 ,Q . 7 2 , Q . 7 2 , Q 9 3 , Q . S 8 ,Q . 7 2 , Q . S 8 ,Q . (Q .7 2 ,Q (Q .7 2 ,Q (Q .7 2 ,Q (Q.9 3 ,Q (Q .S 8 ,Q . (Q .9 3,Q . (Q .7 2,Q . (Q .3 2 ,Q . (Q .7 2,Q . (Q .9 3,Q . (Q .S 8 ,Q . (Q .S 8 ,Q . (Q .7 2,Q . (Q .9 3,Q . (Q .9 3,Q . (Q .3 2 ,Q . (Q .9 3,Q . (Q .S 8,Q . (Q .7 2,Q . (Q .7 2,Q . .7 8 ,Q .7 8 ,Q .7 8 ,Q .98,1 .ó 3 ,Q .9 8,1 .7 8 ,Q .4 1 ,Q .7 8 ,Q .9 8,1 .ó 3 ,Q .ó 3 ,Q .7 8 ,Q .9 8,1 .9 8,1 .4 1 ,Q 9 8,1. ó 3 ,Q . 7 8 ,Q . 7 8 ,Q . .9 2 ,Q .9 2 ,Q .9 2 ,Q .Q Q , 1 8 Q , Q . Q Q , 1 . 9 2 ,Q . S 8 , Q . 9 2 ,Q . Q Q , 1 . 8 Q , Q . 8 Q , Q . 9 2 ,Q . Q Q , 1 . Q Q , 1 . S 8 , Q . Q Q ,1 . 8 Q ,Q . 9 2 ,Q . 9 2 ,Q . .9 7 ;Q .9 7 ;Q .9 7 ;Q .Q Q ;Q .8 ó ;Q ,QQ;Q .9 7 ;Q .ó S ;Q .9 7 ;Q ,QQ;Q .8 ó ;Q .8 ó ;Q .9 7 ;Q ,QQ;Q ,QQ;Q .ó S ;Q Q Q ; Q . 8 ó ; Q . 9 7 ; Q . 9 7 ; Q . .8 ),(Q .8 ),(Q .8 ),(Q .8 ),(Q 8 ), ( Q . 8 ), ( Q . 8 ), ( Q . 8 ), ( Q . 8 ), ( Q . 8 ), ( Q . 8 ), ( Q . 8 ), ( Q . 8 ), ( Q . 8 ), ( Q . 8 ),(Q . 8 ), ( Q . 8 ), ( Q . 8 ), ( Q . 8 ), ( Q . 8 ), ( Q . .7 2 ,Q .7 2 ,Q .7 2 ,Q .9 3 ,Q S 8 ,Q 9 3 ,Q 7 2 ,Q 3 2 ,Q 7 2 ,Q 9 3 ,Q S 8 ,Q S 8 ,Q 7 2 ,Q 9 3 ,Q 9 3 ,Q 3 2 ,Q 9 3 , Q . S 8 , Q . 7 2 , Q . 7 2 , Q . 9 8,1 7 8 ,Q 9 8,1 78,Q .98,1 .98,1 .ó 3 ,Q .4 1 ,Q .98,1 .98,1 .7 8 ,Q .4 1 ,Q .98,1 .98,1 .7 8 ,Q .78,Q .98,1. .ó 3 ,Q . .7 8 ,Q . .ó 3 ,Q . .7 8 ,Q .7 8 ,Q .7 8 ,Q .9 8,1 ó 3 ,Q . 9 8,1. 7 8 ,Q . 4 1 ,Q . 7 8 ,Q . 9 8,1. ó 3 ,Q . ó 3 ,Q . 7 8 ,Q . 9 8,1. 9 8,1. 4 1 ,Q . 9 8,1. ó 3 ,Q . 7 8 ,Q . 7 8 ,Q . .Q Q ,1 .Q Q ;1 .Q Q )], .9 2 ,Q .9 7 ;1 .Q Q )], .Q Q ,1 .Q Q ;1 .Q Q )], .9 2 ,Q .9 7 ;1 .Q Q )], .Q Q ,1 .Q Q .Q Q ,1 .Q Q .8Q,Q.8ó .S 8,Q.ó S .Q Q ,1 .Q Q .Q Q ,1 .Q Q . 9 2 , Q . 9 7 .S 8,Q .ó S .Q Q ,1 .Q Q .Q Q ,1 .Q Q . 9 2 , Q . 9 7 . 9 2 , Q . 9 7 Q Q ,1 .Q Q ,1 .Q )] 8 Q,Q.8 ó , 1.Q )] 9 2 ,Q .9 7 ,1 .Q )] 8 Q , Q . 8 ó , 1 .Q)] l .QQ)], l .QQ)], l .QQ)], l .QQ)], l .QQ)], l .QQ)], l .QQ)], l .QQ)], l .QQ)], l .QQ)], l .QQ)], l .QQ)], .9 2 ,Q .9 2 ,Q .9 2 ,Q .Q Q , 1 .8 Q ,Q .Q Q ,1 .9 2 ,Q .S 8 ,Q .9 2 ,Q .Q Q ,1 .8 Q ,Q .8 Q ,Q .9 2 ,Q .Q Q ,1 .Q Q , 1 .S 8 ,Q Q Q ,1 . 8 Q , Q . 9 2 , Q . 9 2 , Q . .9 7 ; 1 .9 7 ; 1 .9 7 ; 1 .Q Q ; 1 8ó QQ 97 óS 97 QQ 8ó 8ó 97 QQ QQ óS Q Q ,1 . 8 ó , 1 . 9 7 ,1 . 9 7 ,1 . QQ)] QQ)] QQ)] QQ)] QQ)] QQ)] QQ)] QQ)] QQ)] QQ)] QQ)] QQ)] QQ)] QQ)] QQ)] QQ)] Q)] Q)] Q)] Q)] [3]4 [(0.58 [(0.58 [(0.9 3 [(0.58 [(0.3 2 [(0.58 [(0.93 [(0.93 [(0.7 2 [(0.7 2 [(0.93 [(0.58 [(0.93 [(0.58 [(0.93 [(0.93 [(0.9 3, [(0.7 2, [(0.58, [(0.3 2, ,0.6 3 ,0.6 3 ,0.9 8 ,0.6 3 0.4 1 0.63 0.98 0.98 0.78 0.78 0.98 0.63 0.98 0.63 0.98 0.98 0.98, 0.78, 0.63, 0 .4 1 , ,0.8 0 ,0.8 0 ,1 .0 0 ,0.8 0 0.58 0.80 1.0 0 1.0 0 0.92 0.92 1.0 0 0.80 1.0 0 0.80 1.0 0 1.0 0 1.0 0 0.92 0.80 0.58 0.86 0.86 1 .0 0 0.86 0.65 86 00 00 97 97 00 86 00 86 00 00 1.0 0 0.97 0.86 0.65 8 ),(0 8 ),(0 8 ),(0 8 ),(0 8),(0 8 ),(0 8 ),(0 0 .8 ),(0 0 .8 ),(0 8 ),(0 8 ),(0 8 ),(0 8 ),(0 8 ),(0 8 ),(0 8 ),(0 8 ),(0 . 0 .8 ),(0 . 0 .8 ),(0 . 0 .8 ),(0 . .5 8,0 .5 8,0 .9 3,0 .5 8,0 3 2,0 5 8,0 9 3,0 9 3,0 7 2,0 7 2,0 9 3,0 5 8,0 9 3,0 5 8,0 9 3,0 9 3,0 9 3,0. 7 2,0. 5 8,0. 3 2,0. .6 3,0.8 0 .6 3,0.8 0 .98,1 .00 .63,0.80 4 1,0.58 63,0.80 9 8 ,1 .0 0 9 8,1.00 78,0.92 78,0.92 9 8,1.00 63,0.80 9 8 ,1 .0 0 63,0.80 9 8,1.0 0, 9 8 ,1 .0 0 , 9 8,1.0 0, 7 8,0.9 2, 6 3,0.8 0, 4 1,0.58, ,0.8 6 ,0.8 6 ,1 .0 0 ,0.8 6 ,0.6 5 ,0.8 6 ,1 .0 0 ,1 .0 0 ,0.9 7 ,0.9 7 ,1 .0 0 ,0.8 6 ,1 .0 0 ,0.8 6 ,1 .0 0 ,1 .0 0 ,1 .0 0 , 0.97, 0.86, 0.65, ;1 .00)] ;1 .00)] ;1 .00)] ;1 .00)] 1 .00)] 1 .00)] 1 .00)] 1 .00)] 1 .00)] 1 .00)] 1 .00)] 1 .00)] 1 .00)] 1 .00)] 1 .00)] 1 .00)] 1.0)] 1.0)] 1.0)] 1.0)] (2) Combine the individual preferences in order to obtain a collective preference value of each alternative: [ [(0.743,0.797,0.907,0.943;0.800),(0.743,0.797,0.907,0.943;1.000)] [(0.673,0.730,0.880,0.933;0.800),(0.673,0.730,0.880,0.933;1.000)] [(0.860,0.913,0.973,0.990;0.800),(0.860,0.913,0.973,0.990;1.000)] [(0.743,0.797,0.907,0.943;0.800),(0.743,0.797,0.907,0.943;1.000)] [(0.610,0.673,0.793,0.837;0.800),(0.610,0.673,0.793,0.837;1.000)] [(0.813,0.863,0.933,0.953;0.800),(0.813,0.863,0.933,0.953;1.000)] [(0.743,0.797,0.907,0.943 ;0.800),(0.743,0.797,0.907,0.943;1.000)] [(0.523,0.600,0.720,0.767;0.800),(0.523,0.600,0.720,0.767;1.000)] [(0.790,0.847,0.947,0.980;0.800),(0.790,0.847,0.947,0.980;1.000)] [(0.860,0.913,0.973,0.990;0.800),(0.860,0.913,0.973,0.990;1.000)] [(0.743,0.797,0.907,0.943 ;0.800),(0.743,0.797,0.907,0.943;1.000)] [(0.493,0.557,0.727,0.790;0.800),(0.493,0.557,0.727,0.790;1.000)] [(0.860,0.913,0.973,0.990;0.800),(0.860,0.913,0.973,0.990;1.000)] [(0.813,0.863,0.933,0.953;0.800),(0.813,0.863,0.933,0.953;1.000)] [(0.860,0.9 1 3,0.973,0.990;0.800),(0.860,0.9 1 3,0.973,0.990;1.0 00)] [(0.657,0.723,0.833,0.873;0.800),(0.657,0.723,0.833,0.873;1.000)] [(0.930,0.980,1 .000,1 .000;0.800),(0.930,0.980,1 .000,1 .000;1 .000)] [(0.627,0.680,0.840,0.897;0.800),(0.627,0.680,0.840,0.897;1.000)] [(0.673,0.730,0.880,0.933;0.800),(0.673,0.730,0.880,0.933;1.000)] [(0.540,0.607,0.767,0.827;0.800),(0.540,0.607,0.767,0.827;1.000)] [[(0.930,0.980,1.000,1.000;0.800),(0.930,0.980,1.000,1.000;1.000)], [(0.627,0.680,0.840,0.897;0.800),(0.627,0.680,0.840,0.897;1.000)], [(0.627,0.680,0.840,0.897;0.800),(0.627,0.680,0.840,0.897;1.000)], [(0.930,0.980,1.000,1.000;0.800),(0.930,0.980,1.000,1.000;1.000)], [(0.320,0.410,0.580,0.650;0.800),(0.320,0.410,0.580,0.650;1.000)] ] [ xj ]. m. (3)Calculate the weighted decision making matrix: [ ^ ]. [(0.691,0.781,0.907,0.943 ;0.800),(0.691,0.781,0.907,0.943;1.000)] [(0.626,0.71 5,0.880,0.93 3 ;0.800),(0.626,0.7 15,0.880,0.933;1.000)] [(0.800,0.895,0.973 ,0.9 90 ;0.800),(0.800,0.8 9 5,0.97 3,0.9 90;1.000)] [(0.691,0.781,0.907,0.943 ;0.800),(0.691,0.781,0.907,0.943;1.000)] [(0.382,0.458,0.666,0.750;0.800),(0.3 82,0.458,0.666,0.750;1.000)] [(0.5 10,0.587,0.784,0.855 ;0.800),(0.5 10,0.5 87,0.784,0.855;1.000)] [(0.466,0.5 42,0.7 62,0.846 ;0.800),(0.4 66,0.5 42,0.7 62,0.846 ;1.000)] [(0.3 2 8,0.40 8,0.6 05,0.6 87 ;0.800),(0.3 2 8,0.40 8,0.60 5,0.687 ;1.000)] [(0.495,0.576,0.7 95,0.879 ;0.800),(0.4 95,0.576,0.795,0.879 ;1.000)] [(0.5 3 9,0.621,0.8 1 8,0.888 ;0.800),(0.5 3 9,0.621,0.818,0.888;1.000)] [(0.466,0.5 42,0.7 62,0.846 ;0.800),(0.4 66,0.5 42,0.7 62,0.846 ;1.000)] [(0.3 09,0.3 79,0.61 0,0.708 ;0.800),(0.3 09,0.3 79,0.610,0.708 ;1.000)] [(0.800,0.8 9 5,0.9 7 3,0.9 90 ;0.800),(0.8 00,0.8 9 5,0.9 7 3,0.990 ;1.000)] [(0.756,0.846,0.933,0.953 ;0.800),(0.7 56,0.846,0.9 3 3,0.95 3 ;1.000)] [(0.800,0.8 9 5,0.9 7 3,0.9 90 ;0.800),(0.8 00,0.895,0.973,0.9 90 ;1.000)] [(0.61 1,0.709,0.833,0.873 ;0.800),(0.6 11,0.709,0.833,0.873;1.000)] [(0.29 8,0.402,0.5 80,0.650 ;0.800),(0.298,0.402,0.5 80,0.650 ;1.000)] [(0.20 1,0.279,0.487,0.583 ;0.800),(0.201,0.279,0.487,0.583 ;1.000)] [(0.21 5,0.299,0.5 10,0.607;0.800),(0.21 5,0.299,0.5 10,0.607;1.000)] [(0.173,0.249,0.445,0.537;0.800),(0.173,0.249,0.445,0.537;1.000)] (4) Determine the positive ideal solution and the negative ideal solution : [[(0.800,0.895,0.973,0.990;0.800),(0.800,0.895,0.973,0.990;1.000)], [(0.510,0.587,0.784,0.855;0.800),(0.510,0.587,0.784,0.855;1.000)], [(0.539,0.621,0.818,0.888;0.800),(0.539,0.621,0.818,0.888;1.000)], [(0.800,0.895,0.973,0.990;0.800),(0.800,0.895,0.973,0.990;1.000)], [(0.298,0.402,0.580,0.650;0.800),(0.298,0.402,0.580,0.650;1.000)]] [[(0.626,0.715,0.880,0.933;0.800),(0.626,0.715,0.880,0.933;1.000)], [(0.328,0.408,0.605,0.687;0.800),(0.328,0.408,0.605,0.687;1.000)], [(0.309,0.379,0.6J0,0.708;0.800),(0.309,0.379,0.6J0,0.708;J.000)], [(0.611,0.709,0.833,0.873;0.800),(0.611,0.709,0.833,0.873;1.000)], [(0.J73,0.249,0.445,0.537;0.800),(0.J73,0.249,0.445,0.537;J.000)]] (5) Calculate the weighted matrix and the COG of each attributes with respect to the positive ideal solution and the negative ideal solution ( y, x) : [(0.3333,0.8283),(0.4166,0.8283)],[(0.3422,0.5645),(0.4278,0.5645)],[(0.3429,0.6 863), [(0.3381,0.7873),(0.4227,0.7873)],[(0.3427,0.6837),(0.4284,0.6837)],[(0.3418,0.7 159), [(0.3215,0.9107),(0.4019,0.9107)],[(0.3438,0.6540),(0.4298,0.6540)],[(0.3438,0.6540), [(0.3333,0.8283),(0.4166,0.8283)],[(0.3397,0.5071),(0.4246,0.5071)],[(0.3441,0.5 026), (0.4287,0.6863)],[(0.321 5,0.9107),(0.4019,0.9107)],[(0.3341,0.4809),(0.4176,0.4809)] (0.4273,0.71 59)],[(0.3258,0.8691),(0.4072,0.8691)],[(0.3393,0.3880),(0.4242,0.38 80)] (0.4298,0.6540)],[(0.321 5,0.9107),(0.4019,0.9107)],[(0.3386,0.4084),(0.4233,0.40 84)] (0.4301,0.5026)],[(0.3299,0.7540),(0.4123,0.7540)],[(0.3383,0.3515),(0.4229,0.35 15)] = [[(0.3215,0.9107),(0.4019,0.9107),(0.3427,0.6837)],[(0.4284,0.6837),(0.3418,0.7159), (0.4273,0.7159),(0.3215,0.9107)],[(0.4019,0.9107),(0.3341,0.4809),(0.4176,0.4809)]] _ [[(0.3381,0.7873),(0.4227,0.7873)],[(0.3397,0.5071),(0.4246,0.5071)],[(0.3441,0.5 026), 5 _ (0.4301,0.5026)],[(0.3299,0.7540),(0.4123,0.7540)],[(0.3383,0.3 515),(0.4229,0.3 515)]] V' V = [(y, x)V ]4x5 = (y, x )v (6) Calculate the weighted distance between each project Ai and the ideal solution and negative ideal solution: d + =(0.1050 0.1140 0.0707 0.2500) d - = (0.2002 0.2136 0.2097 0.0292) (7) Calculate the relative closeness coefficient: C = (0.6560 0.6520 0.7479 0.1046 ) (8) Rank the alternatives: Based on relative closeness coefficient, we can rank the order of each alternatives: a3 > a1 > a2 ^ a4. (9) Analysis: In this example, our approach produces the same ranking as the literature [17], which proves the approach presented in this paper is effective. Comparing with the literatures [7-11], all of them utilize the TOPSIS method to deal with the decision making problems under the fuzzy information environment. The method in this paper, however, can deal with the more complex decision making problems under the generalized interval-valued trapezoidal fuzzy information environment. Comparing with the literature [17], in the fuzzy information, this method solves the FMADA problem based on the generalized interval-valued trapezoidal fuzzy information, and the literature [17] solves the FMADA problem based on the interval-valued triangular fuzzy numbers. In decision making method, literature [17] firstly uses the lower limits and the upper limits of the interval-valued triangular fuzzy numbers to calculate the relative closeness coefficient based on the TOPSIS method, then it uses the mean closeness coefficient to rank the order of the alternatives, so this method is not considering the interval-valued triangular fuzzy numbers as a whole; in this study, we proposed the extended TOPSIS based on the definition of the distance and the comparison method between the generalized interval-valued trapezoidal fuzzy numbers. Comparing with the literature [18], both of them are the decision making problems based on the generalized interval-valued trapezoidal fuzzy numbers. The literature [18] ranks the order of the alternatives based on the similarity which is hard to calculate. The method proposed in this paper, however, is easy to calculate the similarity. 5 Conclusion Fuzzy multiple attribute decision making (FMADM) is wildly used in various areas. The interval-valued trapezoidal fuzzy numbers can be precisely express the attribute values and weights of the decision making process. This study proposes an extended TOPSIS method for solving the MADM problems which the attribute weights and values are given with the form of GIVTFN and the decision making steps. This method is simple and easy to understand. This method constantly enriches and develops the theory and method of FMADM, and it proposes a new idea for solving the FMADM problems. Acknowledgement This paper is supported by the Humanities and Social Sciences Research Project of Ministry of Education of China(No.09YJA630088), and the Natural Science Foundation of Shandong Province (No. ZR2009HL022). The authors also would like to express appreciation to the anonymous reviewers for their very helpful comments on improving the paper. References [1] Bellman R.E., Zadeh L.A. (1970). Decision-making in a fuzzy environment, management science, 171, pp.41-164. [2] Hwang C.L., Yoon K. (1981). Multiple Attributes Decision Making Methods and Applications, Springer, Berlin Heidelberg. [3] Zavadskas E.K, Kaklauskas A., Turskis Z.(2009).Multi-Attribute Decision-Making Model by Applying Grey Numbers. INFORMATICA, 20(2), pp.305-320. [4] Liu P.D. (2009a). Multi-Attribute Decision-Making Method Research Based on Interval Vague Set and TOPSIS Method. Technological and Economic Development of Economy, 15(3), pp. 453-463. [5] Liu P.D.(2009b). A novel method for hybrid multiple attribute decision making. Knowledge-based systems, 22 (5), pp. 388-391. [6] Xu Zeshui. (2007). Methods for aggregating interval-valued intuitionistic fuzzy information and their application to decision making. Control and Decision,22 (2), pp.215-219. [7] Jahanshahloo G.R., Lotfi F. H., Izadikhah M. (2006).An algorithmic method to extend TOPSIS for decision-making problems with interval data, Applied Mathematics and Computation,l75(2), pp. 1375-1384. [8] Wang Ying-Ming, Elhag Taha M.S. (2006). Fuzzy TOPSIS method based on alpha level sets with an application to bridge risk assessment,Expert Systems with Applications,3\(2), pp.309-319. [9] Liu Weijun, Zeng Ling (2008). A new TOPSIS method for fuzzy multiple attribute group decision making problem, Journal of Guilin University of Electronic Technology. 28(1), pp. 59-62. [10] Tsaur S.H., Chang T.Y., Yen C.H. (2002). The evaluation of airline service quality by fuzzy MCDM, Tourism Management,23, pp. 107-115. [11] Chu T.C., Lin Y.C. (2003). A fuzzy TOPSIS method for robot selection, The International Journal of Advanced Manufacturing Technology, 21, pp. 284-290. [12] Gorzalczany M.B. (1987). A method of inference in approximate reasoning based on interval-valued fuzzy sets, Fuzzy Sets and Systems, 21(1), pp. 117. [13] Turksen I.B. (1996).Interval-valued strict preference with Zadeh triples, Fuzzy Sets and Systems, 78(2), pp. 183-195. [14] Wang G., Li X. (1998). The applications of interval-valued fuzzy numbers and interval- distribution numbers, Fuzzy Sets and Systems, 98(3), pp.33i-335. [15] Wang G., Li X. (200i),.Correlation and information energy of interval-valued fuzzy numbers, Fuzzy Sets and Systems, i03 (i), pp. i69-i75. [16] Hong D.H., Lee S. (2002). Some algebraic properties and a distance measure for interval-valued fuzzy numbers, Information Sciences, i48 (i), pp. i-i0. [17] Ashtiani Behzad, Haghighirad Farzad, Makui Ahmad, etc. (2009). Extension of fuzzy TOPSIS method based on interval-valued fuzzy sets, Applied Soft Computing, 9, pp.457-46i [18] Wei Shih-Hua, Chen Shyi-Ming. (2009). Fuzzy risk analysis based on interval-valued fuzzy numbers, Expert Systems with Applications, 362, pp.2852299. [19] Chen, S. H. (i985). Ranking fuzzy numbers with maximizing set and minimizing set, Fuzzy Sets and Systems, i7(2), pp. ii3-i29. [20] Chen Shyi-Ming, Chen Jim-Ho.(2009). Fuzzy risk analysis based on ranking generalized fuzzy numbers with different heights and different spreads, Expert Systems with Applications, 36(3), pp.6833-6842. [21] Chen, S. J., Chen, S. M. (2003). A new method for handing multicriteria fuzzy decision-making problems using FN-IOWA operators, Cybernetics andSystems,34(2), pp. i09-i37. Optimal Decision Tree Based Multi-class Support Vector Machine Manju Bala and R. K. Agrawal School of Computer & Systems Sciences, Jawaharlal Nehru University, New Delhi-110 067, India Keywords: support vector machine, decision tree, class separability, information gain, Gini Index and Chi-squared, interclass scatter, intraclass scatter. Received: August 5, 2010 In this paper, decision tree SVMs architecture is constructed to solve multi-class problems. To maintain high generalization ability, the optimal structure of decision tree is determined using statistical measures for obtaining class separability. The proposed optimal decision tree SVM (ODT-SVM) takes advantage of both the efficient computation of the decision tree architecture and the high classification accuracy of SVM. A robust non-parametric test is carried out for statistical comparison of proposed ODT-SVM with other classifiers over multiple data sets. Performance is evaluated in terms of classification accuracy and computation time. The statistical analysis on UCI repository datasets indicate that ten cross validation accuracy of our proposed framework is significantly better than widely used multi-class classifiers. Experimental results and statistical tests have shown that the proposed ODT-SVM is significantly better in comparison to conventional OvO and OAA in terms of both training and testing time. Povzetek: Metoda odločitvenega drevesa s SVM dosega signifikantno boljše rezultate kot izvirni SVM. 1 Introduction Support Vector Machine (SVM) has been proved to be a successful learning machine in literature, especially for classification. SVM is based on statistical learning theory developed by Vapnik [6, 25]. Since it was originally designed for binary classification [3], it is not easy to extend binary SVM to multi-class problem. Constructing k-class SVMs (k > 2) is an on-going research issue [1, 4]. Two approaches are suggested in literature to solve multi-class SVM. One is considering all data in one optimization [7]. The other is decomposing multi-class into a series of binary SVMs, such as "One-Against-All" (OAA) [25] and "One-versus-One" (OvO) [16]. It has been reported in literature that both conventional OvO and OAA SVMs suffer from the problem of unclassifiable region [19, 24]. To resolve unclassifiable region in conventional OvO, decision tree OvO SVM formulation is proposed [19]. Takashaki and Abe [24] proposed class separability measure i.e. Euclidean distance between class centers to construct decision tree based OAA SVM to overcome unclassifiable region. In literature, other than Euclidean distance a large number of distance measures were used to determine the class separability, each having its own advantages and disadvantages. Few more realistic and effective statistical measures used in literature are information gain, gini index, chi-square and scatter-matrix-based class separability in kernel-induced space for measuring class separability. In this paper, we evaluate the performance in terms of classification accuracy and computation time of proposed OvO ODT-SVM [17] and OAA ODT-SVM [18]. In both models, class separability is determined using statistical measures i.e. information gain, gini index, chi-square and scatter-matrix-based class separability in kernel-induced space. A robust non-parametric test is carried out for statistical comparison of proposed ODT-SVM with other classifiers over multiple data sets. In section 2 we briefly describe the basics of SVM. In section 3 we discuss decision tree OvO and OAA SVMs approach. Section 4 presents our proposed ODT-SVMs framework using four statistical class separability measures. In section 5, we discuss the theoretical analysis and empirical estimation of training and testing time of both the proposed schemes. The experimental results demonstrate the effectiveness of our ODT-SVMs in comparison to conventional OvO and OAA SVMs. Section 6 includes the conclusion. 2 Support vector machines Support Vector Machines (SVM) is based on statistical learning theory developed by Vapnik [6, 25]. It classifies data by determining a set of support vectors, which are members of the set of training inputs that outline a hyperplane in feature space [15]. Let us assume {(x^ yi),...,( xn, yn)} be a training set with and is the corresponding target class. The basic problem for training an SVM can be reformulated as: Maximize: / = £f= 1 at - 1 1 at a ¡y ¡y ¡(x ? , x ¡) Subject to Ya=ia i y i = 0 and a ¡>0 ,i = 1,2 ,...,71 (1) The computation of dot products between vectors without explicitly mapping to another space is performed by a kernel function K (xi,Xj) . Use of a kernel function [22] enables the curse of dimensionality to be addressed and the solution implicitly contains support vectors that provide a description of the significant data for classification. Substituting K(Xi,Xj)for (XiT,Xj) in eqn. (1) produces a new optimization problem: Maximize: L (a) = Ya=iOCi~^f= iY.% 1aiajyiyjK(Xi,Xj) Subject to 0 < ai < C i,j = 1 ,. . ,,n and YH= i aiyi = 0 (2) Solving it for a gives a decision function of the form f 00= sigaiyiK (xi,Xj) + b) (3) ^ Whose decision boundary is a hyperplane and translates to nonlinear boundaries in the original space. 3 Decision tree based SVM The most common way to build a multi-class SVM is by constructing and combining several binary classifiers [14]. To solve multi-class classification problems, we divide the whole classification problem into a number of binary classification problems. The two representative ensemble schemes are OvO and OAA [21]. Convetional OvO SVM has the problem of unclassifiable region. To resolve unclassifiable region for OvO SVM, Decision Directed Acyclic graph (DDAG) SVM) [19] based on decision tree OvO SVM is proposed in literature. They have shown with an example three-class problem the existence of unclassifiable regions which can lead to degradation of generalization ability of classifier. In general, the unclassifiable region is visible and generalization ability of classifier is not good for k-class problem where k >2. In DDAG OvO scheme [19], VC dimension, LOO error estimator and Joachim's ^a LOO measures were used for estimating the generalization ability of pairwise classifier at each level of decision tree. During training at the top node, a pair that has the highest generalization ability is selected from an initial list of classes . Then it generates the two lists deleting or from the initial list. If the separated classes include the plural classes, at the node connected to the top node, the same procedure is repeated for the two lists till one class remains in the separated region. This means that after only steps just one class remains, which therefore becomes the prediction for the current test sample. Gjorgji et al. [13] proposed binary tree architecture (SVM-BDT) that uses SVMs for making binary decisions in the nodes which takes advantage of both the efficient computation of the tree architecture and high accuracy of SVMs. The hierarchy of binary decision subtasks using SVMs is designed with clustering algorithms. In proposed scheme SVM-BDT, the classes are divided in two disjoint groups gi and g2 using Euclidian distance as distance measure. The two disjoint groups so obtained are then used to train a SVM classifier in the root node of the decision tree. The classes from first and second clustering group are being assigned to left and right subtree respectively. This process continues recursively until there is only one class is left in a group which defines a leaf in the decision tree. Takashaki and Abe [24] proposed OAA SVM based decision tree formulation in literature to overcome the problem of unclassifiable region to improve generalization ability of SVM. They have shown with an example of unclassifiable regions for a three-class problem which can lead to degradation of generalization ability of classifier. In general, the unclassifiable region is visible and generalization ability of classifier is not good for k-class problem where . In Takashaki and Abe [24] proposed scheme, the hyperplane is determined that separates a class from others during training at the top node. If the separated classes include the plural classes, at the node connected to the top node, the hyperplane is determined that separates the classes. This process is repeated until one class remains in the separated region. This can resolve the problem of unclassifiable regions that exist in OAA SVM. They proposed different types of decision trees based on class separability measure i.e. Euclidean distance between class centers. 4 Proposed decision tree SVMs framework using statistical measures Euclidean distance measure used in the construction of decision tree (i.e. OvO SVM-BDT and Takashaki and Abe [24] OAA SVM formulation) does not take into account within class variability of patterns. Hence, it may not be suitable for measuring class separability between two different classes of patterns. To understand better picture of the overlap of the subspaces occupied by individual classes, statistical distance measures are employed by pattern recognition community which constitutes a natural concept of measuring class separability. 4.1 Statistical class separability measures Among statistical measures information gain (I G ) [20] is a measure based on entropy [23] which indicates degree of disorder of a system. It measures reduction in weighted average impurity of the partitions compared with the impurity of the complete set of samples when we know the value of a specific attribute. Thus, the value of IG signifies how the whole system is related to an attribute. IG is calculated using: IG ( C\E) = H ( C)-H (C\E) where / G( C|£) is the information gain of the label C for a given attribute E, H(C) is the system's entropy and H ( C |£") is the system's relative entropy when the value of the attribute E is known. The system's entropy indicates its degree of disorder and is given by the following formula k H ( C ) = - ( C) 1 og (p ( Q) ) i=1 (5) where is the probability of class . The relative entropy is calculated as / k \ H (C|£) = ^p ( e,)( - ^P( C¡|e7-)logp(C £| e,-) ) i=1 V i=l ' (6) Where p ( e,) is the probability of value j for attribute e, and p(C j|e,) is the probability of C ¿ with a given The optimal binary SVM model is selected on the basis of maximum value of / G that signifies more separability between patterns belonging to two different classes. for a given independent binary SVM containing n,- elements of C, and % elements of Cj can be calculated as / G ( ¿J) =H( Cj, C,) - [p ( Cj) H( tp,/p) + p( C,)H (/„, t„) ] (7) where Hfrrt = -M Zog ^ - |y| Zog ^ (8) p (C j^c^ and p( (9) where , , , and denote number of true P' J p' n> i n positive, false positive, true negative and false negative data points respectively. The higher value of signifies less overlap or more distance between two different classes of data points. Hence, / G can be a natural measure to determine class separability of different classes of data points. Similarly for every independent binary OAA SVM, assume there are two classes of dataset, and and training set D contains nj elements of C j and n, elements of class C, ^ j. / G for a given OAA SVM model i can be calculated as IG(i) = ( ) [ ( ) ( ) ] (10) Gini index is another popular measure for feature selection in the field of data mining proposed by Breiman et al. [5]. It measures the impurity of given set of training data D and can be calculated as 2 G ¿n ¿ (D) = 1 - ^ (p (C j) ) 2 i=1 For a binary split, a weighted sum of the impurity of each resulting partition is computed. The reduction in impurity that would be incurred by a particular binary split in binary SVM between two classes C j and C, is calculated as AGini(\,]) = Gini{D) — Giniij(D) (12) , where [ ( ) ] (13) , Where Gini(L) is the Gini Index on the left side of the hyperplane and Gini(R) is that on the right side. OvO SVM model between class pair ( ¿ kjk) that maximizes the reduction in impurity (i.e. Gini index) is selected as splitting node in decision tree SVM at a particular level Similarly for every independent binary OAA model assume there are two classes of dataset, C j and C, ^ j. The reduction in impurity that would be incurred by a particular binary split is given by A Gini(i) = Gini{D) — Gini^D) (14) where G ¿n¿¿ (d) = [p ^¿) G ¿m (i) + (c;-^)g ¿m (?) ] (15) Chi-square [9] another criterion used for binary split in data mining and machine learning, is a statistical test in which the sampling distribution of the test statistic is a chi-square distribution when the null hypothesis is true. We are interested in determining whether a particular decision rule is useful or informative. In this case, the null hypothesis is a random rule that would place tp data points from C¿ and fp data points from C;- independently in the left branch of decision tree and the remainder in the right branch of decision tree respectively. The candidate decision rule would differ significantly from the random rule if the proportions differed significantly from those given by the random rule. The chi-square statistic will be given by z2 = g ( tp, ( tp + /p)p (c j) ) + g(/„, (/„ +t„) p (c j) ) + g(/p ,( tp + ypMc)) + g ( t„, (/„ +1„) p( c,)) (16) i . .>. (count—expect)2 where g ( co un t, exp e ct) =-expect- The higher the value of , the less likely is that the null hypothesis is true. Thus, for a sufficiently high x 2 , a difference between the expected and the observed distributions is statistically significant; one can reject the null hypothesis and can consider candidate rule is informative. Hence OvO SVM model for class pair ( ¿ k, A) or OAA SVM model ¿ that maximizes x 2 is selected as splitting node in ODT-SVM at a particular level. To measure class variability of patterns, the ratio of interclass and intra class scatters in kernel-induced feature space can also be used which better depicts the physical relationship of data in input space and thereby providing high generalization ability of classifier based on decision tree. The scatter-matrix based measure (S) of training set D in original space [10] is defined as s _ tr(Sb) tr(Sw) (17) where S, is the between class scatter matrix and Sw is the within class scatter matrix, defined as Sb = (mi - mj )(ml - mj )T where m ; = 6 C)x and m = — £xgc,x (18) and represents mean vector of data from class and class respectively. (19) Sw = Qt + Qj where and are given as = -Y(x-n, £—i (20) (21) xECi =—— Y(*- 71 — 71; ¿—I m■ ) (x — mi )7 x<£ Cj Using kernel trick [7], data points from C; and C. are implicitly mapped from to a high dimension feature space K . Let 0 ( ■ ) : Rd — K denote the mapping and /cg(x(,x.) = (0 ( x ¿) , 0( x.) ) denote the kernel function, where is the set of kernel parameters and ( ■ , ■ ) is the inner product. K denotes the kernel matrix and .. is defined as fcg (x; , x.). Let KA, B be kernel matrix computed with the data points from A and B which denote two subsets of training sample set D. Let and denotes the between class scatter matrix and within class scatter matrix in , respectively and defined as follows 2 S0=£n ;( m 0-m0) ( m 0-m 0) r (22) i=i S® =ZZ(0(x)-m0)(0(x)-m07 (=1 xEDt (23) where denotes the mean of training data points from and is the mean of all the training data points in K. let F is vector whose elements are all "1". Its size will be decided by the context. Then mf'mf =nr*.FTKHDiF (24) m0 rm® = n~2 ■ FTKDpDF (25) 0 T 0 m; m; 1 y ( ) (26) tr(S0 ) = tr (27) tr(S0) = tr ^n £ (m 0-m0)(m 0-m 0)7 ^K^F | FTKdjJ>jF FtKdpF 0 (x) -m0)(0 (x) - m0)7 ¿ = 1 XE Di FTKDi,DiF FTKDpDjF (28) Now the class separability (SC) in a feature space K is obtained as tr(S0) SC = (29) tr(S0) 4.2 Algorithm for construction of ODT-SVM For the construction of Optimal Decision Tree SVM model (ODT-SVM), we can use one of the class separabilty measures to determine the structure of decision tree. Here for illustration purpose, we consider information gain in proposed optimal decision tree algorithm The outline for OvO ODT-SVM using IG class separability measure for &-class is given below: 1 Generate the initial list { Cx ,. . ., Cfc] 2 Calculate //( Q ,C, ) using eqn. (8) for and 3 Calculate ( ), , p( ), and p( ) using eqn. (8) and eqn. (9) respectively. 4 Compute using eqn. (7). 5 Determine class pair ( for which takes maximum value from the list. If data points belong to class then delete from the list else delete class . 6 If the remaining classes are more than one, repeat Steps 2-5 otherwise terminate the algorithm. Similar computational steps are followed for other three measures to determine the structure of OvO ODT-SVM. Similarly, the outline for OAA ODT-SVM using IG class separability measure for &-class is given below: 1 Generate the initial list { Cv ..., Cfc). 2 Calculate H ( Q,^ ¡) using eqn. (8) for and . 3 Calculate H( tp ,/p ), H (/- , t„ ) , p( Ci) and p( ), using eqn. (8) and eqn. (9) respectively. 4 Compute / G ( i ) using eqn. (10). 5 Determine model for which takes maximum value from the list. 6 If j is multi-class, repeat steps 2-5. Otherwise, terminate the algorithm. Similar computational steps are followed for other three measures to determine the structure of OAA ODT-SVM. 4.3 Time complexity In order to compute the time complexity of training phase of OvO SVM, we assume without loss of any generality that the number of data points in each class is n approximately same i.e. - . To solve k-class problem k k(k-l) using conventional OvO, —-— binary SVM classifiers are developed. Assuming the time complexity of building a SVM with n data points and d features is , it can be shown that training time of conventional OvO, , is In worst case the decision tree generated in SVM-BDT is skewed if classes in two groups at every level is divided into uneven size. Under the assumption that group gi contain only one class and group g2 contain remaining classes or vice versa, the decision tree so generated will be of depth (k - 1 ) for k-class problem. Hence, the training time of SVM-BDT will be given by 1 SVM-BDT-worst n(k - 1) k (D' = (n2dk (30) In SVM-BDT approach under best case, the class in two groups at every level is divided into approximately same size. The decision tree so generated will be almost height balanced of maximum depth \/o g(k) ]. The number of nodes in decision tree at depth i is 2'-1, each containing n data points. Hence, the training time for SVM-BDT in best case is given by Tsvm - SDT - b es t = n ^ + V^J ^ + ... + (31) 2 ^ 'fei^ = (n2d) However, in general the structure of OvO decision tree generated using statistical measures is almost height balanced of maximum depth \l o g ( k) 1. There are 2'-1 th ( 2 n\ nodes at i level and each node uses I— J data points. Hence, the training time for OvO ODT-SVM using statistical measure is retrain 1 OvO-ODT-SVM fc(fc-l) During testing phase of the conventional OvO, —-— decision functions are to be evaluated. Also the majority k(k-1) voting is computed with —-— operation. Hence, the testing time for each sample is given by k(k-1) ~ , N -. In worst case the depth of SVM-BDT is ( k - 1 ) which requires testing time for each sample. However, in best case the depth of SVM-BDT is \ I o g ( k) 1 which requires \/o g ( k) 1 testing time for each sample. Since, the maximum depth of OvO ODT-SVM is \Zo g (k) 1, the testing time requires \ Io g ( k) 1 operations. According to the above analysis it is evident that the training and testing time for OvO ODT-SVM will always take less computation time in comparison to conventional OvO SVM and SVM-BDT. For k-class problem, (k-1) hyperplanes are to be calculated in case of OAA ODT-SVM, whereas k times SVM model is developed in case of conventional OAA SVM approach which is more than number of SVM models required in decision tree OAA SVM. Assuming the time complexity of building a SVM model is given by where n is the number of data points and d is number of attributes. The overall time complexity of training SVM using conventional OAA approach is proportional to . In order to compute training time required by OAA ODT-SVM, we assume that the number of data points in each class is n approximately same i.e. -. At first level of OAA ODT-SVM model will take time proportional to (n 2d) . While at the second stage SVM model will have In -n) number of data points. It can be shown that at it stage, «n(i-l)\2 \ n--1—J ^J. Time, required for decision tree based SVM is given by / U\2 T train o . / \ 1 n 'OAA-ODT-SVM = n2 d + (n — — ) d n(k - 2) (33) + ( n —- + ... 2 d Under the above assumption the time required for training an OAA ODT-SVM is approximately three times lesser than the conventional OAA. While in testing phase, the values of all the hyperplanes need to be determined in case of OAA formulation whereas in OAA ODT-SVM, the value of all the hyperplanes need not be computed in general. Hence the time complexity of testing will also be less in case of OAA ODT-SVM in comparison to conventional OAA. 5 Experimental results To evaluate the performance of our proposed ODT-SVM framework using information gain, gini index, chi- square and scatter-matrix-based class separability in kernel-induced space, we have performed experiments on publically available UCI [2] benchmark datasets. Table 1 describes the datasets used in our experiments. In yeast dataset in actual ten classes are given. We have merged data of six classes into one class to make it a five class problem. The kernel functions used in experiments are given in Table 2. Performance of classifiers is evaluated in terms of classification accuracy, training time and testing time measures on each data set. We have applied Friedman test [11],[12] which is a non-parametric test for statistical comparison of multiple classifiers over multiple data sets. For each dataset, we rank competing algorithms. The one that attains the best performance is ranked 1; the second best is ranked 2, and so forth. A method's mean rank is obtained by averaging its ranks across all datasets. Compared to mean value, mean rank can reduce the susceptibility to outliers [8]. As recommended by Demsar [8], the Friedman test is effective for comparing multiple algorithms across multiple datasets. It compares the mean ranks of approaches to decide whether to reject the null hypothesis, which states that all the approaches are equivalent and, so, their ranks should be equal. If the Friedman test rejects its null hypothesis, we can proceed with a post hoc test, the Nemenyi test. It can be applied to mean ranks of competing approaches and indicate whose performances have statistically differences. Table 1: Description of datasets. To see the effectiveness of our proposed OvO ODT-SVM and OAA ODT-SVM, we compared our methods with conventional OvO, SVM-BDT, and conventional OAA SVMs respectively. We have used five kernel functions with value of C = 1000 and y = [2-11, 2-10, 2-9... 20]. The classification accuracy is determined using ten cross-validations. For a given kernel function and C, we determine the value of y for which the maximum classification accuracy is achieved. Table 3 and Table 4 show the comparison of maximum classification accuracy between conventional OvO and OvO ODT-SVM, and conventional OAA and OAA ODT-SVMs respectively. Table 2: Kernel functions. Table 5 shows comparison of maximum classification accuracy between both models of ODT-SVM, conventional OvO, conventional OAA and commonly used multi-class classifiers in literature i.e. C45, Multi Layer Perceptron (MLP). C4.5 and MLP implemented in WEKA machine learning environment [26] are used in our experiments with their default values of parameters. The best classification accuracy for each dataset is shown in bold. When we apply the Friedman test, with 7 algorithms and 11 datasets, Ff is distributed according to the F distribution with (7-1) x ( 1 1-1 ) = 60 degrees of freedom. The critical value of F(6,10) at the 0.05 critical level is 2.25. Ff calculated from the mean ranks is 12.31. Since 12.31 > 2.25, we can reject the null hypothesis and infer that there exists a significant difference among adversary classifiers. Dataset Kernel Choice OvO OvO ODT-SVMs ED SC IG Gini Chi-squared Iris Gaussian 98 98 98 98 98 98 Laplace 96 96 96 96 96 96 Cauchy 98 98 98 98 98 98 Hypersecant 98 98 98 98 98 98 Square sync 95.33 94.67 97.45 98 97.99 98 Satimage Gaussian 89.43 88.87 92.89 89.66 89.57 91.99 Laplace 91.21 90.76 92.41 91.23 91.42 91.12 Cauchy 90.46 90.61 92.78 89.34 92.98 90.89 Hypersecant 89.71 90.09 93.78 91.34 91.98 93.98 Square sync 74.51 76.43 77.87 78.86 77.8 76.76 Wine Gaussian 82.58 98.98 96.32 96.65 97.98 97.52 Laplace 82.58 92.23 92.67 95.55 93.58 91.58 Cauchy 82.02 82.02 82.02 82.62 82.87 82.02 Hypersecant 93.26 93.36 93.26 94.26 92.13 93.26 Problem #train #test #class #attributes Iris 150 11 3 4 Wine 178 0 3 13 Vehicle 846 0 4 18 Glass 214 0 6 9 Segmentation 210 0 7 19 Ecoli 336 0 8 7 Satimage 4435 2000 6 36 New_Thyroid 215 0 3 5 Yeast 1484 0 5 8 Movement_Libra 360 0 15 90 HeartDisease_Cleveland 303 0 5 13 Kernel Function K (.X, x ) for y > 0 Gaussian exp(-y 1 x x- 1 ) Laplace exp(-y | x- xt |) Cauchy (1/(1 + y | X— x- |2)) Hypersecant 2/(exp(y | X— x- |) + exp(—y | x— xt |)) Square sync sin2 (y | x— x, |)/(y | x— x, |)2 Square sync 75.28 76.89 76.97 77.97 75.28 76.97 Vehicle Gaussian 76.83 79.95 84.45 84.82 85.24 85.24 Laplace 77.42 78.3 80.61 81.61 80.74 80.24 Cauchy 76.48 82.85 84.52 84.52 84.58 85.65 Hypersecant 83.33 81.59 84.28 84.28 84.98 84.98 Square sync 71.51 80.57 81.32 81.32 81.99 81.56 Glass Gaussian 72.43 62.9 71.5 71.21 69.63 75.43 Laplace 75.70 76.17 76.64 74.64 75.7 77.57 Cauchy 72.90 72.21 72.43 71.03 68.92 73.36 Hypersecant 71.96 72.90 71.5 70.09 69.16 71.03 Square sync 66.36 64.2 64.78 62.62 62.62 56.07 Segmentation Gaussian 84.76 87.85 89.24 89.29 88.87 87.43 Laplace 87.14 88.19 89.62 89.67 89.87 90.87 Cauchy 86.19 91 89.87 89.69 89.71 89.99 Hypersecant 90 89.14 90.87 89.52 92.05 91.95 Square sync 81.9 79.05 79.52 80.13 80.95 79.05 Ecoli Gaussian 85.42 84.98 85.42 86.01 84.79 86.14 Laplace 86.97 87.2 86.99 86.9 86.99 87.9 Cauchy 85.42 85.42 86.98 84.82 86.52 85.71 Hypersecant 85.42 85.42 88.45 88.82 85.42 85.74 Square sync 85.12 84.23 87.65 85.12 87.85 86.15 New_Thyroid Gaussian 97.21 97.21 97.21 100 98.21 97.21 Laplace 96.74 96.74 100 96.89 96.89 96.74 Cauchy 96.74 96.84 100 96.89 96.89 97.74 Hypersecant 97.67 100 98.67 98.89 100 100 Square sync 94.88 94.88 96.49 96.49 96.49 96.49 Yeast Gaussian 58.43 59.92 62.40 60.32 63.95 61.92 Laplace 59.56 60.31 62.43 63.32 65.17 61.86 Cauchy 61.54 62.37 64.17 65.41 66.54 62.54 Hypersecant 59.45 67.76 67.65 67.54 65.56 64.76 Square sync 57.98 57.32 58.68 59.45 60.12 58.98 Movement_Libra Gaussian 74.45 79.94 81.91 78.94 77.74 76.94 Laplace 81.73 83.77 85.77 85.87 87.97 82.77 Cauchy 76.45 78.13 79.89 78.54 76.89 78.89 Hypersecant 72.23 78.64 77.65 73.64 77.18 76.64 Square sync 42.22 42.22 42.22 42.22 42.22 42.22 HeartDisease_Cleveland Gaussian 43.56 41.23 34.87 42.27 12.54 43.56 Laplace 22.11 23.43 23.91 32.31 11.88 22.11 Cauchy 15.23 34.38 17.47 13.45 12.21 15.18 Hypersecant 22.44 24.34 25.32 25.41 12.21 22.44 Square sync 12.78 15.38 12.23 11.23 13.43 12.09 Table 3: Classification accuracy of conventional OvO Vs OvO ODT-SVMs [%]. Dataset Kernel Choice OAA OAA ODT-SVM ED SC IG Gini Chi- Iris Gaussian 96 98 98 98 98 98 Laplace 96 96 96 96.44 96 96 Cauchy 96 98 98 98 98 98 Hypersecant 98 98 98 98 98 98 Square sinc 96 94.67 98 98 98 98 Satimage Gaussian 89.43 88.87 92.89 89.66 89.57 91.99 Laplace 89.95 90.76 92.41 91.23 91.42 91.12 Cauchy 89.46 90.61 92.78 89.34 92.98 90.89 Hypersecant 89.71 90.09 93.78 91.34 91.98 93.98 Square sinc 74.51 76.43 77.87 78.86 77.8 76.76 Wine Gaussian 82.01 98.98 96.32 96.65 97.98 97.52 Laplace 82.58 92.23 92.67 95.55 93.58 91.58 Cauchy 82.15 82.02 82.02 82.62 82.87 82.02 Hypersecant 93.82 93.36 93.26 94.26 92.13 93.26 Square sinc 74.72 76.89 76.97 77.97 75.28 76.97 Vehicle Gaussian 84.63 79.95 84.45 84.82 85.24 85.24 Laplace 80.61 78.3 80.61 81.61 80.74 80.24 Cauchy 84.52 82.85 84.52 84.52 84.58 85.65 Hypersecant 84.87 81.59 84.28 84.28 84.98 84.98 Square sinc 78.45 80.57 81.32 81.32 81.99 81.56 Glass Gaussian 60.75 62.9 71.5 71.21 69.63 75.43 Laplace 76.17 76.17 76.64 74.64 75.7 77.57 Cauchy 68.69 72.21 72.43 71.03 68.92 73.36 Hypersecant 63.55 72.9 71.5 70.09 69.16 71.03 Square sinc 61.21 64.2 64.78 62.62 62.62 56.07 Segmentation Gaussian 84.36 87.85 89.24 89.29 88.87 87.43 Laplace 86.19 88.19 89.62 89.67 89.87 90.87 Cauchy 85.71 91 89.87 89.69 89.71 89.99 Hypersecant 88.57 89.14 90.87 89.52 92.05 91.95 Square sinc 80.48 79.05 79.52 80.13 80.95 79.05 Ecoli Gaussian 84.01 84.98 85.42 86.01 84.79 86.14 Laplace 86.90 87.2 86.99 86.9 86.99 87.9 Cauchy 86.90 85.42 86.98 84.82 86.52 85.71 Hypersecant 82.74 85.42 88.45 88.82 85.42 85.74 Square sinc 82.74 84.23 87.65 85.12 87.85 86.15 New_Thyroid Gaussian 95.45 97.98 97.54 100 98.21 97.89 Laplace 96.78 98.89 100 98.52 96.89 98.49 Cauchy 97.34 97.65 98.94 97.89 100 97.96 Hypersecant 96.38 100 98.99 98.89 98.90 98.99 Square sinc 93.54 95.56 96.49 96.78 98.67 97.45 Yeast Gaussian 59.65 59.92 63.90 61.54 66.65 62.32 Laplace 61.26 61.43 64.77 65.54 67.65 64.23 Cauchy 59.46 63.23 64.17 67.41 66.54 65.43 Hypersecant 59.99 68.72 68.65 65.54 65.56 61.34 Square sinc 56.54 58.32 58.68 59.45 60.12 59.98 Movement_Libra Gaussian 73.44 77.76 85.21 82.61 76.41 76.34 Laplace 82.48 84.43 84.67 84.42 85.54 83.75 Cauchy 76.14 78.93 74.71 85.54 78.65 78.89 Hypersecant 73.43 77.84 78.45 77.61 78.65 74.64 Square sinc 42.2 42.22 42.22 42.22 42.22 42.22 HeartDisease_Cleveland Gaussian 43.43 42.36 44.54 42.51 12.21 43.53 Laplace 29.13 43.03 23.34 31.43 16.78 24.12 Cauchy 45.73 49.38 27.47 43.45 22.35 25.38 Hypersecant 22.44 24.64 27.62 25.32 13.61 21.45 Square sync 12.78 15.48 13.61 11.23 12.43 14.37 Table 4: Classification accuracy of conventional OAA Vs OAA ODT-SVMs [%]. Dataset OvO OAA SVM_BDT OvO ODT-SVM OAA ODT-SVM C4.5 MLP Iris 98 98 98 98 98 96 97.33 Satimage 91.21 89.95 91.65 93.61 93.98 85.7 88.12 Wine 93.26 96.45 92.63 96.76 98.98 90.96 92.51 Vehicle 83.33 84.87 82.98 84.95 85.65 71.83 81.98 Glass 75.7 76.17 72.69 76.17 77.57 67.61 63.08 Segmentation 90 90 90 93.78 92.05 86.6 90.48 Ecoli 87.97 86.9 85.78 89.98 88.82 85.08 84.34 New_Thyroid 97.67 97.34 100 100 100 91.59 95.33 Yeast 61.54 61.26 68.59 67.65 68.65 56.71 61.43 Movement_Libra 81.73 82.48 87.45 87.97 85.54 67.13 80.78 HeartDisease_Cleveland 43.56 45.73 49.43 43.56 49.38 50.33 52.65 Table 5: Comparison of best average classification accuracy of ODT-SVMs with other multi-class classifiers. To determine which classifiers are significantly different, we carried out Nemenyi test whose results are illustrated in Figure 1. In this figure, the mean rank of each classifier is pointed by a circle. The horizontal bar across each circle indicates the „critical difference". The performance of two methods is significantly different if their corresponding mean ranks differ by atleast the critical difference i.e. their horizontal bars are not overlapping. Figure 1 reveals that OvO ODT-SVM is significantly different from C4.5 and MLP but not significantly different from conventional OvO, conventional OAA, SVM-BDT, OAA ODT-SVM in terms of classification accuracy. Rather we can say that the proposed scheme is comparable with all the variants of SVM. Table 6 and Table 7 shows the computation time of OvO ODT-SVM for training and testing phase respectively for Gaussian kernel with y = 2-11 and C=1000.Table 6 and Table 7 shows that the time required for training and testing of OvO ODT-SVM is significantly less in comparison to conventional OvO SVM and SVM-BDT approach. Similarly, Table 8 and Table 9 shows that that the time required for training and testing of OAA ODT-SVM is significantly less in comparison to conventional OAA SVM. The Friedman test indicates that there exist significant differences among OvO classifiers on both training and testing time. Figure 2 and Figure 3 illustrate the results of the Nemenyi test to reveal what those differences are among OvO classifiers on training and testing time respectively. Consistent with our time complexity analysis, all variants of OvO ODT-SVM scheme other that Gini measure, are most efficient in terms of training time in comparison to conventional OvO and SVM-BDT. Figure 3 shows that OvO ODT-SVM using Gini is ranked best and is significantly better than conventional OvO and SVM-BDT. All the variants of OvO ODT-SVM schemes are not significantly different from each other in terms of both training and testing time. Similarly, the Friedman test indicates that there exist significant differences among OAA schemes on both training and testing time. Figure 4 and Figure 5 illustrate the results of the Nemenyi test to reveal what those differences are among OAA schemes on training and testing time respectively. Again consistent with our time complexity analysis, all variants of OAA ODT-SVM scheme other that gini measure, are most efficient in terms of training time in comparison to conventional OAA. Figure 5 shows that OAA ODT-SVM using IG is ranked best and is significantly better than conventional OAA. Similar to OvO schemes, all the variants of OAA ODT-SVM schemes are not significantly different from each other in terms of both training and testing time Among four measures employed for determining the structure of decision tree, neither of them is clear winner over other in terms of computation time for training and testing. OvO OAA SVM-BDT OvO ODT-SVM OAA ODT-SVM C4.5 MLP 0 1 2345678 Mean rank of classification accuracy Figure 1: Apply the Nemenyi test to mean ranks of classification accuracy of various classifiers. Dataset OvO SVM-BDT OvO ODT-SVM ED SC IG Gini Chi-square Iris 2.84 5.18 2.15 2.13 2.19 2.17 2.79 Satimage 1065.06 2852.82 867.71 864.77 873.48 958.32 871.47 Wine 2.15 3.53 1.96 2.08 2.11 2.13 2.05 Vehicle 189.96 508.81 154.76 154.24 155.79 170.92 155.43 Glass 12.31 16.89 8.93 8.60 9.05 10.06 8.99 Segmentation 3.00 5.26 2.20 2.52 2.49 2.45 2.39 Ecoli 26.48 45.90 17.28 24.09 20.07 21.97 17.55 New_Thyroid 5.31 8.54 3.93 3.13 2.99 3.23 3.58 Yeast 245.87 675.43 178.62 176.32 173.34 198.14 189.54 Movement_Libra 534.78 1432.76 433.56 437.76 446.34 476.73 436.42 HeartDisease_Oeveland 8.42 13.23 4.67 6.87 5.43 6.48 6.65 Table 6: Training time OvO ODT-SVM Vs OvO and SVM-BDT [sec]. Dataset OvO SVM-BDT OvO ODT-SVM ED SC IG Gini Chi-square Iris 0.03 0.03 0.03 0.01 0.02 0.02 0.02 Satimage 10.44 18.53 9.96 9.32 9.07 7.79 8.53 Wine 0.03 0.06 0.03 0.03 0.03 0.03 0.03 Vehicle 0.44 0. 79 0.43 0. 44 0.43 0. 33 0. 36 Glass 0.04 0.06 0.02 0.04 0.02 0.03 0.02 Segmentation 0.05 0.07 0.04 0.05 0.04 0.05 0.04 Ecoli 0.05 0.06 0.06 0.04 0.04 0.04 0.04 New_Thyroid 0.07 0.09 0.06 0.05 0.06 0.05 0.05 Yeast 0.98 1.79 0.86 0.89 0.86 0.69 0.78 Movement_Libra 5.89 9.89 4.96 4.89 4.43 3.86 4.54 HeartDisease_Cleveland 0.09 0.10 0.05 0.05 0.04 0.05 0.06 Table 7: Testing time OvO ODT-SVM Vs OvO and SVM-BDT [sec]. OvO SVMBDT ED SC IG Gini Chi-square 0123456789 Mean rank of training time of OvO based classifiers Figure 2: Apply the Nemenyi test to mean ranks of training time of OvO schemes. OvO SVM-BDT ED SC IG Gini Ch-square 1 2 3 4 5 6 7 8 Mean rank of testing time of OvO based classifiers Figure 3: Apply the Nemenyi test to mean ranks of testing time of OvO classifiers. Dataset OAA OAA ODT-SVM ED SC IG Gini Chi-square Iris 28.09 12.13 21.65 12.07 15.95 15.75 Satimage 3451.82 858.09 945.91 940.30 989.37 989.43 Wine 5.20 2.95 2.53 2.51 2.52 2.52 Vehicle 615.65 153.04 168.71 167.71 176.46 453.63 Glass 135.60 22.85 27.18 27.18 69.02 21.34 Segmentation 6.70 2.17 2.53 2.51 4.80 4.82 Ecoli 80.68 23.34 30.02 31.02 78.42 79.80 New_Thyroid 10.34 4.54 4.23 5.67 5.13 5.67 Yeast 1278.76 306.98 367.87 306.76 359.54 567.87 Movement_Libra 1734.41 428.87 472.87 478.24 478.91 456.69 HeartDisease_Cleveland 102.34 24.76 23.43 21.23 32.23 24.54 Table 8: Training time OAA Vs OAA ODT-SVM [sec]. Dataset OAA OAA ODT-SVM ED SC IG Gini Chi-square Iris 0.02 0.03 0.01 0.01 0.01 0.01 Satimage 13.95 7.72 9.07 7.59 9.05 7.87 Wine 0.03 0.02 0.02 0.02 0.02 0.02 Vehicle 0.59 0.33 0.39 0.32 0.38 0.33 Glass 0.05 0.02 0.04 0.04 0.03 0.03 Segmentation 0.05 0.04 0.03 0.03 0.04 0.04 Ecoli 0.05 0.04 0.04 0.03 0.04 0.03 New_Thyroid 0.07 0.05 0.05 0.04 0.04 0.05 Yeast 1.28 0.76 0.89 0.66 0.78 0.69 Movement_Libra 6.89 3.78 4.57 3.87 4.56 4.08 HeartDisease_Cleveland 0.05 0.04 0.03 0.04 0.02 0.03 Table 9: Testing time OAA Vs OAA ODT-SVM [sec]. OAA - Gini Chi-square - 12 3 4 5 6 7 Mean rank of training time of OAA based classifiers Figure 4: Apply the Nemenyi test to mean ranks of training time of OAA classifiers. OAA - Gini Chi-square - 1 2 3 4 5 6 7 Mean rank of testing time of OAA based classifiers Figure 5: Apply the Nemenyi test to mean ranks of testing time of OAA classifiers 6 Conclusion In this paper, we evaluate the performance in terms of classification accuracy and computation time of proposed OvO ODT-SVM and OAA ODT-SVM using the statistical measures i.e. information gain, gini index, chi-square and scatter-matrix-based class separability in kernel-induced space. We have also shown theoretically that the computation time of training and testing of both the ODT-SVMs using statistical measures is better in comparison to conventional SVMs. A robust non-parametric test is carried out for statistical comparison of classifiers over multiple data sets. The results of the experiment on UCI repository datasets indicate that accuracy of our proposed framework are significantly better than conventional OvO SVM, conventional OAA SVM and two widely used multi-class classifiers such as C4.5 and MLP for most of the datasets. Our experimental results also demonstrate that the computation time of proposed ODT-SVMs formulation is significantly less in comparison to conventional SVM and SVM-BDT models. Statistical test performed over multiple classifiers also shows that the performance of ODT-SVM model is significantly better in comparison to other natural multi-class classifiers like C4.5 and MLP. Among four measures employed for determining the structure of decision tree, neither of them is clear winner over other in terms of computation time for training and testing. Acknowledgements We are thankful to the reviewers for their valuable comments which have led to the improvement of the paper. References [1] Allwein, Schapire, R., & Singer, Y. (2000). Reducing multiclass to binary: A unifying approach for margin classifiers .In Machine Learning: Proceedings of the Seventeenth International Conference. [2] Blake, C. L., & Merz, C. J. (1998). (C. Irvine, Producer, & Univ. California, Dept. Inform. Computer Science) Retrieved from UCI Repository of Machine Learning Databases: http://www.ics.uci.edu/~mlearn/ML-Repository.htm [3] Boser, Guyon, I., & Vapnik, V. (1992). A training algorithm for optimal margin classifiers. 5th Annual ACM Workshop on COLT, (pp. 144-152). [4] Bredensteiner, & Bennet, K. ( 1999). Multicategory classification by support vector machines . In Computational Optimizations and Applications , 12, 53-79. [5] Breiman, L., Friedman, J., Ohlsen, R., & Stone, C. (1984). Classification and regression trees. Belmont , CA: Wadsworth. [6] Corts, C., & Vapnik, V.N. (1995). Support Vector Networks. Machine Learning, 20, 273-297. [7] Crammer, & Singer, Y. (2001). On the algorithmic implementation of multiclass kernel-based vecto rmachines. Journal of Machine Learning Research , 2, 265-292. [8] Demsar, J. (2006). Statistical Comparisons of Classifiers over Multiple Data sets. Journal Machine Learning Research , 7, 1-30. [9] Duda, R. O., & Hart, P. E. (1973). Pattern classification and scene analysis. New York: J. Wiley. [10] Duda, R. O., Hart, P. E., & Stork, D. G. (2000). Pattern Classification (second ed.). John Wiley & Sons, Inc. [11] Friedman, M. (1940). A Comparison of Alternative Tests of Sgnificance for the Problem of m Ranking. Annals of Math. Statistics , 11, 86-92. [12] Friedman, M. (1937). The Use of Rank to Avoid the Assumption of Normality Implicit in Analysis of Variance. Journal Am. Statistical Association , 32, 675-701. [13] Gjorgji, M., Dejan, G., & Ivan, C. (2009). A Multi-class SVM Classifier Utilizing Binary Decision Tree. Informatica, 33, 233-241. [14] Hsu, C. W., & Lin, C. J. (2002). A comparison of methods for Multiclass Support vector machine. IEEE Transactions on Neural Networks , 13 (2), 415425. [15] Kittler, J., & Hojjatoleslami, A. (1998). A weighted combination of classifiers employing shared and distinct representations. IEEE Computer Vision Pattern Recognition, (pp. 924-929). [16] KreBel. (1999). Pairwise classification and support vector machines. Cambridge, MA: MIT Press. [17] Manju, B., & R. K., A. ( july, 210). Statistical Measures to Determine Optimal Decision Tree Based One versus One SVM. Accepted for publication in Defense Science Journal . [18] Manju, B., & R. K., A. (2009). Evaluation of Decision Tree SVM Framework Using Different Statistical Measures. International Conference on Advances in Recent Technologies in Communication and Computing, (pp. 341-445). Kerala. [19] Platt, Cristianini, N., & Shawe-Taylor, J. (2000). Large margin DAGSVM"is for multiclass classification . Advances in Neural Information Processing System , 12, 547-553. [20] Quinlan, J. R. (1987). Induction of Decision Trees. Machine Learning, 1, 81-106. [21] Rifkin, R., & Klautau, A. (2004). In Defence of One-Vs.-All Classification. Journal of Machine Learning , 5, 101-141. [22] Scholkopf, B., & smola, A. (2002). Learning with kernels. Cambridge, MA: MIT Press. [23] Shannon, C. E. (1948). A Mathematical Theory of Communication. Bell System Tech. Journal, 27, 379423, 623-659. [24] Takahashi, F., & Abe, S. (2002). Decision-tree-based multiclass support vector machines. Proceedings of the Ninth International Conference on Neural Information Processing (ICONIP '02), 3, pp. 141822. Singapore. [25] Vapnik, V. N. (1998 ). Statistical Learning Theory. New York: John Wiley & Sons. [26] Witten, I. H., & Frank, E. (2005). Data Mining : Practical Machine Learning Tools and Techniques with Java Implementation (Second ed.). Morgan Kaufmann. A Shadow Dynamic Finite State Machine for Branch Prediction: An Alternative for the 2-bit Saturating Counter Saleh Abdel-Hafeez, Asem Albosul and Ahmad Shatnawi Department of Computer Engineering Jordan University of Science & Technology, Irbid, Jordan 21110 E-mail: sabdel@just.edu.jo Ann Gordon-Ross and Shadi Harb Department of Electrical & Computer Engineering University of Florida, Gainesville, FL 32611, USA E-mail: ann@ece.ufl.edu Keywords: branch predictor, bimodal, finite state machine, SDFSM, SPEC2000, saturated counter, SimpleScalar Received: July 31, 2009 We propose an adaptive learning machine-based branch predictor - the shadow dynamic finite state machine (SDFSM) - that enables more accurate branch predictions by learning unique branching patterns through a self-modifying technique. SDFSM states represent branch pattern bits. If a state mispredicts a branch, the state is swapped with its shadow state, which represents the correct branching pattern bit. Therefore, the prediction accuracy can reach 100% if the number of states matches a branch's pattern length. When compared to a 2-bit saturating counter using bimodal branch predictors, the SDFSM decreases average misprediction rates by 18.3%, with individual decreases as high as 55%. Povzetek: Predstavljena je metoda za učenje vejitvenih vzorcev v procesorju. 1 Introduction and related work In order to meet high performance demands, modern processor architectures exploit varieties of dynamic branch prediction topologies ([4]-[6] provide an excellent introduction and research coverage) to increase instruction-level parallelism (ILP). Dynamic branch predictors use run-time branch execution history to predict branch direction. Most previous techniques use a branch pattern history table (known as PHTs, BHTs, or BPHTs) to record past branch behavior (e.g., global and/or local) and these tables are indexed using a function/subset of the branch address. Nearly all dynamic branch predictors explored in the last 10 years have been based on tables containing 2-bit saturating counters [7][8]. Extensive simulations of branch predictors reveal that the 2-bit saturating counter performs the best on average [9][10], and thus are used in modern commercial processors. In recent years, research has explored more advanced branch prediction techniques such as neural networks [11][12] and other forms of machine learning. Despite their impressive simulation accuracy, to the best of our knowledge no commercial efforts have publicly announced incorporating such branch predictors because these branch predictors are commonly known to exhibit high prediction latency and long training periods with increased area and energy per prediction [13]. In order to provide increased branch prediction accuracy with low area and power overheads, in this paper we propose a novel adaptive learning machine-based shadow dynamic finite state machine (SDFSM). The SDFSM learns/predicts an application's unique branching pattern using the prediction values (taken/not taken) stored in each state. Upon branch execution, state transition is input independent and the value of the target state predicts the branch outcome. Each state has a corresponding shadow state, which contains the alternate branch prediction value. In the event of a mispredicted branch, the SDFSM performs self-modification by swapping the current state with the current state's shadow state, which contains the correctly predicted branch outcome. This method of state swapping dynamically records unique branch patterns, thus specializing the branch predictor to the needs of an application. Extensive experimental results compare the SDFSM prediction accuracy to the commonly used bimodal [1][2] counter-based predictor and reveal that, for a subset of benchmarks, an SDFSM with six shadow states provides more accurate predictions than counter-base predictors with one-to-one prediction latency. The remainder of this paper is organized as follows. Section 2 describes the proposed SDFSM as an alternative replacement for 2-bit saturating counters and presents the SDFSM architecture. Section 3 and Section 4 present our simulation methodology setup and branch predictor analysis, respectively. Section 5 compares counter-based predictors and SDFSM-based predictors. Section 6 presents a performance analysis and finally, section 7 gives conclusions and suggested future dynamic branch prediction development. 2 Shadow dynamic finite state machine (SDFSM) branch prediction In this section, we present our shadow dynamic finite state machine (SDFSM) branch prediction technique for learning/predicting an application's unique branching patterns. 2.1 SDFSM operation Figure 1 depicts the SDFSM using a 4-state SDFSM automaton (larger SDFSMs are similarly represented using more states). SDFSM state values record/predict branch outcomes. SDFSM operation consists of two phases: the training phase and the operational phase. During the training phase, SDFSM states are manipulated such that they learn the application's branching pattern. SDFSM state transition is deterministic upon each branch execution and the next state's value corresponds to the predicted branch outcome. In other words, branch prediction is determined by the branch history pattern and not by the input condition leading to the next state. If a state's prediction value is correct, no change is made to the SDFSM. If a state's prediction value is incorrect, the SDFSM self-modifies to adapt to the branching pattern. In order to learn branching patterns, each state has a corresponding shadow state (positioned adjacent to the state), and the shadow state contains the opposite prediction value. Thus, if a state's value does not correspond to the branching pattern, the state is swapped with its shadow state in order to swap the state's branch prediction value. During the training phase, the states record the observed pattern and during the operational phase, the states predict taken/not taken. This implies that the SDFSM learns a distinct pattern on-the-fly and then predicts this pattern perfectly. Furthermore, the training and operational phases are not necessarily mutually exclusive as the SDFSM transitions to the training phase anytime there is a misprediction. Figure 2 illustrates the 4-state SDFSM using a repeated pattern of 1010, which is commonly known to produce poor prediction rates for saturating counter techniques [14]. All state values are initialized to 0. Upon first execution of the branch, the SDFSM enters the initial state (step 1), whose state value is 0 and predicts the branch as not taken. After branch resolution, if the state mispredicted the branch outcome, the state is swapped with its shadow state and the state's predicted value becomes 1. On the next execution of the branch, the SDFSM transitions to the next state (step 2), which correctly predicts the branch as not taken. On the next execution of the branch, the SDFSM transitions to the next state (step 3), which predicts the branch as not Initial Figure 1: The proposed shadow dynamic finite state machine (SDFSM) using four states. Figure 2: The SDFSM updates state predictions by swapping states with shadow states based on the observed pattern. taken. Again, branch resolution determines that the branch was mispredicted and the shadow state is swapped in. On the next execution of the branch, the SDFSM transitions to the next state (step 4), which correctly predicts the branch as not taken. On the next execution of the branch, the SDFSM transitions back to the initial state, which ends the training phase and begins the operational phase. The SDFSM now correctly predicts the branch outcome on every branch execution. Perfect branch pattern prediction only occurs if the pattern repeats itself with a repetition cycle equal to (or a divisor of) the number of states. A 4-state SDFSM can perfectly predict any 2- or 4-entry branch pattern. This restriction can be generalized to any x-entry pattern, which would require an SDFSM with x states or any multiple of x states. In Section 4, we provide an in-depth analysis of numerous SDFSM sizes. During context switching, in addition to traditional branch predictor state saving techniques, SDFSM operational state can be quickly saved and restored using special hardware to read and save state on a single clock cycle. State saving area overhead would be small, as only one n-bit counter is required for each context. Currently, SDFSM operation is not pipelined, thus mispredicted branches and branch overlap are not accounted for. However, these operational enhancements could be easily incorporated into the SDFSM by adding additional steering logic and mispredicted rollback capabilities. These additions would be straightforward and could be done such that the prediction accuracy would be unaffected, and are a focus of our future work. 2.2 SDFSM architecture Figure 3 depicts the generalized SDFSM architecture (with N states) consisting of an array of N prediction states and a shift register to selectively enable the appropriate prediction state. Prediction state architectural Component Type Number of components D-Type Flip-Flop 2N (DFF) Two-input Multiplexer N AND gate logic N XOR gate logic N Tri-state gate logic N Table 2: Total number of hardware components based on the number of SDFSM prediction states (N). Parameter Configuratio n BTB, assoc, cache line size 128KB, 4-way, 32B L2 unified size, assoc, cache line size 256KB, 4-way, 64B LI data size, assoc, cache line size 8KB, 4-way, 32B LI instruction size, assoc, cache line size 8KB, 4-way, 32B Branch predictor techniques Bimodal Reorder buffer size 512 L3 unified size, assoc, cache line size 4 MB, 2-way, 64B Pipeline depth 40 Table 1: Architectural parameters. 1. The shift register is responsible for selectively components include a single D-type flip-flop (DFF) to store the state's predicted value, a two input multiplexor to swap the predicted value (effectively implementing a swap with the shadow state), and several gate level components. Prediction state inputs are similar to those used for 2-bit saturating counters, which are initialize (IN), prediction input pattern (PIP), enable (Z), and the clock (CLK) signal. Prediction states have a single output, which is the predicted value. The outputs of all prediction states are connected to a common output (Prediction Output Value) using tri-state buffers. The shift register is composed of N DFFs, whose outputs Q (also denoted as Z) are connected to the adjacent DFFs inputs D and selectively enable the prediction states. The shift register is clocked using the BRANCH signal, which is asserted each time the branch associated with this predictor is fetched. At system startup, IN is asserted to reset the system. IN is connected to each DFF's reset (RES) port, effectively setting all register values to 0, except for the last DFF in the shift register. IN is connected to the set (SET) port of this DFF in order to set this DFF's value to enabling a single prediction state, thus only one bit in the shift register should ever have a value of 1. Each time the BRANCH signal is asserted, the shift register updates its values, which enables the next sequential prediction state via the Z signal. SDFSM prediction states consist of two operational phases: the predict operation and correct operation. The predict operation provides the branch prediction value while the correct operation swaps the branch prediction value with the shadow state value if the branch is mispredicted. During the predict operation, the enabled prediction state's output drives the Prediction Output Value using Z's assertion to enable the tri-state buffer. During this time, the PIP input value should correspond to the Prediction Output Value (not shown in Figure 3) so that the DFF value does not change. If a branch is mispredicted, the PIP value will change to the branch outcome value and the prediction state enters the correct operational phase. During this phase, simple logic gates controlling the multiplexor's inputs and select line swap the DFF's stored value with the shadow value. Thus, in order to swap the DFF's stored value, the PIP must be different than the currently stored value and Z must be asserted. The SDFSM has been architecturally designed to complete in one fast clock cycle. Assuming the DFFs are constructed using two levels of 3-input NAND gates and the multiplexor is constructed using standard two level logic gates, the longest register-register delay is seven gates (since DFF updating for the shift registers and prediction states is mutually exclusive, no phase flows through both DFFs). This situation occurs during the correct operational phase. Table 2 depicts hardware area estimates in number of hardware components based on the number of prediction states N, where total hardware area grows at a rate of O(N). To minimize the output steering logic, prediction state outputs share a common output wire using tri-state buffers. In addition, to minimize active power, the DFFs in each prediction state are only activated on a misprediction. Overall, the SDFSM architecture is highly cost-effective in terms of performance, area, and power. & ^ / éf # J? f J* / / J* ß / > v!/ ö Counter ■ 5DF5M-2 ÛSDFSM-3 □ SDFSM-4 ■5DF5M-6 a SDFSM-S ■ SDFSM-10 a SDFSM-12 Figure 4: Prediction accuracy for each benchmark using a 4k-entry BHT for the bimodal branch predictor using a 2-bit saturating counter (counter) and SDFSMs with 2, 3, 4, 6, 8, 10, and 12 states (SDFSM-A). 95% n Î 90% - rç 3 S 85% - c a S 80% - cquükc gzip_graphic inet ßoncliniaik perlbrïifc_makerand a Courrier ■SDFSM-2 a SDFSM-3 G SDFSM-4 ■ SDFSM-6 OSDFSM-S ■ SDFSM-10 aBDF5M-12 Figure 5: Prediction accuracy for the six advantageous benchmarks using a 4k-entry BHT for the bimodal branch predictor using a 2-bit saturating counter (counter) and SDFSMs with 2, 3, 4, 6, 8, 10, and 12 states (SDFSM-A"). 93% 92% , 91% I 90% ; 89% i 88% 87% 86% • Counter SDFSM-3 -"-SDFSM-2 —*—SDFSM-4 -«-SDFSM-6 —•—SDFSM-8 —•-SDFSM-10 SDFSM-12 —♦— Counter -■-SDFSM-2 SDFSM-3 SDFSM-4 -»-SDFSM-6 —*— SDFSM-S —*—SDFSM-10 SDFSM-12 1024 2048 4096 8192 163S4 32768 65536 131072 Hardware Budget (Entries) 1024 2048 40% 8192 16384 32768 65536 131072 Hardware Budget (Entries) (a) (b) Figure 6: Arithmetic mean of the prediction accuracy for (a) all benchmarks and (b) for the six advantageous benchmarks for the bimodal branch predictor with a 2-bit saturating counter (Counter) and SDFSMs with 2, 3, 4, 6, 8, 10, and 12 states (SDFSM-A) for BHT sizes ranging from 256 to 128k entries. 3 Simulation methodology and evaluation metrics In order to perform an in depth analysis of the SDFSM, we exhaustively simulated the SPEC2000 benchmark suite [16] (we simulated each application in its entirety for all provided input stimuli) using the SimpleScalar PISA processor simulator version 4 [15]. We modified sim-bpred to implement the SDFSM and simulated the SDFSM with 2, 3, 4, 6, 8, 10, and 12 states. Our comparison framework focused on comparing the SDFSM to a popular branch prediction technique (bimodal) using 2-bit saturating counters with branch prediction table sizes ranging from 256k- to 16k-entries. We compare with the bimodal predictor because the bimodal predictor is a branch predictor cornerstone and allows us to establish the fundamental contribution of our SDFSM. Table 1 summarizes the base system's architectural parameters, which represent common modern system parameters, yet are conservative with respect to future technologies. Each branch prediction table entry contains an FSM, which can be either the SDFSM or a 2-bit saturating counter. Hence, the predictor storage budget (PSB) in bits is determined by: PSB = 2n x [log2(number of States FSM)] where N is the number of index bits used for the branch prediction table. In the conventional bimodal branch predictor, the low-order J bits of the branch address index into a branch history table (BHT) of size 2J entries. The BHT entries can either be 2-bit saturating counters or can be replaced with SDFSMs of any size. Since it is difficult to precisely compare predictors with exactly the same hardware budgets, we compare predictors based on number of table entries, which provides a fair performance comparison because these tables account for the majority of the hardware budget. Cumulative prediction rate accuracies are computed and analyzed using the arithmetic mean for averaging prediction rates, over all benchmarks, based on predictor storage budget. In addition, individual branch prediction accuracies for every benchmark and every branch prediction technique studied were measured for increasing hardware budgets, reflecting branch predictor sizes available in commercial microprocessors. Improving processor performance, measured in number of instructions executed per cycle (IPC), is considered the key motivation for combining improved branch prediction accuracy with low latency branch prediction. High prediction latency nullifies any prediction accuracy advantages due to decreased IPC. For a 2-bit saturating counter, since each up-down counter only requires 2 bits to record the branch behavior, the technique requires simple hardware and little storage space. In addition, the 2-bit saturating counter's inherent simplicity results in simple single-cycle prediction computation, thus guaranteeing low prediction latency. In contrast, perceptron-based predictors require comparatively complicated computation using adder components. The prediction latency of the original perceptron predictor was more than 4 cycles [13], which required heavy pipelining to hide such latencies. This pipelining led to problems similar as those encountered when designing modern hyperpipelined execution cores [12]. Thus, since the SDFSM has the same access delay (single-cycle) as the 2-bit saturating counter, the key evaluation metric is SDFSM prediction accuracy compared to 2-bit saturating counters with a fixed hardware budget. 4 Experimental results Figure 4 shows the prediction accuracy for all benchmarks for the bimodal branch predictor with a 2-bit saturating counter (counter) and SDFSMs with 2, 3, 4, 6, 8, 10, and 12 states (SDFSM-^) using a 4k-entry BHT. On average over all benchmarks, the 2-bit saturating counter outperforms all SDFSMs. However, we reiterate that branch predictors behave differently for all applications, and there is no one branch predictor that outperforms all other branch predictors for all applications. Figure 5 subsets the results and depicts the six applications where the SDFSM shows improved prediction accuracy over the 2-bit saturating counter. On average, the 6-state SDFSM provides the largest prediction accuracy improvements with an average misprediction rate decrease of 18.3%, with individual decreases ranging from 6.3% to 55%. Figure 5 also reveals that for each benchmark, the optimal sized SDFSM is quite different. The optimal SDFSM sizes for ammp, equake, gzip_graphic, mcf, mesa, and perlbmk_makerand are the 6-state, 12-state, 8-state, 2-state, 12-state, and 6-state SDFSMs, respectively. (a) Figure 6 (a) depicts the arithmetic mean of the prediction accuracy for all benchmarks for the bimodal branch predictor with a 2-bit saturating counter (counter) and SDFSMs with 2, 3, 4, 6, 8, 10, and 12 states (SDFSM-^) for BHT sizes ranging from 256 to 128k-entries. The prediction accuracy increases as BHT size increases and saturates asymptotically. On average, the 2-bit saturating counter still outperforms all SDFSMs, with the 2-bit predictors prediction accuracy saturating at 92.3% and the next accurate predictor (6-state SDFSM) saturating at 91%. On average, the 2-bit saturating counter with 16k-entries (a practical hardware budget) provides 1.7% more accuracy than the next most accurate predictor. (a) Figure 6 (b) subsets the results from Figure 6 (a) and averages the six applications where the SDFSM shows improved prediction accuracy over the 2-bit saturating counter. For these benchmarks, the 6-state SDFSM is 1.2% more accurate than the 2-bit saturating counter, saturating asymptotically at 93.5%. This figure also shows that both the 6- and 12-state SDFSMs outperform the 2-bit saturating counter. Overall, results reveal that our SDFSM has the potential to further enhance the accuracy of 2-bit saturating counters. Since literature shows that the most advanced branch prediction methods adopt neural or saturating elements, the SDFM has the potential to improve on these methods as a replacement for the saturating elements. The SDFSM is intended to enhance branch prediction for certain applications that exhibit particular behaviors such as aliasing, damping, and other irregularities such as those found in artificial intelligence and gaming applications (see Section 5 for details). 5 Comparison analysis In this section, we analyze the exhaustive results presented in Section 4 and discuss comparative advantages and disadvantages of the 2-bit and SDFSM branch predictors considering aliasing interference, damping, adaptability, training time, and latency. 5.1 Aliasing interference and damping Since the BHT size is generally much less than the total number of branches in an application, the bimodal branch predictor uses the low-order J bits to index into the BHT. Therefore, if two conditional branches have the same low-order J bits, their branch streams will be intermingled and sent to the same predictor. We define this situation as aliasing interference. Due to aliasing interference, and because we use the bimodal branch predictor, both the 2-bit saturating counter and our SDFSM-based predictor generally result in lower prediction accuracy in the presence of significant aliasing interference. Aliasing interference can be alleviated through two methods. Simply increasing the BHT size can significantly reduce aliasing interference. Additionally, using other branch prediction techniques such as per-address branch predictors (PAs) can reduce aliasing interference by using a two level indexing method [14]. The first level is indexed using a subset of H bits of the branch address to index into a pattern history table of size 2H, which stores the unique local branch history pattern of that branch. This pattern is then used to index into the second level, which contains either global pattern histories (PAg) or per-address pattern histories (PAp) [3]. In general, aliasing interference does not directly imply prediction accuracy penalties. For example, if two branches alias to the same BHT entry but their executions are mutually exclusive, (the first branch executes 1000 times followed by 1000 executions of the second branch) the prediction accuracy lost due to aliasing interference is negligible. However, if two branch executions are not mutually exclusive (the worst case being that the two branches alternate executions), then aliasing interference may lead to a significant decrease in prediction accuracy. To analyze the effects of aliasing interference in the case of two interfering branches, we define the most frequently executing branch as the majority branch and the least frequently executing branch as the minority branch. We further define a majority run as consecutive majority branch executions with no intervening minority branch executions. Minority runs are similarly defined. Smith [1] observed that 2-bit saturating counters implicitly provided an appropriate amount of damping (or hysteresis) which alleviated some of the aliasing 1 2 3 4 5 6 7 8 Pattern Length Figure 7: Bimodal predictor misprediction rates for various pattern lengths. interference. The damping mechanism in 2-bit saturating counters requires two consecutive mispredictions before the prediction value changes, thus ignoring minority runs of length one. Damping trades off adaptability for vulnerability to short minority runs. In addition, damping also allows loop branches to incur just one misprediction per loop iteration, instead of two mispredictions (one on loop exit and one on loop entry). On the other hand, the SDFSM's implicit damping mechanism is quite different than the 2-bit saturating counter. The SDFSM simply learns the branching pattern that maps to a particular BHT entry. Therefore, as long as the combined patterns of the interfering branches produce a learnable pattern, the SDFSM will learn that pattern. However, since these combined patterns are likely longer than individual branching patterns, this implies that SDFSMs with more states provide increased damping. The SDFSM predictor actually provides high/perfect prediction accuracy for applications with short minority runs as well as long minority/majority runs, by minimizing or even eliminating aliasing interference. On the contrary, in the presence of aliasing interference, damping in saturating counters only works well for long minority runs. Literature shows that the bimodal predictor is widely known to have a significant amount of aliasing interference even as the hardware budget increases [2][4]. In our experiments, since both the 2-bit saturating counter and the SDFSMs use a bimodal predictor, large amounts of aliasing interference will favor the counter-based predictor since the counter based predictor can better tolerate aliasing interference. Figure 4 shows that on average the 2-bit saturating counter can reduce the misprediction rate by 14.3% over the best SDFSM predictor (SDFSM-6). On the other hand, for the six benchmarks where SDFSMs are advantageous, short minority runs (which are considered a limitation of counter-based predictors) favor the SDFSM predictor. For these six benchmarks, Figure 5 shows that the SDFSM can decrease misprediction rates by 18.3% on average. 5.2 Recurring patterns Researchers have shown that aliasing in the pattern history tables can significantly degrade the performance of bimodal branch predictors. [3][4][21][23] showed that a repeating pattern of length one (i.e., "1111...1" or "0000...0") was detected for approximately 50% of the branches, indicating that a significant amount of branch inference may occur if the PHTs are updated for these branches. For these situations, a simple predictor such as a bimodal predictor would typically outperform the SDFSM predictor, which would incur every interference update. In addition, research showed bimodal predictors could accurately predict branches with short repeating patterns, while branches with a repeating pattern of length six tended to have higher mispredication rates [21][22][23], as is show in Figure 7 from [23]. Since Section 4 revealed that the 6-state SDFSM was the best performing number of states on average, the 6-steate may provide improved performance for these branching patterns of length 6. In addition, our results demonstrated that SDFSMs with a smaller number of states suffered less branch interference penalty as compared to SDFMs with a larger number of states, which could explain why the 6-state SDFSM outperformed the 12-state SDFSM (or for any SDFSM with a multiple of 6 states). 5.3 Adaptability and training time Branches typically exhibit high biasing (usually 70% [4]) towards one outcome (taken or not taken). This bell distribution (bell peaks at 70%) is key to a counter-based predictor's high prediction accuracy and explains why the 2-bit saturating counter outperforms the SDFSM for the majority of the benchmarks. To provide better prediction accuracy for low biasing applications, previous work shows [3][5] that applications with branches that show low biasing require dynamic adaptability in order to achieve high prediction accuracies. This dynamic adaptability enables the predictor to specialize itself to a branch's biasing during application execution. Dynamic adaptability provides the added benefits of not requiring any static profiling or branch predictor training during system/application design time. The 2-bit saturating counter lacks dynamic adaptability. On the other hand, our N-state SDFSM-based predictor can dynamically adapt to any branch pattern of length equal to (or a divisor of) N. The larger the number of states, the more flexibility the SDFSM has for adapting to different pattern lengths. However, SDFSMs with a large number of states can negatively impact the prediction accuracy due to longer training times. Figure 5 exemplifies this impact with the 6- and 12-state SDFSMs. Ammp, gzip_graphic, mcf, and perlbmk makerand show increased prediction accuracy for a 6-state SDFSM even though the 12-state SDFSM captures the same branching pattern. On the other hand, equake and mesa show decreased prediction accuracy for the 6-state SDFSM because these benchmarks likely have longer branch patterns, thus requiring more SDFSM states. On average, the 6- and 12-state SDFSMs decrease misprediction rates by 18.3% and 15.4% compared to the 2-bit saturating counter, respectively. The 6-state SDFSM decreases misprediction rates by 4% compared to the 2-bit saturating counter. This overhead is due to the 12-state SDFSM's increased training time. Similar trends are evident when comparing 2- and 6-state SDFSMs, as well as any other SDFSM with common divisors. 5.4 Latency Few hardware resources are required to implement both the 2-bit saturating counter and the SDFSM predictors and thus these techniques require only modest storage space. In addition, this inherent simplicity results in simple predictions and computations, which guarantees low prediction latency (a critical component for high performance in processors). The SDFSM-based predictor requires only a single cycle for training and prediction, while 2-bit saturating counter-based predictors require two cycles for training and predicting. Thus, the overall prediction latency of the SDFSM-based predictor is 50% less than that of the 2-bit saturating counter-based predictor, resulting in a higher instruction-per-cycle (IPC). 6 Performance evaluation (a) Figure 6 (a) showed that the counter-based predictor was more accurate on average than the SDFSM with respect to the arithmetic mean. However, the counter-based predictor's misprediction latency cycles is twice that of the SDFSM, as was described in Section 5.4. The additional misprediction cycle adversely affects overall processor performance due to stalls while waiting for the training and subsequent prediction. Therefore, in order to more fairly compare complete predictor performance, we must consider the mispenalty latency in conjunction with the misprediction rate. We evaluate the SDFSM and counter-based bimodal type predictors with respect to the misprediction per cycle (MPC) and the prediction accuracy rates (PAs) as determined by simulation. In order to provide an analysis that is independent of the processor clock speed, the misprediction rate is normally measured in cycles rather than in seconds, such that: MPCcounter = [100% - PAcounter ]x 2 cycles and: MPCSDFSm =[100% - PASDFSm ]x 1 cycles Figure 8 shows the MPCs with respect to hardware budget in number of entries and Figure 10 subsets these results as in (a) Figure 6 (a) (i.e., those where the SDFSM showed improvement over the counter-based predictor with respect to misprediction rates), Similarly to the misprediction rates for these subsetted benchmarks, the MPCs for all SDFSMs improves with respect to counter predictor, with an average overall performance increase of 37%. However, on average over all benchmarks the counter-based predictor still had the lowest misprediction rate. Branch predictor performance can also be evaluated using the misprediction speedup, as derived in [17], such that: MPC Speedup =-^^ MPC 1V±± ^ SDFSM Figure 9 shows the misprediction speedup verses hardware budget in number of entries for various SDFSM sizes compared to the counter-based predictor. These speedups are in line with speedups obtained for other recent innovations in branch predictors [18]-[20]. Figure 8: Mispredictions per cycle per benchmark with a hardware budget of 4KB. 7 Conclusion and future work This paper proposes the shadow dynamic finite state machine (SDFSM), a new branch predictor where the FSM states are dynamically trained during rum-time to learn unique branch pattern behaviors. Whereas the SDFSM can be generalized to any arbitrary number of states, we explored several SDFSM sizes and compared extensive simulation results on the SPEC2000 benchmark suite with 2-bit saturating counters using a conventional bimodal-based branch predictor. Results revealed that the SDFSM decreases average misprediction rate for six benchmarks, which have irregular branching tendencies (i.e. those seen in artificial intelligence and gaming applications). Furthermore, in the situations where the SDFSM was slightly less accurate than the 2-bit predictor, this reduced accuracy was due to the nature of the bimodal predictor architecture (and not a failure of the SDFSM), which inhibits a large percentage of aliasing phenomena that severely affects the performance of our SDFSM automaton on prediction accuracy. The SDFSM will likely show marked improvements when coupled with predictors that are less affected by aliasing such as PAs and GAs. In addition, the SDFSM uses a simple hardware structure, which provides single cycle training and prediction latency; in contrast, the 2-bit counter predicts and corrects in two cycles. This single cycle advantage for the SDFSM offsets the accuracy advantage of the 2bit counter by trading off performance with respect to the instructions-per-cycle (IPC) rate. Finally, we explored and analyzed the number of SDFSM states in the scope of adaptability, training, damping, and aliasing in order to determine their affect on prediction accuracy. Results show that a 6-state SDFSM is a good average configuration for optimal length for bimodal predictor topology. Thus, our results encourage researchers to explore the SDFSM combined with more advanced predictor methods, thus improving the accuracy of those predictors. Our future work is motivated by the per-application variation in optimal SDFSM size as shown in Figure 5. Consequently, choosing the best number of states is a key design decision since the SDFSM structure does not "Counter SDFSM-3 "SDFSM-6 "SDFSM-10 SDFSM-2 SDFSM-4 SDSFM-8 -SDFSM-12 -+- -+- CL C/3 "3 "J E. 25(i 512 1024 2048 4096 8102 16384 52768 65536 Kill Hardware Budget (Entries) Figure 10: Average mispredictions per cycle verses hardware budget in number of entries. "SDFSM-2 SDFSM-4 -SDSFM-8 -SDFSM-12 " SDFSM-3 SDFSM-6 -SDFSM-10 1024 204S 40% 8|M2 16384 32768 655 56 151 Hardware Budget (Entries) Figure 9: Misprediction speedup verses hardware budget in number of entries for various SDFSM state sizes compared to the counter-based predictor. dynamically alter its number of states based on pattern entries. Therefore, our future work includes architecting an adaptive SDFSM capable of dynamically altering its number of states based on actual branch pattern length. References [1] J. Smith, "A Study of Branch Prediction Strategies," International Symposium on Computer architecture (ISCA), pp. 135-148, June 1981. [2] C. -C. Lee, C. C. Chen, and T. N. Mudge, "The Bimode Branch Predictor," International Symposium on Microarchitecture (MICRO 30), pp. 4-13, December 1997. [3] T. Yeh and Y. Patt, "A Comparison of Dynamic Branch Predictors that use Two Levels of Branch History," International Symposium on Computer architecture (ISCA), pp. 257-266, June 1993. [4] C. Young, N. Gloy, and M. D. Smith, "A Comparative Analysis of Schemes for Correlated Branch Prediction," International Symposium on Computer architecture (ISCA), pp. 276-286, July 1995. [5] A. Seznec, "Analysis of the O-Geometric History Length Branch Predictor," International Symposium on Computer Architecture (ISCA), pp. 394-405, June 2005. [6] P. Biggar, N. Nash, K. Williams and D. Gregg, "An Experimental Study of Sorting and Branch Prediction," Journal of Experimental Algorithmic (JEA), Volume 12, Article 1.8, June 2008. [7] Y. Ma, Hongliang Gao, and Huiayang Zhou, "Using Indexing Functions to Reduce Conflict Aliasing in Branch Prediction Tables," IEEE Transactions on Computers, Vol. 55, No. 8, pp. 1057-1061, August 2006. [8] C. Y. Ho, K. F. Chong, C. H. Yau, and A. S. S. Fong, "A Study of Dynamic Branch Predictors: Counter versus Perceptron," International Conference on Information Technology (ITNG'07), pp. 528-536, April 2007. [9] R. Nair, "Optimal 2-bit Branch Predictors," IEEE Transactions on Computers, Vol. 44 (5), pp. 698702, Issue 5, May 1995. [10] J. L. Hennessy and D. A. Patterson, Computer Architecture: A quantitative Approach, Morgan Kaufman Publishers, 3rd Edition, 2003. [11] D. A. Jimenez and C. Lin, "Dynamic Branch Prediction with Perceptron," International Symposium on High-Performance Computer Architecture, (HPCA), pp. 197-206, January 2001. [12] A. S. Fong and C. Y. Ho, "Global/Local Hashed Perceptron Branch Prediction," International Conference on Information Technology: New Generations ITNG '08, pp. 247-252, April 2008. [13] D. A. Jimenez, "Improved Latency and Accuracy for Neural Branch Prediction," Transactions on Computer Systems (TOCS), Volume 23 Issue 2, pp. 197-218, May 2005. [14] K. C. Breen and D. G. Elliott, "Aliasing and AntiAliasing in Branch History Table Prediction," Computer Architecture News, Vol. 31, No. 5, pp. 14, December 2003. [15] T. Austin, D. Ernst, E. Larson, C. Weaver, R. N. Raj Desikan, J. Huh, B. Yoder, D. Burger, and S. Keckler, SimpleScalar Tutorial 2001 (for release 4.0). [16] SPEC 2000, The SPEC 2000 Benchmark Report, Waterside Associates, Fremont, CA., January 1990. [17] M. Burtscher and B. G. Zorn, "Prediction Outcome History-based Confidence Estimation for Load Value Prediction," Journal of Instruction-Level Parallelism, Vol 1, May 1999. [18] T. H. Heil, Z. Smith, and J. E. Smith, "Improving Branch Predictors by Correlating on Data Values," 32nd Annual international symposium on Microarchitecture (MICRO-32), pp. 28-37, Nov. 1999. [19] R. Thomas, M. Franklin, C. Wilkerson, J. Stark, "Improving Branch Prediction by Dynamic Dataflow-based Identification of Correlated Barnches from a large Global History," 30th Annual International symposium on Computer Architecture, pp. 314-323, June 2003. [20] R. Sendag, J. J. Yi, P. Chuang, and D. J. Lilja,"Low Power/Area Branch Prediction Using Complementary Branch Predictors," IEEE International Symposium on Parallel and Distributed Processing (IPDPS 2008), pp. 1-12, June 2008. [21] P. Chang, M. Evers, and Y. N. Patt, "Improving Branch Prediction Accuracy by Reducing Pattern History Table Interference," Proceedings of the 1996 Conference on Parallel architecture and Compilation Techniques, PACT '96, pp. 48-57, Oct. 1996. [22] A. R. Talcott, M. Nemirovsky, and R. C. Wood, "The influence of branch prediction table interference on branch prediction scheme performance," International Conference on Parallel Architectures and Compilation Techniques, pp. 8998, June1995. [23] J. Stark, M. Evers, and Y. N. Patt, "Variable Length Path Branch Prediction," International Conference on Architectural Support for Programming Languages and Operating systems (ASPLOS), pp. 170-179, Dec 1998. Programming the Story: Interactive Storytelling System SeokKyoo Kim, SungHyun Moon and SangYong Han Seoul National University, Seoul, Republic of Korea E-mail: {anemone, shmoon, syhan}@pplab.snu.ac.kr Juno Chang Sangmyung University, Seoul, Republic of Korea E-mail: jchang@smu.ac.kr Keywords: interactive storytelling, planning, game programming Received: September 1, 2009 A multi-story can be generated by the interactions of users in the interactive storytelling system. In this paper, we suggest narrative structure and corresponding Storytelling Markup Language. Actor, Action, and Constraint are declared and programmed using interactive storytelling system which generates the stories. Generated stories can be transformed to multimedia formats which are texts, images, animations, and others. Povzetek: Opisan je sistem za generiranje mnogoterih zgodb. 1 Introduction A term called 'Digital Storytelling' is being used in various sectors of society nowadays. Studies are being conducted not only in the academic fields that previously have a field of storytelling, such as literature, but also in media studies, computer engineering, and others, owing to its involvement with digital technology. Although the term has not been defined concretely, it is recognized as a storytelling in the digital era. One definition of digital storytelling is a storytelling that is done by applying digital technology in the medial environment or as an expression means. In other words, digital storytelling in a broad sense indicates the case, wherein digital technology is applied to entire media environment in the production process of the audiovisual materials, or, at least, to creating story and discourse as a means of expression. Interactive storytelling is one area of digital storytelling. This concept is a type of narrative using an interaction between the emotional and dramatic aspects of the story and the computer, indicating storytelling on which the user makes an influence so as to change the direction which the story proceeds in. The closest examples are edutainment, videogames in which the progress and an ending of the story can be various depending on the choices made by the user. Especially in games, such a characteristic can be found in RPG (Role Playing Game) or adventure game, visual novels and others. Studies on such interactive storytelling are not being actively conducted yet, but various attempts are now being made along with the development of online contents such as online game. This study gives a consideration to followings as essential components of interactive storytelling. 1) Narrative Structure - The essential elements for processing stories and the structure to express such elements are required. 2) Script language to embody narrative structure. - A system, where the narrative structure is expressed in the type of languages which authors or programmers can understand so that the computer can process it, needs to be arranged. 3) Story generator and authoring tool assisting in generation of story and narrative structure even without professional knowledge of programming languages. - The development environment, which allows story makers to create stories easily even if they do not have knowledge of script languages, should be supported. The story generator interprets input information as narrative structure, converts it in script language, and then generates the story based on it. The authoring tool are implemented as the graphical user interface environment including the story generator. Figure 1 shows the constituents of an interactive storytelling system. Figure 1: An Interactive Storytelling System Among various important elements constituting an interactive storytelling system, the narrative structure and the interactive storytelling script language based on it are suggested in this paper. For this purpose, the existing script languages and their expression modes are described in section 2, and the constraint-based narrative structure is suggested in section 3. In section 4, SML (Storytelling Markup Language) is suggested for the system. This script language is fundamental to an interactive storytelling system and is used to develop story generator and authoring tools, making it possible to generate stories. The last section summarizes this study, discussing the method of utilizing this study and future tasks. 2 Related works Languages that have been used in an interactive storytelling system so far can be largely divided into three types; natural language, logic programming language, and markup language derived from XML. A natural language is the most suitable language to narrate actions, characters, events, and other source of story because it expresses a human language as it is. Although using a natural language provides convenience for users and increases accessibility or legibility, there are many difficulties compared to processing the existing programming languages. In previous studies, such a natural language system was used in integration with a speech recognition system, which was the system that shows the final stories by processing natural language came through the speech recognition system. Since logic programming languages use an artificial intelligence planning technique in the narrative structure of an interactive storytelling system, programming languages that are compatible with logic programming are used. For example, there are STRIPS (STanford Research Institute Problem Solver) [2] and languages derived from STRIPS. STRIP, which was introduced to solve problems of Al, is the most suitable for expressing a planning algorithm of narrative structure, thus it is used in an interactive storytelling system [3] [4]. Lastly, there is XML (eXtensible Markup Language). HTML (Hyper-Text Markup Language), a subordinate concept of XML, is used as the standard output format of the World Wide Web all over the world, and many users, thus, know or can easily learn the format. Along with such an environmental factor, XML expresses all information in letter that users know, leading to high legibility. XML has been widely used as the standard to express information since proposed by W3c (World Wide Web Consortium), and it is even suggested that it might replace HTML. Many developing tools have already included the libraries dealing with XML and are continuously developing it. Accordingly, HTML has an advantage of being widely used. Most of all, XML, a language with the property of generality, can add various formats and express almost any imaginable formats. In regard to XML-derived languages used in an interactive storytelling system such as MPML (Multimodal Presentation Markup Language) [5], AIML (Artificial Intelligence Markup Language) [6], APML(Affective Presentation Markup Language)[7], FML(Functional Markup Language)[8], BML(Behavior Markup Language)[9]. MPML is a markup language suitable for controlling actions of characters similar to the real world. MPML is a powerful language which can provide the control for behavior of second dimensional characters, the presentation flow, and the integration of external objects. FML and BML are designed to unify representational framework for Embodied Conversational Agents to produce multimodal behaviours of computer-generated characters in a broad range of circumstances. APML is another attempting version for ECA able to generate context-adapted behaviours based on Mind-Body interface; Mind which represents the personality of an agent, and Body reflects its appearance and expressive behaviours. However, they have some shortcomings to be the script language for generating stories which this study aims at, because it is a language for controlling the agent. In the VISTA (Virtual Interactive Story Telling Agents) project [10], AIML was used to write programs. AIML is a XML style script language supporting for AI application program, and the Vista system used AIML interlocked with Prolog. AIML was used in the question-answer relationship applied to stories, while Prolog was used to generate various action rules. With regard to AIML, however, all the questions and answers should be defined in advance, and it is hard to produce various results inferred from the various conditions. Besides, there is another difficulty that general users who are not familiar with programming should know Prolog. In addition to language to formalize narrative structure, authoring tools which assist to program the language and specify the information have been investigated and developed: INSCAPE[11] and PRISM[12] . In INSCAPE, an author writes an interactive story idea; prepares characters, props, and stages; and plan entire flow of story with those assets to achieve desired goals. It also adopted XML-style language, called ICML(Inscape Communication Mark-up Language) for underlying data model. It is designed to create interactive stories for edutainment, simulation, training, and other areas of nonlinear story. PRISM provides story map to set up interactive story in a similar way to INCACPE and it adopts hybrid narrative structure combining "condition based branching narrative" and "planning" methods to generate interactive story 3 Narrative structure As examined previously, various script languages have been used in an interactive storytelling system. This paper suggests SML based on XML. Although there are already script languages derived from XML, such as MPML or AIML, these languages cannot defined the narrative structure or have difficulties in generating the variety of stories. In comparison, SML suggested in this study has following strength: it defines the narrative structure so that authors can intuitively program stories without difficulties and then can define the languages conforming to XML format according to this narrative structure. 3.1 Constraint based narrative structure The previous studies have expressed stories mainly in STRIPS or Lisp format in order to solve problems based on AI planning techniques [1][4][13]. In regard to a planning algorithm of narrative structure, there are HTN (Hierarchical Task Network) and HSP (Heuristic Planning). STRIPS is used in the systems using HTN and HSP. The structure of HTN can be represented as a tree, in which the conclusion of story is a route and each subconclusion is a child. This employs a top-down method, which leads to good narrative coherence. In contrast, HSP generates the route from an initial state to a goal state, which makes it possible to generate stories flexibly [3]. It can be said that the method introduced in a text is a kind of HSP. HTN constructs a story using a tree. The route node and each non-terminal node indicate the conclusion and the sub-conclusion of the story, respectively, while the terminal node indicates occurrence of a certain action. In regard to the way of constructing the story, the final conclusion is divided into several sub-conclusions, each of which is divided again into another sub-conclusion, and then the story is generated by solving each subconclusion. Here, several actions are collected to solve the smallest unit of sub-conclusion. In other words, it is considered that the story is the combination of sub-conclusions and the lowest level subconclusion includes the collection of the specific character's actions. The story is generated using a divide-and-conquer planning technique based on Al, in which the problem is divided into the smaller units, which are then solved and combined. In such a structure, the conclusion cannot be reached unless the sub-conclusion is satisfied, which, in turn, guarantees the coherent flow of the story. In another aspect, the story is processed centering on the only actions that are preconditions for the conclusion, so only stories expressed on the route of HTN are formed, thereby causing the decrease in the degree of freedom. In case of an ideal type of interactive storytelling, the story should have a high degree of freedom while maintaining its coherent flow. It is, however, not easy to realize an interactive storytelling system satisfying both of them, because these two are usually in a contradictory relationship [13]. This study aims at the direction of story generation that makes it possible for authors to adjust the degree of freedom and the coherence of stories. The previous studies have maintained a causal relationship through the divide-and-conquer planning based on AI, but reached the limit for the degree of freedom. Therefore, this study suggests the form that escapes from such a frame by eliminating the hierarchical structure. In other words, contrary to the existing structures, the stories are seen as the continuous actions of the character after eliminating the step of subconclusion. Instead of the eliminated sub-conclusions, the constraint conditions such as a causal relationship or a temporal relationship between actions are declared in order to maintain the unity of the story. And through the degree of such constraint conditions, the degree of the coherence and the degree of freedom can be properly adjusted to the extent which the author wants. The narrative structure suggested in this study is divided to constituent declaration and constraint declaration. The constituent declaration defines actor, action property, stage, and props, while the constraint declaration suggests the various condition-relation structures that can control the flow of the story and support the nonlinear process of the story. 3.2 Constituent declaration In the constituent declaration, the basic constituents necessary for stories are declared. The constituents defined in the declaration include property, stage, actor, props, action and others. Figure 2 shows the constituents defined in the declaration and the interdependence among them. Figure 2: Declaration Constituents and Relationships - Property A property expresses the characteristic of actors or props numerically. Actors or props can have the value corresponding to necessary properties for each as a numerical value. - Stage A Stage displays the space where the story proceeds. - Actor An Actor is the constituent that becomes the subject or the object of the story, which performs itself or is performed. There is a name for each actor, and it contains the types of properties that the actor has, the value of each, and the stage information that marks the space in which the actor is at the very beginning of the story. - Props Props refer to things that cannot be the subject but object of actions. Thus, things or actors who cannot be the subject of actions are defined as props. - Action An action indicates the motion that actors can take. It displays actors who can carry out such an action, properties necessary for the action, stages where the action can occur, and how many objects of the action are. In this constraint based narrative structure, properties commonly exist between actors and actions. This is the device for generating stories, wherein more logical choice can be made in deciding the motions of actors. Let's consider one example in order to understand uses of such a device. Followings are assumed; an actor 'A' has properties of talkative = "80" and aggressive = "30", an action 'attack' has properties of talkative = "30" and aggressive = "80", and an action 'talk' has properties of talkative = "80" and aggressive = "30". 80 30 30 80 80 30 If an actor 'A' is in a situation where he or she has to perform one motion either 'attack' or 'talk', the possibility of carrying out 'talk' motion increases due to the property value. It is necessary to fully unitize this structure in a story generator in order to establish the balance between characteristics and actions of actors. The constituents listed above are combined, thereby generating an event. In brief, an event means that a specific actor carries out a certain motion, wherein the defined form of the action determines the presence of an object. Figure 3: Components of event Figure 3 shows the necessary components of an event. To put it concretely, an event is expressed as actor, stage, action, and props are combined in such form as "Specific actor does what action (something) (to someone) where". The generation of story in this structure means the generation of such an event. In addition, a special component called 'viewpoint' is placed in order to reduce the complexity in story generation. This indicates the stage at which the user is currently looking. The generation of all the stories is limited to the current viewpoint, which is in line with a play where audiences see only one stage. As a curtain comes down when the stage is changing, the change of viewpoint is required to change the stage in this structure. As an initial stage should be specified in a play, the viewpoint for a beginning point of the story should be specified in the constituent declaration as well. 3.3 Constraint declaration In the constraint declaration, various constraint conditions that can control the direction of narrative flaw are declared. Such constraint conditions play a role of framework holding the direction in which the story proceeds, preventing the story from taking the wrong way. The degree of freedom is determined according to a dynamic of constraint conditions, and the types of constraint condition are shown below. The parts explained here is conceptual, so it is necessary to support a wildcard character ('?') or logical operation so as to control the stories more delicately when writing these in an actual script language. 1) Ending : (event) ^ (end) This is a condition that ends the story. If a given event occurs, the story ends. The multiple endings can be setup by assigning several kinds of events. 2) Transaction : (pre event) ^ (post event) Two events occur together like a transaction. This condition is set in order to induce another event to occur in sequence upon occurrence of a specific event. In script languages, the function, where postevent can refer to the subject and the object of pre-event, should be provided in order to make extensive forms of transaction condition. For example, let's assume the case where a wildcard character "?" designated as any actor, is given as a condition, and pre-event is defined as "?" hit "?". It should be possible to define post-event for this case in the same construction with '[The object of pre-event] hit [the subject of pre-event]'. In this case, the subject and the object of post-event are determined at the time pre-event occurs. 3) Transition : (event/action) ^ (change property) Transition constraint is to change the value of a specific property upon occurrence of a designated event or action. The condition is set, for example, as the construction of 'reduce a property of an object, 'health', for a 'hit' action.' This is useful to define the motion accompanied with changes of a specific state. 4) Induction : (condition of property) ^ event/action) It can declare to perform an event or a motion when a specific attribute satisfies a certain condition. The condition is set, for example, as the construction of "eat rice when hungry is less than 30." This constraint condition is assigned to make an actor perform a specific motion according to the change of state. 5) Must : (pre event) ^ (post event) This is a constraint condition that executes various functions. The semantic view of this condition indicates that 'When pre-event is performed without occurrence of post-event, post-event must occur before the story ends.' In other words, this constraint condition is used if an event that occurs as a result must occur upon occurrence of an event corresponding to a cause in a causal relationship, Also, a negative option can be placed in post-event so that post-event should never occur when pre-event occurs. This can put a constraint so that another event can never follow when a specific event occurs. In addition, an event that must occur on any occasion can be indicated by setting only post-event not pre-event. 6) Ordering : (pre event) ^ (post event) This is a declaration to adjust the logical flow of the story. It makes no difference even if events assigned here don't occur. It only indicates 'pre-event must occur prior to post-event at all times'. In other words, if pre-event doesn't occur, post-event wouldn't occur either. 7) Stage Change : (event) ^ (change stage) This is a constraint declaration that changes the current stage of the subject or the object of an event when a specific event occurs. This is similar to an effect that makes characters on stage withdraw or makes characters appear on stage in a play. In summary, the basic environment and constituent required in the story are defined in the constituent declaration, and the constraint items holding the outline of overall direction of story are assigned in the constraint declaration. 4 SML (Storytelling Markup language) 4.1 SML The narrative structure with the purpose of an interactive storytelling was defined previously. However, the previously defined structure is simply a kind of an abstract data type. In order to generate stories, it is necessary to express above components in definite language so that the story generation engine can process them. The language for expressing narrative structure requires following characteristics. First, the legibility should be good enough so that users can easily understand the meanings without any difficulty and add what they want. There would be no occasion to write codes, because the authoring tools are basically used. However, good legibility is necessary so that there would be no difficulties even in the case of writing codes directly. Second, there should be good expressiveness. In the previously suggested structure, an actor or an action has subordinate constituents in complex form. In particular, the way the motion is defined demands information of properties necessary for the motion, actors who can perform the motion, and stages where the motion can occur. Each motion, however, has different number of such constituents. The structures defined in the constraint declaration are combined by a relationship of 'AND' or 'OR' and demands a wildcard character, and a reference structure, thereby becoming more complex. The language should be able to express such structures that are defined as being complex and multilateral. Third, the structure should be easy to deal with. It requires the operation of generating or parsing the codes written in the given language using an authoring tool or a story generator. The structure of the language should be easy to deal with, so that it would be easy to develop an authoring tool or a story generator that supports such a language. Such a characteristic of the language can be considered very important in case that an authoring tool, a story generator and a script language are independent from each other, which even makes it possible to develop various authoring tools or story generators to support the language afterwards. The language should be defined with the above conditions in mind in order to achieve satisfactory results. When taking the structures of already existing languages into consideration, a XML (eXtensible Markup Language) format can be easily considered first. XML can be seen as a super-ordinate concept of HTML (Hyper-Text Markup Language). HTML is well known to many users because HTML is used as a standard output mode of World Wide Web. Many users, in turn, can read and write HTML. Thus, the XML mode can be considered to have quite excellent legibility. In addition, XML itself is a language that can add various formats freely, leading to fairly good expressiveness. Also, since XML is widely known and used, there are many related libraries. Since it is easy to deal with languages with help of libraries, the XML format is, in a sense, the 'easy-to-handle structure'. As described, XML satisfies all the conditions required as a language listed above. This study, thus, defines the previously defined narrative structure in the XML format, naming the language as SML (Storytelling Markup Language). The DTD (Document Type Definition) of SML is not included in a text, since the constituents and the constraint conditions of the previously presented narrative structure are large in quantity and complicated. 4.2 An example of SML Figure 4: An Example of Story Structure. Figure 4 is a drawing which illustrates some part of the story, while Figure 5 simply shows only the necessary part for the story illustrated in Figure 4 in SML. The event of 'Frodo finds the One Ring in the Middle Earth' or 'Gollum finds the One Ring in the Middle Earth' is set to occur for sure with constraint 'must' with no 'pre-event'. With constraint 'transaction', upon generation of an event that 'somebody finding the One Ring in the Middle Earth', 'the finder rules or is ruled by the One Ring.', in such a case, whether ruled by the One Ring or not depends on properties of Frodo and Gollum. The property value of 'ruling' is higher value than that of 'being ruled by' in case of Frodo, while the property value of 'being ruled by' is higher in Gollum. In such cases, an event, in which an action of 'ruling' occurs, has a higher possibility of being generated as for Frodo, while the possibility of generating an event, in which an action of 'being ruled by' occurs, increases as for Gollum. 4.3 Story generator In previous sections, the narrative structure was defined, along with the language to express it. At this point, complier or interpreter is needed to interpret the codes written in the given language and to perform the operation. In other words, a story generator, which receives given SML codes and generates stories, is required. BO

BO 40 OtageR stageId="MiddleEarth" /> octor actorId=nGoIlun"> 5Q 40 30<7propertyR> OtageR stageId=nMiddleEarth"/> 50 5Q OtageR stageId=nMiddleEarth"/>

60 oubR subid="Frodo "/> OubR subId="Golluni"/> OtageR stageId="MiddleEarth"/> eo oubR subld="Frodo"/> OubR sub I d= " So 11 um " /> OtageR stage Id="MiddleEarthir/> OtageR stage Id="MiddleEarth"/> < transact ion ~ran3accionld="rrar.3acrion0n> OtageR 3tageId=nMiddleEart:h,"/> OtageR stageId=nMiddleEarrh,"/> octionR aocionïd=r'rule11 /> OctionR actionId="ruledby"/> OtageR 3tageId="MlddleEarrhn/> OtageR stageId="MiddleEarth"/> CpostObj obj Type="prop" postObjId="TheOneRing™/> OtageR 3tageId=nMiddleEarthn/>