Volume 35 Number 2 June 2011
ISSN 0350-5596
Informatica
An International Journal of Computing and Informatics
EDITORIAL BOARDS, PUBLISHING COUNCIL
Informatica is ajournai primarily covering intelligent systems in the European computer science, informatics and cognitive community; scientific and educational as well as technical, commercial and industrial. Its basic aim is to enhance communications between different European structures on the basis of equal rights and international refereeing. It publishes scientific papers accepted by at least two referees outside the author's country. In addition, it contains information about conferences, opinions, critical examinations of existing publications and news. Finally, major practical achievements and innovations in the computer and information industry are presented through commercial publications as well as through independent evaluations.
Editing and refereeing are distributed. Each editor from the Editorial Board can conduct the refereeing process by appointing two new referees or referees from the Board of Referees or Editorial Board. Referees should not be from the author's country. If new referees are appointed, their names will appear in the list of referees. Each paper bears the name of the editor who appointed the referees. Each editor can propose new members for the Editorial Board or referees. Editors and referees inactive for a longer period can be automatically replaced. Changes in the Editorial Board are confirmed by the Executive Editors.
The coordination necessary is made through the Executive Editors who examine the reviews, sort the accepted articles and maintain appropriate international distribution. The Executive Board is appointed by the Society Informatika. Informatica is partially supported by the Slovenian Ministry of Higher Education, Science and Technology.
Each author is guaranteed to receive the reviews of his article. When accepted, publication in Informatica is guaranteed in less than one year after the Executive Editors receive the corrected version of the article.
Executive Editor - Editor in Chief
Anton P. Železnikar
Volariceva 8, Ljubljana, Slovenia
s51em@lea.hamradio.si
http://lea.hamradio.si/~s51em/
Executive Associate Editor - Managing Editor
Matjaž Gams, Jožef Stefan Institute
Jamova 39, 1000 Ljubljana, Slovenia
Phone: +386 1 4773 900, Fax: +386 1 251 93 85
matjaz.gams@ijs.si
http://dis.ijs.si/mezi/matjaz.html
Executive Associate Editor - Deputy Managing Editor
Mitja Luštrek, Jožef Stefan Institute mitja.lustrek@ijs.si
Executive Associate Editor - Technical Editor
Drago Torkar, Jožef Stefan Institute
Jamova 39, 1000 Ljubljana, Slovenia
Phone: +386 1 4773 900, Fax: +386 1 251 93 85
drago.torkar@ijs.si
Editorial Board
Juan Carlos Augusto (Argentina)
Costin Badica (Romania)
Vladimir Batagelj (Slovenia)
Francesco Bergadano (Italy)
Marco Botta (Italy)
Pavel Brazdil (Portugal)
Andrej Brodnik (Slovenia)
Ivan Bruha (Canada)
Wray Buntine (Finland)
Ondrej Drbohlav (Czech Republic)
Hubert L. Dreyfus (USA)
Jozo Dujmovic (USA)
Johann Eder (Austria)
Ling Feng (China)
Vladimir A. Fomichov (Russia)
Maria Ganzha (Poland)
Marjan Gušev (Macedonia)
N. Jaisankar (India)
Dimitris Kanellopoulos (Greece)
Samee Ullah Khan (USA)
Hiroaki Kitano (Japan)
Igor Kononenko (Slovenia)
Miroslav Kubat (USA)
Ante Lauc (Croatia)
Jadran Lenarčič (Slovenia)
Shiguo Lian (China)
Huan Liu (USA)
Suzana Loskovska (Macedonia)
Ramon L. de Mantras (Spain)
Angelo Montanari (Italy)
Deepak Laxmi Narasimha (Malaysia)
Pavol Ndvrat (Slovakia)
Jerzy R. Nawrocki (Poland)
Nadja Nedjah (Brasil)
Franc Novak (Slovenia)
Marcin Paprzycki (USA/Poland)
Ivana Podnar Žarko (Croatia)
Karl H. Pribram (USA)
Luc De Raedt (Belgium)
Shahram Rahimi (USA)
Dejan Rakovic (Serbia)
Jean Ramaekers (Belgium)
Wilhelm Rossak (Germany)
Ivan Rozman (Slovenia)
Sugata Sanyal (India)
Walter Schempp (Germany)
Johannes Schwinn (Germany)
Zhongzhi Shi (China)
Oliviero Stock (Italy)
Robert Trappl (Austria)
Terry Winograd (USA)
Stefan Wrobel (Germany)
Konrad Wrona (France)
Xindong Wu (USA)
A Data Model and an XQuery Extension for Concurrent XML Structures
Emmanuel Bruno and Elisabeth Murisasco
LSIS -UMR CNRS 6168
Université du Sud Toulon-Var, BP 20132
83957 La Garde Cedex, France
E-mail: {bruno, murisasco}@univ - tln.fr
Keywords: multistructure, concurrent hierarchy, textual document, tree-like structure, XML data model, XQuery Received: March 13, 2009
An XML document is mainly hierarhical, but some applications need to simultaneously associate more than one hierarchy to the same data. In general, concurrent hierarchies cannot be merged in order to get a well-formed XML document. This work stands in this context: it aims at describing and querying hierarchical XML structures defined over the same textual data in a concurrent way. Our proposal called MSXXD is composed of a data model (which can be seen as an index over the data and between all the structures) and a dedicated query language defined as an extension of XQuery. The key idea is to propose a method for a compact description of multiple tree-like structures over a single text based on segmentation. Segmentation encoding allows querying overlap/containment relations of markups belonging to different structures. This paper also tackles a solution to define a multistructured schema in order to describe the relationships (as weak constraints) between parts of concurrent structures. Finally, this paper focuses on the architecture of the XML environmentimplementingourproposals.
Povzetek: Sistem omogoca uveljavljanje vec hierarhij v dokumentih XML.
1 Motivation
XML [10] is the de facto standard to describe structured data. Several applications in the context of information systems are based on their use: electronic publishing, language engineering, technical documentation, digital libraries, Web, etc.
XML documents are mainly hierarchical. The hierarchy, captured in a tree-like structure [23], corresponds to one level of analysis of the data contained in the document (e.g a logical analysis). A large set of tools are now available and widely used in particular for edition, querying (XPath [16], XQuery [7]) or transformation (XSLT [15]). According to us, this success is due to the hierarchical structure which is easier to exploit compared to the graph structure. Moreover, manipulation remains simple in an XML environment.
Recently, the title of an article published in the SIG-MOD conference [27] claims that " One hierarchy is not enough " for data-centric application context. Indeed, the CONCUR feature of SGML [24] first pointed out this need in the nineties but in context of document-centric encoding where some applications need to consider more than one hierarchy over the same text in a concurrent way. These last years, several other works about concurrent markups have been published [31, 30, 32, 22, 33, 19, 25, 14, 21]. All these different works propose solutions to describe multiple tree-like structures (with overlapping) into a single document. The main problem with concurrent hierarchies is that they cannot be merged in order to get a well-formed
XML document without using a flat representation or hyperlinks that make the structure difficult to query.
Our work stands in this context. It aims at representing and querying concurrent hierarchical XML structures defined over the same textual data. We call such data " mul-tistructured textual documents ". The key idea is to propose a method for a compact description of multiple trees over a single text based on its segmentation. Segmentation encoding allows querying overlap/containment relations of markups belonging to different structures. The solution proposed, called MSXD for MultiStructured XML Docu-ments1, is entirely XML compatible.
This paper describes on the first hand the MSXD model and its relative query language, and on the other hand, it details the way these proposals are implemented under an XML environment. We also tackle the description of relationships between structures (as weak constraints) in order to propose a solution to define a multistructured schema enabling validation across multiple hierarchies [20].
This paper is an extended version of different preliminary contributions ( [11], [12] and [13]).
1.1 A running example
To illustrate the problem, we consider a mediœval manuscript (Princeton, Garrett 80 (PG)) related to medico-pharmaceutical recipes written in Occitan language [8]. This kind of text is studied by philologists of the University
¡Funded by the french research agency (ANR) - Semweb project
(2004-2007)
of Pisa (Italy) in the Department of Language and Romance Literature. Philology is a science that studies ancient or medieval civilizations by the mean of literary documents.
Researchers are interested in studying the text of the manuscript according to different points of view corresponding to different uses of the document. Each analysis results in a mainly hierarchical markup of the text which could be easily represented in XML.
From the document represented as an image (see Figure 1), for illustrating our intention, we have extracted the following textual content:
... Per recobrar maniar Ad home cant a perdut lo maniar prin de l erba blanca ...
This same text has been segmented and marked up in three ways: its physical structure ST (the manuscript is organized into pages, on two columns composed of lines), its syntactic structure S2 (the manuscript is composed of sentences and words) and its semantic structure S3 (the manuscript describes medical prescriptions which have signs, ingredients, plants and effects). The result is shown in Figure 2 where three XML documents hierarchically organize the same text according to these three different points of views.
Each structure marks up the text differently from another (see Figure 3). For example the segment of text "Ad home cant a perdut lo maniar' is tagged by Sign in S3, while it is tagged differently in S1: "Ad home cant a perdut' is marked by Line whereas "lo maniar' begins another textual segment marked also by Line. Moreover notice that a Sign can overlap two lines. Therefore, these documents are independent with potential overlapping.
Structures (but not the relations between them) could be defined by means of a grammar like a DTD, an XML schema [6] or a RelaxNG schema [17].
These XML documents are currently considered separately even if researchers identify correlations between them. In particular, they would like to be able to express the following queries: What are the sentences that follow the signs? (this query combines syntactic and semantic structures); What is the prescription that contains the larger number of lines? (this query combines physical and semantic structures); What are the words cut by an end of line? (this query combines physical and syntactic structures).
Usual XML manipulation languages do not support concurrent structures, thus our issue is (1) to model multiple hierarchical structures in order to query them in a concurrent way and (2) to provide an XML environment to support multistructured documents.
Finally, notice that the relationship (eventually weak) between parts of these three structures are not defined: for example, one can remark that a page (in the physical structure) always begins with a prescription (in the semantic structure). It could be useful to use this kind of constraint during querying and to check consistency between structures. We tackle this issue by defining a schema to describe the relationships between parts of the concurrent structures.
1.2 Objectives
Our objectives are the following:
A suitable - XML compatible - data model. This model dedicated to multistructured textual documents is called MSXD. It enables (1) to consider several segmentations of the same text, and (2) to define a hierarchical structure for each of these segmentations. Notice that the structures could be weakly coupled and that there is no main structure. The set of segmentations and the set of hierarchical structures are deduced automatically from the given XML structured documents (for example produced by philologists). That is why they could be developed in a distributed way. The replication of the common text (in each XML document) can be seen as a drawback but (i) the data storage according to the volume is not really a problem, and (ii) synchronisation is not an issue because the common text is never updated: the only changing part (the structure) is not replicated. Moreover, each user can then edit its own copy offline. This data model is based on the use of hedges [28], the foundation of grammar language RelaxNG [17] (REgular LAnguage description for XML New Generation);
An extension of XQuery [7]. Our objective is to query the structures and the textual content in a concurrent way and to use an as much as possible unchanged XQuery. One of the original contribution of our proposal is that the Mul-tiStructured document is never instantiated. As we said before, the set of structures can be syntactically described as a distributed set of XML documents: we want to keep each structure safe to use available XML tools and languages for its construction and its manipulation. Given a set of XML documents over the same textual value, a MSXD instance can be seen as an index over the textual value and between all the structures. This index is used to evaluate queries using concurrent structures;
The description of relationships between structures.
We propose a solution to define a multistructured schema in order to describe the relationships (as weak constraints) between parts of the concurrent structures. This is done as a set of rules by means of Allen's relations to constrain the relative position of fragments in the structures. This work is a first step towards validation across multiple hierarchies [20].
This paper is organized as follows: Section 2 defines the MSXD data model and shows the XML syntax associated to an MSXD instance. Section 3 presents an extension of XQuery to multistructure. Section 4 proposes a schema enabling to express constraints between concurrent structures. Section 5 specifies the MSXD architecture. Section 6 is dedicated to the related works and section 7 concludes. Currently, we only consider the definition of multiple structures in the single textual modality.
fW

al ucrturi nulla crlntnr iwn proftirivi .ikrtonuf n .itorlocotf cn«MUC iu4ii|Cttt t\uc U mcf«U cdvi eo-tua-fcro iti4ifp fvakx qui Ucafcn^ >q»lu£u i aptcfUpaa j4 4L1 (Xiviu catr fcr.i
not ftm low u bcu c
plttEcCfU ttWUC"4 CTT u> ai : -lufK lofuctinf c ëu nwur pillai, crcr ¿Uffctmuf Uf ftarf.
rv 4Lico?bur ttiaiav tvitw »anr 4 patiur-flSwûiui Vim frtjgjM

lctv«H cluui* dinmi "sclun amTxiUUnt t.v putf^pcbtt £>etH c feui a a»MV cuU.ni uin .c atar anrio •xxn o abcuic cpwyn ¿r ta Idticuwtçsuatu firaf U( miljf umoif oitmcinr^^v 1 \ in pu» tofiuteUuâ n oBtnmr >'»»»«» otutaf tvc Vwine « btftu u nol roam pucuftc r*vtitui 2>ucüc—-
fr> tyrnu ^uca uuUdf uulU ViinUtvcuf cpi
rtalo 4b mcl c tu U>t«ir fui lof uulfCtTiaiyfb
S
. i *H. v-'-l
bUmaKb Jf lavvuti.i piuc ical mottà Co cf
L.fb4. C."îd tUTiSl C'ï« -fkJii u|ii Altaic t auf cmr fl[ >	■■
"jcUiua crbu amt» Luii' a cuUf^uïi v. nu cpitaf Ufcftuc uf luf pun lasenfqcv -foramti o^uclla .»¿m iaf M obture <r ttuxcif uutt tu luui.Tr "ïrtnàur fCt* Q mpant^öar ¿axz
•f •••Trp-^^'S
iif bualffCJ* mal > home ïuuicilLif.
ftjitw UCaiusUM «r ncf Uifatfcf-»»»)' »
m/»nn.ci<&-4xp:iri oou Uiiu cfacttu»^»' ituuf-uutur uol ttftf

. fc rxcii:
Figure 1 : A page from the manuscript
2 The MSXD model
We first choose to define the notion of a multistructured document and then to introduce all the concepts used in this definition. We illustrate the MSXD model with our running example. Our data model is based on the use of hedges (the foundation of RelaxNG). Informally, a hedge is a sequence of trees. In the XML terminology, a hedge is a sequence of elements possibly intervened by character data; in particular, an XML document is a hedge [28].
2.1 Formal model definition
Definition 1. A Multistructured document is a triplet (V, I, S) where V is a textual value, I a set of segmentations ofV and S a set of structures associated to segmentations from I.
A multistructured document can be seen as a textual value augmented with different hierarchical structures defined over it. These structures share the same text but concern (in general) different strings extracted from that text.
Definition 2. A segmentation of the textual value V of length l is a list Xy of strings such that Xy = {xi|xi = V[bi..ei] and b§ = 0 and ei > bi and bi = ei—y + 1 and e|xv|—l = l — 1},bi and ei are respectively the start and the end positions of the fragment in V. We define two functions for each Xv [i], start (Xy [i]) = bi and end(Xy [i]) = ei. The setXy makes a total partition of the textual value V.
For our example, V is the text extracted from
the manuscript ("...Per recobrar maniar Ad home cant a perdut lo maniar prin de l erba blanca.. ") and we consider three segmentations Xy, Xy andXy3 (corresponding to the set of textual contents of the XML elements from each structure Sl, S2 and S3, see Figure 2):
-	Xy = xl... U ...x\, with in particular xl = "Per recobrar maniar"andx\ ="blanca...",
-	Xy = x2... U ...x15, with in particular x\ ="Per" and x25 ="blanca",
-	X3 = xl... U ...x3, with in particular xl = "Per recobrar maniar", and x3 = "erba blanca".
<!-- S1 physical structure --> <?xml version="1.0" encoding="utf-8"?> <Manuscript>... <Page>
<Column>...
<Line>Per recobrar maniar</Line> <Line>Ad home cant a perdut</Line> <Line>lo maniar prin de l erba</Line> <Line>blanca ...</Line>... </Column> ... </Page>... </Manuscript>
<!-- S2 syntactic structure --> <?xml version="1.0" encoding="utf-8"?> <Manuscript> <Syntax>...
<Sentence><W>Per</W><W>recobrar</W><W>maniar</W></Sentence> <Sentence><W>Ad</W> <W>home</W> <W>cant</W><W>a</W> <W>perdut</W><W>lo</W> <W>maniar</W> <W>prin</W> <W>de</W><W>l</W> <W>erba</W><W>blanca</W> ...</Sentence>... </Syntax> </Manuscript>
<!-- S3 semantic structure --> <?xml version="1.0" encoding="utf-8"?> <Manuscript> <Prescriptions>... <Prescription>Per recobrar maniar <Sign>Ad home cant a perdut lo maniar</Sign>prin de l <Ingredient><Plant>erba blanca</Plant>...</Ingredient></Prescription>... </Prescriptions> </Manuscript>
Figure 2: Physical (Si), Syntactic (S2) and Semantic (S3) structures
i	V i	V i	"	> i"
Per recobrar maniar Ad home cant a perdut lo maniar prin de l erba blanca
Sign
Ingredient
Figure 3 : Multiple segmentations of the text
We use the concept of fragment to define structures: fragments mark up a segmentation. Fragment positions in the textual value are useful to compute their relative positions.
Definition 3. A fragment f is defined over a segmentation Xy of the textual value V and an alphabet Xy, a set of labels:
1.	f = £ (the empty fragment), start (f) = end (f) = 0
2.	f = vi with vi GXy, start(f) = start(Xy[i]), end (f) = end (Xy [i])
3.	f = v < x > with v G Xy and x a fragment, start (f) = start (x),
end( f) = end(x) (f is called a tree)
4.	f = xy with x andy two fragments and start (y) = end(x) + 1,
start(f) = start(x), end(f) = end(y).
Notice some remarks about definition 3:
- Rule 1 defines the empty fragment. Recall that we want to represent several segmentations of the same
text in order to define a hierarchical structure for each of these segmentations (producing several structured documents over the same text). Using empty fragments (as milestones) is not compatible with our model because a fragment has to be related to a segmentation. The only legal empty fragment is the one associated with the empty document.
-	Rule 3 uses the alphabet XV, whichis a set of labels for fragments having a tree-like structure corresponding to XML element names:
for the physical analysis,
Ej1 = {Manuscript, Page, Column, Line},
for the syntactic analysis,
E2 = {Manuscript, Syntax, Sentence, W},
for the semantic analysis,
E3 = {Manuscript, Prescription, Ingredient,
Sign, Plant}.
-	Rule 4 produces sequences of fragments. We do not make a distinction between a fragment and a singleton sequence containing that fragment and we do not consider nested sequences.
We now give some examples of fragments respectively constructed over segmentations Xy and Xy:
- Over Xy (two fragments):
(1)	f1 = x1 with x1 ="Per recobrar maniar", start(f1 ) = 1, end(f ) = 19 and
(2)	f1 = Line < x\ >, start(f1) = start(f1 ), end(f2)= end (f\);
The fragment f is represented in XML syntax by
<Line>Per recobrar maniar</Line>.
Over XV3 (three fragments): (1) fi
start( f43 ) (2)4
— x4
x4
f4
end( f53)
(3) f4 --
end( f63)
with
= 60, end(/¡) = 70 = Plant < f3 >, start (f ) = end( f3 ) and
Ingredient < f >, start(f3
"erba blanca",
start ( f4 ), start ( f4 ),
92) (we supposed that 92 is the end position of the textual fragment marked by "..." in our example);
The fragment f63 is represented in XML by
<Ingredient>
<Plant>erba blanca</Plant>... </Ingredient>.
Definition 4. A structure is a tree f (a labelled fragment) over a segmentation of the textual value V, end( f) = ¡Xy | — 1 and start ( f) = 0.
Figure 4 illustrates our model over the two XV1 and XV3 segmentations (we do not show the third segmentation to make the reading of the figure easier) and the two structures 51 and S3 defined over them. The figure is composed of two parts: the first indicates the segmentations (start and end positions are associated to each textual segment inside a segmentation; for convenience the numbering of start and end positions only takes into account the segments used in our example), and the second part shows the hierarchical organization of fragments into a structure (physical structure on the left, semantic structure on the right).
In summary, from the XML documents of Figure 2, we can extract the text V, we can deduce three structures (S1, S2 and S3) built on three segmentations (Xy, Xy and Xy). Fragments are constructed over segmentations.
2.2 Relative position of two fragments
Our model is designed so that Allen's relations [2] can be used on fragments in order to calculate their relative position inside a segmentation or between two segmentations.
Definition 5. Predicates on two fragments f and f are defined over one or two segmentations on the same textual value:
before(fi, f2 ) before(fi, f2, n) meets(fi, f2 ) during(fi, f2 )
overlaps(fi, f2 )
starts( fi, f2 ) finishes(fi, f2 ) equals( fi, f2)
= finishes(f2, fi ) = end(fi) < start(f2) = finishes( f, fi, n) = start ( f2 ) — end ( fi) = n = met-by (f2, fi ) = end( fi) = start ( ff) = contains( f2, fi ) = start (fi ) > start (f ) and end(fi ) < end(f ) = is-overlapped ( f, fi ) = start (fi ) < start (f ) and end(fi ) > start ( ff) and end( fi ) < end( f2 )) = started-by( f2, fi ) = start ( fi ) = start ( f ) and end(fi ) < end(f ) = finished-by(f2, fi ) = end( fi) = end ( f ) and start(fi) > start(f2) = start ( fi ) = start ( f ) and end(fi ) = end(f )
Notice that if f1 and f2 are defined on the same segmentation the predicates meets and overlaps are always false.
Finally, we need to compute the level of a fragment in a structure. This level captures the parent/child relationship between two fragments in a structure.
Definition 6. Let F (s) (s is a structure) be the set of fragments f such that f = sor3x G F (s), 3a G Xy |x = a < f >. The function level (s, f ) returns the level of the fragment f in the structure s, it is calculated with the following algorithm :
-	level (s, s) = 0
-	level(s,y) = level(s,x) + 1 with x = a < y > (x and y G F (s)).
Figure 4 also shows the relative position of two fragments in two segmentations. In particular, the two following Allen's predicates are true (they indicate that a Plant overlaps two Lines):
overlaps(Plant(s'5, e' 5), Line(s5, e5)) = true overlaps(Plant(s'5, e' 5), Line(s6, e6)) = true.
Lastly, notice that in Figure 4, we have associated to each fragment its level in the hierarchical structure to which it belongs. For example, in S1, level(S1,Manuscript <>) = 0 level(S1,Page <>) = 1 level(S1,Column <>) = 2 level(S1,Line <>) = 3.
2.3 MSXD XML syntax
We see that for a given multistructured document, each structure can be described using an XML syntax, thus its schema can be described using RelaxNG (see Figure 5). We choose RelaxNG because our model relies on hedges
e1
s2f s3
e2
e3 s4
~v r
e4 s5 ~V
e5 s6
"V r
.Per recobrar maniar Ad home cant a perdut lo maniar prin de l erba blanca

o
s,t
s'4
e'4
e'2
Manus cript(s0,e0) . level=0
Page(s1,e1) level=1
Prescriptions(s'1,e'1) level=1
Line(s3,e3) level=3
Column(s2,e2) . level=2
/ \
Line(s4,e4) level=3
Line(s5,e5) level=3
Prescription(s'2,e'2) level=2
/ \
' Line(s6,e6)	Sign(s'3,e'3) Ingredient(s'4,e'4)
level=3	level=3	level=3
Plant(s'5,e'5) level=4
s'3
e'3
e'5
s'1
e'1
Figure 4: Illustration of our model
to represent XML documents and the design of RelaxNG is based on this theory [28]. Another equivalent solution could have been to use XML Schemas.
We define an XML syntax for an MSXD instance (see Figure 6). In this figure, we refer to the textual value of the manuscript (identified by an uri), each structure (identified by an uri) and its associated schema. Notice that segmentations are implicit.
3 Querying MSXD instances
Our data model is close to XDM (XML Data Model)[23] used in XQuery and XPath. Recall that XDM defines unnested sequences of items (nodes or atomic values). For querying, we propose an extension of XQuery to deal with a multistructured document.
Indeed, we propose to define an XQuery item as an atomic value or a fragment (instead of a node) and then we still deal with sequences of items. Thus, the XQuery language can be adapted to query a multistructured document. A XQuery on an MSXD instance with a single structure is equivalent to the same XQuery on the XML document corresponding to this structure. In the case of multiple structures, we extend the semantics of the filters of XQuery. For that, we successively study and extend:
-	The accessors defined in the XQuery Data Model (XDM) [23]. Accessors can be seen as a set of primitives to access an instance of the XML data model
(parent, child,...).
-	The axis defined in XQuery [7]. Axis are an higher level access mode to instances of the XML data model. An axis can be an accessor (for instance parent and child) or can be based on accessors (for instance
the axis ancestor is defined by the recursive application of the accessor parent).
- The normalization of an extended XQuery into a core XQuery (as defined in the XQuery Formal semantics [18]). To formally define the semantics of XQuery, a subset of XQuery named XQuery core has been defined. Every XQuery can be expressed in the core. For instance, an XPath step is translated in a For Let Where Return (FLWR) expression. The dynamic environment existing during the evaluation of an XQuery is explicitly defined by binding a set of standard variables in the XQuery core.
Our objective is to use an as much as possible unchanged XQuery, that is why we choose to rely on the XQuery normative documents.
3.1 Extending XDM accessors and XQuery axis
To navigate in every structure and to adapt Allen's relations, we choose to slighty modify the semantics of some accessors or operators (close to Allen's relations). In other cases, we define new functions.
Accessors. First, to express the containment we adapt the parent/child relationship: The dm:children and dm:parent accessors have been extended to return every parent and every child of a fragment in every structure it belongs to.
Axis. As a fragment does not belong to every structure and as we want them to be as general as possible, the axis ancestor and descendant cannot be defined by applying recursively the accessors dm:parent and dm:children as it is done in XQuery2. Thus, we define
2http://www.w3.org/TR/xquery/ id-full-axis-feature
<! -- Physical Structure --> start =
element Manuscript {
element Page { element Column { element line { text +


<!-- Syntactic Structure --> start =
element Manuscript { element Page { element Column { element line { text +


<!-- Semantic structure --> start =
element Manuscript { element Prescriptions { element Prescription {
(text | Dosage | Effect | Ingredient | AdministrationMode | element Sign { text } | element Concoction { (text | Effect
| AdministrationMode
| element Action { (text | Ingredient)+ })+

})
Effect = element Effet { text } Sign =
element Ingredient { (text | Dosage | element Mineral { text } | element Plant { text })+
}
AdministrationMode = element AdministrationMode { text } Dosage = element Dosage { text }
Figure 5: Schemas for the Physical (SI), Syntactic (S2) and Semantic (S3) structures
<MsXmlDoc xmlns="http://lsis.univ-tln.fr/msxd/doc/v1/"> <TextalValue
uri="http://lsis.univ-tln.fr/msxd/value/manuscript"/> <Structure
type="http://lsis.univ-tln.fr/msxd/structure/manuscript/physical" uri="http://lsis.univ-tln.fr/msxd/instance/Sl.xml"/> <Structure
type="http://lsis.univ-tln.fr/msxd/structure/manuscript/syntactic" uri="http://lsis.univ-tln.fr/msxd/instance/S2.xml"/> <Structure
type="http://lsis.univ-tln.fr/msxd/structure/manuscript/semantic" uri="http://lsis.univ-tln.fr/msxd/instance/S3.xml"/> </MsXmlDoc>
Figure 6: XML syntax for the multistructured manuscript
two new accessors:
dmmsxd:ancestor and dmmsxd:descendant which respectively return the sequence of fragments containing a fragment or contained in a fragment (according to start and end positions).
Order. We need to consider Allen's relations used to compute relative position between two fragments inside a segmentation or between two segmentations. A partial order on fragments in a given multistructured document can be defined on start and end positions. As in XQuery, the boolean operators ni << n2, ni >> n2 and ni is n2 can be defined. They are respectively true for the fragments ni and n2, if ni is before n2, ni is after n2 and if ni and n2 are the same fragment. We use the two first boolean operators (<<, >>) to express the following (after) and preceding (before) Allen's relations.
New functions. We introduce each of the remaining relations as functions (evaluated according to the relative positions of two fragments). We only give here the definition of two Allen's relations (equals and overlaps) and the corresponding operators because they are used in the query examples given below. The remaining relations can be defined in the same way.
-	the functionmsxd:is-equal(ni(n2) and the operator is-equal,are true if n i and n2 have the same start and end positions in the same multistructured document,
-	the function msxd:is-overlapping(ni(n2) and the operator is-overlapping, are true if
start (ni) < start (n2) and end(nf) > start (n2) and end(ni) < end(ni).
3.2 Extending the dynamic evaluation
In the formal semantics of XPath and XQuery, the dynamic context is explicitly defined by binding variables. In particular $fs:sequence, $fs:dot, $fs:position and $fs:last variables respectively represent the sequence of context items, the context item, the context position, and the context size. The side effect of each operator is also explicit in the core XQuery.
It is necessary to extend the dynamic context to carry information about every structure associated with an MSXD instance. We extend it by binding a new variable $msxd:selected_structures to a sequence of strings that represents the set of ids of the structures to be taken into account during the evaluation of the query.
In our example, when the document is loaded by default:
$msxd:selected_structures =
{"http://lsis.univ-tln.fr/msxd/instance/S1.xml", "http://lsis.univ-tln.fr/msxd/instance/S2.xml", "http://lsis.univ-tln.fr/msxd/instance/S3.xml"};
This variable can be used to restrict the set of structures. If it is set to a subset of the structures, only this subset is taken into account by the XQueries.
We need to define two basic functions to return existing structures in a document and to create the instance of a document:
-	msxd:structures($arg as fragment()*) as xs:string*
returns a sequence of strings which represents the ids of structures to which every fragment of the sequence $arg belongs.
-	msxd:doc($uri as xs:string?) as document-node()?
retrieves the XML description of an MSXD instance using $uri and returns its document fragment $root (equivalent to fn:doc() in http: //www.w3.org/TR/xpath- functions/). This changes the dynamic context by binding $msxd:selected_structures to msxd:structures($root), ie by default every structure of a document are consired during a query.
Finally, we slightly modify the normalization of a path expression in XQuery (a step followed by a relative path expression) to return only fragments of the selected structures. The new rule is given in Figure 7.
The standard normalization transforms an XPath step into a FLWR expression and it sets the dynamic environment after the evaluation of the first step, and for each item of the result ($fs:sequence) the relative path is evaluated (Lines 3,4 and 5). In our extension, we restrict the results to items from the selected structures (Lines 6, 7 and 8). Lines 1 and 2 ensure that each fragment is unique and that every fragment is sorted according to the document global order.
3.3 Query examples
We propose some queries for our running example:
Prolog - Binding the multistructured document to a global variable
declare variable
$msdoc := msxd:doc("manuscript.msxd");
Q1 - Children of Manuscript $msdoc//Manuscript/*
returns every child of the fragment Manuscript which is shared by every structure (the results is the sequence Page from S1, Syntax from S2 and Prescriptions from S3, see Figure 2),
Q2 - First Sentence of Prescriptions described on one Column
$msdoc//Column//Prescription//Sentence[1]
[StepExpr / RelativePathExpr]Expr
1	fs:apply-ordering-mode (
2	fs:distinct-doc-order-or-atomic-sequence (
3	let $fs:sequence as node()* := [StepExpr]Expr return
4	let $fs:last := fn:count($fs:sequence) return
5	for $fs:dot at $fs:position in $fs:sequence return
6	for $msxd:fragment in [RelativePathExpr]Expr return
7	if ([(msxd:structures($msxd:fragment)) = $msxd:selected_structures]Expr)
8	return $msxd:fragment
9	else return (); ))
Figure 7: Extended normalization of a Path expression
Q4- Columns which are Sentences
for $v in $msdoc//Sentence
return $msdoc//Column[. is-equal $v]
Q5 - Sentences containing Plant $msdoc//Sentence[descendant::Plant]
Q6- First Sentence after a Sign $msdoc//Sign/following::Sentence[1]
Q7-First Sentence after a Sign
for $h in $msdoc//Sign
return $msdoc//Sentence[. >> $h][1]
is the same as Q6 but using the order operator instead of
the following axis.
Q8 - Children of Manuscript in S1
let $msxd:selected_structures :=
"http://lsis.univ-tln.fr/msxd/instance/
Sl.xml"
return $msdoc/Manuscript/* is the same as Q1 but returns only children from S1 because the variable $msxd:selected_structures is explicitly set to the identifier of S1.
4 A Schema for multistructured documents
We define a schema for multistructured documents as a set of rules (vs a content model definition) because our structures are weakly coupled and the multistructured document is not hierarchical. Allen's relations (starts, overlaps, equals, ...) enable to constrain the relative position of fragments belonging to different structures. The constrains are expressed using XPath based predicates, we suppose that an XPath expression applied to a structure returns a sequence of fragments.
Definition 7. A Multistructured document schema is a pair (Gs , C) where Gs is a set of grammars defining valid structures and C = {ci\ci = c(pi in si,p2 in s'2)} is a set of constrains, where c is the name of an Allen's predicate
and pi,p2 are XPath expressions applied to the structures si and s'2- The constrain is true if for each fragment fi in val(pi), it exists a fragment f in val(p2) such that c(fi, f2) is true. A document is valid according to the schema if and only if every constrains in C are true.
Figure 8 (see comments in the figure) shows an XML syntax for multistructured documents schemas and illustrates some constrains between fragments of the three structures of our running example (notice that each constrain is applied to two structures). Every constrain could be read in the same way, for example
-	Rule i: Root fragments of physical and syntactic structures are equal. Each fragment matching /Manuscript in every document valid according to the structure (whose alias is) manuscript_physical must be equal to at least one fragment matching /Manuscript in every document valid according to the structure (whose alias is) manuscript_syntactic;
-	Rule 3: A page starts by a prescription. Each fragment matching Page in every document valid according to the structure (whose alias is)
manuscript_physical starts by at least one fragment matching Prescription in every document valid according to the structure (whose alias is)
manuscript_semantic;
-	Rule 4: A prescription contains sentences. Each fragment matching /Prescription in every document valid according to the structure (whose alias is) manuscript_semantic contains at least a fragment matching sentence in every document valid according to the structure (whose alias is) manuscript_syntactic.
This work is a first step towards validation across multiple hierarchies [20]. It enables to check the conformance of concurrent annotations attached to the same textual document related to predefined weak relationships between parts of different structures. Even if the validation is optional, it is useful to use this kind of constraints in case of distributed annotation to check the consistency of structures before querying.
<MsXmlSchema xmlns="http://lsis.univ-tln.fr/msxd/v1/"> <!-- IDENTIFICATION OF THE STRUCTURES --> <MsXmlDoc>
<Structure type="http://lsis.univ-tln.fr
/msxd/structure/manuscript/physical" alias="manuscript_physical" grammar="manuscript_physical.rnc"/> <Structure type="http://lsis.univ-tln.fr
/msxd/structure/manuscript/syntactic" alias="manuscript_syntactic" grammar="manuscript_syntactic.rnc"/> <Structure type="http://lsis.univ-tln.fr
/msxd/structure/manuscript/semantic" alias="manuscript_semantic" grammar="manuscript_semantic.rnc"/>
</MsXmlDoc> <Constraints>
<!-- RELATIVE CONSTRAINTS BETWEEN STRUCTURES --> <!-- Rules 1 and 2: Manuscripts in the three structures are Equals --> <Equals>
<Fragments name="manuscript_physical
select="/Manuscript"/> <Fragments name="manuscript_syntactic" select="/Manuscript"/>
</Equals> <Equals>
<Fragments name="manuscript_physical "
select="/Manuscript"/> <Fragments name="manuscript_semantic" select="/Manuscript"/>
</Equals>
<!-- Rule 3: A page starts by a prescription --> <Starts>
<Fragments name="manuscript_semantic"
select="Prescription"/> <Fragments name="manuscript_physical" select="Page"/>
</Starts>
<!-- Rule 4: A prescription contains sentences --> <Contains>
<Fragments name="manuscript_semantic"
select="Prescription"/> <Fragments name="manuscript_syntactic" select="Sentence"/>
</Contains> </Constraints> </MsXmlSchema>
Figure 8: A grammar for our multistructured document
Document creation and indexing
Document querying
User space MSXD Engine
(1) A multistructured document distributed accross the network
(2) Multistructured
documents
indexing
document access v(on the fly ^indexing)
^—		S--- ^-N. s___ ___'
Textual		Structures and
values		fragments
management		Management
>(indexing)^		(SGBDR) ^___.___^
(3) Multistructured
documents_
querying
User Query (Extended XQuery)
(4) Query evaluation
Optimisation
Query Tree
O
XQuery in the Core language
Figure 9: MSXD functional architecture
5 Implementation of the MXSD data model and query language
In this section, we describe the architecture of an implementation of the MSXD data model and of a XQuery engine extended to support multistructured documents. Figure 9 presents the functional architecture of our prototype and a typical use case in four steps. The figure is organized in two lines and two columns. The first line presents the user level and the second the MXSD engine level. The first column describes the design and the indexing an MSXD instance from a set of XML documents, the second column illustrates the querying.
5.1 User space: An almost usual XML environment
The user space (parts 1 and 3 in Figure 9) is close to an usual XML environment. First, each user can describe its structures in XML documents which mark up the same text (see Figure 2). Notice that the structures could be distributed over a network. Then the user defines a (virtual) MXSD instance with respect to MSXD XML syntax (see Figure 6). To do this the user only needs an XML editor, he does not have to change its habits.
To query an MXSD instance, a user can express it using the language defined in Section 3. The indexing is automatically done if needed (see next section). Recall that the language extends XQuery: if the user queries a single struc-
ture, it is the standard XQuery. Query result is a sequence of fragments represented in XML.
5.2 MSXD Engine: indexing and querying structures and content
As we have shown in Section 2, an MSXD document is a XML meta-document which refers to one XML document for each structure (both of them sharing the same textual value); our implementation of the data model can be seen as a dynamically built index between them (part 2 in Figure 9). We choose to separate the indexing of the structure and the indexing of the content.
The analysis of the first structure enables us to deduce the textual value. The textual value is indexed only once in a specific component which is in charge of the textual indexing (to answer full text queries) and in charge of the access to the textual values of fragments (to build XML answers to the queries). When the other structures are analyzed, the consistency of the text is checked, and an alignment by means of spaces, tabulations or end of line is automatically processed if necessary. For the storage of the textual value and its indexing, we use Apache Lucene3.
The analysis of each structure enables the creation of a relational representation of the structures. We define three main relations for a given document to store: (1) the fragments with their node type (element, attribute, text, ...), their start and end positions in the normalized textual
3http://lucene.apache.org
value, their label, (2) the structures and (3) the structure contents (id of the structure, id of the fragment, level of the fragment in the structure). Notice that this representation is independent from the textual value, it uses start and end positions (the textual value manager stores real values) and it computes the level of fragments into each structure (a fragment can then be shared between structures). In our prototype, we choose to embed a java RDBMS4 but an external one can also be used.
We define an API consisting in usual DOM [1] API ac-cessors extended with operators based on Allen's relations (as defined in definition 5). We store segmentations as tuples in a RDBMS, so we implemented it in SQL. This API provides a high level access to multistructured documents. MSXD instances conform to the DOM API and provide new methods such as the access to overlapping fragments. The query langage prototype relies on this API.
To implement the query engine (part 4 in Figure 9) for this academic prototype, we choose to work step by step and to use standards. First, we translate the user query in the XQuery Core language5. Even if it is not designed to be the foundation of prototypes, we choose it because we need a clean " simple " language with a well defined semantics. Then, the XQuery Core query is used to build a query tree, which is optimized before its evaluation. Most of our operators are implemented to work in pipe line, the XQuery filters operators (operators which give access to children, ancestors, ... of a given fragment) use the extended DOM API. Notice that, the SQL translation remains visible at query time for future optimization (for example by grouping several SQL queries nested with FLWR operators into a single SQL query).
Figures 10 and 11 show two screenshots of the main window of the application and of the query tree display window. Figure 10 shows a capture from our prototype during the evaluation of query Q2. A user can select (from a set of test cases) or edit an extended XQuery (top/right). The automatic translation in XQuery Core is shown bottom/right and the result (either in XML or in internal format by means of start and end positions) is displayed at bottom/left.
The second figure displays the query tree corresponding to the core query and displays dynamic information during the execution (number of items created or filtered by operators,...).
Finally, we developed an implementation of the multi-structured schema validation where constraints are checked sequentially. In order to obtain a more efficient validation and to provide a more intuitive way to express contraints, we are investigating the use of ontologies. We currently tackle this proposal with a linguistic application of multilevel analysis of multimodal data (OTIM-http: //lpl- aix.fr/~otim).
6 Related works
If we look at XML standards, it seems clear that the standard tree-like model [23] and namespaces [9] could be used to represent multistructured documents if each structure is hierarchical and can be merged with others. But, this is not the case in general. In our example, some elements from the physical and syntactic structures can overlap (Line and Sentence). The problem of overlapping is not recent see [20] for a review. Several works have studied multi-structured documents in the context of XML for document-centric encoding. We classify the main proposals into three categories.
The first one concerns the very first works about the representation of several hierarchical structures inside the same text (the CONCUR feature [24] of SGML, TEI6); these solutions are often syntactic. TEI's solutions need to choose either a flat representation of the multistructured document or a main (hierarchical) structure and to use references (ID/IDREF) for the description of the other structures.
The second category is based on proprietary graph models. LMNL7 proposes a new markup language and model such as to overcome the limitations of hierarchical markup in XML and to get an instance of a multistructured document. LMNL graph-based model is not XML compatible even if it is able to import/export. Notice that LMNL considers user annotations but no solution for querying. Mul-tiX [14] is a proprietary graph-based model. It is possible to serialize an instance of it in an unique XML document. The multistructured querying is achieved by means of a set of XQuery [7] functions which in particular, deals with overlapping. Based on [14], [29] proposed a methodology for the construction of multistructured documents. This high level approach aims at defining structures during the construction process. To our knowledge, none other contribution considers a priori the problem of that construction which leads to restructuring and automatic differentiation of structures. We did not yet consider the problem of mul-tistructured document edition.
MVDM (Multiview Document Model) [21] is a proprietary graph-based model. The model aims at considering multimedia documents and therefore at representing different kind of relationship between two document entities (and not only hierarchical relation). MVDM focuses on the notion of view which corresponds to a particular organization of a document. Stored in a document repository, multistructured documents can be queried according to criteria linked to one or several views of that document (automatic generation of SQL queries taking into account overlapping); another solution to navigate in the repository is proposed with a multidimensional analysis (OLAP).
At last, Annotation graphs (AG) [5] are coming from the linguistic domain. Annotation graphs propose a proprietary formal model for the representation over the same flow (au-
4http://www.hsqldb.org/
5http://www.w3.org/TR/xquery- semantics/
6http://www.tei-c.org/P4X/NH.html
7http://www.lmnl.net/
Figure 10: Evaluation of XQuery Q2 in our prototype
Figure 11: Tree of the XQuery Core for Q2 in our prototype
	Data models	XML Compatible	Main structure	Validation	Operators
CONCUR	none	SGML Syntax	no	no	no
TEI (milestones)	None	XML Syntax	yes	no	yes
Annotation graphs (AG)	Proprietary graph	XML Syntax for serialisation	no	no	not for querying Multistructure but only an XPath linguistic extension
LMNL	Proprietary graph	specific markup and XML import / export	no	no	no
MultiX	Proprietary graph	XML import / export	yes	no	specific XQuery functions for querying and structure manipulation
Colored trees	Extension of the XML data model	XML export	no	no	XPath step extension
Based on Goddag	Goddag	DOM Extension and XML import/export	no	no	XPath axis extension
MSXD	Extension of the XML data model	An XML document for each structure	no	weak constraints	XQuery semantics extension and new functions
MVDM	Proprietary graph	XML Syntax by structure	no	no	repository (SQL with overlapping management and multidimensional analysis)
Figure 12: Proposals related to Multistructure
dio, text, ...) of several structures (may not be hierarchical). An XML " flat" representation of an AG is proposed but no solution for querying.
The third category presents XML compatible contributions. A very interesting framework is proposed in [19]. It is a new model based on the Goddags data structure [31] which can be seen as a generalization of DOM trees for the representation of multi hierarchical XML documents. This proposal defines also an extension of XPath [26] to navigate between different structures sharing the same textual data (with a specific axis for concurrent querying such as overlapping, xancestor or xdescendant). Another proposal, the colored trees [27], deals with multiple hierarchies in a data-centric context. It aims at sharing atomic data and it does not consider overlapping thus it is out of our scope. The idea is to build several hierarchies (called colors) over the same set of values (text nodes). Thus, nodes are multicolored. In order to navigate between colors/hierarchies, the authors extend the notion of step in XPath. A step begins by a choice of a color (and thus of a structure) before the usual selection of an axis, a test node and some predicates.
Figure 12 summarizes the main features of each proposal related to multistructured documents according to some criteria: Data model, XML compatibility, Existence of a main structure (the user at the logical level or the system at the physical level chooses a main structure so others structures have to be set with regard to this main structure), Validation of a multistructured document (is it possible to define a schema for multistructured documents validation across multiple hierarchies [20]) Operators (forquery-ing several structures and content in a concurrent way). No query language has been defined for querying annotation graphs, but it exists operators that complete the XML propositions (XPath in particular [3, 4]), they are related to the linguistic context .
Our proposal belongs to the third category, our objective
is to remain close to XML standard. Our model is close to Goddags. Indeed, we want to keep the hierarchical aspect of each structure, so that classical XML tools remain available. But, Goddag does not provide mechanisms to add annotations (as LMNL does) and it does not describe relationships between structures for enabling validation across multiple hierarchies [20] (none of these proposals, whatever the approach is, proposes it). We do not detail user annotation but our proposal considers it (see [11] and [12]). Annotations represent textual data added by a user to the text in one structure and so missing in the other structures. It represents specific information que the user needed to integrate to its analysis.
For querying several hierarchies over the same textual content in a concurrent way, we chose to extend the semantics of the filter of XQuery. Our objective is not to add new axis (like Goddag) or to extend XPath step with colors (as with colored trees). Moreover, even if we propose every Allen's relations, we choose to stay simple and to use an as much as possible unchanged XQuery by only adding the necessary function (Unlike MultiX).
7 Conclusion
In this paper, our intention was twofold. First, we defined a XML compatible model for multistructured textual documents which is based on the use of hedges (the foundation of RelaxNG). A multistructured textual document is a text which has several concurrent hierarchical structures defined over it. Each structure corresponds to an analysis of the text according a specific use. Secondly, we proposed an extension of XQuery in order to query structures and content in a concurrent way. We applied our proposals using a medieval manuscript text written in occitan. Finally, we describe the architecture of an implementation of the MSXD data model and of a XQuery engine extended to support
multistructured documents. Our solution is entirely XML compatible and conforms to standards.
A multistructured textual document is defined as a set of fragments defined on the same textual value and grouped in concurrent hierarchical structures. The key idea is to propose a method for a compact description of multiple trees over a single text based on segmentation. Segmentation encoding allows querying overlap/containment relations of markups belonging to different structures. We showed how each structure can be described in an XML document. The multistructured textual document is never instantiated.
To query a multistructured textual document, we chose to extend the semantics of the filter of XQuery. We show how to take into account equality, overlapping and other Allen's relations. For that we added functions and operators to XQuery. We are trying to avoid changing the structure of XQuery (as colored trees did without considering overlapping). Moreover, we did not simply add a new axis (as goddag did by adding xancestor xdescendant), but one can notice that it makes the query easier to read. However, normalization of xancestor, xdescendant or even a new axis associated to Allen's relations can be rewritten into our proposal.
Moreover, we defined a multistructured schema in order to express weak constraints between structures; it is defined as a set of rules, Allen's relations are used to constrain the relative position of fragments in the structures. An alternative solution could rely on the use of ontologies. It could offer more flexibility. We currently tackle this proposal with a linguistic application of multilevel analysis of multimodal data (Project OTIM8).
Our main perspective is the definition of multiple structures in other modalities than the single textual one. For example, it could be useful to define one or several structures associated to the image of a manuscript as it is done for its textual transcription. Then, the objective would be to manipulate the set of all these structures in a concurrent way. Secondly, we plan to extend our query engine to distribute parts of queries in a P2P network and to enable data sharing.
References
[1]	A. Le Hors et al. Document object model (dom) level 3 core specification. Recommendation, W3C, 2004.
[2]	J. Allen. Time and time again : The many ways to represent time. International Journal of Intelligent Systems, 6(4):341-355, july 1991.
[3]	S. Bird, Y. Chen, S. B. Davidson, H. Lee, and Y. Zheng. Extending XPath to support Linguistic Queries. In Proceedings of The ACM Workshop Programming Language Technologies for XML (PLAN-X), pages 35-46, january 2005.
8http://lpl-aix.fr/~otim
[4]	S. Bird, Y. Chen, S. B. Davidson, H. Lee, and Y. Zheng. Designing and evaluating an XPath dialect for linguistic queries. In Proceedings of The 22nd International Conference on Data Engineering (ICDE'06), april 2006.
[5]	S. Bird and M. Liberman. A formal framework for linguistic annotation. In Speech Communication 33(1,2), pages 23-60, september 2001.
[6]	P.-V Biron and A. Malhotra. XML Schema Part 2: Datatypes second edition. Recommendation, W3C, 2004.
[7]	S. Boag. XQuery 1.0 : An XML Query Language. Recommendation, W3C, 2007.
[8]	M.S. Corradini Bozzi. Etude des textes de matiï£jre medico-pharmaceutique en langue d'oc. In Bulletin de l'Association Internationales d'Etudes Occitanes, VIII, pages 29-34,1990.
[9]	T. Bray, D. Hollander, A. Layman, and R. Tobin. Namespaces in XML 1.1 second edition. Recommendation, W3C, 2006.
[10]	T. Bray, J. Paoli, and C.-M. Sperberg-McQueen. Extensible Markup Language (XML) 1.0. Recommendation, W3C, 1998.
[11]	E. Bruno and E. Murisasco. MSXD : a formal model for concurrent structures defined over the same textual data . In Proceedings of The International Conference on Database and Expert Systems Applications (DEXA 2006), pages 172-181. LNCS, 2006.
[12]	E. Bruno and E. Murisasco. Describing and querying hierarchical structures defined over the same textual data. In Proceedings of the ACM Symposium on Document Engineering (DocEng 2006), pages 147-154, Amsterdam, The Netherlands, October 2006.
[13]	E. Bruno and E. Murisasco. An xml environment for multistructured textual documents. In Proceedings of the Second International Conference on Digital Information Management (ICDIM'07), pages 230-235, Lyon, France, October 2007.
[14]	N. Chatti, S. Kaouk, S. Calabretto, and J.M. Pinon. Multix: an xml based formalism to encode multi-structured documents. In Proceedings of Extreme Markup Languages Conference, August 6-10 2007.
[15]	J. Clark. XSL Transformations (XSLT) V1.0. Recommendation, W3C, 1999.
[16]	J. Clark and S. Derose. XML Path Language (XPath) V1.0. Recommendation, W3C, 1999.
[18]	D. Draper et al. XQuery 1.0 and XPath 2.0 Formal Semantics . Recommendation, W3C, 2007.
[19]	A. Dekhtyar and I.-E. Iacob. A framework for management of concurrent xml markup. Data and Knowledge Engineering, 52(2):185-208,2005.
[20]	S. DeRose. Markup overlap : a review and a horse. In
Proceedings of The Extreme markup language Conference, 2004.
[21]	K. Djemal, Soule-Dupuy, and Valles-Parlangeau C. Modeling and exploitation of multistructured documents. In Proceedings of the IEEE 3rd International Conference on Information and Communication Technologies: From Theory to Applications (ICTTA ' 08)., Damascus, Syria, April 2008.
[22]	P. Durusau and M. Brook O'Donnell. Concurrent markup for xml documents. In Proceedings of XML Europe Atlanta, 2002.
[23]	M. Fernandez, A. Malhotra, J. Marsh, M. Nagy, and N. Walsh. XQuery 1.0 and XPath 2.0 Data Model. Recommendation, W3C, 2007.
[24]	C.-F. Goldfarb and Y. Rubinsky. The SGML handbook. Clarendon Press, Oxford, 1990.
[25]	M. Hilbert, O. Schonefeld, and A. Witt. Making concur work. In Proceedings of The Extreme Markup Languages Conference, August 2005.
[26]	I.-E. Iacob and A. Dekhtyar. Towards a query language for multihierarchical xml: Revisiting xpath,. In
Proceedings of The Eighth International Workshop on the Web and Databases (WebDB '05), pages 43 - 48, june 2005.
[27]	H.-V Jagadish, L.-V-S. Lakshmanan, M. Scanna-pieco, D. Srivastava, and N. Wiwatwattana. Colorful XML: One Hierarchy Isn't Enough. In Proceedings of The International Conference on Management of Data (SIGMOD'04), pages 251-262,2004.
[28]	M. Murata. Hedge automata: a formal model for XML schemata. Web page, 2000.
[29]	P.-E. Portier and S. Calabretto. Creation and maintenance of multi-structured documents. In Proceedings of the ACM Symposium on Document Engineering (DocEng 2009), Munich, Germany, Septembre 2009.
[30]	C.-M. Sperberg-McQueen and L. Burnard. Tei p4 guidelines for electronic text encoding and interchange, 2001.
[31]	C.-M. Sperberg-McQueen and C. Huitfeldt. Goddag: A data structure for overlapping hierarchies. In
Proceedings of The Principles of Digital Document and electronic publishing (DDEP/P0DDP'00), pages 139-160, 2000.
[32]	Jeni Tennison and Wendell Piez. Layered markup and annotation language (lmnl). In Proceedongs of The Extreme Markup Languages Conference, 2002.
[33]	A. Witt. Multiple hierarchies : news aspects of an old solution. In Proceedings of The Extreme markup language Conference, 2004.
A Sequential Three-Stage Integer Goal Programming (IGP) Model for Faculty-Course-Time-Classroom Assignments
Raed Al-Husain, Mohamad K. Hasan and Hameed Al-Qaheri* Department of Quantitative Methods and Information Systems College of Business Administration, Kuwait University, Kuwait E-mail: raed@cba.edu.kw, mkamal@cba.edu.kw, alqaheri@cba.edu.kw
Keywords: integer goal programming, timetabling, university scheduling problem
Received: March 19, 2010
Developing university schedules that could take into account factors such as faculties' preferences to courses, timeslots, and classrooms, in a timely fashion while being unbiased and meeting university requirements, is a hurdle in many universities around the world.This paper exploits the use of three-stage integer goal programming (IGP) technique to solve the university scheduling problem, as an expansion of an earlier two-stage model attempt conducted by the authors. Segmenting the problem into three stages enabled reaching a complete schedule in a timely manner and a high satisfactory level among faculties, while meeting all university requirements. The output of every stage is used as an input to the following stage, and goals are satisfied using the priority sequence approach according to their order of importance based on some college, department, and major regulations and requirements. Povzetek: Z novo metodo IGP naredijo univerzitetni urnik.
1 Introduction
The utilization of optimization techniques to ensure more efficient and effective operational workflow has long been a major factor in the success of organizations in different industries; hence the need for such techniques in the educational sector is no exception. Scheduling problems in universities, such as offering required courses at the same time on the same day, assigning the wrong class size to the wrong classroom, inevitable biased faculty-course assignment, and relatively long time to complete the schedule have all been problematic issues associated with using manual and judgmental approaches when developing course schedules. This paper exploits the use of three-stage integer goal programming (IGP) technique to solve the university scheduling problem, as an expansion of an earlier two-stage model attempt conducted by the authors [12]. The three-stage model is developed and solved in a sequential order, where faculties assigned to courses, courses assigned to different time slots, and then time slots assigned to classrooms respectively. In our approach, each stage is optimally solved such that the outputs of each stage are fed as inputs to the following stage. In every stage, university, college, and departments regulations are considered as a set of goals to be achieved along with faculties' preferences. The model has been tested at the College of Business Administration in Kuwait University using Excel Premium Solver.
The rest of the paper is organized as follows: Section 2 present a selective review of literature, Section
3 covers the Three-Stage integer goal programming(IGP) model formulation, Section 4 cover the experimentation and discusses the results of the three stages follow by an overall analysis and assessment of the three stage model in section, conclusion and future research are discussed in Section 6.
2 Review of literature
The idea of developing sophisticated models to solve the university scheduling problem has been around since the early 70s [14] [11]. The techniques used range from the utilization of optimization models to complex heuristics models. Some models solved the problem of faculties' assignment to courses only [23] [6]. Other models took into consideration the time slot factors as well [10] [6][7][15][17] and some models took into account faculty-time-classroom assignment [13] [1][2].
Most of the work mentioned used the approach of decomposing the problem into distinct and interrelated stages versus the approach of solving the problem as a complex single stage model. Using such approach has the advantage of significantly reducing computation time while finding a relatively satisfying solution.
Heuristics approaches and the aid of decisions support systems have also been utilized to solve the university scheduling problem in order to overcome complexities that could arise from using optimization techniques. The major reason of using such approaches
Corresponding Author
Figure 1: Faculty Course Schedule Block Diagram and Information Flow.
is to reach a relatively close to optimality solution in relatively reasonable time [13][16] [8] [3][9].
More recently, the use of variable neighbourhood search (VNS) approaches for the university examination timetabling problem has been investigated. The technique has proven to produce high quality solution across a wide range of problems, but with relatively large amount of computational time [18]. Another heuristic approach that has been utilization in the College of Engineering at Virginia Tech is the use of Benders' partitioning. An improved quality course schedules, in terms of the total distance travelled by the faculty members from their offices to the classrooms where the courses are offered, has been obtained [19]. Moreover, the use of genetic algorithm meta-heuristic has been another heuristic approach to the university timetabling problem. The approach considers timetable in a bottom-up fashion at the various levels of department, faculty or entire university, which is claimed to be the first application of meta-heuristics to a timetabling problem [20]. Hyper-heuristics method has also been utilized in solving the university timetabling problem. Burke used a novel direction in hype-heuristics, unified graph-based hyper-heuristic (GHH) framework, under which a number of local search-based algorithms are studied to search upon sequences of low-level graph colouring heuristics [21]. More complicated approaches has also been utilized by using a multi-objective evolutionary algorithm that uses a variable-length chromosome representation and incorporates a micro-genetic algorithm and a hill-climber for local exploitation and a goal-based Pareto ranking scheme for assigning the relative strength of solutions [22].
The aim of this paper is to solve the university scheduling problem by extending the work of Badri [6] by using a three-stage (sequential) integer goal programming (IGP) model with different university, college, department and major rules and regulations as shown in Figure 1. Introducing goal programming technique into every stage has enabled the fulfilment of many rules and regulations. The use of integer goal programming technique in developing a comprehensive schedule at the College of Business Administration in
Kuwait University has proven to outperform manual approaches that had been adopted. Results have proven to be both efficient and effective. Resources are optimally utilized, faculties are fairly satisfied, and students are efficiently progressed in a timely manner through their graduation requirements.
The work of this paper represents an extension of an earlier study conducted by the authors [12]. The major difference of this study and that of Badri's [6][7] and Hasan's[12] is the inclusion of the classroom availability regulations to the model. Hence, a complete schedule that could take into account faculty preferences to courses and timeslots, and classroom availability regulations of the college can be developed.
3 A sequential three-stage integer goal programming (IGP) model formulation
The approach that has be followed to solve the university scheduling problem is through segmenting the problem into three distinct yet interrelated stages, the faculty-course assignment stage, the courses-timeslot assignment stage, and then the timeslot-room assignment stage. The inputs of every stage is translated into goals and solved according to their order of importance, where goals are given priorities according to their order of importance. The output of every stage, which represents an optimal assignment, is then fed to the next stage to act as an input. This process continues until the final stage is solved and a complete scheduled is created. Figure 1 shows the schematic diagram of the entire modelling process for solving the university scheduling problem.
A detailed description of the three-stage integer goal linear programming (IGP) model that is applied to solve the university scheduling problem is discussed next. See Hasan. et al[12] for a full description of the two-stage integer goal programming model.
3.1 Stage I: faculty-course assignment
integer goal programming (IGP) model formulation
3.1.1 Stage I model notations:
i: Faculty member j: Courses number
Li :Maximum number course loads for faculty i Nj : Number of sections offered for course j Xj : Number of sections for course j that will be
assigned to faculty member i Rj : The preference for faculty member i to teach
course j, where Ri}- = {0, 1, 2, 3, 4, 5} such that
the value of 5 represents is a very favourable course, and 0 not desired at all.
Stage I IGP Model has five goals and one hard constraint and are described as follows:
Goal 1:
Each faculty member i should take exactly his maximum course loads Li. This goal has a priority P1
and the objective is to minimize both of d^ and d^ Vi. Goal 2:
The number of course sections N j for course j should all be covered by faculty members. This goal has a
Stage I model goals and constraints
In this section we formulate the integer goal programming (IGP) model of stage I that represents the rules and regulations of Kuwait University, College of Business Administration, and the requirements of the Department of Quantitative Methods for assigning courses to faculty.
Stage I IGP model:
Satisfying goals with their priorities and the other requirements, Stage I GLP model can be written as following:
priority P1and the objective is to minimize both of
d2j and d2j Vj . Goal 3:
Each faculty member should take at least one of the College Level Courses (CLC) course section. This goal is a department level requirement and is given a priority level P2 and the objective is to minimize d3i Vi.
Goal 4:
Each faculty member should take at least one of the Major Level Courses (MLC) section. This goal represents another department level requirement and is
Minimize p w (d ~ + d + ) + 2 (d+ d+2] )] + P2 2 d3- + P3 2 d-, + P, 2 d-
i=1 j=1 i i i
Subject to:
2 X j + d " - d !+. = Li i = 1,+ ,..., m
j = 1
2 Xj + d -2]-d +j = Nj	j = ¡,2,......, n
i = 1 n
2 Xj + d-i-d+ = 1	i = 1,2,......, m
j'gclc n
2 Xj + d 4i - d 4i = 1	i = 1,+,......, m
jgmlc
2 RjXj + d 5i-d+i = 5 Li	i = 1,2,......, m
j
f0 if r, = 0
X, =+	J , i = 1,2,..., m and j = 1,2,..., n
J \< 2 Otherwise
<, , , d2j, , , <, d4,, d5,, d5i, > 0for i = 1,2,......, m and j = 1,2,...., n
Where w is very big value to force the values of the deviations d^, d1 i, d2j, d2 j Vi and Vj to be zeros.
given a priority level P3 and the objective is to minimize such that m = morning time and d- Vi .	a = afternoon time
Goal 5'	.	tdp : Time slot number in a day d and period p
Maximize the total preference for each faculty member i F
that has a course loads Li. This goal has a priority P4 ' where tdp ={1, 2, . }
and the objective is to minimize d5i Vi.
Hard Constraints:	X
Finally we have one more hard constrain that does not ijkvtdp allow any faculty member i to take more than two sections for the same course j . This constraint is represented by the following formula:	°tdp : Number of rooms available at time slot number
1 if faculty i assigned to course j sec tion kjj during time slot t of day d in period p 0 Otherwise
Xj < 2	i = 1,2,...,m and j = 1,2,...,n
tdp
4.1.2 Stage II model goals and constraints: 4.i Stage II: faculty-course"time	in this section we formulate the goal programming model
assignment integer goal programming that represents the rules and regulation of Kuwait model formulation	University, College of Business Administration
requirements for assigning time slots to Faculty-Course assignment resulting from stage I.
4.1.1 Stage II model notations:
kj :section number for course j that assigned	Stage II IGP model:
to faculty i, where kj = {1,2,3....,K}	Satisfying the goals with their priorities and the other
requirements, the second stage IGP can be written as follows:
d :Days of the week, where d = {1, 2} such that 1 = {Sunday, Tuesday, Thuresday}
and 2 = {Monday, Wednesday}
p : Period of the day, where p = {m, a}
Mmmize P S^ + P2(SSdljtm) + P3(SSdlit*) + P4d4 + P5d5 + P6(d6 + d7 ) + P7Sd8-
tdp	j tdm	j tda	'
Subject to:
^^ ^C/ j kij tdp (d^tdp + d^tdp Otdp Vt
-j^ ijkijtdp 1tCp 1tdP d	dP kj
SS Xi j kj tdm - j + d2jtdm = 2	Vj e CLC and tdm 1 kij
SS X^jta - j + j = 1	Vi e CLC and tda
i kij
SSS S^d, - 4S S S Sx.- d+ + d— =0
i jeMLCkj
tdm
i jeULCkj
tda
SSSSXi; kj ,p - 1.5(SSSSXi i kj t2p) -d+ + d- = 0
i j kij t1 p	i J kij t2 p
S S S S Xj^Hm - 2.3(S S S S Xijk„tla) - d6++ d— =0
i j kij tm1	i j kij ta1
S S S S Xjkjtm - 2.3(S S S S Xijkjt2a ) - d 7 + d— = 0
i j kij tm2	i j kij ta2
SSSRi j kr tp 'Xi j kj. tp -d8i + d- = M	Vi, such that M is a large number
j kn tdp
SS Xijkjtp <1	Vi e i, Vd j kij
SS Xij 1 tp < 1	Vtdp
i jeMLC

i J kij tdp
= 1

all Xijki td are binary, and all d's > 0
Stage II IGP Model has seven goals and three hard constraints and are described as follows:
Goal 1:
Total number of courses assigned in a specific time slot cannot exceed the number of rooms available for that time slot. This goal has a priority Pl and the objective is
to minimize d^ _ Vt, Goal 2:
dp •
objective is to minimize d3jtda Vj e CLC and t
da
Goal 4:
Reduce the gaps between MLC. The MLC should be 4 times more condensed during the morning-time than they are during the afternoon-time, where 4 is just any number that the department wishes to choose. This goal has a
priority P4 and the objective is to minimize d 4 . Goal 5:
This is more like a guideline, where 60% of courses should preferably be offered during the odd days and 40% during the even days. This goal has a priority P5
and the objective is to minimize d 5~ . Goal 6:
This is more like a guideline, where 70% of courses should preferably be offered during the morning-time and 30% during the afternoon-time. These goals have a
priority P6 and the objective is to minimize d6 , d7 Goal 7:
1. Sum of sections taught for every faculty in every specific time slot must be at most equal to 1.

j kij
i J kij
< 1
Vi e I, Vt,
dp
2. Sum of MLC offered during a specific time slot during same day must equal at most 1.
X X Xij 1 tdp <1
Vt
dp
i JeMLC
3. Sum of time slots for each section for every faculty, every course, and every section must equal 1.
^^Xi J kj tdp 1

This goal eliminates timing conflict of courses that can be taken at the same time for similar CLC. Total number of similar CLC assigned during a specific time slot in morning-time cannot exceed 2 sections for the same course. This goal has a priority P2 and the objective is to
minimize dVj e CLC and tdm. Goal 3:
Total number of similar CLC assigned during a specific time slot in afternoon-time cannot exceed 1 section for the same course. This goal has a priority P3 and the
4.2 Stage III: room assignment integer goal programming model formulation
4.2.1	Stage III model notations:
L : Floor Level, where L = {1,2,3} C : Room size category, where C = {1,2,3,4}
rLC : room number in floor level L and size category C
where rLC = {101,...,112,201,...,214,301,...,318} DL : department number based the floor level
location, where DL = {1,2,3} p LC : Room level location preference, where pLC = {2,4,5} such that 2 is the least preferred room level location, and 5 is highly preferred room level location for a course to be placed.
4.2.2	Stage III model goals and constraints: Stage III IGP model:
Minimize d~
Subject to:
XX pr XJJk t r + d - d+ = M
¿—i /-U^rLC iJckijctdprLC
i j k t rLC
' ijckijctdprLC
XX X XjckuM>-LC = 1 V tdp ,kjc
This goal maximizes the faculty preferences on their class times. This goal has a priority P7 and the objective
is to minimize Vi
Hard Constrains:
L C
X XijckijJdprLC - 1 Vtdp,L, C, rLC i
all Xjk t . are binary, and all d' s > 0
ijckijctdprLC	J '
Stage III IGP Model has one goal and two hard constraints and are described as follows:
Goal 1:
The primary goal in stage III model is to locate each previously assigned course to a room of the right size as close as possible to the department that is offering the course. This is accomplished by achieving a high
dp
dp
enough sum-product of the decision variable Xijck.. tdprIC ,
the assignment of faculty i to course jc of size category C section kjc in time period tdp to a room location rLC , with the location preference prc and M is a large number. The objective is to minimize the under deviation, d ~ of the sum-product.
Hard Constraints
1. Each section of a course that has been assigned a specific faculty and time should be located in one room only.
V tdp, i, k1]c
^^^^ ^^X ijckijctdprLC 1
L C rLC
2. Each room is assigned to at most one faculty in a specific time period.
XXjckUctdprLC ^ 1 Vtdp,L, C, rLC
5 Experimentation
The model was initially applied to generate the schedule for 4 different majors representing 2 different departments at the College of Business Administration in Kuwait University for the semester of fall 2009. Namely, the Marketing major (MRKT) at the Department of Management and Marketing; and the majors of Information Systems (IS), Logistics and Supply Chain Management (LSCM) and Statistics (STAT) at the Department of Quantitative Methods and Information Systems (QMIS).
5.1 Data collection
Each of the above mentioned majors had to fill in the model inputs for every stage sequentially until the final schedule is completed. Model inputs include faculty members, number of courses and their sections to be offered, faculty-course preferences, required load to be taught for every faculty, course-timeslot preferences, and the university, college, and department rules and regulations of assignment. Examples of these regulations include the ratio of courses to be offered in the morning sessions versus in the afternoon sessions, the ratio of courses to be offered during Day 1 (Sunday, Tuesday, and Thursday) versus Day 2 (Monday and Wednesday), and the amount of dispersion of major level classes. For further discussion of stage I and stage II models, please refer to [12].
The same procedure has been followed in stage III, the timeslot-room assignment model. Inputs of this model include room information, i.e. number of rooms, their size category, and their floor location. Each room was given a size category, room category (RC) based on its capacity as shown in Tablel. This distinction ensures that rooms are assigned to courses of the right Expected Course Category (ECC) size only. Moreover, departments are located in 3 different Levels, Room Level (RL), at the College of Business Administrations in Kuwait University; hence
rooms were distinguished based on their floor level in order to be able to assign them as close as possible to the department that is offering the course. Table 2 shows the room characterization where RN is the Room Number.
Table 1: Room Size Category.
Size	Cap
Category	acity
1	25
2	30
3	40 -44
4	65
Table 2: Room Characterization.
RL	RC	RN
1	2	105-106, 111-112
	3	101-104, 107-110
2	2	205-207, 212-214
	3	201-204, 208-211
	1	312-314
3	2	305-207
	3	301-304, 308-311
	4	315-318
5.2 Stage I results
The output of this stage, the faculty-course assignment stage, represents an optimal assignment of faculty members to courses and their sections according to the imposed rules and regulations. Thirteen different scenarios were tested to ensure the effectiveness of the model. These scenarios take into account the occurrences of three different cases that could arise when developing a schedule. Cases include the situation where the faculties' loads = the total course sections offered, the faculties' load > the total course sections offered, and the faculties' load < the total course sections offered.
In the first case, most goals were 100% met except for the last goal, the faculty-course preferences goal. Satisfaction level of this goal, i.e. faculty getting their first choice of courses, ranged from 85.2% to 73.3%. However, when it came to the second choice preferences, all faculties were 100% satisfied.
In the second and third cases, where the load of faculties available is not equal to the amount of courses offered, the satisfaction of the goals ranged from 100% to 54.9% based on the amount of variation of the faculties' load available and the amount of courses offered. For further discussion of stage I results, please refer to [12].
5.3 Stage II results
The output of this stage, the course-timeslot assignment stage, represents an optimal assignment of course, that were already assigned to different faculty members, to timeslots according to the imposed rules and regulations. Most goals were met up to 100% with the exception of goal 4, the dispersion of MLC in the morning versus the afternoon timeslots. The model was able to condense the
MLC during the morning timeslots as desired, hence there has been an under achievement of the goal by 42%. Moreover, about 90% of the faculties got their first choice of preferences when it came to their desired timeslot in the schedule. Combining the faculty satisfaction level of the two stages together, 73.6% of the faculties were able to get their first choices of preferences, and 100% of the faculties were able to get at least their second choice of preferences. For further discussion of stage II model, please refer to [12].
5.4 Stage III results
Upon the completion of stage I & II of the model, an optimal assignment of both faculties to courses, and then those courses to different timeslots is obtained. The result is then used as an input to stage III model, timeslot-room assignment. Based on the formulation of stage III model, a complete schedule was obtained . Table 3 shows part of the generated schedule.
All timeslots were successfully assigned to different room locations, the right course size were assigned to the right room size, and courses were distributed in the college to the desired floor based on the department that is offering these courses.
Table 3: Room Assignment Distribution.
Dep. I.D.	Major I.D.	Dep. Location	Course Level	Time	Faculty ID.	Course I.D.	Sectiou No.	ECC	KL	RC	RN
OMTS	LSCM		CT C	fi-Q	LSCM Fl	210	1	3	3	i	30i5
omis	LSCM		MIC	K-Q	T SCM F?	IIS	1	1	3	1	312
OMIS	IS	3	CLC	8-9	IS Fl	240	1	3	i		309
OMIS	IS	3		8-9	IS F2	240	5	3	3		310
OMIS	IS	3	MLC	8-9	IS F4	4i 1	1	1	i	2	303
OMIS	STAT	3		8-9	STAT Fl	110	7	3	3	4	313
OMIS	STAT	3	CLC	8-9	s TAT F2	220	1	3	3	4	317
OMIS	STAT	3		8-9	STAT F3	220	8	3	3	4	3 IS
BUSA	MRKT		MLC	8-9	MRKT Fl	32 i	1	1	2	2	214
OMIS	LSCM	3	CLC	9-10	T SCM Fi			3	3	4	316
OMIS	LSCM	3		9-10	T SCM F?	Xr.	4	3	3	3	30i5
OMIS	IS	3	CLC	9-10	IS F3	130	6	3	i	i	309
OMIS	IS	3		9-10	IS F4	240	6	3	3		310
OMIS	IS	3	MIC	9-10	TSF1		1	1	3	2	303
OMIS	STAT	3		9-10	STAT F4	120	3	3	3		311
OMIS	STAT	3	CLC	9-10	STAT F3	120	5	3	i	4	317
OMIS	STAT	1		9-10	STAT FS			1	i	4	¿IS
6	Overall analysis and assessment of the three stage model
Breaking the university scheduling problem into three stages has greatly improved the solution process and computation time of such a complex problem. Once all the required input data of every stage were available, computation time for each stage were few seconds using the Excel Premium Solver. Moreover, although the output of every stage represents a local optima of the overall problem, considering the satisfaction level of assigning faculties to courses and courses to different timeslots, then the efficient and effective allocation of timeslots to the right rooms, and the computation time of solving the entire problem, the decomposition of the scheduling problem is considered an advantage rather than a disadvantage. On the other hand, solving the entire scheduling problem in one complex model might result in an infeasible solution when global optimum is desired.
7	Conclusion and future research
Developing an effective, unbiased, and timely schedules have long been an issue in universities around the world. The utilization of optimization techniques, however, has proven to overcome such a complex problem. Although different approaches have been used to resolve this problem and reach an "optimal" schedule, the
consideration of factors such as faculties' preferences to different courses and timeslots, an efficient room assignment, and university rules and regulations of assignment have all been hindrances to be considered all at once. Moreover, computation time has always been a problem when all of the above factors were considered in one complex model.
This paper utilizes the integer goal programming(IGP) technique and the idea of breaking (decomposing) the problem into smaller sub problems, i.e. different stages, in order to simplify formulation and swiftly reach a satisfying solution to the overall scheduling problem. The method used in satisfying goals is the priority sequence approach, where goals are satisfied according to their order of importance based on some university, college, and department regulations and requirements. The output of every stage has been used as an input to the subsequent stage until a complete schedule is developed.
After successful results of the first two stages of the model has been verified in an earlier study conducted by the authors, a new stage, timeslot-room assignment stage, has been added to the previous model and contributed to the development of a complete schedule that took into account all different factors when developing a schedule is desired. The overall model has been tested in Kuwait University at the College of Business Administration using 4 different majors in 2 different departments. Results showed that faculties satisfaction level obtained reached up to 85.2% in stage I, and 88. 8% in stages II of the model as
shown in an earlier study. The overall satisfaction level when combining the two results reached up to 73.6%, as far as faculties getting their first choices of preferences. Nonetheless, faculties' satisfaction level reached up to 100% when it came to getting at least their second choices of preferences. The room assignment stage has successfully used the results obtained in the previous stages and efficiently distributes courses with an assigned timeslots to the desired room location.
Work is underway to eventually integrate the three-stage model of this paper with a Decision Support System (DSS) such as the ScheduleExpert of Cornell in order to build an ultimate scheduling tool that will enable users to develop quick and effective schedules that are demand driven by the students through a new development of students planer DSS rather than supply driven by the college. The integration between the University scheduling DSS and the student planer DSS in a unique integrated DSS, will be a great tool that will efficiently and effectively enhance the whole Kuwait University registration system.
References
[1]	Al-Yakoob, S. M. and Sherali, H. D. (2006). Mathematical programming models and algorithms for a class-faculty assignment problem. European Journal of Operational Research 173, 488-507.
[2]	Al-Yakoob, S. M. and Sherali, H. D. (2007). A mixed-integer programming approach to a class timetabling problem: A case study with gender policies and traffic considerations. European Journal of Operational Research 180, 1028-1044.
[3]	Aladag, C., Hocaoglu,G. and Basaran, M. (2009). The effect of neighborhood structures on tabu search algorithm in solving course timetabling problem. Expert Systems with Applications, in press.
[4]	Anderson, D., Sweeney, D., Williams, T. (2008). An Introduction to Management Science : Quantitative Approaches to Decision Making, 11e.
[5]	Badri, M. A. (1996). A Linear Goal Programming Model for Solving the Optimal Faculty Assignment Problem. King Sauod University Journal; Vol. 8; Business Administration 2; 537 - 566; Reiad.
[6]	Badri, M. A. (1996). A two-stage multi-objective scheduling model for [faculty-course-time] assignments. European Journal of Operational Research 94 (1), 16-28.
[7]	Badri, M. A., Davis, D. L., Davis, D. F., Hollingsworth, J. (1998). A multi-objective course scheduling model: Combining faculty preferences for courses and times. Computers and Operations Research 25 (4), 303-316.
[8]	Broek, J., Hurkens C., and Woeginger G. (2009). Timetabling problems at the TU Eindhoven. European Journal of Operational Research 196 (2009) 877-885
[9]	De Causmaecker, P., Demeester, P., and Berghe, G. (2009). A decomposed metaheuristic approach for a real-world university timetabling problem.
European Journal of Operational Research, 195, 307-318.
[10]	Dinkel, J. and J .Mote. (1989). An Efficient Decision Support System For Academic Course Scheduling. Operations Research. 37(6), 853-864
[11]	Harwood, G. and Lawless, R. (1975). Optimizing organizational goals in assigning faculty teaching schedules. Decision Science 6, 513-524.
[12]	Hasan, M., Al-Husain, R., Al-Qaheri, H. (2009). Course Assignment for University Faculties. Arab Journal of Administrative Sciences, Vol 17, No.1, pp. 169-191.
[13]	Hinkin, T. R. and Thompson, G. M. (2002). SchedulExpert: Scheduling Courses in the Cornell University School of Hotel Administration. Interfaces.32(6), 45-57
[14]	Lee, S. and Schniederjans, M. (1983). Muticriteria Assignment Problem: A goal programming approach. Interfaces 13, 75-81.
[15]	Mirrazavi, SK. Mardle, SJ. And Tamiz, M. (2003). A two-phase multiple objective approach to university timetabling utilizing optimization and evolutionary solution methodologies. Journal of the Operational Research Society 54, 1155 - 1166.
[16]	Pongcharoen, P., Promtet W., Yenradeec, P., and Hicksd C. (2008). Stochastic Optimization Timetabling Tool for University Course Scheduling. Intentional Journal of Production Economics, 112, 903-918
[17]	Valouxis, C. and Housos, E. (2003). Constraint programming approach for school timetabling. Computers & Operations Research_30; 1555 -1572.
[18]	Burke, EK., Eckersley, AJ., McCollum, B., Petrovic, S., Qu, R. (2010). Hybrid variable neighbourhood approaches to university exam timetabling. European Journal of Operational Research, Vol. 206, No. 1, pp. 46-53.
[19]	Subhash, Sarin C., Wang, Yuqiang., Varadarajan, Amrusha. (2010). A university-timetabling problem and its solution using Benders' partitioning-- a case study. Journal of Scheduling, Vol. 13, No. 2, pp. 131-141.
[20]	Aderemi, O., Adewumi, Babatunde, A., Sawyerr, M., Montaz, Ali. (2009). A heuristic solution to the university timetabling problem. Engineering Computations, Vol. 26, No. 8, pp. 972-984.
[21]	Qu, R., Burke, EK. (2009). Hybridizations within a graph-based hyper-heuristic framework for university timetabling problems.The Journal of the Operational Research Society, Vol. 60, No. 9, 1273-1285.
[22]	Cheong, C Y., Tan, K C., Veeravalli, B. (2009). A multi-objective evolutionary algorithm for examination timetabling. Journal of Scheduling, Vol. 12, No. 2, pp. 121-146.
[23]	Schniederjans, M. and Kim, G. (1987). A goal Programming Model to optimize departmental preference in course assignments. Computers and Operations Research 14 (2), 87-96
Performance Comparison Study of Multicast Routing Protocols for Mobile Ad hoc Networks under Default Flooding and Density and Mobility Aware Energy-Efficient (DMEF) Broadcast Strategies
Natarajan Meghanathan
Department of Computer Science
Jackson State University
Jackson, MS 39217, USA
E-mail: natarajan.meghanathan@jsums.edu
Keywords: mobile ad hoc networks, broadcast, energy-efficiency, multicast routing, stability, simulation Received: June 10, 2010
Recently, we had proposed a novel network density and mobility aware energy-efficient broadcast route discovery strategy (called DMEF) to discover stable routes in mobile ad hoc networks (MANETs). DMEF works by letting each node to dynamically choose its own broadcast transmission range for the route discovery messages depending on the perceived number of neighbour nodes in its default maximum transmission range and the node's own mobility values at the time of route discovery. A node surrounded by more neighbours makes itself available to a smaller neighbourhood and vice-versa. Similarly, a slow moving node broadcasts the route discovery message to a majority of its neighbours so that links formed using this node can be more stable. A fast moving node advertises itself only to the neighbours closer to it. The effectiveness of DMEF has been so far tested only for MANET unicast and multi-path routing protocols. In this paper, we study the impact of DMEF on the performance of MANET multicast routing protocols. We investigate the minimum-hop based Multicast Ad hoc On-demand Distance Vector (MAODV) routing protocol, the minimum-link based Bandwidth-Efficient Multicast Routing Protocol (BEMRP) and our recently proposed non-receiver aware and receiver aware multicast extensions to the Location Prediction Based Routing (NR-MLPBR and R-MLPBR) protocols. Exhaustive simulation studies of these multicast routing protocols with DMEF and the default flooding as the route discovery strategies have been conducted. Performance results for each multicast routing protocol illustrate DMEF to be effective in discovering multicast trees that exist for a longer time with a lower energy consumed per node and without any appreciable increase in the hop count per source-receiver path.
Povzetek: Predstavljeno je testiranje nove metode DMEF za iskanje stabilnih povezav v mobilnih omrežjih.
1 Introduction
A mobile ad hoc network (MANET) is a dynamic distributed system of mobile, autonomous wireless nodes. The network has limited bandwidth and the nodes have limited battery charge. In order to conserve battery charge, each node has a limited transmission range (i.e., transmits the data signals only to a limited distance). As a result, MANET routes are typically multi-hop in nature. As nodes move independent of each other, routes between a source and destination node often break and new routes have to be discovered. MANET routing protocols are of two types. Proactive protocols require the nodes to periodically exchange the table updates to pre-determine routes between any pair of source-destination nodes. Reactive protocols determine routes only when a route is required from a source to a destination. In dynamically changing environments, typical of MANETs, reactive on-demand routing
protocols incur lower control overhead to discover routes compared to the proactive routing protocols [5]. In this paper, we work only with the reactive routing protocols.
Flooding is the default route discovery approach for on-demand MANET routing protocols [14]. The flooding algorithm to discover routes can be briefly explained as follows: Whenever a source node needs a route to a destination node, it broadcasts a Route Request (RREQ) message to its neighbours. Neighbour nodes of the source node broadcast the received RREQ further, if they have not already done so. A RREQ message for a particular route discovery process is forwarded by a node exactly once. The destination node receives the RREQs along several routes, selects the best route according to the route selection principles of the particular routing protocol and notifies the selected route to the source
through a Route-Reply (RREP) packet. The source starts sending data packets on the discovered route.
Flooding is inefficient and consumes significantly high energy and bandwidth. When a node receives a message for the first time in its neighbourhood, at least 39% of the neighbourhood would have seen it already and on the average only 41% of the additional area could be covered with a rebroadcast [15]. In an earlier work [11], we had proposed a novel density and mobility aware energy-efficient broadcast strategy, referred to as DMEF, to reduce the energy consumption in broadcast route discoveries by letting a node to broadcast only within a limited neighbourhood. The neighbourhood size to which a node advertises itself as part of the route discovery process is independently decided at the node based on the number of neighbours surrounding the node and the mobility of the node. The neighbourhood size for rebroadcast is reduced in such a way that the RREQ packets still make it to the destination through one or more paths with a reduced energy spent per route discovery and such paths are also more stable compared to those discovered using flooding.
The effectiveness of DMEF has been so far studied only for MANET unicast [11] and multi-path routing protocols [12]. In this paper, we study the impact of DMEF on the performance of MANET multicast routing protocols. Multicasting is the process of sending a stream of data from one source node to multiple recipients by establishing a routing tree, which is an acyclic connected subgraph of the entire network. The set of receiver nodes form the multicast group. While propagating down the tree, data is duplicated only when necessary. This is better than multiple unicast transmissions. Multicasting in ad hoc wireless networks has numerous applications [21]: collaborative and distributing computing like civilian operations, emergency search and rescue, law enforcement, warfare situations and etc. We investigate the minimum-hop based Multicast Ad hoc On-demand Distance Vector (MAODV) routing protocol [18], the minimum-link based Bandwidth-Efficient Multicast Routing Protocol (BEMRP) [16] and our recently proposed non-receiver aware and receiver aware multicast extensions to the Location Prediction Based Routing protocol [9], referred to as NR-MLPBR and R-MLPBR protocols [10] respectively. Exhaustive simulation studies of these multicast routing protocols with DMEF and the default flooding as the route discovery strategies have been conducted in this paper. Performance results for each multicast routing protocol illustrate DMEF to be effective in discovering multicast trees that exist for a longer time with a lower energy consumed per node and without any appreciable increase in the hop count per source-receiver path.
The rest of the paper is organized as follows: Section 2 briefly describes the DMEF strategy. Section 3 reviews the multicast routing protocols studied. Section 4 discusses the simulation environment and presents the simulation results illustrating the effectiveness of DMEF vis-à-vis flooding. Section 5 reviews state-of-the-art related work on different optimal broadcast route discovery strategies proposed in the literature and
discusses the advantages of DMEF and differences with related work. Section 6 concludes the paper and discusses future work. Throughout this paper, the terms 'path' and 'route', 'link' and 'edge', 'message' and 'packet' are used interchangeably. They mean the same.
2 DMEF strategy
2.1	Terminology and assumptions
Every node (say node u) in the network is configured with a maximum transmission range (RangeMax). If the
distance between two nodes is less than or equal to the maximum transmission range, the two nodes are said to be within the "complete neighbourhood" of each other. Each node broadcasts periodically a beacon message in its complete neighbourhood. The time between two successive broadcasts is chosen uniform-randomly, by each node from the range [0...Twait\. Using this strategy, each node learns about the number of nodes in its complete neighbourhood.
2.2	Basic idea of DMEF
The twin objectives of DMEF are to discover stable routes with a reduced energy consumption compared to that incurred using flooding. DMEF achieves this by considering the number of neighbours of a node (a measure of node density) and node mobility. The basic idea behind DMEF is as follows: The transmission range of a RREQ broadcast for route discovery is not fixed for every node. A node surrounded by more neighbours in the complete neighbourhood should broadcast the RREQ message only within a smaller neighbourhood that would be sufficient enough to pick up the message and forward it to the other nodes in the rest of the network. On the other hand, a node that is surrounded by fewer neighbours in the complete neighbourhood should broadcast the RREQ message to a larger neighbourhood (but still contained within the complete neighbourhood) so that a majority of the nodes in the complete neighbourhood can pick up the message and rebroadcast it further. A node rebroadcasts a RREQ message at most once. The density aspect of DMEF thus helps to reduce the unnecessary transmission and reception of broadcast RREQ messages and conserves energy.
To discover stable routes that exist for a longer time, DMEF adopts the following approach: A node that is highly mobile makes itself available only to a smaller neighbourhood around itself, whereas a node that is less mobile makes itself available over a larger neighbourhood (but still contained within the complete neighbourhood). The reasoning is that links involving a slow moving node will exist for a longer time. Hence, it is better for a slow moving node to advertise itself to a larger neighbourhood so that the links (involving this node) that are part of the routes discovered will exist for a longer time. On the other hand, a fast moving node will have links of relatively longer lifetime with neighbours that are closer to it. Hence, it is worth to let a fast moving node advertise only to its nearby neighbours.
2.3 DMEF mathematical model
DMEF effectively uses the knowledge of neighbourhood node density and mobility so that they complement each other in discovering stable routes in a more energy-efficient fashion. The transmission range used by a node u, RangeRREQ, to rebroadcast a RREQ message is given
by the following model:
Ranged = RangeMax

| Neighborsu
a
* vß
u
(1)
The idea behind the formulation of equation (1) is that the larger the value of the term,
| Neighborsu |
a
*„ß
the lower would be the
transmission range chosen by a node for broadcasting the RREQ message. For a fixed value of parameters a and p, the above term in equation (1) could become larger for a node if it has a larger number of neighbours and/or is moving faster with a larger velocity.
In order to make sure, RangeRREQ is always greater
than or equal to zero, the value of parameter a should be chosen very carefully. For a given value of parameter [, the necessary condition is:
a >
| Neighbors t
A
Range,t
Max
ß
(2)
In practice, the value of a has to be sufficiently larger than the value obtained from (2), so that the RREQ message reaches neighbours who can forward the message further to the rest of the network. Otherwise, certain source-destination nodes may not be reachable from one another even though there may exist one or more paths between them in the underlying network.
2.4 Dynamic selection of DMEF parameter values
The specialty of DMEF is that it allows for each node to dynamically and independently choose at run-time the appropriate values for the critical operating parameters a and [ depending on the perceived number of nodes in the complete neighbourhood of the node and the node's own velocity. A node has to be simply pre-programmed with the appropriate values of a and [ to be chosen for different values of the number of nodes in the complete neighbourhood and node velocity.
Let the maximum number of neighbours a node should have in order to conclude that the complete neighbourhood density of the node is low and moderate be represented respectively by maxNeighblowDensity, maxNeighbmodDensity. If a node has more than maxNeighbmodDensity number of neighbours, then the node is said to exist in a complete neighbourhood of high density. Let lowDensitya, modDensitya and highDensitya represent the values of a to be chosen by a node for complete neighbourhoods of low, moderate and high density respectively. Let maxVellowMobility, maxVelmodMobility represent the maximum velocity
values for a node in order to conclude that the mobility of the node is low and moderate respectively. If the velocity of a node is more than maxVel modMobility, then the mobility of the node is said to be high. Let lowMobility_[, modMobility_[ and highMobility_[ represent the values of [ to be chosen by a node when its mobility is low, moderate and high respectively.
Let Neighbors.lu and V represent the set of neighbours in the complete neighbourhood and velocity of a node u at time t. Note that the set Neighbors^ is
determined by node u based on the latest periodic beacon exchange in the complete neighbourhood formed by the maximum transmission range, RangeMax. The
algorithm, DMEF_Parameter_Selection, to dynamically choose the values of parameters a and [ (represented as
atu and fiU) is illustrated below in Figure 1:
Input: Neighbors^ and Vu Auxiliary Variables:
minimum^ // minimum value of a to be chosen to
u
avoid the transmission range of a node from becoming negative
RangeuMax // the maximum transmission range of a node for complete neighbourhood
Density related variables: maxNeighblowDensity, maxNeighb modDensity, lowDensity a, modDensitya, highDensity_a
Node Velocity related variables: maxVel lowMobility, maxVelmodMobility, lowMobility_[, modMobility_[, highMobility_[
Output: au
Begin DMEFParameterSelection if (yl < maxVel lowMobility)
ßU ^ lowMobilityJ else if (< maxVel moderateMobility)
ß ^ moderateMobility_ß else
ßU ^ highMobility_ß
t x
minimum a ^
| Neighborst
eMax eu
* V1
Rangeu
if (I NeighborsU I - maxNeighb lowDensity)
q} ^ Maximum (minimum, lowDensity a) u	u
else if (| Neighborsu I - maxNeighb modDensity)
qQ ^ Maximum (minimum_Q , modDensity a) uu
else
u
*
u
Q Maximum (minimum_Q, highDensitya) u	u
return Q and pl
End DMEF Parameter Selection
Figure 1: Algorithm to Dynamically Select the Parameter Values for DMEF.
3 Review of MANET multicast routing protocols
In this section, we discuss the working of the MANET multicast routing protocols (MAODV, BEMRP, NR-MLPBR and R-MLPBR) whose performance under DMEF and default flooding is studied through simulations in this paper. We also provide a brief overview of LPBR before discussing its two multicast extensions.
3.1 Multicast extension of ad hoc on-demand distance vector (MAODV) routing protocol
MAODV [18] is the multicast extension of the well-known Ad hoc On-demand Distance Vector (AODV) unicast routing protocol [17]. Here, a receiver node joins the multicast tree through a member node that lies on the minimum-hop path to the source. A potential receiver wishing to join the multicast group broadcasts a RREQ message. If a node receives the RREQ message and is not part of the multicast tree, the node broadcasts the message in its neighbourhood and also establishes the reverse path by storing the state information consisting of the group address, requesting node id and the sender node id in a temporary cache. If a node receiving the RREQ message is a member of the multicast tree and has not seen the RREQ message earlier, the node waits to receive several RREQ messages and sends back a RREP message on the shortest path to the receiver. The member node also informs in the RREP message, the number of hops from itself to the source. The potential receiver receives several RREP messages and selects the member node which lies on the shortest path to the source. The receiver node sends a Multicast Activation (MACT) message to the selected member node along the chosen route. The route from the source to receiver is set up when the member node and all the intermediate nodes in the chosen path update their multicast table with state information from the temporary cache. A similar approach is used in NR-MLPBR and R-MLPBR when a new receiver node wishes to join the multicast group.
Tree maintenance in MAODV is based on the expanding ring search (ERS) approach, using the RREQ, RREP and MACT messages. The downstream node of a broken link is responsible for initiating ERS to issue a fresh RREQ for the group. This RREQ contains the hop count of the requesting node from the source and the last known sequence number for that group. It can be replied only by the member nodes whose recorded sequence
number is greater than that indicated in the RREQ and whose hop distance to the source is smaller than the value indicated in the RREQ.
3.2	Bandwidth-efficient multicast routing
protocol (BEMRP)
According to BEMRP [16], a newly joining node to the multicast group opts for the nearest forwarding node in the existing tree, rather than choosing a minimum-hop count path from the source of the multicast group. As a result, the number of links in the multicast tree is reduced leading to savings in the network bandwidth.
Multicast tree construction is receiver-initiated. When a node wishes to join the multicast group as a receiver, it initiates the flooding of Join control packets targeted towards the nodes that are currently members of the multicast tree. On receiving the first Join control packet, the member node waits for a certain time before sending a Reply packet. The member node sends a Reply packet on the path, traversed by the Join control packet, with the minimum number of intermediate forwarding nodes. The newly joining receiver node collects the Reply packets from different member nodes and would send a Reserve packet on that path that has the minimum number of forwarding nodes from the member node to itself.
To provide more bandwidth efficiency, the tree maintenance approach in BEMRP is hard-state based, i.e. a member node transmits control packets only after a link breaks. BEMRP uses two schemes to recover from link failures: Broadcast-multicast scheme - the upstream node of the broken link is responsible for finding a new route to the previous downstream node; Local-rejoin scheme - the downstream node of the broken link tries to rejoin the multicast group using a limited flooding of the Join control packets.
3.3	Location prediction based routing (LPBR) protocol
LPBR works as follows: Whenever a source node has data packets to send to a destination node but does not have a route to that node, it initiates a flooding-based route discovery by broadcasting a Route-Request (RREQ) packet. During this flooding process, each node forwards the RREQ packet exactly once after incorporating its location update vector (LUV) in the RREQ packet. The LUV of a node comprises the node id, the current X and Y co-ordinates of the nodes, the current velocity and angle of movement with respect to the X-axis. The destination node collects the LUV information of all the nodes in the network from the RREQ packets received through several paths and sends a Route-Reply (RREP) packet to the source on the minimum hop path traversed by a RREQ packet.
The source starts sending the data packets on the path learnt (based on the RREP packet) and informs the destination about the time of next packet dispatch through the header of the data packet currently being sent. If an intermediate node could not forward a data packet, it
sends a Route-Error packet to the source node, which then waits a little while for the destination to inform it of a new route predicted using the LUVs gathered from the latest flooding-based route discovery. If the destination does not receive the data packet within the expected time, it locally constructs the current global topology by predicting the locations of the nodes. Each node is assumed to be currently moving in the same direction and speed as mentioned in its latest LUV. If there is at least one path in the predicted global topology, the destination node sends the source a LPBR-RREP packet on the minimum hop path in the predicted topology. If the predicted path actually exists in reality, the intermediate nodes on the predicted route manage to forward the LPBR-RREP packet to the source. The source uses the route learnt through the latest LPBR-RREP packet to send the data packets. A costly flooding-based route discovery has been thus avoided. If an intermediate node could not forward the LPBR-RREP packet (i.e., the predicted path did not exist in reality), the intermediate node sends a LPBR-RREP-ERROR packet to the destination informing it of the failure to forward the LPBR-RREP packet. The destination discards all the LUVs and the source node initiates the next flooding-based route discovery after timing out for the LPBR-RREP packet.
3.4 Multicast extensions to the LPBR
protocol (NR-MLPBR and R-MLPBR)
Both the multicast extensions of LPBR, referred to as NR-MLPBR and R-MLPBR, are aimed at minimizing the number of global broadcast tree discoveries as well as the hop count per source-receiver path of the multicast tree. They use a similar idea of letting the receiver nodes to predict a new path based on the locally constructed global topology obtained from the location and mobility information of the nodes learnt through the latest broadcast tree discovery. Receiver nodes running NR-MLPBR (Non-Receiver aware Multicast extensions of LPBR) are not aware of the receivers of the multicast group, whereas each receiver node running R-MLPBR (Receiver-aware Multicast Extension of LPBR) is aware of the identity of the other receivers of the multicast group. NR-MLPBR attempts to predict a minimum hop path to the source, whereas R-MLPBR attempts to predict a path to the source that has the minimum number of non-receiver nodes.
The multicast extensions of LPBR work as follows: When a source attempts to construct a multicast tree, it floods a Multicast Tree Request Message (MTRM) throughout the network. The location and mobility information of the intermediate forwarding nodes are recorded in the MTRM. Each node, including the receiver nodes of the multicast group, broadcasts the MTRM exactly once in its neighbourhood. Each receiver node of the multicast group receives several MTRMs and sends a Multicast Tree Establishment Message (MTEM) on the minimum hop path traversed by the MTRMs. The set of paths traversed by the MTEMs form the multicast tree rooted at the source. If an intermediate node of the tree notices a downstream node moving away from it, the
intermediate node sends a Multicast Path Error Message (MPEM) to the source. The source does not immediately initiate another tree discovery procedure. Instead, the source waits for the appropriate receiver node (whose path to the source has broken) to predict a path to the source. The receiver predicts a new path based on the location and mobility information of the nodes collected through the MTRMs during the latest global tree discovery procedure. The receiver attempts to locally construct the global topology by predicting the locations of the nodes in the network using the latest location and mobility information collected.
NR-MLPBR and R-MLPBR differ from each other based on the type of path predicted and notified to the source. NR-MLPBR determines and sends a Multicast Predicted Path Message (MPPM) on the minimum hop path to the source. R-MLPBR attempts to choose a path that will minimize the number of newly added intermediate nodes to the multicast tree. In pursuit of this, R-MLPBR determines a set of node-disjoint paths to the source on the predicted topology and sends the MPPM on that path that includes the minimum number of non-receiver nodes. If there is a tie, R-MLPBR chooses the path that has the least hop count. The source waits to receive a MPPM from the affected receiver node. If a MPPM is received within a certain time, the source considers the path traversed by the MPPM as part of the multicast tree and continues to send data packets down the tree including to the nodes on the new path. Otherwise, the source initiates another global tree discovery procedure by broadcasting the MTRM. R-MLPBR has been thus designed to also reduce the number of links that form the multicast tree, in addition to the source-receiver hop count and the number of global tree discoveries.
4 Simulations
The network dimension used is a 1000m x 1000m square network. The transmission range of each node is assumed to be 250m. The number of nodes used in the network is 25, 50 and 75 nodes representing networks of low, medium and high density with an average distribution of 5, 10 and 15 neighbours per node respectively. Initially, nodes are uniformly randomly distributed in the network. We implemented all of the four multicast routing protocols (MAODV, BEMRP, NR-MLPBR and R-MLPBR) in the ns-2 simulator [4]. The broadcast tree discovery strategies simulated are the default flooding approach and DMEF. The DMEF parameter values are given in Table 1.
Table 1: DMEF Parameter Values.
DMEF Parameter	Value
maxNeighb lowDensity	5
maxNeighb modDensity	10
lowDensity a	5
modDensity a	10
highDensity a	20
maxVel lowMobility	5
maxVel modMobility	15
lowMobiUtyJ3	1.6
modMobility_p	1.3
highMobiUty_p	1.1
T ± wait	10 seconds
The signal propagation model used is the Two-ray ground reflection model [4]. The Medium Access Control (MAC) layer model is the IEEE 802.11 [3] model. The channel bandwidth is 2 Mbps. The node queues operate on a First-in First-Out (FIFO) basis, with a maximum queue size of 100 packets. The node mobility model used is the Random Waypoint model [2], with the node velocity chosen from [vmi-„,..., vmax]; vmin was set to 0 and the values of vmax used are 10m/s, 30m/s and 50m/s representing scenarios of low, moderate and high node mobility respectively. The pause time is 0 seconds. Simulations are conducted with a multicast group size of 2, 4 (small size), 8, 12 (moderate size) and 24 (larger size) receiver nodes. For each group size, we generated 5 lists of receiver nodes and simulations were conducted with each of them. Traffic sources are constant bit rate (CBR). Data packets are 512 bytes in size and the packet sending rate is 4 data packets/second. The multicast session continues until the end of the simulation time, which is 1000 seconds. The transmission energy and reception energy per hop is set at 1.4 W and 1 W respectively [6]. Initial energy at each node is 1000 Joules. Each node periodically broadcasts a beacon message within its neighbourhood to make its presence felt to the other nodes in the neighbourhood.
4.1 Performance metrics
The performance metrics studied through this simulation are the following. The performance results for each metric displayed in Figures 2 through 14 are an average of the results obtained from simulations conducted with 5 sets of multicast groups and 5 sets of mobility profiles for each group size, node velocity and network density values. The multicast source in each case was selected randomly among the nodes in the network and the source is not part of the multicast group. The nodes that are part of the multicast group are merely the receivers.
•	Number of Links per Multicast Tree: This is the time averaged number of links in the multicast trees discovered and computed over the entire multicast session. The notion of "time-average" is explained as follows: Let there be multicast trees T1, T2, T3 with 5, 8 and 6 links used for time 12, 6 and 15 seconds respectively, then the time averaged number of links in the multicast trees is given by (5*12+8*6+6*15)/ (12+6+15) = 6 and not merely 6.33, which is the average of 5, 8 and 6.
•	Hop Count per Source-Receiver Path: This is the time averaged hop count of the paths from the source to each receiver of the multicast group and computed over the entire multicast session.
•	Time between Successive Broadcast Tree Discoveries: This is the time between two
successive broadcast tree discoveries, averaged over the entire multicast session. This metric is a measure of the lifetime of the multicast trees discovered and also the effectiveness of the path prediction approach followed in NR-MLPBR and R-MLPBR.
•	Energy Throughput: This is the average of the ratio of the number of data packets reaching the destination to the sum of the energy spent across all the nodes in the network.
•	Energy Consumed per Node: This is the sum of the energy consumed at a node due to the transfer of data packets as part of the multicast session, broadcast tree discoveries as well as the periodic broadcast and exchange of beacons in the neighbourhood.
•	Energy Consumed per Tree Discovery: This is the average of the total energy consumed for the global broadcast based tree discovery attempts. This includes the sum of the energy consumed to transmit (broadcast) the MTRM packets to the nodes in the neighbourhood and to receive the MTRM packet sent by each node in the neighbourhood, summed over all the nodes. It also includes the energy consumed to transmit the MTEM packet from each receiver to the source of the multicast session.
4.2 Number of links per multicast tree
The number of links per multicast tree (refer Figures 2 and 3) is a measure of the efficiency of the multicast routing protocol in reducing the number of link transmissions during the transfer of the multicast data from the source to the receivers of the multicast group. The smaller is the number of links in the tree, the larger the link transmission efficiency of the multicast routing protocol. If fewer links are part of the tree, then the chances of multiple transmissions in the network increase and this increases the efficiency of link usage and the network bandwidth. Naturally, the BEMRP protocol, which has been purely designed to yield bandwidth-efficient multicast trees, discovers trees that have a reduced number of links for all the operating scenarios. This leads to larger hop count per source-receiver paths for BEMRP as observed in Figures 4 and 5.
R-MLPBR, which has been designed to choose the predicted paths with the minimum number of non-receiver nodes, manages to significantly reduce the number of links vis-à-vis the MAODV and NR-MLPBR protocols. R-MLPBR attempts to minimize the number of links in the multicast tree without yielding to a higher hop count per source-receiver path. But, the tradeoff between the link efficiency and the hop count per source-receiver path continues to exist and it cannot be nullified. In other words, R-MLPBR cannot discover trees that have minimum number of links as well as the minimum hop count per source-receiver path. Nevertheless, R-MLPBR is the first multicast routing protocol that yields trees with the reduced number of links and at the same time, with a reduced hop count (close to the minimum) per source-receiver path.
Performance with Flooding as Tree Discovery Strategy
• Impact of Node Mobility: For a given network
density and multicast group size, we do not see any
appreciable variation in the number of links per tree for each of the multicast routing protocols studied.
Figure 2.1: 25 nodes, 10 m/s.
Figure 2.2: 25 nodes, 30 m/s.
Figure 2.3: 25 nodes, 50 m/s.
Figure 2.4: 50 nodes, 10 m/s.
Figure 2.5: 50 nodes, 30 m/s.
Figure 2.6: 50 nodes, 50 m/s.
Figure 2.7: 75 nodes, 10 m/s.
Figure 2.8: 75 nodes, 30 m/s.
Figure 2.9: 75 nodes, 50 m/s.
Figure 2: Average Number of Links per Multicast Tree (Tree Discovery Procedure: Flooding).
Impact of Network Density: For a given multicast group size, the number of links per tree for MAODV and NR-MLPBR is about 4-15%, 8-28% and 1038% more than that incurred with BEMRP in networks of low, moderate and high density respectively. This illustrates that as the network density increases, BEMRP attempts to reduce the number of links per tree by incorporating links that can be shared by multiple receivers on the paths towards the source. On the other hand, both MAODV and NR-MLPBR attempt to choose minimum hop paths between the source and any receiver and hence exploit the increase in network density to discover minimum hop paths, but at the cost of the link efficiency. On the other hand, R-MLPBR attempts to reduce the number of links per tree as we increase the network density. For a given multicast group size, the number of links per tree for R-MLPBR is about 4-15%, 8-18% and 10-21% more than that incurred by BEMRP. This shows that R-MLPBR is relatively more scalable, similar to BEMRP, with increase in the network density. Impact of Multicast Group Size: For a given level of node mobility, for smaller multicast groups (of size 2), the number of links per tree for MAODV, NR-MLPBR and R-MLPBR is about 3-7%, 8-11% and 9-14% more than that incurred for BEMRP in low, medium and high-density networks respectively. For medium and large-sized multicast groups, the
number of links per tree for both MAODV and NR-MLPBR is about 7-15%, 17-28% and 22-38% more than that incurred for BEMRP in low, medium and high-density networks respectively. On the other hand, the number of links per tree for R-MLPBR is about 6-15%, 12-18% and 16-21% more than that incurred for BEMRP in low, medium and high-density networks respectively. This shows that R-MLPBR is relatively more scalable, similar to BEMRP, with increase in the multicast group size.
Performance with DMEF as the Tree Discovery
Strategy
•	Impact of Node Mobility: For each multicast routing protocol, as the maximum node velocity is increased from 10 m/s to 30 m/s, the number of links per multicast tree increases as large as up to 24% (for multicast groups of small and moderate sizes) and 3% (for larger multicast groups). As the maximum node velocity is increased from 10 m/s to 50 m/s, the number of links per tree increases as large as up to 15% (for multicast groups of small and moderate sizes) and 5% (for larger multicast groups). Thus, DMEF can yield multicast trees with reduced number of links in low node mobility, especially for multicast groups of small and moderate sizes.
•	Impact of Network Density: For a given group size, the number of links per tree for MAODV and NR-MLPBR is about 4-15%, 8-28% and 10-35% more


than that incurred with BEMRP in networks of low, moderate and high density respectively. For a given group size, the number of links per tree for R-MLPBR is about 3-9%, 8-18% and 9-24% more than that incurred by BEMRP. The results are more or
less similar to what has been obtained using flooding as the tree discovery strategy.
Impact of Multicast Group Size: For a given level of node mobility, for smaller multicast groups (of size
Figure 3.1: 25 nodes, 10 m/s.
Figure 3.2: 25 nodes, 30 m/s.
Figure 3.3: 25 nodes, 50 m/s.
Figure 3.4: 50 nodes, 10 m/s.
Figure 3.5: 50 nodes, 30 m/s.
Figure 3.6: 50 nodes, 50 m/s.
Figure 3.7: 75 nodes, 10 m/s.
Figure 3.8: 75 nodes, 30 m/s.
Figure 3.9: 75 nodes, 50 m/s.

Figure 3: Average Number of Links per Multicast Tree (Tree Discovery Procedure: DMEF).
2), the number of links per tree for MAODV, NR-MLPBR and R-MLPBR is about 4-7%, 8-9% and 914% more than that incurred for BEMRP in low, medium and high-density networks respectively. For medium and large-sized multicast groups, the number of links per tree for both MAODV and NR-MLPBR is about 7-15%, 17-28% and 21-35% more than that incurred for BEMRP in low, medium and high-density networks respectively. On the other hand, the number of links per tree for R-MLPBR is about 6-8%, 11-18% and 15-24% more than that incurred for BEMRP in low, medium and high-density networks respectively. These results are almost the same as that obtained when flooding is used as the tree discovery strategy.
4.3 Hop count per source-receiver path
All the three multicast routing protocols - MAODV, NR-MLPBR and R-MLPBR, incur almost the same average hop count per source-receiver and it is considerably lower than that incurred for BEMRP. The hop count per source-receiver path is an important metric and it is often indicative of the end-to-end delay per multicast packet from the source to a specific receiver. BEMRP incurs a significantly larger hop count per source-receiver path and this can be attributed to the nature of this multicast routing protocol to look for trees with a reduced number of links. When multiple receiver nodes have to be
connected to the source through a reduced set of links, the hop count per source-receiver path is bound to increase. In performance Figures 4 and 5, we can see a significant increase in the hop count per source-receiver path as we increase the multicast group size. In the case of flooding, the hop count per source-receiver path for BEMRP can be as large as 41%, 57% and 59% more than that of the hop count per source-receiver path incurred for the other three multicast routing protocols. In the case of DMEF, the hop count per source-receiver path for BEMRP can be as large as 36%, 49% and 53% more than that of the hop count per source-receiver path incurred for the other three multicast routing protocols. The increase in the hop count per source-receiver path for BEMRP is slightly less than that obtained under flooding.
Performance with Flooding as the Tree Discovery Strategy
•	Impact of Node Mobility: For a given network density and group size, we do not see any appreciable variation in the hop count per source-receiver path for each of the multicast routing protocols studied.
•	Impact of Network Density: As we increase the network density, the hop count per source-receiver path decreases. This is mainly observed in the case of the minimum-hop based MAODV, NR-MLPBR
and R-MLPBR. In the case of BEMRP, the impact of network density on the decrease in the hop count is relatively less as it is a bandwidth-efficient multicast routing protocol attempting to reduce the number of links in the tree. In networks of moderate density (50
nodes), the hop count per source-receiver path for the three minimum hop based multicast protocols is about 6%, 9-12% and 15-19% less than that incurred in low-density networks for multicast groups of small, medium and larger sizes respectively. In high
Figure 4.1: 25 nodes, 10 m/s.
Figure 4.2: 25 nodes, 30 m/s.
Figure 4.3: 25 nodes, 50 m/s.
# Receivers per Multicast Group
Figure 4.4: 50 nodes, 10 m/s
Figure 4.5: 50 nodes, 30 m/s.
Figure 4.6: 50 nodes, 50 m/s.
Figure 4.7: 75 nodes, 10 m/s.
Figure 4.8: 75 nodes, 30 m/s.
# Receivers per Multicast Group
Figure 4.9: 75 nodes, 50 m/s.
Figure 4: Average Hop Count per Source-Receiver Path (Tree Discovery Procedure: Flooding).
density networks (75 nodes), the hop count per source-receiver path for the three minimum-hop based multicast protocols is about 7-9%, 11-18% and 15-19% less than that incurred in low-density networks for multicast groups of small, medium and larger sizes respectively. In the case of BEMRP, the maximum reduction in the hop count with increase in network density is within 10%.
• Impact of Multicast Group Size: For smaller multicast groups (of size 2), the hop count per source-receiver path for BEMRP can be 6-10%, 812% and 10-12% more than that of the other three multicast routing protocols in networks of low, moderate and high density respectively. For medium sized multicast groups, the hop count per source-receiver path for BEMRP can be 14-29%, 21-30% and 23-37% more than that of the other three multicast routing protocols in networks of low, moderate and high density respectively. For large-sized multicast groups, the hop count per source-receiver path for BEMRP can be 27-41%, 35-57% and 33-59% more than that of the hop count per source-receiver path for the other three multicast routing protocols in networks of low, moderate and high density respectively.
•	Impact of Node Mobility : For each of the multicast routing protocols, as the maximum node velocity is increased from 10 m/s to 30 m/s, we observe that the hop count per source-receiver path increases as large as up to 17% (for multicast groups of small and moderate sizes) and 7% (for multicast groups of larger size). As the maximum node velocity is increased from 10 m/s to 50 m/s, we observe that the number of links per multicast tree increases as large as up to 13% (for multicast groups of small and moderate sizes) and 15% (for multicast groups of larger size). This shows that DMEF can yield multicast trees with reduced hop count per source-receiver path under low node mobility, especially for multicast groups of small and moderate sizes.
•	Impact of Network Density: The impact is similar to that observed in the case of flooding. For the minimum-hop based multicast protocols, with increase in network density, the hop count per source-receiver path decreases significantly. On the other hand, in the case of BEMRP, the decrease in the hop count per source-receiver path is relatively less, with increase in the network density.
•	Impact of Multicast Group Size: For smaller multicast groups (of size 2), the hop count per source-receiver path for BEMRP can be 6-9%, 912% and 10-12% more than that of the other three multicast routing protocols in networks of low,
moderate and high density respectively. For medium sized multicast groups, the hop count per source-receiver path for BEMRP can be 13-28%, 20-29% and 23-34% more than that of the other three multicast routing protocols in networks of low, moderate and high density respectively. For large-
sized multicast groups, the hop count per source-receiver path for BEMRP can be 24-36%, 33-50% and 36-54% more than that of the hop count per source-receiver path for the other three multicast routing protocols in networks of low, moderate and high density respectively.
Figure 5.1: 25 nodes, 10 m/s.
Figure 5.2: 25 nodes, 30 m/s.
Figure 5.3: 25 nodes, 50 m/s.
Figure 5.4: 50 nodes, 10 m/s.
Figure 5.5: 50 nodes, 30 m/s.
Figure 5.6: 50 nodes, 50 m/s.
Figure 5.7: 75 nodes, 10 m/s.
Figure 5.8: 75 nodes, 30 m/s.
Figure 5.9: 75 nodes, 50 m/s.
Figure 5: Average Hop Count per Source-Receiver Path (Tree Discovery Procedure: DMEF).
4.4 Time between successive broadcast tree discoveries
The time between successive broadcast tree discoveries is a measure of the stability of the multicast trees and the effectiveness of the location prediction and path prediction approach of the two multicast extensions. For a given condition of node density and node mobility, both NR-MLPBR and R-MLPBR incur relatively larger time between successive broadcast tree discoveries for smaller and medium sized multicast groups. MAODV tends to be more unstable as the multicast group size is increased, owing to the minimum hop nature of the paths discovered and absence of any path prediction approach. For larger multicast groups, BEMRP tends to perform better by virtue of its tendency to strictly minimize only the number of links in the tree. On the other hand, NR-MLPBR attempts to reduce the hop count per source-receiver path and ends up choosing predicted paths that increase the number of links in the tree, quickly leading to the failure of the tree. The time between successive tree discoveries for R-MLPBR is 15-25%, 15-59% and 20-82% more than that obtained for MAODV in networks of low, moderate and high density respectively. For a given level of node mobility and network density, MAODV trees become highly unstable as the multicast group size increases. For multicast groups of size 2 and
4, the time between successive broadcast tree discoveries for NR-MLPBR and R-MLPBR is greater than that obtained for BEMRP, especially in networks of low and moderate network density. For larger multicast group sizes, when we employ flooding, BEMRP tends to incur larger time between successive broadcast tree discoveries compared to NR-MLPBR and R-MLPBR. On the other hand, when we employ DMEF, R-MLPBR tends to incur larger time between successive broadcast tree discoveries compared to BEMRP, even for larger group sizes.
Performance with Flooding as the Tree Discovery Strategy
•	Impact of Node Mobility: For a given multicast group size, network density and multicast routing protocol, the time between successive broadcast tree discoveries at maximal node velocity of 30 m/s is roughly about 28-47% of that obtained at maximal node velocity of 10 m/s. The time between successive broadcast tree discoveries at maximal node velocity of 50 m/s is roughly about 21-36% of that obtained at maximal node velocity of 10 m/s.
•	Impact of Network Density: For each multicast routing protocol, for a given multicast group size and level of node mobility, as the network density increases, the time between successive broadcast tree discoveries decreases. This is mainly observed for
the minimum-hop based multicast protocols (especially MAODV and NR-MLPBR) which incur a reduced hop count per source-receiver path as we increase the network density. But, such minimum hop paths obtained in moderate and high-density networks are relatively less stable than those
obtained in low-density networks. For a given multicast group size and low node mobility, the time between successive tree discoveries in networks of moderate density (50 nodes) for MAODV and NR-MLPBR is 67-90% and for R-MLPBR and BEMRP is 73-96% of those incurred in low-density networks.
Figure 6.1: 25 nodes, 10 m/s.
Figure 6.2: 25 nodes, 30 m/s.
Figure 6.3: 25 nodes, 50 m/s.
Figure 6.4: 50 nodes, 10 m/s.
Figure 6.5: 50 nodes, 30 m/s.
Figure 6.6: 50 nodes, 50 m/s.
Figure 6.7: 75 nodes, 10 m/s.
Figure 6.8: 75 nodes, 30 m/s.
Figure 6.9: 75 nodes, 50 m/s.
Figure 6: Average Time between Successive Tree Discoveries (Tree Discovery Procedure: Flooding).
For a given multicast group size and low node mobility, the time between successive tree discoveries in networks of high density (75 nodes) is 51-80% for MAODV and NR-MLPBR and for R-MLPBR and BEMRP is 70-90% of those obtained in networks of low-density.
In low-density networks, the time between successive route discoveries for R-MLPBR and NR-MLPBR is about 10-15% more than that obtained for BEMRP for smaller multicast groups and is almost the same as that of BEMRP for moderately sized multicast groups. For larger multicast groups, the time between successive route discoveries for R-MLPBR and NR-MLPBR can be about 10-23% less than that obtained for BEMRP. In moderate and high density networks, the time between successive route discoveries for R-MLPBR is about 7-25% more than that obtained for BEMRP for smaller multicast groups and is about the same of moderately size multicast groups. For larger multicast groups, the time between successive route discoveries for R-MLBPR can be about 15-25% less than that obtained for BEMRP. In both moderate and high-density networks, R-MLPBR incurs larger time between successive route discoveries (as large as 30%) compared to NR-MLPBR.
• Impact of Multicast Group Size: For a given network density and node mobility, the time between successive route discoveries decreases as the multicast group size increases. For smaller group sizes, the time between successive broadcast tree discoveries for MAODV and BEMRP is respectively about 80%-90% and 85%-94% of that incurred for NR-MLPBR and R-MLPBR. For larger group sizes, the time between successive broadcast tree discoveries for MAODV is about 70%, 51% and 41% of that incurred for BEMRP in networks of low, moderate and high density respectively. Similarly, for larger group sizes, the time between successive broadcast tree discoveries for NR-MLPBR is about 76%, 64% and 57% of that incurred for BEMRP in networks of low, moderate and high density respectively. On the other hand, R-MLPBR tends to incur relatively larger time between successive tree discoveries even for larger multicast group sizes. For larger multicast groups, the time between successive tree discoveries for R-MLPBR is about 75%-80% of that incurred for BEMRP for all network densities.
Impact of Node Mobility: For a given multicast group size, network density and multicast routing protocol, the time between successive broadcast tree discoveries at maximal node velocity of 30 m/s is roughly about 38-59% of that obtained at maximal node velocity of 10 m/s in networks of low, moderate and high density respectively. The time
between successive broadcast tree discoveries at maximal node velocity of 50 m/s is roughly about 34-50% of that obtained at maximal node velocity of 10 m/s. In each instance, the increase in the time between successive route discoveries while using DMEF is at least 10-15% more than that obtained due to flooding.
Figure 7.1: 25 nodes, 10 m/s.
Figure 7.2: 25 nodes, 30 m/s.
Figure 7.3: 25 nodes, 50 m/s.
Figure 7.4: 50 nodes, 10 m/s.
Figure 7.5: 50 nodes, 30 m/s.
Figure 7.6: 50 nodes, 50 m/s.
Figure 7.7: 75 nodes, 10 m/s.
Figure 7.8: 75 nodes, 30 m/s.
Figure 7.9: 75 nodes, 50 m/s.
Figure 7: Average Time between Successive Tree Discoveries (Tree Discovery Procedure: DMEF).
Impact of Network Density: As we increase the network density from 25 nodes to 50 nodes, we observe that the time between successive broadcast tree discoveries for MAODV, NR-MLPBR, R-MLPBR and BEMRP decreases by 13%, 9%, 6% and 6% respectively. On the other hand, as we increase from 25 nodes to 75 nodes, we notice that the larger number of nodes in the neighbourhood is taken into account by DMEF to discover stable routes and there is no appreciable difference in the time between successive tree discoveries for NR-MLPBR, R-MLPBR and BEMRP. In the case of MAODV, the time between successive tree discoveries decreases by 8%.
Impact of Multicast Group Size: For a given network density and node mobility, the time between successive route discoveries decreases as the multicast group size decreases. For smaller group sizes, the time between successive broadcast tree discoveries for MAODV and BEMRP is respectively about 82% and 87% of that incurred for NR-MLPBR and R-MLPBR. For moderate group sizes, the time between successive broadcast tree discoveries for MAODV, NR-MLPBR and BEMRP is about 7786%, 96% and 96% of those incurred for R-MLPBR. For larger group sizes, the time between successive
broadcast tree discoveries for MAODV and NR-MLPBR is about 80-89% and 92-94% of that obtained for R-MLPBR and BEMRP.
4.5 Energy consumed per node
Energy consumption in multicast routing is directly proportional to the number of links in the tree. Larger the number of links, more the transmissions and more will be the energy consumption in the network and vice-versa. The simulation results in Figures 8 and 9 clearly illustrate this. BEMRP incurs the least energy consumption per node and MAODV incurs the largest energy consumption per node. The energy consumed per node for the two multicast extensions is in between these two extremes. The energy consumed per node for R-MLPBR is less than that of NR-MLPBR as the former also attempts to simultaneously reduce the number of links as well as the hop count per source-receiver path. The energy consumption per node increases as the multicast group size increases. For a given multicast group size and multicast routing protocol, the energy consumed per node increases with increase in network density as well as with increase in node mobility.



Impact of Node Mobility: For a given multicast group size, network density and multicast routing protocol, the energy consumed per node at maximal node velocity of 30 m/s can grow as large as 10-35% of that obtained at maximal node velocity of 10 m/s. The energy consumed per node at maximal node velocity of 50 m/s can grow as large as 10-40% of
that obtained at maximal node velocity of 10 m/s. BEMRP and MAODV incur the largest increase in energy consumed per node with increase in node mobility. NR-MLPBR and R-MLPBR incur a relatively lower increase in the energy consumed per node with increase in node mobility. This can be attributed to the tendency of these multicast routing
Figure 8.1: 25 nodes, 10 m/s.
Figure 8.2: 25 nodes, 30 m/s.
Figure 8.3: 25 nodes, 50 m/s.
2	4	8	12	24
# Receivers per Multicast Group
Figure 8.4: 50 nodes, 10 m/s.
Figure 8.5: 50 nodes, 30 m/s.
Figure 8.6: 50 nodes, 50 m/s.
Figure 8.7: 75 nodes, 10 m/s.
Figure 8.8: 75 nodes, 30 m/s.
Figure 8.9: 75 nodes, 50 m/s.

Figure 8: Average Energy Consumed per Node (Tree Discovery Procedure: Flooding).
protocols to reduce the number of broadcast tree discoveries using effective tree prediction.
• Impact of Network Density: For multicast groups of size 2 and 4, we observe that with increase in network density from 25 to 50 nodes and from 25 to 75 nodes, the energy consumed per node decreases. This can be attributed to the smaller group size, leading to the effective sharing of the data forwarding load among all the nodes in the network. For larger group sizes, all the nodes in the network end up spending more energy (due to transmission/reception or at least receiving the packets in the neighbourhood). As a result, for multicast group sizes of 8, 12 and 24, as we increase the network density from 25 nodes to 50 nodes, the increase in the energy consumed per node for MAODV, NR-MLPBR, R-MLPBR and BEMRP is by factors of 47%-134%, 46%-133%, 42%-122% and 30%-96% respectively. As we increase the network density from 25 nodes to 75 nodes, the increase in the energy consumed per node for MAODV, NR-MLPBR, R-MLPBR and BEMRP is by factors of 52%-158%, 50%-154%, 42%-125% and 25%-100% respectively. MAODV and NR-MLPBR incur a relatively larger energy consumed per node at high network densities due to the nature of these multicast routing protocols to discover trees
with minimum hop count. R-MLPBR and BEMRP discover trees with reduced number of links and hence incur relatively lower energy consumed per node at high network density.
•	Impact of Multicast Group Size: As we increase the multicast group size from 2 to 24, the energy consumed per node for MAODV and NR-MLPBR increases by a factor of 2.1 to 2.6, 5.7 to 5.9 and 6.0 to 7.0 for low, medium and high density networks respectively. In the case of BEMRP and R-MLPBR, as we increase the multicast group size from 2 to 24, the energy consumed per node increases by a factor of 2.1 to 2.5, 4.9 to 5.2 and 4.6 to 6.2 in networks of low, medium and high density respectively. The increase in the energy consumed per node is below linear. Hence, all the four multicast routing protocols are scalable with respect to the increase in multicast group size.
Performance with DMEF as the Tree Discovery
Strategy
•	Impact of Node Mobility: For a given multicast group size, network density and multicast routing protocol, the energy consumed per node at maximal node velocity of 30 m/s and 50 m/s can grow as large as 5-20% of that obtained at maximal node velocity of 10 m/s. This indicates the effectiveness of DMEF
vis-à-vis flooding in reducing the energy consumed per node. DMEF discovers relatively more stable trees by involving only slow moving nodes in the tree. As a result, the multicast trees exist for a long time and incur less energy for tree discoveries. Similar to that observed for flooding, BEMRP and MAODV incur the largest increase in energy consumed per node with increase in node mobility.
NR-MLPBR and R-MLPBR incur a relatively lower increase in the energy consumed per node with increase in node mobility.
Impact of Network Density: Similar to the observed for flooding, for multicast groups of size 2 and 4, we observe that with increase in network density from
Figure 9.1: 25 nodes, 10 m/s.
Figure 9.2: 25 nodes, 30 m/s.
Figure 9.3: 25 nodes, 50 m/s.
Figure 9.4: 50 nodes, 10 m/s.
Figure 9.5: 50 nodes, 30 m/s.
Figure 9.6: 50 nodes, 50 m/s.
Figure 9.7: 75 nodes, 10 m/s.
Figure 9.8: 75 nodes, 30 m/s.
Figure 9.9: 75 nodes, 50 m/s.

Figure 9: Average Energy Consumed per Node (Tree Discovery Procedure: DMEF).
25 to 50 nodes and from 25 to 75 nodes, the energy consumed per node decreases. For multicast group sizes of 8, 12 and 24, as we increase the network density from 25 nodes to 50 nodes, the increase in the energy consumed per node for MAODV, NR-MLPBR, R-MLPBR and BEMRP is by factors of 54%-157%, 53%-156%, 48%-136% and 38%-118% respectively. As we increase the network density from 25 nodes to 75 nodes, the increase in the energy consumed per node for MAODV, NR-MLPBR, R-MLPBR and BEMRP is by factors of 49%-173%, 47%-172%, 42%-146% and 27%-114% respectively. MAODV and NR-MLPBR incur a relatively larger energy consumed per node at high network densities due to the nature of these multicast routing protocols to discover trees with minimum hop count. R-MLPBR and BEMRP discover trees with reduced number of links and hence incur relatively lower energy consumed per node at high network density. For a given network density, the energy consumed per node due to flooding can be as large as 5%-16%, 12%-23% and 22%-37% more than that incurred using DMEF in the presence of low, medium and high node mobility respectively.
• Impact of Multicast Group Size: As we increase the multicast group size from 2 to 24, the energy
consumed per node for MAODV and NR-MLPBR increases by a factor of 2.2 to 2.4, 5.6 to 5.8 and 6.0 to 7.1 for low, medium and high density networks respectively. In the case of BEMRP and R-MLPBR, as we increase the multicast group size from 2 to 24, the energy consumed per node increases by a factor of 2.2 to 2.4, 4.9 to 5.4 and 4.8 to 6.4 in networks of low, medium and high density respectively. The increase in the energy consumed per node is below linear. Hence, all the four multicast routing protocols are scalable with respect to the increase in multicast group size.
4.6 Energy throughput
For each of the multicast routing protocols and for a given network density and node mobility, the energy throughput decreases with increase in the multicast group size. This can be attributed to the need to spend more energy to deliver a given multicast packet to more receivers vis-à-vis few receivers. For a given network density and multicast group size, the energy throughput of a multicast routing protocol decreases slightly as the node velocity is increased from low to moderate and high. For a given multicast group size and node mobility, the energy throughput of a multicast routing protocol
decreases with increase in network density. This can be attributed to the involvement of several nodes (for larger network density) in distributing the offered traffic load to the multicast group. For a given simulation condition, the energy throughput of BEMRP is slightly larger than that of the other multicast routing protocols. This can be attributed to the lower energy consumed per node (and less number of links) for BEMRP.
Performance with Flooding as the Tree Discovery Strategy
• Impact of Node Mobility: As we increase the node mobility, the energy throughput for a multicast protocol reduces as large as by 8%-12%, 12%-17% and 24%-26% in low, moderate and high density networks respectively. For a given network density,
Figure 10.1: 25 nodes, 10 m/s.
Figure 10.2: 25 nodes, 30 m/s.
Figure 10.3: 25 nodes, 50 m/s.
Figure 10.4: 50 nodes, 10 m/s.
Figure 10.5: 50 nodes, 30 m/s.
Figure 10.6: 50 nodes, 50 m/s.
Figure 10.7: 75 nodes, 10 m/s.
Figure 10.8: 75 nodes, 30 m/s.
Figure 10.9: 75 nodes, 50 m/s.
Figure 10: Energy Throughput: # Packets Delivered per Joule (Tree Discovery Procedure: Flooding).
the reduction in the energy throughput with increase in node mobility is due to the relatively larger amount of energy spent for broadcast tree discoveries.
•	Impact of Network Density: The decrease in energy throughput with increase in network density is more for MAODV and NR-MLPBR, relatively lower for R-MLPBR and is the least for BEMRP. At network density of 50 nodes, the energy throughput of MAODV and NR-MLPBR is 45%-64% and that of R-MLPBR and BEMRP is 50%-65% of that observed at network density of 25 nodes. At network density of 75 nodes, the energy through of MAODV, NR-MLPBR, R-MLPBR and BEMRP is 29%-48%, 30%-50%, 33%-50% and 38%-50% of that observed at network density of 25 nodes.
•	Impact of Multicast Group Size: As the multicast group size is increased from 2 to 4, the energy throughput of the multicast routing protocols decreased by 30%-40%, 36%-40% and 24%-45% in networks of low, moderate and high density respectively. As the multicast group size is increased from 2 to 24, the energy throughput of the multicast routing protocols decreased by about 78%, 83% and
85% in networks of low, moderate and high density respectively.
Performance with DMEF as the Tree Discovery
Strategy
•	Impact of Node Mobility: As we increase the node mobility from low to moderate and high, the energy throughput for a multicast routing protocol reduces as large as by 7%-8%, 8%-12% and 16%-17% in networks of low, moderate and high density respectively. The relatively higher energy throughput while using DMEF can be attributed to the tendency of the broadcast strategy to involve only relatively slow moving nodes to be part of the trees. As a result, less energy consumed for broadcast tree discoveries.
•	Impact of Network Density: The decrease in energy throughput with increase in network density is more for MAODV and NR-MLPBR, relatively lower for R-MLPBR and is the least for BEMRP. At network density of 50 nodes, the energy throughput of MAODV, NR-MLPBR, R-MLPBR and BEMRP is 48%-63%, 47%-63%, 52%-64% and 58%-69% of that observed at network density of 25 nodes. At
network density of 75 nodes, the energy through of MAODV, NR-MLPBR, R-MLPBR and BEMRP is 32%-47%, 32%-48%, 36%-48% and 42%-50% of that observed at network density of 25 nodes. Impact of Multicast Group Size: As the multicast group size is increased from 2 to 4, the energy throughput of the multicast routing protocols decreased by 36%-44%, 35%-45% and 30%-47% in networks of low, moderate and high density respectively. As the multicast group size is increased
from 2 to 24, the energy throughput of the multicast routing protocols decreased by about 80%, 84% and 84% in networks of low, moderate and high density respectively.
4.7 Energy consumed per tree discovery
For a given broadcast strategy, the energy consumed per tree discovery is the same for all of the four multicast
Figure 11.1: 25 nodes, 10 m/s.
Figure 11.2: 25 nodes, 30 m/s.
Figure 11.3: 25 nodes, 50 m/s.
Figure 11.4: 50 nodes, 10 m/s.
Figure 11.5: 50 nodes, 30 m/s.
Figure 11.6: 50 nodes, 50 m/s.
Figure 11.7: 75 nodes, 10 m/s.
Figure 11.8: 75 nodes, 30 m/s.
Figure 11.9: 75 nodes, 50 m/s.
Figure 11: Energy Throughput: # Packets Delivered per Joule (Tree Discovery Procedure: DMEF).
routing protocols. For both flooding and DMEF, the energy consumed increases with increase in network density, attributed to the involvement of multiple nodes in the broadcast of the MTRMs. In low-density networks, the energy consumed per tree discovery using flooding is 10-22%, 19-35% and 14-20% more than that of the energy consumed per tree discovery using DMEF in low, moderate and high node mobility conditions respectively. In moderate density networks, the energy consumed per tree discovery using flooding is about 15%, 23% and 28% more than that of the energy consumed per tree discovery using DMEF in low, moderate and high node mobility conditions respectively. In high-density networks, the energy consumed per tree discovery using flooding is about 18%, 30% and 37% more than the energy consumed per tree discovery using DMEF. As observed, DMEF performs better than flooding with increase in network density and/or node mobility.
Figure 12: Energy Consumed per Broadcast Tree Discovery: Flooding vs. DMEF (25 Nodes).
For a given multicast group size, the energy consumed while using flooding in moderate (50 nodes) and high density (75 nodes) networks is respectively about 3.8 and 8 times more than that incurred in networks of low density. This indicates that as the number of nodes is increased by x times (x = 2 for moderate density and x = 3 for high density), the energy consumed due to flooding increases by 2x times. In the case of DMEF, for a given multicast group size, the energy consumed in moderate density networks is about 3.7, 3.5 and 3.2 times more than that observed in low

density networks for low, moderate and high node mobility conditions respectively. For a given multicast group size, the energy consumed during DMEF in high-density networks is about 7.8, 7.2 and 6.6 times more than that observed in low-density networks for low, moderate and high node mobility conditions respectively. Thus, the energy consumed while using DMEF does not increase exponentially as observed for flooding. DMEF performs appreciably well in lowering the energy consumed per tree discovery with increase in node mobility and/or increase in network density.
Figure 13: Energy Consumed per Broadcast Tree Discovery: Flooding vs. DMEF (50 Nodes).
2	4	8
# Receivers per Multicast Group
Figure 14: Energy Consumed per Broadcast Tree Discovery: Flooding vs. DMEF (75 Nodes).
5 Survey of MANET broadcast route discovery strategies
We surveyed the literature for different broadcast route discovery strategies proposed to reduce the route discovery overhead and we describe below the strategies relevant to the research conducted in this paper. In Section 5.3, we qualitatively analyse the advantages of our DMEF broadcast strategy compared to the broadcast strategies described below in Sections 5.1 and 5.2.
5.1 Reliable route selection (RRS) Algorithm
In [20], the authors proposed a Reliable Route Selection (referred to as RRS) algorithm based on Global Positioning System (GPS) [8]. The RRS algorithm divides the circular area formed by the transmission range of a node into two zones: stable zone and caution zone. A node is said to maintain stable links with the neighbour nodes in its stable zone and maintain unstable links with the neighbour nodes lying in its caution zone. If R is the transmission range of a node, then the radius of the stable zone is defined as r = R-SS where S is the
velocity of the node. The status zone is a circular region (with its own centre) inscribed inside the circular region formed by the transmission range of the node. The centre of the status zone need not be the centre of the circular region forming the transmission range of the node, but always lies in the direction of movement of the node.
RRS works as follows: The Route-Request (RREQ) message of a broadcast route discovery process includes the co-ordinates representing the current position of the transmitter of the RREQ message, the co-ordinates representing the centre of the stable zone of the transmitter, the value of parameter S to be used by an intermediate node and the stable zone radius of the transmitter of the message. The source node of the route discovery process broadcasts the RREQ message in the complete neighbourhood formed by the transmission range R. The RRS-related fields are set to initial values corresponding to the source node. An intermediate node receiving the RREQ message broadcasts the message further, only if the node lies in the stable zone of the transmitter. If a route discovery attempt based on a set value of S is unsuccessful, the source node decrements the value of S and launches another global broadcast based route discovery. This process is continued (i.e., the value of S decremented and global broadcast reinitiated) until the source finds a path to the destination. If the source cannot find a route to the destination even while conducting route discovery with S set to zero, then the source declares that the destination is not connected to it.
5.2 Probability, counter, area and
neighbour-knowledge based methods
In [15], the authors propose several broadcast route discovery strategies that could reduce the number of retransmitting nodes of a broadcast message. These strategies can be grouped into four families: probability-based, counter-based, area-based and neighbour-knowledge based methods:
(i)	Probability-based method: When a node receives a broadcast message for the first time, the node rebroadcasts the message with a certain probability. If the message received is already seen, then the node drops the message irrespective of whether or not the node retransmitted the message when it received the message for the first time.
(ii)	Counter-based method: When a node receives a broadcast message for the first time, it waits for a certain time before retransmitting the message. During this broadcast-wait-time, the node maintains a counter to keep track of the number of redundant broadcast messages received from some of its other neighbours. If this counter value exceeds a threshold within the broadcast-wait-time, then the node decides to drop the message. Otherwise, the node retransmits the message.
(iii)	Area-based method: A broadcasting node includes its location information in the message header. The receiver node calculates the additional coverage area obtained if the message were to be rebroadcast. If the additional coverage area is less than a threshold
value, all future receptions of the same message will be dropped. Otherwise, the node starts a broadcast-wait-timer. Redundant broadcast messages received during this broadcast-wait-time are also cached. After the timer expires, the node considers all the cached messages and recalculates the additional coverage area if it were to rebroadcast the particular message. If the additional obtainable coverage area is less than a threshold value, the cached messages are dropped. Otherwise, the message is rebroadcast.
(iv) Neighbour-knowledge based method: This method requires nodes to maintain a list of 1-hop neighbours and 2-hop neighbours, learnt via periodic beacon exchange. Using these lists, a node calculates the smallest set of 1-hop neighbours required to reach all the 2-hop neighbours. The minimum set of 1-hop neighbours that will cover all of the 2-hop neighbours is called the Multi Point Relays (MPRs).
5.3 Advantages of DMEF and differences with related work
The DMEF strategy is very effective in discovering
relatively long-living routes in an energy-efficient
manner and differs from the RRS algorithm in the
following ways:
•	RRS is highly dependent on location-service schemes like GPS, while DMEF is not dependent on any location-service scheme.
•	RRS requires the RREQ message header to be changed while DMEF does not require any change in the structure of the RREQ messages used for broadcasting. DMEF can be thus used without requiring any change in a MANET routing protocol.
•	In RRS, a node lying in the stable zone of the transmitter of the RREQ rebroadcasts the message in its complete neighbourhood. However, it is only the recipient nodes in the stable zone of the transmitter that rebroadcast the RREQ. Hence, RRS is not energy-efficient. In DMEF, the transmission range for broadcast at a node is dynamically and locally determined using the node's velocity and neighbourhood density and is usually considerably less than the maximum transmission range.
•	RRS does not properly handle the scenario where the value of 8*S exceeds the transmission range, R, of the node. The value of 8 has to be iteratively reduced (by trial and error) to determine the connectivity between the source and destination nodes. DMEF is better than RRS because it requires only one broadcast route discovery attempt from the source to determine a route to the destination if the two nodes are indeed connected. The values of the DMEF parameters are dynamically determined at each node by the nodes themselves because a node knows better about its own velocity and neighbourhood, compared to the source of the broadcast process.
•	The network density does not influence the stable zone radius selected by RRS. In RRS, the number of nodes retransmitting the RREQ message in a
neighbourhood increases significantly as the network density is increased. DMEF is quite effective in reducing the number of nodes retransmitting the RREQ message in high-density networks.
The advantages of the DMEF scheme when compared with the broadcast route discovery strategies discussed in Section 5.2 are summarized as follows:
•	The probability and MPR-based methods do not guarantee that the broadcast message will be routed on a path with the minimum hop count or close to the minimum hop count. Previous research [13] on the impact of these broadcast strategies on the stability and hop count of DSR routes indicates that the hop count of the paths can be far more than the minimum and the routes have a smaller lifetime than the paths discovered using flooding. The probability-based method cannot always guarantee that the RREQ message gets delivered to the destination. Also, with increase in network density, the number of nodes retransmitting the message increases for both the probability-based and MPR-based methods.
DMEF determines paths with hop count being close to that of the minimum hop count paths and such paths have a relatively larger lifetime compared to those discovered using flooding. DMEF almost always guarantees that a source-destination route is discovered if there is at least one such route in the underlying network. DMEF effectively controls the RREQ message retransmission overhead as the network density increases.
•	The counter and area-based methods require careful selection of the threshold values for their proper functioning. Each node has to wait for a broadcast-wait-time before retransmitting the message. This can introduce significant route acquisition delays. The area-based method also requires the nodes to be location-aware and include the location information in the broadcast messages.
With DMEF, there is no waiting time at a node to rebroadcast a received RREQ message, if the message has been received for the first time during a particular route discovery. DMEF does not depend on any location-aware services for its operation and the structure of the RREQ message need not be changed.
5.4 Other relevant optimizations for multicast routing overhead
In addition to the methods described in Sections 5.1 and 5.2, some of the other optimizations that have been proposed in the MANET literature include: (i) A Swarm Intelligence based multicast routing protocol for ad hoc networks (MANHSI) has been proposed in [1]; (ii) In [19], the authors propose an independent tree ad hoc multicast routing (ITAMAR) framework that includes a number of heuristics to compute a set of alternate trees to improve the mean time between interruptions in multicast communication, achieved with a small increase in the route discovery overhead; and (iii) A virtual
overlay mesh of unicast paths has been proposed for efficient discovery of multicast routes in [7].
6 Conclusions and future work
Simulations have been conducted with both flooding and DMEF as the broadcast tree discovery strategies. DMEF helps the multicast routing protocols to discover stable trees and at the same time does not increase the source-receiver hop count appreciably. Hence, the energy consumed per node with DMEF is lower than that incurred with flooding. With the use of DMEF as the tree discovery strategy, the performance of NR-MLPBR and R-MLPBR with respect to the time between successive tree discoveries and energy consumed per node actually improved relatively more than that observed for BEMRP and MAODV. This can be attributed to the effective path prediction of the two multicast extensions, an idea inherited from LPBR, and complemented by DMEF. Thus, DMEF has been demonstrated to be an effective broadcast strategy to discover multicast trees. In a related work [12], we have also demonstrated the effectiveness of DMEF to discover node-disjoint multi-path routes for MANETs. Thus, DMEF is an effective broadcast strategy to discover stable unicast, multicast and multi-path routes for MANETs with relatively lower energy consumption than the default flooding approach.
The related work listed in Sections 5.1 and 5.4 require the strategies to be embedded into the design of the protocols and require changes built-in to the route discovery procedure of the protocols. Ours is the first such effort to study the impact of protocol-independent broadcast strategies (like DMEF and flooding) on the performance of multicast routing protocols. In future, we will evaluate the impact of the probability, counter, area and neighbour-knowledge based methods on the performance of the multicast routing protocols.
Acknowledgments
Research was sponsored by the U. S. Army Research Laboratory and was accomplished under Cooperative Agreement Number W911NF-08-2-0061. The views and conclusions in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Laboratory or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation herein.
References
[1]	Z. M. Alfawaer, H. W. Hua and N. Ahmed, "A Novel Multicast Routing Protocol for Mobile Ad Hoc Networks," American Journal of Applied Sciences, vol. 4, no. 5, pp. 333-338, 2007.
[2]	C. Bettstetter, H. Hartenstein and X. Perez-Costa, "Stochastic Properties of the Random-Waypoint Mobility Model," Wireless Networks, vol. 10, no. 5, pp. 555-567, 2004.
[3]	G. Bianchi, "Performance Analysis of the IEEE 802.11 Distributed Coordination Function," IEEE Journal of Selected Areas in Communication, vol. 18, no. 3, pp. 535-547, 2000.
[4]	L. Breslau et. al, "Advances in Network Simulation," IEEE Computer, vol. 33, no. 5, pp. 5967, May 2000.
[5]	J. Broch, D. A. Maltz, D. B. Johnson, Y. C. Hu and J. Jetcheva, "A Performance of Comparison of Multi-hop Wireless Ad hoc Network Routing Protocols," Proceedings of the 4th Annual ACM/IEEE Conference on Mobile Computing and Networking, pp. 85-97, October 1998.
[6]	L. M. Feeney, "An energy-consumption model for performance analysis of routing protocols for mobile ad hoc networks," Journal of Mobile Networks and Applications, vol. 3, no. 6, pp. 239249, June 2001.
[7]	C. Gui and P. Mohapatra, "Overlay Multicast for MANETs using Dynamic Virtual Mesh," Wireless Networks, vol. 13, no. 1, pp. 77-91, 2007.
[8]	B. Hofmann-Wellenhof, H. Lichtenegger and J. Collins, Global Positioning System: Theory and Practice, 5th rev. ed., Springer, September 2004.
[9]	N. Meghanathan, "A Location Prediction Based Reactive Routing Protocol to Minimize the Number of Route Discoveries and Hop Count per Path in Mobile Ad hoc Networks," The Computer Journal, Vol. 52, No. 4, pp. 461-482, July 2009.
[10]	N. Meghanathan, "Multicast Extensions to the Location Prediction Based Routing Protocol for Mobile Ad hoc Networks," ISAST Transactions on Computers and Intelligent Systems, vol. 1, no. 1, pp. 56-65, August 2009.
[11]	N. Meghanathan, "A Density and Mobility Aware Energy-efficient Broadcast Route Discovery Strategy for Mobile Ad hoc Networks," International Journal of Computer Science and Network Security, vol. 9, no. 11, pp. 15-24, November 2009.
[12]	N. Meghanathan, "A Node-Disjoint Multi-path Extension of the Location Prediction Based Routing Protocol for Mobile Ad hoc Networks," Proceedings of the 3rd International Conference on Signal Processing and Communication Systems, Omaha, Nebraska, USA, September 28-30, 2009.
[13]	N. Meghanathan, "Impact of Broadcast Route Discovery Strategies on the Performance of Mobile Ad Hoc Network Routing Protocols," in Proceedings of the International Conference on High Performance Computing, Networking and Communication Systems, pp. 144-151, July 2007.
[14]	C. S. R. Murthy and B. S. Manoj, "Ad Hoc Wireless Networks: Architectures and Protocols," Prentice Hall, June 2004.
[15]	S. Ni, Y. Tseng, Y. Chen and J. Sheu, "The Broadcast Storm Problem in a Mobile Ad Hoc Network," Proceedings of ACM International Conference on Mobile Computing and Networking, pp. 151-162, 1999.
[16]	T. Ozaki, J-B. Kim and T. Suda, "Bandwidth-Efficient Multicast Routing for Multihop, Ad hoc Wireless Networks," Proceedings of the IEEE INFOCOM Conference, vol. 2, pp. 1182-1192, Anchorage, USA, April 2001.
[17]	C. E. Perkins and E. M. Royer, "The Ad hoc On-demand Distance Vector Protocol," Ad hoc Networking, edited by C. E. Perkins, pp. 173-219, Addison-Wesley, 2000.
[18]	E. Royer and C. E. Perkins, "Multicast Operation of the Ad-hoc On-demand Distance Vector Routing Protocol," Proceedings of the 5th ACM/IEEE Annual Conference on Mobile Computing and Networking, pp. 207-218, Seattle, USA, August 1999.
[19]	S. Sajama and Z. J. Haas, "Independent-tree Ad hoc Multicast Routing (ITAMAR)," Mobile Networks and Applications, vol. 8, no. 5, pp. 551-566, October 2003.
[20]	Y.-J. Suh, W. Kim and D.-H. Kwon, "GPS-Based Reliable routing Algorithms for Ad Hoc Networks,"
The Handbook of Ad Hoc Wireless Networks, pp. 361 -374, CRC Press, 2003.
[21]	C. K. Toh, G. Guichal and S. Bunchua, "ABAM: On-demand Associatvity-based Multicast Routing for Ad hoc Mobile Networks," in Proceedings of the 52nd IEEE VTS Fall Vehicular Technology Conference, Vol. 3, pp. 987 - 993, September 2000.
An Extended TOPSIS Method for Multiple Attribute Group Decision Making Based on Generalized Interval-valued Trapezoidal Fuzzy Numbers
Peide Liu
Shandong Economic University, Information Management School
Jinan Shandong 250014, China
E-mail:Peide.liu@gmail.com
Keywords: interval-valued fuzzy number, relative closeness coefficient, multiple attribute group decision making Received: May 29, 2009
An Extended TOPSIS Method deals with multiple attribute group decision making problems in which the attribute values and weights take the form of the generalized interval-valued trapezoidal fuzzy numbers (GIVTFN). First, some properties are defined, such as the concept and the relational calculation rules of GIVTFN, the distance and its characteristics of GIVTFN, and the method which can transform the linguistic terms into GIVTFN. Second, the normalization method of GIVTFN is illustrated, and an extended TOPSIS method based on the GIVTFN is presented in detail. The order of the alternatives is ranked based on the relative closeness coefficient of TOPSIS. Finally, an illustrate example is given to show the effectiveness of this method and this decision making steps.
Povzetek: Z izboljšano metodo TOPSIS so dosegli boljše rezultate pri odločanju z mnogoterimi atributi.
1 Introduction
Multiple attribute decision making (MADM) is an important part of modern decision science. It has been extensively applied to various areas, such as society, economics, management, military and engineering technology. For example, the investment decisionmaking, project evaluation, the economic evaluation, the personnel evaluation etc. Since the object things are fuzzy, uncertainty and human thinking is ambiguous, the majority of the multi-attribute decision-making is uncertain and ambiguous, which is called the fuzzy multiple attribute decision-making (FMADM). Since Bellmanhe and Zadeh [1] initially proposed the basic model of fuzzy decision making based on the theory of fuzzy mathematics, FMADM has been receiving more and more attentions. Many achievements have been made on FMADM problems [2-5,7-21].
TOPSIS (Technique for Order Preference by Similarity to Ideal Solution) is proposed by Hwang and Yoon [6], and it is a popular approach to MCDM problems. The basic principle is that the chosen alternative should have the shortest distance from the positive ideal solution and the farthest distance from the negative ideal solution. In the TOPSIS, the performance ratings and the weights of the criteria are given as crisp values. In many cases, crisp data are inadequate to model real life situations. Jahanshahloo et al [7] extends the TOPSIS method to the fuzzy decision making situations by considering interval numbers and defining crisp Euclidean distance between two interval numbers. Wang and Elhag[8] proposes a fuzzy TOPSIS method based on alpha level sets and presents a nonlinear programming
(NLP) solution procedure by considering triangular fuzzy numbers. Liu and Zeng [9] proposes a new TOPSIS method to deal with the fuzzy multiple attribute group decision making problem based on the expected value operator of the trapezoidal fuzzy number when the fuzzy decision matrixes and the weights of the decision attributes and decision makers are all given by the trapezoidal fuzzy number. Tsaur et al. [10] convert the fuzzy MCDM problem into a crisp one via centroid defuzzification and then solve the non-fuzzy MCDM problem by the TOPSIS method. Chu and Lin [11] changed the fuzzy MCDM problem into a crisp one. Differing from the others, they first derive the membership functions of all the weighted ratings in a weighted normalized decision matrix and then convert them to crisp values by defuzzifying and then use TOPSIS method to solve this problem.
The concept of the interval-valued fuzzy set is initially proposed by Gorzlczany[12]and Turksen[13]. Some researchers focused on this research topic of interval-valued fuzzy numbers [12-18] in resent years, because the interval-valued fuzzy numbers are more general and better to express fuzzy information. Wang and Li [14-15] defined the expansion operation of the interval-valued fuzzy numbers, and proposed the concept and properties of the similarity coefficient based on the interval-valued fuzzy numbers. Hong and Lee [16] proposed the distance of the interval-valued fuzzy numbers. Ashtiani,et al[17] proposed definition of the interval-valued triangular fuzzy numbers and presented the extended TOPSIS group decision making method for
the interval-valued triangular fuzzy numbers. Wei and Chen[18]proposed similarity measures between the generalized interval-valued trapezoidal fuzzy numbers (GIVTFN) for risk analysis. This paper proposed an extended TOPSIS Method to solve the multiple attribute group decision making problems of which the attribute weights and values are given with the form of GIVTFN.
In this paper we develop an extended TOPSIS method for multiple attribute group decision making based on the generalized interval-valued trapezoidal fuzzy numbers by defining the distance of GIVTFN. The remaining of this study is organized as follows. In the next section, we will briefly introduce the basic concept and the operation rules of the GIVTFN, and define the distance of GIVTFN. Section 3 describes the extended TOPSIS method to solve the multiple attribute group decision making problems by using GIVTFN. Section 4 gives a numerical example to explain validity of the decision-making steps and the method. The study is concluded in Section 5.
2 The basic concept of the interval-valued trapezoidal fuzzy numbers
2.1 The generalized trapezoidal fuzzy numbers
(1) The concept of the generalized trapezoidal fuzzy numbers
Definition 1 [19]: The generalized trapezoidal fuzzy numbers can be defined as a vector
A = (a1, a2, a3, a4; w^ ) (as shown in Fig1), and the
membership function a(x): R ^ [0,1] is defined as follows:
a( x) =
x - a
a2 w -,
x - a
4
a3 - a4
x wA, x e (aj,a2) x e (a2,a3) x w-, x e (a3,a4)
0,
x e (-ro, a1) u (a4, ro)
If
a1 = a2 = a3 = a4 ,
then A is reduced to a real
number.
Figure 1: The generalized trapezoidal fuzzy number A .
(2) The operation of the generalized trapezoidal fuzzy number
Suppose	that
a = (a1,a2,a3,a4;wa) , b = (b1,b2,b3,b4;w-) are
the generalized trapezoidal fuzzy numbers, then the operational rules of the generalized trapezoidal fuzzy number are shown as follows:[20]
a © b = (a,, a2, a3, a4; wa) © (b1, b2, b3, b4; w- )
(i)
= (aj + b1, a2 + b2, a3 + b3, a4 + b4;min(wa, w- ))
(2)
(ii)
(3)
(iii)
(4)
a -b = a a2, ^ a4; wa) - (bJ, b2, b3, b4; wb ) = (a1 - b4, a2 - b3, a3 - b2, a4 - b{; min(w5, w~))
a ®b = (a!, a2, a3, a4; wa) ® (b1, b2, b3, b4; w- ) = (a, b, c, d;min(w5, wb ))
(1)
where a1 < a2 < a3 < a4and w^ e [0,1].
The elements of the generalized trapezoidal fuzzy numbers x e R are real numbers, and its membership function a(x) is the regularly and continuous convex function, showing the membership degree to the fuzzy sets. If -1 < a1 < a2 < a3 < a4 < 1, then a- is called the normalized trapezoidal fuzzy number. Especially, if w- = 1 , then A is called the trapezoidal fuzzy
number(a1,a2,a3,a4); if a1 < a2 = a3 < a4, then A is reduced to a triangular fuzzy number.
where
a = min(a1 x b1, a1 x b4,a4 x b1,a4 x b4), b = min(a2 xb2,a2 xb3,a3 xb2,a3 xb3) c = max(a2 xb2,a2 xb3,a3 xb2,a3 xb3), d = max(a1 x b1, a1 x b4, a4 x b1,a4 x b4) If a1, a2, a3, a4, b1, b2, b3, b4 are the positive numbers, then
a®b = (a! xb1,a2 xb2,a3 xb3,a4 xb4;min(wa,w~))
a / b = fa , a2, a3, a4; wa )/(bJ, b2, ^ b4; wb )
(iv)
(5)
= (a / b4,a2 /b3,a3 / b2, a4 / b1; min(wa, w-))
(3) The center of the gravity (COG) point of the generalized trapezoidal fuzzy numbers
Chen and Chen [21] proposed the concept of the COG point of the generalized trapezoidal fuzzy numbers,
and suppose that the COG point of the generalized trapezoidal	fuzzy	numbers
a = (a1, a2, a3, a4; wa) is (xa, ya), then [21]:
y-a =
x- =
w-x	' -3 -	-2	A + 2 y
	i -4 -	-1	
	6		
\w'A			
if a1 ^ a4 if a1 = a4
y-x (-2 + -3)+(a + -4)x (w- - y- )
2x w-
(6)
A =
(-L, dL, -L; m, ), (-f, ^,-,; wjcí )
0 < - < - < - < - <1 , 0 < -U < -U < -U < -U < 1
0 < w, < Wz. <1 and AL œ AU
al au

Í. , ir A
fl3 11J	rt4
Suppose that
A=lAL, AU ' b=\B, BU

-4 , -a^
UUUU
(b^bb wB x^A^; Wu )
where 0<of <df <af <d <1, 0<of <aU <aU <1,
0 < w~ < w-Ur < 1, AL c AU ,0 < bL < b < bl < bl <1,
alau	1 2 3 4
0 < bU < bU < bU < bU < 1 , 0 < w§l < w§u < 1 ,
WzL c w=U .Then the operation is shown as follows:
B	B
(i) The sum of two interval-valued trapezoidal fuzzy numbers A B:
(<a1, ^ aL,al;wAL),(aU, ^, oU, aU;wAU)
2.2 The interval-valued trapezoidal fuzzy numbers
(1)The interval-valued trapezoidal fuzzy numbers [18]
Wang and Li [15] proposed the interval-valued trapezoidal	fuzzy	numbers
Ot&bLMw ),(bU,bU M M; w-gJ )
.-Li iL
(7)
A0B=
0
(O +bf,d +b3,d +#;min(wAi, w^ )), (aU +b1U, Ou +bU, a , a ;min(wAU, w^ ))
(ii) The difference of two interval-valued trapezoidal fuzzy numbers A B
A-B=
shown	in	Fig.	2,	where
aU < a < a < a
In Fig. 2, we can
conclude that the interval-valued trapezoidal fuzzy
numbers A are consist of the lower values of the
interval-valued trapezoidal fuzzy number AL and the upper values of the interval-valued trapezoidal fuzzy
number AU . (Shown in Fig. 2)
(-L
(-;, -L, -l^Wa, ),- ,-u , - ,-u;wau )
Otrfrftiw ),(MU ÂM MUw )
-b;,Ol -b;, - -bL;min(wAi, w-L )),

/ U iU U iU U iU U iU • /	W
(a ~b4 ,°2 "b3 ,a3 b2 ,a4 "A ;min(wAU,wjU)) (8)
(iii) The product of two interval-valued trapezoidal fuzzy numbers A B
(a1,^, al,al;wAi ),(aU, a, ou,ou;wau )
A®B= ®
QLÚÚ&w; ),(bU M ,tíU,blU;wBj )
L
(9)
Figure 2: the interval-valued trapezoidal fuzzy numbers.
(2) The operation of the interval-valued trapezoidal fuzzy numbers [18]
(a xbL,d, k^O xb^a xb;;min(wAl,w^)), (aU xbU ,d xbU ,d xbU, dU xbU;min(wAU, wBgJ))
(iv)	The quotient of two interval-valued trapezoidal fuzzy numbers A B :
(dl^didw ),(dU,dU4,d4;wAUJ) ^(bl,bl,bl,bl;wBi ),(b ,bJ,bJ bUw)_ (d / bid / bid / b^d / bl;min(wAi, w^ )), (OU / bUo / bU,oU / /	,wbu ))
(v)	The product between an interval-valued trapezoidal fuzzy number A and a constant number X (A> 0):
A/B= /
(10)
are the two interval-valued trapezoidal fuzzy numbers,
AA=Ax
\4444;wAJ )
2J/

A'
(11)
(2) The distance of two interval-valued trapezoidal fuzzy numbers is:
d/AB) =
2.3 The distance of the interval-valued trapezoidal fuzzy numbers
jy ^ )2 +X -Xl )2 ^ )2 -XU )2]
4 (12)
Suppose that
A=
B=
AL, AU BL, BU
^44^),(aU,aU' aJ,aJ;wAJ) \bL,bL,bL,bL4; W§L ),(bj 4 ,bj ,bj; w^j)
are any two generalized trapezoidal fuzzy numbers, then the distance of two interval-valued trapezoidal fuzzy
numbers( A and B ) is calculated as follows:
(l)Utilize the formula (6) to calculate the coordinate of the	COG	point
(XAL, yAL ), (XAJ,) , (XJL, yB ), (xU, y-w )
which belongs to the generalized trapezoidal fuzzy numbers AL, AJ DL DJ
where d(A, B) satisfies the following properties:
(i)	if A and B are the normalized interval-valued trapezoidal fuzzy numbers, then 0 < d(A, B) < l
(ii)	A = B O d(A, B) = 0
(iii)	d(A, B) = d(B, A)
(iv)	d(A, C) + d(C, B) > d(A, B)
Obviously, the properties (i) and (iii) are satisfied.
For the properties (ii), if A = B , then d(A, B) = 0 . If d(A, B) = 0 , then the COG of A is equal to B 's, so we approximately believe that A = B .
BL , Bj respectively.
For properties (iv): In order to simplify the expression in distance formula, we suppose that
«i = xAl ,«2 = xBl « = xcl , P =yAL A =yBL A =yC 7 =^,72 =^7 =*-? fa = yAj fa = yBj fa = yCj. d(A, B) + d(B, C) > d(A, C) O (d(A, B) + d(B, C) )2 > d2 (A, C)
O (d(A, B) + d(B, C))2 - d2 (A, C) > 0
o(« -«2)2 + (A - A)2 + (7i 72) + fa -fa)2 + («2 -«3)2 + P-A3)2 + (72 -7)2 + fa -fa)2
+:
(« «2) + (P -P2)2 + (7i -72)2 + (fa -fa)2]*[(« -«3)2 + (P2 -P3)2 + (72 7 + (fa -fa)2]
-[(« -«)2 + (Pi -P3)2 + (71 -7,)2 + (fa fa?)2]>0 Suppose that
dA = («i - a2)2 + (Pi -P2)2 + (71 - 72)2 + (fa - fa)2 + (a2+ (P2 -P3)2 + (72 - 73)2 + (fa - fa)2
-[(«1 -«3)2 + (Pi -P3)2 + (7i -73)2 + (fa -fa)2]
dB=
2/[(q +(P-P2)2 +7 -72)2 +(fa fa)2]x[(« « +P P +(/2 7 +(fa-fa)]
V	dA = (a, - «2)2 + (P - P2)2 + (/1 - 72)2 + (fa - fa2)2 + («2 - «3)2 + (P2 - P3)2 + (72 - 73)2 + fa - fa3)2
- [(« - «3)2 + (P - P3)2 + (7! - 73)2 + (fa - fa3)2 ]
= 2(a1 - a2)(a3 - a2) + 2(P - P2)(P3 - P2) + 2(71 - 72)(73 - 72) + 2(fa - fa2)(fa3 - fa2 )_
V	dB = 2j[(a -a2)2 + (P -P2)2 + (71 -72)2 + (fa -fa2)2 ]x[(a2 -«)2 + (P -P,)2 + (72 -73)2 + (fa2 -fa)
> 2¡[(a -«2)(«2 -«3) + (P1 -PjXP2 -P3)]2 +[(71 -72)(72 -73) + (fa -fa2)(fa2 -fas)]2 + dC = 2[(a -a2)(a2 -a3) + (P1 P2)(P2 P3) + 7 -72)(72 -7î) + (fa -fa2)fa2 -faî)]
where
dC = 2[(« - a2)(a2 - «3)7 - 71)(y1 - 73) + (ß - ß2)(ß2 - ß3)7 - ^X^ " 73) + (Ü1 -^2X^2 -^)(a1 - «2X« - «3) + (^1 - ^2)(^2 - ^3)(ß1 - ßi)(ß2 - ß3)]
.'. dA + dB > 0 o (d(A, B) + d(B, C) j2 - d2 (Ai, C) > 0
2.4 Utilize the interval-valued trapezoidal fuzzy numbers to represent the linguistic terms
In the real decision making process, it is difficult to adopt the form of generalized interval-valued trapezoidal
> 0 o d (A, B) + d (B, C) > d (A, C)
fuzzy numbers to give the attribute values and weights directly by the decision makers. So we usually adopt the form of linguistic terms. Wei and Chen [18] utilizes the interval-valued trapezoidal fuzzy numbers to represent the 9-member linguistic terms. (shown in Table 1)
Table 1: A 9-member interval linguistic term set.
linguistic terms (the attribute values)
linguistic terms generalized interval-valued trapezoidal (weights)_fuzzy numbers_
Absolutely-poor(AP) Very-poor(VP) poor (P)
Medium-poor(MP)
Medium (F)
Medium-good(MG)
good(G)
Very-good(VG)
Absolutely-good(AG)
Absolutely-low(AL) Very-low (VL)
low (L) Medium-low(ML)
Medium (M) Medium-high(MH)
high(H) very-high (VH) Absolutely-high (AH)
[(0.00,0.00
[(0.00,0.00
[(0.04,0.10,
[(0.17,0.22,
[(0.32,0.41
[(0.58,0.63.
[(0.72,0.78
[(0.93,0.98,
[(1.00,1.00,
,0.00,0.00;0.8). 0.02,0.07;0.8). 0.18,0.23;0.8), 0.36,0.42;0.8), 0.58,0.65;0.8), 0.80,0.86;0.8), 0.92,0.97;0.8), 1.00,1.00;0.8), 1.00,1.00;0.8),
(0.00,0.00,0.00 (0.00,0.00,0.02 (0.04,0.10,0.18, (0.17,0.22,0.36 (0.32,0.41,0.58 (0.58,0.63,0.80 (0.72,0.78,0.92 (0.93,0.98,1.00, (1.00,1.00,1.00,
0.00;1.0)] 0.07;1.0)] 0.23;1.0)] 0.42;1.0)] 0.65;1.0)] 0.86;1.0)] 0.97;1.0)] 1.00;1.0)] 1.00;1.0)]
3 Group decision making method
3.1 The description of the decision making problems
Let E = jgj, e2, • • •, eq j be the set of decision makers in
the group decision making, and A = { A1, A2, • • •, Am j be
the set of alternatives, and C = {Cj, C2, • • •, Cn j be the set of attributes.	Suppose that
a*^(j ^ ^ ^ Xj ^ ^ wl)]
is the attribute value given by the decision maker ek , where aijk is an interval-valued trapezoidal fuzzy number for the alternative Ai with respect to the attribute	Cj	,
and
~ ri L L L L L\ / U U U U U\1
% = LH^ % 2, jj % X ^vj j j %) J
is the attribute weight given by the decision maker ek , where % is also an interval-valued trapezoidal fuzzy number. Let A = (\, A2, • • •, Aq ) be the weight vector of decision makers, where A, is a real number,
and ^ Ak = 1. Then we use the attribute weights, the
k=i
decision makers' weights, and the attribute values to rank the order of the alternatives.
3.2 Normalize the decision-making information
we need normalize the decision-making information, in order to eliminate the impact of different physical dimension to the decision-making result. Consider that there are generally benefit attributes ( I1 ) and cost
attributes ( I2 ). The normalizing method is shown as follows:
For benefit attributes, where mjk = max(aUjk4) .
xi)k = [(-jl, XHk2, Xijk3, XLk4;WLk ), (xuk1, xpk2, xjk3, xjk4' WU )]
( et et et ct	aU aU aU aU
(j ^jk2 j	. W ) (j'kl »2 Uijk3 Jik±.wU )
V
mjk mjk mjk mjk
For cost attributes, where njk = min(aLk1 ) .

"V l_(Xfc1' xjk2, Xjk4; rfjk XCjp 'U, ^jk3, ^jk4^"ijk
(pk_	-U- -Uk- -U-• wu)
ct™ ' oL ' oL'dj ' ijk ' CL ' CL ' CL ' ijk
y yk\ ijk2 ijk3 ijk4
ijk1 ijk2 ijk3 ijk4	J
(14)
3.3 Combine the evaluation information of each decision maker
We can get the collective attribute values and weights, according to the different projects' attribute values and weights which were given by different experts under different attribute.
The combining steps are shown as follows:
k=1
~ ["/¿-j-j-j L\ / U U U U Uxl
" "ij 2, j j Wj ),(j "i, 2 , j j w )J
g	g
Ê(^kxijk ) = Ê {^k X _(xi,k1, xijk2 , xijk3 , xijk4 • Wjk ), (xyk1, Xijk2 , Xijk3 , "ijk4 • Wi,k )]}
' ' k=1
g	g	g	g	^
Ê (■) Ê ( 2 ) Ê	3 )	4 ); mkin(wj)
k=1 k=1 k=1 k=1 ,
Ê (^1 ) Ê (AkxUk 2 ), Ê (AkxUUk 3 ) Ê (AkxUk 4 ); min^ ) '
k=1
k=1
k=1
k=1
~ t/ll lll\suuuu U\"l
K = _(K ' K , ®3 , ®4 • V X (K , ®2 , ®3 , ®4 • V )]
: Ê ( Ak x k )=Ê (^ x [(K , 2 , K, K 4 ; v- x (K, <2, <3, <4 ;vl) ])
k=1 k=1
k=1 g
Ê (^K- ) Ê (4® 2 ) Ê (AkK-3 )	4 ); mkin(vkj-)
k=1 k=1 k=1 k=1
Ê (^kKUj1 ), Ê (^kKUj 2 ), Ê (^kKUj 3 ) ,Ê (^kKUj 4 ); mjn(vJU )
. k=1
k=1
k=1
k=1
(15)
(16)
3.4 Construct the weighted matrix
Let V =
y
be the weighted matrix, then:
Jmxn
.L ,.L ,.L	,_L ) (vU ,.U ,.U ,.U
ij ^ ( ij1
v« = _(v-1, v-2, v-3, vj4;mj ),(j j v^, v^; < )] = ® K
ij [v' ijl' ' ij 21 ij 31 ij 41 ij ij1> ij 21 ij 31 ij 4>~ ij /] "ij
(x>L1, xj2®-2, x-3®-3, x>-4;min(wj,vj)),"
"ij 2^ j 2i ~ij 3^ j 3i ~ij 4™ j 4 '
(x>U1, xU^® U, xtfU3k Uj3, xU^KU4; min(wij V ))
ij ' ' j
ij j
(17)
3.5 The extended TOPSIS decision making method
(1) Determine the positive ideal solution and the ideal solution are V+ = negative ideal solution of the evaluation objects
l~/\,L+ L + L + L ^ ^ L + \	+ „U + ,,U + „U	U + \
Vj =[ ( Vj1 , Vj 2, Vj 3, Vj 4;^j ),( Vj1 , Vj 2 , Vj 3 , Vj 4 ;Vj ) ]
(max( vLl), m ax ( V-2), max( V-3),max (Vj 4 );max(^i^
Suppose that the positive ideal solution and the negative
,V" =
then :
(ma x( ma x( vl2), m a x( v ji3),m a x (vU4 );m axO U ))
ij 2
ij 3-
ij 4 ^
+
1xn
1xn
V j =
[ ( ^
L - L - L - L - \ i U - U
,v,.? ,v^ ),(vj1 ,v^ ,v^ ,v,.A ;w
j2 ' 'j3
j 4 ' '
,2


) ]
(min( vL1),min( vL 2),min( vL 3),min( vL 4);minOi)),
Z	2	Z	1	1
(min(v^), min(vU2),min(vU3),min(v^); min^ ))
(19)
(2)	Calculate the weighted matrix and the COG of each attributes with respect to the positive ideal solution and the negative ideal solution
Based on the formula (6), we can calculate the
COG [ (y, x)v ] of each element in the weighted matrix and	the	barycentric
coordinates [(y, x)v+ ] and [(y, x)V_] of each
element with respect to the positive ideal solution and the negative ideal solution.
(3)	Calculate the weighted distance between each project Ai and the ideal solution V + and negative ideal
solution V :
d + =
A
Z[d v, v, )
j=1
d - =
V
ž[d v, v, )
j=1
(20)
(21)
where i = 1,2 — m, j = 1,2, — n .
(4) Calculate the relative closeness coefficient Ci
d -
C =•
d,+ + d.
(22)
, i 111 1 vector of the decision makers is A = I — — — I. Each
^ 3 3 3 J
decision maker utilizes the linguistic terms to assess the importance of each attribute, and the evaluation information of four volunteers is shown in Tables2, 3, 4 and 5, respectively [17].
Table 2: The attribute weights given by three DMs.
	c1	C2	C3	C4	C5
DM1	VH	H	H	VH	M
DM2	VH	H	MH	H	MH
DM3	VH	MH	MH	VH	M
Table 3: The evaluation information given by DM1.
(5) Rank the alternatives
We rank each alternative, based on the relative closeness coefficient. The bigger the relative closeness coefficient is, the better the alternative is, vice versa.
4 Illustrative example
Suppose that a Telecommunication Company intends to choose a manager for R&D department from four volunteers named A1, A2, A3 and A4. The decision making committee assesses the four concerned volunteers based on five attributes: (1) the proficiency in identifying research areas (C1), (2) the proficiency in administration (C2), (3) the personality (C3), (4) the past experience (C4) and (5) the self-confidence (C5). The number of the committee members is three, labeled as DM1, DM2 and DM3, respectively, and the weight
	c1	C2	C3	C4	C5
a1	VG	VG	VG	VG	VG
a2	G	VG	VG	VG	MG
a3	VG	MG	G	G	G
a4	G	F	F	G	MG
Table 4: The evaluation information given by DM2.					
	c1	C2	C3	C4	C5
a1	G	MG	G	G	VG
a2	G	VG	VG	VG	MG
a3	G	G	MG	VG	G
a4	VG	F	MG	F	G
Table 5: The evaluation information given by DM3.					
	c1	C2	C3	C4	C5
ax	MG	F	G	VG	VG
a2	MG	MG	G	MG	G
a3	VG	VG	VG	VG	MG
a4	MG	VG	MG	VG	F
2
2
The decision making steps are shown as follows:
(1) Convert the linguistic terms into the interval-valued trapezoidal fuzzy numbers, and then get:
a
kj
3 x S
[j.
[ ai2 ] 4
[(Q.9 3 [(Q.9 3 [(Q.9 3 [(Q .72 , [(Q .72 , [ ( Q . S 8 , [(Q .72 , [ ( Q . S 8 , [ ( Q . S 8 , [(Q 9 3, [(Q .72 , [(Q 9 3, [(Q .3 2, [ ( Q . S 8 , [(Q .3 2,
,Q .9 8,1 ,Q .9 8,1 ,Q .9 8,1 ,Q .7 8 ,Q ,Q .7 8 ,Q , Q . ó 3 , Q ,Q .7 8 ,Q , Q . ó 3 , Q , Q . ó 3 , Q ,Q .9 8 ,1 ,Q .7 8 ,Q ,Q .9 8 ,1 Q .41,Q . Q.ó 3,Q . Q.41,Q .
. Q Q , 1 . Q Q , 1 . Q Q , 1 9 2 ,Q 9 2 ,Q
8	Q ,Q
9	2 ,Q 8 Q ,Q
8	Q ,Q Q Q , 1
9	2 ,Q Q Q , 1 S 8 ,Q . 8 Q ,Q . S 8 ,Q .
.Q Q ;Q .Q Q ;Q .Q Q ;Q 9 V ; Q . 9 V ; Q .
8	ó ; Q .
9	V ; Q . 8 ó ; Q .
8	ó ; Q . Q Q ; Q .
9	V ; Q . Q Q ; Q . ó S ;Q . 8 ó ; Q . ó S ;Q .
. 8 ), ( Q . 9 3 . 8 ), ( Q . 9 3 . 8 ), ( Q . 9 3 .8 ),(Q .7 2 .8 ),(Q .8 ),(Q .8 ),(Q .8 ),(Q .8 ),(Q .8 ),(Q .9 3 .8 ),(Q .7 2 .8 ),(Q .9 3 8 ),(Q .3 2 8 ),(Q .S 8 8 ),(Q .3 2
.V2 .S 8 .V2 .S 8 .S 8
,Q.9 8,1 ,Q.9 8,1 ,Q .9 8 , 1 Q .7 8 ,Q Q .7 8 ,Q Q .ó 3 ,Q Q .7 8 ,Q Q .ó 3 ,Q Q .ó 3 ,Q Q .9 8 , 1 Q .7 8 ,Q Q .9 8 , 1 Q .4 1 ,Q . Q .ó 3,Q . Q.4 1 ,Q .
. Q Q , 1 . Q Q , 1 . Q Q , 1 9 2 ,Q 9 2 ,Q
8	Q ,Q
9	2 ,Q 8 Q ,Q
8	Q ,Q QQ,1
9	2 ,Q QQ,1 S 8 ,Q 8 Q ,Q S 8 ,Q
.Q Q ; 1 .Q )] .Q Q ; 1 .Q )] .Q Q ; 1 .Q )] .9 7 ; 1 .Q )] .9 7 ; 1 .Q )] .8 ó ; 1 .Q )] .9 7 ; 1 .Q )] .8 ó ; 1 .Q )] .8 ó ; 1 .Q)] .Q Q ; 1 .Q )] .9 7 ; 1 .Q )] .Q Q ; 1 .Q )] ó S ; 1 .Q)]" .8 ó ; 1 .Q )] .ó S ; 1 .Q )]
[(Q .9 3 ,Q [( Q . 7 2 ,Q [(Q .9 3 ,Q [( Q . 7 2 ,Q [(Q.9 3 ,Q [(Q.9 3 ,Q [(Q .S 8,Q [(Q .3 2,Q [(Q .9 3,Q [(Q.9 3 ,Q [ ( Q . 7 2 , Q [(Q .3 2,Q [(Q.9 3 ,Q [(Q.9 3 ,Q [ ( Q . 7 2 , Q [ ( Q . 7 2 , Q [(Q .9 3,Q. [(Q .S 8,Q. [ ( Q . 7 2 , Q . [(Q.S 8,Q .
.98,1 .7 8 ,Q .98,1 .7 8 ,Q 98,1 98,1 ó 3 ,Q 4 1 ,Q 98,1 98,1 7 8 ,Q 4 1 ,Q 98,1 98,1 7 8 ,Q 7 8 ,Q 98,1. ó 3 ,Q . 7 8 , Q . ó 3 ,Q .
.QQ,1 .9 2 ,Q .QQ,1 .9 2 ,Q Q Q ,1 . Q Q ,1 .
8	Q , Q . S 8 ,Q . Q Q ,1 . Q Q ,1 .
9	2 , Q . S 8 ,Q . Q Q ,1 . Q Q ,1 . 9 2 , Q . 9 2 , Q . QQ,1.
8	Q ,Q .
9	2 , Q . 8 Q ,Q .
.Q Q ;Q .9 7 ;Q .Q Q ;Q .9 7 ;Q .Q Q ;Q .Q Q ;Q .8 ó ;Q .ó S ;Q .Q Q ;Q .Q Q ;Q .9 7 ;Q .ó S ;Q .Q Q ;Q .Q Q ;Q .9 7 ;Q .9 7 ;Q Q Q ; Q .
8	ó ; Q .
9	7 ; Q . 8 ó ; Q .
.8),(Q .8),(Q .8),(Q .8),(Q 8 ), ( Q 8 ), ( Q 8 ), ( Q 8 ), ( Q 8 ), ( Q 8 ), ( Q 8 ), ( Q 8 ), ( Q 8 ), ( Q 8 ), ( Q 8 ), ( Q 8 ), ( Q 8 ), ( Q . 8 ), ( Q . 8 ), ( Q . 8 ), ( Q .
9 3 ,Q 7 2 , Q 9 3 ,Q 7 2 , Q .9 3 ,Q .9 3 ,Q .S 8 ,Q .3 2 ,Q .9 3 ,Q .9 3 ,Q . 7 2 , Q .3 2 ,Q .9 3 ,Q .9 3 ,Q . 7 2 , Q . 7 2 , Q 9 3 , Q . S 8 ,Q . 7 2 , Q . S 8 ,Q .
(Q .7 2 ,Q (Q .7 2 ,Q (Q .7 2 ,Q (Q.9 3 ,Q (Q .S 8 ,Q . (Q .9 3,Q . (Q .7 2,Q . (Q .3 2 ,Q . (Q .7 2,Q . (Q .9 3,Q . (Q .S 8 ,Q . (Q .S 8 ,Q . (Q .7 2,Q . (Q .9 3,Q . (Q .9 3,Q . (Q .3 2 ,Q . (Q .9 3,Q . (Q .S 8,Q . (Q .7 2,Q . (Q .7 2,Q .
.7 8 ,Q .7 8 ,Q .7 8 ,Q .98,1 .ó 3 ,Q .9 8,1 .7 8 ,Q .4 1 ,Q .7 8 ,Q .9 8,1 .ó 3 ,Q .ó 3 ,Q .7 8 ,Q .9 8,1 .9 8,1 .4 1 ,Q 9 8,1. ó 3 ,Q . 7 8 ,Q . 7 8 ,Q .
.9 2 ,Q .9 2 ,Q .9 2 ,Q .Q Q , 1
8	Q , Q . Q Q , 1 .
9	2 ,Q . S 8 , Q . 9 2 ,Q . Q Q , 1 . 8 Q , Q .
8	Q , Q .
9	2 ,Q . Q Q , 1 . Q Q , 1 . S 8 , Q . Q Q ,1 .
8	Q ,Q .
9	2 ,Q . 9 2 ,Q .
.9 7 ;Q .9 7 ;Q .9 7 ;Q .Q Q ;Q .8 ó ;Q ,QQ;Q .9 7 ;Q .ó S ;Q .9 7 ;Q ,QQ;Q .8 ó ;Q .8 ó ;Q .9 7 ;Q ,QQ;Q ,QQ;Q .ó S ;Q Q Q ; Q .
8	ó ; Q .
9	7 ; Q . 9 7 ; Q .
.8 ),(Q .8 ),(Q .8 ),(Q .8 ),(Q 8 ), ( Q . 8 ), ( Q . 8 ), ( Q . 8 ), ( Q . 8 ), ( Q . 8 ), ( Q . 8 ), ( Q . 8 ), ( Q . 8 ), ( Q . 8 ), ( Q . 8 ),(Q . 8 ), ( Q . 8 ), ( Q . 8 ), ( Q . 8 ), ( Q . 8 ), ( Q .
.7 2 ,Q .7 2 ,Q .7 2 ,Q .9 3 ,Q S 8 ,Q 9 3 ,Q 7 2 ,Q 3 2 ,Q 7 2 ,Q 9 3 ,Q S 8 ,Q S 8 ,Q 7 2 ,Q 9 3 ,Q 9 3 ,Q 3 2 ,Q 9 3 , Q . S 8 , Q . 7 2 , Q . 7 2 , Q .
9 8,1 7 8 ,Q 9 8,1 78,Q .98,1 .98,1 .ó 3 ,Q .4 1 ,Q .98,1 .98,1 .7 8 ,Q .4 1 ,Q .98,1 .98,1 .7 8 ,Q .78,Q .98,1. .ó 3 ,Q . .7 8 ,Q . .ó 3 ,Q .
.7 8 ,Q .7 8 ,Q .7 8 ,Q .9 8,1 ó 3 ,Q . 9 8,1. 7 8 ,Q . 4 1 ,Q . 7 8 ,Q . 9 8,1. ó 3 ,Q . ó 3 ,Q . 7 8 ,Q . 9 8,1. 9 8,1. 4 1 ,Q . 9 8,1. ó 3 ,Q . 7 8 ,Q . 7 8 ,Q .
.Q Q ,1 .Q Q ;1 .Q Q )], .9 2 ,Q .9 7 ;1 .Q Q )], .Q Q ,1 .Q Q ;1 .Q Q )], .9 2 ,Q .9 7 ;1 .Q Q )],
.Q Q ,1 .Q Q .Q Q ,1 .Q Q .8Q,Q.8ó .S 8,Q.ó S .Q Q ,1 .Q Q .Q Q ,1 .Q Q . 9 2 , Q . 9 7 .S 8,Q .ó S .Q Q ,1 .Q Q .Q Q ,1 .Q Q . 9 2 , Q . 9 7 . 9 2 , Q . 9 7 Q Q ,1 .Q Q ,1 .Q )]
8	Q,Q.8 ó , 1.Q )]
9	2 ,Q .9 7 ,1 .Q )] 8 Q , Q . 8 ó , 1 .Q)]
l .QQ)], l .QQ)], l .QQ)], l .QQ)], l .QQ)], l .QQ)], l .QQ)], l .QQ)], l .QQ)], l .QQ)], l .QQ)], l .QQ)],
.9 2 ,Q .9 2 ,Q .9 2 ,Q .Q Q , 1 .8 Q ,Q .Q Q ,1 .9 2 ,Q .S 8 ,Q .9 2 ,Q .Q Q ,1 .8 Q ,Q .8 Q ,Q .9 2 ,Q .Q Q ,1 .Q Q , 1 .S 8 ,Q Q Q ,1 .
8	Q , Q .
9	2 , Q . 9 2 , Q .
.9 7 ; 1 .9 7 ; 1 .9 7 ; 1 .Q Q ; 1
8ó QQ 97 óS 97 QQ 8ó 8ó 97 QQ QQ óS Q Q ,1 . 8 ó , 1 . 9 7 ,1 . 9 7 ,1 .
QQ)] QQ)] QQ)] QQ)] QQ)] QQ)] QQ)] QQ)] QQ)] QQ)] QQ)] QQ)] QQ)] QQ)] QQ)] QQ)] Q)] Q)] Q)] Q)]
[3]4
[(0.58
[(0.58
[(0.9 3
[(0.58
[(0.3 2
[(0.58
[(0.93
[(0.93
[(0.7 2
[(0.7 2
[(0.93
[(0.58
[(0.93
[(0.58
[(0.93
[(0.93
[(0.9 3,
[(0.7 2,
[(0.58,
[(0.3 2,
,0.6 3
,0.6 3
,0.9 8
,0.6 3
0.4 1
0.63
0.98
0.98
0.78
0.78
0.98
0.63
0.98
0.63
0.98
0.98
0.98,
0.78,
0.63,
0 .4 1 ,
,0.8 0 ,0.8 0 ,1 .0 0 ,0.8 0 0.58 0.80 1.0 0 1.0 0 0.92 0.92 1.0 0 0.80 1.0 0 0.80 1.0 0 1.0 0 1.0 0 0.92 0.80 0.58
0.86 0.86 1 .0 0 0.86 0.65 86
00 00 97 97 00 86 00 86 00 00
1.0 0 0.97 0.86 0.65
8 ),(0 8 ),(0 8 ),(0 8 ),(0 8),(0 8 ),(0 8 ),(0 0 .8 ),(0 0 .8 ),(0 8 ),(0 8 ),(0 8 ),(0 8 ),(0 8 ),(0 8 ),(0 8 ),(0 8 ),(0 . 0 .8 ),(0 . 0 .8 ),(0 . 0 .8 ),(0 .
.5 8,0 .5 8,0 .9 3,0 .5 8,0 3 2,0 5 8,0 9 3,0 9 3,0 7 2,0 7 2,0 9 3,0 5 8,0 9 3,0 5 8,0 9 3,0 9 3,0 9 3,0. 7 2,0. 5 8,0. 3 2,0.
.6 3,0.8 0 .6 3,0.8 0 .98,1 .00 .63,0.80 4 1,0.58 63,0.80 9 8 ,1 .0 0 9 8,1.00 78,0.92 78,0.92 9 8,1.00 63,0.80 9 8 ,1 .0 0 63,0.80 9 8,1.0 0, 9 8 ,1 .0 0 , 9 8,1.0 0, 7 8,0.9 2, 6 3,0.8 0, 4 1,0.58,
,0.8 6 ,0.8 6 ,1 .0 0 ,0.8 6 ,0.6 5 ,0.8 6 ,1 .0 0 ,1 .0 0 ,0.9 7 ,0.9 7 ,1 .0 0 ,0.8 6 ,1 .0 0 ,0.8 6 ,1 .0 0 ,1 .0 0 ,1 .0 0 , 0.97, 0.86, 0.65,
;1 .00)] ;1 .00)] ;1 .00)] ;1 .00)] 1 .00)] 1 .00)] 1 .00)] 1 .00)] 1 .00)] 1 .00)] 1 .00)] 1 .00)] 1 .00)] 1 .00)] 1 .00)] 1 .00)] 1.0)] 1.0)] 1.0)] 1.0)]
(2) Combine the individual preferences in order to obtain a collective preference value of each alternative:
[ [(0.743,0.797,0.907,0.943;0.800),(0.743,0.797,0.907,0.943;1.000)] [(0.673,0.730,0.880,0.933;0.800),(0.673,0.730,0.880,0.933;1.000)] [(0.860,0.913,0.973,0.990;0.800),(0.860,0.913,0.973,0.990;1.000)] [(0.743,0.797,0.907,0.943;0.800),(0.743,0.797,0.907,0.943;1.000)] [(0.610,0.673,0.793,0.837;0.800),(0.610,0.673,0.793,0.837;1.000)] [(0.813,0.863,0.933,0.953;0.800),(0.813,0.863,0.933,0.953;1.000)] [(0.743,0.797,0.907,0.943 ;0.800),(0.743,0.797,0.907,0.943;1.000)] [(0.523,0.600,0.720,0.767;0.800),(0.523,0.600,0.720,0.767;1.000)] [(0.790,0.847,0.947,0.980;0.800),(0.790,0.847,0.947,0.980;1.000)] [(0.860,0.913,0.973,0.990;0.800),(0.860,0.913,0.973,0.990;1.000)] [(0.743,0.797,0.907,0.943 ;0.800),(0.743,0.797,0.907,0.943;1.000)] [(0.493,0.557,0.727,0.790;0.800),(0.493,0.557,0.727,0.790;1.000)] [(0.860,0.913,0.973,0.990;0.800),(0.860,0.913,0.973,0.990;1.000)] [(0.813,0.863,0.933,0.953;0.800),(0.813,0.863,0.933,0.953;1.000)] [(0.860,0.9 1 3,0.973,0.990;0.800),(0.860,0.9 1 3,0.973,0.990;1.0 00)] [(0.657,0.723,0.833,0.873;0.800),(0.657,0.723,0.833,0.873;1.000)] [(0.930,0.980,1 .000,1 .000;0.800),(0.930,0.980,1 .000,1 .000;1 .000)] [(0.627,0.680,0.840,0.897;0.800),(0.627,0.680,0.840,0.897;1.000)] [(0.673,0.730,0.880,0.933;0.800),(0.673,0.730,0.880,0.933;1.000)] [(0.540,0.607,0.767,0.827;0.800),(0.540,0.607,0.767,0.827;1.000)]
[[(0.930,0.980,1.000,1.000;0.800),(0.930,0.980,1.000,1.000;1.000)], [(0.627,0.680,0.840,0.897;0.800),(0.627,0.680,0.840,0.897;1.000)], [(0.627,0.680,0.840,0.897;0.800),(0.627,0.680,0.840,0.897;1.000)], [(0.930,0.980,1.000,1.000;0.800),(0.930,0.980,1.000,1.000;1.000)], [(0.320,0.410,0.580,0.650;0.800),(0.320,0.410,0.580,0.650;1.000)] ]
[ xj ].
m.
(3)Calculate the weighted decision making matrix:
[ ^ ].
[(0.691,0.781,0.907,0.943 ;0.800),(0.691,0.781,0.907,0.943;1.000)] [(0.626,0.71 5,0.880,0.93 3 ;0.800),(0.626,0.7 15,0.880,0.933;1.000)] [(0.800,0.895,0.973 ,0.9 90 ;0.800),(0.800,0.8 9 5,0.97 3,0.9 90;1.000)] [(0.691,0.781,0.907,0.943 ;0.800),(0.691,0.781,0.907,0.943;1.000)] [(0.382,0.458,0.666,0.750;0.800),(0.3 82,0.458,0.666,0.750;1.000)] [(0.5 10,0.587,0.784,0.855 ;0.800),(0.5 10,0.5 87,0.784,0.855;1.000)] [(0.466,0.5 42,0.7 62,0.846 ;0.800),(0.4 66,0.5 42,0.7 62,0.846 ;1.000)] [(0.3 2 8,0.40 8,0.6 05,0.6 87 ;0.800),(0.3 2 8,0.40 8,0.60 5,0.687 ;1.000)] [(0.495,0.576,0.7 95,0.879 ;0.800),(0.4 95,0.576,0.795,0.879 ;1.000)] [(0.5 3 9,0.621,0.8 1 8,0.888 ;0.800),(0.5 3 9,0.621,0.818,0.888;1.000)] [(0.466,0.5 42,0.7 62,0.846 ;0.800),(0.4 66,0.5 42,0.7 62,0.846 ;1.000)] [(0.3 09,0.3 79,0.61 0,0.708 ;0.800),(0.3 09,0.3 79,0.610,0.708 ;1.000)] [(0.800,0.8 9 5,0.9 7 3,0.9 90 ;0.800),(0.8 00,0.8 9 5,0.9 7 3,0.990 ;1.000)] [(0.756,0.846,0.933,0.953 ;0.800),(0.7 56,0.846,0.9 3 3,0.95 3 ;1.000)] [(0.800,0.8 9 5,0.9 7 3,0.9 90 ;0.800),(0.8 00,0.895,0.973,0.9 90 ;1.000)] [(0.61 1,0.709,0.833,0.873 ;0.800),(0.6 11,0.709,0.833,0.873;1.000)] [(0.29 8,0.402,0.5 80,0.650 ;0.800),(0.298,0.402,0.5 80,0.650 ;1.000)] [(0.20 1,0.279,0.487,0.583 ;0.800),(0.201,0.279,0.487,0.583 ;1.000)] [(0.21 5,0.299,0.5 10,0.607;0.800),(0.21 5,0.299,0.5 10,0.607;1.000)] [(0.173,0.249,0.445,0.537;0.800),(0.173,0.249,0.445,0.537;1.000)]
(4)	Determine the positive ideal solution and the negative ideal solution :
[[(0.800,0.895,0.973,0.990;0.800),(0.800,0.895,0.973,0.990;1.000)], [(0.510,0.587,0.784,0.855;0.800),(0.510,0.587,0.784,0.855;1.000)], [(0.539,0.621,0.818,0.888;0.800),(0.539,0.621,0.818,0.888;1.000)], [(0.800,0.895,0.973,0.990;0.800),(0.800,0.895,0.973,0.990;1.000)], [(0.298,0.402,0.580,0.650;0.800),(0.298,0.402,0.580,0.650;1.000)]]
[[(0.626,0.715,0.880,0.933;0.800),(0.626,0.715,0.880,0.933;1.000)], [(0.328,0.408,0.605,0.687;0.800),(0.328,0.408,0.605,0.687;1.000)], [(0.309,0.379,0.6J0,0.708;0.800),(0.309,0.379,0.6J0,0.708;J.000)], [(0.611,0.709,0.833,0.873;0.800),(0.611,0.709,0.833,0.873;1.000)], [(0.J73,0.249,0.445,0.537;0.800),(0.J73,0.249,0.445,0.537;J.000)]]
(5)	Calculate the weighted matrix and the COG of each attributes with respect to the positive ideal solution and the negative ideal solution ( y, x) :
[(0.3333,0.8283),(0.4166,0.8283)],[(0.3422,0.5645),(0.4278,0.5645)],[(0.3429,0.6 863), [(0.3381,0.7873),(0.4227,0.7873)],[(0.3427,0.6837),(0.4284,0.6837)],[(0.3418,0.7 159), [(0.3215,0.9107),(0.4019,0.9107)],[(0.3438,0.6540),(0.4298,0.6540)],[(0.3438,0.6540), [(0.3333,0.8283),(0.4166,0.8283)],[(0.3397,0.5071),(0.4246,0.5071)],[(0.3441,0.5 026), (0.4287,0.6863)],[(0.321 5,0.9107),(0.4019,0.9107)],[(0.3341,0.4809),(0.4176,0.4809)] (0.4273,0.71 59)],[(0.3258,0.8691),(0.4072,0.8691)],[(0.3393,0.3880),(0.4242,0.38 80)] (0.4298,0.6540)],[(0.321 5,0.9107),(0.4019,0.9107)],[(0.3386,0.4084),(0.4233,0.40 84)] (0.4301,0.5026)],[(0.3299,0.7540),(0.4123,0.7540)],[(0.3383,0.3515),(0.4229,0.35 15)] = [[(0.3215,0.9107),(0.4019,0.9107),(0.3427,0.6837)],[(0.4284,0.6837),(0.3418,0.7159), (0.4273,0.7159),(0.3215,0.9107)],[(0.4019,0.9107),(0.3341,0.4809),(0.4176,0.4809)]] _ [[(0.3381,0.7873),(0.4227,0.7873)],[(0.3397,0.5071),(0.4246,0.5071)],[(0.3441,0.5 026), 5 _ (0.4301,0.5026)],[(0.3299,0.7540),(0.4123,0.7540)],[(0.3383,0.3 515),(0.4229,0.3 515)]]
V'
V =
[(y, x)V ]4x5 =
(y, x )v


(6)	Calculate the weighted distance between each project Ai and the ideal solution and negative ideal solution:
d + =(0.1050 0.1140 0.0707 0.2500)
d - = (0.2002 0.2136 0.2097 0.0292)
(7)	Calculate the relative closeness coefficient: C = (0.6560 0.6520 0.7479 0.1046 )
(8)	Rank the alternatives:
Based on relative closeness coefficient, we can rank
the order of each alternatives: a3 > a1 > a2 ^ a4.
(9)	Analysis:
In this example, our approach produces the same ranking as the literature [17], which proves the approach presented in this paper is effective. Comparing with the literatures [7-11], all of them utilize the TOPSIS method to deal with the decision making problems under the fuzzy information environment. The method in this paper, however, can deal with the more complex decision making problems under the generalized interval-valued trapezoidal fuzzy information environment. Comparing with the literature [17], in the fuzzy information, this method solves the FMADA problem based on the generalized interval-valued trapezoidal fuzzy information, and the literature [17] solves the FMADA problem based on the interval-valued triangular fuzzy numbers. In decision making method, literature [17] firstly uses the lower limits and the upper limits of the interval-valued triangular fuzzy numbers to calculate the relative closeness coefficient based on the TOPSIS method, then it uses the mean closeness coefficient to rank the order of the alternatives, so this method is not considering the interval-valued triangular fuzzy numbers as a whole; in this study, we proposed the extended TOPSIS based on the definition of the distance and the comparison method between the generalized interval-valued trapezoidal fuzzy numbers. Comparing with the literature [18], both of them are the decision making problems based on the generalized interval-valued trapezoidal fuzzy numbers. The literature [18] ranks the order of the alternatives based on the similarity which is hard to calculate. The method proposed in this paper, however, is easy to calculate the similarity.
5 Conclusion
Fuzzy multiple attribute decision making (FMADM) is wildly used in various areas. The interval-valued trapezoidal fuzzy numbers can be precisely express the attribute values and weights of the decision making process. This study proposes an extended TOPSIS method for solving the MADM problems which the attribute weights and values are given with the form of GIVTFN and the decision making steps. This method is simple and easy to understand. This method constantly enriches and develops the theory and method of FMADM, and it proposes a new idea for solving the FMADM problems.
Acknowledgement
This paper is supported by the Humanities and Social Sciences Research Project of Ministry of Education of China(No.09YJA630088), and the Natural Science Foundation of Shandong Province (No. ZR2009HL022). The authors also would like to express appreciation to the anonymous reviewers for their very helpful comments on improving the paper.
References
[1]	Bellman R.E., Zadeh L.A. (1970). Decision-making
in a fuzzy environment, management science, 171, pp.41-164.
[2]	Hwang C.L., Yoon K. (1981). Multiple Attributes Decision Making Methods and Applications, Springer, Berlin Heidelberg.
[3]	Zavadskas E.K, Kaklauskas A., Turskis Z.(2009).Multi-Attribute Decision-Making Model by Applying Grey Numbers. INFORMATICA, 20(2), pp.305-320.
[4]	Liu P.D. (2009a). Multi-Attribute Decision-Making Method Research Based on Interval Vague Set and TOPSIS Method. Technological and Economic Development of Economy, 15(3), pp. 453-463.
[5]	Liu P.D.(2009b). A novel method for hybrid multiple attribute decision making. Knowledge-based systems, 22 (5), pp. 388-391.
[6]	Xu Zeshui. (2007). Methods for aggregating interval-valued intuitionistic fuzzy information and their application to decision making. Control and Decision,22 (2), pp.215-219.
[7]	Jahanshahloo G.R., Lotfi F. H., Izadikhah M. (2006).An algorithmic method to extend TOPSIS for decision-making problems with interval data, Applied Mathematics and Computation,l75(2), pp. 1375-1384.
[8]	Wang Ying-Ming, Elhag Taha M.S. (2006). Fuzzy TOPSIS method based on alpha level sets with an application to bridge risk assessment,Expert Systems with Applications,3\(2), pp.309-319.
[9]	Liu Weijun, Zeng Ling (2008). A new TOPSIS method for fuzzy multiple attribute group decision making problem, Journal of Guilin University of Electronic Technology. 28(1), pp. 59-62.
[10]	Tsaur S.H., Chang T.Y., Yen C.H. (2002). The evaluation of airline service quality by fuzzy MCDM, Tourism Management,23, pp. 107-115.
[11]	Chu T.C., Lin Y.C. (2003). A fuzzy TOPSIS method for robot selection, The International Journal of Advanced Manufacturing Technology, 21, pp. 284-290.
[12]	Gorzalczany M.B. (1987). A method of inference in approximate reasoning based on interval-valued fuzzy sets, Fuzzy Sets and Systems, 21(1), pp. 117.
[13]	Turksen I.B. (1996).Interval-valued strict preference with Zadeh triples, Fuzzy Sets and Systems, 78(2), pp. 183-195.
[14]	Wang G., Li X. (1998). The applications of interval-valued fuzzy numbers and interval-
distribution numbers, Fuzzy Sets and Systems, 98(3), pp.33i-335.
[15]	Wang G., Li X. (200i),.Correlation and information energy of interval-valued fuzzy numbers, Fuzzy Sets and Systems, i03 (i), pp. i69-i75.
[16]	Hong D.H., Lee S. (2002). Some algebraic properties and a distance measure for interval-valued fuzzy numbers, Information Sciences, i48 (i), pp. i-i0.
[17]	Ashtiani Behzad, Haghighirad Farzad, Makui Ahmad, etc. (2009). Extension of fuzzy TOPSIS method based on interval-valued fuzzy sets, Applied Soft Computing, 9, pp.457-46i
[18]	Wei Shih-Hua, Chen Shyi-Ming. (2009). Fuzzy risk analysis based on interval-valued fuzzy numbers, Expert Systems with Applications, 362, pp.2852299.
[19]	Chen, S. H. (i985). Ranking fuzzy numbers with maximizing set and minimizing set, Fuzzy Sets and Systems, i7(2), pp. ii3-i29.
[20]	Chen Shyi-Ming, Chen Jim-Ho.(2009). Fuzzy risk analysis based on ranking generalized fuzzy numbers with different heights and different spreads, Expert Systems with Applications, 36(3), pp.6833-6842.
[21]	Chen, S. J., Chen, S. M. (2003). A new method for handing multicriteria fuzzy decision-making problems using FN-IOWA operators, Cybernetics andSystems,34(2), pp. i09-i37.
Optimal Decision Tree Based Multi-class Support Vector Machine
Manju Bala and R. K. Agrawal
School of Computer & Systems Sciences,
Jawaharlal Nehru University, New Delhi-110 067, India
Keywords: support vector machine, decision tree, class separability, information gain, Gini Index and Chi-squared, interclass scatter, intraclass scatter.
Received: August 5, 2010
In this paper, decision tree SVMs architecture is constructed to solve multi-class problems. To maintain high generalization ability, the optimal structure of decision tree is determined using statistical measures for obtaining class separability. The proposed optimal decision tree SVM (ODT-SVM) takes advantage of both the efficient computation of the decision tree architecture and the high classification accuracy of SVM. A robust non-parametric test is carried out for statistical comparison of proposed ODT-SVM with other classifiers over multiple data sets. Performance is evaluated in terms of classification accuracy and computation time. The statistical analysis on UCI repository datasets indicate that ten cross validation accuracy of our proposed framework is significantly better than widely used multi-class classifiers. Experimental results and statistical tests have shown that the proposed ODT-SVM is significantly better in comparison to conventional OvO and OAA in terms of both training and testing time.
Povzetek: Metoda odločitvenega drevesa s SVM dosega signifikantno boljše rezultate kot izvirni SVM.
1 Introduction
Support Vector Machine (SVM) has been proved to be a successful learning machine in literature, especially for classification. SVM is based on statistical learning theory developed by Vapnik [6, 25]. Since it was originally designed for binary classification [3], it is not easy to extend binary SVM to multi-class problem. Constructing k-class SVMs (k > 2) is an on-going research issue [1, 4]. Two approaches are suggested in literature to solve multi-class SVM. One is considering all data in one optimization [7]. The other is decomposing multi-class into a series of binary SVMs, such as "One-Against-All" (OAA) [25] and "One-versus-One" (OvO) [16].
It has been reported in literature that both conventional OvO and OAA SVMs suffer from the problem of unclassifiable region [19, 24]. To resolve unclassifiable region in conventional OvO, decision tree OvO SVM formulation is proposed [19]. Takashaki and Abe [24] proposed class separability measure i.e. Euclidean distance between class centers to construct decision tree based OAA SVM to overcome unclassifiable region. In literature, other than Euclidean distance a large number of distance measures were used to determine the class separability, each having its own advantages and disadvantages. Few more realistic and effective statistical measures used in literature are information gain, gini index, chi-square and scatter-matrix-based class separability in kernel-induced space for measuring class separability.
In this paper, we evaluate the performance in terms of classification accuracy and computation time of proposed OvO ODT-SVM [17] and OAA ODT-SVM
[18]. In both models, class separability is determined using statistical measures i.e. information gain, gini index, chi-square and scatter-matrix-based class separability in kernel-induced space. A robust non-parametric test is carried out for statistical comparison of proposed ODT-SVM with other classifiers over multiple data sets.
In section 2 we briefly describe the basics of SVM. In section 3 we discuss decision tree OvO and OAA SVMs approach. Section 4 presents our proposed ODT-SVMs framework using four statistical class separability measures. In section 5, we discuss the theoretical analysis and empirical estimation of training and testing time of both the proposed schemes. The experimental results demonstrate the effectiveness of our ODT-SVMs in comparison to conventional OvO and OAA SVMs. Section 6 includes the conclusion.
2 Support vector machines
Support Vector Machines (SVM) is based on statistical learning theory developed by Vapnik [6, 25]. It classifies data by determining a set of support vectors, which are members of the set of training inputs that outline a hyperplane in feature space [15]. Let us assume {(x^ yi),...,( xn, yn)} be a training set with	and is the corresponding target class.
The basic problem for training an SVM can be reformulated as:
Maximize: / = £f= 1 at - 1 1 at a ¡y ¡y ¡(x ? , x ¡)
Subject to Ya=ia i y i = 0 and a ¡>0 ,i =
1,2 ,...,71	(1)
The computation of dot products between vectors without explicitly mapping to another space is performed by a kernel function K (xi,Xj) . Use of a kernel function [22] enables the curse of dimensionality to be addressed and the solution implicitly contains support vectors that provide a description of the significant data for classification. Substituting K(Xi,Xj)for (XiT,Xj) in eqn. (1) produces a new optimization problem: Maximize:
L (a) = Ya=iOCi~^f= iY.% 1aiajyiyjK(Xi,Xj)
Subject to	0 < ai < C i,j = 1 ,. . ,,n
and YH= i aiyi = 0
(2)
Solving it for a gives a decision function of the form f 00= sigaiyiK (xi,Xj) + b)
(3)	^
Whose decision boundary is a hyperplane and translates to nonlinear boundaries in the original space.
3 Decision tree based SVM
The most common way to build a multi-class SVM is by constructing and combining several binary classifiers [14]. To solve multi-class classification problems, we divide the whole classification problem into a number of binary classification problems. The two representative ensemble schemes are OvO and OAA [21]. Convetional OvO SVM has the problem of unclassifiable region. To resolve unclassifiable region for OvO SVM, Decision Directed Acyclic graph (DDAG) SVM) [19] based on decision tree OvO SVM is proposed in literature. They have shown with an example three-class problem the existence of unclassifiable regions which can lead to degradation of generalization ability of classifier. In general, the unclassifiable region is visible and generalization ability of classifier is not good for k-class problem where k >2. In DDAG OvO scheme [19], VC dimension, LOO error estimator and Joachim's ^a LOO measures were used for estimating the generalization ability of pairwise classifier at each level of decision tree. During training at the top node, a pair	that has the highest
generalization ability is selected from an initial list of classes	. Then it generates the two lists
deleting or from the initial list. If the separated classes include the plural classes, at the node connected to the top node, the same procedure is repeated for the two lists till one class remains in the separated region. This means that after only	steps just one class
remains, which therefore becomes the prediction for the current test sample.
Gjorgji et al. [13] proposed binary tree architecture (SVM-BDT) that uses SVMs for making binary decisions in the nodes which takes advantage of both the efficient computation of the tree architecture and high
accuracy of SVMs. The hierarchy of binary decision subtasks using SVMs is designed with clustering algorithms. In proposed scheme SVM-BDT, the classes are divided in two disjoint groups gi and g2 using Euclidian distance as distance measure. The two disjoint groups so obtained are then used to train a SVM classifier in the root node of the decision tree. The classes from first and second clustering group are being assigned to left and right subtree respectively. This process continues recursively until there is only one class is left in a group which defines a leaf in the decision tree.
Takashaki and Abe [24] proposed OAA SVM based decision tree formulation in literature to overcome the problem of unclassifiable region to improve generalization ability of SVM. They have shown with an example of unclassifiable regions for a three-class problem which can lead to degradation of generalization ability of classifier. In general, the unclassifiable region is visible and generalization ability of classifier is not good for k-class problem where . In Takashaki and Abe [24] proposed scheme, the hyperplane is determined that separates a class from others during training at the top node. If the separated classes include the plural classes, at the node connected to the top node, the hyperplane is determined that separates the classes. This process is repeated until one class remains in the separated region. This can resolve the problem of unclassifiable regions that exist in OAA SVM. They proposed different types of decision trees based on class separability measure i.e. Euclidean distance between class centers.
4 Proposed decision tree SVMs framework using statistical measures
Euclidean distance measure used in the construction of decision tree (i.e. OvO SVM-BDT and Takashaki and Abe [24] OAA SVM formulation) does not take into account within class variability of patterns. Hence, it may not be suitable for measuring class separability between two different classes of patterns. To understand better picture of the overlap of the subspaces occupied by individual classes, statistical distance measures are employed by pattern recognition community which constitutes a natural concept of measuring class separability.
4.1 Statistical class separability measures
Among statistical measures information gain (I G ) [20] is a measure based on entropy [23] which indicates degree of disorder of a system. It measures reduction in weighted average impurity of the partitions compared with the impurity of the complete set of samples when we know the value of a specific attribute. Thus, the value of IG signifies how the whole system is related to an attribute. IG is calculated using:
IG ( C\E) = H ( C)-H (C\E)
where / G( C|£) is the information gain of the label C for a given attribute E, H(C) is the system's entropy and H ( C |£") is the system's relative entropy when the value of the attribute E is known. The system's entropy indicates its degree of disorder and is given by the following formula k
H ( C ) = - ( C) 1 og (p ( Q) )
i=1
(5)
where is the probability of class . The relative entropy is calculated as
/ k \ H (C|£) = ^p ( e,)( - ^P( C¡|e7-)logp(C £| e,-) )
i=1 V i=l	'
(6)
Where p ( e,) is the probability of value j for attribute e, and p(C j|e,) is the probability of C ¿ with a given The optimal binary SVM model is selected on the basis of maximum value of / G that signifies more separability between patterns belonging to two different classes. for a given independent binary SVM containing n,- elements of C, and % elements of Cj can be calculated as / G ( ¿J) =H( Cj, C,)
- [p ( Cj) H( tp,/p) + p( C,)H (/„, t„) ]
(7)
where Hfrrt = -M Zog ^ - |y| Zog ^
(8)
p (C j^c^ and p(
(9)
where , , , and denote number of true
P' J p' n>	i n
positive, false positive, true negative and false negative data points respectively. The higher value of signifies less overlap or more distance between two different classes of data points. Hence, / G can be a natural measure to determine class separability of different classes of data points. Similarly for every independent binary OAA SVM, assume there are two classes of dataset, and and training set D contains nj elements of C j and n, elements of class C, ^ j. / G for a given OAA SVM model i can be calculated as IG(i) =
( ) [ ( ) ( ) ]
(10)
Gini index is another popular measure for feature selection in the field of data mining proposed by Breiman et al. [5]. It measures the impurity of given set
of training data D and can be calculated as
2
G ¿n ¿ (D) = 1 - ^ (p (C j) ) 2
i=1
For a binary split, a weighted sum of the impurity of each resulting partition is computed. The reduction in impurity that would be incurred by a particular binary split in binary SVM between two classes C j and C, is calculated as
AGini(\,]) = Gini{D) — Giniij(D) (12) ,
where	[	( )	]
(13)	,
Where Gini(L) is the Gini Index on the left side of the hyperplane and Gini(R) is that on the right side. OvO SVM model between class pair ( ¿ kjk) that maximizes the reduction in impurity (i.e. Gini index) is selected as splitting node in decision tree SVM at a particular level Similarly for every independent binary OAA model assume there are two classes of dataset, C j and C, ^ j. The reduction in impurity that would be incurred by a particular binary split is given by
A Gini(i) = Gini{D) — Gini^D)
(14)
where G ¿n¿¿ (d) = [p ^¿) G ¿m (i) + (c;-^)g ¿m (?) ]
(15)
Chi-square [9] another criterion used for binary split in data mining and machine learning, is a statistical test in which the sampling distribution of the test statistic is a chi-square distribution when the null hypothesis is true. We are interested in determining whether a particular decision rule is useful or informative. In this case, the null hypothesis is a random rule that would place tp data points from C¿ and fp data points from C;- independently in the left branch of decision tree and the remainder in the right branch of decision tree respectively. The candidate decision rule would differ significantly from the random rule if the proportions differed significantly from those given by the random rule. The chi-square statistic will be given by
z2 = g ( tp, ( tp + /p)p (c j) ) + g(/„, (/„ +t„) p (c j) ) + g(/p ,( tp + ypMc)) + g ( t„, (/„ +1„) p( c,))
(16)
i	.	.>. (count—expect)2
where g ( co un t, exp e ct) =-expect-
The higher the value of , the less likely is that the null hypothesis is true. Thus, for a sufficiently high x 2 , a difference between the expected and the observed distributions is statistically significant; one can reject the null hypothesis and can consider candidate rule is informative. Hence OvO SVM model for class pair ( ¿ k, A) or OAA SVM model ¿ that maximizes x 2 is selected as splitting node in ODT-SVM at a particular level.
To measure class variability of patterns, the ratio of interclass and intra class scatters in kernel-induced feature space can also be used which better depicts the physical relationship of data in input space and thereby
providing high generalization ability of classifier based on decision tree. The scatter-matrix based measure (S) of training set D in original space [10] is defined as
s _ tr(Sb) tr(Sw)
(17)
where S, is the between class scatter matrix and Sw is the within class scatter matrix, defined as Sb = (mi - mj )(ml - mj )T
where m ; = 6 C)x and m = — £xgc,x
(18)
and represents mean vector of data from class and class respectively.
(19)
Sw = Qt + Qj
where and are given as
= -Y(x-n, £—i

(20)
(21)
xECi
=—— Y(*-
71 — 71; ¿—I
m■ ) (x — mi )7
x<£ Cj
Using kernel trick [7], data points from C; and C. are
implicitly mapped from to a high dimension feature space K . Let 0 ( ■ ) : Rd — K denote the mapping and /cg(x(,x.) = (0 ( x ¿) , 0( x.) ) denote the kernel function, where is the set of kernel parameters and ( ■ , ■ ) is the inner product. K denotes the kernel matrix and .. is defined as fcg (x; , x.). Let KA, B be kernel matrix computed with the data points from A and B which denote two subsets of training sample set D. Let and denotes the between class scatter matrix and within class scatter matrix in , respectively and defined as follows 2
S0=£n ;( m 0-m0) ( m 0-m 0) r
(22)
i=i
S® =ZZ(0(x)-m0)(0(x)-m07
(=1 xEDt
(23)
where denotes the mean of training data points from and is the mean of all the training data points in
K.
let F is vector whose elements are all "1". Its size will be decided by the context. Then
mf'mf =nr*.FTKHDiF (24)
m0 rm® = n~2 ■ FTKDpDF
(25)
0 T 0
m; m;
1 y
( )
(26)
tr(S0 ) = tr (27)
tr(S0) = tr
^n £ (m 0-m0)(m 0-m 0)7
^K^F | FTKdjJ>jF FtKdpF
0 (x) -m0)(0 (x) - m0)7
¿ = 1 XE Di
FTKDi,DiF FTKDpDjF
(28)
Now the class separability (SC) in a feature space K is obtained as
tr(S0)
SC =
(29)
tr(S0)
4.2 Algorithm for construction of ODT-SVM
For the construction of Optimal Decision Tree SVM model (ODT-SVM), we can use one of the class separabilty measures to determine the structure of decision tree. Here for illustration purpose, we consider information gain in proposed optimal decision tree algorithm The outline for OvO ODT-SVM using IG class separability measure for &-class is given below:
1	Generate the initial list { Cx ,. . ., Cfc]
2	Calculate //( Q ,C, ) using eqn. (8) for
and
3	Calculate ( ),	, p( ), and p( ) using eqn. (8) and eqn. (9) respectively.
4	Compute	using eqn. (7).
5	Determine class pair ( for which
takes maximum value from the list. If data points belong to class then delete from the list else delete class .
6	If the remaining classes are more than one, repeat Steps 2-5 otherwise terminate the algorithm.
Similar computational steps are followed for other three measures to determine the structure of OvO ODT-SVM. Similarly, the outline for OAA ODT-SVM using IG class separability measure for &-class is given below: 1 Generate the initial list { Cv ..., Cfc).
2	Calculate H ( Q,^ ¡) using eqn. (8) for
and .
3	Calculate H( tp ,/p ), H (/- , t„ ) , p( Ci) and p( ), using eqn. (8) and eqn. (9) respectively.
4	Compute / G ( i ) using eqn. (10).
5	Determine model for which	takes maximum value from the list.
6	If j is multi-class, repeat steps 2-5. Otherwise, terminate the algorithm.
Similar computational steps are followed for other three measures to determine the structure of OAA ODT-SVM.
4.3 Time complexity
In order to compute the time complexity of training phase of OvO SVM, we assume without loss of any generality that the number of data points in each class is
n
approximately same i.e. - . To solve k-class problem
k
k(k-l)
using conventional OvO, —-— binary SVM classifiers
are developed. Assuming the time complexity of building a SVM with n data points and d features is	, it can be shown that training time of
conventional OvO,	, is	In worst case
the decision tree generated in SVM-BDT is skewed if classes in two groups at every level is divided into uneven size. Under the assumption that group gi contain only one class and group g2 contain remaining classes or vice versa, the decision tree so generated will be of depth (k - 1 ) for k-class problem. Hence, the training time of SVM-BDT will be given by
1 SVM-BDT-worst
n(k - 1)
k
(D'
= (n2dk
(30)
In SVM-BDT approach under best case, the class in two groups at every level is divided into approximately same size. The decision tree so generated will be almost height balanced of maximum depth \/o g(k) ]. The number of nodes in decision tree at depth i is 2'-1, each containing
n
data points. Hence, the training time for SVM-BDT in best case is given by
Tsvm - SDT - b es t = n ^ + V^J ^
+ ...
+
(31)
2 ^ 'fei^ = (n2d)
However, in general the structure of OvO decision tree
generated using statistical measures is almost height
balanced of maximum depth \l o g ( k) 1. There are 2'-1
th ( 2 n\ nodes at i level and each node uses I— J data points.
Hence, the training time for OvO ODT-SVM using statistical measure is
retrain
1 OvO-ODT-SVM

fc(fc-l)
During testing phase of the conventional OvO, —-—
decision functions are to be evaluated. Also the majority
k(k-1)
voting is computed with —-— operation. Hence, the
testing time	for each sample is given by k(k-1) ~ , N -. In worst case the depth of SVM-BDT is ( k - 1 )
which requires testing time for each sample. However, in best case the depth of SVM-BDT is \ I o g ( k) 1 which requires \/o g ( k) 1 testing time for each sample. Since, the maximum depth of OvO ODT-SVM is \Zo g (k) 1, the testing time requires \ Io g ( k) 1 operations. According to the above analysis it is evident that the training and testing time for OvO ODT-SVM will always take less computation time in comparison to conventional OvO SVM and SVM-BDT.
For k-class problem, (k-1) hyperplanes are to be calculated in case of OAA ODT-SVM, whereas k times SVM model is developed in case of conventional OAA SVM approach which is more than number of SVM models required in decision tree OAA SVM. Assuming the time complexity of building a SVM model is given by	where n is the number of data points and d is
number of attributes. The overall time complexity of training SVM using conventional OAA approach is proportional to	. In order to compute training
time	required by OAA ODT-SVM, we
assume that the number of data points in each class is
n
approximately same i.e. -. At first level of OAA ODT-SVM model will take time proportional to (n 2d) . While at the second stage SVM model will have In -n) number of data points. It can be shown that at it stage,
«n(i-l)\2 \
n--1—J ^J.
Time,	required for decision tree based
SVM is given by
/ U\2 T train	o . /	\
1 n
'OAA-ODT-SVM
= n2 d + (n — — ) d n(k - 2)
(33)
+ ( n —-

+ ...
2
d
Under the above assumption the time required for training an OAA ODT-SVM is approximately three times lesser than the conventional OAA. While in testing phase, the values of all the hyperplanes need to be determined in case of OAA formulation whereas in OAA ODT-SVM, the value of all the hyperplanes need not be computed in general. Hence the time complexity of testing will also be less in case of OAA ODT-SVM in comparison to conventional OAA.
5 Experimental results
To evaluate the performance of our proposed ODT-SVM framework using information gain, gini index, chi-
square and scatter-matrix-based class separability in kernel-induced space, we have performed experiments on publically available UCI [2] benchmark datasets. Table 1 describes the datasets used in our experiments. In yeast dataset in actual ten classes are given. We have merged data of six classes into one class to make it a five class problem. The kernel functions used in experiments are given in Table 2.
Performance of classifiers is evaluated in terms of classification accuracy, training time and testing time measures on each data set. We have applied Friedman test [11],[12] which is a non-parametric test for statistical comparison of multiple classifiers over multiple data sets. For each dataset, we rank competing algorithms. The one that attains the best performance is ranked 1; the second best is ranked 2, and so forth. A method's mean rank is obtained by averaging its ranks across all datasets. Compared to mean value, mean rank can reduce the susceptibility to outliers [8]. As recommended by Demsar [8], the Friedman test is effective for comparing multiple algorithms across multiple datasets. It compares the mean ranks of approaches to decide whether to reject the null hypothesis, which states that all the approaches are equivalent and, so, their ranks should be equal. If the Friedman test rejects its null hypothesis, we can proceed with a post hoc test, the Nemenyi test. It can be applied to mean ranks of competing approaches and indicate whose performances have statistically differences.
Table 1: Description of datasets.
To see the effectiveness of our proposed OvO ODT-SVM and OAA ODT-SVM, we compared our methods with conventional OvO, SVM-BDT, and conventional OAA SVMs respectively. We have used five kernel functions with value of C = 1000 and y = [2-11, 2-10, 2-9... 20]. The classification accuracy is determined using ten cross-validations. For a given kernel function and C, we determine the value of y for which the maximum classification accuracy is achieved. Table 3 and Table 4 show the comparison of maximum classification accuracy between conventional OvO and OvO ODT-SVM, and conventional OAA and OAA ODT-SVMs respectively.
Table 2: Kernel functions.
Table 5 shows comparison of maximum classification accuracy between both models of ODT-SVM, conventional OvO, conventional OAA and commonly used multi-class classifiers in literature i.e. C45, Multi Layer Perceptron (MLP). C4.5 and MLP implemented in WEKA machine learning environment [26] are used in our experiments with their default values of parameters. The best classification accuracy for each dataset is shown in bold. When we apply the Friedman test, with 7 algorithms and 11 datasets, Ff is distributed according to the F distribution with (7-1) x ( 1 1-1 ) = 60 degrees of freedom. The critical value of F(6,10) at the 0.05 critical level is 2.25. Ff calculated from the mean ranks is 12.31. Since 12.31 > 2.25, we can reject the null hypothesis and infer that there exists a significant difference among adversary classifiers.
Dataset	Kernel Choice	OvO	OvO ODT-SVMs				
			ED	SC	IG	Gini	Chi-squared
Iris	Gaussian	98	98	98	98	98	98
	Laplace	96	96	96	96	96	96
	Cauchy	98	98	98	98	98	98
	Hypersecant	98	98	98	98	98	98
	Square sync	95.33	94.67	97.45	98	97.99	98
Satimage	Gaussian	89.43	88.87	92.89	89.66	89.57	91.99
	Laplace	91.21	90.76	92.41	91.23	91.42	91.12
	Cauchy	90.46	90.61	92.78	89.34	92.98	90.89
	Hypersecant	89.71	90.09	93.78	91.34	91.98	93.98
	Square sync	74.51	76.43	77.87	78.86	77.8	76.76
Wine	Gaussian	82.58	98.98	96.32	96.65	97.98	97.52
	Laplace	82.58	92.23	92.67	95.55	93.58	91.58
	Cauchy	82.02	82.02	82.02	82.62	82.87	82.02
	Hypersecant	93.26	93.36	93.26	94.26	92.13	93.26
Problem	#train	#test	#class	#attributes
Iris	150	11	3	4
Wine	178	0	3	13
Vehicle	846	0	4	18
Glass	214	0	6	9
Segmentation	210	0	7	19
Ecoli	336	0	8	7
Satimage	4435	2000	6	36
New_Thyroid	215	0	3	5
Yeast	1484	0	5	8
Movement_Libra	360	0	15	90
HeartDisease_Cleveland	303	0	5	13
Kernel Function	K (.X, x ) for y > 0
Gaussian	exp(-y 1 x x- 1 )
Laplace	exp(-y | x- xt |)
Cauchy	(1/(1 + y | X— x- |2))
Hypersecant	2/(exp(y | X— x- |) + exp(—y | x— xt |))
Square sync	sin2 (y | x— x, |)/(y | x— x, |)2
	Square sync	75.28	76.89	76.97	77.97	75.28	76.97
Vehicle	Gaussian	76.83	79.95	84.45	84.82	85.24	85.24
	Laplace	77.42	78.3	80.61	81.61	80.74	80.24
	Cauchy	76.48	82.85	84.52	84.52	84.58	85.65
	Hypersecant	83.33	81.59	84.28	84.28	84.98	84.98
	Square sync	71.51	80.57	81.32	81.32	81.99	81.56
Glass	Gaussian	72.43	62.9	71.5	71.21	69.63	75.43
	Laplace	75.70	76.17	76.64	74.64	75.7	77.57
	Cauchy	72.90	72.21	72.43	71.03	68.92	73.36
	Hypersecant	71.96	72.90	71.5	70.09	69.16	71.03
	Square sync	66.36	64.2	64.78	62.62	62.62	56.07
Segmentation	Gaussian	84.76	87.85	89.24	89.29	88.87	87.43
	Laplace	87.14	88.19	89.62	89.67	89.87	90.87
	Cauchy	86.19	91	89.87	89.69	89.71	89.99
	Hypersecant	90	89.14	90.87	89.52	92.05	91.95
	Square sync	81.9	79.05	79.52	80.13	80.95	79.05
Ecoli	Gaussian	85.42	84.98	85.42	86.01	84.79	86.14
	Laplace	86.97	87.2	86.99	86.9	86.99	87.9
	Cauchy	85.42	85.42	86.98	84.82	86.52	85.71
	Hypersecant	85.42	85.42	88.45	88.82	85.42	85.74
	Square sync	85.12	84.23	87.65	85.12	87.85	86.15
New_Thyroid	Gaussian	97.21	97.21	97.21	100	98.21	97.21
	Laplace	96.74	96.74	100	96.89	96.89	96.74
	Cauchy	96.74	96.84	100	96.89	96.89	97.74
	Hypersecant	97.67	100	98.67	98.89	100	100
	Square sync	94.88	94.88	96.49	96.49	96.49	96.49
Yeast	Gaussian	58.43	59.92	62.40	60.32	63.95	61.92
	Laplace	59.56	60.31	62.43	63.32	65.17	61.86
	Cauchy	61.54	62.37	64.17	65.41	66.54	62.54
	Hypersecant	59.45	67.76	67.65	67.54	65.56	64.76
	Square sync	57.98	57.32	58.68	59.45	60.12	58.98
Movement_Libra	Gaussian	74.45	79.94	81.91	78.94	77.74	76.94
	Laplace	81.73	83.77	85.77	85.87	87.97	82.77
	Cauchy	76.45	78.13	79.89	78.54	76.89	78.89
	Hypersecant	72.23	78.64	77.65	73.64	77.18	76.64
	Square sync	42.22	42.22	42.22	42.22	42.22	42.22
HeartDisease_Cleveland	Gaussian	43.56	41.23	34.87	42.27	12.54	43.56
	Laplace	22.11	23.43	23.91	32.31	11.88	22.11
	Cauchy	15.23	34.38	17.47	13.45	12.21	15.18
	Hypersecant	22.44	24.34	25.32	25.41	12.21	22.44
	Square sync	12.78	15.38	12.23	11.23	13.43	12.09
Table 3: Classification accuracy of conventional OvO Vs OvO ODT-SVMs [%].
Dataset	Kernel Choice	OAA	OAA ODT-SVM				
			ED	SC	IG	Gini	Chi-
Iris	Gaussian	96	98	98	98	98	98
	Laplace	96	96	96	96.44	96	96
	Cauchy	96	98	98	98	98	98
	Hypersecant	98	98	98	98	98	98
	Square sinc	96	94.67	98	98	98	98
Satimage	Gaussian	89.43	88.87	92.89	89.66	89.57	91.99
	Laplace	89.95	90.76	92.41	91.23	91.42	91.12
	Cauchy	89.46	90.61	92.78	89.34	92.98	90.89
	Hypersecant	89.71	90.09	93.78	91.34	91.98	93.98
	Square sinc	74.51	76.43	77.87	78.86	77.8	76.76
Wine	Gaussian	82.01	98.98	96.32	96.65	97.98	97.52
	Laplace	82.58	92.23	92.67	95.55	93.58	91.58
	Cauchy	82.15	82.02	82.02	82.62	82.87	82.02
	Hypersecant	93.82	93.36	93.26	94.26	92.13	93.26
	Square sinc	74.72	76.89	76.97	77.97	75.28	76.97
Vehicle	Gaussian	84.63	79.95	84.45	84.82	85.24	85.24
	Laplace	80.61	78.3	80.61	81.61	80.74	80.24
	Cauchy	84.52	82.85	84.52	84.52	84.58	85.65
	Hypersecant	84.87	81.59	84.28	84.28	84.98	84.98
	Square sinc	78.45	80.57	81.32	81.32	81.99	81.56
Glass	Gaussian	60.75	62.9	71.5	71.21	69.63	75.43
	Laplace	76.17	76.17	76.64	74.64	75.7	77.57
	Cauchy	68.69	72.21	72.43	71.03	68.92	73.36
	Hypersecant	63.55	72.9	71.5	70.09	69.16	71.03
	Square sinc	61.21	64.2	64.78	62.62	62.62	56.07
Segmentation	Gaussian	84.36	87.85	89.24	89.29	88.87	87.43
	Laplace	86.19	88.19	89.62	89.67	89.87	90.87
	Cauchy	85.71	91	89.87	89.69	89.71	89.99
	Hypersecant	88.57	89.14	90.87	89.52	92.05	91.95
	Square sinc	80.48	79.05	79.52	80.13	80.95	79.05
Ecoli	Gaussian	84.01	84.98	85.42	86.01	84.79	86.14
	Laplace	86.90	87.2	86.99	86.9	86.99	87.9
	Cauchy	86.90	85.42	86.98	84.82	86.52	85.71
	Hypersecant	82.74	85.42	88.45	88.82	85.42	85.74
	Square sinc	82.74	84.23	87.65	85.12	87.85	86.15
New_Thyroid	Gaussian	95.45	97.98	97.54	100	98.21	97.89
	Laplace	96.78	98.89	100	98.52	96.89	98.49
	Cauchy	97.34	97.65	98.94	97.89	100	97.96
	Hypersecant	96.38	100	98.99	98.89	98.90	98.99
	Square sinc	93.54	95.56	96.49	96.78	98.67	97.45
Yeast	Gaussian	59.65	59.92	63.90	61.54	66.65	62.32
	Laplace	61.26	61.43	64.77	65.54	67.65	64.23
	Cauchy	59.46	63.23	64.17	67.41	66.54	65.43
	Hypersecant	59.99	68.72	68.65	65.54	65.56	61.34
	Square sinc	56.54	58.32	58.68	59.45	60.12	59.98
Movement_Libra	Gaussian	73.44	77.76	85.21	82.61	76.41	76.34
	Laplace	82.48	84.43	84.67	84.42	85.54	83.75
	Cauchy	76.14	78.93	74.71	85.54	78.65	78.89
	Hypersecant	73.43	77.84	78.45	77.61	78.65	74.64
	Square sinc	42.2	42.22	42.22	42.22	42.22	42.22
HeartDisease_Cleveland	Gaussian	43.43	42.36	44.54	42.51	12.21	43.53
	Laplace	29.13	43.03	23.34	31.43	16.78	24.12
	Cauchy	45.73	49.38	27.47	43.45	22.35	25.38
	Hypersecant	22.44	24.64	27.62	25.32	13.61	21.45
	Square sync	12.78	15.48	13.61	11.23	12.43	14.37
Table 4: Classification accuracy of conventional OAA Vs OAA ODT-SVMs [%].
Dataset	OvO	OAA	SVM_BDT	OvO ODT-SVM	OAA ODT-SVM	C4.5	MLP
Iris	98	98	98	98	98	96	97.33
Satimage	91.21	89.95	91.65	93.61	93.98	85.7	88.12
Wine	93.26	96.45	92.63	96.76	98.98	90.96	92.51
Vehicle	83.33	84.87	82.98	84.95	85.65	71.83	81.98
Glass	75.7	76.17	72.69	76.17	77.57	67.61	63.08
Segmentation	90	90	90	93.78	92.05	86.6	90.48
Ecoli	87.97	86.9	85.78	89.98	88.82	85.08	84.34
New_Thyroid	97.67	97.34	100	100	100	91.59	95.33
Yeast	61.54	61.26	68.59	67.65	68.65	56.71	61.43
Movement_Libra	81.73	82.48	87.45	87.97	85.54	67.13	80.78
HeartDisease_Cleveland	43.56	45.73	49.43	43.56	49.38	50.33	52.65
Table 5: Comparison of best average classification accuracy of ODT-SVMs with other multi-class classifiers.
To determine which classifiers are significantly different, we carried out Nemenyi test whose results are illustrated in Figure 1. In this figure, the mean rank of each classifier is pointed by a circle. The horizontal bar across each circle indicates the „critical difference". The
performance of two methods is significantly different if their corresponding mean ranks differ by atleast the critical difference i.e. their horizontal bars are not overlapping. Figure 1 reveals that OvO ODT-SVM is significantly different from C4.5 and MLP but not
significantly different from conventional OvO, conventional OAA, SVM-BDT, OAA ODT-SVM in terms of classification accuracy. Rather we can say that the proposed scheme is comparable with all the variants of SVM.
Table 6 and Table 7 shows the computation time of OvO ODT-SVM for training and testing phase respectively for Gaussian kernel with y = 2-11 and C=1000.Table 6 and Table 7 shows that the time required for training and testing of OvO ODT-SVM is significantly less in comparison to conventional OvO SVM and SVM-BDT approach. Similarly, Table 8 and Table 9 shows that that the time required for training and testing of OAA ODT-SVM is significantly less in comparison to conventional OAA SVM. The Friedman test indicates that there exist significant differences among OvO classifiers on both training and testing time.
Figure 2 and Figure 3 illustrate the results of the Nemenyi test to reveal what those differences are among OvO classifiers on training and testing time respectively. Consistent with our time complexity analysis, all variants of OvO ODT-SVM scheme other that Gini measure, are most efficient in terms of training time in
comparison to conventional OvO and SVM-BDT. Figure 3 shows that OvO ODT-SVM using Gini is ranked best and is significantly better than conventional OvO and SVM-BDT. All the variants of OvO ODT-SVM schemes are not significantly different from each other in terms of both training and testing time. Similarly, the Friedman test indicates that there exist significant differences among OAA schemes on both training and testing time. Figure 4 and Figure 5 illustrate the results of the Nemenyi test to reveal what those differences are among OAA schemes on training and testing time respectively. Again consistent with our time complexity analysis, all variants of OAA ODT-SVM scheme other that gini measure, are most efficient in terms of training time in comparison to conventional OAA. Figure 5 shows that OAA ODT-SVM using IG is ranked best and is significantly better than conventional OAA. Similar to OvO schemes, all the variants of OAA ODT-SVM schemes are not significantly different from each other in terms of both training and testing time Among four measures employed for determining the structure of decision tree, neither of them is clear winner over other in terms of computation time for training and testing.
OvO
OAA
SVM-BDT
OvO ODT-SVM
OAA ODT-SVM
C4.5
MLP
0 1 2345678 Mean rank of classification accuracy Figure 1: Apply the Nemenyi test to mean ranks of classification accuracy of various classifiers.
Dataset	OvO	SVM-BDT	OvO ODT-SVM				
			ED	SC	IG	Gini	Chi-square
Iris	2.84	5.18	2.15	2.13	2.19	2.17	2.79
Satimage	1065.06	2852.82	867.71	864.77	873.48	958.32	871.47
Wine	2.15	3.53	1.96	2.08	2.11	2.13	2.05
Vehicle	189.96	508.81	154.76	154.24	155.79	170.92	155.43
Glass	12.31	16.89	8.93	8.60	9.05	10.06	8.99
Segmentation	3.00	5.26	2.20	2.52	2.49	2.45	2.39
Ecoli	26.48	45.90	17.28	24.09	20.07	21.97	17.55
New_Thyroid	5.31	8.54	3.93	3.13	2.99	3.23	3.58
Yeast	245.87	675.43	178.62	176.32	173.34	198.14	189.54
Movement_Libra	534.78	1432.76	433.56	437.76	446.34	476.73	436.42
HeartDisease_Oeveland	8.42	13.23	4.67	6.87	5.43	6.48	6.65
Table 6: Training time OvO ODT-SVM Vs OvO and SVM-BDT [sec].
Dataset	OvO	SVM-BDT	OvO ODT-SVM				
			ED	SC	IG	Gini	Chi-square
Iris	0.03	0.03	0.03	0.01	0.02	0.02	0.02
Satimage	10.44	18.53	9.96	9.32	9.07	7.79	8.53
Wine	0.03	0.06	0.03	0.03	0.03	0.03	0.03
Vehicle	0.44	0. 79	0.43	0. 44	0.43	0. 33	0. 36
Glass	0.04	0.06	0.02	0.04	0.02	0.03	0.02
Segmentation	0.05	0.07	0.04	0.05	0.04	0.05	0.04
Ecoli	0.05	0.06	0.06	0.04	0.04	0.04	0.04
New_Thyroid	0.07	0.09	0.06	0.05	0.06	0.05	0.05
Yeast	0.98	1.79	0.86	0.89	0.86	0.69	0.78
Movement_Libra	5.89	9.89	4.96	4.89	4.43	3.86	4.54
HeartDisease_Cleveland	0.09	0.10	0.05	0.05	0.04	0.05	0.06
Table 7: Testing time OvO ODT-SVM Vs OvO and SVM-BDT [sec].
OvO SVMBDT ED SC IG Gini Chi-square
0123456789 Mean rank of training time of OvO based classifiers
Figure 2: Apply the Nemenyi test to mean ranks of training time of OvO schemes.
OvO SVM-BDT ED SC IG Gini Ch-square
1	2	3	4	5	6	7	8
Mean rank of testing time of OvO based classifiers
Figure 3: Apply the Nemenyi test to mean ranks of testing time of OvO classifiers.
Dataset	OAA	OAA ODT-SVM				
		ED	SC	IG	Gini	Chi-square
Iris	28.09	12.13	21.65	12.07	15.95	15.75
Satimage	3451.82	858.09	945.91	940.30	989.37	989.43
Wine	5.20	2.95	2.53	2.51	2.52	2.52
Vehicle	615.65	153.04	168.71	167.71	176.46	453.63
Glass	135.60	22.85	27.18	27.18	69.02	21.34
Segmentation	6.70	2.17	2.53	2.51	4.80	4.82
Ecoli	80.68	23.34	30.02	31.02	78.42	79.80
New_Thyroid	10.34	4.54	4.23	5.67	5.13	5.67
Yeast	1278.76	306.98	367.87	306.76	359.54	567.87
Movement_Libra	1734.41	428.87	472.87	478.24	478.91	456.69
HeartDisease_Cleveland	102.34	24.76	23.43	21.23	32.23	24.54
Table 8: Training time OAA Vs OAA ODT-SVM [sec].
Dataset	OAA	OAA ODT-SVM				
		ED	SC	IG	Gini	Chi-square
Iris	0.02	0.03	0.01	0.01	0.01	0.01
Satimage	13.95	7.72	9.07	7.59	9.05	7.87
Wine	0.03	0.02	0.02	0.02	0.02	0.02
Vehicle	0.59	0.33	0.39	0.32	0.38	0.33
Glass	0.05	0.02	0.04	0.04	0.03	0.03
Segmentation	0.05	0.04	0.03	0.03	0.04	0.04
Ecoli	0.05	0.04	0.04	0.03	0.04	0.03
New_Thyroid	0.07	0.05	0.05	0.04	0.04	0.05
Yeast	1.28	0.76	0.89	0.66	0.78	0.69
Movement_Libra	6.89	3.78	4.57	3.87	4.56	4.08
HeartDisease_Cleveland	0.05	0.04	0.03	0.04	0.02	0.03
Table 9: Testing time OAA Vs OAA ODT-SVM [sec].
OAA -
Gini
Chi-square -
12 3 4	5 6 7
Mean rank of training time of OAA based classifiers Figure 4: Apply the Nemenyi test to mean ranks of training time of OAA classifiers.
OAA -
Gini
Chi-square -
1	2	3	4	5	6	7
Mean rank of testing time of OAA based classifiers Figure 5: Apply the Nemenyi test to mean ranks of testing time of OAA classifiers
6 Conclusion
In this paper, we evaluate the performance in terms of classification accuracy and computation time of proposed OvO ODT-SVM and OAA ODT-SVM using the statistical measures i.e. information gain, gini index, chi-square and scatter-matrix-based class separability in kernel-induced space. We have also shown theoretically that the computation time of training and testing of both the ODT-SVMs using statistical measures is better in comparison to conventional SVMs. A robust non-parametric test is carried out for statistical comparison of classifiers over multiple data sets.
The results of the experiment on UCI repository datasets indicate that accuracy of our proposed framework are significantly better than conventional OvO SVM, conventional OAA SVM and two widely used multi-class classifiers such as C4.5 and MLP for most of the datasets. Our experimental results also demonstrate that the computation time of proposed ODT-SVMs formulation is significantly less in comparison to conventional SVM and SVM-BDT models. Statistical test performed over multiple classifiers also shows that the performance of ODT-SVM model is significantly better in comparison to other natural multi-class classifiers like C4.5 and MLP. Among four measures employed for determining the structure of
decision tree, neither of them is clear winner over other in terms of computation time for training and testing.
Acknowledgements
We are thankful to the reviewers for their valuable comments which have led to the improvement of the paper.
References
[1]	Allwein, Schapire, R., & Singer, Y. (2000). Reducing multiclass to binary: A unifying approach for margin classifiers .In Machine Learning: Proceedings of the Seventeenth International Conference.
[2]	Blake, C. L., & Merz, C. J. (1998). (C. Irvine, Producer, & Univ. California, Dept. Inform. Computer Science) Retrieved from UCI Repository of Machine Learning Databases: http://www.ics.uci.edu/~mlearn/ML-Repository.htm
[3]	Boser, Guyon, I., & Vapnik, V. (1992). A training algorithm for optimal margin classifiers. 5th Annual ACM Workshop on COLT, (pp. 144-152).
[4]	Bredensteiner, & Bennet, K. ( 1999). Multicategory classification by support vector machines . In Computational Optimizations and Applications , 12, 53-79.
[5]	Breiman, L., Friedman, J., Ohlsen, R., & Stone, C. (1984). Classification and regression trees. Belmont , CA: Wadsworth.
[6]	Corts, C., & Vapnik, V.N. (1995). Support Vector Networks. Machine Learning, 20, 273-297.
[7]	Crammer, & Singer, Y. (2001). On the algorithmic implementation of multiclass kernel-based vecto rmachines. Journal of Machine Learning Research , 2, 265-292.
[8]	Demsar, J. (2006). Statistical Comparisons of Classifiers over Multiple Data sets. Journal Machine Learning Research , 7, 1-30.
[9]	Duda, R. O., & Hart, P. E. (1973). Pattern classification and scene analysis. New York: J. Wiley.
[10]	Duda, R. O., Hart, P. E., & Stork, D. G. (2000). Pattern Classification (second ed.). John Wiley & Sons, Inc.
[11]	Friedman, M. (1940). A Comparison of Alternative Tests of Sgnificance for the Problem of m Ranking. Annals of Math. Statistics , 11, 86-92.
[12]	Friedman, M. (1937). The Use of Rank to Avoid the Assumption of Normality Implicit in Analysis of Variance. Journal Am. Statistical Association , 32, 675-701.
[13]	Gjorgji, M., Dejan, G., & Ivan, C. (2009). A Multi-class SVM Classifier Utilizing Binary Decision Tree. Informatica, 33, 233-241.
[14]	Hsu, C. W., & Lin, C. J. (2002). A comparison of methods for Multiclass Support vector machine. IEEE Transactions on Neural Networks , 13 (2), 415425.
[15]	Kittler, J., & Hojjatoleslami, A. (1998). A weighted combination of classifiers employing shared and
distinct representations. IEEE Computer Vision Pattern Recognition, (pp. 924-929).
[16]	KreBel. (1999). Pairwise classification and support vector machines. Cambridge, MA: MIT Press.
[17]	Manju, B., & R. K., A. ( july, 210). Statistical Measures to Determine Optimal Decision Tree Based One versus One SVM. Accepted for publication in Defense Science Journal .
[18]	Manju, B., & R. K., A. (2009). Evaluation of Decision Tree SVM Framework Using Different Statistical Measures. International Conference on Advances in Recent Technologies in Communication and Computing, (pp. 341-445). Kerala.
[19]	Platt, Cristianini, N., & Shawe-Taylor, J. (2000). Large margin DAGSVM"is for multiclass classification . Advances in Neural Information Processing System , 12, 547-553.
[20]	Quinlan, J. R. (1987). Induction of Decision Trees. Machine Learning, 1, 81-106.
[21]	Rifkin, R., & Klautau, A. (2004). In Defence of One-Vs.-All Classification. Journal of Machine Learning , 5, 101-141.
[22]	Scholkopf, B., & smola, A. (2002). Learning with kernels. Cambridge, MA: MIT Press.
[23]	Shannon, C. E. (1948). A Mathematical Theory of Communication. Bell System Tech. Journal, 27, 379423, 623-659.
[24]	Takahashi, F., & Abe, S. (2002). Decision-tree-based multiclass support vector machines. Proceedings of the Ninth International Conference on Neural Information Processing (ICONIP '02), 3, pp. 141822. Singapore.
[25]	Vapnik, V. N. (1998 ). Statistical Learning Theory. New York: John Wiley & Sons.
[26]	Witten, I. H., & Frank, E. (2005). Data Mining : Practical Machine Learning Tools and Techniques with Java Implementation (Second ed.). Morgan Kaufmann.
A Shadow Dynamic Finite State Machine for Branch Prediction: An Alternative for the 2-bit Saturating Counter
Saleh Abdel-Hafeez, Asem Albosul and Ahmad Shatnawi Department of Computer Engineering
Jordan University of Science & Technology, Irbid, Jordan 21110 E-mail: sabdel@just.edu.jo
Ann Gordon-Ross and Shadi Harb Department of Electrical & Computer Engineering University of Florida, Gainesville, FL 32611, USA E-mail: ann@ece.ufl.edu
Keywords: branch predictor, bimodal, finite state machine, SDFSM, SPEC2000, saturated counter, SimpleScalar Received: July 31, 2009
We propose an adaptive learning machine-based branch predictor - the shadow dynamic finite state machine (SDFSM) - that enables more accurate branch predictions by learning unique branching patterns through a self-modifying technique. SDFSM states represent branch pattern bits. If a state mispredicts a branch, the state is swapped with its shadow state, which represents the correct branching pattern bit. Therefore, the prediction accuracy can reach 100% if the number of states matches a branch's pattern length. When compared to a 2-bit saturating counter using bimodal branch predictors, the SDFSM decreases average misprediction rates by 18.3%, with individual decreases as high as 55%.
Povzetek: Predstavljena je metoda za učenje vejitvenih vzorcev v procesorju.
1 Introduction and related work
In order to meet high performance demands, modern processor architectures exploit varieties of dynamic branch prediction topologies ([4]-[6] provide an excellent introduction and research coverage) to increase instruction-level parallelism (ILP).
Dynamic branch predictors use run-time branch execution history to predict branch direction. Most previous techniques use a branch pattern history table (known as PHTs, BHTs, or BPHTs) to record past branch behavior (e.g., global and/or local) and these tables are indexed using a function/subset of the branch address. Nearly all dynamic branch predictors explored in the last 10 years have been based on tables containing 2-bit saturating counters [7][8]. Extensive simulations of branch predictors reveal that the 2-bit saturating counter performs the best on average [9][10], and thus are used in modern commercial processors.
In recent years, research has explored more advanced branch prediction techniques such as neural networks [11][12] and other forms of machine learning. Despite their impressive simulation accuracy, to the best of our knowledge no commercial efforts have publicly announced incorporating such branch predictors because these branch predictors are commonly known to exhibit high prediction latency and long training periods with increased area and energy per prediction [13].
In order to provide increased branch prediction accuracy with low area and power overheads, in this
paper we propose a novel adaptive learning machine-based shadow dynamic finite state machine (SDFSM). The SDFSM learns/predicts an application's unique branching pattern using the prediction values (taken/not taken) stored in each state. Upon branch execution, state transition is input independent and the value of the target state predicts the branch outcome. Each state has a corresponding shadow state, which contains the alternate branch prediction value. In the event of a mispredicted branch, the SDFSM performs self-modification by swapping the current state with the current state's shadow state, which contains the correctly predicted branch outcome. This method of state swapping dynamically records unique branch patterns, thus specializing the branch predictor to the needs of an application. Extensive experimental results compare the SDFSM prediction accuracy to the commonly used bimodal [1][2] counter-based predictor and reveal that, for a subset of benchmarks, an SDFSM with six shadow states provides more accurate predictions than counter-base predictors with one-to-one prediction latency.
The remainder of this paper is organized as follows. Section 2 describes the proposed SDFSM as an alternative replacement for 2-bit saturating counters and presents the SDFSM architecture. Section 3 and Section 4 present our simulation methodology setup and branch
predictor analysis, respectively. Section 5 compares counter-based predictors and SDFSM-based predictors. Section 6 presents a performance analysis and finally, section 7 gives conclusions and suggested future dynamic branch prediction development.
2 Shadow dynamic finite state machine (SDFSM) branch
prediction
In this section, we present our shadow dynamic finite state machine (SDFSM) branch prediction technique for learning/predicting an application's unique branching patterns.
2.1 SDFSM operation
Figure 1 depicts the SDFSM using a 4-state SDFSM automaton (larger SDFSMs are similarly represented using more states). SDFSM state values record/predict branch outcomes. SDFSM operation consists of two phases: the training phase and the operational phase. During the training phase, SDFSM states are manipulated such that they learn the application's branching pattern. SDFSM state transition is deterministic upon each branch execution and the next state's value corresponds to the predicted branch outcome. In other words, branch prediction is determined by the branch history pattern and not by the input condition leading to the next state. If a state's prediction value is correct, no change is made to the SDFSM. If a state's prediction value is incorrect, the SDFSM self-modifies to adapt to the branching pattern.
In order to learn branching patterns, each state has a corresponding shadow state (positioned adjacent to the state), and the shadow state contains the opposite prediction value. Thus, if a state's value does not correspond to the branching pattern, the state is swapped with its shadow state in order to swap the state's branch prediction value. During the training phase, the states record the observed pattern and during the operational phase, the states predict taken/not taken. This implies that the SDFSM learns a distinct pattern on-the-fly and then predicts this pattern perfectly. Furthermore, the training and operational phases are not necessarily mutually exclusive as the SDFSM transitions to the training phase anytime there is a misprediction.
Figure 2 illustrates the 4-state SDFSM using a repeated pattern of 1010, which is commonly known to produce poor prediction rates for saturating counter techniques [14]. All state values are initialized to 0. Upon first execution of the branch, the SDFSM enters the initial state (step 1), whose state value is 0 and predicts the branch as not taken. After branch resolution, if the state mispredicted the branch outcome, the state is swapped with its shadow state and the state's predicted value becomes 1. On the next execution of the branch, the SDFSM transitions to the next state (step 2), which correctly predicts the branch as not taken. On the next execution of the branch, the SDFSM transitions to the next state (step 3), which predicts the branch as not
Initial
Figure 1: The proposed shadow dynamic finite state machine (SDFSM) using four states.
Figure 2: The SDFSM updates state predictions by swapping states with shadow states based on the observed pattern.
taken. Again, branch resolution determines that the branch was mispredicted and the shadow state is swapped in. On the next execution of the branch, the SDFSM transitions to the next state (step 4), which correctly predicts the branch as not taken. On the next execution of the branch, the SDFSM transitions back to the initial state, which ends the training phase and begins the operational phase. The SDFSM now correctly predicts the branch outcome on every branch execution.
Perfect branch pattern prediction only occurs if the pattern repeats itself with a repetition cycle equal to (or a divisor of) the number of states. A 4-state SDFSM can perfectly predict any 2- or 4-entry branch pattern. This restriction can be generalized to any x-entry pattern, which would require an SDFSM with x states or any multiple of x states. In Section 4, we provide an in-depth analysis of numerous SDFSM sizes.
During context switching, in addition to traditional branch predictor state saving techniques, SDFSM operational state can be quickly saved and restored using special hardware to read and save state on a single clock cycle. State saving area overhead would be small, as only one n-bit counter is required for each context.
Currently, SDFSM operation is not pipelined, thus mispredicted branches and branch overlap are not accounted for. However, these operational enhancements could be easily incorporated into the SDFSM by adding additional steering logic and mispredicted rollback capabilities. These additions would be straightforward and could be done such that the prediction accuracy would be unaffected, and are a focus of our future work.
2.2 SDFSM architecture
Figure 3 depicts the generalized SDFSM architecture (with N states) consisting of an array of N prediction states and a shift register to selectively enable the appropriate prediction state. Prediction state architectural
Component Type	Number of
	components
D-Type Flip-Flop	2N
(DFF)	
Two-input Multiplexer	N
AND gate logic	N
XOR gate logic	N
Tri-state gate logic	N
Table 2: Total number of hardware components based on the number of SDFSM prediction states (N).
			
	Parameter	Configuratio n	
	BTB, assoc, cache line size	128KB, 4-way, 32B	
	L2 unified size, assoc, cache line size	256KB, 4-way, 64B	
	LI data size, assoc, cache line size	8KB, 4-way, 32B	
	LI instruction size, assoc, cache line size	8KB, 4-way, 32B	
	Branch predictor techniques	Bimodal	
	Reorder buffer size	512	
	L3 unified size, assoc, cache line size	4 MB, 2-way, 64B	
	Pipeline depth	40	
Table 1: Architectural parameters.			
1. The shift register is responsible for selectively
components include a single D-type flip-flop (DFF) to store the state's predicted value, a two input multiplexor to swap the predicted value (effectively implementing a swap with the shadow state), and several gate level components. Prediction state inputs are similar to those used for 2-bit saturating counters, which are initialize (IN), prediction input pattern (PIP), enable (Z), and the clock (CLK) signal. Prediction states have a single output, which is the predicted value. The outputs of all prediction states are connected to a common output (Prediction Output Value) using tri-state buffers. The shift register is composed of N DFFs, whose outputs Q (also denoted as Z) are connected to the adjacent DFFs inputs D and selectively enable the prediction states. The shift register is clocked using the BRANCH signal, which is asserted each time the branch associated with this predictor is fetched.
At system startup, IN is asserted to reset the system. IN is connected to each DFF's reset (RES) port, effectively setting all register values to 0, except for the last DFF in the shift register. IN is connected to the set (SET) port of this DFF in order to set this DFF's value to
enabling a single prediction state, thus only one bit in the shift register should ever have a value of 1. Each time the BRANCH signal is asserted, the shift register updates its values, which enables the next sequential prediction state via the Z signal.
SDFSM prediction states consist of two operational phases: the predict operation and correct operation. The predict operation provides the branch prediction value while the correct operation swaps the branch prediction value with the shadow state value if the branch is mispredicted. During the predict operation, the enabled prediction state's output drives the Prediction Output Value using Z's assertion to enable the tri-state buffer. During this time, the PIP input value should correspond to the Prediction Output Value (not shown in Figure 3) so that the DFF value does not change.
If a branch is mispredicted, the PIP value will change to the branch outcome value and the prediction state enters the correct operational phase. During this phase, simple logic gates controlling the multiplexor's inputs and select line swap the DFF's stored value with the shadow value. Thus, in order to swap the DFF's stored value, the PIP must be different than the currently stored value and Z must be asserted.
The SDFSM has been architecturally designed to complete in one fast clock cycle. Assuming the DFFs are constructed using two levels of 3-input NAND gates and the multiplexor is constructed using standard two level logic gates, the longest register-register delay is seven gates (since DFF updating for the shift registers and prediction states is mutually exclusive, no phase flows through both DFFs). This situation occurs during the correct operational phase.
Table 2 depicts hardware area estimates in number of hardware components based on the number of prediction states N, where total hardware area grows at a rate of O(N). To minimize the output steering logic, prediction state outputs share a common output wire using tri-state buffers. In addition, to minimize active power, the DFFs in each prediction state are only activated on a misprediction. Overall, the SDFSM architecture is highly cost-effective in terms of performance, area, and power.

& ^ / éf # J? f J* / / J* ß / >
v!/

ö Counter
■	5DF5M-2 ÛSDFSM-3 □ SDFSM-4 ■5DF5M-6 a SDFSM-S
■	SDFSM-10 a SDFSM-12
Figure 4: Prediction accuracy for each benchmark using a 4k-entry BHT for the bimodal branch predictor using a 2-bit saturating counter (counter) and SDFSMs with 2, 3, 4, 6, 8, 10, and 12 states (SDFSM-A).
95%
n
Î 90% -
rç 3
S 85% -
c
a
S 80% -
cquükc
gzip_graphic	inet
ßoncliniaik
perlbrïifc_makerand
a Courrier ■SDFSM-2 a SDFSM-3 G SDFSM-4
■	SDFSM-6 OSDFSM-S
■	SDFSM-10 aBDF5M-12
Figure 5: Prediction accuracy for the six advantageous benchmarks using a 4k-entry BHT for the bimodal branch predictor using a 2-bit saturating counter (counter) and SDFSMs with 2, 3, 4, 6, 8, 10, and 12 states (SDFSM-A").
93% 92% , 91% I 90% ; 89% i 88% 87% 86%
			
			
			
			
			
	• Counter SDFSM-3	-"-SDFSM-2 —*—SDFSM-4	
	-«-SDFSM-6	—•—SDFSM-8	
	—•-SDFSM-10	SDFSM-12	
			
—♦— Counter	-■-SDFSM-2
SDFSM-3	SDFSM-4
-»-SDFSM-6	—*— SDFSM-S
—*—SDFSM-10	SDFSM-12
1024 2048 4096 8192 163S4 32768 65536 131072 Hardware Budget (Entries)
1024 2048 40% 8192 16384 32768 65536 131072 Hardware Budget (Entries)
(a)	(b)
Figure 6: Arithmetic mean of the prediction accuracy for (a) all benchmarks and (b) for the six advantageous benchmarks for the bimodal branch predictor with a 2-bit saturating counter (Counter) and SDFSMs with 2, 3, 4, 6, 8, 10, and 12 states (SDFSM-A) for BHT sizes ranging from 256 to 128k entries.
3 Simulation methodology and evaluation metrics
In order to perform an in depth analysis of the SDFSM, we exhaustively simulated the SPEC2000 benchmark suite [16] (we simulated each application in its entirety for all provided input stimuli) using the SimpleScalar PISA processor simulator version 4 [15]. We modified sim-bpred to implement the SDFSM and simulated the SDFSM with 2, 3, 4, 6, 8, 10, and 12 states. Our
comparison framework focused on comparing the SDFSM to a popular branch prediction technique (bimodal) using 2-bit saturating counters with branch prediction table sizes ranging from 256k- to 16k-entries. We compare with the bimodal predictor because the bimodal predictor is a branch predictor cornerstone and allows us to establish the fundamental contribution of our SDFSM. Table 1 summarizes the base system's architectural parameters, which represent common modern system parameters, yet are conservative with respect to future technologies.
Each branch prediction table entry contains an FSM, which can be either the SDFSM or a 2-bit saturating counter. Hence, the predictor storage budget (PSB) in bits is determined by:
PSB = 2n x [log2(number of States FSM)]
where N is the number of index bits used for the branch prediction table. In the conventional bimodal branch predictor, the low-order J bits of the branch address index into a branch history table (BHT) of size 2J entries. The BHT entries can either be 2-bit saturating counters or can be replaced with SDFSMs of any size. Since it is difficult to precisely compare predictors with exactly the same hardware budgets, we compare predictors based on number of table entries, which provides a fair performance comparison because these tables account for the majority of the hardware budget.
Cumulative prediction rate accuracies are computed and analyzed using the arithmetic mean for averaging prediction rates, over all benchmarks, based on predictor storage budget. In addition, individual branch prediction accuracies for every benchmark and every branch prediction technique studied were measured for increasing hardware budgets, reflecting branch predictor sizes available in commercial microprocessors.
Improving processor performance, measured in number of instructions executed per cycle (IPC), is considered the key motivation for combining improved branch prediction accuracy with low latency branch prediction. High prediction latency nullifies any prediction accuracy advantages due to decreased IPC. For a 2-bit saturating counter, since each up-down counter only requires 2 bits to record the branch behavior, the technique requires simple hardware and little storage space. In addition, the 2-bit saturating counter's inherent simplicity results in simple single-cycle prediction computation, thus guaranteeing low prediction latency. In contrast, perceptron-based predictors require comparatively complicated computation using adder components. The prediction latency of the original perceptron predictor was more than 4 cycles [13], which required heavy pipelining to hide such latencies. This pipelining led to problems similar as those encountered when designing modern hyperpipelined execution cores [12]. Thus, since the SDFSM has the same access delay (single-cycle) as the 2-bit saturating counter, the key evaluation metric is SDFSM prediction accuracy compared to 2-bit saturating counters with a fixed hardware budget.
4 Experimental results
Figure 4 shows the prediction accuracy for all benchmarks for the bimodal branch predictor with a 2-bit saturating counter (counter) and SDFSMs with 2, 3, 4, 6, 8, 10, and 12 states (SDFSM-^) using a 4k-entry BHT. On average over all benchmarks, the 2-bit saturating counter outperforms all SDFSMs. However, we reiterate
that branch predictors behave differently for all applications, and there is no one branch predictor that outperforms all other branch predictors for all applications.
Figure 5 subsets the results and depicts the six applications where the SDFSM shows improved prediction accuracy over the 2-bit saturating counter. On average, the 6-state SDFSM provides the largest prediction accuracy improvements with an average misprediction rate decrease of 18.3%, with individual decreases ranging from 6.3% to 55%. Figure 5 also reveals that for each benchmark, the optimal sized SDFSM is quite different. The optimal SDFSM sizes for ammp, equake, gzip_graphic, mcf, mesa, and perlbmk_makerand are the 6-state, 12-state, 8-state, 2-state, 12-state, and 6-state SDFSMs, respectively.
(a)
Figure 6 (a) depicts the arithmetic mean of the prediction accuracy for all benchmarks for the bimodal branch predictor with a 2-bit saturating counter (counter) and SDFSMs with 2, 3, 4, 6, 8, 10, and 12 states (SDFSM-^) for BHT sizes ranging from 256 to 128k-entries. The prediction accuracy increases as BHT size increases and saturates asymptotically. On average, the 2-bit saturating counter still outperforms all SDFSMs, with the 2-bit predictors prediction accuracy saturating at 92.3% and the next accurate predictor (6-state SDFSM) saturating at 91%. On average, the 2-bit saturating counter with 16k-entries (a practical hardware budget) provides 1.7% more accuracy than the next most accurate predictor.
(a)
Figure 6 (b) subsets the results from
Figure 6 (a) and averages the six applications where the SDFSM shows improved prediction accuracy over the 2-bit saturating counter. For these benchmarks, the 6-state SDFSM is 1.2% more accurate than the 2-bit saturating counter, saturating asymptotically at 93.5%. This figure also shows that both the 6- and 12-state SDFSMs outperform the 2-bit saturating counter.
Overall, results reveal that our SDFSM has the potential to further enhance the accuracy of 2-bit saturating counters. Since literature shows that the most advanced branch prediction methods adopt neural or saturating elements, the SDFM has the potential to improve on these methods as a replacement for the saturating elements. The SDFSM is intended to enhance branch prediction for certain applications that exhibit particular behaviors such as aliasing, damping, and other irregularities such as those found in artificial intelligence and gaming applications (see Section 5 for details).
5 Comparison analysis
In this section, we analyze the exhaustive results presented in Section 4 and discuss comparative advantages and disadvantages of the 2-bit and SDFSM branch predictors considering aliasing interference, damping, adaptability, training time, and latency.
5.1 Aliasing interference and damping
Since the BHT size is generally much less than the total number of branches in an application, the bimodal branch predictor uses the low-order J bits to index into the BHT. Therefore, if two conditional branches have the same low-order J bits, their branch streams will be intermingled and sent to the same predictor. We define this situation as aliasing interference. Due to aliasing interference, and because we use the bimodal branch predictor, both the 2-bit saturating counter and our SDFSM-based predictor generally result in lower prediction accuracy in the presence of significant aliasing interference. Aliasing interference can be alleviated through two methods. Simply increasing the BHT size can significantly reduce aliasing interference. Additionally, using other branch prediction techniques such as per-address branch predictors (PAs) can reduce aliasing interference by using a two level indexing method [14]. The first level is indexed using a subset of H bits of the branch address to index into a pattern history table of size 2H, which stores the unique local branch history pattern of that branch. This pattern is then used to index into the second level, which contains either global pattern histories (PAg) or per-address pattern histories (PAp) [3].
In general, aliasing interference does not directly imply prediction accuracy penalties. For example, if two branches alias to the same BHT entry but their executions are mutually exclusive, (the first branch executes 1000 times followed by 1000 executions of the second branch) the prediction accuracy lost due to aliasing interference is negligible. However, if two branch executions are not mutually exclusive (the worst case being that the two branches alternate executions), then aliasing interference may lead to a significant decrease in prediction accuracy. To analyze the effects of aliasing interference in the case of two interfering branches, we define the most frequently executing branch as the majority branch and the least frequently executing branch as the minority branch. We further define a majority run as consecutive majority branch executions with no intervening minority branch executions. Minority runs are similarly defined.
Smith [1] observed that 2-bit saturating counters implicitly provided an appropriate amount of damping (or hysteresis) which alleviated some of the aliasing
1 2 3 4 5 6 7 8 Pattern Length
Figure 7: Bimodal predictor misprediction rates for various pattern lengths.
interference. The damping mechanism in 2-bit saturating counters requires two consecutive mispredictions before the prediction value changes, thus ignoring minority runs of length one. Damping trades off adaptability for vulnerability to short minority runs. In addition, damping also allows loop branches to incur just one misprediction per loop iteration, instead of two mispredictions (one on loop exit and one on loop entry).
On the other hand, the SDFSM's implicit damping mechanism is quite different than the 2-bit saturating counter. The SDFSM simply learns the branching pattern that maps to a particular BHT entry. Therefore, as long as the combined patterns of the interfering branches produce a learnable pattern, the SDFSM will learn that pattern. However, since these combined patterns are likely longer than individual branching patterns, this implies that SDFSMs with more states provide increased damping. The SDFSM predictor actually provides high/perfect prediction accuracy for applications with short minority runs as well as long minority/majority runs, by minimizing or even eliminating aliasing interference. On the contrary, in the presence of aliasing interference, damping in saturating counters only works well for long minority runs.
Literature shows that the bimodal predictor is widely known to have a significant amount of aliasing interference even as the hardware budget increases [2][4]. In our experiments, since both the 2-bit saturating counter and the SDFSMs use a bimodal predictor, large amounts of aliasing interference will favor the counter-based predictor since the counter based predictor can better tolerate aliasing interference.
Figure 4 shows that on average the 2-bit saturating counter can reduce the misprediction rate by 14.3% over the best SDFSM predictor (SDFSM-6). On the other hand, for the six benchmarks where SDFSMs are advantageous, short minority runs (which are considered a limitation of counter-based predictors) favor the SDFSM predictor. For these six benchmarks, Figure 5 shows that the SDFSM can decrease misprediction rates by 18.3% on average.
5.2 Recurring patterns
Researchers have shown that aliasing in the pattern history tables can significantly degrade the performance of bimodal branch predictors. [3][4][21][23] showed that a repeating pattern of length one (i.e., "1111...1" or "0000...0") was detected for approximately 50% of the branches, indicating that a significant amount of branch inference may occur if the PHTs are updated for these branches. For these situations, a simple predictor such as a bimodal predictor would typically outperform the SDFSM predictor, which would incur every interference update.
In addition, research showed bimodal predictors could accurately predict branches with short repeating patterns, while branches with a repeating pattern of length six tended to have higher mispredication rates [21][22][23], as is show in Figure 7 from [23]. Since Section 4 revealed that the 6-state SDFSM was the best
performing number of states on average, the 6-steate may provide improved performance for these branching patterns of length 6. In addition, our results demonstrated that SDFSMs with a smaller number of states suffered less branch interference penalty as compared to SDFMs with a larger number of states, which could explain why the 6-state SDFSM outperformed the 12-state SDFSM (or for any SDFSM with a multiple of 6 states).
5.3	Adaptability and training time
Branches typically exhibit high biasing (usually 70% [4]) towards one outcome (taken or not taken). This bell distribution (bell peaks at 70%) is key to a counter-based predictor's high prediction accuracy and explains why the 2-bit saturating counter outperforms the SDFSM for the majority of the benchmarks. To provide better prediction accuracy for low biasing applications, previous work shows [3][5] that applications with branches that show low biasing require dynamic adaptability in order to achieve high prediction accuracies. This dynamic adaptability enables the predictor to specialize itself to a branch's biasing during application execution. Dynamic adaptability provides the added benefits of not requiring any static profiling or branch predictor training during system/application design time. The 2-bit saturating counter lacks dynamic adaptability. On the other hand, our N-state SDFSM-based predictor can dynamically adapt to any branch pattern of length equal to (or a divisor of) N. The larger the number of states, the more flexibility the SDFSM has for adapting to different pattern lengths.
However, SDFSMs with a large number of states can negatively impact the prediction accuracy due to longer training times. Figure 5 exemplifies this impact with the 6- and 12-state SDFSMs. Ammp, gzip_graphic, mcf, and perlbmk makerand show increased prediction accuracy for a 6-state SDFSM even though the 12-state SDFSM captures the same branching pattern. On the other hand, equake and mesa show decreased prediction accuracy for the 6-state SDFSM because these benchmarks likely have longer branch patterns, thus requiring more SDFSM states. On average, the 6- and 12-state SDFSMs decrease misprediction rates by 18.3% and 15.4% compared to the 2-bit saturating counter, respectively. The 6-state SDFSM decreases misprediction rates by 4% compared to the 2-bit saturating counter. This overhead is due to the 12-state SDFSM's increased training time. Similar trends are evident when comparing 2- and 6-state SDFSMs, as well as any other SDFSM with common divisors.
5.4	Latency
Few hardware resources are required to implement both the 2-bit saturating counter and the SDFSM predictors and thus these techniques require only modest storage space. In addition, this inherent simplicity results in simple predictions and computations, which guarantees low prediction latency (a critical component for high performance in processors). The SDFSM-based predictor requires only a single cycle for training and
prediction, while 2-bit saturating counter-based predictors require two cycles for training and predicting. Thus, the overall prediction latency of the SDFSM-based predictor is 50% less than that of the 2-bit saturating counter-based predictor, resulting in a higher instruction-per-cycle (IPC).
6 Performance evaluation
(a)
Figure 6 (a) showed that the counter-based predictor was more accurate on average than the SDFSM with respect to the arithmetic mean. However, the counter-based predictor's misprediction latency cycles is twice that of the SDFSM, as was described in Section 5.4. The additional misprediction cycle adversely affects overall processor performance due to stalls while waiting for the training and subsequent prediction. Therefore, in order to more fairly compare complete predictor performance, we must consider the mispenalty latency in conjunction with the misprediction rate.
We evaluate the SDFSM and counter-based bimodal type predictors with respect to the misprediction per cycle (MPC) and the prediction accuracy rates (PAs) as determined by simulation. In order to provide an analysis that is independent of the processor clock speed, the misprediction rate is normally measured in cycles rather than in seconds, such that:
MPCcounter = [100% - PAcounter ]x 2 cycles
and:
MPCSDFSm =[100% - PASDFSm ]x 1 cycles
Figure 8 shows the MPCs with respect to hardware budget in number of entries and Figure 10 subsets these results as in	(a)
Figure 6 (a) (i.e., those where the SDFSM showed improvement over the counter-based predictor with respect to misprediction rates), Similarly to the misprediction rates for these subsetted benchmarks, the MPCs for all SDFSMs improves with respect to counter predictor, with an average overall performance increase of 37%. However, on average over all benchmarks the counter-based predictor still had the lowest misprediction rate.
Branch predictor performance can also be evaluated using the misprediction speedup, as derived in [17], such that:
MPC
Speedup =-^^
MPC
1V±± ^ SDFSM
Figure 9 shows the misprediction speedup verses hardware budget in number of entries for various SDFSM sizes compared to the counter-based predictor. These speedups are in line with speedups obtained for other recent innovations in branch predictors [18]-[20].
Figure 8: Mispredictions per cycle per benchmark with a hardware budget of 4KB.
7 Conclusion and future work
This paper proposes the shadow dynamic finite state machine (SDFSM), a new branch predictor where the FSM states are dynamically trained during rum-time to learn unique branch pattern behaviors. Whereas the SDFSM can be generalized to any arbitrary number of states, we explored several SDFSM sizes and compared extensive simulation results on the SPEC2000 benchmark suite with 2-bit saturating counters using a conventional bimodal-based branch predictor. Results revealed that the SDFSM decreases average misprediction rate for six benchmarks, which have irregular branching tendencies (i.e. those seen in artificial intelligence and gaming applications). Furthermore, in the situations where the SDFSM was slightly less accurate than the 2-bit predictor, this reduced accuracy was due to the nature of the bimodal predictor architecture (and not a failure of the SDFSM), which inhibits a large percentage of aliasing phenomena that severely affects the performance of our SDFSM automaton on prediction accuracy. The SDFSM will
likely show marked improvements when coupled with predictors that are less affected by aliasing such as PAs and GAs.
In addition, the SDFSM uses a simple hardware structure, which provides single cycle training and prediction latency; in contrast, the 2-bit counter predicts and corrects in two cycles. This single cycle advantage for the SDFSM offsets the accuracy advantage of the 2bit counter by trading off performance with respect to the instructions-per-cycle (IPC) rate.
Finally, we explored and analyzed the number of SDFSM states in the scope of adaptability, training, damping, and aliasing in order to determine their affect on prediction accuracy. Results show that a 6-state SDFSM is a good average configuration for optimal length for bimodal predictor topology. Thus, our results encourage researchers to explore the SDFSM combined with more advanced predictor methods, thus improving the accuracy of those predictors.
Our future work is motivated by the per-application variation in optimal SDFSM size as shown in Figure 5. Consequently, choosing the best number of states is a key design decision since the SDFSM structure does not
"Counter
SDFSM-3 "SDFSM-6 "SDFSM-10
SDFSM-2 SDFSM-4 SDSFM-8 -SDFSM-12

-+-
-+-
CL
C/3
"3
"J
E.
25(i 512 1024 2048 4096 8102 16384 52768 65536 Kill
Hardware Budget (Entries) Figure 10: Average mispredictions per cycle verses hardware budget in number of entries.
"SDFSM-2 SDFSM-4 -SDSFM-8 -SDFSM-12
" SDFSM-3 SDFSM-6 -SDFSM-10
1024 204S 40% 8|M2 16384 32768 655 56 151
Hardware Budget (Entries) Figure 9: Misprediction speedup verses hardware budget in number of entries for various SDFSM state sizes compared to the counter-based predictor.
dynamically alter its number of states based on pattern entries. Therefore, our future work includes architecting an adaptive SDFSM capable of dynamically altering its number of states based on actual branch pattern length.
References
[1]	J. Smith, "A Study of Branch Prediction Strategies," International Symposium on Computer architecture (ISCA), pp. 135-148, June 1981.
[2]	C. -C. Lee, C. C. Chen, and T. N. Mudge, "The Bimode Branch Predictor," International Symposium on Microarchitecture (MICRO 30), pp. 4-13, December 1997.
[3]	T. Yeh and Y. Patt, "A Comparison of Dynamic Branch Predictors that use Two Levels of Branch History," International Symposium on Computer architecture (ISCA), pp. 257-266, June 1993.
[4]	C. Young, N. Gloy, and M. D. Smith, "A Comparative Analysis of Schemes for Correlated Branch Prediction," International Symposium on Computer architecture (ISCA), pp. 276-286, July 1995.
[5]	A. Seznec, "Analysis of the O-Geometric History Length Branch Predictor," International Symposium on Computer Architecture (ISCA), pp. 394-405, June 2005.
[6]	P. Biggar, N. Nash, K. Williams and D. Gregg, "An Experimental Study of Sorting and Branch Prediction," Journal of Experimental Algorithmic (JEA), Volume 12, Article 1.8, June 2008.
[7]	Y. Ma, Hongliang Gao, and Huiayang Zhou, "Using Indexing Functions to Reduce Conflict Aliasing in Branch Prediction Tables," IEEE Transactions on Computers, Vol. 55, No. 8, pp. 1057-1061, August 2006.
[8]	C. Y. Ho, K. F. Chong, C. H. Yau, and A. S. S. Fong, "A Study of Dynamic Branch Predictors: Counter versus Perceptron," International Conference on Information Technology (ITNG'07), pp. 528-536, April 2007.
[9]	R. Nair, "Optimal 2-bit Branch Predictors," IEEE Transactions on Computers, Vol. 44 (5), pp. 698702, Issue 5, May 1995.
[10]	J. L. Hennessy and D. A. Patterson, Computer Architecture: A quantitative Approach, Morgan Kaufman Publishers, 3rd Edition, 2003.
[11]	D. A. Jimenez and C. Lin, "Dynamic Branch Prediction with Perceptron," International Symposium on High-Performance Computer Architecture, (HPCA), pp. 197-206, January 2001.
[12]	A. S. Fong and C. Y. Ho, "Global/Local Hashed Perceptron Branch Prediction," International Conference on Information Technology: New Generations ITNG '08, pp. 247-252, April 2008.
[13]	D. A. Jimenez, "Improved Latency and Accuracy for Neural Branch Prediction," Transactions on Computer Systems (TOCS), Volume 23 Issue 2, pp. 197-218, May 2005.
[14]	K. C. Breen and D. G. Elliott, "Aliasing and AntiAliasing in Branch History Table Prediction,"
Computer Architecture News, Vol. 31, No. 5, pp. 14, December 2003.
[15]	T. Austin, D. Ernst, E. Larson, C. Weaver, R. N. Raj Desikan, J. Huh, B. Yoder, D. Burger, and S. Keckler, SimpleScalar Tutorial 2001 (for release 4.0).
[16]	SPEC 2000, The SPEC 2000 Benchmark Report, Waterside Associates, Fremont, CA., January 1990.
[17]	M. Burtscher and B. G. Zorn, "Prediction Outcome History-based Confidence Estimation for Load Value Prediction," Journal of Instruction-Level Parallelism, Vol 1, May 1999.
[18]	T. H. Heil, Z. Smith, and J. E. Smith, "Improving Branch Predictors by Correlating on Data Values," 32nd Annual international symposium on Microarchitecture (MICRO-32), pp. 28-37, Nov. 1999.
[19]	R. Thomas, M. Franklin, C. Wilkerson, J. Stark, "Improving Branch Prediction by Dynamic Dataflow-based Identification of Correlated Barnches from a large Global History," 30th Annual International symposium on Computer Architecture, pp. 314-323, June 2003.
[20]	R. Sendag, J. J. Yi, P. Chuang, and D. J. Lilja,"Low Power/Area Branch Prediction Using Complementary Branch Predictors," IEEE International Symposium on Parallel and Distributed Processing (IPDPS 2008), pp. 1-12, June 2008.
[21]	P. Chang, M. Evers, and Y. N. Patt, "Improving Branch Prediction Accuracy by Reducing Pattern History Table Interference," Proceedings of the 1996 Conference on Parallel architecture and Compilation Techniques, PACT '96, pp. 48-57, Oct. 1996.
[22]	A. R. Talcott, M. Nemirovsky, and R. C. Wood, "The influence of branch prediction table interference on branch prediction scheme performance," International Conference on Parallel Architectures and Compilation Techniques, pp. 8998, June1995.
[23]	J. Stark, M. Evers, and Y. N. Patt, "Variable Length Path Branch Prediction," International Conference on Architectural Support for Programming Languages and Operating systems (ASPLOS), pp. 170-179, Dec 1998.
Programming the Story: Interactive Storytelling System
SeokKyoo Kim, SungHyun Moon and SangYong Han Seoul National University, Seoul, Republic of Korea E-mail: {anemone, shmoon, syhan}@pplab.snu.ac.kr
Juno Chang
Sangmyung University, Seoul, Republic of Korea E-mail: jchang@smu.ac.kr
Keywords: interactive storytelling, planning, game programming Received: September 1, 2009
A multi-story can be generated by the interactions of users in the interactive storytelling system. In this paper, we suggest narrative structure and corresponding Storytelling Markup Language. Actor, Action, and Constraint are declared and programmed using interactive storytelling system which generates the stories. Generated stories can be transformed to multimedia formats which are texts, images, animations, and others.
Povzetek: Opisan je sistem za generiranje mnogoterih zgodb.
1 Introduction
A term called 'Digital Storytelling' is being used in various sectors of society nowadays. Studies are being conducted not only in the academic fields that previously have a field of storytelling, such as literature, but also in media studies, computer engineering, and others, owing to its involvement with digital technology. Although the term has not been defined concretely, it is recognized as a storytelling in the digital era.
One definition of digital storytelling is a storytelling that is done by applying digital technology in the medial environment or as an expression means. In other words, digital storytelling in a broad sense indicates the case, wherein digital technology is applied to entire media environment in the production process of the audiovisual materials, or, at least, to creating story and discourse as a means of expression.
Interactive storytelling is one area of digital storytelling. This concept is a type of narrative using an interaction between the emotional and dramatic aspects of the story and the computer, indicating storytelling on which the user makes an influence so as to change the direction which the story proceeds in.
The closest examples are edutainment, videogames in which the progress and an ending of the story can be various depending on the choices made by the user. Especially in games, such a characteristic can be found in RPG (Role Playing Game) or adventure game, visual novels and others.
Studies on such interactive storytelling are not being actively conducted yet, but various attempts are now being made along with the development of online contents such as online game. This study gives a consideration to followings as essential components of interactive storytelling.
1) Narrative Structure
-	The essential elements for processing stories and the structure to express such elements are required.
2)	Script language to embody narrative structure.
-	A system, where the narrative structure is expressed in the type of languages which authors or programmers can understand so that the computer can process it, needs to be arranged.
3)	Story generator and authoring tool assisting in generation of story and narrative structure even without professional knowledge of programming languages.
-	The development environment, which allows story makers to create stories easily even if they do not have knowledge of script languages, should be supported. The story generator interprets input information as narrative structure, converts it in script language, and then generates the story based on it. The authoring tool are implemented as the graphical user interface environment including the story generator.
Figure 1 shows the constituents of an interactive storytelling system.
Figure 1: An Interactive Storytelling System
Among various important elements constituting an interactive storytelling system, the narrative structure and the interactive storytelling script language based on it are suggested in this paper. For this purpose, the existing script languages and their expression modes are described in section 2, and the constraint-based narrative structure is suggested in section 3. In section 4, SML (Storytelling Markup Language) is suggested for the system. This script language is fundamental to an interactive storytelling system and is used to develop story generator and authoring tools, making it possible to generate stories. The last section summarizes this study, discussing the method of utilizing this study and future tasks.
2 Related works
Languages that have been used in an interactive storytelling system so far can be largely divided into three types; natural language, logic programming language, and markup language derived from XML.
A natural language is the most suitable language to narrate actions, characters, events, and other source of story because it expresses a human language as it is. Although using a natural language provides convenience for users and increases accessibility or legibility, there are many difficulties compared to processing the existing programming languages. In previous studies, such a natural language system was used in integration with a speech recognition system, which was the system that shows the final stories by processing natural language came through the speech recognition system.
Since logic programming languages use an artificial intelligence planning technique in the narrative structure of an interactive storytelling system, programming languages that are compatible with logic programming are used. For example, there are STRIPS (STanford Research Institute Problem Solver) [2] and languages derived from STRIPS. STRIP, which was introduced to solve problems of Al, is the most suitable for expressing a planning algorithm of narrative structure, thus it is used in an interactive storytelling system [3] [4].
Lastly, there is XML (eXtensible Markup Language). HTML (Hyper-Text Markup Language), a subordinate concept of XML, is used as the standard output format of the World Wide Web all over the world, and many users, thus, know or can easily learn the format. Along with such an environmental factor, XML expresses all information in letter that users know, leading to high legibility. XML has been widely used as the standard to express information since proposed by W3c (World Wide Web Consortium), and it is even suggested that it might replace HTML. Many developing tools have already included the libraries dealing with XML and are continuously developing it. Accordingly, HTML has an advantage of being widely used.
Most of all, XML, a language with the property of generality, can add various formats and express almost any imaginable formats.
In regard to XML-derived languages used in an interactive storytelling system such as MPML (Multimodal Presentation Markup Language) [5], AIML (Artificial Intelligence Markup Language) [6], APML(Affective Presentation Markup Language)[7], FML(Functional Markup Language)[8], BML(Behavior Markup Language)[9].
MPML is a markup language suitable for controlling actions of characters similar to the real world. MPML is a powerful language which can provide the control for behavior of second dimensional characters, the presentation flow, and the integration of external objects. FML and BML are designed to unify representational framework for Embodied Conversational Agents to produce multimodal behaviours of computer-generated characters in a broad range of circumstances. APML is another attempting version for ECA able to generate context-adapted behaviours based on Mind-Body interface; Mind which represents the personality of an agent, and Body reflects its appearance and expressive behaviours. However, they have some shortcomings to be the script language for generating stories which this study aims at, because it is a language for controlling the agent.
In the VISTA (Virtual Interactive Story Telling Agents) project [10], AIML was used to write programs. AIML is a XML style script language supporting for AI application program, and the Vista system used AIML interlocked with Prolog. AIML was used in the question-answer relationship applied to stories, while Prolog was used to generate various action rules. With regard to AIML, however, all the questions and answers should be defined in advance, and it is hard to produce various results inferred from the various conditions. Besides, there is another difficulty that general users who are not familiar with programming should know Prolog.
In addition to language to formalize narrative structure, authoring tools which assist to program the language and specify the information have been investigated and developed: INSCAPE[11] and PRISM[12] . In INSCAPE, an author writes an interactive story idea; prepares characters, props, and stages; and plan entire flow of story with those assets to achieve desired goals. It also adopted XML-style language, called ICML(Inscape Communication Mark-up Language) for underlying data model. It is designed to create interactive stories for edutainment, simulation, training, and other areas of nonlinear story. PRISM provides story map to set up interactive story in a similar way to INCACPE and it adopts hybrid narrative structure combining "condition based branching narrative" and "planning" methods to generate interactive story
3 Narrative structure
As examined previously, various script languages have been used in an interactive storytelling system. This paper suggests SML based on XML. Although there are already script languages derived from XML, such as MPML or AIML, these languages cannot defined the
narrative structure or have difficulties in generating the variety of stories. In comparison, SML suggested in this study has following strength: it defines the narrative structure so that authors can intuitively program stories without difficulties and then can define the languages conforming to XML format according to this narrative structure.
3.1 Constraint based narrative structure
The previous studies have expressed stories mainly in STRIPS or Lisp format in order to solve problems based on AI planning techniques [1][4][13]. In regard to a planning algorithm of narrative structure, there are HTN (Hierarchical Task Network) and HSP (Heuristic Planning). STRIPS is used in the systems using HTN and HSP. The structure of HTN can be represented as a tree, in which the conclusion of story is a route and each subconclusion is a child. This employs a top-down method, which leads to good narrative coherence. In contrast, HSP generates the route from an initial state to a goal state, which makes it possible to generate stories flexibly [3]. It can be said that the method introduced in a text is a kind of HSP.
HTN constructs a story using a tree. The route node and each non-terminal node indicate the conclusion and the sub-conclusion of the story, respectively, while the terminal node indicates occurrence of a certain action. In regard to the way of constructing the story, the final conclusion is divided into several sub-conclusions, each of which is divided again into another sub-conclusion, and then the story is generated by solving each subconclusion. Here, several actions are collected to solve the smallest unit of sub-conclusion.
In other words, it is considered that the story is the combination of sub-conclusions and the lowest level subconclusion includes the collection of the specific character's actions. The story is generated using a divide-and-conquer planning technique based on Al, in which the problem is divided into the smaller units, which are then solved and combined. In such a structure, the conclusion cannot be reached unless the sub-conclusion is satisfied, which, in turn, guarantees the coherent flow of the story. In another aspect, the story is processed centering on the only actions that are preconditions for the conclusion, so only stories expressed on the route of HTN are formed, thereby causing the decrease in the degree of freedom.
In case of an ideal type of interactive storytelling, the story should have a high degree of freedom while maintaining its coherent flow. It is, however, not easy to realize an interactive storytelling system satisfying both of them, because these two are usually in a contradictory relationship [13]. This study aims at the direction of story generation that makes it possible for authors to adjust the degree of freedom and the coherence of stories.
The previous studies have maintained a causal relationship through the divide-and-conquer planning based on AI, but reached the limit for the degree of freedom. Therefore, this study suggests the form that escapes from such a frame by eliminating the
hierarchical structure. In other words, contrary to the existing structures, the stories are seen as the continuous actions of the character after eliminating the step of subconclusion. Instead of the eliminated sub-conclusions, the constraint conditions such as a causal relationship or a temporal relationship between actions are declared in order to maintain the unity of the story. And through the degree of such constraint conditions, the degree of the coherence and the degree of freedom can be properly adjusted to the extent which the author wants.
The narrative structure suggested in this study is divided to constituent declaration and constraint declaration. The constituent declaration defines actor, action property, stage, and props, while the constraint declaration suggests the various condition-relation structures that can control the flow of the story and support the nonlinear process of the story.
3.2 Constituent declaration
In the constituent declaration, the basic constituents necessary for stories are declared. The constituents defined in the declaration include property, stage, actor, props, action and others. Figure 2 shows the constituents defined in the declaration and the interdependence among them.
Figure 2: Declaration Constituents and Relationships
-	Property
A property expresses the characteristic of actors or props numerically. Actors or props can have the value corresponding to necessary properties for each as a numerical value.
-	Stage
A Stage displays the space where the story proceeds.
-	Actor
An Actor is the constituent that becomes the subject or the object of the story, which performs itself or is performed. There is a name for each actor, and it contains the types of properties that the actor has, the value of each, and the stage information that marks the space in which the actor is at the very beginning of the story.
-	Props
Props refer to things that cannot be the subject but object of actions. Thus, things or actors who cannot be the subject of actions are defined as props.
- Action
An action indicates the motion that actors can take. It displays actors who can carry out such an action, properties necessary for the action, stages where the action can occur, and how many objects of the action are.
In this constraint based narrative structure, properties commonly exist between actors and actions. This is the device for generating stories, wherein more logical choice can be made in deciding the motions of actors. Let's consider one example in order to understand uses of such a device. Followings are assumed; an actor 'A' has properties of talkative = "80" and aggressive = "30", an action 'attack' has properties of talkative = "30" and aggressive = "80", and an action 'talk' has properties of talkative = "80" and aggressive = "30".
<actor actorid="A">
<propertyR propertyId="talkative">80</propertyR> <propertyR propertyId = "aggressive">30</propertyR> </actor>
<action actionid = "attack"> <subR subId="A"/>
<propertyR propertyId = "talkative">30</propertyR> <propertyR propertyId = "aggressive">80</propertyR> </action>
<action actionid = "talk"> <subR subId="A"/>
<propertyR propertyId = "talkative">80</propertyR> <propertyR propertyId = "aggressive">30</propertyR> </action>
If an actor 'A' is in a situation where he or she has to perform one motion either 'attack' or 'talk', the possibility of carrying out 'talk' motion increases due to the property value. It is necessary to fully unitize this structure in a story generator in order to establish the balance between characteristics and actions of actors.
The constituents listed above are combined, thereby generating an event. In brief, an event means that a specific actor carries out a certain motion, wherein the defined form of the action determines the presence of an object.
Figure 3: Components of event
Figure 3 shows the necessary components of an event. To put it concretely, an event is expressed as actor, stage, action, and props are combined in such form as "Specific actor does what action (something) (to someone) where". The generation of story in this structure means the generation of such an event. In addition, a special component called 'viewpoint' is placed in order to reduce the complexity in story generation. This indicates the stage at which the user is currently looking. The generation of all the stories is limited to the current viewpoint, which is in line with a play where audiences see only one stage. As a curtain comes down when the stage is changing, the change of viewpoint is required to change the stage in this structure. As an initial stage should be specified in a play, the viewpoint for a beginning point of the story should be specified in the constituent declaration as well.
3.3 Constraint declaration
In the constraint declaration, various constraint conditions that can control the direction of narrative flaw are declared. Such constraint conditions play a role of framework holding the direction in which the story proceeds, preventing the story from taking the wrong way. The degree of freedom is determined according to a dynamic of constraint conditions, and the types of constraint condition are shown below. The parts explained here is conceptual, so it is necessary to support a wildcard character ('?') or logical operation so as to control the stories more delicately when writing these in an actual script language.
1) Ending : (event) ^ (end)
This is a condition that ends the story. If a given event occurs, the story ends. The multiple endings can be setup by assigning several kinds of events.
<ending endingId = "HappyEnding"> <preEventList> <preEvent>
<preSub subtype="actor" preSubId= "Monica"/> <actionR actionId = "love"/>
<preObj objType="actor" actorId = "Joseph"/> </prevent> </preEventList> </ending>
2)	Transaction : (pre event) ^ (post event)
Two events occur together like a transaction. This condition is set in order to induce another event to occur in sequence upon occurrence of a specific event. In script languages, the function, where postevent can refer to the subject and the object of pre-event, should be provided in order to make extensive forms of transaction condition. For example, let's assume the case where a wildcard character "?" designated as any actor, is given as a condition, and pre-event is defined as "?" hit "?". It should be possible to define post-event for this case in the same construction with '[The object of pre-event] hit [the subject of pre-event]'. In this case, the subject and the object of post-event are determined at the time pre-event occurs.
<transaction transactionId = "HitAfter"> <preEventList> <preEvent>
<preSub subType="actor" preSubId ="ALL" /> <actionR actionId = "hit" /> <preObj objType="actor" objId="All"/> </preEvent> </preEventList> <postEventList> <postevent> <postSub type="actor" postSubId="Robert"/> <actionR actionId = "hit"/> <postObj type="actor" postObjId="RS"/> </postEvent> </postEventList> </transaction>
3)	Transition : (event/action) ^ (change property)
Transition constraint is to change the value of a specific property upon occurrence of a designated event or action. The condition is set, for example, as the construction of 'reduce a property of an object, 'health', for a 'hit' action.' This is useful to define the motion accompanied with changes of a specific state.
<transit transitId = "Work"> <preEventList> <preEvent> <preSub subType="actor"
preSubId ="Monica" /> <actionR actionId = " work" /> </preEvent> </preEventList> <postTransitList> <postTransit transiType="actor"
transitSubId="RS" propertyId= "tiredness" value = "20"/> </postTrasitList> </transit>
4)	Induction : (condition of property) ^
event/action)
It can declare to perform an event or a motion when a specific attribute satisfies a certain condition. The condition is set, for example, as the construction of "eat rice when hungry is less than 30." This constraint condition is assigned to make an actor perform a specific motion according to the change of state.
<induction inductionId = "Hungry"> <preTransitList>
<propTransit TransitSubId ="ALL" propertyId = "hungryness" value = "80" compare="GT"/> </preTransitList> <postEventList> <postEvent> <PostSub Type="actor" postSubId="RS" /> <actionR actionId = "eat" /> </postEvent> </postEventList> </induction>
5)	Must : (pre event) ^ (post event)
This is a constraint condition that executes various functions. The semantic view of this condition indicates that 'When pre-event is performed without occurrence of post-event, post-event must occur before the story ends.' In other words, this constraint condition is used if an event that occurs as a result must occur upon occurrence of an event corresponding to a cause in a causal relationship, Also, a negative option can be placed in post-event so that post-event should never occur when pre-event occurs. This can put a constraint so that another event can never follow when a specific event occurs. In addition, an event that must occur on any occasion can be indicated by setting only post-event not pre-event.
<must mustId = "EatAfter"> <preEventList> <preEvent>
<preSub subtype="actor" preSubId="Monica"/> <actionR actionId = "eat" /> </preEvent> <preEventList> <postEventList> <postEvent>
<actor actorId = "Monica" /> <action actionId = "wash" /> <props prosId = "dish" />
</postevent> </must>
6)	Ordering : (pre event) ^ (post event)
This is a declaration to adjust the logical flow of the story. It makes no difference even if events assigned here don't occur. It only indicates 'pre-event must occur prior to post-event at all times'. In other words, if pre-event doesn't occur, post-event wouldn't occur either.
<ordering orderingId = "WinAfter"> <preEventList> <preEvent>
<preSub subtype="actor"preSubId="ALL"/> <actionR actionId = "win" /> </preEvent> </preEventList> <postEventList> <postEvent>
<postSub Type="actor" postSubId="RS" /> <actionR actionId="get"/> <props prosId="money"/> </postEvent> <postEventList> </ordering>
7)	Stage Change : (event) ^ (change stage)
This is a constraint declaration that changes the current stage of the subject or the object of an event when a specific event occurs. This is similar to an effect that makes characters on stage withdraw or makes characters appear on stage in a play.
<stageChange stageChangeId = "Work"> <preEventList> <preEvent>
<preSub Type="actor" preSubId="Monica"/> <actionR actionId="leave" /> </preEvent> </preEventList> <targetStage>
<ActorR actorId="Monica"/> <stageR stageid="office"/> </targetstage> </stageChange >
In summary, the basic environment and constituent required in the story are defined in the constituent declaration, and the constraint items holding the outline of overall direction of story are assigned in the constraint declaration.
4 SML (Storytelling Markup language)
4.1 SML
The narrative structure with the purpose of an interactive storytelling was defined previously. However, the previously defined structure is simply a kind of an abstract data type. In order to generate stories, it is necessary to express above components in definite language so that the story generation engine can process them. The language for expressing narrative structure requires following characteristics.
First, the legibility should be good enough so that users can easily understand the meanings without any difficulty and add what they want. There would be no occasion to write codes, because the authoring tools are basically used. However, good legibility is necessary so that there would be no difficulties even in the case of writing codes directly.
Second, there should be good expressiveness. In the previously suggested structure, an actor or an action has subordinate constituents in complex form. In particular, the way the motion is defined demands information of properties necessary for the motion, actors who can perform the motion, and stages where the motion can occur. Each motion, however, has different number of such constituents.
The structures defined in the constraint declaration are combined by a relationship of 'AND' or 'OR' and demands a wildcard character, and a reference structure, thereby becoming more complex. The language should be able to express such structures that are defined as being complex and multilateral.
Third, the structure should be easy to deal with. It requires the operation of generating or parsing the codes written in the given language using an authoring tool or a story generator. The structure of the language should be easy to deal with, so that it would be easy to develop an authoring tool or a story generator that supports such a language. Such a characteristic of the language can be considered very important in case that an authoring tool, a story generator and a script language are independent from each other, which even makes it possible to develop various authoring tools or story generators to support the language afterwards.
The language should be defined with the above conditions in mind in order to achieve satisfactory results. When taking the structures of already existing languages into consideration, a XML (eXtensible Markup Language) format can be easily considered first. XML can be seen as a super-ordinate concept of HTML (Hyper-Text Markup Language). HTML is well known to many users because HTML is used as a standard output mode of World Wide Web. Many users, in turn, can read and write HTML. Thus, the XML mode can be considered to have quite excellent legibility.
In addition, XML itself is a language that can add various formats freely, leading to fairly good expressiveness. Also, since XML is widely known and
used, there are many related libraries. Since it is easy to deal with languages with help of libraries, the XML format is, in a sense, the 'easy-to-handle structure'.
As described, XML satisfies all the conditions required as a language listed above. This study, thus, defines the previously defined narrative structure in the XML format, naming the language as SML (Storytelling Markup Language). The DTD (Document Type Definition) of SML is not included in a text, since the constituents and the constraint conditions of the previously presented narrative structure are large in quantity and complicated.
4.2 An example of SML
Figure 4: An Example of Story Structure.
Figure 4 is a drawing which illustrates some part of the story, while Figure 5 simply shows only the necessary part for the story illustrated in Figure 4 in SML. The event of 'Frodo finds the One Ring in the Middle Earth' or 'Gollum finds the One Ring in the Middle Earth' is set to occur for sure with constraint 'must' with no 'pre-event'. With constraint 'transaction', upon generation of an event that 'somebody finding the One Ring in the Middle Earth', 'the finder rules or is ruled by the One Ring.', in such a case, whether ruled by the One Ring or not depends on properties of Frodo and Gollum. The property value of 'ruling' is higher value than that of 'being ruled by' in case of Frodo, while the property value of 'being ruled by' is higher in Gollum. In such cases, an event, in which an action of 'ruling' occurs, has a higher possibility of being generated as for Frodo, while the possibility of generating an event, in which an action of 'being ruled by' occurs, increases as for Gollum.
4.3 Story generator
In previous sections, the narrative structure was defined, along with the language to express it. At this point, complier or interpreter is needed to interpret the codes written in the given language and to perform the operation. In other words, a story generator, which receives given SML codes and generates stories, is required.
<sml>
<story>
<declaration>
<actor actorId=BFrodc">
<propertyR propertyId=wFind">BO</propertyR> <p roper tyR propertyId="Rule">BO</propersyR> <propertyR property Id=,,RuledBy">40</propertyR> OtageR stageId="MiddleEarth" /> </actor>
octor actorId=nGoIlun">
<propertyR propertyId="Fï.nd">5Q</proper,cyR> <propertyR propertyId="R'-ile">40</propertyR> <propertyR propertyId=,,RuledByn>30<7propertyR> OtageR stageId=nMiddleEarth"/> </actor>
<props propsId=nTheOneRir.g">
<propertyR propertyId="Rule">50</propertyR> <propertyR propertyId="RuledBy">5Q</propertyR> OtageR stageId=nMiddleEarth"/> </prop3>
<action actionId="ruls" type=,,l">
<p roper tyR propertyId="Rj.le">60</proper-yR> oubR subid="Frodo "/> OubR subId="Golluni"/> <objR objId= "XheOr.eRing" /> OtageR stageId="MiddleEarth"/> </accion>
<action actionId=,rzulecLby" type="ln>
<propertyR propertyId="'R'j.ledByn>eo</propertyR> oubR subld="Frodo"/> OubR sub I d= " So 11 um " /> <objR obj Id=1,TheCr.eRingn,/> OtageR stage Id="MiddleEarthir/> </accion> <fir3tViewpoint>
OtageR stage Id="MiddleEarth"/> </fi r3tViewpoi nt > </deciaration> <constraint>
< transact ion ~ran3accionld="rrar.3acrion0n> <preEventList conj="and"> <preEvent>
OtageR 3tageId=nMiddleEart:h,"/> <preSub subType="all"/> <actior.R act±onId="find"/>
<preObj objType=nprQp" preObjId=nIheOneRing"/> </preEvent> </preEventList> <postEventList conj="Dr"> <postEvent>
OtageR stageId=nMiddleEarrh,"/> <postSub 3ubType="RS"/> octionR aocionïd=r'rule11 />
<po3tObJ objlype=wcr :":" posrObj Id="TheOneRingn/> </postEvent> <postEvent>
OtageR 3tageId="MiddleEarrh"/> <po3tSub subType="RS"/> OctionR actionId="ruledby"/>
<po3tOb] objIype="prcpn postObj Id^'T'reGr.eRing11/> </postEvent> </postEventList> </transaction>
<ordering orderingId=nOrderingO"> <preEventList conj="and"> <preEvent>
OtageR 3tageId="MlddleEarrhn/>
<preSub subType=nactor" preSiibId="Gollum"/>
<actionR actionZd="find"/>
<preObj ob jIype="prop,r preObjId=nTheOneRing"/> </preEvent> </preEventList> <postEventList conj="or"> <postEvent>
OtageR stageId="MiddleEarth"/>
<postSub 3ubïype="actor" postSubId="3cllun:.',/>
octionR aotionld=rrrule"/>
CpostObj obj Type="prop" postObjId="TheOneRing™/> </postEvent> <postEvent>
OtageR 3tageId=nMiddleEarthn/>
<po3t5ub subType=',actor" postSubId="3"llurr.'"/>
<actionR actionId="ruledby"/>
<postObj objType^prop " postObj Id="T'f,.eOr.eRingri/> </postEvent> </postEventList> </ordering> </con3traint> </story> </sml>
Figure 5: Example for expression of story structure using SML
When the story is viewed as the connection of continuous events, the generation of a high-quality story depends on how proper the order these events are sequenced in. When Stem refers to a set having 'item' as domain, Sevent can be expressed as followings.
S
'subject
Sa
SDO Sactor U Sprops SIO _ Sactor U Sprops
SEvent {(xsubject, xstage, xaction, xDO, xio) | xsubject^ Ssubject,
xstage^ Sstage, xaction^ Sactiorn xdoe Sdo, xioe Sio}
In other words, a set of events can be viewed as a Cartesian product of domains; subject, stage, action, DO and IO, and one event is represented as a tuple of (subject, stage, action, DO, IO). In a wider prospective, if the tuple is viewed as node of graph, a story can be considered as a connection of nodes. Therefore, the generation of story can be understood as setting a proper path of the graph, wherein events are nodes.
From this point of view, this study excludes AI techniques and suggests a story generator using graph in which a node is an event, in contrary to previous studies.
As explained above, when an event is seen as a node in graph, a story can be one path connecting a series of nodes in graph. In this case, the story generation is soon down to the issue of setting the graph path in order that is not contrary to the constraint conditions. Thus, most of the problems to consider can be thought by converting them to graph problems.
First let's consider the expression of an event. Although it is said that an event can be seen as a node, nodes can be large in number according to the amount of data if all types of generable events are actually generated and the path between each node is specified as other graph problems. In case of events containing a wildcard character, in particular, the number of nodes will increase by geometric progression when all events are generated. Thus, even though it can be perceived as a graph problem in the conceptual aspect, the realization that uses memory on a practical level should be considered. In that,
It should be approached with the form in which not all events are generated as nodes but generated events are managed as nodes. Also, with managing the list of constraint conditions, it would be realistic to determine whether to set a route or not by comparing such a list and an event to be generated.
The problem of generating events can be seen as the problem of selecting one among many paths that can be chosen at the current node, which requires a function or a value that becomes a benchmark of path choice. In Prim's MST (Minimum Spanning Tree) algorithm, for example, the closest node in MST constructed so far becomes a benchmark of selecting a path.
This study considers the method which put a great deal of weight on the node having the highest value of property combination among available actions choices, which was also mentioned earlier in relation to [Figure 4]. Because when a specific person should select one among more than two actions, it is believed that it's more reasonable to carry out action suiting to one's propensity.
The condition of this choice needs to be processed by the concept of possibility in order to avoid the uniformity of the narrative. The problem of adhering to constraint conditions can be understood as a problem of dynamically managing the path between the nodes in graph, a problem of managing topological order, and a problem of preventing the path from repeating infinitely. If an event B cannot occur right after an event A in a conceptual aspect, then it can be understood as there is no path from node A to a node B. Moreover, when a temporal relationship exists between two events such as
'transaction', 'must', and 'ordering', the constraint conditions can be satisfied by establishing topological order. Also, the conflict among constraint conditions can be prevented by prohibiting the generation of indefinite repetition.
4.4 Example of story generation
The story was generated through a story generator based on the algorithm suggested above, after the construction of The Lord of the Rings story was simplified and then expressed in SML defined above, Figure 6 is the currently realized authoring tool, and Figure 7 shows one example of created stories.
Property Stage Actor Prop Action FirstViewPoint Ending Transaction Transit Induction Must Ordering StageChange Transactions Transaction'sPreEventList ©And Qor
OPreEvent ©PostEvent
Qdd] ED SD
000
Figure 6: Interactive storytelling authoring tool
Figure 7: Example of story generation
Depending on the property values of Frodo and Gollum, the story that is completely different from what we previous knew could be generated. Besides this, several kinds of different endings can be generated. The results are printouts in a plain text format. However, if it is expressed in script language and then combined with the output interpreter by extending this study further, it would be possible to express it in other media such as picture and animation. The output interpreter, which prints the generated text in the form of a series of drawings, is in development.
5 Conclusions
The text investigates the concept and the structure of an interactive storytelling system, and then suggests the narrative structure and the language to express it. The narrative structure suggested in this study cannot be considered as an ideal one for generating stories. However, it suggests a possibility in regard to how to express the structure constructed in such a way in
language and what else is required to generate stories using the language that is expressed in such a way.
The narrative structure and SML suggested in this paper have following strengths, compared to exiting interactive storytelling systems.
First, the previous studies left much to be desired in relation to languages due to undue stress given to applications. On the other hand, this paper studies on languages, increasing expansion possibility of the system that can be combined and linked.
Second, the narrative structure of this study can add or delete the components as occasion demands, and even new narrative structure can be applied. For example, the narrative structure which is an application of data flow system using token can be made.
Third, various interactive storytelling systems using SML can be developed, thereby applied to games and educational multimedia system.
Currently, this study is expected to proceed in three directions; first is to improve the narrative structure and the functions of the language, second is to realize an interactive storytelling system which can generate the results not in text format but with various multi medias such as picture and animation by using the authoring tool and the output interpreter, and the last is to proceed in developing the script language and the control system that can control actions of NPC (Non Playable character) in MMORPG ( Massively Multiplayer Online Role Playing Game) by applying SML, in cooperation with a domestic game company.
As shown in this study, SML shows a possibility to be the language that could be widely applied to the areas of game and multimedia.
References
[1]	F. Charles, S. J. Mead, and M Cavazza, "User Intervention in Virtual Interactive Storytelling," Proceedings of VRIC 2001, Laval, France, 2001.
[2]	R. Fykes, and N. Nilsson, "STRIPS: A new approach to the application of theorem proving to problem solving," Artificial Intelligence 2, pp. 189208, 1971.
[3]	M. Cavazza, F Charles, and S. J. Mead, "Interacting with Virtual Characters in Interactive Storytelling."
ACM Joint Conference on Autonomous Agents and Multi-Agent Systems, pp. 318-325, 2002.
[4]	L. M. Barros and S. R. Musse, "Introducing narrative principles into planning-based interactive storytelling," In Proceedings of ACM SIGCHI
International Conference on Advances in Computer Entertainment Technology, pp. 35-42, 2005.
[5]	H. Prendinger, S. Descamps, and M. Ishizuka. "MPML: A markup language for controlling the behavior of life-like characters." Journal of Visual Languages and Computing, pp. 183-203, 2004.
[6]	A.L.I.C.E. AI foundation, "Artificial Intelligence Markup Language (AIML)," Technical Report, URL: http://alice.sunlitsurf.com/TR/2001/WD-aiml/, 2001.
[7]	B. DeCarolis, M. Bilvi, and C. Pelachaud. "APML, a Mark-up Language for Believable Behavior Generation," In H. Prendinger and M. Ishizuka, editors, Life-like Characters. Tools, Affective Functions and applications, pp. 65-85, 2004.
[8]	B. Krenn and Gregor Sieber, "Functional Mark-up for Behaviour Planning: Theory and Practice," In Proceedings of the AAMAS 2008 Workshop. Why Conversational Agents do what they do. PP. 12 -16, 2008.
[9]	S. Kopp, B. Krenn, S. Marsella, A.N. Marshall, C. Pelachaud, H. Pirker, K. R. Thorisson, and H. Vilhjalmsson, "Towards a Common Framework for Multimodal Generation: The Behavior Markup
Language," In The Proceedings of the 6th International Conference in Itelligent Virtual Agents. pp. 205-217, 2006.
[10]	E. Figa and P. Tarau, "The VISTAProject: An Agent Architecture for Virtual Interactive Storytelling," In Proceedings of Technologies for Interactive Digital Storytelling and Entertainment, pp. 106, 2003.
[11]	N. Zagalo, S. Gobel, A. Torres, R. Malkewitz, and V. Branco, "INSCAPE: Emotion Expression and Experience in an Authoring Environment", In
Proceedings of Technologies for Interactive Digital Storytelling and Entertainment 2006, pp. 219-230, 2006.
[12]	Y. Cheong, Y. Kim, W. Min, E. Shim, and J. Kim, "PRISM: A Framework for Authoring Interactive Narratives", In Proceedings of the 1st Joint International Conference on Interactive Digital Storytelling 2008, pp. 297-308, 2008.
[13]	F. Charles, M. Lozano, S. J. Mead, A. F. Bisquerra, and M. Cavazza. "Planning formalisms and authoring in interactive storytelling," In Proceedings of Technologies for Interactive Digital Storytelling and Entertainment, pp. 216-225, 2003.
Expression-robust 3D Face Recognition using Bending Invariant Correlative Features
Yue Ming and Qiuqi Ruan Senior Member, IEEE
Institute of Information Science, Beijing JiaoTong University, Beijing 100044, P.R. China E-mail: myname35875235@126.com
Keywords: 3D face recognition, 3D bending invariant correlative features (3D BI-LBP), spectral regression (SR) Received: July 1, 2010
In this paper, a novel 3D Bending Invariant Correlative Features (3D BI-LBP) is used for 3D face recognition to overcome some of the unsolved problems encountered with 3D facial images. In this challenging topic, large expression and pose variations along with data noise are three major obstacles. We first exploit an automatic procedure regarding face area extraction, and then process it to minimize the effect of large pose variations and effectively improve the total 3D face recognition performance. To overcome the large expression variations, the key idea in the proposed algorithm is a representation of the facial surface, by what is called a Bending Invariant (BI), which is invariant to isometric deformations resulting from changes in expression and posture. In order to encode relationships in neighboring mesh nodes, 3D LBP is used for the obtained geometric invariant, which own more potential power to describe the structure of faces than individual points and effectiveness in characterizing local details of a signal. The signature images are then decomposed into their principle components based on Spectral Regression (SR) resulting in a huge time saving. Our experiments were based on the CASIA and FRGC 3D face databases which contain large expression and pose variations. Experimental results show our proposed method provides better effectiveness and efficiency than many commonly used existing methods for 3D face recognition and handles variations in facial expression quite well.
Povzetek: Razvitaje nova metoda za prepoznavanje 3D obrazov.
1 Introduction
Information and Communication Technologies are gradually entering all aspects of our life. They are also opening a world where people unprecedentedly interact with electronic devices embedded in environments sensitive and responsive to the presence of users. These scenarios offer the opportunity to exploit the potential of faces as a non-intrusive biometric identifier to not just regulate access to a controlled environment but also to adapt provided services to the preferences of a recognized user.
Automatic human face recognition is an important research area within the field of biometric identification. Compared with other biometric features, face recognition has the advantages of pro-active, non-invasiveness and user-friendliness and has gained great attention during the last decade [1]. While, currently, most efforts are devoted to face recognition using 2D images, they continue to encounter difficulties in handling large facial variations due to head pose, lighting conditions and facial expressions. 2D face recognition systems have a strict constrain on improving accuracy. So far it is still quite difficult to build a robust automatic human face recognition system.
Many researchers are committed to utilizing of three-dimensional information to overcome some of the difficult issues associated with face recognition. Range images
which contain texture and shape information are very effective for recognition of a face image, when comparing one face with another face. There is evidence that range images have the potential to overcome problems inherent in intensity and color images. Some advantages of range images are explicit representation of the 3D shape, invariance under change of illumination, pose and reflectance properties of objects.
In view of the shortcomings of the 2D approaches, a number of 3D and 3D+2D multi-modal approaches have recently been proposed. We extensively examined the prior literature on 3D face recognition, which can be categorized into methods using point cloud representations, depth images, facial surface features or spherical representations [2]. A priori registration of the point clouds is commonly performed by ICP algorithms with 92.1% rank-one identification on a subset of FRGC v2 [3]. Based on depth images, Faltemier et al. [4] introduced concentrate dimensional reduction based on the fusion of results from group regions that have been independently matched. Facial surface features, such as curvature descriptors [5], have also been proposed for 3D face recognition.
Alternatively, spherical representations have been used recently for modeling illumination variations [6,7] or both illumination and pose variations in face images [2,8]. In
addition, Kakadiaris et al. [9] used an annotated face model to fit the changes of the face surface and then obtained the deformation image by a fitting model. A multistage alignment algorithm and advanced wavelet analysis resulted in robust performance. They reported a best performance of 97.0% verification as a 0.1% FAR. Face recognition combining 3D shape and 2D intensity/color information is a developing area of research. Mian et al. [10] handled the expression problem using a fusion scheme in which three kinds of methods, spherical face representation (SFR), scale-invariant feature transform (SIFT)-based matching and a modified ICP were combined to achieve the final result. Their results showed the potential of appearance-based methods for solving the expression problem in 3D face recognition. Because of the extremely high dimensionality of the Gabor features for depth and intensity images, Xu et al. [11] proposed a novel hierarchical selection scheme with embedded LDA and AdaBoost learning for dimensionality reduction. With this scheme an effective classifier can be built. However, some details in these approaches are ignored on how depth and intensity information contributes to recognition with expression and pose variations.
In this paper, we address the major challenges of 3D field-deployable face recognition systems. We propose a novel framework for expression-robust 3D face recognition. The flowchart is shown in Fig.1. Our method can be divided into feature extraction, dimension reduction and classification sections. For all sections, because expression variations and data noise are major obstacles to good system performance, we preprocess the raw 3D data and extract the face area which is least affected by expression changes. In the feature extraction section, the Bending Invariant and its statistical codebook analysis of correlative features are used to describe the intrinsic geometric information, denoted as 3D BI-LBP. This procedure very effectively eliminates the effect of the expression variations. With dimensional reduction based on Spectral Regression, more useful and significant features can be produced for a face than can be produced by current methods, resulting in a huge saving in computational cost. Finally, we achieve face recognition using Nearest Neighbor Classifiers. The
Figure 1: The Framework of 3D Face Recognition
rest of this paper is organized as follows. First, we describe the automatic face registration process that permits alignment the 3D point clouds before analysis in section 2.
Section 3 describes the Bending Invariant Correlative Features (3D BI-LBP) used in our framework. Section 4 introduces Spectral Regression (SR) for reducing dimensions and classifier construction. Section 5 reports the experimental results and gives some comparisons with existing algorithms. Finally, the paper is concluded in section 6.
2 Automatic preprocessing of 3D face data
In this paper, one face is described by one 3D scattered point cloud from one 3D laser scanner as illustrated in Fig.2. The preprocessing scheme is based on three main tasks, respectively the extraction of the facial region, the registration of the 3D face, and the acquisition of the normalized depth and intensity images. They are fully automated; handling noisy and incomplete input data are immune to rotation and translation and suitable for different resolutions.
The main purpose of face extraction is to remove irrelevant information from the 3D point clouds, such as data corresponding to shoulders or hair, and spikes obtained by a laser scanner. First in face extraction, we estimate a vertical projection curve from the point cloud by computing the column sum of the valid point's matrix [2, 12]. Then, we define two lateral thresholds on the left and right inflexion points of the projection curve for removing data points on the subject's shoulders beyond these thresholds. We further remove the data points corresponding to the subject's chest by thresholding of the histogram of depth values. Finally, we remove outlier points that remain in regions disconnected from the main facial area and treat only the largest region as the facial region.
After extracting the main facial region from a 3D scan, registration (pose correction) is performed. We present a multistage approach for automatic registration that offers robust and accurate alignment even in the presence of facial expression variations. First, we compute the orthogonal eigenvectors, v1; v2, v3, of the covariance matrix of the point cloud, as the three main axis of the point cloud. We rotate the point cloud so that V1, V2, V3 are parallel to Y-, X- and Z- axis of the reference coordinate system, respectively. The nose tip obtained by [13] rests on the origin of the reference coordinate system. This permits construction of an average face model (AFM), by computing at each grid point the value across all training faces. The AFM is used as a reference face model, and all face signals are further aligned by running ICP [14] to avoid the unwanted influences of the mouth and the jaw. Finally, there is a refinement step that employs a global optimization technique [15] to minimize the z-buffer distance. This effectively re-samples the data independent of the data's triangulation and removes all irrelevant information that may have been left over from the previous preprocessing steps.
Figure 2: Main steps in facial region preprocessing.
3 Feature extraction 3.1 Bending invariant
The core of our 3D face recognition framework is the representation of a facial surface which is invariant to isometric deformations, by bending invariants (BI) [16, 17]. This paper extends our previous work [18-20]. The class of transformations that a facial surface can undergo is not arbitrary, and empirical observations show that facial expressions can be modeled as isometric (or length-preserving) transformations [21]. Therefore, we introduced an efficient feature for constructing a signature for isometric surfaces, referred to as a bending invariant. The Bending Invariant is a polyhedral approximation of the facial surface obtained by performing an Isomap on a reduced set of points and interpolating on the full set of points.
Given a facial surface M(x, y, z) e R3, the bending invariant IM(x, y, z) e R3 is the output of an Isomap algorithm. A geodesic isometric is formally a mapping y: M ^ M' such that
dM(x,y,z) = dM'(v(x), v(y), w(z)), W(x, y, z) G M3
(1)
One of the crucial practical requirements for the construction of the invariant feature of a given surface, is an efficient algorithm for the computation of the geodesic distance on the surface, that is, dM(x, y, z). Computation of the geodesic distance can effectively reflect the facial shape information and overcome some of the unsolved problems encountered with 3D facial images, such as large expression and pose variations along with data noise. A numerically consistent algorithm for the computation of the distance between a surface vertex and the rest of the n sur-
face vertices on a regular triangulated domain in O(n) operations is referred to as fast marching on triangulated domains (FMTD) [16]. After distance computation, we can obtain an approximation of the geodesic distance by sampling the continuous surface on a finite set of points and making discrete the metric associated with the surface.
The metric is invariant under isometric surface deformation, depending on an arbitrary ordering of the points. We would like to obtain a geometric invariant, which is both unique for isometric surfaces and allows using simple rigid surface matching to compare the invariants.
Based on the discussion above, this is equivalent to finding a mapping between two metric spaces, 9 : (M, dM) ^ (Rm, d); 9(pi) = xi which minimizes the embedding error,
e = f (|dM — d|); d = ||x¡ — x
ill 2
(2)
d is the distance to embed the surface into a low-dimensional Euclidean space Rm based on Isomap [21]. The m-dimensional representation obtained is a set of points xi e Rm(i = 1,...,n) corresponding to the surface points pi.
The embedding in Rm is performed by double-centering the matrix A:B = — 2 JAJ (here J = I — 1U, I is an n x n identity matrix, and U is a matrix consisting entirely of one's). The first m eigenvectors ei, corresponding to the m largest eigenvalues of B , are used as the embedding coordinates
xj = ej; i = 1,...n, j = 1,...,
m
(3)
where xj denotes the j - th coordinate of the vector x¡. Eigenvectors are computed using a standard eigen-decomposition method. Since only m eigenvectors are required (usually, m = 3), the computation can be done efficiently.
Through an Isomap, 3D face samples are mapped to a lower dimensional feature space from a higher dimensional observation space via non-linear mapping, thereby building a mutual mapping between the higher dimensional data manifold space and the lower dimensional representative space. This brings out intrinsic lower dimensional structure hidden in the higher dimensional observational data. This has many positive benefits, such as compressing data thereby reducing storage requirements, removing unnecessary noise, extracting effective features for recognition and providing visualization of higher dimensional data.
Finally, interpolation is used to obtain a geometric invariant on the full set of points. The primary property of bending invariants is that up to rigid motion, IM they are invariant to geodesic isometrics.
Im (x, y, z) = v + UIm' ( Y(x, y, z)), U G O(3), v G R3
(4)
The Bending Invariant (BI) preserves the local neighbor structure of a 3D facial shape and increases global discriminant information. Combined geodesic distance and Isomap
sampling can nicely inherit the ability of local preservation and at the same time increase separability, which overcomes expression and pose variations to some extent.
Next we can describe the Bending Invariant (BI) of each mesh point using the following vector,
BI = {bi,b2, ■■■, bn}	(5)
where bi is the Bending Invariant of each mesh point i. Then all Bending Invariants are normalized to [0, 255]. This vector is a representation of the surface that is invariant under geodesic isometrics, effectively extracting the information about rigid objects and overcoming the problems associated with facial expression variations.
3.2 Correlative features
Local Binary Pattern (LBP) operator is first proposed by Ojala et al. [22] for texture analysis and has been successfully applied to 2D face recognition by Timo et al. [23]. Based on the LBP operator, only encoding signs of the Bending Invariant differences of mesh nodes is not adequate for describing 3D faces. This is because different Bending Invariant differences on the same point of a facial surfaces distinguishes different faces. For example nose tips with depth 255 [24]. As a result, if two facial regions of different persons on the same place have the same trends of Bending Invariant variation, we further encode the extract values of differences into binary patterns as shown in Fig. 3. To obtain local correlative features of facial surface, 3D Local Binary Patterns (3D LBP) is introduced to 3D face recognition. 3D LBP not only enhances local properties and details the texture information of facial images, but also extracts local details effectively improving recognition, in addition to being intensive to expression, pose and illumination variations.
First, the Bending Invariant of each node in 3D face mesh model is subtracted by their neighbors and the differences are converted to binary units: 0 or 1 according to their signs. Then, binary units are arranged clockwise and a set of binary units as the local binary pattern of the node is obtained.
According to our statistical analysis, more than 91% of the Bending Invariant differences between points are small than 7. Three binary units ({i2i3i4}) can correspond to the binary number of the absolute value of the Bending Invariant difference (DD): [0, 7] and are assigned to Layer 2, Layer 3 and Layer 4 respectively. The head binary units (i1) is the original LBP codebook. Four binary units are divided into four layers and each of them is assigned clockwise. The binary pattern of each layer is further transformed to decimal number: P1, P2, P3, P4 at each node point as its correlative feature representation. Finally, histograms of the four maps are concatenated as geometric statistics of correlative features for recognition.
. f 1, DD > 0
11 = \ 0, DD < 0 '	(6)
\DD\ = i2 ■ 22 + i3 ■ 21 + i4 ■ 20
The method also enhance image low-level features like edges, peaks, valleys, and ridges, which is equal to enhancing key facial element information such as the nose, eyes, and mouth plus local characteristics like dimples, melan-otic nevus and scars. They not only preserve global facial information but also enhance local characteristics. When the pose, expression and position of a face change, local changes are smaller than global changes, resulting are a very effective face representation.
o -2 5
o 2
0 0 1 J_
□3 -2i -5i 5£H "i ¡3 I
ol o| ~~oJ ei o| c| of a
_0_0_1 _0 _0_0_0 .¡3
_0 _1 _0 _1 _0_0_0 _ 0
ol o| it ot Tl~ ol ol o
I
DD = -2, i{iMA =0010
Figure 3: The Flowchart of 3DLBP
4 Spectral regression (SR)
In learning section Spectral Regression is adopted to learn principle components from each 3D facial image based on 3D Bending Invariant Correlative Features (3D BI-LBP) and these components are stored into the corresponding sub-codebook. Suppose we have m face range images. Let {xi}m=1 C Rn(n = 1024) denote their vector representations. Dimensionality reduction aims at finding {zi}m=1 C Rd, d C n, where zi can "represent" x.. In order to reflect the relationship of 3D face data among different samples better, Spectral Regression (SR) is introduced to reduce dimensions [25]. The algorithm divides into two steps.
The first one is regularized least squares. Find c - 1 vectors a1,...,ac-1 e Rn(k = 1,..., c- 1) is the solution of regularized least square problem
m
ak = argmin(£(aTx. -tf )2 + a||a||2)	(7)
a i=1
where yk is the i - th element of yk. It is easy to check that ak is the solution of linear equations system:
(XXT + aI)ak = Xyk	(8)
where I is a n x n identity matrix. The canonical Gaussian elimination method can be used to solve this linear equations system [25]. When X is large, some efficient iterative algorithms (eg., LSQR [26]) can be used to directly solve the above regularized least square problem.
The second one is SREmbedding. Let A = [a\,.., ac-1}, A is a n x (c - 1) transformation matrix. The samples can be embedded into c - 1 dimensional subspace by
x ^ z = ATx	(9)
Computational complexity is shown [25] that Spectral Regression decreases the complexity from cubic-time to linear-time which is a huge speed-up. The dimensionality reduction process and subspace projection based on Spectral Regression preserve the discriminated facial information which is able to capture salient facial characteristics and is further enhanced for improved recognition performance by effective matching in the reduced space.
5 Experimental and analysis
In this paper, we present our results evaluating the performance of our framework using the Face Recognition Grand Challenge (FRGC) data corpus, which is organized in 2004 by NIST [12]. FRGC data contain a variety of facial expressions. Therefore it allows design of additional experiments to evaluate the effect of such variation. In this section, we demonstrate the excellent performance of our proposed scheme by comparing experiments in terms of different algorithms and different reducing dimension schemes. All of our experiments have been implemented in Matlab 7.5 and run on a P4 2.1 GHz Windows XP machines with 2GB memory.
5.1 Experiments with different algorithms
In this experiment, we make detailed comparisons with some existing methods for 3D face recognition to show the performance of the proposed algorithm. The considered features include surface curvature (SC) [5], point signature (PS) [27], Learned Visual Codebook (LVC) [28], UR3D [9] and our proposed framework for 3D face recognition. The different features are extracted for each node. In the FRGC verification, three mask are defined over the square similarity matrix which holds the similarity value between all subject sessions. Each mask produces three different Receiver Operating Characteristic (ROC) curves, which will be referred to as ROC I, II and III. In ROC I all the data are within semesters, in ROC II they are within the year, while in ROC III the samples are between semesters. The FRGC data corpus can be divided into two disjoint subsets, depending on whether the subset has a neutral facial expression or not. Table 1 shows the verification rates for ROC I, II and III.
From these results, we can draw the following conclusions: The highest verification rates is up to 96.2% which
Table 1: Verification Rates (%) for ROC I, II and III (FAR=10-3)
Group 1	SC	PS	LVC	UR3D	Ours
ROC I	49.5	43.1	91.2	95.2	96.2
ROC II	43.2	41.5	88.4	94.8	95.3
ROC III	42.8	41.3	86.2	94.4	94.6
Group 2	SC	PS	LVC	UR3D	Ours
ROC I	39.6	35.8	80.2	80.4	80.7
ROC II	32.7	29.4	77.4	79.2	79.4
ROC III	29.3	27.8	75.1	77.9	78.1
group 1: 5 images with only neutral expression group 2: 10 images with non-neutral expression
was obtained by our framework. Shape variation is the important information for characterizing an individual and the depth feature vector reflecting shape variation improves the verification rates distinctly in Table 1; Expression variations affect performance strongly and BI, a novel feature can obtain a representation of the surface which is invariant under geodesic isometries and decrease the influence of expression effectively; Statistical codebook analysis (3D LBP) can encode relationships between neighboring mesh nodes and 3D LBP based on BI are likely to be correlated for nearby nodes. Experiment results show that our framework yields consistently better performance than existing methods in only neutral. If we increase the node number, the performance will be improved significantly. Although due to expression variations, our performance is not better than UR3D [9], we use a simpler method which spends less time and memory.
Second, we made a comparison between 3D BI-LBP and two appearance based methods, which contain local binary pattern (LBP)[23] and learned visual code-book (LVC) [28]. LBP is an efficient texture descriptor and has been successfully used for face recognition. LVC is a method which chooses K-means clustering to learn basic facial elements. K-fold cross validation was used on three methods. Because of the large face database, we did two groups of experiments. In the first, 10 images of each person consisting of 5 neutral images and 5 images with different expressions were divided into 10 groups and used for K-fold cross validation. In the second, all of the expression images were divided into 10 groups for K-fold cross validation. The results are presented in Table.2. It shows that our method outperformed other methods.
Our method exhibits the desirable characteristics of 3D facial structure, captures local structural characteristics of image local areas in multiple directions and has the properties of orientation and scalar invariance. It can effectively estimate the intrinsic dimensions of a data set and preserve local structure information more accurately than other methods, while being insensitive to the external factors of expression, pose and illumination.
Table 2: Comparison of recognition rates using K-fold cross validation
Size	groupl(FRGC)	group2(FRGC)		
LBP	83.32%	85.29%		
LVC	91.87%	92.35%		
3D BI-LBP	96.75%	97.68%		
				
Size	group1(CASIA)		group2(CASIA)	
LBP	85.17%		88.05%	
LVC	93.12%		94.56%	
3D BI-LBP	97.08%		98.15%	
eigen-decomposition of dense matrices. From the experimental results, we can intuitively see that the actual 3D information has no relation to view and illumination. Finally, compared to 2D face recognition, 3D face recognition has higher accuracy and can overcome the existing problems associated with 2D face recognition.
group 1: 5 images with neutral expression and 5 image
with different expression group 2: 10 images with different expression
5.2 Experiments with different dimension reducing schemes
Here, we make detailed comparisons between Spectral Regression (SR) [25] and PCA [29], LPP [30], OLPP [31], SRKDA [32] to show the efficiency of our proposed method for 3D face recognition, especially where there are expression variations. In view of the recognition accuracy curves in Fig.4, we can see PCA gives a good representation, which has a good recognition result in 36 dimensions. With the increase of dimensionality of the related feature vector, the recognition rate also rapidly increases. But when the dimensionality reaches 50 or higher, the recognition rate nearly stabilizes at a certain level (about 90.4%). LPP obtained a better accuracy when the dimensions are more than 50 and with increasing dimensions up to 100 dimensions. This is mainly because in LPP models the local structure of a face manifold has a better discriminate. When the dimensions are lower, the representation ability of LPP is worse than PCA since its basic functions are non-orthogonal. In order to overcome the limitation of LPP, OLPP was introduced based on the orthogonal basic functions. It has better representation and discriminates. As a result, it shows better recognition performance in the lower dimensions. On the other hand, OLPP is expensive in both time and memory. The Spectral Regression (SR) approach solves the optimization problem for linear graph embedding which reducing the cubic-time complexity to linear-time complexity [25], while Spectral Regression Kernel Discriminate Analysis (SRKDA) is quadratic-time complexity. We have performed extensive experimental comparisons of the state-of-the-art approaches, which demonstrate the effectiveness and efficiency of our method.
In this experiment, The algorithm SR is analyzed, and used in 3D face recognition. Spectral methods have recently emerged as a powerful tool for dimensionality reduction and manifold learning. These methods use information contained in the eigenvectors of a data affinity matrix to reveal low dimensional structure in high dimensional data. SR casts the problem of learning an embedding function into a regression framework, which avoids
100 95 90 85 80 75 70 65 60 55 50
* ■
/
' *---*'
A A A *
~ ^ - * '
i
--1--PCA
--1--LPP
-A- OLPP
• —*—•SR •-*-•SRKDA
60 80 Dimensions
Figure 4: The Results of Different Reducing Dimension Schemes
6 Conclusion
In this paper,we propose a novel method for 3D face recognition. We connect a depth feature, Bending Invariant and its statistical codebook analysis (3D BI-LBP) as an intrinsic features. Spectral Regression is used for selecting effective features and combining them to a classification. Experimental results show that our framework reflects the shape and geometric properties of 3D data and describes the relational properties of a local shape in a neighborhood. Compared to the existing methods, it has demonstrated excellent performance. All these reasons make face very suited for Ambient Intelligence applications. Such suitability is especially true for biometric identifier such as 3D face recognition, which is the most common method used in visual interactions and allows recognizing the user in a non-intrusive way without any physical contact with the sensor.
Acknowledgement
This work is supported by the Fundamental Research Funds for the Central Universities (2009YJS025). Portions of the research in this paper use the CASIA 3D Face Database collected by institute of Automation, Chinese Academy of Sciences.
References
[1] Bowyer, K. W., Chang, K., Flynn, P. (2006) A survey of approaches and challenges in 3D and multi-modal 3D+2D face recognition, Computer Vision and Image Understanding, Elsevier, pp. 1-15.
20
40
[2]	Llonch, R. S., Kokiopoulou, E., Tosic, I., Frossard, P. (2010) 3D face recognition with sparse spherical representations, Pattern Recognition, Elsevier, pp. 824834.
[3]	Lu, X., Jain, A. (2006) Deformation modeling for robust 3D face matching, in Proceedings of IEEE Computer Vision and Pattern Recognition, IEEE Computer Society, New York, pp. 1391-1398.
[4]	Faltemier, T. C., Bowyer, K. W., Flynn, P. J. (2008) A region ensemble for 3D face recognition, IEEE Transactions on Information Forensics and Security, IEEE Computer Society, pp. 62-73.
[5]	Gordon, G. (1991) Face recognition based on depth and curvature features, in Proceedings of IEEE Computer Vision and Pattern Recognition, IEEE Computer Society, Champaign, pp. 234-247.
[6]	Wang, H., Wei, H., Wang, Y. (2003) Face representation under different illumination conditions, in Proceedings of IEEE International Conference on Multimedia and Expo, IEEE Computer Society, Maryland, pp. 285-288.
[7]	Ramamoorthi, R. (2002) Analytic PCA construction for theoretical analysis of lighting variability in images of a Lambertian object, IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE Computer Society, pp. 1322-1333.
[8]	Yue, Z., Zhao, W., Chellappa, R. (2008) Pose-encoded spherical harmonics for face recognition and synthesis using a single image, EURASIP Journal on Advances in Signal Processing, Hindawi Publishing Corporation, pp. 1-18.
[9]	Kakadiaris, I. A., Passalis, G., Toderici, G., Mur-tuza, N., Lu, Y., Karampatziakis, N., Theoharis, T. (2007) 3D face recognition in the presence of facial expressions: an annotated deformable model approach, IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE Computer Society, pp. 640-649.
[10]	Mian, A. S., Bennamoun, M., Owens, R. (2007) An efficient multimodal 2D-3D hybrid approach to automatic face recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE Computer Society, pp. 1927-1943.
[11]	Xu, C., Li, S., Tan, T., Quan, L. (2009) Automatic 3D face recognition from depth and intensity Gabor features, Pattern recognition, Elsevier, pp. 1895-1905.
[12]	Phillips, P. J., Flynn, P. J., Scuggs, T., Bowyer, K. W., Chang, J., Hoffman, K., Marques, J., Jaesik Min, Worek, W. (2005) Overview of the face recognition grand challenge, in Proceedings of Computer Vision and Pattern Recognition, IEEE Computer Society, San Diego, pp. 947-954.
[13]	Xu, C., Wang, Y., Tan, T., Quan, L. (2006) A robust method for detecting nose on 3D point cloud, Pattern Recognition Letters, Elsevier, pp. 1687-1497.
[14]	Besl, P., Mckey, N. (1992) A method of registration of 3D shapes, IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE Computer Society, pp. 239-256.
[15]	Siarry, P., Berthiau, G., Durbin, F., Haussy, J. (1997) Enhanced simulated annealing for globally minimizing functions of many-continuous variables, ACM
Transactions on Mathematical Software, Association for Computing Machinery, pp. 209-228.
[16]	Lu, X., Colbry, D., Jain, A. (2004) Three-dimensional model based face recognition, in Proceedings of International Conference on Pattern Recognition, IEEE Computer Society, Cambridge, pp. 362-366.
[17]	Bronstein, A. M., Bronstein, M. M., Kimmel, R. (2003) Expression-Invariant 3D Face Recognition, in
Proceedings of Audio- and Video-Based Biometric Person Authentication, Springer, Guildford, pp. 6270.
[18]	Wu, J. Y., Ruan, Q. Q., An G. Y. (2009) A joint-diffused inpainting model for underexposure image preserving the linear gemetric structure, Informatica, Slovenian Society Informatika, pp. 151-163.
[19]	Wu, J. Y., Ruan, Q. Q., An G. Y. (2009) Exemplar-based image completion model employing PDE corrections, Informatica, Slovenian Society Informatika, pp. 259-276.
[20]	Ming, Y., Ruan, Q. Q., Ni, R. R., LEARNING EFFECTIVE FEATURES FOR 3D FACE RECOGNITION, Accepted by ICIP 2010, Hong Kong.
[21]	Balasubramanian, M., Shwartz, E. L., Tenenbaum, J. B., de Silva, V., Langford, J. C. (2002) The Isomap Algorithm and Topological Stability, Science, Science Publisher Inc., pp. 9-11.
[22]	Ojala, T., Pietikainen, M., Bunke, H. (2002) Multi solution gray-scale and rotation invariant texture classification with local binary patterns, IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE Computer Society, pp. 971-987.
[23]	Timo, A., Abdenour, H., Matti, P. (2004) Face recognition with Local Binary Patterns, in Proceedings of European Conference on Computer Vision, Springer, pp. 469-481.
[24]	Huang, Y. G., Wang, Y. H., Tan, T. N (2006) Combining statistics of geometrical and correlative features for 3D face recognition, in Proceedings of British Machine Vision Conference, British Machine Vision Association, pp. 391-395.
[25]	Cai, D., He, X., Han, J. (2007) Spectral regression for dimensionality reduction, Technical report, Computer Science Department, UIUCDCS-R-2007-2856, UIUC.
[26]	Paige, C. C., Saunders, M. A. (1982) Algorithm 583 LSQR: Sparse linear equations and least squares problems, ACM Transactions on Mathematical Software, Association for Computing Machinery, pp. 195-209.
[27]	Chua, C., Han, F., Ho, Y. (2000) 3D human face recognition using point signature, in Proceedings of IEEE International Conference on Automatic Face and Gesture Recognition, IEEE Computer Society, Grenoble, pp. 233-238.
[28]	Zhong, C., Sun, Z., Tan, T. (2007) Robust 3D face recognition using Learned Visual Codebook, in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, pp. 1-6.
[29]	Turk, M., Pentland, A. (1991) Eigenfaces for recognition, Journal of Cognitive Neuroscience, American Psychological Association, pp. 71-86.
[30]	Cai, D., He, X. F., Han, J. W. (2005) Using graph model for face analysis, Technical Report, UIUCDCS-R-2005-2636, UIUC.
[31]	Cai, D., He, X. F., Han, J. W., Zhang, H. J. (2006) Orthogonal Laplacianfaces for face recognition, IEEE Transactions on Image Processing, IEEE Computer Society, pp. 3608-3614.
[32]	Cai, D., He, X. F., Han, J. W. (2007) Spectral Regression for efficient regularized subspace learning,
in Proceedings of IEEE International Conference on Computer Vision, IEEE Computer Society, Rio de Janeiro, pp. 1-8.
The State of Information and Communication Technology in Hungary - A Comparative Analysis
Peter Sasvari
University of Miskolc, Faculty of Economics, Institute of Business H3515 Miskolc-Egyetemvaros, Hungary
E-mail: iitsasi@uni-miskolc.hu, WebAddress: http://gtk.uni-miskolc.hu/
Keywords: information society, ICT, economic sector, electronic presence, electronic marketplace Received: August 2, 2010
A novel comparative research and analysis method is proposed and applied on the Hungarian economic sectors. The question of what factors have an effect on their net income is essential for enterprises. First, the potential indicators related to economic sectors were studied and then compared to the net income of the surveyed enterprises. The data resulting from the comparison showed that the growing penetration of electronic marketpalces contributed to the change of the net income of enterprises in various economic sectors to the extent of 37%. Among all the potential indicators, only the indicator of electronic marketplaces has a direct influence on the net income of enterprises. Two clusters based on the potential indicators were indicated.
Povzetek: Predstavljena je primerjalna analiza IKT na Madžarskem.
1	Introduction
The current age is often referred to as the Information Age. This concept was first introduced by Manuel Castells, the best-known theoretician of the information society [1]. The information society is a new, special variant of the existing societies in which producing, processing and distributing information become a fundamental source in the economy.
According to the related literature data, the Information Age began in the second half of the 1950s when, for the first time in history, the number of white-collar workers (engineers, administrative employees etc.) exceeded the number of blue-collar workers ([6]).
One of the main driving forces of the Information Age is the phenomenon called Information and Communication Revolution. Its significance often compared to the agricultural and industrial revolutions taken place in the history of mankind. In important fields of high-end technology (computer technology and telecommunication) not only the robust growth of quality, quantity and performance parameters can be observed but the approximation of these two fields along with the appearance of compound applications can also be detected. These phenomena of the information society cannot only be seen as one of the results of the development of technology but also a coherent system affecting the society as a whole ([3]).
2	The characteristics and impacts of information and communication
systems
Information and communication technology can be regarded as an universal technological system, which is closely linked to all of the previous systems and creates
new, more complex technological systems. ICT's main characterizing function is to assure acquiring, storing, processing, delivering, distributing, handling, controlling, transforming, retrieving and using information. ICT has a different effect on the actors of the economy, including companies, employees and consumers. Nowadays we witness a change of paradigm in the operation of enterprises. They have become a rapidly changing system of independent work groups and projects. Enterprises are characterized by flexible operation and demand for flexible labour force. In this new situation, employees have to leave the traditional patterns and develop a new kind of mentality. If they want to stay afloat in the labour market, they have to be flexible as enterprises are no longer strongly interested in developing the professional knowledge of their employees. Beside the changes experienced in the attitude of enterprises and employees, consumers' behaviour has also been changed essentially by the effect of ICT. As consumers are freed from their isolation by the Internet, they become active and conscious actors in the economy. The relationship between buyers and sellers has changed, it has become harder for sellers to recognize and influence the trends in demand and consumers are better informed than ever before.
Information and communication technology has brought a deep change in the opportunities for consumers compared to the opportunities provided by industrial capitalism. This change was as profound as the one caused by the Industrial Revolution earlier. The new generation of consumers is, first and foremost, well informed, collecting and using other consumers' existing experiences. Companies (especially corporations) previously focused on products and markets, nowadays they concentrate on consumers instead. It is not enough to recognize consumers' problems, identifying the problems in order to solve them is also needed. The
opportunities provided by ICT identify actual consumers, based on actual problems occurring during the use of a product. Companies can keep pace with the speed of the development of ICT only by introducing job enrichment. The requirement of versatility can be met only by employees with high-level general education ([3]).
The decrease of the number of strict positions along with the changing requirements of the remaining ones allows employees to acquire new skills but it also stretches their responsibility. Cross-trainings are also organized for the group of employees in order to enable them to perform various tasks. Team-based companies have better problem-solving skills, higher productivity, more efficient use of human resources, more creativity and more innovations when compared to traditional nonteam based organizations. Nowadays, when digital information is regarded as the chief mean of production, the efficiency of production is highly dependent upon obtaining and processing information. Based on the achievements of ICT, companies have shaped up the infrastructure of obtaining and processing information, and help their employees to co-operate by compressing time and space. The intention of raising efficiency gave room to virtual teams. By being part of a virtual team, employees do not have to work under the same roof and other employees from outside the company can take part in the work of them.
Nowadays, the majority of changes in work organization, decision mechanisms and corporate organization structures requires enhanced flexibility. Flexibility means quick reaction, the removal of strict limits and the frequently mentioned job enrichment as well as openness for innovations and unconventional answers to the newer and newer challenges. The environmental impulses do not affect the operators of the assembly lines or the workers of call centers through a long chain. Companies were operated centrally from a single headquarters earlier, nowadays managers and workers try to find answers to the current challenges in many local corporate decision nodes. The coordination of numerous independent units is generated by the company as a self-organizing system, and the company's philosophy is determined by the self-organization of independently operating units based on market principles.
3 The aim of the research
Based on the considerations presented above, it is not the subject of my paper to answer whether there is a need for ICT or creating the necessary conditions for the information society. The real subject is to measure what economic, social, cultural and environmental effects it has on the society. The rich literature of the information society discusses these aspects in detail. In my work, we take the information society as a normative future plan for Hungary, and we are looking for the answer of what progress has been made in building the information society in the Hungarian economic sectors. We examine the following issues:
-	to what extent we can speak about the information society in Hungary nowadays,
-	what is the development level of the information society in several economic branches and company sizes compared to each other and to the member states of the European Union,
-	how this development level can be measured and calculated,
-	how the development level of information and communication technology increases at certain company sizes,
-	what trends can be observed in the development process in the individual economic sectors and company sizes.
My examination extends to the static, momentary state of the development level of ICT devices used in the economic branches as well as to their dynamic analysis, expected pace of growth and their qualification. When establishing the aims of the research, there is always the question of how to position the individual parts of the subject. Should they be positioned in a broader subject or should they be selected for further and deeper examination? The former possibility means that we aim to make suggestions by putting the practical analysis into a broader structure. The aim of my research is exactly this, as the information society means a stage representing a new quality, and the changes of the information and communication technology can be observed in every part of our life nowadays.
4 The method of the research
Similar problems are raised by the quantification of the various components of the information society as the definition of its concept. There is a wide range of variables that can be measured: a great number of explanatory variables can be listed from the perhaps more easily measurable infrastructural components to the more difficult components related to knowledge and willingness for using information. That is why most analyses use sets of variables and complex indices as there is no easily measurable (one-dimensional) index that would characterize the information society. The examination of the subject is interdisciplinary as it has social and scientific references, so a complex approach was needed when we started processing the literature. We needed to study literature on economics, law, sociology and technology connected to the information society. In consideration of the complexity of the studied subject, we selected several analytical methods and approaches. In the phase of data collection, we relied on the available Hungarian (related reports issued by the Hungarian Central Statistical Office [8]) and international data (Statistical Office of the European Communities [7]) as well, and we managed to process a large amount of secondary information consisting of more than 6.000 items. We extended my research to printed as well as electronic publications and artifacts available on the Internet. The reason for conducting a primary research was to reduce some shortcomings originated from secondary data sources. In fact, it
covered an empirical survey among Hungarian companies and enterprises. The questionnaire we used for collecting data on the subject was filled in by 554 respondents altogether, providing nearly 3.000 data records.
As Figure 1 shows, the literature on the development of ICT distinguishes five development stages.
5. Compound 4. Electronic nodes
Figure 1: The development stages of information and communication technology [3]
These stages are built upon each other. With the help of the elaborated model, we measured the individual development stages. By averaging the data of the first three development stages, we examined the enterprises' willingness for adaptation. With the help of an own model, which comprises five elements, we analyzed the development and growth of the size categories and economic sectors.
The steps of this procedure are as follows:
-	Processing the data of the primary and secondary research,
-	Assigning single indicators to individual development stages, calculating potential indicators,
-	Calculating the values of potential indicators from single indicators,
-	Studying potential indicators,
-	Determining potential indicators at the individual development stages.
Then, with the help of the resulting indicators, we performed a cluster analysis, a compound regression analysis, and finally a discriminant analysis on the surveyed economic sectors.
5 The results of the analysis of information and communication technology
Clustering is the assignment of a set of observations into subsets so that observations in the same cluster are similar in some sense. The clustering process is successful when the subsets are similar to each-other and different from the elements of other subsets at the same time. Based on theoretical considerations, we decided to make groups of economic activity categories from the five previously defined potential indicators.
As a summary of the results of the cluster analysis, it can be stated that the sectors "Electricity, gas and water supply", "Transport, storage and communication",
"Mining and quarrying", "Manufacturing" and "Financial intermediation" belong to the second cluster by better average values. The results of this analysis are presented in Figure 2.
Figure 2: Two-cluster model of the national economic sectors
We used the path model to study how the potential indicators influence one another and what direct or indirect effect they have on the average net income of the individual economic sectors.
Figure 3: The scheme of the path model of the potential indicators
As it is indicated in Figure 3, the variables presented in the path model are linked with arrows to one another, showing the direction of their relationships. We assumed in my causal model that the potential indicator of electronic presence is the exogenous variable. Based on the arrows starting from it, the potential indicator of electronic presence has an effect on the other potential indicators, also having an indirect effect on the average net income of enterprises in several economic sectors. These paths are called indirect paths by the literature and in my model they show how the effect of the potential indicator of electronic presence takes place through the potential indicators of interaction/dialoge, transaction and electronic markets. The potential indicators of interaction/dialogue and transaction became endogenous variables. Endogenous variables are variables with causal links leading to them from other variables in the model. In other words, endogenous variables have explicit causes within the model. The dependent variable in my model is the average net income of enterprises in economic sectors, the arrows starting from the other variables point at this one but it has no arrow or link pointing back at the other variables.
1st cluster	2nd cluster
(A) Agriculture, hunting and forestry (F)	Construction (G)	Wholesale and retail trade; repair work (H)	Hotels and restaurants (K) Real estate, renting and business activities (M) Education (N) Health and social work	(C)	Mining and quarrying (D)	Manufacturing (E)	Electricity, gas and water supply (I) Transport, storage and communication (J) Financial intermediation
Underdeveloped Developed relative	
The aim of setting up a path model was to divide the zero linear correlation between the independent and the dependent variables into two parts. The first part is the effect that the independent variable directly has on the dependent variable, while the second part shows the effect being had on the dependent variable caused by the independent variable through another endogenous variables.
Only the potential indicator of electronic markets has a direct effect on the average net income of enterprises as it is illustrated in Figure 4. However, the effect of the potential indicator of electronic presence is significant as it influences the potential indicator of electronic markets to a great extent. The value of the indirect effect of electronic presence was (87.4%*60.5%) 56.2%. In the table below, a new arrow also appears with a value of 70%, showing the effect of non-specified variables from outside the model on the average net income of enterprises.
Transaction
costs of enterprises
3. Stage of transaction
9,=87,6%
1. Electronic presence
ßi-87,4%1
4. Electronic markets
y i =64,3%
2. Stage of interaction/ dialogue
Net sales income of enterprises
Communication and marketing cost of enterprises
Figure 4: The final path model of the potential indicators
Electronic presence has no direct effect on the potential indicator of transaction. The value of the strength of its indirect effect was (64.3%*87.6%) 56.3%, according to my computation. The model verified the hypothesis according to which electronic presence largely determines interaction/dialogue, it has an indirect effect on transaction and it has the strongest correlation with electronic markets. Before creating the model, we assumed a direct correlation between transaction and electronic markets but we could not verify the existence of the relationship between them. However, the new result of my research was that there was a direct correlation between electronic markets and the average net income of enterprises.
In the early phase of my research, we encountered the problem that there were no explanatory variables in the typology created by cluster analysis. Typologies, different clusters are of a low measurement level so the explanation of their development status is impossible with the formerly used techniques. Discriminant analysis is a useful method to explain a low measurement level variable with another variable of high measurement level. Discriminant analysis is a technique where dependent variables are not metric and are classified between two or more categories whereas independent
variables (predictors) are measured on a metric scale. The summary of the methods used together with discriminanat analysis is shown in Figure 5.
|	Independent variable	|
Figure 5: Partial summary of the methods used for structure analysis, along with discriminant analysis [4]
After completing the cluster analysis, we found that the surveyed economic sectors could be classified into two, then four ICT development levels or clusters. The resulting four-cluster model is illustrated in Figure 6.
1.2 cluster	2.2 cluster
(G) Wholesale and retail trade; repair work (K) Real estate, renting and business activities (M) Education	(D)	Manufacturing (E)	Electricity, gas and water supply (I) Transport, storage and communication (J) Financial intermediation
1.1 cluster	2.1 cluster
(A) Agriculture, hunting and forestry (F) Construction (H) Hotels and restaurants (N) Health and social work	(C) Mining and quarrying
Underdeveloped Developed relative	
Figure 6: Four-cluster model of the national economic sectors
The following four economic sectors got into the 1.1 cluster: 'Agriculture, hunting and forestry', 'Construction', 'Hotels and restaurants' and 'Health and social work'. The average of the potential indicators to electronic presence, interaction/dialogue, transaction and electronic markets was the lowest in the four clusters. Four economic sectors were classified into the 1.2 cluster as well: 'Wholesale and retail trade; repair work', 'Real estate, renting and business activities', 'Education' and 'Other community, social and personal service activities'. Examining the data of this cluster, it could be observed that its average values were higher than those of the 1.1 cluster but were lower than the average values of the other two clusters.
Only the 'Mining and quarrying' sector was classified into the 2.1 cluster. In terms of electronic presence and electronic markets, this sector was the most developed compared to the other sectors. This cluster produced the second highest ICT values based on the values of the other potential indicators.
70%
'Manufacturing', 'Electricity, gas and water supply', 'Transport, storage and communication' and 'Financial intermediation' could be found in the 2.2 cluster. The values of interaction/dialogue and transaction were the highest in this cluster comparing to the other ones.
My aim was to get to know the human resource demand of enterprises (the number of the employees regularly using computers), the cost of ICT services or availability (cost of computer-related services) and the amount spent on professional training (the total expenditure on professional training). These three explanatory variables jointly indicate the different ICT development stages, in this case discriminant analysis predicts whether an enterprise belongs to a specific development stage or not. Based on the primary research, it can be stated that education expenses have a more significant effect on belonging to various clusters. As the aim of the discriminant analysis is the classification of cases into groups, the classification table is one of the most important results of the analysis. The table below consists of two parts: the first presents the scores before the grouping took place. The chance of being classified into a cluster is 25% in each group and each cluster weight was different.
Cluster	Prior	Cases Used in Analysis	
		Unweight ed	Weight ed
1.1	,250	36	36,000
1.2	,250	4	4,000
2.1	,250	58	58,000
2.2	,250	82	82,000
Total	1,000	180	180,000
Table 1: Classification Results
The actual hit ratio can be seen in the second part, it is given in percentage, its value ranges from 0 to 100. Instead of the lowest possible value, it needs to be compared to the expected hit ratio. The expected hit ration means the hit ratio resulting from random categorization, its value is 25% in the case of four groups.
The classification table is suitable for the evaluation of the results of the discriminant analysis as it shows the
		Cluster	Predicted Group Membership				
			1.1	2.1	2.2	1.2	Total
		1.1	19	0	0	17	36
	Count	2.1	2	1	0	1	4
		2.2	24	0	4	30	58
Original		1.2	30	1	0	51	82
		1.1	52,8	,0	,0	47,2	00,0
	%	2.1	50,0	25,0	,0	25,0	00,0
		2.2	41,4	,0	6,9	51,7	00,0
		1.2	36,6	1,2	,0	62,2	00,0
		1.1	16	0	1	19	36
	Count	2.1	2	0	0	2	4
		2.2	24	1	3	30	58
Cross-		1.2	32	1	1	48	82
validateda		1.1	44,4	,0	2,8	52,8	00,0
	%	2.1	50,0	,0	,0	50,0	00,0
		2.2	41,4	1,7	5,2	51,7	00,0
		1.2	39,0	1,2	1,2	58,5	00,0
ratio of the adequately categorized group membership. The rows make up the categories of the dependent variables and their initially observed values, while the columns of the table constitute the values predicted by the independent variables. The table can be divided into two parts: the upper part of it shows the inital analysis, while its lower part presents the cross validation values. The data are presented in the same way in both parts of the table, they are expressed either in absolute value or in percentage. Analyzing the absolute values of the table, it can be observed that only 19 cases got into the 1.1 cluster from its original 36 cases, while 17 of them got into the 1.2 cluster. Expressing this data in percentage it means that the rate of the adequately categorized cases is 52.8% in the 1.1 cluster, 25% in the 2.1, 6.9% in the 2.2 and 62.2% in the 1.2 cluster. Consequently, the procedure was successful only in the cases of the 1.1 and the 1.2 clusters. SPSS identifies values as adequate hit ratio on the diagonal: if the prediction equals the value of the initial sets of observations then the prediction is perfect and every value is situated on the diagonal. Enterprises were adequately categorized in 41.7% of cases and 37.2% of predictions based on the given variables. In summary, it can be stated that the first and the fourth clusters are significantly different from the other two clusters, as their hit ratio is above 50% in the case of three independent variables. Examining the results, it can also be observed that these two clusters can hardly be divided in the case of three independent variables.
6 Conclusions and suggestions for the practical use of research findings
The most important step of the cluster analysis is to determine the number of clusters. The data show that it is expedient to form two clusters based on the potential indicators. The first cluster comprises eight, while the second comprises five economic activities. As a consequence, those economic sectors belong to the first cluster that use ICT devices less frequently than the national average, while the second cluster contains those economic sectors that can be seen as developed ICT-users.
The multiple regression analysis is the series of regression models built upon each other. Using the regression model, we studied the direct and indirect effect of the potential indicators on each other and the companies' net income in several economic sectors. The only potential indicator affecting a company's net income is the indicator of electronic marketplaces. However, the effect of the electronic presence is significant, since it has a great influence on the potential indicator of electronic marketplaces. During my primary research, we found out that the effect of the non-specified variables out of the regression model on a company's net income is 70%.
The typology carried out by cluster analysis does not contain independent variables. The discrimination analysis helps to explain the values of dependent
variables with the help of independent variables. With the clusters showing the given development stages, my aim was to get a better idea on the companies' needs of human resources and on how much is spent on training and ICT services by the given company. Exclusively training expenses have a more significant effect on which cluster a company belong to. It was possible to classify the companies into clusters based on the three independent variables in 42% of the cases.
We could not find a reassuring mathematical and statistical method for studying the effect of the information communication technology on businesses in the literature, that is why we proposed a new research and analysis method that we also used to study the Hungarian economic sectors.
The primary possibility of utilizing the proposed method appears in situation report. We managed to measure the relative (economic sectors correlated to each other) and the absolute (economic sectors correlated to the same ones in a different country) development level of the information communication technology with the help of creating development stages, quality categories and the willingness for adoptation belonging to the given development stages.
The secondary possibility for utilization lies in following patterns. The development of ICT is different in several countries, regions and economic sectors. The European Union proposed a strategic framework for its member countries. The main aims of establishing a strategic framework are:
-	a single European information space;
-	boosting investment and innovation in ICT researches;
-	establishing a receptive European information society.
The economy of the United States is regarded as a model economy where two-third of the employees were dealing with information process during working hours in 2000. One of the causes of the massive economic performance in the United States is the highly-developed information processing. If we manage to measure this level of development, a strategy can be formulated in the European Union and in the individual member states in order to catch up with the most developed countries.
The object of the study is generally the national economy of a given country. With the help of the method
we have worked out, it is possible to analyze and assess the sections, subsections, divisions, groups and classes of a given national economy. Beside the economic sectors, company sizes and organization forms can also be studied.
Acknowledgement
We owe my deepest gratitude to my scientific leader, associate professor Pelczné Dr Ildiko Gall who has been guiding me with great expertise and patience for years, following my professional work with attention and giving me assistance as well as useful advices and suggestions.
References
[1]	Castells, Manuel (2000). The Information Age, Gondolat- Infonia
[2]	Kapolnai, A., Nemeslaki, A., Pataki R. (2002).
eBusiness stratégia vällalati felsövezetöknek (Ebusiness strategies for senior management), Aula
[3]	Karvalics, Z. L. (2003). Informâciô, târsadalom, tôrténelem, Vâlogatott irâsok, (Information, society, history, Selected works), Typotex Kiado
[4]	Karvalics, Z. L. (2007). Informâciôs târsadalom -mi az? Egy kifejezés jelentése, tôrténete és fogalomkörnyezete (Information Society - what is it exactly? The meaning, history and conceptual framework of an expression), Az informâciôs târsadalom, Az elmélettôl a politikai gyakorlatig (The Information Society, From theory to political practice), Gondolat - Üj Mandatum, Budapest, 2947
[5]	Sajtos, L., Mitev, A. (2007). SPSS kutatâsi és adatelemzési kézikônyv (The handbook of SPSS research and data analysis), Alinea Kiado, Budapest
[6]	Szabo, K., Hamori B. (2006). Informâciôgazdasâg (Information Economy), Akadémiai Kiado, Budapest
[7]	Eurostat, Your key to European statistics, http://epp.eurostat.ec.europa.eu/portal/page/portal/eu rostat/home
[8]	Hungarian Central Statistical Office, http://www.ksh.hu
An Efficient Cross-Layer Scheduling with Partial Channel State Information
Indumathi Ganesan
Mepco Schlenk Engineering College Sivakasi, TamilNadu, India E-mail: indupriyanga@gmail.com
Murugesan Karuppasamy
Bharathiyar Institute of Engineering for Women
Chinna Salem, Villupuram Dist., TamilNadu, India
Keywords: cross-layer scheduling, heterogeneous applications, orthogonal frequency division multiple access (OFDMA), adaptive modulation and coding (AMC)
Received: April 28, 2010
The proposed Cross-Layer scheduling can boost the spectral efficiency of multi-user OFDMA wireless systems with heterogeneous delay requirements. The existing designs usually have two important assumptions that the users are delay insensitive and Channel State Information at the Transmitter (CSIT) is perfect. In practice, users have heterogeneous delay requirements and CSIT usually becomes outdated in time varying channel, which in turn leads to systematic packet errors and hence results in significant degradation on the throughput. The Adaptive Modulation and Coding (AMC) is a promising tool for increasing the spectral efficiency of time varying channel, while maintaining the target Bit Error Rate (BER) and the Packet Error Rate (PER). In this paper, a novel design problem is formulated which combines AMC and CSI at the physical layer and scheduling using queuing theory at the Medium Access Control (MAC) layer, in order to maximize the throughput and spectral efficiency under the heterogeneous delay constraints. For the above proposed work, transmissions on Rayleigh fading channel including Additive White Gaussian Noise (AWGN) are employed. Simulation results show that the proposed scheduler provides robust system performance enhancement over conventional cross-layer scheduler with perfect CSIT.
Povzetek: Opisana je metoda razporejanja za OFDMA sisteme s poudarkom na odpravljanju zamud.
1 Introduction
There are quite a large number of existing works on cross-layer scheduling design for OFDMA such as [1]-[4] and the optimal sub carrier allocation and the transmitter power adaptation in an OFDMA system having users with fixed data rate requirements have been studied in [1] & [2] respectively. The authors in [4] & [5] provided a general theoretical frame work as well as practical algorithm implementation schemes addressing the cross-layer optimization problem of OFDMA systems.
These cross-layer designs achieve throughput gain by exploiting spatial diversity as well as multiuser diversity. But these designs were only based on a decoupled approach where source statistics and queue dynamics were ignored from the physical layer information theoretical models. To provide diverse QoS requirements in terms of delay performance, some cross-layer designs were proposed in [6, 7, 8 & 9] to incorporate both source statistics and queue dynamics. In [6], a simple on-off physical layer model was assumed in [6] and multiple access channel model with homogeneous users was studied in [7] & [8] through combined information theory and queuing theory. In [9],
a heuristic scheduler design maximizes the system throughput while providing fairness between users in an OFDMA system was proposed. All of these cross-layer designs were targeted for system with homogeneous uses only. Also they rely on two important assumptions: users are delay sensitive and Channel State Information (CSI) at the transmitter is perfect. These assumptions are usually impractical since next generation networks are expected to contain real time users of heterogeneous classes with different delay requirements.
Recently more publications are addressing the issue of imperfect CSI at the transmitter on scheduler design. Generally there are two types of imperfect CSI at the transmitter namely "limited CSI" and "outdated CSI" at the transmitter. Limited CSI refers to the incomplete knowledge of CSI at the transmitter whereas outdated CSI refers to the partial knowledge of CSI at the transmitter. Under outdated CSI, systematic errors occur whenever the scheduled data rate exceeds the instantaneous mutual information rate. Therefore it is very important to control the packet error probability of a lo level for reasonable system throughput and delay performance. To our best of knowledge there are only a
few works considered the outdated CSI at the transmitter [10, 11, 12] considering single user OFDM systems. In [10, 11], the authors had addressed the issue of outdated CSI at the transmitter and those designs were applicable to delay-insensitive applications. In [12], the authors proposed a delay sensitive sub carrier allocation strategy which obtains a substantial throughput gain.
In this paper, our objective is to design a cross-layer scheduler for OFDMA systems consisting of users with mixed traffics and heterogeneous delay requirements. To achieve this objective, priority levels are assigned at the MAC layer using the partial knowledge of the CSI obtained from the physical layer and also by using the Queue State Information (QSI) at the MAC layer. Then based on the priority levels sub carrier allocation is made. To improve spectral efficiency, adaptive Modulation and Coding (AMC) mode of transmission is considered at the physical layer. By including the link adaptation procedure along with the methodology proposed in [12] in our work, we are able to achieve the enhanced spectral efficiency.
The rest of the paper is organized as follows. Channel model which includes downlink channel model and CSI at the transmitter estimation model is explained in Section 2. Multiuser Physical layer model for OFDMA systems with AMC is discussed in Section 3. In Section 4, Scheduler design at the MAC layer model is described. The simulation results are presented in Section 5. Finally concluding remarks are given in section 6.Text of the introduction.
2 Channel model
The cross-layer system model considered for multiuser wireless systems is shown in Fig. 1. Before the scheduling operation is performed, the cross-layer resource scheduler first collects the QoS (delay) requirements of all users. In the beginning of each scheduling interval, the scheduler obtains the partial CSI and QSI by observing the number of backlogged packets in all these user's buffers. The resource scheduler then makes a scheduling decision based on this information and passes the resource allocation scheme to the OFDMA transmitter. The updates of scheduling decision process are made once for every time slot.
2.1 Downlink channel model
An OFDMA system containing K users with frequency
selective channel model consisting of L = lBW / ] i.e., [Signal Bandwidth/Coherent Bandwidth] resolvable paths is considered. For simplicity, uniform power delay profile is adopted, i.e. each path has normalized power given by 1/L. Thus the channel impulse response between the transmitter and the j_th user at the time slot m, hj(m), can be modeled through a L-tap delay line
channel model,
K}
ie. hj (m) = hjj (m)S(m -1 / w) where
are modeled as independent identically distributed (i.i.d.) circularly symmetric complex Gaussian random variables with distribution CN (0, 1/L). The channel is assumed to be quasi-static within each time slot m, but slowly time varying across time slots according to Jakes' model where the scheduling slot duration ts will be very much less than the doppler spread fd of the channel. The scheduling duration is considered within 2 ms. It is reasonable assumption for users with pedestrian mobility where the coherence time of the channel is around 20 ms or more.
With Nf point IFFT and FFT in the OFDMA system, equivalent discrete channel model in the frequency domain (after the length-L cyclic prefix removal) is
Y = HU +
(1)
"=° j,l	is the complex channel
,th „..u	-P„_ ,th ,
Figure 1: Cross-Layer System Model.
where (i denotes subcarrier index and j denotes user index) Yij- is the received symbol, Uj is the data symbol from the transmitter / BS to user j on sub carrier i, ZiJ is the	noise	distributed	with
cn (o^ , \ h=x:° v-■
gain of the ith sub carrier for the jm user. ZiJ is a zero mean complex Gaussian noise with unit variance i.e. i.i.d. for different users but correlated within user j. The transmitter power allocated to user j through the
P = E]U ]2
subcarrier i is given by ij 11 . We define
subcarrier allocation strategy as SnfxK ^ where Sij=1 when user j is selected for subcarrier i, otherwise Sij =0. The average total transmitter power is constrained
P < P
by Pto,	i.e.	Tot,	where
P = E 1(1/N )VK YNfS P I P
LI WZjZ,i=1 ij d and P™ is the available
average power in the transmitter. 2.2 CSIT estimation
Assuming that proposed system is using Time Division Duplex (TDD) with channel reciprocity, the downlink CSI at the transmitter could be obtained by channel estimation based on uplink preambles given by the transmitter. However, due to duplexing delay between uplink and downlink, the estimated downlink CSI at the transmitter will be outdated. Thus the estimated downlink CSI at the transmitter in frequency domain
\hv for all
users over all subcarriers at the transmitter
accounting the outdatedness can be modeled as:
Hl} = H y +NHl}
. (2) where {h. ] is the CSI error at the transmitter with zero
mean noise distribution.
3 Multi-user physical layer model for OFDMA systems with AMC
We consider the information theoretical capacity [7] the abstraction of the multi-user physical layer decouple from specific implementation of coding and modulation schemes. In general, packet error is contributed by two factors, namely the "channel noise" and the "channel outage". In channel outage case, the effect is systematic and cannot be eliminated, because the instantaneous mutual information between transmitter
as to
and
user
J
in
Cj = log 21 + pjhj /a
y F
gj = s y [rJ < cj ]
(3)
i=i
iy Y+1
denoted as v nU=° . The transmission mode n is chosen when,
yAy„ y )
for n = 1, 2, . . . , N
(4)
To avoid deep-channel fades, no data are sent
when y° < y < y1 which corresponds to the mode n = 0, with rate R0 =0 bit/symbol. The design objective of AMC
y i
; V n )>
N+1 n )n=0
is to determine the boundary points To simplify the AMC design, the PER expression for AWGN channels is approximated to give
PER(y) =
I,	if 0 <y<yp,
an exp(- g„y)lf y^ypn
(5)
subcarrier
is contributed by two
factors, namely the "channel noise" and the "channel outage." In channel outage case, the effect is systematic and cannot be eliminated, because the instantaneous mutual information between transmitter and user j in ith
c.. = log, (1 + p]n.\2/ct2 z I subcarrier i'	v	; is a function of
actual CSI Hij, which is unknown to the transmitter. So
packets will be corrupted whenever scheduled data rate
exceeds instantaneous mutual information. To take
account of the packet error due to channel outage, the
instantaneous goodput of the jth user (which measures the
instantaneous data bits/s/Hz successfully delivered to
user j) as
where n is the mode index and y is the received SNR.
Exact closed-form expressions for PER and BER are not available for transmission modes with convolutionally coded modulations. Hence, the exact PER and BER is obtained through Monte Carlo simulations [15]. From that the mode parameters an, gn, and ypn in (5) are obtained by fitting (5) to the exact
PER via Monte Carlo simulations. Using the approximate yet simple expression (5) facilitates the mode selection. The mode fitting parameters for each transmission modes are provided in Table 1.
The region boundary (switching threshold) y" is set for the transmission mode n which is the minimum SNR required to guarantee Ptarget. With the boundaries
\yn J^+Q specified by (6), one can verify that the AMC
in (4) guarantees that the PER is less than or equal to Po. To obtain the region boundaries the general PER
expression is inverted as in (5),
y° =°,
r i if < c. where I \ri1 < ci. = <	is an indicator
Ly yJ I0, if r. > c.
function, and rij is the scheduled data rate of the jth user on the ith subcarrier.
3.1 Design of AMC at the physical layer
The Adaptive Modulation and Coding (AMC) is the technique to maximize the data rate and to utilize the bandwidth efficiently under a prescribed Packet Error Rate (PER) performance at the Physical layer. This AMC scheme matches the transmission parameters to the time-varying wireless channel conditions adaptively [13] and has been used by many standard wireless network specifications, such as IEEE 802.11/15/16 [14].
Let N denote the total number of transmission modes available at the wireless link between transmitter and receiver (say N=6 for IEEE 802.16). As in [15], fixed power transmission is assumed and partition the entire Signal-to-Noise Ratio (SNR) range in N+ 1 non-overlapping consecutive interval, with boundary points
f \
y =— ln
gn
p
V target J
n=122,
, N
(6)
and y n+1 = +» Based on CSI acquired at the receiver, the AMC selector determines the modulation coding pair (mode), which is sent back to the transmitter through the feedback channel. The AMC controller at the transmitter then updates the transmission mode. Coherent demodulation and maximum-likelihood (ML) decoding are used at the receiver. The decoded bits are mapped to packets, which are pushed upwards to the data link layer. When mode n is used, each transmitted symbol will carry
Rn = Rc log2 (m„ ) information bits for the mode adhering
to a Mn -QAM constellation, and a rate Rc FEC code. Therefore, the average spectral efficiency (bit rate per bandwidth) achieved at the physical layer without considering possible packet retransmission is
Se, Physical 2n=1 RnPr (n)
n=0,1,2...
(7)
th
where Pr(n) is the probability of choosing the transmission mode n.
4 Scheduler design at the MAC layer model
The system dynamics are characterized by system state X =( HNf xk , y, Qk ) , which composed of channel state
the HNfxK = [|hj ] and SNR value from physical Layer
and Queue State Information (QSI) QK from MAC Layer User's buffer, where Qk =[qj] is a K x 1 vector with the jth component denotes the number of packets remains in user j's buffer. The MAC Layer is responsible for scheduling at every fading block on the current system state X- Based on CSIT and QSI obtained, the scheduler determines the subcarrier allocation from the policy PNfxk H QJ for the selected users (ie.,) the users having the scheduling rate less than the mutual information rate.
5 Simulation results
In the simulation an OFDMA system with total system bandwidth of 5 MHz with carrier frequency of 2 GHz consisting of 192 data subcarriers and 5 pilot subcarriers and 5 users, is considered(users are specified by arrival rate and delay requirements). Each user is having different information field size. Results are obtained with the frame duration of 2.5 milliseconds. The channel model is constructed to simulate the multipath fading channel. The multipath fading is modelled as a tapped delay-line with 8 taps with non uniform delays. The gain associated with each tap is characterized by a distribution Rayleigh with a K-factor=0 and the maximum Doppler frequency of 25 Hz. For each tap, a method of filtered noise is employed to generate coefficient with the specified distribution and spectral power density. For our simulation, Matlab and the simulink models are considered.
The main function of the AMC design is to adopt the transmission modes according to the channel conditions. The transmission mode selection is based on the obtained SNR values by fitting them into the estimated SNR boundaries or thresholds. The corresponding mode is selected for the next transmission. As per IEEE 802.16 fixed Wimax standard, six transmission modes along with new modulation of BPSK are considered. From the Fig 2, we depict that in a low SNR regime (below 4 dB) the throughput achieved is lower when the user requirement is more stringent. This is because more urgent users with heavy traffic loading will have higher chances of seizing subcarriers, causing losses in degree of freedom in exploiting throughput maximization by other users with better CSI at the transmitter. In a high SNR regime, (above 4 dB), the throughput performance is the same regardless of the value of the imposed delay requirement of that user thi is because in a high SNR regime, the service provision are the same for all users and thus the optimal subcarrier allocation reduces to the
conventional scheduling. Also when we see the effect of transmission modes in AMC, Mn-ary QAM modulations are providing high throughput.
Figure 2: Performance in terms of Throughput for various modes
From Figs. 3 & 4, it is understand that the Bit Error Rate (BER) and Frame Error Rate (FER) measures obtained from the simulations are verifying the throughput achieved shown by Fig.2.
Figure 3: Performance in terms of Bit Error Rate for various modes
E
2 0.4
0.2 0
														
L	J L	J L	J L	J L	J L	v				J L			\	
						\							V	
						]								
							\							
							\							3 □ □
—O—AMC1 -Ü-AMC2 —X—AMC3 -*-AMC4 —O—AMC5 -AMC6
3-D- -OAMC7
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 SNR
Figure 4: Performance in terms of Frame Error Rate for various modes
MODE	1	2	3	4	5	6
Modulation	QPSK	QPSK	16QAM	16QAM	64QAM	64QAM
RS code	(32, 24, 4)	(40, 36, 2)	(64, 48, 8)	(80, 72, 4)	(108, 96, 6)	(102, 108, 6)
CC Code Rate	2/3	5/6	2/3	5/6	3/4	5/6
Coding rate	1/2	3/4	'/2	3/4	2/3	5/6
Rn(bits/symbol)	1	1.5	2	3	4	4.5
an	232.9242	140.7922	264.0330	208.5741	216.8218	220.7515
gn	22.7925	8.2425	6.5750	2.7885	1.0675	0.8125
y pn (dB)	3.7164	5.9474	9.6598	12.3610	16.6996	17.9629
Table 1: Transmission modes Specified in IEEE 802.16 standard
From the above all figures, the truth of achieving high throughput or low BER /FER at high SNR regime or when the active channel condition is proved.
When we consider only AMC at the physical layer without the inclusion of queue state information from the MAC layer for scheduling (conventional method), the spectral efficiency gain achieved seems to be slightly lower than cross-layer design with combination of AMC and queue state information. This shows the significance of cross-layer approach of scheduling. The corresponding average packet error rate is tabulated and given in Table 2.
Parameters	Conventional Scheduling	Proposed Cross-layer scheduling using CSI at the transmitter
Spectral efficiency (b/s)/Hz	3.6219	3.8769
Average packet error rate	8.56x10-2	4.57x10-2
Table 2: Analytical Results
6 Conclusion
In this paper, we presented a delay sensitive cross-layer scheduler for OFDMA systems with heterogeneous delay requirements and outdated CSI at the transmitter. The cross-layer design problem is formulated by taking into account ot the outdated CSI at the transmitter, source statistics with implementation of AMC and queue dynamics of the OFDMA systems. The optimal delay sensitive subcarrier allocation is obtained and the proposed scheduler gives a very good balance of maximizing the throughput and providing QoS (delay) differentiation of the mixed heterogeneous users. Further the work may be extended by including retransmission procedures to reduce the error rate which in turn to increase the throughput.
References
[1] C.Y. Wong, R.S. Cheng, K.B. Letaief, and R.D. Murch, "Multiuser OFDM with adaptive subcarrier, bit amd power allocation, " IEEE J. Select. Areas Commun., vol.17, no.10, pp.17471758, Oct.1999.
[2]	M.Ergen, S.Coleri, and P.Varaiya, "QoS awre adaptive resource allocation techniques for fair scheduling in OFDMA based broadband wireless access systems, " IEEE Trans. Broadcasting, vol.49, no.4, pp 362-370, Dec.2003.
[3]	J.Jang and K.B.Lee, "Transmit power adaptation for multiuser OFDM sytems," IEEE J.Select. Areas Commun., vol.21, no.2, pp.71-178, Feb.2003.
[4]	G.Song and Y.(G.) Li, "Cross-layer optimization for OFDM wireless network - Part-I: Theoretical framework, " IEEE Trans. Wireless Commun., vol.4, no.2, pp 614-624, Mar.2005.
[5]	G.Song and Y.(G.) Li, "Cross-layer optimization for OFDM wireless network - Part-I: Algorithm development," IEEE Trans. Wireless Commun., vol.4, no.2, pp 65-634, Mar.2005.
[6]	S.Kittipiyakul and T.Javidi, "Resource allocation in OFDMA: How load balncing maximizes throughput when water-filling fails," UW Tech. Report, UWEETR-2004-0007, 2004.
[7]	E.M.Yeh and A.S.Cohen, "Information theory, queueing, and resource allocation in multi-user fading communications, " in Proc. Information Sciences Syst. Conf, Mar.2004, pp.1396-1401.
[8]	E.M.Yeh, "Multiaccess and Fading in Communication Networks, " Ph.D. Thesis, Department of Electircal Engineering and Computer Scienec, MIT, 2001.
[9]	P.Parag, S.Bhashyam, and R.Aravind, " A sub carrier allocation algorithm for OFDM using buffer and channel state information, " in Proc. IEEE VTC-Fall, Sep. 2005, pp.622-625.
[10]	Y.W.Yao and G.B.Giannakis, "Rate-maximizing power allocation in OFDM based on Partial channel knowledge," IEEE Trans. Wireless Commun., vol.4, no.3, pp.1073-1083, May 2005.
[11]	A.Leke and J.M.Cioffi, "Multicarrier systems with imperfect channel knowledge, " in Proc. IEEE International Symposium PIMRC, Set.1998, vol.2, pp.549-553.
[12]	D.S.W. Hui and V.K.N.Lau, "Design and analysis of Delay-sensitive Cross-layer OFDMA systems with Outdated CSIT," IEEE Trans. On Wireless Commn., vol.8, no.7, pp. 3484-3491, July 2009.
[13]	Xin Wang, Georgis B.Giannakis, and Antonio G.Marques, "A Unified Approach to QoS-
Guarenteed Scheduling for Channel-Adaptive Wireless Networks," Proceedings of IEEE, vol.95, no. 12, pp 2410-2413, Dec 2007.
[14] IEEE Standard 802.16 Working Group, IEEE standard for local and metropolitan area networks part 16: air interface for fixed Broadband Wireless Access Systems, 2002.
[15] Qingwen Liu, Shengli Zhou and Georgios B. Giannakis, "Cross-Layer Combining of Adaptive Modulation and Coding with Truncated ARQ over Wireless Links," IEEE Transactions on Wireless Communications.vol.3, no.5, pp1746-48, Sep.2004.
Online Motion Planning for Humanoid Robot Based on Embedded Vision System
Qiubo Zhong, Qishu Pan, Bingrong Hong, Baofu Fang and Songhao Piao
School of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, China E-mail: zhongqiubo@yahoo.com.cn
Keywords: humanoid robot, motion planning, embedded vision, look-up table, image recognition, hierarchical control
Received: March 29, 2010
In this paper, we present a control system based on embedded system according to the features of humanoid robot. First, the image captured by the vision system is recognized through the improved look-up table method. And then with a smooth gait planning of turning motion for humanoid robot presented, the next several motions for the robot can be proposed by the local motion planner based on the minimum of energy dissipation. After that, a control decision is made for the humanoid robot to plan the motions by the method of gait generation off-line and adjustment on-line according to the rules of marathon held in FIRA. In this way, simulations for image processing and experiments on real humanoid robot HIT-2 are present.
Povzetek: Predstavljen je sprotni sistem za načrtovanje gibanja humanoidnih robotov.
1 Introduction
With the development of technology, Omnidirectional locomotion is not restricted to wheeled vehicles. Spenneberg and Kirchner developed omnidirectional locomotion on the robot with eight legs[1]. Hengst et al.. described an approach to generate omnidirectional walking for the Sony ERS-110 Aibo dogs[2]. Behnke planned the gait using walking direction, walking speed and rotational speed and he has tested the proposed approach on humanoid robot Jupp[3]. A gross motion planner utilizing graphics hardware has been proposed to generate humanoid body motion on a flat ground in real time[4].
Recently, FIRA (The Federation of International Robot-soccer Association) is paying more attention to humanoid robot competitions. In the events of humanoid robot, the basic event is the heptathlon such as marathon, sprint, weight lifting, basket ball, lift and carry, obstacle race, penalty. During the process of these events, only vision can be used for the object recognition. Hong presented a fast and effective visual tracking and location prediction algorithm for cooperative multiple soccer robots playing a soccer game. The picture image is captured by the system and the image is selected as a template and 2D color image pattern is extracted from the incoming image[5]. Kuo proposed a middle size soccer robot based on intelligent vision fusion sensing architecture. The proposed autonomous soccer robot consists of the mechanical platform, motion control module and omni-directional vision module[6]. Lu [7] used an omnidirectional vision system to provide middle-size robot with a tracking capability for dealing with the image process control. The panoramic image is installed by matching the convex mirror and CCD lens. The experimental works are conducted to illustrate the effectiveness of the neural fuzzy system on the image
color recognition and tracking the ball. Zhang [8] presented a method of color image segmentation based on multi-valued characteristic. A superset of the objective sample is constructed in color space, and multivalued characteristic function is employed to describe objective clustering, which ultimately implement the division of color image. Experiments show this algorithm is preferable in both functionality and speed of image processing.
In the field of Robotics, recognition of a working environment using a vision system is critical for an autonomous vehicle. Kim [9] presented a robotic wielding system for closed block assembly. A three dimensional laser vision system is developed based on the optical triangulation technology in order to provide robots with work environmental map. The algorithm architecture for the welding environment recognition is divided into two parts. The first one is the conventional 3D scanning module, and the second one is the plane generation module utilizing a Hough transform. Suzuki [10] presented a corridor recognition method using unprocessed gray-scale image, termed here as a raw-image, and a genetic algorithm, without any image information conversion, so as to perform the recognition process in real-time.
The potential market in the entertainment humanoid robot has been attracted by scholars. Many different types of humanoid robot have been developed. Among these studies, motion planning from the starting point to reach the target point for an autonomous robot is always a challenging work. Because humanoid robot with multiple degrees of freedom of its own properties often become very complex planning issues. In the continuous process of research, there appeared a variety of different control methods. Setiawan et al. [11] realized online
control of forward and backward walking by connecting motion patterns generated in advance. Kajita et al. [12] realized generation of online walking pattern applying Three-Dimensional Linear Inverted Pendulum Mode to humanoid type robot. Lim et al. [13] proposed "quasi-real-time" walking pattern generation using FFT based dynamically stable motion construction method online.
In this paper, we design a system based on embedded vision for humanoid robot, and present an improved looking-up table algorithm for image recognition. Meanwhile, motion of turning for robot is planned and smoothed. A trajectory of the next steps for robot is generated by the local planner and finally, according to the rules of marathon for humanoid robot in FIRA, a decision is made and experiments on real humanoid robot HIT-2 are shown.
2 Design of embedded vision system and image recognition
2.1 Structure of vision system
The entire system adopts principle of hierarchical control and gets the image by the single camera. The information of image is processed in the PDA (Personal Digital Assistant) and the decision system which is also running in the PDA will get the information and other feedback from sensors and then call the motion system executing. The framework of embedded system is described as shown in figure 1.
Figure 1: Picture of framework of embedded vision system.
In Figure 1, the camera named Logitech Pro 5000 is worked as the image sensor, it captures the image for recognition. The PDA gets the image from camera and executes the image recognition. It also runs the decision for motion and gives the orders for motion control system. Motors and others sensors such as tilt sensor are controlled by motion control system. Information exchanges between the PDA and motion control system.
2.2 Clustering based on improved variable thresholds seed fill algorithm
In RGB color space, since the values of R, G, B have a strong correlation, RGB color space is not suitable directly as image processing.
HSI color space is widely used in machine vision research areas [14]. Where HSI respectively represents Hue, Saturation and Intensity. RGB to HSI conversion formulas [15] is a non-linear reversible.
H = arctan'(A/3(G - B ),2R - G - B ) R + G + B
I =
S = 1
3
m in(R , G , B ) I
Th t(I) = 1.0- — ^w	255
(1) (2)
By equation (1) we can get that the value of hue H is ranged from [0,2n]; Saturation S represents the depth or purity of a color, whose range is [0,1]. I, the general range of intensity I is [0,255]. In the case of S = 0, the value of I is growing from 0 and the color is from black to white varying in gray-degree. With the value of saturation S, influence on that pixel by value of hue and intensity can be obtained. By equation (2), we also can get that when the value of s is bigger than Thsal (I) , the value of H is going to be as the main feature of that pixel and vise verse is the value of I, where a is an empirical value.
Clustering method commonly used requires repeated image scanning, which can not meet the real-time requirement based on the embedded vision system. Seed fill algorithm is commonly used in a class of interactive graphics filling algorithm. It can generate all the clustering just scanning the image for once. The most advantages of this algorithm are fast and the ability of filling the region with variable complex boundary.
If the light is uniform and the color of object is purity, the application of this method can be a very good effective recognition. However, if the light is not uniform, the effect of recognition will be worse. Although adjusting the threshold of hue can improve the effect of recognition under this bad situation, if the threshold of hue is changed to lower, only partly object can be recognized or even misrecognition. Vice verse, if the threshold is higher, interference color blocks is going to affect the accuracy of recognition. Therefore, an improved clustering seed fill algorithm based on variable threshold is proposed. In the robot game, target objects are solid. When the algorithm scans the initial pixel of image, lower threshold of hue Thl is adopt and when clustering starts, higher threshold of hue Thh is used. The numbers of pixels belonging to lower and higher threshold of object are recorded, and the value X described in equation (3) is used to filter the interference region. If the value X is lower, this means the number of pixels in the lower threshold is smaller than in the higher threshold, and the probability of result belonging to the interference region is higher. Vice verse,
if the value A is higher, this will bring to the higher probability of the object. The value A can be gotten through experiments described in table 1 bellow.
Nm Njmbers >Thl, Nm Numbers >Thh; ^ = A (3)
NThh
Table 1: The lower bound table of filter algorithm
Numbers of pixel for	A (Ratio of lower
higher threshold	threshold)
20	0.15
40	0.25
80	0.45
160	0.5
320	0.5
640	0.45
1280	0.4
2560	0.4
5120	0.35
10240	0.3
2.3 Recognition of the colour space based on improved Look-up table
Suppose the visual system needs to identify the N kinds of different colors, and each kind of color represented by COLOR is corresponding to an interval of HSI, which	represented	as
{[ HMINi, HMAXi ], [ SMINiSMAXi ], [ IMIN t, IMAXi ]}
, this equals a 'box' with its length, width and height are AH,., AS,., AI,. . Where AH,. = HMAXi - HMNi , AS,. = SMAX, - SMINi, AI,. = IMAXt - IMINt. N kinds of colors are corresponding to N 'box'. Clustering of pixels based on color is to judge whether the coordination of every pixel in the color space is just in some 'box'. For instance, if a pixel represented as P is in the BOXt , P belongs COLOR .Commonly used
method of clustering is to compare value of H, S, I for each pixel with N kinds of color corresponding to the HSI interval determining whether in the interval or not. For every pixel, the times of comparison operation for clustering operation need to be N x 6 integer variables. It is inefficient if the same clustering operation to all the pixels of the color images is implemented, which affect the real-time target recognition. In the HSI color space, hue is one of the biggest variable differences among the various objects. Therefore, the equation of hue is unchanging. When the HSI color cone changes along the radial orientation, its saturation of color changes, the color attributes are the same, but a change in purity, which is by the color cone color, in the outer part of the net is bright comparison with close to the center part being impure, and gradually moving gray. In fact, if the purity of an object is relatively high, only lower bound of saturation is considered to filter out variegated instead of defining the upper bound. Therefore, only value of I is considered to define to prevent from judging by mistake.
By this method, only N x 6 integer variables need to be comparison for execution of every pixel.
As we known, using look-up table approach can further speed up the clustering speed. This approach adopts operation of array indexes instead of Nx4 times comparison operation, which greatly improves the efficiency of computing for clustering algorithm. However, the disadvantage is its too much storage space. According to this problem, an algorithm based on improved look-up table is present to complete the color space clustering.
The use of projection transformation can project the pixels of image in the color space corresponding to the coordinates onto the three axes of color components respectively. Thus, the number of dimensions of clustering an array of look-up table reduced to one dimension. J kinds of color space can be expressed by three arrays of unsigned integers with j-th bit. (HClass[0...255], SClass[0...255], IClass[0...255]). Where array 'HClass, SClass and IClass' represent each value of color respectively. The j-th bit of elements in every array denotes the binarization results of the j-th color. The clustering of pixels is achieved through bitwise AND operation for arrays. The nature of approach of improved look-up table present in this paper is based on the multi-threshold for color image segmentation. There are two advantages of this approach.
1.	By adopting j-th bit of unsigned integers to represent the elements of array, only two times for bitwise AND operation can determine the color category of pixels, which greatly increased the speed.
2.	By projecting three-dimensional color space onto a one-dimensional axis, only three one-dimensional arrays can express the color space and predefined color category, which greatly save the memory space.
2.4 Fast goal-find algorithm
In robot soccer, the goal is horizontally placed, so we can regard the goal as a horizontal rectangle. If the robot is not in the middle of the field, because of the influence of large near far smaller, the goal in the image will be a slight tilted rectangle. Although the goal is tilted, the inclination angle is negligible. In addition, in the current humanoid robot soccer, the ball will not be kicked off the ground. the tilted rectangle will just affect the computation of height, but not affect the computation of width. The goal-finding process is shown in figure 2.
(Ax, Cy) C (Cx, Cy)	(Bx, Cy)
A (Ax, Ay)	B (Bx, By)
(Ax, Dy)	P D (Dx, Dy) (Bx, Dy)
Figure 2. The schematic diagram of rectangle detection.
In this figure, p is seed point, from p cast four rays toward left, right, up and down intersect the boundary at
A, B, C, and D. Assume that the coordinate of A is (Ax, Ay), the coordinate of B is (Bx, By), the coordinate of C is (Cx, Cy) and the coordinate of D is (Dx, Dy). Using these four coordinates, we can compute the four coordinates of vertex. The coordinate of upper left vertex is (Ax, Cy), the coordinate of lower left vertex is (Ax, Dy), the coordinate of upper right vertex is (Bx, Cy) and the coordinate of lower right vertex is (Bx, Dy). In order to improve the precision and avoid the noise interference, we can cast numbers of rays towards one direction (such as upward side), get many boundary points, we remove the points whose difference are high and compute the mean of coordinates.
3 Motion planning for humanoid robot
P ( x) = Pr1 ( x) + cos(£ Sffr )S<-1
(5)
i=1 n-1
equation
where
p; (y) = P"-1 (y)+sin(X se; )S/r-1 i = 1
Constraint
Symn *(( P (x) - P (x))2 +(P (y) - P Cy))2 f * s„
is
Pn (X) Pn (y)
i \ >, i / is the location of left foot of humanoid
Pn (X) Pn (y) robot while r	, r	is the right
SXn-1 Syn-1
foot. i , i represent the displacement of left foot
sxn-1 Syn-1
and
S ■ S represent the right foot. ymin and ;m
are the least and largest step length respectively.
3.1 Motion planning for turning
Assuming the n -th location for foot (p^, p^ j can be expressed as shown in equation (4)
^ (r) 1	
Px	
r	
_ P; _	
pXr-1)+sxr)
X
n(n-1)
r (r)
p;- - - (-1r s;
[16]
(4)
s(r ) Where x
is the step length on forward direction of
s(n )
humanoid robot, and y is the step length on lateral direction. n represents the n -th step. When the robot turns, the parameters of its gaits can be described as
Figure 3. P ^y)	is the location of left of
humanoid while P ^y) 1,2,3) is the location of right. When the robot walks straight, the angle between
Sd(1)
left turning foot and the horizontal line l should be changed according to the equation (4).
Figure 3: Sketch map of parameters of turning for humanoid robot
From the figure 3, we can get the equation of parameters of turning gait as shown in equation (5)
p? (x) = pn-1 (x)+cos(§ se; )Sx;-1
1=1
Pn (y) = Pn-1( y)+sin(£ se; )Syn-1
3.2 Smoothing for the complex gaits
The connection of complex gaits is a question worth studying. If the connection failed, the completely complex gaits may fail. To smooth the connection between two gaits, suppose the et as the degree of joint in time i , so the next degree of joint is et+1. If the i is passed and the time i +1 is coming, so the current degree of joint is ei+1, and the next degree of joint is e++2. For the smoothly connecting from ; to ei+1, we adopt the trigonometric function to time difference. Suppose the beginning time is Tt and the ending time is Tj+1, the
transition angle &(t) and be obtained from the following equations:
®(t ) = 0i +
Ö..+, -ft .
2
-sin
t - T
-— 71
T - T k1 i+1 1i y
And the angular velocity can be get by equation (7):
n
© ( t ) = ft+1
f t - T, ^
2 T -T
1 ,+1 1 •■
-cos
-—71
T - T
k1 i+1 1i y
(6)
(7)
Since the ©(t) i connection curve is smoothing.
TT
is derivable between ' and '+1 , the
3.3 Local motion planner
We must evaluate the reference trajectory to search for the best one. The reference trajectory, which needs lower energy is better. We give the evaluation function involved the average power Pav , average deviation power Dav and average torque PL.
The average power is a key factor for power analysis of motion. Supposing the torques not doing negative work, and absolute value of power is obtained in every moment. The average power Pav with the model of HIT-2 can be obtained by equation (8) as follows:
P.V=T £ x j; k (* >e,j (*) dt (8)
1 i=1 j=1
Where t is the torque of motor, e is the angular velocity of the joints.
i =1
Although the Pav is the key factor for optimizing the energy consumed by the robot, during the process of motion, there is a situation that the instantaneous power is approaching infinity, however, the average power may be little. This instantaneous power may damage to the system under some situations. Therefore, according to this situation, another factor to evaluate to the consumed energy should be built, which is the average deviation power Dav . The Dav can be achieved by following equation (9) and (10):
p e v c x e )
,=i j=i
(9)
(10)
Where Pj is the instantaneous power of robot. The average torque PL can be get from equation (11):
P = T^^£(^ (t))2 dt (ID
1 i=i j=i
Therefore, the total energy consumed equation can be described as followed:
Emm = PaVmm + DaVmm + P^	(12)
Global scanning an unknown environment using vision will take considerable time, and motion planning for humanoid robot under unknown environment is a complex project. Therefore, we design a local motion planner, which can be defined as finding a feasible locomotion for the next n steps based on minimum of energy dissipation using equation (12). We design some basic motions off-line including of forward one-step, backward one-step, left one-step, right one-step, turn right and turn left. These motions, which stored in the motion library of robot are test on the flat floor and executed stably. The local motion planner can use these basic motions for planning the locomotion.
The local motion planner plans the local locomotion via the information of itself and captured by the vision
system. A sequence M(M1,M2,"'Mn) composed of basic motions can be gotten according to the trajectory.
Let M'(k ^ d^ e) denote the basic motion, where k is the number of motion, t represents the time needed to
finish the motion, d is the distance from the projection of centre of gravity of robot under the former pose to the next pose after finishing this motion, @ denotes the shifting angle between the former pose of robot to the next pose completed by this motion, and the e means the energy consuming during this motion, which is test by the experiments. The planner can generate a feasible trajectory according to the principle of least energy consumption. First, the planner installs the whole
motions M i and computes the distance from initial
location q'nit and goal location qgoaI . Then, selects the
motion by random and execute motion. If the qgoal equals
the qr obtained from executing the motion, then return a
feasible sequence, else goes back to select another motion. The pseudocode of planning algorithm is described as follows:
Step 1. Install Mt in M , Dgoal ^ Compute (qg0aI, qMt)
Step 2. Select Mr ^ Rand (Mr,M); qr ^ Exec (Mr)
Step 3. Dgr ^ Compute (qgoal, qr)
Step 4. IF Dgr < Do0al then
Step 5. Do0al = Do0al - dr,E = E + ^ ,Goto Step 5
Step 6. else Goto Step 2
Step 7. IF (qr = qgoai) then
Step 8. Return a feasible sequence of Mqr
Step 9. else Goto Step 2
The algorithm described above can generate several feasible sequences of motions. The minimum energy consuming of sequence will be selected to execute. Since we select the basic motion in the library by random, the computational complexity will increase exponentially by the increase of basic motions. The local planner just plans the two or three steps next to robot in order to eliminate the time of computing.
When the robot executes the sequence of motions, the information gathered by the vision system and information gained by the tilt sensor can help the robot to adjust its location and keep stable.
The strategy program structure diagram of controlling the motion of robot can be described as in figure 4.
Figure 4: The strategy program structure diagram
Compared to normal dynamic foot planning for humanoid robot [17], the elapsed time in the motion planning presented in this paper is minor. Although normal foot planning can meet the requirement of motion planning for tasks, it is not a real-time planning because of the complex dozens of degree of freedom of humanoid robot. The motion planning algorithm present in this paper can call the basic motions online. Complex motions are designed off line and are stored. Therefore,
the time consumed on foot planning in real-time matches can be greatly saved. Moreover, in this paper, smooth gait planning takes the place of left-right motion, which saves time in the walking process and ensures that the robot can finish the match in a shortest time to get a better result.
3.4 Control decision for marathon
The aim of marathon held in FIRA is to test the robustness and endurance of humanoid robots. The task is for the robot to track a visible line as quickly as possible. The rules of marathon are that a robot is not allowed to leave the track. A robot is considered to have left the track if the distance between the current position of the robot and the closest point on the centre line to that position is more than 50cm. According to these rules, we have made the decision as follows. The Figure 4 has shown the schematic diagram of control strategy for marathon. Where A, B, C, D, E, F, G are the locations of robot may occur, and d1, d2, d3, d4 mean the vertical distances between the robot to centre of visual line. a1, a2, a3 are the angles between the robot and next two locations.
As shown in Figure 5, the white line is visible for the robot to track. The left of the picture is a straight line while the right is a curve. The robot should walk forward by tracking the line and not be far away from the line. The locations of green circles in the figure 3, which marked R1, R4 and R5, are the ideal for the robot to walk and the red circles, which marked R3, R8 and R9, are not allowed for the rule and should do some adjustments to get close to the visible line. When the locations of robot become yellow circles, which marked R2, R6 and R7 should get some warnings.
With the feedback of vision, the robot can adjust the location online by using the basic motion.
Figure 5: Schematic diagram of control strategy for marathon held in FIRA
The decision is made following according to the rule of marathon:
Step 1. Initialize the vision system and start the robot;
Step 2. Execute the basic motions generated by the local planner according to the information from vision system;
Step 3. If the distance d between the center of visible line and the projection of center of gravity of robot is more than 50cm, go to Step 5; Step 4. If the visible line is a curve with the judgment of
a, go to Step 6; Step 5. The robot runs a turning for left or right motion
according the shift distance. Go to Step 7; Step 6. The robot adopts the curve motion planning and
go to Step 3; Step 7. Forward motion is executed by the robot; Step 8. If the robot touches the finishing line, stop motion is arrived, else go to Step 2
4 Simulations and experiments
The images captured by the vision system are under processing by the method present this paper, which can be finished in a short time. The visible lines included straight and curve are shown in Figure 6 and the results of processing is also shown in the bellow of this figure, where we can conclude that the lines can be easily recognized by the vision system. The color space selected in this paper is HSI, where H means Hue, S means Saturation and the I represents Intensity. The simulation results based on HSI color space are shown in Figure 7. Where (a) and (b) are the H value, (c) and (d) are the S value while (e) and (f) are the I value of line and curve respectively. From the (b) of Figure 5, for instance, vertical axis is the number of pixels corresponding on the Hue value of horizontal axis, and most values of Hue are in the interval of [140,150], therefore, we select 145 for the value of Hue. Experiments for running along the circle are achieved on real humanoid robot HIT-2 and the snapshots of the experiments are shown in Figure 8.
Figure 6: Pictures of effect of recognition for lines
Value of Intersil)
(e)	(f)
Figure 7: Pictures of results of simulation based on HIS
From the Figure 7, we can know the H value and S value are more concentrated in dealing with recognition of curve than line. And the I value is better in recognition of line than curve.
The experiments are implemented on real humanoid robot HIT-2. The parameters of HIT-2 are list in table 2.
Table 2: Parameters of HIT-2
Components	Parameters
Weight	3.53 KG
Height	0.45 M
Degree of Freedom	24
Type of Motor	HSR-5990TG
Processor	Acer n300 PDA
Main controller	C3024
Camera	Logitech Pro5000
Other sensors	Infrared, Tilt sensors
Battery	Li-ion 7.4 V
It takes 1 minute and 2 seconds to finish the circle lengthened 6.2 meters. Eight snapshots of every 7
Figure 8: Snapshots of experiments on humanoid robot HIT-2 of running along the circle. seconds of video are shown respectively, where picture 1, 2 and 6 are the motions of walking straight while picture 3, 4, 5, 7 and 8 denote the motions of walking on curve. All the pictures have shown that the robot can walk along the visual line successfully and quickly.
5 Conclusion
Sports and entertainment humanoid robot has always been the strong development in FIRA, and the key technology of humanoid robots has been applied and developed during the races. In this paper, we designed a humanoid robot based on vision system according to the characters of marathon in FIRA. The information of feedback of vision system could be adopted as the chief sensor information for the decision system. Motion planning for the local environment was achieved by the local motion planner and the decision was made for the marathon.
Using the method present in this paper, it can speed up the process of motion planning during the match. Turing motion is obviously faster than the normal turn left or right motion because it reduces the time consumed for feedback of signal and the balance of robot.
In the future, motion planning in complex environment should be considered and the complex motions for humanoid robot such as step on or step over obstacles will be designed.
Acknowledgement
This project is supported by the National Natural Science
Foundation of China with grant number 61075077
References
[1]	Spenneberg D. and Kirchner F., 2000. Omnidirectional walking in an eight legged robot. In: Proceedings of the 2nd International Symposium on Robotics and Automation(ISRA), pp: 108-114.
[2]	Hengst B., Ibbotson D., Pham S. B., and Sammut C., 2002. Omnidirectional locomotion for quadruped robots. In: RoboCup 2001: Robot Soccer World Cup V. Springer, 22: 368-373.
[3]	Behnke S., 2006. Online trajectory generation for omni directional biped walking. In: Proceedings of the 2006 IEEE international conference on robotics and automation, Orlando, Florida, May, 2006. pp: 1597-1603.
[4]	Kuffner J., 1998. Goal-Directed navigation for animated characters using real-time path planning and control. In: Proceedings of CAPTECH'98 Workshop on Modeling and Motion capture Techniques for Virtual Environments, SpringerVerlag, 1998.
[5]	Hong C. S., Chun S. M., Lee. J. S., Hong K. S., 1997. A vision-guided object tracking and prediction algorithm for soccer robots, In: Proceeding of the 1997 IEEE International Conference on Robotics and Automation. Albuquerque, NM. 1997, pp: 346-351.
[6]	Kuo C. H., Yang C. M. and Yang F. C.,2005. Development of intelligent vision fusion based autonomous soccer robot. In 2005 IEEE International Conference on Mechatronics. July 2005.pp:124-129
[7]	Lu H. C., Tsai C. H., 2006. Image Recognition Study via the Neural Fuzzy System. In: Proceeding of International Conference on Intelligent Engineering Systems, London, 2006. pp: 222-226.
[8]	Zhang J.L., Li S. Q., 2010. Color Image Segmentation Bsed on Multi-valued Eigenfunction. In: Proceeding of the 2010 2nd international Workshop on Intelligent Systems and Application. Wuhan, China. 2010. pp:1-4
[9]	Kim M. Y., Ko K. W., Cho H. S., Kim J. H., 2000. Visual sensing and recognition of welding environment for intelligent shipyard welding robots. In: Proceeding of the 2000 IEEE/RSJ International Conference on Intelligent Robots and Systems, Takamatsu, 2000. pp: 2159-2165.
[10]	Suzuki H. and Minami M., 2002. Real-time corridor recognition adaptable for shadow and illuminance variation. In: Proceeding of the IEEE 5th International Conference on Intelligent Transportation Systems, 2002. pp: 61-66
[11]	Setiawan S. A., Hyon S. H., Yamaguchi J., and Takanishi A.,1999. Physical interaction between human and a bipedal humanoid robot realization of human-follow walking. In: Proceeding of the 1997
IEEE International Conference on Robotics and Automation, 1999. pp: 361-367.
[12]	Kajita S., Matsumoto O., and Saigo M., 2001. Realtime 3D walking pattern generation for a biped robot with telescopic legs. In: Proceeding of the 2001 IEEE International Conference on Robotics and Automation, 2001.pp:2299-2306.
[13]	Lim H., Kaneshima Y. and Takanishi A., 2002. Online walking pattern generation for biped humanoid robot with trunk. In: Proceeding of IEEE International Conference on Robotics and Automation, 2002. pp: 3111-3116.
[14]	Manjunath B. S., Ohm J. R. and Vasudevan V. V., 2001. Color and Texture Descriptors. In: IEEE Transactions on circuits and systems for video technology, Vol. 6, pp. 703-715, 2001.
[15]	Stockman G. and Shapiro L., 2001. Computer Vision, Prentice-Hall, 2001.
[16]	Kajita S. and Hirukawa H., 2005. Humanoid robots.. In: Ohm-sha, Ltd. 2005. pp: 120-130.
[17]	James J. Kuffner, Satoshi Kagami, Koichi Nishiwaki, Masayuki Inaba and Hirochika Inoue. Dynamically-stable Motion Planning for Humanoid Robots. Autonomous Robots, vol. 12, No. 1, pp: 105-118
Fault Diagnostics of Centrifuge Pump Using Data Analysis in Spectrometric Method
Mansour Esmaeilpour and Elnaz Nomigolzar
Engineering Dept., Islamic Azad University-Hamedan Branch, Hamedan, Iran E-mail: ma.esmaeilpour@gmail.com
Mohamad Reza Feyzi Derakhshi
Engineering Dept., University of Tabriz, Tabriz, Iran
Zarina Shukur
Faculty of Information Science and Technology, National University of Malaysia, Malaysia E-mail: zs@ftsm.ukm.my
Keywords: vibration analysis, spectrometry, centrifuge pomp, kurtosis and fault diagnostics Received: October 1, 2009
Vibrations analysis is one of the main surveying methods in maintenance and fault detection of machines in the industry. This method has unique advantages and disadvantages relating to surveying and fault detection of the machine. Objective of this research is to show the relationship between vibrations analysis and fault detection. The major problem of the vibration analysis is using the sensitive aural of the vibration sensors by human experts. On the other hand, human fault is time consuming which shows the position of the proposed method that by removing the human factor and increasing the speed and accuracy of the fault diagnosis course to increase the performance of the proposed method. Faults detection of the equipment is one of the most suitable ways of caring about the device when the equipment is on. Predictive repairing methods are new types of preventive repairs which use modern measurement and processing techniques for accurate fault finding and accessing technical conditions of the devices during exploitation and specification when maintenance and repairing operation are needed. In order to study the vibrations in pumps and to find its fault, different conditions of pump (sound and defective) were investigated. By putting sensors in a horizontal and vertical direction, the information was recorded and data vector activity was done for several times. Data gathering process was performed on the sample of centrifuge pump for its fault diagnostics through spectrometry. The results of analysis were able to distinguish between sound and defective data by studying on its acceleration range but there were analyzed and studied for ensuring data of different rounds which was selected randomly. This method is compared with Multilayer Perception Artificial Neural Network form terms of the processing time and accuracy that the result shows the superiority of the proposed method. Time data spectrum which had range 8000 N/mm and higher was sound and in interval between 1000 N/mm and 2000 N/mm, there was a need to repair and in interval 1000 N/mm and lower, it was defective.
Povzetek: Predstavljena je metoda diagnosticiranja crpalk na osnovi vibracij.
1 Introduction
Some of the phenomena which occur in the nature are specified well with statistical means and for this reason; one should have statistical look at random changes in such signals [7]. For example, self-correlation function of random process in time domain and its Fourier Conversion which are called power density spectrum are of the applied tools for analyzing these signals. On the other hand, there is an important class of signals such as static random process which doesn't have limited energy and as a result Fourier Conversion. Such signals have limited average power and are specified with power density spectrum.
For machinery pump fault diagnostics, centrifuged pumps in repair and maintenance section of Idem Company have been selected. For studying faults of machinery pumps which are one of the integral parts of installations of each plant, firstly it is necessary to specify the importance of this part of plant and secondly how sensitive the Fault diagnostics of such parts is [1]. For this reason, we will refer to predictive net systems which have been commissioned in most plants and such systems removes need for studying fault diagnostics in machinery equipment such as centrifuge pumps.
2 Maintenance and repair methods (MaR)
Since early 1970s, changes in the industry have led to important movements in MaR. Changes which led to industrial mutation can be classified into three fields [3].
1.	The formation of new requirements
2.	New research
3.	New MaR techniques
Nowadays, one can show the applied changes by commissioning the following
2.1 Methods in plants
Among mentioned methods, only part 3 is discussed which is stronger in application. American Production and Inventory Control Society is kind of preventive repair and maintenance on the basis of definition which denies MaR activities [2]. This MaR uses unnecessary statistical analyses (and even destructive) and predicts the time which preventive MaR activities should be executed on devices and equipment.
Detection of faults of the equipment is one of the most suitable ways of caring about the device when the equipment is on. Predictive repair methods is new type of preventive repairs which use modern measurement and processing techniques for accurate fault finding and getting access to technical conditions of the devices during exploitation and specification when MaR operation is needed [6].
Predictive MaR techniques such as vibrations analysis, infrared heat images, X ray, ultrasound fault diagnostics and something like these have allowed predicting the conditions of the device and decision about necessary repairs [4].
One can specify the position of Fault diagnostics using vibrations analysis and other methods in terms of application in the plants in case of commissioning such systems with suitable efficiency.
After this stage, it is necessary to perform a data gathering phase for recording vibrations from the sample of the selected pumps. With regarding the applicability of the plan, the samples were selected with special criteria [5]. For this purpose, one sample of the pump out of the sound pumps, one sample of the pump out of the defective pumps and one sample of the pump out of the sound pumps with long term function which have been used due to lack of defects are selected.
In averaging method, we consider time pulse of a round (on the basis of the shortest time of a round) on the basis of the shortest time by using this frequency and the obtained signals for the rounds with the longest time are renovated on the basis of the shortest time.
In this way, the problem of the asynchrony of the signals is solved to some extent. Then time averaging is done on these signals [10]. In this method, accelerometer has been put on the pump and the outlet signal is recorded from it in a time interval equivalent
to time of a round of circulation of the pump. Another example in the next time interval is averaged. When this process was done with the use of frequency sampling, the effect of the parameters relating to tested pump such as strokes resulting from collision of the vane in average signal increased and the specifications relating to other pumps and other factors such as noise were excluded from the average.
For distinguishing between the intervals, a tachometer is used for the creation of the pulses or frequencies equivalent to reference shaft rotation frequency. Then, this pulse frequency is changed on the basis of ratio of the speed which is obtained according to the number of vanes of the engaged pump in such a manner that the limited length of the sampling is considered to be equivalently to a round of pump. The results are recorded as a diagram of average signals in which the number of signal equals to the number of rounds of the pump.
2.2 Data spectrometry
Analysis techniques in the area of frequency for the vibration of signals and their related processing as well as results related to spectrum analysis give us clear information about rotary sets fault detection [8].
Therefore, by using of this method, one can separate important faults from each other and identify them:
-	Imbalance
-	Failure of vans
-	Failures of ball bearing resonance
Studying on the performed research shows that ball bearing clearance decreases the range of vibration in the frequency and will increase its harmony [7]. In this paper, peer to peer frequency results of pump signals are studied with the reference signals which have been used. Regarding the fact that it is possible to change the speed of the pump from revision to revision, in signal time field averaging method, peer to peer study will not be possible. As a result, for doing such direct comparison between signals, Fourier Conversion of the time field average signals is calculated. This method and other methods which are common in industry may detect only the fault and do not give enough information about the place and degree of fault [9]. More importantly, computer software's and systems for faultfinding the simple machines such as pumps; ball bearing and etc. have been designed more widely and used in industry but research continues on some more complex systems faultfinding such as pumps.
3 Machinery pump and a possibility to provide data gathering medium
Data gathering system is pulse type and sensor used for accelerating the body of the pump which has been made of Danish Company with 4 channels of B&K, type 4508 and is installed in ball bearings which are
two channels for tachometer and two channels for acceleration. Tachometer is also the type 0023 made of the same company. Totally 20 seconds were considered for each one of them and the analysis was done in terms of time. This was done by the cooperation of Idem Plant and University of Tabriz and Faculty of Mechanics by conclusion of a contract. Before data gathering, we need to signal synchronous by averaging. In Figure 1 we can see the Averaging process of two signals.
Figure 1: Averaging process of two signals.
Figure 2: The position of the sensor on output shaft of centrifuge pump (right side) and an indicator which is sensitive to a laser (left side).
As shown in Figure 2, two sections of the photo have been shown with a white color and on the right side is an acceleration sensor and on the left side is a label sensitive to the laser light which indicates rotation of the tachometer. In Figure 3, pulse device, tachometer, a computer the laser tachometer and the manner in which b they are positioned are shown. The software which is recording data is completely specified.
In Figure 4, the position of tachometer (opposite to label sensitive to laser) is shown clearly.
Figure 4: Tachometer positioning manner for recording output shaft speed.
After performance of data gathering phase, it is necessary to take an action regarding the reduction of noise on it and data averaging is one of the methods which are used as follows:
3.1
2.
Stages of doing research on samples of machinery pump
Performance of a data gathering phase for samples of machinery pump
Studying data and if necessary the performance of noises reduction methods.
After performing data gathering stage, data was prepared for doing different studies. In each stage data gathering was done 100 times. On the other hand, it entered spectrometry stage. We can see some samples of raw data with different rounds in the flowing figures, Therefore, raw data entered in data analysis program.
Figure 3: Data gathering manner with the related software and pulse device
150 100 50 0
6/.
ct <u
"§ -50
u <
-100
o i_ii_JI_JI_ii_n_i_ii_JI_JL_
_n_i\_n_n_n_ji_ji_n_n_n_
_j_n_n_n_n_n_
-150
0.2
0.4
0.6
0.8 11 1.2 Time (ms)
1.4
1.6
Figure 5: The representation of raw data relating to sound pump.
Data relating to sound pump with a high function was given of which 30 rounds have been shown in Figure
150
5, 6 we can see raw data relating to sound pump with one round of tachometer.
.
« ju
IS
u
u <
100
50
-50
-100
-150
100
200
300 Time 400 (ms)
500
600
Figure 6: The representation of raw data relating to sound pump with one round of tachometer.
By comparing this data with mean data and calculating	.
their kurtosis in figure 7, we find out a failure to perform noise reduction stage to some extent
.
ct JU
IS
u
u <
150 100 50 0 -50
-100 -150
0	100	200	30C Time 400	500
(ms)
Figure 7: The representation of mean data sound pump with one round of tachometer.
600
700
0
2
0
0
4 Signals analysis
According to the procedures of pump fault diagnostics studies, a program was designed for its data analysis and also the spectrometry of pump data for its program in three selected samples with different rounds, one can archive desirable applications.
These results can be explained with the obtained figures.
Figure 8, shows 30 rounds of sound pumps raw data.
		100
		80
		60
1M		40
JE		
		20
s		
		
		0
«		
		
JU		-20
IS		
o u		-40
«!		
		-60
		-80
		-100
		
2000 4000
6000 8000
12000 Time
(ms)
Figure 8: The representation of raw data relating to sound pump with 30 rounds of tachometer.
Figure 9 shows the result of spectrometry from the first round of this data (Sound) which frequency analysis includes applied results by comparing its range and interval.
-o
a
E
<
*104 2.5 2
1.5
1
0.5 0
Ik
2000
4000
6000
8000 Hertz
0
0
Figure 9: The representation of spectrometry of a round of pump
By observing Figure 9, data spectrum frequency interval up to 0-2000 Hz has a range of 2500 N/mm and in interval of 2000-4000Hz; it has a range of 8000 N/mm. This result shows its efficiency for sample of the pumps with different functions by studying the next rounds spectrometry (these results have been studied for different rounds in different intervals of which only some rounds are referred).
In Figure 10, 11 we can see the sound pump (with short term function) by data analysis for four different rounds.
*104 2.5
2 1.5
C 1
S
<
0.5
«	
■a	
s	
	
	
&	
S	
	
0 4000 6000 8000 10000
Hertz
3 2.5
1.5 1
0.5 0
Mi
0 2000 4000 6000 8000 10000 Hertz Figure 10: The representation of a sound pump round spectrometry #1, 2 (with a short term function).
*io4 2.5
= 1.5 &
S
1
0.5
■a s
'S
S
<
M
0 2000 4000 6000 8000 100 Hertz
*104 3.5
3 2.5 2 1.5 1
0.5 0
kwh.
0 2000 4000 6000 8000 10000
Hertz
Figure 11: The representation of a sound pump round spectrometry # 3, 4 (with a short term function). In Figure 12, 13 we can see the sound pump (with a long term function) by data analysis for four different rounds too.
-3 S
S
<
12000
10000
8000
6000
4000
2000

I
Vl„A

	10000
	9000 ™-
	8000
	
■s	7000 ™■
	
ë	6000
S	
<	5000
	4000
	3000
	2000
	1000
	0

m
Afk
V.1 'VLJU./UVL,
0 2000 4000 6000 8000 10000 12000 14000 16000 Hertz	0 2000 4000 6000 8000 10000 12000 14000 16000 Hertz
Figure 12: The representation of a sound pump round spectrometry # 1, 2 (with a long term function)
2
0
2
0
«
-3 S
"E
S
<
10000 9000 8000 7000 6000 5000 4000 3000 2000 1000 0

¿Jy
o ■a s
'S s
Ll_
VAV/w,-,


0 2000 4000 6000 8000 10000 12000 14000 16000 Hertz
10000 9000
8000
7000
6000
5000
4000
3000
2000
1000
0
0 2000 4000 6000 8000 10000 12000 14000 16000
Hertz
Figure 13: The representation of a sound pump round spectrometry #3, 4 (with a long term function)
For comparing the sound and defective pumps, in this section (Figure 14, 15) we show the defective pump (with a long term function) by data analysis for four different rounds.
«
-3 S
S
<
18 16 14 12 10 8 6 4 2 0
s
S
<
A
VI
1800 1600 1400 1200 1000 800 600 400 200 0
A .
0 2000 4000 6000 8000 10000 12000 14000 16000
Hertz
0 2000 4000 6000 8000 10000 12000 14000 16000
Hertz
Figure 14: The representation of a defective pump round spectrometry #1, 2
		3000							4000							
																
		2500	251						3500		3500					
o								i ■s	3000		3000					
-3 S		2000	201													
								S	2500		2500					
"ä S <		1500	'51					S <	2000		2000					
		1000	"OI						1500		1500					
									1000		1000					
		500	51			I								i		
					11	J			500		500	J I	i I1	t		
		0		Ii	P|	1	iL... ,		0		0		J'		J »I-«..- • lili	
																Hertz
		0 2000 4000 6000 8000 10000 12000 14000						Hertz	0		2000		4000		6000 8000 10000 12000 14000	
Figure 15: The representation of a defective pump round spectrometry #3, 4
Considering the presented diagrams for the pumps including sound with low function, a sound with a long term function and defectiveness and their data analysis in different rounds, the following results are obtained:
1-	In data analysis, if the interval of their acceleration frequency has a range 10000 N/mm and higher between 0 Hz and 2000Hz and 8000 N/mm and higher between 2000 Hz and 4000 Hz, the related pumps will be considered a sound and there will be no need to repair and study.
2-	If their acceleration frequency between interval of 0 Hz and 2000 Hz has a range of 2000 N/mm and lower and between interval of 2000 Hz and 10000 Hz, it has the same range or lower and close to 0 N/mm, the related pump will be considered defective but they can continue operating (in case that predictive MaR system has been commissioned, such pump will be subject to revise).
3-	In case that acceleration infrequency is wider and has a range of 1000 N/mm and lowers between interval of 0 Hz and 2000 Hz and the same range between intervals of 2000 Hz and 4000 Hz, the related pump will be considered defective and will not be able to work.
The obtained results have been studied on the applied basis and are used and we can see them in table 1.
It is necessary to note that there is no need to prepare of training by using of artificial neural networks and other techniques and Fault diagnostics due to clarity of the results obtained from data analysis in spectrometric method.
Sample of Pumps	Intervals Hz	Acceleration domain N/mm	Result
Pump #1	0-2000, 2000-4000	8000 and higher	Sound
Pump #2	0-2000, 2000-10000	between 1000 and 2000	Need to repair but it will work
Pump #3	0-2000, 2000-4000	1000 and lower	The pump will be considered defective
Table 1: Results obtained from data analysis
5 Compare the proposed method with MLP artificial neural network
This method is compared with MLP artificial neural network form terms of the processing time and accuracy that the result shows the superiority of the proposed method. The artificial neural network used in this paper is multilayer perception with 15 hidden layers that the result, compared by the proposed method is as bellow. In this paper we used the vibration analysis that can detect the sound and defective pumps in less time and high accuracy. The
amplitude of the signals that obtained from the tachometer in the terms of the sound and defective pumps is very deferent, thus the proposed method can detect the sound and defective pumps in higher speed and accuracy that result show in figure 16 and Table 2.
Figure 16: The representation of MLP Neural Network and mean square of its error.
Table2: The Results of the proposed method and MLPNN comparison.
6 Conclusion
By studying spectrometric data in different intervals, the following results were obtained from it:
-	In case that the range of the acceleration obtained from analysis is about 8000 N/mm and higher, the pump will be considered sound.
-	In case that the range of the acceleration obtained from analysis is between 1000 N/mm and 2000 N/mm, the pump will have no need for repair but it will work.
-	In case that the range of the acceleration obtained from analysis is about 1000 N/mm and lower, the pump will be consider defective.
Reference
[1]	Howard I., Jia S., Wang J., "The dynamic modeling of a spur gear in mesh including friction and a crack", mechanical systems and signal processing, 2001, vol. 15(5), pp.831-853.
[2]	Parker JR B. E., Ware H. A., Wipe D. P., Tompkins W. R., Clark B. R. and Larson E. C., "Fault diagnostics using statistical change detection in the bi-spectral domain", mechanical systems and signal processing, 2000, vol. 14(4), pp. 561-570.
Accuracy in proposed method	Accuracy in MLP Neural Network	Time needed for detecting by proposed method	Time needed for detecting by MLP Neural Network
100	94.583	172.6 sec	582.8 sec
[3]	Peng Z.K., Chu F.L., "The application of the wavelet transform in machine condition monitoring and fault diagnostics: a review with bibliography", Mechanical systems and signal processing, 2004, vol. 18, pp.199-221.
[4]	Vafaei S., Rahnejat H., "Indicated repeatable run out with wavelet decomposition" (IRR-WD) for effective determination of bearing-induced vibration, Journal of sound and vibration, 2003, vol. 2260, pp. 67-82.
[5]	Yang D.M., Stronach A.F., MacConnell P., Penman J., "Third-order spectral techniques for the diagnosis of motor bearing condition using artificial neural networks", Mechanical systems and signal processing, 2002, vol. 16(2-3), pp. 391-411.
[6]	Yesilyurt I., "The application of the conditional moment's analysis to gearbox fault detection-a comparative study using the spectrogram and scalogram", NDT&E International, 2004 vol. 37, pp. 309-320.
[7]	Wuxing L., Tse Peter W., Guicai Z., Tielin S., "The classification of gear faults using cumulates and the radial basis function network", mechanical systems and signal processing, 2004, vol. 18, pp.381-389.
[8]	Hashemi S.H., Farhadi S., Carra S., "Free vibration analysis of rotating thick plates", Journal of Sound and Vibration, 2009, vol. 323, pp. 366-384.
[9]	Jarrar S.M., Hamdan M.N., "Nonlinear vibrations and buckling of a flexible rotating beam, a prescribed torque approach", Mechanism and Machine Theory, 2007, vol. 42, pp. 919-939.
[10]	G. Sofiane, Y. Said, S. Moussa, "Robust H Control of a Doubly Fed Asynchronous Machine", international journal of Informatica, 2008, vol. 32, pp. 151-158.
Real-Time Action Scheduling in Pervasive Computing
Wenwei Xue
Nokia Research Center, Beijing, China E-mail: wayne.xue@nokia.com
Qiong Luo and Lionel M. Ni Department of Computer Science and Engineering Hong Kong University of Science and Technology Clear Water Bay, Kowloon, Hong Kong, China E-mail: {luo, ni}@cse.ust.hk
Keywords: pervasive computing, real-time action scheduling, query processing, device eligibility Received: November 23, 2010
Pervasive computing applications, such as video surveillance and robot control, involve diversified operations on physical devices. We call a sequence of operations on a device an action and study how to schedule real-time actions on the devices in pervasive computing. We identify a number of novel characteristics of this pervasive action scheduling problem and develop a dynamic, heuristic algorithm for the problem. The algorithm performs priority-based action scheduling whenever some device becomes free and does not reply on any system-defined scheduling interval. We have implemented our proposed action scheduling algorithm in a pervasive query processing system named Aorta and evaluated its performance using actions in a pervasive lab monitoring application. Our simulation results demonstrate the algorithm ensures small dropping rate of actions and has tiny computation cost. Povzetek: Opisan je izvirni sistem Aorta, kjer je akcija opisana kotpovprasevalni operator.
1 Introduction
In pervasive computing [27], many types of devices are embedded in the physical world and execute real-time actions for the applications [22][24]. Example devices in pervasive computing include sensor nodes, network cameras, programmable robots, and handheld devices such as PDAs and phones. Here we define an action as a predefined sequence of operations to be executed on a device that is encapsulated in a user-defined function [1][8]. Due to the real-time requirement of action executions, action scheduling on the devices among multiple applications is a crucial problem in pervasive computing. For instance, in a lab surveillance scenario a number of concurrent applications may require the cameras and robots deployed in the lab to take photos or perform tasks at different locations from time to time. A photo taken or a task performed will become obsolete and useless if it cannot be scheduled and executed timely on some device of the corresponding type. In this paper, we address this action scheduling problem in the framework of a pervasive query processing system Aorta that we have developed [29][30].
The problem of job scheduling on parallel machines [16] and its variants [3][5][11][13][17][18][19][20][23] [25][26] have been widely investigated in the literature. The action scheduling problem we study can be regarded as a new variant of this classic problem in the pervasive computing scenario. This is because an action execution for an application is often not fixed to a specific device
but can be performed on any device that satisfies certain condition. Take the lab surveillance scenario as example again. Whenever a sensor node installed in the lab detects abnormally high noise readings lasting a period, which are likely caused by the loud conversation of people around the node, a robot is automatically controlled to move to the location of the node and issue a warning to ensure the quiet working environment in the lab. Every robot in the lab is a candidate device for this action execution and it is sufficient to operate one but not all of them to perform the task.
Our action scheduling problem inherits several characteristics of the classic parallel machine scheduling problem, including unrelated devices, device eligibility restrictions and deadlines of non-preemptive action executions [16]. In addition, our problem has a unique characteristic that is the interaction between actions and devices [29] in pervasive computing. More specifically, the actions often change the physical status of the devices that execute them. Such change in turn affects the future executions of succeeding actions on the devices. The physical status of a device is represented in Aorta as the current values of a set of status attributes defined in the virtual table for the type of device [30]. Example status attributes in different virtual device tables are voltage, freeRAM for sensors and phones; pan, tilt, zoom for cameras; loc, angle for robots.
CREATE AQ noise_rejection AS	
SELECT warn(r.id, s.loc,	'messages/warning.txt")
FROM sensors s, robots	r
WHERE 600 < (SELECT	winavg(ss.noise, 5, 5, minute)
FROM	sensors ss
WHERE	every(10, second) AND ss.id = s.id)
DEADLINE 30 seconds	
Figure 1: The noiserejection query for lab surveillance.
Figure 2: Devices in the pervasive lab.
The scheduling model we face in our problem is dynamic rather than the static model adopted in the classic problem. The action executions to be scheduled on the devices are dynamically arriving at the system over time. In contrast, the classic problem takes a static set of jobs as input and assumes all kinds of job information, e.g., the start or processing times of the jobs, are known a priori before the scheduling [16].
We summarize all these characteristics of our action scheduling problem and propose a dynamic, heuristic algorithm to solve the problem. Whenever a device becomes free, the algorithm selects an action request queued in the system that has the highest priority to be serviced on the device. We define an action request as the request for an action execution from an application with instantiated values for the input parameters of the action. The priority of an action request on a device is computed using the response time of the request on the device, the deadline and the candidate device number of the request, as well as the current eligibility and reliability of the device. We have implemented the algorithm in our Aorta prototype and seamlessly integrated it with other mechanisms in the system [30].
The effectiveness of declarative queries to task networks of devices has been illustrated by lots of recent work in both database [12][31] and networking [10][21] communities. Following this programming paradigm, Aorta uses SQL-based continuous queries having actions embedded [29] to express the processing logics of pervasive computing applications. We call these queries action-embedded queries. With this abstraction for applications, the process of action scheduling in Aorta is encapsulated into the adaptive group optimization of multiple concurrent queries running in the system. Although in this paper we present and evaluate our action scheduling algorithm based on these system implementation details of Aorta,
the algorithm is generic and is indifferent to the particular application interface.
We have designed the syntax and semantics of action-embedded queries to accord with the requirements of action scheduling. An optional DEADLINE clause is provided in Aorta's query interface for applications to tell the system the deadline of an action request from a query, which is defined as the interval between the time when the request is issued and the time when the request is serviced on a device (i.e., the action execution has been finished). Moreover, when the WHERE clause of an action-embedded query is evaluated as true and a set of candidate devices is determined for the action request, the request will be scheduled and serviced only once on a selected device among these candidates [29]. As an example, Figure 1 shows a noise rejection query in Aorta that abstracts the robot patrol application for lab surveillance we have described previously.
We have built a case study application on our Aorta prototype to monitor the pervasive research lab in our department. The lab is equipped with desktops having removable hard disks, notebooks, and various types of devices including Crossbow motes [4], AXIS 2130 network cameras [2], ER1 personal robots [6], PDAs and phones (Figure 2). This pervasive lab monitoring application is used as an illustrative example throughout the paper as well as in our performance evaluation of the proposed action scheduling algorithm.
The remainder of this paper is organized as follows. We describe the Aorta system model for action scheduling in Section 2. We identify the characteristics of our action scheduling problem in Section 3 and present a dynamic, heuristic algorithm for the problem in Section 4. In Section 5, we perform simulation studies to evaluate the effectiveness of our proposed scheduling algorithm using actions in the pervasive lab monitoring application.
We discuss related work in Section 6 and conclude the paper in Section 7.
2 System model for action scheduling
In this section, we describe the model we implement in the Aorta system to effectively support the action scheduling on devices.
2.1 Actions and action operators
Aorta only supports actions that operate a single device. We focus our study on single- device actions due to three main reasons: (i) they are prevalent in real-world applications [1][8][22][24], (ii) they are more practical and manageable in implementation, and (iii) in combination with action or query nesting, they can be used to compose many multi-device actions that have simple communication logics between devices [29].
For every action in Aorta, we require the identity of the device that the action is executed on to be not fixed in the function code block. In contrast, the device should be explicitly or implicitly identified by the instantiated parameter values for the action at run time. As a typical example, the first input parameter of the warn action in Figure 1 determines the robot on which a specific execution of the action will be executed. The necessity of this restriction stems from the "black box" nature of actions. Being a UDF, an action is registered to Aorta as a compiled code block and it is impossible for the system to modify its implementation details. Consequently, if the identity of the device to execute an action is fixed in the code block, there is little room for action scheduling on the parallel devices. In this case, our problem degenerates to a single-machine scheduling problem [16] on individual devices.
Aorta makes an action embedded in a query a first-class operator in the evaluation plan of the query. An action operator contains the following information about an action: (i) the name, (ii) the specifications of input parameters, (iii) the pointer to the function code block to be invoked. Furthermore, all queries having an action on the same type of device share a single action operator among their query plans. Every query plan is connected to the shared action operator via a common input queue of the operator. The action operator maintains corresponding information about the action embedded in each query so that it can use the correct information to schedule a specific execution of an action for a query.
An action operator is created when the first query embedded with an action on the type of device is registered to Aorta. Subsequent queries having actions on the same type of device are connected to the operator by an update of the information maintained in the operator. These shared action operators give the Aorta query optimizer a global view of the current action workload on individual types of devices in the system. Rather than being optimized separately without coordination, multiple queries are grouped and the action executions for them are adaptively scheduled as a whole.
2.2 Scheduling model
Figure 3 depicts the scheduling model for every action operator in Aorta's query processing framework. In the figure, dj (1 < j < m) denotes all devices of a type involved in the Aorta system that the operator is in charge of action scheduling on, e.g., the set of programmable robots. qi (1 < i < n) denotes the plan of Query i and Ri the streaming action requests issued from the query over time. R denotes the whole stream of action requests that arrive
n
at the input queue of the action operator and R = |J Ri.
i=1
Being the main component of the operator, the scheduler implements the dynamic and heuristic action scheduling algorithm we have developed. at (1 < i < n) in the operator denotes the stored specification information about the action that is embedded in Query i.
3 Characteristics of action scheduling
The action scheduling problem we study has a unique set of characteristics that is tightly related to the application scenario of pervasive computing. We identify all characteristics of the problem one by one as follows.
(1) Action-device interaction. There is a special kind of interaction between actions and devices in pervasive computing: an action execution on a device may change the physical status of the device; in turn, the physical status of a device may affect the cost of an action execution on the device. This interaction is generic to several cost metrics for actions in pervasive computing, such as the response time, the power consumption and the price of service. It makes our scheduling of actions more complex than traditional job scheduling, because the costs of an action execution on candidate devices are different and dynamically changing.
q1 q2
qn
R1
R2
Rn
R
Input Queue
Action Operator				
			Scheduler	
a1	a2		an	J"
d
d
2
d
m
Figure 3: Scheduling model for an action operator in Aorta.
Figure 4: The snapshot query in the pervasive lab monitoring application.
CREATE AQ snapshot AS
SELECT photo(c.ip, s.loc, "photos/admin")
FROM sensors s, cameras c
WHERE s.accel_x > 500 AND coverage(c.loc, s.loc)
To illustrate the interaction, two actions in the pervasive lab monitoring application are listed in Examples 1 and 2. The device physical status related to the action is the location of a robot or the head position of a camera, respectively.
Example 1. Consider the warn(id, location, text_file) action Figure 1 on programmable robots [6]. The action operates a robot with id to rotate towards and go straight to a target location, and play a warning message whose content is specified in text_file when arriving at the location. An execution of the action changes the location of the robot. The response time of the execution is proportional to the distance between the target location and the current location of the robot. ■
Example 2. Consider the photo(ip, location, directory) action in Fig on PTZ network cameras [2]. The action operates a camera with ip to move its head to a position pointing at location and take a photo. The action then stores the photo that the camera takes to directory. An execution of the action changes the position of the camera head (i.e., the pan, tilt, zoom values). The response time of the action execution depends on the current head position of the camera. ■
(2)	Dynamic request arrival in a global queue. Action requests from multiple queries continually and dynamically arrive at the single input queue of an action operator over time. There is no local request queue for a device. The system has no prior knowledge about the arrival time, deadline or candidate devices of each request.
(3)	Deadlines of requests. Action executions are of little use for pervasive computing applications if they cannot be finished in a timely manner. An unscheduled action request should be dropped when the system detects that the request cannot be serviced within its deadline.
(4)	Independent and non-preemptive requests. There is no message communication between any two action requests as they represent separate executions of actions. Since the execution flows of actions are encapsulated in code blocks and are unknown to the system, an action execution on a device cannot be interrupted and multiple executions cannot be interleaved on a device. As a result, no communication is required in the scheduling to transplant a partially-serviced action request from one device to another.
(5)	Unrelated devices. The cost of an action request on a device is generally not related to those costs of the request on the other devices. In other words, the devices in our action scheduling model are unrelated [16]. Each
device services the action requests scheduled on it individually.
(6) Device eligibility. The candidate devices for an action request often include a subset of all devices of the type in the system. For instance, for the snapshot query in Figure 4, the set of candidate devices for a request is determined by the function coverage(loc1, loc2) in the query condition. The function returns true if and only if the view range of the camera with location loc1 covers the location loc2. This example also indicates the fact that the set of candidate devices for multiple action requests of the same query may be different. We say that a device is eligible for an action request if it is a candidate for servicing the request.
The last four characteristics of our problem can be mapped to the following characteristics of the classic parallel machine scheduling problem [16] in order: (i) deadlines of jobs, (ii) non-preemptive jobs, (iii) unrelated machines, and (iv) machine eligibility restrictions for jobs. Moreover, the first characteristic of our problem is similar to the sequence-dependent setup time of jobs in the classic problem [16]. The major difference is that we are facing sequence-dependent response time (in general, cost) of action requests rather than setup time. To the best of our knowledge, there is no existing scheduling algorithm for the classic problem or for any variant of it whose design has taken all the characteristics (i)-(iv) and the sequence-dependent setup time into consideration.
The second and third characteristics of our problem require us to adopt a dynamic model [11][17] for action scheduling rather than the static model in the classic problem [3][16]. In our previous work, we have developed two non-real time algorithms for action scheduling in Aorta without considering the request deadlines [29] [30]. The algorithms are based on a static scheduling model that divides the system time into a sequence of equal-length scheduling intervals and schedules action requests arriving in each interval individually. However, in our subsequent real testbed evaluation of Aorta, we found that these prior algorithms have a major drawback when applying to the scheduling of actions with deadlines, that is, their performance in practice largely depends on the length of the system-defined scheduling interval. If the interval is long, requests that arrive earlier in an interval suffers from a large delay and their deadlines are more likely to be missed, whereas very few requests can be scheduled together in each interval and the static group scheduling becomes less effective if the interval is short.
With a dynamic scheduling model, it has been proved that if we do not assume any prior knowledge about the arrival times of the continuously-arriving jobs, an opti-
mal algorithm does not theoretically exist for a job scheduling problem [5]. In comparison, a static model makes the design of an optimal scheduling possible as all kinds of information about the jobs in a scheduling interval is available before the scheduling starts. Nevertheless, such static scheduling problem is NP-hard and too computationally expensive to be feasible in our real-time scenario [30]. Even a sub-optimal, non-heuristic solution for the problem, such as the Simulated Annealing (SA) algorithm we have studied before [30], requires large computation cost given a small input size.
As a summary, the unique set of characteristics of our action scheduling problem in pervasive computing makes scheduling algorithms in the literature based on a static model inapplicable to the problem, due to their significant running time or unconcern for a few characteristics. These negative observations, as well as the effectiveness of our prior static heuristic algorithms for non-real time action scheduling [30], motivate us to propose a new dynamic, heuristic algorithm for the real-time action scheduling in Aorta.
4 Heuristic scheduling algorithm
We present the detailed design of our heuristic algorithm for real-time action scheduling in this section. Algorithm 1 formulates the input and output of the problem and depicts the flow of our proposed algorithm. The algorithm is called by an action operator immediately after the operator is generated. We developed the algorithm
based on the List Scheduling (LS) discipline in scheduling theory [16] due to the tiny algorithmic running time incurred by the discipline. Whenever a device becomes free, the LS discipline schedules a request in the queue that the device is eligible for on the device using a heuristic.
Algorithm 1 starts by initializing the status information about all devices involved in the scheduling (Lines 13). Such information is dynamically maintained during the execution flow of the algorithm (Lines 10, 19). The algorithm then enters an endless scheduling loop (Line 4) and performs action scheduling on the devices round by round. The loop stops only when the system is terminated and the action operator is destroyed.
In each round of the scheduling loop, Algorithm 1 first examines all requests in the input queue of the action operator and removes those requests whose deadlines have been missed at this time (Lines 5-6). Next, for each device that is currently free, the algorithm computes the priority (PRI value) of every request in the queue that the device is eligible for using Function computePriority (Line 14). Function estimateCost(ri, dj, M) estimates the current cost of request rt on device dj based on a cost model M (Line 12). The algorithm then selects the request-device pair (rs, ds) having the highest priority (Lines 15-16) and schedules request rs on device ds in this round (Line 18). Note that we regard a request-device pair has a higher priority if the PRI value of this pair is smaller.
Algorithm 1: Dynamic and Heuristic Action Scheduling
Input: An action operator P on a type of device and the set of all m devices of the type D = (d1y d2,..., dm). The streaming action requests R = (r1, r2, ..., rn, ...) appear in the input queue of P. Each rt e R has a deadline DLt and a set of candidate devices Dt e D. Output: A schedule of R on D. Each rt e R is either assigned to and serviced by a device d e D, or is dropped due to the missing of its deadline.
1
2
3
4
5
6
7
8 9
10 11 12
13
14
15
16
17
18
19
for each device d, e D (1 < j < m) do dj.nextFreeTime = $now; /* nextFreeTime indicates the next time when dj will become free */ poll and store the current physical status of d,; /* $now denotes the current system time */ while true do for each action request rt in R do
if ($now >= DLi) then remove ri from R; /* ri is dropped due to the missing of its deadline */ rs = null; ds = null; PRIs = + inf; /* request rs is to be scheduled on device ds in this round */ for each device dj e D do if dj.nextFreeTime > $now then continue; /* dj is currently busy */ dj.nextFreeTime = $now; update the new physical status of d,-; for each action request rt in R with dj e D{ do costij = estimateCost(ri, dj, M); /*Mis the cost model used to estimate the cost of ri ondj */ if ($now + costij > DLi) then continue; /* deadline of rt cannot be caught on dj at this time */ PRIij = computePriority(ri, dj, costij); if PRI,, < PRIs then rs = r,; ds = d,; PRI, = PRI,,; if PRIs != + inf then remove rs from R and service it on ds; ds.nextFreeTime = $now + the cost of servicing rs on ds;
if $now < min{dj.nextFreeTime} (1 < j < m) then sleep until min{dj.nextFreeTime}; else sleep until the arrival of a new action request;_
If it happens that no free device is eligible for the requests in the queue, Algorithm 1 is paused to avoid the extensive computation overhead of vain loops. The execution of the algorithm will be resumed by the query optimizer later when a new request arrives in the queue (Line 21). On the other hand, if the algorithm detects that all devices are currently busy, it pauses the execution of itself until the nearest time in future when at least one of these devices become free (Line 20).
In the following of this section, we present in more detail a number of important issues in our algorithm design. We describe how the algorithm deals with each characteristic of the action scheduling problem in Section 4.1. We introduce the model for cost estimation of action executions in Section 4.2. In Section 4.3, we describe how the priority of a request on a device is heuristically computed. The assignment of default deadlines to nonreal time requests is discussed in Section 4.4.
4.1	Dealing with action scheduling
characteristics
We have considered all six characteristics of the action scheduling problem in the design of our heuristic algorithm. The algorithm incorporates a corresponding approach to deal with each characteristic. For the action-device interaction, Algorithm 1 updates real-time physical status of a free device (Line 10) before computing the priorities of the requests on the device in each round of scheduling. The physical status update involves the process of sending a request message to the device and parsing real-time values of the status attributes from the response message of the device [30].
The LS discipline adopted by Algorithm 1 efficiently enables the dynamic model that our action scheduling problem requires. No matter what the arrival pattern and rate of action requests are, the algorithm performs a round of scheduling when and only when a device is free and there are requests in the queue that the free device is eligible for.
As a real-time scheduling algorithm, Algorithm 1 repeatedly examines the deadlines of the action requests in the queue and drops a request immediately when it detects that the deadline of the request cannot be caught. Moreover, the deadline of a request is considered as a parameter in the priority computation of the request in Function computePriority (see Section 4.3).
Before a device finishes a previous action request, Algorithm 1 will not schedule a succeeding request on the device. This ensures that the action requests are serviced on the devices in an independent and non-preemptive manner. The unrelated environment and eligibility restrictions of devices are handled in the algorithm by examining only the action requests that a free device is eligible for and re-computing the cost of each request on a free device in every round of scheduling.
4.2	Cost model for actions
To determine whether an action request can be serviced timely, Algorithm 1 requires a cost model to estimate the
response time, and probably the cost values under other metrics, of the request on a device (Line 12). For this purpose, we have previously developed a cost model for actions using response time as the cost metric [29].
Given the physical status of a device, the cost model is able to accurately estimate the response time of any request on the device. The core component of the model is a set of action profiles, each of which specifies the composition of an action in Aorta in terms of the sequential and/or parallel execution of a number of atomic operations. These atomic operations are specific to the type of device the action is executed on and are pre-defined in our Aorta system. Their costs are obtained from empirical measurements in our study. The cost of an action on a device is then estimated using the action profile, the estimated costs of the atomic operations on the type of device, and the physical status of the device [29]. We have implemented the cost model in Aorta and validated its correctness using actions on real devices including cameras and robots. Unless otherwise specified, when we say the "cost" of an action request in the following of this paper, we mean the response time of the request on a device.
Although we use response time as the single cost metric in our study, our proposed action scheduling algorithm is general and is indifferent to the specific metrics used in the cost model. The cost values in Algorithm 1 (Lines 12-14, 19) can be evaluated using other cost metrics for actions, e.g. power consumption, or the combination of multiple metrics. The change of cost metrics will not affect the applicability of the algorithm at all. The only requirement of the algorithm is that a cost model with response time as one of its metrics must be available. In addition, the model may be flexibly designed to selectively involve a few other metrics and compute a more generic request cost value to be used in the algorithm.
Because we focus on action scheduling in this paper, we omit the computation formulas in our cost model and refer interested readers to our previous work for the details [29].
4.3 Priority computation
As a main sub-procedure of the algorithm, Function computePriority uses Equation (1) to compute the PRIij value of a request-device pair (r„ dj) as the multiplication of three values: the basic priority of the pair, the current eligibility degree and the current reliability degree of device dj. These three values are denoted as Bij, Ej and Rj and are computed using Equations (2), (5) and (6), respectively.
PRIV = Bij * Ej * Rj	(1)
Bij = Cj + W1 * DLi + W 2 * CDNi	(2)
W1 = ±cj jrDLk	(3)
k=1 / k=1
W 2 = £ Cj £ CDNk
k=1 / k=1
Ej = 1 + nj / n R = 1 + FPj
(4)
(5)
(6)
Equation (2) computes the basic priority Bj of request Ti on device dj as the weighted sum of three parameters: (i) the current cost Cj of servicing rt on dj, (ii) the deadline DLi of r,, and (iii) the number of candidate devices CDN, of r,-. The intuition is that the smaller the cost of a request on a free device, or the more urgent the request, or the less flexibility to schedule the request on the other devices, the higher priority should be given to schedule the request on this device. As a result, our scheduling heuristic is a weighted combination of three simple heuristics under LS discipline in existing work on job scheduling: whenever a machine is free, select the job with the minimum processing time [3][18], or with the minimum deadline [18], or with the least number of candidate machines [16] to be first processed on the machine.
Among the three parameters in Equation (2), we choose cost as the base parameter for the weighted-sum computation. Our consideration is to use the most dynamic parameter as the base in order to make the computed By value as specific to the request-device pair (r,, dj) as possible. Unlike the value of Cj, the values of DLt and CDN, depend on r, only but not dj and do not change along with the physical status of dj.
Rather than setting system-defined static values, we use an adaptive mechanism to adjust the two weights W1 and W2 used in Equation (2). When computing Bj in a particular round of scheduling, the set of all unscheduled requests that device dj is eligible for are identified. Denote this set of requests as Rj. The values of W1 and W2 in this round are then computed using Equations (3)-(4) as the sum of current costs on dj of all requests in Rj divided by the sum of deadlines or candidate device numbers of these requests.
This adaptive setting of weight values roughly maps the values of DLi and CDNi into the same magnitude of C,j. It avoids the problem that the large magnitude of one parameter will dominate those of the other two when they are added. Moreover, overall information about the three parameters of all requests in Rj is considered when the basic priority of each request is computed. We use a simple summation to model this overall information to keep the computation cost of Equation (2) negligible, so that the running time of Algorithm 1 will not be greatly affected by the frequent invocations of such computation. We have tried and tested a few alternative combinations of the three parameters in Equation (2), such as multiplication and priority-based ordering in a list. By extensive trial experiments, we found that they all induce worse performance on action scheduling than the summation.
The essence of the multiplication of Bj by Ej and Rj in Equation (1) is to use real-time status of a device to adjust the priority of request scheduling on the device. In
Equation (5), the current eligibility degree Ej of a device dj is computed as one plus the size nj of Rj divided by the number n of all requests in the queue. The purpose is to give a higher priority to request scheduling on a device that is eligible for a smaller percent of the total requests. A larger value of Ej implies that the device dj is currently eligible for more requests and a lower priority will be given to request scheduling on the device.
The current reliability degree Rj of a device dj computed by Equation (6) is introduced because action executions occasionally fail on unreliable physical devices due to temporary hardware malfunctions [30]. Consequently, it is more desirable to schedule requests on a device that induce fewer failures of action executions in history. In the equation, FPj is the percentage of action executions on dj that has failed. The value is kept track by the Aorta query optimizer since the system starts. We assume that there is a way for the optimizer to know whether an action execution on a device is failed or not, either through a system error returned by the invocation of the function code block or by a notification message from the application. Same as Ej, the larger the value of Rj is the more unreliable and unfavourable to schedule a request on the device dj.
4.4 Default deadline assignment for action requests
Algorithm 1 assumes that every action request has a deadline. The deadline of a request is specified in the DEADLINE clause of the query that issues the request. If this optional clause is not provided for a query, which suggests all requests from the query are not real-time, Aorta uses a simple scheme to assign a system-default deadline to the requests.
In the scheme, we let the evaluation plan of a query that has no DEADLINE clause dynamically maintain the average request arrival rate Sa of the query. Suppose T is the interval between the current system time and the time when Aorta starts, and there are Na action requests issued from the query within T. The current value of Sa is estimated as Na / T. If a new request is issued from the query at this time, its deadline is assigned to be the current system time plus 1 / Sa. The motivation of this scheme is a previous request from a query should have been serviced before the next request from the same query appears.
5 Performance evaluation
We have evaluated the performance of our real-time action scheduling algorithm using two actions in the pervasive lab monitoring application: the warn action on robots and the photo action on cameras, whose operations have been described in Section 3. We call our algorithm DPH (Dynamic, Priority-based Heuristic action scheduling) in the experiments.
5.1 Experimental setup
Parameter	Description
Nd	Number of devices
I	Mean arrival rate of the requests (unit: requests/second)
w	Weight parameter indicates the deadline range of the requests
f	Maximum failure rate of action executions on the devices
Sd	Dropping rate
Ta	Average service time of a scheduled request
Ts	Average scheduling time of a request
Table 1: Symbols for an action scheduling workload.
5.1.1	Simulation platform
The pervasive lab only has a small number of real devices. To enable large-scale and controllable performance studies of our algorithm, we have developed homegrown robot and camera simulators to simulate the ER1 robots [6] and the AXIS 2130 cameras [2] in the lab on which the warn or photo action is executed.
We tuned the simulators through extensive tests using the real devices, and made an operation executed on a simulated robot or camera has very similar effects to that on a real device such as the cost of the operation and the change of physical status the operation results in. All experiments we present were conducted using the two simulators.
5.1.2	Workload generation
We generated synthetic scheduling workloads of warn or photo action requests and used these workloads as the input traces for our simulation studies. Each workload contains totally 1000 requests for an action that arrives dynamically over time to be scheduled on a number of simulated devices. The arrival of the requests follows a Poisson process [11][17] with a mean arrival rate 1. Each of the devices was assigned with a failure rate to indicate the probability that a request scheduled on the device will fail. The failure rate of a device was uniformly picked from the range [0, f]. f e [0,1) is a parameter we set to limit the maximum failure rate of action executions on the devices.
In a scheduling workload we generated, the cost of an action request on a candidate device was randomly and uniformly picked from the cost range of the action. In the pervasive lab monitoring application, the cost range in seconds of a warn action is [8.32, 43.73], and that of a photo action is [0.41, 8.87]. The deadline of a request was uniformly picked from the range [avgcost, w * avgcost]. avg cost denotes the cost of the action execution on a device in the average case and its value is 26.03
sec for warn or 4.64 sec for photo. The weight w is a parameter we set to examine how the tightness of request deadlines affects the performance of various heuristic scheduling algorithms we compared in the experiments.
The distribution of candidate device numbers of the requests in a workload was one of two kinds: (i) the candidate device number of each request was uniformly and independently picked from the range [1, Nd], or (ii) the candidate device numbers of all requests were picked from [1, Nd] and follow a Zipfian distribution as a whole. Nd is the number of devices involved in the scheduling. For brevity, we call the workloads with these two kinds of distributions the Uniform and Zipfian workloads in the experiments. To be more specific, in a Zipfian workload many requests have only one candidate device, fewer requests have two, even fewer have three and so on. As a result, a large portion of the requests in a Zipfian workload are skewed on a small subset of the devices.
5.1.3 Performance metrics
The performance metrics we studied for an action scheduling workload include: (i) the dropping rate Sd, (ii) the average service time of a scheduled request Ta, and (iii) the average scheduling time of a request Ts. The dropping rate is defined as the percentage of action requests in the workload that are dropped in the scheduling due to deadline missing. The average service time of a scheduled request is defined as the average response time of the scheduled requests in the workload. The average scheduling time of a request is defined as the average computation cost that the algorithm spends on scheduling one request in the workload.
A good algorithm for our real-time action scheduling problem must first achieve a small dropping rate of the action requests over time. In other words, the dropping rate should be considered as the primary performance metric for our problem. Under a stable dropping rate, it is desirable that on average the scheduled requests are serviced as fast as possible so that applications requiring the action executions are responded rapidly. Furthermore, the running time of the algorithm must be negligible to ensure the scheduling process will not add considerable delay to the requests waiting in the queue.
Table 1 summarizes the symbols we used in the experiments to denote the parameters and metrics of an action scheduling workload.
5.2 Comparison of various scheduling heuristics
The core of our DPH algorithm is the scheduling heuristic that computes the priority of a request on a device as the weighted sum of three parameters: the cost on the device, the deadline and the candidate device number of the request. In this section, we validate the choice of using cost as the base parameter in the weighted-sum computation of our heuristic and demonstrate the performance benefit of our heuristic over existing dynamic scheduling heuristics in the literature.
0.5 0.4 0.3 0.2 0.1 0
0.5 0.75
1.25 1.5
0.5 0.4 0.3 0.2 0.1 0
JTkffi&l
□ DPH □ DPH-DL □ DPH-CDN ■ SRF H EDF E LFD-LFR □ RANDOM
□ DPH □ DPH-DL □ DPH-CDN ■ SRF 0 EDF □ LFD-LFR □ RANDOM
(a) warn + Uniform
(b) photo + Uniform
0.5 0.4 0.3
•o
m
0.2 -0.1 0
0.5 0.75	1	1.25 1.5
A
I DPH □ DPH-DL □ DPH-CDN ■ SRF 0 EDF D LFD-LFR □ RANDOM
0.5 0.4 0.3 0.2 0.1 0
2	4	6	8	10
_a_
□ DPH □ DPH-DL □ DPH-CDN ■ SRF H EDF n LFD-LFR □ RANDOM
(c) warn + Zipfian
(d) photo + Zipfian
Figure 5: Dropping rates of seven scheduling heuristics with different request arrival rates.
0.2
0.15
0.1
0.05
Ua
rfTl ri-rl
1.5
2.5 w
3.5
I DPH □ DPH-DL □ DPH-CDN ■ SRF
(a) warn + Uniform
0.25
0.15
□ DPH □ DPH-DL □ DPH-CDN ■ SRF
0.2
0.15
0.1
0.05
0.25
0.15
llil i hi. i
1.5
2.5 w
3.5
□ DPH □ DPH-DL □ DPH-CDN ■ SRF
(b) photo + Uniform
1.5	2	2.5	3	3.5
w
□ DPH □ DPH-DL □ DPH-CDN ■ SRF
2
4
8
10
2
3
2
3
(c) warn + Zipfian	(d) photo + Zipfian
Figure 6: Dropping rates of four scheduling heuristics with different w parameters.
We compared our heuristic with three heuristics widely used for classic job scheduling: (i) shortest request first [3][18], (ii) earliest deadline first [18], and (iii) least flexible device with least flexible request first [16]. In the experiments, we replaced our heuristic with one of these heuristics in DPH while keeping other sub-modules of the algorithm unmodified. The consequent algorithms are denoted as SRF, EDF and LFD-LFR, correspondingly. We picked the three heuristics for performance comparison with ours because they have been illustrated to be effective for different variants of the parallel machine scheduling problem under a number of performance metrics [16]. Moreover, they can be easily used with the LS discipline to enable a dynamic scheduling model as our heuristic.
Whenever a device becomes free in a round of scheduling, SRF, EDF and LFD-LFR select a request in the queue to be serviced on the device if and only if the request has the smallest cost on the device, the earliest deadline, or the least number of candidate devices. If multiple devices are free at the same time, SRF first schedules the request-device pair with the smallest cost, EDF first schedules the request with the earliest deadline on any free candidate device, and LFD-LFR first schedules a request on the device that is eligible for the least number of requests in the queue.
To verify the correct design of our heuristic, two variants of it using deadline or candidate device number as the base parameter in the weighted-sum computation were also included in the performance comparison. We denote DPH with these two variant heuristics as DPH-DL and DPH-CDN. As the baseline of performance analysis, we included a RANDOM algorithm in the comparison. Under the LS discipline, the algorithm randomly schedules each request on one of its candidate devices.
We first examined the dropping rates achieved by different scheduling heuristics. Figure 5 shows the metric values of the seven algorithms with varied request arrival rate. For each action scheduling workload in this experiment, the values of Nd and w were fixed to be 20 and 2.0, and the failure rate of every device was set to be zero (i.e., f = 0.0). Each value in the figure is the average of ten independent runs of the experiment. This is the same for all figures and tables shown in the following. We used different X values for the warn and photo workloads in the figure due to the different cost ranges of the two actions. The values in the figure were subtly picked to demonstrate the performance differences between the algorithms from the situation that the devices in a workload were slightly underloaded to the situation that they were heavily overloaded by the requests in the workload.
In Figure 5 we see that, all algorithms dropped a larger percentage of requests when the requests were arriving at a faster rate. With the same X value, the algorithms
dropped more requests under Zipfian workloads than Uniform workloads. This was because the requests in a Zipfian workload were skewed on few devices and these skewed requests had a higher probability of being dropped due to the overloading of their limited candidate devices. The figure shows that DPH consistently dropped fewer requests than DPH-DL and DPH-CDN under all kinds of workloads, although the performance differences between them were not very large. This result verifies we have chosen the correct base parameter for our heuristic. In the meanwhile, our heuristic has a nice property that its performance is indifferent to the base parameter chosen.
DPH had a significant better performance than EDF, LFD-LFR and RANDOM in Figure 5. The performance difference between DPH and these algorithms got large when the request arrival rate increased. Among the three, the performance of EDF was as bad as RANDOM and LFD-LFR always performed better than both of them. The result indicates that EDF is not suitable for our action scheduling problem, because the dynamic costs and candidate device numbers of action requests have become the dominating factors to the performance of an algorithm for our problem.
The performance of SRF in the figure was much closer to DPH compared to LFD-LFR. It was even better than DPH-DL and DPH-CDN in several cases. This was because cost is the most dynamic parameter in our problem and the cost of an action request on a device will change dynamically in the scheduling process with the physical status updates of the device. As a result, an unscheduled request that currently has a larger cost on a device has the opportunity to get its cost on the device decreased after the physical status of the device is changed. This effect makes the minimization of the current request cost on a free device more beneficial in our problem than in classic job scheduling.
DPH still outperformed SRF noticeably as shown in Figure 5, especially when the devices had a moderate load in the middle part of each sub-figure. The performance benefit of DPH over SRF was larger under Uniform workloads than Zipfian workloads, and was larger under warn workloads than photo workloads. This indicates that when requests are skewed or the cost range of action is small, the performance of two reasonable scheduling heuristics gets closer and the opportunity for performance improvement is lowered. More specifically, DPH dropped 3%-30%, 3%-24%, 8%-14% and 1%-10% fewer requests than SRF in Figure 5(a)-(b), respectively.
Figure 6 shows the dropping rates of DPH, DPH-DL, DPH-CDN and SRF when the w parameter for request deadline generation varied. We still fixed Nd = 20 and f = 0.0 in this experiment, and set X = 1.0 for the warn workloads and X = 8.0 for the photo workloads. No matter
how w is varied, EDF, LFD-LFR and RANDOM performed much worse than the other four algorithms the same as in Figure 5. As a result, we do not include the results for these algorithms in Figure 6 in order to make the performance differences between the other four algorithms more apparent.
As expected, more requests were dropped by every algorithm when the deadlines of requests got smaller. DPH consistently outperformed the other three no matter
how the value of w was changed. Figure 6(a)-(b) indicates under Uniform workloads the performance difference between DPH and SRF increased when the request deadlines were large. For example, DPH dropped 11%, 30% and 38% fewer requests than SRF when the value of w was set to be 1.5, 2.0 and 2.5 under the warn Uniform workloads. In comparison, the performance difference between DPH and SRF remained nearly constant under Zipfian workloads, as shown in Figure 6(c)-(d).
—-Algorithm Workload		DPH	DPH-DL	DPH-CDN	SRF	EDF	LFD-LFR	RANDOM
warn	Uniform	19.77	20.12	20.21	18.17	23.12	23.16	22.94
	Zipfian	17.92	21.11	21.47	17.44	22.83	22.54	22.68
photo	Uniform	2.58	2.66	2.58	2.27	4.2	4.07	4.19
	Zipfian	2.64	2.95	2.95	2.24	4.05	3.98	4.24
Table 2: Ta achieved by different scheduling heuristics (unit: second)
""""""""—-„Algorithm Workload"^""""——		DPH	SRF	EDF	LFD-LFR	RANDOM
warn	Uniform	0.291	0.279	0.246	0.268	0.252
	Zipfian	0.266	0.257	0.237	0.252	0.241
photo	Uniform	0.113	0.110	0.075	0.080	0.076
	Zipfian	0.100	0.101	0.076	0.082	0.077
Table 3: Ts achieved by different scheduling heuristics (unit: second)
We next investigated the average service time and average scheduling time achieved by different scheduling heuristics. The results are listed in Tables II and III. The parameter setting we used in this experiment was Nd = 20, w = 2.0, f= 0.0, X = 1.0 for the warn workloads and X = 8.0 for the photo workloads.
In Table 2, SRF achieved the smallest Ta among the seven heuristics under all kinds of workloads. This was because SRF always picked the request currently in the queue that had the smallest cost on a free device to be serviced in each round of scheduling. Considering the unknown arrival pattern of action requests and the dynamically changing costs of them on devices, we expected SRF would have the best possible performance on Ta in practice. Our DPH achieved the second best Ta among the heuristics compared. EDF and LFD-LFR performed as bad as RANDOM in this experiment due to the inconsideration of the request cost in these heuristics. The result further proved that the performance benefit of DPH over its variants DPH-DL, DPH-CDN and the existing EDF, LFD-LFR heuristics for our action scheduling problem.
The difference between DPH and SRF in Table 2 was consistently smaller than 0.5 seconds indifferent to what kinds of workloads we used. This difference is insignificant to the warn action but non-negligible to the photo action, considering the different cost ranges of the
two actions. The result implies that DPH can perform as well as SRF in Ta when the average cost of the action to be scheduled is in the magnitude of tens of seconds or larger.
Table 3 shows that the average scheduling time of DPH was about the same as that of SRF under all kinds of workloads. On average DPH only required 1-12 milliseconds more computation cost than SRF in each round of scheduling. Since SRF is one of the existing scheduling heuristic requiring the simplest processing logic, the result indicates that the computation cost of DPH is negligible. The average scheduling time of DPH-DL and DPH-CDN are not shown in the table because they were almost the same as that of DPH in the experiment. The small scheduling time of EDF, LFD-LFR and RANDOM in the table was due to the fact that these algorithms had a much larger dropping rate than DPH or SRF as shown in Figures 4-5, so that fewer requests are remained to be selected from in each round of scheduling.
We have run a number of additional experiments using different settings of Nd, w and X values in the workloads. All these experiments came up with results that are consistent with those revealed by Figures 4-5 and Tables 2-3. We omit the details here.
In summary, our DPH algorithm significantly outperformed existing EDF and LFD-LFR heuristic algorithms for action scheduling on the dropping rate and the
average service time of a scheduled request. In comparison with SRF, DPH performed better in the dropping rate and performed equally well in the average service time when the action cost magnitude is not smaller than tens of seconds. When the cost magnitude is small, DPH per-
0.35 -,
0.1	0.2	0.3	0.4	0.5
f
□ DPH □ DPH-WRE
(a) warn
formed a little worse than SRF in the average service time but still dropped noticeably less requests than SRF in the scheduling process. With a similar dropping rate, the scheduling time of DPH was as small as those required by existing heuristic algorithms.
0.35 -,
0.25 -
u
uj
0.15 -
0.05 -I—'———,—'———,—I———,—'——1—L-0.1	0.2 0.3 0.4 0.5
f
□ DPH □ DPH-WRE
(b) photo
Figure 7: Dropping rates of DPH with and without reliability estimation on devices.
5.3 Effectiveness of reliability estimation
We validate the effectiveness of involving device reliability estimation in the priority computation of our heuristic in this section. We modify Equation (1) to exclude Rj from the multiplication and denote the consequent variant of our algorithm DPH-WRE (Without Reliability Estimation). For each workload in this experiment, the parameters other than f were fixed to be Nd = 20, w = 2.0, 1 = 1.0 for the warn workloads and 1 = 8.0 for the photo workloads.
We varied the value of the parameter f from 0.1 to 0.5 in the experiment and the dropping rates of DPH and DPH-WRE under Uniform workloads are shown in Figure 7. Because our results showed that the average service time and scheduling time were hardly affected by the existence of reliability estimation, we omit the results for these two metrics.
In Figure 7 we see that, compared to its counterpart without reliability estimation, DPH was able to reduce the number of dropping requests by 6%-19% under both the warn and photo workloads. We have tried Zipfian workloads and other parameter settings of the workloads. The experiments all had very similar results to those in Figure 7.
6 Related work
There have been a number of publications in pervasive computing that study using sensor readings to actuate the
operations on different types of devices, such as sensor nodes [12], cameras [24], phones [22], vehicles [28] and lights [7]. The EnviroTrack system [1] tracks and invokes pre-defined computations in response to environmental objects like vehicles over networks of sensor nodes. Flinn et al. proposed a remote execution system in pervasive computing named Spectra [8]. Their idea of best execution plan selection for an application has a similar goal to our scheduling of an action request on the best candidate device.
User-defined functions have been widely supported in commercial DBMS products [9][14][15]. These functions run outside the core of the DBMS and their side effects to data (e.g., updates) are never considered during execution. In Aorta, we put actions as query operators and effectively schedule their executions on the devices. Our action scheduling algorithm considers the effects of action executions on the devices. The physical status change of a device incurred by an action execution is examined before the next execution is scheduled on the device.
Braun et al. [3] have compared the performance of eleven static heuristics for parallel machine scheduling under various kinds of task workloads. Their results show that in most cases the simple Min-min heuristic performs better than other complex heuristics such as Generic Algorithms, Simulated Annealing and A*. Most heuristics studied in the paper require a complete knowledge about a static task set before the scheduling is performed so they are inapplicable to our dynamic scenario.
Moreover, the Min-min heuristic the authors identified to have a generally good scheduling performance is essentially the same as the SRF algorithm that we have studied in the experiments. Our results have illustrated that our proposed algorithm noticeably outperforms SRF on the dropping rate of dynamically- arriving action requests.
Previous work on multiprocessor scheduling in realtime systems [13][18] takes static sets of tasks as the scheduling input and aims at enhancing the schedulabil-ity of these task sets. Their proposed scheduling algorithms are based on a branch-and-bound tree search with backtracks and a weighted-sum heuristic integrating the deadline and the earliest start time of a task. These algorithms are too running time-intensive to be adaptable to our scenario and no approach is applied in them to adap-tively fix the weight value used in the heuristics as our algorithm. The resource requirements of tasks other than CPU processing time or the parallelization of a task on multiple processors have also been considered in this scheduling work. These issues are not related to our action scheduling problem.
Many load balancing approaches have been proposed in distributed systems for the non-real time scheduling of jobs on an interconnected network of computers [11][19][20][25][26]. Each job first arrives at the local queue of a computer and then transmitted to be processed by another computer if the original one has been overloaded. In comparison, in our action scheduling problem all devices of a type share a global queue of action requests and there is no communication among devices in the scheduling process. The scheduling model in this distributed computing scenario is dynamic as ours. However, the computers are all regarded as identical rather than unrelated and there is no job deadline involved in the problem.
Reliability-driven job scheduling in parallel and distributed systems has attracted many research efforts in the literature [17][23]. A reliability model that assumes the independent failures of jobs on the computers following a Poisson probability distribution is proposed to drive the scheduling process combined with the job costs. In comparison, in our heuristic action scheduling algorithm we estimate the reliability of an action request on a device in a much simpler way based on computing the average failure rate of requests on the device in history.
7 Conclusion
We have presented the design of the new dynamic and heuristic algorithm for real-time action scheduling in our current prototype of Aorta, which is a query processing system for pervasive computing. We make actions as query operators in Aorta and share a single action operator among the plans of multiple concurrent queries having actions on the same type of device. Each action operator adaptively performs the process of action scheduling on all devices of the type using the scheduling algorithm we propose.
We identify all characteristics of the action scheduling problem we study and apply corresponding approaches to handle each characteristic in our algorithm. The heuristic we develop for action scheduling is based on the priority computation for each unscheduled action request that a free device is eligible for and selecting the request with the highest priority to be serviced by the device at this time. The priority of a request on a device incorporates many parameters involved in the scheduling: the cost of the request on the device, the deadline and the candidate device number of the request, the current eligibility and reliability degree of the device.
We have performed simulation studies to compare our priority-based heuristic with three heuristics for dynamic scheduling in the literature. The results demonstrated the performance benefit of our heuristic over these existing heuristics. We have also conducted experiments to validate our choice of using cost as the base parameter in the weighted-sum computation of our heuristic as well as the effectiveness of the reliability estimation on devices in the heuristic.
References
[1]	T. Abdelzaher, B. Blum, Q. Cao, Y. Chen, D. Evans, J. George, S. George, L. Gu, T. He, S. Krishna-murthy, L. Luo, S. Son, J. Stankovic, R. Stoleru, and A. Wood. EnviroTrack: Towards an environmental computing paradigm for distributed sensor networks. In Proceedings of the 24th Inter-national Conference on Distributed Computing Systems, 2004, pp. 582-589.
[2]	Axis Communications. http://www.axis.com.
[3]	T.D. Braun, H.J. Siegel, N. Beck, L.L. Boloni, M. Maheswaran, A.I. Reuther, J.P. Robertson, M.D. Theys, B. Yao, D. Hensgen, and R.F. Freund. A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems. Journal of Parallel and Distributed Computing 61(6) (2001) 810-837.
[4]	Crossbow Inc. http://www.xbow.com.
[5]	M.L. Dertouzos and A.K.L. Mok. Multiprocessor on-line scheduling of hard-real-time tasks. IEEE Transactions on Software Engineering 15(2) (1989) 1497-1506.
[6]	Evolution Robotics. www.evolution.com.
[7]	C. Feng, L. Yang, J.W. Rozenblit, and P. Beudert. Design of a wireless sensor network based automatic light controller in theater arts. In Proceedings ofthe 14th International Conference and Workshops on the Engineering of Computer-Based System, 2007, pp. 161-170.
[8]	J. Flinn, S.Y. Park, and M. Satyanarayanan. Balancing performance, energy, and quality in pervasive computing. In Proceedings of the 22nd International Conference on Distributed Computing Systems, 2002, pp. 217-226.
[9]	IBM DB2. www.ibm.com/db2.
[10]	C. Intanagonwiwat, R. Govindan, and D. Estrin. Directed diffusion: A scalable and robust communication paradigm for sensor networks. In Proceedings of the 6th Annual International Conference on Mobile Computing and Networking, 2000, pp. 56-67.
[11]	H. Lin and C. Raghavendra. A dynamic load-balancing policy with a central job dispatcher (LBC). IEEE Transactions on Software Engineering 18(2) (1992) 148-158.
[12]	S. Madden, M.J. Franklin, J.M. Hellerstein, and W. Hong. TinyDB: An acquisitional query processing system for sensor networks. ACM Transactions on Database Systems 30(1) (2005) 122-173.
[13]	G. Manimaran and C.R. Murthy. An efficient dynamic scheduling algorithm for multiprocessor realtime systems. IEEE Transactions on Parallel and Distributed Systems 9(3) (1998) 312-319.
[14]	Microsoft SQL Server. www.microsoft.com/sql/.
[15]	Oracle Database. http://www.oracle.com/database/index.html.
[16]	M. Pinedo. Scheduling Theory, Algorithms, and Systems. 2nd Edition, Prentice Hall, 2002.
[17]	X. Qin and H. Jiang. A dynamic and reliability-driven scheduling algorithm for parallel real-time jobs on heterogeneous clusters. Journal of Parallel and Distributed Computing 65(8) (2005) 885-900.
[18]	K. Ramamritham, J. Stankovic, and P.F. Shiah. Efficient scheduling algorithms for real-time multiprocessor systems. IEEE Transactions on Parallel and Distributed Systems 1(2) (1990) 184-194.
[19]	K.W. Ross and D.D. Yao. Optimal load balancing and scheduling in a distributed computer system, Journal ofthe Association for Computing Machinery 38(3) (1991) 676-689.
[20]	R. Shah,B. Veeravalli, and M. Misra. On the design of adaptive and decentralized load balancing algorithms with load estimation for computational grid environments. IEEE Transactions on Parallel and Distributed Systems 18(12) (2007) 1675-1686.
[21]	C. Shen, C. Srisathapornphat, and C. Jaikaeo. Sensor information networking architecture and applications. IEEE Personal Communications 8(4) (2001) 52-59.
[22]	F. Siegemund and C. Florkemeier. Interaction in pervasive computing settings using bluetooth-enabled active tags and passive RFID technology together with mobile phones. In Proceedings of the 1st International Conference on Pervasive Computing and Communications, 2003, pp. 378-387.
[23]	S. Srinivasan and N.K. Jha. Safety and reliability driven task allocation in distributed systems. IEEE Transactions on Parallel and Distributed Systems 10(3) (1999) 238-251.
[24]	N.M. Su, H. Park, E. Bostrom, J. Burke, M.B. Srivastava, and D. Estrin. Augmenting film and video footage with sensor data. In Proceedings of the 2nd International Conference on Pervasive Computing and Communications, 2004, pp. 3-12.
[25]	R. Subrata, A.Y. Zomaya, and B. Landfeldt. Game-theoretic approach for load balancing in computational grids. IEEE Transactions on Parallel and Distributed Systems 19(1) (2008) 66-76.
[26]	A.N. Tantawi and D. Towsley. Optimal static load balancing in distributed computer systems. Journal of the Association for Computing Machinery 32(2) (1985) 445-465.
[27]	M. Weiser. The computer for the 21st century, Scientific American 265(3) (1991) 94-100.
[28]	Z. Wu, Q. Wu, H. Cheng, G. Pan, M. Zhao, and J. Sun. ScudWare: A semantic and adaptive middleware platform for smart vehicle space. IEEE Transactions on Intelligent Transportation Systems 8(1) (2007) 121-132.
[29]	W. Xue and Q. Luo. Action-oriented query processing for pervasive computing. In Proceedings of the 2nd Biennial Conference on Innovative Data Systems Research, 2005, pp. 305-316.
[30]	W. Xue, Q. Luo, and L.M. Ni. Systems support for pervasive query processing. In Proceedings of the 25th International Conference on Distributed Computing Systems, 2005, pp. 135-144.
[31]	Y. Yao and J. Gehrke. Query processing for sensor networks. In Proceedings of the 1st Biennial Conference on Innovative Data Systems Research, 2003.
Tuning Chess Evaluation Function Parameters using Differential Evolution Algorithm
Borko Boškovic
University of Maribor, Faculty of Electrical Engineering and Computer Science Smetanova 17, 2000 Maribor, Slovenia E-mail: borko.boskovic@uni-mb.si http://dkum.uni-mb.si/IzpisGradiva.php?id=14106
Thesis Summary
Keywords: tuning, differential evolution, history mechanism, opposition-based optimization, chess evaluation function Received: October 27, 2010
This article is an extended abstract of a doctoral dissertation on chess evaluation function tuning with differential evolution (DE) algorithm. DE is adopted for efficient chess evaluation function tuning, extended with an opposition-based optimization and a new historymechanism. Experimental results show that the algorithm is efficient and can be applied to the chess evaluation function tuning that has more or less knowledge.
Povzetek: Prispevek predstavlja razširjen povzetek doktorske disertacije s področja uglaševanja šahovske ocenitvene funkcije. Priuglaševanju so uporabljenialgoritem diferencialne evolucije, mehanizem nasprotij in mehanizem zgodovine.
1	Introduction
Almost all chess knowledge of chess programs is defined in the chess evaluation function. This knowledge is presented with many arithmetic expressions and parameters. The problem is how to set the values of these parameters. With conventional approaches this becomes a very challenging task and, as such, requires a lot of time and work. To solve this problem, researchers used automated tuning or "learning". The most successful methods were temporal difference learning [1, 2] and evolutionary algorithms [3].
In this paper we present an approach which is based on differential evolution (DE) algorithm [4]. We adopted this algorithm for efficient chess evaluation function tuning. It was extended with an opposition-based optimization [6] and a new history mechanism [7].
2	Tuning algorithm
This section contains a short description of our tuning algorithm. First, it describes differential evolution algorithm as the basis of our algorithm. Then follows a description of an opposition-based optimization and the history mechanism which are integrated into our algorithm.
DE is a simple yet powerful evolutionary algorithm for global optimization and it has recently been used in a wide variety of real-world applications with impressive results [8]. This is the reason why we used DE for the chess evaluation function tuning. In our problem individuals are vectors of chess evaluation function parameters. The DE algo-
rithm employs mutation and cross-over operations to generate new individuals and selection operation to select individuals that will survive into next generation. Before selection operation is employed, individuals have to be evaluated. In our case individuals are evaluated according to the games they have played. Therefore individuals play a specific number of games in each generation and individuals with greater efficiency survive into the next generation.
The opposition-based optimization was used because it improves efficiency of the DE for noise problems. The main idea of this optimization is to estimate a certain individual and its opposite individual at the same time, in order to achieve a better approximation for the candidate solution. The efficiency of the tuning process depends on the distance between the solution and the individuals of the initial population. Therefore, initialization operation generates opposite individuals according to the randomly initialized individuals. Then the selection operation is performed and selected individuals are probability closer to the solution which accelerates convergence. In the similar way this mechanism is with some probability performed in the rest of the evolutionary process [6, 5].
History mechanism is a new mechanism that enables potentially good individuals to remain within the evolutionary process and it reduces noise in the evaluation of those individuals. It also reduces the possibility of overfitting and it consequently improves the efficiency of the whole tuning process. This mechanism consists of two parts: history update and history injection. The first part adds a potentially good individuals into the history population or updates its
information about the efficiency. The second part injects individuals from the history population, according to the efficiency, back to the evolutionary process [7].
3	Experiment
With the proposed tuning approach, we tuned the chess evaluation function of BBChess chess program. Tuning was done with and without expert knowledge. When we tuned the parameters without expert knowledge, because the search space was huge, we tuned only a few parameters. After 500 generations of the evolutionary process, the value of parameters convergence to the values that relationship were approximately equal as known from the chess theory. When we tuned the parameter values with expert knowledge, the tuning intervals of parameter values were set around the approximate values and the number of tuned parameters was 190. The obtained results show that, our approach was successful. Efficiency of the tuning is dependent on the number of games that were played into the evolutionary process and on the defined tuning intervals. With larger tuning intervals obtained improvements were better but tuned individuals had smaller rating. We did some experiments without the history mechanism. The obtained results show that the history mechanism improves the tuning process by 155.2 rating points, on average.
4	Conclusion
This paper proposes an approach for the tuning of a chess evaluation function based on a DE algorithm, which includes an opposition-based optimizations and a new history mechanism. The tuned approach was tested and the obtained results show that our tuning approach was efficient and that the introduced history mechanism improves the efficiency of the tuning process.
References
[5]	Rahnamayan S, Tizhoosh HR, Salama MMA (2006) Opposition-Based Differential Evolution Algorithms. In: The 2006 IEEE Congress on Evolutionary Computation CEC 2006, pp 7363-7370
[6]	Rahnamayan S, Tizhoosh HR, Salama MMA (2008) Opposition-Based Differential Evolution. IEEE Transactions on Evolutionary Computation 12(1):64-79
[7]	Boskovic B, Brest J, Zamuda A, Greiner S, Zumer V. History Mechanism Supported Differential Evolution for Chess Evaluation Function Tuning, Soft Computing - A Fusion of Foundations, Methodologies and Applications, in press
[8]	Chakraborty, Uday K. (Ed.) (2008) Advances in Differential Evolution, Studies in Computational Intelligence, Vol. 143, Springer
[1]	Baxter J, Tridgell A, Weaver L (1998) Experiments in Parameter Learning Using Temporal Differences, International Computer Chess Association Journal 21(2):84—99
[2]	Baxter J, Tridgell A, Weaver L (2000) Learning to Play Chess Using Temporal Differences, Machine Learning 40(3):243-263
[3]	Fogel DB (2006) Evolutionary Computation: Toward a New Philosophy of Machine Intelligence, 3rd edn. Wiley-IEEE Press
[4]	Feoktistov V (2006) Differential Evolution: In Search of Solutions (Springer Optimization and Its Applications). Springer-Verlag New York, Inc, Secaucus, NJ, USA
Analysis of Results of Ecological Simulation Models with Machine Learning
Aneta Trajanov
Department of Knowledge Technologies, Jozef Stefan Institute Jamova cesta 39, Ljubljana, Slovenia E-mail: aneta.trajanov@ijs.si http://kt.ijs.si/aneta_trajanov/
Thesis summary
Keywords: simulation models, GM crops, co-existence, machine learning Received: November 9, 2010
This paper is an extended abstract of a dissertation which is concerned with analyzing outputs from complex simulation models from the area of ecology with machine learning. The dissertation proposes a methodology that combines simulation outputs, background knowledge, and machine learning, to obtain new and interesting knowledge about a problem of interest.
Povzetek: Članek povzema doktorsko disertacijo, ki se ukvarja z analizo rezultatov kompleksnih simulacijskih modelov iz področja ekologije s strojnim učenjem.
1	Introduction
Simulation models can be used to study situations in which it is impossible to conduct real experiments, or when the process of generating real-life data is very slow and expensive. However, the simulation models can easily grow very complex and extracting new knowledge from their outputs can become a difficult task.
The dissertation [4] proposes a new methodology for analyzing complex simulation models in the area of ecology. The methodology relies on the use of symbolic machine learning methods that produce comprehensible predictive models.
The problem of interest is the co-existence issue between genetically-modified (GM) and conventional crops (oilseed rape and maize) in different field scenarios. For this purpose, three different simulation models were used: GeneSys [2], MAPOD [3] and IBM-OSR [1], that simulate the crop growth and rotation in a large-risk field plan, in a field-to-field scenario and in a within field scenario, respectively. We used different machine learning techniques to analyze the outputs from these simulation models.
2	Simulation models
The three models, GeneSys [2], MAPOD [3] and IBM-OSR [1], are concerned with a different aspect of the coexistence issue between GM and non-GM crops. GeneSys is a simulation model that ranks cropping systems according to their probability of gene flow from herbicide-tolerant winter oilseed rape to rape volunteers and neighbor crops, both in time (via seeds) and in space (via pollen and seeds). The model integrates the effects
of crop succession and crop management at the level of a region and works for seed, as well as for crop production.
The simulation model MAPOD is a deterministic model, especially designed to predict cross-pollination rates between maize fields in a spatially explicit agricultural landscape under varying cropping and climatic conditions. It estimates the effects of farming practices on the levels of in-field contamination and simulates pollen exchange between GM and non-GM maize crops.
While GeneSys and MAPOD are population-based simulation models that describe the population dynamics of GM oilseed rape and maize, respectively, at different field scales, IBM-OSR is an individual-based simulation model. It is designed to help understand how the life-history, agronomic and environmental processes determine the persistence of GM oilseed rape. The model was constructed to represent a population of oilseed rape individuals in a single arable field.
3 Methodology
This dissertation proposes a new methodology for the analysis of results of ecological simulation models with machine learning that takes into account background knowledge about the problem of interest. The methodology consists of the following steps:
1.	Select an appropriate simulation model for the system of interest;
2.	Select a set of inputs for the simulation model and generate simulation outputs (a representative sample for the system under study);
3.	Define background knowledge for the problem of interest;
4.	Select an appropriate machine learning technique, which combines the background knowledge and data, and apply it to generate models of the problem of interest;
5.	Interpret the models with the help of a domain expert.
For the analysis of the outputs of the different simulation models and modeling different aspects of the co-existence issue between GM and non-GM crops, we used different machine learning techniques that take into account background knowledge: relational classification trees to learn co-existence rules for GM and conventional crops in a large region (output from GeneSys); equation discovery to model the outcrossing between two neighboring maize fields (output from MAPOD) and to induce explanatory models of oilseed rape population dynamics from individual-based data (output from IBM-OSR), and linear regression and models trees to validate and compare the results obtained with equation discovery.
4 Results and conclusions
When studying the co-existence between GM and non-GM oilseed rape in a large region, we used the output from the GeneSys simulation model and applied relational classification trees to it [7]. The goal was to assess the influence of the neighboring fields on the contamination of a given field with GM material. The results indicate that the most important parameters that influence the adventitious presence of GM material in a field are its cultivation and management parameters. The neighboring fields also have an influence on the GM contamination of the field, but this information is less important and only adds up to the management and cultivation information about the target field.
In the second case, the outcrossing between two maize fields was modeled using equation discovery [5]. For this purpose the output from MAPOD was used. The background knowledge was given in the form of a context free grammar. These analyses resulted in highly accurate equation-based models of the outcrossing, modeled as a function of the distance between the fields, the wind influence, time lag and the area of the fields.
The last part of the study was concerned with learning explanatory models of oilseed rape population dynamics from individual-based data (IBM-OSR) [6]. Again, we used background knowledge encoded in form of a grammar and applied equation discovery to generate equation-based models. We carried out four different equation discovery experiments, one for each combination of the stage the oilseed rape population can be found in (yield/seed rain and seedbank). The structure of the produced models, although consistent with domain expertise, is complex and needs further modification and improvements to reach the needed level of simplicity for interpretation.
The proposed methodology generates ecological knowledge by analyzing the outputs from simulation models by machine learning. The unique aspects of this methodology include the use of domain knowledge and learning methods that employ expressive formalisms and domain knowledge. The methodology can deal with different simulation models and domains, so the principles of our work can be applied to other simulation models in agriculture and in ecology in general.
Finally, this study poses several challenges for the development of new machine learning methods in relational learning and equation discovery, such as complex aggregates in relational learning and generic models in equation discovery.
References
[1]	Begg, G. S., Elliot, M. J., Squire, G. R., Copeland, J. (2006) Prediction, sampling and management of GM impurities in fields and harvested yields of oilseed rape. Technical Report VS0126, DEFRA.
[2]	Colbach, N., Clermont-Dauphin, C., Meynard, J.-M. (2001a) GeneSys: A model of the influence of cropping system on gene escape from herbicide tolerant rapeseed crops to rape volunteers. I. Temporal evolution of a population of rapeseed volunteers in a field. Agriculture, Ecosystems and Environment, 83, pp. 235-253.
[3]	Messéan, A., Angevin, F., Gómez-Barbero, M., Menrad, K., Rodríguez-Cerezo, E. (2006) New case studies on the coexistence of GM and non-GM crops in European agriculture. Technical Report EUR 22102 EN, Joint Research Center.
[4]	Trajanov, A. Analysis of results of ecological simulation models with machine learning, PhD thesis, Jozef Stefan International Postgraduate School (2010).
[5]	Trajanov, A., Todorovski, L., Debeljak, M., Dzeroski, S. (2009) Modelling the outcrossing between genetically modified and conventional maize with equation discovery. Ecol. model.. [Print ed.], vol. 220, no. 8, pp. 1063-1072.
[6]	Trajanov, A., Begg, G., Todorovski, L., Dzeroski, S. Equation-based models of oilseed rape population dynamics developed from simulation outputs of an individual-based model. In: Proceedings of the 12th International Multiconference Information Society - IS 2009, Oct. 2009, pp. 30-33.
[7]	Trajanov, A., Vens, C., Colbach, N., Debeljak, M., Dzeroski, S. (2008) The feasibility of coexistence between conventional and genetically modified crops: using machine learning to analyse the output of simulation models. Ecol. model. [Print ed.], issues 1-3, vol. 215, pp. 262271.
JOŽEF STEFAN INSTITUTE
Jožef Stefan (1835-1893) was one of the most prominent physicists of the 19th century. Born to Slovene parents, he obtained his Ph.D. at Vienna University, where he was later Director of the Physics Institute, Vice-President of the Vienna Academy of Sciences and a member of several scientific institutions in Europe. Stefan explored many areas in hydrodynamics, optics, acoustics, electricity, magnetism and the kinetic theory of gases. Among other things, he originated the law that the total radiation from a black body is proportional to the 4th power of its absolute temperature, known as the Stefan-Boltzmann law.
The Jožef Stefan Institute (JSI) is the leading independent scientific research institution in Slovenia, covering a broad spectrum of fundamental and applied research in the fields of physics, chemistry and biochemistry, electronics and information science, nuclear science technology, energy research and environmental science.
The Jožef Stefan Institute (JSI) is a research organisation for pure and applied research in the natural sciences and technology. Both are closely interconnected in research departments composed of different task teams. Emphasis in basic research is given to the development and education of young scientists, while applied research and development serve for the transfer of advanced knowledge, contributing to the development of the national economy and society in general.
At present the Institute, with a total of about 900 staff, has 700 researchers, about 250 of whom are postgraduates, around 500 of whom have doctorates (Ph.D.), and around 200 of whom have permanent professorships or temporary teaching assignments at the Universities.
In view of its activities and status, the JSI plays the role of a national institute, complementing the role of the universities and bridging the gap between basic science and applications.
Research at the JSI includes the following major fields: physics; chemistry; electronics, informatics and computer sciences; biochemistry; ecology; reactor technology; applied mathematics. Most of the activities are more or less closely connected to information sciences, in particular computer sciences, artificial intelligence, language and speech technologies, computer-aided design, computer architectures, biocybernetics and robotics, computer automation and control, professional electronics, digital communications and networks, and applied mathematics.
The Institute is located in Ljubljana, the capital of the independent state of Slovenia (or S9nia). The capital today is considered a crossroad between East, West and Mediter-
ranean Europe, offering excellent productive capabilities and solid business opportunities, with strong international connections. Ljubljana is connected to important centers such as Prague, Budapest, Vienna, Zagreb, Milan, Rome, Monaco, Nice, Bern and Munich, all within a radius of 600 km.
From the Jožef Stefan Institute, the Technology park "Ljubljana" has been proposed as part of the national strategy for technological development to foster synergies between research and industry, to promote joint ventures between university bodies, research institutes and innovative industry, to act as an incubator for high-tech initiatives and to accelerate the development cycle of innovative products.
Part of the Institute was reorganized into several hightech units supported by and connected within the Technology park at the Jožef Stefan Institute, established as the beginning of a regional Technology park "Ljubljana". The project was developed at a particularly historical moment, characterized by the process of state reorganisation, privatisation and private initiative. The national Technology Park is a shareholding company hosting an independent venture-capital institution.
The promoters and operational entities of the project are the Republic of Slovenia, Ministry of Higher Education, Science and Technology and the Jožef Stefan Institute. The framework of the operation also includes the University of Ljubljana, the National Institute of Chemistry, the Institute for Electronics and Vacuum Technology and the Institute for Materials and Construction Research among others. In addition, the project is supported by the Ministry of the Economy, the National Chamber of Economy and the City of Ljubljana.
Jožef Stefan Institute
Jamova 39, 1000 Ljubljana, Slovenia
Tel.:+386 1 4773 900, Fax.:+386 1 251 93 85
WWW: http://www.ijs.si
E-mail: matjaz.gams@ijs.si
Public relations: Polona Strnad
INFORMATICA
AN INTERNATIONAL JOURNAL OF COMPUTING AND INFORMATICS
INVITATION, COOPERATION
Submissions and Refereeing
Please	submit	a	manuscript	at:
http://www.informatica.si/Editors/PaperUpload.asp. At least two referees outside the author's country will examine it, and they are invited to make as many remarks as possible from typing errors to global philosophical disagreements. The chosen editor will send the author the obtained reviews. If the paper is accepted, the editor will also send an email to the managing editor. The executive board will inform the author that the paper has been accepted, and the author will send the paper to the managing editor. The paper will be published within one year of receipt of email with the text in Informatica MS Word format or Informatica LATeX format and figures in .eps format. Style and examples of papers can be obtained from http://www.informatica.si. Opinions, news, calls for conferences, calls for papers, etc. should be sent directly to the managing editor.
QUESTIONNAIRE
| | Send Informatica free of charge | | Yes, we subscribe
Please, complete the order form and send it to Dr. Drago Torkar, Informatica, Institut Jožef Stefan, Jamova 39, 1000 Ljubljana, Slovenia. E-mail: drago.torkar@ijs.si
Since 1977, Informatica has been a major Slovenian scientific journal of computing and informatics, including telecommunications, automation and other related areas. In its 16th year (more than seventeen years ago) it became truly international, although it still remains connected to Central Europe. The basic aim of Informatica is to impose intellectual values (science, engineering) in a distributed organisation.
Informatica is a journal primarily covering intelligent systems in the European computer science, informatics and cognitive community; scientific and educational as well as technical, commercial and industrial. Its basic aim is to enhance communications between different European structures on the basis of equal rights and international refereeing. It publishes scientific papers accepted by at least two referees outside the author's country. In addition, it contains information about conferences, opinions, critical examinations of existing publications and news. Finally, major practical achievements and innovations in the computer and information industry are presented through commercial publications as well as through independent evaluations.
Editing and refereeing are distributed. Each editor can conduct the refereeing process by appointing two new referees or referees from the Board of Referees or Editorial Board. Referees should not be from the author's country. If new referees are appointed, their names will appear in the Refereeing Board.
Informatica is free of charge for major scientific, educational and governmental institutions. Others should subscribe (see the last page of Informatica).
ORDER FORM - INFORMATICA
Name: .................................................... Office Address and Telephone (optional):
Title and Profession (optional): ....................................................................
........................................................... E-mail Address (optional): .............
Home Address and Telephone (optional): ....................
........................................................... Signature and Date: ....................
http://www.informatica.si/
Referees from 2008 on:
Ajith Abraham, Siby Abraham, Renato Accornero, Raheel Ahmad, Cutting Alfredo, Hameed Al-Qaheri, Gonzalo Alvarez, Wolfram Amme, Nicolas Anciaux, Rajan Arora, Costin Badica, Zoltän Balogh, Andrea Baruzzo, Borut Batagelj, Norman Beaulieu, Paolo Bellavista, Steven Bishop, Marko Bohanec, Zbigniew Bonikowski, Borko Boskovic, Marco Botta, Pavel Brazdil, Johan Brichau, Andrej Brodnik, Ivan Bruha, Maurice Bruynooghe, Wray Buntine, Dumitru Dan Burdescu, Yunlong Cai, Juan Carlos Cano, Tianyu Cao, Norman Carver, Marc Cavazza, Jianwen Chen, L.M. Cheng, Chou Cheng-Fu, Girija Chetty, G. Chiola, Yu-Chiun Chiou, Ivan Chorbev, Shauvik Roy Choudhary, Sherman S.M. Chow, Lawrence Chung, Mojca Ciglaric, Jean-Noël Colin, Vittorio Cortellessa, Jinsong Cui, Alfredo Cuzzocrea, Darko Cerepnalkoski, Gunetti Daniele, Grégoire Danoy, Manoranjan Dash, Paul Debevec, Fathi Debili, Carl James Debono, Joze Dedic, Abdelkader Dekdouk, Bart Demoen, Sareewan Dendamrongvit, Tingquan Deng, Anna Derezinska, Gaël Dias, Ivica Dimitrovski, Jana Dittmann, Simon Dobrišek, Quansheng Dou, Jeroen Doumen, Erik Dovgan, Branko Dragovich, Dejan Drajic, Jozo Dujmovic, Umut Riza ErtÄijrk, CHEN Fei, Ling Feng, YiXiong Feng, Bogdan Filipic, Iztok Fister, Andres Flores, Vladimir Fomichov, Stefano Forli, Massimo Franceschet, Alberto Freitas, Jessica Fridrich, Scott Friedman, Chong Fu, Gabriel Fung, David Galindo, Andrea Gambarara, Matjaž Gams, Maria Ganzha, Juan Garbajosa, Rosella Gennari, David S. Goodsell, Jaydeep Gore, Miha Grcar, Daniel Grosse, Zhi-Hong Guan, Donatella Gubiani, Bidyut Gupta, Marjan Gusev, Zhu Haiping, Kathryn Hempstalk, Gareth Howells, Juha Hyvärinen, Dino Ienco, Natarajan Jaisankar, Domagoj Jakobovic, Imad Jawhar, Yue Jia, Ivan Jureta, Dani Juricic, Zdravko Kacic, Slobodan Kalajdziski, Yannis Kalantidis, Boštjan Kaluža, Dimitris Kanellopoulos, Rishi Kapoor, Andreas Kassler, Daniel S. Katz, Samee U. Khan, Mustafa Khattak, Elham Sahebkar Khorasani, Ivan Kitanovski, Tomaž Klobučar, Jän Kollär, Peter Korošec, Valery Korzhik, Agnes Koschmider, Jure Kovac, Andrej Krajnc, Miroslav Kubat, Matjaz Kukar, Anthony Kulis, Chi-Sung Laih, Niels Landwehr, Andreas Lang, Mohamed Layouni, Gregor Leban, Alex Lee, Yung-Chuan Lee, John Leggett, Aleš Leonardis, Guohui Li, Guo-Zheng Li, Jen Li, Xiang Li, Xue Li, Yinsheng Li, Yuanping Li, Shiguo Lian, Lejian Liao, Ja-Chen Lin, Huan Liu, Jun Liu, Xin Liu, Suzana Loskovska, Zhiguo Lu, Hongen Lu, Mitja Luštrek, Inga V. Lyustig, Luiza de Macedo, Matt Mahoney, Domen Marincic, Dirk Marwede, Maja Matijasevic, Andrew C. McPherson, Andrew McPherson, Zuqiang Meng, France Mihelic, Nasro Min-Allah, Vojislav Misic, Vojislav Mišic, Mihai L. Mocanu, Angelo Montanari, Jesper Mosegaard, Martin Možina, Marta Mrak, Yi Mu, Josef Mula, Phivos Mylonas, Marco Di Natale, Pavol Navrat, Nadia Nedjah, R. Nejabati, Wilfred Ng, Zhicheng Ni, Fred Niederman, Omar Nouali, Franc Novak, Petteri Nurmi, Denis Obrul, Barbara Oliboni, Matjaž Pancur, Wei Pang, Gregor Papa, Marcin Paprzycki, Marek Paralic, Byung-Kwon Park, Torben Bach Pedersen, Gert Schmeltz Pedersen, Zhiyong Peng, Ruggero G. Pensa, Dana Petcu, Marko Petkovšek, Rok Piltaver, Vid Podpecan, Macario Polo, Victor Pomponiu, Elvira Popescu, Božidar Potocnik, S. R. M. Prasanna, Kresimir Pripuzic, Gabriele Puppis, HaiFeng Qian, Lin Qiao, Jean-Jacques Quisquater, Vladislav Rajkovic, Dejan Rakovic, Jean Ramaekers, Jan Ramon, Robert Ravnik, Wilfried Reimche, Blagoj Ristevski, Juan Antonio Rodriguez-Aguilar, Pankaj Rohatgi, Wilhelm Rossak, Eng. Sattar Sadkhan, Sattar B. Sadkhan, Khalid Saeed, Motoshi Saeki, Evangelos Sakkopoulos, M. H. Samadzadeh, MariaLuisa Sapino, Piervito Scaglioso, Walter Schempp, Barabara Koroušic Seljak, Mehrdad Senobari, Subramaniam Shamala, Zhongzhi Shi, LIAN Shiguo, Heung-Yeung Shum, Tian Song, Andrea Soppera, Alessandro Sorniotti, Liana Stanescu, Martin Steinebach, Damjan Strnad, Xinghua Sun, Marko Robnik Šikonja, Jurij Šilc, Igor Škrjanc, Hotaka Takizawa, Carolyn Talcott, Camillo J. Taylor, Drago Torkar, Christos Tranoris, Denis Trcek, Katarina Trojacanec, Mike Tschierschke, Filip De Turck, Aleš Ude, Wim Vanhoof, Alessia Visconti, Vuk Vojisavljevic, Petar Vracar, Valentino Vranic, Chih-Hung Wang, Huaqing Wang, Hao Wang, Hui Wang, YunHong Wang, Anita Wasilewska, Sigrid Wenzel, Woldemar Wolynski, Jennifer Wong, Allan Wong, Stefan Wrobel, Konrad Wrona, Bin Wu, Xindong Wu, Li Xiang, Yan Xiang, Di Xiao, Fei Xie, Yuandong Yang, Chen Yong-Sheng, Jane Jia You, Ge Yu, Borut Zalik, Aleš Zamuda, Mansour Zand, Zheng Zhao, Dong Zheng, Jinhua Zheng, Albrecht Zimmermann, Blaž Zupan, Meng Zuqiang
Informática
An International Journal of Computing and Informatics
Web edition of Informatica may be accessed at: http://www.informatica.si.
Subscription Information Informatica (ISSN 0350-5596) is published four times a year in Spring, Summer, Autumn, and Winter (4 issues per year) by the Slovene Society Informatika, Vozarski pot 12, 1000 Ljubljana, Slovenia.
The subscription rate for 2011 (Volume 35) is
-	60 EUR for institutions, -30 EUR for individuals, and
-	15 EUR for students
Claims for missing issues will be honored free of charge within six months after the publication date of the issue.
Typesetting: Borut Žnidar.
Printing: Dikplast Kregar Ivan s.p., Kotna ulica 5, 3000 Celje.
Orders may be placed by email (drago.torkar@ijs.si), telephone (+386 1 477 3900) or fax (+386 1 251 93 85). The payment should be made to our bank account no.: 02083-0013014662 at NLB d.d., 1520 Ljubljana, Trg republike 2, Slovenija, IBAN no.: SI56020830013014662, SWIFT Code: LJBASI2X.
Informatica is published by Slovene Society Informatika (president Niko Schlamberger) in cooperation with the following societies (and contact persons): Robotics Society of Slovenia (Jadran Lenarčič) Slovene Society for Pattern Recognition (Franjo Pernuš)
Slovenian Artificial Intelligence Society; Cognitive Science Society (Matjaž Gams) Slovenian Society of Mathematicians, Physicists and Astronomers (Bojan Mohar) Automatic Control Society of Slovenia (Borut Zupancic)
Slovenian Association of Technical and Natural Sciences / Engineering Academy of Slovenia (Igor Grabec) ACM Slovenia (Dunja Mladenic)
Informatica is surveyed by: ACM Digital Library, Citeseer, COBISS, Compendex, Computer & Information Systems Abstracts, Computer Database, Computer Science Index, Current Mathematical Publications, DBLP Computer Science Bibliography, Directory of Open Access Journals, InfoTrac OneFile, Inspec, Linguistic and Language Behaviour Abstracts, Mathematical Reviews, MatSciNet, MatSci on SilverPlatter, Scopus, Zentralblatt Math
The issuing of the Informatica journal is financially supported by the Ministry of Higher Education, Science and Technology, TrgOF 13, 1000 Ljubljana, Slovenia.
Volume 35 Number 2 June 2011
ISSN 0350-5596
Informática
An International Journal of Computing and Informatics
A Data Model and an XQuery Extension for Concurrent XML Structures
A Sequential Three-Stage Integer Goal Programming (IGP) Model for Faculty-Course-Time-Classroom Assignments
Performance Comparison Study of Multicast Routing Protocols for Mobile Ad hoc Networks under Default Flooding and Density and Mobility Aware Energy-Efficient (DMEF) Broadcast Strategies An Extended TOPSIS Method for Multiple Attribute Group Decision Making Based on Generalized Interval-valued Trapezoidal Fuzzy Numbers
Optimal Decision Tree Based Multi-class Support Vector Machine
A Shadow Dynamic Finite State Machine for Branch Prediction: An Alternative for the 2-bit Saturating Counter
Programming the Story: Interactive Storytelling System
Expression-robust 3D Face Recognition using Bending Invariant Correlative Features The State Of Information And Communication Technology In Hungary - A Comparative Analysis An Efficient Cross-Layer Scheduling with Partial Channel State Information
Online Motion Planning for Humanoid Robot Based on Embedded Vision System
Fault Diagnostics of Centrifuge Pump Using Data Analysis in Spectrometric Method
Real-Time Action Scheduling in Pervasive Computing
Tuning Chess Evaluation Function Parameters using Differential Evolution Algorithm
Analysis of Results of Ecological Simulation Models with Machine Learning
E. Bruno, E. Murisasco	141
R. Al-Husain, M.K. Hasan,	157 H. Al-Qaheri
N. Meghanathan	165
P. Liu	185
M. Bala, R.K. Agrawal	197
S. Abdel-Hafeez,	211
A. Gordon-Ross,	
A. Albosul, A. Shatnawi,	
S. Harb	
S. Kim, S. Moon, S. Han,	221
J. Chan	
Y. Ming, Q. Ruan	231
P. Sasvari	239
I. Ganesan,	245
M. Karuppasamy	
Q. Zhong, Q. Pan, B. Hong,	251
B. Fang, S. Piao	
M. Esmaeilpour,	259
E. Nomigolzar,	
M.R.F. Derakhshi,	
Z. Shukur	
W. Xue, Q. Luo, and	269
L.M. Ni	
B. Boskovic, J. Brest	283
A. Trajanov	285
Informatica 35 (2011) Number 2, pp. 141-287