7k n f lo i s s O M rt o § o 4-1 C « ^ Volume 26 Number 4 December 2002 ISSN 0350-55961 Informatica An International Journal of Computing ' and Informatics - J- ' Si'vi 1: ■ J . ^ 5, - - -i Q / /@) -^-h 1 The Slovene Society Informatika, Ljubljana, Slovenia J/ ! .'J Informatica An International Journal of Computing and Informatics Archive of abstracts may be accessed at USA: http://, Europe: http://ai.ijs.si/informatica, Asia: http://www.comp.nus.edu.sg/ liuh/Informatica/index.html. Subscription Information Informatica (ISSN 0350-5596) is published four times a year in Spring, Summer, Autumn, and Winter (4 issues per year) by the Slovene Society Informatika, Vožarski pot 12,1000 Ljubljana, Slovenia. ; ; The subscription rate for 2002 (Volume 26) is - USD 80 for institutions, I - USD 40 for individuals, and - USD 20 for students Claims for missing issues will be honored free of charge within six months after the publication date of the issue. KT^ Tech. Support: Borut Žnidar, Kranj, Slovenia. Lectorship: Fergus F. Smith, AMIDAS d.o.o., Cankarjevo nabrežje 11, Ljubljana, Slovenia. £ Printed by Biro M, d.o.o., Žibertova 1, 1000 Ljubljana, Slovenia. > - ^ - . - - Orders for subscription may be placed by telephone or fax using any major credit card: Please call Mr. R. Mum, Jožef Stefan Institute: Tel (+386) 1 4773 900, Fax (+386) 1 219 385, or send checks or VISA or Eurocard card number or use the bank account number 900-27620-5159/4 Nova Ljubljanska Banka d.d. Slovenia (LB 50101-678-51841 for domestic subscribers only). Informatica is published in cooperation with the following societies (and contact persons): Robotics Society of Slovenia (Jadran Lenarčič) Slovene Society for Pattern Recognition (Franjo Pemuš) Slovenian Artificial Intelligence Society; Cognitive Science Society (Matjaž Gams) Slovenian Society of Mathematicians, Physicists and Astronomers (Bojan Mohar) Automatic Control Society of Slovenia (Borut Zupančič) Slovenian Association of Technical and Natural Sciences / Engineering Academy of Slovenia (Igor Grabec) ACM Slovenia (Dunja Mladenič) Informatica is surveyed by: AI and Robotic Abstracts, AI References, ACM Computing Surveys, ACM Digital Library, Applied Science & Techn. Index, COMPENDEX*PLUS, Computer ASAP, Computer Literature Index, Cur. Cont. & Comp. & Math. Sear., Current Mathematical Publications, Cybemetica Newsletter, DBLP Computer Science Bibliography, Engineering Index, INSPEC, Linguistics and Language Behaviour Abstracts, Mathematical Reviews, MathSci, Sociological Abstracts, Uncover, Zentralblatt fiir Mathematik The issuing of the Informalica journal is financially supported by the Ministry of Education, Science and Sport, Trg OP 13,1000 Ljubljana, Slovenia. . Post tax payed at post 1102 Ljubljana. Slovenia taxe Percue. INFORMATICA AN INTERNATIONAL JOURNAL OF COMPUTING AND INFORMATICS INVITATION, COOPERATION Submissions and Refereeing Please submit three copies of the manuscript with good copies of the figures and photographs to one of the editors from the Editorial Board or to the Contact Person. At least two referees outside the author's country will examine it, and they are invited to make as many remarks as possible directly on the manuscript, from typing errors to global philosophical disagreements. The chosen editor will send the author copies with remarks. If the paper is accepted, the editor will also send copies to the Contact Person. The Executive Board will inform the author that the paper has been accepted, in which case it will be published within one year of receipt of e-mails with the text in Informatica I^TgX format and figures in . eps format. The original figures can also be sent on separate sheets. Style and examples of papers can be obtained by e-mail from the Contact Person or from FTP or WWW (see the last page of Informatica). Opinions, news, calls for conferences, calls for papers, etc. should be sent directly to the Contact Person. QUESTIONNAIRE Send Informatica free of charge Yes, we subscribe Please, complete the order form and send it to Dr. Rudi Murn, Informatica, Institut Jožef Stefan, Jamova 39, 1111 Ljubljana, Slovenia. Since 1977, Informatica has been a major Slovenian scientific journal of computing and informatics, including telecommunications, automation and other related areas. In its 16th year (more than five years ago) it became truly international, although it still remains connected to Central Europe. The basic aim of Informatica is to impose intellectual values (science, engineering) in a distributed organisation. Informatica is a journal primarily covering the European computer science and informatics community - scientific and educational as well as technical, commercial and industrial. Its basic aim is to enhance communications between different European structures on the basis of equal rights and international refereeing. It publishes scientific papers accepted by at least two referees outside the author's country. In addition, it contains information about conferences, opinions, critical examinations of existing publications and news. Finally, major practical achievements and innovations in the computer and information industry are presented through commercial publications as well as through independent evaluations. Editing and refereeing are distributed. Each editor can conduct the refereeing process by appointing two new referees or referees from the Board of Referees or Editorial Board. Referees should not be from the author's country. If new referees are appointed, their names will appear in the Refereeing Board. Informatica is free of charge for major scientific, educational and governmental institutions. Others should subscribe (see the last page of Informatica). ORDER FORM - INFORMATICA Name: .................................................... Office Address and Telephone (optional): Title and Profession (optional): ..................................................................... ........................................................... E-mail Address (optional): ............. Home Address and Telephone (optional): .................... ........................................................... Signature and Date: ................... Informatica WWW: http://ai.ijs.si/informatica/ http://orca.st.usm.edu/informatica/ Referees: Witold Abramowicz, David Abramson, Adel Adi, Kenneth Aizawa, Suad Aiagić, Mohamad Alam, Dia Ali, Alan Aliu, Richard Amoroso, John Anderson, Hans-Jurgen Appelrath, Ivän Araujo, Vladimir Bajič, Michel Barbeau, Grzegorz Bartoszewicz, Catriel Been, Daniel Beech, Fevzi Belli, Simon Beloglavec, Sondes Bennasri, Francesco Bergadano, Istvan Berkeley, Azer Bestavros, Andraž Bežek, Balaji Bharadwaj, Ralph Bisland, Jacek Blazewicz, Laszlo Boeszoermenyi, Damjan Bojadžijev, Jeff Bone, Ivan Bratko, Pavel Brazdil, Boštjan Brumen, Jerzy Brzezinski, Marian Bubak, Davide Bugali, Troy Bull, Leslie Burkholder, Frada Burstein, Wojciech Buszkowski, Rajkumar Bvyya, Netiva Caftori, Particia Carando, Robert Cattral, Jason Ceddia, Ryszard Choras, Wojciech Cellary, Wojciech Chybowski, Andrzej Ciepielewski, Vie Ciesielski, Mei Ó Cinnéide, David Cliff, Maria Cobb, Jean-Pierre Corriveau, Travis Craig, Noel Craske, Matthevi- Crocker, Tadeusz Czachorski, Milan Češka, Honghua Dai, Bart de Decker, Deborah Dent, Andrej Dobnikar, Sait Dogru, Peter Dolog, Georg Dorfner, Ludoslaw Drelichowski, Matija Drobnič, Maciej Drozdowski, Marek Druzdzei, Marjan Družovec, Jozo Dujmović, Pavol Duriš, Amnon Eden, Johann Eder, Hesham El-Rewini, Darrell Ferguson, Warren Fergusson, David Fiater, Pierre Flener, Wojciech Fliegner, Vladimir A. Fomichov, Terrence Forgarty, Hans Fraaije, Hugo de Gans, Eugeniusz Gatnar, Grant Gayed, James Geller, Michael Georgiopoius, Michael Gertz, Jan Golinski, Janusz Gorski, Georg Gottlob, David Green, Herbert Groiss, Jozsef Gyorkos, Marten Haglind, Abdelwahab Hamou-Lhadj, Inman Harvey, Jaak Henno, Marjan Hericko, Elke Hochmueller, Jack Hodges, Doug Howe, Rod Howell, Tomžš Hruška, Don Huch, Simone Fischer-Huebner, Alexey Ippa, Hannu Jaakkola, Sushil Jajodia, Ryszard Jakubowski, Piotr Jedrzejowicz, A. Milton Jenkins, Eric Johnson, Pelina Jordanova, Djani Juričič, Marko Juvancic, Sabhash Kak, Li-Shan Kang, Ivan Kapust0k, Orlando Karam, Roland Kaschek, Jacek Kierzenka, Jan Kniat, Stavros Kokkotos, Fabio Kon, Kevin Korb, Giiad Koren, Andrej Krajne, Henryk Krawczyk, Ben Kroese, Zbyszko Krolikowski, Benjamin Kuipers, Matjaž Kukar, Aarre Laakso, Les Labuschagne, Ivan Lah, Phil Laplante, Bud Lawson, Herbert Leitold, Ulrike Leopold-Wildburger, Timothy C. Lethbridge, Joseph Y-T. Leung, Barry Levine, Xuefeng Li, Alexander Linkevich, Raymond Lister, Doug Locke, Peter Lockeman, Matija Lokar, Jason Lovi'der, Kim Teng Lua, Ann Macintosh, Bernardo Magnini, Andrzej Malachowski, Peter Marcer, Andrzej Marciniak, Witold Marciszewski, Vladimir Marik, Jacek Martinek, Tomasz Maruszewski, Florian Matthes, Daniel Memmi, Timothy Menzies, Dieter Merkl, Zbignievi' Michalewicz, Gautam Mitra, Roland Mittermeir, Madhav Moganti, Reinhard Moller, Tadeusz Morzy, Daniel Mossé, John Mueller, Jari Multisilta, Hari Narayanan, Jerzy Nawrocki, Ranee Necaise, Elzbieta Niedzielska, Marian Niedq'zwiedziriski, Jaroslav Nieplocha, Oscar Nierstrasz, Roumen Nikolov, Mark Nissen, Jerzy Nogieć, Stefano Nolfi, Franc Novak, Antoni Nowakowski, Adam Nowicki, Tadeusz Nowicki, Daniel Olejar, Hubert Osterie, Wojciech Olejniczak, Jerzy Olszewski, Cherry Owen, Mieczyslaw Owoc, Tadeusz Pankowski, Jens Penberg, William C. Perkins, Warren Persons, Mitja Peruš, Stephen Pike, Niki Pissinou, Aleksander Pivk, Ullin Place, Gabika Polöicovä, Gustav Pomberger, James Pomykalski, Dimithu Prasanna, Gary Preckshot, Dejan Rakovič, Cveta Razdevšek Pučko, Ke Qiu, Michael Quinn, Gerald Quirchmayer, Vojislav D. Radonjic, Luc de Raedt, Ewaryst Rafajlowicz, Sita Ramakrishnan, Kai Rannenberg, Wolf Rauch, Peter Rechenberg, Felix Redmill, James Edward Ries, David Robertson, Marko Robnik, Colette Rolland, Wilhelm Rossak, Ingrid Rüssel, A.S.M. Sajeev, Kimmo Salmenjoki, Pierangela Samarati, Bo Sanden, P. G. Sarang, Vivek Sarin, Iztok Savnik, Ichiro Satoh, Walter Schempp, Wolfgang Schreiner, Guenter Schmidt, Heinz Schmidt, Dennis Sewer, Zhongzhi Shi, Märia Smolärovä, Carine Souveyet, William Spears, Hartmut Stadtler, Olivero Stock, Janusz Stoklosa, Przemyslaw Stpiczynski, Andrej Stritar, Maciej Stroinski, Leon Strous, Tomasz Szmuc, Zdzislaw Szyjewski, Jure Šile, Metod Škaija, Jih Šlechta, Chew Lim Tan, Zahir Tari, Jurij Tasič, Gheorge Tecuci, Piotr Teczynski, Stephanie Teufel, Ken Tindell, A Min Tjoa, Vladimir Tosic, Wieslaw Traczyk, Roman Trobec, Marek Tudruj, Andrej Ule, Amjad Umar, Andrzej Urbanski, Marko Uršič, Tadeusz Usowicz, Romana Vajde Horvat, Elisabeth Valentine, Kanonkluk Vanapipat, Alexander P. Vazhenin, Jan Verschuren, Zygmunt Vetulani, Olivier de Vel, Valentino Vranić, Jožef Vyskoc, Eugene Wallingford, Matthew Warren, John Weckert, Michael Weiss, Tatjana Welzer, Lee White, Gerhard Widmer, Stefan Wrobel, Stanislaw Wrycza, Janusz Zalewski, Damir Zazula, Yanchun Zhang, Ales Zivkovic, Zonling Zhou, Robert Zorc, Anton P. Železnikar An Assessment of the Organization Virtuality with Three Different Reference Models Cene Bavec School of Management in Koper, Slovenia Phone: +386 5 610 2000; fax; +386 5 610 2015 E-mail: cene.bavec@guest.arnes.si Keywords: virtual organization, level of virtuality, modeling organization, colored Petri nets, virtual government Received: May 17, 2002 The main objective of the research was to test a holistic view on virtual organizations with different perceptions of virtuality. Traditional and virtual organizations could be seen as two extremes of more general model of organization. To asses a transition from the traditional to the virtual organizations we have to grade organizational virtuality. In the paper we discuss three basically different reference models used to asses this transition. Two models are well known - the Mowshowitz's switching principle, and the Model of Business Networking (MBN) as a representative of models preferred by the ICT experts. They see the virtual organizations through implementation of the ICT, particularly the Internet. To express other characteristics of virtual organizations we also presented the model based on the Colored Petri nets and fuzzy logic that we originally used to study an organized anarchy. All three models were implemented to assess the case of the Custom Administration in Slovenia. An assessment confirms that organization of the Custom services clearly demonstrates an efficient utilization of the Internet and other features of virtual organizations. 1 Introduction Many authors argue that the theory of virtual organizations leads to a generalization of the traditional organization theory. It is not yet a prevailing organizational concept (Klüber et al, 1999) but, Internet, networked and virtual organizations have already proved to be an efficient organizational paradigm that brought to the business world a higher level of flexibility, efficiency, resource utilization and better customer services. An intuitive perception of virtual organizations is often inadequate and misleading so we are still searching for new managerial principles and practical tools for every day management that could replace still prevailing traditional organizational principles bom in the industrial age. The theory of virtual organizations is presently very chaotic. We still haven't developed practically useful indicators to make an objective assessment of virtual organization and to distinguish them from the other forms of organization. We have learned from practical experience that it is not realistic to classify organization in only two classes -virtual and traditional (Jansen at al, 1999). These should be seen as two extremes of a more general model of organizations. If we need to describe and to assess a transition from the traditional to the virtual organizations we have to grade their virtuality. In this paper we discuss the indicators that could assess virtuality of specific business organization. In the case presented we studied a real business environment to underline practical issues of virtual organizations and to raise a general issue of their virtuality and efficiency. We used three very different reference models of virtual organizations. Two models are well known - the switching principle and metamanagement (Mowshowitz, 1999) and the Model of Business Networking (Klüber at al, 1999) that represents a class of models used mainly by the IT and Internet specialists. To learn more about the structure and the internal nature of virtual organizations we also presented a formal definition of the organization, based on the Colored Petri nets and fuzzy sets logic. The definition was implemented in a computer model of non-hierarchical organizations and the organized anarchy (Bavec, 2001). The model was not initially designed to describe virtual organizations. Nevertheless, it predicted some features of non-traditional organizations as fiizziness of organizational rules and boundaries. 2 Reference Models Used The main objective of the research was to test a holistic view on virtual organizations with different perceptions of virtuality: • the Switching Principle - is mainly a managerial view with emphasis on organizationafflexibility and manageability, • the Model of Business Networking - it defines inter-organizational relations and predominantly an ICT view on organization with emphasis on modularity and business transparency. • Model based on the Colored Petri nets formalism shows an internal view on "fuzzy" organizational structures and information flows. We were well aware that models are not compatible and not even comparable but, we had intentionally selected such different perceptions of virtual organizations. The goal was to get some deeper understanding of potential indicators that could be used in their assessment. To overcome this methodological obstacle we used the models to separately assess seven features of virtual organizations proposed by Mertens et al. (1998): • boundary crossing • complementary core competencies • sharing of knowledge • geographical dispersion • changing participants • participant equality • electronic communication 2.1 Switching Principle and Metamanagement The first reference model we used was based on the switching principle and metamanagement introduced by Mowshowitz (1999). In the simplest way we could describe it as an ability of organization to dynamically select the best performer or executor (need-fulfillment) for a particular task (need). That means that an organization treats tasks separately from their potential performers. Switching would take place when replacing one performer would bring benefits that are greater than direct and indirect costs of replacement. Another concept introduced by Mowshowitz is metamanagement. It is basically management of virtually organized tasks and managerial implementation of the switching principle. This principle may seem trivial at the first glance but it opens an entirely new view on the organization. We should notice that in the traditional organization theory and practice it was always a sign of serious miss-planning or "bad organization" when we had to change (switch) a performer in the production phase of the task when organization was already implemented. Possibility of switching undoubtedly adds to organization and managerial flexibility, but the question that still remains is just how realistic it could be in every day business. Basic idea of virtuality is that switching could be done relatively fast. It would be difficult for the management to implement all traditional risk analysis, so the trust becomes an important decision and even an economic factor. We have to trust the new partner (Ishaya, Macaulay, 1999) and be reasonably confident that he will integrate into our operations and perform his role according to our expectations. If the mistrust is too high it could overwhelm other benefits. From the Mowshowitz's model we could assume that the level of virtuality is correlated with the ability to implement the switching principle or metamangement. We could use it to assess changing participants and participant equality from the Mertens' list of virtual organization features. 2.2 ICT Oriented Models of Virtual Organization Another, very ditferent perception of virtual organization is seen in the Model of Business Networking (Kliiber et al, 1999). It is a typical representative of models preferred by the IGT experts as they see virtual organizations through implementation of IGT, particularly the Internet. But, the model is more general and incorporates important features of virtual organizations that are highly relevant for the management. The Model of Business Networking (MBN) has the following elements that were parts of our assessment of a real business environment: • Customer processes (determine the design of a value chain), • Integrators and Aggregators (third parties included into business relationships), • Business Bus (logical space where complex services among business partners are flexibly and efficiently exchanged with the support of service providers), • Business Ports (standardized interface to access the Business Bus). Similar approaches are widely used, often under different names in design and implementation of information systems based on open networking like the Internet. According to the MBN, integrators and aggregators are an essential element of networked and virtual organizations. They provide different business services: knowledge, coordination, process, information, and transaction services. They behave in the way to "soften" or even eliminate organizational boundaries between business partners. The Business Busses and the Business Ports describe inter-organizational relations and interfaces that define mainly an information structure of virtual organizations. But, more generally they describe complementary core competencies, sharing of knowledge, geographical dispersion and electronic communication. 3 Modeling Organizations with Colored Petri Nets 3.1 Rationale Behind the Model In the year 1994 we developed a model of organization based on the extended Colored Petri nets and fuzzy logic to study organized anarchy and influence of information systems on the organization. Their superior semantic power makes possible a very rich representation of the organization and overcomes some limitations of classical representations. A general definition of the organization was based on abstract fuzzy sets with axiomatically assigned properties. The organization was defined as a set of rules that determined the chain of authority, description of working (organizational) places, and other organizational relations. It also determined conditions under which organizational processes could change their states. The quality of organization can be measured only through its impact on processes so it must be modeled together with them. The Petri nets proved to be an efficient way to combine organizational structure and the processes in the organization. This methodology was used for modeling properties that reflect ambiguity or deviation from the traditional hierarchical organization. The study also exposes a paradigm that could be called an informed anarchy paradigm in analogy with the organized anarchy. The informed anarchy paradigm is based on empirically founded facts that unclear technology of allocation, dissemination, and also faulty understanding of information are prevailing properties of organizations. An object-oriented model was developed that interlinks a formal organization with decision and information processes into an integral model. It was based on three classes of objects: organization, decision processes and information processes. An important feature of the model was its ability to model conditions on micro level, that means both on the level of working (organizational) place and individual process. Further development confirmed that the same model represent also some relevant features of virtual organizations. 3.2 Model of Organization Based on Colored Petri Nets Petri nets are well proven tools for systems modeling that can describe dynamic and static properties of the system. Similar situations, only much more complex, can be met in business organizations, so we experimented with the possibility to model them with Colored Petri nets (CPN). We published some results of the computer simulation (Bavec, 2001) - relation between the level of organization anarchy, load of problems, formal and informal information systems, and efficiency of decisionmaking. We defined an organization ORG as a 12-tuple or an extended Colored Petri net (Bavec, 1995): ORG = (B,P, T,D, QR.h O.ò.n.p. c) B = (b], b2, ... bj) a finite set of colors P = (pi,p2, ..■ Pn) a finite set of places T = (ti,t2, ... t J a finite set of transitions D = (dj, d:,... dj C = (Cl, C;, ... Ct) R = (r,. r2, ... rj a finite set of organizational places (working places, divisions, etc.) a finite set of concepts or objects a finite set of organizational relations r = {r\rcDxD } sn/TirriDncn« = 0 1: an input function that maps a set of transitions // into places p, O: T—^P^ an output function that maps a set of transitions into places p, S: D^ P a function that maps organizational places di G D into places pi G P rj: C —f P a function that maps concepts or objects Ci e C into placesPiEP p: /?—> T a function that maps organizational relations r, G R into transitions ti G T a: Z—>• P a function that maps multi-set of tokens z(pi) G Z into places pi&P k^ [0,1] a threshold, an additional condition for firing transitions. We are aware that business and human organizations, particularly as complex ones as the virtual organizations, can't be highly formalized (structured). Consequently, there is a question how far can we go with formal definitions. But, the model confirms that we could model some features of virtual organizations with the CPN and its derivations (Deng et al. 1990). In the model we implemented fiizzy logic, mainly through the threshold XE [0,1] which additionally controls firing of tokens. We also proved that the introduction of concepts or objects (also Bastide, 1996) C/ G C assigned to organizational places di ED (they could be everything from working places to organizational units) provided us with the tool to model complex relations between organization as the set of rules, and processes that are running in accordance with the organizational rules. With the controlled firing of tokens in the CPN and fuzzy logic we could describe and study features of virtual organizations like the switching principle, the ambiguity of organizational relations and particularly the boundary crossing. 4 The Case Study - Assessment of the Government Agency 4.1 Beyond the Business Partnership Emerging experience and the theory of virtual organization is based on the present business practice with very few examples from the government administration. But, there is widely spread belief among researchers that governments and public administrations are one of the most promising grounds for an introduction of virtual organizations. The development of Internet and innovative customer and citizen oriented services in governments in developed countries more than justify this assumption. It is encouraging to notice a similar development also in the middle developed countries, such as Slovenia. In the case presented we studied the collaboration of the Custom Administration in Slovenia (CA) with different private companies. The case reveals development steps from the traditional government agencies towards highly efficient and technologically advanced organizations that clearly articulate elements of the new organizational paradigm - the virtual organizations. The CA has developed a sophisticated and efficient Custom information system in which 98% of all custom declarations are lodged electronically up to the highest security standards. The development of the CA information system and computer applications were outsourced to the private company ZZI from Ljubljana. The application was linked to the operation environment of users. Jointly with its partners, ZZI developed an interface to enable its software to be integrated into larger operation information systems based on the BAAN, SAP, NAVISION, etc. It is important to notice that the CA has been equally opened to other potential partners that could develop software and services to enable different users to link their systems to the Custom information system, enabling them to submit the customs declarations and other documents by the Internet. Beside the software development ZZI and nearly 30 other providers also offer transmission services for the electronic data interchange within different environments and among different partners. The server solutions and programs used by clients facilitate the automatic data interchange within the different environments and applications. Such service for example is the transmission of the messages from the Internet environment of partners to the X.400 environment of the Custom Administration. 4.2 Implementation of the Switching Principle and Metamanagement At the beginning of the research we were concerned with the notorious rigidity of the government organization. It seemed quite unrealistic to notice any sign of the switching principle in the government agencies. But, what we really found was a clear presence of elements of metamanagement and implementation of the switching principle. Services such as the transmission of the messages from the Internet environment of partners to the X.400 environment of the Custom Administration and other forms of electronic data interchange services performed by private companies are without any doubt "switchable". The official policy of the CA is in favor of tighter inclusion of trustworthy participants into the Custom Information System. This just confirms this development. Transferring elements of the CA authority to partners express another important feature of virtual organizations - the trust. From the government point of view it is a major breakthrough to realize that it is more convenient and even cheaper to trust partners and to control them more "softly" in indirect and off-line mode. Looking into the project and official documents of the CA reveals that the switching principles as well as the role of trust are introduced entirely intuitively, without any reference to virtual organizations. It shows that evolution of virtual organization could be entirely spontaneous and a natural development in the competitive and technologically advanced environments. 4.3 Implementation of the MBN According to the Model of Business Networking (MBN) integrators and aggregators are essential elements of networked and virtual organizations. They are the third parties included into business relationship. In the case of the Custom information system they are not government agencies, nor the users of the system. Electronic data interchange services performed by the ZZI and other companies are "infomediary" (Osterie, 1999). The companies around the CA are playing the role of integrators and aggregators with standardized procedures and with standardized interfaces and can be interchanged and replaced. It gives the CA a very high flexibility, accompanying with noticeable cost reduction and better customer services. Again, the CA implemented the most visible feature of virtual organization - a very high flexibility. The overall architecture of the CA information system is modular with surprisingly similar topology as the MBN. Terminology used is different, but its structure could be described with the MBN features. Modular system design and application of the Business Bus (the way to exchange of business services) and the Business Ports (standard business and technological interfaces) offers a tool for optimal organizational design in the CA and their partners. The Business Bus in this case is a virtual world of custom declaration processing that is separated from the physical world of goods, importers and custom houses. Locations of custom warehouses are the matter of convenience and agreement between the CA and an importer. It indicates the presence of vital elements of virtual organization. 4.4 "Fuzzy" Organizational Bonds With the description of the virtual organization with the CPN we could identify and formalize some internal features like the strength of managerial and organizational bonds. We could study the mechanisms that make some positions in organization logically members of two or even more different organizations. Many positions or tasks in the ZZI are so strongly linked to the CA, and also opposite, that employees often don't know who their boss really is and to whom to report to in some cases. It means that some organizational relations are not just ftjzzy but they could also extend out of the organization. That is a relatively simple explanation for ambiguous boundary limits and boundary crossing. The most interesting feature that the CPN modeling offers is its ability to model fuzzy information and decision processes that are the characteristics of the organized anarchy and also of virtual organizations. We were quite intrigued by the fact that we can implement so many ideas of organized anarchy directly on the virtual organizations. 4.5 Results The reference models used confirm that the organization of the Custom Administration clearly demonstrates features of contemporary organizations with an efficient utilization of the Internet and even more hidden elements of virtual organization. In the absence of proven methodologies and indicators for assessment of virtual organizations we assessed the features of virtual organizations proposed by Mertens et al. (1998). We transformed the whole problem into seven separated and independed assessments. Each feature was ranked on the scale from I to 100 and plotted on the radar chart (Figure 1). The picture reveals an uneven development of virtual organizations futures - some of them are very pronounced, others are still very close to the traditional organizations. boundary crossing 100^ sharing of knowledge electronic communication complementary core competencies geographical dispersion participants equality changing participants Figure 1 : An assessment of the basic features of the virtual organizations - the case of the Custom Administration in Slovenia An electronic communication and the geographical dispersion are very developed but, in every day business they are the easiest goals to achieve. They represent more technical than managerial challenge. Obviously, it is much more difficult to achieve managerial goals like sharing of knowledge, participation equality and particularly changing participants. They are prerequisites for introduction of the switching principle and metamanagement. Results also confirm what we could intuitively expect - boundary crossing and complementary core competencies are somewhere in the middle on the scale of managerial problems. They could be achieved in the next step, after introducing electronic communication and geographical dispersion on a broad scale. Conclusion The research has revealed that the CA had progressively developed towards virtual organization entirely intuitively. At the beginning, their main goal was to outsource development of their information system and to utilize the ICT, as much as possible. It confirms a well known fact noticed in the business community that we presently face an evolution rather than revolution towards virtual organizations. This evolution is spontaneous in technologically advanced environments. But, even if we accept the fact that the emergence of virtual organizations could be spontaneous the management still needs deeper insight into challenges of the new organizational paradigm. It will soon turn out to be one of the most important expertise of contemporary managers. The case of the Custom Administration also presents a fine example of virtuality - the virtual world of custom declaration processing is separated from the physical world of goods, importers and custom houses. It provides such a high flexibility that Slovenian accession to the European Union won't require any changes in organization of the CA - their virtual world will be simply extended from Slovenian borders to the whole EU. An assessment of the degree of virtuality proved to be a real challenge. We are still short of any useful methodology or a set of relevant indicators. Nevertheless, a simple case we investigated showed that we could combine different models in search for more holistic view on virtual organizations. We were able to detect weaknesses and obstacles in managerial strategies and also to grade their goals from the easiest to the more complex. Technical issues like extensive introduction of the Internet are relatively easy to achieve and to manage. One of the most pronounced features of virtual organizations like boundary crossings is also quite common, even in the early phase of the development of virtual organization. The ability to change participants has received the lowest grade in our research. It seems that the real managerial challenge is hidden in the switching principle and metamanagement. It could lead us to the conclusion that fully developed virtual organizations are still difficult to achieve. For that reason management needs much deeper understanding of challenges and obstacles in the transition from traditional to virtual organizations. Researchers could contribute with models and tools that would enable managers to set relevant goals and to asses their efforts. References [1] Bastide R. (1996): Approaches in unifying Petri nets and the Object-Oriented Approach, Working paper, L.I.S., Université Toulouse [2] Bavec C. (1995) Object Oriented Modelling of Organization, Ph.D.Dissertation (in Slovenian), University of Ljubljana, Faculty of Economics [3] Bavec C. (2001): "Modeling of Management Decision-making Processes in Organized Anarchy", Informatica, 25 (2001) 375-379 [4] Bavec C., Zorko Z. (2002): Evolution of Networked and Virtual Government Agencies - The Case from Slovenia, Procedings of 3rd European Conference E-Comm-Line 2002, September 26-27, 2002, Bucharest, Romania [5] Cohen M. D., March J. G., Olsen J. P. (1972) A Garbage Can Model of Organizational Choice. Administrative Science Quarterly, 17 1-25 [6] Davidow W.H., Malone M.S. (1992) The Virtual Corporation, Harper Collins, New York [7] Deng Y., Chang S.K. (1990): A G-net Model for Knowledge Representation and Reasoning, lEE trans. On Knowledge and data Engeneering, Vol. 3, No. 3 [8] Drucker P.F. (1999): Management Challenges" for the 21®' Century, Butterworth-Heineman [9] Hesselbein F., Goldsmith M., Beckhard R. (1997): The Organization of the Future, The Drucker Foundation, Jossey Bas, San Francisco [10] Ishaya T., Macaulay L.(1999): The Role of Trust in Virtual Teams, Proceedings of the 2nd International VoNet Workshop, September 23-24, 1999, Simowa Verlag Bern [11] Jansen W., Steenbakkers W., Jägers H. (1999): Electronic Commerceand Virtual Organizations, Proceedings of the 2nd International VoNet Workshop, September 23-24, 1999, Simowa Verlag Bern [12] Jensen K. (1992): Coloured Petri Nets, Basic Concepts, Analysis Methods and Practical Use, Vol. I, Springer-Verlag, Berlin Heidelberg [13] Klüber R., Alt R., Osterie ZH. (1999): Emerging Electronic Services for Virtual Organizations -Concept Framework, Proceedings of the 2nd International VoNet Workshop, September 23-24, 1999, Simowa Verlag Bern [14] Mertens, P., Griese J., Ehrenberg D. (1998): Virtuelle Unternehmen und Informationsver-abeiting, Springer, Berlin [15] Morabito J., Sack 1., Bhate A. (1999): Organization Modeling - Innovative Architectures for 21" Century, Prentice Hall [16] Mowshowitz A. (1997): Virtual Organization: Avision of Management in the Information Age, The Information Society, Vol. 10 [17] Mowshowitz A. (1999): The Switching Principle in Virtual Organization, Proceedings of the 2nd International VoNet Workshop, September 23-24, 1999, Simowa Verlag Bern [18] Strausak N. (1998): "Résumé of VoTalk", VoNet Workshop, April 27-28,1998, Simowa Verlag Bern Using Image Segmentation as a Basis for Categorization Janez Brank Department of Intelligent Systems Jožef Stefan Institute, Ljubljana, Slovenia janez.brank@ijs.si Keywords: image categorization, segmentation, generalized kernels Received: June 26, 2002 Image categorization is the problem of classifying images into one or more of several possible categories or classes, which are defined in advance. Classifiers can be trained using machine learning algorithms, but existing machine learning algorithms cannot work with images directly. This leads to the needfor a suitable way of representing or describing images such that learning algorithms can work with them. We consider a representation based on texture segmentation and a similarity measure between segmented images which has been used successfully in the related area of image retrieval. A generalized kernel for use with the support vector machine (SVM) algorithm can be built from such a similarity measure. We compare this approach with a more straightforward representation based on autocorrelograms, and we show that these two representations can be combined to obtain classifiers with higher categorization accuracy. 1 Introduction Besides textual and relational data, people increasingly have to deal with pictorial data, or data in the form of images. Large pictorial databases are being produced as archives digitize their collections, and additionally the World Wide Web contains a huge number of images. Apart from purely technical problems of storing and processing such large amounts of data, the emergence of large collections of images opens the problems of enabling the users to make sense of this data and find what they need. Image categorization deals with one aspect of this problem: given a set of images and a set of predefined categories or classes, we assume that each image should belong to one or possibly several of these categories. For a large collection it would be impractical to have a human observer categorize all the images, so we want to be able to classify the images automatically after a small number of images has been classified manually to be used for training the automatical classifiers. However, this view of image categorization as a machine learning task immediately opens up a new problem: existing machine learning algorithms generally cannot work with images directly. Instead, they often assume they will be dealing with instances described by vectors or tuples. We need to be able to represent images using structures of this kind to make use of existing machine learning algorithms. 1.1 Related work in image retrieval We can build on existing work in image retrieval, which is a related area where the problem of representation has already been encountered. In image retrieval, the user poses a query to the system and the system should find images that are somehow relevant to the query. Thus a way of representing the query, a way of representing images, and a way of comparing a query and an image (to determine if the image is relevant with regard to this query) are needed. One approach that is both technically feasible and useful enough to be commonly used in practice (e.g. in web image search engines such as Google) is to describe each image using a few keywords, and the user's query can then request images whose description includes or excludes particular keywords. However, this approach is only feasible if textual descriptions of images can be obtained automatically (e.g. from the HTML file that linked to an image); it is usually too costly to have a human maintainer prepare such descriptions manually for a larger database. In addition, this textual approach suffers from problems of polysemy: different people would use different words to describe an image, and the same words may mean different things to different people. Therefore it is often desirable to rely solely on what can be automatically extracted from the images themselves. The user's query is then often simply a request to look for images similar to a given query image or sketch (this approach is known as "querying by contenf', or "content-based image retrieval"). There are several close parallels between image retrieval and image categorization. In categorization, if a new image is similar to training images from a particular category, it should probably itself belong to that category; in content-based image retrieval, if an image from the database is similar to the query image, it should probably be shown to the user. Thus we see that both areas need a way of representing images and assessing similarity between them. Many image representations and similarity measures have been proposed in image retrieval, and we would like to examine some of them from the point of view of image categorization as well. One popular class of image representations is based on simplifying the image by approximating the color of each pixel by the nearest color from a predefined and fixed color palette; this can also be seen as partitioning (or quantizing) the space of all possible colors. Some information is then recorded about the presence of each color on the image. When simply the proportion of the image covered by (the pixels of) that color is stored, the resulting description is called a histogram [11], However, this disregards all spatial information (how the color is distributed around the image): for example, a large patch of red would affect the histogram of an image in the same way as a large number of red pixels scattered all over the image, which is surely undesirable. Several improved histogram-like representations of images have been proposed. For example, an auto-correlogram [4] records, for each color c and for a few small integers d, the probabilify that a pixel, chosen randomly at distance d from a randomly chosen pixel of color c, will itself be of the color c. This retains information about the amount of a color present on the image, but also records something about the spatial arrangement of each color. Still, all "global" representations of this type can be seen as somewhat rigid as they record a strictly fixed amount of data for each image. They cannot take into account the fact that some images are more complex than others, that an image may contain several objects, or that it may be helpful to distinguish between an (interesting) object and (uninteresting) background. 1.2 Image segmentation Another, more sophisticated, class of image representations is based on segmentation, or dividing an image into a set of regions such that each region is roughly homogeneous in color and/or texture. Each image is then represented by a set of regions; each region is typically described by a short vector that is a by-product of the segmentation procedure (containing e.g. the average color of the region, information about texture, and so on). Additionally, the location of each region on the image (i.e. which parts of the image are covered by that region) is often recorded as well. In general, regions might overlap, and each region might itself be composed of several disjoint parts; this is not necessarily problematic as they need not be shown to the user, and image similarity measures usually permit the regions to be disconnected,'^and sometimes work with overlapping regions as well. Representations based on segmentation can adapt well to differences in complexity between images, and have been used successfully in image retrieval [NRS99, WLWOO]. Various segmentation algorithms have been proposed in the context of image retrieval [NRS99, WLWOO]. These approaches are usually based on dividing the image into a grid of small "windows" (e.g. 4x4 pixels); each window is described by a short vector (containing e.g. the average color and possibly a few coefficients from the higher-frequency bands of a wavelet transform, in order to capture the presence of edges or texture), and these vectors are then clustered. Each of the resulting clusters contains vectors that lie close together in their vector space, and such vectors hopefully correspond to windows that are similar in appearance; therefore it makes sense to form a region from such a group of windows. The region thus obtained can be described by the centroid of the cluster, i.e. by the average of the vectors that describe the windows from which the region was formed. To use segmentation for image retrieval, it is also necessary to introduce a measure of similarity between segmented images. Such measures usually examine pairs of individual regions (one region from each image) and combine the measures of similarity or difference between regions into a single similarity measure between entire images. For example, the integrated region matching (IRM) measure [8] defines the distance between two images as a weighted sum of distances between regions, in which the weights are chosen so as to allow larger regions to have a larger influence on the similarity between images. To use the representations described above for image categorization, one could use global representations (e.g. autocorrelograms) in combination with any of several machine learning algorithms (such as support vector machines, SVM); or use a segmentation-based similarity measure with an algorithm that allows an arbitrary similarity measure to be plugged into it (e.g. the nearest-neighbor method). However, our earlier work [1] has shown that the nearest neighbor method, in combination with segmentation-based image similarity measures, results in rather unimpressive performance in comparison to SVM and global representations. It is therefore our goal to try using segmentation together with support vector machines. The main challenge here is that the SVM in its original formulation assumes all training and test examples to be described by vectors with the same number of components, while in the case of segmentation the description of each image has more structure than that, and the number of regions can also vary from image to image. 2 Support vector machines Support Vector Machines (SVMs) [3] are a relatively recent family of machine learning algorithms that have been used successfully in many application domains. In the most elementary form of this method, we assume that each training example is a vector from some d-dimensional real space, and that there are exactly two classes, called positive and negative. Several extensions to multiclass problems are possible [5], usually by converting one multiclass learning problem into several two-class problems (e.g. training one classifier for each pair of classes to separate members of one class from those of the other class). In SVM learning, we want to separate the positive vectors from the negative ones using a hyperplane such that the positive training vectors lie on one side of the plane and the negative ones lie on the other side. Additionally, to help make the classifier more robust and more reliable for use on unseen test vectors, we want the training vectors to lie as far from the separating hyperplane as possible. Maximizing this distance (known as the margin) from the plane to the nearest training example can be cast as an optimization problem in the following way. Let Xj be the /'th training vector, and y, its label (which equals +1 for positive examples and-1 for negative training examples. A hyperplane can be described by the equation w^x + è = 0, where w is the "normal", i.e. a vector perpendicular to the plane, and b is a threshold that determines the actual location of the plane in space, v/x denotes the dot product of the vectors w and X. Given a particular vector x, we can determine what side of the plane it lies on by examining whether w^x + è is positive or negative. However, to ensure that the training examples do not lie too close to the plane, we must also insist that w^x + 6 has a large enough absolute value. We can describe this using the following conditions: w'xj + b>\ and y, = -I => w^xj + b<-\, or, more concisely: yiw'xj + è) > 1 for all training instances i. If all training examples satisfy these conditions, the space between the hyperplanes w'x + b = \ and w'x + è = -I is empty; to maximize the breadth of this margin space, we need to maximize the distance between these two planes, which equals 2/||w||. Maximizing the margin is thus equivalent to minimizing subject to the above conditions. This optimization problem is usually also extended to allow some training instances to be misclassified (or at least lie within the margin, though perhaps on the correct side of the separating plane) if this leads to a wider margin on the other training instances (the soft margin formulation of SVM). Solving the optimization problem gives us the values of w and b, and the resulting classifier simply works according to the formula prediction(x) = sgn[w'x + 6]. Using standard techniques fi'om optimization theory, this optimization problem can be transformed into a "dual" form. It turns out that the dual form, as well as the resulting classification rule, can be expressed so that the training vectors need never be accessed directly, as long as we are able to compute the dot product of any two vectors. In particular, the normal w can be written as w = Z, aiyiXj, where the a/ coefficients are obtained by solving the dual optimization problem. The classifier can then be described asprediction{x) = sgn[b + S, aiyixi'x]. Now suppose we used some mapping cp to map our original instances X; into some other (possibly higher-dimensional) vector space F. Let K(xi, xj) := (cp(x,), (p(Xy));.- be a function that, given two instances x, and Xj, computes the dot product {■,■)[.• (in the new space F) of their images (p(x;) and (p(xy) under the mapping cp. It follows from the above that we could train a hyperplane in F without ever working with the mapped vectors (p(x;) explicitly, as long as we are able to compute A'(x„ Xy) for any two vectors x, and Xj. The function K defined in this way is known as a kernel. The importance of kernels arises from the fact that the mapping cp need not be linear, and for a nonlinear (p a hyperplane in F could correspond to some highly nonlinear separation surface in the original space. In this way, kernels allow the SVM algorithm to induce nonlinear models while preserving the optimization framework essentially intact. The appeal of kernels stems from the fact that a wisely chosen function K can be simple to compute and yet correspond to a complex nonlinear mapping into some very high-dimensional space F. A kernel corresponds to a dot product in some vector space and can therefore in some sense be seen as a sort of similarity measure: the dot product of two vectors (if their length is fixed) is greatest when they point in the same direction, and then decreases as the angle between them increases, eventually becoming 0 (for orthogonal vectors) and even negative, reaching the minimum if the two vectors point in exactly the opposite direction. However, the converse is not true: that is, not every similarity measure corresponds to a scalar product in some vector space. If we used a non-kernel similarity measure as if it were an actual kernel, we would no longer have the mathematical guarantees that the SVM training algorithm would converge, and even if it converged there would be no theoretical grounds to expect the resulting classifier to have good performance. 3 Generalized kernels Generalized SVMs have been proposed by Mangasarian [9] to allow an arbitrary similarity function to be used in a way analogous to a kernel. In the previous section we have seen that SVM can learn nonlinear models of the form prediction{x) = sgn[b + Z, a^y, K{Xi, x)] where A^(x„ x) = ((p(x,), (p(x))/.' for some mapping (p to some space F and some dot product(-,-)f in F. Now if some arbitrary function K were used instead of a proper kernel function, again giving us a classifier of the form sgn[b + Y.iaiyiK{x,,x)\, this might still be a perfectly reasonable and useful classifier, but it wouldn't necessarily correspond to some hyperplane in some vector space F to which the instances X; and x might have been mapped. Thus we couldn't obtain the a, values using the criterion of maximizing the margin, because there wouldn't even be a hyperplane whose margin to maximize. Instead, [9] proposes to minimize the value aJHa (subject to the same constraints as before, i.e. that our training instances should lie on the correct side of the separation surface) for some positive definite matrix H. (This problem has a very similar structure to the dual form of the original SVM optimization problem, and is in fact equivalent to it if K really corresponds to a dot product and a suitable matrix H is chosen.) In the simplest case of the generalized SVM, we would take H= I (the identity matrix) and thus minimize E, a^. This can be interpreted intuitively as looking for a separation surface that can be expressed in the simplest possible way, possibly with many a, equal to 0 (i.e. without really using the training example Xi in the description of the separating surface). It can be shown that the formulation for // = / is equivalent to mapping each instance jc into the vector {K(x, Xi), ..., K{x, x„)) of its "similarities" (as measured by K) to all the training instances Xi,... ,x„, and then using an ordinary linear support vector machine over this new representation. For the problem of image categorization, this amounts to the intuitively appealing suggestion that two images should be treated as similar if they exhibit a similar pattern of similarities to known training images. 4 Region clustering In this section we consider another approach to using segmentation-based representations for image categorization. Each image has its own set of regions and regions belonging to different sets are in a sense quite independent of each other. This leads to the need for special similarity measures that compare two images by considering all pairs of regions and aggregating the similarities of regions into a measure of similarity between the images. As an alternative, we propose to bring the region-based representations of images to a "common denominator" by clustering the descriptions of all the regions of all the training images. The hope here is that each cluster would correspond to a group of similar regions from several images, while regions from separate clusters would be quite different in appearance. Thus, when comparing two images, if a region of one image belongs to a different cluster than some region of the other image, there would be no need to compare these two regions in any particular way, because knowing that they belong to different clusters already indicates that they are different in appearance and cannot really contribute towards the similarity of the two images under consideration. Therefore, an image would then be described by recording, for each cluster of regions, what proportion of the area of this image is covered by regions of this cluster. If there are d region clusters, each image would now be represented by a J-dimensional real vector (with possibly many zero-value components, as there would probably be much more clusters than an average image has regions). With all images represented in this same d-dimensional space, we can then use the ordinary linear support vector machine to train classifiers. 5 Experimental evaluation To compare the approaches described in the previous sections, we conducted experiments on the mise database, which is publicly available (http://www-db.stanford.edu/IMAGE/) and has already been used in image retrieval literature [13, 10], as well as in our earlier work on image categorization [1]. This database contains approximately 10000 small photographic images (of sizes around 128 by 96 pixels). It is thematically very diverse. We selected 1172 images from the database and manually assigned each of them to one of 14 categories (butterflies, US flag, sunsets, autumn, flowers, planets, satellite images of Earth, cars, mountains, clouds, sea, surfboards, sailboats, prairie animals). The intention of this selection was to have categories of varying size and difficulty. The smallest category (flags) contains 32 images, and the largest (sunsets) contains 224 images. Some of the categories, such as sunsets or flowers, have characteristic and easily recognizable color distributions, while some categories are quite similar in this respect and would therefore be more difficult to distinguish (e.g. sea and clouds, both of which have a lot of blue and white pixels). To train the SVM classifiers, we used the LibSvm [2] program, which has the advantage of natively supporting multiclass problems. It uses the all-pairs approach to convert a multiclass problem to several two-class problems: for each pair of classes, a classifier is trained to distinguish members of one class from members of the other class. To classify a new example, it is shown to all the classifiers, each of which then votes for either one or the other of the two classes which it has been trained to separate. The class with the greatest number of votes is then adopted as the final prediction. We compared the following approaches to image categorization: 1. Images are represented in the HSV (hue, saturation, value) color space, which is quantized into 256 colors (the H axis is split into 16 equal ranges and the S and V axes into 4 equal ranges). Each image is then described by an autocorrelogram in the resulting quantized color space. The autocorrelograms are 1024-dimensional vectors and are used as input for linear SVM. 2. Images are segmented into regions using the segmentation algorithm from WALRUS [10]. The IRM similarity metric [8] is then used to construct a generalized kernel as described in Section 3 above. In other words, each image is represented by a vector of its IRM similarities to all the training images; these vectors are then used as input for linear SVM. 3. Images are segmented as in the previous paragraph. Each region is described by a short (12-dimensional) vector, which is a by-product of the segmentation algorithm. The vectors resulting from all the regions of all the training images are then clustered (here we use the same algorithm, BIRCH [14], that is also used by WALRUS during segmentation). An image is then described by a sparse vector specifying what proportion of the area of the image is covered by regions from each region cluster. Depending on the parameters of the segmentation, the average number of regions per image might vary from less than ten to more than a hundred; then, depending on the parameters of the clustering, the number of region clusters (and hence the dimensionality of the space in which our images are now represented) is usually on the order of a few hundred. Once images are represented in this way, linear SVM can be used to train classifiers for them. For the sake of comparison, we also report the performance of the nearest neighbor method with the IRM similarity metric (that is, each image is predicted to belong to the same class as the most similar training image). All performance values reported here are averages (and standard errors) based on tenfold stratified cross-validation. Method Classifìcation accuracy Autocorrelograms Generalized kernels Region clustering 80.2 % ± 1.3 % 79.0 %± 1.3 % 70.0 %± 1.6% Nearest neighbors + IRM 69.1% ±1.3% As expected, the nearest-neighbor method is in general less successful than the approaches based on SVM. However, it turns out that the two segmentation-based approaches do not outperform the representation based on autocorrelograms. The performance of the generalized kernel method is not significantly different (using a paired t-test) from that of autocorrelograms, and the generalized kernel method has the additional disadvantage of much greater computational cost. In addition, the performance of the region clustering approach is remarkably poor. A closer examination suggests that the partitioning of regions into region clusters is problematic and unstable. For example, if the centroid of each cluster is recorded and then all regions are distributed to the cluster with the nearest centroid, most of the regions will tend to move to a different cluster than they were originally attached to. This means that two otherwise similar regions might fall into different clusters by pure chance, and the similarity between their images would thus go unnoticed. The authors of the BIRCH clustering algorithm were aware of the possibility of such problems, and proposed several redistribution passes where the regions are redistributed to the nearest centroids, but in our experiments this did not lead to a really stable partition even after five or ten such passes. An alternative way of making use of the region clustering approach might be to include the test images in the region clustering phase. This really amounts to a form of transduction, i.e. using test data as if it was simply additional unlabeled training data. It ensures that both the training images and the test images are really being represented in a space that treats both groups of images equally. In this setting, the performance of the region clustering increases considerably, and it achieves an accuracy of 86.4 % ± I.O %. However, for the comparison with other methods to be fair, transduction should also be included in the SVM learning process. Since LibSvm does not support tranduction, we used the SvmLight program [6] for these experiments; it implements Joachims' transductive SVM algorithm [7]. With transductive SVM, region clustering achieves an average accuracy of 91.9 % ± 1.0 %, while autocorrelograms achieve an accuracy of 90.7 % ± 1.1 %. Although this difference is not really significant from a practical point of view (a t-test shows that it is statistically significant at a confidence level of 0.945, slightly below the usual 0.95), it suggests that the region clustering approach does have at least some potential to be useful. Finally, we also considered combining several representations. An analysis of classification errors shows that classifiers based on different representations often make mistakes on different test images; that is, a it often happens that a test image is classified correctly by one classifier but incorrectly by another. For example, consider the classifiers based on autocorrelograms and on generalized kernels (with the IRM measure). Of the 1172 images, 828 are classified correctly by both; 120 only by the former; 100 only by the latter; and 128 are mis-classified by both. (To obtain these numbers, each image was classified by a model obtained from that 90% of the dataset which does not contain the image under consideration.) Thus it seems that some advantage could be gained by combining the features of both of these representations. Many approaches exist for combining several classifiers, but with SVM, this can be done in a particularly simple way. If we have two representations, ([>1: and (1)2,: X-^Fi, combining their features (or attributes) would be equivalent to a new representation (]): X-^F^xFi defined by the formula (t)(jc) = ((j)i(x), (t)2Cv)). Now if the kernels A^i(x„ xj) and Kiixi, xj) correspond to some dot product on Fi and F2, respectively, the ftinction K{Xi, Xj) := Ki{xi, Xj) + Kjixi, xj) is a dot product on F{xF2. Thus we can obtain the equivalent of a combined representation simply by computing the sum of two kernels. In our experiments, the combination of the autocorrelogram representation and the generalized kernel using the IRM similarity measure achieved a categorization accuracy of 83.7% ± 1.4 %. A t-test shows that this performance level is significantly better than that of either of these two representations individually. 6 Conclusions and future work Our experiments show that it is difficult to use segmentation-based image representation methods in image categorization. Relatively complex ways of using information obtained from segmentation, such as the generalized kernel approach and (to a lesser extent) the region clustering approach, have been found able to compete with a simpler and more straightforward approach such as autocorrelograms but not to significantly outperform it. In the presence of unlabeled test images, the region clustering approach performs really well (relative to other representations) if a transductive SVM learner is not available. We have shown that it is possible to use segmentation-based representation in combination with another representation to achieve a small but significant increase of categorization accuracy. We nonetheless believe that there must be ways of using segmentation more profitably for image catego- rization, just as it is used in image retrieval, and that this is still an interesting topic for future work. In particular, it would be interesting to further explore the influence of the clustering algorithm used in the region clustering approach, and to look for more stable clustering algorithms that would allow the region clustering approach to perform better in the inductive in additional to the trans-ductive setting. In addition, as segmentation is a relatively complex task, and segmentation algorithms usually depend on several parameters, it would be interesting to explore the influence of these various parameters on the segmentation (and consequently on image categorization) in a more systematic way. The region clustering approach could also be augmented by taking the similarity between different clusters into account. Currently, regions that belong to different clusters contribute to different components of the sparse vectors that describe our images, and therefore whatever similarity might exist between two regions from different clusters cannot contribute anything towards our algorithm's perceived similarity between their two images. Acknowledging that regions can be at least somewhat similar even if they belong to different clusters might lead to an improved representation, but would (if taken to the extreme case) again require us to do the equivalent of comparing every region of one image with every region of the other image, which is what the region clustering approach was designed to avoid in the first place. Perhaps one could determine (from the region clustering process itself), for each region cluster, just a few most similar clusters and then compare pairs of regions from the closely similar clusters but ignore pairs of regions from entirely unrelated clusters. Region clustering could also be integrated with segmentation. Currently, segmentation is being performed separately on each image, by clustering the descriptions of its 4x4 pixel windows; then, the region descriptions of all the images in the training set are clustered to form region clusters. These two steps could be merged by considering the descriptions of all windows from all the images as a single large set and performing clustering on this. Each image would then be represented by a vector of values showing what proportion of the image is covered by windows belonging to a particular cluster. Combination of several kernels could also be pursued further, particularly in the direction of combining more than two classifiers and using weighted sums of kernels. Additionally, the methods considered here should be tested on other datasets, as (given that widely different methods achieve highly similar categorization accuracy values on the present dataset) it is perhaps simply unrealistic to expect better performance on the current dataset, as the categories have an essentially "semantic" motivation that the current image representation methods simply cannot capture. References [1] J. Brank: Machine learning on images (in Slovenian). Proc. IS 2001, Ljubljana, 2001, pp. 152-55. [2] C.-C. Chang, C.-J. Lin: LibSVM: a library for support vector machines (version 2.3). Dept. of Comp. Sci. and Inf. Eng., Nat'l. Taiwan University, April 2001. [3] C. Cortes, V. Vapnik: Support-vector networks. Machine Learning, 20(3):273-297, September 1995. [4] J. Huang, S. R. Kumar, M. Mitra: Combining supervised learning with color correlograms for content-based image retrieval. Proc. 5th ACM Multimedia Conf., Seattle, USA, 1997, pp. 325-334. [5] C.-W. Hsu, C.-J. Lin: A comparison of methods for multi-class support vector machines. Dept. of Comp. Sci. and Inf. Eng., Nat'l Taiwan University, April 2001. [6] T. Joachims: Making large-scale SVM learning practical In: B. Schölkopf et al. (eds.), Advances in Kernel Methods. MIT Press, 1999, pp. 169-184. [7] T. Joachims: Transductive inference for text classification using support vector machines. Proc. 16th ICML, Bled, Slovenia, 1999, pp. 200-209. [8] J. Li, J. Z. Wang, G. Wiederhold: IRM: Integrated region matching for image retrieval. Proc. 8th ACM Multimedia Conf., Los Angeles, 2000, pp. 147-156. [9] O. L. Mangasarian: Generalized support vector machines. In: A. J. Smola et al. (eds.). Advances in Large Margin Classifiers, MIT Press, 2000, pp. 135146. [10] A. Natsev, R. Rastogi, K. Shim: WALRUS: a similarity retrieval algorithm for large databases. Proc. ACM SIGMOD, 1999, pp. 395-406, [11] M. J. Swain, D. H. Ballard: Color indexing, int. Journal of Computer Vision, 7(1): 11-32, Nov. 1991 [12] J. Z. Wang, J. Li, G. Wiederhold: SlMPLlcity: Semantics-sensitive integrated matching for picture libraries. Advances in Visual Inf Systems, 4th Int. Conf., 2000, pp. 360-37L [13] J. Z. Wang, G. Wiederhold, O. Firschein, S. X. Wei: Content-based image indexing using Daubechies' wavelets. Int. Journal of Dig. Lib., l(4):311-328, December 1997. [14] T. Zhang, R. Ramakrishnan, M. Livny: BIRCH: An efficient data clustering method for very large databases. Proc. ACM SIGMOD, 1996. pp. 103114. Recognition of Image Authenticity Using Significant DCT Coefficients Quantization Chin-Chen Chang*, Jun-Chou Chuang* and Tung-Shou Chen** * Department of Computer Science and Information Engineering National Chung Cheng University Chiayi, Taiwan 62107, R.O.C. Phone:886-5-2720411 ext. 33100 FAX: 886-5-2720859 E-mail: {ccc, lzchung}@cs.ccu.edu.tw ** Department of Information Management National Taichung Institute of Technology Taichung, Taiwan 404, R.O.C. Phone: 886-4-22211181 ext. 2213 FAX: 886-4-22233545 E-mail: tschen@ntit.edu.wt Keywords: JPEG, DCT, image authentication, signature, watermarking Received: May 11, 2002 Traditional image authentication methods cannot preserve the authenticity after the processing of the JPEG lossy compression. This is because the JPEG lossy compression destroys the secret embedded in the image. However, the JPEG lossy compression has been so widely used that it simply exists everywhere. Thus, this kind of modification should be taken into consideration. To improve traditional image authentication methods, we shall propose a new method that can not only prevent images from being tampered with but also allow reasonable JPEG lossy compression. Our method works by extracting some significant discrete cosine transform (DCT) coefficients and setting a compression tolerant range. The extracted DCT coefficients will be able to survive when the image is not further modified or lossily compressed. The experimental results show that our method can tolerate JPEG lossy compression while keeping the image from illegal modifications. 1 Introduction Digital images have been widely used in the computer original image. The extracted VQ indices are called world. However, without due protection, they can be features. These features are embedded into the DCT easily modified with image-processing tools. As a result, coefficients located in the middle-frequency part of each image authentication has become an important research DCT block of the original image. The embedded issue. If someone has maliciously manipulated an image, features can be used for image tamper detection and the image authentication system has to have the ability to recovery. Unfortunately, this method cannot tolerate point out the exact places that have been modified. JPEG lossy compression (Pennebaker & Mitchell 1993). The image after JPEG lossy compression may still be Image authentication methods can be classified into two considered acceptable, categories. They are digital-signature-based methods (Bhattacharjee & Kutter 1998, Friedman 1993, Lu & Hence, we shall propose a new method here in this paper Liao 2000) and watermarking-based methods (Kundur & that can prevent images from being tampered with even Hatzinakos 1999, Schneider & Chang 1996, Lin et al. when they have gone through JPEG lossy compression. 2000, Hung et al. 2001). In a digital-signature-based Our method takes advantage of two distinct properties of method, the original image is hashed and then encrypted the image and extracts some features from these two via the public key encryption (Rivest et al. 1978). The properties for further verification, encryption result is called the "signature" of the image. On the other hand, in a watermarking-based method. The first ^property is that the DCT coefficients located in watermarks are first embedded into an image and later the uppet left positions contain most of the information extracted from it to verily the authenticity. An image is in an image block. To make use of the property, we said to have gone through malicious manipulations if the extract some features irom those significant DCT retrieved watermark^.^/'^^e not identical to the coefficients and modify them within a pre-defined corresponding original watermarks. compression-tolerant range. The extracted features can tolerate JPEG lossy compression if the significant DCT In 2001, Hung et al. proposed an image authentication coefficients do not go beyond the compression-tolerant method based on the DCT coefficients. Their method range. Conversely, if the image has been maliciously performs the vector quantization (VQ) technique on the manipulated, the modified DCT coefficients will surely 360 Informatica 26 (2002) 359-366 C.-C. Chang et al. exceed the compression-tolerant range, and that is how we still detect illegal modifications. The second property of the JPEG lossy compression is that, after the encoding and decoding processes, the high frequency part of each block will be lost. This is because most of the DCT coefficients located in the lower right positions will become zero after JPEG lossy compression. Thus the 64 pixel values of each smooth image block will move toward their mean value. Based on this property, the proposed method calculates the maximum difference between the mean value and each pixel value for each block. The maximum difference should be decreased after JPEG lossy compression. In other words, if the maximum difference gets increased, then we know that the image is tampered with. This paper is organized as follows. In Section 2, we shall review DCT property and the image authentication method by Hung et al. After that, our proposed image authentication method will be described in Section 3. In Section 4, some experiments and security analyses will be discussed. Finally, the conclusions will be presented in Section 5. 2 Related Work 2.1 Discrete Cosine Transformation DCT is an image transformation method that can transform each pixel in the spatial domain into the frequency domain. It is very popular in applications such as image compression, image processing, watermarking, etc. The two-dimension transformation of DCT and inverse DCT (IDCT) are defined as follows respectively. {2y+\)j7t^ 7N 7N x=0 y^ 1=0 j=0 Here c{i), c(/)=i/V^ for i, j=0, otherwise c(i), c(j)=l. Besides, ßx, y) is the pixel value in the spatial domain and F{i, j) is the DCT coefficient in the frequency domain. Generally speaking, 2-D DCT is often used to process blocks of 8*8 pixels each. Therefore, the parameter N is set to be 4. An important property of DCT is that after DCT transforming, for an image block, the DCT coefficients located in the upper left positions contain most of the energy. That is to say, even if we only use those DCT coefficients to reconstruct a block by IDCT transformation, the image features of the reconstructed block will still be the same as the original block. That is why we can extract some significant DCT coefficients located in the upper left positions as our significant features. 2.2 The Hung et al.'s Image Authentication 2.2.1 DCT-Based Embedding Procedure Hung et al.' s image authentication can be divided into two stages: feature extraction and feature embedding. In the feature extraction stage, they employ the VQ compression technique to process the original image. Note that the encoding block of VQ is 4x4 pixels. Therefore, they can obtain VQ indices as features, or called watermarks. Two 16-dimensional codebooks and Qr are used to encode the original image O respectively. The encoding results are Wj and Wf. Wj is the detection feature and Wr is the recovery feature. These two distinct features are used for image tampering detection and recovery. In feature embedding, Wd and Wr are embedded into the DCT coefficients located in the middle-frequency part of each block of the original image. Here the DCT block is 8x8 pixels. The detection features of an image block are embedded into its block itself, but the recovery features are not because the recovery features will be destroyed if its block gets tampered with. To solve this problem, they use pseudorandom permutation operation to embed the recovery features of one block in another block. Consider an image block ß,. Assume the embedded features are Wf={wu wj,..., Wj) and the middle-frequency coefficients are Mrinu, mj,..., m^. Here s is the total bit length of the features wf and . They use the hiding fiinction and the pseudorandom number with a seed Sk to embed the features into the DCT coefficients located in the middle frequency part. The hiding function is defined as follows. (1) (2) H(mj,H'j) = L4«J m, + 2a L4«J iX4a + 2a if bi = 1 x4a if «0 = 0 (3) Here a > 1 is the adjusting magnitude. A large a value will make an image become more distorted. On the other hand, a small a value cannot endure an error. After the embedding procedure is finished, IDCT is performed upon the above DCT block to obtain the embedded image. The embedded image can be published when the verification information is embedded into the original image. 2.2.2 Tamper Detection and Recovery Procedure Given a test image O'. They use the embedded detection features to decide if the test image O' has been tampered with or not. If the answer is yes then they use the recovery procedure to recover the modified places. First, they use DCT transformation on the test image O', and then they retrieve the detection features Wf and the recovery features Wf in each DCT block. They use the extraction function and the pseudorandom number with a seed St to extract the features in each DCT block. The extraction function is defined as follows. E{mj) = 0 if [(m/ + a)mod(4a)]<2cir 1 otherwise (4) Here rrij is the coefficients in the middle-frequency part. As for the recovery features, because the pseudorandom permutation operation was previously used to permute them, the inverse pseudorandom permutation should be performed here. They compute = (jr:), where p~' is an inverse permutation function. Next, they perform VQ on the test image O'. The 16-dimensional codebook Qj is used to encode the test image O'. Consider a DCT block Bi of the test image. An 8x8 DCT block Bi can be divided into four encoding VQ-blocks (b,j, bi2, bis, b^) with 4x4 pixels each. Let Dj denote {du, d^, dß, di4) and dij be the VQ index of the closest codeword of the subblock bij. A block Bi is said to have been tampered with if Z), w^' and RMSE(£.)>/, where RMSE is the root mean square error (RMSE) between two blocks, and t is a threshold . Here wf is the detection feature of the block Bi, and ß' is the reconstructed block generated following the side-match VQ method (Chang & chen 1993) as follows. When the method detects that a DCT block Bi has been tampered with, then the recovery procedure is performed. Assume Bi={bii, ba, è«, has been illegally modified. They use the recovery features w'' and w',' by looking up the codebook Qf to recover subblocks bu and 0,^. As for subblocks and bß, they use the side-match VQ method to reconstruct them. Assume the four neighbors of subblock bij are bj, b,, bu, and bd. Then the side-match method can use these four neighbors to reconstruct subblock bij. When the side-match method is adopted, the bit length of the recovery features can be reduced. Let the reconstructed block of Bi be ß'. Finally, we can obtain a recovered image ö • The main drawback of this method is that it cannot tolerate inevitable innocent modification such as JPEG lossy compression. Therefore, we intend to propose a new method to improve it as follows. 3.1 The Signing Procedure Given a gray-level image O. First, we partition it into nonoverlapping blocks, and then we use DCT to transform each block into a DCT coefficient matrix C{i, j), where /,7 ^7. The DCT coefficients located in the upper left positions contain most of the information of the image block, even if the image has been compressed by lossy JPEG. This is the reason why we often call them the "significant DCT coefficients". We extract some features from the significant DCT coefficients of each block for further verification. If the block is modified, then the significant DCT coefficients will also be changed. The extracted features will not be identical with the original ones. To withstand JPEG lossy compression, we set the compression-tolerant range for those significant DCT coefficients. The detailed procedure is described as follows. Consider an image block Bi with its DCT coefficient matrix C. The proposed method chooses ten significant DCT coefficinets C(0,0), C(0,]), C(],0), C(2,0), C(l,l), C(0,2), C(0,3), C(l,2), C(2,l), and C(3,0) in zig-zag scan order (like JPEG). We apply these DCT coefficinets to represent the features of the block. Besides, we use a scale function to quantize and adjust these DCT coefficients. The scale function is defined as below. C\iJ) = LC(/,;)/a>a+| ifC(/,7)>0 \C{i,j)l a\^N gray-level image is j n n MSE={- fYj E (7) ^ i=i j=i Here x,y denotes an original pixel value, and X y denotes the corresponding decoded pixel value. Besides, we define the compression rate (CR). The compression rate is defined as follows: CR=^. (8) X Here X means the original image size while X means the compressed image size. Two extra experiments were conducted to illustrate the performance of our method except the above experiments. The quantization vlaue a was set to be 8 in these two extra experiments. In the first experiemt, we input the original image 'Lena' shown in Figure 3 and output the signed gray-level image 'Lena' in Figure 5. After that, we used JPEG lossy compression to compress the signed 'Lena', where the compression rate was 3 (CR=3), and, moreover, we modified the compressed 'Lena' in the eyes. The decompressed 'Lena' after compression and tampering is shown in Figure 6. The result of the verification procedure is listed in Figure 7. According to our experiment, the decompression result is "acceptable"; and the eyes tampered with can also be pointed out by our method. Such phenomenon also exists in 'Fl6'. The experimental results are shown in Figures 8-10. 5 Conclusions In this paper, we have proposed two important properities of image authentication. Taking advantage of those two properities, our method can both prevent images from being tampered with and allow acceptable JPEG lossy compression. First, we set the compression-tolerant range for significant DCT coefficients. The block can withstand JPEG lossy compression if the significant DCT coefficients after JPEG lossy compression do not fall off the compression tolerantrange. Second, we set the maximum difference for each block to detect malicious manipulations done to the nonsignificant DCT coefficients. According to our experimental results, our method can indeed withstand JPEG lossy compression while keeping the image from being tampered with. 6 References [1] Bhattacharjee S. & Kutter M. (1998) Compression Tolerant Image Authentication. IEEE International Conference on Image Processing, 1, p. 435-439. [2] Friedman, G. L. (1993) The Trustworthy Digital Camera: Restoring Credibility to the Photographic Image. IEEE Transactions on Consumer Electronics, 39, 4, p. 905-910. [3] Lu C. S. & Liao H. Y. (2000) Structural Digital Signature for Image Authentication: An Incidental Distortion Resistant Scheme. Proceedings of the Workshops on ACM multimedia, p. 115 - 118. [4] Kundur D. & Hatzinakos D. (1999) Digital Watermarking for Telltale Tamper Proofing and Authentication. Proceedings of the IEEE, 87, 7, p. 1167-1180. [5] Schneider M. & Chang S. F. (1996) A Robust Image Content Based Digital Signature for Image Authentication. International Conference on Image Processing, 3, p. 227-230. [6] Lin E. T., Podilchuk C. 1. & Delp, E. J. (2000) Detection of Image Alternations Using SemiFragile Watermarks, Security and Watermarking of Multimedia Contents, 3971, p. 152-163. [7] Hung K. L., Chang C. C. & Chen T. S. (2001) Secure Discrete Cosine Transform Based Technique for Recoverable Tamper Proofing. Optical Engineering, 40, 9, p. 1950-1958. [8] Rivest R., Shamir A. & Adleman L. (1978) A Method for Obtaining Digital Signatures and Public-key Cryptosystems. Communications of the ACM,1\,1, p. 120-126. [9] Pennebaker W. B. & Mitchell J. L. (1993) JPEG: Still Image Data Compression Standard, New York, Van Nostrand Reinhold, 1993. [10] Chang R. F. & Chen W. T. (1993) Side-match vector quantization for reconstruction of lost block. Journal of Visual Communications and Image Representation, 4, 2, P. 171-177. Table 1: The image quality (PSNR) of the signed images under different quantization vlaues a=4, 6, 8, 10, and 12 a=4 a=6 a=8 a=10 a=12 F-16 Lena 50.04dB 52.07dB 49.77dB 49.87dB 47.72dB 47.97dB 45.93dB 46.16dB 44.36dB 44.62dB Table 2: Numbers of Error blocks of the signed images pointed out after different JPEG compression qualities q= 8, 5, and 1 Signed images a=4 a=6 a=8 a = 10 a=12 F-16 (^=8) 0 0 0 0 0 F-16(o=5) 942 586 230 70 0 ¥-16 (cr 1) 1024 981 853 820 640 Lena (9=8) 0 0 0 0 0 Lena (q=5) 960 597 225 74 0 Lena (0=1) 1024 1018 921 873 650 Ì57 121 130 136 122 128 147 130 " 85 79 69 69 54 55 57 54 58 45 46 46 41 67 105 67 52 72 61 61 56 92 197 177 54 96 98 98 68 130 210 208 98 92 116 116 136 186 198 188 142 128 115 115 127 136 134 154 156 151 144 144 139 141 144 162 (a) A block of 8x8 pixels 895 -132 85 -12 -27 38 -14 -1 -167 41 -4 4 -2 24 -11 5 83 148 -46 31 45 -19 24 0 134 -41 -9 2 -2 2 15 -3 100 -40 32 -35 -7 12 -18 -1 29 -10 14 16 8 4 -4 11 40 -24 -18 21 -6 14 9 6 -14 15 2 -6 2 3 0 -6 (b) The DCT coefficients 892 -132 84 zP^ -27 38 -14 -1' -164 44 4 -2 24 -11 5 84 -46 31 45 -19 24 0 13^ -9 2 -2 2 15 -3 <00 -40 32 -35 -7 12 -18 -1 29 -10 14 16 8 4 -4 11 40 -24 -18 21 -6 14 9 6 -14 15 2 -6 2 3 0 -6 (c) The adjusted DCT coefficients Coeificient 0 1764 1884 1916 2004 PRNG 1 1 0 1 1 Coefficient 2028 2036 2044 2060 2068 PRNG 0 1 0 0 0 Coefficient 2076 2092 2096 2132 2140 PRNG 1 0 1 0 0 Coefficient 2180 2196 2212 2268 2404 PRNG 0 0 1 1 1 Coefficient 2596 2940 3000 3500 4096 PRNG 1 1 0 0 1 (d) The content of an array A 156 120 130 136 122 127 145 130 85 78 69 68 54 55 58 54 57 45 45 46 41 67 105 67 52 71 61 61 55 92 195 176 55 95 97 97 69 130 209 207 98 92 115 115 135 185 197 187 141 127 116 114 127 135 133 153 156 150 143 143 139 141 144 161 (e) The signed block Figure 1: An example of the signing procedure 157 121 130 136 122 128 147 130 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 54 96 98 98 68 130 210 208 98 92 116 116 136 186 198 188 142 128 115 115 127 136 134 154 156 151 144 144 139 141 144 162 (a) A violence block of 8x8 pixels 550 -42 26 10 0 8 7 0 -281 40 -16 -2 3 14 -6 2 359 20 18 2 11 6 0 0 216 -52 7 12 -7 17 5 2 8 46 -1 -5 8 6 -4 1 132 4 10 7 6 3 0 0 164 -41 0 2 -9 13 3 1 -66 41 1 -1 9 - 4 -3 0 (b) The DCT coefficients 548 -44 28 0 8 7 0 -284 44 - 2 3 14 -6 2 356 18 2 11 6 0 0 -52 7 12 -7 17 5 2 46 -1 -5 8 6 - 4 1 132 4 10 7 6 3 0 0 164 -41 0 2 -9 13 3 1 -66 41 1 -1 9 -4 -3 0 (c) The adjusted DCT coefficients Figure 2: An example of the verification procedure Figure 3: The originai image 'Lena' IPHil^ Figure 4: The originai image 'F 16' Figure 7: The verification result Figure 8: The signed image 'F16' where the PSNR=47.72 dB Figure 5: The signed image 'Lena' where the PSNR=47.97 dB Figure 9: The 'Fl6' with airframe tampered with Figure 6: The compressed 'Lena' with the eyes tampered with Figure 10: The verification result Protecting the Data State of Mobile Agents by Using Bitmaps and XOR Operators Jesiis Arturo Pérez Diaz, , ITEMS, Campus Cuervanava Av. Paseo de la Reforma 182-A, 62589, Temixco, Morelos, Mexico. Jesus. arturo.perez@.itesm.mx AND Dario Alvarez Gutiérrez, University of Oviedo Calvo Sotelo s/n, 33007 Oviedo, Spain. darioa@.pinon.ccu.uniovi.es Keywords: security, mobile agents, data state protection, encryption Received: June 20, 2001 Mobile agents have been considered a promising technology to develop e-commerce applications, however the security concerns about the technology have stopped their widespread use. The identified security areas comprise protecting hosts against malicious agents, protecting the agent's transmission and protecting agents against malicious hosts. The first two security issues and the protection of the agent's code state can be solved by applying traditional security techniques. Even though there are some works that manage the privacy of execution, their implementation is almost unfeasible in terms of performance and complexity. This paper describes a fast and easy to implement algorithm that a mobile agent can use to encrypt its data during its itinerary. The algorithm only makes use of a bitmap and XOR operations. The algorithm consist of applying XOR operations to the data to be ciphered and a random bitmap, while the map is repeatedly shifted to the right or to the left in order to compute a CRC field for validation against malicious tampering. The method only uses basic bit operations so that its implementation is very easy to develop. Besides, since it does not use any computationally expensive cryptographic technique (i.e. digital signatures) it is very fast. In this way we manage to have a secure, simple, fast and feasible protection algorithm to protect data while mobile agents are roaming, where simplicity and performance are its better advantages. • Protection of the agent system against attacks from 1 Introduction mobile agents Mobile Agent Systems are expected to make e-commerce • Protection of the agent against other agents. transactions inside virtual supermarkets. In this t. . ìt • x- ^ • • u ^ ,. .. . . ., • Protection of mformation transmission between agent servers against unauthorized third parties. application area security is crucial since we can consider that any application will not be usefial without doing secure transactions. • Protection of the agent against malicious agent Mobile agents consist of code state, data state, and systems (malicious hosts), which includes protection execution state. Mobile agent systems are platforms that ^^^^^ ^^ allow agents to migrate from one node (a mobile agent Different security architectures for mobile agents [1] and system) to another, keeping its three states. While agents mobile agent systems [2] [3] have used standard migrate there are several security aspects involved. We cryptographic techniques like public key cryptography, or can point out different mechanisms that must be digital signatures to authenticate authorities and solve the implemented by the mobile agent system to ensure the problem of protecting the host against malicious agents, security of the mobile agent applications. Mobile agent Also, they have implemented secure channels for the systems basically must provide: transmission of the agents by using SSL or TLS. Nevertheless, the protection of mobile agents from malicious hosts is only partially solved. The code state of the agent can be signed since it will not be modifiable. In this way, we can protect the static part of the agent. However, to protect the data state (that changes dynamically) becomes a more difficult task to tackle. There are some works in this area, described in the related work section. However, there has not been found any solution having a feasible implementation. Consider that most of mobile agent e-commerce applications do not need to protect all the data state, but only some important values where agents filter information and compile their results. The algorithm presented in this paper protects all the data that the agent decides to encrypt by calling a cipher fiinction. When the agent returns, only the source server is able to decrypt the sensible information stored by the agent. The algorithm describes an easy way to protect sensible data that must be gathered and are carried ,by mobile agents alongside their itinerary. Two principal advantages are highlighted: the algorithm is simple and feasible to implement, and computationally inexpensive. 2 Related Work Wilhelm presented a technique for protecting the itinerary of the mobile agents by using hardware mechanisms [4], He considered that software algorithms were not enough to ensure complete security during the mobile agent's itinerary. Even though the technique managed to achieve the protection of the itinerary, its implementation in real applications becomes difficult, since special hardware is required. One interesting approach to avoid the malicious host attacks was proposed by Fritz Hohl [5]. This approach, which is called Code Mess Up, consists of a combination of two mechanisms: the first one generates a new and far less understandable version of the agent's code. The second mechanism restricts the lifetime of the agent's code and data. In this way, when the code of the agent is messed up, the malicious server would take some more time in order to understand the code and then attack it, but since the agent's lifetime is restricted, the malicious server will not have enough time to attack the agent. In this way the agent remains untouched. Another solution for this problem was proposed by Tomas Sander and Christian Tschudin [6] [7]. They presented techniques on how to achieve "non-interactive computing with encrypted programs" in certain cases and give a complete solution for this problem in important instances. They further show how an agent might securely perform a cryptographic primitive, digital signing, in an untrusted execution environment. Their results are based on the use of homomorphic encryption schemes and function composition techniques. The last two solutions were designed to offer privacy on the agent's execution, but not to give privacy and integrity to the agent's data. Beside, both of them have two main problems: a quite difficult implementation and a considerable performance hit in case of implementation. Perhaps these disadvantages are the reason why these techniques have not been implemented by any mobile agent system, as far as we know, and the problem in current mobile agent systems is still unsolved. Considering these related works, the goal of our research is to offer a simple and feasible to implement algorithm that can be used by mobile agents just for encrypting the data that they gather while they are roaming in untrusted execution environments, and without a perceivable performance hit. 3 Data Encryption Using Bitmaps and the XOR Operation The design of this data protection technique takes into account the fact that, in most applications, it is not vital to protect the whole data state of the mobile agent but some variables holding sensible data gathered by the agent, which is the main goal of the agent's travel and need to be protected. Typical examples of these applications are e-commerce applications in which an agent travels alongside an itinerary looking for prices or particular services. The vital information that must be protected is the price or service offered in each visited server. This technique requires that the agent travels holding data generated by the source server that will be used by the agent to encrypt the sensible data gathered, using fast XOR operations. 3.1 Usefulness of the XOR Operator The main encryption idea is to apply the XOR operation between data and a random number (expressed as a bitmap in a row of a matrix and known only by the source server) to encrypt information. Once the agent returns to the source server, the XOR operation is applied again to the encrypted data, using the same random number, and the information is restored. The agent in the source server will generate two matrixes with a number of rows equal to the number of data items it expects to encrypt. Initially, both matrixes will be filled with the same random numbers (forming a random background bitmap). One of the matrixes will be stored in the source server and the agent will carry the other. For example, let's assume a 10 (binary 1010) is generated as a random number and put it in a row of the matrix. This number goes with the agent and a copy is also stored in the source server. During the itinerary, the agent gets a 3 (binary 0011) that the agent wishes to encrypt. The XOR operation will be then applied between the random number and the datum to be protected (1010 ® 0011), giving 1001 as a result. This datum is stored in the same row of the matrix, overwriting the initial random mask, in order to avoid the next server seeing the random number used to encrypt the datum. Also, the next server is not able to know the real datum, since it ignores the random number used to apply the XOR operation. A given server will use the next free row available in the matrix to store new data, as the occupied rows contain data encrypted in previous servers. In this way, the current server will never be able to know the previously encrypted data since it does know neither the datum nor the random number. The source server had stored in a duplicate matrix a copy of the random numbers, in order to retrieve when the agent returns, the data encrypted by the agent while roaming from server to server. Thus, to retrieve the datum, the encrypted number 1001 will get a XOR applied with the corresponding random number generated in the source server (1001 XOR 1010), giving 0011 as a result (3, the datum the agent had encrypted). To assure that the information restored upon return to the source server has not been tampered with, and is the same information that the agent encrypted in each server, a CRC field is computed in order to perform an integrity test. The complete encryption algorithm is described in the next section. 3.2 Detailed Description of the Encryption Algorithm A matrix with several fields is defined (table 1), which is used to encrypt the agent's data and for validation of the data later. The matrix is initially filled with random numbers, creating a background bitmap used to encrypt data gathered by the agent alongside its itinerary (table 2). The matrix is duplicated. One copy travels with the agent and the other is kept in the source server. The source server's matrix is used to recover the data upon agent's return. Every datum to be protected by the agent needs a row of the matrix, so the agent must know beforehand the approximate amount of data it is going to use. The structure of the matrix is as follows: Data Area!of the Mobile Age n( Host ID Data to be protected CW CRC 128 bits 128 bits 128 bits 128 bits 128 bits 128 bits fl ß ß fn The first field is the identifier of the server, as we need to know, for each row, the place where data was encrypted. The second field represents the space needed to store the data to be gathered by the agent, in 128 bit blocks. The third field is the "codeword", which is a random number to be generated in the remote server. The codeword is used to rotate data before applying the encryption function. The last field is a CRC, which is computed applying a XOR operation using all the 128 bit blocks in the data area. This CRC is used upon agent return to verily that the data area has not been altered. ij;. Data Area of tK'elMobile Agent H ' ' 1 •iimii.iip! " . — ■ ID Host Data to be protected CW CRC 128 bits 128 1 bits 128 bits 1 128 1 bits 128 bits 128 bits 0101 1101 0011 010 0101 0100 IIOI 1100 0001 1101 010 0001 0100 0100 1011 0011 0101 001 1101 1001 0100 Table 1. Fields composing the rows of the matrix Table 2. Matrix filled with a random generated bitmap When the agent departs from the source server, the agent carries the matrix filled with random numbers, creating a background bitmap that is used to hide information, as shown in the next table: Alongside the itinerary, the following algorithm is applied for each datum to be encrypted: 1. The remote server creates a record with the same fields than a row of the matrix that the agent has. 2. The host ID, data to be encrypted in 128-bit blocks form, and a generated random codeword (CW) are put into the record. 3. Each 128 bit blocks? is rotated to the left as many times as indicated by the 7 less-significant bits of the CW. That is fi fi« li, where li 07Fh. 4. Before applying the third step on fi+1, the CW is rotated to the right as many times as indicated by the 7 most-significant bits of the CW. Thus, the number of times that each fi is rotated is not always the same. CW will then heCW^CW» mi where mi (CW « 7) & 07Fh. Once the CW is rotated step 3 is repeated. These tasks will continue until no more 128-bit blocks are left. 5. The original CW is restored into the corresponding field of the register in order to retrieve the original information using the inverse algorithm in the source server. 6. The CRC field is computed as follows. The initial value is filled with binary O's, and then it is XOR'ed in sequence (from left to right) with all the 128 bit blocks in the data area, giving the final CRC value. 7. Lastly, the corresponding row in the matrix holding the original bitmap is XOR'ed with the generated register (data) to be protected, thus encrypting the data. 8. The counter indicating the number of lines used in the matrix is incremented so that the next row available of the matrix can be used. It is worth to note that, once the mobile agents arrive at a new server, the new server can not access the information stored in the previous server, as the background bitmap held in the previous row before the XOR operation was applied can not be guessed'. Only the source server, who has a copy of the original matrix, is able to apply the inverse algorithm to retrieve the encrypted data. The CRC field is computed in order to detect any alteration made by a malicious server on the encrypted data. Without any validation action, a malicious server could modify just one bit on the encrypted data field and when the agent returns to its source server it would not be able to detect that alteration. So, the agent would recover a wrong modified value since the source server just applies and XOR operation with the stored copy of the matrix in order to get the encrypted value. The CRC field does not prevent a malicious server from making any alteration, but it ensures that if an alteration were made it would be detected since the CRC field will be invalid. The bit rotations made in step 3 may appear unnecessaiy. However, if the blocks are not rotated, a malicious server could alter the encrypted information in just one bit in a specific position, and the CRC may not change since each block is XORred with the next block. Then, upon return to the source server, this alteration would not be detected. On the other hand, once the random rotations are applied, a maliciously-altered bit in one block would be detected, as this bit affects many positions in the inverse decryption algorithm (since the position of that bit will change after rotations are applied), rendering always an invalid CRC that will detect the alteration. To retrieve the information encrypted by the agent alongside the itinerary, the source server just applies the XOR operation to each row of the matrix that was used by the agent, with the corresponding row of the copy of the original matrix holding the initial background random bitmap. Then, using the random CW, inverse rotations are applied to retrieve the real data that was encrypted by the agent in a given intermediate server. The main advantage of this technique, encrypting data using bitmaps and XOR operations is that is very easy to implement, compared with other methods, which use very complex mathematical algorithms [8] Besides, it is computationally inexpensive, as only very fast bit operations are used, avoiding effectively the performance impact of other techniques such as digital signatures, keys, or any other means that hurt performance. 4 Feasibility of Implementation and Incorporation to Current Mobile Agent Systems A great advantage of our protection scheme is the feasibility of implementation. Besides, it could be very easily incorporated to the current mobile agent systems' security mechanisms. The majority of Java-based mobile agent systems define an abstract class called Agent. All the agents programmed by the user inherit from this class the required functionality, so that the agent can migrates from one host to another or can create more agents. This abstract class usually follows a pattern like this: public abstract class Agent implements java.io.Serializable{ public void run() public final java.lang.Object clone() public final void createAgent(.....) public final void dispatch(java.net.destinationURL) public final void revert() } We just need to add an addsecure() method to the agent abstract class in order to allow the agents to securely store sensible information in the data structure(the matrix of bitmaps) that is carried with them, so we could define: public final void addsecure(Object data) The implementation of this method will encrypt the information using the algorithm described in the previous section and will store it in the next row available of the matrix that is carried with the agent. The matrix can be easily defined in Java using a Java array (i.e. an instance of the class vector) that will hold the background bit map originated in the source server. In this way, each time that an agent is created by a user (i.e. commerceAgent), it will inherit the addsecure() method allowing it store information in a secure way. public class commerceAgent extends Agent { ' Only the next available rows, not used yet, can be seen. This can be used for an attack that is described later. } When the user creates an instance of commerceAgent, the instance will be able to protect the information it is gathering, just by invoking the addsecureQ method. For example: commerceAgent findFlyAgent; defines an agent of type commerceAgent. The run method of the findFlyAgent would contain the instructions to query the price of the fly it is looking for, once it gets the price it records for later analysis at the home server, so the last line of the run method would be: findFlyAgent.addsecure(FlyPrice); in order to protect the sensible datum it has gotten. The current server will execute the encrypting process and will store the FlyPrice safely in the matrix. The next server visited will not be able to find out what was the price in the previous one and it just will be able to encrypt the information that the agent gathers in that server. 5 Limitations of the Method The algorithm allows to protect the information the agent decides during its itinerary, and to verity that it has not been altered when the agent returns. The algorithm does not prevent the possible alteration of data from malicious hosts, but detects any modification that has been made. In this way, if any alteration is detected (which means a CRC field is invalid) the agent will reject the information since it would be considered invalid. In this way our technique offers integrity. The current server will never be able to access the previously encrypted data since it ignores the data and the random number used to apply the XOR operation. However, it can see and copy the still available rows with random numbers that will be used to encrypt the next data not only in the current server but in the next server as well. The first, and most evident deriving of this, is that a visited server cannot retrieve the data that was encrypted before, but could easily make a copy of the rest of the background bitmap. This means that a server could potentially retrieve the data encrypted in the future by an agent, assuming that the agent visits again the same server. Thus, an agent should not visit the same server twice if it wants to be completely secured. Another possible attack (although less probable) is that two cooperating malicious servers teamed to retrieve the information carried by the agent. The first server would send to the second one a copy of the unused part of the background bitmap already known by the first server (the available rows of the matrix). If the agent arrived later to the second malicious server, it would be able to retrieve the data encrypted since the agent left the first malicious server and then modify the values. The last limitation is that there is a fixed maximum number of data that can be protected, which is given by the length of the matrix (the length must be set in advance). However, in practice, a reasonable length could be set, according with the expected task to be carried by the agent. Finally, this technique does only protect the part of the data state of the agent that the agent wishes to encrypt. The rest of the data, such as local variables, etc. are not protected. 6 Future Work We will continue working on this technique in order to implement an improved algorithm that avoids the current limitations. Nevertheless we intent to use only the operations included in this paper (bit rotations and XOR operation), or equally fast or simple ones, so that we can keep its simplicity and fast speed which are the objectives and philosophy of this work. 7 Conclusions One of the problems that a mobile agent system must solve is the protection of agents from malicious hosts, which includes the protection of the data state of an agent. This is very important in order agent technology be adopted in e-commerce applications, for example in applications where agents collect information (such as flight prices) for later analysis at the source server. Protecting this data gathered by the agent (and not the whole data state, which is not vital) is the objective of the research described in this paper. For example, malicious servers should not be able to see or modify the information gathered in order to change previous low prices to make its price appear as the best. Other techniques such as [5] and [6] try to solve the problem by privacy of execution applying very complex techniques, which are very difficult to implement, and, more importantly, are very expensive computationally, as key cryptography is used. This is a hurdle very difficult to overcome in practical systems. We propose a new technique to protect the part of the data state of an agent (the data gathered the agent wishes to protect) that dose not suffer from these limitations, as it is fast and easy to implement. A matrix is generated at the source server, and filled initially with random bit numbers. Each row is used to protect one datum, and is divided into 128 bit blocks. A copy of the matrix is stored at the source server, and the other copy travels with the agent. To protect one data item, the agent uses a row, and applies the XOR operation to the data item with the random number held previously, encrypting it and overwriting the initial random number with the result. A CRC is computed using XOR operations also, in order to detect alterations when returning to the source server. Some bit alterations would not change the CRC, so the random bitmaps are bit-rotated n-times, as indicated by a random codeword that is also held in the matrix (and is also rotated). The original server is the only one able to decrypt the information, since the inverse algorithm (basically undoing the rotations and applying the XOR operation with the original random bit map) requires the knowledge of the original random bit map and codewords, which is only know (the bitmaps) by the original server. The only limitation is that an agent should not visit the same server twice, or a server co-allied with a malicious server, as a copy of the matrix could be made and subsequent encrypted data items could be retrieved. A minor limitation is that the agent should estimate the maximum number of data items to protect, as the matrix must be generated beforehand. The technique we have presented removes the complexity and computational limitations of other techniques, which hinder the acceptance of agent technology in real applications. Agent's data state protection is made feasible in practical applications, as no performance hit is introduced because no expensive key cryptography is used. Furthermore, the algorithm could be a lot of times faster than any other that uses traditional key cryptographic techniques since only bits operations are used. This algorithm can be easily integrated in current mobile agent systems in order to create basic e-commerce applications that compile information securely. Institute (ICSI) Technical Report, 97(049):1-14. November, 1997. [8] Joan Feigenbaum, Peter Lee. Trust Management and Proof carriying code in Secure Mobile-Code Applications. Accepted paper to the DARPA Workshop on Foundations for Secure Mobile Code Workshop, 26 - 28 March 1997. [9] Sobrado Igor. Evaluation of two security schemes form mobile agents. Proc. ACM SIGCOMM -LatinAmerica and Caribbean 4/01, San Jose, Costa Rica, April 3-5 2001. References [1] Pérez Diaz Jesus Arturo, Alvarez Gutiérrez Dario. Sahara: a comprehensive security architecture for mobile agents systems. Simposio Espanol de Informàtica Distribuida. ISBN: 84-8158-163-1. Orense, Spain. [2] Pérez Diaz Jesus Arturo, Alvarez Gutiérrez Darlo, Cueva Lovelle Juan Manuel. An implementation of a secure Java2-based mobile agent system. The Fifth International Conference on The Practical Application of Intelligent Agents and Multi-Agent Technology. PAAM 2000. ISBN: 1 902426 07 X. Manchester, U.K. [3] Fraunhofer IGD. Project http://www.infonnatik.uni-stuttgart.de/ipvr/vs/proiekte/mole/mal/preview/SeMo A-CSecure-Mobile-Agentsl.9656.txt.html [4] Wilhelm Uwe G., Staamann Sebastian. Protecting the Itinerary of Mobile Agents. Anales del ECOOP Workshop on Distributed Object Security and 4"' Workshop on Mobile Object Systems. Julio 21-22 1998. Belgium. [5] Hohl Fritz. An approach to solve the problem of malicious hosts in mobile agent systems. Institute of parallel and distributed systems. University of Stuttgart, Germany. 1997. [6] Sander Thomas, Tshudin Christian. Protecting Mobile Agents against Malicious Host. Lecture Notes in Computer Science (LNCS), SpringerVerlag, New York, USA, 1419, June 1998. [7] Sander Thomas, Tshudin Christian. Towards mobile cryptography. International Computer Science Evaluation of Technologies for Business Process Automation Maja Pušnik, Matjaž B, Jurič and Ivan Rozman University of Maribor, Faculty of Electrical Engineering, Computer and Information Science, Institute of Informatics, Smetanova 17, 2000 Maribor E-mail: maja.pusnik@uni-mb.si Keywords: business process automation technologies, decision models, ebXML, XLANG, RosettaNet Received: June 2, 2002 The importance ofprocess automation for B2B (business to business) collaboration is rising. The efforts are directed towards automating business processes and forming a global electronic market. In this paper we present and evaluate the three most important technologies for business process automation: ebXML (Electronic Business XML- extensible Markup Language), RosettaNet and XLANG. They differ in terms of features, quality and serviceability. We analyze, compare and evaluate those technologies from the perspective of SME (small and medium enterprises). Based on the comparison we define a multi-criteria decision model with twenty parameters and the corresponding weights, we evaluate the alternatives and define a utility function, which helps us to select the most suitable technology. The contributions of this paper are the in-depth evaluation of technologies and the definition of a multi-criteria decision model. 1 Introduction The well-known fact is that business must be altered to survive the upcoming changes and progress. To make the idea of a global marketplace and B2B work, proper technologies, which will assure safety and efficiency, must be created. They have to be appropriate for all kinds of enterprises, small and large, for those with great financial recourses and responsibilities and for those with limited budgets. Only with such universal technologies, a global market and complete serviceability will be realized. In the paper we will review and compare the three most important technologies for business process automation: ebXML, XLANG and RosettaNet. We will define criteria for their evaluation and build a decision model with twenty criteria. We will evaluate the results and choose the most suitable technology from the perspective of a SME. All three technologies are based on XML and build on the functionality of web services, where they reuse existing web service technologies, such as SOAP (Simple Object Access Protocol), UDDI (Universal Description, Discovery and Integration) and WSDL (Web Service Definition Language). We will see that they differ in some features while in others they are complementary. Because they are based on open standards, they are reachable in aspects of price and complexity, not only to large enterprises, but also to small and medium enterprises. The review of related research has shown that there are not many similar analyses. The comparison made in [6] only compares ebXML and RosettaNet in an informal way and does not define a decision model. The author in [20] compares B2B standards, which include RosettaNet, ebXML, OAGIS (Open Applications Group Integration Specification) and Simple Web Services. The same author explains in a different article [21] how RosettaNet, ebXML, OAGIS and EDI (Electronic Data Interchange) fit together. However the author does not define a formal decision model. In [22] the author again compares RosettaNet, ebXML, OAGIS, Web Services, xCBL (UBL) - XML Common Business Library (Universal Business Language) and cXML (commerce XML) and creates a comparison framework. Our paper is organized in the following order: the needs of the market are evaluated in the second chapter. Third chapter makes a comparison of the ebXML, RosettaNet and XLANG. Fourth chapter defines a multi-criteria decision model and evaluates them. The last, fifth chapter, gives a conclusion of the results. 2 Needs of the Market The way enterprises work, understand their existence and survive must be retained. But the way they do business and communicate with each other must be improved. So business processes must still work on and through the net, just as they have manually. Technologies must describe business processes in a consistent and safe way and more. They must enable changes, upgrades and adaptations, since business is a living process and therefore must be flexible and manageable. But this is only the first step. There is still the question of automation. A business process consists of many steps and includes many people, some of them completely unnecessary, which only enlarges the possibility of making a mistake. One of the goals in creating a global electronic market is to automate everything that can be automated, including routine work or explicitly defined processes with long-term rules and foreseen conditions. Some solutions have already been created in the past, more or less successfully, but by far not sufficient enough for goals and ambitions of the millennium. The web services have only created an initiation of what is yet to come. They enabled process describing, but not automation. The ultimate goal of those technologies is making business as safe and as accessible as possible for all businesses all over the world [19]. 3 Comparison of Technologies There are several technologies for coordination and automation of business processes. Some of them have been present on the market for quite a long time, for example EDI. But the problem with vintage ones is inaccessibility for smaller enterprises and an obvious inflexibility, since most of them require a large initial investment and expensive support. The up to date technologies build upon legacy technologies, which have used older proprietary standards. Contracts / agreements ebXMLCPA Collaborafion Protocol Agreement Private process XLANG WSFL Web Service Flow Language BPML Business Process Markup Language Public collaborative process WSCL Conversation Language ( sbXML BPSS End point description WSEL WS, ; Endpòint Language uusmess Process Specification Schema ebXML CPP Collaboration Protocot Profile Service description and transport binding WSDL Web Service Description language Figure 1: Process Coordination Framework [1] The need for different kind of technologies has increased. Modem technologies are mutually connected and complemented. Figure 1 presents their relationships, horizontally divided by the level of provided services and vertically by the initiative organizatioti, by which they were sponsored and created [1]. All of the technologies are an upgrade of web services and they are all based on the XML language. Their design priorities and fields of concentration however differ. Service description and transport binding was assured in WSDL and in ebXML CPP (Collaboration Protocol Profile) as the EDI follower. With time, new technologies emerged from them in different directions. Their relations are seen from Figure 1: WSEL (Web Service Endpoint Language), ebXML BPSS (Business Process Specification Schema), WSCL (Web Service Conversation Language), WSFL (Web Service Flow Language), XLANG, BMPL (Business Management Markup Language), RosettaNet PIPs (Partner Interface Process) and ebXML CPA (Collaboration Protocol Agreement) []]. WSDL is meant for describing network services as a set of endpoints, operating on messages. WSEL is meant for non-operational features of web services like security. WSCL allows that we define abstract interfaces for web services for business process conversation. WSFL allows the description of business processes or interaction patterns, based on the web services operations. ebXML provides a set of technologies for describing various stages of business collaboration. CPPs enable companies to specify their profiles in which they define the terms for collaboration. CPAs are the computer equivalents of trading partner agreements. They can be defined manually or automatically generated from two or more CPPs. The actual flow of a business process is specified using BPSS. Shared public and private business processes for collaboration between two or more partners are specified using BPML. The focus of XLANG and RosettaNet PIPs is similar to BPML and will be further discussed later in this article. XML, concentrated on the contents, enables remote systems to interchange and interpret the documents without the human intervention. XML document is basically an ordinary text file with markup [1], The combination of structure, flexibility and verification makes XML useful not only for electronic publishing, but also for designing business messages, exchanged between enterprises [1]. While building larger processes, all business partners must agree upon the vocabulary, interfaces and the type of method invocation, before they send individual messages. XML vocabularies can define all kinds of business documents or even whole frameworks, which provides interoperability and functionality. 3.1 ebXML ebXML is a family of specifications that enable companies of all sizes to collaborate with each other, independently of the location [2], through the exchange of XML-based messages [8]. Development of the ebXML specifications is an on-going effort sponsored by OASIS (Organization for the Advancement of Structured Information) and UN/CEFACT (United Nations Center For Trade Facilitation & Electronic Business) [8]. The need for ebXML lies in the experience from the past. EDI, the anterior technology for data interchange among enterprises, was unreachable for most SMEs, since the costs were too high and the implementation too complex. ebXML is based on XML, web services and open standards and is publicly available. It overcomes this barrier and enables the creation of software for building applications, based on mutual structure and syntax, which will lower the costs of business data interchange. ebXML mission is to provide an open XML-based infrastructure, enabling the global use of electronic business information in an interoperable, secure and consistent manner by all parties [8], ebXML architecture was primarily designed for B2B interaction. UDDI and SOAP offer services with similar functionality on the low level. EbXML uses and builds upon these standards. It provides safe and reliable messaging and adds a set of higher level specifications for expressing the semantics of B2B collaborations. For these purposes it provides CPPs, CPAs, BPSS, core components, registry/repository and BPML [11]. ebXML provides an effective platform for long-term business transactions and enables us to express the following: quality of service, timeouts, conformations, multi-language support, authentication, authorization, privacy, integrity and non-repudiation. Example of ebXML usage, shown in Figure 2: ^xHO' [SoftWa* Request ebXML specifications COMPANY X ebXML specifications detail Register company business profile. cuiiipmiy uuaiiic^a piuiiit, f^^ scenarios and implementation details uild local i-r /stem implementation system Confirm profile and scenarios accepted COMPANY Y Figure 2: ebXML in practice By using ebXML, companies have a standard method to exchange business messages, conduct trading relationships, communicate through data in common terms, define and register business processes [8]. it enables all parties to complement and extend current EC/EDI (electronic commerce/EDI) investment and it expands electronic business to new and existing partners. It also facilitates convergence of current and emerging XML efforts [8]. ebXML delivers the value by [8]: using the strengths of OASIS and UN/CEFACT to ensure a global, open process, developing technical specifications for the open ebXML infrastructure, creating the technical specifications with the world's best experts, collaborating with other initiatives and standard development organizations, building on the experience and strength of existing EDI knowledge, enlisting industry leaders to participate and adopt ebXML infrastructure and realizing the commitment by ebXML participants to implement the ebXML technical specifications. 3.2 XLANG XLANG is a notation for the specification of message exchange behavior among participating web services, supporting especially the automation of business processes [9], It is expected to serve as the basis for automated protocol engines that can track the state of process instances and help to enforce protocol correctness in message flows. XLANG is based on XML and is used for describing business processes in the BizTalk inifiative. It offers a model for orchestration of services and contract collaboration between partners [3]. XLANG is fully focused on public processes. It supports long-term operations and nesting. It enables: exception handling, restoring operations,, behavior, actions, control flow, correlations, contents of transaction, service management, time-outs, custom correlation of messages, modular behavior description and contracts with multiple roles [3]. However, it does not define authentication or the quality of service nor the non-repudiation [4]. The goal of XLANG is to make it possible to formally specify business processes as state-full long-running interactions [9]. Main features of XLANG include [1]: • behavior; container for the description of the service's behavioral aspects, including support for looping, concurrency and exception handling, • actions; atoms of behavior, referencing WSDL operations on available ports. 376 Informatica 26 (2002) 373-380 M. Pušnik et al. • control flow; iequence in which the service performs actions, • correlations: structure, the service uses to route messages to correct workflow instances, • context; context for long-running transactions, • service management; features of service instance management and • port mapping; method for plugging in the service user and the service provider. XLANG is an extension of WSDL and dynamics in processes are supported with different flows [3]: 1. Message flow, where actions are the basic constituents of an XLANG process definition that specifies the behavior of the service. The actions are request/response, solicit response, one way, notification, timeouts and exceptions. 2. Data flow, the base of XLANG is fed by the message flow and supports the control flow decisions. 3. Control flow, which provides support for looping, besides the regular elements. It also enables exception handling and transactional behavior. XLANG also supports business process contracts, however they are merely mappings between two port types, which interact together. A contract can only map ports that are unidirectional [3]. The unit of action, offered by a service is an operation. An operation can be a single asynchronous message, or a request/response pair of messages with optional fault messages. The operation can be either incoming or outgoing. But WSDL does not say what is the operation semantics. There are three possibilities [17]: 1. In the first case the operation is a stateless service that has no memory of previous operations, such as a stock quote service. 2. The second possibility is an operation on an object, in the usual sense of object-oriented programming systems, in which case the object will have the ability to use its state variables to keep a record of the consequences of previous operations. In the latter case, we usually think of the object as being subservient to the caller, since the caller controls the entire life cycle of the object. The object itself has low influence regarding the order in which its operations are invoked and no independent behavior. 3. The third possibility is autonomous agents with fitll state representation of the service. In this case the service supports long-term interactions with full state, in which every interaction has a beginning, defined protocol for operation call and the ending. The supplier has to provide a service, which starts an interaction by receiving an order through the entering message, then returns the acknowledgement to the buyer, if the order can be accomplished. Enterprise workflow systems today support the definition, execution and monitoring of long-running processes that coordinate the activities of multiple business applications. But they do not separate internal implementation from external protocol description [9], U Template RFQ Loss notice e -market Template __ RFQ Order Quota Order change Signature number Order number Signature Bill Figure 3: XLANG connecting two parties [9] The Figure 3 represents the dynamics between two participants inside an electronic market, where XLANG is the translating key between a buyer and supplier that cooperate on the net using the advantages of the electronic market. 3.3 RosettaNet RosettaNet is a non-profit consortium of more than 400 of the world's leading Information Technology, Electronic Components, Semiconductor Manufacturing and Solution Provider companies, working to create, implement and promote open electronic business process standards [7], RosettaNet was created as a compromise between EDI and SOAP. Its main goals are reaching dynamic, flexible trading networks, operational efficiency and new business opportunities [10], It enables: real time complex transitions, checking, confirmation, non-repudiation, multiple languages, additional standards in industry, SSL (Secure Socket Layer) authentication, digital signature and data encoding. Its biggest advantage is the well defined although inflexible PIP [5], The purpose of every PIP is to offer general business data models and documents, which enable interface implementation by system developers. Every interface includes [14]: • XML document, which is based on the DTD (Document Type Definition) and specifies PIP services, transactions and management, which include dictionary properties, • class and sequence diagrams in UML, • validation tool and • implementation guide. PIP interface offers mechanism for sending messages and reporting failures. It demonstrates the Integration of web services and its safety features, demanded at RosettaNet [19]: • two - way SSL authentication, • digital signature, • data encryption and • non-repudiation. RosettaNet PIP defines an automated business process among trading partners for demanding and offering product prices and availability information [16]. Different business processes are covered with: 1. RosettaNet executive plan, which offers a general guidance, priorities of addresses and integration through tables. 2. Individual plan of supply chain, which address of the supply chain - specific theme, prioritization, sources, implementation and adaptation. 3. RosettaNet partners, which enable voting about standards, participants in workshops and implementation. RosettaNet standards are managed on a global level. Locally they are focused on implementation and support. So partners can choose between global or local membership [13]. RosettaNet is very rich in its supporting tools: the RosettaNet implementation tool including the current PIP template, a Partner Agreement Wizard for quick importation, development and testing of customized PIP and more. It also contains RosettaNet dictionary and RosettaNet implementation framework. The template enables the development of new PIPs. The Partner Agreement Wizard enables importing of trading partners and a fast development of new processes. Embedded PIP enables implementation of only that certain PIP the partner needs. It includes support for all published RosettaNet PIPs as well as for CIDX (Chemical Industry Data Exchange) and PIDX (Petroleum Industry Data Interchange). PIP can also be tested before actually applied and used. RosettaNet is also focused on the industry support; the adapter for industry development enables integration with new and existing applications and ways of business [15]. Figure 4: RosettaNet communication RosettaNet plans to integrate support for the ebXML Messaging Services Specification in future releases of RosettaNet's Implementation Framework (RNIF). While RosettaNet remains committed to developing business process standards, required to support the complex needs of the high-technology industry, it also wants to ensure interoperability across all supply chains. Figure 4 represents the communication between two trading partners with help of RosettaNet PIP - which enables connection of business processes [12]. 4 Evaluation Model 4,1 Criteria To be able to evaluate the technologies for describing business processes for their suitability and quality, we have defined a multi-criteria decision model. We have identified the following criteria [18]: Defining and describing processes'. Evaluates the architectural support, syntax and semantics for describing all the features of the process and the support for the transition from classical to electronic business from aspects of flexibility, simplicity, user friendliness and compliance to standards. Collaboration description: Evaluates the support for business interactions and defining relationships between partners, from aspects of flexibility, safety and complexity. Role model: Evaluates the support with modeling tools for describing roles and collaboration between them. Small/big/medium enterprises support: Evaluates the appropriateness and flexibility of the technology for different company sizes with different characteristics, needs and preferences. Complexity and learning effort: Evaluates the amount of effort and change, needed to learn and understand the technology and all its features. Efficiency: Evaluates how efficient is the technology at describing and specifying the business processes. Maturity: Evaluates the maturity, based on the number of years the technology exists. Tools support: Evaluates the support within tools and integrated development environments, which ease the development and assure quality. Synchronous communication support: Evaluates support for synchronous, short-term transactions, which require immediate answer. Asynchronous communication support: Evaluates support for asynchronous, long-term transactions. Independency of communication protocols: Describes the relationship between communication protocol and the technology. Quality of service: Evaluates the possibilities for specifying service quality of certain flows, which can be done either by raising the priority of a flow or limiting the priority of another flow. Authentication: Evaluates the level of verification of the senders identity - whether the business message sender is or is not who he claims to be [6]. Authorization: Evaluates the level of verification, whether the sender of a message is permitted to send the subject message to the receiving partner [6]. Integrity: Evaluates, whether the unaltered during transportation [6]. messages remains Encryption: Evaluates the coding and the level of security of messages against unauthorized readers [6], Non-repudiation: Evaluates the mechanism for verifying whether an originating trading partner can or cannot deny having originated and sent a message and that a receiving trading partner can or cannot deny having received a message, sent by its partner [6]. Exceptions handling: Evaluates the business preparation for every sort of failures, duplications and losses of data. Claim detection: Evaluates the preparation and support for events of claim loss. Data transformation: Evaluates the possibilities, tools and technologies for data transformation between collaborating enterprises. Criteria definition Scale defining cl Defining and describing processes 2 - flexible, simple, compliant with standards 1 - simple and user friendly 0 - basic features only c2 Collaboration description 2-iTiultiple language, flexibility, safety 1 - safety and basic features 0 - basic features c3 Role model 1 - yes 0-no c4 Small/big/ medium 2 - big/medium/small 1 - big/medium enterprises 0-big support c5 Complexity and learning effort 2 - simple 1 - moderate effort 0 - great effort c6 Efficiency 2 - very efficient 1 - averagely efficient 0 - low efficiency c7 Maturity Actual number in years c8 Tools support 2 - many 1 - medium 0-low c9 Synchronous 1 - yes communication 0-no support clO Asynchronous 1 - yes communication 0-no support cll Independency from communication protocols 1 - yes 0-no cl2 Quality of service 1 - yes 0-no cl3 Authentication 1 - yes 0-no cl4 Authorization 1 - yes 0-no cl5 Integrity 1 - yes 0-no cl6 Encryption 1 - yes 0-no cI7 Non-repudiation 1 - yes 0-no cl8 Exceptions handling 2 - handling message loss, resolution, system recovery 1 - two above 0 - one above cl9 Claim detection 2 - good I - average 0 - poor c20 Data 2 - good transformation 1 - average 0 - poor Table I: Criteria and scale 4.2 Utility Function We have defined the utility function, which organizes the results, for them to be comparable (on scale between 0 and 1). In the case, where input value is an actual number, the utility function transforms it to the closed interval from 0 to 1. Equation 1: Utility function ' t'MOO 2 ^ Moo Equation 2: Maximum utility N U =max{Uj) Meaning of the symbols: • U - maximum utility, • Uj - utility of alternative j, • Ci - criterion i (Table 1), • Aj -alternative j (ebXML, XLANG, RosettaNet), • Wi - weight of criterion i, • N - total number of alternatives. 4.3 Results For the purposes of the evaluation of the technologies in this article we have selected the weights based on the preferences of a SME, where security (authentication, authorization, integrity, encryption and non-repudiation), defining and describing processes, collaboration support, complexity and learning effort, maturity, tools support, data transformation, exception handling and quality of service are particularly important. The selection of the weights is based on the survey, done in [18], The weights can however be. altered according to the needs and priorities of each distinctive business. The Table 11 shows the evaluation of ebXML, RosettaNet and XLANG. It is divided in 5 columns. The first column presents criteria. The second column shows the weights, which we assigned to each criterion. The rest of the columns show evaluations for each technology, using the scale, explained in the third column of Table I. In the last row we show the results calculated using the utility function. As seen in Table II ebXML has achieved the highest result. It turns out that ebXML is the best technology for most of the businesses. XLANG is second best, although it lacks the quality of service, authentication and non-repudiation. However, it is integrated within the BizTalk Server Initiative, which is very promising. We believe that it will get improved over time. RosettaNet is the least appropriate for general SMEs. Its main preference lies in technical features and level of development. Since it is the oldest technology of the three, it is the most mature one. Its main disadvantage is in the fact that it is suitable mainly for very large companies, since its framework PIP is very inflexible, and once created, very difficult to alter thus inappropriate for smaller businesses. c w ebXML XLANG RosettaNet cl 9 2 2 0 c2 8 2 1 c3 2 1 0 c4 10 2 2 c5 9 1 1 c6 7 2 1 c7 8 2 2 c8 2 1 1 c9 2 1 1 clO 2 1 1 cll 2 1 0 cl2 6 1 0 cl3 3 1 1 cl4 3 1 1 cl5 5 1 1 cl6 3 1 1 cl7 4 1 1 cl8 6 2 2 cl9 2 2 1 c20 7 2 2 0,735 0,606 0,498 Table II: Evaluation matrix and results 5 Conclusions The need to do business on the net and to automate business processes is increasing, as is the need for supporting technologies. Such technologies must satisfy certain standards, they must be flexible and available to all organizations, large but particularly to small and medium enterprises. Describing business processes must be relatively simple, so that even non-programmers can use it, since the business process experts usually do not have the necessary knowledge, needed to work with complex languages. In the article we have identified, compared and evaluated the features of the three most important technologies and upon our findings defined a multi-criteria decision model for their quantitative evaluation. The defined decision model is usable for all kinds of enterprises. They can express their priorities through criteria weights. For the purposes of this article we have also defined a common set of weights for small and medium enterprises and done the evaluation of the technologies. From this perspective we have determined that ebXML technology is the most suitable with the widest range of possibilities, followed by XLANG and RosettaNet. References [1] Selim Aissi, Fallavi Malu, Krishnamurthy Srinivasan (May 2002), E-Business Process Modeling: The Next Big Step, IEEE Computer, pp. 55-62. [2] Alan Kotok, David R.R. Webber (2002), The new global standard for doing business over the Internet ebXML, New Riders Publishing. [3] EbPML.org (2002), XLANG, http://www.ebpml.org/xlang.htm. [4] David O'Riordan (April lO"' 2002), Business Process Standards for Web Services, The candidates. Services Business Strategies and Architectures, http://www.webservicesarchitect.com/content/ai'ticle s/oriordanOl.asp [5] Joe McKee (May/June 2002), RosettaNet at the Dance-an e-business standard does its own choreograph^', Oracle technology network, pp. 5152. [6] Pekka Rantola, Janne J. Korhonen (15.5.2002), RosettaNet vs. ebXML-Security Solutions and exception handling, Helsinki University of Technology, http://www.soberit.hut.fi/T-86/T-86.161/2002/RosettaNet [7] Vitria (February 26* 2002), RosettaNet E-Business Process Standards for the High-Tech Industry, Vitria Technology, Inc., http://www.vitria.com/news/press_releases/pr 2002-02-26.html. [8] EbXML, Technical specifications, http://www.ebxml.org/. [9] Cover Pages hosted by Oasis (June 6 XLANG, Technology Reports, http://xml.coverpages.org/xlanp.html. 2001), [10] RosettaNet Overview, Background Information, http://www.rosettanet.org/RosettaNet/Rooms/Displa vPages/Layoutlnitial. [1 l]Madhu Siddalingaiah (August 17"' 2001), Overview of ebXML,, Technical overviews, SUN Microsystems, http://dcb.sun.com/practices/webservices/overviews/ overviewebxml.isp [12] Arsin Corporation (2002), Solution Integration Services, Arsin RosettaNet PIP Solution, http://www.arsin.com/docs/RNTFactSheet_final.pdf [13] Andy Moir (April 2002), Introduction to RosettaNet, XML.gov, http://xml.gov/presentations/rosettanet. [14] Cover Pages hosted by Oasis (November 2002), RosettaNet, Technology report, http://xml.coverpages.org/rosettaNet.htm. [15] Microsoft BizTalk Server (2002), BizTalk Accelerator for RosettaNet Features, Microsoft, http://www.microsoft.com/biztalk/evaluation/feature s/rosettanet.asp. [16]Bea Web-Logic Integration (2002), RosettaNet 2.0 Security Sample, Bea, http://edocs.bea.eom/wli/docs70/b2bsampl/rn2sec.ht m . [17] XLANG (2001), Web Services for Business Process Design, 200J Microsoft Corporation, http://www.gotdotnet.com/team/xml_wsspecs/xlang-c/default.htm. [18] Object Technology Center (2002), Preferences of small and medium businesses, Technical Report, FERI [19] Matjaž B. Juric, S. Jeelani Basha, Rick Leander, Ramesh Nagappan (December 2001), Professional J2EE EAI, Wrox Press Ltd. [20] Arij it Sengupta (2002), Oracle's support for open eBusiness standards, Oracle corporation, http://www.idealliance.org/papers/xmle02/slides/Sen gupta/sengupta.ppt. [21] Arij it Sengupta (2002), Data integration, process integration and trading partner agreements, Oracle corporation, http://www.edifice.org/ERUG/Sengupta_ERUG.ppt. [22]Paavo Kotinurmi (Oktober 22"'' 2002), Comparing XML Based B2B Integration Frameworks, Helsinki University of Technology SoberIT, http://www.soberit.hut.fi/lCTEC/lectures/20021022_ Kotinurmi.pdf. Visual Secret Sharing Watermarking for Digital Image Shen-Chuan Tai, Chuen-Ching Wang*, and Chong-Shou Yu Institute of Electrical Engineering, National Cheng Kung University Tainan, Taiwan, R.O.C Address: Institute of Electrical Engineering (computer / group 92533) National Cheng Kung University, Tainan, 701, Taiwan Email: wcj@rose.ee.ncku.edu.tw Keywords: Visual Secret Sharing Watermarking (VSSW), Discrete Cosine Transform (DCT), Digital Watermarking, Copyright Protection. Received: February 23, 2001 A visual secret sharing watermarking (VSSfV) technique is proposed as a way of solving copyright protection problents for digital images. The proposed watermarking technique employs a visual secret sharing (VSS) scheme and separates the watermark into two parts, a public watermark and a secret watermark. For watermarking security, only the public watermark is inserted into the original image, while the owner holds the secret watermark. Without the secret watermark, it is almost impossible to extract the watermark even if the embedding algorithm is published. To meet requirements of robustness and imperceptibility, we modify DCT coefficients belonging to the middle frequency band to embed the public watermark. Importantly, the watermark can be retrieved from the watermarked image without resorting to the original image. Various experiments using the proposed watermarking method are presented to demonstrate robustness to tampering and a to variety of common image processing operations and geometric manipulations. 1 Introduction Protection of intellectual property is an increasingly important concern as widespread use of the Internet is making multimedia data increasingly easily copied and distributed. Fortunately, digital watermarking techniques allow us to embed copyrights into digital contents and later extract the watermark to detect copyright infringement and confirm legal ownership. Many watermarking techniques [1 - 12] have been published in the literature. These published techniques utilize either transform domain or spatial domain. In [2], Cox et al. describe a method to embed into the host image a watermark composed of a randomly generated sequence. This scheme applies the full-frame Discrete Cosine Transform (DCT) to the original image, producing a set of coefficients. Then, a subset of these coefficients is chosen according to a rule that depends on the most perceptually significant coefficients of the DCT transform domain. To embed the watermark, they use a scaling modulator to alter the values of these coefficients according to the values of the watermark. Cox's watermarked version is robust to some attacks involving common signal and geometric processing operations. However, there are some drawbacks to this technique. First, extracting the watermark requires the original image for watermark detection. This limitation affects its application on the Internet. Second, the authors use a threshold of similarity measure to determine whether the host image is watermarked or not. In practice, selection of too small or large a threshold will lead to watermark detection error. Third, modulating the most significant coefficients (excluding the DC term) with a random sequence degrades image quality, and is thus an unreasonable requirement for a general watermarking scheme. Hsu and Wu [4] proposed a frequency-domain watermarking technique that used fixed block-base DCT transformation. The method first breaks up the host image into 8X8 blocks and then performs the DCT on each block. In the embedding algorithm, they select 16 middle-band coefficients from each block and then modify these coefficients according to the residual mark to reverse the corresponding polarity. After this procedure, the watermark is embedded into the host image to form a watermarked image. Unlike [2], this method makes use of a binary image as the watermark, thereby allowing identification of the extracted watermark by direct use of the unaided human eye. Both human visual recognition and a similarity measurement were used experimentally to verify the efficiency of their watermark extraction method. However, there are some drawbacks to this technique. First, this method cannot overcome certain attacks, for example image rotation and image resampling. Second, for extraction, this scheme also requires the original image to extract the watermark information. For watermarking systems applicable to the Internet, security is a very important concern. The watermarking systems of [2] and [4] require the originai image for watermark retrieval, making verification complicated, necessitating the original image be shared in a public place or network for ownership verification and thus making these systems unsuitable for Internet application. In this paper, a new watermarking technique based on VSS scheme for enhancing the watermarking security is presented. The proposed method operates in a fiill-frame DCT domain, which allows a reasonable tradeoff between quality and robustness. More importantly for Internet application, watermark extraction does not require the original image and the original watermark, thus simplifying the watermarking system and allowing the original image and original watermark to be kept secret. Further, because it is dangerous to trust only a single person or organization to manage very important information, the proposed algorithm includes a visual secret sharing scheme (VSS) which shares a secret among a limited number of members. This paper is organized as follows. Section 2 introduces the basic concept of the VSS as applied in the proposed watermarking system. The watermarking technique itself is described in section 3. Experimental results are shown in section 4. Finally, section 5 presents conclusions. 2 The Basis of Visual Secret Sharing Scheme VSS is a well-known cipher technique for digital images. Decoding can be performed by the naked eye, with no instrumentation or complex computation. The concept of VSS is derived from [13]. In [14], Naor and Shamir extended this idea to {k, «)-VSS, which is designed to break a shared image into n different shadows. Each single shadow look like random data. The shared image can be recovered easily from k{kv(/, j)s {0,255} (2) and Ws, respectively. In the proposed method, the owner randomly assigns the secret watermark. After defining the secret watermark, the public watermark can be generated by using the relationships based on (2,2)-VSS scheme and listed in Table 1. For each pixel Pfin W and each block ßs-in IVs, the block Bi> in Wp can be derived as: ^ if P,= black ^^^ Bs if P, = white where Bg stands for inverting each sub-pixel in the block Bs. For example, if P^ is a black pixel in W and ri 0" the corresponding block Bs in (^v is ^ ^ corresponding block Bp in Wp is the complement of In accordance with the then the "1 0" "0 r , that is 0 1 _1 0 ongoing example, the public watermark is finally obtained by collecting each block Bp. It can be expressed as: Wp = {vp{a,ß)^ {0,255}0C2,) 0 //(C2,_,ai }• Step 2. For / = 1,2...,2k x 21, pack the coefficient, Cj,-) and the neighboring coefficient, Cj, , into an /'-th coefficient pair, CP, = Step 3. Use the secret key, S, to generate the predefined random number as a set, that is R = where Ä, denotes the ;-th random number. Step 4. According to R, extract the ;-th pixel of the watermark by judging the /?, coefficient pair as the following equation: '/^21 >C2i+\ (11) where Wp(/) stands for the /-th pixel in the Wp and "1" and "0" represent gray-level 0 and gray-level 255, respectively. Step 5. Assemble all the pixels to obtain a reconstructed public watermark Wp . Step 6. Superimpose the reconstructed public watermark ( iVj, ) on the secret watermark ( Wg ) to obtain the reconstructed watermark {W ). Secret W aterm ark Frequency Domain Extract M id-frcqucncy Coefficient visual OR Reconstructed public watermark Reconstructed Watermark Figure 3: Block-diagram of extraction procedure. 3.4 Reduction Process As mentioned section 3.3, the reconstructed watermark can be directly used for identifying the ownership protection. To improve the clarity of W , we use a post-process called the "reduction process" to reduce the redundancy of data caused by VSS scheme. Indeed, this process is a quite simple lookup table (Fig. 4) which performs the reduction process by direct mapping. That is, a block data with four subpixels located in each group will be transferred into a corresponding pixel. This means that each block data in group A will be mapped into a black pixel (gray-level 0), and a block data with four white sub-pixels in group C will be mapped into a white pixel (gray-level 255). Especially, each block data which contains one black sub-pixel and three white sub-pixels located in group B will be assigned to a gray pixel (gray-level here is 125). Suppose that the four inputs, a,, 02 > «3, «4. in the block of each group represent either white or black gray-level, and the 3 outputs, 00, 10, 11, represent black, gray and white pixels respectively. Then, these 16 possible states in Fig.4 can be further mapped into 3 possible states. Referring to Fig. 4, let /,/2 be the output bits, which is controlled by a,, Oj , 03, «4 . Then /^/2 may be expressed as : f,(ai,a2,a},a4)=a,a2a3+a2aia4+aia3a4+aja2a4 (12) f2(a,,a2,a3,a4)=a,a2a3a4 (13) Via the reduction process, the reconstructed watermark is reduced to the same size as the original watermark. Also, the reduced version of reconstructed watermark is more visible to human vision. Comparison of with-reduction and without-reduction is shown in Fig. 5. Since the reduction process mitigates noise effects caused by common image operations, the with-reduction result shown in Fig. 5(b) yields superior quality relative to the without- group A acB NEsa SBB group group C fl-bU ffl gray-level 0 B gray-level 125 □ gray-level 255 Black Pixel Gray Pixel White Pixel reduction result of Fig. 5(a). Figure 4: Reduction process lookup table for 4:1 data reduction rate. (b) Figure 5: Performance test for reduction process: (a) original reconstructed watermark; (b) reduced version of (a). 4 Simulations Results To demonstrate the performance of the proposed scheme, watermarking is performed on the "Lena," "Baboon" and "Airplane" standard images, and the watermarked images are subjected to robustness and quality testing. The employed images are of size 512X512 pixels. The original watermark is of size 50 X 50 pixels. In the robustness test, comparison between extracted watermark and original watermark is made by unaided human vision. Further, we define Detection Rate {DE) as a quantitative measurement for evaluating extraction fidelity. If nxl is the size of the original watermark, S.-C. Tai et al. then DR can be expressed as «X/ (14) where I (15) if >»'Uy) = w('>y) '2= '/ = gray pixel 0, // Here, w{i,j) represents each pixel of the original watermark and M>'{i,j) represents each pixel of the reconstructed watermark. Gray pixels are given a 50% hit ratio when computing the Detection Rate. The quality of a watermarked image is estimated by using peak-to-peak signal-to-noise ratio (PSNR), expressed by: 255^ 7SE iKP,