Informatica An International Journal of Computing and Informatics EDITORIAL BOARDS, PUBLISHING COUNCIL Informatica is a journal primarily covering the European computer science and informatics community; scientific and educational as well as technical, commercial and industrial. Its basic aim is to enhance communications between different European structures on the basis of equal rights and international referee-ing. It publishes scientific papers accepted by at least two referees outside the author's country. In addition, it contains information about conferences, opinions, critical examinations of existing publications and news. Finally, major practical achievements and innovations in the computer and information industry are presented through commercial publications as well as through independent evaluations. Editing and refereeing are distributed. Each editor from the Editorial Board can conduct the refereeing process by appointing two new referees or referees from the Board of Referees or Editorial Board. Referees should not be from the author's country. If new referees are appointed, their names will appear in the list of referees. Each paper bears the name of the editor who appointed the referees. Each editor can propose new members for the Editorial Board or referees. Editors and referees inactive for a longer period can be automatically replaced. Changes in the Editorial Board are confirmed by the Executive Editors. The coordination necessary is made through the Executive Editors who examine the reviews, sort the accepted articles and maintain appropriate international distribution. The Executive Board is appointed by the Society Informatika. Informatica is partially supported by the Slovenian Ministry of Higher Education, Science and Technology. Each author is guaranteed to receive the reviews of his article. When accepted, publication in Informatica is guaranteed in less than one year after the Executive Editors receive the corrected version of the article. Executive Editor - Editor in Chief Anton P. Železnikar Volariceva 8, Ljubljana, Slovenia s51em@lea.hamradio.si http://lea.hamradio.si/~s51em/ Executive Associate Editor - Managing Editor Matjaž Gams, Jožef Stefan Institute Jamova 39, 1000 Ljubljana, Slovenia Phone: +386 1 4773 900, Fax: +386 1 251 93 85 matjaz.gams@ijs.si http://dis.ijs.si/mezi/matjaz.html Executive Associate Editor - Deputy Managing Editor Mitja Luštrek, Jožef Stefan Institute mitja.lustrek@ijs.si Editorial Board Juan Carlos Augusto (Argentina) Costin Badica (Romania) Vladimir Batagelj (Slovenia) Francesco Bergadano (Italy) Marco Botta (Italy) Pavel Brazdil (Portugal) Andrej Brodnik (Slovenia) Ivan Bruha (Canada) Wray Buntine (Finland) Hubert L. Dreyfus (USA) Jozo Dujmovic (USA) Johann Eder (Austria) Vladimir A. Fomichov (Russia) Maria Ganzha (Poland) Marjan Gušev (Macedonia) Dimitris Kanellopoulos (Greece) Hiroaki Kitano (Japan) Igor Kononenko (Slovenia) Miroslav Kubat (USA) Ante Lauc (Croatia) Jadran Lenarcic (Slovenia) Huan Liu (USA) Suzana Loskovska (Macedonia) Ramon L. de Mantras (Spain) Angelo Montanari (Italy) Pavol Nävrat (Slovakia) Jerzy R. Nawrocki (Poland) Nadja Nedjah (Brasil) Franc Novak (Slovenia) Marcin Paprzycki (USA/Poland) Gert S. Pedersen (Denmark) Ivana Podnar Žarko (Croatia) Karl H. Pribram (USA) Luc De Raedt (Belgium) Dejan Rakovic (Serbia) Jean Ramaekers (Belgium) Wilhelm Rossak (Germany) Ivan Rozman (Slovenia) Sugata Sanyal (India) Walter Schempp (Germany) Johannes Schwinn (Germany) Zhongzhi Shi (China) Oliviero Stock (Italy) Robert Trappl (Austria) Srikumar Venugopal (Australia) Terry Winograd (USA) Stefan Wrobel (Germany) Konrad Wrona (France) Xindong Wu (USA) Executive Associate Editor - Technical Editor Drago Torkar, Jožef Stefan Institute Jamova 39, 1000 Ljubljana, Slovenia Phone: +386 1 4773 900, Fax: +386 1 251 93 85 drago.torkar@ijs.si An Exquisite Mutual Authentication Scheme with Key Agreement Using Smart Card Chiu-Hsiung Liao General Education Center National Chin-Yi University of Technology Taichung, Taiwan 411, R.O.C. E-mail: cliao@ncut.edu.tw Hon-Chan Chen and Ching-Te Wang Department of Information Management National Chin-Yi University of Technology Taichung, Taiwan 411, R.O.C. E-mail: {chenhc, ctwang}@ncut.edu.tw Keywords: Diffie-Hellman scheme, transformed identity, authentication, key agreement, password update Received: June 25, 2008 To access a network system legally, efficiently and securely, the authentication scheme is essential and very important. In this paper, we propose a nonce-based authentication scheme using smart card. We use Diffie-Hellman scheme to enhance the security of ourprotocol. To lessen the computation load, the remote system alone proceeds the exponentiation computation and it also implements only once. The other computations are all concerned with simple one-way hash functions or exclusive-or operations. No verification table is needed in our protocol. The protocol provides not only mutual authentication between a user and the remote server but also achievement of key agreement. The protocol also supports convenient password update at the user's terminal. To avoid the identity duplication, we introduce the idea of transformed identity in ourprotocol. Povzetek: Opisana je nova shema dostopa do omrežja s pomočjo pametne kartice. 1 Introduction Recently, some authentication protocols using smart card have been proposed [6,11,12,14]. Using smart card In recent years, people communicate via networks much has many merits. Not only can it implement computations more frequently than before. The frequency that net- and store a wealth of useful information, like identifica- work users transmit information and share the comput- tion number, password and basic personal data, but also ing resources increases very quickly. Moreover, with e- it is portable. Although the protocol using public key en- commerce being prosperous, people use computers daily cryption is much more secure, it may incur a burdensome to link with server to ask for service. In these situations, computation load. Therefore, we propose an authentica- the remote authentication and network security become in- tion protocol using Diffie-Hellman scheme [15] to enhance evitable and very important. the security level and efficiency but to reduce the compu- The authentication scheme is an essential part to assure tation load for a smart card. In our method, the smart card legitimate, secure and efficient access to a network system. is responsible for simple computations and the server is re- Among authentication schemes, password-based authenti- sponsible for complicated ones. The proposed scheme also cation is widely used. But password-based authentication uses the one-way hash function and the exclusive-or oper- is vulnerable to the dictionary attacks [1,2,3,4], i.e. the ation to maintain security and convenience. To prevent the password guessing attacks, because people are inclined to replay attacks and the synchronization problem, we adopt choose easy-to-remember identities or meaningful phrases. the nonces in our scheme instead of using time-stamp. Fur- As a result, a number of protocols have been proposed to thermore, we introduce the design of transformed identity overcome the guessing attacks [1,5-7]. Some of the im- [16] in our scheme to avoid the duplication of identities. proved protocols [1,8-12] use public key encryption in authentication. The others [6,11,12,14] use nonces and one- The rest of this paper is organized as follows: Some re- way hash functions. The nonce-based protocol is more se- lated schemes are reviewed in Section 2. The proposed cure because the nonce is randomly generated. As for one- authentication scheme is described in Section 3. The secu- way hash functions, it is irreversible. Thus, the protocol rity analysis of our scheme is discussed in Section 4. The using hash functions and nonces is safe and secure. efficiency and specialities of the proposed scheme are dis- cussed in Section 5. The functionality and performance of the proposed scheme are compared with related schemes and the result is listed in Table 1. Finally, the conclusions are given in Section 6. 2 Reviews of related schemes In this section, we review some related schemes briefly and closely. 2.1 Chien and Jan's scheme (ROSI scheme) Chien and Jan proposed a nonce-based authentication scheme using smart card: Robust and Simple authentication protocol (ROSI) [6], in 2003. The ROSI scheme consists of two phases: "The registration phase" and "The authentication phase". In the scheme, a prospective user, u, selects his identity, IDu, password, PWu, and an initial nonce, Ni. Then, the user transmits these values to the server, S, in registration phase. After accepting the application, the server stores IDu and h^(PW^WNi) in its database, where the symbol " || " is the string concatenation. The server also uses its secret key to calculate some parameters and stores them in a smart card. Then, the server issues the smart card to the applicant, u. After the authentication phase, the user and the server can mutually authenticate each other. However, in this scheme, it is necessary to set up a verification table and a legitimate user cannot update his password conveniently and freely when the security faces potential threats. 2.2 Juang's password authenticated key agreement scheme In Juang's authenticated key agreement scheme using smart card [12], two phases are included: "The registration phase" and "The login and session key agreement phase". A prospective user submits his identity and password to the server for registration. After getting a smart card, the user can use it to access the server. The user applies his smart card to compute a secret key and uses the key to encrypt a message, which includes a random value and an authentication tag. After receiving the message, the server computes the secret key and decrypts the received message to extract the embedded authentication tag. Then, the server verifies the validity of this tag. In order to attain the shared session key, the user's smart card has to encrypt a forwarding message and decrypt the received message from server to perform a nonce-checking. In this scheme, we found that the smart card should encrypt and decrypt several messages by using the cryptographic scheme. In this situation, the smart card has to compute the modular exponential operations, which require a large amount of computations. These computations may overload the capability of the smart card. 2.3 Hwang et al's remote user authentication scheme The scheme [14] is comprised of three main phases and an additional one. The main phases are "The registration phase", "The log in phase" and "The authentication phase". The additional phase is "The password changing phase" within the user's discretion. When a prospective user, u, wants to register with a server, S, he submits his identity, IDu, and a hash value of password, h{PWu) to the registration center of S. Then, the center uses the server's secret key, Xg, and the hash value of password to compute a shifted password, PW1u = h{IDu e xs) e h{PWu) and stores it with the hash function, h( ), into a smart card, where " e " is the exclusive-or operation. Then, the smart card is issued to the user. To access the server, the user connects his smart card to a card reader and keys in his identity and password at the user's terminal. The smart card executes the exclusive-or operation on the shifted password and h(PWu) to attain a crucial parameter, h(IDu e xxg). The smart card then combines this parameter with a time-stamp to compute an authenticating value. Next, the user transmits these values to the server for authentication. On receiving the messages, the server executes the verification procedures and performs the authentication. However, although the scheme can verify a legitimate user, the user and the server cannot achieve the mutual authentication and the session key agreement. The scheme cannot avoid the time synchronization problem, either. 2.4 Behind the reviews In reviewing the related schemes, we are motivated to propose an improved scheme. Not only do we supplement the deficiencies, but we also enhance the efficiency and the functionality. In our scheme, the verification table is not required and the mutual authentication can be achieved. Furthermore, a user is allowed to select and update his password freely. Finally, the computation cost is reduced in the proposed scheme. 3 The proposed authentication scheme Our authentication scheme consists of four phases: the registration phase, the login and authentication phase, the key agreement phase and the password update phase. As mentioned before, for the sake of security, we prefer to adopt modular exponentiation in registration phase. But, it is performed only at the remote server to reduce the computation load for smart card. The login phase is executed at the user's terminal and the authentication is verified mutually between the user and the server. The key agreement is achieved by the user and the server respectively, and is kept temporarily for mutual communication in the session. As for the password update phase, it is completed only at the user's terminal. To describe our proposed scheme with ease, we use the following symbols and operations: 1. The operator " ® " is the bit-wise exclusive-or operation. 2. The symbol " || " is the string concatenation. 3. The function " h " is a one-way hash function. 4. For the sake of convenience, let the expression " X —> Y : M " mean a sender X transmits a message M to a recipient Y. 3.1 The registration phase The registration phase is performed with the remote server. When a person, u, wants to be a legitimate user to a server, S, he offers an account application to S. The procedure is as follows: Step 1: u —> S : IDu,PWu. Responding the challenge from the server, the applicant submits his identity, IDu, and password, PWu, to the server for registration via a secure communication channel. Both IDu and PWu are selected by himself freely. Step 2: After receiving the response, the server confirms the formats of the submitted identity and password first. Then, the server takes note of the registration time, TS u and archives the user's IDu and related TS u for later authenticating use. Then the server performs the following four processes: (1) Compute the transformed identity [16], TIDu = TSullIDu, automatically by itself. The transformed identity, TIDu, can ensure the uniqueness of the identity. At this stage, the applicant only needs to remember his selected identity, IDu, and password, PWu. (2) Compute Au = h{TIDu 0 x), where the parameter x is the secret key of S and is kept confidentially. (3) Compute Bu = {gAu modp) ® PWu, where p is a large prime positive integer and g is a primitive element in Galois field GF(p). (4) Store the values, TSu, Bu and h(-), in a smart card and issue the smart card to the applicant. 3.2 The login and authentication phase When a legitimate user, u, intends to login the server, S, the user's terminal and the server need to mutually authenticate each other. Step 1: u —> S : Mi = {IDu, NTIDu, Cu}. The user, u, connects his smart card to a reader. The smart card challenges the user for his identity, IDu, and password, PWu, which are selected at his application. The smart card automatically performs the following processes: (1) Generate a nonce, nu. Store the value, nu, temporarily until the end of the session. (2) Retrieve the stored registration time to generate the transformed identity, TIDu = TSullIDu. (3) Compute NTIDu = TIDu 0 nu. (4) Compute the value Cu = h(Bu ® PWu) ® nu. (5) Send the message Mi = {IDu, NTIDu, Cu} to the server, S. Step 2: S —> u : M2 = {Du, NTID^}. After receiving the message M1, S does the following processes: (1) Retrieve from the database the registration time, TSu, which is corresponding with the identity, IDu, of the connecting user. If no such corresponding user matches, the server terminates the connection. Otherwise, it goes on to the next processes. (2) Compute TIDu = TSullIDu, and n'u = NTIDu 0 TIDu. (3) Compute Au = h(TIDu 0 x) and gAu modp, then h(gAu modp). (4) Compute n'u^ = Cu 0 h(gAu modp). If n'u = n'^, the received NTIDu is truly sent from u and the parameters n'u and n'u^ should be the same as nu, which is generated by the smart card at the user's terminal. Hence, the legitimacy of the connecting user is authenticated. See Theorem 1. So, the communication will carry on. On the other hand, if n'u = n^', the server terminates the connection. Furthermore, the server stores nu in memory temporarily for later use. (5) Create a nonce, n^, randomly. Compute Du = Cu0nu0Vs and NTID, = TIDu0ns. Then the server sends the message, M2 = {Du, NTIDs}, to the connecting user, u. Theorem 1: If n'u = n'^, the user, u, is authenticated. Proof. Since NTIDu = TIDu 0 nu, thus, n'u = NTIDu 0 TIDu = nu. Also, given Bu = (gAu modp) 0 PWu, we have Cu = h(Bu 0 PWu) 0 nu = h((gAu modp) 0 PWu 0 PWu)) 0 nu = h(gAu modp) 0 nu. Then, n'U^ = Cu e h{gAu modp) = h{gAu modp) e n^ e h{gAu modp) = nu. It follows that n'u = n'U' = nu. The nonce, nu, is generated at the user terminal when the user, u, inserts his smart card into a card reader. So it is fresh and unique. It is also embedded in NTIDU and never exposed. No one can impersonate it or pry about it. Both TIDU and NTIDU are unique, and NTIDU can be computed by u only. Once n'u = n'U^ is proven, we verify NTIDU is really transmitted by u. Hence, the genuineness of the user, u, is authenticated. □ Step 3: u —> ^ : M3 = {Eu}. When u receives the message M2, he executes the following processes: (1) Compute n', = NTIDs e TIDu and n'S = Cu e nu e Du. If n's = n'', the communication goes on. In this situation, both n's and n"s are equal to ns, which is generated by the server. Thus, the server is authenticated. See Theorem 2. On the other hand, if n's = n'', u ceases the communication. Furthermore, u keeps ns temporarily at the user terminal for later use. (2) Compute Eu = (Cu e nu)\\{n, + 1). Then u sends the message M3 = {Eu} to S. The parameter n, + 1 is the response to the server. Theorem 2: The server, S, is authenticated if n's = Proof. Since NTID, = TIDu e n,, n's = NTID, e TIDu = ns. Also, since Du = Cu e nu e ns, n'' = Cu e nu e Du = ns. Then, we have n's = n'^ = ns. The nonce, ns, is immediately generated by S, when S verifies the genuineness of the user, u. So ns is fresh and unique. The transformed identity, TIDu is also unique. Thus, NTID, is unique and it can be computed by the server only. Furthermore, Du is computed with Cu, nu and ng. A false server can not forge all of them. Once n'^ = n'' is proven, the integrity of S is authenticated. □ Step 4: After receiving the message, M3, the server finds Eu in it. Since Bu e PWu = gAu modp, Cu = h(Bu e PWu) e nu = h(gAu modp) e nu. Thus, Cuenu = h{gAu modp). So, Eu = (Cuenu)\\(ns + 1) = h(gAu modp)\\(ns + 1), and it is really the string concatenation of h(gAu modp) and ns + 1. The server can easily extract ns + 1 from Eu and find ns in there. At this time, the server ensures that the authenticating user does have the nonce, n,. Now, both the user and the server can try for a session key agreement. 3.3 The key agreement phase After receiving the nonce, ns, sent from the server, the user creates a session key SKu = h((Bu e PWu)\\ns\\nu). Once the server ensures that u has the nonce, ns, it generates a session key SKs = h((gAu modp)\\ns\\nu). Since Bu = (gAu modp) e PWu is computed in the registration phase, h((Bu e PWu)\\ns\\nu) = h((gAu modp)\\ns\\nu). Thus, SKu = SKs. Therefore, the key agreement is achieved and the session key for the session communication is SK = h((BuePWu)\\ns\\nu) = h((gAu modp)\\ns\\nu). 3.4 The password update phase When a user wants to change his password for personal reasons or for the sake of security. He can do so at the user's terminal by performing the following: Step 1: Insert the smart card into a reader and announce a password update request at the user's terminal. Step 2: Key in the original password, PWu. The smart card calculates Bu e PWu. Step 3: Responding to the challenge of the smart card, the user gives a new password PWu^ . The smart card calculates bu = (Bu e PWu) e PW* and then replaces Bu with this new Bu . At this time, the password update phase is completed. 4 Security analysis Not only do we concern with the efficiency and the specialties of our scheme, but also we ask for security and the computational complexity in our proposed scheme. In this section, we will display the strength of our scheme first, and later we discuss the computational complexity. The security analysis is listed as follows: (1) Our scheme can overcome the guess attacks: The user is allowed to select his own identity and password freely in our scheme, so he is apt to choose easy-to-remember or meaningful identity and password. In this situation, it seems easy to guess the identity and the password of a legitimate user. However, the construction of transformed identity in our proposed scheme makes the transformed identity be an independent unity. The uniqueness can prevent the transformed identity from being duplicate and resist the guess attacks. An intruder guesses a legitimate user's identity. The guessed identity can not be converted into a valid transformed identity without the exact registration time, which is stored in the user's smart card. As a result, a intruder's intent to access a remote server should be rejected without a valid transformed identity. (2) Our scheme is capable of resisting the man-in-the-middle attacks: A malicious intruder may intercept or eavesdrop on the communication between a legitimate user, u, and the server, S. After intercepting the message Mi sent by u, he may impersonate u and replay the message to S. Then, he waits for a response message from S. The intruder can not compute the efficient TIDu from the intercepted NTIDu without the nonce, Uu, which is generated randomly by the smart card and is never exposed on the communication. Even though the intruder has the response message, M2, from S, he can not extract the nonce, Ug, from NTIDs, which is included in M2, because he has no TIDu at hand. The nonce, Ug, is generated by S and is needed to authenticate the server in Step 3(1) in the login and authentication phase. This nonce is also required to achieve the session key agreement. Furthermore, the intruder must respond the server with Us + 1. Because the nonce, us, is unavailable, this response can't be completed, either. Eventually, an illegitimate user should be rejected and the connection is terminated. On the other hand, when a malicious person intercepts the message Mi, he may pretend to be the server that u is connecting to. Furthermore, he has no TIDu and he can't compute it because he has no TSu at hand. The intruder has no nonce, uu, either. Thus, he can not send an available parameters to the user u for authenticating the integrity of the server. The communication terminates when the authentication fails. (3) An intruder can not achieve session key agreement: The user's password is never exposed in the transmission. An intruder can not intercept the password or any information about it. Meanwhile, the parameter Bu is stored in the user's smart card, no one can access it. So, the parameter gAu modp can not be computed from Bu ® PWu. On the other hand, if an intruder intends to compute modp directly. He needs to compute Au = h{TIDu ® x) first. But, the secret key x of the server is kept confidentially. No one can have it. Hence, it is impossible to compute modp directly. Therefore, no session key agreement can be achieved without all of modp, uu and us at hand. (4) An intruder will be confronted with the complexity of the discrete logarithm: The secret key x of the server is protected by the oneway hash function. It is not possible to derive it from Au = h{TID ® x). Trying to solve out Au from modp is also impossible, because the adversary will be confronted with the difficulty and the complexity of the discrete logarithm problem. Without secret key x, an adversary can not pretend to be the server, S, in the communication. The parameter, Bu, can't be derived without Au. Thus, an adversary can not pretend to be the connecting user, u, either. 5 The efficiency and specialties of our scheme From the procedures of the construction, we point out some merits in our scheme. We concern not only efficiency but also special properties. (1) No verification table is needed: Once a prospective user, u, offers his identity, IDu, and password, PWu, in registration phase. The server, S, takes note of the registration time, TSu, to derive the transformed identity, TIDu. Then, S calculates the parameter, Bu, and stores it in a smart card. When the legitimate user wants to access the system, he only gives his selected identity to compute the transformed identity and then transmits it to the remote server. The smart card also generates automatically a nonce, Uu, to compute the authenticating values, Cu and NTIDu. Then the values are transmitted to the server. It is not necessary for the remote server to set up any verification table of passwords or other personal information. (2) The transformed identity is unique: The construction of transformed identity makes the identity unique. A few users could select the same identities, but the transformed identities should eventually be different since our scheme takes the registration time into account. It prevents the duplication from happening. (3) The user's identity and password can be selected freely: Since our proposed scheme uses the transformed identity to discriminate different users, the original identity is allowed to be selected according to the user's preference. Taking into account the registration time, the proposed scheme converts the selected identity into transformed identity. The transformed identities should be different from one another even if the selected identities might be the same. Thus, a user's identity can be selected freely. The transformed identity is used to compute the parameter Au. Then, gAu modp is computed. The parameter Bu is generated by performing exclusive-or operation on PWu and gAu modp. Because Bu is stored in the user's smart card, no one can pry about it. Therefore, the password can also be selected freely. (4) Diffie-Hellman scheme is used: In registration phase, the server calculates the parameter Bu through Diffie-Hellman scheme to enhance security. Because the computation of modular exponentiation is burdensome for a smart card, the proposed scheme makes the server execute the operation in order to lessen the troublesome implementation for smart card and to speed up the computation. (5) The computations proceed very quickly and the load is low: The modular exponentiation is the only burdensome and time-consuming computation. It is used on the Diffie-Hellman scheme and is performed only once at the remote server. The other computations at both the user's terminal and the remote server are just the one-way hash functions, string concatenations and the exclusive-or operations. The computations proceed very quickly, and the load is extremely low for either of them. The Table 1 demonstrates the computational complexity is simple. (6) The password can be conveniently updated at the user's terminal: The server needs no password-verification table to check the a user's genuineness. The proposed scheme allows a user to update his password at his terminal. It is convenient and efficient for users. (7) The mutual authentication is executed: The scheme can mutually authenticate each other between the user and the server. From the Theorem 1 and 2, the correct methods of the mutually authentication between the user and the remote server are proven. At the end of this section, we compare our proposed scheme with some other schemes on the computational complexity and the performances. The comparison on computational complexity is also listed in Table 1. From an objective point of view about the performance, we include some criteria in the following items: Item 1. No verification table needed: At the remote server, a password-verification table is not needed to authenticate the users. Item 2. Using unique transformed identity: Describe whether a user can choose his identity according to his preference and prevent it from duplication. Item 3. Choosing a password freely: Display whether a scheme allows a user to choose his password freely or not. Item 4. Mutual authentication: Demonstrate whether a legitimate user and the remote server can mutually authenticate each other or not. Item 5. Password update conveniently: Discuss whether a user can conveniently update his password at the user's terminal or not. Item 6. Session key agreement: Show whether a scheme can achieve the session key agreement or not. Item 7. Avoiding time synchronization problem: Exhibit whether a scheme can avoid the time synchronization problem or not. The result of the comparisons on the performances is listed in Table 2. 6 The conclusions We have proposed an exquisite mutual authentication scheme without verification table of passwords and other users' personal information. The proposed scheme includes session key agreement and convenient password update. Our scheme uses the registration time to create the unique transformed identity in order to discriminate a user from the others efficiently, even if they may choose the same value for their identities. Through the storage of important information in the smart card, the proposed scheme can generate necessary parameters without exposing the password in transmission. Our scheme can withstand the replay attacks and resist the man-in-the-middle attacks. Moreover, the security of our scheme relies on the intractability of discrete logarithm because the Diffie-Hellman scheme is used. References [1] S.M. Bellovin, M. Merritt, (1993) Augmented encrypted key exchange: A password-based protocols secure against dictionary attacks and password file compromise, Proceedings of First ACM Conference on Computer & Communications Security, pp.244-250. [2] Y. Ding, P. Horster, (1995) Undetectable on-line pass- word guessing attacks, ACM Operating Syst. Rev., pp.77-86. [3] DV. Klein, (1990) Foiling the cracker: a survey of, and improvements to password security, Proceedings of the second USENIX UNIX security workshop, pp.514. [4] R. Morris, K. Thompson, (1979) Password security: a case history. Communications of the ACM, 22(11), pp.594-597. [5] V. Goyal, V. Kumar, M. Singh, A. Abraham, and S. Sanyal, (2006) A new protocol to counter online dictionary attacks, Computers & Security, 25, pp.114120. [6] H.Y. Chien, J.K. Jan, (2003) Robust and simple authen- tication protocol, Computer Journal, 46, pp.193-201. [7] C.L. Lin, H.M. Sun, T. Hwang, (2001) Attacks and solutions on strong-password authentication, lEICE Trans. Commun. E84-B, No. 9, pp.2622-2627. [8] S. Halevi, H. Krawczyk, (1998) Public-key cryptogra- phy and password protocols, Proceedings of the 5th ACM Conference on Computer and Communications Security, San Francisco, CA, pp.122-131. [9] C.C. Chang, W.Y. Liao, (1994) A Remote Password Au- thentication Scheme Based upon ElGamal's Signature Scheme, Computer & Security, Vol. 13, pp.137-144. [10] C.C. Chang, L.H. Wu, (1990) A Password Authentication Scheme Based upon Rabin's Public-Key Cryp-tosystem, Proceedings of International Conference on Systems Management '90, Hong Kong, pp.425-429. [11] M.S. Hwang, L.H. Li, (2000) A new remote user authentication scheme using smart card, IEEE Transactions on Consumer Electronics, 46(1), pp.28-30. [12] W. S. Juang, (2004) Efficient password authenticated key agreement using smart card, Computer & Security, 23,pp.167-173. [13] Y.C. Chen, L.Y. Yeh, (2005) An efficient nonce-based authentication scheme with key agreement, Applied Mathematics and Computation, 169, pp.982-994. [14] M.S. Hwang, C.C. Lee, Y.L. Tang, (2002) A simple remote user authentication scheme, Mathematical and Computer Modelling,36, pp.103-107. [15] W. Diffie, M. Hellman, (1976) New directions in cryp- tography, IEEE Trans. Inform. Theory, 22, pp.476492. [16] C.T. Wang, C.C. Chang, C.H. Lin, (2004) Using IC Cards to Remotely Login Passwords without Verification Tables, Proceedings of the 18th International Conference on Advanced Information Networking and Application(AINA), Fukoka, Japan, pp.321-326. Phase Registration Login and Key Password Authentication Agreement Update 1Co 3Co Our scheme 1Ha 20 1ME 2Ha 90 1ME Yes Yes 2Co 5Co Chien et al [6] 3Ha 10 17Ha 100 No No 1ME 2Ha Hwang and Li [11] 20 5ME 2MM No No 1Ha 1Co 10 4Ha Juang's [12] 10 3En 3De Yes No Hwang et al [14] 2Ha 20 1Ha 20 No Yes Co: concatenation; Ha: one-way hash function; ®: exclusive-or; ME: modular exponentiation; MM: modular multiplication; En: encryption; De: decryption Table 1: Comparison on Computational Complexity No veri- Using Choosing Mutual PW Session Avoiding Criterion item fication transformed PW authenti- update key synchro- table ID freely cation agreement nization Our scheme Yes Yes Yes Yes Yes Yes Yes Chien et al's [6] No No Yes Yes No No Yes Hwang and Li's [11] Yes No No No No No No Juang's [12] Yes No Yes Yes No Yes Yes Hwang et al's [14] Yes No Yes No Yes No No Table 2: The result of the comparisons on performancesamong schemes Routing Scalability in Multicore-Based Ad Hoc Networks Ami Marowka 12 Anna Frank, Ramat-Gan, 52526, Israel E-mail: amimar2@yahoo.com Keywords: ad hoc networks, routing speedup, multicore, power management Received: December 4, 2007 The integrati on of multicore processors into wireless mobile devices is creating new opportunities to enhance the speed and scalability of message routing in ad hoc networks. In this paper we study the impact of multicore technology on routing speed and node efficiency, and draw conclusions regarding the measures that should be taken to conserve energy and prolong the lifetime of a network. We formally define three metrics and use them for performance evaluation: Time-to-Destination (T2D), Average Routing Speedup (ARS), and Average-Node-Efficiency (ANE). The T2D metric is the time a message takes to travel to its destination in a loaded traffic network. ARS measures the average routing speed gained by a multicore-based network over a single-core based network, and ANE measures the average efficiency of a node, or the number of active cores. These benchmarks show that routing speedup in networks with multicore nodes increases linearly with the number of cores and significantly decrease traffic bottlenecks, while allowing more routings to be executed simultaneously. The average node efficiency, however, decreases linearly with the number of cores per node. Power-aware protocols and energy management techniques should therefore be developed to turn off the unused cores. Povzetek: Narejena je analiza povezljivosti vozlov omrežja. 1 Introduction The recent emergence of affordable dual-core processors in consumer products will overturn many currently accepted standards for software applications(5). The transition from single to multicore CPUs calls for the parallelization of all applications. Developers will be faced with the challenge of designing single-threaded applications that run efficiently on multiple cores. Dual-core processors are only the beginning. Chip makers are currently working on the next generation of mul-ticore processors, which will contain 4, 8 or 16 cores on a single die. According to the roadmap introduced by Intel(15), dual core processors are slated to reach mobile devices as well. The integration of multicore processors into wireless communication and mobile computation devices will in general strengthen the communication infrastructure. Ad hoc wireless networks of mobile devices will also become more robust. An ad hoc network is a self-organized mobile network, whose every node is responsible for both computation and communication operations (1; 11). In this paper we address the problem of how to measure the gain in routing speedup in ad hoc wireless networks where nodes are equipped with multicore processors, in particular when the network is heavily loaded. Moreover, we analyze the efficiency of multicore nodes in the network and show that the energy consumption of multicore nodes could be dramatically reduced by adapting existing methods and techniques. We use location-based routing protocols in our traffic simulations(12). Location-based protocols are usually compared and analyzed by means of the "hops count" metric; i.e., the best single-path routing protocol is the one that finds a path from the origin node to destination node with the fewest number of hops. In real networks, however, many routing tasks are carried out at the same time. This is known in the literature as a multiple-sessions scenario, in which several routing sessions compete for the node's services. Each message wants to reach its destination node without waiting in a queue of routing sessions to be served. In such a scenario, we need a metric to quantify the arrival time of messages to their destinations. This work therefore starts by defining a routing metric, called the Time to Destination (T2D). This metric will quantify the time required for a message to travel along the best path determined by the routing protocol. Then, two other metrics are defined based on the T2D: the Average Routing Speedup (ARS) and Average Node Efficiency (ANE). The first quantifies the average improvement in message travel time for a multicore network compared to a single-core network. The second measures the average node efficiency in a multicore network, compared to the efficiency of a network with single-core nodes. These metrics serve to analyze the behavior of routings in our simulations of heavily loaded networks. We de- rive conclusions regarding the benefits of message routing in multicore networks, and show how power-aware management protocols can reduce the consumption of energy. Moreover, we analyze the effect of mobility on the delivery success rate in multicore ad hoc networks and show that multicore nodes can improve the dependability of mobile networks. To the best of our knowledge, this is the first work in the open literature discussing routing protocols in ad hoc networks of multicore devices. The design principles behind the speedup and the efficiency metrics present in this work were inspired from the parallel computing literature. However, our evaluation methods using scalable networks for measuring the real scalability of routing protocols are novel. In order to explore how protocols scale as the number of the nodes increases and as the average-node-degree increases, our simulator has the capability to generate extendable networks. In the open literature, the scalability is evaluated by measuring the performance of different number nodes such as 10, 20, 30,... where, for example, the 10-nodes network has different topology from the 20-nodes network. Our unique method for generating planar random networks enables us to generate 20-nodes network which is extension of the 10-nodes network and therefore to measure and to evaluate real scalability. In Section 4 we describe in detail this method, that to the best of our knowledge, we are the first to use. A work-in-progress versions of this paper was presented in (9; 10). The remainder of the paper is organized as follows. Section 2 describes state-of-the-art works in the area of MIMO ad hoc networks. Section 3 defines the network model. In section 4, the metrics described above are defined. Section 5 details the simulations and analyzes the resulting data. Section 6 summarizes our results. niques thereby achieving spectral efficiency and increased throughput. A MIMO-OFDM system transmits independent OFDM modulated data from multiple antennas simultaneously. At the receiver, after OFDM demodulation, MIMO decoding on each of the sub-channels extracts the data from all the transmit antennas on all the sub-channels. The IEEE 802.16e standard incorporates MIMO-OFDMA. The MIMO capability of antenna arrays has been studied at the physical layer and over a single link. There are few research directions that have studied MIMO in a multi-hop network from the perspective of higher layers. The research in (20) proposed a scheduling algorithm to offer fair medium access in a network where nodes are equipped with MIMO antennas. The model under study provides a simple abstraction of the physical layer properties of MIMO antennas. At the routing layer, a routing scheme to exploit MIMO gains is proposed (21). The idea is to adaptively switch the transmission/reception strategy using MIMO so that the aggregate throughput at the routing layer is increased. At each hop along a route this decision is made dynamically based on network conditions such as node density and traffic load. At the transport layer, TCP performance over MIMO communications was studied (22). Focusing on the two architectures previously proposed to exploit spatial multiplexing and diversity gains (namely BLAST and STBC), the authors studied how the ARQ and packet combining techniques impact on the overall TCP performance. Their results indicate that, from the standpoint of TCP performance, the enhanced reliability offered by the diversity gain is preferable to the higher capacities offered by spatial multiplexing. 3 A Multicore Network Model 2 State-of-the-Art of MIMO MANET Multiple-input Multiple-output (MIMO) wireless communication systems are the most promising multiple antenna technology today (16; 17). The advantages of MIMO communication, which exploits the physical channel between many transmit and receive antennas, are currently receiving significant attention (18). The integration of an air interface technology, such as MIMO, with a modulation scheme called orthogonal frequency division multiplexing (OFDM) (19) has the potential to lay the foundation for the data rate and capacity gains that will be needed for years to come. Since multiple data streams are transmitted in parallel from different antennas there is a linear increase in throughput with every pair of antennas added to the system. MIMO systems do not increase throughput simply by increasing bandwidth. They exploit the spatial dimension by increasing the number of unique spatial paths between the transmitter and receiver. MIMO-OFDM combines OFDM and MIMO tech- A highly realistic network model would take into account many complexities, such as the control traffic overhead, traffic congestion, mobility of the nodes, the irregular shape of radio coverage areas, and the intermittence of communication due to weather conditions and interference from preexisting infrastructure (power lines, base stations, etc.). Including all these details in the network model, however, would make it extremely complicated and scenario-dependent. This would hamper the derivation of meaningful and sufficiently general analytical results. It was shown that a simple parameterized model can accurately reflect the simulations (14). The model defined here, which is used in our simulations, therefore makes some widely accepted simplifying assumptions. We formally define a multicore network model as follows: Definition 1: A multicore network model is an undirected graph G = (V, E, D, M), where: 1. V is the set of nodes; 2. E C VxV is the set of undirected edges i.e., (u, v) e E if u is able to transmit to v; 3. D is the average node degree; 4. M is the number of cores in a node; 5. Pi,j is the link probability of retransmission; 6. For v G V, the node mobility is determined every pause time pau(v) by its velocity vel(v) towards destination des(v). A realistic multicore model must reflects the potential cost of retransmissions required to recover from link errors. We use a linear transmission cost function: Ti,j = ^-where a link is assumed to exist between node pair {i,j) as long as node j lies within the transmission range of node i, A is the transmission rate, L is the message size and Pi,j is the link packet error probability associated with that link. This cost function captures the cumulative delay expended in reliable data transfer, for both reliable and unreliable link layers. Node mobility is based on the random waypoint parameterized model (2): a node chooses a destination uniformly at random in the simulated region, chooses a velocity uniformly at random, and then moves to that destination at the chosen velocity. Upon arriving at the chosen waypoint, the node pauses for a period before repeating the same process. In this model, the pause time represents the degree of mobility in a simulation; a longer pause time amounts to more nodes being stationary for more of the simulation. The nodes communicate using omnidirectional antennas with maximum range r. We assume that all the nodes are equipped with identical transceivers, and further that each M — core node is equipped with M transceivers and M antennas. Each core has multiple wireless channels and identical hardware and software mechanisms. A multicore node can hence transmit multiple packets at the same time to different nodes. The network is assumed to be homogeneous, consisting of nodes with the same transmission range, number of cores, and battery power. Imposing a common transmission range induces a strongly connected communication graph, often called a unit disk graph in the literature of routing protocols. We assume that the traffic in the network is highly loaded, and that routing sessions are issued in a random manner. In other words, the origin and destination nodes of each routing are chosen randomly and no a priori knowledge is available regarding future sessions. Each node maintains a queue, so that incoming routing sessions are served on a FIFO basis. For simplicity it is assumed that the queue is large enough to store all arriving messages, so no messages are lost due to overflow. If the network consists of multicore nodes with M cores, then each node can serve M routing requests simultaneously. The model assumes that different networks may have different ratios of computation time to communication time, and that communication and computation overlap. The communication time is the amount of time it takes for a packet transmitted by one node to be received by the next node on the path. The computation time is the amount of time it takes for a packet to be processed, from the time it is received by a node to the time it is transmitted to the next node. This interval includes computationally intensive functions such as signal processing, encoding and decoding, and encrypting and decrypting. It also includes relatively lightweight functions such as next-hop routing decisions and channel access delay. Routing requests in our model are managed by localized, location-based protocols (12). In such protocols the location of the destination node is known, and the distance to neighboring nodes can be estimated on the basis of incoming signal strengths. The routes between nodes are created through a series of localized hop decisions; each node decides which neighbor will receive the message based on its own location, its neighboring nodes, and the message's destination. In our simulations we use the Most Forward within Radius (MFR) protocol (13), which forwards the message to the neighbor that maximizes its progress. 4 Routing Speedup and Efficiency In measuring the scalability of parallel applications, the most commonly used practical metrics are relative speedup and relative efficiency (7; 8). Relative Speedup is the ratio of a parallel application's run time on a single processor to its run time on N processors; it is important to emphasize that the application and its test problem are identical in both situations. Relative Efficiency is the relative speedup divided by the number of processors N. Researchers use the speedup metric to check the performance of their applications on multiple platforms; it is a natural choice, since it is dimensionless and captures the relative benefit of solving a problem in parallel. Motivated by these definitions, we define similar metrics for measuring the scalability of routing in heavily loaded, multi-session ad hoc networks. First, we define a time-based routing metric called the Time-to-Destination (T2D). Second, we define the Routing-Speedup (RS) and the Average-Routing-Speedup (ARS) metrics. Finally, we define the Node-Efficiency (NE) and Average-Node-Efficiency (ANE) metrics. The metrics used in simulations of wireless ad hoc networks usually reflect the goal of the network design protocol. Most routing schemes thus use hop count as their metric, where hop count is the number of transmissions along a given route from source to destination. This choice agrees with the assumption that delay is proportional to hop count, which is reasonable when the impact of congestion is not significant. However, this assumption is not warranted for realistic ad hoc network scenarios in which routing sessions are issued from many and various origin nodes simultaneously and to random destinations. We therefore need a metric that reflects the delay caused by traffic congestion in a wireless network. We formally define the Time-to-Destination metric as follows: Definition 2: Time-to-Destination (T2D). Assume a unit disk multicore graph G, and a route R = vs,..., v d from source node v s to destination node vd. The Time-to-Destination (T2D) is the aggregate time taken to route a message from the origin node to the destination node. The T2D metric is also known in the open literature as the end-to-end latency. We assume that different networks have different ratios of computation time to communication time, and that communication and computation overlap. Thus, in the case of a routing with delay times ds,...,dd at each hop the total time it takes for a message to traverse the route is given by T 2D = L * (Tcomm + Tcomp) ^ J2 i=s di, Where Tcomm and Tcomp are the total communication time and total computation time respectively. L is the length of the route, the number of hops from the source node to the destination node. Definition 3: Routing Speedup (RS). Given a unit disk multicore graph G and a route R = vs,..., vd from source node vs to destination node vd, the Routing Speedup (RS) of R is the ratio of message routing time for a network with single-core nodes to the message routing time for a network with M-core nodes: RS = T 2Di RS T2Dm ' Where the subscripts 1 and m indicate the single-core and M-core networks respectively. Since it is more practical to measure the average speedup in a scenario with multiple sessions, we also define the Average Routing Speedup as follows: Definition 4: Average Routing Speedup (ARS). The Average Routing Speedup (ARS) is the average speedup over R routings in G: ARS = r=i RSi. The ARS metric is a practical tool for evaluating the speed gained by moving from an ad hoc network with single-core nodes to one with multicore nodes. ARS does not, however, measure the efficiency of a multicore node. (By node efficiency, we mean the average number of cores that are busy per node.) As will be shown in the next section, knowing the node efficiency permits a dramatic reduction of power consumption in each node and thus increases the lifetime of the whole network. Definition 5: Node Efficiency (NE). Assume a unit disk multicore graph G, and R routings executed in G over a period At. The Node Efficiency (NE) of a given node during At is the ratio of T1omp, its aggregate computation time in the case of a single-core network, to M times Tmmmp, its aggregate computation time in the case of an M-core network. Ti NE = Tcomp M*Tm0,mp ' For the same practical reasons mentioned with respect to the RS and ARS metrics, we define the Average Node Efficiency as follows: Definition 6: Average Node Efficiency (ANE). The Average Node Efficiency (ANE) in G of \V| = n nodes during time At is: ANE = n ^n=1 NEi. It is important to notice that NE and ANE do not take into account idle times, only those times when the nodes Figure 1: Illustration of Scalable Planar Network of 27 nodes with c=3. are involved in the routing process. The aim of these metrics is to measure the efficiency of the cores within a node, rather than a given node's dominance over other nodes in the routing process. Although this information is important for intelligent power management and energy conservation, the efficiency metrics we define here focus on what happens inside a multicore node. In the next section we use these metrics to evaluate and analyze the implications of multicore nodes on position-based routing protocols and power management strategies in ad hoc networks. 5 Simulator and Simulations A discrete event simulator was developed in order to monitor, observe, and measure ARS and ANE in multicore ad hoc networks. We generated a database with hundreds of random unit graphs, with values of V and D spanning a wide range of sparse and dense networks. The results shown in this paper are only a representative sample of the many simulations performed, and each result is averaged over many runs. The network generator first partitions the plane into k regions: an innermost disk whose radius is equal to the transmission range r, and a series of concentric annuli of width r surrounding the disk. The number of nodes in each annulus is proportional to its area. If c nodes are randomly located in the inner disk of a network with k regions, then a total of c * k2 nodes will be randomly placed in the entire network. For example, if the innermost region of a network has 3 nodes (c=3) then 9, 15, and 21 nodes will be located in the Figure 2: A randomly generated 4-regions network of 48 nodes and average node degree of 4. first three rings respectively. Figure 1 illustrates a network of 27 nodes for c=3. The networks are thus generated incrementally, ring after ring. A network with k regions is just an extension of the network with (k - 1) regions. In this way we can calculate the scalability of our protocols exactly. For small networks (up to 50) we choose the value of c to be 3 and for large networks the value was 4. The networks were extension of a base network of 27 nodes (in case of c=3) or 36 nodes (in case of c=4). The average-node-degree of the base network was preserved in the extended networks. Usually, the topology of randomly generated scalable networks are not symmetrical like the illustration shown in Figure 1. An example of a randomly generated 4-region network of 48 nodes and average-node-degree of 4 is shown in Figure 2. All the simulations shown in this paper were carried out on a network of 108 nodes, with an average node degree of 7. Measurements were performed for three traffic loads: 100, 1K and 10K routings. The source and destination nodes of each route were randomly chosen, and all routings were issued simultaneously. The scalability of routing was tested assuming values of 2, 4, 8, and 16 cores per node. Each simulation shown in this section (except Fig. 5) was performed assuming that the ratio of computation time to communication time is 1. The routing protocol used in our benchmarks is the Forward within Radius (MFR) protocol. For the simulations of our multicore network model we used parameter values that are the average values of the IEEE 802.11-based interfaces as appeared in (4) and summarized in table 1. Each core is assumed to be able to transmit and receive 2Mbps. The packet size was of 64KB and the link probability of retransmission was set to 0.25. For the network mobility it was assumed that all nodes move according to the random waypoint model. First, a node chooses a destination uniformly at random in the simulated region. The area A(N, R) of a simulated region is determined relative to the number of nodes in the network (N = c * k2 ) and the data transmission range R. Thus, A(N, R) = (k * R)2 * n = (N/c) * R2 * n. For example, a scalable network of 48 nodes with c =3 has 4 rings and thus the area of the network region is (4R)2n. In the simulations we used data transmission range of 250m. Next, the node chooses a velocity uniformly at random, with a maximum velocity of 10 m/s, and moves to that destination at the chosen velocity. Upon arriving at the chosen waypoint, the node pauses for a period before repeating the same process. We simulate pause times in the range of 0-10 seconds. Table 1: Parameter values used in the simulations. Communication Packet size 64KB Retransmission probability 0.25 Mobility Simulated area (N/c) * R2 * n Transmission range 250m Velocity 0-10m/s Pause time 0-10s Figures 3 and 4 depict the routing scalability that can be expected from a multicore-based network. Figure 3 plots ARS as a function of the number of routings for nodes with 2, 4, 8 and 16 cores. Figure 4 plots ARS as function of the number of cores per node for traffic loads of 100, 1K and 10K routings. We choose to present the same information in two different ways to make all the relationships clearly visible. Analysis of these results leads to the following findings. First, ARS increases as the number of the routings increases; this is obvious from Figure 3. For 10K routings the ARS increases linearly with the number of cores, up to values of 1.96, 3.82,7.26 and 13.27 for 2,4, 8 and 16 cores respectively. For 100 routings, however, the ARS reaches its maximum value of 1.35 at only 4 cores (Figure 4). Adding more cores does not increase the ARS unless there is more traffic to handle. These encouraging results show that mul- Figure 3: Graphs of Average-Routing-Speedup as function of number of routings. ticore nodes decrease the congestion of loaded networks, increase the routing speedup, and decrease the travel time of messages. Figure 5 plots the ARS as function of the number of cores per node and the ratio of computation time to communication time. The goal of this paper is to study the impact of various multicore ad hoc networks on routing scalability, so we varied the ratio of computation time to communication time in the range 0.5 to 3. Delays in computation and communication can be incurred from many sources: next-hop routing decisions, channel access delays, transmission delays, traffic load, the sizes of contending packets, the medium access control algorithm used by the nodes, the modulation and symbol rate of the packets, and finally the distance that the packets must travel. Moreover, wireless communication usually requires several network-dependent, computationally-intensive functions such as signal processing, encoding and decoding, and encrypting and decrypting. Figure 5 shows that as the ratio of computation time to communication time increases, the degree of speedup for large core numbers improves. In the case of 16 cores the ARS increases by 34% as the computation time to communication time ratio is varied in the range 0.5 to 3. In the case of 8 cores the improvement is only by 8% and for 2 and 4 cores there is no improvement at all. However, in the case of 2, 4 and 8 cores the ARS is high. This phenomenon matches also the results shown in Figure 8. The network reaches its load balancing equilibrium point when the number of cores reaches 16. At this Figure 4: Graphs of Average-Routing-Speedup as function of number of cores. point each node has enough available cores to amortize the increase in the ratio of computation time to communication time. Figures 6, 7 and 8 describe the node efficiency of multicore-based networks. Figure 6 graphs ANE as a function of core number for the cases of 100, 1K and 10K routings. Figure 6 graphs ANE as a function of the number of routings for the cases of 2, 4, 8, and 16 cores. Once again, we choose to present the same information in two different ways to clarify the relations. Figure 8 is a histogram showing the distribution of routing loads over all nodes for the case of 10K simultaneous, random routings. Analysis of these results reveals the following relationships. The ANE asymptotically increases with the number of routings (Figure 6). For example, in the case of 8-core network ANE approaches the values 0.18, 0.54 and 0.70 for 100, 1K, and 10K routings respectively. However, the ANE decreases dramatically as the number of cores increases (Figure 7). For example, in the case of 10K routings ANE approaches the values 0.88, 0.78, 0.70 and 0.63 for 2, 4, 8 and 16 cores respectively. In other words, as the number of cores increases more cores will remain idle. Figure 8 presents another view of this phenomenon. This histogram shows how the routing load is distributed among the nodes. For example, in the case of a single-core network most nodes (80 of 108) were not busy at least 40% of the time (marked low in the histogram). 18 nodes were busy between 40% to 80% of the time (marked moderate), and 10 nodes were busy more then 80% of time (marked high). These 10 nodes are the dominant set of the network, through which most of the traffic passes. However, as the number of cores increases more nodes become busy more 16 □ 2-cores ■ 4-cores □ 8-cores □ 16-cores " 0 and d(x, y) = 0 x = y, b.) d(x,y) = d(y,x), c.) d(x,y) < d(x,y) + d(y,z). (non-negativity) (symmetry) (triangle inequality) In an arbitrary metric space the distance is measured by values strictly smaller than to. By allowing to as a distance we have in effect restricted to bounded metric spaces. While the modest generalization of allowing to as a similarity value does not pose a serious restriction (databases are usually built from finite, and therefore bounded sets of data), it makes the set [0, to], when ordered by the usual > relation, a complete lattice as required in sets with similarity. Meet and join are computed as supremum and infimum, respectively. Note that we turned [0, to] upside down so that the least element is to and the greatest is 0. Hence the metric d a is again a special case of a measure of similarity because dA{x,x) = 0, which is the greatest element of the complete lattice Ljo,^]. Thus the bounded metric space (A, d a) can be transformed to the set with similarity (A, L[0,^],dA) and embedded into sets with similarities. 3 Tables, relations, and basic relational operations A relational database is composed of several relations in the form of two-dimensional tables of rows and columns containing related tuples. The rows (tuples) are called records and the columns (fields in the record) are called attributes. Each attribute has a data type that defines the set of possible values. Thus a relation is a subset of a Cartesian product of sets (value domains). 3.1 Cartesian products and subsets In order to use sets with similarity instead of (ordinary) sets we need a suitable notion of relation between sets with similarity. Hence we first need to know how to interpret Cartesian products and subsets of sets with similarity in a natural and effective way. Definition 4. The Cartesian product of sets with similarity A = (A,La,pa) and B = (B,Lb, pb ) is the set with similarity A X B = (A X B,La X Lb,Paxb), where AxB is the Cartesian product of sets, LA xLB is the product of complete lattices, and the measure of similarity p AxB is given by PAxB ((xi,yi), (x2,y2)) = (PA (xi,x2),PB (yi ,y2))- The corresponding canonical projections are (n1,p1) : A X B ^ A and (n2,p2) : A x B ^ B, where n1 and n2 are projections of sets, but p1 and p2 are projections of complete lattices. This interpretation of Cartesian products of sets with similarity is sound since a product of complete lattices is a complete lattice (14) and paxb satisfies the condition of being a measure of similarity: PAxB((x,y), (x,y)) = (PA(x,x),PB(y,y)) = (1a, 1b ) = 1AxB, where 1A is the greatest element of LA, 1B is the greatest element of Lb, and 1axb is the greatest element of the complete lattice La x Lb . Further, we have decided to consider only those sub-objects or substructures I of the set with similarity A = (A,LA,pA) whose similarity measure is induced by the structure of A. That is, the underlying set is a subset of A but the measure of similarity and the corresponding lattice are inherited from (A). Even though the domain of the measure of similarity has changed from A x A to I x I, we will keep the notation pA and write I = (I,LA, pA). Definition 5. (i.e., the Egli-Milner ordering and the Hausdorff metric) A subset of the set with similarity A = (A, La, pA) is a set with similarity I = (I,LA, pA), where I C A. Subsets of sets with similarity will also be called induced subobjects. 3.2 Relations and basic relational operations The family of subsets of A, denoted by IndSub(A), is essentially just the power set P (A). Theorem 1. The induced subobjects of a set with similarity A = (A, La, pA) form a complete boolean algebra equivalent to P (A), in which all the basic relational operations can be properly interpreted. The formal proof is given in (7). However, the Boolean lattice IndSub(A) is ordered with the usual subset relation, where the least element is the empty subobject 0 = (9, La, pA) and the greatest element is A. Hence selection, union, and difference are calculated as usual (A1, A2 C A): ar (A1,LA,PA) = (^F (A1),LA,PA), (A1, LA, PA) U (A2,LA, PA) = (A1 U A2,LA, PA), (A1,LA, PA) - (A2, LA, PA) = (A1 - A2,LA, PA)■ Moreover, Cartesian products, projections, selections, unions, and differences of induced subobjects satisfy all the abstract properties that are axiomatized by relational calculus (15; 16). A relation between two objects of the category of similarities, namely A = (A, LA, pA) and B = (B, LB, pB ), is now determined by a subset R C A x B, which induces a subobject (R, LA x LB ,pAxB ) of the Cartesian product A x B. Hence tables and answers to queries are modeled as induced subobjects. 4 Similarity of relations Sets with similarity enjoy additional constructions, which do not exist at the level of underlying sets. For instance, a suitable notion of similarity < between induced subobjects can be defined. In the case of the reflexive set (A, which is equipped with a reflexive relation and is the complete lattice from definition 3. The measures of similarity are defined as follows: • The measure of similarity pStrings is the Damerau-Levenshtein distance (12), given as the minimum number of operations needed to transform one string into the other, where an operation is an insertion, deletion, or substitution of a single character or a transposition of two characters. Since the length of the strings is bounded by 30, the Damerau-Levenshtein distance is at least 0 (the greatest element in the lattice LStrings) and at most 30 (the least element in the lattice LStrings). • The measure of similarity dsiovenia calculates the similarity of two cities as their air distance given as the Euclidean distance (in kilometres) between the Gauss-Krüger coordinates of the city centers (we have used the tool from http://www2.arnes.si/). The measure of similarity corresponding to the Cartesian product USERS = NAME x NICK x CITY of given sets with similarity is defined in accordance with definition 4: PUsers((p1,n1,c1), (p2,n2,c2)) = = (pStrings(p1,p2), PStrings(n1, n2), dcities(c1, c2)), where (p1,n1,c1) and (p1,n1,c1) are two rows of the given table instance, i.e., elements of the Cartesian product of sets: Users = Strings x Strings x Cities. For instance, the similarity between the first rows of the two tables 1 and 2 is equal to (5,5,18.3), but the similarity of the last two rows of table 2 is equal to (9, 5,189.9). Clearly, (9,5,189.9) > (5, 5,18.3), which means that the first pair of rows is more similar than the second one, i.e., the similarity value of the first pair is higher in the complete lattice LUsers = LStrings x LStrings x Lti Furthermore, the measure of similarity p between induced subobjects A and B of the Cartesian product USERS can be defined in accordance with theorem 2: p(A,B) = = ^ A V PUsers(x,y) ) A ^ \J pUsers(x,y) ), xceA yeB yeB xceA where all the meets and joins are computed in the complete lattice LUsers. Hence the similarity between the given table instances is equal to (9,4,106.9). The similarity would certainly decrease if one of the tables would be increased in size by users living far from Slovenia and/or have or use much longer names or nicks. Clearly, if the similarity measures integrated within the sets with similarity were changed, the similarity value between the two table instances would also change and possibly have a different interpretation. Hence the usefullness of the calculated similarity values depends highly on the definitions of the basic similarity measures. Example 2. Let table 3 contain a portion of public-transport bus routes in Ljubljana (Slovenia). The relational schema of table 3 corresponding to relation BUSES is [R : ROUTE, D : DEPATURE, DT : DTIME, A : ARRIVAL, AT : ATIME], where the domains corresponding to the atributes R, D, DT, A, and AT are the following sets with similarity: ROUTE DEPARTURE DTIME ARRIVAL ATIME (Buses, L2,aBuses), (Stops, LStops,ffStops), (Time, LTime, ffTime), (StoPs, LStops, ffStops), (Time, LTime,ffTime). than the similarity value of the second pair. Note, since LUsers is not linearly ordered, there are also uncomparable elements in the lattice. Here, Buses and Stops are the sets of bus routes and bus stops in Ljubljana, respectively. Time is the set of all possible times of the form HH:MM, where HH denotes hours written as 00,01,..., 23 and MM denotes minutes written as 00,01,..., 59. The complete lattice Lstops is the linearly ordered set {0,1,..., M, to} of non-negative integers (smaller than the number of all bus stops M) and the infinity value to with the order relation >. The complete lattice LTime is the linearly ordered set Time with 23:59 being the least element and 00:00 being the greatest element. Moreover, lattice L2 is the lattice of boolean values from definition 2. The measures of similarity are defined as follows: • The measure of similarity aBuses says 1 if the given bus routes are equal and 0 if they are not. • The measure of similarity astops calculates the similarity of the given bus stops as the minimum number of bus stops needed to pass by bus to come from the first bus stop to the second. If it is impossible to do this, the similarity value given is equal to to. ROUTE DEPARTURE DTIME ARRIVAL ATIME Exchangeability 9 (Step. naselje-Trnovo) 145 (Emona) 9:58 026 (Konzorcij) 10:25 (1,0,23:58,4,00:25) 5 (Step. naselje-Podutik) 145 (Emona) 10:10 025 (Hotel Lev) 10:26 (1,0,00:10,6,00:26) 13 (Sostro-Bežigrad) 145 (Emona) 10:11 059 (Bavarski dvor) 10:24 (1,0,00:11,5,00:24) 9 (Step. naselje-Trnovo) 145 (Emona) 10:14 026 (Konzorcij) 10:41 (1,0,00:14,4,00:41) 6 (CCrnuce-Dolgi most) 058 (Bavarski dvor) 10:29 034 (Hajdrihova) 10:32 (1,6,00:29,0,00:32) 1 (Vižmarje-Mestni log) 024 (Kolizej) 10:36 034 (Hajdrihova) 10:49 (1,7,00:36,0,00:49) 6 (CCrnuce-Dolgi most) 026 (Konzorcij) 10:41 034 (Hajdrihova) 10:46 (1,8,00:41,0,00:46) 1 (Vižmarje-Mestni log) 026 (Konzorcij) 10:44 034 (Hajdrihova) 10:49 (1,8,00:44,0,00:49) 6 ((Crnuce-Dolgi most) 026 (Konzorcij) 10:47 034 (Hajdrihova) 10:54 (1,8,00:47,0,00:54) Table 3: Table of public-transport bus routes in Ljubljana. The last column contains data about the exchangeability of each row with an exact answer to the query given within example 2. • The measure of similarity ^Time calculates the similarity of two time moments as their difference (second minus first) in form of HH:MM. Now consider the query "It is 10 o'clock and I am at the Emona bus stop. Are there any buses to Hajdrihova? I would like to arrive as soon as possible.", written in the language of relational algebra (4): ^V=EmonaAVT=10:00A^=HajdrihovaA^T =10:00 (BUSES). The last column of table 3 contains the calculated similarity or exchangeability values of the exact and the possibly relaxed answer (row). Notice that in table 3 there are no buses satisfying all the conditions given by the user but there are several buses that could be interesting for the user, such as the buses described by the second or the third row. These have a different destination, which is not a real handicap since the user could take another bus to come to Hajdrihova, i.e., bus routes 1 and 6, respectively. However, if we would like to suggest a suitable bus or a sequence of buses, we just need to calculate a 0-join of the given relation with the requirement that the arrival bus stop of the first and the departure bus stop of the second bus are (basically) the same, i.e., there are no bus stops between them, maybe one only needs to cross the street. 6 Conclusion We have defined the mathematical structure of sets with similarity that allows us to treat the features of richly-structured data, such as order, distance, and similarity, in a theoretically sound and uniform way. The proposed measures of similarity allow us to perform all types of similarity search. In addition, we now briefly discuss possible implementations of the resulting databases enriched with measures of similarity. Clearly, the user should be able to query approximate or cooperative data from databases without being concerned about the internal structure of data. Hence some default similarity measures should be integrated within the database. But still, the user should have the opportunity to modify the default notions of similarity if he/she is willing to do this. However, the question that arises is how to store the defined similarity measures. When the size of the data set A is small, the evident way to store a similarity measure pA : A X A ^ La is in tabular form, i.e., as a relation PA C A X A X LA. This kind of representation quickly becomes inefficient since it requires space quadratic in the size of A. Fortunately, in most cases the similarity measure can be easily calculated so that there is no need for storing it. There are two typical examples of similarity measures that can be computed rather than stored. First, distancelike similarities are computed from auxiliary data, such as geographic location, duration, and various other features that only require a minimal amount of additional storage. Second, reflexive relations are often defined in terms of deduction rules, e.g., it may be known that the relation is symmetric or transitive. In such cases we only store the base cases in a database, and deduce the rest from them. This is precisely the idea behind deductive database languages, such as Datalog. References [1] P. Buneman, P. Jung, A. Ohori (1991) Using Powerdomains to Generalize Relational Databases, Theoretical Computer Science 9/1, Elsevier, pp. 23-55. [2] W. W. Chu, Q. Chen (1994) A Structured Approach for Cooperative Query Answering, IEEE Transactions on Knowledge and Data Engineering 6/5, IEEE Computer Society, pp. 738-749. [3] W. W. Chu, H. Jung, K. Chiang, M. Minock, G. Chow, C. Larson (1996) CoBase: A Scalable and Extensible Cooperative Information System, Journal of Intelligent Information Systems 6/2-3, Springer US, pp. 223-259. [4] E. F. Codd (1970) A Relational Model of Data for Large Shared Data Banks, Communications of the ACM 13/6, Association for Computing Machinery, pp. 377-387. [5] T. Gaasterland, P. Godfrey, J. Minker (1992) An Overview of Cooperative Answering, Journal of Intelligent Information Systems 1/2, Springer US, pp. 123-157. [6] T. Gaasterland, P. Godfrey, J. Minker (1992) Relaxation as a Platform of Cooperative Answering, Journal of Intelligent Information Systems 1/3-4, Springer US, pp. 293-321. [7] M. Hajdinjak (2006) Knowledge Representation and Evaluation of Cooperative Spoken Dialogue Systems, Ph.D. thesis, Faculty of Electrical Engineering, University of Ljubljana, Ljubljana, Slovenia. [8] M. Hajdinjak, F. Mihelic (2006) The PARADISE Evaluation Framework: Issues and Findings, Computational Linguistics 32/2, MIT Press, pp. 263-272. [9] G. R. Hjaltason, H. Samet (2003) Index-Driven Similarity Search in Metric Spaces, ACM Transactions on Database Systems 28/4, Association for Computing Machinery, pp. 517-580. [10] A. Motro (1988) VAGUE: A User Interface to Relational Databases that Permits Vague Queries, ACM Transactions on Office Information Systems 6/3, Association for Computing Machinery, pp. 187-214. [11] A. Motro (1990) FLEX: A Tolerant and Cooperative User Interface to Databases, IEEE Transactions on Knowledge and Data Engineering 2/2, IEEE Computer Society, pp. 231-246. [12] G. Navarro (2001) A guided tour to approximate string matching, ACM Computing Surveys 33/1, Association for Computing Machinery, pp. 31-88. [13] W. Ng (2001) An Extension of the Relational Data Model to Incorporate Ordered Domains, ACM Transactions on Database Systems 26/3, Association for Computing Machinery, pp. 344-383. [14] D. E. Rutherford (1965) Introduction to Lattice Theory, Oliver & Boyd, Edinburgh, London. [15] J. D. Ullman (1988) Principles of Database and Knowledge-Base Systems, Volume I, Computer Science Press Inc., Rockville, Maryland. [16] J. D. Ullman (1989) Principles of Database and Knowledge-Base Systems, Volume II: The New Technologies, Computer Science Press, Inc., Rockville, Maryland. Robust H« Control of a Doubly Fed Asynchronous Machine Gherbi Sofiane Department of electrical engineering, Faculty of Science of the engineer 20 August 1956 University, Skikda, Algeria E-mail: sgherbi@gmail.com Yahmedi Said Department of electronic, Faculty of Science of the engineer Badji Mokhtar University, Annaba, Algeria E-mail: sais.yahmedi@carmail.com Sedraoui Moussa Department of electronics, Faculty of Science of the engineer Constantine University, road of AIN EL BEY Constantine, Algeria E-mail : msedraoui@gmail.com Keywords: doubly fed asynchronous machine, robust control, ^^ control, LMI's Received: April 25, 2008 The doubly fed asynchronous machine is among the most used electrical machines due to its low cost, simplicity of construction and maintenance [1]. In this paper, we present a method to synthesize a robust controller of doubly fed asynchronous machine which is the main component of the wind turbine system (actually the most used model [2]), indeed: there is different challenges in the control of the wind energy systems and we have to take in a count a several parameters that perturb the system as: the wind speed variation, the consumption variation of the electricity energy and the kind of the power consumed (active or reactive) ...etc.. The method proposed is based on the H^ control problem with the linear matrix inequalities (LMI's) solution: Gahinet-Akparian [3], the results show the stability and the performance robustness of the system in spite of the perturbations mentioned before. Povzetek: Opisana je metoda upravljanja motorja vetrnih turbin. 1 Introduction From all the renewable energy electricity production systems, the wind turbine systems are the most used specially the doubly fed asynchronous machine based systems, the control of theses systems is particularly difficult because all of the uncertainties introduced such as: the wind speed variations, the electrical energy consumption variation, the system parameters variations, in this paper we focus on the robust control ( H^ controller design method) of the doubly fed asynchronous machine which is the most used in the wind turbine system due to its low cost, simplicity of construction and maintenance [1]. This paper is organised as follow: Section 2 presents the wind turbine system equipped with the doubly fed asynchronous machine and then the mathematical electrical equations from what the system is modelled (in the state space form) are given. The section 3 presents the H^ robust controller design method with the LMI's solution used to control our system. The section 4 presents a numerical application and results in both the frequency and time plan are presented And finally a conclusion is given in section 5. 2 System presentation and modelling The following figure represents the wind turbine system Wind —^ Reactive Active Power Power \ / Electrical energy consumption Doubly fed asynchronous machine Electrical network Figure 1: The Wind turbine system The system use the wind power to drag the double fed asynchronous machine who acts as a generator, the output power produced must have the same high quality when it enters the electrical network, i.e.: 220 volts amplitude and 60 Hz frequency and the harmonics held to a low level in spite of wind speed changes and electrical energy consumption in active or reactive power form. References [4], [5], [6] describe detailed models of wind turbines for simulations, we use the model equipped with the doubly fed induction generators (asynchronous machine) (for more details see [7]), the system electrical equations are given in (d, q)frame orientation, then the stator voltage differential equations are: Vds = R. .Ids + ^dt ® ds - qs (1) Vqs = R. .Iqs + ® qs + ^s ds (2) The rotor voltage differential equations are: Vdr = Rr .Idr + ^df^ ® dr - ^r (3) V = R I + — ® + w (4) I'qr d^q^^r ^dr The stator flux vectors equations are: ® ds = Ls .Ids + M .Idr (5) ® qs = Ls .Iqs + M.Iqr (6) The rotor flux vectors equations: ® dr = Lr .Idr + M .I—s (7) ® qr = Lr .Iqr + M .Iqs (8) The electromagnetic couple flux equation : = P^M- ds .Ir qs .Ir ) (9) Ls The electromagnetic couple mecanic equation : Cm = Cr + + f (10) With: V^^, Vq^s : Statoric voltage vector components in 'd' and 'q' axes respectively. Vdrr, Vq^r : Rotoric voltage vector components in 'd' and ' q' axes respectively. I^is, Iqs : Statoric current vector components in 'd' and ' q' axes respectively. Idr, Iqr : Rotoric current vector components in 'd' and ' q' axes respectively. ®ds, ®qs : Statoric flux vector components in 'd' and ' q' axes respectively. ®, ®^^^ : Rotoric flux vector components in 'd' and ' q' axes respectively. Rs, R^ : Stator and rotor resistances (of one phase) respectively. Ls, Lr : Stator and rotor cyclic inductances respectively. ws,wr : Statoric and rotoric current pulsations respectively. M : Cyclic mutual inductance. p : Number of pair of the machine poles. Cr : Resistant torque. f : Viscous rubbing coefficient. J : Inertia moment. 2.1 State space model In order to apply the robust controller design method, we have to put the system model in the state space from; we consider the rotoric voltage V,:^^, Vqr as the inputs and the statoric voltage Vdis, V^^^ as the outputs, i.e. we have to design a controller who acts on the rotoric voltages to keep the output statoric voltages at 220volts and 50^z frequency in spite of the electric network perturbations (demand variations ^ etc) and the wind speed variations (see figure.2). perturbations / K G y - f Figure 2: A Doubly fed wind turbine system control configuration Where: u , y and e are the rotoric voltage vector (control vector), statoric output voltage vector and the error signal between the input reference and the output system respectively. K , G are the controller and the wind turbine system respectively. R : is the statoric voltage references vector and perturbations are the electric energy demand variations, wind speed variations ^etc. 't Let us consider x = u =Ids Iqs Vds Vq dr ^ qr T as a state vector, and qs ds qs as the command vector, the stator flux vector is oriented in d axis of Parks reference ® qs = 0 and Ids. frame then constant in the steady state i.e.: Ij^ = Iqs = 0 . I qs are considered qs We use the folowing doubly fed asynchronous machine parameters: Rs = 5Q ; R^ = 1.0113Q ; M = 0.1346H Ls = 0.3409H ; L^ = 0.605H ; w^ = 146.6Hz ; ws = 2^- 50Hz Let w = ws - wr and a = 1 - M 2 Ls - L, The state space (11) can be obtained by the combining of the equations (1) to (8) as follow: x = A - x + B - u >> = C - x + D - u Where: R X = U = y = ^ds ^qs Vdr Vqr Vds Vqs And: A = C = - r-R, wr - Rr Lr - wr B = Lr _ Rr M 1 0 Rr^M 0 1 M R. Lr w - w Rr D = Rs + ^L2 • Rr Lr - a- Ls .W M Lr 0 a- Ls .W R. + ■ Rr Lr 0 M Lr 3 The H^ controller design method It is necessary to recall the basics of a control loop (figure.3). With G : the perturbed system. K [ G 1 Figure 3: The control loop with the output multiplicative uncertainties The multiplicative uncertainties at the process output which include all the perturbations that act in the system are then : Am = (G - G).G, with G'= G(I + Am ) : is the perturbed system, figure.4 show the singular values plot at the frequency plan of A m, we can see that the uncertainties are smaller at low frequencies and grow at the medium and high frequencies, this mean a strong perturbation at high frequencies (the transient phase), we also note a pick at: m = 260rad / s, this is due to the fact that the system is highly coupled at this pulsation. We can bound the system uncertainties by the following weighting matrix function: Wt ( jw) = Informatica 32 (2008) 143-150 145 0.55(0.02 jw +1) (1 + 0.0001 jw) 0 0 0.55(0.02 jw +1) (12) (1 + 0.0001 jw) The figure.5 show that the singular values of Wt (jw) bounds the maximum singular values of the uncertainties in the entire frequency plan. The robust stability condition [11] is then: ä[T(j-w)- W, {jw)] -< 1 (13) Or: a T(jw)]< a[Wt{jw)]- (14) Where: a is the maximum singular value and T (jw) is the nominal closed loop transfer matrix defined by: T (jw) = G(jw) - K (jw) - [l + G(jw) - K (jw) -1 (15) The equations (13) allow us to guaranty the stability robustness, in other hand we most guaranty satisfying performances (no overshoot, time response ^etc) in the closed loop (performances robustness), this can by done by the performance robustness condition [8]: (16) a Or: ^(jw)- Wp (jw)] < 1 (jw)]< a[Wp (jw) (17) Where: S (jw) is the sensitivity matrix given by: S (jw)=[l + G(jw)- K (jw)]-1 (18) Wp ( jw) is a weighting matrix function designed to meet the performance specifications desired in the frequency plan, we choose the following matrix function: W^ (jw) = (0.005 jw +1) 0.05 jw 0 0 (0.005 jw +1) (19) 0.05 jw The figure.6 represent the singular values of WP (jw) in the frequency plan, one notice that the specifications on the performances are bigger in low frequencies (integrator frequency behaviour), and this guaranty no static error. Then the standard problem of H® Control theory is then: T(jw)- Wt (jw) min K stabili sin g S (jw)- Wp (jw) (20) i.e.: to find a stabilising controller K that minimise the norm (20). With: is The Hinfinity norm. T 0 L r 0 r L r L r G m R 4 Application The minimisation problem (20) is solved by using two Riccati equations [9] or with the linear matrix inequalities approach. For our system, we use the linear matrix inequalities solution (for more details see [10]). The solution (controller) can be obtained via the Matlab instruction hinflmi available at 'LMI Toolbox' of Matlab® Math works Inc [11]. The figure 7 and the figure 8 show the satisfaction of the stability and performances robustness conditions (14) and (17). The figure.9 show the step responses step responses of the closed loop controlled nominal system with: R = ^Vds_ ref = 10 Vqs_ ref = ^^) respectÌvely. The Outputs Vj,. good time qs _ ref and Vq^s follow the references with a response and no overshoot. 10 10"° iS 10"0-3 " 10"0-4 10"0 10"0 10"4 a A m (jw)] % \ 10"3 10"2 10 10 10 Pulsations rad/sec 102 103 104 Figure 4: The system uncertainties maximum singular values 10" 10 ÉE Xn 10" 10 10 äiWt (jw) / 7 / a[Am (jw)] 10 10 10 10 10 Pulsations rad/sec 10 10 10 Figure 5: Maximum singular values of the system uncertainties Am bounded by the singular values of W^{jw). ^ 101 10 10° 10"' 10" 10"3 10"' 10"- 10"6 10"' 10"3 10"2 10 10 10 Pulsations rad/sec 102 103 10' Figure 6: Singular Values of the weighting performance specification 10' 10" 10"' a T ( jw) 10"' 10 10 10 10 10 10 10 Pulsations rad/sec 10 10 10 Figure 7: Stability robustness condition ^ 10"2 10' 10 10 fWp ( ./w)]- 5 io-2 10"- a S ( jw) 10"' 10 10"6 10 0.8 0.6 O 0.4 0.2 1 0.8 0.6 O 0.4 0.2 0 10 10 10 10 10 Pulsations rad/sec 10 Figure 8: Performances robustness condition 0.1 0.15 Time (sec) 0.1 0.15 Time (sec) 10 10 4 Vds / / Vqs y \ Figure 9: Step response of the controlled closed loop nominal system 4 Vqs / / Vds y \ \ 5 Conclusion In this paper we deal with the control problem of a wind stability and good performances in spite of the turbine equipped with a doubly fed asynchronous perturbations and system uncertainties. machine subject to various perturbations and system uncertainties (wind speed variations, electrical energy consumption, system parameters variations ...etc), we show that the H® controller design method can be successfully applied to this kind of systems keeping 1 0 0 0 References [1] G. L. Johnson (2006), 'Wind energy systems: Electronic Edition', Manhattan, KS, October 10. [2] 'AWEA Electrical Guide to Utility Scale Wind Turbines', (2005), The American Wind Energy Association, available at http://www.awea.org. [3] P. Gahinet, P. Akparian (1994), 'A linear Matrix Inequality Approach to H„ Control ', Int. J. of Robust & Nonlinear Control", vol. 4, pp. 421-448. [4] J. Soens, J. Driesen, R. Belmans (2005), ' Equivalent Transfer Function for a Variable-speed Wind Turbine in Power System Dynamic Simulations ', International Journal of Distributed Energy Resources, Vol.1 N°2, pp. 111-133. [5] 'Dynamic Modelling of Doubly-Fed Induction Machine Wind-Generators' (2003), Dig Silent GmbH Technical Documentation, available at http://www.digsilent.de. [6] J. Soens, J. Driesen, R. Belmans (2004), ' Wind turbine modelling approaches for dynamic power system simulations ', IEEE Young Researchers Symposium in Electrical Power Engineering - Intelligent Energy Conversion, (CD-Rom), Delft, The Netherlands. [7] J. Soens, V. Van Thong, J. Driesen, R. Belmans (2003), ' Modelling wind turbine generators for power system simulations ', European Wind Energy Conference EWEC. [8] Sigurd Skogestad, Ian Postlethwaite (1996), 'Multivariable Feedback Control Analysis and Design', John Wiley and Sons. pp: 72 to 75 [9] J. C. Doyle, K. Glover, P. P. Khargonekar and Bruce A. Francis (1989), 'State-Space Solution to Standard H2 and Control Problems', IEEE Transactions on Automatic Control, Vol. 34, N°. 8. [10] D.-W. Gu, P. Hr. Petkov and M. M. Konstantinov (2005), 'Robust Control Design with MATLAB® ', © Springer-Verlag London Limited.pp:27 to 29 [11] P. Gahinet, A. Nemirovski, A. J. Laub, M. Chilali (1995). "LMI Control Toolbox for Use with MATLAB®", User's Guide Version 1, The Math Works, and Inc. Balancing Load in a Computational Grid Applying Adaptive, Intelligent Colonies of Ants Mohsen Amini Salehi Department of Software Engineering, Faculty of Engineering, Islamic Azad University, Mashhad Branch, Iran E-mail: Amini@mshdiau.ac.ir Hossein Deldari Department of Software Engineering, Faculty of Engineering Ferdowsi University of Mashhad, Iran E-mail: hd@um.ac.ir Bahare Mokarram Dorri Management and Planning Organisation of Khorasan, Mashhad, Iran E-mail: mokarram@mpo-kh.ir Keywords: Grid computing, Load balancing, Ant colony, Agent-based Resource Management System (ARMS) Received: July 1, 2007 Load balancing is substantial when developing parallel and distributed computing applications. The emergence of computational grids extends the necessity of this problem. Ant colony is a meta-heuristic method that can be instrumental for grid load balancing. This paper presents an echo system of adaptive fuzzy ants. The ants in this environment can create new ones and may also commit suicide depending on existing conditions. A new concept called Ant level load balancing is presented here for improving the performance of the mechanism. A performance evaluation model is also derived. Then theoretical analyses, which are supported with experiment results, prove that this new mechanism surpasses its predecessor. Povzetek: Metoda inteligentnih mravelj je uporabljena na problemu razporejanju bremen. 1 Introduction A computational grid is a hardware and software Therefore, the grid middleware should compensate for the infrastructure which provides consistent, pervasive and lack of unique administration. inexpensive access to high end computational capacity. An ARMS is an agent-based resource manager infrastructure ideal grid environment should provide access to all the for the grid [3, 4]. In ARMS, each agent can act available resources seamlessly and fairly. simultaneously as a resource questioner, resource provider, The resource manager is an important infrastructural and the matchmaker. Details of the design and component of a grid computing environment. Its overall implementation of ARMS can be found in [2]. In this aim is to efficiently schedule applications needing work, we use ARMS as the experimental platform. utilization of available resources in the grid environment. Cosy is a job scheduler which supports job scheduling as A grid resource manager provides a mechanism for grid well as advanced reservations [5]. It is integrated into applications to discover and utilize resources in the grid ARMS agents to perform global grid management [5]; environment. Resource discovery and advertisement offer Cosy needs a load balancer to better utilize available complementary functions. The discovery is initiated by a resources. This load balancer is introduced in part 3. grid application to find suitable resources within the grid. The rest of the paper is organized as follows: Section 2 Advertisement is initiated by a resource in search of a introduces the load balancing approaches for grid resource suitable application that can utilize it. A matchmaker is a management. In Section 3, ant colony optimization and grid middleware component which tries to match self-organizing mechanisms for load balancing are applications and resources. A matchmaker may be discussed. Section 4 describes the proposed mechanism. implemented in centralized or distributed ways. As the grid Performance metrics and simulation results are included in is inherently dynamic, and has no boundary [1], so the Section 5. Finally, the conclusion of the article is presented distributed approaches usually show better results [2] and as well as future work related to this research. are also more scalable. A good matchmaker (broker) should uniformly distribute the requests, along the grid 2 Load balancing resources, with the aid of load balancing methods. As mentioned in [1], the grid is a highly dynamic Load balancing algorithms are essentially designed to environment for which there is no unique administration. spread the resources' load equally thus maximizing their utilization while minimizing the total task execution time [7]. This is crucial in a computational grid where the most pressing issue is to fairly assign jobs to resources. Thus, the difference between the heaviest and the lightest resource load is minimized. A flexible load sharing algorithm is required to be general, adaptable, stable, scalable, fault tolerant, transparent to the application and to also induce minimum overhead to the system [8]. The properties listed above are interdependent. For example, a lengthy delay in processing and communication can affect the algorithm overhead significantly, result in instability and indicate that the algorithm is not scalable. The load balancing process can be defined in three rules: the location, distribution and selection rule [7]. The location rule determines which resource domain will be included in the balancing operation. The domain may be local, i.e. inside the node, or global, i.e. between different nodes. The distribution rule establishes the redistribution of the workload among available resources in the domain, while the selection rule decides whether the load balancing operation can be performed preemptively or not [7]. 2.1 Classification of load balancing mechanisms In general, load balancing mechanisms can be broadly categorized as centralized or decentralized, dynamic or static [10], and periodic or non-periodic [11]. In a centralized algorithm, there is a central scheduler which gathers all load information from the nodes and makes appropriate decisions. However, this approach is not scalable for a vast environment like the grid. In decentralized models, there is usually not a specific node known as a server or collector. Instead, all nodes have information about some or all other nodes. This leads to a huge overhead in communication. Furthermore, this information is not very reliable because of the drastic load variation in the grid and the need to update frequently. Static algorithms are not affected by the system state, as their behaviour is predetermined. On the other hand, dynamic algorithms make decisions according to the system state. The state refers to certain types of information, such as the number of jobs waiting in the ready queue, the current job arrival rate, etc [12]. Dynamic algorithms tend to have better performance than static ones [13]. Some dynamic load balancing algorithms are adaptive; in other words, dynamic policies are modifiable as the system state changes. Via this approach, methods adjust their activities based on system feedback [13]. 3 Related works Swarm intelligence [14] is inspired by the behaviour of insects, such as wasps, ants or honey bees. The ants, for example, have little intelligence for their hostile and dynamic environment [15]. However, they perform incredible activities such as organizing their dead in cemeteries and foraging for food. Actually, there is an indirect communication among ants which is achieved through their chemical substance deposits [16]. This ability of ants is applied in solving some heuristic problems, like optimal routing in a telecommunication network [15], coordinating robots, sorting [17], and especially load balancing [6, 9, 18, 19]. Messor [20] is the main contribution to the load balancing context. 3.1 Messor Messor is a grid computing system that is implemented on top of the Anthill framework [18]. Ants in this system can be in Search-Max or Search-Min states. In the Search-Max state, an ant wanders around randomly until it finds an overloaded node. The ant then switches to the Search-Min state to find an underloaded node. After these states, the ant balances the two overloaded and underloaded nodes that it found. Once an ant encounters a node, it retains information about the nodes visited. Other ants which visit this node can apply this information to perform more efficiently. However, with respect to the dynamism of the grid, this information cannot be reliable for a long time and may even cause erroneous decision-making by other ants. 3.2 Self-Organizing agents for grid load balancing In [6], J.Cao et al propose a self-organizing load balancing mechanism using ants in ARMS. As this mechanism is simple and inefficient, we call it the "seminal approach". The main purpose of this study is the optimization of this seminal mechanism. Thus, we propose a modified mechanism based on a swarm of intelligent ants that uniformly balance the load throughout the grid. In this mechanism an ant always wanders '2m+ 1' steps to finally balance two overloaded and underloaded nodes. As stated in [6], the efficiency of the mechanism highly depends on the number of cooperating ants (n) as well as their step count (m). If a loop includes a few steps, then the ant will initiate the load balancing process frequently, while if the ant starts with a larger m, then the frequency of performing load balancing decreases. This implies that the ant's step count should be determined according to the system load. However, with this method, the number of ants and the number of their steps are defined by the user and do not change during the load balancing process. In fact, defining the number of ants and their wandering steps by the user is impractical in an environment such as the grid, where users have no background knowledge and the ultimate goal is to introduce a transparent, powerful computing service to end users. Considering the above faults, we propose a new mechanism that can be adaptive to environmental conditions and turn out better results. In the next section, the proposed method is described. 4 Proposed method In the new mechanism, we propose an echo system of intelligent ants which react proportionally to their conditions. Interactions between these intelligent, autonomous ants result in load balancing throughout the grid. In this case, an echo system creates ants on demand to achieve load balancing during their adaptive lives. They may bear offspring when they sense that the system is drastically unbalanced and commit suicide when they detect equilibrium in the environment. These ants care for every node visited during their steps and record node specifications for future decision making. Moreover, every ant in the new mechanism hops 'm' steps (the value of 'm' is determined adaptively) instead of '2m+1'. At the end of the 'm' steps, 'k' overloaded are equalized with 'k' underloaded nodes, in contrast to one overloaded with one underloaded according to the previous method. This results in an earlier convergence with fewer ants and less communication overhead. In the next sections, the proposed method is described in more detail. 4.1 Creating ants If a node understands that it is overloaded, it can create a new ant taking only a few steps to balance the load as quickly as possible. Actually, as referred in [2], neighbouring agents, in ARMS, exchange their load status periodically. If a node's load is more than the average of its neighbours, for several periods of time, and it has not been visited by any ant during this time, then the node creates a new ant itself to balance its load throughout a wider area. Load can be estimated several ways by an agent to distinguish whether a node is overloaded or not. For the sake of comparison with similar methods, the number of waiting jobs in a node is considered the criterion for load measurement. 4.2 Decision-making Every ant is allocated to a memory space which records specifications of the environment while it wanders. The memory space is divided into an underloaded list (Min List) and an overloaded list (Max List). In the former, the ant saves specifications of the underloaded nodes visited. In the latter, specifications of the overloaded nodes visited are saved. At every step, the ant randomly selects one of the node's neighbours. 4.2.1 Deciding algorithm After entering a node, the ant first checks its memory to determine whether this node was already visited by the ant itself or not. If not, the ant can verify the condition of the node, i.e. overloaded, underloaded or at an equilibrium, using its acquired knowledge from the environment. As the load quantity of a node is a linguistic variable and the state of the node is determined relative to system conditions, decision making is performed adaptively by applying fuzzy logic [21, 22]. To make a decision, the ant deploys the node's current workload and the remaining steps as two inputs into the fuzzy inference system. Then, the ant determines the state of the node, i.e. Max, Avg or Min. The total average of the load visited is kept as the ant's internal knowledge about the environment. The ant uses this for building membership functions of the node's workload, as shown in Figure1.a. The membership functions of Remain steps and Decide, as the output, are on the basis of a threshold and are presented in Figures 1.b, 1.c: Load Total avg Max load (a) RmStep Decide (b) (c) Figure 1: Membership functions of fuzzy sets a) The Node's Load, b) Remain steps, c) Decide. The inference system can be expressed as the following relation: R^ : Load *RmStep ^ Decide< Min, Avg, Max > (1) Where L, ML, MH, H in Figure1.a indicates Low, Medium Low, Medium High, High respectively and F, A, V in Figure 1.b indicates Few, Average and Very. Thus, the ant can make a proper decision. If the result is "Max" or "Min", the node's specifications must be added to the ant's max-list or the min-list. Subsequently, the corresponding counter for Max, Min, or Avg increases by one. These counters also depict the ant's knowledge about the environment. How this knowledge is employed is explained in the next sections. 4.2.2 Ant level load balancing In the subtle behaviour of ants and their interactions, we can see that when two ants face each other, they stop for a moment and touch tentacles, probably for recognizing their team members. This is what inspired the first use of ant level load balancing. With respect to the system structure, it is probable that two or more ants meet each other on the same node. As mentioned earlier, each of these ants may gather specifications of some overloaded and underloaded nodes. The amount of information is not necessarily the same for each ant, for example one ant has specifications of four overloaded and two underloaded while the other has two overloaded and six underloaded nodes in the same position. In this situation, ants extend their knowledge by exchanging them. We call this "ant l^el load balancing." In the aforementioned example, after ant level load balancing of the two co-positions, the ants have specifications of three overloaded and four underloaded nodes in their memories. This leads to better performance in the last step, when the ants want to balance the load of 'k' overloaded with 'k' underloaded nodes. This operation can be applied to more than two ants. Actually, when two or more co-positioned ants exchange their knowledge, they extend their movement radius to a bigger domain, thus improving awareness of the environment. Another idea is taken from the ant's pheromone deposits while travelling, which is used by ants to pursue other ants. This is applied in most ant colony optimization problems [23, 24]. There is, however, a subtle difference between these two applications. In the former the information retained by the ant may become invalid over time. This problem can be solved by evaporation [23, 24]. Evaporation, however, is not applicable in some cases, e.g. in the grid, where load information varies frequently. On the other hand, in the latter application, the knowledge exchanged is completely reliable. 4.2.3 How new ants are created In special conditions, particularly when the its life span is long, the ant's memory may fill up, even though the ant may still be encountering nodes which are overloaded or underloaded. In this situation, if a node is overloaded, the ant bears a new ant with predefined steps. If encountering an underloaded node, the ant immediately exchanges the node's specification with the biggest load on the list of underloaded elements. This results in better balancing performance and adaptability to the environment. Here, adaptability translates into increasing the number of ants automatically, whenever there is an abundance of overloaded nodes. 4.3 Load balancing, starting new itineration When its journey ends, the ant has to start a balancing operation between the overloaded (Max) and underloaded (Min) elements gathered during its roaming. In this stage, the number of elements in the Max-list and Min-list is close together (because of ant level load balancing). To achieve load balancing, the ant recommends underloaded nodes to the overloaded nodes and vice versa. In this way, the amount of load is dispersed equally among underloaded and overloaded nodes. After load balancing, the ant should reinitiate itself to begin a new itineration. One of the fields that must be reinitiated is the ant's step counts. However, as stated in previous sections, the ant's step counts (m) must be commensurate to system conditions [6]. Therefore, if most of the visited nodes were underloaded or in equilibrium, the ant should prolong its wandering steps i.e. decrease the load balancing frequency and vice versa. Doing this requires the ant's knowledge about the environment. This knowledge should be based on the number of overloaded, underloaded and equilibrium nodes visited during the last itineration. Because of fuzzy logic power in the adaptation among several parameters in a problem [22] and the consideration of the step counts (m) as a linguistic variable, e.g. short, medium, long, it is rational to use fuzzy logic for determining the next itineration step counts. Actually, this is an adaptive fuzzy controller which determines the next itineration step counts (NextM, for short) based on the number of overloaded, underloaded and equilibrium nodes visited, along with the step counts during the last itineration (LastM). In other words, the number of overloaded, underloaded and equilibrium nodes encountered during the LastM indicate the recent condition of the environment, while the LastM, itself, reports the lifetime history of the ant. The membership functions of the fuzzy sets are shown in Figure 2. Last M (a) Max m TL L \A MH AA TH Dead A/ X/ NextM Max m (b) M Last m (c) Figure 2: Membership functions of fuzzy sets a) Last m, b) Next m, c) count of Max, Min and Average nodes visited Where TL, L, M, H, TH shows Too Low, Low, Medium, High and Too High in Figure 2.a, 2.b and L, M, H indicates Low, Medium and High in Figure 2.c. This fuzzy system can be displayed as a relation and a corresponding function as follows: B^g : MaxCount l, m, ^ > *MinCount l, mh > * Avgt l, m, h >^NextM< TL, L, M, H,TH, Dead> H L 135 Z yr * n ( xi ) f ( x) = ^ i=1 135 4 (3) zn ( xi ) r=1 i=1 x. Where shows the input data into the system, y is the centre of the specific membership function declared in rule r . ( ' ) indicates the membership value of the i th input in membership functions of the r th rule. In this inference system, we also have 4 inputs and 135 rules defined, as stated in (3). In this system, a large number of underloaded and, especially, equilibrium elements indicate equilibrium states. Consequently, the NextM should be prolonged, thus lowering the load balancing frequency. One can say that, if an ant's step counts extend to extreme values, its effect tends to be zero. Based on this premise, we can conclude that an ant with overly long step counts does not have any influence on the system balance. Rather, the ant imposes its communication overhead on the system. In this situation, the ant must commit suicide. This is the last ring of the echo system. Therefore, if the NextM is fired in the "Dead" membership function, the ant does not start any new itineration. Below is a diagram exhibiting the ant's behaviour in different environmental conditions. Figure 3.a shows the relation between the LastM and the amount of overloaded nodes visited, while Figure 3.b illustrates the relation between the LastM and the number of equilibrium nodes visited. 5 Performance valuations In this section, several common statistics are investigated, which show the performance of the mechanism. 5.1 Efficiency To prove that the new mechanism increases efficiency, it should be compared with the mechanism described in [4]. First, we introduce some of the most important criteria in load balancing: 0 0 (a) (b) Figure 3: Schematic view of adaptive determining of next itineration step counts. a) LastM -MaxCount-output, b) LastM - avgCount-output Let P be the number of agents and Wpk where (p: 1, 2... P) is the workload of the agentp at stepk. The average workload is: w, = pk P (4) The mean square deviation of Wpk, describing the load balancing level of the system, is defined as: Lk Z (Wk - w^,)2 p=1 P (5) Finally, The system load balancing efficiency (e ) is defined as: L - L C,. (6) Where ek means efficiency at step k and Ck is the total number of agent connections that have achieved a load balancing level Lk. To compare the efficiency of these two mechanisms, we should consider e^ / e^ As L0 indicates the load balancing level at the beginning of the load balancing process and is also equal in both new ek = Tirai new and seminal mechanisms, we shall discuss the value of Lk. For the sake of simplicity, assume that every node gets to after the balancing process and no longer requires balancing, i.e. w - w = 0 wk wpk 0 (7) On the other hand, after the k stage, if the memory space considered for overloaded and underloaded elements is equal to 'a' (a>2), then we have ka elements balanced: Lknw = p-ka _ Z(Wk -Wpk)' p=1 P (8) While in the seminal approach we have: LkTrad = p-2k _ Z (Wk -Wpk)2 _p=1_ P (9) If we suppose that a>2, we can conclude: P - 2k > P - ka (10) After the k stages, the difference in the balanced nodes in these two mechanisms is: P - 2a - P + ka = k(a - 2) (11) Then: L. = p-ka p-2k Z (Wk - Wpk )2 Z (Wk - Wpk )2 p=1 1 p=ka Lt = i p-ka Z (Wk - Wpk )2 p=1 P L L > L, L < 1 '^krro^ ^ Lknew —> kTrad With respect to (14), we have: ek„ kTrad 2(L0 - L,^^^ ) L0 - LkTr„, e k„ "kTraä ^ > 2 (12) (13) (14) (15) ant's memory leads to occupying too much bandwidth as well as increasing processing time. Actually, there is a trade-off between the step counts (S) and the memory allocated to each ant (a). If a<