Informatica 30 (2006) 365-372 365 A Study of Fairness of Information Distribution and Utilization in a Mobile Agent-Based Adaptive Information Service System Iftikhar Ahmed and Mohammad Jafar Sadiq, Department of Electrical & Computer Engineering Faculty of Engineering, International Islamic University Malaysia E-mail: a.iftikhar@iiu.edu.my Keywords: mobile agents, information systems, intelligent systems Received: July 19, 2005 The information handling capacity of wide area networks is ever increasing and becoming dynamic, thus posing a challenge for their optimum utilization. In such a dynamic environment it is very difficult to maintain a coherent picture of the information services. The current information services on the internet therefore need to be optimized. A novel technique proposed recently employs autonomous mobile agents in faded information field architecture (FIF) to facilitate information provision and servicing on wide area networks. The architecture is decentralized in nature whereby both the service provider and information seeker communicate through mobile agents. However, the degree of fairness of information distribution and access is still a problem to be addressed. This paper investigates the issue of fairness among information users for access to information in a networked environment employing mobile agent technology. A FIF was simulated employing a number of algorithms characterizing its behaviour with respect to fairness of information distribution by various service providers in the network. Simulations results indicate that in a given information system, both the service providers and the information seekers may be given equal access to the distribution and utilization of information provided a number of suitable algorithms are employed depending on the nature of information service available on the wide area network. Povzetek: Narejena je analiza porazdelitve v agentnem informacijskem storitvenem sistemu. 1 Introduction Information in today's business environment has become the prime ingredient of a successful business enterprise. The rapid advances in Information and Communication Technologies (ICT) have given new impetus to the way business is conducted on the internet with information being the key element of a successful business enterprise. The traditional paradigms of business, learning and other essential services are undergoing an evolution. The internet is a huge data repository that is expanding without a central authority. The number of worldwide internet users is predicted to exceed one billion by the end of 2005 [1]. Moreover 300 terabytes of information is published online every year [2]. As the performance demands of the internet and its usage increase exponentially, the emphasis is shifting from platform centric computing to network centric computing. Information systems have to be designed that are distributed, dynamic and have high assurance and fault tolerance to meet the heterogeneous demands of users. As ICT advances, the dynamics of e-commerce tend to be more data intensive and complex. The traditional business to customer paradigm is encompassing the business to business transactions as well. Companies have to comprehend the trends and demands of the users quickly in order to survive in the competitive environment of today. The expectations of users and customers have soared high and they seek a flawless any time any where service syndrome. The traditional information services based on the clientserver model therefore cannot cope with the rising demand placed by the complexities of data intensive heterogeneous computing. It is therefore imperative to employ new techniques for optimum and accurate information retrieval on the web. Various techniques being investigated for this purpose [ 3-5 ] mostly tend to improve upon the current method of information retrieval, i.e., search engines Information fading based on a demand-oriented service architecture is one technique that may bring about a conceptual change in the information search method in vogue [6], in addition to Semantic web [7]. The architecture balances the cost of information provision and utilization on the network by employing push and pull mobile agents to service user requests and conducts the business of information provision and utilization on the network. In the faded information field (FIF) the service provider distributes the most demanded and popular information contents closer to its vicinity on various nodes in the network. The volume of distributed information on a node is inversely proportional to the distance of information storing node 366 Informatica 30 (2006) 365-372 I. Ahmed et al. from the service provider. Thus the FIF permits the fairness of information distribution to unspecified users with equal access time. A number of attributes of FIF architecture have been investigated recently [8-11]. This paper will analyse the fairness of information access to various users in a FIF information system. The paper is organized as follows; Information service system performance requirements are discussed in section 2, followed by a description of FIF architecture. Mobile agents will be surveyed in section 4. The attribute of fairness of information distribution and utilization will be discussed in section 4. The paper will be concluded in section 5. 2 Information service system performance requirements The wide area network services are gradually creeping into our daily lives with the number of users of these services constantly on the rise. The information systems now involve a constantly changing environment where stringent performance demands are placed on the network. The businesses are taking advantage of the wide area network facility to offer bargains and put various items on sales and special promotions resulting in the users having much more flexibility and choice available making the process of information provision and utilization a complex matter to deal with on the network. It is thus imperative for an information system to respond to both user needs and service provider (SP) requirements to respond in time to these needs in a rapidly changing environment. The main desirable attributes of such a system therefore should be flexibility, reliability and quick reaction time. Only a system with these properties can successfully satisfy both users and SPs in a dynamic network environment. The traditional data retrieval methodologies focused on the optimization of digging in huge data base to satisfy a search request. However, the FIF employs the concept of autonomous provision and utilization of information based on mobile agents. The amount of information to be faded away from SP is a function of network conditions like congestion, popularity of information contents and any other criteria programmed into mobile agents [6]. 3 The FIF architecture The information in a FIF is distributed on various nodes in the system. The information pattern in the field is analogous to the electromagnetic radiation from an antenna, high in intensity near the antenna and low in intensity away from the antenna. The SP fades away the information contents to the nodes in the field with the richness in information contents inversely proportional to the distance of the node from the SP through Push mobile agents (MAs). The user on the other hand sends out Pull MAs seeking the desired information [6, 12]. The general schematic of FIF architecture is depicted in figure 1. - FIF -► Information Fading Figure 1: The layout of FIF architecture. 3.1 FIF System Components The system essentially consists of logically connected nodes through which users and service providers correspond. Mobile agents are used by both parties to acquire and provide information respectively, under evolving/changing situation. The mobile agents (MA) generated by service providers are termed as push mobile agents (Push MAs). Push MAs carry out the function of autonomous coordination and negotiation with other nodes for information fading according to the network traffic status and the level of importance attached to the particular information contents. The level of importance of particular information content is based on its popularity, determined from a high hit rate of query by the information seeker on the network. The pull mobile agents (Pull MAs) are generated by users and they autonomously navigate in search of the required information on the network nodes in a step-by-step fashion. Once the required information is located, these agents report back to the information-seeking source. The push and pull MAs have no direct correspondence with each other. The third important subsystem of a FIF is the node itself. It is a platform for both storage of information and program execution. It monitors the local information-based system conditions and autonomously makes decisions for allocation requests by the SP. It determines the upper directions to the information service in addition to response times on its upper directions. The upper directions or stages of information eventually point to the SP node whereas the lower directions to the information service lead to the outer nodes away from the SP. Each subsystem is autonomous in terms of control to execute its operations and coordination with other nodes under evolving network conditions. 3.2 Communication Format in the FIF The conventional communications techniques cannot cope with the evolving conditions in a heterogeneous network environment where the state of nodes, the status of the SPs and the stability of connections are highly unpredictable as the user demand to access the A STUDY OF FAIRNESS OF INFORMATION... Informatica 30 (2006) 365-372 367 information is dynamic in nature. FIF therefore employs a different technique referred to as content Code communication technique [6]. The information contents in a FIF are uniquely defined by their content codes (CCs). These are further elaborately specified by their characteristic codes (CHs). Push MAs are sent out in the FIF by SPs specifying Content Codes (CCs) of information to the nodes using the message format as depicted in the FIF communication format example in Figure 3. The information about a university is structured according to the degree of importance in this example. For instance, SP1 specifies the university name followed by CH1 indicating the location of university with respect to a particular country and city. CH2 depicts the programs offered by the university and so on. The message format components thus lead to the breakdown of the principal source of information available on the SP into its uniquely defined characteristic codes (CHs). Push MAs are sent out in the FIF by SPs specifying Content Codes (CCs) of information to the nodes using the message format as shown in Figure 3. The selection of information storage/allocation is autonomously carried out by the nodes based on CHs. Similarly the Pull MAs sent out by the users search for the required information based on CHs. S C C C C P1 H1 H2 H3 H4 Universit Locati Progra Fee Start The current distributed network environment is based on the traditional client-server paradigm. In the case of mobile agents employed in a network, the service provision and utilization can be distributed in nature and is dynamically configured according to the changing network performance metrics like congestion and user demand for service provision [7]. The two network environments are depicted in Figure 1 both for clientserver and mobile agents. Mobile agents are typically suited to applications requiring structuring and coordinating wide area network and distributed services involving a large number of remote real time interactions [17]. Figure 2: The message format in FIF architecture. 4 A survey of mobile agents A distributed environment is most suitable for the employment of mobile agents. The details of such applications in a distributed environment can be found elsewhere [13-15]. Only, a brief introduction of mobile agents and their role in a network based information system will be discussed in this paper. The term mobile agent is often context-dependent and has two separate and distinct concepts: mobility and agency. The term agency implies having the same characteristics as that of an agent. These are self-contained and identifiable computer programs that can move within the network from node to node and act on behalf of a user or other entity. These can halt execution from a host without human interruption [16]. Response Mobile agent communication Figure 3: A mobile agent is a network aware program that can conserve bandwidth in a complex adaptive information system environment. 5 Fairness of information distribution and utilization simulation The purpose of Information fading in a network twofold: is Congestion Reduction - The server hosting a popular site is highly likely to get congested. The information provider could employ additional machines (child nodes) to which the original server would distribute or fade its information contents. A pull MA entering the network of the parent server could then access any of the child nodes, which would service the agent's request for the desired information and therefore reduce the service request traffic load on the parent server (PS) in the network. Fairness of Access - Service providers (whether commercial or non-commercial) would like their information contents to be easily accessible by the users on the network. The established and renowned service providers stand out compared to new competitors as search engines already index their search details. Not only the new 368 Informatica 30 (2006) 365-372 I. Ahmed et al. service providers suffer a long delay in getting their nomenclature indexed by a search engine, but they are the tail enders on the search engine index. This particular scheme of indexing by search engine creates disparity among Service providers on the network. Faded information field architecture therefore, can aid in creating fairness among various contenders of information provision and utilization on the network. For congestion reduction, the service provider must use its own resources to create the field size since the congestion reduction benefits only the service provider. However, in a typical business application like advertising for example, each information provider cannot be expected to have the resources to create a field for itself alone. It is therefore possible that each server participates in the faded information field as being part of the field in addition to being a source of information. Thus, no one is the owner of the faded information field resources, and there is neither prohibition nor delay in becoming a member of the field. For the sake of localization (prioritising a nearby service provider over a distant one), information should be faded in a geographic area as close as possible around the server. The faded information distance away from the SP was approximated using propagation delays in this simulation. For the sake of fairness, the field size of all SPs should be equal. This is taken to be the number of nodes that constitute the field. Since each SP is autonomous in creating the field, there has to be an algorithm to ensure that each SP creates fields of the same size. The following are possible algorithms for an SP to autonomously determine its faded information field given that all SPs are aware of the nodes available for them to create FIFs. (In essence, these algorithms compute the FIF size with the given variables as described next): • Sorting. Each SP orders the list of available FIF nodes into a list sorted by delay (as a substitute for physical distance). It can thus determine how "far" it needs to send its push-MAs to create the standard field size, and imposes a travel restriction beyond the envisaged perimeter of its field. The push-MAs will be required to keep updating the field until they reach the travel restriction set by the SP (which can be in the form of travel time). Alternatively the push-MAs can be multicast to the required nodes in the field. The sorting technique is inherently resource intensive and it must be re-employed to account for cost update in the WAN as and when it occurs. However, it is expected that all SPs will have the same number of nodes in the field, since each one selects the same number of nodes from the sorted list. • Step Size. The SP determines the closest and furthest distances from the SP to the nodes, and then divides that distance by the number of nodes to get the average 'hop' distance that a push-MA might take to travel from one node to the next. The inter-nodal step distance D is computed by using the following equation: D = (max (d) - min (d)) / (N - 1) * (R / 100 *N) + min (d) Where D = step distance d = distance from SP to node N = number of nodes R = required field size as a percent of total available nodes. The approximate distance that the SP envisages its push-MAs to cover in its FIF and reach a particular node is determined by the product of step distance and the nth node (DxN). The SP will now send push-MAs that traverse the required distance. Alternatively, the server would determine which nodes fall in the required distance and multicast to those nodes. The step size must be recomputed every time there is an update in the system, but the major computation is finding the maximum and minimum distances, which is much faster than sorting. • Averaged Step Size. In order to account for skewing of SPs' distances, an average distance is computed for each server that is used to determine the step size instead. In this method, the determination of maximum and minimum distances is replaced by finding the total distance to all the nodes. The algorithm was simulated using the following equation: D= (£d / (N - 1) - min (d)) / (N - 1) * 2 * (R / 100.0 * N) + min (d) Where D represents the inter-nodal distance, N is the total number of nodes in the FIF and d is the distance of the node currently being visited by the MA from SP. • Modified Averaged Step Size. The averaged step size algorithm still gets affected by the skewing of nodes. The algorithm was modified by computing the average step size of all the costs excluding the minimum and maximum costs of system information updates. In this method, the total has to be computed and the maximum and minimum has to be determined. The maximum and minimum are subtracted from the total. The average step size for push MA visits is computed using the following equation given below. The variables were incremented as the simulations iterations were performed to compute the step distance D of push MA visit to determine the FIF size. D = ((Id - Nmi„ * min (d) - Nmax * max (d)) / (N - 1 - Nmm - Nmax) - min2 (d)) / (N - 1 -Nmm - Nmax) * 2 * (R/100.0 * N - Nmm) + min2 (d) A STUDY OF FAIRNESS OF INFORMATION... Informatica 30 (2006) 365-372 369 Where Nmin = number of nodes at the minimum distance Nmax = number of nodes at the maximum distance and min2 (d) = distance of the second closest node The performance of each algorithm was tested by simulation using a wide area network with randomly distributed servers connected to randomly distributed gateways. The simulation involved 50 servers and 200 routers/gateways. Djikstra's algorithm [18] was used to determine the server-to-server costs. Under a standard policy each server was allowed to maintain a field size of 10 nodes. Figure 4 demonstrates the field size distribution for this wide area network. - Step Size .....Mod. Average • Averaged ■ sorting 45 40 35 CO o 30 £ e S 25 f o 20 o ■Q E 15 3 z 10 5 0 Iii......^ 5 10 Size of Field 15 normal step size results, as expected. In the second simulation run, the sorting algorithm had an average field size of 10.2 nodes per information provider, the step size algorithm had 10.3, the averaged step size had 10.5 and the modified average had 10.8 nodes respectively. Also, the averaged step size algorithm gave a fair distribution of field size, with most of the servers within a certain range. -Step Size .....Mod. Average — Averaged ■ sorting 45 40 35 £ a 30