Informática 32 (2008) 39-49 39 Semantic Grid Platform in Support of Engineering Virtual Organisations Matevž Dolenc, Robert Klinc and Žiga Turk University of Ljubljana, Faculty of Civil and Geodetic Engineering, Jamova 2, SI-1000 Ljubljana, Slovenia E-mail: {mdolenc, zturk, rklinc}@itc.fgg.uni-lj.si Peter Katranuschkov Technical University of Dresden, Mommsenstr. 13, 01062 Dresden, Germany E-mail: peter.katranuschkov@cib.bau.tu-dresden.de Krzysztof Kurowski Poznan Supercomputing and Networking Center, Noskowskiego 10 Street, 61-704 Poznan, Poland E-mail: kikas@man.poznan.pl Keywords: virtual organisation, interoperability, service oriented architecture, ontologies, semantic grid, engineering software, InteliGrid Received: January 15, 2008 The EU project InteliGrid (2004-2007) combined and extended the state-of-the-art research and technologies in the areas of semantic interoperability, virtual organisations and grid technology to provide diverse engineering industries with a collaboration platform for flexible, secure, robust, interoperable, pay-per-demand access to information, communication and processing infrastructure. This paper describes the system architecture and the technical aspects of the developed platform as well as the key components it offers, including services for document management, access to product model servers and utilisation of high-performance computing infrastructure. Povzetek: Predstavljena je semantična grid platforma za podporo inženirskim virtualnim organizacijam. 1 Introduction Grids are generally known as infrastructures for high performance computing. However, the original idea behind grid computing was to support collaborative problem solving in virtual organizations (VO). This coincides with the EU project InteliGrid (2004-2007) vision: to provide complex industries with challenging integration and interoperability needs (such as automotive, aerospace and construction) a flexible, secure, robust, ambient accessible, interoperable, pay-per-demand access to (1) information, (2) communication and (3) processing infrastructure. The isolation and lack of interoperability of software applications - identified in the late 1980s as the islands of automation problem [1] - is well known in various industries. The term was largely used during the 1980s to describe how rapidly developing automation systems were at first unable to communicate easily with each other. Industrial communication protocols, network technologies, and system integration helped to improve this situation. A number of European projects such as ATLAS [2], COMBI [3], COMBINE [4], ToCEE [5], ISTforCE [6], OSMOS [7] and others have proven theoretically and by developed prototypes that interoperability based on product data technology is achievable and can provide many benefits to the industry. Nevertheless, solutions for the practice require comprehensive environments that must incorporate coherently all as- pects of interoperability. This is the reason for rare use of such solutions in the industry despite all research and development efforts. However, most of the necessary technology for solving this problem, particularly the standards and tools for interoperability, is either already existing or emerging in ongoing grid and web services developments. An overview of past computer integrated construction research is presented by Boddy et al. [8]. The industry still communicates mostly by using drawings, files, project web sites and related ASP services. Semantic Web services built around a standardised product model have been demonstrated partially in research projects (e.g. ISTforCE, OSMOS) but their scalability in large complex environments has not been tested. Semantic interoperability of software and information systems belonging to members of the virtual organisation is essential for their efficient collaboration. Grids provide the robustness but need to be made aware of the business concepts that the VO is addressing. The grid environment itself needs to commit to ontology of the products and processes, thereby evolving into an ontology committed semantic grid environment; to do so there is a need for the generic business-object-aware extensions to grid middleware, implemented in a way that allows grids to commit to an arbitrary ontology; these extensions need to be propagated to toolkits that allow hardware and software to be integrated into the grid. These were the challenges in the 40 Informatica 32 (2008) 39-49 M. Dolenc et al. InteliGrid project. Key requirements for the InteliGrid platform were gathered through an extensive requirements elicitation and analysis process and were used as a baseline in the design of the high-level InteliGrid architecture [9]. Based on the work done by the OSMOS project, InteliGrid internal requirements analysis and feedback from various public demonstrations, as well as various formal and informal discussions with different members of the engineering community, the top requirements can be summarised with the term 5S Grid: Figure 1: Generic virtual organisation end-user scenario actions. Security. Industry is eager to move to a ground-up secure environment [10]. InteliGrid addressed this by adopting Grid Security Infrastructure (GSI) and integrating the Role Based Access Control model (RBAC) [11] into the platform authorisation processes [12]. Simplicity. The platform must work seamlessly with current client applications and operating systems and should not require end users to redefine their usual work processes. Stability & standards. The need for stable long-term specifications and (open) standards is well known [13]. The developed platform complies with such open standards, including WS-Resource Framework (WSRF) [14] and WS+Interoperability (WS+I) [15] for grid technology related developments, RBAC model for VO security, etc. Scalable service oriented architecture (SOA). The service-oriented architecture [16] is a well-accepted and known system architecture. The InteliGrid project adopted the Open Grid Service Architecture OGSA [17] as a baseline, and developed the platform using an OGSA compliant grid middleware. - Semantics. The platform must support rich, domain specific semantics [18]. The InteliGrid project addressed this issue by developing a set of domain specific ontologies [19]. A number of different use cases were considered while designing the InteliGrid platform, ranging from basic ones, such as joining a virtual organisation, to more advanced cases involving the use of semantic information [9]. The developed use cases were abstracted into the generic virtual organisation end user scenario presented in Fig. 1. Starting from available technologies, industry practices and trends, the project aim was to create knowledge, infrastructure and toolkits that allow a broad transition of the industry towards semantic, model based and ontology committed collaboration based on the grid technology (rather than the web, which is the infracturcture technology of the SWOP project [20]). The rest of the paper is organised as follows: Section 2 presents the background information on three key technologies (grid technology, semantic interoperability and virtual organisations), Section 3 describes the high-level system architecture of the platform as well as the developed tools and services, and finally Section 4 presents our conclusions and outlines proposed future research and development work. 2 Technology The InteliGrid project addressed the challenge by successfully combining and extending the state-of-the-art research and technologies in three key areas: (a) semantic interoperability, (b) virtual organisations, and (c) grid technology (see Fig. 2) to provide standards-based collection of ontology based services and grid middleware in support of dynamic virtual organisations as well as grid enabled engineering applications. It was recognized that if a grid technology is to ensure the underlying engineering interoperability and collaboration infrastructure for a complex engineering virtual organisation, the grid technology needs to support shared semantics. 2.1 Grid technology At its core, grid technology can be viewed as a generic enabling technology for distributed computing, based on an open set of standards and protocols that enable communication across heterogeneous, geographically dispersed environments. With grid computing, organizations can optimize computing and data resources, pool them for large capacity workloads, share them across networks and enable collaboration [21]. Foster [22] notes that the grid must be evaluated in terms of the applications, business value, and scientific results that it delivers, and not its architecture. It is based on hardware and software infrastructures which provide a dependable, consistent, pervasive and inexpensive access to com- SEMANTIC GRID PLATFORM IN SUPPORT OF. Informatica 32 (2008) 39-49 41 Figure 2: The InteliGrid project addressed three key technology areas: grid technology, semantic interoperability and virtual organisations. puting resources anywhere and anytime. The term resource has evolved from covering only computing power and storage to covering a wide spectrum of concepts, including: physical resources (computation, communication, storage), informational resources (databases, archives, instruments), individuals (people and the expertise they represent), capabilities (software packages, brokering and scheduling services) and frameworks for access and control of these resources [23]. By using a grid for sharing resources, researchers and small enterprises can gain access to resources they cannot afford otherwise. Research institutes, on the other hand, can leverage their investment in research facilities by making them available to many more scientists. An overview of grid technology in civil engineering (including different grid technology standards, middleware and specific challenges related to technology addoption within engineering industries) has been published by Dolenc et al. [24]. 2.2 Semantic interoperability The development of grids and the increased use of agent and service based technologies have a profound impact on the way of data being exchanged on the Internet. The important feature of new semantic based approaches is the separation of content from presentation, which makes the use and reuse of data easier. To achieve that kind of (semantic) interoperability, systems must be able to exchange data and information in a way that the precise meaning of the data is not lost and is readily accessible, and that the data itself can be translated into a form that is understandable by almost any system. It is important that the meaning of the exchanged information is interpreted accurately. The benefits of semantic interoperability are numerous, but the most notable one is that it assures the processing and reasoning of data by computers. The state-of-the-art developments and corresponding standards that are developed and used for semantic web applications can be reused in the grid computing environments with some modifications. Semantic interoperability and its content description standards are in particular about ontologies and their inherent rules. Whilst the content description standard addresses the general applicability in distributed environments (as in InteliGrid), an important aspect in ensuring semantic interoperability is extensibility, i.e. the content description standard is required to make an 'open world assumption'. That is, semantic concepts are not confined to a single file or scope. While a concept may be defined originally in a basic ontology, it can be extended and instantiated in another definition or exchange file. The ontologies for semantic interoperability are therefore designed mostly in a layered approach, allowing for vertical and horizontal extensions. This means that the ontology has to support abstraction layers (from high-level concepts, such as 'resource', to specific concepts, such as 'construction-site-meeting-memo'), as well as the possibility for horizontal extensions targeting different domain/application areas. These requirements are especially addressed by ontology standards for the semantic web. Therefore they have been gaining more importance. Semantic interoperability issues in the context of information and communication technology as well as recent semantic web developments have been addressed by Velt-man [25]. 2.3 Virtual organisations Client demands for one-of-a-kind-products and services demand a one-time collaboration of different organisations, which have to consolidate and synergise their dispersed competencies in order to deliver the desired product or service. Each organisation is usually involved in the delivery of one or more components of the requested product or service. To deliver the complete product or service, organisations need to rely on each other for information completeness, as all product components are inter-related. Consequently, this has an implication not only on the way information (related to the to-be-delivered product or service) is exchanged and shared, but also on the way in which secure, quick to set-up, transparent (to the end-user) and non-intrusive (to the normal ways of work of an individual/organisation) information and communication technology is used for this purpose. Virtual organisation is quickly becoming the preferred organisational form for one-of-a-kind settings to deliver one-of-a-kind product and typically goes through four distinct lifecycle stages (Fig. 3) [26]: 1. Identification/conception typically begins upon a specific (unique) client need for a product or service that a single organisation cannot deliver and serves as a business opportunity for a set of organisations which will 42 Informatica 32 (2008) 39-49 M. Dolenc et al. Figure 3: Typical virtual organisation lifecycle [26]. combine competencies to deliver the product and/or service that the client needs. 2. Formation/configuration focuses on the establishment of the VO in terms of role definitions, definition of information flow mechanisms, identification of information exchange formats and modalities, interoperability of inter- organisational tools, shared resource and services definition and configuration, etc. According to Kazi and Hannus [27] one of the key ICT requirements in a VO environment is the capability of a quick set-up and configuration. 3. Operation/collaboration is the main stage of a typical VO where different VO tasks are carried out in parallel and/or in series based on task needs. Within this stage there is a significant degree of work taking place within a distributed (engineering) setting with the possibility of some partners leaving and others joining according to the need of the VO. 4. Termination/reconfiguration. When a VO consortium completes the delivery of the required product/service, it is terminated or reconfigured to form another VO (e.g. from a VO that develops a product to a VO that provides maintenance or service for that product). During this stage, it is very important to have proper mechanism in place for archiving the data/information used and produced during the operation and collaboration stage. 3 InteliGrid platform Grids were expected to be the solution to the "islands of computation" problem, but they were also expected to ensure the interoperability and collaboration platform providing that they include the key ingredient required for a complex engineering virtual organization - the support for the shared semantics. Scientific research and technical development in the project have advanced the state-of-the-art in the field of semantic grids and in the field of virtual organization interoperability; while the architecture, engineering and construction sector (including facility manage- ment) has provided the testing environment for the project, all technologies developed are generic and applicable in any kind of virtual organisation environment. A grid environment, in the context of the InteliGrid project, is an infrastructure for secure and coordinated resource-sharing among individuals and institutions with the aim to create dynamic virtual organizations. InteliGrid's hypothesis was that the meaning of the resources should be explicit which leads us to the issue of semantics and ontologies. The vision of the project was to create virtual dynamic organizations through secure and coordinated resource-sharing among individuals, institutions, and resources. Grid computing is an approach to distributed computing that spans not only locations but also organizations, machine architectures and software boundaries to provide unlimited power, collaboration and information access to everyone connected to a grid. 3.1 Architecture The InteliGrid architecture is based on the SOA concept as well as on lessons learned in earlier related projects [5, 6]. It is a high-level architecture, conforming to the key requirement of a generic approach which can be proven by trying to fit the existing architectures of systems developed over the last decade into it. The architecture is used also to identify the components that exist and the components that need to be developed. It includes (see Fig. 4): (1) the layer representing the conceptually modelled real world domain that is being addressed (e.g. buildings, aeroplanes, organizations, engineers, processes etc.), (2) the conceptual layer containing things that exist in the form of standards, ideas, graphs, schemas, ontologies, notions etc., (3) the software layer comprised of software that can be compiled, installed, executed, and runs and communicates with other software, and (4) the basic resource layer that include IT resources which are needed to run the applications and services defined in the layer above, e.g. hardware, firmware, software, etc. Figure 4: Four main layers of the InteliGrid conceptual architecture and their relationships - it is important that all architectural layers commit to common ontologies. InteliGrid is delivering a generic grid-based integra- SEMANTIC GRID PLATFORM IN SUPPORT OF. Informatica 32 (2008) 39-49 43 Semantic Interoperability I Services Business Service Providers Grid Middleware Services RDF and OWL(-S) Schemas XML/SOAP over SSL/TSL + Single Sign On + delegation OGSA-DAI, Grid FTP, WebDAV OpenDSP Web and Grid Services Physical F= ~1£3r— 5 Resources —--1 = And Core ' El -J WT S Services MySQL, PastgreSQL, Windows, Linux.. Solaris and legacy Oracle, DB2 applications Domain and Business specific plugins and extensions Security, AAA and data protection InteliGrid mechanisms Core InteliGrid middleware services and tools Figure 5: InteliGrid high-level platform architecture. tion and a semantic-web based interoperability platform for creating and managing networked virtual organisations. The developed service-oriented architecture is presented in Fig. 5, together with all its principal components and their interfaces. From the security perspective, a virtual organisation is a collection of individuals and institutions, represented by various services and service consumers that are defined according to a set of resource and data sharing security policies and rules. Those resource sharing rules must be dynamically controlled and then enforced into the whole virtual organisation environment. Thus, one of the most challenging tasks in the project was to create an appropriate security infrastructure covering all aspects of operating within a dynamically established virtual organisation. The InteliGrid platform enables both service consumers and service providers to manage and share their resources securely with any of the individual organizations participating in the virtual organisation. Technically speaking, components are deployed either at some workstation or at a remote node on the grid. If on the grid, it is not important where they are deployed physically, the resource where they run will be very likely allocated dynamically. The grouping of the various services in Fig. 5 is presented according to the logic of the service and does not necessarily imply who uses which service. There are four main types of components in the InteliGrid platform: - Business specific applications. These applications are the consumers of the business service providers and are usually accessed through a web based portal interface, although desktop applications can also make use of different available services. - Secure Web Services and WSRF compliant services. They can be further divided into: (1) inter-operability services (top tier) that simplify the interoperability among all services, and (2) domain and business specific services that perform some value added work. There are two kinds of business services: (a) collaboration services that provide file and structured data sharing and collaboration infrastructure, and (b) vertical business services that create new design or plan information. - Middleware services. These services offer traditional grid middleware functionality extended with particular needs of the InteliGrid platform. The services are based on mature grid technologies and their open source reference implementations. - Other resources. The bottom layer consists of various physical infrastructure resources that suppliers offered to the platform. All these resources are available and can be accessed remotely through well-defined interfaces and secure communication protocols. 3.2 Tools and services The developed InteliGrid platform includes different client side applications and tools as well as many server side components enabling potential end users to securely execute high-performance calculations, access heterogeneous data resources, and generally work in established virtual organisations. The description of all available applications and services is available on-line at http://www.InteliGrid.com/products. The following sections provide an overview of the main InteliGrid products: 44 Informatica 32 (2008) 39-49 M. Dolenc et al. (a) A single sign on entry point to all InteliGrid available online services - the authentication process is based on a defined RBAC model. (b) The platform requires that all business services and resources are registered - a portlet enables registration of several different types of services and resources. Figure 6: InteliGrid testbed portal implementation is based on the GridSphere portal framework. - Collaboration platform that provides a working testbed environment, including online access to available resources; - Ontology services that together with the developed ontologies, establish the conceptual and architectural backbone of a semantic grid infrastructure; - Semantic document management service and tools that provide a major testing application for the ontology services; - High-performace services that provide easy integration of existing engineering software (for example finite element codes, etc.); - Product model services that provide access and itegra-tion of engineering product models. 3.2.1 Collaboration platform The InteliGrid collaboration platform for virtual organisations allows dynamic creation and management of virtual organizations in various engineering industry sectors. The platform is independed of the underlying computing technologies, data storage mechanisms or access protocols. The platform enables secure sharing and control of resources across dynamic and geographically dispersed organizations. It features a secure, semantic-based and robust grid middleware together with easy-to-use web based interfaces for information integration, communication and interoperability. The web interface is built on the GridSphere portal framework [28] which provides an open source portlet-based web portal. Built-in single sign on (Fig. 6a), authentication, authorization and control mechanisms allow end users such as engineers, designers, architects, etc. to create their own space within a virtual organization to securely share relevant information and resources with other business partners and groups. The platform enables local administrators and IT staff to monitor the status and conditions of all provided services (Fig. 7a). It also allows virtual organisation managers to orchestrate and control access to different business service providers (Fig. 6b). Other actors such as virtual organisation project managers and grid administrators are able to establish and dynamically modify virtual organisations and their resources including users, services, databases and computation resources (Fig. 7b). 3.2.2 Ontology services To fully utilise the advantages of the ontology-based approach, ontology services - providing convenient methods for management of ontology instances, i.e. semantic metadata about entities in the IT environment - need to be developed and made available through the platform service framework [29]. These services constitute the interoperability layer and make use of the grid middleware services that provide basic authorisation management and generic access to all grid resources. The ontology services provide generic and specific convenience methods to create, manipulate and manage the ontology instances of classes defined in the ontology framework. The developed ontologies and ontology services establish the conceptual and architectural backbone of the semantic grid infrastructure. They facilitate information management, improve the consistency of the distributed environment and make it less prone to errors. End-user applications can also strongly benefit from the added semantic value. The technology is well suited to support humancomputer interactions while semantic models are more related to end user perceptions than the usually applied IT based schemas. All InteliGrid developed business services and end-user client applications use ontology services actively to enhance the end user experience [30]. SEMANTIC GRID PLATFORM IN SUPPORT OF. Informatica 32 (2008) 39-49 45 * MWMi Artoi Holt* vo fYOt*Clí ftq>irI.uh ID F«i*rU*i» G*** Myvarnerv Xi úrreohane Mcanter Mtiwurtcn KU,