P i a a to —I fc o 6 3 30 »H ttì e ,o Volume 18 Number 1 April 1994 ISSN 0350-5596 Informatica An International Journal of Computing and Informatics Profile: Robert Trappl . Guest Editor: Miroslav Kubat Diagnosis of Technical Systems Closure of Database Relation IJCAr93: Critical The Slovene Society Informatika, Ljubljana, Slo "iT Informatica An International Journal of Computing and Informatics Basic info about Informatica and back issues may be FTP'd from ftp.arnes.si in magazines/informatica ID: anonymous PASSWORD: FTP archive may be also accessed with WWW (worldwide web) clients with URL: ftp://ftp.arnes.si/magazines/informatica Subscription Information: Informatica (ISSN 0350-5596) is published four times a year in Spring, Summer, Autumn, and Winter (4 issues per year) by the Slovene Society Informatika, Vožarski pot 12, 61000 Ljubljana, Slovenia. The subscription rate for 1994 (Volume 18) is - DEM 50 (US$ 34) for institutions, - DEM 25 (US$ 17) for individuals, and _ - DEM 10 (USS 4) for students plus the mail charge DEM 10 (USS 4). Claims for missing issues will be honored free of charge within six months after the publication date of the issue. MfeX Tech. Support: Borut Žnidar, DALCOM d.o.o., Stegne 27, 61000 Ljubljana, Slovenia. Lectorship: Fergus F. Smith, AMIDAS d.o.o., Cankarjevo nabrežje 11, Ljubljana, Slovenia. Printed by Biro M, d.o.o., Zibertova 1, 61000 Ljubljana, Slovenia. Orders for subscription may be placed by telephone or fax using any major credit card. Please call Mr. R. Murn, Department for Computer Science, Jožef Stefan Institute: Tel (+386) 61 1259 199, Fax (+386) 61 219 385, or use the bank account number 900-27620-5159/4 Ljubljanska banka d.d. Slovenia (LB 50101-678-51841 for domestic subscribers only). According to the opinion of the Ministry for Informing (number 23/216-92 of March 27, 1992), the scientific journal Informatica is a product of informative matter (point 13 of the tariff number 3), for which the tax of traffic amounts to 5%. Informatica is published in cooperation with the following societies (and contact persons): Robotics Society of Slovenia (Jadran Lenarčič) Slovene Society for Pattern Recognition (Franjo Pernuš) Slovenian Artificial Intelligence Society (Matjaž Gams) Slovenian Society of Mathematicians, Physicists and Astronomers (Bojan Mohar) Authomatic Control Society of Slovenia (Borut Zupančič) Slovenian Association of Technical and Natural Sciences (Janez Peklenik) Referees: - David Duff, Pete Edwards, Hugo de Garis, David Hille, Tom loerger, Yves Kodratoff, Ockkeun Lee, Rich Maclin, Raymond Mooney, Peter Pachowicz, Aswin Ram, Lorenza Saitta, Jude Shavlik, Derek Sleeman, David Wilkins, Bradley Whitehall, Jianping Zhang The issuing of the Informatica journal is financially supported by the Ministry for Science and Technology, Slovenska 50, 61000 Ljubljana, Slovenia. PROFILES For the first time in its Profiles history, Informatica takes the opportunity to introduce one of its editors who is not only a distinguished researcher, computer scientist and professional AI-activity organizer, but also a technological innovator and the leader of one of the largest and the most important industrial projects in the history of AI. In the SQUAD project at NEC Corporation, over 20,000 software engineers are participating in the implementation of knowledge engineering for software quality control. It is certainly important to stress at least one of the major innovative fields of Hiroaki Kitano, the massively parallel artificial intelligence. This field is a consequence of his scientific and research work concerning his speech-to-speech translation system, implemented for a 'live' J apanese-to-English translation on several massively parallel machines Uke CM-2, IXM-2, and SNAP-1. It is not for the first time that the readers of Informatica can learn about some innovative Ki-tano's achievements. A short overview of his professional activity was already reported in Informatica, 18 (1994), pp. 97-99 (M. Gams, Report: IJCAr93—A Critical Review). So let us cite some characteristic sequences from there. The Compiiters and Thought Award for outstanding young scientists (below 35 years) was given to Hiroaki Kitano. ... According to the reviewers, Kitano won the prize for novel translation work on a nnassively parallel computer. His approach builds on memory as the foundation of intelligence. Each of computer's 32,000 processors contains a few sentences and determines the best match for an input sentence in natural language. Translation is based on extensive use of memory. Statements are parsed into basic parts, translated, and sensibly combined to the translated language. All reasoning is by analogy. The program achieves around 75-percent accuracy by one reports and around 90-percent in others: ... He categorises complete, correct and consistent problem domains as toy domains. Real-world domains are denoted as incomplete, incorrect, inconsistent, and are in addition characterised by human bias, tractability and economics. ... a great part of Al research is devoted to toy problems which are unscalable not only to real-world problems but also to intelligent systems. ... Al as well as computer science and engineering in general are much more dependent on the development of computer hardware that generally recognised. Future computer projects can be characterised by massive computing, massive parallelism, massive data. The readers of Informatica may agree how the preceding dictum illuminates complementarily not only the work but also the philosophical and engineering orientation of Hiroaki Kitano. It gives significant hints not only to researchers but also innovators in computer and software engineering. The adjective massive in respect to processors, memory and data comes close to something called the informational and regards an in componential manner massive system (machine) in any respect, that is, in the sense of an informational machine which supports systemicaUy the informing of its hardware and software components (entities). Hiroaki Kitano Hiroaki Kitano is a researcher at Sony Computer Science Laboratory, Tokyo, Japan. He holds a Ph.D. in computer science from Kyoto University. Since 1988, he was a visiting scientist at Center for Machine Translation, Carnegie Mellon University. In 1993, he received the Computers and Thought Award from IJCAI for his accomplishments in speech-to-speech translation and massively parallel AI research. He is currently serving as a member of several important committees, such as the advisory committee for IJCAI, the international committee on evolutionary computation, the program committee for AAAI-94, etc. His scientific contribution spans broad range of AI research. In 1989, Kitano developed one of the first speech-to-speech translation system, a system which translates spoken Japanese into En-gUsh producing vocal outputs. The system, Dm-Dialog system, was also the first system, so far the only one, which demonstrates the possibility of simultaneous interpretation by machines. Kitano, based on his experiences as a professional interpreter, proposed an almost concurrent parsing and generation model so that, in some occasions, translation can be generated even before the completion of an input sentence. The Dm-Dialog system employs massively parallel marker-passing to carry out computation, and based on memory-based translation approach. Due to highly parallel nature of the computing scheme in Dm-Dialog, his system was implemented on several massively parallel machines, such as CM-2 connection machine, IXM-2 associative memory processor, and SNAP-1 semantic network array processor. These implementations demonstrate very high performance translation, translation in milliseconds order, which made a decisive case for massively parallel approach for spoken language translation. In fact, ATR Interpreting Telephony Research Laboratory, a Japanese semi-government research institute, made a decision to pursue a massively parallel memory-based approach to develop its speech-to-speech translation system. The impact of Kitano's work was not limited to natural language community, it was the first application of massive parallelism to AI for highly complex tasks. Kitano also challenged to expand horizon of high performance computing using the state-of-the-art semi-conductor technology. In 1992, he designed a special purpose wafer-scale integration (WSI), called WSI-MBR. It is dedicated device for memory-based reasoning (MBR). The architecture exploits the redundunt nature of MBR to overcome inevitable fabrication defects of WSI. With its special analog-digital hybrid design, the performance of a WSI-MBR on an eight-inch wafer is expected to be equivalent to 70 Tera-Flops in digital computers. It enables over 2 million parallelisms in one wafer. The low cost, small size, and high-performance of WSI-MBR have significant potential to re-shape the future of computing and industry. The success of these projects led him to promote a new field in AI—massively parallel artificial intelligence. He chaired a panel on this subject in IJCAI-91, held a tutorial in IJCAI-93, and recently edited a book Massively Parallel Artificial Intelligence (Kitano and Kendler, Eds.) published from AAAI Press/The MIT Press. In 1990, he published a seminal paper on the combination of genetic algorithms (GA) and neural networks (NN). His paper, appeared in Complex System journal, introduced a developmental stage in combining GA and NN. This is the first work to integrate evolution, development, and learning. This work quickly gets attention in GA and ALife community, and a number of re- search projects was initiated and inspired by his method. Kitano has a substantial contribution to case-based reasoning and corporate information system field, too. While at NEC Corporation, he leads the SQUAD project, which is a project to develop a large scale corporate-wide case-based system on the software quality control domain. The SQUAD system has over 25,000 cases in various areas of software quality control, and the size of the case-base increases at the speed of 3,000 cases a year. In one of the largest knowledge engineering projects undertaken in the AI history, over 20,000 software engineers are involved in reporting cases throughout the company. Aside from his scientific work, Kitano actively promoted important new areas in AI, such as massively parallel AI, entertainment, and grand challenge AI applications. He chaired several important workshops such as the First International Workshop on Parallel Processing for AI (PPAI-91) held in conjunction with IJCAI-91 at Sydney, Australia, and the Second International Workshop on Parallel Processing for AI (PPAI-93), IJCAI-93, at Chambery, France. Back in Japan, he established SIG-PPAI, a new SIG under Japanese Society for Artificial Intelligence (JSAI). SIG-PPAI is a dedicated forum to discuss parallel processing for AI. In 1991, he organized and chaired the workshop on Grand Challenge AI Applications, at Tokyo, Japan, which triggered several key research projects in Japan each of which addresses grand challenge applications. In 1993, he organized a panel "Grand Challenge AI Applications" at IJCAI-93 to discuss international aspects on grand challenge. His recent emphasis in promoting a new area is entertainment. At AAAI-94, he co-chaired the workshop on AI, AHfe, and Entertainment. He is currently planning several other workshops to help researchers to explore a new area of AI research. Kitano serves as an associate editor in several of the major journals, such as Journal of Evolutionary Computation, and Applied AI, and is a member of the editorial board for Journal of Al Research, Journal of Theoretical and Experimental AI (JETAI), Artificial Life journal, and Informatica. Edited by A.P. Železnikar Evolution of Methods in Object Schema Z. Tari and X. Li School of Information Systems Queensland University of Technology GPO Box 2324, Brisbane, Australia zahirtQicis.qut.edu.au, xueliQfit.qut.edu.au Keywords: Object-oriented databases, schema evolution, methods, proof correctness Edited by: Xindong Wu Received: May 12, 1994 Revised: September 20, 1994 Accepted: October 6, 1994 This paper deals with the problem of method restructuring in object-oriented databases. We propose a framework that allows a change of methods while maintaining database consistency. A set of minimal changing operations which affect all parts of methods is provided. The semantics of methods evolution relates to two levels of granularity of a sciema. The first level concerns the evolution of methods in the context of class inheritance hierarchy, whereas the second level relates to behavioural evolution in which the chain of calling relationships between methods is considered. 1 Introduction The purpose of this paper is to show how behavior of an object-oriented (0-0) schema can be updated by keeping the state of the database consistent. This problem is generally involved in a more global problem, namely schema evolution. This problem relates to the ability of updating information within a schema. Schema evolution is an important facility because most of the design of database applications caUs for frequent changes to catch up to changes in reality. Schema evolution has been primarily performed on the relational model. With the emergence of richer models (e.g., 0-0 models), schema evolution is addressed in a more general way, i.e. not only on the structural perspective as it was done in the relational model. In this paper we consider the problem of schema evolution in 0-0 systems. This problem concerns the ability to safely alter a schema both structurally and behaviour ally. - Structural alterations may be performed ■ at different levels of granularity within a schema: class level, relationship level, instance level. Several works, e.g. [4,13,14,16], have investigated these different levels of structural modifications. For each level a set of alteration operations and semantic rules to ensure consistency are provided. Many products have been enhanced by the above facilities (e.g. Gemstone, O2, Orion). A detailed study of schema evolution in such systems can be found in [15,8]. - Èehavioural alterations concern the evolution of the behaviour of objects. The operations relate to the possibility of modifying both the signature and the code of the methods, and ensuring the correctness of these operations. Some approaches have been provided (e.g., [1,9,11,17,18]) in which some structural changes, such as updating relationships between objects, may affect the behavior of objects. In other words, they are considered as behavior updates. The literature has widely addressed the first above aspect. However, regarding the second above aspect, i.e. the modifications of methods, to our knowledge no system fully supports such an aspect. Indeed, the research in this area may be considered as a "sample view" of the problem of method changing, and it does not cover every part of a method. This paper provides our solution to the problem of method evolution in 0-0 schema. The problem of behavioural consistency is also addressed. The behaviour of an 0-0 schema is considered here as a pattern of activities coherent to the schema. Method evolution refers to updating the methods using operations which include deletion, insertion, and modification. These, after their use, may generate inconsistencies which involve (1) run-time type errors, (2) side-eiFects, (3) unchanged methods made redundant or meaningless, and (4) unexpected results given by methods. In this paper, solutions to the problems (l)-(4) cited above are provided. To do so, a fundamental issue is the way in which methods are represented. The semantics of methods is represented declara-tively by using appropriate data structures, such as - a graph for representing method overriding and the inheritance relationships between methods; and - a graph for describing overloading semantics and the method-calling relationships. Because the information about method behaviours is represented in graphs, the behavioural inconsistency problems can be checked out by searching the states represented in graphs. By making the information that constrains the method behaviours available to a method-changing handling mechanism, we address the strategies used to deal with behavioural inconsistency problems. Briefly, the approach we propose has three aspects: - method representation framework, - the method evolution operations, and - the checking strategies. The first aspect is addressed by proposing appropriate graph data structures that represent all the information about methods. A set of operations are provided for the second aspect, and this covers all the possible modifications of a method schema. The last aspect is addressed by checking the inconsistent behaviours on different states of the graphs. We also provide consistency checking algorithms that take these modified graphs as input and output validation of the graphs. Program verification techniques (e.g., Hoare rules [10]) are used to provide a proof of correctness for the modified methods. The consistency checking is performed both at the inheritance level and at the method-caUing level. The algorithms to check these different levels of consistency are proposed. The remainder of the paper is organised as follows. The next section discusses the related work. Section 3 describes the basic concepts related to methods in 0-0 schema. Section 4 gives an example. Section 5 fixes the notations and presents a formalisation of the semantics of methods. In section 6, the syntax of a method evolution language is described. Section 7 presents the strategies for behaviour consistency checking and we conclude with future work in section 8. 2 Related Work The evolution of methods in 0-0 databases is a complex problem. Generally, methods are described in a procedural language in which the effects of update operations are difficult to understand. A small amount of literature has been dedicated to this problem, and generally addresses the declarative part of methods, i.e. the update of the signatures. This section surveys previous work in the area of method evolution and puts our work in context. - In [9,18], the authors mainly approach structural evolution. Behavioural evolution is focused on the update of signatures. Behavioural consistency is restricted to runtime type errors and side-effect problems. The former are detected during structural consistency checking which ensures that the nanie and signature of methods are compatible along the method-inheritance chain. Data Flow technique (VDF) is used in [9] to detect run-time errors in terms of three type safe sufficient conditions. The type safe conditions guarantee - the inheritance relationships among data types referred by a method; - the declaration of method data types; and - the matches of actual and formal arguments. - Waller's approach [17] is graph-oriented for checking method consistency. An abstract run is introduced to represent the unfolding of the method computation from input to output. The signature of a method is "unfolded" as initial and final states of a directed graph which gives a consistency proof. When a method is changed, another consistency proof will be generated. The approach is based on incremental consistency checking which analyses the differences between two consistency proofs. - Method Execution Tree (MET) [11] gives an intuitive way to represent the execution of method calls. The MET of a method execution is represented by the root of the tree (m,t)[o], where m is a method name, t is the type that m is encompassed in, and o is an object instance of t used as the actual argument to invoke m. When m is executed it calls other methods and passes objects to them. Therefore, a node in the tree is a method call and an edge in the tree represents a calling relationship. The ultimate goal to use a MET is to find four kinds of problems: (1) unavoidable nontermination, (2) the method is reachable from another, (3) a method has bounded execution, and (4) method execution is aborted. The problems (1) and (3) can be detected through the configuration of its MET. Also for each method, the problems (2) and (4) are dealt with by examining à report of the method execution. The above mentioned approaches suffer from the lack of semantics associated with methods. Especially, there are no discussions about the method correctness after method updates. However, among these approaches Waller's approach seems to be very elaborate. It has following drawbacks: - When an abstract run is unfolding a method, some cycles may still appear in the graph which are not distinctly different from that of the recursion of methods. For example, people — people may represent a family tree or an employee — manager relationship. The unfolding algorithm runs into the problem it has tried to avoid by making the assumption that methods are recursion-free. Moreover, if the abstract runs to unfold recursive methods are unavoidable, this approach runs into a more serious problem because of the complexity of a recursive method before it is unfolded. - Although this approach is computationally sound and complete, it still cannot deal with low level behavioural inconsistency, i.e. with instances. For example, the output of a method, say for instance an integer, may be fed into another method deriving unexpected negative values of employee salaries. The problem is rooted in the representation of the graph: the vertices are represented as {ti,Ci,i) where U is a collection of method output objects. Ci is a class name, and i is the sequence number in the steps of abstract run. The checking is based on the isomorphism between the corresponding vertices and the minimal bridge between two graphs of the consistency proofs. Virtually there is still nothing more than the preventing of side-effects and run-time type errors in methods. In this paper we provide solutions for the above problems by - proposing a framework for representing the semantics of methods as graphs. This framework bears the information about signatures, pre- post-conditions, and relationships of methods; - proposing the basic operations for method evolution; and - proposing a consistency checking strategy. The checking algorithms are developed based on the two ^ kinds of graphs and can be implemented to automatically validate the changed methods. 3 Basic Concepts This section clarifies the basic concepts of the object paradigm such us: overriding, overloading, and late binding. These characteristics are fundamental to method evolution in 0-0 databases. Other aspects of the object paradigm such us static and dynamic typing, and covariance and contravariance subtype rules complement the previous features. AU of these aspects related to the object paradigm wiU be described in this section in order to provide a clear understanding of such concepts in the context of method evolution framevirork in 0-0 databases. The reader is assumed to be familiar with the 0-0 terminology [3]. Only a brief overview of the basic 0-0 concepts is given here. A class is a central concept in 0-0 systems. It is an abstraction of some information of the real world. Instances may be generated from classes. They are called objects. A class has attributes and a set of methods that are available to all objects of the class. Attributes and implementations of methods are hidden, and methods are used only through their signatures. The signature of a method specifies a set of input attributes and a result attribute. A method is invoked by sending a message to the object to which the method is attached. A message actually is a caU to a method with actual arguments assigned to match with the method signature. In this paper we use the terms of "calling a method" or "sending a message" interchangeably. An 0-0 schema is a set of classes connected together with various types of relationships. The most common (used) relationships are: inheritance (isa) and aggregation (partof). In this paper we adopt a "simplified" Booch notation [5] for describing an 0-0 application (see Figure 1). In [5], classes are specified as cloud dashed icons. An icon class contains the name, the attributes and the interfaces of the class. Additional relationships such as using, instantiation and metaclass relationships are not considered in this paper. They are generally introduced during the design, and then transformed to an isa or partof relationship during the implementation. The static part of an 0-0 schema, i.e. the attributes of classes, is widely addressed in the literature. However the dynamic part, i.e. the methods of classes, is generally not deeply studied. Indeed, method plays an important role in 0-0 systems, especially for method evolution. This section clarifies concepts related to methods in 0-0 systems to prepare for further work in the area of method evolution. Overriding, Overloading and Late Binding. Method overriding is inherent in class inheritance semantics. A method defined in the superclass can be redefined by its subclass(es). By using the same name, a number of methods may override along with an inheritance path. Method overloading reflects the meaning of the polymorphism. Different methods with the same name may be defined within the same lexical scope (e.g., a class). Method overriding and overloading give an effective way of a natural mapping between a problem space and a method space. Late binding is a way that guarantees the overriding and the overloading are properly handled [3]. During compile-time, the binding of method names can only be done in a context-free format and may not be able to link the ones that are meant to be executed. However, at run-time, a certain method to be bound is dependent on the context that - a method of a subclass should always override the one of the superclass; - amongst the methods with the same name in the _ same lexical scope, the one which matches the signature of the caller should be bound. In other words, the late binding is a process in which the system determines the method code to be run and performs the type checking at runtime. Late binding is also known as dynamic binding or run-time binding. Static and Dynamic Typing. Methods in 0-0 systems are categorised as either static typed [2] or dynamic typed. Sometimes static typing is also known as strong typing [12], A static typed method is characterised by the fact that aU types of objects referred to by the method can be determined at compile-time. Situations include: - a call for an overloaded method such that the types of signatures need to be discriminated; - a call for an inherited method such that the types of objects involved need to be decided; - any call-by-reference situations in which a method or an object is to be referred to via a pointer. In the presence of late binding, some of the static-typed languages require that aU candidates must uniformly match with that of the dispatcher (i.e., a caller of a method or a variable referring to a method). Dynamic typing in programming languages is characterised by the fact that the types referred to in a program are determined at compile-time whenever that is possible. Otherwise, they are determined at run-time. This means that during late binding, the match of types of objects is performed within the context of method execution. Whether a static-typed or a dynamic-typed programming language is used, the type safety problem for overloaded and inherited methods have some principles, namely, the covariance subtype rule [9] and the contravariance subtype rule [7,6]. Following are the definitions of such typing rules. Covariance Subtype Rule. When a method is redefined or substituted by another method, in order to maintain the type safe method calling, the co-variance subtype rule states that a method m\ . with a signature X —> Y, denoted as m\ : {X —y), is a subtype of a method m2 with a signature X' —Y' if X is a subtype of X' and y is a subtype of Y'. Thus if 7712 is replaced^ by mi with the satisfaction of the covariance subtype rule, it is stiU type safe. However, when late binding is taking place, the covariance subtype rule per se cannot prevent run-time type errors. In this case, the late binding may take an object x' E. X' instead of a; £ X to match with mi. This is allowable by the covariance rule since X is a subtype of X' and X inherits all objects from X'. However x' may not have some attribute that is peculiar to x required by mi. In this case, a runtime type error within mi would be detected. In brief, the covariance subtype rule provides a sufficient condition to specialise a method into its subtype and problems may occur when an object from a supertype is passed into a subtype. Example 1 Let us consider the following example to clarify the covariance subtyping. Assume that X = Student, Y = Lecturer, X' = Person and Y' = Staff. If m-i : {Person —^ Staff) is replaced by mi : {Student —^ Lecturer), then all calls previously to m^ now become calls to mi. Since Student is a subtype of Person and Lecturer is a subtype of Staff, this replacement is acceptable for the reason that a student inherits all the attributes of the class Person. This is the same for a lecturer who is a person. Consequently, any enquires previously to a person and a staff will then be substituted by to student and lecturer. This is the nature of subtyping. However with late binding, there might be a case that an actual argument p which is of type Person is bound to the formal argument s which is of type Student. Since p cannot provide some specific information required by s (say, credit-points of a student), a run-time type error is reported. It can be seen that this is not a problem when m^ is not replaced by its subtype m\. Contravariance Subtype Rule. The contravariance subtype rule has been discussed by Cardelli in [7]. Given two methods mi and m2, where mi : {X —> Y) and m2 : {X' -—> Y'), m\ Isa m^ means that X is a supertype of X' and Y is a subtype of Y'. By contravariance subtype rule, type safety is provided in late binding. In this case, m2 may be substituted by mi. Then, during late binding, if an object x' G X' instead of a; € X is taken to match the mi signature X —Y, since X is a supertype of X', any object taken from X' must have aU properties that a; G X has, so no run-time type error could possibly happen. In brief, a method m2 : (X' —Y') can be generalised to mi : (X —y) when contravariance subtype rule is applied. Example 2 Let us consider an example where X — Person and Y = Lecturer, X' = Student and Y' = Staff. If m^ is replaced by m\, then all the calls previously to m^ are now calls to mi. Since Person is a supertype of Student and Lecturer is a subtype of Staff, this replacement is acceptable, in the sense that mi requires less information as its input type of Person and returns more information as its output type of Lecturer. Apparently, any previous calls to m2 will be acceptable to mi because the signature of m2 actually specifies more information than is required by the signature of mi. During late binding, if either an actual argument s with the type Student or p' with the type Person is bound to the formal argument p with the type Person, there is no run-time type error because s actually provides redundant information for p and p' is compatible to p. Covariance vs. Contravariance. The covariance subtype and the contravariance subtype rules are mutually exclusive. That is, if a system adopts the covariance subtyping, it will give an error whenever the redefined domain (left) part of the method signature violates this rule of the covariance subtyping. A similar situation happens to a system that adopts the contravariance subtype rule. When the redefined domain (left) part of the method signature is not a supertype of the domain it is redefined from, an error is issued. Many existing systems use only one of these rules: Eiffel for example uses the covariance rule. However in general none of the existing systems support method evolution. In fact, the covariance and the contravariance subtype rules are applied under the following circumstances: - When a method is altered, the relationship or consistency with other methods needs to be reconfirmed. — When a method is called in a context sensitive situation such as pointer dereferencing, overriding, or overloading, in which late binding is invoked, the method signature needs to be matched with the actual arguments. We are concerned with a combination of these two rules in the context of the method migration. The method migration can be performed in two directions: specialisation and generalisation (section 7). The combined rule is an exclusive ored of covariance and contravariance subtype rules. So, either one of the rules is allowed to be applied on a method but not both. The distinction between the use of these two rules is based on the semantics of method evolution which is detailed in section 7. By applying either the covariance rule or the contravariance rule, methods are specialised or generalised. The type safety then relies on consistency checking in the presence of late binding. This is also detailed in section 7 and a run-time exception handler for this idea is to be discussed in future work. 4 University Example This section describes an example of an 0-0 application relating to a university environment (see Figure 1). This example wiU be used to illustrate the method alteration within an 0-0 schema as well as consistency checking. Booch [5] notation is used for showing the class diagram. Dashed icons represent the classes, such as Subject and Course. Every class has three types of information: the name of the class, the set of attributes (properties), and the corresponding methods that are used by the objects derived from that class. Classes are connected with different types of relationships - aggregation and inheritance relationships. The former are represented graphically as a line with a plain boulet at the beginning of the relationships. For instance, the class Course has course-name, time, plus a reference to Subject as attributes. This reference expresses the aggregation relationship that contains the class Subject. The other relationship, graphically expressed as a directed arrow, expresses the inheritance relationship between classes. For instance, there is an inheritance relationship between the classes FullTimeL and Lecturer which expresses the fact that a full time lecturer is a lecturer. z' v / Report clash_reporl{) /' timBjable_reportO * stucly_report() \ printO J Subject / unif_code() ^ room_num > allocationO i'igure 1: The Class Diagram The above class diagram contains information that is required for design purposes. However, for the evolution of the method, other types of knowledge, such as the relationships between methods, should be declaratively specified. These kinds of relationships are called calling relationships, and these model the calling dependencies between methods. Calling relationships are generally parts of the implementation part of methods. In our approach we wiU provide additional data structures that allow the declarative specifications of such calling relationships. Let us consider the example of Figure 1 to show the advantages of the use of caUing relationships for method evolution purposes. For instance, the method allocation{) of the class Course, denoted as Course.allocation{), refers in its implementation to the method Subject.find(). This calling relationship between Course.allocation{) and Subject.findQ means that Course.allocationQ may have to be modified when Subject.findig is changed. This relationship will be declaratively represented in appropriate data structures and then used as a basic "piece" of information during the evolution of methods. 5 Method Schema Definitions and notations are proposed in this section to prepare the formalisation of methods. Basic definitions of graphs, class inheritance hierarchy, and method schema are given. Notations for the relationships between methods are described. A graph G is defined as a tuple < N, E > where N is a set of nodes and E is a set of edges. Every edge describes a relation between two nodes. A directed graph is a graph in which every edge is an asymmetrical relation. A directed acyclic graph G=< N,E > is d. directed graph without cycles. The class inheritance hierarchy of an 0-0 schema is a directed acyclic graph < C,E >, where C is the set of classes and E is set of isa relationships between classes. Given a class c and a method m, the method m in c is identified by c.m which is called the method identifier. A method schema describes all information, related to methods of a schema. It represents the behaviour part of a schema and it is represented as a tuple in the form of < C, M, ID, I > in which - C is the set of all class names, - M is the set of aU method names, - ID is the set of aU method identifiers, - I is the inheritance hierarchy defined over C. The class inheritance hierarchy of an 0-0 schema is a directed acyclic graph , where C is the set of classes and E is the set of isa relationships between classes. Polymorphism allows a class' method to be redefined in its subclasses. Thus given a class c and a method m, we identify the method m in c by c.m which is called a method identifier. An attribute set A is defined as a set of functions, A = F: B—>T, where - B is a set of attribute names. - T is a set of aU types. - F is a set of functions that map between attribute names and types. - A is the set of aU attributes, denoted by A = {a|a = b:t, where beB, teT}. Refinements are needed for the above notations, such as the functions that map attributes to classes and map attributes to method signatures etc. These are omitted because they are not in our focus. Square brackets [] are used to delimit the notations. Given classes c, c', and c", and methods m, m', we denote by [c t c' ]: c is a subclass of c'. [c.m I c'.m' ]: the method m of the class c calls the method m! of the class c'. Or we may say that c'.m' is expected by c.m. [c.m —c'.m ]: the method m in the class c is redefined from the class c', where etc' and ßc" such that m is defined in c", c.m — c".m and c' t c". [a< a' ]: a is a subtype of a', where a = b : t] a' = b' : t'-, a, a' G A; b, b' e B; t, t' £ T; and f is a subtype of t'. [m - t is the signature of the method. Note that the arrow "—s-" has an overloaded meaning with that of the method redefinition notation in section 5. However, in the context of our discussion, there should be no ambiguity. When methods are overloaded, the same name is declared for diflFerent methods in the same lexical scope. However they are distinguished on the basis of the number and types of their arguments. When it is necessary, a system may need to refer to the signature of a method to distinguish between overloaded methods. Figure 2 describes a graphical description of a method. The implementation part is the body of the method and this is represented in the form of c.m(P,Q,R,U), where - P is a set of attributes local to m in c and may be used as actual arguments to call other methods. - Q is a set of constraints on the method input attributes (i.e., formal arguments). They are predicates specifying the pre-conditions before the method is executed. The variables of the predicates are taken from the signature S. - R is a set of predicates specifying the postconditions after the method is executed. - U is a set of methods directly called by c.m. Elements of U can be expressed by method identifiers with the signatures which are substituted by the attributes from P or S. p u Q(S) R(t) chical structures: the method definition directed acyclic graph (DAG) and the method dependency graph (MDG). The former is a graph which models the inheritance relationships between methods, while the latter models the calling relationships. These graphs are defined as foUows: Definition 1 (Method Definition) Given a method schema < C, M,ID,I > and a method m of M, the method definition directed acyclic graph of m is a graph DAG(m)=< C',E'>, where — C' = {c'\c' G C, and m is declared in c'} - E' = {é\e' : c'.m c".m luhere c' G Figure 2: Graphical Representation of a Method In an 0-0 schema, a class is defined within an inheritance hierarchy. A method as a property of a class may be inherited from that of its superclasses. By polymorphism, a method may also be redefined in subclasses with the same method name(s) as that of its superclass(es). Every method is involved in two orthogonal hierar- C, c" e C, m e M, c'.m e ID,ć\m e ID}. When context is clear, we may mention DAG for DAG(m). Object.print Lecturer.print TimeTable.print Subject.print PartTimeL.print Figure 3: The Method Definition DAG oi print Note that a node with only incoming arrows of a DAG indicates the common definition. All other kinds of nodes of the graph indicate the redefinition of methods. Figure 3 describes the DAG of print method which has the following structure: DAG(print)=< C\E' >, where - C = {Object, Lecturer, TimeTable, Subject,Report] - E' = {Lecturer.print—^Object.print, Subject.print—^Object.print, Report.print —> TimeTable.print,TimeTable.print —► Object.print} Definition 2 (Method Dependency Graph) Given a method schema < C,M,ID,I >, a class c of C and a method m of M, the method dependency graph of c.m method represents the methods of M which are linked to c.m by the calling relationship. This graph is in the form of MDG(c.m)=< ID',E' >, where - ID' = {c'.m'|c'.m' e ID, and c.m i c'.m' or 3 e ID' such that c".m" i c'.m'} - E' = {e\e : c'.m' J, c".m",c' G C,c" G C,c'.m' e ID',e ID'}. When context is clear, we may mention MDG for MDG(c.m). TimeTable.TimeTable Report.Report - Report.clash.report -TimeTable.dash -TimeTable.print - c.m(P,Q,R, U) is the method implementation of c.m; - DAG(m) is the method definition DAG defined over c.m; and - MDG(c.m) is the method dependency graph of c.m. The next section introduces the syntax of method-changing operations using the definitions and notations presented in this section as well as those concepts described in previous sections. Report.teach_report--TimeTable.allocation Figure 4: Method Dependency Graph of Report.Report Figure 4 shows the method dependency of Report.Report. This is presented in the following form: MDG(Report.report) = < ID',E' >, where - ID' = {Report.Report, TimeTable.TimeTable, Report. clash.report, Report .teachjreport, TimeTable.dash, T each.allocation, TimeTable .print} - E' = {Report.Report J, TimeTable.TimeTable, Report.Report , I Report.clash-report, Report.clash-report ], TimeTable. clash,TimeTable .clash | TimeTable.print, Report.tech-report | TimeTable.dash, Report .teachjreport J, Teach.allocation}. With all the concepts introduced in this section, the semantics of a method can now be represented as a single concept: namely method intension. It is an integration of all the information related to a method. A method intension contains the identifier, the signature, the implementation, and the two graphs based on method definition and method dependency. Definition 3 (Method Intension) Given a method m of a class c, the semantics of c.m is defined by the tuple INT(c.m) = < c.m : S —^t, c.m{P, Q,R, U), DAG{m), MDG{c.m) > where, - c.m is the method identifier; - S—>t is the signature of c.m; 6 Syntax of the Method Evolution Language The evolution of a method may be performed by different operations on a schema. These operations are: the deletion of a method, the insertion of a method, and the modification of a method. Our approach considers only two basic method-changing operations of deletion and insertion. The method-changing operation should be coded as a transaction to combine elementary operations into a single method-changing operation. For instance, the modification of a method is described as a transaction of deletion followed by an insertion. The moving operation is regarded as a composition of deletion and insertion operations as well. Consistency checking is done after the transaction. Within a transaction, a virtually deleted method can still be accessed. This section discusses the syntax of the method-changing operations and emphasises what can be changed rather than how it can be changed. Method-changing operations refer only to locally defined methods rather than those references inherited via the class inheritance hierarchies. In other words, the method-changing operations can only be performed on those existing methods which can be found in DAG(m), mSM. Otherwise, an insertion operation of method m wiU create a new DAG(m). For instance,-according to the schema of Figure 1, the class FullTimeL inherits the method print{) from the class Lecturer but does not modify it since the method print is not locally declared within the class FullTimeL. This aspect will be detailed later. Bearing in mind the fact that a method can be changed by designing a transaction consisting of only two basic method-changing operations, a method can be changed by the following operations: A. Deletion 1. Deletion of a class: the class is deleted and aU its methods are implicitly deleted as weU. 2. Deletion of a method: the method is deleted explicitly from a class. B. Insertion 1. Insertion of a method with a new class. 2. Insertion of a method within an existing class. C. Modification 1. Modification of the name or the signature of a method. 2. Modification of the implementation of a method. 3. Modification of both the definition and the implementation of a method. The proposed set of operations may be reduced to a minimal set. For instance, a method can be deleted explicitly from a class or deleted implicitly due to the deletion of a class. The implicit deletion is sequenced within a transaction into two distinct operations: the deletion of aU methods defined within the class, and the deletion of the class. Similarly, the insertion of a method may occur with two distinct operations: the insertion of a new class followed by the insertion of the method. Since the structural evolution of a schema is not addressed in this paper, the insertion and the deletion of a class wiU not be discussed. Consequently, the set of method-changing operations is reduced to the situations A-2, B-2, C-1 and C-2. The syntax of the method evolution language is the following. niethod_changing_operation : := ll deletion ::= delete_method () method.ident if ier ::= . c ::= m : := insertion ::= insert_method (, ) method_definition ::= [ : ] method_s ignature ::= -> method_implementation ::= method.body ::= ([

,.,]) P ::= {} Q ::= {} R : := {} U ::= {} modification : := I I ::= modify.def ([], ) ::= modify_imp ([], ) ::= modify_mtd ([],) ::= ([,]) The may be seen as a generic operation to and . The can also be used as a generic method for moving and copying operations. Using the above syntax for method evolution, a method c.m may be updated as follows: Insertion - syntax: insertjmethod{c.m : {S — t),c'.m()) - semantic: This operation inserts m in c. The body of m is copied from the method with the same name in c'. - example: insert jmeihod{Lectur er. find : (c : char —)■ find : char), Subject.findO) Deletion - syntax: deletejmethod{c.m) - semantic: This operation deletes the method m from the class c. - example: deletejmethod(C our se.allocation) Modification of the name - syntax: modify.def{c.m,c.m') - semantic: This operation changes the name m into m'. This is interpreted as a transaction which deletes m and is followed by the insertion of m'. Since the modification operation is a transaction, the deleted m remains accessible from m'. The new method m' has the same signature and the same body as m. - example: modify-def(Subject.find, Subject, search) Modification of the signature - syntax: modify.def{,c.m : {S' —t')) - semantic: This operation is regarded as a transaction which firstly deletes c.m. Then, a method is inserted with the same name and with the specified signature. When the context is clear, a comma in the command indicates that the method before the comma has the same identifier as the method after the comma. - example: modify.def(, Report. Report: (S—> t)) The details of S and t ought to be given. Modification of the body - syntax: modify-imp(c.m, c.m(P',Q',R', U')) - semantic: This operation modifies the body of c.m. - example: modi fy-imp{Student.print, Student.print{P', Q', R', U')) The details of P', Q', R', U' ought to be given in practice. The proposed method-changing operations are arbitrary and may possibly result in compile-time or run time errors. These errors can be detected or handled at various stages by a recompilation trigger mechanism or by a run-time exception handler. The focus of this paper wiU be on the development of algorithms that are applicable to these stages, in order to produce methods that are type safe and behaviourally consistent. Informally speaking, run-time errors are those situations that can disrupt the execution of methods. Some run-time errors are machineimplementation dependent. In this case, the runtime errors report abnormal situations of the program execution environment such as arithmetic overflows, memory, disk, communication, and file management failures. However, due to overriding, overloading, and later binding, many runtime errors are caused by method updates. These include subtype mismatches during late binding, violations of domain constraints, or some abnormal arithmetic errors. The major problem to be dealt with is type mismatch errors during runtime. The pre-detection of the run-time type errors is not a trivial problem. Additionally, methods should be defined as side-effect free. In fact, the specifications of method pre/post conditions give a means to ensure that the expected behaviours are performed at the implementation stage. When a method is not side-effect free, a global variable, or an object in the database, may be changed by the method without a passing of method arguments. In our approach, it is important to the evolution handling mechanism to maintain side-effect free methods, especially when a method is subject to intensive calls or massive data updates in a database. 7 Behavioural Consistency This section develops a mechanism for handling method-changing operations and consistency checking. For that, the concept of method scope is introduced. This is defined as a set of integrated graphs and a set of method intensions. The method scope represents all the information about the methods contained in an 00 schema. Algorithms for building the method scope are given and strategies for method consistency checking based on the method scope are described. 7.1 The Semantic of Method Changing The consistency of method behaviour can be seen as the possibility of calling the method and obtaining the expected results. Inconsistent situa- tions are run-time type errors, side-effects, redundant methods, and unexpected behaviours, such as (1) - (4) described in section 1. In the following subsections, we discuss various aspects of the semantics of method changing. In subsection 7.1.1, method migration, as a special case of method changing, is presented. The type safety in this case is provided by the migration rules. Subsection 7.1.2 develops the behavioural consistency. 7.1.1 Method Migration When a method is changed using the operators discussed in section 6, it can happen in many different ways. One common situation is that methods may be migrated between super and subclasses. This is called migration by either specialisation or generalisation. The operation of the method migration may be constructed as a transaction of a deletion followed by an insertion. With the method migration, the type-safety problem concerns that: (1) the signature of a method may be changed; (2) a message may have an unexpected receiver i.e., a method from its super/subclass is bound. Another issue regarding the behaviour of migrated methods, such as the consistency of modified pre/post conditions, will be covered in the following two subsections. Two rules, covariance subtype rule and con-travariance subtype rule, play an important role in the checking of type safety. Based on these two rules, we shall examine two specific late binding situations. 1. Method generalisation: this means that a method with the supertype of its input attributes is used where its subtype is expected. For example, a method mi : {Person — Lecturer) may be used where a method m2 : (Student —> Person) is expected. Here Student is generalised to Person and Student is surely acceptable to Person because at run-time, type Student can always be safely generalised to type Person. However, the result type Person is changed to the type Lecturer. This substitution satisfies the contravariance subtype rule. Fig.5 illustrates this idea. The dashed arrows indicate the subtype relations and the curved solid arrow indicates that the method m2 is generalised to the method mi. mi: Person À fr^: Student Teacher T Person Figure 5: Method Generalisation 2. Method specialisation: this means that a method with the subtype of its input attributes is used where a supertype is expected. For example, a method mz : (Student —Student) may be used where a method m4 : (Person —Person) is expected. In this case. Person is specialised to Student. Surely, at run time, only messages sent to the object with m^ can be bound with type safety. When the late binding mechanism tries to bind a message sent to the object with 7714, a run-time type error (Person does not match with Student) is detected. This is illustrated by Figure 6. The dashed arrows indicate the subtype relations and the curved solid arrow indicates that the method m4 is specialised to the method ma. m4' Person A rr^: Student Person 1 Student Figure 6: Method Specialisation Updating a method either by generalisation or specialisation offers a general framework for method migration as well as for type safety. The following definition completes the previous discussions by integrating the specialisation and the generalisation rules. Definition 4 (Migration Rules) In summary, the migration via generalisation or specialisation is said to be type safe if - a method is generalised, or — a specialised method is not to be bound to the method of its supertype. 7.1.2 Behavioural Consistency In addition to the type safety problem, the other issue is behavioural consistency which is a fundamental aspect of method evolution. This relates to the capability to produce correct behaviour applications in the sense that they reflect the assumptions of the real world. However the properties of behavioural consistency are difficult to identify and views on consistency have been given by many other works. The type safety problem can be regarded as a sub-issue of consistency. We now give a definition of behaviour consistency by taking into account aU of the modifications of every part of a method. Definition 5 (Behavioural Consistency) An 0-0 database schema is behaviourally consistent if and only if it satisfies the following conditions: - The behaviour of a method rn in a class c is prescribed by c.m: (S—^t) and c.m(P,Q,R,U). - The relationships among methods are described in DAGs and MDGs. - There are no such consequences caused directly by the method-changing operations, as - run-time type errors - side-effects - redundant methods, i.e. methods that have the same signature and the same implementation - unexpected behaviour, i.e. pre/post conditions cannot be satisfied. In a given method schema H = < C, M, ID, I >, a method c.mGiD with c.m:(S—>t) and c.m(P,Q,R,U) as the definition and the implementation parts, can be updated using the proposed method-changing operations. The principles of behavioural consistency checking are that: 1. Any arbitrary method changing is allowed. 2. The consistency checking is a decidable problem if the method intensions {INT(c.m)| c.melD} are given. This is mainly because of the following: - All INTs have limited sizes. - AU effects of method-changing operations are reflected in INTs. - The effects of method-changing operations have finite states. - The checking algorithms are designed based on the searches and comparisons of INTs' states. Thus the method consistency checking problem can be abstracted as a problem space that has a limited size, with finite states, and a set of search algorithms. 3. When a method c.m is changed, INT{c.m) becomes INT'(c.m') which may be empty if c.m is deleted. 4. The checking criteria are used to check the following situations: - the signature of c.m is type-consistent within MDG{c.m). - the signature of c.m is consistent with regards to the redefined methods within DAG{m). - there are no two methods c'.m' and c".to" such that they have identical signatures and pre/post conditions. - there is no such method intension INT{ć\m^'') such that the pre/post condition P" and Q" of c".m" are false in any case of calls within the graphs. - all the c.m(P, Q,R,U) in INT{c.m) are checked using MDG{c.m). The principle of the checking is to adopt the Hoare rules [10] to check method correctness and method calling consistency, and to satisfy the matches between the formal and actual arguments (i.e., the substitution of signatures by the input or local attributes for the called methods). The migration rules are to be checked and the following rule is also used to check the signatures: if c.m i c'.m' e MDG{c.m) then Vc'.m' E U,U C ID, S' U {t'} CPUSö{t}ARDQ' - if c.m is deleted, then INT{c.m) should be deleted too. For any other INT"{c" .m"), where c.m, appears either in DAG{m'') or in MDG{c".m"), c.m should be deleted from the DAG{'m") and the MDG{ć\m'"). Then, the consequent deletions are checked in terms of the above mentioned criteria. Given a class c and a method m of that class, a scenario of the method-changing and the behaviour consistency checking is that - a method-changing operation is defined on c.m.; - the consistency checking procedure uses INT{c.m) and other related INTs] - a method-changing operation is rejected if either errors are found or a warning message is issued with the acceptance of the operation; - an audit report on the operation is given. In section 7.2 we develop appropriate data structures for handling method evolution. This is called method scope, and algorithms for consistency checking are provided in section 7.3. 7.2 Method Scope In order to check inconsistency, method intensions which carry the semantics of methods including the changes on the methods are used as the basis of checking. The checking procedure may refer to many different method intensions for deciding consistencies amongst methods. Indeed, a global view on methods is needed to optimise the checking strategies. For that reason, the concept of method scope as the integration of aU method intensions is introduced. Since there may exist several method intensions such that they aU share the same DAG, DAGs are integrated as a global DAG(M) in the method scope. On the other hand, MDGs are integrated as MDG(ID) because an MDG may be a subgraph of some others. MDG(ID) is not necessarily a connected graph since there may be some methods which do not caU each other at all. The connectivity of MDG(ID) is not an important issue since it is not necessary that a method must call another one within a schema. Definition 6 (Method Scope) The method scope of a method schema is a tuple which consists of integrated graphs and method intensions. It is defined as SCOPE = where, - DAG{M) = [j^^r^DAG{m) - MDGilD) = MDGic.m) - INT = {INT{c.m)\c.m e ID, c e C, m e M} We may treat the SCOPE as a generic class. The algorithm we proposed {algorithml) builds up the instances of SCOPE gradually with the development of a method schema. Initially, the SCOPE is empty. Then, when a method is added to a schema, an INT(c.m)is created in the SCOPE for c.rri. When a method is deleted, the INT(c.m) will be deleted from the INT and the SCOPE is modified accordingly. The operations on the SCOPE should be in transactions and be coupled with consistency checking. When a check fails, the transaction should roU back. Particularly, when - m is added in c, then m is used to search for the DAG{m). If there is no DAG{m) found, then a new DAG{m) is created. Otherwise, the c.m wiU be added into a ĐAG(m) according to the position of c in its inheritance hierarchy. Consequently, DAG{M) is also modified. - m is added in c, since every method is also associated with an MDG, MDG{c.m) will be created and MDG{ID) consequently updated. This process of building MDG{ID) follows a bottom-up pattern, which means that every method must be in existence before it can be called. - c.m is dropped from a schema, the corresponding INT{c.m) including DAG{m) should also be modified or deleted. The maintenance of the method scope is coupled with method-changing operations. The above algorithm is incorporated with the consistency checking algorithms 2, 3, 4, and 5. Considering the complexity of the algorithm, it is in 0(71^) where n is the size of the problem space. By evaluating the size of DAG(M) and MDG(ID), n can be decided as the product of the number of classes multiplied by the average number of methods and then multiplied by the average number of attributes of a method. Since the algorithms discussed in this section are all basically search algorithms based on directed graphs, the complexity analysis for the each of the following algorithms is exactly same as O(n^). 7.3 Semantic Consistency Checking Unexpected behaviours of methods are the activities performed by methods which are not specified by the method declarations. The update of one method may cause changes in the behaviour of other methods. It is crucial to the checking of behavioural consistency that each method is guarded by its signature and its pre/post conditions. However, since a method may call other methods via a class inheritance hierarchy or just from other classes, the relationships amongst methods need to be constructed. In order to achieve that, the following information about a method is needed: — the signature, - the pre/post conditions, - the local attributes that may be passed as actual arguments to the called methods in the MDG, and — the identifiers of methods called. To ensure behavioural consistency, a two-level checking strategy is proposed. The first level of checking concerns the prevention of run-time type errors and side effects. This is performed mainly on D AGs. The second level of checking relates to the unexpected and redundant behaviours of methods. This is performed mainly on MDGs. These levels are complete in the sense that they involve all possible situations of the consistency. Here we describe these two levels in terms of the method-changing operations. 7.3.1 Method Insertion When a method m is inserted in a class c, the two levels of checking will be carried out. For the first level, INT(c.m) is created and used to check the consistency of the names and signatures of m. We informally describe the algorithm for this level, i.e. algorithm 2. This algorithm checks the following conditions: % Algorithm 2 (Level-1 Insertion Consistency Checking on DAGs) % 1. c exists in the schema; In the case of overriding, methods are distinguished by the method identifiers with different class names. In the case of overloading, methods are distinguished by the method identifier plus the signatures. This implies that an INT(c.m) may represent a group of methods which are overloaded. 2. There is no c'.m—>^c.m which exists in DAG(m) such that c'tcGl. This is to ensure that there is no method redefined from c.m before c.m is inserted. It can be seen that the acyclicity of DAG(m) is also ensured by this condition. 3. The signature of c.m:(5 —> it) is compatible . with those methods in DAG(m). The compatibility of signatures is defined as foUows: Vc.m—^c'.m e DAG(m), such that cfc' and c.m-.{S —^ <), c'.m:(5' —^ t') If 5 = Si X S2 X S3 X ... X Sj and S' = s'j x s'2 x S3 x ... X s'^ then (i) Si-S3 ^ •Sfc di t' for j > k. This definition actually defines the common part of signatures to be checked. The proposed definition is composed of two rules, namely, the covariance subtype rule and the contravariance subtype rule. We check these two rules as one case because we are doing two-level checking and any violation on subtyping or mismatch on signatures wiU be checked later. Particularly, for the covariance subtype rule, which is regarded as situations (i) and (ii) above (i.e., S :< S'), the match with s^.,.1, will be checked later at the second level where the MDG(c.m) is checked. For the contravariance subtype rule which is regarded as situations (i) and (iii) above (i.e., S' :< S), the match with Sk+i,...,Sj wiU be checked later at the second level where the MDG(c.m) is checked. The advantages here are that it allows arbitrary changes to be applied on the signatures of a redefined method. Especially in the case of multiple inheritance, a method may be redefined from that of more than one superclass. In this case, the signature of the redefined method may only match with the common part of that of its superclasses and the non-common part is left to be checked later within the method-calling chains. This means that the violation of either the covariance subtype rule or the con-travariance subtype rule in the non-common part of the signatures is not a problem until the method is to be called. The bottom line is to support the overriding, overloading, and later binding semantics. A method can be called by many different methods in many different situations (e.g., a method may be used to replace one which is to be deleted in the DAG, i.e., method migration). 4. There is no redundant method in SCOPE. For a given INT(c.m), any INT(c'.m') in SCOPE is checked in order to find: - identical structures of DAG(m) with DAG(m'), or MDG(m) with MDG(m'); - identical signatures of c.m : (S —> t) with that of c'.m' : (5' —^ t'), i.e., S = S' and t — t', and identical pre/post conditions P = P',Q = Q'. The insertion of a redundant method wiU be warned. For the second level of consistency checking, algorithm 3 is developed to check the implementation part of the method INT(c.m). These are the different steps of the algorithm: % Algorithm 3 (Level-2 Insertion Consistency Checking on MDGs) % 1. For every c'.m' involved in MDG(c.m), INT(c'.m') is defined. 2. For any c".m"ic'.m' e MDG(c.m), with the method intension INT(c".m") and INT(c'.m') respectively, ensure that S'U{f }U P" U U {f} and check if R" D Q'. Then, method migration rules (section 7.1) are checked. After that, Hoare inductive assertion techniques [10] should be applied here to prove the partial correctness of the method intensions in INT. This is to ensure that aU post-conditions in a method-calling chain are true, derived from the signatures and preconditions of calling methods. The violation of the covariance subtype rule and the con-travariance subtype rule is dealt with here. 7.3.2 Method Deletion When the deletion of a method m from class c is requested, as for the insertion of a method, the two-level checking is carried out. At the first level of checking, we develop an algorithm [algorithm 4) that checks the following aspects: % Algorithm 4 (Level-1 Deletion Consistency Checking on DAGs) % 1. c.m should be deleted from DAG(m); 2. either drops or reconnects the rest of nodes in the graph. In particular, aU the successors of c.m may need to be reconnected with the predecessor(s) of c.m; 3. checks the compatibility of signatures between the reconnected nodes. At the second level, we develop an algorithm (algorithm 5) that modifies the MDG according to the methods which call c.m or are called by c.m. Since MDG(c.m) disappears with the deletion of INT(c.m), MDG(ID) needs to be modified. Algorithm 5 ensures that aU references to c.m in MDG(ID) should be modified. This occurs in two steps: (1) the modification of the caUer methods of c.m, and (2) the modification of the called methods of c.m. The following details the different steps of algorithm 5 in which the modifications of the caller methods and the called methods are described. % Algorithm 5 (Level-2 deletion Consistency Checking on MDGs) % A) The Modification of the Callers The modification of the caller methods of c.m regards the set of {INT{c'.m')\c'.m' | c.m,c.m G evolution of methods in object schema Informatica 18 (1994) 257-275 273 c1.m1 c2.m2 A t-' c4.m4 c3.m3 V C5.m5 c1.m1 c2.m2 c3,m3 \ c.m c1.m1 V C4.m4 \ c4.m4 \ ■i / c5.m5 / y C2.m2 cS.mS - \ c5.m5 Figure 7: Method Deletion ID,c'.m' £ ID}. There are two cases for modifying caller methods. In the first case, the c.m may be replaced by a method in the DAG(m). If this is not possible, then the alternative is to try the MDG(c.m) to determine a method to replace c.m. We give details as follows. 1. If c.m is not a root node of MDG(ID), aU methods previously calling c.m now have to call the successor of c.m in the DAG(m) according to c in the inheritance path. This implies that the signatures of all the callers of c.m have to be checked against that of the successor of c.m in DAG(m). Assume for example that Temporary and Temporary-Lecturer are classes where Temporary-Lecturer is a subclass of Lecturer and Temporary. In this case if Temporary-Lecturer.print is deleted, then Lecturer.allocation may have to call Lecturer.print or Temporary.print according to the checking of the signature compatibility and the resolution of the multiple inheritance conflict. 2. After the deletion of c.m, if no replacement of c.m can be found from DAG(m) for the callers of c.m, the method intensions of all the callers of c.m wiU have two choices: (a) the reference (caU) to c.m is replaced by a successor of c.m in MDG(c.m), or (b) the reference to c.m is deleted from the caller's method intension. This decision is made through a user's interaction or by a predefined constraint. Otherwise, the deletion should be re- jected. Assume a method c.m is to be deleted and there is no substitution that can be found from DAG(m), then a situation may be left for the user to make a decision (see Figure 7). Figure 7 can be read as that: if c.m is deleted, then the modification on the MDG is arbitrary, decided by a user or by some predefined constraints. B) Modification of the Called Now, let us consider the methods that are called by c.m, i.e. c.m is a root node or has successors in the MDG(ID). The modification on the called method of c.m regards the set of {INT{c'.m')\c.m j ć.m',c.m 6 ID,c'.m' e ID}. There are three ways to maintain the method scope: cascade, nullified, and restricted. 1. Cascade. If there is no other caUer to the successors of c.m, then aU the method intensions of c.m successors wiU be deleted together with INT(c.m) unless there are some predefined constraints preventing this (e.g., if c.mjc'.m' and c^^c' (c.mGiD, c'.m'elD, cgC, c'gC) then c'.m' cannot be deleted). In this case, a recursive checking for aU successors will be carried out. 2. Nullified. Only INT(c.m) is deleted and the set {INT{c'.m')\c'.m' i c.m,c.m G ID, c'.m' e ID} is modified with c'.m' | c.m being deleted. This leaves the set {INT{c'.m')\c.m i c'.m', c.m g ID, c'.m' g ID} untouched. 3. Restricted. The deletion of c.m wiU be rejected if there is any caUer of c.m. In this case, the effect on the method scope is minimum. Again, the decision of restricted deletion is made through an interactive session with a user or it is predefined as a constraint. 7.3.3 Method Modification Method modification is a transaction of deleting and inserting operations. Thus the checking of the behaviour consistency can be different from that of insertion or deletion only operations. Within a transaction, a method is virtually deleted and then inserted with changed parts. The consistency checking is therefore made in two steps. In the first step, the checking is made on deletion: the reaction of the checking will be suppressed. In the second step, the checking on insertion is made and the reaction of the checking is also suppressed. When the transaction is completed, aü suppressed checking reactions then will be synthesised. A method-changing transaction may roU back if the following conditions are not satisfied: - DAGs are stiU acyclical. - There is no redundant method. - To prevent the run-time errors and side-effects, signatures of methods in method definition DAGs are checked for compatibility. - MDGs involved in the transaction are examined in terms of the linkages between pre/post conditions and there is no false returned. 8 Conclusion In this paper we addressed the development of a framework for the evolution methods in object-oriented databases. This involves - the definition of the set of update operations; - the development of appropriate structures to represent the semantic of methods; and - an approach and algorithms for consistency checking. Data structures, such as DAG and MDG, that model the semantics of inheritance and calling relationships are proposed. These structures are used to check the consistency of a schema after method updates. However our approach lacks a better incorporation with the transaction model. In our approach, if an update of a method occurs and causes some inconsistent database states, then we automatically roll back the transaction. We think that a better facility, such as compensation transactions, wiU improve our approach and put it in a real development environment. Acknowledgment The authors gratefully acknowledge the comments provided by the anonymous referees which helped improving the readability of the paper. References [1] Abiteboul, S., Kanellakis, P., and Waller, E.: Method Schemas. Proc. of the ACM SIGACT-SIGMOD-SIGART Symp. on Principles of Database Systems, Nashville, pp. 16-27, 1990. [2] Agrawal, R., DeMichiel, L.G., and Lindasay, B.G.: Static Type checking of Multi-Methods. Proc. of the Int. Conf. on Object-Oriented Programming Systems, Languages and Applications, pp. 113-128, 1991. [3] Atkinson M., Bancilhon F., et al: The Object-Oriented Database System Manifesto. Proc. of the Int. Conf. on Deductive and Object-Oriented Databases, Kyoto, pp. 40-56, 1989. [4] Banerjee J.W., Kim W., Kim H-J. and Korth H.F.: Semantics and Implementation of Schema Evolution in Object-Oriented Databases. Proc. of the Int. Conf. on Management of Data, San Francisco, pp. 311-322, 1987. [5] Booch, G.: Object-Oriented Analysis and Design With Applicants. 2nd Edition, Addison-Wesley, 1994. [6] Canniing P.S., Cook W.R., Hill W.L., and Olthoff W.G. Interfaces for Strongly-Typed Object-Oriented Programming, Proc. Object-Oriented Programming: Systems, Languages and Applications, pp. 457-467, 1989. [7] Cardelli, L.: A Semantics of Multiple Inheritance. Information and Computation, Vol. 76, Academic Press, pp. 138-164, 1988. [8] Clamen, S.: Schema Evolution Support Summary. Electronic news (group OBJECTS) in computer network, posted by clamen-t-cs.CMU.EDU, 1992. [9] Coen-Porisini, A., Lavazza, L. and Zicari, R.: The ESSE Project: An Overview. Proc. of the Far East Workshop on Future Database Systems, Kyoto, Vol. 3, pp. 28-37, 1992. [10] Hoare, C.A.R.: An Axiomatic Basis for Computer Programming. Communications of the ACM, Vol. 12, pp. 576-583, Oct. 1969. [11] Hull, R., Katsumi, T., and Yoshikawa, M.: Behavior Analysis of Object- Oriented Databases: Method Structure, Execution Trees, and Reachability. Proc. of the FODO, Lecture Notes in Computer Science, No. 367, Springer-Verlag, pp. 3 72388, 1989. [12] Meyer B., Ensuring Strong Typing in an 0-0 Language. Proc. of the Int. Conf. on Object-Oriented Programming Systems, Languages and Applications, 1992. [13] Monk, S. and Sommerville, I.: Schema Evolution in ooDBs Using Class Versioning. SIGMOD Record, 22(3), pp. 16-22, 1993. [14] Penney, D.J. and Stein, J.: Class Modification in the Gemstone Object-Oriented DBMS. Proc. of the Int. Conf. on Object-Oriented Programming Systems, Languages and Applications, Florida, pp. 111-117, 1987. [15] Roddick, J.F.: Schema Evolution in Database Systems - An Annotated Bibliography. SIGMOD Record, 21(4), 1992. [16] Skarra A.H. and Zdonik S.B.: Type Evolution in an Object-Oriented Database. In Research Directions in Object-Oriented Programming, B. Shriver and P. Wegner (eds), MIT Press, 1987. [17] Waller E.: Schema Updates and Consistency. Proc. of the Int. Conf. Deductive and Object-Oriented Databases, 1991. [18] Zicari, R.: A Framework for Schema Updates in an Object-Oriented Database System. Chapter 7, in Building an Object-Oriented Database System: the Story of 02, Bancilhon F., Delobel C., Kenel-lakis P., eds), Morgan Kaufmann, 1991. INFORMATIONAL BEING-OF Anton P. Železnikar Volaričeva ulica 8, 61111 Ljubljana, Slovenia a.p.zeleznikarSij s.si Keywords: axioms, Being-of, functional composition and decomposition, function, includedness, informational, informational frame and gestalt, metaphysical gestalts, metaphysicalism, nested functional forms; serial, parallel, circular and metaphysical functionality Edited by: V. Fomichov Received: April 5, 1994 Revised: August 5, 1994 Accepted: September 5, 1994 Informational Being-of is another fundamental informational concept of functionality in comparison with the informational includedness studied in [9]. It has its formal-theoretical informational structure which is recursive, circular and spontaneous. Informational Being-of can be studied in many aspects from which we chose basic axioms concerning informational functionality, informational interpretations of formula (p |=of ol, and phenomena of serial, parallel, circular informational functionality. Some advanced problems of decomposition (destruction) and composition (construction) concerning informational functionality are treated. At the end, informational functionality of metaphysical cycles impacted by an exterior entity is studied and the so-called metaphysical gestalts concerning the informational Being-of are introduced. Informational gestalts reveal several problems of informational formula structuring, functional interdependence and the like. 1 Introduction Informational Being-of is the original term coined in this paper^. In the common speech we say that something is of something or is something's something, for example, also in the context, to be a property (a definite something) of something, an information of information, in general, etc. Further meaning can be deduced from that which is comprehended as formal functionality (being à function of something), where the function of the function's argument is coming into presence. Informational Being-of concerns the so-caUed functionality of informational entities, that is explicit and implicit formulas of the kind (p{a) and (p |=of a, respectively, being informational operands within the informational theory [4]. Informational function is the basic and one of the most powerful informational concepts which enables the active informational role of an entity ^This paper is a private author's work and no part of it may be used, reproduced or translated in any manner whatsoever without written permission except in the case of brief quotations embodied in critical articles. upon another (passive or active) entity. In this respect, formula (p{a) has to be informationaUy determined in an informationaUy recursive and arising manner, as a regularly informing operand. Expression (p{a) symbolizes a system of inforina-tional formulas in which several operands can depend on argument a, for instance in the filled metaphysical shell belonging to an entity. Definition 1 is the basic concept determining the informational function as a fundamental item of the informational theory. In the last consequence (in a concrete situation), formulas depending on a are explained (informationaUy interpreted) by formulas, in which a appears explicitly as an operand and not in the functional form (operand-operator-parenthesis formula). Informational Being-of is in several informational-theoretical aspects a paraUel and complementary construction to informational Being-in [9]. It can in several respects be more complex than informational Being-in. Both give to the informational theory the power of extremely complex functionality where active, passive, and active-passive entities can be distinguished in a transparent way. The question is, for example, what kind of functionality does the informational includedness (which is a synonym for informational Being-in) perform. Thus, informational includedness can also be studied from this point of view, that is, to be understood as a particular case of the informational Being-of. We will show how such a view is righteous and has its roots in the ba^ic philosophy of an arising informational theory. 2 An Initial Philosophy of Informational Being-of To be something of something pertains to the meaning of the of as described, for example, in [10], where it is written, among other meanings, that from its original sense, of was used in the expression of the notions of removal, separation, privation, derivation, origin, starting-point, spring of action, cause, agent, instrument, material, and other senses, which involve the notion of talking, coming, arising, or resulting from. Of means from, away from; also down of, up of, and off of when following an adverb, with which it is sometimes closely connected. Of indicates a point of time from which something begins or proceeds, the emergence out of which something is formed. Of is used in certain phrases, which particularize the meaning, as within of, wide of, back of, backwards of, etc. It expresses a property, possession or appurtenance. As an informational operator, the Of is connected with verbs (operational compositions, see [7]), e.g., to recover, deliver, empty, free, rid of, etc. The Of introduces that {(p as a function) which is removed (deduced, inferred) from something (a as a functional argument). There are functional relations (informational transitions) between the maker (argument a and the impacting environment), its making (informing) and the made (function (p as (p{a)). The Of expresses racial or local origin, descent, etc. after the verbs arise, be, come, descend, spring, be born, bred, propagated, and the like. In informational language, we already use informational arising of, being of something and, further, coming into existence of and from, etc. The Of connects notions of origin (cause, maker, generator) and consequence (the made, result), where is a consequence of a (and, possibly, other entities) in ).c ipj and, for the second informational includedness, according to [9], ^ Nof (a 1= (p)\ ^ {a 1= |=of )) Cof Vi , ((a 1= (p) |=of ¥>) Cof f ) ((y) (=of a) C |=of ((a(/3) h V) Cof V) - • («(/?) 1= substitute would be 0i(02(03(- • •(a„_2(o„_i)) • • •))); Ol(a2(Q!3(---(o„_2)---))); 01(02(03)) Consequence 6 [A Parallel Functional Dependence] A function a can simultaneously (in parallel) depend on more than only one operand. This parallelism of dependence on several operands can be expressed as which proves the adequacy of the introduced parallel functional expression. □ Informational parallelism and informational functionality are informationaUy dependent phenomena, which interfere with each other. Functions all(oi,02,03,• • •,o„_2,o„_i,a„) and a(oi, 02, • • •, a„) (Definition 11 and Consequence 6, respectively) are essentially different functional structures (functionalities). Consequence 7 [Informational Parallelism and Functionality] The beginning question is, what is the difference between the regular functional expression 0(01,02, •••, o„) and expression o(oi; 02; • • ■ ; On) where commas have been replaced by semicolons. The comma system (oi, 02, • • • ,On) is a system of separated entities 0i,02,---,0n which may or may not cooperate (inform among each other). The semicolon system (oi; 02; • • • ; On) is a characteristic parallel system in which semicolons are nothing else than parallel informational operators (e.g., ||=j. The meaning is (01102; • • SOn) { o,- 11= (Xj] [i,j = l,2,--,nj Operator ||= has the meaning "informs in parallel with". "In parallel" means simultaneously, dependently or independently, spontaneously, circularly, particularly, etc. For instance, for the meaning of the last formula there are the following three alternatives: ,n ( ai 11= Oj; \ i j\ \i,j= 1,2, '^cci ^ a,-; ii^r, = 1,2,- (■{c^i h («i ".); \i,j = 1,2,-^ai 1= Q!j; ij^r, V V w parallel independence partly parallel dependence/ independence a complete parallel dependence ((v 1= (y Nof «)) c "2(a), • • •, an(a)) - ($( (p{a) $part(a;i(a), «2(0), • • •, On)) (p{a) $)) An inverse particular circular informational function, v^partC*^)) is a functional informational system, that is, V?art(a) - ¥'(«) ^pL or ¥'part(a)*(ai(")> "2(0:), • • •, an{a)) ^ a„(a)) (p{a) $)) Subscript 'part' marks a particular case of frame ^part or (^-Q-i numerical index, seman- tic designator, particular symbol, etc.) in which operands «1(0;), a2{a), •••, «„(a) occur in the sequence written. □ Consequence 9 [Circular Informational Shell and Function] According to the previous definition, a right-circular and left-circular functional shells ip^ and (p^ are respectively. For a function (p{a), the circular forms are ^ (p^ia) ^ (š(p{a) where ^