I co o T co iO CM ä a co" >-( XI S s ^ oo" o> as CM CM i« Ü • I—I -i-s cö g ,P Volume 22 Number 3 October 1998 ISSN 0350-5596 Informatica An International Journal of Computing and Informatics Special Issue: Parallel Computing With Optical Interconnections The Slovene Society Informatika, Ljubljana, Slovenia Informatica An International Journal of Computing and Informatics - ■ , Basic info about Informatica and back issues may be FTP'ed from!ftp.anies.si in 1 magazines/informatica ID: anonymous PASSWORD:' FTP archive may be also accessed with WWW (worldwide web) clients with URL: http://turing. ijs.si/Mezi/informatica.htm Subscription Information Informatica (ISSN 0350-5596) is published four times a year in Spring, Summer, Autumn, and Winter (4 issues per year) by the Slovene Society Informatika, Vožarški pot 12, 1000 Ljubljana, Slovenia. The subscription rate for 1998 (Volume 22) is - DEM 50 (US$ 35) for institutions, - DEM 25 (US$ 17) for individuals, and - DEM 10 (US$ 7) for students plus the mail charge DEM 10 (US$ 7). Claims for missing issues will be honored free of charge within six months after the publication date of the issue. - Mgsc Tech. Support: Borut Žnidar, Kranj, Slovenia. Lectorship: Fergus F. Smith, AMIDAS d.o.o., Cankarjevo nabrežje 11, Ljubljana, Slovenia. Printed by Biro M, d.o.o., Žibertova 1, 1000 Ljubljana, Slovenia. Orders for subscription may be placed by telephone or fax using any major credit card. Please call Mr. R. Murn, Jožef Stefan Institute: Tel (+386) 61 1773 900, Fax (+386) 61 219 385, or use the bank account number 900-27620-5159/4 Ljubljanska banka d.d. Slovenia (LB 50101-678-51841 for domestic subscribers only). According to the opinion of the Ministry for Informing (number 23/216-92 of March 27, 1992), the scientific journal Informatica is a product of informative matter (point 13 of the tariff number 3), for which the tax of traffic amounts to 5%. Informatica is published in cooperation with the following societies (and contact persons): Robotics Society of Slovenia (Jadran Lenarčič) Slovene Society for Pattern Recognition (Franjo Pernuš) Slovenian Artificial Intelligence Society; Cognitive Science Society (Matjaž Gams) Slovenian Society of Mathematicians, Physicists and Astronomers (Bojan Mohar) Automatic Control Society of Slovenia (Borut Zupančič) Slovenian Association of Technical and Natural Sciences (Janez Peklenik) Informatica is surveyed by: AI and Robotic Abstracts, AI References, ACM Computing Surveys, Applied Science Techn. Index, COMPENDEX*PLUS, Computer ASAP, Computer Literature Index, Cur. Cont. & Comp. & Math. Sear., Current Mathematical Publications, Engineering Index, INSPEC, Mathematical Reviews, MathSci, Sociological Abstracts, Uncover, Zentralblatt für Mathematik, Linguistics and Language Behaviour Abstracts, Cybernetica Newsletter The issuing of the Informatica journal is ßnancially supported by the Ministry for Science and Technology, Slovenska 50, 1000 Ljubljana, Slovenia. Post tax payed at post 1102 Ljubljana. Slovenia taxe Percue. Introduction to Parallel Computing with Optical Interconnections Advances in semiconductor technologies coupled with progress in parallel processing and network computing are placing stringent requirements on intersystem and intra-system communications. With advances in silicon and Ga-As technologies, processor speed will soon reach the gigahertz (GHz) range. Thus, the communication technology is becoming and will remain a potential bottleneck in many systems. This dictates that significant progress needs to be made in the traditional metal-based interconnects, and/or that new interconnect technologies, such as optics, be introduced in these systems. Optical means are now widely used in telecommunication networks and the evolution of optical and optoelectronic technologies tends to show that they could be successfully introduced in shorter distance interconnection systems such as parallel computers. These technologies offer a wide range of techniques that can be used in interconnection systems. But introducing optics in interconnect systems also means that specific problems have yet to be solved while some unique features of the technology must be taken into account in order to design optimal systems. Such problems and features include device characteristics, network topologies, packaging issues, compatibility with silicon processors, system level modeling, algorithm design, etc. Papers in this special issue were selected to address the potential for using optical interconnections in massively parallel processing systems, and their efi^ect on system and algorithm design. Optics offer many benefits for interconnecting large numbers of processing elements, but may require us to rethink how we build parallel computer systems and communication networks, and how we write algorithms. Fully exploring the capabilities of optical interconnection networks requires an interdisciplinary effort. We hope this special issue serves the reader in this respect. Through rigorous reviews, six papers were chosen from a pool of papers submitted to this special issue. This is reflected in the high quality of the papers accepted. In this issue, the papers by Middendorf and ElGindyi and Sahni and Wang are published. In the next issue we expect to publish the remaining four papers. We wish the reader enjoy reading the papers in this special issue and the following one and find useful information there. Middendorf and ElGindy present an algorithm for matrix multiplication on an array of processors with optical pipelined row and column buses (APPB). They show that two n xn matrices A and B with elements that can be represented by 0{w) bits and where the number of nonzero elements of B is at most kß ■ n, 1 < kß logn we divide the w bits of S into at most j = sequences of consecutive bits. The bits of each sequence are send to the processors of a single column. Similar as before we compute a binary number for each sequence and multiply the jth number with 2^-1)j e [1 : j^^^]. Then the sum of the values obtained is computed in timeO(logj^). Time: 0(log j^). We obtain the following lemma. Lemma 2 Addition of n? w-bit numbers on an nxn APPB where each processor holds one of the numbers can be done with algorithm ADD in time 0(logiu .-l-log*n). Let an n X n matrix B be stored in an n x n APPB such that processor Pij holds element òy of B. The algorithm NUMBER-NONZEROS below labels the nonzero elements of B in column-major order. The algorithm has a running time of 0(log* n). Let f{bij) be the corresponding number of nonzero element bij, i, j e [1 : n]. Algorithm NUMBER-NONZEROS consists of three steps. In the first step the nonzero elements in each single column are numbered if {bij} is the corresponding number of nonzero element bij). Then, in the second step the prefix sums over the numbers of nonzero elements in the columns are determined. Finally, each processor Pij with a nonzero element bij computes f{bij) from fibij) and the total number of nonzero elements in columns 1 to i - 1. Algorithm NUMBER-NONZEROS(B): 1. Number the nonzero elements of B in each column (Hint: Set a flag in each processor with a nonzero element and compute prefix sums over the flags in each column). Let /'^(òy) be the number of nonzero element bij and let /f be the total number of nonzero elements in column i. Time: 0(1). 2. Compute the prefix sums ovGr f^^ /2, • • • ? f^ ^ follows. Let /j^i^.j = /f+/I+.. for i G [1 : n]. (a) Send ff to all processors in the first logn processors of column i, i € [1 : n]. Time: 0(1). (b) For j e [1 : log n] determine the prefix sums over the jth bit of the /?'s in row j and multiply the results with Time: 0(1). (c) For each column i, i > 1 compute the sum of the values obtained in step (2.b) in an logn X logn submesh to obtain /[j.jj. This is done as in step (3) of algorithm ADD but with the difference that, sums over bits are now computed only in the rows of each submesh (Technical details are left to the reader). Finally, for each i £ [2 : n] a binary number is formed of the bits of each sum /jj.^j that are spread over logn processors similarly as in step (4.i) of algorithm ADD. Time: 0(log*n). 3. For 2 € [1 : n - 1] send to all processors in column i + 1. Then each processor Pij with a nonzero element determines /(&jj) = /[i.j_i] + r{bij) if ž 6 [2 : n] and /(6<,) = /^(öy) ìi i = 1. Time: 0(1). Lemma 3 The nonzero elements of an nx n matrix stored in annxn APPB can be numbered in column-major order with algorithm NUMBER-NONZEROS in time 0(log* n). Finally we describe a technique for labelling the nonzero elements within each submatrix of a partitioned matrix. First we introduce the following definitions. For an ž e [1 : n] let Pij^,Pij2, ■ ■ ■ 1 < il < J2 < • • ■ < im < be the processors in row i of the APPB that store a nonzero number in their register X. To compress the contents of the registers X of the processors in row i to the left means that processor Pij^ sends the nonzero number stored in its register X to processor Pi^h- Then, each processor that has received a number stores it in its register X and the other processors set their registers X to zero. Clearly, it is possible to compress the contents of the registers X of the processors in all rows of the APPB to the left in time 0(1). Assume that matrix B can be partitioned into kß n X kt submatrices ßt, ^ € [1 : kß], such that Bt is of size n X kt, ki + k2 + ... + kk^ = n, and each submatrix Bt contains exactly n nonzero elements. The algorithm NUMBER-NONZEROS-BLOCKWISE below labels the nonzero elements of each submatrix Bt in row-major order. The algorithm has a running time of 0(max{A;B,log* n}). Let g{bij) be the corresponding number of nonzero element bij. Algorithm NUMBER-NONZEROS-BLOCKWISE consists of three steps. In the first step the parameter kß is determined and the matrices Bt, t 6 [1 : kß] are identified. In the second step the number of nonzero elements in each row i, i G [1 : n] of every matrix Bt, t e [1 : fcß] is determined. Further, the nonzero elements in each row of every matrix Bt are numbered from left to right. Prefix sums over the values Qi, i e [1 : n] are computed for all f G [1 : kß] in the third step. Then, using this prefix sums and the number of each nonzero element bij in its row of matrix Bt the values g{bij) are computed. Algorithm NUMBER-NONZEROS-BLOCKWISE(B) : 1. Number the nonzero elements of B in column major order with algorithm NUMBER-NONZEROS. Each nonzero element bij with (t - l)n -I- 1 < f{bij) < tn belongs to submatrix Bt, t € [1 : kß]. Time: 0(log*n). 2. For each t £ [1 : kß] number in each row of the mesh the nonzero elements of Bt as described next. Let g^ihj) be the corresponding number of nonzero element bij of Bt and let gl be the total number of nonzero elements bij of Bt in row i. (a) In each row compress the numbers fihj) obtained in (1) to the left. Time: 0(1). (b) Each processor that has received in the last step a number /(by) with {t - l)n + 1 < f{hj) < tn, t E. [1 : n] examines whether its right neighbour has also received a number with a value between {t - l)n -h 1 and tn. If this is not the case or if there is no right neighbour that has received a nonzero value then the processor sends its column index to processor t £ [1 : kß] if it has a value f{bij) with (f-l)n-l-l < f{bij) < tn. Then, repeat the same but with "left" instead of "right". Time: 0(1). (c) For z e [1 : n], t e [1 : kß]'- If processor Pi,f has received two column indices x\, y\, < y\ in the last step it determines g\ = y\ — x\ + l from these values and otherwise sets g\ = 0. Time: 0(1). (d) For i e [1 : n], t E [I : kg - 1]: If gj > 0 processor sends x\ to all processors in its row containing a nonzero element bij of Bt. Time: 0(1). (e) Number the nonzero elements in each row from left to right and let g*{bij) be the number of element bij. Then each processor Pi J with a nonzero element bij computes 9\bij) =g*(bij) ~xl + l. Time: 0(1). Time: 0(1). 3. FOR / = 1 TO DO n + 1 (a) For each t E [(I - min{A;B,/j^^}] compute the prefix sums dfi-.i] values J G [1 : n] in an n X log^ n submesh: i. Compute in an n X logn submesh prefix sums over the jth bits, j € [1 : logn] of the gf's, « S [1 : n] and multiply the result for the jth bit with Then add the obtained values for each i e [1 : n] in an logn x logn submesh similarly as in step (3) of algorithm ADD. The bits of each result are spread over log n processors. The corresponding binary values are formed in the next step. Time: 0(log*n). ii. FOR t = {I - 1) log^ n 1 TO mm{kB,l^}DO Form a binary number of the bits of each sum pfj.jj, « E [1 : n] that are spread over logn processors similarly as in step (4) of algorithm ADD in constant time. Time: Oimm{kB,^}). Time: 0(max{A;j3,log* n}). (b) For each i 6 [1 : n - 1], f S [1 : fcß] send to all processors in row i -I-1 that have a nonzero element b^j of Bt- Then, for each i € [1 : n], t e [1 : feß] a processor that has a nonzero element of Bt computes g(bij) = 9li:i-i]+9\bij) if i > 1 and g{bij) = g^{bij) Time: 0(1). Time: 0(max{A;ß,log* n}). Lemma 4 Let kß n x kt matrices Bt, t € [1 : kß], S [1 : n] each containing at most n nonzero elements are stored in disjoint suhmeshes of an n x n APPB. Then for all matrices Bi,B2,... ,BkB their nonzero elements can be numbered in row-major order with algorithm NUMBER-NONZEROS-BLOCKWISE m time 0(max{A;ß,log*n}). 4 Matrix Multiplication Algorithm In this section we describe algorithm MATR-MULT that multiplies two n x n matrices A and B with elements that can be represented by w bits and where B has at most kß ■ n nonzero elements, 1 < kß < n on an n X n APPB in time 0(A;j5 -l-max{logui, log logn}). The idea of the algorithm is as follows. Matrix B is partitioned into n x kt submatrices Bt, t € [1 : y], ki+k2 + .. ■+ky = n such that each submatrix contains at most n nonzero elements. Clearly we can choose y < Ikß. For ease of description we assume y = kB. A is multiplied with B such that the computation of the products of the elements of A with the elements on Bt is done before we multiply A with Bt+i, i G [1 : fcß — 1]. The computation of the elements of the product matrix C = A X B from these products is done in a pipelined manner, i.e. after computing the products of the elements of A with the elements of Bt we add only so many partial sums of the 0-elements of Ax Bi, A X B2, ..., A X Bt 3tS necessary to obtain free space for the computation of the products of elements of A with elements of Bt+i, t E [1 : kß -1]. Finally, we add all the remaining partial sums necessary to compute the O-elements. In the following algorithm we assume logn < w < Otherwise, easy modifications of algorithm MATR-MULT will show that A and B can be multiplied in time 0{kß -H max{logu;, log logn}). We now give a more detailed outline of algorithm MATR-MULT. In the first two steps the parameter kß is determined and the submatrices Bt, t E [1 : kß] are identified. Also the numbers gj and g'j..^ are computed for f e [1 : kß], i e [1 : n]. Further, for each nonzero element bij of B the numbers fibij) and g(6y ) are computed. In the third step for f = 1 to kß the products of elements of A with the elements of Bt are computed. This is done such that in row Z e [1 : n] all products of the elements of row I of A with elements of Bt are computed. Observe, that these products are already in their final row. The products are formed by sending each nonzero element h j of Bt to all processors in column g{bij). Then each nonzero element an of A is send in its row to the processors with column indices + 9\ui-i] + 2, • • • , 9[i.i-.i] + 9i (these processors contain all nonzero elements from row I of Bt) and multiplied with the local nonzero element of Bt- To make the computation of the C-elements from the products easy the products are rordered in each row such that products that belong to the same target C-element are in neighbored processors. Since each processor has only a constant number of registers we will need to free registers before we can compute products of nonzero elements of A with nonzero elements of Bt+i- Therefore we form partial sums from some products that belong to the same C-element and route computed C-elements to their final processor in the row (the same is done with older partial sums for C-elements emerging from A x Si, AxB2,..., AxBt-i). We do this in such a way that we never use more than two registers per processor to store the products and partial sums for the C-elements. Note, that we do not compute all C-elements of Ax Bt before we start computing A X Bt+i since this may take too much time (it C£m happen that n products belong to the same C-element). The remaining C-elements that have not been computed in step 3 are computed in the final step 4. This is done in two phases. In the first phase we sum so many partial sums that there remain in each row at most ^ partial sums corresponding to at most C-elements where a = max{u;,logn}. In the second phase we first compute the sum of the jth bits of all remaining partial sums that correspond to the same C-element, j € [1 : t/;]. Then, from the obtained w sums with at most logn bits each we compute the corresponding C-element in an a x a submesh and send the C-element to its final processor. Algorithm MATR-MULT(A, B): 1. Number the nonzero elements of B in column-major order with algorithm NUMBER-NONZEROS and let f{bij) be the number of nonzero element bij. Let Bt be the submatrlx of B that contains the nonzero elements with (^ - l)n -M < fibij) 1 and in the register F if i = 1. Time: 0(1). (e) IF t > 1 THEN In every row of the mesh add the contents of some registers X with the same target C-element, send computed C-elements to their final destination, and compress the registers X to the left such that afterwards the registers X of processors in the right half of the mesh are all zero: i. Each processor with an uneven column index sends the content of its register X to its right neighbour. There it is added to the content of the register X if both values have the same target C-element. If that was the case the register X of the sender is the set to zero. Time: 0(1). ii. Same as last step but now with "even" and "left" instead of "uneven" and "right". Time: 0(1). iii. Send C-elements that have been computed to their final destination in the rows. Time: 0(1). iv. In every row compress the contents of the registers X to the left. Observe, that afterwards all registers X in the right half of the mesh are zero. Time: 0(1). Time: 0(1). (f) IF t>l THEN Do the same as in step (3.e) but now with the registers Y instead of the registers X. Afterwards all registers Y in the right half of the mesh are zero. Then send the contents of the register X of each processor P^ with « e [1 : n], j < ^ to processor where it is stored in the register Y and set the register X of Pij to zero. Now, all registers X are zero. Time: 0(1). Time: O(fcß). 4. Do all the remaining additions necessary to compute the rest of the C-elements and send them to their final destinations, as follows. Let a = max{u;,logn}. (a) Partition the rows of the mesh in subar-rays of length and add all partial sums (stored in the registers Y) with the same target C7-element in every subarray. Send C-elements that have been computed to their final destination. Notice, that in every row there remain at most ^ partial sums of at most C-elements that still have to be computed. Time: O (log a). (b) Partition the mesh into n a x ^ submeshes (each submesh is used to add the remaining partial sums of one row of the mesh). For each row send the ith remaining partial sum in the row, i G [1 : to all processors in column i of the corresponding a x ^ submesh. Time: 0(1). (c) For each j £ [1 : a] sum in row j of every a x g submesh all the jth bits corresponding to partial sums with the same target 0-element (The details are left to the reader). Time: 0(1). (d) In each otx^ submesh it remains to compute for each of the at most ^ remaining C-elements a sum of at most a numbers with at most logn bits each. This can be done similar to step (3) of algorithm ADD. Each sum is computed in an a x a submesh. Time: 0(log*n). (e) In each a x a submesh form a binary number of the bits of each sum computed in the last step to obtain the corr-esponding O-element. Then, send the computed C-elements to their final destinations. Time: 0(loga!). Time: O(loga) = 0(max{logw, log logn}). Now, we can state the main theorem. Theorem 1 Two n x n matrices A and B with elements that can be represented by 0{w) bits and where the number of nonzero elements of B is at most kß ■ n, 1 < kß log logn. Corollary 2 An nxn matrix A and an n x 1 vector with elements that can be represented by O(logn) bits can be multiplied on an n x n APPB in time 0 (log logn). 5 Conclusion We have described an algorithm for matrix multiplication on arrays of processors with pipelined optical buses (APPB). Attention was given to the case that one of the matrices is sparse. As a special case we obtained an algorithm for fast matrix-vector multiplication. Our results improve some results obtained in [7] for the stronger model of a reconfigurable array of processors with optical buses (AROB). It is an interesting question whether our algorithm can be improved when using an AROB instead of an APPB. In general, it is worth to compare the power of the APPB and the AROB for other problems. References [1] Beresford-Smith B., Diessel O. & ElGindy H. (1996) Optimal algorithms for constrained reconfigurable meshes. Journal of Parallel and Distributed Computing, 39, 1, p. 74-78. [2] Guo Z., Mehlem R. M., Chiarulh R. W. & Levitan S. P. (1991) Pipelined communication in optically interconnected arrays. Journal of Parallel and Distributed Computing, 12, p. 269-282. [3] Middendorf M., Schmeck H., Schröder H., & Turner G. Multiplication of matrices with different sparseness properties on dynamically reconfigurable meshes. To be published in VLSI-DESIGN. [4] Miller R., Prasanna-Kumar V. K., Reisis D. I. k Stout Q. F. (1993) Parallel computations on reconfigurable meshes. IEEE Trans. Comput., 42, 6, p. 678-692. [5] Nakano K. & Wada K. (1995) Integer summing algorithms on reconfigurable meshes. Proceedings of the IEEE First Conf. on Algorithms and Architectures for Parallel Processing (ICA^PP 95), Brisbane, Australia, 19.-21. April, p. 187-196. [6] Park H.,Kim H. J. & Prasanna V. K. (1993) An 0(1) time optimal algorithm for multiplying matrices on reconfigurable mesh. Inform. Process. Lett., 47, p. 109-113. [7] Pavel S. & Akl S. G. (1995) On the power of arrays with reconfigurable optical buses. Technical Report No. 95-374, Dept. of Computing and Information Science, Queen's University, Kingston, Ontario. [8] Pavel S. & Akl S. G. Matrix operations using arrays with reconfigurable optical buses, to be published in Journal of Parallel Algorithms and Applications. [9] Qiao C. (1995) Efficient matrix multiplication in a reconfigurable array with spanning optical buses. Proceedings of the Fifth Symp. on the Frontiers of Massively Parallel Computing (Frontiers '95), McLean, Virginia, 6.-9. February, p. 273-279. BPC Permutations on the OTIS-Hypercube Optoelectronic Computer^ Sartaj Sahni and Chih-Fang Wang Department of Computer and Information Science and Engineering University of Florida, Gainesville, FL 32611 Phone: 352-392-1527, Fax: 352-392-1220 {sahni,wang}@cise.ufl.edu Keywords: OTIS-Hypercube, BPC permutations, diameter Edited by: Yi Pan Received: October 30, 1997 Revised: January 20, 1998 Accepted: June 19, 1998 We show that the diameter of an N^ processor OTIS-Hypercube computer ( N = ) is 2d + 1. OTIS-Hypercube algorithms for some commonly performed permutations - transpose, bit reversal, vector reversal, perfect shufRe, unshufRe, shuffled row-major, and bit shuffle - are developed. We also propose an algorithm for general BPC permutations. 1 Introduction Electronic interconnects are superior to optical interconnects when the interconnect distance is up to a few millimeters (Feldman et. al. 1988, Kiamilev et. al. 1991). However, for longer interconnects, optics ( and in particular, free space optics ) provides power, speed, and bandwidth advantages over electronics. With this in mind, Marsden et. al. (Marsden et. al. 1993), Hen-drick et. al. (Hendrick et. al. 1995), and Zane et. al. (Zane et. al. 1996) have proposed a hybrid computer architecture in which the processors are divided into groups; intra-group connects are electronic, and inter-group interconnects are optical. Krishnamoorthy et. al. (Krishnamoorthy et. al. 1992) have demonstrated that bandwidth and power consumption are minimized when the number of processors in a group equals the number of groups. Marsden et. al. (Marsden et. al. 1993) propose a family of optoelectronic architectures in which the number of groups equals the number of processors per group. In this family - the optical transpose interconnection system ( OTIS ) - the inter-group connects ( or optical interconnect ) connect processor p of group g to processor g of group p. The intra-group interconnect ( or electronic interconnect ) can be any of the standard previously studied connection schemes for electronic computers. This strategy gives rise to the OTIS-Mesh, OTIS-Hypercube, OTIS-Perfect shuffle, OTIS-Mesh of trees, and so forth computers. Figure 1 shows a generic 16 processor OTIS computer; only the optical connections are shown. The solid squares indicate individual processors, and a pro- group 0 group 1 group 2 group 3 ^This work was supported, in part, by the Army Research Office under grant DAA H04-95-1-0111. Figure 1: Example of OTIS connections with 16 processors cessor index is given by the pair (G,P) where G is its group index and P the processor or local index. Figure 2 shows a 16 processor OTIS-Hypercube. The number inside a processor is the processor index within its group. Hendrick et. al. (Hendrick et. al. 1995) have computed the performance characteristics ( power, throughput, volume, etc. ) of the OTIS-Hypercube architecture. Zane et. al. (Zane et. al. 1996) have shown that each move of an N^ processor hypercube can be simulated by an N^ processor OTIS-Hypercube using either one local electronic move, or one local electronic move and two optical inter group move using the OTIS interconnection. We shall refer the latter as OTIS moves. Sahni and Wang (Sahni & Wang 1997) and Wang and Sahni (Wang k Sahni 1997) have evalu- (0,0) group 0 (0,1) group 1 group 2 (1,0) group 3 (1,1) Figure 2: 16 processor OTIS-Hypercube ated thoroughly the characteristics of the OTIS-Mesh architecture, developing algorithms for basic data rearrangements.. In this paper, we study the OTIS-Hypercube architecture and obtain basic properties and basic permutation routing algorithms for this architecture. These algorithms can be used to develop efficient application programs. In the following, when we describe a path through an OTIS-Hypercube, we use the term electronic move to refer to a move along an electronic interconnect ( so it is an intra-group move ) and the OTIS move to refer to a move along an optical interconnect. 2 OTIS-Hypercube Diameter Let N = 2'^ and let d{i,j) be the length of the shortest path from processor i to processor j in a hypercube. Let (Gl,Pi) and (^2,^2) be two OTIS-Hypercube processors. The shortest path between these two processors fits into one of the following categories: (a) The path employs electronic moves only. This is possible only when Gi = G^- (b) The path employs an even number of OTIS moves. Paths of this type look like {Gi,Pi) (A', Gl) ^ (Gl Pi') ^(02, P2) . iPi,G[) {GuPD {G[,Pi) ^ Here E denotes a sequence ( possibly empty ) of electronic moves and 0 denotes a single OTIS move. If the number of OTIS moves is more than two, we may compress the path into a shorter path that uses 2 OTIS moves only: (Gi,Pi) (Gi,P2) (P2,GI) ^ (P2,G2) ^ (G2,P2) • (c) The path employs an odd number of OTIS moves. Again, if the number of moves is more than one, we can compress the path into a shorter one that employs exactly one OTIS move as in (b). The shorter path looks like: (Gi, Pi) (Gi, G2) -A (G2,GI)^(G2,P2) . Shortest paths of type (a) have length exactly d(Pi,P2) ( which equals the number of ones in the binary representation of Pi © P2 ). Paths of type (b) and type (c) have length d(Pi, P2) -I- d{Gi, G2) + 2 and d(Pi,G2) -I- d(P2,Gi) + 1, respectively. As a result, we obtain the following theorem: Theorem 1 The length of the shortest path between processors (Gi,Pi) and (G2,P2) is d{Pi,P2) when Gl = G2 and min{d(Pi,P2) + d(Gi,G2) + 2,d{PuG2)+diP2,Gi) + l} when Gii^Gi- Theorem 2 The diameter of the OTIS-Hypercube is 2d+l. Proof Since each group is a d-dimensional hypercube, d{Pi,P2), d{GuG2), d{PuG2), and d(P2,Gi) are all less than or equal to d. Prom theorem 1, we conclude that no two processors are more than 2d 4-1 apart. Now consider the processors (Gi,Pi), (G2,P2) such that Pi = 0 and P2 = A'' - 1. Let Gi = 0 and G2 = A^ - 1. So d(Pi,P2) = d(Gi,G2) = d(Pi,G2) = d{P2,Gi) = d. Hence, the distance between (Gi,Pi) and (G2,P2) is 2d-t-l. As a result, the diameter of the OTIS-Mesh is exactly 2d -I-1. □ 3 Common Data Rearrangements In this section, we concentrate on the realization of permutations such as transpose, perfect shuffle, un-shuffie, vector reversal which are frequently used in applications. Nassimi and Sahni (Nassimi & Sahni 1982) have developed optimal hypercube algorithms for these frequently used permutations. These algorithms may be simulated by an OTIS-Hypercube using the method of (Zane et. al. 1996) to obtain algorithms to realize these data rearrangement patterns on an OTIS-Hypercube. Table 1 gives the number of moves used by the optimal hypercube algorithms; a break down of the number of moves in the group and local dimensions; and the number of electronic and OTIS moves required by the simulation. We shall obtain OTIS-Hypercube algorithms, for the permutations of Table 1, that require far fewer moves than the simulations of the optimal hypercube algorithms. As mentioned before, each processor is indexed as (G,P) where G is the group index and P the local index. An index pair (G, P) may be transformed into Permutation Optimal Hypercube Moves OTIS-Hypercube Simulation total group dimension local dimension OTIS electronic Transpose 2d d d 2d 2d Perfect Shuffle 2d d d 2d 2d Unshuffle 2d d d 2d 2d Bit Reversal 2d d d 2d 2d Vector Reversal 2d d d 2d 2d Bit Shuffie 2d-2 d-1 d-1 2d-2 2d-2 Shuffled Row-major 2d-2 d-1 d-1 2d-2 2d-2 GiPu Swap d d/2 d/2 d d Table 1: Optimal moves for N'^ = processor hypercube and respective OTIS-Hypercube simulations a singleton index I = GP by concatenating the binary representations of G and P. The permutations of Table 1 are members of the BPC ( bit-permute-complement ) class of permutations defined in (Nassimi & Sahni 1982). In a BPC permutation, the destination processor of each data is given by a rearrangement of the bits in the source processor index. For the case of our N"^ processor OTIS-Hypercube we know that A'^ is a power of two and so the number of bits needed to represent a processor index is p = logj A''^ = 21ogA'' = 2d. A BPC permutation (Nassimi & Sahni 1982) is specified by a vector A = [Ap_i, Ap_2,... , Ao] where (a) Aie{±0,±l,...,±ip- (b) [|Ap_i|, • ■ • , l^ol] is a permutation of • 1)}, 0 < 2 < p and The destination for the data in any processor may be computed in the following manner. Let mp_imp_2 ...mo be the binary representation of the processor's index. Let dp-idp-^... do be that of the destination processor's index. Then, = rrii l-rrii if Ai > 0, if Ai < 0. In this definition, —0 is to be regarded as < 0, while -hO is > 0. In a 16-processor OTIS-Hypercube, the processor indices have four bits with the first two giving the group number and the second two the local processor index. The BPC permutation [-0,1,2,-3] requires data from each processor m^m2mimQ to be routed to processor (1 - mo)mim2(l - ms). Table 2 lists the source and destination processors of the permutation. The permutation vector A for each of the permutations of Table 1 is given in Table 3. 3.1 Transpose b/2-1,... ,0,p-l,... ,p/2] The transpose operation may be accomplished via a single OTIS move and no electronic moves. The sim- ulation of the optimal hypercube algorithm, however, takes 2d OTIS and 2d electronic moves. 3.2 Perfect Shuffle [0,p - l,p - 2,... , 1 We can adapt the strategy of (Nassimi & Sahni 1982) to an OTIS-Hypercube. Each processor uses two variables A and B. Initially, all data are in the A variables and the B variables have no data. The algorithm for perfect shuffle is given below: Step 1: Swap A and B in processors with last two bits equal to 01 or 10. Step 2: for (i = 1; i < d - 1; i -I- -I-) { (a) Swap the B variables of processors that differ on bit i only; (b) Swap the A and B variables of processors with bit i of their index I not equal to bit i -f 1 of their index; } Step 3: Perform an OTIS move on the A and B variables. Step 4: for (i = 0; i < d - 1; J -I- -I-) { (a) Swap the B variables of processors that differ on bit i only; (b) Swap the A and B variables of processors with bit i of their index I not equal to bit i -M of their index; } Step 5: Perform an OTIS move on the A and B variables. Step 6: Swap the B variables of processors that differ on bit 0 only. Step 7: Swap the A and B variables of processors with last two bits equal to 01 or 10. Actually, in Step 1 it is sufficient to copy from A to B, and in Step 7 to copy from B to A. Source Destination Processor {G,P) Binary Binary (G, P) Processor 0 (0,0) 0000 1001 (2,1) 9 1 (0,1) 0001 0001 (0,1) 1 2 (0,2) 0010 1101 (3,1) 13 3 (0,3) 0011 0101 (1,1) 5 4 (1,0) 0100 1011 (2,3) 11 5 (1,1) 0101 0011 (0,3) 3 6 (1,2) 0110 nil (3,3) 15 7 (1,3) Olli Olli (1,3) 7 8 (2,0) 1000 1000 (2,0) 8 9 (2,1) 1001 0000 (0,0) 0 10 (2,2) 1010 1100 (3,0) 12 11 (2,3) 1011 0100 (1,0) 4 12 (3,0) 1100 1010 (2,2) 10 13 (3,1) 1101 0010 (0,2) 2 14 (3,2) Ilio Ilio (3,2) 14 15 (3,3) 1111 0110 (1,2) 6 Table 2: Source and destination of the BPC permutation [—0,1,2, —3] in a 16 processor OTIS-Hypercube Permutation Permutation Vector Transpose Perfect Shuffle Unshuffle Bit Reversal Vector Reversal Bit Shuffle Shuffled Row-major GiPu Swap [p/2-1,... ,0,p-l,... ,p/2] b-2,p-3,... ,0,p-l] [0,l,...,p-l] [-(p-l),-(p-2),...,-0] [p-l,p-3,... ,l,p-2,p-4,... ,0] [p-l,p/2-l,p-2,p/2-2,... ,p/2,0] [p-l,...,3p/4,p/2-l,... ,p/4,3p/4-l,...,p/2,p/4-l,... ,0] Table 3: Permutations and their permutation vectors Table 4 shows the working of this algorithm on a 16 processor OTIS-Hypercube. The correctness of the algorithm is easily established, and we see that the number of data move step is 2d+ 2 {2d electronic moves and 2 OTIS moves; each OTIS move moves two pieces of data from one processor to another, each electronic swap moves a single data between two processors ). The communication complexity of 2d-t-2 is very close to optimal. For example, data from the processor with index I = 0101...0101 is to move to the processor with index I' = 1010 ... 1010 and the distance between these two processors is 2(i-t-1. Notice that the simulation of the optimal hypercube algorithm for perfect shuffle takes 4d moves. 3.3 Unshuffle [p-2,p-3,... ,0,p-l This is the inverse of a perfect shuffle and may he performed by running the perfect shuffle algorithm backwards (i.e., beginning with Step 7); the for loops of Steps 2 and 4 are also run backwards. Thus the number of moves is the same as for a perfect shuffle. 3.4 Bit Reversal [0,1,... When simulating the optimal hypercube algorithm, the task requires 2d electronic moves and 2d OTIS moves. But with the following algorithm: Step 1: Do a local bit reversal in each group. Step 2: Perform an OTIS move of all data. Step 3: Do a local bit reversal in each group. we can actually achieve the rearrangement in 2d electronic moves and 1 OTIS move, since Steps 1 and 3 can be performed optimally in d electronic moves each (Nassimi & Sahni 1982). The number of moves is optimal since the data from processor 0101... 0101 is to move to processor 1010... 1010, and the distance between these two processors is 2d-I-1 ( Theorem 1 ). Step 1 Step 2 OTIS Step 4 OTIS Step 6 Step 7 index initial i = 1 i = 0 i = 1 variable (a) (b) (a) (b) (a) (b) 0000 0 0 0 0 0 0 0 0 0 0 0 0 A - - 2 2 2 4 4 8 8 8 - - B 0001 1 - - - 6 6 2 2 2 - - 8 A - 1 - - 4 2 6 10 10 - 8 - B 0010 2 - - - 8 8 12 12 4 - - 1 A - 2 - - 10 12 8 4 12 - 1 - B 0011 3 3 3 1 14 14 14 14 6 9 9 9 A - - 1 3 12 10 10 6 14 1 - - B 0100 4 4 4 6 - - - - - 2 2 2 A - - 6 4 - - - - - 10 - - B Old 5 - - - - - - - - - - 10 A - 5 10 - B 0110 6 - - - - - - - - - - 3 A - 6 3 - B Olli 7 7 7 7 - - - - - 11 11 11 A - - 5 5 - - - - - 3 - - B 1000 8 8 8 8 - - - - - 4 4 4 A - - 10 10 - - - - - 12 - - B 1001 9 - 12 A - 9 12 - B 1010 10 - - - - - - - - - - 5 A - 10 5 - B 1011 11 11 11 9 - - - - - 13 13 13 A - - 9 11 - - - - - 5 - - B 1100 12 12 12 14 1 1 1 1 9 6 6 6 A - - 14 12 3 5 5 9 1 14 - - B 1101 13 - - - 7 7 3 3 11 - - 14 A - 13 - - 5 3 7 11 3 - 14 - B Ilio 14 - - - 9 9 13 13 13 - - 7 A - 14 - - 11 13 9 5 5 - 7 - B 1111 15 15 15 15 15 15 15 15 15 15 15 15 A - - 13 13 13 11 11 7 7 7 - - B Table 4: Illustration of the perfect shuffle algorithm on a 16 processor OTIS-Hypercube 3.5 Vector Reversal A vector reversal can be done using 2d electronic and 2 OTIS moves. The steps are: Step 1: Perform a local vector reversal in each group. Step 2: Do an OTIS move of all data. Step 3: Perform a local vector reversal in each group. Step 4: Do an OTIS move of all data. The correctness of the algorithm is obvious. The number of moves is computed using the fact that Steps 1 and 3 can be done in d electronic moves each (Nas-simi & Sahni 1982). Since a vector reversal requires us to move data from processor 00... 00 to processor 11... 11, and since the distance between these two processors is 2d+1 ( Theorem 1 ), our vector reversal algorithm can be improved by at most one move. 3.6 Bit Shuffle p-l,p-3,... ,l,p-2,p-4,... ,0 Let G — GuGi where G„ and Gi partition G in half. Same for P = PuPi- Our algorithm employs a GiPu Swap permutation in which data from processor GuGiPuPi is routed to processor GuPuGiPi- So we need to first look at how this permutation is performed. S. Sahni et al. 3.6.1 GiPu Swap [p - Iv • • ,3p/4,p/2 - 1,... ,p/4,3p/4 - 1,... ,p/2,p/4 - 1,... ,0] The swap is performed by a series of bit exchanges of the form B{i) = [ßp-i,... , ßo], 0 < i < p/4, where {p/2 + i, j = p/4 + i p/4 + i, j=p/2 + i j otherwise Let G (i) and P (i) denote the ith bit of G and P respectively. So (7(0) is the least significant bit in G, and P{d) is the most significant bit in P. The bit exchange B(i) may be accomplished as below: Step 1: Every processor (G,P) with G(i) P(of/2 + i) moves its data to the processor {G,P') where P' differs from P only in bit d/2 + i. Step 2: Perform an OTIS move on the data moved in Step 1. Step 3: Processors (G, P) that receive data in Step 2 move the received data to (G, P'), where P' differs from P only in bit i. Step 4: Perform an OTIS move on the data moved in Step 3. The cost is 2 electronic moves and 2 OTIS moves. To perform a G;P„ Swap permutation, we simply do B(i) for 0 < i < d/2. This takes d electronic moves and d OTIS moves. By doing pairs of bit exchanges (ß(0),ß(l)), (ß(2),ß(3)), etc. together, we can reduce the number of OTIS moves to d/2. 3.6.2 Bit Shuffle A bit shuffle, now, can be performed following these steps: Step 1: Perform a GjP« swap. Step 2: Do a local bit shuffle in each group. Step 3: Do an OTIS move. Step 4: Do a local bit shuffle in each group. Step 5: Do an OTIS move. Steps 2 and 4 are done using the optimal d move hypercube bit shuffle algorithm of (Nassimi & Sahni 1982). The total number of data moves is 3d electronic moves and d/2 + 2 OTIS moves. 3.7 Shuffled Row-major This is the inverse of a bit shuffle and may be done in the same number of moves by running the bit shuffle algorithm backwards. Of course. Steps 2 and 4 are to be changed to shuffled row-major operations. 4 BPC Permutations Every BPC permutation A can be realized by a sequence of bit exchange permutations of the form B{i,j) = [B2d-i,--- ,Bo], d < i < 2d, 0 < j < d, and J, 9 = 1 i, Q = j q, otherwise, and a BPC permutation C = [C2(i-i,... , Co] = noHp where |C,| < d, 0 < 9 < d, IIg and Hp involve d bits each. For example, the transpose permutation may be realized by the sequence B{d + j,j), 0 < j < d; bit reversal is equivalent to the sequence B{2d- 1 -j,j), ^ < 3 < d\ vector reversal can be realized by performing no bit exchanges and using C = [-{2d-1), -(2d-2),...,-0] ( Ug = [-(2d-l),-(2d-2),...,-d], Hp = [-(d - 1),... ,-0] ); and perfect shuffle may be decomposed into B{d,0) and G — [2d - 2,2d -3,... ,d,2d - 1,d - 2,... ,1,0,d - 1] ( Üq = [2d-2,2d-3,... ,d,2d-l], Ep = [d-2,... ,l,0,d-l] ). A bit exchange permutation B{i,j) can be performed in 2 electronic moves and 2 OTIS moves using a process similar to that used for the bit exchange permutation B{i). Notice that B{i) = B{i,i). Our algorithm for general BPC permutations is: Step 1: Decompose the BPC permutation A into the pair cycle moves Bi{ii,ji), S2(Ì2,72),-• •, Bk{ik,3k) and the BPC permutation C = Ho Ep as above. Do this such that ii > i^ > ■■■ > ik, and >Ì2 > ■•• >jk' Step 2: If A; = 0, do the following: Step 2.1: Do the BPC permutation Ep in each group using the optimal algorithm of (Nassimi & Sahni 1982). Step 2.2: Do an OTIS move. Step 2.3: Do the BPC permutation E'^ in each group using the algorithm of (Nassimi k Sahni 1982). Step 2.4: Do an OTIS move. Step 3: If fc = d, do the following: Step 3.1: Do the BPC permutation Eg in each group. Step 3.2: Do an OTIS move. Step 3.3: Do the BPC permutation Ep in each group. Step 4: lik < d/2, do the following: Step 4.1: Perform the bit exchange permutation Bi,...,Bk. Simulation Our Algorithm Permutation OTIS electronic OTIS electronic Transpose 2d 2d 1 0 Perfect Shuffle 2d 2d 2 2d Unshuffle 2d 2d 2 2d Bit Reversal 2d 2d 1 2d Vector Reversal 2d 2d 2 2d Bit Shuffle 2d-2 2d-2 cž/2-f 2 M Shuffled Row-major 2d-2 2d-2 dl2 + 2 3d GiPu Swap ■ d d d/2 d Table 5: Performance Comparisons Step 4.2: Do Steps 2.1 through 2.4. Step 5: If A; > d/2, do the following: Step 5.1: Perform a sequence oi d - k bit exchanges involving bits other than those in ,Bk in the same orderly fashion described in Step 1. Recompute He and Dp. Swap IIg and lip. Step 5.2: Do Steps 3.1 through 3.3. The local BPC permutations determined by IIg and lip take at most d electronic moves each (Nassimi & Sahni 1982); and the bit exchange permutations take at most d electronic moves and d/2 OTIS moves. So the total number of moves is at most 3d electronic moves and dji -I- 2 OTIS moves. 5 Conclusion In this paper we have shown that the diameter of the OTIS-Hypercube is 2(i-|-1, which is very close to that of an N'^ processor hypercube. However, each OTIS-Hypercube processor is connected to at most d + I other processors; while in an N'^ processor hypercube, a processor is connected to up to 2d other processors. We have also developed algorithms for frequently used data permutations. Table 5 compares the performance of our algorithms and those obtained by simulating the optimal hypercube algorithms using the simulation technique of (Zane et. al. 1996). For most of the permutations considered, our algorithms are either optimal or within one move of being optimal. An algorithm for general BPC permutations has also been proposed. References [1] Feldman M., Esener S., Guest C., & S. Lee (1988) Comparison Between Electrical and Free-Space Optical Interconnects Based on Power and Speed Considerations. Applied Optics, 27, 9, p. 1742-1751. [2] Hendrick W., Kibar 0., Marchand P., Fan C., Blerkom D. V., McCormick F., Cokgor I., Hansen M., & Esener S. (1995) Modeling and Optimization of the Optical Transpose Interconnection System. Optoelectronic Technology Center, Program Review, Cornell University. [3] Kiamilev F., Marchand P., Krishnamoorthy A., Esener S., & Lee S. (1991) Performance Comparison Between Optoelectronic and VLSI Multistage Interconnection Networks. Journal of Lightwave Technology, 9, 12, p. 1674-1692. [4] Krishnamoorthy A., Marchand P., Kiamilev F., & Esener S. (1992) Grain-Size Considerations for Optoelectronic Multistage Interconnection Networks. Applied Optics, 31, 26, p. 5480-5507. [5] Marsden G. C., Marchand P. J., Harvey P., & Esener S. C. (1993) Optical Transpose Interconnection System Architectures. Optics Letters, 18, 13, p. 1083-1085. [6] Nassimi D. & Sahni S. (1982) Optimal BPC Permutations On A Cube Connected Computer. IEEE Transactions on Computers, C-31, 4, p. 338-341. [7] Sahni S. & Wang C.-F. (1997) BPC Permutations On The OTIS-Mesh Optoelectronic Computer. Proceedings of the fourth International Conference on Massively Parallel Processing Using Optical Interconnections (MPPOr97), p. 130-135. [8] Wang C.-F. & Sahni S. (1997) Basic Operations on the OTIS-Mesh Optoelectronic Computer. Technical Report 97-008, CISE Department, University of Florida, available by anonymous ftp login from ftp.cise.ufl.edu under directory tech-report/tr97/tr97-008.ps.gz. [9] Zane F., Marchand P., Paturi R., & Esener S. (1996) Scalable Network Architectures Using the Optical Transpose Interconnection System (OTIS). Proceedings of the second International Conference on Massively Parallel Processing Using Optical Interconnections (MPPOrOe), p. 114-121. A Framework Supporting Specialized Electronic Library Construction Daniel J. Helm, Jerry W. Cogle Jr. and Raymond J. D'Amore The Mitre Corporation, 1820 Dolley Madison Blvd., McLean, VA 22102-3481, USA Phone: 703 883 6899, Fax: 703 883 7978 Email: (dhelm, j cogle, rdćunore) Smitre. org Keywords: digital library, information retrieval, system architecture Edited by: Rudi Murn Received: March 25, 1998 Revised: July 28, 1998 Accepted: September 8, 1998 The Collaborative Electronic Library Framework (CELF) is an architecture enabling a group of users to build a specialized digital collection organized around a set of topical areas of interest to a community of users. The system provides services to collect network-based information and tools to publish collected information into different topical areas through both manual and automatic mechanisms. Services are also provided to enable users to locate and retrieve useful information from the repository by browse and search techniques. All access to CELF is via a Web browser. 1 Introduction In recent years, with the proliferation of the World Wide Web, digital libraries have begun to flourish. There have been many viewpoints on what a digital library is or should be and many proposed approaches for their construction (Levy et. al. 95). Most existing digital libraries are built using manual labor intensive techniques, where a group of dedicated individuals collect and organize information. Other approaches have included more automatic techniques, although the resulting systems are typically less organized and coherent. Our approach is to fuse both manual and automatic approaches for digital library construction by leveraging useful benefits from each technique. These dual approaches are implemented in a prototype system called the Collaborative Electronic Library Framework (CELF). CELF provides an environment for developing an electronic library using an integrated suite of advanced data processing tools. The system incorporates data collection utilities and categorization services for automatically assigning collected documents into different topical areas. Document contribution and review services are provided to enable manually submitted information to be added to the library. A browse and search subsystem is also provided to enable users to effectively access and "mine" the information repository. Certain concepts of CELF have evolved from the MIDS project (Helm et. al. 96), which is an ongoing MITRE sponsored research project involving research and development in the area of information organization and discovery. CELF information is organized around topical hier- archies (taxonomies) that provide effective information organization and knowledge discovery supporting general and very specific information needs. The actual underlying structure is similar in concept to Yahoo (Yahoo) and includes similar tools and services such as search and topic-based browsing. Notable differences are that CELF provides a work flow framework for effective collaborative content insertion and management, and also provides advanced tools to automatically collect and categorize documents into taxonomies. These advanced tools are critical in supporting effective content creation and management, and will be discussed in subsequent sections. One basic principle behind CELF is to provide a framework whereby a group of users can build a specialized collection that is of value to a particular special interest community. A pilot project is underway to test the feasibility of the concept, where a group of participants were selected to build a specialized collection using the CELF framework. The primary focus of this paper is to describe the functionality of CELF and to discuss the currently fielded operational prototype supporting a pilot program; a future paper is planned which will provide an analysis and experimental evaluation of key component technologies used within CELF. 1.1 System Overview CELF includes two browsable and searchable repositories: a "core" manually generated high-quality collection, referred to as "library," and an automatically generated collection, referred to as "sources." The sources repository contains documents which are col- lected and categorized by automatic techniques, while the library repository contains documents that are manually published by a group of users. Both the library and sources repositories are available for enduser access. The sources repository also serves as a primary collection that document contributors will access as a basis for locating documents for inclusion in the core library collection. Information submitted to the library is initially entered into an HTML form interface, which enables meta- data describing a document to be provided, such as URL, title, description, abstract, and topical categories. Documents for which meta-data are entered can be obtained from many different sources, including the CELF sources repository, Intranets, and the Web. In general, only meta-data is actually stored within CELF for aJl collected documents. A URL meta-data field provides access to the full document stored at the remote location, although an option is provided on the document submission form to specify that a copy be made of the remote document. For copied documents, the URL meta-data field points to the local copy and an "external-URL" meta-data field provides access to the original remote document. A publishing option is also provided for a document contributor to create a document on their local computer and upload it to CELF. After a document is submitted for the library repository, it goes into a "pending review" state, where a document reviewer will approve or reject the document using a review interface. Approved documents are periodically added and indexed in the library repository where they are made available to end-users via the browse and search interfaces. Rejected documents are removed from the system. A document reviewer can also add. additional comments and/or update metadata for a document prior to the approval stage. 2 Detailed System Description CELF provides a set of integrated services including document collection, publishing, indexing, categorizing, browsing and retrieval, to provide a framework by which a specialized collection can be built. Software comprising CELF includes the Netscape Catalog Server 1.0 product (Catalog Server), as well as custom software developed by The MITRE Corporation. 2.1 Netscape Catalog Server The Netscape Catalog Server product evolved from the Harvest (Bowman et. al. 94) research prototype system that provides an efficient information collection and retrieval framework. The Netscape Catalog Server framework includes two primary server components: a Resource Description Server (RDS) and Catalog Server. An RDS stores meta-data manually sub- mitted by users or automatically collected by robots. Robots are specialized subsystems that can be configured to collect information from the Web and other repositories. A Catalog Server provides a search and browse interface to data collected and indexed from one or more RDS's. An integrated Verity (Verity) search engine provides the underlying retrieval framework for the Catalog Server. 2.2 System Architecture CELF contains a configuration of RDS's and Catalog Servers as shown in Fig. 1. The figure shows three RDS's (Harvest RDS, Agent RDS and Manual RDS) and two Catalog Servers (Library Catalog and Sources Catalog). The Harvest RDS is a specialized subsystem that is used to recursively collect documents from different Web sites. The top level URL's from which to collect are manually specified in a configuration file. A robot associated with the RDS periodically collects Web pages beginning at the top level URL's specified. Meta-data extracted from the collected pages are stored in a document database associated with the RDS. The Agent RDS is a customized subsystem that contains a search agent (Genesereth et. al. 94) that is configured to contact Web-based search engines as a basis for providing URL's for which an associated robot will collect documents. The search agent uses CELF keyword-based topical profiles (also used to automatically categorize collected documents) as canned queries to the search engines. URL's returned from the contacted search engines are then used to "seed" a robot configuration file (as with the Harvest RDS). The robot associated with the Agent RDS periodically collects the requested documents and stores their meta-data in the RDS's document database. The Manual RDS stores the meta-data manually submitted to CELF. These manually submitted documents are stored in a database, which a select group of users will access during the review process. Approved documents will be flagged accordingly and rejected documents will be automatically removed from the document database. The Sources Catalog is a browsable and searchable collection of documents that have been automatically collected by the Agent RDS and Harvest RDS. A categorization module augments this subsystem, by periodically categorizing the collected and indexed documents using the CELF topical profiles. Information not categorized can only be retrieved using the keyword-based search interface. The Library Catalog is a browsable and searchable collection of "core" documents that have been manually submitted by users and approved by document reviewers. This subsystem periodically collects and indexes documents from the Manual RDS. The client interface is a customized subsystem that enables users as well as document contributors to browse and search for documents contained in both the library and sources repository. Different options are available to affect the search process as well as the format of retrieval results. CELF also includes an administrative interface that enables privileged users to perform such operations as document contribution, document review, as well as the administration of different servers and configuration files used in CELF. All features of CELF, including the administrative functions, are available via a Web browser. 2.2.1 Automatic Document Collection Methods Two automatic data collection mechanisms are used within CELF: Agent and Harvest. The two mechanisms have unique features although ultimately both utilize robots for contacting Web sites, retrieving : HTML pages and extracting meta-data for each document. The meta-data are stored in RDS's pending import into the Sources Catalog. Both automatic collection mechanisms are employed to collect large focused quantities of data from the Internet and Intranets. The desired effect of these collections for the document contributor, is sufficient recall from the Sources Catalog such that manual Internet searching need not routinely be performed. Both robots are seeded with a list of URL's to be collected. The Harvest mechanism is configured to collect all meta-data from linked pages at specified URL's. User configurable parameters are also supported to limit the number of levels to recursively retrieve pages, and whether to include or exclude certain file types. The Harvest mechanism is scheduled to collect at periodic time intervals and, depending on the number of starting URL's and depth of recursion, can collect large volumes of information. A general problem with the harvest-based collection approach is that although large quantities of information can be retrieved, the relevancy of the average document is typically very low. It is a brute force approach that collects many non-relevant documents for the sake of locating a smaller number of "golden nuggets" of information. To augment this approach, we developed an Agent-based collection scheme that can perform more precise collection. The Agent RDS collects data from pages identified as containing relevant information. The URL's are identified using The MITRE Corporation's search agent extension. This collection approach makes use of a set of profiles defined for each topic. These profiles were initially generated automatically using the names of the CELF topics in conjunction with Boolean logic; however, profiles can be enhanced manually and au- tomatically to produce more focused search results. For automatic profile adaptation, the system can automatically extend profiles using statistically significant words and phrases extracted from sets of documents. These documents can include previously collected and categorized documents and/or documents deemed relevant via user relevance judgments. The core component of this subsystem is a search agent. The agent contacts a pre- configured list of Internet search engines using the profile queries to conduct canned searches for new pages. New URL results are stored and used to seed the Agent robót collection. The Agent robot is configured to collect a single page for each URL in the configuration list. This scheme provides a fairly precise collection mechanism by only retrieving pages that are of very high relevance to the CELF taxonomy. It is limited by the fact that only index:ed collections can be contacted as a basis for document retrieval. In concert, the two automatic collection mechanisms work to build the Sources Catalog. This repository contains information records which may be of interest but have not been manually reviewed. The purpose of this repository is to provide access to focused Internet information in a format familiar to both end users and data contributors. 2.2.2 Automatic Document Categorization After documents have been retrieved and meta-data extracted for collected documents, a categorization module is invoked to classify all documents into the CELF topical hierarchy. This-module uses topical profiles to classify all documents by comparing each profile to the document meta-data. These profiles axe typically the same as those used in the Agent-based collection subsystem, although it is possible to utilize profiles unique to the categorization subsystem. The resulting output includes the topical labels applicable to each document. This meta-data is later indexed to support the CELF topic-based browse functionality. At pre-defined time intervals, document meta-data collected via the Agent and Harvest subsystems are indexed by the integrated Verity search engine. A categorization module is then invoked, which uses the CELF profiles associated with each topic as canned queries to categorize the documents. For each query, a set of zero or more matching documents are returned. The categorization module will, for each returned document, set the "category" meta-data field to the topic used to perform the retrieval operation. Documents matching multiple queries/profiles are assigned all relevant topical labels. After the categorization module completes, the collection is re-indexed to make the newly assigned topical labels available to the browse interface. The query-based categorization approach allows the Categorize Documents Automatically End-use Automatically Built Collection RDS or Document Contributor Library Catalog Import Approved Manual RDS Document Contributor Review Documents Manually Built Collection Document Reviewer Figure 1: Collaborative Electronic Library Framework (CELF) Architecture full power of the Verity search engine syntax to be utilized in our profiles, and also enables the entire collection to be re-categorized on-demand as new information items are added and/or profiles are modified. 2.3 Post-retrieval Tools In addition to the backend collection, indexing, and categorization tools, CELF also supports post- retrieval tools to enable document contributors and endusers to effectively "mine" the information repository. Although full-text searching and topic-based browsing is supported, we have found the need to incorporate document clustering as a basis for further refining the list of documents returned on behalf of a particular query. Even when a user narrows a query down to a lower-level topic, it is still possible that a relatively large number of documents can be returned. Clustering provides an automatic method to break down large retrieval lists into dynamically generated subgroups (Rao et. al. 95). This aids the user by uncovering the salient themes in a list of returned documents. Fig. 2 provides example fragments of a standard document retrieval list (left screen) and a clustered retrieval list. The document list display presents meta- data (e.g., title, description, URL) for each document. The clustered display depicts the document subgroups generated via a document clustering algorithm (Jain et. al. 88). The clustering module utilizes an adaptive nearest centroid sorting algorithm to group documents together whose similarity exceeds a configurable theshold. The cosine-similarity measure (Salton et. al. 83) is currently used to compute the similarity between documents. Each document subgroup is labeled with the top discriminating keywords and phrases that distinguish the set of documents. Documents with a higher score are more similar to the cluster summary. Fig. 3 shows the major processing steps utilized in our clustering algorithm. The Vector Generator creates for each document, a hst of the unique terms with corresponding frequency counts. All insignificant words (e.g., stop words) or excluded. The vectors are then passed to the Cluster Generator, which groups similar vectors together based on their cosine-similarity score. The results of the Cluster Generator are then passed to the Post-Analysis and Refinement stage, which can prune/split and/or merge clusters to improve cluster effectiveness. Essentially, this stage can setup the clustering algorithm for multiple passes, which is implied by the cycle shown above the I)ocim(its:Slini:l-SOof998 l-Cetd lillf. WoM- AfiiemCcmiriti disoijliin: Ofc Algeia, AngoU Bom. BotswaaBukm, Fm.BwuA Cjjnera«, Clpe, Vtidt. Eg^ Eqoitor^ GuQua, EntiEa, Htliopa, Gabon, Gz)]^ Gian^ Hiilasisol, l^iliwi, Mi!,Muliiù, )foriÌD;,}lof«iii, Uoimtic, Nmbii, l%i, IT Stira lc(U.Ssm&,SoaA Affici 2aiiikii,Z«Aibit!.C:«tt(nvoit tì Z-Pittü lat: Min of Alna dBnipdon Mapi tf Affa Ut Paiy-CasMeà Unii)? Mip Ctltcdon koUsnim tin 2!0,0(l0 naf J-Ditó liir, ducriplùn: A^nctàiialSjtS'icttiiiKSoQetiuInduii^sek Wted lite liiipj;iitiiiiim.lMÌi.»t.iifc8ty-th5ittii;nkMliJiM dmiiilioa' 1teCainf«Slc&!hSdKitaidMalhailic!£7,35) Ih Brillili Counril ■ BitsiraM EitailfiKti in.TÌÌ It« Brillili Cmmca-SniMal Dadficw »7,261 The BriliACi.iii.il-Altai» Btnaisme ?7,241 MidACoäiÄleiäio Djtd [stoft J7,M| nodiiaclaTaiiiiMi Dilail [stoit 96,I7| Iht Brillili Cmiicil-Morelli l)!UfiCMj4,l5|IliiBriiiiliCiimdMiainiatmmt, GovnanotPclitcd SpttjuDonesti: polcj, Icoiuny sbftrui. •■ • -Aaul» tatiàl I (Ummaiti Omi FtSMtIid»|Nc-WVeid«ft AigoUlùsibialoiiaiia Maii&us|}fiseii2il Smt^jSeiéAHtalZnbtj Lutupdned»;«! 21.1597 Ajgobn nibe.basneis.otw$;itUftffortstc aid it cointiy tlid^Peitc;. siructcn ef Ée ginraenl. »ad p^, pri£e <)f tbt Frtnileit. dttctory ofke; oScah. snJ M o: Ute Lui:a agreenot Hswy. ecoMiiit dntloj-mttii. aad twiiaa («renfflnt iifMn»4a uèr tiu GoTMmeotKMnkttu. Dnntfy 0: loutm Ponsfly tojnst inbtiDćlicii Biief cauntiy J« icnpioo, ewr Bc s«šti:i pictjn cf ile Prendali Lakyi^iìàS^taiìo' Btsbes anJ TcuÉtliùaiutoo BnetìrfoRiBdicr. aboot the couity; bstor/.kadrrs. colture, xiìik mi c ictbc« pons Tei]iulttf«ntito£«Win!iimde{MiM]jiubd!iici)u sathc« Nmtdprtsndeuti Figure 5: Document Display Screens [8] Levy D. M. and Marshall C. C. (1995) Going Digital: A Look at Assumptions Underlying Digital Libraries. Communications of the ACM, 38(4), pages, 77-84, April 1995. [9] Rao R., Peder sen J. 0., Hearst M. A., Mackinlay J. D., Card S. K., Masinter L., Halvorsen P., and Robertson G. G. (1995) Rich Interaction in the Digital Library. Communications of the ACM, 38(4), pages, 29-39, April 1995. [10] Salton G. and McGill M. J. (1983) Introduction to Modern Information Retrieval. McGraw- Hill, Inc. [11] Verity, http://www.verity.com. [12] Yahoo, http://www.yahoo.com. A Study in the Use of Parallel Programming Technologies in Computer Tomography Eugene G. Sukhov Institute of Control Sciences, Russian Academy of Sciences, Profsoyuznaya 65, Moscow, Russia Phone: +095 334 79 51, E-mail: sukhovSipu.rssi.ru Keywords: parallel programming, computer tomography, parallel computations, 0-0 programming Edited by: Branko Souček Received: December 12, 1992 Revised: February 9, 1993 Accepted: March 1, 1993 In this paper we consider a parallel implementation of linear algebra algorithms for computer tomography and study two approaches for this purpose. The first concerns parallel Cholesky algorithm based on a block decomposition procedure. The second deals with parallel iteration procedure in terms of so called sections, where each section is a subset of 2D array components. Formulations given are good suited to object-oriented programming. The FORTRAN-like implementation is considered. 1 Introduction In this paper we discuss a parallel implementation of image reconstruction algorithms for computer tomography. It is well-known that the image reconstruction demands a large number of computation, in some cases in real time, which is beyond the capability of present-day computer systems even when relatively modest image matrices are involved. The specialized systems can be based on fast analogue devices [2] or on transputer arrays [1, 3]. But using of general-purpose multiprocessor systems is also attractive for tomography computations [4]. In this paper we don't concern the emission tomography presented in [1, 4] and consider the image can be obtained through an ultrasonic diffraction procedure or through X-raying procedure. In the first case it is essentially to solve an inverse diffraction problem but numerical solution leads to a large scale linear algebraic system with a dense symmetric matrix. The second case gives a variety of ways to create the image reconstruction algorithm. Each algorithm represents the iterative versions of inverse Radon transform [5]. Our objectives are: (1) to make direct numerical procedure for solution symmetric Unear algebraic systems arised in computer tomography, which can be applicable to a wide range multiprocessor systems; (2) to make iterative parallel procedures based on a ray approach with the same applicability and (3) to estimate the efficiency of object-oriented (0-0) programming for problems under study. We shall study Cholesky decomposition for the first point and the iterative "back-projection equalization" algorithm for the second one. Algorithms pro- posed are good for Single-Instruction, Multiple-Data (SIMD) systems, beside implementation for Multiple-Instruction, Multiple-Data (MIMD) is possible without trouble. The examples below are given in convenient FORTRAN-like notation with 0-0 insertions. The reason is that the 0-0 programming facilities are available in the latest FORTRAN versions. 2 Cell Methods The cell methods are very convenient for solving large-scale matrix problems in linear algebra. A variety of the cell methods and base operations proposed in paper [6, 7] good fits to parallel processing. It was shown that the cell decomposition can be evaluated in pyramidal recursive mode. We start the process from the top level down to the bottom level, where the base matrix operations are actually performed. We developed a set of basic operations with matrices and vectors that plays a similar part as PBLAS for ScaLAPACK library [8] for non-parallel matrix computations. Our goal however was to create procedures for recursive implementation. The base is self-similarity of numerical procedures at each level of the pyramid. It was demonstrated that the SIMD system model of W processors ring-connected is a good instrument for the basic operations above. Another model is WxW - processor matrix with the ring-connected rows and columns. The cell processors can form numerical modules for a large scale matrix parallel processing. Multycluster systems of this sort may also include a sharable memory modules of different types, and the multilevel de- composition can be easily adopted. Parallelism available at the base level is intracellular. It can be hidden from the application designer by software facilities. ParalleUsm above the base level is intercellular. This kind of parallelism, if present, is explicitly affected by the computer structure and is not discussed in this paper. 3 Cholesky Scheme The liner algebraic system under study can be written as (1) Ax = b, where A is a real, symmetric and positively defined matrix of order N. The standard scheme for solution (1) is Cholesky decomposition : (2) A = iL^, followed by solving two arised systems (3)- Ly = b, L^x = y, where i is a lower triangular matrix, the upper index T denotes transpose. LI,^-decomposition Let the matrix A be splitted in n^ square cells,each cell is of order N/n. LL^ -decomposition algorithm (2) can be written in the next cellular form: 1) Sn 2) Bij i- j = 2,3,...,n; 3) For i = 2,3,...,n : 3.1) Aik Aik - BjkBji, j = l,2,...,i - 1, A; = j,i-|-l,..,n; 3.2) A-n = £ijj, Bii > hw » W is fulfilled. In each level we have to keep no less then three cells in the memory, hence the 4-level decomposition can be performed, and li hw = W, then n^ = W, and the fourth level is absent. Example 2. An array of n^ disk units keeps the "large" nxn cells of initial matrix. If the host computer has V2 size of memory, we can set: m = N/n, 712 ~ \/u2/3. 5 Iterative Reconstruction Scheme In the previous section we described the method to provide dence linear systems solution. Such kind of solvers are important for inverse scattering problems in diffraction computer tomography. But on the other hand, matrices arising in X-ray and gamma-ray imaging are sparse. That is the reason for using iterative algorithms [5]. Each iterative scheme is the version of numerical inverse Radon transformation. Input data in this case is the set of ray sums on the beams passed along the different ray traces. In this paper we use a simple model of radiation attenuation based on plane parallel beam scanning. This attenuation may be due to Compton or combination scattering and photoelectric absorption. We think the image under study as a square matrix of pixels where the two-variable discretized function should be reconstructed. Suppose that this function domain is covered by the grid included N'^ square pixels. Each pixel is located by index pair {i,j) in general matrix style and the function value attached to it is denoted by Uij . We think this function as pixel absorptance. In our case the radiation attenuation is exponential: I/Io = exp(-aAX), where Io is an initial radiation intensity and I is the radiation intensity after passing through the layer with the thickness of AX, a is an attenuation factor. So the total absorptance can be conveniently expressed by the value of -ln(///o)- In that case it's possible to sum the pixel absorptances along the ray path. On the other hand it's necessary to take into account that each pixel is passed by the ray in its own manner. For this purpose we introduce weight factors gy. The maximum weight corresponds to the ray path through a geometric center of pixel. A weighted ray sum S can be written as (4) 5 = ^ UijQij, i,j€l ni y/hi/3,712 Ri \//i2/3, ns « \/hwß, ni = W where the summation is applied to pixels along the ray path In practice this model conforms to plane-parallel monochromatic raying the object under investigation E. Sukhov and measuring of transferred radiation intensity. In that case total attenuation can be easily determined. The model above may be adopted to .a variety of the scattering and absorption processes. For image reconstruction a set of ray sums should be accumulated. Then the uy can be determined as the solution of the linear system (4), where the weight factors ffij are known. Now the system (4) can be conveniently written in standard form (5) Gu = S . Let image be stored as a matrix of the order v = -Jn, where n is the total number of pixels. The right side vector S is the set of projections, i.e. the measured values of total attenuation. Below we shall discuss the "back-projection equalization algorithm" as a good illustration of parallel approach to iterative solution (5). 6 Sections Let the image be a square matrix of order N. Suppose the X-ray beam is directed to make angle a with the vertical ((|a| < lpi/2)). In such geometry the beam can not enter the image through the bottom and exit through the top. We define a section C{i,j,a) as a sequence of triads: Ciiujua) = {ikjk,9k), k = l,2,...,M, (6) where ik,jk are the indices of pixels passed by the ray during the propagation, gk is the weight factor associated with A;-th pixel. If the beam is uniform and its step is equal to the mesh size, then the section is uniquely determined by the first passed pixel indices ii,ji. The first pixel, and the last OnG Ifnidm always are the boundary pixels of image, as it is illustrated in Fig.2. section 1 section 2 rx secti X X K X s X s periodic bound conditions Figure 2: Section example on 8 x 8 image. Any tracing algorithm may be applied to calculate index set for section in (6). These calculations are the pre-processing stage of reconstruction . In 0-0 style we define two data types: PIXEL and SECTION TYPE PIXEL ITEGER I, J ! INDEX REAL G ! WEIGHT END TYPE TYPE SECTION (I,J,ALPHA) INTEGER I,J REAL ALPHA PRIVATE INTEGER M PIXEL A(M) ! A=PIXEL ARRAY END TYPE The operations (methods) to be used in the sections are the next: 1. Initiahzation (making), i.e. tracing and weighting the separate ray-path. This may be done only once for all images in work. But for good imaging it is necessary to make a set of sections directed by the variety of angles. The initialization is headed as SUBROUTINE INIT (I,J,ALPHA) 2. Opening, i.e. activation of the section: SUBROUTINE OPEN(I,J,ALPHA) All the operations below are specific to earlier opened sections. 3. Reading i.e. extracting and weighting matrix elements in accordance with the index set. Array V is used for data storage SUBROUTINE READ(V) DIMENSION V(M) 4. Writing, when the contents of array V is stored in accordance with the index set: SUBROUTINE WRITE(V) DIMENSION V(M) 5.Closing the section: SUBROUTINE CLOSE(I,J,ALPHA). Suppose that we have collected all the projection data arrays Sij for aJl the sections in work. If the sections are initialized, we can write the back-projection equalization algorithm in the next form: ITERATION: DO I=1,NU DO J=1,NU CALL 0PEN(1,J,ALPHA(I)) CALL READ(V) D(J)=S(I,J)-SUM(V) DLT=D(J)/LENGTH() V=V-hDLT CALL WRITE(V) CALL CL0SE(1,J,ALPHA(I)) END DO END DO IF MAX(ABS(DLT)).LE.EPS THEN EXIT ELSE REPEAT ITERATION Here the variable NU = u is the order of pixel matrix and the number of direction angles, the small positive number EPS is the threshold value. The function SUM(V) sums all components of the array V, which length is determined by LENGTH() . 7 The Parallelism Outlines The section-based scheme above presents inherent parallelism due to ability of the sections to be processed in parallel in SIMD-style. For improving parallelism we introduce an extra operation of section addition. This addition joins index sets and weights for the sections in operation. If the image matrix is processed on a parallel system, including NU processors, the addition provides length equalization for all the sections. This is important for load-balancing of parallel processors. For length equalizing it's necessary to place the periodic boundary conditions on the left and right sides of image. These impose that all J-index increments in the initialization procedure should be evaluatedmod-ulo u. In that case the section terminated on any side of image is added to its extension on the opposite side. As the result all the sections started at the top of the image with the same angle a have the same length and terminate at the bottom of the image. Sections 2 and 3 in Fig.2 are added to produce the sum-section with the same length as section 1. For section processing we can use model SIMD-system with ring connected processors. Such configuration is good suited to periodic boundary conditions. The total memory should be sufficiently large to store the whole image and the supplements. On the other hand, the section addition is necessary to implement the cell methods in imaging. If the processing image is splitted into cells any full section is the sum of intracellular sections. After current cell processing the partial ray sums are stored. Values stored are the initial conditions for the ray sums in adjacent cells. But the implementation of cell sections is not quite elegant for the lack of recurrence. If the image cell of order W is processed in parallel, then 0{W) parallel ax + 6-operations per one iteration are demanded. Numerical experiments showed sufficiently good convergence for W = 32. 8 Conclusion References [1] K.A.Girodias, H.H.Barrett, and R.L.Shoemaker, "Parallel simulated annealing for emission tomography", Phys. Med. Biol., Vol.36, pp.921-938, July 1991. [2] A.F.Gmitro, V.Tresp and G.R. Gindi, "Videographic tomography- Part I: Reconstruction with parallel-beam projection data", IEEE Trans. Medical Imaging, vol.9, pp.366-375, Apr. 1990. [3] F. Wiegand and B. S. Hoyle, "Simulations for parallel processing of ultrasound reflection-mode tomography with applications to two-phase flow measurement", IEEE Trans. Ultrasonics, Ferroelectrics and Frequency Control, Vol.36, pp.652-660, Nov. 1989. [4] M.I.Miller and B.Roysam, "Bayesian image reconstruction for emission tomography incorporating Good's roughness prior on massively parallel processors", Proc. Natl. Acad. Sci. USA, Vol.88,pp.3223-3227, Apr. 1991. [5] V.Picalov, N.G.Preobrazhenski, Reconstructive tomography in gas dynamics and plasma physics, Novosibirsk, Nauka. 1987. [6] E.G.Sukhov, Cell parallel methods for linear algebraic systems solution, Avtomatica i Telemechanica, 1988, No 9. pp.139-143. [7] E.G.Sukhov, Matrix calculations on SIMD multiprocessors, Programmirovanie, 1981, No 4 pp.40-49. [8] J.Choi,J.Dongarra, and D.Walker, PB-BLAS: A Set of Parallel Block Basic Linear Algebra Subroutines, Concurrency: Practice and Experience, 8 (1996), pp. 517-535. [9] S.Atlas et al., POOMA: A high performance distributed simulation environment for scientiflc applications, Supercomputing'95 Conference, (1995). The computer tomography problems are fruitful area for parallel computations implementing. The linear algebra cells method for large systems equations are the main key for getting solution. The methods discussed have a natural hierarchy (also recurcivity in algebraical case) and are good suited for massively parallel computing systems. The explicit hardware dependence can be located only at the lowest level, where the base numerical algorithms are applied. That is good for portability. The 0-0 programming is an adequate approach to the problems under study. It is based on the same concepts that found the decomposition of large matrix problem for parallel computations. Topological Informational Spaces Anton P. Železnikar An Active Member of the New York Academy of Sciences Volaričeva ulica 8 SI-1111 Ljubljana, Slovenia anton.p.zeleznikar@iJs.si and at home s51em@lea.hamradio.si http://lea.hamradio.si/~s51em/ Keywords: basis, closed system, connectedness, covering, discrete space, exterior, indiscrete space, informational space: operand formula vector, vector distributiveness (orthogonality), metrics (meaning); interior, linkage, neighborhood, open system, subbasis; system of informational formulas, operands and basic transition formulas; topology of systems Edited by: Vladimir A. Fomichov Received: April 28, 1998 Revised: August 18, 1998 Accepted: August 25, 1998 A system, of different informational formulas, ip, can possess various topological structures, O. By this, topological informational spaces of the form ($,D) can be constructed and the question arises: How can the topological structures be introduced reasonably for concrete systems of informational formulas? A topology causes certain other concepts, e.g., those concerning closed topology, connetedness, continuum, interior, exterior, neighborhood, basis, subbasis, metric, space, etc. of systems, and especially the concept of meaning as a kind of the informational accumulation point. The paper treats topologies of three types of informational formula systems: and An example of bidirectional consciousness shell is presented enabling a complex engine modeling. 1 Introduction A basic problem of topology^ is to define a general space. By topology a mathematical concept (structure, branch) is meant^ giving sense to various intuitive notions. Topological notions can be innovatively extended into realm of the informational, realizing one of the significant features of the so-called informational space. Such a space can be determined also from other points of view concerning, for instance, the distribu-tivity of informational entities (operands)—which can proceed into different concepts of a vector space. In general, a more complete theory of informational space would need concepts of informational subtheories, such of concerning informational topological space, informational vector space, and informational graph theory. Another, mathematically grounded view to the problem of graph is the so-called topological graph theory [14]. The primitive objective of this theory is to draw a graph on a surface, so that no two edges (graph ^This paper is a private author's work and no part of it may be used, reproduced or translated in any manner whatsoever without written permission except in the case of brief quotations embodied in critical articles. ^Otherwise, topology is a science of position and relation of bodies in space. This paper concerns at least the following topological topics: point system (set) topology (general topology), metric space (e.g., meaning topology), and graph topology. arrows representing informational operators) cross, an intuitive geometric problem that can be solved by specifying symmetries or combinatorial side-conditions (surface graph-imbedding). Although potentially interesting for the informational graph theory, this kind of problem is not in the focus of informational graph investigation. Informationally, graph is merely a presentation of formula or formula system potentiality concerning the setting of the parenthesis pairs (parenthesizing^) in a formula or formula system. Introducing the topology on (over) a system of informational formulas means a challenge of logic—both the informational and the philosophical one—which comes close to some known metamathematical problems [30]. Just imagine a topology in the realm of mathematical axiomatism where for a set of axioms a topology of axiomatic statements (e.g., true formulas) is constructed. Although topology is a general mathematical principle in the realm of the set and space theory, it seems nonobviuously to take it as a set with topology on the set of formulas."^ In mathematics, topology may be considered as an ^Parenthesizing (in German, Einklammerung) has also the philosophical meaning in phenomenology, for instance, in Husserl [17]. ''The author believes too that such an idea reaches beyond the conventional horizon of a mathematician. However, he believes that the following discussion will show the appropriateness of such a trait. A.P. Železnikar abstract study of the limit-point concept [16]. Which factors could dictate the introduction of a topology for a given system of informational formulas? In informational cases, different kinds of reasonable topologies, corresponding intuitive ideas what an understanding, interpretation, conception, perception and meaning should be, are coming to the consciousness, that is, into the modeling foreground. In set theory, the concept of a set (collection, class, family, system, aggregate) itself is undefined. Similar holds for an element x of the set X. The phrases like is in, belongs to, lies in, etc. are used. In informational theory, topology may be considered as an abstract study of the concept of meaning [32, 35] (concerning interpretation, understanding, conceptuahsm, consciousness, etc. of the informational). Here, meaning of something, of some formula or formula system, functions as an informational limit point, to which it is possible to proceed as near as possible by the additional meaning decomposition of something. The concept of a set is replaced by the concept of a system of informational formulas or/and informational formula systems. In this respect, similar notions to those in mathematics can be used, however, considering the informational character of entities (operands) and their relations (operators). Introducing topological concepts in informational theory, the reader will get the opportunity to experience what happens if the informational concepts, priory described by the author (e.g., [31, 32, 33, 34, 35, 36, 37, 38], to mention some of the available sources) are thrown into the realm of a topological informational space. In this view, informational serialism, parallelism, circularism, spontaneism, gestaltism, tran-sitism, organization, graphism, understanding, interpretation, meaning, and consciousness will appear under various topological possibilities, complementing the already previously presented informational properties, structure, and organization. Mathematical topology, as presented for example in [7, 8, 16, 18, 19, 21, 24, 25], roots firmly in the mathematical set theory [5, 6, 20]. In informational theory, the set is replaced by the concept concerning a system of informational formulas (system, informational system or IS, in short). A system is—said roughly—a set of informationally (operandly, through or by operands) connected informational formulas. The question is, which are the substantial differences occurring between the mathematical and the informational conceptuahsm in concern to topological structure? Elements of a mathematical set are elements determined by a logical expression (defining formula, relation, statement) and, for example, by notation of the form X = {XI,X2,. . . ,XTn} which presents a concrete structure of the set by its elements. In informational theory, instead of a set, there is a system of informational formulas being elements of the system. Formulas are active, emerging, changing, vanishing informational entities (by themselves) which can inform in a spontaneous and circular manner. What does not change is their informational markers distinguishing the entities. Notation of the form $ K'PnJ where presents,® in fact, only an instantaneous description of the parallel system of markers ipi, by a vertical presentation, denoting concrete formulas (or formula systems), and being separated by semicolons. These are nothing else as a special sort of informational operators, e.g. ||=, meaning the parallel informing of formulas of the system Also, there is a substantial difference between the symbols = and the second one is read as 'meaii(s)' and denotes meaning and not the usual equality. Another notions to be determined informationally are informational union and informational intersection of systems. It has to be stressed that formulas in a system "behave" in the similar manner as the elements in a set in respect to the union and intersection operation. Thus, the same operators can be used as in mathematics, without a substantial conceptual difference. 2 A IVIathematical vs. Informational Dictionary The presented dictionary should bring the mathematical feeling into the domain of informational theory. It certainly concerns the topological terms priory. The correspondence between set-theoretical and system-informational terms yields the following comparative table®: Mathematical vs. Informational Topology set X set braces: {,} system $: general formula system transition formula system and operand formula system system parentheses: (, ) ®For the system-conditional formula, ipi,(p2,--- , c^i • • • J and (/'2 L- ■ • ! Q) • • • J 7 respectively, notation (fi ip2 or, simply, ipi 'P2 will be used and read as formula ipi informs formula ip2 via operand a or, simply, formula ipi is informationally linked to formula (p2- This operation is informationally symmetric. Thus, i /3, • • • J, then (fx is linked informationally with (^3. Formally, {{ipx <^2) A ((/32 ^ Vs)) / a,ß X ( iplai,a2.....OnJ. A.P. Železnikar Further, there can exist more than one common operand, e.g., ai,aj,ak ■ ■ ■ , am in (pi and (/?2 in a transitive manner. In this case, fl <^2 j ^ Lfi, tpj, ; Oik Ifij, ip^, ; Olm \ Vi' V2 / Transitivity of operator applies also to the case of more than one common operator through several formulas, and it can be defined from case to case. Because common operands concern informing between formulas, the implication in the last definition can be expressed by means of a parallel system / a \ V <^2 (<^1 tp^) Another significant feature follows from the last definition: Theorem 1 Let the linkages in a circular manner tpi Vm-l Vm, Vm be given. Then, This feature is called the reflexivity of a circular informational linkage of formulas within (in the framework of) a formula system. □ Proof 1 We have to prove that {'fil,'P2,--- ,'Pm {fi Vj)] i,j e {1,2,... ,m} Within this conditionality also i 6 {1,2,... ,m} holds in a transitive (consequently multiple-linkage) manner. Another evident meaning of the theorem result is tpi (ßi,ip2,... ,ipm for alH" € {1,2,... ,m} It means that ipi is informationally linked to each of ipi,ip2,... jifm, including to itself (informational circularity). This proves the theorem. □ 3.2 Formula Systems In informational theory, a system of informational formulas corresponds to the notion of a set of elements in mathematics. A fundamental concept of informational theory is that of the system (short for the system of informational formulas). Definition 2 Intuitively, a system is a well-defined list of informationally well-formed formulas (separated by semicolons). Formulas consist of operands, (binary) operators, and parenthesis pairs. In a system, formulas inform in parallel to each other. In a proper system, formulas are information-ally linked via common operands, directly or indirectly (transitively), in such a way that each formula is, to some extent, informationally linked with each other formula of the system. In an improper system, some formulas are informationally isolated. Informationally, only proper systems appear (inform) to be reasonable. Isolated formulas inform per se, beyond the informational context of other formulas or subsystems of formulas in a system. For the rest of the system, such formulas are unobservable and do not observe informationally other formulas or formula subsystems. □ The union of two systems and $2, denoted by U $2, means the system / \ ($1 U #2) ^

g ^2) ^ system classifier y The union classifier can be expressed also as which represents the so-called alternative system, using comma instead of semicolon between formulas [35]. The intersection of two systems #1 and #2, denoted by $1 n $2, means the system ($in#2) y € #2 / If systems $1 and #2 do not have any formulas in common, $1 n $2 ^ 0; they are said to be disjoint systems. The relative complement of a system $1 with respect to a system denoted by or the difference of $ and #1, denoted by $ \ $1, is the system $ \ Kv? e $) V ((^ ^ #1)) The complement of a system denoted by C#i, is the system C$1 where T functions as a universal system. Evidently, C$1 ^ (T \ Usually, in a complex case, the formula system $ has the role of the currently universal system to which its subsystems can be compared.® 4 Topological Informational Spaces 4.1 Definitions 4.1.1 Open Systems Definition 3 Let $ mark a reasonable non-empty system of informational formulas. A class^° (short for informational class) O of subsystems o/$, D C is a topology on ^ iff D satisfies the following axioms: (Ti) The union of any number of systems in D belongs to D. (Tn) The intersection of any two systems in D belongs to D. (Till) Systems $ and 0 belong to D. The systems of D are then called 0-open systems, or simply open systems, and $ together with D, i.e. the informational pair (#,D), is called the topological informational space. □ As we see, a topological space is defined as an ordered pair between the carrier # and its topology O. Let us formulate Def. 3 in another way to get a different experience of the meaning of an informational topology. Topology can be determined by the following four steps too: A basic system $ of formulas tfi,ip2,... exists. 2° There exists a type characteristics^^ D e ^L5PL#JJ. 3° The first axiom is: For each system D', the informational implication // (D' e D) U- VVneo' / eO and $ € D hold. 4° The second axiom is: For each Ei and each E2, the informational implication (Si e D; H2 e D) ((Hi n S2) e D) holds. Such a structure family is called the topological structure, and the relation H € O can be expressed verbally as: system H is open in topology D.^^ ®A complex system is, for example, that of informational consciousness, in which several complex subsystems are imbedded. However, it does not mean that, in a specific case, a subsystem appears as a kind of universal to which its subsystems can be system-complementally compared. ®A reasonable system of informational formulas usually concerns a concrete, cyclically structured informational graph [32, 35]. class (family, collection) of subsystems means a system of subsystems. ^^For more details see [6], p. 246. Evidently, the openness of a system concerns its topology. Example 1 Topologies Deducible from Standardized Metaphysicalism. Let the following classes of subsystems of the standardized metaphysicalistic system [32, 35] • • • iVe) of circular formulas^^ be given: Di;=±(9JI;0;((^6)); For Da, the union ((v?4;,D) is a neighborhood (O-neighborhood) of a point (formula)

• • • . u, Besides, parallel components Cii-- - can appear and be distributed within different kinds of formulas, or even form a serial or circular serial formula as a whole, that is cS pS til-- - čS čS ^ V čS čS ?1)• • •. cS cS and/or respectively. Vector ||(5) corresponding to system 5 is determined by ll<5) "j \ \ o^'o The structure of vector ||5) needs to be additionally explained. What does such a vector include and in which sense the difference between the mathematical and informational vector comes to the surface? First, let us list all the components of vector ||(J) in concern to the origin system 8. System 5 is simply a parallel system of serial and/or circular serial formulas. But, in fact, this list is in no way a complete one in regard to the complex parallelism hidden in particular formulas of The reader should remind the axiomatic approach of the informational where the fundamental axiom is expressed by the implication («N/3) a; \ ß-, If this rule is recursively applied to a serial or circular serial formula [Q:,ai, • • • ,a„J or " . [o!,Qi, • ■ • ,q:„J, respectively, then, evidently, the application of the last axiom delivers all the sub-formulas appearing in a serial and/or circular serial formula, that is, in the serial case. theme of intention irrelevant, all reference a fiction, etc. (see At-tridge [1] p. 12). That a text for Derrida, especially a literary text, is always situated, read and re-read in a specific place and times makes it 'iterable' or repeatable, the same but always different, and therefore never reducible to an abstraction by theoretical contemplation (Derrida [11] pp. 172-97). A text is unique and repeatable, concrete and abstract simultaneously. This coexistence lies in the heart of deconstruction and reflects the connectedness of the subject and object in the experience of the self as pure consciousness. and in the circular serial case, A.P. Železnikar where the asterisked markers ^ój*,... denote the systems of serial subformulas of lengths 1, ... ,n —l,n conditionally in respect to operands in floor parentheses. Namely, a system 7^11 L^> 1 • ■ • ; ^raj includes only and only such basic transitions of the form ai \= aj (£ — 1) which appear in formula • • • or formula 1 ■ ■ ■ ' ^"J ^ whole), respectively. Similar concerns lengths £ up to value n orn + 1, respectively. A short analysis shows that in the serial and circular serial case the number of all possible subformulas of a given length can be evaluated by simple formulas. Let mark the length of a subformula in a serial formula with the length . Then, evidently, the number of such subformulas in a formula is if is even 'sub In a circular case there is o _ aub '-Bub fuzl if if is odd is even if Ìq is odd system, each operand in at least one circular formula. The third system, is the representative of all possible situations occurring by all possible parenthesis pairs displacements within the constructed (analyzed and synthesized) system. □ As said, the originally conceptualized system (obtained by the top-down or bottom-up approach or from both of them) is Thus, the remaining two systems, and evidently emerge from that is, > and where —> denotes the corresponding derivation approach. On this basis, three different topologies can be determined, as formula, operand and basic-transition topology, respectively. —> ^«Ni formal representative of the corresponding informational graph [35]. Now, let us show, how different topologies can be defined on $5 and $^1=»? ^ concrete case, and how all they mirror one and the same informational graph, with different possibilities in regard to various parenthesis displacements in formulas of the system. As an example we choose the metaphysicalistic case. 5 Variants of Informational Topologies A topology D depends on the carrier system that is, on the characteristic forms of its formulas. Which kinds of formulas in $ can be distinguished? The most usual system of formulas is composed of different serial and circular-serial formulas. These formulas emerge during the analysis of an informational case, usually in a kind of top-down and bottom-up decomposition of an initial (top) marker or an end (bottom) marker, carrying implicitly a yet-not-determined concept, proceeding stepwise into a more detail of the case—a progressive case decomposition from different points of view. This approach seems to be the most natural one, seen from the human point of consciousness. Just after of such a case identification more abstract and convenient approach with possibilities can be considered. Definition 13 The constructed system of formulas, can take the following characteristic forms: ^ ■•• ; ^n;) u $[Ci|implicit operands]; ^ (6 H 6; 6 N 6; • • • ; \= U) u 1= ^jlimplicit basic transitionsj The first system, is an authentic, intuitively constructed representation of a real case. The second system, is strictly expressed by all the occurring system operands as the title operands of a circular formula 5,1 Topologies of a Simple Metaphysicalism Simple metaphysicalism is a basic scheme of informational invariance which can be further decomposed in greater details during identification of the involved entities, that is, a formula expressed in the metaphysicalistic form. Thus, the graph in Fig. 4 can be understood as a consequence of the circular metaphysicalistic formula system ji^.f^^L^ij, ctjj • Subscript j concerns the formula system component of system e.g., j = 1,2,... ,n. Subscript i concerns the operand component of metaphysicalistic formula system ^.y^L^ji^jJ € Subscript kj concerns the parenthesis-pair combination 1 < kj < . of the formula subsystem system fc^.V^LCijsctjJ) with altogether n Vi J Cij+l m possibilities, considering serial (input) and circular serial formulas of a system of formula systems, where denotes the length of the formula in a formula subsystem. 5.1.1 Topologies on the circular formula system According to the graph in Fig. 4, one of the possible formula systems can be constructed (reconstructed). Let it be the consequent observing type of metaphysicalism for which the extreme left-parenthesis heaping is characteristic, that is, kj = 1. In this case, the graph formula component component informing component counterinforming component embedding Figure 4: The graph representing the basic metaphysicalism of a formula system L^y, a^J E component ^ij, impacted by something (interior and/or exterior) aj. is interpreted by the one of possible formula systems, that is, {{{{Uij N ) N ic., ) N ) N c^., ) N l^i] e«.,) H e«.,) h^ij-, (0) According to the preceding notation, there is ^ • To be more transparent, let us replace this system by the abbreviated one, in the form representing the input transition formula and the six circular formulas, respectively. Which kind of topologies on can then be defined in a meaningful way? (1) Let topology if possible, maintain the meaning of the original formula system in respect to the graph in Fig. 4. Let the meaningful condition be («yfi), {^2), i^Ps), ((fi), (ifs), {(fe) e by which all of the loops and the input transition enter the topology. What are the consequences of such a choice? First, the mutual intersections of formulas are empty systems, that is, (V'p n v^g) ^ 0; p g; p, ? = 0,1,... , 6 However, all the possible unions of formulas (ßo, fi, V2) Ì J where kj can be an arbitrary subscript in the interval concerning the formula system Thus, one can take ctjj and par- allelize it according to the rules discussed in the previous text. Sometimes, the parallelization of if^^l^ij, aj\ into the system of primitive transition formulas is marked by The result is and, accordingly to Fig. 4, evidently. /aj N ^ij] N^«,,Ne«.-,; Ha \= csü N «J:«.-,; Ve«,., N e«., [^i Iv'^i I'p'i] K] 6 H^n instance, can be useful in cases where informational formulas and chains are combined, to enable the expression of An informational transition formula cćin appear in the system only once. In this way, the last five rows include only the remaining feedback transitions. (0) Let us introduce the notation j ■ For a better transparency, we replace the upper system by the abbreviated /t; Ai;A2;A3; A4;A5; Ae;^!; [v'J Wi\ — j«3; Wi] Wi] Ms; Wi] \ Me W'l] A.P. Železnikar bidirectional formula component bidirectional component informing bidirectional component counterinforming bidirectional component embedding Figure 6: The bidirectional graph representing the metaphysicalism of Fig. 5 by considering the primitive onedi-rectional and counterdirectional transition pairs (Ai,Aì~), (A2,A^), (A3,A^), (A4,A4"), (A5,A^), (AejAg"), (i"5,Mr)> (MÖ.MD' respectively. The correspondence between transition o' formulas in c^^ [ùj>oijì and their abbreviated notations in is evident. Notation t marks the input transition, Ap {p = 1,... ,6) the forward transition of the main loop, and /i, (9 = 1,... , 6) the feedback transition, corresponding to the graph in Fig. 4. The transitional situation is presented in Fig. 5. Systems - are already reduced by the common transitions within the main loop [(^'j]. Which kinds of senseful topologies can now be defined on (1) The basic topological question could concern the main loop in Fig. 5. The circular route of this loop is To this route, evidently, the subsystem [cp^], that is, (Ai; A2; A3; A4; As; Ae;ßi) € corresponds. If [ip'i] C is the only element besides 0 and which has to enter in topology already satisfies the axioms (Ti), (Tn), and (Tin)'. Thus, D^^^,! ;=± (0;yji'; (2) Let us study topology in which the basic transition systems, covering the loops in Fig. 4, are included. Thus, (Ai; A2; A3; A4; As; Aei^i), (a2; a3;a4;/u2), (a4; as; a6;/ì3), (À2;M4); (A6;/Ì6) € This choice of topological subsystems causes the inclusion of further subsystems. By the intersection axiom (Tn), there is (a2;a3;a4), (a4; as; ae), (a2), (a4), (ae) € Evidently, these subsystems represent the common parts of the loops. By the union axiom (Ti), elements like (Ai;... ;^6;MI;m2;M3;M4;M5;M6), (Ai;... ;À6;/^i;^2;m3;m4;m5), (Ai;... ;À6;^I;/X2;M4;M5;M6), (Ai;... ;Àe;Mi;/^3;M4;M5;M6), (Ai;... ; ^6', M2; M3'j M4; M5': Me), (Ai;... ;À6;mi;M2;M3;M4), (Ai;... ;^'6;Mi'>M2;M3), (Ai; ... ; Ae;/ii;/i2), (Ai;... ;À6;M5;M6) etc. must additionally enter topology Dj|=j,,2- Now, again axiom (Tu) has to be applied, etc. The number of elements in becomes enormous. 5.2 Topologies of a Bidirectional Metaphysicalism Bidirectionality in informational sense means introducing a strict counterdirectional path (reverse serial or circular serial formula) in regard to the existing path (initial formula). In a graph, this situation is evidently visible by the occurrence of counterarrows or, in some cases, by the operand connection lines with arrows on both sides of the line. The graph in Fig. 7 represents a conceptually invariant shell of the possible bidirectional artificial consciousness. Bidirectionality is ensured in every point of the informational structure. Further, the graph can be used as a template for any formula system development on one side, and as a individual semantic approach to the choice of vertical components in several specific domains of the informational, that is, of the conscious individualism, its structure and organization on the other side. A suggestion for the choice of vertical components is given in [36, 38]. Thus, vertical components can fit best the specific field of research general components informing counterinforming embedding Copyright © 1998 by Anton P. Železnikar Figure 7: An initial informational shell of the generalized and standardized metaphysicalism of consciousness system 2i (a kind of pure consciousness), exploring the bidirectional metaphysicalism. A.P. Železnikar in respect to the function in the vertical metaphys-icaUstic scheme. On the other side, chosen vertical components can be again metaphysically decomposed in the horizontal direction. In the framework of consciousness circumstances, the stream of consciousness can be forced consciously into the opposite direction within informational cycles as shown in Fig. 6. A critical conscious informing must investigate its own conscious stream (of informing, counterinforming, and informational embedding) in one and the other direction, changing the causal conditions circularly in the opposite direction. In an unidirectional graph, each arrow, representing an operator, is replaced by the bidirectional arrow, representing two operators, the direct and the reverse one. Thus, in Fig. 7, a bidirectional arrow marks meaning two separate and functionally (essen- tially) different operands. Topologically, each concrete case concerning the graph in Fig. 6 can be informationally distinguished, foe example, by a definite setting of the parenthesis pairs in formulas. As mentioned frequently before, the formula system ^^ is the originally conceptualized model of a real informational situation. In this sense, bidirectionality offers the possibility to investigate a loop in one and the opposite direction. For a loop, for instance, the principles of the pure observing and the pure informing can be applied in one and the opposite direction, simultaneously. For the case in Fig. 6, there is, for example, [ma [vi] e«.,) N ec.,) t= Ci,-; N (eco- h (e«., h N (e:?., N [vri (k., N(^cJHeü)))))); Wì] H H N (ko- H 3«,,))); [vn (((Cc.v t= J N e«.,) N e«.,) N [vs] [V3-] NU.-,) N^co-; [Vi] wt] (e:«..- N e«.,) N e:«.,; [V5] N (eco- N N e?.,) N [ve] wt\) ferently), following the realistic circumstances for each of the system formula. However, any other setting of the parenthesis pairs in the system formulas does not change the informational graph in Fig. 6. Maybe, in a specific case, some direct and/or reverse paths can even be omitted or left simply void for a later final decision. (0) Let us denote , ip^. Now, for the sake of transparency, let be Vo; fi; v't; Kfe-, ft ) representing the input transition formula (^o (bringing into the system the exterior object a at point iij) and the twelve circular formulas. (1) To represent the variability of a formula system rooting in the possibility of arbitrary parenthesis pairs displacements in formulas, we can use the formal expression of informational schemes (the so-called graph routes of graph paths) for Fig. 6, and write the graph equivalent scheme in regard to the initial system . in the form This is the original (initial) formula system, a consequently observing case in each system formula, from which the graph in Fig. 6 was drawn, consistently following the rule of an arrow and its counterarrow. It is clear that according to a specific informational case, the parenthesis pairs can be set adequately (and dif- \ «j iij h 1= 1= cf,. h c:^,, 1= N t= Ö^f.y Hcf.v Ne«.., N^f.v; Nc«., N^co-; H kii t= N His \= [vo] [¥>1 I'pt [V2] [¥>3 [V4 WX b5] [ft [ve] In this formula system scheme, some directed and counterdirected paths obtain equal formal expression, e-g-, [V4] and [vj-] [vs] and and I be] and [tp^] seem to be equal. However, it is to understand that they originate from different informational situations and, according to the original circumstances, they have different operators between the equal operands^^. (2) An interesting case occurs in dealing with the operand system C - C-! C) concerning the graph in Fig. 6 and with possible topologies on this system. First, let us explain in which way the bidirectional operands tf^.-, tf^. are formally and explicitly represented. Fig. 6 shows how many causal circular paths (loops) pass a certain operand. The following correspondence is evident: C ► ► 6; ► 6 Operator ► reads directly informs the number of loops. How, for instance, operand is expressed explicitly by means of the loops passing it, using the so-called operand rotation principle for each of particular loop, and the informational path (scheme) form? The advantage of the path formula is that the setting of the parenthesis pairs remains open, and in this case various possibilities of the final setting of parenthesis pairs can be considered. Evidently, the following comes out from the graph in Fig. 6: he?,, Ni«., Ne:?.,; Ni«., Ncc., N^e.,; e^so- Ne«,, Ne?,, Ne:«.,; Ne«., Ncf., N^c.,; Ne«.-, Ne:«.,; e:«., N c«., N e:«,, \ Hi bf] [¥>11 [fi] ivi] [vf] [fl] / The last two paths are virtually equivalent (see the footnote Similar schemata can be obtained from the graph in Fig. 6 or the remaining operand systems , and The rule for an explicit expression of an operand out of given formula system is to collect all the formulas in which the operand occurs and then express these circular formulas, according to the principle of an operand rotation, in a way by which the operand comes to the title position (the most left and the most right position in a circular formula). What can then be said to the topological outlook of the obtained framed operand (in fact, a system of informational paths) representing formula a system by each of the system path? It is to stress that the graph (with 8 formula 'Ci,- for the formula system scheme paths) is a subgraph of the graph in Fig. 6 (merely the local informing and embedding loops are missing). In this sense we introduce a new concept of topology consisting of informational paths (routes, marked by p) instead of informational formulas ip, represent- ing p Thus, instead of we introduce or respectively. Each path (graph route) p represents potentially formulas if ip is the length of the path corresponding formula (number of the adequate formula binary operators). In this way a new sort of topological space is introduced, for instance pertaining to , . »tere = D and, e.g.. Evidently, ^ (3) One could construct other reasonable topologies being subsystems of . But, the next provok- ^^Operator \= denotes a general informational joker. In two cases, the equal transition formulas a f= /3 and a ^ ß can represent different transitions. For instance, between two substantives different verb forms can be set. It means that in virtually equal formal cases, the intention of a's informing follows the first and then the second verbal form. Finally, the cases are resolved as being different by the particularization of operators. ing question concerns a topology of formula systems ^^ (not just formulas ip) and topologies of topological spaces of the form Let mark a system of formula systems and a system of topological spaces in general. Let e and e C This structure delivers a system topological space Another concept of topology of topological spaces follows the condition e and delivering a topology topological space of the form 5.3 Topological Informational Spaces Possessing Informational Metrics What kinds of informational metrics could come to the surface, could be considered, and finally theoretically (constructively) applied in artificial systems of consciousness and other cognitive models? Which are the possibilities of introducing various kinds of metrics concepts—the informationally static^^ and infor-mationally dynamic^'' ones—into a topologically structured informational space? Several candidates come into consideration as measures of the informational metrics. The properties of such measures could be, for instance, meaningness, understandingness, interpretativeness, perceptiveness, conceptiveness, determinativeness, and several others. If so, the corresponding decomposition and expressiveness of informational measures as entities must be available. Where could these measures reside within a meta-physicalistic model? The answer is, anywhere. By the principle of operand rotation in a circular formula, any loop operand can be rotated to the initial (main) position of the loop and, by this, expressed by an adequate informational formula in respect to the parenthesispair setting in the formula. Meaning of something as an informational measure can usually appear in the embedding part of a metaphysicalistic loop. As a meaning of something it could represent the informational value (informational length) of something. In a similar manner, the informational distance between two informational operands could be determined, implicitly and explicitly, by a functionally inner and outer informational difference, respectively. Various concepts of understanding, conception, perception, etc. can serve as special measures of meaning (metrics). They can be placed constructively in any part of the metaphysicalistic loop and, then, rotated to an informationally static metrics, the most common concepts of informing are meant, for instance, that of something's meaning. Typical, purely static metrics concerns numerical or any other value, distance, or any other geometrical measure. ^^By informationally dynamic metrics, the individually organized informational phenomena are meant, for instance that of an individual understanding structure, which has something in common with the individual structures of others, and which is to some extent structured invariantly (standardized) in concern to the meaning or understanding. the main position of a formula and expressed explicitly [32]. This kind of constructive approach must remain within the reasonable limits, preserving the common logical principles or direction. 6 Possible Geometry and Topology of the Informational That what will be stressed in this section concerns the interpretation possibilities of informational topologies by means of geometric bodies—their surfaces, intersections, volumes, and arbitrary substructures occurring interiorly, on the surface, and/or exteriorly of these bodies. Interpretation ideas can be found in several sources dealing with geometry [9, 22, 23, 26, 27]. Mathematica [9] seems to be the tool for an adequate graphical presentation. By such an interpretation of systems of informational formulas, geometrical bodies become also a means for informationally semantic presentation of modeled entities. For instance, a sphere—its interior, surface and exterior—can be taken as a body of consciousness (or a body of any other informational entity). The surface of the sphere can represent topologically that which is potentially possible to become conscious, and a circle on the sphere surface can represent the currently conscious. Such circles can expand as parts of different toruses which intersect with the sphere. They can represent different intentional in-formings within the consciousness activity. Further, the interior of the sphere can represent the subconscious which can come to the surface. On contrary, the exterior of the sphere can be grasped as the non-conscious and non-subconscious yet. Thus, a system of spheres and toruses intersecting each other can built a complex and to some degree globally transparent model of interacting consciousness systems. Such a complex geometrical model can be particularly, that is, additionally, characterized with specific topologies, bringing into the modeling system an interaction of different topological spaces. In this context, both informational topologies and geometrical bodies can become a reasonable unit for complex informational investigation and experiments in the domain of the informational, and particularly in the domain of the conscious in an informational sense. Geometrically interpreted, informational topological spaces of informational topological spaces could get a transparent view to an arbitrary (recurrent) depth. Further, such interpreting geometrical structures can behave variable in any possible aspect, for instance, in moving of geometrical body intersections together with bodies which can change also dimensions (volumes, radii, sides, surfaces) to follow the dynamic picture of informational circumstances emerging, changing, and vanishing. Such problems of informational and consciousness geometry interpretation deserve a special attention and will be treated somewhere else. 7 Conclusion We see how the concept of mathematical topology comes intuitively close to the informational topology. However, the substantial differences occurring between them, e.g. the nature of emergence of operands, operators, and formula systems, have to be stressed over and over again. Some of the differences are already recognized from the mathematical-informational dictionary in Sect. 2, and other follow from the discussion and examples in this paper. It is worth to refresh these differences by the following list: 1. A formula system is obviously a set of interdependent formulas, irrespective, how it is expressed; e.g., by (1) serial circular formulas of different lengths, (2) primitive transition formulas, or (3) informational operands that are in some way, by some specific formula systems given on some other places. 2. Formulas (elements) of an informational formula system are directly dependent on each other through the common operands. Thus, the change of an operand in a given formula changes the same operand in an other formula and, thus, changing the informing of the other formula. As said, the interdependence of formulas as system elements is a rule, that is, a consequence of their formal linkage through common formula operands. In this respect, informational formulas as system elements behave differently in respect to the elements of a mathematical set. 3. A consequence of the preceding item is that elements of a set are meant as a sort of constant determined entities, and are in this way represented as (fixed) set elements. On the other hand, formulas as system elements possess their emerging nature in any respect: in emerging operands and operators, in setting of parenthesis pairs in a formula, and, most significantly, in expanding or contracting a formula by the number of occurring operands and operators, that is, in spreading and narrowing the meaning power of a formula. 4. A concrete formula system can also emerge according to the circumstances of its informing, for instance, by adding the interpretational formulas concerning the occurring operands, expressing the operand properties by additional (new) formulas. On the other side, a concrete mathematical set is defined constantly, even its cardinality is infinite. The elements of a set are determined by an unchangeable rule (e.g., predicate) or by a sort of concrete or recursive enumeration. By informational topology, a complex meaningly structured grouping and coupling of formulas concerning substantial informational spaces can formally be expressed (implemented), keeping the entire, that is, a non-reductional informational nature of involved entities as they perform in their reality. In this respect, a topologized formula system is not a simplified model for real informational situations, for instance in the domain of cognitive science^®. Tangled webs of causal influences are target phenomena in recent biology and cognitive science [10]. Such twisted influences include both internal and external factors as well as patterns of reciprocal (also bidirectional) interaction. The shell graph in Fig. 7 is a general scheme for the most pretentious informational modeling and experimenting, where the so-called reductionist approach can be entirely circumvented. Such an initial informational shell can be used as an informing model for any other problems beside consciousness (e.g., in philosophy, cognitive science, biology, psychology, psychiatry, language, on-line economic simulation, etc., as shown in [32, 37] where additional references are listed.). This points evidently to the applicability of informational topology with its deep intuitive background being appropriate for natural and artificial modeling of interactive philosophical and scientific problems. An evident example of the informational metaphys-icalism could be the so-called inner speech (talking to oneself) [3]. Such a speech is constituted by the experienced meaning (informing), emergence of speech (counterinforming), and logical articulation (informational embedding), respectively. But, all components of this sort can emerge in a distributed form across the inner speech informing. They can be treated (grasped, understood) topologically as certain informational or semantical unity through topological grouping by subsystems cr^ € where cr^ C $ and is the corresponding topological informational space. References [1] Attridge, D. 1992. Derrida and the question of literature. In J. Derrida, Acts of Literature. Routledge. New York. [2] Balakrishnan, V.K. 1995. Combinatorics (including concepts of Graph Theory). McGraw-Hill. New York. [3] Blachowicz, J. 1997. The dialoge of the soul with itself. Journal of Consciousness Studies 4:485-508. In contrast to artificial neural networks (AN Ms) being simplified mathematical models for neural systems formed by massively interconnected computational units running in parallel [28], Informational topological systems can always fit adequately and arbitrarily precisely the point of meaning in a topological continuum. [4] Bonnington, C.P. & C.H.C. Little. 1995. The Foundations of Topological Graph Theory. Springer-Verlag. New York, Berlin, Heidelberg. [5] Bourbaki, N. 1960-1966. Theorie des ensembles. Chapitres 1,2 ,3 et 4. Hermann. Paris. [6] BypGaKH, H. 1965. TeopHH MHOHcecTB. Hs^a-TejibCTBO MHP. MocKsa. [7] Bourbaki, N. 1965. Topologie générale. Chapitres 1 et 2. Hermann. Paris. [8] Byp6aKH, H. 1968. OSinan TonojiornH. Ochob-Hbie CTpyKTypBi. Hayna. $n3MaTrH3. MocKsa. [9] Boyland, p. 1991. Guide to Standard Mathematica Packages. Wolfram Research. [10] Clark, A. 1998. Twisted tales: causal complexity and cognitive scientific explanation. Minds and Machines 8:79-99. [11] Derrida, J. 1988. Signature event context. In G. Graff, Ed. Limited Inc. Northwestern University Press. Evanston. [12] FrÉCHET, M. 1906. Sur quelque points du calcul fonctionnel. Rend. Palermo 22:1-74 . [13] Franz, W. 1960. Topologie L Allgemeine Topologie. Sammlung Göschen. Band 1181. Walter de Gruyter & Co. Berlin. [14] Gross, J.L. & T.W. Tucker. 1987. Topological Graph Theory. J. Wiley. New York. [15] Haney, W.S. 1998. Deconstruction and consciousness: the question of unity. Journal of Consciousness Studies 5:19-33. [16] Hocking, J.G. & G.S. Young. 1961. Topology. Addison-Wesley. Reading, MA. London. [17] Husserl, E. 1950. Ideen zu einer reinen Phänomenologie und phänomenologischen Philosophie. Husserliana III (W. Biemel). Martinus Nijhoff. Haag. [18] Hu, Sze-Tsen. 1964. Elements of General Topology. Holden-Day. San Francisco. London. Amsterdam. [19] Kelley, J.L. 1955. General Topology. SpringerVerlag. New York. Heidelberg. Berlin. [20] Lipschutz, S. 1964. Set Theory. Schaum. McGraw-Hill. New York. [21] Lipschutz, S. 1965. General Topology. Schaum. New York. [22] Maeder, R.E. 1990. Programming in Mathematica. Addison-Wesley. Redwood City, CA. [23] Prassolow, V. 1995. Topologie in Bildern. Verlag Harri Deutsch. Thun, Frankfurt am Main. [24] Prijatelj, N. 1985. Mathematical Structures III. Neighbourhoods. Državna založba Slovenije. Ljubljana. In Slovene. [25] Thron, W.J. 1966. Topological Structures. Holt, Rinehart and Winston. New York. London. [26] TÓTH, L.F. 1965. Reguläre Figuren. B.G. Teub-ner. Leipzig. [27] Wells, D. 1991. The Penguin Dictionary of Curious and Interesting Geometry. Illustrated by J. Sharp. Penguin Books. London. [28] Yang, H.H., N. Murata & S. Amari. 1998. Statistical inference: learning in artificial neural networks. Trends in Cognitive Sciences 2:1:4-10. [29] Železnikar, a. P. 1994. Informational Being-of. Informatica 18:277-298. [30] Železnikar, A.P. 1995. Elements of metamath-ematical and informational calculus. Informatica 19:345-370. [31] Železnikar, A.P. 1996. Informational frames and gestalts. Informatica 20:65-94. [32] Železnikar, A.P. 1996. Organization of informational metaphysicalism. Cybernetica 39:135162. [33] Železnikar, A.P. 1996. Informational transition of the form a \= ß and its decomposition. Informatica 20:331-358. [34] Železnikar, A.P. 1997. Zum formellen Verstehen des Informationsphänomenalismus. Grundlagenstudien aus Kybernetik und Geisteswissenschaft/Humankybernetik 38:3-14. [35] železnikar, A.P. 1997. Informational graphs. Informatica 21:79-114. [36] železnikar, A.P. 1997. Informational theory of consciousness. Informatica 21:345-368. [37] železnikar, A.P. 1997. Informationelle Untersuchungen. Grundlagenstudien aus Kybernetik und Geisteswissenschaft/Humankybernetik 38:147-158. [38] železnikar, A.P. 1997. Informational consciousness. Cybernetica 40:261-296. [39] Slikob, A.A. 1969. Teopna Kone^Hux rpa4)ob I. MsAaTejitcTBO Hayna. CHÖHpcKoe CT^eJie-HHe. HOBOCHÖHpCK. Control Mechanisms for Assuring Better IS Quality Marjan Pivka University of Maribor School of Business and Economics Maribor Razlagova 14, 62000 Maribor, Slovenia Tel: ++ 386 62 2290247, Fax: ++ 386 62 26 681 Email: pivka@uni-mb.si Keywords: software quality, IS manager, business Edited by: Janez Grad Received: August 28, 1997 Revised: July 17, 1998 Accepted: August 25, 1998 The software domain is faced with a number of quality assurance and process improvement models. Business managers are under pressure from many different kinds of assessments for their operations, products and services. Accounting departments are audited by financial auditors. What about Information Systems? Do we have a universal model on how to achieve required IS quality? This paper deals with the definition of IS quality and the influence of different control mechanisms on IS. The results of this empirical research are several. First of all, none of the control mechanisms are universal and appUcable to all IS resources. Application of more than one of them could be redundant. Mutual recognition of results between them is required. IS managers are responsible to understand them and use them with all limitations on specißc IS resources. 1 Introduction The computer based Information System (IS) uses hardware, software, telecommunications and other forms of Information Technology (IT) to transform data resources into a variety of information products. Enterprises need and use those products in their business processes to achieve business objectives. IT resources need to be managed in order to provide such information products to the enterprise. Typical resources of IS are: dataware computer data bases and other data resources software computer programs, applications,... lifeware human resources hardware computers, communications and other office technology orgware organisation, procedures etc. In business we are constantly under pressure to reduce all kinds of expenses on the one hand and to improve quality on the other. One of these expenses are those for Information Systems and there are always some logical questions to be asked: is it good enough to justify the expense? Do we get what we need? Shall we invest in new IS? Is our IS reliable? Are results accurate? Those and other questions can be answered in a discussion on the quality of IS. Because quality is what we all expect from IS: cover of all functional requirements, reliability, needed results given on time, usable for all users, maintainable etc. It would be too easy to suppose that IS quality could be achieved by assuring a high quality of IS resources. IS is too complex and it grows with organisation and needs different control mechanisms in its maturity process. The high quality of technical resources of an IS, such as software or hardware, is by no means any guarantee for its high quality implementation in an improperly organized enterprise! And vice versa! Different methods and principles (our term is control mechanisms) are known for IS quality assurance. Some of them control the IS development process, others IS resources and some may be used to control the development process and the implementation process. The best known control mechanisms today are: - quality system standards (ISO 9000 family standards); - software products standards; - software process assessment models such as BOOTSTRAP [Haase et al 1993], CMM [Paulk et. all 1993] [CMM v2.0], ISO/SPICE [Rout P. Terence 1995] and many others [SPC 1997]; - IS Auditing. It is difficult to understand those, and perhaps other, aggressively marketed control mechanisms. Many have asked themselves: ISO 9000, CMM, SPICE, IS Auditing, BOOTSTRAP - what shall I do? Why does our financial auditor request an IS Audit, if we use a certified software product, or have an ISO 9001 certificate in software development? What shall I do? Which mechanism or model can help me? What are their strengths, what are their weaknesses? The answer is not universal or easy. The aim of this paper is to analyse the influence of the most well known control mechanisms for IS (not just software!) quality. The paper is organized as follows. The second section introduces a formal definition of IS quality. Sections three to six deal with quality system standards, software product standards, software process assessment models and IS Auditing. Each control mechanism is briefly introduced and its strengths and weaknesses on IS quality are discussed. The discussion on collaboration, competition or conflicts between those control mechanisms is presented in section 7. Finally conclusions are given in section 2 The formal definition of IS quality The quality is defined as totality of characteristics of an entity that hear on its ability to satisfy stated and implied needs [ISO/IEC 8402:1995]. According to this definition of IS, the quality of IS in general is totality of needed or implied quality characteristics of dataware, software, lifeware, hardware, and orgware. Or speaking more generally, the quality of IS is totality of quality of IS resources. Therefore, if we want to judge whether a given information system is reliable or not, accurate or not, efficient or inefficient, etc. we shall: - define quality model i.e. quality characteristics based on required and implied needs, - measure or assess each characteristic, - compare the measured or assessed characteristics with the specific requirements, and - validate the results. In the field of engineering these procedures are well defined and known as evaluation models. In software engineering, such a model is defined in [ISO/IEC 9126:1991] and in [ISO/IEC DIS 14598:1996]. Real implementation of this evaluation model requires the practical solution of some very serious problems: 1. The definition of IS quality model. A quality model in general is a structure or composition of all quality characteristics of an entity. Thus, for IS quality model the quality characteristics and their sub-characteristics shall be defined for each IS entity. 2. Each quality characteristic shall be decomposed to a measurable or assessable level. The metrics assessment method for each characteristic has to be defined. 3. User's, legal and or professional requirements (i.e. stated and implied needs) and their significance for (on) IS quality shall be defined for each quality characteristic. The quality of IS is of course not a simple sum of the quality of each IS resource. The quality of an IS Q is a function of the stakeholders defined (stated and - or implied) IS characteristics (DCi), actual values of those characteristics (ACj), and by stakeholders defined influence of each characteristic (pi) on IS: Q = f{DCi,ACi,Pi). Function /, and its arguments DCi, ^Ci and pi depends on management & operations maturity of the organisation, type of organisation, its environment etc. for which the IS is intended. They all define general requirements such as the type of IS (Management IS, Decision support systems. Executive Support Systems, ...or any combination thereof), and of course very specific requirements such as Inputs, Outputs, Interfaces, Security requirements, Services etc. This definition formally demonstrates, that quality of IS can not be achieved by high quality of the IS resources (ACi). It can be achieved only if requirements (/, DCi and pi) are demonstrated with actual resources {ACi). For instance: the implementation of a high quality and complex software product in an badly organized organization will result in a badly organized IS! 3 Influence of Quality System standards on IS quality The quality system is defined as organisational structure, procedures, processes and resources to implement quality management [ISO/IEC 8402:1995]. The most popular international quality system standards are ISO 9000 family standards. ISO 9001:1994 defines a model for quality assurance in design, development, production, installation and servicing. It defines a number of quality requirements structured in 20 clauses, firom management responsibility to statistical techniques. ISO 9000-3:1997 [ISO 9000-3:1997] are guidelines for the application of ISO 9001:1994 to the development, supply, installation and maintenance of computer software. British TickIT [TickIT 1998] certification schema assures a thorough compilation of ISO 9001 with ISO 9000-3 guidelines in software development processes. The result of ISO 9001 application in software development is the defined, controlled and managed process of development, supply, installation and maintenance of computer software. The strengths of implemented ISO 9001 in a software development process are: — The management of software development process is focused on internationally acknowledged quality requirements defined by ISO 9001 which is also measurable. As such, management can monitor and compare quality characteristics of the software development process (productivity, efficiency, number of bugs, user complaints etc) with their plans and with their competitors. — It covers the processes in a software development, acquisition and implementation: from requirements definitions to planning activities. Software Life Cycle (SLC) activities, configuration management, hardware purchasing, software maintenance and training. — It assures constant improvements in SLC, with Quality Assurance and Quality Control activities such as corrective and preventive actions. the — It improves co-operation between all parts in SLC processes. — Its international recognition (national and international certification schemes) has a strong impact on the software industry. The weaknesses of implemented ISO 9001 in a software development process are: — The quality system is limited to the software development processes. IS resources such as lifeware, orgware, hardware and other IS resources are not directly considered; — Implementation of activities such as security management, application control and technology specific controls, depend on the maturity of the implemented quality system. This may vary from not implemented at all to fully implemented procedures; — Implementation of the ISO 9001 standard does not give clear answers to questions concerning productivity, functionality, usability, reliability, cost effectiveness etc. of an IS. — A controlled software development process is no guarantee for quality solutions to business problems. If the user defines a bad or inadequate requirement, this will be with high quality built in a software, but the end product will be of no use. The above discussion consider only the software development process. An interesting is situation is, where IT resources in an enterprise are under the umbrella of ISO 900x requirements, but other business processes are not. The information products are faced with problems like functionality, availability or usability in such a cases. The root causes are in the different maturity levels of business orgware. A Data Base Administrator can not design a robust and accurate data model of an enterprise if the enterprise has no business vision or poUcy statement, long and short term business plans, and also in the case when he or she may or can communicate only with people from third level management. The business objectives of an enterprise are not achieved only with the IT resources. IT resources are the support to other processes and ideally, all of them must be on the same maturity level to achieve expected business results. The implementation of the ISO 9001 standard in the software development process has no essential influence on a very important IS source, orgware. This means that a quality of IS can not be assured only by implementation of the ISO 9001 standard in the software development process, but it is also necessary to consider the maturity of the organisational and management level of the company. In other words, the impact of ISO 9001 in software development and on IS depend on the maturity of the environment for which the software development is intended. However, it is to be expected that the gap between IT processes and business processes will be reduced in time if the quality system is implemented only in one of them. A quality system requires internal audits and corrective and preventive actions which assure improvements and growth of all involved. 4 Influence of software product standards on IS quality There are several hundred software standards. Most of them are National or Multinational such as ANSI (American National Standard Institute), BSI (British Standard Institute) and DIN (Deutsches Institute für Normung) standards, and professional standards such as IEEE Standards, Defence standards etc. The majority of them deal with the results of software activities and tasks such as Management, Quality Assurance, Configuration Management, Safety, Design, Requirement Specification, Coding, Verification & Validation etc. and are mainly considered in the process of software development (discussed in sections 2,4, and 5). The applicability of them and their influence on IS quality therefore depends on the scope of the standard within the software development process. Only a few of the software product standards deal with software products for end users or software pack- ages. They are: ISO/IEC 12119 1995: Information Technology - Software packages - Quality requirements and testing, ISO/IEC 9126 1991: Information Technology - Software product evaluation - Quality characteristics and guidelines for their use and ISO 14598:1996 - Part 1 to Part 6: Information Technology, Software product evaluation. The international standard ISO/IEC 12119 is applicable to software packages like accounting, payroll, data base programs etc. Sets of quality requirements based on this standard are: requirements on product description, documentation, programs and data, and testing procedure. The strengths of software product standards are: — Conformity according to those standards provide confidence that the product actually does what it claims to. — They can be used in national or international software packages certification schemes. — They can be used as a marketing advantage for off the shelf products; — They can have a substantial impact in a software acquisition process. The weaknesses of those standards generally deal with their scope and their importance in the IS. Existing software product standards cover only some parts of the software design process or software products. Its influence on IS quality is limited to the importance of the considered subject of software quality. It is obvious that safety standards (implemented in a software development process!) have a supreme influence on IS quality in safety critical systems, such as Nuclear Power Plant. And vice versa: a text processor package with a certificate of conformity with ISO/IEC 12119, has only little influence on IS quality. 5 Influence of software process assessment models on IS quality It is very important to recognise, that any software process improvement program needs a sound understanding of the current status of the software development process. Software process assessment is the most common method used to achieve this understanding. Process assessment is defined as The disciplined examination of the process used by an organisation against a set of criteria to determine the capability of those processes to perform within quality, cost and schedule goals. The aim is to characterise current practice, identifying strengths and weaknesses and the ability of the process to control or avoid significant causes of poor quality, cost and schedule performance. [ISO/IEC JTC1/SC7 1992]. This definition is also applicable to software process assessment. The most popular approaches for software process assessment are the Software Engineering Institute's (USA) CMM - CapabiUty Maturity Model [Paulk et. all 1993] [Paulk M.C. 1995] [CMMv2.0], ISO/SPICE (Software Process Improvement and Capability Determination) project [ISO 15504 (SPICE) PDTR Draft 1996], and BOOTSTRAP (European developed assessment method [Haase et al 1993]). Some of the others are: Capers-Jones software measurement model [Jones 1991] and Model-based Process Assessment [McGowan et al 1993]. SLC is also subject of standardisation: IEEE standard for software life cycle processes [IEEE 1988] and ISO/IEC12207: 1995 Information Technology - Software life cycle processes (SLC). Many other models and standards can be found in the Software Productivity Consortium WEB server [SPC 1997]. The CMM model provides a conceptional structure for improving the management and development of a software process in a disciplined and consistent way. The CMM model divides the software process into five maturity levels which highlight the primary process changes made at each level: Initial or basic level {ad hoc process), Repeatable (basic project management is established), Defined (process is documented and standardised). Managed (process and product measurements are established) and Optimised. Each level comprises of a set of process goals that, when satisfied, stabilise an important part of the software process. This in turn results in an increase of the software process capability. Each maturity level is composed of a number of key process areas. These key process areas are a set of activities that, when implemented, achieve a set of goals important for enhancing the process capability. A European software process assessment and improvement method - BOOTSTRAP - has been developed in an ESPRIT (1990 - 1993) project. BOOTSTRAP has been built on the basis of the CMM and ISO 9000 series of standards. The basic concept underlying BOOTSTRAP requires first fulfilment of basic organisational requirements, such as process control, project management and risk management before any changes in methods and technology are made to improve the software process. An organisation and its processes are assessed with respect to organisation, methodology and technology. The result of BOOTSTRAP assessment is a capability profile showing maturity of an organisation against an ideal level of maturity, comparison with ISO 9001 requirements and recommendations for appropriate actions for further improvements. ISO/SPICE is a project of the international Committee on Software Engineering Standards ISO/IEC JTC1/SC7. It synthesises above mentioned models and standards (CMM, Bootstrap, Trillium, ISO 9001, ISO 12207 and others). SPICE embodies a sophisticated model for software process management drawn from the world-wide experience of large and small companies. The architecture of the SPICE process assessment defines a two-dimensional view of software process capability: the process categories and capability level. Process categories are: Customer-supplier process category, Engineering process category. Project process category, Support process category and Organisation process category. Each category is a set of processes addressing the same general area of business. The result of SPICE assessment is a "software process profile" defining a capability level for a considered software process category. This model will have a significant influence on software domain in the near future: very probably as "the facto standard". The strengths of software process assessment models are that: - The Software Life Cycle processes are dealt with in full. - They give a clear profile of the current capability of the software development and maintenance processes. This profile corresponds to the maturity level of the software development and maintenance process. Maturity levels or maturity profiles identifies current capabilities of the process and identify process areas for further improvements. - They have a important influence on the software community. - The results of an assessment can be used in a benchmarking process. - The results can be used in national or international procurement activities. - They can be used for self assessment, as a starting point into a quality improvement program. The weaknesses of those models are: - They are limited to the Software Life Cycle and do not consider other IS/IT resources such as orgware, peopleware, hardware, telecommunications etc. - They are not known outside of the software community (to the IT users). - They are neither national or international standards. Companies where those models are applicable are software houses and Electronic Data Processing departments or software development departments within enterprises. They can be used as a self assessment tool for improvements plans or implemented by a third party as an independent assessment, if stakeholders require such an assessment. 6 IS Auditing The EDP Auditor Foundation, Inc. (EDPAF) developed General standards for information system auditing [Dykman A.C.] and Control objectives as a model for IS audit procedure. By general standards for information systems auditing [IS Audit], the Information System Auditing is defined as any audit that encompasses the review and evaluation of all aspects (or any portion) of automated information processing systems, including related nan automated processes, and the interfaces between them. Those aspects, defined by EDP auditor Control objectives [Dykman A.C.] are: - Management control - Information system development, acquisition, and maintenance - Information system operations controls - Apphcation controls - Database supported information system controls - Distributed data processing and network operations controls - Electronic data interchange controls - Service bureau operations controls - Micro computer controls - Local area network controls - Expert system controls and - Joint application design controls. Each general control is divided in to controllable and manageable units. From the definition of the IS, the IS audit definition by EDPAF, and the described control objectives it may be concluded that EDP audit procedures deals with all aspects of an IS. The result of the IS audit procedure is a set of documented facts obtained with interviews and questionnaires by a certified auditor on an audited IS entity. The strengths of the IS auditing assessment model are: - any IS/IT entity can be audited; - it is a strong management tool; - special considerations are pointed on technical aspects such as Data Bases, LANs, Micro Computer Control, application controls, IT/IS risk evaluation and data security etc.; - it is usually used in collaboration with financial auditing; - it can be used as a basis for an improvement program; The weaknesses of this model are: - it can not be used as a tool to find out the current maturity level of a software development process or to compare a project profile; - it can not used in an competition attaining work / business - it is neither a national or international standard; - it can not be used as a self assessment model; This model is applicable mostly to the auditing of EDP departments and other IS resources within enterprises [Pivka M. 1998]. Owners of IS auditing are usually company's management and accounting auditors. IS auditing is generally not applicable to software houses. Detail informations on control objectives for information and related technology are in [CobiT 98] which is a registered trade mark of ISACA. 7 IS control mechanisms: collaboration, competition or conflicts? Information Technology managers, software development managers, business managers and users are faced with aggressively marketed control mechanisms. The following questions are interesting from the IS management point of view: 1. Which model to choose? 2. Which IS resources are controlled by those mechanisms? 3. Are they redundant? 4. Do they compete? 5. Do they collaborate? The answer to those and other questions on IS control mechanisms are not easy and universal. The following paragraphs describe the most general characteristics of control mechanisms and their influence on IS resources. In table 1, the ranking of the influence of control mechanisms on some IS resources is defined. It is obvious, that assessment models have limited or nil influence on technology resources such as communications. But of course, they have a strong influence on the software development process. Table 2 represents which model to choose for some most interesting business requirements. Quality systems based on ISO 9001, BOOTSTRAP, SPICE, CMM and other assessment models, are management tools for improving the software development, maintenance and implementation process. They have an important influence on the software environment in defining their maturity level and in helping them to find and define the key management procedures to improve the software process. Assessment teams (first, second or third party teams) use BOOTSTRAP, SPICE or the CMM model to identify the maturity level of the software process. Models are not used in national or international certification schemes. On the other hand, the ISO 9001 certificate confirms at an international level that the software process is in compliance with internationally accepted quality requirements. The influence on IS of those models is therefore limited to the scope of the model! We may conclude, that there is some competition and conflicts between them, but also a possibility for collaboration. For instance: self assessment with a CMM model can be a good starting point for an improvement program with the aim of attaining an ISO 9001 certificate. The influence of software product standards on IS quality is limited to the importance of the considered software package or the scope of the standard. Those standards shall therefore be recognised as helpful and useful in assessing aspects such as risk management, application control, data security, or any other IS/IT entity, where such standards exist and are implemented. The EDPAA IS audit model is a tool for general management to evaluate the efliciency, security, productivity etc. of implemented IS/IT, or part of it, in a company. IS auditing does not deal with the natural growth of software processes as proposed by the BOOTSTRAP, SPICE, CMM, or with quality systems as defined in ISO 9000 family standards. IS audits are most usually ordered by top management or accounting auditors. There are some gaps between IS auditing and other models, which are for sure possibilities for collaboration. At least the following aspects are a matter of collaboration: application controls, risk management and IT assessment. There is also some overlapping (and therefore, conflicts and competition) between those models, especially between assessment models. BOOTSTRAP, SPICE and CMM models overlapped each other and to conduct more than one of them in practice, is redundant. Which of them to chose is in our opin- ISO 9001 CMM, SPICE BOOTSTRAP Product standards IS Auditing People Strong Strong none Medium Application system Strong Medium medium, depends on the standard and the application system Strong Software development Strong Strong medium, depends on the standard and ap-phed SLC Medium Technology of IS Medium Medium none Strong Facilities Medium None none Strong Data security Depends on the maturity of the company: from medium to strong depends on the maturity of the company: from medium to strong standard dependable Strong Risk management Depends on the maturity of the company: from medium to strong depends on the maturity of the company: from medium to strong standard dependable Strong Communications Medium Medium standard dependable Strong Table 1: Control mechanisms and their influence on IS Requirement : To assess specific IS/IT sources concerned productivity, security, usability, ... Requirement: To assess software process as the basic for software improvement programme. Requirement: To buy (or produce) SW product for market with legal or specific requirements Requirement: To improve software quality and to get market Advantage Possible solutions: IS auditing, combined with assessment methods if necessary for software process. Typical stakeholders: top management, financial auditors. Possible solutions: Assessment methods: BOOTSTRAP, CMM, SPICE, ISO 9001. Depends on added value from assessor company. Typical stakeholders: SW managers, user requirement. Possible solutions: Software product standards or ISO 9001. Typical stakeholders: legal requirements, market or contractor's demands. Possible solutions: ISO 900x certification with one of the assessments methods. Usually influenced by market demands or by management. Table 2: which model to chose. ion the matter of added value, given from the method and assessor company (third party assessment). It is generally accepted that the software process which is certified to be in compliance with ISO 9001 is on the third capability level based on the CMM model. Overlapping between IS Auditing and other control mechanisms is obvious only with the software development and maintenance process and as that, SLC processes are an aspect of possible conflicts and competitions. The responsibility for proper decisions between different models is unfortunately on the stakeholders side. In situations where IS auditors, auditors for quality systems (ISO 9000 family) and software process assessors are hired, or some assessment results exist from the past, we recommend that company management demands from all parties collaboration and a mutual recognition of their results. This shall have a substantial influence on the costs and added value of the company. Otherwise, a lot of people in the company will be interviewed several times with similar questions on the same subjects! Table 2 below shows an example of which model to choose for some business targets or strategies. 8 Conclusions In this paper only the most popular and well defined control mechanisms to achieve better IS quality are briefly presented. Those control mechanisms are: ISO 9001 quality system standards, software process assessment models (CMM, BOOTSTRAP, SPICE), software product standards and IS Auditing. The answer to the question which of them to choose is therefore not easy and it depends on the perspective of the stakeholder: enterprise management, IS management, buyer of software package, contractor for software or software services etc. Once the scope and required goals and expectations are defined and strengths and weakness of available control mechanisms understood then the right choice is not so difficult any more. This can be derived from the descriptions above. It is also obvious, that some competition, conflicts and possibilities for collaboration exist between the considered models. It is the management's role and responsibility to understand those models and to avoid different audits on the same IS resource. Acknowledgements I would like to express my thanks to Gilles MOTET (DGEI/INSA Toulouse, France), Regis FLEURQUIN (lUT Informatique Vannes, France) and Dr. Walter WINTERSTEIGER (Management & Informatik Dornbirn, Austria) for their important and valuable comments on the paper. References [Haase et al 1993] V. Haase, R. Messnarz, G.Koch, H.J.Kugler, P. Decrinis: BOOTSTRAP: Fine-tuning process assesment, IEEE Software, July, 1994 pp 25-35. [Dykman A.C.] Dr. Charlene A. Dykman (editor): Control objectives. Controls in an Information systems Environment: Objectives, Guidelines, and Audit Procedures. EDP Auditors Foundation, Inc. Illinois, USA, 1992. [CMM v2.0] Draft version of CMM v 2.0. Internet web site http : //www. sei. emu. edu [CobiT 98] Control Objectives for Information & related technology ISACA, Illinois, USA, 1998 [IEEE 1988] Institute of Electrical and Electronics Engineers Inc., Standard of Softrware Life Cycle Processes, P1074/D2.1 Dec. 1988. [IS Audit] IS Audit & Control Journal: General standards for IS auditing. Vol. I. 1994, pp 60 - 66. [ISO 9000-3:1997] Quality Management and Quality Assurance Standards - Part 3: Guidelines for the application of ISO 9001:1994 to the development, supply, installation and maintenance of computer software. [ISO/IEC JTC1/SC7 1992] The Need and Requirements for a Software Process Assessment Standard. Study Report Issue 2.0 JTC1/SC7 N944R. [ISO/IEC 8402:1995] Qahty management and quality assurance - Vocabulary. [ISO/IEC 9126:1991] Information Technology - Software product evaluation - Quality characteristics and guidelines for their use. [ISO/IEC 12119:1995] Information Technology - Software packages - Quality requirements and testing. [ISO/IEC 12207:1995] Information technology - Software life cycle processes [ISO/IEC DIS 14598:1996] Information technology -Software product evaluation. Part 1: General overview. Part 5: Process for evaluators. [ISO 15504 (SPICE) PDTR Draft 1996] Software Process Improvement and Capability dEtermination (Software Process Assessment Standard). SPICE web site: http ://www-sqi.cit.gu.edu.au/spice/suite.shtml [Jones 1991] T. Capers Jones, Applied Software Measurement, McGraw Hill, 1991. [McGowan et al 1993] C.L.McGowan and S.A.Bohner, Model Based Process Assesment, Proc. Int. Conf. Software Engineerig IEEE CS Press, Los Alamitos, 1993. [O'Brien J. A. 1994] Introduction to Information Systems. IRWIN 1994. [Paulk et. all 1993] M. Paulk, Bill Curtis, M.B. Chriss&C Webber, Capability Maturity Model, Version 1.1. IEEE Software, July 1993, ppl8-27. [Paulk M.C. 1995] Paulk M.C. How ISO 9001 Compares with the CMM. IEEE Software, January 1995, pp. 74-83. [Pivka M. 1998] Pivka Marjan. A Comparison of Control Mechanisms to Help Achieve Better IS Quality. ISAudit & Control Journal Volume II 1998. [Rout P. Terence 1995] SPICE: A Framework for Software Process Assessment. Software Process Improvement and Practice. August 1995. [SPC 1997] Software Productivity Consortium: http ://www.software.org/quagmire. [TickIT 1998] The TickIT Guide. Issue 4.0. DISC TickIT Office. London 1998. Authentic and Functional Intelligence Mario Radovan University of Rijeka, Faculty of Philosophy, Department of Information Science, Omladinska 14, 51000 Rijeka, Croatia E-mail: mradovan@mapef.pefri.hr Keywords: mind, computation, subjectivity, understanding, thinking, intelligence, three worlds, care thesis, coping-with Edited by: Rudi Murn Received: April 15, 1998 Revised: May 21, 1998 Accepted: June 16, 1998 Philosophical discussions about the aims, possibilities and limitations ofArtifìcial Intelligence (AI) can shed light on the plausibility of different approaches to cognition and computation, and with that, they can have great impact on the future development of computer technologies. However, we argue that such discussions are often based on the vague concepts or on the unsafe assumptions. Intelligence and understanding are usually observed on the level of behaviour; we argue that these phenomena should be considered in terms of motivations; in that context, we hold it necessary to differentiate between authentic and functional cognitive abilities. Computation does not seem to be a plausible way toward authentic understanding and intelligence; however, computational systems do offer virtually unlimited possibilities to replicate and exceed human cognitive abilities on the functional level. 1 Introduction Discussions about the aims, possibilities and limitations of Artificial Intelligence (AI) take place between two basic and mutually opposed theses. The first one claims that the goal of AI is not only to produce systems which mimic some specific intelligent behaviour but to create machines with minds, in the full sense of the word. On such a view, human beings are biologically created computers. The other thesis claims that the very idea that the human mind could be artificially replicated by computational systems of any kind is radically wrong. In this paper, we argue that neither of these thesis has so far offered a conclusive argument in its favour. In hght of our present knowledge, however, we hold the second thesis more plausible. But this thesis cannot be actually proved, not only because present knowledge about the human brain is not sufficient to provide such a proof, but primarily because the basic features of human cognition - such as awareness, understanding, thinking, and intelligence - are not defined well enough. Arguments about the relationship between human cognition and computation are almost always based on such concepts, without it being mentioned that there is any problem with their meaning. Although often interesting, such arguments are vague (or simply invalid), or they can be reduced to some rather obvious and trivial assertion. In section (2) we discuss a few typical examples of such arguments and their drawbacks. In section (3) we deal with the perennial problem of the subjectivity of mental states. We introduce the idea of three worlds - the physical world, the world of subjective states, and the world created by humans. We argue that such a three-world ontology offers a basic conceptual framework within which problems involving cognition and computation can be expressed and discussed in the most appropriate way. In section (4) we discuss the basic problems and limitations of AI in terms of this three-world ontology. In this context, we introduce the distinction between authentic and functional intelligence. We claim that computation is not a plausible way towards an artificial replication of subjectivity, and hence towards authentic understanding and intelligence. However, we recognise that AI does hold unlimited possibilities for replicating human cognitive abilities on the functional level. We argue that authentic intelUgence is out of reach of purely computational systems, though on the level of functional intelligence, performances of computational systems will be more and more out of reach of human capacities. A recent example of this trend comes from computer chess, where machine functional (performational) intelligence surpassed human capacities. Chess programs are often criticised to rely mainly on the "brute force" of the machine. I do not think that the qualities of such programs could be reduced to "brute force" ; however, even if they could be so reduced in part, that would not be unusual when qualities of machine performance are concerned. On the other hand, if the main goal of AI is to create a machine with an authentic mind, then we must face the fact that we do not actually have any clear idea what direction of research could lead to such an artificial mind. In this context, we claim that an authentic artificial mind is not possible without a nearly authentic sentient life, and that leads us out of AI, perhaps towards biology. 2 Simplistic Solutions In this section, we discuss some typical arguments based on vague concepts and unsound assumptions, or trivially true results. Thus, such arguments can tell us far less than they pretend to do. The victory of the chess-playing system Deep Blue over the world's chess champion offers an illustrative example of the tone which dominates in such discussions. As a first, the media presented the event as a threat to human dignity, as if the designers and creators of Deep Blue were not human beings. On the other hand, Searle offers a calming explanation: "The computer knows nothing of chess ... It just manipulates meaningless formal symbols according to the instructions we give it" (Searle '97, p. 59). Such an explanation of machine abilities seems to be more misleading than informative. Namely, it implicitly assumes that the skill of a chess master consists of something radically different than the ability to "manipulate meaningless formal symbols". But let us suppose that Searle knows how to play chess, and that he is no less intelligent than the world champion. Does this mean that he would have equal chances of winning a match with the world champion? Probably not. But isn't that simply because the'world champion knows far more "meaningless" rules about which move seems best in which position than Searle does? Searle gives no explanation of the nature of a chess master's "meaningful" way of reasoning (or computation). Hence, it is not obvious why the claim stated for a computer - that it "knows nothing" except to "manipulate meaningless formal symbols" - could not be equally stated about the world chess champion himself. 2.1 Replicating the Mind Let us start with arguments about the theoretical possibility of complete artificial replication of the human mind. Although computers vastly exceed humans in performing various sorts of computations, it seems that no machine could ever appreciate a wine, or a sonata, for example. Hence, it seems that no machine could fully replicate a conscious human mind. However, Dennett, who advocates "a version of function-alism", says: "if all the control functions of a human wine taster's brain can be reproduced in silicon chips, the enjoyment will ipso facto be reproduced as well" (Dennett, p. 31). Such a claim seems to be true by definition. Namely, if you reproduce all causes (and that is what "all the control functions" should be), you would also reproduce all effects. However, such a claim leaves open at least two essential questions: what are the "control functions" of the brain, and how (if at all) could these functions be reproduced in a lifeless system. If we would know the answers to these two questions, we would hardly need anything more. But there are no clear indications that the functionalist approach could ever lead to the answers to such questions. Hence, the trivial (even if true) conclusion of the argument tell as very little about the real problems concerning the possibility of fully replicating the human mind. The problem of replicating the mind is usually posed in a more precise manner, as the question whether the human mind could be fully replicated by a computational system. In that context, there are claims that the universal Turing machine and the Church-Turing thesis offer the theoretical grounds for a positive answer to this question. Roughly speaking, the universal Turing machine is a symbol system which consists of a set of symbols, operations, internal states and state change instructions by means of which algorithms can be defined and performed. The Church-Turing thesis says that every computable system can be simulated by the universal Turing machine. Such a machine can be implemented on a digital computer (neglecting the rather theoretical limitation that the universal Turing machine has an unlimited input tape). Hence, the Church-Turing thesis opens the following possibility: if the human brain is a computable system, then a suitably programmed computer could replicate all its features, and with that fully replicate the mind. Thus, Goertzel claims that (1) humans are "systems governed by the equations of physics", and (2) "the equations of physics can be approximated, to within any degree of accuracy, by space and time discrete iterations that can be represented as Turing machine programs" (Goertzel, p. 22). Consequently, according to Goertzel, brain activities - and with that, human cognitive abilities - can be fully replicated (at least in principle) by implementation on the universal Turing machine. On the other hand, on the basis of Goedel's theorem, Penrose claims that human cognitive abilities cannot even in principle be replicated by a computational system of any kind (Penrose, p. 116). Roughly speaking, Goedel's theorem shows that there are sentences in a system which are not provable in that system, but humans can see they are true (imder a certain interpretation). Any computational replication of the mind could "see" only those things which are computable: therefore, less than a human mathematician can see. Hence, independently of the "technical" problems, the mind is not even in principle computationally replica-ble. Arguing against Penrose's position, Searle claims that from the fact that computational simulation of some human ability cannot be done on some level of description (i.e., on the level of the language of mathematics), it does not follow that the same ability cannot be simulated on same other level of description (Searle '97). Hence, he holds that Goedel's theorem does not prove that a full simulation of human brain processes is not possible. Searle holds that those brain processes which take part in a mathematician's brain when he sees the truth of an unprovable sentence could be (at least in principle) completely simulated on the level of neurons, so that the computational system (which performs the simulation) would have the same knowledge as the mathematician. Penrose admits that a computational simulation of the brain on the neural level seems possible; however, on the basis of Goedel's theorem, he insists that human cognitive abilities cannot be fully replicated by any sort of computation. Hence, Penrose speculates that the source of human cognitive abilities (which transcend the computable) should be looked for, not on the level of neurons (which seems to be computable), but on some lower level for which there would not exist any computable description. All these positions concerning the computability of the brain take as obvious something that is of essential importance, but is not known. For Goertzel, it is "a physical fact" that the brain is a computer, and that it "deals only with computable functions". Searle assumes that the existence of truths which are not computable on the level of conscious thought does not say anything relevant about the possibility that the system itself (i.e. brain) could be non-computable. And, granted that the system (i.e. brain) can see something that is not computable (on some level of description), Penrose takes it for granted that the system itself is not computable. However, it seems that the question of computability in the human brain is an open question. Copeland gives an analysis of that problem, based on the fact that there exist undecidable sets and that human cognitive system is productive, on the basis of which he claims that "for all we know many aspects of brain function may be non-computable". However, Copeland concludes that this problem is still terra incognita and that "in point of fact" we do not have any definite answer concerning the computability of the human brain (Copeland, p. 233). Let us add here that computability is an abstract category, and it is not obvious how to apply such a category to an organic system, at least not as long as that system has not been formally described. And an organic system can be described in many different ways, which renders the problem even more difficult. Hence, the basic drawback of the arguments and positions listed above is not that they are wrong, but rather that they do not pay enough attention to that problem which is essential for their very coherence. 2.2 Causal Powers The same kind of neglect and simplification is peculiar to arguments concerned with the necessary conditions which an artefact must fulfil to be (or to become) conscious or to have a mind in the full sense of the word. It seems that the simplest way to replicate the conscious mind by artificial means would be to replicate the "neurobiological basis" from which conscious mind emerges in organisms like ourselves. On the other hand, it seems logically possible to obtain the same effect by means and methods radically different from the ones in human brains. However, Searle emphasises that from the fact that the human brain causes consciousness it follows that "anything else capable of causing consciousness would have to have the relevant causal powers at least equal to the minimal powers that human and animal brains have for the production of consciousness" (Searle '97, pp. 158-59). In other words, an artificial product made from different matter and in a structurally different way than the human brain might "cause consciousness" if and only if its structure shared with the brain the "causal powers" to get it over the threshold of consciousness. In short, to be conscious, an artificial system (of whatever kind) "must be able to cause what brains cause" (Searle '97, p. 191). The problem with Searle's claims (as well as with Dennett's) is that their negation would lead to contradiction. If we want an artefact with consciousness like the one "caused" by a brain, the artefact must be of such a "structure" to have sufficient "causal powers" to "cause what brains cause". Indeed, to deny such a claim would mean to contradict the principle of causality, which is the cornerstone of scientific explanations. Hence, although such claims leave an impression that we know something essential about the way an artificial mind could be created, they tell as very little. Namely, the real problem is what are these "causal powers" which get us over the threshold of consciousness. And that is an empirical question. We encounter the same kind of problems with "proofs" that the human mind is (or is not) a computational system, or that computer can (or cannot) be conscious, think and understand. The essence of the problem with such proofs can be reduced to the following form of reasoning. There are entities of type H (humans) which manifest a property P (consciousness, intelligence, etc.). To prove that a computer cannot have property P, one tries to show that there is nothing in it that could cause P. However, as long as we do not know what causes P in H, there is not much sense in trying to prove that entities different from H cannot have property P. The claim that in order to have property P a thing would have to have the "causal power" to produce P does not say anything substantial. It seems reasonable to suppose that, for example, a thermostat does not have any conscious mental states, but it would be less reasonable to try to prove it. As Chalmers puts it, there may exist some "crucial ingredient in processing that the thermostat lacks and that a mouse possesses, or that a mouse lacks and a human possesses" ; however, Chalmers holds that there is "no such ingredient that is obviously required for experience, and indeed it is not obvious that such an ingredient must exist" (Chalmers, p. 259). In his recent book, Searle explains that he did not intend to prove that computers cannot be conscious, as his arguments were often interpreted. After all, if consciousness can emerge in a human brain (a lump of grey matter), why could the same (or similar) state not emerge in some other sort of system? What he intended to prove, says Searle, was that "computational operations by themselves" , as symbol manipulations, are "not sufficient to guarantee the presence of consciousness" (Searle '97, p. 209). He claims that his proof is based on the fact that "syntax by itself has no mental content", and on the fact that "the abstract symbols have no causal powers to cause consciousness because they have no causal powers at all" (Searle '97, p. 210). In this, Searle is right. However, such an argument could hardly be called a proof, since these truths hold by definition: syntax is not, has not, and can not cause, semantics. On the other hand, nothing can be said to "guarantee the presence of consciousness", as long as we do not know what it is that guarantees its presence in the human brain. 2.3 Solution by Definition The main problem for those who argue that computational systems can (at list in principle) fully replicate the human cognitive abilities, represent the consciousness which, it seems, computers neither have nor could create it by computation. The typical approach to conscious mental states in such arguments reduces to finding a way to exclude them from the scope of discourse. As an example, let us take Copeland's argument about the machine thinking. Copeland claims that the questions "Could a computer literally think?" can be settled "only by a decision on our part", as is always the case when old concepts are used in new situations (Copeland, p. 53). In other words, although we may never manage to build machines that think, we can nevertheless "settle the question of whether or not it is a conceptual mistake to say that an artefact might literally think". Copeland argues that there is nothing wrong with applying the term 'thinking' in its literal sense "to an artefact of the right sort" (Copeland, p. 33). On the basis of the fact that "we are not consciously aware of all, or even most, of our mental processes", Copeland assumes that consciousness is not of essential importance for the human cognition (Copeland, p. 34). He claims that human mental activities, such as understanding speech and perceiving the external world, can be performed non-consciously; and since humans can perform these activities non-consciously, Copeland claims that the question whether an artefact could be said to perform such activities can be discussed "without considering whether or not an artefact could be conscious" (Copeland, p. 37). However, we hold that such way of reasoning is wrong. Namely, from the fact that some human thoughts and acts are not conscious it does not follow that an artefact which is never conscious can think in the literal sense of the word. After getting rid of consciousness, Copeland introduces the notion of "massively adaptable" internal processes. The internal processes of a system are said to be massively adaptable if the system itself can "analyse situations, deliberate, reason, exploit analogies, revise beliefs in the light of experience, weigh up conflicting interests, ... and so forth" (Copeland, p. 79). Now, suppose that one day such an artefact with massively adaptable inner processes is produced. Would you say that it thinks? Of course, yes. However, Copeland does not tell us how could an artefact "revise beliefs in the light of experience" if that artefact never has any conscious experience. But instead of dealing with such questions, Copeland introduces a robot which eats, writes poetry, "and so forth". Now, when we are "confronted" with such a robot, "we ought to say that it thinks", says Copeland, because "the contrary decision would be impossible to justify" (Copeland, p. 132). The main (and fatal) drawback of Copeland's argument consists in the fact that he completely neglects consciousness, without which there is no coherent way to speak about the mental states at all. On the other hand, his definition of massive adaptability is such that it literally implies the ability to think; hence, to conclude that a "massively adaptable" artefact should be said to think means only to exphcate a direct consequence of the definition. In section (4) we argue that such a definitional approach to the problem of thinking is wrong, and we propose an opposite, motivational, approach to the problem of defining the basic cognitive categories for humans and machines. 3 The Problem of Subjectivity Science assumes that reality is objective in the sense that neither its existence nor its structure depend on the viewpoint of a particular observer. Scientific knowledge deals with things which could exist without being known, and is expressed in a way which is equally accessible to every (sufficiently qualified) human. We say that science speaks of phenomena from the neutral (or third-person) point of view. On the other side, subjective mental states seem to be a different kind of phenomena in an ontological and epistemic sense. Namely, mental states can exist only as some- one's states, and they are known only to the subject whose states they are. For example, neural activities that cause a pain could (in principle) exist and be observed without being experienced, but the pain itself could not: it exists only if and as experienced. In other words, it seems that conscious mental states cannot be described in an objective fashion for the simple reason that they are always and only someone's subjective states. Churchland says that the taxonomy of physical science has some limits, and that it reaches them "at the subjective character of the contents of consciousness" (Churchland, p. 196). In this context, positions about the nature of mental states are divided into two basic views (which appear in various variants): physi-calism and property dualism. 3.1 Physicalism and Property Dualism Physicalism claims that mental states can be ontolog-ically reduced to physical states of the brain, and thus expressed in the objective (third-person) language of science - firstly, in the language of neurology, and then in the language of biology and the language of physics. Copeland, who considers himself a physicalist, admits that physicalism is "supported only by faith", or more precisely, by the fact that other theories seem less plausible or at least less "comfortable". He says: "The anti-physicalist alternative - an irreducibly mental dimension that sticks out of an otherwise physical universe like a sore thumb - is unpalatable to us. It offends against our expectation that nature is a harmonious, integrated affair" (Copeland, p. 179). In essence, physicalism excludes the subjective from the scope of the discussion, but in doing so it does not solve the problem of how to deal with the "offensive" fact that the universe does contain subjective states of people like you and me. On the other side, property dualism holds that the attributes 'mental' and 'physical' designate two ontologically different kinds of properties, and that the same entity can have both kinds of properties. Mental is caused by the physical, but it cannot be expressed by the taxonomy of physics, since subjective states (qualia) lie beyond the reach of the objective language of science. There is a special feeling, a quale, to each of the conscious states; property dualists hold that we would be none the wiser about qualia even if neuroscience were completed and everything would be known about the biology and physics of the brain. For, to know everything about the physical processes going on in the brain does not mean to know anything about the hurtfulness of pain or the bliss of joy. However, by accepting the irreducibility of conscious mental states, property dualism faces the problem of explaining the relation between the physical and the mental "dimensions" of reaUty. That is, to say that the mental is caused by the physical but is not reducible to it only explicates the problem, and does not solve it. It has been suggested that some neural activities in specific regions of the brain (in the networks connecting the thalamus and the cortex) might be the source of conscious mental states. However, even if some neural "firing" could explain the how of conscious states, it would not explain the what of these states. No explanation on the physical (neural or subneural) level can express a subjective state as experienced, since the taxonomy of science does not have terms which could express a feeling (quale) which are intrinsically subjective and ontologically irreducible to the physical. We lack not only an explanation of the relationship between subjective mental states and physical activities of the brain, but we cannot even conceive the sort of taxonomy which could allow an explanation of the relationship between subjective and objective. Hence, the basic question is not whether machines can think - and therefore be, that is, whether to be means to be a computer - but is instead the question how to speak about subjective phenomena in the scientific fashion at all, since the scientific taxonomy is intrinsically objective. The mystery of the subjective dimension of reality is primarily of a conceptual nature, and we don't really know how to solve it. 3.2 The Three Worlds Framework The basic aim of ontology is to define a conceptual system which makes it possible to speak about that which exist. In other words, a minimal requirement which an ontology should satisfy is to provide a conceptual framework within which we can appropriately speak about known (or, in principle, verifiable) phenomena. We claim that an adoption of the three-world ontolog-ical framework - which emphasises the difference between the physical world, the world of subjective states, and the world created by humans - opens the possibility to express and discuss the problems of human cognition and computation in the most appropriate way. The idea of the "three worlds" is attributed to Popper, but it has not been widely accepted by other philosopher, especially not in the context of the discussions about the relationship between human cognition and computation. Popper's basic idea about the three worlds is clear, but he treats many problems only barely or unsuitably, so that they can be considered open. We assume that to the physical world (worldl) belong all natural phenomena up to but not including the phenomena of consciousness as a boundary case. Although consciousness is a natural phenomenon, it is that peculiar and unique natural phenomenon which at the same time also comprises a new world for itself {world2): the world of subjective states, such as pain, desire, love or anger. The world created by humans {worldS) contains everything that has been created by conscious human beings (even though not all entities from worlds have to be created consciously). World3 contains abstract entities such as numbers and theorems as well as all forms created by humans. We assume here that forms have an autonomous existence; however, they are not eternal (in the Platonic sense), but created by humans and (often) imposed on the physical world. For example, a desk conceived as a piece of wood belongs to the physical world; however, conceived as a desk, in a functional or aesthetic sense, it is a creation of world2 (the conscious mind) and belongs to worlds, the world of the created by humans. The same holds for a computer: the computer on my desk is a physical object, but what makes that "heap of atoms" on my desk a computer is the form imposed on it by humans. In the context of the unsolved problem of subjectivity, the question of the mutual impacts between the three worlds does not have a simple solution. According to Popper, consciousness "plays the main role in the causal chains" that lead from worldS to world 1 (Popper, 24). Popper's arguments are not precise enough to be simply accepted or to allow a clear confutation. However, it seems clear that the three worlds ontological frame does not, by itself, solve the problem of "causal chains" between the three worlds, nor does it seem that this problem has some obvious solution. Furthermore, Popper does not pay enough attention to the problem of delineation between the three worlds; he says: "reality consists of three worlds, which are interconnected and act upon each other in some way, and also partially overlap each other" (Popper, p. 8). Penrose says that the three worlds are mutually "profoundly dependent", but that "there is something distinctly mysterious about the way that these three worlds inter-relate with one another" (Penrose, p. 139). Let us try to express our basic positions in a more precise way. As a first, we assume that the three worlds are disjunctive; they do not overlap, nor are they mutually ontologically reducible. Second, we assume that there are no causal relations between the three worlds. The realm of the physical is causally closed, and we know no other causal relations than the physical ones. Hence, it is possible to speak coherently of causal relations only on the level of the physical world. It could seem that such a position implies that, for example, music cannot affect my physical behaviour; or perhaps that my conscious deliberation about my next move cannot have any effect on my choice. However, the proposed position does not imply such consequences; all it claims is that causal effects take part only in the physical world. Music can have an impact on behaviour, but only when instantiated by means of the physical world, and only through the physical impact of its instantiation on the neural system (through listening, or thinking to it). Being conscious of some situation does impact a decision. However, the conscious state itself emerges from some neural processes, which were triggered and are influenced only by physical causes. And from these processes new mental states are constantly emerging. In short, all causal impacts take part inside world 1. Such a position could be said to belong to epiphe-nomenalism, which is often said not to be a plausible theory. According to epiphenomenaUsm, mental states are products and manifestations of brain activities, but mental states (as subjective phenomena) do not have any causal impact on these activities. Flanagan claims that it is "extremely implausible" that subjective awareness "plays no significant causal role", although he admits that it is hard to say "exactly what role it plays" (Flanagan, p. 151). We argue that a conscious mental state (as subjective and irreducible to the physical) cannot be said to cause anything by itself; what causes is the brain state whose feature and manifestation the conscious state is. The main thesis of this paper is that there can be no understanding and intelligence without awareness as a form of consciousness. Furthermore, conscious mental states are considered a world in themselves (world2), and the source of the world of creation (worlds). However, we claim that it is not a conscious mental state which really acts (or causes), but the neural system whose emergent feature the mental state is. When we say that the conscious mind creates or imposes a structure, we mean that the causal impacts take part on the level of the physical. In other words, it is not consciousness as a subjective state that creates by itself, but the physical system must have consciousness as a property to be motivated and able to fear, desire, think, understand and create. We assume that worldS has been entirely created by the human mind. For example, scientific theories are essentially creations of the human mind; we create a theory, and then we evaluate it on the basis of the effects we manage to obtain with its use. In worldS belong not only mathematics, science, and technical ideas, but everything which has been created by humans by means of some system of representation, as well as the systems of representation themselves. Examples of such are various social structures, customs, works of art and social symbols, which are created by humans and exist on the basis of some system of representation. Humans not only create new entities but also assign functions and values to entities in the physical world as well as to created entities. For example, rivers, fruits and flowers belong to worldl; but their functions and values belong to worldS. We argue that entities of worlds cannot create; when instantiated by worldl, they only explicate that which has been implicitly created by their very creation. For example, an implemented program can produce symbols, shapes, colours or sounds, but its outputs were implicitly ere- ated by the very creation (and implementation) of the program itself, although many of the outputs may not be created with an intention or a purpose. 4 Computation and Understanding Discussions about the relationship between computation and understanding from the Turing test onwards tend to remain at the level of effects (behaviour or functioning). We argue that the questions of understanding, thinking and intelligence should be considered in terms of motivation. In other words, an attempt to define or describe these phenomena should start from the plausible sources (causes) of the phenomena themselves. In this connection, we claim that an appropriate definition of authentic understanding and authentic intelligent behaviour should be based on the concept of awareness, without which there are neither valid reasons nor suitable ways to speak of authentic intelligence or understanding at all. 4.1 Understanding as Problem Solving No behaviour can be said to be "intelligent" independently of its motivations and aims. Hence, the question of the relationship between behaviour and intelligence cannot be resolved in terms of outside effects, but must take into account the motivations and aims. According to Popper, all living beings are constantly preoccupied with solving problems; they are trying to improve their situation, or at least to avoid its deterioration. Popper claims that consciousness was a prohhm-solving consciousness right from its very beginning (Popper, p. 17). Conscious beings encounter and create: they encounter successes and failures, satisfactions and frustrations, joys and sorrows; they create myths, religions, arts, and science in order to explain, solve or celebrate the mysteries of existence. A conscious being creates in order to reduce fears, produce pleasure, add values, communicate feelings, overcome loneliness, relieve tension, establish wholeness (Organ '88, p. 130). It is tension that triggers acting and creating. This may be only a vague feeling "that things are not as they ought to be" ; but it can also be an awareness of the "fundamental dissatisfaction", uneasiness, finitude, lostness, loneliness, anxiety, mean-inglessness. On the other hand, there is an "incessant hope" for integration, satisfaction, meaning, and harmony (Organ '88, p. 171). Human intelligence has been motivated by the internal necessity to solve the problems which it encounters; it has been driven by the impulses to avoid the painful and reach happiness. According to Penrose, it is not necessary to have the precise definitions of the notions such as conscious- ness, understanding and intelligence, to be able to observe or define the basic relationships between these notions. In this connection, Penrose holds that "intelligence is something which requires understanding", and that "understanding requires some sort of awareness"; therefore, intelligence requires awareness. He holds that to speak of understanding - and hence, also of intelligence - without awareness is "a bit of a nonsense" (Penrose, p. 100). Although Penrose does not elaborate his position in details, it seems that his attitude supports the motivational approach to the phenomena of understanding and intelligence which we propose here. The very existence of aware-less functional intelligence depends on the authentic intelligence that is aware, as the only source and measure of every other quality. For the same reasons, we cannot coherently speak of the creativity of aware-less intelligence, since such an intelligence does not and cannot have any motivation to create: it does not have any aim nor any criterion on the basis of which it could evaluate its own behaviour as creating. 4.2 The Care Thesis The motivational approach to cognition implies that thoughts and intelligent behaviour are not independent of the subjective states (moods), neither in the generative nor in the evaluative sense. A mood opens a specific way for things to show up as mattering, and with that opens new ways of understanding and new possibilities of reasonable (i.e. intelligent) action. This also means that aware-less cognitive abilities are qualitatively Umited and fragmentary, and that they exist only as interpretations made by an aware cognitive system. There are many successful expert systems, and we hold that there is no clear limit to the further improvement of machine abilities and skills within the realm of functional replication of human cognitive abilities. However, aware-less machines cannot reach authentic intelligence since for such systems there exist neither problems nor motivations to create. Artificial intelligence is a human creation, and as such it belongs to worlds. On the other hand, human intelligence is a feature of world2: a feature which could neither exist nor be evaluated without the existence of the other features of world2, such as desire, fear, love and anger. Let as sum up the essence of the motivational approach we proposed, in the following Care thesis: human beings, as aware systems, are essentially determined by anxiety and desires: by care. Human cognitive abilities are features of essentially caring systems; they spring from care and are shaped by it. Hence, they cannot be fully replicated by a care-less (aware-less) system. Computers, as symbol manipulating devices, can be said (although not proved) to be care-less systems; hence, such systems cannot understand, be intelligent and creative in the sense in which humans are. Discussions about the possibilities and limitations of AI often refer to Heidegger's views of human cognition and behaviour. Heidegger's approach to cognition could be said to be in accordance with the motivational approach proposed here, except for one essential difference. Namely, we hold that awareness is of the fundamental importance for human cognition and behaviour, while Heidegger builds his position about cognition (and about being human, in general) not on awareness but on something he calls Dasein. According to Dreyfus, Dasein "can mean 'everyday human existence', and so Heidegger uses the term to refer to human being"; however, "we are not to think Dasein as a conscious subject" ; "Dasein must be understood to be more basic than mental states and their inten-tionality" (Dreyfus, p. 13). In this context, the "understanding of being" is something which "is contained in our knowing-how-to-cope in various domains rather than in a set of beliefs that such and such is the case". In essence, Heidegger's basic position can be reduced to the claim that "we embody an understanding of being that no one has in mind" (Dreyfus, p.18). We hold that to speak of understanding as aware-less coping does not make much sense, since everybody and everything "copes" somehow. Furthermore, the fact that human understanding does not consist only (or primarily) "in a set of beliefs that such and such is the case" does not imply that: (1) understanding is not related to an aware mind, and (2) that "being there" (of human beings) is not essentially an aware state. And if we reject these two Heidegger's assumptions, his sophisticated elaboration of "Dasein-ing" reduces to nearly nothing. In fact, Heidegger himself admits that aware-less coping-with can only last as long as some problem ("breakdown", "obstinacy") appears; and then, Dasein becomes a conscious subject confronted with an obstinate and obtrusive world. Namely, "if the going gets difficult, we must pay attention and so switch to deliberate subject/object inten-tionality" (Dreyfus, p. 69). But, as we stated by the Care thesis, sentient beings are always confronted with "difficulties" (explicit or implicit), so that Heidegger's Daseining hardly ever exists; and when it does exist, it seems to be irrelevant. Indeed, human understanding and intelligence can be said to begin as awareness that "things are not as they should be". Finally, though in his peculiar terminology, Heidegger says the same thing: "[Care] is to be taken as an ontological structural concept. It has nothing to do with 'tribulation', 'melancholy', or the 'cares-of life' ... These - like their opposites, 'gaiety' and 'freedom from care' - are onti-cally possible only because Dasein, when understood ontologically, is care" (Heidegger '62, p. 84). We hold that to care (or to he care) means to be a subject that is aware. 5 Summary Authentic cognitive abilities are features of an essentially caring system: they originate from care and are shaped by it; hence, they cannot be fully replicated by a care-less (aware-less) system. We assume that computers are care-less systems, and that they cannot reach authentic understanding and intelligence. On the other hand, we hold that computational systems open virtually unlimited possibilities of replicating human cognitive abilities on the functional (behavioural) level. To develop an artificial system with authentic intelligence, we would first have to create a system which would be "alive" in the sense of being aware and having its own motivations (cares). However, developing such a system should not be considered as a real goal for current AI research. For, in that case, AI could actually not offer anything that could be considered a first step toward such a goal. All too optimistic announcements have been more damaging than beneficial to AI. Instead, machine intelligence and AI achievements should be defined and evaluated in functional terms. On the other hand, if we hold that, for example, an expert system with as much reasoning power in some domain as a human expert, is nevertheless "not really intelligent", then not only do we not have intelligent systems, but we also have no clear idea how such a system could be developed. References [1] Chalmers, J. D.: The Conscious Mind: In Search of a Fundamental Theory, Oxford University Press, 1996. [2] Churchland, M. P.: The Engine of Reason, the Seat of the Soul, The MIT Press, 1995. [3] Copeland, J.: Artificial Intelligence: A Philosophical Introduction, Blackwell, 1993. [4] Dennett, D. C.: Consciousness Explained, Penguin Books, 1993. [5] Dreyfus, L. H.: Being-in-the-World: A Commentary on Heidegger's 'Being and Time', Division I, The MIT Press, 1991. [6] Flanagan, 0.: Consciousness Reconsidered, The MIT Press, 1992. [7] Goertzel, B.: Self and Self-Organisation in Complex AI Systems, in Gams, M., Paprzycki, M., Wu, X. (eds): Mind Versus Computer, lOS Press / Omsha, 1997. [8] Heidegger, M.: Being and Time, Harper and Row, 1986. [9] Nagel, T.: The View from Nowhere, Oxford University Press, 1986. [10] Organ, T.: Philosophy and the Self: East and West, Susquehanna University Press, 1987. [11] Organ, T.: The Self in Its Worlds: East and West, Susquehanna University Press, 1988. [12] Penrose, R.: The Large, the Small and the Human Mind, Cambridge University Press, 1997. [13] Popper, K.: In Search of a Better World, Rout-ledge, 1992. [14] Radovan, M.: Intelligent Systems: Approaches and Limitations, Informatica, Vol. 20 (1996), pp. 319-330. [15] Radovan, M.: Computation and Understanding, in Gams, M., Paprzycki, M., Wu, X. (eds): Mind Versus Computer, lOS Press / Omsha, 1997. [16] Searle, R. J.: The Rediscovery of the Mind, The MIT Press, 1992. [17] Searle, R. J.: The Construction of Social Reality, The Free Press, 1995. [18] Searle, R. J.: The Mystery of Consciousness, The New York Review of Books, 1997. [19] Winograd, T.: Preface in Gams, M., Paprzycki, M., Wu, X. (eds): Mind Versus Computer, lOS Press / Omsha, 1997. LFA+: A Fast Chaining Algorithm for Rule-Based Systems Xindong Wu and Guang Fang Department of Software Development, Monash University 900 Dandenong Road, Melbourne, VIC 3145, Australia AND Matjaž Gams Jozef Stefan Institute Jamova 39, 1000 Ljubljana, Slovenia xindongQinsect.sd.monash.edu.an ; matj az.gamsSij s.si Keywords: expert systems, rule-based systems, fast chaining, conflict resolution Edited by: Rudi Murn Received: May 20, 1997 Revised: April 8, 1998 Accepted: July 14, 1998 A significant weaJcness of rule-based production systems is Jarge computational requirement for performing matching. Time complexity of algorithms is generally still NP-hard (non-polynomial) to the number of rules in a rule base. LFA is a linear-chaining algorithm for rule-based systems which does not require a speciGc conßict resolution step for chaining. However, its applications are still restricted, e.g., it cannot process first-order rules efficiently. This paper reviews the design of chaining algorithms for rule-based systems, and analyses some well-known chaining algorithms such as RETE and LFA. The central contribution is the design of a robust LFA algorithm, LFA-h, which can processes first-order logic rules. 1 Introduction 1.1 Rule-based systems Rule-based systems (RBSs) are an important type of pattern-directed inference systems. They consist of three basic components as follows: 1. A set of rules, which can be activated or fired by patterns in data. 2. One or more data structures (data bases), which can be examined and modified. 3. An interpreter or inference engine that controls selection and activation of the rules. A rule includes a left-hand side, LHS, which is responsible for examining items in the data structures, and a right-hand side, RHS, which is responsible for modifying data structures. Data examination consists of comparing patterns associated with the LHSs with elements in the data structures. The patterns may be defined in many ways, such as simple strings, complex graphs, semantic networks, tree structures, or even arbitrary segments of code which are capable of inspecting data elements. Data modification can involve firing actions to modify data, rules, or even the environment. Information in the data can be in the form of lists, trees, nets, rules, or any other useful representation. The organisation of rule-based systems is modular, and the characteristics of them are as follows [Waterman & Hayes-Roth 78]: - RBS modules^ separate permanent knowledge (rules in the rule base) from temporary knowledge (data in the working memory). - RBS modules are structurally independent. They facilitate incremental expansion of the system and massive code understanding (Modules can be dealt with one by one). - RBS modules facilitate functional independence. It is generally useful to distribute different functions to different modules. - RBS modules may be processed by using a variety of control schemes, i.e. different modules may have different control structures. - RBSs separate data examination from data modification because of the separation of LHSs and RHSs of rules. - RBSs use rules with a high degree of structure, and are a natural knowledge representation (the natural "IF • • • THEN • • • " structure). In the light of problem-solving methods, rule-based systems can be divided into two classes, namely forward-chaining systems and backward-chaining systems. Forward-chaining systems are antecedent-driven, while backward-chaining systems ^A RBS module is a bundle of mechanisms for examining and modifying one or more data structures. are consequent-driven. Forward-chaining systems are commonly known as rule-based systems. Rule-based systems are a well-known type of system in which the control structure can be mapped into a relatively simple recognise-act paradigm. A typical interpreter of a rule-based system performs the following operations in each 'recognise-act' cycle: 1. Match Find out the rule set in the rule base whose LHSs are satisfied by the existing contents of the working memory. 2. Conflict resolution Select one rule with a satisfied LHS; if no rule has a satisfied LHS then stop. 3. Act Perform actions in the RHS of the selected rule and go to step 1. By using suitable interpretations of each of the above actions, the operation of a chaining-based inference engine can be readily described as iteration of such actions. A forward-chaining engine regulates the rules of new databases, while a backward-chaining engine controls the verification of hypothetical information. Another view of the inference engine is that it generates one or more inference nets linking the initial system state to a goal state [Schalkoff 90]. The fundamental operation of the inference engine is the process of matching. Partial matches or complete matches often involve matching with variables for which a suitable unification algorithm which ensures that variable bindings are consistent is necessary. This procedure may require many tests and comparisons. So it is usually difficult to design a fast-chaining algorithm for a large rule-based system. There are three basic approaches to the problem of conflict resolution in a rule-based system as follows [Rich & Knight 91]: - Assign preference based on matched rules in the rule base. - Assign preference based on matched objects in the working memory. - Assign preference based on actions that matched rules would perform. 1.2 Problems with the 'Recognize-Act' Paradigm For naive rule-based systems, all but the smallest systems are computationally intractable because of the complexity of matching in the 3-phase cycles. The successful match of a rule in the rule base with the working memory does not always mean that the rule will be fired. A rule may fail to match with the working memory in an overall problem-solving process, but it probably needs to be tested in each 3-phase cycle when the working memory is changed. Meanwhile, some other rule may be successful in matching with the working memory from the very beginning of a problem-solving process, but may fail to receive enough priority to fire in each confiict resolution phase. When there are changes in the working memory, the rule needs to be tested again and again. It has been observed that some systems spend more than nine-tenths of their total run time performing pattern matching in large rule-based systems [Forgy 82]. As a result of these problems, efficiency is a major issue in large rule-based systems. Since rule-based systems may be expected to exhibit a high standard performance in interactive domains or in real-time domains, many researchers have worked towards improving the efficiency of such systems. As yet, the most significant results have been the RETE algorithm (See Section 4.4) and other RETE-like algorithms such as TREAT (See Section 4.5). These algorithms are match algorithms which avoid matching all rules with the working memory in order to find appropriate rules on each 3-phase cycle so that efficiency can be improved. However, the following two problems still exist in all known rule-based systems except KEshell [Wu 93a]: 1. All complete chaining algorithms are exponential in time complexity. Non-worst-case sub-exponential algorithms are not possible for general cases. 2. Chaining in rule-based systems is a much more complicated process than testing the satisfiability of individual prepositional formulae. It is not possible to know in advance precisely how many 3-phase "match — conflict resolution — act" cycles are needed for each problem solving task. In KEshell, a new algorithm called LFA (See Section 4.6) has been designed. LFA is a linear forward-chaining algorithm for rule-based systems. The most significant advantages of LFA are that its time complexity is 0{n) where n is the number of rules in the rule base, and that it does not need an independent conflict resolution step. By using a two-level "rule schema + rule body" structure (See Section 3.2.2), knowledge representation in KEshell can explicitly express numeric computation and inexact calculus in the same way as inference rules in rule bodies. As long as knowledge representation has an applicable extension, and processing measures show further improvements, LFA should achieve a wider range of applications. However, its knowledge representation cannot represent first-order logic rules efficiently. This is a significant restriction for applications. The research objective of this paper is to relax the above mentioned limitation of LFA so that it can effi- ciently process first-order logic rules. We will present the design of a robust LFA algorithm, LFA+, based on LFA [Wu 93a], which has the following components: — Extended knowledge representation for first-order logic rules, which includes specific representations of recursive rules and rules with negative condition elements. — Sorting measures, for ordering the knowledge in a knowledge base. — Linear forward chaining. The paper is organised as follows. Section 2 introduces expert system principles and some concepts of the first-order logic language, and explains one definition for describing the LFA-I- algorithm. In Section 3, knowledge representation issues are addressed, and two languages for rule-based systems — 0PS5 and rule schema + rule body are described and compared. Section 4 discusses algorithm design issues and techniques, and analyses the RETE, TREAT and LFA algorithms. In Section 5, knowledge representation measures, sorting strategies, the chaining procedure and analyses of the LFA-I- algorithm are presented in detail. Finally, Section 6 outlines conclusions and future research. Definitions are listed in teh Appendix. 2 Background in Expert Systems and First Order Logic 2.1 Expert systems An "expert system" is a computer program which uses knowledge and inference procedures to solve problems that are difficult enough to require human expertise for their solutions [Raeth 90]. Expert system technology aims at improving qualitative factors and can provide expert-level performance to complex problems. A typical expert system consists mainly of the following parts: - A working memory/data base, which stores the evidence and intermediate results of problems during the chaining process. - A knowledge base (KB) or knowledge source. — An inference/chaining engine for solving users' problems by applying the knowledge encoded in the knowledge base. - An explanation engine or tracing engine for telling the users how the solutions were obtained. - A knowledge acquisition engine for acquiring knowledge or modifying the knowledge base when necessary. — A knowledge base management subsystem that detects inconsistencies in the KB. Their relationships are shown in Figure 1. Figure 1: An expert system strucrure Conventional software programs are designed to control computers algorithmically and tell the computers exactly what to do in problem-solving. These programs are usually procedural. Once a program system has been encoded, it is difficult to change the system design. On the other hand, expert systems excel at encoding knowledge declaratively, and they can be modified flexibly because of the separations of knowledge from expert system shells and knowledge from data, and their modular structures. Furthermore, expert systems have the following features [Pedersen 89]: - They use symbols to encode the world which can be used in varied ways. - Most expert systems support uncertainty representation. - Expert systems can handle unknown cases of a problem by applying the knowledge in the knowledge base. - They can explain their reasoning. - They can make multiple conclusions. - They can tailor conclusions. 2.2 Language of first-order logic A first-order language is identified by a triple : - V is a set of variables. - F is a set of functors, each of which has an arity. - they are identical, or — P is a set of predicate symbols, each of which has an arity. The terms (See the definition below) of the language are built from variables and functions (Constants are viewed as functors of arity 0), while predicates are built from terms and predicate symbols (Propositions are viewed as predicate symbols of arity 0). Definition: A term is defined inductively as follows: — A variable is a term. — A constant is a term. — If f is an n-ary function symbol and t\, ■■■ ,tn are terms, then f(fi,- • • ,tn) is a term. Interpretation Truth value interpretation of a first-order logic language is a triple : — D is the domain. — F is a mapping from functions of domain elements to the domain. — R is a mapping from predicates of domain elements to truth values. Horn clauses are a subset of first-order logic languages, but the subset is powerful enough to encode Turing machines. A Horn clause has the following form: p(t) :- gi(il), 92(^2), ■ • • , qn{tn)-where p and qi, q2, ■ ■ ■, Qn sse predicate letters, n > 0, and all variables which occur in the terms t, ti, t2, • • ■, f„ are universally quantified at the front of the clause (implicitly). If n is 0 then the clause is referred to as a fact, otherwise, it is called a rule. The atom p(t) is referred to as the head of the clause, and (il), 92(^2), •■• ) Qnitn) as the body of the clause. The terms t, ti, t2, ■ ■ • ,tn may be arbitrary terms, and hence may contain variables and/or functions. A logic program is a set of Horn clauses. However, it is often useful to consider sub-classes of this class of programs in rule-based systems. One type of these programs is a Datalog program, in which terms are only allowed to be either variables or constants. 2.3 Unification and Match Unification Unification is the basis of the uses of logical inference in artificial intelligence. It is a method of finding such variable bindings for two predicates or terms that they can be identical [Sterling et al. 86]. Match Two terms match if [Bratko 90]: - the variables in both terms can be instantiated to objects in such a way that after the substitution of variables by these objects the terms become identical. The following is an extended definition, partial match, for describing LFA-I- (See Section 5.2). Definition — Partial match Given a premise factor, p-factor, of one rule schema and a conclusion factor, c-factor, of another rule schema, partial match of p-factor with c-factor, written as partial-match (p-factor,c-factor), has the following meanings: - If p-factor is a variable then c-factor is the same variable. - If p-factor is a proposition p or not(p) then c-factor is p or not(p). - If p-factor is a predicate p(- ■ ■ )or not(p(- • • then c-factor is p(- ■ ■ ) ov not(p(- - • )). 3 Knowledge representation 3.1 Introduction In order to solve complex problems encountered in AI, a considerable amount of knowledge, as well as some mechanisms for manipulating knowledge, are necessary. Barr and Feigenbaum identify four types of knowledge as follows [Miranker 87]: - Objects, i.e. nouns and adjectives that describe them. - Events: Object interaction. - Performance: How to do something, also known as procedural knowledge. - Meta-knowledge: Knowledge about knowledge. Knowledge plays two roles in AI programs as follows: - It may define the search space and the criteria for determining a solution to a problem. - It may improve the efficiency of a reasoning procedure by informing an inference procedure of the best places to look for a solution. Knowledge representation occurs at two levels [Rich k Knight 91]: - Data level, at which facts are described. - Symbol level, in which representations of objects at the data level are defined in terms of symbols that can be manipulated by programs, such as PROLOG rules. Knowledge representation and search are the two main themes of AI problem solving but they are not independent issues. If a particular, search method is applied, and a method of knowledge representation may represent the problem more easily and it more efficiently supports the operations required by the search strategy, a particular problem may be more easily solved. For a particular problem, different combinations of knowledge representation methods and search may yield more or less effective means for solving the problem. The next section will focus on representing knowledge by using rules. 3.2 Knowledge representations for rule-based systems The use of rules for encoding knowledge is a particularly important issue because rule-based reasoning systems have played a very important role in AI evolution from a purely laboratory science into a commercially significant one. This section outlines two representation methods namely 0PS5 and "rule schema -I- rule body", which have been applied in some rule-based systems. 3.2.1 OPS5 0PS5 [Forgy 82] is a rule-based system language. An 0PS5 rule comprises the following: 1. Symbol P. 2. Rule name. 3; Left-hand side (LHS). 4. Symbol ->• 5. Right-hand side (RHS). All of these are enclosed in parentheses. A typical LHS structure is as follows: {