LANGUAGE CONSIDERATIONS OF PARALLEL PROCESSING SYSTEMS PART ONE: Concurrent microprocessing systems INFORMATICA 2/87 UDK 681.3.06:519.682/.683 Peter Koibezen Institut »Jožef Štefan«, Ljubljana ASTRACT - Full parallellsm ollered by the nultl-processor is not stili fully exploited. Huch uork that has been done In structured prograaalng to separate a mono-processor prograa into weI1-deflned aodules, and atteapts to systeaiatlze the interactlons betueen modules, have helped to achieve a more discipllned approach to softuare developnent wlth much benellt to nultl-aikroprocessor softuare. Thls paper presents various Issues relevant to language aspects ai parallel processing sy5tB»s, The objectlve Is to present a discussion o( Issues and soie o( ths current approaches rather than a well-developed «etodaIogy o( softuare, vhich has yet to be developed. New approaches to parallel processing architecture are briefly outlined too. O JEZIKIH SISTEMOV PARALELNEGA PROCESIRANJA. PRVI DELi KonkurenCni dilkroprocesorski sistemi. Popolna sočasnost, ki Jo onogofia materialna oprana vefiprocesorsklh siste­ mov, £e ni dovolj izkoriSflena. Da bi se ta cilj dosegel, Je bilo (lad drugim vlojtano :e veliko napora tudi v strukturlrano programiranje, ki deli snoprocesorski program v dobro definirane module, poskuSa sistemizirati akcije med moduli, pomaga dosetfi bolj urejen pristop k razvoju programske opreme in ji daje'številne prednosti. Članek podaja zakljufike, ki Izhajajo iz jezikovnega vidika na sisteme paralelnega procesiranja. Obravnavani so zgolj rezultati in noveJCl poskusi reltevanja problaaov programske opreme. O kaki bolj dovršeni metodologiji programske oprsme pa ni' mod govoriti, saj je le-ta Se vedno v razvojnih fazah. Na kratko so opisane tudi nekatere najvidnejše radunalniJtkc posebej učinkovito podpirajo paralelno procesiranje. arhitektura, ki ite INTRODUCTlON Hlgh level languages and thelr translators have become eseential for uriting applicaticn pro- grams for mono-processor systems. The same, houever, cannot be sald lor multi-microproces- sor systems. The Immense varlety of appllcatl- ons and harduare architectures, and the diver- slty af phllosophies about hou systems shoud be structured, makes it extreaely difficult to design languages that are llkely to be wldely accepted. It stili remalns a difficult chal- lenge to design a hlgh level language uhich is sufficiently general and modular to acconmodate a large number of architectural types of machl- nes /1/. In the absence of bold and fresh ideas to express ooncurrency, it Is than natu- ral that current thlnklng Is along the llnes for extending or generalizing the sequential programmlng languages /2/. At least it is knoun that using thls approach one has sooat- hlng that uorks for an Isolated microprocessar uhich forms a constitutent part of. the uhola systein. Thus a sequentlal language enables individual softuare modules to be uritten. Thls is a rather prlmltlve approach, houavar, uhere concurrency (uhich rei^ulres a control and comrnunlcatlon struoture), fiynchronlzation for resource sharlng, efficiecy and robustness a- spects are outside the language consideratlon. A further dlfflculty stems froa the fact that the language issues and runtime support aspects cannot be Isolated totally. The attributes o( the kernel are important in_ deolding uhether or not certaln issues need to be dealt ulth at the language level. Most o( the language proposals In the concu­ rrent programning area aiso have an underlylng model of distrlbuted computlng. The aany of these languages are in the research phase and any have not been Implemented, also there is little hard practlcal evperiance. Most of the tirne the underlylng model Is not expllcltly stated. Event if one attempts to eKtract the underlylng model from a proposal, it is not aluays an aasy task. Sometlmes the model and languages issuas become inseparable. The choice of the model uould affect the prograaalng aethodology and the proof techniques for a language basad on that model. A nodel provides a conceptual frameuork in uhich to discuss and undarstand the behaviour of concurrent ooaputations, and is intended to capture the underlying .phllosop- hy of a programmlng language. 32 A hlgh level language is a nedium which not only enables us to obtain a nachina eKeoutable cede but, perhaps more inportantl/ allous us to tormulate an application preclsely. In this sense, there is a greate vacuun for a vshicle to describe concurrent applicatios {araally. Anather difficulty in using languagas applica- ble to inulti-mioroprocessor sy8tens Is the neceš5ity for a transiator. Translator urlting ifflinediataly requires the speoif ioation oi the target nachine. It is desirabla that the translator also runs on the target aaohines. Since there is no arohltectural uniiorBity, this rei]uires a translator design uhioh is capable of running on uidal/ varying oondgura- tions. Ideally, a translator also should taka advantage of the structure and hence bo aodu- lar. This requires significant departure iroa coapiler uiriting for mono-procsssor oyBtQaa. FEATURES OF CONCURRENT LANGUAGE Sone of the desired features of a concurrant language can be listed as follous /3/i - eKpressive pover or richness - provability ease and efficiency of implementation - Qasy of use - readabilit/ of resulting prograaa impact of changes - entant of ooncurrenoy possible EKpressive pover or riohnass This refaro to the abitity of the medel/1anguaga in being abls to express certain behaviours, l.e. the richness to be able to modal certain coaputations like recursion, non-deterninlsn, and so on. This praperty is also referred to as ooapleteness or adequancy. An increase in SKpressive pouer is likely to be accompanied by an increase in the difficulty of proving prograias. Uhile it ia desirable to have simplicity as ona of the goals, it is not advisabls to hava that as the overriding criterion. ProvabiIlty One aiay bs Intsrested in proving (nany propertias, like partial correotnoss, (re- edoin frofii daadlocks, teninatlon, fairnass, etc. The presense of some construots uould make it eHtrenely difficult, if not laposalble, ta prove certain properties. For SKaople, at the current state of the art ol program pro­ ving, the presence of tine-outs oould oake the achievement of the traotability af proofs alaost inpossible. Of course, an iaportant consideration is the pouer of the language used for specifyng assertions about prograa properties. Tha assertion language or the logic used should be rich enough to be able to specify forfflally varioue desired properties /i,/. Fornal ization of the semaritics of constructs is an important prerequlsite for prograa proving. Uhile researches have been discussing ali of the above properties for a long tine, there are very feu wen defined techniques or foraal methods to illustrate the exlstenc8 of the necessary properties. good. The practicality of raechanisas uould be measured by the efficlenoy of thelr iapleaanta- tions. Ease and •ffiolanoy o{ plementation of certa difficult to achieve nierely to define primit nakes thein uorth impleim posslble to deliver reasonable efficiency ef(iciency, or costli Important consideration might be irnplenented e .such Inplenientations a laplaaantatlon The ia- in features Qay be quite It is not Buffioient ives uhose funotionality entlng. It auet also be that functinality uith In nost apllcations tha ness, is likely to be an Uhile somo oonstruots a6ily, the efficienoy of ay not necessarily bo Eas« ot uit The presence o does not mean that they woul Noriiially high level oonst stractlons capabilittes aake se of use and eKpressive pow ry criteria. A model/langua ugh to exprees a certain does not autoii)atically aean done in an easy uay-csrtain and obscure uaye have to Constructs uhich reflect abstractions uould be appeal f powerful features d be easy ta use. ructs and good «b- thing easier. Ea- er are coapleaenta- ge being rich eno- type of ooaputation that it could be ingenious, aukvard be resorted to. intuitive uaya of ing to ths user. While uriting prograns, language primitives should allou ccherent combinations. Avoiding subtle interactions among primitives would aake then easier to use and help reduce errors. The fleKibillty cf the constructs is also an iapor- tant factor in the ease of their use. Rea(!abillty oi rsaultlng prograaa Any propasal for neu language featurss should be Borutlnized closely to deterrnine the effeot of tha proposed facility on program structure. The aschaniseas should be such that they discourage ooapleii and confusing structures. The presense of high level and very pouerful constructs could lead to ea5ily comprehensible prograas. Of coursa, this may not always be the ease. The ability to compose the process structure hierarchioaHy should be of great benefit. In general, con­ structs that are easily verified are likQly to be easily understood. lapaot oC ohanges If the constructs do not include or force a high degreo of isiodularity, a change in the deflnition of one process aay necessitate iDany changes throughout ths rest of the systeiii. This uould be highly undesirable, particularly if the number of the processas Involved is quite large. Persiitting s graat degree of autonoiiiy in the definition of procss- ses would help a good deal in rcduclng and localizing the impact of ohanges. Eut^tit oS ooncHirrcno^ pocsibi* The groater the degree of conQurr3noy the constructs psralt to be eKpressed, the better. But the overhead« involved in supporting such concurranc/ should not be such as to offsst the advantage« gained through the Increase in parallelisa. HIGH PARALLEL PROCESSING ARCHITECTURE3 It is fifth much incorp llkely paral 1 inents infere ons, 1 stems stribu netuor At pre paral 1 5ing a are v uhioh opreat used f floati ment o .operat agreed by aH concerned that the kiy to generation conputer arohiteoturas is a higher degree of parallelita than is orated into cosiputers at prasent. It is that there will be a nuaber of layers of elisni! closely coupled processing ele- reflecting the parallelisa inherant in nce or knouledge base processing operatl- ooser coupling betueen ths various subsy- in a fifth generaton conputer, and di- ted processing acrosB local and uide area ks o( computers /3/. sent there ellsm inp rrays and ectors of act synohr lons on or multi-s ng-point n f the pipe ion, and are tuo typ lemented in pipelines. identical onausly to arrays of d tage operat ultiplicatlo line oarries passes its es of ctose-coupled coaputersi procet- Processing arrays processing eleaants perfora identical ata. Pipelines are ions /7/ such as ns, uhera each ala- out one step of interaediate result 33 to the next element, Operations on succeBSive sets at data can take plače at intervals ol one step. Parallel processing of thls type, known as "regular" paraHellsm, utU undoubtedl/ find a plače in fifth generattion computers, but necbanisms to deal wlth irregular parallelism are the oiain topio (or research. Three appro- aches are present today! parallel control flou, dataflou and graph reductlon /9,10/. Trad itionany, by step o( a progr under the control whlch determines carried 'out next. implicit in the st statement in the it detailed prooessin parallel computer age were availab are called at the parallel, and th ali processing pro continuing, Progr current Pascal an operations of th remains to be seen is only a sligh sequentlal process radical demands ot res. parallel control (Iaw eaoh am is e^Bcuted in sequence, at a single prograa oounter the loulevel operation to be The flou of control is ructure of the progra«. Each odule is a call to a aore g procedure. Thersfors, if i systea and programning langu- le, the processing procedure same tine. They executsd in e control module uaits until cedure are coaplete before anming languages such as con- d Ada have facilitias far is sort conputers, but it whether this approach, whioh t variation on conventional ing, yill be ade^uate far the fifth generation arohitectu- For promlsi the inf ' generat It can close-c extensl data t level, inferen dataf la sing e logical out, a nents and wa Interme are tw netuork eaoh ve, uh element when it compute irement number of reasons, one of the. most ng archltectural »odels, oertainly for erence processing subsyste(»s of a fifth ion Computer, is dataflaw architecturs, čope with Irregular as well as regular oupled parallelism, it is flexible and hle, it has the potencial for very high hroughputs, and it refleots, at hardware the type of parallelisa Inherant in ce processing. The central idea of w archltecture is: a netviork of proces- tements is set up, which refleots ths structure of the task to ba oarried nd items of data flou betuaen the ele- Each elenents operates at its oun paca, its until it has a complete set of dlate Inputs before it "fires". Thero o techniques for the control of »uoh a In the totaIly data-driven approaoh, lement uaits passlvely for data to arri- reas in the demand-driven reglne eaoh issues requests "upstreas" for data is ready lor it. In general a dataflou or Computer 5ubsystens has three requ- to store representations of prograa graphs, to implement some form of data tokens to flou through the graphs, and to provide suitable instruotlon processing facilities.' Three lines of research are belng folloued in response to these diff Iculties. The first is to regard a dataflou task as fiiied at coaplle tirne, and to prohiblt re-entrat code /12,13/. This statlc approach is ilustrated in Fig.1, uhich ušes a netuork of binary processors each uith tuo alternative output channels. Processing elamenl Processing element processing element Processing element Fig.1-A static dataflow networkic Flg.3-A tagged dataflow arohitecture. The graph reduction arohitecture /16/ is the ne)it variation on the dataflow approaoh dlsous- sed above. This variation is to evaluate functions by working direotly on their graphi- cal representations. As various portions of the graph are evaluated, they ara repleacad by their internediate results. Evaluation of aaoh of° the louest nodes (uhloh bacoaes a saarch to sae uhether such a node is present in the given relatlons), can prooeed in parallel. The in­ ternediate boolean results are then fad back through .the graph as it is raduced, until a single result eaerges. ALICE /17/ is the conputer uhloh inoorporates graph reduction directly into its basic arohitecture. It is designed to be programned In the applloative language HOPE, but can also support declarativa languages such as PROLOS. The architeoture of ALICE enables the parallel operations to be performed uithout any explioit instructions from the program. Each node in a prograa graph is represented as a pachet uithin Alioa. A packet consists of an identifier fialds, a lunction or operator fleld, and ona or aore argument fields, uhich ffiay be data values or reterences to other packets. Thare are also control fields used by the oonputer in its operatlon. D, and varled sa{tware aspects like the operatlng s/stems /20/, conniunlcations Inf rastructure, and tools to ald appllcatlon programalng such as high level languages suitable for parallel programmlng, aH form a tlghtly knlt sltuatlon in which it is far more difficult to isolata the constltuent parts and arrive at unlversally accepted solutlons. The requlreffients of the fifth generation for la/ers af parallelIsm and an enphasls on infa- rence rather than nunerlcal coaputation look like providlng sufficient incentive. Even i( the objectlve of a computer with enhanced intelligeoe is not attained, the new architec- tures ulll provide englnes of unpreoendsnt pover for conventional conputing. The aove auay from general-purpose processors to aggre- gations of special-purpose chips is likely to affect ali branches of infornatian .technology. The increase In the scale of integration, and the advanced CAD systens for mlcrochip produc- tlon wiH lind appllcatlons in every branoh of ntlcroelectronlcs. The Industrlal, econoitlo and polltical conseciuences of having access to, or not having acoess to the new generation of Silicon toundries are farreachlng. 5. REFERENCES /1/ Bedlna M,, Dlstante F. SH i HH Resources Allocation in a Multiprocessor sy8te«t An Aro- hitectural Problem, Advances in Microprocessing and Mlcroprogramming, EUROMICRO, 198'i. /2/ Tucker A.B., Jr. Progranming languages (Second editlon), ncGraw-Hill Book Coi>pany, 1986. /3/ Stankovifc J.A., A Perspeotlve on Distrlbu- ted Computer Systens, IEEE Trans. on Conp., vol.C-33, NO.12, December 198^. /A/ Chandy M., Mlsra J. Distrlbutad slaulati- on: A čase study in design and verificatlon of distributed prograns, IEEE Trans. Engineerlng, September 1978. on Software /5/ lia P.R., Lee E.V.S., Tsuohya H. "A task allocation model for distributed conputing sy- stems", IEEE Trans, on Conp., pp. A1-47, Jan. 1982. /6/ Flynn N.J. Directions and Issues in Arc- hltecture and Language", Coaputer IEEE, October 1980. /7/ Mllutinavi6 V. A High Level Language Archltecturei Bit-SlIce-Based Processor and As­ sociated Systein Software, Bicroprocessing and •Mlcroprogramming 12, 1983. /8/ Kogge P.M. The Arohltecture of Plpelined Computers, llcGrau-Hill Book Cofflpany, 1981. /9/ Treleaven P. Fifth generation coaputer arohltecture analysis, in Moto-Oka, 1982, pp. 265-275. /10/ Huang K., Briggs F. Computer Arohltectu­ re and Parallel Processing (International Štu­ dent Editlon), ncGraw-Hill Book Conpany, 198C>. /11/ Brajak P. Paralelno procesiranje: arhi­ tektura 90-tih godina. Zbornik jugoslovenskog savjetovanja o novima generacijama raSunala, MIPRO 86, Rijeka, Maj 1986, str.33-A6. /12/ Smith K. New computer breed ušes tran- sputers for parallel processing, Electronics 56,«,, 1983, pp.67-68. /13/ nihovllovli; B., Mavrie S, KolH.e2en P. Transputer-osnovnl gradnik vetlprooesorskih si­ stemov, Informatica 4/86, LJubljana, 1986. 14/ Tanaka H., et.al. The preli«inary rese- arch on the data flou machine and the data base machine as the baslc arohltecture of fifth generation computer systeB, in Moto-Oka, 1982. /15/ Gurd J. Developments in dataflow arohl­ tecture, In SPL Internaoional, 1962. /16/ Cripps M., Field A.J., Reeve M.J. The design and Implementation of Alicet a parallel graph reduction machine, Byta Magazine, June 1985. /17/ Darlington J,, Reeve H.J. Alicei a aul- tiprocessor reduction machine for the parallel evaluatlon of appllcative languages, ACM Con- fernce on Functional Programnlng and Computer Arohltecture, October 1981, pp.65-75. /18/ Inmos. Ocean Progranming Manuel, Prenti- ce-Hall, 1984. /19/ Kolbezen P. Analiza multiprocesorskega sistema, IJS Delovno poroliilo Dp-4461, Univerza E.Kardelja, IJS, LJubljana, December 1986. /20/ Kolbezen P, Problemi naBrtovanJa aulti- procesorskega sistema, IJS Delovno poroCilo Dp-',<>62, Univerza E.Kardelja, IJS, Ljubljana, December 1986. /21/ Foster M.J., Kung H.T. The design of special-purpose VLSI chips, IEEE Coaputer Maga­ zine 13,1, 1980, PP.26-A0. /22/ Seitz C.L. Conoourent VLSI Archlteoturas IEEE Trans, on Comp., Vol.C-33, No.12, Oec. 1984.