AN EVOLUTIONARY APPROACH TO CHIP DESIGN: AN
EMPIRICAL EVALUATION
Gregor Papa "Jožef Stefan" Institute, Ljubljana, Slovenia
Key words; chip design, area and time optimization, low-power circuits, evolutionary algorithms
Abstract: This paper presents a new method with an evolutionary approach to some parts of integrated-circuit (IC) design, This study, however, is focused on application-specific integrated circuits (ASICs), which need an even more sophisticated design (in terms of size, speed and low-power) because of their specific uses.
Optimally scheduled operations are not necessarily optimally allocated to units. To enable optimal allocation we need to consider some ailocation criteria while the scheduling is being done. Therefore, algorithms with concurrent scheduling and allocation produce the best results. It is obvious that we have to deal with a trade-off between the quality of the solution and its design time. The main part of the paper is a presentation of an improved method of the evolutionary search for the optimal design of ICs. The evolutionary approach considers scheduling and allocation constraints and ensures a globally optimal solution in a reasonable time.
The evaluation of our method shows that the evolutionary method is able to find a solution that is more appropriate in terms of all the considered and important objectives than is the case when working with classical deterministic methods.
Evolucijski pristop pri načrtovanju čipov: empirično
ovrednotenje
Ključne besede: načrtovanje čipov, energijsko varčna vezja, optimizacija velikosti in časa, evolucijski algoritmi
Izvleček: V delu je predstavljena metoda sočasnega razvrščanja operacij in dodeljevanja enot, ki temelji na evolucijskem načinu in je uporabna v postopku načrtovanja digitalnih integriranih vezij. Delo obravnava predvsem načrtovanje namenskih vezij, ki potrebujejo zaradi svoje specifičnosti toliko bolj dodelano strukturo, tako s stališča velikosti, kakor tudi s stališča hitrosti delovanja in majhne porabe energije.
Optimalno razvrščenih operacij v splošnem ni mogoče tudi optimalno dodeliti posameznim enotam. Da bi lahko operacije dodelili optimalno, je treba pravila za dodeljevanje upoštevati že med razvrščanjem. Problema smo se lotili z evolucijsko tehniko, ki je pogosto uporabljena v metodah za iskanje rešitev na najrazličnejših področjih. Razvitje bil evolucijski algoritem, ki upošteva razvrščevalne in dodeljevalne zahteve, omogoča kratek načrtovalni čas ter globalno optimalne rešitve. Algoritem je bil tudi računalniško realiziran ter uporabljen pri množici preizkusnih vezij. Vezja so bila izbrana glede na pogostost pojavljanja v sorodni literaturi, in sicer vezja različnih velikosti z različnim številom tipov operacij.
Obravnava rezultatov je pokazala, daje opisani evolucijski algoritem, v primerjavi s klasičnimi determinističnimi metodami, sposoben poiskati rešitev, kije v splošnem ugodnejša s stališča vseh obravnavanih in pomembnih parametrov.
1 Introduction
Whenever a new integrated circuit (IC) is designed the problem of selecting the best register-transfer level (RTL) specification has to be faced. And as circuits get bigger, so too does the problem: more combinations have to be examined before the optimal combination is found. Therefore, automatic circuit optimization is required, as this speeds up the whole design process and eliminates some of the errors.
High-level synthesis /2/, /4/ is an automatic design process that transforms the initial behavioral description into the final specification of the RTL. The process consists of the following tasks: compilation, transformation, scheduling, allocation and binding. Of these, the operation scheduling and the resource allocation are the most important subtasks of the high-level synthesis because they are at the core of the design and crucially influence both the
design and the final layout. Due to the interdependence of these two tasks, the solution of one task depends on an estimation of the solution of the other task, which is not solved yet. The scheduling of the operation into different control steps therefore affects the allocation of operations to different units. The interaction of these two tasks presents formidable obstacles to the goal of optimization /1 /. There are, however, some approaches to concurrent solving, but their solutions, to some extent, are less than optimal.
The evolutionary technique is used in various search methods for a range of different optimization areas /8/. Its undetermined approach, i.e., a probabilistic approach, reduces its popularity, but at the same time gives it an advantage when it comes to multicriteria problems and problems with more local optima. The genetic algorithm (GA), as a frequent implementation of evolutionary techniques, is an optimization method based on the mechanism of evolution and natural genetics.
2 Definitions
An IC described with a hardware description language, e.g. VHDL, can be presented as a control/data-flow graph (CDFG) /4/, and in our case it is enough to consider only the data-flow part, i.e., the data-flow graph (DFG). Each node / represents the operation to be executed, and the type of node determines the type of operation. The edges e represent dependencies between operations. An edge ey, between nodes / and /, represents the data produced by the node / and used by the node j. All edges are unidirectional.
In the final design there is a group of resources that define the implementation. Here we have different functional units (FUs) FUi(i=1..N), where N is the number of different types of functional units. There are also storage (registers Reg) and connection (buses Bus) units, while T represents the execution time of the DFG. Functional units (adders, multipliers, ...) perform transformations on data values, connection units transport values from one unit to another, while storage units preserve those values over time.
The parameters are calculated as follows:
the number of FUi is the highest number of the i-th functional unit needed in a separate control step; the number of registers Reg is the highest number of variables needed in a separate control step. We consider variables that are needed by the functional unit as input data, variables that are returned as output data and variables that are not used at the moment but will be used in some of the later control steps or must be available until the end of the execution of all operations. Of course, more functional units can use the same data from the same register in a control step: the number of buses Bus is the highest number of data transmissions (into or from the functional units) in a separate moment;
the execution time T is the time needed to execute all the operations of the schedule; the weights w are the weights of the IC parameters to be considered in the IC quality evaluation cost function . They depend on the ratio of different units' (functional, storage) sizes.
When tasks are performed separately (Figure 1 a) the solution is not necessarily optimal; it is better to use an approach with iterative repetition of the scheduling and allocation (Figure lb). Here, though, the problem of the next operation/unit to be changed appears, since the order of changes can influence the final solution. It is a similar situation with the approach that involves partitioning of the operations into small groups, within which there is an iterative repetition of the scheduling and the allocation (Figure 1c). Since there are fewer operations in the group there is no problem with the order of changes, but there is a problem with the appropriate partitioning of the operations. Obviously, the best approach is the one with purely concurrent scheduling and allocation (Figure 1d), where the iterative-refinement order does not influence the quality of the solution /7/. Concurrency is achieved through the use of algorithms that do not depend on the order of transformations. Therefore, there is no influence of the changed start time on the allocated unit, nor is there any influence of the allocated unit on the start time. When all the transformations are made, then the appropriateness of the changes is checked.
a)
b)
scheduling
scheduling
allocation
alloi
C)

d)
partitioning
binding
scheduling and allocation
3 Concurrency
Optimally scheduled operations are not necessarily optimally allocated to units. To be also optimally allocated some allocation constraints should be considered during the scheduling. Therefore, algorithms that perform concurrent scheduling and allocation are better in terms of the opti-mality of the solutions. These algorithms are, however, very time consuming. Therefore, we have to deal with a tradeoff between the quality of the final solution and the length of the design-time for it. There are some approaches to concurrent solving, but their solutions, to some extent, are less than optimal.
Figure 1: Scheduling and allocation concurrency
4 The ECSA algorithm
The facts presented in the introduction paragraphs and the promising results of different evaluations /8/, /9/ led us to the ECSA (Evolutionary Concurrent Scheduling and Allocation) design approach /7/. This approach considers scheduling and allocation constraints, allows a short design time and can find globally optimal solutions. The input description of the circuit is transformed into two basic (initial) schedules, obtained with the ASAP and ALAP algo-
rithms. The FUs used in the first case are those that are the fastest for each operation, and in the second case are those that are the slowest for each operation. These two schedules present some kind of boundary solutions, since all the other solutions are executed in between the time limits defined by these two schedules. In other words, no other solution can be faster or slower, irrespective of the combinations of used units.
Each solution has to be properly encoded (into the chromosome), i.e., each operation's start time and functional unit have to exist in the chromosome. The initial population is built upon the two initial solutions, which are multiplied to form the population with the so-called boundary solutions. The optimal solution has to be somewhere in between the boundaries, therefore genetic operators (crossover, mutation, variation) transform those encoded solutions. With the transformations their start times and allocated functional units are changed. Within that, also the multicycling solutions are supported by the approach. The final solution obtained using the genetic operators is also influenced by the simulated annealing algorithm, which improves the solution if it stopped somewhere near the globally optimal point.
4.1	Encoding
The performance of the algorithm depends on the proper encoding. In the ECSA algorithm integer encoding is used, i.e., in the chromosome string are the numbers that represent the starting time of each operation and the allocated unit for each operation, where the position in the string depends on the order of the operations in the input IC description. This means that the chromosome consists of pairs of time/unit information for each operation. And the genetic operators can influence both parts ofthat information, either together or separately.
Integer encoding was chosen, since it does not need any transformation (into binary values) at the beginning and at the end (back into decimal values). Also, the used implementation of genetic operators can check the changed values instantly, without any transformation. The correctness of the transformation, i.e., the crossover, the mutation or the variation, can therefore be checked within the function itself.
4.2	Cost function
One of the most important parts of the algorithm is its cost function. Ouralgorithm is rrtulti-objective /3/, which means it takes control over more criteria or objectives. The cost function, represented by Eqn. 1, considers the number of functional units, the number of registers, the number of buses and the execution time of all the operations in the DFG.
Cost = ^'^(costpu, f + cos/l^g + costl^,, + costl
costpij, = Wfij, ■ FUj = wji^g ■ Reg costs,,, = WBUS ■ costj- =WfT
(i;
To obtain the cost of a certain DFG, the algorithm has to evaluate the required number of resources. In contrast to the other multi-objective functions that give more than one final solution, this one already includes the decision-mak-ing part, i.e., it chooses one solution from all the solutions on the Pareto front. The chosen solution has the shortest distance to the origin, where the origin represents the ideal, costless, solution and the axis represents the considered objectives.
The main weakness /3/ of this approach is the difficulty in determining the appropriate weights when there is not enough information about the problem. Since we are aware of the problem specifics and we know the cost weights of the resources being used, this weakness is not so significant.
4.3 Evolutionary operators
In each iteration (or generation) of the algorithm there are four genetic operators that transform the chromosome. They consider data dependencies and the given library of available functional units. Each time after the genetic operators transform the chromosome, the chromosome is checked to see if it meets all the constraints by considering data dependencies and unit types. Besides the basic implementation of the operators, we also applied the independent operators, which do not need any parameter value to be set in advance: they depend only on the progress of the search and on the size of the problem to be solved.
4.3.1 Basic operators
Selection. Based on the cost-function values the worst solutions are aborted in the selection step. And to ensure that the size of the population remains the same, these solutions are replaced with the best solutions. This ensures that the best solutions of a given generation are involved in the creation of the next generation (elitism).
Crossover. In a crossover task, two approaches are used, with each task expressing the dominancy of the characteristics. After two crossover points are determined, in the first case the unit information is changed between the two chromosomes and the start times are adapted, and in the second case the start times are changed and a suitable unit is allocated. So the dominancy is expressed either in functional units or the start times of operations.
Mutation. We also have two similar approaches for transforming the chromosome. In both cases the starting time
is changed. Either it is moved to later control steps, with the use of faster functional units, or it is moved to earlier control steps, if data dependencies allow this, with slower units.
Variation. After two operations are selected, and when they are of the same type, e.g. additions, their functional units are switched. If needed, their start times are also updated.
4.3.2	Independent operators
The advantage of the independent GA approach /12/ is that there is no need to preset some working parameters, e.g. the number of generations, the population size, and the probabilities of crossover, mutation and variation. These parameters are set automatically during the optimization phase, depending on the progress and the speed of the optimization.
Setup. If the chromosome that presents a solution is large, then the population size also has to be large enough to ensure that a lot of different chromosomes will be involved in a search. The population size therefore depends on the size of the chromosome or the complexity of the problem.
Crossover. Considering four candidates - two parents and their two offspring - only the first and the third, rated according to their fitness, pass to the next generation. This forces at least one of the offspring to be passed to the next generation in addition to the best candidate. Otherwise the offspring have only a small influence on new generations, since the crossing of two good parents probably produces offspring that are not so good; however, they might be good after a few more transformations.
Mutation. Chromosomes with low fitness are mostly exposed to mutation. Each bit in the chromosome is mutated if that position of the chromosome is of the same value in the majority of chromosomes in the population. This is the way to change the bad characteristics in "poorly fitted" chromosomes and to redirect the search to another direction. In the case of "well-fitted" chromosomes, bits are mutated if the value of the bit differs from majority of bits in other good chromosomes at the same position. This ensures faster convergence in the final stages of the optimization.
Variation. The interchange of the values of two bits, as described for the basic operators, is performed if the frequency of the value in that position in the population of one bit is high and the frequency of another bit is low.
4.3.3	Simulated annealing
When the GA finishes its work and the most appropriate solution is found, that solution is additionally influenced by the simulated annealing algorithm. It checks whether the search for the optimal solution stopped in the vicinity of the global optimum, and it changes the solution if needed.
5 Evaluation
The appropriateness of the proposed approach was tested by a computer implementation of the ECSA algorithm, which was used with test-bench ICs. The ICs used for the evaluation were chosen based on their appearance in the literature and similar studies. They differ in terms of size and the number of operation types.
5.1	Test-bench circuits
5.1.1	Differential equation
The relatively small circuit of differential equation /14/ has only 11 operations, but 4 different operation types (6 multiplications, 2 additions, 2 subtractions, 1 comparison). This circuit is useful when testing libraries with different implementations of the same operation types.
5.1.2	Elliptic filter
This filter /6/ consists of 34 operations, but only two operation types: 26 additions and 8 multiplications. The circuit is suitable for comparison, due to its size and operation dependencies, since they form two independent, similar critical paths, both influencing the circuit delay.
5.1.3	Bandpass filter
One of the implementations of the bandpass filter /5/ is the circuit used for our evaluation. It consists of 29 operations: 11 multiplications, 10 additions and 8 subtractions. Due to data dependencies, almost all the operations influence the circuit delay.
5.1.4	Least-mean-square filter
This filter for signal adaptation (noise reduction) is based on the least-mean-square method /2/. It consists of 47 operations: 24 multiplications and 23 additions. This test-bench circuit is useful due to its size and unique data dependencies.
5.2	Functional units
For an easier and more realistic comparison of different algorithms when testing the size and delay of implemented circuits we made a library of different functional units, which differ in terms of their sizes and delays. Table 1 shows the sizes and delays of various implementations of the arithmetic logic operations. Here, different types of logic were used to make the units with different delays and sizes. The values are based on an analysis of data on circuits and their complexities /11/. The number of gate transitions defines the delay, while the overall number of gates needed to implement the unit defines its size. These values are just for orientation, since the real numbers depend on the chosen technology/11/. The delays presented in Table 1 are relative, e.g., normalized to the fastest functional unit among all the operations. Most of the units are multifunctional, i.e., they can perform different types of operation.
Table 1. Technical characteristics of the functional units
functional unit	delay [No. of steps]	No. of
{operations}		gates
FEl {+, -}	6, 6	370
FE2 {+, -}	1, 1	665
FE3 {<}	6	353
FE4 {+, -, <}	1, 1, 1	696
FE5 {x}	4	3040
FE6 {x}	2	7296
FE7	21	3040
FE 8 {-}	9	7296
FE 9	4, 21	3344
FEIO {x,-i-}	2, 9	8025
FE 11 {+,	1,1,1, 4, 21	3692
FE12 {+,	1, 1, 1, 2, 9	8373
FDS tries to optimally schedule the DFG considering a uniform distribution the operations of the same type over the available control steps.
Table 3 presents the results of the following evaluations: FDS with fast units, FDS with slow units, and ECSA with basic and independent genetic operators. There are two types of DFGs for each circuit. The first, or plain, is an ordinary data-flow graph with nodes that represent operations, as described in similar studies; and the second, or improved, considers the input variables (start registers) via some additional nodes to ensure a more accurate estimation of the registers and the buses needed to implement the circuit.
5.3 Parameters
By considering 18750 different schedules of each circuit with the ECSA algorithm and 3125 different combinations of the parameters, we statistically compared (using the procedure described in /10/) the results according to their cost function (Eq. 1). To ensure that most solutions were time-constrained, i.e., executed in shortest possible time, the weight wr was set to an extremely high value.
As presented in Table 2, the high-quality solutions are mostly obtained with the following values of the parameters: probability of crossover, 0.7; probability of mutation, 0.04; and probability of variation, 0.03. In addition, considering the sizes of the circuits, the number of generations and the population size should be set to 3-times and 3.5-timesthe size of the circuit, respectively.
Table 2. Optimal values of the parameters for different testbench circuits
	Differential	Fifth-order	Bandpass	Least-mean-	Average
I	equation	elliptic filler	filter	square filler	optimal values
I number of generations	40	100	90	130	3 X DFG size
! population size	55	120	110	160	3.5 X DFG size
i probability of crossover	0.8	Ö.6	0.7	0.6	0.7
1 probability of mutation	0.04	0.05	0.04	0.05	0.04
1 probability of variation	0.04	0.02 1	0.02	0.05 i	0.03
The values of the parameters in this combination are referred to as the optimal values. These optimal values are determined on the basis of the percentage of solutions with certain parameters from among the good solutions. A parameter value that isto be considered as optimal should have at least a 25% share of the high-quality solutions, as well as having a less than 10% share of the low-quality solutions.
The ECSA algorithm was used with the values of the parameters as presented in Table 2. Other parameters needed to run the FDS and ECSA algorithms and the cost function depended on the sizes of the FUs.
5.4 Results
The ECSA algorithm was evaluated by a comparison with nearly optimal /13/ force-directed scheduling (FDS) /12/.
5.4.1	Differential equation
Because of the small circuit size there is no improvement in the solutions obtained with the ECSA algorithm (either basic or independent) when considering an ordinary DFG - all the solutions are of a larger size. But when we consider the start registers (input variables) there are some ECSA solutions with a slightly larger size and a smaller number of buses.
5.4.2	Fifth-order elliptic filter
The evolutionary method with a basic approach found a smaller circuit with a smaller number of buses and a slightly longer execution time for the ordinary DFG, while the independent approach could not find any improved solution. When dealing with the improved DFG, both approaches (basic and independent) found considerably smaller circuits with a slight increase in the execution time, while the independent approach also found the solution with a substantial decrease in the required number of registers and buses.
5.4.3	Bandpass filter
Both ECSA methods found, when dealing with the ordinary DFG, the solutions with a smaller number of registers and buses; the basic approach also found the smaller circuit, but with a slightly longer execution time. When dealing with the improved DFG, both approaches found the solutions with the same circuit size and execution time as the comparable FDS solution, but the required number of registers and buses was considerably smaller for the ECSA solutions.
5.4.4	Least-mean-square filter
At the expense of a small increase in the delay, the basic ECSA was able to decrease the size and lower the number of registers and buses of the ordinary DFG; but the independent ECSA was not able to improve any parameter. When dealing with the improved DFG, the basic ECSA was able to keep the initial delay, to decrease the circuit size and to lower the number of required registers and buses. The independent ECSA was only able to decrease the number of buses while increasing the circuit size.
Table 3. The evaluation results of the ECSA algorithm with different test-bench ICs
algorithm	functional units	size	registers	buses	delay	runtime [s]
differential eciuation						
FDS-fast	lxFE2 + lxFE4 + 3xFE6	23249	17	6	6	0.01
FDS-slow	2xFEl + lxFE3 + 2xFE5	7173	18	6	20	0.01
ECSA-basic	2xFE2 + lxFE4 + 3xFE6	23914	18	8	6	0.11
ECSA-independent	2xFE2 + lxFE4 + 3xFE6	23914	17	6	6	0.09
differential equation with start re^sters						
FDS-fast	lxFE2 + lxFE4 + 3xFE6	23249	10	9	6	0.01
FDS-slow	2xFEl + lxFE3 + 2xFE5	7173	11	9	20	0.01
ECSA-basic	2xFE2+ lxFE4 + 4xFE6	31210	10	7	6	0.15
ECSA-independent	2xFE2+ lxFE4 + 3xFE6	23914	10	7	6	0.35
fifth-order elliptic filter						
FDS-fast	3xFE2 + 3xFE6	23883	26	8	17	0.02
FDS-slow	5xFEl + 3xFE5	10970	30	6	78	0.03
ECSA-basic	2xFE2 + lxFE5 + 2xFE6	18962	29	4	21	3.80
ECSA-independent	4xFE2 + 4xFE6	31844	30	8	17	1.60
fifth-order elliptic filter with start registers						
FDS-fast	3xFE2 + 3xFE6	23883	21	16	17	0.02
FDS-slow	5xFEl + 3xFE5	10970	25	16	78	0.04
ECSA-basic	2xFE2 + lxFE5 + 2xFE6	18962	24	16	19	4.80
ECSA-independent	2xFE2 + 2xFE6	15922	18	9	21	3.40
bandpass filter						
FDS-fast	3xFE2 + 4xFE6	31179	34	10	10	0.01
FDS-slow	4xFEl+ 3xFE5	10600	35	8	44	0.04
ECSA-basic	3xFE2 + 3xFE6	23883	33	8	11	1.70
ECSA-independent	3xFE2+ lxFE5 + 4xFE6	34219	33	8	10	1.30
bandpass filter with start registers						
FDS-fast	3xFE2 + 4xFE6	31179	25	23	10	0.02
FDS-slow	4xFEl + 3xFE5	10600	26	23	44	0.04
ECSA-basic	3xFE2 + 4xFE6	31179	23	19	10	2.40
ECSA-independent	3xFE2 + 4xFE6	31179	23	19	10	2.90
least-mean-square filter						
FDS-fast	3xFE2 + 6xFE6	45771	68	12	13	0.40
FDS-slow	3xFEl +4xFE5	13270	72	8	70	2.79
ECSA-basic	3xFE2 + 6xFE5 + 3xFE6	42123	67	10	14	6.30
ECSA-independent	9xFE2 + 6xFE5 + 9xFE6	89889	69	30	15	8.05
least-mean-square filter with start registers						
FDS-fast	3xFE2 + 6xFE6	45771	33	29	13	0.48
FDS-slow	3xFEl+4xFE5	13270	37	27	70	3.52
ECSA-basic	4xFE2 + 2xFE5 + 5xFE6	45220	32	25	13	9.20
ECSA-independent	5xFE2 + 3xFE5 + 7xFE6	63517	33	25	13	12.30
6 Conclusions
Optimally scheduled operations are not necessarily optimally allocated to functional units. To enable optimal allocation we need to consider some allocation criteria while the scheduling is being done. This paper describes such an evolutionary approach that considers scheduling and
allocation constraints and ensures a globally optimal solution in a reasonable time. To evaluate our method we built an algorithm and implemented it with a computer. It was used with a group of test-bench ICs. These circuits were chosen because the same type were used in similar studies. They differ in terms of their size and the number of operation types.
It turned out that the evolutionary method (either basic or independent) is able to find a solution that is more appropriate in terms of all the considered and important parameters than is the case when w/orking w/ith classical deterministic methods. There are slightly longer runtimes when the ECSA algorithm is used. But considering the speed (a few seconds) and the computational dependence, where the runtimes for larger circuits increase enormously (exponentially) when the FDS algorithm is used, we can conclude that small and large circuits can be designed and optimized with the use of the proposed evolution-based algorithm, which exhibits a linear increase in the design time with an increase in circuit size.
References
/1/ J. R, Armstrong, F. G. Gray, VHDL Design: Representation and Synthesis, Prentice Hall, Upper Saddle River, 2000.
/2/ J. Benesty, R Duhamel, A Fast Exact Least Square Adaptive Algorithm, IEEE Transactions on Signal Processing 40, 1992, pp. 2904-2920.
/3/ C. A. Coello Coello, A Comprehensive Survey of Evolutionary-Based Multiobjective Optimization Techniques, Knowledge and Information Systems, Vol. 1, No. 3, pp. 269-308, August 1999. /4/ D. Gajski, N. Duff, A. Wu, S. Lin, High-Level Synthesis: Introduction to Chip and System Design, Norwell, Massachusetts, Kluwer Academic Publishers, 1992.
/5/ G. W. Grewal, T. 0, Wilson, An Enhanced Genetic Algorithm for Solving the High-Level Synthesis Problems of Scheduling, Allocation, and Binding, Inti. Journal of Computational Intelligence and Applications, 1, 2001, pp. 91-110.
/6/ T Kung, H. J. Whitehouse, T Kailath, VLSI and Modern Signal Processing, Prentice Hall, 1985.
/7/ G. Papa, Concurrent operation scheduling and unit allocation with an evolutionary technique in the process of integrated-cir-cuit design, Ph.D. Thesis, Faculty of Electrical Engineering, University of Ljubljana, Ljubljana, 2002.
/8/ G. Papa, B. Koroušič-Seljak, B. Benedičič, T. Kmecl, Universal motor efficiency improvement using evolutionary optimization, accepted for publication in IEEE Transactions on Industrial Electronics.
/9/ G. Papa, J. Šile, Automatic Large-Scale Integrated Circuit Synthesis Using Allocation-Based Scheduling Algorithm, Microprocessors and Microsystems 26, 2002, pp. 139-147.
/10/ G. Papa, J. Šile, Evolutionary Synthesis Algorithm - Genetic Operators Tuning, In: A. Grmela, N. Mastorakis (ed.) Advances in Intelligent Systems, Fuzzy Systems, Evolutionary Computation, WSEAS Press, 2002, pp. 256-261.
/11/ B. Parhami, Computer Arithmetic: Algorithms and Hardware Designs, Oxford University Press, New York, 2000.
/12/ P. G, Paulin, J. P. Knight, Force-directed Scheduling in Automatic Data Path Synthesis, Proc. 24th ACM/IEEE Design Automation Conference, Miami, USA, June 1987, pp. 195-202.
/13/ P. G. Paulin, J. P. Knight, Scheduling and Binding Algorithms for High-Level Synthesis, Proc. 26th ACM/IEEE Design Automation Conference, Las Vegas, NE, pp. 1-6, June 1989.
/14/ R G. Paulin, J. P Knight, E. F. Girczyc, HAL: A Multiparadigm Approach to Automatic Data Path Synthesis, Proc. 23rd ACM/ IEEE Design Automation Conference, Las Vegas, USA, June 1986, pp. 263-270.
Gregor Papa Institut "Jožef Stefan" Computer Systems Department, Jamova c. 39, 1000 Ljubljana, Slovenia tel. +386 1 4773 514 fax. +386 1 4773 882 Email: gregorpapa@ljs.si
Prispelo (Arrived): 13.05.2003 Sprejeto (Accepted): 26.08.2003