13
International Journal of Theoretical and Applied Computer Sciences Volume 1 Number 1 (2006) pp. 35–47 (c) GBS Publishers and Distributors (India) http://www.gbspublisher.com/ijtacs.htm An Evolutionary Algorithm Based System Level Synthesis for Platform Based Design Rabindra Ku. Jena Institute of Management Technology, Nagpur, India E-mail: [email protected] Gopal K. Sharma ABV-Indian Institute of Information Technology & Management Gwalior, India E-mail: [email protected] Abstract This paper presents an evolutionary algorithm approach to solve the allocation, binding and scheduling problems of VLSI in system level. From a behavioral description using specification graph, the genetic algorithm generates a cheapest and fastest heterogeneous hardware or software architecture. The evaluation of the proposed approach is performed on a video decoder. The experimental results show the efficiency, accuracy and scalability of our approach. Keywords: Design Space Exploration, System Level Synthesis, Genetic Algorithm, CAD for VLSI 1. Introduction In order to bridge the gaps between growing complexity of chip development and increased time-to-market pressure, a higher-level design methodology is mandatory. As a result, CAD tool developers have recently focused on automated synthesis at the system level referred as system level synthesis. System level synthesis is described as a mapping from a behavioral description, where the functional object poses the granularity of task, procedure, or process into a structural specification with the structural object such as the general and/or special purpose processors, ASICs, buses and memories.

An Evolutionary Algorithm Based System Level Synthesis for

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: An Evolutionary Algorithm Based System Level Synthesis for

International Journal of Theoretical and Applied Computer Sciences Volume 1 Number 1 (2006) pp. 35–47 (c) GBS Publishers and Distributors (India) http://www.gbspublisher.com/ijtacs.htm

An Evolutionary Algorithm Based System Level Synthesis for Platform Based Design

Rabindra Ku. Jena Institute of Management Technology,

Nagpur, India E-mail: [email protected]

Gopal K. Sharma

ABV-Indian Institute of Information Technology & Management Gwalior, India

E-mail: [email protected]

Abstract

This paper presents an evolutionary algorithm approach to solve the allocation, binding and scheduling problems of VLSI in system level. From a behavioral description using specification graph, the genetic algorithm generates a cheapest and fastest heterogeneous hardware or software architecture. The evaluation of the proposed approach is performed on a video decoder. The experimental results show the efficiency, accuracy and scalability of our approach.

Keywords: Design Space Exploration, System Level Synthesis, Genetic Algorithm, CAD for VLSI

1. Introduction In order to bridge the gaps between growing complexity of chip development and increased time-to-market pressure, a higher-level design methodology is mandatory. As a result, CAD tool developers have recently focused on automated synthesis at the system level referred as system level synthesis. System level synthesis is described as a mapping from a behavioral description, where the functional object poses the granularity of task, procedure, or process into a structural specification with the structural object such as the general and/or special purpose processors, ASICs, buses and memories.

Page 2: An Evolutionary Algorithm Based System Level Synthesis for

36 R.K. Jena and G.K. Sharma

Most of the system level synthesis approaches follow the traditional problem solving techniques like dynamic programming and/or greedy methods etc. But, design space exploration at the system level is a NP hard problem [11] in nature. So, the conventional methods are not efficient and suffer a long run- time. Therefore a heuristic based approach is required for system level synthesis. There are a few heuristic based approaches found in literature [3, 5]. Most of the heuristic based approaches do not follow the semantics of the system level design. So, this paper proposes a new approach for system level synthesis based on genetic algorithm. The main purpose of using genetic algorithm for system level synthesis is to explore the huge search space in parallel.

The rest of the paper organized as follows: Section ‘2’ discusses various existing approaches to the system level synthesis reported in literature. Based on the literature review, problem has been formulated and the proposed approach has been discussed in the Section ‘3’. In order to apply the proposed approach, an example of video decoder for image compression using H.261 standard has been considered and described in Section ‘4’. Section ‘5’ provides the detail about the implementation of our proposed approach and the experimental results, outlined for the image compression application. The analysis of the result also has been given in the same section. Finally, the paper is concluded in section ‘6’. 2. Existing Approaches to System Level synthesis Various approaches to the system level synthesis have been reported in the literature [1-5, 7, 8, 10]. The existing approaches can be broadly classified according to their class of input specifications, or use of optimization models and procedures. The results of input specification approaches are either a control dominant specification [1, 2, 7, 10] or a data flow dominant specification [5,8]. It is either a dedicated control and data path in VLSI, or a multi-chip dedicated VLSI architecture, or a mixed hardware and software architecture. In other hand the works relating to optimization methods can be classified into exact methods and heuristics. Exact methods include enumerative search methods based on integer linear programming [3]. This approach suffers a long run-time, so it is only suitable for small problems. In heuristic based approaches, Gupta [5] presents a heuristic for hardware/ software partitioning in system level. Henkel et al. [4] used simulated annealing for synthesis task. But they have considered only the system level synthesis task as a partitioning problem. Again in some of the approaches, the architecture is either already fixed [4] or does not consider the communication as a part of synthesis problem [3]. But our proposed methods follow the semantic of system level synthesis and also consider communication as a part of synthesis task.

Page 3: An Evolutionary Algorithm Based System Level Synthesis for

System Level Synthesis 37

3. Proposed System Level Synthesis Problem Formulation Our proposed system level synthesis task is based on specification model proposed by Teich et al. [11]. The specification model consists of following components:

• Dependence graph G ( V, E ) is a directed graph , where V is the set of nodes and E (V×V) is the set of edges. There are two types of dependency graph i.e (1) GP (Vp, Ep ) for problem graph, which derive from data flow graph(G) as

shown in figure-1. (2) GA(VA, EA ) for architecture graph, which derive from resource structure

graph(G) as shown in figure-2. • Specification graph GS( VS, ES) is consisting of ‘D’ dependency graphs Gi

(Vi, Ei ) for 1≤ i ≤ D and a set of mapping edges EM (i.e user-define problem specific constraints between dependency graphs ) as shown in figure-3. ‘D’ is the depth or level in GS. In particular VS = { ∪ Vi } and ES = { ∪Ei and EM }. Additional problem specific constraints and parameters may be assigned to the specification graph. Figure-3 shows the specification graph of depth two obtained from figure-1(b) and figure-2(b).

In order to describe the mappings, a activation function is required. The activation function activates the nodes and edges of the VSHFLILFDWLRQ�JUDSK��6R��WKH�IXQFWLRQ��3��

LV�GHILQHG�DV�3���9S ∪ ES {0,1}. The value ‘1’ is assigned to the activated nodes or edges and the value ‘0’ is assigned to non-activated nodes or edges.

Figure 1: A data flow graph (a) and the corresponding problem graph (b)

Page 4: An Evolutionary Algorithm Based System Level Synthesis for

38 R.K. Jena and G.K. Sharma

Figure 2: Resource structure graph (a) and corresponding architecture graph (b)

Figure 3: Specification graph of depth two.

System Synthesis Our synthesis problem is basically an allocation, binding and mapping problem. The allocation, binding and scheduling is defined as follows: Allocation: An allocation (α) of a specification graph is the subset of all activated nodes and edges of the dependence graph. The nodes and edges which are allocated through allocation are called allocated nodes and allocated edges respectively. Binding: A binding (β) is the subset of all allocated edges (e), where e�/�(M. Feasible binding: Given specification graph GS and allocation ‘α’. A ‘ β’ is called feasible binding if :

1) HDFK�DOORFDWHG�HGJH��H�/��β , starts and ends at an allocated node. 2) for each allocated node Vi , 1�� L� �� '�� H[DFWO\� RQH� RXWJRLQJ� HGJH� H� � LV�

DFWLYDWHG��ZKHUH�H�/�(M ��DQG�H�/�β.

Page 5: An Evolutionary Algorithm Based System Level Synthesis for

System Level Synthesis 39

3) for each allocated edge e = (vi, vj ��/�(i, 1��L���'��V�W� i. either both operation are mapped into the same node ii. or there exit a allocated edge e1� /� (i+1 to handle the communication

associated with e. Feasible allocation: A feasible allocation is a allocation that allows at least one feasible binding. Schedule: A schedule is a function γ : VP Z

+ that satisfy all edges e=(vi,vj��/�(P, s.t.

γ (vj) ��γ (vi) + delay (vi, β).

To solve this system level synthesis problem, we deduce the synthesis problem as an optimization (minimization) problem as follows: Minimize Cost and Delay associated with α, β, γ Subject to: α is feasible allocation β is feasible binding γ is a schedule Optimization Method The system level synthesis task is a NP-hard optimization problem with large and discrete search space. Genetic algorithm [6] is a right candidate to solve this type of problems. The inherent parallelism nature of genetic algorithm explores the design space in a single optimization run.

The proposed optimization methodology treats the problems of optimization is a mappings of an algorithm level dataflow graph based specification on to a heterogeneous hardware / software architecture. The problem requires:

1] selection of the architecture (allocation) among a specific set of possible architecture,

2] mapping of the algorithm onto a selected architecture in space (binding), 3] time scheduling, and 4] the design space exploration with the objective to find a set of

implementations under set of constraints on cost and performance. Our approach provides an evolutionary technique based on genetic algorithm to

perform step (1)-(4) in a single run. The suitable data structures for implementation of our synthesis problem are

discussed as follows: Each chromosome is represented as a vector. The length of the vector depends on

the number of vertices in GP i.e |VP| as shown below.

Page 6: An Evolutionary Algorithm Based System Level Synthesis for

40 R.K. Jena and G.K. Sharma

The matrix for delay (D) of dimension (VP × VA ) is defined as follows: One cost vector (C) of dimension (VA ) is required to holds the cost of the

resources. C[i] = Cost of the resources ‘i’. A constraint list L of dimension (| VP |) is required to points the list of valid resources. L[i] points to a list of valid resources for a task ‘i’.

The pseudo-code of the proposed genetic algorithm for system level synthesis problem( SLSGA) is given below .

The Repair algorithm is used to convert a chromosome with invalid allocation to valid allocation, which resulted due to crossover and mutation operation, is presented as follows:

Page 7: An Evolutionary Algorithm Based System Level Synthesis for

System Level Synthesis 41

Crossover The crossover between two chromosomes C1 and C2 is generated two new chromosomes. The random function generates a random number between ‘1’ to m. ‘m’ is the number of gene in the chromosome. The swap function swaps the genes between C1 and C2.

The Commonlist ( ) returns a non zero value if there exits a common element between the list pointed by L[predecessor (i )] and L[successor (i )], otherwise it returns zero. Crossover The crossover randomly chooses a position in the chromosome. Then the genes from both of the chromosome are exchanged as shown in the algorithm given below.

Algorithm Crossover (C1, C2) { i = random (1, 2……m); For j=1 to i do Swap (C1 [j], C2 [j]); //swap the contents of C1 [j] and C2 [j] }

Page 8: An Evolutionary Algorithm Based System Level Synthesis for

42 R.K. Jena and G.K. Sharma

Mutation The mutation chooses a gene randomly from a chromosome and replaces contains of the gene by a resource having less cost and delay.

Algorithm Mutation (C1) { i = random (1,2……m); replace (C1[i], L[i]); }

The replace function replace the resource at C1[i] by a resource having lower cost

or delay from the list of resources pointed by L[i]. 4. An Application

We explain our methodology using an example of Video decoder (H.261 standard) for image compression [11]. Its block diagram is shown in figure-4. Basically, it is the lower part of the loop of the coding algorithm used for image compression. Here the motion compensation vector has to be extracted from the input data after the run length decoding operation (RLD). As the amount of transmitted data is small as compared to the size of the macro block, the transmission is assumed to take zero time (symbolized by a “0” above node “12” in the figure-4 given below). The number inside the box and circle represent the index number of the functional and communication node. The proposed resources structure graph is shown in figure-5.

Figure 4: Problem graph for video decoder

Page 9: An Evolutionary Algorithm Based System Level Synthesis for

System Level Synthesis 43

Figure 5: The proposed resource structure graph

Cost vector (C) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 0 150 150 20 40 100 200 50 100 50 10 20 30 The cost vector (C) gives the cost of each resources of the proposed resource structure graph. For example, the cost of the resource at the index ‘3’ which is a RISC processor (identified as RISC1 in figure-5) is 150 unit. Delay Matrix (D)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 1 0 - 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 2 -1 -1 8 8 -1 -1 -1 8 -1 -1 2 -1 -1 -1 3 -1 -1 1 1 -1 -1 -1 2 -1 -1 1 -1 -1 -1 4 -1 -1 -1 -1 0 0 -1 -1 -1 -1 -1 -1 -1 -1 5 -1 -1 9 9 -1 -1 -1 3 -1 -1 2 -1 -1 -1 6 -1 -1 8 8 -1 -1 -1 4 -1 2 -1 -1 -1 -1 7 -1 -1 2 2 -1 -1 -1 2 1 -1 -1 -1 -1 -1 8 -1 -1 -1 -1 0 0 -1 -1 -1 -1 -1 -1 -1 -1 9 -1 0 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 10 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 3 2 1 11 -1 -1 0 0 -1 -1 -1 0 -1 -1 0 3 2 1 12 -1 -1 0 0 -1 -1 -1 0 -1 -1 -1 0 0 0 13 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 2 3 14 -1 -1 0 0 -1 -1 -1 0 -1 -1 -1 1 2 3 15 -1 -1 0 0 -1 -1 -1 0 -1 -1 -1 1 2 3 16 -1 -1 0 0 -1 -1 -1 0 -1 -1 -1 1 2 3 17 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 2 3 18 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 2 3

Table 1: Delay table

RISC1 FM DPFM BMM DSP INM

RISC2 SAM DCTM HC

3 5 6 7 8 1

4 9 10 11 2

OUTM

SBS 12 SBM 13 SBF 14

Page 10: An Evolutionary Algorithm Based System Level Synthesis for

44 R.K. Jena and G.K. Sharma

The delay matrix (D) represents the delay of the hardware resources (both computation and communication) is shown in (table-1). The row represents the index of nodes in the problem graph ( a task) as shown in figure-4 and the column represents the index of nodes ( a resources) as shown in figure-5. Each cell of the table contains a value positive or negative. The represents the delay, when a task is executed in a particular resources. For example, D2,3 having delay ‘8’ unit, i.e the delay require when task ‘2’ execute on resource ‘3’. Negative value indicates that the particular task can’t execute in the particular resources. Our algorithm (SLSGA) stated in section -3, requires only the positive values.

The constraint List [L] (figure-6) is derived from specification graph. Each entry in ‘L’ contains the index of the tasks (figure-4). The head of the arrow indicates that the list of resources which can be assign to the particular task. Constraint List ( L):

Figuir 6: Constraint list. 5. Experimental Results and Analysis We implement the proposed algorithm on a Pentium PC running on the Linux platform. The optimum result is found according to the following parameter setting. Genetic Operators Selection : Tournament scheme with replacement Crossover : Single point crossover with probability (Pc ) Mutation : Mutation with probability (Pm).

Page 11: An Evolutionary Algorithm Based System Level Synthesis for

System Level Synthesis 45

Objective function: F(α,β,γ) = Weighted mean (C, D) = (a.C + b.D) / (a + b). In our case we choose a=b=1.

To get the optimum result we run the experiment number of times with varying different parameters like probability of mutation, probability of crossover and population sizes. Figure-7 shows the value of objective function versus number of generation with varying the mutation and crossover probabilities. The optimum result found with Pc=.98, Pm=.01 and population size = 120. The figure-8 shows the relationships between number of generation, population size and objective function values. Figure-9 shows the fastest and cheapest architecture for the video decoder based on our proposed algorithm.

Figure 7: Number of generation versus objective function Values.

Figure 8: Number of generations versus population sizes versus objective function values.

Page 12: An Evolutionary Algorithm Based System Level Synthesis for

46 R.K. Jena and G.K. Sharma

3 � 9S ∪ ES {0,1}.

Figure 9: Architecture of cheapest and fastest implementation. 6. Conclusion The system level design methodology and synthesis is an important issue of concern among the researchers. It copes with the time-to-market pressure and complexity of the chip. So our work can be treated as one step forward in this direction. The running time and the quality of solution can be improved by parallel implementation of genetic algorithm (PGA) with combination and permutation of different parameters of PGA. The other dimensions of system synthesis like power and energy consumption can be incorporated to design space exploration at system level.

References [1] Buchenrieder,K. Sedlmeier, A. and Veith, C.(1995). Codes a framework for

heterogeneous system. [2] In Co-design : Computer-Aided Software/ Hardware Engineering, IEEE press,

Piscataway N.J, USA, 378-392. [3] Blickle T.and Thiele, T.(1996). A comparison of selection schemes used in

evolutionary algorithm, [4] Evolutionary Computation, 4(4). 361-394. [5] D’Ambrosio, J.G. and Hu, X.(1994). Configuration level hardware/ software

partition for real time embedded systems. In Proc. of CODES/CASHE’94, 3rd international workshop on hardware / software Co-design, Grenoble, France, 34-41.

[6] Ernet, R., Henkel, J. and Benner,T.(1993). Hardware – software co-synthesis for microcontroller. IEEE design and Test of computers, 64-75.

[7] Gupta, R. and Michali, G.De.(1992). System-level synthesis using re-programmable components. In Proc. of EDAC, 2-7.

[8] Goldberg D.E.(1989). Genetic algorithms in Search ,Optimization, and Machine Learning. Addison Wesley, Reading, MA.

FM BMM INM

SAM OUTM DCTM

Page 13: An Evolutionary Algorithm Based System Level Synthesis for

System Level Synthesis 47

[9] Ismail, T.B., Brien K.O. and Jerraya, A.A.(1994). Interactive system level partitioning with ARTIF.

[10] In Proc. Of the European Conference on Design Automation (EDAC). 464-474.

[11] Kalavade, A.and Lee , E.A. (1995). A global criticality/ Local phase driven algorithm for the Constrained hardware/software partitioning problem. Third international Workshop on hardware/ software codesign , Grenoble, France .

[12] Lagnese, E.D. and Thomas, D.E,(1995). “Architectural partitioning for system level synthesis of Integrated circuits. IEEE Trans. On CAD 10(7), 355-369.

[13] Thomas, D.E., Adams, J.K and Schmitt, H.(1993). A model and methodology for hardware software Codesign. IEEE Design and Test of computers, 10(3):6-15.

[14] Teich, J., Blickle, T.and Thiele ,L.(1997). An Evolutionary Approach to system-level Synthesis.

[15] In Proc. of Codes/ CASHE’97 Braunschweig, Germany, 167-171. [16] Vahid, F. and Gajski, D.(1992). Specification partitioning for system design.

In Proc. 29th Design Automation Conference, Anaheim, CA , 219-224.