22
DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs --------Deming Chen, Jason Cong Computer Science Department UCLA Presented by Shikang Xu 1

DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs

  • Upload
    sabin

  • View
    40

  • Download
    0

Embed Size (px)

DESCRIPTION

DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs. -------- Deming Chen, Jason Cong , Computer Science Department , U CLA Presented by Shikang Xu. Outline. Introduction Related Works Definitions and Problem Fomulation Algorithm Description Discuss of Techniques. - PowerPoint PPT Presentation

Citation preview

Page 1: DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs

DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for

FPGA Designs--------Deming Chen, Jason Cong , Computer Science Department , UCLA

Presented by Shikang Xu

1

Page 2: DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs

Outline

• Introduction• Related Works• Definitions and Problem Fomulation• Algorithm Description• Discuss of Techniques

2

Page 3: DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs

Introduction

• The LUT-based FPGA architecture dominates the existing programmable chip industry

• FPGA technology mapping converts a given Boolean circuit into a functionally equivalent network comprised only of LUTs

3

Page 4: DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs

Related Works• Area Minimization

– Chortle-crf, [Francis, et al, DAC’91]– MIS-pga, [Murgai, et al, ICCAD’91]– Praetor, [Cong, et al, FPGA’99] – Anti-fuse FPGA Mapper, [Kang, et al, ASPDAC’04]

• Delay Minimization– DAG-Map, [Chen, et al, DTC’92]– FlowMap, [Cong, et al, ICCAD’92]– Edge-map, [Yang, et al, ICCAD’94]

• Power Minimization– PowerMinMap, [Li, et al, ASPDAC’03]– Emap, [Lamoureux, et al, ICCAD’03]– DVmap, [Chen, et al, FPGA’04]

• Simultaneous Delay and Area Minimization(Area Minimization under Timing Constraints )– FlowMap-r, [Cong, et al, TVLSI’94]– CutMap, [Cong, et al, FPGA’95]– BoolMap-D, [Legl, et al, DAC’96]

Adopted from Deming Chen, Jason Cong , Computer Science Department, UCLA

4

Page 5: DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs

Definitions• Cone (Ov):- A subnetwork of the original network,

consisting of v and some of its predecessors, such that for any node w in Ov, there is a path from w to v in Ov.

• Fanin cone (Fv):- The maximum cone of v, consisting of all PI predecessors of v

• Input(Ov):- Denotes the set of distinct nodes outside Ov which supply inputs to the gates in Ov.

• Cut:- It is a partitioning (X,X’) of a cone Ov such that X’ is a cone of v.

• Cut-set:- It is represented as V(X,X’), and consists of input(X’)

5

Page 6: DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs

Definitons• Cutsize: It is the cardinality of the

cut-set. A cut is said to be K-feasible if the cutsize is <=K

• Level: The level of a node v is the length of the longest path from any PI to the node v.

• Depth: The depth of a network is the largest node level in the network.

• Mapping Depth: The largest optimal delay of the mapped circuit.

Picture adopted from Deming Chen, Jason Cong , Computer Science Department, UCLA

6

a

b c

d e

v

Fv

3-feasible cone Cv

PIs

Delay of 2

Page 7: DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs

Problem Formulation

Area Minimization under Timing Constraint: Given: a Boolean network; Unity delay model (1 LUT

contributes unit delay) Goal: cover the network with K-feasible cones (K-LUTs),

such that• Optimal mapping depth is guaranteed• Area (number of LUTs) is minimized

7

Page 8: DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs

Algorithm Description

A Cut-enumeration-based method consisting of cut generation and cut selection

• Cut generation traverses the network from the PI to the PO, and combines subcuts on the fanin nodes of a target node to generate all the cuts on the target node

• After generating the cuts, the network is traversed from the PO to the PI, and the cuts are selected to produce the LUT mapping result.

8

Page 9: DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs

Cut Enumeration

• Cut enumeration means generating all K-feasible cuts of a cone for a given node effectively

f(K, v) represents all the K-feasible cuts rooted at node v, operator + is Boolean OR,

K is Boolean AND on its operands, but filtering out all the resulting p-terms with more than K variables.

9

Page 10: DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs

Cut Enumeration: Example

10

All the cuts rooted on node s can be generated by combining the cuts rooted on its fanin nodes q and r. The cuts on the fanin nodes are called subcuts. Combining C1 with C2 will form a new cut Cs = {m, n, o, p} rooted on s. If the input of the new cut exceeds K, the cut is discarded.

Page 11: DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs

Cut Enumeration: Time propagation

• The arrival time propagates through each of the cut, and each cut represents a LUT and hence a unit delay. The minimum arrival time at a node v is

where C represents every cut generated for v through cut enumeration. Arri is the minimum arrival time on input signal i of C. There can be several cuts with Arri , form a set Xv

11

Page 12: DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs

Cut enumeration: Area Propagation• Similar to the arrival time, the area can also be propagated.

The area is calculated as

Where Uc is the area contributed by the cut C, Ai is the estimated area of the cone rooted on signal i and f(i) is the fanout number of signal i. That means that the area on i is shared and distributed into other fanout nodes of i.

12

Page 13: DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs

Delay and Area Propagation

a c

d

yxz

b

w

e

fg

Delay 1, Area 1

Delay 1, Area 1

Optimal Delay = 1Area = 1

Optimal Delay = 2Area = 2

Delay 1, Area 1

Delay 2, Area 3

Delay 2, Area 3

Delay 2, Area 2

Delay 2, Area 2

Delay 2, Area 2

Optimal Delay = 1Area = 1

Optimal Delay = 1Area = 1

Propagation process visits cuts and nodes iterativelyThe longest best delay on the POs is the optimal mapping delayAdopted from Deming Chen, Jason Cong , Computer Science Department, UCLA 13

Page 14: DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs

Area propagation under Timing constraints

• To guarantee optimal mapping depth, we need to propagate the estimated area together with the minimum arrival time

Av represents the best achievable area under the constraint that it also generates the optimal mapping delay upto the point of v

• With these formulae, the areas of cuts and nodes are iteratively calculated until the enumeration process reaches the POs.

• During the cut selection process when we know that v is not on a critical path, a cut C not belonging to Xv can be chosen as long as it does not violate the timing constraint.

14

Page 15: DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs

Cost function of a cut

Some Key parameters• IC: cutsize of C• NC: number of nodes covered by C • f(v): fanout number of the root node v• Rc: number of reconvergent path•

15

Page 16: DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs

Example of Cost function

In the example above C1 and C2 have the same cutsize, but C2 is betterC2 covers two sets of reconvergent pathsHaving a cut rooted at node 5 will reduce potential duplications

16

Page 17: DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs

Global Duplication Cost Adjustment

• Consider potential node duplications• Check the sub-cuts for multiple fanouts• Propagate adjusted cost globally

)(

21)](/[ACinputi

ffic PPUcifA

otherwise 0

1 if

f(i)I

NP C

Cf

f

17

Page 18: DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs

Cut Selection

• From POs to PIs

• Critical paths: optimal delay + best area available

• Non-critical paths: relaxed delay + better area

18

Page 19: DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs

Cut Selection

• Greedily pick cuts with smallest costs will forfeit some optimization factors in term of reducing duplication locally.

• Use heuristics to guide the selection procedure– Iterative Cut Selection Procedure– Local Cost Adjustment

• Input Sharing • Slack Distribution • Cut Probing

19

Page 20: DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs

Efficiency

• With DAOmap, the researchers report a better area values with a lower runtime, when compared to CutMap.

• The impact of the various techniques used, on the final area values is shown here.

20

Page 21: DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs

Efficiency: Impact of Techniques• Input sharing proves to be the most important

technique to reduce area because it reduces the number of edges and node duplications

• The mincost propagation is trying to evaluate how accurate our cost estimation model is.

• Global duplication cost adjustment offers the next largest gain, which shows that duplication of nodes adds to the area cost

21

Page 22: DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs

Summary

• A cut enumeration based cut selection and generation process for LUT

• Novel techniques make DAOmap gained significant amount of area and runtime reduction over a state-of-the-art algorithm CutMap

22