Upload
vishal-mehta
View
215
Download
0
Embed Size (px)
Citation preview
8/13/2019 10.Synthesis 1(1)
1/59
111
ehrdad NouraniDept. of EE
Univ. of Texas at Dallas
EEDG/CE 6301: Advanced Digital Logic
8/13/2019 10.Synthesis 1(1)
2/59
222
Synthesis and Design Automation
Session 10
8/13/2019 10.Synthesis 1(1)
3/59
3
Architectural Synthesis
8/13/2019 10.Synthesis 1(1)
4/59
4
Synthesis
Transform behavioral into structural view.
Architectural-level synthesis
Architectural abstraction level.
Determine macroscopic structure.
Example: major building blocks like adder, register,mux.
Logic-level synthesis
Logic abstraction level.
Determine microscopic structure.
Example: logic gate interconnection.
8/13/2019 10.Synthesis 1(1)
5/59
5
Synthesis and Optimization
8/13/2019 10.Synthesis 1(1)
6/59
6
Architectural-Level Synthesis Motivation
Raise input abstraction level.
Reduce specification of details.
Extend designer base.
Self-documenting design specifications.
Ease modifications and extensions. Reduce design time.
Explore and optimize macroscopic structure
Series/parallel execution of operations.
8/13/2019 10.Synthesis 1(1)
7/59
7
Architectural-Level Synthesis
Translate HDL models into sequencing graphs.
Behavioral-level optimization
Optimize abstract models independently from theimplementation parameters.
Architectural synthesis and optimizationCreate macroscopic structure
data-path and control-unit.
Consider area and delay information of the
implementation.
8/13/2019 10.Synthesis 1(1)
8/59
8
Example - Pseudo Code
Second-order differential equation solver
8/13/2019 10.Synthesis 1(1)
9/59
9
Example - VHDL
8/13/2019 10.Synthesis 1(1)
10/59
10
Example - Verilog
8/13/2019 10.Synthesis 1(1)
11/59
11
Dataflow Graphs
Behavioral views of
architectural models.
Useful to represent data-paths.
Graph
Vertices = operations.Edges = dependencies.
Dependencies arise due
Input to an operation is result
of another operation.
Serialization constraints inspecification.
Two tasks share the same
resource.
8/13/2019 10.Synthesis 1(1)
12/59
12
Dataflow Graphs (cont.)
Assumes the existence of variables who storeinformation required and generated byoperations.
Each variable has a lifetime which is the interval
frombirth todeath. Variable birth is the time at which the value is
generated.
Variable death is the latest time at which thevalue is referenced as input to an operation.
Values must be preserved during life-time.
8/13/2019 10.Synthesis 1(1)
13/59
13
Sequencing Graphs
Useful to represent data-path and control.
Extended dataflow graphs
Control Data Flow Graphs
(CDFGs).Polar: source and sink.
Operation serialization.
Hierarchy.
Control-flow commands branching and iteration.
Paths in the graphrepresent concurrent
streams of operations.
8/13/2019 10.Synthesis 1(1)
14/59
14
Behavioral-level optimization
Tree-height reduction
using commutative andassociative properties x = a + b * c + d
=>
x = (a + d) + b * c
Tree-height reduction
using distributiveproperty
x = a * (b * c * d + e)=>
x = a * b * c * d + a * e
8/13/2019 10.Synthesis 1(1)
15/59
15
Architectural Synthesis and Optimization
Synthesize macroscopic structure in terms of
building-blocks.
Explore area/performance trade-offs
maximum performance implementations subject to
area constraints.minimum area implementations subject to
performance constraints.
Determine an optimal implementation.
Create logic model for data-path and control.
8/13/2019 10.Synthesis 1(1)
16/59
16
Circuit Specification for Architectural Synthesis
Circuit behavior
Sequencing graphs.
Building blocks
Resources.
Functional resources: process data (e.g. ALU). Memory resources: store data (e.g. Register).
Interface resources: support data transfer (e.g. MUX and Buses).
Constraints
Interface constraints Format and timing of I/O data transfers.
Implementation constraints
Timing and resource usage.
+Area+ Cycle-time and latency
8/13/2019 10.Synthesis 1(1)
17/59
17
Resources
Functional resources: perform operations on data.
Example: arithmetic and logic blocks.
Standard resources
Existing macro-cells.
Well characterized (area/delay).
Example: adders, multipliers, ALUs, Shifters, ...
Application-specific resources
Circuits for specific tasks.
Yet to be synthesized.
Example: instruction decoder.
Memory resources: store data.
Example: memory and registers.
Interface resources
Example: busses and ports.
8/13/2019 10.Synthesis 1(1)
18/59
18
Resources and Circuit Families
Resource-dominated circuits.
Area and performance depend on few, well-characterized blocks.
Example: DSP circuits.
Non resource-dominated circuits.Area and performance are strongly influenced by
sparse logic, control and wiring.
Example: some ASIC circuits.
8/13/2019 10.Synthesis 1(1)
19/59
19
Synthesis in the Temporal Domain: Scheduling
Scheduling
Associate a start-time with each operation.
Satisfying all the sequencing (timing and resource)constraint.
GoalDetermine area/latency trade-off.
Determine latency and parallelism of the implementation.
Scheduled sequencing graph
Sequencing graph with start-time annotation.
Unconstrained scheduling.
Scheduling with timing constraints
Scheduling with resource constraints.
8/13/2019 10.Synthesis 1(1)
20/59
20
Tradeoff in Scheduling
4 Mult ipl iers, 2 ALUs 1 Multip l ier , 1 ALU
8/13/2019 10.Synthesis 1(1)
21/59
21
Tradeoff in Scheduling (cont.)
2 Mult ipl iers, 3 ALUs 2 Mult ipl iers, 2 ALUs
8/13/2019 10.Synthesis 1(1)
22/59
22
Synthesis in the Spatial Domain: Binding
Binding
Associate a resource with each operation with thesame type.
Determine area of the implementation.
SharingBind a resource to more than one operation.
Operations must not execute concurrently.
Bound sequencing graph
Sequencing graph with resource annotation.
8/13/2019 10.Synthesis 1(1)
23/59
23
Example: Bound Sequencing Graph
8/13/2019 10.Synthesis 1(1)
24/59
24
Performance and Area Estimation
Resource-dominated circuits
Area = sum of the area of the resources bound to theoperations.
Determined by binding.
Latency = start time of the sink operation (minus
start time of the source operation). Determined by scheduling
Non resource-dominated circuits
Area also affected by
registers, steering logic, wiring and control.
Cycle-time also affected by
steering logic, wiring and (possibly) control.
8/13/2019 10.Synthesis 1(1)
25/59
25
Scheduling Algorithms
8/13/2019 10.Synthesis 1(1)
26/59
Outline
The scheduling problem
Scheduling without constraints
Scheduling under timing constraints
Relative scheduling
Scheduling under resource constraintsThe ILP model
Heuristic methods
List scheduling
Force-directed scheduling
8/13/2019 10.Synthesis 1(1)
27/59
27
Scheduling
Circuit model
Sequencing graph.Cycle-time is given.
Operation delays expressed in cycles.
Scheduling
Determine the start times for the operations.Satisfying all the sequencing (timing and resource)
constraint.
Goal
Determine area/latency trade-off. Scheduling affects
Area: maximum number of concurrent operations of sametype is a lower bound on required hardware resources.
Performance: concurrency of resulting implementation.
8/13/2019 10.Synthesis 1(1)
28/59
28
Scheduling Example
Sequencing Graph A scheduled DFG
8/13/2019 10.Synthesis 1(1)
29/59
29
Scheduling Models
Unconstrained scheduling.
Scheduling with timing constraints
Latency
Detailed timing constraints
Scheduling with resource constraints Simplest scheduling model
All operations have bounded delays.
All delays are in cycles.
Cycle-time is given
No constraints - no bounds on area.
Goal
Minimize latency
8/13/2019 10.Synthesis 1(1)
30/59
30
Minimum-Latency Unconstrained Scheduling
Given a set of operations V with integer delays D
and a partial order on the operations E Find an integer labeling of the operations: V Z+, such thatti= (vi),
ti tj+ dj i, j s.t. (vj, vi) Eand tnis minimum.
Unconstrained schedulingused whenDedicated resources are used.Operations differ in type.Operations cost is marginal when compared to that of
steering logic, registers, wiring, and control logic.Binding is done before scheduling: resource conflicts
solved by serializing operations sharing same resource.
Deriving bounds on latency for constrained problems.
8/13/2019 10.Synthesis 1(1)
31/59
31
ASAP Scheduling Algorithm
Denote by tsthe start times computed by the as soon
aspossible(ASAP)algorithm. Yields minimumvalues of start times.
8/13/2019 10.Synthesis 1(1)
32/59
32
ALAP Scheduling Algorithm
Denote by tLthe start times computed by the as late
as possible(ALAP)algorithm. Yields maximumvalues of start times.
Latency upper bound (i.e. tn-t0)
8/13/2019 10.Synthesis 1(1)
33/59
33
Latency-Constrained Scheduling
ALAP solves a latency-constrained problem.
Latency bound can be set to latency computedby ASAP algorithm.
Mobility
Defined for each operation.Difference between ALAP and ASAP schedule.
Zero mobility implies that an operation can be startedonly at one given time step.
Mobility greater than 0 measures span of timeinterval in which an operation may start.
Slack on the start time.
8/13/2019 10.Synthesis 1(1)
34/59
34
Example
Operations with zero mobility
{v1, v2, v3, v4, v5} Critical path
Operations with mobility one
{v6, v7}
Operations with mobility two
{v8, v9, v10, v11}
8/13/2019 10.Synthesis 1(1)
35/59
35
Classical scheduling problem.
Fix area bound - minimize latency.
The amount of available resources affects theachievable latency.
Dual problemFix latency bound - minimize resources.
Assumption
All delays bounded and known.
Scheduling under Resource Constraints
8/13/2019 10.Synthesis 1(1)
36/59
36
Minimum Latency Resource-Constrained Scheduling
Given a set of ops V with integer delays D, a
partial order on the operations E, and upperbounds {ak; k = 1, 2, , nres}
Find an integer labeling of the operations
: V Z+
, such that
ti= (vi),
ti tj+ dj i, j s.t. (vj, vi) E
and tnis minimum.
Number of operations of any given type in any
schedule step does not exceed bound.
:V{1,2, nres}
8/13/2019 10.Synthesis 1(1)
37/59
37
Scheduling under Resource Constraints
Intractable problem
Algorithms
Approximate
List scheduling
Force-directed scheduling
Exact
Integer linear program
Hu (restrictive assumptions)
8/13/2019 10.Synthesis 1(1)
38/59
8/13/2019 10.Synthesis 1(1)
39/59
39
List Scheduling Algorithms
Heuristic method for
Minimum latency subject toresource bound.
Minimum resource subject tolatency bound.
Greedy strategy.
Priority list heuristics.
Assign a weight to each
vertex indicating itsscheduling priority
Longest path to sink.
Longest path to timingconstraint. Labeled Sequencin g Graph
Priority of Vi=label of Vi=i=
#of edges in the longest pathFrom ViVn
8/13/2019 10.Synthesis 1(1)
40/59
40
List Scheduling Algorithm for Minimum Latency
Based on a prioritymetric (e.g. mobility,
labeling, etc.)
8/13/2019 10.Synthesis 1(1)
41/59
41
List Scheduling for Minimum Latency (cont.)
Candidate Operations Ul,k
Operations of type kwhose predecessors are scheduled andcompleted at time step before l
Unfinished operations Tl,k are operations of type kthatstarted at earlier cycles and whose execution is notfinished at time l
Note that when execution delays are 1,Tl,k is empty.
}),(:and)(:{, EvvjldtkvypeVvU ijjjiikl
}),(:)(:{, EvvjldtkvypeVvT ijjjiikl and
8/13/2019 10.Synthesis 1(1)
42/59
42
Example I
Assumptions
a1= 2 multipliers with delay 1. a2= 2 ALUs with delay 1.
First Step U1,1= {v1, v2, v6, v8}
Select {v1, v2}
U1,2= {v10}; selected
Second step U2,1= {v3, v6, v8}
select {v3, v6}
U2,2= {v11}; selected
Third step U3,1= {v7, v8} Select {v7, v8}
U3,2= {v4}; selected
Fourth step
U4,2= {v5, v9}; selected
4, 4, 3, 2
3, 3, 2
2, 2
Labels
(priorities)
8/13/2019 10.Synthesis 1(1)
43/59
43
Example II
Assumptions
a1= 3 multipliers withdelay 2.
a2= 1 ALU with delay 1.
8/13/2019 10.Synthesis 1(1)
44/59
44
List Scheduling for Minimum Resource Usage
ALAP does not exist
(Latency is too tight)
Lower slack means
higher urgency/priority
8/13/2019 10.Synthesis 1(1)
45/59
45
Example (i)
Assume =4
Let a = [1, 1]T
First Step U1,1= {v1, v2, v6, v8} Operations with zero slack{v1, v2} a = [2, 1]T
U1,2= {v10}
Second step U2,1= {v3, v6, v8} Operations with zero slack{v3, v6} U2,2= {v11}
Third step U3,1= {v7, v8}
Operations with zero slack{v7, v8} U3,2= {v4}
Fourth step U4,2= {v5, v9} Both have zero slack; a = [2, 2]T
8/13/2019 10.Synthesis 1(1)
46/59
46
Example (ii)
Assume =8
Let a = [1, 1]T
ALAP Schedule Final Schedule
8/13/2019 10.Synthesis 1(1)
47/59
47
Scheduling Based onInteger Linear Programming (ILP)
8/13/2019 10.Synthesis 1(1)
48/59
48
ILP Solution
Use standard ILP packages.
Transform into LP problem [Gebotys].
Advantages
Exact method.
Other constraints can be incorporated easily Maximum and minimum timing constraints
Disadvantages
Works well up to few thousand variables.
8/13/2019 10.Synthesis 1(1)
49/59
49
ILP Formulation
Binary decision variables
X = { xil; i = 1, 2, , n; l = 1, 2, , +1}.
xil, is TRUE (1) only when operation vistarts in steplof the schedule (i.e. l = ti).
is an upper bound on latency.
Start time of operation vi
Operations start only once
ti =
8/13/2019 10.Synthesis 1(1)
50/59
50
ILP Formulation (cont.)
Sequencing relations must be satisfied
Resource bounds must be satisfied
8/13/2019 10.Synthesis 1(1)
51/59
51
ILP Formulation (cont.)
Minimize cTtsuch that
8/13/2019 10.Synthesis 1(1)
52/59
52
ILP Formulation (cont.)
About the objective function (cTt)
cT=[0,0,,0,1]Tcorresponds to minimizing thelatency of the schedule.
cT=[1,1,,1,1]Tcorresponds to finding the earlieststart times of all operations under the givenconstraints.
8/13/2019 10.Synthesis 1(1)
53/59
53
Example
Resource constraints
2 ALUs; 2 Multipliers.a1= 2; a2= 2.
Single-cycle operation.di= 1 i.
=4 is given as upper bound Objective function:
No need to considerx1,1+x2,1+2x3,2+3x4,3+4x5,4because operations 2, 2, 3, 4and 5 have zero mobility.
8/13/2019 10.Synthesis 1(1)
54/59
54
Example (cont.)
Resource constraints
2 ALUs; 2 Multipliers.a1= 2; a2= 2.
Single-cycle operation.di= 1 i.
Operations start only oncex0,1=1; x1,1=1; x2,1=1; x3,2=1x4,3=1; x5,4=1x6,1+x6,2=1x
7,2
+
x7,3
=1x8,1+x8,2+x8,3=1x9,2+x9,3+x9,4=1x10,1+x10,2+x10,3=1x11,2+x11,3+x11,4=1
xn,5=1
Consider tiSand ti
L
of operations (i.e.
the mobility)
8/13/2019 10.Synthesis 1(1)
55/59
55
Example (cont.)
Sequencing relations must be
satisfied2x3,2-x1,1 1
2x3,2-x2,1 1
2x7,2
+3x7,3
-x6,1
-2x6,2
1
2x9,2+3x9,3+4x9,4-x8,1-2x8,2-3x8,3 1
2x11,2+3x11,3+4x11,4-x10,1-2x10,2 -3x10,3 1
4x5,4-2x7,2-3x7,3 14x5,4-3x4,3 1
5xn,5-2x9,2-3x9,3-4x9,4 1
5xn,5-2x11,2-3x11,3-4x11,4 1
5xn,5-4x5,4 1
Operationdependency
8/13/2019 10.Synthesis 1(1)
56/59
56
Example (cont.)
Resource bounds must be satisfied:
Any set of start times satisfyingconstraints provides a feasiblesolution.
Any feasible solution is optimumsince sink (x
n,5=1) mobility is 0.
8/13/2019 10.Synthesis 1(1)
57/59
57
Relative Timing Constraints in ILP
Relative timing constraints can be considered by
adding new set of inequalities. Example:
8/13/2019 10.Synthesis 1(1)
58/59
58
Dual ILP Formulation
Minimize resource usage under latency
constraint. Same constraints as previous formulation.
Additional constraint
Latency bound must be satisfied.
Resource usage is unknown in the constraints.
Resource usage is the objective to minimize.
Minimize cTa
a vectorrepresents resource usage
cTvectorrepresents resource costs
8/13/2019 10.Synthesis 1(1)
59/59
Example
Multiplier area = 5; ALU area = 1.
Objective function: 5a1+a2 Latency constraints: = 4
Start time constraints same.
Sequencing dependency
constraints same. Resource constraints
x1,1+x2,1+x6,1+x8,1a1 0
x3,2+x6,2+x7,2+x8,2a1 0
x7,3
+x8,3
a1
0
x10,1a2 0
x9,2+x10,2+x11,2a2 0
x4,3+x9,3+x10,3+x11,3a2 0
x5,4+x9,4+x11,4a2 0