33
1 ECE 667 - Synthesis & Verification - L ecture 2 ECE 697B (667) ECE 697B (667) Spring 2006 Spring 2006 Synthesis and Verification of Digital Circuits Scheduling Scheduling Heuristic Algorithms Heuristic Algorithms Adopted, with permission, from M Srivastava and M. Podkonjak, UCLA, 2003.

ECE 667 - Synthesis & Verification - Lecture 2 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling

  • View
    217

  • Download
    0

Embed Size (px)

Citation preview

1

ECE 667 - Synthesis & Verification - Lecture 2

ECE 697B (667)ECE 697B (667)Spring 2006Spring 2006

Synthesis and Verificationof Digital Circuits

Scheduling Scheduling

Heuristic AlgorithmsHeuristic Algorithms

Adopted, with permission, from M Srivastava and M. Podkonjak, UCLA, 2003.

ECE 667 - Synthesis & Verification - Lecture 2 2

Scheduling, Allocation, and AssignmentScheduling, Allocation, and Assignment

D

+

-

>>

>>

+

-

>>

+ >>

+

>>

+

Allocation: How Much?2 adders

Assignment: Where?

Schedule: When?

Shifter 1

Time Slot 4

1 shifter24 registers

D

Techniques well understood and mature

Copyright 2003 Mani Srivastava

ECE 667 - Synthesis & Verification - Lecture 2 3

Scheduling and Assignment - overviewScheduling and Assignment - overview

4 control steps

+ * *

1 +1

2 +2

3 +3 *1

4 *2 *3

Control Step + *

1 +3

2 +1 *2

3 +2 *3

4 *1

Control Step

Copyright 2003 Mani Srivastava

Schedule 1s1

s2

s3

+1

+2

*1

s3

s4 s4

+3

*2 *3

s2

s3

s4

Schedule 2+1

+2

*1

s1

s2 s3*2

+3

*3

ECE 667 - Synthesis & Verification - Lecture 2 4

Algorithm Description Algorithm Description Data Flow Graph Data Flow Graph

Copyright 2003 Mani Srivastava

ECE 667 - Synthesis & Verification - Lecture 2 5

Control Data Flow Graph (CDFG)Control Data Flow Graph (CDFG)

Copyright 2003 Mani Srivastava

ECE 667 - Synthesis & Verification - Lecture 2 6

Sequencing GraphSequencing Graph

Copyright 2003 Mani Srivastava

ECE 667 - Synthesis & Verification - Lecture 2 7

Sequencing GraphSequencing Graph

*

* * + <

* * * +

• Add Add sourcesource and and sink sink nodes (NOP) to the DFGnodes (NOP) to the DFG

*

NOP

* * + <

* * * +

NOPData Flow Graph (DFG)

Sequencing Graph

ECE 667 - Synthesis & Verification - Lecture 2 8

• As Soon as Possible scheduling– Unconstrained minimum latency scheduling– Uses topological sorting of the sequencing graph (polynomial time)– Gives optimum solution to scheduling problem– Schedule first the first node no T1 until last node nv is scheduled

ASAP Scheduling AlgorithmASAP Scheduling Algorithm

ECE 667 - Synthesis & Verification - Lecture 2 9

ASAP Scheduling Algorithm - exampleASAP Scheduling Algorithm - example

Copyright 2003 Mani Srivastava

start

finishTS = 4

TS = 3

TS = 2

TS = 1

ECE 667 - Synthesis & Verification - Lecture 2 10

ASAP Scheduling ExampleASAP Scheduling Example

Sequence Graph ASAP ScheduleCopyright 2003 Mani Srivastava

ECE 667 - Synthesis & Verification - Lecture 2 11

ALAP Scheduling AlgorithmALAP Scheduling Algorithm

Copyright 2003 Mani Srivastava

• As late as Possible scheduling– Latency-constrained scheduling (latency is fixed)

– Uses reversed topological sorting of the sequencing graph

– If over-constrained (latency too small), solution may not exist

– Schedule first the last node nv T, until first node n0 is scheduled

(Latency)

ECE 667 - Synthesis & Verification - Lecture 2 12

ALAP Scheduling Algorithm - exampleALAP Scheduling Algorithm - example

Copyright 2003 Mani Srivastava

TS = 4 = T

TS = 3

TS = 2

TS = 1

start

finish

ECE 667 - Synthesis & Verification - Lecture 2 13

ALAP Scheduling ExampleALAP Scheduling Example

Sequence GraphALAP Schedule

(latency constraint = 4)Copyright 2003 Mani Srivastava

ECE 667 - Synthesis & Verification - Lecture 2 14

• Slack of Operator Slack of Operator i: Si: Si i = TS= TSiiALAPALAP – TS – TSii

ASAPASAP

– Defines mobility of the operatorsDefines mobility of the operators

*

NOP

* *

<

* *

* +

NOP

1 2

3

4

5

6

7

8

9

10

11

0

n

+

0

*

NOP

*

*

+ <

*

*

* +

NOP

1 2

3

4

5

6

7

8

9

10

11

n

S6 = 2-1 = 1

S7 = 2-1 = 1

S8 = 3-1 = 2

S9 = 4-2 = 2

S10 = 3-1 = 2

S11 = 4-2 = 2

Computing Slack (mobility)Computing Slack (mobility)

ECE 667 - Synthesis & Verification - Lecture 2 15

Timing ConstraintsTiming Constraints

• Time measured in cycles or control steps• Imposing relative timing constraints between operators i and j

– max & min timing constraints

ECE 667 - Synthesis & Verification - Lecture 2 16

Constraint Graph GConstraint Graph Gcc(V,E)(V,E)

MULT delay =2

ADD delay = 1

ECE 667 - Synthesis & Verification - Lecture 2 17

Existence of schedule under timing constraintsExistence of schedule under timing constraints

• Example:– Assume delays: ADD=1, MULT=2– Path {1 2} has weight 2 u12=3, that is

cycle {1,2,1} has weight = -1, OK– No positive cycles in the graph, so it has a

consistent schedule

• Upper bound (max timing constraint) is a problem• Examine each max timing constraint (i, j):

– Longest weighted path between nodes i and j must be max timing constraint uij.

– Any cycle in Gc including edge (i, j) must be negative or zero

• Necessary and sufficient condition:– The constraint graph Gc must not have positive cycles

ECE 667 - Synthesis & Verification - Lecture 2 18

Existence of schedule under timing constraintsExistence of schedule under timing constraints

• Example: satisfying assignment– Assume delays: ADD=1, MULT=2– Feasible assignment:

Vertex Start time

• v0 step 1

• v1 step 1

• v2 step 3

• v3 step 1

• v4 step 5

• vn step 6

ECE 667 - Synthesis & Verification - Lecture 2 19

Observation about ALAP & ASAPObservation about ALAP & ASAP

• No priority is given to nodes on critical pathNo priority is given to nodes on critical path• As a result, less critical nodes may be scheduled ahead As a result, less critical nodes may be scheduled ahead

of critical nodesof critical nodes• No problem if unlimited hardware is availableNo problem if unlimited hardware is available• However if the resources are limited, the less critical However if the resources are limited, the less critical

nodes may block the critical nodes and thus produce nodes may block the critical nodes and thus produce inferior schedulesinferior schedules

• List schedulingList scheduling techniques overcome this problem by techniques overcome this problem by utilizing a more global node selection criterionutilizing a more global node selection criterion

Copyright 2003 Mani Srivastava

ECE 667 - Synthesis & Verification - Lecture 2 20

Hu’s AlgorithmHu’s Algorithm

• Simple case of the scheduling problem– Operations of unit delay– Operations (and resources) of the same type

• Hu’s algorithm– Greedy– Polynomial AND optimal– Computes lower bound on number of resources for a given

latency (MR-LCS), OR – computes lower bound on latency subject to resource

constraints (ML-RCS)

• Basic idea:– Label operations based on their distances from the sink– Try to schedule nodes with higher labels first

(i.e., most “critical” operations have priority)[©Gupta]

ECE 667 - Synthesis & Verification - Lecture 2 21

Hu’s AlgorithmHu’s AlgorithmHU (G(V,E), a) {

Label the vertices // label = length of longest pathpassing through the vertex

l = 1

repeat {

U = unscheduled vertices in V whose predecessors have been scheduled

(or have no predecessors)

Select S U such that |S| a and labels in S are maximal

Schedule the S operations at step l by setting ti=l, i: vi S.

l = l + 1

} until vn is scheduled.

}[©Bazargan]

ECE 667 - Synthesis & Verification - Lecture 2 22

Hu’s Algorithm: Example (a=3)Hu’s Algorithm: Example (a=3)

[©Gupta]

ECE 667 - Synthesis & Verification - Lecture 2 23

List SchedulingList Scheduling

• Greedy algorithm for ML-RCS and MR-LCS– Does NOT guarantee optimum solution

• Similar to Hu’s algorithm– Operation selection decided by criticality– O(n) time complexity

• More general input– Resource constraints on different resource types

[©Bazargan]

ECE 667 - Synthesis & Verification - Lecture 2 24

List Scheduling Algorithm: ML-RCSList Scheduling Algorithm: ML-RCS

LIST_L (G(V,E), a) {l = 1repeat {

for each resource type k {

Ul,k = available vertices in V.

Tl,k = operations in progress.

Select Sk Ul,k such that |Sk| + |Tl,k| ak

Schedule the Sk operations at step l

}l = l + 1

} until vn is scheduled.

}[©Bazargan]

ECE 667 - Synthesis & Verification - Lecture 2 25

List Scheduling: Example (List Scheduling: Example (aa=[3,1])=[3,1])

[©Gupta]

ECE 667 - Synthesis & Verification - Lecture 2 26

List Scheduling Algorithm: MR-LCSList Scheduling Algorithm: MR-LCSLIST_R (G(V,E), ’) {

a = 1, l = 1

Compute the ALAP times tL.

if t0L < 0

return (not feasible)repeat {

for each resource type k {

Ul,k = available vertices in V.

Compute the slacks { si = tiL - l, vi Ul,k }.

Schedule operations with zero slack, update a

Schedule additional Sk Ul,k under a constraints}l = l + 1

} until vn is scheduled.}

[©Bazargan]

ECE 667 - Synthesis & Verification - Lecture 2 27

List Scheduling AlgorithmList Scheduling Algorithm

• Algorithm 1: Minimize latency under resource constraintAlgorithm 1: Minimize latency under resource constraint– Resource constraint represented by vector Resource constraint represented by vector aa (indexed by resource type) (indexed by resource type)

• Example: two resources, MULT, ADD; Example: two resources, MULT, ADD; aa11=1, =1, aa22=2=2

– UUl,kl,k = set of candidate operations for resource type = set of candidate operations for resource type k k at time step at time step ll

• The The candidatecandidate operations operations UUl,kl,k

– those operations of type those operations of type kk whose predecessors have already been scheduled early whose predecessors have already been scheduled early enough so that they are completed at step enough so that they are completed at step l:l:

UUl,kl,k = { = { vvii V: type(vV: type(vii) = k ) = k and and ttj j + d+ djj ll, for all , for all jj: (: (vvii, v, vjj) ) EE

• The The unfinishedunfinished operations operations TTl,kl,k

– those operations of type those operations of type kk that started at earlier cycles but whose execution is not that started at earlier cycles but whose execution is not finished at step finished at step l:l:

TTl,kl,k = { = { vvii V: type(vV: type(vii) = k ) = k and and tti i + d+ di i > > l l

• Priority listPriority list– List operators according to some heuristic urgency measureList operators according to some heuristic urgency measure– Common priority list: labeled by position on the longest path in decreasing orderCommon priority list: labeled by position on the longest path in decreasing order

(the most urgent operations are scheduled first)(the most urgent operations are scheduled first)

ECE 667 - Synthesis & Verification - Lecture 2 28

List Scheduling Algorithm (1)List Scheduling Algorithm (1)

LIST_L(Gc(V,E), a) { % Minimize latency under resource constraint

l = 1;

repeat {

for each resource type k = 1,2, …, Nr

{ determine candidate operations Ul,k ;

determine unfinished operations Tl,k ;

select Sk Ul,k vertices such that: |Sk|+|Tk| ak;

schedule the Sk operations at step l by setting ti = l for all vi Sk ;

}

l = l +1 ;

{

until (vn is scheduled)

return (t)

}

ECE 667 - Synthesis & Verification - Lecture 2 29

List Scheduling – example 1aList Scheduling – example 1a

• Assumptions– All operations have unit delay– Resource constraints:

MULT: a1 = 2, ALU: a2 = 2

• Step 1: – U1,1 = {v1,v2, v6, v8}, select {v1,v2} – U1,2 = {v10}, select + schedule

• Step 2:– U2,1 = {v3,v6, v8}, select {v3,v6} – U2,2 = {v11}, select + schedule

• Step 3:– U3,1 = {v7,v8}, select + schedule– U3,2 = {v4}, select + schedule

• Step 4:– U4,2 = {v5,v9}, select + schedule

*

NOP

*

*

<

*

*

+

NOP

1 2

3

4

5

6

7

*8

+9

10

11

0

n

T1

T2

T3

T4

ECE 667 - Synthesis & Verification - Lecture 2 30

List Scheduling – Example 1bList Scheduling – Example 1b

• Assumptions– Operations have different delay:

delMULT = 2, delALU = 1

– Resource constraints:

MULT: a1 = 3, ALU: a2 = 1

• MUTL ALU start time

{v1, v2, v6} v10 1

-- v11 2

{v3, v7, v8} -- 3

-- -- 4

-- v4 5

-- v5 6

-- v9 7

*

NOP

<

+

NOP

1 2

3

4

5

6

7 8

+9

10

11

n

* *

* * *

T1

T2

T3

T4

T5

T6

T7

ECE 667 - Synthesis & Verification - Lecture 2 31

List Scheduling Algorithm (2)List Scheduling Algorithm (2)

LIST_R(Gc(V,E), L { % Minimize resources under latency constrainta = 1;Compute the latest possible starting times t L using ALAP scheduling

if (t0L < 0)

return ;l = 1;repeat {

for each resource type k = 1,2, …, Nr

{ determine candidate operations Ul,k ;

compute the slacks { si = tiL – l for all vi Ul,k };

schedule the candidate operations with zero slack and update vector a ;schedule the candidate operations requiring no additional resources ;

} l = l +1 ; {

until (vn is scheduled)return (t) }

ECE 667 - Synthesis & Verification - Lecture 2 32

List Scheduling – example 2List Scheduling – example 2

• Assumptions– All operations have unit delay

– Latency constraint: L = 4

– No resource constraints

• Use slack information to guide the scheduling– Schedule operations with slack = 0

first

*

NOP

*

*

<

*

*

+

NOP

1 2

3

4

5

6

7

*8

+9

10

11

0

n

T1

T2

T3

T4

ECE 667 - Synthesis & Verification - Lecture 2 33

Scheduling – a Combinatorial Optimization ProblemScheduling – a Combinatorial Optimization Problem

• NP-complete ProblemNP-complete Problem• Optimal solutions for special cases and ILPOptimal solutions for special cases and ILP• Heuristics - iterative Improvements Heuristics - iterative Improvements • Heuristics – constructiveHeuristics – constructive• Various versions of the problemVarious versions of the problem

• Unconstrained, minimum latencyUnconstrained, minimum latency• Resource-constrained, minimum latencyResource-constrained, minimum latency• Timing-constrained, minimum latencyTiming-constrained, minimum latency• Latency-constrained, minimum resourceLatency-constrained, minimum resource

• If all resources are identical, problem is reduced to If all resources are identical, problem is reduced to multiprocessor scheduling (Hu’s algorithm)multiprocessor scheduling (Hu’s algorithm)

• Minimum latency multiprocessor problem is intractableMinimum latency multiprocessor problem is intractable