Upload
betty-mcdaniel
View
223
Download
1
Embed Size (px)
Citation preview
Clock SkewingClock Skewing
EECS 290A EECS 290A Sequential Logic Synthesis and VerificationSequential Logic Synthesis and Verification
OutlineOutline MotivationMotivation GraphsGraphs Algorithms for the shortest path computationAlgorithms for the shortest path computation
Dijkstra and Bellman-FordDijkstra and Bellman-Ford
Optimum cycle ratio computationOptimum cycle ratio computation Howard algorithmHoward algorithm
ASAP and ALAP skewsASAP and ALAP skews Clock skew as the shortest pathClock skew as the shortest path Retiming as discrete clock skewingRetiming as discrete clock skewing
MotivationMotivation
When combinational optimization cannot help, When combinational optimization cannot help, sequential optimization holds some promisesequential optimization holds some promise
Sequential optimization changes one or more of the Sequential optimization changes one or more of the followingfollowing
the clock cycle (the clock cycle (clock skewingclock skewing)) the number and positions of memory elements (the number and positions of memory elements (retimingretiming)) combinational logic (combinational logic (retiming and resynthesisretiming and resynthesis))
Clock skewing is an “easy” way of reducing the clock Clock skewing is an “easy” way of reducing the clock period without moving latches period without moving latches
Moving latches, if done on a mapped and placed netlist, may Moving latches, if done on a mapped and placed netlist, may destroy placement, etcdestroy placement, etc
Directed GraphsDirected Graphs GraphGraph is set of vertices and edges is set of vertices and edges G = (V,E)G = (V,E) Each edge is Each edge is directeddirected (has a source and a sink) (has a source and a sink) A A pathpath is the sequence of vertices connected by edges is the sequence of vertices connected by edges A A cyclecycle is the circular path is the circular path Graph is Graph is strongly connectedstrongly connected if there exist a path from any vertex to if there exist a path from any vertex to
any other vertex.any other vertex. For the general formulation of the graph problems, each edge For the general formulation of the graph problems, each edge ee has has
distance, d(e),distance, d(e), and a and a latency, t(e)latency, t(e)
In this lectureIn this lecture Graph is the “latch dependency graph” Graph is the “latch dependency graph”
• Vertices are latchesVertices are latches• Edges are combinational paths between the latchesEdges are combinational paths between the latches
Distance of an edge is its combinational delayDistance of an edge is its combinational delay Latency of an edge is 1Latency of an edge is 1
Graph ProblemsGraph Problems
Optimum cycle ratioOptimum cycle ratio Given Given d(e)d(e) and and t(e) t(e) for each edgefor each edge e, e, for each cycle for each cycle CC
in in GG we define a cycle ratio: we define a cycle ratio: (C) = D(C)/T(C),(C) = D(C)/T(C), where where D(C) = D(C) = eieiCC d(e d(eii), T(C) = ), T(C) = eieiCC t(e t(eii))
The problem is to determine the min(max) ratio The problem is to determine the min(max) ratio ** over all cycles over all cycles CC in in GG
Shortest pathShortest path Given Given d(e)d(e) for each edge for each edge e, e, and a source vertex and a source vertex ss, ,
determine the shortest path from determine the shortest path from ss to any other vertex to any other vertex in in GG
Shortest Path: PreliminariesShortest Path: Preliminaries Start-shortest-path (G,s)Start-shortest-path (G,s)
For each vertex For each vertex v v G G
• w(v)w(v) = = • p(v)p(v) = NULL = NULL
w(s)w(s) = 0 = 0
w(v)w(v) is the shortest path from is the shortest path from vertexvertex s s to vertex to vertex v v
p(v) p(v) is the predecessor is the predecessor function, which gives for each function, which gives for each node node vv, the previous node on , the previous node on the shortest path from the shortest path from ss
Relax/tighten ( u, v, d() )Relax/tighten ( u, v, d() ) if ( if ( w(v) > w(u) + d(u,v)w(v) > w(u) + d(u,v) ) ) w(v) = w(u) + d(u,v)w(v) = w(u) + d(u,v)
p(v)p(v) = = uu
3
1
6
u
s
v
w(u)=3
w(v)=6
w(v)=4
w(v) > w(u) + w(u,v)w(v) > w(u) + w(u,v)
6 > 3 + 16 > 3 + 1
w(v) = 4w(v) = 4
Shortest Path: Dijkstra AlgorithmShortest Path: Dijkstra Algorithm
Start-shortest-path(G,s)Start-shortest-path(G,s) S=S=, Q, Qww = V(G) = V(G) while ( Qwhile ( Qww ) )
U = Extract-Min( QU = Extract-Min( Qww ) ) S = S S = S {u} {u} for each vertexfor each vertex v, v, which is a successor ofwhich is a successor of u u
• Relax( u, v, d() )Relax( u, v, d() )• Update ordering in QUpdate ordering in Qww
Q Q is a priority queue storing vertices by their distanceis a priority queue storing vertices by their distanceS S is the set of vertices, whose shortest path from is the set of vertices, whose shortest path from ss has has
already been foundalready been found
ExampleExample
T. H. Cormen, C. E. Leiserson, R. L. Rivest, Introduction to algorithms, New York: McGraw-Hill, 1990.
Shortest Path: Bellman-Ford Shortest Path: Bellman-Ford
The limitation of Dijkstra is that it only works for positive The limitation of Dijkstra is that it only works for positive distances distances w(u,v)w(u,v)
Bellman-Ford overcomes this limitation and can detect a Bellman-Ford overcomes this limitation and can detect a negative cyclenegative cycle
Start-shortest-path(G,s)Start-shortest-path(G,s) for i = 1 to i < |V(G)|for i = 1 to i < |V(G)|
for each edge (u,v) for each edge (u,v) E(G) E(G)• relax( u, v, d() )relax( u, v, d() )
for each edge (u,v) for each edge (u,v) E(G) E(G) if w(v) > w(u) + d(u,v)if w(v) > w(u) + d(u,v)
• return FALSEreturn FALSE
return TRUEreturn TRUE
Efficient Implementation of Efficient Implementation of Bellman-FordBellman-Ford
If If w(u)w(u) is not tightened in the current iteration, is not tightened in the current iteration, u u cannot cannot affect the distances of its successors in the next iterationaffect the distances of its successors in the next iteration
Start-shortest-path(G,s)Start-shortest-path(G,s) Q = {s} /* Q is a FIFO queue */Q = {s} /* Q is a FIFO queue */ while ( Q while ( Q ) )
u = Extract from Q u = Extract from Q for each edge (u,v) for each edge (u,v) E(G) E(G)
• relax( u, v, d() )relax( u, v, d() )
• if ( distance of v has changed )if ( distance of v has changed ) Insert v into QInsert v into Q
Check for negative cycleCheck for negative cycle
Optimum Cycle RatioOptimum Cycle Ratio
Determine the min(max) ratio Determine the min(max) ratio ** over all cycles over all cycles CC in in GG
Applications:Applications:
Problem 1:Problem 1: Find the loop, which has the largest Find the loop, which has the largest combinational delay per one memory elementcombinational delay per one memory element
The circuit cannot be clocked faster than this delayThe circuit cannot be clocked faster than this delay
Problem 2:Problem 2: Find the loop, which has the smallest Find the loop, which has the smallest combinational delay per one memory elementcombinational delay per one memory element
If the circuit is implemented with transparent latches, this If the circuit is implemented with transparent latches, this delay should satisfy some constraintsdelay should satisfy some constraints
Latch-to-Latch Max DelayLatch-to-Latch Max Delay
Native method: Native method: Cut at the latch boundaryCut at the latch boundary For each pair For each pair (i, j)(i, j) of latches of latches
• Set arrival times of latch Set arrival times of latch ii to to 00, the rest of latches to , the rest of latches to --• Perform DFS from latchPerform DFS from latch j j to find its combinational delay to find its combinational delay
Better method: Better method: Cut at the latch boundaryCut at the latch boundary For each latch For each latch ii
• Set arrival times of latch Set arrival times of latch ii to to 00, the rest of latches to , the rest of latches to --• Move through the TFO cone of latch Move through the TFO cone of latch ii in the topological order and in the topological order and
propagate the arrival times through the fanoutspropagate the arrival times through the fanouts
• Collect the latches Collect the latches jj such that their arrival times is more than such that their arrival times is more than --
Cycle Ratio AlgorithmsCycle Ratio Algorithms
A. Dasdan, “Experimental analysis of the fastest optimum cycle ratio and mean algorithms”, ACM TODAES, vol. 9(4), pp. 385-418, 2004
Overview of Howard’s AlgorithmOverview of Howard’s Algorithm
This is a Bellman-Ford algorithm with a cycle detection This is a Bellman-Ford algorithm with a cycle detection subroutine, which gradually tightens the lower bound on subroutine, which gradually tightens the lower bound on the Max Cycle Ratio (MCR)the Max Cycle Ratio (MCR)
Exponential in the worst case but efficient in practiceExponential in the worst case but efficient in practice Heuristics are used for faster convergenceHeuristics are used for faster convergence
Find a good starting cycle ratioFind a good starting cycle ratio Detect only relevant changesDetect only relevant changes
Preprocessing the graphPreprocessing the graph Remove non-cyclic branchesRemove non-cyclic branches Decompose into strongly commented componentsDecompose into strongly commented components
Notation for Howard’s AlgorithmNotation for Howard’s Algorithm
u, vu, v are vertices, which represent latches are vertices, which represent latches w(u,v)w(u,v) is the distance between is the distance between uu and and vv, which , which
represents the combinational delayrepresents the combinational delay Defined for adjacent vertices onlyDefined for adjacent vertices only
d(u)d(u) is the longest distance from is the longest distance from uu to any vertex to any vertex vv p(u)p(u) is the successor function is the successor function
For each nodeFor each node u u returns the node returns the node vv such that the such that the distance between distance between uu and and v v is the longest (equal to is the longest (equal to d(u)d(u)))
r r is the current best maximum ratio for any loopis the current best maximum ratio for any loop Initialized to a longest self-loop and refined to Initialized to a longest self-loop and refined to r’r’ in in
procedure procedure FindRatio()FindRatio()
MCR: Find RatioMCR: Find Ratio
Initialization
Searching for a new cycle
Determining a new ratio
Trying to find a longer loop
Updating the ratio
Howard’s AlgorithmHoward’s Algorithm
Initialization
Trying to find longer loops
Heuristic to speed up convergence
Constraint propagation
Clock SkewClock Skew Zero-skewZero-skew
Clock arrives at all latches at the same timeClock arrives at all latches at the same time Non-trivial skewNon-trivial skew
Each latch has a skew (a phase of the clock signal at this latch)Each latch has a skew (a phase of the clock signal at this latch) ASAPASAP (“as soon as possible”) and (“as soon as possible”) and ALAPALAP (“as late as possible”) (“as late as possible”)
skewsskews at a latch define a timing window ( at a latch define a timing window (sequential slacksequential slack), ), which the clock at the latch should satisfy for the design to which the clock at the latch should satisfy for the design to meet the timing constraintsmeet the timing constraints
The sequential slacks at different latches are not independentThe sequential slacks at different latches are not independent
Clock skew optimizationClock skew optimization is a fundamental problem, tightly is a fundamental problem, tightly related to retiming and other sequential transformationsrelated to retiming and other sequential transformations
Skewing changes the skews of the latches, retiming moves the Skewing changes the skews of the latches, retiming moves the latches according to the allowed skewslatches according to the allowed skews
ExampleExample
PI PO
Clock period = 3 Buffer delay = 1
Initial
ALAP
ASAP
ALAP skew = -1 ASAP skew = -3
PI PO
PI PO
skew = 0
skew = -1
skew = -3
ASAP and ALAP Skew ComputationASAP and ALAP Skew Computation
Given a clock period Given a clock period rr, set the , set the weight of an edge weight of an edge (u,v)(u,v) to be to be w’(u,v) = w(u,v) - rw’(u,v) = w(u,v) - r
Connect the latches depending on Connect the latches depending on PIs to the source vertex PIs to the source vertex s s
Connect the latches, which Connect the latches, which produce POs to the sink vertex produce POs to the sink vertex tt
Run Bellman-Form to find the Run Bellman-Form to find the shortest path from shortest path from ss to to uu
This is the ASAP skew of latch This is the ASAP skew of latch uu Run Bellman-Form to find the Run Bellman-Form to find the
shortest reverse path from shortest reverse path from tt to to uu
This is the ALAP skew of latch This is the ALAP skew of latch uu
t
s
u