Timing issues & clock distributioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture10/Lecture10... ·...

Preview:

Citation preview

Timing Issues and Clock Distribution

Lecture 1018-322 Fall 2003

Textbook: [Sections 7.5, 10.1, 10.3]

Overview

Timing issues & clock distributionSystem Performance DeterminationPipeliningClock skew. Register timingCounter clock skew

Review: Register Timing

© Prentice Hall 1995clk-to-Q (propagation) delay (tpFF)

hold time

setup time

Unstable data

cycle time

clk

Q

Sequential Systems: The Big Picture

PrimaryInputs

PrimaryOutputsCombinational

Logic

Next State

Current State

MemoryElements

(Registers)Clock

Maximum Clock Frequency

FF’s

LOGIC

tp,comb

φ

“Speed” of the sequential machine (how fast can this machine be clocked)

f = 1/Tφ (clock frequency)

Example: tp ~ 100ns => 10MHz (limit on performance)

tp,FF + tp,comb + tsetup < Tφ

Setup Time

Required time for input to be stableBEFORE CLOCK EDGE

Comb.Logic

Data stable herebefore clock here

Setup Time Fix

Φ

Data

This violation can be fixed by stretching the clock cycle

OK

Φ

Data

Setup Time Fix 2

Φ

Data

OR… by accelerating the combinational logic

OK

Φ

Data

Hold Time

Required time for input to be stableAFTER CLOCK EDGE

Comb.Logic

Data stable hereafter clock here

Hold Time Violations

Prop Delay: 1 ns Hold Time: 2 ns

Hold time violations are caused by “short paths”Cannot be fixed by slowing down the clock!!!

Fixed by slowing down fast paths

Timing Analysis

Look for longest path: clock speedLook for shortest paths: check hold time

Static Timing Analysis:Attempt to determine longest/shortest path from schematicDifficult problem Know the delay of logic elements, but cannot easily reason about

the entire design

False Paths

Example: #4

#3

#2 #3

Solutions:SimulationFalse Path Analysis

Speeding up System Performance: Pipelining

RE

G

φ

REG

φR

EGφ

log.

RE

G

φ

REG

φ

RE

G

φ

.

RE

G

φ

RE

G

φ

logOut Out

a

b

a

b

Non-pipelined version Pipelined version

tp,comb

How Good Is This?

Tmin,pipe = tp,reg + max(tp,ADD,tp,abs,tp,log ) + tsetup,reg

Pipelining is used to implement high-performance data-pathsAdding extra pipeline stages only makes sense up to a certain point

RE

Gf

RE

G

φR

EG

φ

.

RE

G

φ

RE

G

φ

log Out

a

b

Pipelined version

Overview

Timing issues & clock distributionSystem Performance DeterminationPipeliningClock skew. Register timingCounter clock skew

Synchronous Pipelined Data-Path: Clock Skew

Clock Rates as High as 1 GHz in CMOS!

CL1 R1 CL2 R2 CL3 R3Out

tφ’ tφ’’ tφ’’’

tl,mintl,max

tr,mintr,max

ti

Clock Edge Timing Depends upon Position

A clock line behaves as a distributed RC line

Each register sees a localclock time depending on their distance from the clock source -> clock skew

δ = tφ” – tφ’ (> 0 or <0)

Clock skew can severely affect the performance

Note: we assumed here tsetup = 0

φ

In

Constraints on Skew

R1 R2

φ’ φ’’δ

tr,min + tl,min + ti

(a) Race between clock and data.

tφ’ tφ’’ = tφ’ + δ

dataearliest time

If the local clock of R2 is delayed w.r.t. R1, it might happen that the inputs of R2 change before the previous data is latched -> race

δ ≤ tr,min + ti + tl,min

R1 R2

φ’ φ’’+ Tδ

tr,max + tl,max + ti

(b) Data should be stable before clock pulse is applied.

tφ’ tφ’’ + T =

data

φ’’

tφ’ + T + δ

worst-case

The correct input data is stable at R2 after the worst-case propagation delay. The clock period must be large enough for the computations to settle.

T ≥ tr,max + ti + tl,max - δ

Clock Constraints in Edge-Triggered Logic

δ tr min, ti tl min,+≤

T r max, ti tl max, δ–+≥

+

+t

(1)

(2)

Maximum Clock Skew Determined by Minimum Delay between Latches (condition 1)Minimum Clock Period Determined by Maximum Delay between Latches (condition 2)

Positive and Negative Skew

R R RData

The clock is routed in the same direction as data

The skew has to satisfy (1)If it violates (1), then the circuit

malfunction independently of the clock period Clock period decreases!!!

(a) Positive skewφ

CL CLCL

R R RData

φ (b) Negative skewThe clock is routed in the opposite direction of data

(1) is satisfied implicitly. The circuit operates correctly independently of the skew

Clock period increases by | δ|CL CLCL

Overview

Timing issues & clock distributionPipeliningClock skew. Register timingCounter clock skew

Countering Clock Skew

RE

G

φ

RE

G

φR

EG

φ

.

RE

G

φ

log Out

In

Clock Distribution

Positive Skew

Negative Skew

Data and Clock Routing

Goal: clock skew between registers is bounded!(What matters is the relative skew between communicating registers.)

Clock Distribution: H-Trees

clk

• Every branch sees the same wire length and capacitance •The clock skew is theoretically zero• The sub-blocks should be small enough s.t. the skew within the block is tolerable• It is essential to consider clock distribution early in the design process

Clock distribution is a major design problem!

Clock Network with Distributed Buffering

Module

Module

Module

Module

Module

Module

CLOCK

main clock driver

secondary clock drivers

Reduces absolute delay, and makes Power-Down easierSensitive to variations in Buffer Delay

Local Area

DEC Alpha 21164

Clock Drivers

9.3 M Transistors, 4 metal layers, 0.55µmClock Freq: 300 MHzClock Load: 3.75 nFPower in Clock = 20W (out of 50W)Two Level Clock Distribution:

oSingle 6-stage driver at centeroSecondary buffers drive left and right side

o Max clock skew less than 100psecoRouting the clock in the opposite directionoProper timing

Clock Skew in Alpha

Clock driver

Timing & Race Conditions: Example

AB

SumCoutCin

AB

SumCoutCin

AB

SumCoutCin

32-bit reg

32-bit reg

vv

32-bit adder

R1

R2

clk driver 150Ω

300fF

SourceDestination

32-bit reg

v

R5

32-bit reg

v

R4

32-bit reg

v

R3

~1mm wire 200Ω, 100fF

Example (cont’d)

150Ω 200Ω

600fF 50fF 50fF 900fF

φ’ φ”π model

tφ’ = 0.69 (150) (650) = 67pstφ” = 0.69 [(150) (650) + (150 + 200)(950)] = 297psδ = tφ’ – tφ” = 230ps

Find the skew between the source register clock (φ’) and the destination (φ”)

δ ≤ tr,min + ti + tl,min condition (1)thold + δ ≤ tclk-Q + tsum100 + 230 ≤ 50 + 300 TRUE => No race problem

Check race condition

T ≥ tr,max + ti + tl,max - δ condition (2)T ≥ tclk-Q + 31 tcarry + tsum - δ + tsetupT ≥ 50 + 31(250) + 300 –230 + 150 => T ≥ 8.2 nsFind minimum clock period