Unrolling Carry Recurrence. Carry-Lookahead Equations

Preview:

Citation preview

Unrolling Carry Recurrence

Carry-Lookahead Equations

4-Bit CLA

Circuit Structure

CLA Complexity

Managing CLA Complexity

Multilevel CLA Example

Radix-16 Addition

• Two Binary Numbers Grouped into Hex Digits

• Block Generate and Propagate Signals in Each

radix-16 Digit

• Replace c4 Position of CLA Network with Block

Signals g[i,i+3] and p[i,i+3]

• Results in 4-bit “Lookahead Carry Generator”

CLA Design

Lookahead Carry Generator

Block Generate and Propagate

• Assume i0 < i1 < i2

0 2 1 2 0 1 1 2

0 2 0 1 1 2

[ , 1] [ , 1] [ , 1] [ , 1]

[ , 1] [ , 1] [ , 1]

i i i i i i i i

i i i i i i

g g g p

p p p

• Example: g[0,3] is Generate Signal of Block for bits 0-3

• Relationships Allow for Merging of Blocks

• Can Allow Merged Block to Overlap

Example Merged LAG

x15-12 y15-12

CLA3

c12

s15-12g15-12 p15-12

x11-8 y11-8

CLA2

c8

s11-8g11-8 p11-8

x7-4 y7-4

CLA1

c4

s7-4g7-4 p7-4

x3-0 y3-0

CLA0

c0

s3-0g3-0 p3-0

g15-0 p15-0

Lookahead Carry Generator

CLA Latency

CLA Architecture

Overlapped LAGs

• Overlap Blocks [i1,j1] and [i0,j0]

0 1 1 0 1i i j j

• Relationships Become:

0 1 1 1 0 0 1 1

0 1 0 0 1 1

[ , ] [ , ] [ , ] [ , ]

[ , ] [ , ] [ , ]

i j i j i j i j

i j i j i j

g g g p

p p p

• Useful for Building Trees of Different Shapes

CLA With LAG

CLA Latency

Example: 64-bit CLA in 13 gate levels since

43 = 64

Generates final carry out for Fig. 6.5

Ling Adders

Ling Adders – Wired OR

Block p and g Generators

Carry Determination as Prefix Computations

• Two Contiguous (or Overlapping) Blocks

(g, p) and (g, p)

• Merged Block (g, p)

g = g + gp

p = p p

• Large Group Generates Carry if:

1) left group generates carry

2) right group generates and left group propagates

Carry Operator, ¢• Define Operator Over (g, p) Pairs

(g, p) = (g, p ) ¢ (g, p)

g = g + g p

p = p p

• ¢ is Associative

(g, p) ¢ (g, p) ¢ (g, p)

= [(g, p) ¢ (g, p) ] ¢ (g, p)

= (g, p) ¢ [(g, p) ¢ (g, p)]

Carry Operator, ¢ (cont)

• ¢ is NOT Commutative

(g, p) ¢ (g, p) (g, p)¢ (g, p)

• This is Easy to See Because:

g = g+ gp g+ g p

Prefix Adders

Carry Determination• Assume Adder with NO cIN

ci+1 = g[0,i]

• Carry Enters i+1 Block iff Generated in

Block [0,i]

• Assume Adder with cIN = 1

• Viewed as Generated Carry from Stage -1

p-1 = 0, g-1 = cIN

• Compute g[-1,i] For All i

• Formulate Carry Determination as:

Prefix Computation

Prefix Sums Analogy• Designs for Prefix Sums Can be Converted

to Carry Computation• Replace Adder with ¢ Operator• Addition IS Commutative, Order Doesn’t Matter• Can Group (g, p) In Anyway to Combine Into

Block Signals (as long as order is preserved)• (g, p) Allow for Overlapping Groups, Prefix

Sums Does Not (sum would contain some

values added two or times)

Prefix Sum Network

(adder levels)

(# of adders)

Another Way for Prefix Sums

• Compute the Following First:

x0+x1 x2+x3 x4+x5 ... xk-2+xk-1

• Yields the Partial Sums, s1, s3, s5, ..., sk-1

• Next, Even Indexed Sums Computed As:

s2j = s2j-1 + x2j

Alternative Prefix Sum Network

Comparison of Prefix Sum Networks

• First Design Faster:

lg2(k) versus 2lg2(k)-2 (levels)

• First Design has High Fan-out Requirements

• First Design Requires More Cells

(k/2)lg2k versus 2k-2-lg2k

• Second Design is Brent-Kung Parallel Prefix Graph

• First Design is Kogge-Stone Parallel Prefix Graph

(fan-out can be avoided by distributing computations)

Brent-Kung Network

independent,so single delay

Kogge-Stone Network

Area/Levels of Prefix Networks

Hybrid Parallel Prefix Network

Recommended