Lecture 12 - CpE 690 Introduction to VLSI Design

8/13/2019 Lecture 12 - CpE 690 Introduction to VLSI Design

1/40


2/40


3/40


4/40


5/40

Full Adder Design I

Direct Implementation of Boolean Equations:

5

SUM = AB C Cout= MAJ(A,B,C)

36 transistors


6/40

Ripple Carry Adder

Simplest design: cascade full adders

Critical path goes from Cinto Cout

Worst case delay is linear in number of bits

td (N-bit adder)= (N-1).tcarry+ tsum

Need to minimize delay tcarry= delay from C to Coutin

each full adder

tsum(delay from A,B,C to S) is negligible for large N6

+

A0 B0

S0

+

A1 B1

S1

+

A2 B2

S2

+

A3 B3

S3

CinCoutC1C2C3


7/40

Full Adder Design II

A more compact design can be realized by

generating S as a function of Cout

:

S = ABC + (A + B + C).Cout

If we could eliminate output inverters

simplify design and reduce C to Coutdelay 7

A B

B

A

Ci

Ci A

X

VDD

VDD

A B

Ci BA

B VDD

A

B

Ci

Ci

A

B

A CiB

Co

VDD

S

28 transistors


8/40

Full Adder Design II - Layout

Standard cell style (not bit-slice) layout:

8


9/40


10/40

Full Adder: Design III (Mirror Adder)

10

Output inverters removed

pMOS and nMOS networks are mirror of each other rather than complimentary simplifies layout

enabled by symmetry of the add operation

Transistors placed & sized to minimize carry propagation

at the expense of sum generation

24 transistors


11/40

Mirror Adder Layout

11

Bit-slice cell style (not standard cell) layout:

Transistors now run vertically with horizontal poly

Data travels from left to right

carry propagates vertically from one bit to the next

Can build wide transistors without affecting bit pitch


12/40

GPK Representation

12

Introduce new intermediate signals that describe full

adder operation in terms of carry propagation

G = A B (i.e. generatecarry: Cout= 1 independent of C) P = A B (i.e. propagatecarry: Cout= C)

K = A B (i.e. kill carry: Cout= 0 independent of C)

Note that G, P and K are only functions of A and B

dont need to wait for C

A B C G P K Cout S

0 00

0 0 10 0

1 0 1

0 10

0 1 00 1

1 1 0

1 00

0 1 00 1

1 1 0

1 10

1 0 01 0

1 1 1


13/40

GPK Representation

13

Can see the action of generate, propagate and kill

operators in mirror adder:

VDD

Ci

A

BBA

B

A

A B

Kill

Generate"1"-Propagate

"0"-Propagate

VDD

Ci

A B Ci

Ci

B

A

Ci

A

BBA

VDD

SCo


14/40

Using GPK to Speed up Carry Propagation

14

Divide the words to be added into bit groups or blocks

e.g. think about adding 4-bits at a time

Addition of each block is a

three-step process:

1. Compute bit-wise generate, propagate (& kill) signals

Gi= Ai Bi Pi= AiBi Ki= Ai Bi

2. Use PG(K) signals and Cinto determine Cifor each bit

(and Cout)

3. Calculate sums using Si= PiCi

4-bit adder

block

4 4

4

CinCout

A B

S


15/40

Group Addition with PG Logic

15


16/40

Manchester Carry

Use transmission gates to provide carry propagation

16

static

dynamic


17/40

4-bit Manchester Carry Logic

17

9C

R/2R/2 R/2 R/2

9C 9C 9C


18/40

Delays in Manchester Chain

Using Euler, delay (after nstages) = (9/4).n(n+1)RC

Delay increases quadratically with n

Better to add a couple of inverters after 3-4 bits

makes overall delay linear in n18

9C

R/2R/2 R/2 R/2

9C 9C 9C

n 1 2 3 4

total delay 4.5 RC 13.5 RC 27 RC 45 RC

delay of extra stage 9 RC 13.5 RC 18 RC


19/40

Manchester Carry Stick Layout

19

Pi + 1

Gi + 1

Ci

Inverter/Sum Row

Propagate/Generate Row

Pi

Gi

Ci - 1

Ci + 1

VDD

GND


20/40

Carry-Bypass Adder

If (P0 and P1 and P2 and P3) then Cout= Cin

Otherwise use PG within the block

In an large adder with many blocks, BP is set up well

before Cinarrives

Also known as Carry-Skip Adder 20

Cin

Cin

Cout

Cout

BP = P0.P1.P2.P3


21/40


22/40

Carry-Bypass Critical Path

22

Carrypropagation

Setup

Bit 03

Sum

Mbits

tsetup

tsum

Carrypropagation

Setup

Bit 47

Sum

tbypass

Carrypropagation

Setup

Bit 811

Sum

Carrypropagation

Setup

Bit 1215

Sum

If we have Nbits,

Mbits/block,

N/M blocks,

worst case delay is

tadder= tsetup+ M.tcarry+ (N/M-1).tbypass+ (M-1).tcarry+ tsum

td

N

ripple

by-pass

4-8


23/40

Carry-Select Adder

23

For each M-bit block:

Calculate block carries for both Cin=0 and Cin=1

Then when Cinfinally arrives, use multiplexer to select correct result

PG Setup

0 Carry Propagation

1 Carry Propagation

Multiplexer

Sum Generation

0

1

Co,k Co,k+M

Carry Vector


24/40

Carry-Select Adder Critical Path

24

worst case delay is:

tadder= tsetup+ M.tcarry+ (N/M).tmux+ tsum


25/40


26/40

Setup

"0" Carry

"1" Carry

Multiplexer

Sum Generation

"0"

"1"

Setup

"0" Carry

"1" Carry

Multiplexer

Sum Generation

"0"

"1"

Setup

"0" Carry

"1" Carry

Multiplexer

Sum Generation

"0"

"1"

Setup

"0" Carry

"1" Carry

Multiplexer

Sum Generation

"0"

"1"

Bit 0-1 Bit 2-4 Bit 5-8 Bit 9-13

S0-1 S2-4 S5-8 S9-13

Ci,0

(4) (5) (6) (7)

(1)

(1)

(3) (4) (5) (6)

Mux

Sum

S14-19

(7)

(8)

Bit 14-19

(9)

(3)

Square Root Carry-Select Adder

26

By making blocks of increasing length, we can perform more carry

calculations while waiting for the multiplexer select signal

tadder= tsetup+ M.tcarry+ (2N ).tmux+ tsum


27/40

Carry-Select Adder: Delay Comparisons

27

Square root select particularly effective for large N (e.g. 64-bit)


28/40

Tree Adders

28

For wide adders (N>32 bits) delay of carry lookahead (bypass or

select) adders is dominated by delay of passing carry through the

lookahead stages (multiplexers).

This delay can be reduced by recursively looking ahead across

lookahead blocks, e.g.

lookahead across 2-bit blocks to generate Cinto 4-bit blocks

lookahead across 4-bit blocks to generate Cinto 8-bit blocks, etc.

Delay can O(log N) (at expense of area and power!)

1:03:25:47:69:811:1013:1215:14

3:07:411:815:12

7:015:8

11:0

5:09:013:0

0123456789101112131415

15:014:013:012:011:010:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0

PG generation

Sum calculation

PG logic

G logic

Buffere.g. Brent-Kung

Adder


29/40


30/40

Unsigned Multiplication

30

Example:

1 1 0 0 :1210 multiplicandX 0 1 0 1 : 510 multiplier

1 1 0 0

0 0 0 0 partial

1 1 0 0 products

0 0 0 0

0 0 1 1 1 1 0 0 :6010 product

M x N-bit multiplication

Produce N M-bit partial products

Sum these to produce (M+N)-bit product


31/40


32/40

C S M l i li


33/40

Carry Save Multiplier

33

= =

C S M lti li C iti l P th


34/40

Carry Save Multiplier Critical Path

34

tmult= (M+N-2).tcarry+ 2.tsum+ tAND

OR tmult= (N-1).tcarry+ tfast_adder+ tsum + tAND

Fast

Adder


35/40


36/40

B h W l P ti l P d t


37/40

Baugh-Wooley Partial Products

Subtraction of these terms is accomplished by adding

twos complement, i.e. by adding (term+1)

37

B h W l M lti l i ti A


38/40

Baugh-Wooley Multipl ication Array

38

M difi d B h W l M lt i l i


39/40

Modified Baugh-Wooley Mult iplier

Simply replace AND gate in these cells with NAND gate

and set two of the carry-in constants to 1

39

multiplier cell

with AND gate

multiplier cell

with NAND gate

full adder


40/40

Documents

Lecture 12 - CpE 690 Introduction to VLSI Design