37
Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( www.cse.psu.edu/~mji ) www.cse.psu.edu/~cg477 [Adapted from Rabaey’s Digital Integrated Circuits,

Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Embed Size (px)

Citation preview

Page 1: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

CSE477VLSI Digital Circuits

Fall 2002

Lecture 21: Multiplier DesignMary Jane Irwin ( www.cse.psu.edu/~mji )

www.cse.psu.edu/~cg477

[Adapted from Rabaey’s Digital Integrated Circuits, ©2002, J. Rabaey et al.]

Page 2: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

Review: Basic Building Blocks Datapath

– Execution units» Adder, multiplier, divider, shifter, etc.

– Register file and pipeline registers– Multiplexers, decoders

Control– Finite state machines (PLA, ROM, random logic)

Interconnect– Switches, arbiters, buses

Memory– Caches (SRAMs), TLBs, DRAMs, buffers

Page 3: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

Review: Binary Adder Landscapesynchronous word parallel adders

ripple carry adders (RCA) carry prop min adders

signed-digit fast carry prop residue adders adders adders

Manchester carry parallel conditional carry carry chain select prefix sum skip

T = O(N), A = O(N)

T = O(1), A = O(N)

T = O(log N)A = O(N log N)

T = O(N), A = O(N)T = O(N)

A = O(N)

Page 4: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

Multiply Operation

Multiplication as repeated additions

multiplicand

multiplier

partialproductarray

double precision product

N

2N

N can be formed in parallel

Page 5: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

Shift & Add Multiplication Right shift and add

– Partial product array rows are accumulated from top to bottom on an N-bit adder

– After each addition, right shift (by one bit) the accumulated partial product to align it with the next row to add

– Time for N bits Tserial_mult = O(N Tadder) = O(N2) for a RCA

Making it faster– Use a faster adder– Use higher radix (e.g., base 4) multiplication

»Use multiplier recoding to simplify multiple formation

– Form partial product array in parallel and add it in parallel

Making it smaller (i.e., slower)– Use an array multiplier

»Very regular structure with only short wires to nearest neighbor cells. Thus, very simple and efficient layout in VLSI

»Can be easily and efficiently pipelined

Page 6: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

Tree Multiplier Structure

partial productarray reduction tree

fast carry propagate adder (CPA)

P (product)

mux + reductiontree (log N)+CPA (log N)

Q (‘ier)

D (‘icand)

DD

D

0

00

0

multiple forming circuits

Page 7: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

(4,2) Counter Built out of two (3,2) counters (just FA’s!)

– all of the inputs (4 external plus one internal) have the same weight (i.e., are in the same bit position)

– the internal output is carried to the next higher weight position (indicated by the )

(3,2)

(3,2)Note: Two carry outs - one “internal” and one “external”

Page 8: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

Tiling (4,2) Counters

Reduces columns four high to columns only two high– Tiles with neighboring (4,2) counters– Internal carry in at same “level” (i.e., bit position weight)

as the internal carry out

(3,2)

(3,2)

(3,2)

(3,2)

(3,2)

(3,2)

Page 9: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

4x4 Partial Product Array Reduction

multiplicand

multiplier

partialproductarray

reduced pp array (to CPA)

double precision product

Fast 4x4 multiplication using (4,2) counters

Page 10: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

8x8 Partial Product Array Reduction

‘icand

‘ier

partialproductarray

reduced partial product array

How many (4,2) countersminimum are needed to reduce it to 2 rows?

Answer: 24

Page 11: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

Alternate 8x8 Partial Product Array Reduction

‘icand

‘ier

partialproductarray

reduced partial product array

More (4,2) counters, so what is the advantage?

Page 12: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

Array Reduction Layout Approach

multiple generators

multiplicand

multiple selection signals(‘ier)

. . .2(4,2) counter slice

(4,2) counter slice

(4,2) counter slice

CPA

Page 13: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

Next Lecture and Reminders

Next lecture– Shifters, decoders, and multiplexers

»Reading assignment – Rabaey, et al, 11.5-11.6

Reminders– Project final reports due December 5th – HW5 (last one!) due November 19th – Final grading negotiations/correction (except for the final

exam) must be concluded by December 10th – Final exam scheduled

»Monday, December 16th from 10:10 to noon in 118 and 121 Thomas

Page 14: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

Topics

Adders and ALUs (§6.4, §6.5) Multipliers (§6.6)

– Array multiplier– Baugh-Wooley multiplier– Booth encoding– Wallace tree multiplier

Subsystem design principles (§6.2)

Page 15: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

Elementary School Algorithm

0 1 1 0 multiplicand

× 1 0 0 1 multiplier

0 1 1 0

+ 0 0 0 0

0 0 1 1 0

+ 0 0 0 0

0 0 0 1 1 0

+ 0 1 1 0

0 1 1 0 1 1 0

partial products

Page 16: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

Combinational Multiplier

bit of multiplier controls whether addition occurs

Page 17: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

Array Multiplier

Regular layout – An n × m cell layout – Easy to be pipelined – Used frequently in FPGA and ASICs

Critical path– Less than (n+m-1) bit adder delay

Handles unsigned multiplication ONLY

Page 18: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

A 4 × 4 Unsigned Array Multiplier

skew arrayfor rectangularlayout

X3 X2 X1 X0

× Y3 Y2 Y1 Y0

X3Y0 X2Y0 X1Y0 X0Y0

X3Y1 X2Y1 X1Y1 X0Y1

X3Y2 X2Y2 X1Y2 X0Y2

X3Y3 X2Y3 X1Y3 X0Y3

P7 P6 P5 P4 P3 P2 P1 P0

Page 19: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

Unsigned Array Multiplier

+

a

b

Cin

Cout Sum

+ x0y1

x0y2

P1+

x0y0

x0y3+

0+

+

0

x1y1

x1y2+

x1y0

x1y3+

+

P2

P3

P4

0

P0+

0

x2y1

x2y2+

x2y0

x2y3+

+

x3y1

x3y2

x3y0

x3y3

P5P6P7

Page 20: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

Signed Multiplication

Signed number representation–

Signed n×n multiplication– (1110)2 × (0011)2 = (1010)2 (-2) × 3 = (-6)

– No difference from unsigned multiplication if the result has the same bit-width as the input

But what if we want the result to be 2n bit?– Use sign-bit extension

– Needs 2n × 2n array multiplier

2

0

11 22

n

i

ii

nn xxX

Page 21: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

Baugh-Wooley Multiplier: Principle

2

0

111

2

0

2

0

2211 2)(22

n

i

niinin

n

i

n

j

jiji

nnn xyyxyxyxXY

ii xx 1 ii yy 1

111

221111 2)(2)( n

nnn

nnnn yxyxyxXY

2

0

111

2

0

2

0

2)(2n

i

niinin

n

i

n

j

jiji xyyxyx

111

221111

12 2)(2)(2 nnn

nnnnn

n yxyxyxXY

2

0

111

2

0

2

0

2)(2n

i

niinin

n

i

n

j

jiji xyyxyx

Page 22: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

Baugh-Wooley Multiplier: Structure

+

a

b

Cin

Cout Sumx3

+ x0y1

x0y2

P1+

x0y0

x0y3+

y3+

+

0

x1y1

x1y2+

x1y0

x1y3+

+

P2

P3P4

0

P0+

0

x2y1

x2y2+

x2y0

x2y3+

+

x3y1

x3y2

x3y0

P5P6P7

1

y3

x3y3+

+ +

x3

Page 23: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

Booth Multiplier

Utilize Booth encoding scheme

Booth encoding scheme Handles signed multiplication Reduce the number of partial products by half Small area and fast Encoding scheme cannot be applied hierarchically

» Often used as the first stage partial products reduction

Page 24: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

Booth Encoding: Principle

Two’s-complement form of multiplier y– –

Consider first two terms– – By looking at three bits of y, we can determine

whether to add x, 2x to partial product.

...222 33

22

11

n

nn

nn

n yyyY

...2)(2)(2)( 334

223

112

n

nnn

nnn

nn yyyyyyY

...2)2(2)2( 4543

2321

n

nnnn

nnn XyyyXyyyXY

Page 25: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

Booth Actions

yi yi-1 yi-2 increment

0 0 0 0

0 0 1 X

0 1 0 X

0 1 1 2X

1 0 0 -2X

1 0 1 -X

1 1 0 -X

1 1 1 0

Page 26: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

Booth Example

Don’t forget the sign extension of the encoded value when add them together – Only have to extend 2 bits though

x = 011001 (2510), y = 101110 (-1810).

y1y0y-1 = 100, P1 = P0 - (10 011001) = 11111001110

y3y2y1= 111, P2 = P1 0 = 11111001110.

y5y4y3= 101, P3 = P2 - 0110010000 = 11000111110.

Page 27: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

Wallace Tree

Reduces the number of partial products Built from carry-save adders:

– Three inputs: a, b, c – Two outputs: y, z such that y + z = a + b + c

Carry-save equations:– yi = ai bi ci

– zi+1 = aibi + bici + ciai

– What’s the difference from carry-ripple adder?

Page 28: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

Wallace Tree Structure

FA FA FA

a2 b2c2a1 b1

c1 a0 b0c0

s0s1s2

carry-ripple adder

FA FA FA

a2 b2c2a1 b1

c1 a0 b0c0

y0

carry-save adder

z1y1z2y2z3

Page 29: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

Wallace Tree Operation

n additions are reduced to (2n/3) additions after each level– Sum of inputs = Sum of outputs– Can apply the reduction hierarchically– More efficient design uses 4-2 adders to reduce

n additions to (n/2) additions after each level

Need final adder to add the last two numbers

Page 30: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

A Booth-Wallace Tree Multiplier

4-2 adder array 4-2 adder array 4-2 adder array 4-2 adder array FF

B B B B B B B B B B B B B B B B B

4-2 adder array 4-2 adder array FF

4-2 adder array FF

3-2 adder array

64-bit adder

Booth encoders

Wallace tree level 1

Wallace tree level 2

Wallace tree level 3

Wallace tree level 4

Final Adder(not part of pipeline)

Most commonly used high-performance multiplier

Page 31: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

Topics

Adders and ALUs (§6.4, §6.5)

Multipliers (§6.6)

Subsystem design principles (§6.2)

Page 32: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

Pipelining

Pipelining can be used to reduce clock period at the expense of latency:

combinationallogic 1

combinationallogic 2

Page 33: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

Cycle Time and Latency

# stages

cycl

e ti

me

# stages

late

ncy

Page 34: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

Data Paths

A data path is a logical and physical structure:– bit-wise logical organization– bit-wise physical structure

Data paths generally use busses to pass data between function units.

Page 35: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

Bit Slice Organization

registers shifter ALU

bit n-1

bit 0

bus

control

Page 36: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

Data Path Cell Design

Connections may be made by:– abutment, requiring stretching cells;– river routing, requiring a routing channel

between function units.

Page 37: Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( mji

Digital Integrated Circuits Chpt. 5 Lec. 01- 08/29/2006

Project

Due 10/26– Schematic– Verilog/Spectre simulation results– 10/27 presentation (10-15 PowerPoint slides)

Important (efficiency-related) – How to add array of instances