Upload
others
View
1
Download
1
Embed Size (px)
Citation preview
VLSI-1 Class Notes
Lecture 12:Datapath Design
Mark McDermottElectrical and Computer Engineering
The University of Texas at Austin
10/1/18
VLSI-1 Class Notes
Datapath Design
§ Comparators§ Shifters§ Adders§ Multipliers§ Registers
Page 210/1/18
VLSI-1 Class Notes
ARM Datapath
Page 310/1/18
VLSI-1 Class Notes
Comparators
§ 0 s detector: A = 00…000§ 1 s detector: A = 11…111§ Equality comparator: A = B§ Magnitude comparator: A < B
10/1/18 Page 4
VLSI-1 Class Notes
1 s & 0 s Detectors
§ 1 s detector: N-input AND gate§ 0 s detector: NOTs + 1 s detector (N-input NOR)
A0A1
A2A3
A4A5
A6A7
allones
A0A1
A2A3
allzeros
allones
A1
A2A3
A4A5
A6A7
A0
10/1/18 Page 5
VLSI-1 Class Notes
Equality Comparator
§ Check if each bit is equal (XNOR, or equality gate )§ 1 s detect on bitwise equality
A[0]B[0]
A = B
A[1]B[1]A[2]B[2]A[3]B[3]
10/1/18 Page 6
VLSI-1 Class Notes
Magnitude Comparator
§ Compute B-A and look at sign§ B-A = B + ~A + 1§ For unsigned numbers, carry out is sign bit
A0
B0A1
B1A2
B2A3
B3
A = BZ
C
A B£
N A B³
10/1/18 Page 7
VLSI-1 Class Notes
Signed vs. Unsigned
§ For signed numbers, comparison is harder– C: carry out– Z: zero (all bits of A-B are 0)–N: negative (MSB of result)– V: overflow (inputs had different signs, output sign ¹¹ B)
10/1/18 Page 8
VLSI-1 Class Notes
§ Logical Shift:– Shifts number left or right and fills with 0 s
• 1011 LSR 1 = 0101 1011 LSL1 = 0110
Shifters
Logical Shift Left (LSL)
DestinationCF 0
Destination CF
Logical Shift Right (LSR)
...0
zero shifted in
10/1/18 Page 9
zero shifted in
VLSI-1 Class Notes
Shifters
§ Arithmetic Shift:– Shifts number left or right. Right shift sign extends
1011 ASR1 = 1101 1011 ASL1 = 0110
Destination CF
Arithmetic Shift Right
Sign bit shifted in
Arithmetic Shift Right (ASR) Shifts right (divides by powers of two) and preserves the sign bit, for 2's complement operations. e.g.
ASR #5 = divide by 32
10/1/18 Page 10
VLSI-1 Class Notes
Shifters
§ Barrel Shift (Rotate):– Shifts number left or right and fills with lost bits
1011 ROR1 = 1101 1011 ROL1 = 0111
Destination CF
Rotate Right
10/1/18 Page 11
Destination CF
Rotate Right through Carry
Rotate Right (ROR)Similar to an ASR but the bits wrap around as they leave the LSB and appear as the MSB.e.g. ROR #5Note the last bit rotated is also used as the Carry Out.
Rotate Right Extended (RRX)This operation uses the CPSR C flag as a 33rd bit. Rotates right by 1 bit. Encoded as ROR #0
VLSI-1 Class Notes
ARM: Using the Barrel Shifter: The Second Operand
1210/1/18
Register, optionally with shift operation applied.Shift value can be either be:
5 bit unsigned integerSpecified in bottom byte of another register.
Immediate value8 bit numberCan be rotated right through an even number of positions.Assembler will calculate rotate for you from constant.
Operand 1
Result
ALU
Barrel Shifter
Operand 2
VLSI-1 Class Notes
Second Operand: Using a Shifted Register
§ Using a multiplication instruction to multiply by a constant means first loading the constant into a register and then waiting a number of internal cycles for the instruction to complete.
§ A more optimum solution can often be found by using some combination of MOVs, ADDs, SUBs and RSBs with shifts.– Multiplications by a constant equal to a ((power of 2) 1) can be done in
one cycle.
MOV R2, R0, LSL #2 ; Shift R0 left by 2, write to R2, (R2=R0x4)ADD R9, R5, R5, LSL #3 ; R9 = R5 + R5 x 8 or R9 = R5 x 9RSB R9, R5, R5, LSL #3 ; R9 = R5 x 8 - R5 or R9 = R5 x 7SUB R10, R9, R8, LSR #4 ; R10 = R9 - R8 / 16MOV R12, R4, ROR R3 ; R12 = R4 rotated right by value of R3
1310/1/18
VLSI-1 Class Notes
Funnel Shifter
§ A funnel shifter can do all six types of shifts§ Selects N-bit field Y from 2N-bit input
– Shift by k bits (0 ££ k < N)
B C
offsetoffset + N-1
0N-12N-1
Y
10/1/18 Page 14
VLSI-1 Class Notes
Funnel Shifter Operation
10/1/18 Page 15
VLSI-1 Class Notes
Simplified Funnel Shifter
§ Optimize down to 2N-1 bit input
10/1/18 Page 16
VLSI-1 Class Notes
Funnel Shifter Design 1
§ N N-input multiplexers– Use 1-of-N hot select signals for shift amount– nMOS pass transistor design (Vt drops!)
k[1:0]
s0s1s2s3Y3
Y2
Y1
Y0
Z0Z1Z2Z3Z4
Z5
Z6
left Inverters & Decoder
10/1/18 Page 17
VLSI-1 Class Notes
Funnel Shifter Design 2
§ Log N stages of 2-input MUXes– No select decoding needed
Y3
Y2
Y1
Y0Z0
Z1
Z2
Z3
Z4
Z5
Z6
k0k1left
10/1/18 Page 18
VLSI-1 Class Notes
Logarithmic Barrel Shifter
Right shift only
Right/Left shift
Right/Left Shift & Rotate
VLSI-1 Class Notes
32-bit Logarithmic Barrel
§ Datapath never wider than 32 bits§ First stage preshifts by 1 to handle left shifts
20
VLSI-1 Class Notes
Multi-input Adders
§ Suppose we want to add k N-bit words– Ex: 0001 + 0111 + 1101 + 0010 = 10111
§ Straightforward solution: k-1 N-input CPAs– Large and slow
+
+
0001 0111
+
1101 0010
10101
10111
10/1/18 Page 21
VLSI-1 Class Notes
Carry Save Addition
§ Full adder sums 3 inputs, produces 2 outputs– Carry output has twice weight of sum output
§ N full adders in parallel: carry save adder– Produce N sums and N carry outs
Z4Y4X4
S4C4
Z3Y3X3
S3C3
Z2Y2X2
S2C2
Z1Y1X1
S1C1XN...1 YN...1 ZN...1
SN...1CN...1
n-bit CSA
10/1/18 Page 22
VLSI-1 Class Notes
CSA Application
§ Use k-2 stages of CSAs– Keep result in carry-save redundant form
§ Final CPA computes actual result
4-bit CSA
5-bit CSA
0001 0111 1101 0010
+
10110101_
01010_ 00011
0001 0111+1101 10110101_
XYZSC
0101_ 1011 +0010 0001101010_
XYZSC
01010_+ 00011 10111
ABS
10111
10/1/18 Page 23
VLSI-1 Class Notes
Multiplication
§ Example:
§ M x N-bit multiplication– Produce N M-bit partial products– Sum these to produce M+N-bit product
1100 : 1210 0101 : 510 1100 0000 1100 000000111100 : 6010
multipliermultiplicand
partialproducts
product
10/1/18 Page 24
VLSI-1 Class Notes
General Form
§ Multiplicand: Y = (yM-1, yM-2, …, y1, y0)§ Multiplier: X = (xN-1, xN-2, …, x1, x0)
§ Product: 1 1 1 1
0 0 0 02 2 2
M N N Mj i i j
j i i jj i i j
P y x x y- - - -
+
= = = =
æ öæ ö= =ç ÷ç ÷è øè ø
å å åå
x0y5 x0y4 x0y3 x0y2 x0y1 x0y0
y5 y4 y3 y2 y1 y0x5 x4 x3 x2 x1 x0
x1y5 x1y4 x1y3 x1y2 x1y1 x1y0x2y5 x2y4 x2y3 x2y2 x2y1 x2y0
x3y5 x3y4 x3y3 x3y2 x3y1 x3y0x4y5 x4y4 x4y3 x4y2 x4y1 x4y0
x5y5 x5y4 x5y3 x5y2 x5y1 x5y0p0p1p2p3p4p5p6p7p8p9p10p11
multipliermultiplicand
partialproducts
product
10/1/18 Page 25
VLSI-1 Class Notes
Dot Diagram
§ Each dot represents a bit
partial products
multiplier x
x0
x15
10/1/18 Page 26
VLSI-1 Class Notes
Array Multipliery0y1y2y3
x0
x1
x2
x3
p0p1p2p3p4p5p6p7
B
ASin Cin
SoutCout
BA
CinCout
Sout
Sin=
CSAArray
CPA
critical path BA
Sout
Cout CinCout
Sout
=Cin
BA
10/1/18 Page 27
VLSI-1 Class Notes
Rectangular Array
§ Squash array to fit rectangular floorplan
y0y1y2y3
x0
x1
x2
x3
p0
p1
p2
p3
p4p5p6p7
10/1/18 Page 28
VLSI-1 Class Notes
Fewer Partial Products
§ Array multiplier requires N partial products§ If we looked at groups of r bits, we could form N/r partial
products.– Faster and smaller?– Called radix-2r encoding
§ Ex: r = 2: look at pairs of bits– Form partial products of 0, Y, 2Y, 3Y– First three are easy, but 3Y requires adder LL
10/1/18 Page 29
VLSI-1 Class Notes
Booth Encoding
§ Instead of 3Y, try –Y, then increment next partial product to add 4Y
§ Similarly, for 2Y, try –2Y + 4Y in next partial product
10/1/18 Page 30
VLSI-1 Class Notes
Booth Hardware
§ Booth encoder generates control lines for each PP– Booth selectors choose PP bits
Mi
yj
Xi
yj-1
2Xi
PPij
BoothSelector
BoothEncoder
x2i+1
x2ix2i-1
10/1/18 Page 31
VLSI-1 Class Notes
Booth hardware
Page 3210/1/18
VLSI-1 Class Notes
Sign Extension
§ Partial products can be negative– Require sign extension, which is cumbersome– High fanout on most significant bit
multiplier x
x0
x15
0
00
x-1
x16x17
ssssssssssssssss
ssssssssssssss
ssssssssssss
ssssssssss
ssssssss
ssssss
ssss
ss
PP0PP1
PP2
PP3PP4
PP5PP6
PP7PP8
10/1/18 Page 33
VLSI-1 Class Notes
Simplified Sign Extension
§ Sign bits are either all 0 s or all 1’s– Note that all 0 s is all 1’s + 1 in proper column– Use this to reduce loading on MSB
s111111111111111s
s1111111111111s
s11111111111s
s111111111s
s1111111s
s11111s
s111s
s1s
PP0
PP1
PP2
PP3
PP4
PP5
PP6
PP7
PP8
10/1/18 Page 34
VLSI-1 Class Notes
Even Simpler Sign Extension
§ No need to add all the 1 s in hardware– Precompute the answer!
ssss
ss1
ss1
ss1
ss1
ss1
ss1
ss
PP0PP1PP2PP3PP4PP5PP6PP7PP8
10/1/18 Page 35
VLSI-1 Class Notes
Advanced Multiplication
§ Signed vs. unsigned inputs§ Higher radix Booth encoding§ Array vs. tree CSA networks
10/1/18 Page 36