Upload
udslv
View
222
Download
0
Embed Size (px)
Citation preview
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 1/85
11
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 2/85
22
Design ObjectivesDesign Objectives To To ha veha ve aa registerregister basedbased storagestorage of of 1616 latestlatest inputinput
v alues v alues andand thethe 1616 impulseimpulse responseresponse coeff icientscoeff icients
onon--chipchip..
To To utilizeutilize aa clockedclocked architecturearchitecture toto synchronizesynchronizeinputinput andand outputoutput v alues v alues..
ReduceReduce thethe NumberNumber of of MultiplierMultiplier andand Adder Adder
neededneeded thatthat isis OptimizeOptimize areaarea andand Pow erPow er andand costcost..
ByBy Achieving Achieving thethe abo veabo ve thethe speedspeed w ill w ill notnot bebe
compromisedcompromised
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 3/85
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 4/85
44
Design ObjectivesDesign Objectives 1616 tapstaps of of delaydelay lineline..
88 bitsbits of of Input/OutputInput/Output bitbit resolutionresolution
BurstBurst modemode of of datadata transfer transfer atat InputInput supportingsupporting 3232elementselements of of thethe desireddesired resolutionresolution inin oneone burstburst
Main Issue of concern when designing FIR Filter Main Issue of concern when designing FIR Filter
Sharp ResponseSharp Response
Number of TapsNumber of Taps
Numerical PrecisionNumerical Precision
Fully ParallelFully Parallel
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 5/85
55
Adv antages and Disadv antages Adv antages and Disadv antages
� Advantages: ± Always stable (assume non-recursive
implementation).
± Quantization noise is not much of a problem.
± Transients have a finite duration.
� Disadvantages:
± A high-order filter is generally needed to satisfy
the stated specification ± so more coefficients are
needed with more storage and computation.
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 6/85
66
Review of discreteReview of discrete--time systemstime systems
Linear timeLinear time--invariant (LTI) systemsinvariant (LTI) systems
Causal systems:Causal systems:
for all input x[k]=0, k<0for all input x[k]=0, k<0 --> output y[k]=0, k<0> output y[k]=0, k<0
Impulse response :Impulse response :
input 1,0,0,0,...input 1,0,0,0,... --> output h[0],h[1],h[2],h[3],...> output h[0],h[1],h[2],h[3],...
input x[0],x[1],x[2],x[3]input x[0],x[1],x[2],x[3] --> output y[0],y[1],y[2],y[3],...> output y[0],y[1],y[2],y[3],...
x[k] y[k]
][*][][].[][ k hk uik hiuk yi
!! §
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 7/85
77
Over vie w Over vie w FIR f ilter equationFIR f ilter equation
y[n] = x[n] * h [n] y[n] = x[n] * h [n]
where n is the number of ´ta psµ or where n is the number of ´ta psµ or
coeff icients in the FIR f ilter.coeff icients in the FIR f ilter.
ForFor aa 1616--ta pta p FIR FIR f ilterf ilter
y[n] y[n] == aa00x[n]x[n] ++ aa11x[nx[n--11]] ++ aa22x[nx[n--22]] ++ aa33x[nx[n--
33]+]+««++ aa1515x[nx[n--1515]]
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 8/85
88
Different Filter RepresentationsDifferent Filter Representations
Difference equationDifference equation
RecursiveRecursive
computation needscomputation needs
y y [[--1] and1] and y y [[--2]2]For the filter to be LTI,For the filter to be LTI,
y y [[--1] = 0 and1] = 0 and y y [[--2] = 02] = 0
Transfer functionTransfer functionAssumes LTI systemAssumes LTI system
Block DiagramBlock Diagram
RepresentationRepresentation][]2[81]1[
21][ k xk yk yk y !
7 x[k ] y[k ]
UnitDelay
Unit
Delay
1/2
1/8
y[k -1]
y[k -2]
21
21
8
1
2
11
1
)(
)()(
)()(8
1)(
2
1)(
!!
!
z z z X
z Y z H
z X z Y z z Y z z Y
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 9/85
99
DiscreteDiscrete--Time SystemsTime SystemsZZ--Transform:Transform:
§
!i
i z ih z ].[)(
? A ? A¼¼¼¼
½
»
¬¬¬¬«
¼¼¼¼¼¼¼
½
»
¬¬¬¬¬¬¬
«
!
¼¼¼¼¼¼¼
½
»
¬¬¬¬¬¬¬
«
¼¼¼
½
»
¬¬¬
«
]3[
]2[
]1[
]0[
.
]2[000
]1[]2[00
]0[]1[]2[0
0]0[]1[]2[
00]0[]1[
000]0[
....1
]5[
]4[
]3[
]2[
]1[
]0[
....1
3211).()(
521521
u
u
u
u
h
hh
hhh
hhh
hh
h
z z z
y
y
y
y
y
y
z z z
z z z z z Y
§
!i
i z i y z Y ].[)(§
!
i
i z iu z U ].[)(
)().()( z U z z Y !
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 10/85
1010
DiscreteDiscrete--Time SystemsTime Systems`Popular¶ frequency responses for filter design :`Popular¶ frequency responses for filter design :
lowlow--pass (LP) highpass (LP) high--pass (HP) bandpass (HP) band--pass (BP)pass (BP)
bandband--stop multistop multi--bandband ««T T
T T
T
T
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 11/85
1111
Digital Filter SpecificationsDigital Filter Specifications For example the magnitude responseFor example the magnitude response
of a digital lowpass filter may be given asof a digital lowpass filter may be given asindicated belowindicated below )( [ j
eG
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 12/85
1212
Hierarchical Structures:Hierarchical Structures:
± ±PipelinePipeline
± ±SplitJoinSplitJoin
± ±Feedback LoopFeedback Loop
Structured StreamsStructured Streams
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 13/85
1313
Different StrategiesDifferent Strategies Map filter per tile and runMap filter per tile and run
forever forever
Pros:Pros: ± ± No filter swapping overheadNo filter swapping overhead
± ± Reduced memory trafficReduced memory traffic
± ± Localized communicationLocalized communication
± ± Tighter latenciesTighter latencies
± ± Smaller live data setSmaller live data set
Cons:Cons: ± ± Load balancing is criticalLoad balancing is critical
± ± Not good for dynamic behavior Not good for dynamic behavior
± ± Requires # filtersRequires # filters �� # processing# processing
elementselements
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 14/85
1414
DiscreteDiscrete--Time SystemsTime Systems`FIR filters¶ (finite impulse response):`FIR filters¶ (finite impulse response):
Moving average filters (MA)Moving average filters (MA)
N poles at the origin z=0 (hence guaranteed stability)N poles at the origin z=0 (hence guaranteed stability)
N zeros (zeros of B(z)), `all zero¶ filtersN zeros (zeros of B(z)), `all zero¶ filters
corresponds to difference equationcorresponds to difference equation
Impulse responseImpulse response
N
N N z b z bb
z
z B z H
!! ...)()(1
10
][....]1[.][.][ 10 N k ubk ubk ubk y N
!
,...0]1[,][,...,]1[,]0[ 10 !!!! N hb N hbhbh N
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 15/85
1515
Speeding Up FIR FilterSpeeding Up FIR FilterFIR speedFIR speed--upup
y(0) = c(0)x(0) + c(1)x(y(0) = c(0)x(0) + c(1)x(--1) + c(2)x(1) + c(2)x(--2) + . . . + c(N2) + . . . + c(N--1)x(11)x(1--N);N);
y(1) = c(0)x(1) + c(1)x(0) + c(2)x(y(1) = c(0)x(1) + c(1)x(0) + c(2)x(--1) + . . . + c(N1) + . . . + c(N--1)x(21)x(2--N);N);
y(2) = c(0)x(2) + c(1)x(1) + c(2)x(0) + . . . + c(Ny(2) = c(0)x(2) + c(1)x(1) + c(2)x(0) + . . . + c(N--1)x(31)x(3--N);N);
. . .. . .
y(n) = c(0)x(n) + c(1)x(ny(n) = c(0)x(n) + c(1)x(n--1) + c(2)x(n1) + c(2)x(n--2)+ . . + c(N2)+ . . + c(N--1)x(n1)x(n--(N(N--1));1));
Run MAC at double frequency, read two 32Run MAC at double frequency, read two 32--bit numbersbit numbers
FIR filtering: two outputs in parallelFIR filtering: two outputs in parallel
Two outputs = 4N reads, 2N MAC¶s, 2 writesTwo outputs = 4N reads, 2N MAC¶s, 2 writes
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 16/85
1616
Direct Form RealizationDirect Form Realization
u[k]
( ( ( (
u[k-4]u[k-3]u[k-2]u[k-1]
x
bo
+
x
b4
x
b3
+
x
b2
+
x
b1
+
y[k]
0 1[ ] . [ ] . [ 1] ... . [ ]
( 1)
, number o Taps
N
C r itical M A
C lock C r itical
y k b u k b u k b u k N
T T T N
T T N
!
!
u
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 17/85
1717
Retiming FIR Filter RealizationsRetiming FIR Filter Realizations
Select subgraph (shaded)Select subgraph (shaded)
Remove delay element on all inbound arrowsRemove delay element on all inbound arrows
Add delay element on all outbound arrows Add delay element on all outbound arrows
u[k]
( ( ( (
u[k-4]u[k-3]u[k-2]u[k-1]
x
bo
+
x
b4
x
b3
+
x
b2
+
x
b1
+
y[k]
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 18/85
1818
RetimingRetiming
u[k]
(
(
u[k-1]
x
bo
+
x
b1
+
y[k]
( (
u[k-3]u[k-2]
x
b4
x
b3
+
x
b2
+
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 19/85
1919
Four Tap Direct Form RealizationFour Tap Direct Form Realization
u[k]
( ( (
u[k-3]u[k-2]u[k-1]
x
bo
+
x
b3
x
b2
+
x
b1
y[k] +
0 1 2 3[ ] . [ ] . [ 1] . [ 2 ] . [ 3]
log( )
, n u m b e r o f T a p s
C r i t ic a l
C l o c k C r i t ic a l
y k b u k b u k b u k b u k
T T T N
T T N
!
!
u
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 20/85
2020
Transposed DirectTransposed Direct--Form RealizationForm Realization
u[k]
x
bo
+y[k]
( (
x
b1
+
x
b2
+ (
x
b3
+ (
x
b4
0 1
[ ] . [ ] . [ 1] ... . [ ]
, number o Taps
N
C r itical M A
C lock C r itical
y k b u k b u k b u k N
T T T
T T N
!
!
u
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 21/85
2121
Lattice Form RealizationsLattice Form Realizationsu[k] u[k-1]
(
u[k-2]
xb1
+
xb2
+
x
+
x
+
b3
u[k-3]
(
xb3
+
b2x
+
xbo
+
(
y[k]
b4x
+
u[k-4]
(
xb4
b1x
bo
y[k]~
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 22/85
2222
FIR Filter RealizationsFIR Filter RealizationsLattice FormLattice Form
u[k]
y[k]
(
+
+
x
x
ko
(
+
+
x
x
k1
(
+
+
x
x
k2
(
+
+
x
x
k3
xbo
y[k]
~
][....]1[.][.][ 10 N k ubk ubk ubk y N
!
i.e. different software/hardware, same i/o-behavior
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 23/85
2323
Efficient Direct Form RealizationEfficient Direct Form Realization
Efficient DirectEfficient Direct--Form realization.Form realization.
bo
y[k]
u[k]( ( ( (
+
( ( ( (
+ ++ +
++
x xb4
xb3
xb2
xb1
++
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 24/85
2424
Pin DiagramPin Diagram
Drive
y[0]
y[2]y[3]
y[4]
y[5]
y[6]
«.y[31]
y[1]
x[0]
x[1]
«..«..
x[15]
Reset
Coeffin Din Clk
Vdd Gnd
16-bit16-ta p
FIR
Filter
a[0]
a[1]
«..«..
a[15]
Synthesis using Synopsys Design CompilerSynthesis using Synopsys Design CompilerInitial Target Frequency: 100 MHz (typical)Initial Target Frequency: 100 MHz (typical)
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 25/85
2525
S pecif icationsS pecif icationsInput S pecif icationsInput S pecif ications
1616--bit unsigned integers for data inputs.bit unsigned integers for data inputs.
1616--bit unsigned integers for coeff icients.bit unsigned integers for coeff icients.
Output S pecif icationsOutput S pecif ications
3232--bit unsigned integer output.bit unsigned integer output.
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 26/85
2626
S ystem ComponentsS ystem Components Memory Memory -- Input and Coeff icientInput and Coeff icient
C ontrol C ontrol -- ModMod--4 and Mod4 and Mod--8 counters8 counters
-- 33--8 Decoder8 Decoder
-- Combinational logicCombinational logic
Multiplier Multiplier -- R adiusR adius--8 Booth multiplier8 Booth multiplier
-- Multiplier registerMultiplier register
Add
er Add
er -- 99--bit Carr yS
a ve adderbit Carr yS
a ve adder-- Adder register Adder register
Output Register Output Register
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 27/85
2727
S pecif icationsS pecif icationsDrive Signal(Output Signal)Drive Signal(Output Signal)
A ne w output is a v ailable. A ne w output is a v ailable.
Inputs or coeff icients to be a pplied onl y whenInputs or coeff icients to be a pplied onl y when
Drive is asserted.Drive is asserted.
CoefficientsCoefficients
Any coeff icient changed implies a ne w f ilter Any coeff icient changed implies a ne w f ilter def inition.def inition.
Input Memor y clearedInput Memor y cleared ² ² ne w data to be entered.ne w data to be entered.
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 28/85
2828
S pecif icationsS pecif icationsSystem Clock System Clock
One clock One clock--cycle for the f ilter = 32 input clock cycle for the f ilter = 32 input clock pulses. pulses.
One Ta pOne Ta p--cycle = 8 input clock pulses describedcycle = 8 input clock pulses describedas 8 phases.as 8 phases.
4 such Ta ps for each output.4 such Ta ps for each output.
System ResetSystem Reset
Active High Active High
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 29/85
2929
S ystem TimingS ystem Timing mod8 counter statesmod8 counter states
**
**
InputInput oror Coeff icientCoeff icient memor ymemor y enableenable
** MultiplierMultiplier pro pagation pro pagation dela ydela y
**
MultiplierMultiplier pro pagation pro pagation dela ydela y
**
MultiplierMultiplier RegisterRegister enableenable
** Add Add RegisterRegister EnableEnable
**
OutputOutput RegisterRegister EnableEnable
**
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 30/85
3030
S ystem Timing StrategyS ystem Timing Strategy Two Two phase phase clockingclocking
GenerationGeneration of of internalinternal low erlow er frequencyfrequency clocksclocks
usingusing modmod--44 andand modmod--88 counterscounters
EachEach statestate of of modmod--44 countercounter usedused forforcomputationcomputation of of oneone f ilterf ilter ta pta p
OutputOutput a v ailablea v ailable atat thethe endend of of oneone cyclecycle of of modmod--44
countercounter
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 31/85
3131
22--Parallel FIR Filtering StructureParallel FIR Filtering Structure
H0
H1
H0
H1
+
D
+
y(2k )
y(2k+1)
x(2k )
x(2k+1)
z-2
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 32/85
3232
HardwareHardware--Efficient 2Efficient 2--Parallel FIR FilterParallel FIR Filter
Y Y00 = X= X00 HH00 + z+ z--22XX11HH11
Y Y11 = X= X00 HH11 + X+ X11 HH00
= (H= (H00 + H+ H11) (X) (X00 + X+ X11)) ± ± HH00XX00 ± ± HH11XX11
z-2
H0
H0+H1
H1
+
D
+
y(2k )
y(2k+1)
x(2k )
x(2k+1)
+ +
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 33/85
3333
Savings in the New StructureSavings in the New Structure
Originally,Originally,
± ±2N multiplications + 2(N2N multiplications + 2(N--1)1)
additions for two inputsadditions for two inputs
In the new structureIn the new structure
± ±3*(N/2) = 1.5N multiplication3*(N/2) = 1.5N multiplication
± ±3(N/23(N/2 ± ±1) + 4 = 1.5N + 1 additions1) + 4 = 1.5N + 1 additions
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 34/85
3434
Design Flow FIR 16 Tap DelayDesign Flow FIR 16 Tap Delay
VHDL
Deign Entry
Synthesis
Floor planning
Place & Route
Functional
Verification
Timing
Verification
PhysicalVerification
EDIF
PDEFSDF
PDEFParasitic
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 35/85
3535
The FIR FilterThe FIR Filter
ImplementationImplementation of of 1616 TapTapFIRFIR Filter,Filter, thethe coefficientscoefficientsareare representedrepresented asas fixedfixed
pointpoint 1616--bitsbits 22¶s¶scomplementcomplement numbersnumbers.. ItItisis assumedassumed thatthat either either or or bothboth of of thethe coefficientscoefficientsandand datadata areare fractionalfractional
numbersnumbers..
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 36/85
3636
FIR Filter(Critical Path)FIR Filter(Critical Path) InIn order order toto savesave areaarea andand improveimprove thethe
criticalcritical pathpath performance,performance, wewe decideddecided toto addaddthethe 1212--bitbit sumsum andand carrycarry resultsresults of of thethemultiplier multiplier duringduring thethe accumulationaccumulationoperationoperation.. Therefore,Therefore, thethe adder adder hashas toto addaddthreethree 1212--bitbit numbersnumbers.. ToTo dodo that,that, thethe firstfirststagestage of of thethe adder adder isis aa 33--toto--22 combiner,combiner,whichwhich isis just just aa CSACSA.. TheThe nextnext stagestage isis aa CPACPA(Carry(Carry PropagatePropagate Adder)Adder) arrangedarranged inin aa staticstaticManchester Manchester carrycarry chainchain formform.. TheThe chainchain isis
divideddivided intointo four four sections,sections, eacheach oneone hashasthreethree carrycarry stagesstages.. BuffersBuffers areare usedusedbetweenbetween sectionssections toto reducereduce thethe overalloveralldelaydelay..
f l i lif l i li
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 37/85
3737
Surv ey of Multipli er Surv ey of Multipli er Combinational Multiplier: uses nCombinational Multiplier: uses n
adders, eliminates registers:adders, eliminates registers:
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 38/85
R diR di 2 U i d M l i li i2 U i d M l i li i
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 39/85
3939
RadixRadix--2 Unsigned Multiplication2 Unsigned Multiplication
Use a single nUse a single n--bit adder, three registers (P, A, B),bit adder, three registers (P, A, B),
and a testing circuit for Aand a testing circuit for A00
Initialization: Place the unsigned numbers inInitialization: Place the unsigned numbers in
registers A and B. Set P to zero.registers A and B. Set P to zero.
1: If A1: If A00 is 1,is 1,
then register B, containing bthen register B, containing bnn--11bbnn--22...b...b00 is added tois added toP;P;
otherwise 00...00 (nothing) is added to P. The sumotherwise 00...00 (nothing) is added to P. The sum
is placed back into P.is placed back into P.
2. Shift register pair (P, A) one bit right.2. Shift register pair (P, A) one bit right.
The last bit of A is shifted out (not used).The last bit of A is shifted out (not used).
A M l i liA M l i li
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 40/85
4040
Array Multipli er Array Multipli er ArrayArray multiplier multiplier isis anan efficientefficient
layoutlayout of of aa combinationalcombinational
multiplier multiplier..
ArrayArray multipliersmultipliers maymay bebe
pipelinedpipelined toto decreasedecrease clockclock
periodperiod atat thethe expenseexpense of of latencylatency..
A M lti li O i tiA M lti li O i ti
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 41/85
4141
Array Multipli er O rganization Array Multipli er O rganization 0 1 1 00 1 1 0
x 1 0 0 1x 1 0 0 1
0 1 1 00 1 1 0
++ 0 0 0 00 0 0 0
0 0 1 1 00 0 1 1 0
++ 0 0 0 00 0 0 0
0 0 0 1 1 00 0 0 1 1 0
++ 0 1 1 00 1 1 0
0 1 1 0 1 1 00 1 1 0 1 1 0
Product
sk ew arr a y
for r ect ang ul ar
l a yout
Multiplicand
Multiplier
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 42/85
A M lti li O i tiA M lti li O i ti
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 43/85
4343
tmult}(M-1) tcarry +(N-1) tsum + tand
For small tmult, tcarry
tsum
Beneficial to mak e tcarry = tsum
p Differential Logic (DCVS)
ArrayMultiplier cell
�
Xi
Yi
Pin
Cout
Pout
FA
Pout
Cout
Pin
Cin
Cin
Xi Yi
Critical Path
N-1 P.P
M-1
Array Multipli er O rganization Array Multipli er O rganization
A hit t f A M lti liA hit t f A M lti li
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 44/85
4444
�
�
�
� � �
� � �
� � �
HA
HA
×
×
×
×
HA
HA
X3 X2 X1 X0
Y0
Y1
Y2
Y3
Z7 Z6 Z5 Z4 Z3
Z0
Z1
Z2
Archit ectur e of Array Multipli er Archit ectur e of Array Multipli er
Ad t f A M lti liAd t f A M lti li
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 45/85
Array multipliersArray multipliers
± ±Partial product generation andPartial product generation andaccumulation are mergedaccumulation are merged
± ± Identical cellsIdentical cells
± ±HighHigh--rate pipeliningrate pipelining
a4x2
a3x3
a2x4
p6
a4
x1a3x2
a2x3
a1x4
p5
a4
x4
a4x0
a3
x1a2x2
a1x3
a0x4
p4
a3
x3
a3x0
a2
x1a1x2
a0x3
p3
a2
x2
a2x0
a1
x1a0x2
p2
a1
x1
a1x0
a0
x1
p1
a0
x0
a0x0
p0
a4x3
a3x4
p7
a4x4
p8p9
Ad vantages of Array Multipli er Ad vantages of Array Multipli er
A M lti liA M lti li
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 46/85
± ± Array multiplier for Array multiplier for
Unsigned numbersUnsigned numbers
a3x1
a4x00
a2x1
a3x00
a1x1
a2x00
a0x1
a1x00
a3x2
a4x1
a2x2 a1x2 a0x2
a3x3
a4x2
a2x3 a1x3 a0x3
a3x4
a4x3
a2x4 a1x4 a0x4a4x4
0
a0x0
p9 p8 p7 p6 p5 p4 p3 p2 p1 p0
Array Multipli er Array Multipli er
Array Multiplier for TwoArray Multiplier for Two¶¶s Complements Complement
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 47/85
�� type I celltype I cell
± ±ordinary full adder ordinary full adder �� type II celltype II cell
± ±x + yx + y -- z = 2cz = 2c -- ss
s = (x + ys = (x + y -- z) mod 2z) mod 2
c = [(x + yc = [(x + y -- z) + s] / 2z) + s] / 2
± ±type I cell withtype I cell with
inverted z and sinverted z and s
z=1z=1--z¶, s=1z¶, s=1--s¶s¶
weight = -1z
II x
y
c s
x + y - z 2c - s
0 0 0 0 0
0 0 1 0 1
0 1 0 1 1
0 1 1 0 0
1 0 0 1 1
1 0 1 0 0
1 1 0 1 0
1 1 1 1 1
Array Multipli er for Two Array Multipli er for Two¶ ¶ s Compl ement s Compl ement
Array Multiplier for TwoArray Multiplier for Two¶¶s Complements Complement
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 48/85
��type II¶ cell :type II¶ cell :
± ±-- xx -- y + z =y + z = -- 2c + s2c + s
x + yx + y -- z = 2cz = 2c -- ss
identical to the type IIidentical to the type II
cellcell zy
II¶ x
c s
weight = -2
weight = -1
Array Multipli er for Two Array Multipli er for Two¶ ¶ s Compl ement s Compl ement
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 49/85
Architecture of CarryArchitecture of Carry Save MultiplierSave Multiplier
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 50/85
5050
� � � �
� � ��
�� � �
� � ��
Critical path
Vector-merging adder
carry-save multiplier
tmult=(N-1) tcarry + tand + tvma
Carry-Save Multiplier (4v4)
Archit ectur e of Carry Archit ectur e of Carry--Sav e Multipli er Sav e Multipli er
BaughBaugh Wooley MultiplierWooley Multiplier
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 51/85
5151
B aughB augh--Wool ey Multipli er Wool ey Multipli er
AlgorithmAlgorithm for for two¶stwo¶s--complementcomplement
multiplicationmultiplication..
AdjustsAdjusts partialpartial productsproducts toto maximizemaximize
regularityregularity of of multiplicationmultiplication arrayarray..
MovesMoves partialpartial productsproducts withwith negativenegative
signssigns toto thethe lastlast stepssteps;; alsoalso addsaddsnegationnegation of of partialpartial productsproducts rather rather thanthan
subtractssubtracts..
Se ialSe ial Pa allel M lti liePa allel M lti lie
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 52/85
5252
S erial S erial--Parall el Multipli er Parall el Multipli er
UsedUsed inin serialserial--arithmeticarithmeticoperationsoperations..
MultiplicandMultiplicand cancan bebe heldheld
inin placeplace byby register register..
Multiplier Multiplier isis shiftedshifted intointo
arraarra ..
SerialSerial Parallel MultiplierParallel Multiplier
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 53/85
5353
reset
§
Serial to parallel
register
G1
G2
Full adder
CoCi
Delay element ; F/F
S
N-1 stages
X
Y
M+ N bits M* N cycles
Serial MultiplierSerial Multiplier
S erial S erial--Parall el Multipli er Parall el Multipli er
SerialSerial Parallel MultiplierParallel Multiplier
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 54/85
5454
§ § §
Y0 Y1 Y2 Yn-1
X
S erial S erial--Parall el Multipli er Parall el Multipli er
SerialSerial Parallel MultiplierParallel Multiplier
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 55/85
5555
X3Y0 X2Y0 X1Y0 X0Y0
X0Y1X1Y1X2Y1X3Y1
X0Y2X1Y2X2Y2X3Y2
X0Y3X1Y3X2Y3X3Y3
P7 P6 P5 P4 P3 P2 P1 P0
Y0
Y1
Y2
Y3
X3 X2 X1 X0
S erial S erial--Parall el Multipli er Parall el Multipli er
SerialSerial Parallel MultiplierParallel Multiplier
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 56/85
5656
§
§
!
!
!
!
1
0
1
0
2
2n
j
j
j
m
i
i
i
Y Y
§
§§
§ §
!
!
!
!
!
!
!
!�!
1
0
1
0
1
0
1
0
1
0
2
2)(
22
n
k
k
k
m
i
n
j
ji
ji
m
i
n
j
j
j
i
ir
P
Y X
Y X Y X P
+
Pi+1
Yi
Xi
CiCi+1
S erial S erial--Parall el Multipli er Parall el Multipli er
The Architecture of the Booth AlgorithmThe Architecture of the Booth Algorithm
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 57/85
5757
T he Archit ectur e of the B ooth AlgorithmT he Archit ectur e of the B ooth Algorithm
TheThe BoothBooth Multiplier Multiplier ± ±HighHigh performance,performance, lowlow
power power multiplier multiplier unitsunits arearenecessarynecessary inin manymany
situations,situations, suchsuch asas DSPDSP
systemssystems..
Carry Save AdditionCarry Save Addition
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 58/85
5858
FAFA
FA
FAFAFA
CLA adder
««..
««..
««..
X7 X6 X5 X4 X3 X2 X1 X0
Y0
Y1
Y2
Y7
. . . . . . . . .
C arry Sav e Add ition C arry Sav e Add ition
Booths AlgorithmBooths Algorithm
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 59/85
5959
Booth s AlgorithmBooth s Algorithm
Booth AlgorithmBooth Algorithm
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 60/85
6060
)0(
2)248(
2)24(
2)2(
2)(
0
4
4142434
14/
0
44
313/
0
3132333
12/
0
2
21222
1
01
!
��!
��!
��!
��!
!
!
!
!
§
§
§
§
y
x y y y y y XY
x y y y y XY
x y y y XY
x y y XY
i
iiii
n
i
i
in
i
iiii
n
i
i
iii
n
i
i
ii1st order(radix-2)
2nd order(radix-4)
3rd order(radix-8)
4th order(radix-16)
B ooth AlgorithmB ooth Algorithm
Booth EncodingBooth Encoding
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 61/85
6161
Booth EncodingBooth Encoding Encode a number by taking groups of 3 bitsEncode a number by taking groups of 3 bits
where each 3where each 3--bit group overlaps by 1 bitbit group overlaps by 1 bit
Consider multiplier B with (n + 1) bitConsider multiplier B with (n + 1) bit
± ± Pad B with 0 to match the first termPad B with 0 to match the first term
± ± if B has an odd number of bits,if B has an odd number of bits,
then extend the sign Bthen extend the sign BnnBBnnBBnn--11...B...B0000
i1i2i1 j
2i1ii j
BBB2E
BBB2E
�!
�!
Booth MultiplierBooth Multiplier
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 62/85
6262
B ooth Multipli er B ooth Multipli er Encoding scheme to reduce number of Encoding scheme to reduce number of
stages in multiplication.stages in multiplication.
Performs two bits of multiplication atPerforms two bits of multiplication at
onceonce²²requires half the stages.requires half the stages.
Each stage is slightly more complexEach stage is slightly more complex
than simple multiplier, butthan simple multiplier, butadder/subtracter is almost as small/fastadder/subtracter is almost as small/fast
as adder.as adder.
Booth EncodingBooth Encoding
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 63/85
6363
B ooth E ncod ingB ooth E ncod ing
Two¶sTwo¶s--complement form of multiplier:complement form of multiplier:
± ±y =y = --22nnyynn + 2+ 2nn--11yynn--22 + 2+ 2nn--22yynn--22 + ...+ ...
Rewrite using 2Rewrite using 2aa = 2= 2a+1a+1 -- 22aa::
± ±y =y = --22nn(y(ynn--11--yynn) + 2) + 2nn--11(y(ynn--22 --yynn--11) + 2) + 2nn--22(y(ynn--33 --yynn--
22) + ...) + ...
Consider first two terms: by looking atConsider first two terms: by looking atthree bits of y, we can determinethree bits of y, we can determine
whether to addwhether to add x x ,, 2x 2x to partial product.to partial product.
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 64/85
Booth MultiplierBooth Multiplier
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 65/85
6565
x8
Inverter /shift
Booth
decoder
Wallace Tree
CLA CLA CLA
x 2xx2x
selector
4
x0
y0
y1y2
y3
y4
y5
y6
y7y8
««««.
B ooth Multipli er B ooth Multipli er
Array Multiplier Cell for BoothArray Multiplier Cell for Booth¶¶s Algorithms Algorithm
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 66/85
Array Multipli er C ell for B ooth Array Multipli er C ell for B ooth s Algorithms Algorithm
0 (-2 A)i (2 A)i( A)i(- A)i
MUX
Full Adder
cout sout
select
cin
sin
Sign Extension ReductionSign Extension Reduction
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 67/85
6767
S0 S0 S0 S0 S0 S0 S0 S0 - - - - - - - -
S1 S1 S1 S1 S1 S1 - - - - - - - -
S2 S2 S2 S2 - - - - - - - -
S3 S3 - - - - - - - -
Sign
extension
)2(0)2(1)2(2)2(3
)222(0)222(1)222(2)222(3
)22222222(0
)222222(1)2222(2)22(3
0246
077277477677
01234567
234567456767
!
!
S S S S
S S S S
S
S S S
1 S3 1 S2 1 S1 1 S0+1
Sign Ext ension Red uction Sign Ext ension Red uction
Wallace TreeWallace Tree
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 68/85
6868
Wallac e T r eeWallac e T r ee Reduces depth of adder chain.Reduces depth of adder chain.
Built from carryBuilt from carry--save adders:save adders:
± ± three inputs a, b, cthree inputs a, b, c
± ± produces two outputs y, z such that y + z = a + bproduces two outputs y, z such that y + z = a + b
+ c+ c
CarryCarry--save equations:save equations:
± ± yyii = parity(a= parity(aii,b,bii,c,cii)) ± ± zzii = majority(a= majority(aii,b,bii,c,cii))
Wallace Tree StructureWallace Tree Structure
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 69/85
6969
Wallac e T r ee Structur eWallac e T r ee Structur e
77--bit Wallace Tree Additionbit Wallace Tree Addition
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 70/85
7070
77 bit Wallace Tree Additionbit Wallace Tree Addition
Wallace Tree OperationWallace Tree Operation
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 71/85
7171
Wallac e T r ee O peration Wallac e T r ee O peration At each stage, i numbers are combined toAt each stage, i numbers are combined to
form ceil(2i/3) sums.form ceil(2i/3) sums.
Final adder completes the summation.Final adder completes the summation.
Wiring is more complex.Wiring is more complex.
Can build a BoothCan build a Booth--encoded Wallace treeencoded Wallace tree
multiplier.multiplier.
CSA vs. Wallace TreeCSA vs. Wallace Tree
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 72/85
7272
C S
FA
FA
FA
FA
1 2 3
4
5
6
FA FA
FA
FA
C S
C S A vs. Wallac e T r eeC S A vs. Wallac e T r ee
Rad i xRad i x--4 Mod ifi ed B ooth4 Mod ifi ed B ooth¶ ¶ s Algorithms Algorithm
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 73/85
A 0 1 0 1 1 0 22 A 0 1 0 1 1 0 22
X X 0 0 1 0 1 1 11X X 0 0 1 0 1 1 11
Y(recoded multiplier) 0 1 0 1 0 1Y(recoded multiplier) 0 1 0 1 0 1
1
1 0 0 1 0 1 0
1 0 0 1 0 1 0
1 1 1 0 1 1 0
1 0 0 0 1 1 1 1 0 0 1 0
gg
WallaceWallace--TreeTree
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 74/85
7474
WallaceWallace TreeTree
F A
F A
F A
F A
y 0 y 1 y 2
y 3
y 4
y 5
S
C i - 1
C i - 1
C i - 1
C i
C i
C i
F A
y 0 y 1 y 2
F A
y 3 y 4 y 5
F A
F A
C C S
C i - 1
C i - 1
C i - 1
C i
C i
C i
Collapse the chain of FAs yCollapse the chain of FAs y00--yy55 (5 adders delays) to the Wallace tree consisting(5 adders delays) to the Wallace tree consisting
of (4 adders delays)of (4 adders delays)
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 75/85
Floor Plan of MultiplierFloor Plan of Multiplier
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 76/85
7676
I n T he Actual D atapathI n T he Actual D atapathx
Y
LSB
L
S
B
MSB
M1
M2
or
M3
Floor Plan of MultiplierFloor Plan of Multiplier
Floor PlanFloor Plan
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 77/85
7777
Floor PlanFloor Plan
Adder Adder
Add Reg Add Reg
Out RegOut Reg
MultiplierMultiplier
Multiplier RegMultiplier Reg
Control Block Control Block
Coeff icient Memor yCoeff icient Memor y
InputInput
Memor yMemor y
R outingR outing
Floor PlanningFloor Planning
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 78/85
7878
Floor PlanningFloor Planning
ResultsResults
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 79/85
7979
ResultsResults
CellCell Number of PortsNumber of PortsNumber of PortsNumber of Ports 3434
Number of NetsNumber of Nets 157157
Number of CellsNumber of Cells 3232
Combinational AreaCombinational Area 24286.050781 24286.050781
NonNon--Combinational AreaCombinational Area 14935.535156 14935.535156
Total A rea Total A rea 39221.58593839221.585938
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 80/85
Main ModuleMain Module
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 81/85
8181
Main ModuleMain Module
Booth MultiplierBooth Multiplier
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 82/85
8282
Boo u p eBoo u p e
Core ModuleCore Module
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 83/85
8383
Controller ModuleController Module
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 84/85
8484
ConclusionConclusion
8/6/2019 Proj Desai Mousa 16 Tap Filter
http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 85/85
ConclusionConclusion GoodGood DesignDesign ExperienceExperience..
UsingUsing ParallelParallel FIR FIR FilterFilter RealizationRealization ReducedReduced thethe
numbernumber of of MultiplierMultiplier andand Adder Adder neededneeded thereforetherefore A rea A rea
was was shrunk shrunk andand pow er pow er consumptionconsumption was was low eredlow ered
Timing Timing StrategiesStrategies UsingUsing nonnon--blockingblocking inin Verilog Verilog
reducedreduced numbernumber of of statesstates neededneeded forfor implementationimplementation..
PartitioningPartitioning thethe designdesign intointo submodulessubmodules mademade designdesign
moremore manageablemanageable andand o ptimizedo ptimized..
PerformancePerformance OptimizationOptimization was was reachedreached w ith w ith slack slack timetime
equalequal toto ++99..5454..