85
1

Proj Desai Mousa 16 Tap Filter

  • Upload
    udslv

  • View
    222

  • Download
    0

Embed Size (px)

Citation preview

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 1/85

11

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 2/85

22

Design ObjectivesDesign Objectives To To ha veha ve aa registerregister basedbased storagestorage of of 1616 latestlatest inputinput

 v alues v alues andand thethe 1616 impulseimpulse responseresponse coeff icientscoeff icients

onon--chipchip..

To To utilizeutilize aa clockedclocked architecturearchitecture toto synchronizesynchronizeinputinput andand outputoutput v alues v alues..

ReduceReduce thethe NumberNumber of of MultiplierMultiplier andand Adder Adder

neededneeded thatthat isis OptimizeOptimize areaarea andand Pow erPow er andand costcost..

ByBy Achieving Achieving thethe abo veabo ve thethe speedspeed w ill w ill notnot bebe

compromisedcompromised

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 3/85

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 4/85

44

Design ObjectivesDesign Objectives 1616 tapstaps of of delaydelay lineline..

88 bitsbits of of Input/OutputInput/Output bitbit resolutionresolution

BurstBurst modemode of  of datadata transfer transfer atat InputInput supportingsupporting 3232elementselements of of thethe desireddesired resolutionresolution inin oneone burstburst

Main Issue of concern when designing FIR Filter Main Issue of concern when designing FIR Filter 

Sharp ResponseSharp Response

Number of TapsNumber of Taps

Numerical PrecisionNumerical Precision

Fully ParallelFully Parallel

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 5/85

55

 Adv antages and Disadv antages Adv antages and Disadv antages

� Advantages: ± Always stable (assume non-recursive

implementation).

 ± Quantization noise is not much of a problem.

 ± Transients have a finite duration.

� Disadvantages:

 ± A high-order filter is generally needed to satisfy

the stated specification ± so more coefficients are

needed with more storage and computation.

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 6/85

66

Review of discreteReview of discrete--time systemstime systems

Linear timeLinear time--invariant (LTI) systemsinvariant (LTI) systems

Causal systems:Causal systems:

for all input x[k]=0, k<0for all input x[k]=0, k<0 --> output y[k]=0, k<0> output y[k]=0, k<0

Impulse response :Impulse response :

input 1,0,0,0,...input 1,0,0,0,... --> output h[0],h[1],h[2],h[3],...> output h[0],h[1],h[2],h[3],...

input x[0],x[1],x[2],x[3]input x[0],x[1],x[2],x[3] --> output y[0],y[1],y[2],y[3],...> output y[0],y[1],y[2],y[3],...

x[k] y[k]

][*][][].[][ k hk uik hiuk  yi

!! §

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 7/85

77

Over vie w Over vie w FIR f ilter equationFIR f ilter equation

 y[n] = x[n] * h [n] y[n] = x[n] * h [n]

 where n is the number of ´ta psµ or  where n is the number of ´ta psµ or 

coeff icients in the FIR f ilter.coeff icients in the FIR f ilter.

ForFor aa 1616--ta pta p FIR FIR f ilterf ilter

 y[n] y[n] == aa00x[n]x[n] ++ aa11x[nx[n--11]] ++ aa22x[nx[n--22]] ++ aa33x[nx[n--

33]+]+««++ aa1515x[nx[n--1515]]

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 8/85

88

Different Filter RepresentationsDifferent Filter Representations

Difference equationDifference equation

RecursiveRecursive

computation needscomputation needs

y y [[--1] and1] and y y [[--2]2]For the filter to be LTI,For the filter to be LTI,

y y [[--1] = 0 and1] = 0 and y y [[--2] = 02] = 0

Transfer functionTransfer functionAssumes LTI systemAssumes LTI system

Block DiagramBlock Diagram

RepresentationRepresentation][]2[81]1[

21][ k  xk  yk  yk  y !

7 x[k ] y[k ]

UnitDelay

Unit

Delay

1/2

1/8

 y[k -1]

 y[k -2]

21

21

8

1

2

11

1

)(

)()(

)()(8

1)(

2

1)(

!!

!

 z  z  z  X 

 z Y  z  H 

 z  X  z Y  z  z Y  z  z Y 

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 9/85

99

DiscreteDiscrete--Time SystemsTime SystemsZZ--Transform:Transform:

§

!i

i z ih z  ].[)(

? A ? A¼¼¼¼

½

»

¬¬¬¬«

¼¼¼¼¼¼¼

½

»

¬¬¬¬¬¬¬

«

!

¼¼¼¼¼¼¼

½

»

¬¬¬¬¬¬¬

«

¼¼¼

½

»

¬¬¬

«

]3[

]2[

]1[

]0[

.

]2[000

]1[]2[00

]0[]1[]2[0

0]0[]1[]2[

00]0[]1[

000]0[

....1

]5[

]4[

]3[

]2[

]1[

]0[

....1

3211).()(

521521

u

u

u

u

h

hh

hhh

hhh

hh

h

 z  z  z 

 y

 y

 y

 y

 y

 y

 z  z  z 

 z  z  z  z  z Y 

§

!i

i z i y z Y  ].[)(§

!

i

i z iu z U  ].[)(

)().()( z U  z  z Y  !

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 10/85

1010

DiscreteDiscrete--Time SystemsTime Systems`Popular¶ frequency responses for filter design :`Popular¶ frequency responses for filter design :

lowlow--pass (LP) highpass (LP) high--pass (HP) bandpass (HP) band--pass (BP)pass (BP)

bandband--stop multistop multi--bandband ««T T 

T T 

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 11/85

1111

Digital Filter SpecificationsDigital Filter Specifications For example the magnitude responseFor example the magnitude response

of a digital lowpass filter may be given asof a digital lowpass filter may be given asindicated belowindicated below )( [  j

eG

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 12/85

1212

Hierarchical Structures:Hierarchical Structures:

 ± ±PipelinePipeline

 ± ±SplitJoinSplitJoin

 ± ±Feedback LoopFeedback Loop

Structured StreamsStructured Streams

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 13/85

1313

Different StrategiesDifferent Strategies Map filter per tile and runMap filter per tile and run

forever forever 

Pros:Pros: ± ± No filter swapping overheadNo filter swapping overhead

 ± ± Reduced memory trafficReduced memory traffic

 ± ± Localized communicationLocalized communication

 ± ± Tighter latenciesTighter latencies

 ± ± Smaller live data setSmaller live data set

Cons:Cons: ± ± Load balancing is criticalLoad balancing is critical

 ± ± Not good for dynamic behavior Not good for dynamic behavior 

 ± ± Requires # filtersRequires # filters �� # processing# processing

elementselements

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 14/85

1414

DiscreteDiscrete--Time SystemsTime Systems`FIR filters¶ (finite impulse response):`FIR filters¶ (finite impulse response):

Moving average filters (MA)Moving average filters (MA)

N poles at the origin z=0 (hence guaranteed stability)N poles at the origin z=0 (hence guaranteed stability)

N zeros (zeros of B(z)), `all zero¶ filtersN zeros (zeros of B(z)), `all zero¶ filters

corresponds to difference equationcorresponds to difference equation

Impulse responseImpulse response

 N 

 N  N z b z bb

 z 

 z  B z  H 

!! ...)()(1

10

][....]1[.][.][ 10 N k ubk ubk ubk  y N 

!

,...0]1[,][,...,]1[,]0[ 10 !!!! N hb N hbhbh N 

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 15/85

1515

Speeding Up FIR FilterSpeeding Up FIR FilterFIR speedFIR speed--upup

y(0) = c(0)x(0) + c(1)x(y(0) = c(0)x(0) + c(1)x(--1) + c(2)x(1) + c(2)x(--2) + . . . + c(N2) + . . . + c(N--1)x(11)x(1--N);N);

y(1) = c(0)x(1) + c(1)x(0) + c(2)x(y(1) = c(0)x(1) + c(1)x(0) + c(2)x(--1) + . . . + c(N1) + . . . + c(N--1)x(21)x(2--N);N);

y(2) = c(0)x(2) + c(1)x(1) + c(2)x(0) + . . . + c(Ny(2) = c(0)x(2) + c(1)x(1) + c(2)x(0) + . . . + c(N--1)x(31)x(3--N);N);

. . .. . .

y(n) = c(0)x(n) + c(1)x(ny(n) = c(0)x(n) + c(1)x(n--1) + c(2)x(n1) + c(2)x(n--2)+ . . + c(N2)+ . . + c(N--1)x(n1)x(n--(N(N--1));1));

Run MAC at double frequency, read two 32Run MAC at double frequency, read two 32--bit numbersbit numbers

FIR filtering: two outputs in parallelFIR filtering: two outputs in parallel

Two outputs = 4N reads, 2N MAC¶s, 2 writesTwo outputs = 4N reads, 2N MAC¶s, 2 writes

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 16/85

1616

Direct Form RealizationDirect Form Realization

u[k]

( ( ( (

u[k-4]u[k-3]u[k-2]u[k-1]

x

bo

+

x

b4

x

b3

+

x

b2

+

x

b1

+

y[k]

0 1[ ] . [ ] . [ 1] ... . [ ]

( 1)

, number o Taps

 N 

C r itical M A

C lock  C r itical 

 y k  b u k  b u k  b u k  N 

T T T N  

T T N 

!

!

u

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 17/85

1717

Retiming FIR Filter RealizationsRetiming FIR Filter Realizations

Select subgraph (shaded)Select subgraph (shaded)

Remove delay element on all inbound arrowsRemove delay element on all inbound arrows

 Add delay element on all outbound arrows Add delay element on all outbound arrows

u[k]

( ( ( (

u[k-4]u[k-3]u[k-2]u[k-1]

x

bo

+

x

b4

x

b3

+

x

b2

+

x

b1

+

y[k]

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 18/85

1818

RetimingRetiming

u[k]

(

(

u[k-1]

x

bo

+

x

b1

+

y[k]

( (

u[k-3]u[k-2]

x

b4

x

b3

+

x

b2

+

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 19/85

1919

Four Tap Direct Form RealizationFour Tap Direct Form Realization

u[k]

( ( (

u[k-3]u[k-2]u[k-1]

x

bo

+

x

b3

x

b2

+

x

b1

y[k] +

0 1 2 3[ ] . [ ] . [ 1] . [ 2 ] . [ 3]

log( )

, n u m b e r o f T a p s

C r i t ic a l  

C l o c k   C r i t ic a l  

 y k  b u k  b u k  b u k  b u k 

T T T  N 

T T  N 

!

!

u

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 20/85

2020

Transposed DirectTransposed Direct--Form RealizationForm Realization

u[k]

x

bo

+y[k]

( (

x

b1

+

x

b2

+ (

x

b3

+ (

x

b4

0 1

[ ] . [ ] . [ 1] ... . [ ]

, number o Taps

 N 

C r itical M A

C lock  C r itical 

 y k  b u k  b u k  b u k  N 

T T T 

T T N 

!

!

u

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 21/85

2121

Lattice Form RealizationsLattice Form Realizationsu[k] u[k-1]

(

u[k-2]

xb1

+

xb2

+

x

+

x

+

b3

u[k-3]

(

xb3

+

b2x

+

xbo

+

(

y[k]

b4x

+

u[k-4]

(

xb4

b1x

bo

y[k]~

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 22/85

2222

FIR Filter RealizationsFIR Filter RealizationsLattice FormLattice Form

u[k]

y[k]

(

+

+

x

x

ko

(

+

+

x

x

k1

(

+

+

x

x

k2

(

+

+

x

x

k3

xbo

y[k]

~

][....]1[.][.][ 10 N k ubk ubk ubk  y N 

!

i.e. different software/hardware, same i/o-behavior 

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 23/85

2323

Efficient Direct Form RealizationEfficient Direct Form Realization

Efficient DirectEfficient Direct--Form realization.Form realization.

bo

y[k]

u[k]( ( ( (

+

( ( ( (

+ ++ +

++

x xb4

xb3

xb2

xb1

++

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 24/85

2424

Pin DiagramPin Diagram

Drive

y[0]

y[2]y[3]

y[4]

y[5]

y[6]

«.y[31]

y[1]

x[0]

x[1]

«..«..

x[15]

Reset

Coeffin Din Clk  

  Vdd Gnd

16-bit16-ta p

FIR 

Filter

a[0]

a[1]

«..«..

a[15]

Synthesis using Synopsys Design CompilerSynthesis using Synopsys Design CompilerInitial Target Frequency: 100 MHz (typical)Initial Target Frequency: 100 MHz (typical)

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 25/85

2525

S pecif icationsS pecif icationsInput S pecif icationsInput S pecif ications

1616--bit unsigned integers for data inputs.bit unsigned integers for data inputs.

1616--bit unsigned integers for coeff icients.bit unsigned integers for coeff icients.

Output S pecif icationsOutput S pecif ications

3232--bit unsigned integer output.bit unsigned integer output.

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 26/85

2626

S ystem ComponentsS ystem Components Memory  Memory  -- Input and Coeff icientInput and Coeff icient

C ontrol C ontrol  -- ModMod--4 and Mod4 and Mod--8 counters8 counters

-- 33--8 Decoder8 Decoder

-- Combinational logicCombinational logic

Multiplier  Multiplier  -- R adiusR adius--8 Booth multiplier8 Booth multiplier

-- Multiplier registerMultiplier register

Add 

er  Add 

er  -- 99--bit Carr yS

a ve adderbit Carr yS

a ve adder-- Adder register Adder register

Output Register Output Register 

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 27/85

2727

S pecif icationsS pecif icationsDrive Signal(Output Signal)Drive Signal(Output Signal)

 A ne w  output is a v ailable. A ne w  output is a v ailable.

Inputs or coeff icients to be a pplied onl y  whenInputs or coeff icients to be a pplied onl y  when

Drive is asserted.Drive is asserted.

CoefficientsCoefficients

 Any coeff icient changed implies a ne w  f ilter  Any coeff icient changed implies a ne w  f ilter def inition.def inition.

Input Memor y clearedInput Memor y cleared ²  ² ne w data to be entered.ne w data to be entered.

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 28/85

2828

S pecif icationsS pecif icationsSystem Clock System Clock 

One clock One clock--cycle for the f ilter = 32 input clock cycle for the f ilter = 32 input clock  pulses. pulses.

One  Ta pOne  Ta p--cycle = 8 input clock pulses describedcycle = 8 input clock pulses describedas 8 phases.as 8 phases.

4 such  Ta ps for each output.4 such  Ta ps for each output.

System ResetSystem Reset

 Active High Active High

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 29/85

2929

S ystem  TimingS ystem  Timing mod8 counter statesmod8 counter states

**

**

InputInput oror Coeff icientCoeff icient memor ymemor y enableenable

** MultiplierMultiplier pro pagation pro pagation dela ydela y

**

MultiplierMultiplier pro pagation pro pagation dela ydela y

**

MultiplierMultiplier RegisterRegister enableenable

**  Add Add RegisterRegister EnableEnable

**

OutputOutput RegisterRegister EnableEnable

**

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 30/85

3030

S ystem  Timing StrategyS ystem  Timing Strategy Two Two phase phase clockingclocking

GenerationGeneration of of  internalinternal low erlow er frequencyfrequency clocksclocks

usingusing modmod--44 andand modmod--88 counterscounters

EachEach statestate of of  modmod--44 countercounter usedused forforcomputationcomputation of of oneone f ilterf ilter ta pta p

OutputOutput a v ailablea v ailable atat thethe endend of of oneone cyclecycle of of  modmod--44

countercounter

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 31/85

3131

22--Parallel FIR Filtering StructureParallel FIR Filtering Structure

H0

H1

H0

H1

+

D

+

y(2k )

y(2k+1)

x(2k )

x(2k+1)

z-2

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 32/85

3232

HardwareHardware--Efficient 2Efficient 2--Parallel FIR FilterParallel FIR Filter

Y Y00 = X= X00 HH00 + z+ z--22XX11HH11

Y Y11 = X= X00 HH11 + X+ X11 HH00

= (H= (H00 + H+ H11) (X) (X00 + X+ X11)) ± ± HH00XX00 ± ± HH11XX11

z-2

H0

H0+H1

H1

+

D

+

y(2k )

y(2k+1)

x(2k )

x(2k+1)

+ +

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 33/85

3333

Savings in the New StructureSavings in the New Structure

Originally,Originally,

 ± ±2N multiplications + 2(N2N multiplications + 2(N--1)1)

additions for two inputsadditions for two inputs

In the new structureIn the new structure

 ± ±3*(N/2) = 1.5N multiplication3*(N/2) = 1.5N multiplication

 ± ±3(N/23(N/2 ± ±1) + 4 = 1.5N + 1 additions1) + 4 = 1.5N + 1 additions

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 34/85

3434

Design Flow FIR 16 Tap DelayDesign Flow FIR 16 Tap Delay

VHDL

Deign Entry

Synthesis

Floor planning

Place & Route

Functional

Verification

Timing

Verification

PhysicalVerification

EDIF

PDEFSDF

PDEFParasitic

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 35/85

3535

The FIR FilterThe FIR Filter

ImplementationImplementation of of 1616 TapTapFIRFIR Filter,Filter, thethe coefficientscoefficientsareare representedrepresented asas fixedfixed

pointpoint 1616--bitsbits 22¶s¶scomplementcomplement numbersnumbers.. ItItisis assumedassumed thatthat either either or or bothboth of  of thethe coefficientscoefficientsandand datadata areare fractionalfractional

numbersnumbers..

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 36/85

3636

FIR Filter(Critical Path)FIR Filter(Critical Path) InIn order  order toto savesave areaarea andand improveimprove thethe

criticalcritical pathpath performance,performance, wewe decideddecided toto addaddthethe 1212--bitbit sumsum andand carrycarry resultsresults of of thethemultiplier multiplier duringduring thethe accumulationaccumulationoperationoperation.. Therefore,Therefore, thethe adder adder hashas toto addaddthreethree 1212--bitbit numbersnumbers.. ToTo dodo that,that, thethe firstfirststagestage of of thethe adder adder isis aa 33--toto--22 combiner,combiner,whichwhich isis just just aa CSACSA.. TheThe nextnext stagestage isis aa CPACPA(Carry(Carry PropagatePropagate Adder)Adder) arrangedarranged inin aa staticstaticManchester Manchester carrycarry chainchain formform.. TheThe chainchain isis

divideddivided intointo four four sections,sections, eacheach oneone hashasthreethree carrycarry stagesstages.. BuffersBuffers areare usedusedbetweenbetween sectionssections toto reducereduce thethe overalloveralldelaydelay..

f l i lif l i li

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 37/85

3737

Surv ey of Multipli er Surv ey of Multipli er Combinational Multiplier: uses nCombinational Multiplier: uses n

adders, eliminates registers:adders, eliminates registers:

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 38/85

R diR di 2 U i d M l i li i2 U i d M l i li i

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 39/85

3939

RadixRadix--2 Unsigned Multiplication2 Unsigned Multiplication

Use a single nUse a single n--bit adder, three registers (P, A, B),bit adder, three registers (P, A, B),

and a testing circuit for Aand a testing circuit for A00

Initialization: Place the unsigned numbers inInitialization: Place the unsigned numbers in

registers A and B. Set P to zero.registers A and B. Set P to zero.

1: If A1: If A00 is 1,is 1,

then register B, containing bthen register B, containing bnn--11bbnn--22...b...b00 is added tois added toP;P;

otherwise 00...00 (nothing) is added to P. The sumotherwise 00...00 (nothing) is added to P. The sum

is placed back into P.is placed back into P.

2. Shift register pair (P, A) one bit right.2. Shift register pair (P, A) one bit right.

The last bit of A is shifted out (not used).The last bit of A is shifted out (not used).

A M l i liA M l i li

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 40/85

4040

 Array  Multipli er  Array  Multipli er ArrayArray multiplier multiplier isis anan efficientefficient

layoutlayout of  of aa combinationalcombinational

multiplier multiplier..

ArrayArray multipliersmultipliers maymay bebe

pipelinedpipelined toto decreasedecrease clockclock

periodperiod atat thethe expenseexpense of  of latencylatency..

A M lti li O i tiA M lti li O i ti

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 41/85

4141

 Array  Multipli er O rganization  Array  Multipli er O rganization 0 1 1 00 1 1 0

x 1 0 0 1x 1 0 0 1

0 1 1 00 1 1 0

++ 0 0 0 00 0 0 0

0 0 1 1 00 0 1 1 0

++ 0 0 0 00 0 0 0

0 0 0 1 1 00 0 0 1 1 0

++ 0 1 1 00 1 1 0

0 1 1 0 1 1 00 1 1 0 1 1 0

Product

 sk ew arr a y

 for  r ect ang ul ar 

l a yout 

Multiplicand

Multiplier 

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 42/85

A M lti li O i tiA M lti li O i ti

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 43/85

4343

tmult}(M-1) tcarry +(N-1) tsum + tand

For small tmult, tcarry

tsum

Beneficial to mak e tcarry = tsum

p Differential Logic (DCVS)

ArrayMultiplier cell

Xi

Yi

Pin

Cout

Pout

FA

Pout

Cout

Pin

Cin

Cin

Xi Yi

Critical Path

 N-1 P.P

M-1

 Array  Multipli er O rganization  Array  Multipli er O rganization 

A hit t f A M lti liA hit t f A M lti li

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 44/85

4444

� � �

� � �

� � �

HA

HA

×

×

×

×

HA

HA

X3  X2  X1  X0

Y0

Y1

Y2

Y3

Z7  Z6  Z5  Z4  Z3

Z0

Z1

Z2

 Archit ectur e of   Array  Multipli er  Archit ectur e of   Array  Multipli er 

Ad t f A M lti liAd t f A M lti li

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 45/85

Array multipliersArray multipliers

 ± ±Partial product generation andPartial product generation andaccumulation are mergedaccumulation are merged

 ± ± Identical cellsIdentical cells

 ± ±HighHigh--rate pipeliningrate pipelining

a4x2

a3x3

a2x4

p6

a4

x1a3x2

a2x3

a1x4

p5

a4

x4

a4x0

a3

x1a2x2

a1x3

a0x4

p4

a3

x3

a3x0

a2

x1a1x2

a0x3

p3

a2

x2

a2x0

a1

x1a0x2

p2

a1

x1

a1x0

a0

x1

p1

a0

x0

a0x0

p0

a4x3

a3x4

p7

a4x4

p8p9

 Ad vantages  of   Array  Multipli er  Ad vantages  of   Array  Multipli er 

A M lti liA M lti li

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 46/85

 ± ± Array multiplier for Array multiplier for 

Unsigned numbersUnsigned numbers

a3x1

a4x00

a2x1

a3x00

a1x1

a2x00

a0x1

a1x00

a3x2

a4x1

a2x2 a1x2 a0x2

a3x3

a4x2

a2x3 a1x3 a0x3

a3x4

a4x3

a2x4 a1x4 a0x4a4x4

0

a0x0

p9 p8 p7 p6 p5 p4 p3 p2 p1 p0

 Array  Multipli er  Array  Multipli er 

Array Multiplier for TwoArray Multiplier for Two¶¶s Complements Complement

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 47/85

�� type I celltype I cell

 ± ±ordinary full adder ordinary full adder �� type II celltype II cell

 ± ±x + yx + y -- z = 2cz = 2c -- ss

s = (x + ys = (x + y -- z) mod 2z) mod 2

c = [(x + yc = [(x + y -- z) + s] / 2z) + s] / 2

 ± ±type I cell withtype I cell with

inverted z and sinverted z and s

z=1z=1--z¶, s=1z¶, s=1--s¶s¶

weight = -1z

II x

y

c s

x + y - z 2c - s

0 0 0 0 0

0 0 1 0 1

0 1 0 1 1

0 1 1 0 0

1 0 0 1 1

1 0 1 0 0

1 1 0 1 0

1 1 1 1 1

 Array  Multipli er  for Two Array  Multipli er  for Two¶ ¶ s Compl ement s Compl ement 

Array Multiplier for TwoArray Multiplier for Two¶¶s Complements Complement

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 48/85

��type II¶ cell :type II¶ cell :

 ± ±-- xx -- y + z =y + z = -- 2c + s2c + s

x + yx + y -- z = 2cz = 2c -- ss

identical to the type IIidentical to the type II

cellcell zy

II¶ x

c s

weight = -2

weight = -1

 Array  Multipli er  for Two Array  Multipli er  for Two¶ ¶ s Compl ement s Compl ement 

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 49/85

Architecture of CarryArchitecture of Carry Save MultiplierSave Multiplier

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 50/85

5050

� � � �

� � ��

�� � �

� � ��

Critical path

Vector-merging adder 

carry-save multiplier 

tmult=(N-1) tcarry + tand + tvma

Carry-Save Multiplier (4v4)

 Archit ectur e of Carry  Archit ectur e of Carry--Sav e Multipli er Sav e Multipli er 

BaughBaugh Wooley MultiplierWooley Multiplier

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 51/85

5151

B aughB augh--Wool ey  Multipli er Wool ey  Multipli er 

AlgorithmAlgorithm for  for two¶stwo¶s--complementcomplement

multiplicationmultiplication..

AdjustsAdjusts partialpartial productsproducts toto maximizemaximize

regularityregularity of of multiplicationmultiplication arrayarray..

MovesMoves partialpartial productsproducts withwith negativenegative

signssigns toto thethe lastlast stepssteps;; alsoalso addsaddsnegationnegation of of partialpartial productsproducts rather rather thanthan

subtractssubtracts..

Se ialSe ial Pa allel M lti liePa allel M lti lie

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 52/85

5252

S erial S erial--Parall el  Multipli er Parall el  Multipli er 

UsedUsed inin serialserial--arithmeticarithmeticoperationsoperations..

MultiplicandMultiplicand cancan bebe heldheld

inin placeplace byby register register..

Multiplier Multiplier isis shiftedshifted intointo

arraarra ..

SerialSerial Parallel MultiplierParallel Multiplier

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 53/85

5353

reset

§

Serial to parallel

register 

G1

G2

Full adder 

CoCi

Delay element ; F/F

S

 N-1 stages

X

Y

M+ N bits M* N cycles

Serial MultiplierSerial Multiplier

S erial S erial--Parall el  Multipli er Parall el  Multipli er 

SerialSerial Parallel MultiplierParallel Multiplier

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 54/85

5454

§ § §

Y0 Y1 Y2 Yn-1

X

S erial S erial--Parall el  Multipli er Parall el  Multipli er 

SerialSerial Parallel MultiplierParallel Multiplier

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 55/85

5555

X3Y0 X2Y0 X1Y0 X0Y0

X0Y1X1Y1X2Y1X3Y1

X0Y2X1Y2X2Y2X3Y2

X0Y3X1Y3X2Y3X3Y3

P7 P6 P5 P4 P3 P2 P1 P0

Y0

Y1

Y2

Y3

X3 X2 X1 X0

S erial S erial--Parall el  Multipli er Parall el  Multipli er 

SerialSerial Parallel MultiplierParallel Multiplier

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 56/85

5656

§

§

!

!

!

!

1

0

1

0

2

2n

 j

 j

 j

m

i

i

i

Y Y 

§

§§

§ §

!

!

!

!

!

!

!

!�!

1

0

1

0

1

0

1

0

1

0

2

2)(

22

m

i

n

 j

 ji

 ji

m

i

n

 j

 j

 j

i

ir 

 P 

Y  X 

Y  X Y  X  P 

+

Pi+1

Yi

Xi

CiCi+1

S erial S erial--Parall el  Multipli er Parall el  Multipli er 

The Architecture of the Booth AlgorithmThe Architecture of the Booth Algorithm

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 57/85

5757

T  he  Archit ectur e of  the B ooth  AlgorithmT  he  Archit ectur e of  the B ooth  Algorithm

TheThe BoothBooth Multiplier Multiplier  ± ±HighHigh performance,performance, lowlow

power power multiplier multiplier unitsunits arearenecessarynecessary inin manymany

situations,situations, suchsuch asas DSPDSP

systemssystems..

Carry Save AdditionCarry Save Addition

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 58/85

5858

FAFA

FA

FAFAFA

CLA adder 

««..

««..

««..

X7  X6  X5  X4  X3  X2  X1  X0

Y0

Y1

Y2

Y7

. . . . . . . . .

C arry  Sav e  Add ition C arry  Sav e  Add ition 

Booths AlgorithmBooths Algorithm

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 59/85

5959

Booth s AlgorithmBooth s Algorithm

Booth AlgorithmBooth Algorithm

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 60/85

6060

)0(

2)248(

2)24(

2)2(

2)(

0

4

4142434

14/

0

44

313/

0

3132333

12/

0

2

21222

1

01

!

��!

��!

��!

��!

!

!

!

!

§

§

§

§

 y

 x y y y y y XY 

 x y y y y XY 

 x y y y XY 

 x y y XY 

i

iiii

n

i

i

in

i

iiii

n

i

i

iii

n

i

i

ii1st order(radix-2)

2nd order(radix-4)

3rd order(radix-8)

4th order(radix-16)

B ooth  AlgorithmB ooth  Algorithm

Booth EncodingBooth Encoding

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 61/85

6161

Booth EncodingBooth Encoding Encode a number by taking groups of 3 bitsEncode a number by taking groups of 3 bits

where each 3where each 3--bit group overlaps by 1 bitbit group overlaps by 1 bit

Consider multiplier B with (n + 1) bitConsider multiplier B with (n + 1) bit

 ± ± Pad B with 0 to match the first termPad B with 0 to match the first term

 ± ± if B has an odd number of bits,if B has an odd number of bits,

then extend the sign Bthen extend the sign BnnBBnnBBnn--11...B...B0000

i1i2i1 j

2i1ii j

BBB2E

BBB2E

�!

�!

Booth MultiplierBooth Multiplier

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 62/85

6262

B ooth Multipli er B ooth Multipli er Encoding scheme to reduce number of Encoding scheme to reduce number of 

stages in multiplication.stages in multiplication.

Performs two bits of multiplication atPerforms two bits of multiplication at

onceonce²²requires half the stages.requires half the stages.

Each stage is slightly more complexEach stage is slightly more complex

than simple multiplier, butthan simple multiplier, butadder/subtracter is almost as small/fastadder/subtracter is almost as small/fast

as adder.as adder.

Booth EncodingBooth Encoding

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 63/85

6363

B ooth E ncod ingB ooth E ncod ing

Two¶sTwo¶s--complement form of multiplier:complement form of multiplier:

 ± ±y =y = --22nnyynn + 2+ 2nn--11yynn--22 + 2+ 2nn--22yynn--22 + ...+ ...

Rewrite using 2Rewrite using 2aa = 2= 2a+1a+1 -- 22aa::

 ± ±y =y = --22nn(y(ynn--11--yynn) + 2) + 2nn--11(y(ynn--22 --yynn--11) + 2) + 2nn--22(y(ynn--33 --yynn--

22) + ...) + ...

Consider first two terms: by looking atConsider first two terms: by looking atthree bits of y, we can determinethree bits of y, we can determine

whether to addwhether to add  x  x ,, 2x 2x to partial product.to partial product.

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 64/85

Booth MultiplierBooth Multiplier

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 65/85

6565

x8

Inverter /shift

Booth

decoder 

Wallace Tree

CLA CLA CLA

x 2xx2x

selector 

4

x0 

y0

y1y2

y3

y4

y5

y6

y7y8

««««.

B ooth Multipli er B ooth Multipli er 

Array Multiplier Cell for BoothArray Multiplier Cell for Booth¶¶s Algorithms Algorithm

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 66/85

 Array  Multipli er  C ell  for  B ooth Array  Multipli er  C ell  for  B ooth s   Algorithms   Algorithm

0 (-2 A)i (2 A)i( A)i(- A)i

MUX

Full Adder 

cout sout

select

cin

sin

Sign Extension ReductionSign Extension Reduction

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 67/85

6767

S0 S0 S0 S0 S0 S0 S0 S0 - - - - - - - -

S1 S1 S1 S1 S1 S1 - - - - - - - -

S2 S2 S2 S2 - - - - - - - -

S3 S3 - - - - - - - -

Sign

extension

)2(0)2(1)2(2)2(3

)222(0)222(1)222(2)222(3

)22222222(0

)222222(1)2222(2)22(3

0246

077277477677

01234567

234567456767

!

!

S S S S 

S S S S 

S S S 

1 S3  1 S2  1 S1  1 S0+1

Sign Ext ension Red uction Sign Ext ension Red uction 

Wallace TreeWallace Tree

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 68/85

6868

Wallac e T  r eeWallac e T  r ee Reduces depth of adder chain.Reduces depth of adder chain.

Built from carryBuilt from carry--save adders:save adders:

 ± ± three inputs a, b, cthree inputs a, b, c

 ± ± produces two outputs y, z such that y + z = a + bproduces two outputs y, z such that y + z = a + b

+ c+ c

CarryCarry--save equations:save equations:

 ± ± yyii = parity(a= parity(aii,b,bii,c,cii)) ± ± zzii = majority(a= majority(aii,b,bii,c,cii))

Wallace Tree StructureWallace Tree Structure

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 69/85

6969

Wallac e T  r ee Structur eWallac e T  r ee Structur e

77--bit Wallace Tree Additionbit Wallace Tree Addition

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 70/85

7070

77 bit Wallace Tree Additionbit Wallace Tree Addition

Wallace Tree OperationWallace Tree Operation

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 71/85

7171

Wallac e T  r ee O peration Wallac e T  r ee O peration  At each stage, i numbers are combined toAt each stage, i numbers are combined to

form ceil(2i/3) sums.form ceil(2i/3) sums.

Final adder completes the summation.Final adder completes the summation.

Wiring is more complex.Wiring is more complex.

Can build a BoothCan build a Booth--encoded Wallace treeencoded Wallace tree

multiplier.multiplier.

CSA vs. Wallace TreeCSA vs. Wallace Tree

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 72/85

7272

C S

FA

FA

FA

FA

1 2 3

4

5

6

FA FA

FA

FA

C S

C S  A vs. Wallac e T  r eeC S  A vs. Wallac e T  r ee

Rad i xRad i x--4 Mod ifi ed  B ooth4 Mod ifi ed  B ooth¶ ¶ s   Algorithms   Algorithm

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 73/85

  A 0 1 0 1 1 0 22  A 0 1 0 1 1 0 22

X X 0 0 1 0 1 1 11X X 0 0 1 0 1 1 11

Y(recoded multiplier) 0 1 0 1 0 1Y(recoded multiplier) 0 1 0 1 0 1

1

1 0 0 1 0 1 0

1 0 0 1 0 1 0

1 1 1 0 1 1 0

1 0 0 0 1 1 1 1 0 0 1 0

gg

WallaceWallace--TreeTree

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 74/85

7474

WallaceWallace TreeTree

F A

F A

F A

F A

y  0  y 1 y  2

y  3

y  4

y  5 

S

C  i - 1

C  i - 1

C  i - 1

C  i 

C  i 

C  i 

F A

y  0  y  1 y  2

F A

y  3 y  4 y  5 

F A

F A

C C  S

C  i - 1

C  i - 1

C  i - 1

C  i 

C  i 

C  i 

Collapse the chain of FAs yCollapse the chain of FAs y00--yy55 (5 adders delays) to the Wallace tree consisting(5 adders delays) to the Wallace tree consisting

of (4 adders delays)of (4 adders delays)

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 75/85

Floor Plan of MultiplierFloor Plan of Multiplier

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 76/85

7676

I n  T  he  Actual D atapathI n  T  he  Actual D atapathx

Y

LSB

L

S

B

MSB

M1

M2

or 

M3

Floor Plan of MultiplierFloor Plan of Multiplier

Floor PlanFloor Plan

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 77/85

7777

Floor PlanFloor Plan

 Adder Adder

 Add Reg Add Reg

Out RegOut Reg

MultiplierMultiplier

Multiplier RegMultiplier Reg

Control Block Control Block 

Coeff icient Memor yCoeff icient Memor y

InputInput

Memor yMemor y

R outingR outing

Floor PlanningFloor Planning

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 78/85

7878

Floor PlanningFloor Planning

ResultsResults

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 79/85

7979

ResultsResults

CellCell Number of  PortsNumber of  PortsNumber of PortsNumber of Ports 3434

Number of NetsNumber of Nets 157157

Number of CellsNumber of Cells 3232

Combinational AreaCombinational Area 24286.050781 24286.050781 

NonNon--Combinational AreaCombinational Area 14935.535156 14935.535156 

 Total A rea Total A rea 39221.58593839221.585938

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 80/85

Main ModuleMain Module

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 81/85

8181

Main ModuleMain Module

Booth MultiplierBooth Multiplier

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 82/85

8282

Boo u p eBoo u p e

Core ModuleCore Module

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 83/85

8383

Controller ModuleController Module

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 84/85

8484

ConclusionConclusion

8/6/2019 Proj Desai Mousa 16 Tap Filter

http://slidepdf.com/reader/full/proj-desai-mousa-16-tap-filter 85/85

ConclusionConclusion GoodGood DesignDesign ExperienceExperience..

UsingUsing ParallelParallel FIR FIR  FilterFilter RealizationRealization ReducedReduced thethe

numbernumber of of MultiplierMultiplier andand Adder Adder neededneeded thereforetherefore A rea A rea

 was was shrunk shrunk andand pow er pow er consumptionconsumption was was low eredlow ered

Timing Timing StrategiesStrategies UsingUsing nonnon--blockingblocking inin Verilog Verilog

reducedreduced numbernumber of of statesstates neededneeded forfor implementationimplementation..

PartitioningPartitioning thethe designdesign intointo submodulessubmodules mademade designdesign

moremore manageablemanageable andand o ptimizedo ptimized..

PerformancePerformance OptimizationOptimization was was reachedreached w ith w ith slack slack timetime

equalequal toto ++99..5454..