25
1 © 1999 Berkeley Design Technology, Inc. www.BDTI.com Workshop #311 Understanding the New DSP Processor Architectures Copyright © 1999 Berkeley Design Technology, Inc. 1 Understanding the New DSP Processor Architectures Berkeley Design Technology, Inc. 2107 Dwight Way, Second Floor Berkeley, California U.S.A. +1 (510) 665-1600 [email protected] http://www.BDTI.com © 1999 Berkeley Design Technology, Inc. 2 Outline u DSP architectural basics u Improved performance through increased parallelism l Allowing more operations per instruction • Enhanced conventional DSPs • Single-instruction, multiple-data (SIMD) l Issuing multiple instructions per instruction cycle • VLIW (very long instruction word) DSPs • Superscalar DSPs u CPUs with SIMD extensions u DSP/microcontroller hybrids

Understanding the New DSP Processor Architectures · Understanding the New DSP Processor Architectures ... Understanding the New DSP Processor Architectures ... • Enhance the instruction

Embed Size (px)

Citation preview

Page 1: Understanding the New DSP Processor Architectures · Understanding the New DSP Processor Architectures ... Understanding the New DSP Processor Architectures ... • Enhance the instruction

1

© 1999 Berkeley Design Technology, Inc.www.BDTI.com

Workshop #311Understanding the New DSP Processor Architectures

Copyright © 1999 Berkeley Design Technology, Inc.1

Understanding the New DSP Processor Architectures

Berkeley Design Technology, Inc.2107 Dwight Way, Second Floor

Berkeley, California U.S.A.

+1 (510) [email protected]

http://www.BDTI.com

© 1999 Berkeley Design Technology, Inc.2

Outline

u DSP architectural basicsu Improved performance through increased parallelism

l Allowing more operations per instruction• Enhanced conventional DSPs• Single-instruction, multiple-data (SIMD)

l Issuing multiple instructions per instruction cycle• VLIW (very long instruction word) DSPs• Superscalar DSPs

u CPUs with SIMD extensionsu DSP/microcontroller hybrids

Page 2: Understanding the New DSP Processor Architectures · Understanding the New DSP Processor Architectures ... Understanding the New DSP Processor Architectures ... • Enhance the instruction

2

© 1999 Berkeley Design Technology, Inc.www.BDTI.com

Workshop #311Understanding the New DSP Processor Architectures

© 1999 Berkeley Design Technology, Inc.3

: ::

55 55

A Motivating Example:FIR Filtering

x = input samplesy = output samplesh = filter coefficientsD = unit time delay

y

hn-1

D D

hn-2

...

...

hk

...

...

h0

x

a "tap"

Filter characteristicsgoverned by number of taps, selection

of filter coefficients.

ΣΣ x∗∗h

DD

© 1999 Berkeley Design Technology, Inc.4

FIR Filter on a Low-CostMicrocontroller

loop:

mov*r0,x0mov*r1,y0

mpyx0,y0,a

adda,b

movy0,*r2

incr0

incr1

incr2

decctr

tstctr

jnzloop

MemoryData Path

Problems:•• Memory bandwidth bottleneck• Control code and addressing

overhead• Possibly slow multiply

(Computes one tap per loop iteration)(Computes one tap per loop iteration)

Page 3: Understanding the New DSP Processor Architectures · Understanding the New DSP Processor Architectures ... Understanding the New DSP Processor Architectures ... • Enhance the instruction

3

© 1999 Berkeley Design Technology, Inc.www.BDTI.com

Workshop #311Understanding the New DSP Processor Architectures

© 1999 Berkeley Design Technology, Inc.5

Early DSP Architecture

Data PathData Path

ProgramProgramMemoryMemory

DataDataMemoryMemory

RegisterRegister

MultMult

ALUALU

AccumulatorAccumulator

Memory StructureMemory Structure

Data PathData Path

© 1999 Berkeley Design Technology, Inc.6

FIR Filter on a Conventional DSP

DO dotprod UNTIL CE;dotprod: MR=MR+MX0*MY0(SS),MX0=DM(I0,M0),MY0=PM(I4,M4);

Page 4: Understanding the New DSP Processor Architectures · Understanding the New DSP Processor Architectures ... Understanding the New DSP Processor Architectures ... • Enhance the instruction

4

© 1999 Berkeley Design Technology, Inc.www.BDTI.com

Workshop #311Understanding the New DSP Processor Architectures

© 1999 Berkeley Design Technology, Inc.7

Baseline: "Conventional DSPs"

u Common attributes:

l 16- or 24-bit fixed-point (fractional), or 32-bit floating-point arithmetic

l 16-, 24-, or 32-bit instructions

l One instruction per cycle ("single issue")

l Complex, "compound" instructions encodingmany operations

l Highly constrained, non-orthogonal architectures

© 1999 Berkeley Design Technology, Inc.8

Baseline: "Conventional DSPs"

u Common attributes (cont.):

l Dedicated addressing hardware w/ specializedaddressing modes

l Multiple-access on-chip memory architecture

l Dedicated hardware for loops and otherexecution control

l Specialized on-chip peripherals and I/O interfaces

l Low cost, low power, low memory usage

Page 5: Understanding the New DSP Processor Architectures · Understanding the New DSP Processor Architectures ... Understanding the New DSP Processor Architectures ... • Enhance the instruction

5

© 1999 Berkeley Design Technology, Inc.www.BDTI.com

Workshop #311Understanding the New DSP Processor Architectures

© 1999 Berkeley Design Technology, Inc.9

Increasing Parallelism

u Boosting performance beyond the increases afforded by faster clock speeds requires the processor to do more work in every clock cycle. How?

u By increasing the processors' parallelism in one of the following ways:1. Increase the number of operations that can be

performed in each instruction2. Increase the number of instructions that can be

issued and executed in every cycle

© 1999 Berkeley Design Technology, Inc.10

1. More Operations Per Instruction

u How to increase the number of operations that can be performed in each instruction?

l Add execution units (multiplier, adder, etc.)• Enhance the instruction set to take advantage

of the additional hardware• Possibly, increase the instruction word width• Use wider buses to keep the processor fed with

data

l Add SIMD (single instruction, multiple data) capabilities

Page 6: Understanding the New DSP Processor Architectures · Understanding the New DSP Processor Architectures ... Understanding the New DSP Processor Architectures ... • Enhance the instruction

6

© 1999 Berkeley Design Technology, Inc.www.BDTI.com

Workshop #311Understanding the New DSP Processor Architectures

© 1999 Berkeley Design Technology, Inc.11

2. More Instructions Per Clock Cycle

u How to increase the number of instructions that are issued and executed in every clock cycle?l Use VLIW techniquesl Use superscalar techniques

u VLIW and superscalar architectures typically use simple, RISC-based instructionsl More orthogonal than the complex, compound

instructions traditionally used in DSP processors

© 1999 Berkeley Design Technology, Inc.12

New Architectures for DSP

u Enhanced conventional DSPs

l Examples: Lucent DSP16xxx, ADI ADSP-2116x

u VLIW (Very Long Instruction Word) DSPs

l Examples: TI TMS320C6xxx, Infineon Carmel

u Superscalar DSPs

l Example: ZSP ZSP164xx

u General-purpose processors, hybrids:

l Examples: PowerPC with AltiVec, TriCore

Page 7: Understanding the New DSP Processor Architectures · Understanding the New DSP Processor Architectures ... Understanding the New DSP Processor Architectures ... • Enhance the instruction

7

© 1999 Berkeley Design Technology, Inc.www.BDTI.com

Workshop #311Understanding the New DSP Processor Architectures

© 1999 Berkeley Design Technology, Inc.13

Enhanced Conventional DSPs

More parallelism via:

u Multi-operation data path

l e.g., 2nd multiplier, adder

l SIMD capabilities (ranging from limited to extensive)

u Highly specialized hardware in core

l e.g., application-oriented data path operations

u Co-processors

l Viterbi decoding, FIR filtering, etc.

Example: Lucent DSP16xxx, ADI ADSP-2116x

© 1999 Berkeley Design Technology, Inc.14

Lucent's DSP16xxConventional DSP

16x16

ALU Bit Manip.

Two Accumulators

I Data Bus (16)

X Data Bus (16)

Page 8: Understanding the New DSP Processor Architectures · Understanding the New DSP Processor Architectures ... Understanding the New DSP Processor Architectures ... • Enhance the instruction

8

© 1999 Berkeley Design Technology, Inc.www.BDTI.com

Workshop #311Understanding the New DSP Processor Architectures

© 1999 Berkeley Design Technology, Inc.15

Lucent's DSP16xxxEnhanced Conventional DSP

16x16

ALU Bit Manip.

Eight Accumulators

I Data Bus (32)

X Data Bus (32)

16x16

Adder

Dual-MAC architecture

© 1999 Berkeley Design Technology, Inc.16

FIR Filtering on the DSP16xxx

Do nTaps/2

a0=a0+p0+p1 p0=xh*yh p1=xl*yl y=*r0++ x=*pt0++

Compare to FIR filter code for predecessor, DSP16xx:Do nTaps

a0=a0+p p=x*y y=*r0++ x=*pt++

Page 9: Understanding the New DSP Processor Architectures · Understanding the New DSP Processor Architectures ... Understanding the New DSP Processor Architectures ... • Enhance the instruction

9

© 1999 Berkeley Design Technology, Inc.www.BDTI.com

Workshop #311Understanding the New DSP Processor Architectures

© 1999 Berkeley Design Technology, Inc.17

Enhanced Conventional DSPs

u Advantages:

l Allows incremental performance increases while maintaining competitive cost, power, code density

l Compatibility is possible; similarity is likely

u Disadvantages:

l Increasingly complex, hard-to-program architecturesl Poor compiler targets

l How much farther can we get with this approach?

© 1999 Berkeley Design Technology, Inc.18

SIMDSingle Instruction, Multiple Data

u Splits words into smaller chunks for parallel operationsu Some SIMD processors support multiple data widths (16-

bit, 8-bit, ..)u Examples: Lucent DSP16xxx, ADI ADSP-2116x,

ADI TigerSHARC

16 bits 16 bits 16 bits 16 bits

16 bits 16 bits

++ −− ×× ++ −− ××

Page 10: Understanding the New DSP Processor Architectures · Understanding the New DSP Processor Architectures ... Understanding the New DSP Processor Architectures ... • Enhance the instruction

10

© 1999 Berkeley Design Technology, Inc.www.BDTI.com

Workshop #311Understanding the New DSP Processor Architectures

© 1999 Berkeley Design Technology, Inc.19

SIMD Extensions

u SIMD is becoming more and more common in DSP processors

l Limited SIMD capabilities on the DSP16xxx

l Full SIMD capabilities (enabled by dual data paths) onADI's ADSP-2116x

l "Hierarchical" SIMD capabilities on TigerSHARC

© 1999 Berkeley Design Technology, Inc.20

SIMD Extensions

u SIMD extensions for CPUs are also common. Why?

l Make good use of existing wide resources• Buses, data path

l Significantly accelerate many DSP/image/video algorithms without a radical architectural change

l Examples include Pentium with MMX/SSE, PowerPC G4 with AltiVec, ...

Page 11: Understanding the New DSP Processor Architectures · Understanding the New DSP Processor Architectures ... Understanding the New DSP Processor Architectures ... • Enhance the instruction

11

© 1999 Berkeley Design Technology, Inc.www.BDTI.com

Workshop #311Understanding the New DSP Processor Architectures

© 1999 Berkeley Design Technology, Inc.21

SIMD Challenges

u Algorithms, data organization must be amenable to data-parallel processingl Programmers must be creative, and sometimes pursue

alternative algorithmsl Reorganization penalties can be significantl Most effective on algorithms that process large blocks

of data–not very useful on single-sample algorithms

© 1999 Berkeley Design Technology, Inc.22

SIMD Challenges

u Loss of generalityl Each instruction processes N elements

(typically 4 ≤ N ≤ 8)l Loops often must be unrolled for speed

u High program memory usagel Loop unrollingl Re-arranging data for SIMD processingl Merging partial results

u Often, only fixed-point supported

Page 12: Understanding the New DSP Processor Architectures · Understanding the New DSP Processor Architectures ... Understanding the New DSP Processor Architectures ... • Enhance the instruction

12

© 1999 Berkeley Design Technology, Inc.www.BDTI.com

Workshop #311Understanding the New DSP Processor Architectures

© 1999 Berkeley Design Technology, Inc.23

Analog Devices' ADSP-2106xConventional DSP

ALUALU MACMAC ShifterShifter

Program Memory (PM) Address Bus (24)

Data Memory (DM) Address Bus (32)

Program Memory (PM) Data Bus (48)

Data Memory (DM) Data Bus (40)

ExternalAddress Bus

(32)

ExternalData Bus

(48)

© 1999 Berkeley Design Technology, Inc.24

Analog Devices' ADSP-2116xEnhanced Conventional with SIMD

ALUALU MACMAC ShifterShifter

ALUALU MACMAC ShifterShifter

Program Memory (PM) Address Bus (32)

Data Memory (DM) Address Bus (32)

Program Memory (PM) Data Bus (64)

Data Memory (DM) Data Bus (64)

ExternalAddress Bus

(32)

ExternalData Bus

(64)

ADSP-2116x adds2nd data path,wider buses

Page 13: Understanding the New DSP Processor Architectures · Understanding the New DSP Processor Architectures ... Understanding the New DSP Processor Architectures ... • Enhance the instruction

13

© 1999 Berkeley Design Technology, Inc.www.BDTI.com

Workshop #311Understanding the New DSP Processor Architectures

© 1999 Berkeley Design Technology, Inc.25

Superscalar vs VLIW

Memory

INS 1INS 1

INS 2INS 2

INS 3INS 3

INS nINS n

••••••••••••

??

ALUALU MACMAC BMUBMU •••• •••• ••••

Execution UnitsInstruction scheduling,dispatch

INS 1INS 1 INS 2INS 2

INS 3INS 3 INS 4INS 4

INS 5INS 5INS 6INS 6

time

© 1999 Berkeley Design Technology, Inc.26

VLIW (Very Long Instruction Word)

Examples of current & upcoming VLIW-based architectures for DSP applications:

l TI TMS320C6xxx, Infineon Carmel, ADI TigerSHARC,StarCore SC140

Characteristics:l Multiple independent instructions per cycle, packed

into single large "super-instruction" or "packet"

l More regular, orthogonal, RISC-like operations

l Large, uniform register sets

Page 14: Understanding the New DSP Processor Architectures · Understanding the New DSP Processor Architectures ... Understanding the New DSP Processor Architectures ... • Enhance the instruction

14

© 1999 Berkeley Design Technology, Inc.www.BDTI.com

Workshop #311Understanding the New DSP Processor Architectures

© 1999 Berkeley Design Technology, Inc.27

Example VLIW Data Path ('C62xx)

L: ALUS: Shifter, ALUM: MultiplierD: Address gen.

On-Chip Program Memory

Register File A

L1 S1 M1 D1

Register File B

L2 S2 M2 D2

On-Chip Data Memory

2 independentdata paths,

8 execution units

32x8=256 bits(8 instructions)

3232 3232

Dispatch Unit

© 1999 Berkeley Design Technology, Inc.28

FIR Filtering on the 'C62xx

LOOP:

ADD .L1 A0,A3,A0

||ADD .L2 B1,B7,B1

||MPYHL .M1X A2,B2,A3

||MPYLH .M2X A2,B2,B7

||LDW .D2 *B4++,B2

||LDW .D1 *A7--,A2

||[B0] ADD .S2 -1,B0,B0

||[B0] B .S1 LOOP

; LOOP ends here

Page 15: Understanding the New DSP Processor Architectures · Understanding the New DSP Processor Architectures ... Understanding the New DSP Processor Architectures ... • Enhance the instruction

15

© 1999 Berkeley Design Technology, Inc.www.BDTI.com

Workshop #311Understanding the New DSP Processor Architectures

© 1999 Berkeley Design Technology, Inc.29

VLIW Architectures

u Advantages:

l Increased performance

l More regular architectures• Potentially easier to program, better compiler

targets

l Scalable (?)

© 1999 Berkeley Design Technology, Inc.30

VLIW Architectures

u Disadvantages:

l New kinds of programmer/compiler complexity

• Programmer (or code-generation tool) must keep track of instruction scheduling

• Some VLIW processors have deep pipelines and long latencies--can be confusing, may make peak performance elusive

l Code size bloat

• High program memory bandwidth requirements

l High power consumption

Page 16: Understanding the New DSP Processor Architectures · Understanding the New DSP Processor Architectures ... Understanding the New DSP Processor Architectures ... • Enhance the instruction

16

© 1999 Berkeley Design Technology, Inc.www.BDTI.com

Workshop #311Understanding the New DSP Processor Architectures

© 1999 Berkeley Design Technology, Inc.31

Superscalar Architectures

Current superscalar architectures for DSP apps:

l ZSP ZSP164xx, Infineon TriCore (DSP/µC hybrid)

Characteristics:

l Borrow techniques from high-end CPUs

l Multiple (usually 2-4) instructions issued per instruction cycle• Instruction scheduling handled in hardware, not by

programmer/toolsl RISC-like instruction setl Lots of parallelism

© 1999 Berkeley Design Technology, Inc.32

FIR Filtering on the ZSP164xx

LOOP: LDDU R4, R14, 2

LDDU R8, R15, 2

MAC2.A R4, R8

AGN0 LOOP

(All four instructions execute in a single cycle)

Page 17: Understanding the New DSP Processor Architectures · Understanding the New DSP Processor Architectures ... Understanding the New DSP Processor Architectures ... • Enhance the instruction

17

© 1999 Berkeley Design Technology, Inc.www.BDTI.com

Workshop #311Understanding the New DSP Processor Architectures

© 1999 Berkeley Design Technology, Inc.33

Superscalar Architectures

u Advantages:

l Large jump in performance

l More regular architectures (potentially easier to program, better compiler targets)

l Programmer (or code generation tool) isn't required to schedule instructions

• But peak performance may be hard to achieve without hand-tweaking

• Code size not increased significantly

© 1999 Berkeley Design Technology, Inc.34

Superscalar Architectures

u Disadvantages:

l Energy consumption is a major challenge

l Dynamic behavior complicates software development

• Execution-time variability can be a hazard–how to guarantee meeting real-time constraints?

– Coding for worst-case behavior may leave a lot of a processor's performance untapped

• Code optimization is challenging– If you can't tell how long a segment takes to execute,

how can you tell if an optimization improved the performance?

Page 18: Understanding the New DSP Processor Architectures · Understanding the New DSP Processor Architectures ... Understanding the New DSP Processor Architectures ... • Enhance the instruction

18

© 1999 Berkeley Design Technology, Inc.www.BDTI.com

Workshop #311Understanding the New DSP Processor Architectures

© 1999 Berkeley Design Technology, Inc.35

Comparing Architectural EfficiencyBlock FIR Filter Benchmark

0

200

400

600

800

1000

1200

1400 Instruction Cycles

DSP16xx

ADSP-2116x

DSP16xxx

ADSP-2106x

TMS320C67xx

TMS320C3x

TMS320C54x

fixed-point resultsfloating-point results

TMS320C62xx

Co

nve

nti

on

al

VL

IW

En

han

ced

Co

nv

En

han

ced

Co

nv

+SIM

D

VL

IW

Co

nve

nti

on

al

Co

nve

nti

on

al

Co

nve

nti

on

al

© 1999 Berkeley Design Technology, Inc.36

SpeedBlock FIR Filter Benchmark

0

1

2

3

4

5

6

7

mic

rose

cond

s(lo

wer

is fa

ster

)

'320C6202250 MHz

DSP16210120 MHz

'320C6701167 MHz

LSI401Z200 MHz

'320C549120 MHz

ADSP-21160 100 MHz

fixed-point resultsfloating-point results

projected

Page 19: Understanding the New DSP Processor Architectures · Understanding the New DSP Processor Architectures ... Understanding the New DSP Processor Architectures ... • Enhance the instruction

19

© 1999 Berkeley Design Technology, Inc.www.BDTI.com

Workshop #311Understanding the New DSP Processor Architectures

© 1999 Berkeley Design Technology, Inc.37

Energy ConsumptionBlock FIR Filter Benchmark

0

2

4

6

8

10

12

14

16

18

wat

t-mic

rose

cond

s(lo

wer

is fa

ster

)

'320C6202

250 MHz, 1.8 V

DSP16210

120 MHz, 2.5 V

'320C6701

167 MHz, 1.8 V

'320UC5402

80 MHz, 1.8 V

ADSP-21160

100 MHz, 2.5 V

fixed-point resultsfloating-point results

projected

© 1999 Berkeley Design Technology, Inc.38

Memory UsageFinite State Machine Benchmark

0

50

100

150

200

250

byte

s(lo

wer

is b

ette

r)

'320C6202

DSP16210

'320C6701

LSI401Z

'320C549

ADSP-21160

fixed-point resultsfloating-point results

projected

Page 20: Understanding the New DSP Processor Architectures · Understanding the New DSP Processor Architectures ... Understanding the New DSP Processor Architectures ... • Enhance the instruction

20

© 1999 Berkeley Design Technology, Inc.www.BDTI.com

Workshop #311Understanding the New DSP Processor Architectures

© 1999 Berkeley Design Technology, Inc.39

High-Performance GPPs with SIMD

u Most high-performance GPPs targeting desktop applications are superscalar architecturesl Pentium, PowerPC

u Often have many dynamic features to accelerate performancel Sophisticated, multi-level cachesl Branch predictionl Speculative execution

u Most offer SIMD extensions to increase performance on DSP and multimedia applications (audio, video) l MMX/SSE, AltiVec

© 1999 Berkeley Design Technology, Inc.40

High-Performance GPPs with SIMD

u These processors can often execute DSP tasks faster than DSP processors

u So why do people still use DSPs?l Price l Power consumptionl Availability of off-the-shelf DSP software, DSP-

oriented development toolsl DSP-oriented on-chip integrationl Execution-time predictability is especially

problematic with high-performance GPPs

Page 21: Understanding the New DSP Processor Architectures · Understanding the New DSP Processor Architectures ... Understanding the New DSP Processor Architectures ... • Enhance the instruction

21

© 1999 Berkeley Design Technology, Inc.www.BDTI.com

Workshop #311Understanding the New DSP Processor Architectures

© 1999 Berkeley Design Technology, Inc.41

An Illustrative ExampleExecution-Time Predictability

Vector add on PowerPC 604e:

@vec_add_loop:lfsu fpTemp1,4(rAAddr) # Load A data, ptr. updatelfsu fpTemp2,4(rBAddr) # Load B data, ptr. updatefadds fpSum,fpTemp1,fpTemp2 # Perform add operationstfsu fpSum,4(rCAddr) # Store sum, ptr. update

bdnz @vec_add_loop # loop

Q: How many instruction cycles per iteration?

© 1999 Berkeley Design Technology, Inc.42

Hybrid DSP/Microcontrollers

u GPPs for embedded applications are starting to address DSP needs

u Embedded GPPs typically don't have the advanced features that affect execution-time predictability, so are easier to use for DSP

u There are a wide variety of approaches to combining DSP and microcontroller functionality

Page 22: Understanding the New DSP Processor Architectures · Understanding the New DSP Processor Architectures ... Understanding the New DSP Processor Architectures ... • Enhance the instruction

22

© 1999 Berkeley Design Technology, Inc.www.BDTI.com

Workshop #311Understanding the New DSP Processor Architectures

© 1999 Berkeley Design Technology, Inc.43

Hybrid DSP/MicrocontrollersApproaches

l Multiple processors on a die• e.g., Motorola DSP5665x

l DSP co-processor• e.g., Massana FILU-200, ARM Piccolo

l DSP brain transplant in existing µC• e.g., SH-DSP

l Microcontroller tweaks to existing DSP• e.g., TMS320C27xx

l Totally new design• e.g., TriCore

© 1999 Berkeley Design Technology, Inc.44

Hybrid DSP/MicrocontrollersAdvantages, Disadvantages

l Multiple processors on a die• Two entirely different instruction sets,

debugging tools, etc.• Both cores can operate in parallel• No resource contention...• ..but probably resource duplication

Page 23: Understanding the New DSP Processor Architectures · Understanding the New DSP Processor Architectures ... Understanding the New DSP Processor Architectures ... • Enhance the instruction

23

© 1999 Berkeley Design Technology, Inc.www.BDTI.com

Workshop #311Understanding the New DSP Processor Architectures

© 1999 Berkeley Design Technology, Inc.45

Hybrid DSP/MicrocontrollersAdvantages, Disadvantages

l DSP co-processor• May result in complicated programming model

– Dual instruction sets– In ARM7/Piccolo case, possible deadlocks

• Possible resource contention– e.g., Piccolo requires ARM7 to perform all data

transfers

• May allow both cores to operate in parallel

© 1999 Berkeley Design Technology, Inc.46

Hybrid DSP/MicrocontrollersAdvantages, Disadvantages

l DSP brain transplant in existing µC,microcontroller tweaks to existing DSP• Simpler programming model than dual cores• Constraints imposed by "legacy" architecture

l Totally new design• Avoids legacy constraints• May result in a cleaner architecture• Adopting a totally new architecture can be risky

Page 24: Understanding the New DSP Processor Architectures · Understanding the New DSP Processor Architectures ... Understanding the New DSP Processor Architectures ... • Enhance the instruction

24

© 1999 Berkeley Design Technology, Inc.www.BDTI.com

Workshop #311Understanding the New DSP Processor Architectures

© 1999 Berkeley Design Technology, Inc.47

Execution TimesComplex Block FIR Filter Benchmark

0

10

20

30

40

50

60

70

80

90

TMS320C549120 MHz

SH-DSP66 MHz

TriCore66 MHz

Pentium III600 MHz

DSP5681235 MHz

TMS320C2700150 MHz

ADSP-2189M75 MHz

mic

rose

cond

s

fixed-point resultsfloating-point results

(lower is better)

© 1999 Berkeley Design Technology, Inc.48

Conclusions

u The variety, performance range of processors for DSP is explodingl Better selection, flexibility,...l ...but harder to choose the "best" processorl Architectures are interesting, but other factors may

be more important

Page 25: Understanding the New DSP Processor Architectures · Understanding the New DSP Processor Architectures ... Understanding the New DSP Processor Architectures ... • Enhance the instruction

25

© 1999 Berkeley Design Technology, Inc.www.BDTI.com

Workshop #311Understanding the New DSP Processor Architectures

© 1999 Berkeley Design Technology, Inc.49

Conclusions

u DSPs, microcontrollers, and CPUs are swapping architectural tricksl CPU, µC vendors recognize the need for DSP

capabilities

l DSP, µC vendors don't want to lose sockets to each other

l What is good in a CPU may not be good in a DSP; be careful of issues such as execution-timepredictability, programmability, etc.

© 1999 Berkeley Design Technology, Inc.50

For More Information...

Free resources on BDTI's web site,

www.BDTI.com

l DSP Processors Hit the Mainstreamcovers DSP architectural basics and new developments. Originally printed in IEEE Computer Magazine.

l Evaluating DSP Processor Performance,a white paper from BDTI.

l Numerous other BDTI article reprints, slides

l comp.dsp FAQ