19
Evaluating FPGAs For Communication Infrastructure Applications © 2003 Berkeley Design Technology, Inc. Communications Design Conference September 2003 Page 1 © 2003 Berkeley Design Technology, Inc. Berkeley Design Technology, Inc. 2107 Dwight Way, Second Floor Berkeley, California 94704 USA +1 (510) 665-1600 [email protected] http://www.BDTI.com Optimized DSP Software • Independent DSP Analysis Evaluating FPGAs For Communication Infrastructure Applications 2 © 2003 Berkeley Design Technology, Inc. Presentation Goals By the end of this workshop, you should know: Key processor-selection criteria and trends for communication infrastructure Key strengths and weaknesses of high-end DSPs Key strengths and weaknesses of high-end FPGAs How typical DSPs and FPGAs stack up in terms of performance and cost/perf.

Evaluating FPGAs For Communication Infrastructure Applications · 2019. 12. 18. · ALU Shift MAC ALU Shift MAC ALU Shift MAC ALU Shift ... 2G 2.5G 3G 8-13 Kbps 64-384 Kbps 384-2000+

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Evaluating FPGAs For Communication Infrastructure Applications · 2019. 12. 18. · ALU Shift MAC ALU Shift MAC ALU Shift MAC ALU Shift ... 2G 2.5G 3G 8-13 Kbps 64-384 Kbps 384-2000+

Evaluating FPGAs For Communication Infrastructure Applications

© 2003 Berkeley Design Technology, Inc.

Communications Design Conference September 2003Page 1

© 2003 Berkeley Design Technology, Inc.

Berkeley Design Technology, Inc.2107 Dwight Way, Second Floor

Berkeley, California 94704USA

+1 (510) 665-1600

[email protected]://www.BDTI.com

Optimized DSP Software • Independent DSP Analysis

Evaluating FPGAs For Communication Infrastructure

Applications

2© 2003 Berkeley Design Technology, Inc.

Presentation Goals

By the end of this workshop, you should know:• Key processor-selection criteria and trends for

communication infrastructure• Key strengths and weaknesses of

high-end DSPs• Key strengths and weaknesses of

high-end FPGAs• How typical DSPs and FPGAs stack up in

terms of performance and cost/perf.

Page 2: Evaluating FPGAs For Communication Infrastructure Applications · 2019. 12. 18. · ALU Shift MAC ALU Shift MAC ALU Shift MAC ALU Shift ... 2G 2.5G 3G 8-13 Kbps 64-384 Kbps 384-2000+

Evaluating FPGAs For Communication Infrastructure Applications

© 2003 Berkeley Design Technology, Inc.

Communications Design Conference September 2003Page 2

3© 2003 Berkeley Design Technology, Inc.

Generalized Comm System

Detection,Demodulation

SourceCoding

ChannelCoding

InverseChannelCoding

SourceDecode

Signal In

SignalOut

Parameter Estimation

Encryption,Decryption

Modulation

Mult. AccessTransmitter

ReceiverMult. Access

4© 2003 Berkeley Design Technology, Inc.

Systems: Two Types

Infrastructure• Examples: base stations, central office equipment,

cable “head-end”

Terminals• Portable

• Battery-powered, size-constrained• Examples: cellular phone, mobile media player,

PDA• Non-portable (e.g., “CPE”)

• Examples: set-top box, home media server

Page 3: Evaluating FPGAs For Communication Infrastructure Applications · 2019. 12. 18. · ALU Shift MAC ALU Shift MAC ALU Shift MAC ALU Shift ... 2G 2.5G 3G 8-13 Kbps 64-384 Kbps 384-2000+

Evaluating FPGAs For Communication Infrastructure Applications

© 2003 Berkeley Design Technology, Inc.

Communications Design Conference September 2003Page 3

5© 2003 Berkeley Design Technology, Inc.

Terminal Requirements

Key criteria• Sufficient performance• Cost • Energy efficiency• Memory use• Small-system integration support• Packaging• Tools• Application-development infrastructure• Chip-product roadmap

6© 2003 Berkeley Design Technology, Inc.

Infrastructure Requirements

Key criteria• Board area per channel• Power per channel• Cost per channel• Large-system integration support• Tools• Application-development infrastructure• Architecture roadmap

• Compatibility, multi-vendor support

Page 4: Evaluating FPGAs For Communication Infrastructure Applications · 2019. 12. 18. · ALU Shift MAC ALU Shift MAC ALU Shift MAC ALU Shift ... 2G 2.5G 3G 8-13 Kbps 64-384 Kbps 384-2000+

Evaluating FPGAs For Communication Infrastructure Applications

© 2003 Berkeley Design Technology, Inc.

Communications Design Conference September 2003Page 4

7© 2003 Berkeley Design Technology, Inc.

Key Processing Technologies

DSPsGPPs/DSP-enhanced GPPsReconfigurable architectures• FPGAs• Reconfigurable

processors

Massively parallel processorsASSPsASICs• Licensable cores• Customizable cores• Platform-based

design

8© 2003 Berkeley Design Technology, Inc.

DSPs: The Incumbents

Modern conventional DSPs introduced ~1986• One instruction, one MAC per cycle• Developed primarily for telecom applications

High-performance VLIW DSPs introduced ~1997• Developed primarily for wireless infrastructure• Speed focused:

• Independent execution units support many instructions,MACs per cycle

• Deeper pipelines and simpler instruction sets support higher clock rates

• Emphasis on compilability

Page 5: Evaluating FPGAs For Communication Infrastructure Applications · 2019. 12. 18. · ALU Shift MAC ALU Shift MAC ALU Shift MAC ALU Shift ... 2G 2.5G 3G 8-13 Kbps 64-384 Kbps 384-2000+

Evaluating FPGAs For Communication Infrastructure Applications

© 2003 Berkeley Design Technology, Inc.

Communications Design Conference September 2003Page 5

9© 2003 Berkeley Design Technology, Inc.

Example: StarCore SC140

• 6-issue 16-bit fixed-point architecture• Up to four 16-bit MACs per cycle

• Motorola MSC8101 (one SC140 core) shipping at 300 MHz, $116 (1 ku)

Data Buses (2 x 64 bits)

Address Buses (3 x 32 bits)

Instruction Bus (1 x 128 bits)

AGUs(2)

Prog.Seq.

BMUMACALUShift

MACALUShift

MACALUShift

MACALUShift

Motorola, Agere,… and now Infineon

10© 2003 Berkeley Design Technology, Inc.

Motorola MSC8101

SC140Core

PowerPCBus

(100 MHz)

FilterCoprocessor

CPM

ATM

Ethernet

UTOPIA

UART

I2C

SPIE1/T1E3/T3

HDLC

DMAController

512 KBSRAM

MemoryController

Addr.(32-bit)

Data(64-bit)

Page 6: Evaluating FPGAs For Communication Infrastructure Applications · 2019. 12. 18. · ALU Shift MAC ALU Shift MAC ALU Shift MAC ALU Shift ... 2G 2.5G 3G 8-13 Kbps 64-384 Kbps 384-2000+

Evaluating FPGAs For Communication Infrastructure Applications

© 2003 Berkeley Design Technology, Inc.

Communications Design Conference September 2003Page 6

11© 2003 Berkeley Design Technology, Inc.

Other Infrastructure DSPsTexas Instruments TMS320C64x• 8-issue 16-bit fixed-point architecture

• Up to four 16-bit MACs per cycle• Special instructions and co-processors for communications• Compatible with ‘C62x, ‘C67x

• Sampling at 720 MHz, $216 (1 ku)• Shipping at 600 MHz, $108 (1 ku)

Analog Devices TigerSHARC (ADSP-TS20x)• 4-issue fixed- and floating-point

• Up to eight 16-bit fixed-point MACs per cycle• Special instructions for 3G base stations• High memory bandwidth (18 GB/s)

• Sampling at 600 MHz, $334 (1 ku)• TS101 shipping at 300 MHz, $234 (1 ku)

12© 2003 Berkeley Design Technology, Inc.

DSP Processors

!DSP performance, efficiency strong comparedwith other types of off-the-shelf processors But may not be adequate for demanding tasks Fixed architectures limit efficiency, design flexibility Centralized computation and extensive indirection

reduce efficiency

Relatively limited selection of chips per family!But products offer strong, relevant integration

Strengths and Weaknesses

Page 7: Evaluating FPGAs For Communication Infrastructure Applications · 2019. 12. 18. · ALU Shift MAC ALU Shift MAC ALU Shift MAC ALU Shift ... 2G 2.5G 3G 8-13 Kbps 64-384 Kbps 384-2000+

Evaluating FPGAs For Communication Infrastructure Applications

© 2003 Berkeley Design Technology, Inc.

Communications Design Conference September 2003Page 7

13© 2003 Berkeley Design Technology, Inc.

DSP ProcessorsStrengths and Weaknesses

!Relatively low development cost, risk!Mature technology!Large, experienced developer base!Fast time-to-market!Some architectures available from multiple vendors But some vendors’ roadmaps are unclear or

uncertain

14© 2003 Berkeley Design Technology, Inc.

Why Consider Alternatives?

Convergence• DSP-intensive products increasingly include complex

non-DSP functionalityProcessing throughput, density• E.g., 3G wireless computation demands outstripping

DSP processor advancesDevelopment• DSP processor software development tools

(e.g., compilers) have significant limitationsCost• Desire for integration drives SoC approach

Energy efficiencyFlexibility

Page 8: Evaluating FPGAs For Communication Infrastructure Applications · 2019. 12. 18. · ALU Shift MAC ALU Shift MAC ALU Shift MAC ALU Shift ... 2G 2.5G 3G 8-13 Kbps 64-384 Kbps 384-2000+

Evaluating FPGAs For Communication Infrastructure Applications

© 2003 Berkeley Design Technology, Inc.

Communications Design Conference September 2003Page 8

15© 2003 Berkeley Design Technology, Inc.

Wireless Bandwidth Growth

• GSM• DSC1800• PCS1900• IS-95B• IS-54B• IS-136• PDC

• GPRS• HCSD• IS-95C• IS-136+• IS-136 HS• Compact EDGE

• 3GPP-DS-FDD• 3GPP-DS-TDD• 3GPP-MC• ARIB W-CDMA• IS-2000 CDMA• IS-95-HDR

2G 2.5G 3G

8-13 Kbps 64-384 Kbps 384-2000+ KbpsNARROWBAND

CIRCUITVOICE

WIDEBANDPACKET

DATA

~100 MIPS ~10,000 MIPS ~100,000 MIPSSource: MorphICs Technology, Inc.

16© 2003 Berkeley Design Technology, Inc.

Are Processors Efficient?

Steps for performing one basic operation:• Fetch instruction from memory• Decode instruction• Compute address• Fetch data

• (Off-chip memory ! L2, update cache state)• (L2 ! L1, update cache state)• L1 ! registers• Registers ! arithmetic unit

• Perform desired operation• Write result

• Compute address, access hierarchy• Update data pointers• Update program counter

The Monarchial Model of Computing

Page 9: Evaluating FPGAs For Communication Infrastructure Applications · 2019. 12. 18. · ALU Shift MAC ALU Shift MAC ALU Shift MAC ALU Shift ... 2G 2.5G 3G 8-13 Kbps 64-384 Kbps 384-2000+

Evaluating FPGAs For Communication Infrastructure Applications

© 2003 Berkeley Design Technology, Inc.

Communications Design Conference September 2003Page 9

17© 2003 Berkeley Design Technology, Inc.

FPGAs

An amorphous “sea” of reconfigurable logic with reconfigurable interconnect• Possibly interspersed with fixed-logic resources,

e.g., processors, multipliersPotential for very high parallelismHistorically used for prototyping and “glue logic,” but becoming more sophisticated• DSP-oriented architecture features• DSP-oriented tools and design libraries

• Viterbi, Turbo, and Reed-Solomon coders and decoders, FIR filters, FFTs,…

Key DSP players: Altera and Xilinx

Field-Programmable Gate Arrays

18© 2003 Berkeley Design Technology, Inc.

Altera StratixUp to 28 hard-wired “DSP blocks”• 8×9-bit, 4×18-bit, 1×36-bit multiply operations• Optional pipelining, accumulation, etc.

Three sizes of hard-wired memory blocks

M512 RAMBlocks

Phase-LockedLoops

Logic ArrayBlocks

M4K RAMBlocks

I/O Elements

MegaRAMBlocks

DSP Blocks

Page 10: Evaluating FPGAs For Communication Infrastructure Applications · 2019. 12. 18. · ALU Shift MAC ALU Shift MAC ALU Shift MAC ALU Shift ... 2G 2.5G 3G 8-13 Kbps 64-384 Kbps 384-2000+

Evaluating FPGAs For Communication Infrastructure Applications

© 2003 Berkeley Design Technology, Inc.

Communications Design Conference September 2003Page 10

19© 2003 Berkeley Design Technology, Inc.

Altera Stratix

IP blocks• Filters, FFTs, Viterbi decoders,…• Nios processor• Third-party IP, e.g., DMA controllers

DSP tools• Parameterized IP block generators• Simulink to FPGA link• C+Simulink to FPGA design flow

Most family members available nowPrices begin at $170 (1 ku)

High-end, DSP-enhanced FPGAs

20© 2003 Berkeley Design Technology, Inc.Source: Altera

AlteraFIR Filter Compiler

Page 11: Evaluating FPGAs For Communication Infrastructure Applications · 2019. 12. 18. · ALU Shift MAC ALU Shift MAC ALU Shift MAC ALU Shift ... 2G 2.5G 3G 8-13 Kbps 64-384 Kbps 384-2000+

Evaluating FPGAs For Communication Infrastructure Applications

© 2003 Berkeley Design Technology, Inc.

Communications Design Conference September 2003Page 11

21© 2003 Berkeley Design Technology, Inc.

Xilinx

Virtex-II• Includes array of hard-wired 18 × 18 multipliers plus

distributed memory • Up to 168 multipliers in biggest chip• Most versions shipping now

Virtex-II Pro: joint effort with IBM• Adds up to four hard-wired

PowerPC 405 cores• Up to 216 multipliers in biggest chip• Most versions shipping now

Prices begin at $169 (1 ku) Source: Xilinx

“Virtex” line of FPGAs

22© 2003 Berkeley Design Technology, Inc.

Xilinx

Soft IP blocks; e.g.,• Reed-Solomon encoder, Viterbi decoder,

turbo decoder• ARC processor, MicroBlaze CPU

Sophisticated “Core Generator” tool for generating parameterized IP blocks

Simulink to FPGA link via “System Generator”

Page 12: Evaluating FPGAs For Communication Infrastructure Applications · 2019. 12. 18. · ALU Shift MAC ALU Shift MAC ALU Shift MAC ALU Shift ... 2G 2.5G 3G 8-13 Kbps 64-384 Kbps 384-2000+

Evaluating FPGAs For Communication Infrastructure Applications

© 2003 Berkeley Design Technology, Inc.

Communications Design Conference September 2003Page 12

23© 2003 Berkeley Design Technology, Inc.

XilinxFIR Filter Generator

Source: Xilinx

24© 2003 Berkeley Design Technology, Inc.

Performance Analysis

• Comparing performance of off-the-shelf DSPs to that of FPGAs is tricky

• The common MMACS metric is oversimplified to the point of absurdity• FPGA vendors use distributed-arithmetic

benchmarks that require fixed coefficients• MMACS metric overlooks need to dedicate

resources to non-MAC tasks• MMACS metric ignores memory bandwidth needed

to feed MACs• Many important DSP algorithms don’t use MACs at

all!

Page 13: Evaluating FPGAs For Communication Infrastructure Applications · 2019. 12. 18. · ALU Shift MAC ALU Shift MAC ALU Shift MAC ALU Shift ... 2G 2.5G 3G 8-13 Kbps 64-384 Kbps 384-2000+

Evaluating FPGAs For Communication Infrastructure Applications

© 2003 Berkeley Design Technology, Inc.

Communications Design Conference September 2003Page 13

25© 2003 Berkeley Design Technology, Inc.

Alternative Approach: Application Benchmarks

Use a full application, e.g., N channels of an OFDM receiverHazards:• Applications tend to be ill-defined• Hand-optimization usually required in real-

world applications• Costly, time-consuming to implement• Evaluates programmer as much as processor• What is a “reasonable” benchmark

implementation?

26© 2003 Berkeley Design Technology, Inc.

Solution: Simplified Application Benchmark

BDTI’s benchmark is based on a simplified OFDM receiver• Closely resembles a real-world application• Simplified to enable optimized

implementations• Constrained to ensure consistent, reasonable

implementation practicesBenchmark implementer goals:• Maximize number of channels• Minimize cost per channel

Page 14: Evaluating FPGAs For Communication Infrastructure Applications · 2019. 12. 18. · ALU Shift MAC ALU Shift MAC ALU Shift MAC ALU Shift ... 2G 2.5G 3G 8-13 Kbps 64-384 Kbps 384-2000+

Evaluating FPGAs For Communication Infrastructure Applications

© 2003 Berkeley Design Technology, Inc.

Communications Design Conference September 2003Page 14

27© 2003 Berkeley Design Technology, Inc.

Benchmark Overview

Flexibility is an asset:• Algorithms range from table look-ups to MAC-

intensive transform• Data sizes range from 4 to 16 bits• Data rates range from 40 to 320 MB/s• Data includes real and complex values

FFT Slicer Viterbi Decoder

IQ Demodulator FIR

28© 2003 Berkeley Design Technology, Inc.

Benchmark Requirements“Pins to pins”Real-time throughputBit-exact output dataResource sharing is permitted

Channel 1

FFT4 ch.

FFT4 ch.

FIR8 ch.

Slicer4 ch.

Slicer4 ch.

Viterbi 2 ch.

Viterbi 2 ch.

Viterbi 2 ch.

Viterbi 2 ch.

Channel 2Channel 3Channel 4Channel 5

Channel 8Channel 7Channel 6

Page 15: Evaluating FPGAs For Communication Infrastructure Applications · 2019. 12. 18. · ALU Shift MAC ALU Shift MAC ALU Shift MAC ALU Shift ... 2G 2.5G 3G 8-13 Kbps 64-384 Kbps 384-2000+

Evaluating FPGAs For Communication Infrastructure Applications

© 2003 Berkeley Design Technology, Inc.

Communications Design Conference September 2003Page 15

29© 2003 Berkeley Design Technology, Inc.

BDTI Communications Benchmark™

$3,480$325$116Cost (1 ku)

~50~10<<1Channels

Altera Stratix 1S80-6

(Preliminary)

Altera Stratix 1S20-6

(Preliminary)

Motorola MSC8101 (300 MHz)

From BDTI’s report, FPGAs for DSP.

30© 2003 Berkeley Design Technology, Inc.

Density Comparison

Source: Andre DeHon

[AL

U b

it O

ps /

λλ λλ2 s]

Technology [λλλλ]

100

10

10.1 1.0

SRAM-based FPGAsRISC Processors

Page 16: Evaluating FPGAs For Communication Infrastructure Applications · 2019. 12. 18. · ALU Shift MAC ALU Shift MAC ALU Shift MAC ALU Shift ... 2G 2.5G 3G 8-13 Kbps 64-384 Kbps 384-2000+

Evaluating FPGAs For Communication Infrastructure Applications

© 2003 Berkeley Design Technology, Inc.

Communications Design Conference September 2003Page 16

31© 2003 Berkeley Design Technology, Inc.

FPGAs

!Massive performance gains on some algorithms!Architectural flexibility can yield efficiency! Adjust data widths throughout algorithm! Parallelism where you need it! Massive on-chip memory bandwidth

Efficiency compromised by generality• Embedded MAC units and memory blocks improve efficiency

but reduce generality

!Potentially good cost and energy efficiency But absolute prices and power consumption are much higher

than DSPs’

Strengths and Weaknesses

32© 2003 Berkeley Design Technology, Inc.

FPGAs

Development is long and complicated Higher complexity inherent due to flexibility Design flow is unfamiliar to most DSP engineers!But cost and complexity is much lower than ASICs’

Development infrastructure badly lags DSPs’ DSP-oriented tools are immature

!Field reconfigurability (for some products)!Reconfigure hardware for diverse tasks• Xilinx has mature products, but others are

playing catch-up

Strengths and Weaknesses

Page 17: Evaluating FPGAs For Communication Infrastructure Applications · 2019. 12. 18. · ALU Shift MAC ALU Shift MAC ALU Shift MAC ALU Shift ... 2G 2.5G 3G 8-13 Kbps 64-384 Kbps 384-2000+

Evaluating FPGAs For Communication Infrastructure Applications

© 2003 Berkeley Design Technology, Inc.

Communications Design Conference September 2003Page 17

33© 2003 Berkeley Design Technology, Inc.

Why Use a DSP?

• Some applications are not amenable to FPGA implementations• Parallelism is sometimes inherently limited• Ultimate speed is not always the first priority

• FPGAs are still too expensive for terminal applications

• FPGA energy efficiency is still an unknown• Implementing a complex algorithm is much

more difficult on an FPGA than on a DSP

34© 2003 Berkeley Design Technology, Inc.

Grading the Alternatives

A = Best, E = Worst

Energy Efficiency

Top Speed

Run-time Flexibility

Design Flexibility

Design Effort

C

B

A

B

D

FPG

As

C

D

C

E

B

DS

Ps

D

E

B

E

A

GP

Ps

AC

EC

AB

EC

A+C

AS

SP

s

Cu

stom

Cores

A

A

A

E

E

ASIC

s

Page 18: Evaluating FPGAs For Communication Infrastructure Applications · 2019. 12. 18. · ALU Shift MAC ALU Shift MAC ALU Shift MAC ALU Shift ... 2G 2.5G 3G 8-13 Kbps 64-384 Kbps 384-2000+

Evaluating FPGAs For Communication Infrastructure Applications

© 2003 Berkeley Design Technology, Inc.

Communications Design Conference September 2003Page 18

35© 2003 Berkeley Design Technology, Inc.

Future Communications Applications Dealing with Non-ideal Channels

Multi-antenna approach exploits multi-path fading by sending data along good channelsResults in large theoretical improvements in bandwidth efficiency for fading channelsBut … computationally hungry

ArrayProcessing

)(tx

ArrayProcessing

1st path, α1 = 1

2nd path, α2 = 0.6

)(ty

SNR (dB)0 5 10 15 20 250

5

10

15

20

25

30

Cap

acity

(bits

/s/H

z) (4,4) With Feedback(4,4) No Feedback(4,1) Orthogonal Design (1,1) Baseline

Source: Jan Rabaey, Berkeley Wireless Research Center

36© 2003 Berkeley Design Technology, Inc.

Conclusions

High-end FPGAs can outstrip DSPs on certain DSP tasks• Computation-intensive, highly parallelizable tasks

High-end FPGAs are expensive, but they can beat DSPs in terms of performance per dollarDSP have the advantage in development infrastructure, time-to-market, developer familiarity.In many applications, a heterogeneous combination of computing engines is desirable• Expect to see more heterogeneous processor chips

The “best” architecture depends on the details of the application

Page 19: Evaluating FPGAs For Communication Infrastructure Applications · 2019. 12. 18. · ALU Shift MAC ALU Shift MAC ALU Shift MAC ALU Shift ... 2G 2.5G 3G 8-13 Kbps 64-384 Kbps 384-2000+

Evaluating FPGAs For Communication Infrastructure Applications

© 2003 Berkeley Design Technology, Inc.

Communications Design Conference September 2003Page 19

37© 2003 Berkeley Design Technology, Inc.

For More Information...www.BDTI.com Free Information• BDTImark2000™ scores• DSP Insider newsletter• Pocket Guide to Processors for DSP

White papers on processor architectures and benchmarkingArticle reprints on DSP-orientedprocessors and applications • EE Times• IEEE Spectrum• IEEE Computer and others

comp.dsp FAQ

2001 Edition