View
217
Download
0
Embed Size (px)
Citation preview
1
Architectural Analysis of a DSP Device,
the Instruction Set and the Addressing Modes
SYSC5603 (ELG6163) Digital Signal Processing Microprocessors, Software and Applications
Miodrag Bolic
2
Outline
• FIR filter on ADPS-21x
DSP Requirements• Fast Multiply-Accumulates (Data-path)• Extended Precision Accumulator Register (Data-path)• Dual Operand Fetch (Memory)• Circular Buffering (Addressing)• Zero-Overhead Looping (Instruction set)
Analog Devices Architectures and Programming• SHARC• Blackfin• Performance Optimization
3
ADSP -21x
Copied from [Kester03]
4
CALCULATING OUTPUTS OF 4-TAP FIR FILTER USING A CIRCULAR BUFFER
y(3) = h(0) x(3) + h(1) x(2) + h(2) x(1) + h(3) x(0)
y(4) = h(0) x(4) + h(1) x(3) + h(2) x(2) + h(3) x(1)
y(5) = h(0) x(5) + h(1) x(4) + h(2) x(3) + h(3) x(2)
MemoryLocation
0
1
2
3
Read
x(0)
x(1)
x(2)
x(3)
Write
x(4)
Read
x(4)
x(1)
x(2)
x(3)
Write
x(5)
Read
x(4)
x(5)
x(2)
x(3)
Copied from [Kester03]
5
FIR filter steps
1. Obtain a sample with the ADC; generate an interrupt
2. Detect and manage the interrupt
3. Move the sample into the input signal's circular buffer
4. Update the pointer for the input signal's circular buffer
5. Zero the accumulator
6. Control the loop through each of the coefficients
7. Fetch the coefficient from the coefficient's circular buffer
8. Update the pointer for the coefficient's circular buffer
9. Fetch the sample from the input signal's circular buffer
10. Update the pointer for the input signal's circular buffer
11. Multiply the coefficient by the sample
12. Add the product to the accumulator
13. Move the output sample (accumulator) to a holding buffer
14. Move the output sample from the holding buffer to the DAC
Copied from [Kester03]
6
FIR filter steps (cont.)
ADSP21xx Example code:
CNTR = N-1;DO convolution UNTIL CE;convolution: MR = MR + MX0 * MY0(SS), MX0 = DM(I0,M1), MY0 = PM(I4,M5);
SingleCycle
Instruction
Copied from [Kester03]
7
Outline
• FIR filter on ADPS-21x
DSP Requirements• Fast Multiply-Accumulates (Data-path)• Extended Precision Accumulator Register (Data-path)• Dual Operand Fetch (Memory)• Circular Buffering (Addressing)• Zero-Overhead Looping (Instruction set)
Analog Devices Architectures and Programming• SHARC• Blackfin• Performance Optimization
8Copied from [Takala05]
9Copied from [Takala05]
10
Motorola DSP5600X
Copied from [Takala05]
11Copied from [Takala05]
12Copied from [Takala05]
13
ADSP -21x
MAC
www.analog.com/dsp
14Copied from [Takala05]
15
SHARC Architecture ADSP-2106X
Copied from [Takala05]
16
Outline
• FIR filter on ADPS-21x
DSP Requirements• Fast Multiply-Accumulates (Data-path)• Extended Precision Accumulator Register (Data-path)• Dual Operand Fetch (Memory)• Circular Buffering (Addressing)• Zero-Overhead Looping (Instruction set)
Analog Devices Architectures and Programming• SHARC• Blackfin• Performance Optimization
17Copied from [Takala05]
18Copied from [Takala05]
19Copied from [Takala05]
20
Outline
• FIR filter on ADPS-21x
DSP Requirements• Fast Multiply-Accumulates (Data-path)• Extended Precision Accumulator Register (Data-path)• Dual Operand Fetch (Memory)• Circular Buffering (Addressing)• Zero-Overhead Looping (Instruction set)
Analog Devices Architectures and Programming• SHARC• Blackfin• Performance Optimization
21Copied from [Takala05]
22Copied from [Takala05]
23
Hardware loops
• Software loop:MOVE #16,B Initialize loop counter B
LOOP: MAC (R0)+,(R4)+,A Register-indirect addressing with post-increment
DEC B
JNE LOOP
• Hardware loops: no time is spent on – Decrementing counters– Checking to see if the loop is finished– Branching back to the top of the loop
RPT #16
MAC (R0)+,(R4)+,A
[Lapsley97]
24Copied from [Kester03]
25
Upto 3000MMACS• Image compression• Digital Still/Video Camera• MMOIP• Telematics• Biometrics
Upto 160MMACS• Wired Voice• Wireless Voice• VOIP/VON• Industrial Control
ADSP-218x/9xADSP-218x/9xPower EfficientPower Efficient
$5 - $10$5 - $10
ADSP-218x/9xADSP-218x/9xPower EfficientPower Efficient
$5 - $10$5 - $10
Upto 4800MMACS (16-bit) or 1200MMACS (32-bit)
• 2.5G/3G Infrastructure• Medical Imaging
• Industrial Imaging• Multiprocessing
TigerSHARCTigerSHARCHigh-PerformanceHigh-Performance
$35 - $200$35 - $200
TigerSHARCTigerSHARCHigh-PerformanceHigh-Performance
$35 - $200$35 - $200
Per
form
ance
Blackfin Blackfin Media EnabledMedia Enabled
$5 - $30$5 - $30
Blackfin Blackfin Media EnabledMedia Enabled
$5 - $30$5 - $30
ADI General Purpose DSP Product Families
Upto 600MMACS (32-bit)
• Audio
• Infotainment
• Industrial
SHARCSHARCLow-CostLow-Cost
Floating PointFloating Point$10 - $100$10 - $100
SHARCSHARCLow-CostLow-Cost
Floating PointFloating Point$10 - $100$10 - $100
www.analog.com/dsp
26
Outline
• FIR filter on ADPS-21x
DSP Requirements• Fast Multiply-Accumulates (Data-path)• Extended Precision Accumulator Register (Data-path)• Dual Operand Fetch (Memory)• Circular Buffering (Addressing)• Zero-Overhead Looping (Instruction set)
Analog Devices Architectures and Programming• SHARC• Blackfin• Performance Optimization
27
SHARC Architecture
Copied from [Smith97]
28
SHARC Architecture - Features
• The SSuper HHarvard ARCARChitecture• 100MHz Core / 300 MFLOPS Peak• Parallel Operation of: Multiplier, ALU, 2 Address Generators &
Sequencer– No Arithmetic Pipeline; All Computations Are Single-Cycle
• High Precision and Extended Dynamic Range– 32/40-Bit IEEE Floating-Point Math
– 32-Bit Fixed-Point MAC’s with 64-Bit Product & 80-Bit Accumulation
• Single-Cycle Transfers with Dual-Ported Memory Structures– Supported by Cache Memory and Enhanced HarvardArchitecture
• Glueless Multiprocessing Features• JTAG Test and Emulation Port• DMA Controller, Serial Ports, Link Ports, External Bus, SDRAM
Controller, Timers
www.analog.com/dsp
29
ADSP-2106x Core ArchitectureADSP-2106x Core Architecture
DAG 2
8 x 4 x 24
DAG 1
8 x 4 x 32
CACHE
MEMORY
32 x 48
PROGRAM
SEQUENCER
PMD BUS
DMD BUS
24PMA BUS
PMD
DMD
PMA
32DMA BUSDMA
48
40
JTAG TEST &
EMULATION
FLAGS
FLOATING & FIXED-POINT
MULTIPLIER,
FIXED-POINT
ACCUMULATOR
32-BIT
BARREL
SHIFTER
FLOATING-POINT
& FIXED-POINT
ALU
REGISTER
FILE
16 x 40
BUS CONNECT
TIMER
www.analog.com/dsp
30
Example- Dot product
• C code
Copied from [Smith97]
31
Example- Dot product - Assembly
Copied from [Smith97]
32
Example- Dot product - Assembly
Copied from [Smith97]
33
C or Assembly
• How complicated is the program?• Are you pushing the maximum speed of the DSP?• How many programmers will be working together?• Which is more important, product cost or development
cost?• What is your background?• What does the DSP's manufacturer suggest you use?
Copied from [Smith97]
34
Outline
• FIR filter on ADPS-21x
DSP Requirements• Fast Multiply-Accumulates (Data-path)• Extended Precision Accumulator Register (Data-path)• Dual Operand Fetch (Memory)• Circular Buffering (Addressing)• Zero-Overhead Looping (Instruction set)
Analog Devices Architectures and Programming• SHARC• Blackfin• Performance Optimization
35
BLACKfin Processor Core
Acc1
40BarrelShifter
Acc0
40
16168 8 8 8
Address Arithmetic Unit
DAG0 DAG1
I3 L3 B3 M3I2 L2 B2 M2I1 L1 B1 M1I0 L0 B0 M0
P0P1P2P3P4P5FPSP
R0R1R2R3R4R5R6R7
Data Arithmetic Unit
Sequencer
Two 16-bit MultipliersTwo 40-bit ALUs, Four 8-bit Video ALUsBarrel ShifterSixteen 16-bit /Eight 32-bit Math Registers
Two DAGs, byte addressingEight 32-bit pointer registersFour Sets of 32-bit Index, Modify, Length, Base
16-bit Instructions, 32-bit InstructionsMulti-Issue, 64-bit Instructions
Interlocked PipelineMicro Signal Architecture, developed with Intel
www.analog.com/dsp
36
ADSP-BF535 BLACKfin Processor Architecture
Great Performance Value• Highest Frequency (350
MHz) • 1.0V to 1.6V • 260 PBGA
High System Integration• Address range 768Mbytes• SPORTs support 8
Channels of I2S Audio• (532Mbps) I/O Bandwidth,
DMA Bandwidth & Memory Bandwidth
• Microcontroller features include WDT, PCI, USB1.1 SDRAM controller
To 350 MHzBLACKfin
Processor Core
SDRAM
FLASH/SRAM
Interfaces
Real Time Clock
Watchdog
JTAG
System Peripherals
308 KbytesOn-ChipSRAM
DMA
SPI 2
UART 2
Timers 3 (32bit)
GPIO 16
User Peripherals
Dynamic Power
Management
SPORTs 2
PCI
Memory
PLL
264KbytesOn-ChipSRAM
48 KbytesOn-ChipCache
USB 1.1
www.analog.com/dsp
37Seminars about Blackfin
38Seminars about Blackfin
39Seminars about Blackfin
40Seminars about Blackfin
41Seminars about Blackfin
42Seminars about Blackfin
43Seminars about Blackfin
44Seminars about Blackfin
45Seminars about Blackfin
46Seminars about Blackfin
47Seminars about Blackfin
48Seminars about Blackfin
49Seminars about Blackfin
50Seminars about Blackfin