23
Compiler Design for Digital Signal Processors Erik Eckstein 03/25/2010

Compiler Design for Digital Signal Processors · LSI Proprietary 3 Storage Networking Systems Semiconductors Entry RAID Systems Mid-range RAID Systems Host RAID Adapters Host RAID

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Compiler Design for Digital Signal Processors · LSI Proprietary 3 Storage Networking Systems Semiconductors Entry RAID Systems Mid-range RAID Systems Host RAID Adapters Host RAID

Compiler Design

for Digital Signal Processors

Erik Eckstein

03/25/2010

Page 2: Compiler Design for Digital Signal Processors · LSI Proprietary 3 Storage Networking Systems Semiconductors Entry RAID Systems Mid-range RAID Systems Host RAID Adapters Host RAID

2LSI Proprietary

LSI Corporation

• LSI is a global leader in semiconductors and software solutions

• Solutions for storage and networking markets

• Founded 1981

• Headquarter: Milpitas, California

• Design centers around the world

– Small design center in Vienna: development of DSP software tools

Page 3: Compiler Design for Digital Signal Processors · LSI Proprietary 3 Storage Networking Systems Semiconductors Entry RAID Systems Mid-range RAID Systems Host RAID Adapters Host RAID

3LSI Proprietary

StorageNetworking

Systems Semiconductors

Entry RAID Systems

Mid-range RAID

Systems

Host RAID Adapters

Host RAID SW

Data Management

SW

HDD ICs: HDCs, SoCs,

RCs, Pre-Amps

SAS & SAS ROC

Controllers

SAS Expanders, Bridges

SAN Fabric Products

Custom Storage

Products

Custom Products –

Enterprise Datacom

Network Processors

DSPs

Framers/Mappers

Link Layer Products

Modems, USB,

Firewire

Market Focus

Page 4: Compiler Design for Digital Signal Processors · LSI Proprietary 3 Storage Networking Systems Semiconductors Entry RAID Systems Mid-range RAID Systems Host RAID Adapters Host RAID

4LSI Proprietary

• Digital Signal Processors– Used in mobile phones, wireless devices, infrastructure, etc.

– Audio, video processing

– High performance for signal processing algorithms

– Low power consumption

• Example: StarPro2716– 16 StarCore SC3400e DSP cores @ 750 MHz

– 8 ARM cores for networking packet processing

– Ethernet I/O support

– 6MB on chip memory

– 96 GMACs (96*109 multiply-accumulate instructions/sec)

– Multichannel speech processing in voice gateways

DSP Overview

Page 5: Compiler Design for Digital Signal Processors · LSI Proprietary 3 Storage Networking Systems Semiconductors Entry RAID Systems Mid-range RAID Systems Host RAID Adapters Host RAID

5LSI Proprietary

DSP Characteristics

• RISC-like architecture

• Multiple parallel units

• VLIW (very long instruction word)

• Multiple memory buses

• 40-bit registers

• Complex addressing modes

• Hardware loops

• Fixed-point arithmetic

Page 6: Compiler Design for Digital Signal Processors · LSI Proprietary 3 Storage Networking Systems Semiconductors Entry RAID Systems Mid-range RAID Systems Host RAID Adapters Host RAID

6LSI Proprietary

SC3400e Core

PCUprogram control unit

AGUaddress generation unit

DALUdata ALU

ALU

ALU

ALU

ALU

AA

U

AA

U

BM

U

PROGRAM

XPA (32)

XPD (128)

DATA

XAA (32)

XAD.in XAD.out (64) (64)

XBA (32)

2 FETCH BUFFERS

16DATA

REGISTERS

27ADDRESS

REGISTERS

Page 7: Compiler Design for Digital Signal Processors · LSI Proprietary 3 Storage Networking Systems Semiconductors Entry RAID Systems Mid-range RAID Systems Host RAID Adapters Host RAID

7LSI Proprietary

DSP-C Compiler Requirements

• Support for all DSP features– with language extensions (fractional data types)

– with automatic recognition

• Optimizing for speed– time critical code

• Optimizing for code size– control code

Page 8: Compiler Design for Digital Signal Processors · LSI Proprietary 3 Storage Networking Systems Semiconductors Entry RAID Systems Mid-range RAID Systems Host RAID Adapters Host RAID

8LSI Proprietary

high intermediate representation(general)

Compiler Design

low intermediate representation(processor specific)

component component component...

component

component

component

Page 9: Compiler Design for Digital Signal Processors · LSI Proprietary 3 Storage Networking Systems Semiconductors Entry RAID Systems Mid-range RAID Systems Host RAID Adapters Host RAID

9LSI Proprietary

Code Generator

C Code Parser

Optimizations on Low Level IR

Optimizations on High Level IR

Front End

Back End

Compilation Stages

Page 10: Compiler Design for Digital Signal Processors · LSI Proprietary 3 Storage Networking Systems Semiconductors Entry RAID Systems Mid-range RAID Systems Host RAID Adapters Host RAID

10LSI Proprietary

addr of addr of

High Level IR Example

assign

addr of

obj "b"

+

read read

obj "c"

obj "a"

a = b + c;

Page 11: Compiler Design for Digital Signal Processors · LSI Proprietary 3 Storage Networking Systems Semiconductors Entry RAID Systems Mid-range RAID Systems Host RAID Adapters Host RAID

11LSI Proprietary

Low Level IR Example

move.w #<32,r0

suba r1,r0

asll d7,d1

move.w #<-1,d5 move.l (r6),d4

Page 12: Compiler Design for Digital Signal Processors · LSI Proprietary 3 Storage Networking Systems Semiconductors Entry RAID Systems Mid-range RAID Systems Host RAID Adapters Host RAID

12LSI Proprietary

Optimizations are Important!

• Compiler contains approx. 50 different optimizations– 70% high level optimizations

– 30% low level optimizations

• Most optimization are invoked several times– up to 17 times!

Page 13: Compiler Design for Digital Signal Processors · LSI Proprietary 3 Storage Networking Systems Semiconductors Entry RAID Systems Mid-range RAID Systems Host RAID Adapters Host RAID

13LSI Proprietary

Optimization Example: FIR Filter

long fir(short *x, short *y)

{

long sum = 0;

int i;

for (i = 0; i < 128; i++) {

sum += x[i] * y[i];

}

return sum;

}

Page 14: Compiler Design for Digital Signal Processors · LSI Proprietary 3 Storage Networking Systems Semiconductors Entry RAID Systems Mid-range RAID Systems Host RAID Adapters Host RAID

14LSI Proprietary

Loop Strength Reduction

long fir(short *x, short *y)

{

long sum = 0;

int i;

for (i = 0; i < 128; i++) {

sum += *x * *y;

x++;

y++;

}

return sum;

}

Page 15: Compiler Design for Digital Signal Processors · LSI Proprietary 3 Storage Networking Systems Semiconductors Entry RAID Systems Mid-range RAID Systems Host RAID Adapters Host RAID

15LSI Proprietary

Hardware Loop Support

long fir(short *x, short *y)

{

long sum = 0;

int i;

loop 128 {

sum += *x * *y;

x++;

y++;

}

return sum;

}

Page 16: Compiler Design for Digital Signal Processors · LSI Proprietary 3 Storage Networking Systems Semiconductors Entry RAID Systems Mid-range RAID Systems Host RAID Adapters Host RAID

16LSI Proprietary

Instruction Selection

_fir

doensh3 #128

clr sum

loopstart3

move.w (x),tmp1

move.w (y),tmp2

imac tmp2,tmp1,sum

adda #4,x

adda #4,y

loopend3

rts

Page 17: Compiler Design for Digital Signal Processors · LSI Proprietary 3 Storage Networking Systems Semiconductors Entry RAID Systems Mid-range RAID Systems Host RAID Adapters Host RAID

17LSI Proprietary

Register Allocation

_fir

doensh3 #128

clr d0

loopstart3

move.w (r0),d1

move.w (r1),d2

imac d2,d1,d0

adda #4,r0

adda #4,r1

loopend3

rts

5 cycles/iteration

Page 18: Compiler Design for Digital Signal Processors · LSI Proprietary 3 Storage Networking Systems Semiconductors Entry RAID Systems Mid-range RAID Systems Host RAID Adapters Host RAID

18LSI Proprietary

Addressing Mode Optimization

_fir

doensh3 #128

sub d0,d0,d0

loopstart3

move.w (r0)+,d1

move.w (r1)+,d2

imac d2,d1,d0

loopend3

rts

3 cycles/iteration

Page 19: Compiler Design for Digital Signal Processors · LSI Proprietary 3 Storage Networking Systems Semiconductors Entry RAID Systems Mid-range RAID Systems Host RAID Adapters Host RAID

19LSI Proprietary

Instruction Scheduling

_fir

doensh3 #128

sub d0,d0,d0

loopstart3

[ move.w (r0)+,d1

move.w (r1)+,d2 ]*)

imac d2,d1,d0

loopend3

rts

2 cycles/iteration*) executed in parallel

Page 20: Compiler Design for Digital Signal Processors · LSI Proprietary 3 Storage Networking Systems Semiconductors Entry RAID Systems Mid-range RAID Systems Host RAID Adapters Host RAID

20LSI Proprietary

Software Pipelining

1 cycles/iteration

_fir

sub d0,d0,d0

doensh3 #127

[ move.w (r0)+,d1 move.w (r1)+,d2 ]

loopstart3

[ move.w(r0)+,d1

move.w(r1)+,d2

imac d2,d1,d0 ]

loopend3

imac d2,d1,d0

rts

Page 21: Compiler Design for Digital Signal Processors · LSI Proprietary 3 Storage Networking Systems Semiconductors Entry RAID Systems Mid-range RAID Systems Host RAID Adapters Host RAID

21LSI Proprietary

Loop Transformation

0.25 cycles/iteration

_fir

...

loopstart3

[ move.4w (r0)+,d0:d1:d2:d3

move.4w (r1)+,d8:d9:d10:d11

imac d0,d8,d12

imac d1,d9,d13

imac d2,d10,d14

imac d3,d11,d1 ]

loopend3

...

Page 22: Compiler Design for Digital Signal Processors · LSI Proprietary 3 Storage Networking Systems Semiconductors Entry RAID Systems Mid-range RAID Systems Host RAID Adapters Host RAID

Questions?

Page 23: Compiler Design for Digital Signal Processors · LSI Proprietary 3 Storage Networking Systems Semiconductors Entry RAID Systems Mid-range RAID Systems Host RAID Adapters Host RAID

23LSI Proprietary