31
© Copyright 2008 UG Consu ltants, Bangalore. All r ights reserved TMS320C54X ARCHITECTURE UG Consultants Bangalore UG Consultants

TMS320C54x Architecture

Embed Size (px)

DESCRIPTION

DSP Processor TMS320C54x Architecture

Citation preview

Page 1: TMS320C54x Architecture

© Copyright 2008 UG Consultants, Bangalore. All rights reserved

TMS320C54X ARCHITECTURE

UG ConsultantsBangalore

UG Consultants

Page 2: TMS320C54x Architecture

Introduction to TMS320C54x• Lowest DSP in power consumption: 0.54 mW/MIP• Acceleration for FIR and LMS filtering, code book search,

polynomial evaluation, Viterbi decoding

Roadmap

Page 3: TMS320C54x Architecture

TMS320C54X ARCHITECTURE

• Advanced Multibus Architecture With Three Separate 16-Bit Data Memory Buses and One Program Memory Bus

• 40-Bit Arithmetic Logic Unit (ALU) Including a 40-Bit Barrel Shifter and Two Independent 40-Bit Accumulators

UG Consultants

Page 4: TMS320C54x Architecture

TMS320C54X ARCHITECTURE

• 17- × 17-Bit Parallel Multiplier Coupled to a 40-Bit Dedicated Adder for Non-Pipelined Single-Cycle Multiply/Accumulate (MAC) Operation

• Compare, Select, and Store Unit (CSSU) for the Add/Compare Selection of the Viterbi Operator

UG Consultants

Page 5: TMS320C54x Architecture

TMS320C54X ARCHITECTURE

• Exponent Encoder to Compute an Exponent Value of a 40-Bit Accumulator Value in a Single Cycle

• Two Address Generators With Eight Auxiliary Registers and Two Auxiliary Register Arithmetic Units (ARAUs)

UG Consultants

Page 6: TMS320C54x Architecture

TMS320C54X ARCHITECTURE

• Extended Addressing Mode for 8M × 16-Bit Maximum Addressable External Program Space

• 128K x 16-Bit On-Chip RAM Composed of: –– Eight Blocks of 8K × 16-Bit On-Chip

Dual-Access Program/Data RAM– – Eight Blocks of 8K × 16-Bit On-Chip

Single-Access Program RAM

UG Consultants

Page 7: TMS320C54x Architecture

TMS320C54X ARCHITECTURE

• 16K × 16-Bit On-Chip ROM Configured for Program Memory

• Single-Instruction-Repeat and Block-Repeat Operations for Program Code

• Block-Memory-Move Instructions for Better Program and Data Management

Page 8: TMS320C54x Architecture

TMS320C54X ARCHITECTURE

• Instructions With a 32-Bit Long Word

• Operand Instructions With Two- or Three-Operand Reads

• Arithmetic Instructions With Parallel Store and Parallel Load

• Conditional Load/Store Instructions

• Complex Specialized Instructions (Ex: FIRS, POLY, LMS, SQDST)

UG Consultants

Page 9: TMS320C54x Architecture

Overview of ‘C54x

Overview of the ‘C54x device:

Different memory and peripheral Options.

Performance can go up to 500 MIPS (‘C5441)

Highly specialized instructions set.

Super Modified Harvard architecture.

Low power consumption devices well suited for

cellular application and battery operated devices.

16 Bit Fixed point DSP ( CISC processor ).

UG Consultants

Page 10: TMS320C54x Architecture

TMS320C54x Key features

Total memory Divided into Three spaces•64 K Words of Program memory

(’548, ’549, ’5402, ’5410, and ‘5420 Devices Support Extended Program Memory Of Up To 8M Words)

•64 K Words of data memory (Onchip & External)•64 K Words of IO memory

ROM, DARAM and SARAM are the type of On-Chip Memories Supported DARAM ( Dual Access RAM ) can be accessed twice per machine cycle SARAM (Single Access RAM) can be accessed once per machine cycle ROM contains boot loader and data tables. ROM can be customized by submitting ROM Mask to TI. CPU and peripherals ex. BSP,HPI can write to and read from DARAM in the same cycle

UG Consultants

Page 11: TMS320C54x Architecture

Architecture UG Consultants

Page 12: TMS320C54x Architecture

Architecture

Internal Bus Structures:

There are total 8 internal buses.

A separate output data bus, the E-Bus is used to write to the memory.

A dual data-bus scheme, the C-Bus and D-Bus, permits fetching two operands in the same cycle or a dual data in one cycle.

The P-Bus, carries instruction code and immediate data operands from program memory, is also connected to the multiplier input.

Four-address buses (PAB, CAB, DAB, and EAB) carry the addresses needed for instruction execution

UG Consultants

Page 13: TMS320C54x Architecture

Architecture UG Consultants

Page 14: TMS320C54x Architecture

Architecture

CPU and status registers: 40-bit ALU

Two 40-bit accumulators40-bit Barrel shifter17 X 17 multiplier and accumulate unit

(MAC)40-bit AdderData address generation unit ( ARAU0 and

ARAU1 )

Program address generation Unit (PAGEN)Compare,select and store unit (CSSU)Exponent encoder

UG Consultants

Page 15: TMS320C54x Architecture

Architecture

Status and control registers:

Status registers indicates the condition of P. There are two status registers ST0 And ST1.

Control register is used to configure processor is called PMST.

Content of these register can changed using SSBX,RSBX and LD instruction.

Status Register ST0 :

UG Consultants

Page 16: TMS320C54x Architecture

Architecture

Status register ST1:

PMST Processor mode status register:Content PMST determines the configuration of the DSP mode and Memory

UG Consultants

Page 17: TMS320C54x Architecture

TMS320C54x Pipelining

The TMS320C54x has a six-stage pipeline. Each stage is independent and allows overlapped execution of instructions. One to six different instructions can be active simultaneously, each at a

different stage The pipeline provides very fast throughput, but requires some attention to

detail in programming.

Pre-fetch: PAB is loaded with

contents of PC.

Fetch:Opcode is fetched from the program bus (PB) and loaded into the IR.

Decode:The contents of the IR are decoded.

Access: DAB is loaded with address if read access is require , If second operand is required CAB is loaded with addressOr Auxiliary registers are update in indirect addressing mode.

Read : The read data operand(s), if any, are read from the data buses, DBand CB. Or Same time Write data address is placed on the EAB.

Execute/write:Execution of the instruction Or EB is loaded with the write data.

P F D A R E / W

Instruction fetch Operand Read

Operand Write

UG Consultants

Page 18: TMS320C54x Architecture

C54x Pipelining Bus/hardware Use

P F D A R E/WP F D A R E/W

P F D A R E/W

P F D A R E/WP F D A R E/W

P F D A R E/W

TimeP Generate Program address PAB PC

F Get Opcode PB Program Mem

D Decode instruction Decoder

A Generate read address DAB/CAB AR’s and ARAU

R Read operands

Generate Write address

DB/CB

EAB

Data mem

AR’s, ARAU

E/W

Execute instruction

Write result

EB MAC,ALU

Data mem

UG Consultants

Page 19: TMS320C54x Architecture

Architecture

Arithmetic Logic Unit (ALU)UG Consultants

Page 20: TMS320C54x Architecture

Architecture

ALU Inputs:

X input source 1. Shifter output a 32 bit or 16-bit data memory operand or shifted ACC value.2. Data memory operand from D-bus.

Y- input source:1. Accumulator A or B2. Data bus CB3. T register

Note:Neither Acc A or B is connected to the X-input of the ALU. Ex: ADD A,0,B ;But How this instruction executed ?

Here A forms one of the input to y-input of ALU.

Another input B comes though shifter output to X-input.

UG Consultants

Page 21: TMS320C54x Architecture

Architecture

Accumulators A and B : Destination registers for MAC or ALU operations. Accumulators are divided into three parts :

Guard bits AG and BG High word AH and BH Low word AL and BL

Guard bits prevents overflow in iterative computations. 32-16 bits of A can used as an input to multiplier in MAC.

UG Consultants

Page 22: TMS320C54x Architecture

Architecture

Barrel Shifter: shifts data (-16 to 31 ) times at once Pre-scaling before ALU operation Shift operations Normalizing Post scaling before storing Acc.

UG Consultants

Page 23: TMS320C54x Architecture

Architecture

Input sources : DB for a 16-bit data input operand DB and CB for a 32-bit data input operand Either one of the two 40-bit accumulators

Output sources: One of the ALU inputs E-bus through MSW/LSW write select unit

Shift value : Ranges from –16 to 31 Immediate operand ASM field of ST1 T register

UG Consultants

Page 24: TMS320C54x Architecture

Memory Organization

Memory Map :This space is divided into three individually selectable space:

Program memory of 64KW Data memory of 64KW I/O space of 64KW

Some device have more than 64KW of program memory is referred as paged extended program memory.This PM is divided into 64KW block is called Page.

UG Consultants

Page 25: TMS320C54x Architecture

Memory Organization

Program memory: Contains instructions, immediate data operand and tables

Data Memory: Stores data used by the instruction, Can be used to store Code also.

I/O memory : Used for addressing memory mapped peripherals. Can also serve as extra memory storage.

Memory Configuration

There are three CPU status registers bits that affects the memory configuration.

MP/MC, OVLY and DROM bits affects the memory configuration are located in the PMST register.

MP/MC is external pin on the processor

UG Consultants

Page 26: TMS320C54x Architecture

Memory Organization

0000

MP/MC = 1

Page 0Program

External

Hex

FF7FFF80

FFFF

Interrupts Vector

External

(Microprocessor mode)

MP/MC = 0

Page 0Program

0000

FF7FFF80

FFFF

Interrupts vector

On-chip

Reserved

On chip ROM4K X 16

(Microcomputer mode)

EFFFF000

FEFFFF00

External

Hex

UG Consultants

Page 27: TMS320C54x Architecture

MP/MC = 0

Page 0ProgramHex

0000

FF7FFF80

FFFF

Interrupts Vector

On-chip

Reserved

On chip ROM4K X 16

(Microcomputer mode)

EFFFF000

FEFFFF00

Page 0ProgramHex

0000

FF7FFF80

FFFF

Interrupts Vector

External

(Microprocessor mode)MP/MC = 1

DataPage

External

Hex0000

007F0080

3FFF4000

FFFF

On chipDARAM (16KW)

MMRScratch Pad

registers

Memory Organization

EFFFF000

FEFFFF00 Reserved

(DROM = 1)External

(DROM = 0)

ROM ( DROM = 1)External( DROM=0)

External

007F0080

3FFF4000

External

007F0080

3FFF4000

ReservedOVLY = 1

DARAM OVLY = 1

ReservedOVLY = 1

DARAM OVLY = 1

ExternalOVLY = 0

External OVLY = 0

External OVLY = 0

ExternalOVLY = 0

ExternalOVLY = 0

ExternalOVLY = 0

UG Consultants

Page 28: TMS320C54x Architecture

Memory Organization

On-chip ROM organization: Subdivided into blocks to enhance performance Allows one access per block On some devices, On-chip ROM contains code

A boot loader that boot from Serial port, Extr Memory, HPI A 256-word -Law expansion table A 256-word A-law expansion table A 256-word sine table Interrupt vector Table

F800

FC00

FE00

FF80

FD00

FE00

Boot loadercode

- LawA-Law

Sine lookupreserved

IVT

542/534/548/549/5402

UG Consultants

Page 29: TMS320C54x Architecture

Memory Organization

Memory Mapped Registers:

A portion of data memory is used as registers are called memory mapped registers (MMR).

Peripherals register resides within addresses 0020h – 005Fh.

Scratch pad register are sued for temporary variables storage. They reside in range 0060h – 007Fh)

These MMR register reside in the data page 0 ( 0000h – 007F ) CPU register( 26 Total) are requires no wait states.

Each MMR is associated with memory address. Ex ST0 address is 6, AR0 is 10, etc

UG Consultants

Page 30: TMS320C54x Architecture

Memory Mapped Registers TablesUG Consultants

Page 31: TMS320C54x Architecture

Memory Organization

Temporary Register (T)

Used to hold one of multiplicands

   A dynamic (execution-time programmable) shift count for instructions with shift operation such as the ADD, LD, and SUB instructions.

A dynamic bit address for the BITT instruction.

Used as one of the operand for instructions for CMPS, double precision operation instructions,EXP,NORM

UG Consultants