© Copyright 2008 UG Consultants, Bangalore. All rights reserved
TMS320C54X ARCHITECTURE
UG ConsultantsBangalore
UG Consultants
Introduction to TMS320C54x• Lowest DSP in power consumption: 0.54 mW/MIP• Acceleration for FIR and LMS filtering, code book search,
polynomial evaluation, Viterbi decoding
Roadmap
TMS320C54X ARCHITECTURE
• Advanced Multibus Architecture With Three Separate 16-Bit Data Memory Buses and One Program Memory Bus
• 40-Bit Arithmetic Logic Unit (ALU) Including a 40-Bit Barrel Shifter and Two Independent 40-Bit Accumulators
UG Consultants
TMS320C54X ARCHITECTURE
• 17- × 17-Bit Parallel Multiplier Coupled to a 40-Bit Dedicated Adder for Non-Pipelined Single-Cycle Multiply/Accumulate (MAC) Operation
• Compare, Select, and Store Unit (CSSU) for the Add/Compare Selection of the Viterbi Operator
UG Consultants
TMS320C54X ARCHITECTURE
• Exponent Encoder to Compute an Exponent Value of a 40-Bit Accumulator Value in a Single Cycle
• Two Address Generators With Eight Auxiliary Registers and Two Auxiliary Register Arithmetic Units (ARAUs)
UG Consultants
TMS320C54X ARCHITECTURE
• Extended Addressing Mode for 8M × 16-Bit Maximum Addressable External Program Space
• 128K x 16-Bit On-Chip RAM Composed of: –– Eight Blocks of 8K × 16-Bit On-Chip
Dual-Access Program/Data RAM– – Eight Blocks of 8K × 16-Bit On-Chip
Single-Access Program RAM
UG Consultants
TMS320C54X ARCHITECTURE
• 16K × 16-Bit On-Chip ROM Configured for Program Memory
• Single-Instruction-Repeat and Block-Repeat Operations for Program Code
• Block-Memory-Move Instructions for Better Program and Data Management
TMS320C54X ARCHITECTURE
• Instructions With a 32-Bit Long Word
• Operand Instructions With Two- or Three-Operand Reads
• Arithmetic Instructions With Parallel Store and Parallel Load
• Conditional Load/Store Instructions
• Complex Specialized Instructions (Ex: FIRS, POLY, LMS, SQDST)
UG Consultants
Overview of ‘C54x
Overview of the ‘C54x device:
Different memory and peripheral Options.
Performance can go up to 500 MIPS (‘C5441)
Highly specialized instructions set.
Super Modified Harvard architecture.
Low power consumption devices well suited for
cellular application and battery operated devices.
16 Bit Fixed point DSP ( CISC processor ).
UG Consultants
TMS320C54x Key features
Total memory Divided into Three spaces•64 K Words of Program memory
(’548, ’549, ’5402, ’5410, and ‘5420 Devices Support Extended Program Memory Of Up To 8M Words)
•64 K Words of data memory (Onchip & External)•64 K Words of IO memory
ROM, DARAM and SARAM are the type of On-Chip Memories Supported DARAM ( Dual Access RAM ) can be accessed twice per machine cycle SARAM (Single Access RAM) can be accessed once per machine cycle ROM contains boot loader and data tables. ROM can be customized by submitting ROM Mask to TI. CPU and peripherals ex. BSP,HPI can write to and read from DARAM in the same cycle
UG Consultants
Architecture UG Consultants
Architecture
Internal Bus Structures:
There are total 8 internal buses.
A separate output data bus, the E-Bus is used to write to the memory.
A dual data-bus scheme, the C-Bus and D-Bus, permits fetching two operands in the same cycle or a dual data in one cycle.
The P-Bus, carries instruction code and immediate data operands from program memory, is also connected to the multiplier input.
Four-address buses (PAB, CAB, DAB, and EAB) carry the addresses needed for instruction execution
UG Consultants
Architecture UG Consultants
Architecture
CPU and status registers: 40-bit ALU
Two 40-bit accumulators40-bit Barrel shifter17 X 17 multiplier and accumulate unit
(MAC)40-bit AdderData address generation unit ( ARAU0 and
ARAU1 )
Program address generation Unit (PAGEN)Compare,select and store unit (CSSU)Exponent encoder
UG Consultants
Architecture
Status and control registers:
Status registers indicates the condition of P. There are two status registers ST0 And ST1.
Control register is used to configure processor is called PMST.
Content of these register can changed using SSBX,RSBX and LD instruction.
Status Register ST0 :
UG Consultants
Architecture
Status register ST1:
PMST Processor mode status register:Content PMST determines the configuration of the DSP mode and Memory
UG Consultants
TMS320C54x Pipelining
The TMS320C54x has a six-stage pipeline. Each stage is independent and allows overlapped execution of instructions. One to six different instructions can be active simultaneously, each at a
different stage The pipeline provides very fast throughput, but requires some attention to
detail in programming.
Pre-fetch: PAB is loaded with
contents of PC.
Fetch:Opcode is fetched from the program bus (PB) and loaded into the IR.
Decode:The contents of the IR are decoded.
Access: DAB is loaded with address if read access is require , If second operand is required CAB is loaded with addressOr Auxiliary registers are update in indirect addressing mode.
Read : The read data operand(s), if any, are read from the data buses, DBand CB. Or Same time Write data address is placed on the EAB.
Execute/write:Execution of the instruction Or EB is loaded with the write data.
P F D A R E / W
Instruction fetch Operand Read
Operand Write
UG Consultants
C54x Pipelining Bus/hardware Use
P F D A R E/WP F D A R E/W
P F D A R E/W
P F D A R E/WP F D A R E/W
P F D A R E/W
TimeP Generate Program address PAB PC
F Get Opcode PB Program Mem
D Decode instruction Decoder
A Generate read address DAB/CAB AR’s and ARAU
R Read operands
Generate Write address
DB/CB
EAB
Data mem
AR’s, ARAU
E/W
Execute instruction
Write result
EB MAC,ALU
Data mem
UG Consultants
Architecture
Arithmetic Logic Unit (ALU)UG Consultants
Architecture
ALU Inputs:
X input source 1. Shifter output a 32 bit or 16-bit data memory operand or shifted ACC value.2. Data memory operand from D-bus.
Y- input source:1. Accumulator A or B2. Data bus CB3. T register
Note:Neither Acc A or B is connected to the X-input of the ALU. Ex: ADD A,0,B ;But How this instruction executed ?
Here A forms one of the input to y-input of ALU.
Another input B comes though shifter output to X-input.
UG Consultants
Architecture
Accumulators A and B : Destination registers for MAC or ALU operations. Accumulators are divided into three parts :
Guard bits AG and BG High word AH and BH Low word AL and BL
Guard bits prevents overflow in iterative computations. 32-16 bits of A can used as an input to multiplier in MAC.
UG Consultants
Architecture
Barrel Shifter: shifts data (-16 to 31 ) times at once Pre-scaling before ALU operation Shift operations Normalizing Post scaling before storing Acc.
UG Consultants
Architecture
Input sources : DB for a 16-bit data input operand DB and CB for a 32-bit data input operand Either one of the two 40-bit accumulators
Output sources: One of the ALU inputs E-bus through MSW/LSW write select unit
Shift value : Ranges from –16 to 31 Immediate operand ASM field of ST1 T register
UG Consultants
Memory Organization
Memory Map :This space is divided into three individually selectable space:
Program memory of 64KW Data memory of 64KW I/O space of 64KW
Some device have more than 64KW of program memory is referred as paged extended program memory.This PM is divided into 64KW block is called Page.
UG Consultants
Memory Organization
Program memory: Contains instructions, immediate data operand and tables
Data Memory: Stores data used by the instruction, Can be used to store Code also.
I/O memory : Used for addressing memory mapped peripherals. Can also serve as extra memory storage.
Memory Configuration
There are three CPU status registers bits that affects the memory configuration.
MP/MC, OVLY and DROM bits affects the memory configuration are located in the PMST register.
MP/MC is external pin on the processor
UG Consultants
Memory Organization
0000
MP/MC = 1
Page 0Program
External
Hex
FF7FFF80
FFFF
Interrupts Vector
External
(Microprocessor mode)
MP/MC = 0
Page 0Program
0000
FF7FFF80
FFFF
Interrupts vector
On-chip
Reserved
On chip ROM4K X 16
(Microcomputer mode)
EFFFF000
FEFFFF00
External
Hex
UG Consultants
MP/MC = 0
Page 0ProgramHex
0000
FF7FFF80
FFFF
Interrupts Vector
On-chip
Reserved
On chip ROM4K X 16
(Microcomputer mode)
EFFFF000
FEFFFF00
Page 0ProgramHex
0000
FF7FFF80
FFFF
Interrupts Vector
External
(Microprocessor mode)MP/MC = 1
DataPage
External
Hex0000
007F0080
3FFF4000
FFFF
On chipDARAM (16KW)
MMRScratch Pad
registers
Memory Organization
EFFFF000
FEFFFF00 Reserved
(DROM = 1)External
(DROM = 0)
ROM ( DROM = 1)External( DROM=0)
External
007F0080
3FFF4000
External
007F0080
3FFF4000
ReservedOVLY = 1
DARAM OVLY = 1
ReservedOVLY = 1
DARAM OVLY = 1
ExternalOVLY = 0
External OVLY = 0
External OVLY = 0
ExternalOVLY = 0
ExternalOVLY = 0
ExternalOVLY = 0
UG Consultants
Memory Organization
On-chip ROM organization: Subdivided into blocks to enhance performance Allows one access per block On some devices, On-chip ROM contains code
A boot loader that boot from Serial port, Extr Memory, HPI A 256-word -Law expansion table A 256-word A-law expansion table A 256-word sine table Interrupt vector Table
F800
FC00
FE00
FF80
FD00
FE00
Boot loadercode
- LawA-Law
Sine lookupreserved
IVT
542/534/548/549/5402
UG Consultants
Memory Organization
Memory Mapped Registers:
A portion of data memory is used as registers are called memory mapped registers (MMR).
Peripherals register resides within addresses 0020h – 005Fh.
Scratch pad register are sued for temporary variables storage. They reside in range 0060h – 007Fh)
These MMR register reside in the data page 0 ( 0000h – 007F ) CPU register( 26 Total) are requires no wait states.
Each MMR is associated with memory address. Ex ST0 address is 6, AR0 is 10, etc
UG Consultants
Memory Mapped Registers TablesUG Consultants
Memory Organization
Temporary Register (T)
Used to hold one of multiplicands
A dynamic (execution-time programmable) shift count for instructions with shift operation such as the ADD, LD, and SUB instructions.
A dynamic bit address for the BITT instruction.
Used as one of the operand for instructions for CMPS, double precision operation instructions,EXP,NORM
UG Consultants