282
TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2 May 1997 Technical Training

TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Embed Size (px)

Citation preview

Page 1: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

TMS320C54x DSP Design Workshop

Student Guide

DSP54x-NOTES-1.2May 1997 Technical Training

Page 2: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

ii TMS320C54x DSP Design Workshop

Copyright © 1997 Texas Instruments Incorporated.All rights reserved.

NoticeNo part of this publication may be reproduced, stored in a retrieval system, or transmitted, in anyform or by any means, electronic, mechanical, photocopying, recording or otherwise, without theprior written permission of Texas Instruments.

Texas Instruments reserves the right to update this Guide to reflect the most current productinformation for the spectrum of users. If there are any differences between this Guide and atechnical reference manual, references should always be made to the most current referencemanual. Information contained in this publication is believed to be accurate and reliable.However, responsibility is assumed neither for its use nor any infringement of patents or rights ofothers that may result from its use. No license is granted by implication or otherwise under anypatent or patent right of Texas Instruments or others.

Revision History

Page 3: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

TMS320C54x DSP Design Workshop iii

Welcome to theWelcome to the

TMS320C54x DSPTMS320C54x DSPDesign WorkshopDesign Workshop

Texas InstrumentsTexas Instruments

Technical TrainingTechnical Training

0 - 2

IntroductionsIntroductions

uu NameName

uu CompanyCompany

uu Project ResponsibilitiesProject Responsibilities

uu DSP ExperienceDSP Experience

uu 320 Experience320 Experience

uu Hardware/Software,Hardware/Software, Asm Asm/C/C

uu InterestsInterests

Page 4: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

iv TMS320C54x DSP Design Workshop

0 - 3

TMS320C54x Workshop AgendaTMS320C54x Workshop Agenda

I :I :

II :II :

III :III :

IV :IV :

uu 1. 1. Introduction and OverviewIntroduction and Overview

2. 2. Assembly Language EnvironmentAssembly Language Environment

3. 3. Addressing ModesAddressing Modes

uu 4. 4. Basic Programming TechniquesBasic Programming Techniques

5. 5. Advanced Programming ControlAdvanced Programming Control

6. 6. Pipeline IssuesPipeline Issues

uu 7. 7. Numerical IssuesNumerical Issues

8. 8. Fundamental DSP ApplicationsFundamental DSP Applications

9. 9. Advanced DSP ApplicationsAdvanced DSP Applications

10.10. InterruptsInterrupts

uu 11.11. Hardware InterfacingHardware Interfacing

12.12. Other InterfacingOther Interfacing

13.13. System Design IssuesSystem Design Issues

14.14. Using the C CompilerUsing the C Compiler

Page 5: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

DSP54x - Introduction and Overview 1 - 1

Introduction and Overview

Learning Objectives

1 - 2

Learning ObjectivesLearning Objectives

uu Describe the requirements of a DSP system.Describe the requirements of a DSP system.

uu Identify the CPU components of the ‘C54x.Identify the CPU components of the ‘C54x.

uu List the ‘C54x internal buses and their usage.List the ‘C54x internal buses and their usage.

uu List the ‘C54x pipeline stages and their actions.List the ‘C54x pipeline stages and their actions.

uu Describe the memory map of the ‘C54x.Describe the memory map of the ‘C54x.

uu List memory and peripherals of the ‘C54x devices.List memory and peripherals of the ‘C54x devices.

uu Become familiar with ‘C54x simulator.Become familiar with ‘C54x simulator.

Page 6: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

1 - 2 DSP54x - Introduction and Overview

Page 7: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 1

DSP54x - Introduction and Overview 1 - 3

Module 1

1 - 3

DSP: Sum-of-ProductsDSP: Sum-of-Products

y x an nn

==

∑1

100

y x an nn

==

∑1

100

xx aa

MPYMPY

ADDADD

yy

1 - 4

MAC Unit DetailsMAC Unit Details

MPYMPY

ADDADD

DD CC

M busM bus

accacc A A accacc B B

MAC *AR2+, *AR3+, AMAC *AR2+, *AR3+, A

AABB00

PPA TA T D AD A

s/us/u s/us/u

FRCTFRCT

Page 8: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 1

1 - 4 DSP54x - Introduction and Overview

1 - 5

Accumulators + ALUAccumulators + ALU

General-Purpose Math, ex: t = s + e - rGeneral-Purpose Math, ex: t = s + e - r

MUXMUX

U U BUSBUS

acc acc A A acc acc B B ALUALU

A A BUSBUS B B BUSBUS

A B MA B M

A B C TA B C T D SD S

LDLD s, As, A

ADDADD e, Ae, A

SUBSUB r, Ar, A

STLSTL A, tA, t

1 - 6

NotesNotes

Page 9: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 1

DSP54x - Introduction and Overview 1 - 5

1 - 7

Barrel ShifterBarrel Shifter

SHIFTER (-16 to +31)SHIFTER (-16 to +31)

S BUSS BUS

ALUALU

C D C D

LDLD X, 16, AX, 16, A

W BUSW BUS

A B A B

STHSTH B, yB, y

1 - 8

Temporary RegisterTemporary Register

T BUST BUS

TT

DD

ALUALUMACMAC

ex: A =ex: A = xa xa

LD x, T LD x, T

MPY a, A MPY a, A

EXPEXPAA

BBXX

Page 10: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 1

1 - 6 DSP54x - Introduction and Overview

1 - 9

’C54x Buses’C54x Buses

PP

DD

CC

EE

INTERNALINTERNAL

MEMORYMEMORY

MMUUXXEESS

EXTERNALEXTERNAL

MEMORYMEMORY

MMUUXX

MACMAC *AR2+, *AR3+, A*AR2+, *AR3+, A

ALUALU SHIFTSHIFTBBAAMACMACTT

DDCC

MM

1 - 10

NotesNotes

Page 11: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 1

DSP54x - Introduction and Overview 1 - 7

1 - 11

Pipeline - ConceptPipeline - Concept

F: FetchF: Fetch Get instruction from memory.Get instruction from memory.

D: DecodeD: Decode Schedule activity.Schedule activity.

R: ReadR: Read Get operand from memory.Get operand from memory.

X: ExecuteX: Execute Perform operation.Perform operation.

1 - 12

Memory InteractionMemory Interaction

uu Broken into two phases:Broken into two phases:1. Calculate address1. Calculate address

2. Collect data2. Collect data

uu Allows more time for memory interface.Allows more time for memory interface.

Page 12: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 1

1 - 8 DSP54x - Introduction and Overview

1 - 13

‘C54x Pipeline - Enhanced‘C54x Pipeline - Enhanced

PP PrefetchPrefetch Calculate address of instruction.Calculate address of instruction.

FF FetchFetch Collect instruction.Collect instruction.

DD DecodeDecode Interpret instruction.Interpret instruction.

AA AccessAccess Calculate address of operand.Calculate address of operand.

RR ReadRead Collect operand.Collect operand.

XX ExecuteExecute Perform operationPerform operation..

1 - 14

Memory WriteMemory Write

uu When When storingstoring results back to memory results back to memory

uu Two phasesTwo phasesÀÀ Address set upAddress set up

ÀÀ Data writtenData written

uu Overlaid onto R + X phasesOverlaid onto R + X phases

uu Best balance of:Best balance of:ÀÀ Processor loadingProcessor loading

ÀÀ SpeedSpeed

ÀÀ CostCost

Page 13: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 1

DSP54x - Introduction and Overview 1 - 9

1 - 15

’C54x Pipeline Events’C54x Pipeline Events

PP Drive address of instructionDrive address of instruction

FF Collect instructionCollect instruction

DD Interpret instruction, plan jobInterpret instruction, plan job

AA Set up pointers,Set up pointers, Calc Calc data address data address

RR Collect operandCollect operand

XX Execute operationExecute operation

PPAA

PPDD

ctlrctlr

DDAA

DDDD

*,+*,+

Calculate Write addressCalculate Write address

Send resultSend result

EEAA

EEDD

1 - 16

‘C54x Pipeline Hardware‘C54x Pipeline Hardware

PP

FF

DD

AA

RR

XX

PC, PPC, PAA

ProgramProgram Mem Mem, P, PDD

ControllerController

ARsARs, D, DAA,, ARAUs ARAUs

DataData Mem Mem, D, DDD

CALU (MAC, ALU)CALU (MAC, ALU)

; AR, ARAU, E; AR, ARAU, EAA

; E; EDD, Data, Data Mem Mem

Page 14: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 1

1 - 10 DSP54x - Introduction and Overview

1 - 17

’C54x Components and Bus Usage’C54x Components and Bus Usage

INTERNALINTERNAL

MEMORYMEMORY

EXTERNALEXTERNAL

MEMORYMEMORY

MMUUXXEESS

MMUUXX

PCPC

PP

CNTLCNTL ARsARs

CC

DD

ALUALU SHIFTSHIFTBBAAMACMACTT

EE

1 - 18

NotesNotes

Page 15: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 1

DSP54x - Introduction and Overview 1 - 11

1 - 19

Pipeline PerformancePipeline Performance

TIMETIME

P P11 FF11

PP22

DD11

FF22

PP33

AA11

DD22

FF33

PP44

RR11

AA22

DD33

FF44

PP55

XX11

PP66

RR22

AA33

DD44

FF55

FF66

XX22

RR33

AA44

DD55

DD66

XX33

RR44

AA55

AA66

XX44

RR55

RR66

XX55

XX66

FULLY LOADED ’PIPE’FULLY LOADED ’PIPE’

1 - 20

Pipeline Conflicts - External MemoryPipeline Conflicts - External Memory

PP

DD54x54x

PP11 FF11 DD11 AA11 RR11 XX11

PP22 FF22

PP33 FF33 DD33 AA33 RR33 XX33

DD22 AA22 RR22 XX22

PP44 ----

----

----

----

----

----

----

----

FF44

PP55

----

DD44 AA44 RR44 XX44

FF55 DD55 AA55 RR55 XX55

PP66 FF66 DD66 AA66 RR66

Page 16: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 1

1 - 12 DSP54x - Introduction and Overview

1 - 21

Pipeline Flow: Internal and External MemoriesPipeline Flow: Internal and External Memories

PP54x54x

DDoror

54x54x

PPDD

PP11 FF11 DD11 AA11 RR11 XX11

PP22 FF22

PP33 FF33 DD33 AA33 RR33 XX33

DD22 AA22 RR22 XX22

PP44 FF44 DD44 AA44 RR44 XX44

PP55 FF55 DD55 AA55 RR55 XX55

PP66 FF66 DD66 AA66 RR66 XX66

NO CONFLICTNO CONFLICT

1 - 22

Pipeline: Internal Memory OnlyPipeline: Internal Memory Only

4K4K

4K4K......

ROMROM RAMRAM

1K1K

1K1K......

ALUALUMACMAC

PP

DD

CC

ROMROM

’C54x’C54x

RAMRAM

Two accesses per block per cycleTwo accesses per block per cycle

DADA

DADA

Page 17: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 1

DSP54x - Introduction and Overview 1 - 13

1 - 23

’C541 Memory Maps’C541 Memory Maps

PROGRAMPROGRAM

FFFFFFFF

EXTEXT

00000000

VECTORSVECTORSFF80FF80

InternalInternal

ROM ?ROM ?

90009000

DATADATA

14001400

FFFFFFFF

EXTEXT

00000000MMR / RAMMMR / RAMOVLYOVLY

14001400RAM ?RAM ?

DROMDROME000E000

ROM ? ROM ?

1 - 24

’C541 Program Memory Options’C541 Program Memory Options

EXTEXT

00000000

FFFFFFFF VECTORS*VECTORS*FF80FF80

All ExternalAll ExternalMP/MC = 1MP/MC = 1

00000000

90009000

C000C000

E000E000

FFFFFFFF

EXTEXT

F000F000

98009800

D000D000

A000A000B000B000

4K 4K ROMROM w w VECs VECs * *

4K ROM4K ROM

4K ROM4K ROM

4K ROM4K ROM

4K ROM4K ROM

4K ROM4K ROM

2K ROM2K ROM2K ROM2K ROM

28K ROM28K ROM****MP/MC = 0MP/MC = 0

**** Internal ROM FF00 - FF7F reserved for TI test. Internal ROM FF00 - FF7F reserved for TI test.

* FF80 - FFFF are the default locations for vectors.* FF80 - FFFF are the default locations for vectors.

00000000

14001400

FFFFFFFF

EXTEXT

RAMRAM

VECTORS*VECTORS*

EXTEXToror

ROMROM

00800080

’RAM’ Option’RAM’ OptionOVLY = 1OVLY = 1

Page 18: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 1

1 - 14 DSP54x - Introduction and Overview

1 - 25

’C541 Data Memory’C541 Data Memory

00000000

14001400

E000E000

FFFFFFFF

EXT or ROMEXT or ROM

EXTEXT

MMR / RAMMMR / RAM00000000

04000400

08000800

0C000C00

10001000

14001400

RAM bRAM b

MMRMMR++

RAM aRAM a

RAM cRAM c

RAM dRAM d

RAM eRAM e

00000000

0060006000800080

04000400

RAM aRAM a

MMRMMR

SPRAMSPRAM

1 - 26

’C542 Memory Maps’C542 Memory Maps

EXTEXT

00000000

FFFFFFFF

PP

FF80FF80VECTORSVECTORS

F800F800ROMROM

28002800

FFFFFFFF

DD

EXTEXT

00000000

RAMRAMRAM?RAM?

28002800

OVLYOVLY

Page 19: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 1

DSP54x - Introduction and Overview 1 - 15

1 - 27

’C54x Memory Mix’C54x Memory Mix

11 55 2828 88

22 1010 22

33 1010 22

44 44 2424 88

55 66 4848 1616

66 66 4848 1616

99 3232 1616

C54x C54x RAM ROM DROMRAM ROM DROM

1 - 28

’C54x Peripheral Mix’C54x Peripheral Mix

C54xC54x SERSER TDM BSP HPITDM BSP HPI

11 22

22 11 11 11

33 11 11

44 22

55 11 11 11

66 11 11

99 11 22 11

Page 20: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 1

1 - 16 DSP54x - Introduction and Overview

1 - 29

Lab 1: Debugger WalkthroughLab 1: Debugger Walkthrough

Window ManagementWindow Management

SelectSelect CloseClose OpenOpen

MoveMove SizeSize EditEdit

Running CodeRunning Code

ResetReset StepStep

RunRun BreakpointBreakpoint

BenchmarkBenchmark

Display and AutomationDisplay and Automation

Saving configurationsSaving configurations

Using log filesUsing log files

1 - 30

commandcommandmenu barmenu bar

codecodewindowwindow

commandcommandwindowwindow

CPUCPUregistersregisters

memorymemorywindowwindow

Debugger ScreenDebugger Screen

Page 21: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Lab 1: The Debugger Interface

DSP54x - Introduction and Overview 1 - 17

Lab 1: The Debugger InterfaceThe Texas Instruments DSP family has moved to a common user interface called the SourceDebugger. Almost all TMS320 tools, including the TMS320C54x simulator, use this interface.

This module will guide you through the basic commands of the source debugger. Uponcompletion of the walkthrough, you will be able to:

• Set up and manipulate windows to display variables and data structures

• Single-step C statements and/or assembly instructions

• Set breakpoints and benchmark code

• Issue debugger commands via command menus, keyboard entry, or a mouse

Note: This walkthrough is intended to demonstrate the use of the debugger interface. It is notmeant to be an opportunity to get to know the ’C54x assembly language or C. Please donot attempt to dwell upon them, as this adds considerable time (and effort) to the process.The assembly language will be thoroughly presented in succeeding modules.

Page 22: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Lab 1: The Debugger Interface

1 - 18 DSP54x - Introduction and Overview

Simulator Files and DirectoryVerify that you are in the proper directory by typing:

cd \dsp54x\labs ↵

The demo program is a C file which simply loads an incrementing value to a variety of datatypes. Although of little interest in terms of DSP, it is a useful platform for exercising thedebugger interface and commands.

Sample Program - Source Debugger Walkthrough(1 of 2)

/*-------------------------------------------------------------------------*/

/* Sample program for Source Debugger Walkthrough *//*-------------------------------------------------------------------------*//* declare globals: int, float array, mixed type structure */int i:float a[10];struct { int i;

float j;int k[4];int *p;

} example ;void init();/*count from 0 to 1000 forever, call init each count */

main() {int count;for (;;)

for (count=0; count<1000; count++)init(count);

}

Page 23: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Lab 1: The Debugger Interface

DSP54x - Introduction and Overview 1 - 19

Sample Program - Source Debugger Walkthrough(2 of 2)/* load all globals with the current count value */void init(x)

int x;{for (i=0; i<10; i++)

a[i] = x;example.i = x;example.j = x;

for (i=0; i<4; i++)example.k[i] = x;example.p = (int * (0x0200 + x);

}

Page 24: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Lab 1: The Debugger Interface

1 - 20 DSP54x - Introduction and Overview

Starting the SimulatorTo start the debugger and load your linked output file, type:

SIM5XX lab1 ↵

The debugger assumes that the file to be loaded has a default extension of .out. We will learnhow to create output files in Module 2.

You should now see the debugger screen.

Note: If, in the process of this lab, you reach a point where the system no longer responds, or isotherwise corrupted, you may reload the file by typing LOAD lab1 at the commandprompt. In rare cases, you may have to exit the simulator entirely by typing QUIT andstarting over.

Selecting the Active WindowThe active window is shown with a highlighted border. To change the active window, point themouse at the desired window and press the left button. Repeat this a few times to cycle throughthe active windows.

Make the DISASSEMBLY window active. You can scroll through the code displayed in theDISASSEMBLY window several ways. First, by using the keyboard up-arrow, down-arrow,PgUp, and PgDn keys. And finally, by pointing the mouse at the up and down arrows on thewindow border and pressing the left button.

Note: Be careful. If you click while over an element of a window, you may set a breakpoint (ifyou are in a FILE or DISASSEMBLY window), or select a register or memory locationfor modification. To remove the breakpoint, simply point and click at the highlightedinstruction.

Try scrolling through the DISASSEMBLY window several ways.

When you want to return to a particular label or address, use the command addr nnnn, wherennnn is the label or address to return to. For example, type:

addr c_int00 ↵

Move to an absolute address by typing:

addr 0x0005 ↵

Move to a function by typing:

addr main ↵

Page 25: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Lab 1: The Debugger Interface

DSP54x - Introduction and Overview 1 - 21

Sizing and Moving WindowsYou can size and move any window. Make the CPU window active with the mouse. To changethe size of the window, grab the left or right corner of the window by holding down the leftmouse button down and drag the corner to a new position. Release the mouse button.

To move the window, grab the top of the window by holding the left mouse button down anddrag the window to a new position.

To restore the screen to its original state, use the “screen configuration” command with noarguments:

sconfig ↵

To load a particular screen configuration you may specify the desired file with the SCONFIGcommand:

sconfig tc.clr

To save a configuration, use the ssave <file name> command. There is no defaultextension, although .CLR (for color) is the extension generally used.

Typing sconfig once again will return you to the original configuration. The sconfigcommand uses the default filename init.clr. You may use either of these configurations, orany of your own creation, whenever using the debugger.

Page 26: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Lab 1: The Debugger Interface

1 - 22 DSP54x - Introduction and Overview

Running the ProgramThe sample program begins execution at the C reset function labeled c_int00. To repositionthe disassembly window, type:

addr c_int00 ↵

The assembly code shown at c_int00 can be single-stepped by pressing <F8> on the keyboardor by pointing and clicking the left mouse button.

Try running a few instructions by pressing <F8> and watching the PC value (in the CPUwindow) change as the corresponding instruction is executed (highlighted). Modified register andmemory contents are also highlighted.

To skip past this reset function, type:

go main ↵

Notice that the display changes to display the C program in the FILE window. Also notice thatthe CPU registers are no longer displayed. The CALLS window is opened to show which Cfunctions have been called.

The ability to view C source code in its native format is why our debugger is termed a “source”debugger.

Watch WindowSuppose you want to watch the value of a C variable while single-stepping the program. Type:

wa count ↵

This creates a watch window with the value of the variable count displayed. The valuedisplayed for count is not meaningful at this point since it has not been initialized yet. You maydiscover that opening a watch window on a variable not found in the current function willgenerate a warning.

Single-step the program (execute one C statement at a time) by pressing

<F8>

<F8>

Notice that the variable count was assigned the value zero. You should now be at the init()function call. Press:

<F8>

and you will go to the function. Notice the change in the CALLS window.

Add another variable to the watch window by typing:

Page 27: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Lab 1: The Debugger Interface

DSP54x - Introduction and Overview 1 - 23

wa i ↵

Single-step some more C statements by pressing <F8>.

To watch an array element, type:

wa a[0] ↵

Notice that the display shows a[0] as a floating-point value automatically. The debuggerdisplays values according to their defined type.

When a watch or display is no longer needed on screen, it may be closed by first selecting it(using <F6> or a mouse click), and then using the close window command: the <F4> key. <F4>does not apply the main simulator windows (CPU, MEM, Disassembly, etc.).

Displaying arrays and structures is a powerful debugging feature. Type:

wa a ↵

You receive the error message Invalid watch expression because you are allowed towatch only single, scalar values. If you have forgotten the type of the variable a, type:

whatis a ↵

To display the entire array of floating-point values, use the display command:

disp a ↵

You might want to move the DISP window over to the right of the screen.

Display the structure called example by typing:

disp example ↵

This structure has four members called i, j, k, and p. Note that they are displayed in accordancewith their type. Move this window over to the right just below the DISP: a window.

To display the contents of the array example.k, move the cursor down to highlight the lineshowing k: [...] and select this by pressing:

<F9> or the left mouse button

A new window is opened which shows the elements of the array. If this had been anotherstructure (instead of an array), it would be shown as k: {...}. Brackets indicate arrays andbraces indicate structures.

Since this new window showing the array k is opened directly on top of the previous window,you should move it down to make the example window visible.

Page 28: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Lab 1: The Debugger Interface

1 - 24 DSP54x - Introduction and Overview

Single STEP and NEXT instructionsNow that the display windows are opened, let’s restart from the beginning and then single-stepsome instructions. Type:

restart ↵

go main ↵

Press:

<F8>

Continue executing instructions by repeatedly pressing <F8>. Observe how the values in thewatch window and display windows change. Continue stepping through the init function untilit returns to the main function. If you do not wish to see the remainder of the function in stepmode, you can complete the function and return immediately by entering:

ret ↵

Note: If you were not in a sub-function at this point, the simulator will never reach a return and,therefore, will never halt. To stop the simulator in such an event, simply press <Esc>.

Suppose you want to single-step without seeing the details of each individual function call. Youcan step across function calls using:

Next ↵

Alternatively, you can press <F10>. Notice that the next C statement is executed withoutshowing function calls. (Called functions are not skipped; they are just not executed in single-stepmode.)

Both the step command <F8> and next command <F10> can be executed from the commandline with an argument specifying the number of instructions to execute. For example, type:

step 10000 ↵

To stop execution, press:

<Esc>

Note: You can use a Boolean expression as well as a numerical example with the stepcommand; e.g., step (AR0 !=0)

If you are executing within the init() function and want to return, type:

ret ↵

Page 29: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Lab 1: The Debugger Interface

DSP54x - Introduction and Overview 1 - 25

Now try the next command with a count value:

next 10000 ↵

You can sit back and observe the single-step operation.

To stop execution, press:

<Esc>

and you will see a User halt message displayed in the command window.

Page 30: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Lab 1: The Debugger Interface

1 - 26 DSP54x - Introduction and Overview

Debugging Assembly Language and C ProgramsThis part of the tutorial assumes you have already completed the first part of the walkthrough andhave loaded the lab1.out program into the debugger.

To start execution over again, type:

restart ↵

go main ↵

MIXED ModeTo debug in mixed mode, which allows you to observe assembly instructions and C statementssimultaneously, type:

mix ↵

You should see both the C source code and the corresponding assembly code. TheDISASSEMBLY window shows highlighted memory locations which are associated with thecurrent C statement.

You may have to move and size your display windows and watch windows to see the CPU andREGISTER windows. A suggestion is to remove (reset) the watch window using the command:

wr ↵

Try single-stepping by repeatedly pressing:

<F8>

Notice that assembly instructions are stepped. If you are currently executing with the init()function and want to return from the function, type:

ret ↵

Try the next command by repeatedly pressing:

<F10>

Continue this while observing that the assembly instruction CALL init is skipped over.

To single-step C statements while you are in mixed mode, type either:

cstep ↵

or

cnext ↵

Page 31: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Lab 1: The Debugger Interface

DSP54x - Introduction and Overview 1 - 27

Like their counterparts, step and next, you can execute a fixed number of instructions. Forexample:

cstep 10 ↵

will execute 10 C statements.

ASM ModeIf you are interested only in debugging an assembly language program, you can switch toassembly mode by typing:

asm ↵

Notice that the windows that display C data structures disappear when you are in assembly mode.This is a convenient way to clear up the screen if you want to observe CPU register values ordisplay memory contents. Try single-stepping by repeatedly pressing:

<F8>

and observe the changing register values in the CPU window. Changed values are highlighted soyou will notice when a change occurs.

You can go back to mixed mode by simply typing:

mix ↵

Notice that your DISP windows reappear.

Review of ModesIn summary, there are three modes of operation:

• Mixed mode (mix command) shows assembly and C (if C source exists).

• Assembly mode (asm command) shows assembly code only.

• C mode (c command) automatically switches from C to assembly displays,depending on what type of source code is executing.

Breakpoints and BenchmarkingRestart your program and execute to the first call to the init() function. Type:

restart ↵

mix ↵

go init ↵

Page 32: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Lab 1: The Debugger Interface

1 - 28 DSP54x - Introduction and Overview

To set a breakpoint you can either use the command ba xxxx, where xxxx is an absolutememory location or a valid label. This method requires that you know the address (or label). Forexample, type:

ba init ↵

This sets a breakpoint at the entry point to the function. Notice that the instruction is highlightedwhen a breakpoint is set.

To list breakpoints that are set, type:

bl ↵

To delete all breakpoints, use the breakpoint reset command. Type:

br ↵

Verify the process by listing again. Type:

bl ↵

In addition to the ba command to add breakpoints, simply point the mouse at the line abreakpoint is desired and press the left mouse button. The line that the breakpoint is set on shouldnow be highlighted. Pressing the left mouse button again will remove the breakpoint.

To execute your program up to the breakpoint, type:

run ↵

The program should stop at the breakpoint. If the breakpoint is not reached, press <Esc> andverify that the breakpoint has been set (use the bl command or look at theFILE/DISASSEMBLY window to see a highlighted instruction).

To use a previously entered debugger command (for lazy typists), press:

<Tab> ↵

Notice that pressing <Tab> backs up to the previous command entered. Pressing ↵ causes thatcommand to be executed again. In fact, you can cycle back through all previous commands youhave entered by repeatedly pressing <Tab>. Pressing <Shift><Tab> takes you forwardthrough this command buffer.

Let’s assume you still have a breakpoint set at the for statement. To “benchmark” the executiontime required to execute from one breakpoint to another, you need to set a second breakpoint. Goahead and select another instruction for a breakpoint using either <F9>, a mouse click, or the bacommand. To benchmark, type:

run ↵

runb ↵

Page 33: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Lab 1: The Debugger Interface

DSP54x - Introduction and Overview 1 - 29

? clk ↵

The run command executes to the first breakpoint. The runb command is the “run-with-benchmarking” command. The ? command tells the debugger to evaluate the following Cexpression and display the result. The clk debugger variable is valid only after a runbcommand and is set to the number of clock cycles between the run and runb commands.

Page 34: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Lab 1: The Debugger Interface

1 - 30 DSP54x - Introduction and Overview

Evaluating ExpressionsTo evaluate a C expression, you can use the ? command. This is one way to modify registervalues, since C expressions may have side effects such as assignment. Type:

? pc ↵

You should see the pc value displayed. To modify the current pc, type:

? pc = main

To modify a register, type:

? ar0 = 0

To evaluate an expression without displaying the result in the COMMAND window, use theeval command instead of the ? command. Type:

eval pc = 0 ↵

eval pc = main ↵

CPU, MEMORY, and WATCH window registers can be modified by pointing the mouse to thedesired register and pressing the left mouse button. When the register is selected, it will behighlighted and ready for input from the keyboard.

Point to the CPU window AR0 and press the left mouse button.Enter a new value of 5 and press ↵ when complete.

Displaying FilesYou can display any file in the FILE window. Type:

file siminit.cmd ↵

You should see the debugger’s initialization command file displayed. At this point, you can goback to debugging and the previous C source file will automatically be displayed when you startexecuting instructions.

Within the debugger COMMAND window, you can perform DOS-like commands to examineand change the current directory. Use the command dir nnnn, where nnnn is the directoryname, to display a directory listing. Type:

dir ↵

to display the current directory.

The command cd nnnn, where nnnn is the new directory name, changes the current directory.

To clear the COMMAND window, type:

Page 35: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Lab 1: The Debugger Interface

DSP54x - Introduction and Overview 1 - 31

cls ↵

Some other miscellaneous commands are:

• quit which exits the debugger and returns you to DOS

• restart which sets the PC to the code entry point.

Drop Down MenusTo access the drop-down menus from the menu bar at the top of the screen, press <Alt><key>,where <key> is the highlighted menu letter (L, B, W, M, C, or D). Once a menu is displayed,you can execute a command either by typing the designated letter, or by using the arrow keys tomove the selector bar to the desired command and pressing ↵. For example, press:

<Alt>L

then repeatedly press the right arrow key to look at the drop-down menus.

The drop-down menus can also be selected by pointing and pressing the left mouse button. Forexample, select the mode menu with the mouse.

Changing the Display SizesIf you have a display capable of greater than 80 x 25 character resolution, you can get moreinformation on the screen using the debugger -b[bbbb] option when you invoke the debugger.Let’s try it. Exit the debugger by typing:

quit ↵

From the DOS prompt, enter:

sim54xx lab1 -bb

and you should get a display that shows more detail, but may also cause more eye strain. A largermonitor will allow you to take full advantage of the source debugger’s high resolution modes.

The -bb switch creates a 50-line display. Another switch, -b, offers an intermediate-sized 43-line display. Your preferred display size may be made the default by saving the screenconfiguration as init.clr with the ssave command described earlier. Then the need toexplicitly use the -b switch is eliminated.

Batch Operation of DebuggerYou can execute debugger commands from a batch file. This can be useful if there is a certainsequence of commands you want to enter every time you start a debug session for a givenapplication. The filename should have a .log extension. To execute a .log file while in thedebugger, use the command take <filename>.log. For example, try the batch commandfile:

Page 36: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Lab 1: The Debugger Interface

1 - 32 DSP54x - Introduction and Overview

take lab1.log ↵

Congratulations, you have completed the walkthrough. To exit the debugger, press:

<Esc>

Type:

quit ↵

Page 37: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Lab 1: The Debugger Interface

DSP54x - Introduction and Overview 1 - 33

1 - 31

Simulator Quick ReferenceSimulator Quick ReferenceWindow ManagementSelecting Window

F6 rotates to next windowWIN <name> selects <name> windowClick window frame select windowF4 close selected window

Moving Inside WindowUp Arrow / Down ArrowPage Up / Page DownClick on window frame arrowsFor DISASSEM window; type ADDR <value>For MEMORY window; type MEM <value>

Moving WindowClick on top of frame; drag to new locationType MOVE and use arrows or type coordinates

Sizing WindowClick on bottom right corner; drag to new shapeType SIZE and use arrows or type coordinatesZOOM click on top left cornerUNZOOM click again on top left corner

Screen ConfigurationSCONFIG <name> load configuration <name>SSAVE <name> save configuration <name>

ModesASM display ASM info or <Alt> D,AC display C info or <Alt> D,CMIX display both ASM and C or <Alt> D,M

Running CodeResetType RESET forces PC to zeroType RESTART return to "entry point"

SteppingF8 or type STEP for one stepF10 or type NEXT condense subroutinesType STEP <n> for <n> stepsType NEXT <n> for <n> nexts

RunningRUN run until <Esc> or breakpointRUNB run with benchmarkGO <label> run to <label>

Watches and Breakpoints

Operation Watch BreakpointADD WA BARESET WR BRLIST WL BLDELETE WD # BD #or hot keys or mouse clicks

Other Actions

? <label> display value of <label>? <label> = <n>load <label> with <n>file <name> load file <name> to file windowTAB scroll to prior commandsSHIFT TAB scroll to subsequent commandsF9 alternate form of mouse clickTAKE <name> simulator ’batch’ fileLOAD <name>download file <name>

Entry/Exit

SIM2xx <file> start simulator with <file>.out

SIM2xx -bb high resolution mode

QUIT exit simulator

SYSTEM go to DOS shell

1 - 33

’C54x Review - CALU’C54x Review - CALU

uu CALU supports:CALU supports:ÀÀ General-purpose operations:General-purpose operations:

ÀÀ MACMAC

ÀÀ ALUALU

ÀÀ Special functions:Special functions:ÀÀ CSSU (Viterbi)CSSU (Viterbi)

ÀÀ EXP (Norm)EXP (Norm)

ÀÀ FIRS: MAC + ALUFIRS: MAC + ALU

ÀÀ 16- or 32-bit operations:16- or 32-bit operations:ÀÀ C16 modeC16 mode

ÀÀ ’Double’ operations’Double’ operations

Page 38: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Lab 1: The Debugger Interface

1 - 34 DSP54x - Introduction and Overview

1 - 34

’C54x Review - System’C54x Review - System

uu Four buses allow 1 fetch, 2 reads, and 1 write each cycle.Four buses allow 1 fetch, 2 reads, and 1 write each cycle.

uu Built from and forBuilt from and for cDSP cDSP::ÀÀ Fast growing familyFast growing family

ÀÀ Easy to modify for custom use.Easy to modify for custom use.

uu AttributesAttributesÀÀ Static designStatic design

ÀÀ Low powerLow power

ÀÀ Any clock below maximumAny clock below maximum

ÀÀ Low $/MIPLow $/MIP

ÀÀ Fast/dense instructionsFast/dense instructions

ÀÀ Small size for functionalitySmall size for functionality

ÀÀ LC version for 3V operationLC version for 3V operation

Page 39: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

DSP54x - Assembly Language Tools 2 - 1

Assembly Language Tools

Learning Objectives

2 - 2

uu Describe steps to create executable output filesDescribe steps to create executable output files

uu Create an assembly file containing:Create an assembly file containing:ÀÀ CodeCode

ÀÀ Constants (initialized data)Constants (initialized data)

ÀÀ VariablesVariables

uu Create a linker command file which:Create a linker command file which:ÀÀ Identifies input and output filesIdentifies input and output files

ÀÀ Describes a system’s available memoryDescribes a system’s available memory

ÀÀ Indicates where code and data shall be locatedIndicates where code and data shall be located

uu Develop multi-file systemsDevelop multi-file systems

Learning ObjectivesLearning Objectives

uu Describe steps to create executable output filesDescribe steps to create executable output files

Page 40: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

2 - 2 DSP54x - Assembly Language Tools

Page 41: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 2

DSP54x - Assembly Language Tools 2 - 3

Module 2

2 - 3

Software Development ToolsSoftware Development Tools

TextEditor

ASM500 LNK500 Debug.asm .obj

-o.out

.lst

-L

.cmd

.map

-m

HEX500

ASM500 -LS TESTASM500 -LS TEST

LNK500 TEST.CMDLNK500 TEST.CMD

2 - 4

Software Debug ToolsSoftware Debug Tools

.outDebug

SIM5xx Software Only

EVM500•Contains DSP•ISA Card

XDS510•ISA card•No DSP•PC<-> Target

TargetBoard

HEX500 ROMProg.

Page 42: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 2

2 - 4 DSP54x - Assembly Language Tools

2 - 5

Lab 2a: COFF ToolsLab 2a: COFF Tools

1.1. Assemble Assemble LAB2A.ASMLAB2A.ASM..Note error message - inspect .Note error message - inspect .LSTLST file. file.

2.2. Edit Edit LAB2A.ASMLAB2A.ASM..Replace labelReplace label ’ ’strtstrt’’ with ’ with ’startstart’ - update and exit file.’ - update and exit file.

3.3. Reassemble Reassemble LAB2A.ASMLAB2A.ASM..Verify error-free assembly -Verify error-free assembly - reinspect reinspect . .LSTLST file. file.

4.4. Link using Link using LAB2A.CMDLAB2A.CMD..Verify the result in Verify the result in LAB2A.MAPLAB2A.MAP..

5.5. Simulate Simulate LAB2A.OUTLAB2A.OUT..Step through the code to verify performance.Step through the code to verify performance.

6.6. Inspect batch files: Inspect batch files: A.BAT, L.BAT, S.BAT, ALS.BATA.BAT, L.BAT, S.BAT, ALS.BATConsider their use to save time in later labs.Consider their use to save time in later labs.

7.7. Add a Add a NOPNOP before the loop to separate the . before the loop to separate the .texttext label from label from startstart..Reassemble, link and simulate. Note any change from before.Reassemble, link and simulate. Note any change from before.

2 - 6

uu Describe steps to create executable output filesDescribe steps to create executable output files

uu Create an assembly file containing:Create an assembly file containing:ÀÀ CodeCode

ÀÀ Constants (initialized data)Constants (initialized data)

ÀÀ VariablesVariables

uu Create a linker command file which:Create a linker command file which:ÀÀ Identifies input and output filesIdentifies input and output files

ÀÀ Describes a system’s available memoryDescribes a system’s available memory

ÀÀ Indicates where code and data shall be locatedIndicates where code and data shall be located

uu Develop multi-file systemsDevelop multi-file systems

Assembly FilesAssembly Files

uu Create an assembly file containing:Create an assembly file containing:ÀÀ CodeCode

ÀÀ Constants (initialized data)Constants (initialized data)

ÀÀ VariablesVariables

Page 43: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 2

DSP54x - Assembly Language Tools 2 - 5

2 - 7

Assembly ConventionsAssembly Conventions

label: mnemonic operand,operand ;comment

FRORQ�RSWLRQDO LQVWUXFWLRQ�RU�GLUHFWLYH

uu Any ASCII text is O.K.Any ASCII text is O.K.

uu Use .Use .asmasm extension extension

uu Instructions and directives cannot be in first columnInstructions and directives cannot be in first column

uu Comments O.K. in any column after semicolonComments O.K. in any column after semicolon

WDEV�RU�VSDFHV

2 - 8

Assembly FilesAssembly Files

uu MnemonicsMnemonicsÀÀ Lines of 320 codeLines of 320 code

ÀÀ Generally written in upper caseGenerally written in upper case

ÀÀ Become components of program memoryBecome components of program memory

uu DirectivesDirectivesÀÀ Begin with a period (.) and are lower caseBegin with a period (.) and are lower case

ÀÀ Can create constants and variablesCan create constants and variables

ÀÀ May occupy no memory space when usedMay occupy no memory space when usedto control ASM and LNK processto control ASM and LNK process

Page 44: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 2

2 - 6 DSP54x - Assembly Language Tools

2 - 9

COFF Data TypesCOFF Data Types

Type Examples

Decimal 1234 or +1234 or -1234 (Default type)

Hexadecimal 0A40h or 0A40H or 0xA40

Binary 1110001b or 11111001B

Octal 226q or 572Q

Floating-point 1.623e-23 (sign and decimal point optional)

Character ‘D’

Characterstrings

“this is a string”

2 - 10

Coding Example: z = x + yCoding Example: z = x + y

Code Code

get xget x

add yadd y

store zstore z

looploop

LD x,A

ADD y,A

STL A,z

B start

.text

Constants Constants

x = 2x = 2

y = 7 y = 7

VariablesVariables

zz

.text

LD x,A

ADD y,A

STL A,z

B start

.datax .int 2y .int 7

.bss z,1

start

Page 45: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 2

DSP54x - Assembly Language Tools 2 - 7

2 - 11

The .The .bssbss Directive Directive

uu Only directive with assembly labelOnly directive with assembly labeldefined in the defined in the operandoperand field field

uu Use separate .Use separate .bssbss statements for each statements for eachnamed variablenamed variable

uu Remember .Remember .bssbss by thinking: by thinking:ÀÀ BBlock lock - reserves a - reserves a blockblock of memory of memoryÀÀ SSymbol ymbol -- begining begining at address at address symbolsymbolÀÀ SSize ize - of the specified - of the specified sizesize

uu Example: Create a 5-word array ’x’Example: Create a 5-word array ’x’

..bss bss x , 5 x , 5

2 - 12

Basic Assembler DirectivesBasic Assembler Directives

AssemblerAssemblerDirectiveDirective ExampleExample DefinitionDefinition

CodeCode to follow to follow..texttext ..texttext

ConstantsConstants to follow to follow..datadata ..datadata

AllocateAllocate space for variables space for variables..bssbss x,10x,10..bssbss

Create 16-bit integer constant(s)Create 16-bit integer constant(s)..intint..wordword

TBLTBL ..intint 53h53h,, 5Ah5Ah

Page 46: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 2

2 - 8 DSP54x - Assembly Language Tools

2 - 13

Exercise 2b: ASM Files and SectionsExercise 2b: ASM Files and Sections

; a = 0,1,2,3,4; a = 0,1,2,3,4

aa 0011223344

______________

_____ _______ _____________________ _______ ________________

; x = input array of length 5; x = input array of length 5

xx

_______ _______, ______________ _______, _______

; y = result array of length 1; y = result array of length 1

yy_______ _______, ______________ _______, _______

2 - 14

Lab 2b: Assembly FilesLab 2b: Assembly Files

tabletable 11

22

33

44

88

66

44

22

00

xx

aa

yy

Page 47: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 2

DSP54x - Assembly Language Tools 2 - 9

2 - 15

Lab 2b: ProcedureLab 2b: Procedure

1. Copy LAB2A.ASM to LAB2B.ASM. In LAB2B :

2. Define three arrays in RAM (x, a, y).

3. Define an initialized data table that contains the nine values above.

4. Write code that begins with the label start, contains four NOP instructions, and ends with a branch (B) back to start.

5. Assemble the file and inspect the list (.LST) file.

What is the opcode for NOP? ______________

What are the addresses for the .text and .data sections?

.text ____________

.data ____________

Why? ______________________________________________

2 - 16

uu Describe steps to create executable output filesDescribe steps to create executable output files

uu Create an assembly file containing:Create an assembly file containing:ÀÀ CodeCode

ÀÀ Constants (initialized data)Constants (initialized data)

ÀÀ VariablesVariables

uu Create a linker command file which:Create a linker command file which:ÀÀ Identifies input and output filesIdentifies input and output files

ÀÀ Describes a system’s available memoryDescribes a system’s available memory

ÀÀ Indicates where code and data shall be locatedIndicates where code and data shall be located

uu Develop multi-file systemsDevelop multi-file systems

LinkingLinking

uu Create a linker command file which:Create a linker command file which:ÀÀ Identifies input and output filesIdentifies input and output files

ÀÀ Describes a system’s available memoryDescribes a system’s available memory

ÀÀ Indicates where code and data shall be locatedIndicates where code and data shall be located

Page 48: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 2

2 - 10 DSP54x - Assembly Language Tools

2 - 17

LinkingLinking

/1.����REM �RXW

�PDS

OLQN�FPG

l��)LOHV��LQSXW�DQG�RXWSXW

l��0HPRU\�GHVFULSWLRQ

l��+RZ�WR�SODFH�V�Z�LQWR�K�Z

2 - 18

Example SystemExample System

&��[

ProgramMemory

3520

8000

FFFF

FRGH

DataMemory

65$04000

6000YDU

(35208000

A000FRQVW

Page 49: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 2

DSP54x - Assembly Language Tools 2 - 11

2 - 19

Linker Command FileLinker Command File

example1.obj-o example1.out-m example1.mapMEMORY{

}

PROM: org = 8000h , len = 8000h

SRAM: org = 4000h , len = 2000hEPROM: org = 8000h , len = 2000h

Program

Data

Page 0: /* */

Page 1: /* */

SECTIONS{

}

.text:> PROM PAGE 0

.bss: > SRAM PAGE 1

.data:> EPROM PAGE 1

2 - 20

Memory Descriptor SuggestionsMemory Descriptor Suggestions

1.1. Describe each memory resource on the processorDescribe each memory resource on the processor(internal RAM and/or ROM)(internal RAM and/or ROM)

2.2. Describe each external memory chip in your systemDescribe each external memory chip in your system

3.3. Combine contiguous memory segments, if desiredCombine contiguous memory segments, if desired

4.4. Split any memory segment into multiple segments,Split any memory segment into multiple segments,if desiredif desired

5.5. Name memory segments with useful names; e.g.:Name memory segments with useful names; e.g.:

ÀÀ Types of memory chips (EPROM, RAM, EEPROM)Types of memory chips (EPROM, RAM, EEPROM)

ÀÀ Usage (vectors, code, variables)Usage (vectors, code, variables)

ÀÀ Chip layout names (U1, E2)Chip layout names (U1, E2)

Page 50: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 2

2 - 12 DSP54x - Assembly Language Tools

2 - 21

Exercise 2c: Example SystemExercise 2c: Example System

‘C541(µP mode)

Program

32KEPROM

8000

16KSRAM

0

Data

800032K

DEPROM

SPRAM DARAM

2 - 22

Exercise 2c: Link Command FileExercise 2c: Link Command Fileexample1.obj-o example1.out-m example1.map

MEMORY{ PAGE ___: /* Program Memory */

______: org = ______, len = ______ ______: org = ______, len = ______

________: ______: org = ______, len = ______

______: org = ______, len = ______ ______: org = ______, len = ______

}

SECTIONS{ .text: > EPROM PAGE 0 .bss: > SPRAM PAGE 1 .data: > DEPROM PAGE 1}

Page 51: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 2

DSP54x - Assembly Language Tools 2 - 13

2 - 23

Lab 2c: LinkingLab 2c: Linking

‘C541(µP mode)

Program

8KEPROM

FFFF

Data

32KDEPROM

8000

SPRAM DARAM

2 - 24

Lab 2c: LinkingLab 2c: Linking

ProcedureProcedure1.1. Copy the linker command file Copy the linker command file LAB2A.CMDLAB2A.CMD to to LAB2C.CMDLAB2C.CMD

2.2. Specify Specify LAB2BLAB2B as the input; and request output and map files as the input; and request output and map files

3. 3. Define a system memory map to include:Define a system memory map to include:TMS320C541 (TMS320C541 (µ P mode) with all internal RAM mapped as Data P mode) with all internal RAM mapped as Data8K Program EPROM ending at 64K8K Program EPROM ending at 64K32K Data EPROM beginning at 8000h32K Data EPROM beginning at 8000h

4. 4. Place code sections as follows:Place code sections as follows:Code into EPROMCode into EPROMTable into DEPROMTable into DEPROMVariable arrays in SPRAMVariable arrays in SPRAM

5. 5. Link and inspect the Link and inspect the .MAP.MAP file. What addresses are assigned to: file. What addresses are assigned to:.text _____________.text _____________.data _____________.data _____________..bssbss _____________ _____________

Page 52: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 2

2 - 14 DSP54x - Assembly Language Tools

2 - 25

Multiple SectionsMultiple Sections

65$0

YDU(3520

520 5$0

FRGH ¶&��[

ProgramMemory

DataMemory

'(3520

FRQVW

How do we put a particular code section into specific memory?

(3520

YHFWRUV

2 - 26

Directive Example Description

Named SectionsNamed Sections

.sect Creates initialized sections for code or constants

.sect "vectors"

.usect Creates uninitialized sectionsfor variables

label .usect "name", 23

Page 53: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 2

DSP54x - Assembly Language Tools 2 - 15

2 - 27

Adding ResetAdding Reset

�WH[WVWDUW /' [��$

$'' \��$67/ $��]% VWDUW

�GDWD[ �LQW �\ �LQW �

�EVV ]���

sum.asm�

�VHFW ³�YHFWRUV´

% VWDUW

vectors.asm�GHI VWDUW �UHI VWDUW

2 - 28

sum.obj -o sum.out-m sum.map

MEMORY{ Page 0: /* Program Memory */ EPROM: org = 0E000h , len = 2000h Page 1: /* Data Memory */ SPRAM: org = 0060h , len = 20h DARAM: org = 0080h , len = 1380h DEPROM:org = 8000h , len = 8000h }SECTIONS{ .text: > EPROM PAGE 0 .data: > DEPROM PAGE 1 .bss: > SPRAM PAGE 1 }

Linker CMD File with VectorsLinker CMD File with Vectors

vectors.obj

VECS: org = 0FF80h , len = 0080h

.vectors: > VECS PAGE 0

= 1F80h

Page 54: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 2

2 - 16 DSP54x - Assembly Language Tools

2 - 29

Lab 2d: Multi-file LinkingLab 2d: Multi-file Linking

P D

NOPNOPNOPB start

start.text

x.bssa

y

.data1 2 3 4

8 6 4 2

table

B startFF80.vectors

2 - 30

Lab 2d: Multi-file SystemsLab 2d: Multi-file Systems

Procedure1. Create VECTORS.ASM

2. Copy LAB2B.ASM to LAB2D.ASMModify LAB2D to make start accessible

3. Assemble LAB2D and VECTORS

4. Copy LAB2C.CMD to LAB2D.CMDModify LAB2D.CMD to specify the desired inputand output files and the routing of the RESET vector

5. Link the system and inspect the .MAP file

6. Step through the code on the simulator to verifyperformance.

Page 55: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 2

DSP54x - Assembly Language Tools 2 - 17

2 - 31

COFF Directive SummaryCOFF Directive Summary

TypeType DirectiveDirective PurposePurpose

InitializedInitialized .text.text Program codeProgram codeSectionsSections .data.data Data ConstantsData Constants

.sect.sect User-namedUser-named

UninitializedUninitialized ..bssbss Data variablesData variablesSectionsSections ..usectusect User-namedUser-named

ConstantsConstants ..intint Create integer Create integer .word.word Create integerCreate integer.long.long Create aligned 32-bit constantCreate aligned 32-bit constant

LabelsLabels ..defdef Define global variableDefine global variable.ref.ref Reference global variableReference global variable.global.global Global declaration (.ref + .Global declaration (.ref + .defdef))

MiscMisc .set.set Assign a value,Assign a value, sim sim to . to .equequ or #define or #define.end.end Halt assemblerHalt assembler

2 - 33

Exercise 2d: Multi-file IssuesExercise 2d: Multi-file Issues

ProcedureProcedure1.1. Fill in blanks in Fill in blanks in EX2C.CMDEX2C.CMD to support reset to support reset

vector.vector.

2.2. Per Per EX2C.CMDEX2C.CMD, fill in the post-link addresses, fill in the post-link addressesin the left-side blanks in the in the left-side blanks in the ASMASM files. files.-- Branch is a 2-word instruction.Branch is a 2-word instruction.-- Other instructions are single-word.Other instructions are single-word.-- Put an ’X’ in any blank that has no address.Put an ’X’ in any blank that has no address.

-- Linkage is performed in the order youLinkage is performed in the order youspecify, with sections as the specify, with sections as the majormajor and files and filesas the as the minorminor sort criteria sort criteria

3.3. Resolve symbolic references in the right-sideResolve symbolic references in the right-sideblanks.blanks.

Page 56: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 2

2 - 18 DSP54x - Assembly Language Tools

2 - 34

Exercise 2d: EX2C.CMD FileExercise 2d: EX2C.CMD Filemult.objsum.objvectors.obj-o system.out-m system.map

MEMORY { Page 0: SRAM: org = 0000h , len = 4000h EPROM: org = 0E000h , len = _____ VECS: org = _____ , len = _____ Page 1: SPRAM: org = 0060h , len = 0020h DARAM: org = 0100h , len = 0400h DEPROM: org = 8000h , len = 8000h}SECTIONS{ .text: > EPROM PAGE 0 .data: > DEPROM PAGE 1 .bss: > SPRAM PAGE 1 _____: > _______________}

2 - 35

Exercise 2d:Exercise 2d: mult mult..asmasm

BBBB���� �UHI ]��\

BBBB���� �GHI F��PXOW

BBBB. �VHW ����

BBBBPXOW /' ]��$��BBBB

BBBB 03< \��$��BBBB

BBBB $'' F��$��BBBB

BBBB 67+ $��]��BBBB

BBBBGRQH % GRQH�BBBB�GDWD

BBBBF �LQW �� �.��BBBB

mult.asm Note:Note: order of link is : order of link is :

SECTION major, FILE minorSECTION major, FILE minor

Example:Example: yields:yields:

file1.file1.objobj

file2.file2.objobj

SECTIONS{SECTIONS{

.text : > ROM.text : > ROM

.data: > ROM }.data: > ROM }

file1.textfile1.text

file2.textfile2.text

file1.datafile1.data

file2.datafile2.data

Page 57: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 2

DSP54x - Assembly Language Tools 2 - 19

2 - 36

Exercise 2d: sum and vectors.Exercise 2d: sum and vectors.asmasm

BBBB �UHI F��PXOWBBBB �GHI VWDUW��]��\

BBBBVWDUW /' [��$��BBBB

BBBB $'' F��$��BBBB

BBBB 67+ $��]���BBBB

BBBB % PXOW��BBBB

BBBB �GDWDBBBB[ �LQW �BBBB\ �LQW �BBBB �EVV ]���

sum.asmBBBB �UHI VWDUW

BBBB �VHFW ³�YHFWRUV´

BBBB % VWDUW��BBBB

vectors.asm

Page 58: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 2

2 - 20 DSP54x - Assembly Language Tools

2 - 37

LAB2A.ASMLAB2A.ASM : Solution : Solution

; SOLUTION FILE FOR LAB2A.ASM

NOP

start: NOP

NOP

B start

; SOLUTION FILE FOR LAB2A.ASM; SOLUTION FILE FOR LAB2A.ASM

NOPNOP

startstart: : NOPNOP

NOPNOP

B B startstart

2 - 38

Exercise 2b: SolutionExercise 2b: Solution

; a = 0,1,2,3,4; a = 0,1,2,3,4; x = input array of length 5; x = input array of length 5; y = result; y = result

.data.dataaa ..intint 0,1,2,3,40,1,2,3,4

..bssbss x,5x,5

..bssbss y,1y,1

a 0011223344

x

y

Page 59: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 2

DSP54x - Assembly Language Tools 2 - 21

2 - 39

LAB2B.ASMLAB2B.ASM : Solution : Solution

.bss x,4

.bss a,4

.bss y,1

.data

.word 1,2,3,4

.word 8,6,4,2,0

.text

NOP

start: NOP

NOP

NOP

NOP

B start

. .bssbss x,4 x,4

. .bss bss a,4a,4

. .bssbss y,1 y,1

.data .data

.word 1,2,3,4 .word 1,2,3,4

.word 8,6,4,2,0 .word 8,6,4,2,0

.text .text

NOP NOP

start: NOPstart: NOP

NOP NOP

NOP NOP

NOP NOP

B start B start

2 - 40

Exercise 2c: Link Command FileExercise 2c: Link Command Fileexample1.obj-o example1.out-m example1.map

MEMORY{ PAGE 0: /* Program Memory */

SRAM : org = 0000h , len=4000h EPROM : org = 8000h , len = 8000h

PAGE 1: /* Data Memory */ SPRAM : org = 0060h , len = 0020h

DARAM : org = 0080h , len = 1380h DEPROM: org = 8000h , len = 8000h

}

SECTIONS{ .text: > EPROM PAGE 0 .bss: > SPRAM PAGE 1 .data: > DEPROM PAGE 1}

Page 60: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 2

2 - 22 DSP54x - Assembly Language Tools

2 - 41

LAB2C.CMDLAB2C.CMD : Solution : Solution

.bss data,4

.

. .bssbss data,4 data,4

. .

lab2b.obj

-o lab2c.out

-m lab2c.map

MEMORY {

PAGE 0: EPROM : org = 0E000h len = 02000h

PAGE 1: SPRAM : org = 00060h len = 00020h

DARAM : org = 00080h len = 01380h

DEPROM : org = 08000h len = 08000h

}

SECTIONS{

.text : > EPROM PAGE 0

.data : > DEPROM PAGE 1

.bss : > SPRAM PAGE 1

}

lab2b.lab2b.objobj

-o lab2c.out-o lab2c.out

-m lab2c.map-m lab2c.map

MEMORY {MEMORY {

PAGE 0: EPROM : org = 0E000h PAGE 0: EPROM : org = 0E000h len len = 02000h = 02000h

PAGE 1: SPRAM : org = 00060h PAGE 1: SPRAM : org = 00060h len len = 00020h = 00020h

DARAM : org = 00080h DARAM : org = 00080h len len = 01380h = 01380h

DEPROM : org = 08000h DEPROM : org = 08000h len len = 08000h = 08000h

} }

SECTIONS{SECTIONS{

.text : > EPROM PAGE 0 .text : > EPROM PAGE 0

.data : > DEPROM PAGE 1 .data : > DEPROM PAGE 1

. .bssbss : > SPRAM PAGE 1 : > SPRAM PAGE 1

} }

2 - 42

Exercise 2d: CMD File SolutionExercise 2d: CMD File Solutionmult.objsum.objvectors.obj-o system.out-m system.map

MEMORY { Page 0: SRAM: org = 0000h , len = 4000h EPROM: org = E000h , len = 1F80h VECS: org =0FF80h , len = 0080hPage 1: SPRAM: org = 0060h , len = 0020h DARAM: org = 0100h , len = 0400h DEPROM: org = 8000h , len = 8000h }SECTIONS { .text: > EPROM PAGE 0 .data: > DEPROM PAGE 1 .bss: > SPRAM PAGE 1 .vectors:> VECS PAGE 0 }

Page 61: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 2

DSP54x - Assembly Language Tools 2 - 23

2 - 43

Exercise 2d: ASM Files SolutionExercise 2d: ASM Files Solution

[���� �UHI ]��\

[���� �GHI F��PXOW

�����. �VHW ����

(����PXOW /' ]��$�������

(��� 03< \��$�������

(��� $'' F��$�������

(��� 67+ $��]�������

(����GRQH % GRQH�(����GDWD

�����F �LQW �� �.������

PXOW�DVP VXP�DVP

[ �UHI VWDUW

[ �VHFW ³�YHFWRUV´

))�� % VWDUW��(���

[ �UHI F��PXOW[ �GHI VWDUW��]��\

(����VWDUW /' [��$�������

(��� $'' F��$�������

(��� 67+ $��]�������

(��� % PXOW��(���[ �GDWD�����[ �LQW ������\ �LQW ����� �EVV ]���

YHFWRUV�DVP

2 - 44

LAB2D & VECTORSLAB2D & VECTORS : Solution : Solution

.def start

.bss x,4

.bss a,4

.bss y,1

.data

.word 1,2,3,4

.word 8,6,4,2,0

.text

NOP

start: NOP

NOP

NOP

NOP

B start

..defdef start start

. .bssbss x,4 x,4

. .bss bss a,4a,4

. .bssbss y,1 y,1

.data .data

.word 1,2,3,4 .word 1,2,3,4

.word 8,6,4,2,0 .word 8,6,4,2,0

.text .text

NOP NOP

start: NOPstart: NOP

NOP NOP

NOP NOP

NOP NOP

B start B start

.ref start

.sect ".vectors"

b start

.ref start.ref start

.sect ".vectors" .sect ".vectors"

b start b start

Page 62: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 2

2 - 24 DSP54x - Assembly Language Tools

2 - 45

LAB2D.CMDLAB2D.CMD : Solution : Solution

lab2d.obj

vectors.obj

-o lab2d.out

-m lab2d.map

MEMORY {

PAGE 0: EPROM: org = 0E000h len = 01F80h

VECS: org = 0FF80h len = 00080h

PAGE 1: SPRAM: org = 00060h len = 00020h

DARAM: org = 00080h len = 01380h

DEPROM: org = 08000h len = 08000h }

SECTIONS{

.vectors: > VECS PAGE 0

.text : > EPROM PAGE 0

.data : > DEPROM PAGE 1

.bss : > SPRAM PAGE 1 }

lab2d.lab2d.objobj

vectors.vectors.objobj

-o lab2d.out-o lab2d.out

-m lab2d.map-m lab2d.map

MEMORY {MEMORY {

PAGE 0: EPROM: org = 0E000h PAGE 0: EPROM: org = 0E000h len len = = 01F80h01F80h

VECS: org = 0FF80hVECS: org = 0FF80h len len = 00080h = 00080h

PAGE 1: SPRAM: org = 00060h PAGE 1: SPRAM: org = 00060h len len = 00020h = 00020h

DARAM: org = 00080h DARAM: org = 00080h len len = 01380h = 01380h

DEPROM: org = 08000h DEPROM: org = 08000h len len = 08000h } = 08000h }

SECTIONS{SECTIONS{

.vectors: > VECS PAGE 0.vectors: > VECS PAGE 0

.text : > EPROM PAGE 0 .text : > EPROM PAGE 0

.data : > DEPROM PAGE 1 .data : > DEPROM PAGE 1

. .bssbss : > SPRAM PAGE 1 : > SPRAM PAGE 1 } }

Page 63: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

DSP54x - Addressing Modes 3 - 1

Addressing Modes

Learning Objectives

3 - 2

Learning ObjectivesLearning Objectives

uu List the four basic addressing modes andList the four basic addressing modes andidentify the purpose of each.identify the purpose of each.

uu Express constants via immediate addressing.Express constants via immediate addressing.

uu Access tables and arrays via indirect addressing -Access tables and arrays via indirect addressing -a pointer-like process.a pointer-like process.

uu Select the optimal mode when using indirectSelect the optimal mode when using indirectaddressing.addressing.

uu Perform general purpose access to Data MemoryPerform general purpose access to Data Memoryvia direct addressing (two methods).via direct addressing (two methods).

uu Define and implement methods for controllingDefine and implement methods for controllingpage boundary crossings.page boundary crossings.

uu Access stack variables andAccess stack variables and MMRs MMRs via special via specialversions of direct addressingversions of direct addressing

Page 64: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

3 - 2 DSP54x - Addressing Modes

Page 65: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 3

DSP54x - Addressing Modes 3 - 3

Module 3

3 - 3

Addressing ModesAddressing Modes

TypeType SymbolSymbol Purpose, BenefitPurpose, Benefit‘‘

Using constants/initializationUsing constants/initialization

16-bit values16-bit values

Single cycleSingle cycle

ImmediateImmediate # #

LongLong

ShortShort

Support for pointers - access arrays, lists, tablesSupport for pointers - access arrays, lists, tables

0 cycle auto increment/decrement by +/- 10 cycle auto increment/decrement by +/- 1

0 cycle auto increment by “n”0 cycle auto increment by “n”

IndirectIndirect * *

w. Inc/Decw. Inc/Dec

w. Indexw. Index

General-purpose access to dataGeneral-purpose access to data

Access any location in data memory - ‘flat memory’Access any location in data memory - ‘flat memory’

Single-cycle access within boundarySingle-cycle access within boundary

Optimal for stack-based values (C)Optimal for stack-based values (C)

Optimal for DP0 values (MMR and SPRAM)Optimal for DP0 values (MMR and SPRAM)

DirectDirect <default><default>

AbsoluteAbsolute - or - - or -

PagedPaged @ @

SP-relativeSP-relative

MMRMMR

Operate betweenOperate between Acc Acc A and B A and BRegisterRegister

3 - 4

Immediate AddressingImmediate Addressing

uu Long ImmediateLong Immediate

ÀÀ Allows use of constantAllows use of constant

ÀÀ Up to 16-bit operandUp to 16-bit operand

ÀÀ 2 words, 2 cycles2 words, 2 cycles

ÀÀ Optimal for initializationOptimal for initialization

Example:Example:

LDLD #1234h,A#1234h,A

Load to A #Load to A #

1 2 3 41 2 3 4

Example:Example:

LDLD #12h,A#12h,A

Load A # 1 2Load A # 1 2

uu Short ImmediateShort ImmediateÀÀ Available in limited casesAvailable in limited cases

ÀÀ 9-bit or smaller values9-bit or smaller values

ÀÀ 1 word, 1 cycle1 word, 1 cycle

ÀÀ InitInit.. Acc Acc (8), DP (9), (8), DP (9),ASM (5), etc.ASM (5), etc.

Page 66: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 3

3 - 4 DSP54x - Addressing Modes

3 - 5

..bssbss x,100x,100

Indirect AddressingIndirect Addressing

uu Hardware support of pointer conceptHardware support of pointer conceptuu EightEight ARs ARs (Address or Auxiliary Registers) available (Address or Auxiliary Registers) availableuu AR0 also used as (optional) indexAR0 also used as (optional) indexuu Allows fast, efficient access to arrays, lists, tables, etc.Allows fast, efficient access to arrays, lists, tables, etc.

ExampleExample

y xnn

==

∑1

100

.text.text

LDLD *AR1 ,A*AR1 ,A

ADDADD *AR1 ,A*AR1 ,A

ADDADD *AR1 ,A*AR1 ,A

......

DataData

x1x1x2x2x3x3....x100x100

xx AR1AR1

STMSTM #x,AR1#x,AR1

++

++

++

STLSTL A,yA,y

yy

3 - 6

Indexed AddressingIndexed Addressing

ExampleExample

y x nn

==

∑ 21

100 ..bssbss x,200x,200

.text.text

STMSTM #x,AR1#x,AR1

ADDADD *AR1+ ,A*AR1+ ,A

ADDADD *AR1+ ,A*AR1+ ,A

......

STLSTL A,*(y)A,*(y)

DataData

AR1AR1 xx x2x2x4x4x6x6....x200x200

yy

uu Add step size option to auto increment.Add step size option to auto increment.uu AR0 holds step size.AR0 holds step size.uu Mode selected by using Mode selected by using **ARnARn+0 +0 as as **ARnARn-0-0..uu Pre-mod fixed index w. extra cycle:Pre-mod fixed index w. extra cycle: *+ARn *+ARn(K)(K)

STMSTM #2,AR0#2,AR0

00

00

Page 67: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 3

DSP54x - Addressing Modes 3 - 5

3 - 7

Indirect Addressing OptionsIndirect Addressing Options

No ModNo Mod **ARnARn no modification tono modification to Arn Arn

Inc/DecInc/Dec **ARnARn++ post increment by 1post increment by 1**ARnARn-- post decrement by 1post decrement by 1

IndexIndex **ARnARn+0+0 post increment by AR0post increment by AR0**ARnARn-0-0 post decrement by AR0post decrement by AR0

CircularCircular **ARnARn+%+% post increment by 1 - circularpost increment by 1 - circular**ARnARn-%-% post decrement by 1 - circularpost decrement by 1 - circular**ARnARn+0%+0% post increment by AR0 - circularpost increment by AR0 - circular**ARnARn-0%-0% post decrement by AR0 - circularpost decrement by AR0 - circular

Bit-ReverseBit-Reverse **ARnARn+0B+0B post increment by n - bit rev (for FFT)post increment by n - bit rev (for FFT)**ARnARn-0B-0B post decrement by n - bit rev (for FFT)post decrement by n - bit rev (for FFT)

Pre-mod Pre-mod **ARnARn ( (lklk)) use *(use *(ARnARn+LK),+LK), ARn ARn unchanged unchanged*+ARn*+ARn ( (lklk)) use *(use *(ARnARn+LK),+LK), ARn ARn changed changed*+ARn*+ARn ( (lklk)%)% use *(use *(ARnARn+LK),+LK), ARn ARn changed - circular changed - circular*+ARn*+ARn pre-increment by 1, during write onlypre-increment by 1, during write only

AbsoluteAbsolute *(*(lklk)) absolute directabsolute direct

3 - 8

Indirect Addressing CaveatsIndirect Addressing Caveats

uu Load pointers before usingLoad pointers before using

uu Pointer (MMR) latencies:Pointer (MMR) latencies:

ÀÀ no latencyno latency STM, MVDKSTM, MVDK

ÀÀ 1 cycle1 cycle MVDM, MVKD, MVDDMVDM, MVKD, MVDD

ÀÀ 2 cycles2 cycles STLM, ST, etcSTLM, ST, etc

uu ARsARs are read/modified in access phase, so during are read/modified in access phase, so duringdebug, will appear to change early.debug, will appear to change early.

uu CMPT must = 0 (bit5, ST1)CMPT must = 0 (bit5, ST1)

ÀÀ is 0 on resetis 0 on reset

ÀÀ is forced to 0 with RSBX CMPTis forced to 0 with RSBX CMPT

ÀÀ CMPT = 1 allows old 5x-styled NARP operationCMPT = 1 allows old 5x-styled NARP operationforfor ARs ARs..

Page 68: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 3

3 - 6 DSP54x - Addressing Modes

3 - 9

Absolute Direct AddressingAbsolute Direct Addressing

uu Actually a form of indirect addressing.Actually a form of indirect addressing.

uu Allows access to Allows access to anyany data memory operand. data memory operand.

uu Requires Requires extraextra word of code and extra cycle(s). word of code and extra cycle(s).

ExampleExampleData MemoryData MemoryAddrAddr DataData

. . . . . . . . x: 01FF 1000x: 01FF 1000

y: 0200 0500y: 0200 0500 . . . . . . . .

.data.data

x:x: .word 1000h.word 1000h

y:y: .word 0500h.word 0500h

.text.text

LDLD *(x),A*(x),A0 0 0 0 0 0 1 0 0 00 0 0 0 0 0 1 0 0 0AccAcc A A

ADDADD *(y),A*(y),A0 0 0 0 0 0 1 5 0 00 0 0 0 0 0 1 5 0 0

3 - 10

Paged Direct AddressingPaged Direct Addressing

uu Allows single-word/single-cycle operationAllows single-word/single-cycle operation

uu Seven-bit address field allows access to 128 wordsSeven-bit address field allows access to 128 words

uu Pages are selected by DP field in ST0.Pages are selected by DP field in ST0.

.data.data

x:x: .word 01000.word 01000

y:y: .word 00500.word 00500

Data MemoryData Memory

AddrAddr DataData

0180 0180 00010001 . . . .

x: 01FFx: 01FF 10001000

y: 0200y: 0200 05000500 . . . .

.text.text

LDLD #x,DP#x,DP

AccAcc A A

- - - - - - - - - -- - - - - - - - - - 0 0 30 0 3

DPDP

LDLD x,Ax,A

0 1 0 0 0 0 1 0 0 0 0 0 30 0 3

ADDADD y,Ay,A

0 1 0 0 1 0 1 0 0 1 0 0 30 0 3

Page 69: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 3

DSP54x - Addressing Modes 3 - 7

3 - 11

Paged Direct Addressing - BlockingPaged Direct Addressing - Blocking

Single DP can be assured in either of two simple methods:Single DP can be assured in either of two simple methods:

Specify the blocking argument in the linker command file:Specify the blocking argument in the linker command file:

..bssbss : > RAM : > RAM BLOCK=128 BLOCK=128

Group and block variables in ASM file:Group and block variables in ASM file:

..bssbss x,2,1 x,2,1 ;request all;request all vars vars together, together,;third field requests block;third field requests block

yy .set x+1.set x+1 ;assign;assign vars vars within block within block;;origorig sets sets

3 - 12

Paged Direct - Blocking ExamplePaged Direct - Blocking Example

..bssbss x,2,1x,2,1

yy .set.set x+1x+1 Data MemoryData MemoryAddrAddr DataData1001001FF 1FF --------200200 10001000201201 05000500

.text.text

LDLD #x,DP#x,DP

AccAcc A A

- - - - - - - - - -- - - - - - - - - - 0 0 40 0 4

DPDP

LDLD x,Ax,A

0 1 0 0 0 0 1 0 0 0 0 0 40 0 4

ADDADD y,Ay,A

0 1 5 0 0 0 1 5 0 0 0 0 40 0 4

Page 70: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 3

3 - 8 DSP54x - Addressing Modes

3 - 13

Paged Direct Addressing - CaveatsPaged Direct Addressing - Caveats

uu Data page must be managed by programmer.Data page must be managed by programmer.

ÀÀ No warnings issued by tools for crossed page.No warnings issued by tools for crossed page.

uu CPL bit in ST1 must be 0 for paged direct.CPL bit in ST1 must be 0 for paged direct.

ÀÀ Default condition on reset.Default condition on reset.

ÀÀ Invoked with Invoked with RSBX CPLRSBX CPL..

uu Useful for fast, random access to <100 variables at a time.Useful for fast, random access to <100 variables at a time.

ÀÀ For >100 variables, use pointers.For >100 variables, use pointers.

ÀÀ Not speed critical - use absolute direct.Not speed critical - use absolute direct.

uu Recommended: Watch DP when debugging code:Recommended: Watch DP when debugging code:

WAWA ST0<<7, Base = , xST0<<7, Base = , x

will display "Base = " and the first address active for pagedwill display "Base = " and the first address active for pageddirect addressing cast in hex.direct addressing cast in hex.

3 - 14

Stack Relative Direct AddressingStack Relative Direct Addressing

uu Alternative to paged direct modeAlternative to paged direct mode

uu Uses 16-bit SP instead of 9-MSB DP as baseUses 16-bit SP instead of 9-MSB DP as base

uu Useful for stack-based operationsUseful for stack-based operations

ExampleExample

.text.text

SSBXSSBX CPLCPL

Data MemoryData Memory

0100010000500050

SPSP

LDLD 1,A1,A

AccAcc A A

0 0 0 0 0 0 0 1 0 00 0 0 0 0 0 0 1 0 0

ADDADD 2,A2,A 0 0 0 0 0 0 0 1 5 00 0 0 0 0 0 0 1 5 0

Notes:Notes:

1. SP and DP relative direct are 1. SP and DP relative direct are mutually exclusivemutually exclusive!!

2. Restore CPL = 0 (RSBX CPL) before using paged direct again.2. Restore CPL = 0 (RSBX CPL) before using paged direct again.

3. CPL = 0 on reset.3. CPL = 0 on reset.

Page 71: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 3

DSP54x - Addressing Modes 3 - 9

3 - 15

uu DP is ignored - not used or modifiedDP is ignored - not used or modified

uu CPL is ignored - not used or modifiedCPL is ignored - not used or modified

uu Allows access to all DP0 resources (Allows access to all DP0 resources (MMRsMMRs and SPRAM) and SPRAM)

uu Invoked via MMR-specific mnemonics:Invoked via MMR-specific mnemonics:

LDM, STLMLDM, STLM MMR MMR ↔ Acc AccSTMSTM # # → MMR MMRPSHM, POPMPSHM, POPM MMR MMR ↔ Stack StackMVDM, MVMDMVDM, MVMD MMR MMR ↔ DMem DMemMVMMMVMM AR, SP, BK AR, SP, BK ↔ AR, SP, BK AR, SP, BK

MMR Direct AddressingMMR Direct Addressing

ExampleExample

LDMLDM ST1,BST1,B

OROR #4000,B#4000,B

STLMSTLM B,ST1B,ST1

..mmregsmmregs Allows MMR names as addressesAllows MMR names as addresses

3 - 16

Memory-Mapped Registers (MMR)Memory-Mapped Registers (MMR)

AddrAddr. . NameName (Hex)(Hex) DescriptionDescription

IMRIMR 00000000 Interrupt Mask RegisterInterrupt Mask Register

IFRIFR 00010001 Interrupt Flag RegisterInterrupt Flag Register

---------- 2 - 52 - 5 ReservedReserved

ST0ST0 00060006 Status 0 RegisterStatus 0 Register

ST1ST1 00070007 Status 1 RegisterStatus 1 Register

ALAL 00080008 A accumulator low (A[15:00])A accumulator low (A[15:00])

AHAH 00090009 A accumulator high (A[31:16])A accumulator high (A[31:16])

AGAG 000A000A A accumulator guard (A[39:32])A accumulator guard (A[39:32])

BLBL 000B000B B accumulator low (B[15:00])B accumulator low (B[15:00])

BHBH 000C000C B accumulator high (B[31:16])B accumulator high (B[31:16])

BGBG 000D000D B accumulator guard (B[39:32])B accumulator guard (B[39:32])

TT 000E000E Temporary RegisterTemporary Register

TRNTRN 000F000F TransistionTransistion Register Register

AddrAddr..NameName (Hex)(Hex) DescriptionDescription

AR0AR0 00100010 Auxiliary Register 0Auxiliary Register 0

AR1AR1 00110011 Auxiliary Register 1Auxiliary Register 1

AR2AR2 00120012 Auxiliary Register 2Auxiliary Register 2

AR3AR3 00130013 Auxiliary Register 3Auxiliary Register 3

AR4AR4 00140014 Auxiliary Register 4Auxiliary Register 4

AR5AR5 00150015 Auxiliary Register 5Auxiliary Register 5

AR6AR6 00160016 Auxiliary Register 6Auxiliary Register 6

AR7AR7 00170017 Auxiliary Register 7Auxiliary Register 7

SPSP 00180018 Stack Pointer RegisterStack Pointer Register

BKBK 00190019 Circular Size RegisterCircular Size Register

BRCBRC 001A001A Block Repeat CounterBlock Repeat Counter

RSARSA 001B001B Block Repeat Start AddressBlock Repeat Start Address

REAREA 001C001C Block Repeat End AddressBlock Repeat End Address

PMSTPMST 001D001D PMST RegisterPMST Register

-------------- 01E-01F01E-01F ReservedReserved

Page 72: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 3

3 - 10 DSP54x - Addressing Modes

3 - 17

Register AddressingRegister Addressing

uu Allows interchange between accumulatorsAllows interchange between accumulators

uu Examples:Examples:

LDLD A,BA,B A A → B BADDADD B,AB,A A = A + BA = A + B

uu Can sometimes be merged with other actionCan sometimes be merged with other action

ADDADD x,B,Ax,B,A A = B + xA = B + x

3 - 18

ProgramProgram

Address/Data (hex) Address/Data (hex) ScratchScratch Data1Data1 Data2Data2 B1B1Assume:Assume: DP=0DP=0 6060 20h20h DP=4DP=4 200200 100h100h DP=6DP=6 300300 100h100hCPL=0CPL=0 6161 120h120h 201201 60h60h 301301 30h30hCMPT=0CMPT=0 6262 202202 40h40h 302302 60h60h

Exercise 3: AddressingExercise 3: Addressing

LDLD #0,DP#0,DP

STMSTM #2,AR0#2,AR0

STMSTM #200h,AR1#200h,AR1STMSTM #300h,AR2#300h,AR2LDLD 61h,A61h,A

ADDADD *AR1+,A*AR1+,ASUBSUB 60h,A,B60h,A,BADDADD *AR1+,B,A*AR1+,B,A

LDLD #6,DP#6,DPADDADD 1,A1,AADDADD *AR2+,A*AR2+,A

SUBSUB *AR2+,0,A*AR2+,0,ASUBSUB #32,A#32,AADDADD *AR1-0,A,B*AR1-0,A,B

SUB *AR2-0,B,ASTL A,62h

SUBSUB *AR2-0,B,A*AR2-0,B,ASTLSTL A,62hA,62h

DPDP AR0AR0 AR1AR1 AR2AR2AR2AAA

120120120

260260260

390390390

BB

200200

380380

320320320

Page 73: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 3

DSP54x - Addressing Modes 3 - 11

3 - 19

Lab 3: AddressingLab 3: Addressing

Code Abstract

PP DD

.text.text

vectorsvectors

..bssbss

.data.data

AR(AR(dstdst))

AR(AR(srcsrc))

LD (src1)

ACCACC

STL (dst1)LD (src2)STL (dst2)......

done: B done

Caveats

l Don’t put in a loop

l Use best addressing mode

l Optimizations come in later labs

3 - 20

Lab 3: ProcedureLab 3: Procedure

1.1. Copy Copy LAB2D.ASMLAB2D.ASM to to LAB3.ASMLAB3.ASM. Modify . Modify LAB3LAB3 by byreplacing thereplacing the NOPs NOPs with code to copy the nine data with code to copy the nine datatable values into the allocated RAM, as shown in thetable values into the allocated RAM, as shown in thediagram above.diagram above.

2.2. Copy Copy LAB2D.CMDLAB2D.CMD to to LAB3.CMDLAB3.CMD. Modify . Modify LAB3LAB3 as asrequired.required.

3.3. Assemble and link your code. Check the Assemble and link your code. Check the .LST.LST and and.MAP.MAP files for expected results. files for expected results.

4.4. Step through the code on the simulator. VerifyStep through the code on the simulator. Verifyperformance; debug as necessary.performance; debug as necessary.

Optional: If time permits, add components to create aOptional: If time permits, add components to create alocation "status" and copy ST0 to status. Whichlocation "status" and copy ST0 to status. Whichaddressing modes are best here? Why?addressing modes are best here? Why?

Page 74: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 3

3 - 12 DSP54x - Addressing Modes

3 - 22

ProgramProgram

Address/Data (hex) Address/Data (hex) ScratchScratch Data1Data1 Data2Data2 B1B1Assume:Assume: DP=0DP=0 6060 20h20h DP=4DP=4 200200 100h100h DP=6DP=6 300300 100h100hCPL=0CPL=0 6161 120h120h 201201 60h60h 301301 30h30hCMPT=0CMPT=0 6262 202202 40h40h 302302 60h60h

Exercise 3: Addressing - SolutionExercise 3: Addressing - Solution

LDLD #0,DP#0,DP

STMSTM #2,AR0#2,AR0

STMSTM #200h,AR1#200h,AR1STMSTM #300h,AR2#300h,AR2LDLD 61h,A61h,A

ADDADD *AR1+,A*AR1+,ASUBSUB 60h,A,B60h,A,BADDADD *AR1+,B,A*AR1+,B,A

LDLD #6,DP#6,DPADDADD 1,A1,AADDADD *AR2+,A*AR2+,A

SUBSUB *AR2+,0,A*AR2+,0,ASUBSUB #32,A#32,AADDADD *AR1-0,A,B*AR1-0,A,B

SUB *AR2-0,B,ASTL A,62h

SUBSUB *AR2-0,B,A*AR2-0,B,ASTLSTL A,62hA,62h

DPDP AR0AR0 AR1AR1 AR2AR2AR200

22

200200

300300300

201201

202202

66

301301301

200200

302302302

AAA

120120120

220

260

220220

260260

290390

290290390390

360340

360360340340

BB

200200

380380

320320320 300300300

3 - 23

LAB3.ASMLAB3.ASM : Solution : Solution; LAB3.ASM: Data Xfer solution

.def start,table,x

.bss x,4

.bss a,4

.bss y,1

.data

table: .word 1,2,3,4

.word 8,6,4,2,0

.text

NOP

start: STM #table,AR1

STM #x,AR2

LD *AR1+,A ;1

STL A,*AR2+

LD *AR1+,A ;2

STL A,*AR2+

LD *AR1+,A ;3

STL A,*AR2+

; LAB3.ASM: Data; LAB3.ASM: Data Xfer Xfer solution solution

..defdef start,table,x start,table,x

. .bssbss x,4 x,4

. .bssbss a,4 a,4

. .bssbss y,1 y,1

.data .data

table: .word 1,2,3,4table: .word 1,2,3,4

.word 8,6,4,2,0 .word 8,6,4,2,0

.text .text

NOP NOP

start: STM #table,AR1start: STM #table,AR1

STM #x,AR2 STM #x,AR2

LD *AR1+,A ;1 LD *AR1+,A ;1

STL A,*AR2+ STL A,*AR2+

LD *AR1+,A ;2 LD *AR1+,A ;2

STL A,*AR2+ STL A,*AR2+

LD *AR1+,A ;3 LD *AR1+,A ;3

STL A,*AR2+ STL A,*AR2+

LD *AR1+,A ;4

STL A,*AR2+

LD *AR1+,A ;5

STL A,*AR2+

LD *AR1+,A ;6

STL A,*AR2+

LD *AR1+,A ;7

STL A,*AR2+

LD *AR1+,A ;8

STL A,*AR2+

LD *AR1+,A ;9

STL A,*AR2+

; Optional process solution

.mmregs

.bss status,1

.def status

option: LDM ST0,A

STL A,*(status)

done: B done

LD *AR1+,A ;4LD *AR1+,A ;4

STL A,*AR2+STL A,*AR2+

LD *AR1+,A ;5LD *AR1+,A ;5

STL A,*AR2+STL A,*AR2+

LD *AR1+,A ;6LD *AR1+,A ;6

STL A,*AR2+STL A,*AR2+

LD *AR1+,A ;7LD *AR1+,A ;7

STL A,*AR2+STL A,*AR2+

LD *AR1+,A ;8LD *AR1+,A ;8

STL A,*AR2+STL A,*AR2+

LD *AR1+,A ;9LD *AR1+,A ;9

STL A,*AR2+STL A,*AR2+

; Optional process solution; Optional process solution

..mmregsmmregs

..bssbss status,1 status,1

..defdef status status

option:option: LDM ST0,ALDM ST0,A

STL A,*(status)STL A,*(status)

done: done: B doneB done

Page 75: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 3

DSP54x - Addressing Modes 3 - 13

3 - 24

LAB3.CMDLAB3.CMD : Solution : Solution

lab3.obj

vectors.obj

-o lab3.out

-m lab3.map

MEMORY { PAGE 0: EPROM: org = 0E000h len = 01F80h

VECS: org = 0FF80h len = 00080h

PAGE 1: SPRAM: org = 00060h len = 00020h

DARAM: org = 00080h len = 01380h }

SECTIONS{

.vectors: > VECS PAGE 0

.text : > EPROM PAGE 0

.data : > DARAM PAGE 1

.bss : > SPRAM PAGE 1

}

lab3.lab3.objobj

vectors.vectors.objobj

-o lab3.out-o lab3.out

-m lab3.map-m lab3.map

MEMORY { PAGE 0: EPROM: org = 0E000hMEMORY { PAGE 0: EPROM: org = 0E000h len len = 01F80h = 01F80h

VECS: org = 0FF80h VECS: org = 0FF80h len len = 00080h = 00080h

PAGE 1: SPRAM: org = 00060h PAGE 1: SPRAM: org = 00060h len len = 00020h = 00020h

DARAM: org = 00080h DARAM: org = 00080h len len = 01380h = 01380h } }

SECTIONS{SECTIONS{

.vectors: > VECS PAGE 0 .vectors: > VECS PAGE 0

.text : > EPROM PAGE 0 .text : > EPROM PAGE 0

.data : > DARAM PAGE 1 .data : > DARAM PAGE 1

. .bssbss : > SPRAM PAGE 1 : > SPRAM PAGE 1

} }

Page 76: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 3

3 - 14 DSP54x - Addressing Modes

Page 77: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

DSP54x - Basic Programming Techniques 4 - 1

Basic Programming Techniques

Learning Objectives

4 - 2

Learning ObjectivesLearning Objectives

uu Perform simple branch, loop control,Perform simple branch, loop control,and subroutine operations.and subroutine operations.

uu Set up and employ the stack forSet up and employ the stack forsubroutine call and return.subroutine call and return.

uu Use the accumulator to load, store, addUse the accumulator to load, store, addand subtract 16-bit values from dataand subtract 16-bit values from dataand program memory.and program memory.

uu Use the multiplier to implement sum-ofUse the multiplier to implement sum-ofproducts equations.products equations.

Page 78: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

4 - 2 DSP54x - Basic Programming Techniques

Page 79: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 4

DSP54x - Basic Programming Techniques 4 - 3

Module 4

4 - 3

Basic Program ControlBasic Program Control

BranchBranch CallCall ReturnReturn

BB nextnext CALLCALL subsub RETRET

BACCBACC srcsrc CALACALA srcsrc

BCBC next,next,cndcnd, , CCCC sub,sub,cndcnd, , RCRC cndcnd,,

CyclesCycles

B, CALLB, CALL 44

RETRET 55

BACC, CALABACC, CALA 66

BC, CC, RCBC, CC, RC 5/35/3

InstructionInstruction

4 - 4

Condition OperatorsCondition Operators

EQEQ NEQNEQ OVOV

LEQLEQ GEQ GEQ NOVNOV

LTLT GTGT

TC TC C C BIO BIO

NTCNTC NC NC NBIO NBIO

Pick 1Pick 1 and/orand/or Pick 1Pick 1 Pick 1Pick 1 Pick 1Pick 1 Pick 1Pick 1and/orand/or and/orand/orOROR

ExamplesExamplesRCRC TCTC

CCCC sub,BNEQsub,BNEQ

BCBC new,AGT,AOVnew,AGT,AOV

Page 80: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 4

4 - 4 DSP54x - Basic Programming Techniques

4 - 5

Loop Counter: BANZLoop Counter: BANZ

y Xnn

==

∑1

5

y Xnn

==

∑1

5

..bssbss x,5x,5STMSTM #x,AR1#x,AR1

LDLD #0,A#0,Aloop:loop: ADDADD *AR1+,A*AR1+,A

BB looploopSTLSTL A,yA,y ANZ ANZ ,*AR2- ,*AR2-

STMSTM #4,AR2#4,AR2

4 - 6

Comparison: CMPRComparison: CMPR

For (n=5; n<10; n++)For (n=5; n<10; n++)

STMSTM #5,AR1#5,AR1STMSTM #10,AR0#10,AR0

loop:loop: ............*AR1+*AR1+............

CMPRCMPR LESS,AR1LESS,AR1BCBC loop,TCloop,TC

EQUALEQUAL .set.set 00b00bLESSLESS .set.set 01b01bGRTRGRTR .set.set 10b10bNOTEQNOTEQ .set.set 11b11b

Useful:Useful:

.include .include

filesfiles

Page 81: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 4

DSP54x - Basic Programming Techniques 4 - 5

4 - 7

The StackThe Stack

Setup:Setup:

STACKSTACK ..usectusect "STK",100"STK",100

STMSTM #STACK+100,SP#STACK+100,SP

DataDataMemoryMemory

OpenOpen

Last UsedLast Used

UsedUsed

00

SPSP

64K64K

STACKSTACK

STKSTK

CALL :CALL : PC PC →→ *--SP *--SP

RET :RET : *SP++ *SP++ →→ PC PC

Use:Use:

4 - 8

Measuring Stack RequiredMeasuring Stack Required

Determining amount of stack to allocateDetermining amount of stack to allocatecan be done in four steps :can be done in four steps :

1. 1. Allocate a large stack and fill it with Allocate a large stack and fill it with known values :known values :

LD LD #-8531,A#-8531,A

MVMMMVMM SP,AR7SP,AR7

RPTRPT #length#length

STLSTL A,*AR7-A,*AR7-DEADDEAD

DEADDEAD

... ...

DEADDEAD

DEADDEAD

DEADDEAD

DEADDEADSPSP

AR7AR7

2. 2. Run system to exercise all operationsRun system to exercise all operations

3. 3. Halt and inspect stack for prior valueHalt and inspect stack for prior value

6B146B14

00130013

... ...

7AB37AB3

DEADDEAD

DEADDEAD

00000000

SPSP

4. 4. Delete excess (unused) stackDelete excess (unused) stack

Page 82: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 4

4 - 6 DSP54x - Basic Programming Techniques

4 - 9

Exercise 4-1. Program ControlExercise 4-1. Program Control

1.1. What is the difference between Branch and Call?What is the difference between Branch and Call?

22 Why is there no Return using Accumulator?Why is there no Return using Accumulator?

3.3. Are multi-conditions based on the AND or OR ofAre multi-conditions based on the AND or OR ofconditions?conditions?

4.4. When looping "n" times, what value do you place inWhen looping "n" times, what value do you place inthe loop counter?the loop counter?

5.5. Which register(s) may be used for loop counting?Which register(s) may be used for loop counting?

6.6. When adding to the stack, what happens to SP?When adding to the stack, what happens to SP?

7.7. What does SP point to?What does SP point to?

8.8. How many cycles do Branch operations require?How many cycles do Branch operations require?Why?Why?

4 - 10

Lab 4aLab 4a

VECTORS.ASMVECTORS.ASM

.sect ".vectors".sect ".vectors"BB BEGINBEGIN

;allocate stack[.;allocate stack[.usectusect]].text.text

BEGINBEGIN;[setup SP];[setup SP];[call START];[call START]

1.1. Modify Modify VECTORS.ASMVECTORS.ASM to allocate a stack, set up the SP, and call to allocate a stack, set up the SP, and call startstart

LAB4A.CMDLAB4A.CMDMEMORYMEMORY{Page 1{Page 1

RAM:RAM: org=___,org=___,lenlen=___=___. . .. . .. . .. . .

}}SECTIONSSECTIONS

STACK: > RAMSTACK: > RAM

2.2. Copy Copy LAB3LAB3..CMDCMD to to LAB4A.CMDLAB4A.CMD

3.3. Modify Modify LAB4A.CMDLAB4A.CMD to route the stack to Data RAM to route the stack to Data RAM

4.4. Link Link LAB3.OBJLAB3.OBJ with the modified with the modified VECTORS.OBJVECTORS.OBJ to to produce produce LAB4A.OUTLAB4A.OUT

5. 5. Simulate Simulate LAB4A.OUTLAB4A.OUT to verify your results, especially the placement of a to verify your results, especially the placement of areturn address on the stackreturn address on the stack

Page 83: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 4

DSP54x - Basic Programming Techniques 4 - 7

4 - 11

Dual AccumulatorsDual Accumulators

AGAG AHAH ALAL39 - 3239 - 32 15 - 015 - 031 - 1631 - 16

BGBG BHBH BLBL39 - 3239 - 32 15 - 015 - 031 - 1631 - 16

LDLD x,Ax,ASTLSTL A,*AR1+A,*AR1+

ADDADD *AR2-,16,B*AR2-,16,BSTHSTH B,yB,y

MUXMUX

A B C TA B C T A B D SA B D S

ALUALUA A

B B

MM

4 - 12

Instruction FormatsInstruction Formats

LoadLoad acc acc--LoLo with with Smem Smem

LoadLoad acc acc-Hi with-Hi with Smem Smem

LoadLoad acc acc w. T-SHIFT w. T-SHIFT Smem Smem

Load A with shiftLoad A with shift shft Xmem shft Xmem

Load A with SHIFTLoad A with SHIFT

SHIFTSHIFT Smem Smem

Page 84: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 4

4 - 8 DSP54x - Basic Programming Techniques

4 - 13

Load Accumulator: LDLoad Accumulator: LD

LEGENDLEGEND

SmemSmem: single: single dat dat shftshft: 0<=S<=15: 0<=S<=15 ASM: ASM: Acc Acc.Shifter.Shifter K: 16-bitK: 16-bit const const..

XmemXmem:: ptr ptr.data.data SHIFT: -16<=S<=15 TS: TREG(5-0)SHIFT: -16<=S<=15 TS: TREG(5-0) k8: 8-bitk8: 8-bit const const..

srcsrc,,dstdst:: Acc Acc. A or B. A or B !! = 2-word size = 2-word size

LD _____,LD _____, dst dst

Shift TypeShift Type Data MemoryData Memory

LowLow Acc Acc SmemSmem

HighHigh Acc Acc SmemSmem, 16, 16

T-T-regreg Value Value SmemSmem, TS, TS

Fixed ValueFixed Value XmemXmem, [, [shftshft]]

ExtendedExtended SmemSmem, SHIFT , SHIFT !!

ConstantConstant

#k8#k8

#K,16 #K,16 !!

#K, [#K, [shftshft] ] !!

AccumulatorAccumulator

srcsrc, ASM, ASM

srcsrc, [SHIFT], [SHIFT]

4 - 14

Add and Subtract: ADD, SUBAdd and Subtract: ADD, SUB

Dual Op:Dual Op: Xmem Xmem,, Ymem Ymem,, dst dst

LEGENDLEGEND

SmemSmem: single: single dat dat shftshft: 0<=S<=15: 0<=S<=15 ASM: ASM: Acc Acc.Shifter.Shifter K: 16-bitK: 16-bit const const..

XmemXmem:: ptr ptr.data.data SHIFT: -16<=S<=15 TS: TREG(5-0)SHIFT: -16<=S<=15 TS: TREG(5-0) k8: 8-bitk8: 8-bit const const..

srcsrc,,dstdst:: Acc Acc. A or B. A or B !! = 2-word size = 2-word size

Shift TypeShift Type Data MemoryData Memory ConstantConstant AccumulatorAccumulator

LowLow Acc Acc SmemSmem,, src src

HighHigh Acc Acc SmemSmem, 16,, 16, src src, [, [dstdst]] #K, 16,#K, 16, src src, [, [dstdst] ] !!

T-T-regreg Value Value SmemSmem, TS,, TS, src src srcsrc, ASM, [, ASM, [dstdst]]

Fixed ValueFixed Value XmemXmem, [, [shftshft],], src src #K, [#K, [shftshft],], src src !!

ExtendedExtended SmemSmem, , SHIFTSHIFT,, src src, [, [dstdst] ] !! srcsrc, [, [SHIFTSHIFT], [], [dstdst]]

ADD _____ or SUB _____ADD _____ or SUB _____

Page 85: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 4

DSP54x - Basic Programming Techniques 4 - 9

4 - 15

MIN, MAXMIN, MAX

MAXMAX dst dst dstdst = max (A, B) = max (A, B) if A > B then C = 0if A > B then C = 0

MINMIN dst dst dstdst = min (A, B) = min (A, B) if A < B then C = 0if A < B then C = 0

Example: z = max (Example: z = max (xxnn))

AR1AR1

xx

zz

..bssbss x,100x,100

..bssbss z,1z,1STMSTM #x,AR1#x,AR1STMSTM #98,BRC#98,BRC

LDLD *AR1+,B*AR1+,BRPTBRPTB looploopLDLD *AR1+,A*AR1+,A

loop:loop: MAXMAX BBSTLSTL B,*(z)B,*(z)

4 - 16

Store Accumulator: STL, STHStore Accumulator: STL, STH

Shift typeShift type STLSTL STH STH

NoneNone AccLoAccLo →→ Smem Smem AccAcc >> 16 >> 16 →→ Smem Smem

ASMASM AccAcc << ASM << ASM →→ Smem Smem AccAcc >> (16-ASM) >> (16-ASM) →→ SmemSmem

Short (Short (XmemXmem)) AccAcc << << shft shft →→ Xmem Xmem AccAcc >> (16- >> (16-shftshft) ) →→ Xmem Xmem

Extended Extended !! AccAcc << SHIFT << SHIFT →→ Smem Smem AccAcc >> (16-SHIFT) >> (16-SHIFT) →→ Smem Smem

LEGENDLEGEND

SmemSmem: single: single dat dat shftshft: 0<=S<=15: 0<=S<=15 ASM: ASM: Acc Acc.Shifter.Shifter K: 16-bitK: 16-bit const const..

XmemXmem:: ptr ptr.data.data SHIFT: -16<=S<=15 TS: TREG(5-0)SHIFT: -16<=S<=15 TS: TREG(5-0) k8: 8-bitk8: 8-bit const const..

srcsrc,,dstdst:: Acc Acc. A or B. A or B !! = 2-word size = 2-word size

Page 86: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 4

4 - 10 DSP54x - Basic Programming Techniques

4 - 17

Store Constant to MemoryStore Constant to Memory

ST #K,ST #K, Smem Smem

uu Direct store of constant to memoryDirect store of constant to memory

uu Accumulator not affectedAccumulator not affected

uu Two words, two cyclesTwo words, two cycles

uu Alt. syntax allows store of T or TRN registersAlt. syntax allows store of T or TRN registers

4 - 18

MAC UnitMAC Unit

17 x 1717 x 17MULTIPLIERMULTIPLIER

D P C AD P C ADD

T T

TT

DD17 x 17 Multiplier :17 x 17 Multiplier :

- Sign / Unsigned support - Sign / Unsigned support

- 8000h x 8000h = 7FFFh - 8000h x 8000h = 7FFFh in in SMUL=1SMUL=1 mode mode

A B 0A B 0

ADDER (40)ADDER (40)

M M

A B UA B U

AA BB

40-Bit Adder :40-Bit Adder :

- Separate from ALU - Separate from ALU

- Sum & Add in single cycle - Sum & Add in single cycle

Page 87: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 4

DSP54x - Basic Programming Techniques 4 - 11

4 - 19

Multiplier InstructionsMultiplier Instructions

OPOP OptionsOptions ExecutionExecution

LDLD SmemSmem, T, T T=ST=S

MPYMPY SmemSmem,, dst dst dst dst = S = S .. T T

MACMAC SmemSmem,, src src src src = = src src + S + S .. T T

XmemXmem,, Ymem Ymem,, dst dst dst dst = X = X .. Y Y

SmemSmem, #K,, #K, dst dst !! dst dst = S = S .. K K

#K,#K, dst dst !! dst dst = K = K .. T T

XmemXmem,, Ymem Ymem,, src src, [, [dstdst]] dst dst = = src src + X + X .. Y Y

SmemSmem, #K,, #K, src src, [, [dstdst] ] !! dst dst = = src src + S + S .. K K

#K,#K, src src, [, [dstdst]]!! dst dst = = src src + K + K .. T T

MASMAS SmemSmem,, src src src src = = src src - S - S .. T T

XmemXmem,, Ymem Ymem,, src src, [, [dstdst]] dst dst = = src src - X - X .. Y Y

4 - 20

Additional Multiplier InstructionsAdditional Multiplier Instructions

OpCodeOpCode OptionsOptions ExecutionExecution

MPYAMPYA SmemSmem B B == S S .. AH AH

dstdst dstdst == T T .. AH AH

MACAMACA SmemSmem B B == B + S B + S .. AH AH

T,T, src src, [, [dstdst]] dstdst == srcsrc + T + T .. AH AH

MASAMASA SmemSmem BB == B - S B - S .. AH AH

T,T, src src, [, [dstdst]] dstdst == srcsrc - T - T .. AH AH

SQURSQUR SmemSmem,, dst dst dstdst == SS22

A,A, dst dst dstdst == AHAH22

SQURASQURA SmemSmem,, src src srcsrc == srcsrc + S + S22

SQURSSQURS SmemSmem,, src src srcsrc == srcsrc - S - S22

Page 88: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 4

4 - 12 DSP54x - Basic Programming Techniques

4 - 21

ExamplesExamples

z = x + y - wz = x + y - w

LDLD x,Ax,A

ADDADD y,Ay,A

SUBSUB w,Aw,A

STLSTL A,zA,z

y =y = mx mx + b + b

LDLD m,Tm,T

MPYMPY x,Ax,A

ADDADD b,Ab,A

STLSTL A,yA,y

y = x1 y = x1 .. a1 + x2 a1 + x2 .. a2 a2

LDLD x1,Tx1,T

MPYMPY a1,Ba1,B

LDLD x2,Tx2,T

MACMAC a2,Ba2,B

STLSTL B,yB,y

STHSTH B,y+1B,y+1

4 - 22

Lab 4b: Basic ProgrammingLab 4b: Basic Programming

ProgramProgramMemoryMemory

Lab 3Lab 3

VectorVector

DataDataMemoryMemory

RAMRAM

LAB 3LAB 3

ROMROM

1 2 3 41 2 3 48 6 4 28 6 4 2

1 2 3 41 2 3 48 6 4 28 6 4 2

∑=

•=4

1

)(n

nn xay

Lab 4Lab 4

AR1AR1AR2AR2

TT

AA

XXyy

DoneDone

Page 89: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 4

DSP54x - Basic Programming Techniques 4 - 13

4 - 23

Lab 4b: ProcedureLab 4b: Procedure

1.1. Copy Copy LAB3.ASMLAB3.ASM to to LAB4B.ASMLAB4B.ASM. Open . Open LAB4B.ASMLAB4B.ASM..

2.2. Modify the initialization process to use a Modify the initialization process to use a BANZBANZ loop. loop.

3.3. Call a routine that does the following:Call a routine that does the following:a.a. Initialize pointers to the Initialize pointers to the xx and and aa arrays. arrays.b.b. Multiply the first two array elements into the accumulator.Multiply the first two array elements into the accumulator.c.c. Multiply and accumulate the remaining pairs using in-lineMultiply and accumulate the remaining pairs using in-line

code -- code -- don’tdon’t use use BANZBANZ..d.d. Store the result to memory location Store the result to memory location yy . .e.e. Return to the main routine.Return to the main routine.

4.4. Setup an appropriate linker command fileSetup an appropriate linker command file

5.5. Assemble, link, simulate and debug your code.Assemble, link, simulate and debug your code.

Optional: Obtain the maximum value of an individual product.Optional: Obtain the maximum value of an individual product.

Page 90: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 4

4 - 14 DSP54x - Basic Programming Techniques

4 - 25

VECTOR4.ASMVECTOR4.ASM : Solution : Solution

;Solution for VECTORS.ASM for LAB4A

.ref start

LEN .set 100

STACK .usect "STK",LEN

.sect ".vectors"

B BEGIN

.text

BEGIN STM #STACK+LEN,SP

call start

;Solution for VECTORS.ASM for LAB4A;Solution for VECTORS.ASM for LAB4A

.ref start .ref start

LEN .set 100LEN .set 100

STACK .STACK .usectusect "STK",LEN "STK",LEN

.sect ".vectors" .sect ".vectors"

B BEGIN B BEGIN

.text .text

BEGIN STM #STACK+LEN,SPBEGIN STM #STACK+LEN,SP

call start call start

4 - 26

LAB4A.CMDLAB4A.CMD : Solution : Solution

lab3.obj

vector4.obj

-o lab4a.out

-m lab4a.map

MEMORY { PAGE 0: EPROM: org = 0E000h len = 01F80h

VECS : org = 0FF80h len = 00080h

PAGE 1: SPRAM: org = 00060h len = 00020h

DARAM: org = 00080h len = 01380h }

SECTIONS{ .vectors : > VECS PAGE 0

.text : > EPROM PAGE 0

.data : > DARAM PAGE 1

.bss : > SPRAM PAGE 1

STK : > DARAM PAGE 1 }

lab3.lab3.objobj

vector4.vector4.objobj

-o lab4a.out-o lab4a.out

-m lab4a.map-m lab4a.map

MEMORY { PAGE 0: EPROM: org = 0E000hMEMORY { PAGE 0: EPROM: org = 0E000h len len = 01F80h = 01F80h

VECS : org = 0FF80h VECS : org = 0FF80h len len = 00080h = 00080h

PAGE 1: SPRAM: org = 00060h PAGE 1: SPRAM: org = 00060h len len = 00020h = 00020h

DARAM: org = 00080h DARAM: org = 00080h len len = 01380h } = 01380h }

SECTIONS{SECTIONS{ .vectors : > VECS PAGE 0.vectors : > VECS PAGE 0

.text : > EPROM PAGE 0.text : > EPROM PAGE 0

.data : > DARAM PAGE 1.data : > DARAM PAGE 1

..bssbss : > SPRAM PAGE 1 : > SPRAM PAGE 1

STK : > DARAM PAGE 1 }STK : > DARAM PAGE 1 }

Page 91: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 4

DSP54x - Basic Programming Techniques 4 - 15

4 - 27

LAB4B.ASMLAB4B.ASM : Solution : Solution

.def start,table,x

.bss x,4

.bss a,4

.bss y,1

.text

NOP

start: STM #table,AR1

STM #x,AR2

STM #8,AR7

loop: LD *AR1+,A

STL A,*AR2+

BANZ loop,*AR7-

CALL sop

CALL maxi

done: B done

. .defdef start,table,x start,table,x

. .bssbss x,4 x,4

. .bssbss a,4 a,4

. .bssbss y,1 y,1

.text .text

NOP NOP

start: STM #table,AR1start: STM #table,AR1

STM #x,AR2 STM #x,AR2

STM #8,AR7 STM #8,AR7

loop: LD *AR1+,Aloop: LD *AR1+,A

STL A,*AR2+ STL A,*AR2+

BANZ loop,*AR7- BANZ loop,*AR7-

CALL sop CALL sop

CALL maxi CALL maxi

done: B donedone: B done

sop: STM #x,AR1

STM #a,AR2

LD *AR1+,T ;1

MPY *AR2+,A

LD *AR1+,T ;2

MAC *AR2+,A

LD *AR1+,T ;3

MAC *AR2+,A

LD *AR1,T ;4

MAC *AR2,A

STL A,*(y)

RET

.data

table: .word 1,2,3,4

.word 8,6,4,2,0

sop: STM #x,AR1sop: STM #x,AR1

STM #a,AR2 STM #a,AR2

LD *AR1+,T ;1 LD *AR1+,T ;1

MPY *AR2+,A MPY *AR2+,A

LD *AR1+,T ;2 LD *AR1+,T ;2

MAC *AR2+,A MAC *AR2+,A

LD *AR1+,T ;3 LD *AR1+,T ;3

MAC *AR2+,A MAC *AR2+,A

LD *AR1,T ;4 LD *AR1,T ;4

MAC *AR2,A MAC *AR2,A

STL A,*(y) STL A,*(y)

RET RET

.data.data

table: table: .word 1,2,3,4.word 1,2,3,4

.word 8,6,4,2,0.word 8,6,4,2,0

4 - 28

LAB4B.ASMLAB4B.ASM Optional : Solution Optional : Solution

.bss max,1

maxi: STM #x,AR1

STM #a,AR2

LD *AR1+,T ;1

MPY *AR2+,B

LD *AR1+,T ;2

MPY *AR2+,A

MAX B

LD *AR1+,T ;3

MPY *AR2+,A

MAX B

LD *AR1+,T ;4

MPY *AR2+,A

MAX B

STL B,max

RET

. .bssbss max,1 max,1

maxi: STM #x,AR1maxi: STM #x,AR1

STM #a,AR2 STM #a,AR2

LD *AR1+,T ;1 LD *AR1+,T ;1

MPY *AR2+,B MPY *AR2+,B

LD *AR1+,T ;2LD *AR1+,T ;2

MPY *AR2+,A MPY *AR2+,A

MAX B MAX B

LD *AR1+,T ;3 LD *AR1+,T ;3

MPY *AR2+,A MPY *AR2+,A

MAX B MAX B

LD *AR1+,T ;4 LD *AR1+,T ;4

MPY *AR2+,A MPY *AR2+,A

MAX B MAX B

STL B,max STL B,max

RET RET

Page 92: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 4

4 - 16 DSP54x - Basic Programming Techniques

4 - 29

LAB4B.CMDLAB4B.CMD : Solution : Solution

lab4b.obj

vector4.obj

-o lab4b.out

-m lab4b.map

MEMORY { PAGE 0: EPROM: org = 0E000h len = 01F80h

VECS: org = 0FF80h len = 00080h

PAGE 1: SPRAM: org = 00060h len = 00020h

DARAM: org = 00080h len = 01380h }

SECTIONS{ .vectors: > VECS PAGE 0

.text : > EPROM PAGE 0

.data : > DARAM PAGE 1

.bss : > SPRAM PAGE 1

STK : > DARAM PAGE 1 }

lab4b.lab4b.objobj

vector4.vector4.objobj

-o lab4b.out-o lab4b.out

-m lab4b.map-m lab4b.map

MEMORY { PAGE 0: EPROM: org = 0E000hMEMORY { PAGE 0: EPROM: org = 0E000h len len = 01F80h = 01F80h

VECS: org = 0FF80h VECS: org = 0FF80h len len = 00080h = 00080h

PAGE 1: SPRAM: org = 00060h PAGE 1: SPRAM: org = 00060h len len = 00020h = 00020h

DARAM: org = 00080h DARAM: org = 00080h len len = 01380h } = 01380h }

SECTIONS{SECTIONS{ .vectors: > VECS PAGE 0.vectors: > VECS PAGE 0

.text : > EPROM PAGE 0.text : > EPROM PAGE 0

.data : > DARAM PAGE 1.data : > DARAM PAGE 1

..bssbss : > SPRAM PAGE 1 : > SPRAM PAGE 1

STK : > DARAM PAGE 1STK : > DARAM PAGE 1 } }

Page 93: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

DSP54x - Advanced Programming 5 - 1

Advanced Programming

Learning Objectives

5 - 2

Learning ObjectivesLearning Objectives

uu Repeat FunctionsRepeat Functions

uu Data Move FunctionsData Move Functions

uu Dual Operands (Dual Operands (XmemXmem,, Ymem Ymem))

uu Long , Double, & Parallel OpsLong , Double, & Parallel Ops

Page 94: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

5 - 2 DSP54x - Advanced Programming

Page 95: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 5

DSP54x - Advanced Programming 5 - 3

Module 5

5 - 3

Repeat Next: RPTRepeat Next: RPT

uu FeaturesFeatures

ÀÀ Next instruction iterated N+1 timesNext instruction iterated N+1 times

ÀÀ Saves code space (1 or 2 words)Saves code space (1 or 2 words)

ÀÀ Low overhead (1 or 2 cycles)Low overhead (1 or 2 cycles)

ÀÀ Easy to useEasy to use

ÀÀ Non-interruptibleNon-interruptible

Example :Example :

intint x[5]={0,0,0,0,0}; x[5]={0,0,0,0,0};

..bssbss x,5x,5

STMSTM #x,AR1#x,AR1

LDLD #0,A#0,A

RPTRPT #4#4

STLSTL A,*AR1+A,*AR1+uu OptionsOptionsÀÀ RPT #k8RPT #k8 up to 256 iterationsup to 256 iterations

ÀÀ RPT #KRPT #K up to 64K iterationsup to 64K iterations

ÀÀ RPTRPT Smem Smem ref. dataref. data mem mem for count value for count value

5 - 4

Enhanced Performance with RPTEnhanced Performance with RPT

These instructions execute These instructions execute fasterfaster when in when in

a RPT loop: a RPT loop:

MVDMMVDM MVKDMVKD MACDMACDMVMDMVMD MVDKMVDK MACPMACP

MVDPMVDP READAREADA FIRSFIRSMVPDMVPD WRITAWRITA

Pointer setup and usage becomes morePointer setup and usage becomes more

efficient while RPT loop is active.efficient while RPT loop is active.

Page 96: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 5

5 - 4 DSP54x - Advanced Programming

5 - 5

Non-Repeatable InstructionsNon-Repeatable Instructions

Generally, not operations useful to repeat; e.g.;Generally, not operations useful to repeat; e.g.;branches, status register ops, etc :branches, status register ops, etc :

B[D]B[D] BC[D]BC[D] BANZ[D]BANZ[D] INTRINTR RETE[D]RETE[D]CALL[D]CALL[D] CC[D]CC[D] RPTRPT TRAPTRAP RETF[D]RETF[D]RET[D]RET[D] RC[D]RC[D] RPTZRPTZ RESETRESET BACC[D] BACC[D] Far OpsFar Ops XCXC RPTB[D] RPTB[D] IDLEIDLE CALA[D] CALA[D]

ANDMANDMORMORMXORMXORMADDMADDM

LDLD DPDPLDLD ASMASMLDLD ARPARP

MVMMMVMMCMPRCMPRDSTDST

RSBXRSBXSSBXSSBXRNDRND

Can yield errors. Can yield errors. Won’tWon’t damage device. damage device.

5 - 6

Repeat and Zero: RPTZRepeat and Zero: RPTZ

uu Repeats following instruction N+1 timesRepeats following instruction N+1 times

uu Additionally, zeros specified accumulatorsAdditionally, zeros specified accumulators

uu Uses long constant onlyUses long constant only

uu Requires two words and two cyclesRequires two words and two cycles

Example :Example :intint x[5]={0,0,0,0,0}; x[5]={0,0,0,0,0};

..bssbss x,5x,5

STMSTM #x,AR2#x,AR2

RPTZRPTZ B,#4B,#4

STLSTL B,*AR2+B,*AR2+

Page 97: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 5

DSP54x - Advanced Programming 5 - 5

5 - 7

Block Repeat: RPTBBlock Repeat: RPTB

uu Allows ’zero overhead’ looping on anyAllows ’zero overhead’ looping on anysize code segmentsize code segment

uu Is a 2-word, 4-cycle instructionIs a 2-word, 4-cycle instruction

uu Is interruptibleIs interruptible

uu RSA is next line of codeRSA is next line of code

uu REA is operand for RPTBREA is operand for RPTB

uu BRC must be pre-loaded with ’count-1’BRC must be pre-loaded with ’count-1’

uu May operate on May operate on anyany length block length block

5 - 8

RPTB ExampleRPTB Example

Add 1 to each element in the array x[5]Add 1 to each element in the array x[5]

..bssbss x,5x,5

begin:begin: LDLD #1,16,B#1,16,B

STMSTM #x,AR4#x,AR4

ADDADD *AR4,16,B,A*AR4,16,B,A

STHSTH A,*AR4+A,*AR4+

LDLD #0,B#0,B

……

… …

}} Loop 5x Loop 5x

RPTBRPTB next-1next-1

next:next:

STMSTM #4,BRC#4,BRC

‘next-1’ assures complete fetch of possible multiword final instruction‘next-1’ assures complete fetch of possible multiword final instruction

Page 98: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 5

5 - 6 DSP54x - Advanced Programming

5 - 9

Nested LoopsNested Loops

STMSTM #L-1,AR7#L-1,AR7

1st:1st: outout

outout

STMSTM #M-1,BRC#M-1,BRC

RPTBRPTB 2nd-12nd-1

midmid

midmid

RPTRPT #N-1#N-1

innerinner

midmid

midmid

2nd:2nd: outout

outout

BANZBANZ 1st,*AR7-1st,*AR7-

112233

LevelLevel OperatorOperator CyclesCycles

11 RPTRPT 11

22 RPTBRPTB 4+24+2

33 BANZBANZ 2+4 2+4 . . NN

…… … …

uu RPT uses (invisible) RCRPT uses (invisible) RC

uu RPTB uses BRC, RSA, REARPTB uses BRC, RSA, REA

uu Nesting RPTB possible, Nesting RPTB possible, but not efficientbut not efficient

5 - 10

Exercise 5-1: Repeat OperationsExercise 5-1: Repeat Operations

1.1. Which repeat functions are interruptible?Which repeat functions are interruptible?2.2. How many/few lines of code can be in a How many/few lines of code can be in a RPTBRPTB??

3.3. Which repeat function is fastest?Which repeat function is fastest?4.4. What does What does RPT 5RPT 5 do? do?

Add array x[10] to y[10]Add array x[10] to y[10] Add 100 values in the array xAdd 100 values in the array x

Page 99: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 5

DSP54x - Advanced Programming 5 - 7

5 - 11

Data MoveData Move

uu Faster than Load and StoreFaster than Load and Store

uu Transfer avoids accumulatorTransfer avoids accumulator

uu Allows access to program memoryAllows access to program memory

uu Optimal with RPT (speed and code size)Optimal with RPT (speed and code size)

5 - 12

Optimal Initialization: MVPDOptimal Initialization: MVPD

Program MemoryProgram Memory Data MemoryData Memory

..bssbss x,5x,5 RAMRAM

ROMROM

Example : Example : x[5]={1,2,3,4,5};x[5]={1,2,3,4,5};

.text.textSTART:START: STMSTM #x,AR5#x,AR5

RPTRPT #4#4MVPDMVPD TBL,*AR5+TBL,*AR5+…………

no datano dataROMROMrequired!required!

.data.dataTBL:TBL: .word.word 1,2,3,4,51,2,3,4,5

.sect “.vectors”.sect “.vectors”BB STARTSTART

Page 100: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 5

5 - 8 DSP54x - Advanced Programming

5 - 13

Move InstructionsMove Instructions

DATA DATA ↔ DATA DATA # w/c # w/c

MVDKMVDK SmemSmem,, dmad dmad 2/22/2

MVKDMVKD dmaddmad,, Smem Smem 2/22/2

MVDDMVDD XmemXmem,, Ymem Ymem 1/11/1

DATA DATA ↔ MMR MMR # w/c # w/c

MVDMMVDM dmaddmad, MMR, MMR 2/22/2

MVMDMVMD MMR,MMR, dmad dmad 2/22/2

MVMMMVMM mmrmmr,, mmr mmr 1/11/1

PGM PGM ↔ DATA DATA # w/c # w/c

MVPDMVPD pmadpmad,, Smem Smem 2/32/3

MVDPMVDP SmemSmem,, pmad pmad 2/42/4

PGM(PGM(AccAcc) ) ↔ DATA DATA # w/c # w/c

READAREADA SmemSmem 1/51/5

WRITAWRITA SmemSmem 1/51/5

LEGENDLEGEND

SmemSmem: regular data memory address: regular data memory address dmaddmad: 16-bit data memory address: 16-bit data memory address

XmemXmem,,YmemYmem: dual operand data: dual operand data mems mems pmadpmad: 16-bit: 16-bit pgm pgm. memory address. memory address

MMR: any memory map registerMMR: any memory map register mmrmmr: AR0-AR7, or SP: AR0-AR7, or SP

5 - 14

Exercise 5-2: Move OperationsExercise 5-2: Move Operations

1.1. Which instructions would be best for aWhich instructions would be best for acontext save and restore?context save and restore?

2.2. WhichWhich mmrs mmrs does does MVMMMVMM access? access?

3.3. Which move operations allow a run-timeWhich move operations allow a run-timeselectableselectable pmad pmad??

4.4. Write a routine to copy x[20] to y[20].Write a routine to copy x[20] to y[20].

Page 101: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 5

DSP54x - Advanced Programming 5 - 9

5 - 15

Dual Operand MultiplicationDual Operand Multiplication

DATA MEMORYDATA MEMORY

C BUSC BUS

D BUSD BUS

MACMACUNITUNIT

AA BB

5 - 16

y =y = mx mx + b + b

Dual Operand MPYDual Operand MPY

Standard SolutionStandard Solution

LD LD m,Tm,T

MPYMPY x,Ax,A

ADDADD b,Ab,A

STLSTL A,yA,y

Dual Op SolutionDual Op Solution

MPY *AR2,*AR3,A

ADD b,A

STL A,y

MPYMPY *AR2,*AR3*AR2,*AR3,A,A

ADDADD b,Ab,A

STLSTL A,yA,y

Dual Op CaveatsDual Op Caveatsuu May use May use only AR2-AR5only AR2-AR5

uu Requires less code spaceRequires less code space

uu Executes more quicklyExecutes more quickly

Page 102: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 5

5 - 10 DSP54x - Advanced Programming

5 - 17

Dual Operand MPY ExampleDual Operand MPY Example

y x an nn

==

∑1

20

y x an nn

==

∑1

20

LD #0,BSTM #a,AR2STM #x,AR3STM #19,BRCRPTB done-1LD *AR2+,TMPY *AR3+,AADD A,B

done STH B,ySTL B,y+1

33

sop1: LD #0,BSTM #a,AR2STM #x,AR3STM #19,BRCRPTB done-1MPY *AR2+,*AR3+,AADD A,B

done: STM B,ySTL B,y+1

sop1: LD #0,BSTM #a,AR2STM #x,AR3STM #19,BRCRPTB done-1MPY *AR2+,*AR3+,AADD A,B

done: STM B,ySTL B,y+1

22

Total savings: 1 cycle * 20 iterations = 20 cyclesTotal savings: 1 cycle * 20 iterations = 20 cycles

5 - 18

Dual Operand MAC ExampleDual Operand MAC Example

y x an nn

==

∑1

20

y x an nn

==

∑1

20

sop2: STM #x,AR2STM #a,AR3

RPTZ A,19MAC *AR2+,*AR3+,A

STH A,ySTL A,y+1

sop2: STM #x,AR2STM #a,AR3

RPTZ A,19MAC *AR2+,*AR3+,A

STH A,ySTL A,y+1

Performance: N+2 cycles for N iterationsPerformance: N+2 cycles for N iterations

Page 103: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 5

DSP54x - Advanced Programming 5 - 11

5 - 19

Dual Operand MAC and MPYDual Operand MAC and MPY

LEGENDLEGEND

SmemSmem: regular data memory address: regular data memory address dmaddmad: 16-bit data memory address: 16-bit data memory address

XmemXmem,,YmemYmem: dual operand data: dual operand data mems mems pmadpmad: 16-bit: 16-bit pgm pgm. memory address. memory address

srcsrc: source accumulator: source accumulator dstdst: destination accumulator: destination accumulator

MPYMPY XmemXmem,,YmemYmem,,dstdst dstdst = = Xmem Xmem * * Ymem Ymem

MACMAC XmemXmem,,YmemYmem,,srcsrc,[,[dstdst]] dstdst = = src src + + Xmem Xmem * * Ymem Ymem

MASMAS XmemXmem,,YmemYmem,,srcsrc,[,[dstdst]] dstdst = = src src - - Xmem Xmem * * Ymem Ymem

MACPMACP SmemSmem,,pmadpmad,,srcsrc,[,[dstdst]] dstdst = = src src + + Smem Smem * * pmad pmad

5 - 20

X,Y Addressing RulesX,Y Addressing Rules

Dual operand addressing allows only certainDual operand addressing allows only certainpointers and modes :pointers and modes :

PointersPointers Modes Modes

AR2 AR2 **ARnARn

AR3 AR3 **ARnARn++

AR4 AR4 **ARnARn--

AR5 AR5 **ARnARn+0%+0%

Modifiers: BK + AR0Modifiers: BK + AR0

Since the only index offered is circular, regularSince the only index offered is circular, regularindex is possible only if BK is set to 0, or madeindex is possible only if BK is set to 0, or madevery large, e.g.,very large, e.g., FFFFh FFFFh..

Page 104: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 5

5 - 12 DSP54x - Advanced Programming

5 - 21

Exercise 5-3: Dual Op MACExercise 5-3: Dual Op MAC

uu What addressing options exist for dualWhat addressing options exist for dualoperand mode?operand mode?

uu Which multiplication instructions supportWhich multiplication instructions supportdual operands?dual operands?

uu Write the code to solve, for i = 1 to 10 :Write the code to solve, for i = 1 to 10 :

y(i) = y(i-1) y(i) = y(i-1) ++ x(i) x(i) .. e e

5 - 22

Notes

Page 105: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 5

DSP54x - Advanced Programming 5 - 13

5 - 23

Example : Example : ZZ3232 = X = X3232 + Y + Y3232

Long Word OperationsLong Word Operations

Standard OperationsStandard Operations

LDLD XHI,16,AXHI,16,A

ADDSADDS XLO,AXLO,A

ADDADD YHI,16,AYHI,16,A

ADDSADDS YLO,AYLO,A

STHSTH A,ZHIA,ZHI

STLSTL A,ZLOA,ZLO

Words = 6Words = 6

Cycles = 6Cycles = 6

Long Word OperationsLong Word Operations

DLDDLD XHI,AXHI,A

DADDDADD YHI,AYHI,A

DSTDST A,ZHIA,ZHI

Words = 3Words = 3

Cycles = 4Cycles = 4

5 - 24

DLDDLD LmemLmem,, dst dst dstdst = = Lmem Lmem

DSTDST srcsrc,, Lmem Lmem LmemLmem = = src src

DADDDADD LmemLmem,, src src, [, [dstdst]] dstdst = = src src + + Lmem Lmem

DSUBDSUB LmemLmem,, src src, [, [dstdst]] dstdst = = src src - - Lmem Lmem

DRSUBDRSUB LmemLmem,, src src, [, [dstdst]] dstdst = = Lmem Lmem - - src src

Long Operand InstructionsLong Operand Instructions

uu Double store requires two cycles for dual E-bus activity Double store requires two cycles for dual E-bus activity

uu Double Load/Add/Sub are single cycle in DARAM Double Load/Add/Sub are single cycle in DARAM

uu Double operations to single access memories take two cycles Double operations to single access memories take two cycles

uu Default auto-increment step size is TWO Default auto-increment step size is TWO

Page 106: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 5

5 - 14 DSP54x - Advanced Programming

5 - 25

Long Operand IssuesLong Operand Issues

uu Long operand instructions read MSW from specifiedLong operand instructions read MSW from specifiedaddress and LSW at same address with LSB address and LSW at same address with LSB toggledtoggled

Ex1:Ex1: DLDDLD 100,A100,A A = @100 @101A = @100 @101

Ex2:Ex2: DLDDLD 201,B201,B B = @201 @200B = @201 @200

uu Recommended: Recommended: AlignAlign words in memory so that MSW is at words in memory so that MSW is ateveneven address address

Ex1:Ex1: ..longlong 12345678h12345678h eveneven 1 2 3 41 2 3 4

oddodd 5 6 7 85 6 7 8

Ex2:Ex2: ..bssbss XHI,2,1 XHI,2,1 eveneven XHIXHI

oddodd XLOXLO

NameNameSizeSizePagePage Contig ContigEVEN ALIGNEVEN ALIGN

,,11

5 - 26

Example : Example : Z = X + Y and F = D + EZ = X + Y and F = D + E

Double Word OperationsDouble Word Operations

uu Split accumulators into Split accumulators into separateseparate LO and HI halves: LO and HI halves: SSBX C16SSBX C16

1. Interleave Data1. Interleave Data .data.data.align.align 22.word.word X,D,YX,D,Y.word.word E,Z,FE,Z,F

XXDDYYEEZZFF --- or ------ or ---

..bssbss X,6,1,1X,6,1,1

2. Write Code2. Write Code SSBXSSBX C16C16……DLDDLD X,BX,BDADDDADD Y,BY,BDSTDST B,ZB,Z

Page 107: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 5

DSP54x - Advanced Programming 5 - 15

5 - 27

Parallel OperationsParallel Operations

Example : Example : Z = X + Y and F = D + EZ = X + Y and F = D + E

uu Parallel load/store instructions use D Bus and E BusParallel load/store instructions use D Bus and E Busin same cycle.in same cycle.

XXYYZZDDEEFF

AR5AR5

AR6AR6

..bssbss X,3X,3

..bssbss D,3D,3

STMSTM #X,AR5#X,AR5STMSTM #D,AR6#D,AR6LDLD #0,ASM#0,ASM

LDLD *AR5+,16,A*AR5+,16,AADDADD *AR5+,16,A*AR5+,16,ASTST A,*AR5A,*AR5

|| || LDLD *AR6+,B*AR6+,BADDADD *AR6+,16,B*AR6+,16,BSTHSTH B,*AR6B,*AR6

uu Parallel ops focus on high accumulator.Parallel ops focus on high accumulator.

uu Store in parallel ops are offset by ASM value.Store in parallel ops are offset by ASM value.

ÀÀ ASM is a 5-bit ASM is a 5-bit signedsigned field in ST1 (bits 4-0) field in ST1 (bits 4-0)ÀÀ ASM is best loaded with: ASM is best loaded with: LD #k5,ASMLD #k5,ASM

uu What is the error in the above example?What is the error in the above example?

5 - 28

Parallel InstructionsParallel Instructions

InstructionInstruction ExampleExample OperationOperation

LD || MAC[R]LD || MAC[R] LDLD XmemXmem,,dstdst dstdst = = Xmem Xmem << 16 << 16LD || MAS[R]LD || MAS[R] ||||MAC[R]MAC[R] YmemYmem,[dst2],[dst2] dst2 = dst2 + T * dst2 = dst2 + T * YmemYmem

ST || MPYST || MPY STST srcsrc,,YmemYmem YmemYmem = = src src >> (16-ASM) >> (16-ASM)ST || MAC[R] ST || MAC[R] ||||MAC[R]MAC[R] XmemXmem,,dstdst dstdst = = dst dst + T * + T * Xmem XmemST || MAS[R]ST || MAS[R]

ST || ADDST || ADD STST srcsrc,,YmemYmem YmemYmem = = src src >> (16-ASM) >> (16-ASM)ST || SUB ST || SUB ||||ADDADD XmemXmem,,dstdst dstdst = = dst dst + + Xmem XmemST || LDST || LD

ST || LD ST || LD STST srcsrc,,YmemYmem YmemYmem = = src src >> (16-ASM) >> (16-ASM)||||LDLD XmemXmem,T,T T =T = Xmem Xmem

Page 108: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 5

5 - 16 DSP54x - Advanced Programming

5 - 29

Double, Long, and Parallel ReviewDouble, Long, and Parallel Review

uu How many program words are double, long,How many program words are double, long,and parallel ops?and parallel ops?

uu How many cycles do they take to execute?How many cycles do they take to execute?uu If a If a STST ||LD ||LD refers to the samerefers to the same Acc Acc and and

DMEMDMEM, what happens?, what happens?

uu What is the What is the ASMASM field? What does it affect? field? What does it affect?How is it loaded?How is it loaded?

uu How should 32-bit data be aligned in memory?How should 32-bit data be aligned in memory?How is that accomplished?How is that accomplished?

5 - 30

Bus UsageBus Usage

Instruction ActivityInstruction Activity PBPB CBCB DBDB EBEB

Program ReadProgram Read A,DA,D

Program WriteProgram Write AA DD

Data Single ReadData Single Read A,DA,D

Data Dual ReadData Dual Read A,DA,D A,DA,D

Data Long (32-bit) ReadData Long (32-bit) Read A,D(ms)A,D(ms) AA11,D(,D(lsls))

Data Single WriteData Single Write A,DA,D

Data Read / Data WriteData Read / Data Write A,DA,D A,DA,D

Dual Read / Coefficient ReadDual Read / Coefficient Read A,DA,D A,DA,D A,DA,D

Peripheral WritePeripheral Write A,DA,D

Peripheral Read*Peripheral Read* A,DA,D

* MMRs only accessible via D Bus, MMR access as Ymem op yields bad data!** MMRs MMRs only accessible via D Bus, MMR access as only accessible via D Bus, MMR access as Ymem Ymem op yields bad data! op yields bad data!

Page 109: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 5

DSP54x - Advanced Programming 5 - 17

5 - 31

Module ReviewModule Review

uu Fast loops :Fast loops : RepeatRepeat

uu Fast data transfer :Fast data transfer : Move OpsMove Ops

uu Faster math :Faster math : Dual OperandsDual Operands

uu Fast 32-bit math :Fast 32-bit math : Long OpsLong Ops

uu Double math : Double math : Double OpsDouble Ops

uu Two actions in one cycle :Two actions in one cycle : Parallel OpsParallel Ops

5 - 32

Lab 5: Advanced ProgrammingLab 5: Advanced Programming

Program MemoryProgram Memory

ROMROM.data.data

tbltbl:: .word.word 1,2,3,41,2,3,4.word.word 8,6,4,28,6,4,2

RAMRAM

Data MemoryData Memory

. .bssbss

xx ___ ___ ___ ______ ___ ___ ___

aa ___ ___ ___ ______ ___ ___ ___

yy ___ ___

.sect “.vectors”.sect “.vectors”BB startstart

.text.textstart:start: … …

MVPDMVPD tbltbl ,*…,*… … …

MACMAC *,**,*……

Page 110: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 5

5 - 18 DSP54x - Advanced Programming

5 - 33

Lab 5: ProcedureLab 5: Procedure

1.1. Copy Copy LAB4B.ASMLAB4B.ASM to to LAB5.ASMLAB5.ASM. Modify . Modify LAB5LAB5 to: to:

a.a. Perform initialization with a repeated Perform initialization with a repeated MVPDMVPD..

b.b. Perform the sum-of-products with a repeated dualPerform the sum-of-products with a repeated dualoperand operand MACMAC..

2.2. Copy Copy LAB4.CMDLAB4.CMD to to LAB5.CMDLAB5.CMD. Modify . Modify LAB5.CMDLAB5.CMD to: to:

a.a. Load .data to program memory.Load .data to program memory.b.b. Input Input LAB5.OBJLAB5.OBJ and create and create LAB5.OUTLAB5.OUT and andLAB5.MAPLAB5.MAP..

3.3. Assemble, link, and simulate the program. Debug andAssemble, link, and simulate the program. Debug andverify performance.verify performance.

4.4. Optional: If time permits, modify Optional: If time permits, modify LAB5.ASMLAB5.ASM to use to useMACPMACP. What effects would using . What effects would using MACPMACP have on the have on thesystem implementation?system implementation?

5 - 34

Lab 5: OptionalLab 5: Optional

Program MemoryProgram Memory

ROMROM

.text.textstart:start: ……

MVPDMVPD tbl,*+tbl,*+…… ……

.data.datatbltbl :: .word.word 1,2,3,41,2,3,4 .word.word 8,6,4,28,6,4,2

.sect “.vectors”.sect “.vectors”BB startstart

RAMRAM

Data MemoryData Memory

.. bssbss

x x ___ ___ ___ ______ ___ ___ ___

yy ______MACPMACP coeffcoeff ,…,…

aa

Page 111: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 5

DSP54x - Advanced Programming 5 - 19

5 - 36

Exercise 5-1: SolutionExercise 5-1: Solution

Add 100 values in the array xAdd 100 values in the array x Add array x[10] to y[10]Add array x[10] to y[10]

.bss x,100

STM #x,AR6

RPTZ A,99

ADD *AR6+,A

..bssbss x,100x,100

STMSTM #x,AR6#x,AR6

RPTZRPTZ A,99A,99

ADDADD *AR6+,A*AR6+,A

.bss x,10

.bss y,10

STM #x,AR2

STM #y,AR3

STM #9,BRC

RPTB next-1

LD *AR2+,A

ADD *AR3,A

STL A,*AR3+

next: LD #0,A

..bssbss x,10x,10

..bssbss y,10y,10

STMSTM #x,AR2#x,AR2

STMSTM #y,AR3#y,AR3

STMSTM #9,BRC#9,BRC

RPTBRPTB next-1next-1

LDLD *AR2+,A*AR2+,A

ADDADD *AR3,A*AR3,A

STLSTL A,*AR3+A,*AR3+

next:next: LDLD #0,A#0,A

5 - 37

Exercise 5-2 & 5-3 : SolutionsExercise 5-2 & 5-3 : Solutions

..bssbss x,20x,20

..bssbss y,20y,20

STMSTM #x,AR2#x,AR2

STMSTM #y,AR3#y,AR3

RPTRPT #19#19

MVDDMVDD *AR2*AR2+,*+,*AR3+AR3+

Copy x[20] to y[20]Copy x[20] to y[20]y(i) = y(i-1) + x(i) y(i) = y(i-1) + x(i) .. e e

where i = 1 to 10where i = 1 to 10

.bss x,10

.bss y,10

.bss e,1

STM #x,AR2STM #y,AR3STM #e,AR4LD #0,ASTM #10-1,BRCRPTB loop-1MAC *AR2+,*AR4,ASTL A,*AR3+

loop:

..bssbss x,10x,10

..bssbss y,10y,10

..bssbss e,1e,1

STMSTM #x,AR2#x,AR2STMSTM #y,AR3#y,AR3STMSTM #e,AR4#e,AR4LDLD #0,A#0,ASTMSTM #10-1,BRC#10-1,BRCRPTBRPTB loop-1loop-1MACMAC *AR2*AR2+,*+,*AR4,AAR4,ASTLSTL A,*AR3+A,*AR3+

loop:loop:

Page 112: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 5

5 - 20 DSP54x - Advanced Programming

5 - 38

Lab 5: SolutionLab 5: Solution

..bssbss x,4x,4

..bssbss a,4a,4

..bssbss y,1y,1

.data.datatbltbl:: .word.word 1,2,3,41,2,3,4

.word.word 8,6,4,2,08,6,4,2,0

.text.textstart:start: STMSTM #data,AR1#data,AR1

RPTRPT #8#8MVPDMVPD tbltbl,*AR1+,*AR1+STMSTM #data,AR2#data,AR2STMSTM ##coeffcoeff,AR3,AR3RPTZRPTZ A,3A,3MACMAC *AR2*AR2+,*+,*AR3+,AAR3+,ASTLSTL A,*(result)A,*(result)

5 - 39

Lab 5 Optional: SolutionLab 5 Optional: Solution

..bssbss x,4x,4

..bssbss y,1y,1

.data.datatbltbl:: .word.word 1,2,3,4,01,2,3,4,0a:a: .word.word 8,6,4,28,6,4,2

.text.textSTMSTM #data,AR1#data,AR1RPTRPT #4#4MVPDMVPD tbltbl,*AR1+,*AR1+STMSTM #data,AR1#data,AR1RPTZRPTZ A,3A,3MACPMACP *AR1+,*AR1+,coeffcoeff,A,ASTLSTL A,*(result)A,*(result)

AdvantagesAdvantages : : Faster initialization, Less RAM requiredFaster initialization, Less RAM required

Page 113: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

DSP54x - Pipeline Issues 6 - 1

Pipeline Issues

Learning Objectives

6 - 2

Learning ObjectivesLearning Objectives

uu Describe the ’C54x pipeline events.Describe the ’C54x pipeline events.

uu Implement delayed branching.Implement delayed branching.

uu Identify and resolve pipeline conflicts.Identify and resolve pipeline conflicts.

Page 114: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

6 - 2 DSP54x - Pipeline Issues

Page 115: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 6

DSP54x - Pipeline Issues 6 - 3

Module 6

TMS320C54x DSPTMS320C54x DSPDesign WorkshopDesign Workshop

Module 6Module 6

Pipeline IssuesPipeline Issues

Page 116: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 6

6 - 4 DSP54x - Pipeline Issues

6 - 3

Pipeline OperationPipeline Operation

PREFETCHPREFETCHPAB loaded withPAB loaded withPC contents.PC contents.

P

FETCHFETCHPB loaded byPB loaded bywrapperwrappermanager.manager.

F

DECODEDECODEIR loaded with IR loaded with either PB contenteither PB contentor IQ content. IR or IQ content. IR content is decoded.content is decoded.

D

ACCESSACCESSDAB loaded with data1DAB loaded with data1read access if required.read access if required.CAB loaded with data2CAB loaded with data2read address if required.read address if required.Auxliary register update.Auxliary register update.

A

READREADDB loaded by wrapperDB loaded by wrappermanager with data1 if required.manager with data1 if required.CB loaded by wrapperCB loaded by wrappermanager with data2 if required.manager with data2 if required.EAB loaded with data3 write EAB loaded with data3 write address if required.address if required.

R

EXECUTEEXECUTEExecution of theinstruction and EBloaded with write data

Execution of theinstruction and EBloaded with write data

X

6 - 4

Pipe FlowPipe Flow

TIMETIME

P P11 FF11

PP22

DD11

FF22

PP33

AA11

DD22

FF33

PP44

RR11

AA22

DD33

FF44

PP55

XX11

PP66

RR22

AA33

DD44

FF55

FF66

XX22

RR33

AA44

DD55

DD66

XX33

RR44

AA55

AA66

XX44

RR55

RR66

XX55

XX66

Page 117: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 6

DSP54x - Pipeline Issues 6 - 5

6 - 5

StandardStandard vs vs. Delayed Branch: B & BD. Delayed Branch: B & BD

addraddr..PP11 FF

PP22

D !D !

FF

PP33

----

FF33

PP44

ADDRADDR

----

----

FLUSHFLUSH

PPAA

FLUSHFLUSH ----

----

---- ----

---- ---- ----

---- ---- ----

RR XXAAFF DD AA

2 WORDS2 WORDS

4 4 CYCLESCYCLES

BDBD new new

PP11 FF11

PP22

DD11 ! !

FF22

PP33

----

FF33

PP44

NEWNEW

----

----

DD33

FF44

PPNN

----

----

AA33

DD44

FFNN

----

RR33

AA44

DDNN

XX33

RR44

AANN

XX44

RRNN XXNN

2 WORDS2 WORDS2 CYCLES2 CYCLES

2 FINAL2 FINALCODECODEWORDSWORDS

BB

6 - 6

Delayed Branch ExamplesDelayed Branch Examples

LDLD x,Ax,A

ADDADD y,Ay,A

MPYMPY z,Bz,B

STLSTL A,rA,r

BB nextnext

6 words6 words8 cycles8 cycles

Move branch up two words of code.Move branch up two words of code.

LDLD x,Ax,A

ADDADD y,Ay,A

BDBD nextnext

MPYMPY z,Bz,B

STLSTL A,rA,r

6 words6 words6 cycles6 cycles

Page 118: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 6

6 - 6 DSP54x - Pipeline Issues

6 - 7

Delayed OperationsDelayed Operations

BDBD CALLDCALLD BCDBCD

BACCDBACCD CALADCALAD CCDCCD

RETDRETD RCDRCD

BANZDBANZD RETEDRETED

RPTBDRPTBD RETFDRETFD

Delayed branches are effectively two Delayed branches are effectively two wordswords

faster than their non-delayed version.faster than their non-delayed version.

6 - 8

Delay Slot CaveatsDelay Slot Caveats

uu Delay slot is two Delay slot is two wordswords deep - deep - cycles cycles ororlineslines of code are not relevant of code are not relevant

uu Delay operation may not be a branch ofDelay operation may not be a branch ofany kind (any kind (BB, , CALLCALL, , RETRET, , RPTRPT, etc.), etc.)

uu Conditions in delay slot will be too lateConditions in delay slot will be too lateuu Do not load Do not load BRCBRC in slot of in slot of RPTBDRPTBD

uu No No PUSHPUSH//POPPOP in in CALLDCALLD or or RETDRETD delay delayslotsslots

Page 119: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 6

DSP54x - Pipeline Issues 6 - 7

6 - 9

Conditional Execution: XCConditional Execution: XC

uu Allows Allows fastfast choice of running one or two words of code or choice of running one or two words of code orsubstitution ofsubstitution of NOPs NOPs..

uu Condition evaluated early, so must be set Condition evaluated early, so must be set twotwo instructions instructionsprior.prior.

uu Avoid change of condition in last two lines prior to XC, asAvoid change of condition in last two lines prior to XC, asthey can be recognized in event of interrupt prior to XC.they can be recognized in event of interrupt prior to XC.

XC n,XC n,cndcnd,,cndcnd,,cndcnd

-pre--pre--pre--pre-CMPRCMPR GRTR,AR1 GRTR,AR1 BCBC next,TCnext,TCLDLD *AR3+,A*AR3+,A

next:next: ABSABS AA

3 words, 5/4 cycles3 words, 5/4 cycles

CMPRCMPR GRTR,AR1GRTR,AR1-other--other--other--other-XCXC 1,NTC1,NTCLDLD *AR3+,A*AR3+,AABSABS AA

2 words, 2 cycles2 words, 2 cycles

6 - 10

Exercise 6-1: Delayed OperationsExercise 6-1: Delayed Operations

uu How does How does BDBD differ from differ from BB? How do they differ in code?? How do they differ in code?

uu What should not appear in delay slots?What should not appear in delay slots?uu Why shouldn’t Why shouldn’t PUSHPUSH or or POPPOP appear in a appear in a CCDCCD slot? slot?

uu What should be done if a condition is set in the delay slotWhat should be done if a condition is set in the delay slotof of BCBC??

uu Write code using Write code using RPTBDRPTBD to perform: to perform:y x an n

n

==

∑1

10

uu When would this approach be better than using When would this approach be better than using RPTZRPTZ??

Page 120: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 6

6 - 8 DSP54x - Pipeline Issues

6 - 11

Exercise 6-1: SolutionExercise 6-1: Solution

STMSTM #7,BRC#7,BRC

RPTBDRPTBD next-1next-1

MACMAC *AR2*AR2+,*+,*AR3+,AAR3+,A

next:next: STLSTL A,yA,y

STHSTH A,y+1A,y+1

MACMAC *AR2*AR2+,*+,*AR3+,AAR3+,A

MPYMPY *AR2*AR2+,*+,*AR3+,AAR3+,A

6 - 12

Lab 6Lab 6

1.1. Modify Modify VECTORS.ASMVECTORS.ASM to employ to employ BDBD..

2.2. What code would be most useful in theWhat code would be most useful in thedelay slot?delay slot?

Optional: If time permits, modify yourOptional: If time permits, modify yoursum-of-products to be interruptible.sum-of-products to be interruptible.

Page 121: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 6

DSP54x - Pipeline Issues 6 - 9

6 - 13

Pipeline CasesPipeline Cases

Average C54x System CodeAverage C54x System CodeAverage C54x System Code

30% C Code No Problem

30% 30% C CodeC Code No ProblemNo Problem

70% Assembly Code70% 70% Assembly CodeAssembly Code11

65% CALU Operations No Problem

65% 65% CALU OperationsCALU Operations No ProblemNo Problem

5% MMR Writes5% 5% MMR WritesMMR Writes22

1% Regular MMR WriteUse Key

1% 1% Regular MMR WriteRegular MMR WriteUse KeyUse Key

2% Early Write No Problem

2% 2% Early WriteEarly Write No ProblemNo Problem

2% Protected MMR Write2% 2% Protected MMR WriteProtected MMR Write44 33

1.9% Usual CaseNo Problem

1.9% 1.9% Usual CaseUsual CaseNo ProblemNo Problem

0.1% Prior Reg MMR WriteAdd 1 Cycle

0.1% 0.1% PriorPrior Reg Reg MMR Write MMR WriteAdd 1 CycleAdd 1 Cycle

55 66

Analysis:Analysis:

uu 99% of ’C54x code99% of ’C54x coderequires no specialrequires no specialattention.attention.

uu Latency requirementsLatency requirementsare resolved via a table.are resolved via a table.

6 - 14

Pipeline Case 1 - C CodePipeline Case 1 - C Code

uu Compiler does not produce code with latency issuesCompiler does not produce code with latency issues

uu User need not debug C code for pipeline-related issuesUser need not debug C code for pipeline-related issues

uu C code is ideal for non-critical speed path code.C code is ideal for non-critical speed path code.

ÀÀ Operating systemOperating system

ÀÀ DiagnosticsDiagnostics

ÀÀ Etc.Etc.

uu Allows portability of software to other platforms asAllows portability of software to other platforms asrequired.required.

uu Systems can easily mix C and ASM code.Systems can easily mix C and ASM code.

Page 122: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 6

6 - 10 DSP54x - Pipeline Issues

6 - 15

Pipeline Case 2 - CALU OperationsPipeline Case 2 - CALU Operations

uu No pipeline errors exist between CALUNo pipeline errors exist between CALUoperations.operations.

uu Special effortsSpecial efforts have been made to avoid have been made to avoiderrors without slowing down theerrors without slowing down thepipeline.pipeline.

uu Only one rare case exists where a CALUOnly one rare case exists where a CALUactivity results in a slowdown, and it isactivity results in a slowdown, and it ishandled handled automaticallyautomatically by the ’C54x by the ’C54xwithout data errors (W,W,R||R).without data errors (W,W,R||R).

6 - 16

CALU Operations - AnalysisCALU Operations - Analysis

uu The ’C54x may need to perform a fetch, two reads,The ’C54x may need to perform a fetch, two reads,and a write in any given cycle. Depending on theand a write in any given cycle. Depending on thesystem setup, this event could occur in one cycle or besystem setup, this event could occur in one cycle or bespread over several cycles. spread over several cycles. In no caseIn no case are errors are errorsgenerated. Consider the following environments:generated. Consider the following environments:

ÀÀ More than one external access: multiple cyclesMore than one external access: multiple cycles

ÀÀ Each resource in separate memories: single cycleEach resource in separate memories: single cycle

ÀÀ Note: ’C54x memories are broken into blocks.Note: ’C54x memories are broken into blocks.

ÀÀ More than one resource in a single ’C54x memoryMore than one resource in a single ’C54x memoryblock - Dual Access RAM :block - Dual Access RAM :

Early phase Early phase P and DP and D

Late phase Late phase C and EC and E

Page 123: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 6

DSP54x - Pipeline Issues 6 - 11

6 - 17

Pipeline EventsPipeline Events

Single read instructions: Single read instructions: PPAA PPDD DDDDDDAA

Dual read instructions: Dual read instructions: PPAA PPDDDDD D CCDD

DDA A CCAA

Single write instructions:Single write instructions: PPAA PPDD EEAA EEDD

Dual write instruction: Dual write instruction: PPAA PPDD EEAA EEDD

EEAA EEDD(2 cycles) (2 cycles)

Read/write instructions:Read/write instructions: DDDD E EAA

EEDDDDAAPPAA PPDD

6 - 18

DARAM EventsDARAM Events

Single read instructions: Single read instructions: PP DD

Dual read instructions: Dual read instructions: PP DDCC

Single write instructions:Single write instructions: PP EE

Dual write instruction: Dual write instruction: PP EE

(2 cycles) (2 cycles) EE

Read/write instructions:Read/write instructions: PP EEDD

Page 124: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 6

6 - 12 DSP54x - Pipeline Issues

6 - 19

Case Study - Latencies AvoidedCase Study - Latencies Avoided

WRITEWRITE STL A,*AR3+STL A,*AR3+

READREAD LD *AR2+,ALD *AR2+,A

EEPP

DDPP

What if both are to the What if both are to the samesame address? address?

WRITEWRITE STL A,*AR3+STL A,*AR3+

------ LD #0,A LD #0,A

DUALDUAL ADD *AR4,*AR5,A ADD *AR4,*AR5,A READREAD

PP EE

PP

CC DDPP

EE

Early write Early write held offheld off to allow dual access to operate w/o delay. to allow dual access to operate w/o delay.

6 - 20

Case Study - Automatic LatencyCase Study - Automatic Latency

PP EEWRITEWRITE STL A,*AR3+STL A,*AR3+

WRITEWRITE STH A,*AR3STH A,*AR3

DUALDUAL ADD *AR4,*AR5,AADD *AR4,*AR5,AREADREAD

EEPP

EEPP

CC DD

One cycle latency One cycle latency automaticallyautomatically inserted by decoder inserted by decoder

Page 125: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 6

DSP54x - Pipeline Issues 6 - 13

6 - 21

Pipeline Issues for MMR ActivityPipeline Issues for MMR Activity

5% MMR Writes5% 5% MMR WritesMMR Writes

1% Regular MMR OpUse Key

1% 1% Regular MMR OpRegular MMR OpUse KeyUse Key

2% Early Write No Problem

2% 2% Early WriteEarly Write No ProblemNo Problem

2% Protected MMR Write2% 2% Protected MMR WriteProtected MMR Write44 33

1.9% Usual Case No Problem

1.9% 1.9% Usual CaseUsual Case No ProblemNo Problem

0.1% Prior Reg MMR OpAdd 1 Cycle

0.1% 0.1% PriorPrior Reg Reg MMR Op MMR OpAdd 1 CycleAdd 1 Cycle

55 66

6 - 22

MMRsMMRs That Affect Pipeline That Affect Pipeline

Name Description

BRC Block Repeat Counter

RSA Block Repeat Start Address

REA Block Repeat End Address

T Temporary Register

A Acc A - written as MMR

B Acc B - written as MMR

ST0 Status Register 0

ST1 Status Register 1

PMST Proc. Mode Status Register

Name Description

AR0 Auxiliary Register 0

AR1 Auxiliary Register 1

AR2 Auxiliary Register 2

AR3 Auxiliary Register 3

AR4 Auxiliary Register 4

AR5 Auxiliary Register 5

AR6 Auxiliary Register 6

AR7 Auxiliary Register 7

SP Stack Pointer Register

BK Circular Size Register

Ph

R*

R

R

R

R

R

R

R

R/A

A

* AR’s have been designed specially to operate ‘late’: R instead of A.* AR’s have been designed specially to operate ‘late’: R instead of A.

Ph

P

P

P

X

X

X

--

--

--

Page 126: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 6

6 - 14 DSP54x - Pipeline Issues

6 - 23

Pipeline Control FieldsPipeline Control Fields

ST0ST0 DPDP

PMSTPMST

ST1ST1 BRAFBRAF CPLCPL SXMSXM C16C16 FRCTFRCT ASMASMOVMOVM

OVLYOVLY DROMDROMMP/MCMP/MCIPTRIPTR

XX OVM, SXM, C16, FRCT, ASMOVM, SXM, C16, FRCT, ASMRRAA DP, CPL, DROMDP, CPL, DROMDDFFPP BRAF, MP/MC-, OVLY, IPTRBRAF, MP/MC-, OVLY, IPTR

6 - 24

Pipeline Case 3 : Standard Ops onPipeline Case 3 : Standard Ops on MMRs MMRs

uu Standard write operations will create latency issues that Standard write operations will create latency issues that mustmust be beconsidered by the programmer!considered by the programmer!

uu Consider the following pipeline diagram to understand how muchConsider the following pipeline diagram to understand how muchlatency a control field requires:latency a control field requires:

PP00 FF00 DD00 AA00 RR00 XX00

InstrInstr. 0 writes to a control field.. 0 writes to a control field.

11 11

1 word for effect.1 word for effect.

Affect on these stages ready nowAffect on these stages ready now

22 XX22 A,B,T,SXM,ASM,OVM,FRCT,C16A,B,T,SXM,ASM,OVM,FRCT,C16

33 RR33 ARnARn, SP(0), SP(0)

44 AA44 SP(1), BK, DP, CPL, DROMSP(1), BK, DP, CPL, DROM

55 DD55

66 FF66

OVLY, MP/MC-, IPTR, &OVLY, MP/MC-, IPTR, &PP77 BRC, RSA, REA, BRAFBRC, RSA, REA, BRAF

example:

SSBX SXMNOPLD x,B

example:example:

SSBXSSBX SXMSXMNOPNOPLDLD x,Bx,B

Page 127: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 6

DSP54x - Pipeline Issues 6 - 15

6 - 25

Calculating Minimum Number of Protected CyclesCalculating Minimum Number of Protected Cycles

uu Latency diagram shows worst case number ofLatency diagram shows worst case number of NOPsNOPsto insert between store to control field and effect to beto insert between store to control field and effect to bevalid.valid.

uu NOPsNOPs need not be used; any other non-involved code need not be used; any other non-involved codemay intervene.may intervene.

uu Extra cycle from double word dependent instructionsExtra cycle from double word dependent instructionsmay be counted, reducing the number of othermay be counted, reducing the number of otherintervening cycles required; e.g. :intervening cycles required; e.g. :

EXPLICIT PROTECTED CYCLEEXPLICIT PROTECTED CYCLE

SSBXSSBX SXMSXMNOPNOPLDLD x,Bx,B

IMPLICIT PROTECTED CYCLEIMPLICIT PROTECTED CYCLE

SSBXSSBX SXMSXMLDLD *(x),B*(x),B

6 - 26

Pipeline Case 4: Early WritePipeline Case 4: Early Write

uu ManyMany MMRs MMRs and bit fields are set up during and bit fields are set up duringinitialization and don’t get changed during runtime,initialization and don’t get changed during runtime,e.g.;e.g.;

begin:begin: ……SSBXSSBX SXMSXM…………CALLCALL MAINMAIN

main:main: ……LDLD x,Ax,A

Any pipeline read Any pipeline read more than 6 wordsmore than 6 wordsremoved from theremoved from thewrite is immune.write is immune.

uu These cases are very common and do not present anyThese cases are very common and do not present anypipeline concerns to the user.pipeline concerns to the user.

Page 128: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 6

6 - 16 DSP54x - Pipeline Issues

6 - 27

uu Given the pipeline latency issue, it would be helpful to haveGiven the pipeline latency issue, it would be helpful to haveoptimized instructions that operate earlyoptimized instructions that operate earlyÀÀ Allow faster codeAllow faster codeÀÀ Easier to writeEasier to write

uu Therefore, these instructions offer a one-cycle Therefore, these instructions offer a one-cycle earlyearlyexecution on MMR writes:execution on MMR writes:

STM STM #K,MMR#K,MMR ST ST #K,MMR#K,MMRPOPDPOPD SmemSmem POPMPOPM MMRMMRMVDKMVDK SmemSmem,,dmaddmad MVMDMVMD MMR,MMR,SmemSmemFRAMEFRAME nn

uu Initialization of AR’s with Initialization of AR’s with nono explicit latency : explicit latency :

ST, STM, MVMM, MVDK, MVMMST, STM, MVMM, MVDK, MVMM LD #k9, DP LD #k5,ASM LD #k9, DP LD #k5,ASM

Pipeline Case 5 : Protected InstructionsPipeline Case 5 : Protected Instructions

uu ModifyModify allows early increment, so no latency issues arise : allows early increment, so no latency issues arise : MAR *MAR *ARnARn++

6 - 28

Pipeline Case 6: Protected Instruction ExceptionPipeline Case 6: Protected Instruction Exception

Problem:Problem: Protected instructions attempting to write early (in the R phase)Protected instructions attempting to write early (in the R phase)can be blocked if a prior standard instruction writes to an can be blocked if a prior standard instruction writes to an addressing register in the X phase:addressing register in the X phase:

STLM A,AR0MVMM AR2,AR1LD *AR1,B

STLMSTLM A,AR0A,AR0MVMMMVMM AR2,AR1AR2,AR1LDLD *AR1,B*AR1,B

EExx EE

AA

Solution:Solution: Add one protected cycle before or after STM:Add one protected cycle before or after STM:

STLM A,AR0MVMM AR2,AR1nopLD *AR1,B

STLMSTLM A,AR0A,AR0MVMMMVMM AR2,AR1AR2,AR1nopnopLDLD *AR1,B*AR1,B

Note:Note: Problem can be extended through a chain of special instructions:Problem can be extended through a chain of special instructions:

STLM A,AR0MVMM AR7,AR6MVMM AR2,AR1nopLD *AR1,B

STLMSTLM A,AR0A,AR0MVMMMVMM AR7,AR6AR7,AR6MVMMMVMM AR2,AR1AR2,AR1nopnopLDLD *AR1,B*AR1,B

Page 129: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 6

DSP54x - Pipeline Issues 6 - 17

6 - 29

Execute Unit LatenciesExecute Unit Latencies

Control FieldControl Field Latency 0Latency 0 Latency 1Latency 1

SXM C16SXM C16FRCT OVMFRCT OVM

T T

ASMASM

A A oror B B

STM; MVDKSTM; MVDK All otherAll otherLD x,TLD x,T storesstores incl incl::ST || LDST || LD EXPEXP

LD #k5,ASMLD #k5,ASM All otherAll otherLDLD Smem Smem,ASM,ASM storesstores

All storesAll stores incl incl:: SSXM RSXMSSXM RSXM

All All exceptexcept 1. mod1. mod acc acc2. read2. read mmr mmr

6 - 30

Access Unit LatenciesAccess Unit Latencies

Control FieldControl Field Latency 0Latency 0 Latency 1Latency 1 Latency 2Latency 2 Latency 3Latency 3

AR, AR, SP (CPL=0)SP (CPL=0) STM; STSTM; ST MVKD; MVDMMVKD; MVDM All otherAll other

MVDK; MVMD MVDK; MVMD MVPD; MVDDMVPD; MVDD storesstoresMVMMMVMM POPM SPPOPM SP

POPD SPPOPD SP

BKBKSPSP (CPL=1) (CPL=1) STM; STSTM; ST MVKD; MVDMMVKD; MVDM All otherAll other

MVDK; MVMD MVDK; MVMD MVPD; MVDDMVPD; MVDD storesstoresMVMM; FRAMEMVMM; FRAME POPM SPPOPM SPPUSH; POPPUSH; POP POPD SPPOPD SPRETFDRETFD

DP (CPL=0)DP (CPL=0) LD #K,DPLD #K,DP STM; STSTM; ST All otherAll other LDLD Smem Smem,DP,DP MVDK; MVMDMVDK; MVMD storesstores

CPLCPL STM; STSTM; ST All otherAll otherMVDK; MVMDMVDK; MVMD storesstores incl incl..

SSBX RSBXSSBX RSBX

Page 130: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 6

6 - 18 DSP54x - Pipeline Issues

6 - 31

Other LatenciesOther Latencies

CtlCtl Field Field Latency 2Latency 2 Latency 3Latency 3 Latency 4Latency 4 Latency 5Latency 5 Latency 6Latency 6

DROMDROM STM; STSTM; ST All otherAll otherMVDKMVDK storesstoresMVMDMVMD

OVLYOVLY STM; STSTM; ST All otherAll otherIPTRIPTR MVDKMVDK storesstoresMP/MC-MP/MC- MVMDMVMD

BRC BRC ** SRCCDSRCCD STM; ST STM; ST All storesAll storespre-looppre-loop MVDKMVDK pre-looppre-loop

MVMDMVMD

BRAFBRAF **** All storesAll storespre-looppre-loop

* Note: Writing to * Note: Writing to BRCBRC before before RPTBRPTB has zero latency for has zero latency for STST, , STMSTM, , MVDKMVDK, , MVMDMVMD, and one latency for all other stores, and one latency for all other stores**** Avoid modifying BRAF in line prior to RPTB[D] Avoid modifying BRAF in line prior to RPTB[D]

6 - 32

Latency CaveatsLatency Caveats

uu No No latency for CALU operationslatency for CALU operations

uu Use Use protectedprotected MMR writes whenever possible MMR writes whenever possible

uu Set status early Set status early

uu Use latency diagram when writing toUse latency diagram when writing to MMRs MMRs

uu For debug: For debug: focus on unprotected MMR writesfocus on unprotected MMR writes

uu Reference Guide has chapter on pipeline useReference Guide has chapter on pipeline use

Page 131: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 6

DSP54x - Pipeline Issues 6 - 19

6 - 33

Exercise 6-2aExercise 6-2a

1.1. Determine if latency condition exists.Determine if latency condition exists.

2.2. Note why.Note why.3.3. Add appropriate number ofAdd appropriate number of NOPsNOPs to correct. to correct.

LDLD GAIN,TGAIN,TSTMSTM #input,AR1#input,AR1

MPYMPY *AR1+,A*AR1+,A

STLMSTLM B,AR2B,AR2STMSTM #input,AR3#input,AR3

MPYMPY *AR2+,AR3+,A*AR2+,AR3+,A

MPYMPY *AR1,A*AR1,APOPMPOPM AR0AR0

MVKDMVKD #table,*AR0#table,*AR0

ADDADD y,Ay,ALDLD #table,DP#table,DP

ADDADD table,A,Btable,A,B

6 - 34

Exercise 6-2bExercise 6-2b

MACMAC x,Bx,BSTLMSTLM B,ST0B,ST0

ADDADD table,A,Btable,A,B

STMSTM #pointer,AR4#pointer,AR4STMSTM #stack,SP#stack,SP

LDLD VAR1,AVAR1,A

STLSTL B,B,coeffcoeffSTLMSTLM A,SPA,SP

POPMPOPM AR0AR0

LDLD *AR2,A*AR2,ASSBXSSBX SXMSXM

LDLD data,Bdata,B

Page 132: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 6

6 - 20 DSP54x - Pipeline Issues

6 - 36

Exercise 6-2a - SolutionExercise 6-2a - Solution

LDLD GAIN,TGAIN,TSTMSTM #input,AR1#input,AR1

MPYMPY *AR1+,A*AR1+,A

0 latency: STM0 latency: STM

STLMSTLM B,AR2B,AR2STMSTM #input,AR3#input,AR3nopnopMPYMPY *AR2+,AR3+,A*AR2+,AR3+,A

1 latency: STM1 latency: STM exc’n exc’n

MPYMPY *AR1,A*AR1,APOPMPOPM AR0AR0nopnopMVKDMVKD #table,*AR0#table,*AR0

1 latency: pop1 latency: pop ARn ARn

ADD ADD y,Ay,ALDLD #table,DP#table,DP

ADDADD table,A,Btable,A,B

0 latency: LD DP0 latency: LD DP

Page 133: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 6

DSP54x - Pipeline Issues 6 - 21

6 - 37

Exercise 6-2b: SolutionExercise 6-2b: Solution

MACMAC x,Bx,BSTLMSTLM B,ST0B,ST0nopnopnopnopnopnopADDADD table,A,Btable,A,B

3 latencies: DP3 latencies: DP

STMSTM #pointer,AR4#pointer,AR4STMSTM #stack,SP#stack,SPLDLD VAR1,AVAR1,A

0 latency: STM 0 latency: STM

STLSTL B,B,coeffcoeffSTLMSTLM A,SPA,SPnopnopnopnopPOPMPOPM AR0AR0

2 latencies: SP(0)2 latencies: SP(0)

LDLD *AR2,A*AR2,ASSBXSSBX SXMSXMnopnopLDLD data,Bdata,B

1 latency: SXM1 latency: SXM

6 - 38

VECTOR6.ASMVECTOR6.ASM : Solution : Solution

.ref start .ref start

LEN .set 100LEN .set 100

STACK .STACK .usectusect "STK",LEN "STK",LEN

.sect ".vectors" .sect ".vectors"

BD start BD start

STM #STACK+LEN,SP STM #STACK+LEN,SP

Page 134: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 6

6 - 22 DSP54x - Pipeline Issues

Page 135: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

DSP54x - Numerical Issues 7 - 1

Numerical Issues

Learning Objectives

7 - 2

Learning ObjectivesLearning Objectives

uu Identify & resolve issues for:Identify & resolve issues for:ÀÀ MultiplicationMultiplication

ÀÀ Addition / SubtractionAddition / Subtraction

ÀÀ DivisionDivision

uu Select the appropriate numerical modelsSelect the appropriate numerical modelsÀÀ IntegerInteger vs vs. Fraction. Fraction

ÀÀ SignedSigned vs vs. Unsigned math. Unsigned math

ÀÀ RoundingRounding vs vs. Truncation. Truncation

ÀÀ OverflowOverflow vs vs. Carry. Carry

ÀÀ Fixed pointFixed point vs vs. Floating point. Floating point

uu List mnemonics to performList mnemonics to performÀÀ Extended precision mathExtended precision math

ÀÀ Boolean OperationsBoolean Operations

uu Identify & resolve issues for:Identify & resolve issues for:ÀÀ MultiplicationMultiplication

ÀÀ Addition / SubtractionAddition / Subtraction

ÀÀ DivisionDivision

Page 136: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

7 - 2 DSP54x - Numerical Issues

Page 137: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 7

DSP54x - Numerical Issues 7 - 3

Module 7

7 - 3

Integer MultiplicationInteger Multiplication

uu Integer multiplication yields products larger than the inputs, as canInteger multiplication yields products larger than the inputs, as canbe seen in the example below, using single digit decimal values asbe seen in the example below, using single digit decimal values asinputs:inputs:

9 9 valuevalue

x 9x 9 times valuetimes value

8 1 8 1 yields yields doubledouble size resultsize result

uu Does the user store the lower (1) or upper (8) result?Does the user store the lower (1) or upper (8) result?

uu Both must be kept, resulting in additional resources (two cycles,Both must be kept, resulting in additional resources (two cycles,words of code, and RAM locations) to complete the store.words of code, and RAM locations) to complete the store.

uu Worse, how can the double-sized result be used recursively asWorse, how can the double-sized result be used recursively asan input in later calculations, given that the multiplier inputsan input in later calculations, given that the multiplier inputsare single-width?are single-width?

7 - 4

Fractional MultiplicationFractional Multiplication

uu Multiplication of fractions yields products that never exceed theMultiplication of fractions yields products that never exceed therange of a fraction, as can be seen in the example below, usingrange of a fraction, as can be seen in the example below, usingsingle digit decimal fractions as inputs:single digit decimal fractions as inputs:

. 9 . 9 valuevalue

x . 9x . 9 times valuetimes value

. 8 1 . 8 1 yields yields double double size resultsize result

uu Don’t we still have a double sized result to store?Don’t we still have a double sized result to store?

uu In this case, we can store just the upper result (.8)In this case, we can store just the upper result (.8)

uu This allows storage of result with fewer resourcesThis allows storage of result with fewer resources

uu Results may be used recursivelyResults may be used recursively

uu Has accuracy been lost by dropping the lower accumulator value?Has accuracy been lost by dropping the lower accumulator value?

Page 138: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 7

7 - 4 DSP54x - Numerical Issues

7 - 5

AccuracyAccuracy vs vs. Precision. Precision

uu Often the programmer wants to retain the fullest accuracyOften the programmer wants to retain the fullest accuracyof a calculation, thus dropping the 16of a calculation, thus dropping the 16 LSBLSB’s’s of the result in of the result inthe previous example seems a bad choice.the previous example seems a bad choice.

uu Note though, the inputs: how much accuracy do they offer?Note though, the inputs: how much accuracy do they offer?

uu The product offers double The product offers double precisionprecision but its’ but its’ accuracyaccuracy is isbased on the single-width inputs.based on the single-width inputs.

uu Thus, storing a single precision result is not only an efficientThus, storing a single precision result is not only an efficientsolution, but represents the limit of the accuracy of thesolution, but represents the limit of the accuracy of theresult.result.

uu The accumulator is double-sized for two reasons:The accumulator is double-sized for two reasons:ÀÀ To allow for integer operations, which would possibly requireTo allow for integer operations, which would possibly require

thethe LSBLSBss for the result. for the result.

ÀÀ So that sum-of-product operations will generate accumulativeSo that sum-of-product operations will generate accumulativenoise at the 32noise at the 32ndnd vs vs. the 16. the 16thth bit. bit.

7 - 6

Notes

Page 139: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 7

DSP54x - Numerical Issues 7 - 5

7 - 7

uu How can fractions be represented in binary?How can fractions be represented in binary?

uu Since fractions have a range of +1 to -1, we will haveSince fractions have a range of +1 to -1, we will haveto create a system capable or representing this range.to create a system capable or representing this range.

uu Since negative numbers are involved, a two’sSince negative numbers are involved, a two’scomplement system is required. Two’s complementcomplement system is required. Two’s complementnumbers follow these rules:numbers follow these rules:ÀÀ The bits are a binary weighted progressionThe bits are a binary weighted progression

ÀÀ The MSB and The MSB and only the MSB isonly the MSB is of of negativenegative sign sign

ÀÀ ComplementComplement equals invert plus one equals invert plus one

ÀÀ Small values written to large registers require Small values written to large registers require sign extensionsign extension

uu Given items 1 and 2 above, we can create theGiven items 1 and 2 above, we can create thefollowing fractional model:following fractional model:

Two’s Complement FractionsTwo’s Complement Fractions

-1 1/2 1/4 1/8 ...-1 1/2 1/4 1/8 ...

7 - 8

Fractional ExampleFractional Example

uu The following example demonstrates how two’sThe following example demonstrates how two’scomplement fractions perform under multiplication.complement fractions perform under multiplication.

uu The 4/8 bit model shown here behaves identically toThe 4/8 bit model shown here behaves identically tothe 16/32 bit TMS320 devicethe 16/32 bit TMS320 device

01000100 x 1101 x 1101

010001000000 0000 0100 0100

1100 1100 11101001110100

uu What values do the inputs represent?What values do the inputs represent?

uu What is the result?What is the result?

AccumulatorAccumulator 1111 01001111 01001111 0100

uu What should be stored to memory?What should be stored to memory?

Data MemoryData Memory 11101110

uu What are the Q-types of the input, accumulator,What are the Q-types of the input, accumulator,and output values?and output values?

Page 140: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 7

7 - 6 DSP54x - Numerical Issues

7 - 9

Redundant Sign BitRedundant Sign Bit

uu Multiplication of two signedMultiplication of two signednumbers yields product withnumbers yields product withtwotwo sign bits sign bits

S x x x Q3 S x x x Q3

* S y y y Q3 * S y y y Q3

S S z z z z z z Q6 S S z z z z z z Q6uu Extra sign bit causes problemsExtra sign bit causes problems

if stored to memory as result:if stored to memory as result:ÀÀ Wastes spaceWastes space

ÀÀ Creates off-sizeCreates off-size QQ

uu Solution: Fractional mode bit!Solution: Fractional mode bit!

or, with FRCT=1: or, with FRCT=1:

S z z z z z z 0 Q7 S z z z z z z 0 Q7

uu When When FRCTFRCT (mode bit in (mode bit in ST1ST1))is set, the multiplier output isis set, the multiplier output isleft-shifted by oneleft-shifted by one

SSBXSSBX FRCTFRCT

......

MPYMPY *AR2,*AR3,A*AR2,*AR3,A

STHSTH A,*(z)A,*(z)uu For 16-bit ‘C54x:For 16-bit ‘C54x:

Q15*Q15=Q15Q15*Q15=Q15

7 - 10

Exercise 7a : Multiplier IssuesExercise 7a : Multiplier Issues

1.1. Does the C54x support integer operations?Does the C54x support integer operations?

2.2. What is the optimal numerical type? Why?What is the optimal numerical type? Why?

3.3. How is the extra sign bit in fractional multiply handled?How is the extra sign bit in fractional multiply handled?

Page 141: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 7

DSP54x - Numerical Issues 7 - 7

7 - 11

AccumulationAccumulation

uu With fractions, we were able to guarantee that noWith fractions, we were able to guarantee that nomultiplicativemultiplicative overflow could occur, overflow could occur, ie ie: F*F<=F.: F*F<=F.

uu For addition, this rule does not apply,For addition, this rule does not apply, ie ie: F+F>F.: F+F>F.

uu Therefore, we need additional measures to manage theTherefore, we need additional measures to manage thepossibility of overflow for accumulation. Two generalpossibility of overflow for accumulation. Two generalmethods apply:methods apply:

ÀÀ Guard Bits: the ‘C54x offers an 8-bit extension above theGuard Bits: the ‘C54x offers an 8-bit extension above thehigh accumulator to allow valid representation of thehigh accumulator to allow valid representation of theresult of up to 256 summations.result of up to 256 summations.

ÀÀ Non-gain Systems: offer additional criteria that allow aNon-gain Systems: offer additional criteria that allow asimple solution for unlimited length summations.simple solution for unlimited length summations.

7 - 12

Guard BitsGuard Bits

uu Guard Bits: the ‘C54x offers an 8-bit extension aboveGuard Bits: the ‘C54x offers an 8-bit extension abovethe high accumulator to allow valid representation ofthe high accumulator to allow valid representation ofthe result of up to 256 summations.the result of up to 256 summations.

AG AH AL AG AH AL

BG BH BL BG BH BL

39 31 15 0 39 31 15 0

uu At the conclusion of the summation, what should beAt the conclusion of the summation, what should bedone?done?ÀÀ Store all accumulator components?Store all accumulator components?

ÀÀ Store only high accumulator?Store only high accumulator?

ÀÀ What should be done about guard values?What should be done about guard values?

Page 142: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 7

7 - 8 DSP54x - Numerical Issues

7 - 13

Saturation (Saturation (SATSAT))

uu SATSAT instruction saturates value exceeding 32-bit range in instruction saturates value exceeding 32-bit range inthe selected accumulator:the selected accumulator:

SATSAT AA -or--or- SATSAT BB

uu Provides single-cycle ‘clipping’ function:Provides single-cycle ‘clipping’ function:

Before saturating Before saturating After saturating After saturating

ÀÀ Values not overflowed are unchangedValues not overflowed are unchangedÀÀ Positive overflows are set to : Positive overflows are set to : 00 7FFF FFFF h00 7FFF FFFF h

ÀÀ Negative overflows are set to : Negative overflows are set to : FF 8000 0000 hFF 8000 0000 h

uu Is automatic on store if SST=1 (LP devices)Is automatic on store if SST=1 (LP devices)

0 0

-1-1

1 1

256256

-256-256

7 - 14

Overflow Bits (Overflow Bits (OVAOVA, , OVBOVB,,OVMOVM))

uu Overflow (Overflow (OVOV) is used to record if the range of) is used to record if the range of AccHi AccHi is ever exceeded. is ever exceeded.

ÀÀ OVOV is a latched event: once set it remains set is a latched event: once set it remains set

ÀÀ OVOV is cleared is cleared onlyonly by: by:

ÀÀ Test of Test of OV: OV: BCBC oflow oflow,OVB,OVB

ÀÀ Write to Write to OVOV : : RSBX OVA RSBX OVA oror STM # STM #valval,ST0,ST0

ÀÀ System resetSystem reset

ÀÀ OV is largely obsolete, given the presence of theOV is largely obsolete, given the presence of the Acc Acc Guard. Guard.

uu Overflow Mode (OVM) causes accumulator to saturate at 32nd bit:Overflow Mode (OVM) causes accumulator to saturate at 32nd bit:ÀÀ AccAcc = = 0x00 7FFF FFFF0x00 7FFF FFFF is the positive limit is the positive limit

ÀÀ AccAcc = = 0xFF 8000 00000xFF 8000 0000 is the negative limitis the negative limit

uu Setting Setting OVMOVM ( (SSBX OVMSSBX OVM) causes guard bits to be unused.) causes guard bits to be unused.

uu Setting Setting OVMOVM makes accumulator values non-linear even if subsequent makes accumulator values non-linear even if subsequentterms would have corrected for intermediate overflows.terms would have corrected for intermediate overflows.

uu Overflow mode is generally undesirable, and should usually be turnedOverflow mode is generally undesirable, and should usually be turnedoff (off (RSBX OVMRSBX OVM))

Page 143: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 7

DSP54x - Numerical Issues 7 - 9

7 - 15

Non-gain SystemsNon-gain Systems

uu Many systems can be modeled to have no DC gain:Many systems can be modeled to have no DC gain:ÀÀ Filters with low Q.Filters with low Q.

ÀÀ Any systems scaled by its’ maximum gain value.Any systems scaled by its’ maximum gain value.

uu Input values from A/D converters are automaticallyInput values from A/D converters are automaticallyfractions, if the limits of the A/D are presumed to be +/- 1.fractions, if the limits of the A/D are presumed to be +/- 1.

uu Coefficient values can similarly bounded by making theCoefficient values can similarly bounded by making thelargest value the scaling factor for all other values.largest value the scaling factor for all other values.

uu For these systems, it is known that the final value of theFor these systems, it is known that the final value of theprocess is less than or equal to the input values.process is less than or equal to the input values.

uu The accumulator therefore can be allowed to temporarilyThe accumulator therefore can be allowed to temporarilyoverflow, since the final result is known to be bounded byoverflow, since the final result is known to be bounded by+/- 1.+/- 1.

uu Allows maximum usage of selected A/D and D/AAllows maximum usage of selected A/D and D/AconvertersconvertersÀÀ D/A bits for gain are more expensive than using analogD/A bits for gain are more expensive than using analog

componentscomponents

7 - 16

~1~1

00

–1–1

7FFFh7FFFh

8000h8000h00000000

–1–18000h8000h

~1~17FFFh7FFFh

0001h0001h FFFFhFFFFh

––½½C000hC000h

++½½4000h4000h

CC

OVOV

Number CircleNumber Circle

7FF0h 7FF0h+ 100h = 80F0h+ 100h = 80F0h+ 10h = 8100h+ 10h = 8100h- 200h = 7F00h- 200h = 7F00h

Overflowed Intermediate ResultsOverflowed Intermediate Results

Valid Final ResultValid Final Result

Page 144: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 7

7 - 10 DSP54x - Numerical Issues

7 - 17

Fractional RepresentationFractional Representation

~ 1~ 1

00

–½–½

–1–1

½½

Fractions Fractions

⇒⇒* 32768* 32768

32K32K

00

–16K–16K

–32K–32K

16K16K

Integers Integers

7FFFh7FFFh

00000000

C000hC000h

8000h8000h

4000h4000h

Hex Hex

To store 0.707 type:To store 0.707 type:

.word.word 32768*707/1000 32768*707/1000

7 - 18

Handling Amplifier FunctionsHandling Amplifier Functions

uu Gain is best handled Gain is best handled externalexternal to the ‘C54x to the ‘C54x

uu Allows DSP to perform Allows DSP to perform frequency shaping frequency shaping functionsfunctionsÀÀ higher precision than analoghigher precision than analog

ÀÀ lower costlower cost

ÀÀ more stablemore stable

ÀÀ readily supports adaptive systemsreadily supports adaptive systems

uu Analog system can perform gain functionsAnalog system can perform gain functionsÀÀ Op Amps and resistors are very low costOp Amps and resistors are very low cost

ÀÀ DC gain can easily be made very accurateDC gain can easily be made very accurate

ÀÀ Adaptive DC gain in analogAdaptive DC gain in analogÀÀ Not Not asas easy, but reasonable easy, but reasonable

ÀÀ May be May be controlledcontrolled by ‘C54x by ‘C54x

Page 145: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 7

DSP54x - Numerical Issues 7 - 11

7 - 19

Exercise 7b : Accumulation IssuesExercise 7b : Accumulation Issues

1.1. How wide are the accumulators?How wide are the accumulators?

2.2. What are guard bits for?What are guard bits for?

3.3. What is the easiest way to avoid accumulative overflow?What is the easiest way to avoid accumulative overflow?

4.4. When is saturation useful?When is saturation useful?

5. 5. What benefit do OVA, OVB, and OVM serve?What benefit do OVA, OVB, and OVM serve?

7 - 20

LAB 7 : Fractional MathLAB 7 : Fractional Math

.text.textstart:start: ……

MVPDMVPD tbltbl ,*…,*………MACMAC *,**,*……

.data.datatbltbl :: .word .word 1 , 2 1 , 2

.word.word 3 , 4 3 , 4

.word.word 8 , 6 8 , 6

.word.word 4 , 2 4 , 2

.vectors.vectorsBB startstart

Program MemoryProgram Memory

ROMROM

RAMRAM

Data MemoryData Memory

. . bssbss

x x ___ ___ ___ ______ ___ ___ ___

a a ___ ___ ___ ______ ___ ___ ___

y y ______

0. 0.0. 0.0. 0.0. 0.0. 0.0. 0.0. 0.0. 0.

Page 146: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 7

7 - 12 DSP54x - Numerical Issues

7 - 21

LAB 7 : ProcedureLAB 7 : Procedure

1.1. Copy Copy LAB5.ASMLAB5.ASM to to LAB7.ASMLAB7.ASM. Modify . Modify LAB7LAB7 to: to:a.a. Use the fractional data table shown aboveUse the fractional data table shown aboveb.b. Perform fractional multiplicationPerform fractional multiplicationWhat status bits will be important for this routine toWhat status bits will be important for this routine toperform correctly?perform correctly?

2.2. Copy Copy LAB5.CMDLAB5.CMD to to LAB7.CMDLAB7.CMD. Modify . Modify LAB7.CMDLAB7.CMD to toinput input LAB7.OBJLAB7.OBJ and create and create LAB7.OUTLAB7.OUT and and LAB7.MAPLAB7.MAP..

3.3. Assemble, link, and simulate the program. Debug and verifyAssemble, link, and simulate the program. Debug and verifyperformance. What answer did you get?performance. What answer did you get?

4. 4. To better view the result on the simulator, try:To better view the result on the simulator, try:

WA *(y)/327,y = 0 .,dWA *(y)/327,y = 0 .,d

5.5. Optional: Time permitting, repeat your experiment usingOptional: Time permitting, repeat your experiment usingsome negative array values. Was your result as expected?some negative array values. Was your result as expected?

7 - 22

DivisionDivision

uu The ‘C54x does The ‘C54x does notnot have a single cycle 16-bit divide instruction have a single cycle 16-bit divide instruction

ÀÀ Divide is a rare function in DSPDivide is a rare function in DSP

ÀÀ Division hardware is expensiveDivision hardware is expensive

uu The ‘C54x The ‘C54x doesdoes have a single cycle 1-bit divide instruction: conditional have a single cycle 1-bit divide instruction: conditionalsubtract or subtract or SUBCSUBC

ÀÀ Preceded by Preceded by RPT #15RPT #15, a 16-bit divide is performed, a 16-bit divide is performed

ÀÀ Is Is muchmuch faster than without faster than without SUBCSUBC

uu The The SUBCSUBC process operates only on process operates only on unsignedunsigned operands, thus software operands, thus softwaremust:must:

ÀÀ Compare the signs of the input operandsCompare the signs of the input operands

ÀÀ If they are alike, plan a positive quotientIf they are alike, plan a positive quotientÀÀ If they differ, plan to negate (If they differ, plan to negate (NEGNEG) the quotient) the quotient

ÀÀ Strip the signs of the inputsStrip the signs of the inputs

ÀÀ Perform the unsigned divisionPerform the unsigned division

ÀÀ Attach the proper sign based on the comparison of the inputsAttach the proper sign based on the comparison of the inputs

Page 147: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 7

DSP54x - Numerical Issues 7 - 13

7 - 23

Division RoutineDivision Routine

LDLD @den,16,A@den,16,A

MPYAMPYA @@numnum B =B = num num*den (tells sign)*den (tells sign)

ABSABS AA Strip sign of numeratorStrip sign of numerator

STHSTH A,@denA,@den

LDLD @@numnum,A,A

ABSABS AA Strip sign of denominatorStrip sign of denominator

RPTRPT #15#15 16 iterations16 iterations

SUBCSUBC @den,A@den,A 1-bit divide1-bit divide

XCXC 1,BLT1,BLT If result needs to be negativeIf result needs to be negative

NEGNEG AA Invert signInvert sign

STLSTL A,@A,@quotquot Store negative resultStore negative result

7 - 24

uu Identify & resolve issues for:Identify & resolve issues for:ÀÀ MultiplicationMultiplication

ÀÀ Addition / SubtractionAddition / Subtraction

ÀÀ DivisionDivision

uu Select the appropriate numerical modelsSelect the appropriate numerical modelsÀÀ IntegerInteger vs vs. Fraction. Fraction

ÀÀ SignedSigned vs vs. Unsigned math. Unsigned math

ÀÀ RoundingRounding vs vs. Truncation. Truncation

ÀÀ OverflowOverflow vs vs. Carry. Carry

ÀÀ Fixed pointFixed point vs vs. Floating point. Floating point

uu List mnemonics to performList mnemonics to performÀÀ Extended precision mathExtended precision math

ÀÀ Boolean OperationsBoolean Operations

Learning ObjectivesLearning Objectives

uu Select the appropriate numerical modelsSelect the appropriate numerical modelsÀÀ IntegerInteger vs vs. Fraction. Fraction

ÀÀ SignedSigned vs vs. Unsigned math. Unsigned math

ÀÀ RoundingRounding vs vs. Truncation. Truncation

ÀÀ OverflowOverflow vs vs. Carry. Carry

ÀÀ Fixed pointFixed point vs vs. Floating point. Floating point

Page 148: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 7

7 - 14 DSP54x - Numerical Issues

7 - 25

RoundingRounding

uu Result of multiplication can be rounded for Result of multiplication can be rounded for MPYMPY, , MACMACand and MASMAS operations. This is specified by appending operations. This is specified by appendingthe instruction with an "the instruction with an "RR" suffix." suffix.ÀÀ Example: Example: MACMAC with rounding is with rounding is MACR.MACR.

ÀÀ Rounding consists of adding 2Rounding consists of adding 21515 to the result and then to the result and thenclearing the low accumulator.clearing the low accumulator.

uu In a long sum-of-products, only the In a long sum-of-products, only the last last MACMACoperation should specify rounding:operation should specify rounding:

RPTZRPTZ A,#98A,#98

MACMAC *AR2*AR2+,*+,*AR3+,AAR3+,A

MACR MACR *AR2*AR2+,*+,*AR3+,AAR3+,A

uu Rounding can also be achieved with a load operation: Rounding can also be achieved with a load operation: LDRLDR SmemSmem,,dstdst

7 - 26

Example: LD #0F794h,8,AExample: LD #0F794h,8,A

Sign Extension (SXM)Sign Extension (SXM)

SXM=1SXM=1

CC

XX

GG

0000

ACCACC

E F 1 3 6 4 8 CE F 1 3 6 4 8 C

BeforeBefore AfterAfterCCXX

GGFFFF

ACCACC

F FF F F 7 9 4 0 0 F 7 9 4 0 0

SXM=0SXM=0

CC

XX

GG0000

ACCACC

E F 1 3 6 4 8 CE F 1 3 6 4 8 C

BeforeBefore AfterAfter

CCXX

GG0000

ACCACC

0 00 0 F 7 9 4 0 0 F 7 9 4 0 0

Page 149: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 7

DSP54x - Numerical Issues 7 - 15

7 - 27

Carry Bit (Carry Bit (CC))

uu Carry is:Carry is:

ÀÀ Used with Used with unsigned unsigned numbers to indicate annumbers to indicate anover/under flow conditionover/under flow condition

ÀÀ Set or cleared with each calculation - it is Set or cleared with each calculation - it is notnotlatchedlatched

ÀÀ Optimal for extending 32-bit accumulators toOptimal for extending 32-bit accumulators tolarger-word-size calculationslarger-word-size calculations

uu Example - 64-bit addition:Example - 64-bit addition:

XDXD XCXC XBXB XAXA

BB

CC

7 - 28

Special Load, Add, SubtractSpecial Load, Add, Subtract

Carry & BorrowCarry & Borrow

ADDCADDC SmemSmem,,srcsrc srcsrc = = src src + + Smem Smem + C + C

SUBBSUBB SmemSmem,,srcsrc srcsrc = = src src - - Smem Smem - (C-) - (C-)

Sign-suppressed MathSign-suppressed Math

ADDSADDS SmemSmem,,srcsrc srcsrc = = src src + u( + u(SmemSmem))

SUBSSUBS SmemSmem,,srcsrc srcsrc = = src src - u( - u(SmemSmem))

Load unsignedLoad unsigned

LDULDU SmemSmem,,dstdst dstdst = u( = u(SmemSmem))

Page 150: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 7

7 - 16 DSP54x - Numerical Issues

7 - 29

64-bit Add & Subtract Code64-bit Add & Subtract Code

Example: zExample: z6464 = w = w6464 + x + x6464 - y - y6464

w3 w2 w1 w0w3 w2 w1 w0

x3 x2 x1 x0x3 x2 x1 x0

y3 y2 y1 y0y3 y2 y1 y0

z3 z2 z1 z0z3 z2 z1 z0

DLDDLD @w1,A@w1,A A = w1+w0 A = w1+w0

DADDDADD @x1,A@x1,A A += x1+x0 A += x1+x0

DLDDLD @w3,B@w3,B B = w3+w2B = w3+w2

ADDCADDC @x2,B@x2,B B += x2+CB += x2+C

ADDADD @x3,16,B@x3,16,B B += x3B += x3

DSUBDSUB @y1,A@y1,A A -= y1+y0A -= y1+y0

DSTDST A,@z1A,@z1 z1 = w1+w0+x1+x0-y1-y0z1 = w1+w0+x1+x0-y1-y0

SUBBSUBB @y2,B@y2,B B -= y2+C’ B -= y2+C’

SUBSUB @y3,16,B@y3,16,B B -= y3B -= y3

DSTDST B,@z3B,@z3 z3 = w3+w2+x3+x2+C-y3-y2-C’z3 = w3+w2+x3+x2+C-y3-y2-C’

7 - 30

Long MultiplicationLong Multiplication

X1 X1 X0X0 S S UU

Y1 Y1 Y0 Y0 S S UU X X

XOXO ** Y0 Y0 U U * * UU

Y1 * Y1 * X0 X0 S * S * UU

X1 * X1 * Y0 Y0 S * S * UU

Y1 * X1Y1 * X1 S * SS * S

W3 W3 W2W2 W1 W0 W1 W0 S S U U UU U U

MACSUMACSU XmemXmem,,YmemYmem,,srcsrc srcsrc = = src src + u( + u(SmemSmem)*)*YmemYmem

MPYUMPYU SmemSmem,,dstdst dstdst = u(TREG)*u( = u(TREG)*u(SmemSmem))

Page 151: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 7

DSP54x - Numerical Issues 7 - 17

7 - 31

Long Multiply RoutineLong Multiply Routine

STMSTM #X0,AR2#X0,AR2

STMSTM #Y0,AR3#Y0,AR3

LDLD *AR2,T*AR2,T T = x0T = x0

MPYUMPYU *AR3+,A*AR3+,A A = ux0*uy0A = ux0*uy0

STLSTL A,@W0A,@W0 w0 = ux0*uy0w0 = ux0*uy0LDLD A,-16,AA,-16,A A = A>>16A = A>>16

MACSUMACSU *AR2*AR2+,*+,*AR3-,A AR3-,A A += x1*uy0A += x1*uy0

MACSUMACSU *AR3*AR3+,*+,*AR2,AAR2,A A += y1*ux1A += y1*ux1

STLSTL A,@W1A,@W1 w1 = Aw1 = A

LDLD A,-16,AA,-16,A A = A>>16A = A>>16

MACMAC *AR2,*AR3,A*AR2,*AR3,A A += x1*y1A += x1*y1

STLSTL A,@W2A,@W2 w2 = A-w2 = A-lolo

STHSTH A,@W3A,@W3 w3 = A-hiw3 = A-hi

7 - 32

Exponent EncoderExponent Encoder

uu One cycle exponent ( [ -8, +31 ] range) computationOne cycle exponent ( [ -8, +31 ] range) computation

uu Result in T register as 2’s complement valueResult in T register as 2’s complement value

ALUALU

AA BB

EXPONENTENCODER

EXPONENTEXPONENTENCODERENCODER

66

TT

expexp A ; 1 cycle for A ; 1 cycle for exp exp

norm A ; 1 cycle normalizenorm A ; 1 cycle normalize

-8 0 16 31 -8 0 16 31

Page 152: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 7

7 - 18 DSP54x - Numerical Issues

7 - 33

Floating Point UsageFloating Point Usage

Full Floating PointFull Floating Point

e1 m1 e1 m1

e2 m2 e2 m2

e3 m3 e3 m3

LDLD e1,Te1,T

LDLD m1,T,Am1,T,A

LDLD e2,Te2,T

ADDADD m2,T,Am2,T,A

LDLD e3,Te3,T

ADDADD m3,T,Am3,T,A

2*N RAM & Cycles2*N RAM & Cycles

Block Floating PointBlock Floating Point

e m1 e m1

m2 m2

m3 m3

LD LD e,Te,T

LD LD m1,T,Am1,T,A

ADD ADD m2,T,Am2,T,A

ADD ADD m3,T,Am3,T,A

… …

N+1 RAM & CyclesN+1 RAM & Cycles

7 - 34

Exercise 7c : Numerical IssuesExercise 7c : Numerical Issues

1.1. How is division performed on the 54x?How is division performed on the 54x?

2. 2. How is rounding performed on the 54x?How is rounding performed on the 54x?

3. 3. How are fractions represented in the assembler?How are fractions represented in the assembler?

4. 4. What benefit does the carry bit offer?What benefit does the carry bit offer?What instructions employ/affect the carry bit?What instructions employ/affect the carry bit?

5.5. When are unsigned operations useful?When are unsigned operations useful?

6.6. Does the 54x offer any form of floating pointDoes the 54x offer any form of floating pointoperation?operation?

Page 153: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 7

DSP54x - Numerical Issues 7 - 19

7 - 35

Learning ObjectivesLearning Objectives

uu Identify & resolve issues for :Identify & resolve issues for :ÀÀ MultiplicationMultiplication

ÀÀ Addition / SubtractionAddition / Subtraction

ÀÀ DivisionDivision

uu Select the appropriate numerical models :Select the appropriate numerical models :ÀÀ IntegerInteger vs vs. Fraction. Fraction

ÀÀ SignedSigned vs vs. Unsigned math. Unsigned math

ÀÀ RoundingRounding vs vs. Truncation. Truncation

ÀÀ OverflowOverflow vs vs. Carry. Carry

ÀÀ Fixed pointFixed point vs vs. Floating point. Floating point

uu List mnemonics to performList mnemonics to perform : :

ÀÀ Extended precision mathExtended precision math

ÀÀ Boolean OperationsBoolean Operations

7 - 36

BitfieldBitfield Test & Bit Extraction Test & Bit Extraction

CMPMCMPM SmemSmem,#K,#K TC=1 ifTC=1 if Smem Smem=K=K

BITFBITF SmemSmem,#K,#K TC=0 ifTC=0 if Smem Smem&K=0&K=0

BITBIT XmemXmem,bit,bit TC=TC=XmemXmem(15-bit)(15-bit)

BITTBITT SmemSmem TC=TC=SmemSmem(15-T(3-0))(15-T(3-0))

memmem 15 n 15 n 0 0

bitbit

TC nTC nBITBIT *AR2,5*AR2,5

BCBC true,TCtrue,TC

LDLD @bit,T @bit,T

BITTBITT @x @x

BCBC false,NTC false,NTC

Page 154: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 7

7 - 20 DSP54x - Numerical Issues

7 - 37

Boolean OperationsBoolean Operations

AND OR XOR AND OR XOR 1 cycle 1 cycle

SmemSmem,,srcsrc srcsrc = = src src (op) (op) Smem Smem

srcsrc,[SHIFT],[,[SHIFT],[dstdst]] dstdst = = dst dst (op) (op) src src << SHIFT << SHIFT

AND OR XOR AND OR XOR 2 cycles 2 cycles

#K,[#K,[shftshft],],srcsrc,[,[dstdst]] dstdst = = src src (op) #K << (op) #K << shft shft

#K,16,#K,16,srcsrc,[,[dstdst]] dstdst = = src src (op) #K << 16 (op) #K << 16

ANDM ORM XORM ANDM ORM XORM ADDMADDM 2 cycles 2 cycles

#K,#K, Smem Smem SmemSmem = = Smem Smem (op) #K (op) #K

7 - 38

Shift and Rotate OperationsShift and Rotate Operations

SFTASFTA srcsrc,SHIFT,[,SHIFT,[dstdst]] C 39 32 31 0 0C 39 32 31 0 0

SxSx 39 32 31 0 C 39 32 31 0 C

SFTLSFTL srcsrc,SHIFT,[,SHIFT,[dstdst]] C - 00 - 31 0 0C - 00 - 31 0 0

0 - 00 - 31 0 C0 - 00 - 31 0 C

ROLTCROLTC srcsrc C - 00 - 31 0 TCC - 00 - 31 0 TC

ROLROL srcsrc C - 00 - 31 0C - 00 - 31 0

RORROR srcsrc C - 00 - 31 0C - 00 - 31 0

Page 155: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 7

DSP54x - Numerical Issues 7 - 21

7 - 39

Shifter HardwareShifter Hardware

SXMSXM

AA

BB

To ALUTo ALU

D BusD BusC BusC Bus

Sign ControlSign Control

Barrel ShifterBarrel Shifter(-16, +31)(-16, +31)

MSW/LSWMSW/LSWWrite SelectWrite Select

E BusE Bus

CC

TCTC3232

1616

4040

4040

1616

1616

4040

T(5-0)T(5-0)(-16, +31) Range(-16, +31) Range

ASM(4-0)ASM(4-0)(-16, +15) Range(-16, +15) Range

ConstantConstant(-16, +15) Range(-16, +15) Range or or (0, +15) Range (0, +15) Range

7 - 40

Other Numerical OperationsOther Numerical Operations

ABSABS srcsrc,[,[dstdst]] dstdst = | = |srcsrc||

NEGNEG srcsrc,[,[dstdst]] dstdst = - = -srcsrc

CMPLCMPL srcsrc,[,[dstdst]] dstdst = = src src

Page 156: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 7

7 - 22 DSP54x - Numerical Issues

7 - 41

Exercise 7d : Boolean OperationsExercise 7d : Boolean Operations

1.1. How are bits tested on the 54x? What’s unusual about it?How are bits tested on the 54x? What’s unusual about it?

2. 2. WhatWhat boolean boolean operations are present on the 54x? operations are present on the 54x?

3. 3. MustMust boolean boolean functions operate on the accumulator? functions operate on the accumulator?

4. 4. What is the difference between shift and rotate?What is the difference between shift and rotate?

5.5. What is the difference between What is the difference between SFTASFTA and and SFTLSFTL?? 6.6. What is the difference between What is the difference between NEGNEG and and CMPLCMPL??

Page 157: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 7

DSP54x - Numerical Issues 7 - 23

7 - 43

LAB7.ASMLAB7.ASM : Solution : Solution

.def start,table,y

.bss x,4

.bss a,4

.bss y,1

.data

table: .word 32768*1/10

.word 32768*2/10

.word 32768*3/10

.word 32768*4/10

.word 32768*8/10

.word 32768*6/10

.word 32768*4/10

.word 32768*2/10

.text

NOP

. .defdef start,table,y start,table,y

. .bssbss x,4 x,4

. .bssbss a,4 a,4

. .bssbss y,1 y,1

.data .data

table: .wordtable: .word 32768*1/10 32768*1/10

.word.word 32768*2/10 32768*2/10

.word.word 32768*3/10 32768*3/10

.word .word 32768*4/10 32768*4/10

.word .word 32768*8/10 32768*8/10

.word .word 32768*6/10 32768*6/10

.word .word 32768*4/10 32768*4/10

.word .word 32768*2/10 32768*2/10

.text .text

NOP NOP

start: STM #x,AR2

RPT #8

MVPD table,*AR2+

CALL sop

done: B done

sop: STM #x,AR2

STM #a,AR3

RSBX OVM

SSBX SXM

SSBX FRCT

RPTZ A,#3

MAC*AR2+,*AR3+,A

STH A,*(y)

RET

start: STM #x,AR2start: STM #x,AR2

RPT #8 RPT #8

MVPD table,*AR2+ MVPD table,*AR2+

CALL sop CALL sop

done: B donedone: B done

sop: STM #x,AR2sop: STM #x,AR2

STM #a,AR3 STM #a,AR3

RSBX OVM RSBX OVM

SSBX SXM SSBX SXM

SSBX FRCTSSBX FRCT

RPTZ A,#3 RPTZ A,#3

MAC MAC*AR2*AR2+,*+,*AR3+,AAR3+,A

STH A,*(y) STH A,*(y)

RET RET

Page 158: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 7

7 - 24 DSP54x - Numerical Issues

Page 159: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

DSP54x - Fundamental DSP Applications 8 - 1

Fundamental DSP Applications

Learning Objectives

8 - 2

Objectives

u Describe how FIR/IIR filters operate

u Implement delay lines in two ways

u Write code for FIR/IIR filters on the 54x

u Translate signal flow diagrams to 54x code

u Employ techniques to avoid IIR instability

u Select the best filter type for a given need

Page 160: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

8 - 2 DSP54x - Fundamental DSP Applications

Page 161: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 8

DSP54x - Fundamental DSP Applications 8 - 3

Module 8

8 - 3

Finite Impulse Response (FIR) Filter

z–1z–1 z–1z–1

×× ×

+

X0 X1 X2xin

yout

a0 a1 a2

y(n) = a0 × x(n) + a1 × x(n–1) + a2 × x(n–2)

LD x2, T

MAC a2, A

Circular Buffer or Linear Buffer

8 - 4

I/O Memory Read & WriteI/O Memory Read & Write

PORTR PA,Smem PA Smem

PORTW Smem,PA PA Smem

PORTRPORTR PA,PA,SmemSmem PAPA Smem Smem

PORTWPORTW SmemSmem,PA,PA PAPA Smem Smem

uu Port operations access I/O devices Port operations access I/O devices

uu Requires two words & two cycles Requires two words & two cycles

uu I/O range can be up to 64K locations I/O range can be up to 64K locations

uu There are no I/O resources on-chip There are no I/O resources on-chip

Page 162: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 8

8 - 4 DSP54x - Fundamental DSP Applications

8 - 5

JUNJUN

MAYMAY

APRAPR

MARMAR

FEBFEB

JANJAN

1. Access from the oldest to newest sample.

*ARn-

JUNJUN

JUNJUN

MAYMAY

APRAPR

MARMAR

FEBFEB

2. Input the newest sample on the top of the buffer.

JULPORTR

*ARn-

JULJUL

JULJUL

JUNJUN

MAYMAY

APRAPR

MARMAR

AUGPORTR

*ARn-

DELAY *AR2-DELAY *AR2-

Note: DELAY operates in DARAM Note: DELAY operates in DARAM only!only!

Linear Buffer (Delay Line)

8 - 6

Six-Level Circular Buffer

JANJUN

MARAPR

FEBMAY

JULJUN

MARAPR

FEBMAY

JULJUN

MARAPR

AUGMAY

JUNJUN

MAYMAY

APRAPR

MARMAR

FEBFEB

JANJAN

start

end

ARn JUNJUN

MAYMAY

APRAPR

MARMAR

FEBFEB

JULJUL ARn

JUNJUN

MAYMAY

APRAPR

MARMAR

AUGAUG

JULJULARn

Page 163: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 8

DSP54x - Fundamental DSP Applications 8 - 5

8 - 7

Circular Addressing Hardware

Element 0

Element N-1

Circular

Buffer

Range

Top of Buffer A ... A 0 ... 0

End of Buffer + 1 A ... A BK

(ARn) Index A ... A x ... x

BK = Length “n” of Delay LineBK = Length “n” of Delay Line

8 - 8

Circular Addressing Code

.textSTM #32,BK ;BK = size of circular buf. . . *AR3+% ;circular addressing

.textSTM #32,BK ;BK = size of circular buf. . . *AR3+% ;circular addressing

FIR.ASM

X0 .usect “D_LINE”,32

SECTION{

D_LINE: { } > RAM PAGE 1. . .

}

SECTION{

D_LINE: { } > RAM PAGE 1. . .

}

LINK.CMD

align(64)

Circular buffers of length n Circular buffers of length n mustmust be aligned on be aligned on 22KK > n > n boundaries boundaries

Page 164: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 8

8 - 6 DSP54x - Fundamental DSP Applications

8 - 9

Circular Addressing CaveatsCircular Addressing Caveats

uu Allows Allows allall AR modifications: AR modifications:

uu increment or decrementincrement or decrement

uu indexing (1, K, AR0)indexing (1, K, AR0)

uu Is invoked on Is invoked on anyany AR with the modulo (%) operator AR with the modulo (%) operator

uu Implements Implements truetrue modulo addressing (pointer will modulo addressing (pointer will nevernever exit array even if incremented exit array even if incremented pastpast end of array) end of array)

uu Alignment is to Alignment is to next largernext larger binary boundary binary boundary

uu Alignment can leave Alignment can leave gapsgaps in in RAMRAM

uu Linker will attempt to Linker will attempt to backfillbackfill unused unused RAMRAM if possible on if possible on a a whole-filewhole-file basis basis

uu Recommended: Link largest blocks first. Why?Recommended: Link largest blocks first. Why?

8 - 10

FIR Filter

×

z–1z–1 z–1z–1

××

X0 X1 X2xin

yout

a0 a1 a2

z–1z–1

×

X3

a3

z–1z–1

×

X4

a4

+ + + +

y(n) = a0 × x(n) + a1 × x(n-1) + a2 × x(n-2) + a3 × x(n-3) + a4 × x(n-4)

or

Y0 = a0 × X0 + a1 × X1 + a2 × X2 + a3 × X3 + a4 × X4

Page 165: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 8

DSP54x - Fundamental DSP Applications 8 - 7

8 - 11

FIR Filter - Linear BufferDirect Addressing

LD #X0,DP

SSBX FRCTLOOP LD X4,T

MPY A4,ALTD X3MAC A3,ALTD X2MAC A2,ALTD X1MAC A1,ALTD X0MAC A0,ASTH A,X0PORTW X0,PA0BD LOOPPORTR PA1,X0

Indirect AddressingSTM #A+3,AR2STM #X+3,AR1STM #4,AR0SSBX FRCT

LOOP LD *AR1-,TMPY *AR2-,ALTD *AR1-MAC *AR2-,ALTD *AR1-MAC *AR2-,ALTD *AR1-MAC *AR2-,ALTD *AR1MAC *AR2+0,ASTH A,*AR1PORTW *AR1,PA0BD LOOPPORTR PA1,*AR1+0

Note: location Note: location X0X0 used as temporary output location used as temporary output location

8 - 12

FIR Filter - Dual Op w. Delay

INIT STM #X+5,AR2STM #4,AR0SSBS FRCT

FIR RPTZ A,#4

MACD *AR2-,Coef,A

STH A,*AR2PORTW *AR2+,PA0BD FIRPORTR PA1,*AR2+0

.dataCOEF .word A4,A3,A2,A1,A0X .usect “daram”,1+5+1

point to last datumptr reset valuefractional numbers

5 iterations

Mpy, Acc, Delay

X=resultX to DAC, inc to X0loop (soon)get new X0, inc to X4

coeffs: old to newX,d.line, 1st delay

Page 166: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 8

8 - 8 DSP54x - Fundamental DSP Applications

8 - 13

FIR Filter - Dual Op w. Circ.Buffer

STM #A4,AR3STM #X4,AR2STM #-1,AR0STM #5,BKSSBX FRCT

FIR RPTZ A,#4MAC *AR2+0%,*AR3+0%,A

STH A,*AR2PORTW *AR2,PA0BD FIRPORTR PA1,*AR2+0%

coeff ptrcirc buf ptrdual op deccirc buf sizefractions

5 iterationsSOP & circ

result to old xresult to DACbranch soonget new data,inc to old

Note: Note: in dual op mode *in dual op mode *ArnArn-% is -% is indirectly indirectly supportedsupported

via *via *ArnArn+0% by setting AR0 = -1+0% by setting AR0 = -1

8 - 14

Second-Order IIR Filter

z–1z–1

×

×X0

X1

×

+

z–1z–1

X2

B1

B0

y(n)x(n)

B2

×

×

-A1

-A2

+w(n)

Feedback Path - Poles Forward Path - Zeros

Page 167: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 8

DSP54x - Fundamental DSP Applications 8 - 9

8 - 15

LD #x0,DPSSBX FRCT

IIR: PORTR 0000,x0LD x0,16,ALD x1,TMAC a1,ALD x2,TMAC a2,ASTH A,x0MPY b2,ALTD x1MAC b1,ALTD x0MAC b0,ASTH A,x0BD IIRPORTW x0,0001

IIR Filter - Single Operand

FeedbackSection

ForwardSection

x0 as Delay Element

x0 as Input Handler

x0 as Output Handler

8 - 16

IIR Filter- Dual Operand

Feedback

Path

Forward

Path

SSBX FRCT

STM #X2,AR3STM #Coeff+4,AR4

MVMM AR4,AR1STM #6,BK

STM #-1,AR0IIR: PORTR 0001h,*AR3

LD *AR3 ,16 ,AMAC *AR3+0%,*AR4-,AMAC *AR3+0%,*AR4-,ASTH A,*AR3 MPY *AR3+0%,*AR4-,AMAC *AR3+0%,*AR4-,AMAC *AR3 ,*AR4-,ASTH A, *AR3MVMM AR1,AR4BD IIRPORTW *AR3,0002h

Page 168: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 8

8 - 10 DSP54x - Fundamental DSP Applications

8 - 17

Classical Form IIRClassical Form IIR

z–1z–1

+

x(n) +

z–1z–1

a 11

a 12

z–1z–1

+

+

z–1z–1

b11

b12

y(n)

uuGain (pole, feedback section) Gain (pole, feedback section) afterafter attenuation (zero, attenuation (zero, foward foward section) section)

uuLess need for input scalingLess need for input scaling

uuMore robustMore robust

uuAlternate coding modelAlternate coding model

8 - 18

Classical IIR CodeClassical IIR CodeSTM #X,AR2STM #A,AR3STM #Y,AR4STM #B,AR5STM #3,BKSTM #-1,AR0

IIR: PORTR 0001h,*AR2MPY *AR2+0%,*AR3+0%,AMAC *AR2+0%,*AR3+0%,AMAC *AR2 ,*AR4+0%,AMAS *AR4+ ,*AR5+ ,AMAS *AR4 ,*AR5- ,ASTH A, *AR4PORTW *AR4,0002h

Even

Iteration

Odd

Iteration

...MAS *AR4+ ,*AR5+ ,AMAS *AR3 ,*AR5- ,ASTH A, *AR4BD IIRPORTW *AR3,0002h

Page 169: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 8

DSP54x - Fundamental DSP Applications 8 - 11

8 - 19

IIR Solutions ComparisonIIR Solutions Comparison

ParameterParameter 1 Operand1 Operand 2 Operand 2 Operand ClassicalClassical

Cycle CountCycle Count 12(M) + 4(P)12(M) + 4(P) 9(M) + 4(P)9(M) + 4(P) 6(M) + 4(P)6(M) + 4(P)

Code SizeCode Size 2020 2424 3434

Reg’sReg’s Used Used 00 3 + BK3 + BK 5 + BK5 + BK

8 - 20

IIR Implementation Issues

u Break down high-order systems

u Scale down coefficients that are ≥ 1

u Input scaling

u Optimal Topology

Page 170: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 8

8 - 12 DSP54x - Fundamental DSP Applications

8 - 21

Break Down High-Order IIR

z–1z–1

+

x(n) +d(n)

+

+

z–1z–1

w(n)

a11 b11

a12 b12

y(n)

z–1z–1

+

+ +

+

z–1z–1

a21 b21

a22 b22

8 - 22

Scale Down Coefficient ≥1

z–1z–1

×

×X0

X1

×

+

z–1z–1

X2

B1

B0

y(n)x(n)

B2

×

×

–A2

+w(n)

–(A1)/2

×

–(A1)/2

– A1

Page 171: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 8

DSP54x - Fundamental DSP Applications 8 - 13

8 - 23

Input ScalingInput Scaling

PORTR 0001h,XinLD Xin , 16, A

Q31 format Divide by 8

PORTR 0001h,XinLD Xin , 16-3, A

7711

8 - 24

Optimal IIR TopologyOptimal IIR Topology

z–1z–1

+

x(n) +

z–1z–1

a11

a12

z–1z–1

+

+

z–1z–1

b11

b12

b

+

+21

b22

+

+

a21

a22

z–1z–1

z–1z–1 z–1z–1

z–1z–1z–1z–1

z–1z–1

y(n)+

+b31

b32

z–1z–1

+

+

z–1z–1

a31

a32

Best blend of efficiency andBest blend of efficiency and peformance peformance

by preceding a gain stage (pole) with a zero stageby preceding a gain stage (pole) with a zero stage

Page 172: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 8

8 - 14 DSP54x - Fundamental DSP Applications

8 - 25

IIRIIR vs vs. FIR Filters. FIR Filters

uuFIR:FIR:

uuAll zero implementationAll zero implementation

uuUnconditionally stableUnconditionally stable

uuLinear Phase possibleLinear Phase possible

uuBest for phase encoded dataBest for phase encoded data

uuIIRIIR

uuPole & zero implementationPole & zero implementation

uuStable if no errors madeStable if no errors made

uuMuch better frequency performanceMuch better frequency performance

uuBest for frequency discriminationBest for frequency discrimination

8 - 26

Lab 8 : Recursive FilterLab 8 : Recursive Filter

Implement the signal flow diagram on the ‘C54x.Implement the signal flow diagram on the ‘C54x.

z–1zz–1–1

z–1zz–1–1

Y0Y0

××

××

AA

BB

++ I/O Port 0I/O Port 0

Y1Y1

Y2Y2

A A = 1.975= 1.975

B B = –1.000= –1.000

y(0) y(0) = 0.000= 0.000

y(1) y(1) = 0.1400= 0.1400

y(2) y(2) = ?= ?

Notes: Y0 is the current output based on the two prior outputs, Y1 and Y2. Notes: Y0 is the current output based on the two prior outputs, Y1 and Y2.

Initial conditions y(0) and y(1) are given, so the ‘54x will begin processing at t=2.Initial conditions y(0) and y(1) are given, so the ‘54x will begin processing at t=2.

Since location Y0 is not an input value, results can be directly written to Y1.Since location Y0 is not an input value, results can be directly written to Y1.

Page 173: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 8

DSP54x - Fundamental DSP Applications 8 - 15

8 - 27

Lab 8 : ProcedureLab 8 : Procedureuu Create a Create a newnew assembly file to: assembly file to:

1. Allocate RAM for coefficients and delay line1. Allocate RAM for coefficients and delay line

2. Establish a ROM table for coefficients and2. Establish a ROM table for coefficients and intial intial conditions conditions

3. Initialize ROM into RAM3. Initialize ROM into RAM

4. Initialize processor modes4. Initialize processor modes

5. Write code to implement signal flow diagram in infinite loop5. Write code to implement signal flow diagram in infinite loop

6. Build reset vector6. Build reset vector

uu Assemble the programAssemble the program

uu Link the program using an appropriate linker command fileLink the program using an appropriate linker command file

uu Run the program on the simulator through 40 loopsRun the program on the simulator through 40 loops

uu Exit the simulator and view your results by typing: Exit the simulator and view your results by typing: PLOT OUT.DATPLOT OUT.DAT

uu Verify the results with the instructorVerify the results with the instructor

uu Time permitting, consider optimizing your codeTime permitting, consider optimizing your code

8 - 28

Lab 8: EquationsLab 8: Equations

y(n) = A*y(n-1) + B*y(n-2)y(n) = A*y(n-1) + B*y(n-2)

Y(z) = A*zY(z) = A*z-1-1*Y(z) + B*z*Y(z) + B*z-2-2*Y(z)*Y(z)

Y(z)*[1 - A*zY(z)*[1 - A*z-1-1 - B*z - B*z-2-2] = 0] = 0

solving for roots:solving for roots:

z =[ A +/- (Az =[ A +/- (A22 + 4B) + 4B)1/21/2] / 2] / 2

if z is complex, then Aif z is complex, then A22+4B < 0, so+4B < 0, so

z = [ A +/- j*(-Az = [ A +/- j*(-A22 - 4B) - 4B)1/21/2] / 2] / 2

|z| = [ A|z| = [ A22/4 + (-A/4 + (-A22-4B) / 4 ]-4B) / 4 ]1/21/2

|z| = [ -B ] |z| = [ -B ] 1/21/2

Therefore, if B = -1, Therefore, if B = -1,

|z| = 1|z| = 1

Page 174: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 8

8 - 16 DSP54x - Fundamental DSP Applications

8 - 30

Lab 8: Solution - Parts 1 & 2Lab 8: Solution - Parts 1 & 2******** 1. Allocate RAM for coefficients and delay line 1. Allocate RAM for coefficients and delay line

. .bssbss a,4,1 a,4,1 Alloc Alloc RAM in 1 page RAM in 1 page

b .set a+1b .set a+1

y1 .set a+2y1 .set a+2

y2 .set a+3y2 .set a+3

******** 2. Establish a ROM table for 2. Establish a ROM table for coeff’s coeff’s and and int int .. condsconds ..

.data .data

TBL .word 32768*1975/2000 A/2 (q15)TBL .word 32768*1975/2000 A/2 (q15)

.word 32768*(-1) B (q15) .word 32768*(-1) B (q15)

.word 32768*14/100 Y1 (q15) .word 32768*14/100 Y1 (q15)

.word 32768*0 Y2 (q15) .word 32768*0 Y2 (q15)

SIZE .set $-TBLSIZE .set $-TBL

8 - 31

Lab 8: Solution - Parts 3 & 4Lab 8: Solution - Parts 3 & 4

******** 3. Initialize ROM into RAM 3. Initialize ROM into RAM

.text Begin code space .text Begin code space

start STM #a,AR7 Pointer to RAM arraystart STM #a,AR7 Pointer to RAM array

RPT #SIZE-1 Loop # of TBL elements RPT #SIZE-1 Loop # of TBL elements

MVPD TBL,*AR7+ Copy MVPD TBL,*AR7+ Copy ROMs ROMs to to RAMs RAMs

******** 4. Initialize processor modes 4. Initialize processor modes

SSBX FRCT For Q15*Q15 -> Q31 SSBX FRCT For Q15*Q15 -> Q31

LD #a,DP Set page for direct addressing LD #a,DP Set page for direct addressing

RSBX OVM Allow use of guard bits RSBX OVM Allow use of guard bits

SSBX SXM Two's comp numbers SSBX SXM Two's comp numbers

Page 175: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 8

DSP54x - Fundamental DSP Applications 8 - 17

8 - 32

Lab 8: Solution - Parts 5 & 6Lab 8: Solution - Parts 5 & 6

******** 5. Write code to implement signal flow diagram ... 5. Write code to implement signal flow diagram ...

SINE LD y2,T T = y2SINE LD y2,T T = y2

MPY b,A A = b*y2 MPY b,A A = b*y2

LTD y1 T = y1 , y1 -> y2 LTD y1 T = y1 , y1 -> y2

MAC a,A A = (a*y2)/2 + b*y2 MAC a,A A = (a*y2)/2 + b*y2

MAC a,A A = a*y2 + b*y2 MAC a,A A = a*y2 + b*y2

STH A,y1 y0 -> y1 STH A,y1 y0 -> y1

PORTW y1,0000 write to out. PORTW y1,0000 write to out.datdat file file

B SINE loop ... B SINE loop ...

******** 6. Build reset vector 6. Build reset vector

.include VECTOR6.SSM .include VECTOR6.SSM

8 - 33

SIMINIT.CMDSIMINIT.CMD

ma 0x0000,0, 0x0100, R|W|EX Smallma 0x0000,0, 0x0100, R|W|EX Small Ext’l Pgm Mem Ext’l Pgm Mem at 0 at 0

ma 0x9000,0, 0x1000, R|W|EX 4k ofma 0x9000,0, 0x1000, R|W|EX 4k of Ext’l Ext’l " " " "

ma 0xe000,0, 0x1000, R|W|EX 4k ofma 0xe000,0, 0x1000, R|W|EX 4k of Ext’l Ext’l " " " "

ma 0xff80,0, 0x0080, R|W|EX Vector Area " "ma 0xff80,0, 0x0080, R|W|EX Vector Area " "

ma 0x0000,1, 0x0060, R|Wma 0x0000,1, 0x0060, R|W MMRs MMRs in Data in Data Mem Mem

ma 0x0060,1, 0x0020, R|W SPRAM "ma 0x0060,1, 0x0020, R|W SPRAM "

ma 0x0080,1, 0x0380, R|W RAM 0 "ma 0x0080,1, 0x0380, R|W RAM 0 "

ma 0x0400,1, 0x0400, R|W RAM 1 "ma 0x0400,1, 0x0400, R|W RAM 1 "

ma 0x1400,1, 0x0400, R|W|EX Small X Datama 0x1400,1, 0x0400, R|W|EX Small X Data Mem Mem at 1400 at 1400

ma 0x8000,1, 0x1000, R|W|EX 4k ofma 0x8000,1, 0x1000, R|W|EX 4k of Extl Mem Extl Mem Org 8000 Org 8000

ma 0x0,2,1,ma 0x0,2,1,oportoport Output Port 0 Output Port 0

mcmc 0x0,2,1,out. 0x0,2,1,out.datdat,W ,W

Page 176: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 8

8 - 18 DSP54x - Fundamental DSP Applications

Page 177: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

DSP54x - Algorithms 9 - 1

Algorithms

Learning Objectives

9 - 2

Learning ObjectivesLearning Objectives

uu List the advanced C54x instructionsList the advanced C54x instructions

uu Associate the advanced mnemonicAssociate the advanced mnemonic

with the algorithmic needwith the algorithmic need

uu Identify the architectural componentsIdentify the architectural components

that provide advanced performancethat provide advanced performance

uu Experiment with some of theseExperiment with some of these

instructions on the simulatorinstructions on the simulator

Page 178: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

9 - 2 DSP54x - Algorithms

Page 179: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 9

DSP54x - Algorithms 9 - 3

Module 9

9 - 3

Advanced ApplicationsAdvanced Applications

u FIRS Symmetrical FIR filteru LMS Adaptive filtering

u POLY Polynomial evaluation

u STRCD Code book Search

SACCD

SRCCD

u DADST Viterbi algorithm

DSADT

CMPS

u FIRS Symmetrical FIR filter

9 - 4

Symmetric FIR Filter

Coeffs

a3 a2 a1 a0a3a2a1a0

x(4)x(3)

x(2)x(1)

x(8)x(7)

x(6)x(5)

New Old

Symmetric FIR FiltersSymmetric FIR Filtersare commonly used inare commonly used inapplications where phaseapplications where phasedistortion may degradedistortion may degradethe signal quality,the signal quality,egeg: modems.: modems.

The general form of this FIR equation is writtenThe general form of this FIR equation is written

Y(n) = a0Y(n) = a0x(8)x(8)+a1+a1x(7)x(7)+a2+a2x(6)x(6)+a3+a3x(5)x(5)+a3+a3x(4)x(4)+a2+a2x(3)x(3)+a1+a1x(2)x(2)+a0+a0x(1)x(1)

using 8 using 8 Mult’sMult’s ,7 Adds,7 Adds

In the specific case of a Symmetric FIR we can writeIn the specific case of a Symmetric FIR we can write

Y(n) = a0(Y(n) = a0(x(8)+x(1)x(8)+x(1))+a1()+a1(x(7)+x(2)x(7)+x(2))+a2()+a2(x(6)+x(3)x(6)+x(3))+a3()+a3(x(5)+x(4)x(5)+x(4)))

using 4 using 4 Mult’sMult’s ,7 Adds,7 Adds

Data

Page 180: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 9

9 - 4 DSP54x - Algorithms

9 - 5

1. Split the data into two parts; New and Old.1. Split the data into two parts; New and Old.

FIRS ImplementationFIRS Implementation

2. Set up circular buffers for each part. Set up the pointers for the buffers to the2. Set up circular buffers for each part. Set up the pointers for the buffers to thenewest of “New” and the oldest of “Old”. Set up a newest of “New” and the oldest of “Old”. Set up a coeffientcoeffient table. table.

3. Sum the first two data points into the high A accumulator (AH) and3. Sum the first two data points into the high A accumulator (AH) anddecrement the data pointers.decrement the data pointers.

4. Zero the B accumulator and repeat the following four times:4. Zero the B accumulator and repeat the following four times:a. Multiply AH times the coefficient, accumulate the result into the high Ba. Multiply AH times the coefficient, accumulate the result into the high B

accumulator (BH) and increment the coefficient pointer.accumulator (BH) and increment the coefficient pointer.b. Sum the next two data points and decrement the data pointers.b. Sum the next two data points and decrement the data pointers.

5. Store the result (BH) & set data pointers to oldest “Old” and oldest “New”.5. Store the result (BH) & set data pointers to oldest “Old” and oldest “New”.6. Replace oldest “Old” value with oldest “New” value. Dec. “Old” pointer.6. Replace oldest “Old” value with oldest “New” value. Dec. “Old” pointer.7. Replace oldest “New” value with a new input datum and go to step 3.7. Replace oldest “New” value with a new input datum and go to step 3.

x(5)x(5)

x(6)x(6)

x(7)x(7)

x(8)x(8)

NewNew

x(4)x(4)

x(3)x(3)

x(2)x(2)

x(1)x(1)

OldOld

AR2AR2 AR3AR3

x(8)x(8) x(1)x(1)

HigherHigheraddressesaddresses

x(8)+x(1)x(8)+x(1)x(8)x(8) x(1)x(1)

x(7)x(7) x(2)x(2)

BH = a0( )BH = a0( ) x(7)+x(2)x(7)+x(2)

x(7)x(7) x(2)x(2)

x(6)x(6) x(3)x(3)

+ a1( )+ a1( ) x(6)+x(3)x(6)+x(3)

x(6)x(6) x(3)x(3)

x(5)x(5) x(4)x(4)

+ a2( )+ a2( ) x(5)+x(4)x(5)+x(4)

x(5)x(5) x(4)x(4)

x(8)x(8) x(1)x(1)

+ a3( )+ a3( )

a0a0

a1a1

a2a2

a3a3

CoefficientsCoefficients

a0a0a0a0

a1a1a1a1

a2a2a2a2

a3a3a3a3

a0a0

x(8)x(8) x(1)x(1)

x(7)x(7) x(2)x(2)x(7)x(7) x(2)x(2)

x(1)x(1)

x(5)x(5)

x(2)x(2)

x(5)x(5)

x(9) x(9) A/DA/D

9 - 6

FIRSFIRS Code ExampleX_new .usect “DATA1”,4X_old .usect “DATA2”,4

LD #Y,DP SSBX FRCT STM #X_new,AR2 STM #X_old+3,AR3 STM #4,BK STM #-1,AR0

FIR ADD *AR2+0%,*AR3+0%,A RPTZ B,#3 FIRS *AR2+0%,*AR3+0%,COEFS

STH B,Y PORTW Y,0000h

MAR *+AR2(2)% MAR *AR3+% MVDD *AR2,*AR3+0% BD FIR PORTR 0001h,*AR2

.dataCOEF .word a0,a1,a2,a3

AR2 points to NEW bufAR3 points to OLD bufCircular buffer length = 4Emulates *ARn-%

AH = x(8)+x(1)B = 0;do the following 4 times:B=AH*a0;AH=x(7)+x(2), etc...

Output the result

Point to oldest OLD bufPoint to oldest NEW bufXfer old NEW over old OLD

Input new X to NEW buf

Page 181: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 9

DSP54x - Algorithms 9 - 5

9 - 7

Architecture - FIRSArchitecture - FIRS

AccAcc A A

FIRS *AR2+0% , *AR3+0% , COEFS

ALUALU

CC DD

MUXMUX

ALUALU

AccAcc B B

ADDADD

AA PP

BB

MACMAC

MPYMPY

9 - 8

Advanced ApplicationsAdvanced Applications

u FIRS Symmetrical FIR filteru LMS Adaptive filtering

u POLY Polynomial evaluation

u STRCD Code book Search

SACCD

SRCCD

u DADST Viterbi algorithm

DSADT

CMPS

u LMS Adaptive filtering

Page 182: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 9

9 - 6 DSP54x - Algorithms

9 - 9

Least Mean SquareLeast Mean SquareA least mean square (LMS) approach is widely used for adaptive filter routines.A least mean square (LMS) approach is widely used for adaptive filter routines.

The technique minimizes an error term by tuning the filter coefficients.The technique minimizes an error term by tuning the filter coefficients.

H(z)H(z)

W(z)W(z)

++d(n)d(n)

y(n)y(n)

x(n)x(n) e(n)e(n)++

--

x(n) = input datax(n) = input data

d(n) = desired responsed(n) = desired response

y(n) = actual responsey(n) = actual response

H(z) = real systemH(z) = real system

W(z) = synthesized systemW(z) = synthesized system

e(n) = errore(n) = error

9 - 10

Adaptive FIR Filtering using LMSAdaptive FIR Filtering using LMS

..........zz-1-1 zz-1-1 zz-1-1

LMSLMS

++

bb00 bb11 bbnn-1-1

y(n)y(n)

x(n)x(n)

FIR type filters are usually used in an adaptive algorithm since FIR type filters are usually used in an adaptive algorithm since

they are more tolerant of non-optimal coefficients.they are more tolerant of non-optimal coefficients.

++--

++

d(n)d(n)

e(n)e(n)

Page 183: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 9

DSP54x - Algorithms 9 - 7

9 - 11

LMS LoadingLMS LoadingEach Iteration ( only once )Each Iteration ( only once )

1 - determine error : 1 - determine error : e(i) e(i) = = d(i) - y(i)d(i) - y(i)2 - scale by “rate” term B : 2 - scale by “rate” term B : e´(i) e´(i) = = 2*B*e(i)2*B*e(i)

Each Term ( N sets )Each Term ( N sets )3 - Qualify error with signal strength : 3 - Qualify error with signal strength : e´´(i) e´´(i) = = x(i-k) * e´(i)x(i-k) * e´(i)4 - Sum error with coefficient :4 - Sum error with coefficient : b(i+1) b(i+1) = = b(i) + e´´(i)b(i) + e´´(i)5 - Update coefficient :5 - Update coefficient : b(i) b(i) = = b(i+1)b(i+1)

LMS:LMS: 11 11 SUBSUB22 11 MPYMPY33 NN MPYMPY44 NN ADDADD55 NN STHSTH

FIRFIR aa NN MPYMPYbb NN ADDADDcc 11 STHSTH

@ 100 tap: 500+ cycles@ 100 tap: 500+ cycles

Analysis :Analysis :

ST ST|| MPY|| MPY

MACMACADDADDLMSLMS

@ 100 tap: 200+ cycles@ 100 tap: 200+ cycles

9 - 12

LMS InstructionLMS Instruction

LMSLMS XmemXmem,, Ymem Ymem ;A += (;A += (XmemXmem) << 16 + 2) << 16 + 21515

;B += (;B += (XmemXmem) * () * (YmemYmem))

00 1111 222200 1111 2222AA

00 1000 000000 1000 0000BB

0 0FRCTFRCT

0100 0100AR3AR3

AR4AR4

1000 10000100h0100h

2000 20000200h0200h

Data memoryData memory02000200

Before instructionBefore instruction

AABB

FRCTFRCT

AR3AR3

AR4AR4

0100h0100h

0200h0200h

Data memoryData memory

After instructionAfter instruction

00

10001000

20002000

LMS *AR3+, *AR4+LMS *AR3+, *AR4+LMS *AR3+, *AR4+

00 1111 2222h00 1111 2222h

+ 00 1000 0000h+ 00 1000 0000h

+ 8000h+ 8000h

00 2111 A222h00 2111 A222h

00 2111 A22200 2111 A222

The The LMSLMS instruction adapts the coefficient instruction adapts the coefficient

00 1000 0000h00 1000 0000h

+ 1000h * 2000h+ 1000h * 2000h

00 1200 0000h00 1200 0000h

00 1200 000000 1200 0000

01010101

02010201

andand performs the performs the MACMAC for the filtering in the same cycle. for the filtering in the same cycle.

Storing the coefficient will require 1 additional cycle.Storing the coefficient will require 1 additional cycle.

Page 184: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 9

9 - 8 DSP54x - Algorithms

9 - 13

....asg AR3, Coeffs.asg AR4, Data

LD B2e, TLD #0,BSTM #N-2, BRC

RPTBD End-1MPY *Data +0%, ALMS *Coeffs , *Data

ST A, *Coeffs+|| MPY *Data+0%, A

LMS *Coeffs, *Data

End STH A, *CoeffsSTH B, *Result

....asg AR3, Coeffs.asg AR4, Data

LD B2e, TLD #0,BSTM #N-2, BRC

RPTBD End-1MPY *Data +0%, ALMS *Coeffs , *Data

ST A, *Coeffs+|| MPY *Data+0%, A

LMS *Coeffs, *Data

End STH A, *CoeffsSTH B, *Result

LMS Adaptive Filter CodeLMS Adaptive Filter Code

Pre-calculate 2Beta*e(n) ...AR3 points to Coefficient table ... a(n)AR4 points to Data table ... x(n)

T holds the error step amountZero out BLoad Branch Repeat CounterStart RPTB, next two are delay slotsA = error * oldest sample

B += a(n)*x(n) ... filter tapA += (a(n) << 16)+215 ... coeff. updateStore updated coefficientand form A = x(n-1)*2Beta*e(n)

B = accumulated filter outputA = updated filter coefficients

Store the final updated coefficientStore final filter result

9 - 14

Architecture - LMSArchitecture - LMS

LMS *AR2+0%, *AR3+0%

ALU : LMSALU : LMS

AccAcc A A

ALUALU

AA DD

MUXMUX

MAC : FIRMAC : FIR

AccAcc B B

ADDADD

DD CC

BB

MPYMPY

Page 185: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 9

DSP54x - Algorithms 9 - 9

9 - 15

Advanced ApplicationsAdvanced Applications

u FIRS Symmetrical FIR filteru LMS Adaptive filtering

u POLY Polynomial evaluation

u STRCD Code book Search

SACCD

SRCCD

u DADST Viterbi algorithm

DSADT

CMPS

u POLY Polynomial evaluation

9 - 16

Polynomial Polynomial Evaluation

P(x) = aP(x) = a33xx3+ 3+ aa22xx2 2 + a+ a11xx + a+ a00

The general form of a 3The general form of a 3rdrd order polynomial equation can be written as: order polynomial equation can be written as:

The equation can be rewritten as:The equation can be rewritten as:

P(x) = [(aP(x) = [(a33xx+ a+ a22))xx + a+ a11]]xx + a+ a00

Polynomial evaluation is commonly used in convolutional encoding.Polynomial evaluation is commonly used in convolutional encoding.

This process can be extended to any order polynomialThis process can be extended to any order polynomial

Page 186: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 9

9 - 10 DSP54x - Algorithms

9 - 17

POLY OperationPOLY Operation1. Set up a pointer to the coefficients.1. Set up a pointer to the coefficients.

TT

ALALAHAHAGAG

BLBLBHBHBGBG

2. Load x into the T register.2. Load x into the T register.

xx

3. Load a3. Load a3 3 into the high A accumulator (AH). Decrement pointer.into the high A accumulator (AH). Decrement pointer.

aa33

4. Load a4. Load a22 into the high B accumulator (BH). Decrement pointer. into the high B accumulator (BH). Decrement pointer.

aa22

5. Repeat the following three times:5. Repeat the following three times:a. Multiply AH times T, accumulate with BH and round in AH.a. Multiply AH times T, accumulate with BH and round in AH.b. Load the next coefficient into BH. Decrement pointer.b. Load the next coefficient into BH. Decrement pointer.

P(x) = AH =P(x) = AH = (a(a33xx+ a+ a22))

P(x)P(x)

aa11

[ [ xx+a+a11]]

aa00

xx+a+a00

?x?x

6. Store AH as result.6. Store AH as result.

ARnARn

aa33

aa22

aa11

aa00

??11

??22

CoefficientsCoefficients

aa11

aa33

aa22aa22

??22

??11??11

aa11

aa00aa00

9 - 18

Polynomial EvaluationPolynomial Evaluation

SSBXSSBX FRCT FRCTSSBXSSBX OVM OVMSSBXSSBX SXM SXM

LDLD *AR4+,T*AR4+,TLDLD *AR3+,16,A*AR3+,16,ALDLD *AR3+,16,B*AR3+,16,B

RPTRPT #2#2POLYPOLY *AR3+*AR3+

STHSTH A,*AR2+A,*AR2+

NoteNote: The POLY instruction “expects” Q15 numbers!: The POLY instruction “expects” Q15 numbers!

A parallel load may be added to do iterative POLY operations with no penalty.A parallel load may be added to do iterative POLY operations with no penalty.

|| LD|| LD *AR4+,T*AR4+,T T=new xT=new x

POLY operation is affected by these bitsPOLY operation is affected by these bits

T=X(0)T=X(0)A=A(order)=PXA=A(order)=PX init initB=A(order-1)B=A(order-1)

3 times3 timesA=PX=A=PX=RndRnd(B+A*T) B=An<<16(B+A*T) B=An<<16

PX=A>>16PX=A>>16

Page 187: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 9

DSP54x - Algorithms 9 - 11

9 - 19

TT

Architecture - POLYArchitecture - POLY

AccAcc A A AccAcc B B

POLY *AR3+

ALUALU

DD

MUXMUX

ALUALU

ADDADD

AA TT

BB

MACMAC

MPYMPY

9 - 20

Advanced ApplicationsAdvanced Applications

u FIRS Symmetrical FIR filteru LMS Adaptive filtering

u POLY Polynomial evaluation

u STRCD Code book Search

SACCD

SRCCD

u DADST Viterbi algorithm

DSADT

CMPS

u STRCD Code book Search

SACCD

SRCCD

Page 188: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 9

9 - 12 DSP54x - Algorithms

9 - 21

Code Book SearchCode Book SearchA code-excited linear predictive (CELP) speechA code-excited linear predictive (CELP) speech coder coder is widely used for is widely used forapplications requiring speech coding with a bit rate under 16Kapplications requiring speech coding with a bit rate under 16K bps bps. The. Thespeechspeech coder coder uses a vector uses a vector quantization quantization technique from technique from codebooks codebooks to an to an

excitation signal. This excitation signal is then applied to a linearexcitation signal. This excitation signal is then applied to a linearpredictive-coding (LPC) synthesis filter.predictive-coding (LPC) synthesis filter.

++++

--

WeightingWeighting

FilterFilter

SynthesisSynthesis

FilterFilter

Mean-square errorMean-square error

minimizationminimization

001122......

CodebookCodebook

GainGain

Input speechInput speech

p(n)p(n)

g(n)g(n)SelectSelectCodebookCodebookEntryEntry

9 - 22

Code Book SearchCode Book SearchObtaining the optimum code vector involves minimizing the mean-square errorObtaining the optimum code vector involves minimizing the mean-square error

generated from the weighted input speech and from the zero-input response ofgenerated from the weighted input speech and from the zero-input response of

the synthesis filter.the synthesis filter.

EEii = = ΣΣ [ [ p(n) -p(n) - ββggii(n) ](n) ]22

N-1N-1

i=0i=0

Optimum code vector localizationOptimum code vector localization

* p(n) is the weighted input speech* p(n) is the weighted input speech

** g gii(n) is the zero-input response of the (n) is the zero-input response of the

synthesis filter synthesis filter

* * β β is the gain of theis the gain of the codebook codebook

* N is a* N is a subframe subframe

Mean-Square ErrorMean-Square Error

Page 189: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 9

DSP54x - Algorithms 9 - 13

9 - 23

Code Book SearchCode Book Search

ccii = = ΣΣ ggii * p(n) * p(n)N-1N-1

i=0i=0

The cross-correlation ,The cross-correlation , c cii ,of p(n) and ,of p(n) and g gii(n) is represented by :(n) is represented by :

The energy variable,The energy variable, G Gii , is given by: , is given by:

GGii = = ΣΣ ggii 22

N-1N-1

i=0i=0

MinimizeMinimize E Eii by maximizing cby maximizing cii22//GGii . If a code vector with i = opt is optimal, the . If a code vector with i = opt is optimal, the

following equation is met for any i. Thefollowing equation is met for any i. The codebook codebook search routine evaluates this search routine evaluates thisequation for each code vector and finds the optimum one.equation for each code vector and finds the optimum one.

ccii 22 c c opt opt

22

GGii GGoptopt<< oror ccii

22 * * G Goptopt << cc opt opt22 * * G Gii

9 - 24

Code Book SearchCode Book Search

..mmregsmmregs

.text.textCBS:CBS: STMSTM #C, AR5#C, AR5

STMSTM #G, AR2#G, AR2STMSTM #G-opt, AR3#G-opt, AR3STMSTM #I-opt, AR4#I-opt, AR4STST #0, *AR4#0, *AR4STST #1, *AR3+#1, *AR3+STST #0, *AR3-#0, *AR3-STMSTM #N-1, BRC#N-1, BRC

RPTBRPTB donedoneSQURSQUR *AR5+, A*AR5+, AMPYAMPYA *AR3+*AR3+

MASMAS *AR2+, *AR3-, B*AR2+, *AR3-, BSRCCDSRCCD *AR4, BGEQ*AR4, BGEQSTRCDSTRCD *AR3+, BGEQ*AR3+, BGEQSACCDSACCD A, *AR3-, BGEQA, *AR3-, BGEQ

done: done: NOPNOP

A = C(i)^2A = C(i)^2B = C(i)^2 * B = C(i)^2 * GoptGopt T = G(i) T = G(i)

B = C(i)^2 *B = C(i)^2 * Gopt Gopt -- G(i) * Copt^2 G(i) * Copt^2If (B >= 0) then BRC -->If (B >= 0) then BRC --> Iopt Iopt and T --> and T --> Gopt Gopt and A --> Copt^2 and A --> Copt^2

AR5 AR5 C(0)C(0)......

AR3 AR3 GoptGopt=1=1

Copt=0Copt=0

AR4AR4 IoptIopt=0=0

AR2 AR2 G(0)G(0)......

SQURSQUR *AR5+, A*AR5+, Adone: done: MPYAMPYA *AR3+*AR3+

DD

Page 190: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 9

9 - 14 DSP54x - Algorithms

9 - 25

CodebookCodebook Search Instructions Search Instructions

Store T register conditionallyStore T register conditionally ... ...

STRCD STRCD XmemXmem, , condcondXmemXmem = T if condition is true = T if condition is true

Store Block Repeat Counter conditionally ..Store Block Repeat Counter conditionally ..

SRCCD SRCCD XmemXmem,, cond condXmemXmem = BRC if condition is true = BRC if condition is true

Store Accumulator conditionally ...Store Accumulator conditionally ...

SACCD SACCD srcsrc,, Xmem Xmem, , condcondXmemXmem = = src src << (ASM - 16) if condition is true << (ASM - 16) if condition is true

9 - 26

Advanced ApplicationsAdvanced Applications

u FIRS Symmetrical FIR filteru LMS Adaptive filtering

u POLY Polynomial evaluation

u STRCD Code book Search

SACCD

SRCCD

u DADST Viterbi algorithm

DSADT

CMPS

u DADST Viterbi algorithm

DSADT

CMPS

Page 191: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 9

DSP54x - Algorithms 9 - 15

9 - 27

Data TransmissionData Transmission

XMITXMIT

ModulateModulate

0010110 ...0010110 ...

RCVRCV

DemodulateDemodulate

0010100 ...0010100 ...

uu Digital sourceDigital source data is modulated to XMIT

u Signal is demodulated at RCV

u Noise acquired on RCV can cause data errors

u For greater reliability, EDAC technique is desired

FadingFading

MultipathMultipath

NoiseNoise

9 - 28

Viterbi EncoderViterbi Encoder

G0G0BitsBits

Input Input bitsbits ZZ-1-1 ZZ-1-1 ZZ-1-1 ZZ-1-1

G1G1BitsBits

++

++

u N bits are fed into network.u M (>N) bits flow out (... G0 G1 G0 G1 ...)u e.g. 3 in : 4 out, 4 in : 8 out, etc.u recognizable “holes” are created in data path, e.g.:

3 in : 4 out3 in : 4 out

valid codes: 2valid codes: 233 = 8 = 8

total codes: 2total codes: 244 = 16 = 16

“holes” = 8“holes” = 8

4 in : 8 out4 in : 8 out

valid codes: 2valid codes: 244 = 16 = 16

total codes: 2total codes: 288 = 256 = 256

“holes” = 240“holes” = 240

Receiver can use table of valid Receiver can use table of valid vsvs. invalid code to detect errors. invalid code to detect errors

Page 192: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 9

9 - 16 DSP54x - Algorithms

9 - 29

00

11

00

11statestate

n n

statestate

n+1 n+1

Viterbi Decoder ConceptViterbi Decoder Concept

u ‘Pruning’ less likely paths from consideration keeps N samplesof data from using 2nn locations to just 2*n locations

u After ‘M’ samples are acquired, two best paths are comparedto table of valid/invalid codes (traceback)

u Invalid set is dropped, valid set is saved as received dataIf both are valid, maximum likelihood set is saved

u Receive datau There are now 4 possible sets: 00, 01, 10, 11

u Traditional approach: keep only 1Viterbi method: keep ‘best’ 2

u ‘Best’ is determined by maximum value along paths between states

9 - 30

Viterbi DecoderViterbi Decoder

D-cod:D-cod: LDLD *AR2,T*AR2,T T = MT = M

Old stateOld state

2*J2*J

2*J+12*J+1

New stateNew state

JJ

J+8J+8

+ M+ M

- M- M

+ M+ M

- M- M

AH,ALAH,AL

BH,BLBH,BL

Often, local distance is the same for consecutive butterflies, Often, local distance is the same for consecutive butterflies, so the benchmark approaches 4 cycles per butterfly. so the benchmark approaches 4 cycles per butterfly.

CMPSCMPS B, *AR3+B, *AR3+ (J+8)=max(BH,BL), etc(J+8)=max(BH,BL), etc

CMPSCMPS A, *AR4+A, *AR4+ (J)=max(AH,AL), etc(J)=max(AH,AL), etc

DSADTDSADT *AR5+, B*AR5+, B BH=(2*J)-M, BL=(2*J+1)+MBH=(2*J)-M, BL=(2*J+1)+M

DADSTDADST *AR5, A*AR5, A AH=(2*J)+M, AL=(2*J+1)-MAH=(2*J)+M, AL=(2*J+1)-M

Page 193: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 9

DSP54x - Algorithms 9 - 17

9 - 31

Viterbi Memory MapViterbi Memory Map

u In one symbol time interval, 8 butterflies yield 16 new statesu This operation repeats over a number of symbol time intervalsu At the end of the sequence of time intervals, a back track routine as

performed to find the optimal path out of the 16 paths calculatedu This path represents the bit sequence to be decoded

MetricsMetrics

2*J & 2*J+12*J & 2*J+1

Metrics JMetrics J

Metrics J + 8Metrics J + 8

AR5AR5

AR4AR4

AR3AR3

00

15151616

2424

3131

Relative locationRelative location

Old statesOld states

New statesNew states

9 - 32

Viterbi InstructionsViterbi Instructions

CMPSCMPS src src,, Smem Smem

THEN : THEN : ( (srcsrc(31-16)) (31-16)) ÐÐ SmemSmem

0 0 ÐÐ TC TC(TRN << 1 ) + 0 (TRN << 1 ) + 0 ÐÐ TRN TRN

ELSE : ELSE : ( (srcsrc(31-16)) (31-16)) ÐÐ SmemSmem

1 1 ÐÐ TC TC (TRN << 1 ) + 1 (TRN << 1 ) + 1 ÐÐ TRN TRN

IF { [IF { [ src src (31-16) ] > [ (31-16) ] > [ src src (15-0) ] } (15-0) ] }

DADST Lmem,dst

DSADT Lmem,dst

Lmem ( 31-16 ) + (T) ÐÐ dst (39-16) Lmem ( 15 - 0 ) - (T) ÐÐ dst (15 - 0)

Lmem ( 31-16 ) - (T) ÐÐ dst (39-16)Lmem ( 15 - 0 ) + (T) ÐÐ dst (15 - 0)

Page 194: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 9

9 - 18 DSP54x - Algorithms

9 - 33

TT

Compare Select Store (CSS) UnitCompare Select Store (CSS) Unit

EB [15:0]EB [15:0]

DBDB [15:0] [15:0]CB [15:0]CB [15:0]

TRNTRN

TCTC

CSS

UN

ITC

SS U

NIT

BH BLBH BLAH ALAH AL

C16=1C16=1 ALUALU3232

1616

uu Dual 16-bit Dual 16-bit ALU operationsALU operations

uu T register input T register input ALU as dual ALU as dual 16-bit operand16-bit operand

uu 16-bit transition 16-bit transition shift register shift register (TRN)(TRN)

uu One cycle store One cycle store Max and Shift Max and Shift decisiondecision

COMPCOMP

MSB/LSBMSB/LSBWRITEWRITESELECTSELECT

9 - 34

Absolute and Square DistanceAbsolute and Square Distance

ABDSTABDST XmemXmem, , YmemYmem : Absolute Distance: Absolute Distance

B B + =+ = | AH | | AH |

AH AH = = XmemXmem - - YmemYmem

SQDSTSQDST XmemXmem, , YmemYmem : Square Distance: Square Distance

B B + =+ = AH AH22

AH AH = = XmemXmem - - YmemYmem

Page 195: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 9

DSP54x - Algorithms 9 - 19

9 - 35

ReviewReview

uu What instruction is used to perform adaptive filtering?What instruction is used to perform adaptive filtering?

uu What instructions are used to perform Viterbi decoding?What instructions are used to perform Viterbi decoding?

uu What features of the C54x architecture allow the What features of the C54x architecture allow the FIRSFIRSinstruction to execute in a single cycle?instruction to execute in a single cycle?What might slow it down?What might slow it down?

uu What features of the C54x architecture allow the ViterbiWhat features of the C54x architecture allow the Viterbioperations to execute so quickly?operations to execute so quickly?

uu What mnemonic is used for solving polynomials?What mnemonic is used for solving polynomials?What concept allows this to run so quickly?What concept allows this to run so quickly?

9 - 36

LAB9ALAB9A .. Acoustic Echo Cancellation.. Acoustic Echo Cancellation

The goal of this adaptive filter is to create a replica of theThe goal of this adaptive filter is to create a replica of theecho so that when the signal, y(n), is subtracted from the echoecho so that when the signal, y(n), is subtracted from the echo

signal, z(n), the result is zero.signal, z(n), the result is zero.

Reference signal x(n)Reference signal x(n)

ref.ref.datdat

y(n)y(n) --

++

Error signal e(n)Error signal e(n)

error.error.datdat

SpeakerSpeaker

MicrophoneMicrophone

Near-end speechNear-end speechand room noiseand room noise

Adaptive FilterAdaptive FilterLMS update ofLMS update of

coefficientscoefficientsbbkk

Echo signal Echo signal z(n) z(n)

echo.echo.datdat

Page 196: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 9

9 - 20 DSP54x - Algorithms

9 - 37

LAB9ALAB9A .. Acoustic Echo Cancellation .. Acoustic Echo Cancellation

uu Assemble and link Assemble and link LAB9ALAB9A

uu Enter the simulator and invoke Enter the simulator and invoke LAB9A.TAKLAB9A.TAK. The take file loads . The take file loads LAB9ALAB9A and andconnects the simulator to the input and output files.connects the simulator to the input and output files.

uu Run the program. Depending on the speed of the simulator it may take severalRun the program. Depending on the speed of the simulator it may take severalminutes to process all 2000 points. However, you can stop the simulator at any timeminutes to process all 2000 points. However, you can stop the simulator at any timeand proceed to the next step in this procedure.and proceed to the next step in this procedure.

uu Exit the simulatorExit the simulator

uu Plot the Plot the ERROR.DATERROR.DAT file by typing: file by typing: DRAWHEX ERROR.DATDRAWHEX ERROR.DAT. Note that the error. Note that the errorconverges from a large amplitude to a small amplitude. This is the time it takes theconverges from a large amplitude to a small amplitude. This is the time it takes theLMSLMS filter to adapt. The remaining signal is ambient noise present in the room when filter to adapt. The remaining signal is ambient noise present in the room whenthe data was collected. Analysis shows that the echo is attenuated by an average ofthe data was collected. Analysis shows that the echo is attenuated by an average of28dB.28dB.

uu Determine the number of clock cycles required for the filter and Determine the number of clock cycles required for the filter and LMSLMS update. update.

uu Compare this to the 4N clock cycles required by most DSP’s. N is the number ofCompare this to the 4N clock cycles required by most DSP’s. N is the number oftaps of the adaptive filter.taps of the adaptive filter.

uu Try changing the number of the filter taps and Beta. What is the effect?Try changing the number of the filter taps and Beta. What is the effect?

9 - 38

LAB9ALAB9A .. Acoustic Echo Cancellation .. Acoustic Echo Cancellation

The code for the The code for the LMSLMS filter is in filter is in LAB9A.ASMLAB9A.ASM. The input x(n) is stored in a file. The input x(n) is stored in a filecalled called REF.DATREF.DAT, representing a 1 kHz sine wave sampled at 8 kHz. To create, representing a 1 kHz sine wave sampled at 8 kHz. To createthe the REF.DATREF.DAT file, a signal was generated with a function generator, then file, a signal was generated with a function generator, thensampled with a 13 bit A/D converter. At the same time sampled with a 13 bit A/D converter. At the same time REF.DATREF.DAT was wascollected, the 1 kHz signal was sent to a speaker.collected, the 1 kHz signal was sent to a speaker.

A microphone was used to collect the resulting echoes in a 10’ x 15’ room.A microphone was used to collect the resulting echoes in a 10’ x 15’ room.

The echo signal z(I) was sampled in the same manner as the reference signal.The echo signal z(I) was sampled in the same manner as the reference signal.The echo signal is stored in a file called The echo signal is stored in a file called ECHO.DATECHO.DAT. Two thousand samples. Two thousand samplesor .25 seconds of data is stored in the files. or .25 seconds of data is stored in the files. LAB9A.ASMLAB9A.ASM uses uses REF.DATREF.DAT and andECHO.DATECHO.DAT as inputs to the as inputs to the LMSLMS filter. When the program is run, the filter. When the program is run, theresulting error signal is stored in resulting error signal is stored in ERROR.DATERROR.DAT . The LMS filter length is set . The LMS filter length is setfor 16 taps. For a sampling rate of 8 kHz, a 16 tap filter can cancel up tofor 16 taps. For a sampling rate of 8 kHz, a 16 tap filter can cancel up to16/8000 = 216/8000 = 2 msec msec of echo delay. of echo delay.

Page 197: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 9

DSP54x - Algorithms 9 - 21

9 - 39

3 parity check3 parity checkbits on 50 classbits on 50 class

1a bits1a bits

LAB9BLAB9B .. GSM Channel Coding .. GSM Channel Coding

GSM stands for Global Standard for Mobile communications. It is the digitalGSM stands for Global Standard for Mobile communications. It is the digitalcellular standard used in Europe and throughout the world.cellular standard used in Europe and throughout the world.

RPE-LPCRPE-LPCspeech encoderspeech encoder

input 160 samplesinput 160 samplesoutput 260 bitsoutput 260 bits

class 1aclass 1a

class 1bclass 1b

5353bitsbits

132132bitsbits

Reorder classReorder class1 bits and1 bits andadd 4 zeroadd 4 zero

trailing bitstrailing bits

189189bitsbits

1/2 rate, constraint1/2 rate, constraintlength 5length 5 convol convol--

utionalutional encoding on encoding onclass 1a and 1b bitsclass 1a and 1b bits

378378bitsbits

Voice activity detectorVoice activity detector

class 2class 2

7878bitsbits

InterleavingInterleavingA5 encryptionA5 encryption

and slotand slotformattingformatting

GMSKGMSKmodulationmodulation

performed inperformed inRFRF codec codec

RFRFtransmissiontransmission

TransmitterTransmitter

s(t)s(t)

RFRFreceptionreception

GMSKGMSKmodulationmodulation

performed inperformed inRFRF codec codec

RPE-LPCRPE-LPCspeech decoderspeech decoderinput 260 bitsinput 260 bits

output 260 samplesoutput 260 samples

BitBitReorderingReordering

Test parityTest paritycheck bits,check bits,

discard blockdiscard blockif check failsif check fails

5353bitsbits

Viterbi decodeViterbi decode378 class 1 bits378 class 1 bits

378378bitsbits

Equalization,Equalization,slotslot dissembly dissembly,,de-encryptionde-encryption

andandde-interleavingde-interleaving5050

bitsbits

78 class 2 bits78 class 2 bits

132 class 1b bits132 class 1b bits

Comfort noiseComfort noise

260260bitsbits

s’(t)s’(t)

ReceiverReceiver

LAB9BLAB9B

ChannelChannel

9 - 40

LAB9BLAB9B .. GSM Channel Coding .. GSM Channel Coding

uu Examine the file Examine the file LAB9B.ASMLAB9B.ASM. Note the special instructions used for. Note the special instructions used forimplementing the Viterbi butterfly.implementing the Viterbi butterfly.

uu Assemble and link Assemble and link LAB9B.ASMLAB9B.ASM..

uu Simulate Simulate LAB9BLAB9B program by typing: take program by typing: take LAB9B.TAKLAB9B.TAK

uu Examine the input data array in the Examine the input data array in the MEMORYMEMORY window. window.

++ G0G0BitsBits

Input bitsInput bits ZZ-1-1 ZZ-1-1 ZZ-1-1 ZZ-1-1

G1G1BitsBits++

Page 198: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 9

9 - 22 DSP54x - Algorithms

9 - 41

Convolutional encoder output (G0 and G1)Convolutional encoder output (G0 and G1)1100

Signed Antipodal formatSigned Antipodal format-7-777

LAB9BLAB9B .. continued .. continued

uu Encode the input data by running to the first break point. Break points wereEncode the input data by running to the first break point. Break points wereset by the take file. The encoded data is in the set by the take file. The encoded data is in the MEMORY1MEMORY1 window. Simulate window. Simulatetransmission errors by making changes to the encoded data. Valid changestransmission errors by making changes to the encoded data. Valid changesare between -7 and 7. Note: the encoded data is in signed antipodal format,are between -7 and 7. Note: the encoded data is in signed antipodal format,the format that the GSM equalizer output would be in.the format that the GSM equalizer output would be in.

uu Run the Viterbi decoder and compare the input data array to the outputRun the Viterbi decoder and compare the input data array to the outputdata array. Is the output correct?data array. Is the output correct?

9 - 42

LAB9BLAB9B .. GSM Channel Coding .. GSM Channel Coding

Every 20Every 20 mS mS, 160 sampled values from the ADC are analyzed by the, 160 sampled values from the ADC are analyzed by theRegular Pulse Excitation (RPE) Linear Predictive Coding (LPC) voiceRegular Pulse Excitation (RPE) Linear Predictive Coding (LPC) voiceencoder. The filter amounts to a model of the speaker’s vocal tract (pharynx,encoder. The filter amounts to a model of the speaker’s vocal tract (pharynx,teeth, tongue, etc.) and the excitation signal represents sounds (pitch,teeth, tongue, etc.) and the excitation signal represents sounds (pitch,loudness, etc.). Finding suitable filter coefficients and an excitation signalloudness, etc.). Finding suitable filter coefficients and an excitation signalyields an appropriate speech signal.yields an appropriate speech signal.

The real reduction in bit rate comes from further analyzing theThe real reduction in bit rate comes from further analyzing theexcitation signal. The difference between current and previous excitationexcitation signal. The difference between current and previous excitationsignals is found by using Long Term Predictive analysis (LTP). The LTPsignals is found by using Long Term Predictive analysis (LTP). The LTPalgorithm searches all of the previous sequences (15algorithm searches all of the previous sequences (15 mS mS of history) for the of history) for thesequence that has the highest correlation to the current sequence. Thesequence that has the highest correlation to the current sequence. Thedifference is transmitted along with a pointer to the sequence that should bedifference is transmitted along with a pointer to the sequence that should beselected for use. The 160 samples are reduced to 260 bits. The resulting bitselected for use. The 160 samples are reduced to 260 bits. The resulting bitrate is 13rate is 13 Kbits Kbits / sec. / sec.

Page 199: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 9

DSP54x - Algorithms 9 - 23

9 - 43

LAB9CLAB9C : Polynomial Evaluation : Polynomial Evaluation

uu Examine (or create your own) Examine (or create your own) LAB9C.ASMLAB9C.ASM..

uu Assemble and link Assemble and link LAB9CLAB9C..

uu Simulate Simulate LAB9CLAB9C and observe the operation of the and observe the operation of the POLYPOLY instruction instructionby single stepping through the code until the “by single stepping through the code until the “EndEnd” label.” label.

uu Verify that the code generates the expected result.Verify that the code generates the expected result.

uu Optional: Modify the inputs to generate new results.Optional: Modify the inputs to generate new results.

uu Note: The Note: The POLYPOLY instruction “expects” Q15 numbers! instruction “expects” Q15 numbers!

x = 3/4 = 0x6000x = 3/4 = 0x6000

a0 = 1/8 = 0x1000a0 = 1/8 = 0x1000

a1 = 1/4 = 0x2000a1 = 1/4 = 0x2000

a2 = 3/8 = 0x3000a2 = 3/8 = 0x3000

a3 = 1/2 = 0x4000a3 = 1/2 = 0x4000

P(x) = aP(x) = a33xx3+ 3+ aa22xx2 2 + a+ a11xx + a+ a00

P(x) = [(aP(x) = [(a33x+ ax+ a22)x)x + a+ a11]x]x + a+ a00

P(x) = 94/128 = 0x5EP(x) = 94/128 = 0x5E

9 - 44

Additional ResourcesAdditional Resources

1. S. M. Redl, M. K. Weber, M. W. Oliphant, “An Introduction to GSM”,Artech House, 1995.

2. H. Hendrix, “A Brief Tutorial on GSM Decoding Techniques”,TI Internal paper, 1995.

3. H. Hendrix, “Viterbi Decoding Techniques on the TMS320C54x Family”,TI Application Report, 1995.

1. S. M.1. S. M. Redl Redl, M. K. Weber, M. W., M. K. Weber, M. W. Oliphant Oliphant , “An Introduction to GSM”,, “An Introduction to GSM”,ArtechArtech House, 1995. House, 1995.

2. H.2. H. Hendrix Hendrix , “A Brief Tutorial on GSM Decoding Techniques”,, “A Brief Tutorial on GSM Decoding Techniques”,TI Internal paper, 1995.TI Internal paper, 1995.

3. H.3. H. Hendrix Hendrix , “Viterbi Decoding Techniques on the TMS320C54x Family”,, “Viterbi Decoding Techniques on the TMS320C54x Family”,TI Application Report, 1995.TI Application Report, 1995.

Page 200: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 9

9 - 24 DSP54x - Algorithms

Page 201: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

DSP54x - Interrupts 10 - 1

Interrupts

Learning Objectives

10 - 2

ObjectivesObjectives

uu Describe the ‘C54x state upon reset.Describe the ‘C54x state upon reset.

uu Identify interrupt sources.Identify interrupt sources.

uu Identify the requirements for interruptIdentify the requirements for interruptrecognition.recognition.

uu Describe the sequence of events duringDescribe the sequence of events duringan interrupt.an interrupt.

uu Build vector tablesBuild vector tables

Page 202: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

10 - 2 DSP54x - Interrupts

Page 203: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 10

DSP54x - Interrupts 10 - 3

Module 10

10 - 3

Hardware Reset ActionsHardware Reset Actions

uu All control signals are driven inactive highAll control signals are driven inactive high

uu Address lines are driven to Address lines are driven to FF80hFF80h

uu Data bus is driven to high impedance stateData bus is driven to high impedance state

uu Interrupts are disabled : Interrupts are disabled : 1 1 → → INTMINTM

uu Prior interrupts are purged : Prior interrupts are purged : 0 0 → → IFRIFR

uu The repeat counter (The repeat counter (RCRC) is cleared) is cleared

uu IACKIACK- is driven low- is driven low

uu An internal reset is sent to the peripherals.An internal reset is sent to the peripherals.

uu Seven Seven CLKOUTCLKOUT cycles after cycles after RSRS- is released- is releasedthe processor will fetch from the processor will fetch from 0FF80h0FF80h

10 - 4

Processor Status on ResetProcessor Status on Reset

MathMath

0 0 →→ OVA OVA

0 0 →→ OVB OVB

0 0 →→ OVM OVM

1 1 →→ C C

0 0 →→ C16 C16

0 0 → → ASM ASM

0 0 →→ FRCT FRCT

1 1 →→ SXM SXM

MiscMisc

0 0 →→ BRAF BRAF

0 0 →→ DP DP

0 0 → → CPLCPL

0 0 →→ CMPT CMPT

0 0 →→ ARP ARP

1 1 →→ INTM INTM

MemoryMemory

0 0 → → OVLYOVLY

0 0 →→ DROM DROM

? ? → → MP/MC-MP/MC-

1FFh 1FFh →→ IPTR IPTR

PinsPins

1 1 →→ XF XF

1 1 →→ CLKEN CLKEN

0 0 →→ AVIS AVIS

0 0 →→ HM HM

Page 204: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 10

10 - 4 DSP54x - Interrupts

10 - 5

InterruptInterrupt

Interrupt LocationsInterrupt Locations

Offset (Hex)Offset (Hex) DescriptionDescriptionRSRS 00 ResetReset

NMINMI 44 NonmaskableNonmaskable Interrupt Interrupt

INT0INT0 4040 External User Interrupt #0External User Interrupt #0

INT1INT1 4444 External User Interrupt #1External User Interrupt #1

INT2INT2 4848 External User Interrupt #2External User Interrupt #2

TINTTINT 4C4C Internal Timer InterruptInternal Timer Interrupt

RINT0RINT0 5050 Serial Port 0 Receive InterruptSerial Port 0 Receive Interrupt

XINT0XINT0 5454 Serial Port 0 Transmit InterruptSerial Port 0 Transmit Interrupt

RINT1RINT1 5858 Serial Port 1 Receive InterruptSerial Port 1 Receive Interrupt

XINT1XINT1 5C5C Serial Port 1 Transmit InterruptSerial Port 1 Transmit Interrupt

INT3INT3 6060 External User Interrupt #3External User Interrupt #3

64-7F64-7F ReservedReserved

SINT17-30SINT17-30 8-3C 8-3C Software Interrupt 17-30Software Interrupt 17-30

10 - 6

Interrupt ManagementInterrupt Management

ReservedReservedReserved INT3INT3INT3IFRIFR XINT1XINT1XINT1 RINT1RINT1RINT1 XINT0XINT0XINT0 RINT0RINT0RINT0 TINTTINTTINT INT2INT2INT2 INT1INT1INT1 INT0INT0INT0

15 – 915 – 9 88 77 66 55 44 33 22 11 00

ReservedReservedReserved INT3INT3INT3IMRIMR XINT1XINT1XINT1 RINT1RINT1RINT1 XINT0XINT0XINT0 RINT0RINT0RINT0 TINTTINTTINT INT2INT2INT2 INT1INT1INT1 INT0INT0INT0

15 – 915 – 9 88 77 66 55 44 33 22 11 00

ST1ST1

1111

INTMINTMINTM

Master EnableMaster Enable ::

Master Inhibit Master Inhibit ::

Set IMR Bits Set IMR Bits ::

Modify IMR Modify IMR ::

Clear IFR BitClear IFR Bit ::

RSBX RSBX INTMINTM

SSBXSSBX INTMINTM

STST #102h,*(IMR)#102h,*(IMR)

ORM ORM #40h, *(IMR)#40h, *(IMR)

STST #1, *(IFR)#1, *(IFR)

Page 205: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 10

DSP54x - Interrupts 10 - 5

10 - 7

Recognition of InterruptsRecognition of Interrupts

INT high 2 cyclesINT high 2 cyclesINT high 2 cycles22

INT low 3 cyclesINT low 3 cyclesINT low 3 cycles33

IFR Bit LatchedIFR Bit LatchedIFR Bit Latched

IMR Bit = 1?IMR Bit = 1?IMR Bit = 1?

INTM Bit = 0?INTM Bit = 0?INTM Bit = 0?

Interrupt BeginsInterrupt BeginsInterrupt Begins

10 - 8

Post Interrupt Hardware SequencePost Interrupt Hardware Sequence

CPU ActionCPU ActionCPU Action DescriptionDescriptionDescription

1 → INTM1 1 →→ INTM INTM Disable global interruptsDisable global interruptsDisable global interrupts

PC → - - *(SP)PC PC →→ - - *(SP) - - *(SP) Push PC onto predecremented stackPush PC ontoPush PC onto predecremented predecremented stack stack

Vector(n) → PCVector(n) Vector(n) →→ PC PC Load PC with int. vector “n” addressLoad PC withLoad PC with int int . vector “n” address. vector “n” address

0 → IACK pin0 0 →→ IACK pin IACK pin IACK signal goes lowIACK signal goes lowIACK signal goes low

0 → IFR (n)0 0 →→ IFR (n) IFR (n) Clear corresponding interrupt flag bitClear corresponding interrupt flag bitClear corresponding interrupt flag bit

Page 206: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 10

10 - 6 DSP54x - Interrupts

10 - 9

IACK DecoderIACK Decoder

‘C54x‘C54x

AddrAddr 6 5 4 3 2 1 0 6 5 4 3 2 1 0

INT0 1 0 0 0 0 0 0INT0 1 0 0 0 0 0 0

INT1 1 0 0 0 1 0 0INT1 1 0 0 0 1 0 0

INT2 1 0 0 1 0 0 0INT2 1 0 0 1 0 0 0

INT3 1 1 0 0 0 0 0INT3 1 1 0 0 0 0 0

‘138 ‘138 (3 – 8(3 – 8 DeMux DeMux))

Note: For internal vectors set AVIS = 1 Note: For internal vectors set AVIS = 1

A5A5

A3A3

A2A2

IACK -IACK -

A6A6

CCBBAA

G2 -G2 -

Y0Y0 IACK0 IACK0

Y1Y1 IACK1 IACK1

Y2Y2 IACK2 IACK2

Y4Y4 IACK3 IACK3G1G1

10 - 10

Context Save & Restore InstructionsContext Save & Restore Instructions

InstructionInstructionInstruction DescriptionDescriptionDescription

PSHM mmrPSHMPSHM mmrmmr

POPM mmrPOPMPOPM mmrmmr

Push MMR onto StackSP - 1 → SPPush MMR onto StackPush MMR onto StackSP - 1 SP - 1 →→ SP SP

Pop from Stack to MMRSP + 1 → SPPop from Stack to MMRPop from Stack to MMRSP + 1 SP + 1 →→ SP SP

PSHD SmemPSHDPSHD SmemSmem

POPD SmemPOPDPOPD SmemSmem

Push Data memory value onto StackSP - 1 → SPPush Data memory value onto StackPush Data memory value onto StackSP - 1 SP - 1 →→ SP SP

Pop top of Stack to Data memorySP + 1 → SPPop top of Stack to Data memoryPop top of Stack to Data memorySP + 1 SP + 1 →→ SP SP

FRAME KFRAMEFRAME KK Modify Stack PointerSP + K → SPModify Stack PointerModify Stack PointerSP + K SP + K →→ SP SP

Page 207: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 10

DSP54x - Interrupts 10 - 7

10 - 11

Context SaveContext Save

.ref.ref ISR1ISR1

.sect.sect “.vectors”“.vectors”

...... ...... INT1: INT1: BDBD ISR1ISR1 PSHMPSHM ST0ST0 PSHMPSHM ST1ST1

.. mmregsmmregs.. defdef ISR1ISR1

.text.textISR1: ISR1: PSHMPSHM ALAL PSHMPSHM AHAH PSHMPSHM AGAG

PSHMPSHM AR1AR1 PSHMPSHM IMRIMR PSHMPSHM PMSTPMST ; ISR FOLLOWS... ; ISR FOLLOWS...

10 - 12

Context RestoreContext Restore

; ISR CONCLUDES...; ISR CONCLUDES...

; Context Restore: ; Context Restore:

POPMPOPM PMSTPMST POPMPOPM IMRIMR

POPMPOPM AR1AR1 POPMPOPM AGAG POPMPOPM AHAH POPMPOPM ALAL POPMPOPM ST1ST1 POPMPOPM ST0ST0 RETFRETF

Page 208: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 10

10 - 8 DSP54x - Interrupts

10 - 13

Return InstructionsReturn Instructions

InstructionInstructionInstruction ActionsActionsActions CyclesCyclesCycles

RET[D]RET[D]RET[D] *(SP) -- → PC*(SP) -- *(SP) -- →→ PCPC RET 5RETD 3RETRET 55RETDRETD 33

RETE[D]RETE[D]RETE[D] *(SP) -- → PC 0 → INTM -*(SP) -- *(SP) -- →→ PC PC 0 0 →→ INTM -INTM -

RETE 5RETED 3RETERETE 55RETEDRETED 33

RETF[D]RETF[D]RETF[D] RETF → PC 0 → INTM -*(SP) --

RETF RETF →→ PCPC 0 0 →→ INTM -INTM -*(SP) --*(SP) --

RETF 3RETFD 1RETFRETF 33RETFDRETFD 11

10 - 14

Nested InterruptsNested Interrupts

;; Nestable Nestable ISR . . . ISR . . .

RSBXRSBX INTMINTM

STMSTM #5,IMR#5,IMR

PSHMPSHM IMRIMR

SSBXSSBX INTMINTM

POPMPOPM IMRIMR

RETERETE

Save IMRSave IMR

Enable only Interrupts 0 and 2Enable only Interrupts 0 and 2

Enable Interrupts INTM=0Enable Interrupts INTM=0

Disable Interrupts INTM=1Disable Interrupts INTM=1

Restore IMR valueRestore IMR value

Page 209: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 10

DSP54x - Interrupts 10 - 9

10 - 15

Vector Table StructureVector Table Structure

.sect.sect “.vectors”“.vectors”RSV:RSV: BDBD ResetReset

STMSTM #STK+LEN,SP#STK+LEN,SPNMV:NMV: Put NMIPut NMI

routine here ...routine here ...

IV1:IV1: BDBD ISR1ISR1PSHMPSHM ST0ST0PSHMPSHM ST1ST1

IV2:IV2: BDBD ISR2ISR2PSHMPSHM ST0ST0PSHMPSHM ST1ST1......

.loop.loop 40h-$40h-$RETERETE.. endloopendloop

10 - 16

Filling Empty VectorsFilling Empty Vectors

Standard VectorStandard Vector

IVnIVn :: BDBD ISRnISRn PSHMPSHM ST0ST0 PSHMPSHM ST1ST1

Unused VectorUnused Vector

IVnIVn :: NOPNOP

NOPNOP NOPNOP

NOPNOP

Unused Vector - DebugUnused Vector - Debug

IVnIVn :: BDBD IVnIVn NOPNOP

NOPNOP

Unused Vector - ProductionUnused Vector - ProductionIVnIVn :: XORMXORM #10b,*(IMR)#10b,*(IMR)

RETERETE

Page 210: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 10

10 - 10 DSP54x - Interrupts

10 - 17

NMI InterruptNMI Interrupt

uu Supersedes all regular activity.Supersedes all regular activity.

uu Can serve as ultra-high priorityCan serve as ultra-high priorityinterrupt event.interrupt event.

uu Ignores state of INTM and IMR.Ignores state of INTM and IMR.

uu Sets INTM=1.Sets INTM=1.ÀÀ Cannot supercede RPT or not RDY.Cannot supercede RPT or not RDY.

ÀÀ Can slow time-critical interrupt response.Can slow time-critical interrupt response.

ÀÀ Can interrupt itself.Can interrupt itself.

ÀÀ Can lead to ambiguous return state.Can lead to ambiguous return state.

10 - 18

Using Address for NMI Return StatusUsing Address for NMI Return Status

Put main and ISR code in separate areasPut main and ISR code in separate areas

ISRsISRs

VectorsVectors

MainMain

IntsInts_Off_Off

When about to return from NMI:When about to return from NMI:

POPMPOPM ALAL PSHMPSHM ALAL

SUBSUB ##IntsInts_Off,A_Off,ARETCRETC GEQGEQRETERETE

Get Return AddressGet Return Address Put Return Address BackPut Return Address Back

Is Return Address in ISR region?Is Return Address in ISR region?If yes, return w/o clearing INTMIf yes, return w/o clearing INTMElse clear INTM (allow interrupts)Else clear INTM (allow interrupts)

Page 211: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 10

DSP54x - Interrupts 10 - 11

10 - 19

Using Flag for NMI Return StatusUsing Flag for NMI Return Status

VecNVecN:: BDBD IsrNIsrNSSBX SSBX TCTCNOPNOP

IsrNIsrN:: …………RETEDRETEDRSBXRSBX TCTCNOPNOP

NMI:NMI: …………RETCRETC TCTCRETE RETE

Jump to ISR in 2 wordsJump to ISR in 2 wordsSet flag (non Interruptible)Set flag (non Interruptible)Room for 1 more...Room for 1 more...

Return from ISR in 2 cyclesReturn from ISR in 2 cyclesClear flagClear flagLast word...Last word...

NMI ISR code starts hereNMI ISR code starts here

If TC=1 ret to ISR w INTM=1If TC=1 ret to ISR w INTM=1Else allow interrupts and returnElse allow interrupts and return

10 - 20

Fast InterruptsFast Interrupts

Allows 3-cycle ISR, e.g.:Allows 3-cycle ISR, e.g.:

RINT0:RINT0: NOPNOP

RETFDRETFD

MVKDMVKD DRR0,*AR7+%DRR0,*AR7+%

uu Only 2 words of code may follow RETFD.Only 2 words of code may follow RETFD.

uu One word may precede the RETFDOne word may precede the RETFD

uu Creates “unsupervised” action.Creates “unsupervised” action.

Page 212: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 10

10 - 12 DSP54x - Interrupts

10 - 21

INTR and TRAP InstructionINTR and TRAP Instruction

[Label] INTR[Label] INTR k k ;0 ;0 ≤≤ k k ≤≤ 31 31

[Label] TRAP k ;0 [Label] TRAP k ;0 ≤≤ k k ≤≤ 31 31

INTR = TRAP + 1 -> INTMINTR = TRAP + 1 -> INTM

kkk

012345...

15

001122334455......

1515

InterruptInterruptInterrupt OffsetOffsetOffset

RSNMI

SINT17SINT18SINT19SINT20

.

.

.SINT30

RSRSNMINMI

SINT17SINT17SINT18SINT18SINT19SINT19SINT20SINT20

..

..

..SINT30SINT30

0h4h8hCh10h14h

.

.

.3Ch

0h0h4h4h8h8hChCh10h10h14h14h

..

..

..3Ch3Ch

kkk

16171819202122232425...

31

1616171718181919202021212222232324242525......

3131

InterruptInterruptInterrupt OffsetOffsetOffset

INT0INT1INT2TINTRINTXINTRINT1XINT1INT3

Reserved...

Reserved

INT0INT0INT1INT1INT2INT2TINTTINTRINTRINTXINTXINT

RINT1RINT1XINT1XINT1INT3INT3

ReservedReserved......

ReservedReserved

40h44h48h4Ch50h54h58h5Ch60h64h

.

.

.7Ch

40h40h44h44h48h48h4Ch4Ch50h50h54h54h58h58h5Ch5Ch60h60h64h64h

..

..

..7Ch7Ch

10 - 22

RESET InstructionRESET Instruction

(IPTR)<<7 (IPTR)<<7 →→ PC PC

0 0 →→ IFR IFR

0 0 →→ CLOCK OFF (C541) CLOCK OFF (C541)

MathMath

0 0 →→ OVA OVA

0 0 →→ OVB OVB

0 0 →→ OVM OVM

1 1 →→ C C

0 0 →→ C16 C16

0 0 → → ASM ASM

0 0 →→ FRCT FRCT

1 1 →→ SXM SXM

MiscMisc

0 0 →→ OVM OVM

0 0 →→ BRAF BRAF

0 0 →→ DP DP

0 0 → → CPLCPL

0 0 →→ CMPT CMPT

0 0 →→ ARP ARP

1 1 →→ INTM INTM

MemoryMemory

0 0 → → OVLYOVLY

0 0 →→ DROM DROM

? ? → → MP/MC-MP/MC-

PinsPins

1 1 →→ XF XF

1 1 →→ CLKEN CLKEN

0 0 →→ AVIS AVIS

0 0 →→ HM HM

Page 213: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 10

DSP54x - Interrupts 10 - 13

10 - 23

Interrupt Vector AddressInterrupt Vector Address

111 111 111 111 111 111 111 111 111

1515 1414 1313 1212 1111 1010 99 88 77 66 55 44 33 22 11 00InterruptInterrupt

VectorVectorAddressAddress

000 000 000 000 000 000 000

ResetReset

000 000 000 000 000

IPTRIPTR

111 111 111 111 111 111 111 111 111

1515 1414 1313 1212 1111 1010 99 88 77 66 55 44 33 22 11 00

PMSTPMST

MP

/MC

-M

P/M

C-

MP

/MC

-

OV

LY

OV

LY

OV

LY

AV

ISA

VIS

AV

IS

DR

OM

DR

OM

DR

OM

CL

KO

FF

CL

KO

FF

CL

KO

FF

Res

erve

dR

eser

ved

Res

erve

d

Res

erve

dR

eser

ved

Res

erve

d

10 - 24

Timer Block DiagramTimer Block Diagram

CLKOUTCLKOUT PSC (4)PSC (4)PSC (4)

TDDR (4)TDDR (4)TDDR (4)

TIM (16)TIM (16)TIM (16)

PRD (16)PRD (16)PRD (16)

TINTTINT

TINT rate =TINT rate =11

TCLK1 x ( TDDR+1 ) x ( PRD+1 )TCLK1 x ( TDDR+1 ) x ( PRD+1 )

Page 214: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 10

10 - 14 DSP54x - Interrupts

10 - 25

Timer Control RegisterTimer Control Register

PSCPSC TimerTimer prescaler prescaler counter counterTDDRTDDR Timer divide down ratioTimer divide down ratio

ReservedReservedReserved

1515 1414 1313 1212 1111 1010 99 88 77 66 55 44 33 22 11 00

PSCPSCPSC TDDRTDDRTDDR

enableenable start / stopstart / stop

TR

BT

RB

TSS

TSS

TSSTSS 1 = stop timer, 0 = timer run1 = stop timer, 0 = timer runTRBTRB 1 = load TIM from PRD1 = load TIM from PRD

10 - 26

Lab 10 - Interrupt Driven EventLab 10 - Interrupt Driven EventVECTORS.ASMVECTORS.ASM

RS RS

TINTTINT

11LAB9_M.ASMLAB9_M.ASM

START:START:

CALL CALL TIMER_INITTIMER_INIT

CALL CALL SINE_INITSINE_INIT

ENABLE INTS ENABLE INTS

MAIN:MAIN:

A = A + 1 A = A + 1

LOOP LOOP

22

LAB9_T.ASMLAB9_T.ASM

TIMER_INIT TIMER_INIT RET RET

33

LAB9_S.ASM LAB9_S.ASM

SINE_INITSINE_INITRETRET

CONTEXT SAVECONTEXT SAVESINE_ISRSINE_ISROUT TO PORT0OUT TO PORT0CONTEXT RESTORECONTEXT RESTORERETRET

AA

BB

CC

OUT.DATOUT.DAT

DD

MODIFYMODIFYLAB9.CMDLAB9.CMD

Page 215: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 10

DSP54x - Interrupts 10 - 15

10 - 27

Lab ProcedureLab Procedure

uuWrite code and get files talkingWrite code and get files talking

uuVerify that interrupts are workingVerify that interrupts are working

uuGet sine wave working (watch DP)Get sine wave working (watch DP)

uuPerform/verify full context save/restorePerform/verify full context save/restore

10 - 28

ReviewReview

uu What are the interrupt sources?What are the interrupt sources?

uu How do you poll for interrupts?How do you poll for interrupts?

uu What must you set up to respond to anWhat must you set up to respond to aninterrupt?interrupt?

uu What conditions affect interrupt latency?What conditions affect interrupt latency?

Page 216: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 10

10 - 16 DSP54x - Interrupts

Page 217: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

DSP54x - Hardware Interfacing 11 - 1

Hardware Interfacing

Learning Objectives

11 - 2

ObjectivesObjectives

uu Describe the purpose of each interface pin.Describe the purpose of each interface pin.

uu Connect the ‘C54x to various memory andConnect the ‘C54x to various memory andperipheral devices.peripheral devices.

uu Identify the key timing for external readsIdentify the key timing for external readsand writes.and writes.

uu Implement software wait states.Implement software wait states.

Page 218: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

11 - 2 DSP54x - Hardware Interfacing

Page 219: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 11

DSP54x - Hardware Interfacing 11 - 3

Module 11

11 - 3

Interfacing Memory and PeripheralsInterfacing Memory and Peripherals

DATADATA

DD

TMS320C54xTMS320C54x

DATADATA

PGMPGM

I/OI/O

1616ADDRESSADDRESS A A

A A

AA

PSPSPS CS1CS1CS1

DSDSDS CS1CS1CS1

ISISIS CS1CS1

CS2CS2CS2MSTRBMSTRB

CS2CS2CS2

IOSTRBIOSTRB CS2CS2

(8–15)(8–15)

(0–7)(0–7)DATADATA

DATADATA

DATADATA1616

DATADATA

R/WR/W WEWE

WEWE

OEOE GNDGND

OEOEGNDGND

GNDGNDOEOE

11 - 4

00 cycle timecycle time 2525 ns ns

## Cycle TimeCycle Time 2525

Read TimingRead Timing

11

addressaddress11 Setup AddressSetup Address 5 5

22

datadata22 Data ValidData Valid 5 533

33 Memory SpeedMemory Speed 1515

1515

333399

Notes:Notes: 1. Address timing also includes the PS, DS, IS and MSTRB signals.1. Address timing also includes the PS, DS, IS and MSTRB signals.2. All times are in nanoseconds.2. All times are in nanoseconds.3. H = one-half CLOCKOUT1 cycle time.3. H = one-half CLOCKOUT1 cycle time.4. MSTRB 4. MSTRB staysstays low across reads low across reads

11

MSTRB -MSTRB -

Page 220: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 11

11 - 4 DSP54x - Hardware Interfacing

11 - 5

Bus Collision AvoidanceBus Collision Avoidanceuu Only one external memory - No collision possible Only one external memory - No collision possible

uu Multiple external memoriesMultiple external memories

uu Sequential reads within one memory - MS address lines don’t changeSequential reads within one memory - MS address lines don’t changeOnly one memory will respond - No bus collision possibleOnly one memory will respond - No bus collision possible

uu Sequential reads across memories - MS address lines changeSequential reads across memories - MS address lines change

uuMultiple devices may respond - Can yield early data collisions Multiple devices may respond - Can yield early data collisions or: new device may not turn on in time -or: new device may not turn on in time - exp exp. if de-. if de-muxmux is is used to feed CS-used to feed CS-

uuWon’t corrupt read data, since 54x reads only at end of cycleWon’t corrupt read data, since 54x reads only at end of cycle

uuWon’t damage memory, since event is briefWon’t damage memory, since event is brief

uuCan yield noise, wastes powerCan yield noise, wastes power

uu SolutionsSolutions

uu If noise/power not a concern : no problemIf noise/power not a concern : no problem

uuUse faster memory (higher cost, power, etc)Use faster memory (higher cost, power, etc)

uu Add wait state only when reading Add wait state only when reading acrossacross devices (how?) devices (how?)

11 - 6

BSCR: Bank Switch Control RegisterBSCR: Bank Switch Control Register

BNKCMPBNKCMPBNKCMP

1515 1212 1010 00

res.resres..

BNKCMPBNKCMP value value MSBs MSBs compared Bank Size compared Bank Size

0 0 0 0 None 64K 0 0 0 0 None 64K 1 0 0 0 15 32K 1 0 0 0 15 32K 0 1 0 0 15 - 14 16K 0 1 0 0 15 - 14 16K 0 0 1 0 15 - 13 8K 0 0 1 0 15 - 13 8K 0 0 0 1 15 - 12 4K 0 0 0 1 15 - 12 4K

Note: Use only specified values of Note: Use only specified values of BNKCMPBNKCMP..

1111PS-PS-DSDS

Bit 11: If PS-DS is set, add 1 wait state when access changesBit 11: If PS-DS is set, add 1 wait state when access changesbetween PS and DS.between PS and DS.

Page 221: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 11

DSP54x - Hardware Interfacing 11 - 5

11 - 7

Interface Comparison: 2Interface Comparison: 2 vs vs 4 Phase 4 Phase

C10C10 4 4 200200 75 75 38% 38%

C25C25 4 4 100100 35 35 35% 35%

C50C50 2 2 50 50 32 32 64% 64%

C54xC54x 2 2 25 25 15 15 60% 60%

Four phase systems allow 1/3 cycle for memoryFour phase systems allow 1/3 cycle for memory

while two phase approach offers @ 2/3 cycle.while two phase approach offers @ 2/3 cycle.

Memory:Memory:

CostCost vs vs Speed Speed

DeviceDevice PhasesPhases I.RateI.Rate SramSram RatioRatio

44

22

11 - 8

Memory Interface ProtocolsMemory Interface Protocols

uu Four phase memoryFour phase memory interface is an industry standard, interface is an industry standard, as shown in the diagram below:as shown in the diagram below:

uu The first phase is a strobe The first phase is a strobe offoff time to allow the time to allow theaddress to stabilizeaddress to stabilize

uu Phase two is strobe Phase two is strobe onon, with valid , with valid addressaddress

uu Phase three is for sending/receiving Phase three is for sending/receiving datadata

uu In the fourth phase the strobe goes In the fourth phase the strobe goes offoff to latch data, allow to latch data, allow data hold time, and to relinquish the bus before the next data hold time, and to relinquish the bus before the next memory cyclememory cycle

datadata

addressaddress—— AA ——DD

strobestrobe

Page 222: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 11

11 - 6 DSP54x - Hardware Interfacing

11 - 9

Memory Interface ProtocolsMemory Interface Protocols

uu ReadRead cycles do cycles do notnot require the dead phases, though. require the dead phases, though.

uu Since the 54x is the only device ‘listening’ on the bus, Since the 54x is the only device ‘listening’ on the bus, it can latch data near the end of “D” time and not beit can latch data near the end of “D” time and not beconfused by any spurious early data, nor require an confused by any spurious early data, nor require an explicit data strobe signal.explicit data strobe signal.

uu By eliminating the ‘dead’ phases, the 54x is able to offerBy eliminating the ‘dead’ phases, the 54x is able to offera much larger time window to external memory, as seen a much larger time window to external memory, as seen previouslypreviously

datadata

addressaddress—— AA ——DD

strobestrobe

11 - 10

Memory Interface ProtocolsMemory Interface Protocols

uu WriteWrite cycles cycles dodo require the dead phases, however. require the dead phases, however.

uu Since here the external memory is ‘listening’ the protocolSince here the external memory is ‘listening’ the protocolmust guard against attempting a write before the addressmust guard against attempting a write before the addressis stabilized (the first ‘off’ time) and provide a data latchis stabilized (the first ‘off’ time) and provide a data latchand hold time (the last ‘off’ time) and hold time (the last ‘off’ time)

uu Therefore, Therefore, writeswrites must have a four phase protocol must have a four phase protocol

uu These two ‘off’ phases must be implemented with anThese two ‘off’ phases must be implemented with anextra CPU cycle extra CPU cycle eacheach

datadata

addressaddress—— AA ——DD

strobestrobe

Page 223: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 11

DSP54x - Hardware Interfacing 11 - 7

11 - 11

22

write datawrite data

write addresswrite address

1515 readreaddatadata

External Interface Write TimingExternal Interface Write Timing

11

read addressread addressADDRADDR

DATADATA

55

2H–52H–5 33

55

4H–24H–2

33

RR —— WW ——55

55

2H–52H–5MSTRBMSTRB

R/WR/W

Cycle Time →Cycle Time Cycle Time →→

1 Address valid to MSTRB (min)2 Data valid before MSTRB (min) (setup time)3 Data valid after MSTRB (min) (hold time)

11 Address valid to MSTRB (min)Address valid to MSTRB (min)22 Data valid before MSTRB (min) (setup time)Data valid before MSTRB (min) (setup time)33 Data valid after MSTRB (min) (hold time)Data valid after MSTRB (min) (hold time)

252525 202020

2H-5 = 202H-5 = 20 H-5 = 8

2H-52H-5 = 20= 202H-52H-5 = 20= 20 H-5 H-5 = 8 = 8

15155

1515151555

Notes:Notes: 1. All times are in nanoseconds.1. All times are in nanoseconds.2. H = one-half CLOCKOUT1 cycle time.2. H = one-half CLOCKOUT1 cycle time.

11 - 12

uu The two phase memory interface reduces memory cost, The two phase memory interface reduces memory cost, but yields a three cycle write. but yields a three cycle write.

uu How much performance is lost as a result?How much performance is lost as a result?

uu Consider the usual activity of DSP: Sum of products.Consider the usual activity of DSP: Sum of products.

OrderOrder DataData CoeffsCoeffs Code Code WritesWrites CyclesCycles Ops Ops A.B.R.A.B.R.

Three Cycle Write OverheadThree Cycle Write Overhead

20 20 20 20 20 20 10 10 1 1 53 53 51 51 1.04 1.04

50 50 50 50 50 50 10 10 1 1 113 113 111 111 1.02 1.02

100 100 100100 100 100 10 10 1 1 213 213 211 211 1.01 1.01

uu In these examples we can see that the overhead due to In these examples we can see that the overhead due to external writes is small for normal DSP functionsexternal writes is small for normal DSP functions

uu Can the overhead be reduced further if necessary?Can the overhead be reduced further if necessary?

Page 224: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 11

11 - 8 DSP54x - Hardware Interfacing

11 - 13

Write Timing DetailsWrite Timing Details

uu Single write requires three cycles.Single write requires three cycles.

uu ChainedChained writes use 2N+1 cycles. writes use 2N+1 cycles.

uu Single write is only one Single write is only one CPUCPU cycle. cycle.

uu InternalInternal write is one cycle. write is one cycle.

Note Note : writing an array of internal memory to external: writing an array of internal memory to external

memory is memory is full speedfull speed: while external bus is dead, : while external bus is dead,

CPU reads next datum from internal memory - thusCPU reads next datum from internal memory - thus

N reads and N writes take ~ 2*N cycles N reads and N writes take ~ 2*N cycles

11 - 14

IO Memory TimingIO Memory Timing

CLK OUTCLK OUT

0 0 12 12 25 25 37 37 50 50

A(15-0) , R/W- A(15-0) , R/W-

55

Read Read

552727

WriteWrite

55

IO STRB -IO STRB -

55 55

77 2727

22

Page 225: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 11

DSP54x - Hardware Interfacing 11 - 9

11 - 15

Software Wait StatesSoftware Wait States

I/OI/O Mem Mem

LoLo Data Data

Hi DataHi Data

Lo ProgLo Prog

HiHi Prog Prog

RRR I/OI/OI/O Hi DataHi DataHi Data Low DataLow DataLow Data Hi ProgHiHi Prog Prog Low Prog LowLow Prog Prog

SWWSR: Software Wait State Register. DataSWWSR: Software Wait State Register. Data Addr Addr: 0028h: 0028h

uu 3 bit fields = 0 to 7 software wait states (SW-WS)3 bit fields = 0 to 7 software wait states (SW-WS)

uu On reset, all P/D memory is 7 WS (SWWSR = 7FFFh).On reset, all P/D memory is 7 WS (SWWSR = 7FFFh).

uu On last SW-WS: MSC- will go LOW for 1 cycle.On last SW-WS: MSC- will go LOW for 1 cycle.

11 - 16

Hardware Wait StatesHardware Wait States

uu Software wait states may not be sufficient for all systems.Software wait states may not be sufficient for all systems.

uu Therefore Therefore hardwarehardware wait states may be used when: wait states may be used when:

uu More than 7 wait states are requiredMore than 7 wait states are required

uu More than 2 speeds of memory exist in a mapMore than 2 speeds of memory exist in a map

uu Variable wait-states existVariable wait-states exist

uu Hardware wait-states are Hardware wait-states are notnot considered for 0 and 1 SWWS areas considered for 0 and 1 SWWS areas

uu For 2-7 SWWS areas, the MSC- (Micro State Complete) pin fallsFor 2-7 SWWS areas, the MSC- (Micro State Complete) pin fallsat the end of the last SWWS. Therefore, this signal may be used toat the end of the last SWWS. Therefore, this signal may be used toindicate indicate nn cycles have already transpired, upon which external delay cycles have already transpired, upon which external delaymay be added, if required.may be added, if required.

uu Hardware wait is completed by a high signal input into the READY pinHardware wait is completed by a high signal input into the READY pin

uu READY is ignored for 0 and 1 SWWS areasREADY is ignored for 0 and 1 SWWS areas

uu READY is sampled on falling CLOCKOUT1 (mid-cycle)READY is sampled on falling CLOCKOUT1 (mid-cycle)

uu READY is READY is notnot sampled before MSC- falls sampled before MSC- falls

Page 226: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 11

11 - 10 DSP54x - Hardware Interfacing

11 - 17

Mixed Wait-State ExampleMixed Wait-State Example

DATADATA

TMS320C54xTMS320C54x2525 nS nS

1616ADDRESSADDRESS

PSPSPSMSTRBMSTRB

DATADATA

READYREADY

MSCMSC

CLOCKOUT1CLOCKOUT1 DD

HiHi Pgm Pgm

Lo PgmLo Pgm A A

A A

CSCSCSDATADATA

DATADATACSCSCS

1515 nS nS SRAM SRAM

200nS EPROM200nS EPROM

RR I/OI/O Hi DataHi Data Low DataLow Data HiHi Prog Prog LowLow Prog Prog SWWSRSWWSR

007711 11xx 44

PS -PS -

MSTRB -MSTRB -

A15 -A15 -

1515

OROR

1515PS -PS -

MSTRB -MSTRB -

A15 +A15 +OROR

ORORFF -FF -

11 - 18

Memory Timing SummaryMemory Timing Summary

uu All internal accesses are single cycle.All internal accesses are single cycle.

uu Internal DARAM action may be two accesses per cycle.Internal DARAM action may be two accesses per cycle.

uu External-read timing is biased for read access times:External-read timing is biased for read access times:

ÀÀ 1515 ns ns for 25- for 25-nsns devices devices

uu Write-cycle timing (cost/performance tradeoffs):Write-cycle timing (cost/performance tradeoffs):

ÀÀ 3 cycles for a single write3 cycles for a single write

ÀÀ 2 cycles per write with multiple writes2 cycles per write with multiple writes

ÀÀ 1 CPU cycle to initiate bus cycle(s)1 CPU cycle to initiate bus cycle(s)

uu Software-generated wait states allow for slower memories.Software-generated wait states allow for slower memories.

Page 227: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 11

DSP54x - Hardware Interfacing 11 - 11

11 - 19

Review QuestionsReview Questions

uu What signals select program, data or I/O?What signals select program, data or I/O?

uu How many wait states can the software wait-How many wait states can the software wait-state generator assign?state generator assign?

uu What size boundaries are software wait statesWhat size boundaries are software wait statesassigned in program, data and I/O?assigned in program, data and I/O?

uu What is the advantage of slower write cycles?What is the advantage of slower write cycles?

11 - 20

Lab 11 — Hardware InterfaceLab 11 — Hardware Interface

1616

PSPSPS

TMS320C54x-40TMS320C54x-40

ADDRESSADDRESS

DSDSDSISISIS

MSTRBMSTRBMSTRB

DATADATA

MCMCMCMP/ MP/

1616

IOSTRBIOSTRBIOSTRB

R/ R/ WWW

PROG

8K EPROM

70ns

PROGPROG

8K EPROM8K EPROM

70ns 70ns

CS1CS1CS1OEOEOE

AADD

DATA

8K SRAM

15 ns

DATADATA

8K SRAM8K SRAM

1515 ns ns

CS1CS1CS1OEOEOE

AA

DDWEWEWE

1 ADC &

1 DAC

120 ns

1 ADC 1 ADC &&

1 DAC1 DAC

120120 ns ns

CS1CS1CS1OEOEOE

AA

DDWEWEWE

CS2CS2CS2

CS2CS2CS2

CS2CS2CS2

I/OI/O Hi DataHi Data Low DataLow Data HiHi Prog Prog LowLow Prog Prog SWWSRSWWSR

STM #________, SWWSRSTM #________, SWWSR

Page 228: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 11

11 - 12 DSP54x - Hardware Interfacing

Page 229: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

DSP54x - Ports 12 - 1

Ports

Learning Objectives

12 - 2

Learning ObjectivesLearning Objectives

uu List the available types of portsList the available types of ports

uu Demonstrate how to initialize eachDemonstrate how to initialize eachport to a given statusport to a given status

uu Demonstrate how to connect eachDemonstrate how to connect eachport to external devicesport to external devices

uu Write code to send & receive dataWrite code to send & receive datato a given portto a given port

uu Describe when & how a given portDescribe when & how a given portis best utilizedis best utilized

Page 230: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

12 - 2 DSP54x - Ports

Page 231: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 12

DSP54x - Ports 12 - 3

Module 12

12 - 3

uu Standard Serial PortStandard Serial Port

uu Buffered Serial Port (BSP)Buffered Serial Port (BSP)

uu TDM Serial PortTDM Serial Port

uu Host Port Interface (HPI)Host Port Interface (HPI)

TMS32054x PortsTMS32054x Ports

uu Standard Serial PortStandard Serial Port

12 - 4

SP Pins and SignalsSP Pins and Signals

TransmitTransmit

CLKXCLKX

FSXFSX

DXDX

ReceiveReceive

CLKRCLKR

FSRFSR

DRDR

ClockClock

DataData

FrameFrame

Page 232: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 12

12 - 4 DSP54x - Ports

12 - 5

Dual 54x Serial Port InterconnectDual 54x Serial Port Interconnect

# 2# 2

FSRFSR

DXDX

FSXFSX

DRDR

CLKXCLKX

CLKRCLKR

54x54x

CLKXCLKX

CLKRCLKR

DRDR

FSRFSR

DXDX

FSXFSX

# 1# 1

54x54x

12 - 6

Serial Port DiagramSerial Port Diagram

Data BusData Bus

SPCSPC

Control LogicControl Logic

DRRDRR DXRDXR

XSRXSRRSRRSR

DXDXDRDR

RINTRINT XINTXINT

FSRFSR CLKRCLKR CLKXCLKX FSXFSX

CPUCPU

Page 233: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 12

DSP54x - Ports 12 - 5

12 - 7

Serial Transmit Timing ExampleSerial Transmit Timing Example

CLKX

DX D7 D6 D5 D4 D3 D2 D1 D0 E7

XINT

D È DXR

D È XSR E È XSR

FSX

E È DXR

12 - 8

Maximum Data Rate ExampleMaximum Data Rate Example

D7 D6 D5 D4 D3 D2 D1 D0 E7C0 E6 E5 E4

D È XSR E È XSR F È DXRE È DXR

CLKX

DX

XINT

FSX

Page 234: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 12

12 - 6 DSP54x - Ports

12 - 9

Frame Types - Burst/ContinuousFrame Types - Burst/Continuous

ContinuousContinuous

BurstBurst

DataData 11 22 33

FSM = 1 : BurstFSM = 1 : Burst

FSM = 0 : ContinuousFSM = 0 : Continuous

12 - 10

Serial Port Control RegisterSerial Port Control Register

BITBIT NAMENAME FunctionFunction 0 State0 State 1 State1 State RSRS 0 0 N/AN/A Std / enhance modeStd / enhance mode StdStd EnhEnh 0 0 1 1 DLBDLB DigitalDigital Loopback Loopback RunRun TestTest 0 0 2 2 FOFO FormatFormat 16 b.16 b. 8 b.8 b. 0 0 3 3 FSMFSM Frame Synch ModeFrame Synch Mode ContCont.. BurstBurst 0 0 4 4 MCMMCM Master Clock ModeMaster Clock Mode Ext’lExt’l I.Rate/4I.Rate/4 0 0 5 5 TXMTXM Transmit ModeTransmit Mode FollowFollow LeadLead 0 0 6 6 XRST -XRST - Transmit ResetTransmit Reset ResetReset RunRun 0 0 7 7 RRST -RRST - Receive ResetReceive Reset ResetReset RunRun 0 0 8 8 IN0IN0 Input value on CLKRInput value on CLKR In.Val=0 In.Val=0 In.Val=1 In.Val=1 x x 9 9 IN1IN1 Input value on CLKXInput value on CLKX In.Val=0 In.Val=0 In.Val=1 In.Val=1 x x1010 RRDYRRDY RcvRcv Ready = RINT Ready = RINT No Data No Data ReadyReady 0 01111 XRDYXRDY XmitXmit Ready = XINT Ready = XINT DXR FullDXR Full ReadyReady 1 11212 XSREMPTY -XSREMPTY - Xmit Shft RegXmit Shft Reg Empty Empty OKOK ErrorError 0 01313 RSRFULLRSRFULL Rcv Shft RegRcv Shft Reg Full Full OKOK ErrorError 0 01414 FREEFREE Free Run on BreakFree Run on Break HaltHalt FreeRunFreeRun 0 01515 SOFTSOFT Soft stop on BreakSoft stop on Break HardHard SoftSoft 0 0

Page 235: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 12

DSP54x - Ports 12 - 7

12 - 11

Serial Port ExerciseSerial Port Exercise

uu Initialize the ‘541 XMIT port 0 as follows :Initialize the ‘541 XMIT port 0 as follows :

uu Frame synch is internal and burst modeFrame synch is internal and burst mode

uu Word size is 16 bitWord size is 16 bit

uu Run at the fastest possible speed.Run at the fastest possible speed.SSBX INTM Disable Ints

ORM #______,IMR Enable Xmit Interrupt (XINT)

STM #______,SPC Halt SP & Conf SP Ctl reg

STM #______,IFR Clear any old Xmit ints (XINT)

ORM #______,SPC Start Xmit process

RSBX INTM Enable ints

12 - 12

Serial Port ExerciseSerial Port Exerciseuu Initialize the ‘541 XMIT port 0 as follows :Initialize the ‘541 XMIT port 0 as follows :

uu Frame synch is internal and burst modeFrame synch is internal and burst mode

uu Word size is 16 bitWord size is 16 bit

uu Run at the fastest possible speed.Run at the fastest possible speed.SSBX INTM Disable Ints

ORM #0020h ,IMR Enable Xmit Interrupt (XINT)

STM #00BCh ,SPC Halt SP & Conf SP Ctl reg

STM #0020h ,IFR Clear any old Xmit ints (XINT)

ORM #0040h ,SPC Start Xmit process

RSBX INTM Enable ints

Page 236: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 12

12 - 8 DSP54x - Ports

12 - 13

Serial Port CaveatsSerial Port Caveats

u On reset DX=HiZ, all other pins are inputsu When initializing SP, use two writes:

u One to halt the SP & set desired modesu Second to take SP out of reset

u In Continuous mode XMIT stops if no new data is presentu XMIT restarts when new data is written to DXRu FSX is asserted to indicate new packet initiated

u After last bit sent, DX = HiZu Allows other devices to share busu Should add pullup to avoid chatter

u For self-test (DLB)u Use TXM = 1u If MCM = 1, CLKX -> CLKRu if MCM = 0, CLKR -> CLKX

u For Low Power : clear MCM, XRST, RRST

12 - 14

uu Standard Serial PortStandard Serial Port

uu Buffered Serial Port (BSP)Buffered Serial Port (BSP)

uu TDM Serial PortTDM Serial Port

uu Host Port Interface (HPI)Host Port Interface (HPI)

TMS32054x PortsTMS32054x Ports

uu Buffered Serial Port (BSP)Buffered Serial Port (BSP)

Page 237: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 12

DSP54x - Ports 12 - 9

12 - 15

SPSP

DRDR

DXDX

DRRDRR

DXRDXRXINTXINTRINTRINT

CPUCPU

RCV-ISRRCV-ISR

XMT-ISRXMT-ISR

ABUABU

ARRARR

AXRAXR

00

FFFFFFFF

DMEMDMEM

800800

10001000

Buffered Serial PortBuffered Serial Port

RINTRINT

R-PingR-Ping

R-PongR-Pong

X-PingX-Ping

X-PongX-Pong XINTXINT

DBUSDBUS

BKRBKR

BKXBKX

800800

000000

1111

1111

12 - 16

Serial Port Control Expansion RegisterSerial Port Control Expansion Register

BITBIT NAMENAME FunctionFunction 0 State0 State 1 State1 State RSRS

0-40-4 CLKDVCLKDV Clock DivisorClock Divisor 1 : 11 : 1 1 : n+11 : n+1 3 3

5 5 FSPFSP Frame Sync PolarityFrame Sync Polarity Active HiActive Hi ActiveActive Lo Lo 0 0

6 6 CLKPCLKP ClkClk Polarity : XMIT on Polarity : XMIT on Rising Rising FallingFalling 0 0

7 7 FEFE Format ExtensionFormat Extension 16/816/8 10/1210/12 0 0

8 8 FIGFIG Frame IgnoreFrame Ignore See 2ndSee 2nd No 2ndNo 2nd 0 0

9 9 PCMPCM Pules CodePules Code Mod’n Mod’n NormalNormal negneg==HiZHiZ 0 0

1010 BXEBXE Buffer XMIT EnableBuffer XMIT Enable Std SP Std SP BSP onBSP on 0 0

1111 XHXH XMIT HalfXMIT Half in 2ndin 2nd in 1stin 1st 0 0

1212 HALTXHALTX Halt XMITHalt XMIT ContinueContinue Halt onHalt on Int Int 0 0

1313 BREBRE Buffer RCV Enable Buffer RCV Enable Std SP Std SP BSP on BSP on 0 0

1414 RHRH RCV HalfRCV Half in 2ndin 2nd in 1st in 1st 0 0

1515 HALTRHALTR Halt RCVHalt RCV ContinueContinue Halt onHalt on Int Int 0 0

Page 238: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 12

12 - 10 DSP54x - Ports

12 - 17

Buffered Serial Port ExerciseBuffered Serial Port ExerciseInitialize the Serial Port to Transmit using:

FXM = Burst TXM = Ext’l Polarities = 0Clock = Int’l Rate = Max. Format =10bitPCM = off. ABU(X) = on Halt = offRcv.Locn = 800h Array size = 100h

____ ____ Disable interrupts____ #____h , ____ Work on MMRs____ #____h , ____ Enable XMIT Int (XINT)____ #____h , ____ Config. SPC (XRST=0)____ #____h , ____ Config. SPCE (ABU on)____ #____h , ____ Init AXR to start of buffers____ #____h , ____ Init buffer size____ #____h , ____ Clear any old XINT____ #____h , ____ Start SPI XMIT____ ____ Enable interrupts

12 - 18

Initialize the Serial Port to Transmit using:FXM = Burst TXM = Ext’l Polarities = 0Clock = Int’l Rate = Max. Format =10bitPCM = off. ABU(X) = on Halt = offRcv.Locn = 800h Array size = 100h

Buffered Serial Port SolutionBuffered Serial Port Solution

SSBX INTM Disable interruptsLD #0000h , DP Work on MMRsORM #0020h , IMR Enable XMIT Int (XINT)STM #0098h , BSPC Config. SPC (XRST=0)STM #0480h , BSPCE Config. SPCE (ABU on)STM #0800h , AXR Init AXR to start of buffersSTM #0200h , BKX Init buffer sizeORM #0020h , IFR Clear any old XINTORM #0040h , BSPC Start SPI XMITRSBX INTM Enable interrupts

Page 239: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 12

DSP54x - Ports 12 - 11

12 - 19

Interrupt Service Routine for BSPInterrupt Service Routine for BSP

Respond to RINT of Buffered Serial Portwith iteration counter

RINT LD #0000h , DP Work on MMRsBIT 1 , BSPCE Extract RH: ping or pong?... ... Do (at least) two other words...XC 2 , TC If in ping, reset AR7 to top of pingSTM #0800h , AR7 Config. SPCE (ABU on)... ... Process current array . . .

cntr .usect “SPRAM”, 1 Allocate one SPRAM.text

ANDM #0000h , *(cntr) Clear counter location... ... Other code...

RINT LD #0000h , DP Work on MMRsBIT 1 , BSPCE Extract RH: ping or pong?XC 2 , TC If in ping, reset AR7 to top of pingSTM #0800h , AR7 Config. SPCE (ABU on)ADDM #0001h , cntr Increment counterCMPM #count , cntr Is counter at next to last iteration?XC 2 , TC If so, tell ABU toORM #8000h , BSPCE Stop after next RCV of next array... ... Process current array . . .

cntr .usect “SPRAM”, 1 Allocate one SPRAM.text

ANDM #0000h , *(cntr) Clear counter location... ... Other code...

RINT LD #0000h , DP Work on MMRsBIT 1 , BSPCE Extract RH: ping or pong?CMPM #count , cntr Is counter at next to last iteration?XC 2 , TC If in ping, reset AR7 to top of pingSTM #0800h , AR7 Config. SPCE (ABU on)ADDM #0001h , cntr Increment counterXC 2 , TC If so, tell ABU toORM #8000h , BSPCE Stop after next RCV of next array... ... Process current array . . .

12 - 20

ABU CaveatsABU Caveats

uu BKX, BKR range : min = 2, max = 2047 (BKX, BKR range : min = 2, max = 2047 (notnot 2048) 2048)

uu Buffers must be aligned on 2Buffers must be aligned on 2NN > BK boundary > BK boundary

uu Odd size arrays have ping = pong+1Odd size arrays have ping = pong+1

uu AXR, ARR, BKX, BKR are all 11- bit registersAXR, ARR, BKX, BKR are all 11- bit registers

uu If AXR and ARR are not initialized to base of ‘ping’If AXR and ARR are not initialized to base of ‘ping’

arrays 1st data set will be incompletearrays 1st data set will be incomplete

uu XmitXmit & & Rcv Rcv arrays can arrays can overlapoverlap to extend array size to extend array size

Page 240: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 12

12 - 12 DSP54x - Ports

12 - 21

Using ABU with Overlapping ArraysUsing ABU with Overlapping Arrays

uu Initiate ABU to receive Array1 in ‘ping’Initiate ABU to receive Array1 in ‘ping’

uu RHALT should RHALT should notnot be necessary be necessary

1 A1 AInput :Input :

On RINT begin to process Array1

When finished initiate ABU to send Array1, Wait for RINT

On RINT begin to process Array2 in ‘pong’

ABU RCV will return to ‘ping’, but will not pass ABU XMIT if rates are equalABU Xmit may surpass CPU, so XHALT is recommended

Process :Process :

2 B2 B

1 A1 A

3 A3 A

2 B2 B

4 B4 B

3 A3 A

2 B2 B

5 A5 A

4 B4 B

3 A3 A 4 B4 B

6 B6 B

5 A5 A

......Output :Output : 1 A1 A

12 - 22

uu Standard Serial PortStandard Serial Port

uu Buffered Serial Port (BSP)Buffered Serial Port (BSP)

uu TDM Serial PortTDM Serial Port

uu Host Port Interface (HPI)Host Port Interface (HPI)

TMS32054x PortsTMS32054x Ports

uu TDM Serial PortTDM Serial Port

Page 241: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 12

DSP54x - Ports 12 - 13

12 - 23

TDM Serial PortTDM Serial Port

uu Eight or more devices may share the bus.Eight or more devices may share the bus.

uu Any ‘54x may own multiple slices and/or listenerAny ‘54x may own multiple slices and/or listenerIDs.IDs.

uu During its slice, a ‘54x may talk to any combinationDuring its slice, a ‘54x may talk to any combinationof listeners.of listeners.

uu May also be used as regular serial port.May also be used as regular serial port.

TDMTDM

00

11

22

3344

55

66

77

12 - 24

TDM Four-Wire BusTDM Four-Wire Bus

TMS320C54xTMS320C54x

Device 0Device 0Device 0 Device 1Device 1Device 1 Device 7Device 7Device 7. . .. . .

TCLKTCLK

TCLKXTCLKX

TCLKRTCLKR

TCLKTCLK

TFRMTFRM

TFSXTFSX TFRMTFRM

TDATTDAT

TDXTDX

TDRTDR

TDATTDAT

TADDTADD

TFSRTFSR TADDTADD

Page 242: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 12

12 - 14 DSP54x - Ports

12 - 25

TDM SignalsTDM Signals

TCLKTCLK

TFRMTFRM

TDATTDAT

TADDTADD

bit 15bit 15 00bit 1bit 177 bit 0bit 077 bit 14bit 14 00 bit 13bit 13 00 bit 12bit 12 00

a7a7 a6a6 a5a5 a4a4

uu Any one ’C5x generates the clock and frame signals.Any one ’C5x generates the clock and frame signals.

uu Data to transmit and listener address are generated Data to transmit and listener address are generated by the ’C5x which "owns" the current signal.by the ’C5x which "owns" the current signal.

uu All ’C5x’s capture address and data bits. TDM RCV interrupt All ’C5x’s capture address and data bits. TDM RCV interrupt is generated if routing list includes device’s ID.is generated if routing list includes device’s ID.

. . .. . .

. . .. . .

. . .. . .

. . .. . .

12 - 26

bitbit

TDM Port RegistersTDM Port Registers

15151414131312121111101099887766554433221100

TRCVTRCVTRCV

15

receivedata

0

1515

receivereceivedatadata

00

TDXRTDXRTDXR

15

transmitdata

0

1515

transmittransmitdatadata

00

TSPCTSPCTSPC

resres

rsrfullxsrempty

xrdyrrdyin1in0rrstxrsttxmmcmfsmfo

dlbtdm

resresresres

rsrfullrsrfullxsremptyxsrempty

xrdyxrdyrrdyrrdyin1in1in0in0rrstrrstxrstxrsttxmtxmmcmmcmfsmfsmfofo

dlbdlbtdmtdm

TCSRTCSRTCSR

xxxxxxxx

ch7ch6ch5ch4ch3ch2ch1ch0

xxxxxxxxxxxxxxxx

ch7ch7ch6ch6ch5ch5ch4ch4ch3ch3ch2ch2ch1ch1ch0ch0

TRTATRTATRTA

ta7ta6ta5ta4ta3ta2ta1ta0ra7ra6ra5ra4ra3ra2ra1ra0

ta7ta7ta6ta6ta5ta5ta4ta4ta3ta3ta2ta2ta1ta1ta0ta0ra7ra7ra6ra6ra5ra5ra4ra4ra3ra3ra2ra2ra1ra1ra0ra0

TRADTRADTRAD

xx

x2x1x0s2s1s0a7a6a5a4a3a2a1a0

xxxx

x2x2x1x1x0x0s2s2s1s1s0s0a7a7a6a6a5a5a4a4a3a3a2a2a1a1a0a0

time whentime whenlastlast msg msg for forme came inme came in

current timecurrent time

anyany ra ra# + a## + a#both = 1both = 1means “means “msgmsgfor me”for me”

my listener IDmy listener IDcurrentcurrentrouting listrouting listmy time(s) to talkmy time(s) to talk

routing list forrouting list formymy xmit xmit data data

Page 243: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 12

DSP54x - Ports 12 - 15

12 - 27

uu Standard Serial PortStandard Serial Port

uu Buffered Serial Port (BSP)Buffered Serial Port (BSP)

uu TDM Serial PortTDM Serial Port

uu Host Port Interface (HPI)Host Port Interface (HPI)

TMS32054x PortsTMS32054x Ports

uu Host Port Interface (HPI)Host Port Interface (HPI)

12 - 28

HPI ConceptHPI Concept

CPU

’54x

0800h

FFFh

1000h

1800h

CONTROL

DATA

8

000h

800h

10

HOST

HPI

HPIC

HPIA

HPICMMRs

Bk 0

Bk 1(BSP)

Bk 2(HPI)

0000h

Page 244: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 12

12 - 16 DSP54x - Ports

12 - 29

HPI Control SignalsHPI Control SignalsPin From Function

HBIL Host 0 1st Byte 1:2nd Byte

HCNTL0 Host 00 Control 01 Address

HCNTL1 11 Data 10 Data: ++W and R++

HRW- Host 0 to DSP 1:From DSP : Use R/W- or A(n)

HDS1- Host Host Pins HDS1 HDS2 HRW-

HDS2- RD- WE- RD- WE- WE-

STRB- R/W- STRB- VDD R/W-

STRB R/W- STRB gnd R/W-

HAS- Host Host w Mux(A,D): ALEÈHAS-, else VDDÈALE

HCS- Host Chip select: Use Device select Or A(n)

HRDY- DSP to Host(Ready) if Host rate > DSP rate /5

HINT- DSP to Host Int(n) : Int from DSP

12 - 30

HPIC RegisterHPIC Register15 - 8 7-4 3 2 1 0

0000 HINT DSPINT SMODE BOBCopy of 7:0

HostDSPHostBoth

00 BOBBOB Byte Order BitByte Order Bit 0 =0 = LSByte LSByte 1st (Little 1st (Little Endian Endian))1 =1 = MSByte MSByte 1st (Big 1st (Big Endian Endian))

11 SMODSMOD Shared ModeShared Mode 0 = Host Only Mode (HOM)0 = Host Only Mode (HOM)1 = Shared Access Mode (SAM)1 = Shared Access Mode (SAM)

22 DSPINTDSPINT DSP InterruptDSP Interrupt 0 = No Interrupt0 = No Interrupt1 = DSP1 = DSP Int’d Int’d by Host by Host

33 HINTHINT Host InterruptHost Interrupt DSP writes 1 : HINT- driven lowDSP writes 1 : HINT- driven lowHost writes 1 : HINT cleared (Host writes 1 : HINT cleared (ackack))

ModeMode Max RateMax Rate DetailsDetails

SAMSAM 5 cycles5 cycles asserted by DSP. 320 active only.asserted by DSP. 320 active only.HOMHOM 5050 nS nS resetreset cond cond. When DSP is: active, idle 1 or 2, reset, no clock. When DSP is: active, idle 1 or 2, reset, no clock

Page 245: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 12

DSP54x - Ports 12 - 17

12 - 31

INFOINFO 00 11 BB RR DDDD HPICHPIC HIPAHIPA DatLDatL 12341234 12351235 12361236 12371237 12381238

Ctrl.Ctrl. 00 00 00 00 0000 00000000 XXXX XXXXXXXX XXXX 01020102 03040304 05060506 07080708 090A090A00 00 11 00 0000 00000000 XXXXXXXX XXXXXXXX 01020102 03040304 05060506 07080708 090A090A

AddrAddr.. 00 11 00 00 1212 00000000 12XX12XX XXXXXXXX 01020102 03040304 05060506 07080708 090A 090A 00 11 11 00 3434 00000000 12341234 01020102 01020102 03040304 05060506 07080708 090A 090A

W:D1.W:D1. 11 11 00 00 AAAA 00000000 12341234 AA02AA02 01020102 03040304 05060506 07080708 090A090A11 11 11 00 BBBB 00000000 12341234 AABBAABB AABBAABB 03040304 05060506 07080708 090A 090A

W:+D2W:+D2 11 00 00 00 CCCC 00000000 12351235 CCBBCCBB AABBAABB 03040304 05060506 07080708 090A 090A 11 00 11 00 DDDD 00000000 12351235 CCDDCCDD AABBAABB CCDDCCDD 05060506 07080708 090A 090A

W:+D3W:+D3 11 00 00 00 EEEE 00000000 12361236 EEDDEEDD AABBAABB CCDDCCDD 05060506 07080708 090A 090A 11 00 11 00 FFFF 00000000 12361236 EEFFEEFF AABBAABB CCDDCCDD EEFFEEFF 07080708 090A090A

AddrAddr.. 00 11 00 00 1212 00000000 12371237 07080708 AABBAABB CCDDCCDD EEFFEEFF 07080708 090A090A00 11 11 00 3737 00000000 12371237 07080708 AABBAABB CCDDCCDD EEFFEEFF 07080708 090A090A

R:D4+R:D4+ 11 00 00 11 0707 00000000 12371237 07080708 AABBAABB 03040304 05060506 07080708 090A 090A 11 00 11 11 0808 00000000 12371237 07080708 AABBAABB CCDDCCDD 05060506 07080708 090A 090A

R:D5+R:D5+ 11 00 00 11 0909 00000000 12381238 090A090A AABBAABB CCDDCCDD 05060506 07080708 090A 090A 11 00 11 11 0A0A 00000000 12381238 090A090A AABBAABB CCDDCCDD EEFFEEFF 07080708 090A090A

HPI ProcessHPI Process

12 - 32

’C54x Host to ’C54x HPI’C54x Host to ’C54x HPI

’C54x Host’C54x Host ‘C54x HPI‘C54x HPI

HD7-0HD7-0

HCNTRL0HCNTRL0

HCNTRL1HCNTRL1

HBILHBIL

HRW -HRW -

HAS -HAS -

HCS -HCS -

HDS1 -HDS1 -

HDS2 -HDS2 -

HRDYHRDY

HINT -HINT -

A2A2

A1A1

A0A0

VVCCCC

D7-0D7-0

A2-0A2-0

R/W -R/W -

IS -IS -

IOSTRB -IOSTRB -

READYREADY

INT1 -INT1 -

VVCCCC

Page 246: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 12

12 - 18 DSP54x - Ports

12 - 33

Motorola 68HC11F1 to ‘54x HPIMotorola 68HC11F1 to ‘54x HPI

MC68HC11F1MC68HC11F1 ‘C54x HPI‘C54x HPI

HD7-0HD7-0

HCNTRL0HCNTRL0

HCNTRL1HCNTRL1

HBILHBIL

HRW-HRW-

HAS-HAS-

HCS-HCS-

HDS1-HDS1-

HDS2-HDS2-

HRDYHRDY

HINT-HINT-

PF2PF2PF1PF1PF0PF0

VVCCCC

PC7-0PC7-0

PF2-0PF2-0

R/W -R/W -

CSIO2CSIO2EE

IRQ -IRQ -NCNC

12 - 34

Intel 80C51 to ‘54x HPIIntel 80C51 to ‘54x HPI‘C54x HPI‘C54x HPI

HD7-0HD7-0

HCNTRL0HCNTRL0

HCNTRL1HCNTRL1

HBILHBIL

HRW-HRW-

HCS-HCS-

HAS-HAS-

HDS1-HDS1-

HDS2-HDS2-

HRDYHRDY

HINT-HINT-

Intel 80C51BHIntel 80C51BH

P0.7:0.0P0.7:0.0

ALEALE

P3.7/ RD-P3.7/ RD-P3.6/ WR-P3.6/ WR-

P3.2/ INT0-P3.2/ INT0-N/CN/C

P0.3P0.3

P0.2P0.2

P0.1P0.1

P0.0P0.0HPI HPI SELECTSELECTLOGICLOGIC

Note: Note: HCS- must be low when HDS1- or HDS2- is low. HCS- must be low when HDS1- or HDS2- is low. HCS- may be tied to ground or driven low by some HPI select logic.HCS- may be tied to ground or driven low by some HPI select logic.

Page 247: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

DSP54x - System Considerations 13 - 1

System Considerations

Learning Objectives

13 - 2

Learning ObjectivesLearning Objectives

Become familiar with system level designBecome familiar with system level designconsiderations for the C54x like:considerations for the C54x like:

Boot loaderBoot loader

Clock optionsClock options

Power managementPower management

Program securityProgram security

JTAG emulationJTAG emulation

Memory interfacingMemory interfacing

Multiprocessor issuesMultiprocessor issues

Page 248: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

13 - 2 DSP54x - System Considerations

Page 249: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 13

DSP54x - System Considerations 13 - 3

Module 13

13 - 3

Boot LoaderBoot Loader

The main function of the boot loader is to transfer user codeThe main function of the boot loader is to transfer user codefrom an external source to the program memory at power-up.from an external source to the program memory at power-up.

Depending on the C54x variant, the part can be booted from:Depending on the C54x variant, the part can be booted from:

uu 8 or 16 bit serial mode (SSP, BSP or TDM)8 or 16 bit serial mode (SSP, BSP or TDM)

uu 8 or 16 bit parallel I/O mode 8 or 16 bit parallel I/O mode

uu 8 or 16 bit parallel EPROM mode 8 or 16 bit parallel EPROM mode

uu Warm boot mode Warm boot mode

uu HPI boot mode HPI boot mode

13 - 4

Boot SequenceBoot Sequence

If the MP/MC pin is sampled low during a hardware reset,If the MP/MC pin is sampled low during a hardware reset,execution begins at location 0FF80h of the on-chip ROM.execution begins at location 0FF80h of the on-chip ROM.

This location contains a branch instruction to the start of theThis location contains a branch instruction to the start of theboot loader program.boot loader program.

Unless specified otherwise, the on-chip ROM is factoryUnless specified otherwise, the on-chip ROM is factoryprogrammed with the boot loader program.programmed with the boot loader program.

Page 250: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 13

13 - 4 DSP54x - System Considerations

13 - 5

Boot Loader OperationBoot Loader Operation

The boot loader program sets up the CPU status registers beforeThe boot loader program sets up the CPU status registers beforeinitiating the boot load ...initiating the boot load ...

uu Interrupts are globally disabled ( INTM = 1 ) Interrupts are globally disabled ( INTM = 1 )

uu Internal DARAM is mapped in program / data space ( OVLY = 1 ) Internal DARAM is mapped in program / data space ( OVLY = 1 )

uu 7 wait states are selected for the entire program, data and I/O spaces 7 wait states are selected for the entire program, data and I/O spaces

uu External memory bank size is set to 4K words External memory bank size is set to 4K words

uu 1 cycle is inserted when accesses switch between program and data 1 cycle is inserted when accesses switch between program and dataspacespace

The boot routine then reads the I/O port address 0FFFFh by driving the I/OThe boot routine then reads the I/O port address 0FFFFh by driving the I/Ostrobe pin low.strobe pin low.

The lower 8 bits of the word read from this port address specify the mode ofThe lower 8 bits of the word read from this port address specify the mode oftransfer.transfer.

13 - 6

Boot Mode SelectionBoot Mode Selection

BeginBegin InitializeInitializeTestTest

INT2: HPIINT2: HPImode?mode?

YesYes

NoNo

Read Boot Routine Selection (BRS) word from I/0 address 0FFFFhRead Boot Routine Selection (BRS) word from I/0 address 0FFFFh

BRSBRS==

??????????

Serial Serial Boot ModeBoot Mode

I/O I/O Boot ModeBoot Mode

Parallel Parallel Boot ModeBoot Mode

Warm Warm Boot ModeBoot Mode

Begin execution atBegin execution atHPIRAMHPIRAM

Page 251: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 13

DSP54x - System Considerations 13 - 5

13 - 7

HPI bootHPI boot

In order to do that, HINT is asserted low. In HPI mode, this pin is normally In order to do that, HINT is asserted low. In HPI mode, this pin is normally tied to INT2. tied to INT2.

If INT2 and HINT are tied together, INT2’s bit in the Interrupt Flag RegisterIf INT2 and HINT are tied together, INT2’s bit in the Interrupt Flag Register(IFR) will be set. The(IFR) will be set. The bootloader bootloader waits 20 CLOCKOUT cycles after asserting waits 20 CLOCKOUT cycles after asserting

HINT and then reads IFR bit #2HINT and then reads IFR bit #2

Assert pin HINTAssert pin HINT

The first step of the boot loader is to check if Host Port Interface (HPI) The first step of the boot loader is to check if Host Port Interface (HPI) boot option is selected. boot option is selected.

‘C54x‘C54x

00

00INT2INT2bit 2 of IFRbit 2 of IFRsetset

••If bit #2 is a 1, controlIf bit #2 is a 1, control istransferred istransferred to to the start of HPI RAM. the start of HPI RAM. NOTE: HPI RAM must already be NOTE: HPI RAM must already be loaded by the host before bringing loaded by the host before bringing the C54x out of reset. the C54x out of reset.

••If bit #2 is a 0, the boot routine If bit #2 is a 0, the boot routine skips the HPI mode skips the HPI mode

13 - 8

HPI bootHPI boot

Alternative methods:Alternative methods:

If it’s inconvenient to tie INT2 and HINT together, the following methodsIf it’s inconvenient to tie INT2 and HINT together, the following methodswill work.will work.

Send a valid interrupt to the INT2 input pin within 30 CLOCKOUT cycles Send a valid interrupt to the INT2 input pin within 30 CLOCKOUT cycles after DSP fetches the reset vector.after DSP fetches the reset vector.

or ...or ...

Use the warm boot option described later in this section. This method isUse the warm boot option described later in this section. This method ispreferred.preferred.

Page 252: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 13

13 - 6 DSP54x - System Considerations

13 - 9

Serial bootSerial boot

XXXXXXXXXXXXXXXX

1515 8 78 7

XXXXkmkm

4 34 3

00nn0000

00

At address 0FFFFhAt address 0FFFFh

k = 0, standard serial portk = 0, standard serial portk = 1, TDM serial portk = 1, TDM serial portn = 0, 8 bitn = 0, 8 bitn = 1, 16 bitn = 1, 16 bitm = 0, CLKX, FSX outputm = 0, CLKX, FSX outputm = 1, CLKX, FSX inputm = 1, CLKX, FSX input

The ‘541 serial boot option can use either the buffered serial port (BSP)The ‘541 serial boot option can use either the buffered serial port (BSP) ot ot the the time-division multiplexed (TDM) serial port in standard mode during booting.time-division multiplexed (TDM) serial port in standard mode during booting.

13 - 10

Serial boot processSerial boot process

Start executing codeStart executing codeDecrement code lengthDecrement code length

Branch to DABranch to DA

Configure SPC register to Configure SPC register to put SP in reset and then pull put SP in reset and then pull

SP out of reset. Configure SP out of reset. Configure BSPCE register.BSPCE register.

Read DA from SPRead DA from SP

Read code length from SPRead code length from SP

Read code word from SP and save the code word in DMRead code word from SP and save the code word in DM

Transfer data from DM into PM and increment PMTransfer data from DM into PM and increment PM

CodeCodelengthlength

0?0?NoNo

YesYes

Note: 8 bit read isNote: 8 bit read isHigh byte then lowHigh byte then low

byte.byte.

Page 253: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 13

DSP54x - System Considerations 13 - 7

13 - 11

I/O Parallel BootI/O Parallel Boot

XXXXXXXXXXXXXXXX

1515 8 78 7

XXX1XXX1

4 34 3

10001000

00

At address 0FFFFhAt address 0FFFFh

XXXXXXXXXXXXXXXX

1515 8 78 7

XXXXXXXX

4 34 3

11001100

00

8 bit mode8 bit mode

16 bit mode16 bit mode

Most common use of this mode is to boot from a slow microprocessor.Most common use of this mode is to boot from a slow microprocessor.

13 - 12

EPROM (Parallel) bootEPROM (Parallel) boot

XXXXXXXXXXXXXXXX

1515 8 78 7

SRCSRC

2 12 1

AAAA00

At address 0FFFFhAt address 0FFFFhAA = 01, 8 bit modeAA = 01, 8 bit modeAA = 10, 16 bit modeAA = 10, 16 bit modeSRC = 6 bit page addressSRC = 6 bit page address

Page 254: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 13

13 - 8 DSP54x - System Considerations

13 - 13

Warm BootWarm Boot

XXXXXXXXXXXXXXXX

1515 8 78 7

ADDRADDR

2 12 1

111100

At address 0FFFFhAt address 0FFFFh ADDR = 6 bit page addressADDR = 6 bit page address

13 - 14

Clock OptionsClock Options

CLKMD1CLKMD1

CLKMD2CLKMD2

CLKMD3CLKMD3

The Phase Locked Loop (PLL) mode is determinedThe Phase Locked Loop (PLL) mode is determinedat start-up by the input states on three pins:at start-up by the input states on three pins:

These pins should not be reconfigured during normal operation.These pins should not be reconfigured during normal operation.

CLKMD1 CLKMD2 CLKMD3 Option 1CLKMD1 CLKMD2 CLKMD3 Option 1++ Option 2 Option 2++

0011110000001111

0011001100110011

0000000011111111

PLL x 3,PLL x 3, ext ext.. osc osc..PLL x 2,PLL x 2, ext ext.. osc osc..PLL x 3,PLL x 3, int int.. osc osc..PLL x 1.5,PLL x 1.5, ext ext.. osc osc..Divide by 2,Divide by 2, ext ext.. osc osc..Stop mode*Stop mode*PLL x 1,PLL x 1, ext ext.. osc osc..Divide by 2,Divide by 2, int int.. osc osc..

PLL x 5,PLL x 5, ext ext.. osc osc..PLL x 4,PLL x 4, ext ext.. osc osc..PLL x 5,PLL x 5, int int.. osc osc..PLL x 4.5,PLL x 4.5, ext ext.. osc osc..Divide by 2,Divide by 2, ext ext.. osc osc..Stop mode*Stop mode*PLL x 1,PLL x 1, ext ext.. osc osc..Divide by 2,Divide by 2, int int.. osc osc..

PLL options for ‘541, ‘2 ,’3 ,’4, ‘5 and ‘6PLL options for ‘541, ‘2 ,’3 ,’4, ‘5 and ‘6

+ You can select your device with either option 1 or 2, but not both.+ You can select your device with either option 1 or 2, but not both.* PLL is disabled. System clock is not provided to CPU / peripherals.* PLL is disabled. System clock is not provided to CPU / peripherals.

Page 255: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 13

DSP54x - System Considerations 13 - 9

13 - 15

Clock Options - ‘548Clock Options - ‘548

CLKMD1 CLKMD2 CLKMD3 Clock mode / CLKMD value on resetCLKMD1 CLKMD2 CLKMD3 Clock mode / CLKMD value on reset

0011110000001111

0011001100110011

0000000011111111

1/2 with1/2 with ext ext. source / CLKMD = 0000h. source / CLKMD = 0000h1/2 with1/2 with ext ext. source / CLKMD = 6000h. source / CLKMD = 6000h1/2 with1/2 with ext ext. source / CLKMD = 4000h. source / CLKMD = 4000h1/2 with1/2 with ext ext. source / CLKMD = 2000h. source / CLKMD = 2000h1/2 with1/2 with ext ext. source / CLKMD = 1000h. source / CLKMD = 1000hStop mode / CLKMD =Stop mode / CLKMD = na naPLL*1 withPLL*1 with ext ext. source / CLKMD = 00007h. source / CLKMD = 00007h1/2 with1/2 with ext ext. source / CLKMD = 7000h. source / CLKMD = 7000h

Check this and add moreCheck this and add more

13 - 16

PLL Lockup TimePLL Lockup Time

Since it is an analog system, the PLL requires a lockup time before it is stable.Since it is an analog system, the PLL requires a lockup time before it is stable.

Page 256: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 13

13 - 10 DSP54x - System Considerations

13 - 17

Power ManagementPower Management

The current consumption of a DSP can vary depending on many factorsThe current consumption of a DSP can vary depending on many factorsincluding:including:

•• the instructions being executedthe instructions being executed•• whether the external pins are exercised or not whether the external pins are exercised or not•• temperature temperature•• supply voltage supply voltage•• capacitance of the external traces capacitance of the external traces

IDLE1IDLE1IDLE2IDLE2IDLE3IDLE3

Repeat NOPRepeat NOPInline NOPInline NOP

Repeat MACRepeat MACInline MACInline MAC

5.6mA5.6mA22 mA mA

0.550.55 mA mA0.30.3 mA mA0.40.4 mA mA

16 - 44mA16 - 44mA20 - 5220 - 52 mA mA

TMS320LC548 TMS320LC548 CLKOUT = 40 MHzCLKOUT = 40 MHzVccVcc = 3.0 V = 3.0 VRoom TemperatureRoom TemperaturePLL x1 clock modePLL x1 clock modeInternal consumption onlyInternal consumption only

13 - 18

Power Management HintsPower Management Hints

•• Minimize external trace lengths and their associated capacitance Minimize external trace lengths and their associated capacitance

•• Set Address Visibility (AVIS) = 0 Set Address Visibility (AVIS) = 0

•• When not being used, make sure the timer and serial ports are in reset When not being used, make sure the timer and serial ports are in reset

and MCM = 0 and MCM = 0

•• Assure Assure allall input pins are grounded or pulled hi input pins are grounded or pulled hi gh gh

•• Set SWWR to 0 wait states when possibleSet SWWR to 0 wait states when possible

•• Use circular addressing instead of Use circular addressing instead of DMOV’s DMOV’s

•• Use internal instead of external memory accesses Use internal instead of external memory accesses

•• Minimize the clock frequency to match the task required Minimize the clock frequency to match the task required

•• Implement power down modes where possible Implement power down modes where possible

Some design techniques for minimizing power consumption ...Some design techniques for minimizing power consumption ...

Page 257: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 13

DSP54x - System Considerations 13 - 11

13 - 19

Power Management - IDLEPower Management - IDLEIdle mode is entered by executing the IDLE instruction:Idle mode is entered by executing the IDLE instruction:

IDLE K ( where K 1, 2 or 3 )IDLE K ( where K 1, 2 or 3 )The device will stay in this mode until it is interruptedThe device will stay in this mode until it is interrupted

IDLE1IDLE1 All CPU activities stoppedAll CPU activities stoppedPeripherals activePeripherals active

Wake on … Reset, Wake on … Reset, Peripheral interrupts andPeripheral interrupts and

External interruptsExternal interrupts

IDLE2IDLE2All CPU activities stoppedAll CPU activities stopped

Peripherals inactivePeripherals inactiveCLKOUT inactiveCLKOUT inactive

Wake on … Reset andWake on … Reset andExternal interrupts*External interrupts*

* * IntsInts are not latched in idle mode, they must be low for 5 cycles to be are not latched in idle mode, they must be low for 5 cycles to be ackedacked+ PLL will require a transitory locking time of 50uS for restart+ PLL will require a transitory locking time of 50uS for restart! IDLE3 mode on the 545A, 546A and 548 has additional features. See! IDLE3 mode on the 545A, 546A and 548 has additional features. SeeTechnical Reference.Technical Reference.

IDLE3IDLE3 !!

All CPU activities stoppedAll CPU activities stoppedPeripherals inactivePeripherals inactiveCLKOUT inactiveCLKOUT inactive

PLL halted PLL halted ++

Wake on … Reset andWake on … Reset andExternal interrupts*External interrupts*

13 - 20

Power Management - HOLDPower Management - HOLDPower-down mode can also be initiated by the HOLD signalPower-down mode can also be initiated by the HOLD signal

When Hold initiatesWhen Hold initiatespower-down and ...power-down and ...

HM (in ST1) =1HM (in ST1) =1 The CPU stops executing andThe CPU stops executing andaddress, data and control lines go intoaddress, data and control lines go intohigh impedance state. All peripheralshigh impedance state. All peripherals

remain active.remain active.

HM (in ST1) =0HM (in ST1) =0 The address, data and control linesThe address, data and control linesgo into high impedance state. Allgo into high impedance state. All

peripherals remain active. The CPUperipherals remain active. The CPUcontinues to execute internally untilcontinues to execute internally untilan external access occurs, at whichan external access occurs, at which

point the processor will halt.point the processor will halt.

Power-down mode is terminated when HOLD becomes inactive.Power-down mode is terminated when HOLD becomes inactive.

Page 258: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 13

13 - 12 DSP54x - System Considerations

13 - 21

Power Management - CLKOUTPower Management - CLKOUT

All C54x devices can disable the internal clock of external interfaces usingAll C54x devices can disable the internal clock of external interfaces usingCLKOUT, which will place the interface into a lower power consumption mode.CLKOUT, which will place the interface into a lower power consumption mode.

BSCR(0) = 0BSCR(0) = 0

BSCR(0) = 1BSCR(0) = 1

PMST(2) = 0PMST(2) = 0

PMST(2) = 1PMST(2) = 1

CLKOUT pin enabled*CLKOUT pin enabled*

CLKOUT pin disabledCLKOUT pin disabled

CLKOUT pin enabled*CLKOUT pin enabled*

CLKOUT pin disabledCLKOUT pin disabled

* Condition at Reset* Condition at Reset

13 - 22

Program SecurityProgram Security

On-chip ROM securityOn-chip ROM securityROM / RAM securityROM / RAM security

Page 259: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 13

DSP54x - System Considerations 13 - 13

13 - 23

JTAG EmulationJTAG Emulation

JTAGJTAGControlControlBlockBlock

Internal scan chain Internal scan chain ( resisters and state machines )( resisters and state machines )

IEEE 1149.1IEEE 1149.1JTAG Test BusJTAG Test Bus

JTAG .. Joint Test Action GroupJTAG .. Joint Test Action Group

The JTAG port on the ‘C54x allows:The JTAG port on the ‘C54x allows:

••Boundary ScanBoundary Scan

Boundary ScanBoundary Scan

••EmulationEmulation

AnalysisAnalysisBlockBlock

PINSPINS

AnalysisAnalysisBlockBlock

Through a 14 pin Test / Emulation headerThrough a 14 pin Test / Emulation header

‘C54x‘C54x

EMU0EMU0EMU1EMU1TRSTTRSTTMSTMSTDITDITDOTDOTCKTCKTCK_RETTCK_RET

VccVcc

PDPD

GNDGNDGNDGNDGNDGNDGNDGNDGNDGND

EMU0EMU0EMU1EMU1TRSTTRSTTMSTMSTDITDI

TDOTDOTCKTCK

TCK_RETTCK_RET

VccVcc

GNDGND

HeaderHeader

6 inches6 inches

4.7K4.7K

or moreor more

Header to deviceHeader to devicelengths greater lengths greater

than 6 inchesthan 6 inchesrequire extra circuitryrequire extra circuitry

and attention and attentionto noise.to noise.

13 - 24

Multiprocessor IssuesMultiprocessor Issues

Major how-Major how-to’sto’s, signals involved, etc., signals involved, etc.

Page 260: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 13

13 - 14 DSP54x - System Considerations

13 - 25

‘LC548 ROM Features‘LC548 ROM Features

The on-chip ROM of the ‘LC548 is 2K words in length and is mapped fromThe on-chip ROM of the ‘LC548 is 2K words in length and is mapped from0F800h to 0FFFFh if the MP/MC pin is low.0F800h to 0FFFFh if the MP/MC pin is low.

External program spaceExternal program space

Boot loaderBoot loader

µµ-law table-law table

ΑΑ-law table-law table

Sine lookup tableSine lookup table

Built-in self testBuilt-in self test

Vector tableVector table

Program spaceProgram space

0x00000x0000

0xF8000xF800

0xFC000xFC00

0xFD000xFD00

0xFE000xFE00

0xFF000xFF00

0xFF800xFF80

13 - 26

Run=, Load=Run=, Load=

linker protocollinker protocol

Load =Load =epromepromRun=SRAMRun=SRAM

problems with linker linking symbolsproblems with linker linking symbols

assem langassem lang user guide user guide.label.label

see sheet1see sheet1

this will the lab topic for the modulethis will the lab topic for the module

Page 261: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 13

DSP54x - System Considerations 13 - 15

13 - 27

549 features549 features

two voltages ... 2.5 core 3.3 externaltwo voltages ... 2.5 core 3.3 external

power up issuespower up issues

13 - 28

Level ShiftingLevel Shifting

3.3 - 5v 3.3 - 5v

Page 262: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 13

13 - 16 DSP54x - System Considerations

13 - 29

Power Supply power up sequencePower Supply power up sequence

2.5v core 3.3. i/o2.5v core 3.3. i/o

3.3 volt core 5v i/o3.3 volt core 5v i/o

Page 263: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

DSP54x - Using the C Compiler 14 - 1

Using the C Compiler

Learning Objectives

14 - 2

Learning ObjectivesLearning Objectives

uu Invoke the compiler or shell programInvoke the compiler or shell programÀÀ Options and SwitchesOptions and SwitchesÀÀ The RTS libraryThe RTS libraryÀÀ The OptimizerThe Optimizer

uu Write code in CWrite code in CÀÀ Numerical Types supportedNumerical Types supportedÀÀ AccessingAccessing MMRs MMRs and IO Ports and IO PortsÀÀ InliningInlining C and ASM functions C and ASM functionsÀÀ Interrupt service routinesInterrupt service routinesÀÀ Optimization tipsOptimization tips

uu Use the C support files :Use the C support files :ÀÀ C.CMD : Linker file issues when using CC.CMD : Linker file issues when using CÀÀ BOOT.ASM Pre-main initialization processBOOT.ASM Pre-main initialization process

uu Intermix assembly files within the C environmentIntermix assembly files within the C environmentÀÀ Stack ModelStack ModelÀÀ Register UsageRegister UsageÀÀ Argument passing and result returnArgument passing and result return

uu Invoke the compiler or shell programInvoke the compiler or shell programÀÀ Options and SwitchesOptions and SwitchesÀÀ The RTS libraryThe RTS libraryÀÀ The OptimizerThe Optimizer

Page 264: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

14 - 2 DSP54x - Using the C Compiler

Page 265: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 14

DSP54x - Using the C Compiler 14 - 3

Module 14

14 - 3

FILE.C

C CompilerCC500

FILE.ASM

Parser

Code Generator

Optimizer-o

Assembler : ASM500

FILE.OBJ

Compiler Tool FlowCompiler Tool Flow

Linker : LNK500-z FILE.OUT

C CompilerCC500

FILE.ASM

Parser

Code Generator

Optimizer-o

Assembler : ASM500

FILE.OBJ

Linker : LNK500-z FILE.OUTShell Program :

CL500

Invoking the Shell :Invoking the Shell :

CL500 x.c y.CL500 x.c y.asmasm -z c. -z c.cmdcmd

14 - 4

Common Compiler OptionsCommon Compiler Options

Switch Description-g Global: symbols for debugging-s Source: C interlist in ASM file

-al Assembler: List file request-as Assembler: glboal Symbols

-ms Model Size - optimize for size-mn Model Normal - full opt. despite -g

-o0 Optimize register use-o1 Opt. -o0 + local opt.-o2 Opt. -o1 + global opt.-o3 Opt. -o2 + file opt.-oe Eliminate dead code-x Enable Inlining and -o3

-z LNK500 invoked (link options follow)

Page 266: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 14

14 - 4 DSP54x - Using the C Compiler

14 - 5

Compiler Switch IssuesCompiler Switch Issues

Optimizer should be invoked incrementally:Optimizer should be invoked incrementally:

CL500 -g test -z c.CL500 -g test -z c.cmdcmd Symbols kept for debugSymbols kept for debug

CL500 -g -o3 test -z c.CL500 -g -o3 test -z c.cmdcmd Add optimizer, keep symbolsAdd optimizer, keep symbols

CL500 -g -o3 -CL500 -g -o3 -mnmn test -z c. test -z c.cmdcmd Full optimize, some symbolsFull optimize, some symbols

CL500 -03 test -z c.CL500 -03 test -z c.cmdcmd Final rev: optimize, no symbolsFinal rev: optimize, no symbols

Preferred switches can be selected in several ways:Preferred switches can be selected in several ways:

On command line : On command line : As above.As above.

With batch file, With batch file, eg eg : : CL500 -g -o3 -CL500 -g -o3 -mnmn %1 %2 -z c. %1 %2 -z c.cmdcmd

Via environment variable : Via environment variable : SET C_OPTION=-g -o3 -SET C_OPTION=-g -o3 -mnmn

14 - 6

Lab 14-a : Invoking the CompilerLab 14-a : Invoking the Compiler

1.1. Inspect a C file that performs the sine routineInspect a C file that performs the sine routine

2.2. Compile the file using CL500Compile the file using CL500

3. 3. Observe the resultant .ASM fileObserve the resultant .ASM file

4.4. Load the .OUT file to the simulatorLoad the .OUT file to the simulator

5.5. Run the program andRun the program and

a. Verify correct results obtaineda. Verify correct results obtained

b. Benchmark cycles for sine routineb. Benchmark cycles for sine routine

c. Note lines of code required for sine routinec. Note lines of code required for sine routine

6. 6. Recompile with optimizer (-o). Repeats steps a - cRecompile with optimizer (-o). Repeats steps a - c

7.7. Compare the results of steps 5 and 6Compare the results of steps 5 and 6

Page 267: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 14

DSP54x - Using the C Compiler 14 - 5

14 - 7

Writing Code in CWriting Code in C

uu Invoke the compiler or shell programInvoke the compiler or shell programÀÀ Options and SwitchesOptions and SwitchesÀÀ The RTS libraryThe RTS libraryÀÀ The OptimizerThe Optimizer

uu Write code in CWrite code in CÀÀ Numerical Types supportedNumerical Types supportedÀÀ AccessingAccessing MMRs MMRs and IO Ports and IO PortsÀÀ InliningInlining C and ASM functions C and ASM functionsÀÀ Interrupt service routinesInterrupt service routinesÀÀ Optimization tipsOptimization tips

uu Use the C support files :Use the C support files :ÀÀ C.CMD : Linker file issues when using CC.CMD : Linker file issues when using CÀÀ BOOT.ASM Pre-main initialization processBOOT.ASM Pre-main initialization process

uu Intermix assembly files within the C environmentIntermix assembly files within the C environmentÀÀ Stack ModelStack ModelÀÀ Register UsageRegister UsageÀÀ Argument passing and result returnArgument passing and result return

uu Write code in CWrite code in CÀÀ Numerical Types supportedNumerical Types supportedÀÀ AccessingAccessing MMRs MMRs and IO Ports and IO PortsÀÀ InliningInlining C and ASM functions C and ASM functionsÀÀ Interrupt service routinesInterrupt service routinesÀÀ Optimization tipsOptimization tips

14 - 8

Inline AssemblyInline Assembly

uu Allows direct access to assembly language from CAllows direct access to assembly language from C

uu Useful for operating on components not used by C, ex:Useful for operating on components not used by C, ex:

asm asm ( “label RSBX INTM ” ); ( “label RSBX INTM ” );

uu Note: first column after leading quote is Note: first column after leading quote is labellabel field field

uu Avoid modifying components used by C (especially with Avoid modifying components used by C (especially with -o -o ))

uu Long operations should be written in ASM and called from CLong operations should be written in ASM and called from C

uu main C file retains portabilitymain C file retains portability

uu yields more easily maintained structuresyields more easily maintained structures

uu eliminates risk of interfering with registers in use by Celiminates risk of interfering with registers in use by C

Page 268: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 14

14 - 6 DSP54x - Using the C Compiler

14 - 9

Accessing Accessing MMRs MMRs from C from C

uu Using pointers to access Memory-Mapped Registers :Using pointers to access Memory-Mapped Registers :

uu Create a pointer and set its value to the assigned memory address :Create a pointer and set its value to the assigned memory address :

volatile unsignedvolatile unsigned int int *SPC_REG = (volatile unsigned *SPC_REG = (volatile unsigned int int *) 0x0022; *) 0x0022;

uu Read and write to the register as any other pointer :Read and write to the register as any other pointer :

*SPC_REG = 0xC8;*SPC_REG = 0xC8;

uu Volatile modifier :Volatile modifier :

uu Especially important with optimizer (-o)Especially important with optimizer (-o)

uu Tells compiler to always recheck actual memory whenever encounteredTells compiler to always recheck actual memory whenever encountered

uu Otherwise, optimizer might register-base value, or eliminate constructOtherwise, optimizer might register-base value, or eliminate construct

14 - 10

Accessing I/O Ports from CAccessing I/O Ports from C

Accessing I/O Ports from C :Accessing I/O Ports from C :1. create the port :1. create the port :

2. access the port :2. access the port :

ioportioport type type portHEXNO portHEXNO

ioportioport unsigned port8000 unsigned port8000

x = port8000 ;x = port8000 ;

port8000 = y ; port8000 = y ;

ma 0x8000,2,1,ma 0x8000,2,1,ioportioport

mcmc 0x08000,2,1,out. 0x08000,2,1,out.datdat,W,W

mcmc 0x8000,2,1,in. 0x8000,2,1,in.datdat,R ,R

Accessing I/O Ports from Simulator :Accessing I/O Ports from Simulator :

Label PORTR 8000h,xLabel PORTR 8000h,x

PORTW y,8000 PORTW y,8000

Accessing I/O Ports from ASM :Accessing I/O Ports from ASM :

Page 269: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 14

DSP54x - Using the C Compiler 14 - 7

14 - 11

Interrupts in CInterrupts in C

uu Interrupt Service RoutineInterrupt Service Routine

ÀÀ C function to run when interrupt occursC function to run when interrupt occurs

ÀÀ All necessary context save/restore performedAll necessary context save/restore performedautomaticallyautomatically

uu Interrupt Initialization CodeInterrupt Initialization Code

ÀÀ Should be called prior to run-time processShould be called prior to run-time process

ÀÀ Interrupt status may be modified during run-timeInterrupt status may be modified during run-time

uu Interrupt Vector TableInterrupt Vector Table

ÀÀ Written in ASMWritten in ASM

14 - 12

WritingWriting ISRs ISRs in C in C

int x[100] ;int *p = x ;

main { … } ;

interrupt void name(void) { static int y = 0 ; y += 1 ; if y < 100 *p++ = port0001; else asm(“ intr 17 “); }

uuGlobal variables allowGlobal variables allowsharing sharing of data betweenof data betweenmain functions & ISRmain functions & ISR

uuKeywordKeyworduuName of ISR functionName of ISR function

uuVoid input and return valuesVoid input and return values

uuLocals are lost across callsLocals are lost across callsStaticsStatics persist across calls persist across calls

uu ISRsISRs should not include calls should not include calls

uuReturn is with enable (RETE)Return is with enable (RETE)

uuAvoid -e or -Avoid -e or -oeoe options options

Page 270: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 14

14 - 8 DSP54x - Using the C Compiler

14 - 13

Initializing Interrupts in CInitializing Interrupts in CSetup pointers to IMR & IFR. Initialize IMR, IFR, INTM :Setup pointers to IMR & IFR. Initialize IMR, IFR, INTM :volatile unsignedvolatile unsigned int int *IMR = (volatile unsigned *IMR = (volatile unsigned int int *) 0x0000; *) 0x0000;

volatile unsignedvolatile unsigned int int *IFR = (volatile unsigned *IFR = (volatile unsigned int int *) 0x0001; *) 0x0001;

*IFR = 0xFFFF;*IFR = 0xFFFF;

*IMR = 0xFFFF;*IMR = 0xFFFF;

asmasm(“ RSBX INTM “);(“ RSBX INTM “);

Create Vector Table :Create Vector Table :.sect.sect “.vectors”“.vectors”……BB ISR1ISR1nopnopnopnop……

Compiled ISR Sequence : Compiled ISR Sequence :

uu I$$SAVE performs contextI$$SAVE performs contextsave (from RTS.LIB)save (from RTS.LIB)

uu ISR function runsISR function runs

uu I$$RESTORE performsI$$RESTORE performscontext restore (RTS.LIB)context restore (RTS.LIB)

uu RETE - Return with EnableRETE - Return with Enable

14 - 14

xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx 16-bit16-bit int int

* * yyyy yyyy yyyy yyyy yyyy yyyy yyyy yyyy 16-bit16-bit int int

zzzz zzzz zzzz zzzz zzzz zzzz zzzz zzzzzzzz zzzz zzzz zzzz zzzz zzzz zzzz zzzz 32-bit product32-bit product

Numerical Types in CNumerical Types in C

z = x * y;z = x * y;

z (Q0) z (Q0) z (Q0)

zzzz zzzz zzzz zzzzzzzz zzzz zzzz zzzz

z=((long)(x)*((long)(y))>>15;z=((long)(x)*((long)(y))>>15;

z (Q15)z (Q15)z (Q15)

zzz zzzz zzzz zzzzzzz zzzz zzzz zzzz z z

uu shortshort, , charchar, etc, all occupy full 16-bit memories, etc, all occupy full 16-bit memoriesuu no byte-addressing/packing on ‘54xno byte-addressing/packing on ‘54x

uu floatfloat operations supported via operations supported via rts rts.lib.libuu float math isfloat math is multicycle multicycle

Page 271: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 14

DSP54x - Using the C Compiler 14 - 9

14 - 15

The OptimizerThe Optimizer

uu ‘54x Specific Optimizations‘54x Specific Optimizations

uu General OptimizationsGeneral Optimizations

uu Data-flow OptimizationsData-flow Optimizations

uu Branch & Control-flow OptimizationsBranch & Control-flow Optimizations

uu Loop OptimizationsLoop Optimizations

14 - 16

‘54x Specific Optimizations‘54x Specific Optimizations

uu Cost-based register allocationCost-based register allocation ARnARn, A, B, A, B

uu Auto-increment Auto-increment **ARnARn+ +

uu Block repeat Block repeat RPTBRPTB

uu Delayed Branch, Call, and Return Delayed Branch, Call, and Return BD, CALLD, BD, CALLD, RETDRETD

Page 272: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 14

14 - 10 DSP54x - Using the C Compiler

14 - 17

General OptimizationsGeneral Optimizations

uu Algebraic re-orderingAlgebraic re-orderingexample : example : (a+b) - (c+d) (a+b) - (c+d) = 6 cycles= 6 cyclesbecomes : becomes : (((a+b)-c)-d) (((a+b)-c)-d) = 4 cycles= 4 cycles

uu Constant foldingConstant foldingexample :example : a = (b+4) - (c+1)a = (b+4) - (c+1)becomes :becomes : a = b - c + 3a = b - c + 3

uu Symbolic simplificationSymbolic simplification

uu Alias DisambiguationAlias DisambiguationWhen When only oneonly one pointer accesses a given memory pointer accesses a given memory array, compiler may allow registers to hold valuesarray, compiler may allow registers to hold values

14 - 18

Data-flow OptimizationsData-flow Optimizationsuu Copy propagationCopy propagation

Following assignment to a variable, references to the variableFollowing assignment to a variable, references to the variableare replaced with the valueare replaced with the value

uu Common sub-expression eliminationCommon sub-expression eliminationIf two (or more) equations perform the same sub-action,If two (or more) equations perform the same sub-action,the value is saved after the first and recalled laterthe value is saved after the first and recalled later

uu Redundant Assignment EliminationRedundant Assignment EliminationDrop assignments Drop assignments notnot used in later equations used in later equations

example (example (intint j) j)

{{ intint a = 3; a = 3;

intint b = (j*a) + (j*2); b = (j*a) + (j*2);

intint c = (j<<a); c = (j<<a);

intint d = (j>>3) + (j<<b); d = (j>>3) + (j<<b);

call (a,b,c);call (a,b,c);

}}

3 assigned to a &3 assigned to a & propigated propigated down; a down; a elim’d elim’d

becomes (j*5)becomes (j*5)

deaddead var var: replaced with expression: replaced with expression

assignment unused - eliminatedassignment unused - eliminated

Page 273: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 14

DSP54x - Using the C Compiler 14 - 11

14 - 19

Branch & Control-flow OptimizationsBranch & Control-flow Optimizations

uu Rearrange code to remove branches orRearrange code to remove branches orredundanciesredundancies

uu UnreachedUnreached code is deleted code is deleted

uu Branch to branch is bypassedBranch to branch is bypassed

uu Conditional branch overConditional branch over uncondional uncondional branch branchbecomes single conditional branch ‘not’becomes single conditional branch ‘not’

uu Conditional branches whose conditions areConditional branches whose conditions areresolved at compile-time are replaced withresolved at compile-time are replaced withunconditional branchesunconditional branches

14 - 20

Loop OptimizationsLoop Optimizations

uu Loop induction variables “Loop induction variables “LIVsLIVs ” are, for example, the “i” in ‘for i=…” are, for example, the “i” in ‘for i=…

uu Process of making LIVProcess of making LIV op’s op’s more efficient is called more efficient is called strength reduction,strength reduction, egeg::

for (i=1,i<100,i++)for (i=1,i<100,i++)

y+=x[i];y+=x[i];

yy+=*+=*x++x++

using *using *ARnARn++becomesbecomes

counters -->counters --> BANZBANZ or RPTBor RPTB

uu Often loop control variable is removed entirely - Often loop control variable is removed entirely - debug issuedebug issue

uu Other loop optimizations:Other loop optimizations:

uu Loop Rotation: Loop Rotation: Evaluate loop condition at endEvaluate loop condition at end vs vs. beginning. beginning

uu Loop Invariant Code Motion:Loop Invariant Code Motion:Move static equations out of loop & reference result onlyMove static equations out of loop & reference result only

uu Inline Expansion of RTS Library Functions: Inline Expansion of RTS Library Functions: small functions aresmall functions areinlinedinlined, not called. Size is user, not called. Size is user specifiable specifiable, default = 10 lines, default = 10 lines

Page 274: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 14

14 - 12 DSP54x - Using the C Compiler

14 - 21

InliningInlining C Functions C Functions

Code must be present in file for Code must be present in file for inlininginlining : :

Put code in filePut code in file

Put source in header : Put source in header : # include# include

Library of dual function typesLibrary of dual function types

Benefit - Faster:Benefit - Faster:

no branchno branch

no returnno return

no clear of parentno clear of parent fn fn

no setup of sub-no setup of sub-fnfn

merging ofmerging of fn’s fn’s with optimizer with optimizer

callcall fn fn

callcall fn fn

FnFn ......

retret

inlineinline fn fn

inlineinline fn fn

inlineinline fn fn

Call ofCall of Fn Fn InlineInline Fn Fn

14 - 22

Optimization StepsOptimization Stepsuu Optimize : Use Optimize : Use -o, --o, -mnmn when compiling when compiling

uu Use Use #define#define instead of variables for parameters instead of variables for parametersuu GlobalsGlobals may be faster than locals may be faster than localsuu Minimize mixing signed & unsigned integersMinimize mixing signed & unsigned integers

uu Inline short/key functions : compile with Inline short/key functions : compile with -x-xuu Declare function as inlineDeclare function as inlineuu Automatically invoked for short routines within fileAutomatically invoked for short routines within fileuu InlinesInlines can be passed between files via header can be passed between files via header

uu Give compiler Give compiler project visibilityproject visibilityuu #include#include sub-files within main sub-files within main

uu Optimizer will operate over Optimizer will operate over allall files allowing better files allowing betterinlininginlining, register tracking, etc., register tracking, etc.

uu Tune memory map via Tune memory map via C.CMDC.CMD

uu Re-write key code segments in assemblyRe-write key code segments in assemblyuu Bulletin Board, App notes, 3rd PartiesBulletin Board, App notes, 3rd Partiesuu S/W Cooperative, Hand writtenS/W Cooperative, Hand written

Page 275: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 14

DSP54x - Using the C Compiler 14 - 13

14 - 23

Optimization ProcessOptimization Process

Write & debug in C, benchmarkWrite & debug in C, benchmark

Real-time goal met ? Real-time Real-time goal met ? goal met ?

Perform C & C.CMD OptimizationsPerform C & C.CMD Optimizations

Profile. Convert Key Functions to ASMProfile. Convert Key Functions to ASM

Real-time goal met ? Real-time Real-time goal met ? goal met ?

Real-time goal met ? Real-time Real-time goal met ? goal met ? DoneDone

••

YY

YY

YY

NN

NN

NN

14 - 24

Lab 14-b : Writing Code in CLab 14-b : Writing Code in C

Page 276: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 14

14 - 14 DSP54x - Using the C Compiler

14 - 25

uu Invoke the compiler or shell programInvoke the compiler or shell programÀÀ Options and SwitchesOptions and SwitchesÀÀ The RTS libraryThe RTS libraryÀÀ The OptimizerThe Optimizer

uu Write code in CWrite code in CÀÀ Numerical Types supportedNumerical Types supportedÀÀ AccessingAccessing MMRs MMRs and IO Ports and IO PortsÀÀ InliningInlining C and ASM functions C and ASM functionsÀÀ Interrupt service routinesInterrupt service routinesÀÀ Optimization tipsOptimization tips

uu Use the C support files :Use the C support files :ÀÀ C.CMD : Linker file issues when using CC.CMD : Linker file issues when using CÀÀ BOOT.ASM Pre-main initialization processBOOT.ASM Pre-main initialization process

uu Intermix assembly files within the C environmentIntermix assembly files within the C environmentÀÀ Stack ModelStack ModelÀÀ Register UsageRegister UsageÀÀ Argument passing and result returnArgument passing and result return

C Support FilesC Support Files

uu Use the C support files :Use the C support files :ÀÀ C.CMD : Linker file issues when using CC.CMD : Linker file issues when using CÀÀ BOOT.ASM Pre-main initialization processBOOT.ASM Pre-main initialization process

14 - 26

Components of C.CMDComponents of C.CMD

file1.objvectors.obj-c-o test.out-m test.map-i c:\filepath-l rts.lib

MEMORY{P or D, RAM or ROM, F,M or S}

SECTIONS{ .vectors:> .text :> .cinit :> .const :> .switch :> .bss :> .stack :> .sysmem :> }

-stack 400h-heap 200h

Files : list here or pass via shellFiles : list here or pass via shellMust be written inMust be written in asm asm, listed here, listed hereBoot.Boot.asmasm is included is includedOutput file nameOutput file nameMap file nameMap file namePaths to searchPaths to searchLibraries to search - Libraries to search - lastlast on list on listOverride stack sizeOverride stack sizeOverride heap sizeOverride heap size

(PgmPgm,Data,Fast,,Data,Fast,MedMed,Slow),Slow)

Vector tableVector tableCodeCodeInitInit table for global/ table for global/staticsstaticsConstants - several options hereConstants - several options hereCase statement arraysCase statement arraysGlobalsGlobals and and statics staticsStack allocationStack allocationHeap allocationHeap allocation

P ROM MP ROM FP ROM SD ROM MP ROM MD RAM MD RAM FD RAM M

Page 277: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 14

DSP54x - Using the C Compiler 14 - 15

14 - 27

Options for Handling .Options for Handling .constconstuu Put a Put a ROMROM in data memory for . in data memory for .constconst..

+ + True constantTrue constant- - Extra costExtra cost

uu Link .Link .constconst to a to a ROMROM whose whose CSCS- is an - is an ANDAND of of PSPS- & - & DSDS--++ Lower cost, true constantsLower cost, true constants- - Reduces total memory space, extra gateReduces total memory space, extra gate

uu Use: Use: LOAD=LOAD=PgmRomPgmRom,RUN=,RUN=DataRamDataRam, and write a, and write aroutine to copyroutine to copy Rom Rom to Ram on reset. to Ram on reset.+ + Low costLow cost- - Extra design effort, not true constantsExtra design effort, not true constants

uu Use a host processor toUse a host processor to init init. constants to data ram on reset. constants to data ram on reset+ No extra cost if there is already a host and I/F+ No extra cost if there is already a host and I/F- Not true constants, extra design effort- Not true constants, extra design effort

uu Use initializedUse initialized globals globals instead of constants and link instead of constants and linkwith “-c” to auto-initializewith “-c” to auto-initialize pgm pgm ROMROM to data to data RAMRAM+ + Way toWay to autoinit autoinit, good use of memory space, good use of memory space- - RTS.LIBRTS.LIB fns fns may not apply, not “true” constants may not apply, not “true” constants

14 - 28

Global and Static Variable InitializationGlobal and Static Variable Initialization

uu Global and Static variables (G/SGlobal and Static variables (G/S vars vars) are linked under .) are linked under .bssbss

uu G/SG/S vars vars with no explicit with no explicit init val init val are assumed 0 by ANSI are assumed 0 by ANSI

uu Compiler does Compiler does notnot support the assumed 0 support the assumed 0 init init value value

uu Solutions:Solutions:

STMSTM #.#.bssbss,AR7,AR7

RPTZRPTZ A,#A,#lenlen

STLSTL A,*AR7+A,*AR7+

..bssbss:>:>DatRamDatRam, fill=0, fill=0

uu Add an ASM routine pre-main:Add an ASM routine pre-main:

uu Initialize all G/SInitialize all G/S vars vars to 0 explicitly to 0 explicitly

uu Link with a specified Initial value,Link with a specified Initial value, eg eg::

Page 278: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 14

14 - 16 DSP54x - Using the C Compiler

14 - 29

BOOT.ASMBOOT.ASM - invoked with - invoked with “-c“-c ””

Reset : PC <- FF80Reset : PC <- FF80

.ref .ref _c_int00_c_int00

FF80:FF80: BB _c_int00_c_int00nopnopnopnop

_main ..._main ...

_c_int00 _c_int00 1. Allocate stack1. Allocate stack

2.2. Init Init SP to end of stack SP to end of stack

3. Initialize status bits (3. Initialize status bits (espesp. CPL). CPL)

4. Copy .4. Copy .cinitcinit to . to .bssbss (skip if “- (skip if “- crcr ”)”)

5. Call “_main”5. Call “_main”

14 - 30

Runtime Support Library: RTS.LIBRuntime Support Library: RTS.LIB

uu Use -L RTS.LIB at end of file list in Use -L RTS.LIB at end of file list in LINK.CMDLINK.CMD to get access to to get access tospecified libraries as needed by specified libraries as needed by priorprior listed files: listed files:

file1.file1.objobj /* can access /* can access rts rts.lib */.lib */

-l-l rts rts.lib.lib /* run-time support library */ /* run-time support library */

file2.file2.objobj /* won’t access /* won’t access rts rts .lib */.lib */

uu Library functions must be Library functions must be declareddeclared to be used in a C file, to be used in a C file,usually via Header file.usually via Header file.

uu Headers can be inserted or included via the #includeHeaders can be inserted or included via the #includedeclarationdeclaration

uu Example - to access math.h type: #include <math.h>Example - to access math.h type: #include <math.h>

uu See compiler UG for full list of headersSee compiler UG for full list of headers

uu Note: noNote: no stdio stdio.h - why?.h - why?

Page 279: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 14

DSP54x - Using the C Compiler 14 - 17

14 - 31

TheThe Archiver Archiver: AR500: AR500

Command line options:Command line options: x - extractx - extractr - reinstallr - reinstalla - appenda - append

Sequence to modify libSequence to modify lib fn fn,, eg eg: Boot.: Boot.asmasm::

1. extract file:1. extract file: AR500 xAR500 x rts rts..srcsrc boot. boot.asmasm

2. modify as desired using ASCII editor2. modify as desired using ASCII editor

3. refresh 3. refresh both both archives:archives: AR500 rAR500 r rts rts..srcsrc boot. boot.asmasm AR500 rAR500 r rts rts.lib boot..lib boot.objobj

14 - 32

Lab 14-c : C.CMD and BOOT.ASMLab 14-c : C.CMD and BOOT.ASM

Page 280: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 14

14 - 18 DSP54x - Using the C Compiler

14 - 33

Mixing ASM into C SystemMixing ASM into C System

uu Invoke the compiler or shell programInvoke the compiler or shell programÀÀ Options and SwitchesOptions and SwitchesÀÀ The RTS libraryThe RTS libraryÀÀ The OptimizerThe Optimizer

uu Write code in CWrite code in CÀÀ Numerical Types supportedNumerical Types supportedÀÀ AccessingAccessing MMRs MMRs and IO Ports and IO PortsÀÀ InliningInlining C and ASM functions C and ASM functionsÀÀ Interrupt service routinesInterrupt service routinesÀÀ Optimization tipsOptimization tips

uu Use the C support files :Use the C support files :ÀÀ C.CMD : Linker file issues when using CC.CMD : Linker file issues when using CÀÀ BOOT.ASM Pre-main initialization processBOOT.ASM Pre-main initialization process

uu Intermix assembly files within the C environmentIntermix assembly files within the C environmentÀÀ Stack ModelStack ModelÀÀ Register UsageRegister UsageÀÀ Argument passing and result returnArgument passing and result return

14 - 34

. .defdef _slope _slope_slope:_slope:

DataData Mem Mem

Stack areaStack area

SPSP ÈÈ

void main (void) {void main (void) { int int x,y,b,m ; x,y,b,m ; } }

Calling ASM function from CCalling ASM function from C

externextern int int slope(slope(intint,,intint,,intint););

uu Define function name (code entry point)Define function name (code entry point)

uu Declare function name as a globalDeclare function name as a global

old PCold PCAA arg arg b b

AA mx mx+b+b

uu Note: Note: *SP(1)*SP(1) impliesimplies use of SP, but use of SP, butrequires requires CPL=1CPL=1 to work properly to work properly

local xlocal xlocal ylocal ylocal blocal blocal mlocal m

uu Call the assembly language function.Call the assembly language function.

uu Declare / Prototype the assembly language functionDeclare / Prototype the assembly language function

y = slope(b,m,x); y = slope(b,m,x);

LD LD *SP(1),T *SP(1),T MPY MPY *SP(2),B *SP(2),B ADD ADD B,15,A B,15,A RET RET

LD LD *SP(1),T *SP(1),T RETD RETD MPY MPY *SP(2),B *SP(2),B ADD ADD B,15,A B,15,A

argarg. m. margarg. x. x

Page 281: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 14

DSP54x - Using the C Compiler 14 - 19

14 - 35

Register Caveats for CRegister Caveats for C

ST0ST0

$53�����7&�����&�����29$�����29%$53�����7&�����&�����29$�����29% ���������� �������������� �����'3�����������'3������ ����������������

��������������������������������������������������������������������������������������������������������������������������������������������������������������������������������

ST1ST1

%5$)���&3/���;)���+0���,170�������290���6;0���&�����)5&7���&037���$60������%5$)���&3/���;)���+0���,170�������290���6;0���&�����)5&7���&037���$60������

������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������

Registers Registers notnot free on function call free on function callRegReg.. Use by C :Use by C :

AR7AR7 Long frame Long frame ptrptr

SPSP Stack PointerStack Pointer

AR1,6AR1,6 Register VariablesRegister Variables

AA 1st1st Arg Arg. /. / Rtn Rtn Value Value

Use Use )UDPH��1���36+0���3230���)UDPH�1)UDPH��1���36+0���3230���)UDPH�1

Registers free on function callRegisters free on function callRegReg.. Use by CUse by C

BB Expression AnalysisExpression Analysis

TT Expression AnalysisExpression Analysis

AR0AR0 Pointers and expressionsPointers and expressions

AR2-5AR2-5 Expression AnalysisExpression Analysis

BRCBRC LoopLoop reg’s reg’s ( (RSA,REARSA,REA))

14 - 36

Lab 14-d : ASM routine in CLab 14-d : ASM routine in C

Page 282: TMS320C54x DSP Design Workshop - Cho, Jun Dong ??? …vada.skku.ac.kr/ClassInfo/lower-power-DSP/lecture/... ·  · 2002-04-04TMS320C54x DSP Design Workshop Student Guide DSP54x-NOTES-1.2

Module 14

14 - 20 DSP54x - Using the C Compiler