Upload
buikhanh
View
243
Download
11
Embed Size (px)
Citation preview
TMS320C54x DSP Design Workshop
Student Guide
DSP54x-NOTES-1.2May 1997 Technical Training
ii TMS320C54x DSP Design Workshop
Copyright © 1997 Texas Instruments Incorporated.All rights reserved.
NoticeNo part of this publication may be reproduced, stored in a retrieval system, or transmitted, in anyform or by any means, electronic, mechanical, photocopying, recording or otherwise, without theprior written permission of Texas Instruments.
Texas Instruments reserves the right to update this Guide to reflect the most current productinformation for the spectrum of users. If there are any differences between this Guide and atechnical reference manual, references should always be made to the most current referencemanual. Information contained in this publication is believed to be accurate and reliable.However, responsibility is assumed neither for its use nor any infringement of patents or rights ofothers that may result from its use. No license is granted by implication or otherwise under anypatent or patent right of Texas Instruments or others.
Revision History
TMS320C54x DSP Design Workshop iii
Welcome to theWelcome to the
TMS320C54x DSPTMS320C54x DSPDesign WorkshopDesign Workshop
Texas InstrumentsTexas Instruments
Technical TrainingTechnical Training
0 - 2
IntroductionsIntroductions
uu NameName
uu CompanyCompany
uu Project ResponsibilitiesProject Responsibilities
uu DSP ExperienceDSP Experience
uu 320 Experience320 Experience
uu Hardware/Software,Hardware/Software, Asm Asm/C/C
uu InterestsInterests
iv TMS320C54x DSP Design Workshop
0 - 3
TMS320C54x Workshop AgendaTMS320C54x Workshop Agenda
I :I :
II :II :
III :III :
IV :IV :
uu 1. 1. Introduction and OverviewIntroduction and Overview
2. 2. Assembly Language EnvironmentAssembly Language Environment
3. 3. Addressing ModesAddressing Modes
uu 4. 4. Basic Programming TechniquesBasic Programming Techniques
5. 5. Advanced Programming ControlAdvanced Programming Control
6. 6. Pipeline IssuesPipeline Issues
uu 7. 7. Numerical IssuesNumerical Issues
8. 8. Fundamental DSP ApplicationsFundamental DSP Applications
9. 9. Advanced DSP ApplicationsAdvanced DSP Applications
10.10. InterruptsInterrupts
uu 11.11. Hardware InterfacingHardware Interfacing
12.12. Other InterfacingOther Interfacing
13.13. System Design IssuesSystem Design Issues
14.14. Using the C CompilerUsing the C Compiler
DSP54x - Introduction and Overview 1 - 1
Introduction and Overview
Learning Objectives
1 - 2
Learning ObjectivesLearning Objectives
uu Describe the requirements of a DSP system.Describe the requirements of a DSP system.
uu Identify the CPU components of the ‘C54x.Identify the CPU components of the ‘C54x.
uu List the ‘C54x internal buses and their usage.List the ‘C54x internal buses and their usage.
uu List the ‘C54x pipeline stages and their actions.List the ‘C54x pipeline stages and their actions.
uu Describe the memory map of the ‘C54x.Describe the memory map of the ‘C54x.
uu List memory and peripherals of the ‘C54x devices.List memory and peripherals of the ‘C54x devices.
uu Become familiar with ‘C54x simulator.Become familiar with ‘C54x simulator.
1 - 2 DSP54x - Introduction and Overview
Module 1
DSP54x - Introduction and Overview 1 - 3
Module 1
1 - 3
DSP: Sum-of-ProductsDSP: Sum-of-Products
y x an nn
==
∑1
100
y x an nn
==
∑1
100
xx aa
MPYMPY
ADDADD
yy
1 - 4
MAC Unit DetailsMAC Unit Details
MPYMPY
ADDADD
DD CC
M busM bus
accacc A A accacc B B
MAC *AR2+, *AR3+, AMAC *AR2+, *AR3+, A
AABB00
PPA TA T D AD A
s/us/u s/us/u
FRCTFRCT
Module 1
1 - 4 DSP54x - Introduction and Overview
1 - 5
Accumulators + ALUAccumulators + ALU
General-Purpose Math, ex: t = s + e - rGeneral-Purpose Math, ex: t = s + e - r
MUXMUX
U U BUSBUS
acc acc A A acc acc B B ALUALU
A A BUSBUS B B BUSBUS
A B MA B M
A B C TA B C T D SD S
LDLD s, As, A
ADDADD e, Ae, A
SUBSUB r, Ar, A
STLSTL A, tA, t
1 - 6
NotesNotes
Module 1
DSP54x - Introduction and Overview 1 - 5
1 - 7
Barrel ShifterBarrel Shifter
SHIFTER (-16 to +31)SHIFTER (-16 to +31)
S BUSS BUS
ALUALU
C D C D
LDLD X, 16, AX, 16, A
W BUSW BUS
A B A B
STHSTH B, yB, y
1 - 8
Temporary RegisterTemporary Register
T BUST BUS
TT
DD
ALUALUMACMAC
ex: A =ex: A = xa xa
LD x, T LD x, T
MPY a, A MPY a, A
EXPEXPAA
BBXX
Module 1
1 - 6 DSP54x - Introduction and Overview
1 - 9
’C54x Buses’C54x Buses
PP
DD
CC
EE
INTERNALINTERNAL
MEMORYMEMORY
MMUUXXEESS
EXTERNALEXTERNAL
MEMORYMEMORY
MMUUXX
MACMAC *AR2+, *AR3+, A*AR2+, *AR3+, A
ALUALU SHIFTSHIFTBBAAMACMACTT
DDCC
MM
1 - 10
NotesNotes
Module 1
DSP54x - Introduction and Overview 1 - 7
1 - 11
Pipeline - ConceptPipeline - Concept
F: FetchF: Fetch Get instruction from memory.Get instruction from memory.
D: DecodeD: Decode Schedule activity.Schedule activity.
R: ReadR: Read Get operand from memory.Get operand from memory.
X: ExecuteX: Execute Perform operation.Perform operation.
1 - 12
Memory InteractionMemory Interaction
uu Broken into two phases:Broken into two phases:1. Calculate address1. Calculate address
2. Collect data2. Collect data
uu Allows more time for memory interface.Allows more time for memory interface.
Module 1
1 - 8 DSP54x - Introduction and Overview
1 - 13
‘C54x Pipeline - Enhanced‘C54x Pipeline - Enhanced
PP PrefetchPrefetch Calculate address of instruction.Calculate address of instruction.
FF FetchFetch Collect instruction.Collect instruction.
DD DecodeDecode Interpret instruction.Interpret instruction.
AA AccessAccess Calculate address of operand.Calculate address of operand.
RR ReadRead Collect operand.Collect operand.
XX ExecuteExecute Perform operationPerform operation..
1 - 14
Memory WriteMemory Write
uu When When storingstoring results back to memory results back to memory
uu Two phasesTwo phasesÀÀ Address set upAddress set up
ÀÀ Data writtenData written
uu Overlaid onto R + X phasesOverlaid onto R + X phases
uu Best balance of:Best balance of:ÀÀ Processor loadingProcessor loading
ÀÀ SpeedSpeed
ÀÀ CostCost
Module 1
DSP54x - Introduction and Overview 1 - 9
1 - 15
’C54x Pipeline Events’C54x Pipeline Events
PP Drive address of instructionDrive address of instruction
FF Collect instructionCollect instruction
DD Interpret instruction, plan jobInterpret instruction, plan job
AA Set up pointers,Set up pointers, Calc Calc data address data address
RR Collect operandCollect operand
XX Execute operationExecute operation
PPAA
PPDD
ctlrctlr
DDAA
DDDD
*,+*,+
Calculate Write addressCalculate Write address
Send resultSend result
EEAA
EEDD
1 - 16
‘C54x Pipeline Hardware‘C54x Pipeline Hardware
PP
FF
DD
AA
RR
XX
PC, PPC, PAA
ProgramProgram Mem Mem, P, PDD
ControllerController
ARsARs, D, DAA,, ARAUs ARAUs
DataData Mem Mem, D, DDD
CALU (MAC, ALU)CALU (MAC, ALU)
; AR, ARAU, E; AR, ARAU, EAA
; E; EDD, Data, Data Mem Mem
Module 1
1 - 10 DSP54x - Introduction and Overview
1 - 17
’C54x Components and Bus Usage’C54x Components and Bus Usage
INTERNALINTERNAL
MEMORYMEMORY
EXTERNALEXTERNAL
MEMORYMEMORY
MMUUXXEESS
MMUUXX
PCPC
PP
CNTLCNTL ARsARs
CC
DD
ALUALU SHIFTSHIFTBBAAMACMACTT
EE
1 - 18
NotesNotes
Module 1
DSP54x - Introduction and Overview 1 - 11
1 - 19
Pipeline PerformancePipeline Performance
TIMETIME
P P11 FF11
PP22
DD11
FF22
PP33
AA11
DD22
FF33
PP44
RR11
AA22
DD33
FF44
PP55
XX11
PP66
RR22
AA33
DD44
FF55
FF66
XX22
RR33
AA44
DD55
DD66
XX33
RR44
AA55
AA66
XX44
RR55
RR66
XX55
XX66
FULLY LOADED ’PIPE’FULLY LOADED ’PIPE’
1 - 20
Pipeline Conflicts - External MemoryPipeline Conflicts - External Memory
PP
DD54x54x
PP11 FF11 DD11 AA11 RR11 XX11
PP22 FF22
PP33 FF33 DD33 AA33 RR33 XX33
DD22 AA22 RR22 XX22
PP44 ----
----
----
----
----
----
----
----
FF44
PP55
----
DD44 AA44 RR44 XX44
FF55 DD55 AA55 RR55 XX55
PP66 FF66 DD66 AA66 RR66
Module 1
1 - 12 DSP54x - Introduction and Overview
1 - 21
Pipeline Flow: Internal and External MemoriesPipeline Flow: Internal and External Memories
PP54x54x
DDoror
54x54x
PPDD
PP11 FF11 DD11 AA11 RR11 XX11
PP22 FF22
PP33 FF33 DD33 AA33 RR33 XX33
DD22 AA22 RR22 XX22
PP44 FF44 DD44 AA44 RR44 XX44
PP55 FF55 DD55 AA55 RR55 XX55
PP66 FF66 DD66 AA66 RR66 XX66
NO CONFLICTNO CONFLICT
1 - 22
Pipeline: Internal Memory OnlyPipeline: Internal Memory Only
4K4K
4K4K......
ROMROM RAMRAM
1K1K
1K1K......
ALUALUMACMAC
PP
DD
CC
ROMROM
’C54x’C54x
RAMRAM
Two accesses per block per cycleTwo accesses per block per cycle
DADA
DADA
Module 1
DSP54x - Introduction and Overview 1 - 13
1 - 23
’C541 Memory Maps’C541 Memory Maps
PROGRAMPROGRAM
FFFFFFFF
EXTEXT
00000000
VECTORSVECTORSFF80FF80
InternalInternal
ROM ?ROM ?
90009000
DATADATA
14001400
FFFFFFFF
EXTEXT
00000000MMR / RAMMMR / RAMOVLYOVLY
14001400RAM ?RAM ?
DROMDROME000E000
ROM ? ROM ?
1 - 24
’C541 Program Memory Options’C541 Program Memory Options
EXTEXT
00000000
FFFFFFFF VECTORS*VECTORS*FF80FF80
All ExternalAll ExternalMP/MC = 1MP/MC = 1
00000000
90009000
C000C000
E000E000
FFFFFFFF
EXTEXT
F000F000
98009800
D000D000
A000A000B000B000
4K 4K ROMROM w w VECs VECs * *
4K ROM4K ROM
4K ROM4K ROM
4K ROM4K ROM
4K ROM4K ROM
4K ROM4K ROM
2K ROM2K ROM2K ROM2K ROM
28K ROM28K ROM****MP/MC = 0MP/MC = 0
**** Internal ROM FF00 - FF7F reserved for TI test. Internal ROM FF00 - FF7F reserved for TI test.
* FF80 - FFFF are the default locations for vectors.* FF80 - FFFF are the default locations for vectors.
00000000
14001400
FFFFFFFF
EXTEXT
RAMRAM
VECTORS*VECTORS*
EXTEXToror
ROMROM
00800080
’RAM’ Option’RAM’ OptionOVLY = 1OVLY = 1
Module 1
1 - 14 DSP54x - Introduction and Overview
1 - 25
’C541 Data Memory’C541 Data Memory
00000000
14001400
E000E000
FFFFFFFF
EXT or ROMEXT or ROM
EXTEXT
MMR / RAMMMR / RAM00000000
04000400
08000800
0C000C00
10001000
14001400
RAM bRAM b
MMRMMR++
RAM aRAM a
RAM cRAM c
RAM dRAM d
RAM eRAM e
00000000
0060006000800080
04000400
RAM aRAM a
MMRMMR
SPRAMSPRAM
1 - 26
’C542 Memory Maps’C542 Memory Maps
EXTEXT
00000000
FFFFFFFF
PP
FF80FF80VECTORSVECTORS
F800F800ROMROM
28002800
FFFFFFFF
DD
EXTEXT
00000000
RAMRAMRAM?RAM?
28002800
OVLYOVLY
Module 1
DSP54x - Introduction and Overview 1 - 15
1 - 27
’C54x Memory Mix’C54x Memory Mix
11 55 2828 88
22 1010 22
33 1010 22
44 44 2424 88
55 66 4848 1616
66 66 4848 1616
99 3232 1616
C54x C54x RAM ROM DROMRAM ROM DROM
1 - 28
’C54x Peripheral Mix’C54x Peripheral Mix
C54xC54x SERSER TDM BSP HPITDM BSP HPI
11 22
22 11 11 11
33 11 11
44 22
55 11 11 11
66 11 11
99 11 22 11
Module 1
1 - 16 DSP54x - Introduction and Overview
1 - 29
Lab 1: Debugger WalkthroughLab 1: Debugger Walkthrough
Window ManagementWindow Management
SelectSelect CloseClose OpenOpen
MoveMove SizeSize EditEdit
Running CodeRunning Code
ResetReset StepStep
RunRun BreakpointBreakpoint
BenchmarkBenchmark
Display and AutomationDisplay and Automation
Saving configurationsSaving configurations
Using log filesUsing log files
1 - 30
commandcommandmenu barmenu bar
codecodewindowwindow
commandcommandwindowwindow
CPUCPUregistersregisters
memorymemorywindowwindow
Debugger ScreenDebugger Screen
Lab 1: The Debugger Interface
DSP54x - Introduction and Overview 1 - 17
Lab 1: The Debugger InterfaceThe Texas Instruments DSP family has moved to a common user interface called the SourceDebugger. Almost all TMS320 tools, including the TMS320C54x simulator, use this interface.
This module will guide you through the basic commands of the source debugger. Uponcompletion of the walkthrough, you will be able to:
• Set up and manipulate windows to display variables and data structures
• Single-step C statements and/or assembly instructions
• Set breakpoints and benchmark code
• Issue debugger commands via command menus, keyboard entry, or a mouse
Note: This walkthrough is intended to demonstrate the use of the debugger interface. It is notmeant to be an opportunity to get to know the ’C54x assembly language or C. Please donot attempt to dwell upon them, as this adds considerable time (and effort) to the process.The assembly language will be thoroughly presented in succeeding modules.
Lab 1: The Debugger Interface
1 - 18 DSP54x - Introduction and Overview
Simulator Files and DirectoryVerify that you are in the proper directory by typing:
cd \dsp54x\labs ↵
The demo program is a C file which simply loads an incrementing value to a variety of datatypes. Although of little interest in terms of DSP, it is a useful platform for exercising thedebugger interface and commands.
Sample Program - Source Debugger Walkthrough(1 of 2)
/*-------------------------------------------------------------------------*/
/* Sample program for Source Debugger Walkthrough *//*-------------------------------------------------------------------------*//* declare globals: int, float array, mixed type structure */int i:float a[10];struct { int i;
float j;int k[4];int *p;
} example ;void init();/*count from 0 to 1000 forever, call init each count */
main() {int count;for (;;)
for (count=0; count<1000; count++)init(count);
}
Lab 1: The Debugger Interface
DSP54x - Introduction and Overview 1 - 19
Sample Program - Source Debugger Walkthrough(2 of 2)/* load all globals with the current count value */void init(x)
int x;{for (i=0; i<10; i++)
a[i] = x;example.i = x;example.j = x;
for (i=0; i<4; i++)example.k[i] = x;example.p = (int * (0x0200 + x);
}
Lab 1: The Debugger Interface
1 - 20 DSP54x - Introduction and Overview
Starting the SimulatorTo start the debugger and load your linked output file, type:
SIM5XX lab1 ↵
The debugger assumes that the file to be loaded has a default extension of .out. We will learnhow to create output files in Module 2.
You should now see the debugger screen.
Note: If, in the process of this lab, you reach a point where the system no longer responds, or isotherwise corrupted, you may reload the file by typing LOAD lab1 at the commandprompt. In rare cases, you may have to exit the simulator entirely by typing QUIT andstarting over.
Selecting the Active WindowThe active window is shown with a highlighted border. To change the active window, point themouse at the desired window and press the left button. Repeat this a few times to cycle throughthe active windows.
Make the DISASSEMBLY window active. You can scroll through the code displayed in theDISASSEMBLY window several ways. First, by using the keyboard up-arrow, down-arrow,PgUp, and PgDn keys. And finally, by pointing the mouse at the up and down arrows on thewindow border and pressing the left button.
Note: Be careful. If you click while over an element of a window, you may set a breakpoint (ifyou are in a FILE or DISASSEMBLY window), or select a register or memory locationfor modification. To remove the breakpoint, simply point and click at the highlightedinstruction.
Try scrolling through the DISASSEMBLY window several ways.
When you want to return to a particular label or address, use the command addr nnnn, wherennnn is the label or address to return to. For example, type:
addr c_int00 ↵
Move to an absolute address by typing:
addr 0x0005 ↵
Move to a function by typing:
addr main ↵
Lab 1: The Debugger Interface
DSP54x - Introduction and Overview 1 - 21
Sizing and Moving WindowsYou can size and move any window. Make the CPU window active with the mouse. To changethe size of the window, grab the left or right corner of the window by holding down the leftmouse button down and drag the corner to a new position. Release the mouse button.
To move the window, grab the top of the window by holding the left mouse button down anddrag the window to a new position.
To restore the screen to its original state, use the “screen configuration” command with noarguments:
sconfig ↵
To load a particular screen configuration you may specify the desired file with the SCONFIGcommand:
sconfig tc.clr
To save a configuration, use the ssave <file name> command. There is no defaultextension, although .CLR (for color) is the extension generally used.
Typing sconfig once again will return you to the original configuration. The sconfigcommand uses the default filename init.clr. You may use either of these configurations, orany of your own creation, whenever using the debugger.
Lab 1: The Debugger Interface
1 - 22 DSP54x - Introduction and Overview
Running the ProgramThe sample program begins execution at the C reset function labeled c_int00. To repositionthe disassembly window, type:
addr c_int00 ↵
The assembly code shown at c_int00 can be single-stepped by pressing <F8> on the keyboardor by pointing and clicking the left mouse button.
Try running a few instructions by pressing <F8> and watching the PC value (in the CPUwindow) change as the corresponding instruction is executed (highlighted). Modified register andmemory contents are also highlighted.
To skip past this reset function, type:
go main ↵
Notice that the display changes to display the C program in the FILE window. Also notice thatthe CPU registers are no longer displayed. The CALLS window is opened to show which Cfunctions have been called.
The ability to view C source code in its native format is why our debugger is termed a “source”debugger.
Watch WindowSuppose you want to watch the value of a C variable while single-stepping the program. Type:
wa count ↵
This creates a watch window with the value of the variable count displayed. The valuedisplayed for count is not meaningful at this point since it has not been initialized yet. You maydiscover that opening a watch window on a variable not found in the current function willgenerate a warning.
Single-step the program (execute one C statement at a time) by pressing
<F8>
<F8>
Notice that the variable count was assigned the value zero. You should now be at the init()function call. Press:
<F8>
and you will go to the function. Notice the change in the CALLS window.
Add another variable to the watch window by typing:
Lab 1: The Debugger Interface
DSP54x - Introduction and Overview 1 - 23
wa i ↵
Single-step some more C statements by pressing <F8>.
To watch an array element, type:
wa a[0] ↵
Notice that the display shows a[0] as a floating-point value automatically. The debuggerdisplays values according to their defined type.
When a watch or display is no longer needed on screen, it may be closed by first selecting it(using <F6> or a mouse click), and then using the close window command: the <F4> key. <F4>does not apply the main simulator windows (CPU, MEM, Disassembly, etc.).
Displaying arrays and structures is a powerful debugging feature. Type:
wa a ↵
You receive the error message Invalid watch expression because you are allowed towatch only single, scalar values. If you have forgotten the type of the variable a, type:
whatis a ↵
To display the entire array of floating-point values, use the display command:
disp a ↵
You might want to move the DISP window over to the right of the screen.
Display the structure called example by typing:
disp example ↵
This structure has four members called i, j, k, and p. Note that they are displayed in accordancewith their type. Move this window over to the right just below the DISP: a window.
To display the contents of the array example.k, move the cursor down to highlight the lineshowing k: [...] and select this by pressing:
<F9> or the left mouse button
A new window is opened which shows the elements of the array. If this had been anotherstructure (instead of an array), it would be shown as k: {...}. Brackets indicate arrays andbraces indicate structures.
Since this new window showing the array k is opened directly on top of the previous window,you should move it down to make the example window visible.
Lab 1: The Debugger Interface
1 - 24 DSP54x - Introduction and Overview
Single STEP and NEXT instructionsNow that the display windows are opened, let’s restart from the beginning and then single-stepsome instructions. Type:
restart ↵
go main ↵
Press:
<F8>
Continue executing instructions by repeatedly pressing <F8>. Observe how the values in thewatch window and display windows change. Continue stepping through the init function untilit returns to the main function. If you do not wish to see the remainder of the function in stepmode, you can complete the function and return immediately by entering:
ret ↵
Note: If you were not in a sub-function at this point, the simulator will never reach a return and,therefore, will never halt. To stop the simulator in such an event, simply press <Esc>.
Suppose you want to single-step without seeing the details of each individual function call. Youcan step across function calls using:
Next ↵
Alternatively, you can press <F10>. Notice that the next C statement is executed withoutshowing function calls. (Called functions are not skipped; they are just not executed in single-stepmode.)
Both the step command <F8> and next command <F10> can be executed from the commandline with an argument specifying the number of instructions to execute. For example, type:
step 10000 ↵
To stop execution, press:
<Esc>
Note: You can use a Boolean expression as well as a numerical example with the stepcommand; e.g., step (AR0 !=0)
If you are executing within the init() function and want to return, type:
ret ↵
Lab 1: The Debugger Interface
DSP54x - Introduction and Overview 1 - 25
Now try the next command with a count value:
next 10000 ↵
You can sit back and observe the single-step operation.
To stop execution, press:
<Esc>
and you will see a User halt message displayed in the command window.
Lab 1: The Debugger Interface
1 - 26 DSP54x - Introduction and Overview
Debugging Assembly Language and C ProgramsThis part of the tutorial assumes you have already completed the first part of the walkthrough andhave loaded the lab1.out program into the debugger.
To start execution over again, type:
restart ↵
go main ↵
MIXED ModeTo debug in mixed mode, which allows you to observe assembly instructions and C statementssimultaneously, type:
mix ↵
You should see both the C source code and the corresponding assembly code. TheDISASSEMBLY window shows highlighted memory locations which are associated with thecurrent C statement.
You may have to move and size your display windows and watch windows to see the CPU andREGISTER windows. A suggestion is to remove (reset) the watch window using the command:
wr ↵
Try single-stepping by repeatedly pressing:
<F8>
Notice that assembly instructions are stepped. If you are currently executing with the init()function and want to return from the function, type:
ret ↵
Try the next command by repeatedly pressing:
<F10>
Continue this while observing that the assembly instruction CALL init is skipped over.
To single-step C statements while you are in mixed mode, type either:
cstep ↵
or
cnext ↵
Lab 1: The Debugger Interface
DSP54x - Introduction and Overview 1 - 27
Like their counterparts, step and next, you can execute a fixed number of instructions. Forexample:
cstep 10 ↵
will execute 10 C statements.
ASM ModeIf you are interested only in debugging an assembly language program, you can switch toassembly mode by typing:
asm ↵
Notice that the windows that display C data structures disappear when you are in assembly mode.This is a convenient way to clear up the screen if you want to observe CPU register values ordisplay memory contents. Try single-stepping by repeatedly pressing:
<F8>
and observe the changing register values in the CPU window. Changed values are highlighted soyou will notice when a change occurs.
You can go back to mixed mode by simply typing:
mix ↵
Notice that your DISP windows reappear.
Review of ModesIn summary, there are three modes of operation:
• Mixed mode (mix command) shows assembly and C (if C source exists).
• Assembly mode (asm command) shows assembly code only.
• C mode (c command) automatically switches from C to assembly displays,depending on what type of source code is executing.
Breakpoints and BenchmarkingRestart your program and execute to the first call to the init() function. Type:
restart ↵
mix ↵
go init ↵
Lab 1: The Debugger Interface
1 - 28 DSP54x - Introduction and Overview
To set a breakpoint you can either use the command ba xxxx, where xxxx is an absolutememory location or a valid label. This method requires that you know the address (or label). Forexample, type:
ba init ↵
This sets a breakpoint at the entry point to the function. Notice that the instruction is highlightedwhen a breakpoint is set.
To list breakpoints that are set, type:
bl ↵
To delete all breakpoints, use the breakpoint reset command. Type:
br ↵
Verify the process by listing again. Type:
bl ↵
In addition to the ba command to add breakpoints, simply point the mouse at the line abreakpoint is desired and press the left mouse button. The line that the breakpoint is set on shouldnow be highlighted. Pressing the left mouse button again will remove the breakpoint.
To execute your program up to the breakpoint, type:
run ↵
The program should stop at the breakpoint. If the breakpoint is not reached, press <Esc> andverify that the breakpoint has been set (use the bl command or look at theFILE/DISASSEMBLY window to see a highlighted instruction).
To use a previously entered debugger command (for lazy typists), press:
<Tab> ↵
Notice that pressing <Tab> backs up to the previous command entered. Pressing ↵ causes thatcommand to be executed again. In fact, you can cycle back through all previous commands youhave entered by repeatedly pressing <Tab>. Pressing <Shift><Tab> takes you forwardthrough this command buffer.
Let’s assume you still have a breakpoint set at the for statement. To “benchmark” the executiontime required to execute from one breakpoint to another, you need to set a second breakpoint. Goahead and select another instruction for a breakpoint using either <F9>, a mouse click, or the bacommand. To benchmark, type:
run ↵
runb ↵
Lab 1: The Debugger Interface
DSP54x - Introduction and Overview 1 - 29
? clk ↵
The run command executes to the first breakpoint. The runb command is the “run-with-benchmarking” command. The ? command tells the debugger to evaluate the following Cexpression and display the result. The clk debugger variable is valid only after a runbcommand and is set to the number of clock cycles between the run and runb commands.
Lab 1: The Debugger Interface
1 - 30 DSP54x - Introduction and Overview
Evaluating ExpressionsTo evaluate a C expression, you can use the ? command. This is one way to modify registervalues, since C expressions may have side effects such as assignment. Type:
? pc ↵
You should see the pc value displayed. To modify the current pc, type:
? pc = main
To modify a register, type:
? ar0 = 0
To evaluate an expression without displaying the result in the COMMAND window, use theeval command instead of the ? command. Type:
eval pc = 0 ↵
eval pc = main ↵
CPU, MEMORY, and WATCH window registers can be modified by pointing the mouse to thedesired register and pressing the left mouse button. When the register is selected, it will behighlighted and ready for input from the keyboard.
Point to the CPU window AR0 and press the left mouse button.Enter a new value of 5 and press ↵ when complete.
Displaying FilesYou can display any file in the FILE window. Type:
file siminit.cmd ↵
You should see the debugger’s initialization command file displayed. At this point, you can goback to debugging and the previous C source file will automatically be displayed when you startexecuting instructions.
Within the debugger COMMAND window, you can perform DOS-like commands to examineand change the current directory. Use the command dir nnnn, where nnnn is the directoryname, to display a directory listing. Type:
dir ↵
to display the current directory.
The command cd nnnn, where nnnn is the new directory name, changes the current directory.
To clear the COMMAND window, type:
Lab 1: The Debugger Interface
DSP54x - Introduction and Overview 1 - 31
cls ↵
Some other miscellaneous commands are:
• quit which exits the debugger and returns you to DOS
• restart which sets the PC to the code entry point.
Drop Down MenusTo access the drop-down menus from the menu bar at the top of the screen, press <Alt><key>,where <key> is the highlighted menu letter (L, B, W, M, C, or D). Once a menu is displayed,you can execute a command either by typing the designated letter, or by using the arrow keys tomove the selector bar to the desired command and pressing ↵. For example, press:
<Alt>L
then repeatedly press the right arrow key to look at the drop-down menus.
The drop-down menus can also be selected by pointing and pressing the left mouse button. Forexample, select the mode menu with the mouse.
Changing the Display SizesIf you have a display capable of greater than 80 x 25 character resolution, you can get moreinformation on the screen using the debugger -b[bbbb] option when you invoke the debugger.Let’s try it. Exit the debugger by typing:
quit ↵
From the DOS prompt, enter:
sim54xx lab1 -bb
and you should get a display that shows more detail, but may also cause more eye strain. A largermonitor will allow you to take full advantage of the source debugger’s high resolution modes.
The -bb switch creates a 50-line display. Another switch, -b, offers an intermediate-sized 43-line display. Your preferred display size may be made the default by saving the screenconfiguration as init.clr with the ssave command described earlier. Then the need toexplicitly use the -b switch is eliminated.
Batch Operation of DebuggerYou can execute debugger commands from a batch file. This can be useful if there is a certainsequence of commands you want to enter every time you start a debug session for a givenapplication. The filename should have a .log extension. To execute a .log file while in thedebugger, use the command take <filename>.log. For example, try the batch commandfile:
Lab 1: The Debugger Interface
1 - 32 DSP54x - Introduction and Overview
take lab1.log ↵
Congratulations, you have completed the walkthrough. To exit the debugger, press:
<Esc>
Type:
quit ↵
Lab 1: The Debugger Interface
DSP54x - Introduction and Overview 1 - 33
1 - 31
Simulator Quick ReferenceSimulator Quick ReferenceWindow ManagementSelecting Window
F6 rotates to next windowWIN <name> selects <name> windowClick window frame select windowF4 close selected window
Moving Inside WindowUp Arrow / Down ArrowPage Up / Page DownClick on window frame arrowsFor DISASSEM window; type ADDR <value>For MEMORY window; type MEM <value>
Moving WindowClick on top of frame; drag to new locationType MOVE and use arrows or type coordinates
Sizing WindowClick on bottom right corner; drag to new shapeType SIZE and use arrows or type coordinatesZOOM click on top left cornerUNZOOM click again on top left corner
Screen ConfigurationSCONFIG <name> load configuration <name>SSAVE <name> save configuration <name>
ModesASM display ASM info or <Alt> D,AC display C info or <Alt> D,CMIX display both ASM and C or <Alt> D,M
Running CodeResetType RESET forces PC to zeroType RESTART return to "entry point"
SteppingF8 or type STEP for one stepF10 or type NEXT condense subroutinesType STEP <n> for <n> stepsType NEXT <n> for <n> nexts
RunningRUN run until <Esc> or breakpointRUNB run with benchmarkGO <label> run to <label>
Watches and Breakpoints
Operation Watch BreakpointADD WA BARESET WR BRLIST WL BLDELETE WD # BD #or hot keys or mouse clicks
Other Actions
? <label> display value of <label>? <label> = <n>load <label> with <n>file <name> load file <name> to file windowTAB scroll to prior commandsSHIFT TAB scroll to subsequent commandsF9 alternate form of mouse clickTAKE <name> simulator ’batch’ fileLOAD <name>download file <name>
Entry/Exit
SIM2xx <file> start simulator with <file>.out
SIM2xx -bb high resolution mode
QUIT exit simulator
SYSTEM go to DOS shell
1 - 33
’C54x Review - CALU’C54x Review - CALU
uu CALU supports:CALU supports:ÀÀ General-purpose operations:General-purpose operations:
ÀÀ MACMAC
ÀÀ ALUALU
ÀÀ Special functions:Special functions:ÀÀ CSSU (Viterbi)CSSU (Viterbi)
ÀÀ EXP (Norm)EXP (Norm)
ÀÀ FIRS: MAC + ALUFIRS: MAC + ALU
ÀÀ 16- or 32-bit operations:16- or 32-bit operations:ÀÀ C16 modeC16 mode
ÀÀ ’Double’ operations’Double’ operations
Lab 1: The Debugger Interface
1 - 34 DSP54x - Introduction and Overview
1 - 34
’C54x Review - System’C54x Review - System
uu Four buses allow 1 fetch, 2 reads, and 1 write each cycle.Four buses allow 1 fetch, 2 reads, and 1 write each cycle.
uu Built from and forBuilt from and for cDSP cDSP::ÀÀ Fast growing familyFast growing family
ÀÀ Easy to modify for custom use.Easy to modify for custom use.
uu AttributesAttributesÀÀ Static designStatic design
ÀÀ Low powerLow power
ÀÀ Any clock below maximumAny clock below maximum
ÀÀ Low $/MIPLow $/MIP
ÀÀ Fast/dense instructionsFast/dense instructions
ÀÀ Small size for functionalitySmall size for functionality
ÀÀ LC version for 3V operationLC version for 3V operation
DSP54x - Assembly Language Tools 2 - 1
Assembly Language Tools
Learning Objectives
2 - 2
uu Describe steps to create executable output filesDescribe steps to create executable output files
uu Create an assembly file containing:Create an assembly file containing:ÀÀ CodeCode
ÀÀ Constants (initialized data)Constants (initialized data)
ÀÀ VariablesVariables
uu Create a linker command file which:Create a linker command file which:ÀÀ Identifies input and output filesIdentifies input and output files
ÀÀ Describes a system’s available memoryDescribes a system’s available memory
ÀÀ Indicates where code and data shall be locatedIndicates where code and data shall be located
uu Develop multi-file systemsDevelop multi-file systems
Learning ObjectivesLearning Objectives
uu Describe steps to create executable output filesDescribe steps to create executable output files
2 - 2 DSP54x - Assembly Language Tools
Module 2
DSP54x - Assembly Language Tools 2 - 3
Module 2
2 - 3
Software Development ToolsSoftware Development Tools
TextEditor
ASM500 LNK500 Debug.asm .obj
-o.out
.lst
-L
.cmd
.map
-m
HEX500
ASM500 -LS TESTASM500 -LS TEST
LNK500 TEST.CMDLNK500 TEST.CMD
2 - 4
Software Debug ToolsSoftware Debug Tools
.outDebug
SIM5xx Software Only
EVM500•Contains DSP•ISA Card
XDS510•ISA card•No DSP•PC<-> Target
TargetBoard
HEX500 ROMProg.
Module 2
2 - 4 DSP54x - Assembly Language Tools
2 - 5
Lab 2a: COFF ToolsLab 2a: COFF Tools
1.1. Assemble Assemble LAB2A.ASMLAB2A.ASM..Note error message - inspect .Note error message - inspect .LSTLST file. file.
2.2. Edit Edit LAB2A.ASMLAB2A.ASM..Replace labelReplace label ’ ’strtstrt’’ with ’ with ’startstart’ - update and exit file.’ - update and exit file.
3.3. Reassemble Reassemble LAB2A.ASMLAB2A.ASM..Verify error-free assembly -Verify error-free assembly - reinspect reinspect . .LSTLST file. file.
4.4. Link using Link using LAB2A.CMDLAB2A.CMD..Verify the result in Verify the result in LAB2A.MAPLAB2A.MAP..
5.5. Simulate Simulate LAB2A.OUTLAB2A.OUT..Step through the code to verify performance.Step through the code to verify performance.
6.6. Inspect batch files: Inspect batch files: A.BAT, L.BAT, S.BAT, ALS.BATA.BAT, L.BAT, S.BAT, ALS.BATConsider their use to save time in later labs.Consider their use to save time in later labs.
7.7. Add a Add a NOPNOP before the loop to separate the . before the loop to separate the .texttext label from label from startstart..Reassemble, link and simulate. Note any change from before.Reassemble, link and simulate. Note any change from before.
2 - 6
uu Describe steps to create executable output filesDescribe steps to create executable output files
uu Create an assembly file containing:Create an assembly file containing:ÀÀ CodeCode
ÀÀ Constants (initialized data)Constants (initialized data)
ÀÀ VariablesVariables
uu Create a linker command file which:Create a linker command file which:ÀÀ Identifies input and output filesIdentifies input and output files
ÀÀ Describes a system’s available memoryDescribes a system’s available memory
ÀÀ Indicates where code and data shall be locatedIndicates where code and data shall be located
uu Develop multi-file systemsDevelop multi-file systems
Assembly FilesAssembly Files
uu Create an assembly file containing:Create an assembly file containing:ÀÀ CodeCode
ÀÀ Constants (initialized data)Constants (initialized data)
ÀÀ VariablesVariables
Module 2
DSP54x - Assembly Language Tools 2 - 5
2 - 7
Assembly ConventionsAssembly Conventions
label: mnemonic operand,operand ;comment
FRORQ�RSWLRQDO LQVWUXFWLRQ�RU�GLUHFWLYH
uu Any ASCII text is O.K.Any ASCII text is O.K.
uu Use .Use .asmasm extension extension
uu Instructions and directives cannot be in first columnInstructions and directives cannot be in first column
uu Comments O.K. in any column after semicolonComments O.K. in any column after semicolon
WDEV�RU�VSDFHV
2 - 8
Assembly FilesAssembly Files
uu MnemonicsMnemonicsÀÀ Lines of 320 codeLines of 320 code
ÀÀ Generally written in upper caseGenerally written in upper case
ÀÀ Become components of program memoryBecome components of program memory
uu DirectivesDirectivesÀÀ Begin with a period (.) and are lower caseBegin with a period (.) and are lower case
ÀÀ Can create constants and variablesCan create constants and variables
ÀÀ May occupy no memory space when usedMay occupy no memory space when usedto control ASM and LNK processto control ASM and LNK process
Module 2
2 - 6 DSP54x - Assembly Language Tools
2 - 9
COFF Data TypesCOFF Data Types
Type Examples
Decimal 1234 or +1234 or -1234 (Default type)
Hexadecimal 0A40h or 0A40H or 0xA40
Binary 1110001b or 11111001B
Octal 226q or 572Q
Floating-point 1.623e-23 (sign and decimal point optional)
Character ‘D’
Characterstrings
“this is a string”
2 - 10
Coding Example: z = x + yCoding Example: z = x + y
Code Code
get xget x
add yadd y
store zstore z
looploop
LD x,A
ADD y,A
STL A,z
B start
.text
Constants Constants
x = 2x = 2
y = 7 y = 7
VariablesVariables
zz
.text
LD x,A
ADD y,A
STL A,z
B start
.datax .int 2y .int 7
.bss z,1
start
Module 2
DSP54x - Assembly Language Tools 2 - 7
2 - 11
The .The .bssbss Directive Directive
uu Only directive with assembly labelOnly directive with assembly labeldefined in the defined in the operandoperand field field
uu Use separate .Use separate .bssbss statements for each statements for eachnamed variablenamed variable
uu Remember .Remember .bssbss by thinking: by thinking:ÀÀ BBlock lock - reserves a - reserves a blockblock of memory of memoryÀÀ SSymbol ymbol -- begining begining at address at address symbolsymbolÀÀ SSize ize - of the specified - of the specified sizesize
uu Example: Create a 5-word array ’x’Example: Create a 5-word array ’x’
..bss bss x , 5 x , 5
2 - 12
Basic Assembler DirectivesBasic Assembler Directives
AssemblerAssemblerDirectiveDirective ExampleExample DefinitionDefinition
CodeCode to follow to follow..texttext ..texttext
ConstantsConstants to follow to follow..datadata ..datadata
AllocateAllocate space for variables space for variables..bssbss x,10x,10..bssbss
Create 16-bit integer constant(s)Create 16-bit integer constant(s)..intint..wordword
TBLTBL ..intint 53h53h,, 5Ah5Ah
Module 2
2 - 8 DSP54x - Assembly Language Tools
2 - 13
Exercise 2b: ASM Files and SectionsExercise 2b: ASM Files and Sections
; a = 0,1,2,3,4; a = 0,1,2,3,4
aa 0011223344
______________
_____ _______ _____________________ _______ ________________
; x = input array of length 5; x = input array of length 5
xx
_______ _______, ______________ _______, _______
; y = result array of length 1; y = result array of length 1
yy_______ _______, ______________ _______, _______
2 - 14
Lab 2b: Assembly FilesLab 2b: Assembly Files
tabletable 11
22
33
44
88
66
44
22
00
xx
aa
yy
Module 2
DSP54x - Assembly Language Tools 2 - 9
2 - 15
Lab 2b: ProcedureLab 2b: Procedure
1. Copy LAB2A.ASM to LAB2B.ASM. In LAB2B :
2. Define three arrays in RAM (x, a, y).
3. Define an initialized data table that contains the nine values above.
4. Write code that begins with the label start, contains four NOP instructions, and ends with a branch (B) back to start.
5. Assemble the file and inspect the list (.LST) file.
What is the opcode for NOP? ______________
What are the addresses for the .text and .data sections?
.text ____________
.data ____________
Why? ______________________________________________
2 - 16
uu Describe steps to create executable output filesDescribe steps to create executable output files
uu Create an assembly file containing:Create an assembly file containing:ÀÀ CodeCode
ÀÀ Constants (initialized data)Constants (initialized data)
ÀÀ VariablesVariables
uu Create a linker command file which:Create a linker command file which:ÀÀ Identifies input and output filesIdentifies input and output files
ÀÀ Describes a system’s available memoryDescribes a system’s available memory
ÀÀ Indicates where code and data shall be locatedIndicates where code and data shall be located
uu Develop multi-file systemsDevelop multi-file systems
LinkingLinking
uu Create a linker command file which:Create a linker command file which:ÀÀ Identifies input and output filesIdentifies input and output files
ÀÀ Describes a system’s available memoryDescribes a system’s available memory
ÀÀ Indicates where code and data shall be locatedIndicates where code and data shall be located
Module 2
2 - 10 DSP54x - Assembly Language Tools
2 - 17
LinkingLinking
/1.����REM �RXW
�PDS
OLQN�FPG
l��)LOHV��LQSXW�DQG�RXWSXW
l��0HPRU\�GHVFULSWLRQ
l��+RZ�WR�SODFH�V�Z�LQWR�K�Z
2 - 18
Example SystemExample System
&��[
ProgramMemory
3520
8000
FFFF
FRGH
DataMemory
65$04000
6000YDU
(35208000
A000FRQVW
Module 2
DSP54x - Assembly Language Tools 2 - 11
2 - 19
Linker Command FileLinker Command File
example1.obj-o example1.out-m example1.mapMEMORY{
}
PROM: org = 8000h , len = 8000h
SRAM: org = 4000h , len = 2000hEPROM: org = 8000h , len = 2000h
Program
Data
Page 0: /* */
Page 1: /* */
SECTIONS{
}
.text:> PROM PAGE 0
.bss: > SRAM PAGE 1
.data:> EPROM PAGE 1
2 - 20
Memory Descriptor SuggestionsMemory Descriptor Suggestions
1.1. Describe each memory resource on the processorDescribe each memory resource on the processor(internal RAM and/or ROM)(internal RAM and/or ROM)
2.2. Describe each external memory chip in your systemDescribe each external memory chip in your system
3.3. Combine contiguous memory segments, if desiredCombine contiguous memory segments, if desired
4.4. Split any memory segment into multiple segments,Split any memory segment into multiple segments,if desiredif desired
5.5. Name memory segments with useful names; e.g.:Name memory segments with useful names; e.g.:
ÀÀ Types of memory chips (EPROM, RAM, EEPROM)Types of memory chips (EPROM, RAM, EEPROM)
ÀÀ Usage (vectors, code, variables)Usage (vectors, code, variables)
ÀÀ Chip layout names (U1, E2)Chip layout names (U1, E2)
Module 2
2 - 12 DSP54x - Assembly Language Tools
2 - 21
Exercise 2c: Example SystemExercise 2c: Example System
‘C541(µP mode)
Program
32KEPROM
8000
16KSRAM
0
Data
800032K
DEPROM
SPRAM DARAM
2 - 22
Exercise 2c: Link Command FileExercise 2c: Link Command Fileexample1.obj-o example1.out-m example1.map
MEMORY{ PAGE ___: /* Program Memory */
______: org = ______, len = ______ ______: org = ______, len = ______
________: ______: org = ______, len = ______
______: org = ______, len = ______ ______: org = ______, len = ______
}
SECTIONS{ .text: > EPROM PAGE 0 .bss: > SPRAM PAGE 1 .data: > DEPROM PAGE 1}
Module 2
DSP54x - Assembly Language Tools 2 - 13
2 - 23
Lab 2c: LinkingLab 2c: Linking
‘C541(µP mode)
Program
8KEPROM
FFFF
Data
32KDEPROM
8000
SPRAM DARAM
2 - 24
Lab 2c: LinkingLab 2c: Linking
ProcedureProcedure1.1. Copy the linker command file Copy the linker command file LAB2A.CMDLAB2A.CMD to to LAB2C.CMDLAB2C.CMD
2.2. Specify Specify LAB2BLAB2B as the input; and request output and map files as the input; and request output and map files
3. 3. Define a system memory map to include:Define a system memory map to include:TMS320C541 (TMS320C541 (µ P mode) with all internal RAM mapped as Data P mode) with all internal RAM mapped as Data8K Program EPROM ending at 64K8K Program EPROM ending at 64K32K Data EPROM beginning at 8000h32K Data EPROM beginning at 8000h
4. 4. Place code sections as follows:Place code sections as follows:Code into EPROMCode into EPROMTable into DEPROMTable into DEPROMVariable arrays in SPRAMVariable arrays in SPRAM
5. 5. Link and inspect the Link and inspect the .MAP.MAP file. What addresses are assigned to: file. What addresses are assigned to:.text _____________.text _____________.data _____________.data _____________..bssbss _____________ _____________
Module 2
2 - 14 DSP54x - Assembly Language Tools
2 - 25
Multiple SectionsMultiple Sections
65$0
YDU(3520
520 5$0
FRGH ¶&��[
ProgramMemory
DataMemory
'(3520
FRQVW
How do we put a particular code section into specific memory?
(3520
YHFWRUV
2 - 26
Directive Example Description
Named SectionsNamed Sections
.sect Creates initialized sections for code or constants
.sect "vectors"
.usect Creates uninitialized sectionsfor variables
label .usect "name", 23
Module 2
DSP54x - Assembly Language Tools 2 - 15
2 - 27
Adding ResetAdding Reset
�WH[WVWDUW /' [��$
$'' \��$67/ $��]% VWDUW
�GDWD[ �LQW �\ �LQW �
�EVV ]���
sum.asm�
�VHFW ³�YHFWRUV´
% VWDUW
vectors.asm�GHI VWDUW �UHI VWDUW
2 - 28
sum.obj -o sum.out-m sum.map
MEMORY{ Page 0: /* Program Memory */ EPROM: org = 0E000h , len = 2000h Page 1: /* Data Memory */ SPRAM: org = 0060h , len = 20h DARAM: org = 0080h , len = 1380h DEPROM:org = 8000h , len = 8000h }SECTIONS{ .text: > EPROM PAGE 0 .data: > DEPROM PAGE 1 .bss: > SPRAM PAGE 1 }
Linker CMD File with VectorsLinker CMD File with Vectors
vectors.obj
VECS: org = 0FF80h , len = 0080h
.vectors: > VECS PAGE 0
= 1F80h
Module 2
2 - 16 DSP54x - Assembly Language Tools
2 - 29
Lab 2d: Multi-file LinkingLab 2d: Multi-file Linking
P D
NOPNOPNOPB start
start.text
x.bssa
y
.data1 2 3 4
8 6 4 2
table
B startFF80.vectors
2 - 30
Lab 2d: Multi-file SystemsLab 2d: Multi-file Systems
Procedure1. Create VECTORS.ASM
2. Copy LAB2B.ASM to LAB2D.ASMModify LAB2D to make start accessible
3. Assemble LAB2D and VECTORS
4. Copy LAB2C.CMD to LAB2D.CMDModify LAB2D.CMD to specify the desired inputand output files and the routing of the RESET vector
5. Link the system and inspect the .MAP file
6. Step through the code on the simulator to verifyperformance.
Module 2
DSP54x - Assembly Language Tools 2 - 17
2 - 31
COFF Directive SummaryCOFF Directive Summary
TypeType DirectiveDirective PurposePurpose
InitializedInitialized .text.text Program codeProgram codeSectionsSections .data.data Data ConstantsData Constants
.sect.sect User-namedUser-named
UninitializedUninitialized ..bssbss Data variablesData variablesSectionsSections ..usectusect User-namedUser-named
ConstantsConstants ..intint Create integer Create integer .word.word Create integerCreate integer.long.long Create aligned 32-bit constantCreate aligned 32-bit constant
LabelsLabels ..defdef Define global variableDefine global variable.ref.ref Reference global variableReference global variable.global.global Global declaration (.ref + .Global declaration (.ref + .defdef))
MiscMisc .set.set Assign a value,Assign a value, sim sim to . to .equequ or #define or #define.end.end Halt assemblerHalt assembler
2 - 33
Exercise 2d: Multi-file IssuesExercise 2d: Multi-file Issues
ProcedureProcedure1.1. Fill in blanks in Fill in blanks in EX2C.CMDEX2C.CMD to support reset to support reset
vector.vector.
2.2. Per Per EX2C.CMDEX2C.CMD, fill in the post-link addresses, fill in the post-link addressesin the left-side blanks in the in the left-side blanks in the ASMASM files. files.-- Branch is a 2-word instruction.Branch is a 2-word instruction.-- Other instructions are single-word.Other instructions are single-word.-- Put an ’X’ in any blank that has no address.Put an ’X’ in any blank that has no address.
-- Linkage is performed in the order youLinkage is performed in the order youspecify, with sections as the specify, with sections as the majormajor and files and filesas the as the minorminor sort criteria sort criteria
3.3. Resolve symbolic references in the right-sideResolve symbolic references in the right-sideblanks.blanks.
Module 2
2 - 18 DSP54x - Assembly Language Tools
2 - 34
Exercise 2d: EX2C.CMD FileExercise 2d: EX2C.CMD Filemult.objsum.objvectors.obj-o system.out-m system.map
MEMORY { Page 0: SRAM: org = 0000h , len = 4000h EPROM: org = 0E000h , len = _____ VECS: org = _____ , len = _____ Page 1: SPRAM: org = 0060h , len = 0020h DARAM: org = 0100h , len = 0400h DEPROM: org = 8000h , len = 8000h}SECTIONS{ .text: > EPROM PAGE 0 .data: > DEPROM PAGE 1 .bss: > SPRAM PAGE 1 _____: > _______________}
2 - 35
Exercise 2d:Exercise 2d: mult mult..asmasm
BBBB���� �UHI ]��\
BBBB���� �GHI F��PXOW
BBBB. �VHW ����
BBBBPXOW /' ]��$��BBBB
BBBB 03< \��$��BBBB
BBBB $'' F��$��BBBB
BBBB 67+ $��]��BBBB
BBBBGRQH % GRQH�BBBB�GDWD
BBBBF �LQW �� �.��BBBB
mult.asm Note:Note: order of link is : order of link is :
SECTION major, FILE minorSECTION major, FILE minor
Example:Example: yields:yields:
file1.file1.objobj
file2.file2.objobj
SECTIONS{SECTIONS{
.text : > ROM.text : > ROM
.data: > ROM }.data: > ROM }
file1.textfile1.text
file2.textfile2.text
file1.datafile1.data
file2.datafile2.data
Module 2
DSP54x - Assembly Language Tools 2 - 19
2 - 36
Exercise 2d: sum and vectors.Exercise 2d: sum and vectors.asmasm
BBBB �UHI F��PXOWBBBB �GHI VWDUW��]��\
BBBBVWDUW /' [��$��BBBB
BBBB $'' F��$��BBBB
BBBB 67+ $��]���BBBB
BBBB % PXOW��BBBB
BBBB �GDWDBBBB[ �LQW �BBBB\ �LQW �BBBB �EVV ]���
sum.asmBBBB �UHI VWDUW
BBBB �VHFW ³�YHFWRUV´
BBBB % VWDUW��BBBB
vectors.asm
Module 2
2 - 20 DSP54x - Assembly Language Tools
2 - 37
LAB2A.ASMLAB2A.ASM : Solution : Solution
; SOLUTION FILE FOR LAB2A.ASM
NOP
start: NOP
NOP
B start
; SOLUTION FILE FOR LAB2A.ASM; SOLUTION FILE FOR LAB2A.ASM
NOPNOP
startstart: : NOPNOP
NOPNOP
B B startstart
2 - 38
Exercise 2b: SolutionExercise 2b: Solution
; a = 0,1,2,3,4; a = 0,1,2,3,4; x = input array of length 5; x = input array of length 5; y = result; y = result
.data.dataaa ..intint 0,1,2,3,40,1,2,3,4
..bssbss x,5x,5
..bssbss y,1y,1
a 0011223344
x
y
Module 2
DSP54x - Assembly Language Tools 2 - 21
2 - 39
LAB2B.ASMLAB2B.ASM : Solution : Solution
.bss x,4
.bss a,4
.bss y,1
.data
.word 1,2,3,4
.word 8,6,4,2,0
.text
NOP
start: NOP
NOP
NOP
NOP
B start
. .bssbss x,4 x,4
. .bss bss a,4a,4
. .bssbss y,1 y,1
.data .data
.word 1,2,3,4 .word 1,2,3,4
.word 8,6,4,2,0 .word 8,6,4,2,0
.text .text
NOP NOP
start: NOPstart: NOP
NOP NOP
NOP NOP
NOP NOP
B start B start
2 - 40
Exercise 2c: Link Command FileExercise 2c: Link Command Fileexample1.obj-o example1.out-m example1.map
MEMORY{ PAGE 0: /* Program Memory */
SRAM : org = 0000h , len=4000h EPROM : org = 8000h , len = 8000h
PAGE 1: /* Data Memory */ SPRAM : org = 0060h , len = 0020h
DARAM : org = 0080h , len = 1380h DEPROM: org = 8000h , len = 8000h
}
SECTIONS{ .text: > EPROM PAGE 0 .bss: > SPRAM PAGE 1 .data: > DEPROM PAGE 1}
Module 2
2 - 22 DSP54x - Assembly Language Tools
2 - 41
LAB2C.CMDLAB2C.CMD : Solution : Solution
.bss data,4
.
. .bssbss data,4 data,4
. .
lab2b.obj
-o lab2c.out
-m lab2c.map
MEMORY {
PAGE 0: EPROM : org = 0E000h len = 02000h
PAGE 1: SPRAM : org = 00060h len = 00020h
DARAM : org = 00080h len = 01380h
DEPROM : org = 08000h len = 08000h
}
SECTIONS{
.text : > EPROM PAGE 0
.data : > DEPROM PAGE 1
.bss : > SPRAM PAGE 1
}
lab2b.lab2b.objobj
-o lab2c.out-o lab2c.out
-m lab2c.map-m lab2c.map
MEMORY {MEMORY {
PAGE 0: EPROM : org = 0E000h PAGE 0: EPROM : org = 0E000h len len = 02000h = 02000h
PAGE 1: SPRAM : org = 00060h PAGE 1: SPRAM : org = 00060h len len = 00020h = 00020h
DARAM : org = 00080h DARAM : org = 00080h len len = 01380h = 01380h
DEPROM : org = 08000h DEPROM : org = 08000h len len = 08000h = 08000h
} }
SECTIONS{SECTIONS{
.text : > EPROM PAGE 0 .text : > EPROM PAGE 0
.data : > DEPROM PAGE 1 .data : > DEPROM PAGE 1
. .bssbss : > SPRAM PAGE 1 : > SPRAM PAGE 1
} }
2 - 42
Exercise 2d: CMD File SolutionExercise 2d: CMD File Solutionmult.objsum.objvectors.obj-o system.out-m system.map
MEMORY { Page 0: SRAM: org = 0000h , len = 4000h EPROM: org = E000h , len = 1F80h VECS: org =0FF80h , len = 0080hPage 1: SPRAM: org = 0060h , len = 0020h DARAM: org = 0100h , len = 0400h DEPROM: org = 8000h , len = 8000h }SECTIONS { .text: > EPROM PAGE 0 .data: > DEPROM PAGE 1 .bss: > SPRAM PAGE 1 .vectors:> VECS PAGE 0 }
Module 2
DSP54x - Assembly Language Tools 2 - 23
2 - 43
Exercise 2d: ASM Files SolutionExercise 2d: ASM Files Solution
[���� �UHI ]��\
[���� �GHI F��PXOW
�����. �VHW ����
(����PXOW /' ]��$�������
(��� 03< \��$�������
(��� $'' F��$�������
(��� 67+ $��]�������
(����GRQH % GRQH�(����GDWD
�����F �LQW �� �.������
PXOW�DVP VXP�DVP
[ �UHI VWDUW
[ �VHFW ³�YHFWRUV´
))�� % VWDUW��(���
[ �UHI F��PXOW[ �GHI VWDUW��]��\
(����VWDUW /' [��$�������
(��� $'' F��$�������
(��� 67+ $��]�������
(��� % PXOW��(���[ �GDWD�����[ �LQW ������\ �LQW ����� �EVV ]���
YHFWRUV�DVP
2 - 44
LAB2D & VECTORSLAB2D & VECTORS : Solution : Solution
.def start
.bss x,4
.bss a,4
.bss y,1
.data
.word 1,2,3,4
.word 8,6,4,2,0
.text
NOP
start: NOP
NOP
NOP
NOP
B start
..defdef start start
. .bssbss x,4 x,4
. .bss bss a,4a,4
. .bssbss y,1 y,1
.data .data
.word 1,2,3,4 .word 1,2,3,4
.word 8,6,4,2,0 .word 8,6,4,2,0
.text .text
NOP NOP
start: NOPstart: NOP
NOP NOP
NOP NOP
NOP NOP
B start B start
.ref start
.sect ".vectors"
b start
.ref start.ref start
.sect ".vectors" .sect ".vectors"
b start b start
Module 2
2 - 24 DSP54x - Assembly Language Tools
2 - 45
LAB2D.CMDLAB2D.CMD : Solution : Solution
lab2d.obj
vectors.obj
-o lab2d.out
-m lab2d.map
MEMORY {
PAGE 0: EPROM: org = 0E000h len = 01F80h
VECS: org = 0FF80h len = 00080h
PAGE 1: SPRAM: org = 00060h len = 00020h
DARAM: org = 00080h len = 01380h
DEPROM: org = 08000h len = 08000h }
SECTIONS{
.vectors: > VECS PAGE 0
.text : > EPROM PAGE 0
.data : > DEPROM PAGE 1
.bss : > SPRAM PAGE 1 }
lab2d.lab2d.objobj
vectors.vectors.objobj
-o lab2d.out-o lab2d.out
-m lab2d.map-m lab2d.map
MEMORY {MEMORY {
PAGE 0: EPROM: org = 0E000h PAGE 0: EPROM: org = 0E000h len len = = 01F80h01F80h
VECS: org = 0FF80hVECS: org = 0FF80h len len = 00080h = 00080h
PAGE 1: SPRAM: org = 00060h PAGE 1: SPRAM: org = 00060h len len = 00020h = 00020h
DARAM: org = 00080h DARAM: org = 00080h len len = 01380h = 01380h
DEPROM: org = 08000h DEPROM: org = 08000h len len = 08000h } = 08000h }
SECTIONS{SECTIONS{
.vectors: > VECS PAGE 0.vectors: > VECS PAGE 0
.text : > EPROM PAGE 0 .text : > EPROM PAGE 0
.data : > DEPROM PAGE 1 .data : > DEPROM PAGE 1
. .bssbss : > SPRAM PAGE 1 : > SPRAM PAGE 1 } }
DSP54x - Addressing Modes 3 - 1
Addressing Modes
Learning Objectives
3 - 2
Learning ObjectivesLearning Objectives
uu List the four basic addressing modes andList the four basic addressing modes andidentify the purpose of each.identify the purpose of each.
uu Express constants via immediate addressing.Express constants via immediate addressing.
uu Access tables and arrays via indirect addressing -Access tables and arrays via indirect addressing -a pointer-like process.a pointer-like process.
uu Select the optimal mode when using indirectSelect the optimal mode when using indirectaddressing.addressing.
uu Perform general purpose access to Data MemoryPerform general purpose access to Data Memoryvia direct addressing (two methods).via direct addressing (two methods).
uu Define and implement methods for controllingDefine and implement methods for controllingpage boundary crossings.page boundary crossings.
uu Access stack variables andAccess stack variables and MMRs MMRs via special via specialversions of direct addressingversions of direct addressing
3 - 2 DSP54x - Addressing Modes
Module 3
DSP54x - Addressing Modes 3 - 3
Module 3
3 - 3
Addressing ModesAddressing Modes
TypeType SymbolSymbol Purpose, BenefitPurpose, Benefit‘‘
Using constants/initializationUsing constants/initialization
16-bit values16-bit values
Single cycleSingle cycle
ImmediateImmediate # #
LongLong
ShortShort
Support for pointers - access arrays, lists, tablesSupport for pointers - access arrays, lists, tables
0 cycle auto increment/decrement by +/- 10 cycle auto increment/decrement by +/- 1
0 cycle auto increment by “n”0 cycle auto increment by “n”
IndirectIndirect * *
w. Inc/Decw. Inc/Dec
w. Indexw. Index
General-purpose access to dataGeneral-purpose access to data
Access any location in data memory - ‘flat memory’Access any location in data memory - ‘flat memory’
Single-cycle access within boundarySingle-cycle access within boundary
Optimal for stack-based values (C)Optimal for stack-based values (C)
Optimal for DP0 values (MMR and SPRAM)Optimal for DP0 values (MMR and SPRAM)
DirectDirect <default><default>
AbsoluteAbsolute - or - - or -
PagedPaged @ @
SP-relativeSP-relative
MMRMMR
Operate betweenOperate between Acc Acc A and B A and BRegisterRegister
3 - 4
Immediate AddressingImmediate Addressing
uu Long ImmediateLong Immediate
ÀÀ Allows use of constantAllows use of constant
ÀÀ Up to 16-bit operandUp to 16-bit operand
ÀÀ 2 words, 2 cycles2 words, 2 cycles
ÀÀ Optimal for initializationOptimal for initialization
Example:Example:
LDLD #1234h,A#1234h,A
Load to A #Load to A #
1 2 3 41 2 3 4
Example:Example:
LDLD #12h,A#12h,A
Load A # 1 2Load A # 1 2
uu Short ImmediateShort ImmediateÀÀ Available in limited casesAvailable in limited cases
ÀÀ 9-bit or smaller values9-bit or smaller values
ÀÀ 1 word, 1 cycle1 word, 1 cycle
ÀÀ InitInit.. Acc Acc (8), DP (9), (8), DP (9),ASM (5), etc.ASM (5), etc.
Module 3
3 - 4 DSP54x - Addressing Modes
3 - 5
..bssbss x,100x,100
Indirect AddressingIndirect Addressing
uu Hardware support of pointer conceptHardware support of pointer conceptuu EightEight ARs ARs (Address or Auxiliary Registers) available (Address or Auxiliary Registers) availableuu AR0 also used as (optional) indexAR0 also used as (optional) indexuu Allows fast, efficient access to arrays, lists, tables, etc.Allows fast, efficient access to arrays, lists, tables, etc.
ExampleExample
y xnn
==
∑1
100
.text.text
LDLD *AR1 ,A*AR1 ,A
ADDADD *AR1 ,A*AR1 ,A
ADDADD *AR1 ,A*AR1 ,A
......
DataData
x1x1x2x2x3x3....x100x100
xx AR1AR1
STMSTM #x,AR1#x,AR1
++
++
++
STLSTL A,yA,y
yy
3 - 6
Indexed AddressingIndexed Addressing
ExampleExample
y x nn
==
∑ 21
100 ..bssbss x,200x,200
.text.text
STMSTM #x,AR1#x,AR1
ADDADD *AR1+ ,A*AR1+ ,A
ADDADD *AR1+ ,A*AR1+ ,A
......
STLSTL A,*(y)A,*(y)
DataData
AR1AR1 xx x2x2x4x4x6x6....x200x200
yy
uu Add step size option to auto increment.Add step size option to auto increment.uu AR0 holds step size.AR0 holds step size.uu Mode selected by using Mode selected by using **ARnARn+0 +0 as as **ARnARn-0-0..uu Pre-mod fixed index w. extra cycle:Pre-mod fixed index w. extra cycle: *+ARn *+ARn(K)(K)
STMSTM #2,AR0#2,AR0
00
00
Module 3
DSP54x - Addressing Modes 3 - 5
3 - 7
Indirect Addressing OptionsIndirect Addressing Options
No ModNo Mod **ARnARn no modification tono modification to Arn Arn
Inc/DecInc/Dec **ARnARn++ post increment by 1post increment by 1**ARnARn-- post decrement by 1post decrement by 1
IndexIndex **ARnARn+0+0 post increment by AR0post increment by AR0**ARnARn-0-0 post decrement by AR0post decrement by AR0
CircularCircular **ARnARn+%+% post increment by 1 - circularpost increment by 1 - circular**ARnARn-%-% post decrement by 1 - circularpost decrement by 1 - circular**ARnARn+0%+0% post increment by AR0 - circularpost increment by AR0 - circular**ARnARn-0%-0% post decrement by AR0 - circularpost decrement by AR0 - circular
Bit-ReverseBit-Reverse **ARnARn+0B+0B post increment by n - bit rev (for FFT)post increment by n - bit rev (for FFT)**ARnARn-0B-0B post decrement by n - bit rev (for FFT)post decrement by n - bit rev (for FFT)
Pre-mod Pre-mod **ARnARn ( (lklk)) use *(use *(ARnARn+LK),+LK), ARn ARn unchanged unchanged*+ARn*+ARn ( (lklk)) use *(use *(ARnARn+LK),+LK), ARn ARn changed changed*+ARn*+ARn ( (lklk)%)% use *(use *(ARnARn+LK),+LK), ARn ARn changed - circular changed - circular*+ARn*+ARn pre-increment by 1, during write onlypre-increment by 1, during write only
AbsoluteAbsolute *(*(lklk)) absolute directabsolute direct
3 - 8
Indirect Addressing CaveatsIndirect Addressing Caveats
uu Load pointers before usingLoad pointers before using
uu Pointer (MMR) latencies:Pointer (MMR) latencies:
ÀÀ no latencyno latency STM, MVDKSTM, MVDK
ÀÀ 1 cycle1 cycle MVDM, MVKD, MVDDMVDM, MVKD, MVDD
ÀÀ 2 cycles2 cycles STLM, ST, etcSTLM, ST, etc
uu ARsARs are read/modified in access phase, so during are read/modified in access phase, so duringdebug, will appear to change early.debug, will appear to change early.
uu CMPT must = 0 (bit5, ST1)CMPT must = 0 (bit5, ST1)
ÀÀ is 0 on resetis 0 on reset
ÀÀ is forced to 0 with RSBX CMPTis forced to 0 with RSBX CMPT
ÀÀ CMPT = 1 allows old 5x-styled NARP operationCMPT = 1 allows old 5x-styled NARP operationforfor ARs ARs..
Module 3
3 - 6 DSP54x - Addressing Modes
3 - 9
Absolute Direct AddressingAbsolute Direct Addressing
uu Actually a form of indirect addressing.Actually a form of indirect addressing.
uu Allows access to Allows access to anyany data memory operand. data memory operand.
uu Requires Requires extraextra word of code and extra cycle(s). word of code and extra cycle(s).
ExampleExampleData MemoryData MemoryAddrAddr DataData
. . . . . . . . x: 01FF 1000x: 01FF 1000
y: 0200 0500y: 0200 0500 . . . . . . . .
.data.data
x:x: .word 1000h.word 1000h
y:y: .word 0500h.word 0500h
.text.text
LDLD *(x),A*(x),A0 0 0 0 0 0 1 0 0 00 0 0 0 0 0 1 0 0 0AccAcc A A
ADDADD *(y),A*(y),A0 0 0 0 0 0 1 5 0 00 0 0 0 0 0 1 5 0 0
3 - 10
Paged Direct AddressingPaged Direct Addressing
uu Allows single-word/single-cycle operationAllows single-word/single-cycle operation
uu Seven-bit address field allows access to 128 wordsSeven-bit address field allows access to 128 words
uu Pages are selected by DP field in ST0.Pages are selected by DP field in ST0.
.data.data
x:x: .word 01000.word 01000
y:y: .word 00500.word 00500
Data MemoryData Memory
AddrAddr DataData
0180 0180 00010001 . . . .
x: 01FFx: 01FF 10001000
y: 0200y: 0200 05000500 . . . .
.text.text
LDLD #x,DP#x,DP
AccAcc A A
- - - - - - - - - -- - - - - - - - - - 0 0 30 0 3
DPDP
LDLD x,Ax,A
0 1 0 0 0 0 1 0 0 0 0 0 30 0 3
ADDADD y,Ay,A
0 1 0 0 1 0 1 0 0 1 0 0 30 0 3
Module 3
DSP54x - Addressing Modes 3 - 7
3 - 11
Paged Direct Addressing - BlockingPaged Direct Addressing - Blocking
Single DP can be assured in either of two simple methods:Single DP can be assured in either of two simple methods:
Specify the blocking argument in the linker command file:Specify the blocking argument in the linker command file:
..bssbss : > RAM : > RAM BLOCK=128 BLOCK=128
Group and block variables in ASM file:Group and block variables in ASM file:
..bssbss x,2,1 x,2,1 ;request all;request all vars vars together, together,;third field requests block;third field requests block
yy .set x+1.set x+1 ;assign;assign vars vars within block within block;;origorig sets sets
3 - 12
Paged Direct - Blocking ExamplePaged Direct - Blocking Example
..bssbss x,2,1x,2,1
yy .set.set x+1x+1 Data MemoryData MemoryAddrAddr DataData1001001FF 1FF --------200200 10001000201201 05000500
.text.text
LDLD #x,DP#x,DP
AccAcc A A
- - - - - - - - - -- - - - - - - - - - 0 0 40 0 4
DPDP
LDLD x,Ax,A
0 1 0 0 0 0 1 0 0 0 0 0 40 0 4
ADDADD y,Ay,A
0 1 5 0 0 0 1 5 0 0 0 0 40 0 4
Module 3
3 - 8 DSP54x - Addressing Modes
3 - 13
Paged Direct Addressing - CaveatsPaged Direct Addressing - Caveats
uu Data page must be managed by programmer.Data page must be managed by programmer.
ÀÀ No warnings issued by tools for crossed page.No warnings issued by tools for crossed page.
uu CPL bit in ST1 must be 0 for paged direct.CPL bit in ST1 must be 0 for paged direct.
ÀÀ Default condition on reset.Default condition on reset.
ÀÀ Invoked with Invoked with RSBX CPLRSBX CPL..
uu Useful for fast, random access to <100 variables at a time.Useful for fast, random access to <100 variables at a time.
ÀÀ For >100 variables, use pointers.For >100 variables, use pointers.
ÀÀ Not speed critical - use absolute direct.Not speed critical - use absolute direct.
uu Recommended: Watch DP when debugging code:Recommended: Watch DP when debugging code:
WAWA ST0<<7, Base = , xST0<<7, Base = , x
will display "Base = " and the first address active for pagedwill display "Base = " and the first address active for pageddirect addressing cast in hex.direct addressing cast in hex.
3 - 14
Stack Relative Direct AddressingStack Relative Direct Addressing
uu Alternative to paged direct modeAlternative to paged direct mode
uu Uses 16-bit SP instead of 9-MSB DP as baseUses 16-bit SP instead of 9-MSB DP as base
uu Useful for stack-based operationsUseful for stack-based operations
ExampleExample
.text.text
SSBXSSBX CPLCPL
Data MemoryData Memory
0100010000500050
SPSP
LDLD 1,A1,A
AccAcc A A
0 0 0 0 0 0 0 1 0 00 0 0 0 0 0 0 1 0 0
ADDADD 2,A2,A 0 0 0 0 0 0 0 1 5 00 0 0 0 0 0 0 1 5 0
Notes:Notes:
1. SP and DP relative direct are 1. SP and DP relative direct are mutually exclusivemutually exclusive!!
2. Restore CPL = 0 (RSBX CPL) before using paged direct again.2. Restore CPL = 0 (RSBX CPL) before using paged direct again.
3. CPL = 0 on reset.3. CPL = 0 on reset.
Module 3
DSP54x - Addressing Modes 3 - 9
3 - 15
uu DP is ignored - not used or modifiedDP is ignored - not used or modified
uu CPL is ignored - not used or modifiedCPL is ignored - not used or modified
uu Allows access to all DP0 resources (Allows access to all DP0 resources (MMRsMMRs and SPRAM) and SPRAM)
uu Invoked via MMR-specific mnemonics:Invoked via MMR-specific mnemonics:
LDM, STLMLDM, STLM MMR MMR ↔ Acc AccSTMSTM # # → MMR MMRPSHM, POPMPSHM, POPM MMR MMR ↔ Stack StackMVDM, MVMDMVDM, MVMD MMR MMR ↔ DMem DMemMVMMMVMM AR, SP, BK AR, SP, BK ↔ AR, SP, BK AR, SP, BK
MMR Direct AddressingMMR Direct Addressing
ExampleExample
LDMLDM ST1,BST1,B
OROR #4000,B#4000,B
STLMSTLM B,ST1B,ST1
..mmregsmmregs Allows MMR names as addressesAllows MMR names as addresses
3 - 16
Memory-Mapped Registers (MMR)Memory-Mapped Registers (MMR)
AddrAddr. . NameName (Hex)(Hex) DescriptionDescription
IMRIMR 00000000 Interrupt Mask RegisterInterrupt Mask Register
IFRIFR 00010001 Interrupt Flag RegisterInterrupt Flag Register
---------- 2 - 52 - 5 ReservedReserved
ST0ST0 00060006 Status 0 RegisterStatus 0 Register
ST1ST1 00070007 Status 1 RegisterStatus 1 Register
ALAL 00080008 A accumulator low (A[15:00])A accumulator low (A[15:00])
AHAH 00090009 A accumulator high (A[31:16])A accumulator high (A[31:16])
AGAG 000A000A A accumulator guard (A[39:32])A accumulator guard (A[39:32])
BLBL 000B000B B accumulator low (B[15:00])B accumulator low (B[15:00])
BHBH 000C000C B accumulator high (B[31:16])B accumulator high (B[31:16])
BGBG 000D000D B accumulator guard (B[39:32])B accumulator guard (B[39:32])
TT 000E000E Temporary RegisterTemporary Register
TRNTRN 000F000F TransistionTransistion Register Register
AddrAddr..NameName (Hex)(Hex) DescriptionDescription
AR0AR0 00100010 Auxiliary Register 0Auxiliary Register 0
AR1AR1 00110011 Auxiliary Register 1Auxiliary Register 1
AR2AR2 00120012 Auxiliary Register 2Auxiliary Register 2
AR3AR3 00130013 Auxiliary Register 3Auxiliary Register 3
AR4AR4 00140014 Auxiliary Register 4Auxiliary Register 4
AR5AR5 00150015 Auxiliary Register 5Auxiliary Register 5
AR6AR6 00160016 Auxiliary Register 6Auxiliary Register 6
AR7AR7 00170017 Auxiliary Register 7Auxiliary Register 7
SPSP 00180018 Stack Pointer RegisterStack Pointer Register
BKBK 00190019 Circular Size RegisterCircular Size Register
BRCBRC 001A001A Block Repeat CounterBlock Repeat Counter
RSARSA 001B001B Block Repeat Start AddressBlock Repeat Start Address
REAREA 001C001C Block Repeat End AddressBlock Repeat End Address
PMSTPMST 001D001D PMST RegisterPMST Register
-------------- 01E-01F01E-01F ReservedReserved
Module 3
3 - 10 DSP54x - Addressing Modes
3 - 17
Register AddressingRegister Addressing
uu Allows interchange between accumulatorsAllows interchange between accumulators
uu Examples:Examples:
LDLD A,BA,B A A → B BADDADD B,AB,A A = A + BA = A + B
uu Can sometimes be merged with other actionCan sometimes be merged with other action
ADDADD x,B,Ax,B,A A = B + xA = B + x
3 - 18
ProgramProgram
Address/Data (hex) Address/Data (hex) ScratchScratch Data1Data1 Data2Data2 B1B1Assume:Assume: DP=0DP=0 6060 20h20h DP=4DP=4 200200 100h100h DP=6DP=6 300300 100h100hCPL=0CPL=0 6161 120h120h 201201 60h60h 301301 30h30hCMPT=0CMPT=0 6262 202202 40h40h 302302 60h60h
Exercise 3: AddressingExercise 3: Addressing
LDLD #0,DP#0,DP
STMSTM #2,AR0#2,AR0
STMSTM #200h,AR1#200h,AR1STMSTM #300h,AR2#300h,AR2LDLD 61h,A61h,A
ADDADD *AR1+,A*AR1+,ASUBSUB 60h,A,B60h,A,BADDADD *AR1+,B,A*AR1+,B,A
LDLD #6,DP#6,DPADDADD 1,A1,AADDADD *AR2+,A*AR2+,A
SUBSUB *AR2+,0,A*AR2+,0,ASUBSUB #32,A#32,AADDADD *AR1-0,A,B*AR1-0,A,B
SUB *AR2-0,B,ASTL A,62h
SUBSUB *AR2-0,B,A*AR2-0,B,ASTLSTL A,62hA,62h
DPDP AR0AR0 AR1AR1 AR2AR2AR2AAA
120120120
260260260
390390390
BB
200200
380380
320320320
Module 3
DSP54x - Addressing Modes 3 - 11
3 - 19
Lab 3: AddressingLab 3: Addressing
Code Abstract
PP DD
.text.text
vectorsvectors
..bssbss
.data.data
AR(AR(dstdst))
AR(AR(srcsrc))
LD (src1)
ACCACC
STL (dst1)LD (src2)STL (dst2)......
done: B done
Caveats
l Don’t put in a loop
l Use best addressing mode
l Optimizations come in later labs
3 - 20
Lab 3: ProcedureLab 3: Procedure
1.1. Copy Copy LAB2D.ASMLAB2D.ASM to to LAB3.ASMLAB3.ASM. Modify . Modify LAB3LAB3 by byreplacing thereplacing the NOPs NOPs with code to copy the nine data with code to copy the nine datatable values into the allocated RAM, as shown in thetable values into the allocated RAM, as shown in thediagram above.diagram above.
2.2. Copy Copy LAB2D.CMDLAB2D.CMD to to LAB3.CMDLAB3.CMD. Modify . Modify LAB3LAB3 as asrequired.required.
3.3. Assemble and link your code. Check the Assemble and link your code. Check the .LST.LST and and.MAP.MAP files for expected results. files for expected results.
4.4. Step through the code on the simulator. VerifyStep through the code on the simulator. Verifyperformance; debug as necessary.performance; debug as necessary.
Optional: If time permits, add components to create aOptional: If time permits, add components to create alocation "status" and copy ST0 to status. Whichlocation "status" and copy ST0 to status. Whichaddressing modes are best here? Why?addressing modes are best here? Why?
Module 3
3 - 12 DSP54x - Addressing Modes
3 - 22
ProgramProgram
Address/Data (hex) Address/Data (hex) ScratchScratch Data1Data1 Data2Data2 B1B1Assume:Assume: DP=0DP=0 6060 20h20h DP=4DP=4 200200 100h100h DP=6DP=6 300300 100h100hCPL=0CPL=0 6161 120h120h 201201 60h60h 301301 30h30hCMPT=0CMPT=0 6262 202202 40h40h 302302 60h60h
Exercise 3: Addressing - SolutionExercise 3: Addressing - Solution
LDLD #0,DP#0,DP
STMSTM #2,AR0#2,AR0
STMSTM #200h,AR1#200h,AR1STMSTM #300h,AR2#300h,AR2LDLD 61h,A61h,A
ADDADD *AR1+,A*AR1+,ASUBSUB 60h,A,B60h,A,BADDADD *AR1+,B,A*AR1+,B,A
LDLD #6,DP#6,DPADDADD 1,A1,AADDADD *AR2+,A*AR2+,A
SUBSUB *AR2+,0,A*AR2+,0,ASUBSUB #32,A#32,AADDADD *AR1-0,A,B*AR1-0,A,B
SUB *AR2-0,B,ASTL A,62h
SUBSUB *AR2-0,B,A*AR2-0,B,ASTLSTL A,62hA,62h
DPDP AR0AR0 AR1AR1 AR2AR2AR200
22
200200
300300300
201201
202202
66
301301301
200200
302302302
AAA
120120120
220
260
220220
260260
290390
290290390390
360340
360360340340
BB
200200
380380
320320320 300300300
3 - 23
LAB3.ASMLAB3.ASM : Solution : Solution; LAB3.ASM: Data Xfer solution
.def start,table,x
.bss x,4
.bss a,4
.bss y,1
.data
table: .word 1,2,3,4
.word 8,6,4,2,0
.text
NOP
start: STM #table,AR1
STM #x,AR2
LD *AR1+,A ;1
STL A,*AR2+
LD *AR1+,A ;2
STL A,*AR2+
LD *AR1+,A ;3
STL A,*AR2+
; LAB3.ASM: Data; LAB3.ASM: Data Xfer Xfer solution solution
..defdef start,table,x start,table,x
. .bssbss x,4 x,4
. .bssbss a,4 a,4
. .bssbss y,1 y,1
.data .data
table: .word 1,2,3,4table: .word 1,2,3,4
.word 8,6,4,2,0 .word 8,6,4,2,0
.text .text
NOP NOP
start: STM #table,AR1start: STM #table,AR1
STM #x,AR2 STM #x,AR2
LD *AR1+,A ;1 LD *AR1+,A ;1
STL A,*AR2+ STL A,*AR2+
LD *AR1+,A ;2 LD *AR1+,A ;2
STL A,*AR2+ STL A,*AR2+
LD *AR1+,A ;3 LD *AR1+,A ;3
STL A,*AR2+ STL A,*AR2+
LD *AR1+,A ;4
STL A,*AR2+
LD *AR1+,A ;5
STL A,*AR2+
LD *AR1+,A ;6
STL A,*AR2+
LD *AR1+,A ;7
STL A,*AR2+
LD *AR1+,A ;8
STL A,*AR2+
LD *AR1+,A ;9
STL A,*AR2+
; Optional process solution
.mmregs
.bss status,1
.def status
option: LDM ST0,A
STL A,*(status)
done: B done
LD *AR1+,A ;4LD *AR1+,A ;4
STL A,*AR2+STL A,*AR2+
LD *AR1+,A ;5LD *AR1+,A ;5
STL A,*AR2+STL A,*AR2+
LD *AR1+,A ;6LD *AR1+,A ;6
STL A,*AR2+STL A,*AR2+
LD *AR1+,A ;7LD *AR1+,A ;7
STL A,*AR2+STL A,*AR2+
LD *AR1+,A ;8LD *AR1+,A ;8
STL A,*AR2+STL A,*AR2+
LD *AR1+,A ;9LD *AR1+,A ;9
STL A,*AR2+STL A,*AR2+
; Optional process solution; Optional process solution
..mmregsmmregs
..bssbss status,1 status,1
..defdef status status
option:option: LDM ST0,ALDM ST0,A
STL A,*(status)STL A,*(status)
done: done: B doneB done
Module 3
DSP54x - Addressing Modes 3 - 13
3 - 24
LAB3.CMDLAB3.CMD : Solution : Solution
lab3.obj
vectors.obj
-o lab3.out
-m lab3.map
MEMORY { PAGE 0: EPROM: org = 0E000h len = 01F80h
VECS: org = 0FF80h len = 00080h
PAGE 1: SPRAM: org = 00060h len = 00020h
DARAM: org = 00080h len = 01380h }
SECTIONS{
.vectors: > VECS PAGE 0
.text : > EPROM PAGE 0
.data : > DARAM PAGE 1
.bss : > SPRAM PAGE 1
}
lab3.lab3.objobj
vectors.vectors.objobj
-o lab3.out-o lab3.out
-m lab3.map-m lab3.map
MEMORY { PAGE 0: EPROM: org = 0E000hMEMORY { PAGE 0: EPROM: org = 0E000h len len = 01F80h = 01F80h
VECS: org = 0FF80h VECS: org = 0FF80h len len = 00080h = 00080h
PAGE 1: SPRAM: org = 00060h PAGE 1: SPRAM: org = 00060h len len = 00020h = 00020h
DARAM: org = 00080h DARAM: org = 00080h len len = 01380h = 01380h } }
SECTIONS{SECTIONS{
.vectors: > VECS PAGE 0 .vectors: > VECS PAGE 0
.text : > EPROM PAGE 0 .text : > EPROM PAGE 0
.data : > DARAM PAGE 1 .data : > DARAM PAGE 1
. .bssbss : > SPRAM PAGE 1 : > SPRAM PAGE 1
} }
Module 3
3 - 14 DSP54x - Addressing Modes
DSP54x - Basic Programming Techniques 4 - 1
Basic Programming Techniques
Learning Objectives
4 - 2
Learning ObjectivesLearning Objectives
uu Perform simple branch, loop control,Perform simple branch, loop control,and subroutine operations.and subroutine operations.
uu Set up and employ the stack forSet up and employ the stack forsubroutine call and return.subroutine call and return.
uu Use the accumulator to load, store, addUse the accumulator to load, store, addand subtract 16-bit values from dataand subtract 16-bit values from dataand program memory.and program memory.
uu Use the multiplier to implement sum-ofUse the multiplier to implement sum-ofproducts equations.products equations.
4 - 2 DSP54x - Basic Programming Techniques
Module 4
DSP54x - Basic Programming Techniques 4 - 3
Module 4
4 - 3
Basic Program ControlBasic Program Control
BranchBranch CallCall ReturnReturn
BB nextnext CALLCALL subsub RETRET
BACCBACC srcsrc CALACALA srcsrc
BCBC next,next,cndcnd, , CCCC sub,sub,cndcnd, , RCRC cndcnd,,
CyclesCycles
B, CALLB, CALL 44
RETRET 55
BACC, CALABACC, CALA 66
BC, CC, RCBC, CC, RC 5/35/3
InstructionInstruction
4 - 4
Condition OperatorsCondition Operators
EQEQ NEQNEQ OVOV
LEQLEQ GEQ GEQ NOVNOV
LTLT GTGT
TC TC C C BIO BIO
NTCNTC NC NC NBIO NBIO
Pick 1Pick 1 and/orand/or Pick 1Pick 1 Pick 1Pick 1 Pick 1Pick 1 Pick 1Pick 1and/orand/or and/orand/orOROR
ExamplesExamplesRCRC TCTC
CCCC sub,BNEQsub,BNEQ
BCBC new,AGT,AOVnew,AGT,AOV
Module 4
4 - 4 DSP54x - Basic Programming Techniques
4 - 5
Loop Counter: BANZLoop Counter: BANZ
y Xnn
==
∑1
5
y Xnn
==
∑1
5
..bssbss x,5x,5STMSTM #x,AR1#x,AR1
LDLD #0,A#0,Aloop:loop: ADDADD *AR1+,A*AR1+,A
BB looploopSTLSTL A,yA,y ANZ ANZ ,*AR2- ,*AR2-
STMSTM #4,AR2#4,AR2
4 - 6
Comparison: CMPRComparison: CMPR
For (n=5; n<10; n++)For (n=5; n<10; n++)
STMSTM #5,AR1#5,AR1STMSTM #10,AR0#10,AR0
loop:loop: ............*AR1+*AR1+............
CMPRCMPR LESS,AR1LESS,AR1BCBC loop,TCloop,TC
EQUALEQUAL .set.set 00b00bLESSLESS .set.set 01b01bGRTRGRTR .set.set 10b10bNOTEQNOTEQ .set.set 11b11b
Useful:Useful:
.include .include
filesfiles
Module 4
DSP54x - Basic Programming Techniques 4 - 5
4 - 7
The StackThe Stack
Setup:Setup:
STACKSTACK ..usectusect "STK",100"STK",100
STMSTM #STACK+100,SP#STACK+100,SP
DataDataMemoryMemory
OpenOpen
Last UsedLast Used
UsedUsed
00
SPSP
64K64K
STACKSTACK
STKSTK
CALL :CALL : PC PC →→ *--SP *--SP
RET :RET : *SP++ *SP++ →→ PC PC
Use:Use:
4 - 8
Measuring Stack RequiredMeasuring Stack Required
Determining amount of stack to allocateDetermining amount of stack to allocatecan be done in four steps :can be done in four steps :
1. 1. Allocate a large stack and fill it with Allocate a large stack and fill it with known values :known values :
LD LD #-8531,A#-8531,A
MVMMMVMM SP,AR7SP,AR7
RPTRPT #length#length
STLSTL A,*AR7-A,*AR7-DEADDEAD
DEADDEAD
... ...
DEADDEAD
DEADDEAD
DEADDEAD
DEADDEADSPSP
AR7AR7
2. 2. Run system to exercise all operationsRun system to exercise all operations
3. 3. Halt and inspect stack for prior valueHalt and inspect stack for prior value
6B146B14
00130013
... ...
7AB37AB3
DEADDEAD
DEADDEAD
00000000
SPSP
4. 4. Delete excess (unused) stackDelete excess (unused) stack
Module 4
4 - 6 DSP54x - Basic Programming Techniques
4 - 9
Exercise 4-1. Program ControlExercise 4-1. Program Control
1.1. What is the difference between Branch and Call?What is the difference between Branch and Call?
22 Why is there no Return using Accumulator?Why is there no Return using Accumulator?
3.3. Are multi-conditions based on the AND or OR ofAre multi-conditions based on the AND or OR ofconditions?conditions?
4.4. When looping "n" times, what value do you place inWhen looping "n" times, what value do you place inthe loop counter?the loop counter?
5.5. Which register(s) may be used for loop counting?Which register(s) may be used for loop counting?
6.6. When adding to the stack, what happens to SP?When adding to the stack, what happens to SP?
7.7. What does SP point to?What does SP point to?
8.8. How many cycles do Branch operations require?How many cycles do Branch operations require?Why?Why?
4 - 10
Lab 4aLab 4a
VECTORS.ASMVECTORS.ASM
.sect ".vectors".sect ".vectors"BB BEGINBEGIN
;allocate stack[.;allocate stack[.usectusect]].text.text
BEGINBEGIN;[setup SP];[setup SP];[call START];[call START]
1.1. Modify Modify VECTORS.ASMVECTORS.ASM to allocate a stack, set up the SP, and call to allocate a stack, set up the SP, and call startstart
LAB4A.CMDLAB4A.CMDMEMORYMEMORY{Page 1{Page 1
RAM:RAM: org=___,org=___,lenlen=___=___. . .. . .. . .. . .
}}SECTIONSSECTIONS
STACK: > RAMSTACK: > RAM
2.2. Copy Copy LAB3LAB3..CMDCMD to to LAB4A.CMDLAB4A.CMD
3.3. Modify Modify LAB4A.CMDLAB4A.CMD to route the stack to Data RAM to route the stack to Data RAM
4.4. Link Link LAB3.OBJLAB3.OBJ with the modified with the modified VECTORS.OBJVECTORS.OBJ to to produce produce LAB4A.OUTLAB4A.OUT
5. 5. Simulate Simulate LAB4A.OUTLAB4A.OUT to verify your results, especially the placement of a to verify your results, especially the placement of areturn address on the stackreturn address on the stack
Module 4
DSP54x - Basic Programming Techniques 4 - 7
4 - 11
Dual AccumulatorsDual Accumulators
AGAG AHAH ALAL39 - 3239 - 32 15 - 015 - 031 - 1631 - 16
BGBG BHBH BLBL39 - 3239 - 32 15 - 015 - 031 - 1631 - 16
LDLD x,Ax,ASTLSTL A,*AR1+A,*AR1+
ADDADD *AR2-,16,B*AR2-,16,BSTHSTH B,yB,y
MUXMUX
A B C TA B C T A B D SA B D S
ALUALUA A
B B
MM
4 - 12
Instruction FormatsInstruction Formats
LoadLoad acc acc--LoLo with with Smem Smem
LoadLoad acc acc-Hi with-Hi with Smem Smem
LoadLoad acc acc w. T-SHIFT w. T-SHIFT Smem Smem
Load A with shiftLoad A with shift shft Xmem shft Xmem
Load A with SHIFTLoad A with SHIFT
SHIFTSHIFT Smem Smem
Module 4
4 - 8 DSP54x - Basic Programming Techniques
4 - 13
Load Accumulator: LDLoad Accumulator: LD
LEGENDLEGEND
SmemSmem: single: single dat dat shftshft: 0<=S<=15: 0<=S<=15 ASM: ASM: Acc Acc.Shifter.Shifter K: 16-bitK: 16-bit const const..
XmemXmem:: ptr ptr.data.data SHIFT: -16<=S<=15 TS: TREG(5-0)SHIFT: -16<=S<=15 TS: TREG(5-0) k8: 8-bitk8: 8-bit const const..
srcsrc,,dstdst:: Acc Acc. A or B. A or B !! = 2-word size = 2-word size
LD _____,LD _____, dst dst
Shift TypeShift Type Data MemoryData Memory
LowLow Acc Acc SmemSmem
HighHigh Acc Acc SmemSmem, 16, 16
T-T-regreg Value Value SmemSmem, TS, TS
Fixed ValueFixed Value XmemXmem, [, [shftshft]]
ExtendedExtended SmemSmem, SHIFT , SHIFT !!
ConstantConstant
#k8#k8
#K,16 #K,16 !!
#K, [#K, [shftshft] ] !!
AccumulatorAccumulator
srcsrc, ASM, ASM
srcsrc, [SHIFT], [SHIFT]
4 - 14
Add and Subtract: ADD, SUBAdd and Subtract: ADD, SUB
Dual Op:Dual Op: Xmem Xmem,, Ymem Ymem,, dst dst
LEGENDLEGEND
SmemSmem: single: single dat dat shftshft: 0<=S<=15: 0<=S<=15 ASM: ASM: Acc Acc.Shifter.Shifter K: 16-bitK: 16-bit const const..
XmemXmem:: ptr ptr.data.data SHIFT: -16<=S<=15 TS: TREG(5-0)SHIFT: -16<=S<=15 TS: TREG(5-0) k8: 8-bitk8: 8-bit const const..
srcsrc,,dstdst:: Acc Acc. A or B. A or B !! = 2-word size = 2-word size
Shift TypeShift Type Data MemoryData Memory ConstantConstant AccumulatorAccumulator
LowLow Acc Acc SmemSmem,, src src
HighHigh Acc Acc SmemSmem, 16,, 16, src src, [, [dstdst]] #K, 16,#K, 16, src src, [, [dstdst] ] !!
T-T-regreg Value Value SmemSmem, TS,, TS, src src srcsrc, ASM, [, ASM, [dstdst]]
Fixed ValueFixed Value XmemXmem, [, [shftshft],], src src #K, [#K, [shftshft],], src src !!
ExtendedExtended SmemSmem, , SHIFTSHIFT,, src src, [, [dstdst] ] !! srcsrc, [, [SHIFTSHIFT], [], [dstdst]]
ADD _____ or SUB _____ADD _____ or SUB _____
Module 4
DSP54x - Basic Programming Techniques 4 - 9
4 - 15
MIN, MAXMIN, MAX
MAXMAX dst dst dstdst = max (A, B) = max (A, B) if A > B then C = 0if A > B then C = 0
MINMIN dst dst dstdst = min (A, B) = min (A, B) if A < B then C = 0if A < B then C = 0
Example: z = max (Example: z = max (xxnn))
AR1AR1
xx
zz
..bssbss x,100x,100
..bssbss z,1z,1STMSTM #x,AR1#x,AR1STMSTM #98,BRC#98,BRC
LDLD *AR1+,B*AR1+,BRPTBRPTB looploopLDLD *AR1+,A*AR1+,A
loop:loop: MAXMAX BBSTLSTL B,*(z)B,*(z)
4 - 16
Store Accumulator: STL, STHStore Accumulator: STL, STH
Shift typeShift type STLSTL STH STH
NoneNone AccLoAccLo →→ Smem Smem AccAcc >> 16 >> 16 →→ Smem Smem
ASMASM AccAcc << ASM << ASM →→ Smem Smem AccAcc >> (16-ASM) >> (16-ASM) →→ SmemSmem
Short (Short (XmemXmem)) AccAcc << << shft shft →→ Xmem Xmem AccAcc >> (16- >> (16-shftshft) ) →→ Xmem Xmem
Extended Extended !! AccAcc << SHIFT << SHIFT →→ Smem Smem AccAcc >> (16-SHIFT) >> (16-SHIFT) →→ Smem Smem
LEGENDLEGEND
SmemSmem: single: single dat dat shftshft: 0<=S<=15: 0<=S<=15 ASM: ASM: Acc Acc.Shifter.Shifter K: 16-bitK: 16-bit const const..
XmemXmem:: ptr ptr.data.data SHIFT: -16<=S<=15 TS: TREG(5-0)SHIFT: -16<=S<=15 TS: TREG(5-0) k8: 8-bitk8: 8-bit const const..
srcsrc,,dstdst:: Acc Acc. A or B. A or B !! = 2-word size = 2-word size
Module 4
4 - 10 DSP54x - Basic Programming Techniques
4 - 17
Store Constant to MemoryStore Constant to Memory
ST #K,ST #K, Smem Smem
uu Direct store of constant to memoryDirect store of constant to memory
uu Accumulator not affectedAccumulator not affected
uu Two words, two cyclesTwo words, two cycles
uu Alt. syntax allows store of T or TRN registersAlt. syntax allows store of T or TRN registers
4 - 18
MAC UnitMAC Unit
17 x 1717 x 17MULTIPLIERMULTIPLIER
D P C AD P C ADD
T T
TT
DD17 x 17 Multiplier :17 x 17 Multiplier :
- Sign / Unsigned support - Sign / Unsigned support
- 8000h x 8000h = 7FFFh - 8000h x 8000h = 7FFFh in in SMUL=1SMUL=1 mode mode
A B 0A B 0
ADDER (40)ADDER (40)
M M
A B UA B U
AA BB
40-Bit Adder :40-Bit Adder :
- Separate from ALU - Separate from ALU
- Sum & Add in single cycle - Sum & Add in single cycle
Module 4
DSP54x - Basic Programming Techniques 4 - 11
4 - 19
Multiplier InstructionsMultiplier Instructions
OPOP OptionsOptions ExecutionExecution
LDLD SmemSmem, T, T T=ST=S
MPYMPY SmemSmem,, dst dst dst dst = S = S .. T T
MACMAC SmemSmem,, src src src src = = src src + S + S .. T T
XmemXmem,, Ymem Ymem,, dst dst dst dst = X = X .. Y Y
SmemSmem, #K,, #K, dst dst !! dst dst = S = S .. K K
#K,#K, dst dst !! dst dst = K = K .. T T
XmemXmem,, Ymem Ymem,, src src, [, [dstdst]] dst dst = = src src + X + X .. Y Y
SmemSmem, #K,, #K, src src, [, [dstdst] ] !! dst dst = = src src + S + S .. K K
#K,#K, src src, [, [dstdst]]!! dst dst = = src src + K + K .. T T
MASMAS SmemSmem,, src src src src = = src src - S - S .. T T
XmemXmem,, Ymem Ymem,, src src, [, [dstdst]] dst dst = = src src - X - X .. Y Y
4 - 20
Additional Multiplier InstructionsAdditional Multiplier Instructions
OpCodeOpCode OptionsOptions ExecutionExecution
MPYAMPYA SmemSmem B B == S S .. AH AH
dstdst dstdst == T T .. AH AH
MACAMACA SmemSmem B B == B + S B + S .. AH AH
T,T, src src, [, [dstdst]] dstdst == srcsrc + T + T .. AH AH
MASAMASA SmemSmem BB == B - S B - S .. AH AH
T,T, src src, [, [dstdst]] dstdst == srcsrc - T - T .. AH AH
SQURSQUR SmemSmem,, dst dst dstdst == SS22
A,A, dst dst dstdst == AHAH22
SQURASQURA SmemSmem,, src src srcsrc == srcsrc + S + S22
SQURSSQURS SmemSmem,, src src srcsrc == srcsrc - S - S22
Module 4
4 - 12 DSP54x - Basic Programming Techniques
4 - 21
ExamplesExamples
z = x + y - wz = x + y - w
LDLD x,Ax,A
ADDADD y,Ay,A
SUBSUB w,Aw,A
STLSTL A,zA,z
y =y = mx mx + b + b
LDLD m,Tm,T
MPYMPY x,Ax,A
ADDADD b,Ab,A
STLSTL A,yA,y
y = x1 y = x1 .. a1 + x2 a1 + x2 .. a2 a2
LDLD x1,Tx1,T
MPYMPY a1,Ba1,B
LDLD x2,Tx2,T
MACMAC a2,Ba2,B
STLSTL B,yB,y
STHSTH B,y+1B,y+1
4 - 22
Lab 4b: Basic ProgrammingLab 4b: Basic Programming
ProgramProgramMemoryMemory
Lab 3Lab 3
VectorVector
DataDataMemoryMemory
RAMRAM
LAB 3LAB 3
ROMROM
1 2 3 41 2 3 48 6 4 28 6 4 2
1 2 3 41 2 3 48 6 4 28 6 4 2
∑=
•=4
1
)(n
nn xay
Lab 4Lab 4
AR1AR1AR2AR2
TT
AA
XXyy
DoneDone
Module 4
DSP54x - Basic Programming Techniques 4 - 13
4 - 23
Lab 4b: ProcedureLab 4b: Procedure
1.1. Copy Copy LAB3.ASMLAB3.ASM to to LAB4B.ASMLAB4B.ASM. Open . Open LAB4B.ASMLAB4B.ASM..
2.2. Modify the initialization process to use a Modify the initialization process to use a BANZBANZ loop. loop.
3.3. Call a routine that does the following:Call a routine that does the following:a.a. Initialize pointers to the Initialize pointers to the xx and and aa arrays. arrays.b.b. Multiply the first two array elements into the accumulator.Multiply the first two array elements into the accumulator.c.c. Multiply and accumulate the remaining pairs using in-lineMultiply and accumulate the remaining pairs using in-line
code -- code -- don’tdon’t use use BANZBANZ..d.d. Store the result to memory location Store the result to memory location yy . .e.e. Return to the main routine.Return to the main routine.
4.4. Setup an appropriate linker command fileSetup an appropriate linker command file
5.5. Assemble, link, simulate and debug your code.Assemble, link, simulate and debug your code.
Optional: Obtain the maximum value of an individual product.Optional: Obtain the maximum value of an individual product.
Module 4
4 - 14 DSP54x - Basic Programming Techniques
4 - 25
VECTOR4.ASMVECTOR4.ASM : Solution : Solution
;Solution for VECTORS.ASM for LAB4A
.ref start
LEN .set 100
STACK .usect "STK",LEN
.sect ".vectors"
B BEGIN
.text
BEGIN STM #STACK+LEN,SP
call start
;Solution for VECTORS.ASM for LAB4A;Solution for VECTORS.ASM for LAB4A
.ref start .ref start
LEN .set 100LEN .set 100
STACK .STACK .usectusect "STK",LEN "STK",LEN
.sect ".vectors" .sect ".vectors"
B BEGIN B BEGIN
.text .text
BEGIN STM #STACK+LEN,SPBEGIN STM #STACK+LEN,SP
call start call start
4 - 26
LAB4A.CMDLAB4A.CMD : Solution : Solution
lab3.obj
vector4.obj
-o lab4a.out
-m lab4a.map
MEMORY { PAGE 0: EPROM: org = 0E000h len = 01F80h
VECS : org = 0FF80h len = 00080h
PAGE 1: SPRAM: org = 00060h len = 00020h
DARAM: org = 00080h len = 01380h }
SECTIONS{ .vectors : > VECS PAGE 0
.text : > EPROM PAGE 0
.data : > DARAM PAGE 1
.bss : > SPRAM PAGE 1
STK : > DARAM PAGE 1 }
lab3.lab3.objobj
vector4.vector4.objobj
-o lab4a.out-o lab4a.out
-m lab4a.map-m lab4a.map
MEMORY { PAGE 0: EPROM: org = 0E000hMEMORY { PAGE 0: EPROM: org = 0E000h len len = 01F80h = 01F80h
VECS : org = 0FF80h VECS : org = 0FF80h len len = 00080h = 00080h
PAGE 1: SPRAM: org = 00060h PAGE 1: SPRAM: org = 00060h len len = 00020h = 00020h
DARAM: org = 00080h DARAM: org = 00080h len len = 01380h } = 01380h }
SECTIONS{SECTIONS{ .vectors : > VECS PAGE 0.vectors : > VECS PAGE 0
.text : > EPROM PAGE 0.text : > EPROM PAGE 0
.data : > DARAM PAGE 1.data : > DARAM PAGE 1
..bssbss : > SPRAM PAGE 1 : > SPRAM PAGE 1
STK : > DARAM PAGE 1 }STK : > DARAM PAGE 1 }
Module 4
DSP54x - Basic Programming Techniques 4 - 15
4 - 27
LAB4B.ASMLAB4B.ASM : Solution : Solution
.def start,table,x
.bss x,4
.bss a,4
.bss y,1
.text
NOP
start: STM #table,AR1
STM #x,AR2
STM #8,AR7
loop: LD *AR1+,A
STL A,*AR2+
BANZ loop,*AR7-
CALL sop
CALL maxi
done: B done
. .defdef start,table,x start,table,x
. .bssbss x,4 x,4
. .bssbss a,4 a,4
. .bssbss y,1 y,1
.text .text
NOP NOP
start: STM #table,AR1start: STM #table,AR1
STM #x,AR2 STM #x,AR2
STM #8,AR7 STM #8,AR7
loop: LD *AR1+,Aloop: LD *AR1+,A
STL A,*AR2+ STL A,*AR2+
BANZ loop,*AR7- BANZ loop,*AR7-
CALL sop CALL sop
CALL maxi CALL maxi
done: B donedone: B done
sop: STM #x,AR1
STM #a,AR2
LD *AR1+,T ;1
MPY *AR2+,A
LD *AR1+,T ;2
MAC *AR2+,A
LD *AR1+,T ;3
MAC *AR2+,A
LD *AR1,T ;4
MAC *AR2,A
STL A,*(y)
RET
.data
table: .word 1,2,3,4
.word 8,6,4,2,0
sop: STM #x,AR1sop: STM #x,AR1
STM #a,AR2 STM #a,AR2
LD *AR1+,T ;1 LD *AR1+,T ;1
MPY *AR2+,A MPY *AR2+,A
LD *AR1+,T ;2 LD *AR1+,T ;2
MAC *AR2+,A MAC *AR2+,A
LD *AR1+,T ;3 LD *AR1+,T ;3
MAC *AR2+,A MAC *AR2+,A
LD *AR1,T ;4 LD *AR1,T ;4
MAC *AR2,A MAC *AR2,A
STL A,*(y) STL A,*(y)
RET RET
.data.data
table: table: .word 1,2,3,4.word 1,2,3,4
.word 8,6,4,2,0.word 8,6,4,2,0
4 - 28
LAB4B.ASMLAB4B.ASM Optional : Solution Optional : Solution
.bss max,1
maxi: STM #x,AR1
STM #a,AR2
LD *AR1+,T ;1
MPY *AR2+,B
LD *AR1+,T ;2
MPY *AR2+,A
MAX B
LD *AR1+,T ;3
MPY *AR2+,A
MAX B
LD *AR1+,T ;4
MPY *AR2+,A
MAX B
STL B,max
RET
. .bssbss max,1 max,1
maxi: STM #x,AR1maxi: STM #x,AR1
STM #a,AR2 STM #a,AR2
LD *AR1+,T ;1 LD *AR1+,T ;1
MPY *AR2+,B MPY *AR2+,B
LD *AR1+,T ;2LD *AR1+,T ;2
MPY *AR2+,A MPY *AR2+,A
MAX B MAX B
LD *AR1+,T ;3 LD *AR1+,T ;3
MPY *AR2+,A MPY *AR2+,A
MAX B MAX B
LD *AR1+,T ;4 LD *AR1+,T ;4
MPY *AR2+,A MPY *AR2+,A
MAX B MAX B
STL B,max STL B,max
RET RET
Module 4
4 - 16 DSP54x - Basic Programming Techniques
4 - 29
LAB4B.CMDLAB4B.CMD : Solution : Solution
lab4b.obj
vector4.obj
-o lab4b.out
-m lab4b.map
MEMORY { PAGE 0: EPROM: org = 0E000h len = 01F80h
VECS: org = 0FF80h len = 00080h
PAGE 1: SPRAM: org = 00060h len = 00020h
DARAM: org = 00080h len = 01380h }
SECTIONS{ .vectors: > VECS PAGE 0
.text : > EPROM PAGE 0
.data : > DARAM PAGE 1
.bss : > SPRAM PAGE 1
STK : > DARAM PAGE 1 }
lab4b.lab4b.objobj
vector4.vector4.objobj
-o lab4b.out-o lab4b.out
-m lab4b.map-m lab4b.map
MEMORY { PAGE 0: EPROM: org = 0E000hMEMORY { PAGE 0: EPROM: org = 0E000h len len = 01F80h = 01F80h
VECS: org = 0FF80h VECS: org = 0FF80h len len = 00080h = 00080h
PAGE 1: SPRAM: org = 00060h PAGE 1: SPRAM: org = 00060h len len = 00020h = 00020h
DARAM: org = 00080h DARAM: org = 00080h len len = 01380h } = 01380h }
SECTIONS{SECTIONS{ .vectors: > VECS PAGE 0.vectors: > VECS PAGE 0
.text : > EPROM PAGE 0.text : > EPROM PAGE 0
.data : > DARAM PAGE 1.data : > DARAM PAGE 1
..bssbss : > SPRAM PAGE 1 : > SPRAM PAGE 1
STK : > DARAM PAGE 1STK : > DARAM PAGE 1 } }
DSP54x - Advanced Programming 5 - 1
Advanced Programming
Learning Objectives
5 - 2
Learning ObjectivesLearning Objectives
uu Repeat FunctionsRepeat Functions
uu Data Move FunctionsData Move Functions
uu Dual Operands (Dual Operands (XmemXmem,, Ymem Ymem))
uu Long , Double, & Parallel OpsLong , Double, & Parallel Ops
5 - 2 DSP54x - Advanced Programming
Module 5
DSP54x - Advanced Programming 5 - 3
Module 5
5 - 3
Repeat Next: RPTRepeat Next: RPT
uu FeaturesFeatures
ÀÀ Next instruction iterated N+1 timesNext instruction iterated N+1 times
ÀÀ Saves code space (1 or 2 words)Saves code space (1 or 2 words)
ÀÀ Low overhead (1 or 2 cycles)Low overhead (1 or 2 cycles)
ÀÀ Easy to useEasy to use
ÀÀ Non-interruptibleNon-interruptible
Example :Example :
intint x[5]={0,0,0,0,0}; x[5]={0,0,0,0,0};
..bssbss x,5x,5
STMSTM #x,AR1#x,AR1
LDLD #0,A#0,A
RPTRPT #4#4
STLSTL A,*AR1+A,*AR1+uu OptionsOptionsÀÀ RPT #k8RPT #k8 up to 256 iterationsup to 256 iterations
ÀÀ RPT #KRPT #K up to 64K iterationsup to 64K iterations
ÀÀ RPTRPT Smem Smem ref. dataref. data mem mem for count value for count value
5 - 4
Enhanced Performance with RPTEnhanced Performance with RPT
These instructions execute These instructions execute fasterfaster when in when in
a RPT loop: a RPT loop:
MVDMMVDM MVKDMVKD MACDMACDMVMDMVMD MVDKMVDK MACPMACP
MVDPMVDP READAREADA FIRSFIRSMVPDMVPD WRITAWRITA
Pointer setup and usage becomes morePointer setup and usage becomes more
efficient while RPT loop is active.efficient while RPT loop is active.
Module 5
5 - 4 DSP54x - Advanced Programming
5 - 5
Non-Repeatable InstructionsNon-Repeatable Instructions
Generally, not operations useful to repeat; e.g.;Generally, not operations useful to repeat; e.g.;branches, status register ops, etc :branches, status register ops, etc :
B[D]B[D] BC[D]BC[D] BANZ[D]BANZ[D] INTRINTR RETE[D]RETE[D]CALL[D]CALL[D] CC[D]CC[D] RPTRPT TRAPTRAP RETF[D]RETF[D]RET[D]RET[D] RC[D]RC[D] RPTZRPTZ RESETRESET BACC[D] BACC[D] Far OpsFar Ops XCXC RPTB[D] RPTB[D] IDLEIDLE CALA[D] CALA[D]
ANDMANDMORMORMXORMXORMADDMADDM
LDLD DPDPLDLD ASMASMLDLD ARPARP
MVMMMVMMCMPRCMPRDSTDST
RSBXRSBXSSBXSSBXRNDRND
Can yield errors. Can yield errors. Won’tWon’t damage device. damage device.
5 - 6
Repeat and Zero: RPTZRepeat and Zero: RPTZ
uu Repeats following instruction N+1 timesRepeats following instruction N+1 times
uu Additionally, zeros specified accumulatorsAdditionally, zeros specified accumulators
uu Uses long constant onlyUses long constant only
uu Requires two words and two cyclesRequires two words and two cycles
Example :Example :intint x[5]={0,0,0,0,0}; x[5]={0,0,0,0,0};
..bssbss x,5x,5
STMSTM #x,AR2#x,AR2
RPTZRPTZ B,#4B,#4
STLSTL B,*AR2+B,*AR2+
Module 5
DSP54x - Advanced Programming 5 - 5
5 - 7
Block Repeat: RPTBBlock Repeat: RPTB
uu Allows ’zero overhead’ looping on anyAllows ’zero overhead’ looping on anysize code segmentsize code segment
uu Is a 2-word, 4-cycle instructionIs a 2-word, 4-cycle instruction
uu Is interruptibleIs interruptible
uu RSA is next line of codeRSA is next line of code
uu REA is operand for RPTBREA is operand for RPTB
uu BRC must be pre-loaded with ’count-1’BRC must be pre-loaded with ’count-1’
uu May operate on May operate on anyany length block length block
5 - 8
RPTB ExampleRPTB Example
Add 1 to each element in the array x[5]Add 1 to each element in the array x[5]
..bssbss x,5x,5
begin:begin: LDLD #1,16,B#1,16,B
STMSTM #x,AR4#x,AR4
ADDADD *AR4,16,B,A*AR4,16,B,A
STHSTH A,*AR4+A,*AR4+
LDLD #0,B#0,B
……
… …
}} Loop 5x Loop 5x
RPTBRPTB next-1next-1
next:next:
STMSTM #4,BRC#4,BRC
‘next-1’ assures complete fetch of possible multiword final instruction‘next-1’ assures complete fetch of possible multiword final instruction
Module 5
5 - 6 DSP54x - Advanced Programming
5 - 9
Nested LoopsNested Loops
STMSTM #L-1,AR7#L-1,AR7
1st:1st: outout
outout
STMSTM #M-1,BRC#M-1,BRC
RPTBRPTB 2nd-12nd-1
midmid
midmid
RPTRPT #N-1#N-1
innerinner
midmid
midmid
2nd:2nd: outout
outout
BANZBANZ 1st,*AR7-1st,*AR7-
112233
LevelLevel OperatorOperator CyclesCycles
11 RPTRPT 11
22 RPTBRPTB 4+24+2
33 BANZBANZ 2+4 2+4 . . NN
…… … …
uu RPT uses (invisible) RCRPT uses (invisible) RC
uu RPTB uses BRC, RSA, REARPTB uses BRC, RSA, REA
uu Nesting RPTB possible, Nesting RPTB possible, but not efficientbut not efficient
5 - 10
Exercise 5-1: Repeat OperationsExercise 5-1: Repeat Operations
1.1. Which repeat functions are interruptible?Which repeat functions are interruptible?2.2. How many/few lines of code can be in a How many/few lines of code can be in a RPTBRPTB??
3.3. Which repeat function is fastest?Which repeat function is fastest?4.4. What does What does RPT 5RPT 5 do? do?
Add array x[10] to y[10]Add array x[10] to y[10] Add 100 values in the array xAdd 100 values in the array x
Module 5
DSP54x - Advanced Programming 5 - 7
5 - 11
Data MoveData Move
uu Faster than Load and StoreFaster than Load and Store
uu Transfer avoids accumulatorTransfer avoids accumulator
uu Allows access to program memoryAllows access to program memory
uu Optimal with RPT (speed and code size)Optimal with RPT (speed and code size)
5 - 12
Optimal Initialization: MVPDOptimal Initialization: MVPD
Program MemoryProgram Memory Data MemoryData Memory
..bssbss x,5x,5 RAMRAM
ROMROM
Example : Example : x[5]={1,2,3,4,5};x[5]={1,2,3,4,5};
.text.textSTART:START: STMSTM #x,AR5#x,AR5
RPTRPT #4#4MVPDMVPD TBL,*AR5+TBL,*AR5+…………
no datano dataROMROMrequired!required!
.data.dataTBL:TBL: .word.word 1,2,3,4,51,2,3,4,5
.sect “.vectors”.sect “.vectors”BB STARTSTART
Module 5
5 - 8 DSP54x - Advanced Programming
5 - 13
Move InstructionsMove Instructions
DATA DATA ↔ DATA DATA # w/c # w/c
MVDKMVDK SmemSmem,, dmad dmad 2/22/2
MVKDMVKD dmaddmad,, Smem Smem 2/22/2
MVDDMVDD XmemXmem,, Ymem Ymem 1/11/1
DATA DATA ↔ MMR MMR # w/c # w/c
MVDMMVDM dmaddmad, MMR, MMR 2/22/2
MVMDMVMD MMR,MMR, dmad dmad 2/22/2
MVMMMVMM mmrmmr,, mmr mmr 1/11/1
PGM PGM ↔ DATA DATA # w/c # w/c
MVPDMVPD pmadpmad,, Smem Smem 2/32/3
MVDPMVDP SmemSmem,, pmad pmad 2/42/4
PGM(PGM(AccAcc) ) ↔ DATA DATA # w/c # w/c
READAREADA SmemSmem 1/51/5
WRITAWRITA SmemSmem 1/51/5
LEGENDLEGEND
SmemSmem: regular data memory address: regular data memory address dmaddmad: 16-bit data memory address: 16-bit data memory address
XmemXmem,,YmemYmem: dual operand data: dual operand data mems mems pmadpmad: 16-bit: 16-bit pgm pgm. memory address. memory address
MMR: any memory map registerMMR: any memory map register mmrmmr: AR0-AR7, or SP: AR0-AR7, or SP
5 - 14
Exercise 5-2: Move OperationsExercise 5-2: Move Operations
1.1. Which instructions would be best for aWhich instructions would be best for acontext save and restore?context save and restore?
2.2. WhichWhich mmrs mmrs does does MVMMMVMM access? access?
3.3. Which move operations allow a run-timeWhich move operations allow a run-timeselectableselectable pmad pmad??
4.4. Write a routine to copy x[20] to y[20].Write a routine to copy x[20] to y[20].
Module 5
DSP54x - Advanced Programming 5 - 9
5 - 15
Dual Operand MultiplicationDual Operand Multiplication
DATA MEMORYDATA MEMORY
C BUSC BUS
D BUSD BUS
MACMACUNITUNIT
AA BB
5 - 16
y =y = mx mx + b + b
Dual Operand MPYDual Operand MPY
Standard SolutionStandard Solution
LD LD m,Tm,T
MPYMPY x,Ax,A
ADDADD b,Ab,A
STLSTL A,yA,y
Dual Op SolutionDual Op Solution
MPY *AR2,*AR3,A
ADD b,A
STL A,y
MPYMPY *AR2,*AR3*AR2,*AR3,A,A
ADDADD b,Ab,A
STLSTL A,yA,y
Dual Op CaveatsDual Op Caveatsuu May use May use only AR2-AR5only AR2-AR5
uu Requires less code spaceRequires less code space
uu Executes more quicklyExecutes more quickly
Module 5
5 - 10 DSP54x - Advanced Programming
5 - 17
Dual Operand MPY ExampleDual Operand MPY Example
y x an nn
==
∑1
20
y x an nn
==
∑1
20
LD #0,BSTM #a,AR2STM #x,AR3STM #19,BRCRPTB done-1LD *AR2+,TMPY *AR3+,AADD A,B
done STH B,ySTL B,y+1
33
sop1: LD #0,BSTM #a,AR2STM #x,AR3STM #19,BRCRPTB done-1MPY *AR2+,*AR3+,AADD A,B
done: STM B,ySTL B,y+1
sop1: LD #0,BSTM #a,AR2STM #x,AR3STM #19,BRCRPTB done-1MPY *AR2+,*AR3+,AADD A,B
done: STM B,ySTL B,y+1
22
Total savings: 1 cycle * 20 iterations = 20 cyclesTotal savings: 1 cycle * 20 iterations = 20 cycles
5 - 18
Dual Operand MAC ExampleDual Operand MAC Example
y x an nn
==
∑1
20
y x an nn
==
∑1
20
sop2: STM #x,AR2STM #a,AR3
RPTZ A,19MAC *AR2+,*AR3+,A
STH A,ySTL A,y+1
sop2: STM #x,AR2STM #a,AR3
RPTZ A,19MAC *AR2+,*AR3+,A
STH A,ySTL A,y+1
Performance: N+2 cycles for N iterationsPerformance: N+2 cycles for N iterations
Module 5
DSP54x - Advanced Programming 5 - 11
5 - 19
Dual Operand MAC and MPYDual Operand MAC and MPY
LEGENDLEGEND
SmemSmem: regular data memory address: regular data memory address dmaddmad: 16-bit data memory address: 16-bit data memory address
XmemXmem,,YmemYmem: dual operand data: dual operand data mems mems pmadpmad: 16-bit: 16-bit pgm pgm. memory address. memory address
srcsrc: source accumulator: source accumulator dstdst: destination accumulator: destination accumulator
MPYMPY XmemXmem,,YmemYmem,,dstdst dstdst = = Xmem Xmem * * Ymem Ymem
MACMAC XmemXmem,,YmemYmem,,srcsrc,[,[dstdst]] dstdst = = src src + + Xmem Xmem * * Ymem Ymem
MASMAS XmemXmem,,YmemYmem,,srcsrc,[,[dstdst]] dstdst = = src src - - Xmem Xmem * * Ymem Ymem
MACPMACP SmemSmem,,pmadpmad,,srcsrc,[,[dstdst]] dstdst = = src src + + Smem Smem * * pmad pmad
5 - 20
X,Y Addressing RulesX,Y Addressing Rules
Dual operand addressing allows only certainDual operand addressing allows only certainpointers and modes :pointers and modes :
PointersPointers Modes Modes
AR2 AR2 **ARnARn
AR3 AR3 **ARnARn++
AR4 AR4 **ARnARn--
AR5 AR5 **ARnARn+0%+0%
Modifiers: BK + AR0Modifiers: BK + AR0
Since the only index offered is circular, regularSince the only index offered is circular, regularindex is possible only if BK is set to 0, or madeindex is possible only if BK is set to 0, or madevery large, e.g.,very large, e.g., FFFFh FFFFh..
Module 5
5 - 12 DSP54x - Advanced Programming
5 - 21
Exercise 5-3: Dual Op MACExercise 5-3: Dual Op MAC
uu What addressing options exist for dualWhat addressing options exist for dualoperand mode?operand mode?
uu Which multiplication instructions supportWhich multiplication instructions supportdual operands?dual operands?
uu Write the code to solve, for i = 1 to 10 :Write the code to solve, for i = 1 to 10 :
y(i) = y(i-1) y(i) = y(i-1) ++ x(i) x(i) .. e e
5 - 22
Notes
Module 5
DSP54x - Advanced Programming 5 - 13
5 - 23
Example : Example : ZZ3232 = X = X3232 + Y + Y3232
Long Word OperationsLong Word Operations
Standard OperationsStandard Operations
LDLD XHI,16,AXHI,16,A
ADDSADDS XLO,AXLO,A
ADDADD YHI,16,AYHI,16,A
ADDSADDS YLO,AYLO,A
STHSTH A,ZHIA,ZHI
STLSTL A,ZLOA,ZLO
Words = 6Words = 6
Cycles = 6Cycles = 6
Long Word OperationsLong Word Operations
DLDDLD XHI,AXHI,A
DADDDADD YHI,AYHI,A
DSTDST A,ZHIA,ZHI
Words = 3Words = 3
Cycles = 4Cycles = 4
5 - 24
DLDDLD LmemLmem,, dst dst dstdst = = Lmem Lmem
DSTDST srcsrc,, Lmem Lmem LmemLmem = = src src
DADDDADD LmemLmem,, src src, [, [dstdst]] dstdst = = src src + + Lmem Lmem
DSUBDSUB LmemLmem,, src src, [, [dstdst]] dstdst = = src src - - Lmem Lmem
DRSUBDRSUB LmemLmem,, src src, [, [dstdst]] dstdst = = Lmem Lmem - - src src
Long Operand InstructionsLong Operand Instructions
uu Double store requires two cycles for dual E-bus activity Double store requires two cycles for dual E-bus activity
uu Double Load/Add/Sub are single cycle in DARAM Double Load/Add/Sub are single cycle in DARAM
uu Double operations to single access memories take two cycles Double operations to single access memories take two cycles
uu Default auto-increment step size is TWO Default auto-increment step size is TWO
Module 5
5 - 14 DSP54x - Advanced Programming
5 - 25
Long Operand IssuesLong Operand Issues
uu Long operand instructions read MSW from specifiedLong operand instructions read MSW from specifiedaddress and LSW at same address with LSB address and LSW at same address with LSB toggledtoggled
Ex1:Ex1: DLDDLD 100,A100,A A = @100 @101A = @100 @101
Ex2:Ex2: DLDDLD 201,B201,B B = @201 @200B = @201 @200
uu Recommended: Recommended: AlignAlign words in memory so that MSW is at words in memory so that MSW is ateveneven address address
Ex1:Ex1: ..longlong 12345678h12345678h eveneven 1 2 3 41 2 3 4
oddodd 5 6 7 85 6 7 8
Ex2:Ex2: ..bssbss XHI,2,1 XHI,2,1 eveneven XHIXHI
oddodd XLOXLO
NameNameSizeSizePagePage Contig ContigEVEN ALIGNEVEN ALIGN
,,11
5 - 26
Example : Example : Z = X + Y and F = D + EZ = X + Y and F = D + E
Double Word OperationsDouble Word Operations
uu Split accumulators into Split accumulators into separateseparate LO and HI halves: LO and HI halves: SSBX C16SSBX C16
1. Interleave Data1. Interleave Data .data.data.align.align 22.word.word X,D,YX,D,Y.word.word E,Z,FE,Z,F
XXDDYYEEZZFF --- or ------ or ---
..bssbss X,6,1,1X,6,1,1
2. Write Code2. Write Code SSBXSSBX C16C16……DLDDLD X,BX,BDADDDADD Y,BY,BDSTDST B,ZB,Z
Module 5
DSP54x - Advanced Programming 5 - 15
5 - 27
Parallel OperationsParallel Operations
Example : Example : Z = X + Y and F = D + EZ = X + Y and F = D + E
uu Parallel load/store instructions use D Bus and E BusParallel load/store instructions use D Bus and E Busin same cycle.in same cycle.
XXYYZZDDEEFF
AR5AR5
AR6AR6
..bssbss X,3X,3
..bssbss D,3D,3
STMSTM #X,AR5#X,AR5STMSTM #D,AR6#D,AR6LDLD #0,ASM#0,ASM
LDLD *AR5+,16,A*AR5+,16,AADDADD *AR5+,16,A*AR5+,16,ASTST A,*AR5A,*AR5
|| || LDLD *AR6+,B*AR6+,BADDADD *AR6+,16,B*AR6+,16,BSTHSTH B,*AR6B,*AR6
uu Parallel ops focus on high accumulator.Parallel ops focus on high accumulator.
uu Store in parallel ops are offset by ASM value.Store in parallel ops are offset by ASM value.
ÀÀ ASM is a 5-bit ASM is a 5-bit signedsigned field in ST1 (bits 4-0) field in ST1 (bits 4-0)ÀÀ ASM is best loaded with: ASM is best loaded with: LD #k5,ASMLD #k5,ASM
uu What is the error in the above example?What is the error in the above example?
5 - 28
Parallel InstructionsParallel Instructions
InstructionInstruction ExampleExample OperationOperation
LD || MAC[R]LD || MAC[R] LDLD XmemXmem,,dstdst dstdst = = Xmem Xmem << 16 << 16LD || MAS[R]LD || MAS[R] ||||MAC[R]MAC[R] YmemYmem,[dst2],[dst2] dst2 = dst2 + T * dst2 = dst2 + T * YmemYmem
ST || MPYST || MPY STST srcsrc,,YmemYmem YmemYmem = = src src >> (16-ASM) >> (16-ASM)ST || MAC[R] ST || MAC[R] ||||MAC[R]MAC[R] XmemXmem,,dstdst dstdst = = dst dst + T * + T * Xmem XmemST || MAS[R]ST || MAS[R]
ST || ADDST || ADD STST srcsrc,,YmemYmem YmemYmem = = src src >> (16-ASM) >> (16-ASM)ST || SUB ST || SUB ||||ADDADD XmemXmem,,dstdst dstdst = = dst dst + + Xmem XmemST || LDST || LD
ST || LD ST || LD STST srcsrc,,YmemYmem YmemYmem = = src src >> (16-ASM) >> (16-ASM)||||LDLD XmemXmem,T,T T =T = Xmem Xmem
Module 5
5 - 16 DSP54x - Advanced Programming
5 - 29
Double, Long, and Parallel ReviewDouble, Long, and Parallel Review
uu How many program words are double, long,How many program words are double, long,and parallel ops?and parallel ops?
uu How many cycles do they take to execute?How many cycles do they take to execute?uu If a If a STST ||LD ||LD refers to the samerefers to the same Acc Acc and and
DMEMDMEM, what happens?, what happens?
uu What is the What is the ASMASM field? What does it affect? field? What does it affect?How is it loaded?How is it loaded?
uu How should 32-bit data be aligned in memory?How should 32-bit data be aligned in memory?How is that accomplished?How is that accomplished?
5 - 30
Bus UsageBus Usage
Instruction ActivityInstruction Activity PBPB CBCB DBDB EBEB
Program ReadProgram Read A,DA,D
Program WriteProgram Write AA DD
Data Single ReadData Single Read A,DA,D
Data Dual ReadData Dual Read A,DA,D A,DA,D
Data Long (32-bit) ReadData Long (32-bit) Read A,D(ms)A,D(ms) AA11,D(,D(lsls))
Data Single WriteData Single Write A,DA,D
Data Read / Data WriteData Read / Data Write A,DA,D A,DA,D
Dual Read / Coefficient ReadDual Read / Coefficient Read A,DA,D A,DA,D A,DA,D
Peripheral WritePeripheral Write A,DA,D
Peripheral Read*Peripheral Read* A,DA,D
* MMRs only accessible via D Bus, MMR access as Ymem op yields bad data!** MMRs MMRs only accessible via D Bus, MMR access as only accessible via D Bus, MMR access as Ymem Ymem op yields bad data! op yields bad data!
Module 5
DSP54x - Advanced Programming 5 - 17
5 - 31
Module ReviewModule Review
uu Fast loops :Fast loops : RepeatRepeat
uu Fast data transfer :Fast data transfer : Move OpsMove Ops
uu Faster math :Faster math : Dual OperandsDual Operands
uu Fast 32-bit math :Fast 32-bit math : Long OpsLong Ops
uu Double math : Double math : Double OpsDouble Ops
uu Two actions in one cycle :Two actions in one cycle : Parallel OpsParallel Ops
5 - 32
Lab 5: Advanced ProgrammingLab 5: Advanced Programming
Program MemoryProgram Memory
ROMROM.data.data
tbltbl:: .word.word 1,2,3,41,2,3,4.word.word 8,6,4,28,6,4,2
RAMRAM
Data MemoryData Memory
. .bssbss
xx ___ ___ ___ ______ ___ ___ ___
aa ___ ___ ___ ______ ___ ___ ___
yy ___ ___
.sect “.vectors”.sect “.vectors”BB startstart
.text.textstart:start: … …
MVPDMVPD tbltbl ,*…,*… … …
MACMAC *,**,*……
Module 5
5 - 18 DSP54x - Advanced Programming
5 - 33
Lab 5: ProcedureLab 5: Procedure
1.1. Copy Copy LAB4B.ASMLAB4B.ASM to to LAB5.ASMLAB5.ASM. Modify . Modify LAB5LAB5 to: to:
a.a. Perform initialization with a repeated Perform initialization with a repeated MVPDMVPD..
b.b. Perform the sum-of-products with a repeated dualPerform the sum-of-products with a repeated dualoperand operand MACMAC..
2.2. Copy Copy LAB4.CMDLAB4.CMD to to LAB5.CMDLAB5.CMD. Modify . Modify LAB5.CMDLAB5.CMD to: to:
a.a. Load .data to program memory.Load .data to program memory.b.b. Input Input LAB5.OBJLAB5.OBJ and create and create LAB5.OUTLAB5.OUT and andLAB5.MAPLAB5.MAP..
3.3. Assemble, link, and simulate the program. Debug andAssemble, link, and simulate the program. Debug andverify performance.verify performance.
4.4. Optional: If time permits, modify Optional: If time permits, modify LAB5.ASMLAB5.ASM to use to useMACPMACP. What effects would using . What effects would using MACPMACP have on the have on thesystem implementation?system implementation?
5 - 34
Lab 5: OptionalLab 5: Optional
Program MemoryProgram Memory
ROMROM
.text.textstart:start: ……
MVPDMVPD tbl,*+tbl,*+…… ……
.data.datatbltbl :: .word.word 1,2,3,41,2,3,4 .word.word 8,6,4,28,6,4,2
.sect “.vectors”.sect “.vectors”BB startstart
RAMRAM
Data MemoryData Memory
.. bssbss
x x ___ ___ ___ ______ ___ ___ ___
yy ______MACPMACP coeffcoeff ,…,…
aa
Module 5
DSP54x - Advanced Programming 5 - 19
5 - 36
Exercise 5-1: SolutionExercise 5-1: Solution
Add 100 values in the array xAdd 100 values in the array x Add array x[10] to y[10]Add array x[10] to y[10]
.bss x,100
STM #x,AR6
RPTZ A,99
ADD *AR6+,A
..bssbss x,100x,100
STMSTM #x,AR6#x,AR6
RPTZRPTZ A,99A,99
ADDADD *AR6+,A*AR6+,A
.bss x,10
.bss y,10
STM #x,AR2
STM #y,AR3
STM #9,BRC
RPTB next-1
LD *AR2+,A
ADD *AR3,A
STL A,*AR3+
next: LD #0,A
..bssbss x,10x,10
..bssbss y,10y,10
STMSTM #x,AR2#x,AR2
STMSTM #y,AR3#y,AR3
STMSTM #9,BRC#9,BRC
RPTBRPTB next-1next-1
LDLD *AR2+,A*AR2+,A
ADDADD *AR3,A*AR3,A
STLSTL A,*AR3+A,*AR3+
next:next: LDLD #0,A#0,A
5 - 37
Exercise 5-2 & 5-3 : SolutionsExercise 5-2 & 5-3 : Solutions
..bssbss x,20x,20
..bssbss y,20y,20
STMSTM #x,AR2#x,AR2
STMSTM #y,AR3#y,AR3
RPTRPT #19#19
MVDDMVDD *AR2*AR2+,*+,*AR3+AR3+
Copy x[20] to y[20]Copy x[20] to y[20]y(i) = y(i-1) + x(i) y(i) = y(i-1) + x(i) .. e e
where i = 1 to 10where i = 1 to 10
.bss x,10
.bss y,10
.bss e,1
STM #x,AR2STM #y,AR3STM #e,AR4LD #0,ASTM #10-1,BRCRPTB loop-1MAC *AR2+,*AR4,ASTL A,*AR3+
loop:
..bssbss x,10x,10
..bssbss y,10y,10
..bssbss e,1e,1
STMSTM #x,AR2#x,AR2STMSTM #y,AR3#y,AR3STMSTM #e,AR4#e,AR4LDLD #0,A#0,ASTMSTM #10-1,BRC#10-1,BRCRPTBRPTB loop-1loop-1MACMAC *AR2*AR2+,*+,*AR4,AAR4,ASTLSTL A,*AR3+A,*AR3+
loop:loop:
Module 5
5 - 20 DSP54x - Advanced Programming
5 - 38
Lab 5: SolutionLab 5: Solution
..bssbss x,4x,4
..bssbss a,4a,4
..bssbss y,1y,1
.data.datatbltbl:: .word.word 1,2,3,41,2,3,4
.word.word 8,6,4,2,08,6,4,2,0
.text.textstart:start: STMSTM #data,AR1#data,AR1
RPTRPT #8#8MVPDMVPD tbltbl,*AR1+,*AR1+STMSTM #data,AR2#data,AR2STMSTM ##coeffcoeff,AR3,AR3RPTZRPTZ A,3A,3MACMAC *AR2*AR2+,*+,*AR3+,AAR3+,ASTLSTL A,*(result)A,*(result)
5 - 39
Lab 5 Optional: SolutionLab 5 Optional: Solution
..bssbss x,4x,4
..bssbss y,1y,1
.data.datatbltbl:: .word.word 1,2,3,4,01,2,3,4,0a:a: .word.word 8,6,4,28,6,4,2
.text.textSTMSTM #data,AR1#data,AR1RPTRPT #4#4MVPDMVPD tbltbl,*AR1+,*AR1+STMSTM #data,AR1#data,AR1RPTZRPTZ A,3A,3MACPMACP *AR1+,*AR1+,coeffcoeff,A,ASTLSTL A,*(result)A,*(result)
AdvantagesAdvantages : : Faster initialization, Less RAM requiredFaster initialization, Less RAM required
DSP54x - Pipeline Issues 6 - 1
Pipeline Issues
Learning Objectives
6 - 2
Learning ObjectivesLearning Objectives
uu Describe the ’C54x pipeline events.Describe the ’C54x pipeline events.
uu Implement delayed branching.Implement delayed branching.
uu Identify and resolve pipeline conflicts.Identify and resolve pipeline conflicts.
6 - 2 DSP54x - Pipeline Issues
Module 6
DSP54x - Pipeline Issues 6 - 3
Module 6
TMS320C54x DSPTMS320C54x DSPDesign WorkshopDesign Workshop
Module 6Module 6
Pipeline IssuesPipeline Issues
Module 6
6 - 4 DSP54x - Pipeline Issues
6 - 3
Pipeline OperationPipeline Operation
PREFETCHPREFETCHPAB loaded withPAB loaded withPC contents.PC contents.
P
FETCHFETCHPB loaded byPB loaded bywrapperwrappermanager.manager.
F
DECODEDECODEIR loaded with IR loaded with either PB contenteither PB contentor IQ content. IR or IQ content. IR content is decoded.content is decoded.
D
ACCESSACCESSDAB loaded with data1DAB loaded with data1read access if required.read access if required.CAB loaded with data2CAB loaded with data2read address if required.read address if required.Auxliary register update.Auxliary register update.
A
READREADDB loaded by wrapperDB loaded by wrappermanager with data1 if required.manager with data1 if required.CB loaded by wrapperCB loaded by wrappermanager with data2 if required.manager with data2 if required.EAB loaded with data3 write EAB loaded with data3 write address if required.address if required.
R
EXECUTEEXECUTEExecution of theinstruction and EBloaded with write data
Execution of theinstruction and EBloaded with write data
X
6 - 4
Pipe FlowPipe Flow
TIMETIME
P P11 FF11
PP22
DD11
FF22
PP33
AA11
DD22
FF33
PP44
RR11
AA22
DD33
FF44
PP55
XX11
PP66
RR22
AA33
DD44
FF55
FF66
XX22
RR33
AA44
DD55
DD66
XX33
RR44
AA55
AA66
XX44
RR55
RR66
XX55
XX66
Module 6
DSP54x - Pipeline Issues 6 - 5
6 - 5
StandardStandard vs vs. Delayed Branch: B & BD. Delayed Branch: B & BD
addraddr..PP11 FF
PP22
D !D !
FF
PP33
----
FF33
PP44
ADDRADDR
----
----
FLUSHFLUSH
PPAA
FLUSHFLUSH ----
----
---- ----
---- ---- ----
---- ---- ----
RR XXAAFF DD AA
2 WORDS2 WORDS
4 4 CYCLESCYCLES
BDBD new new
PP11 FF11
PP22
DD11 ! !
FF22
PP33
----
FF33
PP44
NEWNEW
----
----
DD33
FF44
PPNN
----
----
AA33
DD44
FFNN
----
RR33
AA44
DDNN
XX33
RR44
AANN
XX44
RRNN XXNN
2 WORDS2 WORDS2 CYCLES2 CYCLES
2 FINAL2 FINALCODECODEWORDSWORDS
BB
6 - 6
Delayed Branch ExamplesDelayed Branch Examples
LDLD x,Ax,A
ADDADD y,Ay,A
MPYMPY z,Bz,B
STLSTL A,rA,r
BB nextnext
6 words6 words8 cycles8 cycles
Move branch up two words of code.Move branch up two words of code.
LDLD x,Ax,A
ADDADD y,Ay,A
BDBD nextnext
MPYMPY z,Bz,B
STLSTL A,rA,r
6 words6 words6 cycles6 cycles
Module 6
6 - 6 DSP54x - Pipeline Issues
6 - 7
Delayed OperationsDelayed Operations
BDBD CALLDCALLD BCDBCD
BACCDBACCD CALADCALAD CCDCCD
RETDRETD RCDRCD
BANZDBANZD RETEDRETED
RPTBDRPTBD RETFDRETFD
Delayed branches are effectively two Delayed branches are effectively two wordswords
faster than their non-delayed version.faster than their non-delayed version.
6 - 8
Delay Slot CaveatsDelay Slot Caveats
uu Delay slot is two Delay slot is two wordswords deep - deep - cycles cycles ororlineslines of code are not relevant of code are not relevant
uu Delay operation may not be a branch ofDelay operation may not be a branch ofany kind (any kind (BB, , CALLCALL, , RETRET, , RPTRPT, etc.), etc.)
uu Conditions in delay slot will be too lateConditions in delay slot will be too lateuu Do not load Do not load BRCBRC in slot of in slot of RPTBDRPTBD
uu No No PUSHPUSH//POPPOP in in CALLDCALLD or or RETDRETD delay delayslotsslots
Module 6
DSP54x - Pipeline Issues 6 - 7
6 - 9
Conditional Execution: XCConditional Execution: XC
uu Allows Allows fastfast choice of running one or two words of code or choice of running one or two words of code orsubstitution ofsubstitution of NOPs NOPs..
uu Condition evaluated early, so must be set Condition evaluated early, so must be set twotwo instructions instructionsprior.prior.
uu Avoid change of condition in last two lines prior to XC, asAvoid change of condition in last two lines prior to XC, asthey can be recognized in event of interrupt prior to XC.they can be recognized in event of interrupt prior to XC.
XC n,XC n,cndcnd,,cndcnd,,cndcnd
-pre--pre--pre--pre-CMPRCMPR GRTR,AR1 GRTR,AR1 BCBC next,TCnext,TCLDLD *AR3+,A*AR3+,A
next:next: ABSABS AA
3 words, 5/4 cycles3 words, 5/4 cycles
CMPRCMPR GRTR,AR1GRTR,AR1-other--other--other--other-XCXC 1,NTC1,NTCLDLD *AR3+,A*AR3+,AABSABS AA
2 words, 2 cycles2 words, 2 cycles
6 - 10
Exercise 6-1: Delayed OperationsExercise 6-1: Delayed Operations
uu How does How does BDBD differ from differ from BB? How do they differ in code?? How do they differ in code?
uu What should not appear in delay slots?What should not appear in delay slots?uu Why shouldn’t Why shouldn’t PUSHPUSH or or POPPOP appear in a appear in a CCDCCD slot? slot?
uu What should be done if a condition is set in the delay slotWhat should be done if a condition is set in the delay slotof of BCBC??
uu Write code using Write code using RPTBDRPTBD to perform: to perform:y x an n
n
==
∑1
10
uu When would this approach be better than using When would this approach be better than using RPTZRPTZ??
Module 6
6 - 8 DSP54x - Pipeline Issues
6 - 11
Exercise 6-1: SolutionExercise 6-1: Solution
STMSTM #7,BRC#7,BRC
RPTBDRPTBD next-1next-1
MACMAC *AR2*AR2+,*+,*AR3+,AAR3+,A
next:next: STLSTL A,yA,y
STHSTH A,y+1A,y+1
MACMAC *AR2*AR2+,*+,*AR3+,AAR3+,A
MPYMPY *AR2*AR2+,*+,*AR3+,AAR3+,A
6 - 12
Lab 6Lab 6
1.1. Modify Modify VECTORS.ASMVECTORS.ASM to employ to employ BDBD..
2.2. What code would be most useful in theWhat code would be most useful in thedelay slot?delay slot?
Optional: If time permits, modify yourOptional: If time permits, modify yoursum-of-products to be interruptible.sum-of-products to be interruptible.
Module 6
DSP54x - Pipeline Issues 6 - 9
6 - 13
Pipeline CasesPipeline Cases
Average C54x System CodeAverage C54x System CodeAverage C54x System Code
30% C Code No Problem
30% 30% C CodeC Code No ProblemNo Problem
70% Assembly Code70% 70% Assembly CodeAssembly Code11
65% CALU Operations No Problem
65% 65% CALU OperationsCALU Operations No ProblemNo Problem
5% MMR Writes5% 5% MMR WritesMMR Writes22
1% Regular MMR WriteUse Key
1% 1% Regular MMR WriteRegular MMR WriteUse KeyUse Key
2% Early Write No Problem
2% 2% Early WriteEarly Write No ProblemNo Problem
2% Protected MMR Write2% 2% Protected MMR WriteProtected MMR Write44 33
1.9% Usual CaseNo Problem
1.9% 1.9% Usual CaseUsual CaseNo ProblemNo Problem
0.1% Prior Reg MMR WriteAdd 1 Cycle
0.1% 0.1% PriorPrior Reg Reg MMR Write MMR WriteAdd 1 CycleAdd 1 Cycle
55 66
Analysis:Analysis:
uu 99% of ’C54x code99% of ’C54x coderequires no specialrequires no specialattention.attention.
uu Latency requirementsLatency requirementsare resolved via a table.are resolved via a table.
6 - 14
Pipeline Case 1 - C CodePipeline Case 1 - C Code
uu Compiler does not produce code with latency issuesCompiler does not produce code with latency issues
uu User need not debug C code for pipeline-related issuesUser need not debug C code for pipeline-related issues
uu C code is ideal for non-critical speed path code.C code is ideal for non-critical speed path code.
ÀÀ Operating systemOperating system
ÀÀ DiagnosticsDiagnostics
ÀÀ Etc.Etc.
uu Allows portability of software to other platforms asAllows portability of software to other platforms asrequired.required.
uu Systems can easily mix C and ASM code.Systems can easily mix C and ASM code.
Module 6
6 - 10 DSP54x - Pipeline Issues
6 - 15
Pipeline Case 2 - CALU OperationsPipeline Case 2 - CALU Operations
uu No pipeline errors exist between CALUNo pipeline errors exist between CALUoperations.operations.
uu Special effortsSpecial efforts have been made to avoid have been made to avoiderrors without slowing down theerrors without slowing down thepipeline.pipeline.
uu Only one rare case exists where a CALUOnly one rare case exists where a CALUactivity results in a slowdown, and it isactivity results in a slowdown, and it ishandled handled automaticallyautomatically by the ’C54x by the ’C54xwithout data errors (W,W,R||R).without data errors (W,W,R||R).
6 - 16
CALU Operations - AnalysisCALU Operations - Analysis
uu The ’C54x may need to perform a fetch, two reads,The ’C54x may need to perform a fetch, two reads,and a write in any given cycle. Depending on theand a write in any given cycle. Depending on thesystem setup, this event could occur in one cycle or besystem setup, this event could occur in one cycle or bespread over several cycles. spread over several cycles. In no caseIn no case are errors are errorsgenerated. Consider the following environments:generated. Consider the following environments:
ÀÀ More than one external access: multiple cyclesMore than one external access: multiple cycles
ÀÀ Each resource in separate memories: single cycleEach resource in separate memories: single cycle
ÀÀ Note: ’C54x memories are broken into blocks.Note: ’C54x memories are broken into blocks.
ÀÀ More than one resource in a single ’C54x memoryMore than one resource in a single ’C54x memoryblock - Dual Access RAM :block - Dual Access RAM :
Early phase Early phase P and DP and D
Late phase Late phase C and EC and E
Module 6
DSP54x - Pipeline Issues 6 - 11
6 - 17
Pipeline EventsPipeline Events
Single read instructions: Single read instructions: PPAA PPDD DDDDDDAA
Dual read instructions: Dual read instructions: PPAA PPDDDDD D CCDD
DDA A CCAA
Single write instructions:Single write instructions: PPAA PPDD EEAA EEDD
Dual write instruction: Dual write instruction: PPAA PPDD EEAA EEDD
EEAA EEDD(2 cycles) (2 cycles)
Read/write instructions:Read/write instructions: DDDD E EAA
EEDDDDAAPPAA PPDD
6 - 18
DARAM EventsDARAM Events
Single read instructions: Single read instructions: PP DD
Dual read instructions: Dual read instructions: PP DDCC
Single write instructions:Single write instructions: PP EE
Dual write instruction: Dual write instruction: PP EE
(2 cycles) (2 cycles) EE
Read/write instructions:Read/write instructions: PP EEDD
Module 6
6 - 12 DSP54x - Pipeline Issues
6 - 19
Case Study - Latencies AvoidedCase Study - Latencies Avoided
WRITEWRITE STL A,*AR3+STL A,*AR3+
READREAD LD *AR2+,ALD *AR2+,A
EEPP
DDPP
What if both are to the What if both are to the samesame address? address?
WRITEWRITE STL A,*AR3+STL A,*AR3+
------ LD #0,A LD #0,A
DUALDUAL ADD *AR4,*AR5,A ADD *AR4,*AR5,A READREAD
PP EE
PP
CC DDPP
EE
Early write Early write held offheld off to allow dual access to operate w/o delay. to allow dual access to operate w/o delay.
6 - 20
Case Study - Automatic LatencyCase Study - Automatic Latency
PP EEWRITEWRITE STL A,*AR3+STL A,*AR3+
WRITEWRITE STH A,*AR3STH A,*AR3
DUALDUAL ADD *AR4,*AR5,AADD *AR4,*AR5,AREADREAD
EEPP
EEPP
CC DD
One cycle latency One cycle latency automaticallyautomatically inserted by decoder inserted by decoder
Module 6
DSP54x - Pipeline Issues 6 - 13
6 - 21
Pipeline Issues for MMR ActivityPipeline Issues for MMR Activity
5% MMR Writes5% 5% MMR WritesMMR Writes
1% Regular MMR OpUse Key
1% 1% Regular MMR OpRegular MMR OpUse KeyUse Key
2% Early Write No Problem
2% 2% Early WriteEarly Write No ProblemNo Problem
2% Protected MMR Write2% 2% Protected MMR WriteProtected MMR Write44 33
1.9% Usual Case No Problem
1.9% 1.9% Usual CaseUsual Case No ProblemNo Problem
0.1% Prior Reg MMR OpAdd 1 Cycle
0.1% 0.1% PriorPrior Reg Reg MMR Op MMR OpAdd 1 CycleAdd 1 Cycle
55 66
6 - 22
MMRsMMRs That Affect Pipeline That Affect Pipeline
Name Description
BRC Block Repeat Counter
RSA Block Repeat Start Address
REA Block Repeat End Address
T Temporary Register
A Acc A - written as MMR
B Acc B - written as MMR
ST0 Status Register 0
ST1 Status Register 1
PMST Proc. Mode Status Register
Name Description
AR0 Auxiliary Register 0
AR1 Auxiliary Register 1
AR2 Auxiliary Register 2
AR3 Auxiliary Register 3
AR4 Auxiliary Register 4
AR5 Auxiliary Register 5
AR6 Auxiliary Register 6
AR7 Auxiliary Register 7
SP Stack Pointer Register
BK Circular Size Register
Ph
R*
R
R
R
R
R
R
R
R/A
A
* AR’s have been designed specially to operate ‘late’: R instead of A.* AR’s have been designed specially to operate ‘late’: R instead of A.
Ph
P
P
P
X
X
X
--
--
--
Module 6
6 - 14 DSP54x - Pipeline Issues
6 - 23
Pipeline Control FieldsPipeline Control Fields
ST0ST0 DPDP
PMSTPMST
ST1ST1 BRAFBRAF CPLCPL SXMSXM C16C16 FRCTFRCT ASMASMOVMOVM
OVLYOVLY DROMDROMMP/MCMP/MCIPTRIPTR
XX OVM, SXM, C16, FRCT, ASMOVM, SXM, C16, FRCT, ASMRRAA DP, CPL, DROMDP, CPL, DROMDDFFPP BRAF, MP/MC-, OVLY, IPTRBRAF, MP/MC-, OVLY, IPTR
6 - 24
Pipeline Case 3 : Standard Ops onPipeline Case 3 : Standard Ops on MMRs MMRs
uu Standard write operations will create latency issues that Standard write operations will create latency issues that mustmust be beconsidered by the programmer!considered by the programmer!
uu Consider the following pipeline diagram to understand how muchConsider the following pipeline diagram to understand how muchlatency a control field requires:latency a control field requires:
PP00 FF00 DD00 AA00 RR00 XX00
InstrInstr. 0 writes to a control field.. 0 writes to a control field.
11 11
1 word for effect.1 word for effect.
Affect on these stages ready nowAffect on these stages ready now
22 XX22 A,B,T,SXM,ASM,OVM,FRCT,C16A,B,T,SXM,ASM,OVM,FRCT,C16
33 RR33 ARnARn, SP(0), SP(0)
44 AA44 SP(1), BK, DP, CPL, DROMSP(1), BK, DP, CPL, DROM
55 DD55
66 FF66
OVLY, MP/MC-, IPTR, &OVLY, MP/MC-, IPTR, &PP77 BRC, RSA, REA, BRAFBRC, RSA, REA, BRAF
example:
SSBX SXMNOPLD x,B
example:example:
SSBXSSBX SXMSXMNOPNOPLDLD x,Bx,B
Module 6
DSP54x - Pipeline Issues 6 - 15
6 - 25
Calculating Minimum Number of Protected CyclesCalculating Minimum Number of Protected Cycles
uu Latency diagram shows worst case number ofLatency diagram shows worst case number of NOPsNOPsto insert between store to control field and effect to beto insert between store to control field and effect to bevalid.valid.
uu NOPsNOPs need not be used; any other non-involved code need not be used; any other non-involved codemay intervene.may intervene.
uu Extra cycle from double word dependent instructionsExtra cycle from double word dependent instructionsmay be counted, reducing the number of othermay be counted, reducing the number of otherintervening cycles required; e.g. :intervening cycles required; e.g. :
EXPLICIT PROTECTED CYCLEEXPLICIT PROTECTED CYCLE
SSBXSSBX SXMSXMNOPNOPLDLD x,Bx,B
IMPLICIT PROTECTED CYCLEIMPLICIT PROTECTED CYCLE
SSBXSSBX SXMSXMLDLD *(x),B*(x),B
6 - 26
Pipeline Case 4: Early WritePipeline Case 4: Early Write
uu ManyMany MMRs MMRs and bit fields are set up during and bit fields are set up duringinitialization and don’t get changed during runtime,initialization and don’t get changed during runtime,e.g.;e.g.;
begin:begin: ……SSBXSSBX SXMSXM…………CALLCALL MAINMAIN
main:main: ……LDLD x,Ax,A
Any pipeline read Any pipeline read more than 6 wordsmore than 6 wordsremoved from theremoved from thewrite is immune.write is immune.
uu These cases are very common and do not present anyThese cases are very common and do not present anypipeline concerns to the user.pipeline concerns to the user.
Module 6
6 - 16 DSP54x - Pipeline Issues
6 - 27
uu Given the pipeline latency issue, it would be helpful to haveGiven the pipeline latency issue, it would be helpful to haveoptimized instructions that operate earlyoptimized instructions that operate earlyÀÀ Allow faster codeAllow faster codeÀÀ Easier to writeEasier to write
uu Therefore, these instructions offer a one-cycle Therefore, these instructions offer a one-cycle earlyearlyexecution on MMR writes:execution on MMR writes:
STM STM #K,MMR#K,MMR ST ST #K,MMR#K,MMRPOPDPOPD SmemSmem POPMPOPM MMRMMRMVDKMVDK SmemSmem,,dmaddmad MVMDMVMD MMR,MMR,SmemSmemFRAMEFRAME nn
uu Initialization of AR’s with Initialization of AR’s with nono explicit latency : explicit latency :
ST, STM, MVMM, MVDK, MVMMST, STM, MVMM, MVDK, MVMM LD #k9, DP LD #k5,ASM LD #k9, DP LD #k5,ASM
Pipeline Case 5 : Protected InstructionsPipeline Case 5 : Protected Instructions
uu ModifyModify allows early increment, so no latency issues arise : allows early increment, so no latency issues arise : MAR *MAR *ARnARn++
6 - 28
Pipeline Case 6: Protected Instruction ExceptionPipeline Case 6: Protected Instruction Exception
Problem:Problem: Protected instructions attempting to write early (in the R phase)Protected instructions attempting to write early (in the R phase)can be blocked if a prior standard instruction writes to an can be blocked if a prior standard instruction writes to an addressing register in the X phase:addressing register in the X phase:
STLM A,AR0MVMM AR2,AR1LD *AR1,B
STLMSTLM A,AR0A,AR0MVMMMVMM AR2,AR1AR2,AR1LDLD *AR1,B*AR1,B
EExx EE
AA
Solution:Solution: Add one protected cycle before or after STM:Add one protected cycle before or after STM:
STLM A,AR0MVMM AR2,AR1nopLD *AR1,B
STLMSTLM A,AR0A,AR0MVMMMVMM AR2,AR1AR2,AR1nopnopLDLD *AR1,B*AR1,B
Note:Note: Problem can be extended through a chain of special instructions:Problem can be extended through a chain of special instructions:
STLM A,AR0MVMM AR7,AR6MVMM AR2,AR1nopLD *AR1,B
STLMSTLM A,AR0A,AR0MVMMMVMM AR7,AR6AR7,AR6MVMMMVMM AR2,AR1AR2,AR1nopnopLDLD *AR1,B*AR1,B
Module 6
DSP54x - Pipeline Issues 6 - 17
6 - 29
Execute Unit LatenciesExecute Unit Latencies
Control FieldControl Field Latency 0Latency 0 Latency 1Latency 1
SXM C16SXM C16FRCT OVMFRCT OVM
T T
ASMASM
A A oror B B
STM; MVDKSTM; MVDK All otherAll otherLD x,TLD x,T storesstores incl incl::ST || LDST || LD EXPEXP
LD #k5,ASMLD #k5,ASM All otherAll otherLDLD Smem Smem,ASM,ASM storesstores
All storesAll stores incl incl:: SSXM RSXMSSXM RSXM
All All exceptexcept 1. mod1. mod acc acc2. read2. read mmr mmr
6 - 30
Access Unit LatenciesAccess Unit Latencies
Control FieldControl Field Latency 0Latency 0 Latency 1Latency 1 Latency 2Latency 2 Latency 3Latency 3
AR, AR, SP (CPL=0)SP (CPL=0) STM; STSTM; ST MVKD; MVDMMVKD; MVDM All otherAll other
MVDK; MVMD MVDK; MVMD MVPD; MVDDMVPD; MVDD storesstoresMVMMMVMM POPM SPPOPM SP
POPD SPPOPD SP
BKBKSPSP (CPL=1) (CPL=1) STM; STSTM; ST MVKD; MVDMMVKD; MVDM All otherAll other
MVDK; MVMD MVDK; MVMD MVPD; MVDDMVPD; MVDD storesstoresMVMM; FRAMEMVMM; FRAME POPM SPPOPM SPPUSH; POPPUSH; POP POPD SPPOPD SPRETFDRETFD
DP (CPL=0)DP (CPL=0) LD #K,DPLD #K,DP STM; STSTM; ST All otherAll other LDLD Smem Smem,DP,DP MVDK; MVMDMVDK; MVMD storesstores
CPLCPL STM; STSTM; ST All otherAll otherMVDK; MVMDMVDK; MVMD storesstores incl incl..
SSBX RSBXSSBX RSBX
Module 6
6 - 18 DSP54x - Pipeline Issues
6 - 31
Other LatenciesOther Latencies
CtlCtl Field Field Latency 2Latency 2 Latency 3Latency 3 Latency 4Latency 4 Latency 5Latency 5 Latency 6Latency 6
DROMDROM STM; STSTM; ST All otherAll otherMVDKMVDK storesstoresMVMDMVMD
OVLYOVLY STM; STSTM; ST All otherAll otherIPTRIPTR MVDKMVDK storesstoresMP/MC-MP/MC- MVMDMVMD
BRC BRC ** SRCCDSRCCD STM; ST STM; ST All storesAll storespre-looppre-loop MVDKMVDK pre-looppre-loop
MVMDMVMD
BRAFBRAF **** All storesAll storespre-looppre-loop
* Note: Writing to * Note: Writing to BRCBRC before before RPTBRPTB has zero latency for has zero latency for STST, , STMSTM, , MVDKMVDK, , MVMDMVMD, and one latency for all other stores, and one latency for all other stores**** Avoid modifying BRAF in line prior to RPTB[D] Avoid modifying BRAF in line prior to RPTB[D]
6 - 32
Latency CaveatsLatency Caveats
uu No No latency for CALU operationslatency for CALU operations
uu Use Use protectedprotected MMR writes whenever possible MMR writes whenever possible
uu Set status early Set status early
uu Use latency diagram when writing toUse latency diagram when writing to MMRs MMRs
uu For debug: For debug: focus on unprotected MMR writesfocus on unprotected MMR writes
uu Reference Guide has chapter on pipeline useReference Guide has chapter on pipeline use
Module 6
DSP54x - Pipeline Issues 6 - 19
6 - 33
Exercise 6-2aExercise 6-2a
1.1. Determine if latency condition exists.Determine if latency condition exists.
2.2. Note why.Note why.3.3. Add appropriate number ofAdd appropriate number of NOPsNOPs to correct. to correct.
LDLD GAIN,TGAIN,TSTMSTM #input,AR1#input,AR1
MPYMPY *AR1+,A*AR1+,A
STLMSTLM B,AR2B,AR2STMSTM #input,AR3#input,AR3
MPYMPY *AR2+,AR3+,A*AR2+,AR3+,A
MPYMPY *AR1,A*AR1,APOPMPOPM AR0AR0
MVKDMVKD #table,*AR0#table,*AR0
ADDADD y,Ay,ALDLD #table,DP#table,DP
ADDADD table,A,Btable,A,B
6 - 34
Exercise 6-2bExercise 6-2b
MACMAC x,Bx,BSTLMSTLM B,ST0B,ST0
ADDADD table,A,Btable,A,B
STMSTM #pointer,AR4#pointer,AR4STMSTM #stack,SP#stack,SP
LDLD VAR1,AVAR1,A
STLSTL B,B,coeffcoeffSTLMSTLM A,SPA,SP
POPMPOPM AR0AR0
LDLD *AR2,A*AR2,ASSBXSSBX SXMSXM
LDLD data,Bdata,B
Module 6
6 - 20 DSP54x - Pipeline Issues
6 - 36
Exercise 6-2a - SolutionExercise 6-2a - Solution
LDLD GAIN,TGAIN,TSTMSTM #input,AR1#input,AR1
MPYMPY *AR1+,A*AR1+,A
0 latency: STM0 latency: STM
STLMSTLM B,AR2B,AR2STMSTM #input,AR3#input,AR3nopnopMPYMPY *AR2+,AR3+,A*AR2+,AR3+,A
1 latency: STM1 latency: STM exc’n exc’n
MPYMPY *AR1,A*AR1,APOPMPOPM AR0AR0nopnopMVKDMVKD #table,*AR0#table,*AR0
1 latency: pop1 latency: pop ARn ARn
ADD ADD y,Ay,ALDLD #table,DP#table,DP
ADDADD table,A,Btable,A,B
0 latency: LD DP0 latency: LD DP
Module 6
DSP54x - Pipeline Issues 6 - 21
6 - 37
Exercise 6-2b: SolutionExercise 6-2b: Solution
MACMAC x,Bx,BSTLMSTLM B,ST0B,ST0nopnopnopnopnopnopADDADD table,A,Btable,A,B
3 latencies: DP3 latencies: DP
STMSTM #pointer,AR4#pointer,AR4STMSTM #stack,SP#stack,SPLDLD VAR1,AVAR1,A
0 latency: STM 0 latency: STM
STLSTL B,B,coeffcoeffSTLMSTLM A,SPA,SPnopnopnopnopPOPMPOPM AR0AR0
2 latencies: SP(0)2 latencies: SP(0)
LDLD *AR2,A*AR2,ASSBXSSBX SXMSXMnopnopLDLD data,Bdata,B
1 latency: SXM1 latency: SXM
6 - 38
VECTOR6.ASMVECTOR6.ASM : Solution : Solution
.ref start .ref start
LEN .set 100LEN .set 100
STACK .STACK .usectusect "STK",LEN "STK",LEN
.sect ".vectors" .sect ".vectors"
BD start BD start
STM #STACK+LEN,SP STM #STACK+LEN,SP
Module 6
6 - 22 DSP54x - Pipeline Issues
DSP54x - Numerical Issues 7 - 1
Numerical Issues
Learning Objectives
7 - 2
Learning ObjectivesLearning Objectives
uu Identify & resolve issues for:Identify & resolve issues for:ÀÀ MultiplicationMultiplication
ÀÀ Addition / SubtractionAddition / Subtraction
ÀÀ DivisionDivision
uu Select the appropriate numerical modelsSelect the appropriate numerical modelsÀÀ IntegerInteger vs vs. Fraction. Fraction
ÀÀ SignedSigned vs vs. Unsigned math. Unsigned math
ÀÀ RoundingRounding vs vs. Truncation. Truncation
ÀÀ OverflowOverflow vs vs. Carry. Carry
ÀÀ Fixed pointFixed point vs vs. Floating point. Floating point
uu List mnemonics to performList mnemonics to performÀÀ Extended precision mathExtended precision math
ÀÀ Boolean OperationsBoolean Operations
uu Identify & resolve issues for:Identify & resolve issues for:ÀÀ MultiplicationMultiplication
ÀÀ Addition / SubtractionAddition / Subtraction
ÀÀ DivisionDivision
7 - 2 DSP54x - Numerical Issues
Module 7
DSP54x - Numerical Issues 7 - 3
Module 7
7 - 3
Integer MultiplicationInteger Multiplication
uu Integer multiplication yields products larger than the inputs, as canInteger multiplication yields products larger than the inputs, as canbe seen in the example below, using single digit decimal values asbe seen in the example below, using single digit decimal values asinputs:inputs:
9 9 valuevalue
x 9x 9 times valuetimes value
8 1 8 1 yields yields doubledouble size resultsize result
uu Does the user store the lower (1) or upper (8) result?Does the user store the lower (1) or upper (8) result?
uu Both must be kept, resulting in additional resources (two cycles,Both must be kept, resulting in additional resources (two cycles,words of code, and RAM locations) to complete the store.words of code, and RAM locations) to complete the store.
uu Worse, how can the double-sized result be used recursively asWorse, how can the double-sized result be used recursively asan input in later calculations, given that the multiplier inputsan input in later calculations, given that the multiplier inputsare single-width?are single-width?
7 - 4
Fractional MultiplicationFractional Multiplication
uu Multiplication of fractions yields products that never exceed theMultiplication of fractions yields products that never exceed therange of a fraction, as can be seen in the example below, usingrange of a fraction, as can be seen in the example below, usingsingle digit decimal fractions as inputs:single digit decimal fractions as inputs:
. 9 . 9 valuevalue
x . 9x . 9 times valuetimes value
. 8 1 . 8 1 yields yields double double size resultsize result
uu Don’t we still have a double sized result to store?Don’t we still have a double sized result to store?
uu In this case, we can store just the upper result (.8)In this case, we can store just the upper result (.8)
uu This allows storage of result with fewer resourcesThis allows storage of result with fewer resources
uu Results may be used recursivelyResults may be used recursively
uu Has accuracy been lost by dropping the lower accumulator value?Has accuracy been lost by dropping the lower accumulator value?
Module 7
7 - 4 DSP54x - Numerical Issues
7 - 5
AccuracyAccuracy vs vs. Precision. Precision
uu Often the programmer wants to retain the fullest accuracyOften the programmer wants to retain the fullest accuracyof a calculation, thus dropping the 16of a calculation, thus dropping the 16 LSBLSB’s’s of the result in of the result inthe previous example seems a bad choice.the previous example seems a bad choice.
uu Note though, the inputs: how much accuracy do they offer?Note though, the inputs: how much accuracy do they offer?
uu The product offers double The product offers double precisionprecision but its’ but its’ accuracyaccuracy is isbased on the single-width inputs.based on the single-width inputs.
uu Thus, storing a single precision result is not only an efficientThus, storing a single precision result is not only an efficientsolution, but represents the limit of the accuracy of thesolution, but represents the limit of the accuracy of theresult.result.
uu The accumulator is double-sized for two reasons:The accumulator is double-sized for two reasons:ÀÀ To allow for integer operations, which would possibly requireTo allow for integer operations, which would possibly require
thethe LSBLSBss for the result. for the result.
ÀÀ So that sum-of-product operations will generate accumulativeSo that sum-of-product operations will generate accumulativenoise at the 32noise at the 32ndnd vs vs. the 16. the 16thth bit. bit.
7 - 6
Notes
Module 7
DSP54x - Numerical Issues 7 - 5
7 - 7
uu How can fractions be represented in binary?How can fractions be represented in binary?
uu Since fractions have a range of +1 to -1, we will haveSince fractions have a range of +1 to -1, we will haveto create a system capable or representing this range.to create a system capable or representing this range.
uu Since negative numbers are involved, a two’sSince negative numbers are involved, a two’scomplement system is required. Two’s complementcomplement system is required. Two’s complementnumbers follow these rules:numbers follow these rules:ÀÀ The bits are a binary weighted progressionThe bits are a binary weighted progression
ÀÀ The MSB and The MSB and only the MSB isonly the MSB is of of negativenegative sign sign
ÀÀ ComplementComplement equals invert plus one equals invert plus one
ÀÀ Small values written to large registers require Small values written to large registers require sign extensionsign extension
uu Given items 1 and 2 above, we can create theGiven items 1 and 2 above, we can create thefollowing fractional model:following fractional model:
Two’s Complement FractionsTwo’s Complement Fractions
-1 1/2 1/4 1/8 ...-1 1/2 1/4 1/8 ...
7 - 8
Fractional ExampleFractional Example
uu The following example demonstrates how two’sThe following example demonstrates how two’scomplement fractions perform under multiplication.complement fractions perform under multiplication.
uu The 4/8 bit model shown here behaves identically toThe 4/8 bit model shown here behaves identically tothe 16/32 bit TMS320 devicethe 16/32 bit TMS320 device
01000100 x 1101 x 1101
010001000000 0000 0100 0100
1100 1100 11101001110100
uu What values do the inputs represent?What values do the inputs represent?
uu What is the result?What is the result?
AccumulatorAccumulator 1111 01001111 01001111 0100
uu What should be stored to memory?What should be stored to memory?
Data MemoryData Memory 11101110
uu What are the Q-types of the input, accumulator,What are the Q-types of the input, accumulator,and output values?and output values?
Module 7
7 - 6 DSP54x - Numerical Issues
7 - 9
Redundant Sign BitRedundant Sign Bit
uu Multiplication of two signedMultiplication of two signednumbers yields product withnumbers yields product withtwotwo sign bits sign bits
S x x x Q3 S x x x Q3
* S y y y Q3 * S y y y Q3
S S z z z z z z Q6 S S z z z z z z Q6uu Extra sign bit causes problemsExtra sign bit causes problems
if stored to memory as result:if stored to memory as result:ÀÀ Wastes spaceWastes space
ÀÀ Creates off-sizeCreates off-size QQ
uu Solution: Fractional mode bit!Solution: Fractional mode bit!
or, with FRCT=1: or, with FRCT=1:
S z z z z z z 0 Q7 S z z z z z z 0 Q7
uu When When FRCTFRCT (mode bit in (mode bit in ST1ST1))is set, the multiplier output isis set, the multiplier output isleft-shifted by oneleft-shifted by one
SSBXSSBX FRCTFRCT
......
MPYMPY *AR2,*AR3,A*AR2,*AR3,A
STHSTH A,*(z)A,*(z)uu For 16-bit ‘C54x:For 16-bit ‘C54x:
Q15*Q15=Q15Q15*Q15=Q15
7 - 10
Exercise 7a : Multiplier IssuesExercise 7a : Multiplier Issues
1.1. Does the C54x support integer operations?Does the C54x support integer operations?
2.2. What is the optimal numerical type? Why?What is the optimal numerical type? Why?
3.3. How is the extra sign bit in fractional multiply handled?How is the extra sign bit in fractional multiply handled?
Module 7
DSP54x - Numerical Issues 7 - 7
7 - 11
AccumulationAccumulation
uu With fractions, we were able to guarantee that noWith fractions, we were able to guarantee that nomultiplicativemultiplicative overflow could occur, overflow could occur, ie ie: F*F<=F.: F*F<=F.
uu For addition, this rule does not apply,For addition, this rule does not apply, ie ie: F+F>F.: F+F>F.
uu Therefore, we need additional measures to manage theTherefore, we need additional measures to manage thepossibility of overflow for accumulation. Two generalpossibility of overflow for accumulation. Two generalmethods apply:methods apply:
ÀÀ Guard Bits: the ‘C54x offers an 8-bit extension above theGuard Bits: the ‘C54x offers an 8-bit extension above thehigh accumulator to allow valid representation of thehigh accumulator to allow valid representation of theresult of up to 256 summations.result of up to 256 summations.
ÀÀ Non-gain Systems: offer additional criteria that allow aNon-gain Systems: offer additional criteria that allow asimple solution for unlimited length summations.simple solution for unlimited length summations.
7 - 12
Guard BitsGuard Bits
uu Guard Bits: the ‘C54x offers an 8-bit extension aboveGuard Bits: the ‘C54x offers an 8-bit extension abovethe high accumulator to allow valid representation ofthe high accumulator to allow valid representation ofthe result of up to 256 summations.the result of up to 256 summations.
AG AH AL AG AH AL
BG BH BL BG BH BL
39 31 15 0 39 31 15 0
uu At the conclusion of the summation, what should beAt the conclusion of the summation, what should bedone?done?ÀÀ Store all accumulator components?Store all accumulator components?
ÀÀ Store only high accumulator?Store only high accumulator?
ÀÀ What should be done about guard values?What should be done about guard values?
Module 7
7 - 8 DSP54x - Numerical Issues
7 - 13
Saturation (Saturation (SATSAT))
uu SATSAT instruction saturates value exceeding 32-bit range in instruction saturates value exceeding 32-bit range inthe selected accumulator:the selected accumulator:
SATSAT AA -or--or- SATSAT BB
uu Provides single-cycle ‘clipping’ function:Provides single-cycle ‘clipping’ function:
Before saturating Before saturating After saturating After saturating
ÀÀ Values not overflowed are unchangedValues not overflowed are unchangedÀÀ Positive overflows are set to : Positive overflows are set to : 00 7FFF FFFF h00 7FFF FFFF h
ÀÀ Negative overflows are set to : Negative overflows are set to : FF 8000 0000 hFF 8000 0000 h
uu Is automatic on store if SST=1 (LP devices)Is automatic on store if SST=1 (LP devices)
0 0
-1-1
1 1
256256
-256-256
7 - 14
Overflow Bits (Overflow Bits (OVAOVA, , OVBOVB,,OVMOVM))
uu Overflow (Overflow (OVOV) is used to record if the range of) is used to record if the range of AccHi AccHi is ever exceeded. is ever exceeded.
ÀÀ OVOV is a latched event: once set it remains set is a latched event: once set it remains set
ÀÀ OVOV is cleared is cleared onlyonly by: by:
ÀÀ Test of Test of OV: OV: BCBC oflow oflow,OVB,OVB
ÀÀ Write to Write to OVOV : : RSBX OVA RSBX OVA oror STM # STM #valval,ST0,ST0
ÀÀ System resetSystem reset
ÀÀ OV is largely obsolete, given the presence of theOV is largely obsolete, given the presence of the Acc Acc Guard. Guard.
uu Overflow Mode (OVM) causes accumulator to saturate at 32nd bit:Overflow Mode (OVM) causes accumulator to saturate at 32nd bit:ÀÀ AccAcc = = 0x00 7FFF FFFF0x00 7FFF FFFF is the positive limit is the positive limit
ÀÀ AccAcc = = 0xFF 8000 00000xFF 8000 0000 is the negative limitis the negative limit
uu Setting Setting OVMOVM ( (SSBX OVMSSBX OVM) causes guard bits to be unused.) causes guard bits to be unused.
uu Setting Setting OVMOVM makes accumulator values non-linear even if subsequent makes accumulator values non-linear even if subsequentterms would have corrected for intermediate overflows.terms would have corrected for intermediate overflows.
uu Overflow mode is generally undesirable, and should usually be turnedOverflow mode is generally undesirable, and should usually be turnedoff (off (RSBX OVMRSBX OVM))
Module 7
DSP54x - Numerical Issues 7 - 9
7 - 15
Non-gain SystemsNon-gain Systems
uu Many systems can be modeled to have no DC gain:Many systems can be modeled to have no DC gain:ÀÀ Filters with low Q.Filters with low Q.
ÀÀ Any systems scaled by its’ maximum gain value.Any systems scaled by its’ maximum gain value.
uu Input values from A/D converters are automaticallyInput values from A/D converters are automaticallyfractions, if the limits of the A/D are presumed to be +/- 1.fractions, if the limits of the A/D are presumed to be +/- 1.
uu Coefficient values can similarly bounded by making theCoefficient values can similarly bounded by making thelargest value the scaling factor for all other values.largest value the scaling factor for all other values.
uu For these systems, it is known that the final value of theFor these systems, it is known that the final value of theprocess is less than or equal to the input values.process is less than or equal to the input values.
uu The accumulator therefore can be allowed to temporarilyThe accumulator therefore can be allowed to temporarilyoverflow, since the final result is known to be bounded byoverflow, since the final result is known to be bounded by+/- 1.+/- 1.
uu Allows maximum usage of selected A/D and D/AAllows maximum usage of selected A/D and D/AconvertersconvertersÀÀ D/A bits for gain are more expensive than using analogD/A bits for gain are more expensive than using analog
componentscomponents
7 - 16
~1~1
00
–1–1
7FFFh7FFFh
8000h8000h00000000
–1–18000h8000h
~1~17FFFh7FFFh
0001h0001h FFFFhFFFFh
––½½C000hC000h
++½½4000h4000h
CC
OVOV
Number CircleNumber Circle
7FF0h 7FF0h+ 100h = 80F0h+ 100h = 80F0h+ 10h = 8100h+ 10h = 8100h- 200h = 7F00h- 200h = 7F00h
Overflowed Intermediate ResultsOverflowed Intermediate Results
Valid Final ResultValid Final Result
Module 7
7 - 10 DSP54x - Numerical Issues
7 - 17
Fractional RepresentationFractional Representation
~ 1~ 1
00
–½–½
–1–1
½½
Fractions Fractions
⇒⇒* 32768* 32768
32K32K
00
–16K–16K
–32K–32K
16K16K
Integers Integers
7FFFh7FFFh
00000000
C000hC000h
8000h8000h
4000h4000h
Hex Hex
To store 0.707 type:To store 0.707 type:
.word.word 32768*707/1000 32768*707/1000
7 - 18
Handling Amplifier FunctionsHandling Amplifier Functions
uu Gain is best handled Gain is best handled externalexternal to the ‘C54x to the ‘C54x
uu Allows DSP to perform Allows DSP to perform frequency shaping frequency shaping functionsfunctionsÀÀ higher precision than analoghigher precision than analog
ÀÀ lower costlower cost
ÀÀ more stablemore stable
ÀÀ readily supports adaptive systemsreadily supports adaptive systems
uu Analog system can perform gain functionsAnalog system can perform gain functionsÀÀ Op Amps and resistors are very low costOp Amps and resistors are very low cost
ÀÀ DC gain can easily be made very accurateDC gain can easily be made very accurate
ÀÀ Adaptive DC gain in analogAdaptive DC gain in analogÀÀ Not Not asas easy, but reasonable easy, but reasonable
ÀÀ May be May be controlledcontrolled by ‘C54x by ‘C54x
Module 7
DSP54x - Numerical Issues 7 - 11
7 - 19
Exercise 7b : Accumulation IssuesExercise 7b : Accumulation Issues
1.1. How wide are the accumulators?How wide are the accumulators?
2.2. What are guard bits for?What are guard bits for?
3.3. What is the easiest way to avoid accumulative overflow?What is the easiest way to avoid accumulative overflow?
4.4. When is saturation useful?When is saturation useful?
5. 5. What benefit do OVA, OVB, and OVM serve?What benefit do OVA, OVB, and OVM serve?
7 - 20
LAB 7 : Fractional MathLAB 7 : Fractional Math
.text.textstart:start: ……
MVPDMVPD tbltbl ,*…,*………MACMAC *,**,*……
.data.datatbltbl :: .word .word 1 , 2 1 , 2
.word.word 3 , 4 3 , 4
.word.word 8 , 6 8 , 6
.word.word 4 , 2 4 , 2
.vectors.vectorsBB startstart
Program MemoryProgram Memory
ROMROM
RAMRAM
Data MemoryData Memory
. . bssbss
x x ___ ___ ___ ______ ___ ___ ___
a a ___ ___ ___ ______ ___ ___ ___
y y ______
0. 0.0. 0.0. 0.0. 0.0. 0.0. 0.0. 0.0. 0.
Module 7
7 - 12 DSP54x - Numerical Issues
7 - 21
LAB 7 : ProcedureLAB 7 : Procedure
1.1. Copy Copy LAB5.ASMLAB5.ASM to to LAB7.ASMLAB7.ASM. Modify . Modify LAB7LAB7 to: to:a.a. Use the fractional data table shown aboveUse the fractional data table shown aboveb.b. Perform fractional multiplicationPerform fractional multiplicationWhat status bits will be important for this routine toWhat status bits will be important for this routine toperform correctly?perform correctly?
2.2. Copy Copy LAB5.CMDLAB5.CMD to to LAB7.CMDLAB7.CMD. Modify . Modify LAB7.CMDLAB7.CMD to toinput input LAB7.OBJLAB7.OBJ and create and create LAB7.OUTLAB7.OUT and and LAB7.MAPLAB7.MAP..
3.3. Assemble, link, and simulate the program. Debug and verifyAssemble, link, and simulate the program. Debug and verifyperformance. What answer did you get?performance. What answer did you get?
4. 4. To better view the result on the simulator, try:To better view the result on the simulator, try:
WA *(y)/327,y = 0 .,dWA *(y)/327,y = 0 .,d
5.5. Optional: Time permitting, repeat your experiment usingOptional: Time permitting, repeat your experiment usingsome negative array values. Was your result as expected?some negative array values. Was your result as expected?
7 - 22
DivisionDivision
uu The ‘C54x does The ‘C54x does notnot have a single cycle 16-bit divide instruction have a single cycle 16-bit divide instruction
ÀÀ Divide is a rare function in DSPDivide is a rare function in DSP
ÀÀ Division hardware is expensiveDivision hardware is expensive
uu The ‘C54x The ‘C54x doesdoes have a single cycle 1-bit divide instruction: conditional have a single cycle 1-bit divide instruction: conditionalsubtract or subtract or SUBCSUBC
ÀÀ Preceded by Preceded by RPT #15RPT #15, a 16-bit divide is performed, a 16-bit divide is performed
ÀÀ Is Is muchmuch faster than without faster than without SUBCSUBC
uu The The SUBCSUBC process operates only on process operates only on unsignedunsigned operands, thus software operands, thus softwaremust:must:
ÀÀ Compare the signs of the input operandsCompare the signs of the input operands
ÀÀ If they are alike, plan a positive quotientIf they are alike, plan a positive quotientÀÀ If they differ, plan to negate (If they differ, plan to negate (NEGNEG) the quotient) the quotient
ÀÀ Strip the signs of the inputsStrip the signs of the inputs
ÀÀ Perform the unsigned divisionPerform the unsigned division
ÀÀ Attach the proper sign based on the comparison of the inputsAttach the proper sign based on the comparison of the inputs
Module 7
DSP54x - Numerical Issues 7 - 13
7 - 23
Division RoutineDivision Routine
LDLD @den,16,A@den,16,A
MPYAMPYA @@numnum B =B = num num*den (tells sign)*den (tells sign)
ABSABS AA Strip sign of numeratorStrip sign of numerator
STHSTH A,@denA,@den
LDLD @@numnum,A,A
ABSABS AA Strip sign of denominatorStrip sign of denominator
RPTRPT #15#15 16 iterations16 iterations
SUBCSUBC @den,A@den,A 1-bit divide1-bit divide
XCXC 1,BLT1,BLT If result needs to be negativeIf result needs to be negative
NEGNEG AA Invert signInvert sign
STLSTL A,@A,@quotquot Store negative resultStore negative result
7 - 24
uu Identify & resolve issues for:Identify & resolve issues for:ÀÀ MultiplicationMultiplication
ÀÀ Addition / SubtractionAddition / Subtraction
ÀÀ DivisionDivision
uu Select the appropriate numerical modelsSelect the appropriate numerical modelsÀÀ IntegerInteger vs vs. Fraction. Fraction
ÀÀ SignedSigned vs vs. Unsigned math. Unsigned math
ÀÀ RoundingRounding vs vs. Truncation. Truncation
ÀÀ OverflowOverflow vs vs. Carry. Carry
ÀÀ Fixed pointFixed point vs vs. Floating point. Floating point
uu List mnemonics to performList mnemonics to performÀÀ Extended precision mathExtended precision math
ÀÀ Boolean OperationsBoolean Operations
Learning ObjectivesLearning Objectives
uu Select the appropriate numerical modelsSelect the appropriate numerical modelsÀÀ IntegerInteger vs vs. Fraction. Fraction
ÀÀ SignedSigned vs vs. Unsigned math. Unsigned math
ÀÀ RoundingRounding vs vs. Truncation. Truncation
ÀÀ OverflowOverflow vs vs. Carry. Carry
ÀÀ Fixed pointFixed point vs vs. Floating point. Floating point
Module 7
7 - 14 DSP54x - Numerical Issues
7 - 25
RoundingRounding
uu Result of multiplication can be rounded for Result of multiplication can be rounded for MPYMPY, , MACMACand and MASMAS operations. This is specified by appending operations. This is specified by appendingthe instruction with an "the instruction with an "RR" suffix." suffix.ÀÀ Example: Example: MACMAC with rounding is with rounding is MACR.MACR.
ÀÀ Rounding consists of adding 2Rounding consists of adding 21515 to the result and then to the result and thenclearing the low accumulator.clearing the low accumulator.
uu In a long sum-of-products, only the In a long sum-of-products, only the last last MACMACoperation should specify rounding:operation should specify rounding:
RPTZRPTZ A,#98A,#98
MACMAC *AR2*AR2+,*+,*AR3+,AAR3+,A
MACR MACR *AR2*AR2+,*+,*AR3+,AAR3+,A
uu Rounding can also be achieved with a load operation: Rounding can also be achieved with a load operation: LDRLDR SmemSmem,,dstdst
7 - 26
Example: LD #0F794h,8,AExample: LD #0F794h,8,A
Sign Extension (SXM)Sign Extension (SXM)
SXM=1SXM=1
CC
XX
GG
0000
ACCACC
E F 1 3 6 4 8 CE F 1 3 6 4 8 C
BeforeBefore AfterAfterCCXX
GGFFFF
ACCACC
F FF F F 7 9 4 0 0 F 7 9 4 0 0
SXM=0SXM=0
CC
XX
GG0000
ACCACC
E F 1 3 6 4 8 CE F 1 3 6 4 8 C
BeforeBefore AfterAfter
CCXX
GG0000
ACCACC
0 00 0 F 7 9 4 0 0 F 7 9 4 0 0
Module 7
DSP54x - Numerical Issues 7 - 15
7 - 27
Carry Bit (Carry Bit (CC))
uu Carry is:Carry is:
ÀÀ Used with Used with unsigned unsigned numbers to indicate annumbers to indicate anover/under flow conditionover/under flow condition
ÀÀ Set or cleared with each calculation - it is Set or cleared with each calculation - it is notnotlatchedlatched
ÀÀ Optimal for extending 32-bit accumulators toOptimal for extending 32-bit accumulators tolarger-word-size calculationslarger-word-size calculations
uu Example - 64-bit addition:Example - 64-bit addition:
XDXD XCXC XBXB XAXA
BB
CC
7 - 28
Special Load, Add, SubtractSpecial Load, Add, Subtract
Carry & BorrowCarry & Borrow
ADDCADDC SmemSmem,,srcsrc srcsrc = = src src + + Smem Smem + C + C
SUBBSUBB SmemSmem,,srcsrc srcsrc = = src src - - Smem Smem - (C-) - (C-)
Sign-suppressed MathSign-suppressed Math
ADDSADDS SmemSmem,,srcsrc srcsrc = = src src + u( + u(SmemSmem))
SUBSSUBS SmemSmem,,srcsrc srcsrc = = src src - u( - u(SmemSmem))
Load unsignedLoad unsigned
LDULDU SmemSmem,,dstdst dstdst = u( = u(SmemSmem))
Module 7
7 - 16 DSP54x - Numerical Issues
7 - 29
64-bit Add & Subtract Code64-bit Add & Subtract Code
Example: zExample: z6464 = w = w6464 + x + x6464 - y - y6464
w3 w2 w1 w0w3 w2 w1 w0
x3 x2 x1 x0x3 x2 x1 x0
y3 y2 y1 y0y3 y2 y1 y0
z3 z2 z1 z0z3 z2 z1 z0
DLDDLD @w1,A@w1,A A = w1+w0 A = w1+w0
DADDDADD @x1,A@x1,A A += x1+x0 A += x1+x0
DLDDLD @w3,B@w3,B B = w3+w2B = w3+w2
ADDCADDC @x2,B@x2,B B += x2+CB += x2+C
ADDADD @x3,16,B@x3,16,B B += x3B += x3
DSUBDSUB @y1,A@y1,A A -= y1+y0A -= y1+y0
DSTDST A,@z1A,@z1 z1 = w1+w0+x1+x0-y1-y0z1 = w1+w0+x1+x0-y1-y0
SUBBSUBB @y2,B@y2,B B -= y2+C’ B -= y2+C’
SUBSUB @y3,16,B@y3,16,B B -= y3B -= y3
DSTDST B,@z3B,@z3 z3 = w3+w2+x3+x2+C-y3-y2-C’z3 = w3+w2+x3+x2+C-y3-y2-C’
7 - 30
Long MultiplicationLong Multiplication
X1 X1 X0X0 S S UU
Y1 Y1 Y0 Y0 S S UU X X
XOXO ** Y0 Y0 U U * * UU
Y1 * Y1 * X0 X0 S * S * UU
X1 * X1 * Y0 Y0 S * S * UU
Y1 * X1Y1 * X1 S * SS * S
W3 W3 W2W2 W1 W0 W1 W0 S S U U UU U U
MACSUMACSU XmemXmem,,YmemYmem,,srcsrc srcsrc = = src src + u( + u(SmemSmem)*)*YmemYmem
MPYUMPYU SmemSmem,,dstdst dstdst = u(TREG)*u( = u(TREG)*u(SmemSmem))
Module 7
DSP54x - Numerical Issues 7 - 17
7 - 31
Long Multiply RoutineLong Multiply Routine
STMSTM #X0,AR2#X0,AR2
STMSTM #Y0,AR3#Y0,AR3
LDLD *AR2,T*AR2,T T = x0T = x0
MPYUMPYU *AR3+,A*AR3+,A A = ux0*uy0A = ux0*uy0
STLSTL A,@W0A,@W0 w0 = ux0*uy0w0 = ux0*uy0LDLD A,-16,AA,-16,A A = A>>16A = A>>16
MACSUMACSU *AR2*AR2+,*+,*AR3-,A AR3-,A A += x1*uy0A += x1*uy0
MACSUMACSU *AR3*AR3+,*+,*AR2,AAR2,A A += y1*ux1A += y1*ux1
STLSTL A,@W1A,@W1 w1 = Aw1 = A
LDLD A,-16,AA,-16,A A = A>>16A = A>>16
MACMAC *AR2,*AR3,A*AR2,*AR3,A A += x1*y1A += x1*y1
STLSTL A,@W2A,@W2 w2 = A-w2 = A-lolo
STHSTH A,@W3A,@W3 w3 = A-hiw3 = A-hi
7 - 32
Exponent EncoderExponent Encoder
uu One cycle exponent ( [ -8, +31 ] range) computationOne cycle exponent ( [ -8, +31 ] range) computation
uu Result in T register as 2’s complement valueResult in T register as 2’s complement value
ALUALU
AA BB
EXPONENTENCODER
EXPONENTEXPONENTENCODERENCODER
66
TT
expexp A ; 1 cycle for A ; 1 cycle for exp exp
norm A ; 1 cycle normalizenorm A ; 1 cycle normalize
-8 0 16 31 -8 0 16 31
Module 7
7 - 18 DSP54x - Numerical Issues
7 - 33
Floating Point UsageFloating Point Usage
Full Floating PointFull Floating Point
e1 m1 e1 m1
e2 m2 e2 m2
e3 m3 e3 m3
LDLD e1,Te1,T
LDLD m1,T,Am1,T,A
LDLD e2,Te2,T
ADDADD m2,T,Am2,T,A
LDLD e3,Te3,T
ADDADD m3,T,Am3,T,A
2*N RAM & Cycles2*N RAM & Cycles
Block Floating PointBlock Floating Point
e m1 e m1
m2 m2
m3 m3
LD LD e,Te,T
LD LD m1,T,Am1,T,A
ADD ADD m2,T,Am2,T,A
ADD ADD m3,T,Am3,T,A
… …
N+1 RAM & CyclesN+1 RAM & Cycles
7 - 34
Exercise 7c : Numerical IssuesExercise 7c : Numerical Issues
1.1. How is division performed on the 54x?How is division performed on the 54x?
2. 2. How is rounding performed on the 54x?How is rounding performed on the 54x?
3. 3. How are fractions represented in the assembler?How are fractions represented in the assembler?
4. 4. What benefit does the carry bit offer?What benefit does the carry bit offer?What instructions employ/affect the carry bit?What instructions employ/affect the carry bit?
5.5. When are unsigned operations useful?When are unsigned operations useful?
6.6. Does the 54x offer any form of floating pointDoes the 54x offer any form of floating pointoperation?operation?
Module 7
DSP54x - Numerical Issues 7 - 19
7 - 35
Learning ObjectivesLearning Objectives
uu Identify & resolve issues for :Identify & resolve issues for :ÀÀ MultiplicationMultiplication
ÀÀ Addition / SubtractionAddition / Subtraction
ÀÀ DivisionDivision
uu Select the appropriate numerical models :Select the appropriate numerical models :ÀÀ IntegerInteger vs vs. Fraction. Fraction
ÀÀ SignedSigned vs vs. Unsigned math. Unsigned math
ÀÀ RoundingRounding vs vs. Truncation. Truncation
ÀÀ OverflowOverflow vs vs. Carry. Carry
ÀÀ Fixed pointFixed point vs vs. Floating point. Floating point
uu List mnemonics to performList mnemonics to perform : :
ÀÀ Extended precision mathExtended precision math
ÀÀ Boolean OperationsBoolean Operations
7 - 36
BitfieldBitfield Test & Bit Extraction Test & Bit Extraction
CMPMCMPM SmemSmem,#K,#K TC=1 ifTC=1 if Smem Smem=K=K
BITFBITF SmemSmem,#K,#K TC=0 ifTC=0 if Smem Smem&K=0&K=0
BITBIT XmemXmem,bit,bit TC=TC=XmemXmem(15-bit)(15-bit)
BITTBITT SmemSmem TC=TC=SmemSmem(15-T(3-0))(15-T(3-0))
memmem 15 n 15 n 0 0
bitbit
TC nTC nBITBIT *AR2,5*AR2,5
BCBC true,TCtrue,TC
LDLD @bit,T @bit,T
BITTBITT @x @x
BCBC false,NTC false,NTC
Module 7
7 - 20 DSP54x - Numerical Issues
7 - 37
Boolean OperationsBoolean Operations
AND OR XOR AND OR XOR 1 cycle 1 cycle
SmemSmem,,srcsrc srcsrc = = src src (op) (op) Smem Smem
srcsrc,[SHIFT],[,[SHIFT],[dstdst]] dstdst = = dst dst (op) (op) src src << SHIFT << SHIFT
AND OR XOR AND OR XOR 2 cycles 2 cycles
#K,[#K,[shftshft],],srcsrc,[,[dstdst]] dstdst = = src src (op) #K << (op) #K << shft shft
#K,16,#K,16,srcsrc,[,[dstdst]] dstdst = = src src (op) #K << 16 (op) #K << 16
ANDM ORM XORM ANDM ORM XORM ADDMADDM 2 cycles 2 cycles
#K,#K, Smem Smem SmemSmem = = Smem Smem (op) #K (op) #K
7 - 38
Shift and Rotate OperationsShift and Rotate Operations
SFTASFTA srcsrc,SHIFT,[,SHIFT,[dstdst]] C 39 32 31 0 0C 39 32 31 0 0
SxSx 39 32 31 0 C 39 32 31 0 C
SFTLSFTL srcsrc,SHIFT,[,SHIFT,[dstdst]] C - 00 - 31 0 0C - 00 - 31 0 0
0 - 00 - 31 0 C0 - 00 - 31 0 C
ROLTCROLTC srcsrc C - 00 - 31 0 TCC - 00 - 31 0 TC
ROLROL srcsrc C - 00 - 31 0C - 00 - 31 0
RORROR srcsrc C - 00 - 31 0C - 00 - 31 0
Module 7
DSP54x - Numerical Issues 7 - 21
7 - 39
Shifter HardwareShifter Hardware
SXMSXM
AA
BB
To ALUTo ALU
D BusD BusC BusC Bus
Sign ControlSign Control
Barrel ShifterBarrel Shifter(-16, +31)(-16, +31)
MSW/LSWMSW/LSWWrite SelectWrite Select
E BusE Bus
CC
TCTC3232
1616
4040
4040
1616
1616
4040
T(5-0)T(5-0)(-16, +31) Range(-16, +31) Range
ASM(4-0)ASM(4-0)(-16, +15) Range(-16, +15) Range
ConstantConstant(-16, +15) Range(-16, +15) Range or or (0, +15) Range (0, +15) Range
7 - 40
Other Numerical OperationsOther Numerical Operations
ABSABS srcsrc,[,[dstdst]] dstdst = | = |srcsrc||
NEGNEG srcsrc,[,[dstdst]] dstdst = - = -srcsrc
CMPLCMPL srcsrc,[,[dstdst]] dstdst = = src src
Module 7
7 - 22 DSP54x - Numerical Issues
7 - 41
Exercise 7d : Boolean OperationsExercise 7d : Boolean Operations
1.1. How are bits tested on the 54x? What’s unusual about it?How are bits tested on the 54x? What’s unusual about it?
2. 2. WhatWhat boolean boolean operations are present on the 54x? operations are present on the 54x?
3. 3. MustMust boolean boolean functions operate on the accumulator? functions operate on the accumulator?
4. 4. What is the difference between shift and rotate?What is the difference between shift and rotate?
5.5. What is the difference between What is the difference between SFTASFTA and and SFTLSFTL?? 6.6. What is the difference between What is the difference between NEGNEG and and CMPLCMPL??
Module 7
DSP54x - Numerical Issues 7 - 23
7 - 43
LAB7.ASMLAB7.ASM : Solution : Solution
.def start,table,y
.bss x,4
.bss a,4
.bss y,1
.data
table: .word 32768*1/10
.word 32768*2/10
.word 32768*3/10
.word 32768*4/10
.word 32768*8/10
.word 32768*6/10
.word 32768*4/10
.word 32768*2/10
.text
NOP
. .defdef start,table,y start,table,y
. .bssbss x,4 x,4
. .bssbss a,4 a,4
. .bssbss y,1 y,1
.data .data
table: .wordtable: .word 32768*1/10 32768*1/10
.word.word 32768*2/10 32768*2/10
.word.word 32768*3/10 32768*3/10
.word .word 32768*4/10 32768*4/10
.word .word 32768*8/10 32768*8/10
.word .word 32768*6/10 32768*6/10
.word .word 32768*4/10 32768*4/10
.word .word 32768*2/10 32768*2/10
.text .text
NOP NOP
start: STM #x,AR2
RPT #8
MVPD table,*AR2+
CALL sop
done: B done
sop: STM #x,AR2
STM #a,AR3
RSBX OVM
SSBX SXM
SSBX FRCT
RPTZ A,#3
MAC*AR2+,*AR3+,A
STH A,*(y)
RET
start: STM #x,AR2start: STM #x,AR2
RPT #8 RPT #8
MVPD table,*AR2+ MVPD table,*AR2+
CALL sop CALL sop
done: B donedone: B done
sop: STM #x,AR2sop: STM #x,AR2
STM #a,AR3 STM #a,AR3
RSBX OVM RSBX OVM
SSBX SXM SSBX SXM
SSBX FRCTSSBX FRCT
RPTZ A,#3 RPTZ A,#3
MAC MAC*AR2*AR2+,*+,*AR3+,AAR3+,A
STH A,*(y) STH A,*(y)
RET RET
Module 7
7 - 24 DSP54x - Numerical Issues
DSP54x - Fundamental DSP Applications 8 - 1
Fundamental DSP Applications
Learning Objectives
8 - 2
Objectives
u Describe how FIR/IIR filters operate
u Implement delay lines in two ways
u Write code for FIR/IIR filters on the 54x
u Translate signal flow diagrams to 54x code
u Employ techniques to avoid IIR instability
u Select the best filter type for a given need
8 - 2 DSP54x - Fundamental DSP Applications
Module 8
DSP54x - Fundamental DSP Applications 8 - 3
Module 8
8 - 3
Finite Impulse Response (FIR) Filter
z–1z–1 z–1z–1
×× ×
+
X0 X1 X2xin
yout
a0 a1 a2
y(n) = a0 × x(n) + a1 × x(n–1) + a2 × x(n–2)
LD x2, T
MAC a2, A
Circular Buffer or Linear Buffer
8 - 4
I/O Memory Read & WriteI/O Memory Read & Write
PORTR PA,Smem PA Smem
PORTW Smem,PA PA Smem
PORTRPORTR PA,PA,SmemSmem PAPA Smem Smem
PORTWPORTW SmemSmem,PA,PA PAPA Smem Smem
uu Port operations access I/O devices Port operations access I/O devices
uu Requires two words & two cycles Requires two words & two cycles
uu I/O range can be up to 64K locations I/O range can be up to 64K locations
uu There are no I/O resources on-chip There are no I/O resources on-chip
Module 8
8 - 4 DSP54x - Fundamental DSP Applications
8 - 5
JUNJUN
MAYMAY
APRAPR
MARMAR
FEBFEB
JANJAN
1. Access from the oldest to newest sample.
*ARn-
JUNJUN
JUNJUN
MAYMAY
APRAPR
MARMAR
FEBFEB
2. Input the newest sample on the top of the buffer.
JULPORTR
*ARn-
JULJUL
JULJUL
JUNJUN
MAYMAY
APRAPR
MARMAR
AUGPORTR
*ARn-
DELAY *AR2-DELAY *AR2-
Note: DELAY operates in DARAM Note: DELAY operates in DARAM only!only!
Linear Buffer (Delay Line)
8 - 6
Six-Level Circular Buffer
JANJUN
MARAPR
FEBMAY
JULJUN
MARAPR
FEBMAY
JULJUN
MARAPR
AUGMAY
JUNJUN
MAYMAY
APRAPR
MARMAR
FEBFEB
JANJAN
start
end
ARn JUNJUN
MAYMAY
APRAPR
MARMAR
FEBFEB
JULJUL ARn
JUNJUN
MAYMAY
APRAPR
MARMAR
AUGAUG
JULJULARn
Module 8
DSP54x - Fundamental DSP Applications 8 - 5
8 - 7
Circular Addressing Hardware
Element 0
Element N-1
Circular
Buffer
Range
Top of Buffer A ... A 0 ... 0
End of Buffer + 1 A ... A BK
(ARn) Index A ... A x ... x
BK = Length “n” of Delay LineBK = Length “n” of Delay Line
8 - 8
Circular Addressing Code
.textSTM #32,BK ;BK = size of circular buf. . . *AR3+% ;circular addressing
.textSTM #32,BK ;BK = size of circular buf. . . *AR3+% ;circular addressing
FIR.ASM
X0 .usect “D_LINE”,32
SECTION{
D_LINE: { } > RAM PAGE 1. . .
}
SECTION{
D_LINE: { } > RAM PAGE 1. . .
}
LINK.CMD
align(64)
Circular buffers of length n Circular buffers of length n mustmust be aligned on be aligned on 22KK > n > n boundaries boundaries
Module 8
8 - 6 DSP54x - Fundamental DSP Applications
8 - 9
Circular Addressing CaveatsCircular Addressing Caveats
uu Allows Allows allall AR modifications: AR modifications:
uu increment or decrementincrement or decrement
uu indexing (1, K, AR0)indexing (1, K, AR0)
uu Is invoked on Is invoked on anyany AR with the modulo (%) operator AR with the modulo (%) operator
uu Implements Implements truetrue modulo addressing (pointer will modulo addressing (pointer will nevernever exit array even if incremented exit array even if incremented pastpast end of array) end of array)
uu Alignment is to Alignment is to next largernext larger binary boundary binary boundary
uu Alignment can leave Alignment can leave gapsgaps in in RAMRAM
uu Linker will attempt to Linker will attempt to backfillbackfill unused unused RAMRAM if possible on if possible on a a whole-filewhole-file basis basis
uu Recommended: Link largest blocks first. Why?Recommended: Link largest blocks first. Why?
8 - 10
FIR Filter
×
z–1z–1 z–1z–1
××
X0 X1 X2xin
yout
a0 a1 a2
z–1z–1
×
X3
a3
z–1z–1
×
X4
a4
+ + + +
y(n) = a0 × x(n) + a1 × x(n-1) + a2 × x(n-2) + a3 × x(n-3) + a4 × x(n-4)
or
Y0 = a0 × X0 + a1 × X1 + a2 × X2 + a3 × X3 + a4 × X4
Module 8
DSP54x - Fundamental DSP Applications 8 - 7
8 - 11
FIR Filter - Linear BufferDirect Addressing
LD #X0,DP
SSBX FRCTLOOP LD X4,T
MPY A4,ALTD X3MAC A3,ALTD X2MAC A2,ALTD X1MAC A1,ALTD X0MAC A0,ASTH A,X0PORTW X0,PA0BD LOOPPORTR PA1,X0
Indirect AddressingSTM #A+3,AR2STM #X+3,AR1STM #4,AR0SSBX FRCT
LOOP LD *AR1-,TMPY *AR2-,ALTD *AR1-MAC *AR2-,ALTD *AR1-MAC *AR2-,ALTD *AR1-MAC *AR2-,ALTD *AR1MAC *AR2+0,ASTH A,*AR1PORTW *AR1,PA0BD LOOPPORTR PA1,*AR1+0
Note: location Note: location X0X0 used as temporary output location used as temporary output location
8 - 12
FIR Filter - Dual Op w. Delay
INIT STM #X+5,AR2STM #4,AR0SSBS FRCT
FIR RPTZ A,#4
MACD *AR2-,Coef,A
STH A,*AR2PORTW *AR2+,PA0BD FIRPORTR PA1,*AR2+0
.dataCOEF .word A4,A3,A2,A1,A0X .usect “daram”,1+5+1
point to last datumptr reset valuefractional numbers
5 iterations
Mpy, Acc, Delay
X=resultX to DAC, inc to X0loop (soon)get new X0, inc to X4
coeffs: old to newX,d.line, 1st delay
Module 8
8 - 8 DSP54x - Fundamental DSP Applications
8 - 13
FIR Filter - Dual Op w. Circ.Buffer
STM #A4,AR3STM #X4,AR2STM #-1,AR0STM #5,BKSSBX FRCT
FIR RPTZ A,#4MAC *AR2+0%,*AR3+0%,A
STH A,*AR2PORTW *AR2,PA0BD FIRPORTR PA1,*AR2+0%
coeff ptrcirc buf ptrdual op deccirc buf sizefractions
5 iterationsSOP & circ
result to old xresult to DACbranch soonget new data,inc to old
Note: Note: in dual op mode *in dual op mode *ArnArn-% is -% is indirectly indirectly supportedsupported
via *via *ArnArn+0% by setting AR0 = -1+0% by setting AR0 = -1
8 - 14
Second-Order IIR Filter
z–1z–1
×
×X0
X1
×
+
z–1z–1
X2
B1
B0
y(n)x(n)
B2
×
×
-A1
-A2
+w(n)
Feedback Path - Poles Forward Path - Zeros
Module 8
DSP54x - Fundamental DSP Applications 8 - 9
8 - 15
LD #x0,DPSSBX FRCT
IIR: PORTR 0000,x0LD x0,16,ALD x1,TMAC a1,ALD x2,TMAC a2,ASTH A,x0MPY b2,ALTD x1MAC b1,ALTD x0MAC b0,ASTH A,x0BD IIRPORTW x0,0001
IIR Filter - Single Operand
FeedbackSection
ForwardSection
x0 as Delay Element
x0 as Input Handler
x0 as Output Handler
8 - 16
IIR Filter- Dual Operand
Feedback
Path
Forward
Path
SSBX FRCT
STM #X2,AR3STM #Coeff+4,AR4
MVMM AR4,AR1STM #6,BK
STM #-1,AR0IIR: PORTR 0001h,*AR3
LD *AR3 ,16 ,AMAC *AR3+0%,*AR4-,AMAC *AR3+0%,*AR4-,ASTH A,*AR3 MPY *AR3+0%,*AR4-,AMAC *AR3+0%,*AR4-,AMAC *AR3 ,*AR4-,ASTH A, *AR3MVMM AR1,AR4BD IIRPORTW *AR3,0002h
Module 8
8 - 10 DSP54x - Fundamental DSP Applications
8 - 17
Classical Form IIRClassical Form IIR
z–1z–1
+
x(n) +
z–1z–1
a 11
a 12
z–1z–1
+
+
z–1z–1
b11
b12
y(n)
uuGain (pole, feedback section) Gain (pole, feedback section) afterafter attenuation (zero, attenuation (zero, foward foward section) section)
uuLess need for input scalingLess need for input scaling
uuMore robustMore robust
uuAlternate coding modelAlternate coding model
8 - 18
Classical IIR CodeClassical IIR CodeSTM #X,AR2STM #A,AR3STM #Y,AR4STM #B,AR5STM #3,BKSTM #-1,AR0
IIR: PORTR 0001h,*AR2MPY *AR2+0%,*AR3+0%,AMAC *AR2+0%,*AR3+0%,AMAC *AR2 ,*AR4+0%,AMAS *AR4+ ,*AR5+ ,AMAS *AR4 ,*AR5- ,ASTH A, *AR4PORTW *AR4,0002h
Even
Iteration
Odd
Iteration
...MAS *AR4+ ,*AR5+ ,AMAS *AR3 ,*AR5- ,ASTH A, *AR4BD IIRPORTW *AR3,0002h
Module 8
DSP54x - Fundamental DSP Applications 8 - 11
8 - 19
IIR Solutions ComparisonIIR Solutions Comparison
ParameterParameter 1 Operand1 Operand 2 Operand 2 Operand ClassicalClassical
Cycle CountCycle Count 12(M) + 4(P)12(M) + 4(P) 9(M) + 4(P)9(M) + 4(P) 6(M) + 4(P)6(M) + 4(P)
Code SizeCode Size 2020 2424 3434
Reg’sReg’s Used Used 00 3 + BK3 + BK 5 + BK5 + BK
8 - 20
IIR Implementation Issues
u Break down high-order systems
u Scale down coefficients that are ≥ 1
u Input scaling
u Optimal Topology
Module 8
8 - 12 DSP54x - Fundamental DSP Applications
8 - 21
Break Down High-Order IIR
z–1z–1
+
x(n) +d(n)
+
+
z–1z–1
w(n)
a11 b11
a12 b12
y(n)
z–1z–1
+
+ +
+
z–1z–1
a21 b21
a22 b22
8 - 22
Scale Down Coefficient ≥1
z–1z–1
×
×X0
X1
×
+
z–1z–1
X2
B1
B0
y(n)x(n)
B2
×
×
–A2
+w(n)
–(A1)/2
×
–(A1)/2
– A1
Module 8
DSP54x - Fundamental DSP Applications 8 - 13
8 - 23
Input ScalingInput Scaling
PORTR 0001h,XinLD Xin , 16, A
Q31 format Divide by 8
PORTR 0001h,XinLD Xin , 16-3, A
7711
8 - 24
Optimal IIR TopologyOptimal IIR Topology
z–1z–1
+
x(n) +
z–1z–1
a11
a12
z–1z–1
+
+
z–1z–1
b11
b12
b
+
+21
b22
+
+
a21
a22
z–1z–1
z–1z–1 z–1z–1
z–1z–1z–1z–1
z–1z–1
y(n)+
+b31
b32
z–1z–1
+
+
z–1z–1
a31
a32
Best blend of efficiency andBest blend of efficiency and peformance peformance
by preceding a gain stage (pole) with a zero stageby preceding a gain stage (pole) with a zero stage
Module 8
8 - 14 DSP54x - Fundamental DSP Applications
8 - 25
IIRIIR vs vs. FIR Filters. FIR Filters
uuFIR:FIR:
uuAll zero implementationAll zero implementation
uuUnconditionally stableUnconditionally stable
uuLinear Phase possibleLinear Phase possible
uuBest for phase encoded dataBest for phase encoded data
uuIIRIIR
uuPole & zero implementationPole & zero implementation
uuStable if no errors madeStable if no errors made
uuMuch better frequency performanceMuch better frequency performance
uuBest for frequency discriminationBest for frequency discrimination
8 - 26
Lab 8 : Recursive FilterLab 8 : Recursive Filter
Implement the signal flow diagram on the ‘C54x.Implement the signal flow diagram on the ‘C54x.
z–1zz–1–1
z–1zz–1–1
Y0Y0
××
××
AA
BB
++ I/O Port 0I/O Port 0
Y1Y1
Y2Y2
A A = 1.975= 1.975
B B = –1.000= –1.000
y(0) y(0) = 0.000= 0.000
y(1) y(1) = 0.1400= 0.1400
y(2) y(2) = ?= ?
Notes: Y0 is the current output based on the two prior outputs, Y1 and Y2. Notes: Y0 is the current output based on the two prior outputs, Y1 and Y2.
Initial conditions y(0) and y(1) are given, so the ‘54x will begin processing at t=2.Initial conditions y(0) and y(1) are given, so the ‘54x will begin processing at t=2.
Since location Y0 is not an input value, results can be directly written to Y1.Since location Y0 is not an input value, results can be directly written to Y1.
Module 8
DSP54x - Fundamental DSP Applications 8 - 15
8 - 27
Lab 8 : ProcedureLab 8 : Procedureuu Create a Create a newnew assembly file to: assembly file to:
1. Allocate RAM for coefficients and delay line1. Allocate RAM for coefficients and delay line
2. Establish a ROM table for coefficients and2. Establish a ROM table for coefficients and intial intial conditions conditions
3. Initialize ROM into RAM3. Initialize ROM into RAM
4. Initialize processor modes4. Initialize processor modes
5. Write code to implement signal flow diagram in infinite loop5. Write code to implement signal flow diagram in infinite loop
6. Build reset vector6. Build reset vector
uu Assemble the programAssemble the program
uu Link the program using an appropriate linker command fileLink the program using an appropriate linker command file
uu Run the program on the simulator through 40 loopsRun the program on the simulator through 40 loops
uu Exit the simulator and view your results by typing: Exit the simulator and view your results by typing: PLOT OUT.DATPLOT OUT.DAT
uu Verify the results with the instructorVerify the results with the instructor
uu Time permitting, consider optimizing your codeTime permitting, consider optimizing your code
8 - 28
Lab 8: EquationsLab 8: Equations
y(n) = A*y(n-1) + B*y(n-2)y(n) = A*y(n-1) + B*y(n-2)
Y(z) = A*zY(z) = A*z-1-1*Y(z) + B*z*Y(z) + B*z-2-2*Y(z)*Y(z)
Y(z)*[1 - A*zY(z)*[1 - A*z-1-1 - B*z - B*z-2-2] = 0] = 0
solving for roots:solving for roots:
z =[ A +/- (Az =[ A +/- (A22 + 4B) + 4B)1/21/2] / 2] / 2
if z is complex, then Aif z is complex, then A22+4B < 0, so+4B < 0, so
z = [ A +/- j*(-Az = [ A +/- j*(-A22 - 4B) - 4B)1/21/2] / 2] / 2
|z| = [ A|z| = [ A22/4 + (-A/4 + (-A22-4B) / 4 ]-4B) / 4 ]1/21/2
|z| = [ -B ] |z| = [ -B ] 1/21/2
Therefore, if B = -1, Therefore, if B = -1,
|z| = 1|z| = 1
Module 8
8 - 16 DSP54x - Fundamental DSP Applications
8 - 30
Lab 8: Solution - Parts 1 & 2Lab 8: Solution - Parts 1 & 2******** 1. Allocate RAM for coefficients and delay line 1. Allocate RAM for coefficients and delay line
. .bssbss a,4,1 a,4,1 Alloc Alloc RAM in 1 page RAM in 1 page
b .set a+1b .set a+1
y1 .set a+2y1 .set a+2
y2 .set a+3y2 .set a+3
******** 2. Establish a ROM table for 2. Establish a ROM table for coeff’s coeff’s and and int int .. condsconds ..
.data .data
TBL .word 32768*1975/2000 A/2 (q15)TBL .word 32768*1975/2000 A/2 (q15)
.word 32768*(-1) B (q15) .word 32768*(-1) B (q15)
.word 32768*14/100 Y1 (q15) .word 32768*14/100 Y1 (q15)
.word 32768*0 Y2 (q15) .word 32768*0 Y2 (q15)
SIZE .set $-TBLSIZE .set $-TBL
8 - 31
Lab 8: Solution - Parts 3 & 4Lab 8: Solution - Parts 3 & 4
******** 3. Initialize ROM into RAM 3. Initialize ROM into RAM
.text Begin code space .text Begin code space
start STM #a,AR7 Pointer to RAM arraystart STM #a,AR7 Pointer to RAM array
RPT #SIZE-1 Loop # of TBL elements RPT #SIZE-1 Loop # of TBL elements
MVPD TBL,*AR7+ Copy MVPD TBL,*AR7+ Copy ROMs ROMs to to RAMs RAMs
******** 4. Initialize processor modes 4. Initialize processor modes
SSBX FRCT For Q15*Q15 -> Q31 SSBX FRCT For Q15*Q15 -> Q31
LD #a,DP Set page for direct addressing LD #a,DP Set page for direct addressing
RSBX OVM Allow use of guard bits RSBX OVM Allow use of guard bits
SSBX SXM Two's comp numbers SSBX SXM Two's comp numbers
Module 8
DSP54x - Fundamental DSP Applications 8 - 17
8 - 32
Lab 8: Solution - Parts 5 & 6Lab 8: Solution - Parts 5 & 6
******** 5. Write code to implement signal flow diagram ... 5. Write code to implement signal flow diagram ...
SINE LD y2,T T = y2SINE LD y2,T T = y2
MPY b,A A = b*y2 MPY b,A A = b*y2
LTD y1 T = y1 , y1 -> y2 LTD y1 T = y1 , y1 -> y2
MAC a,A A = (a*y2)/2 + b*y2 MAC a,A A = (a*y2)/2 + b*y2
MAC a,A A = a*y2 + b*y2 MAC a,A A = a*y2 + b*y2
STH A,y1 y0 -> y1 STH A,y1 y0 -> y1
PORTW y1,0000 write to out. PORTW y1,0000 write to out.datdat file file
B SINE loop ... B SINE loop ...
******** 6. Build reset vector 6. Build reset vector
.include VECTOR6.SSM .include VECTOR6.SSM
8 - 33
SIMINIT.CMDSIMINIT.CMD
ma 0x0000,0, 0x0100, R|W|EX Smallma 0x0000,0, 0x0100, R|W|EX Small Ext’l Pgm Mem Ext’l Pgm Mem at 0 at 0
ma 0x9000,0, 0x1000, R|W|EX 4k ofma 0x9000,0, 0x1000, R|W|EX 4k of Ext’l Ext’l " " " "
ma 0xe000,0, 0x1000, R|W|EX 4k ofma 0xe000,0, 0x1000, R|W|EX 4k of Ext’l Ext’l " " " "
ma 0xff80,0, 0x0080, R|W|EX Vector Area " "ma 0xff80,0, 0x0080, R|W|EX Vector Area " "
ma 0x0000,1, 0x0060, R|Wma 0x0000,1, 0x0060, R|W MMRs MMRs in Data in Data Mem Mem
ma 0x0060,1, 0x0020, R|W SPRAM "ma 0x0060,1, 0x0020, R|W SPRAM "
ma 0x0080,1, 0x0380, R|W RAM 0 "ma 0x0080,1, 0x0380, R|W RAM 0 "
ma 0x0400,1, 0x0400, R|W RAM 1 "ma 0x0400,1, 0x0400, R|W RAM 1 "
ma 0x1400,1, 0x0400, R|W|EX Small X Datama 0x1400,1, 0x0400, R|W|EX Small X Data Mem Mem at 1400 at 1400
ma 0x8000,1, 0x1000, R|W|EX 4k ofma 0x8000,1, 0x1000, R|W|EX 4k of Extl Mem Extl Mem Org 8000 Org 8000
ma 0x0,2,1,ma 0x0,2,1,oportoport Output Port 0 Output Port 0
mcmc 0x0,2,1,out. 0x0,2,1,out.datdat,W ,W
Module 8
8 - 18 DSP54x - Fundamental DSP Applications
DSP54x - Algorithms 9 - 1
Algorithms
Learning Objectives
9 - 2
Learning ObjectivesLearning Objectives
uu List the advanced C54x instructionsList the advanced C54x instructions
uu Associate the advanced mnemonicAssociate the advanced mnemonic
with the algorithmic needwith the algorithmic need
uu Identify the architectural componentsIdentify the architectural components
that provide advanced performancethat provide advanced performance
uu Experiment with some of theseExperiment with some of these
instructions on the simulatorinstructions on the simulator
9 - 2 DSP54x - Algorithms
Module 9
DSP54x - Algorithms 9 - 3
Module 9
9 - 3
Advanced ApplicationsAdvanced Applications
u FIRS Symmetrical FIR filteru LMS Adaptive filtering
u POLY Polynomial evaluation
u STRCD Code book Search
SACCD
SRCCD
u DADST Viterbi algorithm
DSADT
CMPS
u FIRS Symmetrical FIR filter
9 - 4
Symmetric FIR Filter
Coeffs
a3 a2 a1 a0a3a2a1a0
x(4)x(3)
x(2)x(1)
x(8)x(7)
x(6)x(5)
New Old
Symmetric FIR FiltersSymmetric FIR Filtersare commonly used inare commonly used inapplications where phaseapplications where phasedistortion may degradedistortion may degradethe signal quality,the signal quality,egeg: modems.: modems.
The general form of this FIR equation is writtenThe general form of this FIR equation is written
Y(n) = a0Y(n) = a0x(8)x(8)+a1+a1x(7)x(7)+a2+a2x(6)x(6)+a3+a3x(5)x(5)+a3+a3x(4)x(4)+a2+a2x(3)x(3)+a1+a1x(2)x(2)+a0+a0x(1)x(1)
using 8 using 8 Mult’sMult’s ,7 Adds,7 Adds
In the specific case of a Symmetric FIR we can writeIn the specific case of a Symmetric FIR we can write
Y(n) = a0(Y(n) = a0(x(8)+x(1)x(8)+x(1))+a1()+a1(x(7)+x(2)x(7)+x(2))+a2()+a2(x(6)+x(3)x(6)+x(3))+a3()+a3(x(5)+x(4)x(5)+x(4)))
using 4 using 4 Mult’sMult’s ,7 Adds,7 Adds
Data
Module 9
9 - 4 DSP54x - Algorithms
9 - 5
1. Split the data into two parts; New and Old.1. Split the data into two parts; New and Old.
FIRS ImplementationFIRS Implementation
2. Set up circular buffers for each part. Set up the pointers for the buffers to the2. Set up circular buffers for each part. Set up the pointers for the buffers to thenewest of “New” and the oldest of “Old”. Set up a newest of “New” and the oldest of “Old”. Set up a coeffientcoeffient table. table.
3. Sum the first two data points into the high A accumulator (AH) and3. Sum the first two data points into the high A accumulator (AH) anddecrement the data pointers.decrement the data pointers.
4. Zero the B accumulator and repeat the following four times:4. Zero the B accumulator and repeat the following four times:a. Multiply AH times the coefficient, accumulate the result into the high Ba. Multiply AH times the coefficient, accumulate the result into the high B
accumulator (BH) and increment the coefficient pointer.accumulator (BH) and increment the coefficient pointer.b. Sum the next two data points and decrement the data pointers.b. Sum the next two data points and decrement the data pointers.
5. Store the result (BH) & set data pointers to oldest “Old” and oldest “New”.5. Store the result (BH) & set data pointers to oldest “Old” and oldest “New”.6. Replace oldest “Old” value with oldest “New” value. Dec. “Old” pointer.6. Replace oldest “Old” value with oldest “New” value. Dec. “Old” pointer.7. Replace oldest “New” value with a new input datum and go to step 3.7. Replace oldest “New” value with a new input datum and go to step 3.
x(5)x(5)
x(6)x(6)
x(7)x(7)
x(8)x(8)
NewNew
x(4)x(4)
x(3)x(3)
x(2)x(2)
x(1)x(1)
OldOld
AR2AR2 AR3AR3
x(8)x(8) x(1)x(1)
HigherHigheraddressesaddresses
x(8)+x(1)x(8)+x(1)x(8)x(8) x(1)x(1)
x(7)x(7) x(2)x(2)
BH = a0( )BH = a0( ) x(7)+x(2)x(7)+x(2)
x(7)x(7) x(2)x(2)
x(6)x(6) x(3)x(3)
+ a1( )+ a1( ) x(6)+x(3)x(6)+x(3)
x(6)x(6) x(3)x(3)
x(5)x(5) x(4)x(4)
+ a2( )+ a2( ) x(5)+x(4)x(5)+x(4)
x(5)x(5) x(4)x(4)
x(8)x(8) x(1)x(1)
+ a3( )+ a3( )
a0a0
a1a1
a2a2
a3a3
CoefficientsCoefficients
a0a0a0a0
a1a1a1a1
a2a2a2a2
a3a3a3a3
a0a0
x(8)x(8) x(1)x(1)
x(7)x(7) x(2)x(2)x(7)x(7) x(2)x(2)
x(1)x(1)
x(5)x(5)
x(2)x(2)
x(5)x(5)
x(9) x(9) A/DA/D
9 - 6
FIRSFIRS Code ExampleX_new .usect “DATA1”,4X_old .usect “DATA2”,4
LD #Y,DP SSBX FRCT STM #X_new,AR2 STM #X_old+3,AR3 STM #4,BK STM #-1,AR0
FIR ADD *AR2+0%,*AR3+0%,A RPTZ B,#3 FIRS *AR2+0%,*AR3+0%,COEFS
STH B,Y PORTW Y,0000h
MAR *+AR2(2)% MAR *AR3+% MVDD *AR2,*AR3+0% BD FIR PORTR 0001h,*AR2
.dataCOEF .word a0,a1,a2,a3
AR2 points to NEW bufAR3 points to OLD bufCircular buffer length = 4Emulates *ARn-%
AH = x(8)+x(1)B = 0;do the following 4 times:B=AH*a0;AH=x(7)+x(2), etc...
Output the result
Point to oldest OLD bufPoint to oldest NEW bufXfer old NEW over old OLD
Input new X to NEW buf
Module 9
DSP54x - Algorithms 9 - 5
9 - 7
Architecture - FIRSArchitecture - FIRS
AccAcc A A
FIRS *AR2+0% , *AR3+0% , COEFS
ALUALU
CC DD
MUXMUX
ALUALU
AccAcc B B
ADDADD
AA PP
BB
MACMAC
MPYMPY
9 - 8
Advanced ApplicationsAdvanced Applications
u FIRS Symmetrical FIR filteru LMS Adaptive filtering
u POLY Polynomial evaluation
u STRCD Code book Search
SACCD
SRCCD
u DADST Viterbi algorithm
DSADT
CMPS
u LMS Adaptive filtering
Module 9
9 - 6 DSP54x - Algorithms
9 - 9
Least Mean SquareLeast Mean SquareA least mean square (LMS) approach is widely used for adaptive filter routines.A least mean square (LMS) approach is widely used for adaptive filter routines.
The technique minimizes an error term by tuning the filter coefficients.The technique minimizes an error term by tuning the filter coefficients.
H(z)H(z)
W(z)W(z)
++d(n)d(n)
y(n)y(n)
x(n)x(n) e(n)e(n)++
--
x(n) = input datax(n) = input data
d(n) = desired responsed(n) = desired response
y(n) = actual responsey(n) = actual response
H(z) = real systemH(z) = real system
W(z) = synthesized systemW(z) = synthesized system
e(n) = errore(n) = error
9 - 10
Adaptive FIR Filtering using LMSAdaptive FIR Filtering using LMS
..........zz-1-1 zz-1-1 zz-1-1
LMSLMS
++
bb00 bb11 bbnn-1-1
y(n)y(n)
x(n)x(n)
FIR type filters are usually used in an adaptive algorithm since FIR type filters are usually used in an adaptive algorithm since
they are more tolerant of non-optimal coefficients.they are more tolerant of non-optimal coefficients.
++--
++
d(n)d(n)
e(n)e(n)
Module 9
DSP54x - Algorithms 9 - 7
9 - 11
LMS LoadingLMS LoadingEach Iteration ( only once )Each Iteration ( only once )
1 - determine error : 1 - determine error : e(i) e(i) = = d(i) - y(i)d(i) - y(i)2 - scale by “rate” term B : 2 - scale by “rate” term B : e´(i) e´(i) = = 2*B*e(i)2*B*e(i)
Each Term ( N sets )Each Term ( N sets )3 - Qualify error with signal strength : 3 - Qualify error with signal strength : e´´(i) e´´(i) = = x(i-k) * e´(i)x(i-k) * e´(i)4 - Sum error with coefficient :4 - Sum error with coefficient : b(i+1) b(i+1) = = b(i) + e´´(i)b(i) + e´´(i)5 - Update coefficient :5 - Update coefficient : b(i) b(i) = = b(i+1)b(i+1)
LMS:LMS: 11 11 SUBSUB22 11 MPYMPY33 NN MPYMPY44 NN ADDADD55 NN STHSTH
FIRFIR aa NN MPYMPYbb NN ADDADDcc 11 STHSTH
@ 100 tap: 500+ cycles@ 100 tap: 500+ cycles
Analysis :Analysis :
ST ST|| MPY|| MPY
MACMACADDADDLMSLMS
@ 100 tap: 200+ cycles@ 100 tap: 200+ cycles
9 - 12
LMS InstructionLMS Instruction
LMSLMS XmemXmem,, Ymem Ymem ;A += (;A += (XmemXmem) << 16 + 2) << 16 + 21515
;B += (;B += (XmemXmem) * () * (YmemYmem))
00 1111 222200 1111 2222AA
00 1000 000000 1000 0000BB
0 0FRCTFRCT
0100 0100AR3AR3
AR4AR4
1000 10000100h0100h
2000 20000200h0200h
Data memoryData memory02000200
Before instructionBefore instruction
AABB
FRCTFRCT
AR3AR3
AR4AR4
0100h0100h
0200h0200h
Data memoryData memory
After instructionAfter instruction
00
10001000
20002000
LMS *AR3+, *AR4+LMS *AR3+, *AR4+LMS *AR3+, *AR4+
00 1111 2222h00 1111 2222h
+ 00 1000 0000h+ 00 1000 0000h
+ 8000h+ 8000h
00 2111 A222h00 2111 A222h
00 2111 A22200 2111 A222
The The LMSLMS instruction adapts the coefficient instruction adapts the coefficient
00 1000 0000h00 1000 0000h
+ 1000h * 2000h+ 1000h * 2000h
00 1200 0000h00 1200 0000h
00 1200 000000 1200 0000
01010101
02010201
andand performs the performs the MACMAC for the filtering in the same cycle. for the filtering in the same cycle.
Storing the coefficient will require 1 additional cycle.Storing the coefficient will require 1 additional cycle.
Module 9
9 - 8 DSP54x - Algorithms
9 - 13
....asg AR3, Coeffs.asg AR4, Data
LD B2e, TLD #0,BSTM #N-2, BRC
RPTBD End-1MPY *Data +0%, ALMS *Coeffs , *Data
ST A, *Coeffs+|| MPY *Data+0%, A
LMS *Coeffs, *Data
End STH A, *CoeffsSTH B, *Result
....asg AR3, Coeffs.asg AR4, Data
LD B2e, TLD #0,BSTM #N-2, BRC
RPTBD End-1MPY *Data +0%, ALMS *Coeffs , *Data
ST A, *Coeffs+|| MPY *Data+0%, A
LMS *Coeffs, *Data
End STH A, *CoeffsSTH B, *Result
LMS Adaptive Filter CodeLMS Adaptive Filter Code
Pre-calculate 2Beta*e(n) ...AR3 points to Coefficient table ... a(n)AR4 points to Data table ... x(n)
T holds the error step amountZero out BLoad Branch Repeat CounterStart RPTB, next two are delay slotsA = error * oldest sample
B += a(n)*x(n) ... filter tapA += (a(n) << 16)+215 ... coeff. updateStore updated coefficientand form A = x(n-1)*2Beta*e(n)
B = accumulated filter outputA = updated filter coefficients
Store the final updated coefficientStore final filter result
9 - 14
Architecture - LMSArchitecture - LMS
LMS *AR2+0%, *AR3+0%
ALU : LMSALU : LMS
AccAcc A A
ALUALU
AA DD
MUXMUX
MAC : FIRMAC : FIR
AccAcc B B
ADDADD
DD CC
BB
MPYMPY
Module 9
DSP54x - Algorithms 9 - 9
9 - 15
Advanced ApplicationsAdvanced Applications
u FIRS Symmetrical FIR filteru LMS Adaptive filtering
u POLY Polynomial evaluation
u STRCD Code book Search
SACCD
SRCCD
u DADST Viterbi algorithm
DSADT
CMPS
u POLY Polynomial evaluation
9 - 16
Polynomial Polynomial Evaluation
P(x) = aP(x) = a33xx3+ 3+ aa22xx2 2 + a+ a11xx + a+ a00
The general form of a 3The general form of a 3rdrd order polynomial equation can be written as: order polynomial equation can be written as:
The equation can be rewritten as:The equation can be rewritten as:
P(x) = [(aP(x) = [(a33xx+ a+ a22))xx + a+ a11]]xx + a+ a00
Polynomial evaluation is commonly used in convolutional encoding.Polynomial evaluation is commonly used in convolutional encoding.
This process can be extended to any order polynomialThis process can be extended to any order polynomial
Module 9
9 - 10 DSP54x - Algorithms
9 - 17
POLY OperationPOLY Operation1. Set up a pointer to the coefficients.1. Set up a pointer to the coefficients.
TT
ALALAHAHAGAG
BLBLBHBHBGBG
2. Load x into the T register.2. Load x into the T register.
xx
3. Load a3. Load a3 3 into the high A accumulator (AH). Decrement pointer.into the high A accumulator (AH). Decrement pointer.
aa33
4. Load a4. Load a22 into the high B accumulator (BH). Decrement pointer. into the high B accumulator (BH). Decrement pointer.
aa22
5. Repeat the following three times:5. Repeat the following three times:a. Multiply AH times T, accumulate with BH and round in AH.a. Multiply AH times T, accumulate with BH and round in AH.b. Load the next coefficient into BH. Decrement pointer.b. Load the next coefficient into BH. Decrement pointer.
P(x) = AH =P(x) = AH = (a(a33xx+ a+ a22))
P(x)P(x)
aa11
[ [ xx+a+a11]]
aa00
xx+a+a00
?x?x
6. Store AH as result.6. Store AH as result.
ARnARn
aa33
aa22
aa11
aa00
??11
??22
CoefficientsCoefficients
aa11
aa33
aa22aa22
??22
??11??11
aa11
aa00aa00
9 - 18
Polynomial EvaluationPolynomial Evaluation
SSBXSSBX FRCT FRCTSSBXSSBX OVM OVMSSBXSSBX SXM SXM
LDLD *AR4+,T*AR4+,TLDLD *AR3+,16,A*AR3+,16,ALDLD *AR3+,16,B*AR3+,16,B
RPTRPT #2#2POLYPOLY *AR3+*AR3+
STHSTH A,*AR2+A,*AR2+
NoteNote: The POLY instruction “expects” Q15 numbers!: The POLY instruction “expects” Q15 numbers!
A parallel load may be added to do iterative POLY operations with no penalty.A parallel load may be added to do iterative POLY operations with no penalty.
|| LD|| LD *AR4+,T*AR4+,T T=new xT=new x
POLY operation is affected by these bitsPOLY operation is affected by these bits
T=X(0)T=X(0)A=A(order)=PXA=A(order)=PX init initB=A(order-1)B=A(order-1)
3 times3 timesA=PX=A=PX=RndRnd(B+A*T) B=An<<16(B+A*T) B=An<<16
PX=A>>16PX=A>>16
Module 9
DSP54x - Algorithms 9 - 11
9 - 19
TT
Architecture - POLYArchitecture - POLY
AccAcc A A AccAcc B B
POLY *AR3+
ALUALU
DD
MUXMUX
ALUALU
ADDADD
AA TT
BB
MACMAC
MPYMPY
9 - 20
Advanced ApplicationsAdvanced Applications
u FIRS Symmetrical FIR filteru LMS Adaptive filtering
u POLY Polynomial evaluation
u STRCD Code book Search
SACCD
SRCCD
u DADST Viterbi algorithm
DSADT
CMPS
u STRCD Code book Search
SACCD
SRCCD
Module 9
9 - 12 DSP54x - Algorithms
9 - 21
Code Book SearchCode Book SearchA code-excited linear predictive (CELP) speechA code-excited linear predictive (CELP) speech coder coder is widely used for is widely used forapplications requiring speech coding with a bit rate under 16Kapplications requiring speech coding with a bit rate under 16K bps bps. The. Thespeechspeech coder coder uses a vector uses a vector quantization quantization technique from technique from codebooks codebooks to an to an
excitation signal. This excitation signal is then applied to a linearexcitation signal. This excitation signal is then applied to a linearpredictive-coding (LPC) synthesis filter.predictive-coding (LPC) synthesis filter.
++++
--
WeightingWeighting
FilterFilter
SynthesisSynthesis
FilterFilter
Mean-square errorMean-square error
minimizationminimization
001122......
CodebookCodebook
GainGain
Input speechInput speech
p(n)p(n)
g(n)g(n)SelectSelectCodebookCodebookEntryEntry
9 - 22
Code Book SearchCode Book SearchObtaining the optimum code vector involves minimizing the mean-square errorObtaining the optimum code vector involves minimizing the mean-square error
generated from the weighted input speech and from the zero-input response ofgenerated from the weighted input speech and from the zero-input response of
the synthesis filter.the synthesis filter.
EEii = = ΣΣ [ [ p(n) -p(n) - ββggii(n) ](n) ]22
N-1N-1
i=0i=0
Optimum code vector localizationOptimum code vector localization
* p(n) is the weighted input speech* p(n) is the weighted input speech
** g gii(n) is the zero-input response of the (n) is the zero-input response of the
synthesis filter synthesis filter
* * β β is the gain of theis the gain of the codebook codebook
* N is a* N is a subframe subframe
Mean-Square ErrorMean-Square Error
Module 9
DSP54x - Algorithms 9 - 13
9 - 23
Code Book SearchCode Book Search
ccii = = ΣΣ ggii * p(n) * p(n)N-1N-1
i=0i=0
The cross-correlation ,The cross-correlation , c cii ,of p(n) and ,of p(n) and g gii(n) is represented by :(n) is represented by :
The energy variable,The energy variable, G Gii , is given by: , is given by:
GGii = = ΣΣ ggii 22
N-1N-1
i=0i=0
MinimizeMinimize E Eii by maximizing cby maximizing cii22//GGii . If a code vector with i = opt is optimal, the . If a code vector with i = opt is optimal, the
following equation is met for any i. Thefollowing equation is met for any i. The codebook codebook search routine evaluates this search routine evaluates thisequation for each code vector and finds the optimum one.equation for each code vector and finds the optimum one.
ccii 22 c c opt opt
22
GGii GGoptopt<< oror ccii
22 * * G Goptopt << cc opt opt22 * * G Gii
9 - 24
Code Book SearchCode Book Search
..mmregsmmregs
.text.textCBS:CBS: STMSTM #C, AR5#C, AR5
STMSTM #G, AR2#G, AR2STMSTM #G-opt, AR3#G-opt, AR3STMSTM #I-opt, AR4#I-opt, AR4STST #0, *AR4#0, *AR4STST #1, *AR3+#1, *AR3+STST #0, *AR3-#0, *AR3-STMSTM #N-1, BRC#N-1, BRC
RPTBRPTB donedoneSQURSQUR *AR5+, A*AR5+, AMPYAMPYA *AR3+*AR3+
MASMAS *AR2+, *AR3-, B*AR2+, *AR3-, BSRCCDSRCCD *AR4, BGEQ*AR4, BGEQSTRCDSTRCD *AR3+, BGEQ*AR3+, BGEQSACCDSACCD A, *AR3-, BGEQA, *AR3-, BGEQ
done: done: NOPNOP
A = C(i)^2A = C(i)^2B = C(i)^2 * B = C(i)^2 * GoptGopt T = G(i) T = G(i)
B = C(i)^2 *B = C(i)^2 * Gopt Gopt -- G(i) * Copt^2 G(i) * Copt^2If (B >= 0) then BRC -->If (B >= 0) then BRC --> Iopt Iopt and T --> and T --> Gopt Gopt and A --> Copt^2 and A --> Copt^2
AR5 AR5 C(0)C(0)......
AR3 AR3 GoptGopt=1=1
Copt=0Copt=0
AR4AR4 IoptIopt=0=0
AR2 AR2 G(0)G(0)......
SQURSQUR *AR5+, A*AR5+, Adone: done: MPYAMPYA *AR3+*AR3+
DD
Module 9
9 - 14 DSP54x - Algorithms
9 - 25
CodebookCodebook Search Instructions Search Instructions
Store T register conditionallyStore T register conditionally ... ...
STRCD STRCD XmemXmem, , condcondXmemXmem = T if condition is true = T if condition is true
Store Block Repeat Counter conditionally ..Store Block Repeat Counter conditionally ..
SRCCD SRCCD XmemXmem,, cond condXmemXmem = BRC if condition is true = BRC if condition is true
Store Accumulator conditionally ...Store Accumulator conditionally ...
SACCD SACCD srcsrc,, Xmem Xmem, , condcondXmemXmem = = src src << (ASM - 16) if condition is true << (ASM - 16) if condition is true
9 - 26
Advanced ApplicationsAdvanced Applications
u FIRS Symmetrical FIR filteru LMS Adaptive filtering
u POLY Polynomial evaluation
u STRCD Code book Search
SACCD
SRCCD
u DADST Viterbi algorithm
DSADT
CMPS
u DADST Viterbi algorithm
DSADT
CMPS
Module 9
DSP54x - Algorithms 9 - 15
9 - 27
Data TransmissionData Transmission
XMITXMIT
ModulateModulate
0010110 ...0010110 ...
RCVRCV
DemodulateDemodulate
0010100 ...0010100 ...
uu Digital sourceDigital source data is modulated to XMIT
u Signal is demodulated at RCV
u Noise acquired on RCV can cause data errors
u For greater reliability, EDAC technique is desired
FadingFading
MultipathMultipath
NoiseNoise
9 - 28
Viterbi EncoderViterbi Encoder
G0G0BitsBits
Input Input bitsbits ZZ-1-1 ZZ-1-1 ZZ-1-1 ZZ-1-1
G1G1BitsBits
++
++
u N bits are fed into network.u M (>N) bits flow out (... G0 G1 G0 G1 ...)u e.g. 3 in : 4 out, 4 in : 8 out, etc.u recognizable “holes” are created in data path, e.g.:
3 in : 4 out3 in : 4 out
valid codes: 2valid codes: 233 = 8 = 8
total codes: 2total codes: 244 = 16 = 16
“holes” = 8“holes” = 8
4 in : 8 out4 in : 8 out
valid codes: 2valid codes: 244 = 16 = 16
total codes: 2total codes: 288 = 256 = 256
“holes” = 240“holes” = 240
Receiver can use table of valid Receiver can use table of valid vsvs. invalid code to detect errors. invalid code to detect errors
Module 9
9 - 16 DSP54x - Algorithms
9 - 29
00
11
00
11statestate
n n
statestate
n+1 n+1
Viterbi Decoder ConceptViterbi Decoder Concept
u ‘Pruning’ less likely paths from consideration keeps N samplesof data from using 2nn locations to just 2*n locations
u After ‘M’ samples are acquired, two best paths are comparedto table of valid/invalid codes (traceback)
u Invalid set is dropped, valid set is saved as received dataIf both are valid, maximum likelihood set is saved
u Receive datau There are now 4 possible sets: 00, 01, 10, 11
u Traditional approach: keep only 1Viterbi method: keep ‘best’ 2
u ‘Best’ is determined by maximum value along paths between states
9 - 30
Viterbi DecoderViterbi Decoder
D-cod:D-cod: LDLD *AR2,T*AR2,T T = MT = M
Old stateOld state
2*J2*J
2*J+12*J+1
New stateNew state
JJ
J+8J+8
+ M+ M
- M- M
+ M+ M
- M- M
AH,ALAH,AL
BH,BLBH,BL
Often, local distance is the same for consecutive butterflies, Often, local distance is the same for consecutive butterflies, so the benchmark approaches 4 cycles per butterfly. so the benchmark approaches 4 cycles per butterfly.
CMPSCMPS B, *AR3+B, *AR3+ (J+8)=max(BH,BL), etc(J+8)=max(BH,BL), etc
CMPSCMPS A, *AR4+A, *AR4+ (J)=max(AH,AL), etc(J)=max(AH,AL), etc
DSADTDSADT *AR5+, B*AR5+, B BH=(2*J)-M, BL=(2*J+1)+MBH=(2*J)-M, BL=(2*J+1)+M
DADSTDADST *AR5, A*AR5, A AH=(2*J)+M, AL=(2*J+1)-MAH=(2*J)+M, AL=(2*J+1)-M
Module 9
DSP54x - Algorithms 9 - 17
9 - 31
Viterbi Memory MapViterbi Memory Map
u In one symbol time interval, 8 butterflies yield 16 new statesu This operation repeats over a number of symbol time intervalsu At the end of the sequence of time intervals, a back track routine as
performed to find the optimal path out of the 16 paths calculatedu This path represents the bit sequence to be decoded
MetricsMetrics
2*J & 2*J+12*J & 2*J+1
Metrics JMetrics J
Metrics J + 8Metrics J + 8
AR5AR5
AR4AR4
AR3AR3
00
15151616
2424
3131
Relative locationRelative location
Old statesOld states
New statesNew states
9 - 32
Viterbi InstructionsViterbi Instructions
CMPSCMPS src src,, Smem Smem
THEN : THEN : ( (srcsrc(31-16)) (31-16)) ÐÐ SmemSmem
0 0 ÐÐ TC TC(TRN << 1 ) + 0 (TRN << 1 ) + 0 ÐÐ TRN TRN
ELSE : ELSE : ( (srcsrc(31-16)) (31-16)) ÐÐ SmemSmem
1 1 ÐÐ TC TC (TRN << 1 ) + 1 (TRN << 1 ) + 1 ÐÐ TRN TRN
IF { [IF { [ src src (31-16) ] > [ (31-16) ] > [ src src (15-0) ] } (15-0) ] }
DADST Lmem,dst
DSADT Lmem,dst
Lmem ( 31-16 ) + (T) ÐÐ dst (39-16) Lmem ( 15 - 0 ) - (T) ÐÐ dst (15 - 0)
Lmem ( 31-16 ) - (T) ÐÐ dst (39-16)Lmem ( 15 - 0 ) + (T) ÐÐ dst (15 - 0)
Module 9
9 - 18 DSP54x - Algorithms
9 - 33
TT
Compare Select Store (CSS) UnitCompare Select Store (CSS) Unit
EB [15:0]EB [15:0]
DBDB [15:0] [15:0]CB [15:0]CB [15:0]
TRNTRN
TCTC
CSS
UN
ITC
SS U
NIT
BH BLBH BLAH ALAH AL
C16=1C16=1 ALUALU3232
1616
uu Dual 16-bit Dual 16-bit ALU operationsALU operations
uu T register input T register input ALU as dual ALU as dual 16-bit operand16-bit operand
uu 16-bit transition 16-bit transition shift register shift register (TRN)(TRN)
uu One cycle store One cycle store Max and Shift Max and Shift decisiondecision
COMPCOMP
MSB/LSBMSB/LSBWRITEWRITESELECTSELECT
9 - 34
Absolute and Square DistanceAbsolute and Square Distance
ABDSTABDST XmemXmem, , YmemYmem : Absolute Distance: Absolute Distance
B B + =+ = | AH | | AH |
AH AH = = XmemXmem - - YmemYmem
SQDSTSQDST XmemXmem, , YmemYmem : Square Distance: Square Distance
B B + =+ = AH AH22
AH AH = = XmemXmem - - YmemYmem
Module 9
DSP54x - Algorithms 9 - 19
9 - 35
ReviewReview
uu What instruction is used to perform adaptive filtering?What instruction is used to perform adaptive filtering?
uu What instructions are used to perform Viterbi decoding?What instructions are used to perform Viterbi decoding?
uu What features of the C54x architecture allow the What features of the C54x architecture allow the FIRSFIRSinstruction to execute in a single cycle?instruction to execute in a single cycle?What might slow it down?What might slow it down?
uu What features of the C54x architecture allow the ViterbiWhat features of the C54x architecture allow the Viterbioperations to execute so quickly?operations to execute so quickly?
uu What mnemonic is used for solving polynomials?What mnemonic is used for solving polynomials?What concept allows this to run so quickly?What concept allows this to run so quickly?
9 - 36
LAB9ALAB9A .. Acoustic Echo Cancellation.. Acoustic Echo Cancellation
The goal of this adaptive filter is to create a replica of theThe goal of this adaptive filter is to create a replica of theecho so that when the signal, y(n), is subtracted from the echoecho so that when the signal, y(n), is subtracted from the echo
signal, z(n), the result is zero.signal, z(n), the result is zero.
Reference signal x(n)Reference signal x(n)
ref.ref.datdat
y(n)y(n) --
++
Error signal e(n)Error signal e(n)
error.error.datdat
SpeakerSpeaker
MicrophoneMicrophone
Near-end speechNear-end speechand room noiseand room noise
Adaptive FilterAdaptive FilterLMS update ofLMS update of
coefficientscoefficientsbbkk
Echo signal Echo signal z(n) z(n)
echo.echo.datdat
Module 9
9 - 20 DSP54x - Algorithms
9 - 37
LAB9ALAB9A .. Acoustic Echo Cancellation .. Acoustic Echo Cancellation
uu Assemble and link Assemble and link LAB9ALAB9A
uu Enter the simulator and invoke Enter the simulator and invoke LAB9A.TAKLAB9A.TAK. The take file loads . The take file loads LAB9ALAB9A and andconnects the simulator to the input and output files.connects the simulator to the input and output files.
uu Run the program. Depending on the speed of the simulator it may take severalRun the program. Depending on the speed of the simulator it may take severalminutes to process all 2000 points. However, you can stop the simulator at any timeminutes to process all 2000 points. However, you can stop the simulator at any timeand proceed to the next step in this procedure.and proceed to the next step in this procedure.
uu Exit the simulatorExit the simulator
uu Plot the Plot the ERROR.DATERROR.DAT file by typing: file by typing: DRAWHEX ERROR.DATDRAWHEX ERROR.DAT. Note that the error. Note that the errorconverges from a large amplitude to a small amplitude. This is the time it takes theconverges from a large amplitude to a small amplitude. This is the time it takes theLMSLMS filter to adapt. The remaining signal is ambient noise present in the room when filter to adapt. The remaining signal is ambient noise present in the room whenthe data was collected. Analysis shows that the echo is attenuated by an average ofthe data was collected. Analysis shows that the echo is attenuated by an average of28dB.28dB.
uu Determine the number of clock cycles required for the filter and Determine the number of clock cycles required for the filter and LMSLMS update. update.
uu Compare this to the 4N clock cycles required by most DSP’s. N is the number ofCompare this to the 4N clock cycles required by most DSP’s. N is the number oftaps of the adaptive filter.taps of the adaptive filter.
uu Try changing the number of the filter taps and Beta. What is the effect?Try changing the number of the filter taps and Beta. What is the effect?
9 - 38
LAB9ALAB9A .. Acoustic Echo Cancellation .. Acoustic Echo Cancellation
The code for the The code for the LMSLMS filter is in filter is in LAB9A.ASMLAB9A.ASM. The input x(n) is stored in a file. The input x(n) is stored in a filecalled called REF.DATREF.DAT, representing a 1 kHz sine wave sampled at 8 kHz. To create, representing a 1 kHz sine wave sampled at 8 kHz. To createthe the REF.DATREF.DAT file, a signal was generated with a function generator, then file, a signal was generated with a function generator, thensampled with a 13 bit A/D converter. At the same time sampled with a 13 bit A/D converter. At the same time REF.DATREF.DAT was wascollected, the 1 kHz signal was sent to a speaker.collected, the 1 kHz signal was sent to a speaker.
A microphone was used to collect the resulting echoes in a 10’ x 15’ room.A microphone was used to collect the resulting echoes in a 10’ x 15’ room.
The echo signal z(I) was sampled in the same manner as the reference signal.The echo signal z(I) was sampled in the same manner as the reference signal.The echo signal is stored in a file called The echo signal is stored in a file called ECHO.DATECHO.DAT. Two thousand samples. Two thousand samplesor .25 seconds of data is stored in the files. or .25 seconds of data is stored in the files. LAB9A.ASMLAB9A.ASM uses uses REF.DATREF.DAT and andECHO.DATECHO.DAT as inputs to the as inputs to the LMSLMS filter. When the program is run, the filter. When the program is run, theresulting error signal is stored in resulting error signal is stored in ERROR.DATERROR.DAT . The LMS filter length is set . The LMS filter length is setfor 16 taps. For a sampling rate of 8 kHz, a 16 tap filter can cancel up tofor 16 taps. For a sampling rate of 8 kHz, a 16 tap filter can cancel up to16/8000 = 216/8000 = 2 msec msec of echo delay. of echo delay.
Module 9
DSP54x - Algorithms 9 - 21
9 - 39
3 parity check3 parity checkbits on 50 classbits on 50 class
1a bits1a bits
LAB9BLAB9B .. GSM Channel Coding .. GSM Channel Coding
GSM stands for Global Standard for Mobile communications. It is the digitalGSM stands for Global Standard for Mobile communications. It is the digitalcellular standard used in Europe and throughout the world.cellular standard used in Europe and throughout the world.
RPE-LPCRPE-LPCspeech encoderspeech encoder
input 160 samplesinput 160 samplesoutput 260 bitsoutput 260 bits
class 1aclass 1a
class 1bclass 1b
5353bitsbits
132132bitsbits
Reorder classReorder class1 bits and1 bits andadd 4 zeroadd 4 zero
trailing bitstrailing bits
189189bitsbits
1/2 rate, constraint1/2 rate, constraintlength 5length 5 convol convol--
utionalutional encoding on encoding onclass 1a and 1b bitsclass 1a and 1b bits
378378bitsbits
Voice activity detectorVoice activity detector
class 2class 2
7878bitsbits
InterleavingInterleavingA5 encryptionA5 encryption
and slotand slotformattingformatting
GMSKGMSKmodulationmodulation
performed inperformed inRFRF codec codec
RFRFtransmissiontransmission
TransmitterTransmitter
s(t)s(t)
RFRFreceptionreception
GMSKGMSKmodulationmodulation
performed inperformed inRFRF codec codec
RPE-LPCRPE-LPCspeech decoderspeech decoderinput 260 bitsinput 260 bits
output 260 samplesoutput 260 samples
BitBitReorderingReordering
Test parityTest paritycheck bits,check bits,
discard blockdiscard blockif check failsif check fails
5353bitsbits
Viterbi decodeViterbi decode378 class 1 bits378 class 1 bits
378378bitsbits
Equalization,Equalization,slotslot dissembly dissembly,,de-encryptionde-encryption
andandde-interleavingde-interleaving5050
bitsbits
78 class 2 bits78 class 2 bits
132 class 1b bits132 class 1b bits
Comfort noiseComfort noise
260260bitsbits
s’(t)s’(t)
ReceiverReceiver
LAB9BLAB9B
ChannelChannel
9 - 40
LAB9BLAB9B .. GSM Channel Coding .. GSM Channel Coding
uu Examine the file Examine the file LAB9B.ASMLAB9B.ASM. Note the special instructions used for. Note the special instructions used forimplementing the Viterbi butterfly.implementing the Viterbi butterfly.
uu Assemble and link Assemble and link LAB9B.ASMLAB9B.ASM..
uu Simulate Simulate LAB9BLAB9B program by typing: take program by typing: take LAB9B.TAKLAB9B.TAK
uu Examine the input data array in the Examine the input data array in the MEMORYMEMORY window. window.
++ G0G0BitsBits
Input bitsInput bits ZZ-1-1 ZZ-1-1 ZZ-1-1 ZZ-1-1
G1G1BitsBits++
Module 9
9 - 22 DSP54x - Algorithms
9 - 41
Convolutional encoder output (G0 and G1)Convolutional encoder output (G0 and G1)1100
Signed Antipodal formatSigned Antipodal format-7-777
LAB9BLAB9B .. continued .. continued
uu Encode the input data by running to the first break point. Break points wereEncode the input data by running to the first break point. Break points wereset by the take file. The encoded data is in the set by the take file. The encoded data is in the MEMORY1MEMORY1 window. Simulate window. Simulatetransmission errors by making changes to the encoded data. Valid changestransmission errors by making changes to the encoded data. Valid changesare between -7 and 7. Note: the encoded data is in signed antipodal format,are between -7 and 7. Note: the encoded data is in signed antipodal format,the format that the GSM equalizer output would be in.the format that the GSM equalizer output would be in.
uu Run the Viterbi decoder and compare the input data array to the outputRun the Viterbi decoder and compare the input data array to the outputdata array. Is the output correct?data array. Is the output correct?
9 - 42
LAB9BLAB9B .. GSM Channel Coding .. GSM Channel Coding
Every 20Every 20 mS mS, 160 sampled values from the ADC are analyzed by the, 160 sampled values from the ADC are analyzed by theRegular Pulse Excitation (RPE) Linear Predictive Coding (LPC) voiceRegular Pulse Excitation (RPE) Linear Predictive Coding (LPC) voiceencoder. The filter amounts to a model of the speaker’s vocal tract (pharynx,encoder. The filter amounts to a model of the speaker’s vocal tract (pharynx,teeth, tongue, etc.) and the excitation signal represents sounds (pitch,teeth, tongue, etc.) and the excitation signal represents sounds (pitch,loudness, etc.). Finding suitable filter coefficients and an excitation signalloudness, etc.). Finding suitable filter coefficients and an excitation signalyields an appropriate speech signal.yields an appropriate speech signal.
The real reduction in bit rate comes from further analyzing theThe real reduction in bit rate comes from further analyzing theexcitation signal. The difference between current and previous excitationexcitation signal. The difference between current and previous excitationsignals is found by using Long Term Predictive analysis (LTP). The LTPsignals is found by using Long Term Predictive analysis (LTP). The LTPalgorithm searches all of the previous sequences (15algorithm searches all of the previous sequences (15 mS mS of history) for the of history) for thesequence that has the highest correlation to the current sequence. Thesequence that has the highest correlation to the current sequence. Thedifference is transmitted along with a pointer to the sequence that should bedifference is transmitted along with a pointer to the sequence that should beselected for use. The 160 samples are reduced to 260 bits. The resulting bitselected for use. The 160 samples are reduced to 260 bits. The resulting bitrate is 13rate is 13 Kbits Kbits / sec. / sec.
Module 9
DSP54x - Algorithms 9 - 23
9 - 43
LAB9CLAB9C : Polynomial Evaluation : Polynomial Evaluation
uu Examine (or create your own) Examine (or create your own) LAB9C.ASMLAB9C.ASM..
uu Assemble and link Assemble and link LAB9CLAB9C..
uu Simulate Simulate LAB9CLAB9C and observe the operation of the and observe the operation of the POLYPOLY instruction instructionby single stepping through the code until the “by single stepping through the code until the “EndEnd” label.” label.
uu Verify that the code generates the expected result.Verify that the code generates the expected result.
uu Optional: Modify the inputs to generate new results.Optional: Modify the inputs to generate new results.
uu Note: The Note: The POLYPOLY instruction “expects” Q15 numbers! instruction “expects” Q15 numbers!
x = 3/4 = 0x6000x = 3/4 = 0x6000
a0 = 1/8 = 0x1000a0 = 1/8 = 0x1000
a1 = 1/4 = 0x2000a1 = 1/4 = 0x2000
a2 = 3/8 = 0x3000a2 = 3/8 = 0x3000
a3 = 1/2 = 0x4000a3 = 1/2 = 0x4000
P(x) = aP(x) = a33xx3+ 3+ aa22xx2 2 + a+ a11xx + a+ a00
P(x) = [(aP(x) = [(a33x+ ax+ a22)x)x + a+ a11]x]x + a+ a00
P(x) = 94/128 = 0x5EP(x) = 94/128 = 0x5E
9 - 44
Additional ResourcesAdditional Resources
1. S. M. Redl, M. K. Weber, M. W. Oliphant, “An Introduction to GSM”,Artech House, 1995.
2. H. Hendrix, “A Brief Tutorial on GSM Decoding Techniques”,TI Internal paper, 1995.
3. H. Hendrix, “Viterbi Decoding Techniques on the TMS320C54x Family”,TI Application Report, 1995.
1. S. M.1. S. M. Redl Redl, M. K. Weber, M. W., M. K. Weber, M. W. Oliphant Oliphant , “An Introduction to GSM”,, “An Introduction to GSM”,ArtechArtech House, 1995. House, 1995.
2. H.2. H. Hendrix Hendrix , “A Brief Tutorial on GSM Decoding Techniques”,, “A Brief Tutorial on GSM Decoding Techniques”,TI Internal paper, 1995.TI Internal paper, 1995.
3. H.3. H. Hendrix Hendrix , “Viterbi Decoding Techniques on the TMS320C54x Family”,, “Viterbi Decoding Techniques on the TMS320C54x Family”,TI Application Report, 1995.TI Application Report, 1995.
Module 9
9 - 24 DSP54x - Algorithms
DSP54x - Interrupts 10 - 1
Interrupts
Learning Objectives
10 - 2
ObjectivesObjectives
uu Describe the ‘C54x state upon reset.Describe the ‘C54x state upon reset.
uu Identify interrupt sources.Identify interrupt sources.
uu Identify the requirements for interruptIdentify the requirements for interruptrecognition.recognition.
uu Describe the sequence of events duringDescribe the sequence of events duringan interrupt.an interrupt.
uu Build vector tablesBuild vector tables
10 - 2 DSP54x - Interrupts
Module 10
DSP54x - Interrupts 10 - 3
Module 10
10 - 3
Hardware Reset ActionsHardware Reset Actions
uu All control signals are driven inactive highAll control signals are driven inactive high
uu Address lines are driven to Address lines are driven to FF80hFF80h
uu Data bus is driven to high impedance stateData bus is driven to high impedance state
uu Interrupts are disabled : Interrupts are disabled : 1 1 → → INTMINTM
uu Prior interrupts are purged : Prior interrupts are purged : 0 0 → → IFRIFR
uu The repeat counter (The repeat counter (RCRC) is cleared) is cleared
uu IACKIACK- is driven low- is driven low
uu An internal reset is sent to the peripherals.An internal reset is sent to the peripherals.
uu Seven Seven CLKOUTCLKOUT cycles after cycles after RSRS- is released- is releasedthe processor will fetch from the processor will fetch from 0FF80h0FF80h
10 - 4
Processor Status on ResetProcessor Status on Reset
MathMath
0 0 →→ OVA OVA
0 0 →→ OVB OVB
0 0 →→ OVM OVM
1 1 →→ C C
0 0 →→ C16 C16
0 0 → → ASM ASM
0 0 →→ FRCT FRCT
1 1 →→ SXM SXM
MiscMisc
0 0 →→ BRAF BRAF
0 0 →→ DP DP
0 0 → → CPLCPL
0 0 →→ CMPT CMPT
0 0 →→ ARP ARP
1 1 →→ INTM INTM
MemoryMemory
0 0 → → OVLYOVLY
0 0 →→ DROM DROM
? ? → → MP/MC-MP/MC-
1FFh 1FFh →→ IPTR IPTR
PinsPins
1 1 →→ XF XF
1 1 →→ CLKEN CLKEN
0 0 →→ AVIS AVIS
0 0 →→ HM HM
Module 10
10 - 4 DSP54x - Interrupts
10 - 5
InterruptInterrupt
Interrupt LocationsInterrupt Locations
Offset (Hex)Offset (Hex) DescriptionDescriptionRSRS 00 ResetReset
NMINMI 44 NonmaskableNonmaskable Interrupt Interrupt
INT0INT0 4040 External User Interrupt #0External User Interrupt #0
INT1INT1 4444 External User Interrupt #1External User Interrupt #1
INT2INT2 4848 External User Interrupt #2External User Interrupt #2
TINTTINT 4C4C Internal Timer InterruptInternal Timer Interrupt
RINT0RINT0 5050 Serial Port 0 Receive InterruptSerial Port 0 Receive Interrupt
XINT0XINT0 5454 Serial Port 0 Transmit InterruptSerial Port 0 Transmit Interrupt
RINT1RINT1 5858 Serial Port 1 Receive InterruptSerial Port 1 Receive Interrupt
XINT1XINT1 5C5C Serial Port 1 Transmit InterruptSerial Port 1 Transmit Interrupt
INT3INT3 6060 External User Interrupt #3External User Interrupt #3
64-7F64-7F ReservedReserved
SINT17-30SINT17-30 8-3C 8-3C Software Interrupt 17-30Software Interrupt 17-30
10 - 6
Interrupt ManagementInterrupt Management
ReservedReservedReserved INT3INT3INT3IFRIFR XINT1XINT1XINT1 RINT1RINT1RINT1 XINT0XINT0XINT0 RINT0RINT0RINT0 TINTTINTTINT INT2INT2INT2 INT1INT1INT1 INT0INT0INT0
15 – 915 – 9 88 77 66 55 44 33 22 11 00
ReservedReservedReserved INT3INT3INT3IMRIMR XINT1XINT1XINT1 RINT1RINT1RINT1 XINT0XINT0XINT0 RINT0RINT0RINT0 TINTTINTTINT INT2INT2INT2 INT1INT1INT1 INT0INT0INT0
15 – 915 – 9 88 77 66 55 44 33 22 11 00
ST1ST1
1111
INTMINTMINTM
Master EnableMaster Enable ::
Master Inhibit Master Inhibit ::
Set IMR Bits Set IMR Bits ::
Modify IMR Modify IMR ::
Clear IFR BitClear IFR Bit ::
RSBX RSBX INTMINTM
SSBXSSBX INTMINTM
STST #102h,*(IMR)#102h,*(IMR)
ORM ORM #40h, *(IMR)#40h, *(IMR)
STST #1, *(IFR)#1, *(IFR)
Module 10
DSP54x - Interrupts 10 - 5
10 - 7
Recognition of InterruptsRecognition of Interrupts
INT high 2 cyclesINT high 2 cyclesINT high 2 cycles22
INT low 3 cyclesINT low 3 cyclesINT low 3 cycles33
IFR Bit LatchedIFR Bit LatchedIFR Bit Latched
IMR Bit = 1?IMR Bit = 1?IMR Bit = 1?
INTM Bit = 0?INTM Bit = 0?INTM Bit = 0?
Interrupt BeginsInterrupt BeginsInterrupt Begins
10 - 8
Post Interrupt Hardware SequencePost Interrupt Hardware Sequence
CPU ActionCPU ActionCPU Action DescriptionDescriptionDescription
1 → INTM1 1 →→ INTM INTM Disable global interruptsDisable global interruptsDisable global interrupts
PC → - - *(SP)PC PC →→ - - *(SP) - - *(SP) Push PC onto predecremented stackPush PC ontoPush PC onto predecremented predecremented stack stack
Vector(n) → PCVector(n) Vector(n) →→ PC PC Load PC with int. vector “n” addressLoad PC withLoad PC with int int . vector “n” address. vector “n” address
0 → IACK pin0 0 →→ IACK pin IACK pin IACK signal goes lowIACK signal goes lowIACK signal goes low
0 → IFR (n)0 0 →→ IFR (n) IFR (n) Clear corresponding interrupt flag bitClear corresponding interrupt flag bitClear corresponding interrupt flag bit
Module 10
10 - 6 DSP54x - Interrupts
10 - 9
IACK DecoderIACK Decoder
‘C54x‘C54x
AddrAddr 6 5 4 3 2 1 0 6 5 4 3 2 1 0
INT0 1 0 0 0 0 0 0INT0 1 0 0 0 0 0 0
INT1 1 0 0 0 1 0 0INT1 1 0 0 0 1 0 0
INT2 1 0 0 1 0 0 0INT2 1 0 0 1 0 0 0
INT3 1 1 0 0 0 0 0INT3 1 1 0 0 0 0 0
‘138 ‘138 (3 – 8(3 – 8 DeMux DeMux))
Note: For internal vectors set AVIS = 1 Note: For internal vectors set AVIS = 1
A5A5
A3A3
A2A2
IACK -IACK -
A6A6
CCBBAA
G2 -G2 -
Y0Y0 IACK0 IACK0
Y1Y1 IACK1 IACK1
Y2Y2 IACK2 IACK2
Y4Y4 IACK3 IACK3G1G1
10 - 10
Context Save & Restore InstructionsContext Save & Restore Instructions
InstructionInstructionInstruction DescriptionDescriptionDescription
PSHM mmrPSHMPSHM mmrmmr
POPM mmrPOPMPOPM mmrmmr
Push MMR onto StackSP - 1 → SPPush MMR onto StackPush MMR onto StackSP - 1 SP - 1 →→ SP SP
Pop from Stack to MMRSP + 1 → SPPop from Stack to MMRPop from Stack to MMRSP + 1 SP + 1 →→ SP SP
PSHD SmemPSHDPSHD SmemSmem
POPD SmemPOPDPOPD SmemSmem
Push Data memory value onto StackSP - 1 → SPPush Data memory value onto StackPush Data memory value onto StackSP - 1 SP - 1 →→ SP SP
Pop top of Stack to Data memorySP + 1 → SPPop top of Stack to Data memoryPop top of Stack to Data memorySP + 1 SP + 1 →→ SP SP
FRAME KFRAMEFRAME KK Modify Stack PointerSP + K → SPModify Stack PointerModify Stack PointerSP + K SP + K →→ SP SP
Module 10
DSP54x - Interrupts 10 - 7
10 - 11
Context SaveContext Save
.ref.ref ISR1ISR1
.sect.sect “.vectors”“.vectors”
...... ...... INT1: INT1: BDBD ISR1ISR1 PSHMPSHM ST0ST0 PSHMPSHM ST1ST1
.. mmregsmmregs.. defdef ISR1ISR1
.text.textISR1: ISR1: PSHMPSHM ALAL PSHMPSHM AHAH PSHMPSHM AGAG
PSHMPSHM AR1AR1 PSHMPSHM IMRIMR PSHMPSHM PMSTPMST ; ISR FOLLOWS... ; ISR FOLLOWS...
10 - 12
Context RestoreContext Restore
; ISR CONCLUDES...; ISR CONCLUDES...
; Context Restore: ; Context Restore:
POPMPOPM PMSTPMST POPMPOPM IMRIMR
POPMPOPM AR1AR1 POPMPOPM AGAG POPMPOPM AHAH POPMPOPM ALAL POPMPOPM ST1ST1 POPMPOPM ST0ST0 RETFRETF
Module 10
10 - 8 DSP54x - Interrupts
10 - 13
Return InstructionsReturn Instructions
InstructionInstructionInstruction ActionsActionsActions CyclesCyclesCycles
RET[D]RET[D]RET[D] *(SP) -- → PC*(SP) -- *(SP) -- →→ PCPC RET 5RETD 3RETRET 55RETDRETD 33
RETE[D]RETE[D]RETE[D] *(SP) -- → PC 0 → INTM -*(SP) -- *(SP) -- →→ PC PC 0 0 →→ INTM -INTM -
RETE 5RETED 3RETERETE 55RETEDRETED 33
RETF[D]RETF[D]RETF[D] RETF → PC 0 → INTM -*(SP) --
RETF RETF →→ PCPC 0 0 →→ INTM -INTM -*(SP) --*(SP) --
RETF 3RETFD 1RETFRETF 33RETFDRETFD 11
10 - 14
Nested InterruptsNested Interrupts
;; Nestable Nestable ISR . . . ISR . . .
RSBXRSBX INTMINTM
STMSTM #5,IMR#5,IMR
PSHMPSHM IMRIMR
SSBXSSBX INTMINTM
POPMPOPM IMRIMR
RETERETE
Save IMRSave IMR
Enable only Interrupts 0 and 2Enable only Interrupts 0 and 2
Enable Interrupts INTM=0Enable Interrupts INTM=0
Disable Interrupts INTM=1Disable Interrupts INTM=1
Restore IMR valueRestore IMR value
Module 10
DSP54x - Interrupts 10 - 9
10 - 15
Vector Table StructureVector Table Structure
.sect.sect “.vectors”“.vectors”RSV:RSV: BDBD ResetReset
STMSTM #STK+LEN,SP#STK+LEN,SPNMV:NMV: Put NMIPut NMI
routine here ...routine here ...
IV1:IV1: BDBD ISR1ISR1PSHMPSHM ST0ST0PSHMPSHM ST1ST1
IV2:IV2: BDBD ISR2ISR2PSHMPSHM ST0ST0PSHMPSHM ST1ST1......
.loop.loop 40h-$40h-$RETERETE.. endloopendloop
10 - 16
Filling Empty VectorsFilling Empty Vectors
Standard VectorStandard Vector
IVnIVn :: BDBD ISRnISRn PSHMPSHM ST0ST0 PSHMPSHM ST1ST1
Unused VectorUnused Vector
IVnIVn :: NOPNOP
NOPNOP NOPNOP
NOPNOP
Unused Vector - DebugUnused Vector - Debug
IVnIVn :: BDBD IVnIVn NOPNOP
NOPNOP
Unused Vector - ProductionUnused Vector - ProductionIVnIVn :: XORMXORM #10b,*(IMR)#10b,*(IMR)
RETERETE
Module 10
10 - 10 DSP54x - Interrupts
10 - 17
NMI InterruptNMI Interrupt
uu Supersedes all regular activity.Supersedes all regular activity.
uu Can serve as ultra-high priorityCan serve as ultra-high priorityinterrupt event.interrupt event.
uu Ignores state of INTM and IMR.Ignores state of INTM and IMR.
uu Sets INTM=1.Sets INTM=1.ÀÀ Cannot supercede RPT or not RDY.Cannot supercede RPT or not RDY.
ÀÀ Can slow time-critical interrupt response.Can slow time-critical interrupt response.
ÀÀ Can interrupt itself.Can interrupt itself.
ÀÀ Can lead to ambiguous return state.Can lead to ambiguous return state.
10 - 18
Using Address for NMI Return StatusUsing Address for NMI Return Status
Put main and ISR code in separate areasPut main and ISR code in separate areas
ISRsISRs
VectorsVectors
MainMain
IntsInts_Off_Off
When about to return from NMI:When about to return from NMI:
POPMPOPM ALAL PSHMPSHM ALAL
SUBSUB ##IntsInts_Off,A_Off,ARETCRETC GEQGEQRETERETE
Get Return AddressGet Return Address Put Return Address BackPut Return Address Back
Is Return Address in ISR region?Is Return Address in ISR region?If yes, return w/o clearing INTMIf yes, return w/o clearing INTMElse clear INTM (allow interrupts)Else clear INTM (allow interrupts)
Module 10
DSP54x - Interrupts 10 - 11
10 - 19
Using Flag for NMI Return StatusUsing Flag for NMI Return Status
VecNVecN:: BDBD IsrNIsrNSSBX SSBX TCTCNOPNOP
IsrNIsrN:: …………RETEDRETEDRSBXRSBX TCTCNOPNOP
NMI:NMI: …………RETCRETC TCTCRETE RETE
Jump to ISR in 2 wordsJump to ISR in 2 wordsSet flag (non Interruptible)Set flag (non Interruptible)Room for 1 more...Room for 1 more...
Return from ISR in 2 cyclesReturn from ISR in 2 cyclesClear flagClear flagLast word...Last word...
NMI ISR code starts hereNMI ISR code starts here
If TC=1 ret to ISR w INTM=1If TC=1 ret to ISR w INTM=1Else allow interrupts and returnElse allow interrupts and return
10 - 20
Fast InterruptsFast Interrupts
Allows 3-cycle ISR, e.g.:Allows 3-cycle ISR, e.g.:
RINT0:RINT0: NOPNOP
RETFDRETFD
MVKDMVKD DRR0,*AR7+%DRR0,*AR7+%
uu Only 2 words of code may follow RETFD.Only 2 words of code may follow RETFD.
uu One word may precede the RETFDOne word may precede the RETFD
uu Creates “unsupervised” action.Creates “unsupervised” action.
Module 10
10 - 12 DSP54x - Interrupts
10 - 21
INTR and TRAP InstructionINTR and TRAP Instruction
[Label] INTR[Label] INTR k k ;0 ;0 ≤≤ k k ≤≤ 31 31
[Label] TRAP k ;0 [Label] TRAP k ;0 ≤≤ k k ≤≤ 31 31
INTR = TRAP + 1 -> INTMINTR = TRAP + 1 -> INTM
kkk
012345...
15
001122334455......
1515
InterruptInterruptInterrupt OffsetOffsetOffset
RSNMI
SINT17SINT18SINT19SINT20
.
.
.SINT30
RSRSNMINMI
SINT17SINT17SINT18SINT18SINT19SINT19SINT20SINT20
..
..
..SINT30SINT30
0h4h8hCh10h14h
.
.
.3Ch
0h0h4h4h8h8hChCh10h10h14h14h
..
..
..3Ch3Ch
kkk
16171819202122232425...
31
1616171718181919202021212222232324242525......
3131
InterruptInterruptInterrupt OffsetOffsetOffset
INT0INT1INT2TINTRINTXINTRINT1XINT1INT3
Reserved...
Reserved
INT0INT0INT1INT1INT2INT2TINTTINTRINTRINTXINTXINT
RINT1RINT1XINT1XINT1INT3INT3
ReservedReserved......
ReservedReserved
40h44h48h4Ch50h54h58h5Ch60h64h
.
.
.7Ch
40h40h44h44h48h48h4Ch4Ch50h50h54h54h58h58h5Ch5Ch60h60h64h64h
..
..
..7Ch7Ch
10 - 22
RESET InstructionRESET Instruction
(IPTR)<<7 (IPTR)<<7 →→ PC PC
0 0 →→ IFR IFR
0 0 →→ CLOCK OFF (C541) CLOCK OFF (C541)
MathMath
0 0 →→ OVA OVA
0 0 →→ OVB OVB
0 0 →→ OVM OVM
1 1 →→ C C
0 0 →→ C16 C16
0 0 → → ASM ASM
0 0 →→ FRCT FRCT
1 1 →→ SXM SXM
MiscMisc
0 0 →→ OVM OVM
0 0 →→ BRAF BRAF
0 0 →→ DP DP
0 0 → → CPLCPL
0 0 →→ CMPT CMPT
0 0 →→ ARP ARP
1 1 →→ INTM INTM
MemoryMemory
0 0 → → OVLYOVLY
0 0 →→ DROM DROM
? ? → → MP/MC-MP/MC-
PinsPins
1 1 →→ XF XF
1 1 →→ CLKEN CLKEN
0 0 →→ AVIS AVIS
0 0 →→ HM HM
Module 10
DSP54x - Interrupts 10 - 13
10 - 23
Interrupt Vector AddressInterrupt Vector Address
111 111 111 111 111 111 111 111 111
1515 1414 1313 1212 1111 1010 99 88 77 66 55 44 33 22 11 00InterruptInterrupt
VectorVectorAddressAddress
000 000 000 000 000 000 000
ResetReset
000 000 000 000 000
IPTRIPTR
111 111 111 111 111 111 111 111 111
1515 1414 1313 1212 1111 1010 99 88 77 66 55 44 33 22 11 00
PMSTPMST
MP
/MC
-M
P/M
C-
MP
/MC
-
OV
LY
OV
LY
OV
LY
AV
ISA
VIS
AV
IS
DR
OM
DR
OM
DR
OM
CL
KO
FF
CL
KO
FF
CL
KO
FF
Res
erve
dR
eser
ved
Res
erve
d
Res
erve
dR
eser
ved
Res
erve
d
10 - 24
Timer Block DiagramTimer Block Diagram
CLKOUTCLKOUT PSC (4)PSC (4)PSC (4)
TDDR (4)TDDR (4)TDDR (4)
TIM (16)TIM (16)TIM (16)
PRD (16)PRD (16)PRD (16)
TINTTINT
TINT rate =TINT rate =11
TCLK1 x ( TDDR+1 ) x ( PRD+1 )TCLK1 x ( TDDR+1 ) x ( PRD+1 )
Module 10
10 - 14 DSP54x - Interrupts
10 - 25
Timer Control RegisterTimer Control Register
PSCPSC TimerTimer prescaler prescaler counter counterTDDRTDDR Timer divide down ratioTimer divide down ratio
ReservedReservedReserved
1515 1414 1313 1212 1111 1010 99 88 77 66 55 44 33 22 11 00
PSCPSCPSC TDDRTDDRTDDR
enableenable start / stopstart / stop
TR
BT
RB
TSS
TSS
TSSTSS 1 = stop timer, 0 = timer run1 = stop timer, 0 = timer runTRBTRB 1 = load TIM from PRD1 = load TIM from PRD
10 - 26
Lab 10 - Interrupt Driven EventLab 10 - Interrupt Driven EventVECTORS.ASMVECTORS.ASM
RS RS
TINTTINT
11LAB9_M.ASMLAB9_M.ASM
START:START:
CALL CALL TIMER_INITTIMER_INIT
CALL CALL SINE_INITSINE_INIT
ENABLE INTS ENABLE INTS
MAIN:MAIN:
A = A + 1 A = A + 1
LOOP LOOP
22
LAB9_T.ASMLAB9_T.ASM
TIMER_INIT TIMER_INIT RET RET
33
LAB9_S.ASM LAB9_S.ASM
SINE_INITSINE_INITRETRET
CONTEXT SAVECONTEXT SAVESINE_ISRSINE_ISROUT TO PORT0OUT TO PORT0CONTEXT RESTORECONTEXT RESTORERETRET
AA
BB
CC
OUT.DATOUT.DAT
DD
MODIFYMODIFYLAB9.CMDLAB9.CMD
Module 10
DSP54x - Interrupts 10 - 15
10 - 27
Lab ProcedureLab Procedure
uuWrite code and get files talkingWrite code and get files talking
uuVerify that interrupts are workingVerify that interrupts are working
uuGet sine wave working (watch DP)Get sine wave working (watch DP)
uuPerform/verify full context save/restorePerform/verify full context save/restore
10 - 28
ReviewReview
uu What are the interrupt sources?What are the interrupt sources?
uu How do you poll for interrupts?How do you poll for interrupts?
uu What must you set up to respond to anWhat must you set up to respond to aninterrupt?interrupt?
uu What conditions affect interrupt latency?What conditions affect interrupt latency?
Module 10
10 - 16 DSP54x - Interrupts
DSP54x - Hardware Interfacing 11 - 1
Hardware Interfacing
Learning Objectives
11 - 2
ObjectivesObjectives
uu Describe the purpose of each interface pin.Describe the purpose of each interface pin.
uu Connect the ‘C54x to various memory andConnect the ‘C54x to various memory andperipheral devices.peripheral devices.
uu Identify the key timing for external readsIdentify the key timing for external readsand writes.and writes.
uu Implement software wait states.Implement software wait states.
11 - 2 DSP54x - Hardware Interfacing
Module 11
DSP54x - Hardware Interfacing 11 - 3
Module 11
11 - 3
Interfacing Memory and PeripheralsInterfacing Memory and Peripherals
DATADATA
DD
TMS320C54xTMS320C54x
DATADATA
PGMPGM
I/OI/O
1616ADDRESSADDRESS A A
A A
AA
PSPSPS CS1CS1CS1
DSDSDS CS1CS1CS1
ISISIS CS1CS1
CS2CS2CS2MSTRBMSTRB
CS2CS2CS2
IOSTRBIOSTRB CS2CS2
(8–15)(8–15)
(0–7)(0–7)DATADATA
DATADATA
DATADATA1616
DATADATA
R/WR/W WEWE
WEWE
OEOE GNDGND
OEOEGNDGND
GNDGNDOEOE
11 - 4
00 cycle timecycle time 2525 ns ns
## Cycle TimeCycle Time 2525
Read TimingRead Timing
11
addressaddress11 Setup AddressSetup Address 5 5
22
datadata22 Data ValidData Valid 5 533
33 Memory SpeedMemory Speed 1515
1515
333399
Notes:Notes: 1. Address timing also includes the PS, DS, IS and MSTRB signals.1. Address timing also includes the PS, DS, IS and MSTRB signals.2. All times are in nanoseconds.2. All times are in nanoseconds.3. H = one-half CLOCKOUT1 cycle time.3. H = one-half CLOCKOUT1 cycle time.4. MSTRB 4. MSTRB staysstays low across reads low across reads
11
MSTRB -MSTRB -
Module 11
11 - 4 DSP54x - Hardware Interfacing
11 - 5
Bus Collision AvoidanceBus Collision Avoidanceuu Only one external memory - No collision possible Only one external memory - No collision possible
uu Multiple external memoriesMultiple external memories
uu Sequential reads within one memory - MS address lines don’t changeSequential reads within one memory - MS address lines don’t changeOnly one memory will respond - No bus collision possibleOnly one memory will respond - No bus collision possible
uu Sequential reads across memories - MS address lines changeSequential reads across memories - MS address lines change
uuMultiple devices may respond - Can yield early data collisions Multiple devices may respond - Can yield early data collisions or: new device may not turn on in time -or: new device may not turn on in time - exp exp. if de-. if de-muxmux is is used to feed CS-used to feed CS-
uuWon’t corrupt read data, since 54x reads only at end of cycleWon’t corrupt read data, since 54x reads only at end of cycle
uuWon’t damage memory, since event is briefWon’t damage memory, since event is brief
uuCan yield noise, wastes powerCan yield noise, wastes power
uu SolutionsSolutions
uu If noise/power not a concern : no problemIf noise/power not a concern : no problem
uuUse faster memory (higher cost, power, etc)Use faster memory (higher cost, power, etc)
uu Add wait state only when reading Add wait state only when reading acrossacross devices (how?) devices (how?)
11 - 6
BSCR: Bank Switch Control RegisterBSCR: Bank Switch Control Register
BNKCMPBNKCMPBNKCMP
1515 1212 1010 00
res.resres..
BNKCMPBNKCMP value value MSBs MSBs compared Bank Size compared Bank Size
0 0 0 0 None 64K 0 0 0 0 None 64K 1 0 0 0 15 32K 1 0 0 0 15 32K 0 1 0 0 15 - 14 16K 0 1 0 0 15 - 14 16K 0 0 1 0 15 - 13 8K 0 0 1 0 15 - 13 8K 0 0 0 1 15 - 12 4K 0 0 0 1 15 - 12 4K
Note: Use only specified values of Note: Use only specified values of BNKCMPBNKCMP..
1111PS-PS-DSDS
Bit 11: If PS-DS is set, add 1 wait state when access changesBit 11: If PS-DS is set, add 1 wait state when access changesbetween PS and DS.between PS and DS.
Module 11
DSP54x - Hardware Interfacing 11 - 5
11 - 7
Interface Comparison: 2Interface Comparison: 2 vs vs 4 Phase 4 Phase
C10C10 4 4 200200 75 75 38% 38%
C25C25 4 4 100100 35 35 35% 35%
C50C50 2 2 50 50 32 32 64% 64%
C54xC54x 2 2 25 25 15 15 60% 60%
Four phase systems allow 1/3 cycle for memoryFour phase systems allow 1/3 cycle for memory
while two phase approach offers @ 2/3 cycle.while two phase approach offers @ 2/3 cycle.
Memory:Memory:
CostCost vs vs Speed Speed
DeviceDevice PhasesPhases I.RateI.Rate SramSram RatioRatio
44
22
11 - 8
Memory Interface ProtocolsMemory Interface Protocols
uu Four phase memoryFour phase memory interface is an industry standard, interface is an industry standard, as shown in the diagram below:as shown in the diagram below:
uu The first phase is a strobe The first phase is a strobe offoff time to allow the time to allow theaddress to stabilizeaddress to stabilize
uu Phase two is strobe Phase two is strobe onon, with valid , with valid addressaddress
uu Phase three is for sending/receiving Phase three is for sending/receiving datadata
uu In the fourth phase the strobe goes In the fourth phase the strobe goes offoff to latch data, allow to latch data, allow data hold time, and to relinquish the bus before the next data hold time, and to relinquish the bus before the next memory cyclememory cycle
datadata
addressaddress—— AA ——DD
strobestrobe
Module 11
11 - 6 DSP54x - Hardware Interfacing
11 - 9
Memory Interface ProtocolsMemory Interface Protocols
uu ReadRead cycles do cycles do notnot require the dead phases, though. require the dead phases, though.
uu Since the 54x is the only device ‘listening’ on the bus, Since the 54x is the only device ‘listening’ on the bus, it can latch data near the end of “D” time and not beit can latch data near the end of “D” time and not beconfused by any spurious early data, nor require an confused by any spurious early data, nor require an explicit data strobe signal.explicit data strobe signal.
uu By eliminating the ‘dead’ phases, the 54x is able to offerBy eliminating the ‘dead’ phases, the 54x is able to offera much larger time window to external memory, as seen a much larger time window to external memory, as seen previouslypreviously
datadata
addressaddress—— AA ——DD
strobestrobe
11 - 10
Memory Interface ProtocolsMemory Interface Protocols
uu WriteWrite cycles cycles dodo require the dead phases, however. require the dead phases, however.
uu Since here the external memory is ‘listening’ the protocolSince here the external memory is ‘listening’ the protocolmust guard against attempting a write before the addressmust guard against attempting a write before the addressis stabilized (the first ‘off’ time) and provide a data latchis stabilized (the first ‘off’ time) and provide a data latchand hold time (the last ‘off’ time) and hold time (the last ‘off’ time)
uu Therefore, Therefore, writeswrites must have a four phase protocol must have a four phase protocol
uu These two ‘off’ phases must be implemented with anThese two ‘off’ phases must be implemented with anextra CPU cycle extra CPU cycle eacheach
datadata
addressaddress—— AA ——DD
strobestrobe
Module 11
DSP54x - Hardware Interfacing 11 - 7
11 - 11
22
write datawrite data
write addresswrite address
1515 readreaddatadata
External Interface Write TimingExternal Interface Write Timing
11
read addressread addressADDRADDR
DATADATA
55
2H–52H–5 33
55
4H–24H–2
33
RR —— WW ——55
55
2H–52H–5MSTRBMSTRB
R/WR/W
Cycle Time →Cycle Time Cycle Time →→
1 Address valid to MSTRB (min)2 Data valid before MSTRB (min) (setup time)3 Data valid after MSTRB (min) (hold time)
11 Address valid to MSTRB (min)Address valid to MSTRB (min)22 Data valid before MSTRB (min) (setup time)Data valid before MSTRB (min) (setup time)33 Data valid after MSTRB (min) (hold time)Data valid after MSTRB (min) (hold time)
252525 202020
2H-5 = 202H-5 = 20 H-5 = 8
2H-52H-5 = 20= 202H-52H-5 = 20= 20 H-5 H-5 = 8 = 8
15155
1515151555
Notes:Notes: 1. All times are in nanoseconds.1. All times are in nanoseconds.2. H = one-half CLOCKOUT1 cycle time.2. H = one-half CLOCKOUT1 cycle time.
11 - 12
uu The two phase memory interface reduces memory cost, The two phase memory interface reduces memory cost, but yields a three cycle write. but yields a three cycle write.
uu How much performance is lost as a result?How much performance is lost as a result?
uu Consider the usual activity of DSP: Sum of products.Consider the usual activity of DSP: Sum of products.
OrderOrder DataData CoeffsCoeffs Code Code WritesWrites CyclesCycles Ops Ops A.B.R.A.B.R.
Three Cycle Write OverheadThree Cycle Write Overhead
20 20 20 20 20 20 10 10 1 1 53 53 51 51 1.04 1.04
50 50 50 50 50 50 10 10 1 1 113 113 111 111 1.02 1.02
100 100 100100 100 100 10 10 1 1 213 213 211 211 1.01 1.01
uu In these examples we can see that the overhead due to In these examples we can see that the overhead due to external writes is small for normal DSP functionsexternal writes is small for normal DSP functions
uu Can the overhead be reduced further if necessary?Can the overhead be reduced further if necessary?
Module 11
11 - 8 DSP54x - Hardware Interfacing
11 - 13
Write Timing DetailsWrite Timing Details
uu Single write requires three cycles.Single write requires three cycles.
uu ChainedChained writes use 2N+1 cycles. writes use 2N+1 cycles.
uu Single write is only one Single write is only one CPUCPU cycle. cycle.
uu InternalInternal write is one cycle. write is one cycle.
Note Note : writing an array of internal memory to external: writing an array of internal memory to external
memory is memory is full speedfull speed: while external bus is dead, : while external bus is dead,
CPU reads next datum from internal memory - thusCPU reads next datum from internal memory - thus
N reads and N writes take ~ 2*N cycles N reads and N writes take ~ 2*N cycles
11 - 14
IO Memory TimingIO Memory Timing
CLK OUTCLK OUT
0 0 12 12 25 25 37 37 50 50
A(15-0) , R/W- A(15-0) , R/W-
55
Read Read
552727
WriteWrite
55
IO STRB -IO STRB -
55 55
77 2727
22
Module 11
DSP54x - Hardware Interfacing 11 - 9
11 - 15
Software Wait StatesSoftware Wait States
I/OI/O Mem Mem
LoLo Data Data
Hi DataHi Data
Lo ProgLo Prog
HiHi Prog Prog
RRR I/OI/OI/O Hi DataHi DataHi Data Low DataLow DataLow Data Hi ProgHiHi Prog Prog Low Prog LowLow Prog Prog
SWWSR: Software Wait State Register. DataSWWSR: Software Wait State Register. Data Addr Addr: 0028h: 0028h
uu 3 bit fields = 0 to 7 software wait states (SW-WS)3 bit fields = 0 to 7 software wait states (SW-WS)
uu On reset, all P/D memory is 7 WS (SWWSR = 7FFFh).On reset, all P/D memory is 7 WS (SWWSR = 7FFFh).
uu On last SW-WS: MSC- will go LOW for 1 cycle.On last SW-WS: MSC- will go LOW for 1 cycle.
11 - 16
Hardware Wait StatesHardware Wait States
uu Software wait states may not be sufficient for all systems.Software wait states may not be sufficient for all systems.
uu Therefore Therefore hardwarehardware wait states may be used when: wait states may be used when:
uu More than 7 wait states are requiredMore than 7 wait states are required
uu More than 2 speeds of memory exist in a mapMore than 2 speeds of memory exist in a map
uu Variable wait-states existVariable wait-states exist
uu Hardware wait-states are Hardware wait-states are notnot considered for 0 and 1 SWWS areas considered for 0 and 1 SWWS areas
uu For 2-7 SWWS areas, the MSC- (Micro State Complete) pin fallsFor 2-7 SWWS areas, the MSC- (Micro State Complete) pin fallsat the end of the last SWWS. Therefore, this signal may be used toat the end of the last SWWS. Therefore, this signal may be used toindicate indicate nn cycles have already transpired, upon which external delay cycles have already transpired, upon which external delaymay be added, if required.may be added, if required.
uu Hardware wait is completed by a high signal input into the READY pinHardware wait is completed by a high signal input into the READY pin
uu READY is ignored for 0 and 1 SWWS areasREADY is ignored for 0 and 1 SWWS areas
uu READY is sampled on falling CLOCKOUT1 (mid-cycle)READY is sampled on falling CLOCKOUT1 (mid-cycle)
uu READY is READY is notnot sampled before MSC- falls sampled before MSC- falls
Module 11
11 - 10 DSP54x - Hardware Interfacing
11 - 17
Mixed Wait-State ExampleMixed Wait-State Example
DATADATA
TMS320C54xTMS320C54x2525 nS nS
1616ADDRESSADDRESS
PSPSPSMSTRBMSTRB
DATADATA
READYREADY
MSCMSC
CLOCKOUT1CLOCKOUT1 DD
HiHi Pgm Pgm
Lo PgmLo Pgm A A
A A
CSCSCSDATADATA
DATADATACSCSCS
1515 nS nS SRAM SRAM
200nS EPROM200nS EPROM
RR I/OI/O Hi DataHi Data Low DataLow Data HiHi Prog Prog LowLow Prog Prog SWWSRSWWSR
007711 11xx 44
PS -PS -
MSTRB -MSTRB -
A15 -A15 -
1515
OROR
1515PS -PS -
MSTRB -MSTRB -
A15 +A15 +OROR
ORORFF -FF -
11 - 18
Memory Timing SummaryMemory Timing Summary
uu All internal accesses are single cycle.All internal accesses are single cycle.
uu Internal DARAM action may be two accesses per cycle.Internal DARAM action may be two accesses per cycle.
uu External-read timing is biased for read access times:External-read timing is biased for read access times:
ÀÀ 1515 ns ns for 25- for 25-nsns devices devices
uu Write-cycle timing (cost/performance tradeoffs):Write-cycle timing (cost/performance tradeoffs):
ÀÀ 3 cycles for a single write3 cycles for a single write
ÀÀ 2 cycles per write with multiple writes2 cycles per write with multiple writes
ÀÀ 1 CPU cycle to initiate bus cycle(s)1 CPU cycle to initiate bus cycle(s)
uu Software-generated wait states allow for slower memories.Software-generated wait states allow for slower memories.
Module 11
DSP54x - Hardware Interfacing 11 - 11
11 - 19
Review QuestionsReview Questions
uu What signals select program, data or I/O?What signals select program, data or I/O?
uu How many wait states can the software wait-How many wait states can the software wait-state generator assign?state generator assign?
uu What size boundaries are software wait statesWhat size boundaries are software wait statesassigned in program, data and I/O?assigned in program, data and I/O?
uu What is the advantage of slower write cycles?What is the advantage of slower write cycles?
11 - 20
Lab 11 — Hardware InterfaceLab 11 — Hardware Interface
1616
PSPSPS
TMS320C54x-40TMS320C54x-40
ADDRESSADDRESS
DSDSDSISISIS
MSTRBMSTRBMSTRB
DATADATA
MCMCMCMP/ MP/
1616
IOSTRBIOSTRBIOSTRB
R/ R/ WWW
PROG
8K EPROM
70ns
PROGPROG
8K EPROM8K EPROM
70ns 70ns
CS1CS1CS1OEOEOE
AADD
DATA
8K SRAM
15 ns
DATADATA
8K SRAM8K SRAM
1515 ns ns
CS1CS1CS1OEOEOE
AA
DDWEWEWE
1 ADC &
1 DAC
120 ns
1 ADC 1 ADC &&
1 DAC1 DAC
120120 ns ns
CS1CS1CS1OEOEOE
AA
DDWEWEWE
CS2CS2CS2
CS2CS2CS2
CS2CS2CS2
I/OI/O Hi DataHi Data Low DataLow Data HiHi Prog Prog LowLow Prog Prog SWWSRSWWSR
STM #________, SWWSRSTM #________, SWWSR
Module 11
11 - 12 DSP54x - Hardware Interfacing
DSP54x - Ports 12 - 1
Ports
Learning Objectives
12 - 2
Learning ObjectivesLearning Objectives
uu List the available types of portsList the available types of ports
uu Demonstrate how to initialize eachDemonstrate how to initialize eachport to a given statusport to a given status
uu Demonstrate how to connect eachDemonstrate how to connect eachport to external devicesport to external devices
uu Write code to send & receive dataWrite code to send & receive datato a given portto a given port
uu Describe when & how a given portDescribe when & how a given portis best utilizedis best utilized
12 - 2 DSP54x - Ports
Module 12
DSP54x - Ports 12 - 3
Module 12
12 - 3
uu Standard Serial PortStandard Serial Port
uu Buffered Serial Port (BSP)Buffered Serial Port (BSP)
uu TDM Serial PortTDM Serial Port
uu Host Port Interface (HPI)Host Port Interface (HPI)
TMS32054x PortsTMS32054x Ports
uu Standard Serial PortStandard Serial Port
12 - 4
SP Pins and SignalsSP Pins and Signals
TransmitTransmit
CLKXCLKX
FSXFSX
DXDX
ReceiveReceive
CLKRCLKR
FSRFSR
DRDR
ClockClock
DataData
FrameFrame
Module 12
12 - 4 DSP54x - Ports
12 - 5
Dual 54x Serial Port InterconnectDual 54x Serial Port Interconnect
# 2# 2
FSRFSR
DXDX
FSXFSX
DRDR
CLKXCLKX
CLKRCLKR
54x54x
CLKXCLKX
CLKRCLKR
DRDR
FSRFSR
DXDX
FSXFSX
# 1# 1
54x54x
12 - 6
Serial Port DiagramSerial Port Diagram
Data BusData Bus
SPCSPC
Control LogicControl Logic
DRRDRR DXRDXR
XSRXSRRSRRSR
DXDXDRDR
RINTRINT XINTXINT
FSRFSR CLKRCLKR CLKXCLKX FSXFSX
CPUCPU
Module 12
DSP54x - Ports 12 - 5
12 - 7
Serial Transmit Timing ExampleSerial Transmit Timing Example
CLKX
DX D7 D6 D5 D4 D3 D2 D1 D0 E7
XINT
D È DXR
D È XSR E È XSR
FSX
E È DXR
12 - 8
Maximum Data Rate ExampleMaximum Data Rate Example
D7 D6 D5 D4 D3 D2 D1 D0 E7C0 E6 E5 E4
D È XSR E È XSR F È DXRE È DXR
CLKX
DX
XINT
FSX
Module 12
12 - 6 DSP54x - Ports
12 - 9
Frame Types - Burst/ContinuousFrame Types - Burst/Continuous
ContinuousContinuous
BurstBurst
DataData 11 22 33
FSM = 1 : BurstFSM = 1 : Burst
FSM = 0 : ContinuousFSM = 0 : Continuous
12 - 10
Serial Port Control RegisterSerial Port Control Register
BITBIT NAMENAME FunctionFunction 0 State0 State 1 State1 State RSRS 0 0 N/AN/A Std / enhance modeStd / enhance mode StdStd EnhEnh 0 0 1 1 DLBDLB DigitalDigital Loopback Loopback RunRun TestTest 0 0 2 2 FOFO FormatFormat 16 b.16 b. 8 b.8 b. 0 0 3 3 FSMFSM Frame Synch ModeFrame Synch Mode ContCont.. BurstBurst 0 0 4 4 MCMMCM Master Clock ModeMaster Clock Mode Ext’lExt’l I.Rate/4I.Rate/4 0 0 5 5 TXMTXM Transmit ModeTransmit Mode FollowFollow LeadLead 0 0 6 6 XRST -XRST - Transmit ResetTransmit Reset ResetReset RunRun 0 0 7 7 RRST -RRST - Receive ResetReceive Reset ResetReset RunRun 0 0 8 8 IN0IN0 Input value on CLKRInput value on CLKR In.Val=0 In.Val=0 In.Val=1 In.Val=1 x x 9 9 IN1IN1 Input value on CLKXInput value on CLKX In.Val=0 In.Val=0 In.Val=1 In.Val=1 x x1010 RRDYRRDY RcvRcv Ready = RINT Ready = RINT No Data No Data ReadyReady 0 01111 XRDYXRDY XmitXmit Ready = XINT Ready = XINT DXR FullDXR Full ReadyReady 1 11212 XSREMPTY -XSREMPTY - Xmit Shft RegXmit Shft Reg Empty Empty OKOK ErrorError 0 01313 RSRFULLRSRFULL Rcv Shft RegRcv Shft Reg Full Full OKOK ErrorError 0 01414 FREEFREE Free Run on BreakFree Run on Break HaltHalt FreeRunFreeRun 0 01515 SOFTSOFT Soft stop on BreakSoft stop on Break HardHard SoftSoft 0 0
Module 12
DSP54x - Ports 12 - 7
12 - 11
Serial Port ExerciseSerial Port Exercise
uu Initialize the ‘541 XMIT port 0 as follows :Initialize the ‘541 XMIT port 0 as follows :
uu Frame synch is internal and burst modeFrame synch is internal and burst mode
uu Word size is 16 bitWord size is 16 bit
uu Run at the fastest possible speed.Run at the fastest possible speed.SSBX INTM Disable Ints
ORM #______,IMR Enable Xmit Interrupt (XINT)
STM #______,SPC Halt SP & Conf SP Ctl reg
STM #______,IFR Clear any old Xmit ints (XINT)
ORM #______,SPC Start Xmit process
RSBX INTM Enable ints
12 - 12
Serial Port ExerciseSerial Port Exerciseuu Initialize the ‘541 XMIT port 0 as follows :Initialize the ‘541 XMIT port 0 as follows :
uu Frame synch is internal and burst modeFrame synch is internal and burst mode
uu Word size is 16 bitWord size is 16 bit
uu Run at the fastest possible speed.Run at the fastest possible speed.SSBX INTM Disable Ints
ORM #0020h ,IMR Enable Xmit Interrupt (XINT)
STM #00BCh ,SPC Halt SP & Conf SP Ctl reg
STM #0020h ,IFR Clear any old Xmit ints (XINT)
ORM #0040h ,SPC Start Xmit process
RSBX INTM Enable ints
Module 12
12 - 8 DSP54x - Ports
12 - 13
Serial Port CaveatsSerial Port Caveats
u On reset DX=HiZ, all other pins are inputsu When initializing SP, use two writes:
u One to halt the SP & set desired modesu Second to take SP out of reset
u In Continuous mode XMIT stops if no new data is presentu XMIT restarts when new data is written to DXRu FSX is asserted to indicate new packet initiated
u After last bit sent, DX = HiZu Allows other devices to share busu Should add pullup to avoid chatter
u For self-test (DLB)u Use TXM = 1u If MCM = 1, CLKX -> CLKRu if MCM = 0, CLKR -> CLKX
u For Low Power : clear MCM, XRST, RRST
12 - 14
uu Standard Serial PortStandard Serial Port
uu Buffered Serial Port (BSP)Buffered Serial Port (BSP)
uu TDM Serial PortTDM Serial Port
uu Host Port Interface (HPI)Host Port Interface (HPI)
TMS32054x PortsTMS32054x Ports
uu Buffered Serial Port (BSP)Buffered Serial Port (BSP)
Module 12
DSP54x - Ports 12 - 9
12 - 15
SPSP
DRDR
DXDX
DRRDRR
DXRDXRXINTXINTRINTRINT
CPUCPU
RCV-ISRRCV-ISR
XMT-ISRXMT-ISR
ABUABU
ARRARR
AXRAXR
00
FFFFFFFF
DMEMDMEM
800800
10001000
Buffered Serial PortBuffered Serial Port
RINTRINT
R-PingR-Ping
R-PongR-Pong
X-PingX-Ping
X-PongX-Pong XINTXINT
DBUSDBUS
BKRBKR
BKXBKX
800800
000000
1111
1111
12 - 16
Serial Port Control Expansion RegisterSerial Port Control Expansion Register
BITBIT NAMENAME FunctionFunction 0 State0 State 1 State1 State RSRS
0-40-4 CLKDVCLKDV Clock DivisorClock Divisor 1 : 11 : 1 1 : n+11 : n+1 3 3
5 5 FSPFSP Frame Sync PolarityFrame Sync Polarity Active HiActive Hi ActiveActive Lo Lo 0 0
6 6 CLKPCLKP ClkClk Polarity : XMIT on Polarity : XMIT on Rising Rising FallingFalling 0 0
7 7 FEFE Format ExtensionFormat Extension 16/816/8 10/1210/12 0 0
8 8 FIGFIG Frame IgnoreFrame Ignore See 2ndSee 2nd No 2ndNo 2nd 0 0
9 9 PCMPCM Pules CodePules Code Mod’n Mod’n NormalNormal negneg==HiZHiZ 0 0
1010 BXEBXE Buffer XMIT EnableBuffer XMIT Enable Std SP Std SP BSP onBSP on 0 0
1111 XHXH XMIT HalfXMIT Half in 2ndin 2nd in 1stin 1st 0 0
1212 HALTXHALTX Halt XMITHalt XMIT ContinueContinue Halt onHalt on Int Int 0 0
1313 BREBRE Buffer RCV Enable Buffer RCV Enable Std SP Std SP BSP on BSP on 0 0
1414 RHRH RCV HalfRCV Half in 2ndin 2nd in 1st in 1st 0 0
1515 HALTRHALTR Halt RCVHalt RCV ContinueContinue Halt onHalt on Int Int 0 0
Module 12
12 - 10 DSP54x - Ports
12 - 17
Buffered Serial Port ExerciseBuffered Serial Port ExerciseInitialize the Serial Port to Transmit using:
FXM = Burst TXM = Ext’l Polarities = 0Clock = Int’l Rate = Max. Format =10bitPCM = off. ABU(X) = on Halt = offRcv.Locn = 800h Array size = 100h
____ ____ Disable interrupts____ #____h , ____ Work on MMRs____ #____h , ____ Enable XMIT Int (XINT)____ #____h , ____ Config. SPC (XRST=0)____ #____h , ____ Config. SPCE (ABU on)____ #____h , ____ Init AXR to start of buffers____ #____h , ____ Init buffer size____ #____h , ____ Clear any old XINT____ #____h , ____ Start SPI XMIT____ ____ Enable interrupts
12 - 18
Initialize the Serial Port to Transmit using:FXM = Burst TXM = Ext’l Polarities = 0Clock = Int’l Rate = Max. Format =10bitPCM = off. ABU(X) = on Halt = offRcv.Locn = 800h Array size = 100h
Buffered Serial Port SolutionBuffered Serial Port Solution
SSBX INTM Disable interruptsLD #0000h , DP Work on MMRsORM #0020h , IMR Enable XMIT Int (XINT)STM #0098h , BSPC Config. SPC (XRST=0)STM #0480h , BSPCE Config. SPCE (ABU on)STM #0800h , AXR Init AXR to start of buffersSTM #0200h , BKX Init buffer sizeORM #0020h , IFR Clear any old XINTORM #0040h , BSPC Start SPI XMITRSBX INTM Enable interrupts
Module 12
DSP54x - Ports 12 - 11
12 - 19
Interrupt Service Routine for BSPInterrupt Service Routine for BSP
Respond to RINT of Buffered Serial Portwith iteration counter
RINT LD #0000h , DP Work on MMRsBIT 1 , BSPCE Extract RH: ping or pong?... ... Do (at least) two other words...XC 2 , TC If in ping, reset AR7 to top of pingSTM #0800h , AR7 Config. SPCE (ABU on)... ... Process current array . . .
cntr .usect “SPRAM”, 1 Allocate one SPRAM.text
ANDM #0000h , *(cntr) Clear counter location... ... Other code...
RINT LD #0000h , DP Work on MMRsBIT 1 , BSPCE Extract RH: ping or pong?XC 2 , TC If in ping, reset AR7 to top of pingSTM #0800h , AR7 Config. SPCE (ABU on)ADDM #0001h , cntr Increment counterCMPM #count , cntr Is counter at next to last iteration?XC 2 , TC If so, tell ABU toORM #8000h , BSPCE Stop after next RCV of next array... ... Process current array . . .
cntr .usect “SPRAM”, 1 Allocate one SPRAM.text
ANDM #0000h , *(cntr) Clear counter location... ... Other code...
RINT LD #0000h , DP Work on MMRsBIT 1 , BSPCE Extract RH: ping or pong?CMPM #count , cntr Is counter at next to last iteration?XC 2 , TC If in ping, reset AR7 to top of pingSTM #0800h , AR7 Config. SPCE (ABU on)ADDM #0001h , cntr Increment counterXC 2 , TC If so, tell ABU toORM #8000h , BSPCE Stop after next RCV of next array... ... Process current array . . .
12 - 20
ABU CaveatsABU Caveats
uu BKX, BKR range : min = 2, max = 2047 (BKX, BKR range : min = 2, max = 2047 (notnot 2048) 2048)
uu Buffers must be aligned on 2Buffers must be aligned on 2NN > BK boundary > BK boundary
uu Odd size arrays have ping = pong+1Odd size arrays have ping = pong+1
uu AXR, ARR, BKX, BKR are all 11- bit registersAXR, ARR, BKX, BKR are all 11- bit registers
uu If AXR and ARR are not initialized to base of ‘ping’If AXR and ARR are not initialized to base of ‘ping’
arrays 1st data set will be incompletearrays 1st data set will be incomplete
uu XmitXmit & & Rcv Rcv arrays can arrays can overlapoverlap to extend array size to extend array size
Module 12
12 - 12 DSP54x - Ports
12 - 21
Using ABU with Overlapping ArraysUsing ABU with Overlapping Arrays
uu Initiate ABU to receive Array1 in ‘ping’Initiate ABU to receive Array1 in ‘ping’
uu RHALT should RHALT should notnot be necessary be necessary
1 A1 AInput :Input :
On RINT begin to process Array1
When finished initiate ABU to send Array1, Wait for RINT
On RINT begin to process Array2 in ‘pong’
ABU RCV will return to ‘ping’, but will not pass ABU XMIT if rates are equalABU Xmit may surpass CPU, so XHALT is recommended
Process :Process :
2 B2 B
1 A1 A
3 A3 A
2 B2 B
4 B4 B
3 A3 A
2 B2 B
5 A5 A
4 B4 B
3 A3 A 4 B4 B
6 B6 B
5 A5 A
......Output :Output : 1 A1 A
12 - 22
uu Standard Serial PortStandard Serial Port
uu Buffered Serial Port (BSP)Buffered Serial Port (BSP)
uu TDM Serial PortTDM Serial Port
uu Host Port Interface (HPI)Host Port Interface (HPI)
TMS32054x PortsTMS32054x Ports
uu TDM Serial PortTDM Serial Port
Module 12
DSP54x - Ports 12 - 13
12 - 23
TDM Serial PortTDM Serial Port
uu Eight or more devices may share the bus.Eight or more devices may share the bus.
uu Any ‘54x may own multiple slices and/or listenerAny ‘54x may own multiple slices and/or listenerIDs.IDs.
uu During its slice, a ‘54x may talk to any combinationDuring its slice, a ‘54x may talk to any combinationof listeners.of listeners.
uu May also be used as regular serial port.May also be used as regular serial port.
TDMTDM
00
11
22
3344
55
66
77
12 - 24
TDM Four-Wire BusTDM Four-Wire Bus
TMS320C54xTMS320C54x
Device 0Device 0Device 0 Device 1Device 1Device 1 Device 7Device 7Device 7. . .. . .
TCLKTCLK
TCLKXTCLKX
TCLKRTCLKR
TCLKTCLK
TFRMTFRM
TFSXTFSX TFRMTFRM
TDATTDAT
TDXTDX
TDRTDR
TDATTDAT
TADDTADD
TFSRTFSR TADDTADD
Module 12
12 - 14 DSP54x - Ports
12 - 25
TDM SignalsTDM Signals
TCLKTCLK
TFRMTFRM
TDATTDAT
TADDTADD
bit 15bit 15 00bit 1bit 177 bit 0bit 077 bit 14bit 14 00 bit 13bit 13 00 bit 12bit 12 00
a7a7 a6a6 a5a5 a4a4
uu Any one ’C5x generates the clock and frame signals.Any one ’C5x generates the clock and frame signals.
uu Data to transmit and listener address are generated Data to transmit and listener address are generated by the ’C5x which "owns" the current signal.by the ’C5x which "owns" the current signal.
uu All ’C5x’s capture address and data bits. TDM RCV interrupt All ’C5x’s capture address and data bits. TDM RCV interrupt is generated if routing list includes device’s ID.is generated if routing list includes device’s ID.
. . .. . .
. . .. . .
. . .. . .
. . .. . .
12 - 26
bitbit
TDM Port RegistersTDM Port Registers
15151414131312121111101099887766554433221100
TRCVTRCVTRCV
15
receivedata
0
1515
receivereceivedatadata
00
TDXRTDXRTDXR
15
transmitdata
0
1515
transmittransmitdatadata
00
TSPCTSPCTSPC
resres
rsrfullxsrempty
xrdyrrdyin1in0rrstxrsttxmmcmfsmfo
dlbtdm
resresresres
rsrfullrsrfullxsremptyxsrempty
xrdyxrdyrrdyrrdyin1in1in0in0rrstrrstxrstxrsttxmtxmmcmmcmfsmfsmfofo
dlbdlbtdmtdm
TCSRTCSRTCSR
xxxxxxxx
ch7ch6ch5ch4ch3ch2ch1ch0
xxxxxxxxxxxxxxxx
ch7ch7ch6ch6ch5ch5ch4ch4ch3ch3ch2ch2ch1ch1ch0ch0
TRTATRTATRTA
ta7ta6ta5ta4ta3ta2ta1ta0ra7ra6ra5ra4ra3ra2ra1ra0
ta7ta7ta6ta6ta5ta5ta4ta4ta3ta3ta2ta2ta1ta1ta0ta0ra7ra7ra6ra6ra5ra5ra4ra4ra3ra3ra2ra2ra1ra1ra0ra0
TRADTRADTRAD
xx
x2x1x0s2s1s0a7a6a5a4a3a2a1a0
xxxx
x2x2x1x1x0x0s2s2s1s1s0s0a7a7a6a6a5a5a4a4a3a3a2a2a1a1a0a0
time whentime whenlastlast msg msg for forme came inme came in
current timecurrent time
anyany ra ra# + a## + a#both = 1both = 1means “means “msgmsgfor me”for me”
my listener IDmy listener IDcurrentcurrentrouting listrouting listmy time(s) to talkmy time(s) to talk
routing list forrouting list formymy xmit xmit data data
Module 12
DSP54x - Ports 12 - 15
12 - 27
uu Standard Serial PortStandard Serial Port
uu Buffered Serial Port (BSP)Buffered Serial Port (BSP)
uu TDM Serial PortTDM Serial Port
uu Host Port Interface (HPI)Host Port Interface (HPI)
TMS32054x PortsTMS32054x Ports
uu Host Port Interface (HPI)Host Port Interface (HPI)
12 - 28
HPI ConceptHPI Concept
CPU
’54x
0800h
FFFh
1000h
1800h
CONTROL
DATA
8
000h
800h
10
HOST
HPI
HPIC
HPIA
HPICMMRs
Bk 0
Bk 1(BSP)
Bk 2(HPI)
0000h
Module 12
12 - 16 DSP54x - Ports
12 - 29
HPI Control SignalsHPI Control SignalsPin From Function
HBIL Host 0 1st Byte 1:2nd Byte
HCNTL0 Host 00 Control 01 Address
HCNTL1 11 Data 10 Data: ++W and R++
HRW- Host 0 to DSP 1:From DSP : Use R/W- or A(n)
HDS1- Host Host Pins HDS1 HDS2 HRW-
HDS2- RD- WE- RD- WE- WE-
STRB- R/W- STRB- VDD R/W-
STRB R/W- STRB gnd R/W-
HAS- Host Host w Mux(A,D): ALEÈHAS-, else VDDÈALE
HCS- Host Chip select: Use Device select Or A(n)
HRDY- DSP to Host(Ready) if Host rate > DSP rate /5
HINT- DSP to Host Int(n) : Int from DSP
12 - 30
HPIC RegisterHPIC Register15 - 8 7-4 3 2 1 0
0000 HINT DSPINT SMODE BOBCopy of 7:0
HostDSPHostBoth
00 BOBBOB Byte Order BitByte Order Bit 0 =0 = LSByte LSByte 1st (Little 1st (Little Endian Endian))1 =1 = MSByte MSByte 1st (Big 1st (Big Endian Endian))
11 SMODSMOD Shared ModeShared Mode 0 = Host Only Mode (HOM)0 = Host Only Mode (HOM)1 = Shared Access Mode (SAM)1 = Shared Access Mode (SAM)
22 DSPINTDSPINT DSP InterruptDSP Interrupt 0 = No Interrupt0 = No Interrupt1 = DSP1 = DSP Int’d Int’d by Host by Host
33 HINTHINT Host InterruptHost Interrupt DSP writes 1 : HINT- driven lowDSP writes 1 : HINT- driven lowHost writes 1 : HINT cleared (Host writes 1 : HINT cleared (ackack))
ModeMode Max RateMax Rate DetailsDetails
SAMSAM 5 cycles5 cycles asserted by DSP. 320 active only.asserted by DSP. 320 active only.HOMHOM 5050 nS nS resetreset cond cond. When DSP is: active, idle 1 or 2, reset, no clock. When DSP is: active, idle 1 or 2, reset, no clock
Module 12
DSP54x - Ports 12 - 17
12 - 31
INFOINFO 00 11 BB RR DDDD HPICHPIC HIPAHIPA DatLDatL 12341234 12351235 12361236 12371237 12381238
Ctrl.Ctrl. 00 00 00 00 0000 00000000 XXXX XXXXXXXX XXXX 01020102 03040304 05060506 07080708 090A090A00 00 11 00 0000 00000000 XXXXXXXX XXXXXXXX 01020102 03040304 05060506 07080708 090A090A
AddrAddr.. 00 11 00 00 1212 00000000 12XX12XX XXXXXXXX 01020102 03040304 05060506 07080708 090A 090A 00 11 11 00 3434 00000000 12341234 01020102 01020102 03040304 05060506 07080708 090A 090A
W:D1.W:D1. 11 11 00 00 AAAA 00000000 12341234 AA02AA02 01020102 03040304 05060506 07080708 090A090A11 11 11 00 BBBB 00000000 12341234 AABBAABB AABBAABB 03040304 05060506 07080708 090A 090A
W:+D2W:+D2 11 00 00 00 CCCC 00000000 12351235 CCBBCCBB AABBAABB 03040304 05060506 07080708 090A 090A 11 00 11 00 DDDD 00000000 12351235 CCDDCCDD AABBAABB CCDDCCDD 05060506 07080708 090A 090A
W:+D3W:+D3 11 00 00 00 EEEE 00000000 12361236 EEDDEEDD AABBAABB CCDDCCDD 05060506 07080708 090A 090A 11 00 11 00 FFFF 00000000 12361236 EEFFEEFF AABBAABB CCDDCCDD EEFFEEFF 07080708 090A090A
AddrAddr.. 00 11 00 00 1212 00000000 12371237 07080708 AABBAABB CCDDCCDD EEFFEEFF 07080708 090A090A00 11 11 00 3737 00000000 12371237 07080708 AABBAABB CCDDCCDD EEFFEEFF 07080708 090A090A
R:D4+R:D4+ 11 00 00 11 0707 00000000 12371237 07080708 AABBAABB 03040304 05060506 07080708 090A 090A 11 00 11 11 0808 00000000 12371237 07080708 AABBAABB CCDDCCDD 05060506 07080708 090A 090A
R:D5+R:D5+ 11 00 00 11 0909 00000000 12381238 090A090A AABBAABB CCDDCCDD 05060506 07080708 090A 090A 11 00 11 11 0A0A 00000000 12381238 090A090A AABBAABB CCDDCCDD EEFFEEFF 07080708 090A090A
HPI ProcessHPI Process
12 - 32
’C54x Host to ’C54x HPI’C54x Host to ’C54x HPI
’C54x Host’C54x Host ‘C54x HPI‘C54x HPI
HD7-0HD7-0
HCNTRL0HCNTRL0
HCNTRL1HCNTRL1
HBILHBIL
HRW -HRW -
HAS -HAS -
HCS -HCS -
HDS1 -HDS1 -
HDS2 -HDS2 -
HRDYHRDY
HINT -HINT -
A2A2
A1A1
A0A0
VVCCCC
D7-0D7-0
A2-0A2-0
R/W -R/W -
IS -IS -
IOSTRB -IOSTRB -
READYREADY
INT1 -INT1 -
VVCCCC
Module 12
12 - 18 DSP54x - Ports
12 - 33
Motorola 68HC11F1 to ‘54x HPIMotorola 68HC11F1 to ‘54x HPI
MC68HC11F1MC68HC11F1 ‘C54x HPI‘C54x HPI
HD7-0HD7-0
HCNTRL0HCNTRL0
HCNTRL1HCNTRL1
HBILHBIL
HRW-HRW-
HAS-HAS-
HCS-HCS-
HDS1-HDS1-
HDS2-HDS2-
HRDYHRDY
HINT-HINT-
PF2PF2PF1PF1PF0PF0
VVCCCC
PC7-0PC7-0
PF2-0PF2-0
R/W -R/W -
CSIO2CSIO2EE
IRQ -IRQ -NCNC
12 - 34
Intel 80C51 to ‘54x HPIIntel 80C51 to ‘54x HPI‘C54x HPI‘C54x HPI
HD7-0HD7-0
HCNTRL0HCNTRL0
HCNTRL1HCNTRL1
HBILHBIL
HRW-HRW-
HCS-HCS-
HAS-HAS-
HDS1-HDS1-
HDS2-HDS2-
HRDYHRDY
HINT-HINT-
Intel 80C51BHIntel 80C51BH
P0.7:0.0P0.7:0.0
ALEALE
P3.7/ RD-P3.7/ RD-P3.6/ WR-P3.6/ WR-
P3.2/ INT0-P3.2/ INT0-N/CN/C
P0.3P0.3
P0.2P0.2
P0.1P0.1
P0.0P0.0HPI HPI SELECTSELECTLOGICLOGIC
Note: Note: HCS- must be low when HDS1- or HDS2- is low. HCS- must be low when HDS1- or HDS2- is low. HCS- may be tied to ground or driven low by some HPI select logic.HCS- may be tied to ground or driven low by some HPI select logic.
DSP54x - System Considerations 13 - 1
System Considerations
Learning Objectives
13 - 2
Learning ObjectivesLearning Objectives
Become familiar with system level designBecome familiar with system level designconsiderations for the C54x like:considerations for the C54x like:
Boot loaderBoot loader
Clock optionsClock options
Power managementPower management
Program securityProgram security
JTAG emulationJTAG emulation
Memory interfacingMemory interfacing
Multiprocessor issuesMultiprocessor issues
13 - 2 DSP54x - System Considerations
Module 13
DSP54x - System Considerations 13 - 3
Module 13
13 - 3
Boot LoaderBoot Loader
The main function of the boot loader is to transfer user codeThe main function of the boot loader is to transfer user codefrom an external source to the program memory at power-up.from an external source to the program memory at power-up.
Depending on the C54x variant, the part can be booted from:Depending on the C54x variant, the part can be booted from:
uu 8 or 16 bit serial mode (SSP, BSP or TDM)8 or 16 bit serial mode (SSP, BSP or TDM)
uu 8 or 16 bit parallel I/O mode 8 or 16 bit parallel I/O mode
uu 8 or 16 bit parallel EPROM mode 8 or 16 bit parallel EPROM mode
uu Warm boot mode Warm boot mode
uu HPI boot mode HPI boot mode
13 - 4
Boot SequenceBoot Sequence
If the MP/MC pin is sampled low during a hardware reset,If the MP/MC pin is sampled low during a hardware reset,execution begins at location 0FF80h of the on-chip ROM.execution begins at location 0FF80h of the on-chip ROM.
This location contains a branch instruction to the start of theThis location contains a branch instruction to the start of theboot loader program.boot loader program.
Unless specified otherwise, the on-chip ROM is factoryUnless specified otherwise, the on-chip ROM is factoryprogrammed with the boot loader program.programmed with the boot loader program.
Module 13
13 - 4 DSP54x - System Considerations
13 - 5
Boot Loader OperationBoot Loader Operation
The boot loader program sets up the CPU status registers beforeThe boot loader program sets up the CPU status registers beforeinitiating the boot load ...initiating the boot load ...
uu Interrupts are globally disabled ( INTM = 1 ) Interrupts are globally disabled ( INTM = 1 )
uu Internal DARAM is mapped in program / data space ( OVLY = 1 ) Internal DARAM is mapped in program / data space ( OVLY = 1 )
uu 7 wait states are selected for the entire program, data and I/O spaces 7 wait states are selected for the entire program, data and I/O spaces
uu External memory bank size is set to 4K words External memory bank size is set to 4K words
uu 1 cycle is inserted when accesses switch between program and data 1 cycle is inserted when accesses switch between program and dataspacespace
The boot routine then reads the I/O port address 0FFFFh by driving the I/OThe boot routine then reads the I/O port address 0FFFFh by driving the I/Ostrobe pin low.strobe pin low.
The lower 8 bits of the word read from this port address specify the mode ofThe lower 8 bits of the word read from this port address specify the mode oftransfer.transfer.
13 - 6
Boot Mode SelectionBoot Mode Selection
BeginBegin InitializeInitializeTestTest
INT2: HPIINT2: HPImode?mode?
YesYes
NoNo
Read Boot Routine Selection (BRS) word from I/0 address 0FFFFhRead Boot Routine Selection (BRS) word from I/0 address 0FFFFh
BRSBRS==
??????????
Serial Serial Boot ModeBoot Mode
I/O I/O Boot ModeBoot Mode
Parallel Parallel Boot ModeBoot Mode
Warm Warm Boot ModeBoot Mode
Begin execution atBegin execution atHPIRAMHPIRAM
Module 13
DSP54x - System Considerations 13 - 5
13 - 7
HPI bootHPI boot
In order to do that, HINT is asserted low. In HPI mode, this pin is normally In order to do that, HINT is asserted low. In HPI mode, this pin is normally tied to INT2. tied to INT2.
If INT2 and HINT are tied together, INT2’s bit in the Interrupt Flag RegisterIf INT2 and HINT are tied together, INT2’s bit in the Interrupt Flag Register(IFR) will be set. The(IFR) will be set. The bootloader bootloader waits 20 CLOCKOUT cycles after asserting waits 20 CLOCKOUT cycles after asserting
HINT and then reads IFR bit #2HINT and then reads IFR bit #2
Assert pin HINTAssert pin HINT
The first step of the boot loader is to check if Host Port Interface (HPI) The first step of the boot loader is to check if Host Port Interface (HPI) boot option is selected. boot option is selected.
‘C54x‘C54x
00
00INT2INT2bit 2 of IFRbit 2 of IFRsetset
••If bit #2 is a 1, controlIf bit #2 is a 1, control istransferred istransferred to to the start of HPI RAM. the start of HPI RAM. NOTE: HPI RAM must already be NOTE: HPI RAM must already be loaded by the host before bringing loaded by the host before bringing the C54x out of reset. the C54x out of reset.
••If bit #2 is a 0, the boot routine If bit #2 is a 0, the boot routine skips the HPI mode skips the HPI mode
13 - 8
HPI bootHPI boot
Alternative methods:Alternative methods:
If it’s inconvenient to tie INT2 and HINT together, the following methodsIf it’s inconvenient to tie INT2 and HINT together, the following methodswill work.will work.
Send a valid interrupt to the INT2 input pin within 30 CLOCKOUT cycles Send a valid interrupt to the INT2 input pin within 30 CLOCKOUT cycles after DSP fetches the reset vector.after DSP fetches the reset vector.
or ...or ...
Use the warm boot option described later in this section. This method isUse the warm boot option described later in this section. This method ispreferred.preferred.
Module 13
13 - 6 DSP54x - System Considerations
13 - 9
Serial bootSerial boot
XXXXXXXXXXXXXXXX
1515 8 78 7
XXXXkmkm
4 34 3
00nn0000
00
At address 0FFFFhAt address 0FFFFh
k = 0, standard serial portk = 0, standard serial portk = 1, TDM serial portk = 1, TDM serial portn = 0, 8 bitn = 0, 8 bitn = 1, 16 bitn = 1, 16 bitm = 0, CLKX, FSX outputm = 0, CLKX, FSX outputm = 1, CLKX, FSX inputm = 1, CLKX, FSX input
The ‘541 serial boot option can use either the buffered serial port (BSP)The ‘541 serial boot option can use either the buffered serial port (BSP) ot ot the the time-division multiplexed (TDM) serial port in standard mode during booting.time-division multiplexed (TDM) serial port in standard mode during booting.
13 - 10
Serial boot processSerial boot process
Start executing codeStart executing codeDecrement code lengthDecrement code length
Branch to DABranch to DA
Configure SPC register to Configure SPC register to put SP in reset and then pull put SP in reset and then pull
SP out of reset. Configure SP out of reset. Configure BSPCE register.BSPCE register.
Read DA from SPRead DA from SP
Read code length from SPRead code length from SP
Read code word from SP and save the code word in DMRead code word from SP and save the code word in DM
Transfer data from DM into PM and increment PMTransfer data from DM into PM and increment PM
CodeCodelengthlength
0?0?NoNo
YesYes
Note: 8 bit read isNote: 8 bit read isHigh byte then lowHigh byte then low
byte.byte.
Module 13
DSP54x - System Considerations 13 - 7
13 - 11
I/O Parallel BootI/O Parallel Boot
XXXXXXXXXXXXXXXX
1515 8 78 7
XXX1XXX1
4 34 3
10001000
00
At address 0FFFFhAt address 0FFFFh
XXXXXXXXXXXXXXXX
1515 8 78 7
XXXXXXXX
4 34 3
11001100
00
8 bit mode8 bit mode
16 bit mode16 bit mode
Most common use of this mode is to boot from a slow microprocessor.Most common use of this mode is to boot from a slow microprocessor.
13 - 12
EPROM (Parallel) bootEPROM (Parallel) boot
XXXXXXXXXXXXXXXX
1515 8 78 7
SRCSRC
2 12 1
AAAA00
At address 0FFFFhAt address 0FFFFhAA = 01, 8 bit modeAA = 01, 8 bit modeAA = 10, 16 bit modeAA = 10, 16 bit modeSRC = 6 bit page addressSRC = 6 bit page address
Module 13
13 - 8 DSP54x - System Considerations
13 - 13
Warm BootWarm Boot
XXXXXXXXXXXXXXXX
1515 8 78 7
ADDRADDR
2 12 1
111100
At address 0FFFFhAt address 0FFFFh ADDR = 6 bit page addressADDR = 6 bit page address
13 - 14
Clock OptionsClock Options
CLKMD1CLKMD1
CLKMD2CLKMD2
CLKMD3CLKMD3
The Phase Locked Loop (PLL) mode is determinedThe Phase Locked Loop (PLL) mode is determinedat start-up by the input states on three pins:at start-up by the input states on three pins:
These pins should not be reconfigured during normal operation.These pins should not be reconfigured during normal operation.
CLKMD1 CLKMD2 CLKMD3 Option 1CLKMD1 CLKMD2 CLKMD3 Option 1++ Option 2 Option 2++
0011110000001111
0011001100110011
0000000011111111
PLL x 3,PLL x 3, ext ext.. osc osc..PLL x 2,PLL x 2, ext ext.. osc osc..PLL x 3,PLL x 3, int int.. osc osc..PLL x 1.5,PLL x 1.5, ext ext.. osc osc..Divide by 2,Divide by 2, ext ext.. osc osc..Stop mode*Stop mode*PLL x 1,PLL x 1, ext ext.. osc osc..Divide by 2,Divide by 2, int int.. osc osc..
PLL x 5,PLL x 5, ext ext.. osc osc..PLL x 4,PLL x 4, ext ext.. osc osc..PLL x 5,PLL x 5, int int.. osc osc..PLL x 4.5,PLL x 4.5, ext ext.. osc osc..Divide by 2,Divide by 2, ext ext.. osc osc..Stop mode*Stop mode*PLL x 1,PLL x 1, ext ext.. osc osc..Divide by 2,Divide by 2, int int.. osc osc..
PLL options for ‘541, ‘2 ,’3 ,’4, ‘5 and ‘6PLL options for ‘541, ‘2 ,’3 ,’4, ‘5 and ‘6
+ You can select your device with either option 1 or 2, but not both.+ You can select your device with either option 1 or 2, but not both.* PLL is disabled. System clock is not provided to CPU / peripherals.* PLL is disabled. System clock is not provided to CPU / peripherals.
Module 13
DSP54x - System Considerations 13 - 9
13 - 15
Clock Options - ‘548Clock Options - ‘548
CLKMD1 CLKMD2 CLKMD3 Clock mode / CLKMD value on resetCLKMD1 CLKMD2 CLKMD3 Clock mode / CLKMD value on reset
0011110000001111
0011001100110011
0000000011111111
1/2 with1/2 with ext ext. source / CLKMD = 0000h. source / CLKMD = 0000h1/2 with1/2 with ext ext. source / CLKMD = 6000h. source / CLKMD = 6000h1/2 with1/2 with ext ext. source / CLKMD = 4000h. source / CLKMD = 4000h1/2 with1/2 with ext ext. source / CLKMD = 2000h. source / CLKMD = 2000h1/2 with1/2 with ext ext. source / CLKMD = 1000h. source / CLKMD = 1000hStop mode / CLKMD =Stop mode / CLKMD = na naPLL*1 withPLL*1 with ext ext. source / CLKMD = 00007h. source / CLKMD = 00007h1/2 with1/2 with ext ext. source / CLKMD = 7000h. source / CLKMD = 7000h
Check this and add moreCheck this and add more
13 - 16
PLL Lockup TimePLL Lockup Time
Since it is an analog system, the PLL requires a lockup time before it is stable.Since it is an analog system, the PLL requires a lockup time before it is stable.
Module 13
13 - 10 DSP54x - System Considerations
13 - 17
Power ManagementPower Management
The current consumption of a DSP can vary depending on many factorsThe current consumption of a DSP can vary depending on many factorsincluding:including:
•• the instructions being executedthe instructions being executed•• whether the external pins are exercised or not whether the external pins are exercised or not•• temperature temperature•• supply voltage supply voltage•• capacitance of the external traces capacitance of the external traces
IDLE1IDLE1IDLE2IDLE2IDLE3IDLE3
Repeat NOPRepeat NOPInline NOPInline NOP
Repeat MACRepeat MACInline MACInline MAC
5.6mA5.6mA22 mA mA
0.550.55 mA mA0.30.3 mA mA0.40.4 mA mA
16 - 44mA16 - 44mA20 - 5220 - 52 mA mA
TMS320LC548 TMS320LC548 CLKOUT = 40 MHzCLKOUT = 40 MHzVccVcc = 3.0 V = 3.0 VRoom TemperatureRoom TemperaturePLL x1 clock modePLL x1 clock modeInternal consumption onlyInternal consumption only
13 - 18
Power Management HintsPower Management Hints
•• Minimize external trace lengths and their associated capacitance Minimize external trace lengths and their associated capacitance
•• Set Address Visibility (AVIS) = 0 Set Address Visibility (AVIS) = 0
•• When not being used, make sure the timer and serial ports are in reset When not being used, make sure the timer and serial ports are in reset
and MCM = 0 and MCM = 0
•• Assure Assure allall input pins are grounded or pulled hi input pins are grounded or pulled hi gh gh
•• Set SWWR to 0 wait states when possibleSet SWWR to 0 wait states when possible
•• Use circular addressing instead of Use circular addressing instead of DMOV’s DMOV’s
•• Use internal instead of external memory accesses Use internal instead of external memory accesses
•• Minimize the clock frequency to match the task required Minimize the clock frequency to match the task required
•• Implement power down modes where possible Implement power down modes where possible
Some design techniques for minimizing power consumption ...Some design techniques for minimizing power consumption ...
Module 13
DSP54x - System Considerations 13 - 11
13 - 19
Power Management - IDLEPower Management - IDLEIdle mode is entered by executing the IDLE instruction:Idle mode is entered by executing the IDLE instruction:
IDLE K ( where K 1, 2 or 3 )IDLE K ( where K 1, 2 or 3 )The device will stay in this mode until it is interruptedThe device will stay in this mode until it is interrupted
IDLE1IDLE1 All CPU activities stoppedAll CPU activities stoppedPeripherals activePeripherals active
Wake on … Reset, Wake on … Reset, Peripheral interrupts andPeripheral interrupts and
External interruptsExternal interrupts
IDLE2IDLE2All CPU activities stoppedAll CPU activities stopped
Peripherals inactivePeripherals inactiveCLKOUT inactiveCLKOUT inactive
Wake on … Reset andWake on … Reset andExternal interrupts*External interrupts*
* * IntsInts are not latched in idle mode, they must be low for 5 cycles to be are not latched in idle mode, they must be low for 5 cycles to be ackedacked+ PLL will require a transitory locking time of 50uS for restart+ PLL will require a transitory locking time of 50uS for restart! IDLE3 mode on the 545A, 546A and 548 has additional features. See! IDLE3 mode on the 545A, 546A and 548 has additional features. SeeTechnical Reference.Technical Reference.
IDLE3IDLE3 !!
All CPU activities stoppedAll CPU activities stoppedPeripherals inactivePeripherals inactiveCLKOUT inactiveCLKOUT inactive
PLL halted PLL halted ++
Wake on … Reset andWake on … Reset andExternal interrupts*External interrupts*
13 - 20
Power Management - HOLDPower Management - HOLDPower-down mode can also be initiated by the HOLD signalPower-down mode can also be initiated by the HOLD signal
When Hold initiatesWhen Hold initiatespower-down and ...power-down and ...
HM (in ST1) =1HM (in ST1) =1 The CPU stops executing andThe CPU stops executing andaddress, data and control lines go intoaddress, data and control lines go intohigh impedance state. All peripheralshigh impedance state. All peripherals
remain active.remain active.
HM (in ST1) =0HM (in ST1) =0 The address, data and control linesThe address, data and control linesgo into high impedance state. Allgo into high impedance state. All
peripherals remain active. The CPUperipherals remain active. The CPUcontinues to execute internally untilcontinues to execute internally untilan external access occurs, at whichan external access occurs, at which
point the processor will halt.point the processor will halt.
Power-down mode is terminated when HOLD becomes inactive.Power-down mode is terminated when HOLD becomes inactive.
Module 13
13 - 12 DSP54x - System Considerations
13 - 21
Power Management - CLKOUTPower Management - CLKOUT
All C54x devices can disable the internal clock of external interfaces usingAll C54x devices can disable the internal clock of external interfaces usingCLKOUT, which will place the interface into a lower power consumption mode.CLKOUT, which will place the interface into a lower power consumption mode.
BSCR(0) = 0BSCR(0) = 0
BSCR(0) = 1BSCR(0) = 1
PMST(2) = 0PMST(2) = 0
PMST(2) = 1PMST(2) = 1
CLKOUT pin enabled*CLKOUT pin enabled*
CLKOUT pin disabledCLKOUT pin disabled
CLKOUT pin enabled*CLKOUT pin enabled*
CLKOUT pin disabledCLKOUT pin disabled
* Condition at Reset* Condition at Reset
13 - 22
Program SecurityProgram Security
On-chip ROM securityOn-chip ROM securityROM / RAM securityROM / RAM security
Module 13
DSP54x - System Considerations 13 - 13
13 - 23
JTAG EmulationJTAG Emulation
JTAGJTAGControlControlBlockBlock
Internal scan chain Internal scan chain ( resisters and state machines )( resisters and state machines )
IEEE 1149.1IEEE 1149.1JTAG Test BusJTAG Test Bus
JTAG .. Joint Test Action GroupJTAG .. Joint Test Action Group
The JTAG port on the ‘C54x allows:The JTAG port on the ‘C54x allows:
••Boundary ScanBoundary Scan
Boundary ScanBoundary Scan
••EmulationEmulation
AnalysisAnalysisBlockBlock
PINSPINS
AnalysisAnalysisBlockBlock
Through a 14 pin Test / Emulation headerThrough a 14 pin Test / Emulation header
‘C54x‘C54x
EMU0EMU0EMU1EMU1TRSTTRSTTMSTMSTDITDITDOTDOTCKTCKTCK_RETTCK_RET
VccVcc
PDPD
GNDGNDGNDGNDGNDGNDGNDGNDGNDGND
EMU0EMU0EMU1EMU1TRSTTRSTTMSTMSTDITDI
TDOTDOTCKTCK
TCK_RETTCK_RET
VccVcc
GNDGND
HeaderHeader
6 inches6 inches
4.7K4.7K
or moreor more
Header to deviceHeader to devicelengths greater lengths greater
than 6 inchesthan 6 inchesrequire extra circuitryrequire extra circuitry
and attention and attentionto noise.to noise.
13 - 24
Multiprocessor IssuesMultiprocessor Issues
Major how-Major how-to’sto’s, signals involved, etc., signals involved, etc.
Module 13
13 - 14 DSP54x - System Considerations
13 - 25
‘LC548 ROM Features‘LC548 ROM Features
The on-chip ROM of the ‘LC548 is 2K words in length and is mapped fromThe on-chip ROM of the ‘LC548 is 2K words in length and is mapped from0F800h to 0FFFFh if the MP/MC pin is low.0F800h to 0FFFFh if the MP/MC pin is low.
External program spaceExternal program space
Boot loaderBoot loader
µµ-law table-law table
ΑΑ-law table-law table
Sine lookup tableSine lookup table
Built-in self testBuilt-in self test
Vector tableVector table
Program spaceProgram space
0x00000x0000
0xF8000xF800
0xFC000xFC00
0xFD000xFD00
0xFE000xFE00
0xFF000xFF00
0xFF800xFF80
13 - 26
Run=, Load=Run=, Load=
linker protocollinker protocol
Load =Load =epromepromRun=SRAMRun=SRAM
problems with linker linking symbolsproblems with linker linking symbols
assem langassem lang user guide user guide.label.label
see sheet1see sheet1
this will the lab topic for the modulethis will the lab topic for the module
Module 13
DSP54x - System Considerations 13 - 15
13 - 27
549 features549 features
two voltages ... 2.5 core 3.3 externaltwo voltages ... 2.5 core 3.3 external
power up issuespower up issues
13 - 28
Level ShiftingLevel Shifting
3.3 - 5v 3.3 - 5v
Module 13
13 - 16 DSP54x - System Considerations
13 - 29
Power Supply power up sequencePower Supply power up sequence
2.5v core 3.3. i/o2.5v core 3.3. i/o
3.3 volt core 5v i/o3.3 volt core 5v i/o
DSP54x - Using the C Compiler 14 - 1
Using the C Compiler
Learning Objectives
14 - 2
Learning ObjectivesLearning Objectives
uu Invoke the compiler or shell programInvoke the compiler or shell programÀÀ Options and SwitchesOptions and SwitchesÀÀ The RTS libraryThe RTS libraryÀÀ The OptimizerThe Optimizer
uu Write code in CWrite code in CÀÀ Numerical Types supportedNumerical Types supportedÀÀ AccessingAccessing MMRs MMRs and IO Ports and IO PortsÀÀ InliningInlining C and ASM functions C and ASM functionsÀÀ Interrupt service routinesInterrupt service routinesÀÀ Optimization tipsOptimization tips
uu Use the C support files :Use the C support files :ÀÀ C.CMD : Linker file issues when using CC.CMD : Linker file issues when using CÀÀ BOOT.ASM Pre-main initialization processBOOT.ASM Pre-main initialization process
uu Intermix assembly files within the C environmentIntermix assembly files within the C environmentÀÀ Stack ModelStack ModelÀÀ Register UsageRegister UsageÀÀ Argument passing and result returnArgument passing and result return
uu Invoke the compiler or shell programInvoke the compiler or shell programÀÀ Options and SwitchesOptions and SwitchesÀÀ The RTS libraryThe RTS libraryÀÀ The OptimizerThe Optimizer
14 - 2 DSP54x - Using the C Compiler
Module 14
DSP54x - Using the C Compiler 14 - 3
Module 14
14 - 3
FILE.C
C CompilerCC500
FILE.ASM
Parser
Code Generator
Optimizer-o
Assembler : ASM500
FILE.OBJ
Compiler Tool FlowCompiler Tool Flow
Linker : LNK500-z FILE.OUT
C CompilerCC500
FILE.ASM
Parser
Code Generator
Optimizer-o
Assembler : ASM500
FILE.OBJ
Linker : LNK500-z FILE.OUTShell Program :
CL500
Invoking the Shell :Invoking the Shell :
CL500 x.c y.CL500 x.c y.asmasm -z c. -z c.cmdcmd
14 - 4
Common Compiler OptionsCommon Compiler Options
Switch Description-g Global: symbols for debugging-s Source: C interlist in ASM file
-al Assembler: List file request-as Assembler: glboal Symbols
-ms Model Size - optimize for size-mn Model Normal - full opt. despite -g
-o0 Optimize register use-o1 Opt. -o0 + local opt.-o2 Opt. -o1 + global opt.-o3 Opt. -o2 + file opt.-oe Eliminate dead code-x Enable Inlining and -o3
-z LNK500 invoked (link options follow)
Module 14
14 - 4 DSP54x - Using the C Compiler
14 - 5
Compiler Switch IssuesCompiler Switch Issues
Optimizer should be invoked incrementally:Optimizer should be invoked incrementally:
CL500 -g test -z c.CL500 -g test -z c.cmdcmd Symbols kept for debugSymbols kept for debug
CL500 -g -o3 test -z c.CL500 -g -o3 test -z c.cmdcmd Add optimizer, keep symbolsAdd optimizer, keep symbols
CL500 -g -o3 -CL500 -g -o3 -mnmn test -z c. test -z c.cmdcmd Full optimize, some symbolsFull optimize, some symbols
CL500 -03 test -z c.CL500 -03 test -z c.cmdcmd Final rev: optimize, no symbolsFinal rev: optimize, no symbols
Preferred switches can be selected in several ways:Preferred switches can be selected in several ways:
On command line : On command line : As above.As above.
With batch file, With batch file, eg eg : : CL500 -g -o3 -CL500 -g -o3 -mnmn %1 %2 -z c. %1 %2 -z c.cmdcmd
Via environment variable : Via environment variable : SET C_OPTION=-g -o3 -SET C_OPTION=-g -o3 -mnmn
14 - 6
Lab 14-a : Invoking the CompilerLab 14-a : Invoking the Compiler
1.1. Inspect a C file that performs the sine routineInspect a C file that performs the sine routine
2.2. Compile the file using CL500Compile the file using CL500
3. 3. Observe the resultant .ASM fileObserve the resultant .ASM file
4.4. Load the .OUT file to the simulatorLoad the .OUT file to the simulator
5.5. Run the program andRun the program and
a. Verify correct results obtaineda. Verify correct results obtained
b. Benchmark cycles for sine routineb. Benchmark cycles for sine routine
c. Note lines of code required for sine routinec. Note lines of code required for sine routine
6. 6. Recompile with optimizer (-o). Repeats steps a - cRecompile with optimizer (-o). Repeats steps a - c
7.7. Compare the results of steps 5 and 6Compare the results of steps 5 and 6
Module 14
DSP54x - Using the C Compiler 14 - 5
14 - 7
Writing Code in CWriting Code in C
uu Invoke the compiler or shell programInvoke the compiler or shell programÀÀ Options and SwitchesOptions and SwitchesÀÀ The RTS libraryThe RTS libraryÀÀ The OptimizerThe Optimizer
uu Write code in CWrite code in CÀÀ Numerical Types supportedNumerical Types supportedÀÀ AccessingAccessing MMRs MMRs and IO Ports and IO PortsÀÀ InliningInlining C and ASM functions C and ASM functionsÀÀ Interrupt service routinesInterrupt service routinesÀÀ Optimization tipsOptimization tips
uu Use the C support files :Use the C support files :ÀÀ C.CMD : Linker file issues when using CC.CMD : Linker file issues when using CÀÀ BOOT.ASM Pre-main initialization processBOOT.ASM Pre-main initialization process
uu Intermix assembly files within the C environmentIntermix assembly files within the C environmentÀÀ Stack ModelStack ModelÀÀ Register UsageRegister UsageÀÀ Argument passing and result returnArgument passing and result return
uu Write code in CWrite code in CÀÀ Numerical Types supportedNumerical Types supportedÀÀ AccessingAccessing MMRs MMRs and IO Ports and IO PortsÀÀ InliningInlining C and ASM functions C and ASM functionsÀÀ Interrupt service routinesInterrupt service routinesÀÀ Optimization tipsOptimization tips
14 - 8
Inline AssemblyInline Assembly
uu Allows direct access to assembly language from CAllows direct access to assembly language from C
uu Useful for operating on components not used by C, ex:Useful for operating on components not used by C, ex:
asm asm ( “label RSBX INTM ” ); ( “label RSBX INTM ” );
uu Note: first column after leading quote is Note: first column after leading quote is labellabel field field
uu Avoid modifying components used by C (especially with Avoid modifying components used by C (especially with -o -o ))
uu Long operations should be written in ASM and called from CLong operations should be written in ASM and called from C
uu main C file retains portabilitymain C file retains portability
uu yields more easily maintained structuresyields more easily maintained structures
uu eliminates risk of interfering with registers in use by Celiminates risk of interfering with registers in use by C
Module 14
14 - 6 DSP54x - Using the C Compiler
14 - 9
Accessing Accessing MMRs MMRs from C from C
uu Using pointers to access Memory-Mapped Registers :Using pointers to access Memory-Mapped Registers :
uu Create a pointer and set its value to the assigned memory address :Create a pointer and set its value to the assigned memory address :
volatile unsignedvolatile unsigned int int *SPC_REG = (volatile unsigned *SPC_REG = (volatile unsigned int int *) 0x0022; *) 0x0022;
uu Read and write to the register as any other pointer :Read and write to the register as any other pointer :
*SPC_REG = 0xC8;*SPC_REG = 0xC8;
uu Volatile modifier :Volatile modifier :
uu Especially important with optimizer (-o)Especially important with optimizer (-o)
uu Tells compiler to always recheck actual memory whenever encounteredTells compiler to always recheck actual memory whenever encountered
uu Otherwise, optimizer might register-base value, or eliminate constructOtherwise, optimizer might register-base value, or eliminate construct
14 - 10
Accessing I/O Ports from CAccessing I/O Ports from C
Accessing I/O Ports from C :Accessing I/O Ports from C :1. create the port :1. create the port :
2. access the port :2. access the port :
ioportioport type type portHEXNO portHEXNO
ioportioport unsigned port8000 unsigned port8000
x = port8000 ;x = port8000 ;
port8000 = y ; port8000 = y ;
ma 0x8000,2,1,ma 0x8000,2,1,ioportioport
mcmc 0x08000,2,1,out. 0x08000,2,1,out.datdat,W,W
mcmc 0x8000,2,1,in. 0x8000,2,1,in.datdat,R ,R
Accessing I/O Ports from Simulator :Accessing I/O Ports from Simulator :
Label PORTR 8000h,xLabel PORTR 8000h,x
PORTW y,8000 PORTW y,8000
Accessing I/O Ports from ASM :Accessing I/O Ports from ASM :
Module 14
DSP54x - Using the C Compiler 14 - 7
14 - 11
Interrupts in CInterrupts in C
uu Interrupt Service RoutineInterrupt Service Routine
ÀÀ C function to run when interrupt occursC function to run when interrupt occurs
ÀÀ All necessary context save/restore performedAll necessary context save/restore performedautomaticallyautomatically
uu Interrupt Initialization CodeInterrupt Initialization Code
ÀÀ Should be called prior to run-time processShould be called prior to run-time process
ÀÀ Interrupt status may be modified during run-timeInterrupt status may be modified during run-time
uu Interrupt Vector TableInterrupt Vector Table
ÀÀ Written in ASMWritten in ASM
14 - 12
WritingWriting ISRs ISRs in C in C
int x[100] ;int *p = x ;
main { … } ;
interrupt void name(void) { static int y = 0 ; y += 1 ; if y < 100 *p++ = port0001; else asm(“ intr 17 “); }
uuGlobal variables allowGlobal variables allowsharing sharing of data betweenof data betweenmain functions & ISRmain functions & ISR
uuKeywordKeyworduuName of ISR functionName of ISR function
uuVoid input and return valuesVoid input and return values
uuLocals are lost across callsLocals are lost across callsStaticsStatics persist across calls persist across calls
uu ISRsISRs should not include calls should not include calls
uuReturn is with enable (RETE)Return is with enable (RETE)
uuAvoid -e or -Avoid -e or -oeoe options options
Module 14
14 - 8 DSP54x - Using the C Compiler
14 - 13
Initializing Interrupts in CInitializing Interrupts in CSetup pointers to IMR & IFR. Initialize IMR, IFR, INTM :Setup pointers to IMR & IFR. Initialize IMR, IFR, INTM :volatile unsignedvolatile unsigned int int *IMR = (volatile unsigned *IMR = (volatile unsigned int int *) 0x0000; *) 0x0000;
volatile unsignedvolatile unsigned int int *IFR = (volatile unsigned *IFR = (volatile unsigned int int *) 0x0001; *) 0x0001;
*IFR = 0xFFFF;*IFR = 0xFFFF;
*IMR = 0xFFFF;*IMR = 0xFFFF;
asmasm(“ RSBX INTM “);(“ RSBX INTM “);
Create Vector Table :Create Vector Table :.sect.sect “.vectors”“.vectors”……BB ISR1ISR1nopnopnopnop……
Compiled ISR Sequence : Compiled ISR Sequence :
uu I$$SAVE performs contextI$$SAVE performs contextsave (from RTS.LIB)save (from RTS.LIB)
uu ISR function runsISR function runs
uu I$$RESTORE performsI$$RESTORE performscontext restore (RTS.LIB)context restore (RTS.LIB)
uu RETE - Return with EnableRETE - Return with Enable
14 - 14
xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx 16-bit16-bit int int
* * yyyy yyyy yyyy yyyy yyyy yyyy yyyy yyyy 16-bit16-bit int int
zzzz zzzz zzzz zzzz zzzz zzzz zzzz zzzzzzzz zzzz zzzz zzzz zzzz zzzz zzzz zzzz 32-bit product32-bit product
Numerical Types in CNumerical Types in C
z = x * y;z = x * y;
z (Q0) z (Q0) z (Q0)
zzzz zzzz zzzz zzzzzzzz zzzz zzzz zzzz
z=((long)(x)*((long)(y))>>15;z=((long)(x)*((long)(y))>>15;
z (Q15)z (Q15)z (Q15)
zzz zzzz zzzz zzzzzzz zzzz zzzz zzzz z z
uu shortshort, , charchar, etc, all occupy full 16-bit memories, etc, all occupy full 16-bit memoriesuu no byte-addressing/packing on ‘54xno byte-addressing/packing on ‘54x
uu floatfloat operations supported via operations supported via rts rts.lib.libuu float math isfloat math is multicycle multicycle
Module 14
DSP54x - Using the C Compiler 14 - 9
14 - 15
The OptimizerThe Optimizer
uu ‘54x Specific Optimizations‘54x Specific Optimizations
uu General OptimizationsGeneral Optimizations
uu Data-flow OptimizationsData-flow Optimizations
uu Branch & Control-flow OptimizationsBranch & Control-flow Optimizations
uu Loop OptimizationsLoop Optimizations
14 - 16
‘54x Specific Optimizations‘54x Specific Optimizations
uu Cost-based register allocationCost-based register allocation ARnARn, A, B, A, B
uu Auto-increment Auto-increment **ARnARn+ +
uu Block repeat Block repeat RPTBRPTB
uu Delayed Branch, Call, and Return Delayed Branch, Call, and Return BD, CALLD, BD, CALLD, RETDRETD
Module 14
14 - 10 DSP54x - Using the C Compiler
14 - 17
General OptimizationsGeneral Optimizations
uu Algebraic re-orderingAlgebraic re-orderingexample : example : (a+b) - (c+d) (a+b) - (c+d) = 6 cycles= 6 cyclesbecomes : becomes : (((a+b)-c)-d) (((a+b)-c)-d) = 4 cycles= 4 cycles
uu Constant foldingConstant foldingexample :example : a = (b+4) - (c+1)a = (b+4) - (c+1)becomes :becomes : a = b - c + 3a = b - c + 3
uu Symbolic simplificationSymbolic simplification
uu Alias DisambiguationAlias DisambiguationWhen When only oneonly one pointer accesses a given memory pointer accesses a given memory array, compiler may allow registers to hold valuesarray, compiler may allow registers to hold values
14 - 18
Data-flow OptimizationsData-flow Optimizationsuu Copy propagationCopy propagation
Following assignment to a variable, references to the variableFollowing assignment to a variable, references to the variableare replaced with the valueare replaced with the value
uu Common sub-expression eliminationCommon sub-expression eliminationIf two (or more) equations perform the same sub-action,If two (or more) equations perform the same sub-action,the value is saved after the first and recalled laterthe value is saved after the first and recalled later
uu Redundant Assignment EliminationRedundant Assignment EliminationDrop assignments Drop assignments notnot used in later equations used in later equations
example (example (intint j) j)
{{ intint a = 3; a = 3;
intint b = (j*a) + (j*2); b = (j*a) + (j*2);
intint c = (j<<a); c = (j<<a);
intint d = (j>>3) + (j<<b); d = (j>>3) + (j<<b);
call (a,b,c);call (a,b,c);
}}
3 assigned to a &3 assigned to a & propigated propigated down; a down; a elim’d elim’d
becomes (j*5)becomes (j*5)
deaddead var var: replaced with expression: replaced with expression
assignment unused - eliminatedassignment unused - eliminated
Module 14
DSP54x - Using the C Compiler 14 - 11
14 - 19
Branch & Control-flow OptimizationsBranch & Control-flow Optimizations
uu Rearrange code to remove branches orRearrange code to remove branches orredundanciesredundancies
uu UnreachedUnreached code is deleted code is deleted
uu Branch to branch is bypassedBranch to branch is bypassed
uu Conditional branch overConditional branch over uncondional uncondional branch branchbecomes single conditional branch ‘not’becomes single conditional branch ‘not’
uu Conditional branches whose conditions areConditional branches whose conditions areresolved at compile-time are replaced withresolved at compile-time are replaced withunconditional branchesunconditional branches
14 - 20
Loop OptimizationsLoop Optimizations
uu Loop induction variables “Loop induction variables “LIVsLIVs ” are, for example, the “i” in ‘for i=…” are, for example, the “i” in ‘for i=…
uu Process of making LIVProcess of making LIV op’s op’s more efficient is called more efficient is called strength reduction,strength reduction, egeg::
for (i=1,i<100,i++)for (i=1,i<100,i++)
y+=x[i];y+=x[i];
yy+=*+=*x++x++
using *using *ARnARn++becomesbecomes
counters -->counters --> BANZBANZ or RPTBor RPTB
uu Often loop control variable is removed entirely - Often loop control variable is removed entirely - debug issuedebug issue
uu Other loop optimizations:Other loop optimizations:
uu Loop Rotation: Loop Rotation: Evaluate loop condition at endEvaluate loop condition at end vs vs. beginning. beginning
uu Loop Invariant Code Motion:Loop Invariant Code Motion:Move static equations out of loop & reference result onlyMove static equations out of loop & reference result only
uu Inline Expansion of RTS Library Functions: Inline Expansion of RTS Library Functions: small functions aresmall functions areinlinedinlined, not called. Size is user, not called. Size is user specifiable specifiable, default = 10 lines, default = 10 lines
Module 14
14 - 12 DSP54x - Using the C Compiler
14 - 21
InliningInlining C Functions C Functions
Code must be present in file for Code must be present in file for inlininginlining : :
Put code in filePut code in file
Put source in header : Put source in header : # include# include
Library of dual function typesLibrary of dual function types
Benefit - Faster:Benefit - Faster:
no branchno branch
no returnno return
no clear of parentno clear of parent fn fn
no setup of sub-no setup of sub-fnfn
merging ofmerging of fn’s fn’s with optimizer with optimizer
callcall fn fn
callcall fn fn
FnFn ......
retret
inlineinline fn fn
inlineinline fn fn
inlineinline fn fn
Call ofCall of Fn Fn InlineInline Fn Fn
14 - 22
Optimization StepsOptimization Stepsuu Optimize : Use Optimize : Use -o, --o, -mnmn when compiling when compiling
uu Use Use #define#define instead of variables for parameters instead of variables for parametersuu GlobalsGlobals may be faster than locals may be faster than localsuu Minimize mixing signed & unsigned integersMinimize mixing signed & unsigned integers
uu Inline short/key functions : compile with Inline short/key functions : compile with -x-xuu Declare function as inlineDeclare function as inlineuu Automatically invoked for short routines within fileAutomatically invoked for short routines within fileuu InlinesInlines can be passed between files via header can be passed between files via header
uu Give compiler Give compiler project visibilityproject visibilityuu #include#include sub-files within main sub-files within main
uu Optimizer will operate over Optimizer will operate over allall files allowing better files allowing betterinlininginlining, register tracking, etc., register tracking, etc.
uu Tune memory map via Tune memory map via C.CMDC.CMD
uu Re-write key code segments in assemblyRe-write key code segments in assemblyuu Bulletin Board, App notes, 3rd PartiesBulletin Board, App notes, 3rd Partiesuu S/W Cooperative, Hand writtenS/W Cooperative, Hand written
Module 14
DSP54x - Using the C Compiler 14 - 13
14 - 23
Optimization ProcessOptimization Process
Write & debug in C, benchmarkWrite & debug in C, benchmark
Real-time goal met ? Real-time Real-time goal met ? goal met ?
Perform C & C.CMD OptimizationsPerform C & C.CMD Optimizations
Profile. Convert Key Functions to ASMProfile. Convert Key Functions to ASM
Real-time goal met ? Real-time Real-time goal met ? goal met ?
Real-time goal met ? Real-time Real-time goal met ? goal met ? DoneDone
••
YY
YY
YY
NN
NN
NN
14 - 24
Lab 14-b : Writing Code in CLab 14-b : Writing Code in C
Module 14
14 - 14 DSP54x - Using the C Compiler
14 - 25
uu Invoke the compiler or shell programInvoke the compiler or shell programÀÀ Options and SwitchesOptions and SwitchesÀÀ The RTS libraryThe RTS libraryÀÀ The OptimizerThe Optimizer
uu Write code in CWrite code in CÀÀ Numerical Types supportedNumerical Types supportedÀÀ AccessingAccessing MMRs MMRs and IO Ports and IO PortsÀÀ InliningInlining C and ASM functions C and ASM functionsÀÀ Interrupt service routinesInterrupt service routinesÀÀ Optimization tipsOptimization tips
uu Use the C support files :Use the C support files :ÀÀ C.CMD : Linker file issues when using CC.CMD : Linker file issues when using CÀÀ BOOT.ASM Pre-main initialization processBOOT.ASM Pre-main initialization process
uu Intermix assembly files within the C environmentIntermix assembly files within the C environmentÀÀ Stack ModelStack ModelÀÀ Register UsageRegister UsageÀÀ Argument passing and result returnArgument passing and result return
C Support FilesC Support Files
uu Use the C support files :Use the C support files :ÀÀ C.CMD : Linker file issues when using CC.CMD : Linker file issues when using CÀÀ BOOT.ASM Pre-main initialization processBOOT.ASM Pre-main initialization process
14 - 26
Components of C.CMDComponents of C.CMD
file1.objvectors.obj-c-o test.out-m test.map-i c:\filepath-l rts.lib
MEMORY{P or D, RAM or ROM, F,M or S}
SECTIONS{ .vectors:> .text :> .cinit :> .const :> .switch :> .bss :> .stack :> .sysmem :> }
-stack 400h-heap 200h
Files : list here or pass via shellFiles : list here or pass via shellMust be written inMust be written in asm asm, listed here, listed hereBoot.Boot.asmasm is included is includedOutput file nameOutput file nameMap file nameMap file namePaths to searchPaths to searchLibraries to search - Libraries to search - lastlast on list on listOverride stack sizeOverride stack sizeOverride heap sizeOverride heap size
(PgmPgm,Data,Fast,,Data,Fast,MedMed,Slow),Slow)
Vector tableVector tableCodeCodeInitInit table for global/ table for global/staticsstaticsConstants - several options hereConstants - several options hereCase statement arraysCase statement arraysGlobalsGlobals and and statics staticsStack allocationStack allocationHeap allocationHeap allocation
P ROM MP ROM FP ROM SD ROM MP ROM MD RAM MD RAM FD RAM M
Module 14
DSP54x - Using the C Compiler 14 - 15
14 - 27
Options for Handling .Options for Handling .constconstuu Put a Put a ROMROM in data memory for . in data memory for .constconst..
+ + True constantTrue constant- - Extra costExtra cost
uu Link .Link .constconst to a to a ROMROM whose whose CSCS- is an - is an ANDAND of of PSPS- & - & DSDS--++ Lower cost, true constantsLower cost, true constants- - Reduces total memory space, extra gateReduces total memory space, extra gate
uu Use: Use: LOAD=LOAD=PgmRomPgmRom,RUN=,RUN=DataRamDataRam, and write a, and write aroutine to copyroutine to copy Rom Rom to Ram on reset. to Ram on reset.+ + Low costLow cost- - Extra design effort, not true constantsExtra design effort, not true constants
uu Use a host processor toUse a host processor to init init. constants to data ram on reset. constants to data ram on reset+ No extra cost if there is already a host and I/F+ No extra cost if there is already a host and I/F- Not true constants, extra design effort- Not true constants, extra design effort
uu Use initializedUse initialized globals globals instead of constants and link instead of constants and linkwith “-c” to auto-initializewith “-c” to auto-initialize pgm pgm ROMROM to data to data RAMRAM+ + Way toWay to autoinit autoinit, good use of memory space, good use of memory space- - RTS.LIBRTS.LIB fns fns may not apply, not “true” constants may not apply, not “true” constants
14 - 28
Global and Static Variable InitializationGlobal and Static Variable Initialization
uu Global and Static variables (G/SGlobal and Static variables (G/S vars vars) are linked under .) are linked under .bssbss
uu G/SG/S vars vars with no explicit with no explicit init val init val are assumed 0 by ANSI are assumed 0 by ANSI
uu Compiler does Compiler does notnot support the assumed 0 support the assumed 0 init init value value
uu Solutions:Solutions:
STMSTM #.#.bssbss,AR7,AR7
RPTZRPTZ A,#A,#lenlen
STLSTL A,*AR7+A,*AR7+
..bssbss:>:>DatRamDatRam, fill=0, fill=0
uu Add an ASM routine pre-main:Add an ASM routine pre-main:
uu Initialize all G/SInitialize all G/S vars vars to 0 explicitly to 0 explicitly
uu Link with a specified Initial value,Link with a specified Initial value, eg eg::
Module 14
14 - 16 DSP54x - Using the C Compiler
14 - 29
BOOT.ASMBOOT.ASM - invoked with - invoked with “-c“-c ””
Reset : PC <- FF80Reset : PC <- FF80
.ref .ref _c_int00_c_int00
FF80:FF80: BB _c_int00_c_int00nopnopnopnop
_main ..._main ...
_c_int00 _c_int00 1. Allocate stack1. Allocate stack
2.2. Init Init SP to end of stack SP to end of stack
3. Initialize status bits (3. Initialize status bits (espesp. CPL). CPL)
4. Copy .4. Copy .cinitcinit to . to .bssbss (skip if “- (skip if “- crcr ”)”)
5. Call “_main”5. Call “_main”
14 - 30
Runtime Support Library: RTS.LIBRuntime Support Library: RTS.LIB
uu Use -L RTS.LIB at end of file list in Use -L RTS.LIB at end of file list in LINK.CMDLINK.CMD to get access to to get access tospecified libraries as needed by specified libraries as needed by priorprior listed files: listed files:
file1.file1.objobj /* can access /* can access rts rts.lib */.lib */
-l-l rts rts.lib.lib /* run-time support library */ /* run-time support library */
file2.file2.objobj /* won’t access /* won’t access rts rts .lib */.lib */
uu Library functions must be Library functions must be declareddeclared to be used in a C file, to be used in a C file,usually via Header file.usually via Header file.
uu Headers can be inserted or included via the #includeHeaders can be inserted or included via the #includedeclarationdeclaration
uu Example - to access math.h type: #include <math.h>Example - to access math.h type: #include <math.h>
uu See compiler UG for full list of headersSee compiler UG for full list of headers
uu Note: noNote: no stdio stdio.h - why?.h - why?
Module 14
DSP54x - Using the C Compiler 14 - 17
14 - 31
TheThe Archiver Archiver: AR500: AR500
Command line options:Command line options: x - extractx - extractr - reinstallr - reinstalla - appenda - append
Sequence to modify libSequence to modify lib fn fn,, eg eg: Boot.: Boot.asmasm::
1. extract file:1. extract file: AR500 xAR500 x rts rts..srcsrc boot. boot.asmasm
2. modify as desired using ASCII editor2. modify as desired using ASCII editor
3. refresh 3. refresh both both archives:archives: AR500 rAR500 r rts rts..srcsrc boot. boot.asmasm AR500 rAR500 r rts rts.lib boot..lib boot.objobj
14 - 32
Lab 14-c : C.CMD and BOOT.ASMLab 14-c : C.CMD and BOOT.ASM
Module 14
14 - 18 DSP54x - Using the C Compiler
14 - 33
Mixing ASM into C SystemMixing ASM into C System
uu Invoke the compiler or shell programInvoke the compiler or shell programÀÀ Options and SwitchesOptions and SwitchesÀÀ The RTS libraryThe RTS libraryÀÀ The OptimizerThe Optimizer
uu Write code in CWrite code in CÀÀ Numerical Types supportedNumerical Types supportedÀÀ AccessingAccessing MMRs MMRs and IO Ports and IO PortsÀÀ InliningInlining C and ASM functions C and ASM functionsÀÀ Interrupt service routinesInterrupt service routinesÀÀ Optimization tipsOptimization tips
uu Use the C support files :Use the C support files :ÀÀ C.CMD : Linker file issues when using CC.CMD : Linker file issues when using CÀÀ BOOT.ASM Pre-main initialization processBOOT.ASM Pre-main initialization process
uu Intermix assembly files within the C environmentIntermix assembly files within the C environmentÀÀ Stack ModelStack ModelÀÀ Register UsageRegister UsageÀÀ Argument passing and result returnArgument passing and result return
14 - 34
. .defdef _slope _slope_slope:_slope:
DataData Mem Mem
Stack areaStack area
SPSP ÈÈ
void main (void) {void main (void) { int int x,y,b,m ; x,y,b,m ; } }
Calling ASM function from CCalling ASM function from C
externextern int int slope(slope(intint,,intint,,intint););
uu Define function name (code entry point)Define function name (code entry point)
uu Declare function name as a globalDeclare function name as a global
old PCold PCAA arg arg b b
AA mx mx+b+b
uu Note: Note: *SP(1)*SP(1) impliesimplies use of SP, but use of SP, butrequires requires CPL=1CPL=1 to work properly to work properly
local xlocal xlocal ylocal ylocal blocal blocal mlocal m
uu Call the assembly language function.Call the assembly language function.
uu Declare / Prototype the assembly language functionDeclare / Prototype the assembly language function
y = slope(b,m,x); y = slope(b,m,x);
LD LD *SP(1),T *SP(1),T MPY MPY *SP(2),B *SP(2),B ADD ADD B,15,A B,15,A RET RET
LD LD *SP(1),T *SP(1),T RETD RETD MPY MPY *SP(2),B *SP(2),B ADD ADD B,15,A B,15,A
argarg. m. margarg. x. x
Module 14
DSP54x - Using the C Compiler 14 - 19
14 - 35
Register Caveats for CRegister Caveats for C
ST0ST0
$53�����7&�����&�����29$�����29%$53�����7&�����&�����29$�����29% ���������� �������������� �����'3�����������'3������ ����������������
��������������������������������������������������������������������������������������������������������������������������������������������������������������������������������
ST1ST1
%5$)���&3/���;)���+0���,170�������290���6;0���&�����)5&7���&037���$60������%5$)���&3/���;)���+0���,170�������290���6;0���&�����)5&7���&037���$60������
������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������
Registers Registers notnot free on function call free on function callRegReg.. Use by C :Use by C :
AR7AR7 Long frame Long frame ptrptr
SPSP Stack PointerStack Pointer
AR1,6AR1,6 Register VariablesRegister Variables
AA 1st1st Arg Arg. /. / Rtn Rtn Value Value
Use Use )UDPH��1���36+0���3230���)UDPH�1)UDPH��1���36+0���3230���)UDPH�1
Registers free on function callRegisters free on function callRegReg.. Use by CUse by C
BB Expression AnalysisExpression Analysis
TT Expression AnalysisExpression Analysis
AR0AR0 Pointers and expressionsPointers and expressions
AR2-5AR2-5 Expression AnalysisExpression Analysis
BRCBRC LoopLoop reg’s reg’s ( (RSA,REARSA,REA))
14 - 36
Lab 14-d : ASM routine in CLab 14-d : ASM routine in C
Module 14
14 - 20 DSP54x - Using the C Compiler