Upload
berenice-gibbs
View
214
Download
0
Embed Size (px)
Citation preview
Architecture Selection of a Flexible DSP Core Using Re-
configurable System Software
Architecture Selection of a Flexible DSP Core Using Re-
configurable System Software
July 18, 1998July 18, 1998
Jong-Yeol LeeJong-Yeol Lee
Department of Electrical Engineering, KAIST
AgendaAgenda
Introduction
MetaCore C Compiler
MetaCore Assembler
MetaCore Instruction Set Simulator
“Compile-Simulate-Refine” Procedure of Architecture Selection
Experimental Results
Conclusion
IntroductionIntroduction
Application-Specific Instruction set Processor(ASIP) maximize the performance on a specific application parameterized architecture
Major issues in ASIP design performance & cost efficiency
instruction set & micro-architecture diverse exploration of the design space
short design turnaround time how efficiently transform the higher-level specification into
lower-level implementation
application program development tools compiler, assembler, ISS(Instruction Set Simulator) Re-configurability
IntroductionIntroduction
MetaCore is a flexible DSP core MetaCore can be modified easily by just changing
hardware parameters MetaCore has special features for DSP
applications(e.g. MAC unit and hardware loop unit)
To support the flexible core The system software must be re-configurable For re-configurability, each software has its own form
of machine description
Using re-configurable system software Many architectures can tested by iterating “compile-
simulate-refine” cycles
MetaCore C Compiler(MCC)
MetaCore C Compiler Can be configured by changing parameters Can be used to explore architectures
Operation of MCC
MCC
Machine Description
ApplicationC Program
main() { ... m = min(m,t) ...} : .
.
.
...
Parameter File Ainst = 16bit
general_ reg = 28
address_ reg = 8
minmaxALU = 0
Assembly Code A : Amin : cmp A0, R1 b.lt LLmin cla A0 add A0, R1 LLmin : mvtom *AR2(0), A0 :
Assembly Code B :
min A0, R1
mvtom *AR2(0), A0
:
Parameter File Binst = 16bit
general_ reg = 28
address_ reg = 8
minmaxALU = 1
Object Code
80 F0 AF 478F F1 31 41
MASM
Assembly Code
ADD A1, R7 SUB A0, R1 U_ADD A0, A1 B LOOP
MetaCore Assembler(MASM)
User can define new instructions Mapping table is automatically reconstructed from
Assembly Language Definition
Operation of MASM
Format Converter
Format Converter
Mapping
Table
AssemblerAssembler
// instruction : (operation field) (operand field) add : 80 (OP_ALU)sub : AF (OP_ALU)u_add : 8F (OP_ALU)
AssemblyLanguageDefinition
MISS
Object Code
80 F0 AF 478F F1 31 41
InstructionUsage Statistics
InstructionUsage Statistics
ProgramOutput
ProgramOutput
MetaCore Instruction Set Simulator(MISS)MetaCore Instruction Set Simulator
Enables fast simulation on instruction level Supports re-configurability using Instruction
Description Format Can be used as a debugger of assembly codes
Operation of MISS
Decoding Tree
Simulation CoreSimulation Core
Tree BuilderTree Builder
U_ADD:B10001111,2,6 { Operands1 = GetSource 1; …. Result = AddOperands; SetConditionCodes; }
Instruction Description Format
CompilerCompiler
Architecture Selection ProcedureArchitecture Selection Procedure
ApplicationC code
ApplicationC code AssemblerAssembler
Instruction-setSimulator
Instruction-setSimulator
SimulationResult
SimulationResult
OK?OK?ArchitectureRefinement
ArchitectureRefinement
No
SelectedArchitecture
Yes
“Compile-simulate-refine” cycle
ArchitectureParameter
ArchitectureParameter
AssemblerLanguageDefinition
AssemblerLanguageDefinition
InstructionDescriptionFormat
InstructionDescriptionFormat
Experimental ResultsExperimental Results
Performance impact of accumulators The reference architecture has six address registers
and ten general purpose registers
1.000
1.050
1.100
1.150
1.200
2 3 4
Number of accumulators
Sp
ee
du
p
adpcmconvedgefftfir2dimidctiirlmsviterbi
Experimental ResultsExperimental Results
Parameter selection considering generated code size If the code size should be smaller than 32K bytes, 16
general purpose register and 4 address registers will be enough
498 498552 555
668
300350400450500550600650700
Cyc
le C
ou
nt
(x10
00 c
ycle
s)
GR=28AR=16
GR=8 AR=8
GR=28 AR=4
GR=16 AR=4
GR=8 AR=4
No. of GPR's and AR's
(a) Cycle Counts
11.212
12.8
15.4
33
1011121314151617181920
Co
de
Siz
e(x
1000
byt
es)
GR=28AR=16
GR=8 AR=8
GR=28 AR=4
GR=16 AR=4
GR=8 AR=4
No. of GPR's and AR's
(b) Code Sizes
Experimental ResultsExperimental Results
Find minimum area under speedup constraints Use simple iterative heuristic
Speedupconstraint
Obtainedarea
Optimalarea
Areaoverhead
Number ofiterations
Benchmark
ADPCM
5%
10%
15%
20%
20467
20467
20563
20707
19923
20019
20115
20259
2.7%
2.2%
2.2%
2.2%
16
15
13
10
IDCT
5%
10%
15%
20%
20467
20515
20563
20659
19971
20067
20115
20211
2.5%
2.2%
2.2%
2.2%
16
14
13
11
Viterbi
5%
10%
15%
20%
20419
20419
20467
20467
19923
19923
19971
19971
2.5%
2.5%
2.5%
2.5%
17
17
16
16
ConclusionConclusion
Present re-configurable system software for a flexible DSP core
Show that the re-configurable system software can be used to select the most suitable architecture for a given application
Code Optimization will be major future work