View
3
Download
0
Category
Preview:
Citation preview
1/31/12
1
High-Level Synthesis using Catapult-C
System-on-Chip Design Methodologies
Olivier Sentieys IRISA/INRIA
ENSSAT - Université de Rennes 1
EII3/M2R - 2
Outline § Introduction § Design Flow and Tool Basics § Data Types § Writing C++ for Synthesis § Optimizing your Design
• Loops
§ Interface and Memory Synthesis
EII3/M2R - 3
Catapult C Design Methodology § Compatible
Environments • Matlab/Simulink • C++ • SystemC
EII3/M2R - 4
Design Steps with Catapult § Algorithm design and H/S partitioning § Write C code for hardware
• technology independent, fast simulation, compact, etc.
§ Analyze your C code inside Catapult § Constrain the micro-architecture
• technology, resource constraints, I/O, frequency, etc.
§ Generate, analyze and validate hardware • Gantt chart • testbench generation, RTL generation
1/31/12
2
EII3/M2R - 5
Writing C code for Hardware: basics § Big rules:
• No dynamic memory allocation • Pointer restrictions • Integer or fixed-point data types: bit accurate, no floats • The style of your C code will have a great impact on design
quality
§ Add #pragma hls_design top!§ Use pragmas in C++ code
EII3/M2R - 6
C Design Example
#define num_taps 8!#pragma hls_design top !void fir_filter (int *input,!
! ! ! ! !int coeffs[num_taps],!! ! ! ! !int *output) { !!static int regs[num_taps];!!short temp = 0;!!for (int i = num_taps-1; i >= 0; i--) { !! !if (i == 0)!! ! !regs[0] = *input; !! !else!! ! !regs[i] = regs[i-1]; !! !temp += coeffs[i] * regs[i];!!} !!*output = temp;!
}!
Virtual WAIT
Outputs are registered
Inputs are read with handshake
EII3/M2R - 7
Outline § Introduction § Design Flow and Tool Basics § Data Types § Writing C++ for Synthesis § Optimizing your Design
• Loops
§ Interface and Memory Synthesis
EII3/M2R - 8
Synthesis Flow § Set working directory § Add input file(s) § Setup design § Architectural
constraints § Resource constraints § Schedule § Generate RTL § Invoke simulation
1/31/12
3
EII3/M2R - 9
Catapult Window § Invoke catapult!§ Load scripts § “Crossprobing” between
C code and • constraints • Gantt chart • generated HDL • reports • schematic • etc.
EII3/M2R - 10
Setting Up Design
EII3/M2R - 11
Setting Up Design § Technology: FPGA, ASIC § Synthesis tool § IP blocks: RAMs, pipeline multipliers § Design frequency constraints
• Clock cycle frequency
§ Interface • One, and only one, clock ;-) je vous l’avais bien dit • Synchronous or asynchronous reset • Enable (optional) • Start and Done flags
EII3/M2R - 12
Specifying Architectural Constraints
1/31/12
4
EII3/M2R - 13
Analysing your design
EII3/M2R - 14
Analysing your design
EII3/M2R - 15
Analysing your design § Gantt Chart View of Data Dependencies
EII3/M2R - 16
Schematic Viewer
1/31/12
5
EII3/M2R - 17
Schematic Viewer
EII3/M2R - 18
Outline § Introduction § Design Flow and Tool Basics § Data Types § Writing C++ for Synthesis § Optimizing your Design
• Loops
§ Interface and Memory Synthesis
EII3/M2R - 19
Data Types § Type constraints
• signed or unsigned • bit-width for integers • bit-width for fixed-point types
o 0111.11001
§ C++ extensions: bitvectors • package mc_bitvector.h • int5 my_variable; !• uint5 my_unsigned_variable;!
EII3/M2R - 20
SystemC Datatypes § #include "systemc.h”!§ sc_int/sc_bigint
• sc_int<10> tenbitInt;!§ sc_uint/sc_biguint § sc_fixed/sc_ufixed
• sc_fixed<20,10> a; !• sc_fixed<20,10,SC_RND,SC_SAT_ZERO> c;!
1/31/12
6
EII3/M2R - 21
AC Datatypes § #include <ac_fixed.h>!§ ac_int<int W, bool S>
• ac_int<10> tenbitInt;!• ac_int<10,true> tenbitIntSigned;!
§ ac_fixed<int W, int I, bool S, ac_q_mode Q, ac_o_mode O>
• ac_fixed<20,10,true> a; !• ac_fixed<20,10,AC_RND,AC_SAT_ZERO> c;!
EII3/M2R - 22
AC Datatypes
EII3/M2R - 23
AC Datatypes: Quantization Modes
EII3/M2R - 24
AC Datatypes: Overflow Modes
1/31/12
7
EII3/M2R - 25
Outline § Introduction § Design Flow and Tool Basics § Data Types § Writing C++ for Synthesis § Optimizing your Design
• Loops
§ Interface and Memory Synthesis
EII3/M2R - 26
C++ File Format § Files: .c, .cxx, .cpp or .c § C++ parser § C++ preprocessor #ifndef MY_HEADER_FILE_NAME !
#define MY_HEADER_FILE_NAME!!code goes here ...!
#endif!!
#pragma hls_design top !void my_design (int *input, int array[8], int *output) { !
!static int temp;!!short var = *input;!!… !!*output = temp;!
}!
EII3/M2R - 27
Storage Types § Static datatypes may only be assigned to constants
during their declaration. • static int a = 5; // Correct !• static int b = x; // Incorrect if x is not a constant declared in the local scope!
§ Static variables are assigned to their initial value during reset
§ Storage types “const”, “extern” and “mutable” have no affect on synthesis
EII3/M2R - 28
Condition Statements § All branches of a conditional statement will be
balanced to have the same length § Every branch of a case statement should have a
break § Don't write code with conditional looping statements
§ Supported: if, switch switch (a) { !!case 1:!! !c = a + b;!! !break; !!case 12:!! !c = a - b; !! !break;!
}!
1/31/12
8
EII3/M2R - 29
Loop Statements § The variable used to decide if a loop should exit
should have a constant start value and increment § Conditional break from a loop is preferred § Each loop should have only one “exit” § Supported: do, for, while § Partial unroll of any loop
• #pragma unroll yes // unroll a loop !• #pragma unroll no // leave a loop rolled !• #pragma unroll 5 // Unroll the loop 5 times!• Default is to leave loops rolled
EII3/M2R - 30
Branching and Functions § Branching
• “continue” statement should be avoided • “goto” statement is not supported • Supported: break continue return
§ Functions • Should have only one “return” statement at the end • Recursive functions are not supported
int my_addsub (int a, int b, bool c) { !if ( c )!
!return a + b; !else!
!return a - b;!}!
if ( c )!!return a + b; !
return a - b;!
EII3/M2R - 31
Expressions § Don't index arrays with signed expressions § Don't use the pre- and post-increment expressions
(“++” and “--”) on a variable if that variable is used somewhere else in that expression • b = 5; a = ++b * --b;!• a = 6 * 5; or a = 5 * 5;!
§ Divide “/” and Modulo “%” should be avoided § Keep C integer shifts in their defined range: 0 to 31
for int, 0 to 63 for long
EII3/M2R - 32
Using Parentheses and Common Sub-Expressions
1/31/12
9
EII3/M2R - 33
Multiply and Divide by Constants § Constant shifts have virtually no cost in hardware § Constant multiplications, divisions and modulus are
converted into shifts and adds or subtracts • y = a * 6;!• converted into y = (a << 1) + (a << 2);!
• y = (int)a/2;!• converted into y = ((int)a + 1) >> 1;!
• y = (unsigned int)a/2;!• converted into y = (unsigned int) a >> 1;!
EII3/M2R - 34
Loops § FOR loop
§ WHILE loop
§ DO loop
FOR_LOOP:for(int i=0;i<4;i++) { !!dout[i] = din[i];!
}
int i=0; !WHILE_LOOP:while(i<4){!
!dout[i] = din[i]; !!i++;!
}
DO_LOOP:do{!!dout[i] = din[i];!!i++; !
}while(i<4);
EII3/M2R - 35
Loops § Loop using one iterator
EII3/M2R - 36
Loops § Loop using multiple iterators
1/31/12
10
EII3/M2R - 37
Conditions § Conditional “if” Creates Dependency Chain
EII3/M2R - 38
Conditions § Conditional “else” Splits Dependency Chain
EII3/M2R - 39
Outline § Introduction § Design Flow and Tool Basics § Data Types § Writing C++ for Synthesis § Optimizing your Design
• Loops
§ Interface and Memory Synthesis
EII3/M2R - 40
Loop § Loop example int acc=0; !
ACCUM:for(int i=0;i<4;i++){!!acc += din[i];!
}!
1/31/12
11
EII3/M2R - 41
Partial Loop Unrolling int acc=0;!ACCUM:for(int i=0;i<4;i+=2){ !
!acc += din[i]; !!acc += din[i+1];!
}!
EII3/M2R - 42
Fully Unrolled Loop
int acc=0;!acc += din[0]; !acc += din[1]; !acc += din[2]; !acc += din[3];
EII3/M2R - 43
Loops with Conditional Bounds • Loop bound is an input
#include“accum.h” !#include<ac_int.h> !void accumulate( int din[4], int &dout,!
! ! ! ! !unsigned int ctrl){! int acc=0; ! ACCUM:for(int i=0;i<ctrl;i++){!
!acc += din[i];! }!}!
EII3/M2R - 44
Optimizing the Loop Control #include “accum.h” !#include <ac_int.h> !void accumulate(int din[4], int &dout, !
! ! ! ! !ac_int<3,false> ctrl){! int acc=0;! int i_old=0; ! ACCUM:for(int i=0;i<4;i++){!
!acc += din[i]; if(i_old==ctrl)!!break; !!i_old = i;!
} ! dout = acc;!}
1/31/12
12
EII3/M2R - 45
Nested Loops #include “accum.h” !#include <ac_int.h> !#define MAX 100000 !void accumulate(int din[2][4], !
! ! ! ! int &dout){! int acc=0; ! ROW:for(int i=0;i<2;i++){!
!if(acc>MAX) !! acc = MAX;!
!COL:for(int j=0;j<4;j++){ !! acc += din[i][j];!
!} !!dout = acc;!
}!}
EII3/M2R - 46
Nested Loop: Unrolling the Innermost Loop
ROW:for(int i=0;i<2;i++){! !acc=0; !
!acc += din[i][0];!!acc += din[i][1];!!acc += din[i][2];!!acc += din[i][3];!!dout[i] = acc;!
}!}
EII3/M2R - 47
Nested Loop: Unrolling the Outer Loop
int acc[2];!acc[0] = 0; !COL_0: for(int j=0;j<4;j++){!
!acc[0] += din[0][j];!} !dout[0] = acc[0]; !acc[1] = 0;!COL_1:for(int j=0;j<4;j++){ !
!acc[1] += din[1][j];!} !dout[1] = acc[1];
int acc[2];!acc[0] = 0;!acc[1] = 0; !COL_0_1:for(int j=0;j<4;j++){!
!acc[0] += din[0][j]; !!acc[1] += din[1][j];!
} !dout[0] = acc[0]; !dout[1] = acc[1];
=> Loop Merging
EII3/M2R - 48
But loop parallelization is complex! for(i=1; i<=n-1; i++)! for(j=1; j<=n-1; j++)!
!a[i][j] = ( a[i-1][j] + a[i][j]!! ! ! + a[i][j-1] ) / 3.0;!
§ Is the loop parallel ?
§ Apply the following transform • t=i+j, p=j i=p-t, j=p • Is it better ?
The parallel execution of DO loops, Leslie Lamport, Communications of the ACM CACM 17(2), 1974
Recommended