Upload
veronica-manning
View
213
Download
1
Embed Size (px)
Citation preview
Graphical Design Environment for a Reconfigurable ProcessorIAmE
Abstract
The Field Programmable Processor Array (FPPA) is a new reconfigurable architecture developed by NASA/GSFC and the University of Idaho under ESTO funding. FPPA architecture promises high-throughput, radiation-tolerant, low-power data processing, for spacecraft instruments.
FPPA implements a synchronous integer data flow computational model, which is not easily captured in procedural languages like C, but is easy to represent graphically. This motivates our Simulink-based design environment for the FPPA. In a process familiar to all Simulink users, the algorithm designer selects functional blocks from the menu, places them on a work screen, and connects them by drawing interconnect lines. A click of a button executes the simulation. The goals of this effort are to implement the following:
1. Verify algorithm; this is the familiar Simulink operational mode, which runs the simulation, invoking underlying Matlab functions and verifying the
functional correctness of the program.
2. Translate to FPPA; incorporating design parameters such as value ranges and topology, the software will translate the floating point Matlab representation to the FPPA fixed point in an optimal fashion, and generate an interface to the FPPASim simulator software.
3. Verify the FPPA implementation; The designer now executes a simulation that invokes the FPPASim
program, which faithfully duplicates the FPPA behavior.
4. Generate FPPA code; when the implementation
has been verified, the software will map the design to FPPA configuration and run-time code, enabling the design to be ported to FPPA chips.
FPPA architecture
An embedded data processor VLSI chip for spacecraft:• Radiation-tolerant, 0.25m CMOS process• Fixed point processing elements• Implements a reconfigurable synchronous data flow processor
1. Run-time reconfigurable2. Extensible by tiling multiple chips
• Serves as accelerator to a host CPU
Features:• 16 configurable on-board Processing Elements • Four 16-bit-wide, bidirectional I/O ports• One 16-bit-wide dedicated output port• On-board program memory and execution unit
Application development:• Text base development
1. Configuration and Run-Time compilers2. Standalone functional simulator
• FPPA Simulink graphical design environment (GUI)
Processing Element components
Components:• 17 bits multipliers• ALU• Data format• Primary and secondary output• Conditional output select module• Delay elements
Design Flow
Note:SIFOpt tool is a result ofDavid M. Buehler dissertationat the University of Idaho.
BSEL0
PE00 PE01
PE03 PE02
PE10 PE11
PE13 PE12
PE30 PE31
PE33 PE32
PE24 PE21
PE23 PE22
LBUS0 LBUS1
LBUS2LBUS3
XBAR
BSEL1
BSEL2
BSEL3
DOM
IOM0
IOM2
IOM3
IOM1
Figure 1: FPPA architecture
General model of the Processing Element (PE)
Behavior of the PE:The PE works in two different modes;configuration and runtime. During the configuration mode; C0, C1, Datapath and Runtime as shownin figure 4 are configures to a giving topology as well as a sequence of enable and disable of the PE.
During the runtime mode, the PE takeinput data i.e. X,Y,W shown in figure 4and produce an output base on theconfigured topology as well asthe status of the PE i.e. enable or disable.
Configure PE with Simulink Graphical Design Environment
The PE can perform numericalcomputation as well as logic computation.As shown in Figure 5 is a sample of what Processing Element componentsthe PE can do with both numerical and logical computation.
mul_Xmul_Y
MUL_OUT
Format
ALUXY
alu_Y
ALU_OUT
alu_X
Format
Conditional Output Select
SecondaryOutput
16 16
16 16
PrimaryOutput
Inputs
ControlOutput
16 16
16 16
Delay Elements
Figure 2: A look at the Processing Element architecture and its components
Tu LeInstitute of Advanced Microelectronics ECE/CAMBRUniversity of [email protected]
Gregory DonohoeInstitute of Advanced Microelectronics ECE/CAMBRUniversity of [email protected]
David M. BuehlerInstitute of Advanced Microelectronics ECE/CAMBRUniversity of [email protected]
Pen-Shu YehNASA GSFC Code [email protected]
Configuration
Input Processing Element (PE)
Function (X, Y, W, C0, C1, DP, RT) => output
1/Z
1/Z
1/Z
X
Y
W
Data Path(DP)
Run Time(RT)
Output
Constants(C0, C1)
Figure 4: General model of the Processing Element
AlgorithmSimulink
Model (floating point)
• Design Data Path• Provide Input data
• Data format• Run time
FPPA C++ simulator (fixed point model)SIFOpt ConfigASM
Data Path
Data format
Run Time
PERL(floating fixed point)
Validation
result
result Golden model
Figure 3: Design flow of the graphical design environment for a reconfigurable processor
1. Unconditional PE
• Delay
• Shift right or left
• (X + Y)
• (X – Y)
• (X + Y) * Z
• (X - Y) * Z
• (X*Y – Z)
• (X*Y + Z)
• C0, C1
2. Conditional PE
If (condition) then Perform task A Else Perform task B
Figure 5: Unconditional and conditional PE configuration window
Example Application using the FPPA
Multi-rate filter bank:Each of the low pass filters shown in figure 6 made upby the four taps FIR filter with the debenchies coefficient.
Figure 6 shown a filter bank, which is a portion of the circuit that implement data compression using debenchies coefficient, filter bank and the concept of wavelet decomposition. For this example, especially shown in figure 7 that FPPA can configure to be the filter bank with ease through the help of the simulink graphic interface. Down sampling at each of the levels i.e. L1, L2, L3 and L4 are accomplished by enable or disable desirable processing element. Figure 8 shown the output result at each of the down sample levels and the source signal.
Low Pass
L Dec by 2
Source
Low Pass
L Dec by 2 Low Pass
L Dec by 2 Low Pass
L Dec by 2
L1 L2 L3 L4
Figure 6: Block diagram of the multi-rate filter bank
1
Run Time: Down Sampling
01
0001
00000001
0000000000000001
Figure 7: Multi-rate filter implementation using the FPPA
Figure 8: Source signal and the output signals at each of the down sampling levels