Upload
rosalyn-riddle
View
35
Download
1
Embed Size (px)
DESCRIPTION
Using a CSP based Programming Model for Reconfigurable Processor Arrays. By: Zain-ul-Abdin [email protected]. Motivation. Emergence of new heterogeneous parallel architectures Increased Performance Power Efficiency Traditional methods Automatic parallelization by compilers - PowerPoint PPT Presentation
Citation preview
Using a CSP based Programming Model for
Reconfigurable Processor ArraysBy: [email protected]
"Using a CSP based Programming Model for Reconfigurable Processor Arrays", Zain-ul-Abdin 2
Motivation• Emergence of new heterogeneous parallel
architectures– Increased Performance– Power Efficiency
• Traditional methods– Automatic parallelization by compilers– Use of Thread model of computation
• Highly non-deterministic
• Use of Concurrent Programming Model– Expresses computations in a productive manner by
matching it to target hardware– Supported by a compiler for allowing portability
"Using a CSP based Programming Model for Reconfigurable Processor Arrays", Zain-ul-Abdin 3
Array of Processors
• Consists of heterogenous processors with specialized interconnection netrworks
• Improved performance by exploiting paralellism rather than scaling clock frequency
• Flexible due to dynamically reconfigurable interconnection network
• Energy Efficient– Individual brics can be switched off when not in use– The Clock frequency of brics can be optimized
"Using a CSP based Programming Model for Reconfigurable Processor Arrays", Zain-ul-Abdin 4
Ambric Programming Model
• Design consists of:– Objects: defines the
functionality in either java subset or assembly.
– Structured composition described in aStruct
"Using a CSP based Programming Model for Reconfigurable Processor Arrays", Zain-ul-Abdin 5
Ambric-Simple ExampleDesign Toplevel
design SimpleDesigntop { Root_IF root_Inst;}
interface Root_IF {}
binding CompRoot implements Root_IF {
simpledesign process1;
Vio inOut = {NumSources = 1, NumSinks = 1};
channel c0 = {inOut.out[0], process1.in};channel c1 = {process1.out, inOut.in[0]};}
Object Structure
interface simpledesign {
inbound in;
outbound out;
}
binding Javasimpledesign implements simpledesign {
implementation "simpledesign.java";
}
Object Implementation
import ajava.io.InputStream;
import ajava.io.OutputStream;
public class simpledesign {
public void run(InputStream<Integer> in, OutputStream<Integer> out) {
while (true) {
out.writeInt(in.readInt());
}
}
}
"Using a CSP based Programming Model for Reconfigurable Processor Arrays", Zain-ul-Abdin 6
Why use Occam-pi?
• Language level support for concurrency
• Provides higher order combinators for facilitating composition of re-targetable data parallel descriptions
• Sematically transparent PAR/SEQ style• Explicit control of graularity of parallelism
and data locality
"Using a CSP based Programming Model for Reconfigurable Processor Arrays", Zain-ul-Abdin 7
Occam-pi Language
• Based on ideas of CSP with pi-calculus
• Abstractions for underlying hardware– Processes– Channels (Unbuffered message passing)
• Rendezvous behavior of channels– Receiver blocks until the sender wrote the value – Sender continues after the receiver read the value
"Using a CSP based Programming Model for Reconfigurable Processor Arrays", Zain-ul-Abdin 8
Occam-pi Language
• Primitive actions– Variable assignment– Channel output !– Channel input ?– PAR– SEQ
• Variables can only be written by one process in parallel– Likewise, only a single process can read from a channel, and
another single process can write to the channel
PROC SimpleEx() INT x,y:
CHAN OF INT c,d:PAR SEQ
c ! 117d ? x
SEQc ? yd ! 118
:
"Using a CSP based Programming Model for Reconfigurable Processor Arrays", Zain-ul-Abdin 9
Compilation Methodology
• Implemented a Backend for Ambric in Tock(Translator of Occam to C by Kent)
• Staged compilation
• Native SOPL code generation for Ambric– Use of concurrency of Occam-pi– Reduced memory footprint
"Using a CSP based Programming Model for Reconfigurable Processor Arrays", Zain-ul-Abdin 10
Occam-Ambric Compilation
"Using a CSP based Programming Model for Reconfigurable Processor Arrays", Zain-ul-Abdin 11
Ambric-related Transformations
• Introduction of Channel-end Specifiers• Enables use of flat data parallelism• Replicators transformations:
– SEQ Replicators to For loops– PAR Replicators unrolled to multiple PROCs
• Emission of aStruct structural interface and binding code for each PROC
• Emission of aJava class code corresponding to each PROC
"Using a CSP based Programming Model for Reconfigurable Processor Arrays", Zain-ul-Abdin 12
1D-Discrete Cosine Transform
"Using a CSP based Programming Model for Reconfigurable Processor Arrays", Zain-ul-Abdin 13
Performance Results
• 8-point DCT Implementations
NP CLFO Throughput
Serial DCT 1 220 TS
Coarse-grained Parallelized DCT
4 51 3TS
Fine-grained Parallelized DCT
34 33 27TS
"Using a CSP based Programming Model for Reconfigurable Processor Arrays", Zain-ul-Abdin 14
Conclusions
• Proposed the use of Occam-pi for programming a coarse-grained processor architecture
• Raises the abstraction level while not compromising the efficiency
• To extend the compiler for supporting mobility features of Occam-pi for reconfigurable logic