31
Introduction Code Generation Synchronous Dataflow Current Status Machine Assisted Code Generation for Manycore Processors Jerker Bengtsson [email protected] EPC meeting at Lindholmen, March 19, 2008 Centre for Research on Embedded Systems

Jerker Bengtsson Jerker.Bengtsson@hh · IntroductionCode GenerationSynchronous Data owCurrent Status Machine Assisted Code Generation for Manycore Processors Jerker Bengtsson [email protected]

  • Upload
    lamlien

  • View
    223

  • Download
    4

Embed Size (px)

Citation preview

Page 1: Jerker Bengtsson Jerker.Bengtsson@hh · IntroductionCode GenerationSynchronous Data owCurrent Status Machine Assisted Code Generation for Manycore Processors Jerker Bengtsson Jerker.Bengtsson@hh.se

Introduction Code Generation Synchronous Dataflow Current Status

Machine Assisted Code Generation for ManycoreProcessors

Jerker [email protected]

EPC meeting at Lindholmen, March 19, 2008

Centre for Research on Embedded Systems

Page 2: Jerker Bengtsson Jerker.Bengtsson@hh · IntroductionCode GenerationSynchronous Data owCurrent Status Machine Assisted Code Generation for Manycore Processors Jerker Bengtsson Jerker.Bengtsson@hh.se

Introduction Code Generation Synchronous Dataflow Current Status

Introduction CERES

Are new languages necessary for multi-/many-/what-evercores?

The solution is well-defined || models of computation (MoC)Dataflow is a good match with manycore targets

Why investigate on code generation?

Consensus 1: the programming complexity need to be reducedConsensus 2: we want code with a high degree of portabilityAlternative: program for a machine API that abstracts sharedmachine resources

(+) solves parts of the portability issues(-) does not reduce the multicore programming complexity

What do we mean by ”machine assisted”?

Latency optimisation is different from throughput optimisation→ different problems require different optimisation strategiesProvide means to specialize the || mapping strategy

Page 3: Jerker Bengtsson Jerker.Bengtsson@hh · IntroductionCode GenerationSynchronous Data owCurrent Status Machine Assisted Code Generation for Manycore Processors Jerker Bengtsson Jerker.Bengtsson@hh.se

Introduction Code Generation Synchronous Dataflow Current Status

Introduction CERES

Are new languages necessary for multi-/many-/what-evercores?

The solution is well-defined || models of computation (MoC)Dataflow is a good match with manycore targets

Why investigate on code generation?

Consensus 1: the programming complexity need to be reducedConsensus 2: we want code with a high degree of portabilityAlternative: program for a machine API that abstracts sharedmachine resources

(+) solves parts of the portability issues(-) does not reduce the multicore programming complexity

What do we mean by ”machine assisted”?

Latency optimisation is different from throughput optimisation→ different problems require different optimisation strategiesProvide means to specialize the || mapping strategy

Page 4: Jerker Bengtsson Jerker.Bengtsson@hh · IntroductionCode GenerationSynchronous Data owCurrent Status Machine Assisted Code Generation for Manycore Processors Jerker Bengtsson Jerker.Bengtsson@hh.se

Introduction Code Generation Synchronous Dataflow Current Status

Introduction CERES

Are new languages necessary for multi-/many-/what-evercores?

The solution is well-defined || models of computation (MoC)Dataflow is a good match with manycore targets

Why investigate on code generation?

Consensus 1: the programming complexity need to be reducedConsensus 2: we want code with a high degree of portabilityAlternative: program for a machine API that abstracts sharedmachine resources

(+) solves parts of the portability issues(-) does not reduce the multicore programming complexity

What do we mean by ”machine assisted”?

Latency optimisation is different from throughput optimisation→ different problems require different optimisation strategiesProvide means to specialize the || mapping strategy

Page 5: Jerker Bengtsson Jerker.Bengtsson@hh · IntroductionCode GenerationSynchronous Data owCurrent Status Machine Assisted Code Generation for Manycore Processors Jerker Bengtsson Jerker.Bengtsson@hh.se

Introduction Code Generation Synchronous Dataflow Current Status

Introduction CERES

Are new languages necessary for multi-/many-/what-evercores?

The solution is well-defined || models of computation (MoC)Dataflow is a good match with manycore targets

Why investigate on code generation?

Consensus 1: the programming complexity need to be reducedConsensus 2: we want code with a high degree of portabilityAlternative: program for a machine API that abstracts sharedmachine resources

(+) solves parts of the portability issues(-) does not reduce the multicore programming complexity

What do we mean by ”machine assisted”?

Latency optimisation is different from throughput optimisation→ different problems require different optimisation strategiesProvide means to specialize the || mapping strategy

Page 6: Jerker Bengtsson Jerker.Bengtsson@hh · IntroductionCode GenerationSynchronous Data owCurrent Status Machine Assisted Code Generation for Manycore Processors Jerker Bengtsson Jerker.Bengtsson@hh.se

Introduction Code Generation Synchronous Dataflow Current Status

Introduction CERES

Are new languages necessary for multi-/many-/what-evercores?

The solution is well-defined || models of computation (MoC)Dataflow is a good match with manycore targets

Why investigate on code generation?

Consensus 1: the programming complexity need to be reducedConsensus 2: we want code with a high degree of portabilityAlternative: program for a machine API that abstracts sharedmachine resources

(+) solves parts of the portability issues(-) does not reduce the multicore programming complexity

What do we mean by ”machine assisted”?

Latency optimisation is different from throughput optimisation→ different problems require different optimisation strategiesProvide means to specialize the || mapping strategy

Page 7: Jerker Bengtsson Jerker.Bengtsson@hh · IntroductionCode GenerationSynchronous Data owCurrent Status Machine Assisted Code Generation for Manycore Processors Jerker Bengtsson Jerker.Bengtsson@hh.se

Introduction Code Generation Synchronous Dataflow Current Status

Introduction CERES

Are new languages necessary for multi-/many-/what-evercores?

The solution is well-defined || models of computation (MoC)Dataflow is a good match with manycore targets

Why investigate on code generation?

Consensus 1: the programming complexity need to be reducedConsensus 2: we want code with a high degree of portabilityAlternative: program for a machine API that abstracts sharedmachine resources

(+) solves parts of the portability issues(-) does not reduce the multicore programming complexity

What do we mean by ”machine assisted”?

Latency optimisation is different from throughput optimisation→ different problems require different optimisation strategiesProvide means to specialize the || mapping strategy

Page 8: Jerker Bengtsson Jerker.Bengtsson@hh · IntroductionCode GenerationSynchronous Data owCurrent Status Machine Assisted Code Generation for Manycore Processors Jerker Bengtsson Jerker.Bengtsson@hh.se

Introduction Code Generation Synchronous Dataflow Current Status

Introduction CERES

Are new languages necessary for multi-/many-/what-evercores?

The solution is well-defined || models of computation (MoC)Dataflow is a good match with manycore targets

Why investigate on code generation?

Consensus 1: the programming complexity need to be reducedConsensus 2: we want code with a high degree of portabilityAlternative: program for a machine API that abstracts sharedmachine resources

(+) solves parts of the portability issues(-) does not reduce the multicore programming complexity

What do we mean by ”machine assisted”?

Latency optimisation is different from throughput optimisation→ different problems require different optimisation strategiesProvide means to specialize the || mapping strategy

Page 9: Jerker Bengtsson Jerker.Bengtsson@hh · IntroductionCode GenerationSynchronous Data owCurrent Status Machine Assisted Code Generation for Manycore Processors Jerker Bengtsson Jerker.Bengtsson@hh.se

Introduction Code Generation Synchronous Dataflow Current Status

Introduction CERES

Are new languages necessary for multi-/many-/what-evercores?

The solution is well-defined || models of computation (MoC)Dataflow is a good match with manycore targets

Why investigate on code generation?

Consensus 1: the programming complexity need to be reducedConsensus 2: we want code with a high degree of portabilityAlternative: program for a machine API that abstracts sharedmachine resources

(+) solves parts of the portability issues(-) does not reduce the multicore programming complexity

What do we mean by ”machine assisted”?

Latency optimisation is different from throughput optimisation→ different problems require different optimisation strategiesProvide means to specialize the || mapping strategy

Page 10: Jerker Bengtsson Jerker.Bengtsson@hh · IntroductionCode GenerationSynchronous Data owCurrent Status Machine Assisted Code Generation for Manycore Processors Jerker Bengtsson Jerker.Bengtsson@hh.se

Introduction Code Generation Synchronous Dataflow Current Status

Introduction CERES

Are new languages necessary for multi-/many-/what-evercores?

The solution is well-defined || models of computation (MoC)Dataflow is a good match with manycore targets

Why investigate on code generation?

Consensus 1: the programming complexity need to be reducedConsensus 2: we want code with a high degree of portabilityAlternative: program for a machine API that abstracts sharedmachine resources

(+) solves parts of the portability issues(-) does not reduce the multicore programming complexity

What do we mean by ”machine assisted”?

Latency optimisation is different from throughput optimisation→ different problems require different optimisation strategiesProvide means to specialize the || mapping strategy

Page 11: Jerker Bengtsson Jerker.Bengtsson@hh · IntroductionCode GenerationSynchronous Data owCurrent Status Machine Assisted Code Generation for Manycore Processors Jerker Bengtsson Jerker.Bengtsson@hh.se

Introduction Code Generation Synchronous Dataflow Current Status

Introduction CERES

Are new languages necessary for multi-/many-/what-evercores?

The solution is well-defined || models of computation (MoC)Dataflow is a good match with manycore targets

Why investigate on code generation?

Consensus 1: the programming complexity need to be reducedConsensus 2: we want code with a high degree of portabilityAlternative: program for a machine API that abstracts sharedmachine resources

(+) solves parts of the portability issues(-) does not reduce the multicore programming complexity

What do we mean by ”machine assisted”?

Latency optimisation is different from throughput optimisation→ different problems require different optimisation strategiesProvide means to specialize the || mapping strategy

Page 12: Jerker Bengtsson Jerker.Bengtsson@hh · IntroductionCode GenerationSynchronous Data owCurrent Status Machine Assisted Code Generation for Manycore Processors Jerker Bengtsson Jerker.Bengtsson@hh.se

Introduction Code Generation Synchronous Dataflow Current Status

Introduction CERES

Are new languages necessary for multi-/many-/what-evercores?

The solution is well-defined || models of computation (MoC)Dataflow is a good match with manycore targets

Why investigate on code generation?

Consensus 1: the programming complexity need to be reducedConsensus 2: we want code with a high degree of portabilityAlternative: program for a machine API that abstracts sharedmachine resources

(+) solves parts of the portability issues(-) does not reduce the multicore programming complexity

What do we mean by ”machine assisted”?

Latency optimisation is different from throughput optimisation→ different problems require different optimisation strategiesProvide means to specialize the || mapping strategy

Page 13: Jerker Bengtsson Jerker.Bengtsson@hh · IntroductionCode GenerationSynchronous Data owCurrent Status Machine Assisted Code Generation for Manycore Processors Jerker Bengtsson Jerker.Bengtsson@hh.se

Introduction Code Generation Synchronous Dataflow Current Status

Code Generation Framework CERES

Machine Parameters Program

Program Graph

Machine Graph

Configuration Graph

C-code generation & compilation

Manycore Executable

F r o n

t E n d

B a

c k E

n d

Abstract code generator framework

Page 14: Jerker Bengtsson Jerker.Bengtsson@hh · IntroductionCode GenerationSynchronous Data owCurrent Status Machine Assisted Code Generation for Manycore Processors Jerker Bengtsson Jerker.Bengtsson@hh.se

Introduction Code Generation Synchronous Dataflow Current Status

Synchronous Dataflow CERES

Hierarchical SDF model of aWCDMA Adaptive Multi Rate(AMR) transmitter processing

chain

Segm

Segm

Segm

CRC CRCCRC

Code CodeCode

Rm RmRm

Dtx1 Dtx1Dtx1

Intrl IntrlIntrl

Segm

Segm

Segm

Mux

Dtx2Segm Intrl Phy

33

33

1010

1111

2121

2121

3 24

2222

1111

1111

44

44

2222

22

22

55

66

1111

1111

22

11

2211

12

6

2727

27 2727 2727 27

The graph shows the top-levelcomposite actors

The integers associated withthe edges specifies the tokenconsumption and productionrate for each actor when fired

Hierarchical SDF modelsare composed of atomicand composite actors

SDF is a well defined, restricted subset ofdataflow

Pipeline-, task- and data parallelism can bediscovered by a code generator

The properties of SDF guarantees

buffer bounded executionexecution without deadlock

Limitations of SDF

Expressability is limitedNot efficient for dynamic computationproblems

Page 15: Jerker Bengtsson Jerker.Bengtsson@hh · IntroductionCode GenerationSynchronous Data owCurrent Status Machine Assisted Code Generation for Manycore Processors Jerker Bengtsson Jerker.Bengtsson@hh.se

Introduction Code Generation Synchronous Dataflow Current Status

Synchronous Dataflow CERES

Hierarchical SDF model of aWCDMA Adaptive Multi Rate(AMR) transmitter processing

chain

Segm

Segm

Segm

CRC CRCCRC

Code CodeCode

Rm RmRm

Dtx1 Dtx1Dtx1

Intrl IntrlIntrl

Segm

Segm

Segm

Mux

Dtx2Segm Intrl Phy

33

33

1010

1111

2121

2121

3 24

2222

1111

1111

44

44

2222

22

22

55

66

1111

1111

22

11

2211

12

6

2727

27 2727 2727 27

The graph shows the top-levelcomposite actors

The integers associated withthe edges specifies the tokenconsumption and productionrate for each actor when fired

Hierarchical SDF modelsare composed of atomicand composite actors

SDF is a well defined, restricted subset ofdataflow

Pipeline-, task- and data parallelism can bediscovered by a code generator

The properties of SDF guarantees

buffer bounded executionexecution without deadlock

Limitations of SDF

Expressability is limitedNot efficient for dynamic computationproblems

Page 16: Jerker Bengtsson Jerker.Bengtsson@hh · IntroductionCode GenerationSynchronous Data owCurrent Status Machine Assisted Code Generation for Manycore Processors Jerker Bengtsson Jerker.Bengtsson@hh.se

Introduction Code Generation Synchronous Dataflow Current Status

Synchronous Dataflow CERES

Hierarchical SDF model of aWCDMA Adaptive Multi Rate(AMR) transmitter processing

chain

Segm

Segm

Segm

CRC CRCCRC

Code CodeCode

Rm RmRm

Dtx1 Dtx1Dtx1

Intrl IntrlIntrl

Segm

Segm

Segm

Mux

Dtx2Segm Intrl Phy

33

33

1010

1111

2121

2121

3 24

2222

1111

1111

44

44

2222

22

22

55

66

1111

1111

22

11

2211

12

6

2727

27 2727 2727 27

The graph shows the top-levelcomposite actors

The integers associated withthe edges specifies the tokenconsumption and productionrate for each actor when fired

Hierarchical SDF modelsare composed of atomicand composite actors

SDF is a well defined, restricted subset ofdataflow

Pipeline-, task- and data parallelism can bediscovered by a code generator

The properties of SDF guarantees

buffer bounded executionexecution without deadlock

Limitations of SDF

Expressability is limitedNot efficient for dynamic computationproblems

Page 17: Jerker Bengtsson Jerker.Bengtsson@hh · IntroductionCode GenerationSynchronous Data owCurrent Status Machine Assisted Code Generation for Manycore Processors Jerker Bengtsson Jerker.Bengtsson@hh.se

Introduction Code Generation Synchronous Dataflow Current Status

Synchronous Dataflow CERES

Hierarchical SDF model of aWCDMA Adaptive Multi Rate(AMR) transmitter processing

chain

Segm

Segm

Segm

CRC CRCCRC

Code CodeCode

Rm RmRm

Dtx1 Dtx1Dtx1

Intrl IntrlIntrl

Segm

Segm

Segm

Mux

Dtx2Segm Intrl Phy

33

33

1010

1111

2121

2121

3 24

2222

1111

1111

44

44

2222

22

22

55

66

1111

1111

22

11

2211

12

6

2727

27 2727 2727 27

The graph shows the top-levelcomposite actors

The integers associated withthe edges specifies the tokenconsumption and productionrate for each actor when fired

Hierarchical SDF modelsare composed of atomicand composite actors

SDF is a well defined, restricted subset ofdataflow

Pipeline-, task- and data parallelism can bediscovered by a code generator

The properties of SDF guarantees

buffer bounded executionexecution without deadlock

Limitations of SDF

Expressability is limitedNot efficient for dynamic computationproblems

Page 18: Jerker Bengtsson Jerker.Bengtsson@hh · IntroductionCode GenerationSynchronous Data owCurrent Status Machine Assisted Code Generation for Manycore Processors Jerker Bengtsson Jerker.Bengtsson@hh.se

Introduction Code Generation Synchronous Dataflow Current Status

Synchronous Dataflow CERES

Hierarchical SDF model of aWCDMA Adaptive Multi Rate(AMR) transmitter processing

chain

Segm

Segm

Segm

CRC CRCCRC

Code CodeCode

Rm RmRm

Dtx1 Dtx1Dtx1

Intrl IntrlIntrl

Segm

Segm

Segm

Mux

Dtx2Segm Intrl Phy

33

33

1010

1111

2121

2121

3 24

2222

1111

1111

44

44

2222

22

22

55

66

1111

1111

22

11

2211

12

6

2727

27 2727 2727 27

The graph shows the top-levelcomposite actors

The integers associated withthe edges specifies the tokenconsumption and productionrate for each actor when fired

Hierarchical SDF modelsare composed of atomicand composite actors

SDF is a well defined, restricted subset ofdataflow

Pipeline-, task- and data parallelism can bediscovered by a code generator

The properties of SDF guarantees

buffer bounded executionexecution without deadlock

Limitations of SDF

Expressability is limitedNot efficient for dynamic computationproblems

Page 19: Jerker Bengtsson Jerker.Bengtsson@hh · IntroductionCode GenerationSynchronous Data owCurrent Status Machine Assisted Code Generation for Manycore Processors Jerker Bengtsson Jerker.Bengtsson@hh.se

Introduction Code Generation Synchronous Dataflow Current Status

Synchronous Dataflow CERES

Hierarchical SDF model of aWCDMA Adaptive Multi Rate(AMR) transmitter processing

chain

Segm

Segm

Segm

CRC CRCCRC

Code CodeCode

Rm RmRm

Dtx1 Dtx1Dtx1

Intrl IntrlIntrl

Segm

Segm

Segm

Mux

Dtx2Segm Intrl Phy

33

33

1010

1111

2121

2121

3 24

2222

1111

1111

44

44

2222

22

22

55

66

1111

1111

22

11

2211

12

6

2727

27 2727 2727 27

The graph shows the top-levelcomposite actors

The integers associated withthe edges specifies the tokenconsumption and productionrate for each actor when fired

Hierarchical SDF modelsare composed of atomicand composite actors

SDF is a well defined, restricted subset ofdataflow

Pipeline-, task- and data parallelism can bediscovered by a code generator

The properties of SDF guarantees

buffer bounded executionexecution without deadlock

Limitations of SDF

Expressability is limitedNot efficient for dynamic computationproblems

Page 20: Jerker Bengtsson Jerker.Bengtsson@hh · IntroductionCode GenerationSynchronous Data owCurrent Status Machine Assisted Code Generation for Manycore Processors Jerker Bengtsson Jerker.Bengtsson@hh.se

Introduction Code Generation Synchronous Dataflow Current Status

Synchronous Dataflow CERES

Hierarchical SDF model of aWCDMA Adaptive Multi Rate(AMR) transmitter processing

chain

Segm

Segm

Segm

CRC CRCCRC

Code CodeCode

Rm RmRm

Dtx1 Dtx1Dtx1

Intrl IntrlIntrl

Segm

Segm

Segm

Mux

Dtx2Segm Intrl Phy

33

33

1010

1111

2121

2121

3 24

2222

1111

1111

44

44

2222

22

22

55

66

1111

1111

22

11

2211

12

6

2727

27 2727 2727 27

The graph shows the top-levelcomposite actors

The integers associated withthe edges specifies the tokenconsumption and productionrate for each actor when fired

Hierarchical SDF modelsare composed of atomicand composite actors

SDF is a well defined, restricted subset ofdataflow

Pipeline-, task- and data parallelism can bediscovered by a code generator

The properties of SDF guarantees

buffer bounded executionexecution without deadlock

Limitations of SDF

Expressability is limitedNot efficient for dynamic computationproblems

Page 21: Jerker Bengtsson Jerker.Bengtsson@hh · IntroductionCode GenerationSynchronous Data owCurrent Status Machine Assisted Code Generation for Manycore Processors Jerker Bengtsson Jerker.Bengtsson@hh.se

Introduction Code Generation Synchronous Dataflow Current Status

What manycore targets? CERES

PE1,1 PE1,2 PE1,n-1

PE2,1 PE PE PE

PE3,2 PEm,n

Switch/Router

Instructionmem

Datamem

Instr. execution

Regfile

PE3,1

PEm,1 PEm,2

PE1,n

PE2,n

PEm-1,n

PEm,nPEm,n-1

PEm-1,n

Page 22: Jerker Bengtsson Jerker.Bengtsson@hh · IntroductionCode GenerationSynchronous Data owCurrent Status Machine Assisted Code Generation for Manycore Processors Jerker Bengtsson Jerker.Bengtsson@hh.se

Introduction Code Generation Synchronous Dataflow Current Status

Machine abstraction: computational capacity CERES

Computational capacity is described by < P, p, m, bl , bg >

P is the number of coresp is the processing power per corem is local memory sizebl is local memory bandwidthbg is global shared memory bandwidth

Page 23: Jerker Bengtsson Jerker.Bengtsson@hh · IntroductionCode GenerationSynchronous Data owCurrent Status Machine Assisted Code Generation for Manycore Processors Jerker Bengtsson Jerker.Bengtsson@hh.se

Introduction Code Generation Synchronous Dataflow Current Status

Machine abstraction: network capacity CERES

The network capacity is described by < so , sl , nb, nhl , ro , rl >

so is send occupancysl is send latencynb is network buffer capacityc is link bandwidthnhl is network hop latencyrl is receive latencyro is receive occupancy

Page 24: Jerker Bengtsson Jerker.Bengtsson@hh · IntroductionCode GenerationSynchronous Data owCurrent Status Machine Assisted Code Generation for Manycore Processors Jerker Bengtsson Jerker.Bengtsson@hh.se

Introduction Code Generation Synchronous Dataflow Current Status

Performance functions CERES

computation time Tp = d rpp esend time Ts = d rcout

messlene × so + Tblocked()

network injection time is sl + Qblocked()

receive time Tr is d rcinmesslen

e × ro + Tblocked()

network extraction time is rl + Qblocked()

communication latency is

Tc = nhl × nhops + bl + (L− 1)×max( 1c , P

bg)

for global memory accessTc = dist × nhl + (L− 1)× 1

cfor core-to-core communication

State dependent performance functions Tblocked() andQblocked() can be set constant (if we know the worst case...)

...contact me if you want the details

Page 25: Jerker Bengtsson Jerker.Bengtsson@hh · IntroductionCode GenerationSynchronous Data owCurrent Status Machine Assisted Code Generation for Manycore Processors Jerker Bengtsson Jerker.Bengtsson@hh.se

Introduction Code Generation Synchronous Dataflow Current Status

Where we are now CERES

Machine abstraction

The framework is being implemented inPtolemy http://ptolemy.berkeley.edu/

The intermediate representation (IR) is ahierarchical heterogenous model

multicore level is Process Networkscore internals are SDF models

We can generate the IR from SDF input

...but no clustering or optimisation yet

C code can be generated from the IR

parallel code using POSIX threadscan be modified to generate target specificcode

Page 26: Jerker Bengtsson Jerker.Bengtsson@hh · IntroductionCode GenerationSynchronous Data owCurrent Status Machine Assisted Code Generation for Manycore Processors Jerker Bengtsson Jerker.Bengtsson@hh.se

Introduction Code Generation Synchronous Dataflow Current Status

Where we are now CERES

Machine abstraction

The framework is being implemented inPtolemy http://ptolemy.berkeley.edu/

The intermediate representation (IR) is ahierarchical heterogenous model

multicore level is Process Networkscore internals are SDF models

We can generate the IR from SDF input

...but no clustering or optimisation yet

C code can be generated from the IR

parallel code using POSIX threadscan be modified to generate target specificcode

Page 27: Jerker Bengtsson Jerker.Bengtsson@hh · IntroductionCode GenerationSynchronous Data owCurrent Status Machine Assisted Code Generation for Manycore Processors Jerker Bengtsson Jerker.Bengtsson@hh.se

Introduction Code Generation Synchronous Dataflow Current Status

Where we are now CERES

Machine abstraction

The framework is being implemented inPtolemy http://ptolemy.berkeley.edu/

The intermediate representation (IR) is ahierarchical heterogenous model

multicore level is Process Networkscore internals are SDF models

We can generate the IR from SDF input

...but no clustering or optimisation yet

C code can be generated from the IR

parallel code using POSIX threadscan be modified to generate target specificcode

Page 28: Jerker Bengtsson Jerker.Bengtsson@hh · IntroductionCode GenerationSynchronous Data owCurrent Status Machine Assisted Code Generation for Manycore Processors Jerker Bengtsson Jerker.Bengtsson@hh.se

Introduction Code Generation Synchronous Dataflow Current Status

Where we are now CERES

Machine abstraction

The framework is being implemented inPtolemy http://ptolemy.berkeley.edu/

The intermediate representation (IR) is ahierarchical heterogenous model

multicore level is Process Networkscore internals are SDF models

We can generate the IR from SDF input

...but no clustering or optimisation yet

C code can be generated from the IR

parallel code using POSIX threadscan be modified to generate target specificcode

Page 29: Jerker Bengtsson Jerker.Bengtsson@hh · IntroductionCode GenerationSynchronous Data owCurrent Status Machine Assisted Code Generation for Manycore Processors Jerker Bengtsson Jerker.Bengtsson@hh.se

Introduction Code Generation Synchronous Dataflow Current Status

What is going on now? CERES

Investigation of clustering and parallelisation strategies

User specified clusteringDifferent types of automatized clustering

detect and cluster non-parallel actor chainsconstraint-driven clustering (RT constraints)exploit ”hidden” data parallelism

New spin-off proposal: codegen for multicore RTOS

Problem: for non-trivial systems, we will need a higher degreeof run-time flexibilityApproach?: Runtime (real-time scheduling) support formulti-coresTo appoach this problem, we need RT scheduling theory (HoaiHoang)

Page 30: Jerker Bengtsson Jerker.Bengtsson@hh · IntroductionCode GenerationSynchronous Data owCurrent Status Machine Assisted Code Generation for Manycore Processors Jerker Bengtsson Jerker.Bengtsson@hh.se

Introduction Code Generation Synchronous Dataflow Current Status

What is going on now? CERES

Investigation of clustering and parallelisation strategies

User specified clusteringDifferent types of automatized clustering

detect and cluster non-parallel actor chainsconstraint-driven clustering (RT constraints)exploit ”hidden” data parallelism

New spin-off proposal: codegen for multicore RTOS

Problem: for non-trivial systems, we will need a higher degreeof run-time flexibilityApproach?: Runtime (real-time scheduling) support formulti-coresTo appoach this problem, we need RT scheduling theory (HoaiHoang)

Page 31: Jerker Bengtsson Jerker.Bengtsson@hh · IntroductionCode GenerationSynchronous Data owCurrent Status Machine Assisted Code Generation for Manycore Processors Jerker Bengtsson Jerker.Bengtsson@hh.se

Introduction Code Generation Synchronous Dataflow Current Status

What is going on now? CERES

Investigation of clustering and parallelisation strategies

User specified clusteringDifferent types of automatized clustering

detect and cluster non-parallel actor chainsconstraint-driven clustering (RT constraints)exploit ”hidden” data parallelism

New spin-off proposal: codegen for multicore RTOS

Problem: for non-trivial systems, we will need a higher degreeof run-time flexibilityApproach?: Runtime (real-time scheduling) support formulti-coresTo appoach this problem, we need RT scheduling theory (HoaiHoang)