Upload
flextiles
View
373
Download
7
Tags:
Embed Size (px)
DESCRIPTION
Citation preview
1 /1 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
A Town close to Madrid
www.thalesgroup.com
Research & Technology
201
2/0
7/17
/ Réf
ére
nce
www.flextiles.eu
Philippe MILLET, SAMOS [email protected]
Project coordinator: THALES
Funding budget: 3,670,000€
Starting date: 15/10/2011
Duration: 36 months
www.thalesgroup.com
Towards Future Adaptive MultiProcessor Systems-On-Chip: an Innovative Approach
for Flexible Architecture
3 /3 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
Some Future applications within THALES
Embedded Real-Time Market
low power consumption low volumes long life-time (~20 years)
Adapt to environment dynamicity, flexibility & dependability
Smart cameraCognitive radio UAV
More than staticdataflow
4 /4 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
Manycore is a main issue for the industry
Programmability (industrial view): Time to market
SW Development costs
Reuse of legacy code
What about Manycores? Homogeneous?
Heterogeneous?
5 /5 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
Manycore is a main issue for the industry
Programmability (industrial view): Time to market
SW Development costs
Reuse of legacy code
What about Manycores? Homogeneous?
Heterogeneous?
Why taking risks with Manycores ?
We want to continue like in the good days:
compile “without thinking” and get performances
(keep it as long/simple as possible) !
6 /6 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
Homogeneous manycores: Good at Parallelism
Parallelisation: raise computing power / lower power consumption.
Homogeneity eases programming (C-Like + tools) but:Maximum performance only with static application.
automatic optimisation (data parallelism)
static allocation and scheduling.Else Average performances / No guaranty
Tilera - Tile-Gx100 – 100 cores
C/C++
Nvidia - Fermi - 512 cores
OpenCL/CUDA (C like+kernels)
Kalray - MPPA - 256 cores
SigmaC (C++ like for dataflow)
7 /7 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
Parallelisation is not enough: did we miss something?
Homogeneous?
8 /8 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
Parallelisation is not enough
Customization: optimised for the job!
Australian Desert Animal: the Thorny Devil
9 /9 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
Customized/Customizable chips vs. FPGA
Xilinx – ZYNQ : FPGA with a dual ARM A9 core
MPCore with reconfiguration capabilities
Cluster Cluster Cluster
Cluster Cluster Cluster
Cluster Cluster Cluster
Fabric Controller
core
Fabric
GOOD Parallelization
POOR Customization POOR Parallelization
GOOD Customization
ST – P2012 (Heterogeneous manycore fabric)
Once done: Dedicated to a specific domain of applications
Affordable only for large series of products.
Main issue: Domain dedication
idem with MPSoCs (TI-OMAPs)
10 /10 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
FlexTiles Proposes
A 3D stacked chip based on:
A manycore layer GPPs
DSPs
A FPGA layer A 3D-NoC
GOOD Parallelization
GOOD Customization
Customization at low price Opportunity: self adaptive capabilities
Future application needs
11 /11 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
Self adaptive?
Adapt the architecture to application requests at "real-time" Improve yield and extend life-time of sub-micron technologies
Fault tolerance
Increase energy efficiency give the right task to the best available processor
finalize the mapping at runtime
Temperature management re-mapping
How to program it?
12 /12 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
Holistic Approach: Model of Execution
Model of Computation
Optimisation tools
ProgrammingEfficiency
Self-AdaptiveCapabilities
Relocation strategies
Model of programmation
Flexible Hardware
Common Interfaces
Model of Execution
13 /13 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
Model of Execution
Master Nodes
Slave Nodes
GPP nodes
eFPGA nodesDSP nodes
GPP Node
acceleratornode
NI
NoC
NI
Accelerator Interface (AI)
accrequests
control / status
DMA
DMArequests
data
Master-slave execution model
AI HW / SW independency regarding accelerator specificities
14 /14 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
Model of Computation & Model of Programmation
Optimisation tools
ProgrammingEfficiency
Self-AdaptiveCapabilities
Relocation strategies
Flexible Hardware
Common Interfaces
Model of Computation
Model of Execution
Model of programmation
15 /15 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
: Clusters group managed by a state management
: Cluster group input/output
Act Act
Act Act
Act
Act ActAct
state 1
state 2
state 3
states managementcluster groupevent
Model of Computation & Model of Programmation
Optimisation and parallelisation tools work on static applicationsfind static clusters inside the applications based on SDF/CSDF MoCBring Dynamicity with higher hierarchical level
: actor ~ task or tasks
: static cluster
Act
: Cluster input/output
actor: consumes and produces token of data with predefined and static rules
SDF, CSDF MoC
16 /16 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
Act
sensordata
states management
event
Act
state 1
nop
state 1
states management
states management
Act Act
Act
state 2
Act
Act
states managementevent
Act Act
Act
state 1
Act
Act
states management
Act Act
Act
state 1
Act
Actscatter
gather
sensordata
cluster group 3
cluster group 4
cluster group 5
cluster group 2
cluster group 1 event
event
event
Model of Programmation
: Actor
: static cluster
Act
: Clusters group managed by one state management
: Cluster group input/output
: Cluster input/output
17 /17 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
Act
sensordata
: Actor
: static cluster
states management
event
Act
state 1
states management
states management
Act Act
Act
state 2
Act
Act
states managementevent
Act Act
Act
state 1
Act
Act
Act
: Clusters group managed by one state management
states management
Act Act
Act
state 1
Act
Actscatter
Act Act
Act
state 1.1
Act
Act
Act Act
Act
state 1.2
Act
Act
gather
: Cluster group input/output
: Cluster input/output
sensordata
cluster group 3
cluster group 4
cluster group 5
cluster group 2
cluster group 1 event
event
event
Act Act
Act
state 2
Act
Model of Programmation
18 /18 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
Programming efficiency: Model of Computation
ProgrammingEfficiency
Self-AdaptiveCapabilities
Relocation strategies
Model of programmation
Flexible Hardware
Common Interfaces
Model of Execution
Optimisation tools
Model of Computation
19 /19 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
Application (C code)
C to SpearDE representation
Conversion (Cosy)
Data parallelisation Mapping (SpearDE)
Graphic input (manual)
+C kernels
Streaming optimisation (Cosy)
Compilation & Link(Cosy)
architecture representation
Master coresSlave cores
Library of IPs
Tool flow and MoC
Tool flow based :Thales - SpearDEACE - Cosy
Programming efficiency: Model of Computation
Binaries
Acc compiler or C2VHDL tools
20 /20 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
Programming efficiency: Model of Computation
ProgrammingEfficiency
Self-AdaptiveCapabilities
Relocation strategies
Model of programmation
Flexible Hardware
Model of Execution
Model of Computation
Common Interfaces
Optimisation tools
21 /21 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
Modularity and scalability: common interfaces
Homogeneous GPP nodes
Heterogeneous accelerators
nodes
GPP Node
AI
DSPNode
NI
GPP Node
NI
NoC
NI NI NI
AI AI
NI
Config. Ctrl.
DDR Ctrl.
NI
GPP Node
NI
I/O
NI
Generic Interfaces
eFPGA Domain (Reconfigurable HW acc.)
Dedicated Accelerator
Node
Dedicated Accelerator
Node
22 /22 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
Relocation Strategies
ProgrammingEfficiency
Self-AdaptiveCapabilities
Model of programmation
Flexible Hardware
Model of Execution
Model of Computation
Optimisation tools
Relocation Strategies
Common Interfaces
23 /23 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
A1.1 A2.1
A3
A5
A4
A1.2 A2.2
A1.3 A2.3
A1.4 A2.4
• FPGA• GPP
• FPGA
A1.1 A2.1
A3
A5
A4
A1.2 A2.2
A1.3 A2.3
A1.4 A2.4
• DSP • GP
P
• DSP
A1.1 A2.1
A3
A5
A4
A1.2 A2.2
A1.3 A2.3
A1.4 A2.4
• DSP • DS
P
• DSP
timerelocation relocation relocation
Relocation Strategies
24 /24 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
Self-adaptation
Heterogeneous Hardware
Controlled byKernel and
Virtualization layerEthernet
IMDCT MatrixMult
Accelerator/Virtual Code
Dynamicallocation / binding
DIAGNOSISO = F(L)
ACTION
SYSTEM
MONITORING
GPP Node
AI
DSPNode
NI
GPP Node
NI
NoC
NI NI NI
AI AI
NI
Config. Ctrl.
DDR Ctrl.
NI
GPP Node
NI
I/O
NI
Dedicated Accelerator
Node
Dedicated Accelerator
Node
eFPGA Domain (Reconfigurable HW acc.)
25 /25 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
Flexible Hardware
ProgrammingEfficiency
Self-AdaptiveCapabilities
Model of programmation
Model of Execution
Model of Computation
Optimisation tools
Common Interfaces
Flexible Hardware
Relocation strategies
26 /26 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
Tile Tile Tile
Tile Tile Tile
Tile Tile Tile
New dynamic reconfigurable technology
Homogeneous manycore
NoC
FlexTiles: a 3D stack chip
3D stacked reconfigurable layer
27 /27 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
Tile Tile Tile
Tile Tile Tile
Tile Tile Tile
New dynamic reconfigurable technology
3D stacked reconfigurable layer
Homogeneous manycore
NoC
FlexTiles: a 3D stack chip
Map Accelerated functions
28 /28 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
Tile Tile Tile
Tile Tile Tile
Tile Tile Tile
New dynamic reconfigurable technology
3D stacked reconfigurable layer
Homogeneous manycore
NoC
FlexTiles: a 3D stack chip
Duplicate
29 /29 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
Tile Tile Tile
Tile Tile Tile
Tile Tile Tile
New dynamic reconfigurable technology
3D stacked reconfigurable layer
Homogeneous manycore
NoC
FlexTiles: a 3D stack chip
Migrate
30 /30 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
Holistic Approach
Model of programmation
Model of Computation
Model of Execution
Flexible Hardware
Common Interfaces
strategies of relocation
Optimisation tools
Programming efficiency
self adaptive capabilities
3D NETWORK
31 /31 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
NoC QoS
chip
GPP
icache
dcache
dLMEM GPP
NI
iLMEM eFPGA
eFPGA
dLMEM eFPGA
iLMEM DSP
DSP
dLMEM DSP
DDR
NI+
DDR ctrl
on chipshMEM
NI NI
controlNOC
bitstreamNOC
dataNOC
instructionNOC
test/debugNOC
32 /32 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
ANoC (CEA)
GALS: asynchronous logic in nodes, local synchronous cores- highly scalable- between nodes: no global clock, not even local clock- power efficient and dependable- packet switching- wormhole protocol- low latency
33 /33 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
AEtheral NoC (TUe)
Guaranteed levels of services and performancesContention free routing by construction- wormhole routing specified at design time Globally Synchronous with time slots
34 /34 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
Conclusion
Parallelisation is the only way to reach HPC for low power consumption.
But parallelisation is not enough, customisation is also necessary
Only affordable for high volumes
Reconfigurable customisation is the solution:
Increase accessibility to heterogeneous manycore technology
Offers self-adaptive capabilities
35 /35 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
Our proposition: a 3D stacked chip and …
A 3D stacked chip based on:
A manycore layer GPPs
DSPs
A FPGA layer A 3D-NoC
3D NETWORK
36 /36 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
…a complete platform
Virtualisation layer
relocatable binary code
Parallelisation, partioning
Application
Hardware Nodes
Compilation Synthesis, P&Rrelocatable bitstream
Hardware Abstraction Layer
Hardware Abstraction Layer API
Operating Library API
Kernel Resource Monitoring &
Allocation
DIAGNOSISO = F(L)
ACTION
SYSTEM
toolchain
operating library
heterogenousmanycore
MONITORING
37 /37 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
Consortium and questions
Partners & Third Party
Country Main scientific and technical contributions
THALES France Infrastructure and applications
KIT Germany Virtualisation layer
TUE Netherlands Kernel ; NoC
CSEM Switzerland DSP
CEA France NoC ; 3D stacking
UR1 France Reconfigurable technology
SUNDANCE United Kingdom
FPGA Demonstrator
ACE Netherlands Parallelisation and compilation Tools
8 partners in 5 countries
38 /38 /
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
TH
ALE
S.
You
are
her
eby
notif
ied
that
any
rev
iew
, di
ssem
inat
ion,
dis
trib
utio
n, c
opyi
ng o
r ot
herw
ise
use
of t
his
docu
men
t is
str
ictly
pro
hibi
ted
with
out T
hale
s pr
ior
writ
ten
appr
oval
. ©
TH
ALE
S 2
011.
Tem
plat
e t
rtp
vers
ion
7.0
.8
Da
te /R
éfé
renc
e
FlexTiles
With FlexTiles, The Industry will be able to…
take the plunge into the manycore world!