Upload
saxton
View
38
Download
0
Embed Size (px)
DESCRIPTION
November 22, 2002, ENE, UnB, Brasilia. Data-stream-based Computing, Enabling Technology for Reconfigurable Computing Friday, November 22, 2002, 17.00 hrs. Reiner Hartenstein * Kaiserslautern University of Technology (TU Kaiserslautern). *) IEEE fellow. >> Microelectronics History. - PowerPoint PPT Presentation
Citation preview
Seminar by Prof. Dr.José Camargo da Costa
Data-stream-based Computing,Enabling Technology for Reconfigurable Computing
Friday, November 22, 2002, 17.00 hrs.
Reiner Hartenstein*
KaiserslauternUniversity of Technology(TU Kaiserslautern)
November 22, 2002, ENE, UnB, Brasilia
*) IEEE fellow
© 2002, [email protected] http://hartenstein.de2
TU KaiserslauternXputer Lab
>> Microelectronics History
• Microelectronics History
• fine grain and coarse grain Morphware
• Anti Matter of Computing
• Anti Machine and its Resources
• Problems to be solved
http://www.uni-kl.de
© 2002, [email protected] http://hartenstein.de3
TU KaiserslauternXputer Lab
The History of Paradigm Shifts
“Mainstream Silicon Applicationis switching every 10 Years”
TTL µproc.,memory
“The Programmable System-on-a-Chipis the next wave“
custom
standard
1957
1967
1977
1987
1997
2007
Makimoto’s Wave
ASICs,accel’s
LSI,MSI
1st D
esig
n C
risis
2n
d D
esig
n C
risis
morphw
are
Published
in 1989
© 2002, [email protected] http://hartenstein.de4
TU KaiserslauternXputer Lab
The Impact of Makimoto’s Paradigm
Shifts
TTL µproc.,memory
custom
standard
ASICs,accel’s
LSI,MSI
reconfigurable
1957
1967
1977
1987
1997
2007
Proceduralpersonalization via RAM-based
Machine Paradigm
Personalization(CAD) beforefabrication
structuralpersonalization:
RAM-basedbefore run time
Dr. Makimoto: FPL 2000 keynote
Software Industry’sSecret of Success
Repeat Success Story bynew Machine Paradigm !
© 2002, [email protected] http://hartenstein.de5
TU KaiserslauternXputer Lab
Makimoto’s 3rd wave
The next EDA Industry Revolution
1978
Transistor entry: Applicon, Calma, CV ...
1992Synthesis (HDLs): Cadence, Synopsys ...
1985
Schematics entry: Daisy, Mentor, Valid ...
[Keutzer / Newton]McKinsey Curves
EDA industry paradigmswitching every 7 years
1999(Co-) Compilation:data-stream-based
DPAs
[Hartenstein]
Von Neumann does not
support Morphware:
“The Programmable System-on-a-Chip is the next wave“
© 2002, [email protected] http://hartenstein.de6
TU KaiserslauternXputer Lab
Ubiquitous embedded systems
20 billion µprocessors (2001)
> 90% in embedded systems
10 times more programmers will write embedded applications than computer software by 2010
That’s where our graduates will go
Embedded systems means:
• hardware / software co-design
• configware / software co-design
• hardware / configware / software co-design
© 2002, [email protected] http://hartenstein.de7
TU KaiserslauternXputer Lab
Hardware,Configware
Embedded Systems Requirement:Hardware/Configware and Software as Alternatives
Algorithm
Software
Software only
partitioning
Software & Hardw/ConfigwHardw/Configw only
© 2002, [email protected] http://hartenstein.de8
TU KaiserslauternXputer Lab
>> fine grain and coarse grain Morphware
• Microelectronics History
• fine grain and coarse grain Morphware
• Anti Matter of Computing
• Anti Machine and its Resources
• Problems to be solved
http://www.uni-kl.de
© 2002, [email protected] http://hartenstein.de9
TU KaiserslauternXputer Lab
Top 4 FPGA Manufacturers 2000
Xilinx42%
Altera37%
Lattice15%
Actel6%
Top 4 PLD Manufacturers 2000total: $3.7 Bio
•[Dataquest] > $7 billion by
2003. •FPGAs going into every type of application – also SoC
•fastest growing segment of semiconductor market
You do not neet specific silicon !
You do not neet specific silicon !
cost / mio §4
3
2
1mask set cost [eASIC]
NRE and mask cost
[dataquest] .
12 12 16 20 26 28 30 >30no. of masks
0.8 0.6 0.35 0.25 0.18 0.15 0.13 0.1 0.07 feature size
© 2002, [email protected] http://hartenstein.de10
TU KaiserslauternXputer Lab
Configware and EDA as the Key Enabler
• Growing no. of independent configware houses (soft IP core vendors) and design services provide libraries of "pre-fabricated" re-usable IP cores
• Synplicity 57%, • Mentor 37%, • Synopsys 7%
• Emerging separate EDA software market -FPGA synthesis [2001: Dataquest]:
© 2002, [email protected] http://hartenstein.de11
TU KaiserslauternXputer Lab
Throughput vs. Efficiency
1000
100
10
1
0.1
0.01
0.0012 1 0.5 0.25 0.13 0.1 0,07
MOPS / mW
µ feature size
FPGAs (reconfigurable logic)hardwired
instruction set processors
standard microprocessor
DSP
S S
S S
resources needed for
reconfigurability
L
L L
LL
L
L LL
area used by application
1 Bit CLB
T. Claasen et al.: ISSCC 1999
Wiring by abutment:32 Bit example
*) R. Hartenstein: ISIS 1997
rDPAs (reconfigurable computing)*
© 2002, [email protected] http://hartenstein.de12
TU KaiserslauternXputer Lab
Commercial rDPAs
XPU128ACM: Quicksilver Tech
XPU family (IP cores):PACT AG., Munich
http://pactcorp.com
© 2002, [email protected] http://hartenstein.de13
TU KaiserslauternXputer Lab
rDPU not used used for routing only operator and routing port location markerLegend: backbus connect
array size: 10 x 16 = 160 rDPUs
http://kressarray.de
SNN filter KressArray Mapping Example
rout thru only
not usedbackbus connect
data streams
© 2002, [email protected] http://hartenstein.de14
TU KaiserslauternXputer Lab
KressArray Family generic Fabrics: a few examples
Examples of 2nd Level Interconnect:layouted overrDPU cell - no separate routing areas !
+
rout-through and function
rout-throug
h only more NNports:
rich Rout Resources
Select Function
Repertory
select Nearest Neighbour (NN) Interconnect: an example
16 32 8 24
4
2 rDPU
Select mode, number, width of NNports
http://kressarray.de
© 2002, [email protected] http://hartenstein.de15
TU KaiserslauternXputer Lab
Antimatter of Computing is available
• Using FPGAs (fine grain morphware) has been just Logic Synthesis on a strange platform
• Coarse Grain rDPAs (Reconfigurable Computing): a fundamental Paradigm Shift
• up several abstraction levels
• Data-stream-based Computing
© 2002, [email protected] http://hartenstein.de16
TU KaiserslauternXputer Lab
>> Anti Matter of Computing
• Microelectronics History
• fine grain and coarse grain Morphware
• Anti Matter of Computing
• Anti Machine and its Resources
• Problems to be solvedhttp://www.uni-kl.de
© 2002, [email protected] http://hartenstein.de17
TU KaiserslauternXputer Lab
The anti universe
•Paul Dirac predicted a complete anti universe consisting of antimatter
•“There are regions in the universe, which consist of antimatter .....
•We are not aware, that there is a new area in computing sciences , which consists of antimatter of computing
•.... But there are asymmetries”
•Reconfigurable Computing is made from this antimatter: data-stream-based computing
•when a particle hits its antiparticle, both are converted into energy: Annihilation
© 2002, [email protected] http://hartenstein.de18
TU KaiserslauternXputer Lab
anti particles
•1956: anti neutron created on Bevatron
•1928: Paul Dirac: „there should be an anti electron having positive charge“ (Nobel price 1933)
•1932: Carl David Anderson detected this „positron“ in cosmic radiation (Nobel price 1936)
•1955 Owen Chamberlain et al. create anti proton on Bevatron
•1954: new accelerators: cyclotron, like Berkeley‘s Bevatron
•1965: creation of a deuterium anti nucleus at CERN
hydrogen anti hydrogen
•1995: hydrogen anti atom created at CERN – by forcing positron and anti proton to merge by very low energy.
© 2002, [email protected] http://hartenstein.de19
TU KaiserslauternXputer Lab
Matter & Antimatter: Atom and Anti Atom
The World of Matter -machine paradigm:the Atom
Anti Matter -
machine paradigm:Anti Atom
++Electron spinning-
--Positron spinning
+
© 2002, [email protected] http://hartenstein.de20
TU KaiserslauternXputer Lab
Matter & Antimatter of Informatics :
instruction stream
spinning
Machine and Anti Machine
+CPU
- 1936 1st electronic computer (Konrad Zuse)
Machine paradigm:„von Neumann“
1946 v. N. machine paradigm1971 1st microprocessor (Ted Hoff)
data stream spinning
1979 „data streams“ (systolic array: Kung / Leiserson)
-DPU
+
Anti Machine paradigm
1990 anti machine paradigm published1995 rDPA / DPSS (supersystolic: Rainer Kress)
novelcompilationtechniquesall ingredients available
© 2002, [email protected] http://hartenstein.de21
TU KaiserslauternXputer Lab
-DPU
DataPathUnit
DPU
+CPU
DataPath
instructionsequencer instruction
stream
Matter vs. antimatter: CPU vs. DPU
- +
data
str
eam
data
str
eam
s+
+
+
+
DataPathUnit
DPU
ther
e ar
e
asym
met
ries
© 2002, [email protected] http://hartenstein.de22
TU KaiserslauternXputer Lab
heavy anti atoms: DPA = DPU array
-DPA-
DPU
-DPU
-DPU
-DPU
-DPU
-DPU
-DPU
-DPU
-DPU-
DPA
+
+
+
+
+
+
++
+
cohere
nt
data
str
eam
ssp
innin
g a
round
© 2002, [email protected] http://hartenstein.de23
TU KaiserslauternXputer Lab
Parallelism by Concurrency
+-
+-
-+
- +
+-- +
-+
independent instruction streamsdifficult ...
© 2002, [email protected] http://hartenstein.de24
TU KaiserslauternXputer Lab
>> Anti Machine and its Resources
• Microelectronics History
• fine grain and coarse grain Morphware
• Anti Matter of Computing
• Anti Machine and its Resources
• Problems to be solved
http://www.uni-kl.de
© 2002, [email protected] http://hartenstein.de25
TU KaiserslauternXputer Lab
Dichotomy of machine paradigms
DPUinstructi
on sequence
r
CPU
Minstruction stream
M
(r)DPU
asM
data stream
M M M M
M M M MasMaddress generato
r
(r)DPU Array(r)DPA
(r)DPUor
data streams
ther
e ar
e
asym
met
ries
© 2002, [email protected] http://hartenstein.de26
TU KaiserslauternXputer Lab
Terminology: DPU versus CPU ...
• DPU: data path unit• DPA: DPU array• GA: gate array• rDPU: reconfigurable DPU• rDPA: reconfigurable DPA• rGA: reconfigurable GA
• DPU is no CPU: there is nothing central - like in a DPA
DPUDPU
DPUinstructionsequencer
CPU
DPAr
r
© 2002, [email protected] http://hartenstein.de27
TU KaiserslauternXputer Lab
Terminology: Digital System Platforms clearly distinguished
platformsource
running on it
machine paradigm
hardware (not running on it)
nonemorphwar
e
fine grain
rGA (FPGA)configware
coarse grain
rDPU, rDPAreconfigurable data stream processor
flowware & configware anti
machinedata stream processor (hardwired) flowware
instruction stream processor softwarevon Neumann machine
© 2002, [email protected] http://hartenstein.de28
TU KaiserslauternXputer Lab
flowware defines ....
time
port #
time
DPA
xxx
xxx
xxx
|
||
x x
x
x
x
x
x x
x
- -
-
input data streams
xx
x
x
x
x
xx
x
--
-
-
-
-
-
-
-
-
-
-
xxx
xxx
xxx
|
|
|
|
|
|
|
|
|
|
|
|
|
|output data streams
time
port #time
port #
... which data item at which time at which port
flowware manipulates the data counter(s) ...
... software manipulates the program counter
© 2002, [email protected] http://hartenstein.de29
TU KaiserslauternXputer Lab
asM
Configware / Flowware Compilation
r. DataPath
Array
rDPA intermediate
high level source program
wrapper
configwareconfigware
mapper
flowwareflowware
scheduler
M M M M
M M M M
MM
MM
MM
MM
data streams
data sequencer
address generato
r
© 2002, [email protected] http://hartenstein.de30
TU KaiserslauternXputer Lab
... for a Stream-based Soft Machine
SchedulerMemory(data memory)
memory bank
memory bank
memory bank
memory bank
memory bank
...
...
“instructions”
rDPACompiler
Sequencers(data stream
generator)
© 2002, [email protected] http://hartenstein.de31
TU KaiserslauternXputer Lab
JPEG zigzag scan pattern
x
y
*> Declarations
HalfZigZag isEastScanloop 3 times SouthWestScanSouthScanNorthEastScanEastScanendloopend HalfZigZag;
goto PixMap[1,1]
HalfZigZag;SouthWestScanuturn (HalfZigZag)
HalfZigZag
data counterdata counter
data counterdata counter
HalfZigZag
EastScan is step by [1,0]end EastScan;
SouthWestScan isloop 8 times until [1,*]step by [-1,1]endloopend SouthWestScan;
SouthScan isstep by [0,1]endSouthScan; NorthEastScan isloop 8 times until [*,1]step by [1,-1]endloopend NorthEastScan;
Flowware language example (MoPL)The same language
principles
© 2002, [email protected] http://hartenstein.de32
TU KaiserslauternXputer Lab
GAG Slider Model
LimitStepper
BaseStepper
AddressStepper
B0AL0
A
LimitStepper
BaseStepper
AddressStepper
B0AL0
A
sliders
B0B
[
0 L
]0L0
B0B
[
0 A
A
L
]0L0
GAGGenericAddress
Generator
floor ceiling
© 2002, [email protected] http://hartenstein.de33
TU KaiserslauternXputer Lab
ceiling
C
address
GAG Slider Operation Demo Example
yx
LB
L0B0AF
floor
LB
floor
slid
er
ceiling slider
© 2002, [email protected] http://hartenstein.de34
TU KaiserslauternXputer Lab
GAG Complex Sequencer Implementation
LimitSlider
BaseSlider
GAG
AddressStepper
B0AL0
A
all `been published
in 1990
LimitSlider
BaseSlider
GAG
AddressStepper
B0AL0
A
LimitSlider
BaseSlider
GAU
AddressStepper
B0AL0
A
GAGGAG
GAUGeneric Addressing Unit
SDS
GAU
VLIWstack
© 2002, [email protected] http://hartenstein.de35
TU KaiserslauternXputer Lab
Generic Sequence Examples
LimitSlider
BaseSlider
GAU
AddressStepper
B0AL0
A
published
in 1990
a) b)
c)
d) e) f) g)
video scan
-90º rotated video scan
sheared video scan
non-rectangular video scan
zigzag video scan
spiral scan
feed-back-driven scans
atomic scan linear scan
-45º rotated (mirx (v scan))
perfectshuffle
until
© 2002, [email protected] http://hartenstein.de36
TU KaiserslauternXputer Lab
r r
r/w r r
r
r r r
r/w r r
r/w r r
r r r
after inner scan line loop unrolling
final design
after scan line
unrolling
hardw. level access optim.
initial design
rr
w/r r r
r
r r r Bank a
Bank a
Bank b
Storage scheme optimization: scanline unrolling
x
y
handle positions
scan window
scan pattern (high level sequencing)
example
intra scan window accesses(low level sequencing)
MoM anti machine architecture
© 2002, [email protected] http://hartenstein.de37
TU KaiserslauternXputer Lab
>> Problems to be solved
• Microelectronics History
• fine grain and coarse grain Morphware
• Anti Matter of Computing
• Anti Machine and its Resources
• Problems to be solved
http://www.uni-kl.de
© 2002, [email protected] http://hartenstein.de38
TU KaiserslauternXputer Lab
What is the trend ?
•vN is needed for embedded systems, OS, compilers, Sauerkraut software, non-performance-critical applications, others ….
•vN is obsolete for massive parallelism, except some special application areas
•Anti machine is the way to go for massive parallelism, also data-intensive applications
•Morphware is the way for high performance with short product life cycles, unstable standards
•Data-stream-based Computing is heading for mainstream
–1979 „data streams“ (Kung / Leiserson)
–1997 SCCC (LANL) Streams-C Configurabble Computing
–SCORE (UCB) Stream Computations Organized for Reconfigurable Execution
–ASPRC (UCB) Adapting Software Pipelining for Reconfigurable Computing
–2000 Bee (UCB), ...
–Most stream-based multimedia systems, etc.
–Many other areas ....
© 2002, [email protected] http://hartenstein.de39
TU KaiserslauternXputer Lab
Conclusion: all knowledge needed is available
•machine paradigm
•anti architectural resources
•sequencing methodology: hw & sw
•parallel memory IP core and module generator vendors
courses / embedded tutorials:• DATE. Munich, 2001
• ASP-DAC, Yokohama, 2001• SBCCI, Brasilia, 2001
full day courses:
Univ. Montpellier 1998Nokia / Univ. Tampere, Finland, 2002
CNRS Paris France, 2002UnB, Brasilia, 2002
• 10 keynotes 2001 / 2002• 5 invited talks 2001 / 2002
•anything else needed
•compilation techniques
•hw / sw partitioning methodology
•languages
© 2002, [email protected] http://hartenstein.de40
TU KaiserslauternXputer Lab
Main problems to be solved
computingin space
computingin time
systolicarrays etc.
and other transformationsmigration by re-timing
this dichotomy iscompletely ignoredby our CS curricula
•Each programmer should have qualified awareness on dichotomy and morphware
•curricular innovations are urgently needed
•Lack of qualified users and implementers
© 2002, [email protected] http://hartenstein.de41
TU KaiserslauternXputer Lab
CS education .....
software person
procedural
structural
hardware personAn
nihi
latio
n?
Configware / Software Co-Design?Hardware / Software Co-Design?
© 2002, [email protected] http://hartenstein.de42
TU KaiserslauternXputer Lab
Annihilation?
-+
-
+- +
cras
h
avoidable by careful
methodology
© 2002, [email protected] http://hartenstein.de43
TU KaiserslauternXputer Lab
However, current CS Education ….
Hardware invisible:under the surface
… is based on the Submarine Model
Brain usage:procedural-only
Software Faculty Colleagues shy away from the Paradigm Shift:their Brain hurts? - can’t be: this Half has been amputated
Algorithm
Assembly Language
procedural high level Programming
Language
Hardware
Software
This model disables ...
© 2002, [email protected] http://hartenstein.de44
TU KaiserslauternXputer Lab
Hardware,Configware
Hardware and Software as Alternatives
Algorithm
Software
partitioning
Software onlySoftware & Hardw/Configw
procedural structural
Brain Usage:both Hemispheres
Hardw/Configw only
© 2002, [email protected] http://hartenstein.de45
TU KaiserslauternXputer Lab
The Dominance of the Submarine Model ...
Hardware
... indicates, that our CS education system produces zillions of mentally disabled
Persons
(procedural) structurallydisabled
… completely disabled to cope with solutions other than software only
It‘s time to attack the software faculty dictatorship.Get
involved!
© 2002, [email protected] http://hartenstein.de46
TU KaiserslauternXputer Lab Antimatter Search ?
Antimatter Search
in EE & CS we do not need to search
© 2002, [email protected] http://hartenstein.de47
TU KaiserslauternXputer Lab
>>> thank you
thank you for your patience