Upload
lymien
View
227
Download
0
Embed Size (px)
Citation preview
A 1024-core 70GFLOP/W Floating Point Manycore Microprocessor
Andreas Olofsson, Roman Trogan, Oleg Raikhman Adapteva, Lexington, MA
ANSI C Programmable IEEE Floating Point 70 GFLOPS/W
E1G3 E16G3 E1G4 E16G4 E64G4 E256G4 E1KG4 E4KG4 Cores 1 16 1 16 64 256 1024 4096
Process Geometry 65G 65G 28LP 28LP 28LP 28LP 28LP 28LP Max Frequency
(MHz) 1000 1000 700 700 700 700 700 700
Performance (GFLOPS/sec) 2 32 1.4 22 88 352 1408 5632
Performance (CoreMark) 1288 1288*16 900 900*16 900*64 900*256 900*1k 900*4k
Peak Energy Efficiency
(GFLOPS/W) 35 35 70 70 70 70 70 70
Total Area (mm2) 0.5 8.96 0.13 2.05 8.2 32.7 131.1 524.3
20-100W
0
10
20
30
40
50
60
70
80
0 5 10 15 20 25 30
GFLOPS/W
GFLOPS
Energy Efficiency
ENERGY EFFICIENCY
ENERGY EFFICIENCY (28nm)
Epiphany Introduction
Accelerator Model
A More Balanced Approach?
64 CORES @ 2 WATTS
28nm IP Offering
Programming Model Silicon Measurements
Built to Scale
1024 Cores 1 Core
Architecture designed from scratch to scale
LEGO approach to chip design
Array generator minimizes chip design
1024 cores easily manufactured in 28nm
100% C/C++ Programmable
Architecture is programming model agnostic
Could support SIMD, MIMD, Threading, Message Passing, Functional Programming.
Focusing on message passing model
Benchmarks
CoreMark C-Benchmark
Matrix Multiplication
(Assembly90%)
Software Development Kit
64 Cores
800 MHz
102 GFLOPS
70 GFLOP/W
3mm x 4mm
<2 Watt Chip Power
28nm Process
Available Q1,2012