28
Open-Source Hardware in the Post Moore Era Dr. George Michelogiannakis Research scientist Computer architecture group Lawrence Berkeley National Laboratory These are not DOE’s or LBNL’s official views

Open-Source Hardware in the Post Moore Era Hardware in the Post Moore Era Dr. George Michelogiannakis Research scientist Computer architecture group Lawrence Berkeley National Laboratory

  • Upload
    vokhanh

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

Open-Source Hardwarein the Post Moore Era

Dr. George Michelogiannakis

Research scientist

Computer architecture group

Lawrence Berkeley National Laboratory

These are not DOE’s or LBNL’s official views

Technology Scaling Trends

Figure courtesy of Kunle Olukotun, Lance Hammond, Herb Sutter, and Burton Smith

2020 2025 2030

Year

Pe

rfo

rman

ce

Transistors

Thread Performance

Clock Frequency

Power (watts)

# CoresExascale

Happens in

2021-2023

Atomic scale limit case for 2D Lithography Scaling

Moore’s Law of Documentation

Scaling Already Slowing Down

Peter Bright “Intel retires “tick-tock” development model, extending the life of each process “, 2016

Preserve Performance Scaling With Emerging Technologies

Perfo

rman

cePer

form

ance

Some Paths Forward in Post Moore

Superconducting

Spintronics

Photonics

CMOS

Modeling / Simulation

ApproximateComputing

Async

ReconfigurableComputing

Specialized

Hardware

New

Devices

New Models of

Computation

y

x

z

3D StackingAdv. Packages

TFETs

MRAM

SpecializedArchitectures

Neuromorphic

Big Data

Dataflow

Exascale

General Purpose

CarbonNanotubes

SoC

DarkSilicon

10 Years

+10 Year Lead

Time

Revolutionary

Heterogeneous

HPC Architectures

First Line of Defense:

More Efficient

Architectures

3D Integration of Tomorrow

Shulaker “Transforming Emerging Technologies into Working Systems”

Lets Get The Most Out of CMOSBefore we Jump Ship

General-Purpose Architectures Trade Overhead for Programmability

“Accelerator-Rich Architectures: Opportunities and Progresses”, DAC 2014

Superscalarout-of-order

pipeline

Architectures Trade Overhead for Programmability

“Accelerator-Rich Architectures: Opportunities and Progresses”, DAC 2014

Compare against 12-core 1.9 GHz Intel Xeon E5-2420 processor

Accelerators Have Been Growing in HPC

Strohmaier “Top 500”, SC17

0

10

20

30

40

50

60

70

80

90

100

110

120

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

2017

Sys

tem

s

PEZY-SC

Kepler/Phi

Xeon Phi Main

Intel Xeon Phi

Clearspeed

IBM Cell

ATI Radeon

Nvidia Volta

Nvidia Pascal

Nvidia Kepler

Nvidia Fermi

Top 500 systems

Specialization

Hardware that is more suited for specific kinds of computation

Can also have accelerators for data transfer

General

purpose

Fixed

functionAccelerators

Programmability

High Low

GPU Acceleration is Popular

NVIDIA

FPGA Acceleration Just Beginning

FPGA accelerators used as

programmable array of soft

cores – more like a GPU model

Parallels early days of GPGPU

computing

Capable hardware

New languages raising

abstraction levels

Tools lacking

Wadler, Intel’s Response to ARM Servers: Xeon D Processors with a Twist, 2016

Fixed-Function Hardware

How fine-grain accelerators?

How to schedule and transfer data?

Yakun S et al “Aladdin”

Hardware Development EffortIs a Challenge

Behavioral simulators: Fast but

less accurate

Typically used to prune design

space

No substitute for real

hardware

Lets make hardware development

faster!

High level synthesis

languages

Open-source hardware

Zipcpu, “FPGAs vs ASICs”, 2017

12-18 month cycle

Reduce Hardware Development Effort to Explore the Specialization Spectrum With Open-Source

Hardware

The Rise of Open-Source Hardware

10

In all likelihood, Weddington concedes, the

resulting technology “will never be as good as what

is commercially available.” But perhaps it could

be made good enough “to bring the power and

ability to design your own IC, or microprocessor,

to smaller and smaller groups of people and drive

down the enormous capital requirements of an

entrenched, dinosaur industry.”

Similarly, Michael Cooney of Network World15

describes the state of open-source hardware today

as roughly where open-source software was during

the mid-1990s – waiting for commercial suppliers

to provide higher levels of support. “What made

open-source software acceptable for many

businesses was the arrival of support for it, such as

Red Hat,” he says, adding, “Something similar may

take place with the hardware.”

• Rapid growth in the adoption and number of open source software projects

• More than 95% of web servers run Linux variants, approximately 85%

of smartphones run Android variants

• Will open source hardware ignite the semiconductor industry?

Is RISC-V the hardware industry’s Linux?

The Rise of Open Source Software: Will Hardware Follow Suit?

The Economics of Open-Source Innovation

GSA 2016

Accelerating the Design Process

A complete set of tools

OpenSoC Fabric

OpenSoC Compiler

OpenSoC Cores& Open2C

OpenSoC System Architect

OpenSoC System Architect

Frontend

Verilog

Chisel

Spec

LLVM

Compiler

Frontend &

CoreGen

SC15 Demo: 96-core SoC for HPC

Shockingly but accidentally

similar to Sunway node

architecture

4 Z-Scale processors connected

on a 4x4 mesh and Micron HMC

memory

Two people spent two months

to create

FPGA 0 FPGA 1 FPGA 2

FPGA 5FPGA 4FPGA 3

Core Core

Core Core

Core DDR

Core Core

Core Core

DDR Core

Core Core

CoreOff-Chip

Core Core

Core Core

Core DDR

Core Core

Core Core

DDR Core

Core Core

CoreOff-Chip

Core Core

Core Core

Core DDR

Core Core

Core Core

DDR Core

Core Core

CoreOff-Chip

Core Core

Core Core

Core DDR

Core Core

Core Core

DDR Core

Core Core

CoreOff-Chip

Core Core

Core Core

Core DDR

Core Core

Core Core

DDR Core

Core Core

CoreOff-Chip

Core Core

Core Core

Core DDR

Core Core

Core Core

DDR Core

Core Core

CoreOff-Chip

Use Open-Source Hardware:Specialization Opportunities

A Specialization Opportunity

On-detector processing

Future detectors have data rates

exceeding 1 Tb/s

Proposed solution:

Process data before it leaves

the sensor

Application-tailored,

programmable processing

Programmability allows

processing to be tailored to the

experiment

0

10

20

30

40

50

60

2010 2011 2012 2013 2014 2015

Incre

ase

ov

er

20

10

Projected Rates

Sequencers

Detectors

Processors

Memory

Create an Architecture per Motif

Spatial Specialization

Architecture to match data set shape to help communication

StreamingIntegerALU

R [t=n+1]

R [t=n]

R[t=n-1]

Rx

Mux

Demux

OperandArbiter

NoCRouter

MasterControlUnitInstructionBuffer

Inte r-PDEce ll L inks

...

Scalar/ControlRegisters

StreamingIntegerALU

R [t=n+1]

R [t=n]

R[t=n-1]

Rx

Mux

Demux

OperandArbiter

NoCRouter

MasterContro lUnitInstructionBuffer

Inter -PDE ce ll L inks

...

Scalar/ControlRegisters

1000LayersMonolithicStacking

PDEcell /PICcell:Ultra-simplecomputeengine(50kgates)calculatesfinite-differenceupdates,andparticleforcesfromneighbors.MicroinstructionsspecifythePDEequation,stencil,andPICoperators.Novelfeatures:variablelengthstreamingintegerarithmeticandnovelPICparticlevirtualizationscheme.

ComputationalLattice:PDECells aretilesinalattice/arrayoneach2Dplanarchiplayer.Target120x120tilespermm2

@28nmlithography.NovelFeatures:eachtilerepresentssinglecellofcomputationaldomain(pushestolimitofstrong-scaling).

Monolithic3DIntegration:Integratelayersofcomputeelementsusingemergingmonolithic3Dchipstacking.NovelFeatures:1000layerstacking(20xmorethancurrentpractice).Areaefficientinter-layerconnectivityandnewenergyefficienttransistorlogic(ncFET).1Petaflopequivalentperformancein300mm^2for<200Watts.

Quantum Control Processor

𝑄𝑢𝑎𝑛𝑡𝑢𝑚 𝐶𝑜𝑚𝑝𝑢𝑡𝑒𝑟 = 𝑄𝑢𝑎𝑛𝑡𝑢𝑚 𝑃𝑈 + 𝐶𝑜𝑛𝑡𝑟𝑜𝑙 𝐻𝑎𝑟𝑑𝑤𝑎𝑟𝑒

Qubit Digitizer

Large amount of data

PC

IE

PC

RA

M

HD

D

Low speed

Tektronix AWG

High cost

Control

Measurement-based feedback

FPGA

Measurement

Off the shelf and high cost Large amount of data and slow speed

Qubit Digitizer

Large amount of data

PC

IE

PC

RA

M

HD

D

Low speed

Tektronix AWG

High cost

Control

Measurement-based feedback

FPGA

MeasurementQubit Digitizer

Large amount of data

PC

IE

PC

RA

M

HD

D

Low speed

Tektronix AWG

High cost

Control

Measurement-based feedback

FPGA

Measurement

Qubit Digitizer

Large amount of data

PC

IE

PC

RA

M

HD

D

Low speed

Tektronix AWG

High cost

Control

Measurement-based feedback

FPGA

MeasurementQubit Digitizer

Large amount of data

PC

IE

PC

RA

M

HD

D

Low speed

Tektronix AWG

High cost

Control

Measurement-based feedback

FPGA

Measurement

1000 qubits, gate time 10ns,

3 ops/qubit300 billion ops per second

Conclusion

Open-source projects rely on community

Need a collection of accelerators

Open-source hardware may be the key to ubiquitous specialization

Programmability and compilers must not be neglected

It is an exciting time to be an architect

Questions