32
1 Yiran Chen Electrical and Computer Engineering University of Pittsburgh Sponsors: NSF, DARPA, AFRL, and HP Labs Emerging NVM Enabled Storage Architecture: From Evolution to Revolution.

Emerging NVM Enabled Storage Architectureacs.ict.ac.cn/ncis2014/slides/NCIS2014_Plenary_ChenYiran.pdf · Memristor –Rebirth of Neuromorphic Circuits • Two terminal, high density

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Emerging NVM Enabled Storage Architectureacs.ict.ac.cn/ncis2014/slides/NCIS2014_Plenary_ChenYiran.pdf · Memristor –Rebirth of Neuromorphic Circuits • Two terminal, high density

1

YiranChenElectricalandComputerEngineeringUniversityofPittsburghSponsors:NSF,DARPA,AFRL,andHPLabs

EmergingNVMEnabledStorageArchitecture:FromEvolutiontoRevolution.

Page 2: Emerging NVM Enabled Storage Architectureacs.ict.ac.cn/ncis2014/slides/NCIS2014_Plenary_ChenYiran.pdf · Memristor –Rebirth of Neuromorphic Circuits • Two terminal, high density

2

Outline

• Introduction• EvolutionwitheNVM:

– On‐chiphighspeedstorage;

– Off‐chipsecondarystorage;• RevolutionwitheNVM:

– Memristor‐basedneuromorphic accelerator• Conclusion

Page 3: Emerging NVM Enabled Storage Architectureacs.ict.ac.cn/ncis2014/slides/NCIS2014_Plenary_ChenYiran.pdf · Memristor –Rebirth of Neuromorphic Circuits • Two terminal, high density

3

ConventionalMemoryScaling

2012– 201338nm‐ 32nmM:StackedMIMP:PlanarA:6F2, bWLG:poly/SiO2C: SiV: 1.35V

2014– 201529nm‐ 22nmM:StackedMIMP:Planar,HKMGA:6F2,bWLG:HKMGC:SiV:1.2V

2016– 201722nm‐ 16nmM:StackedMIMP:PlanarA:6F2,bBL,LBL,1T1C(VFET)G:HKMGC:SiV:1.1V

2018– 201916nm‐ 14nmM:FBRAM,STT‐RAM,RRAM,PCRAMP:PlanarA:4F2,1T,1T1R,1TMTJ(VFET)G:HKMGC:SiV:~1V

Burj KhalifaA/R=6

100

20

80

60

40

AspectRatioA/R

60 50 40 30 20

11Å9Å

8Å7Å

TOX

TechnologyNode1990 2000 2010

101

102

103

104

Mb/Chip

EDO50

SDRAM133

DDR1200-400

DDR2400-800

DDR3800-1600

Mbps

Sources:ASML,ITRS,IMEC,Hynix,IBM

Intrinsic difficulty of charge-based computing and storage!

Page 4: Emerging NVM Enabled Storage Architectureacs.ict.ac.cn/ncis2014/slides/NCIS2014_Plenary_ChenYiran.pdf · Memristor –Rebirth of Neuromorphic Circuits • Two terminal, high density

4

EmergingNonvolatileMemory

Page 5: Emerging NVM Enabled Storage Architectureacs.ict.ac.cn/ncis2014/slides/NCIS2014_Plenary_ChenYiran.pdf · Memristor –Rebirth of Neuromorphic Circuits • Two terminal, high density

5

MemoryTechnologiesComparison

ReRAM

>10y

<1

1015

None

STT‐RAM

>10y

8

1015

None

NANDFLASH

10y

4

0.1ms

1/0.1ms

105

High

None

PCRAM

>10y

4

12ns

<50ns

108

Low

None

DRAM

4ms

7‐9

2ns

1ns

1016

Low

RefreshPower

SRAM

N

120‐140

0.2 ns

70ps

1016

Low

LeakageCurrent

DataRetention

MemoryCell (F2)

ReadTime

Write/EraseTime

Number ofRewrites

PowerConsumptionRead/Write

PowerConsumptionotherthanR/W

N 4ms

0.1ms

1/0.1ms <50ns

LeakageCurrent

RefreshPower

High

>10y >10y

5‐10ns 5‐10ns

<10ns<10ns

5‐10 ns

<10ns <10ns

5‐10ns

Low LowLow Low

None None

Source:ITRSERDworkshoppresentationbyProf.Y. Chen

6

Page 6: Emerging NVM Enabled Storage Architectureacs.ict.ac.cn/ncis2014/slides/NCIS2014_Plenary_ChenYiran.pdf · Memristor –Rebirth of Neuromorphic Circuits • Two terminal, high density

6

Challenges:

• Identifyingtheevolutional applicationsthatcan– Easilyandseamlesslyintegratedintothecurrentmemoryhierarchyandcomputingplatform;

– FullyleveragetheadvantagesofemergingNVM;

– Notbeeasilyreplacedbyotheralternativetechnologyorarchitecture.

• Inventingarevolutionary computingandstoragearchitecturethatcan– Offerahigh‐performance,powerefficient,andscalablecomputingmodel;

– Provideatrulyseamlessintegrationbetweencomputingandmemory.

Page 7: Emerging NVM Enabled Storage Architectureacs.ict.ac.cn/ncis2014/slides/NCIS2014_Plenary_ChenYiran.pdf · Memristor –Rebirth of Neuromorphic Circuits • Two terminal, high density

7

Outline

• Introduction• EvolutionwitheNVM:

– On‐chiphighspeedstorage;• STT‐RAMbased3DcacheforCPU.

• RacetrackbasedregisterfileforGPU.

– Off‐chipsecondarystorage;• RevolutionwitheNVM:

– Memristor‐basedneuromorphic accelerator.• Conclusion

Page 8: Emerging NVM Enabled Storage Architectureacs.ict.ac.cn/ncis2014/slides/NCIS2014_Plenary_ChenYiran.pdf · Memristor –Rebirth of Neuromorphic Circuits • Two terminal, high density

8

Writing‘1’

1T‐1MTJSTT‐RAMSchematic

STT‐RAMbased3DcacheSpin‐TransferTorqueRandomAccessMemory

Source‐line

MTJ

ReferenceLayer

FreeLayer

Bit‐line

Word‐line

Ascalabletechnology

Writing‘0’

MgO Layer

Magnetictunnelingjunction

Page 9: Emerging NVM Enabled Storage Architectureacs.ict.ac.cn/ncis2014/slides/NCIS2014_Plenary_ChenYiran.pdf · Memristor –Rebirth of Neuromorphic Circuits • Two terminal, high density

9

• Pros:Lowleakagepower,highdensity.

• Cons:Longwritelatencyandlargewritepower

SRAMvs.MRAM(STT‐RAM)

Area (65nm) 3.66mm2 SRAM 3.30mm2 MRAM

Capacity/Bank 128KB 512KB

Read latency 2.25ns 2.32ns

Write latency 2.26ns 11.02ns

Read energy 0.90nJ 0.86nJ

Write energy 0.80nJ 5.00nJ

Cache configurations Leakage power

2MB (16x128KB) SRAM cache 2.09W

8MB (16x512KB) MRAM cache 0.26W

Page 10: Emerging NVM Enabled Storage Architectureacs.ict.ac.cn/ncis2014/slides/NCIS2014_Plenary_ChenYiran.pdf · Memristor –Rebirth of Neuromorphic Circuits • Two terminal, high density

10

STT‐RAMbased3Dcache

• Baseline3DArchitecture– CoreLayer+CacheLayers.

– NUCAcacheswithNOCconnections.

Layer 1

Cache Controller

Core

Layer 2

TSV

Cache Bank

Router

Cache Bank

Cache Bank

Cache Bank

Cache Bank

R

R

R

R

Horizontal Hop

Ver

tica

l Hop

Data Migration

G. Sun, X. Dong, Y. Xie, J. Li, Y. Chen, HPCA, 2009.

Page 11: Emerging NVM Enabled Storage Architectureacs.ict.ac.cn/ncis2014/slides/NCIS2014_Plenary_ChenYiran.pdf · Memristor –Rebirth of Neuromorphic Circuits • Two terminal, high density

11

STT‐RAMbased3Dcache

• Challenges:longwritelatencyofSTT‐RAM.

• Solution1(S1):Read‐PreemptiveWriteBuffer.

STT-RAMCaches

Cores

Write Op.

Read Op.

Read Op.

Read Data

Read Data

Write Buffer (FIFO) Write Req.

Read Req.

Write just begins.Write is almost done.

Page 12: Emerging NVM Enabled Storage Architectureacs.ict.ac.cn/ncis2014/slides/NCIS2014_Plenary_ChenYiran.pdf · Memristor –Rebirth of Neuromorphic Circuits • Two terminal, high density

12

STT‐RAMbased3Dcache• SolutionS2:SRAM‐MRAMHybridL2Cache

Core

Core

Core

Core

MRAM Bank

TSV

Core

Core

Core

Core

SRAM Bank

32-Way STT-RAM31-Way STT-RAM &

1-Way SRAM

Page 13: Emerging NVM Enabled Storage Architectureacs.ict.ac.cn/ncis2014/slides/NCIS2014_Plenary_ChenYiran.pdf · Memristor –Rebirth of Neuromorphic Circuits • Two terminal, high density

13

STT‐RAMbased3Dcache

• Result(S1&S2):– Performanceisimprovedby4.91%comparedwithSTT‐RAMbaseline.

– Powerconsumptionisreducedby73.5%.

0

0.2

0.4

0.6

0.8

1

2M-SRAM-DNUCA 8M-MRAM-DNUCA8M Hybrid DNUCA

0

0.2

0.4

0.6

0.8

1

IPC

Pow

er

Page 14: Emerging NVM Enabled Storage Architectureacs.ict.ac.cn/ncis2014/slides/NCIS2014_Plenary_ChenYiran.pdf · Memristor –Rebirth of Neuromorphic Circuits • Two terminal, high density

14

Outline

• Introduction• EvolutionwitheNVM:

– On‐chiphighspeedstorage;• STT‐RAMbased3DcacheforCPU.

• RacetrackbasedregisterfileforGPU.

– Off‐chipsecondarystorage;• RevolutionwitheNVM:

– Memristor‐basedneuromorphic accelerator.• Conclusion

Page 15: Emerging NVM Enabled Storage Architectureacs.ict.ac.cn/ncis2014/slides/NCIS2014_Plenary_ChenYiran.pdf · Memristor –Rebirth of Neuromorphic Circuits • Two terminal, high density

15

RacetrackforGPU

• Racetrackcell:

– Twofixedpinningregions:freeregion,andfixregion

– Write`0’

– Write`1’

– Read

WWL RWL

SL

BL

Pinning layer Pinning layer

Free layer

Reference layer

• Racetrack

– Racetrack‐magnetictrack– Injectcurrenttomovecell– Accessport

Page 16: Emerging NVM Enabled Storage Architectureacs.ict.ac.cn/ncis2014/slides/NCIS2014_Plenary_ChenYiran.pdf · Memristor –Rebirth of Neuromorphic Circuits • Two terminal, high density

16

RacetrackforGPU

• BenefitsfromRacetrack:– Extremelysmallcellsize;

• Majorchallenges:– Shiftingcauseddelay/energy.

• Warpregisterremapping(WRR)– 60.0%RFareallocatedduringtheexecution

– Non‐optimalwarpregistermapping,maxshiftdistance—8‐cell

– WRR,interleavesthewarpregistersacrosstheaccessports,maxshiftdistance—4‐cell

WWL

RWL

WWL

RWL

…...

SLBL SLBL SLBL SLBL

Row Decod

er

Write/Read/Shifter Driver

Column MuxSense Amplifier Arrays

Shift ControllerArbitrator

Warp 0 Warp 0

M. Mao, W. Wen, Y. Zhang, Y. Chen, H. Li, DAC 2014

Page 17: Emerging NVM Enabled Storage Architectureacs.ict.ac.cn/ncis2014/slides/NCIS2014_Plenary_ChenYiran.pdf · Memristor –Rebirth of Neuromorphic Circuits • Two terminal, high density

17

RacetrackforGPU• Writebuffer

– “piggyback‐write”towritebacktoRFfromwritebuffer;

– Relyonthetrackmovementtriggeredbythereadrequests;

– Positiveside‐effect:filtertheredundantRFR/WbyleveragingRAWandWAW.

1

32 4 8

7

9

6

5

To EXE/MEM

Page 18: Emerging NVM Enabled Storage Architectureacs.ict.ac.cn/ncis2014/slides/NCIS2014_Plenary_ChenYiran.pdf · Memristor –Rebirth of Neuromorphic Circuits • Two terminal, high density

18

RacetrackforGPU

• Experimentresults:– Baseline:SRAM‐basedregisterfiles.

– Energyreduction:59%.

– Performanceimprovement:4%.

Page 19: Emerging NVM Enabled Storage Architectureacs.ict.ac.cn/ncis2014/slides/NCIS2014_Plenary_ChenYiran.pdf · Memristor –Rebirth of Neuromorphic Circuits • Two terminal, high density

19

Outline

• Introduction• EvolutionwitheNVM:

– On‐chiphighspeedstorage;

– Secondarystorage;• PCRAMandNANDhybridSSD;

• RevolutionwitheNVM:– Memristor‐basedneuromorphic accelerator.

• Conclusion

Page 20: Emerging NVM Enabled Storage Architectureacs.ict.ac.cn/ncis2014/slides/NCIS2014_Plenary_ChenYiran.pdf · Memristor –Rebirth of Neuromorphic Circuits • Two terminal, high density

20

HybridSSD

• Memoryhierarchy

Off-chip memory 100~300 cycles

On-chip memory1~30 cycles Page mode

↓Random

access

erase-before-write (EBW)

↓In-place-

update (IPU)

Courtesy: Al Fazio (Intel)

Solid State Disk(Flash)

25K~2M cycles

PN=0, V

Erase Unit

PN=1, V

PN=2, V

PN=n, V

X

X

Page 21: Emerging NVM Enabled Storage Architectureacs.ict.ac.cn/ncis2014/slides/NCIS2014_Plenary_ChenYiran.pdf · Memristor –Rebirth of Neuromorphic Circuits • Two terminal, high density

21

• Onetransistor/diodeandoneGST(GeSbTe).

• In‐placeupdating(IPU)

PRAM(PCM)Cell

High resistance: ‘0’Low resistance: ‘1’

Top ElectrodeGST

Substrate

Bottom Electrode

Heater

+NTop Electrode

GST

Substrate

Bottom Electrode

Heater

+N

AmorphousCrystalline

Page 22: Emerging NVM Enabled Storage Architectureacs.ict.ac.cn/ncis2014/slides/NCIS2014_Plenary_ChenYiran.pdf · Memristor –Rebirth of Neuromorphic Circuits • Two terminal, high density

22

HybridSSD

• ConventionalSSD:FLASH.

• Promisingcandidate:PRAM(Phasechange).

• Tocombinebenefitsofbothtechnologies:

– HybridSSD.

• Twousage:– Performance;

– Reliability.

Page 23: Emerging NVM Enabled Storage Architectureacs.ict.ac.cn/ncis2014/slides/NCIS2014_Plenary_ChenYiran.pdf · Memristor –Rebirth of Neuromorphic Circuits • Two terminal, high density

23

HybridSSD:performanceenhancement

PN=0, V

Erase Unit 1

PN=1, V

PN=2, V

PN=n, V

PN=Page Number; V=Valid; I=Invalid

Erase Unit 2

PN=0, V

Erase Unit 3

PN=1, V

PN=2, V

PN=n, VPN=n, I

(Empty Pages)

PN=2, VPN=2, I

PN=n, V

Merge Operation (time consuming)

Erase Unit = 128/256KB, Page = 512Bytes ~ 8KBG.Sun, Y. Joo, Y. Chen, Y. Xie, Y.Chen, H. Li, HPCA, 2010.

Page 24: Emerging NVM Enabled Storage Architectureacs.ict.ac.cn/ncis2014/slides/NCIS2014_Plenary_ChenYiran.pdf · Memristor –Rebirth of Neuromorphic Circuits • Two terminal, high density

24

HybridSSD:performanceenhancement

… …Data Region

DataBuffer

inMemory

Hybrid ArchitecturePhysical View Structural View

… …Log Region

NANDflash

PRAM

Erase Unit

In-place updating

Sector (512Bytes)

Page 25: Emerging NVM Enabled Storage Architectureacs.ict.ac.cn/ncis2014/slides/NCIS2014_Plenary_ChenYiran.pdf · Memristor –Rebirth of Neuromorphic Circuits • Two terminal, high density

25

DifferentLogAssignments

Data Region

Log Region

Erase Unit

FixedAssignment

Data Region

Log Region

Erase Unit

Organizelog pages in group

Data Region

Log Region

Erase Unit

DynamicAssignment

Static log assignmentGroup log assignmentDynamic log assignment

Page 26: Emerging NVM Enabled Storage Architectureacs.ict.ac.cn/ncis2014/slides/NCIS2014_Plenary_ChenYiran.pdf · Memristor –Rebirth of Neuromorphic Circuits • Two terminal, high density

26

HybridSSD:performanceenhancement

Page 27: Emerging NVM Enabled Storage Architectureacs.ict.ac.cn/ncis2014/slides/NCIS2014_Plenary_ChenYiran.pdf · Memristor –Rebirth of Neuromorphic Circuits • Two terminal, high density

27

Outline

• Introduction• EvolutionwitheNVM:

– On‐chiphighspeedstorage;

– Secondarystorage;• RevolutionwitheNVM:

– Memristor‐basedneuromorphic accelerator.• Conclusion

Page 28: Emerging NVM Enabled Storage Architectureacs.ict.ac.cn/ncis2014/slides/NCIS2014_Plenary_ChenYiran.pdf · Memristor –Rebirth of Neuromorphic Circuits • Two terminal, high density

28

Computing:PresentandFuture

2000 20101990

1000

100

Multi‐core

ClockFrequency(MHz)

NewTrend:- Multi‐core,advancedpowermanagement,largeon‐chipstorage.

Future:- Heterogeneoussystem,Brain‐like computing.

Source:CPUDB,Intel

NeuralNetwork

2000 20101990

10000

RocketLaunch

NuclearReactor

HotPlate

PowerDensity(m

W/m

m2 )

1000

100

Page 29: Emerging NVM Enabled Storage Architectureacs.ict.ac.cn/ncis2014/slides/NCIS2014_Plenary_ChenYiran.pdf · Memristor –Rebirth of Neuromorphic Circuits • Two terminal, high density

29

GraymatterWhitematter

Neocortex6layersSignalstravelwithinandbetweenlayers

Brain– TheMostEfficientComputingMachine

Brain:15–30BneuronsExtremelycomplexorgan4km/mm3

35w

Neuron:Processsignalsfromotherneurons.

Synapse:MemoryWeightsignals

NeuralNetwork

Page 30: Emerging NVM Enabled Storage Architectureacs.ict.ac.cn/ncis2014/slides/NCIS2014_Plenary_ChenYiran.pdf · Memristor –Rebirth of Neuromorphic Circuits • Two terminal, high density

30

Brain‐likeNeuromorphicCircuits

HighlyparallelUltrapowerefficient

Flexible Extremelyrobust

Realworldinput

Humanfriendlyoutput

Datafriendly

Slowprogressinneuoromoprhic hardwareimplementation• Lackofefficientsynapsedesign• Notsupportivetomassconnection

Page 31: Emerging NVM Enabled Storage Architectureacs.ict.ac.cn/ncis2014/slides/NCIS2014_Plenary_ChenYiran.pdf · Memristor –Rebirth of Neuromorphic Circuits • Two terminal, high density

31

0 10 20 30 40 50 60 70300

400

500

600

700

Pulse number

Res

ista

nce

()

0 10 20 30 40 50 60 70-4

-2

0

2

4Vo

ltage

(V)

Memristor– RebirthofNeuromorphicCircuits

• Twoterminal,highdensity• Non‐volatility• Analog/multi‐levelstates

• Naturalmatrixfunction• AMIMOsystem• Goodcombinationwithmemristor

Memristor↔ Synapse Crossbar↔Network

TaN1+x

HPlab,2012

EIlab,DAC’12

2

3

4

i

i+1

n

1 2 3 j-1 j n-1 n

1

EIlab,APL’13

EIlab &HPlabTiN-TaOx device, pulses grows linearly in amplitude

Page 32: Emerging NVM Enabled Storage Architectureacs.ict.ac.cn/ncis2014/slides/NCIS2014_Plenary_ChenYiran.pdf · Memristor –Rebirth of Neuromorphic Circuits • Two terminal, high density

32

Conclusion

• Emergingnonvolatilememorytechnology(NVM)suchasSTT‐RAM,racetrack,PRAMdeliverssignificantimprovementforvariousapplications.

• Challengesexistandcanbesolvedbyarchitectureleveloptimization.

• InnovationofrevolutionaryarchitecturewhichprovidesMulti‐orderspeedup,powerefficiencyimprovement,andhardwarecostreductionispromised.