Memory Intensive Architectures - Technion – Israel...

Preview:

Citation preview

Memory Intensive Architectures

1

Shahar KvatinskyViterbi Faculty of Electrical EngineeringTechnion – Israel Institute of Technology

ICRI-CI June 2017

MemristorsEmerging Nonvolatile Memory Technologies

STT MRAMResistive

RAM

(RRAM)

Phase Change

Memory

(PCM)

2

Example: Intel 3D Xpoint

3

• 16 GB, PCIe 3.0, 241 mm2

• 20nm process, 4F2, 1S1R cells between M4 and M5

• 91.4% memory efficiency (4.5X higher than DRAM)

Memristors Add New Capabilities to CMOSSea of memory above the logic

4 Dense, nonvolatile, fast, and CMOS compatible

Memory

Memory Intensive Architectures

• Tight integration of memory and logic

–Bringing memory to logic

– In-memory computing5

Input OutputCPU

MIA Research Projects

6

Memory design

Memristive Memory Processing Unit

Embedded memory

Neuromorphic computing

Memory Design and Methodologies

• Understanding the fundamental issues in resistive memory

• Circuits for memory (Ramadan et al., submitted to TCAS-I)

• Coding for RRAM (Cassuto et al., ISIT 13, 16, TIT 16)

• RRAM/PCM in the memory system (Nishil Talati)

7

On-Die Intensive Memory

Circuits and Architectures

8

Multistate Register (TVLSI 15)

ContinuousFlow

Multithreading(CAL 14)

IoTRFIC (Wainstein et al.,ISCAS 17, Memrisys 17)

Neuromorphic Computing

9

• Online gradient descent training (TNNLS 15, ISCAS 16)

• Machine learning accelerators (Tzofnat Greenberg)

• Configurable mixed signal circuits (Loai Danial)

Agenda

• Memristors and MIA

• Memristive MPU (mMPU) architecture

• Summary

10

Processing “In-Memory” (PIM)Reducing Data Movement

4

Micron Automata Memory

Processor

Prior Art

SA connected to SIMD pipeline

Configuration PIM machine

Active Pages90’s

Recent

M. Gokhale et al., “Processing in memory: the Terasys massively parallel PIM array,” Computer, 1995

M. Oskin et al., “Active pages: A computation model for intelligent memory,” Comput. Archit. News, 1998

D. Elliott et al., “Computational ram: Implementing processors in memory,” IEEE Des. Test, 1999

P. Dlugosch et al., "An Efficient and Scalable Semiconductor Architecture for Parallel Automata Processing," IEEE TPDS, 2014

Processing “In-Memory” (PIM)Reducing Data Movement

4

Data transfer is still required to/from DRAM and PUs

CPU

CMOS

Processing

Units (PUs)

Memory

(DRAM) Memory

(DRAM)

Real Computing within the MemoryBeyond von Neumann Architecture

13

Memory

Control Unit

Arithmetic/Logic Unit

Input Device

Output Device

CPU

MemoryProcessing

Unit(MPU)

mMPU: Solving thevon Neumann Bottleneck

mMPU: performing

computation USING

the memristive

memory cells

5

Moving from DRAM to

memristive memory

Clock, Address, Data, and Controls

CPU

mMPUmMPU

Logic within MemoryLogic Families

15

MAGIC(TCAS II 14, TNANO 16)

IMPLY(ICCD 11, TVLSI 14)

Unipolar logic(ICSEE 16, VLSI-Soc 16)

Akers array(MEJ 14)

𝑦0

𝑥0

𝑀𝑍

𝑓01

𝑥1

𝑀𝑍

𝑦1

𝑓10

𝑀𝑍

𝑓𝑜𝑢𝑡

𝑓𝑜𝑢𝑡

𝑀𝑍

𝑀𝑍

𝐴

𝐴 𝐵

𝐵 ′1′

0 ′1′

𝑓𝑜𝑢𝑡

𝐛 Array for 𝑿𝑶𝑹 𝑨, 𝑩 𝐚 Array 2x2 model

′0′

′0′

𝑀𝑍

𝑀𝑍

𝑀𝑍

Memristor – Memory ResistorResistor with Varying Resistance

Decrease resistanceIncrease resistance

Current

Voltage

Current

16

MAGIC – Memristor Aided LoGICExample of MAGIC NOR

17

NORIN2IN1

100

010

001

011

RON

ROFF >> RON

ROFF

ROFF <<VG

Increase resistance

>VG/2

ROFF

RON

RON

Initialize OUT to RON

S. Kvatinsky, D. Belousov, S. Liman, G. Satat, N. Wald, E. G. Friedman, A. Kolodny, and U. C. Weiser, "MAGIC – Memristor Aided LoGIC," IEEE TCAS II, Nov. 2014

R ON =Logic ‘1’R OFF=Logic ‘0’

18

Real MAGIC

B. C. Jung et al., “Zero-static-power nonvolatile logic-in-memory circuits for flexible electronics,” Nano Research, April 2017

VG VG

IN1 IN2OUT

19

MAGIC NOR in a Crossbar

VG VG

IN1 IN2OUT

20

MAGIC NOR in a Crossbar

21

VIsolate

VG VG

IN1 IN2OUT

IN INOU

T

VG VG

IN1 IN2 OUT

MAGIC NOR in a Memristive Memory

N. Talati, S. Gupta, P. Mane, and S. Kvatinsky, “Logic Design within Memristive Memories Using MAGIC," IEEE Transactions on Nanotechnology, July 2016

Hierarchy of Logical Functions

22

MUL

COPYNOT

DIV

NANDOR

Convolution

MAGIC - NOR

ADD SUB

XOR

NOR AND

Matrix multiplication

SQRTPOW

23

𝑓𝑛:

,

=

a0

a1

a2

an

b0

b1

b2

bn

c0

c1

c2

cn

Latency of the vector operation is

independent of the length of the vector

Control𝑓𝑛: 𝑅𝑛 × 𝑅𝑛 → 𝑅𝑛

Parallel Vector Operation within Memristive MPU

24

MemristiveMemory

Ro

w C

on

trol

Column Control

mM

PU

Co

ntro

ller

a0

a1

a2

an

b0

b1

b2

bn

mMPU µArchitecture

R. Ben-Hur and S. Kvatinsky, "Memory Processing Unit for In-Memory Processing," Proceedings of the IEEE/ACM International Symposium

on Nanoscale Architectures, July 2016

MemristiveMemory

25

Ro

w C

on

trol

Column Control

mM

PU

Co

ntro

ller

a0

a1

a2

an

b0

b1

b2

bn

mMPU µArchitecture

R. Ben-Hur and S. Kvatinsky, "Memory Processing Unit for In-Memory Processing," Proceedings of the IEEE/ACM International Symposium

on Nanoscale Architectures, July 2016

MemristiveMemory

26

Ro

w C

on

trol

Column Control

mM

PU

Co

ntro

ller

a0

a1

a2

an

b0

b1

b2

bn

mMPU µArchitecture

R. Ben-Hur and S. Kvatinsky, "Memory Processing Unit for In-Memory Processing," Proceedings of the IEEE/ACM International Symposium

on Nanoscale Architectures, July 2016

MemristiveMemory

27

Ro

w C

on

trol

Column Control

mM

PU

Co

ntro

ller

a0

a1

a2

an

b0

b1

b2

bn

c0

c1

c2

cn

mMPU µArchitecture

R. Ben-Hur and S. Kvatinsky, "Memory Processing Unit for In-Memory Processing," Proceedings of the IEEE/ACM International Symposium

on Nanoscale Architectures, July 2016

DRAM

DIMMMemristive

MemorymMPU

mMPU SystemsAccelerator or Main Memory?

28

Clock, Address, Data, and Controls

CPU

mMPU

Accelerators

?Memristive

memory with

processing

capabilities

29

Issues Involved in mMPU Architecture

CPU

mMPU

mMPU Architecture

mMPU Controller

?

Memory

Design

Periphery

Design

mMPU

Controller

Design and

Optimization

Programming

Model

Software

Applications

Agenda

• Memristors and MIA

• Memristive MPU (mMPU) architecture

• Summary

30

Memristors to the Rescue?

• New technologies enable memory intensive

architectures

• Better processors (multithreading, low power)

• Accelerators (machine learning)

• Smart memories (memory processing unit)

31

Thanks!

32

Recommended