34
Uniform Methodology of Benchmarking Beyond-CMOS Devices Dmitri Nikonov, Ian Young 1 “Benchmarking is a tool, not a weapon.” - Ian Young Exploratory Integrated Circuits Group / Components Research

Uniform Methodology of Benchmarking Beyond-CMOS Devices

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Uniform Methodology of Benchmarking Beyond-CMOS Devices

Uniform Methodology of Benchmarking

Beyond-CMOS Devices

Dmitri Nikonov, Ian Young

1

“Benchmarking is a tool, not a weapon.”

- Ian Young

Exploratory Integrated Circuits Group / Components Research

Page 2: Uniform Methodology of Benchmarking Beyond-CMOS Devices

Acknowledgement

• Benchmarking thrust led by K. Bernstein for the last 3 years.

• It was a momentous step: from concepts of devices to envisioning practical circuits.

• We wanted to equalize the assumptions and approaches to benchmarking.

• All devices PIs were very supportive of this work. Worked hard to understand and refresh their data. >3 meetings with each group.

• We take all the responsibility for possible errors in the new analysis.

• Hoping that this overview will stimulate discussions and improvements.

2 Exploratory Integrated Circuits Group / Components Research

Page 3: Uniform Methodology of Benchmarking Beyond-CMOS Devices

Insights of Benchmarking Practical considerations

1. Area is determined by metal pitch (4F) to connect to the device [terminal connections]

2. Parasitics can overwhelm the intrinsic device attributes.

3. Majority gates permit more compact, faster circuits

4. Power delivery dominates for very low-voltage devices

Results

1. Spintronics devices are dominated by either – switching energy (spin torque) – magnetization switching speed (magnetoelectric).

2. Charge-based devices are an attractive option: good E*d, compatible with CMOS circuits

3. Spintronic devices still competitive on throughput at low power

3 Exploratory Integrated Circuits Group / Components Research

Page 4: Uniform Methodology of Benchmarking Beyond-CMOS Devices

Computational Variables and Transduction

4 Exploratory Integrated Circuits Group / Components Research

Charge E-Dipole Magnetic -Dipole Orbital State

Class Variables Example

Charge Q, I, V CMOS, TFET

Electric Dipole P (FeFET)

Magnetic Dipole M, Ispin ASL, SWD, NML

Orbital State Bose condensate BisFET

Page 5: Uniform Methodology of Benchmarking Beyond-CMOS Devices

Computational Variables and Transduction

5 Exploratory Integrated Circuits Group / Components Research

A. Transistor-like devices: (CMOS HP, CMOS LP, III-V FET, HJFET, gnrFET, spinFET) also GpnJ, BisFET

C. SMG, SWD, NML Voltage-driven = Magneto-Electric Switching also ASLD (current driven)

Output of a device needs to be the same computational variable (and same range) as input. Otherwise a transducer is needed.

B. STT/DW, STOlogic, STTriad Current-driven = Spin Torque Switching

Page 6: Uniform Methodology of Benchmarking Beyond-CMOS Devices

Beyond CMOS Devices

6 Exploratory Integrated Circuits Group / Components Research

3. Orbitronic

Tunneling FET

Graphene pn Junction

SpinFET All Spin Logic

Spintronic Majority

Spin Wave Device Nano Magnet Logic

BisFET

spin-current

spin-torque e

e

1. Electronic 2. Spintronic

Domain Wall Logic

Spin Torque Triad

Spin Torque Oscillator

Page 7: Uniform Methodology of Benchmarking Beyond-CMOS Devices

Barriers, Collectives, Thermodynamics

7 Exploratory Integrated Circuits Group / Components Research

source gate drain

e-

Energy

θ 0/ 2ππ-V

+V

)exp(kT

VeII offon∆

<

kTHNE ksB 60~21

0µµ=kTVNeE 4000~∆=

Generic Electronic Switch Generic Spintronic Switch

Barrier 20 kT (from Ion/Ioff) 60 kT (non-volatile)

Voltage 0.5 – 1 V 10-100 mV

Particles Ne = 200 electrons Ns = 10000 spins

Switching Energy Limit 4000kT = Ne*20kT 60 kT

Phenomenon Non collective Collective

e-

Leakage not related to barrier

Leakage determined by barrier

(1) (2)

(3)

Page 8: Uniform Methodology of Benchmarking Beyond-CMOS Devices

Beyond CMOS Device Density

8 Exploratory Integrated Circuits Group / Components Research

Electrode pitch (4F=8λ) limits the Beyond CMOS device density.

Transistor NAND Spintronic Majority Gate (SMG) adder

Intel Transistor Pitch vs. F

Moore’s law We benchmarked beyond CMOS devices with λ scalable design rules. Technology Node Parameter: F=15nm Scalable design rules unit, F = 2λ

Page 9: Uniform Methodology of Benchmarking Beyond-CMOS Devices

Adder: Transistor or Majority Gates

9

• Spintronic circuits can be more compact

Adder = 28 transistors (at least)

… or just 3 majority gates (Nanomagnetic Logic)

… or just 2 majority gates (All Spin Logic)

… or just 1 majority gate (Spin Wave Devices) !

Exploratory Integrated Circuits Group / Components Research

Page 10: Uniform Methodology of Benchmarking Beyond-CMOS Devices

Energy-Delay of Beyond CMOS Electronic Devices

10 Exploratory Integrated Circuits Group / Components Research

Rely on device simulations done by device research groups to obtain input

parameters. But audited numbers.

Add thorough accounting of driven capacitance, parasitics, interconnect. [From realistic layout] Obtain switching time and energy of gates (inverter, NAND, XOR, 1-bit adder). Values validated against Purdue’s electronic simulator, PETE

Simple underlying equations for intrinsic switching time and energy

int /dev dd devt C V I=2

int dev ddE C V=

Vdd, V Ion, A/m

CMOS HP 0.73 1805

CMOS LP 0.3 2

IIIvTFET 0.2 25

HJTFET 0.3 112

gnrTFET 0.1 20

GpnJ 0.7 3932

BisFET NA NA

SpinFET 0.7 700

(4)

(5)

Page 11: Uniform Methodology of Benchmarking Beyond-CMOS Devices

11

Switching Current vs. Supply Voltage

Exploratory Integrated Circuits Group / Components Research

10 -2

10 -1

10 0

10 -1

10 0

10 1

10 2

10 3

Voltage, V

Cur

rent

, µ A

CMOS HP

CMOS LP

IIIvTFET

HJTFET

gnrTFET

GpnJ SpinFET

STT/DW

SMG

STTtriad

STOlogic ASLD

SWD

NML

High resistance

Low resistance

Spin Torque 10mV

Magnetoelectric 100mV

Electronic 100-700mV

Page 12: Uniform Methodology of Benchmarking Beyond-CMOS Devices

Energy-Delay of Beyond CMOS Spintronic Devices

12 Exploratory Integrated Circuits Group / Components Research

STO SRO PZT

CoFe MgO CoFe IrMn Cu

Iout

Vin

FE polarization

Current induced spin torque Voltage driven Magnetoelectric Switching

b u nmU K v=

bc

e UIP

α=

energy barrier

critical current

0ms ms msP ε ε= E

2ms ms S tl X ddQ P w c w V= +

polarization

charge

STT Magnetoelectric

Time to switch

Energy

( )2 2

log3

Bs nmstt

B c c b

k TeM vtg P I I U

πµ

= −

ms ms ddE Q V=

2magme

tBπγ

=

stt dev dd sttE I V t=

DOMI NATES

DOMI NATES

Page 13: Uniform Methodology of Benchmarking Beyond-CMOS Devices

10 1

10 2

10 3

10 4

10 5

10 6

10 -2

10 -1

10 0

10 1

10 2

10 3

Delay, ps

Ener

gy, f

J

CMOS HP

CMOS LP

IIIvTFET

HJTFET

gnrTFET

GpnJ

SpinFET

STT/DW SMG

STTtriad

STOlogic

ASLD

SWD

NML

32bit adder

Benchmarks with spin torque

13

Spin torque

Exploratory Integrated Circuits Group / Components Research

10-26

10-25

10-24

10-23

10-22 10-21

E*d Constant Energy*Delay

Electronics

Page 14: Uniform Methodology of Benchmarking Beyond-CMOS Devices

10 1

10 2

10 3

10 4

10 5

10 6

10 -2

10 -1

10 0

10 1

10 2

10 3

Delay, ps

Ener

gy, f

J

CMOS HP

CMOS LP

IIIvTFET

HJTFET

gnrTFET

GpnJ

SpinFET

STT/DW

SMG

STTtriad

STOlogic

ASLD

SWD

NML

32bit adder

Benchmarks with magnetoelectric

14

Spin torque Electronics

Magnetoelectric

10-26

10-25

10-24

10-23

10-22 10-21

E*d

Exploratory Integrated Circuits Group / Components Research

Constant Energy*Delay

Page 15: Uniform Methodology of Benchmarking Beyond-CMOS Devices

15

Changes in Benchmarking (selected)

Devices NRI Oct 2011 Intel March 2012

CMOS HP Purdue 15nm MOSFET simulated parameters, V=0.7V

Taken from ITRS, V=0.73V

STTriad Only spin torque switching Possibility of magnetoelectric switching

STOlogic In plane magnetization Perpendicular magnetization, optimistic inputs

STT/DW Proponent’s old scheme, not well-founded calculations

New device scheme and architecture

ASLD Smaller voltage. Volatile. Only intrinsic contribution.

Larger voltage. Non-volatile.

NML Smaller size, favorable assumptions of clocking

Magnetoelectric switching, larger size, interconnect contribution

Exploratory Integrated Circuits Group / Components Research

Page 16: Uniform Methodology of Benchmarking Beyond-CMOS Devices

10 -1

10 0

10 1

10 2

10 3

10 4

10 -4

10 -2

10 0

10 2

Delay, ps

Ener

gy, f

J

CMOS HP

CMOS LP

IIIvTFET HJTFET

gnrTFET

GpnJ

SpinFET STT/DW

SMG STTtriad

STOlogic ASLD

SWD

NML

NAND2

CMOS HP

CMOS LP IIIvTFET

HJTFET

gnrTFET

GpnJ

BisFET

STT/DW

STTtriad STOlogic

SpinFET ASLD

SWD

NML

Before and After

16

NRI Oct 2011

Intel March 2012

size

voltage

ITRS

magnetoelectric perp M

Exploratory Integrated Circuits Group / Components Research

scheme

Page 17: Uniform Methodology of Benchmarking Beyond-CMOS Devices

Differences, CMOS

17 Exploratory Integrated Circuits Group / Components Research

Values from ITRS 2011

Before Now A. Khakifirooz et. al., IEEE TED vol. 55, pp. 1391– 1400, 2008. MIT

Page 18: Uniform Methodology of Benchmarking Beyond-CMOS Devices

Differences, BisFET - TBD

18

Reddy et al., IEEE TED 57, 755 (2010), UT Austin Gilbert, IEEE TED 57, 3059 (2010), UIUC

UT Austin Vdd = 25mV Curve fitted to one back-of-the –envelope calculated point. Thermal distribution, SS not accounted for. Need a simulated I-V curve. Not ready to benchmark.

UIUC Vdd = 600mV Simulations of transport via Landauer transmission probability. BUT, according to the device group: May be not optimal parameters. Different wiring.

Exploratory Integrated Circuits Group / Components Research

Page 19: Uniform Methodology of Benchmarking Beyond-CMOS Devices

Power delivery

19 Exploratory Integrated Circuits Group / Components Research

• Limited by Cu conductivity.

• I/device=0.1mA • 1000 devices on

network. • Size 6um*6um • Drop 2mV in device

itself. [Vias not shown]

• Contribution of wires NOT negligible • Estimated voltage power and ground networks: Minimum 10mV supply needed.

Page 20: Uniform Methodology of Benchmarking Beyond-CMOS Devices

Differences, ASLD

20

Purdue Intel

VSS =2mV

Input Output

Vdd=10mV

10x10x1nm 3000 spins Volatile, ∆=10kT voltage supplied directly to the device Vss=2mV

15x15x2nm 14500 spins Non-volatile, ∆=65kT considering the resistance of the power and ground distribution networks, hierarchical Vdd=10mV

2mV

Exploratory Integrated Circuits Group / Components Research

Power and ground

dist. 1000 devices

Page 21: Uniform Methodology of Benchmarking Beyond-CMOS Devices

10 -1

10 0

10 1

10 2

10 3

10 4

10 1

10 2

10 3

Device Delay, ps

Add

er/D

evic

e D

elay

CMOS HP CMOS LP

IIIvTFET HJTFET

gnrTFET

GpnJ

SpinFET

STT/DW

SMG

STTtriad

STOlogic

ASLD

SWD

NML

21

Device vs. Circuit, Time

Majority gates

Exploratory Integrated Circuits Group / Components Research

• Majority gates => faster circuits

Fast devices

Fast

ci

rcu

its

Page 22: Uniform Methodology of Benchmarking Beyond-CMOS Devices

10 -4

10 -2

10 0

10 2

10 4

10 3

10 4

10 5

10 6

Devics E*d, fJ ps

Add

er/D

evic

e E*

d

CMOS HP CMOS LP IIIvTFET

HJTFET

gnrTFET

GpnJ

SpinFET STT/DW SMG

STTtriad

STOlogic

ASLD

SWD

NML

22

Device vs. Circuit, Energy*Delay

Exploratory Integrated Circuits Group / Components Research

• Fewer element => efficient circuits

Efficient devices

Eff

icie

nt

circ

uit

s

Page 23: Uniform Methodology of Benchmarking Beyond-CMOS Devices

23

Switching time and energy, closer look

Exploratory Integrated Circuits Group / Components Research

10 1

10 2

10 3

10 4

10 5

10 6

10 -2

10 -1

10 0

10 1

10 2

10 3

Delay, ps

Ener

gy, f

J

CMOS HP

CMOS LP

IIIvTFET

HJTFET

gnrTFET

GpnJ

SpinFET

STT/DW

SMG

STTtriad

STOlogic

ASLD

SWD

NML

32bit adder

Worse

Better

Fast Slow

Limited by Capacitor charging

Steep turn-on/off (TFETs)

Limited by spin dynamics

Magneto-electric

Potentially Nonvolatile

Page 24: Uniform Methodology of Benchmarking Beyond-CMOS Devices

Energy Aware Figure of Merit: Throughput with Capped Power

New FOM = Throughput with Capped Power

“Computational throughput with capped power measured as Operations per second per logic die area measures how useful a computer is, in a power constrained computing environment.”

Choose 10W/cm2* as the cap

Re-scales throughput by the same factor, either

i. Less dense circuits

ii. Slower circuits

* Clocking and long interconnect dissipation are not included

24 Exploratory Integrated Circuits Group / Components Research

Throughput @ Capped Power = Switching Operations/Area/Time

T@CP Units = [Operations/s/cm2]

Page 25: Uniform Methodology of Benchmarking Beyond-CMOS Devices

25

Throughput and Power Comparison

Exploratory Integrated Circuits Group / Components Research

10 -2 10 -1 10 0 10 1 10 -2

10 -1

10 0

10 1

10 2

Throughput, PetaIntegerOps/s/cm 2

Pow

er, W

/cm

2

CMOS HP

CMOS LP

IIIvTFET

HJTFET

gnrTFET

GpnJ SpinFET STT/DW

SMG STTtriad

STOlogic ASLD

SWD NML

32bit adder

worse

better

* Cap for power

10W/cm2, slowed down

circuits

Limited by Power

Dissipation

Energy Efficient, Lower Voltage

• SWD, HJTFET = high throughput, low power

Page 26: Uniform Methodology of Benchmarking Beyond-CMOS Devices

Insights of Benchmarking Practical considerations

1. Area is determined by metal pitch (4F) to connect to the device [terminal connections]

2. Parasitics can overwhelm the intrinsic device attributes.

3. Majority gates permit more compact, faster circuits

4. Power delivery dominates for very low-voltage devices

Results

1. Spintronics devices are dominated by either – switching energy (spin torque) – magnetization switching speed (magnetoelectric).

2. Charge-based devices are an attractive option: good E*d, compatible with CMOS circuits

3. Spintronic devices still competitive on throughput at low power

26 Exploratory Integrated Circuits Group / Components Research

Page 27: Uniform Methodology of Benchmarking Beyond-CMOS Devices

Proposed Next Steps for Benchmarking

• Need a metric for standby power (leakage)

• Architectural aspects (incl. activity factor, pipelining)

• How to utilize non-volatility of devices in circuits

• Material study for magnetoelectric switching

• Expand to other circuits: mux, latch, ALU, FFT, …?

• Can this technology do analog circuit functions?

• Can it also lend itself to neuromorphic/non-Boolean computing?

27 Exploratory Integrated Circuits Group / Components Research

Page 28: Uniform Methodology of Benchmarking Beyond-CMOS Devices

BACKUP

28 Exploratory Integrated Circuits Group / Components Research

Page 29: Uniform Methodology of Benchmarking Beyond-CMOS Devices

29

Devices Off the Table

Orbitronic

Electronic Spintronic

Excitonic FET

Graphene thermal transistor

Resonant Injection Enhanced FET

MTJ+STT

SET/binary decision diagram

Electron structure modulation FET Phononic

Domain wall ring

current

substrate

contacts conductor

Hbi

as

RAMA

Exploratory Integrated Circuits Group / Components Research

Page 30: Uniform Methodology of Benchmarking Beyond-CMOS Devices

10 -4

10 -3

10 -2

10 -1

10 0

10 1

10 2

10 3

10 4

Device Energy, fJ

Add

er/D

evic

e En

ergy

CMOS HP CMOS LP

IIIvTFET HJTFET

gnrTFET GpnJ

SpinFET

STT/DW SMG

STTtriad

STOlogic ASLD

SWD

NML

30

Device vs. Circuit, Energy

Low energy circuits

Low energy devices

Exploratory Integrated Circuits Group / Components Research

Page 31: Uniform Methodology of Benchmarking Beyond-CMOS Devices

10 -2

10 -1

10 0

10 -1

10 0

10 1

10 2

10 3

Voltage, V

Cur

rent

, µ A

CMOS HP

CMOS LP

IIIvTFET

HJTFET

gnrTFET

GpnJ SpinFET

STT/DW

SMG

STTtriad

STOlogic ASLD

SWD

NML

31

Charge vs. Voltage

High energy

Shot noise? Low capacitance

Exploratory Integrated Circuits Group / Components Research

Page 32: Uniform Methodology of Benchmarking Beyond-CMOS Devices

10 2

10 3

10 4

10 5

10 6

10 7

10 1

10 2

10 3

10 4

10 5

10 6

Resistance, Ohm

Cha

rge,

e

CMOS HP CMOS LP

IIIvTFET HJTFET

gnrTFET

GpnJ

SpinFET

STT/DW

SMG STTtriad

STOlogic

ASLD

SWD

NML

32

Charge vs. Resistance

Larger energy*delay

Exploratory Integrated Circuits Group / Components Research

Smaller energy*delay

Page 33: Uniform Methodology of Benchmarking Beyond-CMOS Devices

10 2

10 3

10 4

10 5

10 6

10 7

10 0

10 2

10 4

10 6

10 8

Resistance, Ohm

Cap

acita

nce,

aF

CMOS HP CMOS LP IIIvTFET

HJTFET

gnrTFET

GpnJ SpinFET

STT/DW

SMG STTtriad

STOlogic

ASLD

SWD

NML

33

Capacitance vs. Resistance

Exploratory Integrated Circuits Group / Components Research

Slower devices

Faster devices

Page 34: Uniform Methodology of Benchmarking Beyond-CMOS Devices

10 2

10 4

10 6

10 8

10 10

10 2

10 4

10 6

10 8

10 10

Q 2 R, h

Ener

gy*d

elay

, h

CMOS HP

CMOS LP

IIIvTFET HJTFET gnrTFET

GpnJ

SpinFET

STT/DW

SMG STTtriad

STOlogic

ASLD

SWD

NML

34

Energy*Delay vs. Q2*R

Switch with Beff, slower than electric

Too good to be true?

Exploratory Integrated Circuits Group / Components Research