26
Integrated Circuit Design with Integrated Circuit Design with NEM Relays NEM Relays Fred Chen Fred Chen 2 , Hei Kam , Hei Kam 1 , Dejan Markovic , Dejan Markovic 3 , , Tsu Tsu-Jae King Jae King Liu Liu 1 , Vladimir Stojanovic , Vladimir Stojanovic 2 , Elad Alon , Elad Alon 1 1 Department of EECS, University of California, Berkeley, CA Department of EECS, University of California, Berkeley, CA 2 Department of EECS, Massachusetts Institute of Technology, Cambridge, MA Department of EECS, Massachusetts Institute of Technology, Cambridge, MA 3 EE Department, University of California, Los Angeles, CA EE Department, University of California, Los Angeles, CA

Integrated Circuit Design with NEM Relays...Integrated Circuit Design with NEM Relays Fred Chen2, Hei Kam1, Dejan Markovic3, Tsu-Jae KingLiu1, Vladimir Stojanovic2, Elad Alon1 1Department

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

  • Integrated Circuit Design with Integrated Circuit Design with

    NEM RelaysNEM Relays

    Fred ChenFred Chen22, Hei Kam, Hei Kam11, Dejan Markovic, Dejan Markovic33, ,

    TsuTsu--Jae KingJae King LiuLiu11, Vladimir Stojanovic, Vladimir Stojanovic22, Elad Alon, Elad Alon11

    11Department of EECS, University of California, Berkeley, CADepartment of EECS, University of California, Berkeley, CA22Department of EECS, Massachusetts Institute of Technology, Cambridge, MADepartment of EECS, Massachusetts Institute of Technology, Cambridge, MA

    33EE Department, University of California, Los Angeles, CAEE Department, University of California, Los Angeles, CA

  • 10.110.1

    100

    101

    102

    103

    104

    0

    20

    40

    60

    80

    100

    No

    rma

    lize

    d E

    ne

    rgy

    /op

    1/throughput (ps/op)

    CMOS is Scaling, Power Density is NotCMOS is Scaling, Power Density is Not

    VVdddd and and VVtt not scaling well not scaling well power/area not scalingpower/area not scaling

    ICCAD 2008 ICCAD 2008 –– November 12, 2008November 12, 2008 22

    400480088080

    8085

    8086

    286386

    486Pentium® proc

    P6

    1

    10

    100

    1000

    10000

    1970 1980 1990 2000 2010Year

    Po

    we

    r D

    en

    sit

    y (

    W/cm

    2)

    Hot Plate

    Nuclear Reactor

    Rocket Nozzle

    S. Borkar

    Sun’s Surface

    Power Density Prediction circa 2000

  • 10.110.1

    CMOS is Scaling, Power Density is NotCMOS is Scaling, Power Density is Not

    VVdddd and and VVtt not scaling well not scaling well power/area not scalingpower/area not scaling

    Parallelism to improve performance within power budgetParallelism to improve performance within power budget

    ICCAD 2008 ICCAD 2008 –– November 12, 2008November 12, 2008 33

    100

    101

    102

    103

    104

    0

    20

    40

    60

    80

    100

    No

    rmalized

    En

    erg

    y/o

    p

    1/throughput (ps/op)

    Operate at a

    lower energy

    point

    Run in parallel to

    recoup performance

    400480088080

    8085

    8086

    286386

    486Pentium® proc

    P6

    1

    10

    100

    1000

    10000

    1970 1980 1990 2000 2010Year

    Po

    we

    r D

    en

    sit

    y (

    W/cm

    2)

    Hot Plate

    Nuclear Reactor

    Rocket Nozzle

    S. Borkar

    Sun’s Surface

    Power Density Prediction circa 2000

    Core 2

  • 10.110.1

    101

    102

    103

    104

    105

    0

    20

    40

    60

    80

    100

    No

    rma

    lize

    d E

    ne

    rgy

    /op

    1/throughput (ps/op)

    Where Parallelism Doesn’t HelpWhere Parallelism Doesn’t Help

    ICCAD 2008 ICCAD 2008 –– November 12, 2008November 12, 2008 44

    CMOS circuits have wellCMOS circuits have well--defined minimum energydefined minimum energy

    Caused by leakage and finite sub-threshold swing

    Need to balance leakage and active energy

    Limits energy-efficiency, regardless how slow the circuit runs

    Energy/op vs. Vdd

    0.1 0.2 0.3 0.4 0.5

    5

    10

    15

    20

    25

    No

    rmalized

    En

    erg

    y/c

    ycle

    Vdd (V)

    Etotal

    Edynamic

    Eleak

    Energy/op vs. 1/throughput

    Scale Vdd & VT:

  • 10.110.1

    What if There Was No Leakage?What if There Was No Leakage?

    ICCAD 2008 ICCAD 2008 –– November 12, 2008November 12, 2008 55

    No longer need to balance increasing component of No longer need to balance increasing component of energy as energy as VddVdd decreasesdecreases

    Relays offer near infinite subRelays offer near infinite sub--threshold slope and threshold slope and no no

    leakage currentleakage current

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    1.2

    34 35 36 37

    Measured relay I-V

    VGB (V)

    I DS

    (mA

    ) compliance limit

    VDS = 1V

    W x L = 2mm x 20mm

    H = g = 0.6mm

    Normalized Energy/cycle in CMOS

    0.1 0.2 0.3 0.4 0.5

    5

    10

    15

    20

    25

    No

    rma

    lize

    d E

    ne

    rgy

    /cyc

    le

    Vdd (V)

    Etotal

    Edynamic

  • 10.110.1

    OutlineOutline

    If we could If we could reliably reliably build relays, would circuits build relays, would circuits

    implemented using relays offer implemented using relays offer superior superior

    energyenergy--efficiencyefficiency to CMOS?to CMOS?

    Compact Circuit Model for NEM RelaysCompact Circuit Model for NEM Relays

    Fast yet key functionality captured

    Circuit Design Paradigms for NEM RelaysCircuit Design Paradigms for NEM Relays

    How to deal with ―slow‖ relays

    Compare vs. CMOSCompare vs. CMOS

    Digital Logic: 32-bit adder

    Mixed Signal I/O: DAC, ADC

    ICCAD 2008 ICCAD 2008 –– November 12, 2008November 12, 2008 66

  • 10.110.1 ICCAD 2008 ICCAD 2008 –– November 12, 2008November 12, 2008 77

    Plan-View

    Schematic Cross-Section

    L

    W

    Body Electrode(under body

    dielectric)

    DrainSource

    Channel

    Contact Dimples

    Electrical

    Gate

    Anchor

    NEM Relay Structure & OperationNEM Relay Structure & Operation

    Metal Channel Gate DielectricElectrical Gate

    Source Drain

    Bodyody

    Body Dielectric

    H

    HSource Drain

    Body

    Body Dielectric

    t gap

    Htdimple

    W

    Air gap

    Infinite off-state resistance Zero off-state leakage

    Open Relay (“OFF”):

    |Vgb| < Vpo (pull-out voltage)

    Closed Relay (“ON”):

    |Vgb| > Vpi (pull-in voltage)

    Relatively low on-state resistance (100Ω < Ron < 1kΩ @ 90nm)

  • 10.110.1

    NEM Relay ModelNEM Relay Model

    Compact Compact VerilogVerilog--A model for circuit simulation:A model for circuit simulation:

    Mechanical dynamics: spring (k), damper (b), mass (m)

    Electrical parasitics: non-linear gate-body (Cgb), gate-channel

    (Cgc), and source/drain-body cap (Csd_b), contact resistance (Rcs,d)

    VVpipi and delay verified against device measurementsand delay verified against device measurements

    ICCAD 2008 ICCAD 2008 –– November 12, 2008November 12, 2008 88

    0.5Rc

    RsdRsdCgb

    0.5Rc

    RcdRcs

    DS

    Cgc

    So DoCsd_bCsd_b

    B

    Go

    C

    Cs Cd

    Anchor

    Spring k Damper b

    Mechanical Model

    Mass m

  • 10.110.1

    NEM Relay ScalingNEM Relay Scaling

    ICCAD 2008 ICCAD 2008 –– November 12, 2008November 12, 2008 99

    Constant EConstant E--field scaling has similar benefits as CMOSfield scaling has similar benefits as CMOS

    Pull-in Voltage with Beam Scaling

    Measured pullMeasured pull--in voltages scale linearlyin voltages scale linearly

    If we can overcome reliability issues & surface forces scale:

    {W,L,tgap} = {90nm,2.3um,10nm} Vpi = 200mV

    Mechanical delay also scales linearly Mechanical delay also scales linearly (~10ns @90nm)(~10ns @90nm)

  • 10.110.1

    NEM Relay as a Logic ElementNEM Relay as a Logic Element

    ICCAD 2008 ICCAD 2008 –– November 12, 2008November 12, 2008 1010

    44--terminal design mimics MOSFET operationterminal design mimics MOSFET operation Electrostatic actuation is Electrostatic actuation is ambipolarambipolar

    NonNon--inverting logic possible: actuation independent of inverting logic possible: actuation independent of drain/source voltagesdrain/source voltages

    Mechanical delay (~10ns) >> electrical Mechanical delay (~10ns) >> electrical tt (~1ps)(~1ps) Can stack 200 devices (with 1kW on-resistance) before

    electrical delay is comparable to mechanical delay

  • 10.110.1

    Digital Circuit Design with RelaysDigital Circuit Design with Relays

    CMOS: delay set by electrical time constantCMOS: delay set by electrical time constant

    Quadratic delay penalty for stacking devices (pass transistor logic)

    Buffer & distribute logical/electrical effort over many stages

    Relays: delay dominated by mechanical movementRelays: delay dominated by mechanical movement

    Want all relays to switch simultaneously

    Implement logic as a single complex gate

    ICCAD 2008 ICCAD 2008 –– November 12, 2008November 12, 2008 1111

  • 10.110.1

    Digital Circuit Design with RelaysDigital Circuit Design with Relays

    Delay Comparison vs. CMOSDelay Comparison vs. CMOS

    Single mechanical delay vs. several electrical gate delays

    For reasonable loads, relay delay unaffected by fan-out/fan-in

    Area Comparison vs. CMOSArea Comparison vs. CMOS

    Larger individual devices

    Fewer devices needed to implement the same logic function

    ICCAD 2008 ICCAD 2008 –– November 12, 2008November 12, 2008 1212

    4 gate delays 1 mechanical delay

  • 10.110.1

    101

    102

    103

    104

    105

    0

    20

    40

    60

    80

    100

    No

    rma

    lize

    d E

    ne

    rgy

    /op

    1/throughput (ps/op)

    CMOS vs. Relays: Digital LogicCMOS vs. Relays: Digital Logic

    ICCAD 2008 ICCAD 2008 –– November 12, 2008November 12, 2008 1313

    Most processor components exhibit the energy vs. Most processor components exhibit the energy vs.

    performance tradeoff of static CMOSperformance tradeoff of static CMOS

    Control, Control, DatapathDatapath, Clock, Clock

    Adder energy performance tradeoff is representativeAdder energy performance tradeoff is representative

  • 10.110.1

    RelayRelay--based Adderbased Adder

    Full adder cell:Full adder cell:

    12 relays vs. 24

    transistors

    XOR for free

    Use fully complementary

    signals to avoid

    additional mechanical

    delay (to invert)

    Relays all sized Relays all sized

    minimallyminimally

    ICCAD 2008 ICCAD 2008 –– November 12, 2008November 12, 2008 1414

  • 10.110.1

    NN--bit Relaybit Relay--based Adderbased Adder

    Ripple carry Ripple carry

    configurationconfiguration

    Cascade full adder Cascade full adder

    cells to create larger cells to create larger

    complex gatecomplex gate

    Stack of N relaysStack of N relays——

    still 1 mechanical still 1 mechanical

    delaydelay

    ICCAD 2008 ICCAD 2008 –– November 12, 2008November 12, 2008 1515

  • 10.110.1

    Snapshot: NEM Relays vs. CMOSSnapshot: NEM Relays vs. CMOS

    32-bit adder CMOS Relay

    Supply

    Voltage

    0.5 V 0.32 - 0.9 V

    Load Cap

    per Output

    25 fF 25 fF

    Total Gate

    Cap

    4.0 pF 125 fF

    Area 600 mm2 480 mm2

    ICCAD 2008 ICCAD 2008 –– November 12, 2008November 12, 2008 1616

    Compare vs. Compare vs. SklanskySklansky CMOS adderCMOS adder11

    For similar area: For similar area:

    >9x lower energy/op, >10x greater delay>9x lower energy/op, >10x greater delay

    1D. Patil et. al., “Robust Energy-Efficient Adder Topologies,” in Proc. 18th IEEE Symp. on Computer Arithmetic (ARITH'07).

    9x

    10x

    Energy/op vs. Delay/op across Vdd

    Energy is for adder only

  • 10.110.1

    EnergyEnergy--Delay ResultsDelay Results

    ICCAD 2008 ICCAD 2008 –– November 12, 2008November 12, 2008 1717

    Energy/op vs. Delay/op across Vdd & CL

    30x capacitance gain30x capacitance gain

    Lower device Cg, Cd

    Fewer devices

    2.4x 2.4x VddVdd gaingain

    No leakage energy

    Relays always provide a more energy efficient solutionRelays always provide a more energy efficient solution

    > 10x better, even in the GOPS range

  • 10.110.1

    Area vs. Throughput TradeoffArea vs. Throughput Tradeoff

    For high throughputs, relays require area overhead to For high throughputs, relays require area overhead to

    achieve similar throughputs as CMOSachieve similar throughputs as CMOS

    Area overhead is bounded (max. 6xArea overhead is bounded (max. 6x--25x) : 25x) :

    To maintain minimum energy/op for throughputs > 1GOPS,

    CMOS needs parallelism too

    ICCAD 2008 ICCAD 2008 –– November 12, 2008November 12, 2008 1818

    0.5 1 1.5 2 2.5 3

    5

    10

    15

    20

    25

    30

    Throughput (GOPS)

    AR

    ela

    y/A

    CM

    OS

    W Relay (buffered, 25fF)

    W Relay (buffered, 100fF)

    Au Relay (25fF)

    Au Relay (100fF)

    Erelay = 20fJ/op

  • 10.110.1

    CMOS vs. Relays: Mixed SignalCMOS vs. Relays: Mixed Signal

    ICCAD 2008 ICCAD 2008 –– November 12, 2008November 12, 2008 1919

    I/O energy dominant if core is ~30x more efficientI/O energy dominant if core is ~30x more efficient

    See if I/O circuits can benefit as wellSee if I/O circuits can benefit as well

    Representative examples: DAC & ADCRepresentative examples: DAC & ADC

    How to process/generate analog signals?How to process/generate analog signals?

    No ―linear gain‖ in relays No ―linear gain‖ in relays –– rely on switchingrely on switching

  • 10.110.1

    NEM Relay Based DACNEM Relay Based DAC

    Same topology as CMOS except:Same topology as CMOS except:

    Need to add passive R to set impedance

    Don’t have to buffer up or size relays to drive output

    Relays’ gate voltage independent of I/O voltage

    Energy dominated by I/O energy:Energy dominated by I/O energy:

    @VIO=200mV, CL=1pF: 62.8fJ/bit vs. ~780fJ/bit for CMOS

    ICCAD 2008 ICCAD 2008 –– November 12, 2008November 12, 2008 2020

    Voltage OutputDAC Architecture AC Impedance

    2

    2BIT IO LE V C

  • 10.110.1

    NEM Relay Based Flash ADCNEM Relay Based Flash ADC

    Topologically similar to CMOSTopologically similar to CMOS

    Sample/Hold input

    Flash Converter – bank of

    comparators

    Key differences…Key differences…

    Sample & Hold: Unbalanced ―on‖ vs.

    ―off‖ switching times

    Comparators: Body terminal used as

    threshold, ambipolar actuation,

    hysteretic switching

    ICCAD 2008 ICCAD 2008 –– November 12, 2008November 12, 2008 2121

  • 10.110.1

    Body terminal used to set Body terminal used to set

    thresholdthreshold

    Hysteretic switching impactHysteretic switching impact

    Comparator will have different

    threshold if previous state varies

    Need to reset all comparators to

    the same state before each

    evaluation dynamic logic

    Relay Based ComparatorsRelay Based Comparators

    ICCAD 2008 ICCAD 2008 –– November 12, 2008November 12, 2008 2222

    VT

  • 10.110.1

    AmbipolarAmbipolar actuation impactactuation impact

    Comparators above & below input will evaluate need a different

    decoder

    Each comparator has a ―deadEach comparator has a ―dead--zone‖ where it stays offzone‖ where it stays off

    Can use ―Max‖ or ―Min‖ of this range to determine code

    ICCAD 2008 ICCAD 2008 –– November 12, 2008November 12, 2008 2323

    -0.2 0 0.2 0.4 0.6 0.8 10

    10

    20

    30

    40

    50

    60

    Input Voltage (V)

    Ou

    tpu

    t C

    od

    e

    Max

    Min

    Relay Based ComparatorsRelay Based Comparators

    VT

  • 10.110.1

    ADC ResultsADC Results

    Energy dominated by resistor string (320fJ out of 350fJ)Energy dominated by resistor string (320fJ out of 350fJ)

    Largely dependent on input dynamic range (VREF), not core voltage

    FOM improves with more resolution as resistor string still

    dominates

    Most advantages stem from lower RMost advantages stem from lower Ronon at smaller Cat smaller Cgg’s’s

    ICCAD 2008 ICCAD 2008 –– November 12, 2008November 12, 2008 2424

    Vcore = 0.3V

    C = 500fF

    R = 4kW

    VREF = 1V

    fsamp = 10 MS/s

    6-bit res: 5.5fJ/conv

  • 10.110.1

    ConclusionsConclusions

    NEM relays offer unique characteristics NEM relays offer unique characteristics

    Superior Ion/Ioff ratio vs. CMOS

    Superior Ron*Cg vs. CMOS

    Switching delay largely independent of electrical t

    Relay circuits show Relay circuits show order of magnitudeorder of magnitude energy energy

    efficiency improvements over CMOSefficiency improvements over CMOS

    Use parallelism (and area) to achieve comparable

    throughputs in digital circuits

    Mixed signal circuits see similar benefits from low Ron at

    small Cgs

    Key challenge is scaling devices reliablyKey challenge is scaling devices reliably

    ICCAD 2008 ICCAD 2008 –– November 12, 2008November 12, 2008 2525

  • 10.110.1

    AcknowledgementsAcknowledgements

    Berkeley Wireless Research CenterBerkeley Wireless Research Center

    NSFNSF

    DARPADARPA

    FCRP (C2S2)FCRP (C2S2)

    Intel Foundation Graduate FellowshipIntel Foundation Graduate Fellowship

    ICCAD 2008 ICCAD 2008 –– November 12, 2008November 12, 2008 2626