L3 - System on Chip

Embed Size (px)

Citation preview

  • 8/11/2019 L3 - System on Chip

    1/68

  • 8/11/2019 L3 - System on Chip

    2/68

    Outline The framework: Computing platforms in the broad sense

    Historical Trends towards Multi-Core through MooresLaw

    , -Computation and Limits

    Opportunities to Improve Energy Efficiency/VoltageScalability

    Beyond-CMOS Ultra-Low Voltage Circuits

    2 prof. Massimo Alioto

  • 8/11/2019 L3 - System on Chip

    3/68

    The Framework:

    3 prof. Massimo Alioto

  • 8/11/2019 L3 - System on Chip

    4/68

    Computing Platforms: The Big Picture

    Computing/sensing platforms are rapidly expanding*

    networks move towards macro and nano scale

    nano scale (self-powered nodes)

    meso scale (portable/handheld)

    macro scale (data centers)

    macro : cloud computing

    nano : ubi uitous com utin /sensin

    4 prof. Massimo Alioto

    * adapted from MuSyC FCRP center

  • 8/11/2019 L3 - System on Chip

    5/68

    Computing Platforms: The Big Picture

    Computing/sensing platforms are rapidly expanding networks move towards macro and nano scale

    mesonano

    macro

    5 prof. Massimo Alioto

  • 8/11/2019 L3 - System on Chip

    6/68

    Computing Platforms: The Big Picture

    Computing/sensing platforms are rapidly expanding networks move towards macro and nano scale

    mesonano

    macro

    New concepts

    Internet of things

    6 prof. Massimo Alioto

  • 8/11/2019 L3 - System on Chip

    7/68

    Computing Platforms: The Big Picture

    Computing/sensing platforms are rapidly expanding networks move towards macro and nano scale

    mesonano

    New conceptsmacro

    ntra ervous

    System for the Earth

    7 prof. Massimo Alioto

  • 8/11/2019 L3 - System on Chip

    8/68

    Computing Platforms: The Big Picture

    Computing/sensing platforms are rapidly expanding networks move towards macro and nano scale

    mesonano

    New concepts-

    macro

    collectiveintelligence,

    8 prof. Massimo Alioto

  • 8/11/2019 L3 - System on Chip

    9/68

    Computing Platforms: The Big Picture

    Computing/sensing platforms are rapidly expanding networks move towards macro and nano scale

    mesonano

    New applicationsmacro

    9 prof. Massimo Alioto

  • 8/11/2019 L3 - System on Chip

    10/68

    Computing Platforms: The Big Picture

    Computing/sensing platforms are rapidly expanding networks move towards macro and nano scale

    mesonano

    New applicationsmacro

    10 prof. Massimo Alioto

  • 8/11/2019 L3 - System on Chip

    11/68

    Computing Platforms: The Big Picture

    Computing/sensing platforms are rapidly expanding networks move towards macro and nano scale

    mesonano

    New applications

    macro

    advanced water/energymanagement,

    11 prof. Massimo Alioto

  • 8/11/2019 L3 - System on Chip

    12/68

    Historical Trends towards Multi-Core

    12 prof. Massimo Alioto

  • 8/11/2019 L3 - System on Chip

    13/68

    CMOS Integrated Circuits

    MOS transistor Shockley-Brattain-Bardeen

    (1947, Bell Labs)

    Integrated Circuit (IC) chipmultiple transistors +

    interconnects =--------------------------

    Jack Kilby (1958)

    packaged chip

    13 prof. Massimo Alioto

    demonstrated 1 st IC PCB

  • 8/11/2019 L3 - System on Chip

    14/68

    Gordon Moores Prediction

    CMOS technology scaling

    XY

    Z

    0.7X0.7Y

    0.7Z

    2X more transistors/chipprevious generation next generation

    Prediction in 1965 (not a law) Moores law: 1 generation/24 months

    exponential growth in transistor count

    14 prof. Massimo Alioto

  • 8/11/2019 L3 - System on Chip

    15/68

  • 8/11/2019 L3 - System on Chip

    16/68

    colossal investments coordinated by International Technology Roadmap for

    process, device, circuits

    challenges, performance,

    consumption, capabilities,

    16 prof. Massimo Alioto

  • 8/11/2019 L3 - System on Chip

    17/68

    As a Result of CMOS Scaling

    CMOS scaling trends for microprocessors (macro scale) before 2005: Moores law + Dennards scaling (voltage )

    exponential growth in # of transistors and performance

    I n t el

    1 2 m

    I n t el

    1 2 m

    17 prof. Massimo Alioto

    0 0 4

    2 0 0 4

    2

  • 8/11/2019 L3 - System on Chip

    18/68

  • 8/11/2019 L3 - System on Chip

    19/68

    Power vs. energy efficiency

    lkg opchip P throughput E P

    keep leakage power small enough (10-15%)

    - E op V 2

    single-core performance V

    I n t el

    1 2 m

    19 prof. Massimo Alioto

    0 0 4

    2

  • 8/11/2019 L3 - System on Chip

    20/68

    Power vs. energy efficiency

    lkg opchip P throughput E P

    keep leakage power small enough (10-15%)

    - E op V 2

    single-core performance V

    E op by using lower V

    I n t el

    1 2 m

    improve peformance

    20 prof. Massimo Alioto

    0 0 4

    2

  • 8/11/2019 L3 - System on Chip

    21/68

    Multi-Core: Numerical Example

    Post-Dennard scaling keep V TH , V DD constant performance becomes power limited use area (moretransistors available use them to improve efficiency)

    use silicon for low power density blocks (cache 10 W/cm 2)

    that strongly impact total speed, rather than logic (30 W/cm 2)D. Frank, Power Constrained CMOS Scaling Limits, IBM J. RES. & DEV. VOL. 46 NO.2/3 MARCH/MAY 2002

    Example (iso power/technology):

    core 1 DD

    f area = 1

    =

    . DD0.8 f area = 2

    =

    core core 1

    core 2

    . DD0.63 f area = 4

    =

    core 2

    core 3

    21 prof. Massimo Alioto

    throughput = 1

    throughput = 1.6

    throughput = 1.6 2 = 2.5core 4

  • 8/11/2019 L3 - System on Chip

    22/68

    Multi-Core Scaling

    Multi-core era will not last long [ISCA2011]: announced catastrophe Dark Silicon and the Endo u core ca ng ue o na equa e energy e c ency

    percentage of unusable dark silicon

    is growing fast

    22 prof. Massimo Alioto

    [ISCA2011] H. Esmaeilzadeh et al., Dark Silicon and the End of Multicore Scaling ISCA, June 2011

  • 8/11/2019 L3 - System on Chip

    23/68

    Multi-core era will not last long [ISCA2011]: announced catastrophe Dark Silicon and the Endo u core ca ng ue o na equa e energy e c ency

    percentage of unusable dark silicon

    is growing fast

    23 prof. Massimo Alioto

    [ISCA2011] H. Esmaeilzadeh et al., Dark Silicon and the End of Multicore Scaling ISCA, June 2011

  • 8/11/2019 L3 - System on Chip

    24/68

    Multi-core era will not last long [ISCA2011]: announced catastrophe Dark Silicon and the Endo u core ca ng ue o na equa e energy e c ency

    percentage of unusable dark silicon

    is growing fast

    new power crisis n 2016 or processors: no reason or sca ng

    24 prof. Massimo Alioto

    [ISCA2011] H. Esmaeilzadeh et al., Dark Silicon and the End of Multicore Scaling ISCA, June 2011

  • 8/11/2019 L3 - System on Chip

    25/68

    A Broader View of Dark Silicon

    At macro scale, dark refers to spatial dimensionAt nano scale (self-powered nodes)

    inadequate energy efficiency dark silicon along thetemporal dimension (intermittent available power)

    availableenergy

    nooperation

    normaloperation

    normaloperation

    nooperation

    t

    At meso scale (portable) dark silicon in both spatial (power constraint 1-2 W) and

    25 prof. Massimo Alioto

    temporal dimension (limited lifetime @ given functionality)

  • 8/11/2019 L3 - System on Chip

    26/68

  • 8/11/2019 L3 - System on Chip

    27/68

    Green IC Group

    Painting silicon green: mission of Green IC groupwww.green-ic.org

    meso

    nano

    27 prof. Massimo Alioto

  • 8/11/2019 L3 - System on Chip

    28/68

    Aggressive Voltage Scaling, Minimum-Energy

    28 prof. Massimo Alioto

  • 8/11/2019 L3 - System on Chip

    29/68

    Voltage Scaling: Dynamic Energy

    If dynamic energy per clockdominates:

    2

    DDSW dyn V C E

    affected by switching activity , capacitance , voltage

    reduce V DD as much as possible

    energy reduction limited by V DD,min (functional/timing failures)E

    E dyn

    29 prof. Massimo Alioto

    V DDV DD,min

  • 8/11/2019 L3 - System on Chip

    30/68

    Voltage Scaling: Leakage Energy

    If leakage (static) energy per cycle dominatesCK o DDlk T I V E

    affected by supply voltage, leakage , clock cycle ( logic depth*gate delay)

    V DD reduction and trends

    D Q

    clk

    Reg comb 1 D Q

    clk

    Reg comb 2 D Q

    clk

    Reg comb n D Q

    clk

    Reg...

    stage 1 stage 2 stage n

    E

    linear

    constantexponentially growing

    E lkg exponentially increasesE lkg

    30 prof. Massimo Alioto

    at low V DD V DDV DD,min

  • 8/11/2019 L3 - System on Chip

    31/68

    Voltage Scaling: Total Energy

    Total energy vs. V DD tradeoff between E dyn and E lkg minimum-energy point (MEP) exists

    E

    E TOT

    E dynE lkg

    V DDV DD,min V DD,opt

    31 prof. Massimo Alioto

    MEP determined by optimal balance of E dyn and E lkg

  • 8/11/2019 L3 - System on Chip

    32/68

    Importance of Voltage Scaling: Broader View

    Minimum-energy operation for better (10X) energyefficiency + circuit/architectural/SW integration permit performance increase at macro scale

    reduces battery size

    and lifetime at meso/nano

    o age sca ng s power u intrinsic in Dennard scaling

    .os - ennar sca ng aggressive voltage scaling: do it by yourself

    as much as ossible ive u somethin variable workload

    32 prof. Massimo Alioto

    deal with related issues

  • 8/11/2019 L3 - System on Chip

    33/68

    Ultra-Low Voltage (ULV) Operation:

    Energy reduction comesLimits and Challenges

    at a price

    per ormance

    leakage energy

    resiliencylkg

    ener

    performance

    ield

    failurerate

    yield

    33 prof. Massimo Alioto

    , DD

  • 8/11/2019 L3 - System on Chip

    34/68

  • 8/11/2019 L3 - System on Chip

    35/68

    Limits and Challenges

    Resiliency degraded at ULV rocess/volta e/tem erature

    !%

    !+

    $ % & ' ( ) * + , - . / * + * , -

    % ) - $

    3 3

    . / !

    5-10X more process variations

    (delay: easily 2X variations)%

    '%% &%% )%% *%% !%%%

    4 '

    5 # &

    ! "" 1)$2M. Alioto(TCAS-I 2012)

    5X higher sensitivity to V DD

    p r o c e s s

    v o l t a

    g e

    e m p e r a t u r e

    design margining

    -

    nominal margin

    t

    R. Krishnamurthy (Micro 2012)

    at near threshold, easily 2X margin

    (in speed binning, many discarded )

    35 prof. Massimo Alioto

    performance/energy efficiency

  • 8/11/2019 L3 - System on Chip

    36/68

    Limits and Challenges

    Aging (depends on history, workload, voltage, temperature)

    Soft errors nominal margin

    .

    higher failure rate at ULV

    Degraded functionality at ULV V DD,min increase due tovariations 8

    9 v t

    1 4 v t

    3 5 0 m

    V

    DD

    degraded I on/ I off (incomplete switching)

    MEM arrays: much less scalability (0.6-0.7 V)V DD,min increase due to intrinsic

    NMOS/PMOS imbalance

    V DD,min increase due to residualPUN/PDN imbalance 0

    . 5 v t

    2 . 5

    v t

    v t

    1 3

    3 2 5

    36 prof. Massimo Alioto

    2

    M. Alioto (TCAS-I 2012)

  • 8/11/2019 L3 - System on Chip

    37/68

    Opportunities to Improve Energy

    37 prof. Massimo Alioto

  • 8/11/2019 L3 - System on Chip

    38/68

    Near-Threshold ICs

    Parallelism compensates speed loss enhanced b 3D chi stackin

    near res o compu ng very prom s ng: e c ency enables data center scalabilityD. Blaauw/D. S lvester

    can enable exascale

    computing by 2020

    (Shekhar Borkar, Intel)

    38 prof. Massimo Alioto

    .

  • 8/11/2019 L3 - System on Chip

    39/68

    Near-threshold computers will be different lo ic/MEM: different scalin (MEM becomes faster)

    less cache levels, bigger cache

    better logic/MEM coupling through 3D integration

    More efficient and scalable microarchitectureseep p pe ne : ower ea age energy

    D Q

    clk

    Reg comb 1 D Q

    clk

    Reg comb 2 D Q

    clk

    Reg comb n D Q

    clk

    Reg...

    stage 1 stage 2 stage n

    CK off DDlkg T I V E

    ultra-low power = high speed only 17FO4/stage in 1,024-point complex FFT, 4X lower energy

    39 prof. Massimo Alioto

    (D. Blaauw, D. Sylvester ISSCC 2011)

  • 8/11/2019 L3 - System on Chip

    40/68

  • 8/11/2019 L3 - System on Chip

    41/68

    Finer-grain voltage domains currentl : cores share same volta e, different fre uenc

    lkg opchip P throughput E P

    slower cores might operate at lower voltage ( E op, P lkg ) not possible (share same voltage)

    multiple on-chip regulators on sight r e g u

    l a t o r

    r e g u

    l a t o

    g u

    l a t o r

    g u

    l a t o r

    different frequencies

    can exploit workload reduction to further

    r r

    r e g u

    l a t o r

    r e g u

    l a t o r

    41 prof. Massimo Alioto

    reduce E op and P lkg

  • 8/11/2019 L3 - System on Chip

    42/68

    Enhance Energy Efficiency: Heterogeneity

    Exploit heterogeneity (different scaling at ULV) area is commodit : ive

    R. Krishnamurthy (Micro 2012)

    up flexibility for better

    efficiency

    HW accelerators(media, image, crypto,

    ra o, ,

    same function in different IPs

    - -. more extreme: use different

    replicas with different variationsmodule 1

    module 2

    42 prof. Massimo Alioto

    energy efficiency more testing delayusemodule 2usemodule 1

  • 8/11/2019 L3 - System on Chip

    43/68

    Enhance Energy Efficiency

    Limit communication energy ex loit localit at different scales

    limit off-chip (2-10X intrachip)

    limit intra-chip (1-10X computation)

    B. Dally (CICC 2012) restrict data structure and flow (SIMD)

    better Flip-Flops (post-silicon tuning)3

    -. . better clock domain design

    clock slope optimization: 35% better

    43 prof. Massimo Alioto

    energy efficiency [Alioto TCAS-I 2010] M. Alioto (ISSCC 2012)

  • 8/11/2019 L3 - System on Chip

    44/68

    Margin Elimination: Design vs. Testing Time

    Uncertainty margin at design time is too expensive post-silicon (self)tuning absolutely needed

    eliminate margin: optimally allocate cost/design effort at

    design/ testing / boot / run timeincrease design margin, improve

    understanding/modeling, more robust

    complexity , uncertainty design...

    ckts people, architects and

    testing people need topost-silicon

    tune at testingtime, adapt at

    44 prof. Massimo Alioto

    play in the same field decisions relatedto design timedecisions related totesting / boot / run time

  • 8/11/2019 L3 - System on Chip

    45/68

    Margin Elimination: Timing Error DetectionReduce/eliminate worst-casemargin by catching delay faults

    nominal margin

    correct at run-time, tune to compensate actual variationsrun-time testing improves energy efficiency

    n-s tu mon tor ng no margin

    au t pre ct on (Tunable Replica Circuit) needs some margin (false positives,mimics only critical path)

    45 prof. Massimo Alioto

    invasive, limited tuning little invasive, tuning required, low overhead

  • 8/11/2019 L3 - System on Chip

    46/68

    Margin Elimination: Timing Error DetectionTiming monitoring: some circuit approaches

    double sampling transition detection

    Razor

    Razor II(Umich)

    (Umich)

    DSTB(Intel)

    TDTB(Intel)

    to architecture through OR tree hold-time/detection window (TD)

    46 prof. Massimo Alioto

    metastability in data (Razor)/error path (others)

  • 8/11/2019 L3 - System on Chip

    47/68

    Margin Elimination: Error Correction

    Faults can be corrected at various levelsfaster correctionSW Architecture Microarchitecture Circuit

    Circuit Microarchitecture Architecture SW

    less HW resources

    SW Architecture Microarchitecture Circuit

    lower energy/performance penalty

    energy overheadenergy of traditionalmargined design

    t h r o u g h p u t

    d e g r a d a t i o n d u e t o

    i n c r e a s e d e r r o r r a t e energy reductionhrou h m ar in p

    u t / I P C

    e

    op

    correction

    E throughput

    rateerror energy reduction belowPoFF

    error rateincreasebelow PoFF

    elimination

    minimum energyunder error det./corr.

    e n e r g y

    t h r o u g h

    e r r o r r a

    t

    47 prof. Massimo Alioto

    V DDmargined V DD(traditionaldesign)

    o n t oFirst

    Failure(PoFF)

    energy-optimum

    V DD

  • 8/11/2019 L3 - System on Chip

    48/68

    Margin Elimination: Error Correction

    Existing approaches circuit clock gating (Umich) clock stretching (Georgia Tech) error propagation within a clock cycle (very hard)

    microarchitecture counterflow pipelining (Umich) micro-rollback Umich Bubble Razor (Umich) interferes with microarchitecture/

    c cle-based timin architecture instruction re-execution (Intel), simple, large

    48 prof. Massimo Alioto

    checkpoint-restart (Wisc), simple, very large penalty

  • 8/11/2019 L3 - System on Chip

    49/68

    The Next Step: Sub-Cycle Detection/Correction

    Existing approaches are cycle-based

    from J. Crop et al.,JLPEA, 2011

    correction interferes with microarchitecture (design effort) errors affect timing at boundary: difficult SoC integration large energy penalty in high error rate regime (future)

    Our visionsub-cycle detection/correction

    errors detected/corrected in the same cycle or, at least, errors do not have to ro a ate to the boundar

    49 prof. Massimo Alioto

    so that errors are confined and determine low energy penalty

  • 8/11/2019 L3 - System on Chip

    50/68

    Approximate Computing as Extreme Scaling

    Some apps do not need to have perfect computation aggressively push voltage and tolerate errors

    approximate computing (voltage overscaling by N. Shanbhag,K. Ro

    ex.: multimedia (occasionally wrong pixels/samples)

    errors not corrected on the fl

    rather, avg error rate kept within bound (slow correction loop)

    degradation of signal quality can be dynamically adjusted(application level)

    50 prof. Massimo Alioto

  • 8/11/2019 L3 - System on Chip

    51/68

    Our Approach: User Experience-Centric Design

    Voltage/energy reduction in portable multimedia for a iven ualit of user ex erience

    20

    40

    tight link between circuit and final user errors are acceptable

    20 40 60 80 100 120 140 160

    60

    80

    100

    120

    140

    PSNR=24 dB

    metrics for quality of user experience (PSNR)close circuit design loop at application level

    20

    40

    60

    80

    100

    120

    minimize energy for given quality

    20 40 60 80 100 120 140 160

    140

    PSNR=36 dB

    energy scalability : reduce energy

    if lower quality is accepted

    51 prof. Massimo Alioto

    ynam c sca ng

  • 8/11/2019 L3 - System on Chip

    52/68

    Limits of recent work on energy scalability (SRAM)

    [Wolf2009], [Kurdahi2008] : aggressive V DD scaling to reduceenergy at the cost of higher BER

    very limited voltage/energyBER

    (or PSNR) BERenergy

    sca a y : exp DD

    abruptly increases targetedquality

    same limitation in mixed 6T/8T SRAM Ro 2011

    V DD

    near threshold, 6T array almost always fails, 8T almost never fails

    52 prof. Massimo Alioto

    not really scalable either

  • 8/11/2019 L3 - System on Chip

    53/68

    Our approach errors have different impact depending on where theyoccur

    optimal energy allocation: protect (=spend energy) only

    important bits to have graceful degradation (various knobs) when limiting precision, use

    unused bits to improve resiliency

    can pus more on DD to re uceenergy at same quality

    53 prof. Massimo Alioto

    current y, -nm c p un er test

  • 8/11/2019 L3 - System on Chip

    54/68

  • 8/11/2019 L3 - System on Chip

    55/68

    Results in 28-nm 32-kb SRAM, YUV format

    (QCIF 144x176)

    Akiyo video, frame #30

    PSNR w.r.t. voltage scaling

    41% better PSNR(dB) at same energy

    20

    40

    A20

    40

    B20

    40

    Original

    60

    80

    100

    120

    60

    80

    100

    120

    60

    80

    100

    120

    55 prof. Massimo Alioto

    20 40 60 80 100 120 140 160

    14020 40 60 80 100 120 140 160

    140

    20 40 60 80 100 120 140 160

    140

  • 8/11/2019 L3 - System on Chip

    56/68

    Other Opportunities

    Enable burst very high-speed computation ust violate reliabilit constraint

    temporarily exceed Thermal Design Power

    leverage thermal cap for DVFS Turbo Boost

    2.0 [Intel, Rotem et al., HOTCHIPS 2011]

    enhance thermal cap via phase change materials

    Com utational S rintin Ra havan HPCA 2012

    56 prof. Massimo Alioto

  • 8/11/2019 L3 - System on Chip

    57/68

    Our Vision of Distributed Power Management

    Globally green systems ener -efficient, widel ener scalable

    and externally tunable components

    need for communication (energy state, knob tuning)

    global policies based on information on energy stateENERGY

    MANAGEMENTCHANNEL

    TRADITIONALCOMMUNICATION

    CHANNEL

    REG

    inputs

    self-adjustinternal

    EX.: bus, NoC, crossbar...

    EX.: throughput,arithmetic precision...

    instantaneous requirements ments

    sensors

    knobs tominimize

    energy

    settings processing

    added to enable energyscalability and dynamic

    tradeoff with other assets

    MODULE

    energy-related parametersmeters

    57 prof. Massimo Alioto

    outputsEX.: timing slack, bit error rate...

  • 8/11/2019 L3 - System on Chip

    58/68

    Our Vision of Distributed Power Management keep it simple (integration), yet maintain global view:

    hierarchical structure

    h e r

    l e v e

    l i n

    h i e r a r c

    h y

    h i

    ena es remo e power managemen g o a v ew anintelligence kept out of nano-scale nodes)

    move computation where more efficient (computation vs.

    58 prof. Massimo Alioto

    communication, locality, heterogeneity)

  • 8/11/2019 L3 - System on Chip

    59/68

    Beyond-CMOS Ultra-Low Voltage Circuits

    59 prof. Massimo Alioto

  • 8/11/2019 L3 - System on Chip

    60/68

    Tunnel-FETs: a Very Promising Alternative Main limit to voltage scaling of CMOS transistor

    V TH can be reduced only if

    su res o s ope s

    lowered at given leakage

    use new devices with

    ower su t res o s ope

    Tunnel FETs : very promising (ITRS: after 2020)

    Physical structure p+in+

    60 prof. Massimo Alioto

    metal

  • 8/11/2019 L3 - System on Chip

    61/68

    Tunnel-FETs: Robustness Comparison Comparison with CMOS bulk (FinFET) / SOI

    fair: all optimized for ULV, same targets (leakage)

    Noise margin degradation at ULV linear '%%

    '+%! ""0123 167862 " 9: )$! ""0123 1;

  • 8/11/2019 L3 - System on Chip

    62/68

  • 8/11/2019 L3 - System on Chip

    63/68

    Tunnel-FETs: Energy Comparison FO 4 inverter chain (10% activity, 16 slices)

    min. energy vs. logic depth max. TFET advantage

    w.r.t. SOI 35% @ 60FO4%")%"*

    !!"'!"&!")

    ( ) * + , - . / 0 ' 6 7 8 6

    . . .%

    %"'"

    '% &% )% *% !%% !'% !&% !)% !*% '%% 7

    + 8 *

    & '

    +'E,F /.G0H 9"#/D3 1

  • 8/11/2019 L3 - System on Chip

    64/68

    Tunnel-FETs: Energy Comparison Impact of transistor stacking

    at ULV, leakage reduction in 2-4 stacked TFETs is 5-8X better than SOI, 3-6X better than bulk

    at ULV, I on reduction in 2-4 stacked TFETs is u

    TFET cells with larger fan-in provide more benefits faster, lower leakage lower min. energy

    TFET standard cell libraries must include higher fan-in cells

    Example: zero-detector with 4-input gates min. energy improved by 1.79X (1.84X) w.r.t. to SOI (bulk)

    64 prof. Massimo Alioto

  • 8/11/2019 L3 - System on Chip

    65/68

    Tunnel-FETs: SRAM cell System voltage scalability limited by SRAM cell

    small margins, sensitive to variations

    8T cell

    about same area (33 X 13.4 F 2)

    TFET SNM scales better '%%'+%

    1 ) $ 2

    %

    +%

    !%%

    !+%

    ( . * / ) * ( E , &

    : 4

    TFET SOI bulk

    V DD>140 mV 30% V DD 35% V DD 30% V DD

    65 prof. Massimo Alioto

    !%% !&% !*% ''% ')% (%% (&% (*% &'% &)% +%%! "" 1)$2

    F748

  • 8/11/2019 L3 - System on Chip

    66/68

    Conclusions Future computing platforms (macro, meso, nano)

    Green: energy efficiency is key in any component

    Ultra-low voltage is really challenging speed, leakage, resiliency (design margin)

    Opportunities to overcome challenges margin reduction

    heterogeneity

    fine-grain/independent power domains

    coor na e arc ec ure c rcu es gn use better devices

    66 prof. Massimo Alioto

  • 8/11/2019 L3 - System on Chip

    67/68

    67 prof. Massimo Alioto

  • 8/11/2019 L3 - System on Chip

    68/68

    Speakers Contacts

    E-mail [email protected]

    Massimo Alioto, Ph.D.

    [email protected]

    . - .

    ECE De artmentNational University of Singapore (NUS)

    4 Engineering Drive 3, Singapore 117576

    68 prof. Massimo Alioto