31

memory

Embed Size (px)

DESCRIPTION

SRAM memory design

Citation preview

  • IntroductionSRAM OverviewNovel SRAM bitcellTest Chip ArchitectureSummaryLecture Contents

  • Memory is classified by 4 major categoriesVolatility, Access Speed, Capacity and CostMemory Classification

  • Minimum energy point in digital circuits is achieved at subthreshold voltages (Vdd < Vt). Low-voltage operation of SRAM memories in the subthreshold region offers substantial power and energy savings at the cost of speed. This project focuses on the design and implementation of a novel SRAM bitcell for use in the subthreshold region.

    Motivation & Goal

  • SRAM OVERVIEW

  • Overview

    Bistability Principle (Q, QB)Differential Read (Sense Amp)Fast Access Speeds (read, write)Differential WriteLarge Noise MarginsLarge Area (6 transistors)Prechargable BitlinesPower Consuming

  • Positive feedback creates two stable points 1 and 0.Regenerative property ensures a noisy cell converges back to nominal values.Bistability Butterfly Curve

  • Bitlines (BL, BL) are precharged to VDDWordline signal (WL) is assertedOne of the bitlines is pulled down toward GND.Differential signal (BL-BL) is amplified to accelerate the process.SRAM Read AccessM1 > M5 Constraint!

  • Bitlines are precharged to complementary values.Worldline signal (WL) is asserted.Q is pulled down to GND while Q is driven to VDD.SRAM Write AccessM6 > M4 Constraint!

  • In general, ratioed digital circuits are more likely to fail in subthreshold voltages.6T Bitcells cannot operate below 600mV 700mV.Read SNM problem - degraded read noise margins decrease bitcell stability.Write fails under 600mV due to the increase of the pMOS drive in sub-threshold.

    SRAM Subthreshold Challenges

  • NOVEL 9T SRAM BITCELL

  • The Research WorkNumerous novel low-power SRAM memories have been proposed in recent years.We studied and analyzed many of the important proposals which include : 6T, 7T, 8T, 9T, 10T bitcells, Virtual VDD, Virtual GND, DCVSL, Voltage Boost, Read Buffer, Read Assist, Voltage Boost, and more .

  • Brain Storming

  • Two innovative SRAM 9T bitcells, named PSRAM and SFSRAM , aimed at eliminating static power consumption and operated in the subthreshold region were fully designed and analyzed. Three types of 8-kb 40 nm SRAM test chips, nicknamed RAMBO, were designed for operation at 600mV and below.

    Major AchievementsWe are the first academic research team in Israel to fully design and fabricate a state-of-the-art 40nm CMOS silicon chip.

  • Chip Design Workflow

  • Standard 8T Schematic and Layout

  • Pseudo static behavior - A novel bitcell mechanism disposes of both data node charges while holding a logical 1.Leakage current is practically eliminated during this low-power standby mode.Up to 3.75X less static power consumption than a standard 8T cell at 0.9V.Pseudo SRAM (PSRAM)

  • WBL is driven to 1 and WBLB to 0PSRAM Write 1 Operation Write wordline (enable) is assertedQ is driven to 1 and QB to 0Q is discharged to during standbyCLK synchronizes write access

  • PSRAM Power Reduction

  • Enables subthreshold write with a Virtual-VDD technique weakening the Supply VDD during write operation.A new approach for the design of the Virtual-VDD scheme reduces periphery and thus, reduces write power.Operates at ultra-low voltages, down to 200mV.

    SFSRAM (Supply Feedback SRAM)

  • Standard 8T Revisited

  • SFSRAM Power Reduction

  • 40NM TEST CHIP

  • 8-kb ArrayRead-Bitline divisionLevel ShiftersRow DecoderSense-AmpsPrecharge UnitsWrite DriversBISTChip Architecture

    N1

    P1

    1

    32

    N1

    P1

    N4

    P4

    N3

    P3

    EN_

    EN

    J

    Q

    Q

    K

    SETCLR

    D

    L

    S

    R

    GND

    NQ_

    NRB

    Dummy

    NQ

    PQ

    PQ_

    NRWL

    RWL

    RBL

    Dummy

    Q_

    Q

    Dummy

    WWL

    TQ

    TQ_

    WBL_

    WBL

    NFB

    VVDD

    VDD

    NWELL

    LVT

    NFB

    NQ_

    NQ

    PQ_

    PQ

    TQ_

    TQ

    WBL

    WBL_

    WWL

    WWL_

    NRB

    NRWL

    RWL

    RBL

    Q_

    Q

    SRAM Array(128 Rows X 8 Columns)

    Row Decoder (7 to 128)

    Level Shifting Wordline Drivers x128

    8 x Precharge + Write Driver Units

    BIST

    128

    128

    WWL[127:0]

    RWL[127:0]

    WWL [127:0]

    RWL [127:0]

    128

    WL_out [127:0]

    WL [127:0]

    7

    ADD_in[6:0]

    ADD2SRAM[6:0]

    DVDD

    CLK

    8

    WBL[7:0]

    DVSS

    8

    WBLB[7:0]

    PC

    PC

    WBL[7:0]

    WBLB[7:0]

    CLK

    VDD

    GND

    CLK

    DVDD

    DVSS

    CLK

    DVDD

    GND

    SVDD

    SVDD

    CLK

    DVDD

    DVSS

    CLK

    DVDD

    GND

    SVDD

    SVDD

    8

    DIN2SRAM[7:0]

    DIN[7:0]

    8

    DIN[7:0]

    DATA[7:0]

    8

    RBL[7:0]

    RBL[7:0]

    8 x Sensing Unit + Level Shifter

    DOUT[7:0]

    SRAM2DOUT[7:0]

    8

    8

    DOUT[7:0]

    DOUT[7:0]

    ?

    ?

    PASS_FAIL

    CLK

    DVDD

    DVSS

    CLK

    DVDD

    GND

    SVDD

    SVDD

    DVSS

    GND

    SVDD

    SVDD

    CLK

    DVDD

    DVSS

    CLK

    VDD

    GND

    ?

    BIST_CNTRL

    ?

    7

    ADD[6:0]

    ADD[6:0]

    WR

    RD

    RD

    WR

    CS

    CS

    CS2SRAM

    CS

    RD

    RD2SRAM

    WR

    WR2SRAM

    CLK

    ADDR

    WR_DATA

    WR

    RD

    Selected WL

    Selected WWL

    Selected RWL

    Selected WBL

    Selected WBLB

    Selected RBL

    RD_DATA

    RD

    CLK

    Selected WL

    Selected RWL

    WR

    CLK

    Selected WWL

    Selected WL

    VDD

    QB

    VVDD

    GND

    RBL

    PQB

    PQ

    VDD

    NWELL

    Q

    VDD

    NFB

    NFB

    RWL

    RWL

    QB

    WBLB

    NQB

    NQ

    Q

    QB

    QB

    Q

    Q

    QB

    GND

    VVDD

    WWL

    RB

    RB

    VVDD

    NFB

    NQB

    NQ

    PQB

    PQ

    TQB

    TQ

    WBL

    WBLB

    WWL

    WWL

    NRB

    NRWL

    RWL

    RBL

    QB

    Q

    RB

    TQ

    QB

    NRB

    NRWL

    GND

    TQB

    QB

    Q

    WBL

    Will be changed to NWELL for pmos layout

    WWL

    WBLB

    QB

    Q

    Q

    WBL

    RBL

    32

    8

    BYPASS_BIST

    IO Ring

    5

    DATA_IN

    EXT_CLK

    32

    ADDR_IN

    6

    DATA_OUT

    CLK

    ipg_clk

    ipg_hard_async_reset_b

    ipt_bist_reset

    ipt_bist_config[4:0]

    ipg_clk

    ipt_bist_pattern_sel[5:0]

    MPC2010_ram1_di[31:0]

    ipt_bist_retention_en

    ipt_bist_release

    jta_bist_tdi

    jta_bist_clock_dr

    ram1_do[31:0]

    MPC2010_ram1_addr[7:0]

    WR

    MPC2010_ram1_wr

    MPC2010_bist_en

    MPC2010_ipt_bist_done

    MPC2010_ipt_bist_ram_failed

    MPC2010_ipt_bist_mode_tb

    MPC2010_ipt_bist_retention_stop

    MPC2010_jta_bist_tdo

    EN

    Column i

    Column i+1

    N2

    P2

    1.1V

    1.1V

    N6

    P6

    N5

    P5

    0.6V

    0.6V

    DIN

    DATA_

    DATA

    WBLB

    WBL

    NWELL 1.1V

    P1

    P2

    N1

    N2

    DVDD

    DVDD

    DGND

    DGND

    DATA_

    DIN

    DATA_

    DATA

    DATA

    NWELL 0.6V

    DATA_

    P3

    P4

    P6

    P5

    WBL

    WBLB

    SVDD

    SVDD

    WBL

    WBLB

    DGND

    DGND

    N3

    N4

    N6

    N5

    EN_

    SRAM Array64x32

    Row Decoder (8 :256)

    Level Shifting Wordline Drivers x256

    32 x Precharge + Write Driver Units

    BIST

    256

    256

    WWL[63:0]

    RWL[63:0]

    WWL [255:0]

    RWL [255:0]

    256

    WL_out [255:0]

    WL [255:0]

    7

    ADD_in[7:0]

    ADD2SRAM[7:0]

    CLK

    32

    WBL[31:0]

    DVDD

    DVSS

    32

    WBLB[31:0]

    PC

    PC

    WBL[31:0]

    WBLB[31:0]

    CLK

    VDD

    GND

    CLK

    DVDD

    DVSS

    CLK

    DVDD

    GND

    SVDD

    SVDD

    CLK

    DVDD

    DVSS

    CLK

    DVDD

    GND

    SVDD

    SVDD

    32

    DIN2SRAM[31:0]

    DIN[31:0]

    32

    DIN[31:0]

    DATA[31:0]

    32

    RBL[31:0]

    RBL[31:0]

    32 x Sensing Unit + Level Shifter

    DOUT[31:0]

    SRAM2DOUT[31:0]

    32

    32

    DOUT[31:0]

    DOUT[31:0]

    PASS_FAIL

    CLK

    DVDD

    DVSS

    CLK

    DVDD

    GND

    SVDD

    SVDD

    DVSS

    GND

    SVDD

    SVDD

    CLK

    DVDD

    DVSS

    CLK

    VDD

    GND

    BIST_CNTRL

    8

    ADD[7:0]

    ADD[7:0]

    WR

    WR

    RD

    RD

    CS

    CS

    CS

    CS2SRAM

    RD

    RD2SRAM

    WR

    WR2SRAM

    Block 0

    WWL[63:0]

    RWL[63:0]

    RBL[31:0]

    Block 1

    WWL[63:0]

    RWL[63:0]

    RBL[31:0]

    Block 2

    WWL[63:0]

    RWL[63:0]

    RBL[31:0]

    Block 3

  • 40nm Test Chip - Periphery

  • Test Chip Top Level Layout1.40 um2.90 um

  • Chip Timing DiagramsSRAM access is synchronized by a clock. Bitline Precharge, write driving and digital logic execute during the high phase and read/write take place during the low phase.

    CLK

    ADDR

    WR_DATA

    WR

    RD

    Selected WL

    Selected WWL

    Selected RWL

    Selected WBL

    Selected WBLB

    Selected RBL

    RD_DATA

  • SUMMARY

  • A fully functional 8-kb array was layed out and designed for the 40nm lp TSMC process. SFSRAM Memory successfully operates at subthreshold voltages - no additional periphery required.

    Additional power savings can be achieved in the PSRAM with a majority bit algorithm.

    Summary

  • PSRAM consumes up to 3.75X less static power than a standard 8T cell.Summary ContinuedWe Are The first academic research team in Israel to fully design and fabricate a state-of-the-art 40nm chip.

  • Questions??Digital ChipChocolate Chip

    Lidor Introduction and PSRAMOmer SFSRAM, Architecture and summaryPrint note cards with the VLSI center symbol on the back.

    How do we describe a layout to someone with no knowledge in VLSI design?Review the asynchronous and synchrous sram access methods**Three main approaches for lowering power in SRAM arrays:

    - Bitcell modification (6T, 8T, 10T)- Additional Periphery (VVDD as an example)- Algorithm (Majority bit algorithm)**Basically Back-To-Back inverters that store the cell state (Q,QB) and two access transistors for read/write access.

    Introduce listeners to all SRAM terms:Precharge, data nodes, Bitlines, Wordlines, Sense Amp, Access Transistors, Differential Write, Differential Read.

    **Reading a 1 from the cell.*Differential Write*Generally, ratioed circuits are more sensitive in subthreshold voltages

    Write fails under 600mV because of the inability of the Write-Driver and nMOS access transistor to win the ratioed fight against the pMOS inside the cell.

    Read failes due to cell-invasive read**Three main approaches for lowering power in SRAM arrays:

    - Bitcell modification (6T, 8T, 10T)- Additional Periphery (VVDD as an example)- Algorithm (Majority bit algorithm)

    **Three types of test chips include a 8T array, SFSRAM array, PSRAM arrayAdd a picture of RAMBOGDS files were sent successfully to TSMC at Taiwan two weeks ago*A one-time opportunity to design a chip through Zoran.

    Memory cell schematic SPICE functionality simulations (Read, Write, Hold) .MC performance tests (speed, power).MC optimizations.MC layout.Test Chip Architecture design8-kb Array layoutComprehnsive Post-Layout simulationsTAPEOUT (GDS files sent to TSMC at Taiwan).

    Total of 8-kb in 4 blocksNo column mux/decoder

    * . ' .

    Describe the Subthreshold Write ProblemSingle-Ended Read Add a glow effect to the Read BufferDifferential Write

    *Reduces static power consumption at 1 and at 0Enables subthreshold write (?)

    *Animate this slideBoth datanodes are discharged during standbyTiming-diagrams on post-layout

    *Describe the Subthreshold-Write problem pMOS and nMOS have the same strengthDescribe the Subthreshold-Read problem* . ' .

    Describe the Subthreshold Write ProblemSingle-Ended Read Add a glow effect to the Read BufferDifferential Write

    ***Describe the two power domains in the chip: DVDD, SVDDMeasure power accuratlyVerilog Digital LogicRead Bitline dividing Scheme for faster read access

    What kind of pass/fail tests does the BIST do?*****We are also the first VLSI project team to fully design and tapeout a chip.**