19
EEL 5708 Memory technology Lotzi Bölöni Fall 2003

lecture_18_memory.ppt

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: lecture_18_memory.ppt

EEL 5708

Memory technology

Lotzi Bölöni

Fall 2003

Page 2: lecture_18_memory.ppt

EEL 5708

Acknowledgements

• All the lecture slides were adopted from the slides of David Patterson (1998, 2001) and David E. Culler (2001), Copyright 1998-2002, University of California Berkeley

Page 3: lecture_18_memory.ppt

EEL 5708

Standing on shoulders of giants

“Ideally one would desire an indefinitely large memory capacity such that any particular… word would be immediately available… We are… forced to recognize the possibility of constructing a hierarchy of memories, each of which has a greater capacity than the preceding but which is less quickly accessible.”

A.W.Burks, H.H.Goldstine and J. von Neumann

Preliminary Discussion of the Logical Design of an Electronic Computing Instrument (1946)

Page 4: lecture_18_memory.ppt

EEL 5708

Elements of Memory Organization

• The technologies (SRAM, DRAM etc)• The components

– Cache (L1,L2)– Main memory– Virtual memory

Page 5: lecture_18_memory.ppt

EEL 5708

Main Memory Background• Random Access Memory (vs. Serial Access

Memory)• Different flavors at different levels

– Physical Makeup (CMOS, DRAM)– Low Level Architectures (FPM,EDO,BEDO,SDRAM)

• Cache uses SRAM: Static Random Access Memory– No refresh (6 transistors/bit vs. 1 transistor

Size: DRAM/SRAM 4-8, Cost/Cycle time: SRAM/DRAM 8-16

• Main Memory is DRAM: Dynamic Random Access Memory

– Dynamic since needs to be refreshed periodically (8 ms, 1% time)

– Addresses divided into 2 halves (Memory as a 2D matrix):» RAS or Row Access Strobe» CAS or Column Access Strobe

Page 6: lecture_18_memory.ppt

EEL 5708

Static RAM (SRAM)

• Six transistors in cross connected fashion– Provides regular AND inverted outputs– Implemented in CMOS process

Single Port 6-T SRAM Cell

Page 7: lecture_18_memory.ppt

EEL 5708

• SRAM cells exhibit high speed/poor density• DRAM: simple transistor/capacitor pairs in

high density form

Dynamic RAM

Word Line

Bit Line

C

Sense Amp

.

.

.

Page 8: lecture_18_memory.ppt

EEL 5708

DRAM Operations

• Write– Charge bitline HIGH or LOW and set wordline HIGH

• Read– Bit line is precharged to a voltage halfway

between HIGH and LOW, and then the word line is set HIGH.

– Depending on the charge in the cap, the precharged bitline is pulled slightly higheror lower.

– Sense Amp Detects change

• Explains why Cap can’t shrink– Need to sufficiently drive bitline– Increase density => increase parasitic

capacitance

Word Line

Bit Line

C

Sense Amp

.

.

.

Page 9: lecture_18_memory.ppt

EEL 5708

DRAM logical organization (4 Mbit)

• Square root of bits per RAS/CAS

Column Decoder

Sense Amps & I/O

Memory Array(2,048 x 2,048)

A0…A10

11

D

Q

Word LineStorage CellR

ow D

ecod

er…

Page 10: lecture_18_memory.ppt

EEL 5708

So, Why do I freaking care?

• By it’s nature, DRAM isn’t built for speed– Response times dependent on capacitive circuit

properties which get worse as density increases

• DRAM process isn’t easy to integrate into CMOS process

– DRAM is off chip – Connectors, wires, etc introduce slowness– IRAM efforts looking to integrating the two

• Memory Architectures are designed to minimize impact of DRAM latency

– Low Level: Memory chips– High Level memory designs.– You will pay $$$$$$ and then some $$$ for a good

memory system.

Page 11: lecture_18_memory.ppt

EEL 5708

So, Why do I freaking care?

• 1960-1985: Speed = ƒ(no. operations)

• 1990– Pipelined

Execution & Fast Clock Rate

– Out-of-Order execution

– Superscalar Instruction Issue

• 1998: Speed = ƒ(non-cached memory accesses)

• What does this mean for– Compilers?,Operating Systems?, Algorithms?

Data Structures?

1

10

100

1000

1980

1981

1982

1983

1984

1985

1986

1987

1988

1989

1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

DRAM

CPU

Page 12: lecture_18_memory.ppt

EEL 5708

DRAM Performance

• A 60 ns (tRAC) DRAM can – perform a row access only every 110 ns (tRC)

– perform column access (tCAC) in 15 ns, but time between column accesses is at least 35 ns (tPC).

» In practice, external address delays and turning around buses make it 40 to 50 ns

• These times do not include the time to drive the addresses off the microprocessor nor the memory controller overhead!

• Can it be made faster?• Many techniques are trading higher

bandwidth, but have higher latency– The idea that the latency will be taken care of by the

cache.

Page 13: lecture_18_memory.ppt

EEL 5708

Synchronous DRAM

• Has a clock input.– Data output is in bursts w/ each element clocked

• Flavors: SDRAM, DDR

PC100: Intel spec to meet 100MHz memory bus designs. Introduced w/ i440BX chipset

Write Read

Page 14: lecture_18_memory.ppt

EEL 5708

RAMBUS

• “Intellectual property company”. – Located in Los Altos, CA– Designed a memory architecture– Licenced to manufacturers– They have no factories.

• Picked up by Intel, who signed an exclusive deal with them for Pentium 4 motherboards.

• Litigation regarding the intellectual property.

Page 15: lecture_18_memory.ppt

EEL 5708

RAMBUS (RDRAM)

• Protocol based RAM w/ narrow (16-bit) bus– High clock rate (400 Mhz), but long latency– Pipelined operation

• Multiple arrays w/ data transferred on both edges of clock

RAMBUS Bank RDRAM Memory System

Page 16: lecture_18_memory.ppt

EEL 5708

RDRAM Timing

Page 17: lecture_18_memory.ppt

EEL 5708

DRAM History

• DRAMs: capacity +60%/yr, cost –30%/yr– 2.5X cells/area, 1.5X die size in 3 years

• ‘98 DRAM fab line costs $2B– DRAM only: density, leakage v. speed

• Rely on increasing no. of computers & memory per computer (60% market)

– SIMM or DIMM is replaceable unit => computers use any generation DRAM

• Commodity, second source industry => high volume, low profit, conservative

– Little organization innovation in 20 years– Don’t want to be chip foundries (bad for RDRAM)

• Order of importance: 1) Cost/bit 2) Capacity

– First RAMBUS: 10X BW, +30% cost => little impact

Page 18: lecture_18_memory.ppt

EEL 5708

Read-only memory (ROM)

• Programmed at time of manufacture– Can not be written by the computer– It is not erased by loss of power– Some of them can be erased and rewritten by special

hardware (EEPROM)

• One transistor / bit.• Used in:

– BIOS of desktop computers– Embedded devices (also serves as a code protection

device)

Page 19: lecture_18_memory.ppt

EEL 5708

FLASH Memory

• Floating gate transitor– Presence of charge => “0”– Erase Electrically or UV (EPROM)

• Performance– Reads like DRAM (~ns)– Writes like DISK (~ms). Write is a complex operation