Upload
zarita
View
56
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Phase Change Memory What to wear out today?. Chris Craik, Aapo Kyrola, Yoshihisa Abe. Memory Technologies. Concerns Density Latency Energy Off Chip Technologies DRAM Moderately dense, but not very fast Flash Fairly dense, but near-disk slowness. Evaluation of Technologies. - PowerPoint PPT Presentation
Citation preview
Phase Change Memory What to wear out today?
Chris Craik, Aapo Kyrola, Yoshihisa Abe
Memory Technologies
• Concerns– Density– Latency– Energy
• Off Chip Technologies– DRAM• Moderately dense, but not very fast
– Flash• Fairly dense, but near-disk slowness
Evaluation of Technologies
DRAM NAND Flash NOR Flash
Density 1 4 0.25
Read Latency 60ns 25,000ns 300ns
Write Speed 1000MB/s 2.4MB/s 0.5MB/s
Endurance Eff. Infinite 10^4 10^4
Retention? Refresh 10 Years 10 Years
Phase Change Memory
• Bit recorded in ‘Phase Change Material’– SET to 1 by heating to crystallization point– RESET to 0 by heating to melting point– Resistance indicates state
Phase Change Memory• Density
– 4x increase over DRAM• Latency
– 4x increase over DRAM• Energy
– No leakage– Reads are worse(2x), writes much worse (40x)
• Wear out– Limited number of writes (but better than Flash)
• Non-volatile– data persists in memory
Evaluation of Technologies
DRAM NAND Flash NOR Flash PCM
Density 1 4 0.25 2-4
Read Latency 60ns 25,000ns 300ns 200-300ns
Write Speed 1000MB/s 2.4MB/s 0.5MB/s 100MB/s
Endurance Eff. Infinite 10^4 10^4 10^6 to 10^8
Retention? Refresh 10 Years 10 Years 10 Years
Solutions to wearing & energy• Partial writes = write only bits that
have changeda) Caches keep track of written
bytes/words per cacheline (Lee et. al)• storage overhead vs.
accuracyb) When writing a row to memory,
first read old row and compare => write only modified bits (Zhou et al.)
Writes cause thermal expansion / contraction that wears the material and requires strong current. But contrary to DRAM, PCM does not leak energy.
Most written bits redundant!
Solutions to wearing & energy (cont.)
• Buffer organisation (Lee et al.)– DRAM uses one row buffer (2048B)– propose using up to 32 * 64B narrow
buffers, each with own association• capture coalescing writes:
temporal locality more important than spatial locality
• find 4*512B most effective• area-neutral• also helps decrease latency
• Small DRAM buffer for PCM (Qureshi et al.)– combine low latency of DRAM with
high capacity of PCM– similarly use Flash cache for Disk
Solutions to wearing & energy• Wear leveling (Zhou et al.)
– row shifting: even out writes among cells in a row• needs extra hardware
– segment swapping: even out between pages• implemented in memory controller
Spatial locality is now a problem!
PCM as On-chip Cache• Hybrid on-chip cache architecture consisting of multiple memory
technologies• PCM, SRAM, embedded DRAM (eDRAM), and Magnetic RAM (MRAM)
• PCM is slow compared to SRAM etc.– But high density, non-volatility etc. help
• Use as complement to faster memory technologies• As “slow” L2 cache, as L3 cache etc.
PCM
Cache Structure Example• Use PCM as huge L3 cache• SRAM and eDRAM both as L2
– Faster and smaller SRAM region– Slower and larger eDRAM region
L3 PCM (32MB)
L2 eDRAM (Slow: <4MB) L2 SRAM (Fast: 256KB)
Core w/ L1L3 SRAM
1MBL2
SRAM256KB
Corew/ L1
Same Footprint
• Compared to 3-level SRAM cache model:• 18% improvement in instructions per cycle• Comparable power consumption
• Despite additional layer of PCM and its large capacity
• Various design possibilities• PCM as “third” L2 cache etc.
Summary• PCM can be viable approach towards next-generation
memory architecture– High density, non-volatility– Various techniques to overcome shortcomings
• Short endurance, high-energy writes, latencies– Could be used as main memory or in on-chip cache hierarchy
Questions
• How well do results obtained on benchmark apps translate to real usage?
• Variance of endurance of memory cells?– may some cells wear out very quickly?
• Possibilities of PCM non-volatilityinstant wake-up from hibernation etc.