Exploiting Managed Language Semantics to Mitigate Wear-out in Persistent Memory
Shoaib AkramGhent University, Belgium
Flash Memory Summit Santa Clara, CA
20191
Main memory capacity expansion
Flash Memory Summit Santa Clara, CA
20192
Charge storage in DRAM a scaling limitation
Manufacturing complexity makes DRAM pricing volatile
Source: WSTS, IC Insights0.6
0.7
0.8
0.9
1
Pric
e/Gb
($)
Jan’17 Jan’18
Phase change memory (PCM)
Flash Memory Summit Santa Clara, CA
20193
Scalable → More Gb for the same price
Byte addressable like DRAM
Latency closer to DRAM
🙁🙁 Low write endurance
Why PCM has low write endurance?
Flash Memory Summit Santa Clara, CA
20194
tem
pera
ture
set to crystalline
reset to amorphous
time
Electric pulses to program PCMcells wear them out over time
Hybrid DRAM-PCM memory
Flash Memory Summit Santa Clara, CA
20195
DRAM PCM
Endurance
PCM alone as a DRAM replacement wears out in a few months for popular Java applications
CapacityPersistence
Mitigating PCM wear-out
Flash Memory Summit Santa Clara, CA
20196
Wear-leveling to spread writes across PCM
This talk → Use DRAM to limit PCM writes
OS to limit PCM writes
Flash Memory Summit Santa Clara, CA
20197
Coarse-grained page migrations hurt application performance and PCM lifetime
DRAM PCM
Flash Memory Summit Santa Clara, CA
20198
Managed runtimes
Managed Runtime
OperatingSystem
Hardware
ApplicationPlatform independenceAbstract hardware/OS→ Aka Virtual Machine
Ease programmer’s burdenGarbage collectionSecurity
Flash Memory Summit Santa Clara, CA
20199
GC to limit PCM writes
GC understands memory semantics
OperatingSystem
Hardware
Application
GC approaches are pro-activeand fine-grained
Flash Memory Summit Santa Clara, CA
201910
Write Distribution in GC heap
nursery matureGC
70%of writes
Flash Memory Summit Santa Clara, CA
201911
Write Distribution in GC heap
22%to 2% of objects
nursery matureGC
70%of writes
Flash Memory Summit Santa Clara, CA
201912
Write-Rationing Garbage Collection
Limit PCM writes by discovering highly written objects
Kingsguard dynamically monitor writes
Kingsguard-Nursery (KG-N)
Flash Memory Summit Santa Clara, CA
201913
mature largenursery
DRAM PCM
Kingsguard-Writers (KG-W)
Flash Memory Summit Santa Clara, CA
201914
mature large
observer
PCM
mature largeDRAM
nursery
KG-W drawbacks
Flash Memory Summit Santa Clara, CA
201915
Overhead of dynamic monitoring
Limited time window to predict write intensity
Excessive & fixed DRAM consumption
Flash Memory Summit Santa Clara, CA
201916
Write-Rationing Garbage Collection
Limit PCM writes by discovering highly written objects
Kingsguard dynamically monitor writes
Crystal Gazer statically profiles objects
Allocation site as a write predictor
Flash Memory Summit Santa Clara, CA
201917
a = new Object()b = new Object()c = new Object()d = new Object()
a = new Object()b = new Object()c = new Object()d = new_dram Object()
Uniform distribution 🙁🙁Skewed distribution 🙂🙂
Write distribution by allocation site
Flash Memory Summit Santa Clara, CA
201918
Few sites capture majority of writes
0
25
50
75
100
0 50 100 150
% m
atur
e ob
ject
s
Sites sorted by writes
WritesVolume Pjbb2005
Crystal Gazer operation
Flash Memory Summit Santa Clara, CA
201919
ApplicationProfiling
Advice Generation
Bytecode Compilation
a = new Object()…b = new_dram Object()
a = new Object()…b = new Object()
ObjectPlacement
Advice generation
Flash Memory Summit Santa Clara, CA
201920
Goal: Generate <alloc-site, advice> pairs advice → DRAM or PCMinput is a write-intensity trace
Two heuristics to classify allocation sites as DRAM or PCM
Classification heuristic (1)
Flash Memory Summit Santa Clara, CA
201921
Freq: A threshold % of objects from a site get more than a threshold # writes → DRAM
🙂🙂 Aggressively limits PCM writes
🙁🙁 No distinction based on object size
Classification heuristic (2)
Flash Memory Summit Santa Clara, CA
201922
Dens: A threshold % of objects from a site have more than a threshold write density → DRAM
Write density → Ratio of # writes to object size
Classification thresholds
Flash Memory Summit Santa Clara, CA
201923
Homogeneity threshold → 1%
Frequency threshold → 1
Density threshold → 1
Classification examples
Flash Memory Summit Santa Clara, CA
201924
ObjectIdentifier # Writes # Bytes
Allocation site
O1 0 4 A() + 10O2 0 4 A() + 10O3 128 4 A() + 10O4 128 4096 B() + 4
Frequency threshold = 1PCM writes = ?, DRAM bytes = ?
Classification examples
Flash Memory Summit Santa Clara, CA
201925
ObjectIdentifier # Writes # Bytes
Allocation site
O1 0 4 A() + 10O2 0 4 A() + 10O3 128 4 A() + 10O4 128 4096 B() + 4
Frequency threshold = 1PCM writes = ?, DRAM bytes = ?
→ →
Classification examples
Flash Memory Summit Santa Clara, CA
201926
ObjectIdentifier # Writes # Bytes
Allocation site
O1 0 4 A() + 10O2 0 4 A() + 10O3 128 4 A() + 10O4 128 4096 B() + 4
Frequency threshold = 1PCM writes = 0/256, DRAM bytes = 5008
→ →
Classification examples
Flash Memory Summit Santa Clara, CA
201927
ObjectIdentifier # Writes # Bytes
Allocation site
O1 0 4 A() + 10O2 0 4 A() + 10O3 128 4 A() + 10O4 128 4096 B() + 4
Density threshold = 1PCM writes = ?, DRAM bytes = ?
Classification examples
Flash Memory Summit Santa Clara, CA
201928
ObjectIdentifier # Writes # Bytes
Allocation site
O1 0 4 A() + 10O2 0 4 A() + 10O3 128 4 A() + 10O4 128 4096 B() + 4
Density threshold = 1PCM writes = ?, DRAM bytes = ?
→ 32
Classification examples
Flash Memory Summit Santa Clara, CA
201929
ObjectIdentifier # Writes # Bytes
Allocation site
O1 0 4 A() + 10O2 0 4 A() + 10O3 128 4 A() + 10O4 128 4096 B() + 4
Density threshold = 1PCM writes = ?, DRAM bytes = ?
→ <1
Classification examples
Flash Memory Summit Santa Clara, CA
201930
ObjectIdentifier # Writes # Bytes
Allocation site
O1 0 4 A() + 10O2 0 4 A() + 10O3 128 4 A() + 10O4 128 4096 B() + 4
Density threshold = 1PCM writes = 128/256, DRAM bytes = 12
Flash Memory Summit Santa Clara, CA
201931
new_dram() → Set a bit in the object header
GC → Inspect the bit on nursery collection to copy object in DRAM or PCM
Object placement in Crystal Gazer
Flash Memory Summit Santa Clara, CA
201932
Object placement in Crystal Gazer
mature large
PCM
mature largeDRAM
nursery🧐🧐
Is marked highly written? ✓
Key features of Crystal Gazer
Flash Memory Summit Santa Clara, CA
201933
Eliminates overhead of dynamic monitoring
Less mispredictions due to pro-active nature
Pareto optimal trade-offs b/w capacity and lifetime
Flash Memory Summit Santa Clara, CA
201934
Evaluation methodology
15 Applications → DaCapo, GraphChi, SpecJBB
Medium-end server platform
Different inputs for production and advice
Jikes RVM
Flash Memory Summit Santa Clara, CA
201935
Emulation platform
CPU CPU✗
Jikes RVMApp
OS
Flash Memory Summit Santa Clara, CA
201936
PCM write rates
PCM-Only write rate is above 1 GB/s on average
Safe operation is 200 MB/s
Flash Memory Summit Santa Clara, CA
201937
PCM write rates
0200400600800
Writ
e ra
te in
MB/
s KG-N KG-W Dens Freq
Flash Memory Summit Santa Clara, CA
201938
Execution time
0.0
0.5
1.0
1.5Ex
ec. t
ime
norm
. to
KG-N
KG-W Dens Freq
30% 8%
Flash Memory Summit Santa Clara, CA
201939
0.30.40.50.60.70.8
100 150 200 250
PCM
writ
es
rela
tive
to K
G-N
DRAM capacity in MB
Pjbb2005
KG-W
KG-W versus Crystal Gazer
Flash Memory Summit Santa Clara, CA
201940
0.30.40.50.60.70.8
100 150 200 250
PCM
writ
es
rela
tive
to K
G-N
DRAM capacity in MB
Pjbb2005
KG-WCrystal Gazer
KG-W versus Crystal Gazer
Crystal Gazeropens up Pareto-optimaltrade-offs
Flash Memory Summit Santa Clara, CA
201941
Write-rationing garbage collection
Hybrid memory is inevitable
All layers can contribute to manage hybrid memory
Write-rationing GC is pro-active and fine-grained
DRAM PCM