Upload
clio
View
26
Download
0
Embed Size (px)
DESCRIPTION
Leveraging Heterogeneity in DRAM Main Memories to Accelerate Critical Word Access. Niladrish Chatterjee Manjunath Shevgoor Rajeev Balasubramonian Al Davis Zhen Fang ‡† Ramesh Illikkal * Ravi Iyer *. University of Utah , NVidia ‡ and Intel Labs* † Work done while at Intel. - PowerPoint PPT Presentation
Citation preview
Leveraging Heterogeneity in DRAM Main Memories to Accelerate Critical Word Access
Niladrish ChatterjeeManjunath ShevgoorRajeev BalasubramonianAl DavisZhen Fang‡†
Ramesh Illikkal*Ravi Iyer*
University of Utah , NVidia‡ and Intel Labs*
†Work done while at Intel
Memory Bottleneck• DRAM power as high as 25% of total datacenter power• Low-Power DRAM in place of DDR3.
– BOOM from HP Labs – Energy Proportional Memory from Stanford
2
CPUDDR3
DDR3
DDR3
DDR3
CPU LPDDR
LPDDR
LPDDR
LPDDR
BASELINE Low Power Memory
Latency Wall
• Memory latency wall not going away– Emerging scale-out workloads e.g. Cloudsuite– Move towards energy-efficient in-order cores
• Reduced Latency DRAM offers very low latency– Row-cycle time (tRC) of 8-12ns (DDR3 tRC = 48.75ns, LPDDR2 tRC = 60ns)
3
No one memory works best
4
RLDRAM3 DDR3 LPDDR20
100
200
300
400
500
600
700
Power (mW)
RLDRAM3 DDR3 LPDDR20
10
20
30
40
50
60
Latency (ns)
Heterogeneous Memory
5
PERFORMANCE OPTIMIZED
DRAMCPU
• Combine high-performance and low-power dram to outperform DDR3 at a lower energy cost
• Large number of possible designs– Different DRAM device combinations– Channel Organization– Data Placement Granularity
POWER OPTIMIZED
DRAM
Critical Word Regularity
6
• Most DRAM requests are for word-0 of the cache-line
Frequency of accesses to individual words of a cache-line
Critical Word Acceleration
7
CPU LPDDR
RLDRAM
LPDDR
RLDRAM
Word 0
Words 1 - 7
• Critical Word fetched from RLDRAM to boost performance• Rest of the cache-line placed & retrieved from LPDRAM for
energy efficiency.
Results
8
• Throughput improved by 12.9%
• System energy improved by 6%