Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Xylem: Enhancing Vertical Thermal Conduction in 3D
Processor-Memory Stacks
University of Illinois at Urbana Champaign and Nvidia Corporation
http://iacoma.cs.uiuc.edu
MICRO, October 2017
Aditya Agrawal, Josep Torrellas and Sachin Idgunji
2
Xylem
Image source: http://repasosdeharold.blogspot.com/2015/02/plants-test-review.html
3
Processor-Memory Stacks:
▪ Reduced interconnect length and power
▪ Higher memory bandwidth
▪ Smaller form factors
▪ Heterogeneous integration
▪ 2.5D processor-memory stacks exist
▪ 3D processor-memory stacking is the future
Major challenge: Thermals
Motivation: Thermal Issues in 3D Stacking
Xylem, MICRO 2017
4
3D Stacking Technologies
Xylem, MICRO 2017
TSV
TSV
BM1 Backside Metal Layers
Frontside Metal LayersMn
M1
Silicon
Devices
Silicon
D2D Layer
Electricalμbump
Low
er D
ie
(Bac
k)U
pp
er D
ie(F
ace)
2 μ
m2
0 μ
m
2 μ
m
Dummyμbump
TTSV
TTSV
Face-to-back (f2b) die interface (not to scale)
5
✓ Processor power and I/O signals
do not traverse TSVs
✓ Processor IR drop similar to current
designs
✓ DRAM and processor die floorplans
are independent because TSV
count and location are governed by
stacked DRAM standards
Thermal challenges
Stack Organization: Memory on Top
Xylem, MICRO 2017
Heat Sink
Integrated Heat Spreader (IHS)
Thermal Interface Material (TIM)
Package
Processor SiliconProcessor Frontside Metal (Cu)
DRAM Frontside Metal (Al)
DRAM Silicon
Die to Die (D2D) Layer
Through Silicon Vias (TSVs)
C4 pads
6
▪ Identify the thermal bottleneck in 3D stacks: Die-to-Die(D2D) layers
▪ Improve vertical conduction through the D2D layer:
– Align and short dummy μbumps with Thermal TSVs
– Generic and custom TTSV placement schemes
▪ Use the resulting thermal headroom:
– Boost processor frequency (400-720 MHz) & performance (11-18%)
▪ Exploit thermal heterogeneity: cores closer to TTSVs conduct heat better
– Conductivity-aware thread placement and migration, and frequency boosting
Contributions
Xylem, MICRO 2017
7
Thermal Resistance in the Stack
Xylem, MICRO 2017
Layer Rth (mm2-K/W)
Bulk Silicon 0.83
Proc. Metal 1.00
D2D 13.33
▪ Thermal resistance per unit area:
D2D layer is 13-16x more resistive than bulk silicon or metal layers
8
Shortcomings of Prior Work
Xylem, MICRO 2017
▪ Underestimated the thermal resistance of D2D layer by assuming:
– High conductivity
– Small thickness
▪ Focused on increasing the conductivity of the bulk silicon using TTSVs
▪ Concluded that TTSVs alone are effective
Our approach: Combine TTSVs with a mechanism to reduce D2D resistance
9
Before Proposed
Propose: Dummy μbump-TTSV Alignment & Shorting
Xylem, MICRO 2017
Silicon
Dummyμbump
TTSV
Silicon
Underfill
TTSV
Up
per
Die
(Fa
ce)
Low
er D
ie (
Bac
k)D
2D
Frontside Metal Layers
BacksideMetalLayers
Silicon
Dummyμbump
TTSV
Silicon
Underfill
TTSV
Up
per
Die
(Fa
ce)
Low
er D
ie (
Bac
k)D
2D
FrontsideMetalLayers
BacksideMetalLayers
10
TTSVs:
▪ Cannot disrupt regular DRAM arrays: Place in the DRAM peripheral logic
▪ Distribute TTSVs and avoid TTSV farms
▪ Maintain Keep Out Zone (KOZ) around each TTSV
Dummy μbumps:
▪ Anywhere in the D2D layer except the electrical μbump locations
TTSV Placement: Constraints
Xylem, MICRO 2017
11
DRAM (Wide IO) die floorplan Processor die floorplan
DRAM and Processor Baseline Floorplans
Xylem, MICRO 2017
Bank
TSV Bus
Logic
L2
IL1
DL1
L2
Co
re 1
IL1
DL1
L2
Co
re 2
DL1 IL1
L2
Co
re 3
DL1 IL1
L2
Co
re 4
L2 L2 L2
IL1
DL1
Co
re 5
IL1
DL1
Co
re 6
DL1 IL1
Co
re 7
DL1 IL1
Co
re 8
MemoryControllersCoherent Bus
TSV Bus
12
Generic (oblivious to hotspots) Custom (aligned with hotspots)
Proposal: TTSV Placement Schemes
Xylem, MICRO 2017
Bank
TTSV TSV Bus
Bank
TTSV TSV Bus
13
▪ TTSV placement & TTSV-μbump alignment and shorting:
– Increases thermal conduction from the processor die to the heat sink
– Reduces the temperature of the processor die
▪ Proposal: Increase processor frequency to consume the thermal headroom
– Increase application performance
Proposal: Frequency Boosting
Xylem, MICRO 2017
14
▪ TTSV-μbump alignment and shorting creates high conductivity paths
– Areas closer to TTSVs dissipate heat more easily
– Result is thermal spatial heterogeneity in the stack
▪ Proposal: Three λ-aware optimizations to further improve performance
– λ-aware thread placement
– λ-aware frequency boosting
– λ-aware thread migration
Proposal: Conductivity (λ) Aware Techniques
Xylem, MICRO 2017
15
▪ 8-core OoO processor die @ 2.4 GHz
▪ 8 high Wide IO memory on top
▪ Processor timing and power: SESC & McPAT
▪ DRAM timing and power: DRAMSim2
▪ Thermal analysis: 3D HotSpot
▪ Applications: SPLASH-2, PARSEC & NAS
Evaluation Setup
Xylem, MICRO 2017
Heat Sink
IHS
TIM
Motherboard
Proc. SiliconProc.Metal
DRAM Metal
DRAM Silicon
D2D Layer
TSVs
C4 pads
Memory-on-top configuration
16
λ-aware techniques enable further 100-200 MHz improvements
Result Summary
Xylem, MICRO 2017
TTSV Placement Generic Custom
Area Overhead 0.63% 0.81%
Proc. Temp. Reduction 5.0 oC 8.4 oC
Avg. Frequency Boost 400 MHz 720 MHz
Avg. Performance Gain 11% 18%
17
▪ Identified that D2D layer is the thermal bottleneck in 3D stacks
▪ Improved vertical conduction through the D2D layer:
– Align and short dummy μbumps with TTSVs
– Generic and custom TTSV placement schemes
▪ Used the resulting thermal headroom to
– Boost processor frequency (400-720 MHz) & performance (11-18%)
▪ Exploited thermal heterogeneity: cores closer to TTSVs conduct heat better
– Conductivity-aware thread placement and migration, and frequency boosting
– Enable further 100-200 MHz improvements
Conclusion
Xylem, MICRO 2017
18
Xylem: Enhancing Vertical Thermal Conduction in 3D
Processor-Memory Stacks
Aditya Agrawal, Josep Torrellas and Sachin Idgunji
University of Illinois at Urbana Champaign and Nvidia Corporation
http://iacoma.cs.uiuc.edu