18
Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory Stacks University of Illinois at Urbana Champaign and Nvidia Corporation http://iacoma.cs.uiuc.edu MICRO, October 2017 Aditya Agrawal, Josep Torrellas and Sachin Idgunji

Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory Stacks · 2017. 12. 3. · Processor-Memory Stacks: Reduced interconnect length and power Higher memory bandwidth

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory Stacks · 2017. 12. 3. · Processor-Memory Stacks: Reduced interconnect length and power Higher memory bandwidth

Xylem: Enhancing Vertical Thermal Conduction in 3D

Processor-Memory Stacks

University of Illinois at Urbana Champaign and Nvidia Corporation

http://iacoma.cs.uiuc.edu

MICRO, October 2017

Aditya Agrawal, Josep Torrellas and Sachin Idgunji

Page 2: Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory Stacks · 2017. 12. 3. · Processor-Memory Stacks: Reduced interconnect length and power Higher memory bandwidth

2

Xylem

Image source: http://repasosdeharold.blogspot.com/2015/02/plants-test-review.html

Page 3: Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory Stacks · 2017. 12. 3. · Processor-Memory Stacks: Reduced interconnect length and power Higher memory bandwidth

3

Processor-Memory Stacks:

▪ Reduced interconnect length and power

▪ Higher memory bandwidth

▪ Smaller form factors

▪ Heterogeneous integration

▪ 2.5D processor-memory stacks exist

▪ 3D processor-memory stacking is the future

Major challenge: Thermals

Motivation: Thermal Issues in 3D Stacking

Xylem, MICRO 2017

Page 4: Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory Stacks · 2017. 12. 3. · Processor-Memory Stacks: Reduced interconnect length and power Higher memory bandwidth

4

3D Stacking Technologies

Xylem, MICRO 2017

TSV

TSV

BM1 Backside Metal Layers

Frontside Metal LayersMn

M1

Silicon

Devices

Silicon

D2D Layer

Electricalμbump

Low

er D

ie

(Bac

k)U

pp

er D

ie(F

ace)

2 μ

m2

0 μ

m

2 μ

m

Dummyμbump

TTSV

TTSV

Face-to-back (f2b) die interface (not to scale)

Page 5: Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory Stacks · 2017. 12. 3. · Processor-Memory Stacks: Reduced interconnect length and power Higher memory bandwidth

5

✓ Processor power and I/O signals

do not traverse TSVs

✓ Processor IR drop similar to current

designs

✓ DRAM and processor die floorplans

are independent because TSV

count and location are governed by

stacked DRAM standards

Thermal challenges

Stack Organization: Memory on Top

Xylem, MICRO 2017

Heat Sink

Integrated Heat Spreader (IHS)

Thermal Interface Material (TIM)

Package

Processor SiliconProcessor Frontside Metal (Cu)

DRAM Frontside Metal (Al)

DRAM Silicon

Die to Die (D2D) Layer

Through Silicon Vias (TSVs)

C4 pads

Page 6: Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory Stacks · 2017. 12. 3. · Processor-Memory Stacks: Reduced interconnect length and power Higher memory bandwidth

6

▪ Identify the thermal bottleneck in 3D stacks: Die-to-Die(D2D) layers

▪ Improve vertical conduction through the D2D layer:

– Align and short dummy μbumps with Thermal TSVs

– Generic and custom TTSV placement schemes

▪ Use the resulting thermal headroom:

– Boost processor frequency (400-720 MHz) & performance (11-18%)

▪ Exploit thermal heterogeneity: cores closer to TTSVs conduct heat better

– Conductivity-aware thread placement and migration, and frequency boosting

Contributions

Xylem, MICRO 2017

Page 7: Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory Stacks · 2017. 12. 3. · Processor-Memory Stacks: Reduced interconnect length and power Higher memory bandwidth

7

Thermal Resistance in the Stack

Xylem, MICRO 2017

Layer Rth (mm2-K/W)

Bulk Silicon 0.83

Proc. Metal 1.00

D2D 13.33

▪ Thermal resistance per unit area:

D2D layer is 13-16x more resistive than bulk silicon or metal layers

Page 8: Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory Stacks · 2017. 12. 3. · Processor-Memory Stacks: Reduced interconnect length and power Higher memory bandwidth

8

Shortcomings of Prior Work

Xylem, MICRO 2017

▪ Underestimated the thermal resistance of D2D layer by assuming:

– High conductivity

– Small thickness

▪ Focused on increasing the conductivity of the bulk silicon using TTSVs

▪ Concluded that TTSVs alone are effective

Our approach: Combine TTSVs with a mechanism to reduce D2D resistance

Page 9: Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory Stacks · 2017. 12. 3. · Processor-Memory Stacks: Reduced interconnect length and power Higher memory bandwidth

9

Before Proposed

Propose: Dummy μbump-TTSV Alignment & Shorting

Xylem, MICRO 2017

Silicon

Dummyμbump

TTSV

Silicon

Underfill

TTSV

Up

per

Die

(Fa

ce)

Low

er D

ie (

Bac

k)D

2D

Frontside Metal Layers

BacksideMetalLayers

Silicon

Dummyμbump

TTSV

Silicon

Underfill

TTSV

Up

per

Die

(Fa

ce)

Low

er D

ie (

Bac

k)D

2D

FrontsideMetalLayers

BacksideMetalLayers

Page 10: Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory Stacks · 2017. 12. 3. · Processor-Memory Stacks: Reduced interconnect length and power Higher memory bandwidth

10

TTSVs:

▪ Cannot disrupt regular DRAM arrays: Place in the DRAM peripheral logic

▪ Distribute TTSVs and avoid TTSV farms

▪ Maintain Keep Out Zone (KOZ) around each TTSV

Dummy μbumps:

▪ Anywhere in the D2D layer except the electrical μbump locations

TTSV Placement: Constraints

Xylem, MICRO 2017

Page 11: Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory Stacks · 2017. 12. 3. · Processor-Memory Stacks: Reduced interconnect length and power Higher memory bandwidth

11

DRAM (Wide IO) die floorplan Processor die floorplan

DRAM and Processor Baseline Floorplans

Xylem, MICRO 2017

Bank

TSV Bus

Logic

L2

IL1

DL1

L2

Co

re 1

IL1

DL1

L2

Co

re 2

DL1 IL1

L2

Co

re 3

DL1 IL1

L2

Co

re 4

L2 L2 L2

IL1

DL1

Co

re 5

IL1

DL1

Co

re 6

DL1 IL1

Co

re 7

DL1 IL1

Co

re 8

MemoryControllersCoherent Bus

TSV Bus

Page 12: Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory Stacks · 2017. 12. 3. · Processor-Memory Stacks: Reduced interconnect length and power Higher memory bandwidth

12

Generic (oblivious to hotspots) Custom (aligned with hotspots)

Proposal: TTSV Placement Schemes

Xylem, MICRO 2017

Bank

TTSV TSV Bus

Bank

TTSV TSV Bus

Page 13: Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory Stacks · 2017. 12. 3. · Processor-Memory Stacks: Reduced interconnect length and power Higher memory bandwidth

13

▪ TTSV placement & TTSV-μbump alignment and shorting:

– Increases thermal conduction from the processor die to the heat sink

– Reduces the temperature of the processor die

▪ Proposal: Increase processor frequency to consume the thermal headroom

– Increase application performance

Proposal: Frequency Boosting

Xylem, MICRO 2017

Page 14: Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory Stacks · 2017. 12. 3. · Processor-Memory Stacks: Reduced interconnect length and power Higher memory bandwidth

14

▪ TTSV-μbump alignment and shorting creates high conductivity paths

– Areas closer to TTSVs dissipate heat more easily

– Result is thermal spatial heterogeneity in the stack

▪ Proposal: Three λ-aware optimizations to further improve performance

– λ-aware thread placement

– λ-aware frequency boosting

– λ-aware thread migration

Proposal: Conductivity (λ) Aware Techniques

Xylem, MICRO 2017

Page 15: Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory Stacks · 2017. 12. 3. · Processor-Memory Stacks: Reduced interconnect length and power Higher memory bandwidth

15

▪ 8-core OoO processor die @ 2.4 GHz

▪ 8 high Wide IO memory on top

▪ Processor timing and power: SESC & McPAT

▪ DRAM timing and power: DRAMSim2

▪ Thermal analysis: 3D HotSpot

▪ Applications: SPLASH-2, PARSEC & NAS

Evaluation Setup

Xylem, MICRO 2017

Heat Sink

IHS

TIM

Motherboard

Proc. SiliconProc.Metal

DRAM Metal

DRAM Silicon

D2D Layer

TSVs

C4 pads

Memory-on-top configuration

Page 16: Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory Stacks · 2017. 12. 3. · Processor-Memory Stacks: Reduced interconnect length and power Higher memory bandwidth

16

λ-aware techniques enable further 100-200 MHz improvements

Result Summary

Xylem, MICRO 2017

TTSV Placement Generic Custom

Area Overhead 0.63% 0.81%

Proc. Temp. Reduction 5.0 oC 8.4 oC

Avg. Frequency Boost 400 MHz 720 MHz

Avg. Performance Gain 11% 18%

Page 17: Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory Stacks · 2017. 12. 3. · Processor-Memory Stacks: Reduced interconnect length and power Higher memory bandwidth

17

▪ Identified that D2D layer is the thermal bottleneck in 3D stacks

▪ Improved vertical conduction through the D2D layer:

– Align and short dummy μbumps with TTSVs

– Generic and custom TTSV placement schemes

▪ Used the resulting thermal headroom to

– Boost processor frequency (400-720 MHz) & performance (11-18%)

▪ Exploited thermal heterogeneity: cores closer to TTSVs conduct heat better

– Conductivity-aware thread placement and migration, and frequency boosting

– Enable further 100-200 MHz improvements

Conclusion

Xylem, MICRO 2017

Page 18: Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory Stacks · 2017. 12. 3. · Processor-Memory Stacks: Reduced interconnect length and power Higher memory bandwidth

18

Xylem: Enhancing Vertical Thermal Conduction in 3D

Processor-Memory Stacks

Aditya Agrawal, Josep Torrellas and Sachin Idgunji

University of Illinois at Urbana Champaign and Nvidia Corporation

http://iacoma.cs.uiuc.edu