36
Computational Sprinting on a Hardware/Software Testbed Arun Raghavan * , Laurel Emurian * , Lei Shao # , Marios Papaefthymiou + , Kevin P. Pipe +# , Thomas F. Wenisch + , Milo M. K. Martin * University of Pennsylvania, Computer and Information Science * University of Michigan, Electrical Eng. and Computer Science + University of Michigan, Mechanical Engineering #

Computational Sprinting on a Hardware/Software Testbed Arun Raghavan *, Laurel Emurian *, Lei Shao #, Marios Papaefthymiou +, Kevin P. Pipe +#, Thomas

Embed Size (px)

Citation preview

Computational Sprinting on a Hardware/Software Testbed

Arun Raghavan*, Laurel Emurian*, Lei Shao#, Marios Papaefthymiou+, Kevin P. Pipe+#, Thomas F. Wenisch+, Milo M. K. Martin*

University of Pennsylvania, Computer and Information Science*

University of Michigan, Electrical Eng. and Computer Science+

University of Michigan, Mechanical Engineering#

This work licensed under the Creative Commons Attribution-Share Alike 3.0 United States License

• You are free:• to Share — to copy, distribute, display, and perform the work• to Remix — to make derivative works

• Under the following conditions:• Attribution. You must attribute the work in the manner specified by the author or

licensor (but not in any way that suggests that they endorse you or your use of the work). • Share Alike. If you alter, transform, or build upon this work, you may distribute the

resulting work only under the same, similar or a compatible license.

• For any reuse or distribution, you must make clear to others the license terms of this work. The best way to do this is with a link to:

http://creativecommons.org/licenses/by-sa/3.0/us/ • Any of the above conditions can be waived if you get permission from the

copyright holder.• Apart from the remix rights granted under this license, nothing in this

license impairs or restricts the author's moral rights.

2

3

Overview

• Computational sprinting [HPCA’12]• Targets responsiveness in thermally constrained environments• Far exceed sustainable power for short bursts of computation• Simulation based feasibility study

• This work: what can we learn with today’s hardware?• Engineer hardware/software testbed for sprinting

• Reduce heat venting capacity• Sustain only lowest power mode

• Can sprint on today’s system• Longer with phase-change material• Sprinting improves energy-efficiency

• Even for sustained computations

Tmaxp

ow

erte

mp

erat

ure

Computational Sprinting Using Dark Silicon [HPCA’12]

5

Tmaxp

ow

erte

mp

erat

ure

Effect of thermal capacitance

Computational Sprinting Using Dark Silicon [HPCA’12]

6

Tmaxp

ow

erte

mp

erat

ure

Effect of thermal capacitance

Computational Sprinting Using Dark Silicon [HPCA’12]

7

Tmaxp

ow

erte

mp

erat

ure

Effect of thermal capacitance

Computational Sprinting Using Dark Silicon [HPCA’12]

8

Tmaxp

ow

erte

mp

erat

ure

State of the art:Turbo Boost 2.0

exceeds sustainable power

with DVFS (~25%)

Our goal: 10x

Effect of thermal capacitance

Computational Sprinting Using Dark Silicon [HPCA’12]

9

Evaluating Sprinting

• Simulation-based feasibility study [HPCA’12]

• Thermal models: buffer heat using thermal capacitance• Electrical models: stabilize voltage with gradual core activation• Architectural models:

• Large responsiveness improvements • Little dynamic energy overheads

• Next steps: understanding sprinting on a real system• Build a real chip? • Sprint on today’s mobile chips?

Our approach: study sprinting on hardware available today

10

This Work: Testbed for Computational Sprinting

• How long can the testbed sprint?• How to select sprint intensity?• How can we extend sprint duration?• How does sprinting impact energy?

11

Designing a testbed for sprinting

sprinting

sustainable

Quad-core Intel i7-2600With heatsink and fan: 95W

12

Remove heatsink, slow fan;10W thermal design (TDP)

Cores Freq. PowerNormalized

Power

PeakSpeedu

p

1 core 1.6 GHz 10 W 1x 1x

4 cores 1.6 GHz 20 W ~2x 4x

4 cores 3.2 GHz 50 W ~5x 8x

3 operating modes:

Sprinting Performance

0

1

2

3

4

5

6

7

8

norm

aliz

ed s

peed

up

sobel disparity segment kmeans feature texture

Cores + Frequency (3.2GHz): 6.3x speedup Cores only (1.6 GHz): 3.5x speedup

Max 4 core, 3.2 GHz

Max 4 core, 1.6 GHz

Baseline(no sprint)

3.2GHz

3.2GHz

3.2GHz

3.2GHz

3.2GHz

3.2GHz

1.6GHz

1.6GHz

1.6GHz

1.6GHz

1.6GHz

1.6GHz

13

14

How long can the testbed sprint?

Testbed Thermal Response40

5060

700

2040

60

Po

wer

(W

)

time (s)

Tem

p (

°C)

time (s)

Tmax

15

-5 0 5 10 15 20 25 300

1 0

2 0

3 0

4 0

5 0

6 0

sustainedsprint-3.2GHz

-5 0 5 10 15 20 25 30

sustainedsprint-3.2GHz

020

4060

4050

6070

sustained

sustained

-5 0 5 10 15 20 25 300

1 0

2 0

3 0

4 0

5 0

6 0

sustainedsprint-3.2GHz

-5 0 5 10 15 20 25 30

sustainedsprint-3.2GHz

Testbed Thermal Response40

5060

700

2040

60

Po

wer

(W

)

time (s)

Tem

p (

°C)

time (s)16

5x

3s

-5 0 5 10 15 20 25 300

1 0

2 0

3 0

4 0

5 0

6 0

sustainedsprint-3.2GHz

-5 0 5 10 15 20 25 30

sustainedsprint-3.2GHz

020

4060

4050

6070

sustainedsprint (3.2 GHz)

sustainedsprint (3.2 GHz)

20g copper

Δ 25oC, ~188J 3.5s @ 50W

Heat spreader

-5 0 5 10 15 20 25 30

sustainedsprint-3.2GHzsprint-1.6GHz

-5 0 5 10 15 20 25 300

1 0

2 0

3 0

4 0

5 0

6 0

sustainedsprint-3.2GHzsprint-1.6GHz

Testbed Thermal Response40

5060

700

2040

60

Po

wer

(W

)

time (s)

Tem

p (

°C)

time (s)17

5x

3s

-5 0 5 10 15 20 25 30

sustainedsprint-3.2GHz

020

4060

4050

6070

-5 0 5 10 15 20 25 300

1 0

2 0

3 0

4 0

5 0

6 0

sustainedsprint-3.2GHz

21ssustainedsprint (3.2 GHz)sprint (1.6 GHz)

sustainedsprint (3.2 GHz)sprint (1.6 GHz)

2x

20g copper

Δ 25oC, ~188J 3.5s @ 50W

Heat spreader

18

What if computation doesn’t complete during the sprint?

Truncated Sprint Performance

19

012345678

4 cores, 1.6 GHz4 cores, 3.2 GHz

computation length

nor

ma

lize

d s

pe

edu

p

Little benefit

Near-peak performance for short tasks

Truncated Sprint Performance

012345678

4 cores, 1.6 GHz4 cores, 3.2 GHz

computation length

nor

ma

lize

d s

pe

edu

p

20

Little benefit

Near-peak performance for short tasks

Lower peak performance; benefits longer tasks

• Best sprint intensity depends on task size• How to sprint when task size is unknown?• Sprint pacing

• Max intensity sprint for half thermal capacitance• Cores-only sprinting for other half

Truncated Sprint Performance

012345678

4 cores, 1.6 GHz4 cores, 3.2 GHz

computation length

nor

ma

lize

d s

pe

edu

p

21

• Best sprint intensity depends on task size• How to sprint when task size is unknown?• Sprint pacing

• Max intensity sprint for half thermal capacitance• Cores-only sprinting for other half

Truncated Sprint Performance

012345678

adaptive4 cores, 1.6 GHz4 cores, 3.2 GHz

22

• Best sprint intensity depends on task size• How to sprint when task size is unknown?• Sprint pacing

• Max intensity sprint for half thermal capacitance• Cores-only sprinting for other half

computation length

nor

ma

lize

d s

pe

edu

p

Increasing sprint duration

23

24

Two Ways of Adding Thermal Capacitance

• Specific heat capacity: introduce thermal mass

• Latent heat: absorb heat to change phase (e.g. melting)

tem

per

atu

rete

mp

era

ture

time (s)

time (s)

Phase-change absorbs heat while holding temperature constant

Baselinesprinting

More specific heat takes longer to heat

20g copper, Δ 25oC ~188J

1g of wax ~200J

25 Computational Sprinting on a Hardware-Software Testbed

Extending Sprints with Phase Change Material

4g of wax, 1g of aluminum foam

Copper shim

26

0 50 100 150 200 250 300 350 40040

50

60

70

80

Impact of Phase Changete

mp

erat

ure

(°C

)air

time (s)

27

0 50 100 150 200 250 300 350 40040

50

60

70

80

tem

per

atu

re (

°C)

Small extension from heat capacity of encasement…

foamair

time (s)

Impact of Phase Change

28

0 50 100 150 200 250 300 350 40040

50

60

70

80

tem

per

atu

re (

°C)

Small extension from heat capacity of encasement……6x increase in sprint duration with phase change

foam waxair

time (s)

Impact of Phase Change

29

0 50 100 150 200 250 300 350 40040

50

60

70

80

tem

per

atu

re (

°C)

Small extension from heat capacity of encasement……6x increase in sprint duration with phase change

due to phase change

foamairwater

time (s)

Impact of Phase Changewax

30

Phase Change Material in Action

Time lapse 15x

31

How does sprinting impact energy?

32

Energy Impact of Sprinting

0

0.5

1

1.5

norm

aliz

ed e

nerg

y

sobel disparity segment kmeans feature texture3.2GHz

3.2GHz

3.2GHz

3.2GHz

3.2GHz

3.2GHz

1.6GHz

1.6GHz

1.6GHz

1.6GHz

1.6GHz

1.6GHz

Race-to-idle: 7% energy savings!

Sprint 3.2GHz

Sprint 1.6GHz

Idle

If sprinting is more energy efficient, why not sprint all the time…

33

0 50 100 150 200 250 300 350 400 450 50005

10152025

sustainedsprint-rest

Sprint-and-Rest

seconds

pow

er (

W)

0 50 100 150 200 250 300 350 400 450 500304050607080

sustainedsprint-rest

seconds

tem

pera

ture

(o C

)

0 50 100 150 200 250 300 350 400 450 5000

50100150200250300

sustainedsprint-rest

seconds

cum

ulat

ive

wor

k

sprint-and-rest

sprint-and-rest

sprint-and-rest

34

35% faster

5s @ 20W sprint, 12s @ 5W rest < 10W average

Conclusions

• Testbed confirms sprinting improves responsiveness

• Sprint pacing can extend benefits of sprinting

• Exploiting phase change allows longer sprints

• Sprinting can save energy in thermally limited systems

36

www.cis.upenn.edu/acg/sprinting/