57
1 The implications of energetic and thermal constraints on current and future processors Pierre Michaud June 2008

The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

1

The implications of energetic and thermalconstraints on current and future

processors

Pierre Michaud

June 2008

Page 2: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

2

Outline

1. Why temperature, power and energy must be limited

2. Some basics

3. Why power consumption became a problem

4. How the power problem has been tackled

5. The temperature problem

6. How future processors may look like

Page 3: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

3

Why temperature, power andenergy must be limited

Page 4: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

4

Processing consumes energy

• When a processor executes a program it consumes someenergy. This energy is transformed into internal energy.

• Temperature is a measure of the average kinetic energyassociated with the disordered microscopic motion ofparticles

• A processor consuming some energy increases its owntemperature and that of its environment

Page 5: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

5

Temperature must be limited

• We don’t want the processor to burn

• A 10 ºC temperature increase halves the processorlifetime– Several aging phenomena at work that are exponential with

temperature– Example: electromigration

• Circuits gets slower when temperature is higher

• ➔ Maximum temperature between 80 ºC and 100 ºC

Page 6: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

6

Power consumption must be limited

• Power = energy per unit of time

• Electric power = voltage X current

• Power is limited by the power supply and by themaximum current

• A high sustained power generates a hightemperature ➔ limiting power is a way to limittemperature

Page 7: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

7

Energy consumption must be limited

• Some processors are battery-powered– Laptop computers, hand-held devices– A battery stores a finite amount of energy– If you spend less energy to do a given work, you are able to do more

work before you need to recharge the battery

• Energy costs money– Will cost more and more (Google cares !)– Also an environmental cost

• Consuming less energy decreases power and temperature

Page 8: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

8

Example: data center

• For each watt dissipated in the room, one extra watt mustbe consumed for the air conditioning

• Assume machines dissipate 100 KW, of which 30% (=30KW) come from CPUs

• + 100 KW for cooling ➔ 200 KW

• If we halve the power consumed by each CPU, we can …– Decrease the electric bill (15+15=30 KW saved)– Or put more CPUs in the room

Page 9: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

9

Some basics

Page 10: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

10

MOSFET(Metal-Oxide Semiconductor Field-Effect Transistor)

source drain

gate

substrate

dielectric

!

L

!

W

!

tox

Page 11: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

11

Switching energy

0

C

Joule heating

i

v

ddV

=!=! ""#

dvvVCidtvV

ddV

dddd

00

)()( 2

2

1

ddCV

Energy is consumed when a gate output voltage switches from lowto high or from high to low

The switching energy depends oncapacitance and supply voltage

Page 12: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

12

Dynamic power

• Dynamic power = switching energy consumed per second

!

Pd

= Ce" F "V

dd

2

clock frequencyequivalentcapacitance

• Equivalent capacitance takes into account contributions from allswitching gates on the chip

Page 13: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

13

Gate delay

Cv

ddV

!

t "C #V

dd

Idsat

!

Idsat

" µ#ox

tox

$W

L$Vdd%V

t( )2

2

Gate dielectricthickness

Channellength

Channelwidth

Thresholdvoltage

Gate dielectricpermittivity

Approximate transistoras a current source

0

Shockley model

Page 14: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

14

Why power consumption became aproblem

Page 15: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

15

Moore’s law

• The number of transistors on a processor chip doubles every 2 years

Page 16: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

16

Classical scaling rules

• Dennard, Gaensslen, Yu, Rideout, Bassous, Leblanc,“Design of ion-implanted MOSFET’s with very small physicaldimensions”, IEEE Journal of Solid-State Circuits, oct. 1974– On each technology generation, divide all transistors and wires

dimensions by– Divide all voltages by

• Dimensions scaling ➔ all parasitic capacitances (transistors& wires) are divided by

!

2

!

2

!

2

!

parallel plate capacitance =dielectric permittivity " plate area

dielectric thickness

Page 17: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

17

Classical scaling: impact on delay

!

t "2

µ#

L

W

$

% &

'

( ) #

tox

*ox

$

% &

'

( ) #C #

Vdd

Vdd+V

t( )2

!

" 1

2

!

" 1

2

!

" 1

2

!

" 1

2 !

" 1

2

Under classical scaling, clock frequency(inverse of delay) can be multiplied by

!

2 "1.4

Page 18: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

18

!

Pd

= Ce" F "V

dd

2 #Ce

C

$

% &

'

( ) "

W

L

$

% &

'

( ) "

*ox

tox

$

% &

'

( ) "Vdd

Vdd+V

t( )2

Classical scaling: impact on power

• Before 1990, supply voltage was kept constant (5 V)• In the 1990’s, power became a concern and voltage was scaled• Since 2000, voltage keeps decreasing, but more slowly

!

" 1

2

!

" 2

!

" 1

2!

" 1

2

Page 19: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

19

Leakage currents

• Subthreshold leakage current between drain and source when the gate-to-source voltage is below the threshold voltage

• Gate leakage current due to tunneling of electrons through the dielectric layer

• Classical scaling requires that the threshold voltage be decreased when thesupply voltage is decreased– But this increases subthreshold leakage– ➔ Difficult to scale supply voltage further

• Classical scaling requires that the gate dielectric thickness be decreased– But this increases gate leakage– ➔ Use gate dielectric with higher permittivity (high-K)

drainsource

gate

dielectric

Page 20: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

20

Static power

• Leakage currents ➔ static power consumption Ps

!

total power = Pd

+ Ps

= Ce" F "V

dd

2+V

dd" I

total leakage current

• We spend energy doing no work ☹

• Subthreshold leakage increases with temperature !

Page 21: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

21

Microarchitects are guilty !

• Extra transistors have been used to increase the processorperformance, at the cost of more complexity and less energy efficiency– Superscalar, out-of-order, 64-bit operations, floating-point, SIMD,

multicore, etc.

• Until year 2000, clock frequency has increased not only because offaster transistors but also because of pipelining

!

Pd

= Ce" F "V

dd

2

!

constant chip area "# 2

!

pipelining "# 2

Page 22: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

22

Core complexity across generations

R. Kumar, D. Tullsen, N. Jouppi, P. Ranganathan, “Heterogeneous Chip Multiprocessors”, IEEEComputer, Nov. 2005

Alpha 21064 (EV4) to Alpha 21464 (EV8)

Page 23: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

23

What happened ?

• In 2000, we thought we would have processors in 2008 clocked at 10Ghz ➔ this did not happen

• In 2004, the Intel Tejas microprocessor was cancelled ➔ too muchpower, too much heat– It was very difficult to continue pushing the complexity of superscalar

processors and the clock frequency together

• But Moore’s law continues, so what do we do with extra transistors ?• ➔ big caches & multiple “simple” cores on the chip

• Power consumption has become a first-class constraint

Page 24: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

24

How the power problem has beentackled

Page 25: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

25

Energy-efficiency needed everywhere• Technology

– high-threshold-voltage transistors where speed is not critical (e.g., caches)– high-K gate dielectric– …

• Circuit– Sometimes, sacrificing speed a little permits saving significant energy– fine-grained clock gating ➔ don’t clock a flip-flop unless it has valid data in

input– …

• Microarchitecture– Find a good balance between complexity and performance– Disconnect parts that are not used

• Example: if a program does not perform floating-point computations, turn theFP units off

Page 26: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

26

Voltage / frequency

• Circuit can be clocked at frequency proportional to supply voltage– As long as supply voltage is not too close to transistor threshold voltage

• Example– Assume 75% of power is dynamic and 25% static– multiply simultaneously voltage and frequency by 0.8

!

Pd

+ Ps

= Ce" F "V

dd

2+V

dd" I

!

" 0.8

!

" (0.8)2

!

" 0.8

!

0.75 " 0.83

+ 0.25 " 0.8 # 0.6 ➔ we get a 40% decrease of powerif we decrease frequency by 20%

Page 27: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

27

Parallelism is power efficient

1 processorfrequency F

voltage V

2 processorsfrequency F/2

voltage V/2

• Parallelism allows to get the same performance whileconsuming less power

• A multicore processor permits obtaining moreperformance with the same power consumption– provided the application is parallel

Page 28: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

28

The temperature problem

It is related to the power problem,but is not strictly equivalent to it

Page 29: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

29

Processor heat sink

air-blowing fan takes heataway from the chip

Put on top of theprocessor chip

Page 30: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

30

Cooling a laptop

Heat pipe

Heat sink

CPU

Page 31: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

31

Fourier’s law of heat conduction

!

r q = "K #

r $ T

Heat flux Thermalconductivity

Temperaturegradient

Heat flows from high-temperature regions to low-temperature ones ata rate proportional to the temperature difference!

W /m2

!

W /mK

!

K /m

siliconaluminumcopper

100-150 W/mK240400

W/mKW/mK

Page 32: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

32

Thermal resistance

!

T1

thermally insulated side

!

T2

!

section S

!

power P =Q" S

!

Fourier's law Q =P

S= K

T1"T

2

L!

length L

!

Thermal resistance R =T

1"T

2

P=

L

K # S(in kelvin per watt)

Page 33: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

33

Convection cooling

solidtemperature T

ambient fluidin motiontemperature T0

!

Q = H(T "T0)

!

QHeat flux

!

W /m2

Heattransfercoefficient

!

W /m2K

Forced convection: the heat transfercoefficient increases with the fluid velocity

Newton’s law of cooling

!

Thermal resistance R =T "T

0

P=

1

H # S(in kelvin per watt)

area in contact with the fluid

Page 34: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

34

Example

Heat sink

Silicon die

!

50 µm, 3.33 W/mK

Interfacematerial

!

500 µm

Transistors& wires

!

5 µm

!

150 mm2

!

Rim

= 0.1 K/W

!

Rhs

= 0.3 K/W

Primary heat path

!

Rsi

= 0.02 K/W

!

Tcircuit

"Tair

= P # (Rsi

+ Rim

+ Rhs) Each watt dissipated contributes

a 0.42 ºC temperature increase

Page 35: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

35

Temperature is not uniform

J.D. Warnock et al., “The circuit and physical design of the POWER4 microprocessor”, IBM Journal ofResearch & Development, Jan. 2002.

Page 36: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

36

Point source

Point sourcedissipating 1 watt

Temperature (relative to ambient) as afunction of the distance from the source

For multiple sources, add the contributions from each source

Page 37: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

37

Impact of miniaturization on temperature

!

" 1

2

If power remains constant, temperatureincreases

Page 38: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

38

Power must be decreased !

!

" 1

2

!

power " P

!

power P

!

P'= P " temperature increases

!

P'=P

2 " same power density (W/m2)" temperature decreases

!

P'=P

2 " temperature roughly the same

Page 39: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

39

From single to dual-core

use this area fora second core!

Pd

= Ce" F "V

2

!

Pd" =

Ce

2# " F # " V

2

!

Pd" =

Pd

2# " F " V

2=

FV2

2

!

If we want " F > F, we must have " V <V

2# 0.84 $V

same total power ➔

Page 40: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

40

“Dual-core” with a single valid core• Yield issue: only a fraction of the chips on a wafer will eventually be sold

– Other chips have defects

• Valid chips have either 1 or 2 valid cores

• Chips with a single valid core can use a higher voltage and frequency– (ignoring commercial considerations …)

!

Pd" =

Pd

2# " F " V

2= FV

2

times higher than thechip with 2 valid cores

We are limited by temperature ➔

!

2

Page 41: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

41

Dynamic voltage / frequency scaling

• Processors can vary voltage and frequency dynamically (DVFS)– Keep frequency proportional to voltage

• The operating system sets frequency and voltage depending on thesituation– Thermal sensor indicates that temperature is too high ➔ decrease V & F– System activity is low ➔ decrease V & F to save energy

• When a single core is used, put the inactive core in low power mode andincrease V & F of the active core to boost performance– Intel Penryn processor ➔ 10 % frequency boost when 2nd core inactive

• Intel “Dynamic Acceleration Technology”

!

V ="F

Page 42: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

42

DVFS in multicores

!

F2N,V

2N2N cores active ➔

N cores active ➔

!

FN,V

N

!

FNVN

2

F2NV2N

2= 2

!

" 2FN

3

" 2F2N

3= 2➔

!

FN

F2N

= 21/ 6

!

F2

F1

= 21/ 6

"1.12

!

F4

F1

= 21/ 3

"1.26

!

F8

F1

= 21/ 2

"1.41

Intel Penryn

Intel Nehalem ?

Page 43: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

43

Activity migration

• When using a single core at a time, migrating the executionperiodically to a different core decreases temperature– Spreads the same heat on a larger area

Page 44: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

44

Activity migration

• When using a single core at a time, migrating the executionperiodically to a different core decreases temperature– Spreads the same heat on a larger area

Page 45: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

45

Activity migration

• When using a single core at a time, migrating the executionperiodically to a different core decreases temperature– Spreads the same heat on a larger area

Page 46: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

46

Activity migration

• When using a single core at a time, migrating the executionperiodically to a different core decreases temperature– Spreads the same heat on a larger area

Page 47: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

47

Activity migration

• When using a single core at a time, migrating the executionperiodically to a different core decreases temperature– Spreads the same heat on a larger area

Page 48: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

48

Activity migration

• When using a single core at a time, migrating the executionperiodically to a different core decreases temperature– Spreads the same heat on a larger area

Page 49: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

49

Activity migration

• When using a single core at a time, migrating the executionperiodically to a different core decreases temperature– Spreads the same heat on a larger area

Page 50: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

50

Activity migration

• When using a single core at a time, migrating the executionperiodically to a different core decreases temperature– Spreads the same heat on a larger area

Page 51: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

51

Activity migration

• When using a single core at a time, migrating the executionperiodically to a different core decreases temperature– Spreads the same heat on a larger area

Page 52: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

52

DVFS in multicores + activity migration

!

F2N,V

2N2N cores active ➔

N cores active ➔

!

FN,V

N

!

FNVN

2

F2NV2N

2= 2

!

" 2FN

3

" 2F2N

3= 2➔

!

FN

F2N

= 21/ 3

!

F2

F1

= 21/ 3

"1.26

!

F4

F1

= 22/ 3

"1.59

!

F8

F1

= 2

DVFS is more efficient when it iscombined with activity migration➔ potential speed-up of 2 for sequentialexecution on a 8-core processor

Page 53: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

53

The future ?

Page 54: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

54

Let’s assume this scenario

!

frequency F "1

C

#

$ %

&

' ( )

W

L

#

$ %

&

' ( )

*ox

tox

#

$ %

&

' ( )

Vdd+V

t( )2

Vdd

!

" 2

!

dynamic core power Pd"

Ce

C

#

$ %

&

' ( )

W

L

#

$ %

&

' ( )

*ox

tox

#

$ %

&

' ( ) V

dd+V

t( )2Vdd

!

constant : 0.1V or 0.2V

!

" 2

Page 55: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

55

My guess wish for 2010-2020

cache

big cores small cores

• Constant chip area

• Several big cores for high sequentialperformance– Vdd x 0.9 for constant core power– frequency x 1.7 on each generation– Use a single big core at a time– Migrate periodically for temperature

• Many small cores for high parallelperformance– Vdd x 0.8 for halving core power– Frequency increases slowly– Parallel performance doubles on each

generation

low freq.high freq.low Vt high Vt

Page 56: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

56

Conclusion

• Sequential performance must increase, it is a necessity– Some applications have little parallelism– Amdahl’s law– legacy code– software productivity– Efficient activity migration may be the only long-term solution

• Peak parallel performance is likely to increase faster thansequential performance

Page 57: The implications of energetic and thermal constraints on ... · dimensions”, IEEE Journal of Solid-State Circuits, oct. 1974 –On each technology generation, divide all transistors

57

Questions ?