Application Transformations for Energy and Performance-Aware Device Management Taliver Heath,...

Preview:

DESCRIPTION

Our Solution CPU Disk CPU Disk idle active idle An Unmodified Application (UM) Transformed Application idleactive

Citation preview

Application Transformations for Energy and Performance-Aware Device Management

Taliver Heath, Eduardo Pinheiro, Jerry Hom, Ulrich Kremer, and

Ricardo BianchiniRutgers University

www.darklab.rutgers.edu

The Problem

Conserve energy in devices Must take advantage of lower power

states State transitions have overhead Cost in both energy and performance

Challenge: non-interactive applications and fast processors Short device idle times Devices cannot use lower power states

www.darklab.rutgers.edu

Our Solution

CPUDisk

CPUDisk

idle idleactive activeidle

An Unmodified Application (UM)

Transformed Applicationidleactiveactive

www.darklab.rutgers.edu

Our Goals

Conserve energy by exploiting transformations that increase idle time

Evaluate ideas using: Hand-modified programs Automated compiler transformations

Specific policies: Energy Oblivious, Fixed Threshold, Direct Deactivation, Pre-Activation, and Combined

www.darklab.rutgers.edu

Application Transformations

Increase idle times with help of compiler or programmer Identify loops where accesses occur Perform loop transformations Estimate device idle times Insert system calls

Idle time limited by memory or real-time constraints

www.darklab.rutgers.edu

Example: Original Application

i = 1;while i <= N {

read chunk[i] of file;compute on chunk[i];i = i+1;

}

www.darklab.rutgers.edu

Example: Transformed Application

available = how_much_memory();numchunks = available/sizeof(chunks);compute_time = appfunc(numchunks);i = 1;while i <= N {

read chunk[i…i+numchunks] of file;next_R(compute_time);compute on chunk[i…i+numchunks];i = i+numchunks;

}

www.darklab.rutgers.edu

Compiler Framework Annotations to file descriptors Replace disk calls using

interprocedural analysis Profiling Buffered I/O Notify OS of idle times

Based on SUIF infrastructure

www.darklab.rutgers.edu

Device Management

Energy-Oblivious (EO) Fixed-Threshold (FT) Direct-Deactivation (DD) Pre-Activation (PA) Combined(CO) : DD+PA

Final state based on model [Heath02]

www.darklab.rutgers.edu

Sample Disk Power Graphs (mp3 player)

FT

UM

CO

0

1

2

3

0 5 10 15 20 25 30

Power(W)

0

1

2

3

0 5 10 15 20 25 30

Power(W)

0

1

2

3

0 5 10 15 20 25 30

Power(W)

www.darklab.rutgers.edu

Experimental Setup

Fujitsu Disk 6-GB, 4200-rpm laptop disk 3 states

Idle – 0.9 W Standby – 0.22 W Sleep - 0.09W

Buffer memory available: 19MB Time allowed for reading: .3 seconds

www.darklab.rutgers.edu

Experiment

6 applications Mp3 player, mpeg-player Gzip, sftp, mpeg-encode, image

smoother Variables investigated

Disk management policies Compiler vs. hand-optimized OS prefetching on/off

www.darklab.rutgers.edu

Non-streaming: SFTP

www.darklab.rutgers.edu

Streaming: MP3 player

www.darklab.rutgers.edu

Average Hand-Modified Results

Policy Energy Performance

EO 40% 0%FT 60% 5%DD 73% 7%PA 60% 1%CO 70% 4%

www.darklab.rutgers.edu

Average Compiler ResultsPolicy Energy Performanc

eEO 46% 1%FT 68% 4%DD 79% 7%CO 75% 3%

www.darklab.rutgers.edu

Related Work (partial list)

Application-controlled power states Concept, but no implementation

[Ellis99,Lu99] Compiler infrastructure [Delaluz01]

Direct deactivation and preactivation [Hom01,Heath02]

Conserving disk energy [Douglis94] Modifying disk access API [Weissel02]

www.darklab.rutgers.edu

Conclusions

Application transformations 55-89% savings in energy Minimal effect on performance Idle time predictions are difficult

Prefetching has little impact Compiler transformations work well

As good as hand modifications Generic framework: other disks and

devices

www.darklab.rutgers.edu

For more information

www.darklab.rutgers.edu

www.darklab.rutgers.edu

Technique

Create model of disk energy Transform applications Realize model on real disk Predict disk energy usage Measure disk on 4 applications

www.darklab.rutgers.edu

Future Work

More disks Other devices Multiple active processes Asynchronous I/O

www.darklab.rutgers.edu

Summary

www.darklab.rutgers.edu

Historical Use of States

Change to Lower State during Period of Idleness Fixed-threshold Adaptive/Heuristic OS Hints

Based on general knowledge of system

www.darklab.rutgers.edu

Runlength vs. Energy

www.darklab.rutgers.edu

Projected Application Gain

www.darklab.rutgers.edu

Projected Application Gain

www.darklab.rutgers.edu

Overhead for DD

www.darklab.rutgers.edu

Combined (CO)

fact

ffact

fdeact

co EPTREE ,1

RT co

CPUDisk

idle idleactive activeidle

1111,1,1 fact

ffact

fdeact

fact

ffact

fdeact EPTREEPTRE

active active idle

www.darklab.rutgers.edu

Parameter DescriptionParameter

Explanation

Energy consumed by policy polCPU time consumed by policy polRun-lengthAverage power consumed in sInactivity threshold for sAverage reactivation energyAverage deactivation energyAverage reactivation time

polTRsPsTsactE',ss

deactEsactT

polE

www.darklab.rutgers.edu

Reality Departs from Model

Hidden states in several transitions Transition from active to idle Behavior on activation

fPWRadj 75.14.0)(For CO:

www.darklab.rutgers.edu

Experiments

Application s1 s2 s3MP3 player 0 0 1MPEG player 0 0 1Image smoother 0 0 1Gzip 0 .36 .64Sftp 0 0 1MPEG encoder 0 0.5 0.5

Modified App Runlengths

www.darklab.rutgers.edu

Energy, mpg123

www.darklab.rutgers.edu

Energy, sftp

www.darklab.rutgers.edu

Performance, mpg123

www.darklab.rutgers.edu

Performace, sftp

www.darklab.rutgers.edu

Experimental Results

MP3 player

www.darklab.rutgers.edu

Summary

direct-deactivation and preactivation (CO) Can save up to 89% of disk energy No performance penalty, except for MPEG

player (<10%) Just increasing runlengths, we can save

up to 50% energy Error in model can be significant – up to

50% for the entire application

www.darklab.rutgers.edu

Energy Oblivious(EO)

1PREeo

RT eo

CPUDisk

idle idleactive activeidle

www.darklab.rutgers.edu

Direct Deactivation(DD)

fact

ffdeact

dd ERPEE '',1

'fact

dd TRT

CPUDisk

idle idleactive active

1'1'1',1''',1 fact

ffdeact

fact

ffdeact EPREEPRE

www.darklab.rutgers.edu

Pre-Activation(PA)

''''1''

1

1,1''

1

fact

ffact

f

s

sssdeact

f

s

sspa EPTTRETPE

RT pa

RTT fact

f

s

s

1''1

1

CPUDisk

idle idleactive activeidle

www.darklab.rutgers.edu

Fixed-Threshold(FT)

**1*

1

1,1*

1

fact

ff

s

sssdeact

f

s

ssft EPTRETPE

*fact

ft TRT

CPUDisk

idle idleactive active

RTf

s

s

1*

1

www.darklab.rutgers.edu

Terminology

CPU Time Device Time

Blocking device accesses (reads)Single ready-to-run application

Time between device accesses by the processor

Runlength (R)

R R