Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Scheduling for Reduced CPU Energy
M. Weiser, B. Welch, A. Demers and S. ShenkerAppears in "Proceedings of the First Symposium on Operating Systems Design and Implementation," Usenix Association, November 1994
Apresentado por Ricardo Carrano paraSistemas de Tempo Real e Embarcados – IC / UFF
Sistemas de Tempo Real e Embarcados 2
Abstract
� Introduces MIPJ
� Examines a class of methods to reduce MIPJ based on the dynamic control of system clock speed by the OS scheduler
� Question: What are the right scheduling algorithms for taking advantage of reduced clock-speed, especially in the presence of applications demanding ever more IPSs?
� Result: Adjusting clock speed at a fine grain, saves substantial CPU energy (with little impact on
performance)
Sistemas de Tempo Real e Embarcados 3
Outline
� An Energy Metric for CPUS
� Motivation
� The experiment
� Trace Data
� Assumptions and Simulations
� Three algorithms (OPT, FUTURE and PAST)
� Evaluating the Algorithms
� Conclusions
Sistemas de Tempo Real e Embarcados 4
Motivation: components’ energy use
� Dominated by display and disk
� But CPU is significant
� Common approach (at the time):
� Power down when idle
� Proposed (new) approach:
� Minimize idle time
Sistemas de Tempo Real e Embarcados 5
An Energy Metric for CPUs
� MIPJ: new metric for CPU energy performance
� MIPJ = MIPS/WATTS
� MIPS stands for any workload-per-time bench
mark
� Examples
− 1984 2-MIPS 68020 – 2W MIPJ: 1
− 1994 200-MIPS Alpha – 40W MIPJ: 5
− Laptops: Motorola 68349 6MIPS/300mW: MIPJ: 20
Sistemas de Tempo Real e Embarcados 6
More recent data: Energy per Instruction
Trends in Intel® Microprocessors
� “Core Duo and Pentium M reverse the trend towards ever-greater EPI” (but the arrow still points up)
� “(…) improving IPC has emerged as the more energy-efficient of the two techniques.” (as opposed to increasing frequency).
Grochowski and AnnavaramMicroarchitecture Research Lab Intel Corporation
Sistemas de Tempo Real e Embarcados 7
How to reduce MIPJ?
� Other things equal, MIPJ is unchanged by changes in clock speed.
� Reducing clock speed causes a linear reduction in energy
consumption → The two cancel
� But a reduced clock speed creates an opportunity for quadratic energy savings
� Clock speed reduced by n → energy per cycle reduced by n2.
� So, dynamic control of system clock speed by the OS scheduler do saves energy
� Reducing voltage, Reversible logic, Adiabatic logic
Sistemas de Tempo Real e Embarcados 8
Adiabatic Logic
� Benjamin Gojman (August 8, 2004)� As circuits get smaller and faster, their energy dissipation
greatly increases� a problem that adiabatic circuits promises to solves
� Adiabatic process → total heat or energy in the system remains constant.
� Term given to low-power electronic circuits that implement reversible logic.
� CMOS technology dissipate energy as heat mostly when switching.
� There are two fundamental rules CMOS adiabatic circuits must follow� never to turn on a transistor when there is a voltage difference between
the drain and source. � never to turn off a transistor that has current flowing through it.
Sistemas de Tempo Real e Embarcados 9
The experiment
� Simulations over real traces
� Lengthen runtime of individually scheduled
segments of the trace in order to eliminate idle
time.
� The idea is to stretch runtime into idle times
Sistemas de Tempo Real e Embarcados 10
The experiments: Trace Data
� Taken from UNIX stations
� Over periods up to several hours on a work day
� Workload includes SW devel., documentation, e-mail, simulation, etc
� Other traces taken during specific workload
Sistemas de Tempo Real e Embarcados 11
The experiment: assumptions (1/2)
� No reordering of tasks
� Sleep events classified into hard and soft
� Disk request time are hard (non-deterministic)
� Keystrokes, for example, can be stretched
� No energy consumption when idle
� Energy/instructions in proportion to n2 when
running at speed n
� n varies between minimum relative speed and
1.0 (full speed)
Sistemas de Tempo Real e Embarcados 12
The experiment: assumptions (2/2)
� No time to switch speeds
� Turning off due to power saving skipped/ignored
� Lower bound to practical speed (5V – full speed):
� 0.2, 0.44 or 0.66 → 1.0, 2.2 and 3.3 V
� Speed adjusted linearly with voltage
Sistemas de Tempo Real e Embarcados 13
Scheduling algorithms
� Three types of scheduling
� OPT: unbounded-delay, perfect-future
� FUTURE: bounded-delay, limited-future
� PAST: bounded-delay, limited-past
Sistemas de Tempo Real e Embarcados 14
OPT
� Takes the entire trace
� Stretches all the runtimes to fill all the idle times
� Off periods (90% of idle times over 30s) not
available for stretching
� Impractical – future knowledge
� Undesirable – large delays
� no regard to interactivity
Sistemas de Tempo Real e Embarcados 15
FUTURE
� Like OPT but peers only a small window into the future
� Stretches runtime into idle time only within this window
� setting window size of 10 to 50ms interactive
response will remain high
� Impractical: future knowledge
� Desirable: limited delay
Sistemas de Tempo Real e Embarcados 16
PAST
� Practical version of FUTURE
� Looks a fixed window into the past
� Assumes the next will be like the previous
� The algorithm follows...
Sistemas de Tempo Real e Embarcados 17
PAST algorithm
� Process previous window and computes:
� run_cycles number of non-idle CPU cycles
� idle_cycles idle CPU cycles, hard and soft.
� excess_cycles left over because we ran too slow.
� run_percent = run_cycles / (idle_cycles + run_cycles)
� Adjusts speed accordingly
� If excess_cycles > idle_cycles → newspeed = 1.0
� elseif run_percent > 0.7 → newspeed = speed + 0.2
� elseif run_percent < 0.5 →newspeed = speed – (0.6 – run_percent)
Sistemas de Tempo Real e Embarcados 21
Evaluating the Algorithms
Algorithms and
Minimum speeds
allowed
PAST beats FUTURE, because
excess cycles are deferred
Sistemas de Tempo Real e Embarcados 22
Penalty at 20ms
Time it would take to execute them at full speed
20msec
Excess cycles
built up
Most intervals have no
excess cycles
Sistemas de Tempo Real e Embarcados 23
Penalty at 2.2V
The peak shifts right
as the interval length
increases
Sistemas de Tempo Real e Embarcados 24
PAST (Min Volts, 20ms)
Minimum speed does not always
result in the minimum energy
2.2V almost as good as 1.0V
Kestrel march 1
Sistemas de Tempo Real e Embarcados 26
PAST (2.2V vs. Interval)
Longer adjustment periods result in more savings
Sistemas de Tempo Real e Embarcados 27
Excess Cycles
Lower minimum voltage → more excess cycles
Sistemas de Tempo Real e Embarcados 28
Longer interval → more excess cycles
Sistemas de Tempo Real e Embarcados 29
Conclusions (1/2)
� PAST, with a 50ms window, saves energy:
� up to 50% for conservative assumptions (3.3V)
� up to 70% for more aggressive assumptions (2.2V)
� Savings depends on the interval between speed adjustments.
� too fine: less power saved (CPU usage bursty).
� too coarse: excess cycles built up during a slow interval will adversely affect interactive response.
� interval of 20 or 30 milliseconds: good compromise: power savings vs interactive response.
Sistemas de Tempo Real e Embarcados 30
Conclusions (2/2)
� Too low a min. speed → less efficiency
� more excess cycles → must speed up to catch up.
� If an effective way of predicting workload can be found, then significant power can be saved.
� adjusting the processor speed at a fine grain so it is just fast enough to accommodate the workload.
� The tortoise is more efficient than the hare:
� better to spread work out by reducing cycle time (and voltage) than to run the CPU at full speed for short bursts and then idle.
� But QoS is not actually taken into account
� Hard and soft idle cycles are no guarantee for RT systems