Upload
avel
View
48
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Optimal Reliability-Constrained Overdrive Frequency Selection in Multicore Systems. Andrew B. Kahng and Siddhartha Nath VLSI CAD LABORATORY, UC San Diego. Outline. Motivation Previous Work Our Work Problem Formulation Optimal (Discretized) Solution Flow Results Conclusions. - PowerPoint PPT Presentation
Citation preview
-1-UC San Diego / VLSI CAD Laboratory
Optimal Reliability-Constrained Overdrive Frequency Selection in
Multicore Systems
Optimal Reliability-Constrained Overdrive Frequency Selection in
Multicore SystemsAndrew B. Kahng and Siddhartha Nath
VLSI CAD LABORATORY, UC San Diego
-2-
OutlineOutline
Motivation Previous Work Our Work Problem Formulation Optimal (Discretized) Solution Flow Results Conclusions
-3-
Reliability in MultiCore SystemsReliability in MultiCore Systems Modern multicore processors operate at
multiple operating modes– E.g., nominal, supply voltage scaling, turbo,
etc. Reliability is a key processor design
consideration at leading-edge technology nodes to guarantee a prescribed system lifetime
Task scheduling affects how cores are used– A subset of cores can fail before others
-4-
Scheduling in Multicore SystemsScheduling in Multicore Systems
Scheduler packs tasks using some or all the available processing cores
1 1
1
2
2
3
4
4
Application B
Application A
Time
Time
#C
ore
s#
Core
s
1 2 3 4 5 6 7 80%
20%
40%
60%
80%
100%
120% Application A Application BPacked A, B
#Active cores
%A
ctiv
e T
ime
-5-
Core WearoutCore Wearout Mean time to failure (MTTF) is a measure of
the lifetime of a core Reliability mechanisms degrade MTTF of a
core– E.g., electromigration (EM), stress migration,
hot carrier injection, bias temperature instability, etc.
When all cores are not simultaneously active– Adjust task scheduling on a subset of active
cores for balanced wearout
-6-
Impact of Overdrive FrequencyImpact of Overdrive Frequency Frequency due to overclocking the cores to
meet performance and throughput requirements
Overdrive frequencies cause faster MTTF degradation
Two challenges– Can violate “acceptable throughput” for tasks
Cores fail before all assigned tasks are completed– Can violate minimum “acceptable performance”
for tasksCores operate at lower frequencies
-7-
TerminologyTerminology Power-on-hours ()
– Effective number of lifetime hours consumed– Measure of a core’s lifetime degradation due to
operating conditions, e.g., temperature, frequency Nominal temperature
– Temperature at which MTTF degradation is the same as the number of hours a core is active
Acceleration factor (AF)
-8-
OutlineOutline
Motivation Previous Work Our Work Problem Formulation Optimal (Discretized) Solution Flow Results Conclusions
-9-
Classification of Existing WorksClassification of Existing Works
Work Type
Reiss12 NRC, NLG, NPG
Karpuzcu09 RC, NLG, NPG
Mihic04 RC, LG (Dynamic power management), NPG
Rosing07 RC, LG (Dynamic power management), NPG
Rong06 RC, LG (Dynamic power management), NPG
Coskun09 RC, LG (Dynamic thermal management), NPG
Srinivasan04 RC, LG (Dynamic reliability management), NPG
Karl08 RC, LG (Dynamic reliability management), NPG
(N)RC – (Non-) Reliability Constrained
(N)LG – (No) Lifetime Guarantee
(N)PG – (No) Performance Guarantee
-10-
Counterexample to NRC PoliciesCounterexample to NRC Policies Task schedule Max frequency =
3GHz Min acceptable
frequency = 1.8GHz Initial lifetime = 7
years (61320h)
#Active cores (m)
Nominal execution time (AF = 1)
Overdrive execution time (AF = 9.77)
1 1000h 3000h
2 2000h 5000h
3 3000h 8000h
4 2000h 5000h
All cores operate always at 3GHz– From HotSpot simulations, AF = 9.77
Lifetime after nominal tasks requiring m = 3 is 24947.5h– Tasks requiring m = 3 cannot complete overdrive execution– Tasks requiring m = 4 cannot complete at allCannot guarantee “acceptable throughput” !!!
-11-
Counterexample to RC-LG PoliciesCounterexample to RC-LG Policies Task schedule Max frequency =
3GHz Min acceptable
frequency = 1.8GHz
Initial lifetime = 61320h
#Active cores (m)
Nominal execution time (AF = 1)
Overdrive execution time (AF = 9.77)
1 1000h 3000h
2 2000h 5000h
3 3000h 8000h
4 2000h 5000h
All cores operate initially at 3GHz, and then at 1.6GHz– From HotSpot simulations, AF = 9.77
All tasks are completed but– Tasks requiring m = 3, 4 operate at 1.6GHz < 1.8GHz
(acceptable performance) !!!Cannot guarantee “acceptable performance” !!!
-12-
OutlineOutline
Motivation Previous Work Our Work Problem Formulation Optimal (Discretized) Solution Flow Results Conclusions
-13-
What Do We Do Differently?What Do We Do Differently? We formulate a new Maximum-Value Reliability-
Constrained Overdrive Frequencies (MVRCOF) optimization (offline) problem
Important because– Overdrive frequencies are our optimization variables– User experience is the value
We guarantee prescribed levels of “acceptable performance” and “acceptable throughput”
-14-
Comparison of Ours vs. Existing WorksComparison of Ours vs. Existing Works
Work Type
Reiss12 NRC, NLG, NPG
Karpuzcu09 RC, NLG, NPG
Mihic04 RC, LG (Dynamic power management), NPG
Rosing07 RC, LG (Dynamic power management), NPG
Rong06 RC, LG (Dynamic power management), NPG
Coskun09 RC, LG (Dynamic thermal management), NPG
Srinivasan04 RC, LG (Dynamic reliability management), NPG
Karl08 RC, LG (Dynamic reliability management), NPG
Our Work RC, LG (Dynamic reliability management, PG
(N)RC – (Non-) Reliability Constrained
(N)LG – (No) Lifetime Guarantee
(N)PG – (No) Performance Guarantee
-15-
What is the Optimal Solution?What is the Optimal Solution? Task schedule Max frequency =
3GHz Min acceptable
frequency = 1.8GHz
Initial lifetime = 61320h
#Active cores (m)
Nominal execution time (AF = 1)
Overdrive execution time (AF = 9.77)
1 1000h 3000h
2 2000h 5000h
3 3000h 8000h
4 2000h 5000h
Optimal (discretized) solution from exhaustive search
#Active cores (m)
Nominal frequency
Overdrive frequency
1 1.5GHz 2.85GHz
2 1.5GHz 2.3GHz
3 1.5GHz 1.8GHz
4 1.5GHz 1.8GHzWe guarantee both “acceptable performance” and “acceptable throughput” if a solution exists!!!
-16-
Our Key ContributionsOur Key Contributions We develop a new MVRCOF formulation to maximize
the value of operating multiple cores at overdrive frequencies
Our solutions provide guarantees for prescribed lower bounds on “acceptable performance” and “acceptable throughput”
We propose optimal (discretized) solution using exhaustive search as well as an approximate heuristic flow
Our solutions determine optimal overdrive frequencies as well as execution times for each active core
We empirically determine that our optimal solutions improve the objective function value by up to 17.4% versus existing works
-17-
OutlineOutline
Motivation Previous Work Our Work Problem Formulation Optimal (Discretized) Solution Flow Results Conclusions
-18-
FormulationFormulation
𝑀𝑎𝑥𝑖𝑚𝑖𝑧𝑒∑𝑚=1
𝑁
(𝑤𝑂𝐷 ,𝑚∙ 𝑓 𝑂𝐷 ,𝑚 ∙𝐸𝑂𝐷 ,𝑚+𝑤𝑛𝑜𝑚 ,𝑚∙ 𝑓 𝑛𝑜𝑚 ,𝑚∙𝐸𝑛𝑜𝑚 ,𝑚)
-19-
Formulation In EnglishFormulation In English
𝑀𝑎𝑥𝑖𝑚𝑖𝑧𝑒∑𝑚=1
𝑁
(𝑤𝑂𝐷 ,𝑚∙ 𝑓 𝑂𝐷 ,𝑚 ∙𝐸𝑂𝐷 ,𝑚+𝑤𝑛𝑜𝑚 ,𝑚∙ 𝑓 𝑛𝑜𝑚 ,𝑚∙𝐸𝑛𝑜𝑚 ,𝑚)
The value of operating at overdrive frequencies () described by weights () and the duration ()
The value of operating at nominal frequencies () described by weights () and the duration ()
-20-
Formulation In EnglishFormulation In English
Guarantees minimum “acceptable performance” () and upper bounded by the maximum achievable frequency ()
Guarantees “acceptable throughput”, i.e., all tasks complete within lifetime and cores wearout in a balanced manner
Upper bound on instantaneous power dissipated by any coreUpper bound on instantaneous temperature of all actives cores
-21-
MVRCOF Inputs: Task Description MVRCOF Inputs: Task Description App
1App
2
App X
Scheduler
El,m
wl,m
fnom,m
Execution times in nominal and overdrive modes with different number of active cores
Weights in nominal and overdrive modes with different number of active coresNominal frequencies at different number of active cores
-22-
MVRCOF Inputs: System DescriptionMVRCOF Inputs: System Description
SoC Designer
N
Pmax
fmax
Tmax
Tnom
MTTF
Number of available symmetric cores
Maximum power of any coreMaximum frequency of any coreMaximum die temperature
Nominal temperature
Initial MTTF of any core
-23-
MVRCOF OutputsMVRCOF OutputsMVRCO
F solver
fOD,m
vj,m,l
ui,l
Optimal overdrive frequencies for each set of active cores%execution time in each combination of the active coresE.g., in a system with three available cores, two cores can be active in ways
%lifetime each core operates at nominal and overdrive modes
-24-
MVRCOF Inputs and OutputsMVRCOF Inputs and OutputsApp
1App
2
App X
Scheduler
SoC Designer
N Pmax fmax Tmax Tnom MTTF
El,m wl,m
fnom,m
System Description
Task Description
MVRCOF
solver
fOD,m
vj,m,l ui,l
Outputs
-25-
OutlineOutline
Motivation Previous Work Our Work Problem Formulation Optimal (Discretized) Solution Flow Results Conclusions
-26-
Optimal (Discretized) Solution FlowOptimal (Discretized) Solution Flow
For each core– For each combination in which the core is active
Choose discrete values of overdrive frequencies within a range Perform power and temperature simulations Create a one-time LUT
– Example: If a system has 3 cores (Core A, B, C), the number of active cores
can be 1, 2 or 3 Core A is active
– One (out of three) combination when ; two (out of three) combinations when ; one (out of one) combination when
Perform exhaustive seach using the LUT for optimal overdrive frequencies that maximize the value of the objective function
-27-
Heuristic FlowHeuristic Flow
We maximize the overdrive frequency in the order of the set of active cores for which the product of weights and execution times is maximum– Example:
If a system has 3 cores, the number of active cores can be 1, 2 or 3
If , we maximizeand This achieves large improvements in the value of the
objective function
𝑀𝑎𝑥𝑖𝑚𝑖𝑧𝑒∑𝑚=1
𝑁
(𝒘𝑶𝑫 ,𝒎 ∙ 𝒇 𝑶𝑫 ,𝒎 ∙𝑬𝑶𝑫 ,𝒎+𝑤𝑛𝑜𝑚 ,𝑚 ∙ 𝑓 𝑛𝑜𝑚 ,𝑚 ∙𝐸𝑛𝑜𝑚 ,𝑚)
-28-
OutlineOutline
Motivation Previous Work Our Work Problem Statement Optimal (Discretized) Solution Flow Results Conclusions
-29-
Experimental SetupExperimental Setup Each core is simulated with 72 copies of jpeg_encoder from
OpenCores– SP&R implementation with commercial tools and foundry
45nm libraries Power simulation using Synopsys PrimeTime-PX
– Increase voltage from 0.8V to 1.2V in steps of 10mV– Increase frequency from 1.5GHz to 3GHz in steps of 50MHz
Thermal simulation using HotSpot LP solver is lp_solve Baseline policy is RC-LG from existing works
-30-
TestcasesTestcases
Name(Kh) (Kh)
4-I 1, 23, 4
1, 23, 2
3, 58, 5
0.5, 0.30.2, 0.4
0.5, 0.70.8, 0.6
Testcases are described by
Eight testcases in total– Format is -Testcase#– Seven have optimal solutions– One does not have feasible solution
Example
-31-
Optimal, Heuristic vs. RC-LGOptimal, Heuristic vs. RC-LG
4-I 4-II 4-III 4-IV 4-V 6-I 8-I0
5000
10000
15000
20000
25000
30000
35000
40000
45000 Optimal Heuristic Baseline
Testcase
Ob
jecti
ve F
un
cti
on
Valu
e
-3.3%
-17.4%
-12%
-9%
sw
-32-
Runtime ComparisonRuntime Comparison
4-I 4-II 4-III 4-IV 4-V 6-I 8-I0
0.2
0.4
0.6
0.8
1
1.2 Optimal Heuristic
Testcase
Norm
alized
Ru
nti
me
10
2.3 2.52.5
-33-
OutlineOutline
Motivation Previous Work Our Work Problem Statement Optimal (Discretized) Solution Flow Results Conclusions
-34-
ConclusionsConclusions We formulate and solve a new MVRCOF problem
under lifetime reliability constraints We develop MVRCOF solver that implements our
optimal (discretized) and heuristic flows Our optimal solutions guarantee both “acceptable
performance” and “acceptable throughput” We empirically demonstrate that our optimal solutions
achieve up to 17.4% greater value of the objective function than existing works
Our future works include– Application of our methods to traces from actual server
workloads– Expand our methods to handle other objectives– Achieve solutions that are temperature history-aware
-35-
Thank You!
-36-
Back up
-37-
NotationNotation number of simultaneously active cores number of symmetric cores in a system index for a core, overdrive and nominal frequencies when cores are active weights of achieved for overdrive and nominal frequencies execution time in overdrive and nominal frequencies maximum achievable frequency of any core maximum power consumption of any core maximum die temperature
-38-
Optimal Solution FlowOptimal Solution Flow
fOD,mPower(fOD,m)Power
simulation
Thermal simulatio
n
(fOD,m, temp, AF) LUT
(m, j)Core TempfOD,m AF
Exhaustive Search
For each core i, fOD,m and combination j of m
Optimal obj fn value, fOD,m and tj,m,l
LP
1