Upload
fran
View
36
Download
0
Embed Size (px)
DESCRIPTION
Meta Optimization. Improving Compiler Heuristics with Machine Learning. Mark Stephenson, Una-May O’Reilly, Martin Martin, and Saman Amarasinghe MIT Computer Architecture Group. Motivation. Compiler writers are faced with many challenges: Many compiler problems are NP-hard - PowerPoint PPT Presentation
Citation preview
http://www.cag.lcs.mit.edu/metaopt
Meta OptimizationImproving Compiler Heuristics
with Machine Learning
Mark Stephenson, Una-May O’Reilly,Martin Martin, and Saman AmarasingheMIT Computer Architecture Group
http://www.cag.lcs.mit.edu/metaopt
Motivation
• Compiler writers are faced with many challenges:– Many compiler problems are NP-hard– Modern architectures are inextricably complex
• Simple models can’t capture architecture intricacies
– Micro-architectures change quickly
http://www.cag.lcs.mit.edu/metaopt
Motivation
• Heuristics alleviate complexity woes– Find good approximate solutions for a large
class of applications– Find solutions quickly
• Unfortunately…– They require a lot of trial-and-error tweaking to
achieve suitable performance
http://www.cag.lcs.mit.edu/metaopt
Priority Functions
• A heuristic’s Achilles heel– A single priority or cost function often dictates the
efficacy of a heuristic
• Priority functions rank the options available to a compiler heuristic– Graph coloring register allocation (selecting nodes to
spill)– List scheduling (identifying instructions in worklist to
schedule first)– Hyperblock formation (selecting paths to include)
http://www.cag.lcs.mit.edu/metaopt
Machine Learning
• We propose using machine learning techniques to automatically search the priority function space– Search space is feasible– Make use of spare computer cycles
http://www.cag.lcs.mit.edu/metaopt
• Find predicatable regions of control flow
• Enumerate paths of control in region– Exponential, but in practice it’s okay
• Prioritize paths based on several characteristics– The priority function we want to optimize
• Add paths to hyperblock in priority order
Case Study I: Hyperblock Formation
http://www.cag.lcs.mit.edu/metaopt
Case Study I: IMPACT’s Function
free hazard is if :1
hazard has if :25.0
i
ii path
pathhazard
jNj
ii heightdep
heightdepratiodep
_max
__
1
)__1.2(_ iiiii ratioopratiodephazardratioexecpriority
jNj
ii heightop
heightopratioop
_max
__
1
Favor frequentlyExecuted paths Favor short paths
Favor parallel pathsPenalize paths with hazards
http://www.cag.lcs.mit.edu/metaopt
Hyperblock Formation
• What are the important characteristic of a hyperblock formation priority function?
• IMPACT uses four characteristics
• Extract all the characteristics you can think of and have a machine learning algorithm find the priority function
http://www.cag.lcs.mit.edu/metaopt
Hyperblock Formationx1 Maximum ops over paths x2 Dependence height
x3 Number of paths x4 Number of operations
x5 Does path have subroutine calls? x6 Number of branches
x7 Does path have unsafe calls? x8 Path execution ratio
x9 Does path have pointer derefs? x10 Average ops executed in path
x11 Issue width of processor x12 Average predictability of branches in path
… xN Predictability product of branches in path
)function( 1 Ni xxpriority
)__1.2(_ iiiii ratioopratiodephazardratioexecpriority
http://www.cag.lcs.mit.edu/metaopt
Genetic Programming
• GP’s representation is a directly executable expression– Basically a lisp expression (or an AST)– In our case, GP variables are interesting
characteristics of the program
num_ops 2.3 predictability 4.1
- /
*
http://www.cag.lcs.mit.edu/metaopt
Genetic Programming
• Searching algorithm analogous to natural selection– Maintain a population of expressions– Selection
• The fittest expressions in the population are more likely to reproduce
– Sexual reproduction• Crossing over subexpressions of two expressions
– Mutation
http://www.cag.lcs.mit.edu/metaopt
Genetic ProgrammingCreate initial population(initial solutions)
Evaluation
Selection
END
Generation of variants(mutation and crossover)
Generations < Limit?
• Most expressions in initial population are randomly generated
• It also seeded with the compiler writer’s best guesses
http://www.cag.lcs.mit.edu/metaopt
Genetic Programming
Evaluation
Selection
END
Generation of variants(mutation and crossover)
Generations < Limit?
• Each expression is evaluated by compiling and running benchmark(s)
• Fitness is the relative speedup over the baseline on benchmark(s)
Baseline expression in the one that’s distributed with Trimaran
Create initial population(initial solutions)
http://www.cag.lcs.mit.edu/metaopt
Genetic Programming
Evaluation
Selection
END
Generation of variants(mutation and crossover)
Generations < Limit?
• Just as with Natural Selection, the fittest individuals are more likely to survive and reproduce.
Create initial population(initial solutions)
http://www.cag.lcs.mit.edu/metaopt
Genetic Programming
Evaluation
Selection
END
Generation of variants(mutation and crossover)
Generations < Limit?
Create initial population(initial solutions)
http://www.cag.lcs.mit.edu/metaopt
Genetic Programming
Evaluation
Selection
END
Generation of variants(mutation and crossover)
Generations < Limit?
• Use crossover and mutation to generate new expressions
Create initial population(initial solutions)
http://www.cag.lcs.mit.edu/metaopt
Hyperblock ResultsCompiler Specialization
1.5
4
1.2
3
0
0.5
1
1.5
2
2.5
3
3.51
29
.co
mp
ress
g7
21
en
cod
e
g7
21
de
cod
e
hu
ff_
de
c
hu
ff_
en
c
raw
cau
dio
raw
da
ud
io
toa
st
mp
eg
2d
ec
Ave
rag
e
Sp
ee
du
p
Train data set Alternate data set
(add (sub (cmul (gt (cmul $b0 0.8982 $d17)…$d7)) (cmul $b0 0.6183 $d28)))
(add (div $d20 $d5) (tern $b2 $d0 $d9))
http://www.cag.lcs.mit.edu/metaopt
Hyperblock ResultsA General Purpose Priority Function
1.44
1.25
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5de
codr
le4
codr
le4
g721
deco
de
g721
enco
de
raw
daud
io
raw
caud
io
toas
t
mpe
g2de
c
124.
m88
ksim
129.
com
pres
s
huff
_enc
huff
_dec
Ave
rage
Sp
ee
du
p
Train data set Alternate data set
http://www.cag.lcs.mit.edu/metaopt
Cross ValidationTesting General Purpose Applicability
1.09
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6u
ne
pic
djp
eg
rast
a
02
3.e
qn
tott
13
2.ij
pe
g
05
2.a
lvin
n
14
7.v
ort
ex
08
5.c
c1 art
13
0.li
osd
em
o
mip
ma
p
Ave
rag
e
Sp
ee
du
p
http://www.cag.lcs.mit.edu/metaopt
Case Study II: Register AllocationA General Purpose Priority Function
1.03
1.03
0.85
0.9
0.95
1
1.05
1.1
1.151
29
.co
mp
ress
g7
21
de
cod
e
g7
21
en
cod
e
hu
ff_e
nc
hu
ff_d
ec
raw
cau
dio
raw
da
ud
io
mp
eg
2d
ec
Ave
rag
e
Sp
ee
du
p
Train data set Alternate data set
http://www.cag.lcs.mit.edu/metaopt
1.02 1.
03
0.9
0.95
1
1.05
1.1
1.15
1.2
de
cod
rle
4
cod
rle
4
12
4.m
88
ksim
un
ep
ic
djp
eg
02
3.e
qn
tott
13
2.ij
pe
g
14
7.v
ort
ex
08
5.c
c1
13
0.li
Ave
rag
e
Sp
ee
du
p
Speedup (32-Regs) Speedup (64-Regs)
Register Allocation ResultsCross Validation
http://www.cag.lcs.mit.edu/metaopt
Conclusion
• Machine learning techniques can identify effective priority functions
• ‘Proof of concept’ by evolving two well known priority functions
• Human cycles v. computer cycles
http://www.cag.lcs.mit.edu/metaopt
(add (sub (mul exec_ratio_mean 0.8720) 0.9400) (mul 0.4762 (cmul (not has_pointer_deref)
(mul 0.6727 num_paths) (mul 1.1609
(add (sub (mul (div num_ops dependence_height) 10.8240) exec_ratio) (sub (mul (cmul has_unsafe_jsr predict_product_mean
0.9838) (sub 1.1039 num_ops_max)) (sub (mul dependence_height_mean
num_branches_max) num_paths)))))))
GP Hyperblock SolutionsGeneral Purpose
Intron that doesn’t affect solution
http://www.cag.lcs.mit.edu/metaopt
(add (sub (mul exec_ratio_mean 0.8720) 0.9400) (mul 0.4762 (cmul (not has_pointer_deref)
(mul 0.6727 num_paths) (mul 1.1609
(add (sub (mul (div num_ops dependence_height) 10.8240) exec_ratio) (sub (mul (cmul has_unsafe_jsr predict_product_mean
0.9838) (sub 1.1039 num_ops_max)) (sub (mul dependence_height_mean
num_branches_max) num_paths)))))))
GP Hyperblock SolutionsGeneral Purpose
Favor paths that don’t have pointer dereferences
http://www.cag.lcs.mit.edu/metaopt
GP Hyperblock SolutionsGeneral Purpose
(add (sub (mul exec_ratio_mean 0.8720) 0.9400) (mul 0.4762 (cmul (not has_pointer_deref)
(mul 0.6727 num_paths) (mul 1.1609
(add (sub (mul (div num_ops dependence_height) 10.8240) exec_ratio) (sub (mul (cmul has_unsafe_jsr predict_product_mean
0.9838) (sub 1.1039 num_ops_max)) (sub (mul dependence_height_mean
num_branches_max) num_paths)))))))
Favor highly parallel (fat) paths
http://www.cag.lcs.mit.edu/metaopt
GP Hyperblock SolutionsGeneral Purpose
(add (sub (mul exec_ratio_mean 0.8720) 0.9400) (mul 0.4762 (cmul (not has_pointer_deref)
(mul 0.6727 num_paths) (mul 1.1609
(add (sub (mul (div num_ops dependence_height) 10.8240) exec_ratio) (sub (mul (cmul has_unsafe_jsr predict_product_mean
0.9838) (sub 1.1039 num_ops_max)) (sub (mul dependence_height_mean
num_branches_max) num_paths)))))))
If a path calls a subroutine that may have side effects, penalize it
http://www.cag.lcs.mit.edu/metaopt
Case Study I: IMPACT’s Algorithm
A
B C
E
F
G
4k 24k
4k 22k 2k
2k 25
28k
28k
D
10
10
Path exec haz ops dep pr
A-B-D-F-G ~0 1.0 13 4 ~0
A-B-F-G 0.14 1.0 10 4 0.21
A-C-F-G 0.79 1.0 9 2 1.44
A-C-E-F-G 0.07 0.25 13 5 0.02
A-C-E-G ~0 0.25 11 3 ~0
http://www.cag.lcs.mit.edu/metaopt
Case Study I: IMPACT’s Algorithm
A
B C
E
F
G
4k 24k
4k 22k 2k
2k 25
28k
28k
D
10
10
Path exec haz ops dep pr
A-B-D-F-G ~0 1.0 13 4 ~0
A-B-F-G 0.14 1.0 10 4 0.21
A-C-F-G 0.79 1.0 9 2 1.44
A-C-E-F-G 0.07 0.25 13 5 0.02
A-C-E-G ~0 0.25 11 3 ~0