Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
AI-Augmented Algorithms –How I Learned to Stop
Worrying and Love ChoiceLars Kotthoff
University of [email protected]
St Andrews, 25 July 2018
Outline
▷ Big Picture▷ Motivation▷ Algorithm Selection and Portfolios▷ Algorithm Configuration▷ Outlook
2
Big Picture▷ advance the state of the art through meta-algorithmic
techniques▷ rather than inventing new things, use existing things more
intelligently – automatically▷ invent new things through combinations of existing things
3
Big Picture▷ advance the state of the art through meta-algorithmic
techniques▷ rather than inventing new things, use existing things more
intelligently – automatically▷ invent new things through combinations of existing things
3
Motivation – What DifferenceDoes It Make?
4
Prominent Application
Fréchette, Alexandre, Neil Newman, Kevin Leyton-Brown. “Solving theStation Packing Problem.” In Association for the Advancement of ArtificialIntelligence (AAAI), 2016.
5
Performance Differences
0.1
1
10
100
1000
0.1 1 10 100 1000
Virt
ual B
est S
AT
Virtual Best CSP
Hurley, Barry, Lars Kotthoff, Yuri Malitsky, and Barry O’Sullivan. “Proteus:A Hierarchical Portfolio of Solvers and Transformations.” In CPAIOR, 2014.
6
Leveraging the Differences
Method Mean PAR10 Number SolvedVB Proteus 97 1493Proteus 1774 1424VB CSP 3577 1349VB SAT 17373 775VB DirectOrder Encoding 17637 764VB Direct Encoding 21736 593VB Support Encoding 21986 583
Hurley, Barry, Lars Kotthoff, Yuri Malitsky, and Barry O’Sullivan. “Proteus:A Hierarchical Portfolio of Solvers and Transformations.” In CPAIOR, 2014.
7
Performance Improvements
Configuration of a SAT Solver for Verification [Hutter et al, 2007]
Ran FocusedILS, 2 days × 10 machines
– On a training set from each benchmark
Compared to manually-engineered default
– 1 week of performance tuning
– Competitive with the state of the art
– Comparison on unseen test instances
10−2
10−1
100
101
102
103
104
10−2
10−1
100
101
102
103
104
SPEAR, original default (s)
SP
EA
R, optim
ized for
IBM
−B
MC
(s)
4.5-fold speedup
on hardware verification
10−2
10−1
100
101
102
103
104
10−2
10−1
100
101
102
103
104
SPEAR, original default (s)
SP
EA
R, optim
ized for
SW
V (
s)
500-fold speedup won category
QF BV in 2007 SMT competitionHutter & Lindauer AC-Tutorial AAAI 2016, Phoenix, USA 17
Hutter, Frank, Domagoj Babic, Holger H. Hoos, and Alan J. Hu.“Boosting Verification by Automatic Tuning of Decision Procedures.” InFMCAD ’07: Proceedings of the Formal Methods in Computer Aided Design,27–34. Washington, DC, USA: IEEE Computer Society, 2007. 8
Common Theme
Performance models of black-box processes▷ also called surrogate models▷ replace expensive underlying process with cheap approximate
model▷ build approximate model based on real evaluations using
machine learning techniques▷ no knowledge of what the underlying process does required
(but can be helpful)▷ allow better understanding of the underlying process through
interrogation of the model
9
Algorithm Selection
10
Algorithm Selection
Given a problem, choose the best algorithm to solve it.
Rice, John R. “The Algorithm Selection Problem.” Advances in Computers15 (1976): 65–118.
11
Algorithm Selection
Portfolio
Algorithm 2Algorithm 1 Algorithm 3
Training Instances
Instance 2Instance 1 Instance 3
Algorithm Selection
Performance Model
Instance 4Instance 5Instance 6
...
Instance 4: Algorithm 2Instance 5: Algorithm 3Instance 6: Algorithm 3
...
Feature Extraction
FeatureExtraction
12
Algorithm Portfolios
▷ instead of a single algorithm, use several complementaryalgorithms
▷ idea from Economics – minimise risk by spreading it outacross several securities
▷ same for computational problems – minimise risk of algorithmperforming poorly
▷ in practice often constructed from competition winners
Huberman, Bernardo A., Rajan M. Lukose, and Tad Hogg. “An EconomicsApproach to Hard Computational Problems.” Science 275, no. 5296 (1997):51–54. doi:10.1126/science.275.5296.51.
13
Algorithms
“algorithm” used in a very loose sense▷ algorithms▷ heuristics▷ machine learning models▷ consistency levels▷ …
14
Parallel Portfolios
Why not simply run all algorithms in parallel?▷ not enough resources may be available/waste of resources▷ algorithms may be parallelized themselves▷ memory contention
15
Building an Algorithm Selection System
▷ most approaches rely on machine learning▷ train with representative data, i.e. performance of all
algorithms in portfolio on a number of instances▷ evaluate performance on separate set of instances▷ potentially large amount of prep work
16
Key Components of an Algorithm Selection System
▷ feature extraction▷ performance model▷ prediction-based selector/scheduler
optional:▷ presolver▷ secondary/hierarchical models and predictors (e.g. for feature
extraction time)
17
Types of Performance Models
Regression Models
A1
A2A3
A1: 1.2A2: 4.5A3: 3.9
Classification Model
A1A3 A1
A2 A1
Pairwise Classification Models
A1 vs. A2
A1A2 A1
A1
A1 vs. A3
A1A1 A3
A3 …A1: 1 voteA2: 0 votesA3: 2 votes
Pairwise Regression Models
A1 - A2
0
A1 - A3
0…
A1: -1.3A2: 0.4A3: 1.7
Instance 1Instance 2Instance 3
...
Instance 1: Algorithm 2Instance 2: Algorithm 1Instance 3: Algorithm 3
...
18
Benchmark Library – ASlib
▷ currently 29 data sets/scenarios with more in preparation▷ SAT, CSP, QBF, ASP, MAXSAT, OR, machine learning…▷ includes data used frequently in the literature that you may
want to evaluate your approach on▷ performance of common approaches that you can compare to▷ http://aslib.net
Bischl, Bernd, Pascal Kerschke, Lars Kotthoff, Marius Lindauer, YuriMalitsky, Alexandre Fréchette, Holger H. Hoos, et al. “ASlib: A BenchmarkLibrary for Algorithm Selection.” Artificial Intelligence Journal (AIJ), no. 237(2016): 41–58.
19
(Much) More Information
http://larskotthoff.github.io/assurvey/
Kotthoff, Lars. “Algorithm Selection for Combinatorial Search Problems: ASurvey.” AI Magazine 35, no. 3 (2014): 48–60. 20
Algorithm Configuration
21
Algorithm Configuration
Given a (set of) problem(s), find the best parameter configuration.
22
Parameters?
▷ anything you can change that makes sense to change▷ e.g. search heuristic, variable ordering, type of global
constraint decomposition▷ not random seed, whether to enable debugging, etc.▷ some will affect performance, others will have no effect at all
23
Automated Algorithm Configuration
▷ no background knowledge on parameters or algorithm▷ as little manual intervention as possible
▷ failures are handled appropriately▷ resources are not wasted▷ can run unattended on large-scale compute infrastructure
24
Algorithm Configuration
Frank Hutter and Marius Lindauer, “Algorithm Configuration: A Hands onTutorial”, AAAI 2016
25
General Approach
▷ evaluate algorithm as black box function▷ observe effect of parameters without knowing the inner
workings▷ decide where to evaluate next▷ balance diversification/exploration and
intensification/exploitation
26
When are we done?
▷ most approaches incomplete▷ cannot prove optimality, not guaranteed to find optimal
solution (with finite time)▷ performance highly dependent on configuration space
→ How do we know when to stop?
27
Time Budget
How much time/how many function evaluations?▷ too much → wasted resources▷ too little → suboptimal result▷ use statistical tests▷ evaluate on parts of the instance set▷ for runtime: adaptive capping
28
Grid and Random Search▷ evaluate certain points in parameter space
Bergstra, James, and Yoshua Bengio. “Random Search forHyper-Parameter Optimization.” J. Mach. Learn. Res. 13, no. 1 (February2012): 281–305.
29
Model-Based Search
▷ evaluate small number of configurations▷ build model of parameter-performance surface based on the
results▷ use model to predict where to evaluate next▷ repeat▷ allows targeted exploration of new configurations▷ can take instance features into account like algorithm selection
Hutter, Frank, Holger H. Hoos, and Kevin Leyton-Brown. “SequentialModel-Based Optimization for General Algorithm Configuration.” In LION 5,507–23, 2011.
30
Model-Based Search Example
●
●
●
●
●
yei
−1.0 −0.5 0.0 0.5 1.0
0.0
0.4
0.8
0.000
0.005
0.010
0.015
0.020
0.025
x
type
● init
prop
type
y
yhat
ei
Iter = 1, Gap = 1.9909e−01
31
Model-Based Search Example
●
●
●
●
●
yei
−1.0 −0.5 0.0 0.5 1.0
0.0
0.4
0.8
0.00
0.01
0.02
0.03
x
type
● init
prop
seq
type
y
yhat
ei
Iter = 2, Gap = 1.9909e−01
32
Model-Based Search Example
●
●
●
●
●
yei
−1.0 −0.5 0.0 0.5 1.0
0.0
0.4
0.8
0.000
0.002
0.004
0.006
x
type
● init
prop
seq
type
y
yhat
ei
Iter = 3, Gap = 1.9909e−01
33
Model-Based Search Example
●
●
●
●
●
yei
−1.0 −0.5 0.0 0.5 1.0
0.0
0.4
0.8
0e+00
2e−04
4e−04
6e−04
8e−04
x
type
● init
prop
seq
type
y
yhat
ei
Iter = 4, Gap = 1.9992e−01
34
Model-Based Search Example
●
●
●
●
●
yei
−1.0 −0.5 0.0 0.5 1.0
0.0
0.4
0.8
0e+00
1e−04
2e−04
x
type
● init
prop
seq
type
y
yhat
ei
Iter = 5, Gap = 1.9992e−01
35
Model-Based Search Example
●
●
●
●
●
yei
−1.0 −0.5 0.0 0.5 1.0
0.0
0.4
0.8
0.00000
0.00003
0.00006
0.00009
0.00012
x
type
● init
prop
seq
type
y
yhat
ei
Iter = 6, Gap = 1.9996e−01
36
Model-Based Search Example
●
●
●
●
●
yei
−1.0 −0.5 0.0 0.5 1.0
0.0
0.4
0.8
0e+00
1e−05
2e−05
3e−05
4e−05
5e−05
x
type
● init
prop
seq
type
y
yhat
ei
Iter = 7, Gap = 2.0000e−01
37
Model-Based Search Example
●
●
●
●
●
yei
−1.0 −0.5 0.0 0.5 1.0
0.0
0.4
0.8
0.0e+00
5.0e−06
1.0e−05
1.5e−05
2.0e−05
x
type
● init
prop
seq
type
y
yhat
ei
Iter = 8, Gap = 2.0000e−01
38
Model-Based Search Example
●
●
●
●
●
yei
−1.0 −0.5 0.0 0.5 1.0
0.0
0.4
0.8
0.0e+00
2.5e−06
5.0e−06
7.5e−06
1.0e−05
x
type
● init
prop
seq
type
y
yhat
ei
Iter = 9, Gap = 2.0000e−01
39
Model-Based Search Example
●
●
●
●
●
yei
−1.0 −0.5 0.0 0.5 1.0
0.0
0.4
0.8
0e+00
1e−07
2e−07
3e−07
4e−07
x
type
● init
prop
seq
type
y
yhat
ei
Iter = 10, Gap = 2.0000e−01
40
Benchmark Library – AClib
▷ ASP, MIP, planning, machine learning, …▷ 4 algorithm configuration tools from the literature already
integrated▷ https://bitbucket.org/mlindauer/aclib2
Hutter, Frank, Manuel López-Ibáñez, Chris Fawcett, Marius Lindauer,Holger H. Hoos, Kevin Leyton-Brown, and Thomas Stützle. “AClib: ABenchmark Library for Algorithm Configuration.” In Learning and IntelligentOptimization, 36–40. Cham: Springer International Publishing, 2014.
41
Outlook
42
Quo Vadis, Software Engineering?
Run
Hoos, Holger H. “Programming by Optimization.” Communications of theAssociation for Computing Machinery (CACM) 55, no. 2 (February 2012):70–80. https://doi.org/10.1145/2076450.2076469.
43
Quo Vadis, Software Engineering?
Run
+ AI
Hoos, Holger H. “Programming by Optimization.” Communications of theAssociation for Computing Machinery (CACM) 55, no. 2 (February 2012):70–80. https://doi.org/10.1145/2076450.2076469.
43
Meta-Algorithmics in the Physical Realm – AI and Lasers
44
Tools and Resources
LLAMA https://bitbucket.org/lkotthoff/llamaSATzilla http://www.cs.ubc.ca/labs/beta/Projects/SATzilla/
iRace http://iridia.ulb.ac.be/irace/mlrMBO https://github.com/mlr-org/mlrMBO
SMAC http://www.cs.ubc.ca/labs/beta/Projects/SMAC/Spearmint https://github.com/HIPS/Spearmint
TPE https://jaberg.github.io/hyperopt/
autofolio https://bitbucket.org/mlindauer/autofolio/Auto-WEKA http://www.cs.ubc.ca/labs/beta/Projects/autoweka/
Auto-sklearn https://github.com/automl/auto-sklearn
45
Summary
Algorithm Selection choose the best algorithm for solving aproblem
Algorithm Configuration choose the best parameter configurationfor solving a problem with an algorithm
▷ mature research areas▷ can combine configuration and selection▷ effective tools are available▷ COnfiguration and SElection of ALgorithms group COSEAL
http://www.coseal.net
Don’t set parameters prematurely, embrace choice!
46
I’m hiring!
Several funded graduate positions available.
47