Upload
martin-pelikan
View
826
Download
2
Tags:
Embed Size (px)
DESCRIPTION
For many optimization problems it is possible to define a distance metric between problem variables that correlates with the likelihood and strength of interactions between the variables. For example, one may define a metric so that the dependencies between variables that are closer to each other with respect to the metric are expected to be stronger than the dependencies between variables that are further apart. The purpose of this paper is to describe a method that combines such a problem-specific distance metric with information mined from probabilistic models obtained in previous runs of estimation of distribution algorithms with the goal of solving future problem instances of similar type with increased speed, accuracy and reliability. While the focus of the paper is on additively decomposable problems and the hierarchical Bayesian optimization algorithm, it should be straightforward to generalize the approach to other model-directed optimization techniques and other problem classes. Compared to other techniques for learning from experience put forward in the past, the proposed technique is both more practical and more broadly applicable.
Citation preview
Distance-‐Based Bias in Model-‐Directed Op3miza3on
of Addi3vely Decomposable Problems
Mar3n Pelikan and Mark W. Hauschild
Missouri Es3ma3on of Distribu3on Algorithms Laboratory Department of Mathema3cs and Computer Science
University of Missouri, St. Louis, MO
E-‐mail: [email protected] WWW: hKp://mar3npelikan.net/
1
Background
• Model-‐directed op3mizers (MDOs) learn and use models in op3miza3on to solve difficult op3miza3on problems scalably and reliably.
• MDOs oPen provide more than the solu3on; they provide a set of models that reveal informa3on about the problem.
• Learning from experience: Use models from prior runs of MDOs to introduce bias when solving problems of similar type in future.
2
Purpose
• Combine prior models with a problem-‐specific distance metric to solve new problem instances with increased speed, accuracy, reliability.
• Demonstrate significant speedups across a broad array of problem domains.
• Focus on hBOA algorithm and addi3vely decomposable func3ons, although the approach can be generalized to other MDOs and other problem classes.
3
Outline
1. Hierarchical BOA (hBOA). 2. Distance metric for ADFs. 3. Learning from experience via distance-‐based
bias. 4. Experiments. 5. Summary and conclusions.
4
Hierarchical Bayesian Op3miza3on Algorithm (hBOA)
5
Current population Selection
New population
Bayesian network
[Pelikan, Goldberg, & Cantu-Paz, 2001]
Decision Trees Represent Dependencies
6
X1
X2
X3
X4 Probability table Decision tree
(more efficient)
Dependency
Learning from Experience (Transfer Learning)
• Mo3va3on – When solving a problem, hBOA provides the user with a set of probabilis3c models.
– Each model encodes informa3on about the problem, such as dependencies between variables.
– Why not use this informa3on when solving new problem instances of similar type?
• Example: hBOA solves 99 scheduling problems; why not use the knowledge obtained when solving the 100th instance?
7
How to Make it Work?
• It is straighborward to keep sta3s3cs from past hBOA runs, for example, capturing the number of dependencies between any pair of variables.
• In hBOA, this can be done by looking at the number of “splits” on variable Xi in a decision tree storing dependencies for variable Xj.
• But it is important to ensure that the sta3s3cs are meaningful with respect to the problem being solved, so that the sta3s3cs help us solve future problem instances faster and beKer.
8
Learning from Experience via Distance-‐Based Bias: Basic Idea
• Learning from experience using distance-‐based bias – Define distances between problem variables. – Mine probabilis3c models from previous runs for model regulari3es with respect to distances.
• Mine models to es3mate how strongly variables influence each other depending on their distance. – This should work whenever strength of dependencies is correlated with distance.
• Apply idea to hBOA and addi3vely decomposable func3ons.
9
Addi3vely Decomposable Func3ons
• Addi3vely decomposable func3on (ADF):
– {Si} are subsets of variables. – {fi} are func3ons defining overall solu3on quality.
• Addi3vely decomposable func3ons are oPen difficult to solve! Many NP-‐complete problems are ADFs with subproblems of 2 or 3 variables.
10
Define Distance Metric for ADFs Using Dependency Graph
• Create a dependency graph where variables in the same subset Si are connected.
• Define distance between variables as shortest path between them in the dependency graph.
• If there exists no such path, set distance to the number of variables (any exis3ng path is shorter).
11 [Hauschild et al., 2008]
Define Distance Metric for ADFs Using Dependency Graph: Example
12 [Hauschild et al., 2008]
Mo3va3ng Example
• Propor3ons of splits for variables at various distances shows evident correla3on between the two:
13
NK landscapes 2D spin glass
Details of the Approach
• Denote by M the set of models from prior runs. • Record the number of splits on any variable Xi in any decision tree Xj in model m such that distance of Xi and Xj is d
• Compute probability of kth split on variable Xi in any decision tree Xj such that dist. of Xi and Xj is d assuming (k-‐1) such splits:
14
Details of the Approach
• Set prior probability of network structure based on the learned probabili3es (kappa denotes strength of bias)
• Evaluate each network using a Bayesian metric
15
Test Problems
• Included in this paper – NK landscapes with nearest-‐neighbor interac3ons. – 2D spin glass.
• Done later on – 3D spin glass. – Minimum vertex cover for random graphs. – MAXSAT for 3-‐CNF formulas.
• Large number of different instances for each problem class (100s to 1000s each).
16
Experimental Methodology
• 10-‐fold crossvalida3on – Divide instances into 10 sets. – Test bias from models on 9 sets on remaining 1 set, repeat for every set.
– BoKom line: Any problem instance is never used for both crea3ng the bias and tes3ng it.
• Bisec3on for gemng popula3on sizes, 10 runs for each problem instance.
• Focus on mul3plica3ve speedups – How many 3mes faster with the use of bias? 17
Results on NK Landscapes
18
Results on Minimum Vertex Cover
19
Results on 2D Spin Glass
20
Results on 3D Spin Glass
21
Results on MAXSAT
22
More Results to be Published Soon
• Nearly iden3cal speedups if bias is based on problems of smaller size.
• Significant speedups even if bias is based on another class of ADFs (e.g. models from NK landscapes used to solve MVC).
• Nearly mul3plica3ve speedups in combina3on with other efficiency enhancements (e.g. sporadic model building).
• So far not a single problem class for which the bias does not yield significant speedups.
23
Results Applicable in Other Contexts
• Approach can be applied to other model-‐directed op3mizers, such as ECGA, LTGA, or mGA.
• Approach can be applied to other problem classes for which a distance metric can be defined, such as QAP or scheduling problems.
• This work demonstrates the poten3al, but more work to be done in future.
24
Summary and Conclusions
• Proposed a prac3cal approach to using models from prior runs of model-‐directed op3mizers to bias op3miza3on of future problem instances.
• Demonstrated significant speedups across a number of problem domains and semngs, including a number scenarios that are not possible with related techniques proposed in the past.
• Approach is ready to be applied in a different context.
25
Acknowledgments
• Support was provided by – NSF grants ECS-‐0547013 and IIS-‐1115352. – ITS at the University of Missouri in St. Louis. – University of Missouri Bioinforma3cs Consor3um.
• Get the papers at hKp://medal-‐lab.org/files/2012001.pdf hKp://medal-‐lab.org/files/2012004.pdf
26