Global Optimization Software - McMaster Universitycs777/presentations/1_GO_Jonathan_LGO_… · What...

Preview:

Citation preview

Global Optimization Software

Doron PearlJonathan Li Olesya PeshkoXie Feng

What is global optimization?

Global optimization is aimed at finding thebest solution of constrained optimization problem which (may) also have various local optima.

General global optimization problem (GOP)

Given a bounded, robust set D in the real n-space Rn and a continuous function f: D R ,find

global min f(x), subject to the constraint x Є D

Note: robust set : the closure of its nonempty interior.

First, we have to tell you..

No single optimization package can solve all global optimization problems efficiently.

Two General Classes In Global Optimization

Deterministic-Grid search -Branch and bound

Stochastic-Simulated Annealing-Tabu Search-Genetic Algorithm-Statistical Algorithm

Deterministic class and software

Actually, We can further classify deterministic class into two different sub-class :

Explicit Function RequiredSuch as ..Baron

Explicit Function isn’t requiredSuch as ..LGO( Lipschitz Global Optimization ).

Remark:1. In present deterministic solvers, the number of solvers in first class is more

then in second class. 2. Even though LGO is regard as using deterministic way to solve the problem,

the solution isn’t always guaranteed to be “deterministic” global optimal.3. There are some more solvers in first class won’t be discussed in detail, but

in later slides, they will be included in comparison.

LGO Lipschitz Global Optimization

0

min ( )

{ : ( ) 0, 1,..., }j

f x

x D x D f x j J∈ = ∈ ≤ =

1 2 1 2( ) ( )j j jf x f x L x x− ≤ −

represents a ‘simple’ explicit constraint set:frequently, it’s a finite n-dimensional interval or simplex, or Rn.

Furthermore, the objective function and constraint functions are Lipschitz-continuous on D0.That is,they satisfy the relation

0nD R∈

LGO Lipschitz Global Optimization

Three Key Components in the approach:

Lipschitz Continuous Function

Adaptive Partition Strategy

Branch and Bound

Lipschitz Continuous FunctionWith Lipschitz Continuous Property :

We can conclude the following observation of the function:

The ’slope’ is bounded with respect to the system input variables x.

If a function is Lipschitz Continuous on certain compact domain, it’s guaranteed that the bound of the function exists.

On the other hand, without the property, on the sole basisof sample points and corresponding function values, one cannot provide a lower bound after any finite numberof function evaluations of D.

Remark:It’s not necessary to compute L in global optimization, but the existence of it is a necessary condition to have lower bound.

1 2 1 2( ) ( )j j jf x f x L x x− ≤ −

Lipschitz Continuous Function

In Lipschitz continuous function , the more sample points we have , the more accurate approximation of the lower bound we can obtain.

Adaptive partition strategy

Usually implement on the relaxed feasible set, such as:

-Interval set: a<x<b (x, a, b is vector)

The strategy is to partition the interval into sub-interval by bisection. In high dimension, it could be regard as a box.

-Simplex set:The strategy is to partition the simplex into sub-simplex by each time cutting one vertex out.

-Convex Cone set:The strategy is to partition the cone into sub-cone.

Remark:As you may see, partition usually should:-Create linear bound constraint of each partition-Fulfill “exhaustive search”The choice of different partition strategy usually depends on how well the relaxation is, such as tightness.

Example of computing L when having relaxed bound constraint

2

1

( ) ((1/ 2) )n

k k k k kk

f x p x q x r=

= + +∑{ : , 1,.... }n

k k kP x R a x b k n= ∈ ≤ ≤ =

1 2

22 1/ 2

1

2

[ ( ) ( ) ]

{ : / (1/ 2)( )}{ : / (1/ 2)( )}

k k k k k kk I k I

k k k k

k k k k

then

L p a q p b q

whereI k q p a bI k q p a b

∈ ∈

= + + +

= − ≥ +

= − < +

∑ ∑

0, , , ( 1, 2,... )k k k k kp q r a b k n≠ < =

Branch and Bound

Branch literally means that the algorithm trying to partition the feasible region in some fashion.

Bound means while doing searching, we try to estimate the objective value by using upper bound and lower bound.

Upper bound: In each feasible region, the founded local optimum gives the upper bound, or the function evaluation of randomly sampling.

Lower bound : usually composed by certain approximation.

Branch and Bound

Set up domain D’ with simpleexplicit constraint

Pick samplepoints,x1,x2..,

calculatef(x1),f(x2)..

Do the local search ,set

local optimal asupper bound,and Record it.

Compute thelower bound

for thebounded area

Ifupper bound =lower bound

Partition the

domain D

Compute thelower bound

for eachpartition,

Do the local searchlocal optimal,

update the upperbound, andRecord it.

IfLower bound >

latest upperbound

No

Yes

Yes

stop

stop

Three approaches in LGO

LGO integrates a suite of robust and efficient global and local scope solvers. These include:

adaptive partition and search (branch-and-bound)

adaptive global random search (single & multi-start)

constrained local optimization( reduced gradient method)

Remark:The random option of approach is also usually used to handle black-box function.

General global optimization model in LGO

x is a real n-vector (to describe feasible decisions)

a, b are finite, component-wise vector bounds imposed on x

f(x) is a continuous function (to describe the model objective)

g(x) is a continuous vector function (to describe the model constraints; the inequality sign is interpreted component-wise).

min ( )( ) 0

f xg xa x b

≤≤ ≤

LGO interface

Library: LGO solver suite for C and Fortran compilers, with a text I/O interface, or embedded in a Windows GUI

Spreadsheets:Excel Premium Solver Platform/LGO solver engine, in cooperation with Frontline Systems

Modeling Language:GAMS/LGO solver engine, in cooperation with the GAMS Development Corporation

Integrated technical computing systems: AIMMS/LGO solver engine, in cooperation with Paragon Decision Technologies Global Optimization Toolbox for Maple, in cooperation with MaplesoftMPL/LGO solver engine, in cooperation with Maximal Software MathOptimizer for Mathematica, a native Mathematica product MathOptimizer Professional (LGO for Mathematica), in cooperation with Dr. Frank KampasTOMLAB/LGO for Matlab, in cooperation with TOMLAB Optimization

LGO

LGO has been used to solve models with up to one thousand variables and constraints.

These packages are developed by J. D. Pinter, who, since doing his PhD (1982 - Moscow State University) in optimisation, has become an internationally known expert in the field. One of his textbooks has won an international award (INFORMS Computing Society Prize for Research Excellence)

Further detail will be discussed in later slides.

LGO testing

In our numerical experiments described here, we have used LGO to solve a set of GAMS models based on the Handbook of Test Problems in Local and Global Optimization by Floudaset al.(1999). For brevity, we shall refer to the model collection studied as HTPLGO. The set of models considered is available from GLOBALLib (GAMS Global World, 2003).

GLOBALLib is a collection of nonlinear models that provides GO solver developers with a large and varied set of theoretical and practical test models.

The entire test set used consists of 117 models.

The test models included have up to 142 variables, 109 constraints, 729 non-zero and 567 nonlinear-non-zero model terms.

LGO test result

Figure 3. Efficiency profiles: all LGO solver modes are applied to GLOBALLib models.

Operational mode(for brevity we shall use opmode)

.opmode 0: local search from a given nominal solution, without a preceding global search mode (LS)

·opmode 1: global branch-and-bound search and local search (BB+LS)

· opmode 2: global adaptive random search and local search (GARS+LS)

· opmode 3: global multi-start random search and local search (MS+LS)

Using LGO

There are usually five stages while using LGO, they are problem definition, problem compilation, model parameters, model solution, and result analysis

-Problem definition: Define the function -Problem Compilation : Link to obj and lib-Model parameters: Set up lower bound, upper bound , and number of

constraint, etc-Model solution: There is automatic model and interactive model

Automatic model : Program determine which of the four module to use to compute with respect to the input file

Interactive model: User determine which module to use and in which order ,maximum search effort

Price

GAMS/LGO commercial $1,600academic $320

Premium Solver Platform $1,495

TOMLAB /LGOcommercial $1,600academic $600

Some important fact

Continuity of the functions (objective and constraints) defining the global optimization model is sufficient to use the LGO software.

Naturally, in such cases only a statistical guarantee can be given for the global lower bound estimate. The lower bound generated by LGO is statistical in all cases, since it is based partially on pseudo-random sampling.

LGO could only give the global optima deterministically based on deterministic L and deterministic boundary.

Comparison of complete global optimization solvers

Solvers being compared:We present test results for the global optimization systems BARON, COCOS, GlobSol, ICOS, LGO/GAMS, LINGO, OQNLP Premium Solver, and for comparison the local solver MINOS. All tests were made on the COCONUT benchmarking suite.

Outline of test set:The test set from three libraries consists of 1322 models varying in dimension (number of variables) between 1 and over 1000, coded in the modeling language AMPL.

Library 1 : GAMS Global library ; real life global optimization problems with industrial relevance, but currently most problems on this site are without computational results.

Library 2: CUTE library ; consist of global (and some local) optimization problems with

nonempty feasible domain

Library 3: EPFL library ; consists of pure constraint satisfaction problems (constant objective function) almost all being feasible.

Comparison of complete global optimization solvers(2)

Those excluded from libraries: 1.Certain difficult ones for testing, but the difficulties is unrelated to solver 2.Those contain function which aren’t support by ampl2dag converter.3. Problem actually contain objective function in Library3.4. Showing strange behavior, which might caused by bug in converter5. No solver can get optimal solution

Brief overview of special characteristic of other solvers

Globsol , Premium solver exploiting interval method.

ICOS is a pure constraint solver, which currently cannot handle models with an objective function

COCOS contains many modules that can be combined to yield various combination strategies for global optimization.

Characteristic comparison

Important related details

All solvers are tested with the default options suggested by theproviders of the codes.

The timeout limit used was (scaled to a 1000 MHz machine) around180 seconds CPU time for models of size 1, 900 seconds for models of size 2, and 1800 seconds for models of size 3

The solvers LGO and GlobSol required a bounded search region, and we bounded each variable between ¡1000 and 1000, except in a few cases where this leads to a loss of the global optimum.

The reliability of claimed results is the most poorly documented aspect of current global optimization software.

Reliability

Reliability

Performance

Note: Different solvers have different stopping criteria, Which should also be considered.

Like..Baron, Lingo : stop while time

is upLGO,OQNLP : stop based on

certain statistic

Final Remark

In a few cases, GlobSol and Premium Solver found solutions where BARON failed, which suggests that BARON would benefit from some of the advanced interval techniques implemented in GlobSol and Premium Solver.

However, GlobSol and Premium Solver are much less efficient in both time and solving capacity than BARON. To a large extent this may be due to the fact that both GlobSol and Premium Solve strive to achieve mathematical rigor, resulting insignificant slowdown due to the need of rigorously validated techniques.

Reference

http://myweb.dal.ca/jdpinter/index.htmlJanos D. Pinter (LGO ‘s creator) website

Global Optimization in Action Continuous and Lipschitz Optimization : Algorithm, Implementations and Applications, Author: Janos D. Pinter

Introduction to Global Optimization Author : Reiner Horst, Panos M.Pardalos and Nguyen V.ThoaiA comparison of complete global optimization solvers Arnold Neumaier. Oleg Shcherbina, Waltraud Huyer, Tamas Vinko , Mathematical Programming

http://www.mat.univie.ac.at/~neum/glopt.htmlWebsite maintained by Arnold Neumaier

p.s. If you like to check above two books, go asking Prof.Tamas. He will be generous to who like to learn.

Global OptimizationBARON

Feng Xie

BARON Branch And Reduce Optimization Navigator

It derives its name from “its combining interval analysis and duality in its reduce arsenal with enhanced branch and bound concepts as it winds its way through the hills and valleys of complex optimization problems in search of global solutions” (N. V. Sahinidis, University of Illinois at Urbana-Champaign, Department of Chemical Engineering).

• Basically an improved branch and bound algorithm.

• Range reduction is the major feature of the branch and reduce algorithm.

Two range reduction techniques are used:

• Optimality-based.• Feasibility-based.

Branch and ReduceAlgorithm Overview

Relaxation

min ( ) ( ). . ( ) 0 ( )

f x f x is convexs t g x g x is convex

x X X is a box≤

min ( ). . ( ) 0

f xs t g x

x X≤

Sub-problem Relaxed problem (convex optimization)

Variable

Objective

Variable

Objective

Converting a non-convex objective function into convex

Relaxation – an Example

Problem Relaxed problem

Perturbation Function

min ( ). . ( ) 0

f xs t g x

x X≤

Given a relaxed problem (R):

( ) min ( ). . ( )

p y f xs t g x y

x X

=≤

The corresponding perturbed problem (Ry) is:

perturbation

Properties of p(y) :

• p(0) is the solution of problem (R);

• p(y) is non-increasing (bigger y, bigger feasible set for (Ry));

• p(y) is a convex function (proof ignored).

Perturbation Function – Another View

y

z

G={(g(x),f(x)}

Let G be the set {(y,z) : y=g(x), z=f(x) for some x in X}. Then p(y) is the lower envelope of G.

y1

p(y), non-increasing and convex.

0

L=p(0), solution of (R).

p(y1), solution of (Ry).

Optimality-Based Range Reduction

z

G

A simple case :

p

U

[ , ] 0 .L U Uj j j j jx x x and constraint x x is active∈ − ≤

jxUjx

L

Ljx *

The upper bound of the global minimum

original range

Optimality-Based Range Reduction

z

G

p is unknown.

U

*, [ , ], [ , ].L U Uj j j j jThe range of x x x can be reduced to xκ

jxUjx

L

Ljx *

However, the perturbation function p is not explicitly given.

reducedrange

Optimality-Based Range Reduction

z

G

p

U

jxUjx

L

Ljx *

By nonlinear programming duality, the line passing through the optimum point and supporting G has slope -λj , where λj is the Lagrange multiplier corresponding to constraint in the solution of the Lagrangian dual problem.

0Uj jx x− ≤

supporting line of G

Optimality-Based Range Reduction

z

G

p

U

jxUjx

L

Ljx *

jκjκ

( ) .

[ , ] [ , ], ( ) / .

Uj j j

L U U Uj j j j j j j

Use the support line z L x x as the under estimator of p We can

reduce the range from x x to x where x U L

λ

κ κ λ

= − −

= − −

support line of Greduced range

Optimality-Based Range Reduction

The above process of range reduction can be extended to arbitrary constraints of the type .

The above range reduction process is called “optimality-based”because the new range is derived based on the optimal solution of the relaxed problem.

and are the optimal dual multipliers of nonlinear/linear constraints and simple bound constraints respectively.μ λ

( ) 0ig x ≤

Range Reduction – Example Continued

Relaxed problem:

• The optimum for the relaxed problem lies at (6, 0.89) with an objective function value of –6.89=L. Local search starting from (6, 0.89) gives U=-6.66 at (6, 0.66).

• In the solution of the relaxed problem, x1 is at its maximum (constraint x1-6<=0 is active) with a dual Lagrange multiplier =0.2. So the lower bound of x1 can be tightened to 6-(U-L)/ =4.86. Similarly x2 can be tightened to 0.66.

• The new bounds are 0<=x1<=4.86, 0.66<=x2<=4.

• Reconstruct the relaxation with the new bounds (see next page):

λλ

Range Reduction – Example Continued

Relaxed problem on new bounds:

• Reconstruct the relaxation with the new bounds (the new feasible set is indicated by the blue contour).

• The optimum for the new relaxed problem lies at (6, 0.66) with an objective function value of –6.66. Thus, the lower bound and upper bound of the global optimum converges. Global optimum reached (no branching is needed).

• If no range reduction is used, 4 branches have to be explored before the global optimum can be found (BFS traversal and bisection branching is used).

U=-6.66L=-6.89

Feasibility-Based Range Reduction

Feasibility-based range reduction is a process that tightens the bounds of problem variables by cutting off the infeasible portions.

Example:

1

, 1,...,n

ij j ij

a x b i m=

≤ =∑

, 1,...,L Uj j jx x x

Given

j n≤ ≤ =

(linear constraints)

(variable bounds)

Find tighter bounds for the variables, that is

. . 1,..., .L U L L U Uj j j j j j jfind and s t x x x for j nκ κ κ κ≤ ≤ ≤ ≤ =

Feasibility-Based Range Reduction

Best-effort range reduction through linear programming.

1

, 1,..., , :

min

. . , 1,..., .

j

j

n

ij j ij

For each variable x j n solve two LPs

x

s t a x b i m=

=

±

≤ =∑

• It gives the best/tightest new ranges.

• But it is expensive (2n LPs to solve).Best-effort range reduction in 2D

Feasibility-Based Range Reduction

“Poor man’s linear programming” heuristic.

Given any inequality ,1

n

ij j ij

a x b=

≤∑single out xk:

{ }min ,U Lik k i ij j i ij j ij j

j k j k

a x b a x b a x a x≠ ≠

≤ − ≤ −∑ ∑

{ }

{ }

1 min , , 0

1 min , , 0

U Lk i ij j ij j ik

j kik

U Lk i ij j ij j ik

j kik

x b a x a x if aa

x b a x a x if aa

⎧ ⎛ ⎞≤ − >⎪ ⎜ ⎟

⎪ ⎝ ⎠⎨

⎛ ⎞⎪ ≥ − <⎜ ⎟⎪⎝ ⎠⎩

So,

Feasibility-Based Range Reduction

Compared to the Linear Programming approach, “Poor man’s LP”is not guaranteed to give the maximum range reduction, but it isvery cheap.

The best-effort feasibility-based range reduction is only used in the preprocessing phase, on an one-time basis.

The bounds are not improved by “Poor man’s LP” at all.

“Poor man’s LP”gives a suboptimal range reduction.

“Poor man’s LP”gives the optimal range reduction.

Software Structure

BARON core-user interaction

High flexibility - The solver is highly customizable (by providing a certain number of user-written subroutines).

Software Structure

BARON specialized modules

• Separable Concave Programming (The objective function is the sum of concave functions.)

• Fractional Programming(The objective function is a fraction with linear functions as the numerator and denominator.)

• Mixed Integer Linear Programming

• Others

Availability

Under the GAMS or AIMMS modeling language.Base Module GAMS/BARON GAMS/CPLEX GAMS/MINOS GAMS/SNOPT

Commercial 3, 200. 00$ 1, 600. 00$ 6, 000. 00$ 3, 200. 00$ 3, 200. 00$ Academic 640. 00$ 320. 00$ 1, 280. 00$ 640. 00$ 640. 00$

Price list of related GAMS products (www.gams.com)

Through NEOS Server in GAMS format.

(As stated in the user’s manual)BARON is available as a callable library where users can supply problem-specific subroutines (range reduction, branching, local search, …) to improve the performance.

Limits (BARON)

Convex relaxation• Need the knowledge of the functions to get a good relaxation.

• Perform poorly on black-box or “unexpected” functions, for which the analytical information is limited.

Purely deterministic• Have to walk through a large number of branches before reaching the global optimum, resulting slow convergence for certain problems, especially the ones with large bounds or no bounds at all.

• As a comparison, solvers using statistical methods will focus on the branches that contain global optimum with high probability.

Limits (GAMS/BARON)

Variable and expression boundsAll nonlinear variables and expressions should be bounded below and above by finite numbers. If not, default bounds are given, which are usually large, and global optimum is not guaranteed.

Allowable nonlinear functions, ln , , , , | |, , .x x ye x x x x where Rα β α β ∈

Trigonometric functions are not allowed.

Solver dependency• An LP solver is required in most cases (CPLEX, OSL, MINOS or SNOPT).• An NLP solver is optional for the purpose of optimality-based range reduction (MINOS or SNOPT).

Limits – An Example

A Chemical Equilibrium problem with 6 variables (Ba, SO4, BaOH, OH, HSO4, H) and 6 constraints. The optimal solution is -1.000.

4

4

4

7 54 4

7 2 5 24 4

min. . 1

4.8

0.98

110 10

2 10 10 2 10 10

Bas t Ba SO

BaOHBa OHHSO

SO HH OHBa BaOH SO HSO

Ba BaOH H SO HSO OH

− −

− − − −

⋅ =

=⋅

=⋅

⋅ =

+ = +

⋅ + + = ⋅ + +

Source: wall.gms from GAMS model library

Limits – An ExampleGAMS/BARON: Branch and Reduce algorithm;GAMS/LGO: Multistart Random Sampling algorithm (default).Hardware: Mobile AMD Sempron 3300+,1.59GHz, 896MB RAM.

Unbounded intervals

Time (s) Solution Found Default Bounds

BARON 186.280 -1.000 Inferred by preprocessor

LGO 0.313 -1.000 Inferred by preprocessor

Bounded intervals

Bounds [-1E7,1E7] [-1E8,1E8] [-1E9,1E9] [-1E10,1E10]

Time (s) 0.532 0.640 13.532 127.953 BARON

Solution -1.000 -1.000 -1.000 -1.000

Time (s) 0.312 81.687 0.172 0.203 LGO

Solution 1.000 1.000 0.996 0.996

References

BARON User’s Manual (Version 4.0).

http://archimedes.scs.uiuc.edu/baron

http://www.gams.com

“Global Optimization, Deterministic Approaches”, Reiner Horst, Hoang Tuy.

“Nonlinear Programming, Theory and Algorithms”, Mokhtar S. Bazaraa, Hanif D. Sherali, C. M. Shetty.

“Optimization Theory and Methods, Nonlinear Programming”, Wenyu Sun, Ya-Xiang Yuan.

“Convex Optimization”, Stephen Boyd, Lieven Vandenberghe.

Recommended