21
11. Lecture Stochastic Optimization Simulated Annealing Soft Control (AT 3, RMA)

11. Lecture Stochastic Optimization€¦ · Procedure to search local extrema within specific environment Most popular example: gradient descent methods • Find the minimum of a

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 11. Lecture Stochastic Optimization€¦ · Procedure to search local extrema within specific environment Most popular example: gradient descent methods • Find the minimum of a

11. Lecture

Stochastic Optimization

Simulated Annealing

Soft Control

(AT 3, RMA)

Page 2: 11. Lecture Stochastic Optimization€¦ · Procedure to search local extrema within specific environment Most popular example: gradient descent methods • Find the minimum of a

SC

294WS 19/20 Georg Frey

11. Structure of the lecture

1. Soft control: the definition and limitations, basics of “expert"

systems

2. Knowledge representation and knowledge processing (Symbolic AI)

application: expert systems

3. Fuzzy Systems: Dealing with Fuzzy knowledge application: Fuzzy

Control

4. Connective systems: neural networks application: Identification and

neural controller

5. Genetic Algorithms: Stochastic Optimization

Genetic Algorithms

Simulated Annealing

Differential Evolution

Application: Optimization

6. Summary and Literarture reference

Page 3: 11. Lecture Stochastic Optimization€¦ · Procedure to search local extrema within specific environment Most popular example: gradient descent methods • Find the minimum of a

SC

295WS 19/20 Georg Frey

• Simulated Annealing

Annealing: heating and subsequent slow cooling

Method inspired from the physics-

Model is the cooling process in crystal structures

Heat a substance with a lattice structure (e.g. silicon)

Observed effect

• It cools the substance particularly fast ( "quenching"), the result is very

uneven (impure) grid structure

• Leaving aside the substance to cool slowly, however, the result is

cooling at the end of a particularly uniform lattice structure

Simulated Annealing: Introduction 1/2

Heated substance

Lattice structure by quenching

Lattice structure by annealing

Page 4: 11. Lecture Stochastic Optimization€¦ · Procedure to search local extrema within specific environment Most popular example: gradient descent methods • Find the minimum of a

SC

296WS 19/20 Georg Frey

Simulated Annealing: Introduction 2/2

• Explanation

Generally aspire body in the nature of a state with as low energy as

possible

The Chilling (cooling) corresponds to the quest for a lattice structure

with this property natural optimization methods

The warmer the body, the more agile the particles of the lattice structure

existing (not optimal) grid structures can be dissolved

The colder the body becomes, the more immovable, the particle in fixed

grids and forms grid structure

Worth noting: In the transition from a sub-optimal lattice structure to an

optimal grid structure often an intermediate state is still needed that is

more "sub-optimal" than the initial state

“bad" grid “Very bad" grid “Good” grid

Page 5: 11. Lecture Stochastic Optimization€¦ · Procedure to search local extrema within specific environment Most popular example: gradient descent methods • Find the minimum of a

SC

297WS 19/20 Georg Frey

• Local optimization methods

Procedure to search local extrema within specific environment

Most popular example: gradient descent methods

• Find the minimum of a function at a given starting point

Problem: To view the global minimal need to find out from the starting

point iA local minima will be passed

Temporarily (but not permanently) must be worse than an

improved solution is acceptable

Simulated annealing: disadvantages of local optimization methods

Start point

0

1 23

End point of Search

Local Minimum

global Minimum

Page 6: 11. Lecture Stochastic Optimization€¦ · Procedure to search local extrema within specific environment Most popular example: gradient descent methods • Find the minimum of a

SC

298WS 19/20 Georg Frey

• Simulated annealing allows overcoming local minima

• Basic algorithm

1. Assume an initial solution (Current solution to begin optimization)

Centre of local search area

2. Choose a candidate solution within a radius of the center (of local search)

3. Decision whether the candidate solution will be the new solution

4. If the candidate solution is accepted as the new solution, center moves into the

centre of new solution (adjustment of the local search)

5. Continue from 2 to termination criterion

Simulated annealing: Local Search with varying Search radius 1 / 2

Page 7: 11. Lecture Stochastic Optimization€¦ · Procedure to search local extrema within specific environment Most popular example: gradient descent methods • Find the minimum of a

SC

299WS 19/20 Georg Frey

• Illustration

2-Dimensional search

Simulated annealing: Local Search with varying Search radius 1 / 2

global Search area

01

2 3

4

56

7

8

910

11

X1

X2

local Search area

History of Güte

This simplification ,

Goodness is only

Depending on X2

Page 8: 11. Lecture Stochastic Optimization€¦ · Procedure to search local extrema within specific environment Most popular example: gradient descent methods • Find the minimum of a

SC

300WS 19/20 Georg Frey

• Metropolis-Algorithm (1953, Metropolis et al.)

Algorithm for choosing a test solution within the local search and to

determine whether a worse solution will be used as new center is called

Metropolis algorithm

Original purpose: creating a Boltzmann distribution

• Choosing a test solution

y: test solution

x: Center of the local search

: Radius of the local search

For practical choice of y ,a probability distribution is used

• Frequently used: Gaussian distribution

• Test of preferred solutions in the

Near the center

Selection of the test solution by chance

Simulated annealing: Stochastic elements of simulated annealing 1 / 2

}{ xy

x

x1 x2

Page 9: 11. Lecture Stochastic Optimization€¦ · Procedure to search local extrema within specific environment Most popular example: gradient descent methods • Find the minimum of a

SC

301WS 19/20 Georg Frey

• Takeover of the tested solution

Determination of Energy (goodness) of the Center : E(x)

Determination of Energy (goodness) of the test solution: E(y)

Comparisons E (x) with E (y)

Is E (y) <E (x) y is the new center: x:=y

Otherwise investigate the energy difference ∆E=E(y)-E(x)

• The new (inferior) solution is assumed with exponential probability

distribution

• ∆E: Good difference (abstract energy difference)

• T: (abstract) Temperature

Simulated annealing: Stochastic elements of simulated annealing 2 / 2

TEeTEp /),(

p(∆E)

∆E

1

p(T)

T

1

The lower the energy

difference and the higher

the temperature, the more

likely the adoption of a

worse solution

Page 10: 11. Lecture Stochastic Optimization€¦ · Procedure to search local extrema within specific environment Most popular example: gradient descent methods • Find the minimum of a

SC

302WS 19/20 Georg Frey

• Interpretation of energy

For minimization problem, energy should be minimized

For maximization problem, energy should be maximized

• Transforming the problem into a minimization problem needed

• e.g. by inversion (1 / E), or by multiplying by -1 (-E)

• Note: 1/E is nonlinear

The energy is a metaphor for a good functionality

• Interpretation der Temperatur

High temperature high probability of acceptance

Low temperature low probability of acceptance

Temperature is a measure of likely acceptance

Description of heating followed by cooling

• Heat: initial temperature

• Cooling: lowering the temperature (e.g. exponential Cooling)

With decreasing temperature the likelihood of accepting worse solution

decreases

Simulated annealing: interpretation of energy and temperature

Page 11: 11. Lecture Stochastic Optimization€¦ · Procedure to search local extrema within specific environment Most popular example: gradient descent methods • Find the minimum of a

SC

303WS 19/20 Georg Frey

1. Assume an Initial solution (solution at the beginning of the optimization)

2. Choose within a candidates solution from the radius of the center (local

search)

• For example, by Gauss distribution

3. Decision whether the candidate solution will be the new solution

• Calculating E(y) E should be easy to calculate

• Better solution in any is accepted

• Worse solution is likely to be accepted

4. Provision of the new center and cooling

• Shift of the center (or not)

• Cooling: T=α*T, α є [0,1), cooling coefficient

5. Continue from 2 to termination criterion

Simulated annealing: simulated annealing algorithm optimization

TEeTEp /),(

Page 12: 11. Lecture Stochastic Optimization€¦ · Procedure to search local extrema within specific environment Most popular example: gradient descent methods • Find the minimum of a

SC

304WS 19/20 Georg Frey

• Combined global and local search

Instead the extremum search takes place in a local search

But the center of the search moves

Local Search in the global search area

• Independence from initial solution

Initial solution must be given

Initial temperature is high enough, to leave a local extremum easily

With temperature decreases, however, then the probability for leaving a local

extremum drops

At the beginning of the optimization search of a maximum in local search

area

At the end of the optimization search of the minimum in the local search

area

• Hybrid optimization methods

Bit coding of the solution Discrete Optimization

Floating-Coding Solution continuous optimization

Simulated Annealing: Properties

Page 13: 11. Lecture Stochastic Optimization€¦ · Procedure to search local extrema within specific environment Most popular example: gradient descent methods • Find the minimum of a

SC

305WS 19/20 Georg Frey

• Example of a typical search course

2-Dimensionaler Solution space (x1,x2)

Several local minima

Simulated Annealing: Typical search course

global search

X1

X2

Initial solution

Search for local minimum

Searching for minimum within a global environment

Page 14: 11. Lecture Stochastic Optimization€¦ · Procedure to search local extrema within specific environment Most popular example: gradient descent methods • Find the minimum of a

SC

306WS 19/20 Georg Frey

• Traveling Salesman Problem

One of the hardest known discrete optimization problems

It belongs to the class of complete-NP problems

• Calculating expense increases with increasing size of the problem in more

than polynomials

OTSP> O(nk)

• SYMPTOMS

A traveller wants to be on round trip to different cties and offer his

products there

Start and end point are determined

Each city will be visited exactly once

The distance should be minimal (optimization problem)

• Solution with simulated annealing

Coding solution as a list of cities

Energy Total distance traveled (to be minimized)

Simulated Annealing: Application example 1/5

Page 15: 11. Lecture Stochastic Optimization€¦ · Procedure to search local extrema within specific environment Most popular example: gradient descent methods • Find the minimum of a

SC

307WS 19/20 Georg Frey

• Global search

All possible routes with which all cities can be visited

Exact size of the search area:

Solutions

• 10 Cities: 181440 solutions

• 20 Cities: 60822550204416000 Solutions ≈ 6*1016 Solutions

Already in 20 cities, you can not search on all solutions, solutions at 106 per

second one expects more than 1902 years to guarantee the optimal solution

• Determining a candidate solution

Output solutions : 1,2,3,..,i,i+1,…,j-1,j,…,n

Copy output solution, Cut Segment i,…,j from a copy

Invert the Segment: i,i+1,…,j-1,j j,j-1,…,i+1,i

Initializing an inverted segment insead of original will provide derivatiopn

source

i, j be randomly determined (e.g. with normal distribution), where the chain

is understood as a ring, so the mean of the normal distribution can also be

moved

Simulated Annealing: Application example 2/5

2/)!1( n

Page 16: 11. Lecture Stochastic Optimization€¦ · Procedure to search local extrema within specific environment Most popular example: gradient descent methods • Find the minimum of a

SC

308WS 19/20 Georg Frey

• Example: Determination of the solution candidates (continued)

Initialize solution as a list of cities: 1,2,3,..,i,i+1,…,j-1,j,…,n

Simulated Annealing: Application example 3/5

1

2

i

i+1

j-1j

n

1

2

i

i+1

j-1j

n

Representation as ring

Page 17: 11. Lecture Stochastic Optimization€¦ · Procedure to search local extrema within specific environment Most popular example: gradient descent methods • Find the minimum of a

SC

309WS 19/20 Georg Frey

• Demo Applet for TSP-Problem von TU Clausthal

http://www.math.tu-clausthal.de/Arbeitsgruppen/Stochastische-Optimierung/

• Example TSP

Problem

• 50 Cities

• Intial solution: E=11603

Parameter

• T0= 10

• α = 0,999

Number of solutions

• 3*1062

For comparison

• Sun consists of approx. 1057 Atoms (sourse: http://fma2.math.uni-

magdeburg.de/~bessen/krypto/krypto8.htm)

Simulated Annealing: Application example 4/5

Page 18: 11. Lecture Stochastic Optimization€¦ · Procedure to search local extrema within specific environment Most popular example: gradient descent methods • Find the minimum of a

SC

310WS 19/20 Georg Frey

• Solution found within 60 seconds of CPU time

E = 2361

36188 Solution candidates were scanned

Optimal solution: Unknown!

Simulated Annealing: Application example 5/5

Page 19: 11. Lecture Stochastic Optimization€¦ · Procedure to search local extrema within specific environment Most popular example: gradient descent methods • Find the minimum of a

SC

311WS 19/20 Georg Frey

• Simulated Annealing

Stochastic optimization methods

Global Optimization

• No guarantee of optimization

Practically is not guaranteed that the global optimum is found

I.A. However, in finite time quasi-optimal solutions

Through a formal evidence has shown that with infinite computing the global

optimum is found (almost irrelevant)

Even at low temperature and infinitely large good difference the probability

to change the local minimum is never 0

• Practicalities

The algorithm is very simple fast processing

Even easy to implement with scripting languages ideal for testing

whether the algorithm for a problem is applicable

Simulated Annealing: Assessment 1/2

Page 20: 11. Lecture Stochastic Optimization€¦ · Procedure to search local extrema within specific environment Most popular example: gradient descent methods • Find the minimum of a

SC

312WS 19/20 Georg Frey

• Many variants

Determination of the solution candidates (probability distribution)

Remember the best solution (a kind of elitism)

Periodic partial Improve the temperature

Opportunity to leave a local extremum

• Successful application to many problems in practice

Travelling Salesman Problem

Controller-parameter optimization

• Coding for every problem must be re-elected

In the case of inappropriate coding the optimization methods is collapsed

In the coding of the user's knowledge (heuristics)

Simulated Annealing: Assessment 2/2

t

T

T0

T

t

T0

Page 21: 11. Lecture Stochastic Optimization€¦ · Procedure to search local extrema within specific environment Most popular example: gradient descent methods • Find the minimum of a

SC

313WS 19/20 Georg Frey

Summary and learning from the 11th Lecture

The basic idea of simulated annealing

Model in physics

Problems of local optimization methods

Describe why simulated annealing can stochastically

Select of the solution candidates

Decide over assumed solution

Metropolis-Algorithm

Travelling Salesman-Problem

Describe

Complexity

Solution with simulated annealing