21
ON THE OPTIMAL USE OF THE BETHE APPROXIMATION FOR MODELS ON GRAPHS WITH LOOPS GABRIELE PERUGINI TUTOR: PROF. FEDERICO RICCI TERSENGHI

TUTOR: PROF. FEDERICO RICCI TERSENGHI ON THE … · TUTOR: PROF. FEDERICO RICCI TERSENGHI. OUTLINE OF TALK the Bethe approximation Belief propagation algorithm INTRODUCTION ONGOING

  • Upload
    lamdat

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

ON THE OPTIMAL USE OF THE BETHE APPROXIMATION FOR MODELS ON GRAPHS WITH LOOPS

GABRIELE PERUGINI TUTOR: PROF. FEDERICO RICCI TERSENGHI

OUTLINE OF TALK

the Bethe approximation

Belief propagation algorithm

INTRODUCTION

ONGOING PROJECTS

Bethe approximation and the T=0 random field Ising model

Belief propagation inspired heuristics for spin glass optimization problem

FUTURE PROJECTS

[4 slides]

[2 slides]

[6 slides]

[3 slides]

INTRODUCTION

too many variables !

few variables, hope errors are under control

Interacting Ising spins on a generic graph

Probability of finding the system in a configuration {s} with energy

cost function

Gibbs distribution

MODELS

approximate the Gibbs distribution

HOW CAN WE DO IT

compute marginal distributions

sampling from

compute the free energy

PROBLEMS ONE WOULD LIKE TO SOLVE [on a given graph]

NAIVE MEAN FIELD APPROXIMATION

approximate the Gibbs distribution:

parametrize the marginals: (local magnetization)

write the free energy in term of local magnetizations:

local magnetizations minimize the free energy:

exact for weak interactions (e.g. fully connected models)

No correlations !

Useful only for oversimplified models

BETHE APPROXIMATION (1/2)

approximate the Gibbs distribution:

parametrize:

(local magnetization)

(connected correlation between neighboring spins )

write down the parametrized (Bethe) free energy

find the parameters that minimize the free energy

BETHE APPROXIMATION (2/2)

exact on trees

asymptotically exact on locally tree-like graphs

Many states (RSB?)

Small loops ?

BELIEF PROPAGATION ALGORITHM (1/2)

Born as a way of computing exact marginals on trees

Later, it was realized that:

FIXED POINTS OF B.P.

MINIMA OF THE BETHE FREE ENERGY

[Yedidia, 2001]

Each edge carries a message

Update each message using incoming messages

Compute marginals with fixed point messages

mi

BELIEF PROPAGATION ALGORITHM (2/2)

Well established in various disciplines

Coding theory

Probabilistic inference

Artificial intelligence

Statistical mechanics

(almost) LINEAR ALGORITHM, and easy to implement

up to now, limited to safe grounds applications

PROBLEMS:

It may not converge…

Rigorous results only limited to a 1 fixed point scenario

Is the Bethe free energy a good approximation when the model has loops ?

Sensible to the initial condition

ONGOING PROJECTS

Quantify how bad or good is the Bethe approximation

Improve our understanding of the B.P. algorithm

Develop fast and efficient (approximate) algorithms forinference

optimization

AIMS OF THE PROJECT

MODELS CURRENTLY UNDER STUDY

T = 0 RANDOM FIELD ISING MODEL

T = 0 SPIN GLASS

Random regular graphs

finite dimensional lattices

best case worst case

ZERO TEMPERATURE RANDOM FIELD ISING MODEL

‣ ferromagnet + i.i.d. random fields acting on each site

‣ can be studied at zero temperature (optimization problem)

‣ “peculiar” second order phase transition

‣ lots of metastable states 1 spin flip stable states

‣ anomalous slow dynamics (Griffiths singularities)

max-flow / minimum-cut algorithm

the ground state can be obtained in polynomial time

BELIEF PROPAGATION APPROACH TO THE T=0 R.F.I.M.

[Chertkov, 2008]GLOBAL MINIMUM OF THE BETHE FREE ENERGY

GROUND STATE OF THE RFIM

What shall we expect ?

F

M

unbalanced ferromagnetic-like minima

some higher energy states..

BELIEF PROPAGATION APPROACH TO THE T=0 R.F.I.M.

[Chertkov, 2008]GLOBAL MINIMUM OF THE BETHE FREE ENERGY

GROUND STATE OF THE RFIM

Metastable states are relevant (dominant?) at criticality

F

M

unbalanced ferromagnetic-like minima

some (not so) higher energy states..

T=0 R.F.I.M.: MAXIMAL SOLUTIONS

BP results to be very sensitive to the initialization of the messages, however..

We proved that two special initial conditions exist “ (+), (-) “ that bound every fixed point:

0

0.2

0.4

0.6

0.8

1

0.1 0.2 0.3 0.4 0.5 0.6

n(-

)

J

L = 8L = 10L = 12L = 16L = 24

0

0.2

0.4

0.6

0.8

1

0.1 0.2 0.3 0.4 0.5 0.6

n(+

)

J

L = 8L = 10L = 12L = 16L = 24

3D latticeworst case

0

0.2

0.4

0.6

0.8

1

0.3 0.4 0.5 0.6 0.7 0.8

n(+

)

J

N = 211

N = 212

N = 213

N = 214

N = 215

0

0.2

0.4

0.6

0.8

1

0.3 0.4 0.5 0.6 0.7 0.8

n(-

)

J

N = 211

N = 212

N = 213

N = 214

N = 215

Random regular graphs

best case

prob “up” fixed point is the G.S. prob “down” fixed point is the G.S.

T=0 R.F.I.M.: PERCOLATION OF UNFROZEN VARIABLES

The spin is frozen

Every fixed point is bounded:

FRACTION OF FROZEN VARIABLES GIANT CLUSTER OF UNFROZEN VAR.

Random regular graphs

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0.48 0.5 0.52 0.54 0.56 0.58 0.6

frozen s

pin

s

J

N = 210

N = 211

N = 212

N = 213

N = 214

N = 215

N = 216

N = 217

N = 218

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0.48 0.5 0.52 0.54 0.56 0.58 0.6

gia

nt clu

ste

r

J

N = 210

N = 211

N = 212

N = 213

N = 214

N = 215

N = 216

N = 217

N = 218

#clusters ~ 1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0.2 0.25 0.3 0.35 0.4 0.45 0.5

frozen v

ariable

s

J

L = 10

L = 12

L = 16

L = 24

L = 32

L = 48

3D lattice

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0.2 0.25 0.3 0.35 0.4 0.45 0.5

gia

nt clu

ste

r

J

L = 10L = 12L = 16L = 24L = 32L = 48

#clusters ~ N

Correlatedpercolation?

T=0 R.F.I.M.: SEARCHING BETWEEN SOLUTIONS

We designed a modified version of B.P. which is able to find efficiently many different fixed points starting from different initial conditions

INITIAL CONDITIONS = CONVEX COMBINATIONS OF FIXED POINTS ALREADY FOUND

competitive for optimization, Prob[g.s.] ~ 1

Too many states, however at the present time the fastest way for finding metastable states

Time complexity ~

Random regular graphs

best case

3D latticeworst case

1

1.5

2

2.5

3

3.5

4

0.3 0.4 0.5 0.6 0.7 0.8

< N

so

l >

J

N = 210

N = 211

N = 212

N = 213

N = 214

N = 215

T=0 R.F.I.M.: PERSPECTIVES

percolation of unconstrained variables

anomalous slow dynamics

?

are all these stable states relevant for the thermodynamics ?

are there situations where “faster” and “flexible” is better that “exact” ?

SPIN GLASS OPTIMIZATION

One of the hardest optimization problem

Many open questions can be answered only through numerical simulations

Ubiquitous in applications

ALGORITHMS

EXACT

HEURISTICS

GENETIC ALGORITHMS

CLUSTER EXACT APPROXIMATION

BP-BASED HEURISTICS

S.G. OPTIMIZATION: IMPROVING C.E.A.

build an unfrustrated cluster compute the g.s.

of the cluster with min-cut algorithm

BP is faster than min-cut

BP can handle a small amount of frustration

-1.67

-1.665

-1.66

-1.655

-1.65

-1.645

-1.64

-1.635

-1.63

0 10 20 30 40 50 60 70 80 90 100

E / N

number of i.c.

treecluster no frus

numFrus = 1numFrus = 2numFrus = 3numFrus = 4numFrus = 5

N = 163 = 4096

FUTURE WORKS

A BETTER UNDERSTANDING OF BETHE APPROXIMATION (B.P.) VIA…

▸ Characterization of unfrozen variables percolation

▸ Correct counting of metastable states

▸ Use of the Bethe approximation as an heuristicswhen it is not exact:

▸ when B.P. does not converge

▸ in presence of strongly frustrated short loops

▸ Comparison with exact (and slow) solutions andwith other fast (and approximate) heuristics

THANKS !

S.G. OPTIMIZATION: BP + PINNING

fix a fraction of variables

let them act as an external field

0

100

200

300

400

500

600

700

800

900

1000

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

fraction of pinned variables

L = 8L = 10L = 12L = 14L = 64

No phase transition : (

Seems to be great for optimization

…however

beliefs can be better than we think