Graph-Cuts Michael Bleyer LVA Stereo Vision. What happened last time? (1) We have defined an energy function to measure the quality of a disparity map

Graph-Cuts

Michael BleyerLVA Stereo Vision

What happened last time? (1) We have defined an energy function to measure the quality of a

disparity map D:

where- m(p,dp) computes color dissimilarity for matching pixel p at disparity dp

- N denotes all spatial neighbors in 4-connectivity- s() is the smoothness function. We use the Potts model:

This energy function is important for many computer vision problems.

NqpIpp qpsdpmDE

,),(),()(

),( qps0 if dp = dq P otherwise.

2

What happened last time? (2) Smoothness interactions define a

graph known as 4-connected grid. Computing the energy optimum on the

4-connected grid is an np-complete problem.

We have learned about dynamic programming:• Computes exact energy optimum

• Requires the graph to be a tree=> We had to remove smoothness interactions

4-Connected Grid

3

What is Going to Happen Today?

Just one point on the agenda:

• Graph-Cuts

4

What is Graph-Cuts? Powerful optimization method. Finds strong local minima of our np-complete energy function. Graph-cuts have been around in computer vision for quite some

time (e.g. [Roy,ICCV98]). I will speak about modern graph-cuts, i.e. move making

algorithms

5

Move Making Algorithms We are given a labeled image as

input. (In our case, the image is labeled

with disparity values, i.e. label α can for example mean a disparity of 10 pixels.)

We want to modify the assignment of pixels to labels to obtain a better solution, i.e. one of lower energy.

An operation that changes labels is called a move.

We will learn about 3 types of moves:• αβ-swap

• α-expansion

• fusion move

βγ

α

Current labeling

New labeling (preferably of lower energy than current

labeling)

Move

6

αβ-Swap [Boykov,PAMI01]

Select two labels: α and β. A pixel that is assigned to α in the

current labeling can either:1. switch its label to β or

2. keep its old label α in the new labeling.

Analogously, a pixel that is currently assigned to β can either:1. switch its label to α or

2. keep its old label β in the new labeling.

Simply spoken:• Some pixels that had the label α are now

assigned to β.

• Some pixels that had the label β are now assigned to α.

βγ

α

βγ

α

Current labeling

One possible labeling after αβ-swap 7

α-Expansion [Boykov,PAMI01]

Select one label: α. Any pixel can either

1. switch its label to α or

2. keep its old label.

More global than αβ-swap:• All pixels can change their labels

simultaneously.

In experiments, α-expansion moves typically outperform αβ-swaps.

We will therefore concentrate on α-expansions.

βγ

α

Current labeling

One possible labeling after α-expansion

βγ

α

8

The Key Problem

There is an extremely large number of possible α-expansions. The key challenge is to find the “best” α-expansion, i.e. the one that leads to

the largest decrease of our energy. Good news:

• For our energy function, we can solve this problem in an exact and fast way via solving a min-cut problem in a graph.

βγ

α

Current labeling

βγ

α

βγ

α

βγ

α

α

α-exp 1

α-ex

p 4

α-ex

p 2

α-exp 3

9

The Key Problem




βγ

α

Current labeling

βγ

α

βγ

α

α

α-exp 1

α-ex

p 4

α-ex

p 2

α-exp 3

10

E = 1000

E = 2000

E = 500 E = 750We should take

this one

E = 1000

βγ

α

The Key Problem




βγ

α

Current labeling

βγ

α

βγ

α

βγ

α

α

α-exp 1

α-ex

p 4

α-ex

p 2

α-exp 3

11

Iterative Algorithm – α-Expansion

Let us for now assume that we know how to compute the optimal α-expansion.

We can incorporate the α-expansion as follows.

Iterative Algorithm: Start with an arbitrary labeling f. Loop (e.g. 3 times)

• For each allowed label α:- Find f* = argmin E(f’) among f’ within one α-expansion of f- f := f*

Comment:• Note that we compute the optimal α-expansion.

• Therefore, the energy will either decrease after α-expansion or stay the same (not changing the labeling at all is a feasible α-expansion).

• The algorithm will in any case converge to a (strong) local energy optimum. 12

Iterative Algorithm – Example Video

(α-expansions for stereo matching)

13

Computing the Optimal α-Expansion

There are 3 things you have to do to find the optimal α-expansion via graph-cuts:1. Write your energy as a pseudo-boolean function

2. Construct a graph that represents your boolean function

3. Compute the Minimum Cut in this graph

These steps are discussed in the following.

14

Writing the Energy as a Pseudo-Boolean Function (1)

We associate a boolean variable xp with each pixel p where:• xp = 0 means that pixel p keeps its old label after α-expansion

• xp = 1 means that pixel p takes label α after α-expansion For example, if this is the current labeling:

then x =

leads to the label configuration:

after α-expansion. We can represent all possible α-expansions by the boolean

variables x.

β β β γ γ

1 1 0 0 1

α α β γ α

15


Let us assume we have two pixels p and q. Both pixels are assigned to label β in the current labeling: β β

16


Let us assume we have two pixels p and q. Both pixels are assigned to label β in the current labeling: Recall our energy function:

NqpIpp qpsdpmDE

,),(),()(

β β

Dissimilarity function

Potts model (Impose penalty P if p and q have

different labels)

17



We can write our energy as a function of binary variables xp and xq:

NqpIpp qpsdpmDE

,),(),()(

),()()(),( , qpqpqqppqp xxExExExxE

β β

18




NqpIpp qpsdpmDE

,),(),()(


β β

19




NqpIpp qpsdpmDE

,),(),()(


β β

20




NqpIpp qpsdpmDE

,),(),()(


β β

We call these terms unary terms, since they depend on one variable.

We call this term a pairwise term, since it

depends on two variables.21




where:

NqpIpp qpsdpmDE

,),(),()(


),()1(

),()0(

pmE

pmE

p

p

β β

22




where:

NqpIpp qpsdpmDE

,),(),()(


),()1(

),()0(

pmE

pmE

p

p

),()1(

),()0(

qmE

qmE

q

q

β β

23




where:

NqpIpp qpsdpmDE

,),(),()(


),()1(

),()0(

pmE

pmE

p

p

),()1(

),()0(

qmE

qmE

q

q

0)1,1(

)0,1(

)1,0(

0)0,0(

,

,

,

,

qp

qp

qp

qp

E

PE

PE

E β β

β α

α β

α α

β β

24


Let us assume we have two pixels p and q. Both pixel are assigned to label β in the current labeling: Recall our energy function:


where:

NqpIpp qpsdpmDE

,),(),()(


),()1(

),()0(

pmE

pmE

p

p

),()1(

),()0(

qmE

qmE

q

q

0)1,1(

)0,1(

)1,0(

0)0,0(

,

,

,

,

qp

qp

qp

qp

E

PE

PE

E β β

β α

α β

α α

β βWe have to find the settings of binary

variables xp and xq that minimize the energy.

This comes next.

25

The Min-Cut Problem We have two dedicated

nodes, the source and the sink.

source

sink

Example taken from Pushmeet Kohli’s ICCV09 tutorial

p q

2 9

2

1

5 4

26



We partition the graph into two sets S and T where . sink and source TS

source

sink


p q

2 9

2

1

5 4

S

T

27



We partition the graph into two sets S and T where

The cut consists of all edges that lead from S to T.

. sink and source TS

source

sink


p q

2 9

2

1

5 4

S

T

28





The costs of a cut are the sum of weights of these edges.


source

sink


p q

2 9

2

1

5 4

S

T

Costs: 5 + 2 + 9 = 16

29





The costs of a cut are the sum of weights of these edges.

The minimum cut is the cut of minimum costs among all possible cuts.


source

sink


p q

2 9

2

1

5 4

S

T

Costs: 2 + 1 + 4 = 7

30

The Min-Cut Problem

The min-cut problem has been extensively studied in graph theory.

There exists a variety of algorithms that1. Can find the exact solution

2. Are computationally very fast.

Side notes:• The min-cut problem and the max-flow problem are dual problems:

=> Solving min-cut also gives the solution for max-flow and vice versa.

• Max-flow and min-cut are therefore often used synonymously.

• If you are interested in algorithms for computing min-cut/max-flow:- Read [Boykov,PAMI04]

31

The Min-Cut Problem

The min-cut problem has been extensively studied in graph theory.

There exists a variety of algorithms that1. Can find the exact solution

2. Are computationally very fast.

Side notes:• The min-cut problem and the max-flow problem are dual problems:

=> Solving min-cut also gives the solution for max-flow and vice versa.

• Max-flow and min-cut are therefore often used synonymously.

• If you are interested in algorithms for computing min-cut/max-flow:- Read [Boykov,PAMI04]

Nice, but does this help us to optimize our pseudo-boolean

function?

32

Optimization of our Pseudo-Boolean Function We insert a node for each

pixel.source

sink

p q

33


pixel. If a node p is member of S

after the cut, then xp = 0.

source

sink

p q

S=> xp = 0

34



after the cut, then xp = 0. If p is member of T, then xp = 1

source

sink

p q

S=> xp = 0

T

=> xq = 1

35



after the cut, then xp = 0. If p is member of T, then xp = 1 We adjust the edges so that

the costs of the cut are equal to the energy of our binary variables x.

source

sink

p q

S=> xp = 0

T

=> xq = 1

The costs of this cut have to be equal to the energy

of xp = 0 and xq = 1. 36





The minimum cut therefore also represents the minimum of our energy.

source

sink

p q

S=> xp = 0

T

=> xq = 1

37


of xp = 0 and xq = 1.


of xp = 0 and xq = 1.





The minimum cut therefore also represents the minimum of our energy.

source

sink

p q

S=> xp = 0

T

=> xq = 1

How can we do this for our example?

38

Optimization of our Pseudo-Boolean Function Our unary terms:

Our pairwise term:

source

p q

sink

39


•

Our pairwise term:

source

p q

sink

),()1(

),()0(

pmE

pmE

p

p

m(p,β)

m(p,α)

40


•

Our pairwise term:

source

p q

sink

),()1(

),()0(

pmE

pmE

p

p

),()1(

),()0(

qmE

qmE

q

q

m(p,β)

m(p,α)

m(q,β)

m(q,α)

41


•

Our pairwise term:•

source

p q

sink

),()1(

),()0(

pmE

pmE

p

p

),()1(

),()0(

qmE

qmE

q

q

0)1,1(

)0,1(

)1,0(

0)0,0(

,

,

,

,

qp

qp

qp

qp

E

PE

PE

Em(p,β)

m(p,α)

m(q,β)

m(q,α)

P

P

42


•


source

p q

sink

),()1(

),()0(

pmE

pmE

p

p

),()1(

),()0(

qmE

qmE

q

q

0)1,1(

)0,1(

)1,0(

0)0,0(

,

,

,

,

qp

qp

qp

qp

E

PE

PE

Em(p,β)

m(p,α)

m(q,β)

m(q,α)

P

P

Let us check whether this graph really represents

our energy.

43

S

=> xp = 0, xq = 0

T


•


source

p q

sink

),()1(

),()0(

pmE

pmE

p

p

),()1(

),()0(

qmE

qmE

q

q

0)1,1(

)0,1(

)1,0(

0)0,0(

,

,

,

,

qp

qp

qp

qp

E

PE

PE

Em(p,β)

m(p,α)

m(q,β)

m(q,α)

P

P

Cut Costs:C = m(p,β)+m(q,β)

Energy: E(0,1) = Ep(0)+Eq(0)+Ep,q(0,0)

= m(p,β)+m(q,β)+0 44

S

T


•


source

p q

sink

),()1(

),()0(

pmE

pmE

p

p

),()1(

),()0(

qmE

qmE

q

q

0)1,1(

)0,1(

)1,0(

0)0,0(

,

,

,

,

qp

qp

qp

qp

E

PE

PE

Em(p,β)

m(p,α)

m(q,β)

m(q,α)

P

P

Cut Costs:C = m(p,β)+m(q,α)+P

Energy: E(0,1) = Ep(0)+Eq(1)+Ep,q(0,1)

= m(p,β)+m(q,α)+P

=> xp = 0, xq = 1

45

S

T


•


source

p q

sink

),()1(

),()0(

pmE

pmE

p

p

),()1(

),()0(

qmE

qmE

q

q

0)1,1(

)0,1(

)1,0(

0)0,0(

,

,

,

,

qp

qp

qp

qp

E

PE

PE

Em(p,β)

m(p,α)

m(q,β)

m(q,α)

P

P

Cut Costs:C = m(p,α)+m(q,β)+P

Energy: E(0,1) = Ep(1)+Eq(0)+Ep,q(1,0)

= m(p,α)+m(q,β)+P

=> xp = 1, xq = 0

46

S

=> xp = 1, xq = 1

T


•


source

p q

sink

),()1(

),()0(

pmE

pmE

p

p

),()1(

),()0(

qmE

qmE

q

q

0)1,1(

)0,1(

)1,0(

0)0,0(

,

,

,

,

qp

qp

qp

qp

E

PE

PE

Em(p,β)

m(p,α)

m(q,β)

m(q,α)

P

P

Cut Costs:C = m(p,α)+m(q,α)

Energy: E(0,1) = Ep(1)+Eq(1)+Ep,q(1,1)

= m(p,α)+m(q,α)+0 47

S

=> xp = 1, xq = 1

T


•


source

p q

sink

),()1(

),()0(

pmE

pmE

p

p

),()1(

),()0(

qmE

qmE

q

q

0)1,1(

)0,1(

)1,0(

0)0,0(

,

,

,

,

qp

qp

qp

qp

E

PE

PE

Em(p,β)

m(p,α)

m(q,β)

m(q,α)

P

P

Cut Costs:C = m(p,α)+m(q,α)

Energy: E(0,1) = Ep(1)+Eq(1)+Ep,q(1,1)

= m(p,α)+m(q,α)+0

We have shown that the graph represents our

energy.

48

What Energy Function Can be Optimized via Graph-Cuts?

Not every boolean energy function can be represented by a graph! The pairwise terms have to fulfill the following constraint

[Kolmogorov,PAMI04]:

In our example, this has been the case:

If there is at least one pairwise term in the boolean energy function that violates this constraint, the energy is said to be non-submodular.

Otherwise, it is called submodular. Optimizing non-submodular energies is an np-complete problem.

=> Computing the optimal α-expansion becomes very difficult. (but not impossible)

)0,1()1,0()1,1()0,0( ,,,, qpqpqpqp EEEE

PP 00

49

Max-Flow/Min-Cut Library

“I would like to use graph-cuts, but I do not want to mess around with graphs.”

Good news:

• You don’t have to. It is sufficient to define your energy as a pseudo-boolean function. You can then download the Max-Flow/Min-Cut library from

http://www.cs.ucl.ac.uk/staff/V.Kolmogorov/software.html The library will:

Construct the graph that represents your boolean function Compute the min-cut Provide you the optimal labeling

See example on next slide.

50

http://www.cs.ucl.ac.uk/staff/V.Kolmogorov/software.html

Example Code for the Max-Flow/Min-Cut Library

// Set up graph and add 2 nodes

1. Graph *g = new Graph();

2. int p = g->AddNode();

3. int q = g->AddNode();

// Define boolean energy

4. g->AddUnaryTerm(p, Ep(0), Ep(1));

5. g->AddUnaryTerm(q, Eq(0), Eq(1));

6. g->AddPairwiseTerm(p, q, Ep,q(0,0), Ep,q(0,1), Ep,q(1,0), Ep,q(1,1));

// Construct graph that represents the energy// Compute min-cut

7. g->Solve();

// Write optimal labels

8. printf (“optimal label p %d”, g->GetLabel(p));

9. printf (“optimal label q %d”, g->GetLabel(q));

51

The Fusion Move [Lemptisky,ICCV07]

Two proposals are fused to obtain a new solution of lower energy.

Fusion Move:• Let fp denote pixel p’s label

in proposal 1.

• Let gp denote p’s label in proposal 2.

• After fusion p is either assigned to fp or gp.

α-expansion is a special case of a fusion move where the second proposal contains only a single label.

One possible labeling after fusion of proposals 1 and 2

Proposal 1 Proposal 2

52

Iterative Algorithm – Fusion Moves

Iterative Algorithm: Start with an arbitrary labeling f. For each proposal g:

- Find f* = argmin E(f’) among f’ being one possible fusion of f and g.

- f := f*

53

Iterative Algorithm – Example Video

(Fusion moves for stereo matching)

54

Why Fusion Moves? (1) Parallelization:

• Parallel implementations of Min-Cut algorithms are very difficult to accomplish.

We can do the following parallel implementation:• CPU1 computes α-

expansions for disparities 0-8

• CPU2 computes α-expansions for disparities 9-16

• The results of both CPUs are then fused

Fusion Move

55

Why Fusion Moves? (2) You have two algorithms

that have different failure modes.

Opical flow example:1. Horn-Schunck:

- works well in untextured regions- fails at flow borders

2. Lucas-Kanade:- fails in untextured regions- works well at flow borders

• We run both algorithms and fuse their results

• Fusion move will - pick Horn-Schunk result for

untextured regions- pick Lucas-Kanade result at flow

borders- => much better result

• Can even work in real-time

First frame Ground truth optical flow

Result – Horn-Schunck algo. Result – Lucas-Kanade algo.

Fusion Move

Fusion of both algorithms56

Why Fusion Moves? (3) α-expansions will become intractable if there is a very large or

infinite label set. For example:

• Large resolution stereo:- You might need to test > 1000 disparity labels

• Optical flow:- The space of all possible discrete flow vectors is very large (2 dimensions)

• Assigning pixels to continuous disparity values:- The set of all continuous disparities is of infinite size

• Assigning pixels to 3D surfaces:- There is an infinite amount of 3D surfaces.- You will hear more about surface stereo in a different session.

Fusion moves can handle all of these cases! Probably the most important argument.

57

Writing the Boolean Fusion Move Energy Let us assume we have two pixels p and q. Our 2 proposals have the following labeling:

α β γ αProposal 1 Proposal 2

58


This time xp has a different meaning:• xp = 0, if p takes the label of proposal 1

• xp = 1, if p takes the label of proposal 2


59




As before, we write our energy as a function of binary variables xp and xq:

where:



60





where:


),()1(

),()0(

pmE

pmE

p

p


61





where:


),()1(

),()0(

pmE

pmE

p

p

),()1(

),()0(

qmE

qmE

q

q


62





where:


),()1(

),()0(

pmE

pmE

p

p

),()1(

),()0(

qmE

qmE

q

q

PE

PE

E

PE

qp

qp

qp

qp

)1,1(

)0,1(

0)1,0(

)0,0(

,

,

,

, α β

α α

γ β

γ α


63





where:


),()1(

),()0(

pmE

pmE

p

p

),()1(

),()0(

qmE

qmE

q

q

PE

PE

E

PE

qp

qp

qp

qp

)1,1(

)0,1(

0)1,0(

)0,0(

,

,

,

, α β

α α

γ β

γ α

α β γ αProposal 1 Proposal 2Can you see a problem

here?

64

The Fusion Energy Can Be Non-Submodular

Remember the condition for sub-modularity:

Our example energy is non-submodular:

Finding the optimal fusion move is, in general, an np-complete problem • That is actually the reason why fusion moves have not been used before 2007.

Good news:• Nowadays there exist powerful graph-cut-based optimization algorithms that

can handle non-submodular energies.

• In particular, I mean Quadratic Pseudo Boolean Optimization (QPBO)

)0,1()1,0()1,1()0,0( ,,,, qpqpqpqp EEEE

PPP 0

65

Quadratic Pseudo Boolean Optimization (QPBO) [Kolmogorov,PAMI07]

QPBO can only compute a part of the global optimal solution This means

• Instead of a complete labeling such as- xp = 0, xq = 1, xr = 0

QPBO will in general provide an incomplete labeling such as- xp = 0, xq = ø, xr = ø

where ø means “unknown”.

• Those pixel whose label ≠ ø would also have this label in the “complete” global optimal solution.

Proposal 1 Proposal 2 Fused Result (computed via QPBOI)

Pixels labeled as unknown by QPBO are shown in

black 66

What to do with pixels labeled as unknown?

Autarky property of QPBO:• If you assign all unknown pixels to label 0, the energy is guaranteed to be lower

or equal to the labeling <0,0,0,…,0>.

• In case of a fusion move, this means that assigning unknown pixels to the labels of proposal 1 will lead to a lower or equal energy than that of proposal 1.

• Assigning unknown pixels to label 0 is known as QPBOF.

You can do more [Rother, CVPR07]:• QPBOI (I stands for Improve):

- Tries to improve QPBOF solution.

• QPBOP (P stands for Probe):- Tries to find more pixels of the global optimal solution.

You can download QPBO:• http://www.cs.ucl.ac.uk/staff/V.Kolmogorov/software.html

• Also includes QPBOI and QPBOP

• Interface almost identical to MaxFlow Library.

67

http://www.cs.ucl.ac.uk/staff/V.Kolmogorov/software.html

Corrected Iterative AlgorithmI have cheated in the definition of the iterative algorithm.

Iterative Algorithm: Start with an arbitrary labeling f. For each proposal g:

- Find f* = argmin E(f’) among f’ being one possible fusion of f and g.

- f := f*

In the general case, we cannot really compute the global optimal fusion move (np-complete problem).

We just find a “good” one. The energy of f* is guaranteed to be equal or lower than that of f.

(autarky property of QPBO). The iterative algorithm will therefore converge to a local energy

minimum.68

Summary Move making algorithms α-expansions:

• Iterative algorithm

• Computing the optimal α-expansion

Sub-modularity condition Fusion moves:

• Handle large label spaces

• Computing a “good” fusion move

QPBO

69

References [Boykov,PAMI01] Y. Boykov, O. Veksler, R. Zabih, Fast Approximate

Energy Minimization via Graph Cuts, PAMI, vol. 23, no. 11, pp. 1222-1239, 2001.

[Boykov,PAMI04] Y. Boykov, V. Kolmogorov, An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision. PAMI, vol. 26, no. 9, pp. 1124-1137, 2004.

[Kolmogorov,PAMI07] V. Kolmogorov, C. Rother, Minimizing Nonsubmodular Functions with Graph Cuts-A Review, PAMI, vol. 29, no. 7, pp. 1274-1279, 2007.

[Lempitsky,ICCV07] V. Lempitsky, C. Rother, A. Blake, LogCut - Efficient Graph Cut Optimization for Markov Random Fields, ICCV 2007.

[Rother,CVPR07] C. Rother, V. Kolmogorov, V. Lempitsky, M. Szummer, Optimizing Binary MRFs Via Extended Roof Duality, CVPR 2007.

[Roy,ICCV98] S. Roy, I. Cox, A Maximum-Flow Formulation of the N-Camera Stereo Correspondence Problem“, ICCV 1998.

70

Documents

Graph-Cuts Michael Bleyer LVA Stereo Vision. What happened last time? (1) We have defined an energy function to measure the quality of a disparity map