59
Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 1 / 31

Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Visual Recognition: Inference in Graphical Models

Raquel Urtasun

TTI Chicago

Feb 28, 2012

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 1 / 31

Page 2: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Graphical models

Applications

Representation

Inference

message passing (LP relaxations)graph cuts

Learning

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 2 / 31

Page 3: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Inference with graph cuts

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 3 / 31

Page 4: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Submodular Functions

A Pseudo-boolean function f : {0, 1}n → < is submodular if

f (A) + f (B) ≥ f (A ∨ B)︸ ︷︷ ︸OR

+ f (A ∧ B)︸ ︷︷ ︸AND

∀A,B ∈ {0, 1}n

Example: n = 2, A = [1, 0], B = [0, 1]

f ([1, 0]) + f ([0, 1]) ≥ f ([1, 1]) + f ([0, 0])

Sum of submodular functions is submodular → Easy to proof.

Some energies in computer vision can be submodular

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 4 / 31

Page 5: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Minimizing submodular Functions

Pairwise submodular functions can be transformed to st-mincut/max-flow[Hammer, 65].

Very low running time ∼ O(n)

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 5 / 31

Page 6: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

The ST-mincut problem

Suppose we have a graph G = {V ,E ,C}, with vertices V , Edges E andcosts C .

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 6 / 31

Page 7: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

The ST-mincut problem

An st-cut (S,T) divides the nodes between source and sink.

The cost of a st-cut is the sum of cost of all edges going from S to T

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 7 / 31

Page 8: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

The ST-mincut problem

The st-mincut is the st-cut with the minimum cost

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 8 / 31

Page 9: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Back to our energy minimization

Construct a graph such that

1 Any st-cut corresponds to an assignment of x

2 The cost of the cut is equal to the energy of x : E(x)

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 9 / 31

Page 10: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

St-mincut and Energy Minimization

[Source: P. Kohli]

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 10 / 31

Page 11: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

How are they equivalent?

A B

C D

0 1

0

1 xi

xj

= A + 0 0

C-A C-A

0 1

0

1

0 D-C

0 D-C

0 1

0

1

0 B+C-A-D

0 0

0 1

0

1 + +

if x1=1 add C-A

if x2 = 1 add D-C

B+C-A-D ! 0 is true from the submodularity of !ij

A = !ij (0,0) B = !ij(0,1) C = !ij

(1,0) D = !ij (1,1)

!ij (xi,xj) = !ij(0,0) + (!ij(1,0)-!ij(0,0)) xi + (!ij(1,0)-!ij(0,0)) xj + (!ij(1,0) + !ij(0,1) - !ij(0,0) - !ij(1,1)) (1-xi) xj

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 11 / 31

Page 12: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Graph Construction

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 12 / 31

Page 13: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Graph Construction

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 12 / 31

Page 14: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Graph Construction

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 12 / 31

Page 15: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Graph Construction

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 12 / 31

Page 16: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Graph Construction

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 12 / 31

Page 17: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Graph Construction

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 12 / 31

Page 18: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Graph Construction

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 12 / 31

Page 19: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Graph Construction

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 12 / 31

Page 20: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Graph Construction

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 12 / 31

Page 21: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

How to compute the St-mincut?

[Source: P. Kohli]

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 13 / 31

Page 22: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

How does the code look like

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 14 / 31

Page 23: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

How does the code look like

[Source: P. Kohli]

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 14 / 31

Page 24: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

How does the code look like

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 14 / 31

Page 25: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

How does the code look like

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 14 / 31

Page 26: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Graph cuts for multi-label problems

Exact Transformation to QPBF [Roy and Cox 98] [Ishikawa 03] [Schlesingeret al. 06] [Ramalingam et al. 08]

Very high computational cost

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 15 / 31

Page 27: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Alternative: Move making

[Source: P. Kohli]

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 16 / 31

Page 28: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Alternative: Move making

[Source: P. Kohli]

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 16 / 31

Page 29: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Computing the Optimal Move

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 17 / 31

Page 30: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Move Making Algorithms

[Source: P. Kohli]

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 18 / 31

Page 31: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Energy Minimization

Consider pairwise MRFs

E (f ) =∑

{p,q}∈N

Vp,q(fp, fq) +∑p

Dp(fp)

with N defining the interactions between nodes, e.g., pixels

Dp non-negative, but arbitrary.

Same as before, where Vp,q ≡ −θα and Dp ≡ −θp.

This is the graph-cuts notation.

Important to notice it’s the same thing.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 19 / 31

Page 32: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Energy Minimization

Consider pairwise MRFs

E (f ) =∑

{p,q}∈N

Vp,q(fp, fq) +∑p

Dp(fp)

with N defining the interactions between nodes, e.g., pixels

Dp non-negative, but arbitrary.

Same as before, where Vp,q ≡ −θα and Dp ≡ −θp.

This is the graph-cuts notation.

Important to notice it’s the same thing.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 19 / 31

Page 33: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Energy Minimization

Consider pairwise MRFs

E (f ) =∑

{p,q}∈N

Vp,q(fp, fq) +∑p

Dp(fp)

with N defining the interactions between nodes, e.g., pixels

Dp non-negative, but arbitrary.

Same as before, where Vp,q ≡ −θα and Dp ≡ −θp.

This is the graph-cuts notation.

Important to notice it’s the same thing.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 19 / 31

Page 34: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Energy Minimization

Consider pairwise MRFs

E (f ) =∑

{p,q}∈N

Vp,q(fp, fq) +∑p

Dp(fp)

with N defining the interactions between nodes, e.g., pixels

Dp non-negative, but arbitrary.

Same as before, where Vp,q ≡ −θα and Dp ≡ −θp.

This is the graph-cuts notation.

Important to notice it’s the same thing.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 19 / 31

Page 35: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Metric vs Semimetric

Two general classes of pairwise interactions

Metric if it satisfies for any set of labels α, β, γ

V (α, β) = 0 ↔ α = β

V (α, β) = V (β, α) ≥ 0

V (α, β) ≤ V (α, γ) + V (γ, β)

Semi-metric if it satisfies for any set of labels α, β, γ

V (α, β) = 0 ↔ α = β

V (α, β) = V (β, α) ≥ 0

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 20 / 31

Page 36: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Metric vs Semimetric

Two general classes of pairwise interactions

Metric if it satisfies for any set of labels α, β, γ

V (α, β) = 0 ↔ α = β

V (α, β) = V (β, α) ≥ 0

V (α, β) ≤ V (α, γ) + V (γ, β)

Semi-metric if it satisfies for any set of labels α, β, γ

V (α, β) = 0 ↔ α = β

V (α, β) = V (β, α) ≥ 0

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 20 / 31

Page 37: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Examples for 1D label set

Truncated quadratic is a semi-metric

V (α, β) = min(K , |α− β|2)

with K a constant.

Truncated absolute distance is a metric

V (α, β) = min(K , |α− β|)

with K a constant.

For multi-dimensional, replace | · | by any norm.

Potts model is a metric

V (α, β) = K · T (α 6= β)

with T (·) = 1 if the argument is true and 0 otherwise.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 21 / 31

Page 38: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Examples for 1D label set

Truncated quadratic is a semi-metric

V (α, β) = min(K , |α− β|2)

with K a constant.

Truncated absolute distance is a metric

V (α, β) = min(K , |α− β|)

with K a constant.

For multi-dimensional, replace | · | by any norm.

Potts model is a metric

V (α, β) = K · T (α 6= β)

with T (·) = 1 if the argument is true and 0 otherwise.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 21 / 31

Page 39: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Examples for 1D label set

Truncated quadratic is a semi-metric

V (α, β) = min(K , |α− β|2)

with K a constant.

Truncated absolute distance is a metric

V (α, β) = min(K , |α− β|)

with K a constant.

For multi-dimensional, replace | · | by any norm.

Potts model is a metric

V (α, β) = K · T (α 6= β)

with T (·) = 1 if the argument is true and 0 otherwise.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 21 / 31

Page 40: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Examples for 1D label set

Truncated quadratic is a semi-metric

V (α, β) = min(K , |α− β|2)

with K a constant.

Truncated absolute distance is a metric

V (α, β) = min(K , |α− β|)

with K a constant.

For multi-dimensional, replace | · | by any norm.

Potts model is a metric

V (α, β) = K · T (α 6= β)

with T (·) = 1 if the argument is true and 0 otherwise.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 21 / 31

Page 41: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Binary Moves

α− β moves works for semi-metrics

α expansion works for V being a metric

Figure: Figure from P. Kohli tutorial on graph-cuts

For certain x1 and x2, the move energy is sub-modular QPBF

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 22 / 31

Page 42: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Swap Move

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 23 / 31

Page 43: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Swap Move

[Source: P. Kohli]

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 23 / 31

Page 44: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Expansion Move

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 24 / 31

Page 45: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Expansion Move

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 24 / 31

Page 46: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

More formally

Any labeling can be uniquely represented by a partition of image pixelsP = {Pl |l ∈ L}, where Pl = {p ∈ P|fp = l} is a subset of pixels assignedlabel l .

There is a one to one correspondence between labelings f and partitions P.

Given a pair of labels α, β, a move from a partition P (labeling f ) to a newpartition P’ (labeling f ′) is called an α− β swap if Pl = P ′ for any labell 6= α, β.

The only difference between P and P ′ is that some pixels that were labeledin P are now labeled in P ′, and vice-versa.

Given a label l , a move from a partition P (labeling f ) to a new partition P ′

(labeling f ′) is called an α-expansion if Pα ⊂ P ′α and P ′

l ⊂ Pl .

An α-expansion move allows any set of image pixels to change their labelsto α.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 25 / 31

Page 47: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

More formally

Any labeling can be uniquely represented by a partition of image pixelsP = {Pl |l ∈ L}, where Pl = {p ∈ P|fp = l} is a subset of pixels assignedlabel l .

There is a one to one correspondence between labelings f and partitions P.

Given a pair of labels α, β, a move from a partition P (labeling f ) to a newpartition P’ (labeling f ′) is called an α− β swap if Pl = P ′ for any labell 6= α, β.

The only difference between P and P ′ is that some pixels that were labeledin P are now labeled in P ′, and vice-versa.

Given a label l , a move from a partition P (labeling f ) to a new partition P ′

(labeling f ′) is called an α-expansion if Pα ⊂ P ′α and P ′

l ⊂ Pl .

An α-expansion move allows any set of image pixels to change their labelsto α.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 25 / 31

Page 48: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

More formally

Any labeling can be uniquely represented by a partition of image pixelsP = {Pl |l ∈ L}, where Pl = {p ∈ P|fp = l} is a subset of pixels assignedlabel l .

There is a one to one correspondence between labelings f and partitions P.

Given a pair of labels α, β, a move from a partition P (labeling f ) to a newpartition P’ (labeling f ′) is called an α− β swap if Pl = P ′ for any labell 6= α, β.

The only difference between P and P ′ is that some pixels that were labeledin P are now labeled in P ′, and vice-versa.

Given a label l , a move from a partition P (labeling f ) to a new partition P ′

(labeling f ′) is called an α-expansion if Pα ⊂ P ′α and P ′

l ⊂ Pl .

An α-expansion move allows any set of image pixels to change their labelsto α.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 25 / 31

Page 49: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

More formally

Any labeling can be uniquely represented by a partition of image pixelsP = {Pl |l ∈ L}, where Pl = {p ∈ P|fp = l} is a subset of pixels assignedlabel l .

There is a one to one correspondence between labelings f and partitions P.

Given a pair of labels α, β, a move from a partition P (labeling f ) to a newpartition P’ (labeling f ′) is called an α− β swap if Pl = P ′ for any labell 6= α, β.

The only difference between P and P ′ is that some pixels that were labeledin P are now labeled in P ′, and vice-versa.

Given a label l , a move from a partition P (labeling f ) to a new partition P ′

(labeling f ′) is called an α-expansion if Pα ⊂ P ′α and P ′

l ⊂ Pl .

An α-expansion move allows any set of image pixels to change their labelsto α.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 25 / 31

Page 50: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

More formally

Any labeling can be uniquely represented by a partition of image pixelsP = {Pl |l ∈ L}, where Pl = {p ∈ P|fp = l} is a subset of pixels assignedlabel l .

There is a one to one correspondence between labelings f and partitions P.

Given a pair of labels α, β, a move from a partition P (labeling f ) to a newpartition P’ (labeling f ′) is called an α− β swap if Pl = P ′ for any labell 6= α, β.

The only difference between P and P ′ is that some pixels that were labeledin P are now labeled in P ′, and vice-versa.

Given a label l , a move from a partition P (labeling f ) to a new partition P ′

(labeling f ′) is called an α-expansion if Pα ⊂ P ′α and P ′

l ⊂ Pl .

An α-expansion move allows any set of image pixels to change their labelsto α.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 25 / 31

Page 51: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

More formally

Any labeling can be uniquely represented by a partition of image pixelsP = {Pl |l ∈ L}, where Pl = {p ∈ P|fp = l} is a subset of pixels assignedlabel l .

There is a one to one correspondence between labelings f and partitions P.

Given a pair of labels α, β, a move from a partition P (labeling f ) to a newpartition P’ (labeling f ′) is called an α− β swap if Pl = P ′ for any labell 6= α, β.

The only difference between P and P ′ is that some pixels that were labeledin P are now labeled in P ′, and vice-versa.

Given a label l , a move from a partition P (labeling f ) to a new partition P ′

(labeling f ′) is called an α-expansion if Pα ⊂ P ′α and P ′

l ⊂ Pl .

An α-expansion move allows any set of image pixels to change their labelsto α.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 25 / 31

Page 52: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Example

Figure: (a) Current partition (b) local move (c) α− β-swap (d) α-expansion.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 26 / 31

Page 53: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Algorithms

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 27 / 31

Page 54: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Finding optimal Swap move

Given an input labeling f (partition P) and a pair of labels α, β we want tofind a labeling f̂ that minimizes E over all labelings within one α− β-swapof f .

This is going to be done by computing a labeling corresponding to aminimum cut on a graph Gαβ = (Vαβ , Eαβ).

The structure of this graph is dynamically determined by the currentpartition P and by the labels α, β.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 28 / 31

Page 55: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Finding optimal Swap move

Given an input labeling f (partition P) and a pair of labels α, β we want tofind a labeling f̂ that minimizes E over all labelings within one α− β-swapof f .

This is going to be done by computing a labeling corresponding to aminimum cut on a graph Gαβ = (Vαβ , Eαβ).

The structure of this graph is dynamically determined by the currentpartition P and by the labels α, β.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 28 / 31

Page 56: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Finding optimal Swap move

Given an input labeling f (partition P) and a pair of labels α, β we want tofind a labeling f̂ that minimizes E over all labelings within one α− β-swapof f .

This is going to be done by computing a labeling corresponding to aminimum cut on a graph Gαβ = (Vαβ , Eαβ).

The structure of this graph is dynamically determined by the currentpartition P and by the labels α, β.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 28 / 31

Page 57: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Graph Construction

The set of vertices includes the two terminals α and β, as well as imagepixels p in the sets Pα and Pβ (i.e., fp ∈ {α, β}).

Each pixel p ∈ Pαβ is connected to the terminals α and β, called t-links.

Each set of pixels p, q ∈ Pαβ which are neighbors is connected by an edgeep,q

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 29 / 31

Page 58: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Computing the Cut

Any cut must have a single t-link not cut.

This defines a labeling

There is a one-to-one correspondences between a cut and a labeling.

The energy of the cut is the energy of the labeling.

See Boykov et al, ”fast approximate energy minimization via graph cuts”PAMI 2001.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 30 / 31

Page 59: Visual Recognition: Inference in Graphical Modelsttic.uchicago.edu/~rurtasun/courses/VisualRecognition/lecture13.pdf · Visual Recognition: Inference in Graphical Models Raquel Urtasun

Properties

For any cut, then

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 31 / 31