Hierarchical Graph Cuts for Semi-Metric Labeling M. Pawan Kumar Joint work with Daphne Koller

Hierarchical Graph Cutsfor Semi-Metric Labeling

M. Pawan Kumar

Joint work with

Daphne Koller

AimTo obtain accurate MAP estimate for Semi-Metric MRFs efficiently

V1 V2 … … …

… … … … …

… … … … …

… … … … Vn

Random Variables V = { V1, V2, …, Vn}

Aim

Va Vb

li

ab(i,j) a(i) : arbitrary

ab(i,j) = sab d(i,j)

sab ≥ 0a(i) b(j)

lj

d( i , i ) = 0 for all i d( i , j ) = d( j , i ) > 0 for all i≠j

Semi-metric Distance Function

d( i , j ) - d( j , k ) ≤ d( i , k )

Metric Distance Function

To obtain accurate MAP estimate for Semi-Metric MRFs efficiently

Aim

Va Vb

li

ab(i,j) a(i) : arbitrary

a(i) b(j)

lj

f* = arg minf a(f(a)) + ab(f(a),f(b))

ab(i,j) = sab d(i,j)

sab ≥ 0

To obtain accurate MAP estimate for Semi-Metric MRFs efficiently

Visualizing Metrics

l5

l1l2

l4l3

w1w2

w3

w4

w5

w6

w7 w9w8

d( i , j ) : shortest path defined by the graph

Overview

+

f1 f2f

Outline

• Simpler Metrics

• Labeling for Simpler Metrics

• Approximating General Metrics/Semi-Metrics

• Combining Labelings

• Results

r-HST Metrics

Edge lengths for all children are the same

l1 l2 l3 l4 l5 l6

w1 w1

w2 w2w2 w3 w3

w3

Graph is a Tree. Labels are leaves

r-HST Metrics

Edge lengths decrease by factor r ≥ 2

w2 ≤ w1/r w3 ≤ w1/r

l1 l2 l3 l4 l5 l6

w1 w1

w2 w2w2 w3 w3

w3

Outline

• Simpler Metrics




• Results

r-HST Metric Labeling

r-HST Metrics admit Divide-and-Conquer

Divide original problem into subproblems

l1 l2 l3 l4 l5 l6

w1 w1

w2 w2w2 w3 w3

w3


Subproblem defined at vertex ‘m’

l1 l2 l3 l4 l5 l6

w1 w1

w2 w2w2 w3 w3

w3


such that f(a) m


Trivial problem

l1 l2 l3 l4 l5 l6

w1 w1

w2 w2w2 w3 w3

w3


such that f(a) { l4 }


Original problem

l1 l2 l3 l4 l5 l6

w1 w1

w2 w2w2 w3 w3

w3


such that f(a) { l1, …, l6 }


Problems get tougher as we move up

Solve the simple subproblems(starting with trivial subproblems)

Use their solutions to solve difficult subproblems


w ww

f1 f2 f3

Find new labeling using -Expansion


w ww

f1 f2 f3

Continue till we reach the root

Analysis

w ww

Mathematical Induction

All variables Va such that f*(a) m

m

1 bound on the unary potentials

2r/(r-1) bound on the pairwise potentials

Analysis

w ww

Mathematical Induction

m

Initial step of M.I. trivial (for leaf nodes)

Given children, prove for parent

Analysis

w ww

a(f(a)) +

i ab(fi(a),fi(b)) +

i≠j ab(fi(a),fj(b))

f(a) = fi(a)f(b) = fi(b)

f(a) = fi(a)f(b) = fj(b)

Analysis

w ww

a(f*(a)) +

i ab(fi(a),fi(b)) +


Analysis

w ww

a(f*(a)) +

i ab(f*(a),f*(b)) +


2rr-1

Analysis

w ww

a(f*(a)) +

i ab(f*(a),f*(b)) +

i≠j ab(f*(a),f*(b)) dmax

dmin

2

2rr-1

Analysis

w ww

dmax = 2w(1+1/r+1/r2+….)

dmin = 2w

Analysis

w ww

i≠j ab(f*(a),f*(b)) 2rr-1

a(f*(a)) +

i ab(f*(a),f*(b)) + 2rr-1

Analysis

Overall approximation bound 2r/(r-1)

Previous best bound 2r/(r-2)

Not Tight ?

Overview

+

f1 f2f

Outline

• Simpler Metrics




• Results

Approximating Metrics

D = {dt(.,.), t = 1,2,… T}, dt(i,j) ≥ d(i,j)

Pr(.) over the elements of D

Given distance d(.,.)

minD,Pr(.) maxi≠j ∑tPr(t) dt(i,j)

d(i,j)


l1 l2 l3 l4 l5 l6

w1 w1

w2 w2w2 w3 w3

w3

r-HST : hierarchical clustering of labels

Use a clustering algorithm

Approximating MetricsFakcharoenphol, Rao and Talwar, 2003

max d(i,j) = 2M mini≠j d(i,j) > 1

Level ‘1’

Level ‘2’

Clustering at level 2??

Sample [1,2]

Choose a permutation π of labels

= { l1,…, lh }


max d(i,j) = 2M mini≠j d(i,j) > 1

Level ‘m-2’

Level ‘m-1’

Clustering at level m??

Choose a permutation π of labels

Fakcharoenphol, Rao and Talwar, 2003

Sample [1,2]


l1 l2l3l4 l5l6

l1 l2 l3

π

d(1,4) ≤ 2M-m ?



l1 l2l3l4 l5l6

l1

l2 l3

π

d(2,4) ≤ 2M-m ?



l1 l2l3l4 l5l6

l1

l2 l3

π

d(2,1) ≤ 2M-m ?



l1 l2l3l4 l5l6

l1 l2

l3

π

d(3,4) ≤ 2M-m ?



l1 l2l3l4 l5l6

l1 l2

l3

π

Edge length = Diameter of cluster / 2



Choose . Choose π

Initialize root node as trivial cluster (all labels)

Choose a cluster at level m-1

Run procedure to get clusters at level m

Repeat for all clusters at level m-1

Stop when all clusters are singletons

Repeat to get a set of r-HST metrics


Analysis

d(i,j) ≤ ∑Pr(t) dt(i,j) ≤ O(log h) d(i,j)

How many r-HST metrics ??

O(h log h)

Charikar, Chekuri, Goel, Guha and Plotkin, 1998


Approximating Semi-Metrics

d(i,j) ≤ ∑Pr(t) dt(i,j) ≤ O(( log h)2) d(i,j)

How many r-HST metrics ??

O(h log h)

d(i,j) - d(j,k) ≤ d(i,k)

Overview

+

f1 f2f

Outline

• Simpler Metrics




• Results

Combining Labelings

Use -Expansion !!

Analysis

Bound for r-HST Labeling = O(1)

Distortion for Metrics = O(log h)

Bound for Metric Labeling = O(log h)

Distortion for Semi-Metrics = O(( log h)2)

Bound for Semi-Metric Labeling = O(( log h)2)

Analysis

When h < n, all known LP boundscan be obtained using move making algorithms.

Refining the Labeling

Current energy Q(f; d) = Q(f; dt)

Q(f’; d) ≤ Q(f’; dt), f’ ≠ f

Find best ft according to dt(.,.)



f = ft. Repeat till convergence.

Outline

• Simpler Metrics




• Results

Synthetic Data

T. Lin. T. Quad. r-HST Met S-Met

Exp 48645 52094 50221 48112 47613

Swap 48721 51938 51055 48487 47579

TRW-S 47506 51318 48132 47355 46612

BP-S 50942 60269 52841 48136 47402

R-Swap 48045 51842 - - -

R-Exp 47998 51641 - - -

Our 47850 51587 48146 47538 46651

Our+EM 47823 51413 48146 47382 46638

Synthetic Data

T. Lin. T. Quad. r-HST Met S-Met

Exp 0.44 0.36 0.29 0.30 0.36

Swap 0.65 0.86 0.52 0.51 0.47

TRW-S 104.29 178.97 713.70 703.82 709.36

BP-S 15.78 45.63 150.36 129.68 141.79

R-Swap 1.97 10.73 - - -

R-Exp 5.78 30.73 - - -

Our 10.22 12.84 1.86 10.58 12.25

Our+EM 25.66 64.08 5.02 32.75 57.50

Image Denoising

Image DenoisingExp Swap TRW-S

BP-S Our Our+EM

75641,5.09 74426,25.22 68226,174.33

105845,32.94 72828,70.55 72332,204.55

Image Denoising

Image DenoisingExp Swap TRW-S

BP-S Our Our+EM

86163,26.13 89264,90.74 73383,529.60

526969,115.84 81820,294.72 81820,465.57

Stereo Reconstruction

Stereo ReconstructionExp Swap TRW-S

BP-S Our Our+EM

78776,12.07 97999,34.59 62777,263.28

126824,50.38 65116,152.74 65008,361.81

Scene Registration

Scene RegistrationExp Swap TRW-S

BP-S Our Our+EM

82036,1.66 83023,8.15 81118,1371.11

84396,218.04 81315,104.89 81258,373.60

Scene Registration

Scene RegistrationExp Swap TRW-S

BP-S Our Our+EM

68572,1.27 69767,2.78 67616,1058.25

70239,159.98 67682,73.61 67676,240.49

Scene Segmentation

Energy Accuracy Timing

Exp 302272 60.62 3.18

Swap 302389 60.60 3.73

TRW-S 302211 60.68 451.02

BP-S 310825 60.44 102.14

Our 302265 60.64 157.03

Future Work

• Tighter approximations for semi-metrics

• Higher-order potentials?

• Learning the parameters?

A Diffusion Algorithm for Upper Envelope Potentials

M. Pawan Kumar

Joint work with

Pushmeet Kohli

AimEfficient MAP estimation of sparse higher order potentials

V1 V2 … … …

… … … … …

… … … … …

… … … … Vn

AimEfficient MAP estimation of sparse higher order potentials

Z

In general, f(z) Lc

Some special cases computationally feasible

Lower Envelope Potentials

Z

mini z(i) + ∑aC za(i,f(a))

f(z) L’ Lc


ENERGY


ENERGY


ENERGY


mini z(i) + ∑aC za(i,f(a))

f(z) {0,1}

ENERGY

Robust Pn Model


+ ∑z (mini z(i) + ∑aC za(i,f(a))) f* = arg minf a(f(a)) + ab(f(a),f(b))



f(z) L’

+ ∑z z(f(z)) + ∑aC za(f(z),f(a))

Use your favorite pairwise MRF algorithm

Upper Envelope Potentials

Z

maxi z(i) + ∑aC za(i,f(a))

f(z) L’ Lc


SilhouetteObject

Ray

Ray

Cameracenter

At least one voxel on the ray labeled ‘object’


SilhouetteObject

Ray

Ray

Cameracenter

maxi z(i) + ∑aC za(i,f(a))

f(z) {0,1}


+ ∑z (maxi z(i) + ∑aC za(i,f(a))) f* = arg minf a(f(a)) + ab(f(a),f(b))


+ ∑z tz


tz ≥ z(i) + ∑aC tza(i)

tza(i) ≥ za(i,f(a))

LP Relaxation

Dual

max a mini a(i) + (a,b) mini,j ab(i,j)

+ z mini z(i) + (z,a) mini,j za(i,j)

∑iz(i) = 1∑jza(i,j) = z(i) za(i,j)≥ 0

a(i) = a(i)

ab(i,j) = ab(i,j)

z(i) = z(i)a(i)

za(i,j) = za(i,j)ab(i,j)

Dual Without Z

max a mini a(i) + (a,b) mini,j ab(i,j)

Diffusion

Va

3

1 0

2

Va

5

10 12

3

Va

4

2

0 2 3

Diffusion

Va

3

0 0

1

Va

0

5 9

0

Va

4

2

0

1

5

3

Diffusion

Va

3

0 0

1

Va

0

5 9

0

Va

3

2

3

2

3

2

Diffusion

Va

6

2 3

3

Va

3

8 11

2

Va

3

2

2 2 2

Diffusion for Auxiliary Variable

z

3

1 0

2

z

5

10 12

3

z

4

2

z(i) = ’z(i) + (z(i) - ’z(i))a(i)

za(i,j) = ’za(i,j) + (za(i,j) - ’za(i,j))za(i,j)


max ( mini z(i) + ∑a mini,j za(i,j) )

∑I z(i) = 1

z(i)≥ 0

∑j za(i,j) = z(i)

za(i,j)≥ 0

Solve for Expensive


max ( mini z(i) ) max ( mini,j za(i,j) )

∑I z(i) = 1

z(i)≥ 0

∑j za(i,j) = z(i) ?

∑ij za(i,j) = 1

za(i,j)≥ 0



∑I z(i) = 1

z(i)≥ 0

∑ij za(i,j) = 1

za(i,j)≥ 0

+ ∑z;iz(i) + ∑za;iza(i,j)

z;i + ∑aza;i = 0Fractional Packing Problem



∑I z(i) = 1

z(i)≥ 0

∑ij za(i,j) = 1

za(i,j)≥ 0

+ ∑z;iz(i) + ∑za;iza(i,j)

z;i + ∑aza;i = 0Plotkin, Shmoys and Tardos, 1995


z

3

1 0

2

z

5

10 12

3

z

4

2

Run Standard Diffusion on

The Algorithm

Choose a variable (random or auxiliary)

If random variable, run standard diffusion

If auxiliary variable, obtain and then run standard diffusion

Repeat till convergence

Future Work

• Write the code

• Do the experiments

• A better way to get ??

Documents

Hierarchical Graph Cuts for Semi-Metric Labeling M. Pawan Kumar Joint work with Daphne Koller