1
A dual ascent framework for Lagrangean decomposition of combinatorial problems Paul Swoboda 1 Jan Kuske 2 Bogdan Savchynskyy 3 1 IST Austria 2 Heidelberg University, Germany 3 TU Dresden, Germany e-mail: [email protected] Goal and contribution Goal: Efficient dual block coordinate ascent solvers for a wide range of large scale combina- torial problems from computer vision and beyond. Contribution: We propose a new class of general LP-relaxations of ILP-problems, called Integer Relaxed Pairwise Separable Linear Programs (IRPS-LP). (IRPS-LP) generalize the local polytope relaxation for CRFs. (IRPS-LP) allows explicit modelling of allowed configurations, instead of forbidding them with -costs. This leads to more compact formulations and lower computational cost. Relaxations of the multicut [4] (poster #75) and graph matching [5] (poster #69 tommorow afternoon) problems can be written as (IRPS-LP). We analyze dual block coordinate ascent for (IRPS-LP). We prove monotonical improvement of a dual lower bound. Popular algorithms TRWS [2], SRMP [3] and MPLP [1] for inference in CRFs are special cases. Practical impact: Efficient algorithms for multicut and graph matching have been written in our framework (for extensive details see corresponding posters). C++ implementation available at https://github.com/pawelswoboda/LP_MP. Problem formulation (IRPS-LP) Factor graph G =(F, E): factors i F with variables μ i X i ⊆{0, 1} d i couplings ij E F 2 with constraints A (i ,j ) μ i = A (j ,i ) μ j . (IRPS-LP): min μΛ G k X i =1 hθ i i i Λ G := (μ 1 ...μ k ) μ i conv(X i ) i F A (i ,j ) μ i = A (j ,i ) μ j ij E . μ 1 X 1 μ 2 X 2 μ 3 X 3 μ 4 X 4 A (1,2) μ 1 =A (2,1) μ 2 A (1,3) μ 1 =A (3,1) μ 3 A (2,4) μ 2 =A (4,2) μ 4 A (3,4) μ 3 =A (4,3) μ 4 Example: MAP-inference in CRFs as (IRPS-LP) CRF: Graph G =(V, E), label space X = Q u V X u , u V: unary costs θ u : X u R, uv E: pairwise costs θ uv : X u × X v R. MAP-inference: min x X X u V θ u (x u )+ X uv E θ uv (x uv ) . (IRPS-LP): F = V E, E = {{u , uv }, {v , uv } : uv E}. Simplex: Δ n = {x R n + : n i =1 x i i = 1}. min μL G hθ, μi := X u V hθ u u i + X uv E hθ uv uv i L G = μ u conv(X u ): μ u Δ |X u | , u V μ uv conv(X uv ): μ uv Δ |X uv | , uv E A (uv ,u ) μ uv = A (u ,uv ) μ u : x v X v μ uv (x u , x v )= μ u (x u ) . u v w CRF μ u Δ X u μ v Δ X v μ w Δ X w μ vw Δ X v ·X w μ uv Δ X u ·X v μ uw Δ X u ·X w μ u (x u )= x v μ uv (x uv ) μ(x v )= x u μ uv (x uv ) μ(x v )= x w μ vw (x vw ) 0 μ(x w )= x v μ vw (x vw ) μ(x u )= x w μ uw (x uw ) 0 μ(x w )= x u μ uw (x uw ) CRF as (IRPS-LP) Example: graph matching Given: CRF. Label spaces X u ⊂L are part of an universe. min x X X u V θ u (x u )+ X uv E θ uv (x uv ) | {z } CRF s.t. x u 6= x v u 6= v | {z } uniqueness constraints Example: V = {u , v , w }, L = {l , l 0 , l 00 }. Unary factors: Each node takes a label. Label factors: Each label can be taken once. = 1 = 1 = 1 = 1 = 1 = 1 u v w l l 0 l 00 (IRPS-LP): min μL G , ˜ μ hθ, μi s.t. ˜ μ s ∈L : X u V ˜ μ u (s ) 1 additional (IRPS-LP) factors for uniqueness constraint μ u (s ) = ˜ μ s (u ), s X u couple CRF and uniqueness constraints . Example: multicut Optimization problem: Weighted graph G =(V , E ). Find partition of G. min x ∈{0,1} |E| X eE θ e x e , s.t. cycles C : e 0 C : X eC\{e 0 } x e x e 0 | {z } cycle inequalities . (IRPS-LP): Factors: F = E 3-cycle(E ). Couplings: E = {{e , C } : C 3-cycle(E ), e C } Motivation: Dual block coordinate ascent for CRFs Observation: TRWS [2] is state of the art among LP-based methods for MAP-inference for CRFs: Contribution: Develop dual block coordinate ascent methods inspired by TRWS [2] for (IRPS-LP). Dual lower bound Dualize primal problem w.r.t. coupling constraints: Primal variables P := μ = μ 1 . . . μ k : μ conv(x 1 ) . . . conv(X k ) . Coupling constraints: Aμ =(A (i ,j ) μ i - A (j ,i ) μ j ) ij E = 0. Maximize dual lower bound: max φ [D (φ) := min μP hθ, μi + hφ, Aμi]. Reparametrization: θ φ := θ + A > φ. Elementary updates – Admissible messages For factor i BF take subset of neighbors N ⊆{j : ij E}. Consider dual variables φ i j related to couplings j N . Change (φ i j ) j N by Δ such that D (φ - Δ) D (φ). Illustration in the figure: factor 1, neighbors N = {2, 3}. μ 1 X 1 μ 2 X 2 μ 3 X 3 μ 4 X 4 φ 12 φ 13 Elementary updates – Admissible messages Computation of Δ: ˆ x i arg min x i X i hθ φ i , x i i : opt. configuration before update. ˆ x i arg min x i X i hθ φ-Δ i , x i i : opt. configuration after update. hθ φ-Δ i , ˆ x i i≥hθ φ i , ˆ x i i : costs do not decrease. Δ = arg max Δ-admissible hδ, θ φ-Δ i i, δ (s ) ( > 0, ˆ x i (s )= 1 < 0, ˆ x i (s )= 0 Algorithm Visit all factors i F and perform elementary updates: Special cases: TRWS [2], SRMP [3] and MPLP [1] for MAP-inference in CRFs. Results: graph matching 10 -1 10 0 10 1 10 1 10 2 10 3 runtime(s) car log(energy - lower bound) AMP HUNGARIAN-BP DD 10 -1 10 0 10 1 10 0 10 1 10 2 10 3 runtime(s) motor log(energy - lower bound) 10 -1 10 0 10 1 10 2 10 2 10 3 10 4 10 5 runtime(s) worms log(energy - lower bound) For more details please see [5] and corresponding poster #69 on Tuesday afternoon. Results: multicut 10 -2 10 -1 10 0 -5,000 -4,800 -4,600 -4,400 -4,200 -4,000 runtime(s) knott-3d-150 energy MP-C CGC MC-ILP CC-Fusion-RWS CC-Fusion-RHC 10 -1 10 0 10 1 10 2 -2.75 -2.74 -2.73 -2.72 -2.71 ·10 4 runtime(s) knott-3d-300 10 -1 10 0 10 1 10 2 -8 -7.8 -7.6 ·10 4 runtime(s) knott-3d-450 For more details please see [4] and corresponding poster #75. References [1] A. Globerson and T. S. Jaakkola. Fixing max-product: Convergent message passing algorithms for MAP LP-relaxations. In NIPS, pages 553–560, 2007. [2] V. Kolmogorov. Convergent tree-reweighted message passing for energy minimization. IEEE Trans. Pattern Anal. Mach. Intell., 28(10):1568–1583, 2006. [3] V. Kolmogorov. A new look at reweighted message passing. IEEE Trans. Pattern Anal. Mach. Intell., 37(5):919–930, 2015. [4] P. Swoboda and B. Andres. A message passing algorithm for the minimum cost multicut problem. In CVPR, 2017. [5] P. Swoboda, C. Rother, H. Abu Alhaija, D. Kainmueller, and B. Savchynskyy. Study of Lagrangean decomposition and dual ascent solvers for graph matching. In CVPR, 2017. Acknowledgements. The first author was supported by the European Research Council under the European Unions Seventh Framework Programme (FP7/2007-2013)/ERC grant agreement no 616160, the second author by DFG Grant GRK 1653 and the last author by the European Research Council under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 647769) and the DFG Grant ”ERBI” SA 2640/1-1.

A dual ascent framework for Lagrangean decomposition of ... · A dual ascent framework for Lagrangean decomposition of combinatorial problems Paul Swoboda1 Jan Kuske2 Bogdan Savchynskyy3

  • Upload
    ngocong

  • View
    223

  • Download
    0

Embed Size (px)

Citation preview

A dual ascent framework for Lagrangean decomposition of combinatorial problemsPaul Swoboda1 Jan Kuske2 Bogdan Savchynskyy3

1 IST Austria 2 Heidelberg University, Germany 3 TU Dresden, Germanye-mail: [email protected]

Goal and contributionGoal: Efficient dual block coordinate ascent solvers for a wide range of large scale combina-torial problems from computer vision and beyond.Contribution:

We propose a new class of general LP-relaxations of ILP-problems, calledInteger Relaxed Pairwise Separable Linear Programs (IRPS-LP).

(IRPS-LP) generalize the local polytope relaxation for CRFs.(IRPS-LP) allows explicit modelling of allowed configurations, instead of forbidding them with∞-costs.This leads to more compact formulations and lower computational cost.Relaxations of the multicut [4] (poster #75) and graph matching [5] (poster #69 tommorow afternoon)problems can be written as (IRPS-LP).

We analyze dual block coordinate ascent for (IRPS-LP).We prove monotonical improvement of a dual lower bound.Popular algorithms TRWS [2], SRMP [3] and MPLP [1] for inference in CRFs are special cases.

Practical impact:Efficient algorithms for multicut and graph matching have been written in our framework(for extensive details see corresponding posters).C++ implementation available at https://github.com/pawelswoboda/LP_MP.

Problem formulation (IRPS-LP)Factor graph G = (F,E):

factors i ∈ F with variables µi ∈ Xi ⊆ {0,1}di

couplings ij ∈ E ⊆(F2

)with constraints A(i ,j)µi = A(j ,i)µj.

(IRPS-LP):

minµ∈ΛG

k∑i=1

〈θi, µi〉

ΛG :=

{(µ1 . . . µk)

∣∣∣∣ µi ∈ conv(Xi) ∀i ∈ FA(i ,j)µi = A(j ,i)µj ∀ij ∈ E

}.

µ1 ∈ X1 µ2 ∈ X2

µ3 ∈ X3 µ4 ∈ X4

A(1,2)µ1=A(2,1)µ2

A(1,3) µ

1 =A(3,1) µ

3

A(2,4) µ

2 =A(4,2) µ

4

A(3,4)µ3=A(4,3)µ4

Example: MAP-inference in CRFs as (IRPS-LP)CRF:

Graph G = (V,E),label space X =

∏u∈V Xu,

∀u ∈ V: unary costs θu : Xu → R,∀uv ∈ E: pairwise costs θuv : Xu × Xv → R.

MAP-inference:

minx∈X

∑u∈V

θu(xu) +∑

uv∈Eθuv(xuv) .

(IRPS-LP): F = V ∪ E, E = {{u,uv}, {v ,uv} : uv ∈ E}. Simplex: ∆n = {x ∈ Rn+ :∑n

i=1 xi〉 =1}.

minµ∈LG

〈θ, µ〉 :=∑u∈V

〈θu, µu〉 +∑uv∈E

〈θuv , µuv〉

LG =

µu ∈ conv(Xu) : µu ∈ ∆|Xu|,u ∈ Vµuv ∈ conv(Xuv) : µuv ∈ ∆|Xuv |,uv ∈ EA(uv ,u)µuv = A(u,uv)µu :

∑xv∈Xv

µuv(xu, xv) = µu(xu) .

u v

w

CRF

µu ∈ ∆Xu µv ∈ ∆Xv

µw ∈ ∆Xw µvw ∈ ∆Xv ·Xw

µuv ∈ ∆Xu·Xv

µuw ∈ ∆Xu·Xw

µ u(x

u)=

∑ x vµ u

v(x

uv) µ

(xv )=

∑xu µ

uv (xuv )

µ(x

v )=

∑xwµ

vw (xvw )

′µ(xw)=∑

xv µvw(xvw)

µ(x

u)=

∑ x wµ u

w(x

uw)

′µ(xw)=∑

xu µuw(xuw)

CRF as (IRPS-LP)

Example: graph matchingGiven:

CRF.Label spaces Xu ⊂ L are part of an universe.

minx∈X

∑u∈V

θu(xu) +∑uv∈E

θuv(xuv)︸ ︷︷ ︸CRF

s.t. xu 6= xv ∀u 6= v︸ ︷︷ ︸uniqueness constraints

Example: V = {u, v ,w}, L = {l , l ′, l ′′}.Unary factors: Each node takes a label.Label factors: Each label can be takenonce.

∑ =1

∑ =1

∑ =1

∑= 1∑= 1∑= 1

u v w

l

l ′

l ′′

(IRPS-LP):

minµ∈LG,µ̃ 〈θ, µ〉

s.t. µ̃ ∈

∀s ∈ L :∑u∈V

µ̃u(s) ≤ 1

additional (IRPS-LP) factorsfor uniqueness constraint

µu(s) = µ̃s(u), s ∈ Xu couple CRF and uniqueness constraints .

Example: multicutOptimization problem:

Weighted graph G = (V ,E).Find partition of G.

minx∈{0,1}|E|

∑e∈E

θexe , s.t. ∀ cycles C : ∀e′ ∈ C :∑

e∈C\{e′}xe ≥ xe′︸ ︷︷ ︸

cycle inequalities

.

(IRPS-LP):Factors: F = E ∪ 3-cycle(E).Couplings: E = {{e,C} : C ∈ 3-cycle(E),e ∈ C}

Motivation: Dual block coordinate ascent for CRFs

Observation: TRWS [2] is state of the artamong LP-based methods forMAP-inference for CRFs:Contribution: Develop dual blockcoordinate ascent methods inspired byTRWS [2] for (IRPS-LP).

Dual lower boundDualize primal problem w.r.t. coupling constraints:

Primal variables P :=

{µ =

(µ1...µk

): µ ∈

(conv(x1)...conv(Xk)

)}.

Coupling constraints: Aµ = (A(i ,j)µi − A(j ,i)µj)ij∈E = 0.Maximize dual lower bound: maxφ [D(φ) := minµ∈P〈θ, µ〉 + 〈φ,Aµ〉].Reparametrization: θφ := θ + A>φ.

Elementary updates – Admissible messages

For factor i ∈ BF take subset ofneighbors N ⊆ {j : ij ∈ E}.Consider dual variables φi j related tocouplings j ∈ N.Change (φi j)j∈N by ∆ such thatD(φ−∆) ≥ D(φ).

Illustration in the figure:factor 1, neighbors N = {2,3}.

µ1 ∈ X1 µ2 ∈ X2

µ3 ∈ X3 µ4 ∈ X4

φ12

φ13

Elementary updates – Admissible messagesComputation of ∆:

x̂i ∈ arg minxi∈Xi〈θφi , xi〉 : opt. configuration before update.

x̂i ∈ arg minxi∈Xi〈θφ−∆i , xi〉 : opt. configuration after update.

〈θφ−∆i , x̂i〉 ≥ 〈θφi , x̂i〉 : costs do not decrease.

∆ = arg max∆−admissible〈δ, θφ−∆i 〉, δ(s){> 0, x̂i(s) = 1< 0, x̂i(s) = 0

AlgorithmVisit all factors i ∈ F and perform elementary updates:

Special cases: TRWS [2], SRMP [3] and MPLP [1] for MAP-inference in CRFs.

Results: graph matching

10−1 100 101

101

102

103

runtime(s)car

log(en

ergy

-lower

bou

nd)

AMPHUNGARIAN-BP

DD

10−1 100 101

100

101

102

103

runtime(s)motor

log(en

ergy

-lower

bou

nd)

10−1 100 101 102102

103

104

105

runtime(s)worms

log(en

ergy-lower

bou

nd)

For more details please see [5] and corresponding poster #69 on Tuesday afternoon.

Results: multicut

10−2 10−1 100−5,000

−4,800

−4,600

−4,400

−4,200

−4,000

runtime(s)knott-3d-150

energy

MP-CCGC

MC-ILPCC-Fusion-RWSCC-Fusion-RHC

10−1 100 101 102−2.75

−2.74

−2.73

−2.72

−2.71

·104

runtime(s)knott-3d-300

10−1 100 101 102

−8

−7.8

−7.6

·104

runtime(s)knott-3d-450

For more details please see [4] and corresponding poster #75.

References[1] A. Globerson and T. S. Jaakkola. Fixing max-product: Convergent message passing algorithms for

MAP LP-relaxations. In NIPS, pages 553–560, 2007.

[2] V. Kolmogorov. Convergent tree-reweighted message passing for energy minimization. IEEE Trans.Pattern Anal. Mach. Intell., 28(10):1568–1583, 2006.

[3] V. Kolmogorov. A new look at reweighted message passing. IEEE Trans. Pattern Anal. Mach. Intell.,37(5):919–930, 2015.

[4] P. Swoboda and B. Andres. A message passing algorithm for the minimum cost multicut problem. InCVPR, 2017.

[5] P. Swoboda, C. Rother, H. Abu Alhaija, D. Kainmueller, and B. Savchynskyy. Study of Lagrangeandecomposition and dual ascent solvers for graph matching. In CVPR, 2017.

Acknowledgements. The first author was supported by the European Research Council under the European Unions Seventh Framework Programme(FP7/2007-2013)/ERC grant agreement no 616160, the second author by DFG Grant GRK 1653 and the last author by the European Research Council under the

European Union’s Horizon 2020 research and innovation programme (grant agreement No 647769) and the DFG Grant ”ERBI” SA 2640/1-1.