Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Lecture 3
C25 Optimization Hilary 2013 A. Zisserman
Dynamic Programming on graphs
• Terminology: chains, stars, trees, loops
• Application
• Message passing
• Shortest path - Dijkstra's algorithm
The Optimization Tree
Consider a cost function of the form
where xi can take one of h values
e.g. h=5, n=6
x1 x2 x3 x4 x5 x6
f(x) : IRn → IR
f(x) =
find shortest
path
Complexity of minimization:
• exhaustive search O(hn)
• dynamic programming O(nh2)
f(x) =nXi=1
mi(xi) +nXi=2
φi(xi−1, xi)
m1(x1) +m2(x2) +m3(x3) +m4(x4) +m5(x5) +m6(x6)
φ(x1, x2) + φ(x2, x3) + φ(x3, x4) + φ(x4, x5) + φ(x5, x6)
trellis
RECAP
x1 x2 x3 x4 x5 x6
Key idea: the optimization can be broken down into n sub-optimizations
f(x) =nXi=1
mi(xi) +nXi=2
φ(xi−1, xi)
Step 1: For each value of x2 determine the best value of x1
• Compute
S2(x2) = minx1{m2(x2) +m1(x1) + φ(x1, x2)}
= m2(x2) +minx1{m1(x1) + φ(x1, x2)}
• Record the value of x1 for which S2(x2) is a minimum
To compute this minimum for all x2 involves O(h2) operations
RECAP
x1 x2 x3 x4 x5 x6
Step 2: For each value of x3 determine the best value of x2 and x1
• Compute
S3(x3) = m3(x3) +minx2{S2(x2) + φ(x2, x3)}
• Record the value of x2 for which S3(x3) is a minimum
Again, to compute this minimum for all x3 involves O(h2) operations
Note Sk(xk) encodes the lowest cost partial sum for all nodes up to k
which have the value xk at node k, i.e.
Sk(xk) = minx1,x2,...,xk−1
kXi=1
mi(xi) +kXi=2
φ(xi−1, xi)
RECAP
Viterbi Algorithm
Complexity O(nh2)
• Initialize S1(x1) = m1(x1)
• For k = 2 : n
Sk(xk) = mk(xk) + minxk−1
{Sk−1(xk−1) + φ(xk−1, xk)}bk(xk) = argmin
xk−1{Sk−1(xk−1) + φ(xk−1, xk)}
• Terminatex∗n = argmin
xnSn(xn)
• Backtrackxi−1 = bi(xi)
RECAP
Dynamic Programming on graphs
• Graph (V,E)
• Vertices vi for i = 1, . . . , n
• Edges eij connect vi to other vertices vj
f(x) =Xvi∈V
mi(vi) +Xeij∈E
φ(vi, vj)
3
1
2
e.g.
• 3 vertices, each can take one of h values
• 3 edges
Dynamic Programming on graphs
So far have considered chains
Can dynamic programming be applied to these configurations?
(i.e. to reduce the optimization complexity from O(hn) to O(nh2))
54 61 2 3
1
3
4 5
6
2
Fully connected
1
3
4 5
6
2
Star structure
1
3
4
5
6
2
Tree structure
Terminlogy
simple (or open) chain
54 61 2 3
1
3
4 5
6
2
Graphs with loops
1
2
6
3
5
4
closed chain
1
3
4
5
6
2
1
3
4 5
6
2
Fully connected
Tree structure
Star structure
1
3
4
5
6
2
Terminlogy
1
3
4
5
6
2
degree = 1
degree (or valency) of a vertex
• is the number of edges incident to the vertex
degree = 3
Terminlogy for trees
root node0
1 2
3 4 5
leaf node
depth0
1
2
depth of node
• number of edges between node and root
children of node i
• are neighbouring vertices of depth di + 1
Dynamic programming on stars and trees
1
3
4 5
6
21
3
4
5
6
2
• for a value of the central node determine the best value of each of the other nodes in turn O(nh)
• repeat for all values of the central node O(h)
• final complexity O(nh2)
• order the nodes by their depth from the root, where d is max depth
• start with nodes at depth d-1, and compute best value for all (child) nodes at depth d, O(h2)
• decrease depth and repeat, O(n)
• final complexity O(nh2)
Dynamic programming on graphs with loops
• select one node and choose its value
• for the other nodes, the graph is then equivalent to an open chain and can be optimized with O(nh2) complexity
• repeat for all values of the selected node and choose lowest overall cost from these
• final complexity O(nh3)
1
2
6
3
5
4
Summary of different graph structures
1
3
4 5
6
2
Fully connected
O(hn)
1
3
4 5
6
2
Star structure
O(nh2)
1
3
4
5
6
2
Tree structure
O(nh2)
1
2
6
3
5
4Open chain O(nh2)
54 61 2 3
Closed chain O(nh3)
Application: facial feature detection in images
• Parts V= {v1, … vn}
• Connected by springs in star configuration to nose
• Quadratic cost for spring
Model
high spring cost 1 - NCC with appearance template
f(x) =Xvi∈V
mi(vi) +Xeij∈E
φ(vi, vj)
=Xvi∈V
mi(vi) +Xj
φ(v1, vj)
Spring extension
from v1 to vj
v1
v2 v3
v4
Fitting the model to an imageFind the configuration with the lowest energy
?
Model
v1
v2
v4
v3
h = 10^6 n = 4
?
Model
v1
v2
v4
v3
Fitting the model to an imageFind the configuration with the lowest energy
h = 10^6 n = 4
?
Model
v1
v2
v4
v3
Fitting the model to an imageFind the configuration with the lowest energy
h = 10^6 n = 4
Appearance templates and springs
f(x) =Xvi∈V
mi(vi) +Xj
φ(v1, vj)
x = (x1, y1, . . . , x4, y4)>
Each (xi, yi) ranges over h (x,y) positions in the image
NCC = 1−mi
Requires pair wise terms for correct detection
Example
1−mi
Example
Application of trees – computer vision example
Detect person layout
Message Passing
Example: counting people in a queue
How to count the number of people in a queue?
Method 1:
• ask them all to shout out their names and count these
Method 2:
• person at either end sends “1” to their neighbour
• if a person receives a number from their neighbour, add one, and pass total to neighbour on other side
• when anyone receives a message from both sides, they can compute the length of the queue by adding both messages and one (for themselves)
54 61 2 3
Counting vertices on a tree
How would the algorithm have to be modified to count vertices on a tree?
1
3
4
5
6
2
From David Mackay’s book
Viterbi Algorithm
Complexity O(nh2)
• Initialize S1(x1) = m1(x1)
• For k = 2 : n
Sk(xk) = mk(xk) + minxk−1
{Sk−1(xk−1) + φ(xk−1, xk)}bk(xk) = argmin
xk−1{Sk−1(xk−1) + φ(xk−1, xk)}
• Terminatex∗n = argmin
xnSn(xn)
• Backtrackxi−1 = bi(xi)
RECAP
Message passing for Dynamic Programming
• nodes send messages to their neighbours
• messages are vectors of dimension h
• notation: is message sent from p to q at time t
• also known as Belief Propagation
mtp→q(vi), i.e. the ith component of the vector,is low if vertex p “believes” that i is a good
(low cost) solution for vertex q
Non Examinable
minxf(x) =
Xvp∈V
ψp(vp) +Xepq∈E
φ(vp, vq)
Algorithm
• Initialize mtp→q(vq) = 0
• Repeat for each vertex
mtp→q(vq) = minvp{ψp(vp) + φ(vp, vq)
+X
s∈N(p)\qmt−1s→p(vp)}
Viterbi Algorithm using Message Passing
Non Examinable
Shortest Path Algorithms
The Shortest Path Problem
Given: a weighted directed graph, with a single source s
Goal: Find the shortest path from s to t
Length of path = Σ Length of arcs
Application
Minimize numberof stops(lengths = 1)
Minimize amountof time(positive lengths)
Dijkstra’s Algorithm (1959)
Given: a directed graph with non-negative edge costs, Find: the shortest path from s to every other vertex
• pick the unvisited vertex with the lowest distance to s
• calculate the distance through it to each unvisited neighbour, and update the neighbour's distance if smaller
• mark vertex as visited once all neighbours explored
Example: Dijkstra's Shortest Path Algorithm
Find shortest path from s to t.
s
3
t
2
6
7
45
24
18
2
9
14
155
30
20
44
16
11
6
19
6
Slide: Robert Sedgewick and Kevin Wayne
Dijkstra's Shortest Path Algorithm
s
3
t
2
6
7
45
24
18
2
9
14
155
30
20
44
16
11
6
19
6
∞
∞ ∞
∞
∞
∞
∞
0
distance label
S = { }PQ = { s, 2, 3, 4, 5, 6, 7, t }
Slide: Robert Sedgewick and Kevin Wayne
Dijkstra's Shortest Path Algorithm
s
3
t
2
6
7
45
24
18
2
9
14
155
30
20
44
16
11
6
19
6
∞
∞ ∞
∞
∞
∞
∞
0
distance label
S = { }PQ = { s, 2, 3, 4, 5, 6, 7, t }
select
Dijkstra's Shortest Path Algorithm
s
3
t
2
6
7
45
24
18
2
9
14
155
30
20
44
16
11
6
19
6
15
9 ∞
∞
∞
14
∞
0
distance label
S = { s }PQ = { 2, 3, 4, 5, 6, 7, t }
decrease
∞X
∞
∞X
X
compute distance through vertex to each unvisited neighbour, and update with min if smaller
Dijkstra's Shortest Path Algorithm
s
3
t
2
6
7
45
24
18
2
9
14
155
30
20
44
16
11
6
19
6
15
9 ∞
∞
∞
14
∞
0
distance label
S = { s }PQ = { 2, 3, 4, 5, 6, 7, t }
∞X
∞
∞X
X
Select pick the unvisited vertex with the lowest distance to s
Dijkstra's Shortest Path Algorithm
s
3
t
2
6
7
45
24
18
2
9
14
155
30
20
44
16
11
6
19
6
15
9 ∞
∞
∞
14
∞
0
S = { s, 2 }PQ = { 3, 4, 5, 6, 7, t }
∞X
∞
∞X
X
Dijkstra's Shortest Path Algorithm
s
3
t
2
6
7
45
24
18
2
9
14
155
30
20
44
16
11
6
19
6
15
9 ∞
∞
∞
14
∞
0
S = { s, 2 }PQ = { 3, 4, 5, 6, 7, t }
∞X
∞
∞X
X
decrease
X 33
compute distance through vertex to each unvisited neighbour, and update with min if smaller
Dijkstra's Shortest Path Algorithm
s
3
t
2
6
7
45
24
18
2
9
14
155
30
20
44
16
11
6
19
6
15
9 ∞
∞
∞
14
∞
0
S = { s, 2 }PQ = { 3, 4, 5, 6, 7, t }
∞X
∞
∞X
X
X 33
select
pick the unvisited vertex with the lowest distance to s
Dijkstra's Shortest Path Algorithm
s
3
t
2
6
7
45
24
18
2
9
14
155
30
20
44
16
11
6
19
6
15
9 ∞
∞
∞
14
∞
0
S = { s, 2, 6 }PQ = { 3, 4, 5, 7, t }
∞X
∞
∞X
X
X 33
44X
X32
Dijkstra's Shortest Path Algorithm - solution
s
3
t
2
6
7
45
24
18
2
9
14
155
30
20
44
16
11
6
19
6
15
9
∞
∞
14
∞
0
S = { s, 2, 3, 4, 5, 6, 7, t }PQ = { }
∞X
∞
∞X
X
44X
35X
59 XX51
X 34
X50
X45
∞X 33X32
Applications of shortest path algorithms:
• plotting routes on Google maps
• robot path planning
• urban traffic planning (e.g. through congestion/road works)
• routing of communications messages
• and many more …
Shortest path using linear programming
Given: a weighted directed graph, with a single source s
Distance from s to v: length of the shortest part from s to v
Goal: Find distance (and shortest path) to every vertex
More …
• Applications:• Snakes in computer vision: dynamic programming on a
closed or open chain
• Detecting human limb layout in images: dynamic programming on trees
• David Mackay’s book for message passing
• More on web page: • http://www.robots.ox.ac.uk/~az/lectures/opt