52
Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

Embed Size (px)

Citation preview

Page 1: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

Pictorial Structures and Distance Transforms

Computer VisionCS 543 / ECE 549

University of Illinois

Ian Endres

03/31/11

Page 2: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

Goal: Detect all instances of objectsCars

Faces

Cats

Page 3: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

Object model: last class• Statistical Template in Bounding Box

– Object is some (x,y,w,h) in image– Features defined wrt bounding box coordinates

Image Template Visualization

Images from Felzenszwalb

Page 4: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

Last class: sliding window detection

Page 5: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

Last class: statistical template

• Object model = log linear model of parts at fixed positions

+3 +2 -2 -1 -2.5 = -0.5

+4 +1 +0.5 +3 +0.5= 10.5

> 7.5?

> 7.5?

Non-object

Object

Page 6: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

When are statistical templates useful?

Caltech 101 Average Object Images

Page 7: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

Deformable objects

Images from Caltech-256

Slide Credit: Duan Tran

Page 8: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

Deformable objects

Images from D. Ramanan’s datasetSlide Credit: Duan Tran

Page 9: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

Object models: this class• Articulated parts model

– Object is configuration of parts– Each part is detectable

Images from Felzenszwalb

Page 10: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

Parts-based Models

Define object by collection of parts modeled by1. Appearance2. Spatial configuration

Slide credit: Rob Fergus

Page 11: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

How to model spatial relations?• Star-shaped model

Root

Part

Part

Part

Part

Part

Page 12: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

How to model spatial relations?• Star-shaped model

=X X

X

≈Root

Part

Part

Part

Part

Part

Page 13: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

How to model spatial relations?• Tree-shaped model

Page 14: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

How to model spatial relations?

Fergus et al. ’03Fei-Fei et al. ‘03

Leibe et al. ’04, ‘08Crandall et al. ‘05Fergus et al. ’05

Crandall et al. ‘05 Felzenszwalb & Huttenlocher ‘05

Bouchard & Triggs ‘05 Carneiro & Lowe ‘06Csurka ’04Vasconcelos ‘00

from [Carneiro & Lowe, ECCV’06]

O(N6) O(N2) O(N3) O(N2)

• Many others...

Page 15: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

Today’s class

1. Tree-shaped model–Example: Pictorial structures

• Felzenszwalb Huttenlocher 2005

2. Optimization with Dynamic Programming3. Distance Transforms

Page 16: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

Pictorial Structures Model

Part = oriented rectangle Spatial model = relative size/orientation

Felzenszwalb and Huttenlocher 2005

Page 17: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

Part representation• Background subtraction

Page 18: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

Pictorial Structures Model

Appearance likelihood Geometry likelihood

Page 19: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

Modeling the Appearance• Any appearance model could be used

– HOG Templates, etc.– Here: rectangles fit to background subtracted

binary map

• Train a detector for each part independently

Appearance likelihood Geometry likelihood

Page 20: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

Pictorial Structures Model

Minimize Energy (-log of the likelihood):

Appearance Cost Geometry Cost

Page 21: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

Optimization – Brute Force

H

T

F

Appearance Cost Deformation Cost

Page 22: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

Optimization

H

T

F

Page 23: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

Optimization - Dynamic Programming

H

T

F

Page 24: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

Optimization - Dynamic Programming

H

T

F

Page 25: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

Optimization - Dynamic Programming

H

T

F

Page 26: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

Optimization - Dynamic Programming

H

T

F

Page 27: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

Optimization - Complexity

H

T

F

?

For all lT?

Each part can take N locations

Overall? (for K parts)

Page 28: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

• For each pixel p, how far away is the nearest pixel q of set S– – G is often the set of edge pixels

Distance Transform

G: black pixelsp1

q1

p2

q2

Page 29: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

Computing the Distance Transform• 1-Dimension

Page 30: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

• Extending to N-Dimensions• Savings can be extended from 1-D to N-D cases if deformation cost is

separable:

Computing the Distance Transform

Page 31: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

Distance Transform - Applications• Set distances – e.g. Hausdorff Distance• Image processing – e.g. Blurring• Robotics – Motion Planning• Alignment

– Edge images– Motion tracks– Audio warping

• Deformable Part Models (of course!)

Page 32: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

Generalized Distance Transform• Original form: • General form:

– m(q): arbitrary function sampled at discrete points– For each p, find a nearby q with small m(q)

– For original DT:

– For part models:

• For some deformation costs,

Page 33: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

How do we do it?• Key idea: Construct lower “envelope” of data

• Given envelope, compute f(p) in O(N) time• Goal: Compute envelope in O(N) time

Page 34: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

• Key idea: Keep track of intersection points– fi(x), fj(x) only intersect once (let i<j)

Computing the Lower Envelope

Start(1) = -InfEnd(1) = Inf

Page 35: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

• Key idea: Keep track of intersection points– fi(x), fj(x) only intersect once (let i<j)

Computing the Lower Envelope

Start(1) = -InfEnd(1) = 3

Start(2) = 3End(2) = Inf

X

Page 36: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

• Key idea: Keep track of intersection points– fi(x), fj(x) only intersect once (let i<j)

Computing the Lower Envelope

Start(1) = -InfEnd(1) = 3

Start(2) = 3End(2) = 4

Start(3) = 4End(3) = Inf

X

Page 37: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

• Key idea: Keep track of intersection points– fi(x), fj(x) only intersect once (let i<j)

Computing the Lower Envelope

X

Start(1) = -InfEnd(1) = 3

Start(2) = 3End(2) = 4

Start(3) = 4End(3) = Inf

Page 38: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

• Key idea: Keep track of intersection points– fi(x), fj(x) only intersect once (let i<j)

Computing the Lower Envelope

Start(1) = -InfEnd(1) = 3

Start(2) = 3End(2) = 4

Start(3) = []End(3) = []

Start(4) = 4End(4) = Inf

X

Page 39: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

• Key idea: Keep track of intersection points– fi(x), fj(x) only intersect once (let i<j)

Computing the Lower Envelope

Start(1) = -InfEnd(1) = 3

Start(2) = 3End(2) = 4

Start(3) = []End(3) = []

Start(4) = 4End(4) = Inf

X

Page 40: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

• Key idea: Keep track of intersection points– fi(x), fj(x) only intersect once (let i<j)

Computing the Lower Envelope

Start(1) = -InfEnd(1) = 3

Start(2) = 3End(2) = 4

Start(3) = []End(3) = []

Start(4) = 4End(4) = Inf

X

Page 41: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

• Key idea: Keep track of intersection points– fi(x), fj(x) only intersect once (let i<j)

Computing the Lower Envelope

Start(1) = -InfEnd(1) = 3

Start(2) = []End(2) = []

Start(3) = []End(3) = []

Start(4) = 3End(4) = Inf

X

Page 42: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

• Key idea: Keep track of intersection points– fi(x), fj(x) only intersect once (let i<j)

Computing the Lower Envelope

Start(1) = -InfEnd(1) = 2.6

Start(2) = []End(2) = []

Start(3) = []End(3) = []

Start(4) = 2.6End(4) = Inf

X

Page 43: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

• Key idea: Keep track of intersection points– fi(x), fj(x) only intersect once (let i<j)

Computing the Lower Envelope

Start(1) = -InfEnd(1) = 2.6

Start(2) = []End(2) = []

Start(3) = []End(3) = []

Start(4) = 2.6End(4) = 3.5

Start(5) = 3.5End(5) = Inf

X

Page 44: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

Is This O(N)?

• Yes!• N parabolas are added

– For each addition, may remove up to O(N) parabolas

• But, each parabola can only be deleted once• Thus, O(N) additions, and at most O(N)

deletions

Page 45: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

What distances can we use?

• Quadratic (with linear shift)

• Abs. diff

• Min-composition

• Bounded (Requires extra bookkeeping)

Page 46: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

Back to the Pictorial structures model

Problem: May need to infer more than one configuration

a) May be more than one object• Report scores for every root position, apply NMS

b) Optimal solution may be incorrect• Sampling

– Sample root node, then each node given parent, until all parts are sampled

Page 47: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

Sample poses from likelihood and choose best match with Chamfer distance to map

Page 48: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

Results for person matching

48

Page 49: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

Results for person matching - Mistakes• Ambiguous Background subtracted image

49

Page 50: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

Recently enhanced pictorial structures

BMVC 2009

Page 51: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

Deformable Latent Parts Model

Useful parts discovered during training

Detections

Template Visualization

Felzenszwalb et al. 2008

Page 52: Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

Things to Remember• Rather than searching for whole object, can

locate parts that compose the object– Better encoding of spatial variation

• Models can be broken down into part appearance and spatial configuration– Wide variety of models

• Efficient optimization is often tricky, but many tricks available– Dynamic programming, Distance transforms