What we didn’t have time for CS664 Lecture 26 Thursday 12/02/04 Some slides c/o Dan Huttenlocher,...

Preview:

Citation preview

What we didn’t have time for

CS664 Lecture 26Thursday 12/02/04

Some slides c/o Dan Huttenlocher, Stefano Soatto, Sebastian Thrun

Administrivia Final project is due at noon on

Friday 12/17 Write-up only (5MB max) Be sure to include some pictures

Send me email if you missed any quiz for a good reason

Outline Geometry Graph-based segmentation Statistics

Geometry

Homogeneous coordinates

Identify a point in the image plane with ray passing through that point (pixel) (x,y) ´ ( x, y, ) for non-zero (X,Y,Z) ´ (X/Z,Y/Z,1) for non-zero Z

Advantages Many non-linear operations

become linear in homogeneous coordinates Example: (X,Y,Z) projects to

(fX/Z,fY/Z)

2D point 3D point

3x4 camera

projection

Camera projection matrix

epipole

Epipolar geometry

epipolarplane

epipolar

line

Stefano Soatto (c) 2002

Pencil of planes Different epipolar planes for

different scene points x Plane defined by camera origins + x

Epipolar lines are important For pixel p in I

there is a corresponding epipolar line in I’ This allows us to

limit the search! Generalization of

stereo to arbitrary camera positions

Classical stereo has parallel cameras

p

Example: verged stereo

Examples: motion

Parallel toImage Plane

Forward

Essential matrix E Ex is perpendicular to x’s epipolar

line in the other image So if x’ corresponds to x then

x’TEx = 0 Captures the scene geometry

We assume the cameras are calibrated

Otherwise we get the fundamental matrix

Estimating the geometry The essential matrix has 5 parameters

Can estimate from 5 corresponding points

Fundamental matrix has 7 The question of “how few perfect

correspondences do you need” has spawned an unfortunately large literature

Yet more optimization We can estimate the essential

matrix from a bunch of point matches

A similar technique can be used to compute structure from motion Bundle adjustment

RANSAC (line fitting) Variant of generate-and-test Pick a small set of points at random Fit them via least squares Points “far” from this line are

outliers Repeat until you find a line with

very few outliers

RANSAC (camera geometry) Pick a small set of corresponding

pixels At least 5 (essential) or 7

(fundamental) Compute the matrix from these See how many corresponding

pixels this matrix explains

Graph-based Segmentation

Segmentation by min cut

ImagePixels

w

SimilarityMeasure

MinimumCut

* From Khurram Hassan-Shafique CAP5415 Computer Vision 2003

Min cuts don’t segment well

Ideal Cut

Cuts with lesser weightthan the ideal cut

* Slide from Khurram Hassan-Shafique CAP5415 Computer Vision 2003

Normalized cuts Instead of the min cut, minimize

Measure of dis-similarity between the sets A and B NP-hard to minimize Rely on continuous approximation

VyBzVyAx

cut yzw

BAw

yxw

BAwBAN

,,,

,

,

,,

Normalized cuts examples

Limitations of normalized cuts Works by binary partitioning Slow and memory-intensive Textured backgrounds are

problems

Other graph-based methods Many other variants on min cuts

Typical cuts, nested cuts, etc. No clear winner for segmentation

Perhaps mean shift?

MST-based segmentation Minimum spanning tree is the

cheapest way to connect all pixels into a single component (or “region”)

Merge two components when the cheapest edge between them is cheap compared to a measure of the internal variation

Provably good segmentation under a fairly natural definition

Neither too coarse nor too fine

Example output

Solves many problems with normalized cuts

More statistics

Dimensionality reduction

We can represent orange points only by their v1 coordinate

Eigenfaces An n-pixel image

is a point in <n

Find low-dimensional representation of face images (from a training set)

Recognition by finding the closest face in face space

Markov Random Fields

MRF defining property:

Hammersley-Clifford Theorem:

),|(Pr),|(Pr pqpqp Nqffpqff

),(

),( ),(exp~)(Prqp

qpqp ffVf

neighborhood relationships (n-links)

image pixels (vertices)

pf - disparity at pixel p

),...,( 1 mfff - configuration

MAP estimation of an MRF

)Pr()|Pr(maxargˆ ffOff

p qp

qpqppp

f

ffVfOgf),(

),( ),()|(lnexpmaxargˆ

)|(Prmaxargˆ Offf

Observed data

Likelihoodfunction

(sensor noise)

Prior (MRF model)

Bayes rule

Energy minimization

),(

),( ),()|(ln)(qp

qpqp

p

pp ffVfOgfE

Data term

(sensor noise)

Smoothness term

(MRF prior)

Recommended