88
Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Embed Size (px)

Citation preview

Page 1: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Stereo Vision

ECE 847:Digital Image Processing

Stan BirchfieldClemson University

Page 2: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Modeling from multiple views

time

# cameras

photographbinocular stereo

trinocular stereo

multi-baseline stereo

camcorder

human vision

camera dome

two frames ...

...

– Greek for solid

Page 3: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Stereoscope

Invented by Wheatstone in 1838

Page 4: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Modern version

Page 5: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Can you fuse these?

rightleft

No special instrument needed

Just relax your eyes

L

R

Page 6: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Random dot stereogram

invented by Bela Juleszin 1959

http://www.magiceye.com/faq.htm

Page 7: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Autostereogram

Do you see the shark?

http://en.wikipedia.org/wiki/Autostereogram

Page 8: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Can you cross-fuse these?

right leftNote: Cross-fusion is necessary if distance between images

is greater than inter-ocular distance

L

R

R

L

impossible: instead, trickthe brain:

Tsukuba stereo images courtesy of Y. Ohta and Y. Nakamura at the University of Tsukuba

Page 9: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Human stereo geometry

http://webvision.med.utah.edu/space_perception.html

fixationpoint

corresponding points

aR

aL

disparity

Page 10: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Horopter

• Horopter: surfacewhere disparity is zero

• For round retina,the theoretical horopteris a circle(Vieth-Muller circle)

http://webvision.med.utah.edu/space_perception.html

Page 11: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Cyclopean image

http://webvision.med.utah.edu/space_perception.html

http://bearah718.tripod.com/sitebuildercontent/sitebuilderpictures/cyclops.jpg

Page 12: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Panum’s fusional area (volume)

• Human visual system is only capable of fusing the two images with a narrow range of disparities around fixation point

• This area (volume) is Panum’s fusional area

• Outside this area we get double-vision (diplopia)

http://www.allaboutvision.com/conditions/double-vision.htm

Page 13: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Human visual pathway

Page 14: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

photos courtesy California Academy of Science

Cheetah:More accurate

depthestimation

Antelope:larger field

of view

Prey and predator

Page 15: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Stereo geometry for pinhole cameras with flat retinas

C,C’,x,x’ and X are coplanar

Left camera Right camera

world point

center ofprojection

epipolarplane epipolar line for x

epipole

baseline

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 16: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

epipoles e,e’= intersection of baseline with image plane = projection of projection center in other image= vanishing point of camera motion direction

an epipolar plane = plane containing baseline (1-D family)

an epipolar line = intersection of epipolar plane with image(always come in corresponding pairs)

Epipolar geometry

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 17: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

What if only C,C’,x are known?

Epipolar geometry

epipole

centerof

projection baseline

All points on p project on l and l’M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 18: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Family of planes and lines l and l’ Intersection in e and e’

Epipolar geometry

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 19: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Example: Converging cameras

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 20: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

e

e’

Example: Forward motion

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 21: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Epipolar geometry andFundamental matrix

epipolar line

(epipole: intersection of all epipolar lines)

(computed with 20 points)

Page 22: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Epipolar geometry andFundamental matrix

epipolar line

(epipole: intersection of all epipolar lines)

(computed with 50 points)

Page 23: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Fundamental matrix

point in image 1

point in image 2

fundamentalmatrix

Page 24: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Epipolar lines (1)

epipolar line in image 2associated with (x,y) in image 1

Page 25: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Epipolar lines (2)

epipolar line in image 1associated with (x’,y’) in image 2

Page 26: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Computing the fundamental matrix

knownunknown

Page 27: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Computing the fundamental matrix

knownunknown

Page 28: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Computing the fundamental matrix

1. Construct A (nx9) from correspondences2. Compute SVD of A: A = UVT

3. Unit-norm solution of Af=0 is given by vn (the right-most singular vector of A)

4. Reshape f into F1

5. Compute SVD of F1: F1=UFFVFT

6. Set F(3,3)=0 to get F’(The enforces rank(F)=2)

7. Reconstruct F=UFF’VFT

8. Now xTF is epipolar line in image 2,and Fx’ is epipolar line in image 1

Page 29: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

(simple for stereo rectification)

Example: Motion parallel to image plane

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 30: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Example: Motion parallel to image scanlines

Epipoles are at infinity

Scanlines are the epipolar lines

In this case, the images are said to be “rectified”

Tsukuba stereo images courtesy of Y. Ohta and Y. Nakamura at the University of Tsukuba

Page 31: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Standard (rectified) stereo geometry

pure translation along X-axis

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 32: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Perspective projection

X

x

X

x

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 33: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Rectified geometry

xL xR

XL -XR

b

Page 34: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Standard stereo geometry

• disparity is inversely proportional to depth• stereo vision is less useful for distant objects

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 35: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Binocular rectified stereo

rightleft

disparity map depth discontinuities

epipolarconstraint

Page 36: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Matching scanlines

inte

nsi

ty

L

R

dis

par

ity

lamp

wall

pixel

rightleft

Page 37: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Stereo matching

• Search is limited to epipolar line (1D)• Look for most similar pixel

?

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 38: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Aggregation

• Use more than one pixel• Assume neighbors have similar

disparities*

– Use correlation window containing pixel

– Allows to use SSD, ZNCC, Census, etc.

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 39: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Block matching

Page 40: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Dissimilarity measures

Connection between SSD and cross correlation:

Also normalized correlation, rank, census, sampling-insensitive ...

Most common:

Page 41: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

More efficient implementation

Key idea: Summation over window is correlation with box filter, which is separable

Running sum improves efficiency even more

Note: w is half-width

Page 42: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Compare intensities pixel-by-pixel

Comparing image regions

I(x,y) I´(x,y)

Sum of Square Differences

Dissimilarity measures

Note: SAD is fast approximation (replace square with absolute value)

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 43: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Compare intensities pixel-by-pixel

Comparing image regions

I(x,y) I´(x,y)

Dissimilarity measures

If energy does not change much, then minimizing SSD equals maximizing cross-correlation:

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 44: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Compare intensities pixel-by-pixel

Comparing image regions

I(x,y) I´(x,y)

Zero-mean Normalized Cross Correlation

Similarity measures

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 45: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Compare intensities pixel-by-pixel

Comparing image regions

I(x,y) I´(x,y)

Census

Similarity measures

125 126 125

127 128 130

129 132 135

0 0 0

0 1

1 1 1

(Real-time chip from TYZX based on Census)

only compare bit signature

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 46: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Sampling-InsensitivePixel Dissimilarity

d(xL,xR)

xL xR

d(xL,xR) = min{d(xL,xR) ,d(xR,xL)}Our dissimilarity measure:

[Birchfield & Tomasi 1998]

IL IR

Page 47: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Given: An interval A such that [xL – ½ , xL + ½] _ A, and

[xR – ½ , xR + ½] _ A

Dissimilarity Measure Theorems

If | xL – xR | ≤ ½, then d(xL,xR) = 0

| xL – xR | ≤ ½ iff d(xL,xR) = 0

∩∩

Theorem 1:

Theorem 2:

(when A is convex or concave)

(when A is linear)

Page 48: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Aggregation window sizes

Small windows • disparities similar• more ambiguities• accurate when correct

Large windows • larger disp. variation• more discriminant• often more robust• use shiftable windows to

deal with discontinuities

(Illustration from Pascal Fua)

Page 49: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Occlusions

(Slide from Pascal Fua)

Page 50: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Left-right consistency check

• Search left-to-right, then right-to-left• Retain disparity only if they agree

xL

d

Do minima coincide?

Page 51: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Results: correlation

disparity mapleft

with left-right consistency check

Page 52: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Constraints

• Epipolar – match must lie on epipolar line• Piecewise constancy – neighboring pixels should usually

have same disparity• Piecewise continuity – neighboring pixels should usually

have similar disparity• Disparity – impose allowable range of disparities (Panum’s

fusional area)• Disparity gradient – restricts slope of disparity• Figural continuity – disparity of edges across scanlines• Uniqueness – each pixel has no more than one match

(violated by windows and mirrors)• Ordering – disparity function is monotonic (precludes thin

poles)

Page 53: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Exploiting scene constraints

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 54: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Ordering constraint

11 22 33 4,54,5 66 11 2,32,3 44 55 66

2211 33 4,54,5 6611

2,32,3

44

55

66

surface slicesurface slice surface as a pathsurface as a path

occlusion right

occlusion left

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 55: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Uniqueness constraint

• In an image pair each pixel has at most one corresponding pixel– In general one corresponding pixel– In case of occlusion there is none

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 56: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Disparity constraint

surface slicesurface slice surface as a pathsurface as a path

bounding box

dispa

rity b

and

use reconstructed features to determine bounding box

constantdisparitysurfaces

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 57: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Figural continuity constraint

right left

[University of Tsukuba]

Page 58: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Cooperative algorithm

Page 59: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Disparity space image

Page 60: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Dynamic Programming: 1D Search

Dis

par

ity

map

occlusion

depthdiscontinuity

RIGHTL

EF

T

c a r t

ca

t 3 2 1 1 12 1 0 1 21 0 1 2 30 1 2 3 4

string editing:

stereo matching:

penalties: mismatch = 1 insertion = 1 deletion = 1

c a t

c a r t

Page 61: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Minimizing a 2D Cost FunctionalMinimize:

disparity

p,q

p,q

d(p, )2D:

GL

OB

AL

dis

par

ity

pixel?

p,q

d(p, )

1D:

E E d(p, ) u(l )data smoothness p p,q p,q

{p,q} N

Global

u(l )p,q

Discontinuity penalty:

lp,q

minimum cut = disparity surface

u(l )= lp,q p,q p,qsolves

LO

CA

L Local (GOOD)

(BAD)

Page 62: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Multiway-Cut:2D Search

pixels

labels

pixels

labels

[Boykov, Veksler, Zabih 1998]

Page 63: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Multiway-Cut Algorithm

),( x'x ))(, x(x fg

minimum cut

),(

)]()()[,())(,x'xx

x'xx'xx(x fffg Minimizes

source label

sink label

pixels

(cost of label discontinuity)

(cost of assigninglabel to pixel)

pixels

labels

Page 64: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Energy minimization

(Slide from Pascal Fua)

Page 65: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Graph Cut

(Slide from Pascal Fua)

(general formulation requires multi-way cut!)

Page 66: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

(Boykov et al ICCV‘99)

(Roy and Cox ICCV‘98)

Simplified graph cut

Page 67: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Correspondence as Segmentation

• Problem: disparities (fronto-parallel) O()surfaces (slanted) O( 2 n)=> computationally intractable!

• Solution: iteratively determine which labels to use

labelpixels

find affineparametersof regions

multiway-cut(Expectation)

Newton-Raphson(Maximization)

Page 68: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Stereo Results (Dynamic Programming)

Page 69: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Stereo Results (Multiway-Cut)

Page 70: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Stereo Results on Middlebury Database

imag

eB

irch

fiel

dT

om

asi 1

999

Ho

ng

-C

hen

200

4

Page 71: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Untextured regions remain a challenge

Multiway-cutDynamic programming

Page 72: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Results: dynamic programming

disparity map

[Bobick & Intille]

left

Page 73: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Results: multiway cut

disparity mapleft

[Kolmogorov & Zabih]

Page 74: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Results: multiway cut (untextured)

disparity map

Page 75: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Multi-camera configurations

Okutami and Kanade

(illustration from Pascal Fua)

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 76: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Example: Tsukuba

Tsukuba dataset

Page 77: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Real-time stereo on GPU• Computes Sum-of-Square-Differences (use

pixelshader)• Hardware mip-map generation for aggregation over

window• Trade-off between small and large support window

(Yang and Pollefeys, CVPR2003)

290M disparity hypothesis/sec (Radeon9800pro)e.g. 512x512x36disparities at 30Hz

GPU is great for vision too!

Page 78: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Stereo matching

Optimal path(dynamic programming )

Similarity measure(SSD or NCC)

Constraints• epipolar

• ordering

• uniqueness

• disparity limit

Trade-off

• Matching cost (data)

• Discontinuities (prior)

Consider all paths that satisfy the constraints

pick best using dynamic programming

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 79: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Hierarchical stereo matching

Dow

nsam

plin

g

(Gau

ssia

n p

yra

mid

)

Dis

pari

ty p

rop

ag

ati

on

Allows faster computation

Deals with large disparity ranges

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 80: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Disparity map

image I(x,y) image I´(x´,y´)Disparity map D(x,y)

(x´,y´)=(x+D(x,y),y)

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 81: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Example: reconstruct image from neighboring images

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 82: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Stereo matching with general camera configuration

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 83: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Image pair rectification

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 84: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Planar rectification

Bring two views Bring two views to standard stereo setupto standard stereo setup

(moves epipole to )(not possible when in/close to image)

~ image size

(calibrated)(calibrated)

Distortion minimization(uncalibrated)

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 85: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 86: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

polarrectification

planarrectification

originalimage pair

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 87: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Stereo camera configurations

(Slide from Pascal Fua)

Page 88: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

More cameras

Multi-baseline stereo

[Okutomi & Kanade]