48
Stereo Sebastian Thrun, Gary Bradski, Daniel Russakoff Stanford CS223B Computer Vision http://robots.stanford.edu/cs223b (with slides by James Rehg and Zhigang Zhu) Stereo

Stereo Sebastian Thrun, Gary Bradski, Daniel Russakoff Stanford CS223B Computer Vision (with slides by James Rehg and

  • View
    232

  • Download
    0

Embed Size (px)

Citation preview

Stereo

Sebastian Thrun, Gary Bradski, Daniel RussakoffStanford CS223B Computer Vision

http://robots.stanford.edu/cs223b

(with slides by James Rehg and Zhigang Zhu)

Stereo

Sebastian Thrun Stanford University CS223B Computer Vision

Stereo Vision: Illustration

http://www.well.com/user/jimg/stereo/stereo_list.html

Sebastian Thrun Stanford University CS223B Computer Vision

Stereo Vision: Outline

Basic Equations Epipolar Geometry Image Rectification Reconstruction Correspondence (Active Range Imaging Techniques)

Sebastian Thrun Stanford University CS223B Computer Vision

Pinhole Camera Model

Imageplane Focal length f

Center ofprojection

Sebastian Thrun Stanford University CS223B Computer Vision

Pinhole Camera Model

Imageplane

),,( ZYXP

),,( ZYXP

f

Oy

x

z

Z

Z

Y

Y

X

X

ZZ

YY

XX

OPPO

Sebastian Thrun Stanford University CS223B Computer Vision

Pinhole Camera Model

Imageplane

),,( ZYXP

),,( ZYXP

f

Oy

x

z

)1,,()1,,(),,(

,

,,

Z

Yf

Z

XfyxZYX

YyXxZ

YfY

Z

XfXfZ

Sebastian Thrun Stanford University CS223B Computer Vision

Basic Stereo Derivations

),,(1 ZYXP 1Oy

x

z

f

2Oy

x

z

B

BfxxZ ,,, offunction a as for expression Derive 21

1p

2p

Sebastian Thrun Stanford University CS223B Computer Vision

Basic Stereo Derivations

),,(1 ZYXP 1Oy

x

z

f

2Oy

x

z

B

211

11

1

12

1

11 ,

xx

BfZ

Z

Bfx

Z

BXfx

Z

Xfx

Sebastian Thrun Stanford University CS223B Computer Vision

What If…?

),,(1 ZYXP 1Oy

x

z

f

2Oy

x

z

B

1p

2p

),,(1 ZYXP 1Oy

x

z

1p

f2O

y

x

z

2p

Sebastian Thrun Stanford University CS223B Computer Vision

Epipolar Geometry

pl pr

P

Ol Or

Xl

Xr

Pl Pr

fl fr

Zl

Yl

Zr

Yr

Rrotation Tontranslati

Sebastian Thrun Stanford University CS223B Computer Vision

Epipolar Geometry

plp

r

P

Ol Orel er

Pl Pr

Epipolar Plane

Epipolar Lines

Epipoles

Sebastian Thrun Stanford University CS223B Computer Vision

Epipolar Geometry

Epipolar plane: plane going through point P and the centers of projection (COPs) of the two cameras

Epipoles: The image in one camera of the COP of the other

Epipolar Constraint: Corresponding points must lie on epipolar lines

Sebastian Thrun Stanford University CS223B Computer Vision

Essential Matrix

pl pr

P

Ol Orel er

Pl Pr

Orthogonality T, Pl, PlT: 0)( lT

l PTTP

)( TPRP lr Coordinate Transformation:

0

0

0

xy

xz

yz

TT

TT

TT

S

ll SPPT

0)( lT

rT SPPR

0lT

r RSPP

0)( lT

rT PTPRResolves to

RSE Essential Matrix 0lT

r EPP

Sebastian Thrun Stanford University CS223B Computer Vision

Essential Matrix

pl pr

P

Ol Orel er

Pl Pr

0

0

0

xy

xz

yz

TT

TT

TT

SRSE Essential Matrix

0 lTr Epp0l

Tr EPP

Projective Line: lr Epu

Sebastian Thrun Stanford University CS223B Computer Vision

Fundamental Matrix

Same as Essential Matrix in Camera Pixel Coordinates

0lTr pFp

0lTr Epp

Pixel coordinates 1 lT

r EMMF

Intrinsic parameters

Sebastian Thrun Stanford University CS223B Computer Vision

Computing F: The Eight-Point Algorithm

Input: n point correspondences ( n >= 8)– Construct homogeneous system Ax= 0 from

• x = (f11,f12, ,f13, f21,f22,f23 f31,f32, f33) : entries in F• Each correspondence give one equation• A is a nx9 matrix

– Obtain estimate F^ by SVD of A:• x (up to a scale) is column of V corresponding to the least

singular value– Enforce singularity constraint: since Rank (F) = 2

• Compute SVD of F:• Set the smallest singular value to 0: D -> D’• Correct estimate of F :

Output: the estimate of the fundamental matrix F’ Similarly we can compute E given intrinsic

parameters

0lTr pFp

TUDVA

TUDVF ˆ

TVUDF' '

Sebastian Thrun Stanford University CS223B Computer Vision

Recitification

Idea: Align Epipolar Lines with Scan Lines.

Question: What type transformation?

Sebastian Thrun Stanford University CS223B Computer Vision

Locating the Epipoles

pl pr

P

Ol Orel er

Pl Pr

Input: Fundamental Matrix F– Find the SVD of F– The epipole el is the column of V corresponding to the

null singular value (as shown above)– The epipole er is the column of U corresponding to the

null singular value (similar treatment as for el) Output: Epipole el and er

TUDVF

el lies on all the epipolar lines of the left image

0lTr pFp

0lTr eFp

0leF

Sebastian Thrun Stanford University CS223B Computer Vision

Stereo Rectification (see Trucco)

Stereo System with Parallel Optical AxesEpipoles are at infinity

Horizontal epipolar lines

pl

pr

P

Ol Or

Xl

Xr

Pl Pr

Zl

Yl

Zr

Yr

T

Sebastian Thrun Stanford University CS223B Computer Vision

pl

pr

P

Ol Or

Pl Pr

Reconstruction (3-D): Idealized

Sebastian Thrun Stanford University CS223B Computer Vision

pl

pr

P

Ol Or

Pl Pr

Reconstruction (3-D): Real

See Trucco/Verri, pages 161-171

Sebastian Thrun Stanford University CS223B Computer Vision

Correspondence

1P1Oy

x

z

f

2Oy

x

z

1.lp

1,rp

1P

Phantom points

Sebastian Thrun Stanford University CS223B Computer Vision

Correspondence via Correlation

Rectified images

Left Right

scanline

SSD error

disparity

(Same as max-correlation / max-cosine for normalized image patch)

Sebastian Thrun Stanford University CS223B Computer Vision

Image Normalization

Even when the cameras are identical models, there can be differences in gain and sensitivity.

The cameras do not see exactly the same surfaces, so their overall light levels can differ.

For these reasons and more, it is a good idea to normalize the pixels in each window:

pixel Normalized ),(

),(ˆ

magnitude Window )],([

pixel Average ),(

),(

),(),(

2

),(

),(),(),(

1

yxW

yxWvuyxW

yxWvuyxW

m

mm

m

m

II

IyxIyxI

vuII

vuII

Sebastian Thrun Stanford University CS223B Computer Vision

Images as Vectors

Left Right

LwRw

Each window is a vectorin an m2 dimensionalvector space.Normalization makesthem unit length.

Sebastian Thrun Stanford University CS223B Computer Vision

Image Metrics

Lw)(dwR

2

),(),(

2SSD

)(

)],(ˆ),(ˆ[)(

dww

vduIvuIdC

RL

yxWvuRL

m

(Normalized) Sum of Squared Differences

Normalized Correlation

cos)(

),(ˆ),(ˆ)(),(),(

NC

dww

vduIvuIdC

RL

yxWvuRL

m

)(maxarg)(minarg2* dwwdwwd RLdRLd

Sebastian Thrun Stanford University CS223B Computer Vision

Correspondence Using Correlation

Left Disparity Map

Images courtesy of Point Grey Research

Sebastian Thrun Stanford University CS223B Computer Vision

LEFT IMAGE

corner line

structure

Correspondence By Features

Sebastian Thrun Stanford University CS223B Computer Vision

Correspondence By Features

RIGHT IMAGE

corner line

structure

Search in the right image… the disparity (dx, dy) is the displacement when the similarity measure is maximum

Sebastian Thrun Stanford University CS223B Computer Vision

Stereo Correspondences

… …Left scanline Right scanline

Sebastian Thrun Stanford University CS223B Computer Vision

Stereo Correspondences

… …Left scanline Right scanline

Match

Match

MatchOcclusion Disocclusion

Sebastian Thrun Stanford University CS223B Computer Vision

Search Over Correspondences

Three cases:–Sequential – cost of match–Occluded – cost of no match–Disoccluded – cost of no match

Left scanline

Right scanline

Occluded Pixels

Disoccluded Pixels

Sebastian Thrun Stanford University CS223B Computer Vision

Scan across grid computing optimal cost for each node given its upper-left neighbors.Backtrack from the terminal to get the optimal path.

Occluded Pixels

Left scanline

Dis-occluded Pixels

Right scanline

Terminal

Stereo Matching with Dynamic Programming

Sebastian Thrun Stanford University CS223B Computer Vision

Stereo Matching with Dynamic Programming

Dynamic programming yields the optimal path through grid. This is the best set of matches that satisfy the ordering constraint

Occluded Pixels

Left scanline

Dis-occluded Pixels

Right scanline

Start

End

Sebastian Thrun Stanford University CS223B Computer Vision

Scan across grid computing optimal cost for each node given its upper-left neighbors.Backtrack from the terminal to get the optimal path.

Occluded Pixels

Left scanline

Dis-occluded Pixels

Right scanline

Terminal

Stereo Matching with Dynamic Programming

Sebastian Thrun Stanford University CS223B Computer Vision

Scan across grid computing optimal cost for each node given its upper-left neighbors.Backtrack from the terminal to get the optimal path.

Occluded Pixels

Left scanline

Dis-occluded Pixels

Right scanline

Terminal

Stereo Matching with Dynamic Programming

Sebastian Thrun Stanford University CS223B Computer Vision

Correspondence

It is fundamentally ambiguous, even with stereo constraints

Ordering constraint… …and its failure

Figure fromForsyth & Ponce

Sebastian Thrun Stanford University CS223B Computer Vision

A Last Word on Correspondences

Correspondens fail for smooth surfaces

There is currently no good solution to the correspondence problem

Sebastian Thrun Stanford University CS223B Computer Vision

Summary Stereo Vision

Epipolar Geometry: Corresponding points lie on epipolar line

Essential/Fundamental matrix: Defines this line Eight-Point Algorithm: Recovers Fundamental matrix Rectification: Epipolar lines parallel to scanlines Reconstruction: Minimize quadratic distance Correspondence:

– Minimize Sum of Squares over image correlation– Minimize Sum of Squares of feature characteristics

Many correspondences: Dynamic programming along scanlines (but can fail)

Sebastian Thrun Stanford University CS223B Computer Vision

How can We Improve Stereo?

By James Davis, Honda Research

Sebastian Thrun Stanford University CS223B Computer Vision

rect

ified

Active Stereo (Structured Light)

Sebastian Thrun Stanford University CS223B Computer Vision

Structured Light: 3-D Result

3D Model3D Snapshot

By James Davis, Honda Research

Sebastian Thrun Stanford University CS223B Computer Vision

Time of Flight Sensor: Shutter

http://www.3dvsystems.com

Sebastian Thrun Stanford University CS223B Computer Vision

Time of Flight Sensor: Shutter

http://www.3dvsystems.com

Sebastian Thrun Stanford University CS223B Computer Vision

Time of Flight Sensor: Shutter

http://www.3dvsystems.com

Sebastian Thrun Stanford University CS223B Computer Vision

Time of Flight Sensor: Scanning

Sebastian Thrun Stanford University CS223B Computer Vision

Time of Flight Sensor: Scanning

Sebastian Thrun Stanford University CS223B Computer Vision

Time of Flight Sensor: Scanning

Cleaned up…Raw data