Stereo Outline : Remind class of 3d geometry Introduction How human see depth The principle of triangulation Epipolar geometry Fundamental Matrix Rectification

stereo

• Outline :• Remind class of 3d geometry• Introduction• How human see depth• The principle of triangulation• Epipolar geometry• Fundamental Matrix• Rectification• correspondence

• Correlation• Feature based• Other 3d reconstruction methods

Remind from the last lecture

Single view modeling

Projection in 2D

Vanishing points (2D)

Two point perspective

Vanishing lines

Multiple Vanishing Points Any set of parallel lines on the plane define a vanishing point The union of all these vanishing points is the horizon line

Also called vanishing line Note that different planes (can) define different vanishing lines

Vanishing pointVanishing point

Perspective cues

Perspective cues

Perspective cues

Comparing heights•Vanishing point Vanishing point

Measuring height

Cross ratio

scene cross ratio

Image cross ratio

Cross ratio

stereo

• What is stereo in computer vision?• What it’s god for?

Introduction

• There is no sense of depth in image that seen from one camera.• We will see in this chapter how to produce images that have sense of depth.

Stereo

Many slides adapted from Steve Seitz

Introduction

• Computer stereo vision is the extraction of 3D information from digital images.• Human being use stereo to sense the distance.

How human see depth

Motivation-Application

• Stereo vision is highly important in fields such as• robotics:

• Robot navigation.• Car navigation.

• Extract information about the relative position of 3D objects.• Making the 3d moves.• Vedieo games that use stereo .

http://en.wikipedia.org/wiki/Robotics

Motivation .con

goal

• Given two or more images of a same object (from different points) we want a recovery of the object in the real word .

Goal con.“Demo”

The principle of triangulation con.• Given projections of a 3D point

“x” in two or more images (with known focal length), find the coordinates of the point.

• The 3d points at the intersection of the rays passing through the matching the image points and the associated optical center .

Depth from disparity

f

x x’

BaselineB

z

O O’

X

f

z

fBxxdisparity

𝛼𝛽

We can see that we have Similarity between the triangles (X,O,O’) and (X,x,x’)

26

Stereo VisionTwo cameras: Left and RightOptical centers: OL and OR

Virtual image plane is projection of actual image plane through optical centerBaseline, b, is the separation between the optical centersScene Point, P, imaged at pL and pR

pL = 9pR = 3Disparity, d = pR – PL = 6

Disparity is the amount by which the two images of P are displaced relative to each other

Depth, z= bfpd

p = pixel width


f

x x’

BaselineB

z

O O’

X

f

z

fBxxdisparity

Disparity is inversely proportional to depth!

Z’Small disparity => large depth


f

x x’

BaselineB

z

O O’

X

f

z

fBxxdisparity

Disparity is inversely proportional to depth!

ZLarge disparity => small depth

Reconstruction

Sometimes the rays will never intersect because we have feature localization error, so we look for the closest point to the rays .

Stereo with convergence cameras

• short base line:large common filed of view .large depth error.

Stereo with convergence cameras

• Large base line:Small depth error .Small common filed of view.

Verging optical axes

• Two optical axes intersect at a fixation point:

the common filed of view increased.

small depth error.• Corresponding is more difficult.

Problems in stereo vision

• We have a problem to match a point in the two images.

correspondence

• So we use epipolar geometry.

Epipolar geometry

For each point in image plane 1 there is a set of points that can match (q’,p’). q

p

P’q’

O O’

Epipolar geometry

x x’

• Baseline – line connecting the two camera centers

• Epipolar Plane – plane containing baseline (1D family)• Epipoles = intersections of baseline with image planes = projections of the other camera center= vanishing points of the motion direction

• Baseline – line connecting the two camera centers

Epipolar geometryp

p p’

Epipolar line

epipole

Epipolar plan

The epipole

Epipolar constraint

• Potential matches for p have to lay on the corresponding epiploar line l’

• Potential matches for p’ have to lay on the corresponding epipolar line l.

• e , e’ called epipoles.

Vector cross product to matrix vector multiplication

],,[ yxyxzxxzzyyz

zyx

zyx BAABBABABABA

BBB

AAA

kji

BA

z

y

x

xy

xZ

yZ

B

B

B

AA

AA

AA

BSBA

0

0

0

. ],,[ yxyxzxxzzyyz BAABBABABABA

Essential matrix

T=translationR=rotation

Essential matrix

0 lT

r PRTP0)( l

Tl PTTP

)( TPRP lr

)( TPPR lrT

Tl

Tr TPRP )(

}

Coplanarity constraint between vectors T , )( TPl lP,

Essential matrix

0 lT

r PRTP } 0][ lxT

r PTRP 0 lT

r EPP

Essential Matrix

E=R

0

0

0

xy

xZ

yZ

TT

TT

TT xT

Homogeneous coordinates Conversion

Converting to homogeneous coordinates

homogeneous image coordinates

homogeneous scene coordinates

Converting from homogeneous coordinates

Intrinsic transformation - Principal Point

Camera transformation

2D poin

t (3x1

)

Camera to pixel coord.

trans. matrix

(3x3)

Perspectiveprojection

matrix(3x4)

World to camera coord.

trans. matrix(4x4)

3Dpoin

t(4x1

)

Intrinsic transformation

Fundamental matrix

• We assume that camera calibration is known what if we don’t know the camera calibration.

• We use fundamental matrix to solve this problem

0lT

r Fxx

Fundamental matrix

Assume we know the camera calibration:Then: is the un-normalized coordinates . = K=

lll pMx

lp

lM

lM

Fundamental matrix

lll PMx

rrr PMx 0l

Tr EPP

lll PxM 1

rrr PMx T

rT

rT

r PMx

}01

llT

rT

r xEMMx

0][ 1 ll

Tr

Tr xEMMx

0lT

r Fxx

Fundamental matrix

• M is 4*3 matrix , so how we get • Assume we get M.• We use pseudo inverse to get it.

TT MMMMpinv 1)()(

Fundamental matrix

• Given appoint on the left camera what can we know from the fundamental matrix.

• Every point on the epipolar line of that correspond the point is accomplish the equation.

0''

x

ihg

fed

cba

xFxx TT0' Fxx T

• Assume ]1[ yxX

3

2

1

f

f

f

x

ihg

fed

cba

Fx 0'')'( 321

3

2

1

fyfxf

f

f

f

X T

We got a equation of the epipolar line .

Fundamental Matrix. con

0''

x

ihg

fed

cba

xFxx TT

Given a point in left camera x , epiploar line in right camera is Fxur

Admin

Admin

Fundamental Matrix. con

0''

x

ihg

fed

cba

xFxx TT

• 3*3 matrix with 9 components• Rank 2 matrix( due to S )• 7 degrees of freedom• Given a point in left camera x , epiploar line in

right camera is Fxur

Computing fundamental matrix

• F (the fundamental matrix ) has 9 variables.• In order to compute F we must get some pointes.• How much pointes we need.

0

00

'

''

x

ihg

fed

cba

x

x

ihg

fed

cba

xFxx

T

TT


• we can set i=1;• All epipolar lines intersect in the epipole point .• =epipole in the left image.• No mater what is x’ the equation Is always true.

00''

ee

TT x

ihg

fed

cba

x

ihg

fed

cba

xFxx

ex

Epipole is an eigenvector of F with eigenvalue =0


• F has 7 degree of freedom .• We need minimum 4 points to calculate i={1,2,3,4}.

0

1

'

'

11 3231

232221

131211

i

i

T

i

i

y

x

ff

fff

fff

y

x


0)''()''()''( 333231232221131211 fyfxffyfxfyfyfxfx iiiii

0

''

''

''

1 333231

232221

131211

fyfxf

fyfxf

fyfxf

y

x

i

i

i

i

i

Fundamental Matrix.con

0)''()''()''( 333231232221131211 fyfxffyfxfyfyfxfx iiiii

0'''''' 333231232221131211 fyfxffyyfyxfyfxyfxxfx iiiiiiiii

One equation for one point correspondence


1

1

1

1

1

1

1

1

33

32

31

23

22

21

13

12

'8

'888

'88

'888

'88

'8

'7

'777

'77

'777

'77

'7

'6

'666

'66

'666

'66

'6

'5

'555

'55

'555

'55

'5

'4

'444

'44

'444

'44

'4

'3

'333

'33

'333

'33

'3

'2

'222

'22

'222

'2

'2

'1

'111

'11

'111

'11

'1

f

f

f

f

f

f

f

f

f

yxyyyyxxxyxx

yxyyyyxxxyxx

yxyyyyxxxyxx

yxyyyyxxxyxx

yxyyyyxxxyxx

yxyyyyxxxyxx

yxyyyyxxxyxx

yxyyyyxxxyxx

Epipolar lines

Epipolar lines

Rectification

• Why we need rectification ?

Because the matching point will be in the same row.

Rectification

• Rectification: warping the input images (perspective transformation) so that epipolar lines are horizontal

Rectification

Image Reprojection reproject image planes onto

common plane parallel to baseline Notice, only focal point of camera

really matters(Seitz)

Rectification

slide: R. Szeliski

Rectification• Any stereo pair can be rectified by rotating and scaling

the two image planes (=homography)

• We will assume images have been rectified so• Image planes of cameras are parallel. • Focal points are at same height.• Focal lengths same.

• Then, epipolar lines fall along the horizontal scan lines of the images

Correspondence

• For every point in image plane 1 we have a set of point that may match in the image plane2.

• How we find the best matching point ?• Correlation based

Attempt to establish correspondence by matching image intensities-usually over a window of pixels in each image.• Feature based

attempt to establish a correspondence by matching a sparse sets to image features (edges…).

Matching cost

disparity

Left Right

scanline

Correspondence search-via correlation

• Slide a window along the right scanline and compare contents of that window with the reference window in the left image

• Matching cost: SSD or normalized correlation

Correlation methods

• Sum of squares different(SSD)=

• Absolute difference (AD) =

• CC =

• Normalized correlation (NC) =

• MC =

2)( rightleft II

|)(| rightleft II

rightleft II

rightleft

rightleft

II

II

)(

))((64

1rightrightleftleft

rightleft

II

Window size

• If we take a too small window then we will more details but more noise !• If we take a too large window the matching will be less sensitive to noise.

W=3 W=20

Disparity map

• Disparity map is a map that helps us to express our match after we have done the correspondence ( correlation ) between the left and the right images, that it uses the intensity of gray color to express the disparity between the two matching windows (bright color mention that disparity is large, and dark color express small disparity) .

Disparity map.con

• If we were to perform this matching process for every pixel in the left hand image, finding its match in the right hand frame and computing the distance between them you would end up with an image where every pixel contained the distance/disparity value for that pixel in the left image.

Right imageLeft image

Disparity map

Correlation method

• Not working good enough in some cases (images that have few details).• Easy to implement .• We can use dense disparity map.

Failures of correspondence search

Textureless surfaces Occlusions , repetition

Feature based approach

Feature based

• Features• Edge points• Lines• Corners

• Matching algorithm• Extract features in the stereo pair• Define similarity measure• Search correspondence using similarity measure and the epipolar geometry

Feature based methods

23

22

21

20 )()()()(

1

rlrlrlrl iiwmmwwllws

l-length - orientation m-coordinates of the midpointi-average intensity along the line

W- the weights

Feature based approach

• Pros :• Relatively insensive to illumination changes• Good for man made scenes with strong lines but weak texture or texurless

surfaces• Work well on edges• Faster than correlation approach

• Cons:• Only sparse depth map• May be tricky

Winner take all

• Tow pixels(in image plan1) may correesponed to the same pixel in image plane2.

ab

c

Ordering

Global approach

• Use dynamic programing to force pixels along scan line.• If a is in the left side of b then will match to a pixel that in the liftest

side.• We will match every pixel

Global approach

Global approach

• We want to minimize the coast of the correspondence(starting from the left to the right).

Three Views

Matches between points in the first two images can be checked by re-projecting the corresponding three-dimensional point in the third image

Documents

Stereo Outline : Remind class of 3d geometry Introduction How human see depth The principle of triangulation Epipolar geometry Fundamental Matrix Rectification