C18 Computer Vision - University of Oxforddwm/Courses/4CV_2015/4CV-N3.pdf · 2015-10-21 · C18 2015 3 / 35 Recovering 3D from two images I: epipolar geometry 3.1Introduction 3.2EpipolarGeometry

C18 2015 1 / 35

C18 Computer Vision

David Murray

[email protected]/∼dwm/Courses/4CV

Michaelmas 2015

C18 2015 2 / 35

Computer Vision: This time ...

1. Introduction; imaging geometry; camera calibration2. Salient feature detection – edges, line and corners3. Recovering 3D from two images I: epipolar geometry.4. Recovering 3D from two images II: stereo correspondence

algorithms; triangulation.

C18 2015 3 / 35

Recovering 3D from two images I: epipolargeometry

3.1 Introduction

3.2 Epipolar Geometry

3.3 Algebraic Representation and the Fundamental Matrix

3.4 Computing the Fundamental Matrix

C18 2015 4 / 35

3.1 Introduction: Forward and inverse mappings

Geometrical image entities back-project to entities of higherdimensionality in the scene.

So, points in the image back-project to lines in the scene, lines to planes...

X = αx 0 < α <∞

C18 2015 5 / 35

Introduction: What do single-view ambiguities tell us?

Single views are NOT sufficient to solve geometric problems indata-driven vision.

We need methods of understanding multiple views.

Shape-from-stereodifferent camerasdifferent viewpointssame time

Structure-from-motionsame cameradifferent viewpointsdifferent times

Note that single views can be sufficient to solve geometric problems intop-down model-driven vision (but not always).

C18 2015 6 / 35

Introduction: Reconstruction from two viewsIn principle, recovering 3D structure is straightforward. Find a bit of thescene that is observable in two or more cameras, and backproject the tworays to find their intersection in the world.

Right (camera C ′) View Left (camera C) View BackprojectionArranged for cross-eyed fusion

There are three things to cover:1. Understanding the geometry — Epipolar geometry2. Determining which points in the images are from the same scene

location — the correspondence problem3. Determining the 3D structure by back-projecting rays —

triangulation

C18 2015 7 / 35

What clues from identical parallel cameras?

Assume two cameras with the same f , separated by tx in the x-dirn.

Inhomogenous coordinates: Same scene Y , Z but different X ...

⇒ Y ′ = Y , Z ′ = Z , but

X ′ = X + tx ,

As

x ′ = fX ′/Z x = fX/Z

⇒ 1Z

=1ftx

(x ′ − x)

In this case the reciprocal depth 1/Z is proportional to thehorizontal disparity (x ′ − x).

C18 2015 8 / 35

What clues from identical parallel cameras? /ctd

The point x in the left image can come from any point X on thebackprojected ray from the left camera C

Could the corresponding x ′ be anywhere in C ′?

x ′ = (f /Z )(X + tx)⇒x ′ = x + ftx/Z

y ′ = (f /Z )(Y )

⇒y ′ = y

As Z ranges from 0 to ∞, 0 < ftx/Z <∞So the locus of possible matches appears to be on a straight line.

Does this generalize to arbitrary camera geometry? And how?

C18 2015 9 / 35

3.2 Epipolar geometry in arbitrary camerasThe locus of matches is the projection into C ′ of the backprojected ray inC.

This is always a straight line, and is called the epipolar line, labelled l ′.

As X moves along the ray, the other ray sweeps out the epipolar plane.The intersection of the epipolar plane with the image plane is theepipolar line.

C18 2015 10 / 35

Epipolar geometry in arbitrary camerasIf x is moved then the entire plane moves too, generating a new epipolarline.

Epipolar planes hinge about the camera baseline, forming a pencil ofplanes.

This means that all the epipolar lines in camera C ′ meet at its epipolee ′, where the baseline pierces the image plane.

This point is the projection of the optical centre of camera C into C ′.

There are equivalent constructs incamera C. All points on the sameepipolar line in C share the sameepipolar line in C ′.

C18 2015 11 / 35

Epipolar geometry examples

1. Converging cameras

Notice that the epipoles most often lie off the physical image planes

What would be a quick test you could carry out to see whether theepipole was on the image plane?

C18 2015 12 / 35

Epipolar geometry examples

2. (Close to) parallel cameras

Epipolar geometry depends only on the relative pose of the cameras (iethe rotation and translation between them) and on the cameras’ intrinsicparameters.

It does not depend on the scene structure.Can you reason qualitatively why not?

C18 2015 13 / 35

3.3 Algebraic representation & the F matrix

There are three bits of preamble to get across:

1. The first explains how homogeneous notation handles points atinfinity.

2. The second introduces the homogeneous notation for lines.You may notice a duality between lines and points ...

3. The last is the matrix representation of the vector product.

C18 2015 14 / 35

Preamble I: Points at ∞ in homogeneous coordinates

A line of points in 3D through the point A with direction D is

X(µ) = A+ µD

Writing this in homogeneous notationX1(µ)X2(µ)X3(µ)X4(µ)

P=

[A+ µD

1

]P=

[A1

]+ µ

[D0

]P=

1µ

[A1

]+

[D0

]

In the limit µ→∞ the point on the line is[D0

].

So, homogeneous vectors with X4 = 0 represent points “at infinity”.Points at infinity are equivalent to directionsParallel lines in the scene meet at the same point at ∞

C18 2015 15 / 35

Points at ∞ and vanishing points

The projection of a point at ∞ into the image is the vanishing point

To find it, simply project the point-at-∞ into the image ...

v = K[

R t] [ D

0

]=

DxDyDz

C18 2015 16 / 35

Preamble II: Homogeneous notation for linesRecall that a point (x , y)> in 2D is represented by the homogeneous3-vector

x P= (x1, x2, x3)

>

where x = x1/x3, y = x2/x3.

Equivalently x = λ(x , y , 1)>.

The line l1x + l2y + l3 = 0 in 2D is represented by the homogeneous 3-vector

l P= (l1, l2, l3)>

For example, the line y = 1 is written −y + 1 = 0, and so l = (0,−1, 1)>

is a homogeneous representation — as would (0, 47,−47)>, etc.

A point x on the line l has

l>x = x>l = 0

or equivalently, thinking about scalar products, l · x = 0.

C18 2015 17 / 35

Preamble II: Homogeneous notation for lines

Reminder: A point x on the line l has l>x = 0, x>l = 0, or l · x = 0.

The line through two points p and q is given by

l P= p× q.

Proof: Use the properties of the scalar triple product,

p · l = p · (p× q) ≡ 0 q · l = q · (p× q) ≡ 0 .

The intersection of two lines is the point

x P= l1 × l2

Proof: For you to fill in ...

C18 2015 18 / 35

[**] Example ... DIY♣ Find (i) the point of intersection of the linesl and m; (ii) the point of intersection of the linel and the x-axis; and (iii) the equation of theline n joining (0, 1) and (2, 0)

l = (0,−1, 1) and m = (−1, 0, 2).

x =

∣∣∣∣∣∣ı j k0 −1 1−1 0 2

∣∣∣∣∣∣ = −2

−1−1

which is the point (2, 1).

The x-axis is (0, 1, 0) and so it and l meet at

x =

∣∣∣∣∣∣ı j k0 −1 10 1 0

∣∣∣∣∣∣ = −1

00

which is a 2D point at infinity at x = ±∞.

n =

∣∣∣∣∣∣ı j k0 1 12 0 1

∣∣∣∣∣∣ = 1

2−2

which is the line x + 2y − 2 = 0 or

y = −(1/2)x + 1

C18 2015 19 / 35

Preamble III: Matrix representation of vector products

The vector product a× b =∣∣∣∣∣∣ı j ka1 a2 a3b1 b2 b3

∣∣∣∣∣∣ = a2b3 − a3b2

a3b1 − a1b3a1b2 − a2b1

=

0 −a3 a2a3 0 −a1−a2 a1 0

b1b2b3

= [a]×b

[a]× is a 3× 3 skew-symmetric matrix and has rank=2.

a is the kernel of [a]×. (Why?)

Example: compute the vector product of l = (1, 2, 3) and m = (2, 3, 4).Pseudo-determinant method gives (−1, 2, 1)>.Skew-sym method gives 0 −3 2

3 0 −1−2 1 0

234

=

C18 2015 20 / 35

Algebraic representation of Epipolar GeometryWe now know that the epipolar geometry defines the

mapping from point x to line l ′.

The mapping depends only the cameras, not on the structure. Thismeans

The mapping depends on the overall projection matrices P and P ′.

We will show that the mapping is linear, and can be written as

l ′ = Fx, where F is the fundamental matrix

C18 2015 21 / 35

Algebraic representation of Epipolar Geometry /ctd

With no loss of generality we can use the first camera C to define theworld coordinate frame, so that its overall 3× 4 projection matrix is

P = K [I |0]

We will define the rotation and translation between cameras frames as

X ′ =[

R t0> 1

]X

So thatP ′ = K ′[R |t]

NB, the camera intrinsics can be different.

C18 2015 22 / 35


Step 1: back project a ray from C

Point x back-projects to ray X(ζ) thatsatisfies

PX(ζ) = K [I |0]X(ζ) = x

where we use ζ as a parameter.

Now x P=

xy1

= K [I |0]

XYZ1

= K

XYZ

⇒ X(ζ) =

[ζ[K ]−1x

1

]and ⇒ X(∞) =

[[K ]−1x

0

]In effect, [K ]−1 corrects the direction of the ray. Direction

[x0

]would

be incorrect, because x was measured in a non-ideal camera.

C18 2015 23 / 35


Step 2: Choose two pointson the ray and project theinto the second camera C ′

Choose Point (1) with ζ = 0

⇒It is the optical centre[01

] Choose Point (2) with ζ = ∞⇒A pt at infinity

[K−1x0

]Project these two points into C ′

(1) e ′ = K ′[R |t][01

]= K ′t (2) q = K ′[R |t]

[K−1x0

]= K ′RK−1x

Note that the first point is the epipole.

C18 2015 24 / 35

Algebraic representation of Epipolar Geometry /ctdStep 3: Use vector productto find epipolar line.

l ′ = (K ′t)×K ′RK−1x

Now tidy up ...(1) using the general identity (Ma)× (Mb) = M−>(a× b), whereM−> = [M−1]>, and (2) using the skew symmetric matrix

Hencel ′ = [K ′]−>(t× RK−1x) = [K ′]−>[t]×R[K ]−1x

SoEpipolar Line is: l ′ = FxFundamental matrix is: F = [K ′]−>[t]×RK−1

As x ′>l ′ = 0 ... x ′>Fx = 0

C18 2015 25 / 35

Example: identical parallel cameras

K = K ′ =

f 0 00 f 00 0 1

, R = I , t = (tx , 0, 0)>

F = [K ′]−>[t]×[R][K ]−1

=

1/f 0 00 1/f 00 0 1

0 0 00 0 −tx0 tx 0

[I ] 1/f 0 0

0 1/f 00 0 1

=

0 0 00 0 −10 1 0

⇒ x ′>Fx = [x ′ y ′ 1]

0 0 00 0 −10 1 0

xy1

= 0

which reduces to .................................

C18 2015 26 / 35

Example /ctdIn the exercises you are asked to show that the epipole is in the rightnullspace of the fundamental matrix — that is,

Fe = 0

By inspection, for the parallel identical cameras example: 0 0 00 0 −10 1 0

100

= 0 ⇒Evidently e P=

100

What is the geometric interpretationof this result?

C18 2015 27 / 35

Summary of propertiesF is a rank 2 matrix with 7 d.o.f.

If x↔ x ′ then x ′>Fx = 0

The epipolar lines in C and C ′ are

l = F>x ′ l ′ = Fx

The epipoles in C and C ′ are obtained from

Fe = 0 F>e ′ = 0

(The last is e ′>F = 0>. That is, e ′ is the in the left nullspace of F .)

For P = K [I |0] and P ′ = K ′[R |t] the fundamental matrix is derived as

F = [K ′]−>[t]×[R][K ]−1

where −> denotes transpose of the inverse.

C18 2015 28 / 35

Computing F : Algebraic minimizations

The basis for several methods of computing F lies in re-writing theconstraint x ′>Fx = 0 for each match x↔ x ′ as

[x ′x x ′y x ′ y ′x y ′y y ′ x y 1

] F11...

F33

= 0 .

Inserting a row for each of M matches builds the system

AM×9f9×1 = 0M×1

The properties of A and f are then exploited in various ways.

In Lecture 5 you’ll see an 8-point method and a more expensive leastsquares method using a proper image cost, but here let’s look at a7-point algorithm which minimizes an algebraic cost.

C18 2015 29 / 35

Minimal linear 7-point solution for FWith 7 entries, matrix A7×9 has rank 7 and nullity 2.

Let vectors v and w be two vectors spanning the nullspace of A. Everyf = (αv + βw) satisfies Af = 0, but we need to find an f that givedetF = 0.

Surprisingly, we don’t need to consider every α and β ...

For any f which is a solution, any scaling(+ve of -ve) of f is also a solution. So toexplore the entire plane containing v and wwe need only map out a path that traversesa half plane.One such path is f = αv + (1− α)w, whereα runs from −∞ to +∞.Eg, α = −2 deals with all solutions on thegreen line.

1

0

8

+ 8

2

v

w

Notice that α→ ±∞ generates points → ±α(v −w) which define thehalf plane division.

C18 2015 30 / 35

7-point solution for FWe want to find an α along that path that makes detF = 0.

However, we have to take care!detF = 0 is a cubic in α

detF = a0 + a1α+ a2α2 + a3α

3 = 0 .

So there are 3 solutions.

The algorithm goes likeGenerate the matchesStatistically centre all sets of x and y values.Build A from seven of the matches.Use SVD to find the two vectors v, w spanning the nullspace.Use v, w to find the coeffs of the cubicSolve the cubic, and test which α is best.

C18 2015 31 / 35

Skeleton Matlab code

% xc1 and xc2 are 3 x npoint% matrices of matched and% centred image points%% Choose 7 of them% Generate a complete A matrix

for (i=1:7)A(i,1) = xc2(1,i) * xc1(1,i);A(i,2) = xc2(1,i) * xc1(2,i);A(i,3) = xc2(1,i);A(i,4) = xc2(2,i) * xc1(1,i);A(i,5) = xc2(2,i) * xc1(2,i);A(i,6) = xc2(2,i);A(i,7) = xc1(1,i);A(i,8) = xc1(2,i);A(i,9) = 1;

end% Perform SVD to obtain% nullspace of A

[U,S,V] = svd(A);v = V(:,8);w = V(:,9);

% find the cubic coeffs% from a subroutine

[a0 ,a1,a2,a3] = ...f_cubic_coeffs(v,w);

coeffs =[a3 ,a2,a1,a0];

% Solve the roots of polynomialalpha=roots(coeffs );

% for alpha (1), (2), (3)% generate f 9x1 vector

f = alpha (1)*v +(1-alpha (1))*w;% convert into 3x3 matrix

F1 = fvec_to_Fmat(f);

% ... etc ... for F2 and F3

% then test which is best!

C18 2015 32 / 35

Cubic coefficients ...Why cubic? Writing d = v −w, then f = αd+w, and

detF =

∣∣∣∣∣∣αd1 + w1 αd2 + w2 αd3 + w3αd4 + w4 αd5 + w5 αd6 + w6αd7 + w7 αd8 + w8 αd9 + w9

∣∣∣∣∣∣Minutes of fun later, you’ll find

a0 = +w1w5w9 + w2w6w7 + w3w4w8 − w1w6w8 − w2w4w9 − w3w5w7

a1 = d1w5w9 + w1d5w9 + w1w5d9 + d2w6w7 + w2d6w7 + w2w6d7 + d3w4w8 + w3d4w8 + w3w4d8

− d1w6w8 − w1d6w8 − w1w6d8 − d2w4w9 − w2d4w9 − w2w4d9 − d3w5w7 − w3d5w7 − w3w5d7

a2 = +d1d5w9 + d1w5d9 + w1d5d9 + d2d6w7 + d2w6d7 + w2d6d7 + d3d4w8 + d3w4d8 + w3d4d8

− d1d6w8 − d1w6d8 − w1d6d8 − d2d4w9 − d2w4d9 − w2d4d9 − d3d5w7 − d3w5d7 − w3d5d7

a3 = +d1d5d9 + d2d6d7 + d3d4d8 − d1d6d8 − d2d4d9 − d3d5d7

Tedious, but easy enough to write a matlab routine

function [a0 ,a1 ,a2,a3] = f_cubic_coeffs(v,w)d= v-w;a0 = w(1)*w(5)*w(9) + w(2)*w(6)*w(7) + w(3)*w(4)*w(8) ...

- w(1)*w(6)*w(8) - w(2)*w(4)*w(9) - w(3)*w(5)*w(7);a1 = ... blah blah blah ...

C18 2015 33 / 35

[**For info**] Centering the dataFor numerical stability it is essential that the elements of A are not ofmarkedly different orders of magnitude.

The standard method of “statistical centering” an image quantity xinvolves shifting and scaling the data so that the centred data haveµxC = 0 and standard deviation of σxC =

√2.

Treating x and y as independent, a centered point is

xC = Cx =

√2/σx 0 −√2µx/σx

0√2/σy −

√2µy/σy

0 0 1

xy1

and similarly for points x ′ in the other image.

Using centered points in A, the required matrix is obtained from the FC

recovered via SVD asF = [C ′]>FC [C ]

You are just undoing the linear transformation.

C18 2015 34 / 35

A note on the Essential Matrix

The fundamental matrix F = [K ′]−>[t]×RK−1 rrequires x ′>Fx = 0

If however we know the intrinsic calibrations, we can transform thematching points into their respective ideal images.

Points in the ideal images are related by

x ′>Ex = 0

where the Essential Matrix

E = [t]×R

Properties of the essential matrix were derived (in a variety of forms) inthe early 1980’s. The broad sweep of methods to be described forcomputing F have counterparts for E , although opportunities (andchallenges) arise from its reduced number of degrees of freedom andspecial form.

C18 2015 35 / 35

Summary

3.1 Introduction

3.2 Epipolar GeometryClues from Parallel Cameras; the Epipolar Plane and lines; Cat’swhiskers for converging cameras

3.3 Algebraic Representation and the Fundamental MatrixRepresenting the epipolar plane; linear relationship between point inone image and epipolar line in the other.

3.4 Computing the F matrix: a 7-point methodA minimal linear method. Simple code available from course page.

Documents

C18 Computer Vision - University of Oxforddwm/Courses/4CV_2015/4CV-N3.pdf · 2015-10-21 · C18 2015 3 / 35 Recovering 3D from two images I: epipolar geometry 3.1Introduction 3.2EpipolarGeometry