Upload
reynard-sullivan
View
215
Download
0
Embed Size (px)
Citation preview
1
Formation et Analyse d’ImagesSession 11
Daniela Hall
12 December 2005
2
Course Overview
• Session 1 (19/09/05)– Overview– Human vision – Homogenous coordinates– Camera models
• Session 2 (26/09/05)– Tensor notation– Image transformations– Homography computation
• Session 3 (3/10/05)– Camera calibration– Reflection models– Color spaces
• Session 4 (10/10/05)– Pixel based image analysis
• 17/10/05 course is replaced by Modelisation surfacique
3
Course overview
• Session 5 + 6 (24/10/05) 9:45 – 12:45– Contrast description– Hough transform
• Session 7 (7/11/05)– Kalman filter
• Session 8 (14/11/05)– Tracking of regions, pixels, and lines
• Session 9 (21/11/05)– Gaussian filter operators
• Session 10 (5/12/05)– Scale Space
• Session 11 (12/12/05)– Stereo vision – Epipolar geometry
• Session 12 (16/01/06): exercises and questions
4
Session overview
1. Stereo vision
2. Epipolar geometry
3. 3d point position from two views using epipolar geometry
4. 3d point position from two views when camera models are known.
5
Human stereo vision• Two Eyes = Three Dimensions (3D)!
Each eye captures its own view and the two separate images are sent on to the brain for processing.
• When the two images arrive simultaneously in the back of the brain, they are united into one picture. The mind combines the two images by matching up the similarities and adding in the small differences.
• The small differences between the two images add up to a big difference in the final picture! The combined image is more than the sum of its parts. It is a three-dimensional stereo picture.
• The word "stereo" comes from the Greek word "stereos" which means firm or solid. With stereo vision you see an object as solid in three spatial dimensions--width, height and depth--or x, y and z.
6
Computer stereo vision
• Stereo vision allows to estimates the 3D position of scene point X from its positions x, x’ in 2 images taken from different camera positions P, P’.
• The two views can be acquired simultaneously with two cameras or sequentially with one camera in motion.
• Each view has an associated camera matrix P,P’.• The 3d point X is imaged as x=PX in the first view and
x’=P’X in the second view. • x and x’ correspond because they are the image of the
same point in 3d.
Source: Hartley, Zisserman: Multiple view geometry in computer vision, Cambridge, 2000. http://www.robots.ox.ac.uk/~vgg/hzbook/
7
Topics in stereo vision
• Correspondence geometry (epipolar geometry): – given an image point x in the first view, how does it
constrain the corresponding point x’ in the second view?
• Camera geometry (motion): – Given a set of corresponding points {xi, x’i}, what are
the cameras P, P’ of the two views?
• Scene geometry: – Given corresponding image points {xi,x’i} and cameras
P, P’, what is the position of X in 3d?
8
Session overview
1. Stereo vision
2. Epipolar geometry
3. 3d point position from two views using epipolar geometry
4. 3d point position from two views when camera models are known.
9
Epipolar geometry
• A point in one view defines an epipolar line in the other view on which the corresponding point lies.
• The epipolar geometry depends only on the cameras. Their relative position and their internal parameters.
• The epipolar geometry is represented by a 3x3 matrix called the fundamental matrix F.
10
Epipolar geometry
thanks to Andrew Zisserman and Richard Hartley for all figures.
11
Notations
• X 3d point• C, C’ 3d position of camera• I, I’ image planes.• x, x’ 2d position of 3d point X in image I, I’ of camera C,
C’.• pi epipolar plane. C, x, e, e’, C’,X all lie on pi.• e, e’ epipoles (2d position of the camera center C in image
I’). C, e, e’, C’ lie on the baseline.• l, l’ epipolar line. l’ is the intersection of the epipolar plane
pi spanned by the baseline CC’ and the ray of Cx. The corresponding point x’ must lie on l’.
12
Epipolar geometry
• For any two fixed cameras we have one baseline.• For any 3d point X we have a different epipolar plane pi.• All epipolar planes intersect at the baseline.
13
Epipolar line
• Suppose we know only x and the baseline. • How is the corresponding point x’ in the other
image constrained?• pi is defined by the baseline and the ray Cx. • The epipolar line l’ is the image of this ray in the
other image. x’ must lie on l’.• The benefit of the epipolar line is that the
correspondence search can be restricted to l’ instead of searching the entire image.
14
Example
15
Epipolar terminology
• Epipole: – intersection of the line joining the camera centers (baseline) and
the image plane. – the image of the other camera center in the image plane.– intersection of the epipolar lines.
• Epipolar plane: – a plane containing the baseline. There is a one-parameter family of
epipolar planes for a fixed camera pair.• Epipolar line:
– intersection of the epipolar plane with the image plane.– all epipolar lines intersect in the epipole.– an epipolar plane intersects both image planes and defines
correspondences between the lines.
16
The fundamental matrix• The fundamental matrix is the algebraic representation of
the epipolar geometry.• Derivation of F:
– map point x to some point x’ in the other image– l’ is obtained as the line joing x’ and the epipole e’– F can be computed from these elements
HeF
FxxHel
]'[
]'['Relation of x and epipolar lineEquation for Fe epipole[e]x skew-symetricmatrix
Relation scalar productand skew symetric matrix
baba
ee
ee
ee
e
eeee T
][
0
0
0
][
),,(
12
13
23
321
17
Fundamental matrix
18
Correspondence condition
• The fundamental matrix satisfies the condition that for any pair of corresponding points x, x’ in the two images
• This is true, because if x and x’ correspond, then x’ lies on the epipolar line l’. And since we know l’=Fx we can write:
• The importance of this relations is that we can compute the fundamental matrix only from point correspondences. We need at least 7 point correspondences (details chap 10, Hartley, Zisserman book).
0' Fxx T
Fxxlx TT '''0
19
Computing the fundamental matrix
• Given sufficiently many point matches xi, xi’ the equation x’TFx=0 can be used to compute F.
• Writing x=(x,y,1)T and x’=(x’,y’,1)T each point match gives rise to an
equation of the unknowns of F.
• writing the 9 unknowns of F as a vector f, we get the lower two equations.
• Using the last equation SVD provides a direct solution for F.
0
1''''''
...........................
1111'11'11'1'11'11'
0)1,,,',',',',','(
0'''''' 333231232221131211
f
yxyyyxyxyxxx
yxyyyxyxyxxx
fA
fyxyyyxyxyxxx
fyfxffyyfyxfyfxyfxxfx
nnnnnnnnnnnn
20
The fundamental matrix
• Allows to compute the epipolar line l’ in I’ for a point x in I. x’ lies on l’.
• Allows to compute the l in I for x’ in I’. We can verify the point correspondence x, x’, because x must lie on l.
• In the course 3d vision, you will see that the F is used to compute the camera projection model P for each camera. With the camera model you can estimate the 3d position of a point without calibrating the cameras. (self-calibration).
21
Session overview
1. Stereo vision
2. Epipolar geometry
3. 3d point position from two views using epipolar geometry
4. 3d point position from two views when camera models are known.
22
Computing the 3d position of a point
1. Compute the fundamental matrix from at least 7 point correspondences.
2. Determine two camera projection matrices.
3. Determine a point correspondence x, x’ in the two views.
4. The 3d position X of the image points x and x’ can be computed directly as intersection of the two rays defined by Cx and C’x’.
23
Computing the 3d position of a point
C C’
LU
24
Camera projection matrices P, P’
• We set the origin of the world to the camera center C of the first camera. Then The projection matrix P of the first camera is
• The projection matrix of the second camera has the form
• It can be computed by solving F=[e]xP’P+ for P’• C is the null vector of P:
000
100
010
001
, inverse, pseudo ,
0100
0010
0001
)0|( PIPPPIP
tion t translaandrotation 3D R with ),|(' tRP
0
CP
25
Defining the rays
• The ray backprojected from x by P is obtained – by solving PX=x
– by using 2 points on the ray and compute the tensor (see Session 1)
• C is on the ray and x. The 3d position of x is P+x
• line equation using tensor notation
kjijki
kjijki
kjijki
ULEX
CxPEU
CxPEL
)'()''(
)()(line defined by x and Cline defined by x’ and C’3d point as intersection of L and U
26
Intersecting the rays
• In real world applications x’TFx =0 may not be true due to imprecise measurements.
• This means that the rays L and U may not intersect (they are skew).
• In these cases you need to find the point with minimum distance between the L and U. You can solve this by SVD.
27
Direct computation of 3d point position
• Choose a calibration object, whose 3d position is known
• Calibrate the cameras (compute the camera model MI
S and NIS from at least 5 ½ points)
• Then from a correspondence P, Q the 3d position of R can be computed directly.
28
Session overview
1. Stereo vision
2. Epipolar geometry
3. 3d point position from two views using epipolar geometry
4. 3d point position from two views when camera models are known.
29
Direct computation of 3d point position
• R is at the intersection of 3 planes
30
Camera modelSI
SSC
SRC
IR
I PMPTMCP
1S
S
S
IS z
y
x
M
w
wj
wi
SS
IS
SS
IS
SS
IS
SS
IS
PM
PM
mzsmysmxsm
mzsmysmxsm
w
wjj
PM
PM
mzsmysmxsm
mzsmysmxsm
w
wii
3
2
3
1
)(
)(
34333231
24232221
)(
)(
34333231
14131211
Equation:
Image coordinates
31
Transformation image-scene
• Problem: we need to know depth zs for each image position. Since zs can change, a general form of MS
I can not exist.
• Any point in I is the image of points in a ray.
ISI
IRI
CR
SC
S PMPCMTP
32
Calibration
1. Construct a calibration object whose 3D position is known.
2. Measure image coordinates
3. Determine correspondences between 3D point RS
k and image point PIk.
4. We have 11 DoF. We need at least 5 ½ correspondences.
33
Calibration
• For each correspondence scene point RSk and image
point PIk
• which gives following equations for k=1, ..., 6
• from wich MIS can be computed
Skh
IS
Skh
IS
k
kkk RM
RM
w
iwi
3
1
)(
)(
Skh
IS
Skh
IS
k
kkk RM
RM
w
jwj
3
2
)(
)(
0))(())((
0))(())((32
31
Skh
ISk
Skh
IS
Skh
ISk
Skh
IS
RMjRM
RMiRM
34
Properties of MIS
• The first equation defines a plane that goes through the camera center and the image plane in x direction
• The second equation defines a plane that goes through the camera center and the image plane in y direction.
0))(())((
0))(())((32
31
Skh
ISk
Skh
IS
Skh
ISk
Skh
IS
RMjRM
RMiRM
35
Calibration using many points
• For k=5 ½ M has one solution.– Solution depends on precise measurements of
3D and 2D points. – If you use another 5 ½ points you will get a
different solution.
• A more stable solution is found by using large number of points and do optimisation.
36
Calibration using many points
• For each point correspondence we know (i,j) and R.
• We want to know MIS.
Solve equation with your favorite algorithm (least squares, levenberg-marquart, svd,...)
0))(())((
0))(())((32
31
Skh
ISk
Skh
IS
Skh
ISk
Skh
IS
RMjRM
RMiRM
0
34
33
32
31
24
23
22
21
14
13
12
11
10000
00001
0000000000
0000000000
m
m
m
m
m
m
m
m
m
m
m
m
jzjyjxjzyx
iziyixizyx
37
Computation of 3d point position
• We have the camera model MIS of camera 1
and the camera model NIS of camera 2.
• We have a point PI in camera 1 and a point QI in camera 2 which correspond (that means they are the image of the same scene point RS).
• The position of RS can be computed by the intersection of 3 planes.
38
Computation of 3d point position
• PI = MISRS, PI=(i,j), QI=NI
SRS, QI=(u,v)
• We have following equations
• RS can be found by using 3 of those 4 equations.
0))(())((
0))(())((
0))(())((
0))(())((
32
31
32
31
Sh
IS
Sh
IS
Sh
IS
Sh
IS
Sh
IS
Sh
IS
Sh
IS
Sh
IS
RNvRN
RNuRN
RMjRM
RMiRM
39
Computation of 3d point position
34
24
34
24
34
14
1
33
23
32
22
31
21
33
23
32
22
31
21
33
13
32
12
31
11
34
24
34
24
34
14
33
23
32
22
31
21
33
23
32
22
31
21
33
13
32
12
31
11
32
32
31
)()(
)()(
)()(
)()()()()()(
)()()()()()(
)()()()()()(
)()(
)()(
)()(
)()()()()()(
)()()()()()(
)()()()()()(
0
)()(
)()(
)()(
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
IS
S
hISh
IS
hISh
IS
hISh
IS
NvN
MjM
MiM
NvNNvNNvN
MjMMjMMjM
MiMMiMMiM
z
y
x
NvN
MjM
MiM
z
y
x
NvNNvNNvN
MjMMjMMjM
MiMMiMMiM
R
NvN
MjM
MiM
The point RS=(x,y,z,1) is computed as follows.