3D Computer Vision and Video Computing 3D Vision Topic 3 of Part II Stereo Vision CSc I6716 Spring...
If you can't read please download the document
3D Computer Vision and Video Computing 3D Vision Topic 3 of Part II Stereo Vision CSc I6716 Spring 2011 Zhigang Zhu, City College of New York [email protected]
3D Computer Vision and Video Computing 3D Vision Topic 3 of
Part II Stereo Vision CSc I6716 Spring 2011 Zhigang Zhu, City
College of New York [email protected]
Slide 2
3D Computer Vision and Video Computing Stereo Vision n Problem
l Infer 3D structure of a scene from two or more images taken from
different viewpoints n Two primary Sub-problems l Correspondence
problem (stereo match) -> disparity map n Similar instead of
Same n Occlusion problem: some parts of the scene are visible only
in one eye l Reconstruction problem -> 3D n What we need to know
about the cameras parameters n Often a stereo calibration problem n
Lectures on Stereo Vision l Stereo Geometry Epipolar Geometry (*) l
Correspondence Problem (*) Two classes of approaches l 3D
Reconstruction Problems Three approaches
Slide 3
3D Computer Vision and Video Computing A Stereo Pair n Problems
l Correspondence problem (stereo match) -> disparity map l
Reconstruction problem -> 3D CMU CIL Stereo Dataset : Castle
sequence
http://www-2.cs.cmu.edu/afs/cs/project/cil/ftp/html/cil-ster.html ?
3D?
Slide 4
3D Computer Vision and Video Computing More Images n Problems l
Correspondence problem (stereo match) -> disparity map l
Reconstruction problem -> 3D
Slide 5
3D Computer Vision and Video Computing More Images n Problems l
Correspondence problem (stereo match) -> disparity map l
Reconstruction problem -> 3D
Slide 6
3D Computer Vision and Video Computing More Images n Problems l
Correspondence problem (stereo match) -> disparity map l
Reconstruction problem -> 3D
Slide 7
3D Computer Vision and Video Computing More Images n Problems l
Correspondence problem (stereo match) -> disparity map l
Reconstruction problem -> 3D
Slide 8
3D Computer Vision and Video Computing More Images n Problems l
Correspondence problem (stereo match) -> disparity map l
Reconstruction problem -> 3D
Slide 9
3D Computer Vision and Video Computing Part I. Stereo Geometry
n A Simple Stereo Vision System l Disparity Equation l Depth
Resolution l Fixated Stereo System n Zero-disparity Horopter n
Epipolar Geometry l Epipolar lines Where to search correspondences
n Epipolar Plane, Epipolar Lines and Epipoles n
http://www.ai.sri.com/~luong/research/Meta3DViewer/EpipolarGeo.html
l Essential Matrix and Fundamental Matrix n Computing E & F by
the Eight-Point Algorithm n Computing the Epipoles n Stereo
Rectification
Slide 10
3D Computer Vision and Video Computing Stereo Geometry n
Converging Axes Usual setup of human eyes n Depth obtained by
triangulation n Correspondence problem: p l and p r correspond to
the left and right projections of P, respectively. P(X,Y,Z)
Slide 11
3D Computer Vision and Video Computing A Simple Stereo System Z
w =0 LEFT CAMERA Left image: reference Right image: target RIGHT
CAMERA Elevation Z w disparity Depth Z baseline
Slide 12
3D Computer Vision and Video Computing Disparity Equation
P(X,Y,Z) p l (x l,y l ) Optical Center O l f = focal length Image
plane LEFT CAMERA B = Baseline Depth Stereo system with parallel
optical axes f = focal length Optical Center O r p r (x r,y r )
Image plane RIGHT CAMERA Disparity: dx = x r - x l
Slide 13
3D Computer Vision and Video Computing Disparity vs. Baseline
P(X,Y,Z) p l (x l,y l ) Optical Center O l f = focal length Image
plane LEFT CAMERA B = Baseline Depth f = focal length Optical
Center O r p r (x r,y r ) Image plane RIGHT CAMERA Disparity dx = x
r - x l Stereo system with parallel optical axes
Slide 14
3D Computer Vision and Video Computing Depth Accuracy n Given
the same image localization error l Angle of cones in the figure n
Depth Accuracy (Depth Resolution) vs. Baseline l Depth Error 1/B
(Baseline length) l PROS of Longer baseline, n better depth
estimation l CONS n smaller common FOV n Correspondence harder due
to occlusion n Depth Accuracy (Depth Resolution) vs. Depth l
Disparity (>0) 1/ Depth l Depth Error Depth 2 l Nearer the
point, better the depth estimation n An Example l f = 16 x 512/8
pixels, B = 0.5 m l Depth error vs. depth Z2Z2 Two viewpoints
Z2>Z1Z2>Z1 Z1Z1 Z1Z1 OlOl OrOr Absolute error Relative
error
Slide 15
3D Computer Vision and Video Computing Stereo with Converging
Cameras n Stereo with Parallel Axes l Short baseline n large common
FOV n large depth error l Long baseline n small depth error n small
common FOV n More occlusion problems n Two optical axes intersect
at the Fixation Point converging angle l The common FOV Increases
FOV Leftright
Slide 16
3D Computer Vision and Video Computing Stereo with Converging
Cameras n Stereo with Parallel Axes l Short baseline n large common
FOV n large depth error l Long baseline n small depth error n small
common FOV n More occlusion problems n Two optical axes intersect
at the Fixation Point converging angle l The common FOV Increases
FOV Leftright
Slide 17
3D Computer Vision and Video Computing Stereo with Converging
Cameras n Two optical axes intersect at the Fixation Point
converging angle l The common FOV Increases n Disparity properties
l Disparity uses angle instead of distance l Zero disparity at
fixation point n and the Zero-disparity horopter l Disparity
increases with the distance of objects from the fixation points n
>0 : outside of the horopter n
3D Computer Vision and Video Computing Stereo with Converging
Cameras n Two optical axes intersect at the Fixation Point
converging angle l The common FOV Increases n Disparity properties
l Disparity uses angle instead of distance l Zero disparity at
fixation point n and the Zero-disparity horopter l Disparity
increases with the distance of objects from the fixation points n
>0 : outside of the horopter n
3D Computer Vision and Video Computing Stereo with Converging
Cameras n Two optical axes intersect at the Fixation Point
converging angle l The common FOV Increases n Disparity properties
l Disparity uses angle instead of distance l Zero disparity at
fixation point n and the Zero-disparity horopter l Disparity
increases with the distance of objects from the fixation points n
>0 : outside of the horopter n l d > 0 Horopter
Slide 20
3D Computer Vision and Video Computing Stereo with Converging
Cameras n Two optical axes intersect at the Fixation Point
converging angle l The common FOV Increases n Disparity properties
l Disparity uses angle instead of distance l Zero disparity at
fixation point n and the Zero-disparity horopter l Disparity
increases with the distance of objects from the fixation points n
>0 : outside of the horopter n
3D Computer Vision and Video Computing Stereo with Converging
Cameras n Two optical axes intersect at the Fixation Point
converging angle l The common FOV Increases n Disparity properties
l Disparity uses angle instead of distance l Zero disparity at
fixation point n and the Zero-disparity horopter l Disparity
increases with the distance of objects from the fixation points n
>0 : outside of the horopter n
3D Computer Vision and Video Computing Essential Matrix n
Essential Matrix E = RS l A natural link between the stereo point
pair and the extrinsic parameters of the stereo system n One
correspondence -> a linear equation of 9 entries n Given 8 pairs
of (pl, pr) -> E l Mapping between points and epipolar lines we
are looking for n Given p l, E -> p r on the projective line in
the right plane n Equation represents the epipolar line of pr (or
pl) in the right (or left) image n Note: l pl, pr are in the camera
coordinate system, not pixel coordinates that we can measure
Slide 28
3D Computer Vision and Video Computing Fundamental Matrix n
Mapping between points and epipolar lines in the pixel coordinate
systems l With no prior knowledge on the stereo system n From
Camera to Pixels: Matrices of intrinsic parameters n Questions: l
What are fx, fy, ox, oy ? l How to measure p l in images? Rank (M
int ) =3
Slide 29
3D Computer Vision and Video Computing Fundamental Matrix n
Fundamental Matrix l Rank (F) = 2 l Encodes info on both intrinsic
and extrinsic parameters l Enables full reconstruction of the
epipolar geometry l In pixel coordinate systems without any
knowledge of the intrinsic and extrinsic parameters l Linear
equation of the 9 entries of F
Slide 30
3D Computer Vision and Video Computing Computing F: The
Eight-point Algorithm n Input: n point correspondences ( n >= 8)
l Construct homogeneous system Ax= 0 from n x = (f 11,f 12,,f 13, f
21,f 22,f 23 f 31,f 32, f 33 ) : entries in F n Each correspondence
give one equation n A is a nx9 matrix l Obtain estimate F^ by SVD
of A n x (up to a scale) is column of V corresponding to the least
singular value l Enforce singularity constraint: since Rank (F) = 2
n Compute SVD of F^ n Set the smallest singular value to 0: D ->
D n Correct estimate of F : n Output: the estimate of the
fundamental matrix, F n Similarly we can compute E given intrinsic
parameters
Slide 31
3D Computer Vision and Video Computing Locating the Epipoles
from F n Input: Fundamental Matrix F l Find the SVD of F l The
epipole e l is the column of V corresponding to the null singular
value (as shown above) l The epipole e r is the column of U
corresponding to the null singular value n Output: Epipole e l and
e r e l lies on all the epipolar lines of the left image F is not
identically zero For every p r p l p r P OlOl OrOr elel erer PlPl
PrPr Epipolar Plane Epipolar Lines Epipoles
Slide 32
3D Computer Vision and Video Computing Break n Homework #4
online, due on May 03 before class
Slide 33
3D Computer Vision and Video Computing Stereo Rectification n
Rectification l Given a stereo pair, the intrinsic and extrinsic
parameters, find the image transformation to achieve a stereo
system of horizontal epipolar lines l A simple algorithm: Assuming
calibrated stereo cameras p l r P OlOl OrOr X r PlPl PrPr Z l Y l Y
r TX l Z r n Stereo System with Parallel Optical Axes n Epipoles
are at infinity n Horizontal epipolar lines
Slide 34
3D Computer Vision and Video Computing Stereo Rectification n
Algorithm l Rotate both left and right camera so that they share
the same X axis : O r -O l = T l Define a rotation matrix R rect
for the left camera l Rotation Matrix for the right camera is R
rect R T l Rotation can be implemented by image transformation p l
p r P OlOl OrOr XlXl XrXr PlPl PrPr ZlZl YlYl ZrZr YrYr R, T TX l X
l = T, Y l = X l xZ l, Z l = X l xY l
Slide 35
3D Computer Vision and Video Computing Stereo Rectification n
Algorithm l Rotate both left and right camera so that they share
the same X axis : O r -O l = T l Define a rotation matrix R rect
for the left camera l Rotation Matrix for the right camera is R
rect R T l Rotation can be implemented by image transformation p l
p r P OlOl OrOr XlXl XrXr PlPl PrPr ZlZl YlYl ZrZr YrYr R, T TX l X
l = T, Y l = X l xZ l, Z l = X l xY l
Slide 36
3D Computer Vision and Video Computing Stereo Rectification n
Algorithm l Rotate both left and right camera so that they share
the same X axis : O r -O l = T l Define a rotation matrix R rect
for the left camera l Rotation Matrix for the right camera is R
rect R T l Rotation can be implemented by image transformation ZrZr
p l r P OlOl OrOr X r PlPl PrPr Z l Y l Y r R, T TX l T = (B, 0,
0), P r = P l T
Slide 37
3D Computer Vision and Video Computing Epipolar Geometry:
Summary n Purpose l where to search correspondences n Epipolar
plane, epipolar lines, and epipoles l known intrinsic (f) and
extrinsic (R, T) n co-planarity equation l known intrinsic but
unknown extrinsic n essential matrix l unknown intrinsic and
extrinsic n fundamental matrix n Rectification l Generate stereo
pair (by software) with parallel optical axis and thus horizontal
epipolar lines
Slide 38
3D Computer Vision and Video Computing Part II. Correspondence
problem n Three Questions l What to match? n Features: point, line,
area, structure? l Where to search correspondence? n Epipolar line?
l How to measure similarity? n Depends on features n Approaches l
Correlation-based approach l Feature-based approach n Advanced
Topics l Image filtering to handle illumination changes l Adaptive
windows to deal with multiple disparities l Local warping to
account for perspective distortion l Sub-pixel matching to improve
accuracy l Self-consistency to reduce false matches l
Multi-baseline stereo
Slide 39
3D Computer Vision and Video Computing Correlation Approach n
For Each point (x l, y l ) in the left image, define a window
centered at the point (x l, y l ) LEFT IMAGE
Slide 40
3D Computer Vision and Video Computing Correlation Approach n
search its corresponding point within a search region in the right
image (x l, y l ) RIGHT IMAGE
Slide 41
3D Computer Vision and Video Computing Correlation Approach n
the disparity (dx, dy) is the displacement when the correlation is
maximum (x l, y l )dx(x r, y r ) RIGHT IMAGE
Slide 42
3D Computer Vision and Video Computing Correlation Approach n
Elements to be matched l Image window of fixed size centered at
each pixel in the left image n Similarity criterion l A measure of
similarity between windows in the two images l The corresponding
element is given by window that maximizes the similarity criterion
within a search region n Search regions l Theoretically, search
region can be reduced to a 1-D segment, along the epipolar line,
and within the disparity range. l In practice, search a slightly
larger region due to errors in calibration
Slide 43
3D Computer Vision and Video Computing Correlation Approach n
Equations n disparity n Similarity criterion l Cross-Correlation l
Sum of Square Difference (SSD) l Sum of Absolute
Difference(SAD)
Slide 44
3D Computer Vision and Video Computing Correlation Approach n
PROS l Easy to implement l Produces dense disparity map l Maybe
slow n CONS l Needs textured images to work well l Inadequate for
matching image pairs from very different viewpoints due to
illumination changes l Window may cover points with quite different
disparities l Inaccurate disparities on the occluding
boundaries
Slide 45
3D Computer Vision and Video Computing Correlation Approach n A
Stereo Pair of UMass Campus texture, boundaries and occlusion
Slide 46
3D Computer Vision and Video Computing Feature-based Approach n
Features l Edge points l Lines (length, orientation, average
contrast) l Corners n Matching algorithm l Extract features in the
stereo pair l Define similarity measure l Search correspondences
using similarity measure and the epipolar geometry
Slide 47
3D Computer Vision and Video Computing Feature-based Approach n
For each feature in the left image LEFT IMAGE corner line
structure
Slide 48
3D Computer Vision and Video Computing Feature-based Approach n
Search in the right image the disparity (dx, dy) is the
displacement when the similarity measure is maximum RIGHT IMAGE
corner line structure
Slide 49
3D Computer Vision and Video Computing Feature-based Approach n
PROS l Relatively insensitive to illumination changes l Good for
man-made scenes with strong lines but weak texture or textureless
surfaces l Work well on the occluding boundaries (edges) l Could be
faster than the correlation approach n CONS l Only sparse depth map
l Feature extraction may be tricky n Lines (Edges) might be
partially extracted in one image n How to measure the similarity
between two lines?
Slide 50
3D Computer Vision and Video Computing Break n Homework #4
online, due on May 03 before class
Slide 51
3D Computer Vision and Video Computing Advanced Topics n Mainly
used in correlation-based approach, but can be applied to
feature-based match n Image filtering to handle illumination
changes l Image equalization n To make two images more similar in
illumination l Laplacian filtering (2 nd order derivative) n Use
derivative rather than intensity (or original color)
Slide 52
3D Computer Vision and Video Computing Advanced Topics n
Adaptive windows to deal with multiple disparities l Adaptive
Window Approach (Kanade and Okutomi) n statistically adaptive
technique which selects at each pixel the window size that
minimizes the uncertainty in disparity estimates n A Stereo
Matching Algorithm with an Adaptive Window: Theory and Experiment,
T. Kanade and M. Okutomi. Proc. 1991 IEEE International Conference
on Robotics and Automation, Vol. 2, April, 1991, pp. 1088-1095 A
Stereo Matching Algorithm with an Adaptive Window: Theory and
ExperimentT. Kanade l Multiple window algorithm (Fusiello, et al) n
Use 9 windows instead of just one to compute the SSD measure n The
point with the smallest SSD error amongst the 9 windows and various
search locations is chosen as the best estimate for the given
points n A Fusiello, V. Roberto and E. Trucco, Efficient stereo
with multiple windowing, IEEE CVPR pp858-863, 1997
Slide 53
3D Computer Vision and Video Computing Advanced Topics n
Multiple windows to deal with multiple disparities Smooth regions
Corners edges near far
Slide 54
3D Computer Vision and Video Computing Advanced Topics n
Sub-pixel matching to improve accuracy l Find the peak in the
correlation curves n Self-consistency to reduce false matches esp.
for occlusions l Check the consistency of matches from L to R and
from R to L n Multiple Resolution Approach l From coarse to fine
for efficiency in searching correspondences n Local warping to
account for perspective distortion l Warp from one view to the
other for a small patch given an initial estimation of the (planar)
surface normal n Multi-baseline Stereo l Improves both
correspondences and 3D estimation by using more than two cameras
(images)
Slide 55
3D Computer Vision and Video Computing 3D Reconstruction
Problem n What we have done l Correspondences using either
correlation or feature based approaches l Epipolar Geometry from at
least 8 point correspondences n Three cases of 3D reconstruction
depending on the amount of a priori knowledge on the stereo system
l Both intrinsic and extrinsic known - > can solve the
reconstruction problem unambiguously by triangulation l Only
intrinsic known -> recovery structure and extrinsic up to an
unknown scaling factor l Only correspondences -> reconstruction
only up to an unknown, global projective transformation (*)
Slide 56
3D Computer Vision and Video Computing Reconstruction by
Triangulation n Assumption and Problem l Under the assumption that
both intrinsic and extrinsic parameters are known l Compute the 3-D
location from their projections, pl and pr n Solution l
Triangulation: Two rays are known and the intersection can be
computed l Problem: Two rays will not actually intersect in space
due to errors in calibration and correspondences, and pixelization
l Solution: find a point in space with minimum distance from both
rays p p r P OlOl OrOr l
Slide 57
3D Computer Vision and Video Computing Reconstruction up to a
Scale Factor n Assumption and Problem Statement l Under the
assumption that only intrinsic parameters and more than 8 point
correspondences are given l Compute the 3-D location from their
projections, pl and pr, as well as the extrinsic parameters n
Solution l Compute the essential matrix E from at least 8
correspondences l Estimate T (up to a scale and a sign) from E
(=RS) using the orthogonal constraint of R, and then R n End up
with four different estimates of the pair (T, R) l Reconstruct the
depth of each point, and pick up the correct sign of R and T. l
Results: reconstructed 3D points (up to a common scale); l The
scale can be determined if distance of two points (in space) are
known
Slide 58
3D Computer Vision and Video Computing Reconstruction up to a
Projective Transformation n Assumption and Problem Statement l
Under the assumption that only n (>=8) point correspondences are
given l Compute the 3-D location from their projections, pl and pr
n Solution l Compute the Fundamental matrix F from at least 8
correspondences, and the two epipoles l Determine the projection
matrices n Select five points ( from correspondence pairs) as the
projective basis l Compute the projective reconstruction n Unique
up to the unknown projective transformation fixed by the choice of
the five points (* not required for this course; needs advanced
knowledge of projective geometry )
Slide 59
3D Computer Vision and Video Computing Summary n Fundamental
concepts and problems of stereo n Epipolar geometry and stereo
rectification n Estimation of fundamental matrix from 8 point pairs
n Correspondence problem and two techniques: correlation and
feature based matching n Reconstruct 3-D structure from image
correspondences given l Fully calibrated l Partially calibration l
Uncalibrated stereo cameras (*)
Slide 60
3D Computer Vision and Video Computing Next n Understanding 3D
structure and events from motion Motion n Homework #4 online, due
on May 03 before class