3D Computer Vision and Video Computing 3D Vision Topic 3 of Part II Stereo Vision CSc I6716 Spring...
Preview:
Citation preview
- Slide 1
- 3D Computer Vision and Video Computing 3D Vision Topic 3 of
Part II Stereo Vision CSc I6716 Spring 2011 Zhigang Zhu, City
College of New York zhu@cs.ccny.cuny.edu
- Slide 2
- 3D Computer Vision and Video Computing Stereo Vision n Problem
l Infer 3D structure of a scene from two or more images taken from
different viewpoints n Two primary Sub-problems l Correspondence
problem (stereo match) -> disparity map n Similar instead of
Same n Occlusion problem: some parts of the scene are visible only
in one eye l Reconstruction problem -> 3D n What we need to know
about the cameras parameters n Often a stereo calibration problem n
Lectures on Stereo Vision l Stereo Geometry Epipolar Geometry (*) l
Correspondence Problem (*) Two classes of approaches l 3D
Reconstruction Problems Three approaches
- Slide 3
- 3D Computer Vision and Video Computing A Stereo Pair n Problems
l Correspondence problem (stereo match) -> disparity map l
Reconstruction problem -> 3D CMU CIL Stereo Dataset : Castle
sequence
http://www-2.cs.cmu.edu/afs/cs/project/cil/ftp/html/cil-ster.html ?
3D?
- Slide 4
- 3D Computer Vision and Video Computing More Images n Problems l
Correspondence problem (stereo match) -> disparity map l
Reconstruction problem -> 3D
- Slide 5
- 3D Computer Vision and Video Computing More Images n Problems l
Correspondence problem (stereo match) -> disparity map l
Reconstruction problem -> 3D
- Slide 6
- 3D Computer Vision and Video Computing More Images n Problems l
Correspondence problem (stereo match) -> disparity map l
Reconstruction problem -> 3D
- Slide 7
- 3D Computer Vision and Video Computing More Images n Problems l
Correspondence problem (stereo match) -> disparity map l
Reconstruction problem -> 3D
- Slide 8
- 3D Computer Vision and Video Computing More Images n Problems l
Correspondence problem (stereo match) -> disparity map l
Reconstruction problem -> 3D
- Slide 9
- 3D Computer Vision and Video Computing Part I. Stereo Geometry
n A Simple Stereo Vision System l Disparity Equation l Depth
Resolution l Fixated Stereo System n Zero-disparity Horopter n
Epipolar Geometry l Epipolar lines Where to search correspondences
n Epipolar Plane, Epipolar Lines and Epipoles n
http://www.ai.sri.com/~luong/research/Meta3DViewer/EpipolarGeo.html
l Essential Matrix and Fundamental Matrix n Computing E & F by
the Eight-Point Algorithm n Computing the Epipoles n Stereo
Rectification
- Slide 10
- 3D Computer Vision and Video Computing Stereo Geometry n
Converging Axes Usual setup of human eyes n Depth obtained by
triangulation n Correspondence problem: p l and p r correspond to
the left and right projections of P, respectively. P(X,Y,Z)
- Slide 11
- 3D Computer Vision and Video Computing A Simple Stereo System Z
w =0 LEFT CAMERA Left image: reference Right image: target RIGHT
CAMERA Elevation Z w disparity Depth Z baseline
- Slide 12
- 3D Computer Vision and Video Computing Disparity Equation
P(X,Y,Z) p l (x l,y l ) Optical Center O l f = focal length Image
plane LEFT CAMERA B = Baseline Depth Stereo system with parallel
optical axes f = focal length Optical Center O r p r (x r,y r )
Image plane RIGHT CAMERA Disparity: dx = x r - x l
- Slide 13
- 3D Computer Vision and Video Computing Disparity vs. Baseline
P(X,Y,Z) p l (x l,y l ) Optical Center O l f = focal length Image
plane LEFT CAMERA B = Baseline Depth f = focal length Optical
Center O r p r (x r,y r ) Image plane RIGHT CAMERA Disparity dx = x
r - x l Stereo system with parallel optical axes
- Slide 14
- 3D Computer Vision and Video Computing Depth Accuracy n Given
the same image localization error l Angle of cones in the figure n
Depth Accuracy (Depth Resolution) vs. Baseline l Depth Error 1/B
(Baseline length) l PROS of Longer baseline, n better depth
estimation l CONS n smaller common FOV n Correspondence harder due
to occlusion n Depth Accuracy (Depth Resolution) vs. Depth l
Disparity (>0) 1/ Depth l Depth Error Depth 2 l Nearer the
point, better the depth estimation n An Example l f = 16 x 512/8
pixels, B = 0.5 m l Depth error vs. depth Z2Z2 Two viewpoints
Z2>Z1Z2>Z1 Z1Z1 Z1Z1 OlOl OrOr Absolute error Relative
error
- Slide 15
- 3D Computer Vision and Video Computing Stereo with Converging
Cameras n Stereo with Parallel Axes l Short baseline n large common
FOV n large depth error l Long baseline n small depth error n small
common FOV n More occlusion problems n Two optical axes intersect
at the Fixation Point converging angle l The common FOV Increases
FOV Leftright
- Slide 16
- 3D Computer Vision and Video Computing Stereo with Converging
Cameras n Stereo with Parallel Axes l Short baseline n large common
FOV n large depth error l Long baseline n small depth error n small
common FOV n More occlusion problems n Two optical axes intersect
at the Fixation Point converging angle l The common FOV Increases
FOV Leftright
- Slide 17
- 3D Computer Vision and Video Computing Stereo with Converging
Cameras n Two optical axes intersect at the Fixation Point
converging angle l The common FOV Increases n Disparity properties
l Disparity uses angle instead of distance l Zero disparity at
fixation point n and the Zero-disparity horopter l Disparity
increases with the distance of objects from the fixation points n
>0 : outside of the horopter n
- 3D Computer Vision and Video Computing Stereo with Converging
Cameras n Two optical axes intersect at the Fixation Point
converging angle l The common FOV Increases n Disparity properties
l Disparity uses angle instead of distance l Zero disparity at
fixation point n and the Zero-disparity horopter l Disparity
increases with the distance of objects from the fixation points n
>0 : outside of the horopter n
- 3D Computer Vision and Video Computing Stereo with Converging
Cameras n Two optical axes intersect at the Fixation Point
converging angle l The common FOV Increases n Disparity properties
l Disparity uses angle instead of distance l Zero disparity at
fixation point n and the Zero-disparity horopter l Disparity
increases with the distance of objects from the fixation points n
>0 : outside of the horopter n l d > 0 Horopter
- Slide 20
- 3D Computer Vision and Video Computing Stereo with Converging
Cameras n Two optical axes intersect at the Fixation Point
converging angle l The common FOV Increases n Disparity properties
l Disparity uses angle instead of distance l Zero disparity at
fixation point n and the Zero-disparity horopter l Disparity
increases with the distance of objects from the fixation points n
>0 : outside of the horopter n
- 3D Computer Vision and Video Computing Stereo with Converging
Cameras n Two optical axes intersect at the Fixation Point
converging angle l The common FOV Increases n Disparity properties
l Disparity uses angle instead of distance l Zero disparity at
fixation point n and the Zero-disparity horopter l Disparity
increases with the distance of objects from the fixation points n
>0 : outside of the horopter n
- 3D Computer Vision and Video Computing Essential Matrix n
Essential Matrix E = RS l A natural link between the stereo point
pair and the extrinsic parameters of the stereo system n One
correspondence -> a linear equation of 9 entries n Given 8 pairs
of (pl, pr) -> E l Mapping between points and epipolar lines we
are looking for n Given p l, E -> p r on the projective line in
the right plane n Equation represents the epipolar line of pr (or
pl) in the right (or left) image n Note: l pl, pr are in the camera
coordinate system, not pixel coordinates that we can measure
- Slide 28
- 3D Computer Vision and Video Computing Fundamental Matrix n
Mapping between points and epipolar lines in the pixel coordinate
systems l With no prior knowledge on the stereo system n From
Camera to Pixels: Matrices of intrinsic parameters n Questions: l
What are fx, fy, ox, oy ? l How to measure p l in images? Rank (M
int ) =3
- Slide 29
- 3D Computer Vision and Video Computing Fundamental Matrix n
Fundamental Matrix l Rank (F) = 2 l Encodes info on both intrinsic
and extrinsic parameters l Enables full reconstruction of the
epipolar geometry l In pixel coordinate systems without any
knowledge of the intrinsic and extrinsic parameters l Linear
equation of the 9 entries of F
- Slide 30
- 3D Computer Vision and Video Computing Computing F: The
Eight-point Algorithm n Input: n point correspondences ( n >= 8)
l Construct homogeneous system Ax= 0 from n x = (f 11,f 12,,f 13, f
21,f 22,f 23 f 31,f 32, f 33 ) : entries in F n Each correspondence
give one equation n A is a nx9 matrix l Obtain estimate F^ by SVD
of A n x (up to a scale) is column of V corresponding to the least
singular value l Enforce singularity constraint: since Rank (F) = 2
n Compute SVD of F^ n Set the smallest singular value to 0: D ->
D n Correct estimate of F : n Output: the estimate of the
fundamental matrix, F n Similarly we can compute E given intrinsic
parameters
- Slide 31
- 3D Computer Vision and Video Computing Locating the Epipoles
from F n Input: Fundamental Matrix F l Find the SVD of F l The
epipole e l is the column of V corresponding to the null singular
value (as shown above) l The epipole e r is the column of U
corresponding to the null singular value n Output: Epipole e l and
e r e l lies on all the epipolar lines of the left image F is not
identically zero For every p r p l p r P OlOl OrOr elel erer PlPl
PrPr Epipolar Plane Epipolar Lines Epipoles
- Slide 32
- 3D Computer Vision and Video Computing Break n Homework #4
online, due on May 03 before class
- Slide 33
- 3D Computer Vision and Video Computing Stereo Rectification n
Rectification l Given a stereo pair, the intrinsic and extrinsic
parameters, find the image transformation to achieve a stereo
system of horizontal epipolar lines l A simple algorithm: Assuming
calibrated stereo cameras p l r P OlOl OrOr X r PlPl PrPr Z l Y l Y
r TX l Z r n Stereo System with Parallel Optical Axes n Epipoles
are at infinity n Horizontal epipolar lines
- Slide 34
- 3D Computer Vision and Video Computing Stereo Rectification n
Algorithm l Rotate both left and right camera so that they share
the same X axis : O r -O l = T l Define a rotation matrix R rect
for the left camera l Rotation Matrix for the right camera is R
rect R T l Rotation can be implemented by image transformation p l
p r P OlOl OrOr XlXl XrXr PlPl PrPr ZlZl YlYl ZrZr YrYr R, T TX l X
l = T, Y l = X l xZ l, Z l = X l xY l
- Slide 35
- 3D Computer Vision and Video Computing Stereo Rectification n
Algorithm l Rotate both left and right camera so that they share
the same X axis : O r -O l = T l Define a rotation matrix R rect
for the left camera l Rotation Matrix for the right camera is R
rect R T l Rotation can be implemented by image transformation p l
p r P OlOl OrOr XlXl XrXr PlPl PrPr ZlZl YlYl ZrZr YrYr R, T TX l X
l = T, Y l = X l xZ l, Z l = X l xY l
- Slide 36
- 3D Computer Vision and Video Computing Stereo Rectification n
Algorithm l Rotate both left and right camera so that they share
the same X axis : O r -O l = T l Define a rotation matrix R rect
for the left camera l Rotation Matrix for the right camera is R
rect R T l Rotation can be implemented by image transformation ZrZr
p l r P OlOl OrOr X r PlPl PrPr Z l Y l Y r R, T TX l T = (B, 0,
0), P r = P l T
- Slide 37
- 3D Computer Vision and Video Computing Epipolar Geometry:
Summary n Purpose l where to search correspondences n Epipolar
plane, epipolar lines, and epipoles l known intrinsic (f) and
extrinsic (R, T) n co-planarity equation l known intrinsic but
unknown extrinsic n essential matrix l unknown intrinsic and
extrinsic n fundamental matrix n Rectification l Generate stereo
pair (by software) with parallel optical axis and thus horizontal
epipolar lines
- Slide 38
- 3D Computer Vision and Video Computing Part II. Correspondence
problem n Three Questions l What to match? n Features: point, line,
area, structure? l Where to search correspondence? n Epipolar line?
l How to measure similarity? n Depends on features n Approaches l
Correlation-based approach l Feature-based approach n Advanced
Topics l Image filtering to handle illumination changes l Adaptive
windows to deal with multiple disparities l Local warping to
account for perspective distortion l Sub-pixel matching to improve
accuracy l Self-consistency to reduce false matches l
Multi-baseline stereo
- Slide 39
- 3D Computer Vision and Video Computing Correlation Approach n
For Each point (x l, y l ) in the left image, define a window
centered at the point (x l, y l ) LEFT IMAGE
- Slide 40
- 3D Computer Vision and Video Computing Correlation Approach n
search its corresponding point within a search region in the right
image (x l, y l ) RIGHT IMAGE
- Slide 41
- 3D Computer Vision and Video Computing Correlation Approach n
the disparity (dx, dy) is the displacement when the correlation is
maximum (x l, y l )dx(x r, y r ) RIGHT IMAGE
- Slide 42
- 3D Computer Vision and Video Computing Correlation Approach n
Elements to be matched l Image window of fixed size centered at
each pixel in the left image n Similarity criterion l A measure of
similarity between windows in the two images l The corresponding
element is given by window that maximizes the similarity criterion
within a search region n Search regions l Theoretically, search
region can be reduced to a 1-D segment, along the epipolar line,
and within the disparity range. l In practice, search a slightly
larger region due to errors in calibration
- Slide 43
- 3D Computer Vision and Video Computing Correlation Approach n
Equations n disparity n Similarity criterion l Cross-Correlation l
Sum of Square Difference (SSD) l Sum of Absolute
Difference(SAD)
- Slide 44
- 3D Computer Vision and Video Computing Correlation Approach n
PROS l Easy to implement l Produces dense disparity map l Maybe
slow n CONS l Needs textured images to work well l Inadequate for
matching image pairs from very different viewpoints due to
illumination changes l Window may cover points with quite different
disparities l Inaccurate disparities on the occluding
boundaries
- Slide 45
- 3D Computer Vision and Video Computing Correlation Approach n A
Stereo Pair of UMass Campus texture, boundaries and occlusion
- Slide 46
- 3D Computer Vision and Video Computing Feature-based Approach n
Features l Edge points l Lines (length, orientation, average
contrast) l Corners n Matching algorithm l Extract features in the
stereo pair l Define similarity measure l Search correspondences
using similarity measure and the epipolar geometry
- Slide 47
- 3D Computer Vision and Video Computing Feature-based Approach n
For each feature in the left image LEFT IMAGE corner line
structure
- Slide 48
- 3D Computer Vision and Video Computing Feature-based Approach n
Search in the right image the disparity (dx, dy) is the
displacement when the similarity measure is maximum RIGHT IMAGE
corner line structure
- Slide 49
- 3D Computer Vision and Video Computing Feature-based Approach n
PROS l Relatively insensitive to illumination changes l Good for
man-made scenes with strong lines but weak texture or textureless
surfaces l Work well on the occluding boundaries (edges) l Could be
faster than the correlation approach n CONS l Only sparse depth map
l Feature extraction may be tricky n Lines (Edges) might be
partially extracted in one image n How to measure the similarity
between two lines?
- Slide 50
- 3D Computer Vision and Video Computing Break n Homework #4
online, due on May 03 before class
- Slide 51
- 3D Computer Vision and Video Computing Advanced Topics n Mainly
used in correlation-based approach, but can be applied to
feature-based match n Image filtering to handle illumination
changes l Image equalization n To make two images more similar in
illumination l Laplacian filtering (2 nd order derivative) n Use
derivative rather than intensity (or original color)
- Slide 52
- 3D Computer Vision and Video Computing Advanced Topics n
Adaptive windows to deal with multiple disparities l Adaptive
Window Approach (Kanade and Okutomi) n statistically adaptive
technique which selects at each pixel the window size that
minimizes the uncertainty in disparity estimates n A Stereo
Matching Algorithm with an Adaptive Window: Theory and Experiment,
T. Kanade and M. Okutomi. Proc. 1991 IEEE International Conference
on Robotics and Automation, Vol. 2, April, 1991, pp. 1088-1095 A
Stereo Matching Algorithm with an Adaptive Window: Theory and
ExperimentT. Kanade l Multiple window algorithm (Fusiello, et al) n
Use 9 windows instead of just one to compute the SSD measure n The
point with the smallest SSD error amongst the 9 windows and various
search locations is chosen as the best estimate for the given
points n A Fusiello, V. Roberto and E. Trucco, Efficient stereo
with multiple windowing, IEEE CVPR pp858-863, 1997
- Slide 53
- 3D Computer Vision and Video Computing Advanced Topics n
Multiple windows to deal with multiple disparities Smooth regions
Corners edges near far
- Slide 54
- 3D Computer Vision and Video Computing Advanced Topics n
Sub-pixel matching to improve accuracy l Find the peak in the
correlation curves n Self-consistency to reduce false matches esp.
for occlusions l Check the consistency of matches from L to R and
from R to L n Multiple Resolution Approach l From coarse to fine
for efficiency in searching correspondences n Local warping to
account for perspective distortion l Warp from one view to the
other for a small patch given an initial estimation of the (planar)
surface normal n Multi-baseline Stereo l Improves both
correspondences and 3D estimation by using more than two cameras
(images)
- Slide 55
- 3D Computer Vision and Video Computing 3D Reconstruction
Problem n What we have done l Correspondences using either
correlation or feature based approaches l Epipolar Geometry from at
least 8 point correspondences n Three cases of 3D reconstruction
depending on the amount of a priori knowledge on the stereo system
l Both intrinsic and extrinsic known - > can solve the
reconstruction problem unambiguously by triangulation l Only
intrinsic known -> recovery structure and extrinsic up to an
unknown scaling factor l Only correspondences -> reconstruction
only up to an unknown, global projective transformation (*)
- Slide 56
- 3D Computer Vision and Video Computing Reconstruction by
Triangulation n Assumption and Problem l Under the assumption that
both intrinsic and extrinsic parameters are known l Compute the 3-D
location from their projections, pl and pr n Solution l
Triangulation: Two rays are known and the intersection can be
computed l Problem: Two rays will not actually intersect in space
due to errors in calibration and correspondences, and pixelization
l Solution: find a point in space with minimum distance from both
rays p p r P OlOl OrOr l
- Slide 57
- 3D Computer Vision and Video Computing Reconstruction up to a
Scale Factor n Assumption and Problem Statement l Under the
assumption that only intrinsic parameters and more than 8 point
correspondences are given l Compute the 3-D location from their
projections, pl and pr, as well as the extrinsic parameters n
Solution l Compute the essential matrix E from at least 8
correspondences l Estimate T (up to a scale and a sign) from E
(=RS) using the orthogonal constraint of R, and then R n End up
with four different estimates of the pair (T, R) l Reconstruct the
depth of each point, and pick up the correct sign of R and T. l
Results: reconstructed 3D points (up to a common scale); l The
scale can be determined if distance of two points (in space) are
known
- Slide 58
- 3D Computer Vision and Video Computing Reconstruction up to a
Projective Transformation n Assumption and Problem Statement l
Under the assumption that only n (>=8) point correspondences are
given l Compute the 3-D location from their projections, pl and pr
n Solution l Compute the Fundamental matrix F from at least 8
correspondences, and the two epipoles l Determine the projection
matrices n Select five points ( from correspondence pairs) as the
projective basis l Compute the projective reconstruction n Unique
up to the unknown projective transformation fixed by the choice of
the five points (* not required for this course; needs advanced
knowledge of projective geometry )
- Slide 59
- 3D Computer Vision and Video Computing Summary n Fundamental
concepts and problems of stereo n Epipolar geometry and stereo
rectification n Estimation of fundamental matrix from 8 point pairs
n Correspondence problem and two techniques: correlation and
feature based matching n Reconstruct 3-D structure from image
correspondences given l Fully calibrated l Partially calibration l
Uncalibrated stereo cameras (*)
- Slide 60
- 3D Computer Vision and Video Computing Next n Understanding 3D
structure and events from motion Motion n Homework #4 online, due
on May 03 before class