Upload
barrie-spencer
View
217
Download
0
Embed Size (px)
Citation preview
Stereo Vision
ECE 847:Digital Image Processing
Stan BirchfieldClemson University
Modeling from multiple views
time
# cameras
photographbinocular stereo
trinocular stereo
multi-baseline stereo
camcorder
human vision
camera dome
two frames ...
...
– Greek for solid
Stereoscope
Invented by Wheatstone in 1838
Modern version
Can you fuse these?
rightleft
No special instrument needed
Just relax your eyes
L
R
Random dot stereogram
invented by Bela Juleszin 1959
http://www.magiceye.com/faq.htm
Autostereogram
Do you see the shark?
http://en.wikipedia.org/wiki/Autostereogram
Can you cross-fuse these?
right leftNote: Cross-fusion is necessary if distance between images
is greater than inter-ocular distance
L
R
R
L
impossible: instead, trickthe brain:
Tsukuba stereo images courtesy of Y. Ohta and Y. Nakamura at the University of Tsukuba
Human stereo geometry
http://webvision.med.utah.edu/space_perception.html
fixationpoint
corresponding points
aR
aL
disparity
Horopter
• Horopter: surfacewhere disparity is zero
• For round retina,the theoretical horopteris a circle(Vieth-Muller circle)
http://webvision.med.utah.edu/space_perception.html
Cyclopean image
http://webvision.med.utah.edu/space_perception.html
http://bearah718.tripod.com/sitebuildercontent/sitebuilderpictures/cyclops.jpg
Panum’s fusional area (volume)
• Human visual system is only capable of fusing the two images with a narrow range of disparities around fixation point
• This area (volume) is Panum’s fusional area
• Outside this area we get double-vision (diplopia)
http://www.allaboutvision.com/conditions/double-vision.htm
Human visual pathway
photos courtesy California Academy of Science
Cheetah:More accurate
depthestimation
Antelope:larger field
of view
Prey and predator
Stereo geometry for pinhole cameras with flat retinas
C,C’,x,x’ and X are coplanar
Left camera Right camera
world point
center ofprojection
epipolarplane epipolar line for x
epipole
baseline
M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
epipoles e,e’= intersection of baseline with image plane = projection of projection center in other image= vanishing point of camera motion direction
an epipolar plane = plane containing baseline (1-D family)
an epipolar line = intersection of epipolar plane with image(always come in corresponding pairs)
Epipolar geometry
M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
What if only C,C’,x are known?
Epipolar geometry
epipole
centerof
projection baseline
All points on p project on l and l’M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
Family of planes and lines l and l’ Intersection in e and e’
Epipolar geometry
M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
Example: Converging cameras
M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
e
e’
Example: Forward motion
M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
Epipolar geometry andFundamental matrix
epipolar line
(epipole: intersection of all epipolar lines)
(computed with 20 points)
Epipolar geometry andFundamental matrix
epipolar line
(epipole: intersection of all epipolar lines)
(computed with 50 points)
Fundamental matrix
point in image 1
point in image 2
fundamentalmatrix
Epipolar lines (1)
epipolar line in image 2associated with (x,y) in image 1
Epipolar lines (2)
epipolar line in image 1associated with (x’,y’) in image 2
Computing the fundamental matrix
knownunknown
Computing the fundamental matrix
knownunknown
Computing the fundamental matrix
1. Construct A (nx9) from correspondences2. Compute SVD of A: A = UVT
3. Unit-norm solution of Af=0 is given by vn (the right-most singular vector of A)
4. Reshape f into F1
5. Compute SVD of F1: F1=UFFVFT
6. Set F(3,3)=0 to get F’(The enforces rank(F)=2)
7. Reconstruct F=UFF’VFT
8. Now xTF is epipolar line in image 2,and Fx’ is epipolar line in image 1
(simple for stereo rectification)
Example: Motion parallel to image plane
M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
Example: Motion parallel to image scanlines
Epipoles are at infinity
Scanlines are the epipolar lines
In this case, the images are said to be “rectified”
Tsukuba stereo images courtesy of Y. Ohta and Y. Nakamura at the University of Tsukuba
Standard (rectified) stereo geometry
pure translation along X-axis
M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
Perspective projection
X
x
X
x
M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
Rectified geometry
xL xR
XL -XR
b
Standard stereo geometry
• disparity is inversely proportional to depth• stereo vision is less useful for distant objects
M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
Binocular rectified stereo
rightleft
disparity map depth discontinuities
epipolarconstraint
Matching scanlines
inte
nsi
ty
L
R
dis
par
ity
lamp
wall
pixel
rightleft
Stereo matching
• Search is limited to epipolar line (1D)• Look for most similar pixel
?
M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
Aggregation
• Use more than one pixel• Assume neighbors have similar
disparities*
– Use correlation window containing pixel
– Allows to use SSD, ZNCC, Census, etc.
M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
Block matching
Dissimilarity measures
Connection between SSD and cross correlation:
Also normalized correlation, rank, census, sampling-insensitive ...
Most common:
More efficient implementation
Key idea: Summation over window is correlation with box filter, which is separable
Running sum improves efficiency even more
Note: w is half-width
Compare intensities pixel-by-pixel
Comparing image regions
I(x,y) I´(x,y)
Sum of Square Differences
Dissimilarity measures
Note: SAD is fast approximation (replace square with absolute value)
M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
Compare intensities pixel-by-pixel
Comparing image regions
I(x,y) I´(x,y)
Dissimilarity measures
If energy does not change much, then minimizing SSD equals maximizing cross-correlation:
M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
Compare intensities pixel-by-pixel
Comparing image regions
I(x,y) I´(x,y)
Zero-mean Normalized Cross Correlation
Similarity measures
M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
Compare intensities pixel-by-pixel
Comparing image regions
I(x,y) I´(x,y)
Census
Similarity measures
125 126 125
127 128 130
129 132 135
0 0 0
0 1
1 1 1
(Real-time chip from TYZX based on Census)
only compare bit signature
M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
Sampling-InsensitivePixel Dissimilarity
d(xL,xR)
xL xR
d(xL,xR) = min{d(xL,xR) ,d(xR,xL)}Our dissimilarity measure:
[Birchfield & Tomasi 1998]
IL IR
Given: An interval A such that [xL – ½ , xL + ½] _ A, and
[xR – ½ , xR + ½] _ A
Dissimilarity Measure Theorems
If | xL – xR | ≤ ½, then d(xL,xR) = 0
| xL – xR | ≤ ½ iff d(xL,xR) = 0
∩∩
Theorem 1:
Theorem 2:
(when A is convex or concave)
(when A is linear)
Aggregation window sizes
Small windows • disparities similar• more ambiguities• accurate when correct
Large windows • larger disp. variation• more discriminant• often more robust• use shiftable windows to
deal with discontinuities
(Illustration from Pascal Fua)
Occlusions
(Slide from Pascal Fua)
Left-right consistency check
• Search left-to-right, then right-to-left• Retain disparity only if they agree
xL
d
Do minima coincide?
Results: correlation
disparity mapleft
with left-right consistency check
Constraints
• Epipolar – match must lie on epipolar line• Piecewise constancy – neighboring pixels should usually
have same disparity• Piecewise continuity – neighboring pixels should usually
have similar disparity• Disparity – impose allowable range of disparities (Panum’s
fusional area)• Disparity gradient – restricts slope of disparity• Figural continuity – disparity of edges across scanlines• Uniqueness – each pixel has no more than one match
(violated by windows and mirrors)• Ordering – disparity function is monotonic (precludes thin
poles)
Exploiting scene constraints
M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
Ordering constraint
11 22 33 4,54,5 66 11 2,32,3 44 55 66
2211 33 4,54,5 6611
2,32,3
44
55
66
surface slicesurface slice surface as a pathsurface as a path
occlusion right
occlusion left
M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
Uniqueness constraint
• In an image pair each pixel has at most one corresponding pixel– In general one corresponding pixel– In case of occlusion there is none
M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
Disparity constraint
surface slicesurface slice surface as a pathsurface as a path
bounding box
dispa
rity b
and
use reconstructed features to determine bounding box
constantdisparitysurfaces
M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
Figural continuity constraint
right left
[University of Tsukuba]
Cooperative algorithm
Disparity space image
Dynamic Programming: 1D Search
Dis
par
ity
map
occlusion
depthdiscontinuity
RIGHTL
EF
T
c a r t
ca
t 3 2 1 1 12 1 0 1 21 0 1 2 30 1 2 3 4
string editing:
stereo matching:
penalties: mismatch = 1 insertion = 1 deletion = 1
c a t
c a r t
Minimizing a 2D Cost FunctionalMinimize:
disparity
p,q
p,q
d(p, )2D:
GL
OB
AL
dis
par
ity
pixel?
p,q
d(p, )
1D:
E E d(p, ) u(l )data smoothness p p,q p,q
{p,q} N
Global
u(l )p,q
Discontinuity penalty:
lp,q
minimum cut = disparity surface
u(l )= lp,q p,q p,qsolves
LO
CA
L Local (GOOD)
(BAD)
Multiway-Cut:2D Search
pixels
labels
pixels
labels
[Boykov, Veksler, Zabih 1998]
Multiway-Cut Algorithm
),( x'x ))(, x(x fg
minimum cut
),(
)]()()[,())(,x'xx
x'xx'xx(x fffg Minimizes
source label
sink label
pixels
(cost of label discontinuity)
(cost of assigninglabel to pixel)
pixels
labels
Energy minimization
(Slide from Pascal Fua)
Graph Cut
(Slide from Pascal Fua)
(general formulation requires multi-way cut!)
(Boykov et al ICCV‘99)
(Roy and Cox ICCV‘98)
Simplified graph cut
Correspondence as Segmentation
• Problem: disparities (fronto-parallel) O()surfaces (slanted) O( 2 n)=> computationally intractable!
• Solution: iteratively determine which labels to use
labelpixels
find affineparametersof regions
multiway-cut(Expectation)
Newton-Raphson(Maximization)
Stereo Results (Dynamic Programming)
Stereo Results (Multiway-Cut)
Stereo Results on Middlebury Database
imag
eB
irch
fiel
dT
om
asi 1
999
Ho
ng
-C
hen
200
4
Untextured regions remain a challenge
Multiway-cutDynamic programming
Results: dynamic programming
disparity map
[Bobick & Intille]
left
Results: multiway cut
disparity mapleft
[Kolmogorov & Zabih]
Results: multiway cut (untextured)
disparity map
Multi-camera configurations
Okutami and Kanade
(illustration from Pascal Fua)
M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
Example: Tsukuba
Tsukuba dataset
Real-time stereo on GPU• Computes Sum-of-Square-Differences (use
pixelshader)• Hardware mip-map generation for aggregation over
window• Trade-off between small and large support window
(Yang and Pollefeys, CVPR2003)
290M disparity hypothesis/sec (Radeon9800pro)e.g. 512x512x36disparities at 30Hz
GPU is great for vision too!
Stereo matching
Optimal path(dynamic programming )
Similarity measure(SSD or NCC)
Constraints• epipolar
• ordering
• uniqueness
• disparity limit
Trade-off
• Matching cost (data)
• Discontinuities (prior)
Consider all paths that satisfy the constraints
pick best using dynamic programming
M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
Hierarchical stereo matching
Dow
nsam
plin
g
(Gau
ssia
n p
yra
mid
)
Dis
pari
ty p
rop
ag
ati
on
Allows faster computation
Deals with large disparity ranges
M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
Disparity map
image I(x,y) image I´(x´,y´)Disparity map D(x,y)
(x´,y´)=(x+D(x,y),y)
M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
Example: reconstruct image from neighboring images
M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
Stereo matching with general camera configuration
M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
Image pair rectification
M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
Planar rectification
Bring two views Bring two views to standard stereo setupto standard stereo setup
(moves epipole to )(not possible when in/close to image)
~ image size
(calibrated)(calibrated)
Distortion minimization(uncalibrated)
M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
polarrectification
planarrectification
originalimage pair
M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
Stereo camera configurations
(Slide from Pascal Fua)
More cameras
Multi-baseline stereo
[Okutomi & Kanade]