Upload
carlos-vazquez
View
1.361
Download
1
Tags:
Embed Size (px)
Citation preview
Depth estimation from Multi-View sources based
on full search and Total Variation regularization
Carlos Vazquez Wa James Tam
Advanced Video SystemsBroadcasting Technologies
Communications Research Centre Canada (CRC)
International Workshop on Computer Vision andIts Application to Image Media Processing
Tokyo, Japan
Outline
Outline
1 Introduction
2 Depth information for 3D-TV
3 Depth from Multi-View sources
Algorithm overviewError volume generationFirst depth approximationDepth refining
4 Experimental results
Application: Multi-View image coding
5 Conclusions
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 2 / 24
Introduction
Outline
1 Introduction
2 Depth information for 3D-TV
3 Depth from Multi-View sources
Algorithm overviewError volume generationFirst depth approximationDepth refining
4 Experimental results
Application: Multi-View image coding
5 Conclusions
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 3 / 24
Introduction
3D-TV: is on the way!!Next step in television broadcasting
1 More content available in 3D:◮ 3D cinema (IMAX, RealD)◮ Live 3D (U2-3D, sport events)◮ Video games (3D at home)
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 4 / 24
Introduction
3D-TV: is on the way!!Next step in television broadcasting
1 More content available in 3D:◮ 3D cinema (IMAX, RealD)◮ Live 3D (U2-3D, sport events)◮ Video games (3D at home)
2 Availability of 3D displays:◮ Stereoscopic (with glasses)◮ Auto-stereoscopic (no glasses)
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 4 / 24
Introduction
3D-TV: is on the way!!Next step in television broadcasting
1 More content available in 3D:◮ 3D cinema (IMAX, RealD)◮ Live 3D (U2-3D, sport events)◮ Video games (3D at home)
2 Availability of 3D displays:◮ Stereoscopic (with glasses)◮ Auto-stereoscopic (no glasses)
3 Ongoing work to develop coding standards:◮ Stereo extension to MPEG◮ Depth coding extension to MPEG
(2D+Depth)◮ Multi-View coding standard (JMVM)◮ 3D@Home consortium
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 4 / 24
Depth information for 3D-TV
Outline
1 Introduction
2 Depth information for 3D-TV
3 Depth from Multi-View sources
Algorithm overviewError volume generationFirst depth approximationDepth refining
4 Experimental results
Application: Multi-View image coding
5 Conclusions
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 5 / 24
Depth information for 3D-TV
Depth information in 3D-TV broadcastingAn essential information
Large variety of viewers and viewing devices:◮ Need to adjust the amount of depth perceived.◮ Need to adjust the depth to the size of the display.◮ Coding of multi-view or stereoscopic sources.
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 6 / 24
Depth information for 3D-TV
Depth information in 3D-TV broadcastingAn essential information
Large variety of viewers and viewing devices:◮ Need to adjust the amount of depth perceived.◮ Need to adjust the depth to the size of the display.◮ Coding of multi-view or stereoscopic sources.
How to fulfill these requirements?◮ Generation of new views from the ones available.
⋆ Depth-Image-Based rendering.⋆ Intermediate View Reconstruction.
◮ Predictive coding of 3D sources.
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 6 / 24
Depth information for 3D-TV
Depth information in 3D-TV broadcastingAn essential information
Large variety of viewers and viewing devices:◮ Need to adjust the amount of depth perceived.◮ Need to adjust the depth to the size of the display.◮ Coding of multi-view or stereoscopic sources.
How to fulfill these requirements?◮ Generation of new views from the ones available.
⋆ Depth-Image-Based rendering.⋆ Intermediate View Reconstruction.
◮ Predictive coding of 3D sources.
⇒ Knowledge of depth becomes essential for 3D-TV.
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 6 / 24
Depth information for 3D-TV
Depth information in 3D-TV broadcastingDepth is embedded in Multi-View sources
2D
Multi−View source
+
D
P1 P2 PN
P
XY
Z
Cam
era
N
Cam
era
2
Cam
era
1
x1 x2xN
B N
f
z
Problem statement
Recover the depth information from a Multi-View source to be used in thetransmission, processing and coding of the Multi-View video content.
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 7 / 24
Depth from Multi-View sources
Outline
1 Introduction
2 Depth information for 3D-TV
3 Depth from Multi-View sources
Algorithm overviewError volume generationFirst depth approximationDepth refining
4 Experimental results
Application: Multi-View image coding
5 Conclusions
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 8 / 24
Depth from Multi-View sources Algorithm overview
Depth estimation from Multi-View sourcesProposed algorithm overview
Depth estimation from Multi-View sources with TV regularization
Full scan of possible depth values and subsequent refining of depth withTotal-Variation regularization combined with edge correspondence andvisibility consistency
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 9 / 24
Depth from Multi-View sources Algorithm overview
Depth estimation from Multi-View sourcesProposed algorithm overview
Depth estimation from Multi-View sources with TV regularization
Full scan of possible depth values and subsequent refining of depth withTotal-Variation regularization combined with edge correspondence andvisibility consistency
1 Pre-processing of the Multi-View source◮ Noise reduction: A general noise removing step is applied.◮ Gradient computation: We add the gradient information ∇Io as two
new ’color’ channels to the color image.◮ Edges extraction: Image edges are used in the depth estimation
process. Edge map ǫo = δc(Io).
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 9 / 24
Depth from Multi-View sources Algorithm overview
Depth estimation from Multi-View sourcesProposed algorithm overview
Depth estimation from Multi-View sources with TV regularization
Full scan of possible depth values and subsequent refining of depth withTotal-Variation regularization combined with edge correspondence andvisibility consistency
1 Pre-processing of the Multi-View source
2 Error volume generation
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 9 / 24
Depth from Multi-View sources Algorithm overview
Depth estimation from Multi-View sourcesProposed algorithm overview
Depth estimation from Multi-View sources with TV regularization
Full scan of possible depth values and subsequent refining of depth withTotal-Variation regularization combined with edge correspondence andvisibility consistency
1 Pre-processing of the Multi-View source
2 Error volume generation3 First depth approximation
◮ Median filter
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 9 / 24
Depth from Multi-View sources Algorithm overview
Depth estimation from Multi-View sourcesProposed algorithm overview
Depth estimation from Multi-View sources with TV regularization
Full scan of possible depth values and subsequent refining of depth withTotal-Variation regularization combined with edge correspondence andvisibility consistency
1 Pre-processing of the Multi-View source
2 Error volume generation
3 First depth approximation4 Depth refining
◮ TV regularization◮ Edge correspondence◮ Visibility consistency
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 9 / 24
Depth from Multi-View sources Error volume generation
Error volume generationOverview
4
v5
d4 d3 d2 d1
d5
X
V
v1
v2
v3
v
Motivation
For each pixel in the central view and depth value a similarity measure isevaluated for correspondent pixels in all views. The depth with the bestsimilarity measure is accepted as the best estimate.
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 10 / 24
Depth from Multi-View sources Error volume generation
Error volume generationEquations
Mean square error across ’colors’:
Ev (x, d) =1
C
C∑
c=1
(Iv (To,v (x, d), c) − Io(x, c))2
Mean error across ’views’
E (x, d) =1
N (x, d)
∑
v∈Rm(x,d)
Ev (x, d)
Matched views
Rm = {v : Ev (x, d) < Tm}
Number of matched views
N (x, d) =∑
v∈V(x,d)
(
Ev (x, d) < Tm
)
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 11 / 24
Depth from Multi-View sources Error volume generation
Error volume generationError volume and visibility: Example
6
Dep
th
-x
Error volume6
Dep
th
-x
Number of matching views
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 12 / 24
Depth from Multi-View sources First depth approximation
First depth approximationDirect minimization of error measure
1 Minimize the error by penalizing disparitieswith less matching views:
D0(x) = arg mind
E (x, d)
(
V(x, d)
N (x, d)
)2
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 13 / 24
Depth from Multi-View sources First depth approximation
First depth approximationDirect minimization of error measure
1 Minimize the error by penalizing disparitieswith less matching views:
D0(x) = arg mind
E (x, d)
(
V(x, d)
N (x, d)
)2
2 Apply a median filter to remove noise fromthe estimated depth map.
D(1) = HM(D(0))
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 13 / 24
Depth from Multi-View sources Depth refining
Depth refiningTotal variation regularization
Depth as a function that minimizes a two-term global energy:
D(x) = arg minD
(Gd(D, E ) + λGr (D))
Data term
Gd(D, E ) =1
2
∑
x∈Λo
‖E (x,D[x])‖2
Regularization term
Gr (D) =
∫
Wo
‖∇xD(n)‖ dWo
Level set minimization
D(n+1) = D(n) + ∆T
(
λκ‖∇xD(n)‖ −
(
∂E
∂dE (D(n))
)
)
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 14 / 24
Depth from Multi-View sources Depth refining
Depth refiningEdge correspondence
1 Image edges
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 15 / 24
Depth from Multi-View sources Depth refining
Depth refiningEdge correspondence
1 Image edges
2 Distance to image edges:
F(x) = max(dist(x, ǫo), FM)
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 15 / 24
Depth from Multi-View sources Depth refining
Depth refiningEdge correspondence
1 Image edges
2 Distance to image edges:
F(x) = max(dist(x, ǫo), FM)
3 Depth edges
η(n) = δc(D(n))
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 15 / 24
Depth from Multi-View sources Depth refining
Depth refiningEdge correspondence
1 Image edges
2 Distance to image edges:
F(x) = max(dist(x, ǫo), FM)
3 Depth edges
η(n) = δc(D(n))
4 Edge correction term
φ(x) = η(n)(x)F(x)sign(
∇D(n)(x) · ∇F(x))
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 15 / 24
Depth from Multi-View sources Depth refining
Depth refiningVisibility consistency
Estimated visibility vs. matching visibility
Compare the visibility resulting from the estimated depth map to thevisibility suggested by the number of matching views.
Estimated visibility
Q(x) =V(x,D(n)(x)) −
∑
L
v=1 (Ov (xv ) 6= xv )
V(x,D(n)(x))
Matching visibility
S(x) =N (x)
V(x)
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 16 / 24
Depth from Multi-View sources Depth refining
Depth refiningVisibility consistency
Estimated visibility vs. matching visibility
Compare the visibility resulting from the estimated depth map to thevisibility suggested by the number of matching views.
Estimated visibility
Q(x) =V(x,D(n)(x)) −
∑
L
v=1 (Ov (xv ) 6= xv )
V(x,D(n)(x))
Matching visibility
S(x) =N (x)
V(x)
Occluded and occluding regions
Ba = {x | (Q(x) < 1) ∧ (S(x) > Q(x))}
Ja = {x = Ov (u) | Q(x) = 1}
Conflict
B = {y ∈ Ba|x ∈ Ja}
J = {x ∈ Ja|S(x) < 1}
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 16 / 24
Depth from Multi-View sources Depth refining
Depth refiningVisibility consistency
Estimated visibility vs. matching visibility
Compare the visibility resulting from the estimated depth map to thevisibility suggested by the number of matching views.
Estimated visibility
Q(x) =V(x,D(n)(x)) −
∑
L
v=1 (Ov (xv ) 6= xv )
V(x,D(n)(x))
Matching visibility
S(x) =N (x)
V(x)
Conflict
B = {y ∈ Ba|x ∈ Ja}
J = {x ∈ Ja|S(x) < 1}
Correction
B ⇒ pushed to Foreground
J ⇒ pushed to Background
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 16 / 24
Depth from Multi-View sources Depth refining
Depth refiningFinal evolution equation
Level sets evolution equation
D(n+1) = D(n) + ∆T
(
λκ‖∇xD(n)‖ −
∂E
∂dE (D(n)) + µΦ + β(B − J )
)
1 Total variation regularization
2 Minimization of Multi-View matching error
3 Image and depth edges correspondence
4 Occlusion correction by visibility check
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 17 / 24
Experimental results
Outline
1 Introduction
2 Depth information for 3D-TV
3 Depth from Multi-View sources
Algorithm overviewError volume generationFirst depth approximationDepth refining
4 Experimental results
Application: Multi-View image coding
5 Conclusions
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 18 / 24
Experimental results
Experimental resultsTest images and depth maps.
Original color images: View 2
Original depth images: View 2
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 19 / 24
Experimental results
Experimental resultsResulting depth maps and error.
Estimated depth image: View 2
Error with respect to ground-truth: 1 pixel differences
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 20 / 24
Experimental results
Experimental resultsError with respect to ground-truth.
Image Venus Teddy Cones Art Bowling2
PSNR(dB) 51.96 44.02 44.76 36.72 36.26E > 1(%) 6.93 10.96 8.01 18.99 17.80E > 2(%) 2.19 6.49 4.13 11.88 10.46
1 PSNR indicates that results close to ground-truth
2 Errors larger than 1 pixel are large
3 Errors larger than 2 pixels drop significantly
4 A 2 pixels error is manageable in intended application
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 21 / 24
Experimental results Application: Multi-View image coding
Experimental resultsApplication: Multi-View image coding
2D+Depth+Occlusions Multi-View coding system
N
Disocclu.View N
View 1
View 2
Encode
Decode
Mask
Wav. Tran.
Encode
DEmbed
Tx
Edges
WCM
E
2D
I N
D 2D
2D+D
DepthEstimation
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 22 / 24
Experimental results Application: Multi-View image coding
Experimental resultsApplication: Multi-View image coding
Decoded images: Estimated depth map
Venus 32.19dB Teddy 31.40dB Cones 30.84dB
Decoded images: Real depth map
Venus 35.96dB Teddy 31.93dB Cones 31.81dB
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 22 / 24
Conclusions
Outline
1 Introduction
2 Depth information for 3D-TV
3 Depth from Multi-View sources
Algorithm overviewError volume generationFirst depth approximationDepth refining
4 Experimental results
Application: Multi-View image coding
5 Conclusions
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 23 / 24
Conclusions
Conclusions
High quality depth estimation from Multi-View sources.
Occlusion processing by analysis of visibility consistency.
Total-Variation regularization ensures smooth depth with sharp edges.
Application to Multi-View image coding
Outlook◮ Improve the visibility consistency step.◮ Speed-up the algorithm execution.◮ Integrating into a MPEG-2 standard stream.
Vazquez, Tam (CRC) 3D–TV: Depth estimation WCVIM’09 24 / 24