1
Media Re-expressionfor Stereo Cinema
Félix Raimbault, François Pitié and Anil Kokaram
2
Overview 1/2
• Motion Magnification for Stereo Videos
• Stereo Video Inpainting
3
Overview 2/2
• Automatic Cartoonization
• Stereo Video Segmentation
Segmentation of a video volume
[Collomosse et al. 2005]
4
Stereo Video Inpainting
5
Context
• Initial motivation: fill-in revealed areas for stereo motion magnification
• Correct artefacts arising during shooting in stereo
• Remove unwanted objects
• Fill-in dis-occluded areas for
disparity remapping
play video
6
Exemplar-based Framework: Priority
• Priority: [Criminisi, A. et al. 2004]• process first pixels with more available information
nearby• try to reconstruct first areas with “structure” (edges and
depth discontinuity)
edge map
initial priority
order of filling
reconstruction
target frame
7
Exemplar-based Framework: Matching
• Patch-matching strategy: [Efros, A. and Leung, T. 1999]
• find similar neighbourhood to current missing pixel
• SSIM-based distance: compare patches structure
...
I(x) replaced by I(x)
S(x) best match at site x ^
target patch around x
^
8
Patch Tracking
• [Raimbault, F. and Kokaram, A. SPIE 2011]
• use long-term data
• motion vector reconstruction inside the hole
• use data across views
• disparity vector reconstruction inside the hole
9
Smoothness
• Spatial smoothness: (to be submitted to WIAMIS’12)• “coherent patch sewing”• estimate average distance of selected patches as
a criteria to prune patch copying
missing data
pixel by pixel
coherence sewing
target frame
10
Smoothness
• Stereo-Temporal smoothness: (to be submitted to WIAMIS’12)• preferentially select patch in previously
reconstructed frame• stereo-spatio-temporal patch-matching
• de-activate smoothness for outliers (bad motion and disparity estimates)
t t+1t-1
v
v’
11
Luminance Correction
• [Raimbault, F. and Kokaram, A. SPIE 2011]:• colour discrepancy• due to sampling from frames far
away in time from current frame (lighting can change)
• colour correction needed
• 1-tap linear predictive model:
• weighted least squares solution
12
Results• video “spywalk”• reconstruction of twist in the lash of the girl’s bag
• average SSIM: 0.9997 slightly better than Rig Removal: 0.9989
• small hole -> block matching is enough to estimate offsets
• video “water_drop”• “fresh” data input from other view in our technique
whereas reconstruction with Rig Removal degrades
• video “walking_girl”• block matching and pixel accuracy -> not enough
• Published in SPIE’11– Accepted for publication in JEI’12
play video
13
Issues
• Issues:• greedy => lack of global coherence• more accurate motion vectors needed• parameters choice can be complicated (patch size,
search size)• lack of temporal smoothness
• To be explored:• experiment with global optimisation (QPBO)• use trajectories (Viterbi tracker)• patch stitching• impose constraints by first doing stereo-video
segmentation
14
Plan
• Internship for Sony Research in Stuttgart• temporal stabilisation of videos
• Stereo Video Segmentation• based on [Baugh, G. and Kokaram, A. 2010]
• Return to Stereo Video Inpainting• feature point trajectories [Baugh, G. and Kokaram, A.
2009]