Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
By:
Francisco J. Hernandez-Lopez
Mariano Rivera Meraz
VSCREEN: A REAL-TIME AUGMENTED VIDEO
METHOD
OUTLINE
• What is VScreen?
• Method for including an image into another image
• Method for including a video into another video
• Automatic Fg/Bg video segmentation
• Quantitative evaluation
• Running Vscreen
• Rate of Vscreen
• Experiments and Results
• Conclusions
2
WHAT IS VSCREEN?
• It’s an image and video editing tool that allows us to insert
part of a video or image into another video or image in real time.
Original I/V New I/V Composed I/V
3
METHOD FOR INCLUDING AN IMAGE INTO
ANOTHER IMAGE
(a) (b) (c)
(e) (d) (f)
Augmenting images. (a) Original image. (b) Trimap image. (c) Interactive binary segmentation. (d) New image.
(e) Homography calculation. (f) Composed image. 4
METHOD FOR INCLUDING A VIDEO
INTO ANOTHER VIDEO
(a) (b)
(c)
Augmenting videos. (a) Original video. (b) Automatic Fg-Bg video segmentation. (c) Composed frame.
5
AUTOMATIC FG/BG VIDEO SEGMENTATION
• Given a video, we segment the Fg from the Bg in real-time.
• Let 𝐵𝑀 be the Bg model (a), for the subsequent frames (b)
we compute the likelihood 𝑉𝑀 of each pixel belonging to
the Bg (c).
• We update the VM in order to be robust to: Illumination
changes, cast shadows and camouflage situations (d).
• We segment the VM with the QMMF method [1], which
eliminates the video artifacts and noise (e)
• Finally, we substitute the background (f).
(a) (b)
(c)
(f)
(d)
(e) [1] M. Rivera, O. Ocegeda, J. L. Marroquin, Entropy-Controlled Quadratic Markov Measure Field
Models for Efficient Image Segmentation, IEEE Trans. on Image Processing 16 (12) (2007) 3047-3057. 6
AUTOMATIC FG/BG VIDEO SEGMENTATION
7
• CUDA Implementation
QMMF binary segmentation
We implement the Red-Black Gauss Seidel iterations with MGrid.
𝑉𝑀 𝑥, 𝑡
𝑑𝐵 𝑥, 𝑡 = − log 𝑉𝑀 𝑥, 𝑡 , 𝑊𝛾 𝑥, 𝑦 = 𝛾
𝛾 + 𝑓 𝑥, 𝑡 − 𝑓 𝑦, 𝑡 2
𝑑𝐹 𝑥, 𝑡 = − log 1 − 𝑉𝑀 𝑥, 𝑡
𝑈 𝑝 𝑥, 𝑡 = 𝑝2 𝑥, 𝑡 𝑑𝐵 𝑥, 𝑡 + 1 − 𝑝 𝑥, 𝑡 2𝑑𝐹 𝑥, 𝑡 +𝜆 𝑝 𝑥, 𝑡 − 𝑝 𝑦, 𝑡 2 𝑊𝛾(𝑥, 𝑦)𝑦∈𝑁𝑥
The complete process is implemented in GPU.
AUTOMATIC FG/BG VIDEO SEGMENTATION
8
• CUDA Implementation
Tonal Stabilization
𝑝 𝑥, 𝑡 − 1
𝐷 𝑡 𝑏𝑗 =
𝛿 𝐼 𝑓 𝑥, 𝑡 − 𝑏𝑗 𝑅 𝑥, 𝑡𝑥
𝛿 𝐼 𝑓 𝑥, 𝑡 − 𝑏𝑗𝑥
𝑖𝑓 𝑝 𝑥, 𝑡 − 1 ≥1
2
1 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
𝒯 𝑓 𝑥, 𝑡 = 𝛽 𝐷 𝑡 𝑓 𝑥, 𝑡 𝑓 𝑥, 𝑡 + 1 − 𝛽 𝑓 𝑥, 𝑡
𝑅 𝑥, 𝑡 = 𝐼 𝐵𝑀 𝑥, 𝑡
𝐼 𝑓 𝑥, 𝑡 GPU
CPU
𝑉𝑀 𝑥, 𝑡 = 1 − 𝜖 𝑖𝑓 𝒯 𝑓 𝑥, 𝑡 − 𝐵𝑀 𝑥, 𝑡 2 ≤ 𝜃1𝜖 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
GPU
[2] P. Spagnolo, T. Orazio, M. Leo, A. Distante, Moving object segmentation
by background subtraction and temporal analysis, Image Vision Comput.
24 (5) (2006) 411-423.
Tonal Stabilization. (a) Previous frame.
(b) Current frame. (c) 𝑉𝑀 𝑥, 𝑡 without
𝒯. (d) 𝑉𝑀 𝑥, 𝑡 with 𝒯.
(a) (b)
(c) (d)
AUTOMATIC FG/BG VIDEO SEGMENTATION
9
• CUDA Implementation
Cast Shadows
𝑉𝑀 𝑥, 𝑡
𝑉𝐿 𝑥, 𝑡 = exp − 𝒯 𝑓 𝑥, 𝑡 − 𝐵𝑀 𝑥, 𝑡 2
𝑉𝐻 𝑥, 𝑡 = exp − 𝐶 𝑓 𝑥, 𝑡 − 𝐶 𝐵𝑀 𝑥, 𝑡 2
𝑉𝐺 𝑥, 𝑡 = exp − 𝐺 𝑓 𝑥, 𝑡 − 𝐺 𝐵𝑀 𝑥, 𝑡 2
𝑉𝑆 𝑥, 𝑡 = 𝑉𝐿 𝑥, 𝑡 × 𝑉𝐻 𝑥, 𝑡 × 𝑉𝐺 𝑥, 𝑡
𝑉𝑀 𝑥, 𝑡 = 1 − 𝜖 𝑖𝑓 𝑉𝑆 𝑥, 𝑡 > 𝜃2 𝜖 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
The complete process is implemented in GPU.
[3] G. F. Fung, N. H. Yung, G. K. Pang, A. H. Lai, Effective moving cast shadow detection
for monocular color traffic image sequences, Optical Engineering 41 (6) (2002) 1425-1440.
(a) (b) (c)
(d) (e)
Cast Shadows Detection. (a) Current frame. (b) 𝑉𝐿 𝑥, 𝑡 .
(c) 𝑉𝐻 𝑥, 𝑡 . (d) 𝑉𝐺 𝑥, 𝑡 . (e) Shadows detected.
AUTOMATIC FG/BG VIDEO SEGMENTATION
10
• CUDA Implementation
Camouflage Situations
𝑝 𝑥, 𝑡 − 1
𝑉𝐶 𝑥, 𝑡 = 𝑉𝑀 𝑥, 𝑡 × 𝑉𝑇 𝑥, 𝑡 × [1 − 𝑝(𝑥, 𝑡 − 1)]
𝑉𝑇 𝑥, 𝑡 = 1 − 𝜖 𝑖𝑓 𝑓 𝑥, 𝑡 − 𝑓 𝑥, 𝑡 − 1 2 ≤ 𝜃3𝜖 𝑜𝑡ℎ𝑒𝑤𝑖𝑠𝑒
𝑉𝑀 𝑥, 𝑡 = 1 /2 𝑖𝑓 𝑉𝐶 𝑥, 𝑡 > 1/2
𝑉𝑀 𝑥, 𝑡 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
The complete process is implemented in GPU.
Camouflage Situations. (a) Bg model. (b)
Current frame. (c) 𝑉𝑀 𝑥, 𝑡 without CS
Control. (d) 𝑉𝑀 𝑥, 𝑡 with CS Control.
(a) (b)
(c) (d)
AUTOMATIC FG/BG VIDEO SEGMENTATION
11
• Visual Profiler
QMMF binary segmentation
Tonal Stabilization
Cast Shadows
Camouflage Situations
QUANTITATIVE EVALUATION
Original Video Segmented Video
GT
TS
54 G
TT
S51
Microsoft database http://research.microsoft.com/vision/cambridge/i2i/DSWeb.htm 12
QUANTITATIVE EVALUATION
Tree-based Classifiers [4] VScreen method
GT
TS
54 G
TT
S51
[4] P. Yin, A. Criminisi, J. Winn, I. Essa, Bilayer Segmentation of Webcam Videos Using Tree-Based Classiers,
IEEE Trans. On Pattern Analisis and Machine Intelligence 33 (1) (2011) 30-42. 13
QUANTITATIVE EVALUATION
Sequence Method F. Neg. F. Pos. Total Error
Intelligent_room García et al. [5] 159.1 74.1 233.2
Vscreen 58.0 45.4 103.4
Video2_long García et al. [5] 302.5 59.3 361.8
Vscreen 138.7 18.2 156.9
Video5_long García et al. [5] 1179.3 275.9 1455.2
Vscreen 346.1 24.0 370.1
Video6_long García et al. [5] 1195.0 310.6 1505.6
VScreen 402.3 46.9 449.2
[5] A. García and J. Bescós. Video object segmentation based on feedback schemes guided by a low-level
scene ontology. In Proc. ACIVS ’08, pages 322–333, 2008. 14
RUNNING VSCREEN
Size of Video: 720 x 304. Rate: 32 f/s 15
RUNNING VSCREEN
Size of Video: 1280 x 720. Rate: 11 f/s 16
RATE OF VSCREEN
Image
Size
(frames/second)
GeForce
8800 GT
Tesla
C1060
GeForce
GTX 480
320 X 240 48 63 65
640 X 480 12 18 20
These rates were computed by assuming that the augmented region is the whole frame in the
background video.
17
EXPERIMENTS
• VScreen robust to Illumination Changes (IC)
Original video Vscreen without IC Control Vscreen with IC Control
18
EXPERIMENTS
• VScreen robust to Cast Shadows (CtSd)
Original video Vscreen without CtSd Control Vscreen with CtSd Control
19
EXPERIMENTS
• VScreen robust to Camouflage Situation (CmSt)
Original video Vscreen without CmSt Control Vscreen with CmSt Control
20
RESULTS
Original video Modified video 21
CONCLUSIONS
• We propose VScreen, an interactive and automatic tool for video composition in real–
time. The user just selects the region in the original video to be augmented by another
new video or image.
• VScreen manages the occlusions with the Fg/Bg video segmentation [6] implemented in
CUDA in order to compute the composition in real-time.
• Experiments and quantitative evaluation demonstrate that VScreen achieves good results
and is robust to different scenarios (indoor/outdoor).
• For watching the videos, you can visit the website http://www.cimat.mx/~mrivera/vscreen,
where more videos are available (original and modified) with high resolution (1280 x 720).
[6] F.J. Hernandez-Lopez and M. Rivera, "Binary segmentation of video sequences in real time," Proc.
MICAI, 163–168, IEEE PRESS, 2010. 22
23