VScreen: A Real-Time Augmented Video Method - GTC 2012€¦ · WHAT IS VSCREEN? • It’s an image and video editing tool that allows us to insert part of a video or image into another

By:

Francisco J. Hernandez-Lopez

Mariano Rivera Meraz

VSCREEN: A REAL-TIME AUGMENTED VIDEO

METHOD

http://www.gputechconf.com/page/home.html

OUTLINE

• What is VScreen?

• Method for including an image into another image

• Method for including a video into another video

• Automatic Fg/Bg video segmentation

• Quantitative evaluation

• Running Vscreen

• Rate of Vscreen

• Experiments and Results

• Conclusions

2

WHAT IS VSCREEN?

• It’s an image and video editing tool that allows us to insert

part of a video or image into another video or image in real time.

Original I/V New I/V Composed I/V

3

METHOD FOR INCLUDING AN IMAGE INTO

ANOTHER IMAGE

(a) (b) (c)

(e) (d) (f)

Augmenting images. (a) Original image. (b) Trimap image. (c) Interactive binary segmentation. (d) New image.

(e) Homography calculation. (f) Composed image. 4

METHOD FOR INCLUDING A VIDEO

INTO ANOTHER VIDEO

(a) (b)

(c)

Augmenting videos. (a) Original video. (b) Automatic Fg-Bg video segmentation. (c) Composed frame.

5

AUTOMATIC FG/BG VIDEO SEGMENTATION

• Given a video, we segment the Fg from the Bg in real-time.

• Let 𝐵𝑀 be the Bg model (a), for the subsequent frames (b)

we compute the likelihood 𝑉𝑀 of each pixel belonging to

the Bg (c).

• We update the VM in order to be robust to: Illumination

changes, cast shadows and camouflage situations (d).

• We segment the VM with the QMMF method [1], which

eliminates the video artifacts and noise (e)

• Finally, we substitute the background (f).

(a) (b)

(c)

(f)

(d)

(e) [1] M. Rivera, O. Ocegeda, J. L. Marroquin, Entropy-Controlled Quadratic Markov Measure Field

Models for Efficient Image Segmentation, IEEE Trans. on Image Processing 16 (12) (2007) 3047-3057. 6


7

• CUDA Implementation

QMMF binary segmentation

We implement the Red-Black Gauss Seidel iterations with MGrid.

𝑉𝑀 𝑥, 𝑡

𝑑𝐵 𝑥, 𝑡 = − log 𝑉𝑀 𝑥, 𝑡 , 𝑊𝛾 𝑥, 𝑦 = 𝛾

𝛾 + 𝑓 𝑥, 𝑡 − 𝑓 𝑦, 𝑡 2

𝑑𝐹 𝑥, 𝑡 = − log 1 − 𝑉𝑀 𝑥, 𝑡

𝑈 𝑝 𝑥, 𝑡 = 𝑝2 𝑥, 𝑡 𝑑𝐵 𝑥, 𝑡 + 1 − 𝑝 𝑥, 𝑡 2𝑑𝐹 𝑥, 𝑡 +𝜆 𝑝 𝑥, 𝑡 − 𝑝 𝑦, 𝑡 2 𝑊𝛾(𝑥, 𝑦)𝑦∈𝑁𝑥

The complete process is implemented in GPU.


8


Tonal Stabilization

𝑝 𝑥, 𝑡 − 1

𝐷 𝑡 𝑏𝑗 =

𝛿 𝐼 𝑓 𝑥, 𝑡 − 𝑏𝑗 𝑅 𝑥, 𝑡𝑥

𝛿 𝐼 𝑓 𝑥, 𝑡 − 𝑏𝑗𝑥

𝑖𝑓 𝑝 𝑥, 𝑡 − 1 ≥1

2

1 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

𝒯 𝑓 𝑥, 𝑡 = 𝛽 𝐷 𝑡 𝑓 𝑥, 𝑡 𝑓 𝑥, 𝑡 + 1 − 𝛽 𝑓 𝑥, 𝑡

𝑅 𝑥, 𝑡 = 𝐼 𝐵𝑀 𝑥, 𝑡

𝐼 𝑓 𝑥, 𝑡 GPU

CPU

𝑉𝑀 𝑥, 𝑡 = 1 − 𝜖 𝑖𝑓 𝒯 𝑓 𝑥, 𝑡 − 𝐵𝑀 𝑥, 𝑡 2 ≤ 𝜃1𝜖 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

GPU

[2] P. Spagnolo, T. Orazio, M. Leo, A. Distante, Moving object segmentation

by background subtraction and temporal analysis, Image Vision Comput.

24 (5) (2006) 411-423.

Tonal Stabilization. (a) Previous frame.

(b) Current frame. (c) 𝑉𝑀 𝑥, 𝑡 without

𝒯. (d) 𝑉𝑀 𝑥, 𝑡 with 𝒯.

(a) (b)

(c) (d)


9


Cast Shadows

𝑉𝑀 𝑥, 𝑡

𝑉𝐿 𝑥, 𝑡 = exp − 𝒯 𝑓 𝑥, 𝑡 − 𝐵𝑀 𝑥, 𝑡 2

𝑉𝐻 𝑥, 𝑡 = exp − 𝐶 𝑓 𝑥, 𝑡 − 𝐶 𝐵𝑀 𝑥, 𝑡 2

𝑉𝐺 𝑥, 𝑡 = exp − 𝐺 𝑓 𝑥, 𝑡 − 𝐺 𝐵𝑀 𝑥, 𝑡 2

𝑉𝑆 𝑥, 𝑡 = 𝑉𝐿 𝑥, 𝑡 × 𝑉𝐻 𝑥, 𝑡 × 𝑉𝐺 𝑥, 𝑡

𝑉𝑀 𝑥, 𝑡 = 1 − 𝜖 𝑖𝑓 𝑉𝑆 𝑥, 𝑡 > 𝜃2 𝜖 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒


[3] G. F. Fung, N. H. Yung, G. K. Pang, A. H. Lai, Effective moving cast shadow detection

for monocular color traffic image sequences, Optical Engineering 41 (6) (2002) 1425-1440.

(a) (b) (c)

(d) (e)

Cast Shadows Detection. (a) Current frame. (b) 𝑉𝐿 𝑥, 𝑡 .

(c) 𝑉𝐻 𝑥, 𝑡 . (d) 𝑉𝐺 𝑥, 𝑡 . (e) Shadows detected.


10


Camouflage Situations

𝑝 𝑥, 𝑡 − 1

𝑉𝐶 𝑥, 𝑡 = 𝑉𝑀 𝑥, 𝑡 × 𝑉𝑇 𝑥, 𝑡 × [1 − 𝑝(𝑥, 𝑡 − 1)]

𝑉𝑇 𝑥, 𝑡 = 1 − 𝜖 𝑖𝑓 𝑓 𝑥, 𝑡 − 𝑓 𝑥, 𝑡 − 1 2 ≤ 𝜃3𝜖 𝑜𝑡ℎ𝑒𝑤𝑖𝑠𝑒

𝑉𝑀 𝑥, 𝑡 = 1 /2 𝑖𝑓 𝑉𝐶 𝑥, 𝑡 > 1/2

𝑉𝑀 𝑥, 𝑡 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒


Camouflage Situations. (a) Bg model. (b)

Current frame. (c) 𝑉𝑀 𝑥, 𝑡 without CS

Control. (d) 𝑉𝑀 𝑥, 𝑡 with CS Control.

(a) (b)

(c) (d)


11

• Visual Profiler

QMMF binary segmentation

Tonal Stabilization

Cast Shadows

Camouflage Situations

QUANTITATIVE EVALUATION

Original Video Segmented Video

GT

TS

54 G

TT

S51

Microsoft database http://research.microsoft.com/vision/cambridge/i2i/DSWeb.htm 12


Tree-based Classifiers [4] VScreen method

GT

TS

54 G

TT

S51

[4] P. Yin, A. Criminisi, J. Winn, I. Essa, Bilayer Segmentation of Webcam Videos Using Tree-Based Classiers,

IEEE Trans. On Pattern Analisis and Machine Intelligence 33 (1) (2011) 30-42. 13


Sequence Method F. Neg. F. Pos. Total Error

Intelligent_room García et al. [5] 159.1 74.1 233.2

Vscreen 58.0 45.4 103.4

Video2_long García et al. [5] 302.5 59.3 361.8

Vscreen 138.7 18.2 156.9


Vscreen 346.1 24.0 370.1


VScreen 402.3 46.9 449.2

[5] A. García and J. Bescós. Video object segmentation based on feedback schemes guided by a low-level

scene ontology. In Proc. ACIVS ’08, pages 322–333, 2008. 14

RUNNING VSCREEN

Size of Video: 720 x 304. Rate: 32 f/s 15

RUNNING VSCREEN

Size of Video: 1280 x 720. Rate: 11 f/s 16

RATE OF VSCREEN

Image

Size

(frames/second)

GeForce

8800 GT

Tesla

C1060

GeForce

GTX 480

320 X 240 48 63 65

640 X 480 12 18 20

These rates were computed by assuming that the augmented region is the whole frame in the

background video.

17

EXPERIMENTS

• VScreen robust to Illumination Changes (IC)

Original video Vscreen without IC Control Vscreen with IC Control

18

EXPERIMENTS

• VScreen robust to Cast Shadows (CtSd)

Original video Vscreen without CtSd Control Vscreen with CtSd Control

19

EXPERIMENTS

• VScreen robust to Camouflage Situation (CmSt)

Original video Vscreen without CmSt Control Vscreen with CmSt Control

20

RESULTS

Original video Modified video 21

CONCLUSIONS

• We propose VScreen, an interactive and automatic tool for video composition in real–

time. The user just selects the region in the original video to be augmented by another

new video or image.

• VScreen manages the occlusions with the Fg/Bg video segmentation [6] implemented in

CUDA in order to compute the composition in real-time.

• Experiments and quantitative evaluation demonstrate that VScreen achieves good results

and is robust to different scenarios (indoor/outdoor).

• For watching the videos, you can visit the website http://www.cimat.mx/~mrivera/vscreen,

where more videos are available (original and modified) with high resolution (1280 x 720).

[6] F.J. Hernandez-Lopez and M. Rivera, "Binary segmentation of video sequences in real time," Proc.

MICAI, 163–168, IEEE PRESS, 2010. 22

23

Documents

VScreen: A Real-Time Augmented Video Method - GTC 2012€¦ · WHAT IS VSCREEN? • It’s an image and video editing tool that allows us to insert part of a video or image into another