Introduction to Video Background Subtraction 1. Motivation In video action analysis, there are many popular applications like surveillance for security,

1

Introduction to

Video Background Subtraction

2 Motivation

In video action analysis, there are many popular applications like surveillance for security, home care and athletes action analysis to predict and develop the strategy.

The first step is to capture the human or moving parts which we call foreground. And then use the information to recognize the actions.

In this presentation, I will introduce the methods to extract the foreground.

3 Overall structure

Background Subtraction

Activity Recognition

3D model Estimation

Human Joints

Estimation Model learning

Feature points

4 Methods of Background Subtraction

At early time, the widely used method is the Gaussian Mixture Model (GMM) [1] and statistics [2]. The main advantage of GMM is that the it can reach real time processing. But it is sensitive to small noise like changing the luminance.

In recent years, robust principle component analysis (RPCA) model [3][4][5] is found good performance than other state-of-the-art means. But instead, it costs more iterative time.

5 GMM (Method 1)

GMM uses 3~5 Gaussian functions to each color channel pixel of the video frame.

New input pixels the algorithm would check whether the value is smaller than the deviation of Gaussian which is determined by the past pixel values of the same pixel location to determine this pixel is foreground or not.

6 GMM (cont’d)

Traffic sequenceThis picture is result showed in [1]Left is original frame, and the right is the overlapped frames

7 GMM (cont’d)

We use the code provided by [1]. And the clip was shot by ourselvesThere are holes and noise in the left image

8 GMM (cont’d)

We use closing morphology to do post processing in order to get more clean images.

9Foreground classification & Statistics(Method 2)The work [2] tears foreground into 4 parts, including moving visual object (MVO), ghost due to deinterlacing of TV, moving visual object shadow (MVO shadow), and ghost shadow.Judge whether the pixel belongs to foreground. If not, use several previous frames to estimate the current value.

Examples from [6] to show the effect of ghost

10 Foreground classification & Statistics

: ghost shadow

: moving visual object shadow

𝑀𝑉𝑂 : moving visual object : ghost

The result from [2]

11 RPCA (Method 3)The method is to transform video frames to vectors. And then combine the vectors into a big matrix.

It follows the conditions that B is background matrix which is low rank and F represent foreground matrix which is sparse. And the combination of the two matrix is the original frames.

. .s t D B F

𝐹 : 𝑓𝑜𝑟𝑒𝑔𝑟𝑜𝑢𝑛𝑑𝑚𝑎𝑡𝑟𝑖𝑥

𝐵 :𝑏𝑎𝑐𝑘𝑔𝑟𝑜𝑢𝑛𝑑𝑚𝑎𝑡𝑟𝑖𝑥

: original frame matrix

:

: constant

argmin𝐵 ,𝐹

𝑟𝑎𝑛𝑘 (𝐵)+𝛾 ‖𝐹 ‖1

norm

12 RPCA (cont’d)The objective function in P.9 is an ill-posed problem, so the solution is to iteratively optimize B and F

This is captured from [3].

13 RPCA (cont’d)We also run the code provided by [3] with our own video clips.

Use the code from [3] by our clips

Comparing with GMM, obviously RPCA is more robust.

14

Generalized Fused Lasso (GFL) (Method 4)

The latest research about background subtraction is [4].

With the same low-ranked objective function. The zero norm term is modeled by generalized fused lasso function.

,min ( )B F gflrank B F ‖ ‖

. .s t D B F

𝐹 : 𝑓𝑜𝑟𝑒𝑔𝑟𝑜𝑢𝑛𝑑𝑚𝑎𝑡𝑟𝑖𝑥

𝐵 :𝑏𝑎𝑐𝑘𝑔𝑟𝑜𝑢𝑛𝑑𝑚𝑎𝑡𝑟𝑖𝑥

: original frame matrix : constant

: spatial neighborhood pixels

,

𝑓 (𝑘) : h𝑘𝑡 𝑓𝑜𝑟𝑒𝑔𝑟𝑜𝑢𝑛𝑑𝑣𝑒𝑐𝑡𝑜𝑟

: weighting

15 Generalized Fused Lasso (GFL) (cont’d)

Results comparison of GFL and RCPA from [4]

Result from [4]Original image, ground truth, GFL

16 Conclusion• GMM only cost 0.03 second per frame, but it always cause

holes in the foreground parts.

• Also the low-ranked matrix is fancy, but the cost time is high. (Almost 30 seconds for 10 frames)

• Also the shadow parts of images are not considered as foreground. There may need some constraints to remove the shadow.

• All of the methods discussed above are only applied in static cameras.

17References

[1] Zoran Zivkovic Improved Adaptive Gaussian Mixture Model for Background Subtraction In Proc. ICPR, 2004[2] Rita Cucchiara, Costantino Grana, Massimo Piccardi, Andrea Prati

Detecting Moving Objects, Ghosts, and Shadows in Video Streams IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 25, NO. 10, OCTOBER 2003[3] John Wright , Yigang Peng, Yi Ma, Arvind Ganesh, Shankar Rao Robust

Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Matrices by Convex Optimization NIPS 2009

[4] Xiaowei Zhou, Can Yang and Weichuan Yu Moving Object Detection by Detecting Contiguous Outliers in the Low-Rank Representation

Pattern Analysis and Machine Intelligence, IEEE Transactions 2012[5] Bo Xin Yuan Tian Yizhou Wang Wen Gao Background Subtraction via

Generalized Fused Lasso Foreground Modeling CVPR 2015[6] https://en.wikipedia.org/wiki/Interlaced_video

Documents

Introduction to Video Background Subtraction 1. Motivation In video action analysis, there are many popular applications like surveillance for security,