28
Online Stochastic Tensor Decomposition for Background Subtraction in Multispectral Video Sequences Andrews Sobral 1 , Sajid Javed 2 , Soon Ki Jung 2 , Thierry Bouwmans 1 , and El-hadi Zahzah 1 1 Laboratoire MIA (Mathematiques Image de Applications) Universite de La Rochelle, France 2 Virtual Reality Laboratory, School of Computer Science and Engineering Kyungpook National University, Republic of Korea 18 December, 2015

Online Stochastic Tensor Decomposition for Background Subtraction in Multispectral Video Sequences

Embed Size (px)

Citation preview

Online Stochastic Tensor Decomposition for

Background Subtraction in Multispectral

Video Sequences

Andrews Sobral1, Sajid Javed2, Soon Ki Jung2, Thierry Bouwmans1, and

El-hadi Zahzah1

1Laboratoire MIA (Mathematiques Image de Applications)

Universite de La Rochelle, France

2Virtual Reality Laboratory, School of Computer Science and Engineering

Kyungpook National University, Republic of Korea

18 December, 2015

• Introduction

• Tensor Decomposition

–Methods

–Challenges

• Proposed Methodology

• Experimental Evaluations

• Conclusion

2

Main Contents

• What is Tensor?

– multi-dimensional numerical array

• generalization of conventional arrays

– Matrix

o second-order tensor: rank (2) tensor

– Vector

o first-order tensor: rank(1) tensor

• Higher order tensors (order≥3): stores data in a

multi-dimensional array

– Main operation

• unfolding or matricization

– reformating tensors into matrices

o frontal, vertical, and horizontal

3

Introduction

• Video or sequence of images as a tensor

4

Introduction-cont..

• Is it possible to decompose tensor for Background subtraction application?

– 2 components

• Multi-dimensional low-rank tensor (corresponds to

background model)

• Multi-dimensional sparse tensor (belongs to moving objects)

• Matrix-based decomposition

– matrix considers only single dimensional (i.e., grayscale)

– spatial correlation loss

• erroneous foreground regions

• Tensor-based decomposition

– multi-dimensional data is considered (3d or 4rth order tensor)

– multi-aspects generalization of matrices

5

Tensor Decomposition

• Example: background subtraction via tensor decomposition under convex

optimization framework

6

Tensor Decomposition

MaskInput Low-rank Sparse

Frontal slices

• Methods

– Tucker/HOSVD

– CANDECOMP-PARAFAC(CP)

– NTF (Non-negative Tensor Factorization)

– NTD (Non-negative Tucker Decomposition)

– NCP (Non-negative CP Decomposition)

• Major Challenges

– Batch optimization

– Higher Order SVD computation

– Computational complexities

– Designed for only monochromatic (i.e., grayscale) or trichromatic (i.e.,

RGB) cameras.

– Real-time processing is not desirable

7

Tensor Decomposition

• Is it possible to make “Online Tensor Decomposition method for RGB as

well as Multispectral bands for background subtraction”?

– Main contributions

• Online Stochastic framework for Tensor Decomposition (OSTD)

– computationally good

– less memory cost

• OSTD for Multi-Spectral Video Sequences (MSVS)

– RGB is not sufficient for color saturation/ shadows/ reflections

– Multi-spectral bands can improve foreground segmentation

8

Proposed Methodology

9

Proposed Framework

Input

Multi-spectral

Bands

𝑁𝑡ℎ Order

Tensor

OSTD: Online

Stochastic Tensor

Decomposition

Low-rank

Sparse

• Let say 𝑁𝑡ℎ order observation tensor

– corrupted by outliers,

• Main assumption

– can be reconstructed by the combination of

• low-rank component,

• sparse component,

– convex optimization framework

• represents the nuclear norm of 𝑖𝑡ℎ mode

• represents the 𝑙1 norm

• Stochastic/Online optimization proposed by [Feng et.al 2013]

10

OSTD: The Model

• Main notion

– process only one frame at a time instance t

• MSVS: process each k band

• Nuclear norm is re-formulated

– Decompose nuclear norm into

• explicit product of basis and coefficients

• re-formulated norm is used proposed by [Feng et.al 2013]

– p ambient dimension, and r is rank

• Stochastic optimization

11

OSTD: Online Optimization

= 𝑖𝑛𝑓𝐿∈ℝ𝑝×𝑟,𝑅∈ℝ𝑛×𝑟

1

2( | 𝐿𝑖 |𝐹

2 + | 𝑅𝑖 |𝐹2) 𝑠. 𝑡. , = 𝐿𝑖𝑅𝑖

𝑇

• Advantages

– no batch processing

– iteratively update the basis

– used for each 𝑖𝑡ℎ mode

• Major Processing: 3 Steps

– Low-rank approximation

• Initialize the basis, L

– Bilateral Random Projections (BRP) method

o L, Y, A are all random matrices

o speed-up low-rank recovery: fast convergence

• SVD decay slowly

12

OSTD cont…

𝐿 = 𝑌1(𝐴1𝑇𝑌1)

−1𝑌2𝑇

• Find coefficients R as

• Fix R, and updated basis

– use block-coordinate decent method

– incremental updated

• Sparse outlier estimation

– M = −L𝑟𝑡 of k element

• Background Model:

• Sparse Component:

13

OSTD cont…

𝑒𝑡 = 𝑀𝑡 𝑘 − λ2, 𝑖𝑓 𝑀𝑡 𝑘 > λ2

𝑀𝑡 𝑘 + λ2, 𝑖𝑓𝑀𝑡 𝑘 < λ2

𝑟𝑡 = (𝐿𝑇𝐿 + λ1𝐼)−1𝐿𝑇 −𝑒𝑡−1

• Synthetic Evaluation

– True low-rank tensor of size 30 × 30 × 30 is generated

by rank-3 factor matrices

• 𝑍𝑛 ∈ ℝ30×3, where 𝑛 = 1,2,3

• random entries are corrupted

– Relative Root Square Error (RRSE) measure is computed

• Two different cases are considered

– smaller magnitude of true data

– with a higher magnitude

14

Experimental Evaluations

RRSE=

• Comparison Methods: Batch processing algorithms

– Bayesian Robust Tensor Factorization (BRTF) [Q. Zhao et.al 2014]

– Higher Order RPCA (HORPCA) [D. GoldFarb et.al 2013]

– Tensor Factorization method CP-ALS [T. Kolda 2009]

– Higher Order SVD (HOSVD) [L. De Lathauwer et.al 2013]

15

Experimental Evaluations

• Multispectral Video Sequences (MSVS)

– Acquisition

• commercial camera (FD-1665-MS)

– 7 spectral narrow bands = 6 visible + 1 NIR spectral band

– 5 video sequences

• 1 indoor video sequence

• 4 outdoor scenes

• frame Size: [658 × 491 × 3] with 250 to 2300 no. of frames

• frame rate: depends on overall scene illuminations

– 5 fps for dark scene and 15 fps for brighter one

– Main Challenges

• gradual illumination changes, shadows, and intermittent

object motion

• camouflage (color similarity between background and objects)

16

Experimental Evaluations

• MSVS dataset

– “integration of MS bands improve the foreground segmentation”

17

Experimental Evaluations

Video1 Video2 Video 3 Video 4 Video 5

• Visual Results of Video 1

18

Experimental Evaluations

RGB VS-1 VS-2 VS-3 VS-4 VS-5 VS-6 NIR

Input

Low-

rank

Sparse

Mask

• Visual Results of Video 2

19

Experimental Evaluations

RGB VS-1 VS-2 VS-3 VS-4 VS-5 VS-6 NIR

Input

Low-

rank

Sparse

Mask

20

Experimental Evaluations

Video 1

Video 2

Video 3

Video 4

Video 5

Input Low-rank Ground

Truth

RGB

Mask

6 VSB

Mask

1 NIR

Mask

• Qualitative Comparison

– White: True positive (TP) pixels

– Black: True negatives (TN) pixels

– Red: False positives (FP) pixels

– Green: False negatives (FN) pixels

21

Experimental Evaluations

Video 2

Video 3

Video 5

Input Ground

Truth

Proposed BRTF HORPCA CP-ALS

• Quantitative Analysis

– F measure score is computed for RGB and MS bands for comparison

22

Experimental Evaluations

• Time Complexity

– Independent number of samples

• grows linearly to the image resolution

23

Experimental Evaluations

• Video demo 1

– color saturation issue

24

Experimental Evaluations

Input Ground

Truth

RGB

Mask

MS

Mask

• Video demo 2

– color saturation issue

25

Experimental Evaluations

Input Ground

Truth

RGB

Mask

MS

Mask

• Video demo 3

– shadows, dynamic backgrounds, intermittent object motion

26

Experimental Evaluations

Input Ground

Truth

RGB

Mask

MS

Mask

• Online stochastic optimization framework is proposed

– tensor decomposition into low-rank and sparse tensor

• computationally attractive

• real-time processing achieved

– provides great potential for multi-spectral bands

• Limitation

– proposed method is not stable for RGB image features

• achieve promising accuracy with the integration of MS

spectral bands

• Future work

– disparity features will be integrated

– can be extended for visual tracker as Low-rank sparse tracking

27

Conclusion

28