1 Performance Evaluation of Object Detection Algorithms for Video Surveillance Jacinto C. Nascimento, Member, IEEE, and Jorge S. Marques IEEE Transactions

1

Performance Evaluation of Object Detection Algorithms for Video Surveillance

Jacinto C. Nascimento, Member, IEEE, and Jorge S. Marques

IEEE Transactions On Multimedia VOL.8, NO.4, AUGUST 2006VOL.8, NO.4, AUGUST 2006

2

Outline

Introduction Related Work Segmentation Algorithms Proposed Framework Tests on PETS2001 Dataset Conclusions

3

Introduction (1/4)

Video surveillance systems rely on the ability to detect moving objects in the video streams.

It should be reliable and effective. unconstrained environments non stationary background different motion patterns…etc

4

Introduction (2/4) Approaches to characterize the

performance of video segmentation: Pixel based methods Template based methods Object based methods

Three major drawback: Several types of error should be considered. Some methods are based on the selection with

or without persons. It is not possible to define a unique ground truth.

5

Introduction (3/4) Five segmentation algorithms are

considered as examples and evaluated. BBS, W4, SGM, MGM, and LOTS.

Several types of errors are considered. Correct Detections, Detection Failures, Splits,

Merges, Splits/Merges, and False Alarms. Provide segmentation results of these

algorithms on the PETS2001 sequence. We also consider multiple interpretations.

6

Introduction (4/4)

Segmentation Algorithms

(BBS, W4, SGM, MGM, LOTS)

Proposed Framework

(User Friendly Interface)

Performance Evaluation

(CD, DF, , Split, Merge, S/M, and FA)

Segmentation of video images

Create the ground truth

7

Related Work (1/3) Background subtraction is simple to detect

moving objects in video sequences. by comparing the difference with a threshold

Several difficulties arise when background image is corrupted by noise. camera movements fluttering objects (e.g., tree waving) illumination changes clouds, shadows

8

Related Work (2/3) Some works use a deterministic background

model. admissible interval for each pixel maximum rate of change in consecutive images,

…etc Most works rely on statistical models of the

background. Each pixel is a random variable with a probability

distribution. e.g., Pfinder system uses a Gaussian Model. mixture of Gaussian Models.

9

Related Work (3/3) For shadows and non-stationary

backgrounds: show changes (e.g., sun motion) and rapid

changes (clouds, rain, or abrupt changes…etc) recursively update the background parameters

and thresholds Presence of ghosts

Static objects suddenly starts to move. Combining background subtraction with frame

differencing or by high level operation.

10

Segmentation Algorithms Basic Background Subtraction W4W4

detection algorithm used in the W4 system [17] SSingle GGaussian MModel MMultiple GGaussian MModel LLehigh O Omnidirectional T Tracking S System

Used to detect small non cooperative targets [18]

[17] “W4: real time surveillance of people and their activities”

[18] “Into the woods: Visual surveillance of non-cooperative camouflaged targets in complex outdoor settings”

11

Segmentation AlgorithmsBBS:BBS: Basic Background Subtraction

Computing the difference between the current frame and the background image.

Classify each pixel as foreground region if

For pixels associated with the same object by connected component analysis

TyxyxI tt ),(),(

intensity (current frame, 3x1 vector) mean intensity (background)

(threshold)

12

Segmentation AlgorithmsAlgorithm used in W4 System

designed for grayscale images Three features:

Min: minimum intensity Max: maximum intensity D: maximum intensity difference between consecutive fra

mes Classify the pixel I(x,y) as a foreground pixel if

),(),(),(

)),(),(),(),((

1 yxDyxIyxI

yxMaxyxIyxMinyxI

tt

tt

modified!

modified threshold!

[17] W4: real-time surveillance of people and their activities

13

Segmentation AlgorithmsSGM: SGM: Single Gaussian Model

The mean and covariance of each pixel: updated recursively

Classify each pixel as active or background

αα: constant, I(x,y): pixel : constant, I(x,y): pixel of the current frame (YUV)of the current frame (YUV)

If l(x,y) is small, thIf l(x,y) is small, the pixel is classified e pixel is classified as active!as active!

Ttttt1-tt

t1-tt

y))(x,-y)(x,(Iy))(x,-y)(x,(I y)(x,)-(1 y)(x,

y),(x,I y)(x,)-(1 y)(x,

)ln(22

m - ln

2

1 - y))(x, - y)(x,(I)(y))(x, - y)(x,(I

2

1- y)l(x, tttt1-Ttt

[1] Pfinder: real-time tracking of the human body

14

Segmentation AlgorithmsMGM: MGM: Multiple Gaussian Model

MGM models each pixel I(x,y) as a mixture of N (N=3) Gaussian Distributions. I(x,y) is a 3x1 vector (R,G,B)

The mixture model is dynamically updated. N Gaussian Distributions with respective weights weight: non match components of the mixture are not mo

dified

[2] Learning Patterns of Activity Using Real-Time Tracking

15

Segmentation AlgorithmsLOTS: LOTS: Lehigh Omnidirectional Tracking System Use two gray level background images B1, B2.

initialized using a set of T consecutive frames

Targets are detected using two thresholds high threshold, low threshold

},...,1),,(max{),(

},...,1),,(min{),(

2

1

TtyxIyxB

TtyxIyxBt

t

SLH

UL

cyxTyxT

cyxByxByxT

),(),(

),(),(),( 21

]255,0[, SU cc

User Specified!

16

Segmentation AlgorithmsLOTS:LOTS: Lehigh Omnidirectional Tracking System Each pixel is considered as active if

A target is a set of connected active pixel that a subset of them verifies:

, high threshold (TH), and low threshold (TL) are updated recursively!

),(),(),(min yxTyxByxI Lti

t

i

),(),(),(min yxTyxByxI Hti

t

i

2,1),,( iyxB ti

[18] Into the woods: Visual surveillance of non-cooperative camouflaged targets in complex outdoor settings

17

Proposed Framework

Principles:Select a set sequences

Object Detection

Manually Correction

Error Analysis and Classification

Statistics

1 frame/second

Performance Evaluation!

User Friendly InterfaceUser Friendly Interface

By Automatic By Automatic ProcedureProcedure

Detection Failure, False Alarm…etc

18

Proposed Framework Interface used to create ground truth manually.

four active regions

User can easily edit it and provide the correct segmentation!

output of the detector

4 false alarms!

19

Proposed Framework

Compare the output of the algorithm with the ground truth segmentation. Region Matching Region Overlap Area Matching Multiple Interpretation

20

Proposed Framework

Several cases are considered: Correct Detection (CD): 1-1 match False Alarm (FA): 0-1 match Detection Failure (DF): 1-0 match Merge Region (M): many-1 match Split Region (S): 1-many match Split-Merge Region (SM)

Correspondence:

ground truth – detector output

21

Proposed FrameworkRegion Matching

Binary Correspondence Matrix: Defines correspondence between active regions.

tC

N

i

M

j

MjjiCjC

NijiCiL

1

1

},...,1{),,()(

},...,1{,),()(

M}{1,...,j N},{1,...,i

T )RR

~(#

)RR~

(# if 0

T )RR

~(#

)RR~

(# if 1

j)(i,C

ji

ji

ji

ji

t

0 L(i) : FailureDetection

0 C(j) : Alarm False

1 j)C(i,1 C(j)1 L(i) : Merge-Split

1 j)C(i,1 L(i) : Split

1 j)C(i,1 C(j) : Merge

1 j)C(i,1 C(j) L(i) : Detection Correct

:as Rjregion detectedClassify

i

i

i

i

i

i

22

Different matching cases:

Correct Detection Detection Failure!

MergeFalse Alarm!

Split-MergeSplit

23

Proposed FrameworkRegion Overlap

Overlap Requirement = 20%

DF! (Overlap < 20%) CD! (Overlap > 20%)

Split (Overlap > 20%)2 DF! (Overlap < 20%)

24

Proposed FrameworkArea Matching

higher percentage of match size, better active regions produced by the algorithm.

For correctly detected regions,

)~

(#

)~

(#)(

ji

ji

RR

RRi

Characterize the performance of the detector!

25

Proposed FrameworkMultiple Interpretations

Correct Split Example:

manual segmentation SGM segmentation

Should be considered as Should be considered as validvalid!!

26


Wrong Split Example:

manual segmentation W4 segmentation

WrongWrong Segmentation! Segmentation!

27


different merged regions groups

Labeling Matrix M

Region Linking Procedure with three objects A, B, C

3 2 1

2 2 1

2 1 1

1 1 1

M

3 2 1

2 2 1

1 2 1

2 1 1

1 1 1

M FINAL

28

Tests on PETS2001 Dataset Evaluate several object detection

algorithms using PET2001 dataset. Training (3064 frames) and test sequences

(2688 frames) are used. The first 100 images were used to build the

background model. The algorithm were evaluated using 1

frame/second. Area of 25 pixel was chosen, and overlap

requirement is 10%.

Ground Truth vs. Detector Output

29

Tests on PETS2001 DatasetChoice of the Model Parameter (BBS)

ROC for different value of αα : BBS

αα = 0.05 = 0.05 αα = 0.1 = 0.1 αα = 0.15 = 0.15

Performance of BBS is independent of αα..

T=0.2 is the best value!

30

Tests on PETS2001 DatasetChoice of the Model Parameter (SGM)

ROC for different value of αα : SGM

αα = 0.01 = 0.01 αα = 0.05 = 0.05 αα = 0.15 = 0.15

Choose α α = 0.05= 0.05, T=-400!

see for -400 < T < -150

seems less DF and FA!seems less DF and FA!

31

Tests on PETS2001 DatasetChoice of the Model Parameter (MGM)

ROC for different value of αα : MGM

αα = 0.008 = 0.008 αα = 0.01 = 0.01 αα = 0.05 = 0.05

Performance of MGM is strongly depend on the value of T.

Choose α α = 0.008= 0.008, T > 0.9!

32

Tests on PETS2001 DatasetChoice of the Model Parameter (LOTS)

ROC for different background update rate : LOTS

Background update at every:Background update at every:

1024th Frame 256th Frame1024th Frame 256th Frame 128th Frame 128th Frame

variation of sensitivity from 10% to 110%variation of sensitivity from 10% to 110%

33

Tests on PETS2001 DatasetPerformance Evaluation (Case I.)

Performance of five object detection algorithms

If a moving object stops and remains still, it is considered an active region.

34

Tests on PETS2001 DatasetPerformance Evaluation (Case II.)

Performance of five object detection algorithms

If a moving object stops and remains still, it is integrated in the background after one minute.

35

Tests on PETS2001 DatasetComplexity vs. Performance

by Appendix BBS, LOTS, W4, SGM have a similar

computational complexity. MGM requires higher computational cost!

MGM requires higher complexity, but the performance is not as good as the LOTS and SGM.

36

Conclusion This paper proposes a framework for

the evaluation of object detection algorithms. Detector Output vs. Ground Truth consider multiple interpretations Measuring the percentage of each type

of error. The best results were achieved by the

LOTS and SGM algorithms.

Documents

1 Performance Evaluation of Object Detection Algorithms for Video Surveillance Jacinto C. Nascimento, Member, IEEE, and Jorge S. Marques IEEE Transactions