View
216
Download
1
Tags:
Embed Size (px)
Citation preview
1
Performance Evaluation of Object Detection Algorithms for Video Surveillance
Jacinto C. Nascimento, Member, IEEE, and Jorge S. Marques
IEEE Transactions On Multimedia VOL.8, NO.4, AUGUST 2006VOL.8, NO.4, AUGUST 2006
2
Outline
Introduction Related Work Segmentation Algorithms Proposed Framework Tests on PETS2001 Dataset Conclusions
3
Introduction (1/4)
Video surveillance systems rely on the ability to detect moving objects in the video streams.
It should be reliable and effective. unconstrained environments non stationary background different motion patterns…etc
4
Introduction (2/4) Approaches to characterize the
performance of video segmentation: Pixel based methods Template based methods Object based methods
Three major drawback: Several types of error should be considered. Some methods are based on the selection with
or without persons. It is not possible to define a unique ground truth.
5
Introduction (3/4) Five segmentation algorithms are
considered as examples and evaluated. BBS, W4, SGM, MGM, and LOTS.
Several types of errors are considered. Correct Detections, Detection Failures, Splits,
Merges, Splits/Merges, and False Alarms. Provide segmentation results of these
algorithms on the PETS2001 sequence. We also consider multiple interpretations.
6
Introduction (4/4)
Segmentation Algorithms
(BBS, W4, SGM, MGM, LOTS)
Proposed Framework
(User Friendly Interface)
Performance Evaluation
(CD, DF, , Split, Merge, S/M, and FA)
Segmentation of video images
Create the ground truth
7
Related Work (1/3) Background subtraction is simple to detect
moving objects in video sequences. by comparing the difference with a threshold
Several difficulties arise when background image is corrupted by noise. camera movements fluttering objects (e.g., tree waving) illumination changes clouds, shadows
8
Related Work (2/3) Some works use a deterministic background
model. admissible interval for each pixel maximum rate of change in consecutive images,
…etc Most works rely on statistical models of the
background. Each pixel is a random variable with a probability
distribution. e.g., Pfinder system uses a Gaussian Model. mixture of Gaussian Models.
9
Related Work (3/3) For shadows and non-stationary
backgrounds: show changes (e.g., sun motion) and rapid
changes (clouds, rain, or abrupt changes…etc) recursively update the background parameters
and thresholds Presence of ghosts
Static objects suddenly starts to move. Combining background subtraction with frame
differencing or by high level operation.
10
Segmentation Algorithms Basic Background Subtraction W4W4
detection algorithm used in the W4 system [17] SSingle GGaussian MModel MMultiple GGaussian MModel LLehigh O Omnidirectional T Tracking S System
Used to detect small non cooperative targets [18]
[17] “W4: real time surveillance of people and their activities”
[18] “Into the woods: Visual surveillance of non-cooperative camouflaged targets in complex outdoor settings”
11
Segmentation AlgorithmsBBS:BBS: Basic Background Subtraction
Computing the difference between the current frame and the background image.
Classify each pixel as foreground region if
For pixels associated with the same object by connected component analysis
TyxyxI tt ),(),(
intensity (current frame, 3x1 vector) mean intensity (background)
(threshold)
12
Segmentation AlgorithmsAlgorithm used in W4 System
designed for grayscale images Three features:
Min: minimum intensity Max: maximum intensity D: maximum intensity difference between consecutive fra
mes Classify the pixel I(x,y) as a foreground pixel if
),(),(),(
)),(),(),(),((
1 yxDyxIyxI
yxMaxyxIyxMinyxI
tt
tt
modified!
modified threshold!
[17] W4: real-time surveillance of people and their activities
13
Segmentation AlgorithmsSGM: SGM: Single Gaussian Model
The mean and covariance of each pixel: updated recursively
Classify each pixel as active or background
αα: constant, I(x,y): pixel : constant, I(x,y): pixel of the current frame (YUV)of the current frame (YUV)
If l(x,y) is small, thIf l(x,y) is small, the pixel is classified e pixel is classified as active!as active!
Ttttt1-tt
t1-tt
y))(x,-y)(x,(Iy))(x,-y)(x,(I y)(x,)-(1 y)(x,
y),(x,I y)(x,)-(1 y)(x,
)ln(22
m - ln
2
1 - y))(x, - y)(x,(I)(y))(x, - y)(x,(I
2
1- y)l(x, tttt1-Ttt
[1] Pfinder: real-time tracking of the human body
14
Segmentation AlgorithmsMGM: MGM: Multiple Gaussian Model
MGM models each pixel I(x,y) as a mixture of N (N=3) Gaussian Distributions. I(x,y) is a 3x1 vector (R,G,B)
The mixture model is dynamically updated. N Gaussian Distributions with respective weights weight: non match components of the mixture are not mo
dified
[2] Learning Patterns of Activity Using Real-Time Tracking
15
Segmentation AlgorithmsLOTS: LOTS: Lehigh Omnidirectional Tracking System Use two gray level background images B1, B2.
initialized using a set of T consecutive frames
Targets are detected using two thresholds high threshold, low threshold
},...,1),,(max{),(
},...,1),,(min{),(
2
1
TtyxIyxB
TtyxIyxBt
t
SLH
UL
cyxTyxT
cyxByxByxT
),(),(
),(),(),( 21
]255,0[, SU cc
User Specified!
16
Segmentation AlgorithmsLOTS:LOTS: Lehigh Omnidirectional Tracking System Each pixel is considered as active if
A target is a set of connected active pixel that a subset of them verifies:
, high threshold (TH), and low threshold (TL) are updated recursively!
),(),(),(min yxTyxByxI Lti
t
i
),(),(),(min yxTyxByxI Hti
t
i
2,1),,( iyxB ti
[18] Into the woods: Visual surveillance of non-cooperative camouflaged targets in complex outdoor settings
17
Proposed Framework
Principles:Select a set sequences
Object Detection
Manually Correction
Error Analysis and Classification
Statistics
1 frame/second
Performance Evaluation!
User Friendly InterfaceUser Friendly Interface
By Automatic By Automatic ProcedureProcedure
Detection Failure, False Alarm…etc
18
Proposed Framework Interface used to create ground truth manually.
four active regions
User can easily edit it and provide the correct segmentation!
output of the detector
4 false alarms!
19
Proposed Framework
Compare the output of the algorithm with the ground truth segmentation. Region Matching Region Overlap Area Matching Multiple Interpretation
20
Proposed Framework
Several cases are considered: Correct Detection (CD): 1-1 match False Alarm (FA): 0-1 match Detection Failure (DF): 1-0 match Merge Region (M): many-1 match Split Region (S): 1-many match Split-Merge Region (SM)
Correspondence:
ground truth – detector output
21
Proposed FrameworkRegion Matching
Binary Correspondence Matrix: Defines correspondence between active regions.
tC
N
i
M
j
MjjiCjC
NijiCiL
1
1
},...,1{),,()(
},...,1{,),()(
M}{1,...,j N},{1,...,i
T )RR
~(#
)RR~
(# if 0
T )RR
~(#
)RR~
(# if 1
j)(i,C
ji
ji
ji
ji
t
0 L(i) : FailureDetection
0 C(j) : Alarm False
1 j)C(i,1 C(j)1 L(i) : Merge-Split
1 j)C(i,1 L(i) : Split
1 j)C(i,1 C(j) : Merge
1 j)C(i,1 C(j) L(i) : Detection Correct
:as Rjregion detectedClassify
i
i
i
i
i
i
22
Different matching cases:
Correct Detection Detection Failure!
MergeFalse Alarm!
Split-MergeSplit
23
Proposed FrameworkRegion Overlap
Overlap Requirement = 20%
DF! (Overlap < 20%) CD! (Overlap > 20%)
Split (Overlap > 20%)2 DF! (Overlap < 20%)
24
Proposed FrameworkArea Matching
higher percentage of match size, better active regions produced by the algorithm.
For correctly detected regions,
)~
(#
)~
(#)(
ji
ji
RR
RRi
Characterize the performance of the detector!
25
Proposed FrameworkMultiple Interpretations
Correct Split Example:
manual segmentation SGM segmentation
Should be considered as Should be considered as validvalid!!
26
Proposed FrameworkMultiple Interpretations
Wrong Split Example:
manual segmentation W4 segmentation
WrongWrong Segmentation! Segmentation!
27
Proposed FrameworkMultiple Interpretations
different merged regions groups
Labeling Matrix M
Region Linking Procedure with three objects A, B, C
3 2 1
2 2 1
2 1 1
1 1 1
M
3 2 1
2 2 1
1 2 1
2 1 1
1 1 1
M FINAL
28
Tests on PETS2001 Dataset Evaluate several object detection
algorithms using PET2001 dataset. Training (3064 frames) and test sequences
(2688 frames) are used. The first 100 images were used to build the
background model. The algorithm were evaluated using 1
frame/second. Area of 25 pixel was chosen, and overlap
requirement is 10%.
Ground Truth vs. Detector Output
29
Tests on PETS2001 DatasetChoice of the Model Parameter (BBS)
ROC for different value of αα : BBS
αα = 0.05 = 0.05 αα = 0.1 = 0.1 αα = 0.15 = 0.15
Performance of BBS is independent of αα..
T=0.2 is the best value!
30
Tests on PETS2001 DatasetChoice of the Model Parameter (SGM)
ROC for different value of αα : SGM
αα = 0.01 = 0.01 αα = 0.05 = 0.05 αα = 0.15 = 0.15
Choose α α = 0.05= 0.05, T=-400!
see for -400 < T < -150
seems less DF and FA!seems less DF and FA!
31
Tests on PETS2001 DatasetChoice of the Model Parameter (MGM)
ROC for different value of αα : MGM
αα = 0.008 = 0.008 αα = 0.01 = 0.01 αα = 0.05 = 0.05
Performance of MGM is strongly depend on the value of T.
Choose α α = 0.008= 0.008, T > 0.9!
32
Tests on PETS2001 DatasetChoice of the Model Parameter (LOTS)
ROC for different background update rate : LOTS
Background update at every:Background update at every:
1024th Frame 256th Frame1024th Frame 256th Frame 128th Frame 128th Frame
variation of sensitivity from 10% to 110%variation of sensitivity from 10% to 110%
33
Tests on PETS2001 DatasetPerformance Evaluation (Case I.)
Performance of five object detection algorithms
If a moving object stops and remains still, it is considered an active region.
34
Tests on PETS2001 DatasetPerformance Evaluation (Case II.)
Performance of five object detection algorithms
If a moving object stops and remains still, it is integrated in the background after one minute.
35
Tests on PETS2001 DatasetComplexity vs. Performance
by Appendix BBS, LOTS, W4, SGM have a similar
computational complexity. MGM requires higher computational cost!
MGM requires higher complexity, but the performance is not as good as the LOTS and SGM.
36
Conclusion This paper proposes a framework for
the evaluation of object detection algorithms. Detector Output vs. Ground Truth consider multiple interpretations Measuring the percentage of each type
of error. The best results were achieved by the
LOTS and SGM algorithms.