Upload
pei-li
View
214
Download
1
Embed Size (px)
Citation preview
. RESEARCH PAPER .
SCIENCE CHINAInformation Sciences
November 2012 Vol. 55 No. 11: 2635–2645
doi: 10.1007/s11432-012-4701-9
c© Science China Press and Springer-Verlag Berlin Heidelberg 2012 info.scichina.com www.springerlink.com
A variational method for contour tracking viacovariance matching
WU YuWei, MA Bo∗ & LI Pei
Beijing Laboratory of Intelligent Information Technology, School of Computer Science,Beijing Institute of Technology, Beijing 100081, China
Received February 10, 2012; accepted March 28, 2012
Abstract This paper presents a novel formulation for contour tracking. We model the second-order statistics
of image regions and perform covariance matching under the variational level set framework. Specifically,
covariance matrix is adopted as a visual object representation for partial differential equation (PDE) based
contour tracking. Log-Euclidean calculus is used as a covariance distance metric instead of Euclidean distance
which is unsuitable for measuring the similarities between covariance matrices, because the matrices typically lie
on a non-Euclidean manifold. A novel image energy functional is formulated by minimizing the distance metric
between the candidate object region and a given template, and maximizing the one between the background
region and the template. The corresponding gradient flow is then derived according to a variational approach,
enabling partial differential equations (PDEs) based contour tracking. Experiments on several challenging
sequences prove the validity of the proposed method.
Keywords contour tracking, covariance region descriptor, level set, Log-Euclidean Riemannian metric
Citation Wu Y W, Ma B, Li P. A variational method for contour tracking via covariance matching. Sci China
Inf Sci, 2012, 55: 2635–2645, doi: 10.1007/s11432-012-4701-9
1 Introduction
Object tracking is a well-known topic to computer vision community, and has found applications in a
wide range of areas including visual surveillance, medical imaging, human computer interaction, and
autonomous vehicles, etc. Different tracking strategies have been implemented and applied to overcome
the related difficulties, from Bayesian-based trackers [1,2], kernel-based methods [3,4] or online-learning-
based methods [5–8] to detection-based methods [9,10]. A thorough review can be found in [11,12].
The most methods mentioned above usually model shapes of objects as simple geometric primitives,
such as rectangles, ellipses, etc., hence leading to inaccurate extraction of object contour. In addition,
such simple geometric shapes cannot be applied for high-level motion analysis like pose recognition. In
contrast, contour-based methods aim to obtain the accurate contour of an object in each frame instead
of the rough locations. One technique that ideally provides precise localization of the target and its
boundary is the use of an implicit contour, or level set to represent the boundary of the object [13–16].
Recently, an important feature known as the region covariance descriptor has been proven to be a
very effective tool for visual tracking [17], pedestrian detection [18], face recognition [19] and texture
∗Corresponding author (email: [email protected])
2636 Wu Y W, et al. Sci China Inf Sci November 2012 Vol. 55 No. 11
classification. By taking into account the multiple-feature fusion, the region covariance descriptor is
able to capture both spatial and statistical properties of each pixel in an object region with a low-
dimensional representation. This paper proposes covariance matching for partial differential equation-
based (PDE-based) contour tracking. To further reduce the complexity of computation, the Log-Euclidean
Riemannian metric [20] is employed as the similarity measure between covariance matrices. Based on the
Log-Euclidean metric, a novel image energy functional incorporated the similarities and the dissimilarities
between the features of an object is presented. Specifically, similarity is defined by the distance between
the covariance of the internal image region enclosed by the evolving contour and the predefined template
covariance. The dissimilarity is represented by the one between the resultant covariance of the external
image region enclosed by the evolving contour and the template. Starting with the variational approach,
we derive the gradient flow, enabling PDE-based contour tracking. A group of tracking experimental
results that prove the validity of the proposed contour tracking method is presented.
The rest of the paper is organized as follows. Section 2 provides a brief review of previous contour track-
ing studies that use the active contour model and covariance features. Section 3 presents the proposed
methods. We first introduce the covariance matrix as a region-level or object-level feature descriptor, and
then present Log-Euclidean calculus to measure the similarities between covariance matrices. Further-
more, we design image energies for contour tracking that, in essence, seeks optimal covariance matching
over an image domain given a known object template. Finally, we derive the corresponding gradient flow
equation. Section 4 presents experimental results on several real image sequences using the proposed
methods. Section 5 concludes our paper.
2 Related work
A number of methods which use a parameterized or an explicit representation for contours have been
proposed [21,22] for active contour tracking. The traditional active contour framework involves param-
eterizations of curves for performing visual tracking. Isard and Blake [1] proposed the condensation
algorithm which employs a particle filter to estimate the nonlinear probability density function of the
state of the tracked contour represented by B-spline. Ray and Acton [21] presented motion gradient
vector flow to track both slow- and fast-rolling leukocytes by minimizing an energy functional involving
the motion direction and the image gradient magnitude.
Other models such as geometric active contours [23,24] for representing contours are the level set
method where a contour is represented as the zero level set of a higher dimensional function. Paragios
and Deriche [25] proposed a geodesic active region framework for multiple-object tracking. In their
work, several visual cues are integrated within an objective function which contains change detection
foreground/background separation energy, edge-driven tracking energy and visual consistency energy.
Bertalmio et al. [26] presented morphing geometric active contours for visual tracking. Given two
consecutive frames, the morphing active contours can deform the first frame toward the second one via a
PDE, and track the deformation of the curves of interest in the first frame with an additional coupled PDE.
Mansouri [27] used the optical flow constraint for geometric active contour evolution. Sundaramoorthi
et al. [28] developed Sobolev active contours to perform coarse-to-fine segmentation and visual tracking.
Based on a well-structured Riemannian metric, Sobolev active contours evolve more globally and are less
attracted to certain intermediate local minima compared with traditional active contours. By simply
switching elements between two linked lists instead of solving any PDEs, Shi and Carl [29] proposed
a fast implementation of the level set method for real-time visual tracking. Like [14], Bibby and Reid
[30] introduced a probabilistic framework for real-time tracking using pixel-wise posteriors. Later, they
extended their method to achieve real-time tracking of multiple occluding objects [15].
To make active contour tracking more robust to noise, occlusion and illumination changes, the adoption
of shape priors in the contour evolution process has been shown to be an effective way. Cremers [14]
developed a Bayesian formulation for level set-based walking person tracking, and dynamical statistical
shape models are introduced for implicitly represented shapes. Rathi et al. [31] combined the geometric
active contour with particle filtering for visual tracking. The dynamic shape prior is obtained using the
Wu Y W, et al. Sci China Inf Sci November 2012 Vol. 55 No. 11 2637
Figure 1 The flow chart of covariance matching for contour tracking.
locally linear embedding algorithm, which enables the exploration of local neighboring linear structures
within the training shape dataset. Yilmaz et al. [32] proposed an energy functional that combines color
features with the filter responses of spatial orientation selective filters using probability theory. Although
the results obtained in [32] are desirable, one significant assumption made in the proposed approach is not
accurate enough. It is assumed that the pixels in foreground and background regions are independent
of one another. Freedman et al. [16] used the geometric active contour for distribution tracking and
adopted Euclidean similarity transformations and nonrigid transformations to constraint object shapes.
Our work is mainly inspired by a number of recent works [17,24]. Porikli et al. [17] proposed a
covariance-based object description and a Lie algebra-based update mechanism for visual tracking. They
represented an object window as the covariance matrix of features, enabling the capture of spatial and
statistical properties, as well as their correlation, within the same representation. The tracking is per-
formed by globally or locally searching the image region, whose covariance matrix best matches a given
template covariance matrix. The distance metric uses the sum of the squared logarithms of the gener-
alized eigenvalues to compute the similarities between covariance matrices. Our work differs from [17]
in that we track covariance matrix using the geometric active contour, instead of performing a direct
search on the image domain. As a result, our method can track deformable objects while inheriting the
advantages of the covariance matrix as the visual object descriptor.
3 Our method
Given an image I(X), where X = (x, y) is pixel coordinates. For a predefined image region, we compute
the covariance matrix of its features as a template covariance matrix. In the current frame, the region
which optimally match the template matrix using the geometric active contour model is searched for.
Figure 1 demonstrates the flow of the covariance matching for contour tracking. In this framework,
the covariance matrix is defined as an image region descriptor. Then, we construct region-based energy
functional that uses the second-order statistics of both the candidate foreground region and background
region for contour tracking. The corresponding gradient flow is derived under the variational level set
framework, in accordance with the variational approach [24]. Finally, combining the image energy term
and the shape energy term, we present a complete energy functional and its corresponding gradient flow
equation.
3.1 Visual object representation
Around each pixel on the image plane, we can extract feature vector f of the pixels, whose components
can include the pixel coordinates, image gray level or color, image gradients, edge magnitude, edge
orientation, filter responses, etc. For example, f = [x, y, I, Ix, Iy,√I2x + I2y ]
T.
Given the extracted feature vector f defined on an image plane Ω, we suppose that the image domain
is divided into two parts (i.e. the interior region R, the exterior region Rc and Ω = R ∪ Rc) by an
evolving contour C(s), where s is the arc length parameter. The d× d covariance matrix SR for a given
image region R is computed by
SR =
∫R(f − μR)(f − μR)
TdX∫RdX
, (1)
2638 Wu Y W, et al. Sci China Inf Sci November 2012 Vol. 55 No. 11
where μR =∫RfdX/
∫RdX. Covariance matrix can represent not only the variances of each feature,
but also their respective correlations. By taking into account the multiple-feature fusion, the region
covariance descriptor is able to capture both spatial and statistical properties of each pixel in an object
region with a low dimensional representation [17].
3.2 Image energy term
Given a template covariance matrix ST and a candidate image region for the current frame, the distance
between ST and SR in the candidate image region needs to be computed. However, the covariance matri-
ces lie on a Riemannian manifold rather than the Euclidean space. Therefore, an arithmetic subtraction
of two matrices would not measure the distance of the corresponding regions. To address this issue,
Porikli et al. [24] developed the sum of the squared logarithms of the generalized eigenvalues for the
covariance matrices as distance metric. This metric is the affine-invariant Riemannian metric expressed
in Eq. (2). However, the computational cost for this Riemannian mean grows linearly as time progresses
ρ(ST ,SR) =
√√√√ d∑k=1
lnλ2k(ST ,SR), (2)
where λk is the kth generalized eigenvalue of ST and SR.
Directly deriving the gradient flow from Eq. (2) is difficult. To simplify the derivation of gradient
flow for visual tracking, we adopt the Log-Euclidean Riemannian metric proposed by Arsigny et al. [20].
The Riemannian means take a much simpler form under the Log-Euclidean metric than those under the
affine-invariance metric. Under the Log-Euclidean Riemannian metric, the distance between two points
X and Y is calculated by ‖ log(Y ) − log(X)‖. Thus, the energy functional for geometric active contour
tracking can be defined as
Ei,F =‖ lnSR − lnST ‖F , (3)
where ‖ · ‖F denotes the Frobenius norm of the matrix. Let φ(x, y) be an embedding function, whose
zero level set corresponds to evolving curve C(s), and H(φ) be the Heaviside function. Suppose that a
set of points {x ∈ Ω : H(φ(x)) � 0} corresponds to interior region R surrounded by a curve C(s). The
energy defined in Eq. (3) can be regarded as a functional of level set function φ. The expression for SR
as a matrix function of φ can be given by
SR(φ) =
∫ΩH(φ)(f − μR(φ))(f − μR(φ))
TdX∫Ω H(φ)dX
, (4)
where μR(φ) =∫ΩH(φ)fdX/
∫ΩH(φ)dX is the mean vector within region R. By minimizing the
image energy functional defined in Eq. (3), we try to deform the evolving contour to the moving object
of interest.
In general, good tracking results can be obtained by minimizing Eq. (3). When the local area of
the object region produces same covariance as the entire object region does, however, minimizing this
image energy functional may fail to extract the complete object boundary. This functional only takes
into account the statistics of the candidate foreground region. Thus, based on the information from both
the foreground and background, the modified energy functional is defined as
Ei,B = α ‖ lnSR − lnST ‖F −β ‖ lnSRc − lnST ‖F , (5)
where SRc is the covariance matrix of region Rc, and α and β both are the adjusting parameters. As a
functional of level set function φ, SRc can be expressed as
SRc(φ) =
∫ΩH(−φ)(f − μRc(φ))(f − μRc(φ))TdX∫
Ω H(−φ)dX, (6)
Wu Y W, et al. Sci China Inf Sci November 2012 Vol. 55 No. 11 2639
Figure 2 The selection of background region for computing Ei,B .
where μRc(φ) =∫ΩH(−φ)fdX/
∫ΩH(−φ)dX is the mean vector of background region Rc. To better
model the distributions both the interior and exterior of an object curve, we empirically extract a bounding
box with 2H × 2W pixels as the region of interest, as shown in Figure 2. The background is defined as
the surrounding region with 2H × 2W—area pixels, where area denotes the area of the object. Figure 2
illustrates the process of selecting the background. Intuitively, Eq. (5) minimizes the covariance distance
between the foreground region and the object template while maximizing the covariance distance between
the background and the object template. Note that when β = 0, Eq. (5) degenerates to functional
Eq. (3).
Minimizing the energy functional Eq. (5), the level set formulation is expressed as
∂φi,B
∂t= − ∂Ei,B
∂φ
=−α
‖ Θ1 ‖Fd∑
i=1
d∑j=1
(Θ1)i,j
(S−1R (φ)
∂SR(φ)
∂φ
)
i,j
+β
‖ Θ2 ‖Fd∑
i=1
d∑j=1
(Θ2)i,j
(S−1Rc (φ)
∂SRc(φ)
∂φ
)
i,j
, (7)
where⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩
Θ1 = lnSR − lnST ,
Θ2 = lnSRc − lnST ,
∂SR(φ)
∂φ=
δ(φ)
AR
(ffT − 1
AR
∫∫
Ω
H(φ)ffTdΩ − fμTR − μRf
T + 2μRμTR
),
∂SRc(φ)
∂φ=
δ(φ)
ARc
(− ffT +
1
ARc
∫∫
Ω
H(−φ)ffTdΩ + fμTRc + μRcfT − 2μRcμT
Rc
).
(8)
Here δ(φ) denotes delta function, and AR =∫∫
ΩH(φ)dΩ, ARc =
∫∫ΩH(−φ)dΩ. Please visit http://mci-
slab.cs.bit.edu.cn/member/wuyuwei/Research.htm for details.
3.3 Shape energy term
To make active contour tracking more robust to noise, occlusion and illumination changes, the adoption of
shape priors in the contour evolution process has been shown to be an effective way. Given a single curve
template, determined by the zero level set of a template embedding function φ(X), where X = (x, y) is
pixel coordinates, we can minimize the following energy functional to drive the evolving contour close to
the curve template up to a Euclidean similarity transformation:
Esh =
∫
Ω
[H(φ)−H(φ(γ ·Λr ·X +Λt))]2dX, (9)
where γ, Λr and Λt are the scaling, rotation, and translation parameters, respectively. The Euclidean
similarity transformation adopted in our method is expressed as
φ(X) = φ(γ ·Λr ·X +Λt). (10)
2640 Wu Y W, et al. Sci China Inf Sci November 2012 Vol. 55 No. 11
By Eq. (10), the corresponding first variation can be derived as
∂φsh
∂t= 2δ(φ)[H(φ) −H(φ(γ ·Λr ·X +Λt))]. (11)
For each current frame, we choose the similarity parameters to keep φ(γ · Λr · X + Λt) as close to the
tracking result of the previous frame as possible. Let φn−1 and φn−2 be the final level set functions of
the frame n − 1 and the frame n − 2, respectively, and let φ(X) = {X : φ(X) < 0}. Then we can
estimate γ = A(φn−1)/A(φn−2) , where A(φn−1) and A(φn−2) denote the areas of internal region of the
ultimate curve in the frame n−1 and the frame n−2, respectively, The translation vector Λt is estimated
according to φn−1 and φn−2 as follows:
Λt =
∑φn−1(X)<0 X∑φn−1(X)<0 1
−∑
φn−2(X)<0 X∑φn−2(X)<0 1
. (12)
The rotation matrix Λr is determined by Procrustes methods [33]. For more details, we refer the readers
to [34].
3.4 Complete energy functional
By integrating the image energy term and the shape energy term, the complete energy functional is
expressed as a linear combination of Ei(φ) and Esh(φ):
E(φ) = λEi(φ) + (1− λ)Esh(φ), (13)
where Ei = Ei,F or Ei,B and Esh denote the image energy functional and shape energy functional,
respectively. We set Ei = Ei,B in our experiments. In practice, tuning weighting factors λ is not easy
and cannot be done automatically. We will give more explanation about setting λ in Subsection 4.1. The
gradient descent flow of Eq. (13) for level set function is as follows:
∂φ
∂t= λ
∂φi
∂t+ (1− λ)
∂φsh
∂t. (14)
In implementing Eq. (14), it is numerically necessary to periodically re-initialize the level set function to
a signed distance function during the evolution. Specifically, re-initialization of an embedding function φ
is a process of making |φ| = 1 while remaining the zero level set of the curve C unchanged. One of the
efficient methods for re-initialization is the fast marching method [36].
4 Experiments
The performance of the proposed tracking methods was demonstrated on several real image sequences.
All the experiments were run on a Pentium machine operating at 1.97 GHz, and using a MATLAB
implementation. No code optimization was performed. We extract the feature vectors of the original
color images
f =
[R,G,B, Ix, Iy, Ixx, Iyy, Ixy
√I2x + I2y
]T,
whereas, for the gray images, the first three components (R,G,B) are replaced by only one compo-
nent, i.e., the gray value. To solve Eq. (14), a Heaviside function H is approximated by a smooth
function Hε(φ) = 1/2 + arctan(φ/ε)/π. Its derivative gives a regularized version of delta function
δε(φ) = ε/[π(ε2 + φ2)]. When ε → 0, Hε(φ) → H(φ) and δε(φ) → δ(φ). In all our tests, we
set ε = 1.5. Various experimental results (in rmvb format) shown in this article can be found at
http://mcislab.cs.bit.edu.cn/member/wuyuwei/Research.htm.
Wu Y W, et al. Sci China Inf Sci November 2012 Vol. 55 No. 11 2641
Figure 3 A comparison of image energy Ei,B indicated in the top row with the image energy Ei,F shown in the bottom
row on the flag sequence.
4.1 Setting parameter λ
In this section, we test the performance of our approach versus λ in the presence of disturbing factors
(e.g. pose changes). Generally speaking, the larger a weight, the more important its corresponding term
becomes when minimization is carried out. In general, when tracking a nonrigid object (e.g. female
skater), we may set a large weight to the shape prior term since it prevents the contour from being
attracted to background objects wrongly. When tracking a rigid object (e.g. Sylvester sequence), we may
set a large weight to the image energy term. In conclusion, there was no evidence that there is a best
possible way to choose a value for λ, and therefore, this is to be decided according to the application
after trial-and-error procedure.
4.2 Qualitative comparison
To demonstrate the performance of our tracking model, We first test the energy functional with image
energy Ei,B . Compared with image energy Ei,F , image energy Ei,B also integrates knowledge on the
background region. In most cases, it may outperform Ei,F . Another choice for this image energy is replac-
ing subtraction with division; thus, control parameter γ vanishes. In addition, a number of recent studies
show that minimizing these image energies cannot guarantee good tracking results. A possibly better
alternative is to make the image energy approach as close as possible to a constant learned beforehand.
The top row of Figure 3 illustrates the tracking results of the flag sequence with a certain deformation
using Ei,B . The main difficulty with this sequence is that the covariance of some internal areas is similar
to that of the flag as a whole. Therefore, only minimizing Ei,F together with the shape energy term
often drives the final contour to converge to a smaller internal region within the flag, as shown in the
bottom row of Figure 3. In contrast, the bottom row of Figure 3 indicates that good tracking results are
guaranteed when both matching similarities and matching dissimilarities are considered using Eq. (5).
We adopt the image energy Ei,B for all the subsequent experiments.
We provide a qualitative comparison between our method and those of the results obtained with the
distribution matching tracker [16]. The main reason why we choose the method in [16] as a comparison
can be explained as follows. The important features of both distribution [16] and region covariance
descriptor have proven to be very effective tools for non-contour tracking. Freedman et al. proved the
feasibility of distribution matching for contour tracking in their research work. Motivated by this, we
propose using covariance matching for contour tracking under the level set framework. It is therefore very
natural for our method to be evaluated by comparison with distribution matching method. We adopted
Bhattacharyya distance as a criterion, and set the bins to 16×16×16 in the distribution matching-based
active contour algorithm.
In the first experiment, Figure 4 presents the comparison results on the face sequence, in which occlusion
occurs. Our proposed method can achieve better tracking results. In both methods, we set the shape
parameters to zero because inferior tracking results may be obtained when severe occlusion occurs. We
have tested our method extensively on other challenging sequences. The female skater sequence is obtained
from http://www.cise.ufl.edu/smshahed/. It contains over 150 frames, and the dazzling performance is
accompanied by a large pose variation. Figure 5 demonstrates some representative tracking results. We
can see that when the tracked skater turns her body, the distribution matching tracker deviates from the
object center at the frame 51 (e.g. part of the object is outside the contour). Overall, our method is able
2642 Wu Y W, et al. Sci China Inf Sci November 2012 Vol. 55 No. 11
Figure 4 A comparison of our tracker in the top row with the distribution matching-based active contour tracker [16] in
the bottom row on the face sequence. Rows from left to right show frames 17, 40, 67, 130 and 157.
Figure 5 A comparison of our tracker in the top row with the distribution matching-based active contour tracker [16] in
the bottom row on the female skater sequence. Rows from left to right show frames 2, 21, 31, 51, 71 and 91.
Figure 6 A comparison of our tracker in the top row with the distribution matching-based active contour tracker [16] in
the bottom row on the Sylvester sequence. Rows from left to right show frames 7, 73, 127, 223, 307 and 400.
Figure 7 A comparison of our method with Bha tracker [16] on CAVIAR sequence. Top: Bha tracker results, bottom:
our tracker results. Frame number from left to right are 45, 63, 80, 103, 121, and 137, respectively.
to track the skater well and provides much more accurate and consistent tracking contours. In this
experiment, the parameter λ is set at 0.2.
Next, the Sylvester sequence is a moving animal doll obtained from http://www.cs.toronto.edu/ross/
ivt/. In this sequence, the target frequently changes its pose as well as its scale. The changing lighting
condition also makes the target hard to be distinguished. Our method can achieve better human object
tracking results than distribution matching-based active contour tracking, as shown in Figure 6. In this
experiment, the parameter λ is set at 0.6. Figure 7 shows the results of tracking a pedestrian on the
CAVIAR1) sequence. The color of the clothes is very similar to that of background regions in this sequ-
1) http://homepages.inf.ed.ac.uk/rbf/CAVIAR/
Wu Y W, et al. Sci China Inf Sci November 2012 Vol. 55 No. 11 2643
Figure 8 Comparisons of the Jaccard similarity coefficient of our method with distribution based tracker [16].
ence, which makes accurate extraction of the object region more difficult. From Figure 7, it is clear
that our method achieves desirable results. In contrast, the distribution matching-based tracker is lost
in tracking. In this experiment, the parameter λ is set at 0.3.
4.3 Quantitative Comparison
To make a quantitative comparison between proposed method and other two algorithms, we adopted
a simple measure for trackers evaluation, namely Jaccard similarity coefficient [35] for comparison. A
Jaccard similarity coefficient is defined as the intersected area of the extracted contour with ground truth
divided by their united area given by
ρ =Ctrackingcontour
⋂Cgroundtruth
Ctrackingcontour
⋃Cgroundtruth
. (15)
For each tracking task, we manually mark the ground truth every three frames. Quantitative performance
of our method and distribution matching tracker [16] with respect to error measurements are summarized
in Figure 8. As is evident in Figure 8, our method obtains a higher similarity coefficient with the ground
truth for these four sequences.
5 Conclusions
In this work, we have proposed a variational-based tracking method, using level set formulation that
models the second-order statistics of the visual object and image region. Under the Log-Euclidean
Riemannian metric, the region energy functional tries to maximize the covariance distance between the
region outside the evolving contour and the template while minimizing it between the internal image
region and a given template covariance. The shape constraints make our model more robust in a cluttered
environment. Experiments on real image sequences demonstrate the effectiveness of our method.
2644 Wu Y W, et al. Sci China Inf Sci November 2012 Vol. 55 No. 11
Acknowledgements
This work was supported by National High-tech Research & Development Program of China (863 Program) (Grant
No. 2009AA01Z323) and Major State Basic Research Development Program of China (Grant No. 2012CB720003).
The authors appreciate the anonymous reviewers for their invaluable comments and suggestions.
References
1 Isard M, Blake A. Condensation conditional density propagation for visual tracking. Int J Comput Vis, 1998, 29: 5–28
2 Li P, Zhang T, Ma B. Unscented Kalman filter for visual curve tracking. Image Vis Comput, 2004, 22: 157–164
3 Comaniciu D, Ramesh V, Meer P. Kernel-based object tracking. IEEE Trans Pattern Anal Mach Intell, 2003, 25:
564–575
4 Yilmaz A. Kernel based object tracking using asymmetric kernels with adaptive scale and orientation selection. Mach
Vis Appl, 2011, 22: 255–268
5 Collins R, Liu Y, Leordeanu M. Online selection of discriminative tracking features. IEEE Trans Pattern Anal Mach
Intell, 2005, 27: 1631–1643
6 Babenko B, Yang M, Belongie S. Robust object visual tracking with online multiple instance learning. IEEE Trans
Pattern Anal Mach Intell, 2011, 33: 1619–1632
7 Grabner H, Leistner C, Bischof H. Semi-supervised on-line boosting for robust tracking. In: Forsyth D, Torr P,
Zisserman A, eds. Proceedings of the 10th European Conference on Computer Vision (ECCV). Berlin, Heidelberg:
Springer-Verlag, 2008. 234–247
8 Ross D, Lim J, Lin R, et al. Incremental learning for robust visual tracking. Int J Comput Vis, 2008, 77: 125–141
9 Andriluka M, Roth S, Schiele B. People-tracking-by-detection and people-detection-by-tracking. In: Proceedings of
IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Anchorage, 2008. 1–8
10 Breitenstein M, Reichlin F, Leibe B, et al. Online multi-person tracking by-detection from a single, uncalibrated
camera. IEEE Trans Pattern Anal Mach Intell, 2011, 33: 1820–1833
11 Yilmaz A. Javed O, Shah M. Object tracking: A survey. ACM Comput Surv, 2006, 38: 13–57
12 Cannons K. A Review of Visual Tracking. Technical Report CSE-2008-07, York University, 2008
13 Paragios N, Deriche R. Geodesic active regions and level set methods for motion estimation and tracking. Comput Vis
Image Underst, 2005, 97: 259–282
14 Cremers D. Dynamical statistical shape priors for level set-based tracking. IEEE Trans Pattern Anal Mach Intell,
2006, 28: 1262–1273
15 Bibby C, Reid I. Real-time tracking of multiple occluding objects using level sets. In: Proceedings of the 23th IEEE
Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, 2010. 1307–1314
16 Freedman D, Zhang T. Active contours for tracking distributions. IEEE Trans Image Process, 2004, 13: 518–526
17 Porikli F, Tuzel O, Meer P. Covariance tracking using model update based on Lie algebra. In: Proceedings of IEEE
Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), New York, 2006. 728–735
18 Tuzel O, Porikli F, Meer P. Pedestrian detection via classification on Riemannian manifolds. IEEE Trans Pattern Anal
Mach Intell, 2008, 30: 1713–1727
19 Pang Y, Yuan Y, Li X. Gabor-based region covariance matrices for face recognition. IEEE Trans Circuits Syst Video
Technol, 2008, 18: 989–993
20 Arsigny V, Fillard P, Pennec X, et al. Fast and simple calculus on tensors in the Log-Euclidean framework. In:
Duncan J S, Gerig G, eds. Proceedings of the 14th International Conference on Medical Image Computing and
Computer-Assisted Intervention (MICCAI). Berlin, Heidelberg: Springer-Verlag, 2005. 115–122
21 Ray N, Acton S. Motion gradient vector flow: An external force for tracking rolling leukocytes with shape and size
constrained active contours. IEEE Trans Med Imaging, 2004, 23: 1466–1478
22 Kass M, Witkin A, Terzopoulos D. Snakes: Active contour models. Int J Comput Vis, 1988, 1: 321–331
23 Caselles V, Kimmel R, Sapiro G. Geodesic active contours. Int J Comput Vis, 1997, 22: 61–79
24 Chan T F, Vese L A. Active contours without edges. IEEE Trans Image Process, 2001, 10: 266–277
25 Paragios N, Deriche R. Geodesic active regions for motion estimation and tracking. In: Proceedings of the Seventh
IEEE International Conference on Computer Vision, Kerkyra, 1999. 688–694
26 Bertalmio M, Sapiro G, Randall G. Morphing active contours. IEEE Trans Pattern Anal Mach Intell, 2000, 22: 733–737
27 Mansouri A R. Region tracking via level set PDEs without motion computation. IEEE Trans Pattern Anal Mach
Intell, 2002, 24: 947–961
28 Sundaramoorthi G, Yezzi A, Mennucci A. Coarse-to-fine segmentation and tracking using Sobolev active contours.
IEEE Trans Pattern Anal Mach Intell, 2008, 30: 851–864
29 Shi Y, Karl W C. Real-time tracking using level sets. In: Proceedings of IEEE Computer Society Conference on
Computer Vision and Pattern Recognition (CVPR), San Diego, 2005. 34–41
Wu Y W, et al. Sci China Inf Sci November 2012 Vol. 55 No. 11 2645
30 Bibby C, Reid I. Robust real-time visual tracking using pixel-wise posteriors. In: Forsyth D, Torr P, Zisserman A,
eds. Proceedings of the 10th European Conference on Computer Vision (ECCV). Berlin, Heidelberg: Springer-Verlag,
2008. 831–844
31 Rathi Y, Vaswani N, Tannenbaum A. A generic framework for tracking using particle filter with dynamic shape prior.
IEEE Trans Image Process, 2007, 16: 1370–1382
32 Yilmaz A, Li X, Shah M. Contour-based object tracking with occlusion handling in video acquired using mobile
cameras. IEEE Trans Pattern Anal Mach Intell, 2004, 26: 1531–1536
33 Dryden I, Mardia K. Statistical Shape Analysis. Chichester: John Wiley & Sons, 1998
34 Zhang T, Freedman D. Tracking objects using density matching and shape priors. In: Proceedings of the Ninth IEEE
Conference on Computer Vision, Nice, 2003. 1056–1062
35 Udupa J, Leblanc V, Zhuge Y, et al. A framework for evaluating image segmentation algorithms. Comput Med
Imaging Graph, 2006, 30: 75–87
36 Sethian J A. Level Set Methods and Fast Marching Methods. Cambridge: Cambridge University Press, 1999