Upload
waili8
View
226
Download
0
Embed Size (px)
Citation preview
8/3/2019 Bilateral Symmetry Detection for Real Time Robotics Applications
1/26
1
Bilateral Symmetry Detection for Real Time
Robotics ApplicationsWai Ho Li, Alan M. Zhang and Lindsay Kleeman
Intelligent Robotics Research Centre
Department of Electrical and Computer Systems Engineering
Monash University, Clayton, Victoria 3800, Australia
{ Wai.Ho.Li, Alan.Zhang, Lindsay.Kleeman } @eng.monash.edu.au
Abstract Bilateral symmetry is a salient visual feature ofmany man-made objects. In this paper, we present research thatuse bilateral symmetry to identify, segment and track objects inreal time using vision. Apart from the assumption of symmetry,the algorithms presented do not require any object models, suchas colour, shape or three dimensional primitives. In order toremedy the high computational cost of traditional symmetry
detection methods, a novel computationally efficient algorithmis proposed. To investigate symmetry as an object feature, ourfast detection scheme is applied to the tasks of object detection,segmentation and tracking. We find that objects with a lineof symmetry can be segmented without relying on colour orshape models by using a dynamic programming approach. Objecttracking is achieved by estimating symmetry line parametersusing a Kalman filter. The tracker operates at 40 frames-per-second on 640x480 video while running on a standard laptopPC. We use ten difficult real world tracking sequences to testour approach. We also quantitatively analyze symmetry as atracking feature by comparing detected symmetry lines againstground truth. Colour tracking is also performed to provide aqualitative comparison.
Index Terms bilateral symmetry, feature detection, real time,fast, model-free, computer vision, segmentation, tracking
I. INTRODUCTION
Computer Vision systems employed in robotics for the pur-
poses of detecting, segmenting and tracking objects generally
require a priori models. These object models range in com-
plexity from simple colour histograms to three dimensional
mesh grids consisting of thousands of polygons. This initial
knowledge allows for robust operation, especially when several
models are used in a synergistic manner. For example, Boosted
Haar classifiers [Viola and Jones, 2001] allow robust multi-
scale tracking of objects after offline training on positive and
negative data sets of target objects.
Collecting data sets and constructing prior models for every
object that can appear in a robots environment is neither prac-
tical nor cost effective for many real world situations. In many
environments, novel objects may appear without warning. For
example, a domestic robot can expect to encounter new objects
regularly, such as cans of soft drinks, cups and bowls, when
performing cleaning tasks. For increased adaptability, a robot
should possess some means of segmenting and tracking novel
objects, in real time, without any a priori models.
The work presented here attempts to remedy this problem
by providing a set of model-free solutions for object detection,
segmentation and tracking. In order to do this robustly, the set
of objects we target are limited to those with strong bilateral
symmetry. The algorithms presented here can operate rapidly
on rigid objects with bilateral symmetry. The algorithms work
especially well for objects with surfaces of revolution, such as
cups, bottles and cans, as they appear bilaterally symmetric
from many view points. Our methods will not work for
deformable symmetric objects such as humans or animals. This
research is designed to deal with objects in a limited context,
which allows for the use of a model-free approach.
The motivations for using symmetry as an object feature
are as follows. Gestalt suggested that symmetry is one of
several salient features humans use to visually locate and
model objects. Indeed, many man-made objects are bilaterally
symmetric. Apart from aesthetics, many objects are intention-
ally designed to be bilaterally symmetric for practical reasons.
For example, furniture maybe bilaterally symmetric to provide
better balance under load. Most drinking utensils, such as
bottles and cups, are solids of revolution to allow for easymanipulation, which makes them bilaterally symmetric when
viewed from the side. As such, bilateral symmetry appears
to be a salient feature of objects and should be useful for a
variety of robotic applications.
To avoid confusion in later sections, our definition of bilat-
eral symmetry is as follows. Bilateral symmetry is represented
as a line parameterized in a polar fashion, as shown in
Figure 1. A bilateral symmetry line is a mirror line that
bisects an object to provide two symmetric halves. This kind
of bilateral symmetry can be seen in Subfigure 7(a) and 7(b).
Another point of note is the use of the term ground truth.
By ground truth, we mean any measurement of an observed
quantity, such as an object symmetry line, that can be used
to validate other measurements. The main assumption is that
ground truth is a more accurate measurement of the physical
world than the quantities it is compared against.
Bilateral symmetry has traditionally been used in offline
applications, due to the high computational cost of detection.
To remedy this, the authors have developed a fast bilat-
eral symmetry detection algorithm [Li et al., 2005]. This
algorithms implementation and performance are detailed in
Section II. A comparison between our algorithm and the
Generalized Symmetry Transform [Reisfeld et al., 1995] is
also presented in the same section. The detection algorithm
has been successfully applied to the tasks of static object
8/3/2019 Bilateral Symmetry Detection for Real Time Robotics Applications
2/26
2
segmentation and object tracking. Earlier versions of the object
segmentation and tracking algorithms can be found in [Li
et al., 2006] and [Li and Kleeman, 2006] respectively. This
paper provides more detailed coverage of our segmentation
and tracking research. In addition, new experiment results
on long tracking sequences provide quantitative analysis of
symmetry as a tracking feature. We also try colour centroid
as a tracking feature for the same video sequences to providea qualitative comparison with a well established model-free
tracking approach operating on different visual cues.
Object segmentation is a useful tool for robots that must
find, classify and interact with objects. In situations where pre-
built object models are unavailable, a fast model-free segmen-
tation approach is needed. Our symmetry detection algorithm
has been applied to the task of locating and segmenting static
objects. We show that object contours can be found in noisy
images, without the use of prior object models, by applying
a dynamic programming approach to find symmetric edge
contours. This model-free segmentation approach can operate
in real time on 640x480 pixel images. Images of symmetric
and partially-symmetric household objects, such as cups and
bottles, are used to test the segmentation approach. The
algorithm and experimental results are detailed in Section III.
Robots that deal with moving objects generally require
the ability to perform visual tracking in real time. Object
movement can come about through purposeful robotic ma-
nipulation or accidentally, as an unintended consequence of
the robots actions. A human user may also move objects
for teaching purposes. As mentioned earlier, many man-made
objects are bilaterally symmetric. The task of collecting and
labeling images for every single object that can appear maybe
highly difficult or impossible. As such, a robot operating in
such environments should be equipped with some means totrack novel objects in real time. Ideally, the method can also
allow the construction of better models over time through the
collection of object data. Section IV covers our work on real
time object tracking using symmetry directly addresses these
issues. Experiments are carried out on difficult tracking se-
quences, including cases where the target object is transparent,
subjected to occlusions and undergoing large orientation and
scale changes. Some sample result frames from the tracking
experiments can be found in the Appendix. Time trials show
that the tracking system can operate at 40 frames per second
on 640x480 images.
A quantitative analysis of bilateral symmetry as a tracking
feature is also performed by mounting a test object on a
custom-made pendulum. The details of the experiment and
analysis are available in Section V. The symmetry tracker
is also qualitatively compared against a simple HSV colour
tracker, which operates on different visual cues. Sample video
frames from the test sequences and error plots are located in
the Appendices.
II . FAS T SYMMETRY DETECTION
A. Introduction
This section provides a brief survey of research on symme-
try detection. Levitt was the first to detail a Hough transform
scheme to detect bilateral symmetry in point clusters [Levitt,
1984]. Ogawa suggested a method of symmetry detection that
can be used to find symmetry between edge segments [Ogawa,
1991]. The Generalized Symmetry Transform [Reisfeld et al.,
1995] can detect bilateral and radial symmetry at different
scales using gradient information. Yips [Yip, 2000] symmetry
detector can detect skew symmetry in edge images using
a multi-pass Hough transform approach. More recently, afeature-based bilateral and radial detection scheme [Loy and
Eklundh, 2006] has been used to find symmetry in clusters
of feature points. A method based on matching quartets of
SIFT [Lowe, 2004] features to detect bilateral symmetry under
perspective [Cornelius and Loy, 2006] has also been proposed.
While real time radial symmetry detection [Loy and Zelin-
sky, 2003] has been achieved, bilateral symmetry detectors
are generally used in offline processing applications due to
their high computational cost. For example, the Generalized
Symmetry Transform operates on every possible pixel pair
in the input image. It has a computational complexity of
O(n2), where n is the total number of pixels in the inputimage. Yips symmetry detector uses mid-point pairs, each
generated from two edge pixel pairs. The algorithm has acomplexity of O(n4edge), where nedge is the number of edgepixels. Due to their high complexity, real time detection using
these algorithms cannot be achieved for large images using
standard computing hardware at the time of writing.
Our Fast Global Reflectional Symmetry Detection algo-
rithm [Li et al., 2005] is inspired by the Hough transform
approach of Levitt [Levitt, 1984]. We improve detection speed
by rotating edge pixels through discrete detection angles as
described in Subsection II-B. When used in tracking applica-
tions, we limit the angle range of detection to further improve
performance.
B. Algorithm Description
Our approach performs symmetry detection on an images
edge pixels. In our experiments, we have found that a million-
pixel image reduce down to an edge image with roughly
10000 to 30000 non-zero pixels. Of course, this number
will depend on the visual complexity of the scene, and the
characteristics of the edge filter. Apart from reducing data size,
symmetry detection also benefit from the noise rejection, edge
linking and weak edge retention properties of edge filters. The
Canny [Canny, 1986] edge filter is used to generate all the edge
images used in our experiments.
The polar parametrization described in Figure 1 is used for
the detected symmetry lines. Symmetry lines are represented
by their angle and distance relative to the center of the image.
Edge pixels are grouped into pairs and each pair votes for
a single line in parameter space. Unlike traditional Hough
Transform [Duda and Hart, 1972], which requires multiple
votes per edge pixel, our approach only requires a single vote
per edge pixel pair. This convergent voting scheme is similar
to that utilized in Randomized Hough Transform [Xu and Oja,
1993].
Algorithm 1 details the fast symmetry detection method.
Edge pixels are rotated about the center of the image for
8/3/2019 Bilateral Symmetry Detection for Real Time Robotics Applications
3/26
3
Fig. 1. An edge pixel pair, shown in black, voting for a symmetry line withparameters R and . Note that
2< Dmax then
continue to next pair
x0 (x2 + x1)/2Increment H[x0][index] by 1
for i 1 to Nlines dosym[i] max(Rindex, index) HBins around sym[i] in H 0
Fig. 2. Edge pixel rotation and discretization procedure. Edge pixels () arerotated by about the image center, marked as a +. Then, the horizontalcoordinate of the rotated pixels are inserted into the 2D array Rot. Pixelsfrom the same scanline are placed into the same row in Rot. Pixels in thesame row are paired up and each pair votes for a single symmetry line inR parameter space. The rows containing [3,1] in Rot will vote for thedashed symmetry line a total of five times
on the Hough angle quantization and the vertical quantization
used in Rot. Assuming uniformly distributed edge pixelsacross the rows of Rot, the algorithm requires BINS
D n2edge
voting operations, where BINS is the number of Houghangle divisions and D is the number of rows in Rot. Theaccuracy of the method can be improved by increasing the
number of angle divisions, sacrificing execution time as a trade
off. The reverse is true if we increase the number of rows in
Rot. In essence, the BINSD
term allows for an adjustable trade
off between detection accuracy and computational efficiency.
Symmetry lines are detected by looking for peaks in the
Hough accumulator. The final for-loop in Algorithm 1 de-
8/3/2019 Bilateral Symmetry Detection for Real Time Robotics Applications
4/26
4
scribes the non-maxima suppression algorithm used for peak
finding. Maxima in the Hough accumulator H are founditeratively. Each iteration is followed by setting the maxima
and its surrounding neighbourhood of bins to zero. As with
the edge rotation operation, the contribution of peak finding to
execution time is negligible when compared with the Hough
voting stage of the algorithm.
Our detection scheme has also been extended to allowfor the detection of skew symmetry resulting from weak
perspective projection. Figure 7(d) shows detection results for
a horizontally-skewed skull-and-crossbones poison logo. This
is achieved by modifying the voting operation to vote for all
lines passing through the mid-point of an edge pair, in the
same way as the standard Hough transform. The algorithms
order of computational complexity remains the same when
detecting skew symmetry, as the number of angle divisions in
the Hough accumulator is fixed. In addition to the constant-
factor increase in computational cost, two additional matrices,
the same size as the Hough accumulator, are required. Please
refer to Lei and Wongs work on skew symmetry detection [Lei
and Wong, 1999] for additional information concerning these
extra Hough accumulators.
C. Comparison with Generalized Symmetry
The Generalized Symmetry Transform calculates weights
based on image gradient symmetry, distance between pixels
and image gradient intensity to generate symmetry maps.
The symmetry map is essentially an image of symmetry
contributions made by all pixel pairs in the original image.
Two symmetry maps are produced by the transform, one
containing the magnitude of symmetry contribution and the
other the phase. The contributions of all points are used in thegeneration of the symmetry maps. For a detailed description of
the algorithm, such as the way in which the different weighting
functions are combined, please refer to the seminal paper by
Reisfeld et al [Reisfeld et al., 1995].
In order to obtain an axis of bilateral symmetry from
the symmetry map, thresholding followed by a line search
method, such as Hough transform, is required. Various ad-
hoc techniques can also be applied to the early stages of
Generalized Symmetry to limit the required processing. This
includes ignoring pixels of low gradient magnitude and sam-
pling pixels values at a coarse scale. However, due to the use
of computationally expensive weighting functions and the fact
that every image pixel is processed, the algorithm does not
lend itself easily to real time implementation.
Generalized symmetry and fast symmetry detection results
are presented below. In order to quantify the comparison
between these two methods, synthetic test images are used.
The test images each contains a grey vase-like shape with a
vertical line of symmetry through its center. The location of
this line of bilateral symmetry is used as ground truth in our
tests. If the maximum distance between the symmetry found
using fast symmetry and ground truth is less than or equal
to 1 pixel, the test is successful. As generalized symmetry
does not return a symmetry line, the location with maximum
contributed value, that is, the point in the symmetry map with
maximum isotropic symmetry, is used instead. If this point is
located within 1 pixel of the ground truth symmetry line, the
test is considered a success.
Three test images are used. The first has a dark shape set
against a light background. The remaining two images have
the same vase set against backgrounds with smooth changes in
intensity. The tests are repeated after adding Gaussian noise
to the test images. All the test images are 64x81 pixels insize. Note that the detection scale of generalized symmetry is
governed by the variable .For all three test images, the fast symmetry detector is able
to find the line of symmetry at the exact, ground truth, location.
The generalized symmetry transform is able to find the axis of
symmetry in test image 1, with no noise added. Lowering the
scale parameter produces a corner-detecting behaviour, as seen
in Figure 3(c). With added noise, the generalized Symmetry
algorithm is only able to detect the axis of symmetry with
= 10. For test images with intensity variations in theirbackgrounds, the generalized symmetry algorithm is only
successful for test image 3 with a of 10. As seen in Figure5(d), the line of symmetry is found near the top of the vase.
With the addition of Gaussian noise, the generalized approach
failed for both images, regardless of the scale factor. To save
space and needless repetition, the results of adding Gaussian
noise to test image 3 has been omitted.
The problems Generalized Symmetry Transform have with
varying background intensity stems from its core assumptions.
The transform is designed to favour opposing image gradients,
while rejecting image gradients in the same direction. In high-
level terms, the algorithm assumes either light objects against a
dark background, or dark objects on a light background. With
variations in the background perpendicular to the symmetry
line, this leads to zero contributions being made by pixelpairs across the left and right edges of the vase. This can
be seen in the results for test image 2. The algorithm is
still able to find the correct symmetry in Figure 5 as the
gradient variations only effect pixel pairings that contribute
to horizontal symmetry.
The computational complexity of both symmetry algorithms
are O(n2), with n equal to the number of input pixels.
The number of possible pixel pairs is given byn(n1)
2.
However, the fast symmetry algorithm only operates on edge
pixels, while the generalized symmetry algorithm operates
on all image pixels. The number of edge pixels is generally
much smaller than the total number of pixels in an image.
Additionally, the complexity of computations in the inner loop
of the fast symmetry detector has been drastically reduced
by edge pixel rotation. Hence, the fast symmetry algorithm
require fewer computations than the generalized approach. Our
approach has also incorporated the post-processing stage of
applying Hough transform line detection to the symmetry map.
The calculation of local gradient intensities have been removed
by the use of edge images, which removes pixels with low
image gradient magnitude.
In order to evaluate the performance and suitability of
both algorithms for real time applications, they have been
implemented in C++. Input images of size 80x60 pixels are
used as test data in experiments. The test image set contained
8/3/2019 Bilateral Symmetry Detection for Real Time Robotics Applications
5/26
5
(a) Test Imag e 1 (b) Fast Symmetry
(c) Generalized Symmetry, = 2.5
(d) Generalized Symmetry, = 10, with edge imageoverlayed
(e) Test Image 1 with withGaussian noise added
(f) Fast Symmetry
(g) Generalized Symmetry, = 2.5
(h) Generalized Symmetry, = 10
Fig. 3. Symmetry detection results for Test Image 1. (b)-(d) contain resultsfor the test image. (e) is Test Image 1 with added Gaussian noise. Thenoise has = 0.1, with image intensity defined between 0 and 1. (f)-(h)contain detection results for the noisy image. Bright pixels in the generalizedSymmetry results have high levels of detected symmetry
( a) Test Image 2 (b) Fast Sy mmetry
(c) Generalized Symmetry, = 2.5
(d) Generalized Symmetry, = 10
(e) Test Image 2 with Gaus-sian noise added
(f) Fast Symmetry
(g) Generalized Symmetry, = 2.5
(h) Generalized Symmetry, = 10
Fig. 4. Symmetry detection results for Test Image 2. Note the intensityvariation in the background of (a). (b)-(d) contain results for the test image.(e) is Test Image 2 with added Gaussian noise. The noise has = 0.1, withimage intensity defined between and 1. (f)-(h) contain detection results forthe noisy image
8/3/2019 Bilateral Symmetry Detection for Real Time Robotics Applications
6/26
8/3/2019 Bilateral Symmetry Detection for Real Time Robotics Applications
7/26
7
(a) Line 2
(b) Line 4
Fig. 8. Detection of non-object symmetry lines from Subfigure 7(c). Edgepixels have been dilated for improved visibility and are shown in black. Thered edge pixels are those that voted for the green symmetry line.
7(a), both the symmetry of the forearm and the symmetry
between the forearm and its shadow contribute votes to the
accumulator. However, the bottles total symmetry contribution
is much higher. Subfigure 7(d) shows the detection of skew
symmetry using the aforementioned modified voting proce-
dure.
Subfigure 7(c) displays the detection results for a more
complicated arrangement of objects. The lines are labeled
according to the number of votes they received, with 1
being the symmetry line with the most votes. Notice that
the symmetry lines of all three objects are found. However,
background symmetries are also detected. Line 2 is due to the
a combination of edge pixel noise and symmetry of the long
horizontal shadow. Line 4 is caused by inter-object symmetry,
primarily between the two cups.
Figure 8 shows the edge pixel pairs that voted for the
non-object symmetry lines of Subfigure 7(c). Notice the large
number of edge pixels contributed by the multi-coloured mug
in both cases. This is caused by the use of low canny edge
filtering thresholds which produced many noisy edges due to
the mugs textured surface. We chose not to alter the canny
thresholds, which will remove much of the edge noise, to
fully test the noise robustness of our symmetry detection
approach. Note also that the canny thresholds are kept constant
TABLE II
EXECUTION TIME OF THE FAS T SYMMETRY DETECTION ALGORITHM
Image Image No. of
Number Dimensions Edge Pixels Execution Time (ms)
1 640 X 480 9766 136
2 640 X 480 15187 224
3 640 X 480 9622 153
4 640 X 480 9946 141
5 640 X 480 9497 128
6 640 X 480 9698 145
7 640 X 480 11688 167
8 640 X 480 11061 180
9 640 X 480 12347 196
10 640 X 480 8167 81
11 610 X 458 6978 97
during our experiments. Unless otherwise specified, detection
parameters are not adjusted for different objects or lighting
conditions.
The detection of background symmetry and inter-object
symmetry is unavoidable due to the lack of high-level knowl-edge available to our algorithm. However, the distance thresh-
old Dmin in algorithm 1 can be increased to reject narrowsymmetry, such as line 2 in Subfigure 7(c). The expected
orientation of symmetry lines can also be used to reject
unwanted symmetry, especially when some knowledge of the
scene is available. For example, a humanoid robot trying to
manipulate cups and bottles on a table will only deal with near-
vertical lines of symmetry. As such, the angular range of the
symmetry detector can be constrained accordingly. Limiting
the range of detection angles will also improve detection
speed.
A C++ implementation of Algorithm 1 is used for allexperiments. The experiment platform is a desktop PC with a
Xeon 2.2GHz CPU and 1GB of main memory. No platform-
specific optimizations, such as MMX or SSE2, are used in
the code. Referring to the parameters defined in Algorithm 1,
the hough accumulator has 180 angle divisions (BINS). Thenumber of radius divisions (BINSR) is equivalent to the sizeof the image diagonal in pixel units. Dmax is half the imagewidth and Dmin is set to 5 pixels.
Borrowing from randomized hough transform [Xu and Oja,
1993], the list of edge pixels are sampled before detection. The
sampling occurs as follows. After edge detection, a random
subset of edge pixels are chosen from the edge image. In
our experiments, the subset is one quarter the size of all
edge pixels, meaning only one-in-four edge pixels is kept.
This subset is given as input to fast symmetry detection. We
have found that detection reliability and accuracy degrade
noticeably when the sampling ratio drops below 0.1, that
is, one-in-ten edge pixels. The timing results are shown in
Table II. Note that the execution times include edge filtering
as well as non-maxima suppression peak finding.
The execution times confirm that the amount of computation
increases as the number of edge pixels extracted from the input
image increase. More complicated images require more time as
they tend to generate more edge pixels. The detection time for
640x480 images ranges from 80 to 224 milliseconds. This will
8/3/2019 Bilateral Symmetry Detection for Real Time Robotics Applications
8/26
8
allow for frame rates between 5 to 12 Hz, which is acceptable
for many real time applications. The use of a processing
window or smaller images, a faster PC and processor-specific
optimizations can further improve the speed of the algorithm
to meet more stringent real time requirements. The sampling
ratio of edge pixels can also be adjusted at run time to alter
the detection speed. Frame rates of 20Hz has been achieved
by reducing the input image size to 320x240 pixels.
III. SEGMENTATION USING SYMMETRY
A. Introduction
There are many definitions of Object Segmentation in
Robotics and Computer Vision. For digital images, it can
be seen as an intelligent version of Image Segmentation, a
review of which can be found in [Pal and Pal, 1993]. Here,
we define object segmentation as the task of find sections
of an image that correspond to a 3D object in the real
world. Through object segmentation, a robot can obtain useful
information about its surroundings. For domestic robots, the
ability to quickly and robustly segment man-made objects in
the household or office environment is highly desirable. For
example, a robot designed to clean and tidy desks will need
to locate and segment common objects such as cups, pens and
books.
Object segmentation methodologies differ in their assump-
tions, as well as in their level of prior knowledge. When
models of objects are available, it can be argued that the
system is in fact performing object recognition, by matching
sensor data with pre-built models. The Generalized Hough
Transform [Ballard, 1981] is an example of a model-based
segmentation approach. A predefined parameterized model of
a 2D shape, essentially an object model, is required before the
transform can be applied.In many situations, the a priori object information, such as
shape and colour, may not be available. Also, the generation
of detailed object models can be costly and in many cases,
not fully automated. Hence, a robot that can segment objects
without requiring a prior model is much desired, especially for
use in domestic environments. Returning to the desk cleaning
robot example, in the case where it encounters a cup without
its model, a solution would be to have the robot generate its
own learning data by physically interacting with the cup. In
order to do this, the robot must begin by using a model-free
approach to object segmentation. The ability to detect and
segment objects quickly, ideally in real time, will also greatly
benefit the robots responsiveness and robustness to changing
environments.
Colour, gradients and shape are some common visual cues
used for segmentation. Colour has proved to be useful in
segmenting a variety of entities and is relatively simple to
detect. For example, skin colour filters are widely used in face
recognition and hand tracking applications. However, many
man-made household objects are multi-colour, consisting of
several segments of different colour. For a survey of colour-
based image segmentation techniques, refer to [Skarbek and
Koschan, 1994].
There are several symmetry-based segmentation methods in
existing literature. Methods such as [Gupta et al., 2005] applies
an existing segmentation algorithm, such as the normalized-cut
algorithm, but modifies the affinity matrix using the property
of symmetry. This is a region based segmentation method and
requires the pixel values within the object to be symmetric.
As such, the method can not segment transparent objects or
objects with asymmetric textures.
To overcome this problem, we use a purely edge based
method. Simply identifying all edge pixels that voted forthe symmetry line is not acceptable due to the possibility
of coincidentally matching pairs of edge points. A more
robust method is needed. Before continuing, we must define
object segmentation. Because no prior model or geometric
properties of the object are assumed apart from its symmetry,
a definition is difficult. We define an object segmentation as
the most continuous contour symmetric about the objects line
of symmetry. While the definition is not perfect, it does allow
for the problem to be solved robustly.
The task then becomes finding the most continuous and
symmetric contour in the edge image about a detected sym-
metry line. For real time applications, the proposed algorithm
must have predictable execution times. This criteria rejects
approaches that require initialization and multiple iterations
such as active contours. Our proposed algorithm uses a single
pass Dynamic Programming (DP) approach. While much re-
search has been performed on the use of DP to find contours in
images [Yan and Kassim, 2004], [Lee et al., 2001], [Mortensen
et al., 1992], [Yu and Luo, 2002], they require a human-
selected starting point. For the object outlines being consid-
ered, these methods would require a human user to provide
an initial pair of symmetric pixels on the outline. As a major
goal of object segmentation in robotics is image understanding
without human intervention, we engineered our approach to
requires none. Section III-B describes the pre-processing stepused to remove asymmetric edge pixel pairs, which we call the
Symmetric Edge Pair Transform. Section III-C describes the
dynamic programming segmentation algorithm. Results and
processing times are presented in Section III-D.
B. The Symmetric Edge Pair Transform
We introduce the Symmetric Edge Pair Transform (SEPT)
as a preprocessing step applied prior to dynamic program-
ming segmentation. The edge image is first rotated such that
symmetric edge pairs lie in the same row. The idea behind
the transform is to parameterize a point pair as its distance
of separation and deviation of its midpoint from the objects
symmetry line. The algorithm has also been generalized to
accommodate skew symmetry by using a non-vertical sym-
metry line after rotating the edge pixel pairs. The transform
is described in Algorithm 2.
The weighting function W() in Algorithm 2 is a monoton-ically decreasing function, such that the larger the deviation
(d) of the midpoint from the symmetry line, the lower theweight. That is, the more asymmetric an edge pair is about
the symmetry line, the lower its weight in the resulting
SEPT buffer. In our implementation, the weighting function is
W(d) = 1 d2WND . The variables d and W N D are defined
in algorithm 2. Note that the distance threshold M AXhw can
8/3/2019 Bilateral Symmetry Detection for Real Time Robotics Applications
9/26
8/3/2019 Bilateral Symmetry Detection for Real Time Robotics Applications
10/26
10
Fig. 10. Object Segmentation and Contour Refinement. The object outlinediscovered by back tracking from the maximum value in the score table isshown on the left. Object outline after Contour Refinement is on the right
cycles to form in the same row of the backPtr array. Toprevent these cycles a copy of backPtr is made in Step 1.If the score from horizontal continuity is higher than from
vertical continuity then the higher score is recorded, and
backPtr is updated. Horizontal continuity is given less rewardthan vertical continuity. The reason for this is to reject long
horizontal edge segments, which are technically symmetric
about its mid point. The symmetry detected for these straight
lines very rarely represent actual symmetric objects. Humans
generally consider symmetric objects as those that have sym-metric boundaries along the direction of the mirror line. If the
horizontal continuity reward is too high, it may also lead to
unwanted zigzag patterns in the generated contours.
After filling the score table, the best symmetric contour can
be found by starting at the highest scoring cell, and back
tracking through cells of lower weight. The back tracking
algorithm is described in Algorithm 4. Note that both copies
of back pointers are utilized by the algorithm so that there will
be no purely horizontal segment in the contour. The contour
of the object can be extracted by keeping a list of position
indices {r, c} during the back tracking process. The columnindex c indicates the horizontal distance of the contour fromthe symmetry line. An example of the resulting contour can
be seen superimposed on to SeptBuf in Figure 9(b) and inthe left image of Figure 10.
The contour obtained thus far does not directly correspond
to edge pixels. This is due to the tolerance introduced in the
SEPT preprocessing, which also caused our aforementioned
edge weighting ambiguity. In order to produce a contour that
corresponds to actual edge pixels, a refinement step is taken.
In this step, the same window size used in the SEPT, W N D,is employed to refine the contour. The algorithm produces a
near-symmetry outline by looking for edge pixels near the
symmetric contour within the window. The contour refinement
step is similar to Algorithm 3, substituting the SeptBuf with
Algorithm 3: Finding Continuous Symmetric Contours
with Dynamic Programming
Input: SeptBufOutput: sTab Table of scores, same size as SeptBufbackPtr back pointersParameters:
Himg image heightM AXhw half of the maximum expected width ofsymmetric objects
{Pver , Rver} penalty/reward for vertical continuity{Phor, Rhor} penalty/reward for horizontal continuity
sTab[ ][ ] 0for r 1 to Himg do
Step 1, vertical continuity
for c 1 to MAXhw doif SeptBuf[r][c] is not 1 then
cost SeptBuf[r][c] Rverelse
cost Pver
vScore[c] max
0sTab[r1][c1] + costsTab[r1][c] + costsTab[r1][c+1] + cost
if vScore[c] > 0 thenSet backPtr[r][c] to record which of the 3neighbouring cells is used to produce
vScore[c]
backPtrAux[r][c] backPtr[r][c]
Step 2, horizontal continuity from left to right
prevScore neg. inf.for c 1 to MAXhw do
if SeptBuf[r][c] is not 1 thencost SeptBuf[r][c] Rhor
elsecost Phor
hScore prevScore + cost
if vScore[c] >= hScore thenprevScore vScore[c]
columnPtr celseprevScore hScore
if sTab[r][c] < prevScore thensTab[r][c] prevScore
Set backPtr[r][c] to record position{r, columnP tr}
Step 3, horizontal continuity from right to left
Repeat Step 2, moving right to left in column index
8/3/2019 Bilateral Symmetry Detection for Real Time Robotics Applications
11/26
11
Algorithm 4: Back tracking highest score in the score table
Input: sTab, backPtr, backPtrAuxOutput: {r, c} {Row, column} indices
{r, c} position of MAX(sTab)while sTab[r][c] is not zero do
{r, c} backPtr[r][c]if r did not change, i.e. no vertical position changethen
{r, c} backPtrAux[r][c]
(a) (b)
Fig. 11. Segmentation of a multi-colour object. The purple contour has beenmanually thickened and overlaid on to the edge image. The symmetry line,in yellow, is identical to the symmetry detection results shown in Figure 7(b)
the edge image. Results from contour refinement is shown in
the right image of Figure 10.
Objects with internal symmetries, such as pliers, maybe
difficult to segment accurately using our approach. Both the
inside and outside edges of a set of pliers are symmetric about
the same line. In such cases it maybe more useful to identify
the outer edges for a robot manipulation task. One of the
possible solutions investigated is the use of a weighting thatfavours symmetric edge pairs that harbours more symmetric
edge pairs between its edges. For example, in the case of the
pliers, the pairs of outer handle edges contains in between
them the inner handle edges, so the outer edge pairs are more
heavily weighted. However, this is may not be a satisfactory
solution as it tends to favour widely separated edge pairs.
D. Results
Figure 11 shows the segmentation of a multi-colour object.
This result demonstrates the algorithms ability to segment
objects of non-uniform colour. Note that the edge image is
quite noisy due to texture on the cup surface and on the book.
This noise did not adversely affect the segmentation results. In
Figure 12, all three symmetric objects are segmented using our
approach. Note that, in all of our results, no prior information
such as geometric models, object colour or texture is used. The
only information received by the segmentation algorithm is the
detected symmetry line parameters and the edge image. Due
to shadows and specular reflections, the vertical side edges of
the small, multi-coloured cup are distorted and has very large
gaps. Hence, the more symmetric and continuous elliptical
contour of the cups opening is returned by the segmentation
algorithm. There is a slight distortion in the detected ellipse.
This distortion is caused by large gaps in the outer rim of the
Fig. 12. Object segmentation performed on a scene with multiple objectsusing the results from Figure 7(c). The object outlines have been thickenedand rotated such that their symmetry lines are vertical
TABLE III
EXECUTION TIME OF THE OBJECT SEGMENTATION ALGORITHM
Image Size of Cumulative No. of Execution Time (ms)
No. Score Table Edge Pairs SEPT+DP CR
1 356 x 576 (= 205056) 77983 26 4
2 322 x 482 (= 155204) 142137 25 4
3 322 x 482 (= 155204) 65479 19 3
4 326 x 493 (= 160718) 68970 21 4
5 402 x 476 (= 191352) 67426 36 7
6 382 x 801 (= 305982) 44901 36 8
7 349 x 556 (= 194044) 90104 26 6
8 345 x 546 (= 188370) 133784 28 4
9 402 x 777 (= 312354) 121725 40 8
10 393 x 705 (= 277065) 177077 32 711 383 x 722 (= 276526) 51475 32 6
CR: contour refinement stage
cup in the edge image. This produced a contour that contains
a combination of the inner and outer rim of the cups elliptical
opening.
Table III contains execution times of a C++ implementation
of our object segmentation algorithm. The same computer
described in Section II-D is used for these experiments. The
image numbers are the same as those used in Table II. The test
cases with smaller DP score tables are able to be processed
at 30 frames per second (FPS). Test cases with larger tables
can still be processed at 20FPS. The third column of Table III,
labeled No. of Edge Pairs, is the number of edge pixel pairs
processed by SEPT. In our implementation, the SEPT code that
filled the SeptBuf and the dynamic programming code areplaced within the same loop to improve efficiency. As such,
their combined execution time is shown in the SEPT+DP
column. Looking at Table III, the size of the cumulative score
table appear to be the main factor affecting the execution time.
This agrees with expectations as a score is calculated for each
entry in the table. The maximum expected size of objects is
set to be the width of the image. In practice, the size of the
objects can be restricted to more reasonable bounds, especially
8/3/2019 Bilateral Symmetry Detection for Real Time Robotics Applications
12/26
12
Fig. 13. System Diagram of Symmetry Tracker
considering the use of distance thresholds in our symmetry
detection algorithm. This will further improve execution time.
IV. OBJECT TRACKING USING SYMMETRY
A. Introduction
To represent an object without its prior model, features that
are robust to affine transformation and illumination changes
are needed. Descriptive and noise robust features, such as
SIFT [Lowe, 2004] or MSER [Matas et al., 2002], are difficult
to apply in real time applications due to their high computa-
tional costs, especially when matching against large descriptor
databases. The need for a prebuilt database of features can
also be an issue when dealing with novel objects. In order to
perform tracking in real time, model-free approaches generally
use features that are computationally inexpensive to extract
and match. Huang et al [Huang et al., 2002] used region
templates of similar intensity. Satoh et al [Satoh et al.,
2004] utilized colour histograms. Both approaches can tolerate
occlusions, but are unable to handle shadows and colour
changes caused by variations in lighting. To track objects
under different illumination conditions require features that
do not directly rely on colour or intensity information.
Figure 13 gives an overview of the tracking process. Motion
detection results are used to limit symmetry detection to
areas with movement. The Kalman filter prediction, before the
measurement update, is used to speed up symmetry detection
by limiting the detection angle. The detection results are then
passed to the Kalman filter as measurements. The motion
detection results are refined using the symmetry line estimate
produced by the Kalman filter. This produces a near-symmetricsegmentation of the object. A rotated bounding box is then
computed based on the segmentation.
B. Improving Symmetry Detection for use in Object Tracking
The raw symmetry detection results cannot be used directly
as measurements for tracking. Inter-object symmetry as well
as symmetric portions of the background, like table corners,
can overshadow the symmetry of the object being tracked.
Figure 14(a) is an example where background symmetry lines
may cause problems in tracking. The bottles symmetry line
is weaker, in terms of its Hough vote total, than the orange
symmetry line (line 1). As such, non-objectedge pixels should
(a) Top three symmetry lines (b) Angle limits
Fig. 14. Symmetry Detection for use in Object Tracking.Left: Top three symmetry lines returned by our detector. Lines are numberedaccording to the quantity of votes they received, with line 1 having receivedthe most votes. Notice that the objects symmetry line is not the strongest onein the image.Right: Angle limits (black) imposed on symmetry detection. The angle limitsare generated using the Kalman filter prediction and the prediction covariance
be rejected before applying symmetry detection, to improve
the robustness of tracking. This is achieved by only allowing
edges in the moving portions of an image to cast votes. A
motion mask, generated using the algorithm detailed in Section
IV-C, is used to suppress background edge pixels. By doing
this, the majority of votes will be cast by edge pixel pairs
belonging to the moving object.
The state prediction of the Kalman filter is used to improve
the computational efficiency of symmetry detection. Recall
that the symmetry detector iteratively rotates the edge pixels
to find symmetry lines at different angles. The range of
rotation angles can be limited by using the Kalman filter
prediction and the prediction covariance. Figure 14(b) isan example of such angle limits provided by the Kalman
filter. By limiting the Hough voting angle, the total number
of votes cast is reduced. This greatly improves the execution
time of our symmetry detection algorithm. In our experiments,three standard deviations is added to the symmetry line angle
prediction to generate the angle limits.
C. Block Motion Masking
As seen in Figure 14(a), the amount of background sym-
metry needs to be reduced before applying the symmetry
detector. In order to do this, a binary motion mask is used
to eliminate static portions of video frames. Background
modeling approaches are inappropriate for our application due
to their assumption of a near-static background and consistent
illumination conditions. Also, background modeling is not
suitable for the detection and tracking of transparent and re-
flective objects. Instead, a fast block-based frame differencing
approach is employed to generate the motion masks. We use
the classic two-frame difference [Nagel, 1978].
The colour video frames are first converted to grayscale
images. The absolute difference between time-adjacent images
is calculated. The resulting difference image is then converted
into a block image by spatially grouping pixels into 8x8 blocks.
The choice of block size is arbitrary, and should be determined
based on the smallest scale of motion to be considered by the
tracker. The sum of pixel values in the difference image is
calculated for each 8x8 block. Each blocks sum is compared
against the average value across all blocks. Blocks with a sum
8/3/2019 Bilateral Symmetry Detection for Real Time Robotics Applications
13/26
13
higher than a multiple of the average are classified as moving
parts of a video frame. This multiple constant is determined
experimentally, by starting at a value of 1, and increasing it
until camera noise and small movements can be successfully
ignored. In all our experiments, we use a factor of 1.5.
Algorithm 5: Block Motion Detection
Input: I0, I1 Video frames at time t, t + 1Output: mask Motion MaskParameters: mf Motion threshold
diff |I1 I0|res, sum are images 1
blocksizethe size of diff
sum[][] 0i 0for ii 0 to height of res do
m ii i + blocksizefor increment m until m == i do
j 0for jj 0 to width of res do
n jj j + blocksizefor increment n until n == j do
sum[ii][jj ] sum[ii][jj ] + diff[m][n]
res THRESHOLD(sum, AVERAGE(sum) mf)Median filter res then Dilate resmask res resized by a factor of blocksize
Algorithm 5 details the procedure used to generate the
motion mask. The AVERAGE function returns an average
of input elements. The THRESHOLD(A,b) function returnsa binary image, consisting of 0s and 1s. An output element
is set to 1 if the corresponding element in A is above thethreshold value b. Otherwise, it is set to 0. Median filteringon the block level is used to remove spurious motion blocks
caused by small movements and camera noise. The result, res,is then dilated to ensure that all edge pixels belonging to the
moving object are included in the masked result. The mask
is then produced by resizing the res image by a factor ofblocksize.
D. Motion Mask Refinement
In Figure 15(a) and 15(c), motion masks have been over-
layed on to video frames for illustrative purposes. In actual
operation, the mask is used to suppress static edge pixels
before passing the edge image to the symmetry detector.
Images on the right column of the same figure is produced
by applying a refined motion mask to the source image. The
refined mask is produced through a two step process. Firstly,
the location of each block with motion, b, is reflected acrossthe symmetry line. The reflected location is searched using a
local window. If none of the blocks in the window is classified
as moving, the original block b is re-classified as static. Thisfirst step removes motion that are not symmetric about the
objects symmetry line, which may have been caused by the
Fig. 15. Block Motion MaskingLeft column: Images masked by the unrefined block motion mask.Right column: Images masked using the refined block motion mask. Thesymmetry line estimate from the tracker is shown in red. The refined mask isgenerated based on this symmetry line estimate
end effector, and other moving objects. After the generation
of a near-symmetric mask, the second step attempts to remove
holes and gaps in the mask. This is achieved by looking for
blocks that are surrounded by multiple neighbours that contain
motion. These two steps are very efficient as they only operate
on res, which has fewer pixels than the source image. Therefinement process is a single pass operation.
E. Kalman Filter
A Kalman filter, as described in [Bar-Shalom et al., 2002], is
used to estimate symmetry line parameters. Hough (R, ) in-dex values are used directly as measurements. We use a linear
acceleration motion model. The filter plant and measurement
matrices are shown below.
A =
1 0 1 0 12
00 1 0 1 0 1
20 0 1 0 1 00 0 0 1 0 10 0 0 0 1 00 0 0 0 0 1
H =
1 0 0 0 0 00 1 0 0 0 0
x =
R dRdt
ddt
d2Rdt2
d2dt2
T
Process and measurement noise are chosen empirically.
Measurement and process noise variables are assumed to be
independent. The noise values used for all experiments are as
follows. The R measurement variance is 9 pixels2 and the variance is 9 degrees2. The diagonal elements of the processcovariance matrix are (1, 0.1, 10, 1, 10, 1). The odd elements
are the position, velocity and acceleration covariance ofR, theeven elements are the covariances.
Data association and validation is performed using a vali-
dation gate. The top symmetry lines, in terms of their Hough
votes, are given to the Kalman filters validation gate. Sym-
metry line parameters that generate an error above 9.21 (2-
DOF Chi-square, P = 0.01) are discarded by the gate. If nosymmetry line passes through the gate without exceeding the
Chi-square error threshold, the next state will be estimated
using the state model alone.
8/3/2019 Bilateral Symmetry Detection for Real Time Robotics Applications
14/26
14
To use the tracker in situations where new objects are being
discovered by a robot, it must have an automatic initialization
scheme. The initial state must be set to a value close to the
moving objects symmetry line to ensure convergence. An
automatic initialization method is used to find the objects
initial state. The number of moving blocks returned by the
motion detector is continuously monitored. By looking for
a sharp jump in the detected motion, frames where objectsbegin to move can be found. Symmetry lines detected from the
three time-consecutive frames after an object begins to move
are used to initialize the Kalman filter. Firstly, all possible
data associations across the three frames are generated. In
our experiments, the top three symmetry lines are used as
measurements for each frame. This produced 33 permutations.Each generated data association permutation is used as Kalman
filter measurement sets. The Kalman filter is initialized using
the first measurement in the permutation, and updated using
the second and third. The validation gate error for the updates
are accumulated and logged. After iterating through all 27
permutations, the permutations are ranked according to their
errors. The best permutation, that is, the data association
sequence with minimum error, is used to initialize the Kalman
filter. This automatic initialization procedure is used to start
the tracker for all video sequences used in our experiments,
without any manual intervention.
F. Results
The entire tracking system is implemented using C++, with
no platform specific optimizations. A notebook PC with a
1.73GHz Pentium M processor is used as the test computer.
The video frames are 640x480 pixels in size, and are recorded
at 25 frames per second. All experiments are performed usingthe same tracker parameter values. The Canny edge filter
thresholds are set to 30 and 60, with an aperture size of 3
pixels. The block motion detector uses a motion factor of 1.5.
Borrowing from Randomized Hough Transform [Xu and Oja,
1993], a sampling ratio of 0.6 is used to obtain a random
subset of the edge pixels.
Table IV contains the execution times of the tracking code.
Each sequence, numbered 1 to 10, contains up to 400 video
frames. The code responsible for symmetry detection, block
motion masking, mask refinement and Kalman filtering are
timed independently. The average run time of these code
segments can be found under the Average Time heading of
the table. The column labeled Init contains the time taken
to perform automatic initialization as discussed at the end of
Section IV-E. The average frame rates obtained are shown in
the column labeled as FPS. Note that the tracker is able to
perform at above 40 frames per second for many sequences.
The tracking system generates a rotated bounding box
around the object being tracked. The bounding box is oriented
such that two of its edges are parallel with the objects
symmetry line. The size of the box is determined by the refined
motion mask. Figure 16 shows two example bounding boxes,
and the motion masks from which they are generated.
Frame sequences of the tracking results can be found
at the end of this paper. Videos of tracking results can be
TABLE IV
OBJECT TRACKER EXECUTION TIMES AND FRAME RATES
# Average Time (ms) Init FPS
Sym Motion Refine Kalman (ms) (Hz)
1 37.87 4.84 0.86 0.09 10.41 2 2.91
2 16.76 4.76 0.75 0.06 9.74 44.77
3 17.95 4.85 0.85 0.04 10.69 4 2.22
4 18.31 4.74 0.75 0.04 11.90 4 1.96
5 33.69 4.87 0.87 0.05 11.38 2 5.33
6 20.84 4.94 0.85 0.04 13.18 3 7.50
7 35.29 5.01 0.87 0.13 11.32 2 4.22
8 34.48 4.94 0.79 0.14 11.14 2 4.79
9 18.19 4.91 0.79 0.06 11.83 4 1.75
10 27.01 4.89 0.82 0.06 12.50 30.51
Fig. 16. Generation of rotated bounding boxes from refined motion masksLeft column: Symmetry-refined motion masksRight column: Bounding boxes in green, symmetry lines in red
downloaded from:
www.ecse.monash.edu.au/centres/irrc/li_iro2006.php
V. ANALYSIS OF BILATERAL SYMMETRY AS A TRACKING
FEATURE
To evaluate the accuracy of symmetry as a tracking feature
under various background conditions, detected symmetry lines
are compared to the ground truth symmetry line of an object,
as it appears in the camera image. In many cases, such ground
truth data is unobtainable due to the lack of constraints inthe objects trajectory. Also, manually extracting the objects
symmetry line in long tracking sequences is not practical due
to the large number of video frames and the high likelihood
of introducing human errors. The following approach is used
to overcome these problems.
A. Finding Ground Truth
A custom-built pendulum, as seen in Figure 17, is used
to provide predictable oscillatory object motion along with
measurable ground truth. Our test object, a red plastic squeeze
bottle, is affixed to the end of the pendulum. The bottles
symmetry line is mechanically aligned with the pendulum arm
8/3/2019 Bilateral Symmetry Detection for Real Time Robotics Applications
15/26
15
by drilling through the center of the bottle and then passing
the pendulum arm through it. Blue markers are placed above
and below the object on the pendulum arm. The centroids of
the coloured markers are used to determine the ground truth
polar parameters of the objects symmetry line. The markers
are found automatically using colour segmentation. An exam-
ple ground truth symmetry line, extracted automatically by
segmenting the coloured markers, is shown in Figure 18.
1 Degree-of-
Freedom Pivot
Carbon Fiber Tube
Ground TruthMarkers
Fig. 17. Pendulum hardware used to generate ground truth data
Fig. 18. Ground truth symmetry axis, drawn in black. The centroids of themarkers are shown as red and green circles
The test sequences each consists of 1000 video frames
captured with the pendulum swinging in front of different
backgrounds. The following four backgrounds are used. A
white background is used as control experiment to obtain
errors of our detector under ideal background conditions.
However, specular reflections and shadows are still quite
prominent at some object poses. In order to test the robustness
of the detector to missing edges due to similar foreground
and background colours, red distracters are added to the back-
ground in the second tracking sequence. To increase input edge
noise, random edge noise is added to the background of thethird tracking sequence. The fourth sequence consists of both
red distracters and edge noise in the pendulums background.
Example frames taken from these sequences can be found at
the bottom of the error plots located in Appendix II.
Another advantage of using the pendulum to actuate our test
object is the predictability of the objects pose over time. The
accuracy of our ground truth data is analyzed using a damped
pendulum model. As the range of angles during our experi-
ments are relatively small, the small-angle approximation of
sin() is applied. The damped pendulum described byEquations 1 and 2 is used as our model. Note that R(t) is afunction of (t). The damping is modelled as an exponential,with parameter governing the rate of decay. MATLABsnlinfit function is used to perform a non-linear regression,
which simultaneously estimates A, , t0, B, L and L0. The(t) and R(t) regressions are performed separately.
(t) = Aet{cos((t t0))} + B (1)
R(t) = L(t) + L0 (2)
The absolute means of the regression residuals of ground
truth and our symmetry detector are listed in Table V. For
all four sequences, and in both R and , the marker-based
ground truth provides a better fit than our symmetry detector.This is especially true for residuals, where the ground truthresults are at least three times more accurate than our detected
symmetry. These results clearly demonstrate the validity of our
automatic marked-based method to determine ground truth.
TABLE V
MEAN OF ABSOLUTE REGRESSION RESIDUALS
Background Parameter Ground Truth Symmetry
WhiteR (pixels) 0.39 0.46
(radians) 0.0014 0.0041
RedR (pixels) 0.76 1.29
(radians) 0.0021 0.0081
Edge R (pixels) 1.82 2.34 (radians) 0.0025 0.0063
MixedR (pixels) 0.51 3.06
(radians) 0.0014 0.0188
B. Quantitative Comparison of Detected Symmetry and
Ground Truth
To compare detected symmetry against ground truth, we
detect symmetry for each frame in each sequence. As the
motivation behind the comparison is to evaluate our detected
symmetry as a tracking feature, not to evaluate the tracker
itself, we do not use a Kalman filter or any other temporal
8/3/2019 Bilateral Symmetry Detection for Real Time Robotics Applications
16/26
16
(a) Edge Pixels (b) Symmetry Line
Fig. 19. Example symmetry detection result from pendulum sequence withedge noise in the background. The motion-masked edge pixels show thatmany non-object edge pixels are passed to our fast symmetry detector. Thesymmetry line returned by our detector is shown in blue
estimation technique. However, we do use the block motion
masking method described in Subsection IV-C on these track-
ing sequences prior to detection. This simulates the kind of
edge data our detector will receive during a tracking operation.
The symmetry detection error for each frame is found by
taking difference between the polar parameters of our detected
symmetry line and the ground truth data. The error results areshown as line plots in Appendix II. Due to the length of the
experiments, only the detection errors of the first 400 frames
are shown. The mean-subtracted ground truth data is plotted
using a different vertical axis as reference. The detection error
is shown in blue. Histograms of detection errors are included
after the error line plots.
Figure 23 contains the symmetry detection error with the
test object placed against a white background. Both the radius
and errors are very small. This suggests that our symmetrydetection scheme will provide accurate measurements to a
tracker when the target object is placed against a plain back-
ground. The jumps in error magnitude tend to occur duringthe zero crossings of the overlayed ground truth plot. This
increase in error of detection appears to be correlated with
object motion, as the ground truth zero crossings occur at the
middle of the swing where the object is moving the fastest.
The application of a temporal filter, such as our Kalman filter
tracker, will further improve the error characteristics.
Figure 24 shows that a background littered with distracters,
similar in colour to the test object, has little effect on detection
error. The detection errors are larger when compared against
the white background sequence. This is due to a reduction in
the quality of detected edges caused by a lower intensity dif-
ference between object and background pixels. This reduction
in pixel contrast also adversely affect block motion masking
as seen in the colour-extracted blob of Figure 21(b). Similar to
the white background sequence, increases in error magnitude
occur at the middle of the pendulum swing.
The results in Figure 25 indicate that edge noise in the
background affect detection in a similar way as red distracters.
Figure 19 contain an example of the edge data given to
our symmetry detector during this tracking sequence. Notice
that the large amount of background edge noise has minimal
impact on detection performance. Unlike previous plots, the
detection error magnitude is not noticeably higher when the
object is moving the fastest. This maybe due to an improved
intensity contrast between the object and the background
TABLE VI
SYMMETRY DETECTION ERROR STATISTICS
Background Parameter Abs Mean STD Abs Median
WhiteR (pixels) 1.1256 1.1350 0.5675
(radians) 0.0057 0.0048 0.0043
RedR (pixels) 2.0550 1.7955 2.8732
(radians) 0.0134 0.0110 0.0129
EdgeR (pixels) 1.2118 1.0529 0.8765
(radians) 0.0078 0.0053 0.0079
MixedR (pixels) 3.4147 1.6186 7.4565
(radians) 0.0192 0.0099 0.0375
caused by the edge-laden piece of white paper.
The mixed background sequence plot, Figure 26, contains
the only large detection errors found in our pendulum ex-
periments. As with the red distracter and white background
sequences, these large errors occur during periods of fast
object motion. Given the sparseness of these error peaks,
temporal filtering should be able to correct them and a tracker
should successfully ignore them. The latter is confirmed by
our successful real world tracking experiments described in
Section IV-F.
Table VI provides a statistical summary of the symmetry
detection errors. The columns, from left to right are the mean
of absolute errors, standard deviation of errors and the median
of absolute errors. Looking at the statistics, it seems that
missing edges, due to distracters of similar colour to the
target object, cause larger detection errors than having noisy
edges in the background. This result is in agreement with
expectation as the hough transform voting method is inherently
robust to noisy edge data. On qualitative inspection of the
detection results, the relatively large detection errors of the
mixed sequence appear to be caused by missing object edgesduring periods of fast object motion.
C. Qualitative Feature Comparison: Colour Blob Centroid
This section provides a qualitative comparison of symmetry
with another commonly used tracking feature, a Colour Blob
Centroid. The centroid errors should not be compared against
the symmetry errors directly, as the comparison is inherently
biased and unfair since our test object is symmetric. Colour
tracking does not limit the target objects shape and the target
can in fact be deformable. Symmetry can be seen as a visually
orthogonal cue to colour and each has its own advantages and
disadvantages depending on the target application.
We use a Hue-Saturation-Value (HSV) colour filter in our
experiments. We implemented the filter using the OpenCV
library [Intel, 2006] histogram and our own C++ code. A
two-dimensional histogram is used to represent Hue and
Saturation. The Value component of HSV is used to reject
pixels of extreme darkness or brightness, which have noisy hue
characteristics. The hue and saturation is discretized into 45
and 8 histogram bins respectively. An example HSV histogram
is shown in Subfigure 20(e).
The colour blob centroid is obtained as follows. We apply
the HSV filter to the input image to obtain a histogram back
projection as described by [Swain and Ballard, 1991]. The
8/3/2019 Bilateral Symmetry Detection for Real Time Robotics Applications
17/26
17
(a) Input (b) Back Projection
(c) Input (d) Back Projection
(e) Hue-Saturation Histogram
Fig. 20. HSV histogram back projection. In the back projection images,dark pixels have high probability of being the objects colour, as shown inthe Hue-Saturation histogram
back projection image approximately represent the probability
that a pixel in the input image belongs to the target object
based on its colour. Example back projection results are shown
in Figure 20, where darker pixels represent higher object
probability. The objects colour histogram used to generate
the back projection is built offline and optimized manually
prior to any centroid detection. A binary blob is produced by
thresholding the back projection image. The largest contiguous
blob is kept as the objects blob, the rest are discarded.
Example binary blobs are shown as yellow pixels in the images
of Figure 21. The colour blob centroid, drawn as a black dot
in the same images, is the center of mass (zeroth moment) of
the yellow binary blob.
Ideally, the best error measure would be to find the dis-
tance between the detected centroid and the ground truth
centroid of the object. However, the latter will require manual
segmentation of the object for all frames, including those
where the object is over distracters of similar colour. As weare only using the errors in a qualitative manner, our error
measure is simply the minimum distance between the objects
centroid and the ground truth symmetry line. As the object
is symmetric, its actual centroid is located somewhere along
its symmetry line, so zero distance means perfect centroid
detection. While not ideal, this error metric should provide
some indication of feature detection accuracy and reliability.
Line plots of the centroid detection errors are located
in Appendix III. The mean-subtracted ground truth radius
is shown along side the error data as a dotted black line
to provide a visual reference of the pendulum motion. As
mentioned earlier, the zero crossings of the dotted ground truth
(a) White Background (b) Red Distractors in Background
Fig. 21. HSV Blob Extraction and Centroid Detection against differentbackgrounds. The extracted blob is in yellow and the centroid is shown as ablack dot
curve coincides with the middle of the pendulum swing, where
the object is moving at maximum speed.
Figure 31 and 33 suggests that centroid detection is very
accurate when no distracters of similar colour to the target are
present. The magnitude of the average error is around 1 to
2 pixels for both cases. By inspection, the average centroid
error in Figure 32 is 4 to 5 times larger than the white
background sequence. This agrees with expectations as the
background is filled with distracters of similar colour to the
test object that will distort the shape of the objects binary blob.
An example of this distortion can be found in Figure 21(b).
The lopsided errors of Figure 34 confirms the detrimental
effects of red distracters, with the error magnitude climbing
much higher when the object is swinging in front of the
red portion of the background. From these results, it is clear
that both feature modalities have their own weaknesses and
strengths. Each feature should only be applied to tracking after
careful consideration of the expected object and background
characteristics.
V I. CONCLUSION
We have qualitatively and quantitatively analysed the use of
bilateral symmetry as an object feature. We show that bilateral
symmetry can be detected in real time under noisy conditions
using our Hough-based fast symmetry detector. Applying the
fast symmetry detector to object segmentation, a dynamic pro-
gramming based approach is able to segment multi-coloured
objects without using any prior shape or colour information.
Real time object tracking using bilateral symmetry has also
been achieved. Our Kalman filter tracker has been successfully
tested on 10 video sequences, which include situations where
the target object is transparent or partially occluded. Thetracker can also handle large changes in object scale and
orientation. Quantitative analysis of symmetry as a tracking
feature shows minimal increase of detection error in the
presence of similarly coloured distracters and background edge
noise. A qualitative comparison with HSV colour centroid
suggests that bilateral symmetry has the level of accuracy
and reliability required of a tracking feature. Overall, bilateral
symmetry appears to be a useful and surprisingly robust object
feature for robotic applications, especially those where robots
have to deal with novel symmetric objects.
8/3/2019 Bilateral Symmetry Detection for Real Time Robotics Applications
18/26
8/3/2019 Bilateral Symmetry Detection for Real Time Robotics Applications
19/26
19
APPENDIX II
SYMMETRY ERROR PLOTS
0 50 100 150 200 250 300 350 400-4
-3
-2
-1
0
1
2
3
4
RadiusError(pixels)
Frame Number
White Background: Fast Symmetry Radius Error
0 50 100 150 200 250 300 350 400
-200
-150
-100
-50
0
50
100
150
200
M
ean-SubtractedGroundTruthRadius(pixels)
Symmetry Radius ErrorGround Truth
0 50 100 150 200 250 300 350 400-0.03
-0.02
-0.01
0
0.01
0.02
0.03
sE
rror(radians)
Frame Number
White Background: Fast Symmetry s Error
0 50 100 150 200 250 300 350 400-0.25
-0.2
-0.15
-0.1
-0.05
0
0.05
0.1
0.15
0.2
0.25
Mean-SubtractedGroundTruths
(radians)
Symmetry s ErrorGround Truth
Fig. 23. White Background: Symmetry Error Plots. Sample video frames shown at the bottom
8/3/2019 Bilateral Symmetry Detection for Real Time Robotics Applications
20/26
20
0 50 100 150 200 250 300 350 400-8
-6
-4
-2
0
2
4
6
8
RadiusError(pixels)
Frame Number
Background with Red Distractors: Fast Symmetry Radius Error
0 50 100 150 200 250 300 350 400
-200
-150
-100
-50
0
50
100
150
200
Mean-SubtractedGroundTruthRadius(pixels)
Symmetry Radius ErrorGround Truth
0 50 100 150 200 250 300 350 400
-0.06
-0.04
-0.02
0
0.02
0.04
0.06
sE
rror(radians)
Frame Number
Background with Red Distractors: Fast Symmetry s Error
0 50 100 150 200 250 300 350 400
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
Mean-SubtractedGroundTruth
s(
radians)
Symmetry s ErrorGround Truth
Fig. 24. Background with Red Distractors: Symmetry Error Plots. Sample video frames shown at the bottom
8/3/2019 Bilateral Symmetry Detection for Real Time Robotics Applications
21/26
21
0 50 100 150 200 250 300 350 400-6
-4
-2
0
2
4
6
RadiusError(pixels)
Frame Number
Background with Edge Noise: Fast Symmetry Radius Error
0 50 100 150 200 250 300 350 400
-200
-150
-100
-50
0
50
100
150
200
Mean-SubtractedGroundTruthRadius(pixels)
Symmetry Radius ErrorGround Truth
0 50 100 150 200 250 300 350 400
-0.06
-0.04
-0.02
0
0.02
0.04
0.06
sE
rror(radians)
Frame Number
Background with Edge Noise: Fast Symmetry s Error
0 50 100 150 200 250 300 350 400
-0.25
-0.2
-0.15
-0.1
-0.05
0
0.05
0.1
0.15
0.2
0.25
Mean-SubtractedGroundTruth
s(
radians)
Symmetry s ErrorGround Truth
Fig. 25. Background with Edge Noise: Symmetry Error Plots. Sample video frames shown at the bottom
8/3/2019 Bilateral Symmetry Detection for Real Time Robotics Applications
22/26
22
0 50 100 150 200 250 300 350 400-80
-60
-40
-20
0
20
40
60
80
RadiusError(pixels)
Frame Number
Mixed Background: Fast Symmetry Radius Error
0 50 100 150 200 250 300 350 400
-200
-150
-100
-50
0
50
100
150
200
Mean-SubtractedGroundTruthRadius(pixels)
Symmetry Radius ErrorGround Truth
0 50 100 150 200 250 300 350 400
-0.5
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
sE
rror(radians)
Frame Number
Mixed Background: Fast Symmetry s Error
0 50 100 150 200 250 300 350 400
-0.25
-0.2
-0.15
-0.1
-0.05
0
0.05
0.1
0.15
0.2
0.25
Mean-SubtractedGroundTruth
s(
radians)
Symmetry s ErrorGround Truth
Fig. 26. Mixed Background: Symmetry Error Plots. Sample video frames shown at the bottom
8/3/2019 Bilateral Symmetry Detection for Real Time Robotics Applications
23/26
23
0
10
20
30
40
50
60
70
80
90
00
(a) Symmetry Line Radius Error (pixels)
0
10
20
30
40
50
60
70
80
90
(b) Symmetry Line Error (radians)
Fig. 27. White Background: Histograms of Symmetry Errors
0
50
100
150
00
50
00
(a) Symmetry Radius (pixels)
0
50
100
150
00
50
(b) Symmetry (radians)
Fig. 28. Background with Red Distractors: Histograms of Symmetry Errors
0
10
20
30
40
50
60
70
80
90
(a) Symmetry Radius (pixels)
0
50
100
150
(b) Symmetry (radians)
Fig. 29. Background with Edge Noise: Histograms of Symmetry Errors
0
100
00
00
00
00
00
(a) Symmetry Radius (pixels)
0
50
100
150
00
50
00
50
00
50
(b) Symmetry (radians)
Fig. 30. Mixed Background: Histograms of Symmetry Errors
8/3/2019 Bilateral Symmetry Detection for Real Time Robotics Applications
24/26
24
APPENDIX II I
COLOUR BLO B CENTROID ERROR PLOTS
0 50 100 150 200 250 300 350 400-4
-3
-2
-1
0
1
2
3
4
CentroidDisplacementErrors(pixels)
Frame Number
White Background: HSV Tracking Centroid Displacement Error
0 50 100 150 200 250 300 350 400
-200
-150
-100
-50
0
50
100
150
200
Mean-SubtractedGroundTruthRadius(pixels
)
Symmetry ErrorGround Truth
Fig. 31. White Background: Colour Centroid Error Plot
0 50 100 150 200 250 300 350 400-20
-15
-10
-5
0
5
10
15
20
CentroidD
isplacementErrors(pixels)
Frame Number
Background with Red Distractors: HSV Tracking Centroid Displacement Error
0 50 100 150 200 250 300 350 400
-200
-150
-100
-50
0
50
100
150
200
Mean-Subtracte
dGroundTruthRadius(pixels)
Symmetry ErrorGround Truth
Fig. 32. Background with Red Distractors: Colour Centroid Error Plot
8/3/2019 Bilateral Symmetry Detection for Real Time Robotics Applications
25/26
25
0 50 100 150 200 250 300 350 400-3
-2
-1
0
1
2
3
CentroidDisplacementErrors(pixels)
Frame Number
Background with Edge Noise: HSV Tracking Centroid Displacement Error
0 50 100 150 200 250 300 350 400
-200
-150
-100
-50
0
50
100
150
200
M
ean-SubtractedGroundTruthRadius(pixels)
Symmetry ErrorGround Truth
Fig. 33. Background with Edge Noise: Colour Centroid Error Plot
0 50 100 150 200 250 300 350 400-20
-15
-10
-5
0
5
10
15
20
CentroidD
isplacementErrors(pixels)
Frame Number
Mixed Background: HSV Tracking Centroid Displacement Error
0 50 100 150 200 250 300 350 400
-200
-150
-100
-50
0
50
100
150
200
Mean-SubtractedGroundTruthRadius(pixels)
Symmetry ErrorGround Truth
Fig. 34. Mixed Background: Colour Centroid Error Plot
8/3/2019 Bilateral Symmetry Detection for Real Time Robotics Applications
26/26
26
ACKNOWLEDGEMENTS
The authors would like to thank Monash University, the
Intelligent Robotics Research Centre and PIMCE ARC Centre
for their financial support. The first author would also like
to thank Konrad Schindler of the Institute of Vision Systems
Engineering at Monash University for his suggestions and
comments regarding the colour and symmetry feature compar-
ison. We also thank the anonymous reviewers for their helpfulcomments.
REFERENCES
[Ballard, 1981] Ballard, D. H. (1981). Generalizing the hough transform todetect arbitrary shapes. Pattern Recognition, 13(2):111122.
[Bar-Shalom et al., 2002] Bar-Shalom, Y., Kirubarajan, T., and Li, X.-R.(2002). Estimation with Applications to Tracking and Navigation. JohnWiley & Sons, Inc.
[Canny, 1986] Canny, J. (1986). A computational approach to edge detec-tion. IEEE Transactions on Pattern Analysis and Machine Intelligence,8(6):679698.
[Cornelius and Loy, 2006] Cornelius, H. and Loy, G. (2006). Detectingbilateral symmetry in perspective. page 191, Los Alamitos, CA, USA.
IEEE Computer Society.[Duda and Hart, 1972] Duda, R. O. and Hart, P. E. (1972). Use of the houghtransformation to detect lines and curves in pictures. Communications ofthe ACM, 15(1):1115.
[Gupta et al., 2005] Gupta, A., Prasad, V. S. N., and Davis, L. S. (2005).Extracting regions of syemmetry. In IEEE International Conference on
Image Processing (ICIP), volume 3, pages 1336, Genova.
[Huang et al., 2002] Huang, Y., Huang, T. S., and Niemann, H. (2002). Aregion-based method for model-free object tracking. In InternationalConference on Pattern Recognition (ICPR), pages 592595, Quebec,Canada.
[Intel, 2006] Intel (2006). Opencv: Open source computer vision library.Online. http://www.intel.com/technology/computing/opencv/.
[Lee et al., 2001] Lee, B., Yan, J.-Y., and Zhuang, T.-G. (2001). A dynamicprogramming based algorithm for optimal edge detection in medicalimages. In Proceedings of the International Workshop on Medical Imagingand Augmented Reality, pages 193198, Hong Kong, China.
[Lei and Wong, 1999] Lei, Y. and Wong, K. C. (1999). Detection andlocalisation of reflectional and rotational symmetry under weak perspectiveprojection. Pattern Recognition, 32(2):167180.
[Levitt, 1984] Levitt, T. S. (1984). Domain independent object descriptionand decomposition. In AAAI, pages 207211.
[Li and Kleeman, 2006] Li, W. H. and Kleeman, L. (2006). Real time objecttracking using reflectional symmetry and motion. IEEE/RSJ InternationalConference on Intelligent Robots and Systems (IROS).
[Li et al., 2005] Li, W. H., Zhang, A., and Kleeman, L. (2005). Fast globalreflectional symmetry detection for robotic grasping and visual tracking.In Matthews, M. M., editor, Proceedings of Australasian Conference on
Robotics and Automation.
[Li et al., 2006] Li, W. H., Zhang, A. M., and Kleeman, L. (2006). Realtime detection and segmentation of reflectionally symmetric objects indigital images. IEEE/RSJ International Conference on Intelligent Robotsand Systems (IROS).
[Lowe, 2004] Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2):91.
[Loy and Eklundh, 2006] Loy, G. and Eklundh, J.-O. (2006). Detectingsymmetry and symmetric constellations of features. In Proceedings of
European Conference on Computer Vision (ECCV), Graz, Austria.
[Loy and Zelinsky, 2003] Loy, G. and Zelinsky, A. (2003). Fast radialsymmetry for detecting points of interest. IEEE Transactions on Pattern
Analysis and Machine Intelligence, 25(8):959973.[Matas et al., 2002] Matas, J., Chum, O., Urban, M., and Pajdla, T. (2002).
Robust wide baseline stereo from maximally stable extremal regions.Proceedings of the British Machine Vision Conference, 1:384393.
[Mortensen et al., 1992] Mortensen, E., Morse, B., Barrett, W., and Udupa,J. (1992). Adaptive boundary detection using live-wire two-dimensionaldynamic programming. In IEEE Proceedings of Computers in Cardiology,pages 635638, Durham, North Carolina.
[Nagel, 1978] Nagel, H. H. (1978). Formation of an object concept by anal-ysis of systematic time variations in the optically perceptible environment.
Computer Graphics Image Processing, 7(2):149194.
[Ogawa, 1991] Ogawa, H. (1991). Symmetry analysis of line drawings usingthe hough transform. Pattern Recognition Letters, 12(1):912.
[Pal and Pal, 1993] Pal, N. R. and Pal, S. K. (1993). A review on imagesegmentation techniques. Pattern Recognition, 26(9):12771294.
[Reisfeld et al., 1995] Reisfeld, D., Wolfson, H., and Yeshurun, Y. (1995).Context-free attentional operators: the generalized symmetry transform.
Internationl Journal of Computer Vision, 14(2):119130.[Satoh et al., 2004] Satoh, Y., Okatani, T., and Deguchi, K. (2004). A color-
based tracking by kalman particle filter. In International Conference onPattern Recognition (ICPR), pages 50250