Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
5 Rotated object detection
Portions of this chapter were previously presented at the 20th Australian Joint Confer-
ence on Artificial Intelligence (Horton et al., 2007).
As described in section 2.6.7, Haar Classifier Cascades cannot detect objects with
a wide range of orientations. There are two simple ways to deal with this:
1. Rotated images: one cascade is trained and the test images are rotated past it to
detect rotated objects.
2. Rotated cascades: Multiple cascades are trained, each to detect objects at di!erent
orientations. The entire set of cascades is run on every test image.
After either method, the detections for di!erent orientations may be merged into a
single set of detections, as described for single cascades in section 2.6.6.
Both of these methods multiply classification time by the number of orientations
used. Jones and Viola avoided this for the second method by training a pose estimator
decision tree which predicts which of 12 binary detection cascades is best for each image
region (Jones & Viola, 2003a). This had approximately twice the classification time of
a single binary object detector, a significant improvement over the 12 times required
by using every cascade. However, accuracy was lost with the gain in speed.
Some uncertainty remains over how far apart the image or cascade orientations can
be, and how much the training images should be perturbed for good performance in
rotated object detection. In (Jones & Viola, 2003a) the cascades were trained in 30!
steps, each covering a 30! (±15!) range, while (Kolsch & Turk, 2004a) used 15! steps
and tested angle ranges from 0! to 15!(±7.5!). In the latter paper the most accurate
detections were by cascades with 15! ranges – the highest ranges tested. This suggests
that increasing the range further (which would lead to overlap between the cascade
ranges) may also be worth testing.
5.1 Cascade training and testing implementation
As mentioned in section 2.6.8, the Open Computer Vision libaries were used as a
starting point for the object detection algorithms in this and the following chapters.
56
Chapter 5 Rotated object detection 57
However, apart from the new algorithms added, four significant changes were made to
the OpenCV implementation:
1. In the original program, the training process does not correctly build the set of all
possible features if the cascade is rectangular – a property required of the fish detection
cascades trained in this chapter. This error was corrected by correctly identifying the
cascade width and height.
2. One of the feature candidates created by OpenCV, a ‘chequerboard’ listed internally
as ‘haar x2 y2’ (shown in fig. 2.7), was used by Viola and Jones (Viola & Jones, 2001b)
but left out of Lienhart and Maydt’s feature set (Lienhart & Maydt, 2002) and did not
have a corresponding tilted feature. With the emphasis on training cascades to detect
rotated objects in this thesis, it was felt that cascade training for straight and tilted
objects should be as comparable as possible. The haar x2 y2 feature was therefore
disabled before cascade training began. All other features in fig. 2.7 were used.
3. At classification time, OpenCV halved tilted feature weights. This weight adjustment
did not happen to tilted features in training, so the same region evaluated di!erently
at training and testing time. The halving at classification time was therefore removed.
4. In the OpenCV merging process described in section 2.6.6, detections completely
within the rectangle of a larger detection with more neighbours are erased. Such
detections may easily be correct in marine animal detection, where the creatures don’t
fill their detection rectangle. This option was therefore disabled for the fish and seahorse
detection.
5.2 Cascade training data
5.2.1 Fish
As explained in section 4.2, the 235 fish images were divided into training and testing
sets, with 119 images annotated for testing. The remaining 116 training images were
annotated with the locations of 409 mostly-visible fish. Each training fish annotation
was a line segment from nose to tail, as in the examples in fig. 5.1. These training fish
subimages could then be rotated to any angle to train a cascade to detect fish at that
orientation.
Chapter 5 Rotated object detection 58
As noted above, existing research on hand detection suggests that the positive
training examples for a given angle should not be fixed at that angle, but should be
perturbed by some random amount (Jones & Viola, 2003a; Kolsch & Turk, 2004a), as
illustrated in fig. 5.2. However, these studies only report on random angle ranges as
high as the angle step, as in fig. 5.2(c), and don’t consider overlapping random angle
ranges such as those in figs. 5.2(d) and 5.2(e).
Along with the rotation, the positive samples were scaled to be as large as possible
while centred in and fitted within the sample rectangle. Figs. 5.3(a) and 5.3(b) show
some of the training fish being forced to exactly 0! and 45! respectively. Figs. 5.3(d)
and 5.3(e) show the same fish randomly rotated within a range of 30!(±15!). The
rotation and scaling were done in a single step with an a"ne transformation function
from the OpenCV libraries; it interpolated using the pixel area relation to minimise
loss of detail.
The Haar Classifier Cascade training process needs negative training examples as
well, so 722 regions containing no complete fish were extracted from the training images;
fig. 5.4 contains examples. The cascades were supposed to ignore negative objects in
any orientation. In order to provide such negative training examples, the negative
regions were flipped 50% of the time, mirrored twice across each axis, rotated to random
orientations and the largest possible square used as a negative training sample. Fig. 5.5
shows the process, and fig. 5.6 shows some of the results. The negative sample in the
first step of fig. 5.5 has merely been shrunk to fit the page; negative training images
were not scaled and, as with the positive samples, the rotation step interpolated pixels
to minimise information loss.
Figure 5.1: Example fish images, annotated with training positives
Chapter 5 Rotated object detection 59
(a) Range=0! (b) Range=10! (c) Range=15! (d) Range=20! (e) Range=60!
Figure 5.2: Positive sample orientations for 7 cascades fixed on angles from !45! to
+45! with random angle ranges; darker areas show where random ranges overlap.
Each set of rotated cascade fish detectors tested in section 5.4.2 onwards contained
seven Haar Classifier Cascades. Each was trained to detect fish oriented around an
angle of -45!, -30!, -15!, 0!, +15!, +30! or +45!. The dimensions of each cascade were
chosen to detect fish with an aspect ratio of 3:1 (consistent with the mean aspect ratio
of fish in the training images) and to have an area of approximately 48 " 16 = 768
units (table 5.1). This would not detect all possible fish, as it does not cover every
possible orientation, but was su"cient for the images being used, where the fish were
consistently close to horizontal and almost always swimming left, as seen in fig. 4.4.
Table 5.1: Fish detection cascade window sizes
Chapter 5 Rotated object detection 60
(a) First 8 positive fish training regions forced to 0! orientation
(b) First 8 positive fish training regions forced to 45! orientation
(c) First positive fish training region forced to -45!..45!
(d) First positive fish training region forced to 0!±15!
(e) First positive fish training region forced to 45!±15!
Figure 5.3: Positive fish training regions at di!erent orientations
Figure 5.4: First 10 negative fish training regions before flipping, mirroring and rotation
Chapter 5 Rotated object detection 61
Figure 5.5: Creating a fish negative training sample from the first negative training
region
Figure 5.6: First 10 negative fish training regions after flipping, mirroring and rotation
Chapter 5 Rotated object detection 62
5.2.2 Seahorses
As explained in section 4.3, there were 263 seahorse images, with 131 used for training.
The remaining 132 test images were annotated using the rules described there and
illustrated in fig. 4.7. Annotations made following these rules on the example distant
and close-up images from fig. 4.6 are shown in fig. 5.7. The training images were filled
with 748 such annotations. As with the fish, these segments could be forced to any
angle for training, as seen for heads in fig. 5.8 and bodies in fig. 5.9.
The seahorse tank was less crowded than the fish tank, so the entire images were
used as negative training samples. Positive examples for each segment were removed by
drawing blank circles over the appropriate parts of each image (fig. 5.10). To encourage
rotation-independent cascades these negative training images were also flipped 50% of
the time, mirrored and randomly rotated. This provided negative training examples in
all orientations (fig. 5.11).
Seahorse cascades were then trained for anti-clockwise-oriented seahorse segments
at 15! angle increments, from 0! to 75!. Trained cascades can be rotated by any multiple
of 90! by modifying the coordinates (Jones & Viola, 2003a), so each of the 6 cascades
was rotated through 0!, 90!, 180!and 270!. The resulting 24 cascades covered all
orientations from 0! to 345!. These would only detect anti-clockwise seahorse segments,
so during classification the images were flipped horizontally to detect clockwise-oriented
seahorse segments; this also doubles classification time. Each seahorse segment detector
had a window size of 24" 24 units.
Chapter 5 Rotated object detection 63
Figure 5.7: Example seahorse images with segments annotated for training
Figure 5.8: First 8 seahorse heads extracted from example images and forced to 0!
orientation
Figure 5.9: First 8 seahorse bodies extracted from example images and forced to 90!
orientation
Chapter 5 Rotated object detection 64
Figure 5.10: Example seahorse images with heads blanked out
Figure 5.11: Example seahorse images with heads blanked out after flipping, mirroring
and rotation
Chapter 5 Rotated object detection 65
5.3 Cascade training settings
All cascades used in this chapter were trained with the OpenCV default maximum stage
false positive rate of 0.5 and maximum stage false negative rate of 0.005. Preliminary
tests showed that the OpenCV default stage count of 14 was insu"cient for fish and
seahorse detection; 20 stages were tested and found to be more appropriate. This was
also consistent with some of Lienhart et al.’s 20-stage face detection cascades (Lienhart
et al., 2003a). By default, OpenCV also creates and tests features in symmetric pairs,
mirrored across the y axis of the cascade training window; this was deactivated since
the fish and seahorse segments are not symmetric from the side. Following (Lienhart
et al., 2003a), Gentle AdaBoost was selected as the boosting algorithm.
The settings chosen here, and in sections 5.1, 5.2.1 and 5.2.2, are summarised in
table 5.2.
To find the best random angle ranges to use for training cascades, training positive
image sets were constructed with random ranges from 0! to 90! (±0! to ±45!). The
fish detection cascade ranges were 0!, 5!, 10!, 15!, 20!, 25!, 30!, 45!, 60! and 90!; the
seahorse detection cascade ranges were 0!, 5!, 10!, 15!, 20!, 25!, 30!, 35!, 40!, 45!,
50!, 60! and 90!. They were tested using both rotated cascades and rotated images in
steps of 15!, 30! and 45!. This has a cost in computation time; if detections are made
at n di!erent angles, the detection time is that for a single cascade multiplied by n. If
n di!erent cascades are trained on di!erent angles and used to classify a single image,
similar time is taken unless the two-stage process discussed at the start of the chapter
and described in (Jones & Viola, 2003a) is used. However, more rotations give more
opportunities to detect each object, so may be expected to increase accuracy.
Table 5.2: List of cascade training settings
Chapter 5 Rotated object detection 66
5.4 Results
For comparisons, ROC curves were constructed for each cascade or set of cascades and
visually compared, as described in section 2.7.3. Comparisons of di!erent random angle
ranges in this thesis involve a pair of graphs, with the first showing ranges ascending
from 0! to the apparent best range (e.g. fig. 5.14(a)), and then angles ascending from it
to 90! (e.g. fig 5.14(b)). All ranges were tested, but not all are shown on the graphs for
readability. Comparisons of di!erent methods use plots that combine the best results
from each set of cascades, as described in section 2.7.3.1. For example, the ‘15! steps’
line at the top of fig. 5.21(a) follows the best points from figs. 5.14(a) and 5.14(b).
Appendix C contains example images with detections made by some of these cas-
cades and methods. Figs. C.4 and C.7 contain fish detections from rotated images and
rotated cascades respectively; fig. C.11 contains seahorse detections on rotated images.
Seahorse detections by rotated cascades are not shown. In the example fish image,
there is one large fish which was not found during the rotated cascade detection in
fig. C.7. This is probably because a cascade to detect it would have to pass outside
the image bounds.
First, however, trends in the number and areas of the features chosen for each
cascade will be considered.
Chapter 5 Rotated object detection 67
5.4.1 Cascade features
Table 5.3 counts the features for each cascade, where each ‘feature’ is a rectangular
block of the type shown in fig. 2.7. As explained in section 5.2.2, the seahorse segment
detection cascades fixed around angles 90!..345! are created through 90! rotations of
the 0!..75! cascades listed here. Note also that these are total counts for 20-stage
cascades; comparisons with other cascades should be made by dividing the number of
features by the number of stages.
The areas of covered by those features are also given; tilted features are considered
to have four times the area of straight features. These areas are listed in table 5.4.
The mean counts for each random angle range listed in the right-hand column of the
tables are plotted in fig. 5.12; the mean area columns are similarly plotted in fig. 5.13.
Increasing the random angle range makes the positive samples more varied, so the
corresponding increase in features and feature area as the angle range increases is not
surprising.
A property that was not anticipated is that each seahorse segment detection cascade
has approximately twice as many features as the fish detection cascades trained with
the same random angle range. This may be because the fish are more consistently
shaped or because their image backgrounds contain less objects. Testing this would
involve moving dozens of fish and seahorses into di!erent environments, followed by
extensive image photography, annotating, training and classification. Such tests were
not considered within the scope of this thesis.
Also, while the feature counts for seahorse heads and bodies in fig. 5.12 are very sim-
ilar, the body detection features generally cover a larger area than their head detection
counterparts, as seen in fig. 5.13.
Chapter 5 Rotated object detection 68
Table 5.3: Rotated cascade feature counts
(a) Fish detection cascade feature counts
(b) Seahorse head detection cascade feature counts
(c) Seahorse body detection cascade feature counts
Fig. 5.12 contains a graph of the mean columns from these tables.
Chapter 5 Rotated object detection 69
Table 5.4: Rotated cascade area per feature
(a) Fish detection cascade mean area per feature
(b) Seahorse head detection cascade mean area per feature
(c) Seahorse body detection cascade mean area per feature
Fig. 5.13 contains a graph of the mean columns from these tables.
Areas are in the units of the cascades’ coordinate systems.
Chapter 5 Rotated object detection 70
0
200
400
600
800
0° 30° 60° 90°
Random angle range
Me
an
ca
sc
ad
e f
ea
ture
s
Seahorse bodies
Seahorse heads
Fish
Figure 5.12: Graph of rotated cascade feature counts
0
20
40
60
80
100
120
140
0° 30° 60° 90°
Random angle range
Me
an
ca
sc
ad
e f
ea
ture
are
a
Seahorse bodies
Seahorse heads
Fish
Figure 5.13: Graph of rotated cascade feature areas
Chapter 5 Rotated object detection 71
5.4.2 Angle ranges for fish detection
With detection on rotated images at 15! steps, the best fish detection cascades were
trained on an angle range of 30! (±15!), resulting in overlapping angle ranges. This
may be seen in figs. 5.14(a) and 5.14(b). When the angle step was increased to 30!,
this remained the best range, as shown in figs. 5.14(c) and 5.14(d). Finally, when the
angle step was increased to 45!, the best angle range also increased to 45! (±22.5!)
(figs. 5.14(e), 5.14(f)). When the cascades were rotated instead of images, the best
angle ranges were di!erent but followed similar trends, with 25! angle ranges best for
15! and 30! angle steps (figs. 5.15(a), 5.15(b) and 5.15(c), 5.15(d) respectively). When
the step increased to 45!, the best angle range increased to 60! (figs. 5.15(e), 5.15(f)).
Chapter 5 Rotated object detection 72
0
100
200
300
400
500
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Range=30°Range=25°Range=20°Range=10°Range=0°
(a) 15! steps, 0!..30! range
0
100
200
300
400
500
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Range=30°Range=45°Range=60°Range=90°
(b) 15! steps, 30!..90! range
0
100
200
300
400
500
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1T
rue
po
sit
ive
ra
te
Range=30°Range=25°Range=20°Range=10°Range=0°
(c) 30! steps, 0!..30! range
0
100
200
300
400
500
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Range=30°Range=45°Range=60°Range=90°
(d) 30! steps, 30!..90! range
0
100
200
300
400
500
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Range=45°Range=30°Range=20°Range=10°Range=0°
(e) 45! steps, 0!..45! range
0
100
200
300
400
500
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Range=45°Range=60°Range=90°
(f) 45! steps, 45!..90! range
Figure 5.14: ROC curves for fish detection on rotated images, varying the cascade
random angle range
Chapter 5 Rotated object detection 73
0
100
200
300
400
500
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Range=25°Range=20°Range=15°Range=10°Range=0°
(a) 15! steps, 0!..25! range
0
100
200
300
400
500
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Range=25°Range=30°Range=45°Range=60°Range=90°
(b) 15! steps, 25!..90! range
0
100
200
300
400
500
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1T
rue
po
sit
ive
ra
te
Range=25°Range=20°Range=15°Range=10°Range=0°
(c) 30! steps, 0!..25! range
0
100
200
300
400
500
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Range=25°Range=30°Range=45°Range=60°Range=90°
(d) 30! steps, 25!..90! range
0
100
200
300
400
500
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Range=60°Range=45°Range=30°Range=15°Range=0°
(e) 45! steps, 0!..60! range
0
100
200
300
400
500
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Range=60°
Range=90°
(f) 45! steps, 60!..90! range
Figure 5.15: ROC curves for fish detection by rotated cascades, varying the cascade
random angle range
Chapter 5 Rotated object detection 74
5.4.3 Angle ranges for seahorse segment detection
The best random angle ranges for e!ective seahorse detection were generally lower than
those found for e!ective fish detection in section 5.4.2. When the images were rotated,
the best random angle ranges were always smaller than or equal to the angle step, both
for head detection (fig. 5.16) and body detection (fig. 5.17).
The results were similar for rotated cascades detecting heads (fig. 5.18) and bodies
(fig. 5.19), although for the high angle step of 45!, the best angle ranges exceeded the
step: 60! for heads (figs. 5.18(e), 5.18(f)) and 50! for bodies (figs. 5.19(e), 5.19(f)).
Chapter 5 Rotated object detection 75
0
200
400
600
800
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Range=5°
Range=0°
(a) 15! steps, 0!..5! range
0
200
400
600
800
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Range=5°Range=10°Range=15°Range=30°Range=90°
(b) 15! steps, 5!..90! range
0
200
400
600
800
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1T
rue
po
sit
ive
ra
te
Range=20°Range=15°Range=10°Range=5°Range=0°
(c) 30! steps, 0!..20! range
0
200
400
600
800
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Range=20°Range=25°Range=30°Range=45°Range=90°
(d) 30! steps, 20!..90! range
0
200
400
600
800
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Range=40°Range=35°Range=30°Range=20°Range=0°
(e) 45! steps, 0!..40! range
0
200
400
600
800
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Range=40°Range=45°Range=50°Range=60°Range=90°
(f) 45! steps, 40!..90! range
Figure 5.16: ROC curves for seahorse head detection on rotated images, varying the
cascade random angle range
Chapter 5 Rotated object detection 76
0
200
400
600
800
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Range=10°
Range=5°
Range=0°
(a) 15! steps, 0!..10! range
0
200
400
600
800
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Range=10°Range=15°Range=20°Range=30°Range=90°
(b) 15! steps, 10!..90! range
0
200
400
600
800
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1T
rue
po
sit
ive
ra
te
Range=30°Range=25°Range=20°Range=10°Range=0°
(c) 30! steps, 0!..30! range
0
200
400
600
800
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Range=30°Range=35°Range=40°Range=45°Range=90°
(d) 30! steps, 30!..90! range
0
200
400
600
800
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Range=30°Range=25°Range=20°Range=10°Range=0°
(e) 45! steps, 0!..30! range
0
200
400
600
800
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Range=30°Range=35°Range=40°Range=45°Range=90°
(f) 45! steps, 30!..90! range
Figure 5.17: ROC curves for seahorse body detection on rotated images, varying the
cascade random angle range
Chapter 5 Rotated object detection 77
0
200
400
600
800
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Range=10°
Range=5°
Range=0°
(a) 15! steps, 0!..10! range
0
200
400
600
800
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Range=10°Range=15°Range=20°Range=30°Range=90°
(b) 15! steps, 10!..90! range
0
200
400
600
800
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1T
rue
po
sit
ive
ra
te
Range=20°Range=15°Range=10°Range=5°Range=0°
(c) 30! steps, 0!..20! range
0
200
400
600
800
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Range=20°Range=25°Range=30°Range=45°Range=90°
(d) 30! steps, 20!..90! range
0
200
400
600
800
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Range=60°Range=50°Range=40°Range=20°Range=0°
(e) 45! steps, 0!..60! range
0
200
400
600
800
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Range=60°Range=90°
(f) 45! steps, 60!..90! range
Figure 5.18: ROC curves for seahorse head detection by rotated cascades, varying the
cascade random angle range
Chapter 5 Rotated object detection 78
0
200
400
600
800
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Range=10°
Range=5°
Range=0°
(a) 15! steps, 0!..10! range
0
200
400
600
800
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Range=10°Range=15°Range=20°Range=30°Range=90°
(b) 15! steps, 10!..90! range
0
200
400
600
800
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1T
rue
po
sit
ive
ra
te
Range=20°Range=15°Range=10°Range=5°Range=0°
(c) 30! steps, 0!..20! range
0
200
400
600
800
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Range=20°Range=25°Range=30°Range=45°Range=90°
(d) 30! steps, 20!..90! range
0
200
400
600
800
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Range=50°Range=45°Range=40°Range=30°Range=0°
(e) 45! steps, 0!..50! range
0
200
400
600
800
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Range=50°Range=60°Range=90°
(f) 45! steps, 50!..90! range
Figure 5.19: ROC curves for seahorse body detection by rotated cascades, varying the
cascade random angle range
Chapter 5 Rotated object detection 79
5.4.4 Seahorse segment comparison
An unexpected result of the segment detections above is that seahorse body detection
consistently had a large accuracy advantage over seahorse head detection. This was
verified by plotting the best head and body detection curves on the same graph, as in
fig. 5.20. This di!erence appeared despite both sets of cascades being trained on the
same number of images and being built from the same pool of possible features.
However, as found in section 5.4.1, the two sets of cascades also have similar feature
counts, but the body detection cascades had consistently greater area per feature than
the head detection cascades. This may make the head detection cascades more ‘brittle’
and less likely to report a positive when placed close to but not exactly over a seahorse
head in the test images.
0
200
400
600
800
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Seahorse bodies
Seahorse heads
(a) Rotated images
0
200
400
600
800
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Seahorse bodies
Seahorse heads
(b) Rotated cascades
Figure 5.20: ROC curves for seahorse head detection compared with seahorse body
detection
Chapter 5 Rotated object detection 80
5.4.5 Angle steps
As predicted in section 5.3, the best angle step was almost always the smallest of those
tested, 15!. For fish detection on rotated images 30! steps were comparable with 15!,
although 45! steps were much worse (fig. 5.21(a)), but for rotated cascades 15! steps
were consistently better than 30! steps (fig. 5.21(b)). The seahorse segment results
were similar, although the di!erences were smaller (figs. 5.22, 5.23).
0
100
200
300
400
500
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Step=15°Step=30°Step=45°
(a) Rotated images
0
100
200
300
400
500
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Step=15°Step=30°Step=45°
(b) Rotated cascades
Figure 5.21: ROC curves for fish detection with varying angle steps
Chapter 5 Rotated object detection 81
0
200
400
600
800
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Step=15°Step=30°Step=45°
(a) Seahorse heads
0
200
400
600
800
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Step=15°Step=30°Step=45°
(b) Seahorse bodies
Figure 5.22: ROC curves for seahorse segment detection on rotated images with varying
angle steps
0
200
400
600
800
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Step=15°Step=30°Step=45°
(a) Seahorse heads
0
200
400
600
800
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Step=15°Step=30°Step=45°
(b) Seahorse bodies
Figure 5.23: ROC curves for seahorse segment detection by rotated cascades with
varying angle steps
Chapter 5 Rotated object detection 82
5.4.6 Rotated images against rotated cascades
Detection by a single cascade on a rotating image was consistently more accurate
than detection on a single image by multiple rotated cascades. This is shown for fish
detection in fig. 5.24(a), seahorse head detection in fig. 5.24(b) and seahorse body
detection in fig. 5.24(c). This would not merely be because the 0! cascades used for
rotated image detection were ‘less distorted’ – the fish and seahorses in the original
images were not conveniently posed at 0!, so still had to be rotated and scaled by some
amount to make them horizontal for 0! cascade training.
0
100
200
300
400
500
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Rotated imagesRotated cascades
(a) Fish detection
0
200
400
600
800
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Rotated imagesRotated cascades
(b) Seahorse head detection
0
200
400
600
800
0 200 400 600 800
False positives
Tru
e p
os
itiv
es
0
0.2
0.4
0.6
0.8
1
Tru
e p
os
itiv
e r
ate
Rotated imagesRotated cascades
(c) Seahorse body detection
Figure 5.24: ROC curves comparing rotated images with rotated cascades
Chapter 5 Rotated object detection 83
5.5 Conclusions
Haar Classifier Cascades are, by the nature of their features, not rotation-invariant.
This chapter considered first how to train them to detect objects over a small range
of orientations, then how to apply them in combination to detect objects over a wide
range of orientations. Three di!erent parameters were varied: the training sample
random angle range, whether the image or cascade was rotated, and the steps between
image or cascade rotations.
An expected result is that the minimum angle step was best, although with rotated
images going from 15! steps to 30! steps caused minimal accuracy loss. While the best
training random angle range was close to this step, results show that it is potentially
worthwhile to train and compare classifiers with angle ranges both above and below
the angle step.
Detections on rotated images by a single cascade were also consistently more ac-
curate than detections on a single image by rotated cascades. The best random angle
ranges for detection by rotated cascades were also usually smaller than the best ranges
for detection on rotated images.
The classifiers trained in this chapter are used again in chapter 6, where their
confidence is measured, and in chapter 7, where detections made by the seahorse head
and body detectors are joined together.