5 Rotated object detection - University of Tasmania5 Rotated object detection Portions of this chapter were previously presented at the 20th Australian Joint Confer- ence on Artiﬁcial

5 Rotated object detection

Portions of this chapter were previously presented at the 20th Australian Joint Confer-

ence on Artificial Intelligence (Horton et al., 2007).

As described in section 2.6.7, Haar Classifier Cascades cannot detect objects with

a wide range of orientations. There are two simple ways to deal with this:

1. Rotated images: one cascade is trained and the test images are rotated past it to

detect rotated objects.

2. Rotated cascades: Multiple cascades are trained, each to detect objects at di!erent

orientations. The entire set of cascades is run on every test image.

After either method, the detections for di!erent orientations may be merged into a

single set of detections, as described for single cascades in section 2.6.6.

Both of these methods multiply classification time by the number of orientations

used. Jones and Viola avoided this for the second method by training a pose estimator

decision tree which predicts which of 12 binary detection cascades is best for each image

region (Jones & Viola, 2003a). This had approximately twice the classification time of

a single binary object detector, a significant improvement over the 12 times required

by using every cascade. However, accuracy was lost with the gain in speed.

Some uncertainty remains over how far apart the image or cascade orientations can

be, and how much the training images should be perturbed for good performance in

rotated object detection. In (Jones & Viola, 2003a) the cascades were trained in 30!

steps, each covering a 30! (±15!) range, while (Kolsch & Turk, 2004a) used 15! steps

and tested angle ranges from 0! to 15!(±7.5!). In the latter paper the most accurate

detections were by cascades with 15! ranges – the highest ranges tested. This suggests

that increasing the range further (which would lead to overlap between the cascade

ranges) may also be worth testing.

5.1 Cascade training and testing implementation

As mentioned in section 2.6.8, the Open Computer Vision libaries were used as a

starting point for the object detection algorithms in this and the following chapters.

56

Chapter 5 Rotated object detection 57

However, apart from the new algorithms added, four significant changes were made to

the OpenCV implementation:

1. In the original program, the training process does not correctly build the set of all

possible features if the cascade is rectangular – a property required of the fish detection

cascades trained in this chapter. This error was corrected by correctly identifying the

cascade width and height.

2. One of the feature candidates created by OpenCV, a ‘chequerboard’ listed internally

as ‘haar x2 y2’ (shown in fig. 2.7), was used by Viola and Jones (Viola & Jones, 2001b)

but left out of Lienhart and Maydt’s feature set (Lienhart & Maydt, 2002) and did not

have a corresponding tilted feature. With the emphasis on training cascades to detect

rotated objects in this thesis, it was felt that cascade training for straight and tilted

objects should be as comparable as possible. The haar x2 y2 feature was therefore

disabled before cascade training began. All other features in fig. 2.7 were used.

3. At classification time, OpenCV halved tilted feature weights. This weight adjustment

did not happen to tilted features in training, so the same region evaluated di!erently

at training and testing time. The halving at classification time was therefore removed.

4. In the OpenCV merging process described in section 2.6.6, detections completely

within the rectangle of a larger detection with more neighbours are erased. Such

detections may easily be correct in marine animal detection, where the creatures don’t

fill their detection rectangle. This option was therefore disabled for the fish and seahorse

detection.

5.2 Cascade training data

5.2.1 Fish

As explained in section 4.2, the 235 fish images were divided into training and testing

sets, with 119 images annotated for testing. The remaining 116 training images were

annotated with the locations of 409 mostly-visible fish. Each training fish annotation

was a line segment from nose to tail, as in the examples in fig. 5.1. These training fish

subimages could then be rotated to any angle to train a cascade to detect fish at that

orientation.


As noted above, existing research on hand detection suggests that the positive

training examples for a given angle should not be fixed at that angle, but should be

perturbed by some random amount (Jones & Viola, 2003a; Kolsch & Turk, 2004a), as

illustrated in fig. 5.2. However, these studies only report on random angle ranges as

high as the angle step, as in fig. 5.2(c), and don’t consider overlapping random angle

ranges such as those in figs. 5.2(d) and 5.2(e).

Along with the rotation, the positive samples were scaled to be as large as possible

while centred in and fitted within the sample rectangle. Figs. 5.3(a) and 5.3(b) show

some of the training fish being forced to exactly 0! and 45! respectively. Figs. 5.3(d)

and 5.3(e) show the same fish randomly rotated within a range of 30!(±15!). The

rotation and scaling were done in a single step with an a"ne transformation function

from the OpenCV libraries; it interpolated using the pixel area relation to minimise

loss of detail.

The Haar Classifier Cascade training process needs negative training examples as

well, so 722 regions containing no complete fish were extracted from the training images;

fig. 5.4 contains examples. The cascades were supposed to ignore negative objects in

any orientation. In order to provide such negative training examples, the negative

regions were flipped 50% of the time, mirrored twice across each axis, rotated to random

orientations and the largest possible square used as a negative training sample. Fig. 5.5

shows the process, and fig. 5.6 shows some of the results. The negative sample in the

first step of fig. 5.5 has merely been shrunk to fit the page; negative training images

were not scaled and, as with the positive samples, the rotation step interpolated pixels

to minimise information loss.

Figure 5.1: Example fish images, annotated with training positives


(a) Range=0! (b) Range=10! (c) Range=15! (d) Range=20! (e) Range=60!

Figure 5.2: Positive sample orientations for 7 cascades fixed on angles from !45! to

+45! with random angle ranges; darker areas show where random ranges overlap.

Each set of rotated cascade fish detectors tested in section 5.4.2 onwards contained

seven Haar Classifier Cascades. Each was trained to detect fish oriented around an

angle of -45!, -30!, -15!, 0!, +15!, +30! or +45!. The dimensions of each cascade were

chosen to detect fish with an aspect ratio of 3:1 (consistent with the mean aspect ratio

of fish in the training images) and to have an area of approximately 48 " 16 = 768

units (table 5.1). This would not detect all possible fish, as it does not cover every

possible orientation, but was su"cient for the images being used, where the fish were

consistently close to horizontal and almost always swimming left, as seen in fig. 4.4.

Table 5.1: Fish detection cascade window sizes


(a) First 8 positive fish training regions forced to 0! orientation

(b) First 8 positive fish training regions forced to 45! orientation

(c) First positive fish training region forced to -45!..45!

(d) First positive fish training region forced to 0!±15!

(e) First positive fish training region forced to 45!±15!

Figure 5.3: Positive fish training regions at di!erent orientations

Figure 5.4: First 10 negative fish training regions before flipping, mirroring and rotation


Figure 5.5: Creating a fish negative training sample from the first negative training

region

Figure 5.6: First 10 negative fish training regions after flipping, mirroring and rotation


5.2.2 Seahorses

As explained in section 4.3, there were 263 seahorse images, with 131 used for training.

The remaining 132 test images were annotated using the rules described there and

illustrated in fig. 4.7. Annotations made following these rules on the example distant

and close-up images from fig. 4.6 are shown in fig. 5.7. The training images were filled

with 748 such annotations. As with the fish, these segments could be forced to any

angle for training, as seen for heads in fig. 5.8 and bodies in fig. 5.9.

The seahorse tank was less crowded than the fish tank, so the entire images were

used as negative training samples. Positive examples for each segment were removed by

drawing blank circles over the appropriate parts of each image (fig. 5.10). To encourage

rotation-independent cascades these negative training images were also flipped 50% of

the time, mirrored and randomly rotated. This provided negative training examples in

all orientations (fig. 5.11).

Seahorse cascades were then trained for anti-clockwise-oriented seahorse segments

at 15! angle increments, from 0! to 75!. Trained cascades can be rotated by any multiple

of 90! by modifying the coordinates (Jones & Viola, 2003a), so each of the 6 cascades

was rotated through 0!, 90!, 180!and 270!. The resulting 24 cascades covered all

orientations from 0! to 345!. These would only detect anti-clockwise seahorse segments,

so during classification the images were flipped horizontally to detect clockwise-oriented

seahorse segments; this also doubles classification time. Each seahorse segment detector

had a window size of 24" 24 units.


Figure 5.7: Example seahorse images with segments annotated for training

Figure 5.8: First 8 seahorse heads extracted from example images and forced to 0!

orientation

Figure 5.9: First 8 seahorse bodies extracted from example images and forced to 90!

orientation


Figure 5.10: Example seahorse images with heads blanked out

Figure 5.11: Example seahorse images with heads blanked out after flipping, mirroring

and rotation


5.3 Cascade training settings

All cascades used in this chapter were trained with the OpenCV default maximum stage

false positive rate of 0.5 and maximum stage false negative rate of 0.005. Preliminary

tests showed that the OpenCV default stage count of 14 was insu"cient for fish and

seahorse detection; 20 stages were tested and found to be more appropriate. This was

also consistent with some of Lienhart et al.’s 20-stage face detection cascades (Lienhart

et al., 2003a). By default, OpenCV also creates and tests features in symmetric pairs,

mirrored across the y axis of the cascade training window; this was deactivated since

the fish and seahorse segments are not symmetric from the side. Following (Lienhart

et al., 2003a), Gentle AdaBoost was selected as the boosting algorithm.

The settings chosen here, and in sections 5.1, 5.2.1 and 5.2.2, are summarised in

table 5.2.

To find the best random angle ranges to use for training cascades, training positive

image sets were constructed with random ranges from 0! to 90! (±0! to ±45!). The

fish detection cascade ranges were 0!, 5!, 10!, 15!, 20!, 25!, 30!, 45!, 60! and 90!; the

seahorse detection cascade ranges were 0!, 5!, 10!, 15!, 20!, 25!, 30!, 35!, 40!, 45!,

50!, 60! and 90!. They were tested using both rotated cascades and rotated images in

steps of 15!, 30! and 45!. This has a cost in computation time; if detections are made

at n di!erent angles, the detection time is that for a single cascade multiplied by n. If

n di!erent cascades are trained on di!erent angles and used to classify a single image,

similar time is taken unless the two-stage process discussed at the start of the chapter

and described in (Jones & Viola, 2003a) is used. However, more rotations give more

opportunities to detect each object, so may be expected to increase accuracy.

Table 5.2: List of cascade training settings


5.4 Results

For comparisons, ROC curves were constructed for each cascade or set of cascades and

visually compared, as described in section 2.7.3. Comparisons of di!erent random angle

ranges in this thesis involve a pair of graphs, with the first showing ranges ascending

from 0! to the apparent best range (e.g. fig. 5.14(a)), and then angles ascending from it

to 90! (e.g. fig 5.14(b)). All ranges were tested, but not all are shown on the graphs for

readability. Comparisons of di!erent methods use plots that combine the best results

from each set of cascades, as described in section 2.7.3.1. For example, the ‘15! steps’

line at the top of fig. 5.21(a) follows the best points from figs. 5.14(a) and 5.14(b).

Appendix C contains example images with detections made by some of these cas-

cades and methods. Figs. C.4 and C.7 contain fish detections from rotated images and

rotated cascades respectively; fig. C.11 contains seahorse detections on rotated images.

Seahorse detections by rotated cascades are not shown. In the example fish image,

there is one large fish which was not found during the rotated cascade detection in

fig. C.7. This is probably because a cascade to detect it would have to pass outside

the image bounds.

First, however, trends in the number and areas of the features chosen for each

cascade will be considered.


5.4.1 Cascade features

Table 5.3 counts the features for each cascade, where each ‘feature’ is a rectangular

block of the type shown in fig. 2.7. As explained in section 5.2.2, the seahorse segment

detection cascades fixed around angles 90!..345! are created through 90! rotations of

the 0!..75! cascades listed here. Note also that these are total counts for 20-stage

cascades; comparisons with other cascades should be made by dividing the number of

features by the number of stages.

The areas of covered by those features are also given; tilted features are considered

to have four times the area of straight features. These areas are listed in table 5.4.

The mean counts for each random angle range listed in the right-hand column of the

tables are plotted in fig. 5.12; the mean area columns are similarly plotted in fig. 5.13.

Increasing the random angle range makes the positive samples more varied, so the

corresponding increase in features and feature area as the angle range increases is not

surprising.

A property that was not anticipated is that each seahorse segment detection cascade

has approximately twice as many features as the fish detection cascades trained with

the same random angle range. This may be because the fish are more consistently

shaped or because their image backgrounds contain less objects. Testing this would

involve moving dozens of fish and seahorses into di!erent environments, followed by

extensive image photography, annotating, training and classification. Such tests were

not considered within the scope of this thesis.

Also, while the feature counts for seahorse heads and bodies in fig. 5.12 are very sim-

ilar, the body detection features generally cover a larger area than their head detection

counterparts, as seen in fig. 5.13.


Table 5.3: Rotated cascade feature counts

(a) Fish detection cascade feature counts

(b) Seahorse head detection cascade feature counts

(c) Seahorse body detection cascade feature counts

Fig. 5.12 contains a graph of the mean columns from these tables.


Table 5.4: Rotated cascade area per feature

(a) Fish detection cascade mean area per feature

(b) Seahorse head detection cascade mean area per feature

(c) Seahorse body detection cascade mean area per feature

Fig. 5.13 contains a graph of the mean columns from these tables.

Areas are in the units of the cascades’ coordinate systems.


0

200

400

600

800

0° 30° 60° 90°

Random angle range

Me

an

ca

sc

ad

e f

ea

ture

s

Seahorse bodies

Seahorse heads

Fish

Figure 5.12: Graph of rotated cascade feature counts

0

20

40

60

80

100

120

140

0° 30° 60° 90°

Random angle range

Me

an

ca

sc

ad

e f

ea

ture

are

a

Seahorse bodies

Seahorse heads

Fish

Figure 5.13: Graph of rotated cascade feature areas


5.4.2 Angle ranges for fish detection

With detection on rotated images at 15! steps, the best fish detection cascades were

trained on an angle range of 30! (±15!), resulting in overlapping angle ranges. This

may be seen in figs. 5.14(a) and 5.14(b). When the angle step was increased to 30!,

this remained the best range, as shown in figs. 5.14(c) and 5.14(d). Finally, when the

angle step was increased to 45!, the best angle range also increased to 45! (±22.5!)

(figs. 5.14(e), 5.14(f)). When the cascades were rotated instead of images, the best

angle ranges were di!erent but followed similar trends, with 25! angle ranges best for

15! and 30! angle steps (figs. 5.15(a), 5.15(b) and 5.15(c), 5.15(d) respectively). When

the step increased to 45!, the best angle range increased to 60! (figs. 5.15(e), 5.15(f)).


0

100

200

300

400

500

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate

Range=30°Range=25°Range=20°Range=10°Range=0°

(a) 15! steps, 0!..30! range

0

100

200

300

400

500

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate

Range=30°Range=45°Range=60°Range=90°

(b) 15! steps, 30!..90! range

0

100

200

300

400

500

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1T

rue

po

sit

ive

ra

te


(c) 30! steps, 0!..30! range

0

100

200

300

400

500

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate

Range=30°Range=45°Range=60°Range=90°

(d) 30! steps, 30!..90! range

0

100

200

300

400

500

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate


(e) 45! steps, 0!..45! range

0

100

200

300

400

500

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate

Range=45°Range=60°Range=90°

(f) 45! steps, 45!..90! range

Figure 5.14: ROC curves for fish detection on rotated images, varying the cascade

random angle range


0

100

200

300

400

500

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate


(a) 15! steps, 0!..25! range

0

100

200

300

400

500

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate


(b) 15! steps, 25!..90! range

0

100

200

300

400

500

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1T

rue

po

sit

ive

ra

te


(c) 30! steps, 0!..25! range

0

100

200

300

400

500

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate


(d) 30! steps, 25!..90! range

0

100

200

300

400

500

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate


(e) 45! steps, 0!..60! range

0

100

200

300

400

500

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate

Range=60°

Range=90°

(f) 45! steps, 60!..90! range

Figure 5.15: ROC curves for fish detection by rotated cascades, varying the cascade

random angle range


5.4.3 Angle ranges for seahorse segment detection

The best random angle ranges for e!ective seahorse detection were generally lower than

those found for e!ective fish detection in section 5.4.2. When the images were rotated,

the best random angle ranges were always smaller than or equal to the angle step, both

for head detection (fig. 5.16) and body detection (fig. 5.17).

The results were similar for rotated cascades detecting heads (fig. 5.18) and bodies

(fig. 5.19), although for the high angle step of 45!, the best angle ranges exceeded the

step: 60! for heads (figs. 5.18(e), 5.18(f)) and 50! for bodies (figs. 5.19(e), 5.19(f)).


0

200

400

600

800

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate

Range=5°

Range=0°

(a) 15! steps, 0!..5! range

0

200

400

600

800

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate


(b) 15! steps, 5!..90! range

0

200

400

600

800

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1T

rue

po

sit

ive

ra

te


(c) 30! steps, 0!..20! range

0

200

400

600

800

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate


(d) 30! steps, 20!..90! range

0

200

400

600

800

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate


(e) 45! steps, 0!..40! range

0

200

400

600

800

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate


(f) 45! steps, 40!..90! range

Figure 5.16: ROC curves for seahorse head detection on rotated images, varying the

cascade random angle range


0

200

400

600

800

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate

Range=10°

Range=5°

Range=0°

(a) 15! steps, 0!..10! range

0

200

400

600

800

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate


(b) 15! steps, 10!..90! range

0

200

400

600

800

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1T

rue

po

sit

ive

ra

te


(c) 30! steps, 0!..30! range

0

200

400

600

800

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate


(d) 30! steps, 30!..90! range

0

200

400

600

800

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate


(e) 45! steps, 0!..30! range

0

200

400

600

800

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate


(f) 45! steps, 30!..90! range

Figure 5.17: ROC curves for seahorse body detection on rotated images, varying the



0

200

400

600

800

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate

Range=10°

Range=5°

Range=0°

(a) 15! steps, 0!..10! range

0

200

400

600

800

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate


(b) 15! steps, 10!..90! range

0

200

400

600

800

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1T

rue

po

sit

ive

ra

te


(c) 30! steps, 0!..20! range

0

200

400

600

800

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate


(d) 30! steps, 20!..90! range

0

200

400

600

800

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate


(e) 45! steps, 0!..60! range

0

200

400

600

800

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate

Range=60°Range=90°

(f) 45! steps, 60!..90! range

Figure 5.18: ROC curves for seahorse head detection by rotated cascades, varying the



0

200

400

600

800

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate

Range=10°

Range=5°

Range=0°

(a) 15! steps, 0!..10! range

0

200

400

600

800

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate


(b) 15! steps, 10!..90! range

0

200

400

600

800

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1T

rue

po

sit

ive

ra

te


(c) 30! steps, 0!..20! range

0

200

400

600

800

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate


(d) 30! steps, 20!..90! range

0

200

400

600

800

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate


(e) 45! steps, 0!..50! range

0

200

400

600

800

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate

Range=50°Range=60°Range=90°

(f) 45! steps, 50!..90! range

Figure 5.19: ROC curves for seahorse body detection by rotated cascades, varying the



5.4.4 Seahorse segment comparison

An unexpected result of the segment detections above is that seahorse body detection

consistently had a large accuracy advantage over seahorse head detection. This was

verified by plotting the best head and body detection curves on the same graph, as in

fig. 5.20. This di!erence appeared despite both sets of cascades being trained on the

same number of images and being built from the same pool of possible features.

However, as found in section 5.4.1, the two sets of cascades also have similar feature

counts, but the body detection cascades had consistently greater area per feature than

the head detection cascades. This may make the head detection cascades more ‘brittle’

and less likely to report a positive when placed close to but not exactly over a seahorse

head in the test images.

0

200

400

600

800

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate

Seahorse bodies

Seahorse heads

(a) Rotated images

0

200

400

600

800

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate

Seahorse bodies

Seahorse heads

(b) Rotated cascades

Figure 5.20: ROC curves for seahorse head detection compared with seahorse body

detection


5.4.5 Angle steps

As predicted in section 5.3, the best angle step was almost always the smallest of those

tested, 15!. For fish detection on rotated images 30! steps were comparable with 15!,

although 45! steps were much worse (fig. 5.21(a)), but for rotated cascades 15! steps

were consistently better than 30! steps (fig. 5.21(b)). The seahorse segment results

were similar, although the di!erences were smaller (figs. 5.22, 5.23).

0

100

200

300

400

500

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate

Step=15°Step=30°Step=45°

(a) Rotated images

0

100

200

300

400

500

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate


(b) Rotated cascades

Figure 5.21: ROC curves for fish detection with varying angle steps


0

200

400

600

800

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate


(a) Seahorse heads

0

200

400

600

800

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate


(b) Seahorse bodies

Figure 5.22: ROC curves for seahorse segment detection on rotated images with varying

angle steps

0

200

400

600

800

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate


(a) Seahorse heads

0

200

400

600

800

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate


(b) Seahorse bodies

Figure 5.23: ROC curves for seahorse segment detection by rotated cascades with

varying angle steps


5.4.6 Rotated images against rotated cascades

Detection by a single cascade on a rotating image was consistently more accurate

than detection on a single image by multiple rotated cascades. This is shown for fish

detection in fig. 5.24(a), seahorse head detection in fig. 5.24(b) and seahorse body

detection in fig. 5.24(c). This would not merely be because the 0! cascades used for

rotated image detection were ‘less distorted’ – the fish and seahorses in the original

images were not conveniently posed at 0!, so still had to be rotated and scaled by some

amount to make them horizontal for 0! cascade training.

0

100

200

300

400

500

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate

Rotated imagesRotated cascades

(a) Fish detection

0

200

400

600

800

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate


(b) Seahorse head detection

0

200

400

600

800

0 200 400 600 800

False positives

Tru

e p

os

itiv

es

0

0.2

0.4

0.6

0.8

1

Tru

e p

os

itiv

e r

ate


(c) Seahorse body detection

Figure 5.24: ROC curves comparing rotated images with rotated cascades


5.5 Conclusions

Haar Classifier Cascades are, by the nature of their features, not rotation-invariant.

This chapter considered first how to train them to detect objects over a small range

of orientations, then how to apply them in combination to detect objects over a wide

range of orientations. Three di!erent parameters were varied: the training sample

random angle range, whether the image or cascade was rotated, and the steps between

image or cascade rotations.

An expected result is that the minimum angle step was best, although with rotated

images going from 15! steps to 30! steps caused minimal accuracy loss. While the best

training random angle range was close to this step, results show that it is potentially

worthwhile to train and compare classifiers with angle ranges both above and below

the angle step.

Detections on rotated images by a single cascade were also consistently more ac-

curate than detections on a single image by rotated cascades. The best random angle

ranges for detection by rotated cascades were also usually smaller than the best ranges

for detection on rotated images.

The classifiers trained in this chapter are used again in chapter 6, where their

confidence is measured, and in chapter 7, where detections made by the seahorse head

and body detectors are joined together.

Documents

5 Rotated object detection - University of Tasmania5 Rotated object detection Portions of this chapter were previously presented at the 20th Australian Joint Confer- ence on Artiﬁcial