6
Fine-Grained Visual Classification of Aircraft Subhransu Maji TTI Chicago [email protected] Esa Rahtu Juho Kannala University of Oulu, Finland {erahtu, jkannala}@ee.oulu.fi Matthew Blaschko ´ Ecole Centrale Paris [email protected] Andrea Vedaldi University of Oxford [email protected] Abstract This paper introduces FGVC-Aircraft, a new dataset containing 10,000 images of aircraft spanning 100 aircraft models, organised in a three-level hierarchy. At the finer level, differences between models are often subtle but al- ways visually measurable, making visual recognition chal- lenging but possible. A benchmark is obtained by defin- ing corresponding classification tasks and evaluation pro- tocols, and baseline results are presented. The construc- tion of this dataset was made possible by the work of air- craft enthusiasts, a strategy that can extend to the study of number of other object classes. Compared to the do- mains usually considered in fine-grained visual classifica- tion (FGVC), for example animals, aircraft are rigid and hence less deformable. They, however, present other inter- esting modes of variation, including purpose, size, designa- tion, structure, historical style, and branding. 1. Introduction In this paper, we introduce FGVC-Aircraft, a novel dataset aimed at studying the problem of fine-grained recog- nition of aircraft models (Fig. 1, Sect. 2). The new data in- cludes 10,000 airplane images spanning 100 different mod- els, organised in a hierarchical manner. All models are vi- sually distinguishable, even though in many cases the dif- ferences are subtle, making classification challenging and interesting. Airplanes are an alternative to objects typically consid- ered in fine-grained visual classification (FGVC) such as birds [5] and pets [24]. Compared to these domains, air- craft classification has several interesting aspects. First, air- craft designs vary significantly depending on the plane size (from home-built to large carriers), designation (private, civil, military), purpose (transporter, carrier, training, sport, fighter, etc.), and technological factors such as propulsion (glider, propellor, jet). Overall, thousands of different air- plane models exist or have existed. An interesting mode of variation, which is is not shared with categories such as ani- mals, is the fact that the structure of aircraft can change with their design. For example, the number of wings, undercar- riages, wheels per undercarriage, engines, etc. varies. Sec- ond, the aircraft designs exhibit systematic historical vari- ations in their style. Thirdly, the same aircraft models are used by different airliner companies, resulting in variable livery branding. Finally, aircraft are largely rigid objects, reducing the impact of deformability on classification per- formance, and allowing one to focus on the other aspects of FGVC. Our contributions are three-fold: (i) we introduce a new large dataset of aircraft images with detailed model anno- tations; (ii) we describe how this data was collected using on-line resources and the work of hobbyists and enthusiasts – a method that may be applicable to other object classes; and (iii) we present baseline results on aircraft model iden- tification. Sect. 2 describes the content of FGVC-Aircraft, including task definitions and evaluation protocols, Sect. 3 the dataset construction, Sect. 4 examines the performance of a baseline classifier on the data, and Sect. 5 summarises the contributions, giving further details on the data usage policy. 2. The dataset: content, tasks, and evaluation FGVC-Aircraft contains 10,000 images of airplanes an- notated with the model and bounding box of the dominant aircraft they contain. Aircraft models are organised in a four-level hierarchy, of which only the last three levels are of interest here. Model. This is the most specific class label, such as Boe- ing 737-76J. This level is not considered meaningful for FGVC as differences between models may not be visu- ally measurable, at least given an image of the exterior of the aircraft. Variant. Model variants are the finer distinction level that are visually detectable, and were obtained by merg- ing visually indistinguishable models. For example, the variant Boeing 737-700 includes 87 models such as 737- 7H4, 737-76N, 737-7K2, etc. The dataset contains 100 variants. Family. Families group together model variants that dif- fer in subtle ways, making differences between families 1 arXiv:1306.5151v1 [cs.CV] 21 Jun 2013

Fine-Grained Visual Classification of Aircraft

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Fine-Grained Visual Classification of Aircraft

Fine-Grained Visual Classification of Aircraft

Subhransu MajiTTI [email protected]

Esa Rahtu Juho KannalaUniversity of Oulu, Finland{erahtu, jkannala}@ee.oulu.fi

Matthew BlaschkoEcole Centrale [email protected]

Andrea VedaldiUniversity of [email protected]

Abstract

This paper introduces FGVC-Aircraft, a new datasetcontaining 10,000 images of aircraft spanning 100 aircraftmodels, organised in a three-level hierarchy. At the finerlevel, differences between models are often subtle but al-ways visually measurable, making visual recognition chal-lenging but possible. A benchmark is obtained by defin-ing corresponding classification tasks and evaluation pro-tocols, and baseline results are presented. The construc-tion of this dataset was made possible by the work of air-craft enthusiasts, a strategy that can extend to the studyof number of other object classes. Compared to the do-mains usually considered in fine-grained visual classifica-tion (FGVC), for example animals, aircraft are rigid andhence less deformable. They, however, present other inter-esting modes of variation, including purpose, size, designa-tion, structure, historical style, and branding.

1. IntroductionIn this paper, we introduce FGVC-Aircraft, a novel

dataset aimed at studying the problem of fine-grained recog-nition of aircraft models (Fig. 1, Sect. 2). The new data in-cludes 10,000 airplane images spanning 100 different mod-els, organised in a hierarchical manner. All models are vi-sually distinguishable, even though in many cases the dif-ferences are subtle, making classification challenging andinteresting.

Airplanes are an alternative to objects typically consid-ered in fine-grained visual classification (FGVC) such asbirds [5] and pets [2–4]. Compared to these domains, air-craft classification has several interesting aspects. First, air-craft designs vary significantly depending on the plane size(from home-built to large carriers), designation (private,civil, military), purpose (transporter, carrier, training, sport,fighter, etc.), and technological factors such as propulsion(glider, propellor, jet). Overall, thousands of different air-plane models exist or have existed. An interesting mode ofvariation, which is is not shared with categories such as ani-mals, is the fact that the structure of aircraft can change with

their design. For example, the number of wings, undercar-riages, wheels per undercarriage, engines, etc. varies. Sec-ond, the aircraft designs exhibit systematic historical vari-ations in their style. Thirdly, the same aircraft models areused by different airliner companies, resulting in variablelivery branding. Finally, aircraft are largely rigid objects,reducing the impact of deformability on classification per-formance, and allowing one to focus on the other aspects ofFGVC.

Our contributions are three-fold: (i) we introduce a newlarge dataset of aircraft images with detailed model anno-tations; (ii) we describe how this data was collected usingon-line resources and the work of hobbyists and enthusiasts– a method that may be applicable to other object classes;and (iii) we present baseline results on aircraft model iden-tification. Sect. 2 describes the content of FGVC-Aircraft,including task definitions and evaluation protocols, Sect. 3the dataset construction, Sect. 4 examines the performanceof a baseline classifier on the data, and Sect. 5 summarisesthe contributions, giving further details on the data usagepolicy.

2. The dataset: content, tasks, and evaluationFGVC-Aircraft contains 10,000 images of airplanes an-

notated with the model and bounding box of the dominantaircraft they contain. Aircraft models are organised in afour-level hierarchy, of which only the last three levels areof interest here.• Model. This is the most specific class label, such as Boe-

ing 737-76J. This level is not considered meaningful forFGVC as differences between models may not be visu-ally measurable, at least given an image of the exterior ofthe aircraft.

• Variant. Model variants are the finer distinction levelthat are visually detectable, and were obtained by merg-ing visually indistinguishable models. For example, thevariant Boeing 737-700 includes 87 models such as 737-7H4, 737-76N, 737-7K2, etc. The dataset contains 100variants.

• Family. Families group together model variants that dif-fer in subtle ways, making differences between families

1

arX

iv:1

306.

5151

v1 [

cs.C

V]

21

Jun

2013

Page 2: Fine-Grained Visual Classification of Aircraft

707−320 727−200 737−200 737−300 737−400 737−500 737−600 737−700 737−800 737−900

747−100 747−200 747−300 747−400 757−200 757−300 767−200 767−300 767−400 777−200

777−300 A300B4 A310 A318 A319 A320 A321 A330−200 A330−300 A340−200

A340−300 A340−500 A340−600 A380 ATR−42 ATR−72 An−12 BAE 146−200 BAE 146−300 BAE−125

Beechcraft 1900 Boeing 717 C−130 C−47 CRJ−200 CRJ−700 CRJ−900 Cessna 172 Cessna 208 Cessna 525

Cessna 560 Challenger 600 DC−10 DC−3 DC−6 DC−8 DC−9−30 DH−82 DHC−1 DHC−6

DHC−8−100 DHC−8−300 DR−400 Dornier 328 E−170 E−190 E−195 EMB−120 ERJ 135 ERJ 145

Embraer Legacy 600 Eurofighter Typhoon F−16A/B F/A−18 Falcon 2000 Falcon 900 Fokker 100 Fokker 50 Fokker 70 Global Express

Gulfstream IV Gulfstream V Hawk T1 Il−76 L−1011 MD−11 MD−80 MD−87 MD−90 Metroliner

Model B200 PA−28 SR−20 Saab 2000 Saab 340 Spitfire Tornado Tu−134 Tu−154 Yak−42

Figure 1. Our dataset contains 100 variants of aircrafts shown above. These are also annotated with their family and manufacturer, as wellas bounding boxes.

more substantial. The goal of this level is to create a clas-sification task of intermediate difficulty. For example, thefamily Boeing 737 contains variants 737-200, 737-300,. . . , 737-900. The dataset contains 70 families.

• Manufacturer. A manufacturer is a grouping of familiesproduced by the same company. For example, Boeingcontains the families 707, 727, 737, . . . . The dataset con-tains airplanes made by 30 different manufacturers.

The list of model variants and corresponding example im-ages are given in Fig. 1 and the hierarchy is given in Fig. 2.

FGVC-Aircraft contains 100 example images for each ofthe 100 model variants. The image resolution is about 1-2Mpixels. Image quality varies as images were captured ina span of decades, but it is usually very good. The domi-nant aircraft is generally well centred, which helps focus-ing on fine-grained discrimination rather than object detec-tion. Images are equally divided into training, validation,and test subsets, so that each subset contains either 33 or 34images for each variant. Algorithms should be designed onthe training and validation subsets, and tested just once onthe test subset to avoid over fitting.

Bounding box information can be used for training theaircraft classifiers, but should not be used for testing.

We define three tasks: aircraft variant recognition, air-craft family recognition, and aircraft manufacturer recog-nition. The performance is evaluated as class-normalisedaverage classification accuracy, obtained as the average ofthe diagonal elements of the normalised confusion matrix.Formally, let yi ∈ {1, . . . ,M} the ground truth label forimage i = 1, . . . , N (where N = 10, 000 and M = 100for variant recognition). Let yi be the label estimated auto-matically. The entry Cpq of the confusion matrix is givenby

Cpq =|{i : yi = q ∧ yi = p}||{i : yi = p}|

where | · | denotes the cardinality of a set. The class-normalised average accuracy is then

∑Mp=1 Cpp/M .

The dataset is made publicly available for re-search purposes only at http://www.robots.ox.ac.uk/˜vgg/data/fgvc-aircraft/. Please note(Sect. 3.1) that the data contains images that were gener-ously made available for research purposes by several pho-tographers; however, these images should not be used forany other purpose without obtaining prior and explicit con-sent by the respective authors (see Sect. 5.1 for further de-tails).

2

Page 3: Fine-Grained Visual Classification of Aircraft

Manufacturer Family VariantBoeing Boeing 707 707−320

Boeing 727 727−200Boeing 737 737−200

737−300737−400737−500737−600737−700737−800737−900

Boeing 747 747−100747−200747−300747−400

Boeing 757 757−200757−300

Boeing 767 767−200767−300767−400

Boeing 777 777−200777−300

Airbus A300 A300B4A310 A310A320 A318

A319A320A321

A330 A330−200A330−300

A340 A340−200A340−300A340−500A340−600

A380 A380ATR ATR−42 ATR−42

ATR−72 ATR−72Antonov An−12 An−12

British Aerospace BAE 146 BAE 146−200BAE 146−300

BAE−125 BAE−125Beechcraft Beechcraft 1900 Beechcraft 1900

Boeing Boeing 717 Boeing 717Lockheed Corporation C−130 C−130

Douglas Aircraft Company C−47 C−47Canadair CRJ−200 CRJ−200

CRJ−700 CRJ−700CRJ−900

Cessna Cessna 172 Cessna 172Cessna 208 Cessna 208

Cessna Citation Cessna 525

Manufacturer Family VariantCessna Cessna Citation Cessna 560

Canadair Challenger 600 Challenger 600McDonnell Douglas DC−10 DC−10

Douglas Aircraft Company DC−3 DC−3DC−6 DC−6DC−8 DC−8

McDonnell Douglas DC−9 DC−9−30de Havilland DH−82 DH−82

DHC−1 DHC−1DHC−6 DHC−6Dash 8 DHC−8−100

DHC−8−300Robin DR−400 DR−400

Dornier Dornier 328 Dornier 328Embraer Embraer E−Jet E−170

E−190E−195

EMB−120 EMB−120Embraer ERJ 145 ERJ 135

ERJ 145Embraer Legacy 600 Embraer Legacy 600

Eurofighter Eurofighter Typhoon Eurofighter TyphoonLockheed Martin F−16 F−16A/B

McDonnell Douglas F/A−18 F/A−18Dassault Aviation Falcon 2000 Falcon 2000

Falcon 900 Falcon 900Fokker Fokker 100 Fokker 100

Fokker 50 Fokker 50Fokker 70 Fokker 70

Bombardier Aerospace Global Express Global ExpressGulfstream Aerospace Gulfstream Gulfstream IV

Gulfstream VBritish Aerospace Hawk T1 Hawk T1

Ilyushin Il−76 Il−76Lockheed Corporation L−1011 L−1011

McDonnell Douglas MD−11 MD−11MD−80 MD−80

MD−87MD−90 MD−90

Fairchild Metroliner MetrolinerBeechcraft King Air Model B200

Piper PA−28 PA−28Cirrus Aircraft SR−20 SR−20

Saab Saab 2000 Saab 2000Saab 340 Saab 340

Supermarine Spitfire SpitfirePanavia Tornado TornadoTupolev Tu−134 Tu−134

Tu−154 Tu−154Yakovlev Yak−42 Yak−42

Figure 2. Label hierarchy shown as the manufacturer, family and the variant. Our dataset contains aircrafts of 100 different variantsgrouped under 70 families and 30 manufacturers.

Authorship information is contained in a banner at thebottom of each image (20 pixels high). Do not forget toremove this banner before using the images in experiments.

3. Dataset construction

Identifying the detailed model of an aircraft from animage is challenging for anyone but aircraft experts, andcollecting 10,000 such annotations is daunting in general.Sect. 3.1 explains how leveraging aircraft data collected byaircraft spotters was the key in the construction of FGVC-Aircraft. However, collecting data from a restricted numberof sources presents its own challenges. Sect. 3.2 introducesa notion of diversity and applies it to select a subset of thedata that is maximally uncorrelated. Sect. 3.3 explains howbounding box annotations were crowdsourced using Ama-zon Mechanical Turk, and Sect. 3.4 how the hierarchicallabels were obtained.

3.1. Initial data collection

Enthusiasts, collectors, and other hobbyists may be anexcellent source of annotated visual data. In particular,data obtained from aircraft spotters was instrumental in theconstruction of this FGVC-Aircraft. A large number ofsuch annotated images is available online in Airliners.net(http://www.airliners.net/), a repository of air-

craft spotting data (similar collections exists, for example,for cars and trains). While using such images for researchpurposes may be considered fair use, nevertheless we foundappropriate to ask for explicit permission to the photogra-phers due to the large quantity of data involved. Of abouttwenty photographers that were contacted, permission touse the data for research purposes was granted by about tenof them (Sect. 5.1), and an explicit negative answer was re-ceived only from two of them. FGVC-Aircraft contains dataonly from the photographers that explicitly made their pic-tures available (see Sect. 2 and Sect. 5.1 for further details).

About 70,000 images were downloaded from the tenphotographers, resulting in images spanning thousands ofdifferent aircraft models. Even after grouping these modelsinto variants, there was still a very large number of differentclasses, with a very skewed distribution. Popular familiessuch as Airbus and Boeing included thousand of images permodel variant, whereas rarer models counted only a dozenimages. The 100 most frequent variants were retained, re-sulting in at least 120 images per variant.

3.2. Diversity maximisation

One drawback of relying on a small set of photographersis that unwanted correlation may be introduced in the data.While these photographers tend to be active in the span ofseveral years, it is natural to expect at least regional de-

3

Page 4: Fine-Grained Visual Classification of Aircraft

Model Accuracy Model Accuracy Model AccuracyDR-400 94.1% DHC-8-100 57.6% ERJ 135 35.3%

Eurofighter Typhoon 94.1% Embraer Legacy 600 57.6% 747-100 33.3%F-16A/B 90.9% F/A-18 57.6% 747-300 33.3%

Cessna 172 88.2% 757-300 54.5% 767-200 33.3%SR-20 88.2% 767-400 54.5% 777-200 33.3%

BAE-125 84.8% A340-500 54.5% BAE 146-200 33.3%DH-82 84.8% Cessna 208 54.5% DC-10 33.3%

Tornado 84.8% Challenger 600 54.5% DC-8 33.3%C-130 81.8% E-170 54.5% MD-87 33.3%

Hawk T1 81.8% Gulfstream V 54.5% 737-500 32.4%Model B200 81.8% ATR-42 51.5% 727-200 30.3%

DHC-1 78.8% CRJ-900 51.5% A300B4 30.3%Il-76 76.5% EMB-120 51.5% A330-300 30.3%

An-12 75.8% DC-3 50.0% E-190 29.4%Falcon 900 75.8% DHC-6 50.0% BAE 146-300 26.5%

PA-28 75.8% Tu-134 48.5% 737-700 24.2%Spitfire 70.6% Gulfstream IV 47.1% A340-300 24.2%DC-6 69.7% Tu-154 47.1% MD-80 23.5%E-195 69.7% 737-900 45.5% A310 21.2%

Cessna 560 67.6% Fokker 100 42.4% A319 21.2%Fokker 50 67.6% L-1011 42.4% A330-200 21.2%

Cessna 525 66.7% Boeing 717 41.2% C-47 21.2%Global Express 66.7% CRJ-200 41.2% 747-200 20.6%

Saab 2000 66.7% DHC-8-300 39.4% 737-200 17.6%Yak-42 66.7% ERJ 145 39.4% 737-800 17.6%A318 64.7% ATR-72 38.2% 757-200 17.6%

Falcon 2000 64.7% 707-320 36.4% A320 15.2%Metroliner 64.7% 747-400 36.4% 767-300 14.7%

Beechcraft 1900 63.6% CRJ-700 36.4% DC-9-30 14.7%Dornier 328 63.6% MD-11 36.4% 737-400 12.1%Fokker 70 63.6% MD-90 36.4% A321 11.8%Saab 340 63.6% 777-300 35.3% 737-300 06.1%737-600 57.6% A340-200 35.3%

A380 57.6% A340-600 35.3% Average 48.69%Table 1. Accuracy of variant prediction sorted according to the accuracy for each of the 100 variants in our dataset.

pendencies (for example certain airliners may fly more fre-quently to certain airports). Therefore, the data was firstfiltered to maximise internal diversity. Each pair of imagesfor a given variant was compared based on photographer,time, airliner, and airport, obtaining an “a priori” similarityscore (i.e., without looking at the pictures). Then, 100 im-ages per variant were incrementally and greedily selectedin order of decreasing diversity to the images already addedto the collection. After doing so, images were randomlyassigned to the training, validation, and test subsets. Thissimple procedure was effective at reducing internal correla-tion in the data, as reflected by a substantial reduction of theclassification performance of baseline classifiers. In partic-ular, sequences of photos are broken whenever possible.

Isolating different photographers in different splits wasalso considered as an option, but ultimately it was rejecteddue to the complex dependency structure that such a choice

would have introduced in the data.

3.3. Bounding boxes

About 110 images were initially selected for each variantand submitted to Amazon Mechanical Turk for boundingbox annotation. Annotators were instructed to skip imagesthat did not contain the exterior of an aircraft, so that theseimages could be identified and discarded. Three annotationswere collected for each image, presenting annotators withbatches of 10 images at a time and paying 0.03 USD perbatch. Overall, the cost of annotating all the images was 110USD and annotations were complete in less than 48 hours.Out of three annotations, we sought at least two whose over-lap over union similarity score was above 0.85% (fairly re-strictive in practice), discarding other annotations. The re-maining annotations were then averaged to obtain the finalbounding box, and images without a bounding box (usually

4

Page 5: Fine-Grained Visual Classification of Aircraft

due to a problematic image) were discarded. Since slightlymore than 100 images were submitted for annotation, thiseventually resulted in a sufficient number of validated im-ages.

3.4. Hierarchy

The hierarchy (Fig. 2) was obtained largely by manualinspection. Fortunately, sorting models by name is verylikely to suggest possible merges in a straightforward way.These were verified manually by searching example im-ages, Wikipedia, and the manufacturer websites for clearevidence that two model would differ visually. If no evi-dence was found, then the two models were merged in avariant. Sometimes, differences are fairly subtle; for exam-ple, Boeing variants -200, -300, -400, . . . differ mostly inlength, an attribute that is difficult to estimate from monoc-ular images (in this case counting windows may be the bestway of telling a model from another).

4. Baselines

We consider the classification tasks given in Sect. 2. Forexample, the variant classification for our dataset is a 100-way binary classification problem and performance is mea-sured in term of class-normalised average accuracy as de-scribed earlier.

Fig. 3 shows the confusion matrix for a strong base-line model (non-linear SVM on a χ2 kernel, bag-of-visualwords, 600 k-means words dictionary, multi-scale denseSIFT features, and 1× 1, 2× 2 spatial pyramid [1]). Thesemodels were trained on the entire image ignoring the bound-ing box information. As seen in Tab. 1 the performance isquite good for a few relatively distinctive categories (e.g.,the “Eurofighter Typhoon” has error of just 5.9%). On theother hand, bag-of-visual-words is much worse at pickingup subtle variations, such as for Airbus or Boeing family,resulting in large intra-family confusion (Fig. 3). The over-all accuracy of the classifier is 48.69%.

Fig. 4 shows the accuracy of the classifier when mea-sured on the hierarchical label classification tasks. The ac-curacy for the variant classification is 58.48%, whereas, theaccuracy for manufacturer classification is 71.30%. At thetop level the two manufacturers, Boeing and Airbus, aremost confused with one another perhaps due to the sim-ilar kinds of aircrafts they manufacture – large passengerplanes catering to airliners. Note that for the hierarchicalevaluation we trained our models for the variant classifica-tion task and simply measured the performance at differentlevels of the hierarchy by merging the labels below. An al-ternative strategy, which is to train the models directly forthe labels at a given level in the hierarchy, performed sig-nificantly worse in our experiments.

5. Summary

We have introduced FGVC-Aircraft, a new large datasetof aircraft images for fine-grained visual categorisation.The data contains 10,000 images, 100 airplane model vari-ants, 70 families, and 30 manufacturers. We believe thatFGVC-Aircraft has the potential of introducing aircraftrecognition as a novel domain in FGVC to the wider com-puter vision community (FGVC-Aircraft will be part ofthe ImageNet 2013 FGVC challenge). Compared to otherclasses used frequently in FGVC, aircraft have different andinteresting modes of variation.

Images in FGVC-Aircraft were obtained from aircraftspotter collections, maximising internal diversity in orderto reduce unwanted correlation between images taken by alimited number of photographers; in the future, we plan tosubstantially increase the size of the FGVC-Aircraft datasetby including more models as more and more photographersprovide permission to use their photos, and apply the sameconstruction to other object categories as well.

5.1. Acknowledgments

The creation of this dataset started during the JohnsHopkins CLSP Summer Workshop 2012, Towards a De-tailed Understanding of Objects and Scenes in Natural Im-ages1 with, in alphabetical order, Matthew B. Blaschko,Ross B. Girshick, Juho Kannala, Iasonas Kokkinos, Sid-dharth Mahendran, Subhransu Maji, Sammy Mohamed,Esa Rahtu, Naomi Saphra, Karen Simonyan, Ben Taskar,Andrea Vedaldi, and David Weiss. The CLSP workshop wassupported by the National Science Foundation via Grant No1005411, the Office of the Director of National Intelligencevia the JHU Human Language Technology Center of Ex-cellence; and Google Inc. A special thanks goes to PekkaRantalankila for helping with the creation of the airplanehierarchy.

Many thanks to the photographers that kindly madeavailable their images for research purposes. These are, inalphabetical order, Mick Bajcar, Aldo Bidini, Wim Callaert,Tommy Desmet, Thomas Posch, James Richard Coving-ton, Gerry Stegmeier, Ben Wang, Darren Wilson and Kon-stantin von Wedelstaedt. Please note that images aremade available exclusively for non-commercial researchpurposes. The original authors retain the copyright onthe respective pictures and should be contacted for anyother usage of them. Photographers may be contactedthrough their http://www.airliners.net profilepages, which are linked from http://www.robots.ox.ac.uk/˜vgg/data/fgvc-aircraft/.

1http://www.clsp.jhu.edu/workshops/archive/ws-12/groups/tduosn/.

5

Page 6: Fine-Grained Visual Classification of Aircraft

747−

100

747−

200

747−

300

747−

400

747−100

747−200

747−300

747−400

0

0.2

0.4

0.6

0.8

1

Figure 3. Confusion matrix for the 100 variant classification challenge. Some high confusion, due to the similarity of the models are alsoshown. These correspond to the Boeing 737 family, Boeing 747 family, Airbus family, McDonnell Douglas (MD) and the Embraer family.The average diagonal accuracy is 48.69%.

Confusion matrix: Family classification (58.48 % accuracy)

A300

A310

A320

A330

A340

A380

ATR−4

2AT

R−7

2An

−12

BAE

146

BAE−

125

Beec

hcra

ft 19

00Bo

eing

707

Boei

ng 7

17Bo

eing

727

Boei

ng 7

37Bo

eing

747

Boei

ng 7

57Bo

eing

767

Boei

ng 7

77C−1

30C−4

7C

RJ−

200

CR

J−70

0C

essn

a 17

2C

essn

a 20

8C

essn

a C

itatio

nC

halle

nger

600

DC−1

0D

C−3

DC−6

DC−8

DC−9

DH−8

2D

HC−1

DH

C−6

DR−4

00D

ash

8D

orni

er 3

28EM

B−12

0Em

brae

r E−J

etEm

brae

r ER

J 14

5Em

brae

r Leg

acy

600

Euro

fight

er T

ypho

onF−

16F/

A−18

Falc

on 2

000

Falc

on 9

00Fo

kker

100

Fokk

er 5

0Fo

kker

70

Glo

bal E

xpre

ssG

ulfs

tream

Haw

k T1

Il−76

King

Air

L−10

11M

D−1

1M

D−8

0M

D−9

0M

etro

liner

PA−2

8SR

−20

Saab

200

0Sa

ab 3

40Sp

itfire

Torn

ado

Tu−1

34Tu−1

54Ya

k−42

A300A310A320A330A340A380

ATR−42ATR−72

An−12BAE 146

BAE−125Beechcraft 1900

Boeing 707Boeing 717Boeing 727Boeing 737Boeing 747Boeing 757Boeing 767Boeing 777

C−130C−47

CRJ−200CRJ−700

Cessna 172Cessna 208

Cessna CitationChallenger 600

DC−10DC−3DC−6DC−8DC−9

DH−82DHC−1DHC−6

DR−400Dash 8

Dornier 328EMB−120

Embraer E−JetEmbraer ERJ 145

Embraer Legacy 600Eurofighter Typhoon

F−16F/A−18

Falcon 2000Falcon 900Fokker 100Fokker 50Fokker 70

Global ExpressGulfstream

Hawk T1Il−76

King AirL−1011MD−11MD−80MD−90

MetrolinerPA−28SR−20

Saab 2000Saab 340

SpitfireTornadoTu−134Tu−154Yak−42

Confusion matrix: Manufacturer classification (71.30 % accuracy)AT

RAi

rbus

Anto

nov

Beec

hcra

ftBo

eing

Bom

bard

ier A

eros

pace

Briti

sh A

eros

pace

Can

adai

rC

essn

aC

irrus

Airc

raft

Das

saul

t Avi

atio

nD

orni

erD

ougl

as A

ircra

ft C

ompa

nyEm

brae

rEu

rofig

hter

Fairc

hild

Fokk

erG

ulfs

tream

Aer

ospa

ceIly

ushi

nLo

ckhe

ed C

orpo

ratio

nLo

ckhe

ed M

artin

McD

onne

ll D

ougl

asPa

navi

aPi

per

Rob

inSa

abSu

perm

arin

eTu

pole

vYa

kovl

evde

Hav

illand

ATRAirbus

AntonovBeechcraft

BoeingBombardier Aerospace

British AerospaceCanadair

CessnaCirrus Aircraft

Dassault AviationDornier

Douglas Aircraft CompanyEmbraer

EurofighterFairchild

FokkerGulfstream Aerospace

IlyushinLockheed Corporation

Lockheed MartinMcDonnell Douglas

PanaviaPiperRobinSaab

SupermarineTupolev

Yakovlevde Havilland

747−

100

747−

200

747−

300

747−

400

747−100

747−200

747−300

747−400

0

0.2

0.4

0.6

0.8

1

Figure 4. Confusion matrix for the family (left) and manufacturer (right) classification tasks.

References[1] K. Chatfield, V. Lempitsky, A. Vedaldi, and A. Zisserman. The devil

is in the details: an evaluation of recent feature encoding methods. InProc. BMVC, 2011.

[2] Aditya Khosla, Nityananda Jayadevaprakash, Bangpeng Yao, andLi Fei-Fei. Novel dataset for fine-grained image categorization. InCVPR Workshop on Fine-Grained Visual Categorization, 2011.

[3] J. Liu, A. Kanazawa, D. Jacobs, and P. Belhumeur. Dog breed classi-fication using part localization. In Proc. ECCV, 2012.

[4] O. Parkhi, A. Vedaldi, C. V. Jawahar, and A. Zisserman. Cats vs dogs.In Proc. CVPR, 2012.

[5] C. Wah, S. Branson, P. Welinder, P. Perona, and S. Belongie. Thecaltech-ucsd birds-200-2011 dataset. Technical report, California In-stitute of Technology, 2011.

6