Learning Deep Features for Scene Recognition using Places Database

Bolei Zhou, Agata Lapedriza, Jianxiong Xiao, Antonio Torralba, Aude Oliva NIPS2014

Bora Çelikkale

INTRODUCTION

Human Visual Recognition

Samples world several times / sec

~millions images within a year

INTRODUCTION

Primate Brain

Hierarchical organization in layers of increasing processing complexity

Inspired CNNs

PROBLEM & MOTIVATION

Obj Classification have obtained astonishing performanace with large databases (ImageNet)

Iconic images do not contain the richness and diversity of visual info in scenes

CONTRIBUTIONS

Scene-centric database 60x larger than SUN

Comparison metrics for scene datasets:Density, Diversity

SCENE DATASETS

Scene15 (Lazebnik et al. 2006)

15 categories

~3000 imgs

MIT Indoor67 (Quatham & Torralba 2009)

67 categories of indoor places

15.620 imgs

SUN (Xiao et al. 2010)

397 (well-sampled) categories

130.519 imgs

Places (Zhou et al. 2014)

476 categories

7.076.580 imgs

PLACES DATASET

Same categories from SUN

696 popular adjectives in Eng

Google Images

Bing Images

Flickr

>40M imgs are downloaded

PLACES DATASET

PCA-based duplicate removal across SUN

2 Places & SUN have different images

Allows to combine Places & SUN

PLACES DATASET

Annotations (with AMT)

Questions (eg: is this a living room?)

Two round setup:1. Default answer is NO2. Default answer is YES

Imgs shown / round: 750 + 60 from SUN for control

Take >90% accuracy

COMPARISON METRICS

Relative Density

COMPARISON METRICS

Relative Density

Images have more similar neighbors

NN of a1

NN of b1

COMPARISON METRICS

Relative Diversity

Simpson Index: two random individual belong to same specie

NN of a1 NN of b

EXPERIMENTS

Density & Diversity Comparison (AMT)

1 Relative diversity vs. relative density per each category and dataset

Show 12 pairs of images

Workers select the most similar pair

Diversity: pairs are chosen random for each db

Density: 5th NN (avoid near duplicates) is chosen as pair with GIST

EXPERIMENTS

Cross Dataset Generalization

2 Training and testing across different datasets

ImageNet-CNN and linear SVM

EXPERIMENTS

Comparison with Hand-designed Features

EXPERIMENTS

Training CNN for Scene Recognition

2,5M imgs from 205 categories, on AlexNet 4

PLACES-CNNs

Hybrid-AlexNet

Places + ImageNet 3.5M imgs, 1183 categoriesAccuracy = 0.5230 on validation set

Places205-GoogLeNet (on 205 categories)

Accuracy: top1 = 0.5567, top5 = 0.8541 on validation set

Places205-VGG16 (on 205 categories)

Accuracy: top1 = 0.5890, top5 = 0.8770 on validation set

PLACES2 DATASET

400+ unique scene categories

>10M images

AlexNet top1 accuracy: 43.0%

VGG16 top1 accuracy: 47.6%

http://places.csail.mit.edu/demo.html

http://places2.csail.mit.edu/demo.html

THANK YOU

Learning Deep Features for Scene Recognition using Places Database · 2017-10-27 · Learning Deep...

Documents

Crime Scene Investigation. Defining the Crime Scene Information Obtained from a Crime Scene Processing the Crime Scene

Learning Deep Features for Scene Recognition using …papers.nips.cc/paper/5349-learning-deep-features-for...Learning Deep Features for Scene Recognition using Places Database Bolei

Scene by scene

Project title - F.learning Studio · 2018-01-05 · Scene Scene Project title: Action: VO: FX: Action: VO: FX: Scene Action: VO: FX: Scene Scene Action: VO: FX: Action: VO: FX: Scene

Mise en scene 2 (Hospital Scene)

METROPOLE PLACES 56 30 PLACES 64 34 RANGS 34 48 PLACES 29 55 PLACES 57 > 85 PLACES PLACES 01 27 28 02 PLACES RANGS 19 33 PLACES 33 63 PLACES 118 88 PLACES 144 120 PLACES 174

Look No Deeper: Recognizing Places from Opposing Viewpoints … · 2019-02-21 · Look No Deeper: Recognizing Places from Opposing Viewpoints under Varying Scene Appearance using

Places: An Image Database for Deep Scene Understandingplaces.csail.mit.edu/places2_arxiv.pdf · · 2016-10-10based duplicate removal was conducted within each scene ... top ranked

Trading Places - Angus JournalThe notion fits the restaurants just as well. For starters, they hold different customer bases: 40, the more adult couples scene, and Trading Places:

Scene by scene media a2

One day itinerary: Secret Places - Dorchester …...One day itinerary: Secret Places The cosmopolitan city of Milan may be known for its fashion scene, but behind the façade of opulent

Learning Deep Features for Discriminative …cnnlocalization.csail.mit.edu/supp.pdfLearning Deep Features for Discriminative Localization Bolei Zhou, Aditya Khosla, Agata Lapedriza,

Theme 3: Places Art Vocabulary - The Getty · Theme 3: Places Art Vocabulary background—the part of a scene that appears most distant; what is behind the foreground and middle ground

Places: An Image Database for Deep Scene Understanding · 1 Places: An Image Database for Deep Scene Understanding Bolei Zhou, Aditya Khosla, Agata Lapedriza, Antonio Torralba and

Learning deep features for scene recognition using Places database

Proposed Catchment Plan for North Koel Reservoir (Mandal ...forestsclearance.nic.in/writereaddata/Addinfo/0_0... · In many places the reaches of this river present scene ... religious

Scene Composition ( mis -en-scene)

Going Places Additional Activities Scene 1 German With English Instructions This project has been funded with support from the European Commission. This

Learning Deep Features for Discriminative Localization - arXiv · Learning Deep Features for Discriminative Localization Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, Antonio

Learning Deep Features for Scene Recognition using Places ......Learning Deep Features for Scene Recognition using Places Database Bolei Zhou 1, Agata Lapedriza1,3, Jianxiong Xiao2,