87
1 Using Deep Learning to rank and tag millions of hotel images 15/11/2018 - PyParis 2018 Christopher Lennan (Senior Data Scientist) @chris_lennan Tanuj Jain (Data Scientist) @tjainn #idealoTech

Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

1

Using Deep Learning to rank and tag millions of hotel images15/11/2018 - PyParis 2018

Christopher Lennan (Senior Data Scientist)@chris_lennan

Tanuj Jain (Data Scientist)@tjainn #idealoTech

Page 2: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Agenda

2

1. idealo.de

2. Business Motivation

3. Models and Training

4. Image Tagging

5. Image Aesthetics

6. Summary

Page 3: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Some Key Facts

18 More than 18 years experience

700 “idealos” from 40 nations

Active in 6 different countries (DE, AT, ES, IT, FR, UK)

16 million users/month 1

50.000 shops

Over 330 million offers for 2 million products

Tüv certified comparison portal 2

Germany's 4th largest eCommerce website

Page 4: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Motivation

4

Page 5: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

idealo hotel price comparisonhotel.idealo.de

5

● 2.306.658 accommodations

● 308.519.299 images

● ~ 133 images per

accommodation

Page 6: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Importance of Photography for Hotels

6

“.. after price, photography is the most important factor for travelers and prospects scanning OTA sites..”

“.. Photography plays a role of 60% in the decision to book with a particular hotel ..”

“.. study published today by TripAdvisor, it would seem like photos have the greatest impact driving engagement from travelers researching on hotel and B&B pages ..”

Page 7: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

7

Page 8: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

8

Page 9: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

9

Page 10: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

1 2 3 4 5

6 7 8 9

10 11 12 13

Page 11: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

11

Position: 19Position: 1

Current image placement

Image Aesthetics

Page 12: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

12

Image AestheticsCurrent image placement

Position: 17Position: 3

Page 13: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

13

Beautiful images should appear earlier in the gallery

Page 14: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

1 2 3 4 5

6 7 8 9

10 11 12 13

Page 15: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

1 2 3 4 5

6 7 8 9

10 11 12 13

Page 16: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

16

Ensure different areas get depicted

Page 17: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

1 2 3 4

5 6 7 8

Bedroom Bathroom

Restaurant Facade Fitness Studio Kitchen

Page 18: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Understanding Image Content

18

1. Tag the image with the hotel property area

2. Predict aesthetic quality

Two part problem

Page 19: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Models & Training

19

Page 20: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Transfer Learning

20

● Use pre-trained CNN that was trained on millions of images

(e.g. MobileNet or VGG16)

● Replace top layers so that the output fits with classification task

● Train existing and new layer weights

Page 21: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Transfer LearningCNN architecture (VGG16)

21

Page 22: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Training regime

22

1. Only train the newly added dense layers with high learning rate

2. Then train all layers with low learning rate

Goal: Do not juggle around the pre-trained convolutional weights too much

Page 23: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

23

Training regime

Page 24: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

● CEL generally used for “one-class” ground truth classifications (e.g. image tagging)

● CEL ignores inter-class relationships between score buckets

24

Loss functionsCross-entropy loss (CEL)

source: https://ssq.github.io/2017/02/06/Udacity%20MLND%20Notebook/

Page 25: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

25

Loss functions

● For ordered classes, classification settings can outperform regressions

● Training on datasets with intrinsic ordering can benefit from EMD loss objective

Earth Mover’s Distance (EMD)

Page 26: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Local AWS

26

GPU training workflow

ECRpushCustom

AMIdatasets

nvidia-docker

EC2GPU

instance

launchDockerMachine

train script

Docker image

build

Dockerfile

SSHevaluation script

DockerMachine

EC2GPU

instance

launch

Jupyter notebook

Setup

Train

Evaluate launch evaluation container with nvidia-docker

pull image

copy existing model

S3launch training container with nvidia-docker

store train outputs

pull image

copy existing model

Page 27: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Image Tagging

27

Page 28: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Tagging Problem

● Given an image, tag it as belonging to a single class

● Multiclass classification model with classes:○ Bedroom○ Bathroom○ Foyer○ Restaurant○ Swimming Pool○ Kitchen○ View of Exterior (Facade)○ Reception

28

Page 29: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Multiple Datasets

Will go over them one-by-one and see:

● Dataset properties● Results● Issues

29

Page 30: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Wellness Dataset

● Idealo in-house pre-labelled images

● Mostly pictures of 2 or 3 stars properties

30

Page 31: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Wellness Dataset

● Balanced: Equal sample count in all categories for all sets

31

Page 32: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Wellness Dataset: Metrics

Top-1- accuracy: 86%

32

Page 33: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Wellness Dataset: Wrong PredictionsTrue Class of these images: BATHROOM, Predicted as: RECEPTION

Rectangular structure = Reception with high probability→ BIAS!

33

Page 34: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Wellness Dataset: Wrong PredictionsTrue Class of these images: BATHROOM

Wrong true label of images→ NOISE in the dataset!

34

Page 35: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Correcting Bias

● Augmentation operations, same for every class:○ Random cropping○ Rotation○ Horizontal flipping

● Data enrichment:○ External data from google images

35

Page 36: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Augmented Wellness + Google Dataset: Metrics

Top-1- accuracy: 88%

36

Page 37: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Gotta Clean!

37

Page 38: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Cleaning Dataset

● Hand-cleaned each category:

○ Deleted pictures that do not belong in its category

○ Removed duplicates (presence of duplicates can give us wrong

metrics)

○ Added more images from external sources for classes with a small

number of images left after cleaning

38

Page 39: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Cleaned Data: Metrics

Top-1- accuracy: 91%

39

Page 40: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Cleaned Dataset: Results

● Bathroom vs. Reception confusion has almost vanished!

● View_of_exterior vs Pool confusion has reduced

● Foyer performance:

○ Most misclassifications of Foyer gets assigned to

Reception

○ This is human problem as well!

40

Page 41: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Foyer or Reception?

41

Page 42: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Learnings so far

● The model can only be as good as the data (cleaning)

● Foyer is a hard category to predict

42

Page 43: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Understanding Model Decisions

43

Page 44: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Understanding Decisions: Class Activation Maps

● Use the penultimate Global Average Pooling Layer (GAP) to get class activation map

● Highlights discriminative region that lead to a classification

44

Page 45: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Insights With CAMSwimming Pool misclassified as Bathroom

45

CAM

Page 46: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Insights With CAMSwimming Pool misclassified as Bathroom

46

CAM

Page 47: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Insights With CAMSwimming Pool misclassified as Bathroom

47

CAM

Page 48: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Insights With CAMSwimming Pool misclassified as Bathroom

Using rails to misidentify Pool as Bathroom.

48

Page 49: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Insights With CAMBathroom correct classification

49

CAM

Page 50: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Insights With CAMBathroom correct classification

50

CAM

Page 51: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Insights With CAMBathroom correct classification

51

CAM

Page 52: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Insights With CAMBathroom correct classification

Using faucets to correctly identify Bathroom.

52

Page 53: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Learnings so far

● Attribution techniques like CAM lend interpretability

● CAM can drive data collection in specific directions

53

Page 54: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Tagging Next Steps

1. Add still more data

a. Explore manual tagging options for training

(Example: Amazon Mechanical Turk)

2. Add more classes

a. Fitness Studio

b. Conference Room

c. Other

54

Page 55: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Image Aesthetics

Page 56: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Ground Truth Labels

For the NIMA model we need “true” probability distribution over all classes for each image:

● AVA dataset: we have frequencies over all classes for each image

→ normalize frequencies to get “true” probability distribution

56

(6.151 / 1.334)

Page 57: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Iterations

57

We have gone through two iterations of the aesthetic model:

● First iteration - Train on AVA Dataset

● Second iteration - Fine-tune first iteration model on in-house labelled data

Page 58: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Results - first iteration

58

Linear correlation coefficient (LCC): 0.5987Spearman's correlation coefficient (SCRR): 0.6072Earth Mover's Distance: 0.2018Accuracy (threshold at 5): 0.74

Aesthetic model - MobileNet

Page 59: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Examples - first iterationAesthetic model

59

Page 60: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Examples - first iterationAesthetic model

60

Page 61: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Examples - first iterationAesthetic model

61

Page 62: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Examples - first iterationAesthetic model

62

Page 63: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Examples - first iterationAesthetic model

63

Page 64: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Results - second iteration

64

● We built a simple labeling application

● http://image-aesthetic-labelling-app-nima.apps.eu.idealo.com/

● ~ 12 people from idealo Reise and Data Science labeled

○ 1000 hotel images for aesthetics

● We fine-tuned the aesthetic model with 800 training images

● Built aesthetic test dataset with 200 images

Page 65: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Results - second iteration

65

Linear correlation coefficient (LCC): 0.7986Spearman's correlation coefficient (SCRR): 0.7743Earth Mover's Distance: 0.1236Accuracy (threshold at 5): 0.85

Aesthetic model - MobileNet

Page 66: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Examples - second iterationAesthetic model

66

Page 67: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Examples - second iterationAesthetic model

67

Page 68: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Examples - second iterationAesthetic model

68

Page 69: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Examples - second iterationAesthetic model

69

Page 70: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Examples - second iterationAesthetic model

70

Page 71: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

ProductionAesthetic model

71

Page 72: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

ProductionAesthetic model

72

● To date we have scored ~280 million images

● Distribution of scores (sample of 1 million scores):

Page 73: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Production - Low Scores Aesthetic model

73

Page 74: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Production - Medium Scores Aesthetic model

74

Page 75: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Production - High Scores Aesthetic model

75

Page 76: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Understanding Model Decisions

76

Page 77: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Convolutional Filter Visualisations

Layer 23 MobileNet original

MobileNet Aesthetic

77

Page 78: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Convolutional Filter Visualisations

Layer 51 MobileNet original

MobileNet Aesthetic

78

Page 79: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Convolutional Filter Visualisations

Layer 79 MobileNet original

MobileNet Aesthetic

79

Page 80: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Aesthetic Learnings

● Hotel specific labeled data is key - Aesthetic model improved markedly from 800

additional training samples

● NIMA only requires few samples to achieve good results (EMD loss)

● Labeled hotel images also important for test set (model evaluation)

● Training on GPU significantly improved training time (~30 fold)

80

Page 81: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

● Continue labeling images for aesthetic classifier

● Introduce new desirable biases in labeling (e.g. low technical quality == low aesthetics)

● Improve prediction speed of models (e.g. lighter CNN architectures)

Aesthetics Next Steps

81

Page 82: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Summary

82

Page 83: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

● Transfer learning allowed us to train image tagging and aesthetic classifiers with a few

thousand domain specific samples

● Showed the importance of having noise-free data for quality predictions

● Use of attribution & visualization techniques helps understand model decisions and

improve them

Summary

83

Page 84: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Check us out! #idealoTech

84

https://github.com/idealo https://medium.com/idealo-tech-blog

Page 85: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

We’re hiring!

85

Data Engineers, DevOps Engineers across different teamsCheck out our job postings: jobs.idealo.de

Page 86: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

Tanuj [email protected]@tjainn

Christopher [email protected]@chris_lennan

86

Page 87: Using Deep Learning to rank and tag millions of hotel imagespyparis.org/static/slides/Christopher+Lennan+Tanuj... · 16,10 0,00 9,80 16,10 0,00 6,94 5,90 6,30 7,40 8,10 0,32 0,32

16,10 0,00 9,80 16,10

0,00

6,94

5,906,30

7,40

8,10

0,320,32

THE END

87