34
DeepFix: A Fully Convolutional Neural Network for Predicting Human Fixations Srinivas S S Kruthiventi, Kumar Ayush, and R. Venkatesh Babu (arXiv October 2015) [URL ] Slides by Xavier Giró-i-Nieto , from the Computer Vision Reading Group. (27/10/2015) https://imatge.upc.edu/web/teaching/computer-vision-reading-group

DeepFix: a fully convolutional neural network for predicting human fixations (UPC Reading Group)

Embed Size (px)

Citation preview

Page 1: DeepFix:  a fully convolutional neural network for predicting human fixations (UPC Reading Group)

DeepFix: A Fully ConvolutionalNeural Network for Predicting

Human Fixations

Srinivas S S Kruthiventi, Kumar Ayush, and R. Venkatesh Babu (arXiv October 2015) [URL]

Slides by Xavier Giró-i-Nieto, from the Computer Vision Reading Group. (27/10/2015)https://imatge.upc.edu/web/teaching/computer-vision-reading-group

Page 2: DeepFix:  a fully convolutional neural network for predicting human fixations (UPC Reading Group)

Introduction

2

Page 3: DeepFix:  a fully convolutional neural network for predicting human fixations (UPC Reading Group)

Introduction

3

Bottom-up attention

AutomaticReflexiveStimulus-driven

Page 4: DeepFix:  a fully convolutional neural network for predicting human fixations (UPC Reading Group)

Introduction

4

Top-down attention

Subjective’s prior knowledgeExpectationsTask orientedMemoryBehavioral goals

Page 5: DeepFix:  a fully convolutional neural network for predicting human fixations (UPC Reading Group)

Introduction

5

Visual Attentional Mechanisms

Bottom-upAutomaticReflexiveStimulus-driven

Top-downSubjective’s prior knowledgeExpectationsTask orientedMemoryBehavioral goals

Page 6: DeepFix:  a fully convolutional neural network for predicting human fixations (UPC Reading Group)

Introduction

Page 7: DeepFix:  a fully convolutional neural network for predicting human fixations (UPC Reading Group)

Introduction

7

DeepFixClassic method

Page 8: DeepFix:  a fully convolutional neural network for predicting human fixations (UPC Reading Group)

Introduction

8

mit300 benchmark [URL]

Page 9: DeepFix:  a fully convolutional neural network for predicting human fixations (UPC Reading Group)

Introduction

9

cat200 benchmark [URL]

Page 10: DeepFix:  a fully convolutional neural network for predicting human fixations (UPC Reading Group)

The ingredients

10

Page 11: DeepFix:  a fully convolutional neural network for predicting human fixations (UPC Reading Group)

Very deep network

11

Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014)

● Inspired by Oxford’s VGG net (19 layers).● 20 layers● Small kernel sizes.

Page 12: DeepFix:  a fully convolutional neural network for predicting human fixations (UPC Reading Group)

Fully convolutional network (FCN)

12

● Fully connected layers at the end are replaced by convolutional layers with very large receptive fields.

● They capture the global context of the scene.

● End-to-end training

Long, J., Shelhamer, E., & Darrell, T. (2015). Fully Convolutional Networks for Semantic Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 3431-3440)

Page 13: DeepFix:  a fully convolutional neural network for predicting human fixations (UPC Reading Group)

13

Inception layers

Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., ... & Rabinovich, A. (2015). Going Deeper With Convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1-9)

● GoogLeNet● Different kernel sizes

operating in parallel.

Page 14: DeepFix:  a fully convolutional neural network for predicting human fixations (UPC Reading Group)

14

Location Biased Convolutional (LBC) layer

● Centre-bias●

Page 15: DeepFix:  a fully convolutional neural network for predicting human fixations (UPC Reading Group)

The network

15

Page 16: DeepFix:  a fully convolutional neural network for predicting human fixations (UPC Reading Group)

Architecture

16

Small convolutional filters of 3x3 with stride of 1 to allow a large depth without increasing the memory requirement

Page 17: DeepFix:  a fully convolutional neural network for predicting human fixations (UPC Reading Group)

Architecture

17

Max pooling layers (in red) reduce computation.

Page 18: DeepFix:  a fully convolutional neural network for predicting human fixations (UPC Reading Group)

Architecture

18

Gradual increase in the amount of channels to progressively learn richer semantic representations: 64, 128, 256, 512...

Page 19: DeepFix:  a fully convolutional neural network for predicting human fixations (UPC Reading Group)

Architecture

19

Weights initialized from VGG-16 net for stable and effective learning

Page 20: DeepFix:  a fully convolutional neural network for predicting human fixations (UPC Reading Group)

Architecture

20

Convolution kernel 3x3 with hole size 2 have a receptive field of 5x5.

Page 21: DeepFix:  a fully convolutional neural network for predicting human fixations (UPC Reading Group)

Architecture

21

Capture multi-scale semantic structure using two inception style convolutional modules

Page 22: DeepFix:  a fully convolutional neural network for predicting human fixations (UPC Reading Group)

Architecture

22

Very large receptive fields of 25x25 by introducing holes of size 6 in kernels

Page 23: DeepFix:  a fully convolutional neural network for predicting human fixations (UPC Reading Group)

Architecture

23

Location Biased Convolutional (LBC) layers

Page 24: DeepFix:  a fully convolutional neural network for predicting human fixations (UPC Reading Group)

Architecture

24

Location Biased Convolutional (LBC) layers

Page 25: DeepFix:  a fully convolutional neural network for predicting human fixations (UPC Reading Group)

Architecture

25

constant during training learnt during training

weights from c’th filter in a convolutional layer

input blob

Page 26: DeepFix:  a fully convolutional neural network for predicting human fixations (UPC Reading Group)

Architecture

26

Final output W/8xH/8 is upsampled.

Page 27: DeepFix:  a fully convolutional neural network for predicting human fixations (UPC Reading Group)

Experiments

27

Page 28: DeepFix:  a fully convolutional neural network for predicting human fixations (UPC Reading Group)

Training

28

2nd stage

MIT 1003

CAT2000Mouse clicks from Microsoft CoCo

Not mentioned how to go from eye fixations to heat mapa !!

Page 29: DeepFix:  a fully convolutional neural network for predicting human fixations (UPC Reading Group)

Training

29

● End to end (as JuntingNet)● Caffeframework● 1 day in K40 GOU!

Page 30: DeepFix:  a fully convolutional neural network for predicting human fixations (UPC Reading Group)

Results

30

Page 31: DeepFix:  a fully convolutional neural network for predicting human fixations (UPC Reading Group)

Results

31

Page 32: DeepFix:  a fully convolutional neural network for predicting human fixations (UPC Reading Group)

Results

32

Page 33: DeepFix:  a fully convolutional neural network for predicting human fixations (UPC Reading Group)

Results

33

Page 34: DeepFix:  a fully convolutional neural network for predicting human fixations (UPC Reading Group)

Results

34