27
Predicting Human Eye Fixations via an LSTM-based Saliency Attentive Model Junting Pan M Cornia, L Baraldi, G Serra, R Cucchiara. Predicting Human Eye Fixations via an LSTM-based Saliency Attentive Model. arXiv preprint arXiv:1611.09571

Predicting Human Eye Fixations via an LSTM-based Saliency Attentive Model (UPC Reading Group 2017)

Embed Size (px)

Citation preview

1. Introduction

Outline1. Introduction2. Model Architecture3. Experimental Evaluation4. Implementation Details5. Results 6. Conclusion

Outline1. Introduction2. Model Architecture

a. Dilated Residual Convolutional Networksb. Attentive Convolutional LSTMc. Learned Priors

3. Experimental Evaluation4. Implementation Details5. Results 6. Conclusion

2. Model Architecture

2. Model Architecture - DRCN

2. Model Architecture - DRCN

ResNet-50

2. Model Architecture - DRCN

2. Model Architecture - DRCN

conv4 use dilated rate = 1conv5 use dilated rate = 3

2. Model Architecture - Attentive ConvLSTM

2. Model Architecture - Attentive ConvLSTM

2. Model Architecture - Attentive ConvLSTM

2. Model Architecture - Attentive ConvLSTM

2. Model Architecture - DRCN

2. Model Architecture - DRCN

2. Model Architecture - DRCN

Filter size of 5x5 with dilated rate =3, which means a receptive field of 17x17

Outline1. Introduction2. Model Architecture3. Experimental Evaluation4. Implementation Details5. Results 6. Conclusion

- SALICON 5.000 testing images- MIT1003 1003 images- MIT300 300 testing images- CAT2000 2000 testing images

Outline1. Introduction2. Model Architecture3. Experimental Evaluation4. Implementation Details5. Results 6. Conclusion

4. Implementation Details- Batch size : 10- RMSprop optimizer- Residual Network are initialized using ResNet-50 pre trained on ImageNet- Loss Function : KL-Divergence - Learning rate : 10e-4, decay by a factor of 10 every two epochs

Outline1. Introduction2. Model Architecture3. Experimental Evaluation4. Implementation Details5. Results 6. Conclusion

5. Results - Model Ablation Analysis

5. Results - Model Ablation AnalysisDRCN +Conv LSTM +Priors Ground Truth

5. Results - SALICON MIT300 CAT200

5. Results

Outline1. Introduction2. Model Architecture3. Experimental Evaluation4. Implementation Details5. Results 6. Conclusion

6. Conclusions

- New Saliency Attentive Model for fixation prediction.

- Attentive Convolutional LSTM that is specifically designed to sequentially focus on the most salient regions of input images.

- Residual Architecture with dilated filters that maintains the spatial resolution.

Thanks for your attention!