New directions in saliency research: Developments in architectures, datasets, and evaluation
Tutorial OverviewZoya Bylinskii & Tilke Judd, ECCV Oct. 8, 2016
Ali BorjiCenter for Research in
Computer Vision,University of Central Florida
Zoya BylinskiiComputer Science and
Artificial Intelligence Lab,MIT
Tilke JuddProduct Manager,
GooglePreviously: MIT
MIT Saliency Benchmark Team
Tutorial organized by:
Judd et al. “A Benchmark of Computational Models of Saliency to Predict Human Fixations” [MIT Tech Report 2012]
• 10 saliency models
• 3 baslines
• 3 metrics (AUC, SIM, EMD)
• 1 dataset
• 66 saliency models
• 5 baslines
• 8 metrics
• 2 datasets
2011 2016
large boosts in performance
in recent yearsdriven by neural
network architectures
tracking progress with the MIT Saliency Benchmark
• 66 models evaluated on MIT benchmark (as of Sept 30, 2016)
• top 8/10 are neural networks
• neural networks make up 25% of all submissions (17 models since 2014)
What are the common architectures used for neural network models of saliency?
How are these models trained?
PDP - Jetley et al. [CVPR 2016]
DeepGaze - Kümmerer et al. [ICLR 2015 workshop]
ML-Net - Cornia et al. [ICPR 2016]
SALICON - Huang et al. [ICCV 2015]
DeepFix - Kruthiventi et al. [ArXiv 2015]
eDN - Vig et al. [CVPR 2014]
loss functions & evaluation
datasets
architectures
applications
semanticsegmentation
objectdetection
objectproposals
image clustering & retrieval
cognitivesaliency
Tutorial overview“Deep networks for
saliency map prediction” Naila Murray
tracking progress with the MIT Saliency Benchmark
• 66 models evaluated on MIT benchmark
• using 8 evaluation metrics
• on 2 datasets (MIT300, CAT2000)
tracking progress with the MIT Saliency Benchmark
0 0.2 0.4 0.6 0.8 1159
1317212529333741454953576165
AUC-Judd
0 0.2 0.4 0.6 0.8159
1317212529333741454953576165
sAUC
0 0.5 1 1.5 2 2.5159
1317212529333741454953576165
NSS
0 0.2 0.4 0.6 0.8 11591317212529333741454953576165
CC
0 0.2 0.4 0.6 0.8159
1317212529333741454953576165
SIM
gradual improvementover time
but saturating
tracking progress with the MIT Saliency Benchmarkm
odel
s
0 0.2 0.4 0.6 0.8 1159
1317212529333741454953576165
AUC-Judd
0 0.2 0.4 0.6 0.8159
1317212529333741454953576165
sAUC
0 0.5 1 1.5 2 2.5159
1317212529333741454953576165
NSS
0 0.2 0.4 0.6 0.8 11591317212529333741454953576165
CC
0 0.2 0.4 0.6 0.8159
1317212529333741454953576165
SIM
tracking progress with the MIT Saliency Benchmarkm
odel
s
0 0.2 0.4 0.6 0.8 1159
1317212529333741454953576165
AUC-Judd
0 0.2 0.4 0.6 0.8159
1317212529333741454953576165
sAUC
0 0.5 1 1.5 2 2.5159
1317212529333741454953576165
NSS
0 0.2 0.4 0.6 0.8 11591317212529333741454953576165
CC
0 0.2 0.4 0.6 0.8159
1317212529333741454953576165
SIM
large boosts in performance
in recent yearsdriven by neural
network architectures
tracking progress with the MIT Saliency Benchmarkm
odel
s
As model performances continue to approach ground truth, how can we have
more meaningful evaluation?
loss functions & evaluation
datasets
architectures
applications
semanticsegmentation
objectdetection
objectproposals
image clustering & retrieval
cognitivesaliency
Tutorial overview“Evaluating saliency models in a
probabilistic framework”Matthias Kümmerer
New data collection methodologies
How can we collect enough training data for neural network models of saliency?
Krafka et al. [CVPR 2016]
Papoutsaki et al. [IJCAI 2016]
Xu et al. [ArXiv 2015]
New data collection methodologies
iSUN
Webcam-based
Krafka et al. [CVPR 2016]
Papoutsaki et al. [IJCAI 2016]
Xu et al. [ArXiv 2015]
New data collection methodologies
Kim et al. [CHI EA 2015]
Jiang et al. [CVPR 2015]
iSUN SALICON
Click-basedWebcam-based
What kind of new applications are possiblewith state-of-the-art saliency architectures
and big data?
loss functions & evaluation
datasets
architectures
applications
semanticsegmentation
objectdetection
objectproposals
image clustering & retrieval
cognitivesaliency
Tutorial overview“Saliency for image
understanding and manipulation”Ming-Ming Cheng
Is saliency solved?
Is saliency solved?NO!
“Towards cognitive saliency”Zoya Bylinskii
Tutorial overview
Gaze Text Face Action Main Focus
Inpu
tH
uman
Fi
xatio
nsM
odel
Pr
edic
tions
“Towards cognitive saliency”Zoya Bylinskii
Tutorial overview