Upload
danganh
View
220
Download
0
Embed Size (px)
Citation preview
Saliency-based Object Discovery
Simone Frintrop
Cognitive Computer Vision GroupRheinische Friedrich-Wilhelms-Universität Bonn
13.10.2014
Object DiscoveryWhat is object discovery? (also called: general/generic object detection, object proposal detection)
Find object without pre-knowledge(“What is an object”?) Capture the ‘objectness’
Already 5 months old infants can reliablydistinguish objects from background![von Hofsten & Spelke, J. of Exp. Psych. 1985]
Simone Frintrop 2
Object DiscoveryApplications: • Pre-processing for object classification:
Reduce the number of queries for therecognition program
3
Classify
Google Glass
Simone Frintrop
Object DiscoveryApplications:
• Analyzing photos, e.g.:– Automatic cropping – Automatic thumbnailing
Simone Frintrop
Steniford, ICVS 2007
Marchesotti et al., ICCV 2009
4
Object DiscoveryApplications: • Robotics:
– Detect candidates for manipulation– Exploration: create database of all objects
Simone Frintrop
5
Rhino
iCubPlayBot: a robotic wheelchair for disabled children [Rotenstein et al. 2007]
Simone Frintrop
[Horbert, Martín-García,Frintrop, Leibe, submitted]
From human object perception to an object discovery system
In the following, object discovery on 3 types of input data:
Object Discovery
Simone Frintrop 6
1) Photos 2) Videos 3) RGB-D data
[Frintrop et al.: ICPR 2014][Martín-García/Frintrop:
CogSci 2013]
[Martín-García/Frintrop/Cremers:J. of KI 2013]
Human perception:• Object detection takes place before object recognition [Pylyshyn 2001]
Segmentation
2D Object Discovery• Segmentation processes on all levels of the visual system bundle parts of the visual
input [Scholl 2001]. Result: proto-objects (superpixels)• Proto-objects are combined by focused attention to form coherent objects (attention
“grabs” proto-objects [Rensink, 2000]). • Saliency map in V1? [Zhang et al. 2012]
Attention prioritizes processing
Attention consists of:• Bottom up (saliency)• Top-down
An image region is salientif it automatically attracts human attention
Simone Frintrop 7
Saliency computation
Object candidate
Proto-
objectsSaliency
map
[Felzenszwalb/Huttenlocher 2004]
Saliency
Simone Frintrop 8
Saliency systems from our group:
VOCUS: [Frintrop: LNAI 2006]
BITS: [Klein/Frintrop: ICCV 2011]
CoDi: [Klein/Frintrop: DAGM 2012]
Simple CoDi: [Frintrop et al: ICPR 2014]
Most recently:
VOCUS 2
Conspicuity maps
Scale representations
Input image
Saliency map
Max Finder
Feature 1 Feature n
Feature 1 Feature n
Trajectory of FOAs
Top-downinformation
Inhibition of return
...
1
2
...3
4
56
Computational Attention Systems
Saliency Systems
center-surround contrast
7
Saliency on Web ImagesMSRA Salient Object dataset [Liu et al., PAMI 2009]
Simone Frintrop 9[Klein & Frintrop, ICCV 2011][Klein & Frintrop, DAGM 2012][Frintrop, Martín-García, Cremers, ICPR 2014]
Image
Saliency map
Segmentation
2D Object DiscoveryHuman Perception:• Segmentation processes on all levels of the visual system bundle parts of the visual
input [Scholl 2001]. Result: proto-objects (superpixels)• Proto-objects are combined by focused attention to form coherent objects (attention
“grabs” proto-objects [Rensink, 2000]). • Saliency map in V1? [Zhang et al. 2012]
Simone Frintrop 10
Saliency computation
Object candidate
Super-
pixelsSaliency
map
[Felzenszwalb/Huttenlocher 2004]
Discovery on Web ImagesMSRA Salient Object dataset [Liu et al., PAMI 2009]
Simone Frintrop 11[Frintrop, Martín-García, Cremers, ICPR 2014]
Image
Saliency map
Segmentation
Results:Obj. candidates
Ground truth
Discovery on Web ImagesMSRA Salient Object dataset [Liu et al., PAMI 2009]
Simone Frintrop 12[Frintrop, Martín-García, Cremers, ICPR 2014]
Saliency
Saliency + Segmentation
Image
Saliency map
Segmentation
Results:Obj. candidates
Ground truth
2D Discovery in Real-worldCooperation with Bastian Leibe and Esther Horbert:Object discovery in real-world indoor sequences
Simone Frintrop 14
Saliency map Salient blobs
Segmentation
Object candidates
Computation on 4th pyramid layer
Results 2D Discovery• Experiments on new sequence-based dataset with RWTH Aachen:
5 sequences of real-world indoor scenarios• Precision-Recall curves (frame-based):
Simone Frintrop 15[Horbert, Martín-García, Frintrop, Leibe 2014 (submitted)]
Manén et al, ICCV 2013Alexe et al, PAMI 2012Arbelaez et al., PAMI 2011
Results 2D DiscoveryHow does the recall evolve over time?
Simone Frintrop 16[Horbert, Martín-García, Frintrop, Leibe 2014 (submitted)]
At the end of the sequence, we have found 90% of the objects
Our approach
Maximal possible recall
Manén et al, ICCV 2013
Alexe et al, PAMI 2012
Arbelaez et al., PAMI 2011
Results: Object Discovery
Simone Frintrop 17
[Video: Esther Horbert]
Cooperation with Bastian Leibe and Esther Horbert:Object discovery in real-world indoor sequences
From 2D to 3DHVS: Two pathways for object perception [Ungerleider 1982]:
– ventral stream (“what pathway”) processes color & form, responsible for object detection & recognition
– dorsal stream (“where pathway”) processes depth & motion, responsible for spatially localizing objects
18
Ventral stream
Dorsal stream
Depth processing stream:Creating 3D map
Map with 3D object models
Incrementally update map with new measurements
Color processing stream:Generating object candidates
[Martín-García/Frintrop, Proc. of the annual meeting of Cognitive Sciences (CogSci), 2013][Martín-García/Frintrop/Cremers, German Journal of Artificial Intelligence, 2013]
RGB-D Sensor
From Frames to Sequences: Visual Scene Exploration
Strategy to process image sequence:Two-stage processing as in human vision:
19
[Neisser, Cognitive Psychology, 1967][Treisman, Cognitive Psychology, 1985]
• Prioritization: visual attention directs the processing to the regions of most potential interest [Pashler, 1997].
[Frintrop et al: Computational Visual Attention Systems and their Cognitive Foundation: A Survey, ACM Trans. on Applied Perception (TAP), 2010 ]
Parallel, pre-attentive stage
(e.g. saliency system)
Serial,attentive stage
(e.g. recognition)Scene
Image Saliency map
20
Spatial Inhibition of Return• Inhibition of return (IOR) mechanisms inhibit cells that correspond to previously fixated
locations and objects [Posner and Cohen, 1984]. IOR supports orienting towards novelty and enables scene exploration.
[Martín-García/Frintrop, Proc. of the annual meeting of Cognitive Sciences (CogSci), 2013][Martín-García/Frintrop/Cremers, German Journal of Artificial Intelligence, 2013]
Inhibition
21
Spatial Inhibition of Return• Inhibition of return (IOR) mechanisms inhibit cells that correspond to previously fixated
locations and objects [Posner and Cohen, 1984]. IOR supports orienting towards novelty and enables scene exploration.
• IOR happens in spatial coordinates and not in retinotopic coordinates
Image Saliency map
[Martín-García/Frintrop, Proc. of the annual meeting of Cognitive Sciences (CogSci), 2013][Martín-García/Frintrop/Cremers, German Journal of Artificial Intelligence, 2013]
Inhibition
22
Spatial Inhibition of Return• Inhibition of return (IOR) mechanisms inhibit cells that correspond to previously fixated
locations and objects [Posner and Cohen, 1984]. IOR supports orienting towards novelty and enables scene exploration.
• IOR happens in spatial coordinates and not in retinotopic coordinates
Each voxel stores inhibition data:• Inhibition flag• Inhibition weight
Depth processing stream:Creating 3D map
Map with 3D object models
Color processing stream
Inhibitionmap
IORflags
…Objectcandidates
[Martín-García/Frintrop, Proc. of the annual meeting of Cognitive Sciences (CogSci), 2013][Martín-García/Frintrop/Cremers, German Journal of Artificial Intelligence, 2013]
[Martín-García/Frintrop, Proc. of the annual meeting of Cognitive Sciences (CogSci), 2013][Martín-García/Frintrop/Cremers, German Journal of Artificial Intelligence, 2013]
3D Object Discovery
23
Object Discovery
Summary: our object discovery method performs well on:
Simone Frintrop 25
1) Photos 2) Videos 3) RGB-D data
[Frintrop et al.: ICPR 2014] [Horbert, Martín-García,Frintrop, Leibe, submitted]
[Martín-García/Frintrop:CogSci 2013]
[Martín-García/Frintrop/Cremers:J. of KI 2013]