View
217
Download
0
Tags:
Embed Size (px)
Citation preview
Features-based Object Recognition
Pierre MoreelsCalifornia Institute of Technology
Thesis defense, Sept. 24, 2007
2
The recognition continuumvariab
ility
Individual objects
means of transportation
BMW logo
Categories
cars
Applications
Autonomousnavigation
Identification, Security.
Help Daiki find his toys !
4
• Problem setup
• Features
• Coarse-to-fine algorithm
• Probabilistic model
• Experiments
• Conclusion
Outline
5
…
The detection problem
New scene (test image)
Models fromdatabase
Find models and their pose (location, orientation…)
6
…
Hypotheses – models + positions
New scene (test image)
Models fromdatabase
1
2
Θ = affine transformation
7
…
Matching features
Models fromdatabase
New scene (test image)
Set of correspondences = assignment vector
8
Features detection
9
Image characterization by features
• Features = high information content
‘locations in the image where the signal changes two-dimensionally’ C.Schmid
• Reduce the volume of information
edge strength map
features
– [Sobel 68]– Diff of Gaussians [Crowley84]– [Harris 88]– [Foerstner94]– Entropy [Kadir&Brady01]
10
Correct vs incorrect descriptors matches
Mutual Euclidean distances in appearance space of descriptors
12
34
5
6
7
8
- Pixels intensity within a patch- Steerable filters [Freeman1991]- SIFT [Lowe1999,2004]- Shape context [Belongie2002]- Spin [Johnson1999]- HOG [Dalal2005]
11
Stability with respect to nuisances
Which detector / descriptor
combination is best for recognition ?
Past work on evaluation of features• Use of flat surfaces, ground truth easily established• In 3D images appearance changes more !
[Schmid&Mohr00] [Mikolajczyk&Schmid 03,05,05]
13
Database : 100 3D objects
14
Testing setup
[Moreels&Perona ICCV05, IJCV07]
Used by [Winder, CVPR07]
Results – viewpoint change M
ahal
anob
is d
ista
nce
No
‘bac
kgro
und’
imag
es
2D vs. 3D
Ranking of detectors/descriptorscombinations are modified whenswitching from 2D to 3D objects
17
Features matching algorithm
18
Features assignments
models from database
New scene (test image)
. . .
Interpretation
. . .
19
Coarse-to-fine strategy• We do it every day !
Search for my place : Los Angeles area – Pasadena – Loma Vista - 1351
my car
Coarse-to-fine example
[Fleuret & Geman 2001,2002]
Face identification in complex scenes
Coarse resolution
Intermediate resolution
Fine resolution
21
• Progressively narrow down focus on correct region of hypothesis space
• Reject with little computation cost irrelevant regions of search space
• Use first information that is easy to obtain
• Simple building blocks organized in a cascade
• Probabilistic interpretation of each step
Coarse-to-Fine detection
22
Coarse data : prior knowledge
• Which objects are likely to be there, which pose are they likely to have ?
unlikelysituations
23
New scene (test image)…
Models fromdatabase
4 votes
2 votes
0 vote
Model voting
Search tree (appearance space – leaves = database features)
24
(x1,y1,s1,1)
(x2,y2,s2,2)
Transform predicted by this match: x = x2-x1
y = y2-y1
s = s2 / s1
= 2 - 1
Each match is represented by a dot in
the space of 2D similarities (Hough space)
x
y
s
Use of rich geometric information
[Lowe1999,2004]
• Prediction of position of model center after transform
• The space of transform parameters is discretized into ‘bins’
• Coarse bins to limit boundary issues and have a low false-alarm rate for this stage
• We count the number of votes collected by each bin.
Coarse Hough transform
N~
Model
Test scene
correct transformation
26Output of PROSAC : pose transformation
+ set of features correspondences
Correspondence or clutter ? PROSAC
• Similar to RANSAC – robust statistic for parameter estimation
• Priority to candidates with good quality of appearance match
• 2D affine transform : 6 parameters
each sample contains 3 candidate correspondences.
d
d
d
[Fischler 1973] [Chum&Matas 2005]
27
Probabilistic model
28
Generative model
29
Recognition steps
Score of an extended hypothesis
Hypothesis:model + position
observed featuresgeometry + appearance
database of models
constant
Consistency(after PROSAC)Prior on model
and poses
Featuresassignments
Votes per model Votes per model pose bin(Hough transform)
Prior on assignments(before actual observations)
ConsistencyConsistency between observations and predictions from hypothesis
model m
position of model m
Common-frame approximation : parts are conditionally independent once reference position of the object is fixed. [Lowe1999,Huttenlocher90,Moreels04]
Con
stel
latio
n m
odel
Com
mon
-fra
me
32
foreground features ‘null’ assignments
geometry geometryappearance appearance
Consistency - appearance Consistency - geometry
ConsistencyConsistency between observations and predictions from hypothesis
Learning foreground & background densities
• Ground truth pairs of matches are collected
• Gaussian densities, centered on the nomimal value that appearance / pose should have according to H
• Learning background densities is easy: match to random images.
[Moreels&Perona, IJCV, 2007]
34
Experiments
An example
Model votin
g
Hough
bins
36
An example
After
PROSAC
Probabilistic
scores
37
Efficiency of coarse-to-fine processing
38
Giuseppe Toys database – Models
61 objects, 1-2 views/object
Giuseppe Toys database – Test scenes
141 test scenes
40
Home objects database – Models
49 objects, 1-2 views/object
41
Home objects database – Test scenes
141 test scenes
42
Results – Giuseppe Toys database
Lowe’99,’04
Lower false alarmrate- more systematic verification of geometry consistency- more consistent verification of geometric consistency
undetected objects: features with poor appearance distinctivenessindex to incorrect models
-
+
43
Results – Home objects database
44
Failure modeTest image hand-labeledbefore the experiments
45
Test – Text and graphics
46
Test – no texture
Test – Clutter
48
• Coarse-to-fine strategy prunes irrelevant search branches at early stages.
• Probabilistic interpretation of each step.
• Higher performance than Lowe, especially in cluttered environment.
• Front end (features) needs more work for smooth or shiny surfaces.
Conclusions