View
216
Download
0
Embed Size (px)
Citation preview
Evaluation of features detectors and descriptors based on 3D objects
P. Moreels - P. Perona
California Institute of Technology
Large baseline stereo
Features – what for ?
Stitching
Object recognition
[Dorko & Schmid’05]
[Lowe ’04][Brown & Lowe ’03]
[Tuytelaars & Van Gool ’00]
Moving the viewpoint
232 keypoints extracted
Features stabilityFeatures stability is not perfect…
240 keypoints extracted
First stage – feature detector
difference ofgaussians
[Crowley’84]
Kadir & Brady[Kadir’02]
Harris[Harris’88]
Affineinvariant
Harris[Mikolajczyk’02]
Second stage – feature descriptorSIFT Steerable filters
Differential invariants Shape context
[Lowe ’04] [Freeman’91 ]
[Belongie’02 ][Schmid’97 ]
Evaluations – Mikolajczyk ’03-’05
• Large viewpoint change• Computation of ground truth positions via a homography
Evaluations – Mikolajczyk ’03-’05
[CVPR’03] [PAMI’04]
[submitted]
• SIFT-based descriptors rule !
• All affine-invariant detectors are good, they should all be used together.
2D vs. 3D
Ranking of detectors/descriptorscombinations are modified whenswitching from 2D to 3D objects
Dataset – 100 3D objects
Viewpoints 45° apart
Viewpoints 45° apart
Viewpoints 45° apart
Viewpoints 45° apart
Ground truth - Epipolar constraints
Testing setup
Unrelated images used to load the database of features.
Distance ratio
• Correct matches are highly distinctive lower ratio
• Incorrect correspondences are ‘random correspondences’ low distinctiveness and ratio close to 1
[Lowe’04]
Are we accepting wrong matches ?
• Manual user classification into correct and incorrect triplets• Comparison with a simpler system: 2 views, only one epipolar constraint.
Pietro said maybe don’t need this slide – I think it is important to justify our 3-cameras setup
Detectors / descriptors tested
Detectors Descriptors
• Harris• Hessian• Harris-affine• Hessian-affine• Difference-of-gaussians• MSER• Kadir-Brady
• SIFT• steerable filters• differential invariants• shape context• PCA-SIFT
Results – viewpoint change M
ahal
anob
is d
ista
nce
No
‘bac
kgro
und’
imag
es
Results – lighting / scale changes
Change in light – result averaged over 3 lighting conditions.
Change in scale - 7.0mm to 14.6mm
Conclusions
• Automated ground truth for 3D objects/scenes
• Ranking changes from 2D to 3D
• Stability is much lower for 3D
• Detectors – affine-rectified detectors are indeed best
• Descriptors – SIFT and shape context performed best.
• Application: use ground truth in order to learn probability densities: ‘how does a correct match look like ?’