Evaluation of features detectors and descriptors based on 3D objects P. Moreels - P. Perona California Institute of Technology

Evaluation of features detectors and descriptors based on 3D objects

P. Moreels - P. Perona

California Institute of Technology

Large baseline stereo

Features – what for ?

Stitching

Object recognition

[Dorko & Schmid’05]

[Lowe ’04][Brown & Lowe ’03]

[Tuytelaars & Van Gool ’00]

Moving the viewpoint

232 keypoints extracted

Features stabilityFeatures stability is not perfect…

240 keypoints extracted

First stage – feature detector

difference ofgaussians

[Crowley’84]

Kadir & Brady[Kadir’02]

Harris[Harris’88]

Affineinvariant

Harris[Mikolajczyk’02]

Second stage – feature descriptorSIFT Steerable filters

Differential invariants Shape context

[Lowe ’04] [Freeman’91 ]

[Belongie’02 ][Schmid’97 ]

Evaluations – Mikolajczyk ’03-’05

• Large viewpoint change• Computation of ground truth positions via a homography

Evaluations – Mikolajczyk ’03-’05

[CVPR’03] [PAMI’04]

[submitted]

• SIFT-based descriptors rule !

• All affine-invariant detectors are good, they should all be used together.

2D vs. 3D

Ranking of detectors/descriptorscombinations are modified whenswitching from 2D to 3D objects

Dataset – 100 3D objects

Viewpoints 45° apart




Ground truth - Epipolar constraints

Testing setup

Unrelated images used to load the database of features.

Distance ratio

• Correct matches are highly distinctive lower ratio

• Incorrect correspondences are ‘random correspondences’ low distinctiveness and ratio close to 1

[Lowe’04]

Are we accepting wrong matches ?

• Manual user classification into correct and incorrect triplets• Comparison with a simpler system: 2 views, only one epipolar constraint.

Pietro said maybe don’t need this slide – I think it is important to justify our 3-cameras setup

Detectors / descriptors tested

Detectors Descriptors

• Harris• Hessian• Harris-affine• Hessian-affine• Difference-of-gaussians• MSER• Kadir-Brady

• SIFT• steerable filters• differential invariants• shape context• PCA-SIFT

Results – viewpoint change M

ahal

anob

is d

ista

nce

No

‘bac

kgro

und’

imag

es

Results – lighting / scale changes

Change in light – result averaged over 3 lighting conditions.

Change in scale - 7.0mm to 14.6mm

Conclusions

• Automated ground truth for 3D objects/scenes

• Ranking changes from 2D to 3D

• Stability is much lower for 3D

• Detectors – affine-rectified detectors are indeed best

• Descriptors – SIFT and shape context performed best.

• Application: use ground truth in order to learn probability densities: ‘how does a correct match look like ?’

Documents

Evaluation of features detectors and descriptors based on 3D objects P. Moreels - P. Perona California Institute of Technology