View
214
Download
1
Tags:
Embed Size (px)
Citation preview
2
Outlines Introduction
Model-Based Object Recognition• AAM • Inverse Composition AAM
View-Based Object Recognition• Recognition based on boundary fragments• Recognition based on SIFT
Proposed Research
Conclusion and Future Work
3
Introduction
Object Recognition• A task of finding 3D objects from 2D images (or even
video) and classifying them into one of the many known object types
• Closely related to the success of many computer vision applications robotics, surveillance, registration … etc.
• A difficult problem that a general and comprehensive solution to this problem has not been made
4
Introduction
Two main streams of approaches:• Model-Based Object Recognition
3D model of the object being recognized is available Compare the 2D representation of the structure of an object
with the 2D projection of the model
• View-Based Object Recognition 2D representations of the same object viewed at different
angles and distances when available Extract features (as the representations of object) and compare
them to the features in the feature database
5
Introduction
Pros and Cons of each main stream:• Model-Based Object Recognition
Model features can be predicted from just a few detected features based on the geometric constraints
Models sacrifice its generality
• View-Based Object Recognition Greater generality and more easily trainable from visual data Matching is done by comparing the entire objects, some
methods may be sensitive to clutter and occlusion
6
Model-Based Object Recognition
Commonly used in face recognition
General Steps:• Locate the object,• locate and label its structure,• adjust the model's parameters until the model generates
an image similar enough to the real object.
Active Appearance Models (AAM) have been proved to be highly useful models for face recognition
7
Active Appearance Models
They model shape and appearance of objects separately
Shape: the vertex locations of a mesh Appearance: the pixels’ values of a mesh Both of the parameters above used PCA to generalize the
face recognition to generic face
Fitting an AAM: non-linear optimization solution is applied which iteratively solve for incremental additive updates to the shape and appearance coefficients
8
Inverse Compositional AAMs
The major difference of these models with AAMs is the fitting algorithm
AAM: additive incremental update shape and appearance parameters
ICAAM: inverse compositional update – The algorithm updates the entire warp by composing the current warp with the computed incremental warp
9
View-Based Object Recognition
Common approaches:• Correlation-based template matching (Li, W. et al. 95)
SEA, PDE, … etc Not effective when the following happens:
• illumination of environment changes
• Posture and scale of object changes
• Occlusion
• Color Histogram (Swain, M.J. 90) Construct histogram for an object and match it over image It is robust to changing of viewpoint and occlusion But it requires good isolation and segmentation of objects
10
View-Based Object Recognition Common approaches:
• Feature based Extract features from the image that are salient and match only to those
features when searching all location for matches
Feature types: groupings of edges, SIFT … etc Feature’s property preferences:
• View invariant• Detected frequently enough for reliable recognition• Distinctive
Image descriptor is created based on detected features to increase the matching performance
Image descriptor = Key / Index to database of features Descriptor’s property preferences:
• Invariant to scaling, rotation, illumination, affine transformation and noise
11
Nelson’s Approach
Recognition based on 2D Boundary Fragments
Prepare 53 clean images for each object and build 3D recognition database:
Object
Camera
13
Nelson’s Approach
Nelson’s experiment has shown his approach has high accuracy• 97.0% success rate for 24 objects database
under the following conditions:• Large number of images• Clean images• Very different objects• No occlusion and clutter
14
Lowe’s Approach
Recognition based on Scale Invariant Feature Transform (SIFT)• SIFT generates distinctive invariant features• SIFT based image descriptors are generally most resist
ant to common image deformations (Mikolajczyk 2005)• SIFT – four steps:
Scale-space extrema detection Keypoint localization Orientation assignment Keypoint descriptor computation
15
Scale-space extrema detection
DOG ~ LOG Search over all sample
points in all scales and find extrema that are local maxima or minima in laplacian space
Small keypoints Solve occlusion problemLarge keypoints Robust to noise and image blur
16
Keypoint localization
Reject keypoints with the following properties:• Low contrast (sensitive to noise)• Localized along edge (sliding effect)
Solution:• Filter points with value D below 0.03• Apply Hessian edge detector
17
Orientation assignment
Pre-compute the gradient magnitude and orientation
Use them to construct keypoint descriptor
18
Keypoint descriptor computation
Create orientation histogram over 4x4 sample regions around the keypoint locations
Each histogram contains 8 orientation bins 4x4x8 = 128 elements vectors (distinctively representing a f
eature)
19
Object Recognition based on SIFT
Nearest-neighbor algorithm Matching: assign features to objects There can be many wrong matches
• Solution Identify clusters of features Generalized Hough transform
Determine pose of object and then discard outliers
20
Proposed Research Personally, I think model-based approach does have better
performance
Success of model-based approach requires:• All models of objects to be detected• Automatically construct models• Automatically select the best model
How do the system know which 3D model to be used on a specific image of object?• By view-based approach• Human looks at an image of object for a moment and then realize which
model to be used on that object• Then use the specific model to refine the identification of the specific object
21
Hybrid of bottom-up and top-down
View-based approaches just presented are bottom-up approaches• Features: edges, extrema (Low Level)• Descriptors of features• Matching• Identification of object (High Level)
Can it be like that?• Features• …• Matching (Lower Level)• Guessing of object (Higher Level)• Matching (Lower Level)• Guessing of object (Higher Level)• …• Identification of object
22
Hierarchy of features Lowe’s system
• All features have equal weight in voting of object during identification of object (subject to be verified by examining the opened source code)
• Special features do not have enough voting power to shift the result to the correct one
• Consider the following scenario: Two objects have many similar features, a1 to a100 are similar to b1 to b100, and
have just one very different feature, a* for object A and b* for object B
Many a1 to a100 may be poorly captured by imaging device and mismatched as b1 to b100 , even we can still recognize the feature a*, the system may still think the object is B
Object A Object B
23
Extension of SIFT
Color descriptors Local texture measures
incorporated into feature descriptors
Scale-invariant edge groupings
*Generic object class recognition
24
Conclusion and Future Work
Discussed the different approaches in object recognition
Discussed what is SIFT and how it works Discussed the possible extensions to SIFT Design hybrid approach Design extensions