Download pdf - Feature Detection: Correlation-based Matchingcse399b/SP05/lectures/features.pdf · 1 1 Feature Detection: Correlation-based Matching Goal : Develop matching procedures that can recognize

1

1

Feature Detection: Correlation-based Matching

� Goal: Develop matching procedures that can recognize possibly partially-occluded objects or features specified as patterns of intensity values

� Template matching � gray level correlation

� edge correlation

� Hough Transform

� Chamfer Matching

2

Applications

� Feature detectors� Line detectors

� Corner detectors

� Spot detectors

� Known shapes� Character fonts

� Faces

� Stereo

� Motion tracking

- + -+

-

2

3

Harris Corner Detector

C. Harris, M. Stephens, “A Combined Corner and Edge Detector,” 1988

4

The Basic Idea

� We should easily recognize the point by looking through a small window

� Shifting a window in any directionshould give a large changein response

3

5

Harris Detector: Basic Idea

“flat” region:no change in all directions

“edge”:no change along the edge direction

“corner”:significant change in all directions

6

Harris Detector: Mathematics

[ ]2

,

( , ) ( , ) ( , ) ( , )x y

E u v w x y I x u y v I x y= + + −∑

Change of intensity for the shift [u,v]:

IntensityShifted intensity

Window function

orWindow function w(x,y)=

Gaussian1 in window, 0 outside

4

7


[ ]( , ) ,u

E u v u v Mv

≅

For small shifts [u,v] we have a bilinear approximation:

2

2,

( , ) x x y

x y x y y

I I IM w x y

I I I

=

∑

where M is a 2×2 matrix computed from image derivatives:

8


[ ]( , ) ,u

E u v u v Mv

≅

Intensity change in shifting window: eigenvalue analysis

λ1, λ2 – eigenvalues of M

direction of the slowest change

direction of the fastest change

(λmax)-1/2

(λmin)-1/2

Ellipse E(u,v) = const

5

9


λ1

λ2

“Corner”λ1 and λ2 are large,

λ1 ~ λ2;

E increases in all directions

λ1 and λ2 are small;

E is almost constant in all directions

“Edge”λ1 >> λ2

“Edge”λ2 >> λ1

“Flat”region

Classification of image points using eigenvalues of M:

10


Measure of corner response:

( )2det traceR M k M= −

1 2

1 2

det

trace

M

M

λ λλ λ

== +

(k is an empirically determined constant; k = 0.04 - 0.06)

6

11


λ1

λ2 “Corner”

“Edge”

“Edge”

“Flat”

• Rdepends only on eigenvalues of M

• R is large for a corner

• R is negative with large magnitude for an edge

• |R| is small for a flatregion

R> 0

R< 0

R< 0|R| small

12

Harris Detector

� The Algorithm:

� Find points with large corner response function R

(R > threshold)

� Take the points of local maxima of R

7

13

Harris Detector: Example

14

Harris Detector: ExampleCompute corner response R

8

15

Harris Detector: ExampleFind points with large corner response: R>threshold

16

Harris Detector: ExampleTake only the points of local maxima of R

9

17

Harris Detector: Example

18

Harris Detector: Summary

� Average intensity change in direction [u,v] can be expressed as a bilinear form:

� Describe a point in terms of eigenvalues of M:measure of corner response

� A good (corner) point should have a large intensity changein all directions, i.e., Rshould be a large positive value

[ ]( , ) ,u

E u v u v Mv

≅

( )2

1 2 1 2R kλ λ λ λ= − +

10

19

Harris Detector: Some Properties

� Rotation invariance

Ellipse rotates but its shape (i.e., eigenvalues) remains the same

Corner response R is invariant to image rotation

20


� Partial invariance to affine intensitychange

� Only derivatives are used => invariance to intensity shift I → I + b

� Intensity scale: I → a I

R

x (image coordinate)

threshold

R

x (image coordinate)

11

21


� But not invariant to image scale

Fine scale: All points will be classified as edges

Coarse scale:Corner

22


� Quality of Harris detector for different scale changes

Repeatability rate:# correspondences

# possible correspondences

C. Schmid et al., “Evaluation of Interest Point Detectors,” IJCV 2000

12

23

Tomasi and Kanade’s Corner Detector

� Idea: Intensity surface has 2 directions with significant intensity discontinuities

� Image gradient [Ix, Iy]T gives information about direction and magnitude of one direction, but not two

� Compute 2 x 2 matrix

where Q is a 2n+1 x 2n+1 neighborhood of a given point p

=∑∑

∑∑

Qy

Qyx

Qyx

Qx

III

III

M 2

2

24

Corner Detection (cont.)

� DiagonalizeM converting it to the form

� Eigenvaluesλ1 and λ2 , λ1 ≥ λ2, give measure of the edge strength (i.e., magnitude) of the two strongest, perpendicular edge directions (specified by the eigenvectors of M)

� If λ1 ≈ λ2 ≈ 0, then p’s neighborhood is approximately constant intensity

� If λ1 > 0 and λ2 ≈ 0, then single step edge in neighborhood of p

� If λ2 > threshold and no other point within p’s neighborhood has greater value of λ2, then mark p as a corner point

=

2

1

0

0

λλ

M

13

25

Tomasi and Kanade Corner Algorithm

� Compute the image gradient over entire image f

� For each image point p:� form the matrix M over (2N+1) x (2N+1) neighborhood Q of p

� compute the smallest eigenvalue of M

� if eigenvalue is above some threshold, save the coordinates of p into a list L

� Sort L in decreasing order of eigenvalues

� Scanning the sorted list top to bottom: For each current point, p, delete all other points on the list that belong to the neighborhood of p

26

Results

14

27

Results

28

Results

15

29

Moravec’s Interest Operator

� Compute four directional variances in horizontal, vertical, diagonal and anti-diagonal directions for each 4 x 4 window

� If the minimum of four directional variances is a local maximum in a 12 x 12 overlapping neighborhood, then that window (point) is “interesting”

30

∑∑

∑∑

∑∑

∑∑

= =

= =

= =

= =

++−+−++=

++++−++=

+++−++=

+++−++=

2

0

3

1

2

2

0

2

0

2

2

0

3

0

2

3

0

2

0

2

))1,1(),((

))1,1(),((

))1,(),((

)),1(),((

j ia

j id

j iv

j ih

jyixPjyixPV

jyixPjyixPV

jyixPjyixPV

jyixPjyixPV

16

31

=

=

otherwise,0

max local a is),( if,1),(

)),(),,(),,(),,(min(),(

yxVyxI

yxVyxVyxVyxVyxV advh

32

17

33

Scale Invariant Detection

� Consider regions (e.g., circles) of different sizes around a point

� Regions of corresponding sizes will look the same in both images

34


� The problem: How do we choose corresponding circles independently in each image?

18

35


� Solution:

� Design a function on the region (circle) that is “scale invariant,” i.e., the same for corresponding regions, even if they are at different scales

Example: Average intensity. For corresponding regions (even of different sizes) it will be the same

scale = 1/2

– For a point in one image, we can consider it as a function of

region size (circle radius)

f

region size

Image 1 f

region size

Image 2

36


� Common approach:

scale = 1/2

f

region size

Image 1 f

region size

Image 2

Take a local maximum of this function

Observation: Region size, for which the maximum is achieved, should be invariant to image scale

s1 s2

Important: This scale invariant region size is found in each image independently!

19

37


� A “good” function for scale detection has one stable sharp peak

� For usual images: a good function would be a one which responds to contrast (sharp local intensity change

f

region size

bad

f

region size

bad

f

region size

Good !

38

Scale Invariance

Requires a method to repeatably select points in location and scale:

� The only reasonable scale-space kernel is a Gaussian (Koenderink, 1984; Lindeberg, 1994)

� An efficient choice is to detect peaks in the difference of Gaussian pyramid (Burt & Adelson, 1983; Crowley & Parker, 1984 – but examining more scales)

� Difference-of-Gaussian with constant ratio of scales is a close approximation to Lindeberg’s scale-normalized Laplacian (can be shown from the heat diffusion equation)

Blur

Subtract

Blur

Subtract

20

39

SIFT Operator: Scale Space Processed One Octave (i.e., Doubling σσσσ) at a Time

Gσ ⊗ I

Gσ ⊗ Gσ = G√2σ

(Gσ)3 = G2σ

(Gσ)4 = G2√2σ

G2σ ⊗ I

(Gσ)4 = G4σ

40

SIFT: Key Point Localization

� Detect maxima and minima of difference-of-Gaussian in scale space

� Fit a quadratic to surrounding values for sub-pixel and sub-scale interpolation (Brown & Lowe, 2002)

� Taylor expansion around point:

� Offset of extremum (use finite differences for derivatives):

Blur

Subtract

21

43

SIFT: Select Canonical Orientation

� Create histogram of local gradient directions computed at selected scale

� Assign canonical orientation at peak of smoothed histogram

� Each key specifies stable 2D coordinates (x, y, scale, orientation)

0 2π

44

Example of Keypoint Detection

Threshold on value at DOG peak and on ratio of principle curvatures (Harris approach)

(a) 233x189 image(b) 832 DOG extrema(c) 729 left after peak

value threshold(d) 536 left after testing

ratio of principlecurvatures

22

45

Creating Features Stable to Viewpoint Change

� Edelman, Intrator & Poggio (97) showed that complex cell outputs are better for 3D recognition than simple correlation

46

Stability to Viewpoint Change

� Classification of rotated 3D models (Edelman 97):

� Complex cells: 94% vs simple cells: 35%

23

47

SIFT Vector Formation

� Thresholded image gradients are sampled over 16 x 16 array of locations in scale space around each keypoint position

� Create array of orientation histograms from Gaussian-weighted gradients in 4 x 4 region

� 8 orientations x 4x4 histogram array = 128 dimensions

48

Feature Stability to Noise

� Match features after random change in image scale and orientation, with differing levels of image noise

� Find nearest neighbor in database of 30,000 features

24

49

Feature Stability to Affine Change

� Match features after random change in image scale and orientation, with 2% image noise, and affine distortion

� Find nearest neighbor in database of 30,000 features

50

Distinctiveness of Features

� Vary size of database of features, with 30 degree affine change,2% image noise

� Measure % correct for single nearest-neighbor match

25

51

Image Correlation

� Given:� n x n image, M, of an object of interest, called a template

� n x n image, N, that possibly contains that object (usually a window of a larger image)

� Goal: Develop functions that compare images M and Nand measure their similarity� Sum-of-Squared-Difference (SSD):

� (Normalized) Cross-Correlation (CC):

∑∑ ∑∑

∑∑

= = = =

= =

−−= n

i

n

j

n

i

n

j

n

i

n

j

jiNjiM

jiNljkiM

lkCC

1 1 1 1

2/122

1 1

]),(),([

),(),(

),(

SSD(k,l) = ∑∑ [M(I-k, j-l) - N(i, j)]2

52

Sum-of-Squared-Difference (SSD)

� Perfect match: SSD = 0

� If N = M + c, SSD = c2n2, so sensitive to constant

illumination change in image N. Fix by grayscale

normalization of N before SSD

26

53

Cross-Correlation (CC)

� CC measure takes on values in the range [0, 1] (or [0, √ ∑∑M2] if first term in denominator removed)� it is 1 if and only if N = cM for some constant c

� so N can be uniformly brighter or darker than the template, M, and the correlation will still be high

� SSD is sensitive to these differences in overall brightness

� The first term in the denominator, ΣΣM2, depends only on the template, and can be ignored because it is constant

� The second term in the denominator, ΣΣN2, can be eliminated if we first normalize the gray levels of N so that their total value is the same as that of M - just scale each pixel in N by ΣΣM/ΣΣN

� practically, this step is sometimes ignored, or M is scaled to have average gray level of the big image from which the unknown images, N, are drawn

54

Cross-Correlation

� Suppose that N(i,j) = cM(i,j)

1

]),(),([

),(

]),(),([

),(),(

]),(),([

),(),(

1 1 1 1

2/122

1 1

2

1 1 1 1

2/1222

1 1

1 1 1 1

2/122

1 1

=

=

=

=

∑∑ ∑∑

∑∑

∑∑ ∑∑

∑∑

∑∑ ∑∑

∑∑

= = = =

= =

= = = =

= =

= = = =

= =

n

i

n

j

n

i

n

j

n

i

n

j

n

i

n

j

n

i

n

j

n

i

n

j

n

i

n

j

n

i

n

j

n

i

n

j

jiNjiNc

jiNc

jiNjiNc

jiNjicN

jiNjiM

jiNjiM

CC

27

55

Cross-Correlation

� Alternatively, we can rescale both M and N to have unit total intensity� N′(i, j) = N(i, j)//ΣΣN

� M′(i, j) = M(i, j)/ΣΣM

� Now, we can view these new images, M′ and N′ as unit vectors of length n2

� The correlation measure ΣΣM′(i,j)N′(i,j) is the familiar dot product between the two n2 vectors M′′′′ andN′′′′. Recall that the dot product is the cosine of the angle between the two vectors� it is equal to 1 when the vectors are the same vector, or the normalized

images are identical

� These are BIG vectors

56

Cross-Correlation Example 1

� Template M = 0 0 0 Image N = 0

1 1 1 0 1 1 1 0

0 0 0 0

� ∑∑NM = 0 ∑∑N2 = 0

0 1 2 3 2 1 0 0

0 0 1 2 3 2 1 0

0

0

� ∑∑NM / √∑∑N2 = 0 1 √2 √3 √2 1 0

NOTE: Many near misses

28

57


� Template M = 0 0 0 Image N = 0

1 1 1 4 4 4

0 0 0 0

� ∑∑NM = 0 ∑∑N2 = 0

4 8 12 8 4 16 32 48 32 16

0 0

� ∑∑NM / √∑∑N2 = 1 √2 √3 √2 1

58


� Template M = 0 0 0 Image N = 1 1 1

1 1 1 0 1 1 1 0

0 0 0 1 1 1

� ∑∑NM = 1 2 3 2 1 ∑∑N2 = 1 2 3 2 1

1 2 3 2 1 2 4 6 4 2

1 2 3 2 1 3 6 9 6 3

2 4 6 4 2

1 2 3 2 1

29

59

Example 3 (cont.)

� ∑∑NM / √∑∑N2 = 0

1 √6/2 1

0 0 1 0 0

1 √6/2 1

0

� Lots of near misses!

60

Reducing the Computational Cost of Correlation Matching

� A number of factors lead to large costs in correlation matching:� the image N is much larger than the template M, so we have to perform

correlation matching of M against every n x n window of N

� we might have many templates, Mi, that we have to compare against a given image N

� face recognition - have a face template for every known face; this might easily be tens of thousands

� character recognition - template for each character

� we might not know the orientation of the template in the image

� template might be rotated in the image N - example: someone tilts their head for a photograph

� would then have to perform correlation of rotated versions of Magainst N

30

61

Reducing the Computational Cost of Correlation Matching

� A number of factors lead to large costs in correlation matching:� we might not know the scale, or size, of the template in the unknown

image

� the distance of the camera from the object might only be known approximately

� would then have to perform correlation of scaled versions of M against N

� Most generally, the image N contains some mathematical transformation of the template image M� if M is the image of a planar pattern, like a printed page or

(approximately) a face viewed from a great distance, then the transformation is an affine transformation, which has six degrees of freedom

62

Correlation Matching

� Let T(p1, p2, ..., pr) be the class of mathematical transformations of interest� For rotation, we have T(θ)

� For scaling, we have T(s)

� General goal is to find the values of p1, p2, ..., pr for which� C(T(p1, p2, ..., pr)M, N) is “best”

� highest for normalized cross-correlation

� smallest for SSD