Lecture 4: Feature Extraction II

Preview:

Citation preview

Lecture 4: Feature Extraction II

Searching for Patches

• sliding window• image pyramid

Local feature points

• Advantages of this approach• Harris point detector

Scale invariant region selection

• Searching over scale.• Characteristic scale.

Story so far

• Have introduced some methods to describe the appearance of an image/imagepatch via a feature vector. (SIFT HOG etc..)

• For patches of similar appearance their computed feature vectors should besimilar while dissimilar if the patches differ in appearance.

• Feature vectors are designed to be invariant to common transformations thatsuperficially change the pixel appearance of the patch.

Next Problem

We have an reference image patch which is described by a feature vector fr.

Face Finder: Training• Positive examples:

– Preprocess ~1,000 example face images into 20 x 20 inputs

– Generate 15 “clones” of each with small random rotations, scalings, translations, reflections

• Negative examples– Test net on 120 known “no-face” images

����!��������!���

����!������!��!���

⇒ fr

Given a novel image identify the patches in this image that correspond to thereference patch.

One part of the problem we have explored.

A patch from the novel image generates a feature vector fn. If ‖fr − fn‖ is

small then this patch can be considered an instance of the texture patternrepresented by the reference patch.

However, which and how many different image patches do we extractfrom the novel image ??

Remember..

The sought after image patch can appear at:

• any spatial location in the image

• any size, (the size of an imaged object depends on its from the camera)

• multiple locationsVariation in position and size- multiple detection windows

, 49

Sliding Window Technique

Must examine patches centered at many different pixel locations and sizes.Naive Option: Exhaustive search using original image

for j = 1:n sn = n min + j*n stepfor x=0:x maxfor y=0:y maxExtract image patch centred on pixel x, y of size n×n.Rescale it to the size of the reference patchCompute feature vector f.

This is computationally intensive especially if it is expensive to compute f as itcould be calculated upto n s × x max × y max.

Also if n is large then it is probably very costly to compute f .

What about a multi-scale search

Scale Pyramid Option

Construct an image pyramid that represents an image as several resolutions. Theneither

• Use the coarse scale to highlight promising image patches and then just explorethese area in more detail at the finer resolutions. (Quick but may miss bestimage patches)

• Visit every pixel in the fine resolution image as a potential centre pixel, butsimulate changing the window size by applying the same window size on thedifferent images in the pyramid.

Now will review construction of the image pyramid..

Image Scale Pyramids

Naive Subsampling

6

SMOOTHED IMAGE NAIVE SUBSAMPLING

Pick every other pixel in both directions

SUBSAMPLING ARTIFACTS

Particularly noticeable in high frequency areas, such as on the hair. The lowest resolution level

represents very poorly the highest one.

SYNTHETIC EXAMPLE

1—D ALIASING

High frequency signal sampled at a much lower frequency.

2—D ALIASING

Sampling frequency lower than that of the signal yields a poor representation.

! Must remove high frequencies before sub-sampling.

Pick every other pixel in both directions

Subsampling Artifacts

6

SMOOTHED IMAGE NAIVE SUBSAMPLING

Pick every other pixel in both directions

SUBSAMPLING ARTIFACTS

Particularly noticeable in high frequency areas, such as on the hair. The lowest resolution level

represents very poorly the highest one.

SYNTHETIC EXAMPLE

1—D ALIASING

High frequency signal sampled at a much lower frequency.

2—D ALIASING

Sampling frequency lower than that of the signal yields a poor representation.

! Must remove high frequencies before sub-sampling.

Particularlynoticeable in highfrequency areas,

such as on the hair.

Synthetic Example

6

SMOOTHED IMAGE NAIVE SUBSAMPLING

Pick every other pixel in both directions

SUBSAMPLING ARTIFACTS

Particularly noticeable in high frequency areas, such as on the hair. The lowest resolution level

represents very poorly the highest one.

SYNTHETIC EXAMPLE

1—D ALIASING

High frequency signal sampled at a much lower frequency.

2—D ALIASING

Sampling frequency lower than that of the signal yields a poor representation.

! Must remove high frequencies before sub-sampling.

Under-sampling

Undersampling

• Looks just like lower frequency signal!

Undersampling

• Looks like higher frequency signal!

Aliasing: higher frequency information can appear as lower frequency information

Undersampling

Good sampling

Bad sampling

Aliasing

AliasingInput signal:

x = 0:.05:5; imagesc(sin((2.^x).*x))

Matlab output:

Not enough samples

Aliasing in video

Slide credit: S. Seitz

Looks just like a lower frequency signal!

Under-samplingUndersampling

• Looks just like lower frequency signal!

Undersampling

• Looks like higher frequency signal!

Aliasing: higher frequency information can appear as lower frequency information

Undersampling

Good sampling

Bad sampling

Aliasing

AliasingInput signal:

x = 0:.05:5; imagesc(sin((2.^x).*x))

Matlab output:

Not enough samples

Aliasing in video

Slide credit: S. Seitz

Looks like higher frequency signal!

Aliasing: higher frequency information can appear as lower frequency information

2-D Aliasing

6

SMOOTHED IMAGE NAIVE SUBSAMPLING

Pick every other pixel in both directions

SUBSAMPLING ARTIFACTS

Particularly noticeable in high frequency areas, such as on the hair. The lowest resolution level

represents very poorly the highest one.

SYNTHETIC EXAMPLE

1—D ALIASING

High frequency signal sampled at a much lower frequency.

2—D ALIASING

Sampling frequency lower than that of the signal yields a poor representation.

! Must remove high frequencies before sub-sampling.

High frequency signalsampled lower than that

of the signal yields apoor representation.

Therefore mustremove high

frequencies beforesub-sampling.

Aliasing Summary

• Can’t shrink an image by taking every second pixel due to sampling below theNyquist rate

• If we do, characteristic errors appear such as

– jaggedness in line features– spurious highlights– appearance of frequency patterns not present in the original image

Gaussian Pyramid

7

GAUSSIAN PYRAMID

• Gaussian smooth• Pick every other pixel in both directions

LOSS OF DETAILS BUT NOT ARTIFACTS

!No aliasing but details are lost as high frequencies are progressively removed.

LAPLACIAN PYRAMID

Each level of the Laplacian pyramid is the difference between corresponding and next higher level of the Gaussian Pyramid.

LAPLACIAN RECONSTRUCTION

• Upsampling by interpolation.• Adding upsampled image and difference image.

P. Burt and E. Adelson, The Laplacian Pyramid as a Compact Image Code, IEEE Transactions on Communications, 1983.

LAPLACIAN PYRAMID

• Pixels in the difference images are relatively uncorrelated.

• Their values are concentrated around zero.

ENTROPY AND QUANTIZATION

! Effective compression through shortened and variable code words.

• Gaussian smooth image

• Pick every other pixel in both directions

Images in the Pyramid

7

GAUSSIAN PYRAMID

• Gaussian smooth• Pick every other pixel in both directions

LOSS OF DETAILS BUT NOT ARTIFACTS

!No aliasing but details are lost as high frequencies are progressively removed.

LAPLACIAN PYRAMID

Each level of the Laplacian pyramid is the difference between corresponding and next higher level of the Gaussian Pyramid.

LAPLACIAN RECONSTRUCTION

• Upsampling by interpolation.• Adding upsampled image and difference image.

P. Burt and E. Adelson, The Laplacian Pyramid as a Compact Image Code, IEEE Transactions on Communications, 1983.

LAPLACIAN PYRAMID

• Pixels in the difference images are relatively uncorrelated.

• Their values are concentrated around zero.

ENTROPY AND QUANTIZATION

! Effective compression through shortened and variable code words.

No aliasing but detailsare lost as highfrequencies are

progressively removed.

Scaled representation advantages

• Find template matches at all scales

– Template size is constant, but image size changes

• Efficient search for correspondence

– look at coarse scales, then refine with finer scales– much less cost, but may miss best match

• Examining of all levels of detail

– Find edges with different amounts of blur– Find textures with different spatial frequencies

Back to Sliding Windows

Summary: Sliding Windows

Pros

• Simple to implement• Good feature choices critical• Past successes for certain classes• Good detectors available

Cons/Limitations

• High computational complexity– 250,000 locations x 30 orientations x 4 scales = 30,000,000 evaluations!– Puts constraints on the type of classifiers we can use– If training binary detectors independently, this means cost increases linearly

with number of classes.• With so many windows, false positive rate better be low

Limitations of sliding windows

• Not all object are box shaped.

Limitations of sliding windows

• Non-rigid, deformable objects not captured well with representations assuminga fixed 2D structure; or must assume fixed viewpoint

• Objects with less-regular textures not captured well with holistic appearance-based descriptions.

Limitations of sliding windows

• If considering windows in isolation, context is lost.

Sliding Window Detector’s View

Limitations of sliding windows

• In practice, often entails large, cropped training set.

• Using a global appearance description can lead to sensitivity to partial occlusions.

Need lots of training data Partial occlusion a problem

Another Approach

Local invariant features: The Motivation

• Global representations have major limitations

• Instead, describe and match only local regions

• Increased robustness to:

occlusionsintra-category

variationArticulation

Applications: Image Matching

Applications: Image Stitching

Applications: Image Stitching

Procedure

• Detect feature points in both images.

Applications: Image Stitching

Procedure

• Detect feature points in both images.

• Find corresponding pairs.

Applications: Image Stitching

Procedure

• Detect feature points in both images.

• Find corresponding pairs.

• Use these pairs to align the images.

Applications: Object Recognition

• Image content is transformed into local features that are invariant to translation,rotation, and scale.• Goal: Verify if they belong to a consistent configuration.

Reference Image Local Features Test Image

General Approach

1. Find a set of distinctive keypoints.

2. Define a region around each keypoint.

3. Extract and normalize the region content.

4. Compute a local descriptor from the normalized regions.

5. Match local descriptors.

A1

Af

e.g. color

Bf

e.g. color

B1

A2

A3B2

B3

Similarity measure

Necessary conditions for this approach

Need a repeatable detectorDetect the same point independently in both images

In this case, No chance to match!

Necessary conditions for this approach

Need a reliable and distinctive descriptorFor each point correctly recognize the corresponding one.

?

Descriptor InvariancesGeometric Transformations

⇓ ⇓

An aside Levels of Geometric Invariance

Photometric Transformations

Often modelled as a linear transformation: Scaling + offset.

Requirements

Repeatable and Accurate Region Extraction

• Invariant to translation, rotation, scale changes

• Robust or covariant to out-of-plane (≈affine) transformations

• Robust to lighting variations, noise, blur, quantization

Locality Features are local, therefore robust to occlusion and clutter.

Quantity We need a sufficient number of regions to cover the object.

Distinctiveness The regions should contain interesting structure.

Efficiency Close to real-time performance.

Keypoint Localization

Goals• Repeatable detection• Precise localization• Interesting content.

Look for two-dimensional signal changes.

Find Corners

Key Property Region around a corner its image gradient has two or moredominant directions.

Corners are repeatable and distinctive

Corners as distinctive interest points

Design criteria:• Should easily recognize the point by looking through a small window. (locality)• Shifting a window in any direction should give a large change in intensity. (goodlocalization)

Harris Detector Formulation

Change in the intensity for a shift (u, v):

E(u, v) =∑

(x,y)∈W

w(x, y) [I(x+ u, y + v)− I(x, y)]2

where the window/weight mask w(x, y) is either:

! ⇥!⇤⌅⇧⌃⌥!⌅⌥! ⌦↵,⌅!.↵,✏⌦⌅⇣⌘!✓⌦◆!⇧⌃!56⇣↵⌫◆↵⌦!⇠⇧⌫⇡5⇣↵,⇢⌘!;,⌧⇣↵⇣5⇣!⌃=⌅!;,⌃⇧⌅⌫⌦⇣↵>⌘!?,↵@⌅⌧↵⇣!⇣!"5⇢⌧#5⌅⇢!!$↵%✏6↵⇣,⌅⌧⇣⌅⌥!&'⌘!()*+&,!"5⇢⌧#5⌅⇢⌘!-⌅⌫⌦,./!⌫⌦↵60! ⌦↵,⌅⌥.↵,✏⌦⌅⇣1↵,⌃⇧⌅⌫⌦⇣↵>⌥5,↵(⌦5⇢⌧#5⌅⇢⌥◆

!⇥#⇠⌦◆⇥#⌅⇧

*⇥⇤⇤⌅⇧⌃↵⌦#⌦#⇤/⌃

!⇥#⇠⌦◆⇥#⌅⇧

⇠✏⌦,⇢!⇧⌃!↵,⇣,⌧↵⇣.!⌃⇧⌅!⇣✏!⌧✏↵⌃⇣!6!⇥⇤70

!⇥⇤⌅⇥⇧⌃⇤⌥ ⌃⌦⇤⌅↵,⌃⇥⇤⌅⇥⇧⌃⇤⌥

⌃⇥↵.✏,⌦⇣⇥⌘⇤⌃.⇥

⇧⌅8↵,◆⇧9!⌃5,%⇣↵⇧,!⌅⇧⌃⇥⌥!:

-⌦5⌧⌧↵⌦,+!↵,!9↵,◆⇧9⌘!'!⇧5⇣⌧↵◆

! !⇥ ⇤ ⌅⇥⇤⌅⇧ ⇤ ⌃

⌥! ⇧ ⇤ ⌃⇥⇧  !⇧⌃⇥ ⇤ ⌃⌃⌅ ⇥⌥  ! ⇧ ⇤ ⌃⇥⇤

Harris Detector Formulation

This measure of change can be approximated by:

E(u, v) ≈ (u, v)M(uv

)where M is a 2× 2 matrix computed from image derivatives:

M =∑

(x,y)∈W

w(x, y)(

Ix(x, y)2 Ix(x, y)Iy(x, y)Ix(x, y)Iy(x, y) Iy(x, y)2

)

=∑

(x,y)∈W

w(x, y)(Ix(x, y)Iy(x, y)

) (Ix(x, y) Iy(x, y)

)where Ix is the gradient with respect to x.

Harris Detector Formulation

M is computed from these image derivatives:

Image I Ix Iy IxIy

What does this matrix reveal?

Consider an axis-aligned corner:

What does this matrix reveal?

For an axis-aligned corner:

M =( ∑

I2x

∑IxIy∑

IxIy∑I2y

)=(λ1 00 λ2

)• The dominant gradient directions align with the x and y axis.

• If either λ1 or λ2 is close to 0, then this is not a corner, so look for locationswhere both are large.

What if we have a corner that is not aligned with the image axes?

General caseSince M is symmetric, then

M = R−1

(λ1 00 λ2

)R

Can visualize M as an ellipse with axis lengths determined by its eigenvalues andorientation determined by R.

! ⇥!⇤⌅⇧⌃⌥!⌅⌥! ⌦↵,⌅!.↵,✏⌦⌅⇣⌘!✓⌦◆!⇧⌃!56⇣↵⌫◆↵⌦!⇠⇧⌫⇡5⇣↵,⇢⌘!;,⌧⇣↵⇣5⇣!⌃=⌅!;,⌃⇧⌅⌫⌦⇣↵>⌘!?,↵@⌅⌧↵⇣!⇣!"5⇢⌧#5⌅⇢!!$↵%✏6↵⇣,⌅⌧⇣⌅⌥!&'⌘!()*+&,!"5⇢⌧#5⌅⇢⌘!-⌅⌫⌦,./!⌫⌦↵60! ⌦↵,⌅⌥.↵,✏⌦⌅⇣1↵,⌃⇧⌅⌫⌦⇣↵>⌥5,↵(⌦5⇢⌧#5⌅⇢⌥◆

⌃"⇥$⇠⌦◆⇥$⌅⇧

;,⇣,⌧↵⇣.!%✏⌦,⇢!↵,!⌧✏↵⌃⇣↵,⇢!9↵,◆⇧90!↵⇢,@⌦65!⌦,⌦6.⌧↵⌧

λ+⌘!λ=!>!↵⇢,@⌦65⌧!⇧⌃!⇣

!⇥⇤⌅⇧⌃⇥⌥ ⌥⌦ ⌃↵⌅ ,⌥.⌅,⌃ ⇧↵✏⇣⌅

!⇥⇤⌅⇧⌃⇥⌥ ⌥⌦ ⌃↵⌅ ⌦✏,⌃⌅,⌃ ⇧↵✏⇣⌅

?λ⌫⌦<@(+A=

?λ⌫↵,@(+A=

$66↵⇡⌧!⌘⇧$⇥⇤✓:!%⇧,⌧⇣

⇤⌅⇧ ⇤ ⌃

⌥# ⇧ ⇤ ⌃⇥⇧  ⇧⇤

 ⇧  ⌃

 ⇧  ⌃  ⌃⇤ 

! ⇥!⇤⌅⇧⌃⌥!⌅⌥! ⌦↵,⌅!.↵,✏⌦⌅⇣⌘!✓⌦◆!⇧⌃!56⇣↵⌫◆↵⌦!⇠⇧⌫⇡5⇣↵,⇢⌘!;,⌧⇣↵⇣5⇣!⌃=⌅!;,⌃⇧⌅⌫⌦⇣↵>⌘!?,↵@⌅⌧↵⇣!⇣!"5⇢⌧#5⌅⇢!!$↵%✏6↵⇣,⌅⌧⇣⌅⌥!&'⌘!()*+&,!"5⇢⌧#5⌅⇢⌘!-⌅⌫⌦,./!⌫⌦↵60! ⌦↵,⌅⌥.↵,✏⌦⌅⇣1↵,⌃⇧⌅⌫⌦⇣↵>⌥5,↵(⌦5⇢⌧#5⌅⇢⌥◆

⌃"⇥$⇠⌦◆⇥$⌅⇧

λ+

λ=

3⇠⇧⌅,⌅4

λ+!⌦,◆!λ=!⌦⌅!6⌦⌅⇢⌘!λ+!B!λ=/⌘!↵,%⌅⌦⌧⌧!↵,!⌦66!◆↵⌅%⇣↵⇧,⌧

λ+!⌦,◆!λ=!⌦⌅!⌧⌫⌦66/⌘!↵⌧!⌦6⌫⇧⌧⇣!%⇧,⌧⇣⌦,⇣!↵,!⌦66!◆↵⌅%⇣↵⇧,⌧

3$◆⇢4!

λ+!CC!λ=

3$◆⇢4!

λ=!CC!λ+

3;6⌦⇣4!⌅⇢↵⇧,

⇠6⌦⌧⌧↵⌃↵%⌦⇣↵⇧,!⇧⌃!↵⌫⌦⇢!⇡⇧↵,⇣⌧!5⌧↵,⇢!↵⇢,@⌦65⌧!⇧⌃!⇣0

Corner response function

Measure of corner response:

R = det(M)− κ (trace(M))2

where

det(M) = λ1λ2

trace(M) = λ1 + λ2

κ is a constant whose value was determined empirically to give results in the range[.04, .06].

! ⇥!⇤⌅⇧⌃⌥!⌅⌥! ⌦↵,⌅!.↵,✏⌦⌅⇣⌘!✓⌦◆!⇧⌃!56⇣↵⌫◆↵⌦!⇠⇧⌫⇡5⇣↵,⇢⌘!;,⌧⇣↵⇣5⇣!⌃=⌅!;,⌃⇧⌅⌫⌦⇣↵>⌘!?,↵@⌅⌧↵⇣!⇣!"5⇢⌧#5⌅⇢!!$↵%✏6↵⇣,⌅⌧⇣⌅⌥!&'⌘!()*+&,!"5⇢⌧#5⌅⇢⌘!-⌅⌫⌦,./!⌫⌦↵60! ⌦↵,⌅⌥.↵,✏⌦⌅⇣1↵,⌃⇧⌅⌫⌦⇣↵>⌥5,↵(⌦5⇢⌧#5⌅⇢⌥◆

⌃"⇥$⇠⌦◆⇥$⌅⇧

λ+

λ= 3⇠⇧⌅,⌅4

3$◆⇢4!

3$◆⇢4!

3;6⌦⇣4

⇥!!◆⇡,◆⌧!⇧,6.!⇧,!↵⇢,@⌦65⌧!⇧⌃!

⇥!!↵⌧!6⌦⌅⇢!⌃⇧⌅!⌦!%⇧⌅,⌅

⇥!!↵⌧!,⇢⌦⇣↵@!9↵⇣✏!6⌦⌅⇢!⌫⌦⇢,↵⇣5◆!⌃⇧⌅!⌦,!◆⇢

⇥!EE!↵⌧!⌧⌫⌦66!⌃⇧⌅!⌦!⌃6⌦⇣!⌅⇢↵⇧,

!C!'

!F!'

!F!'""!⌧⌫⌦66

Window function

M =∑

(x,y)∈W

w(x, y)(I2x IxIy

IxIy I2y

)

Option 1: Uniform window Sum over a square window

M =∑

(x,y)∈W

(I2x IxIy

IxIy I2y

)

Problem: Not rotationally invariant.

Option 2: Smooth with Gaussian Gaussian already performs weighted sum

M =∑

(x,y)∈W

G(σ) ∗(I2x IxIy

IxIy I2y

)

Result is rotation invariant.

Summary: Harris Detector

Compute second moment matrix

M(σl, σD) = G(σl) ∗(I2x(σD) IxIy(σD)

IxIy(σD) I2y(σD)

)

Compute cornerness function

R = det(M(σl, σD))− κ trace(M(σl, σD))

= (G ∗ I2x)(G ∗ I2

y)− (G ∗ (IxIy))2 − κ(G ∗ I2x +G ∗ I2

y)2

Perform non-maximum suppression

Original Image

Harris response

Harris: In action

! ⇥!⇤⌅⇧⌃⌥!⌅⌥! ⌦↵,⌅!.↵,✏⌦⌅⇣⌘!✓⌦◆!⇧⌃!56⇣↵⌫◆↵⌦!⇠⇧⌫⇡5⇣↵,⇢⌘!;,⌧⇣↵⇣5⇣!⌃=⌅!;,⌃⇧⌅⌫⌦⇣↵>⌘!?,↵@⌅⌧↵⇣!⇣!"5⇢⌧#5⌅⇢!!$↵%✏6↵⇣,⌅⌧⇣⌅⌥!&'⌘!()*+&,!"5⇢⌧#5⌅⇢⌘!-⌅⌫⌦,./!⌫⌦↵60! ⌦↵,⌅⌥.↵,✏⌦⌅⇣1↵,⌃⇧⌅⌫⌦⇣↵>⌥5,↵(⌦5⇢⌧#5⌅⇢⌥◆

!⇥⇤⇤⌅⇧⌃↵⌦)⌦)⇤,⌃-⇤.!0"

Harris: Corner Response

! ⇥!⇤⌅⇧⌃⌥!⌅⌥! ⌦↵,⌅!.↵,✏⌦⌅⇣⌘!✓⌦◆!⇧⌃!56⇣↵⌫◆↵⌦!⇠⇧⌫⇡5⇣↵,⇢⌘!;,⌧⇣↵⇣5⇣!⌃=⌅!;,⌃⇧⌅⌫⌦⇣↵>⌘!?,↵@⌅⌧↵⇣!⇣!"5⇢⌧#5⌅⇢!!$↵%✏6↵⇣,⌅⌧⇣⌅⌥!&'⌘!()*+&,!"5⇢⌧#5⌅⇢⌘!-⌅⌫⌦,./!⌫⌦↵60! ⌦↵,⌅⌥.↵,✏⌦⌅⇣1↵,⌃⇧⌅⌫⌦⇣↵>⌥5,↵(⌦5⇢⌧#5⌅⇢⌥◆

!⇥⇤⇤⌅⇧⌃↵⌦)⌦)⇤,⌃-⇤.!0"⇠⇧⌫⇡5⇣!%⇧⌅,⌅!⌅⌧⇡⇧,⌧!

Harris: Threshold

! ⇥!⇤⌅⇧⌃⌥!⌅⌥! ⌦↵,⌅!.↵,✏⌦⌅⇣⌘!✓⌦◆!⇧⌃!56⇣↵⌫◆↵⌦!⇠⇧⌫⇡5⇣↵,⇢⌘!;,⌧⇣↵⇣5⇣!⌃=⌅!;,⌃⇧⌅⌫⌦⇣↵>⌘!?,↵@⌅⌧↵⇣!⇣!"5⇢⌧#5⌅⇢!!$↵%✏6↵⇣,⌅⌧⇣⌅⌥!&'⌘!()*+&,!"5⇢⌧#5⌅⇢⌘!-⌅⌫⌦,./!⌫⌦↵60! ⌦↵,⌅⌥.↵,✏⌦⌅⇣1↵,⌃⇧⌅⌫⌦⇣↵>⌥5,↵(⌦5⇢⌧#5⌅⇢⌥◆

!⇥⇤⇤⌅⇧⌃↵⌦)⌦)⇤,⌃-⇤.!0";↵,◆!⇡⇧↵,⇣⌧!9↵⇣✏!6⌦⌅⇢!%⇧⌅,⌅!⌅⌧⇡⇧,⌧0!"⇣✏⌅⌧✏⇧6◆

Harris: Local Maxima

! ⇥!⇤⌅⇧⌃⌥!⌅⌥! ⌦↵,⌅!.↵,✏⌦⌅⇣⌘!✓⌦◆!⇧⌃!56⇣↵⌫◆↵⌦!⇠⇧⌫⇡5⇣↵,⇢⌘!;,⌧⇣↵⇣5⇣!⌃=⌅!;,⌃⇧⌅⌫⌦⇣↵>⌘!?,↵@⌅⌧↵⇣!⇣!"5⇢⌧#5⌅⇢!!$↵%✏6↵⇣,⌅⌧⇣⌅⌥!&'⌘!()*+&,!"5⇢⌧#5⌅⇢⌘!-⌅⌫⌦,./!⌫⌦↵60! ⌦↵,⌅⌥.↵,✏⌦⌅⇣1↵,⌃⇧⌅⌫⌦⇣↵>⌥5,↵(⌦5⇢⌧#5⌅⇢⌥◆

!⇥⇤⇤⌅⇧⌃↵⌦)⌦)⇤,⌃-⇤.!0"G⌦>!⇧,6.!⇣✏!⇡⇧↵,⇣⌧!⇧⌃!6⇧%⌦6!⌫⌦<↵⌫⌦!⇧⌃!

Harris: Final Points

! ⇥!⇤⌅⇧⌃⌥!⌅⌥! ⌦↵,⌅!.↵,✏⌦⌅⇣⌘!✓⌦◆!⇧⌃!56⇣↵⌫◆↵⌦!⇠⇧⌫⇡5⇣↵,⇢⌘!;,⌧⇣↵⇣5⇣!⌃=⌅!;,⌃⇧⌅⌫⌦⇣↵>⌘!?,↵@⌅⌧↵⇣!⇣!"5⇢⌧#5⌅⇢!!$↵%✏6↵⇣,⌅⌧⇣⌅⌥!&'⌘!()*+&,!"5⇢⌧#5⌅⇢⌘!-⌅⌫⌦,./!⌫⌦↵60! ⌦↵,⌅⌥.↵,✏⌦⌅⇣1↵,⌃⇧⌅⌫⌦⇣↵>⌥5,↵(⌦5⇢⌧#5⌅⇢⌥◆

!⇥⇤⇤⌅⇧⌃↵⌦)⌦)⇤,⌃-⇤.!0"

Harris Detector: Responses

Effect: A very precise corner detector.

Harris Detector: Properties

Corner response R is invariant to image rotation

Ellipse rotates but its shape (i.e. eigenvalues) remains the same.

Harris Detector: Properties

Not invariant to changes in scale

corner

all points classified as edges

Scale Invariant Region Selection

From points to regions

• The Harris operator defines interest points

– precise localization– High repeatability

• In order to compare those points, we need to compute a descriptor over a region.

How can we define such a region in a scale invariant manner?

• How can we detect scale invariant interest regions?

Scale Invariant Detection

Consider regions of different sizes around a point. Regions of corresponding sizeswill look the same in both images.

Harris Detector: Some Properties

Rotation invariance

Ellipse rotates but its shape (i.e. eigenvalues)

remains the same

Corner response R is invariant to image rotation

Harris Detector: Some Properties

Partial invariance to affine intensity change

!! Only derivatives are used => invariance

to intensity shift I ! I + b

!! Intensity scale: I ! a I

R

x (image coordinate)

threshold

R

x (image coordinate)

Harris Detector: Some Properties

But: non-invariant to image scale!

All points will be

classified as edges Corner !

Scale Invariant Detection

Consider regions (e.g. circles) of different sizes around a point

Regions of corresponding sizes will look the same in both images

Naive Approach: Exhaustive SearchCompare descriptors while varying the patch size

Naive Approach: Exhaustive SearchCompare descriptors while varying the patch size

Naive Approach: Exhaustive Search

Compare descriptors while varying the patch size

Naive Approach: Exhaustive Search

Compare descriptors while varying the patch size

Naive Approach: Exhaustive Search

Compare descriptors while varying the patch size

• Computationally inefficient

• Inefficient but possible for matching

• Prohibitive for retrieval in large databases

• Prohibitive for recognition.

Scale Invariant Detection

Problem: How do we choose corresponding circles independently in each image?

Scale Invariant Detection

The problem: how do we choose corresponding circles

independently in each image?

Choose the scale of the “best” corner

Feature selection

Distribute points evenly over the image

Adaptive Non-maximal Suppression

Desired: Fixed # of features per image

•! Want evenly distributed spatially…

•! Search over non-maximal suppression radius

[Brown, Szeliski, Winder, CVPR’05]

Feature descriptors

We know how to detect points

Next question: How to match them?

?

Point descriptor should be:

1.! Invariant 2. Distinctive

Automatic Scale Detection

Design a function on the region, which is scale invariant (the same for correspondingregions, even if they are at different scales)

• Take a local maximum of this funcion.• Region size of the maximum should be invariant to image scale.

f

Region size

Image 1 f

Region size

Image 2

scale = .5

s1 = .5 s2s1 s2

Important: this scale invariant region size is found in each image independently.

Automatic Scale DetectionFunction responses for increasing scale (scale signature)

Automatic Scale Detection

Normalize: Rescale to fixed size

What is a useful signature function?Laplacian-of-Gaussian = blob detection

Characteristic Scale

We define the characteristic scale as the scale that produces peak of Laplacianresponse.

Summary: Scale invariant detector

Given Two images of the same scene with a large scale difference between them.

Goal Find the same interest points independently in each image.

Solution Search for maxima of suitable functions in scale and in space (over theimage).

What do we do now?

We know how to detect points.

Next question: How to match them?

Scale Invariant Detection

The problem: how do we choose corresponding circles

independently in each image?

Choose the scale of the “best” corner

Feature selection

Distribute points evenly over the image

Adaptive Non-maximal Suppression

Desired: Fixed # of features per image

•! Want evenly distributed spatially…

•! Search over non-maximal suppression radius

[Brown, Szeliski, Winder, CVPR’05]

Feature descriptors

We know how to detect points

Next question: How to match them?

?

Point descriptor should be:

1.! Invariant 2. Distinctive You know this, what are our options?

Today’s programming assignment

Programming Assignment

• Details available on the course website.

• You will write Matlab functions compute Harris interest points at different scalesand also estimate the dominant direction of the gradient around the interestpoint.

• Important If you have a lecture clash read the course website for a solution.

• Mail me about any errors you spot in the Exercise notes.

• I will notify the class about errors spotted and correstions via the course websiteand mailing list.

Recommended