111
CEng 583 - Computational Vision 2011-2012 Spring Week – 3 12 th of March, 2011

CEng 583 - Computational Vision

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CEng 583 - Computational Vision

CEng 583 - Computational Vision 2011-2012 Spring

Week – 3

12th of March, 2011

Page 2: CEng 583 - Computational Vision
Page 3: CEng 583 - Computational Vision

Early Vision Corners

Texture

Segmentation

Optic flow

Today

Page 4: CEng 583 - Computational Vision

Image matching

by Diva Sian

by swashford Slide: Trevor Darrell

Page 5: CEng 583 - Computational Vision

Harder case

by Diva Sian by scgbt

Slide: Trevor Darrell

Page 6: CEng 583 - Computational Vision

Harder still?

NASA Mars Rover images

Slide: Trevor Darrell

Page 7: CEng 583 - Computational Vision

Non-accidental features (Witkin & Tenenbaum, 1983)

Corners, Junctions

Page 8: CEng 583 - Computational Vision

Non-accidental features (Witkin & Tenenbaum, 1983)

Corners or Junctions

Page 9: CEng 583 - Computational Vision

What is accidental?

http://en.wikipedia.org/wiki/Penrose_triangle

Page 10: CEng 583 - Computational Vision

Corners as distinctive interest points

Shifting a window in any direction should give a large change in intensity

“edge”:

no change

along the edge

direction

“corner”:

significant

change in all

directions

“flat” region:

no change in

all directions

Source: A. Efros

Page 11: CEng 583 - Computational Vision

SUSAN Detector

Moravec Detector

Harris Detector

We will talk about two widely used corner detectors.

Page 12: CEng 583 - Computational Vision

Center pixel is compared with the pixels in a circular mask.

If they are all the same, the pixel is “homogeneous”

If half of the pixels are different, the pixel is “edge-like”

If one-quarter of the pixels are different, then the pixel is corner.

SUSAN Detector (Smallest Univalue Segment Assimilating Nucleus)

Page 13: CEng 583 - Computational Vision

Based on “self-similarity”

Move a window in horizontal, vertical and diagonal directions.

Compute the similarity of the original patch with the shifted ones.

A corner is a local minimum in this similarity space.

Moravec Detector

Page 14: CEng 583 - Computational Vision

Harris Detector formulation

2

,

( , ) ( , ) ( , ) ( , )x y

E u v w x y I x u y v I x y

Change of intensity for the shift [u,v]:

Intensity Shifted intensity

Window function

or Window function w(x,y) =

Gaussian 1 in window, 0 outside

Source: R. Szeliski

Page 15: CEng 583 - Computational Vision
Page 16: CEng 583 - Computational Vision
Page 17: CEng 583 - Computational Vision

Harris Detector formulation This measure of change can be approximated by:

2

2,

( , )x x y

x y x y y

I I IM w x y

I I I

where M is a 22 matrix computed from image derivatives:

v

uMvuvuE ][),(

Sum over image region – area we are checking for corner

Gradient with respect to x, times gradient with respect to y

Source: R. Szeliski

Page 18: CEng 583 - Computational Vision

Harris Detector formulation

2

2,

( , )x x y

x y x y y

I I IM w x y

I I I

where M is a 22 matrix computed from image derivatives:

Sum over image region – area we are checking for corner

Gradient with respect to x, times gradient with respect to y

Source: R. Szeliski

Page 19: CEng 583 - Computational Vision

Interpreting the eigenvalues of M

1

2

“Corner”

1 and 2 are large,

1 ~ 2;

E increases in all

directions

1 and 2 are small;

E is almost constant

in all directions

“Edge”

1 >> 2

“Edge”

2 >> 1

“Flat”

region

Classification of image points using eigenvalues of M:

Source: R. Szeliski

Page 20: CEng 583 - Computational Vision

Corner response function

“Corner”

R > 0

“Edge”

R < 0

“Edge”

R < 0

“Flat”

region

|R| small

2

2121

2 )()(trace)det( MMR

α: constant (0.04 to 0.06)

Source: R. Szeliski

Page 21: CEng 583 - Computational Vision

Algorithm steps:

Compute M matrix within all image windows to get their R scores

Find points with large corner response

(R > threshold)

Take the points of local maxima of R

Harris Corner Detector

Slide: Trevor Darrell

Page 22: CEng 583 - Computational Vision

Rotation invariance

Harris Detector: Properties

Ellipse rotates but its shape (i.e. eigenvalues) remains the same

Corner response R is invariant to image rotation

Slide: Trevor Darrell

Page 23: CEng 583 - Computational Vision

Not invariant to image scale

Harris Detector: Properties

All points will be classified as edges

Corner !

Slide: Trevor Darrell

Page 24: CEng 583 - Computational Vision

Kalkan et al., Int. Conf. on Computer Vision Theory and Applications, 2007.

Noise, Thresholding, Incompleteness

Page 25: CEng 583 - Computational Vision

Another problem: Localization

Kalkan, 2008; Kalkan et al., 2007.

Page 26: CEng 583 - Computational Vision

A corner is where lines intersect.

Since we know the edges and their orientation, we can compute whether the lines in a window are intersecting at the center.

Intersection Consistency as a Corner Measure

Kalkan, 2008; Kalkan et al., 2007.

Page 27: CEng 583 - Computational Vision

Intersection Consistency as a Corner Measure

Kalkan, 2008; Kalkan et al., 2007.

Page 28: CEng 583 - Computational Vision

Kalkan et al., Int. Conf. on Computer Vision Theory and Applications, 2007.

Page 29: CEng 583 - Computational Vision

Kalkan et al., Int. Conf. on Computer Vision Theory and Applications, 2007.

Page 30: CEng 583 - Computational Vision

Kalkan et al., Int. Conf. on Computer Vision Theory and Applications, 2007.

Page 31: CEng 583 - Computational Vision

Problems with Corner Detection

Localization

Representation

Viewpoint

Scale

Page 32: CEng 583 - Computational Vision

Texture

Page 33: CEng 583 - Computational Vision

What is texture?

No unique definition.

Certain aspects:

Repetition

Sometimes random

Sometimes involving “edges”

Texture

Page 34: CEng 583 - Computational Vision

Because the world is full of them.

Why study texture?

Page 35: CEng 583 - Computational Vision
Page 36: CEng 583 - Computational Vision

Julesz

Textons: analyze the texture in terms of statistical relationships between fundamental texture elements, called “textons”.

Slide: Trevor Darrell

Page 37: CEng 583 - Computational Vision

Texture representation

Textures are made up of repeated local patterns, so:

Find the patterns

Use filters that look like patterns (spots, bars, raw patches…)

Consider magnitude of response

Describe their statistics within each local window

Mean, standard deviation

Histogram

Histogram of “prototypical” feature occurrences

Slide: Trevor Darrell

Page 38: CEng 583 - Computational Vision

Texture representation: example

original image

derivative filter responses, squared

statistics to summarize patterns in small

windows

mean d/dx value

mean d/dy

value

Win. #1 4 10

Slide: Trevor Darrell

Page 39: CEng 583 - Computational Vision

Texture representation: example

original image

derivative filter responses, squared

statistics to summarize patterns in small

windows

mean d/dx value

mean d/dy

value

Win. #1 4 10

Win.#2 18 7

Slide: Trevor Darrell

Page 40: CEng 583 - Computational Vision

Texture representation: example

original image

derivative filter responses, squared

statistics to summarize patterns in small

windows

mean d/dx value

mean d/dy

value

Win. #1 4 10

Win.#2 18 7

Win.#9 20 20

Slide: Trevor Darrell

Page 41: CEng 583 - Computational Vision

Texture representation: example

statistics to summarize patterns in small

windows

mean d/dx value

mean d/dy

value

Win. #1 4 10

Win.#2 18 7

Win.#9 20 20

Dimension 1 (mean d/dx value)

Dim

en

sio

n 2

(m

ean

d/d

y va

lue

)

Slide: Trevor Darrell

Page 42: CEng 583 - Computational Vision

Texture representation: example

statistics to summarize patterns in small

windows

mean d/dx value

mean d/dy

value

Win. #1 4 10

Win.#2 18 7

Win.#9 20 20

Dimension 1 (mean d/dx value)

Dim

en

sio

n 2

(m

ean

d/d

y va

lue

)

Windows with small gradient in both directions

Windows with primarily vertical edges

Windows with primarily horizontal edges

Both

Slide: Trevor Darrell

Page 43: CEng 583 - Computational Vision

Texture representation: example

original image

derivative filter responses, squared

visualization of the assignment to texture

“types”

Slide: Trevor Darrell

Page 44: CEng 583 - Computational Vision

Texture representation: example

statistics to summarize patterns in small

windows

mean d/dx value

mean d/dy

value

Win. #1 4 10

Win.#2 18 7

Win.#9 20 20

Dimension 1 (mean d/dx value)

Dim

en

sio

n 2

(m

ean

d/d

y va

lue

)

Far: dissimilar textures

Close: similar textures

Slide: Trevor Darrell

Page 45: CEng 583 - Computational Vision

Problem: Scale

We’re assuming we know the relevant window size for which we collect these statistics.

Possible to perform scale selection by looking for window scale where texture description not changing.

Slide: Trevor Darrell

Page 46: CEng 583 - Computational Vision

Texture Analysis Using Oriented Filter Banks

Malik J, Perona P. Preattentive texture

discrimination with early vision mechanisms.

J OPT SOC AM A 7: (5) 923-932 MAY 1990 Slide: A. Torralba

Page 47: CEng 583 - Computational Vision

Texture Analysis Using Oriented Filter Banks

Forsyth, Ponce, “Computer Vision: A Modern Approach”, Ch11., 2002.

Page 48: CEng 583 - Computational Vision

Slide: Trevor Darrell

Page 49: CEng 583 - Computational Vision

Slide: Trevor Darrell

Page 50: CEng 583 - Computational Vision

Texture Analysis Using Oriented Filter Banks

http://www.robots.ox.ac.uk/~vgg/research/texclass/with.html

Page 51: CEng 583 - Computational Vision

Problems involving texture

Texture Segmentation

Texture Classification

http://www.texturesynthesis.com/texture.htm

Texture Synthesis

Shape from Texture

Page 52: CEng 583 - Computational Vision

Texture Synthesis Using Pyramids

Forsyth, Ponce, “Computer Vision: A Modern Approach”, Ch11/Ch8., 2002.

Page 53: CEng 583 - Computational Vision

Freeman

Gaussian pyramid

Page 54: CEng 583 - Computational Vision

Laplacian pyramid

1x 11xG

111 xGF

111 )( xGFI

222 )( xGFI

333 )( xGFI

2x 2x3x

Freeman

Page 55: CEng 583 - Computational Vision

Laplacian pyramid

Freeman

Page 56: CEng 583 - Computational Vision

Texture Synthesis Using Pyramids

De Bonet, 1997.

Page 57: CEng 583 - Computational Vision

De Bonet, 1997.

Page 58: CEng 583 - Computational Vision

Source S. Seitz

Page 59: CEng 583 - Computational Vision

Markov Chain Example: Text “A dog is a man’s best friend. It’s a dog eat dog world out there.”

2/3 1/3

1/3 1/3 1/3

1

1

1

1

1

1

1

1

1

1

a

dog

is

man’s

best

friend

it’s

eat

world

out

there

do

g

is

man

’s

be

st

frien

d

it’s

eat

wo

rld

ou

t

the

re

a .

.

Source: S. Seitz

Page 60: CEng 583 - Computational Vision
Page 61: CEng 583 - Computational Vision

Text synthesis

Results:

“As I've commented before, really relating to someone involves standing next to impossible.”

"One morning I shot an elephant in my arms and kissed him.”

"I spent an interesting evening recently with a grain of salt"

Dewdney, “A potpourri of programmed prose and prosody” Scientific American, 1989.

Slide from Alyosha Efros, ICCV 1999

Page 62: CEng 583 - Computational Vision

Synthesizing Computer Vision text

What do we get if we extract the probabilities from the F&P chapter on Linear Filters, and then synthesize new statements?

Check out Yisong Yue’s website implementing text generation: build your own text Markov Chain for a given text corpus. http://www.yisongyue.com/shaney/index.php

Slide: Trevor Darrell

Page 63: CEng 583 - Computational Vision

Synthesized text

This means we cannot obtain a separate copy of the best studied regions in the sum.

All this activity will result in the primate visual system.

The response is also Gaussian, and hence isn’t bandlimited.

Instead, we need to know only its response to any data vector, we need to apply a low pass filter that strongly reduces the content of the Fourier transform of a very large standard deviation.

It is clear how this integral exist (it is sufficient for all pixels within a 2k +1 × 2k +1 × 2k +1 × 2k + 1 — required for the images separately.

Page 64: CEng 583 - Computational Vision

Markov Random Field

A Markov random field (MRF) • generalization of Markov chains to two or more dimensions.

First-order MRF: • probability that pixel X takes a certain value given the values of

neighbors A, B, C, and D:

D

C

X

A

B

Source: S. Seitz

Page 66: CEng 583 - Computational Vision

Adapted from A. Torralba

• Model the local conditional dependency of pixels using Markov Random Field.

Page 67: CEng 583 - Computational Vision

white bread brick wall

Synthesis results

Slide from Alyosha Efros, ICCV 1999

Page 68: CEng 583 - Computational Vision

Synthesis results

Slide from Alyosha Efros, ICCV 1999

Page 69: CEng 583 - Computational Vision

Hole Filling

Slide from Alyosha Efros, ICCV 1999

Page 70: CEng 583 - Computational Vision

Input texture

B1 B2

Random placement

of blocks

block

B1 B2

Neighboring blocks

constrained by overlap

B1 B2

Minimal error

boundary cut

Page 71: CEng 583 - Computational Vision

min. error boundary

Minimal error boundary

overlapping blocks vertical boundary

_ =

2

overlap error

Slide from Alyosha Efros

Page 72: CEng 583 - Computational Vision

Slide from Alyosha Efros

Page 73: CEng 583 - Computational Vision

Slide from Alyosha Efros

Page 74: CEng 583 - Computational Vision

More on texture

Page 75: CEng 583 - Computational Vision

Representation

Scale

View-point

Matching

Problems with Texture

Page 76: CEng 583 - Computational Vision

Segmentation

http://web.mit.edu/manoli/www/imagina/imagina.html

Page 77: CEng 583 - Computational Vision

Why study segmentation?

Page 78: CEng 583 - Computational Vision

Merging Clustering

Divisive Clustering

Segmentation as Clustering

Slide: A. Torralba

Page 79: CEng 583 - Computational Vision

Segmentation by Clustering

Slide: A. Torralba

Page 80: CEng 583 - Computational Vision

K-means clustering using intensity alone and color alone

Image Clusters on intensity (K=5) Clusters on color (K=5)

Slide: A. Torralba

Page 81: CEng 583 - Computational Vision

K-means using color alone, 11 segments

Image Clusters on color

Slide: A. Torralba

Page 82: CEng 583 - Computational Vision

Including spatial relationships

Augment data to be clustered with spatial coordinates.

y

x

v

u

Y

z

color coordinates

spatial coordinates

Slide: A. Torralba

Page 83: CEng 583 - Computational Vision

Mean Shift Algorithm 1. Choose a search window size. 2. Choose the initial location of the search window. 3. Compute the mean location (centroid of the data) in the search window. 4. Center the search window at the mean location computed in Step 3. 5. Repeat Steps 3 and 4 until convergence.

The mean shift algorithm seeks the “mode” or point of highest density of a data distribution:

Slide: A. Torralba

Page 84: CEng 583 - Computational Vision

1. Convert the image into tokens (via color, gradients, texture measures etc). 2. Choose initial search window locations uniformly in the data. 3. Compute the mean shift window location for each initial position. 4. Merge windows that end up on the same “peak” or mode. 5. The data these merged windows traversed are clustered together.

Mean Shift Segmentation

Slide: A. Torralba

Page 85: CEng 583 - Computational Vision

Mean Shift Segmentation

http://www.caip.rutgers.edu/~comanici/MSPAMI/msPamiResults.html

Slide: A. Torralba

Page 86: CEng 583 - Computational Vision

Mean Shift color&spatial Segmentation Results:

http://www.caip.rutgers.edu/~comanici/MSPAMI/msPamiResults.html Slide: A. Torralba

Page 87: CEng 583 - Computational Vision

Mean Shift color&spatial Segmentation Results:

Slide: A. Torralba

Page 88: CEng 583 - Computational Vision

Minimum Cut and Clustering

* From Khurram Hassan-Shafique CAP5415 Computer Vision 2003 Slide: A. Torralba

Page 89: CEng 583 - Computational Vision

Results on color segmentation

http://www.cs.berkeley.edu/~malik/papers/SM-ncut.pdf

Slide: A. Torralba

Page 90: CEng 583 - Computational Vision

Contains a large

dataset of images

with human

“ground truth”

labeling.

Slide: A. Torralba

Page 91: CEng 583 - Computational Vision

Do we need recognition to take the next step in performance?

Slide: A. Torralba

Page 92: CEng 583 - Computational Vision

Aim

Given an image and object category, to segment the object

Segmentation should (ideally) be

• shaped like the object e.g. cow-like

• obtained efficiently in an unsupervised manner

• able to handle self-occlusion

Segmentation

Object

Category

Model

Cow Image Segmented Cow

Slide from Kumar ‘05

Page 93: CEng 583 - Computational Vision

Feature-detector view

Slide: A. Torralba

Page 94: CEng 583 - Computational Vision

Slide: A. Torralba

Page 95: CEng 583 - Computational Vision

Slide: A. Torralba

Page 96: CEng 583 - Computational Vision

Slide: A. Torralba

Page 97: CEng 583 - Computational Vision

Object-Specific Figure-Ground Segregation

Some segmentation/detection results

Yu and Shi, 2002 Slide: A. Torralba

Page 98: CEng 583 - Computational Vision

Implicit Shape Model - Liebe and Schiele, 2003

Backprojected Hypotheses

Interest Points Matched Codebook Entries

Probabilistic Voting

Voting Space (continuous)

Backprojection of Maxima

Refined Hypotheses (uniform sampling)

Liebe and Schiele, 2003, 2005 Slide: A. Torralba

Page 99: CEng 583 - Computational Vision

Slide: A. Torralba

Page 100: CEng 583 - Computational Vision

Determining similarity between pixels.

Determining “k” or a threshold.

Representing them.

Implicit vs. explicit.

Matching.

Problems with Segmentation

Page 101: CEng 583 - Computational Vision

Perceptual Grouping

Page 102: CEng 583 - Computational Vision

Slide: A. Torralba

Page 103: CEng 583 - Computational Vision

Slide: A. Torralba

Page 104: CEng 583 - Computational Vision

Familiarity

Slide: A. Torralba

Page 105: CEng 583 - Computational Vision

Familiarity

Slide: A. Torralba

Page 106: CEng 583 - Computational Vision

Influences of grouping

Grouping influences other

perceptual mechanisms such

as lightness perception

http://web.mit.edu/persci/people/adelson/publications/gazzan.dir/koffka.html Slide: A. Torralba

Page 107: CEng 583 - Computational Vision

Perceptual Grouping & Statistics of the Environment

Page 108: CEng 583 - Computational Vision
Page 109: CEng 583 - Computational Vision

Perceptual Grouping

Field, 1992.

Elder, Goldberg, 2002.

Page 110: CEng 583 - Computational Vision

Popular descriptors like:

SIFT

SURF

MSER

Contours/Boundaries

What did I skip?

Page 111: CEng 583 - Computational Vision

I will supply material for:

Edges

Corners/Junctions

Texture

Segmentation

Reading