31
TA Section - 4 Zixuan Wang TA Section: Problem Set 4 11/16/2012 Zixuan Wang 11/16/2012

TA Section: Problem Set 4 - Artificial Intelligencevision.stanford.edu/teaching/cs231a_autumn1213/ta...Zixuan Wang TA Section - 4 A word about Q2 in PS4 •We’d like you to understand

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: TA Section: Problem Set 4 - Artificial Intelligencevision.stanford.edu/teaching/cs231a_autumn1213/ta...Zixuan Wang TA Section - 4 A word about Q2 in PS4 •We’d like you to understand

TA Section - 4 Zixuan Wang

TA Section: Problem Set 4

11/16/2012

Zixuan Wang

11/16/2012

Page 2: TA Section: Problem Set 4 - Artificial Intelligencevision.stanford.edu/teaching/cs231a_autumn1213/ta...Zixuan Wang TA Section - 4 A word about Q2 in PS4 •We’d like you to understand

TA Section - 4 Zixuan Wang

Outline

• Discriminative vs. Generative Classifiers

• Image representation and recognition models

– Bag of Words Model

– Part-based Model

• Constellation Model

• Pictorial Structures Model

– Spatial Pyramid Matching (SPM)

– ObjectBank

11/16/2012

Page 3: TA Section: Problem Set 4 - Artificial Intelligencevision.stanford.edu/teaching/cs231a_autumn1213/ta...Zixuan Wang TA Section - 4 A word about Q2 in PS4 •We’d like you to understand

TA Section - 4 Zixuan Wang

Discriminative vs Generative Classifiers

Training classifiers involves estimating f: X Y, or P(Y|X)

• Discriminative classifiers (e.g. logistic regression, SVM):

– We want to model P(Y|X)

– Assume some functional form for P(Y|X)

– Estimate parameters of P(Y|X) directly from training data

• Generative classifiers (e.g. naïve bayes):

– We want to model P(X, Y)

– Assume some functional form for P(X|Y), P(X)

– Estimate parameters of P(X|Y), P(X) directly from training data

– Use Bayes rule to calculate P(Y|X=xi)

11/16/2012

Page 4: TA Section: Problem Set 4 - Artificial Intelligencevision.stanford.edu/teaching/cs231a_autumn1213/ta...Zixuan Wang TA Section - 4 A word about Q2 in PS4 •We’d like you to understand

TA Section - 4 Zixuan Wang

Discriminative vs Generative Classifiers

• Advantages of discriminative classifiers: – Typically faster at making predictions

– Tend to have better performance

– Direct modeling of what we want to optimize

• Advantage of generative classifiers: – Can handle missing/partially labeled data

– A new class (Y+1) can be added incrementally without training the complete model

– Can generate samples from the training distribution

[Ulusoy & Bishop, 2005]

11/16/2012

Page 5: TA Section: Problem Set 4 - Artificial Intelligencevision.stanford.edu/teaching/cs231a_autumn1213/ta...Zixuan Wang TA Section - 4 A word about Q2 in PS4 •We’d like you to understand

TA Section - 4 Zixuan Wang

Image Representation

11/16/2012

Spatial Specificity of Parts No spatial info. Very strong but sparse

Bag of Words Part-based

•Constellation Model •Pictorial Structure

Weakly Spatial Models

•Spatial Pyramid Matching

•Object-Bank

Page 6: TA Section: Problem Set 4 - Artificial Intelligencevision.stanford.edu/teaching/cs231a_autumn1213/ta...Zixuan Wang TA Section - 4 A word about Q2 in PS4 •We’d like you to understand

TA Section - 4 Zixuan Wang

Image Representation

11/16/2012

Bag of Words Part-based

•Constellation Model •Pictorial Structure

Weakly Spatial Models

•Spatial Pyramid Matching

•Object-Bank

Spatial Specificity of Parts No spatial info. Very strong but sparse

Page 7: TA Section: Problem Set 4 - Artificial Intelligencevision.stanford.edu/teaching/cs231a_autumn1213/ta...Zixuan Wang TA Section - 4 A word about Q2 in PS4 •We’d like you to understand

TA Section - 4 Zixuan Wang

Histogram Assignment

Bag-of-words Representation

11/16/2012

Local features (e.g. SIFT features)

Clustering

(e.g. K-means)

Dictionary or Code book

Assign feature to a dictionary word

Feature generation Using

minimum Euclidean distance!

Bag-of-words!

19

19

19

Normalization

(Why?) Count

Page 8: TA Section: Problem Set 4 - Artificial Intelligencevision.stanford.edu/teaching/cs231a_autumn1213/ta...Zixuan Wang TA Section - 4 A word about Q2 in PS4 •We’d like you to understand

TA Section - 4 Zixuan Wang

Image Representation

11/16/2012

Bag of Words Part-based

•Constellation Model •Pictorial Structure

Weakly Spatial Models

•Spatial Pyramid Matching

•Object-Bank

Spatial Specificity of Parts No spatial info. Very strong but sparse

Page 9: TA Section: Problem Set 4 - Artificial Intelligencevision.stanford.edu/teaching/cs231a_autumn1213/ta...Zixuan Wang TA Section - 4 A word about Q2 in PS4 •We’d like you to understand

TA Section - 4 Zixuan Wang

Image Representation

11/16/2012

Bag of Words Part-based

•Constellation Model •Pictorial Structure

Weakly Spatial Models

•Spatial Pyramid Matching

•Object-Bank

Spatial Specificity of Parts No spatial info. Very strong but sparse

Page 10: TA Section: Problem Set 4 - Artificial Intelligencevision.stanford.edu/teaching/cs231a_autumn1213/ta...Zixuan Wang TA Section - 4 A word about Q2 in PS4 •We’d like you to understand

TA Section - 4 Zixuan Wang

Foreground model Gaussian shape pdf

Poission pdf on # detections

Uniform shape pdf

Generat ive probabilist ic model ( 2)

Clutter model

Gaussian part appearance pdf

Gaussian background appearance pdf

Prob. of detection

0.8 0.75 0.9

Gaussian relative scale pdf

log(scale)

Uniform relative scale pdf

log(scale)

11/16/2012

Page 11: TA Section: Problem Set 4 - Artificial Intelligencevision.stanford.edu/teaching/cs231a_autumn1213/ta...Zixuan Wang TA Section - 4 A word about Q2 in PS4 •We’d like you to understand

TA Section - 4 Zixuan Wang

Image Representation

11/16/2012

Bag of Words Part-based

•Constellation Model •Pictorial Structure

Weakly Spatial Models

•Spatial Pyramid Matching

•Object-Bank

Spatial Specificity of Parts No spatial info. Very strong but sparse

Page 12: TA Section: Problem Set 4 - Artificial Intelligencevision.stanford.edu/teaching/cs231a_autumn1213/ta...Zixuan Wang TA Section - 4 A word about Q2 in PS4 •We’d like you to understand

TA Section - 4 Zixuan Wang

Pictorial Structures

• Basic idea: We would like to represent an object by

– a collection of parts

– arranged in a deformable configuration

11/16/2012

Examples of detected models

Parts sharing similar appearance

Page 13: TA Section: Problem Set 4 - Artificial Intelligencevision.stanford.edu/teaching/cs231a_autumn1213/ta...Zixuan Wang TA Section - 4 A word about Q2 in PS4 •We’d like you to understand

TA Section - 4 Zixuan Wang

Pictorial Structures

• Local model of appearance with non-local geometric or spatial constraints

• Simultaneous use of appearance and spatial information

– Simple part models alone are not discriminative

• The model needs to solve the tasks:

– determine whether an object is visible in an image

– determine where an object is in the image

11/16/2012

Page 14: TA Section: Problem Set 4 - Artificial Intelligencevision.stanford.edu/teaching/cs231a_autumn1213/ta...Zixuan Wang TA Section - 4 A word about Q2 in PS4 •We’d like you to understand

TA Section - 4 Zixuan Wang

Pictorial Structures

• Model is represented as an undirected graph structure G= (V, E), where V are the vertices and E are the edges

11/16/2012

Sum over all parts

Deformation score of connected parts

Matching score of individual parts

Sum over all edges

Optimal parts configuration

Page 15: TA Section: Problem Set 4 - Artificial Intelligencevision.stanford.edu/teaching/cs231a_autumn1213/ta...Zixuan Wang TA Section - 4 A word about Q2 in PS4 •We’d like you to understand

TA Section - 4 Zixuan Wang

Image Representation

11/16/2012

Bag of Words Part-based

•Constellation Model •Pictorial Structure

Weakly Spatial Models

•Spatial Pyramid Matching

•Object-Bank

Spatial Specificity of Parts No spatial info. Very strong but sparse

Page 16: TA Section: Problem Set 4 - Artificial Intelligencevision.stanford.edu/teaching/cs231a_autumn1213/ta...Zixuan Wang TA Section - 4 A word about Q2 in PS4 •We’d like you to understand

TA Section - 4 Zixuan Wang

Start with Pyramid Matching Kernel for BoW Models

11/16/2012

Sets of features

[Grauman & Darrell, 2005]

Page 17: TA Section: Problem Set 4 - Artificial Intelligencevision.stanford.edu/teaching/cs231a_autumn1213/ta...Zixuan Wang TA Section - 4 A word about Q2 in PS4 •We’d like you to understand

TA Section - 4 Zixuan Wang

Pyramid Matching Kernel

• How do we build a discriminative classifier using the set representation?

• Kernel-based methods (e.g. SVM) are appealing for efficiency and generalization power.

• But what is an appropriate kernel?

– Each instance is an unordered set of vectors

– Varying number of vectors per instance

11/16/2012

Page 18: TA Section: Problem Set 4 - Artificial Intelligencevision.stanford.edu/teaching/cs231a_autumn1213/ta...Zixuan Wang TA Section - 4 A word about Q2 in PS4 •We’d like you to understand

TA Section - 4 Zixuan Wang

Pyramid Matching Kernel

• We can compare sets by computing a partial matching between their features

11/16/2012

Number of newly

matched pairs at level i

Measure of difficulty

of a match at level i

Approximate

partial match

similarity

Page 19: TA Section: Problem Set 4 - Artificial Intelligencevision.stanford.edu/teaching/cs231a_autumn1213/ta...Zixuan Wang TA Section - 4 A word about Q2 in PS4 •We’d like you to understand

TA Section - 4 Zixuan Wang

Pyramid Matching Kernel (Example)

11/16/2012

Level 0

Page 20: TA Section: Problem Set 4 - Artificial Intelligencevision.stanford.edu/teaching/cs231a_autumn1213/ta...Zixuan Wang TA Section - 4 A word about Q2 in PS4 •We’d like you to understand

TA Section - 4 Zixuan Wang

Pyramid Matching Kernel (Example)

11/16/2012

Level 1

Page 21: TA Section: Problem Set 4 - Artificial Intelligencevision.stanford.edu/teaching/cs231a_autumn1213/ta...Zixuan Wang TA Section - 4 A word about Q2 in PS4 •We’d like you to understand

TA Section - 4 Zixuan Wang

Pyramid Matching Kernel

11/16/2012

optimal partial

matching between

sets of features

number of new matches at level i difficulty of a match at level i

Page 22: TA Section: Problem Set 4 - Artificial Intelligencevision.stanford.edu/teaching/cs231a_autumn1213/ta...Zixuan Wang TA Section - 4 A word about Q2 in PS4 •We’d like you to understand

TA Section - 4 Zixuan Wang

Spatial Pyramid Matching

• Pyramid Match Kernel (Grauman & Darrell)

Pyramid in feature space, ignore location

• Spatial Pyramid (Lazebnik et al)

Pyramid in image space, quantize features

11/16/2012

Features:

Weak (edge orientations) Strong (SIFT)

OR

Page 23: TA Section: Problem Set 4 - Artificial Intelligencevision.stanford.edu/teaching/cs231a_autumn1213/ta...Zixuan Wang TA Section - 4 A word about Q2 in PS4 •We’d like you to understand

TA Section - 4 Zixuan Wang

Spatial Pyramid Matching

11/16/2012

Page 24: TA Section: Problem Set 4 - Artificial Intelligencevision.stanford.edu/teaching/cs231a_autumn1213/ta...Zixuan Wang TA Section - 4 A word about Q2 in PS4 •We’d like you to understand

TA Section - 4 Zixuan Wang

Spatial Pyramid Matching

11/16/2012

Page 25: TA Section: Problem Set 4 - Artificial Intelligencevision.stanford.edu/teaching/cs231a_autumn1213/ta...Zixuan Wang TA Section - 4 A word about Q2 in PS4 •We’d like you to understand

TA Section - 4 Zixuan Wang

Image Representation

11/16/2012

Bag of Words Part-based

•Constellation Model •Pictorial Structure

Weakly Spatial Models

•Spatial Pyramid Matching

•Object-Bank

Spatial Specificity of Parts No spatial info. Very strong but sparse

Page 26: TA Section: Problem Set 4 - Artificial Intelligencevision.stanford.edu/teaching/cs231a_autumn1213/ta...Zixuan Wang TA Section - 4 A word about Q2 in PS4 •We’d like you to understand

TA Section - 4 Zixuan Wang

Image representation

High Level Objects based

Semantic Gap

High level tasks

Object Bank

Sailboat, water, sky, tree, …

HoG, Gist , SIFT, Color, Texture, Bag of Words (BoW), Spatial Pyramid (SPM) Low level feature

Image classification

Event: “Sailing”

Li e

t al

. 20

10

11/16/2012

Page 27: TA Section: Problem Set 4 - Artificial Intelligencevision.stanford.edu/teaching/cs231a_autumn1213/ta...Zixuan Wang TA Section - 4 A word about Q2 in PS4 •We’d like you to understand

TA Section - 4 Zixuan Wang

Sailboat response

Object Filters Spatial pooling

Object Bank representation

Li e

t al

. 20

10

11/16/2012

Page 28: TA Section: Problem Set 4 - Artificial Intelligencevision.stanford.edu/teaching/cs231a_autumn1213/ta...Zixuan Wang TA Section - 4 A word about Q2 in PS4 •We’d like you to understand

TA Section - 4 Zixuan Wang

Object size variance …

… …

Object Filters Spatial pooling

Sailboat response

Object Bank representation

Small

Median

Large

Li e

t al

. 20

10

11/16/2012

Page 29: TA Section: Problem Set 4 - Artificial Intelligencevision.stanford.edu/teaching/cs231a_autumn1213/ta...Zixuan Wang TA Section - 4 A word about Q2 in PS4 •We’d like you to understand

TA Section - 4 Zixuan Wang

… …

Bear response

Water response

Object Filters Spatial pooling

Sailboat response

• ~ 200 object detectors • Felzenswalb et al. 2008 • Hoeim et al. 2005

• 3-level spatial pyramid • for each grid: max of each object

Object Bank representation

Implementation details

Li e

t al

. 20

10

11/16/2012

Page 30: TA Section: Problem Set 4 - Artificial Intelligencevision.stanford.edu/teaching/cs231a_autumn1213/ta...Zixuan Wang TA Section - 4 A word about Q2 in PS4 •We’d like you to understand

TA Section - 4 Zixuan Wang

… …

Bear response

Water response

Object Filters Spatial pooling

Sailboat response

Object Bank representation

N: nGridsperScale, M: nScale, O: nObject

N

N

M x N

N

N

M x N

M x N x O= thousands

E.g., In our setting: 12 x 21x 177=44604

Li e

t al

. 20

10

11/16/2012

Page 31: TA Section: Problem Set 4 - Artificial Intelligencevision.stanford.edu/teaching/cs231a_autumn1213/ta...Zixuan Wang TA Section - 4 A word about Q2 in PS4 •We’d like you to understand

TA Section - 4 Zixuan Wang

A word about Q2 in PS4

• We’d like you to understand the differences between BoW, SPM, and ObjectBank

• We’d like you to use what you’ve learned so far, be creative, and come up with interesting ways of encoding image information for an image recognition task

• Extra credits are given especially to innovation and good performances

11/16/2012