41
6.S093 Visual Recognition through Machine Learning Competition Image by kirkh.deviantart.com Aditya Khosla

6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

6.S093 Visual Recognition through Machine Learning Competition

Image by kirkh.deviantart.com

Aditya Khosla

Page 2: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

Today’s class

• Part 1: Competition details

• Part 2: Image representation lecture– Bag-of-words

– Spatial pyramid

• Part 3: Feature extraction tutorial

Page 3: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

Competition details: dataset

person

10 object categories

airplane bicycle car

cup/mug dog(s) guitar hamburger sofa trafficlight

Page 4: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

Competition details: dataset

Training set

8,000 images

Validation set

2,000 imagesTesting set

5,000 images

labels provided NO labels provided

Leaderboard set

Page 5: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

Competition details: submission

• For each image, you provide the probability of every class belonging in it (as returned by your algorithm)

airp

lan

e

bic

ycle car

cup

do

ggu

itar

ham

bu

rger

sofa

traf

fic

ligh

t

per

son

0

1

Page 6: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

Competition details: evaluation

• Average precision

Page 7: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

Competition details: prizes

Cas

h

first

+ cash

second third

+ cash

Page 8: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

Competition details: thank you!

Page 9: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

Image representation: bag-of-words

Page 10: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

Document representation: bag-of-words

• Order-less document representation: frequencies of words from a dictionary Salton & McGill (1983)

Page 11: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

Document representation: bag-of-words

• Order-less document representation: frequencies of words from a dictionary Salton & McGill (1983)

US Presidential Speeches Tag Cloud

Page 12: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

Document representation: bag-of-words

• Order-less document representation: frequencies of words from a dictionary Salton & McGill (1983)

US Presidential Speeches Tag Cloud

Page 13: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

Document representation: bag-of-words

• Order-less document representation: frequencies of words from a dictionary Salton & McGill (1983)

US Presidential Speeches Tag Cloud

Page 14: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

Image representation: bag-of-words

document

bag-of-words

Page 15: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

Image representation: bag-of-words

document

bag-of-words

image bag-of-visual words

Page 16: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

Object Bag of ‘words’

Page 17: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

ObjectUgly bag of

‘words’

Page 18: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

ObjectStylish bag of

‘words’

Page 19: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

ObjectStylish bag of

‘words’

Page 20: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

visual dictionary

Page 21: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

Image representation: bag-of-words

1. Extract descriptors

Page 22: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

Image representation: bag-of-words

1. Extract descriptors

2. Learn “visual dictionary”

Page 23: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

Image representation: bag-of-words

1. Extract descriptors

2. Learn “visual dictionary”

3. Quantize features using visual vocabulary

Page 24: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

Image representation: bag-of-words

1. Extract descriptors

2. Learn “visual dictionary”

3. Quantize features using visual vocabulary

Page 25: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

Image representation: bag-of-words

1. Extract descriptors

2. Learn “visual dictionary”

3. Quantize features using visual vocabulary

4. Represent images by frequencies of “visual words”

Page 26: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

1. Extracting descriptors

regular grid interest points

Page 27: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

Image representation: yesterdaygradient magnitude

gradient orientation

feature vector

Page 28: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

Image representation: yesterdaygradient magnitude

gradient orientation

descriptor

Page 29: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

2. Learning “visual dictionary”

Compute descriptor

Page 30: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

2. Learning “visual dictionary”

descriptors

Page 31: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

2. Learning visual dictionarydescriptors

Page 32: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

2. Learning visual dictionarydescriptors

Clustering

Page 33: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

2. Learning visual dictionarydescriptors

Clustering

visual vocabulary

Page 34: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

Example visual vocabulary

Fei-Fei et al. 2005

Page 35: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

Image patch examples

Sivic et al. 2005

Page 36: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

Image patch examples

Sivic et al. 2005

How to choose the vocabulary size?

Page 37: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

Bag-of-words: limitations

• What about the structure of the image?

=?

Page 38: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

Image representation: spatial pyramids

level 0

Page 39: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

Image representation: spatial pyramids

level 0 level 1

Page 40: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

Image representation: spatial pyramids

level 0 level 1 level 2

Page 41: 6.S093 Visual Recognition through Machine Learning Competitionviscomp.csail.mit.edu/resource/slides/lecture3.pdf · •Part 1: Competition details •Part 2: Image representation

Tutorial