42
The Duality of Object Retrieval: Unsupervised and Supervised Approaches TUAN NGUYEN ANH THE UNIVERSITY OF TOKYO

A Survey about Object Retrieval

Embed Size (px)

Citation preview

Page 1: A Survey about Object Retrieval

The Duality of Object Retrieval:Unsupervised and Supervised

Approaches

TUAN NGUYEN ANH THE UNIVERSITY OF TOKYO

Page 2: A Survey about Object Retrieval

Index •  Part 1: Basic Object Retrieval

Ø Unsupervised approaches

•  Part 2: State-of-the-art results

•  Part 3: Future attempts

Ø Duality & supervised approaches

•  Conclusion

2

Page 3: A Survey about Object Retrieval

Part 1: Basic Object Retrieval

Page 4: A Survey about Object Retrieval

Object Retrieval 4

?

1st

2nd

3rd

4th

Page 5: A Survey about Object Retrieval

5

Similar images

Related info

Source: https://www.yandex.com/images

Page 6: A Survey about Object Retrieval

6

Key words for images

Similar images

Source: https://www.google.com/imghp

Related info

Page 7: A Survey about Object Retrieval

7

Page 8: A Survey about Object Retrieval

Pinterest: Zoom-in Search 8

Source: https://www.pinterest.com/

Page 9: A Survey about Object Retrieval

Overview of the system 9

Query

DatabaseMatching

Features

Page 10: A Survey about Object Retrieval

Features in object retrieval 10

Query

DatabaseMatching

Features

Page 11: A Survey about Object Retrieval

Local features •  SIFT [Lowe, 1999, 2004]

•  HOG [Dalal & Triggs, 2005]

11

Page 12: A Survey about Object Retrieval

Global and deep features •  GIST features [Oliva et al., 2001]

Ø Describe the images by spectral information •  Deep features

Ø Extracted from neural networks

12

[Krizhevsky et al., 2012]

Page 13: A Survey about Object Retrieval

Aggregated Features •  BoF [Sivic et al., 2003] •  Hamming Embedding [Jégou et al., 2008] •  Fisher Vector [Perronnin et al., 2007] •  VLAD [Jégou et al., 2012]

13

Page 14: A Survey about Object Retrieval

Bag of Features (BoF) •  Cluster local descriptors to build a dictionary. •  Compute the BoF vector as a histogram of

visual words.

14

Images

c2

c3

DictionaryBag of Features

[Sivic et al., 2003]

Page 15: A Survey about Object Retrieval

Hamming Embedding •  Each local descriptors set of an image will be

encoded by a binary signature.

15

[Jégou et al., 2008]

Page 16: A Survey about Object Retrieval

Fisher Vector (FV) •  Cluster the local descriptors by GMM •  Fisher Kernel •  Fisher Vector

16

Images Local descriptors GMMFisher Vector

[Perronnin et al., 2007]

Page 17: A Survey about Object Retrieval

VLAD •  Replace the GMM in FV by k-means clustering •  Approximate FV by

17

Images Local descriptors K-meansVLAD Vector

[Jégou et al., 2012]

Page 18: A Survey about Object Retrieval

Overview of the system 18

Query

DatabaseMatching

Features

Page 19: A Survey about Object Retrieval

Distances and similarities •  Euclidean distances

•  Hamming distances

•  Inner product

•  Approximated distances (ADC): Ø Distance between query vector and compressed

database vector.

Ø [Jégou et al., 2011]

19

Page 20: A Survey about Object Retrieval

Nearest neighbor search 20

Query

DatabaseMatching, Nearest

neighbor search

Features

Page 21: A Survey about Object Retrieval

Nearest neighbor search 21

Nearest neighbor

Page 22: A Survey about Object Retrieval

Indexing and compressing data •  Coarse-to-fine strategy

Ø Use quantization techniques to build an inverted file (IVF)

22

c1 1 3

c2 2

c3 4 5 6

id code

m bytes

c2

c3

Inverted File

Compressed vector Faster search

Better memory footprint

[Jégou et al., 2011]

Page 23: A Survey about Object Retrieval

Quantization techniques •  Compress the data for

better memory footprint •  Search accuracy is

acceptable with appropriate parameters

23

Recall = 95% with 64 bits code

[Jégou et al., 2011]

c1 1 3

c2 2

c3 4 5 6

id code

m bytes

Page 24: A Survey about Object Retrieval

Feature processing •  Square rooting [Arandjelovic & Zisserman,

2012] •  L2-normalization [Jain et al., 2012] •  Centralization [Tolias et al., 2013] •  Down-weight highly populated cells in

aggregation [Jégou et al., 2009] •  Whitening [Jégou et al., 2010]

24

Page 25: A Survey about Object Retrieval

Image processing: re-ranking •  Estimate a transformation between the query

region and each target image. •  Target images are re-ranked based on the

discriminability of the spatially verified visual words.

25

mAP with BoF: 0.618→0.645 [Philbin et al., 2007]Dataset: Oxford Buildings

Queries

Page 26: A Survey about Object Retrieval

Image processing: query expansion 26

mAP with BoF: 0.645→0.696 [Chum et al., 2007]

•  Requery after reconstructing the original query.

•  The new query is constructed from verified query in the first time retrieval.

Dataset: Oxford Buildings

Page 27: A Survey about Object Retrieval

Part 2: State-of-the-art results

Page 28: A Survey about Object Retrieval

Nearest neighbor search •  Datasets: 1M~1B vectors with ground truth

data Ø BIGANN dataset: http://corpus-texmex.irisa.fr/

•  Evaluation Ø recall@R = the proportion of queries with NN

ranked in top-R results.

28

c1 1 3

c2 2

c3 4 5 6

id code

m bytes

c2

c3

Inverted File

Compressed vector

Page 29: A Survey about Object Retrieval

Quantization techniques •  Additive Quantization

[Babenko et al., 2014] •  Approximate a vector by

the sum of codewords. •  Learn codewords by an

iterative optimization.

•  Composite Quantization [Zhang et al., 2014]

•  Minimize the orthogonality of the approximation.

29

Page 30: A Survey about Object Retrieval

Indexing techniques •  Multi-indexing [Babenko et al., 2012, 2015]

•  Performance in a dataset of one billion SIFT vectors Ø Memory: 12 GB Ø Search time: 2 ms/query Ø recall@100 = 70%

30

Page 31: A Survey about Object Retrieval

Image search •  Datasets: Oxford building dataset [Philbin et

al., 2007]

•  Evaluation Ø mAP: Mean average precision for a set of queries

is the mean of the average precision scores for each query.

31

Page 32: A Survey about Object Retrieval

Selective Match Kernel •  [Tolias et al., 2013] •  Apply the power normalization to each VLAD

component to improve the accuracy. •  Use hashing to reduce the memory footprint. •  mAP = 0.817 on Oxford5K dataset [Philbin et al., 2007]

32

Page 33: A Survey about Object Retrieval

Neural Codes •  [Babenko et al., 2014] •  Attempt to use features that are extracted from

neural network to object retrieval. •  Features are fine-tuned. •  mAP = 0.435 with fc6 features on Oxford5K

dataset.

33

Page 34: A Survey about Object Retrieval

Sum-pooled convolutional features •  [Babenko et al., 2015] •  Deep features are sum-pooled and Gaussian

weighted to improve the accuracy. •  mAP = 0.657 on Oxford5K dataset.

34

Page 35: A Survey about Object Retrieval

Summary of image retrieval results 35

•  Search framework with deep features in object

retrieval still need to be improved.

Method Feature Framework mAPASMK [Tolias et al., 2013] SIFT VLAD 0.817Neural codes [Babenko et al., 2014] Deep features - 0.435SPoC [Babenko et al., 2015] Deep features SPoC 0.657

Page 36: A Survey about Object Retrieval

Part 3: Future attempts

Page 37: A Survey about Object Retrieval

Attempts on current topics •  Improve the features:

Ø Feature fusion

Ø Find new match kernels

Ø Improve the system with deep features?

•  Improve the distance metrics and NN search.

37

Page 38: A Survey about Object Retrieval

Dual-process system 38

•  [Stanovich et al., 1999, 2004]

Fast, high capacity, implicit knowledge and basic emotions

only .

Slow, limited capacity, explicit knowledge and

complicated emotions.

Page 39: A Survey about Object Retrieval

Supervised Object Retrieval? •  More than just apply the deep features into

retrieval.

•  Learning while searching?

•  Learning with feedback?

39

Page 40: A Survey about Object Retrieval

The Duality of Object Retrieval •  The collaboration between unsupervised

learning and supervised learning in object retrieval.

40

[Stanovich et al., 1999, 2004]

Page 41: A Survey about Object Retrieval

Conclusion •  Basic Object Retrieval

Ø Features: SIFT, HOG, GIST, deep features

Ø Distance metrics and NN search

Ø Hamming Embedding and Aggregation

Ø Pre-processing and post-processing

•  State-of-the-art results

•  Future attempts: Duality & Supervised & Unsupervised?

41

Page 42: A Survey about Object Retrieval

Thank you for listening