Lost in Quantization: Improving Particular Object Retrieval in Large Scale Image Databases CVPR 2008...

Lost in Quantization: Improving Particular Object Retrieval in Large Scale Image Databases

CVPR 2008James PhilbinOndˇrej ChumMichael Isard

Josef SivicAndrew Zisserman

[7] O. Chum, J. Philbin, J. Sivic, M. Isard, and A. Zisserman. Total recall: Automatic query expansion with a generative feature model for object retrieval. In Proc. ICCV, 2007.

Outline

• Introduction• Methods in this paper• Experiment & Result• Conclusion

Outline

Introduction

• Goal– Specific object retrieval from an image database

• For large database– It’s achieved by systems that are inspired by text retrieval

(visual words).

1. Get features– SIFT

2. Cluster– Approximate k-means

3. Feature quantization– Visual word– Soft-assignment (query)

4. Re-ranked– RANSAC

5. Query expansion– Average query expansion

Outline

Feature

• SIFT

Quantization (visual word)

• Point List = [(2,3), (5,4), (9,6), (4,7), (8,1), (7,2)]• Sorted List = [(2,3), (4,7), (5,4), (7,2), (8,1),(9,6)]

Soft-assignment of visual words

• Matching two image features in bag-of-visual-words in hard-assignment– Yes if assigned to the same visual word– No otherwise

• Sort-assignment– A weighted combination of visual words

A~E represent cluster centers (visual words)points 1–4 are features

• – d is the distance from the cluster center to the

descriptor• In practice is chosen so that a substantial

weight is only assigned to few cells• The essential parameters– the spatial scale – r, nearest neighbors considered

• the weights to the r nearest neighbors, the descriptor is represented by an r-vector, which is then L1 normalized

TF–IDF weighting

• Standard index architecture

TF–IDF weighting

• tf– 100 vocabularies in a document, ‘a’ 3 times– 0.03 (3/100)

• idf– 1,000 documents have ‘a’, total number of

documents 10,000,000– 9.21 ( ln(10,000,000 / 1,000) )

• if-idf = 0.28( 0.03 * 9.21)

TF–IDF weighting

• In this paper– For the term frequency(tf)• we simply use the normalized weight value for each

visual word.

– For the inverse document(idf)• feature measure, we found that counting an occurrence

of a visual word as one, no matter how small its weight, gave the best results

Re-ranking

• RANSAC– Affine transform Θ : Y = AX+b

• Algorithm– 1. Randomly choose n points– 2. Use n points to find Θ – 3. Input N-n points to Θ– 4. How many inlier– Repeat 1~4 K times– Pick the best Θ

Re-ranking

• In this paper– No only counting the number of inlier

correspondences ,but also scoring function, or cosine =

Average query expansion

• Obtain top (m < 50) verified results of original query• Construct new query using average of these results•

– where d0 is the normalized tf vector of the query region

– di is the normalized tf vector of the i-th result

• Requery once

Outline

Dataset

• Crawled from Flickr & high resolution(1024x768)• Oxford buildings– About 5,062 high resolution(1024x768) images– using 11 landmarks as queries

• Paris– Used for quantization– 6,300 images

• Flickr1– 145 most popular tags– 99,782 images

Dataset

• Query– 55 queries: 5 queries for each of 11 landmarks

Baseline

• Follow the architecture of previous work [15]• A visual vocabulary of 1M words is generated

using an approximate k-means

[15] J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Object retrieval with large vocabularies and fast spatial matching. In Proc. CVPR, 2007

Evaluation

• Compute Average Precision (AP) score for each of the 5 queries for a landmark– Area under the precision-recall curve• Precision = RPI / TNIR• Recall = RPI / TNPCRPI = retrieved positive images

TNIR = total number of images retrieved

TNPC = total number of positives in the corpus• Average these to obtain a Mean Average

Precision (MAP)

Recall

Precision

Evaluation

• Dataset– Only the Oxford (D1) 5,062 images– Oxford (D1) + Flickr1 (D2) 104,844 images

• Vector quantizers– Oxford or Paris

Result

[14] D. Nister and H. Stewenius. Scalable recognition with a vocabulary tree.CVPR, 2006.

[18] T. Tuytelaars and C. Schmid. Vector quantizing feature space with a regular lattice. ICCV, 2007.

[15] J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Object retrieval with large vocabularies and fast spatial matching. CVPR, 2007.

Parameter variation Comparison with other methods

Result

Effect of vocabulary size

Spatial verification

Result

Query expansion

Scaling-up to 100K images

Result

ashmolean_3 goes from 0.626 AP to 0.874 APchrist_church_5 increases from 0.333 to 0.813 AP

Outline

Conclusion

• A new method of visual word assignment was introduced:– descriptor-space soft-assignment

• It improves that descriptor lost in the quantization step of previously published methods.

Lost in Quantization: Improving Particular Object Retrieval in Large Scale Image Databases CVPR 2008...

Documents

Instance-level recognition I. - Camera geometry and image alignment Josef Sivic josef INRIA, WILLOW, ENS/INRIA/CNRS UMR 8548 Laboratoire

Machine Learning in Computer Vision · What is (computer) vision? ... feature detection & representation codewords dictionary image representation ... Slide credit: Josef Sivic

Hierarchical Scene Coordinate Classiﬁcation and Regression for … · 2020. 6. 11. · jdla, and Josef Sivic. NetVLAD: CNN architecture for weakly supervised place recognition

sivic educatiaont

Distributed computing using Dryad Michael Isard Microsoft Research Silicon Valley

Michael Isard and Andrew Blake, IJCV 1998 Presented by Wen Li Department of Computer Science & Engineering Texas A&M University

kmlinux.fjfi.cvut.cztichyon2/data/Vyzkumny_ukol.pdf · Nazev pr´ ace:´ Konvolucnˇ ´ı parametrizace v analyze scintigraﬁck´ ych obrazov´ ych sekvenc´ ´ı Autor: Ondˇrej

Proceedings of the Conference on Machine … L. Forcada (Universitat d ... Daniel Ortiz-Martínez ... Findings of the 2017 Conference on Machine Translation (WMT17) Ondˇrej Bojar,

Looking at some data from Isard

DryadLINQ A System for General-Purpose Distributed Data-Parallel Computing Yuan Yu, Michael Isard, Dennis Fetterly, Mihai Budiu, Úlfar Erlingsson, Pradeep

Neural Markov Logic Networks - GitHub Pages · 2020. 11. 13. · Giuseppe Marra Department of Information Engineering University of Florence Florence, Italy Ondˇrej Kuželka Faculty

SIVIC: Open-Source, Standards-Based Software for …mriquestions.com/.../metabolite_map_software_ijbi2013-169526.pdf · SIVIC: Open-Source, Standards-Based Software for ... TIFF EPS

With Mrs Armson, Mrs Burmiston and Mrs Harris Mrs Eardley, Mrs Banks, Miss Gruzyska, Mrs Sivic, Miss Roebuck, Mrs Haywood

Machine Translation via Deep Syntax - core.ac.uk · Machine Translation via Deep Syntax Ondˇrej Bojar bojar@ufal.mﬀ.cuni.cz Institute of Formal and Applied Linguistics ... •

Naiad Iterative and Incremental Data-Parallel Computation Frank McSherry Rebecca Isaacs Derek G. Murray Michael Isard Microsoft Research Silicon Valley

Core-level spectroscopy: XAS, PED, XES - Ondrej Šiprmaca/Povrchy2018/T10xrayspec_OS2018.pdf · Core-level spectroscopy: XAS, PED, XES Ondˇrej Siprˇ VIII. NEVF514SurfacePhysics

A brief overview of Drupal 7 By Robin Isard, Systems Librarian Algoma University

Unsupervised discovery of visual object class hierarchies Josef Sivic (INRIA / ENS), Bryan Russell (MIT), Andrew Zisserman (Oxford), Alyosha Efros (CMU)

Scott A. Isard Professor of Aerobiology Departments of ... · Scott A. Isard. Professor of Aerobiology. Departments of Plant Pathology & Meteorology. Penn State University. Early

An Exemplar Model for Learning Object Classescmp.felk.cvut.cz/~chum/papers/chum07cvpr.pdf · An Exemplar Model for Learning Object Classes Ondˇrej Chum Andrew Zisserman Visual Geometry