SiSemi-SidSupervisedSupervised p H hi f S l bl I R t i ...jwang/cvprpresentations/SSH_CVPR2010_poster.pdfSiSemi-SidSupervisedSupervised p H hi f S l bl I R t i lHashing for Scalable

S i S i d H hi f S l bl I R t i lSemi Supervised Hashing for Scalable Image RetrievalSemi Supervised Hashing for Scalable Image RetrievalSemi-Supervised Hashing for Scalable Image RetrievalSemi Supervised Hashing for Scalable Image RetrievalSemi Supervised Hashing for Scalable Image Retrievalp g g1 2 11J W 2S ji K d 1Shih F Ch1Jun Wang 2Sanjiv Kumar and 1Shih Fu ChangJun Wang, Sanjiv Kumar and Shih-Fu ChangJun Wang, Sanjiv Kumar and Shih Fu ChangJ g, j g

1 f l i l i i C l bi i i k SA1Department of Electrical Engineering Columbia University New York USA1Department of Electrical Engineering Columbia University New York USADepartment of Electrical Engineering, Columbia University, New York, USAp f g g, y, ,2G l R h N Y k USA2Google Research New York USA2Google Research, New York, USAGoogle Research, New York, USAg

R l t d W kI t d ti d M ti ti Related Work S i S i d F l tiIntroduction and Motivation Related Work Semi-Supervised FormulationIntroduction and Motivation Related Work Semi-Supervised FormulationIntroduction and Motivation Semi Supervised FormulationpTh i d f f t d t i d i fThe emerging need of fast and accurate indexing for Locality Sensitive Hashing (LSH) I d k t l 98The emerging need of fast and accurate indexing for Locality Sensitive Hashing (LSH) -- Indyk et al 98 Incorporate both metric similarity and semantic similarity for fastg g g Locality Sensitive Hashing (LSH) Indyk et al., 98 Incorporate both metric similarity and semantic similarity for fast

l l t t b d i t i l (CBIR)y g ( ) p y y

large scale content based image retrieval (CBIR) and accurate indexinglarge scale content based image retrieval (CBIR) and accurate indexingg g ( )h i h i i f ibl d i l d

gExhaustive search is infeasible due to computational and storage cost Given pair wise labeled dataExhaustive search is infeasible due to computational and storage cost Given pair wise labeled datasampled from distribution random shiftp gI d f i hb i i hb

p- - sampled from distribution - - random shiftInstead of exact nearest neighbor approximate nearest neighborp

Instead of exact nearest neighbor, approximate nearest neighborS t l H hi (SH)(ANN ) h i ti l f l d t b Spectral Hashing (SH) -- Weiss et al 08 A d d d h h ld(ANN )search is more practical for large databases Spectral Hashing (SH) -- Weiss et al. 08 Assume data zero centered and use mean threshold:(ANN )search is more practical for large databases p g ( ) Assume data zero centered and use mean threshold:

Compared ith tree based ANN methods hashing approach is more i i l fiCompared with tree based ANN methods, hashing approach is more Empirical fitness:Th l i f h d SSHCompared with tree based ANN methods, hashing approach is more Empirical fitness:The conceptual comparison of the proposed SSHscalable and efficient

pThe conceptual comparison of the proposed SSHscalable and efficient method with LSH SH and RBMsmethod with LSH, SH and RBMs.

d tti l fM i i - - data range- - spatial frequencyMain issues: R l i ti f i idata rangespatial frequencyMain issues: Regularization of maximum variance:principal projections of data

E i i h hi h d l l d i i l j iRegularization of maximum variance:- - principal projections of data

Existing hashing methods mostly rely on random or principal projections Fi l bj ti f tip p p j

Existing hashing methods mostly rely on random or principal projections, Sinusoidal binarization over PCA projections Final objective function Similarg g y y p p p j

hi h t t t Sinusoidal binarization over PCA projections Final objective function Similarwhich are not very compact or accurate p j

Dissimilarwhich are not very compact or accurateR t i t d B lt M hi (RBM ) Hi l 06

DissimilarSi l t i ll t h t ti i il it Restricted Boltzmann Machines (RBMs) – Hinton et al 06Simple metrics are usually not enough to express semantic similarity Restricted Boltzmann Machines (RBMs) Hinton et al. 06Simple metrics are usually not enough to express semantic similarity ( )

L d b li f k b i bi dG l Gi l l b l d t ith f i i Learns deep belief networks to obtain compact binary codesGoal: Given a large unlabeled set with a few pair-wise Learns deep belief networks to obtain compact binary codes Goal: Given a large unlabeled set with a few pair-wise p p yT t i i t i d t i i d i d fi t i

g pTwo training steps unsupervised pre-training and supervised fine tuning G d P i il b l d i t l d t d d t t h h dTwo training steps, unsupervised pre training and supervised fine tuning Good Partitionlabeled points learn data-dependent compact hash codes E ti ti l b f i ht i l t f l b l d d t d l t i i ti P P i i

Good a t t olabeled points, learn data dependent compact hash codes Estimating a large number of weights requires lots of labeled data and long training time Poor PartitionTh th i i d h hi k i thi ' CVPRp , p p Estimating a large number of weights requires lots of labeled data and long training time Poor PartitionThe other semi-supervised hashing work in this year's CVPR: AD p g y

Y M t l W kl S i d H hi i K l SAD

Y. Mu, et al., Weakly-Supervised Hashing in Kernel Space, , y p g p

O h l H h C d N O th l H h C d E i t d C l iOrthogonal Hash Codes Non Orthogonal Hash Codes Experiments and ConclusionsOrthogonal Hash Codes Non-Orthogonal Hash Codes Experiments and ConclusionsOrthogonal Hash Codes Non Orthogonal Hash Codes Experiments and Conclusionsg g pR l h li i d bi MNIST (70K 2K training labels 1K test)Simplified objective function in matrix form Relax orthogonality constraints and combine MNIST (70K, 2K training labels, 1K test)Simplified objective function in matrix form Relax orthogonality constraints and combine ( , g , )Simplified objective function in matrix form Relax orthogonality constraints and combine

th t th bj ti f ti lt tthem to the objective function as penalty termthem to the objective function as penalty termj p y

B i i th lit t i t th bj ti f tiBy imposing orthogonality constraints the objective functionBy imposing orthogonality constraints, the objective function y p g g y , jC b l d b i l difi i f hcan be converted to a standard eigen problem Can be solved by a simple modification of thecan be converted to a standard eigen problem Can be solved by a simple modification of thecan be converted to a standard eigen problem Can be solved by a simple modification of the

i th l l tiprevious orthogonal solution“adjusted” covariance matrix previous orthogonal solutionadjusted covariance matrix O illi GIST (10K i i l b l 2K )

p gjOne million GIST (10K training labels 2K test)One million GIST (10K training labels, 2K test)

th l l ti( g )

--- orthogonal solutionorthogonal solutionsupervised unsupervisedsupervised unsupervisedp p

principal projections of “adjusted” covariance matrixFinding maximum variance projections over the data covariance principal projections of adjusted covariance matrixFinding maximum variance projections over the data covariance p p p j jFinding maximum variance projections over the data covariance i “ dj d” b i i h i i l b l d d Ch ffi imatrix “adjusted” by incorporating the pair wise labeled data Choose proper coefficient to guaranteematrix adjusted by incorporating the pair-wise labeled data Choose proper coefficient to guarantee matrix adjusted by incorporating the pair wise labeled data p p g

I M t l d t t h th i t t d th iti d fi itIssues: Most real datasets have the variance concentrated on the positive-definiteIssues: Most real datasets have the variance concentrated on the positive-definitept l f j ti th lit t i f ttop several few projections orthogonality constrains force to

C l itop several few projections. orthogonality constrains force to

Ch l k f i i Conclusionsp p j g y

-- - Cholesky factorization Conclusionsprogressively pick those low variance directions substantially -- - Cholesky factorizationprogressively pick those low-variance directions substantially yA semi-supervised formulation for hash function learningprogressively pick those low variance directions, substantially A semi-supervised formulation for hash function learningp g y p y

d i h li f h i b ddi O th l d th l l ti b i l SVDreducing the quality of hamming embedding non orthogonal solution Orthogonal and non-orthogonal solutions by simple SVDreducing the quality of hamming embedding --- non-orthogonal solution Orthogonal and non orthogonal solutions by simple SVDreducing the quality of hamming embedding gEasy implementation fast training and superior performanceEasy implementation, fast training, and superior performance

Documents

SiSemi-SidSupervisedSupervised p H hi f S l bl I R t i ...jwang/cvprpresentations/SSH_CVPR2010_poster.pdfSiSemi-SidSupervisedSupervised p H hi f S l bl I R t i lHashing for Scalable