11
4514 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 10, OCTOBER 2016 Query-Adaptive Hash Code Ranking for Large- Scale Multi-View Visual Search Xianglong Liu, Member, IEEE, Lei Huang, Cheng Deng, Member, IEEE, Bo Lang, and Dacheng Tao, Fellow, IEEE Abstract— Hash-based nearest neighbor search has become attractive in many applications. However, the quantization in hashing usually degenerates the discriminative power when using Hamming distance ranking. Besides, for large-scale visual search, existing hashing methods cannot directly support the efficient search over the data with multiple sources, and while the lit- erature has shown that adaptively incorporating complementary information from diverse sources or views can significantly boost the search performance. To address the problems, this paper proposes a novel and generic approach to building multiple hash tables with multiple views and generating fine-grained ranking results at bitwise and tablewise levels. For each hash table, a query-adaptive bitwise weighting is introduced to alleviate the quantization loss by simultaneously exploiting the quality of hash functions and their complement for nearest neighbor search. From the tablewise aspect, multiple hash tables are built for different data views as a joint index, over which a query-specific rank fusion is proposed to rerank all results from the bitwise ranking by diffusing in a graph. Comprehensive experiments on image search over three well-known benchmarks show that the proposed method achieves up to 17.11% and 20.28% perfor- mance gains on single and multiple table search over the state- of-the-art methods. Index Terms— Locality-sensitive hashing, hash code ranking, multiple views, multiple hash tables, nearest neighbor search, query adaptive ranking. I. I NTRODUCTION H ASH based nearest neighbor search has attracted great attentions in many areas, such as large-scale visual search [1]–[7], object detection [8], classification [9], [10], Manuscript received November 29, 2015; revised May 13, 2016; accepted July 13, 2016. Date of publication July 19, 2016; date of current version August 5, 2016. This work was supported in part by the Foundation of the State Key Laboratory of Software Development Environment under Grant SKLSDE-2016ZX-04, in part by the Innovation Foundation of BUAA for Ph.D. Graduates, in part by the National Natural Science Foundation of China under Grant 61402026 and Grant 61572388, in part by the Australian Research Council under Grant DP-140102164, Grant LE-140100061, and Grant FT-130101457. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Ling Shao. (Corresponding author: Dacheng Tao.) X. Liu, L. Huang, and B. Lang are with the State Key Laboratory of Software Development Environment, Beihang University, Beijing 100191, China (e-mail: [email protected]; [email protected]; [email protected]). C. Deng is with the School of Electronic Engineering, Xidian University, Xi’an 710071, China (e-mail: [email protected]). D. Tao is with the Centre for Quantum Computation & Intelligent Systems and the Faculty of Engineering and Information Technology, University of Technology Sydney, Ultimo, NSW 2007, Australia (email: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TIP.2016.2593344 recommendation [11], etc. Locality-Sensitive Hashing (LSH) serves as a wide paradigm of hash methods that maps the similar data into similar codes [12]. As the most well-known method, random projection based hashing method projects and quantizes high-dimensional data to the binary hash code, with compressed storage and fast query speed by computing the Hamming distance [13]. Even with the sound theoretical guarantee, the random strategy usually requires a large number of hash functions to achieve desired discriminative power. Motivated by the concept of locality sensitivity, many following work devote to pursuing more compact hash codes in unsupervised [14]–[19] and supervised manner [20]–[22] using different techniques including nonlinear hash functions [22]–[25], multiple features [3], [6], [26], [27], multiple bits [15], [28], [29], bit selection [30], [31], multiple tables [27], [32], [33], discrete optimiza- tion [34], deep learning [35], structured quantization [36], [37], projection bank [38], online sketch [39], etc. The hashing techniques, integrating the property of the sub- space methods [40], [41] and the efficient bit manipulations, can help achieve compressed storage and fast computation in large-scale nearest neighbor search. However, there exist more than one buckets that share the same Hamming distance to the query, and subsequently samples falling in these buckets will be ranked equally according to the Hamming distance. Therefore, the quantization in hashing loses the exact ranking information among the samples, and thus degenerates the discriminative power of the Hamming distance measurement. To improve the ranking accuracy using Hamming distance, it is necessary to enable fine-grained ranking by alleviating the quantization loss. As in many traditional applications like the classifica- tion [42] and the retrieval [43], in hashing a weighting scheme can serve as one of the most powerful and successful tech- niques, which assesses the quality of each hash bit to improve the discriminative power of the hash code. Reference [44] proposed a query-adaptive Hamming distance ranking method using the learned bitwise weights for a diverse set of pre- defined semantic concept classes. References [45] and [46] respectively studied a query-sensitive hash code ranking algo- rithms (QsRank and WhRank) based on the data-adaptive and query-sensitive bitwise weights without compressing query points to discrete hash codes. Despite of the progress with the promising performance, there are still certain limitations when using these methods. For instance, these methods are usually designed for projection based hashing algorithms 1057-7149 © 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

4514 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO ...see.xidian.edu.cn/faculty/chdeng/Welcome to Cheng Deng's Homepage_files... · tables with multiple views and generating

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 4514 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO ...see.xidian.edu.cn/faculty/chdeng/Welcome to Cheng Deng's Homepage_files... · tables with multiple views and generating

4514 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 10, OCTOBER 2016

Query-Adaptive Hash Code Ranking for Large-Scale Multi-View Visual Search

Xianglong Liu, Member, IEEE, Lei Huang, Cheng Deng, Member, IEEE,Bo Lang, and Dacheng Tao, Fellow, IEEE

Abstract— Hash-based nearest neighbor search has becomeattractive in many applications. However, the quantization inhashing usually degenerates the discriminative power when usingHamming distance ranking. Besides, for large-scale visual search,existing hashing methods cannot directly support the efficientsearch over the data with multiple sources, and while the lit-erature has shown that adaptively incorporating complementaryinformation from diverse sources or views can significantly boostthe search performance. To address the problems, this paperproposes a novel and generic approach to building multiple hashtables with multiple views and generating fine-grained rankingresults at bitwise and tablewise levels. For each hash table,a query-adaptive bitwise weighting is introduced to alleviate thequantization loss by simultaneously exploiting the quality of hashfunctions and their complement for nearest neighbor search.From the tablewise aspect, multiple hash tables are built fordifferent data views as a joint index, over which a query-specificrank fusion is proposed to rerank all results from the bitwiseranking by diffusing in a graph. Comprehensive experiments onimage search over three well-known benchmarks show that theproposed method achieves up to 17.11% and 20.28% perfor-mance gains on single and multiple table search over the state-of-the-art methods.

Index Terms— Locality-sensitive hashing, hash code ranking,multiple views, multiple hash tables, nearest neighbor search,query adaptive ranking.

I. INTRODUCTION

HASH based nearest neighbor search has attracted greatattentions in many areas, such as large-scale visual

search [1]–[7], object detection [8], classification [9], [10],

Manuscript received November 29, 2015; revised May 13, 2016; acceptedJuly 13, 2016. Date of publication July 19, 2016; date of current versionAugust 5, 2016. This work was supported in part by the Foundation of theState Key Laboratory of Software Development Environment under GrantSKLSDE-2016ZX-04, in part by the Innovation Foundation of BUAA forPh.D. Graduates, in part by the National Natural Science Foundation ofChina under Grant 61402026 and Grant 61572388, in part by the AustralianResearch Council under Grant DP-140102164, Grant LE-140100061, andGrant FT-130101457. The associate editor coordinating the review ofthis manuscript and approving it for publication was Prof. Ling Shao.(Corresponding author: Dacheng Tao.)

X. Liu, L. Huang, and B. Lang are with the State Key Laboratory ofSoftware Development Environment, Beihang University, Beijing 100191,China (e-mail: [email protected]; [email protected];[email protected]).

C. Deng is with the School of Electronic Engineering, Xidian University,Xi’an 710071, China (e-mail: [email protected]).

D. Tao is with the Centre for Quantum Computation & IntelligentSystems and the Faculty of Engineering and Information Technology,University of Technology Sydney, Ultimo, NSW 2007, Australia (email:[email protected]).

Color versions of one or more of the figures in this paper are availableonline at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TIP.2016.2593344

recommendation [11], etc. Locality-Sensitive Hashing (LSH)serves as a wide paradigm of hash methods that maps thesimilar data into similar codes [12]. As the most well-knownmethod, random projection based hashing method projects andquantizes high-dimensional data to the binary hash code, withcompressed storage and fast query speed by computing theHamming distance [13].

Even with the sound theoretical guarantee, the randomstrategy usually requires a large number of hash functionsto achieve desired discriminative power. Motivated by theconcept of locality sensitivity, many following work devote topursuing more compact hash codes in unsupervised [14]–[19]and supervised manner [20]–[22] using different techniquesincluding nonlinear hash functions [22]–[25], multiple features[3], [6], [26], [27], multiple bits [15], [28], [29], bit selection[30], [31], multiple tables [27], [32], [33], discrete optimiza-tion [34], deep learning [35], structured quantization [36], [37],projection bank [38], online sketch [39], etc.

The hashing techniques, integrating the property of the sub-space methods [40], [41] and the efficient bit manipulations,can help achieve compressed storage and fast computation inlarge-scale nearest neighbor search. However, there exist morethan one buckets that share the same Hamming distance tothe query, and subsequently samples falling in these bucketswill be ranked equally according to the Hamming distance.Therefore, the quantization in hashing loses the exact rankinginformation among the samples, and thus degenerates thediscriminative power of the Hamming distance measurement.To improve the ranking accuracy using Hamming distance, itis necessary to enable fine-grained ranking by alleviating thequantization loss.

As in many traditional applications like the classifica-tion [42] and the retrieval [43], in hashing a weighting schemecan serve as one of the most powerful and successful tech-niques, which assesses the quality of each hash bit to improvethe discriminative power of the hash code. Reference [44]proposed a query-adaptive Hamming distance ranking methodusing the learned bitwise weights for a diverse set of pre-defined semantic concept classes. References [45] and [46]respectively studied a query-sensitive hash code ranking algo-rithms (QsRank and WhRank) based on the data-adaptive andquery-sensitive bitwise weights without compressing querypoints to discrete hash codes. Despite of the progress withthe promising performance, there are still certain limitationswhen using these methods. For instance, these methods areusually designed for projection based hashing algorithms

1057-7149 © 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Page 2: 4514 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO ...see.xidian.edu.cn/faculty/chdeng/Welcome to Cheng Deng's Homepage_files... · tables with multiple views and generating

LIU et al.: QUERY-ADAPTIVE HASH CODE RANKING FOR LARGE-SCALE MULTI-VIEW VISUAL SEARCH 4515

rather than cluster based ones like K-means hashing [47].Moreover, methods like WhRank heavily rely on the assump-tion of data distribution (eg., Laplace distribution for spectralhashing [14]), which is not always true in practice.

Another promising solution to enabling fine-grained rankingis incorporating information from multiple views. For exam-ple, in many applications, especially the large-scale search,prior studies have proved that adaptively incorporating com-plementary information from diverse sources or views forthe same object (e.g., images can be described by differentvisual features like SIFT and GIST) can help boost theperformance [3], [6], [33], [48]–[51]. Though there are afew research on multiple feature hashing [3], [6], [27] whichlearn discriminative hash functions and improve the searchperformance by adaptively leveraging multiple cues, but tillnow there is rare work regarding the unified solution tobuilding multiple hash tables from complementary sourceviews with fine-grained ranking. In practice, indexing usinghash tables will be more computationally feasible, due to thefast search in a constant time using table lookup [8], [32], [33],and can significantly enhance the discriminative power of thehash codes in practice [27], [32].

To the end, in this paper we simultaneously study bothtechniques and propose a query-adaptive hash code fusionbased ranking method over multiple tables with multipleviews. For each hash table learned from each view, we exploitthe similarities between the query and database samples, andlearn a set of query-adaptive bitwise weights that characterizeboth the discriminative power of each hash function and theircomplement for nearest neighbor search. Assigning differentweights to individual hash bit will distinguish the resultssharing the same hamming distance, and obtain a more fine-grained and accurate ranking order. Compared to existingmethods, our method is more general for different types ofhashing algorithms, without strict assumptions on the datadistribution. Meanwhile, it can faithfully enhance the overalldiscriminative power of the weighted Hamming distance.

As to multi-view multiple tables, a straightforward, yetefficient solution is to respectively build hash tables from eachsource view and then combine them as the joint table index.Such solution can directly support any type of source andexisting hashing algorithms, and be easily accomplished forgeneral applications. To further adaptively fuse search resultsfrom multiple sources in an desired order, many studies havebeen proposed in the literature [48], [49], [52]–[54]. Thesepowerful rank fusion techniques faithfully boost the searchperformance, but in practice usually are not computationallyfeasible for hash table search, because they heavily dependon the fine-grained ranking. For instance, [50] and [51]accessed the data similarities relying on the raw features andexact neighbor structures respectively, which can neither beloaded in the memory nor be dynamically updated for dynamicdataset. To adaptively fuse the query-specific ordered resultswithout operating on the raw features, an anchor representationfor candidates in each source view is introduced to characterizetheir neighbor structure by a neighbor graph according to theirweighted Hamming distances, and then all the candidates arereranked by randomly walking on the merged multiple graphs.

Fig. 1. Demonstration of the proposed query-adaptive bitwise weighting.

To our best knowledge, this is the first work that compre-hensively studies the large-scale search based on hash tableindexing with multiple sources. Compared to existing methods,our method is more generic for different types of hashingalgorithms and data sources, without strict assumptions onthe data distribution and memory consumption. Meanwhile,it can faithfully enhance the overall ranking performance fornearest neighbor search from the aspects of fine-grained coderanking and multiple feature fusion. Note that the whole paperextends upon a previous conference publication [55] which ismainly concentrated on hash code ranking in a single hashtable generated from one view. In this paper, we conductadditional exploration on fine-grained ranking technique overmultiple tables with multiple features, and provide amplifiedexperimental results. The remaining sections are organizedas follows: The query-adaptive bitwise weighing approach ispresent in Section II. Section III elaborates on the graph-basedfusion of multiple table ranking. In Section IV we evaluate theproposed method with state-of-the-art methods over severallarge datasets. Finally, we conclude in Section V.

II. QUERY-ADAPTIVE HASH CODE RANKING

In this section, we will first focus on the fine-grainedranking in each hash table, and propose the query-adaptiveranking method (named QRank) using weighted hammingdistance. It utilizes the similarity relationship between thequery point and its neighbors in the database to measure theoverall discriminative power of the hash code, simultaneouslyconsidering both the quality of hash functions and theircorrelations.

A. Query-Adaptive Weight

Given a set of N training samples {xi ∈ RD, i = 1, . . . , N}

and a set of B hash functions H (·) = {h1(·), . . . , h B(·)}, eachtraining sample xi is encoded into hash bit yik = hk(xi ) ∈{−1, 1} by the k-th hash function. With the hash functionsand corresponding weights w, the weighted Hamming distancebetween any two points xi and x j are usually defined asdh(xi , x j ) = ∑B

k=1 wk(yik ⊗ y jk).To improve the ranking precision of the weighted dis-

tance dh , a data-dependent and query-adaptive wk should belearnt to characterize the overall quality of the k-th hashfunction in H . Intuitively, for a query point q, a hash functionwell preserving q’s nearest neighbors NN(q) (eg., h3 andh4 in Figure 1) should play a more important role in the

Page 3: 4514 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO ...see.xidian.edu.cn/faculty/chdeng/Welcome to Cheng Deng's Homepage_files... · tables with multiple views and generating

4516 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 10, OCTOBER 2016

weighted Hamming distance, and thus a larger weight shouldbe assigned to it.

Formally, for a high-quality hash function hk , if p ∈ NN(q),then the higher its similarity s(p, q) to q, the larger theprobability that hk(q) = hk(p). Therefore, based on theneighbor preservation of each hash function, we define itsweight using the spectral embedding loss [14], [31]:

wk = −1

2

p∈NN(q)

s(q, p)‖hk(q) − hk(p)‖2

=∑

p∈NN(q)

s(q, p)hk(q)hk(p) + const . (1)

where we constrain∑

p∈N N(q) s(q, p) = 1. Note that in theabove definition the similarity can be adaptively tailored fordifferent scenarios.

To make the weight positive and sensitive to the capabilityof neighbor preservation, in practice we use the following formwith γ > 0:

wk = exp

⎣γ∑

p∈NN(q)

s(q, p)hk(q)hk(p)

⎦. (2)

In the above definition, one important question is how toefficiently find the query’s nearest neighbors NN(q) at theonline query stage. It is infeasible to find the exact onesamong the whole database [56]. One way to speedup thecomputation is choosing Nl � N landmarks to representthe database using various techniques like K-means, where theapproximated nearest neighbors can be quickly discovered bylinear scan. We adopt the simple way by randomly samplingNl points as the landmarks at the offline training stage.

B. Weight Calibration

As Figure 1 demonstrates that hash function h3 and h4shows satisfying capability of neighbor preservation forquery q, but the large correlation between them indicatesundesired redundancy between them. Instead, though func-tion h2 performs worse than h3 and h4, but it serves as acomplement to h4, with which they together can well preserveall neighbor relations of q. This observation motivates usto further calibrate the query-adaptive weights, taking thecorrelations among all hash functions into consideration.

Given any pair of hash functions hi and h j from H , if theybehave similarly on a certain set of data points (i.e., the hashbits encoded by them are quite similar), then we can regardthat the two hash functions are strongly correlated. In practice,to improve the overall discriminative power of the hash codes,uncorrelated (or independent) and high-quality hash functionsshould be given higher priority.

Since computing higher-order independence among hashfunctions is quite expensive, we approximately evaluate theindependence based on the pair-wise correlations betweenthem. Specifically, we introduce the mutual independencebetween hash functions based on the mutual informationMI(yi , y j )) between the bit variables yi and y j generated byhash function hi and h j :

ai j = exp[−λMI(yi , y j )

], (3)

where λ > 0 and ai j = a j i , forming a symmetrical indepen-dence matrix A = (ai j ).

Then we calibrate the query-adaptive weights wk byreweighing it using a positive variable πk . Namely, the newbitwise query-adaptive weight is given by

w∗k = wkπk, (4)

which should overall maximize both the neighbor preservationand the independence between hash functions. We formulateit as the following quadratic programming problem:

maxπ

i j

w∗i w∗

j ai j

s.t . 1T π = 1, π � 0. (5)

The above problem can be efficiently solved by a number ofpowerful techniques like replicator dynamics [31].

C. Data-Dependent Similarity

As aforementioned, the similarity between the query anddatabase samples plays an important role in pursuing query-adaptive weights in Section II-A. Since in practice datapoints are usually distributed on certain manifold, the standardEuclidean metric cannot strictly capture their global similar-ities. To obtain similarity measurement adaptive to differentdatasets, we adopt the anchor graph to represent any sample xby z(x) based on their local neighbor relations to anchorspoints U = {uk ∈ R

d }Nlk=1, which can be generated efficiently

by clustering or sampling [15]:

[z(x)] j =

⎧⎪⎨

⎪⎩

K(x, u j )∑

u j ′ ∈NN(x) K(x, u j ′), if u j ∈ NN(x)

0, otherwise

(6)

where NN(x) denotes x’s nearest anchors in U according to thepredefined kernel function K(x, u j ) (e.g., Gaussian kernel).The highly sparse z(x) serves as a nonlinear and discrimi-native feature representation, and can be used to efficientlyapproximate the data-dependent similarities between samples.Specifically, for query q and any point p, their similarity canbe computed by

s(p, q) = exp(−‖z(p) − z(q)‖2/σ 2), (7)

where σ is usually set to the largest distance between z(p)and z(q).

III. MULTI-VIEW GRAPH-BASED RANK FUSION

Now we turn to the fine-grained ranking over multiple hashtables with multiple view information. In this section, we willpresent a query specific rank fusion method that adaptivelycombines the ranking results of multiple tables. The weightedHamming distance enables fine-grained ranking of each hashtable, which improves the discriminative power of hash codesfrom the views of hash function quality and their correlations.To adaptively organize results from different tables in andesired order without any supervision or user feedbacks, werepresent them as an edge-weighted graph, where the compa-rable similarities across different tables can be approximated

Page 4: 4514 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO ...see.xidian.edu.cn/faculty/chdeng/Welcome to Cheng Deng's Homepage_files... · tables with multiple views and generating

LIU et al.: QUERY-ADAPTIVE HASH CODE RANKING FOR LARGE-SCALE MULTI-VIEW VISUAL SEARCH 4517

Fig. 2. Demonstration of the proposed table-wise rank fusion.

efficiently online based on the neighborhood captured byanchors in the weighted Hamming space. Finally, with theconsistent similarity measurement, the common candidatesand their neighbors from different tables will be promotedwith higher probabilities to be ranked top using a localizedPageRank algorithm over the fused graph. Figure 2 demon-strates the proposed rank fusion method based on probabilitydiffusion on the merged graph.

A. Graph Construction

For the query q, multiple candidate sets will be retrievedin hash tables built from M different sources. We representthe candidate set corresponding to the m-th source/table,including the query, as an edge-weighted graph G(m) =(V(m), E(m), �(m)). The vertices V(m) in the graph correspondto each element in the candidate set, while the edge con-nections E(m) indicate the neighbor relations, whose weights�

(m)i j are defined by the similarity S(m)

i j between the can-

didates xi and x j . The graph is quite powerful to charac-terize the local neighbor structures of the candidates alonga manifold.

In practice, similar candidates usually own a large por-tion of common neighbors. Therefore, it is critical to cap-ture the local neighbor structure centered at each candidatepoint in the similarity measurement. In traditional methodsthe similarity measurement usually depends on the raw fea-tures or neighbor structures process beforehand, e.g., usingthe Euclidean distance between raw features [49] or Jac-card index (common neighbors) in the neighborhood [48].Both solutions are infeasible for large-scale datasets, dueto the explosive memory consumption and expensiveupdating.

To avoid this bottleneck in the online graph construction,we estimate the neighbor relations by employing the anchorapproximation in the weighted Hamming space. A small setof K anchors are selected as the prototypes to together locateeach data point in the specific space. Therefore, each pointcan be characterized by its nearest anchors, and thus theirneighbor relations can be determined by checking whethersharing similar nearest anchors.

Formally, for candidate set V(m) of m-th source, the K(K � N) anchor points U (m) = {u(m)

k ∈ RDm }K

k=1 areselected from the database to characterize the inherent neigh-bor structures. Thus, any data point xi can be representedby its nearest anchors in a vector form, whose elements aredenoted by the truncated similarities Z(m)

i j with respect to the

m-th source:

Z(m)i j =

⎧⎪⎪⎪⎨

⎪⎪⎪⎩

exp(−dH (x(m)

i , u(m)j )/σH

)

∑j ′∈〈i〉(m) exp

(−dH (x(m)

i , x(m)j ′ )/σH

) , if j ∈〈i〉(m)

0, otherwise

(8)

where 〈i〉(m) ⊂ [1 : K ] denotes the indices of s (s � K )nearest anchors of point xi in U (m) according to the weightedHamming distance, and σH is set to the maximum Hammingdistance.

The symmetric similarity S(m) between candidates in V(m)

can be defined by the inner product between their anchorrepresentations in the m-th feature space:

S(m)i j = 1

λ(m)i

Z(m)i· Z(m)T

j · + 1

λ(m)j

Z(m)j · Z(m)T

i· (9)

with λ(m)i = ∑

j Z(m)i· Z(m)T

j · . Since Z(m) ∈ RN×K is highly

sparse with s nonzero elements in each row summing to 1, theapproximated similarity S(m) will be also quite sparse, whereonly those data sharing same nearest anchors will be regardedas neighbors with non-zero similarities.

B. Rank Fusion

After obtaining multiple graphs G(m) = (V(m), E(m), �(m))from different hash tables corresponding to each type ofsource, we fuse them together as one graph G = (V, E,�),where the vertices correspond to all candidates without repeats.Subsequently, for each pair of candidates if there is an edgebetween them in any G(m), there will be a connection betweenthem in G, and its weight will be the superposition ofall weights in G(m). This will guarantee that the commoncandidates and their neighbors from different hash tables showhigh homogeneity, and thus have priorities to be ranked top.Note that the anchor-based representation not only capturesthe local neighbor structure along the manifold, but alsomakes the similarity comparable across different rank listsafter normalization. Therefore, without any prior, we regardmultiple retrieval results equally and simply sum up their edgeweights. Formally,

V =⋃

mV(m),

E =⋃

mE(m),

�i, j =∑

m�

(m)i, j =

mS(m)

i j . (10)

For multiple table search with different sources, the strongconnected candidates in G show great and consistent visualsimilarities in different views, which implies an inherentneighbor relations between them, forming a dense subgraph.To discover the most relevant candidates for the query, thismotivates us to perform connectivity analysis on the graph,where the desired ones will have high probability to be visitedfrom the query.

One popular and powerful technique is random walk, inter-preted as a diffusion process [48] and succeeding in theGoogle PageRank system. Specifically, the walk in G will

Page 5: 4514 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO ...see.xidian.edu.cn/faculty/chdeng/Welcome to Cheng Deng's Homepage_files... · tables with multiple views and generating

4518 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 10, OCTOBER 2016

Algorithm 1 Query-Specific Rank Fusion Over Hash TablesWith Multiple Sources (QsRF)

sometimes (with a small probability 1 − α, where empiricallyα > 0.8) jump to a potentially random result with a fixeddistribution r, or transit in a random walk way according to thevertex connection matrix P = �. For query-specific ranking,distribution r for query q will be empirically set to a largeprobability (e.g., 0.99), while others being uniform, satisfyingrT 1 = 1. Formally the visiting probability r of each vertexafter one-step random walk from the t-th step will be updatedas follows:

r(t+1) = (1 − α)π + αPT r(t). (11)

With a number of random walks, the visiting probability r ofeach candidate will converge to an equilibrium state, where ahigher probability reflects a higher relevance to the query.

Using the above updating strategy, the equilibrium distri-bution can be obtained in a close form, i.e., r∗ = (1 − α)(I−αPT )−1π . Note that such solution requires matrix inverse,which will bring expensive computation when a large numberof candidates participate in the rank fusion. To speedup thisprocess, usually the iterative updating in (11) is preferredin practice. Algorithm 1 list the detailed procedures at bothoffline and online stages.

C. Online Query

At the offline stage, for each view we build a hash tableusing certain hashing algorithms. Both Nl sampled landmarksand K anchors can be obtained efficiently, and the landmarkscan be represented using anchors in O(Nl K ). At the onlinestage, for each table the query in the corresponding featurerepresentation is transformed into anchor representation andused to compute the query’s similarities to the landmarks inO(Nl ), and then the query-adaptive weights can be obtainedby quadratic programming (5) in polynomial time, and the

results are fast ranked according to the weighted hammingdistances. With the ranked result sets from different tables,multiple graph can be built over all the candidates whichare empirically limited to the top Nk � N (e.g., 1,000 inour experiments) ones in practice. Subsequently, the graphconstruction based on the anchor representation will costO(M Nk K + N2

k K ) time, which is much less than the databasesize N . Finally, the reranking process fusing multiple rankinglists can be completed in a very few random walk iterationsof O(N2

k ) complexity. On the whole, the proposed rank fusionmethod can support fast online search in a sublinear time withrespect to the database size.

IV. EXPERIMENTS

In this section, we comprehensively evaluate the effective-ness of our proposed method consisting of query-adaptive hashcode ranking (QRank) and query specific rank fusion (QsRF)method, over several real-world image datasets.

A. Data Sets

To evaluate the proposed methods, we conduct extensiveexperiments on the real-world image datasets as follows:

MNIST includes 70K 784 dimensional images, each ofwhich is associated with a digit label from ‘0’ to ‘9’.Each example is a 28×28 image and is represented by a784 dimensional vector.

CIFAR-10 contains 60K 32×32 color images of 10 classesand 6K images in each class. For each image, we extract300D bag-of-words (BoW) quantized from dense SIFT fea-tures and 384D GIST feature.

TRECVID [57] is a large-scale image dataset built fromthe TRECVID 2011 Semantic Indexing annotation set with126 fully labeled concepts, from which we select 25 most-frequent concepts. For each image, we extract 512D GISTfeature and 1000D spatial pyramid bag-of-words feature.

1) Protocols: In the literature, there are very few studiesregarding the Hamming distance ranking besides WhRank [46]and QsRank [45]. Since WhRank outperformed QsRank [45]significantly as reported in [46], we only focus the comparisonbetween our method and WhRank [46] over a variety ofbaseline hashing algorithms including projection based andcluster based ones, which cover most types of the state-of-the-art hashing research including linear/nonlinear hashing andrandom/optimized hashing.

Specifically, In our experiments each hash table isconstructed using hash functions generated by basic hash-ing algorithms. State-of-the-art hashing algorithms, includingLocality Sensitive Hashing (LSH) [13], Spectral Hashing(SH) [14], Kernelized LSH (KLSH) [23], PCA-based Hashing(PCAH), Iterative Quantization (ITQ) [16], Spherical Hash-ing (SPH) [58], and K-means Hashing (KMH) [47], areinvolved in generating hash functions. Among these methods,LSH, SH, KLSH, PCAH and ITQ adopted the projectionparadigm to generate hash bits, while SPH and KMH exploitthe cluster structure in the data. Meanwhile, LSH, PCAH andITQ are linear methods, and SH, KLSH, SPH and KMH arenonlinear ones.

Page 6: 4514 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO ...see.xidian.edu.cn/faculty/chdeng/Welcome to Cheng Deng's Homepage_files... · tables with multiple views and generating

LIU et al.: QUERY-ADAPTIVE HASH CODE RANKING FOR LARGE-SCALE MULTI-VIEW VISUAL SEARCH 4519

Fig. 3. Performances comparison of different hash code ranking methods on MNIST. (a) Precision @ 48 bits. (b) Precision @ 96 bits. (c) Recall @ 48 bits.(d) Recall @ 96 bits.

Fig. 4. Performances comparison of different hash code ranking methods on CIFAR-10. (a) Precision @ 48 bits. (b) Precision @ 96 bits.(c) Recall @ 48 bits. (d) Recall @ 96 bits.

The common search method named Hamming distanceranking will be adopted in the evaluation, which ranks allpoints in the database according to their (weighted) Hammingdistances from the query. We adopt the popular performancemetric including precision and recall in our experiments. Allresults are averaged over 10 independent runs to suppress therandomness. For each run, we randomly sample 5,000 imagesas the training data, 3,000 as query images, and 300 as theanchors for both query-adaptive weighting and the similarityapproximation in each view. The true nearest neighbors aredefined as the images sharing at least one common tag.

In our experiments, we first investigate the performance ofthe proposed query-adaptive hash code ranking using a singlefeature type in one hash table. On MNIST we employ thegrayscale pixel intensities of each image as the feature, andon CIFAR-10 and TRECVID we use GIST descriptors forevaluation. As to the rank fusion over multiple hash tables withmultiple sources, for simplicity and similar to prior multiplefeature work [3], [26], we adopt two types of visual featuresfor each set (see the dataset description above) to verify theefficiency of our proposed method. Namely, without loss ofgenerality, we evaluate our method on image search overCIFAR-10 (60K) and TRECVID (260K), and each sourcecorresponds to one type of visual features.

B. Evaluation on Query-Adaptive Hash Code RankingFigure 3 shows the precision and recall curves respectively

using 48 and 96 bits on MINST. We can easily find outthat both hash bit weighting methods (WhRank and QRank)achieved better performances than the baseline hashing algo-rithms. The observation indicates that the hash code rankingbased on weighted Hamming distance hopefully serves asa promising solution to boosting the performance of hash-based retrieval. In all cases, our proposed QRank consistentlyachieves the superior performances to WhRank over differenthashing algorithms. Figure 6(a) and (d) depict the overallperformance evaluation using mean average precision (MAP)with 48 and 96 hash bits, where we get a similar observationthat QRank outperforms WhRank significantly, e.g., comparedto WhRank, QRank obtains 17.11%, 11.36%, 14.06% and11.39% performance gains respectively over LSH, ITQ, SHand KLSH.

Besides MNIST, Figure 4 and 5 present the recall and preci-sion curves on CIFAR-10 and TRECVID using 48 and 96 bits.By comparing the performance of QRank with that of WhRankand baseline hashing algorithms, we get a similar conclusionas on MNIST that both QRank and WhRank can enhancethe discriminative power of hash functions by weightingthem elegantly, and meanwhile in all cases QRank clearly

Page 7: 4514 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO ...see.xidian.edu.cn/faculty/chdeng/Welcome to Cheng Deng's Homepage_files... · tables with multiple views and generating

4520 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 10, OCTOBER 2016

Fig. 5. Performances comparison of different hash code ranking methods on TRECVID. (a) Precision @ 48 bits. (b) Precision @ 96 bits. (c) Recall @48 bits. (d) Recall @ 96 bits.

Fig. 6. Average precision comparison on MNIST, CIFAR-10, and TRECVID. (a) MNIST @ 48 bits. (b) CIFAR-10 @ 48 bits. (c) TRECVID @ 48 bits.(d) MNIST @ 96 bits. (e) CIFAR-10 @ 96 bits. (f) TRECVID @ 96 bits.

outperforms WhRank owing to its comprehensive capability ofdistinguishing high-quality functions. Similar conclusion canbe obtained from the MAP performance comparison as theoverall evaluation in Figure 6 (b), (c), (e) and (f) with respectto different number of hash bit per table.

It is worth noting that since WhRank depends on thedistribution of the projected samples, it cannot be directly

applied to hashing algorithms like SPH and KMH. Instead,the proposed method derives the bitwise weight only relyingon the data-dependent similarity. Therefore, it can not onlycapture the neighbor relations of the query, but also possessuniversality suiting for all hashing algorithms. Figure 3 - 6show that QRank obtains significant performance gains overSPH and KMH in all cases.

Page 8: 4514 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO ...see.xidian.edu.cn/faculty/chdeng/Welcome to Cheng Deng's Homepage_files... · tables with multiple views and generating

LIU et al.: QUERY-ADAPTIVE HASH CODE RANKING FOR LARGE-SCALE MULTI-VIEW VISUAL SEARCH 4521

TABLE I

MAP (%) OF DIFFERENT PARTS OF QRANK ON MNIST

We investig ate the effect of different parts of QRank inTable I by comparing the performance of QRank with orwithout the weight calibration (named QRank−). The resultsare reported over LSH, SH, PCAH and ITQ respectively using96 bits on MNIST. It is obvious that QRank− only usingquery-adaptive weights without considering the redundanceamong hash functions is able to boost the hashing perfor-mance, but QRank appended with the weight calibration canfurther bring significant (up to 46.44%) performance gains.This indicates that both the individual quality of each hashfunction and their complementarity are critical for fine-grainedneighbor ranking. Finally, we also compare the time costof different query adaptive ranking strategies. Though ourQRank spends a little more time than WhRank on learningquery-adaptive weight, on the whole this part is relativelysmall and the online query is quite efficient in practice(see Table III).

C. Evaluation on Query Specific Rank Fusion

In this part, we further evaluate our query specific rankfusion method over several hashing algorithms. Note thatour method serves as a generic solution to build multipletables using various sources, which is compatible with mosttypes of hashing algorithms. To illustrate the efficiency ofthe proposed method (QsRF), we will compare it on sev-eral datasets with several baseline methods: the basic hash-ing algorithms that build a single table using informationfrom one source based on the standard Hamming distance(F1/F2), query-specific ranking method over each hashingalgorithm without rank fusion (F1/F2-Qs), multiple featurehashing algorithm (MFH) [26], and the very recent rank fusionmethod (HRF) that reranks candidates based on the minimalHamming distances, which shows promising performance inmultiple table search [31], [33].

In our experiments, a specific number of hash functions arefirst generated using different hashing algorithms. Then allmethods except MFH build hash tables using these functionsin different manners. MFH will learn its hash functions andthen build tables by simultaneously considering informationfrom multiple sources.

Table II shows the averaged precision (AP) with respectto different cutting points (@5 and @10) respectively onCIFAR-10 and TRECVID. We can easily find out that inmost cases, query specific ranking (F1/F2-Qs) achieved betterperformances than the baseline hashing algorithms (F1/F2).The observation indicates that the hash code ranking based onweighted Hamming distance hopefully serves as a promisingsolution to boost the performance of hash table search. Further-more, in all cases our proposed QsRF consistently achieves thesuperior performances to all baselines over different hashing

Fig. 7. Recall performances of different ranking methods over multipletables on CIFAR-10 and TRECVID. (a) Recall@10 CIFAR-10. (b) Recall@10 TRECVID.

algorithms, e.g., on TRECVID, it obtains 20.28%, 9.80% and8.62% AP@5 performance gains over the best competitorswhen using LSH, KLSH and SPH. This is because that ourgraph based rank fusion can faithfully exploit the comple-mentarity between different features and thus discover moretrue semantic neighbors. Besides, note that on LSH we canobtain more significant performance improvement than otherlearning based hashing algorithms that own better discrimi-native power, which indicates that the proposed method canlargely compensate the quantization loss of the basic hashingalgorithms.

Besides averaged precision, recall is also an important per-formance metric for nearest neighbor search. Figure 7 presentsthe recall rate of top 10 results using different methods onboth datasets. By observing the performance of methods withquery specific scheme, we get a similar conclusion that thefine-grained weighted Hamming distance can enhance thediscriminative power of hash functions. Meanwhile, QsRF canfurther adaptively capture and fuse the local neighbor struc-tures in different views, and subsequently outperforms otherranking methods in all cases, with much more true relevantimages returned after rank fusion. Even though the semanticcomplexity is quite different on CIFAR-10 and TRECVID,our method can consistently capture the intrinsic relations andobtain the best performance in most cases.

It is worth noting that our rank fusion framework is genericfor any type of binary hashing algorithms including thelinear (LSH, PCAH, ITQ, etc.), nonlinear (KLSH, SPH, etc.),

Page 9: 4514 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO ...see.xidian.edu.cn/faculty/chdeng/Welcome to Cheng Deng's Homepage_files... · tables with multiple views and generating

4522 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 10, OCTOBER 2016

TABLE II

AP (%) OF HASH TABLE SEARCH USING DIFFERENT RANKING METHODS OVER MULTIPLE TABLES ON CIFAR-10 AND TRECVID

TABLE III

SEARCH TIME (ms) OF DIFFERENT METHODS ON CIFAR-10

random (LSH, KLSH, etc.) and optimized (PCAH, ITQ, SPH,etc.) hashing. Therefore, it can be easily applied to state-of-the-art hashing research to support effective hash tablesearch using multiple sources, achieving significant perfor-mance gains without too much efforts.

Finally, we investigate the efficiency of different methodsin Table III. For multiple hash tables, the search process canbe completed in parallel, which will be close to that of thesingle table search. Due to online computation of the queryadaptive bit weights, query-specific ranking will cost a littlemore than standard Hamming distance ranking. For QsRF,since the candidate graph is quite sparse, the convergence ofthe equilibrium state distribution is computationally efficient(O(|E |) time complexity), especially when adopting parallelcomputing and fast iterative updating. Therefore, QsRF onlyspends slightly more time than baseline methods. In practice,since usually only tens of hash tables are employed, it is verybeneficial that QsRF can get significant performance gains,and meanwhile still guarantee fast online search.

V. CONCLUSIONS

As described in this paper, we proposed a novel method forfine-grained ranking over multiple hash tables incorporatinginformation from multiple views. At the bitwise level, thehash code ranking method named QRank learns a query-adaptive bitwise weights by simultaneously considering boththe individual quality of each hash function and their com-plement for nearest neighbor search. At the tablewise level,a query specific rank fusion method represents results of eachtable using QRank as an edge-weighted graph with compa-rable similarities across different tables, and promotes thecommon candidates and their neighbors from different tablesusing a localized PageRank algorithm over the merged graph.

Compared to state-of-the-art weighting methods, our rankfusion method serves as a general solution to enabling fine-grained ranking over multiple hash tables with multiplesources, which supports different kinds of hashing algorithmsand different feature types. Due to both the query-adaptiveranking and data-dependent fusion, our method significantlyand efficiently boosts the ranking performance in practice.Further work can be concentrated on learning a more discrim-inative similarity metric and building each hash table frommultiple views with the query specific ranking.

REFERENCES

[1] D. Tao, X. Tang, X. Li, and X. Wu, “Asymmetric bagging and randomsubspace for support vector machines-based relevance feedback in imageretrieval,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 28, no. 7,pp. 1088–1099, Jul. 2006.

[2] J. He et al., “Mobile product search with bag of hash bits and boundaryreranking,” in Proc. IEEE CVPR, Jun. 2012, pp. 3005–3012.

[3] X. Liu, J. He, and B. Lang, “Multiple feature kernel hashing for large-scale visual search,” Pattern Recognit., vol. 47, no. 2, pp. 748–757,Feb. 2014.

[4] J. Song, Y. Yang, X. Li, Z. Huang, and Y. Yang, “Robust hashing withlocal models for approximate similarity search,” IEEE Trans. Cybern.,vol. 44, no. 7, pp. 1225–1236, Jul. 2014.

[5] J. Qin, L. Liu, M. Yu, Y. Wang, and L. Shao, “Fast action retrievalfrom videos via feature disaggregation,” in Proc. BMVC, 2015,pp. 180.1–180.11.

[6] L. Liu, M. Yu, and L. Shao, “Multiview alignment hashing for efficientimage search,” IEEE Trans. Image Process., vol. 24, no. 3, pp. 956–966,Mar. 2015.

[7] X. Zhang, W. Liu, M. Dundar, S. Badve, and S. Zhang, “Towards large-scale histopathological image analysis: Hashing-based image retrieval,”IEEE Trans. Med. Imag., vol. 34, no. 2, pp. 496–506, Feb. 2015.

[8] T. Dean, M. Ruzon, M. Segal, J. Shlens, S. Vijayanarasimhan, andJ. Yagnik, “Fast, accurate detection of 100,000 object classes on a singlemachine,” in Proc. IEEE CVPR, Jun. 2013, pp. 1814–1821.

[9] Y. Mu, G. Hua, W. Fan, and S.-F. Chang, “Hash-SVM: Scalable kernelmachines for large-scale visual classification,” in Proc. IEEE CVPR,Jun. 2014, pp. 979–986.

[10] X. Liu, X. Fan, C. Deng, Z. Li, H. Su, and D. Tao, “Multilinearhyperplane hashing,” in Proc. IEEE CVPR, Jun. 2016, pp. 5119–5127.

[11] X. Liu, J. He, C. Deng, and B. Lang, “Collaborative hashing,” in Proc.IEEE CVPR, Jun. 2014, pp. 2147–2154.

[12] M. S. Charikar, “Similarity estimation techniques from rounding algo-rithms,” in Proc. 34th ACM STOC, 2002, pp. 380–388.

[13] M. Datar, N. Immorlica, P. Indyk, and V. S. Mirrokni, “Locality-sensitivehashing scheme based on p-stable distributions,” in Proc. 20th SCG,2004, pp. 253–262.

[14] Y. Weiss, A. Torralba, and R. Fergus, “Spectral hashing,” in Proc. NIPS,2008, pp. 1–8.

Page 10: 4514 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO ...see.xidian.edu.cn/faculty/chdeng/Welcome to Cheng Deng's Homepage_files... · tables with multiple views and generating

LIU et al.: QUERY-ADAPTIVE HASH CODE RANKING FOR LARGE-SCALE MULTI-VIEW VISUAL SEARCH 4523

[15] W. Liu, J. Wang, S. Kumar, and S.-F. Chang, “Hashing with graphs,” inProc. ICML, 2011, pp. 1–8.

[16] Y. Gong and S. Lazebnik, “Iterative quantization: A procrusteanapproach to learning binary codes,” in Proc. IEEE CVPR, Jun. 2011,pp. 817–824.

[17] M. Norouzi and D. J. Fleet, “Cartesian k-means,” in Proc. IEEE CVPR,Jun. 2013, pp. 3017–3024.

[18] F. Shen, C. Shen, Q. Shi, A. van den Hengel, and Z. Tang,“Inductive hashing on manifolds,” in Proc. IEEE CVPR, Jun. 2013,pp. 1562–1569.

[19] L. Liu and L. Shao, “Sequential compact code learning for unsupervisedimage hashing,” IEEE Trans. Neural Netw. Learn. Syst., vol. PP, no. 99,pp. 1–11, 2015.

[20] W. Liu, J. Wang, R. Ji, Y.-G. Jiang, and S.-F. Chang, “Super-vised hashing with kernels,” in Proc. IEEE CVPR, Jun. 2012,pp. 2074–2081.

[21] X. Liu, Y. Mu, B. Lang, and S.-F. Chang, “Mixed image-keyword query adaptive hashing over multilabel images,” ACM Trans.Multimedia Comput. Commun. Appl., vol. 10, no. 2, pp. 22:1–22:21,Feb. 2014.

[22] G. Lin, C. Shen, and A. van den Hengel, “Supervised hashing usinggraph cuts and boosted decision trees,” IEEE Trans. Pattern Anal. Mach.Intell., vol. 37, no. 11, pp. 2317–2331, Nov. 2015.

[23] B. Kulis and K. Grauman, “Kernelized locality-sensitive hashingfor scalable image search,” in Proc. IEEE ICCV, Sep./Oct. 2009,pp. 2130–2137.

[24] F. Shen, C. Shen, Q. Shi, A. V. D. Hengel, Z. Tang, and H. T. Shen,“Hashing on nonlinear manifolds,” IEEE Trans. Image Process., vol. 24,no. 6, pp. 1839–1851, Jun. 2015.

[25] X. Liu, B. Du, C. Deng, M. Liu, and B. Lang, “Structure sensitivehashing with adaptive product quantization,” IEEE Trans. Cybern.,vol. PP, no. 99, pp. 1–13, 2015.

[26] J. Song, Y. Yang, Z. Huang, H. T. Shen, and R. Hong, “Multiple featurehashing for real-time large scale near-duplicate video retrieval,” in Proc.19th ACM MM, 2011, pp. 423–432.

[27] X. Liu, L. Huang, C. Deng, J. Lu, and B. Lang, “Multi-view comple-mentary hash tables for nearest neighbor search,” in Proc. IEEE ICCV,Dec. 2015, pp. 1107–1115.

[28] W. Kong and W.-J. Li, “Double-bit quantization for hashing,” in Proc.26th AAAI, 2012, pp. 634–640.

[29] C. Deng, H. Deng, X. Liu, and Y. Yuan, “Adaptive multi-bitquantization for hashing,” Neurocomputing, vol. 151, pp. 319–326,Mar. 2015.

[30] Y. Mu, X. Chen, X. Liu, T.-S. Chua, and S. Yan, “Multimedia semantics-aware query-adaptive hashing with bits reconfigurability,” Int. J.Multimedia Inf. Retr., vol. 1, no. 1, pp. 59–70, 2012.

[31] X. Liu, J. He, B. Lang, and S.-F. Chang, “Hash bit selection: A unifiedsolution for selection problems in hashing,” in Proc. IEEE CVPR,Jun. 2013, pp. 1570–1577.

[32] X. Liu, J. He, and B. Lang, “Reciprocal hash tables for nearest neighborsearch,” in Proc. AAAI, 2013, pp. 626–632.

[33] X. Liu, C. Deng, B. Lang, D. Tao, and X. Li, “Query-adaptive reciprocalhash tables for nearest neighbor search,” IEEE Trans. Image Process.,vol. 25, no. 2, pp. 907–919, Feb. 2016.

[34] W. Liu, C. Mu, S. Kumar, and S.-F. Chang, “Discrete graph hashing,”in Proc. NIPS, 2014, pp. 1–8.

[35] H. Lai, Y. Pan, Y. Liu, and S. Yan, “Simultaneous feature learningand hash coding with deep neural networks,” in Proc. IEEE CVPR,Jun. 2015, pp. 3270–3278.

[36] Y. Gong, S. Kumar, H. A. Rowley, and S. Lazebnik, “Learning binarycodes for high-dimensional data using bilinear projections,” in Proc.IEEE CVPR, Jun. 2013, pp. 484–491.

[37] F. X. Yu, S. Kumar, Y. Gong, and S.-F. Chang, “Circulant binaryembedding,” in Proc. ICML, 2014, pp. 1–9.

[38] L. Liu, M. Yu, and L. Shao, “Projection bank: From high-dimensionaldata to medium-length binary codes,” in Proc. IEEE ICCV, Jun. 2015,pp. 2821–2829.

[39] C. Leng, J. Wu, J. Cheng, X. Bai, and H. Lu, “Online sketching hashing,”in Proc. CVPR, Jun. 2015, pp. 2503–2511.

[40] D. Tao, X. Li, X. Wu, and S. J. Maybank, “General tensor discriminantanalysis and Gabor features for gait recognition,” IEEE Trans. PatternAnal. Mach. Intell., vol. 29, no. 10, pp. 1700–1715, Oct. 2007.

[41] D. Tao, X. Li, X. Wu, and S. J. Maybank, “Geometric mean for subspaceselection,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 31, no. 2,pp. 260–274, Feb. 2009.

[42] T. Liu and D. Tao, “Classification with noisy labels by importancereweighting,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 38, no. 3,pp. 447–461, Mar. 2016.

[43] X. Liu, B. Lang, Y. Xu, and B. Cheng, “Feature grouping and local softmatch for mobile visual search,” Pattern Recognit. Lett., vol. 33, no. 3,pp. 239–246, 2012.

[44] Y.-G. Jiang, J. Wang, and S.-F. Chang, “Lost in binarization: Query-adaptive ranking for similar image search with compact codes,” in Proc.1st ACM ICMR, 2011, pp. 16:1–16:8.

[45] X. Zhang, L. Zhang, and H.-Y. Shum, “Qsrank: Query-sensitive hashcode ranking for efficient ε-neighbor search,” in Proc. CVPR, Jun. 2012,pp. 2058–2065.

[46] L. Zhang, Y. Zhang, J. Tang, K. Lu, and Q. Tian, “Binary coderanking with weighted Hamming distance,” in Proc. CVPR, Jun. 2013,pp. 1586–1593.

[47] K. He, F. Wen, and J. Sun, “K-means hashing: An affinity-preservingquantization method for learning binary compact codes,” in Proc. IEEECVPR, Jun. 2013, pp. 2938–2945.

[48] S. Zhang, M. Yang, T. Cour, K. Yu, and D. N. Metaxas, “Queryspecific fusion for image retrieval,” in Proc. 12th ECCV, 2012,pp. 660–673.

[49] C. Deng, R. Ji, W. Liu, D. Tao, and X. Gao, “Visual rerankingthrough weakly supervised multi-graph learning,” in Proc. IEEE ICCV,Jun. 2013, pp. 2600–2607.

[50] R. Hong, M. Wang, Y. Gao, D. Tao, X. Li, and X. Wu, “Image annota-tion by multiple-instance learning with discriminative feature mappingand selection,” IEEE Trans. Cybern., vol. 44, no. 5, pp. 669–680,May 2014.

[51] R. Hong, L. Zhang, and D. Tao, “Unified photo enhancementby discovering aesthetic communities from Flickr,” IEEETrans. Image Process., vol. 25, no. 3, pp. 1124–1135,Mar. 2016.

[52] C. Xu, D. Tao, and C. Xu, “Multi-view intact space learning,” IEEETrans. Pattern Anal. Mach. Intell., vol. 37, no. 12, pp. 2531–2544,Dec. 2015.

[53] Q. Wang, S. Li, H. Qin, and A. Hao, “Robust multi-modal medicalimage fusion via anisotropic heat diffusion guided low-rank structuralanalysis,” Inf. Fusion, vol. 26, pp. 103–121, Nov. 2015.

[54] C. Chen, S. Li, H. Qin, and A. Hao, “Structure-sensitive saliencydetection via multilevel rank analysis in intrinsic feature space,”IEEE Trans. Image Process., vol. 24, no. 8, pp. 2303–2316,Aug. 2015.

[55] T. Ji, X. Liu, C. Deng, L. Huang, and B. Lang, “Query-adaptive hashcode ranking for fast nearest neighbor search,” in Proc. 22nd ACM MM,2014, pp. 1005–1008.

[56] R. Hong, Y. Yang, M. Wang, and X.-S. Hua, “Learning visual semanticrelationships for efficient visual retrieval,” IEEE Trans. Big Data, vol. 1,no. 4, pp. 152–161, Dec. 2015.

[57] F. X. Yu, R. Ji, M.-H. Tsai, G. Ye, and S.-F. Chang, “Weak attributesfor large-scale image retrieval,” in Proc. IEEE CVPR, Jun. 2012,pp. 2949–2956.

[58] J.-P. Heo, Y. Lee, J. He, S.-F. Chang, and S.-E. Yoon, “Sphericalhashing,” in Proc. IEEE CVPR, Jun. 2012, pp. 2957–2964.

Xianglong Liu received the B.S. and Ph.D. degreesin computer science from Beihang University,Beijing, China, in 2008 and 2014, respectively. From2011 to 2012, he visited the Digital Video andMultimedia Laboratory, Columbia University, as ajoint Ph.D. student. He is currently an AssociateProfessor with the School of Computer Science andEngineering, Beihang University. He has publishedover 30 research papers at top venues like the IEEETRANSACTIONS ON IMAGE PROCESSING, the IEEETRANSACTIONS ON CYBERNETICS, the Conference

on Computer Vision and Pattern Recognition, the International Conferenceon Computer Vision, and the Association for the Advancement of ArtificialIntelligence. His research interests include machine learning, computer vision,and multimedia information retrieval.

Page 11: 4514 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO ...see.xidian.edu.cn/faculty/chdeng/Welcome to Cheng Deng's Homepage_files... · tables with multiple views and generating

4524 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 10, OCTOBER 2016

Lei Huang received the B.S. degree incomputer science from Beihang University,Beijing, China, in 2010. He is currently pursuingthe Ph.D. degree with Beihang University, andis visiting the Artificial Intelligence Laboratory,University of Michigan, as a joint Ph.D student.His research interests include deep learning,semi-supervised learning, active learning, andlarge-scale data annotation and retrieval.

Cheng Deng received the B.Sc., M.Sc., andPh.D. degrees in signal and information processingfrom Xidian University, Xian, China. He is cur-rently a Full Professor with the School of ElectronicEngineering, Xidian University. He has authoredor co-authored over 50 scientific articles at topvenues, including the IEEE TRANSACTIONS ON

IMAGE PROCESSING, the IEEE TRANSACTIONS ONNEURAL NETWORKS AND LEARNING SYSTEMS,the IEEE TRANSACTIONS ON MULTIMEDIA, theIEEE TRANSACTIONS ON SYSTEMS, MAN, AND

CYBERNETICS, the IEEE TRANSACTIONS ON CYBERNETICS, the Interna-tional Conference on Computer Vision, the Conference on Computer Visionand Pattern Recognition, the International Joint Conference on Artificial Intel-ligence, and the Association for the Advancement of Artificial Intelligence.His research interests include computer vision, multimedia processing andanalysis, and information hiding

Bo Lang received the Ph.D. degree from BeihangUniversity, Beijing, China, 2004. She has been aVisiting Scholar with the Argonne National Labo-ratory, University of Chicago, for one year. She iscurrently a Professor with the School of ComputerScience and Engineering. Her current research inter-ests include data management, information security,and distributed computing. As a Principle Investi-gator or Core Member, she has engaged in manynational research projects funded by the NationalNatural Science Foundation, National High Technol-

ogy Research and Development (863) Program, and so on.

Dacheng Tao (F’15) is a Professor of ComputerScience with the Centre for Quantum Computation& Intelligent Systems, and the Faculty of Engi-neering and Information Technology, University ofTechnology Sydney. He mainly applies statisticsand mathematics to data analytics problems andhis research interests spread across computer vision,data science, image processing, machine learning,and video surveillance. His research results haveexpounded in one monograph and 200+ publicationsat prestigious journals and prominent conferences,

such as IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE

INTELLIGENCE, the IEEE TRANSACTIONS ON NEURAL NETWORKS AND

LEARNING SYSTEMS, the IEEE TRANSACTIONS ON IMAGE PROCESSING,the Journal of Machine Learning Research, the International Journal ofComputer Vision, NIPS, ICML, CVPR, ICCV, ECCV, AISTATS, ICDM, andACM SIGKDD. He received several best paper awards, such as the besttheory/algorithm paper runner up award in IEEE ICDM’07, the best studentpaper award in IEEE ICDM’13, and the 2014 ICDM 10-year highest-impactpaper award. He received the 2015 Australian Scopus-Eureka Prize, the 2015ACS Gold Disruptor Award, and the 2015 UTS Vice-Chancellor’s Medal forExceptional Research. He is a fellow of OSA, IAPR, and SPIE.