Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
Landmark Ordinal EmbeddingNikhil Ghosh† Yuxin Chen‡ Yisong Yue‹
†University of California, Berkeley ‡University of Chicago ‹California Institute of Technology
Landmark Ordinal EmbeddingNikhil Ghosh† Yuxin Chen‡ Yisong Yue‹
†University of California, Berkeley ‡University of Chicago ‹California Institute of Technology
Motivation
• Want representations respecting object similarity
• Difficult to obtain quantitative measurements
• Instead use triplet preference feedback from humans
Is closer to than ?
Problem Statement
- Given n objects with unknown embeddings x˚1, . . . ,x˚n P Rd
- D˚ is the Euclidean Distance Matrix (EDM) i.e. D˚ij “∥∥∥x˚i ´ x˚j
∥∥∥2
2.
- Receive noisy answers to the triplet query “is object j closer to i than k?”
- Given xi, j, ky receive “yes” or “no” label where P r“yes”s “ fpD˚ij ´D˚ikq
- We consider the Bradley-Terry-Luce model: f pθq “ 11`expp´θq
- Goal: using m queries, estimate px1, . . . , pxn minimizing Frobenius norm error∥∥∥X˚ ´ pX∥∥∥F
Primary Related Work
Let T be the set of triplet queries xi, j, ky that received label “yes”.
• Stochastic Triplet Embedding (STE) [VDMW12]
maxX
ÿ
@xi,j,kyPTfpDij ´Dikq
• Generalized Non-metric Multidimensional Scaling (GNMDS) [Aga+07]
minX,ξijkě0
ÿ
@xi,j,kyPTξijk subject to Dik ´Dij ě 1´ ξijk
Issues with Existing Approaches
- Existing algorithms require solving expensive optimization problems
- (Projected) Gradient Descent algorithms require operations that are take Ωpn2q time
- Becomes prohibitively slow when n is large pě 104q
- We would like to compute embeddings for large n
Landmark Multidimensional Scaling [DST04]
Let L Ă rns be a set of landmark points. Given distances D˚ij, pi, jq P L ˆ rns,Landmark Multidimensional Scaling (LMDS) allows us to recover the full embedding X.
Our Contributions
• A fast embedding algorithm, Landmark Ordinal Embedding (LOE), inspired fromLandmark Multidimensional Scaling (LMDS).
• A thorough analysis in both sample complexity and computational complexity.
• LOE allows us to warm-start existing state-of-the-art embedding approaches thatare statistically more efficient but computationally more expensive.
• By warm-starting with LOE we can find accurate embeddings on massive datasetsmuch more quickly than existing methods.
Landmark Ordinal Embedding
Algorithm Sketch
Input: # of items n; # of landmarks `; # of samples m; dimension d;
1: Randomly select ` landmarks from rns2: Compute rankings R1, . . . , R` of landmark cols.3: Estimate `ˆ ` landmark submatrix of D˚
4: Estimate landmark column shifts meanpD˚iq5: Recover D˚i , output embedding using LMDS
`<latexit sha1_base64="RlbZ5Lkiubsgl7TvwuBEPjdrZv0=">AAAB63icbVBNS8NAEJ3Ur1q/qh69BIvgqSR+YI8FLx4r2A9oQ9lsJ+3S3U3Y3Qgl9C948aCIV/+QN/+NmzYHrT4YeLw3w8y8MOFMG8/7ckpr6xubW+Xtys7u3v5B9fCoo+NUUWzTmMeqFxKNnElsG2Y49hKFRIQcu+H0Nve7j6g0i+WDmSUYCDKWLGKUmFwaIOfDas2rewu4f4lfkBoUaA2rn4NRTFOB0lBOtO77XmKCjCjDKMd5ZZBqTAidkjH2LZVEoA6yxa1z98wqIzeKlS1p3IX6cyIjQuuZCG2nIGaiV71c/M/rpyZqBBmTSWpQ0uWiKOWuid38cXfEFFLDZ5YQqpi91aUTogg1Np6KDcFfffkv6VzU/cv69f1Vrdko4ijDCZzCOfhwA024gxa0gcIEnuAFXh3hPDtvzvuyteQUM8fwC87HNwyojjg=</latexit>
n<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit>D
<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit>
`<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit>
Di
<latexit sha1_base64="WJtAQ65X5351/s5RHUkbOt07HOM=">AAAB7HicbVBNSwMxEJ2tX7V+VT16CRZBPJRdP7DHgh48VnDbQruWbJptQ7PJkmSFsvQ3ePGgiFd/kDf/jWm7B219MPB4b4aZeWHCmTau++0UVlbX1jeKm6Wt7Z3dvfL+QVPLVBHqE8mlaodYU84E9Q0znLYTRXEcctoKRzdTv/VElWZSPJhxQoMYDwSLGMHGSv5tjz2e9coVt+rOgJaJl5MK5Gj0yl/dviRpTIUhHGvd8dzEBBlWhhFOJ6VuqmmCyQgPaMdSgWOqg2x27ASdWKWPIqlsCYNm6u+JDMdaj+PQdsbYDPWiNxX/8zqpiWpBxkSSGirIfFGUcmQkmn6O+kxRYvjYEkwUs7ciMsQKE2PzKdkQvMWXl0nzvOpdVK/uLyv1Wh5HEY7gGE7Bg2uowx00wAcCDJ7hFd4c4bw4787HvLXg5DOH8AfO5w8rEY49</latexit>
Figure 1: LOE
Key Ideas
1. Queries xi, j, ky correspond to pairwise comparisons xj, ky based on distance from i
2. View recovering column D˚i as a ranking problem
3. Ranking model is not identifiable, but we can recover Ri “ D˚i ´meanpD˚iq ¨ 1
4. Submatrix of landmarks is a distance matrix, allows identification of meanpD˚iq
5. LMDS can compute an embedding using only ` “ Opdq columns of D˚
Theoretical Analysis
Theorem 1 (Consistency) Ωppolypdqn log nq triplets suffice for LOE to recoverthe embedding in Frobenius norm with high probability.
- Proof sketch: first bound the propagated error of the landmark columns estimate;combined with a perturbation bound for LMDS to obtain the above result.
Theorem 2 (Computational Complexity) Recovering the embedding up to afixed error using LOE requires time Opm` nd2 ` d3q which is linear in n.
Experimental Results
- STE, GNMDS: baselines
- LOE: our algorithm
- LOE-STE: using LOE to warm-start STE
Datasets
- Synthetic (x˚i „ normal dist.)
- MNIST (handwritten digits)
- FOOD (images of food)
Food Embedding using LOE-STE
We sample a total number of cn log n triplets per experiment
References
[VDMW12] Laurens Van Der Maaten and Kilian Weinberger. “Stochastic triplet embedding”. In: Ma-chine Learning for Signal Processing (MLSP). 2012, pp. 1–6.
[Aga+07] Sameer Agarwal, Josh Wills, Lawrence Cayton, Gert Lanckriet, David Kriegman, and SergeBelongie. “Generalized non-metric multidimensional scaling”. In: AISTATS. 2007, pp. 11–18.
[DST04] Vin De Silva and Joshua B Tenenbaum. Sparse multidimensional scaling using landmarkpoints. Tech. rep. 2004.