1
Landmark Ordinal Embedding Nikhil Ghosh Yuxin Chen Yisong Yue University of California, Berkeley University of Chicago California Institute of Technology Landmark Ordinal Embedding Nikhil Ghosh Yuxin Chen Yisong Yue University of California, Berkeley University of Chicago California Institute of Technology Motivation Want representations respecting object similarity Difficult to obtain quantitative measurements Instead use triplet preference feedback from humans Is closer to than ? Problem Statement - Given n objects with unknown embeddings x ˚ 1 ,..., x ˚ n P R d - D ˚ is the Euclidean Distance Matrix (EDM) i.e. D ˚ ij x ˚ i ´ x ˚ j 2 2 . - Receive noisy answers to the triplet query is object j closer to i than k ?- Given xi, j, k y receive yesor nolabel where P ryess“ f pD ˚ ij ´ D ˚ ik q - We consider the Bradley-Terry-Luce model: f pθ q“ 1 1`expθ q - Goal: using m queries, estimate p x 1 ,..., p x n minimizing Fr¨ obenius norm error X ˚ ´ p X F Primary Related Work Let T be the set of triplet queries xi, j, k y that received label yes. Stochastic Triplet Embedding (STE) [VDMW12] max X ÿ @xi,j,k yPT f pD ij ´ D ik q Generalized Non-metric Multidimensional Scaling (GNMDS) [Aga+07] min X,ξ ijk ě0 ÿ @xi,j,k yPT ξ ijk subject to D ik ´ D ij ě 1 ´ ξ ijk Issues with Existing Approaches - Existing algorithms require solving expensive optimization problems - (Projected) Gradient Descent algorithms require operations that are take Ωpn 2 q time - Becomes prohibitively slow when n is large 10 4 q - We would like to compute embeddings for large n Landmark Multidimensional Scaling [DST04] Let L Ărns be a set of landmark points. Given distances D ˚ ij , pi, j qP L ˆrns, Landmark Multidimensional Scaling (LMDS) allows us to recover the full embedding X. Our Contributions A fast embedding algorithm, Landmark Ordinal Embedding (LOE), inspired from Landmark Multidimensional Scaling (LMDS). A thorough analysis in both sample complexity and computational complexity. LOE allows us to warm-start existing state-of-the-art embedding approaches that are statistically more efficient but computationally more expensive. By warm-starting with LOE we can find accurate embeddings on massive datasets much more quickly than existing methods. Landmark Ordinal Embedding Algorithm Sketch Input: # of items n; # of landmarks ; # of samples m; dimension d; 1: Randomly select landmarks from rns 2: Compute rankings R 1 ,...,R of landmark cols. 3: Estimate ˆ landmark submatrix of D ˚ 4: Estimate landmark column shifts meanpD ˚ i q 5: Recover D ˚ i , output embedding using LMDS ` n D ` D i Figure 1: LOE Key Ideas 1. Queries xi, j, k y correspond to pairwise comparisons xj, k y based on distance from i 2. View recovering column D ˚ i as a ranking problem 3. Ranking model is not identifiable, but we can recover R i D ˚ i ´ meanpD ˚ i 1 4. Submatrix of landmarks is a distance matrix, allows identification of meanpD ˚ i q 5. LMDS can compute an embedding using only Opdq columns of D ˚ Theoretical Analysis Theorem 1 (Consistency) Ωppolypdqn log nq triplets suffice for LOE to recover the embedding in Fr¨ obenius norm with high probability. - Proof sketch : first bound the propagated error of the landmark columns estimate; combined with a perturbation bound for LMDS to obtain the above result. Theorem 2 (Computational Complexity) Recovering the embedding up to a fixed error using LOE requires time Opm ` nd 2 ` d 3 q which is linear in n. Experimental Results - STE, GNMDS: baselines - LOE: our algorithm - LOE-STE: using LOE to warm-start STE Datasets - Synthetic (x ˚ i normal dist.) - MNIST (handwritten digits) - FOOD (images of food) Food Embedding using LOE-STE We sample a total number of cn log n triplets per experiment References [VDMW12] Laurens Van Der Maaten and Kilian Weinberger. Stochastic triplet embedding. In: Ma- chine Learning for Signal Processing (MLSP). 2012, pp. 1–6. [Aga+07] Sameer Agarwal, Josh Wills, Lawrence Cayton, Gert Lanckriet, David Kriegman, and Serge Belongie. Generalized non-metric multidimensional scaling. In: AISTATS. 2007, pp. 11–18. [DST04] Vin De Silva and Joshua B Tenenbaum. Sparse multidimensional scaling using landmark points. Tech. rep. 2004.

Landmark Ordinal Embedding - Yuxin Chen · 2021. 2. 1. · Landmark Multidimensional Scaling [DST04] Let L•rnsbe a set of landmark points. Given distances D ij, pi;jqPLr ns, Landmark

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Landmark Ordinal Embedding - Yuxin Chen · 2021. 2. 1. · Landmark Multidimensional Scaling [DST04] Let L•rnsbe a set of landmark points. Given distances D ij, pi;jqPLr ns, Landmark

Landmark Ordinal EmbeddingNikhil Ghosh† Yuxin Chen‡ Yisong Yue‹

†University of California, Berkeley ‡University of Chicago ‹California Institute of Technology

Landmark Ordinal EmbeddingNikhil Ghosh† Yuxin Chen‡ Yisong Yue‹

†University of California, Berkeley ‡University of Chicago ‹California Institute of Technology

Motivation

• Want representations respecting object similarity

• Difficult to obtain quantitative measurements

• Instead use triplet preference feedback from humans

Is closer to than ?

Problem Statement

- Given n objects with unknown embeddings x˚1, . . . ,x˚n P Rd

- D˚ is the Euclidean Distance Matrix (EDM) i.e. D˚ij “∥∥∥x˚i ´ x˚j

∥∥∥2

2.

- Receive noisy answers to the triplet query “is object j closer to i than k?”

- Given xi, j, ky receive “yes” or “no” label where P r“yes”s “ fpD˚ij ´D˚ikq

- We consider the Bradley-Terry-Luce model: f pθq “ 11`expp´θq

- Goal: using m queries, estimate px1, . . . , pxn minimizing Frobenius norm error∥∥∥X˚ ´ pX∥∥∥F

Primary Related Work

Let T be the set of triplet queries xi, j, ky that received label “yes”.

• Stochastic Triplet Embedding (STE) [VDMW12]

maxX

ÿ

@xi,j,kyPTfpDij ´Dikq

• Generalized Non-metric Multidimensional Scaling (GNMDS) [Aga+07]

minX,ξijkě0

ÿ

@xi,j,kyPTξijk subject to Dik ´Dij ě 1´ ξijk

Issues with Existing Approaches

- Existing algorithms require solving expensive optimization problems

- (Projected) Gradient Descent algorithms require operations that are take Ωpn2q time

- Becomes prohibitively slow when n is large pě 104q

- We would like to compute embeddings for large n

Landmark Multidimensional Scaling [DST04]

Let L Ă rns be a set of landmark points. Given distances D˚ij, pi, jq P L ˆ rns,Landmark Multidimensional Scaling (LMDS) allows us to recover the full embedding X.

Our Contributions

• A fast embedding algorithm, Landmark Ordinal Embedding (LOE), inspired fromLandmark Multidimensional Scaling (LMDS).

• A thorough analysis in both sample complexity and computational complexity.

• LOE allows us to warm-start existing state-of-the-art embedding approaches thatare statistically more efficient but computationally more expensive.

• By warm-starting with LOE we can find accurate embeddings on massive datasetsmuch more quickly than existing methods.

Landmark Ordinal Embedding

Algorithm Sketch

Input: # of items n; # of landmarks `; # of samples m; dimension d;

1: Randomly select ` landmarks from rns2: Compute rankings R1, . . . , R` of landmark cols.3: Estimate `ˆ ` landmark submatrix of D˚

4: Estimate landmark column shifts meanpD˚iq5: Recover D˚i , output embedding using LMDS

`<latexit sha1_base64="RlbZ5Lkiubsgl7TvwuBEPjdrZv0=">AAAB63icbVBNS8NAEJ3Ur1q/qh69BIvgqSR+YI8FLx4r2A9oQ9lsJ+3S3U3Y3Qgl9C948aCIV/+QN/+NmzYHrT4YeLw3w8y8MOFMG8/7ckpr6xubW+Xtys7u3v5B9fCoo+NUUWzTmMeqFxKNnElsG2Y49hKFRIQcu+H0Nve7j6g0i+WDmSUYCDKWLGKUmFwaIOfDas2rewu4f4lfkBoUaA2rn4NRTFOB0lBOtO77XmKCjCjDKMd5ZZBqTAidkjH2LZVEoA6yxa1z98wqIzeKlS1p3IX6cyIjQuuZCG2nIGaiV71c/M/rpyZqBBmTSWpQ0uWiKOWuid38cXfEFFLDZ5YQqpi91aUTogg1Np6KDcFfffkv6VzU/cv69f1Vrdko4ijDCZzCOfhwA024gxa0gcIEnuAFXh3hPDtvzvuyteQUM8fwC87HNwyojjg=</latexit>

n<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit>D

<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit>

`<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit>

Di

<latexit sha1_base64="WJtAQ65X5351/s5RHUkbOt07HOM=">AAAB7HicbVBNSwMxEJ2tX7V+VT16CRZBPJRdP7DHgh48VnDbQruWbJptQ7PJkmSFsvQ3ePGgiFd/kDf/jWm7B219MPB4b4aZeWHCmTau++0UVlbX1jeKm6Wt7Z3dvfL+QVPLVBHqE8mlaodYU84E9Q0znLYTRXEcctoKRzdTv/VElWZSPJhxQoMYDwSLGMHGSv5tjz2e9coVt+rOgJaJl5MK5Gj0yl/dviRpTIUhHGvd8dzEBBlWhhFOJ6VuqmmCyQgPaMdSgWOqg2x27ASdWKWPIqlsCYNm6u+JDMdaj+PQdsbYDPWiNxX/8zqpiWpBxkSSGirIfFGUcmQkmn6O+kxRYvjYEkwUs7ciMsQKE2PzKdkQvMWXl0nzvOpdVK/uLyv1Wh5HEY7gGE7Bg2uowx00wAcCDJ7hFd4c4bw4787HvLXg5DOH8AfO5w8rEY49</latexit>

Figure 1: LOE

Key Ideas

1. Queries xi, j, ky correspond to pairwise comparisons xj, ky based on distance from i

2. View recovering column D˚i as a ranking problem

3. Ranking model is not identifiable, but we can recover Ri “ D˚i ´meanpD˚iq ¨ 1

4. Submatrix of landmarks is a distance matrix, allows identification of meanpD˚iq

5. LMDS can compute an embedding using only ` “ Opdq columns of D˚

Theoretical Analysis

Theorem 1 (Consistency) Ωppolypdqn log nq triplets suffice for LOE to recoverthe embedding in Frobenius norm with high probability.

- Proof sketch: first bound the propagated error of the landmark columns estimate;combined with a perturbation bound for LMDS to obtain the above result.

Theorem 2 (Computational Complexity) Recovering the embedding up to afixed error using LOE requires time Opm` nd2 ` d3q which is linear in n.

Experimental Results

- STE, GNMDS: baselines

- LOE: our algorithm

- LOE-STE: using LOE to warm-start STE

Datasets

- Synthetic (x˚i „ normal dist.)

- MNIST (handwritten digits)

- FOOD (images of food)

Food Embedding using LOE-STE

We sample a total number of cn log n triplets per experiment

References

[VDMW12] Laurens Van Der Maaten and Kilian Weinberger. “Stochastic triplet embedding”. In: Ma-chine Learning for Signal Processing (MLSP). 2012, pp. 1–6.

[Aga+07] Sameer Agarwal, Josh Wills, Lawrence Cayton, Gert Lanckriet, David Kriegman, and SergeBelongie. “Generalized non-metric multidimensional scaling”. In: AISTATS. 2007, pp. 11–18.

[DST04] Vin De Silva and Joshua B Tenenbaum. Sparse multidimensional scaling using landmarkpoints. Tech. rep. 2004.