Collaborative Metric Learning - Cornell Universityylongqi/presentation/HsiehYCLBE17Slides.pdf ·...

Preview:

Citation preview

AndyHsieh,LongqiYang,YinCui,Tsung-YiLin,SergeBelongie,DeborahEstrin

ConnectedExperienceLab,CornellTech

AOL CONNECTED EXPERIENCES LAB CORNELL TECH

CollaborativeMetricLearning

1

CollaborativeMetricLearning

• Adifferentperspectiveoncollaborativefiltering

• Betteraccuracy

• ExtremelyefficientTop-Krecommendations

• Easytointerpretandextend

2

User-ItemMatrixUsers

Items

3

MatrixFactorization(MF)

Users

Items

Users

Items

4

ImplicitFeedback

• Ubiquitousintoday’sonlineservices

• Onlypositivefeedbackisavailable

• TraditionalMFdoesnotwork

?

??

?

?

?

?

?

?

?

?

Click Thumbsup Like

5

MatrixFactorizationforImplicitFeedback

• WeightedRegularizedMatrixFactorization(WRMF)[Hu08]

• ProbabilisticMatrixFactorization(PMF)[Salakhutdinov08]

• BayesianPersonalizedRanking(BPR)[Rendle09]

andmanymore…

6

ThinkBeyondMatrix

?

??

?

?

?

?

?

?

?

?

• Nolongeraboutestimatingratings

• Butaboutmodelingtherelationships

betweendifferentuser/itempairs

Explicit Implicit

7

ThinkBeyondMatrix

• Nolongeraboutestimatingratings

• Butaboutmodelingtherelationships

betweendifferentuser/itempairs

Explicit Implicit

8

MetricLearning

9

Knownrelationships

Unknownrelationships

CollaborativeMetricLearning

• Learnajointuser-itemdistancemetric.

• TheEuclideandistancesreflecttherelationshipsbetweenusers/items.

10

BasedontheinherentTriangularInequalityofMetricLearning– IfAisclosetoB,andBisclosetoC,thenAisclosetoC.

• Fitthemodelwithimplicitfeedback

1. Anuserispulledclosertotheitemssheliked

2. Othersimilarusersarepulledcloser.

3. Theitemsuserslikedarealsopulledcloser.

• Top-KrecommendationsaresimplyKNN

search(awell-optimizedtask)

11

12

BasedontheinherentTriangularInequalityofMetricLearning– IfAisclosetoB,andBisclosetoC,thenAisclosetoC.

• Fitthemodelwithimplicitfeedback

1. Anuserispulledclosertotheitemssheliked

2. Othersimilarusersarepulledcloser.

3. Theitemsuserslikedarealsopulledcloser.

• Top-KrecommendationsaresimplyKNN

search(awell-optimizedtask)

13

BasedontheinherentTriangularInequalityofMetricLearning– IfAisclosetoB,andBisclosetoC,thenAisclosetoC.

• Fitthemodelwithimplicitfeedback

1. Anuserispulledclosertotheitemssheliked

2. Othersimilarusersarepulledcloser.

3. Theitemsuserslikedarealsopulledcloser.

• Top-KrecommendationsaresimplyKNN

search(awell-optimizedtask)

CollaborativeLargeMarginNearestNeighbor

User

Positiveitem

Imposter

SafetyMargin

Gradients

Before After

*TheoutlineoffigureisinspiredbyWeinberger,KilianQ.,JohnBlitzer,andLawrenceSaul."Distancemetriclearningforlargemarginnearestneighborclassification." Advancesinneuralinformationprocessingsystems 18(2006):1473. 14

PitfallsofMatrixFactorization(Dot-Product)

• Dot-Productviolatestriangleinequalitymisleadingembedding.

15

PitfallsofMatrixFactorization(Dot-Product)

• Dot-Productviolatestriangleinequalitymisleadingembedding.

𝑉#$𝑉% = 0: doesnotreflectthattheyarebothlikedby𝑈*

𝑈#$𝑈% = 0:doesnotreflectthattheybothsharethesameinterestas𝑈*

16

CollaborativeMetricLearningEmbedding

• Euclidiandistancefaithfullyreflectstherelativerelationships.

17

IntegratingItemFeatures

• Usealearnablefunction(e.g.

Multi-LayerPerceptron)to

projectfeaturesintouser-item

embedding.

• Treattheprojectionsasaprior

foritems'locations.

18

Evaluation

• 6DatasetsfromDifferentDomains

• Papers - CiteULike

• Books - BookCrossing

• Photography - Flickr

• Articles - Medium

• Movies - MovieLens

• Music - EchoNest

19

Accuracy(Recall@50)

-40

-20

0

20

40

60

80

100

CiteULike BookCX Flickr Medium MovieLens EchoNest

Recall@50ImprovementsOverBPR(%)

WRMF WARP CML

**

**

*IndicatethatCML>thesecondbestalgorithmisstatisticallysignificantaccordingtoWilcoxonsignedranktest 20

Accuracy(withItemFeatures)

-20

0

20

40

60

80

100

120

CiteULike BookCX Flickr Medium MovieLens

VBPR CDL CML+F

* **

*IndicatethatCML>thesecondbestalgorithmisstatisticallysignificantaccordingtoWilcoxonsignedranktest

Recall@50ImprovementsOverFactorizationMachine(%)

21

Efficiency

• AlloptimizedwithLSHs

• CML’sthroughputisimprovedby106x

withonly2%reductioninaccuracy

• Over8xfasterthan(optimized)MF

modelsgiventhesameaccuracy

8xfaster

‘sarebruteforcesearch

22

EmbeddingInterpretability

23

AB

C

A

B

C

Conclusions

• Thenotionofuser-itemmatrixandmatrixfactorizationbecomeslessapplicablewithimplicitfeedback.• CMLisametriclearningmodelthathas• betteraccuracy,efficiency,interpretability,andextensibility.

• Applyingmetric-basedalgorithms,suchasK-means,andSVMs,tootherrecommendationproblems.

24

25

Thankyou!

AOL CONNECTED EXPERIENCES LAB CORNELL TECH

Recommended