21
1 On Top-k Recommendation using Social Networks Xiwang Yang, Harald Steck* + , Yang Guo* and Yong Liu Polytechnic Institute of NYU *Bell Labs + Netflix Inc. 1

On Top-k Recommendation using Social Networks

Embed Size (px)

Citation preview

Page 1: On Top-k Recommendation using Social Networks

1

On Top-k Recommendation using Social Networks

Xiwang Yang, Harald Steck*+, Yang Guo* and Yong Liu

Polytechnic Institute of NYU *Bell Labs +Netflix Inc.

1

Page 2: On Top-k Recommendation using Social Networks

2

Outline Background & Motivation

Social network based top-k recommendation Related Work: AllRank, SoRec, STE, SocialMF, Trust-cf

Top-k recommender using social networks Top-k MF using Social Networks Nearest Neighbor Methods

EvaluationConclusion

2

Page 3: On Top-k Recommendation using Social Networks

Social Recommenders Everywhere

3

Page 4: On Top-k Recommendation using Social Networks

Social network based top-k recommendation

TargetTargetCustomerCustomer

4

List of Top Movies ??

Recommender

Social network based top-k recommendation is not well studied

Page 5: On Top-k Recommendation using Social Networks

Social Top-K Recommendation Top-k recommendation:

More realistic RS task Integrate social network information into RS

Matrix Factorization(MF)• SoRec, STE, SocialMF – optimzie RMSE• AllRank - without social network information• Our approach directly optimize social network based

top-k recommendation Nearest Neighbor(NN)

• Trust-cf (recsys’09)– Combine CF neighborhood with social neighborhood,

items rated by the combined neighborhood are considered, average rating, rank item based on predicted rating to form top-k recommendation

• Our approach employs new neighborhood construction + using voting mechanism5

Page 6: On Top-k Recommendation using Social Networks

AllRank-(Steck kdd’10) Use AllRank to optimize top-k recommendation user’s selection bias causes the observed feedback (e.g. ratings,

purchases, clicks) in the data to be missing not at random (MNAR)—(Recsys’09) Lower ratings missed with higher probability missing ratings tend to indicate that a user does not like the item

Prediction: Objective:

Wm > 0, training on all items BaseMF: Wm = 0, training on observed ratings only Rank items based on predicted rating to form top-k list Tailor existing social-trust enhanced MF model for top-k

recommendation 6

& 2 2 2, , ,

ˆ( ) (|| || || || )o iu i u i u i F F

all u all i

W R R P Qλ− + +∑∑,

,

1

otherwiseu i

u i

m

if R observedW

w

=

,ˆ Tu i m u iR r Q P= +

& , ,,

otherwiseo i u i u iu i

m

R if R observedR

r

=

Page 7: On Top-k Recommendation using Social Networks

7

Outline Background & Motivation

Social network based top-k recommendation Related Work: AllRank, SoRec, STE, SocialMF

Top-k recommender using social networks Top-k MF using Social Networks Nearest Neighbor Methods

EvaluationConclusion

7

Page 8: On Top-k Recommendation using Social Networks

SoRec Prediction:

Objective-optimize RMSE

Modified Objective-optimize top-k hit rate

,ˆ Tu i m u iR r Q P= +

( )

2 * * 2 2 2 2, , , ,

, . ( , ) .

ˆˆ( ) ( ) (|| || || || || || )u i u i u v u v F F Fu i obs u v obs

R R S S P Q Zγ λ− + − + + +∑ ∑

*,

ˆ Tu v m u vS s Q Z= +

2& 2 ( ) *( & ) * 2 2 2, , , , , ,

v

ˆˆ( ) ( ) (|| || || || || || )o i S o iu i u i u i u v u v u v F F F

all u all i all u all

W R R W S S P Q Zλ− + − + + +∑∑ ∑ ∑,

,

1

>0 otherwiseu i

u i

m

if R observedW

w

=

*

( ) ,, ( )

1

>0 otherwiseS u v

u v Sm

if S observedW

=

& , ,,

otherwiseo i u i u iu i

m

R if R observedR

r

=

* *

*( & ) , ,,

otherwiseo i u v u v

u v

m

S if S observedS

s

=

Top-k list generated based on ranking of predicted ratings of all items

Page 9: On Top-k Recommendation using Social Networks

STE: Modified Objective-optimize top-k hit rate

SocialMF: Modified Objective-optimize top-k hit rate

, ,ˆ (1 )T Tu i m u i u v v i

v

R r Q P S Q Pα α= + + − ∑

& 2 2 2, , ,

ˆ( ) (|| || || || )o iu i u i u i F F

all u all i

W R R P Qλ− + +∑∑

,ˆ Tu i m u iR r Q P= +

& 2, , ,

* *, ,

2 2

ˆ( )

( )( )

(|| || || || )

o iu i u i u i

all u all i

Tu u v v u u v v

all u v v

F F

W R R

Q S Q Q S Q

P Q

β

λ

+ − − ÷

+ +

∑∑

∑ ∑ ∑

,,

1

>0 otherwiseu i

u i

m

if R observedW

w

=

& , ,,

otherwiseo i u i u iu i

m

R if R observedR

r

=

Page 10: On Top-k Recommendation using Social Networks

Nearest Neighbor Methods CF-ULF approach

Use AllRank to obtain user latent features Clustering user by PCC in latent feature space Select k1 nearest neighbor for target user u Relevant items of these nearest neighbors are voted to

target user, voting weight is PCC similarity

Top-k list is generated based on voting value

, ( , ) ,v

u

u i i Iiv N

Vote sim u v δ ∈∈

= ∑ ∑

Page 11: On Top-k Recommendation using Social Networks

Nearest Neighbor Methods PureTrust approach

breadth-first search (BFS) in the social network to find k2 trusted users to the target user u.

Relevant items of these trusted users are voted to target user, voting weight is proportional to 1/dv

is the set of trusted users of u is the voting weight from user v

dv is the depth of user v in the BFS tree rooted at user u.

, ( , )v

tu

u i t i Iiv N

Vote w u v δ ∈∈

= ∑ ∑t

uN

( , )tw u v

1( , )tv

w u v d=

Page 12: On Top-k Recommendation using Social Networks

Nearest Neighbor Methods Trust-CF-ULF approach

combination of CF-ULF approach and PureTrust Find k1 nearest neighbors from the CF-ULF neighborhood Find k2 nearest neighbors from the trust neighborhood which

are not in the k1 set (k2 = k1) Relevant items of these users are voted to target user Top-k list is generated based on voting value

Trust-CF-ULF-best approach Given total neighborhood size, dynamically tune the value of

k1 and k2 to obtain the best recall result

Page 13: On Top-k Recommendation using Social Networks

13

Outline Background & Motivation

Social network based top-k recommendation Related Work: AllRank, SoRec, STE, SocialMF

Top-k recommender using social networks Top-k MF using Social Networks Nearest Neighbor Methods

EvaluationConclusion

13

Page 14: On Top-k Recommendation using Social Networks

Evaluation MetricsTop-k hit rate(Recall)

The fraction of relevant items in the test set that are in the top-k of the ranking list

RMSE

14

2, ,( , )

ˆ( )

| |test

u i u iu i R

test

R RRMSE

R∈

−=

Page 15: On Top-k Recommendation using Social Networks

Top-k hit rate on Epinions Dataset 71K users, 104K items, 571K item reviews, 509K trust statement

Up to ~10× increment compared with training on observed rating Social network is very helpful in terms of top-k recommendation

especially for recommendation of cold start users Modified SoRec outperforms modified No Trust (AllRank)by 23.1% in

terms of overall recall and 101.8% in terms of cold user recall Recall of cold users in SoRec better than all users Item rated by a cold user averagely has received 102 ratings Item rated by all users has received averagely 93 ratings

15

Page 16: On Top-k Recommendation using Social Networks

RMSE on Epinions Dataset Set RMSE = 1.174, BaseMF RMSE = 1.095, for SocialMF ( = 20), RMSE = 1.157, for STE ( = 0.5), RMSE = 1.117, for SoRec ( = 50 and =0)

Consistent with RMSE results in published literature SocialMF performs best in RMSE while performs

worst in terms of top-k hit rate

16

0 10j = =0.1, 4.0, 0m mr wλ = =

βα

γ ( )SMw

Page 17: On Top-k Recommendation using Social Networks

Experiments on Epinions Dataset-NN

Greatly outperform existing work—trust-cf Trust-cf predicts the rating value of target user in terms of

the average rating values of the user’s neighbors–which is obviously based on the observed ratings only

Our CF neighbors derived from user latent features obtained from AllRank, which considered data MNAR, training on all items

Voting is the simplest possible way of accounting for all ratings, i.e. by counting 0 for an absent rating and counting 1 for an observed relevant rating17

Page 18: On Top-k Recommendation using Social Networks

Experiments on Flixster Dataset~1M Users, 49K movies, 8.2M ratings,

26.7M connectionsResults are similar

18

Page 19: On Top-k Recommendation using Social Networks

Impact of Dimensionality and Top-k

top-k hit rate of Flixster data is much more better than Epinions data Number of items in Epinions dataset is about two times as of

Flixster dataset while recall of Flixster is more than twice of Epinions for top-5 to top-500 recommendations

Epinions is a multi-category data(cars, movies, books,etc.) users in Flixster dataset averagely have more number of

social connections and item ratings 19

Page 20: On Top-k Recommendation using Social Networks

Conclusion Comprehensive study on improving the accuracy of

top-k recommendation using social networks Tailor existing social-trust enhanced MF models for top-k

recommendation by considering missing ratings

Proposed a NN based top-k recommendation method combining users’ neighborhoods in the trust network with their neighborhoods in the latent feature space and used voting instead of average rating to consider all ratings

Social recommenders considering missing feedbacks that works best for minimizing RMSE works worst for maximizing the hit rate, and vice versa First developing a good RMSE approach, and then modifying

the training for top-k is not necessarily a viable strategy for obtaining a good top-k approach

20

Page 21: On Top-k Recommendation using Social Networks

Thanks!

Q & A

21