Upload
xiwang-yang
View
764
Download
4
Embed Size (px)
Citation preview
1
On Top-k Recommendation using Social Networks
Xiwang Yang, Harald Steck*+, Yang Guo* and Yong Liu
Polytechnic Institute of NYU *Bell Labs +Netflix Inc.
1
2
Outline Background & Motivation
Social network based top-k recommendation Related Work: AllRank, SoRec, STE, SocialMF, Trust-cf
Top-k recommender using social networks Top-k MF using Social Networks Nearest Neighbor Methods
EvaluationConclusion
2
Social Recommenders Everywhere
3
Social network based top-k recommendation
TargetTargetCustomerCustomer
4
List of Top Movies ??
Recommender
Social network based top-k recommendation is not well studied
Social Top-K Recommendation Top-k recommendation:
More realistic RS task Integrate social network information into RS
Matrix Factorization(MF)• SoRec, STE, SocialMF – optimzie RMSE• AllRank - without social network information• Our approach directly optimize social network based
top-k recommendation Nearest Neighbor(NN)
• Trust-cf (recsys’09)– Combine CF neighborhood with social neighborhood,
items rated by the combined neighborhood are considered, average rating, rank item based on predicted rating to form top-k recommendation
• Our approach employs new neighborhood construction + using voting mechanism5
AllRank-(Steck kdd’10) Use AllRank to optimize top-k recommendation user’s selection bias causes the observed feedback (e.g. ratings,
purchases, clicks) in the data to be missing not at random (MNAR)—(Recsys’09) Lower ratings missed with higher probability missing ratings tend to indicate that a user does not like the item
Prediction: Objective:
Wm > 0, training on all items BaseMF: Wm = 0, training on observed ratings only Rank items based on predicted rating to form top-k list Tailor existing social-trust enhanced MF model for top-k
recommendation 6
& 2 2 2, , ,
ˆ( ) (|| || || || )o iu i u i u i F F
all u all i
W R R P Qλ− + +∑∑,
,
1
otherwiseu i
u i
m
if R observedW
w
=
,ˆ Tu i m u iR r Q P= +
& , ,,
otherwiseo i u i u iu i
m
R if R observedR
r
=
7
Outline Background & Motivation
Social network based top-k recommendation Related Work: AllRank, SoRec, STE, SocialMF
Top-k recommender using social networks Top-k MF using Social Networks Nearest Neighbor Methods
EvaluationConclusion
7
SoRec Prediction:
Objective-optimize RMSE
Modified Objective-optimize top-k hit rate
,ˆ Tu i m u iR r Q P= +
( )
2 * * 2 2 2 2, , , ,
, . ( , ) .
ˆˆ( ) ( ) (|| || || || || || )u i u i u v u v F F Fu i obs u v obs
R R S S P Q Zγ λ− + − + + +∑ ∑
*,
ˆ Tu v m u vS s Q Z= +
2& 2 ( ) *( & ) * 2 2 2, , , , , ,
v
ˆˆ( ) ( ) (|| || || || || || )o i S o iu i u i u i u v u v u v F F F
all u all i all u all
W R R W S S P Q Zλ− + − + + +∑∑ ∑ ∑,
,
1
>0 otherwiseu i
u i
m
if R observedW
w
=
*
( ) ,, ( )
1
>0 otherwiseS u v
u v Sm
if S observedW
wγ
=
& , ,,
otherwiseo i u i u iu i
m
R if R observedR
r
=
* *
*( & ) , ,,
otherwiseo i u v u v
u v
m
S if S observedS
s
=
Top-k list generated based on ranking of predicted ratings of all items
STE: Modified Objective-optimize top-k hit rate
SocialMF: Modified Objective-optimize top-k hit rate
, ,ˆ (1 )T Tu i m u i u v v i
v
R r Q P S Q Pα α= + + − ∑
& 2 2 2, , ,
ˆ( ) (|| || || || )o iu i u i u i F F
all u all i
W R R P Qλ− + +∑∑
,ˆ Tu i m u iR r Q P= +
& 2, , ,
* *, ,
2 2
ˆ( )
( )( )
(|| || || || )
o iu i u i u i
all u all i
Tu u v v u u v v
all u v v
F F
W R R
Q S Q Q S Q
P Q
β
λ
−
+ − − ÷
+ +
∑∑
∑ ∑ ∑
,,
1
>0 otherwiseu i
u i
m
if R observedW
w
=
& , ,,
otherwiseo i u i u iu i
m
R if R observedR
r
=
Nearest Neighbor Methods CF-ULF approach
Use AllRank to obtain user latent features Clustering user by PCC in latent feature space Select k1 nearest neighbor for target user u Relevant items of these nearest neighbors are voted to
target user, voting weight is PCC similarity
Top-k list is generated based on voting value
, ( , ) ,v
u
u i i Iiv N
Vote sim u v δ ∈∈
= ∑ ∑
Nearest Neighbor Methods PureTrust approach
breadth-first search (BFS) in the social network to find k2 trusted users to the target user u.
Relevant items of these trusted users are voted to target user, voting weight is proportional to 1/dv
is the set of trusted users of u is the voting weight from user v
dv is the depth of user v in the BFS tree rooted at user u.
, ( , )v
tu
u i t i Iiv N
Vote w u v δ ∈∈
= ∑ ∑t
uN
( , )tw u v
1( , )tv
w u v d=
Nearest Neighbor Methods Trust-CF-ULF approach
combination of CF-ULF approach and PureTrust Find k1 nearest neighbors from the CF-ULF neighborhood Find k2 nearest neighbors from the trust neighborhood which
are not in the k1 set (k2 = k1) Relevant items of these users are voted to target user Top-k list is generated based on voting value
Trust-CF-ULF-best approach Given total neighborhood size, dynamically tune the value of
k1 and k2 to obtain the best recall result
13
Outline Background & Motivation
Social network based top-k recommendation Related Work: AllRank, SoRec, STE, SocialMF
Top-k recommender using social networks Top-k MF using Social Networks Nearest Neighbor Methods
EvaluationConclusion
13
Evaluation MetricsTop-k hit rate(Recall)
The fraction of relevant items in the test set that are in the top-k of the ranking list
RMSE
14
2, ,( , )
ˆ( )
| |test
u i u iu i R
test
R RRMSE
R∈
−=
∑
Top-k hit rate on Epinions Dataset 71K users, 104K items, 571K item reviews, 509K trust statement
Up to ~10× increment compared with training on observed rating Social network is very helpful in terms of top-k recommendation
especially for recommendation of cold start users Modified SoRec outperforms modified No Trust (AllRank)by 23.1% in
terms of overall recall and 101.8% in terms of cold user recall Recall of cold users in SoRec better than all users Item rated by a cold user averagely has received 102 ratings Item rated by all users has received averagely 93 ratings
15
RMSE on Epinions Dataset Set RMSE = 1.174, BaseMF RMSE = 1.095, for SocialMF ( = 20), RMSE = 1.157, for STE ( = 0.5), RMSE = 1.117, for SoRec ( = 50 and =0)
Consistent with RMSE results in published literature SocialMF performs best in RMSE while performs
worst in terms of top-k hit rate
16
0 10j = =0.1, 4.0, 0m mr wλ = =
βα
γ ( )SMw
Experiments on Epinions Dataset-NN
Greatly outperform existing work—trust-cf Trust-cf predicts the rating value of target user in terms of
the average rating values of the user’s neighbors–which is obviously based on the observed ratings only
Our CF neighbors derived from user latent features obtained from AllRank, which considered data MNAR, training on all items
Voting is the simplest possible way of accounting for all ratings, i.e. by counting 0 for an absent rating and counting 1 for an observed relevant rating17
Experiments on Flixster Dataset~1M Users, 49K movies, 8.2M ratings,
26.7M connectionsResults are similar
18
Impact of Dimensionality and Top-k
top-k hit rate of Flixster data is much more better than Epinions data Number of items in Epinions dataset is about two times as of
Flixster dataset while recall of Flixster is more than twice of Epinions for top-5 to top-500 recommendations
Epinions is a multi-category data(cars, movies, books,etc.) users in Flixster dataset averagely have more number of
social connections and item ratings 19
Conclusion Comprehensive study on improving the accuracy of
top-k recommendation using social networks Tailor existing social-trust enhanced MF models for top-k
recommendation by considering missing ratings
Proposed a NN based top-k recommendation method combining users’ neighborhoods in the trust network with their neighborhoods in the latent feature space and used voting instead of average rating to consider all ratings
Social recommenders considering missing feedbacks that works best for minimizing RMSE works worst for maximizing the hit rate, and vice versa First developing a good RMSE approach, and then modifying
the training for top-k is not necessarily a viable strategy for obtaining a good top-k approach
20
Thanks!
Q & A
21