Upload
dinhtuong
View
227
Download
0
Embed Size (px)
Citation preview
Cold Star t So lu t ions fo r Recommender Sys tems Amin Mantrach
[email protected] Research Scientist – Yahoo Labs Barcelona
Outline
§ Recommending at cold start; › Learning representations for the item cold start:
• Recommending cold articles to users;
› Enrich user profiles by using users’ implicit feedback: • Learning representations for completing the user profile;
› Enrich user profiles by using query logs; § Discussions: Matrix factorization and skip gram.
2
RECOMMENDING AT COLD START
3
Item Cold Start Problem on Yahoo Properties
§ Majority of the items (~80%) are never shown or clicked:
§ Personalization uses content as main signal (CTR can not be used on cold items);
§ Motivations: why recommending cold start items? › Diversify the offer; › Avoid the “kim kardashian” effect; › Avoid quick sold out of advertisings.
Weakly-engaged users on Yahoo Properties
5
§ User engagement is power-law distributed; à ~80% of the users have sparse profiles: on Netflix, Amazon, Yahoo news and Yahoo search. § In other words, we are facing a coverage problem; § Recommendations can not be efficient for the majority of the users due
to the sparsity of their profiles.
4 50 100 150 200 250 3000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Clicks
Cove
rage
Drawbacks of state-of-the-art in cold-start recommendations Item cold start § State-of-the-art of collaborative filtering approaches can not be applied:
[2009 Koren et al., Matrix factorization techniques for recommender systems; 2009 Rendle et al. UAI, B.P. Ranking from implicit feedback];
§ Basic approaches relies on content-based (CB) models. § State-of-the-art for item cold-start consists of hybrid methods
[WSDM 2010, Agarwal et al., fLDA; Gantner ICDM 2010, learning mappings…];
Weakly-engaged user § 80% of the users are weakly-engaged and thus have sparse profiles; § State-of-the art user profile enrichment techniques rely on
› the kNN to enrich the user profile (this does not work for the weakly-engaged user); › external information but low coverage.
Our contributions to the cold start
7
§ Novel collective learning representation framework: › Common framework:
• We solve both the item cold start and the user cold start; • Our representations are interpretable (non-negative) and can be used to
reconstruct the user profile; • Our implementation relies on simple alternating least squares (ALS) or
multiplicative updates (MU).
§ Weakly-engaged users: • We can complete better the user profiles in comparison to the state-of-the-
art.
The cold start: Research Questions
8
Content + User feedback à Collective factorization Singh and Gordon KDD 2008
Item cold start: 1.Can we learn collective representations from content and
collaborative information to outperform state-of-the-art item cold-start recommenders? [Saveski and Mantrach – RecSys 2014]. Weakly-engaged user:
2.Can we design collective representations for enriching the weakly-engaged user profile using implicit user feedback? [ongoing work]
3.Can we use query logs as external information to improve recommendations on homerun ? [patent filed]
The cold start: Research Questions
9
Content + User feedback + Matrix factorization à Collective factorization Singh and Gordon KDD 2008
Item cold start: 1.Can we learn collective representations from content and
collaborative information to outperform state-of-the-art item cold-start recommenders? [Saveski and Mantrach – RecSys 2014]. Weakly-engaged user:
2.Can we design collective representations for enriching the weakly-engaged user profile using internal user feedback?
3.Can we use external informational – as query logs – to improve recommendations ? [patent filed]
1. RECOMMEND FOR THE ITEM COLD START
10
Collective Representation Learning
§ Why collective? › It allows learning from multiple sources: users’ feedback + items’ features.
§ Why representation? › By learning embeddings we extract latent factors that capture the essence of the data.
§ Why collective representation for cold start? › When observing just one view we can reconstruct the missing one .
By projecting items’ features on the joined representation we can
reconstruct missing the user’s items
11
Xs W Hs
≈ Global
topic model TOPICS
Topic 1 …
Xu W Hu
≈ C1 …Ck
Topic k
#features
#users
#ite
ms
#ite
ms
COMMUNITY Personalized
model per user
Collective Representation Learning
12
Non-negative representations + locality constraints: LCE 2 similar items should share similar representations
Optimization Problem
13
§ We implemented an alternating least squares algorithm and a multiplicative update algorithm to learn the decomposition.
[https://github.com/amantrac/JNMF]
Item Cold-Start Recommendations
Offline evaluation: § Enron: 10 mailboxes, 36K emails, 5K users, explicit feedback. § Yahoo News Articles (40days – random sample of 41K articles –
650K users + user implicit feedback (3.5M comments).
A/B testing › Average #of items surfaced/day; › Dwell time of the items
[RecSys2014, Yi et al., Beyond clicks: dwell time for personalization.]
14
Item Cold Start: Baselines
15
1. Content Based Recommender (CB) 2. Content Topic Based Recommender
3. Latent Semantic Indexing on user profiles [Soboroff’99]
4. Author Topic Model [M. Rosen-Zvi’04]
5. Bayesian Personalized Ranking + kNN (BRP-kNN) [Gantner’10]
6. fLDA [Agarwal’10]
Offline Evaluation: Email Recipients Recommendation
16
0.00
0.10
0.20
0.30
0.40
0.50
MicroF1 MacroF1 MAP NDCG
Performan
ce*
BPR-kNN CB LCE (No GR) LCE
Offline Evaluation: Cold News Articles Recommendation
17
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
RA@3 RA@5 RA@7 RA@10
Rank
ing'Ac
curacy'
CB BPR-kNN LCE (No GR) LCE
Next directions…
§ Use dwell time/duration (i.e. proportion of the video watched) instead of intentional plays;
§ Incorporate a profile enrichment strategy based on representation learning to diversify recommendations for the weakly-engaged user.
18
The cold start: Research Questions
19
Content + User feedback + Matrix factorization à Collective factorization Singh and Gordon KDD 2008
Item cold start: 1.Can we learn collective representation from content and
collaborative information to outperform state-of-the-art item cold-start recommenders? [Saveski and Mantrach – RecSys 2014]. Weakly-engaged user:
2.Can we design collective representations for enriching the weakly-engaged user profile using internal user feedback?
3.Can we use external informational – as query logs – to improve recommendations ? [1 patent – submitted to Techpulse 2014]
Recommending in the Long Tail: User Profile Completion Why it is important: § Current rec. systems are only effective for the 20% loyal users having
a dense profile. Why enriching weakly engaged user ? § Improving recommendation for 80% of the remaining users; § Encouraging churning of weakly-engaged users to loyal; § Easy to integrate: we feed the existing system with enriched user
profiles and do not need to change existing algorithms; § Advertising can benefit as well of better enriched profiles.
20
Endogenous vs Exogenous Profile Enrichment
21
A. Endogenous: Using implicit feedback § We have this info for loyal users for free; § We do not need to rely on any external source. Our solution: § Learning embedding spaces designed to reconstruct user profiles to
improve news recommendation.
B. Exogenous § Many external sources of information are available inside Yahoo. They can
be used to enrich user profiles. § Our solution: § Using search query logs to enrich user profiles for news recommendation.
The cold start: Research Questions
22
Content + User feedback + Matrix factorization à Collective factorization Singh and Gordon KDD 2008
Item cold start: 1.Can we learn collective representation from content and
collaborative information to outperform state-of-the-art item cold-start recommenders? [Saveski and Mantrach – RecSys 2014]. Weakly-engaged user:
2.Can we design collective representations for enriching the weakly-engaged user profile using implicit user feedback?
3.Can we use external informational – as query logs – to improve recommendations ? [1 patent – submitted to Techpulse 2014]
2. USING IMPLICIT USER FEEDBACK 23
User Coverage Against Click Count for News Data Set
24
4 50 100 150 200 250 3000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Clicks
Cove
rage
Xs W Hs
≈
Xu Hu
≈
#features
#users
#ite
ms
#ite
ms
Collective Representation Learning for User Profiles Reconstruction
W
Xp ≈ Hu Hs
#use
rs
#features
User Profile Reconstruction Xp=XuT.Xs
25
Optimization Problem
26
User Profile Reconstruction Regularization
27
0.25
0.3
0.35
0.4
0.45
0.5
0.55
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
NDCG
a
clicks=1clicks=2clicks=3clicks=4clicks=5
Performance in Terms of Sparsity
28
0.1
0.2
0.3
0.4
0.5
0.6
1 2 3 4 5 6
NDCG
Clicks
CEUP-ACLS
CEUP-MU
kNN
WITHOUT ENRICHMENT
The cold start: Research Questions
29
Content + User Feedback + Matrix Factorization à Collective Factorization Singh and Gordon KDD 2008
Item cold start: 1.Can we learn collective representation from content and
collaborative information to outperform state-of-the-art item cold-start recommenders? [Saveski and Mantrach – RecSys 2014]. Weakly-engaged user:
2.Can we design collective representations for enriching the weakly-engaged user profile using internal user feedback? [submitted to techpulse 2014]
3.Can query logs be used as external information to improve recommendationso ? [1 patent]
B. USING EXTERNAL SIGNAL QUERY LOGS 30
News Personalization
§ Reading profile (endogenous): Aggregated clicked news (implicit feedback) or skipped news.
§ Search profiles (exogenous): Aggregated queries submitted by users (explicit feedback).
31
Search profiles
§ Motivations: using other sources of available information to improve news personalization.
§ Why search? › More familiar; › Explicit user intent.
32
Query
33
Titles
34
Abstracts
35
Coverage
36
66% of the Homerun users in a specific target day did also use “Search” during the last month.
Others Finance FrontPage Mail News Search SportsYahoo! properties
0
10
20
30
40
50
60
70
80
90
100
Ove
rlap
in p
age
view
s (%
)unique yuidsunique bcookies
Coverage
37
§ Considering users who clicked at least once on a Homerun recommendation on a target day, how many queries did each of them submit during the last 3 months?
0
200K
400K
600K
800K
1.0M
1.2M
1.4M
1.6M
1.8M
100 101 102 103 104
Num
ber o
f use
rs
Number of queries per user during 90 days
§ Users that clicked at least once during a target day on a recommended article;
§ We consider only users who submitted at least 1000 queries during the last 3 months (~10 query/day);
à 70K users with 140K recommendations.
Data Set
38
User Query
39
User QTitles
40
User QAbstracts
41
User Query 1
User Query 2
User Query 3
User Query N
Query User Profile
1 2 3
90
AG
GR
EG
ATION
42
User QTitles 1
User QTitles 2
User QTitles 3
User QTitles N
QTitle User Profile
1 2 3
90
AG
GR
EG
ATION
43
User QAbstracts 1
User QAbstracts 2
User QAbstracts 3
User QAbstracts N
QAbstracts User Profile
1 2 3
90
AG
GR
EG
ATION
44
I. Do search profiles help improve the quality of news personalization?
45
I. Do search profiles help improve the quality of news personalization?
46
II. What are the important features to be considered in a search profile?
47
III. How many queries do we need?
48
Limitation: 400 queries corresponds to a coverage of ~200K users
IV. Which period should the historical search information span in order to produce high-quality recommendations?
49
V. How does the recency of search profiles affect the quality of news personalization?
50
Status and Limitations
§ The main limitation is the coverage: › Scales up to 200K users.
§ Further work: › Improve coverage; › Complete user profile by learning collective representations from (1) implicit
feedback, (2) query logs and (3) item’s features.
51
Discussions: Matrix Factorizations and skip-gram based Representations
52
§ Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient Estimation of Word Representations in Vector Space. In Proceedings of Workshop at ICLR, 2013
§ “We analyze skip-gram with negative-sampling (SGNS), a word
embedding method introduced by Mikolov et al., and show that it is implicitly factorizing a word-context matrix, whose cells are the pointwise mutual information (PMI) of the respective word and context pairs (shifted by a global constant).” [Omer Levy and Yoav Goldberg,Neural Word Embeddings as Implicit Matrix Factorization, NIPS, 2014]
Quest ions Doubts
Concerns Queries
Quest ion Advises Issues
53