44
Recommendation Systems Alpa Jain Twitter Revenue 09.10.012

Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

Recommendation Systems

Alpa Jain Twitter Revenue

09.10.012

Page 2: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

Outline •  What are recommendation systems?

•  How to build a recommendation system?

•  What challenges do we need to further address?

•  Twitter promoted products – a real-life recommendation system.

Page 3: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

A recommendation system provides information or items that are likely to be of interest to a user, in an automated fashion.

Page 4: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

Daily examples of recommendation systems

Page 5: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing
Page 6: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

Snapshot  from:  h.p://itunes.apple.com/us/app/ne6lix/id363590051?mt=8  

Page 7: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing
Page 8: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

Promoted tweets in timeline

Page 9: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

Promoted tweets in search results

Page 10: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

Why do we need recommendation systems?

Page 11: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

Need for Recommendation Systems

•  Solution to large amounts of good data

•  Reduce cognitive load on users

•  Introduce quality control

Page 12: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

Overview of Recommendation Systems

Candidate generation

Filtering

Rank

User feedback

Users Items

Automa'cally  iden'fy    items  of  interest  to  users  (focus  of  talk)  

Filters:  near  duplicates,  already  seen,  dismissed  

Order  recommenda'ons:  temporal,  diversity,  personaliza'on  

Track  user  feedback:  dislike,  click,  purchase  

Page 13: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

Recommendation Algorithms •  Collaborative filtering (CF) •  Hypothesis: Similar users tend to like similar items •  Two forms of CF –  Item-based collaborative filtering – User-based collaborative filtering

•  Data collection methods – Explicit feedback

Example: ratings, dismiss –  Implicit feedback

Example: number of views, purchases

Page 14: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

Data Representation

•  Items: i1, i2, i3, … in

•  User u1 has provided ratings on items 1   -­‐   5   -­‐   2  

I1   I2   I3   I4   I5  

U1  

4   2   -­‐   -­‐   5  U2  

1   2   4   4   5  U3  

-­‐   3   -­‐   1   5  Um  

   …  

Page 15: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

User-Item Ratings Matrix

1   3   1   5   2  

1  

4   1   2   3   5  

1   1   1  

1  

2   3   2   1   2   1  

2   3   4   1  

2   4  

3  

2   3   1   1  

users

item

s

Page 16: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

Example of User / Movie Ratings Matrix

Alice   Bob   Charlie   Dave  

Harry  PoHer  …   3   5   2   3  

The  Shawshank  Redemp'on  

4   4   2   -­‐  

Rabbit-­‐Proof  Fence  

5   1   -­‐   -­‐  

American  Pie   -­‐   1   1   5  

•  The Netflix challenge –  20 thousand movies –  500 thousand users –  100 million ratings

•  Training set of 99 million ratings –  RMSE < 0.8572 on a test set of 1 million ratings

Page 17: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

A Naïve Recommendation System

1. Aggregate ratings for each item 2. Recommend item with maximum rating •  Does everybody like Harry Potter movies?

score(i,u) = f (i) = rating(u, i)u∍users∑

Historical information about users is important!  

Page 18: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing
Page 19: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

Item-based Collaborative Filtering •  Predict user’s rating for an item i based on his

ratings for other item •  Given a user u with I(u) preferred items

score(i,u) = rating(u, j)•sim(i, j)j∍I (u)∑

Similarity between items i and j

Rating provided by user u for item j

Page 20: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

Example: Item-based CF

•  Given user with ratings for items x and y

•  Items a and b with similarities

score(u, N) = 1. 0 * 0.8 + 0.3 * 0.3 = 0.89 score(u, S) = 0.2 * 0.8 + 0.3 * 0.8 = 0.4

Harry  PoHer   The  Matrix  

ra'ng   0.8   0.3  

item   Harry  PoHer   The  Matrix  

The  Chronicles  of  Narnia  (N)   1.0   0.3  

Star  Wars  (S)   0.2   0.8  

Page 21: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

Computing Similarity between Items

•  Cosine similarity –  Items are represented as u-dimensional vectors over user space –  Similarity is cosine of the angle between two vectors –  Score ranges between 1 (perfect) and -1 (opposite)

0  

0.2  

0.4  

0.6  

0.8  

1  

0   0.2   0.4   0.6   0.8   1  

B

A C

u1   u2  

A   0.8   0.45  

B   0.4   0.8  

C   0.3   0.3  

Example: 2 users

U1

U2

Page 22: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

Similarity Measures – contd.

•  Cosine similarity :

•  Magnitude-aware measure:

•  Jaccard, Pearson correlation …

sim(i, j) = I ⋅ J|| I || ⋅ || J ||

sim(i, j) = U(i)∩U( j)|U(i) | ⋅ |U( j) |

Harry Potter movie problem?

Page 23: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

User-based Collaborative Filtering •  K-nearest neighbors (KNN) – Cluster users after representing them as feature vectors

Users Clusters Items  

Page 24: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

Challenges in Building Recommender

Systems

Page 25: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

Challenges and Interesting Problems •  Data sparsity

–  Users rarely purchase, rate, or click

•  The more you see the less you know –  Increasing users or items increases the dimensions we need to learn

•  Cold-start problem –  No historical information for new users or items

•  Correlation between nearest neighbors –  Harry Potter sequels

•  Scalability and recommendation accuracy not production-friendly

Page 26: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

Dimensionality Reduction •  Say every user who likes “Harry Potter” also likes

“The Chronicles of Narnia”

•  Generalize movies into generic latent semantic characteristics –  {fantasy, novel-based movies, …}

•  Reduces dimensions to track and improves scalability

•  Reduces data sparsity and improves prediction accuracy

Page 27: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

Singular Value Decomposition

Single value decomposition takes a (m x n) matrix and produces three matrices: S : a (m x n) diagonal matrix with non-negative numbers U : a (m x m) matrix V : a (n x n) matrix

M =U ⋅S ⋅VT

Page 28: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

User Movie Rating Matrix

Alice   Bob   Charlie   Dave   Ed  

A   5   2   1   1   4  

B   0   1   1   0   3  

C   3   4   1   0   2  

D   4   0   0   0   3  

E   3   0   2   5   4  

F   2   5   1   5   1  

Matrix has 6 rows and 5 columns. Denoted as M(6 x 5) matrix

Page 29: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

Computing SVD

Page 30: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

Matrix Decomposition

m  x  n  

m  x  m  

m  x  n   n  x  n  

=   .   .  

Page 31: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

Apply SVD – example contd.

•  SVD collapses the matrix to a smaller matrix retaining important features

•  Pick k dimensions and chop off the matrixes

U  

-­‐0.59   0.37  

-­‐0.17   0.13  

-­‐0.36   0.016  

-­‐0.31   0.51  

-­‐0.5   -­‐0.03  

-­‐0.47   -­‐0.75  

VT  

-­‐0.59   0.40  

-­‐0.39   0.49  

-­‐0.20   0.53  

-­‐0.42   0.62  

-­‐0.53   0.44  

S  

12.65   0  

0   5.77  

We have reduced our dataset to a 2-dimensional space!

Alice  

Ed  

Page 32: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

Finding Recommendations

•  A user X comes in with some ratings in the original feature space : [0, 0, 2, 0, 4, 1]

•  Map it to k-dimensional vector

•  X à [-0.25, -0.15] •  Note this is similar to the problem we solved

earlier using cosine similarity

B = BT ⋅U ⋅S−1

Page 33: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

SVD : Putting it All Together

•  Represent datasets from multiple users and items into a matrix

•  Apply SVD and pick k dimensions to reduce our datasets

•  Map new users into this low k-dimensional space

•  Compute similarity between users in this space

•  Provide recommendations based on similar users

Page 34: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

Computing SVD - Partial

8 0−2 1

"

#$

%

&'

8 −20 1

"

#$

%

&'A = AT = 64 0

0 5

!

"#

$

%&AT. A=

64− c 00 5− c

"

#$

%

&'AT. A – cI =

|AT. A – cI| = = (64 –c) * (5 – c) – 0 = = 0

Solution: C1= 8 C2= 1 S = 8 0

0 1

!

"

##

$

%

&&= 2.82 0

0 8

!

"#

$

%&

8 * 8 + 0 * 0 = 64

Page 35: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

hHp://abeau'fulwww.com/wp-­‐content/uploads/2007/04/ne]lixAllMovies-­‐blackBack3[5].jpg  

Page 36: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing
Page 37: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

Cold start problem : unknown users

Page 38: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

The Cold Start Problem

A brand new user or item is introduced in a recommendation system for which it cannot draw any inferences due to lack of historical data

•  Ratings or other metrics may contain biases – Harry Potter I is more popular than Harry Potter 2 – Ads or documents often suffer from this problem – First sight, fatigue, seasonality, …

Page 39: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

Explore – Exploit Strategy

•  Recommend items at random to a small randomly chosen users

•  Ensures user information on each item •  Full randomization may not be possible

•  Recommender systems need to chose between –  Exploiting a model to improve quality –  Exploring a new item to reduce uncertainty

•  Active research area

Explore / Exploit Schemes for Web Content Optimization, ICDM 2009

Page 40: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

Twitter Promoted Products

Page 41: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

Promoted Accounts

•  Suggest accounts that you do not follow and may find interesting •  Advertisers increase their followers’ reach

Page 42: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

Promoted Tweets in Timeline or Search

Users may expand, retweet, click, favorite, dismiss, …

Page 43: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

Further Reading 1.  Edwin’s blog: http://blog.echen.me/2011/10/24/winning-the-netflix-prize-a-summary/

2.  Netflix challenge: http://www.technologyreview.com/news/406637/the-1-million-netflix-challenge/

3.  Collaborative Filtering: A Tutorial - Carnegie Mellon University

4.  Content-Boosted Collaborative Filtering for Improved Recommendations

5.  Supervised Random Walks: Predicting and Recommending Links in Social Networks, WSDM11

6.  The Link Prediction Problem for Social Network, CIKM 2003

7.  Explore/Exploit Schemes for Web Content Optimization , ICDM 2009

8.  Multi-armed bandit problem: http://en.wikipedia.org/wiki/Multi-armed_bandit

Page 44: Recommendation Systems - blogs.ischoolblogs.ischool.berkeley.edu/.../2012/08/alpa_twitter_recommenders1.p… · The+Chronicles+of+Narnia(N)+ 1.0 0.3 Star+Wars+(S)+ 0.2 0.8. Computing

•  Ques'ons?