RecMax: Exploiting Recommender Systems for Fun and Profit RecMax – Recommendation Maximization Previous research in Recommender Systems mostly focused on improving the accuracy of recommendations. We propose a novel problem, RecMax, in this paper – Can we launch a targeted marketing campaign over an existing operational Recommender Systems (RS)? “Select k users such that if they provide high ratings to a new product, then the number of other users to whom the product is recommended by the underlying recommender system (RS) algorithm is maximum.” Benefits of RecMax 1.Targeted marketing in RS •Marketers can effectively advertise new products on a RS platform •Business opportunity to RS platform (service provider). 2.Beneficial to seed users •They get free/discounted samples of a new product •Helpful to other users •They receive recommendations of new products – solution to cold start problem Problem Formulation The single most important challenge is to study RecMax is the wide diversity of RS algorithms•We focus on User-based (with Pearson Correlation as similarity function) and Item-based RS (with Adjusted Cosine similarity function). Hit Score: The goal of RecMax is to find a seed set S such that hit score f(S) is maximized. Does Seeding Help? Conclusion and Future Work •We propose a new problem RecMax. It has real world applications. We show that RecMax makes marketing sense, even if it is NP- hard to approximate. •Developing more effective heuristics is interesting and challenging. •Study of RecMax on more sophisticated recommender systems – like Matrix Factorization would be exciting. Amit Goyal University of British Columbia [email protected] Laks V. S. Lakshmanan University of British Columbia [email protected] Paper ID: 727 Recommendations Expected Rating Harry Potter 4.8 American Pie 4.3 …. … The Dark Knight 1.2 l r e c o m m e n d a t i o n s Recommendation List for user v r a t i n g t h r e s h o l d o f u s e r v ( d e n o t e d b y θ v ) If expected rating R(v,i) > θ v , then the new item is recommended to v Key Theoretical Results •RecMax is NP-hard to solve exactly. • RecMax is NP-hard to approximate within a factor of 1/|V| 1-ε for any ε > 0. • It is as hard as Maximum Independent Set Problem. Idea behind reduction: In order to achieve hit score of 2, {B,C,E} must rate the new product. Datasets Heuristics Random: Seed Set is Selected Randomly. Most-Active: Select top-k users with most number of ratings. Most-Positive: Select top-k user with most positive average ratings. Most-Critical: Select top-k users with most critical average ratings. Most-Central: Select top-k central users with highest aggregate similarity scores. Experiments Movielens Yahoo! Music Jester Joke #Users 6040 10K 25K #Items 3706 5069 100 #Ratings 1M 1M 1.8M A D B C E Nodes {A,D} form Maximum Independent Set Nodes {B,C,E} encircle {A,D} (Maximum Encirclement Problem) Hit Score Variation on User- based: •Follows S-curve. •Most-Central and Most-Positive perform good. Most Central wins overall. •Most-Active and Most-Critical perform poorly, on all datasets. Hit Score Variation on Item- based: •Tipping point is achieved very early. •Less seeding is required for converge. •Most-Central performs better overall. RecMax on User-based vs Item- based •Initial rise is much steeper in Item-based. •Eventual gain is much higher in User-based. •Out of 1000, number of common seeds are 103, 219 and 62 on 3 datasets – Seed sets are different for different RS.

RecMax: Exploiting Recommender Systems for Fun and Profit RecMax – Recommendation Maximization Previous research in Recommender Systems mostly focused

Download PPT Report

Upload
cory-banks
View
229
Download
9

Embed Size (px)

Citation preview

RecMax: Exploiting Recommender Systems for Fun and Profit

RecMax – Recommendation MaximizationPrevious research in Recommender Systems mostly focused on improving the accuracy of recommendations.

We propose a novel problem, RecMax, in this paper – Can we launch a targeted marketing campaign over an existing operational Recommender Systems (RS)?

“Select k users such that if they provide high ratings to a new product, then the number of other users to whom the product is recommended by the underlying recommender system (RS) algorithm is maximum.”

Benefits of RecMax1.Targeted marketing in RS •Marketers can effectively advertise new products on a RS platform•Business opportunity to RS platform (service provider).2.Beneficial to seed users•They get free/discounted samples of a new product•Helpful to other users •They receive recommendations of new products – solution to cold start problem

Problem FormulationThe single most important challenge is to study RecMax is the wide diversity of RS algorithms.

•We focus on User-based (with Pearson Correlation as similarity function) and Item-based RS (with Adjusted Cosine similarity function).

Hit Score:

The goal of RecMax is to find a seed set S such that hit score f(S) is maximized.

Does Seeding Help?

Hit Rate achieved by random seed set on Movielens dataset on User-based (left) and Item-based (right). The plots show that even when the seed sets are selected randomly, seeding does help and exhibits impressive

gains in hit score.

Conclusion and Future Work•We propose a new problem RecMax. It has real world applications. We show that RecMax makes marketing sense, even if it is NP- hard to approximate.

•Developing more effective heuristics is interesting and challenging.

•Study of RecMax on more sophisticated recommender systems – like Matrix Factorization would be exciting.

Amit Goyal

University of British [email protected]

Laks V. S. Lakshmanan

University of British [email protected]

Paper ID: 727

Recommendations Expected Rating

Harry Potter 4.8

American Pie 4.3

….

…

The Dark Knight 1.2

l recomm

endations

Recommendation List for user v rating threshold of user v(denoted by θ

v )

If expected rating R(v,i) > θv, then the new item is recommended to v

Key Theoretical Results•RecMax is NP-hard to solve exactly.

• RecMax is NP-hard to approximate within a factor of 1/|V|1-ε for any ε > 0.

• It is as hard as Maximum Independent Set Problem.

Idea behind reduction: In order to achieve hit score of 2, {B,C,E} must rate the new product.

Datasets

HeuristicsRandom: Seed Set is Selected Randomly.

Most-Active: Select top-k users with most number of ratings.

Most-Positive: Select top-k user with most positive average ratings.

Most-Critical: Select top-k users with most critical average ratings.

Most-Central: Select top-k central users with highest aggregate similarity scores.

Experiments

Movielens Yahoo! Music Jester Joke

#Users 6040 10K 25K

#Items 3706 5069 100

#Ratings 1M 1M 1.8M

Nodes {A,D} form MaximumIndependent Set

Nodes {B,C,E} encircle {A,D}(Maximum Encirclement Problem)

Hit Score Variation on User-based: •Follows S-curve.•Most-Central and Most-Positive perform good. Most Central wins overall. •Most-Active and Most-Critical perform poorly, on all datasets.

Hit Score Variation on Item-based: •Tipping point is achieved very early.•Less seeding is required for converge.•Most-Central performs better overall.

RecMax on User-based vs Item-based•Initial rise is much steeper in Item-based.•Eventual gain is much higher in User-based.•Out of 1000, number of common seeds are 103, 219 and 62 on 3 datasets – Seed sets are different for different RS.