43
Manipulating the Capacity of Recommendation Models in Recall-Coverage Optimization Ing. Tomáš ŘEHOŘEK

Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

Manipulating the Capacityof Recommendation Models

in Recall-Coverage OptimizationIng. Tomáš ŘEHOŘEK

Page 2: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

Motivation● Recommender Systems = actively studied topic in the last decades

○ Large number of novel algorithms continuously published

● Confusion in:○ Evaluating and comparing the models

■ Netflix Prize = RMSE■ Recall, Precision, F-Measure

○ Tasks solved by the models○ Long-tail vs. popular items recommendation

● Missing systematic framework for long tail recommendations○ Novelty and Serendipity of the models

■ Evaluation■ Control

Page 3: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

State of the Art● Before 2006: GroupLens era

○ Bardul Sarwar: ItemKnn, Associaton Rules, SVD○ Optimizing MAE, NMAE

● 2006-2014: Netflix Prize, Matrix Factorization era○ $1M Prize, 40K teams from 186 countries○ Optimizing RMSE○ Robert Bell, Yehuda Koren – Top-N recommendation○ Harald Steck [100,101] – MNAR, popularity-stratified recall

● 2015-2019: Deep learning approaches○ Collaborative Deep Learning – optimizing Recall@N, mAP○ AutoRec – RMSE○ Wide & Deep Recommendation – optimizing AUC

Page 4: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

Contributions of This Thesis● Capacity manipulating hyperparameters for different learning algorithms

○ Showing shared nature of most recommendation models○ Generalizing large number of published algorithms under a common framework

● Popularity-biasing coefficient β as universal capacity hyperparameter○ Thanks to putting models under common score(u,i) framework○ Generalizing approach already proposed in [101] to full scale of models

● Recall-Coverage Optimization (RCO)○ Searching for Pareto-Optimal states while manipulating the capacity○ Using Recall@N coupled with Catalog Coverage to optimize models

Page 5: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

Formal Framework for Hyperparametrized Learning

Rating Prediction Binary Classification

Top-N RecommendationLearning to Rank

Learning Algorithm

Hyperparametrization Training Dataset

Recommendation Model

Page 6: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

Solving Recommendation Tasks Using score(u,i)

Rating Prediction Binary Classification

Top-N Recommendation

Learning to Rank

Page 7: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

Existing Recommendation Models as score(u,i)

Popularity (“Bestseller”) Model

Page 8: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

Existing Recommendation Models as score(u,i)UserKnn Model

Unweighted

Weighted

With Non-Normalized Neighborhood

Using Cosine Similarity

Using PearsonCorrelationCoefficient

Ochiai for Binary Matrices

Page 9: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

Unweighted

Existing Recommendation Models as score(u,i)ItemKnn Model

With Non-Normalized Neighborhood

With rating similarity functionsas in UserKnn (transposed)

Weighted

With attribute (content) similarity● tokenization

● embedding

Page 10: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

Existing Recommendation Models as score(u,i)Association Rules (Survey and Novel Framework)

Multiple Rule-Quality Measures

● confidence ● lift ● conviction

Different Ways of Combining Rules Together for Recommendation

● best-rule

● weightedvoting

Page 11: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

Association Rules

Page 12: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

Existing Recommendation Models as score(u,i)Matrix Factorization (Survey)

Simple

...trained using different optimization algorithms (Stochastic Gradient Descent, Alternating Least Squares)

With biases

With mixed implicit and explicit feedback data

With imputed unobserved ratings

● Yifan Hu, Yehuda Koren

● Harald Steck

Page 13: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

Existing Recommendation Models as score(u,i)AutoRecU-AutoRec

I-AutoRec

Page 14: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

Generalized Validation Loss and Validation Reward

Rating Prediction Binary Classification

Page 15: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

Generalized Validation Loss and Validation Reward

Accuracy Measures

Top-N Recommendation

Coverage Measures

Validation Reward Function=

Generalization of Accuracy and Coverage

Page 16: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

Model Capacity HyperparametersFor Individual Learning Algorithms

Page 17: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

β = Universal Capacity HyperparameterUsing score(u,i)

Page 18: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

Recall-Coverage OptimizationSearching for Pareto-Optimal States

Leave-One-Recall@N

Cat

alog

Cov

erag

e

Page 19: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

Cat

alog

Cov

erag

e

Recall@N

High capacity

Low capacity

Page 20: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

Experiments● Different learning algorithms● Mixture of academic and industrial datasets

Page 21: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

Experiments● Different characteristics of academic and industrial datasets

Page 22: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

UserKnn

Page 23: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

ItemKnn: Rating Similarity

Page 24: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

ItemKnn: Attribute Similarity

Page 25: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

Association Rules: Best-Rule Method

Page 26: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

Association Rules: Weighted Voting Method

Page 27: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

Matrix Factorization

Page 28: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

AutoRec

Page 29: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

Related Activities of the Author● Supervised 20 Bachelor+Master theses

○ 7× received the Dean’s award

● Created and lectured Fundamentals of Artificial Intelligence course○ 200+ students / 1 run○ created full course from scratch, giving lectures and seminars for 5 years

● Co-Founded Recombee, designed and implemented Recombee Engine○ Supervising demand-inspired research in embeddings, deep learning, auto ML○ SaaS, 3000 online registrations,○ 50 recurrently paying customers from 30 countries,○ running real-time service on 120 servers, 1000 CPUs, 10TB RAM○ currently maintained and improved by team of 10 people○ customers with

■ 200 M interactions / month,■ 300 recommendations / second■ 20 M items with text descriptions and average 5 images per item

Page 30: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

Thank you for your attention!

Questions?

Page 31: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

Performance in Production● Presented models are running live in Recombee

○ Incremental implementation (response to new data)○ Dozens of use-cases (E-Commerce, news, classifieds, movies, job boards, cultural events...)○ Initial implementation done by author of this thesis

● Capacity manipulation from dissertation frequently A/B tested in production● Using β > 0 by default for some models● Maximizing recall: better than random, but sometimes suboptimal● RCO:

○ sometimes better than recall,○ sometimes both recall and RCO are deceiving for user response (CTR, CR)

Page 32: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

IsThereAnyDeal.com Experiment

Page 33: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

Online Experiment: Large Czech Job Board

Page 34: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

Other Experiments (BUBB.Store)

Page 35: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

Other Experiments (Moodings)

Page 36: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

Online Model Optimization (CMA-ES)● Bc. Radek Bartyzal

Page 37: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

Comparing Offline and Online Evaluation Results ofRecommender Systems (RecSys Paper)

Page 38: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

Model Ensembles in RCO

Page 39: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

Other Research Supervised by Author

● Rating data = high quality, but limited● Recommendation based on attributes needed● Researched topics:

○ Feature selection by Semi-supervised learning (Bc. Michal Režnický)○ Neural prediction of interaction similarity from attributes

(Bc. Petr Kasalický)○ Stacked denoising autoencoders (Bc. Radek Bartyzal)

Cold-Start

Page 40: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

Other Research Supervised by Author

● Image-Based ItemKnn (Bc. Martin Pavlíček)● Re-training convolutional networks from interactions (Bc. Petr Kasalický)

Image Embeddings

Page 41: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating
Page 42: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

Other Research Supervised by AuthorHyper-Embeddings● Ing. Ivan Povalyaev (Recombee)

... 1 0 0 0 0 1 0 0 1 0 0 0 1 0 0 0 1 0 ...

Page 43: Manipulating the Capacity Ing. Tomáš ŘEHOŘEK of ...ai.ms.mff.cuni.cz/~sui/rehorek.pdf · Wide & Deep Recommendation – optimizing AUC. Contributions of This Thesis Capacity manipulating

Other Research Supervised by AuthorShowmax Scene Embeddings for Recommendation● Ing. Ivan Povalyaev (Recombee), Bc. Ondřej Bíža (ShowmaxLab)