Predicting Performance in Recommender Systemsir.ii.uam.es/~alejandro/2011/recsys-dc-slides.pdf ·...

ACM Conference on Recommender Systems 2011 – Doctoral Symposium

October 23, Chicago, USA IRGIR Group @ UAM

Predicting Performance in

Recommender Systems

Alejandro Bellogín

Supervised by Pablo Castells and Iván Cantador Escuela Politécnica Superior

Universidad Autónoma de Madrid

@abellogin

alejandro.bellogin@uam.es

IRGIR Group @ UAM

October 23, Chicago, USA

Motivation

Is it possible to predict the accuracy of a recommendation?

IRGIR Group @ UAM

Hypothesis

Data that are commonly available to a

Recommender System could contain

signals that enable an a priori estimation

of the success of the recommendation

IRGIR Group @ UAM

Research Questions

1. Is it possible to define a performance prediction theory for

recommender systems in a sound, formal way?

2. Is it possible to adapt query performance techniques (from

IR) to the recommendation task?

3. What kind of evaluation should be performed? Is IR

evaluation still valid in our problem?

4. What kind of recommendation problems can these models

be applied to?

IRGIR Group @ UAM

Predicting Performance in Recommender Systems

RQ1. Is it possible to define a performance prediction theory

for recommender systems in a sound, formal way?

a) Define a predictor of performance = (u, i, r, …)

b) Agree on a performance metric = (u, i, r, …)

c) Check predictive power by measuring correlation

corr([(x1), …, (xn)], [(x1), …, (xn)])

d) Evaluate final performance: dynamic vs static

IRGIR Group @ UAM

RQ2. Is it possible to adapt query performance techniques

(from IR) to the recommendation task?

In IR: “Estimation of the system’s performance in response

to a specific query”

Several predictors proposed

We focus on query clarity user clarity

IRGIR Group @ UAM

User clarity

It captures uncertainty in user’s data

• Distance between the user’s and the system’s probability model

• X may be: users, items, ratings, or a combination

|c la rity | lo g

p x uu p x u

system’s model

user’s model

IRGIR Group @ UAM

User clarity

Three user clarity formulations:

|c la rity | lo g

p x uu p x u

background model

user model

Name Vocabulary User model Background

Rating-based Ratings

Item-based Items

Item-and-rating-based Items rated by the user

|p r u

|p i u

| ,p r i u

IRGIR Group @ UAM

User clarity

Seven user clarity models implemented:

Name Formulation User model Background model

RatUser Rating-based

RatItem Rating-based

ItemSimple Item-based

ItemUser Item-based

IRUser Item-and-rating-based

IRItem Item-and-rating-based

IRUserItem Item-and-rating-based

p r i u

| ,U I

p r i u

| , ; |U U R

p r i u p i u

| , ; |I U R

p r i u p i u

IRGIR Group @ UAM

User clarity

Predictor that captures uncertainty in user’s data

Different formulations capture different nuances

More dimensions in RS than in IR: user, items, ratings, features, …

4 3 5 2 1 4 3 5 2 1Ratings

RatItem p_c(x)

p(x|u1)

p(x|u2)

IRGIR Group @ UAM

RQ3. What kind of evaluation should be performed? Is IR evaluation still valid in our problem?

In IR: Mean Average Precision + correlation

50 points (queries) vs 1000+ points (users)

Performance metric is not clear: error-based, precision-based?

What is performance?

It may depend on the final application

Possible bias

E.g., towards users or items with larger profiles

r ~ 0.57

IRGIR Group @ UAM

RQ3. What kind of evaluation should be performed? Is IR evaluation still valid in our problem?

In IR: Mean Average Precision + correlation

50 points (queries) vs 1000+ points (users)

Performance metric is not clear: error-based, precision-based?

What is performance?

It may depend on the final application

Possible bias

E.g., towards users or items with larger profiles

IRGIR Group @ UAM

RQ4. What kind of recommendation problems can these

models be applied to?

Whenever a combination of strategies is available

Example 1: dynamic neighbor weighting

Example 2: dynamic ensemble recommendation

IRGIR Group @ UAM

Dynamic neighbor weighting

The user’s neighbors are weighted according to their similarity

Can we take into account the uncertainty in neighbor’s data?

User neighbor weighting [1]

• Static:

• Dynamic:

, s im , ,

g u i C u v r a t v i

, γ s im , ,

g u i C v u v r a t v i

IRGIR Group @ UAM

Dynamic hybrid recommendation

Weight is the same for every item and user (learnt from training)

What about boosting those users predicted to perform better for

some recommender?

Hybrid recommendation [3]

• Static:

• Dynamic:

R 1 R 2, , 1 ,g u i g u i g u i

R 1 R 2, γ , 1 γ ,g u i u g u i u g u i

IRGIR Group @ UAM

10 20 30 40 50 60 70 80 90 10 20 30 40 50 60 70 80 90

% of ratings for training

Standard CF

Clarity-enhanced CF

b) Neighbourhood size: 500

100 150 200 250 300 350 400 450 500 100 150 200 250 300 350 400 450 500

Neighbourhood size

Standard CF

Clarity-enhanced CF

b) 80% training

Results – Neighbor weighting

Correlation analysis [1]

• With respect to Neighbor Goodness metric: “how good a neighbor is to her vicinity”

Performance [1] (MAE = Mean Average Error, the lower the better)

IRGIR Group @ UAM

10 20 30 40 50 60 70 80 90 10 20 30 40 50 60 70 80 90

% of ratings for training

Standard CF

Clarity-enhanced CF

b) Neighbourhood size: 500

100 150 200 250 300 350 400 450 500 100 150 200 250 300 350 400 450 500

Neighbourhood size

Standard CF

Clarity-enhanced CF

b) 80% training

Results – Neighbour weighting

• With respect to Neighbour Goodness metric: “how good a neighbour is to her vicinity”

Performance [1] (MAE = Mean Average Error, the lower the better)

Improvement of over 5% wrt. the baseline

Plus, it does not degrade performance

Positive, although not very strong correlations

IRGIR Group @ UAM

H1 H2 H3 H4

MAP@50

Adaptive Static

H1 H2 H3 H4

nDCG@50

Adaptive Static

Results – Hybrid recommendation

• With respect to nDCG@50 (nDCG, normalized Discount Cumulative Gain)

Performance [3]

IRGIR Group @ UAM

H1 H2 H3 H4

MAP@50

Adaptive Static

H1 H2 H3 H4

nDCG@50

Adaptive Static

Results – Hybrid recommendation

• With respect to nDCG@50 (nDCG, normalized Discount Cumulative Gain)

Performance [3]

In average, most of the predictors obtain positive, strong correlations

Dynamic strategy outperforms static for

different combination of recommenders

IRGIR Group @ UAM

Summary

Inferring user’s performance in a recommender system

Different adaptations of query clarity techniques

Building dynamic recommendation strategies

• Dynamic neighbor weighting: according to expected goodness of neighbor

• Dynamic hybrid recommendation: based on predicted performance

Encouraging results

• Positive predictive power (good correlations between predictors and metrics)

• Dynamic strategies obtain better (or equal) results than static

IRGIR Group @ UAM

Related publications

[1] A Performance Prediction Aproach to Enhance Collaborative

Filtering Performance. A. Bellogín and P. Castells. In ECIR 2010.

[2] Predicting the Performance of Recommender Systems: An

Information Theoretic Approach. A. Bellogín, P. Castells, and I.

Cantador. In ICTIR 2011.

[3] Performance Prediction for Dynamic Ensemble Recommender

Systems. A. Bellogín, P. Castells, and I. Cantador. In press.

IRGIR Group @ UAM

Future Work

Explore other input sources

• Item predictors

• Social links

• Implicit data (with time)

We need a theoretical background

• Why do some predictors work better?

Larger datasets

IRGIR Group @ UAM

FW – Other input sources

Item predictors

Social links

Implicit data (with time)

Item predictors could be very useful:

• Different recommender behavior depending on item attributes

• They would allow to capture popularity, diversity, etc.

IRGIR Group @ UAM

FW – Other input sources

Item predictors

Social links

Implicit data (with time)

First results using social-based predictors

• Combination of social and CF

• Graph-based measures as predictors

• “Indicators” of the user strength

IRGIR Group @ UAM

FW – Theoretical background

We need a theoretical background

• Why do some predictors work better?

IRGIR Group @ UAM

Thank you!

Predicting Performance in

Recommender Systems

Alejandro Bellogín

Supervied by Pablo Castells and Iván Cantador Escuela Politécnica Superior

Universidad Autónoma de Madrid

@abellogin

alejandro.bellogin@uam.es

Acknowledgements to the National Science Foundation

for the funding to attend the conference.

Reviewer’s comments: Confidence

Other methods to measure self-performance of RS

o Confidence

These methods capture the performance of the RS,

not user’s performance

Reviewer’s comments: Neighbor’s goodness

Neighbor goodness seems to be a little bit ad-hoc

We need a measurable definition of neighbor performance

NG(u) ~ “total MAE reduction by u” ~ “MAE without u” – “MAE with u”

Some attempts in trust research: sign and error deviation [Rafter et al. 2009]

: ( , )

1 1C E C E

| | | |

C E | , , |

v U u v U uU u U u

i ra t v i

v vR R

v r v i r v i

Reviewer’s comments: Neighbor’s weighting issues

Neighbor size vs dynamic neighborhood weighting

So far, only dynamic weighting

• Same training time than static weighting

Future work: dynamic size

Apply this method for larger datasets

Current work

Apply this method for other CF methods (e.g., latent factor models, SVD)

More difficult to identify the combination

Future work

Reviewer’s comments: Dynamic hybrid issues

Other methods to combine recommenders

o Stacking

o Multi-linear weighting

We focus on linear weighted hybrid recommendation

Future work: cascade, stacking

Predicting Performance in Recommender Systemsir.ii.uam.es/~alejandro/2011/recsys-dc-slides.pdf ·...

Documents

Recommender System Performance Evaluation and ...ir.ii.uam.es/~alejandro/thesis/00 Preface.pdfPredicting performance in recommender systems 77 Chapter 5. Performance prediction in

Predicting Performance and Workload - EUROCONTROL

Platform Architectweb.itu.edu.tr/~orssi/dersler/SLD/platform_architect_ds.pdfPlatform Architect 2 Problem: Predicting Dynamic System Performance Predicting the dynamic system performance

CSE 258 Web Mining and Recommender Systems · predicting and ranking likely purchases) Week 4: Recommender Systems ... • One assignment on recommender systems (after week 5), worth

Predicting Mutual Fund Performance - Oxford-Man Institute · Predicting Mutual Fund Performance Oxford, ... Predictability and performance evaluation for mutual funds ... Power of

EVALUATING RECOMMENDER SYSTEMSfiles.meetup.com/19190729/2016-ML-Meetup.pdf · Relative Performance of Neighbourhood-Based Recommender Systems." Spanish Conference of Information Retrieval,

Predicting performance of classification algorithms

PREDICTING OIL RESERVOIR PERFORMANCE

Predicting Cancer Drug Response Using a Recommender System · Predicting Cancer Drug Response Using a Recommender System Chayaporn Supahvilai1,2, Denis Bertrand2 and Niranjan Nagarajan2*

Predicting Missing Ratings in Recommender Systems: Adapted

A Hospital Queuing-Recommender System for …A Hospital Queuing-Recommender System for Predicting Patient Treatment Time Using Random Forest Algorithm 121 1.2 Problem Definition Most

NCAA Basketball Tournament: Predicting Performance

Modeling Fatigue Predicting Performance

Predicting performance in Recommender Systems - Poster

Predicting Students’ Academic Performance Using Multiple

Predicting human performance for face recognition

Predicting Performance Under Stressful Conditions Using ...web.mit.edu/jvielma/www/publications/Predicting-Performance.pdf · Predicting Performance Under Stressful Conditions Using

Predicting Parallel Performance

2018 Linking Study: Predicting Performance on SC … · 2018 Linking Study: Predicting Performance on SC ... 2018 Linking Study: Predicting Performance on SC ... 1** 100–167 1–24

Inferring Contextual User Proﬁles - Improving Recommender Performance