30
ACM Conference on Recommender Systems 2011 Doctoral Symposium October 23, Chicago, USA IR G IR Group @ UAM Predicting Performance in Recommender Systems Alejandro Bellogín Supervised by Pablo Castells and Iván Cantador Escuela Politécnica Superior Universidad Autónoma de Madrid @abellogin [email protected]

Predicting Performance in Recommender Systemsir.ii.uam.es/~alejandro/2011/recsys-dc-slides.pdf · Predicting Performance in Recommender Systems RQ1. Is it possible to define a performance

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Predicting Performance in Recommender Systemsir.ii.uam.es/~alejandro/2011/recsys-dc-slides.pdf · Predicting Performance in Recommender Systems RQ1. Is it possible to define a performance

ACM Conference on Recommender Systems 2011 – Doctoral Symposium

October 23, Chicago, USA IRGIR Group @ UAM

Predicting Performance in

Recommender Systems

Alejandro Bellogín

Supervised by Pablo Castells and Iván Cantador Escuela Politécnica Superior

Universidad Autónoma de Madrid

@abellogin

[email protected]

Page 2: Predicting Performance in Recommender Systemsir.ii.uam.es/~alejandro/2011/recsys-dc-slides.pdf · Predicting Performance in Recommender Systems RQ1. Is it possible to define a performance

IRGIR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium

October 23, Chicago, USA

2

Motivation

Is it possible to predict the accuracy of a recommendation?

Page 3: Predicting Performance in Recommender Systemsir.ii.uam.es/~alejandro/2011/recsys-dc-slides.pdf · Predicting Performance in Recommender Systems RQ1. Is it possible to define a performance

IRGIR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium

October 23, Chicago, USA

3

Hypothesis

Data that are commonly available to a

Recommender System could contain

signals that enable an a priori estimation

of the success of the recommendation

Page 4: Predicting Performance in Recommender Systemsir.ii.uam.es/~alejandro/2011/recsys-dc-slides.pdf · Predicting Performance in Recommender Systems RQ1. Is it possible to define a performance

IRGIR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium

October 23, Chicago, USA

4

Research Questions

1. Is it possible to define a performance prediction theory for

recommender systems in a sound, formal way?

2. Is it possible to adapt query performance techniques (from

IR) to the recommendation task?

3. What kind of evaluation should be performed? Is IR

evaluation still valid in our problem?

4. What kind of recommendation problems can these models

be applied to?

Page 5: Predicting Performance in Recommender Systemsir.ii.uam.es/~alejandro/2011/recsys-dc-slides.pdf · Predicting Performance in Recommender Systems RQ1. Is it possible to define a performance

IRGIR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium

October 23, Chicago, USA

5

Predicting Performance in Recommender Systems

RQ1. Is it possible to define a performance prediction theory

for recommender systems in a sound, formal way?

a) Define a predictor of performance = (u, i, r, …)

b) Agree on a performance metric = (u, i, r, …)

c) Check predictive power by measuring correlation

corr([(x1), …, (xn)], [(x1), …, (xn)])

d) Evaluate final performance: dynamic vs static

Page 6: Predicting Performance in Recommender Systemsir.ii.uam.es/~alejandro/2011/recsys-dc-slides.pdf · Predicting Performance in Recommender Systems RQ1. Is it possible to define a performance

IRGIR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium

October 23, Chicago, USA

6

Predicting Performance in Recommender Systems

RQ2. Is it possible to adapt query performance techniques

(from IR) to the recommendation task?

In IR: “Estimation of the system’s performance in response

to a specific query”

Several predictors proposed

We focus on query clarity user clarity

Page 7: Predicting Performance in Recommender Systemsir.ii.uam.es/~alejandro/2011/recsys-dc-slides.pdf · Predicting Performance in Recommender Systems RQ1. Is it possible to define a performance

IRGIR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium

October 23, Chicago, USA

7

User clarity

It captures uncertainty in user’s data

• Distance between the user’s and the system’s probability model

• X may be: users, items, ratings, or a combination

|c la rity | lo g

x X c

p x uu p x u

p x

system’s model

user’s model

Page 8: Predicting Performance in Recommender Systemsir.ii.uam.es/~alejandro/2011/recsys-dc-slides.pdf · Predicting Performance in Recommender Systems RQ1. Is it possible to define a performance

IRGIR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium

October 23, Chicago, USA

8

User clarity

Three user clarity formulations:

|c la rity | lo g

x X c

p x uu p x u

p x

background model

user model

Name Vocabulary User model Background

model

Rating-based Ratings

Item-based Items

Item-and-rating-based Items rated by the user

|p r u

|p i u

| ,p r i u

cp r

cp i

|m l

p r i

Page 9: Predicting Performance in Recommender Systemsir.ii.uam.es/~alejandro/2011/recsys-dc-slides.pdf · Predicting Performance in Recommender Systems RQ1. Is it possible to define a performance

IRGIR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium

October 23, Chicago, USA

9

User clarity

Seven user clarity models implemented:

Name Formulation User model Background model

RatUser Rating-based

RatItem Rating-based

ItemSimple Item-based

ItemUser Item-based

IRUser Item-and-rating-based

IRItem Item-and-rating-based

IRUserItem Item-and-rating-based

cp r

cp r

cp i

cp i

|m l

p r i

|m l

p r i

|m l

p r i

| ,U

p r i u

| ,I

p r i u

| ,U I

p r i u

|R

p i u

|U R

p i u

| , ; |U U R

p r i u p i u

| , ; |I U R

p r i u p i u

Page 10: Predicting Performance in Recommender Systemsir.ii.uam.es/~alejandro/2011/recsys-dc-slides.pdf · Predicting Performance in Recommender Systems RQ1. Is it possible to define a performance

IRGIR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium

October 23, Chicago, USA

10

User clarity

Predictor that captures uncertainty in user’s data

Different formulations capture different nuances

More dimensions in RS than in IR: user, items, ratings, features, …

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

4 3 5 2 1 4 3 5 2 1Ratings

RatItem p_c(x)

p(x|u1)

p(x|u2)

Page 11: Predicting Performance in Recommender Systemsir.ii.uam.es/~alejandro/2011/recsys-dc-slides.pdf · Predicting Performance in Recommender Systems RQ1. Is it possible to define a performance

IRGIR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium

October 23, Chicago, USA

11

Predicting Performance in Recommender Systems

RQ3. What kind of evaluation should be performed? Is IR evaluation still valid in our problem?

In IR: Mean Average Precision + correlation

50 points (queries) vs 1000+ points (users)

Performance metric is not clear: error-based, precision-based?

What is performance?

It may depend on the final application

Possible bias

E.g., towards users or items with larger profiles

r ~ 0.57

Page 12: Predicting Performance in Recommender Systemsir.ii.uam.es/~alejandro/2011/recsys-dc-slides.pdf · Predicting Performance in Recommender Systems RQ1. Is it possible to define a performance

IRGIR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium

October 23, Chicago, USA

12

Predicting Performance in Recommender Systems

RQ3. What kind of evaluation should be performed? Is IR evaluation still valid in our problem?

In IR: Mean Average Precision + correlation

50 points (queries) vs 1000+ points (users)

Performance metric is not clear: error-based, precision-based?

What is performance?

It may depend on the final application

Possible bias

E.g., towards users or items with larger profiles

Page 13: Predicting Performance in Recommender Systemsir.ii.uam.es/~alejandro/2011/recsys-dc-slides.pdf · Predicting Performance in Recommender Systems RQ1. Is it possible to define a performance

IRGIR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium

October 23, Chicago, USA

13

Predicting Performance in Recommender Systems

RQ4. What kind of recommendation problems can these

models be applied to?

Whenever a combination of strategies is available

Example 1: dynamic neighbor weighting

Example 2: dynamic ensemble recommendation

Page 14: Predicting Performance in Recommender Systemsir.ii.uam.es/~alejandro/2011/recsys-dc-slides.pdf · Predicting Performance in Recommender Systems RQ1. Is it possible to define a performance

IRGIR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium

October 23, Chicago, USA

14

Dynamic neighbor weighting

The user’s neighbors are weighted according to their similarity

Can we take into account the uncertainty in neighbor’s data?

User neighbor weighting [1]

• Static:

• Dynamic:

[ ]

, s im , ,

v N u

g u i C u v r a t v i

[ ]

, γ s im , ,

v N u

g u i C v u v r a t v i

Page 15: Predicting Performance in Recommender Systemsir.ii.uam.es/~alejandro/2011/recsys-dc-slides.pdf · Predicting Performance in Recommender Systems RQ1. Is it possible to define a performance

IRGIR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium

October 23, Chicago, USA

15

Dynamic hybrid recommendation

Weight is the same for every item and user (learnt from training)

What about boosting those users predicted to perform better for

some recommender?

Hybrid recommendation [3]

• Static:

• Dynamic:

R 1 R 2, , 1 ,g u i g u i g u i

R 1 R 2, γ , 1 γ ,g u i u g u i u g u i

Page 16: Predicting Performance in Recommender Systemsir.ii.uam.es/~alejandro/2011/recsys-dc-slides.pdf · Predicting Performance in Recommender Systems RQ1. Is it possible to define a performance

IRGIR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium

October 23, Chicago, USA

16

0,80

0,82

0,84

0,86

0,88

0,90

0,92

0,94

0,96

0,98

10 20 30 40 50 60 70 80 90 10 20 30 40 50 60 70 80 90

MA

E

% of ratings for training

Standard CF

Clarity-enhanced CF

b) Neighbourhood size: 500

0,80

0,81

0,82

0,83

0,84

0,85

0,86

0,87

0,88

100 150 200 250 300 350 400 450 500 100 150 200 250 300 350 400 450 500

MA

E

Neighbourhood size

Standard CF

Clarity-enhanced CF

b) 80% training

Results – Neighbor weighting

Correlation analysis [1]

• With respect to Neighbor Goodness metric: “how good a neighbor is to her vicinity”

Performance [1] (MAE = Mean Average Error, the lower the better)

Page 17: Predicting Performance in Recommender Systemsir.ii.uam.es/~alejandro/2011/recsys-dc-slides.pdf · Predicting Performance in Recommender Systems RQ1. Is it possible to define a performance

IRGIR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium

October 23, Chicago, USA

17

0,80

0,82

0,84

0,86

0,88

0,90

0,92

0,94

0,96

0,98

10 20 30 40 50 60 70 80 90 10 20 30 40 50 60 70 80 90

MA

E

% of ratings for training

Standard CF

Clarity-enhanced CF

b) Neighbourhood size: 500

0,80

0,81

0,82

0,83

0,84

0,85

0,86

0,87

0,88

100 150 200 250 300 350 400 450 500 100 150 200 250 300 350 400 450 500

MA

E

Neighbourhood size

Standard CF

Clarity-enhanced CF

b) 80% training

Results – Neighbour weighting

Correlation analysis [1]

• With respect to Neighbour Goodness metric: “how good a neighbour is to her vicinity”

Performance [1] (MAE = Mean Average Error, the lower the better)

Improvement of over 5% wrt. the baseline

Plus, it does not degrade performance

Positive, although not very strong correlations

Page 18: Predicting Performance in Recommender Systemsir.ii.uam.es/~alejandro/2011/recsys-dc-slides.pdf · Predicting Performance in Recommender Systems RQ1. Is it possible to define a performance

IRGIR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium

October 23, Chicago, USA

18

0

0,01

0,02

0,03

0,04

0,05

0,06

0,07

H1 H2 H3 H4

MAP@50

Adaptive Static

0

0,05

0,1

0,15

0,2

H1 H2 H3 H4

nDCG@50

Adaptive Static

Results – Hybrid recommendation

Correlation analysis [2]

• With respect to nDCG@50 (nDCG, normalized Discount Cumulative Gain)

Performance [3]

Page 19: Predicting Performance in Recommender Systemsir.ii.uam.es/~alejandro/2011/recsys-dc-slides.pdf · Predicting Performance in Recommender Systems RQ1. Is it possible to define a performance

IRGIR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium

October 23, Chicago, USA

19

0

0,01

0,02

0,03

0,04

0,05

0,06

0,07

H1 H2 H3 H4

MAP@50

Adaptive Static

0

0,05

0,1

0,15

0,2

H1 H2 H3 H4

nDCG@50

Adaptive Static

Results – Hybrid recommendation

Correlation analysis [2]

• With respect to nDCG@50 (nDCG, normalized Discount Cumulative Gain)

Performance [3]

In average, most of the predictors obtain positive, strong correlations

Dynamic strategy outperforms static for

different combination of recommenders

Page 20: Predicting Performance in Recommender Systemsir.ii.uam.es/~alejandro/2011/recsys-dc-slides.pdf · Predicting Performance in Recommender Systems RQ1. Is it possible to define a performance

IRGIR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium

October 23, Chicago, USA

20

Summary

Inferring user’s performance in a recommender system

Different adaptations of query clarity techniques

Building dynamic recommendation strategies

• Dynamic neighbor weighting: according to expected goodness of neighbor

• Dynamic hybrid recommendation: based on predicted performance

Encouraging results

• Positive predictive power (good correlations between predictors and metrics)

• Dynamic strategies obtain better (or equal) results than static

Page 21: Predicting Performance in Recommender Systemsir.ii.uam.es/~alejandro/2011/recsys-dc-slides.pdf · Predicting Performance in Recommender Systems RQ1. Is it possible to define a performance

IRGIR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium

October 23, Chicago, USA

21

Related publications

[1] A Performance Prediction Aproach to Enhance Collaborative

Filtering Performance. A. Bellogín and P. Castells. In ECIR 2010.

[2] Predicting the Performance of Recommender Systems: An

Information Theoretic Approach. A. Bellogín, P. Castells, and I.

Cantador. In ICTIR 2011.

[3] Performance Prediction for Dynamic Ensemble Recommender

Systems. A. Bellogín, P. Castells, and I. Cantador. In press.

Page 22: Predicting Performance in Recommender Systemsir.ii.uam.es/~alejandro/2011/recsys-dc-slides.pdf · Predicting Performance in Recommender Systems RQ1. Is it possible to define a performance

IRGIR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium

October 23, Chicago, USA

22

Future Work

Explore other input sources

• Item predictors

• Social links

• Implicit data (with time)

We need a theoretical background

• Why do some predictors work better?

Larger datasets

Page 23: Predicting Performance in Recommender Systemsir.ii.uam.es/~alejandro/2011/recsys-dc-slides.pdf · Predicting Performance in Recommender Systems RQ1. Is it possible to define a performance

IRGIR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium

October 23, Chicago, USA

23

FW – Other input sources

Item predictors

Social links

Implicit data (with time)

Item predictors could be very useful:

• Different recommender behavior depending on item attributes

• They would allow to capture popularity, diversity, etc.

Page 24: Predicting Performance in Recommender Systemsir.ii.uam.es/~alejandro/2011/recsys-dc-slides.pdf · Predicting Performance in Recommender Systems RQ1. Is it possible to define a performance

IRGIR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium

October 23, Chicago, USA

24

FW – Other input sources

Item predictors

Social links

Implicit data (with time)

First results using social-based predictors

• Combination of social and CF

• Graph-based measures as predictors

• “Indicators” of the user strength

Page 25: Predicting Performance in Recommender Systemsir.ii.uam.es/~alejandro/2011/recsys-dc-slides.pdf · Predicting Performance in Recommender Systems RQ1. Is it possible to define a performance

IRGIR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium

October 23, Chicago, USA

25

FW – Theoretical background

We need a theoretical background

• Why do some predictors work better?

Page 26: Predicting Performance in Recommender Systemsir.ii.uam.es/~alejandro/2011/recsys-dc-slides.pdf · Predicting Performance in Recommender Systems RQ1. Is it possible to define a performance

IRGIR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium

October 23, Chicago, USA

26

Thank you!

Predicting Performance in

Recommender Systems

Alejandro Bellogín

Supervied by Pablo Castells and Iván Cantador Escuela Politécnica Superior

Universidad Autónoma de Madrid

@abellogin

[email protected]

Acknowledgements to the National Science Foundation

for the funding to attend the conference.

Page 27: Predicting Performance in Recommender Systemsir.ii.uam.es/~alejandro/2011/recsys-dc-slides.pdf · Predicting Performance in Recommender Systems RQ1. Is it possible to define a performance

ACM Conference on Recommender Systems 2011 – Doctoral Symposium

October 23, Chicago, USA

27

Reviewer’s comments: Confidence

Other methods to measure self-performance of RS

o Confidence

These methods capture the performance of the RS,

not user’s performance

Page 28: Predicting Performance in Recommender Systemsir.ii.uam.es/~alejandro/2011/recsys-dc-slides.pdf · Predicting Performance in Recommender Systems RQ1. Is it possible to define a performance

ACM Conference on Recommender Systems 2011 – Doctoral Symposium

October 23, Chicago, USA

28

Reviewer’s comments: Neighbor’s goodness

Neighbor goodness seems to be a little bit ad-hoc

We need a measurable definition of neighbor performance

NG(u) ~ “total MAE reduction by u” ~ “MAE without u” – “MAE with u”

Some attempts in trust research: sign and error deviation [Rafter et al. 2009]

: ( , )

1 1C E C E

| | | |

C E | , , |

UU u

v U u v U uU u U u

X X

i ra t v i

v vR R

v r v i r v i

Page 29: Predicting Performance in Recommender Systemsir.ii.uam.es/~alejandro/2011/recsys-dc-slides.pdf · Predicting Performance in Recommender Systems RQ1. Is it possible to define a performance

ACM Conference on Recommender Systems 2011 – Doctoral Symposium

October 23, Chicago, USA

29

Reviewer’s comments: Neighbor’s weighting issues

Neighbor size vs dynamic neighborhood weighting

So far, only dynamic weighting

• Same training time than static weighting

Future work: dynamic size

Apply this method for larger datasets

Current work

Apply this method for other CF methods (e.g., latent factor models, SVD)

More difficult to identify the combination

Future work

Page 30: Predicting Performance in Recommender Systemsir.ii.uam.es/~alejandro/2011/recsys-dc-slides.pdf · Predicting Performance in Recommender Systems RQ1. Is it possible to define a performance

ACM Conference on Recommender Systems 2011 – Doctoral Symposium

October 23, Chicago, USA

30

Reviewer’s comments: Dynamic hybrid issues

Other methods to combine recommenders

o Stacking

o Multi-linear weighting

We focus on linear weighted hybrid recommendation

Future work: cascade, stacking