77
Research Challenges in Group Recommender Systems Judith Masthoff

Research Challenges in Group Recommender Systems Judith Masthoff

Embed Size (px)

Citation preview

Page 1: Research Challenges in Group Recommender Systems Judith Masthoff

Research Challenges in Group Recommender

Systems

Judith Masthoff

Page 2: Research Challenges in Group Recommender Systems Judith Masthoff

The University of Aberdeen, Scotland

• Founded 1495• 5th oldest university in the UK• 14,000 students• 120 nationalities• 550+ first degree programmes• 110+ taught postgraduate

programmes

Page 3: Research Challenges in Group Recommender Systems Judith Masthoff

Contents

• Introduction to group recommender systems• Five challenges:

• Evaluation and metrics• Recommending sequences• Modelling satisfaction• Incorporating group attributes• Explaining group recommendations

Page 4: Research Challenges in Group Recommender Systems Judith Masthoff

Introduction to Group Recommender

Systems

Page 5: Research Challenges in Group Recommender Systems Judith Masthoff

Introduction

• Group Recommender Systems: how to adapt to a group of people rather than to an individual

I know individual ratings of Peter, Jane,

and Mary. What to recommend to the

group?

Page 6: Research Challenges in Group Recommender Systems Judith Masthoff

How do groups make decisions?

• Split yourself into groups of 4-6 people• Decide what your group would listen to

if time for 1 song, for 2 songs, for 3 songs, for 4 songs

A B C D E

F G H I J K

Page 7: Research Challenges in Group Recommender Systems Judith Masthoff

What would you recommend?

A B C D E F G H I J

Peter 10 4 3 6 10 9 6 8 10 8

Jane 1 9 8 9 7 9 6 9 3 8

Mary 10 5 2 7 9 8 5 6 7 6

Page 8: Research Challenges in Group Recommender Systems Judith Masthoff

[Masthoff, 2004]• Plurality Voting• Average• Multiplicative• Borda count• Copeland rule• Approval voting• Least misery• Most pleasure• Average without misery• Fairness• Most respected person

Many strategies exist

• Graph-based ranking [Kim et al, 2013]

• Spearman footrule rank [Baltrunas et al, 2010]

• Nash equilibrium [Carvalho & Macedo, 2013]

• Purity [Salamó et al, 2012]• Completeness

[Salamó et al, 2012]• ……….

Page 9: Research Challenges in Group Recommender Systems Judith Masthoff

Average Strategy

A B C D E F G H I J

Peter 10 4 3 6 10 9 6 8 10 8

Jane 1 9 8 9 7 9 6 9 3 8

Mary 10 5 2 7 9 8 5 6 7 6

Group list: (E, F), H, (D, J), A, I, B, G, C

7 6 4.3 7.3 8.7 8.7 5.7 7.7 6.7 7.3

Use average of the individual ratings

Page 10: Research Challenges in Group Recommender Systems Judith Masthoff

Least misery strategy

A B C D E F G H I J

Peter 10 4 3 6 10 9 6 8 10 8

Jane 1 9 8 9 7 9 6 9 3 8

Mary 10 5 2 7 9 8 5 6 7 6

Group list: F, E, (H, J, D), G, B, I, C, A

1 4 2 6 7 8 5 6 3 6

Use minimum of the individual ratings

Page 11: Research Challenges in Group Recommender Systems Judith Masthoff

Challenge 1: Evaluation and Metrics

Page 12: Research Challenges in Group Recommender Systems Judith Masthoff

Slicing and Dicing

• Want to know why a group recommender system works / does not work

• Slicing: Layered evaluation (Paramythis et al, 2010)

– Break adaptation process down into its constituents (“layers”)

– Evaluate layers separately• Dicing

– Break system down into separate functionalities (e.g. provide recommendations, explain recommendations)

– Evaluate functionalities separately

Page 13: Research Challenges in Group Recommender Systems Judith Masthoff

Layered evaluation

Layered Evaluation – Recap

Presence tracking

Group recommendation

Explicit ratings, user’s viewing

actions

Show top recommendations

by stars

Rank items to recommend

How much user liked viewed items

What the individuals in the group (dis)like, how

they are feeling

Most of this presentation focusses on one layer (DA or UM)

Page 14: Research Challenges in Group Recommender Systems Judith Masthoff

• What does it mean for a group recommender strategy to be good?

• For the group to be satisfied? • But how do you measure the satisfaction of a

group?

How to evaluate how good a strategy is?

Page 15: Research Challenges in Group Recommender Systems Judith Masthoff

Metrics (1)

• Utility for the group

This is what most researchers do, they take the average of the individuals’ ratings (or average of a comparison of rankings of items).

What is the problem with this?

Page 16: Research Challenges in Group Recommender Systems Judith Masthoff

Metrics (2)

• Whether all individuals exceeded a minimum level of satisfaction

When? After a sequence of items? At each point in the sequence?

What is the problem with this?

Page 17: Research Challenges in Group Recommender Systems Judith Masthoff

Metrics (3)

Extent to which group members• Think it is fair?• Think it is best for the group?• Accept the recommendation for the group?• Do not exhibit negative emotions?

With or without having seen the options and individual preferences?

What is the problem with this?

Page 18: Research Challenges in Group Recommender Systems Judith Masthoff

Metrics (4)

Extent to which independent observers• Think it is fair?• Think it is good / best for the group?

Having seen the options and individual preferences

Having seen the reactions of the group members?

What is the problem with this?

Page 19: Research Challenges in Group Recommender Systems Judith Masthoff

Metrics (5)

Extent to which the recommendations correspond to

• What groups would decide themselves?• What human facilitators would decide for the

group?

What is the problem with this?

Page 20: Research Challenges in Group Recommender Systems Judith Masthoff

How to obtain groups for evaluation?

• Artificially construct groups – From existing data about individuals– Or: of invented individuals

• Use real groups:– But without group data– Or: to generate group data

(e.g. What the group decides to watch when together)

– Or: to provide recommendations and measure effect

Page 21: Research Challenges in Group Recommender Systems Judith Masthoff

Groups matter

• Group size

• Homogeneity in opinions in group

• And many other attributes (discussed later)

Page 22: Research Challenges in Group Recommender Systems Judith Masthoff

Domains matter

Music

Tourist attractions

NewsRestaurants / Food

Why do they matter?

Page 23: Research Challenges in Group Recommender Systems Judith Masthoff

Exp1: What do people do?

Compare what people do with what strategies do

I know individual ratings of Peter, Mary,

and Jane. What to recommend to the

group? If time to watch 1-2-3-4-5-6-7 clips…

Why?

Page 24: Research Challenges in Group Recommender Systems Judith Masthoff

Exp1: Results

• Participants do ‘use’ some of the strategies• Care about Misery, Fairness, Preventing

starvation

Page 25: Research Challenges in Group Recommender Systems Judith Masthoff

Exp2: What do people like?

You know the individual ratings of you and your

two friends. I have decided to show you the

following sequence. How satisfied would you be?

And your friends?

Why?

Which strategy does best?Which prediction function does best?

Page 26: Research Challenges in Group Recommender Systems Judith Masthoff

Exp2: Results

• Multiplicative strategy performs best (FEHJDI is the only sequence that has ratings 4 for all subjects for all individuals)

• Prediction functions: Some evidence of normalization, Misery taken into account, Quadratic is better than linear

Page 27: Research Challenges in Group Recommender Systems Judith Masthoff

Evaluation of aggregation strategies (1)

• Our work: Best Multiplicative (from Exp 2 above and simulations with satisfaction functions below)

• Unfortunatly, Multiplicative often not investigated, indicated the ones that did in blue

• Where it was included, it performed best

Page 28: Research Challenges in Group Recommender Systems Judith Masthoff

Evaluation of aggregation strategies

• Senot et al (2010): TV, historic TV use (individual and group data), size 2-5, Best: Average most groups, Dictatorship 20% of groups

• Carvalho & Macedo (2013): MovieLens data, Average metric, size 3-7, Best: Average

• Bourke et al (2011): TV movies, User study, Average metric, size 3-10, Best: Multiplicative

• De Pessemier et al (2013): Movies, Synthetic data, Average metric, size 2-5, Best: Average and Average without Misery

Page 29: Research Challenges in Group Recommender Systems Judith Masthoff

Evaluation of aggregation strategies (3)

• Baltrunas et al (2010): MovieLens data, Weighted average metric, Size 2-8, Average, Borda, Least Misery, Spearman footrule very similar

• Berkovsky & Freyne (2010): Recipes, User study, Size 2-4, Best: Weighted average (only compared to Average)

• Salamo et al (2012): Holidays, Synthetic groups, Average metric, Size 3-8, Best: Multiplicative, Borda, Completeness

Page 30: Research Challenges in Group Recommender Systems Judith Masthoff

Challenge 2: Recommending

Sequences

Page 31: Research Challenges in Group Recommender Systems Judith Masthoff

Why sequences?

• Sequences for groups are a lot more interesting than individual items

• With a sequence, it is harder to please everybody

• Fairness has a larger role• Example domains:

tourist attractions, music in shop, TV news

Page 32: Research Challenges in Group Recommender Systems Judith Masthoff

How to deal with order?

Determine Group List

Show items in the order of the list

Determine top N items to show

Determine Group List

Determine order to show items in

Determine top N items to show

Show items in that order

Determine Group List

UpdateRatings

Show first item of list

Time left

But: mood consistency, strong ending, narrative flow,..

But: given all this, perhaps other items are more suitable..

Page 33: Research Challenges in Group Recommender Systems Judith Masthoff

Exp3: Effect of mood, topic

How much would you want to watch these 7 news items?

How would they make you feel?

[Insert name of your favorite sport’s club] wins important game,Fleet of limos for Jennifer Lopez 100-metre trip,Heart disease could be halved, Is there room for God in Europe?,Earthquake hits Bulgaria, UK fire strike continues,Main three Bulgarian players injured after Bulgaria-Spain football match

The first item on the news is “England football team has to play Bulgaria”. Rate interest, resulting mood.

Rate interest in 7 news items again

Page 34: Research Challenges in Group Recommender Systems Judith Masthoff

Exp3: Results

• Mood can influence ratings• Topical relatedness can influence ratings• Effect of topical relatedness can depend on

rating for first item (if interested then more likely to increase)

• Importance dimension

Page 35: Research Challenges in Group Recommender Systems Judith Masthoff

Domain specific aspects of sequences

For example, in tourist guide domain:• Mutually exclusive / hard to combine items• Physical proximity • Diversity concerns

In news domain:• Novelty concerns• Topical relatedness

How about music?

Page 36: Research Challenges in Group Recommender Systems Judith Masthoff

Challenge 3:

Modelling Satisfaction

Page 37: Research Challenges in Group Recommender Systems Judith Masthoff

Why model satisfaction?

• When adapting to a group of people, you cannot give everyone what they like all the time

• But you don’t want somebody to get too dissatisfied…

• When adapting a sequence to an individual, the order may impact satisfaction

Page 38: Research Challenges in Group Recommender Systems Judith Masthoff

Strategies that use satisfaction

,

,

,

Know how satisfied each user is with the items so far

And their profile

B C H I J

9 8 9 3 8

5 2 6 7 6

4 3 8 10 8

Decide which item to present next, trying to please the least satisfied user

Page 39: Research Challenges in Group Recommender Systems Judith Masthoff

Strongly support grumpiest strategy

• Pick item most liked by the least satisfied person

• If multiple items most liked, use existing strategy (e.g. Multiplicative) to choose between them

Problem: Suppose Mary least satisfied so far - Strategy would pick A - Very bad for Jane - Better to show E?

A B D E

Peter 10 4 6 10

Jane 1 9 9 7

Mary 10 5 7 9

Page 40: Research Challenges in Group Recommender Systems Judith Masthoff

Alternative strategies using satisfaction

• Weakly support grumpiest strategy– Consider all items quite liked (say rating>7) by the

least satisfied person– Use existing strategy to choose between them

• Strategies using weights– Assign weights to users depending on satisfaction– Use weighted form of existing strategy, e.g.

weighted Average– Cannot be done with some strategies, such as

Least Misery

Page 41: Research Challenges in Group Recommender Systems Judith Masthoff

Challenge is to model satisfaction

• Would like a model that predicts satisfaction of an individual userafter a sequence of items

Page 42: Research Challenges in Group Recommender Systems Judith Masthoff

Basic model

Page 43: Research Challenges in Group Recommender Systems Judith Masthoff

Impact

Quadratic(Rebalanced(

Normalized(Rating( )))

Page 44: Research Challenges in Group Recommender Systems Judith Masthoff

Variant 1: Satisfaction decreases over time

x

0 1 =0: No memory =1: Perfect memory

Page 45: Research Challenges in Group Recommender Systems Judith Masthoff

Variant 2: Satisfaction is bounded

x

(1+)

Page 46: Research Challenges in Group Recommender Systems Judith Masthoff

Mood impacts evaluative judgement

How often has your television broken down in the last years?

Hardly ever.

A lot!

Isen et al, 1978

Page 47: Research Challenges in Group Recommender Systems Judith Masthoff

Mood impacts evaluative judgement

How much have you

been persuaded?

A little.

A lot.

Mackie & Worth, 1989

Page 48: Research Challenges in Group Recommender Systems Judith Masthoff

Affective forecasting can change actual emotional experience

I really hate it..

It is ok.I am expecting to

like this…

I am expecting to hate this…

Assimilation

Wilson & Klaaren, 1992

Page 49: Research Challenges in Group Recommender Systems Judith Masthoff

Variant 3: Impact depends on mood

x

x

,

Page 50: Research Challenges in Group Recommender Systems Judith Masthoff

Impact depends on mood

,

x ( )

0 1 =0: No impact mood =1: Mood determines all

Page 51: Research Challenges in Group Recommender Systems Judith Masthoff

Variant 4: Combination of Variants 2 and 3

x

x

,

(1+)

Page 52: Research Challenges in Group Recommender Systems Judith Masthoff

Evaluation by simulation

-20

0

20

40

60

Jane, sequence from Multiplicative strategy

0

10

20

30

40

50

60

70

0 0.2 0.4 0.6 0.8 0.9 1

0

10

20

30

40

50

60

70

0 0.2 0.4 0.6 0.8 0.9 1

-10

-5

0

5

10

15

Variant 1 Variant 2

=

• Models predict how satisfied Peter, Jane, Mary would be with a sequence of items

• If we know , …• UM05 contains 54

prediction graphs• Compare these to

predictions by humans (from Exp 2)

Page 53: Research Challenges in Group Recommender Systems Judith Masthoff

Simulation results

• Some strategies result in misery for any

• Variant 2 predicts misery for the Average strategy, contradicting our subjects

• Value of should be high (>0.5)• With high , Multiplicative performed best

• Lower values of might be better

Page 54: Research Challenges in Group Recommender Systems Judith Masthoff

Inherent complexity of evaluation

Evaluation challenge of EAS workshop at UM05

• Need accurate ratings which is hard due to order effect

• Need to know satisfaction after each item• Items should be ‘unrelated’ and preferably not have

emotional content

Initial empirical evaluation has been done, using an educational task

Page 55: Research Challenges in Group Recommender Systems Judith Masthoff

Exp 4: Which model is best?

• Three lexical decision tasks:

• Subjects split into two groups:Group A: Easy – Hard – MediumGroup B: Hard – Easy – Medium

• Self-reported satisfaction with performance on task, and over all tasks so far

Is this an English word? (2s to answer)

Easy Medium Hard

Women PhialDebut

Page 56: Research Challenges in Group Recommender Systems Judith Masthoff

Exp4: Results

-4

-2

0

2

4

6

8

10

1 2 3

Group A

Group B

Satisfaction with overall performance after each task

Variants 1 and 2 predict lower satisfaction for group B (easy first) after 2 tasks, due to emotions wearing off.

Assimilation could result in higher satisfaction for B.

Comparison with individual task satisfaction, shows more like averaging.

Variant 4 is best?? Task number

Page 57: Research Challenges in Group Recommender Systems Judith Masthoff

Exp4: Results (2)

• Analysed Error Rates of best fit to data of Variants 2 and 4

• Variant 2 fitted the data rather badly• Variant 4 was a lot better, so assimilation parameter

ε helps• Found individual differences in δ and ε

(as expected)

Page 58: Research Challenges in Group Recommender Systems Judith Masthoff

Emotional contagion

Totterdell et al, 1998; Barsade 2002; Bartel & Saavedra, 2000

Page 59: Research Challenges in Group Recommender Systems Judith Masthoff

Emotional contagion

Totterdell et al, 1998; Barsade 2002; Bartel & Saavedra, 2000

Page 60: Research Challenges in Group Recommender Systems Judith Masthoff

Emotional contagion

Totterdell et al, 1998; Barsade 2002; Bartel & Saavedra, 2000

Page 61: Research Challenges in Group Recommender Systems Judith Masthoff

Emotional contagion

, x

Or x -

Page 62: Research Challenges in Group Recommender Systems Judith Masthoff

Susceptibility of emotional contagion

User Dependent

So, should be user dependent

Laird et al, 1994

Page 63: Research Challenges in Group Recommender Systems Judith Masthoff

Types of relationship

Authority Ranking

Market PricingEquality Matching

“Somebody you respect highly”

“Somebody you do deals with / compete with”

“Somebody you share everything with, e.g. a best friend”

Communal Sharing

“Somebody you are on equal footing with”

Fiske, 1992; Haslam, 1994

Page 64: Research Challenges in Group Recommender Systems Judith Masthoff

Susceptibility and types of relationship

=

When calculating of by

Need to take account of ’s susceptibility

And the relationship between and

Page 65: Research Challenges in Group Recommender Systems Judith Masthoff

Exp5: Emotional contagion

• Susceptibility to emotional contagion measured using existing scale (Doherty, 1997)

• “Think of somebody [relationship type]. Assume you and this person are watching TV together. You are enjoying the program a little. How would it make you feel to know that the other person is [enjoying it greatly / really hating it]? My enjoyment would…”

• We expect Authority Ranking and Communal Sharing to have more contagion.

• Will Market Pricing have negative ?

Page 66: Research Challenges in Group Recommender Systems Judith Masthoff

Exp5: Results

• Contagion happens• More contagion for Authority Ranking and

Communal Sharing relationships• No difference between negative and positive

contagion• Susceptibility only seemed to make a difference for

Communal Sharing relationships

Page 67: Research Challenges in Group Recommender Systems Judith Masthoff

Challenge 4: Incorporating Group

Attributes

Page 68: Research Challenges in Group Recommender Systems Judith Masthoff

What attributes matter?

• Remember the task I gave you at the start

• What attributes of the people in your group influenced the decion making (excluding their opinions on the music items)?

• Or could have influenced the decision making if they had beeen present in your group

Page 69: Research Challenges in Group Recommender Systems Judith Masthoff

Attributes of group members

• Demographics and roles [Ardissono et al, 2002; Senot et al, 2010]

• Personality– Propensity to emotional contagion – Agreeableness?– Assertiveness and cooperativeness

[Quijano-Sanchez et al, 2013]

• Expertise [Berkovsky & Freyne; Gatrell et al, 2010, Herr et al, 2012]

• Personal impact/cognitive centrality [Liu et al, 2012; Herr et al, 2012]

Typically used to vary the weights of group members

Page 70: Research Challenges in Group Recommender Systems Judith Masthoff

Attributes of the group as a whole

• Relationship strengthGatrell et al (2010) propose:

Most Pleasure for strong relations, Least Misery for weak, Average for intermediate

• Relationship type:Wang et al (2010) distinguish:– Positionally homogeneous vs heterogenous groups– Tightly coupled versus loosly coupled groups

Typically used to select a different strategy

Page 71: Research Challenges in Group Recommender Systems Judith Masthoff

Attributes of pairs in the group

• Relationship strength/social trust [Quijano-Sanchez et al, 2013]

• Personal impact[Liu et al, 2012; Ye et al, 2012, Ioannidis et al, 2013]

Typically used to adjust the ratings of an individual in light of the ratings of the other person in the pair.

Page 72: Research Challenges in Group Recommender Systems Judith Masthoff

Challenge 5: Explaining Group

Recommendations

Page 73: Research Challenges in Group Recommender Systems Judith Masthoff

Improve:– Trust– Effectiveness– Persuasiveness– Efficiency– Transparency– Scrutability– Satisfaction

[Tintarev & Masthoff]

And these aims can conflict

Aim of explanations in any rec sys

Explanations may be even more important in group recommender systems

Which aims?

Page 74: Research Challenges in Group Recommender Systems Judith Masthoff

Sequence issue

• More work is needed on explaining sequences, particularly sequences that contain items the user will not like

Page 75: Research Challenges in Group Recommender Systems Judith Masthoff

Privacy issue

• Many aims may require explanations that reflect on other group members….

• How to do this without disclosing sensitive information?

• Even general statements such as “this item was not chosen as it was hated by somebody in your group” may cause problems

Page 76: Research Challenges in Group Recommender Systems Judith Masthoff

Conclusion

Many challenges

Needs a lot more work

Page 77: Research Challenges in Group Recommender Systems Judith Masthoff

Questions ?