Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR...

ECIR 2016, PADUA, ITALYEFFICIENT PSEUDO-RELEVANCE FEEDBACKMETHODS FOR COLLABORATIVE FILTERINGRECOMMENDATION

Daniel Valcarce, Javier Parapar, Álvaro Barreiro@dvalcarce @jparapar @AlvaroBarreiroG

Information Retrieval Lab@IRLab_UDC

University of A CoruñaSpain

Outline

1. Pseudo-Relevance Feedback (PRF)

2. Collaborative Filtering (CF)

3. PRF Methods for CF

4. Experiments

5. Conclusions and Future Work

PSEUDO-RELEVANCE FEEDBACK (PRF)

Pseudo-Relevance Feedback (I)

Pseudo-Relevance Feedback provides an automatic method forquery expansion:

# Assumes that the top retrieved documents with theoriginal query are relevant (pseudo-relevant set).

# The query is expanded with the most representative termsfrom this set.

# The expanded query is expected to yield better results thanthe original one.

Pseudo-Relevance Feedback (II)

Information need

query RetrievalSystem

Information need

QueryExpansion

expandedquery

Information need

QueryExpansion

expandedquery

Pseudo-Relevance Feedback (III)

Some popular PRF approaches:

# Based on Rocchio’s model(Rocchio, 1971 & Carpineto et al., ACM TOIS 2001)

# Relevance-Based Language Models(Lavrenko & Croft, SIGIR 2001)

# Divergence Minimization Model(Zhai & Lafferty, SIGIR 2006)

# Mixture Models(Tao & Zhai, SIGIR 2006)

COLLABORATIVE FILTERING (CF)

Recommender Systems

Notation:

# The set of users U

# The set of items I

# The rating that the user u gave to the item i is ru ,i

# The set of items rated by user u is denoted by Iu

# The set of users that rated item i is denoted by Ui

# The neighbourhood of user u is denoted by Vu

Top-N recommendation: create a ranked list containingrelevant and unknown items for each user u ∈ U.

Collaborative Filtering (I)

Collaborative Filtering (CF) employs the past interactionbetween users and items to generate recommendations.

Idea: If this user who is similar to you likes this item, maybe you willalso like it.

Different input data:

# Explicit feedback: ratings, reviews...

# Implicit feedback: clicks, purchases...

Perhaps the most popular approach to recommendation giventhe increasing amount of information about users.

Collaborative Filtering (II)

Collaborative Filtering (CF) techniques can be classified in:

# Model-based methods: learn a predictive model from theuser-item ratings.◦ Matrix factorisation (e.g., SVD)

# Neighbourhood-based (or memory-based) methods:compute recommendations using directly part of theratings.◦ k-NN approaches

PRF METHODS FOR CF

PRF for CF

PRF CFUser’s query User’s profile

mostˆ1,populatedˆ2,stateˆ2 Titanicˆ2,Avatarˆ3,Matrixˆ5

Previous Work on Adapting PRF Methods to CF

Relevance-Based Language Models

# Originally devised for PRF (Lavrenko & Croft, SIGIR 2001).# Adapted to CF (Parapar et al., Inf. Process. Manage. 2013).# Two models: RM1 and RM2.# High precision figures in recommendation.

# ... but high computational cost!

RM1 : p(i |Ru) ∝∑v∈Vu

p(v) p(i |v)∏j∈Iu

p( j |v)

RM2 : p(i |Ru) ∝ p(i)∏j∈Iu

∑v∈Vu

p(i |v) p(v)p(i) p( j |v)

Previous Work on Adapting PRF Methods to CF

Relevance-Based Language Models

# Originally devised for PRF (Lavrenko & Croft, SIGIR 2001).# Adapted to CF (Parapar et al., Inf. Process. Manage. 2013).# Two models: RM1 and RM2.# High precision figures in recommendation.# ... but high computational cost!

RM1 : p(i |Ru) ∝∑v∈Vu

p(v) p(i |v)∏j∈Iu

p( j |v)

RM2 : p(i |Ru) ∝ p(i)∏j∈Iu

∑v∈Vu

p(i |v) p(v)p(i) p( j |v)

Our Proposals based on Rocchio’s Framework

Rocchio’s Weights

pRocchio(i |u) �∑v∈Vu

Robertson Selection Value g

pRSV (i |u) �∑v∈Vu

|Vu | p(i |Vu)

CHI-2 g

pCHI−2(i |u) ��p(i |Vu) − p(i |C)�2

p(i |C)

Kullback–Leibler Divergence

pKLD(i |u) � p(i |Vu) logp(i |Vu)p(i |C)

Rocchio’s Weights

|Vu | p(i |Vu)

CHI-2 g

pCHI−2(i |u) ��p(i |Vu) − p(i |C)�2

p(i |C)

Rocchio’s Weights

|Vu | p(i |Vu)

CHI-2 g

pCHI−2(i |u) ��p(i |Vu) − p(i |C)�2

p(i |C)

Probability Estimation

Maximum Likelihood Estimate under a MultinomialDistribution over the ratings:

pmle(i |Vu) �∑

v∈Vu rv ,i∑v∈Vu , j∈I rv , j

pmle(i |C) �∑

u∈U ru ,i∑u∈U, j∈I ru , j

Neighbourhood Length Normalisation (I)

Neighbourhoods are computed using clustering algorithms:

# Hard clustering: every user is in only one cluster. Clustersmay have different sizes. Example: k-means.

# Soft clustering: each user has its own neighbours. Whenwe set k to a high value, we may find different amounts ofneighbours. Example: k-NN.

Idea: consider the variability of the neighbourhood lengths:

# Big neighbourhoods is equivalent to a query with a lot ofresults: the collection model is closed to the target user.

# Small neighbourhoods implies that neighbours are highlyspecific: the collection is very different from the target user.

Neighbourhood Length Normalisation (I)

Neighbourhoods are computed using clustering algorithms:

# Hard clustering: every user is in only one cluster. Clustersmay have different sizes. Example: k-means.

# Soft clustering: each user has its own neighbours. Whenwe set k to a high value, we may find different amounts ofneighbours. Example: k-NN.

Idea: consider the variability of the neighbourhood lengths:

# Big neighbourhoods is equivalent to a query with a lot ofresults: the collection model is closed to the target user.

# Small neighbourhoods implies that neighbours are highlyspecific: the collection is very different from the target user.

Neighbourhood Length Normalisation (II)

We bias the MLE to perform neighbourhood lengthnormalisation:

pnmle(i |Vu) rank�

1|Vu |

∑v∈Vu rv ,i∑

v∈Vu , j∈I rv , j

pnmle(i |C) rank�

∑u∈U ru ,i∑

u∈U, j∈I ru , j

EXPERIMENTS

Experimental settings

Baselines:

# UB: traditional user-based neighbourhood approach.# SVD: matrix factorisation.# UIR-Item: probabilistic approach.# RM1 and RM2: Relevance-Based Language Models.

Our algorithms:

# Rocchio’s Weights (RW)# Robertson Selection Value (RSV)# CHI-2# Kullback-Leibler Divergence (KLD)

Efficiency

ML 100k ML 1M ML 10Mreco

dataset

UIRRM1RM2

SVD++RSVUBRW

CHI-2KLD

Accuracy (nDCG@10)

Algorithm ML 100k ML 1M R3-Yahoo! LibraryThing

UB 0.0468 0.0313 0.0108 0.0055b

SVD 0.0936a 0.0608a 0.0101 0.0015UIR-Item 0.2188ab 0.1795abd 0.0174abd 0.0673abd

RM1 0.2473abc 0.1402ab 0.0146ab 0.0444ab

RM2 0.3323abcd 0.1992abd 0.0207abcd 0.0957abcd

Rocchio’s Weights 0.2604abcd 0.1557abd 0.0194abcd 0.0892abcd

RSV 0.2604abcd 0.1557abd 0.0194abcd 0.0892abcd

KLDMLE 0.2693abcd 0.1264ab 0.0197abcd 0.1576abcde

NMLE 0.3120abcd 0.1546ab 0.0201abcd 0.1101abcde

CHI-2MLE 0.0777a 0.0709ab 0.0149ab 0.0939abcd

NMLE 0.3220abcd 0.1419ab 0.0204abcd 0.1459abcde

Table: Values of nDCG@10. Pink = best algorithm. Blue = notsignificantly different to the best (Wilcoxon two-sided p < 0.01). 20/28

Diversity (Gini@10)

UIR-Item 0.0124 0.0050 0.0137 0.0005RM2 0.0256 0.0069 0.0207 0.0019CHI-2 NMLE 0.0450 0.0106 0.0506 0.0539

Table: Values of the complement of Gini index at 10. Pink = bestalgorithm.

Novelty (MSI@10)

UIR-Item 5.2337e 8.3713e 3.7186e 17.1229eRM2 6.8273c 8.9481c 4.9618c 19.27343c

CHI-2 NMLE 8.1711ec 10.0043ec 7.5555ec 8.8563

Table: Values of Mean Self-Information at 10. Pink = best algorithm.

Trade-off Accuracy-Diversity

200 300 400 500 600 700 800 900

G–(Gini,n

RM2CHI-2 NMLE

Figure: G-measure of nDCG@10 and Gini@10 on MovieLens 100kvarying the number of neighbours k using Pearson’s correlationsimilarity.

Trade-off Accuracy-Novelty

0.91.01.11.21.31.41.51.61.71.81.92.0

200 300 400 500 600 700 800 900

G–(MSI,nDCG)

RM2CHI-2 NMLE

Figure: G-measure of nDCG@10 and MSI@10 on MovieLens 100kvarying the number of neighbours k using Pearson’s correlationsimilarity.

CONCLUSIONS AND FUTURE WORK

Conclusions

We proposed to use fast PRF methods (Rocchio’s Weigths, RSV,KLD and CHI-2):

# They are orders of magnitude faster than the RelevanceModels (up to 200x).

# They generate quite accurate recommendations.

# Good novelty and diversity figures with a better trade-offthan RM2.

# They lack of parameters (only clustering parameters).

Future Work

Other approaches for computing neighbourhoods:

# Posterior Probability Clustering (a non-negative matrixfactorisation).

# Normalised Cut (spectral clustering).

Explore other PRF methods:

# Divergence Minimization Models.

# Mixture Models.

Future Work

Other approaches for computing neighbourhoods:

# Posterior Probability Clustering (a non-negative matrixfactorisation).

# Normalised Cut (spectral clustering).

Explore other PRF methods:

# Divergence Minimization Models.

# Mixture Models.

THANK YOU!

@DVALCARCEhttp://www.dc.fi.udc.es/~dvalcarce

Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR...

Data & Analytics

The SearchMaster's Toolbox ECIR Industry Day 01 Apr 2010 David Hawking

MRI Topic 5: ECIR - Explorations in Cyber International Relations · 2010-04-09 · MRI Topic 5: ECIR - Explorations in Cyber International Relations PI: Professor Nazli Choucri,

Pseudo-commutative monads and pseudo-closed 2-categories

ECIR 2011 - cs.nuim.ie

A pseudo-mathematical pseudo-review on 4d N = 2

SI 520 Website Mockup for ECIR

Researching and Building IR applications using ... - ECIR 2008

T cosmology: From Pseudo-Bang to Pseudo-Rip

ECIR Medical Communications 2020-2022

Report on ECIR 2016: 38th European Conference on

RTBMA ECIR 2016 tutorial

Pseudo Limits, Biadjoints, and Pseudo Algebras: Categorical ...arXiv:math/0408298v4 [math.CT] 18 Oct 2006 Pseudo Limits, Biadjoints, and Pseudo Algebras: Categorical Foundations of

Pseudo-Random Generators and Pseudo-Random Functions

Relations between the Pseudo-Integral and Some Pseudo ...iieng.org/images/proceedings_pdf/1758E0314215.pdfRelations between the Pseudo-Integral and Some Pseudo-Type Integral Transforms

Skin Rendering by Pseudo{Separable Cross Bilateral Filtering · Bilateral Filtering Morten S. Mikkelsen Naughty Dog Inc., USA August 28, 2010 Abstract This paper proposes a real{time

Lexical Distribution in Citation Contexts through the IMRaD Standard - ECIR-2014 Bibliometric-enhanced IR

Graph Regularised Hashing (ECIR'15 Talk)

Pseudo Newsgathering: Analyzing Journalists' Use of Pseudo

ECIR – a Lightweight Approach for Entity-centric Information Retrieval

ECIR 2013 Keynote - Time for Events