20
GAUSSIAN PROCESS FACTORIZATION MACHINES FOR CONTEXT-AWARE RECOMMENDATIONS Trung V. Nguyen, Alexandros Karatzoglou, Linas Baltrunas SIGIR 2014 Presentation: Vu Tien Duong

GAUSSIAN PROCESS FACTORIZATION MACHINES FOR CONTEXT-AWARE RECOMMENDATIONS Trung V. Nguyen, Alexandros Karatzoglou, Linas Baltrunas SIGIR 2014 Presentation:

Embed Size (px)

Citation preview

GAUSSIAN PROCESS FACTORIZATION MACHINES FOR CONTEXT-AWARE RECOMMENDATIONS

Trung V. Nguyen, Alexandros Karatzoglou, Linas Baltrunas

SIGIR 2014

Presentation: Vu Tien Duong

CONTENT• Introduction• Gaussian processes• GPFM• GPPW• Evaluation• Conclusion

Introduction

• Context: the environment in which a recommendation is provided.

• Multidimensional latent factors: variables are represented as latent features in a low-dimensional space

• Context-aware recommendation (CAR): the user-item-context interactions are modeled in factor models. • Tensor Factorization • Factorization Machine

Introduction (1)• Problem: given the many possible types of interactions

between user, items and contextual variables, it may seem unrealistic to restrict the interactions among them to linearity

ÞSolution: Gaussian Process Factorization Machines (GPFM) - non-linear context-aware collaborative filtering method, use Gaussian Processes.

Introduction (2)

Contributions:• Applicable to both the explicit and implicit feedback• Using stochastic gradient descent (SGD) optimization to

allow scalability of the model• The first GP-based attempt for context-aware

recommendations

Steps of method

1. Converting observed data to latent representation

2. Having prior and likelihood of model

3. Learning

4. Predicting utility

Gaussian Processes • Widely used for modeling relational data• Use flexible covariance functions• An important tool for modeling non-linear complex

patterns

Gaussian Processes (1)

Gaussian Process Factorization Machines (GPFM)• First, convert data to latent representation

• Apply GP to CAR, GPFM is specified by the prior and likelihood• Can choose many type of covariance function k

Pairwise Comparison for Implicit Feedback

• Pair comparison (j1, c1) > i (j2, c2) which says the user has higher utility for item j1 in context c1 than item j2 in context c2

Pairwise Preference Model (GPPW)• Similar to GPFM

LEARNING• Use the empirical Bayes. This means optimizing variables

being the set {X, ,θ ,σ} in the marginal likelihood

• Use Stochastic Gradient Descent (SGD) to train: iterate over each user and update its parameters according to the update rule:

• -

PREDICTIVE DISTRIBUTION• Once the X and ϴ are learned, using them to make

prediction for unseen pairs of (item, context)• Given a test observation (j*, c*)• Convert to latent representation x* = t(j*, c*)• Apply p((x*)|) = N(u, s) with u, s are calculated from k, K,

y and ϴ

Evaluation• Implicit datasets: FRAPPE, converted FOOD, converted COMODA• Explicit datasets: ADOM, COMODA, FOOD, SUSHI• Compared methods: fm, multiverse, mf, constant• Metrics:

• Overall quality: MAE, RMSE• Top items in a list: NDCG, ERR

Dataset Detail

ADOM Movies (from students)

COMODA Movies

FOOD Food menus

SUSHI Sushi types (from Japanese)

FRAPPE Android applications

Converted FOOD

rating > 3 is treated as positive

Converted COMODA

rating > 4 is treated as positive

Evaluation• Split dataset to 5 folds and iterate 5 times. Each time, one

for testing and 4 for training. • Empirical tune parameters by one fold as validation. • Then fix tuned parameter when running experiments with

other 4 folds. • The performance is average of 5 folds.

Evaluation of GPFM for Explicit Feedback

• Context-aware and context-agnostic• gpfm and fm significantly outperforms mf

=> benefit of contextual info• Multiverse outperform mf on ADOM and FOOD

but poor on COMODA and SUSHI=> high dimensional context

• gpfm and fm (best context-aware methods)• Significatly outperforms

=> nonlinear with GPFM leads to substantial performance in CAR.

GPFM - Explicit Feedback

Evaluation of GPPW for Implicit Feedback

• gppw significantly outperforms both gpfm and fm on the FOOD and COMODA • Comparable ERR@10 and MAP@10 on the FRAPPE

Þ Learning with paired comparisons can lead to substantial improvement for ranking (compared to optimizing item-based scores)

Þ GPPW is more effective than GPFM in the implicit feedback with little overhead in computation

CONCLUSION• The utility of an item under a context is modeled as

functions in the latent feature space of the item and context

• Introducing Gaussian processes as priors for these utility functions, GPFM allows complex, nonlinear user-item-context interactions to be captured leading to powerful and flexible modeling capacity