Multiverse Recommendation: N-dimensional Tensor Factorization for Context-aware Collaborative...

Multiverse Recommendation: N-dimensional TensorFactorization for Context-aware Collaborative

Filtering

Alexandros Karatzoglou1 Xavier Amatriain 1 Linas Baltrunas 2

Nuria Oliver 2

1Telefonica ResearchBarcelona, Spain

2Free University of BolzanoBolzano, Italy

November 4, 2010

Context in Recommender Systems

Context is an important factor to consider in personalizedRecommendation

Current State of the Art in Context- awareRecommendation

Pre-Filtering TechniquesPost-Filtering TechniquesContextual modeling

The approach presented here fits in the Contextual Modeling category

Pre-Filtering Techniques

Post-Filtering TechniquesContextual modeling

Pre-Filtering TechniquesPost-Filtering Techniques

Contextual modeling

Collaborative Filtering problem setting

Typically data sizes e.g. Netlix data n = 5× 105, m = 17× 103

Standard Matrix Factorization

Find U ∈ Rn×d and M ∈ Rd×m so that F = UM

minimizeU,ML(F ,Y ) + λΩ(U,M)

Movies!

Users!

Multiverse Recommendation: Tensors for ContextAware Collaborative Filtering

Movies!

Users!

Fijk = S ×U Ui∗ ×M Mj∗ ×C Ck∗

R[U,M,C,S] := L(F ,Y ) + Ω[U,M,C] + Ω[S]

Tensors for Context Aware Collaborative Filtering

Movies!

Users! U!

R[U,M,C,S] := L(F ,Y ) + Ω[U,M,C] + Ω[S]

Tensors for Context Aware Collaborative Filtering

Movies!

Users! U!

R[U,M,C,S] := L(F ,Y ) + Ω[U,M,C] + Ω[S]

Regularization

Ω[F ] = λM‖M‖2F + λU‖U‖2F + λC‖C‖2F

Ω[S] := λS ‖S‖2F (1)

Squared Error Loss Function

Many implementations of MF used a simple squared error regressionloss function

l(f , y) =12

(f − y)2

thus the loss over all users and items is:

L(F ,Y ) =n∑i

l(fij , yij)

Note that this loss provides an estimate of the conditional mean

Absolute Error Loss Function

Alternatively one can use the absolute error loss function

l(f , y) = |f − y |

thus the loss over all users and items is:

L(F ,Y ) =n∑i

l(fij , yij)

which provides an estimate of the conditional median

Optimization - Stochastic Gradient Descent for TF

The partial gradients with respect to U, M, C and S can then bewritten as:

∂Ui∗ l(Fijk ,Yijk ) = ∂Fijk l(Fijk ,Yijk )S ×M Mj∗ ×C Ck∗

∂Mj∗ l(Fijk ,Yijk ) = ∂Fijk l(Fijk ,Yijk )S ×U Ui∗ ×C Ck∗

∂Ck∗ l(Fijk ,Yijk ) = ∂Fijk l(Fijk ,Yijk )S ×U Ui∗ ×M Mj∗

∂S l(Fijk ,Yijk ) = ∂Fijk l(Fijk ,Yijk )Ui∗ ⊗Mj∗ ⊗ Ck∗

We then iteratively update the parameter matrices and tensors usingthe following update rules:

U t+1i∗ = U t

i∗ − η∂UL− ηλUUi∗

M t+1j∗ = M t

j∗ − η∂ML− ηλMMj∗

Ct+1k∗ = Ct

k∗ − η∂CL− ηλCCk∗

St+1 = St − η∂S l(Fijk ,Yijk )− ηλSS

where η is the learning rate.

The partial gradients with respect to U, M, C and S can then bewritten as:

∂Ui∗ l(Fijk ,Yijk ) = ∂Fijk l(Fijk ,Yijk )S ×M Mj∗ ×C Ck∗

∂Mj∗ l(Fijk ,Yijk ) = ∂Fijk l(Fijk ,Yijk )S ×U Ui∗ ×C Ck∗

∂Ck∗ l(Fijk ,Yijk ) = ∂Fijk l(Fijk ,Yijk )S ×U Ui∗ ×M Mj∗

∂S l(Fijk ,Yijk ) = ∂Fijk l(Fijk ,Yijk )Ui∗ ⊗Mj∗ ⊗ Ck∗

We then iteratively update the parameter matrices and tensors usingthe following update rules:

U t+1i∗ = U t

i∗ − η∂UL− ηλUUi∗

M t+1j∗ = M t

j∗ − η∂ML− ηλMMj∗

Ct+1k∗ = Ct

k∗ − η∂CL− ηλCCk∗

St+1 = St − η∂S l(Fijk ,Yijk )− ηλSS

where η is the learning rate.11

Movies!

Users! U!

Movies!

Users! U!

Movies!

Users! U!

Movies!

Users! U!

Movies!

Users! U!

Experimental evaluation

We evaluate our model on contextual rating data and computing theMean Absolute Error (MAE),using 5-fold cross validation defined asfollows:

MAE =1K

n,m,c∑ijk

Dijk |Yijk − Fijk |

Data set Users Movies Context Dim. Ratings ScaleYahoo! 7642 11915 2 221K 1-5Adom. 84 192 5 1464 1-13Food 212 20 2 6360 1-5

Table: Data set statistics

Context Aware Methods

Pre-filtering based approach, (G. Adomavicius et.al), computesrecommendations using only the ratings made in the same contextas the target oneItem splitting method (L. Baltrunas, F. Ricci) which identifies itemswhich have significant differences in their rating under differentcontext situations.

Results: Context vs. No Context

Nocontext

TensorFactorization

Nocontext

TensorFactorization

Figure: Comparison of matrix (no context) and tensor (context) factorizationon the Adom and Food data.

Yahoo Artificial Data

α=0.1

1.00α=0.5

1.00α=0.9

No Context Reduction Item-Split Tensor Factorization

Figure: Comparison of context-aware methods on the Yahoo! artificial data

Yahoo Artificial Data

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9Probability of contextual influence

No Context

Reduction

Item-Split

Tensor Factorization

Figure: Evolution of MAE values for different methods with increasinginfluence of the context variable on the Yahoo! data.

Reduction Item-Split TensorFactorization

Figure: Comparison of context-aware methods on the Adom data.

Nocontext

Reduction TensorFactorization

Figure: Comparison of context-aware methods on the Food data.

Conclusions

Tensor Factorization methods seem to be promising for CARSMany different TF methods existFuture work: extend to implicit taste dataTensor representation of context data seems promising

Thank You !

Multiverse Recommendation: N-dimensional Tensor Factorization for Context-aware Collaborative...

Technology

Fast and Scalable Method for Distributed Boolean Tensor Factorizationukang/papers/dbtfVLDBJ.pdf · 2019-08-15 · Boolean tensor factorization (BTF) is a useful tool for an-alyzing

Nonnegative Tensor Factorization for EEG Pattern ... · PDF fileNonnegative Tensor Factorization for EEG Pattern Classification ... Fyhn M, Bonnevie T, Moser M-B, ... mA/cm2 for 1500s

Rubik: Knowledge Guided Tensor Factorization and ...hiplab.mc.vanderbilt.edu/~ychen/kdd.pdfRubik: Knowledge Guided Tensor Factorization and ... to tensor factorization and completion,

1 Tensor Factorization for Low-Rank Tensor Completion · So tensor analysis is of practical signiﬁcance and beneﬁts many applications in computer vi-sion [2], [6], [7], data mining

Convolutional Auto-Encoder With Tensor-Train Factorization

Bayesian Poisson Tensor Factorization for Inferring ...as5530/ScheinPaisleyBleiWallach2015_slides.pdf · Sample result: Afghanistan War time steps senders receivers action types

Bayesian Poisson Tensor Factorization (BPTF) for feature ... · Bayesian Poisson Tensor Factorization (BPTF) for feature selection from GDELT Chennan Zhou, Feiyan Liu, Jianbo Gao

Mu Li iPAL Group Meeting Sept. 17, 2010 Introduction to tensor, tensor factorization and its applications

Multiverse 4

Multiverse - Service Offerings. © 2011 Multiverse solutions Pvt. Ltd. All Rights Reserved. Corporate Overview: Multiverse IT Software and Services CompanyIT

Matrix and Tensor Factorization for Machine Learninggrabus/courses/... · IFT 6760A Matrix and Tensor Factorization for Machine Learning Winter 2020 Instructor: Guillaume Rabusseau

The multiverse perspective on determinateness in set theorylogic.harvard.edu/EFI_Hamkins_MultiverseSlides.pdf · The multiverse view Ontology of Forcing Case Studies: CH, V = L Multiverse

Tensor factorization for model-free space- variant blind ...Tensor factorization for model-free space-variant blind deconvolution of the single- and multi-frame multi-spectral image

Multiverse Scifi

Robust Tensor Factorization With Unknown Noise...niques to deal with tensor problems. However, as shown in [10], such matricization fails to exploit the essential tensor structure

Lecture XI Tensor Factorization: blind separation of

GRANITE: DIVERSIFIED, SPARSE TENSOR FACTORIZATION FOR ... · GRANITE: DIVERSIFIED, SPARSE TENSOR FACTORIZATION FOR ELECTRONIC HEALTH RECORD-BASED PHENOTYPING Jette Henderson ∗,

Global Optimality in Matrix and Tensor Factorization, Deep

40204484 Multiverse

A Tensor-based Factorization Model of Semantic Compositionality