Upload
alessandro-liparoti
View
109
Download
2
Embed Size (px)
Citation preview
Content Based Recommendations Enhanced with Collaborative Information
POLITECNICO DI MILANO
Scuola di Ingegneria Industriale e dell'Informazione
Corso di Laurea Magistrale in Ingegneria Informatica
Anno Accademico 2014 – 2015
Candidato: Alessandro Liparoti (819828)
Relatore: Prof. Paolo Cremonesi
RECOMMENDER SYSTEMS
software tools which analyze different source of data in order to predict the rating that a user would give to an item
Main Families:
• Collaborative Filtering• Content-based Filtering• Hybrid algorithms
POLITECNICO DI MILANO
COLLABORATIVE FILTERING
Collaborative Filteringassumption: users who agreed in the past will also agree in the
future
analyze past users’ ratings to compute predictions
User-Rating-Matrix (URM) is the rating given by user u to item i
POLITECNICO DI MILANO
✔ good performances
✘ not applicable if no enough ratings
for both users and items (cold-start problems)
CONTENT-BASED FILTERING
Content-Based Filteringassumption: users will like items similar to those they liked in the
past
compute items similarities’ scores considering item features
Item-Content-Matrix (ICM) if item i hasthe feature k
POLITECNICO DI MILANO
✔ no need of items’ ratings (i.e. works in a new-item scenario)
✘ ignoring users’relations leads to worse performances
HYBRID ALGORITHMS• new-item recommendations
- Factorization Machines (FM)- generic factorization model- can represent different types of models
- UFSM- computes item similarities as a CB approach- uses collaborative data to personalize them for each user
• no new-item recommendations - SSLIM
- learns a matrix of item-item coefficients- improves SLIM adding side information
POLITECNICO DI MILANO
GOAL OF THE THESIS
usual hybridization use item content data to improve collaborative models
our hybridization build a content-based model enhanced with collaborative data
CONTENT-BASED COLLABORATIVE
POLITECNICO DI MILANO
✔exploits collaborative data (also in a new-item scenario)
✔uses weigths for features and user-feature relations
CONTENT BASED COLLABORATIVE
content-based similarity function
CBC similarity function
• bk control the importance of feature k(e.g. usually genre > year of production for movies)
• cu,k control the importance of the relation between user u and feature k
(e.g. a user likes a particular actor)POLITECNICO DI MILANO
CBC VERSIONS
rating prediction
CBCrmse
partial effects:analytical sequential approach
item recommendation
CBCbpr
stochastic gradient descent approach
POLITECNICO DI MILANO
CBC parameters θ are learned minimizing an error function
DATASET
POLITECNICO DI MILANO
HetRec2011-Movielens
RecSys-Polimi IMDB
SIMULATIONS
POLITECNICO DI MILANO
URM was split in three parts
two types of simulation
collaborative train on A+B test on C
new-item train on A test on C
TESTING
POLITECNICO DI MILANO
- rating prediction metricsRMSE
RMSEp (only on positive ratings)
- item recommendation metricsprecision
recall
mean average precision (MAP)
mean reciprocal rank (MRR)
normalized discounted cumulative gain (NDCG)
ALGORITHMS OF COMPARISON
POLITECNICO DI MILANO
• Collaborative - Matrix Factorization (MF)
Asymmetric-SVD
Bayesian Personalized Ranking MF
• Hybrid
Factorization Machines
• Content-Based
item k-nearest-neighbor
RESULTS
POLITECNICO DI MILANO
collaborative simulations
RESULTS – ITEM RECOMMENDATION
POLITECNICO DI MILANO
new-item simulations
THINGS TO DO
POLITECNICO DI MILANO
use weights to control the importance of relations among items’ features
(e.g. two actors appearing together in many movies)
use a CBC-like similarity function to compare users instead of items
(e.g. gender, age, demographic information,…)
collaborative method (BPR-MF) not applicable
CBC is the best algorithm
THE END
POLITECNICO DI MILANO
THANK YOU