Recommendation (Recommender) Systems - UZH

Recommendation (Recommender) Systems

Prof. Dr. Daning Hu Department of Informatics University of Zurich Nov 12th, 2013

2

Outline n  Introduction

n  Approaches used by Recommendation Systems ¨  Collaborative Filtering

¨  Content-based ¨  Social Contagion

Introduction: Motivation n  Recommendation systems (RS) are designed to customize

the users’ needs in their search for products – A type of social computing technology.

n  It provides a way for online retailers (e.g., Amazon) or E-commerce platform (e.g., Ebay) to compete with traditional brick-and-mortar competitors. ¨ Ease information overload

n  It aims to answer: ¨  Which digital camera should I buy?

¨  What is the best holiday for me and my family?

¨  Which movie should I rent?

Introduction

n  Recommendation systems are a subclass of information filtering systems that seek to predict the 'rating' or 'preference' that a user would give to an item or social element they had not yet considered (Wiki) ¨  based on the user's social environment (Collaborative Filtering

approaches)

¨  using a model built from the characteristics of an item (Content-based approaches) or

¨  studying consumers’ social behavior

When does a RS do its job well?

§  "Recommend widely unknown items that users might actually like!"

§  20% of items accumulate 74% of all positive ratings

§  Items rated > 3 in MovieLens 100K dataset

Recommend items from the long tail

7

Underlying Technologies: Machine Learning

n  Recommendation systems are instances of personalization software. ¨  adapting to the individual needs, interests, and preferences of each

user.

n  Machine Learning (ML) aims to learn a user model or profile of a particular user based on: ¨  Sample interactions ¨  Rated examples

¨  Used to filter information and predict consumer behaviors

8

Collaborative Filtering

n  A very prominent approach to generate recommendations ¨  used by large, commercial e-commerce sites such as Amazon ¨  well-understood, various algorithms and variations exist ¨  applicable in many domains (book, movies, DVDs, ..)

n  CF uses the “wisdom/intelligence of people who have similar tastes with you”.

n  The basic assumption: customers who had similar tastes in the past, will have similar tastes in the future

9

Collaborative Filtering

n  A database of many users’ ratings of a variety of items.

n  For a given user, find other similar users whose ratings strongly correlate with the current user.

n  Recommend items rated highly by these similar users, but have not yet rated by the current user.

n  Input: a matrix of given user–item ratings.

n  Output: A (numerical) prediction indicating to what degree the current user will like a certain item.

Collaborative Filtering n  Example

¨  Determine whether Jack will like Item5, which he has not yet rated or seen

¨  How to measure similarity? ¨  How do we make a prediction from the neighboring users?

Item1 Item2 Item3 Item4 Item5

Jack 5 3 4 4 ?

User1 3 1 2 3 3

User2 4 3 4 3 5

User3 3 3 1 5 4

User4 1 5 5 2 1

Collaborative Filtering: Measuring User Similarity n 

Item1 Item2 Item3 Item4 Item5

Alice 5 3 4 4 ?

User1 3 1 2 3 3

User2 4 3 4 3 5

User3 3 3 1 5 4

User4 1 5 5 2 1

sim = 0,85 sim = 0.00

sim = 0.70 sim = -‐0.79

Making predictions n 

Techniques improving the prediction function n  Not all neighbor ratings might be equally "valuable"

¨  Agreement on commonly liked items is not so informative as agreement on controversial items

¨  Possible solution: Give more weight to items that have a higher variance

n  Value of number of co-rated items ¨  Use "significance weighting", by e.g., linearly reducing the

weight when the number of co-rated items is low

n  Neighborhood selection ¨  Use similarity threshold or fixed number of neighbors

14

Cons n  Cold Start / Data Sparsity Problem: Not enough users or ratings in

the user-ratings matrix, making it very sparse. It is hard to find users that have rated the same items.

n  First Rater: Not working for an item that has not been previously rated

n  Popularity Bias: Cannot recommend items to someone with unique tastes. ¨  tends to recommend popular items since they have more previous raters/

ratings.

Graph-based methods n  "Spreading activation" (Huang et al. 2004)

¨  Exploit the supposed "transitivity" of customer tastes and thereby augment the matrix with additional information

¨  Assume that we are looking for a recommendation for User1 ¨  When using a standard CF approach, User2 will be considered a

peer for User1 because they both bought Item2 and Item4 ¨  Thus Item3 will be recommended to User1 because the nearest

neighbor, User2, also bought or liked it

Graph-based methods n  "Spreading activation" (Huang et al. 2004)

¨  In a standard user-based or item-based CF approach, paths of length 3 will be considered – that is, Item3 is relevant for User1 because there exists a three-step path (User1–Item2–User2–Item3) between them

¨  Because the number of such paths of length 3 is small in sparse rating databases, the idea is to also consider longer paths (indirect associations) to compute recommendations

¨  Using path length 5, for instance

Graph-based methods

n  "Spreading activation" (Huang et al. 2004) ¨  Idea: Use paths of lengths > 3

to recommend items ¨  Length 3: Recommend Item3 to User1 ¨  Length 5: Item1 also recommendable

Content-based Approaches n  Recommendations are based on information on the content of

items rather than on other users’ opinions.

n  It uses machine learning algorithms to induce a profile of the users preferences from examples based on content features. ¨  No need for data on other users; no cold-start or sparsity problems.

¨  Able to recommend to users with unique tastes.

¨  Often used to recommend text documents

Title Genre Author Type Price Keywords

The Night of the Gun

Memoir David Carr Paperback 29.90 Press and journalism, drug addiction, personal memoirs, New York

The Lace Reader

Fiction, Mystery

Brunonia Barry Hardcover 49.90 American contemporary fiction, detective, historical

Into the Fire Romance, Suspense

Suzanne Brockmann

Hardcover 45.90 American fiction, murder, neo-Nazism

Term-Frequency - Inverse Document Frequency (𝑇𝐹−𝐼𝐷𝐹)

n  Simple keyword representation has its problems n  not every word has similar importance n  longer documents have a higher chance to have an overlap with

the user profile

n  Standard measure: TF-IDF-Encodes text documents in multi-dimensional Euclidian space; weighted term vector ¨ TF: Measures, how often a term appears (density in a

document) n  assuming that important terms appear more often n  normalization has to be done in order to take document

length into account ¨  IDF: Aims to reduce the weight of terms that appear in all

documents

TF-IDF II n  Given a keyword 𝑖 and a document 𝑗

n  𝑇𝐹(𝑖,𝑗) ¨  term frequency of keyword 𝑖 in document 𝑗

n  𝐼𝐷𝐹(𝑖) ¨  inverse document frequency calculated as 𝑰𝑫𝑭(𝒊)=𝒍𝒐𝒈𝑵/𝒏(𝒊) 

n 𝑁 : number of all recommendable documents n  𝑛(𝑖) : number of documents from 𝑁 in which keyword 𝑖 appears

n  𝑇𝐹−𝐼𝐷𝐹 ¨  is calculated as: 𝑻𝑭-𝑰𝑫𝑭(𝒊,𝒋)=𝑻𝑭(𝒊,𝒋)∗𝑰𝑫𝑭(𝒊)

Limitations of content-based recommendation methods

n  Keywords alone may not be sufficient to judge quality/relevance of a document or web page

n  up-to-date-ness, usability, aesthetics, writing style n  content may also be limited / too short n  content may not be automatically extractable (multimedia)

n  Overspecialization n  Algorithms tend to propose "more of the same" n  Or: too similar news items

Discussion & summary n  It does not require user community in order to work.

n  This approach aims to learn a model of user's interest preferences based on explicit or implicit feedback.

n  Danger exists that recommendation lists contain too many similar

items ¨ All learning techniques require a certain amount of training

data ¨ Some learning methods tend to overfit the training data

n  Pure content-based systems are rarely found in commercial environments

23

Using Social Contagion for Recommendations

§  Intelligent Advertising, Product Recommendation §  Who are the most influential people? §  What are the patterns of information diffusion?

24

Social Contagion Thoery – LeBon et al. 1895 n  Le Bon, Park and Blumer the three major theorists made an

assumption that something happens in a crowd situation that can cause people to become irrational.

n  The social pathology and social contagion perspectives – the idea that someone who already has the affliction (behavior) can pass it on the someone else, and it can rapidly infect others

¨  Gabrielle Tarde’s work on the ‘laws of imitation’

n  Applications: Viral marketing, social media marketing

25

Social Recommendations for Marketing n  Mass marketing is not the best way to attract people

¨  $ Expensive $

¨  Usually not very focused

n  Recommendations by people we know are more effective then input by unknown individuals

¨  Content: Our friends know what we like

¨  Homophily: Our friends and us are more likely to share interests

and preferences

¨  Biased: We listen more to what our friends say (usually)

¨  Inexpensive

26

27

28

Data n  The dataset for this study was collected from a large online OSS

community – Ohloh, which provides information about 11,800 OSS projects involving 94,330 people ¨  Positive evaluation relationship ¨  Developers’ sociological features

n  Nationality, geographical location, etc. ¨  OSS project related information

n  Primary programming language, development activity, ratings, etc. ¨  From software revision control repositories – Subversion, CVS and Git.

n  Ohloh web site provides a REST-based application programming interface (API) for users to access and query its data.

Figure.1. Sample data from Ohloh developers

Name Created_at Location Country Kudo_rankProgramming

LanguageTotal

CommitsJason Allen 2006-09-15T02:23:01Z Sammamish WA US 9 Java 789Robin Luckey 2006-09-15T02:23:01Z Seattle WA US 9 Java 11358Scott Collison 2006-09-15T02:23:01Z Seattle WA US 8 C 254The Ohloh Slave 2006-09-15T02:23:02Z Redmond WA US 9 Php 13

29

Statistical Analysis on Link Formation

n  Dependent variable: The outcome of a developer D participates in an OSS project P at time T , coded as a binary variable “Kudo” link.

n  Independent variables include three types of possible determinants

¨  Homophily factors

¨  Share affiliation factors

¨  Preferential attachment factors

30

31

Summary

n  Recommender systems have their roots in various research areas, such as ¨  information retrieval, information filtering, and text classification.

n  Recommender systems apply methods from different fields, such as ¨  machine learning, data mining, etc.

n  Addressed main topics ¨  Basic recommendation algorithms ¨  Evaluation of recommender systems and their business value

Outlook on recommendation systems

n  Improved collaborative filtering techniques ¨  Use more data sources such as tagging data, demographic

information, and time data ¨  Automatic fine-tuning of parameters

n  Context awareness ¨  Taking time aspects, geographical location and additional

context aspects of the user into account ¨  Emotional context ("I fell in love with a boy. I want to watch a

romantic movie.")

n  Group recommendations ¨  Accompanying persons? ("Recommendations for a couple

or for friends?")

Documents

Recommendation (Recommender) Systems - UZH