Recommender Systems · About myself PhD 1993 Northwestern University –Intelligent Multimedia Retrieval 1993-1998 –Post-doc at University of Chicago Kristian Hammond –Helped

Recommender Systems

Robin Burke

DePaul University

Chicago, IL

About myself

PhD 1993 Northwestern University– Intelligent Multimedia Retrieval

1993-1998– Post-doc at University of Chicago

Kristian Hammond

– Helped found Recommender, Inc. became Verb, Inc.

1998-2000– Dir. of Software Development– Adjunct at University of California, Irvine

2000-2002– California State University, Fullerton

2002-present– DePaul University

My Interests

Memory

– How do we remember the right thing at the right time?

– Why is it that computers are so bad at this?

– How does knowledge of different types shape the activity of memory?

Organization

3 days

21 hours

Not me talking all the time!

Partners– For in-class activities

– For coding labs

For labs– Must be one laptop per pair

– Using Eclipse / Java

Activity 1

With your partner One person should recommend a movie or

DVD to the other– asking questions as necessary– in the end, you should be confident that they are

right

No right or wrong way to do this! Take note

– the questions you ask– the reasons for the recommendation

Discussion

Recommender

– What did you have to ask?

– How did you use this information?

Recommendee

– What made you sure the recommendation was good?

Example: Amazon.com

Product similarity

Market-basket analysis

Profitability analysis

Sequential pattern mining

Application: Recommender.com

Similar movies

Applying a critique

New results

Knowledge employed

Similarity metric– what makes something "alike"?

– # of features in common is not sufficient

Movies– genres of movies

– types of actors

– directorial styles

– meaning of ratings NR could mean adult, but it could just be a foreign

movie

This class

TuesdayA. 8:00 – 10:30B. 10:45 – 13:00C. 15:00 – 18:00WednesdayD. 8:00 – 10:00E. 10:15 – 13:00F. 17:00 – 19:00ThursdayG. 8:00 – 11:00H. 14:30 – 16:00I. 18:00 – 20:00

Roadmap

Session A: Basic Techniques I– Introduction– Knowledge Sources– Recommendation Types– Collaborative Recommendation

Session B: Basic Techniques II– Content-based Recommendation– Knowledge-based Recommendation

Session C: Domains and Implementation I– Recommendation domains– Example Implementation– Lab I

Session D: Evaluation I– Evaluation

Session E: Applications– User Interaction– Web Personalization

Session F: Implementation II– Lab II

Session G: Hybrid Recommendation Session H: Robustness Session I: Advanced Topics

– Dynamics– Beyond accuracy

Recommender Systems

Wikipedia:– Recommendation systems are programs which

attempt to predict items (movies, music, books, news, web pages) that a user may be interested in, given some information about the user's profile.

My definition– Any system that guides the user in a

personalized way to interesting or useful objects in a large space of possible options or that produces such objects as output.

Historical note

Used to be a more restrictive definition

– “people provide recommendations as inputs, which the system then aggregates and directs to appropriate recipients” (Resnick & Varian 1997)

Aspects of the definition

basis for recommendation

– personalization

process of recommendation

– interactivity

results of recommendation

– interest / useful objects

Personalization

– Any system that guides the user in a personalized way to interesting or useful objects in a large space of possible options or that produces such objects as output.

Definitions agree that recommendations are personalized– Some might say that suggesting a best-seller to everyone

is a form of recommendation

Meaning– the process is guided by some user-specific information

could be a long-term model

could be a query

Interactivity


Many possible interaction styles

– query / retrieve

– recommendation list

– predicted rating

– dialog

Results


Recommendation = Search?

Search– a query matching process

– given a query return all items that match it

Recommendation– a need satisfaction process

– given a need return items that are likely to satisfy it

Some definitions

Recommendation

Items

Domain

Users

Ratings

Profile

Recommendation

A prediction of a given user's likely preference regarding an item

Issues

– Negative prediction

– Presentation / Interface

Notation

– Pred(u,i)

Items

The things being recommended– can be products– can be documents

Assumption– Discrete items are being recommended– Not, for example, contract terms

Issues– Cost– Frequency of purchase– Customizability– Configurations

Notation– I = set of all items– i = an individual item

Recommendation Domain

What is being recommended?– a $0.99 music track?– a $1.9 M luxury condo?

Much depends on the characteristics of the domain– cost

how costly is a false positive? how costly is a false negative?

– portfolio OK to recommend something that the user has already seen? compatibility with owned items?

– individual vs group are we recommending something for individual or group consumption?

– single item vs configuration are we recommending a single item or a configuration of items? what are the constraints that tie configurations together?

– constraints what types of constraints are users likely to impose (hard vs soft)?

Example 1

Music track (ala iTunes)– low cost

– individual

– configuration fit into existing playlist?

– portfolio should not be already owned

– constraints likely to be soft

Example 2

Course advising– high cost– individual– configuration

must fit with other courses prerequisites

– portfolio should not have already been taken

– constraints may be hard

– graduation requirements– time and day

Example 3

DVD rental– low cost

– group consumption

– no configuration issues

– portfolio possible to recommend a favorite title again

– Christmas movies

– constraints likely to be soft

some could be hard like maximum allowed rating

Users

People who need / want items

Assumption– (Usually) repeat users

Issues– Portfolio effects

Notation– U = set of all users

– u = a particular user

Ratings

A (numeric) score given by a user to a particular item representing the user's preference for that item.

Assumption– Preferences are static (or at least of long duration)

Issues– Multi-dimensional ratings

– Context-dependencies

Notation– ru,i = a rating of item i by user u

– RU,i = Ri = the ratings of item i by all users

Explicit vs Implicit Ratings

A explicit rating is one that has been provided by a user– via a user interface

An implicit rating is inferred from user behavior– for example, as recorded in web log data

Issues– effort threshold

– noise

Collecting Explicit Ratings

Profile

A user profile is everything that the system knows about a particular user

Issues

– profile dimensionality

Notation

– P = all profiles

– Pu = the profile of user u

Knowledge Sources

An AI system requires knowledge

Takes various forms

– raw data

– algorithm

– heuristics

– ontology

– rule base

In Recommendation

Social knowledge

User knowledge

Content knowledge

Knowledge source: Collaborative

A collaborative knowledge source is one that holds information about peer users in a system

Examples

– ratings of items

– age, sex, income of other users

Knowledge source: User

A user knowledge source is one that holds information about the current user

– the one who needs a recommendation

Example

– a query the user has entered

– a model of the user's preferences

Knowledge source: Content

A content knowledge source holds information about the items being recommended

Example

– knowledge about how items satisfy user needs

– knowledge about the attributes of items

Recommendation Knowledge Sources Taxonomy

RecommendationKnowledge

Collaborative

Content

User

OpinionProfiles

DemographicProfiles

Opinions

Demographics

Item Features

Means-ends

DomainConstraints

Contextual Knowledge

Requirements

Query

Constraints

Preferences

Context

DomainKnowledge

FeatureOntology

Break

Roadmap









Recommendation Types

Default (non-personalized)– “Would you like fries with that?”

Collaborative– “Most people who bought hamburgers also bought fries.”

Demographic– “Most 45-year-old computer scientists buy fries.”

Content-based– “You usually buy fries with your burgers.”

Knowledge-based– “A large order of curly fries would really complement the

flavor of a Western Bacon Cheeseburger.”

Collaborative

Key knowledge source

– opinion database

Process

– given a target user, find similar peer users

– extrapolate from peer user ratings to the target user

Demographic

Key knowledge sources

– Demographic profiles

– Opinion profiles

Process

– for target user, find users of similar demographic

– extrapolate from similar users to target user

Content-based

Key knowledge sources

– User’s opinion

– Item features

Process

– learn a function that maps from item features to user’s opinion

– apply this function to new items

Knowledge-based

Key knowledge source

– Domain knowledge

Process

– determine user’s requirements

– apply domain knowledge to determine best item

Collaborative Recommendation

Identify peers

Generate recommendation

Recommendation Knowledge Sources Taxonomy

RecommendationKnowledge

Collaborative

Content

User

OpinionProfiles

DemographicProfiles

Opinions

Demographics

Item Features

Means-ends

DomainConstraints

Contextual Knowledge

Requirements

Query

Constraints

Preferences

Context

DomainKnowledge

FeatureOntology

Two Problems

Generate neighborhood– Peers should be users with similar needs

/ tastes

– How to identify peer users?

Generate predictions– Basic assumption = consistency in

preference

– Prefer those items generally liked by peers

Opinion Profile

Consist of ratings of items– Pu = {ru,i i I}– usually discrete numerical values

We can think of such a profile as a vector– <r0, r1, ..., rk>– some (most) ratings will be missing– the vector is sparse

The collection of all ratings for all users– the rating matrix– usually very sparse

Cosine

The angle between two vectors is given by

θ

Example

Cosine similarity with Alice

Cosine, cont'd

Useful as a metric

– varies between -1 and 1

approaches 1 if angle is small

approches -1 if angle is near 180º

Common in information retrieval

Mean Adjustment

Cosine is sensitive to the actual values in the vector– but users often have different "baseline" preferences– one might never rate an item below 3 / 5– another might only rarely give a 5 / 5

These differences in scale– can mask real similarities between preferences

Missing entries– are effectively zero (very negative rating)

Solution– mean-adjustment– subtract the user's mean from each rating

an item that gets an average score becomes a 0 below average becomes negative

Mean Adjusted Cosine

Example

User6 now most similar

– because missing items aren't a penalty

Problem

How to handle missing ratings?

– sparsity

Cosine

– assumes a value for these values

– regular cosine

assumes zero (not a valid rating)

– adjusted cosine

assumes the user's mean

Neither really satisfactory

Correlation

Don't think of ratings as dimensions

Think of them as samples of a random variable– user opinion

– taken at different points

Try to estimate whether two user's opinions move in the same way– if they are correlated

Correlation

0

1

2

3

4

5

6

Item 1 Item 2 Item 3 Item 4

User A

User B

User C

Pearson's r

Measurement of the correlation tendency of paired measurements

– covariance / product of std. dev.

Items not co-rated are not considered

Cosine vs Correlation

0

1

2

3

4

5

6

Item 1 Item 2 Item 3 Item 4

User A

User B

User C

Example

Neighborhood Size

Too few

– prediction based on only a few neighbors

Too many

– distant neighbors included

– niche not specifically identified

– taken to extreme

overall average

Sparsity

What if the neighbor has only a few ratings in common with the target?

Possible to compute correlation with just two ratings in common

Example

Considerations in Prediction

Proximity– should nearer neighbors get more say

Sparsity– should neighbors with less overlap get less (or

no) say

Baseline– different users have different average ratings

All of these factors can be included in making predictions

Typical prediction formula

Take the user’s average

– add a weighted average of the neighbors

– weight using the similarity scores

Nv

v

Nv vivv

uw

rrwriuP

)(),(

,

Collaborative Recommendation

Advantages– possible to make recommendations knowing nothing about

the items

– extends common social practice, exchange of opinions

– possible to find niches of users with obscure combinations of interests

– possible to make disparate connections (serendipity)

Disadvantages– vulnerability to manipulation (more later)

– source of ratings needed explicit ratings preferred

– cold start problems (next slide)

Cold Start Problem

New item– how can a new item be recommended?

no users have rated it

– must wait for the first person to rate it

– possible solution: genre bot

New user– how can a new user get a recommendation

needs a profile that can be compared with others

– possible solutions wait for user to rate items

require users to rate items

give some default recommendations while waiting for data

Roadmap









Documents

Recommender Systems · About myself PhD 1993 Northwestern University –Intelligent Multimedia Retrieval 1993-1998 –Post-doc at University of Chicago Kristian Hammond –Helped