8
By Rachsuda Jiamthapthaksin 10/09/2009 1 ited by Christoph F. Eick

By Rachsuda Jiamthapthaksin 10/09/2009 1 Edited by Christoph F. Eick

Embed Size (px)

Citation preview

Page 1: By Rachsuda Jiamthapthaksin 10/09/2009 1 Edited by Christoph F. Eick

By Rachsuda Jiamthapthaksin

10/09/2009

1Edited by Christoph F. Eick

Page 2: By Rachsuda Jiamthapthaksin 10/09/2009 1 Edited by Christoph F. Eick

Recommender Systems (RSs) Goal: To help users to find items that

they likely appreciate (and buy/lease) from huge catalogues.

2

Page 3: By Rachsuda Jiamthapthaksin 10/09/2009 1 Edited by Christoph F. Eick

The recommendation problem Let

○ C be the set of all users, and○ S be the set of all possible items that can be

recommended. ○ u be a utility function that measures the

usefulness of item s to user c, u:CSR

For cC, find s’S that maximizes the user’s utility:

cC, s’c = argmaxsS u(c,s)(1).

3

Page 4: By Rachsuda Jiamthapthaksin 10/09/2009 1 Edited by Christoph F. Eick

Netflix Recommender System Scenario

4

:= unknownRemark: Typically, a lot of symbols

Page 5: By Rachsuda Jiamthapthaksin 10/09/2009 1 Edited by Christoph F. Eick

Survey of the Netflix Contest Netflix Prize competition offers a grand

prize of US $1M for an algorithm that’s 10% more accurate than “Cinematch” Netflix uses to predict customers’ movie preferences.

The best score will win a $50K Progress Prize.

5

Page 6: By Rachsuda Jiamthapthaksin 10/09/2009 1 Edited by Christoph F. Eick

The Basic Structure of the Contest Provide 100 million ratings that 480K

anonymous customers had given to 17K movies.

Withhold 3M of the most recent ratings and ask the contestants to predict them.

Assess each contestant’s 3M predictions by comparing predictions with actual ratings.

Evaluation metric: the Root-Mean Squared Error

6

Page 7: By Rachsuda Jiamthapthaksin 10/09/2009 1 Edited by Christoph F. Eick

Netflix Dataset (1) The data were collected between

October, 1998 and December, 2005 and reflect the distribution of all ratings received during this period.

The ratings are on a scale from 1 to 5 (integral) stars.

The date of each rating and the title and year of release for each movie id are also provided.

7

Page 8: By Rachsuda Jiamthapthaksin 10/09/2009 1 Edited by Christoph F. Eick

Netflix Dataset (2)

training_set.tar (2 GB) movie_titles.txt (575 KB) qualifying.txt (51,224 KB) probe.txt (10,530 KB) rmse.pl (1 KB)

8