Upload
alexia-burns
View
214
Download
0
Embed Size (px)
Citation preview
The Effect of Dimensionality Reduction in Recommendation Systems
Juntae Kim
Department of Computer EngineeringDongguk University
Contents Introduction
Collaborative Recommendation
Data Sparseness Problem
Dimensionality Reduction by using SVD
An Example
Experiments
Conclusion
Introduction e-CRM
Provides personalized service Enhance sales by
Product recommendation, target advertisement, etc.
Recommendation System
Demographic features
Item features
Sales history
Purchase historyCustomer
Recommend items
Introduction Use item-to-item similarity – content-based
Use item-to-item similarity – association
A
C
B
like
similarcontents
Recommend
A
C
B
like
highcorrelation
Recommend
Introduction Use people-to-people similarity – demographic
Use people-to-people similarity – collaborative
A
C
Bsimilarfeature
like
Recommend
A
C
B
A B
highcorrelation
like
like
Recommend
Collaborative Method Advantages
No needs of contents analysis Items that are difficult to analyze contents can be
recommended Ex> Movie, music, …
No needs of user information High precision
Method1. Find out similar users
2. Predict preferences based on similar users preferences
Collaborative Method Computing similarity
Pearson correlation coefficient ( [-1, 1] )
: Rating of user a to item i
Example User a: (1, 8, 9) (-5, +2, +3) User b: (2, 9, 7) (-4, +3, +1) User a is similar to b User c: (9, 3, 3) (+4, -2, -2)
Collaborative Method Prediction of preferences
Weighted sum of similar users’ preferences
: 사용자 a 와 u 의 유사도
Example Average rating of user a: 5 Preferences of user a User b: (2, 8, 8), wa,b = 0.5 = (5, 5, 5) + (-4, 2, 2)*0.5
User c: (4, 4, 7), wa,c = 0.1 + (-1, -1, 2)*0.1
= (2.9, 5.9, 6.2)
Data Sparseness Problem Example data
10000
01000
11000
00001
00110
01101
6
5
4
3
2
1.
user
user
user
user
user
userScreamHolloweenPocahontasKingLionCoMonster
A
Data Sparseness Problem Explicit ratings are not usually available
Available data purchase, click, etc.
0 or 1 Computing correlation is not appropriate
(no negative preference information)
use cosine similarity
ua
uaua rr
rrw
,
Data Sparseness Problem Available data are usually very sparse
Buy 2~3 items among thousands of items Cosine similarity can not be computed
Reduce dimension
10000
01000
11000
00001
00110
01101
A
?
A
Dimensionality Reduction Using category information
Represent user preference vector with item categories Monster Co., Lion King, Pocahontas animation Holloween, Scream horror
10
10
10
01
01
11
A
10000
01000
11000
00001
00110
01101
A
Dimensionality Reduction Singular Value Decomposition (SVD)
Decompose the user-item matrix Amn
Amn = Umm Smn (Vnn)T
S : Diagonal matrix that contains the singular values of A in descending order
U, V : Orthogonal matrices
Rotating the axes of the n-dimensional space 1st axis runs along the direction of largest variation
Dimensionality Reduction SVD example
22.058.033.041.012.0
41.058.012.022.033.0
19.000.020.063.045.0
63.058.045.019.020.0
29.000.075.053.028.0
53.000.028.029.075.0
U
39.000.000.000.000.0
00.000.100.000.000.0
00.000.028.100.000.0
00.000.000.059.100.0
00.000.000.000.016.2
S
09.058.041.065.026.0
16.058.015.035.070.0
61.000.037.051.048.0
73.000.059.033.013.0
25.058.057.030.044.0
TV
Dimensionality Reduction Approximation of A
Select largest k singular values
A’mn = Umk Skk (Vnk)T
Computing user similarity AAT = USVT(USVT)T
= USVTVSTUT
= (US)(US)T
Projection of A into k dimensionA’mn Vnk = Umk Skk
An Example User-item matrix
10000
01000
11000
00001
00110
01101
6
5
4
3
2
1.
user
user
user
user
user
userScreamHolloweenPocahontasKingLionCoMonster
A
An Example Reduction, k = 2
65.026.0
35.071.0
00.197.0
30.004.0
84.060.0
46.062.0
2USVA
10000
01000
11000
00001
00110
01101
A
An Example User-user similarity
00.1
74.000.1
93.094.000.1
87.032.062.000.1
54.016.018.088.000.1
10.074.047.040.078.000.1
))(( TUSUS
An Example User vectors in 2-D space
u6
u4
u5
u3
u2
u1
Experiments Dataset – MovieLens
943 users, 1628 movies, 1~5 rating, 6.4% rated Change ratings to 0/1 3.6% rated
Experiments Compare performance of plain collaborative(CF) and reduce
d dimension(SVD) recommendation CF: 60 neighbor SVD: rank 20
Change sparseness to 2.0%, 1.0%, 0.5%
Experiments Metric
Hit ratio Remove 1 rating from each user test data Recommend 10 items for each user If the test data is in the recommended item hit
Total # of hit
Total # of test data
Result Sparseness 3.6% SVD improves hit ratio by x % Sparseness 0.5% SVD improves hit ratio by x %
Hit ratio =
Experiments Results
0
0.05
0.1
0.15
0.2
0.25
3.6% 2.0% 1.0% 0.5% 0.1%
Sparseness
recall Avg
CF 60NN
SVD Rank20
Conclusion Solve data sparseness problem
Reduce dimension – heuristics Reduce dimension – SVD
Experimental results SVD shows more performance improvement in sparser data
Future research Statistical analysis Combined methods
References Basu, C, Hirsh, H., Cohen, W., “Recommeder Systems. Recommedation As Classification: Using Social And C
onent-Based Information,” Proceedings of the Workshop on Recommendation system. AAAI Press, Menlo Park California, 1998.
Billsus, D., Pazzani, M. j., “Learning Collaborative Information Filters,” Proceedings of workshop on recommender system, 1998.
Berry, M. W., Dumais, S. T., and O’Brain, G. W. “Using Linear Algebra for Intelligent Information Retrieval,” SIAM Review, 37(4), pp. 573-595, 1995.
Breese, J. S., Heckerman, D., and Kadie, C., “Empirical Analysis of Predictive Algorithm for Collaborative Filtering,”Proceeding of the Fourteenth Conference UAI, July 1998.
Goldberg, k., Roeder, T., Gupta, D., and Perkins, C., “Eigentaste: A Constant Time Collaborative Filtering Algorithm,” Technical Report M00/41. Electronics Research Laborotary, University of California, Berkeley, 2000.
Herlocker, J., Konstan, J., Borchers, A., Riedl, J., “An Algorithmic Framework for Performing Collaborative Filtering,”Proceedings of the 1999 Conference on Research and Development in Information Retrieval, Aug. 1999.
Sarwar, B. M. “Sparsity, Scalability, and Distribution in Recommender Systems,” Ph.D. Thesis, Computer Science Dept., University of Minnesota, 2001.
Sarwar, B. M., Karypis, G., Konstan, J. A., and Riedl, J., “Application of Dimensionality Reduction in Recommender System-A Case Study,”WebKDD 00-Web-mining for E-Commerce Workshop, 2000.
Schafer, J. B., Konstan, J., and Riedl, J., “Recommender Systems in E-Commerce ,” Proceedings of the ACM Conference on Electronic Commerce, November 1999.
Shardanand, U., "Social information filtering for music recommendation," Technical Report MA95, MIT Media Laboratory, 1995.