Upload
james-ward
View
216
Download
2
Tags:
Embed Size (px)
Citation preview
INHA UNIVERSITYINHA UNIVERSITYINCHEON, KOREAINCHEON, KOREA
http://eslab.inha.ac.kr/
Collaborative Tagging in Collaborative Tagging in Recommender SystemsRecommender Systems
AE-TTIE JIAE-TTIE JI11, CHEOL YEON, CHEOL YEON11, HEUNG-NAM KIM, HEUNG-NAM KIM11, AND GEUN-SIK JO, AND GEUN-SIK JO22
11 Intelligent E-Commerce Systems Laboratory, Intelligent E-Commerce Systems Laboratory, Department of Computer Science & Information Engineering, Inha University Department of Computer Science & Information Engineering, Inha University
{{aerry13, , entireboy, , nami}@eslab.inha.ac.kr}@eslab.inha.ac.kr
22 School of Computer Science & Engineering, Inha University, School of Computer Science & Engineering, Inha University, 253 Yonghyun-dong, Incheon, Korea 402-751 253 Yonghyun-dong, Incheon, Korea 402-751
- 2 -INHA UNIVERSITYINHA UNIVERSITYINCHEON, KOREAINCHEON, KOREA
http://eslab.inha.ac.kr/
IntroductionIntroduction
Recommender System with Recommender System with Collaborative TaggingCollaborative Tagging
Experimental ResultsExperimental Results
Conclusions Conclusions
Future WorksFuture Works
IntroductionIntroduction
Recommender System with Recommender System with Collaborative TaggingCollaborative Tagging
Experimental ResultsExperimental Results
Conclusions Conclusions
Future WorksFuture Works
IntroductionIntroduction
Recommender System with Recommender System with Collaborative TaggingCollaborative Tagging
Experimental ResultsExperimental Results
Conclusions Conclusions
Future WorksFuture Works
IntroductionIntroduction
Recommender System with Recommender System with Collaborative TaggingCollaborative Tagging
Experimental ResultsExperimental Results
Conclusions Conclusions
Future WorksFuture Works
IntroductionIntroduction
Recommender System with Recommender System with Collaborative TaggingCollaborative Tagging
Experimental ResultsExperimental Results
ConclusionsConclusions
Future WorksFuture Works
IntroductionIntroduction
Recommender System with Recommender System with Collaborative TaggingCollaborative Tagging
Experimental ResultsExperimental Results
Conclusions Conclusions
Future WorksFuture Works
- 3 -INHA UNIVERSITYINHA UNIVERSITYINCHEON, KOREAINCHEON, KOREA
http://eslab.inha.ac.kr/
IntroductionIntroduction
?? ?
- 4 -INHA UNIVERSITYINHA UNIVERSITYINCHEON, KOREAINCHEON, KOREA
http://eslab.inha.ac.kr/
Collaborative Filtering (CF)Collaborative Filtering (CF)
IntroductionIntroduction
!! !
Nearest neighbors’ opinion
My preference history
Recommendation
Sparsity Problem
Cold-startUser
Problem
- 5 -INHA UNIVERSITYINHA UNIVERSITYINCHEON, KOREAINCHEON, KOREA
http://eslab.inha.ac.kr/
Collaborative Tagging (CT)Collaborative Tagging (CT)
cookin
g, hobby
myC
yworld
picn
ic, 2
007,
hor
se
picn
ic, h
orse
walkin
g, p
icnic
coup
le, w
alki
ng, p
icni
c
wiki, t
aggi
ng
tagg
ing,
wik
iped
ia
IntroductionIntroduction
- 6 -INHA UNIVERSITYINHA UNIVERSITYINCHEON, KOREAINCHEON, KOREA
http://eslab.inha.ac.kr/
MotivationMotivation
!! !
Nearest neighbors’ tags
My tags
walking, picnic
picnic, 2007, horse, girl
Recommendation
picnic, horse
cooking, hobby, spaghetti
tagging, wikipedia, social
intelligence
IntroductionIntroduction
- 7 -INHA UNIVERSITYINHA UNIVERSITYINCHEON, KOREAINCHEON, KOREA
http://eslab.inha.ac.kr/
System ArchitectureSystem Architecture
Recommender System with CTRecommender System with CT
Part 1: Catching an user’s latent preference!Part 1: Catching an user’s latent preference! Candidate Tag Set Generation via CF
Part 2: Probabilistic Recommendation!Part 2: Probabilistic Recommendation! Naïve Bayesian Approach
User CandidateTag Model
User-Tag MatrixA : r ⅹ m
User-User Similarity[r ⅹ r]
Naïve Bayesian Classifier
User-Item MatrixR : r ⅹ n
Tag-Item MatrixQ : m ⅹ n
Recommendation
T3
T5T1T2T4
T
Users
Items
Tagging
Tags
......
..........
k:: users, n:: items, m:: tagsr:: users, n:: items, m:: tags
Target User
- 8 -INHA UNIVERSITYINHA UNIVERSITYINCHEON, KOREAINCHEON, KOREA
http://eslab.inha.ac.kr/
Matrices Representing Matrices Representing PreferencesPreferences
User-item binary matrix, User-item binary matrix, RR ( (rr × × nn)) Ru,i : whether a user ur prefers an item in or not.
User-tag matrix, User-tag matrix, AA ( (rr × × mm)) Au,t : frequency of a tag tm tagged by a user ur.
Tag-item matrix, Tag-item matrix, QQ ( (mm × × nn)) Qt,i : frequency of a tag tm for an item in.
1
i1
u1
i2 i3 in
(a) user-item binary matrix R
itemuser
u2
u3 1
u4 1
...… …
ur 1
1
1
…
1
i4
1
...
…
itemtag
…
…
…
…
21
1
3
1
1
taguser
3 1
5
2 4
(b) user-tag matrix A
3
2
t1
u1
t2 t3
u2
u3
u4
t4 i1
t1
i2 i3
t2
t3
t4
i4
…
1
...… … …
...
…
…
…
…
… …...… … …
...
…
…
…
…
… …
3 3ur
1
2
tm
(c) tag-item matrix Q
1 2tm
2
in
Recommender System with CTRecommender System with CT
- 9 -INHA UNIVERSITYINHA UNIVERSITYINCHEON, KOREAINCHEON, KOREA
http://eslab.inha.ac.kr/
Recommendation ProcessRecommendation Process
Recommender System with CTRecommender System with CT
Step 1: CTS Generation
t1
u1
u2
u3
u4
t2 t3 t4 i1
t1
t2
t3
t4
i2 i3 i4 taguser
itemtag
Naïve Bayes Classifier
Target user
CTS for user u3
Step 2: Recommendation
Collaborative Filtering
TopN items for user u3
Candidate Tag Set (Candidate Tag Set (CTSCTS) Generation via CF) Generation via CF User-User Similarity Tag Preference
- 10 -INHA UNIVERSITYINHA UNIVERSITYINCHEON, KOREAINCHEON, KOREA
http://eslab.inha.ac.kr/
CTS Generation via CFCTS Generation via CF
CTSCTS (Candidate Tag Set) (Candidate Tag Set) The latent preference of a target user
User-user SimilarityUser-user Similarity To find k nearest neighbors (KNN) of a target user based on
user-tag matrix A
Tag PreferenceTag Preference
Tt v,tTt u,t
Tt v,tu,t
)(A)(A
AA
)v,u(sim(u,v)
22
cos
)( ,, ),()(uKNNo totu ousimAS
t1
u1
u3
u2
u4
t2 t3 t4
2
4 1
taguser
Target user
u5
t5
2
1
2 3
3
4 1
3 2
user-tag matrix A
Recommender System with CTRecommender System with CT
T} tw,,,,| x {t(u) CTS xxw 2 1
- 11 -INHA UNIVERSITYINHA UNIVERSITYINCHEON, KOREAINCHEON, KOREA
http://eslab.inha.ac.kr/
Case for Data SparsityCase for Data Sparsity
iGoo
gle
(ww
w.g
oogl
e.co
m/ig
)
Brandon
Courtney
Dannis
Net
vibe
s(w
ww
.net
vibe
s.co
m)
MS
Virt
ual E
art
h(m
aps.
live.
com
)
Win
dow
s Li
ve(w
ww
.live
.com
)
1
1
itemuser
1
11
1 1
targetuser
Users Tags
Courtney
Brandon
Dannis
Contents
MS Virtual Earth(maps.live.com)
iGoogle(www.google.com/ig)
Netvibes(www.netvibes.com)
Windows Live(www.live.com)
Improving limitations of CF via CTImproving limitations of CF via CT
- 12 -INHA UNIVERSITYINHA UNIVERSITYINCHEON, KOREAINCHEON, KOREA
http://eslab.inha.ac.kr/
iGoo
gle
(ww
w.g
oo
gle
.co
m/ig
)
Brandon
Courtney
Dannis
Net
vibe
s(w
ww
.ne
tvib
es.
com
)
MS
Virt
ual E
arth
(ma
ps.
live
.co
m)
Win
dow
s Li
ve(w
ww
.live
.co
m)
1
1
itemuser
1
11
1 1
pers
onal
Brandon
Courtney
Dannis
web
2.0
map
port
al taguser iG
oogl
e
sear
ch
MS
112 2
3 1 1111
21 1 1 11
…
…
……
Case for Data SparsityCase for Data Sparsity
Improving limitations of CF via CTImproving limitations of CF via CT
Windows Live, search, personal, MS
MS Virtual Earth, MS, map, web2.0
Live Search Maps, map, web2.0, MS
Netvibes, web2.0, personal
search, Google, personal, iGoogletargetuser
Users Tags
Netvibes, personal
Courtney
Brandon
Dannis
Contents
MS Virtual Earth(maps.live.com)
iGoogle(www.google.com/ig)
Netvibes(www.netvibes.com)
Windows Live(www.live.com)
web2.0, iGoogle, search
- 13 -INHA UNIVERSITYINHA UNIVERSITYINCHEON, KOREAINCHEON, KOREA
http://eslab.inha.ac.kr/
Windows Live, search, personal, MS
MS Virtual Earth, MS, map, web2.0
Live Search Maps, map, web2.0, MS
Netvibes, web2.0, personal
search, Google, personal, iGoogletargetuser
Users Tags
Netvibes, personal
Courtney
Brandon
Dannis
Contents
MS Virtual Earth(maps.live.com)
iGoogle(www.google.com/ig)
Netvibes(www.netvibes.com)
Windows Live(www.live.com)
web2.0, iGoogle, search
iGoo
gle
(ww
w.g
oo
gle
.co
m/ig
)
Brandon
Courtney
Dannis
Net
vibe
s(w
ww
.ne
tvib
es.
com
)
MS
Virt
ual E
arth
(ma
ps.
live
.co
m)
Win
dow
s Li
ve(w
ww
.live
.co
m)
1
1
itemuser
1
11
1 1
pers
onal
Brandon
Courtney
Dannis
web
2.0
map
port
al taguser iG
oogl
e
sear
ch
MS
112 2
3 1 1111
21 1 1 11
…
…
……
Case for Data SparsityCase for Data Sparsity
Improving limitations of CF via CTImproving limitations of CF via CT
- 14 -INHA UNIVERSITYINHA UNIVERSITYINCHEON, KOREAINCHEON, KOREA
http://eslab.inha.ac.kr/
Windows Live, search, personal, MS
MS Virtual Earth, MS, map, web2.0
Live Search Maps, map, web2.0, MS
Netvibes, web2.0, personal
search, Google, personal, iGoogletargetuser
Users Tags
Netvibes, personal
Courtney
Brandon
Dannis
Contents
MS Virtual Earth(maps.live.com)
iGoogle(www.google.com/ig)
Netvibes(www.netvibes.com)
Windows Live(www.live.com)
cold-startuser
Eric
web2.0, iGoogle, search
web2.0, map, MS
iGoo
gle
(ww
w.g
oo
gle
.co
m/ig
)
Brandon
Courtney
Dannis
Net
vibe
s(w
ww
.ne
tvib
es.
com
)
MS
Virt
ual E
arth
(ma
ps.
live
.co
m)
Win
dow
s Li
ve(w
ww
.live
.co
m)
1
1
itemuser
1
11
1 1
Eric 1
Case for Cold-start UserCase for Cold-start User
Improving limitations of CF via CTImproving limitations of CF via CT
- 15 -INHA UNIVERSITYINHA UNIVERSITYINCHEON, KOREAINCHEON, KOREA
http://eslab.inha.ac.kr/
Recommendation ProcessRecommendation Process
Recommender System with CTRecommender System with CT
Step 1: CTS Generation
t1
u1
u2
u3
u4
t2 t3 t4 i1
t1
t2
t3
t4
i2 i3 i4 taguser
itemtag
Naïve Bayes Classifier
Target user
CTS for user u3
Step 2: Recommendation
Collaborative Filtering
TopN items for user u3
Item RecommendationItem Recommendation Naïve Bayes Classifier Top-N Items Recommendation
- 16 -INHA UNIVERSITYINHA UNIVERSITYINCHEON, KOREAINCHEON, KOREA
http://eslab.inha.ac.kr/
Item RecommendationItem Recommendation
Naïve Bayes ClassifierNaïve Bayes Classifier Posterior probability : a preference probability of user u for an item iy
with CTSw(u)
Prior probability
Item-conditional Tag Distribution
Top-NTop-N Recommendation Recommendation TopNu items with the highest Pu,y , |TopNu| ≤ N and TopNu ∩ Iu = Ø
t1 t2 t3 tw…
iy
Candidate Tag Set for user u, CTSw(u)
Class item
w
jyjyyu i|ItPiIPP
1, )()(
n
g
r
u gu
r
u yuy
R
RiIP
1 1 ,
1 ,)(
m
t yt
yjyj
Qm
QiItP
1 ,
,1)|(
Recommender System with CTRecommender System with CT
- 17 -INHA UNIVERSITYINHA UNIVERSITYINCHEON, KOREAINCHEON, KOREA
http://eslab.inha.ac.kr/
Dataset & Evaluation MetricDataset & Evaluation Metric
DatasetDataset http://del.icio.us (a social bookmarking service)
Training data : 21,653 / Testing data : 5,413 Sparsity level of user-item matrix : 0.9989
Evaluation metricEvaluation metric
usersusers ItemsItems tagstags book markingsbook markings taggingstaggings
1,5441,544 17,39017,390 10,07710,077 27,06627,066 44,68144,681
)(
u
uu
Test
TopNTesturatiohit
100
)(1
k
uratiohitrecall
k
u
Experimental ResultsExperimental Results
- 18 -INHA UNIVERSITYINHA UNIVERSITYINCHEON, KOREAINCHEON, KOREA
http://eslab.inha.ac.kr/
Benchmark AlgorithmsBenchmark Algorithms User-based Collaborative Filtering User-based Collaborative Filtering (Badrul Sarwar, and et al., 2000)
Item-based Collaborative Filtering Item-based Collaborative Filtering (Mukund Deshpande, and et al., 2004)
KNNKNN size was set to 50 where the performance increase size was set to 50 where the performance increase rates were diminished for main comparison.rates were diminished for main comparison.
Experimental ResultsExperimental Results
5.0%
5.5%
6.0%
6.5%
7.0%
7.5%
8.0%
8.5%
9.0%
9.5%
10 30 50 70 100
Neighborhood Size (k )
reca
ll
User-based CF
Item-based CF
Recommendation size N = 10
- 19 -INHA UNIVERSITYINHA UNIVERSITYINCHEON, KOREAINCHEON, KOREA
http://eslab.inha.ac.kr/
Experiments with CTS sizeExperiments with CTS size
The size of The size of CTSCTS, , ww, can be a significant factor affecting the , can be a significant factor affecting the quality of recommendation.quality of recommendation.
w w was set to 70, which obtained the best quality for main was set to 70, which obtained the best quality for main comparisons.comparisons.
Experimental ResultsExperimental Results
6.0%
6.5%
7.0%
7.5%
8.0%
8.5%
9.0%
10 30 50 70 100
Candidate Tag Set Size (w)
reca
ll
Tag-based CF
Neighbor size k = 50Recommendation size N = 10
- 20 -INHA UNIVERSITYINHA UNIVERSITYINCHEON, KOREAINCHEON, KOREA
http://eslab.inha.ac.kr/
Comparisons of Overall PerformanceComparisons of Overall Performance
Sparsity of the collected dataset affected the performances Sparsity of the collected dataset affected the performances of all three methods.of all three methods.
Even though the number of recommended items were Even though the number of recommended items were small, our method outperformed the other two methods.small, our method outperformed the other two methods.
Experimental ResultsExperimental Results
6.0%
7.0%
8.0%
9.0%
10.0%
11.0%
12.0%
10 20 30 40 50
A number of Recommended Items (N )
reca
ll
Tag-based CFItem-based CFUser-based CF
Neighbor size k = 50CTS size w = 70
- 21 -INHA UNIVERSITYINHA UNIVERSITYINCHEON, KOREAINCHEON, KOREA
http://eslab.inha.ac.kr/
For cold-start users who do not have enough preference For cold-start users who do not have enough preference information, our method outperformed the other two information, our method outperformed the other two methods.methods.
Comparisons for Cold-start UserComparisons for Cold-start User
0.0%
1.0%
2.0%
3.0%
4.0%
5.0%
6.0%
7.0%
8.0%
9.0%
< 3 < 6 < 300Boomarks
reca
llTag-based CF Item-based CF User-based CF
Experimental ResultsExperimental Results
Neighbor size k = 50CTS size w = 70Recommendation size N = 10
- 22 -INHA UNIVERSITYINHA UNIVERSITYINCHEON, KOREAINCHEON, KOREA
http://eslab.inha.ac.kr/
ConclusionConclusion
We analyzed the potential of collaborative tagging We analyzed the potential of collaborative tagging system for applying to recommendation.system for applying to recommendation. User-created tags imply users’ preferences about items as well
as metadata about them. Using tags can partially improve data sparsity and cold-start
user problem which are serious limitations of CF recommendation.
Also proposed is a novel recommender system based Also proposed is a novel recommender system based on collaborative tags of users using CF scheme.on collaborative tags of users using CF scheme. Our algorithm obtained better recommendation quality compared
to traditional CF schemes. It provided more suitable items for user preferences even though
the number of recommended items were small.
- 23 -INHA UNIVERSITYINHA UNIVERSITYINCHEON, KOREAINCHEON, KOREA
http://eslab.inha.ac.kr/
Future WorkFuture Work
““Noise” tags can be included in CTS.Noise” tags can be included in CTS. Some tags are too personalized or content-criticizable (e.g., bad,
myWork, to read etc.)
They should be treated for more personalized and valuable analysis.
There are common issues of keyword-based analysis.There are common issues of keyword-based analysis. Polysemy, synonymy and basic level variation.
Semantic tagging is an interesting approach to address these issues.