Upload
otis
View
44
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Using Hierarchical Clustering for Learning the Ontologies used in Recommendation Systems. Vincent Schickel-Zuber, Boi Faltings [SIGKDD’07] Reporter: Che-Wei, Liang Date: 2008/04/10. Outline. Introduction Background Collaborative Filtering Ontology Filtering Learning the Ontologies - PowerPoint PPT Presentation
Citation preview
Using Hierarchical Clustering for Learning the Ontologies used in
Recommendation SystemsVincent Schickel-Zuber, Boi Faltings
[SIGKDD’07]
Reporter: Che-Wei, LiangDate: 2008/04/10
1
Outline
• Introduction• Background
– Collaborative Filtering– Ontology Filtering
• Learning the Ontologies– Clustering Algorithms– Learning Hierarchical Ontologies
• Experiments• Conclusion
2
Introduction
• Recommender system– Help people finding the most relevant items based
on the preferences of the person and others.
• Item-based collaborative filtering (CF)– Recommend items based on the experience of the
user as well as other similar users.
3
Ontology• What is Ontology?
– A Multi-inheritance graph structure– Edge represent feature,– Item is an instance of at least one concept
5
Ontology Filtering• Infer preference ratings of items based on the
ratings of known items and the relative position in an ontology.
6
Outline
• Introduction• Background
– Collaborative Filtering– Ontology Filtering
• Learning the ontologies– Clustering Algorithms– Learning Hierarchical ontologies
• Experiments• Conclusion
7
Background
• Users U = {u1,…,um}
• Items I = {i1,…,in}
• Ru,i = the rating assigned to item i by user u
8
Collaborative Filtering (1/4)
• Collaborative Filtering1. Finding similar items2. Combine similar items into a recommendation
list
• Assumption: similar users like similar items
9
Collaborative Filtering (2/4)
• Top-N recommendation strategy1. Compute pair-wise similarities in matrix R
2. Predict rating of an item i by using the k most similar items to i (i’s neighborhood)
3. Select best N items
10
Collaborative Filtering (4/4)
• Reduce the search space!• But
– Search space remain huge an unconstrained– Require user to rate many items to find highly
correlated neighbors.– Greatly influenced by the size of the item’s
neighborhood.
12
Ontology Filtering (1/3)
• Two input:– Users’ historical data R– An Ontology modeling the domain
• Defining the ontology usually not made explicit– wine by color => white and red by taste?
13
Ontology Filtering (2/3)
1. Compute a-priori score, APS(c) , nc is number of descendants of concept c
2. Infer rating by
α(y, lca) β(x, lca)OSS - find the closest concept x to any given y
14
CF vs. OF
16
Collaborative filtering Ontology filtering
Compute item similarity
Using the item-item similarity matrix S
Restricted to a hierarchical ontology
Infer missing score By neighborhood From the closest concept
with preferences
Missing user preference
From other people’s preferences
From the user’s past experience
Outline
• Introduction• Background
– Collaborative Filtering– Ontology Filtering
• Learning the ontologies– Clustering Algorithms– Learning Hierarchical ontologies
• Experiments• Conclusion
17
Clustering algorithm
• Clustering algorithm– Fuzzy clustering, nearest-neighbor clustering,
hierarchical clustering, artificial neural networks for clustering, statistical clustering.
• Hierarchical algorithm– Distance-based clustering– Conceptual-based clustering
18
Distance-based Clustering
• Distance-based clustering– Agglomerative clustering
• bottom-up• Compute all pair-wise similarities O(n2)
– Partitional clustering • top-down• Low complexity
20
Concept-Based clustering
• Concept-Based clustering– Items need to be represented by a set of
attribute-value pairs.– Ex: mammal (body cover, heart chamber, body
temperature) = (hair, four, regulated)– COBWEB
• Classification tree is not height-balanced• Overall complexity is exponential to #attributes.
21
Learning Hierarchical Ontologies (1/5)
• Users can be categorized in different communities.– One ontology for all users is not appropriate– Select better ontology to use based on user’s
preferences.
22
Learning Hierarchical Ontologies (4/5)
• Find concept problem– In s(y| x), if concepts represents the items liked
are too distant from disliked ones?
• Algorithm 21. Select a subset of ontologies that perform best2. Select ontology minimizes the distance between
liked and disliked concepts for the selected ontologies.
25
Learning Multi-Hierarchical Ontologies
• Some problem– Implicit feature
• Limit concept representation• Limit OF’s inference process
– Ignore other possible suboptimal candidates
• Improve: slightly increase the search space
27
Experiments
• Two data sets:– MovieLens
• Rating 943 real users on at least 20 movies.• Total 1682 movies, 19 themes.
– Jester• Rating on jokes collected over a period of 4 years.• Contains 24,983 users, 100 jokes.
30
Evaluating Recommendation Algorithm
• RS: recommendation set RS• Nok : #(Relevant items)
• Nr : #(Relevant items in the database N)
• Use F1 metric
31
Hierarchical Clustering Analysis
• Execution time in seconds required for the clustering algorithm to generate the ontology.
35
Multi-Hierarchical Clustering Analysis
• Tradeoff between prediction accuracy and ontology quality.
38
Conclusions• Introduce three algorithms
– Learns a set of ontologies based on some historical data.– Capable of selecting which one to use based on the user’s perference– Building a multi-hierarchical ontology based on a predefined window
size
• Experimental results on two famous data sets showed that can produce good ontologies and increase the prediction accuracy.
• The learnt ontologies can even outperform traditional item-based collaborative filtering.
41