41
Using Hierarchical Clustering for Learning the Ontologies used in Recommendation Systems Vincent Schickel-Zuber, Boi Faltings [SIGKDD’07] Reporter: Che-Wei, Liang Date: 2008/04/10 1

Using Hierarchical Clustering for Learning the Ontologies used in Recommendation Systems

  • Upload
    otis

  • View
    44

  • Download
    0

Embed Size (px)

DESCRIPTION

Using Hierarchical Clustering for Learning the Ontologies used in Recommendation Systems. Vincent Schickel-Zuber, Boi Faltings [SIGKDD’07] Reporter: Che-Wei, Liang Date: 2008/04/10. Outline. Introduction Background Collaborative Filtering Ontology Filtering Learning the Ontologies - PowerPoint PPT Presentation

Citation preview

Using Hierarchical Clustering for Learning the Ontologies used in

Recommendation SystemsVincent Schickel-Zuber, Boi Faltings

[SIGKDD’07]

Reporter: Che-Wei, LiangDate: 2008/04/10

1

Outline

• Introduction• Background

– Collaborative Filtering– Ontology Filtering

• Learning the Ontologies– Clustering Algorithms– Learning Hierarchical Ontologies

• Experiments• Conclusion

2

Introduction

• Recommender system– Help people finding the most relevant items based

on the preferences of the person and others.

• Item-based collaborative filtering (CF)– Recommend items based on the experience of the

user as well as other similar users.

3

• CF constructs the item-item similarity matrix S

4

Ontology• What is Ontology?

– A Multi-inheritance graph structure– Edge represent feature,– Item is an instance of at least one concept

5

Ontology Filtering• Infer preference ratings of items based on the

ratings of known items and the relative position in an ontology.

6

Outline

• Introduction• Background

– Collaborative Filtering– Ontology Filtering

• Learning the ontologies– Clustering Algorithms– Learning Hierarchical ontologies

• Experiments• Conclusion

7

Background

• Users U = {u1,…,um}

• Items I = {i1,…,in}

• Ru,i = the rating assigned to item i by user u

8

Collaborative Filtering (1/4)

• Collaborative Filtering1. Finding similar items2. Combine similar items into a recommendation

list

• Assumption: similar users like similar items

9

Collaborative Filtering (2/4)

• Top-N recommendation strategy1. Compute pair-wise similarities in matrix R

2. Predict rating of an item i by using the k most similar items to i (i’s neighborhood)

3. Select best N items

10

Collaborative Filtering (3/4)

11

Collaborative Filtering (4/4)

• Reduce the search space!• But

– Search space remain huge an unconstrained– Require user to rate many items to find highly

correlated neighbors.– Greatly influenced by the size of the item’s

neighborhood.

12

Ontology Filtering (1/3)

• Two input:– Users’ historical data R– An Ontology modeling the domain

• Defining the ontology usually not made explicit– wine by color => white and red by taste?

13

Ontology Filtering (2/3)

1. Compute a-priori score, APS(c) , nc is number of descendants of concept c

2. Infer rating by

α(y, lca) β(x, lca)OSS - find the closest concept x to any given y

14

Ontology Filtering (3/3)

15

CF vs. OF

16

Collaborative filtering Ontology filtering

Compute item similarity

Using the item-item similarity matrix S

Restricted to a hierarchical ontology

Infer missing score By neighborhood From the closest concept

with preferences

Missing user preference

From other people’s preferences

From the user’s past experience

Outline

• Introduction• Background

– Collaborative Filtering– Ontology Filtering

• Learning the ontologies– Clustering Algorithms– Learning Hierarchical ontologies

• Experiments• Conclusion

17

Clustering algorithm

• Clustering algorithm– Fuzzy clustering, nearest-neighbor clustering,

hierarchical clustering, artificial neural networks for clustering, statistical clustering.

• Hierarchical algorithm– Distance-based clustering– Conceptual-based clustering

18

Hierarchical algorithm

dendrogram

19

Distance-based Clustering

• Distance-based clustering– Agglomerative clustering

• bottom-up• Compute all pair-wise similarities O(n2)

– Partitional clustering • top-down• Low complexity

20

Concept-Based clustering

• Concept-Based clustering– Items need to be represented by a set of

attribute-value pairs.– Ex: mammal (body cover, heart chamber, body

temperature) = (hair, four, regulated)– COBWEB

• Classification tree is not height-balanced• Overall complexity is exponential to #attributes.

21

Learning Hierarchical Ontologies (1/5)

• Users can be categorized in different communities.– One ontology for all users is not appropriate– Select better ontology to use based on user’s

preferences.

22

Learning Hierarchical Ontologies (2/5)

• Generate a whole set of ontologies Λ

23

Learning Hierarchical Ontologies (3/5)

24

Learning Hierarchical Ontologies (4/5)

• Find concept problem– In s(y| x), if concepts represents the items liked

are too distant from disliked ones?

• Algorithm 21. Select a subset of ontologies that perform best2. Select ontology minimizes the distance between

liked and disliked concepts for the selected ontologies.

25

Learning Hierarchical Ontologies (5/5)

26

Learning Multi-Hierarchical Ontologies

• Some problem– Implicit feature

• Limit concept representation• Limit OF’s inference process

– Ignore other possible suboptimal candidates

• Improve: slightly increase the search space

27

28

Classical agglomerative clustering with complete- link criterion function

29

Experiments

• Two data sets:– MovieLens

• Rating 943 real users on at least 20 movies.• Total 1682 movies, 19 themes.

– Jester• Rating on jokes collected over a period of 4 years.• Contains 24,983 users, 100 jokes.

30

Evaluating Recommendation Algorithm

• RS: recommendation set RS• Nok : #(Relevant items)

• Nr : #(Relevant items in the database N)

• Use F1 metric

31

Hierarchical Clustering Analysis

32

Hierarchical Clustering Analysis

33

Hierarchical Clustering Analysis

34

Hierarchical Clustering Analysis

• Execution time in seconds required for the clustering algorithm to generate the ontology.

35

Hierarchical Clustering Analysis

36

Hierarchical Clustering Analysis

37

Multi-Hierarchical Clustering Analysis

• Tradeoff between prediction accuracy and ontology quality.

38

Multi-Hierarchical Clustering Analysis

39

Recommendation Accuracy

40

Conclusions• Introduce three algorithms

– Learns a set of ontologies based on some historical data.– Capable of selecting which one to use based on the user’s perference– Building a multi-hierarchical ontology based on a predefined window

size

• Experimental results on two famous data sets showed that can produce good ontologies and increase the prediction accuracy.

• The learnt ontologies can even outperform traditional item-based collaborative filtering.

41