36
KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer Science (adjunct Professor) TUD – Technische Universität Darmstadt Rundeturmstr. 10, D-64283 Darmstadt, Germany Tel.+49 6151 166150, Fax. +49 6151 166152 www.KOM.tu-darmstadt.de © author(s) of these slides 2012 including research results of the research network KOM and TU Darmstadt otherwise as specified at the respective slide httc – Hessian Telemedia Technology Competence-Center e.V - www.httc.de Thomas Rodenhausen [email protected] 12. Januar 2012 2 1 Source: www.google.com Ranking Resources in Folksonomies by Exploiting Semantic and Context-speci c Information 3

KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

Embed Size (px)

Citation preview

Page 1: KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM - Multimedia Communications LabProf. Dr.-Ing. Ralf Steinmetz (Director)

Dept. of Electrical Engineering and Information TechnologyDept. of Computer Science (adjunct Professor)

TUD – Technische Universität Darmstadt Rundeturmstr. 10, D-64283 Darmstadt, Germany

Tel.+49 6151 166150, Fax. +49 6151 166152 www.KOM.tu-darmstadt.de

© author(s) of these slides 2012 including research results of the research network KOM and TU Darmstadt otherwise as specified at the respective slide

httc – Hessian Telemedia Technology

Competence-Center e.V - www.httc.de

Thomas Rodenhausen

[email protected]

12. Januar 2012

21

Source: www.google.com

Ranking Resources in Folksonomies by Exploiting Semantic and Context-

specific Information

3

Page 2: KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM – Multimedia Communications Lab 2Source: www.icon-finder.com, www.flickr.com, www.delicious.com, www.crokodil.de

Folksonomies

Bob sugar loaf

A tag assignment

Page 3: KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM – Multimedia Communications Lab 3

Task of Ranking of Resources: “Rank resources, such that they are indescending order of relevance towards an information need.”

user

given as query-entity

Interests match

Morelike this

resource

adapted from [Bog09]

Guidedsearch

tag

Find me a

resource

Ranking Resources in Folksonomies

Page 4: KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM – Multimedia Communications Lab 4

Ranking Resources in Folksonomies

Relevance: Resources, are said to be relevant for a user’sinformation need, once they contain valuable information with respect to the user’s information need. [MRS08]

Information need: A user’s desire to have information about a certain topic, e.g. in order to solve a certain task, is described as a user’s information need [MRS08]

Task of Ranking of Resources: “Rank resources, such that they are indescending order of relevance towards an information need.”

Page 5: KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM – Multimedia Communications Lab 5

Example: Bibsonomy

http://www.bibsonomy.org/

https://www.flickr.com/photos/unslugged/21999441399/in/explore-2015-10-15/

https://twitter.com/taxonbytes

Page 6: KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM – Multimedia Communications Lab 6

Assumptions / Constraints

Content of Resources is unavailable Multimedia content, e.g. photos, audio, video Other not readily readable format Complexity of dealing with content undesirable

Graph-based Ranking Only options considered

Page 7: KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM – Multimedia Communications Lab 7

“How probable do I go to B being at A”

1/51/43/5

1/3

1/2

FolkRank [HJSS06] state-of-the-art graph-based Based on PageRank’s random surfer [PBMW99]

How to Actually Rank in Folksonomies?

Restart

31 1

2

1

1/4

3/41/4 2/3

1/5 describes context

45%

29%

16%

10%

Estimates relevance

fcbarcelona.com

messi

barca

barcaFan

Estimates authority

Page 8: KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM – Multimedia Communications Lab 8

Authority & Context & Assumptions

Context (sec. Meriam Webster): “The interrelated conditions in which something exists or occurs: environment, setting”

Authority [PBMW99]A hyperlink of a web page a to web page b denotes the assignment of authority or trust from a to b [PBMW99].

“How probable do I go to B being at A”

Page 9: KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM – Multimedia Communications Lab 9

Authority & Context & Assumptions

Assumptions of FolkRank (i)Assigned tags to a resource describe the resource’s content well (ii)Resources a tag is assigned to describe the tag’s semantic well(iii)Assigned tags by a user describe the user’s interests well(iv)Users a tag was assigned by describe the tag’s semantic well(v)Resources of a user describe the user’s interests well(vi)Users of a resource describe the resource’s content well

Page 10: KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM – Multimedia Communications Lab 10

Assumption about folksonomy-structure violated

Source: www.icon-finder.com

Challenges of FolkRank

Concept drift Ambiguity

Multi-facetedness of entities

Including quality attributes of a resource Authority Signals (e.g. PageRank on the Web) Hub signals

“hub pages are … compilations that someone with an interest in the topic has spent time putting together” [MRS08]

authority signals

hub signals

1

1

AI(topic)

Barcelona(location)

?

IJCAI-Proceedings.pdf

ArtificialIntelligence

(topic)

1

1

1

?1

football 1

1

?1

soccer

news

football

Well-maintained Ad-free

Page 11: KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM – Multimedia Communications Lab 11

Structure

Page 12: KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM – Multimedia Communications Lab 12Source: travel.sympatico.ca

Semantic relatedness i.e. XESA [SBD+10]

FC Barcelona football Dallas Cowboyssoccer American football

0.50.1

0.005

Tags can be ambiguous

IJCAI-Proceedings.pdf

topic location

Barcelona

barcelona-city.jpg

Can Semantic Information help?

Tag types [BSB+09] Different incentives to tag [MNBD06]

Page 13: KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM – Multimedia Communications Lab 13Source: www.free-clipart-pictures.net, www.icon-finder.com

Tag assignment context

e.g. football

e.g. soccer

e.g. playing soccer in the street

Barcelona,football

Cowboys, football

Cowboys, carnival costume

May allow to disambiguateat a fine semantic level

Can Context-specific information help?

Context “The interrelated conditions in which something exists or occurs: environment, setting”

Page 14: KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM – Multimedia Communications Lab 14

Proposed Approaches

IncentiveScore Concept drift: Tag types used to adapt folksonomy graph

Tag Types: topic, location, person/organization, event, activity, resource type, other

Concept drift: Semantic Relatedness measures used to adapt folksonomy graph XESA measures based on Wikipedia used

InteliScore

Inclusion of quality attributes of resourcesHITSonomy

Extensive description of resources/query-entityVSScore

Page 15: KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM – Multimedia Communications Lab 15

HITSonomy

FolkRank ‘thinks’ unidirectional

Estimates relevance & authority Estimates relevance & hub

A B21 2

A B1/3 2/3 2/4

2/4

A B1/32/3

2/42/4

“How probable do I go to B being at A”

A B21 2

Additionally:“How probable did I come from B being at A”

HITSonomy ‘thinks’ bidirectional Inspired by HITS [Kle99]

Describes context

Page 16: KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM – Multimedia Communications Lab 16

VSScore

Idea Port ranking task to vector space model [MRS08] used in text retrieval

Cowboys1…0

0.8…0.3…0.2

barca

barcaFan

dallascowboys.com

A term (usually) represents a semantic concept

Problem No content information of resources (in this work)

Solution Entities in folksonomy can be viewed as semantic concepts Represent resources’ content by their context Represent any entity by their context (e.g. a query-entity)

δ

Barcelona

Cowboys

Barcelona… Messi...Barcelona…Barcelona…FCB…

2…0

0…3

Cowboys

BarcelonaDallas… Cowboys…Football…Cowboys…Dallas…

Page 17: KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM – Multimedia Communications Lab 17

Structure

Page 18: KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM – Multimedia Communications Lab 18

Evaluation Setup

BibSonomy corpus

Methodologies LeavePostOut [JMH+07] LeaveRTOut

Assumption: “Tag assignment indicates relevance of resource towards information need represented by user or tag”

Post: All tag assignments between user and resource

RT: All tag assignments between tag and resource

Page 19: KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM – Multimedia Communications Lab 19

Evaluation Parameters

FolkRank LeavePostOut, given user as query-entity find me resources Restart propability

Page 20: KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM – Multimedia Communications Lab 20

Evaluation Parameters

HITSonomy LeavePostOut, given user as query-entity find me resources Restart propability Weighted arithmetic mean of authority and hub score

Page 21: KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM – Multimedia Communications Lab 21

Evaluation Results

LeavePostOut: 1 out Given user as query-entity find me resources

HITSonomy and VSScore significantly more effective than FolkRank Wilcoxon signed rank test on AveragePrecision

Page 22: KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM – Multimedia Communications Lab 22

Evaluation Results

LeavePostOut: 33% out Given user as query-entity find me resources

HITSonomy and VSScore significantly more effective than FolkRank Wilcoxon signed rank test on AveragePrecision

Page 23: KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM – Multimedia Communications Lab 23

Conclusion

HITSonomy and VSScore can beat the state-of-the-art In different resource ranking tasks Depending on LeavePostOut/LeaveRTOut, thus the conditions of the query-entity

Other proposed algorithms not as well

Methodology Interests match Guided search

LeavePostOut HITSonomy HITSonomy

LeaveNPostsOut HITSonomy HITSonomy

LeaveRTOut FolkRank, HITSonomy, IncentiveScore, InteliScore

VSScore

LeaveNRTsOut FolkRank, HITSonomy, IncentiveScore

HITSonomy, VSScore

Most pairwise statistical significance comparisons won:

Page 24: KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM – Multimedia Communications Lab 24

Limitations

Ranking Novel resources Explanation Efficient computation

Page 25: KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM – Multimedia Communications Lab 25

Thank you!

Page 26: KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM – Multimedia Communications Lab 26

Why not Simply Use a Search Engine?

Search engines on the web in general not spezialized for search in folksonomies

Don‘t usually create rankings that take available semantic and context-specific information into acccount

Not necessarily wanted, to have the search engine crawl the information contained in the folksonomy application

Search engine may not have access to information like friendships, activities etc. How to give search engine a user or a resource to obtain a resource ranking? May want to determine yourself how a ranking is to be produced, without having to rely on

however a search engine may produce a ranking Depending on the scenario this may, however, be an alternative

user

given as query-entity

Resource recom-

mendation

Relatedstuff

resource

adapted from [Bog09]

Guidedsearch

tag

Find me a

resource

Page 27: KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM – Multimedia Communications Lab 27

Metrics

Mean Average Precision Q is the set of information needs to evaluate

Average Precision q is a single information need

Mean Normalized Precision at k Q is the set of information needs to evaluate

Normalized Precision at k q is a single information need

Page 28: KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM – Multimedia Communications Lab 28

Why Ranking in Folksonomies?

Search of relevant knowledge artifacts (resources) constitutes overhead e.g. in resource-based learning, e.g. in CROKODIL

adapted from [BSB+09]

Page 29: KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM – Multimedia Communications Lab 29

Related Work

Page 30: KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM – Multimedia Communications Lab 30

Ranking Vector as Context

Query-entity context-vector visualized

45%

29%

16%

10%

Resource-entities context-vector visualized

28%

47%

13%

12% 9%

20%

25%

46%

δ0.450.290.160.10

0.280.470.120.13

0.090.200.250.46

Page 31: KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM – Multimedia Communications Lab 31

Evaluation Results

LeavePostOut: 1 out Given user as query-entity find me resources

HITSonomy and VSScore significantly more effective than FolkRank Wilcoxon signed rank test on AveragePrecision

Page 32: KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM – Multimedia Communications Lab 32

Evaluation Results

LeavePostOut: 33% out Given user as query-entity find me resources

HITSonomy and VSScore significantly more effective than FolkRank Wilcoxon signed rank test on AveragePrecision

Page 33: KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM – Multimedia Communications Lab 33

Evaluation Results

LeaveRTOut: 1 out Given tag as query-entity find me resources

HITSonomy and VSScore significantly more efficient than FolkRank Wilcoxon signed rank test on AveragePrecision

Mean Average Precision

Page 34: KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM – Multimedia Communications Lab 34

Evaluation Results

LeaveRTOut: 33% out Given tag as query-entity find me resources

HITSonomy and VSScore significantly more efficient than FolkRank Wilcoxon signed rank test on AveragePrecision

Mean Average Precision

Page 35: KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM – Multimedia Communications Lab 35

FR: FolkRankVS FR: Vector space with FolkRank context-vectorsVS HITS: Vector space with HITS context-vectors

precisionPosition found

Preliminary Experiments (Zwischenvortrag)

Page 36: KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM – Multimedia Communications Lab 36

Bibliography

[Bog09] T. Bogers. Recommender Systems for Social Bookmarking. PhD Thesis, Tilburg University,2009.

[BSB+08] D. Böhnstedt, P. Scholl, B. Benz, C. Rensing, R. Steinmetz, and B. Schmitz. Einsatz persönlicher Wissensnetze im Ressourcen-basierten Lernen. In Proceedings of the 6th e-Learning Fachtagung Informatik, pages 113–124, 2008.

[HJSS06] A. Hotho, R. Jäschke, C. Schmitz, and G. Stumme. Information Retrieval in Folksonomies:Search and Ranking. In Proceedings of the 3rd European Semantic Web Conference on theSemantic Web: Research and Applications, pages 411–426, 2006.[JMH+07] Robert Jäschke, Leandro Marinho, Andreas Hotho, Schmidt-Thie Lars, and Stum Gerd. Tag recommendations in folksonomies. 2007

[MRS08] C. Manning, P. Raghavan, and H. Schütze. Introduction to Information Retrieval. CambridgeUniversity Press, 2008.

[Kle99] J. Kleinberg. Authoritative Sources in a Hyperlinked Environment. Journal of the ACM,46:604–632, 1999.

[PBMW99] L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank Citation Ranking: BringingOrder to the Web. Technical Report 1999-66, Stanford InfoLab, 1999.