KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer

KOM - Multimedia Communications LabProf. Dr.-Ing. Ralf Steinmetz (Director)

Dept. of Electrical Engineering and Information TechnologyDept. of Computer Science (adjunct Professor)

TUD – Technische Universität Darmstadt Rundeturmstr. 10, D-64283 Darmstadt, Germany

Tel.+49 6151 166150, Fax. +49 6151 166152 www.KOM.tu-darmstadt.de

© author(s) of these slides 2012 including research results of the research network KOM and TU Darmstadt otherwise as specified at the respective slide

httc – Hessian Telemedia Technology

Competence-Center e.V - www.httc.de

Thomas Rodenhausen

[email protected]

12. Januar 2012

21

Source: www.google.com

Ranking Resources in Folksonomies by Exploiting Semantic and Context-

specific Information

3

KOM – Multimedia Communications Lab 2Source: www.icon-finder.com, www.flickr.com, www.delicious.com, www.crokodil.de

Folksonomies

Bob sugar loaf

A tag assignment

KOM – Multimedia Communications Lab 3

Task of Ranking of Resources: “Rank resources, such that they are indescending order of relevance towards an information need.”

user

given as query-entity

Interests match

Morelike this

resource

adapted from [Bog09]

Guidedsearch

tag

Find me a

resource

Ranking Resources in Folksonomies


Ranking Resources in Folksonomies

Relevance: Resources, are said to be relevant for a user’sinformation need, once they contain valuable information with respect to the user’s information need. [MRS08]

Information need: A user’s desire to have information about a certain topic, e.g. in order to solve a certain task, is described as a user’s information need [MRS08]

Task of Ranking of Resources: “Rank resources, such that they are indescending order of relevance towards an information need.”


Example: Bibsonomy

http://www.bibsonomy.org/

https://www.flickr.com/photos/unslugged/21999441399/in/explore-2015-10-15/

https://twitter.com/taxonbytes


Assumptions / Constraints

Content of Resources is unavailable Multimedia content, e.g. photos, audio, video Other not readily readable format Complexity of dealing with content undesirable

Graph-based Ranking Only options considered


“How probable do I go to B being at A”

1/51/43/5

1/3

1/2

FolkRank [HJSS06] state-of-the-art graph-based Based on PageRank’s random surfer [PBMW99]

How to Actually Rank in Folksonomies?

Restart

31 1

2

1

1/4

3/41/4 2/3

1/5 describes context

45%

29%

16%

10%

Estimates relevance

fcbarcelona.com

messi

barca

barcaFan

Estimates authority


Authority & Context & Assumptions

Context (sec. Meriam Webster): “The interrelated conditions in which something exists or occurs: environment, setting”

Authority [PBMW99]A hyperlink of a web page a to web page b denotes the assignment of authority or trust from a to b [PBMW99].



Authority & Context & Assumptions

Assumptions of FolkRank (i)Assigned tags to a resource describe the resource’s content well (ii)Resources a tag is assigned to describe the tag’s semantic well(iii)Assigned tags by a user describe the user’s interests well(iv)Users a tag was assigned by describe the tag’s semantic well(v)Resources of a user describe the user’s interests well(vi)Users of a resource describe the resource’s content well


Assumption about folksonomy-structure violated

Source: www.icon-finder.com

Challenges of FolkRank

Concept drift Ambiguity

Multi-facetedness of entities

Including quality attributes of a resource Authority Signals (e.g. PageRank on the Web) Hub signals

“hub pages are … compilations that someone with an interest in the topic has spent time putting together” [MRS08]

authority signals

hub signals

1

1

AI(topic)

Barcelona(location)

?

IJCAI-Proceedings.pdf

ArtificialIntelligence

(topic)

1

1

1

?1

football 1

1

?1

soccer

news

football

Well-maintained Ad-free


Structure

KOM – Multimedia Communications Lab 12Source: travel.sympatico.ca

Semantic relatedness i.e. XESA [SBD+10]

FC Barcelona football Dallas Cowboyssoccer American football

0.50.1

0.005

Tags can be ambiguous

IJCAI-Proceedings.pdf

topic location

Barcelona

barcelona-city.jpg

Can Semantic Information help?

Tag types [BSB+09] Different incentives to tag [MNBD06]

KOM – Multimedia Communications Lab 13Source: www.free-clipart-pictures.net, www.icon-finder.com

Tag assignment context

e.g. football

e.g. soccer

e.g. playing soccer in the street

Barcelona,football

Cowboys, football

Cowboys, carnival costume

May allow to disambiguateat a fine semantic level

Can Context-specific information help?

Context “The interrelated conditions in which something exists or occurs: environment, setting”


Proposed Approaches

IncentiveScore Concept drift: Tag types used to adapt folksonomy graph

Tag Types: topic, location, person/organization, event, activity, resource type, other

Concept drift: Semantic Relatedness measures used to adapt folksonomy graph XESA measures based on Wikipedia used

InteliScore

Inclusion of quality attributes of resourcesHITSonomy

Extensive description of resources/query-entityVSScore


HITSonomy

FolkRank ‘thinks’ unidirectional

Estimates relevance & authority Estimates relevance & hub

A B21 2

A B1/3 2/3 2/4

2/4

A B1/32/3

2/42/4


A B21 2

Additionally:“How probable did I come from B being at A”

HITSonomy ‘thinks’ bidirectional Inspired by HITS [Kle99]

Describes context


VSScore

Idea Port ranking task to vector space model [MRS08] used in text retrieval

Cowboys1…0

0.8…0.3…0.2

barca

barcaFan

dallascowboys.com

A term (usually) represents a semantic concept

Problem No content information of resources (in this work)

Solution Entities in folksonomy can be viewed as semantic concepts Represent resources’ content by their context Represent any entity by their context (e.g. a query-entity)

δ

Barcelona

Cowboys

Barcelona… Messi...Barcelona…Barcelona…FCB…

2…0

0…3

Cowboys

BarcelonaDallas… Cowboys…Football…Cowboys…Dallas…


Structure


Evaluation Setup

BibSonomy corpus

Methodologies LeavePostOut [JMH+07] LeaveRTOut

Assumption: “Tag assignment indicates relevance of resource towards information need represented by user or tag”

Post: All tag assignments between user and resource

RT: All tag assignments between tag and resource


Evaluation Parameters

FolkRank LeavePostOut, given user as query-entity find me resources Restart propability


Evaluation Parameters

HITSonomy LeavePostOut, given user as query-entity find me resources Restart propability Weighted arithmetic mean of authority and hub score


Evaluation Results

LeavePostOut: 1 out Given user as query-entity find me resources

HITSonomy and VSScore significantly more effective than FolkRank Wilcoxon signed rank test on AveragePrecision


Evaluation Results

LeavePostOut: 33% out Given user as query-entity find me resources



Conclusion

HITSonomy and VSScore can beat the state-of-the-art In different resource ranking tasks Depending on LeavePostOut/LeaveRTOut, thus the conditions of the query-entity

Other proposed algorithms not as well

Methodology Interests match Guided search

LeavePostOut HITSonomy HITSonomy

LeaveNPostsOut HITSonomy HITSonomy

LeaveRTOut FolkRank, HITSonomy, IncentiveScore, InteliScore

VSScore

LeaveNRTsOut FolkRank, HITSonomy, IncentiveScore

HITSonomy, VSScore

Most pairwise statistical significance comparisons won:


Limitations

Ranking Novel resources Explanation Efficient computation


Thank you!


Why not Simply Use a Search Engine?

Search engines on the web in general not spezialized for search in folksonomies

Don‘t usually create rankings that take available semantic and context-specific information into acccount

Not necessarily wanted, to have the search engine crawl the information contained in the folksonomy application

Search engine may not have access to information like friendships, activities etc. How to give search engine a user or a resource to obtain a resource ranking? May want to determine yourself how a ranking is to be produced, without having to rely on

however a search engine may produce a ranking Depending on the scenario this may, however, be an alternative

user

given as query-entity

Resource recom-

mendation

Relatedstuff

resource

adapted from [Bog09]

Guidedsearch

tag

Find me a

resource


Metrics

Mean Average Precision Q is the set of information needs to evaluate

Average Precision q is a single information need

Mean Normalized Precision at k Q is the set of information needs to evaluate

Normalized Precision at k q is a single information need


Why Ranking in Folksonomies?

Search of relevant knowledge artifacts (resources) constitutes overhead e.g. in resource-based learning, e.g. in CROKODIL

adapted from [BSB+09]


Related Work


Ranking Vector as Context

Query-entity context-vector visualized

45%

29%

16%

10%

Resource-entities context-vector visualized

28%

47%

13%

12% 9%

20%

25%

46%

δ0.450.290.160.10

0.280.470.120.13

0.090.200.250.46


Evaluation Results

LeavePostOut: 1 out Given user as query-entity find me resources



Evaluation Results

LeavePostOut: 33% out Given user as query-entity find me resources



Evaluation Results

LeaveRTOut: 1 out Given tag as query-entity find me resources

HITSonomy and VSScore significantly more efficient than FolkRank Wilcoxon signed rank test on AveragePrecision

Mean Average Precision


Evaluation Results

LeaveRTOut: 33% out Given tag as query-entity find me resources

HITSonomy and VSScore significantly more efficient than FolkRank Wilcoxon signed rank test on AveragePrecision

Mean Average Precision


FR: FolkRankVS FR: Vector space with FolkRank context-vectorsVS HITS: Vector space with HITS context-vectors

precisionPosition found

Preliminary Experiments (Zwischenvortrag)


Bibliography

[Bog09] T. Bogers. Recommender Systems for Social Bookmarking. PhD Thesis, Tilburg University,2009.

[BSB+08] D. Böhnstedt, P. Scholl, B. Benz, C. Rensing, R. Steinmetz, and B. Schmitz. Einsatz persönlicher Wissensnetze im Ressourcen-basierten Lernen. In Proceedings of the 6th e-Learning Fachtagung Informatik, pages 113–124, 2008.

[HJSS06] A. Hotho, R. Jäschke, C. Schmitz, and G. Stumme. Information Retrieval in Folksonomies:Search and Ranking. In Proceedings of the 3rd European Semantic Web Conference on theSemantic Web: Research and Applications, pages 411–426, 2006.[JMH+07] Robert Jäschke, Leandro Marinho, Andreas Hotho, Schmidt-Thie Lars, and Stum Gerd. Tag recommendations in folksonomies. 2007

[MRS08] C. Manning, P. Raghavan, and H. Schütze. Introduction to Information Retrieval. CambridgeUniversity Press, 2008.

[Kle99] J. Kleinberg. Authoritative Sources in a Hyperlinked Environment. Journal of the ACM,46:604–632, 1999.

[PBMW99] L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank Citation Ranking: BringingOrder to the Web. Technical Report 1999-66, Stanford InfoLab, 1999.

Documents

KOM - Multimedia Communications Lab Prof. Dr.-Ing. Ralf Steinmetz (Director) Dept. of Electrical Engineering and Information Technology Dept. of Computer