Upload
myrtle-richard
View
241
Download
0
Embed Size (px)
Citation preview
KOM - Multimedia Communications LabProf. Dr.-Ing. Ralf Steinmetz (Director)
Dept. of Electrical Engineering and Information TechnologyDept. of Computer Science (adjunct Professor)
TUD – Technische Universität Darmstadt Rundeturmstr. 10, D-64283 Darmstadt, Germany
Tel.+49 6151 166150, Fax. +49 6151 166152 www.KOM.tu-darmstadt.de
© author(s) of these slides 2012 including research results of the research network KOM and TU Darmstadt otherwise as specified at the respective slide
httc – Hessian Telemedia Technology
Competence-Center e.V - www.httc.de
Thomas Rodenhausen
12. Januar 2012
21
Source: www.google.com
Ranking Resources in Folksonomies by Exploiting Semantic and Context-
specific Information
3
KOM – Multimedia Communications Lab 2Source: www.icon-finder.com, www.flickr.com, www.delicious.com, www.crokodil.de
Folksonomies
Bob sugar loaf
A tag assignment
KOM – Multimedia Communications Lab 3
Task of Ranking of Resources: “Rank resources, such that they are indescending order of relevance towards an information need.”
user
given as query-entity
Interests match
Morelike this
resource
adapted from [Bog09]
Guidedsearch
tag
Find me a
resource
Ranking Resources in Folksonomies
KOM – Multimedia Communications Lab 4
Ranking Resources in Folksonomies
Relevance: Resources, are said to be relevant for a user’sinformation need, once they contain valuable information with respect to the user’s information need. [MRS08]
Information need: A user’s desire to have information about a certain topic, e.g. in order to solve a certain task, is described as a user’s information need [MRS08]
Task of Ranking of Resources: “Rank resources, such that they are indescending order of relevance towards an information need.”
KOM – Multimedia Communications Lab 5
Example: Bibsonomy
http://www.bibsonomy.org/
https://www.flickr.com/photos/unslugged/21999441399/in/explore-2015-10-15/
https://twitter.com/taxonbytes
KOM – Multimedia Communications Lab 6
Assumptions / Constraints
Content of Resources is unavailable Multimedia content, e.g. photos, audio, video Other not readily readable format Complexity of dealing with content undesirable
Graph-based Ranking Only options considered
KOM – Multimedia Communications Lab 7
“How probable do I go to B being at A”
1/51/43/5
1/3
1/2
FolkRank [HJSS06] state-of-the-art graph-based Based on PageRank’s random surfer [PBMW99]
How to Actually Rank in Folksonomies?
Restart
31 1
2
1
1/4
3/41/4 2/3
1/5 describes context
45%
29%
16%
10%
Estimates relevance
fcbarcelona.com
messi
barca
barcaFan
Estimates authority
KOM – Multimedia Communications Lab 8
Authority & Context & Assumptions
Context (sec. Meriam Webster): “The interrelated conditions in which something exists or occurs: environment, setting”
Authority [PBMW99]A hyperlink of a web page a to web page b denotes the assignment of authority or trust from a to b [PBMW99].
“How probable do I go to B being at A”
KOM – Multimedia Communications Lab 9
Authority & Context & Assumptions
Assumptions of FolkRank (i)Assigned tags to a resource describe the resource’s content well (ii)Resources a tag is assigned to describe the tag’s semantic well(iii)Assigned tags by a user describe the user’s interests well(iv)Users a tag was assigned by describe the tag’s semantic well(v)Resources of a user describe the user’s interests well(vi)Users of a resource describe the resource’s content well
KOM – Multimedia Communications Lab 10
Assumption about folksonomy-structure violated
Source: www.icon-finder.com
Challenges of FolkRank
Concept drift Ambiguity
Multi-facetedness of entities
Including quality attributes of a resource Authority Signals (e.g. PageRank on the Web) Hub signals
“hub pages are … compilations that someone with an interest in the topic has spent time putting together” [MRS08]
authority signals
hub signals
1
1
AI(topic)
Barcelona(location)
?
IJCAI-Proceedings.pdf
ArtificialIntelligence
(topic)
1
1
1
?1
football 1
1
?1
soccer
news
football
Well-maintained Ad-free
KOM – Multimedia Communications Lab 11
Structure
KOM – Multimedia Communications Lab 12Source: travel.sympatico.ca
Semantic relatedness i.e. XESA [SBD+10]
FC Barcelona football Dallas Cowboyssoccer American football
0.50.1
0.005
Tags can be ambiguous
IJCAI-Proceedings.pdf
topic location
Barcelona
barcelona-city.jpg
Can Semantic Information help?
Tag types [BSB+09] Different incentives to tag [MNBD06]
KOM – Multimedia Communications Lab 13Source: www.free-clipart-pictures.net, www.icon-finder.com
Tag assignment context
e.g. football
e.g. soccer
e.g. playing soccer in the street
Barcelona,football
Cowboys, football
Cowboys, carnival costume
May allow to disambiguateat a fine semantic level
Can Context-specific information help?
Context “The interrelated conditions in which something exists or occurs: environment, setting”
KOM – Multimedia Communications Lab 14
Proposed Approaches
IncentiveScore Concept drift: Tag types used to adapt folksonomy graph
Tag Types: topic, location, person/organization, event, activity, resource type, other
Concept drift: Semantic Relatedness measures used to adapt folksonomy graph XESA measures based on Wikipedia used
InteliScore
Inclusion of quality attributes of resourcesHITSonomy
Extensive description of resources/query-entityVSScore
KOM – Multimedia Communications Lab 15
HITSonomy
FolkRank ‘thinks’ unidirectional
Estimates relevance & authority Estimates relevance & hub
A B21 2
A B1/3 2/3 2/4
2/4
A B1/32/3
2/42/4
“How probable do I go to B being at A”
A B21 2
Additionally:“How probable did I come from B being at A”
HITSonomy ‘thinks’ bidirectional Inspired by HITS [Kle99]
Describes context
KOM – Multimedia Communications Lab 16
VSScore
Idea Port ranking task to vector space model [MRS08] used in text retrieval
Cowboys1…0
0.8…0.3…0.2
barca
barcaFan
dallascowboys.com
A term (usually) represents a semantic concept
Problem No content information of resources (in this work)
Solution Entities in folksonomy can be viewed as semantic concepts Represent resources’ content by their context Represent any entity by their context (e.g. a query-entity)
δ
Barcelona
Cowboys
Barcelona… Messi...Barcelona…Barcelona…FCB…
2…0
0…3
Cowboys
BarcelonaDallas… Cowboys…Football…Cowboys…Dallas…
KOM – Multimedia Communications Lab 17
Structure
KOM – Multimedia Communications Lab 18
Evaluation Setup
BibSonomy corpus
Methodologies LeavePostOut [JMH+07] LeaveRTOut
Assumption: “Tag assignment indicates relevance of resource towards information need represented by user or tag”
Post: All tag assignments between user and resource
RT: All tag assignments between tag and resource
KOM – Multimedia Communications Lab 19
Evaluation Parameters
FolkRank LeavePostOut, given user as query-entity find me resources Restart propability
KOM – Multimedia Communications Lab 20
Evaluation Parameters
HITSonomy LeavePostOut, given user as query-entity find me resources Restart propability Weighted arithmetic mean of authority and hub score
KOM – Multimedia Communications Lab 21
Evaluation Results
LeavePostOut: 1 out Given user as query-entity find me resources
HITSonomy and VSScore significantly more effective than FolkRank Wilcoxon signed rank test on AveragePrecision
KOM – Multimedia Communications Lab 22
Evaluation Results
LeavePostOut: 33% out Given user as query-entity find me resources
HITSonomy and VSScore significantly more effective than FolkRank Wilcoxon signed rank test on AveragePrecision
KOM – Multimedia Communications Lab 23
Conclusion
HITSonomy and VSScore can beat the state-of-the-art In different resource ranking tasks Depending on LeavePostOut/LeaveRTOut, thus the conditions of the query-entity
Other proposed algorithms not as well
Methodology Interests match Guided search
LeavePostOut HITSonomy HITSonomy
LeaveNPostsOut HITSonomy HITSonomy
LeaveRTOut FolkRank, HITSonomy, IncentiveScore, InteliScore
VSScore
LeaveNRTsOut FolkRank, HITSonomy, IncentiveScore
HITSonomy, VSScore
Most pairwise statistical significance comparisons won:
KOM – Multimedia Communications Lab 24
Limitations
Ranking Novel resources Explanation Efficient computation
KOM – Multimedia Communications Lab 25
Thank you!
KOM – Multimedia Communications Lab 26
Why not Simply Use a Search Engine?
Search engines on the web in general not spezialized for search in folksonomies
Don‘t usually create rankings that take available semantic and context-specific information into acccount
Not necessarily wanted, to have the search engine crawl the information contained in the folksonomy application
Search engine may not have access to information like friendships, activities etc. How to give search engine a user or a resource to obtain a resource ranking? May want to determine yourself how a ranking is to be produced, without having to rely on
however a search engine may produce a ranking Depending on the scenario this may, however, be an alternative
user
given as query-entity
Resource recom-
mendation
Relatedstuff
resource
adapted from [Bog09]
Guidedsearch
tag
Find me a
resource
KOM – Multimedia Communications Lab 27
Metrics
Mean Average Precision Q is the set of information needs to evaluate
Average Precision q is a single information need
Mean Normalized Precision at k Q is the set of information needs to evaluate
Normalized Precision at k q is a single information need
KOM – Multimedia Communications Lab 28
Why Ranking in Folksonomies?
Search of relevant knowledge artifacts (resources) constitutes overhead e.g. in resource-based learning, e.g. in CROKODIL
adapted from [BSB+09]
KOM – Multimedia Communications Lab 29
Related Work
KOM – Multimedia Communications Lab 30
Ranking Vector as Context
Query-entity context-vector visualized
45%
29%
16%
10%
Resource-entities context-vector visualized
28%
47%
13%
12% 9%
20%
25%
46%
δ0.450.290.160.10
0.280.470.120.13
0.090.200.250.46
KOM – Multimedia Communications Lab 31
Evaluation Results
LeavePostOut: 1 out Given user as query-entity find me resources
HITSonomy and VSScore significantly more effective than FolkRank Wilcoxon signed rank test on AveragePrecision
KOM – Multimedia Communications Lab 32
Evaluation Results
LeavePostOut: 33% out Given user as query-entity find me resources
HITSonomy and VSScore significantly more effective than FolkRank Wilcoxon signed rank test on AveragePrecision
KOM – Multimedia Communications Lab 33
Evaluation Results
LeaveRTOut: 1 out Given tag as query-entity find me resources
HITSonomy and VSScore significantly more efficient than FolkRank Wilcoxon signed rank test on AveragePrecision
Mean Average Precision
KOM – Multimedia Communications Lab 34
Evaluation Results
LeaveRTOut: 33% out Given tag as query-entity find me resources
HITSonomy and VSScore significantly more efficient than FolkRank Wilcoxon signed rank test on AveragePrecision
Mean Average Precision
KOM – Multimedia Communications Lab 35
FR: FolkRankVS FR: Vector space with FolkRank context-vectorsVS HITS: Vector space with HITS context-vectors
precisionPosition found
Preliminary Experiments (Zwischenvortrag)
KOM – Multimedia Communications Lab 36
Bibliography
[Bog09] T. Bogers. Recommender Systems for Social Bookmarking. PhD Thesis, Tilburg University,2009.
[BSB+08] D. Böhnstedt, P. Scholl, B. Benz, C. Rensing, R. Steinmetz, and B. Schmitz. Einsatz persönlicher Wissensnetze im Ressourcen-basierten Lernen. In Proceedings of the 6th e-Learning Fachtagung Informatik, pages 113–124, 2008.
[HJSS06] A. Hotho, R. Jäschke, C. Schmitz, and G. Stumme. Information Retrieval in Folksonomies:Search and Ranking. In Proceedings of the 3rd European Semantic Web Conference on theSemantic Web: Research and Applications, pages 411–426, 2006.[JMH+07] Robert Jäschke, Leandro Marinho, Andreas Hotho, Schmidt-Thie Lars, and Stum Gerd. Tag recommendations in folksonomies. 2007
[MRS08] C. Manning, P. Raghavan, and H. Schütze. Introduction to Information Retrieval. CambridgeUniversity Press, 2008.
[Kle99] J. Kleinberg. Authoritative Sources in a Hyperlinked Environment. Journal of the ACM,46:604–632, 1999.
[PBMW99] L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank Citation Ranking: BringingOrder to the Web. Technical Report 1999-66, Stanford InfoLab, 1999.