View
662
Download
3
Tags:
Embed Size (px)
DESCRIPTION
Slides presented at ISWC 2011 in Bonn, Germany. Corresponding paper: http://iswc2011.semanticweb.org/fileadmin/iswc/Papers/Research_Paper/12/70310288.pdf
Citation preview
DelftUniversity ofTechnology
Generating Resource Profiles by Exploiting the Context of Social AnnotationsISWC, Bonn, Germany, Oct 27th 2011
Ricardo Kawase1, George Papadakis2, Fabian Abel3
1L3S Research Center, Leibniz University Hannover, Germany2 ICCS, National Technical University of Athens, Greece
3Web Information Systems, TU Delft
2Generating Resource Profiles by Exploiting the Context of Social Annotations
Social Annotations Folksonomies
• Folksonomy: • set of tag assignments• Formal model [Hotho et al. ‘07]:F = (U, T, R, Y)
baker, cool
armstrongdizzy, jazz
armstrongjazzmusic
trumpet
trumpetUsers
Tags
Resources
armstrong, baker, dizzy, cool,
jazzmusic, jazz, trumpet
usertag
resource
tag assignment
3Generating Resource Profiles by Exploiting the Context of Social Annotations
Generating resource profiles
• Resource profile = representation of a resource = set of weighted concepts
• Straightforward approach = occurrence frequency of tags:
SELECT tag, count(distinct user) FROM tas WHERE resource = XY
GROUP BY tag
baker, cool
Profile? concept weight
?
cool Applicationsthat operate on
resource profiles(e.g. search,
content-based recommender)
Profile? concept weight
baker 1 cool 2
Applications rely on good resource profiles!
4Generating Resource Profiles by Exploiting the Context of Social Annotations
Problems of traditional folksonomies
baker, cool
armstrongdizzy, jazz
armstrongjazzmusic
trumpet
trumpetTags
armstrong, baker, dizzy, cool,
jazzmusic, jazz, trumpet
no tags
ambiguityof tagssynonyms
descriptive vs. subjective tags
Generating valuable resource profiles becomes difficult
5Generating Resource Profiles by Exploiting the Context of Social Annotations
• Context-enabled folksonomy:Fc = (U, T, R, Y, C, Z)
- C is the actual metadata information- Z Y x C is the set of context assignments
usertag
resource
tag assignment
context
User XAge: 30 yearsEducation: …
context
Jazz (noun) is a style of music that…
music
jazz
contextResource Ycreated: 1979-12-06creator: …
context
User Xjazz
TAS XYcreated: 2011-04-19meaning: dbpedia:Jazz
Exploiting Context in Folksonomies
6Generating Resource Profiles by Exploiting the Context of Social Annotations
Context in Social Tagging Systems• TagMe! – Tagging and
exploration front-end for Flickr pictures that allows to attach three types of contexts to tag assignments:• Spatial information (assign tags
to certain areas)• Categories (= tagging of tag
assignments, e.g. buildings)• DBpedia URIs (meaning of tag
assignments, dbpedia:Opera)
• BibSonomy – Social resource sharing system for bookmarks and publications• Context of resources = BibTeX information of publications
7Generating Resource Profiles by Exploiting the Context of Social Annotations
How can we exploit the context of social annotations to generate resource profiles?
baker, cool
Profile? concept weight
?
cool
context
context
8Generating Resource Profiles by Exploiting the Context of Social Annotations
Standard Resource Profiling
• Exploiting the tags that have been directly assigned to the resource
u1tag X
R1
Resource Profile R1
concept weight tag X 0.67
tag Y 0.33tag-based
resource profiletag assignments performed on resource
R1
u2tag X
tag Y
9Generating Resource Profiles by Exploiting the Context of Social Annotations
Context Profiling
• Aggregation of (context) profiles is possible ( again by means of mixture approach)
u1tag A R1
context C1
u2tag A R2
u3tag A R3
u3tag B R3
Context Profile C1
concept weight tag A 0.75
tag B 0.25tag-based context
profile
tag assignments in context folksonomy that refer to context C1
10Generating Resource Profiles by Exploiting the Context of Social Annotations
Generating context-based resource profiles
• Generic strategy for generating context-based resource profiles:
Context-based
Resource Profile
concept weight tx 0.7
ty 0.3
=αResource
Profile
concept weight tx 0.85
ty 0.15
+ (1-α)Context Profile
concept weight tx 0.55
ty 0.45 context
11Generating Resource Profiles by Exploiting the Context of Social Annotations
Weighting Strategies
baker, cool
Profile? concept weight
?
cool
context
context
12Generating Resource Profiles by Exploiting the Context of Social Annotations
Context-based Weighting Strategies (1)
1. User-based co-occurrence: • Hypothesis: users tend to annotate similar resources
tags a user assigns to other resources are also relevant for the resource profile that should be constructed
u1jazz
R1
Context-based
Resource Profile R1
concept weight jazz 0.67
trumpet 0.33
trumpet
R2
13Generating Resource Profiles by Exploiting the Context of Social Annotations
Context-based Weighting Strategies (2)
2. Category-based co-occurrence: • Hypothesis: resources that occur in tag assignments that
are classified in the same category are similar “tags of that category” are also relevant for the resource
Category:music
u3trumpe
t R3
u1jazz
R1
Context-based
Resource Profile R1
concept weight jazz 0.67
trumpet 0.33
14Generating Resource Profiles by Exploiting the Context of Social Annotations
Context-based Weighting Strategies (3)
3. Semantic Meaning – URI-based co-occurrence: • Hypothesis: tags that have the same meaning
complement the tag-based resource profile positively
dbpedia:Jazz
u3jazzmusic
R3
u1jazz
R1
Context-based
Resource Profile R1
concept weight jazz 0.67
jazzmusic 0.33
15Generating Resource Profiles by Exploiting the Context of Social Annotations
Context-based Weighting Strategies (4)
4. Semantic Meaning – “binary”: • Hypothesis: tags that can be mapped to a DBpedia
resource are more important than other tags
dbpedia:Jazz
u1jazz
R1
u3cb1981
R1
?
Context-based
Resource Profile R1
concept weight jazz 1
cb1981 0
16Generating Resource Profiles by Exploiting the Context of Social Annotations
Context-based Weighting Strategies (5)
5. Weighting based on Spatial context – area size: • Hypothesis: the larger the area to which a tag is
assigned to the more important the tag for the resource
u1
chet baker
R1
Context-based
Resource Profile R1
concept weight jazz 0.67
trumpet 0.33
trumpet
R1
17Generating Resource Profiles by Exploiting the Context of Social Annotations
Context-based Weighting Strategies (6)
6. Weighting based on Spatial context – distance from center: • Hypothesis: the closer the (centroid of the) area to the
center of the picture the more important the tag for the resource
u1
tag1
R5tag2
distance(tag1) < distance(tag2)
Context-based
Resource Profile R5
concept weight tag1 0.83
tag2 0.17
18Generating Resource Profiles by Exploiting the Context of Social Annotations
Context-based Weighting Strategies (7)
7. Journal-based co-occurrence: • Hypothesis: tags that are assigned to publications that
were published in the same journal are also relevant for the resource
u1SPARQL
R1Journal: Web
Semantics
u3semantics
R3
Context-based
Resource Profile R1
concept weight SPARQL 0.67
semantics 0.33
R1
19Generating Resource Profiles by Exploiting the Context of Social Annotations
Context-based Weighting Strategies (8)
8. Journal-Year-based co-occurrence: • Hypothesis: tags that are assigned to publications that
were published in the same journal AND in the same year are also relevant for the resource
u1SPARQL
R1
Journal: Web Semantics
u3RDF store
R3
Context-based
Resource Profile R1
concept weight SPARQL 0.67
RDF store 0.33
R1
u5trust
R16
year: 2007
year: 2009
20Generating Resource Profiles by Exploiting the Context of Social Annotations
Overview on Weighting Strategies
Tag-based co-occurrence frequency
User-based
usertag
resource
tag assignment
context
Based on:- Categories- Spatial information- Semantic meaning
[TagMe!]
Resource-based: - BibTeX properties[BibSonomy]
Baseline:
Combining strategies more than 120 context-based profiling strategies for TagMe!
21Generating Resource Profiles by Exploiting the Context of Social Annotations
Broken Slide
22Generating Resource Profiles by Exploiting the Context of Social Annotations
Which resource profiling strategy generates the most valuable profiles?
baker, cool
Profile? concept weight
?
cool
context
context
23Generating Resource Profiles by Exploiting the Context of Social Annotations
Experimental setup• “Tag Prediction” task (leave-one-out cross
validation):• remove one tag from the resource • create (context-based) resource profile • use profile to create a ranking of tags hidden tag should be
at the top of the ranking• Baseline: tag co-occurrence • Metrics: Success@k = probability that the relevant
tag appear within the top k of the ranking• Data sets:Tag Assignments (TAs) 1,288
TAs with Spatial Information
671
TAs with Category Information
917
TAs with URI Information 1,050
TAs with all information 432
Resources 566,939Users 6,569Tag Assignments (TAs)
2,622,423
TagMe!
BibSonomy
24Generating Resource Profiles by Exploiting the Context of Social Annotations
Results [TagMe!]Context-based profiling strategies outperform baseline (tag frequency) significantly.
Semantic meaning and spatial information allow for best performance.
Area size more valuable than distance to center
no significant difference w.r.t. category- and user-based strategy
25Generating Resource Profiles by Exploiting the Context of Social Annotations
Combining different types of context-based profiling strategies
Mixture of context-based strategies improve performance (by 37%)Context-based strategies have to be combined intelligently in order to increase cumulative gain in performance.
26Generating Resource Profiles by Exploiting the Context of Social Annotations
Results [BibSonomy]Again: Context-based profiling strategies outperform baseline (tag frequency) significantly.
The more specific the context, the better the performance ( reducing noise)
27Generating Resource Profiles by Exploiting the Context of Social Annotations
Conclusions• What we did: framework for generating resource profiles by exploiting contextual information of social annotations• Context-based folksonomy model• Set of context-based resource profiling strategies (both
generic and application-specific strategies)• Evaluation in two social tagging systems: TagMe! and BibSonomy
• Results: • Context-based strategies outperform other strategies that
do not exploit contextual information• Context of tag assignments (e.g. semantic meaning) allows for best
performance
• Context of the user who performs the tag assignment is competitive
• Mixing context-based strategies improves quality but does not necessarily result in a cumulative gain in performance (“over-contextualization”) smart mixing performs best (>40% improvement)
28Generating Resource Profiles by Exploiting the Context of Social Annotations
Twitter: @ricardokawase@gpapadis@fabianabel
baker, cool
cool
context
context
Context-based
Resource Profile
concept weight
SPARQL 0.67
semantics 0.33
Thank you!