28
Delft University of Technology Generating Resource Profiles by Exploiting the Context of Social Annotations ISWC, Bonn, Germany, Oct 27 th 2011 ardo Kawase 1 , George Papadakis 2 , Fabian Abel 3 1 L3S Research Center, Leibniz University Hannover, Germany 2 ICCS, National Technical University of Athens, Greece 3 Web Information Systems, TU Delft

Generating Resource Profiles by Exploiting the Context of Social Annotations

Embed Size (px)

DESCRIPTION

Slides presented at ISWC 2011 in Bonn, Germany. Corresponding paper: http://iswc2011.semanticweb.org/fileadmin/iswc/Papers/Research_Paper/12/70310288.pdf

Citation preview

Page 1: Generating Resource Profiles by Exploiting the Context of Social Annotations

DelftUniversity ofTechnology

Generating Resource Profiles by Exploiting the Context of Social AnnotationsISWC, Bonn, Germany, Oct 27th 2011

Ricardo Kawase1, George Papadakis2, Fabian Abel3

1L3S Research Center, Leibniz University Hannover, Germany2 ICCS, National Technical University of Athens, Greece

3Web Information Systems, TU Delft

Page 2: Generating Resource Profiles by Exploiting the Context of Social Annotations

2Generating Resource Profiles by Exploiting the Context of Social Annotations

Social Annotations Folksonomies

• Folksonomy: • set of tag assignments• Formal model [Hotho et al. ‘07]:F = (U, T, R, Y)

baker, cool

armstrongdizzy, jazz

armstrongjazzmusic

trumpet

trumpetUsers

Tags

Resources

armstrong, baker, dizzy, cool,

jazzmusic, jazz, trumpet

usertag

resource

tag assignment

Page 3: Generating Resource Profiles by Exploiting the Context of Social Annotations

3Generating Resource Profiles by Exploiting the Context of Social Annotations

Generating resource profiles

• Resource profile = representation of a resource = set of weighted concepts

• Straightforward approach = occurrence frequency of tags:

SELECT tag, count(distinct user) FROM tas WHERE resource = XY

GROUP BY tag

baker, cool

Profile? concept weight

?

cool Applicationsthat operate on

resource profiles(e.g. search,

content-based recommender)

Profile? concept weight

baker 1 cool 2

Applications rely on good resource profiles!

Page 4: Generating Resource Profiles by Exploiting the Context of Social Annotations

4Generating Resource Profiles by Exploiting the Context of Social Annotations

Problems of traditional folksonomies

baker, cool

armstrongdizzy, jazz

armstrongjazzmusic

trumpet

trumpetTags

armstrong, baker, dizzy, cool,

jazzmusic, jazz, trumpet

no tags

ambiguityof tagssynonyms

descriptive vs. subjective tags

Generating valuable resource profiles becomes difficult

Page 5: Generating Resource Profiles by Exploiting the Context of Social Annotations

5Generating Resource Profiles by Exploiting the Context of Social Annotations

• Context-enabled folksonomy:Fc = (U, T, R, Y, C, Z)

- C is the actual metadata information- Z Y x C is the set of context assignments

usertag

resource

tag assignment

context

User XAge: 30 yearsEducation: …

context

Jazz (noun) is a style of music that…

music

jazz

contextResource Ycreated: 1979-12-06creator: …

context

User Xjazz

TAS XYcreated: 2011-04-19meaning: dbpedia:Jazz

Exploiting Context in Folksonomies

Page 6: Generating Resource Profiles by Exploiting the Context of Social Annotations

6Generating Resource Profiles by Exploiting the Context of Social Annotations

Context in Social Tagging Systems• TagMe! – Tagging and

exploration front-end for Flickr pictures that allows to attach three types of contexts to tag assignments:• Spatial information (assign tags

to certain areas)• Categories (= tagging of tag

assignments, e.g. buildings)• DBpedia URIs (meaning of tag

assignments, dbpedia:Opera)

• BibSonomy – Social resource sharing system for bookmarks and publications• Context of resources = BibTeX information of publications

Page 7: Generating Resource Profiles by Exploiting the Context of Social Annotations

7Generating Resource Profiles by Exploiting the Context of Social Annotations

How can we exploit the context of social annotations to generate resource profiles?

baker, cool

Profile? concept weight

?

cool

context

context

Page 8: Generating Resource Profiles by Exploiting the Context of Social Annotations

8Generating Resource Profiles by Exploiting the Context of Social Annotations

Standard Resource Profiling

• Exploiting the tags that have been directly assigned to the resource

u1tag X

R1

Resource Profile R1

concept weight tag X 0.67

tag Y 0.33tag-based

resource profiletag assignments performed on resource

R1

u2tag X

tag Y

Page 9: Generating Resource Profiles by Exploiting the Context of Social Annotations

9Generating Resource Profiles by Exploiting the Context of Social Annotations

Context Profiling

• Aggregation of (context) profiles is possible ( again by means of mixture approach)

u1tag A R1

context C1

u2tag A R2

u3tag A R3

u3tag B R3

Context Profile C1

concept weight tag A 0.75

tag B 0.25tag-based context

profile

tag assignments in context folksonomy that refer to context C1

Page 10: Generating Resource Profiles by Exploiting the Context of Social Annotations

10Generating Resource Profiles by Exploiting the Context of Social Annotations

Generating context-based resource profiles

• Generic strategy for generating context-based resource profiles:

Context-based

Resource Profile

concept weight tx 0.7

ty 0.3

=αResource

Profile

concept weight tx 0.85

ty 0.15

+ (1-α)Context Profile

concept weight tx 0.55

ty 0.45 context

Page 11: Generating Resource Profiles by Exploiting the Context of Social Annotations

11Generating Resource Profiles by Exploiting the Context of Social Annotations

Weighting Strategies

baker, cool

Profile? concept weight

?

cool

context

context

Page 12: Generating Resource Profiles by Exploiting the Context of Social Annotations

12Generating Resource Profiles by Exploiting the Context of Social Annotations

Context-based Weighting Strategies (1)

1. User-based co-occurrence: • Hypothesis: users tend to annotate similar resources

tags a user assigns to other resources are also relevant for the resource profile that should be constructed

u1jazz

R1

Context-based

Resource Profile R1

concept weight jazz 0.67

trumpet 0.33

trumpet

R2

Page 13: Generating Resource Profiles by Exploiting the Context of Social Annotations

13Generating Resource Profiles by Exploiting the Context of Social Annotations

Context-based Weighting Strategies (2)

2. Category-based co-occurrence: • Hypothesis: resources that occur in tag assignments that

are classified in the same category are similar “tags of that category” are also relevant for the resource

Category:music

u3trumpe

t R3

u1jazz

R1

Context-based

Resource Profile R1

concept weight jazz 0.67

trumpet 0.33

Page 14: Generating Resource Profiles by Exploiting the Context of Social Annotations

14Generating Resource Profiles by Exploiting the Context of Social Annotations

Context-based Weighting Strategies (3)

3. Semantic Meaning – URI-based co-occurrence: • Hypothesis: tags that have the same meaning

complement the tag-based resource profile positively

dbpedia:Jazz

u3jazzmusic

R3

u1jazz

R1

Context-based

Resource Profile R1

concept weight jazz 0.67

jazzmusic 0.33

Page 15: Generating Resource Profiles by Exploiting the Context of Social Annotations

15Generating Resource Profiles by Exploiting the Context of Social Annotations

Context-based Weighting Strategies (4)

4. Semantic Meaning – “binary”: • Hypothesis: tags that can be mapped to a DBpedia

resource are more important than other tags

dbpedia:Jazz

u1jazz

R1

u3cb1981

R1

?

Context-based

Resource Profile R1

concept weight jazz 1

cb1981 0

Page 16: Generating Resource Profiles by Exploiting the Context of Social Annotations

16Generating Resource Profiles by Exploiting the Context of Social Annotations

Context-based Weighting Strategies (5)

5. Weighting based on Spatial context – area size: • Hypothesis: the larger the area to which a tag is

assigned to the more important the tag for the resource

u1

chet baker

R1

Context-based

Resource Profile R1

concept weight jazz 0.67

trumpet 0.33

trumpet

R1

Page 17: Generating Resource Profiles by Exploiting the Context of Social Annotations

17Generating Resource Profiles by Exploiting the Context of Social Annotations

Context-based Weighting Strategies (6)

6. Weighting based on Spatial context – distance from center: • Hypothesis: the closer the (centroid of the) area to the

center of the picture the more important the tag for the resource

u1

tag1

R5tag2

distance(tag1) < distance(tag2)

Context-based

Resource Profile R5

concept weight tag1 0.83

tag2 0.17

Page 18: Generating Resource Profiles by Exploiting the Context of Social Annotations

18Generating Resource Profiles by Exploiting the Context of Social Annotations

Context-based Weighting Strategies (7)

7. Journal-based co-occurrence: • Hypothesis: tags that are assigned to publications that

were published in the same journal are also relevant for the resource

u1SPARQL

R1Journal: Web

Semantics

u3semantics

R3

Context-based

Resource Profile R1

concept weight SPARQL 0.67

semantics 0.33

R1

Page 19: Generating Resource Profiles by Exploiting the Context of Social Annotations

19Generating Resource Profiles by Exploiting the Context of Social Annotations

Context-based Weighting Strategies (8)

8. Journal-Year-based co-occurrence: • Hypothesis: tags that are assigned to publications that

were published in the same journal AND in the same year are also relevant for the resource

u1SPARQL

R1

Journal: Web Semantics

u3RDF store

R3

Context-based

Resource Profile R1

concept weight SPARQL 0.67

RDF store 0.33

R1

u5trust

R16

year: 2007

year: 2009

Page 20: Generating Resource Profiles by Exploiting the Context of Social Annotations

20Generating Resource Profiles by Exploiting the Context of Social Annotations

Overview on Weighting Strategies

Tag-based co-occurrence frequency

User-based

usertag

resource

tag assignment

context

Based on:- Categories- Spatial information- Semantic meaning

[TagMe!]

Resource-based: - BibTeX properties[BibSonomy]

Baseline:

Combining strategies more than 120 context-based profiling strategies for TagMe!

Page 21: Generating Resource Profiles by Exploiting the Context of Social Annotations

21Generating Resource Profiles by Exploiting the Context of Social Annotations

Broken Slide

Page 22: Generating Resource Profiles by Exploiting the Context of Social Annotations

22Generating Resource Profiles by Exploiting the Context of Social Annotations

Which resource profiling strategy generates the most valuable profiles?

baker, cool

Profile? concept weight

?

cool

context

context

Page 23: Generating Resource Profiles by Exploiting the Context of Social Annotations

23Generating Resource Profiles by Exploiting the Context of Social Annotations

Experimental setup• “Tag Prediction” task (leave-one-out cross

validation):• remove one tag from the resource • create (context-based) resource profile • use profile to create a ranking of tags hidden tag should be

at the top of the ranking• Baseline: tag co-occurrence • Metrics: Success@k = probability that the relevant

tag appear within the top k of the ranking• Data sets:Tag Assignments (TAs) 1,288

TAs with Spatial Information

671

TAs with Category Information

917

TAs with URI Information 1,050

TAs with all information 432

Resources 566,939Users 6,569Tag Assignments (TAs)

2,622,423

TagMe!

BibSonomy

Page 24: Generating Resource Profiles by Exploiting the Context of Social Annotations

24Generating Resource Profiles by Exploiting the Context of Social Annotations

Results [TagMe!]Context-based profiling strategies outperform baseline (tag frequency) significantly.

Semantic meaning and spatial information allow for best performance.

Area size more valuable than distance to center

no significant difference w.r.t. category- and user-based strategy

Page 25: Generating Resource Profiles by Exploiting the Context of Social Annotations

25Generating Resource Profiles by Exploiting the Context of Social Annotations

Combining different types of context-based profiling strategies

Mixture of context-based strategies improve performance (by 37%)Context-based strategies have to be combined intelligently in order to increase cumulative gain in performance.

Page 26: Generating Resource Profiles by Exploiting the Context of Social Annotations

26Generating Resource Profiles by Exploiting the Context of Social Annotations

Results [BibSonomy]Again: Context-based profiling strategies outperform baseline (tag frequency) significantly.

The more specific the context, the better the performance ( reducing noise)

Page 27: Generating Resource Profiles by Exploiting the Context of Social Annotations

27Generating Resource Profiles by Exploiting the Context of Social Annotations

Conclusions• What we did: framework for generating resource profiles by exploiting contextual information of social annotations• Context-based folksonomy model• Set of context-based resource profiling strategies (both

generic and application-specific strategies)• Evaluation in two social tagging systems: TagMe! and BibSonomy

• Results: • Context-based strategies outperform other strategies that

do not exploit contextual information• Context of tag assignments (e.g. semantic meaning) allows for best

performance

• Context of the user who performs the tag assignment is competitive

• Mixing context-based strategies improves quality but does not necessarily result in a cumulative gain in performance (“over-contextualization”) smart mixing performs best (>40% improvement)

Page 28: Generating Resource Profiles by Exploiting the Context of Social Annotations

28Generating Resource Profiles by Exploiting the Context of Social Annotations

Twitter: @ricardokawase@gpapadis@fabianabel

baker, cool

cool

context

context

Context-based

Resource Profile

concept weight

SPARQL 0.67

semantics 0.33

Thank you!