26
Enhancing the Navigability of Social Tagging Systems with Tag Taxonomies Christoph Trattner & Christian K¨ orner & Denis Helic KMI, TU Graz September 8, 2011 Christoph Trattner & Christian K¨orner & Denis Helic (KMI, TU Graz) Enhancing the Navigability of Social Tagging Systems with Tag Taxonomies September 8, 2011 1 / 26

Enhancing the Navigability of Social Tagging Systems with Tag Taxonomies

Embed Size (px)

Citation preview

Page 1: Enhancing the Navigability of Social Tagging Systems with Tag Taxonomies

Enhancing the Navigability of Social Tagging Systemswith Tag Taxonomies

Christoph Trattner & Christian Korner & Denis Helic

KMI, TU Graz

September 8, 2011

Christoph Trattner & Christian Korner & Denis Helic (KMI, TU Graz)Enhancing the Navigability of Social Tagging Systems with Tag TaxonomiesSeptember 8, 2011 1 / 26

Page 2: Enhancing the Navigability of Social Tagging Systems with Tag Taxonomies

Introduction

“Tagging gained tremendously in popularity over the past few years”

Christoph Trattner & Christian Korner & Denis Helic (KMI, TU Graz)Enhancing the Navigability of Social Tagging Systems with Tag TaxonomiesSeptember 8, 2011 2 / 26

Page 3: Enhancing the Navigability of Social Tagging Systems with Tag Taxonomies

Introduction

Figure: Tags on Flickr

Christoph Trattner & Christian Korner & Denis Helic (KMI, TU Graz)Enhancing the Navigability of Social Tagging Systems with Tag TaxonomiesSeptember 8, 2011 3 / 26

Page 4: Enhancing the Navigability of Social Tagging Systems with Tag Taxonomies

Introduction

Figure: Tags on Amazon

Christoph Trattner & Christian Korner & Denis Helic (KMI, TU Graz)Enhancing the Navigability of Social Tagging Systems with Tag TaxonomiesSeptember 8, 2011 4 / 26

Page 5: Enhancing the Navigability of Social Tagging Systems with Tag Taxonomies

Introduction

Figure: Tags on LastFM

Christoph Trattner & Christian Korner & Denis Helic (KMI, TU Graz)Enhancing the Navigability of Social Tagging Systems with Tag TaxonomiesSeptember 8, 2011 5 / 26

Page 6: Enhancing the Navigability of Social Tagging Systems with Tag Taxonomies

Introduction

What we also like about tags, apart form the fact that they representa cheap and light-weight alternative to common key-word basedsemantic enrichment, is the fact that they allow us to invent tools toexplore or navigate an information system in a light-weight andconcept driven manner.

A popular example of such a tool are tag taxonomies!

Christoph Trattner & Christian Korner & Denis Helic (KMI, TU Graz)Enhancing the Navigability of Social Tagging Systems with Tag TaxonomiesSeptember 8, 2011 6 / 26

Page 7: Enhancing the Navigability of Social Tagging Systems with Tag Taxonomies

Introduction

Q: What is a tag taxonomy?A: A tool that allows us to navigate information items in aninformation system in a concept driven and hierarchical manner.

Figure: Tag Taxonomy

Christoph Trattner & Christian Korner & Denis Helic (KMI, TU Graz)Enhancing the Navigability of Social Tagging Systems with Tag TaxonomiesSeptember 8, 2011 7 / 26

Page 8: Enhancing the Navigability of Social Tagging Systems with Tag Taxonomies

Introduction

Popular examples of tag taxonomy induction algorithms are:

The graph based approach of Heymann (Heymann et al. 2009)

Affinity Propagation (Lerman et al. 2010)

Hierarchical K-Means (Dhillon et al. 2001)

Christoph Trattner & Christian Korner & Denis Helic (KMI, TU Graz)Enhancing the Navigability of Social Tagging Systems with Tag TaxonomiesSeptember 8, 2011 8 / 26

Page 9: Enhancing the Navigability of Social Tagging Systems with Tag Taxonomies

Why usefulness of tag taxonomies for navigation is limited?

What we also observed in recent research regarding tagging is the factthat tag based navigation has also it’s limitations (Helic et al. 2010).

The problem with tagging is basically the fact that people do notapply tags to all resources of an information system system in auniform manner.

Christoph Trattner & Christian Korner & Denis Helic (KMI, TU Graz)Enhancing the Navigability of Social Tagging Systems with Tag TaxonomiesSeptember 8, 2011 9 / 26

Page 10: Enhancing the Navigability of Social Tagging Systems with Tag Taxonomies

Why usefulness of tag taxonomies for navigation is limited?

Actually, it was observed (H. Halpin et al. 2007) that the tag distributionof almost all tagging systems follows a power-law function, i.e. there aremany tags that refer to a large number of resources.

(a) Austria-Forum (b) BibSonomy (c) CiteULike

Figure: Tag distributions.

Christoph Trattner & Christian Korner & Denis Helic (KMI, TU Graz)Enhancing the Navigability of Social Tagging Systems with Tag TaxonomiesSeptember 8, 2011 10 / 26

Page 11: Enhancing the Navigability of Social Tagging Systems with Tag Taxonomies

Why usefulness of tag taxonomies for navigation is limited?

Hence, to navigate from one resource to another resource in aninformation system with the help of a tag taxonomy the user would have toclick many many times in the worst case to reach a desired target resource.

Figure: Result list of the tag “blog” in the bookmarking system Delicious.

Christoph Trattner & Christian Korner & Denis Helic (KMI, TU Graz)Enhancing the Navigability of Social Tagging Systems with Tag TaxonomiesSeptember 8, 2011 11 / 26

Page 12: Enhancing the Navigability of Social Tagging Systems with Tag Taxonomies

Why usefulness of tag taxonomies for navigation is limited?

Now, to support the user in the process to also navigate to theresources of a tagging system in an efficient manner, we invented theapproach of the so-called tag-resource taxonomies.

Tire Motor

Mercedes VWVOLVO BMW

Car

(a) Tag Taxonomy

Tire Motor

VW VWBMW BMW

Car

(b) Tag-Resource Taxonomy

Figure: Tag Taxonomy vs. Tag-Resource Taxonomy.

The beauty of such tag-resource hierarchies is that the result lists arelimited to a certain branching factor b and the maximum number of clicksis bounded by log(n), where n are the number of resources.

Christoph Trattner & Christian Korner & Denis Helic (KMI, TU Graz)Enhancing the Navigability of Social Tagging Systems with Tag TaxonomiesSeptember 8, 2011 12 / 26

Page 13: Enhancing the Navigability of Social Tagging Systems with Tag Taxonomies

Why usefulness of tag taxonomies for navigation is limited?

Sample calculations of a tag taxonomy vs. a tag-resource taxonomy forthe max number of clicks for three different tagging datasets withbranching factor b = 10.

Austria-Forum BibSonomy CiteULike

max{click(Ttag )} 184 5,278 20,799

max{click(Tres)} 6.1 7.7 8.5

Table: Tag Taxonomy vs. Tag-Resource Taxonomy.

Christoph Trattner & Christian Korner & Denis Helic (KMI, TU Graz)Enhancing the Navigability of Social Tagging Systems with Tag TaxonomiesSeptember 8, 2011 13 / 26

Page 14: Enhancing the Navigability of Social Tagging Systems with Tag Taxonomies

Why usefulness of tag taxonomies for navigation is limited?

Sample calculations of a tag taxonomy vs. a tag-resource taxonomy forthe mean number of clicks for three different tagging datasets withbranching factors ranging from b = 2− 10.

b Austria-Forum BibSonomy CiteULike

mean{click(Tres)} 2 14.2 17.8 19.8mean{click(Ttag )} 2 29.5 22.4 30.7

mean{click(Tres)} 5 6.1 7.6 8.5mean{click(Ttag )} 5 11.6 9.2 12.3

mean{click(Tres)} 10 4.3 5.3 5.9mean{click(Ttag )} 10 6.4 5.6 7.3

Table: Tag Taxonomy vs. Tag-Resource Taxonomy.

Christoph Trattner & Christian Korner & Denis Helic (KMI, TU Graz)Enhancing the Navigability of Social Tagging Systems with Tag TaxonomiesSeptember 8, 2011 14 / 26

Page 15: Enhancing the Navigability of Social Tagging Systems with Tag Taxonomies

Creating tag-resource Taxonomies

“How do we create tag-resource hierarchies?”

Christoph Trattner & Christian Korner & Denis Helic (KMI, TU Graz)Enhancing the Navigability of Social Tagging Systems with Tag TaxonomiesSeptember 8, 2011 15 / 26

Page 16: Enhancing the Navigability of Social Tagging Systems with Tag Taxonomies

Creating tag-resource Taxonomies

Actually, the first step to create a tag-resource hierarchy is to create aresource hierarchy out of a tagging dataset.

1. Computer Degree centrality for each resource of the taggingdataset and take the most general resource as our root

2. Compute cosine-similarity for all resources that are related to theroot node

3. Re-rank nodes according to their cosine*centrality values

4. Attach max. b resources as childs to the root.

5. Set next child as root and go to step 2.

Christoph Trattner & Christian Korner & Denis Helic (KMI, TU Graz)Enhancing the Navigability of Social Tagging Systems with Tag TaxonomiesSeptember 8, 2011 16 / 26

Page 17: Enhancing the Navigability of Social Tagging Systems with Tag Taxonomies

Creating tag-resource Taxonomies

To generate the actual tag-resource taxonomy we invented a hierarchicallabeling algorithm. Basically the algorithm works as follows:

1. Traverse the resource taxonomy in left-order and calculate aco-occurance vector for the currently processed resource.

2. Remove all tags from the co-occ. vector that are not in the tag setof the currently processed resource.

3. Try to apply most general tag of the co-ooc. vector. If thecandidate tag has already been applied to one of the parent resourcesof the currently processed resource, take the next candidate tag fromthe co-occ. vector.

Christoph Trattner & Christian Korner & Denis Helic (KMI, TU Graz)Enhancing the Navigability of Social Tagging Systems with Tag TaxonomiesSeptember 8, 2011 17 / 26

Page 18: Enhancing the Navigability of Social Tagging Systems with Tag Taxonomies

Evaluating Tag-Resource Taxonomies

In order to evaluate our approach, we conducted basically 3 differentexperiments

As dataset for our analysis we used a tagging dataset from a largeWiki based information system called the Austria-Forum.

Christoph Trattner & Christian Korner & Denis Helic (KMI, TU Graz)Enhancing the Navigability of Social Tagging Systems with Tag TaxonomiesSeptember 8, 2011 18 / 26

Page 19: Enhancing the Navigability of Social Tagging Systems with Tag Taxonomies

Evaluating Tag-Resource Taxonomies

Since our tag-taxonomy induction algorithm is not to 100% free ofcollisions, we conducted a simple experiment were we measured thenumber of collisions that occur during the labeling process.

Example of a collision: car > bmw > bmw

For that purpose we generated three different tag-resourcetaxonomies with different branching factors ranging from b = 2− 10and investigated the collision rate.

Name b n CR (%)

Res2 2 19,430 0.1%Res5 5 19,430 0.2%Res10 10 19,430 0.2%

Table: Collision Rates (CR) for different resource taxonomies with differentbranching factor b.

Christoph Trattner & Christian Korner & Denis Helic (KMI, TU Graz)Enhancing the Navigability of Social Tagging Systems with Tag TaxonomiesSeptember 8, 2011 19 / 26

Page 20: Enhancing the Navigability of Social Tagging Systems with Tag Taxonomies

Evaluating Tag-Resource Taxonomies

In the second experiment we measured the semantic structure of thetag-resource taxonomy compared to popular tag taxonomy inductionalgorithms such as Heymann, K-Means, Affinity Propagation andCo-Occurance

As measure for this experiment we used Taxonomic Recall/Prec. andOverlap.

As Ground truth we used the Germanet ontholoy

For the experiment we again generated three different tag-resourcetaxonomies with different branching factors b.

Christoph Trattner & Christian Korner & Denis Helic (KMI, TU Graz)Enhancing the Navigability of Social Tagging Systems with Tag TaxonomiesSeptember 8, 2011 20 / 26

Page 21: Enhancing the Navigability of Social Tagging Systems with Tag Taxonomies

Evaluating Tag-Resource Taxonomies

Res2 Res5 Res10 Deg/Cooc Aff. Prop K−Means Heymann0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

Cou

nt (

1 =

100

%)

Taxonomic F−MeasureTaxonomic Overlap

Figure: Results of the semantic evaluation of the three generated tag-resourcetaxonomies Res2, Res5 and Res10.

Christoph Trattner & Christian Korner & Denis Helic (KMI, TU Graz)Enhancing the Navigability of Social Tagging Systems with Tag TaxonomiesSeptember 8, 2011 21 / 26

Page 22: Enhancing the Navigability of Social Tagging Systems with Tag Taxonomies

Evaluating Tag-Resource Taxonomies

In the third and last experiment a user study was conducted toevaluate weather our approach is also useful for humans and could beused in a practical setting

To compare our approach against a golden standard we used for theexperiment so far best known tag taxonomy induction algorithm(Deg/Cooc)

To measure the performance of our approach, we invited 9 test usersto judge 200 tag trails extracted from both hierarchies

Christoph Trattner & Christian Korner & Denis Helic (KMI, TU Graz)Enhancing the Navigability of Social Tagging Systems with Tag TaxonomiesSeptember 8, 2011 22 / 26

Page 23: Enhancing the Navigability of Social Tagging Systems with Tag Taxonomies

Evaluating Tag-Resource Taxonomies

To ensure that the user would not know which trail she is actuallyjudging, we mixed the trails up uniform at random

To actually evaluate the trails, we asked our test users to start fromthe most left concept and to move on to the most right concept inthe trail

The evaluation schema given to the user was the following:

Classification Description

Correct Correct hierarchy relationRelated Correct relation, but not hierarchical

or reverse hierarchicalEquivalent SynonymNot Related The relations do not have anything

to do with each otherUnknown The evaluator does not recognize

the meaning of the tag(s)

Table: Classification Labels for the User Evaluation.

Christoph Trattner & Christian Korner & Denis Helic (KMI, TU Graz)Enhancing the Navigability of Social Tagging Systems with Tag TaxonomiesSeptember 8, 2011 23 / 26

Page 24: Enhancing the Navigability of Social Tagging Systems with Tag Taxonomies

Evaluating Tag-Resource Taxonomies

The user study showed a high performance of our approach compared to aDeg/Cooc tag taxonomy.

Name b Correct (%) Related (%) Equivalent (%) Not Related (%) Unknown(%)

Deg/Cooc10 10 33.2 27.3 13 21.9 5.1Res10 10 27.3 36.2 12.3 19.8 4.2

Table: Results of the empirical analysis of the tag-resource taxonomy withbranching factor b = 10 compared to a Deg/Cooc tag taxonomy with branchingfactor b = 10.

Christoph Trattner & Christian Korner & Denis Helic (KMI, TU Graz)Enhancing the Navigability of Social Tagging Systems with Tag TaxonomiesSeptember 8, 2011 24 / 26

Page 25: Enhancing the Navigability of Social Tagging Systems with Tag Taxonomies

Summary

We showed that tag taxonomies are in general not very well suited forfinding resources in an efficient number of clicks.

To tackle that issue we introduced a novel approach of the so-calledtag-resource hierarchies.

We illustrated in theory that with the approach of a tag-resourcetaxonomy it is possible to navigate to resources efficiently.

Additionally to these findings, we introduced an algorithm to generatesuch hierarchies and presented in a number of experiments thatproofed that tag-resource taxonomies perform on a semantic levelnearly as good or even better than popular tag taxonomy approaches.

Christoph Trattner & Christian Korner & Denis Helic (KMI, TU Graz)Enhancing the Navigability of Social Tagging Systems with Tag TaxonomiesSeptember 8, 2011 25 / 26

Page 26: Enhancing the Navigability of Social Tagging Systems with Tag Taxonomies

End of presentation

Thank you very much for your attention!Christoph Trattner ([email protected])

Christoph Trattner & Christian Korner & Denis Helic (KMI, TU Graz)Enhancing the Navigability of Social Tagging Systems with Tag TaxonomiesSeptember 8, 2011 26 / 26