Retrieval Effectiveness of Tagging Systems...Social tagging is a widespread activity for indexing...

Preview:

Citation preview

Retrieval Effectiveness of Tagging SystemsIsabella Peters, Laura Schumann, Jens Terliesner, Wolfgang G. Stock{isabella.peters | laura.schumann | jens.terliesner}@uni-duesseldorf.de

Heinrich-Heine-University, Department of Information ScienceUniversitatsstraße 1, 40225 Dusseldorf, Germany

Paper and fullreference list at:

http://tinyurl.com/retrieval-test

Abstract

Social tagging is a widespread activity for indexing user-generatedcontent on Web services. This poster summarizes research onfolksonomies and their retrieval effectiveness. A TREC-like retrievaltest was conducted with tags and resources from the socialbookmarking system delicious, which resulted in recall and precisionvalues for tag-only searches. Moreover, several experimental tag-based databases (i.e., Power tags, Luhn tags) have been testedregarding their retrieval effectiveness. Test results show thatfolksonomies work best with short queries although recall values arehigh and precision values are low. Here, a search function “Powertags only” greatly enhances precision values.

1. Extract documents and tags

I 1,989 resources consisting of the docsonomies of thedelicious-bookmarks

I collected from delicious.com in October 2010

I tags form the indexed databases for the retrieval test runs

I each resource is tagged from at least 30 users

I each resource was tagged with “folksonomy”,“seo” or“folksonomies”

2. Adjust tags

I original tags (Information Retrieval)

I tags unified, i.e. without special characters (informationretrieval)

I tags unified and stemmed (informationretriev)

Relevance

Queries Tags

All tags

Pow

er tags

Luh

n tags

4. Create information needs

55 information needs and search tasks have been created. They vary in theircomplexity as one can see in the following examples: simple lookup, complexlookup, exploratory search task.

I Find a thesaurus

I Find articles which present social bookmarking tools

I Find articles which advise the combination of folksonomies and controlledvocabularies for indexing

5. Judge relevance of documents

I 24 students and 9 people from staff of the department act as assessors

I assessments are binary and are conducted manually

I assessors had access to resources (i.e., websites). Tags were hidden forrelevance assessments

I if two assessors agreed in their decision the relevance judgment was savedimmediately

I this procedure results in 109,395 relevance judgments

6. Create queries

I 25 students and 2 people from staff created queries for each information need→ 730 queries

I Boolean operators and brackets are allowed (they can be used in delicious)

I minimum number of query terms: 1, maximum number of query terms: 13

I average query length: 3 terms

I experts from staff built queries for each search task with a maximum number of5 query terms (simulation of real-world-users: they use 2 terms or less)

3. Extract Power tags and Luhn tags

The algorithms developed for Power tag and Luhn tag extractionwork differently and depend on the tag distribution of thedocsonomy.

0

0,2

0,4

0,6

0,8

1

1,2

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

0,00

0,20

0,40

0,60

0,80

1,00

1,20

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

7. Retrieval and evaluation

The figures A to F visualize the results of the retrieval test and display average recall values, average precision values and values for F-measure for conducted search runs. Figures A to C visualize the results of the expert searches, figures D to F showthe results of simple lookup-search tasks requesting one query term. Figures G and H visualize the relative benefit or loss while using Power tags and Luhn tags with expert queries (G) and one word queries (H).

Test results show that retrieval in folksonomies works best with very short queries. Here, a search function “Power tags only” greatly enhances retrieval effectiveness.

me

an a

vera

ge r

eca

ll fo

r d

atab

ase

me

an a

vera

ge r

eca

ll fo

r d

atab

ase

me

an a

vera

ge r

eca

ll fo

r d

atab

ase

me

an a

vera

ge r

eca

ll fo

r d

atab

ase

me

an a

vera

ge p

reci

sio

n f

or

dat

abas

e

me

an a

vera

ge p

reci

sio

n f

or

dat

abas

e

me

an a

vera

ge p

reci

sio

n f

or

dat

abas

e

me

an a

vera

ge p

reci

sio

n f

or

dat

abas

e

F-m

eas

ure

F-m

eas

ure

F-m

eas

ure

F-m

eas

ure

0

0,1

0,2

0,3

0,4

0,5

0,6

0,7

0,8

0,9

1

Original tags 55 information needs

Expert queries

all delicious tags restricted to Luhn tags restricted to Luhn tags and Power

tags restricted to Power tags

me

an a

vera

ge r

eca

ll fo

r d

atab

ase

me

an a

vera

ge r

eca

ll fo

r d

atab

ase

me

an a

vera

ge r

eca

ll fo

r d

atab

ase

me

an a

vera

ge r

eca

ll fo

r d

atab

ase

me

an a

vera

ge p

reci

sio

n f

or

dat

abas

e

me

an a

vera

ge p

reci

sio

n f

or

dat

abas

e

me

an a

vera

ge p

reci

sio

n f

or

dat

abas

e

me

an a

vera

ge p

reci

sio

n f

or

dat

abas

e

F-m

eas

ure

F-m

eas

ure

F-m

eas

ure

F-m

eas

ure

0

0,1

0,2

0,3

0,4

0,5

0,6

0,7

0,8

0,9

1

Unified tags 55 information needs

Expert queries

all delicious tags, unified restricted to Luhn tags, unified restricted to Luhn tags and Power

tags, unified

restricted to Power tags, unified

me

an a

vera

ge r

eca

ll fo

r d

atab

ase

me

an a

vera

ge r

eca

ll fo

r d

atab

ase

me

an a

vera

ge r

eca

ll fo

r d

atab

ase

me

an a

vera

ge r

eca

ll fo

r d

atab

ase

me

an a

vera

ge p

reci

sio

n f

or

dat

abas

e

me

an a

vera

ge p

reci

sio

n f

or

dat

abas

e

me

an a

vera

ge p

reci

sio

n f

or

dat

abas

e

me

an a

vera

ge p

reci

sio

n f

or

dat

abas

e

F-m

eas

ure

F-m

eas

ure

F-m

eas

ure

F-m

eas

ure

0

0,1

0,2

0,3

0,4

0,5

0,6

0,7

0,8

0,9

1

Unified and stemmed tags 55 information needs

Expert queries

all delicious tags, unified and stemmed

restricted to Luhn tags, unified and stemmed

restricted to Luhn tags and Power tags, unified and stemmed

restricted to Power tags, unified and stemmed

Expert queries

0%

20%

40%

60%

80%

100%

120%

140%

160%

original tags original tags

restricted to

Luhn tags

original tags

restricted to

Power tags

unified tags unified tags

restricted to

Luhn tags

unified tags

restricted to

Power tags

unified and

stemmed

tags

unified and

stemmed

tags

restricted to

Luhn tags

unified and

stemmed

tags

restricted to

Power tags

Baseline = F-measure of delicious-tags for expert queries

ave

rage

re

call

for

dat

abas

e

ave

rage

re

call

for

dat

abas

e

ave

rage

re

call

for

dat

abas

e

ave

rage

re

call

for

dat

abas

e

ave

rage

pre

cisi

on

fo

r d

atab

ase

ave

rage

pre

cisi

on

fo

r d

atab

ase

ave

rage

pre

cisi

on

fo

r d

atab

ase

ave

rage

pre

cisi

on

fo

r d

atab

ase

F-m

eas

ure

F-m

eas

ure

F-m

eas

ure

F-m

eas

ure

0

0,1

0,2

0,3

0,4

0,5

0,6

0,7

0,8

0,9

1

Original tags 5 information needs

One word query

all delicious tags restricted to Luhn tags restricted to Luhn tags and Power

tags restricted to Power tags

ave

rage

re

call

for

dat

abas

e

ave

rage

re

call

for

dat

abas

e

ave

rage

re

call

for

dat

abas

e

ave

rage

re

call

for

dat

abas

e

ave

rage

pre

cisi

on

fo

r d

atab

ase

ave

rage

pre

cisi

on

fo

r d

atab

ase

ave

rage

pre

cisi

on

fo

r d

atab

ase

ave

rage

pre

cisi

on

fo

r d

atab

ase

F-m

eas

ure

F-m

eas

ure

F-m

eas

ure

F-m

eas

ure

0

0,1

0,2

0,3

0,4

0,5

0,6

0,7

0,8

0,9

1

Unified tags 5 information needs

One word query

all delicious tags, unified restricted to Luhn tags, unified restricted to Luhn tags and Power

tags, unified

restricted to Power tags, unified

ave

rage

re

call

for

dat

abas

e

ave

rage

re

call

for

dat

abas

e

ave

rage

re

call

for

dat

abas

e

ave

rage

re

call

for

dat

abas

e

ave

rage

pre

cisi

on

fo

r d

atab

ase

ave

rage

pre

cisi

on

fo

r d

atab

ase

ave

rage

pre

cisi

on

fo

r d

atab

ase

ave

rage

pre

cisi

on

fo

r d

atab

ase

F-m

eas

ure

F-m

eas

ure

F-m

eas

ure

F-m

eas

ure

0

0,1

0,2

0,3

0,4

0,5

0,6

0,7

0,8

0,9

1

Unified and stemmed tags 5 information needs

One word query

all delicious tags, unified and stemmed

restricted to Luhn tags, unified and stemmed

restricted to Luhn tags and Power tags, unified and stemmed

restricted to Power tags, unified and stemmed

0%

20%

40%

60%

80%

100%

120%

140%

160%

original tags original tags

restricted to

Luhn tags

original tags

restricted to

Power tags

unified tags unified tags

restricted to

Luhn tags

unified tags

restricted to

Power tags

unified and

stemmed

tags

unified and

stemmed

tags

restricted to

Luhn tags

unified and

stemmed

tags

restricted to

Power tags

Baseline = F-measure of delicious-tags for one word queries

74th Annual Meeting of the American Society for Information Science and Technology, October 9-13, 2011, New Orleans, Louisiana The research is funded by the DFG (STO 764/4-1). We thank students and staff for their valuable contribution.

Recommended