27
Crowdsourcing metadata for audiovisual collections from free tekst tags to semantic concepts Lotte Belice Baltussen – Sound and Vision 7 December 2011 | DISH

Crowdsourcing metadata for audiovisual collections

Embed Size (px)

DESCRIPTION

Crowdsourcing metadata for audiovisual collections: from free tekst tags to semantic concepts 7 December 2011 | DISH | Rotterdam Session: http://www.dish2011.nl/sessions/new-models-of-interaction-glams-linked-open-data-and-user-participation

Citation preview

Page 1: Crowdsourcing metadata for audiovisual collections

Crowdsourcing metadatafor audiovisual collections

from free tekst tags to semantic concepts

Lotte Belice Baltussen – Sound and Vision

7 December 2011 | DISH

Page 2: Crowdsourcing metadata for audiovisual collections
Page 3: Crowdsourcing metadata for audiovisual collections

Waisda? What’s that?

Allows people to annotate audiovisual archive material

in the form of a game.

Page 4: Crowdsourcing metadata for audiovisual collections

4

• Time-related metadata• Social tagging (bridging the semantic gap)• Interaction between the archive /broadcaster and

the public• Gathering data for further research

• Efficiency?annotating video takes up to 5 x the length of the video

• New business model?

Added value

Page 5: Crowdsourcing metadata for audiovisual collections

• Netherlands Institute for Sound and Vision (project management, content, research)

• KRO (concept, content, PR)• VU (research within PrestoPRIME)• Q42 (developer)

Project partners pilot

Page 6: Crowdsourcing metadata for audiovisual collections

Man bijt hond Woordentikkertje

After evaluation:• Improved interface• New scoring mechanisms

(semantics)• New content• More feedback

Page 7: Crowdsourcing metadata for audiovisual collections
Page 8: Crowdsourcing metadata for audiovisual collections
Page 9: Crowdsourcing metadata for audiovisual collections

How does it work?

Players choose from ‘channels’ with different episodes

Page 10: Crowdsourcing metadata for audiovisual collections

How does it work?Scoring:• Basic rule – players score

points when their tag exactly matches the tag entered by another player within 10 seconds• Multiple other scoring

mechanisms to create various tag incentives

Scoring as filter

Page 11: Crowdsourcing metadata for audiovisual collections

Evaluation

Martorrel

Page 12: Crowdsourcing metadata for audiovisual collections

Generating a constant flow of traffic is a challenge! Important: Partners, publicity on external websites with relevant communities and a large number of visitors.

Example FWAW, in one week:

• Triple # of tags to 160.000

• Double # of registered players to 362

Page 13: Crowdsourcing metadata for audiovisual collections

Outcomes

• Matches in Waisda? • Matches GTAA / Cornetto

• Stats

• 340,551 tags added to 604 items, 42,068 unique tags• 39.134 pageviews, 555 registered players, 10,926 visits• Average playing time 6min45, 4.287 sessions

Page 14: Crowdsourcing metadata for audiovisual collections

Evaluationav-documentalist

Page 15: Crowdsourcing metadata for audiovisual collections

Evaluationav-documentalist

• Tags mostly describe short fragments and are often not very specific. They don’t describe a programme as a whole.

• BUT! Can be solved by filtering and mapping free tekst tags to existing vocabularies.

• The WNW tags were the most useful and specifc; content influences specificity.

• Tags can be used in different ways and the relevance varies per user group.

• Documentalists exicted about further development!

Page 16: Crowdsourcing metadata for audiovisual collections

Evaluation

Page 17: Crowdsourcing metadata for audiovisual collections

Evaluation

Page 18: Crowdsourcing metadata for audiovisual collections

Source: Jakob Nielsen’s Alertblog 9 October 2006

Page 19: Crowdsourcing metadata for audiovisual collections

‘Fun’+

Competition+

Altruism+

Content+

Reward+…=

Motivation

Page 20: Crowdsourcing metadata for audiovisual collections

Waisda? Woordentikkertje

Months

Videos

Players

Tags – totalTags – unique

Matches• Players• Geo. names*• Persons*

8

648

2,435

428,83248,242 (11%)

• 156,546 (37%)• 6,089 (1,4%)• 107 (0,25%)

4,5

2,892

689

392,86043,407 (11%)

• 215,156 (55%)• 23,142 (5,8%)• 2,423 (0,6%)

* For Waisda? we looked at unique tags, for Woordentikkertje at the total number of tags

Page 21: Crowdsourcing metadata for audiovisual collections

Tips and lessons learned so far

• What are your success criteria?• How do you define your target users,

and how do you reach them?• How do you motivate your target

users?

• Read existing reports and literature!• Keep learning and improving!

Page 22: Crowdsourcing metadata for audiovisual collections

And beyond…

Page 23: Crowdsourcing metadata for audiovisual collections
Page 24: Crowdsourcing metadata for audiovisual collections
Page 25: Crowdsourcing metadata for audiovisual collections

• Open Source version of Waisda?• Crowdsourcing Olympics• More research into the added value of

tags for retrieval (subtitle comparison, tests with various end users, more research on linking semantically rich sources to tags)

Future work

Page 26: Crowdsourcing metadata for audiovisual collections

...recommended sourcesblogs, feeds, people

• http://museumtwo.blogspot.com/• http://80gb.wordpress.com/• http://themuseumofthefuture.com/• http://www.delicious.com/RuncocoProject/• @ammeveleigh• @archivesopen• @digitalst• @microtask• @mia_out • @museweb• @runcoco• @wittylama

This presentation is partly based on Oomen & Aroyo 2011: http://www.slideshare.net/PaulaUdondek/crowdsourcing-in-het-cultureel-erfgoed-kansen-uitdagingen

Page 27: Crowdsourcing metadata for audiovisual collections

Thanks!

@lottebelice / [email protected]

Big thank you to:B&G: @johanoomen / @mbrinkerink VU: @laroyo / @McHildebrand

http://blog.waisda.nlhttp://woordentikkertje.manbijthond.nl