Upload
elijah-newman
View
212
Download
0
Embed Size (px)
Citation preview
Systematization of Crowdsoucing for Data Annotation
Aobo, Feb. 2010
Outline Overview Related Work Analysis and Classification Recommendation Future work Conclusions Reference
Overview
Contribution Provide a faceted analysis of existing
crowdsourcing annotation applications. Discuss recommendations on how practioners can
take advantage of crowdsourcing . Discuss the potential opportunities in this area.
Defination Crowdsoucing GWAP Distributed Human-based Computation AMT HIT
Retaled Works
“A Taxonomy of Distributed Human Computation” Author: J. Quinn and B. Bederson Year: 2009
Contribution Divide the DHC applications into seven genres. Proposed six dimensions to help characterize the
different approaches . Propose some recommandation and future
directions
Related work “A Survey of Human Computation Systems”
Author: Yuen, Chen and King Year: 2009
Contribution General survey of various human computation
systems separately. Compare the GWAPs based on the game structure,
verification method, and game mechanism Present the performance aspect issues of GWAPs.
Analysis and Classification Dimensions
Analysis and Classification GWAP
High score in : GUI desine, Implementation cost, Annotation speed,
Low score in : Anonotation cost, Difficulty, Participation time,
Domain Coverage, Popularization
Medium score in:Annotation accuracy, Data size
NLP tasks: - Word Sense Disambiguation
- Coreference Annotation
Analysis and Classification AMT
High score in : Annotation cost
Low score in :
GUI design, Implementation Cost, Number of Participants, Data size
Medium score in:Popularization, Difficulty, Domain coverage, Paticipation time, Popularization
Annotation accuracy,
NLP tasks: - Parsing
- Part-of-Speech Tagging
Analysis and Classification Wisdom of Volunteers
High score in : Number of Participants, Data size , Difficulty, Paticipation time
Low score in : GUI design, Fun
Medium score in:
Implementation Cost, Annotation accuracy,
NLP tasks: - Paraphrasing - Machine Translation task - Summarization
Recommendation GWAP
Submit the GWAP games to a popular game website which provides and recommend new games for players
Uniform game developing platform
AMT Make the task fun Rank the employers by their contribuation Award employers who provide original data to be annotated Donate the whole or part of the benefit to an charity
Wisdom of Volunteers Rank the users by their contribuation Push the tasks to the public users
Conclusions Propose different dimentions of existing
crowdsourcing annotation applications.
Discuss recommendations on each crowdsourcing approach
Discuss the potential opportunities in this area
Reference
1. Benjamin B. Bederson Alexander J. Quinn. 2009. A taxonomy of distributed human computation.
2. Aniket Kittur, Ed H. Chi, and Bongwon Suh. 2008. Crowdsourcing user studies with mechanical turk.
3. Rion Snow, Brendan O’Connor, Daniel Jurafsky, and Andrew Ng. 2008. Cheap and fast – but is it good? evaluating non-expert annotations for natural language tasks.
4. A. Sorokin and D. Forsyth. 2008. Utility data annotation with amazon mechanical turk.
5. Luis von Ahn and Laura Dabbish. 2008a. Designing games with a purpose. Commun. ACM, 51(8):58– 67, August.
6. Luis von Ahn and Laura Dabbish. 2008b. General techniques for designing games with a purpose. Commun. ACM, 51(8):58–67.
7. Man-Ching Yuen, Ling-Jyh Chen, and Irwin King. 2009. A survey of human computation systems. Computational Science and Engineering, IEEE International Conference on, 4:723–728.