Upload
sanjay-kairam
View
2.079
Download
0
Tags:
Embed Size (px)
DESCRIPTION
These are slides from a presentation I gave on the TurKit paper and toolkit by Greg Little and others at MIT.
Citation preview
TurKit: Tools for Iterative Tasks on Mechanical TurkPaper by Greg Little, Lydia B. Chilton, Rob Miller, and Max Goldman (MIT CSAIL)
Presented by Sanjay Kairam (Stanford)
Human Computation
•There are still some tasks that are too difficult for computers to do well.
•Examples:▫Labeling Images▫Tagging Documents▫Proofreading Text▫Writing Novel Content
•Simple solution: Get humans to do it!
Motivation and Participation
•Why would humans want to do these things?▫Reputation (Q&A Sites, Review Sites)▫Contribution (Wikipedia)▫Fun (Games with a Purpose)▫$$$
Amazon Mechanical Turk
•Marketplace for “Human Intelligence Tasks” (HITs).•Small amounts of money per task.
Amazon Mechanical Turk
Traditional Workflow
HIT
HIT
HIT
HIT
HIT
HIT
Data Collected
in CSV File
Requester Posts HIT Group to
Mechanical Turk
Data Exported for Use
Traditional Workflow: Pros & Cons•Easy to run simple, parallelized tasks.•Not so easy to run tasks in which turkers improve on
or validate each others’ work.
•TurKit to the rescue!
Iterative Improvement and Voting
Improve Vote
improved
original
text output
“Improve” and “Vote” Tasks
• Improve-HIT: Improve a given body of text according to some instructions.
•Vote-HIT: Choose which of two bodies of text is better according to some criteria.
• “Improve Task”▫ Please improve the description for this image.▫ People will vote whether to approve your changes.▫ Use no more than 500 characters.
• “Vote Task”▫ Please select the better description for this image.▫ Your vote must agree with the majority to be approved.▫ Differences are highlighted in yellow.
Image Description
Image Description• Version 1:
▫ A parial view of a pocket calculator together with some coins and a pen.
Image Description• Version 2:
▫ A view of personal items a calculator, and some gold and copper coins, and a round tip pen, these are all pocket and wallet sized item used for business, writing, calculating prices or solving math problems and purchasing items.
• [Voted down in favor of Version 1]
Image Description• Version 3:
▫ A close-up photograph of the following items:
▫ A CASIO multi-function calculator
▫ A ball point pen, uncapped▫ Various coins, apparently
European, both copper and gold
▫ Seems to be a theme illustration for a brochure or document cover treating finance, probably personal finance.
Image Description• Version 4:
▫ [Builds on Version 3]▫ …Various British coins; two
of £1 value, three of 20p value and one of 1p value...
• Further versions continued to iterate on Version 3.
Experiment: Iterative vs. Single HIT
•11 image description tasks• In each task:
▫Image selected randomly from set of 10▫Budget chosen (either $0.25 or $0.50)
• Iterative HIT Condition: $0.02/Improve & $0.01/Vote•Single HIT Condition: Single HIT with $0.25 or $0.50
Reward
Experiment: Iterative vs. Single HIT
•20 Turkers voted on each pair.•Votes favored the iteratively generated
description in 9 out of 11 cases (82%).
Writing Tasks: Outline to Prose• Improve letter to convey this
outline:▫ Somehow I need this to
sound diplomatic: I am grateful for job
offer I got another offer
they pay $10,000 more I’ll work for you if you
can match their offer
• Version 1:▫ I want to thank you for your
gracious job offer. However, I have had another job offer willing to pay me $10,000 more for the same job. But, I am willing to forgoe that job offer because working with your company would be a great opportunity, that is, if you would like to match their offer.
Writing Tasks: Outline to Prose• Improve letter to convey this
outline:▫ Somehow I need this to
sound diplomatic: I am grateful for job
offer I got another offer
they pay $10,000 more I’ll work for you if you
can match their offer
• Version 3:▫ I would like to thank you very
much for your offer, however, I have received a better package from another firm. I still would like to work for company and would like to speak with you to see if there is anyway we could make that occur.
Writing Tasks: Outline to Prose• Improve letter to convey this
outline:▫ Somehow I need this to
sound diplomatic: I am grateful for job
offer I got another offer
they pay $10,000 more I’ll work for you if you
can match their offer
• Version 9:▫ Thank you very much for your
offer. At this time, I have a competing offer on the table at a higher salary. If the remuneration packages were equal I would much prefer to work for yourselves, and I am open to discussing the possibility of your matching this offer. If this is possible, please contact me at your earliest convenience.
Handwriting Recognition• Version 1:
▫ You (?) (?) (?) (work). (?) (?) (?) work (not) (time). I (?) (?) a few grammatical mistakes. Overall your writing style is a bit too (phoney). You do (?) have good (points), but they got lost amidst the (writing). (signature)
Handwriting Recognition• Version 2:
▫ You (?) (?) (?) (work). (?) (?) (?) work (not) (time). I (?) (?) a few grammatical mistakes. Overall your writing style is a bit too (phoney). You do (?) have good (points), but they got lost amidst the (writing). (signature)
Handwriting Recognition• Version 4:
▫ You (misspelled) (several) (words). (?) (?) (?) work next (time). I also notice a few grammatical mistakes. …
Handwriting Recognition• Version 5:
▫ You (misspelled) (several) (words). (Plan?) (spellcheck) (your) work next time. I also notice a few grammatical mistakes. Overall your writing style is a bit too phoney. You do make some good (points), but they got lost amidst the (writing). (signature)
Handwriting Recognition• Final Version:
▫ You (misspelled) (several) (words). Please spellcheck your work next time. I also notice a few grammatical mistakes. Overall your writing style is a bit too phoney. You do make some good (points), but they got lost amidst the (writing). (signature)
Task Time & CostLatency (min) per iteration
Turker time(min) per HIT
Cost per iteration
Iterations
Improve All Votes Improve Single Vote
Improve All Votes Total Cost
Outline to prose 10 23.98 59.64 4.15 0.43 $0.05 $0.027 $0.38
Active Voice 13 37.47 7.77 5.22 0.23 $0.05 $0.027 $0.39
Grammatical Tense
7 9.45 16.38 1.56 0.35 $0.02 $0.022 $0.18
Handwriting 9 21.20 14.57 3.30 0.38 $0.05 $0.023 $0.46
Brainstorming 24 13.34 7.81 1.37 0.32 $0.02 $0.024 $0.88
The TurKit Toolkit• Arrows indicate the flow of
information.• Programmer writes 2 sets of
source code:▫ HTML files for web servers▫ JavaScript executed by TurKit
• Output is retrieved via a JavaScript database.
Turkers
Mechanical Turk
Web Server TurKit
*.html *.js
Programmer
JavaScript Database
TurKit APIs
•MTurk API: JavaScript wrapper for the MTurk API•Trace API: Uses the database to store information
about program execution•Utility API: Covers some common higher level MTurk
tasks.▫waitForHit: accepts a HIT ID and returns a JavaScript
object containing answers.▫vote: manages a HIT where turkers vote between two
or more options.▫sort: takes two parameters and a comprator
TurKit DemoIterative Text Improvement
TurKit Demo: Properties
•Fields to define:▫Mode = {Sandbox, Offline, Real}▫maxMoney & maxHITs = budget, HIT limits▫repeatInterval = wait time before re-running script
•You will also need:▫AWS Developer Access Key▫AWS Secret Key
TurKit Demo: Code▫ // improve text▫ var hitID = createImproveHIT(text, 0.02)▫ var hit = mturk.waitForHIT(hitId)
▫ var newText = hit.assignments[0].answer.newText▫ print(“-------------”)▫ print(newText)▫ print(“-------------”)
TurKit Demo: Code▫ // verify improvement▫ if (vote(text, newText, 0.01)) {
text = newText mturk.approveAssignment(hit.assignments[0]) print(“\nvote = keep\n”)
▫ }▫ else {
Mturk.rejectAssignment(hit.assignments[0]) print(“\nvote = reject\n”)
TurKit Demo: Code
•HITs are created using XML schemas defined by Amazon.
•Referred to by URL:▫Example:
http://mechanicalturk.amazonaws.com/AWSMechanicalTurkDataSchemas/2005-10-01/QuestionForm.xsd;