Crowdsourcing with Amazon Mechanical Turk Alex Quinn and Tom Yeh HCIL Symposium May 25, 2011

Preview:

Citation preview

Crowdsourcing withAmazon Mechanical Turk

Alex Quinn and Tom Yeh

HCIL SymposiumMay 25, 2011

Interface options

• Easy web-based tool– http://requester.mturk.com

• REST and SOAP APIs (HTTP)• Wrapper SDKs: Java, .NET, Perl, PHP, Ruby• Command line tool• TurKit (JavaScript)• Boto (Python)• CrowdLib (Python)

Posting HITs – external

Your URL in an IFRAME

Your form must route results to Amazon.<form method="post" action="https://mturk.com/...">

Posting HITs – standard

QuestionForm XML

Standard vs. External HITs

Standard• Amazon hosts everything

• Content: HTML, Flash, applets

• Answers: Standard controls

• No JavaScript / CSS

• More secure

• Easier to get started

External• Requires your own web server

• Content: anything!

• Answers: Your CGI form fields

• Inner scrolling (potentially)

• More flexible

• Easier to do fancy things

API Documentation

Question XML<QuestionForm xmlns="..."> <Question> <QuestionIdentifier>food_name</QuestionIdentifier> <IsRequired>true</IsRequired> <QuestionContent> <Text>What food is Maine known for?</Text> </QuestionContent> <AnswerSpecification> <FreeTextAnswer></FreeTextAnswer> </AnswerSpecification> </Question></QuestionForm>

Example - TurKitvar params = {

title : "Find food for a state",desc : "Given a US state find a food it is known for",question : questionXML,reward : 0.01,maxAssignments : 3

};

var hit = mturk.createHIT(params);

hit = mturk.getHIT(hit,true);

for( var j=0; j<hit.assignments.length; j++ ) { var foodName = hit.assignments[j].answer.food_name; print(stateName + " : " + foodName);}

<QuestionForm xmlns="..."> <Question> <QuestionIdentifier>food_name</QuestionIdentifier> <IsRequired>true</IsRequired> <QuestionContent> <Text>What food is Maine known for?</Text> </QuestionContent> <AnswerSpecification> <FreeTextAnswer></FreeTextAnswer> </AnswerSpecification> </Question></QuestionForm>

Example - CrowdLib

import crowdlib as cl, crowdlib_settings

hit_type = cl.create_hit_type("Find food for a state", "Given a US state find a food it is known for", 0.01)

fields = [cl.text_field("What food is Maine known for?")]

hit = hit_type.create_hit(fields, max_assignments=3)

Example – Command Line• Many little files!– state_foods.question– state_foods.input– state_foods.properties– state_foods_start.bat– state_foods_get_results.bat– state_foods_review_results.bat

• Creates even more…– state_foods.success– state_foods.results

Example – Command Line• Many little files!– state_foods.question question XML– state_foods.input parameters (i.e. states)– state_foods.properties title, description, etc.– state_foods_start.bat calls loadHITs.bat– state_foods_get_results.bat fetch results– state_foods_review_results.bat approve/reject

• Creates even more…– state_foods.success contains the HIT IDs– state_foods.results all result details

Example – Command Line• Many little files!– state_foods.question– state_foods.input– state_foods.properties– state_foods_start.bat– state_foods_get_results.bat– state_foods_review_results.bat

• Creates even more…– state_foods.success– state_foods.results

<QuestionForm xmlns="..."> <Question> <QuestionIdentifier> food_name </QuestionIdentifier> <IsRequired>true</IsRequired> <QuestionContent> <Text>

What food is Maine known for? </Text> </QuestionContent> <AnswerSpecification> <FreeTextAnswer/> </AnswerSpecification> </Question></QuestionForm>

Example – Command Line• Many little files!– state_foods.question– state_foods.input– state_foods.properties– state_foods_start.bat– state_foods_get_results.bat– state_foods_review_results.bat

• Creates even more…– state_foods.success– state_foods.results

stateAlabamaAlaskaArizonaArkansasCaliforniaColoradoConnecticutDelawareDistrict of ColumbiaFloridaGeorgiaHawaiiIdahoIllinoisIndianaIowaKansasKentuckyLouisianaMaine...

Example – Command Line• Many little files!– state_foods.question– state_foods.input– state_foods.properties– state_foods_start.bat– state_foods_get_results.bat– state_foods_review_results.bat

• Creates even more…– state_foods.success– state_foods.results

title:Find food for a state

description:Given a US state find a food it is known for

keywords:culture, food, search

reward:0.02

assignments:1

annotation:${state}

# time limit: 30 minutesassignmentduration:1800

# expires in: 3 dayshitlifetime:259200

# auto-pay after: 2 daysautoapprovaldelay:1296000

Example – Command Line• Many little files!– state_foods.question– state_foods.input– state_foods.properties– state_foods_start.bat– state_foods_get_results.bat– state_foods_review_results.bat

• Creates even more…– state_foods.success– state_foods.results

pushd bin

call loadHITs %1 %2 %3 %4 %5 ^ -label ..\sf\state_foods ^ -input ..\sf\state_foods.input ^ -question ..\sf\state_foods.question ^ -properties ..\sf\state_foods.properties

popd

Example – Command Line• Many little files!– state_foods.question– state_foods.input– state_foods.properties– state_foods_start.bat– state_foods_get_results.bat– state_foods_review_results.bat

• Creates even more…– state_foods.success– state_foods.results

pushd ..\bin

call getResults %1 %2 %3 %4 %5 ^ -successfile ..\sf\state_foods.success -outputfile ..\sf\state_foods.results

popd

Example – Command Line• Many little files!– state_foods.question– state_foods.input– state_foods.properties– state_foods_start.bat– state_foods_get_results.bat– state_foods_review_results.bat

• Creates even more…– state_foods.success– state_foods.results

pushd ..\..\bin

call reviewResults %1 %2 %3 %4 %5 ^ -resultsfile ..\sf\state_foods.results

popd

Example – Command Line• Many little files!– state_foods.question– state_foods.input– state_foods.properties– state_foods_start.bat– state_foods_get_results.bat– state_foods_review_results.bat

• Creates even more…– state_foods.success– state_foods.results

hitid hittypeid11MFAR2I9GUBPPX3H302KECWDGG45C 1MQHAMAJ67VR562077Y5NLM6WNOIO81N2WE274D6AYYDS5F6ENYRH1NR7JAO 1MQHAMAJ67VR562077Y5NLM6WNOIO81KBVYNQOT8Y85Z3T3H6YTS1CV3596T 1MQHAMAJ67VR562077Y5NLM6WNOIO81NPDB0OPE4D7JGXHYWSMAPKKCVAELG 1MQHAMAJ67VR562077Y5NLM6WNOIO81HNBWMEOVNP9NL2O6SYS8WM19JGNIZ 1MQHAMAJ67VR562077Y5NLM6WNOIO8

Example – Command Line• Many little files!– state_foods.question– state_foods.input– state_foods.properties– state_foods_start.bat– state_foods_get_results.bat– state_foods_review_results.bat

• Creates even more…– state_foods.success– state_foods.results

"hitid" "hittypeid" "title" "description" "keywords" "reward" "creationtime" "assignments" "numavailable" "numpending" "numcomplete" "hitstatus" "reviewstatus" "annotation" "assignmentduration" "autoapprovaldelay" "hitlifetime" "viewhit" "assignmentid" "workerid" "assignmentstatus" "autoapprovaltime" "assignmentaccepttime" "assignmentsubmittime" "assignmentapprovaltime" "assignmentrejecttime" "deadline" "feedback" "reject" "Answer.answer"

"11MFAR2I9GUBPPX3H302KECWDGG45C" "1MQHAMAJ67VR562077Y5NLM6WNOIO8" "Find food for a state" "Given a US state find a food it is known for" "US culture, food, search" "$0.02" "Mon Dec 06 21:50:25 EST 2010" "1" "0"

"0" "0" "Reviewable" "NotReviewed" "Maine" "3600" "1296000" "Thu Dec 09 21:50:25 EST 2010" "http://requester.mturk.com/mturk/manageHIT?HITId=11MFAR2I9GUBPPX3H302KECWDGG45C" "1SPKY563G67J32JCIGFCFF0CVAKGF0" "A1JYO3377GTS1O" "Submitted" "Tue Dec 21 21:52:56 EST 2010" "Mon Dec 06 21:51:26 EST 2010" "Mon Dec 06 21:52:56 EST 2010" "" "" ""

"" "" "lobster"

Reviewing assignments

• Approve• Reject• Grant bonus

Word to wise: Preview in Excel!

Worker issues

• Approval rating• Blocked by multiple requestors banned!!!• Too many rejections banned (???)

• My rule: If they made a reasonable attempt to follow the directions, then pay.– If quality is bad, get in touch.

Worker issues

• Indiscriminate rejections…

Qualification Requirements

• Qualification Type ~ HIT Type• Qualification Requirement ~ HIT• Qualification ~ Assignment

• Custom• Built-in• Granted

Informed Consent

Ways to receive results

• Ad-hoc script• Polling• Notification receptor

Terminology review

• HIT Type• HIT• Assignment (HIT, worker)• Answer part of Assignment

• Qualification Type• Qualification Requirement• Qualification (QR, worker)

HIT Type

HIT

Recommended