16
Administering Panels on Amazon Mechanical Turk: A Guide to Within-Subjects Experiments Kyle A. Dropp 1 March 7, 2014 Click here for latest version. Comments Welcome! This document provides step-by-step instructions for implementing multiple, complex treatments across survey waves. Data analysis will be conducted in R, surveys will be pro- grammed in Qualtrics, and cases will be collected via Amazon Mechanical Turk (AMT), though the instructions are applicable to other samples. 2 Note: this tutorial assumes very limited knowledge or familiarity with Qualtrics, Amazon Mechanical Turk, HTML or server applications, but more advanced applications require basic knowledge of php, sql and MySQL. All supporting materials are available at kyleadropp.com/panels 1 Assistant Professor, Department of Government, Dartmouth College [email protected] I thank Solomon Messing for providing information on MTurkR- http://solomonmessing.wordpress.com/ 2013/06/24/streamline-your-mechanical-turk-workflow-with-mturkr/ 2 These instructions can be utilized for implementing panels via Survey Sampling International, Inc. (SSI), for instance. 1

Administering Panels on Amazon Mechanical Turk: A …kyledropp.weebly.com/uploads/1/2/0/9/12094568/panels.pdf · Administering Panels on Amazon Mechanical Turk: A Guide to Within-Subjects

Embed Size (px)

Citation preview

Administering Panels on Amazon Mechanical Turk:A Guide to Within-Subjects Experiments

Kyle A. Dropp1

March 7, 2014

Click here for latest version.Comments Welcome!

This document provides step-by-step instructions for implementing multiple, complextreatments across survey waves. Data analysis will be conducted in R, surveys will be pro-grammed in Qualtrics, and cases will be collected via Amazon Mechanical Turk (AMT),

though the instructions are applicable to other samples.2 Note: this tutorial assumesvery limited knowledge or familiarity with Qualtrics, Amazon Mechanical Turk, HTML orserver applications, but more advanced applications require basic knowledge of php, sql andMySQL. All supporting materials are available at kyleadropp.com/panels

1Assistant Professor, Department of Government, Dartmouth College [email protected] Ithank Solomon Messing for providing information on MTurkR- http://solomonmessing.wordpress.com/2013/06/24/streamline-your-mechanical-turk-workflow-with-mturkr/

2These instructions can be utilized for implementing panels via Survey Sampling International, Inc. (SSI),for instance.

1

Contents

1. Why Panels? 32. Wave 1 32.1. Programming Survey in Qualtrics 32.1.1. Login 32.1.2. Uploading Survey 32.1.3. Obtaining Worker ID 32.1.4. Foreign Direct Investment Experiment 42.1.5. Confirmation Code 52.2. Programming in Amazon Mechanical Turk 62.3. Fielding Wave 1 92.3.1. Evaluating Results from Wave 1 103. Wave 2 113.0.2. Upload Wave 2 113.0.3. Branch Logic and Wave 2 Treatment 113.1. Managing Wave 2 from R 123.1.1. Install MTurkR 123.1.2. Enter Amazon Mechanical Turk credentials 123.1.3. Invite Wave 1 respondents to Wave 2 133.1.4. Compensating Respondents 144. Data Analysis 15

2 Materials: kyleadropp.com/panels

1. Why Panels?

Panels are a powerful tool that can identify changes in opinion (or behavior) over time,inference inferences made from single-shot, between-subjects experiments, increase the pre-cision of statistical estimates and address important methodological questions such as therate of experimental decay. This tutorial provides step-by-step instructions for implementingmultiple, complex treatments across survey waves.

2. Wave 1

2.1. Programming Survey in Qualtrics. Prior to creating our HIT in Amazon Mechan-ical Turk (AMT), we will program a survey in Qualtrics, a top notch interface for program-ming surveys.

2.1.1. Login. Login to your Qualtrics account (e.g., http://stanforduniversity.qualtrics.com/ for Stanford University, princeton.qualtrics.com for Princeton University, or tuck.qualtrics.com/ for Dartmouth College). University affiliates should have free access to anaccount, but see your department or university administrator if you have difficulty accessingQualtrics.

2.1.2. Uploading Survey. Surveys for Waves 1 and 2 are pre-programmed for this tutorial.You can simply upload them to Qualtrics. On the ‘Edit Survey’ tab, select ‘Advanced Op-tions’ on the far right, then select ‘Import Survey,’ and browse to the file ‘panel wave1 tutorial.qsf’in the supporting materials folder. Choose the file and then select ‘Import.’ This survey isthe first wave of the study and includes a basic experiment on how Americans’ preferencestoward foreign direct investment vary based on the foreign country making the investment.The foregoing subsections will describe key segments of the uploaded survey.

2.1.3. Obtaining Worker ID. Each worker on AMT has an ID called a ‘Worker ID’ thatwe will obtain to prevent duplicate entries and to link individual responses across multiplewaves. On the ‘Edit Survey’ tab, select ‘Survey Flow.’ There is an empty embedded datavariable called ‘MID’ in your Amazon Mechanical Turk survey flow that obtains each AMTWorker ID. Basically, when AMT workers enter the survey, they will have an ‘MID’ on theirurl and this embedded data captures this value. For Survey Sampling International (SSI),the respondent identifier is typically ‘psid.’

3 Materials: kyleadropp.com/panels

2.1.4. Foreign Direct Investment Experiment. Respondents view a fake newspaper articledescribing a battery making company that has been acquired. By random draw, the companypurchasing the battery maker is Japanese, German, Chinese, or American. After viewingthe article, respondents state whether they favor or oppose the acquisition. In a subsequentquestion battery, they state whether they believe the acquisition will harm national security,lead to job losses, or harm American culture and values. See below for a picture of thenewspaper article in the ‘Japanese’ condition:

The two figures below, contained in the ‘Survey Flow,’ indicate that the country (‘rand-Country’) is randomized between one of four options and the first option displays the Ger-many treatment. The other randomizations (‘randCountry’ = 2, 3, or 4) display the Chinese,German, and American treatments, respectively.

4 Materials: kyleadropp.com/panels

In Wave 2, the respondent will view a similar article but will be randomly assigned toanother value of ‘randCountry.’ That is, the respondent who sees the Germany treatmentin Wave 1 will have an equal probability of viewing the American, Chinese, or Japanesetreatment in Wave 2. This design creates a powerful between-subjects and within-subjectsanalysis.

Following the main post-test dependent variable, they answer a simple question batteryto identify mechanisms for support or opposition to the acquisition.

2.1.5. Confirmation Code. We generate a confirmation code between 5,000,000 and 9,999,999and ask respondents to enter the code into the AMT interface to receive payment.

A corresponding descriptive text question at the end of the survey provides the confirma-tion code generated from the web service. See the figure below.

5 Materials: kyleadropp.com/panels

2.2. Programming in Amazon Mechanical Turk. Login to Amazon Mechanical Turkas a ‘Requester.’ If you are not familiar with AMT, there are many good tutorials.3 Pleasenote jobs, also known as HITs, can be setup with the R package ‘mTurkR.’4

Enter the ‘Create’ tab in Amazon Mechanical Turk, select ‘New Project,’ then select‘Survey,’ and finally click ‘Create Project.’

Here are the details I suggest entering for this project. Enter the title ‘Answer a shortsurvey’ and description ‘Answer a short, fun survey’ and keywords ‘surveys.’ Set the rewardto $0.20, assign 100 HITs, allot 1 hour to complete, have the HIT expire in 6 hours, andhave the results automatically approved in 6 hours.

3http://docs.aws.amazon.com/AWSMechTurk/latest/AWSMechanicalTurkGettingStartedGuide/

Welcome.html4http://thomasleeper.com/MTurkR/

6 Materials: kyleadropp.com/panels

Select the ‘Advanced’ tab on the bottom right, select ‘Worker requirements,’ click ‘Cus-tomize Worker Requirements..’ in the drop down menu, specify Location as ‘United States,’HIT Approval rate ‘greater than or equal to 95’, and Number of HITs Approved ‘greaterthan or equal to 100.’

Select ‘Design Layout’ to move to the next pane, click ‘OK’ when prompted regardingMaster Workers. You will now add the content to the HIT. This will include a brief descrip-tion of the survey, a link to the survey in Qualtrics, and a confirmation code for the AMTrespondent to enter upon completion of the Qualtrics survey.

7 Materials: kyleadropp.com/panels

Open the ‘AMT wave1.html’ file in the supporting materials folder and click ‘Source’ onthe right side of the ‘Design Layout’ page. Paste the entire HTML file into the body. Thisfile includes code with a brief description of the project, commands to extract the AMTrespondent’s unique Worker ID, and a confirmation code. Modify lines 1, 4, and 5 thatpertain to survey length and eligibility. In line 28, modify the quoted ‘href’ portion to thefull Qualtrics link for the survey. You can find the link to your survey in Qualtrics by clickingon the ‘Distribute Survey’ tab in Qualtrics.5

If you click ‘Source’ again, the body should look like this (figure below) and contain awarning message regarding JavaScript. You can ignore this warning message.

Click ‘Preview’ to preview the HIT and click ‘Finish’ to finish the programming portionof the HIT.

5Thanks to these scholars for the HTML code - http://www.academia.edu/1803170/Screening_

participants_from_previous_studies_on_Amazon_Mechanical_Turk_and_Qualtrics

8 Materials: kyleadropp.com/panels

2.3. Fielding Wave 1. Now click the ‘Create’ tab in AMT, find your Project, select ‘NewBatch,’ click ‘Next,’ and then click ‘Publish HITs.’ You may need to add funds to youraccount. Your HIT is now live and you are collecting data! You can review the status ofyour project by clicking the ‘Manage’ tab. There are a number of methods for reviewing andapproving completed HITs. From the least to most automated, you can select the ‘Results’tab on a given project and review the confirmation codes for completed HITs, you can mergea .csv of the results from Amazon with your Qualtrics .csv file (on the ‘confCode’ variable)and check whether the confirmation code is correct, or you can use the package ‘mTurkR’to batch approve respondents who have provided the correct confirmation code.

9 Materials: kyleadropp.com/panels

2.3.1. Evaluating Results from Wave 1. After you have finished Wave 1 data collection,login to Qualtrics, select ‘View Results,’ click ‘Download Data,’ scroll down and click thehighlighted CSV to download a .csv.

The key variables in this file are the Worker ID (‘MID’) and the Wave 1 treatment as-signment (‘randCountry’), the confirmation code (‘confCode’), support for the acquisition(‘Q16’), and post-test mechanism questions (e.g., Q25 1 through Q25 5).

Using R,6 I randomized the Wave 2 treatment assignment (‘randCountry2’). This file hasbeen saved as ‘wave2 assignment.csv’ online.## Read file with completed responses from Dropbox public folder

df0 = read.csv("http://dl.dropboxusercontent.com/u/24660992/panels/data_wave1.csv",skip=0,head=T)

df1 = read.csv("http://dl.dropboxusercontent.com/u/24660992/panels/data_wave1.csv",skip=2,head=F)

names(df1)=names(df0)

## Add Randomization to Wave 2

df1$randCountry2 = rep(NA,dim(df1)[1] ##create null variable

for(i in 1:nrow(df1)){ ##loop through Wave 1 rows

hline1 = df1$randCountry[i] ## identify Wave 1 treatment

conds = 1:4 ## list of possible Wave 2 treatments

df1$randCountry2[i] =sample(conds[conds!=hline1],1)} ## Sample from non-Wave 1 treatments

df1$randCountry2

write.csv(df1[order(df1$randCountry2),],"wave2_assignment.csv",row.names=F) # specify location

In an ideal world, we would wait a few weeks to administer Wave 2 of this panel. For thistutorial, I will administer it less than 24 hours after Wave 1. Let’s get to it.

6R can be downloaded here - http://cran.us.r-project.org/10 Materials: kyleadropp.com/panels

3. Wave 2

This section will provide instructions for administering Wave 2. I typically use MySQLdatabases for this wave, but this tutorial does not use any servers and should be accessibleto anyone with basic knowledge of R and Qualtrics. See the Appendix (or message me) formethods for administering Wave 2 treatments using MySQL.

3.0.2. Upload Wave 2. Enter Qualtrics, and Upload the survey ‘panel tutorial wave2.qsf’ inthe same way you uploaded the first survey at the beginning of the tutorial (i.e., edit survey,advanced options, import survey). This survey contains the same questions and questionblocks as Wave 1.

3.0.3. Branch Logic and Wave 2 Treatment. On the ‘Edit Survey’ tab, select ‘Survey Flow.’Scroll down to see the new branch logic. Open the file ‘wave2 assignment.csv’, which is sortedby Wave 2 treatment assignment (‘randCountry2’) and also includes the AMT Worker ID.I have copied the AMT Worker IDs associated with each unique treatment into the logic ofthe Survey Flow.

11 Materials: kyleadropp.com/panels

We have assigned embedded data variables to each of the four possible Wave 2 treatmentassignments. Now, when respondents enter the survey we will link their ‘MID’ to their Wave2 treatment group. Next, we use branching logic to assign respondents to an appropriatetreatment.

3.1. Managing Wave 2 from R.

3.1.1. Install MTurkR.

library(devtools)

## install.packages(’MTurkR’) if you don’t have package installed

require("MTurkR")

3.1.2. Enter Amazon Mechanical Turk credentials. You must provide two unique identifiers,your AWS Access Key ID and your AWS Secret Access Key, to control AMT from thecommand line.

## Sign into Amazon Mechanical Turk with AWS Access Key ID (‘xxxx’)

## and AWS Secret Access Key (‘yyyy’)

credentials(c("xxxx","yyyy"))

AccountBalance()

Here is how you acquire your AWS Access Key ID. First, go to http://aws.amazon.com/,select Security Credentials in the top right, sign into your account, click ‘Continue to Security

12 Materials: kyleadropp.com/panels

Credentials,’ click ‘Access Keys’ tab, and you will see your Access Key ID. Copy this intothe ‘xxxx’ portion of the credentials command. This is first value. Now, to find your AWSSecret Access Key, select ‘Security Credentials’ in the yellow box below your Access KeyID, select ‘Access Credentials,’ and click ‘Show’ under Secret Access Key. This is the AWSSecret Access Key. Copy this into the ‘yyyy’ portion of the credentials command.

3.1.3. Invite Wave 1 respondents to Wave 2. Now, you will contact workers to invite them tothe Wave 2 panel. Use the ContactWorker() command to send messages to each respondent,specify a bonus, and add a subject to the email. A typical subject line is “Thanks forcompleting my HIT!”, a typical body is

“I will pay a $0.20 bonus if you complete a brief follow-up study. The surveycan be completed at http://tuck.qualtrics.com/SE/?SID=SV_ePSM9ygoprOmMkZ&MID=xxxxx”

Please note the MID is the unique Amazon Mechanical Turk ID for the respondent. Hereis the appropriate R code (with the same data frame df1). See the ‘Distribute Survey’ linkin Qualtrics to provide to correct Qualtrics survey link. Use the paste command to appendan ‘MID’ identifier to each respondent who responds to your invitation.

df0 = read.csv("http://dl.dropboxusercontent.com/u/24660992/panels/data_wave1.csv",skip=0,head=T)

df = read.csv("http://dl.dropboxusercontent.com/u/24660992/panels/data_wave1.csv",skip=2,head=F)

names(df)=names(df0)

df = df[as.character(df$MID)!="",]

a <- "Complete a brief question follow-up survey for $0.20 bonus!"

b <- paste("The survey can be completed at ",

"http://tuck.qualtrics.com/SE/?SID=SV_6LOw3p5hGeaF9uR&MID=",

df$MID,sep="")

c = as.character(df$MID)

d <- ContactWorker(subjects=a,msgs=b,workers=c)

Your screen will look like this when you run the command.13 Materials: kyleadropp.com/panels

Respondents now will take Wave 2 of the survey. Download the results from the ‘View Re-sults’ tab after a sufficient number of individuals have responded. Typically, I have receivedWave 2 response rates of 70%+. The Wave 2 data file is saved as ‘data wave2.csv.’

3.1.4. Compensating Respondents. After the worker has submitted the job, or HIT, you mustpay him or her using the GrantBonus() command. First, in the manage tab in AMT, select‘Results’ on your latest batch of HITs, click ‘Download’ , then click ‘here’ on the DownloadBatch Results page to download an individual level file with completed HITs.

The file includes the ‘AssignmentId’ column, along with a ‘WorkerId’ column. Merge thisdataset with the Wave 2 results file, on the ‘MID’ variable to determine which respondentssuccessfully entered Wave 2. Then, using AMT Worker IDs and Assignment Ids, send abonus to the workers.wave1AMT = read.csv("http://dl.dropboxusercontent.com/u/24660992/panels/wave1_AMT_results.csv",skip=0,head=T)

wave2Data = read.csv("http://dl.dropboxusercontent.com/u/24660992/panels/data_wave2.csv",skip=0,head=T)

merge1 = merge(wave2Data,wave1AMT,by.y="WorkerId",by.x="MID")

dim(merge1)

14 Materials: kyleadropp.com/panels

a1 <- as.character(merge1$MID)

b1 <- as.character(merge1$AssignmentId)

c1 <- ".20"

d1 <- "Thanks for your great work on my HIT! I really appreciate it!"

##GrantBonus(workers=a1,assignments=b1,amounts=c1,reasons=d1)

4. Data Analysis

You now have two separate .csvs containing completed responses to Wave 1, and a secondfile containing responses to Wave 2. Merge the two files based on the respondent’s AmazonMechanical Turk ID and then start your between-subjects and within-subjects analysis.

Here is code to merge the files and examine the treatment assignments. A figure belowdemonstrates that the Wave 1 treatment assignments differ from Wave 2 assignments.

df0 = read.csv("http://dl.dropboxusercontent.com/u/24660992/panels/data_wave1.csv",skip=0,head=T)

wave1data = read.csv("http://dl.dropboxusercontent.com/u/24660992/panels/data_wave1.csv",skip=2,head=F)

names(wave1data)=names(df0)

wave2data = read.csv("http://dl.dropboxusercontent.com/u/24660992/panels/data_wave2.csv",skip=0,head=T)

dim(wave1data)

dim(wave2data)

names(wave1data)=paste(names(wave1data),"_W1",sep="")

names(wave2data)=paste(names(wave2data),"_W2",sep="")

waves12 = merge(wave1data,wave2data,by.x="MID_W1",by.y="MID_W2")

cbind(waves12$headline_W1[11:20],waves12$headline_W2[11:20])

15 Materials: kyleadropp.com/panels

16 Materials: kyleadropp.com/panels