73
Anand Kulkarni Björn Hartmann University of California, Berkeley Matthew Can Stanford University Collaboratively Crowdsourcing Comple With Turkomatic Turkomati c

Anand Kulkarni Björn Hartmann University of California, Berkeley Matthew Can Stanford University

  • Upload
    aerona

  • View
    34

  • Download
    0

Embed Size (px)

DESCRIPTION

Turkomatic. Collaboratively Crowdsourcing Complex Work With Turkomatic. Anand Kulkarni Björn Hartmann University of California, Berkeley Matthew Can Stanford University. Microtask marketplaces excel at simple, repetitive work. - PowerPoint PPT Presentation

Citation preview

Page 1: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Anand KulkarniBjörn Hartmann University of California, BerkeleyMatthew Can Stanford University

Collaboratively Crowdsourcing Complex WorkWith Turkomatic

Turkomatic

Page 2: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Microtask marketplaces excel at simple, repetitive work.

Page 3: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Microtask marketplaces excel at simple, repetitive work.

Transcribe a business card.

Page 4: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Microtask marketplaces excel at simple, repetitive work.

Transcribe a business card.

Look up a fact online.

Page 5: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Much of the work we do in our daily lives is not simple or repetitive.

“Create algebra problems for my mathematics exam.” “Write a research paper.”

“Create a small piece of software.”

“Arrange my trip to Seattle.”

“Write a blog about Mechanical Turk with a few good entries.”

Page 6: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

How do we crowdsource complex work?

Page 7: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Complex work with crowdsSoylent: Editing word processing documents(Bernstein et al ’10)Vizwiz: Answering queries about visual scenes (Bigham et al ‘10)

More complex applications: Platemate [NHZG11], Adrenaline [BBMK11], Crowdforge [KSK11]….

Page 8: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Workflows: Crowd Algorithms

Divide complex tasks into a sequence of microtasks arranged in a workflow

Soylent, Bernstein et al, UIST 2010

Page 9: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Workflow design is labor-intensive1. Design individual HITs2. Implement parallelism to make sure tasks are done correctly3. Write software to launch HITs and parse worker results4. Test workflow by running program 5. Identify errors6. Iterate from step 1

Page 10: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Workflow design is labor-intensive

Difficult and domain-specific: Workflow design requires extensive up-front iteration and experimentation and is specific to a given task domain.

Inaccessible to non-experts: Few have the patience to implement this process in code

Page 11: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Turkomatic is a system for crowdsourcing high-level complex and creative work where the crowd designs the workflow.

What is Turkomatic?

Page 12: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

What is Turkomatic?Create a new blog about Mechanical Turk with two posts.

Page 13: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Price-Divide-Solve (PDS)How do we induce the crowd to design a workflow?

Page 14: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Price-Divide-Solve (PDS)PDS is a divide and conquer algorithm to create workflows.Price: Can this task be solved for

20 cents?

If yes: Solve task and return the answer.

If no: Divide task into multiple steps.

For each step, recurse.

Merge steps into solution.

Page 15: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Price-Divide-Solve (PDS)PDS is a divide and conquer algorithm to create workflows.Price: Can this task be solved for

20 cents?

If yes: Solve task and return the answer.

If no: Divide task into multiple steps.

For each step, recurse.

Merge steps into solution.

Page 16: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Price Task

Price Task

Price-Divide-Solve (PDS)Redundancy is used at each step to ensure quality.

Divide Task

Best subdivisio

nVote

Price Task

Price Task

Price check

Consensus on priceMajority

Price Task

Price Task

Solve Task

Best solutionVote

Page 17: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Price-Divide-Solve (PDS)Create a new blog about Mechanical Turk with two posts.

Can we solve it for 20 cents?Price

Page 18: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Price-Divide-Solve (PDS)Create a new blog about Mechanical Turk with two posts.

Can we solve it for 20 cents?Price No.

Page 19: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Can we solve it for 20 cents?Price No.

Price-Divide-Solve (PDS)Create a new blog about Mechanical Turk with two posts.

Create a new blog on Wordpress.com.

Write one entry for a blog.

Write a second entry for a blog.

Divide it into two or more steps.Divide

Page 20: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Price-Divide-Solve (PDS)

Create a new blog on Wordpress.com.

Write one entry for a blog.

Create a new blog about Mechanical Turk with two posts.

Write a second entry for a blog.

Price Divide it into two or more steps.Divide

Page 21: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Price-Divide-Solve (PDS)

Create a new blog on Wordpress.com.

Write one entry for a blog.

Create a new blog about Mechanical Turk with two posts.

Write a second entry for a blog.

Can we solve it for 20 cents?

Price

Can we solve it for 20 cents?

Can we solve it for 20 cents?

Page 22: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Price-Divide-Solve (PDS)

Create a new blog on Wordpress.com.

Write one entry for a blog.

Create a new blog about Mechanical Turk with two posts.

Write a second entry for a blog.

Can we solve it for 20 cents?

Price

Can we solve it for 20 cents?

Can we solve it for 20 cents?

Yes. Yes. Yes.

Page 23: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Price-Divide-Solve (PDS)

Create a new blog on Wordpress.com.

Write one entry for a blog.

Create a new blog about Mechanical Turk with two posts.

Write a second entry for a blog.

Solve

Page 24: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Price-Divide-Solve (PDS)

Create a new blog on Wordpress.com. Write one entry for

a blog.

Create a new blog about Mechanical Turk with two posts.

Write a second entry for a blog.

Solve

“Welcome to my blog about Mechanical Turk! Here, I’ll be posting

some of my favorite recipes for Mechanical Turk. You’ll be able to follow

along at home and create delicious HITs. From the comfort of your own home! Stay tuned and i’ll show you

some of the best strategies for keeping your Turk workers engaged.”

Page 25: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Price-Divide-Solve (PDS)

Create a new blog on Wordpress.com.

Write one entry for a blog.

Create a new blog about Mechanical Turk with two posts.

Write a second entry for a blog.

Solve

“You may be inclined to price your HITs at the lowest possible rate, but this

isn’t always the best choice. Instead, you should base your pricing on:

-How long will the HIT take?-Is the HIT similar to other HITs? If so,

price it slightly less than theirs.-If the HIT involves a lot of

qualifications, you may want to price it higher, to attract more qualified

workers.”

Page 26: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Price-Divide-Solve (PDS)

Create a new blog on Wordpress.com.

Write one entry for a blog.

Create a new blog about Mechanical Turk with two posts.

Write a second entry for a blog.

mtworker.wordpress.com

Combine the results of solved steps.Merge

Page 27: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

mtworker.wordpress.com

Page 28: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University
Page 29: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Can this task be solved for 20 cents?

YesNo

Write a blog about Mechanical Turk

Submit

Page 30: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Break down the following task.

Write a blog about Mechanical Turk

Step 1:Step 2:

Add Step Submit

Page 31: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Solve the following task.

Create a new blank blog on Wordpress

Submit

Page 32: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Merge the following subtasks.

Write a blog about Mechanical Turk

Step 1:Step 2:

Submit

Workers previously divided this task into simpler steps and solved each step. Combine their work into a complete solution.

Write a blog post about Mechanical Turk. [answer: This post is…]

Create a blank blog about Mechanical Turk [answer: www...]

Page 33: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Price-Divide-Solve (PDS)PDS guides the crowd to design workflows in a particular way.

It can attempt to create a workflow for any task, but it can’t produce all workflows.

Write a sentence.Improve the

previous worker’s answer.

Check that the previous answer was improved.

Page 34: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

System Recap

Price SolveDivide

Requester Interface

System Output

Algorithm

Worker Interface

Page 35: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Experiment 1: Can the crowd plan and execute workflows using PDS?

Over 150 trials, including:• Java programming• Booking restaurants• Sorting and cleaning data• Blogging• Creating self-portraits• Solving an SAT• Logo design• Travel planning• Writing essays• Web research

Page 36: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Experiment 1: Can the crowd plan and execute workflows using PDS?

Over 150 trials, including:• Java programming• Booking restaurants• Sorting and cleaning data• Blogging• Creating self-portraits• Solving an SAT• Logo design• Travel planning• Writing essays• Web research

Page 37: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Experiment 1: Success Modes

Write a 3-paragraph essay about whether it’s ever OK to lie.

Write one paragraph arguing it’s OK to lie sometimes.

Write one paragraph suggesting it’s never OK to lie.

Write a conclusion reconciling the two.

Write one sentence

to open the conclusion.

Write 2-3 sentences in the middle of the conclusion.

Write a concluding sentence.

Page 38: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Experiment 1: Success Modes

Data:• 6 subnodes were produced• 44 separate worker judgments were

used• Task completed with a full essay

Page 39: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Experiment 1: Success Modes

“…although many people believe it is always essential to tell the truth, sometimes it may be better to lie. There is credibility in both views. And like many ethical decisions, sometimes the circumstances dictate.

When you tell the truth you develop a stronger bond of trust with those around you. A relationship can not exist without trust. If you lie, you end up telling more lies to cover the first….”

Page 40: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Experiment 1: Failure ModesThere are two ways we found that the algorithm could fail:

-Failing to terminate at all-Completing, but producing

wrong answers

Page 41: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Experiment 1: Failing to terminate

Plan a trip from New York to S.F. that visits 5 interesting places.

Think about where to go next

in Ohio.

Think about where to go next

in Ohio.

Page 42: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Experiment 1: Wrong answers

List the department chairs of the top 20 US programs in CS.

aalto armchair poang lounge chair adirondack chair

aeron chair balans chair

ball chair….

Page 43: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Why does the crowd lose context?

Turkomatic worker:“…I’ve taken a look at your instructions, and I understand them perfectly. However, this task seems to have been inadvertently sabotaged by other turkers who do not understand what you are asking them to do…”

Page 44: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Long workflows involve increasing chains of trust.

Each individual worker has a ~30% probability of failure [Chi/Kittur/Suh ’08, Bernstein et al ’10]

Weakest link problem: If one worker early in the workflow design process makes mistakes, the subsequent decompositions will fail.

Page 45: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Including context doesn’t suffice

Page 46: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

One explanationWhat if we used more competent

workers?

Page 47: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Experiment 2: Can expert workers make Turkomatic work? Setup:

We recruited five graduate students with experience as requesters on Mechanical Turk.

We ran the PDS algorithm on three complex tasks with this crowd: online research, essay writing, and creating a blog

Page 48: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Experiment 2: Can expert workers make Turkomatic work? Results:

Each of three tested tasks completed correctly when we used only expert workers!

Page 49: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Experiment 2: Can expert workers make Turkomatic work? Results:

Each of three tested tasks completed correctly when we used only expert workers!

Conclusion:PDS works well with qualified crowds.

Page 50: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

How can we successfully run PDS with unskilled workers?

Page 51: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Experiment 3: Can requester management help the crowd?

Workflow visualizer: Monitor the workflow in real-time.

Interactive task editor: Selectively invalidate parts of a workflow.

Workflow seeding: Run previously-designed parts of workflows in the crowd.

Page 52: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Task Graphs (Requester)

Page 53: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Task Graph Nodes

Task Prompt

Status Submitted Answer

completedqueued in progress

Page 54: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Task Graph Edges: ParallelParent Task

Split

Sub Task 1

Solve

Sub Task 2

Decide

Page 55: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Task Graph Edges: Sequential

Parent Task

Split

Sub Task 1

Solve

Sub Task 2

Decide

Page 56: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Task Graph ExampleWrite an essay

Split

Write an outline

Solve 1. Thesis: …

Expand the outline

Decide

Page 57: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Task Graph EditingWrite a 3-paragraph essay…Split

Think about the topic…Split Collect information

about…Decide Write the paragraphs…

Decide

Pick one of the topicsSplit

List possible topicsSolve 1. The word…

EDIT TASK DETAILS

Edit Task Edit Solution

Edit Subtask

Delete Node

Task:

Status:

Think about the topic you want to write aboutSplit

Page 58: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University
Page 59: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Task Graph EditingWrite a 3-paragraph essay…Split

Think about the topic…Split Collect information

about…Decide Write the paragraphs…

Decide

Pick one of the topicsSplit

List possible topicsSolve 1. The word…

List three main topics…Solve

Page 60: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Recomputing Task Graphs• Delete subtree of edited task• Recursively:

– Delete stale solutions in parent tasks– Delete stale solutions in subsequent

sibling tasks (for serial decompositions)

Page 61: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Seeding workflowsWe mitigate poor performances by workers by starting with partial workflows.

Run Workflow with Crowd

Page 62: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Experiment 3: CollaborationSetup:

We ran the PDS algorithm using Turkers on three sets of tasks, but actively monitored and intervened only to eliminate errors

Outcomes:Each of the three tested tasks completed correctly with 1 to 4 requester interventions.

Page 63: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Experiment 3: Collaboration

Paragraph 1

Paragraph 2

Paragraph 3

Page 64: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Experiment 3: Collaboration

Crowdsourcing is a term…

Page 65: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Experiment 3: Collaboration

Crowdsourcing is a term…

Chaordix crowd consulting is…

Page 66: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Experiment 3: Collaboration

Crowdsourcing is a term…

Page 67: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Experiment 3: Collaboration

Crowdsourcing is a term…

Crowdsourcing works best on tasks where…

Page 68: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

Experiment 3: Collaboration

Crowdsourcing is a term…

Crowdsourcing works best on tasks where…

One of the best known crowdsourcing

platforms…

Page 69: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

ConclusionWe presented Turkomatic, a system to let the requesters harness the crowd to design complex workflows.

Our first experiment showed successful and unsuccessful examples could result from letting the crowd design their own tasks.

Our second experiment showed that expert workers could successfully design workflows using PDS.

Page 70: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

ConclusionLast, we showed that an interactive, real-time interface for visualizing and selectively editing worker interfaces could produce viable workflows.

Page 71: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

One finding of noteIn Turkomatic, highly motivated workers could not contribute to correct others’ errors.

Excessive structure in workflow design prevents the emergence of leaders.

To scale, we may consider giving editing abilities to more capable workers.

Page 72: Anand Kulkarni Björn  Hartmann   University of California, Berkeley Matthew Can   Stanford University

ContributionsA simplified interface for crowdsourcing

that lowers the threshold for crowdsourcing complex tasks

A new algorithm, techniques, and interfaces enabling the crowd to decompose complex tasks

A new interface for letting requesters edit, visualize, and seed workflows