24
Crowdscreen: Algorithms for Filtering Data using Humans Aditya Parameswaran Stanford University (Joint work with Hector Garcia-Molina, Hyunjung Park, Neoklis Polyzotis, Aditya Ramesh, and Jennifer Widom)

Crowdscreen : Algorithms for Filtering Data using Humans

Embed Size (px)

DESCRIPTION

Crowdscreen : Algorithms for Filtering Data using Humans. Aditya Parameswaran Stanford University (Joint work with Hector Garcia-Molina, Hyunjung Park, Neoklis Polyzotis , Aditya Ramesh , and Jennifer Widom ). Crowdsourcing : A Quick Primer. Asking the crowd for help to solve problems. - PowerPoint PPT Presentation

Citation preview

Page 1: Crowdscreen : Algorithms for Filtering Data using Humans

Crowdscreen: Algorithms for Filtering Data using Humans

Aditya Parameswaran

Stanford University

(Joint work with Hector Garcia-Molina, Hyunjung Park, Neoklis Polyzotis, Aditya Ramesh, and Jennifer

Widom)

Page 2: Crowdscreen : Algorithms for Filtering Data using Humans

Why? Many tasks done better by humans

Crowdsourcing: A Quick Primer

2

Pick the “cuter” cat Is this a photo of a car?

How? We use an internet marketplace

Requester: Aditya Reward: 1$ Time: 1 day

Asking the crowd for help to solve problems

Page 3: Crowdscreen : Algorithms for Filtering Data using Humans

Crowd Algorithms

Working on fundamental data processing algorithms that use humans:—Max [SIGMOD12]—Filter [SIGMOD12]—Categorize [VLDB11]—Cluster [KDD12]—Search—Sort

Using human unit operations:—Predicate Eval., Comparisons, Ranking, Rating

3

Goal: Design efficient crowd algorithms

Page 4: Crowdscreen : Algorithms for Filtering Data using Humans

4

Efficiency: Fundamental Tradeoffs

Latency

Cost

Uncertainty

How much $$ can I spend?

How long can I wait?

What is the desired quality?

• Which questions do I ask humans?• Do I ask in sequence or in parallel?• How much redundancy in questions? • How do I combine the answers?• When do I stop?

Page 5: Crowdscreen : Algorithms for Filtering Data using Humans

Filter

5

Dataset of Items

Predicate

Y Y NItem X satisfies

predicate?

Predicate 1

Predicate 2……

Predicate k

Single

Is this an image of Paris?

Is the image blurry?

Does it show people’s faces?

Filtered Dataset

Applications: Content Moderation, Spam Identification, Determining Relevance, Image/Video Selection, Curation, and

Management, …

Page 6: Crowdscreen : Algorithms for Filtering Data using Humans

Parameters

6

Latency

Cost

Uncertainty

Given: —Per-question human error probability (FP/FN)

—Selectivity

Goal: Compose filtering strategies, minimizing across all items—Overall expected cost (# of questions)

—Overall expected error

Page 7: Crowdscreen : Algorithms for Filtering Data using Humans

Our Visualization of Strategies

7

654321

6

5

4

3

2

1

NOs

YESs

continue

decide PASS

decide FAIL

Page 8: Crowdscreen : Algorithms for Filtering Data using Humans

Common Strategies

Always ask X questions, return most likely answer—Triangular strategy

If X YES return “Pass”, Y NO return “Fail”, else keep asking.—Rectangular strategy

Ask until |#YES - #NO| > X, or at most Y questions—Chopped off triangle

8

Page 9: Crowdscreen : Algorithms for Filtering Data using Humans

Filtering: Outline

How do we evaluate strategies? Hasn’t this been done before? What is the best strategy? (Formulation 1)

—Formal statement—Brute force approach—Pruning strategies—Probabilistic strategies—Experiments

Extensions

9

Page 10: Crowdscreen : Algorithms for Filtering Data using Humans

Evaluating Strategies

10

321

3

2

1

NOs

YESsCost = (x+y) Pr. of reaching (x,y)

Error = Pr. of reaching (x,y) and incorrectly filtered

Pr. of reaching (x, y) = Pr. of reaching (x, y-1) and getting Yes + Pr. of reaching (x-1, y) and getting No

Page 11: Crowdscreen : Algorithms for Filtering Data using Humans

Hasn’t this been done before?

Solutions from elementary statistics guarantee the same error per item—Important in contexts like:• Automobile testing• Medical diagnosis

We’re worried about aggregate error over all items: a uniquely data-oriented problem—We don’t care if every item is perfect as long as

the overall error is met.—As we will see, results in $$$ savings

11

Page 12: Crowdscreen : Algorithms for Filtering Data using Humans

Find strategy with minimum overall expected cost, such that

1. Overall expected error is less than threshold 2. Number of questions per item never exceeds

m

What is the best strategy?

12

654321

654321

NOs

YESs

Page 13: Crowdscreen : Algorithms for Filtering Data using Humans

Brute Force Approaches

Try all O(3p) strategies, p = O(m2)

Try all “hollow” strategies

13

Too Long!

Too Long!

654321

4

3

2

1

NOs

YESs

4321

4

3

2

1

NOs

YESs

Page 14: Crowdscreen : Algorithms for Filtering Data using Humans

Pruning Hollow Strategies

14

654321

4

3

2

1

NOs

YESs For every hollow strategy, there is a ladder strategy that is as good or better.

Page 15: Crowdscreen : Algorithms for Filtering Data using Humans

Other Pruning Examples

15

654321

654321

NOs

YESs

654321

654321

NOs

YESs

Ladder

Hollow

Page 16: Crowdscreen : Algorithms for Filtering Data using Humans

Probabilistic Strategies

Probabilities: —continue(x, y), pass(x, y), fail(x, y)

16

321

3

2

1

YESs

NOs

(0,1,0) (0,1,0)

(0,0,1)

(0,0,1)

(0,0,1)

(1,0,0)

(1,0,0)

(0.5,0,0.5)

(0.5,0.5,0)

(1,0,0)

(1,0,0)

(1,0,0)

(0.5,0.5,0)

(0,1,0)

(1,0,0)

Page 17: Crowdscreen : Algorithms for Filtering Data using Humans

Best probabilistic strategy

Finding best strategy can be posed as a Linear Program!

Insight 1: —Pr of reaching (x, y) = Paths into (x, y) * Pr. of one path

Insight 2: —Probability of filtering incorrectly at a point is independent

of number of paths

Insight 3: —At least one of pass(x, y) or fail(x, y) must be 0

17

Page 18: Crowdscreen : Algorithms for Filtering Data using Humans

Experimental Setup

Goal: Study cost savings of probabilistic relative to others

Parameters Generate Strategies Compute Cost

Two sample plots—Varying false positive error

(other parameters fixed)—Varying selectivity

(other parameters varying)18

Ladder

Hollow

Probabilisitic

Rect

Deterministic

Growth Shrink

Page 19: Crowdscreen : Algorithms for Filtering Data using Humans

Varying false positive error

19

Page 20: Crowdscreen : Algorithms for Filtering Data using Humans

Varying selectivity

20

Page 21: Crowdscreen : Algorithms for Filtering Data using Humans

Other Issues and Factors

Other formulations Multiple filters Categorize (output >2 types)

21

Ref: “Crowdscreen: Algorithms for filtering with humans” [SIGMOD 2012]

Page 22: Crowdscreen : Algorithms for Filtering Data using Humans

Natural Next Steps

Expertise Spam Workers Task Difficulty Latency Error Models Pricing

22

Algorithms

Skyline of cost, latency, error

Page 23: Crowdscreen : Algorithms for Filtering Data using Humans

Related Work on Crowdsourcing

Workflows, Platforms and Libraries: Turkit [Little et al. 2009], HProc [Heymann 2010], CrowdForge [Kittur et al. 2011], Turkomatic [Kulkarni and Can 2011], TurKontrol/Clowder [Dai, Mausam and Weld 2010-11]

Games: GWAP, Matchin, Verbosity, Input Agreement, Tagatune, Peekaboom [Von Ahn & group 2006-10], Kisskissban [Ho et al. 2009], Foldit [Cooper et. al. 2010-11], Trivia Masster [Deutch et al. 2012]

Marketplace Analysis: [Kittur et al. 2008], [Chilton et al. 2010], [Horton and Chilton 2010], [Ipeirotis 2010]

Apps: VizWiz [Bigham et al. 2010], Soylent [Bernstein et al. 2010], ChaCha, CollabMap [Stranders et al. 2011], Shepherd [Dow et al. 2011]

Active Learning: Survey [Settles 2010], [Raykar et al. 2009-10], [Sheng et al. 2008], [Welinder et al. 2010], [Dekel 2010], [Snow et al. 2008], [Shahaf 2010], [Dasgupta, Langford et al. 2007-10]

Databases: CrowdDB [Franklin et al. 2011], Qurk [Marcus et al. 2011], Deco [Parameswaran et. al. 2011], Hlog [Chai et al., 2009]

Algorithms: [Marcus et al. 2011], [Gomes et al. 2011], [Ailon et al. 2008], [Karger et al. 2011],

23

Page 24: Crowdscreen : Algorithms for Filtering Data using Humans

24

Thanks for listening! Questions?

SCHRÖDINGER’S CAT