CUbRIK research at CIKM 2012: Map to Humans and Reduce Error

Identify pairs with ADS = threshold±εSample and add to

Pcand

Get Crowd Labels for Pcand

High confidence pairs => Ptrain

Pcand= Pcand - Ptrain

Identify duplicate pairs from Ptrain, Pdupl

Compute crowd decisions and worker

confidences

Optimize DSParams and threshold to fit to the

data in Ptrain

Initial DSParams, ThresholdPcand = φ

Better DSParams, Threshold

Pdupl

Map to Humans and Reduce Error - Crowdsourcing for Deduplication Applied to Digital Libraries

Compare CD to AD and optimize DSParams and threshold to maximize Accuracy

Compare ADS to CSD and optimize DSParams•minimize the sum of errors •minimize the sum of log of errors

•maximize the Pearson correlationCompare CD to AD and optimize threshold to maximize Accuracy

• Find duplicate entities based on metadata• Focus on scientific publications in the Freesearch system

• An automatic method and human labelers work together towards improving their performance at identifying duplicate entities• Actively learn how to deduplicate from the crowd by optimizing the parameters of the automatic method

• MTurk HITs to get labeled data, while tackling the quality issues of the crowdsourced work

• DuplicatesScorer produces an ADS• DSParams={(fieldName, fieldWeight)} and threshold• Compare ADS to threshold => ADϵ{1,0}

Mihai Georgescu, Dang Duc Pham, Claudiu S. Firan, Julien Gaugaz, Wolfgang Nejdl

Optimization strategies

Crowd Decision Strategies

3 workers 5 workers

MV MV Iter Manual Boost Heur

Accuracy 79.19 80.00 79.73 80.00 78.92 79.73

Sum-Err 76.49 79.46 79.46 79.46 79.46 79.19

Sum-log-err 71.89 78.11 78.38 78.92 80.27 76.76

Pearson 73.24 79.46 79.46 80.54 79.46 81.08

• Aggregated decision from all workers for a pair produces a CSD • Worker contribution to the CSD is proportional to the confidence ck we have in him• Compare CDS to 0.5 => CDϵ{1,0}

Crowd Soft DecisionAggregation of all individual votes Wi,j(k)ϵ{-1,1}CSD ϵ{0,1}

Worker Confidence• Asses how reliable are the individual workers when compared to the overall performance of the crowd• Simple measure: proportion of pairs that have the same label as the one assigned by the crowd• Use an EM algorithm to iteratively compute the worker confidence

• Compute CSD• Update ck

Crowd Decision Strategies:• MV: Majority Voting; All users are equal ck=1• Iter: ck computed using the EM algorithm• Boost: ck computed using the EM algorithm using boosted weights in the computation of CSD• Heur: Heuristic 3/3 or 4/5

2

)()(1,

,,

,

jiWk

jiji

ji

kWkweight

CSD

jWiv v

kji

c

ckweight

,

, )(

Contact: Mihai Georgescu

email: [email protected]

L3S Research Center / Leibniz Universität HannoverAppelstrasse 4, 30167 Hannover, Germany

phone: +49 511 762-19715

R

A

P-

0.20

0.40

0.60

0.80

1.00

s igns ign+DS/m

s ign+DS/oDS/m

DS/oCD-MV

sign sign+DS/m sign+DS/o DS/m DS/o CD-MV

R 0.20 0.20 0.20 0.67 0.56 0.97

A 0.77 0.77 0.77 0.70 0.79 0.83

P 0.95 0.95 1.00 0.48 0.66 0.63

Duplicate Detection Strategies

Crowd Decision

Automatic Method

Experiment Setup• 3 Batches :

o 60 HITs with qualification testo 60 HITs without qualification testo 120 HITs without qualification test•Just signatures

• Sign•Just the DuplicatesScorer

• DS/m• DS/o

•First compute signatures and then base

decision on DuplicatesScorer• sign + DS/m• sign + DS/o

•Directly use Crowd Decision obtained via Majority Voting CD-MV

Crowdsourcing:

1 HIT = 5 Pairs5ct / HIT3 ->5 Assignments

Crowd Decision and Optimization Strategies

[Show Diff] [Full Text]Title: Comparing H euristic, Evolutionary and Local Search Approaches to Scheduling

Authors : Soraya Rana, Adele E. H owe, L. Darrell, Whitley Keith Mathias

Venue: Proceedings of the Third International Conference on Artificial Intelligence Planning Systems, Menlo Park, CA Publisher: The AAAI PressYear: 1996Language: English

Type: conference

Abstract: The choice of search algorithm can play a vital role in the success of a scheduling application. In this paper, we investigate the contribution of search algorithms in solving a real-world warehouse scheduling problem. We compare performance of three types of

scheduling algorithms: heuris tic, genetic algorithms and local search.

[Show Diff]Title: Comparing H euristic, Evolutionary and Local Search Approaches to Scheduling.

Authors : Soraya B. Rana, Adele E. H owe, L. Darrell Whitley, Keith E. MathiasBook: AIPS Pg. 174-181 [Contents ]Year: 1996 Language: English Type: conference (inproceedings )

After carefully reviewing the publications metadata presented to you, how would you class ify the 2 publications referred:

Judgment for publications pair:oDuplicates

oNot Duplicates

www.cubrikproject.eu

dblp.kbs.uni-hannover.de

Documents

CUbRIK research at CIKM 2012: Map to Humans and Reduce Error