Upload
radwan
View
34
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Wisdom of Crowds in Human Memory: Reconstructing Events by Aggregating Memories across Individuals. Mark Steyvers Department of Cognitive Sciences University of California, Irvine. Joint work with: Brent Miller, Pernille Hemmer, Mike Yi Michael Lee, Bill Batchelder, Paolo Napoletano. - PowerPoint PPT Presentation
Citation preview
Wisdom of Crowds in Human Memory: Reconstructing Events by Aggregating
Memories across Individuals
Mark SteyversDepartment of Cognitive Sciences
University of California, Irvine
Joint work with:Brent Miller, Pernille Hemmer, Mike Yi
Michael Lee, Bill Batchelder, Paolo Napoletano
Ulysses S. Grant
James Garfield
Rutherford B. Hayes
Abraham Lincoln
Andrew Johnson
James Garfield
Ulysses S. Grant
Rutherford B. Hayes
Andrew Johnson
Abraham Lincoln
What is the correct chronological order?
time
Research goal: aggregating responses
3
D A B C A B D C B A D C A C B D A D B C
Aggregation Algorithm
A B C D A B C D
ground truth
=?
group answer
Task constraints
No communication between individuals
There is always a true answer (ground truth)
Aggregation algorithm is unsupervised ground truth only used for evaluation
4
Wisdom of crowds phenomenon
Group estimate often performs as well as or better than best individual in the group
5
Examples of wisdom of crowds phenomenon
6
Who wants to be a millionaire?Galton’s Ox (1907): Median of individual estimates comes close to true answer
Relation to Cultural Consensus Theory (CCT) Developed by Batchelder and Romney
CCT can recover the answer key of a multiple choice test by analyzing responses across individuals
Key assumption: questions vary in difficulty and individuals vary in ability
Our models will be similar to the ideas of CCT, but the emphasis is different Each problem studied has a ground truth We focus on “wisdom of crowds” phenomenon
7
Overview of talk
Ordering problems – general knowledge what is the order of US presidents?
Ordering problems – episodic memory what is the order of events you have experienced?
Matching problems memory for pairs: what object was paired with what person?
Recognition memory problems what words were studied?
8
Experiment: 26 individuals order all 44 US presidents
9
George Washington John Adams Thomas Jefferson James Madison
James Monroe John Quincy Adams Andrew Jackson Martin Van Buren
William Henry Harrison John Tyler James Knox Polk Zachary Taylor
Millard Fillmore Franklin Pierce James Buchanan Abraham Lincoln
Andrew Johnson Ulysses S. Grant Rutherford B. Hayes James Garfield
Chester Arthur Grover Cleveland 1 Benjamin Harrison Grover Cleveland 2
William McKinley Theodore Roosevelt William Howard Taft Woodrow Wilson
Warren Harding Calvin Coolidge Herbert Hoover Franklin D. Roosevelt
Harry S. Truman Dwight Eisenhower John F. Kennedy Lyndon B. Johnson
Richard Nixon Gerald Ford James Carter Ronald Reagan
George H.W. Bush William Clinton George W. Bush Barack Obama
= 1= 1+1Measuring performance
Kendall’s Tau: The number of adjacent pair-wise swaps
Ordering by IndividualA B E C D
True OrderA B C D E
C DEA B
A B E C D
A B C D E= 2
Empirical Results
11
1 10 200
100
200
300
400
500
Individuals (ordered from best to worst)
(random guessing)
A Bayesian (generative) approach
12
D A B C A B D C B A D C A C B D
? ? ? ?latent “input”
Model ModelModel Model
1 2 3 N
…
Bayesian models
We extend two models: Thurstone’s (1927) model Estes (1972) perturbation model
13
Bayesian Thurstonian Approach
14
Each item has a true coordinate on some dimension
A B C
Bayesian Thurstonian Approach
15
A B C
… but there is noise because of encoding and/or retrieval error
Person 1
Bayesian Thurstonian Approach
16
Each person’s mental representation is based on (latent) samples of these distributions
B C
A B C
Person 1
A
Bayesian Thurstonian Approach
17
B C
A B C
The observed ordering is based on the ordering of the samples
A < B < C
Observed Ordering:
Person 1
A
Bayesian Thurstonian Approach
18
People draw from distributions with common means but different variances
Person 1
B C
A B CA < B < C
Observed Ordering:
Person 2
A B C
BC
Observed Ordering:
A < C < BA
A
Graphical Model Notation
19
jx
1x
2x 3xj=1..3
shaded = observednot shaded = latent
Graphical Model of Bayesian Thurstonian Model
20
j individuals
jx
jy
μ
j
| , ~ N ,ij j jx
( )j jranky x
~ Gamma ,1 /j
Latent group means
Individual noise level
Mental representation
Observed ordering
Inference
Need the posterior distribution
Markov Chain Monte Carlo Gibbs sampling on Metropolis-hastings on and
21
xμ
, , |p μ σ x y
Inferred Distributions for 44 US Presidents
22
George Washington (1)John Adams (2)
Thomas Jefferson (3)James Madison (4)James Monroe (6)
John Quincy Adams (5)Andrew Jackson (7)
Martin Van Buren (8)William Henry Harrison (21)
John Tyler (10)James Knox Polk (18)
Zachary Taylor (16)Millard Fillmore (11)Franklin Pierce (19)
James Buchanan (13)Abraham Lincoln (9)
Andrew Johnson (12)Ulysses S. Grant (17)
Rutherford B. Hayes (20)James Garfield (22)Chester Arthur (15)
Grover Cleveland 1 (23)Benjamin Harrison (14)
Grover Cleveland 2 (25)William McKinley (24)
Theodore Roosevelt (29)William Howard Taft (27)
Woodrow Wilson (30)Warren Harding (26)Calvin Coolidge (28)Herbert Hoover (31)
Franklin D. Roosevelt (32)Harry S. Truman (33)
Dwight Eisenhower (34)John F. Kennedy (37)
Lyndon B. Johnson (36)Richard Nixon (39)
Gerald Ford (35)James Carter (38)
Ronald Reagan (40)George H.W. Bush (41)
William Clinton (42)George W. Bush (43)
Barack Obama (44)
median and minimumsigma
Model can predict individual performance
23
0 0.1 0.2 0.3 0.450
100
150
200
250
300
R=0.941
inferred noise level for
each individual
distance to ground
truth
individual
1 10 200
50
100
150
200
250
300
350
Individuals
Thurstonian ModelIndividuals
(Weak) Wisdom of Crowds Effect
24
model’s ordering is as good as best individual (but not better)
Extension of Estes (1972) Perturbation Model
Main idea: item order is perturbed locally
Our extension: perturbation noise varies
between individuals and items
25
A
True order
B C D E
Recalled order
DB C EA
Modified Perturbation Model
26
Inferred Perturbation Matrix and Item Accuracy
272 6 10 14 18 22 26 30 34 38 42
1. George Washington (1)2. John Adams (2)
3. Thomas Jefferson (3)4. James Madison (4)5. James Monroe (6)
6. John Quincy Adams (5)7. Andrew Jackson (7)
8. Martin Van Buren (8)9. William Henry Harrison (21)
10. John Tyler (11)11. James Knox Polk (16)
12. Zachary Taylor (18)13. Millard Fillmore (9)
14. Franklin Pierce (20)15. James Buchanan (13)16. Abraham Lincoln (15)17. Andrew Johnson (10)18. Ulysses S. Grant (17)
19. Rutherford B. Hayes (19)20. James Garfield (22)21. Chester Arthur (14)
22. Grover Cleveland 1 (23)23. Benjamin Harrison (12)
24. Grover Cleveland 2 (25)25. William McKinley (24)
26. Theodore Roosevelt (28)27. William Howard Taft (26)
28. Woodrow Wilson (30)29. Warren Harding (27)30. Calvin Coolidge (29)31. Herbert Hoover (31)
32. Franklin D. Roosevelt (32)33. Harry S. Truman (33)
34. Dwight Eisenhower (34)35. John F. Kennedy (35)
36. Lyndon B. Johnson (36)37. Richard Nixon (38)
38. Gerald Ford (37)39. James Carter (39)
40. Ronald Reagan (40)41. George H.W. Bush (41)
42. William Clinton (42)43. George W. Bush (43)
44. Barack Obama (44)
Output position
True
pos
ition
0 5 10
Abraham Lincoln
Richard Nixon
James Carter
Strong wisdom of crowds effect
28
1 10 200
50
100
150
200
250
300
350
Individuals
Thurstonian ModelPerturbationIndividuals
Perturbation model’s ordering is better than best individual
Perturbation
Alternative Heuristic Models
Many heuristic methods from voting theory E.g., Borda count method
Suppose we have 10 items assign a count of 10 to first item, 9 for second item, etc add counts over individuals order items by the Borda count
i.e., rank by average rank across people
29
Model Comparison
30
1 10 20 300
50
100
150
200
250
300
350
Individuals
Thurstonian ModelPerturbationBorda countIndividuals
Borda
Ordering Ten Amendments
31
Freedom of speech & religion (1)
Right to bear arms (2)
No quartering of soldiers (4)
No unreasonable searches (3)
Due process (5)
Trial by Jury (6)
Civil Trial by Jury (7)
No cruel punishment (8)
Right to non-specified rights (10)
Power for the States & People (9)
Ordering Ten Commandments
32
Worship any other God (1)
Make a graven image (7)
Take the Lord's name in vain (2)
Break the Sabbath (3)
Dishonor your parents (4)
Murder (6)
Commit adultery (8)
Steal (5)
Bear false witness (9)
Covet (10)
Overview of talk
Ordering problems – general knowledge what is the order of US presidents?
Ordering problems – episodic memory what is the order of events you have experienced?
Matching problems memory for pairs: what object was paired with what person?
Recognition memory problems what words were studied?
33
Recollecting order from episodic memory
34http://www.youtube.com/watch?v=a6tSyDHXViM&feature=related
Place scenes in correct order (serial recall)
35
time
A B C D
Recollecting Order from Episodic Memory
36
Study this sequence of images
Place the images in correct sequence (serial recall)
37
A
B
C
D
E
F
G
H
I
J
Average results across 6 problems
38
Mea
n
1 10 20 300
5
10
15
Individuals
Thurstonian ModelPerturbation ModelBorda countIndividuals
Example calibration result for individuals
39
0 2 4 60
5
10
15
20
25
30
R=0.920
inferred noise level
distance to ground
truth
individual
(pizza sequence; perturbation model)
Overview of talk
Ordering problems – general knowledge what is the order of US presidents?
Ordering problems – episodic memory what is the order of events you have experienced?
Matching problems memory for pairs: what object was paired with what person?
Recognition memory problems what words were studied?
40
Study these combinations
41
2 3 4 51
B C D EA
Find all matching pairs
42
Bayesian Matching Model
Proposed process: match “known” items guess between remaining ones
Individual differences some items easier to know some participants know more
43
Graphical Model
44
i items
jx
jy
z
ja
Latent answer key
Observed matching
Knowledge State
jsProb. of knowing
id
j individuals
logitj i js d a
~ Bernoulliij ijx s
1 1( )
1 / ! 0ij
ij ij ij
xp y z
n x
person abilityitem easiness
Results across 8 problems
45
1 5 10 150
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Individuals
Mea
n A
ccur
acy
Bayesian MatchingHungarian AlgorithmIndividuals
General Knowledge Matching Problems
46
Dutch
Danish
Yiddish
Thai
Vietnamese
Chinese
Georgian
Russian
Japanese
A
B
C
D
E
F
G
H
I
godt nytår
gelukkig nieuwjaar
a gut yohr
С Новым Годом
สวสัดีปีใหม่
Chúc Mừng Nǎm Mới
გილოცავთ ახალწელს
Modeling Results – General Knowledge Tasks
47
1 10 200
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Individuals
Mea
n A
ccur
acy
Bayesian MatchingHungarian AlgorithmIndividuals
Overview of talk
Ordering problems – general knowledge what is the order of US presidents?
Ordering problems – episodic memory what is the order of events you have experienced?
Matching problems memory for pairs: what object was paired with what person?
Recognition memory problems what words were studied?
48
Systematic Errors and Biases
Some memory errors are systematic
When averaging over biased individuals, the group estimate will also be systematically biased
… unless the aggregation model can explain the bias
49
Listen to these words…
50
Associative structure influences false memories
51
cow
calfbull
herd
pasture
cattlemilk
graze
Experiment
Study list 10 lists of 15 spoken words
Recognition memory test Targets (15 items) Lure (1 item) Related distractors (15 items) Unrelated distractors (15 items)
Confidence ratings 5-point confidence ratings
1=definitely not on list; 2 = probably not on list; 3 = not sure; 4 = probably on list; 5 = sure it was on the list
52
Mean Confidence ratings for 12 individuals
53
T L R U1
2
3
4
5Individual 1
T L R U1
2
3
4
5Individual 2
T L R U1
2
3
4
5Individual 3
T L R U1
2
3
4
5Individual 4
T L R U1
2
3
4
5Individual 5
T L R U1
2
3
4
5Individual 6
T L R U1
2
3
4
5Individual 7
T L R U1
2
3
4
5Individual 8
T L R U1
2
3
4
5Individual 9
T L R U1
2
3
4
5Individual 10
T L R U1
2
3
4
5Individual 11
T L R U1
2
3
4
5Individual 12
T L R U1
2
3
4
5METHOD1
Con
fiden
ce
Signal Detection Aggregation Model
55
new (z=0) old (z=1)
Important: model needs to infer z, whether an item is old or new
321 4 5
Incorporating Associative Structure
56
cow
calfbull
herd
pasture
cattlemilk
graze
Incorporating Associative “Boost”
57
new (z=0) old (z=1)
Associative “boost” depends on 1) set of items that are considered “old” 2) vulnerability of individuals to associative influences
321 4 5
Inferred target status over mcmc iterations
58Iteration5 10 15 20 25 30 35 40 45 50
MOOCALFHERDBULL
PASTURECATTLE
MILKGRAZE
BEEFFARMBARN
STEERDAIRYVEAL
LEATHERCOW
FIELDMEAT
HORSEEATHAY
SHEEPGRASS
LEGBUFFALO
FOODDRINKBELT
STEAKJACKETROAST
PLAYVASE
FRESHCOMEDIAN
REAREMPLOYEE
GRANDREFRAIN
BLANKLOSEITEM
BARGAINGREAT
REELPEDAL
MOOCALFHERDBULL
PASTURECATTLE
MILKGRAZE
BEEFFARMBARN
STEERDAIRYVEAL
LEATHERCOW
FIELDMEAT
HORSEEATHAY
SHEEPGRASS
LEGBUFFALO
FOODDRINKBELT
STEAKJACKETROAST
PLAYVASE
FRESHCOMEDIAN
REAREMPLOYEE
GRANDREFRAIN
BLANKLOSEITEM
BARGAINGREAT
REELPEDAL
ROC Curves for SDT Aggregation Models
59
0 0.05 0.1 0.15 0.2 0.25 0.3 0.350
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
False Alarm Rate
Hit
Rat
e
SDT + assoc. (AUC=0.995)SDT (AUC=0.977)
Performance of Individuals and Aggregate
60
1 2 3 4 5 6 7 8 9 10 11 12 130.75
0.8
0.85
0.9
0.95
1
Individuals
Are
a U
nder
Cur
ve (A
UC
)
SDT + assoc.SDTIndividuals
Summary Aggregation of combinatorially complex data
going beyond numerical estimates or multiple choice questions
Incorporate individual differences going beyond models that treat every vote equally assume some individuals might be “experts”
Take cognitive processes into account going beyond mere statistical aggregation allows us to correct for systematic errors and biases
61
Predictive Rankings: fantasy football
63
South Australian Football League (32 people rank 9 teams)
1 10 20 300
20
40
60
80
Individuals
Thurstonian ModelPerturbation ModelBorda countIndividuals
Australian Football League (29 people rank 16 teams)
1 10 20 300
5
10
15
20
25
Individuals
1 10 20 300
20
40
60
80
Experiment
78 participants 17 ordering problems each with 10 items
Chronological Events Physical Measures Purely ordinal problems, e.g.
Ten Amendments Ten commandments
64
Ordering states west-east
65
Oregon (1)
Utah (2)
Nebraska (3)
Iowa (4)
Alabama (6)
Ohio (5)
Virginia (7)
Delaware (8)
Connecticut (9)
Maine (10)
Question
How many individuals do we need to average over?
66
Effect of Group Size: random groups
67
0 10 20 30 40 50 60 70 807
8
9
10
11
12
13
14
Group Size
T=0T=2
T=12
How effective are small groups of experts?
Want to find experts endogenously – without feedback
Approach: select individuals with the smallest estimated noise levels based on previous tasks
We are identifying general expertise (“Pearson’s g”)
68
Group Composition based on prior performance
69
0 10 20 30 40 50 60 70 807
8
9
10
11
12
13
14
Group Size
T=0T=2
T=12
T = 0
# previous tasks
T = 2T = 8
Group size (best individuals first)
70
Endogenous no feedback
required
Exogenous selecting people based on
actual performance
0 10 20 30 407
8
9
10
11
12
13
14
0 20 407
8
9
10
11
12
13
14
Online Experiments
Experiment 1 (Prior knowledge) http://madlab.ss.uci.edu/dem2/examples/
Experiment 2a (Serial Recall) study sequence of still images http://madlab.ss.uci.edu/memslides/
Experiment 2b (Serial Recall) study video http://madlab.ss.uci.edu/dem/
71
MDS solution of pairwise tau distances
72-15 -10 -5 0 5 10 15 20 25 30 35-20
-15
-10
-5
0
5
10
15
7
26
3
16
7 96
1
22
2
13
12
7
11
14
9
5
7
11
8
3
24
3
7
10
10
4
03
6
9
6
26
5
18
44 3
14
6
2
5
3
5
1
4210
11
4
3
42
0
8
21
7
3
5
1
1
8
1
33
14
3
20
6
8
16
7
22
23
2 3710
states westeast
IndividualsTruthThurstonian Model
distance to truth
MDS solution of pairwise tau distances
73-20 -15 -10 -5 0 5 10 15 20 25
-20
-15
-10
-5
0
5
10
15
20
14
23
25
24
18 24
13
14
10
5
9
20
8
20
15
18
12
33
25
29
171
14
20
27176
13
11
15
3
17
17
17
24
7
26
9
13
17
27
13
15
11
15
15
23
2811
26
16
4
27
9
23
24
11
17
19
15
22
2
15
14
12
21
11
26
11
18
35
22
10
20
24
25
1
19
7
0
ten commandments
IndividualsTruthThurstonian Model
Thurstonian Model – stereotyped event sequences
74
event1 (1)event2 (2)event3 (3)event4 (4)event5 (5)event6 (7)event7 (6)event8 (8)event9 (9)
event10 (10)
Bus (Recall)
0
5
10
15
20
25
R=0.890
event1 (1)event2 (2)event3 (3)event4 (4)event5 (5)event6 (6)event7 (7)event8 (8)event9 (9)
event10 (10)
Morning (Recall)
0
5
10
15
20
25
R=0.982
event1 (1)event2 (2)event3 (3)event4 (4)event5 (5)event6 (6)event7 (7)event8 (8)event9 (9)
event10 (10)
Wedding (Recall)
0 0.5 1 1.5 20
5
10
15
20
25
R=0.973
Thurstonian Model – “random” videos
75
event1 (1)event2 (2)event3 (3)event4 (5)event5 (7)event6 (6)event7 (4)event8 (8)event9 (9)
event10 (10)
Yogurt (Recall)
0
5
10
15
20
25
R=0.908
event1 (1)event2 (3)event3 (4)event4 (5)event5 (2)event6 (6)event7 (7)event8 (9)
event9 (10)event10 (8)
Pizza (Recall)
0
5
10
15
20
25
R=0.851
event1 (1)event2 (2)event3 (3)event4 (4)event5 (6)event6 (5)event7 (7)event8 (8)event9 (9)
event10 (10)
Clay (Recall)
0 0.5 1 1.5 20
5
10
15
20
25
R=0.928
Heuristic Aggregation Approach
Combinatorial optimization problem maximizes agreement in assigning N items to N responses
Hungarian algorithm construct a count matrix M Mij = number of people that paired item i with response j find row and column permutations to maximize diagonal sum O( n3 )
76
Hungarian Algorithm Example
77= correct
DutchDan
ish
Frenc
h
Japan
ese
Span
ish
Arabic
Chinese
German
Italia
nRussi
an
ThaiViet
namese
Wels
hGeo
rgian
Yiddish
gelukkig Nieuwjaar 7 3 0 0 0 1 0 0 0 0 0 0 2 0 2godt nytår 2 3 0 0 0 0 0 2 0 2 0 0 1 3 2
bonne année 0 0 14 0 1 0 0 0 0 0 0 0 0 0 00 0 0 9 0 0 2 0 1 0 3 0 0 0 0
feliz año nuevo 0 0 0 0 14 0 0 0 0 0 1 0 0 0 0عامسعيد 0 1 0 0 0 14 0 0 0 0 0 0 0 0 0
0 0 0 2 0 0 12 0 0 0 0 1 0 0 0ein gutes neues Jahr 3 1 0 0 0 0 0 9 0 0 0 0 1 0 1
felice anno nuovo 0 0 0 0 0 0 0 0 14 1 0 0 0 0 0С Новым Годом 0 0 1 0 0 0 0 0 0 11 0 0 1 2 0
สวัสดีปีใหม่ ่ 0 0 0 1 0 0 1 0 0 0 7 1 1 4 0Chúc Mừng Nǎm Mới 0 0 0 0 0 0 0 0 0 1 0 11 1 2 0
Blwyddyn Newydd Dda 0 4 0 1 0 0 0 0 0 0 1 0 6 1 2გილოცავთ ახალ წელს 0 0 0 2 0 0 0 1 0 0 3 2 0 1 6
a gut yohr 3 3 0 0 0 0 0 3 0 0 0 0 2 2 2
= incorrect
What are methods for finding experts?
1) Self-reported expertise: unreliable has led to claims of “myth of expertise”
2) Based on explicit scores by comparing to ground truth but ground truth might not be immediately available
3) Endogenously discover experts Use the crowd to discover experts Small groups of experts can be effective
78
0.8 1 1.2 1.4 1.6 1.8
0
2
4
6
8
10
12
14
16
18R=-0.752
1
2
3
4
5
6
7
8
9
10
1112
13
14
15
16
17
Predicting problem difficulty
79
std
dispersion of noise levels across individual
distance of group
answer to ground truth
ordering states geographically
city size rankings
Mean p( “yes” )
80
T L R U0
0.5
1aaa
T L R U0
0.5
1ardor
T L R U0
0.5
1azs
T L R U0
0.5
1incognito
T L R U0
0.5
1indigo
T L R U0
0.5
1jshi
T L R U0
0.5
1nobody
T L R U0
0.5
1peter griffin
T L R U0
0.5
1piper michelle
T L R U0
0.5
1plutonium
T L R U0
0.5
1scott bakula
T L R U0
0.5
1sky
T L R U0
0.5
1METHOD1
note: confidence ratings were converted to yes/no judgments. Yes = rating >= 3; No = rating < 3
0
5
10
15
20
25
30
Num
ber
of P
eopl
e
Recollection of 9/11 Event Sequence (Altmann, 2003)
82
A A A A A A A A A A A A C C A A A A A A A A C E E EB B B B B C C D D B C B A A B B B B C C D E D A C CC C D C D B B B B D B E B B D D E F D E F B A B A AD E F D C D E F F C D C D E E F D D B B B C B C B DF D C E F F D C E E E D F D F E F C F D C D F D D BE F E F E E F E C F F F E F C C C E E F E F E F F F
Correct
Most frequent response (i.e, mode)
A = One plane hits the WTC B = A second plane hits the WTCC = One plane crashes into the Pentagon D = One tower at the WTC collapsesE = One plane crashes in PennsylvaniaF = A second tower at the WTC collapses
Example tasks studied in our research
Ordering problems what is the order of US presidents?
Matching problems memory for pairs: what object was paired with what person?
Recognition memory problems what set of words were studied?
83
problems involving combinatorially complex inference problems