32
Overconfidence in judgment: Why experience might not be a good teacher Tom Stewart September 24, 2007

Overconfidence in judgment: Why experience might not be a good teacher Tom Stewart September 24, 2007

Embed Size (px)

Citation preview

Overconfidence in judgment: Why experience might not be a good teacher

Tom Stewart

September 24, 2007

2

Einhorn, H. J., & Hogarth, R. M. (1978). Confidence in judgment: Persistence of the illusion of validity.

Psychological Review, 85(5), 395-416.

“How can the contradiction between the considerable evidence on the fallibility of human judgment be reconciled with the seemingly unshakable confidence people exhibit in their judgmental ability? In other words, why does the illusion of validity persist?” (p. 396)

3

Experience, performance, confidence

Experience

Performance

Confidence

?

?

?

4

Experience, performance, confidence

Experience

Performance

Confidence

?

?

?

Uncertainty

Feedback

5

0

50

100

0 50 100

Judgment

"Tru

th"

r = .50

6

0

50

100

0 50 100

Judgment

"Tru

th"

r = .50

7

0

50

100

0 50 100

Judgment

"Tru

th"

r = .80

8

0

50

100

0 50 100

Judgment

"Tru

th"

r = .95

See speaker note

9

Judgments are continuous

Decisions are discrete

10

Decision A

Threshold model

JudgmentLow High

Threshold 1

Decision B

Threshold 2

Decision C

11

0

50

100

0 50 100

Judgment

"Tru

th"

r = .50Decision threshold

ActDon’t

Act

12

0

50

100

0 50 100

Judgment

"Tru

th"

r = .50

Criterion

threshold

Action is appropriate

Action is inappropriate

13

Correct rejections

Hits

r = .50Decision threshold

Criterion

threshold

False alarms

Misses

0

50

100

0 50 100

Judgment

"Tru

th"

14

Correct rejections

Hits

0

50

100

0 50 100

Judgment

"Tru

th"

r = .50Decision threshold

Criterion

threshold

False alarms

Misses

15

Correct rejections

Hits

0

50

100

0 50 100

Judgment

"Tru

th"

r = .50Decision threshold

Criterion

threshold

False alarms

Misses

16

Correct rejections

Hits

0

50

100

0 50 100

Judgment

"Tru

th"

r = .95Decision threshold

Criterion

threshold

Misses

False alarms

17

Research on judging contingencies between x and y based on information in 2x2 tables suggests that people

focus on frequency of Hits.

This may be due to the difficulty people have in using disconfirming information.

Correct rejections

Hits

0

50

100

0 50 100

Judgment

"Tru

th"

Misses

False alarms

18

How do people learn to make decisions if feedback (knowledge of results) is

incomplete?• Selective feedback example – selection task

– If an employer chooses not hire an applicant, she will not learn how that applicant would have performed.

• Selective feedback example – detection task– If a customs officer chooses not to conduct a

search of an airline passenger entering the country, he will not learn whether the passenger is smuggling goods into the country.

19

Correct rejections

Hits

0

50

100

0 50 100

Judgment

"Tru

th"

Misses

False alarms

Knowledge of results: Full feedback

20

Correct rejections

Hits

0

50

100

0 50 100

Judgment

"Tru

th"

Misses

False alarms

Knowledge of results: Selective feedback

21

Typical results

Miss

Correct rejection

Hit

False Alarm

Judgment

"Tru

th"

Miss

Correct rejection

Hit

False Alarm

Judgment

"Tru

th" Correct decision Reject Select

Select 23 27 50Reject 44 6 50

67 33 1000.710 = proportion correct decisions

Decision

Cases 100Base rate 0.500

Correlation 0.700

Correct decision Reject SelectSelect 13 37 50Reject 37 13 50

50 50 1000.740 = proportion correct decisions

Decision

Full feedback

Selective feedback

22

Possible explanation

• Encoding of cases when no feedback is available. Two possibilities (not exhaustive):– Positivist – People assume that when feedback is

missing accuracy is the same as when feedback is present.

– Constructivist (optimistic) – People assume perfect accuracy when feedback is missing.

Elwin, E., Juslin, P., Olsson, H., & Enkvist, T. (2007). Constructivist Coding: Learning From Selective Feedback. Psychological Science, 18(2), 105-110.

23

Correct decision Reject SelectSelect 23 27 50Reject 44 6 50

67 33 1000.710 = proportion correct decisions

0.500 = base rate

Correct decision Reject Select

Select 12 27 39

Reject 55 6 61

67 33 1000.818 = subjective proportion correct decisions

0.392 = subjective base rate

DecisionPositivist encoding

Correct decision Reject Select

Select 0 27 27

Reject 67 6 73

67 33 1000.940 = subjective proportion correct decisions

0.270 = subjective base rate

DecisionConstructivist encoding

Selective feedback – possible types of encoding

Objective results

Subjective results – Constructivist (optimistic)

encoding

Cases 100Base rate 0.500

Correlation 0.700

Subjective results – Positivist encoding

= subjective encoding

24

Encoding and values affect the cutoff Subjective encoding

If people assume they are correct when they don’t get feedback, the cutoff will move up (fewer cases selected).

Values of the four outcomes– There is evidence that people value hits more than

other outcomes.– This could result in selecting more cases.– However, if people pay attention to the positive hit

rate, they might select fewer cases.

Einhorn and Hogarth, 1978

25

Hits as a function of selection rate

0

0.2

0.4

0.6

0.8

1

00.10.20.30.40.50.60.70.80.91

Selection rate

Hit rate Proportion correct decisions Proportion of hitsP

rop

ort

ion

(Hit rate is number of hits divided by number of positive decisions.)

Hit rate

Proportion correct

Proportion of all decisions that are hits

Base rate 0.500Correlation 0.700

Note that hit rate can be high even if accuracy is not.

Correct decision Reject SelectSelect 41 9Reject 49 1

Decision

Full feedback results

Correct decision Reject SelectSelect 0 50Reject 0 50

Decision

Full feedback results

26

0

0.2

0.4

0.6

0.8

1

1.2

1.4

Cutoff

E.V

.

Plot of expected value vs. decision cutoff

Base rate 0.500Correlation 0.700

Full feedback, objective expected value

Selective feedback, constructivist encoding, subjective expected

value

Selective feedback, positivist encoding, subjective expected

value

Correct decision Reject Select

Select -1 2Reject 1 -1

Payoff matrix assumes greater value for hits

27

Summary: Selective feedback increases confidence while reducing performance

• Research suggests that, with limited feedback, people will learn to select fewer cases.

– This results in a decision bias that increases the error rate.• Other research suggests that people pay more attention to hits

than to other outcomes.– This could result in either more cases being selected in order to

increase the number of hits, or fewer cases to increase the hit rate.• The constructivist encoding hypothesis can account for the

experimental results.*• Furthermore, with constructivist encoding subjective

performance will be better than objective performance, accounting for overconfidence.

• It appears that while selective feedback results in more decision errors, it may not affect the accuracy of judgment.

*Of course, this does not prove that people are actually doing constructivist encoding, and there are certainly individual differences.

28

End

29

Confidence

• People pay attention to positive hit rate.– Inability to use disconfirming information– Limited feedback when action not taken

• Positive hit rate is often high, even when accuracy is not.– Positive hit rate can always be increased by

reducing selection rate/increasing threshold.– Treatment effects increase positive hit rate, and

this increase is greater for high selection rates.

30

If people judge their skill by the true positive rate, what affects that rate?

• Base rate

• Correlation

• Selection rate

• Treatment effects

Illustrate with spreadsheet C:/Documents and Settings/Tom/My Documents/aaDocuments/AAPRJCTS/2005/NSF-TR-SDT-Feedback/6-Talks/T-R-

634Assignment-Einhorn-Hogarth-treatment.xls

31

Treatment effectG

old

Sta

ndard

hd

False negative

True negative

True positive

False positive

Judgment

r = .50

32

Treatment effectG

old

Sta

ndard

hd

False negative

True negative

True positive

False positive

Judgment

r = .50