View
212
Download
0
Category
Preview:
Citation preview
Decision-Principles to Justify Carnap's Updating Method and to Suggest Corrections
of Probability JudgmentsPeter P. WakkerEconomics Dept.
Maastricht University
Nir Friedman (opening)
2
+
dimensionmap
densitylabelsplayer
ancestralgenerativedynamics
boundfiltering
iterationancestral
graph
Good words
agentBayesiannetworklearning
elicitationdiagramcausality
utilityreasoning
Bad words
3
“Decision theory =probability theory + utility theory.”
Bayesian networkers care about prob. th.However,why care about utility theory?
(1) Important for decisions.(2) Helps in studying probabilities: If you are interested in the processing of probabilities, then still the tools of utility theory can be useful.
4
1. Decision Theory: Empirical Work (on Utty);2. A New Foundation of (Static) Bayesianism;3. Carnap’s Updating Method;4. Corrections of Probability Judgments Based on Empirical Findings.
Outline
5
(Hypothetical) measurement of popularity of internet sites. For simplicity, Assumption.We compare internet sites that differ only regarding (randomness in) waiting time.
Question: How does random waiting time affect popularity of internet sites?
Through average?
1. Decision Theory; Empirical Work
6
More refined procedure:Not average of waiting time, but average ofhow people feel about waiting time,(subjectively perceived) cost of waiting time.
Problem: Users’ subjectively perceived costof waiting time may be nonlinear.
Subj.perc.of costs
waiting time (seconds)
1
00 3 20
1/6
5/6
14
4/6
9
3/6
7
2/6
5
7
Graph
8
For simplicity,Assumption.Internet can be in two states only:fast or slow.P(fast) = 2/3;P(slow) = 1/3.
How measure subjectively perceived cost of waiting time?
C(25) + C(t1) = C(35) + C(t0)
_ (C(35) C(25))
Tradeoff (TO) method
t2
25 35
t1
~
t6
25 35
t5
~
25 35
0
slowfast (= t0)
EC
=C(t2) C(t1) ==
.
.
.
=
C(t6) C(t5) =
C(t1) C(t0) =
.
.
.
9
_ (C(35) C(25))
_ (C(35) C(25))
slowfast
t
t´
t1 ~
1
0
Subj.cost
waiting time
Normalize: C(t0) = 0; C(t6) = 1.
0=t0
t1 t6
1/6
5/6
t5
4/6
t4
3/6
t3
2/6
t2
Consequently: C(tj) = j/6.10
_ (C(35) C(25))
t2
25 35
t1
~
t6
25 35
t5
~
~ 25
t1
35
0
(= t0)
=C(t2) C(t1) ==
.
.
.
=
C(t6) C(t5) =
C(t1) C(t0) =
.
.
.
_ (C(35) C(25))
_ (C(35) C(25))
Tradeoff (TO) method revisited11
misperceived probs
1
2
1
2
1
2
?
?
?
!
!
!
ECunknown probs
12Measure subjective/unknown probs from elicited choices:
then
p(C(35) – C(25)) = (1p)(C(t1) – C(t0)),
sop =
C(35) – C(25) + C(t1) – C(t0)C(t1) – C(t0)
~25
t1
35
0
pslowfast1-p (= t0)
pslowfast1-p
If
P(slow) =
Abdellaoui (2000), Bleichrodt & Pinto (2000),Management Science.
13
Say, some observations show:C(t2) C(t1) = C(t1) C(t0).
Other observations show:C(t2’) C(t1) = C(t1) C(t0),
for t2’ > t2.
Then you have empirically falsified EC model!
Definition. Tradeoff consistency holds if this never happens.
What if inconsistent data?
Theorem. EC model holds
14
Descriptive application:EC model falsified ifftradeoff consistency violated.
tradeoff consistency holds.
15
Normative application: Can convince client to use ECiffcan convince client that tradeoff consistency is reasonable.
2. A New Foundation of (Static) Bayesianism
16
We examine:Rudolf Carnap’s (1952, 1980) ideas aboutthe Dirichlet family of probty distributions.
3. Carnap’s Updating Method
17
Example. Doctor, say YOU, has to choose the treatment of a patient standing before you.
Patient has exactly one (“true”) disease from set D = {d1,...,ds} of possible diseases.
You are uncertain aboutwhich the true disease is.
For simplicity:Assumption. Results of treatment can be expressed in monetary terms.
18
Definition. Treatment (di:1) : if true disease is di, it saves $1, compared to common treatment; otherwise, it is equally expensive.
19
treatment (di:1)d1 . . . di . . . ds
0 . . . 1 . . . 0
Uncertain which disease dj is true uncertain what the outcome (money saved) of the treatment will be.
20
When deciding on your patient, you have observed t similar patientsin the past, and found out their true disease.
Notation.E = (E1,...,Et), Ei describes disease of ith patient.
Assumption.
21
You are Bayesian.
So, expected uility.
Assumption.
22
Given info E, probs are to be taken as follows:
Imagine someone, say me, gives you advice:
23
pEi =ip0 +
ni
tt
+ t
(as are the ‘s)ip0
Appealing! Natural way to integrate- subject-matter info
ip0( )- statistical informationni
t( )
: obsvd relative frequency of di in E1,…,Etni
t > 0: subjective parameter
Subjective parameters disappear as t .
Alternative interpretation: combining evidence.
24
Why not weight t2 iso t?Why not take geometric mean?Why not have depend on t and ni, and on other nj’s?
Decision theory can make things less ad hoc.
An aside. The main mathematical problem: to formulate everything in terms of the“naïve space,” as Grünwald & Halpern (2002) call it.
Appealing advice, but, a hoc!
25
Let us change subject.
Forget about advice, for the time being.
E
26
Positive relatedness of the observations.(di:1) ~E $x
(1) Wouldn’t you want to satisfy:
(di:1) $x . ( ,di)
27
Past-exchangeability:(di:1) ~E $x (di:1) ~E' $x
whenever:E = (E1,...,Em1,dj,dk,E
m+2,...,Et)
andE' = (E1,...,Em1, , ,Em+2,...,Et)
(2) Wouldn’t you want to satisfy:
dk dj
for some m < t, j,k.
28
Ej. . . . . . Et
¬ni
di attime t+1
E1
ni ns. . . . . .n1
past-exchange-bility
disjoint causality
next, 29
31
31
29
Future-exchangeabilityAssume $x ~E (dj:y) and $y ~(E,dj) (dk:z).
Interpretation: $x ~E (dj and then dk: z).
Assume $x‘~E (dk:y’) and $y' ~(E,dk) (dj:z’).
Interpretation: $x’ ~E (dk and then dj: z’).
Now: x = x‘ z = z’.Interpretation: [dj then dk] is as likely as [dk then dj], given E.
(3) Wouldn’t you want to satisfy:
(di:1) $x ( ,dj)
30
Disjoint causality: for all E & distinct i,j,k,
(4) Wouldn’t you want to satisfy:
E~
E(di:1) $x ~( ,dk)
Badnutrition
Othercause
d2d1 d3
A violation:
Fig, 28
Fig, 28
31
Theorem. Assume s3. Equivalent are: (i) (a) Tradeoff consistency;
Decision-theoretic surprise:
pEi =ip0 +
ni
tt
+ t
(b) Positive relatedness of obsns; (c) Exchangeability (past and future); (d) Disjoint causality.(ii) EU holds for each E with fixed U, and Carnap’s inductive method:
32
Abdellaoui (2000), Bleichrodt & Pinto (2000) (and many others): Subj.Probs nonadditive.
Assume simple model: (A:x) W(A)U(x) U(0) = 0; W nonadditive;may be Dempster-Shafer belief function. Only nonnegative outcomes.
4. Corrections of Probability Judgments Based on Empirical Findings
33
two-stage model, W = w ;: direct psychological judgment of probabilityw: turns judgments of probability into decision weights. w can be measured from case where obj. probs are known.
Tversky & Fox (1995):
34
W(AB) W(A) + W(B) if disjoint (superadditivity). (e.g., Dempster-Shafer belief functions).
Economists/AI: w is convex. Enhances:
p
w
1
1
0
35
Psychologists:
36
p, q moderate:w(p + q) w(p) + w(q) (subadditivity) .The w component of W enhances subadditivity of W,
W(A B) W(A) + W(B) for disjoint events A,B, contrary to the common assumptions about belief functions as above.
37
= winvW: behavioral derivation of judgment of expert. Tversky & Fox 1995: more nonlinearity in than in w's and W's deviations from linearity are of the same nature as Figure 3. Tversky & Wakker (1995): formal definitions
38
Non-Bayesians:Alternatives to the Dempster-Shafer belief functions. No degeneracy after multiple updating.Figure 3 for and W: lack of sensitivity towards varying degrees of uncertainty Fig. 3 better reflects absence of information than convexity
39
Fig. 3: from dataSuggests new concepts. e.g., info-sensitivity iso conservativeness/pessimism.Bayesians: Fig. 3 suggests how to correct expert judgments.
40
Support theory (Tversky & Koehler 1994). Typical finding:For disjoint Aj,
(A1) + ... + (An) – (A1 ... An)
increases as n increases.
Recommended